Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Reduce memory requirements of checkstyleNohttp #60

Open
wilkinsona opened this issue Jan 19, 2023 · 7 comments
Open

Reduce memory requirements of checkstyleNohttp #60

wilkinsona opened this issue Jan 19, 2023 · 7 comments

Comments

@wilkinsona
Copy link
Contributor

In Spring Boot's build, we currently allocate a 1GB max heap to checkstyleNohttp:

tasks.named("checkstyleNohttp").configure {
    maxHeapSize = "1g"
}

Unfortunately, this isn't always sufficient and the build can still sometimes fail with an OutOfMemoryError:

A failure occurred while executing org.gradle.api.plugins.quality.internal.CheckstyleAction
> An unexpected error occurred configuring and executing Checkstyle.
  > java.lang.OutOfMemoryError: GC overhead limit exceeded
    > GC overhead limit exceeded

Looking at some data from ge.spring.io, @mhalbritter and I have both seen this failure in the last 7 days. We could just bump the max heap size but a gigabyte already feels excessive for a check that can, presumably, be performed on a file-by-file basis.

@mhalbritter
Copy link

For me, it initially died on this commit: spring-projects/spring-boot@e8d809f

@wilkinsona
Copy link
Contributor Author

This continues to be a problem. In my clone of Spring Boot 2.7.x branch, it's checking 9744 files and is failing quite reliably when configured with a heap of 1g. The task succeeds with a heap of 1536m.

@philwebb
Copy link

I think this might be a general checkstyle issue, I can't see anything in nohttp that looks funky. The Gradle plugin delegates to an Ant task which delegate to RootChecker. The API takes a List<File> parameter so it's possible the number of files is the problem. Perhaps an alternative task that doesn't use checkstyle could be developed.

@mhalbritter
Copy link

mhalbritter commented Feb 22, 2023

I looked at it yesterday. Checkstyle reads the whole file in memory (in com.puppycrawl.tools.checkstyle.api.FileText#FileText(java.io.File, java.lang.String)). Maybe there's some big file which checkstyle analyzes and then causes the OOM?

@wilkinsona
Copy link
Contributor Author

wilkinsona commented Feb 22, 2023

Looking at a heap dump in YourKit taken while the task is running, I think the problem is the sheer number of files rather than any one big file. There also appears to be some inefficiencies in Ant or the Checkstyle Ant task.

In my Spring Boot 2.7.x checkout, I have 79297 files according to find . -type f | wc -l yet the heap shows over 1 million strings referenced by Ant FileSet instances. These strings appear to be paths of files in the checkout, including those beneath build directories that are ultimately ignored. Some of them are absolute paths but the majority seem to be relative paths. At the time of the heap dump, there were 8653 FileSet instances with a retained size of roughly 390MB.

@wilkinsona
Copy link
Contributor Author

I've exported all of the strings from the memory snapshot and can see that there's some duplication. For example:

$ grep TestFailuresPluginIntegrationTests.java strings.txt
src/test/java/org/springframework/boot/build/testing/TestFailuresPluginIntegrationTests.java
buildSrc/src/test/java/org/springframework/boot/build/testing/TestFailuresPluginIntegrationTests.java
buildSrc/src/test/java/org/springframework/boot/build/testing/TestFailuresPluginIntegrationTests.java
buildSrc/src/test/java/org/springframework/boot/build/testing/TestFailuresPluginIntegrationTests.java
buildSrc/src/test/java/org/springframework/boot/build/testing/TestFailuresPluginIntegrationTests.java
buildSrc/src/test/java/org/springframework/boot/build/testing/TestFailuresPluginIntegrationTests.java
buildSrc/src/test/java/org/springframework/boot/build/testing/TestFailuresPluginIntegrationTests.java

In total there were 1,341,465 strings exported but only 298,247 of those were unique.

@wilkinsona
Copy link
Contributor Author

The inefficiency appears to be in how Gradle maps a FileTree into Ant. The single FileTree in Gradle has become 8653 FileSet instances in Ant, each only containing a single file. Each FileSet also contains a DirectoryScanner that contains information about every file and directory that it didn't select. There are 6 FileSet instance for files at the root of the checkout. The DirectoryScanner of each of these consumes almost 48MB heap storing the files and directories that it didn't select as well as every directory that it scanned.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

3 participants