Tomszt/hang on commit #1387

TomaszSzt · 2021-11-04T16:33:35Z

SOLVED PROBLEM

Commit page consumes to much time and memory when being subjected to huge, binary mostly commit. In effect server freezes and reset is necessary.

Issue reported: #1385

DESCRIPTION:

This is a pull request which basically does following:

Prevents commit page from chewing up too many files. The number of files is actually configurable through "defaults.properties" web.maxCommitPaths option.
Extends user access to logging configuration by providing the ability to append options seen in $basepath/log4j.properties to options seen in internally jar contained log4j.properties. This is explained in "defaults.properties" web.debugMode option description and in example log4-debug-example.properties file.
2.1 Adds some logging instructions across the application at TRACE level.
2.2. Adds early messages displaying logging configuration on System.out if any --dailyLogFile or web.debugMode or log4.properties was used to override default settings.
Prevents server from sending "Internal Error" if some text resources are missing and using default dumb text instead. This is basically a robustness against human mistakes during development and testing. This is easy to loose some text by mistake. This is especially true with GitBlit context-less keys like: gb.mirrorWarning which do encourage wide use of same text in many places.

Each commit carefully describes steps taken.

TESTING

Only manual tests were made on the problematic repository.
Only GitBlit GO was tested.

No new automatic tests are included since:

I can hardly bite through the entire test suite without a detailed guide. I am used to very different work flow;
I can't imagine how to formulate an automatic test which would check if OutOfMemory was thrown or freeze have happen. I could not identify a single point responsible for a problem, like ie. some exception thrown somewhere or something like that.

Old tests are FAILING but they do fail on "master" too. I do not think they are kept up to date or something in my setup is screwed up. I do not feel competent to fix it.

MANUAL TEST CONDITIONS
PART 1.Commit pages.

FAULT:
Following manual test conditions should apply:
1.Build server
2.Ensure that web.maxCommitPaths=-1
3.Check that Your JVM has 256MB max heap size (default for JDK8)
4.Start server.
5.Create repository with an initial commit.
6.Clone it.
7.Create a huge commit of at least 16k files and at least 1GB in raw size.
8.Push it.
6.Enter it through web interface. Click the commit page.
Observe that page takes about 30 seconds to load or more, depending on machine (mine server: 2 minutes)
7.Do it few times, quickly, not waiting for page to load.
8.Try to display any other page, ie. repository list.

Expected result: No pages loads, server seems to be stuck.

CORRECTED EFFECT
Do same steps, except:
(...)
2.Ensure that web.maxCommitPaths=1000
(...)
6.Observe, that page loads withing a few seconds or faster.
Observe that commited files list is now 1000 lines long and headers over it do contain "+more changes" and "or more" texts to indicate that something was trimmed off.
(...)
8.All works as expected.

Repeat it again, but commit initial commit so that the problematic commit will be parent-less. Same effects are expected.

Repeat the test with small repository and observe that headers above commited files list should NOT contain "+ more changes" or "or more" texts.

Otherwise web interface should be visually in agreement with "master" branch.

Note: Polish translation for some of added texts is provided. I will possibly add more of them in next pull requests, since my users are Polish.

PART.2 Logging.

1.Build server.
2.Start it with --dailyLogFile
Notice: "Using --dailyLogFile to override log output." on console and dump of log4j.properties file.
3.Stop it, start with "web.debugMode=true" option.
Notice: "Using web.debugMode to override debug levels." on console and dump of log4j.properties file.
4.Stop it, rename data/log4j-debug-example.properties to log4j.properties, start it.
Notice: "Using web.debugMode to override debug levels." on console and dump of log4j.properties file.
5.Try mixing --dailyLogFile and web.debugMode with log4j.properties.
Notice different messages and prioritization of settings from log4j.properties if options do conflict.

EXPECTATIONS

I do hope that some developers would at least "cherry-pick" some of proposed changes and pin point obvious mistakes on my part. This is a first time I do actually publish fixes to open source projects, so please excuse me if I offended someone or made some wrongs.

the compiler version. Was: Directly call javac. This depended on system search path settings (ie. PATH) on windows and might differ from what ANT deduced from JAVA_HOME (if present) or automatically. Is: A <javac> ant task is asked to compile "empty.java" file which contains nothing except a comment. As a side effect of compiler argument -version the version is displayed to user. 2.Added <presetdef name="javac"> <javac includeantruntime="false" /> </presetdef> what removes the "warning: 'includeantruntime' was not set, defaulting to build.sysclasspath=last;..." ANT warning during builds (ant >1.10) This change may be incompatible with older ANT versions. I do not have environment to test it.

not clearly described. This should be restored at final deployment.

…T source code. Added "javadoc" target to build.xml. 2.Modified .gitignore to not include build docs to repository 3.Modified .gitignore to ignore standard backup files *.java~ and etc.

used library this java doc is not specific enough.

maintenace. 2.Changed logging policy to following: - log4j configuration is loaded, as previously, from an internal log4j.properties file (from gitblit.jar) - THEN it is overriden from $baseFolder/log4j.properties if this file exists. - THEN web.debugMode is used to override logging level properties IF THEY ARE AT DEFAULT STATE. - THEN --dailyLogFile is used to override appenders (output for log) if in default state. - if any overrides are present loaded configuration is dumped to System.out 3.Updated comments in log4j.properties AND default.properties to inform users how to control debugging level in fain grained way. 4.Prepared to add trace level commands to CommitPage since this branch is motivated mainly due to CommitPage having problems with large repositories.

- potential environment variables needed - SSL conflict with pre 7.171 JRE version used to run ANT 2.Added Windows script setEnv.bat which explains what needs to be set up and how to let ANT 1.10 or ANT 1.9 to use more modern JRE to avoid SSL exceptions and still compile using JDK 1.7.

logging set-up. 2.Added example log setup file with excessive manual inside. 3.Added some notes to build file how to boost compilation speed which with moxie is horrible when compared to pure ANT solution. It basically repetively tries do download everything and compile from scratch. We have "clean" target to enforce it, don't we? Plain compile should be hell fast.

1.Detected that RepositoryPage is using logger field which duplicates logger() call from superclass BasePage. Since logger() is lazy initialized I did remove the field. Subsequent commit will be used to detect and re-factor it to use base class logger() call.

Finished refactoring from logger field to logger() call.

I think it is important to not need to dig into Wicket docs to figure it out. 2.Changed javadoc target to chew into dependencies to correctly display all implemented interfaces.

2.Added fast access checks if log levels are enabled. 3.Provided recommended use patterns in javadocs.

"large commit crashes server" bug. 2.Pin-pointed it to JGitUtils.getFilesInCommit

Server GitBlitServer do initialize static field in instance constructor. This is not a problem. The problem is, that it also initializes background loggers overriding some settings. Most of gitblit.utils classes is using private static final Logger logger = initialization scheme, that is inside a "class constructor". Java strictly says, that classes are initialized before first use. The "use" is any kind of "touch" to class, either directly or in-directly. Since this is very hard to control one may assume, that some static initializers may get called before GitBlitServer reaches logging set-up and construction. Depending on underlying back-end previously fetched loggers may be miss-configured, non-configured or something unpredictable may happen. Changing initialization scheme to be as soon as possible.

to ensure, at best possible level, that utils classes static initializers are NOT initialized before logging is set up.

if resource is missing. This may result in strange texts to be shown to user if resource is missing, "[Warning: Property for 'xxxx' not found]" but it is better than Internal Server Error which do totally disable functionality.

Use of AtomicInteger as "mutable interger" is an overkill and performance loss.

wrappers over wrappers over.... where a simple array would be sufficient. Now I start to understand what kind of thinking is behind high memory consumption and low overall performance. It wasn't like that when I was a lad and had 960 bytes of RAM on board... ;)

2.Added property which controls limit of files to process to show diff stats in commit. 3.Updated code responsible for computing diffs to respect it. 4.Ensured that server will not report Internal Error if missing resource string is found. 5.Added necessary resource strings in EN and PL translations. Noticed that there is no script which transforms UTF-8 props to ASCII what is a pain for translators, but ignore id. 6.Updated some comments regarding loggers and their retrival startegy 7.Updated CommitLegendPanel and DiffStatsPanel to inform user that there are some more changes if limit set in 2. was tripped. 8.Code is ready for clean-up and full testing.

2.Refactored GitBlitServer main() method to serve for tests where tests do need to run main related code before the instance of server is created. This seems to be a design flaw in it since some elements are processed before and some inside constructor. Sadly the way of loggers configuration enforces such a trick.

after ant test but after checking out master noticed, that those are also present in master build and are not related to changes made. Damn... I do not feel right to dig into it.

regarding peculiar problem after server upgrade when web.maxActivityCommits is modified in server configuration to not be one of -1, 0, 25, 50, 75, 100, 150, 200, 250, 500. The detailed and exact code line which is responsible for why this happens is something beond my understanding.

TomaszSzt added 21 commits November 2, 2021 13:34

Added version indicator. Possibly in wrong place since it is

81270e6

not clearly described. This should be restored at final deployment.

1.Noticed that there is no ANT target to generate javadocs for GITBLI…

3f9015a

…T source code. Added "javadoc" target to build.xml. 2.Modified .gitignore to not include build docs to repository 3.Modified .gitignore to ignore standard backup files *.java~ and etc.

Added missing JavaDoc. Due to lack of good documentation of

6508c64

used library this java doc is not specific enough.

Non Compiling.

ffb83de

1.Detected that RepositoryPage is using logger field which duplicates logger() call from superclass BasePage. Since logger() is lazy initialized I did remove the field. Subsequent commit will be used to detect and re-factor it to use base class logger() call.

Compiling.

b7dc1fc

Finished refactoring from logger field to logger() call.

1.Added some comments about multithread safety in BasePage javadocs.

375b4fc

I think it is important to not need to dig into Wicket docs to figure it out. 2.Changed javadoc target to chew into dependencies to correctly display all implemented interfaces.

1.Updated some comments in BasePage according to logging.

82ee8ee

2.Added fast access checks if log levels are enabled. 3.Provided recommended use patterns in javadocs.

1.Armed CommitPage with trace logs which will help pin-point

dac46a4

"large commit crashes server" bug. 2.Pin-pointed it to JGitUtils.getFilesInCommit

Moved logging system initialization to static code

a53a5e9

to ensure, at best possible level, that utils classes static initializers are NOT initialized before logging is set up.

Basically a question expained in commit.

3301fea

Use of AtomicInteger as "mutable interger" is an overkill and performance loss.

Detected some test errors

44fc7c1

after ant test but after checking out master noticed, that those are also present in master build and are not related to changes made. Damn... I do not feel right to dig into it.

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Tomszt/hang on commit #1387

Tomszt/hang on commit #1387

TomaszSzt commented Nov 4, 2021

Tomszt/hang on commit #1387

Are you sure you want to change the base?

Tomszt/hang on commit #1387

Conversation

TomaszSzt commented Nov 4, 2021