Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Tomszt/hang on commit #1387

Open
wants to merge 21 commits into
base: master
Choose a base branch
from

Conversation

TomaszSzt
Copy link
Contributor

SOLVED PROBLEM

Commit page consumes to much time and memory when being subjected to huge, binary mostly commit. In effect server freezes and reset is necessary.

Issue reported: #1385

DESCRIPTION:

This is a pull request which basically does following:

  1. Prevents commit page from chewing up too many files. The number of files is actually configurable through "defaults.properties" web.maxCommitPaths option.
  2. Extends user access to logging configuration by providing the ability to append options seen in $basepath/log4j.properties to options seen in internally jar contained log4j.properties. This is explained in "defaults.properties" web.debugMode option description and in example log4-debug-example.properties file.
    2.1 Adds some logging instructions across the application at TRACE level.
    2.2. Adds early messages displaying logging configuration on System.out if any --dailyLogFile or web.debugMode or log4.properties was used to override default settings.
  3. Prevents server from sending "Internal Error" if some text resources are missing and using default dumb text instead. This is basically a robustness against human mistakes during development and testing. This is easy to loose some text by mistake. This is especially true with GitBlit context-less keys like: gb.mirrorWarning which do encourage wide use of same text in many places.

Each commit carefully describes steps taken.

TESTING

Only manual tests were made on the problematic repository.
Only GitBlit GO was tested.

No new automatic tests are included since:

  • I can hardly bite through the entire test suite without a detailed guide. I am used to very different work flow;
  • I can't imagine how to formulate an automatic test which would check if OutOfMemory was thrown or freeze have happen. I could not identify a single point responsible for a problem, like ie. some exception thrown somewhere or something like that.

Old tests are FAILING but they do fail on "master" too. I do not think they are kept up to date or something in my setup is screwed up. I do not feel competent to fix it.

MANUAL TEST CONDITIONS
PART 1.Commit pages.

FAULT:
Following manual test conditions should apply:
1.Build server
2.Ensure that web.maxCommitPaths=-1
3.Check that Your JVM has 256MB max heap size (default for JDK8)
4.Start server.
5.Create repository with an initial commit.
6.Clone it.
7.Create a huge commit of at least 16k files and at least 1GB in raw size.
8.Push it.
6.Enter it through web interface. Click the commit page.
Observe that page takes about 30 seconds to load or more, depending on machine (mine server: 2 minutes)
7.Do it few times, quickly, not waiting for page to load.
8.Try to display any other page, ie. repository list.

Expected result: No pages loads, server seems to be stuck.

CORRECTED EFFECT
Do same steps, except:
(...)
2.Ensure that web.maxCommitPaths=1000
(...)
6.Observe, that page loads withing a few seconds or faster.
Observe that commited files list is now 1000 lines long and headers over it do contain "+more changes" and "or more" texts to indicate that something was trimmed off.
(...)
8.All works as expected.

Repeat it again, but commit initial commit so that the problematic commit will be parent-less. Same effects are expected.

Repeat the test with small repository and observe that headers above commited files list should NOT contain "+ more changes" or "or more" texts.

Otherwise web interface should be visually in agreement with "master" branch.

Note: Polish translation for some of added texts is provided. I will possibly add more of them in next pull requests, since my users are Polish.

PART.2 Logging.

1.Build server.
2.Start it with --dailyLogFile
Notice: "Using --dailyLogFile to override log output." on console and dump of log4j.properties file.
3.Stop it, start with "web.debugMode=true" option.
Notice: "Using web.debugMode to override debug levels." on console and dump of log4j.properties file.
4.Stop it, rename data/log4j-debug-example.properties to log4j.properties, start it.
Notice: "Using web.debugMode to override debug levels." on console and dump of log4j.properties file.
5.Try mixing --dailyLogFile and web.debugMode with log4j.properties.
Notice different messages and prioritization of settings from log4j.properties if options do conflict.

EXPECTATIONS

I do hope that some developers would at least "cherry-pick" some of proposed changes and pin point obvious mistakes on my part. This is a first time I do actually publish fixes to open source projects, so please excuse me if I offended someone or made some wrongs.

the compiler version.

  Was:
	Directly call javac. This depended on system search path
  settings (ie. PATH) on windows and might differ from what ANT
  deduced from JAVA_HOME (if present) or automatically.

  Is:
 	A <javac> ant task is asked to compile "empty.java" file
  which contains nothing except a comment. As a side effect of
  compiler argument -version the version is displayed to user.

2.Added
	<presetdef name="javac">
		<javac includeantruntime="false" />
	</presetdef>
  what removes the "warning: 'includeantruntime' was not set, defaulting to build.sysclasspath=last;..."
  ANT warning during builds (ant >1.10)

  This change may be incompatible with older ANT versions. I do not have environment to test it.
not clearly described. This should be restored at final deployment.
…T source

code. Added "javadoc" target to build.xml.
2.Modified .gitignore to not include build docs to repository
3.Modified .gitignore to ignore standard backup files *.java~ and etc.
used library this java doc is not specific enough.
maintenace.
2.Changed logging policy to following:
	- log4j configuration is loaded, as previously, from
	an internal log4j.properties file (from gitblit.jar)
	- THEN it is overriden from $baseFolder/log4j.properties
	if this file exists.
	- THEN web.debugMode is used to override logging level properties
	IF THEY ARE AT DEFAULT STATE.
	- THEN --dailyLogFile is used to override appenders (output for log)
	if in default state.
	- if any overrides are present loaded configuration is dumped to
	System.out
3.Updated comments in log4j.properties AND default.properties to
inform users how to control debugging level in fain grained way.
4.Prepared to add trace level commands to CommitPage since
this branch is motivated mainly due to CommitPage having problems with
large repositories.
	- potential environment variables needed
	- SSL conflict with pre 7.171 JRE version used to
	 run ANT
2.Added Windows script setEnv.bat which explains what needs
to be set up and how to let ANT 1.10 or ANT 1.9 to use
more modern JRE to avoid SSL exceptions and still compile using
JDK 1.7.
logging set-up.
2.Added example log setup file with excessive manual inside.
3.Added some notes to build file how to boost compilation speed
which with moxie is horrible when compared to pure ANT solution.
It basically repetively tries do download everything and compile
from scratch. We have "clean" target to enforce it, don't we?
Plain compile should be hell fast.
1.Detected that RepositoryPage is using logger field
 which duplicates logger() call from superclass BasePage.
 Since logger() is lazy initialized I did remove the
 field. Subsequent commit will be used to detect and re-factor
 it to use base class logger() call.
Finished refactoring from logger field to logger() call.
 I think it is important to not need to dig into Wicket docs to
 figure it out.
2.Changed javadoc target to chew into dependencies to correctly display
all implemented interfaces.
2.Added fast access checks if log levels are enabled.
3.Provided recommended use patterns in javadocs.
"large commit crashes server" bug.
2.Pin-pointed it to JGitUtils.getFilesInCommit
	Server GitBlitServer do initialize static field in
	instance constructor. This is not a problem. The problem
	is, that it also initializes background loggers overriding
	some settings.

	Most of gitblit.utils classes is using

		private static final Logger logger =

	initialization scheme, that is inside a "class constructor".

	Java strictly says, that classes are initialized before first
	use. The "use" is any kind of "touch" to class, either directly
	or in-directly. Since this is very hard to control one may
	assume, that some static initializers may get called before
	GitBlitServer reaches logging set-up and construction.

	Depending on underlying back-end previously fetched loggers
	may be miss-configured, non-configured or something unpredictable
	may happen.

 Changing initialization scheme to be as soon as possible.
to ensure, at best possible level, that utils classes
static initializers are NOT initialized before logging is
set up.
if resource is missing.

This may result in strange texts to be shown to user if resource is
missing, "[Warning: Property for 'xxxx' not found]" but it is better
than Internal Server Error which do totally disable functionality.
Use of AtomicInteger as "mutable interger" is an overkill and
performance loss.
wrappers over wrappers over.... where a simple array would be
sufficient. Now I start to understand what kind of thinking is
behind high memory consumption and low overall performance.

It wasn't like that when I was a lad and had 960 bytes of RAM
on board... ;)
2.Added property which controls limit of files to process to show diff stats
in commit.
3.Updated code responsible for computing diffs to respect it.
4.Ensured that server will not report Internal Error if missing resource string is found.
5.Added necessary resource strings in EN and PL translations.
Noticed that there is no script which transforms
UTF-8 props to ASCII what is a pain for translators, but ignore id.
6.Updated some comments regarding loggers and their retrival startegy
7.Updated CommitLegendPanel and DiffStatsPanel to inform user that there
are some more changes if limit set in 2. was tripped.
8.Code is ready for clean-up and full testing.
2.Refactored GitBlitServer main() method to serve
for tests where tests do need to run main related
code before the instance of server is created.

This seems to be a design flaw in it since some
elements are processed before and some inside constructor.
Sadly the way of loggers configuration enforces such
a trick.
after

	ant test

but after checking out master noticed, that those are also present
in master build and are not related to changes made. Damn... I do not
feel right to dig into it.
regarding peculiar problem after server upgrade when  web.maxActivityCommits
is modified in server configuration to not be one of -1, 0, 25, 50, 75, 100, 150, 200, 250, 500.

The detailed and exact code line which is responsible for why this happens is something beond my understanding.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

1 participant