-
Notifications
You must be signed in to change notification settings - Fork 31
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Update GlassFish to a version beyond 4.0 #131
Comments
Might really be related to the "ThreadLocal" warning. I found https://github.com/mjiderhamn/classloader-leak-prevention, and when adding the necessary three jars in "org.jboss.arquillian.warp.impl.client.deployment.DeploymentEnricher.addWarpExtensionsDeployment" to the deployed WAR file, the time for excuting a single test goes down to 30 seconds (which is still horrible, but better than 90 seconds).
@chengfang Do you have further knowledge of arquillian internals? Below is the glassfish output when running only a single test with this command - all ThreadLocal messages point to arquillian core classes, so I don't know where they are linked to the warp extension:
|
@WolfgangHG I don't know off the top of my head. It could be a removal is missing in some application stop callback somewhere in arquillian-core or glassfish container impl, or more complicated if there is no obvious place to do so. I'll take a closer look. |
@chengfang How do we continue with this? Do you have any idea or know someone to ask ;-)? |
We don't have necessary glassfish knowledge to tackle it. As this is a glassfish specific issue, I think we should be able to lower its priority and move forward without fixing it now. |
I still would like to sort it out... Probably, the ThreadLocal warnings are not related to the delay, but they are the only hint currently. Seems the problems are caused by org.jboss.arquillian.core.spi.context.AbstractContext which really contains a threadlocal stack. I tried to add a "javax.servlet.ServletContextListener" to the deployment and can confirm that "contextDestroyed" is called before the ThreadLocal warnings are printed. Maybe it would be sufficient to call "AbstractContext.destroy" in this method. But how to get the context? The inspiration for this came from https://stackoverflow.com/questions/22032045/why-am-i-receiving-jdbc-driver-warning-and-threadlocal-errors |
@chengfang Do you have any idea on how to access the Arquillian context? |
WildFly might have the same ThreadLocal leak issue - when adding the "classloader-leak-prevention" jars to the testing war files, the same message is printed as for GlassFish. I have the suspicion that it is caused by |
I tried two approaches to work around the ThreadLocal leaks. They are both applied to First approach: force-cleanup ThreadLocals by using reflection (inspired by https://stackoverflow.com/questions/3869026/how-to-clean-up-threadlocals ). Second approach: remove the usage of ThreadLocal und use a HashTable that maps a thread id to a value. Running a single test still takes about 1 minute, os there is no improvement at all. Using the "classloader-leak-prevention", the time goes down to 30 seconds, but the tool does not provide any information whether it fixes anything. Further research is necessary... |
The delay does not occur when using linux. Seems to be a windows issue. This thread pointed me to a possible cause: https://stackoverflow.com/questions/48296933/glassfish-undeployment-does-not-work-on-windows-but-works-fine-on-linux After the "test.war" file was undeployed, all jar files from "WEB-INF\lib" still reside in the glassfish deploy directory, and I cannot delete them using Windows explorer, until glassfish is stopped. Maybe used by some classloader? |
I sent a pull request (and forgot to create a branch first...) - the glassfish profiles are now linux only ;-). After updating to JakartaEE10 and recent GlassFish versions, we could take another look at it. But as it works on linux, it does not block further progress with this extension. @chengfang I would like to get rid of the "ThreadLocal" warnings in glassfish console, which means a change to arquillian. What do you think of my suggestion (file "cleanup_threadlocal_diff.txt" in previous comment)? |
@WolfgangHG if seems quite some hacks including using reflection to get non-public data. I'm not sure if all these hard-to-maintain code is worth it to fix the warning. |
@chengfang I don't know whether those ThreadLocals could cause memory issues e.g. on a WildFly server (where this warning is not reported). So it gives me a better feeling to clean them up properly ;-). Unfortunately, the usage of Arquillian in the warp extension (where cleanup is done in a |
I did some research on the "locked files prevent undeploy" issue with the help of File Leak Detector. For my own records: start it with this command and attach it to a running GlassFish process (processid is "1234" in the sample):
Run it with the same java version that GlassFish uses. That's the reason why I use 1.14 - newer versions are not compatible. It prints out a file with stack traces for each file opened and closed. I filtered only operations for one of the "WEB-INF\lib" jar files (here: "arquillian-jacoco.jar") and removed all open/close pairs. This left two "open without close" calls. The first call stack is a GlassFish internal class:
Here is the code for BeanDeploymentArchiveImpl.populate and GlassFish 5.0. So hopefully this is fixed with recent versions. The other "file open" stacktrace is by arquillian, see following reply. |
The second "file open" is caused by arquillian:
The code in
But you can simply avoid using the cache by calling
Now, the @chengfang What do you think about this change to arquillian? Unfortunately, we cannot test whether this fixes the windows file leak issues until we upgrade to GlassFish 7 ;-). So we could also keep this on hold until we upgrade arquillian-warp to JakartaEE10. I think there is no reason to cache the jar files in Arquillian, as they are accessed only once. So, there should be no performance impact. WildFly is not affected by this issue, because the implementation of "URLConnection" is completely different (org.jboss.vfs.protocol.VirtualFileURLConnection) and does not do caching/file locking. |
@chengfang I can reproduce an OutOfMemoryException when running the warp tests in the "wildfly-remote" profile after several runs (about 5-7 turns). After testing with an arquillian 1.7.2-SNAPSHOT including my ThreadLocal workaround, I did not observe this error after 20 test runs. So it seems to really have a positive effect ;-). Can I convince you to add this cleanup to arquillian? I can send a pull request, if it is OK for you. |
@WolfgangHG can you open a PR to arquillian-core so it will be easier to review and also get input from wider audience? Thanks for looking into these difficult issues. |
Pull requests sent.... |
The arquillian pull requests for file leak and So, this issue is about upgrading to a JakartaEE10 version of glassfish, where hopefully more file leaks were fixed, see comment above. |
The original arquillian glassfish adapter seems to support only glassfish versions to 5: There is a port for Glassfish 6, which seems to be adandoned: This seems to be a maintained fork, but it requires a custom repository that currently does not work: But the file leak problem seems to exist still - on windows every test takes 20 seconds and there are errors about non deleteable jar files in the log. Also, the test "org.jboss.arquillian.warp.jsf.ftest.producer.TestJSFResourceNotFound" is failing. Further research required... |
Thanks for the research. What a mess, that whole 'arquillian-container-glassfish6' should have been a branch and NOT a new project.. Would be ideal if these activities united under 1 project, but I am not looking at more projects to sponsor :-) I dono how diverges is OmniFish from upstream Glassfish, but if that's not much, that one seems most maintained? Taking a long time is perhaps not the end of the word as that can run on CI (might need to bump the timeout).
Again, if need be, can be ignored conditionally using |
Keeping open per request on #252 (comment) |
I think I tracked the locked files ("arquillian-core.jar" and "arquillian-testenricher-cdi-jakarta.jar") down to two stack traces and created issues: https://issues.redhat.com/browse/WELD-2800 After having built a new Weld version with the fix and replaced it in my GlassFish bundle, the same file leak was reported from "org.jboss.arquillian.core.impl.loadable.JavaSPIExtensionLoader": arquillian/arquillian-core#637 |
Currently, the tests use GlassFish 4.0
Updating to a newer version causes severe timing problems: each test runs about 2 minutes (but in the end works), which makes the test suite unusable. This happens with 4.1.2 and 5.1.0.
When using a remote GlassFish server, there is console output like this:
But this is also printed with 4.0, so I don't think it is related.
The text was updated successfully, but these errors were encountered: