Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

cmdLineTester_criu_nonPortableRestore NotCheckpointSafeDeadlock Checkpoint blocked #20832

Open
pshipton opened this issue Dec 13, 2024 · 2 comments
Assignees
Labels
comp:vm criu Used to track CRIU snapshot related work test failure

Comments

@pshipton
Copy link
Member

https://openj9-jenkins.osuosl.org/job/Test_openjdk21_j9_sanity.functional_aarch64_linux_Nightly_testList_1/367
cmdLineTester_criu_nonPortableRestore_10

20:10:53  Testing: Create and Restore Criu Checkpoint Image once - NotCheckpointSafeDeadlock
20:10:53  Test start time: 2024/12/13 01:10:50 Coordinated Universal Time
20:10:53  Running command: bash /home/jenkins/workspace/Test_openjdk21_j9_sanity.functional_aarch64_linux_Nightly_testList_1/aqa-tests/TKG/../../jvmtest/functional/cmdLineTests/criu/criuScript.sh /home/jenkins/workspace/Test_openjdk21_j9_sanity.functional_aarch64_linux_Nightly_testList_1/aqa-tests/TKG/../../jvmtest/functional/cmdLineTests/criu /home/jenkins/workspace/Test_openjdk21_j9_sanity.functional_aarch64_linux_Nightly_testList_1/jdkbinary/j2sdk-image/bin/java " -XX:+DebugOnRestore -Xjit:count=0  -XX:+ThrowOnDelayedCheckpointOperation -Xtrace:print=j9vm.731 --add-opens java.base/jdk.internal.misc=ALL-UNNAMED  --add-exports java.base/openj9.internal.criu=ALL-UNNAMED" org.openj9.criu.DeadlockTest NotCheckpointSafeDeadlock 1 false false
20:10:53  Time spent starting: 6 milliseconds
20:12:01  Time spent executing: 68989 milliseconds
20:12:01  Test result: FAILED
20:12:01  Output from test:
20:12:01   [OUT] start running script
20:12:01   [OUT] export GLIBC_TUNABLES=glibc.cpu.hwcaps=-XSAVEC,-XSAVE,-AVX2,-ERMS,-AVX,-AVX_Fast_Unaligned_Load
20:12:01   [OUT] export LD_BIND_NOT=on
20:12:01   [OUT] /home/jenkins/workspace/Test_openjdk21_j9_sanity.functional_aarch64_linux_Nightly_testList_1/jdkbinary/j2sdk-image/bin/java -XX:+EnableCRIUSupport  -XX:+DebugOnRestore -Xjit:count=0  -XX:+ThrowOnDelayedCheckpointOperation -Xtrace:print=j9vm.731 --add-opens java.base/jdk.internal.misc=ALL-UNNAMED  --add-exports java.base/openj9.internal.criu=ALL-UNNAMED -cp /home/jenkins/workspace/Test_openjdk21_j9_sanity.functional_aarch64_linux_Nightly_testList_1/aqa-tests/TKG/../../jvmtest/functional/cmdLineTests/criu/criu.jar org.openj9.criu.DeadlockTest NotCheckpointSafeDeadlock 1
20:12:01   [OUT] Pre-checkpoint
20:12:01   [OUT] main: Fri Dec 13 01:11:19 UTC 2024, Performing CRIUSupport.checkpointJVM(), System.currentTimeMillis(): 1734052296128, System.nanoTime(): 7317567116127810
20:12:01   [OUT] 01:11:38.603*0x19200            j9vm.731      - Checkpoint blocked because thread=00000000001D7800 is in method=00000000002297B8 due to delay code 1
20:12:01   [OUT] 01:11:38.613 0x19200            j9vm.731      - Checkpoint blocked because thread=00000000001D7800 is in method=00000000002297B8 due to delay code 1
20:12:01   [OUT] 01:11:38.624 0x19200            j9vm.731      - Checkpoint blocked because thread=00000000001D7800 is in method=00000000002297B8 due to delay code 1
20:12:01   [OUT] 01:11:38.634 0x19200            j9vm.731      - Checkpoint blocked because thread=00000000001D7800 is in method=00000000002297B8 due to delay code 1
20:12:01   [OUT] 01:11:38.644 0x19200            j9vm.731      - Checkpoint blocked because thread=00000000001D7800 is in method=00000000002297B8 due to delay code 1
20:12:01   [OUT] 01:11:38.654 0x19200            j9vm.731      - Checkpoint blocked because thread=00000000001D7800 is in method=00000000002297B8 due to delay code 1
20:12:01   [OUT] 01:11:38.665 0x19200            j9vm.731      - Checkpoint blocked because thread=00000000001D7800 is in method=00000000002297B8 due to delay code 1
20:12:01   [OUT] 01:11:38.675 0x19200            j9vm.731      - Checkpoint blocked because thread=00000000001D7800 is in method=00000000002297B8 due to delay code 1
20:12:01   [OUT] 01:11:38.685 0x19200            j9vm.731      - Checkpoint blocked because thread=00000000001D7800 is in method=00000000002297B8 due to delay code 1
20:12:01   [OUT] 01:11:38.695 0x19200            j9vm.731      - Checkpoint blocked because thread=00000000001D7800 is in method=00000000002297B8 due to delay code 1
20:12:01   [OUT] 01:11:38.705 0x19200            j9vm.731      - Checkpoint blocked because thread=00000000001D7800 is in method=00000000002297B8 due to delay code 1
20:12:01   [OUT] 01:11:38.716 0x19200            j9vm.731      - Checkpoint blocked because thread=00000000001D7800 is in method=00000000002297B8 due to delay code 1
20:12:01   [OUT] 01:11:38.726 0x19200            j9vm.731      - Checkpoint blocked because thread=00000000001D7800 is in method=00000000002297B8 due to delay code 1
20:12:01   [OUT] 01:11:38.736 0x19200            j9vm.731      - Checkpoint blocked because thread=00000000001D7800 is in method=00000000002297B8 due to delay code 1
20:12:01   [OUT] 01:11:38.746 0x19200            j9vm.731      - Checkpoint blocked because thread=00000000001D7800 is in method=00000000002297B8 due to delay code 1
20:12:01   [OUT] 01:11:38.757 0x19200            j9vm.731      - Checkpoint blocked because thread=00000000001D7800 is in method=00000000002297B8 due to delay code 1
20:12:01   [OUT] 01:11:38.767 0x19200            j9vm.731      - Checkpoint blocked because thread=00000000001D7800 is in method=00000000002297B8 due to delay code 1
20:12:01   [OUT] 01:11:38.777 0x19200            j9vm.731      - Checkpoint blocked because thread=00000000001D7800 is in method=00000000002297B8 due to delay code 1
20:12:01   [OUT] 01:11:38.787 0x19200            j9vm.731      - Checkpoint blocked because thread=00000000001D7800 is in method=00000000002297B8 due to delay code 1
20:12:01   [OUT] 01:11:38.797 0x19200            j9vm.731      - Checkpoint blocked because thread=00000000001D7800 is in method=00000000002297B8 due to delay code 1
20:12:01   [OUT] 01:11:38.808 0x19200            j9vm.731      - Checkpoint blocked because thread=00000000001D7800 is in method=00000000002297B8 due to delay code 1
20:12:01   [OUT] 01:11:38.818 0x19200            j9vm.731      - Checkpoint blocked because thread=00000000001D7800 is in method=00000000002297B8 due to delay code 1
20:12:01   [OUT] 01:11:38.828 0x19200            j9vm.731      - Checkpoint blocked because thread=00000000001D7800 is in method=00000000002297B8 due to delay code 1
20:12:01   [OUT] 01:11:38.838 0x19200            j9vm.731      - Checkpoint blocked because thread=00000000001D7800 is in method=00000000002297B8 due to delay code 1
20:12:01   [OUT] 01:11:38.848 0x19200            j9vm.731      - Checkpoint blocked because thread=00000000001D7800 is in method=00000000002297B8 due to delay code 1
20:12:01   [OUT] 01:11:38.859 0x19200            j9vm.731      - Checkpoint blocked because thread=00000000001D7800 is in method=00000000002297B8 due to delay code 1
20:12:01   [OUT] 01:11:38.869 0x19200            j9vm.731      - Checkpoint blocked because thread=00000000001D7800 is in method=00000000002297B8 due to delay code 1
20:12:01   [OUT] 01:11:38.879 0x19200            j9vm.731      - Checkpoint blocked because thread=00000000001D7800 is in method=00000000002297B8 due to delay code 1
20:12:01   [OUT] 01:11:38.889 0x19200            j9vm.731      - Checkpoint blocked because thread=00000000001D7800 is in method=00000000002297B8 due to delay code 1
20:12:01   [OUT] 01:11:38.900 0x19200            j9vm.731      - Checkpoint blocked because thread=00000000001D7800 is in method=00000000002297B8 due to delay code 1
20:12:01   [OUT] 01:11:38.910 0x19200            j9vm.731      - Checkpoint blocked because thread=00000000001D7800 is in method=00000000002297B8 due to delay code 1
20:12:01   [OUT] 01:11:38.920 0x19200            j9vm.731      - Checkpoint blocked because thread=00000000001D7800 is in method=00000000002297B8 due to delay code 1
20:12:01   [OUT] 01:11:38.930 0x19200            j9vm.731      - Checkpoint blocked because thread=00000000001D7800 is in method=00000000002297B8 due to delay code 1
20:12:01   [OUT] 01:11:38.941 0x19200            j9vm.731      - Checkpoint blocked because thread=00000000001D7800 is in method=00000000002297B8 due to delay code 1
20:12:01   [OUT] 01:11:38.951 0x19200            j9vm.731      - Checkpoint blocked because thread=00000000001D7800 is in method=00000000002297B8 due to delay code 1
20:12:01   [OUT] 01:11:38.961 0x19200            j9vm.731      - Checkpoint blocked because thread=00000000001D7800 is in method=00000000002297B8 due to delay code 1
20:12:01   [OUT] 01:11:38.971 0x19200            j9vm.731      - Checkpoint blocked because thread=00000000001D7800 is in method=00000000002297B8 due to delay code 1
20:12:01   [OUT] 01:11:38.981 0x19200            j9vm.731      - Checkpoint blocked because thread=00000000001D7800 is in method=00000000002297B8 due to delay code 1
20:12:01   [OUT] 01:11:38.992 0x19200            j9vm.731      - Checkpoint blocked because thread=00000000001D7800 is in method=00000000002297B8 due to delay code 1
20:12:01   [OUT] 01:11:39.002 0x19200            j9vm.731      - Checkpoint blocked because thread=00000000001D7800 is in method=00000000002297B8 due to delay code 1
20:12:01   [OUT] 01:11:39.012 0x19200            j9vm.731      - Checkpoint blocked because thread=00000000001D7800 is in method=00000000002297B8 due to delay code 1
20:12:01   [OUT] 01:11:39.022 0x19200            j9vm.731      - Checkpoint blocked because thread=00000000001D7800 is in method=00000000002297B8 due to delay code 1
20:12:01   [OUT] 01:11:39.033 0x19200            j9vm.731      - Checkpoint blocked because thread=00000000001D7800 is in method=00000000002297B8 due to delay code 1
20:12:01   [OUT] 01:11:39.043 0x19200            j9vm.731      - Checkpoint blocked because thread=00000000001D7800 is in method=00000000002297B8 due to delay code 1
20:12:01   [OUT] 01:11:39.053 0x19200            j9vm.731      - Checkpoint blocked because thread=00000000001D7800 is in method=00000000002297B8 due to delay code 1
20:12:01   [OUT] 01:11:39.063 0x19200            j9vm.731      - Checkpoint blocked because thread=00000000001D7800 is in method=00000000002297B8 due to delay code 1
20:12:01   [OUT] 01:11:39.073 0x19200            j9vm.731      - Checkpoint blocked because thread=00000000001D7800 is in method=00000000002297B8 due to delay code 1
20:12:01   [OUT] 01:11:39.084 0x19200            j9vm.731      - Checkpoint blocked because thread=00000000001D7800 is in method=00000000002297B8 due to delay code 1
20:12:01   [OUT] 01:11:39.094 0x19200            j9vm.731      - Checkpoint blocked because thread=00000000001D7800 is in method=00000000002297B8 due to delay code 1
20:12:01   [OUT] 01:11:39.104 0x19200            j9vm.731      - Checkpoint blocked because thread=00000000001D7800 is in method=00000000002297B8 due to delay code 1
20:12:01   [OUT] 01:11:39.114 0x19200            j9vm.731      - Checkpoint blocked because thread=00000000001D7800 is in method=00000000002297B8 due to delay code 1
20:12:01   [OUT] 01:11:39.124 0x19200            j9vm.731      - Checkpoint blocked because thread=00000000001D7800 is in method=00000000002297B8 due to delay code 1
20:12:01   [OUT] 01:11:39.135 0x19200            j9vm.731      - Checkpoint blocked because thread=00000000001D7800 is in method=00000000002297B8 due to delay code 1
20:12:01   [OUT] 01:11:39.145 0x19200            j9vm.731      - Checkpoint blocked because thread=00000000001D7800 is in method=00000000002297B8 due to delay code 1
20:12:01   [OUT] 01:11:39.155 0x19200            j9vm.731      - Checkpoint blocked because thread=00000000001D7800 is in method=00000000002297B8 due to delay code 1
20:12:01   [OUT] 01:11:39.165 0x19200            j9vm.731      - Checkpoint blocked because thread=00000000001D7800 is in method=00000000002297B8 due to delay code 1
20:12:01   [OUT] 01:11:39.175 0x19200            j9vm.731      - Checkpoint blocked because thread=00000000001D7800 is in method=00000000002297B8 due to delay code 1
20:12:01   [OUT] 01:11:39.186 0x19200            j9vm.731      - Checkpoint blocked because thread=00000000001D7800 is in method=00000000002297B8 due to delay code 1
20:12:01   [OUT] JVMJITM048W AOT load and compilation disabled pre-checkpoint and post-restore.
20:12:01   [OUT] JVMJITM044W Some or all compiled code in the code cache invalidated post restore.
20:12:01   [OUT] TEST FAILED
20:12:01   [OUT] initiate restore
20:12:01   [OUT] Removed test output files
20:12:01   [OUT] finished script
20:12:01   [ERR] /home/jenkins/workspace/Test_openjdk21_j9_sanity.functional_aarch64_linux_Nightly_testList_1/aqa-tests/TKG/../../jvmtest/functional/cmdLineTests/criu/criuScript.sh: line 41: 3855892 Killed                  $2 -XX:+EnableCRIUSupport $3 -cp "$1/criu.jar" $4 $5 $6 > testOutput 2>&1
20:12:01  >> Success condition was not found: [Output match: User requested Java dump using]
20:12:01  >> Failure condition was not found: [Output match: AOT load and compilation disabled post restore]
20:12:01  >> Success condition was not found: [Output match: TEST PASSED]
20:12:01  >> Required condition was found: [Output match: Pre-checkpoint]
20:12:01  >> Required condition was found: [Output match: Checkpoint blocked because thread]
20:12:01  >> Failure condition was found: [Output match: Killed]
20:12:01  >> Failure condition was not found: [Output match: CRIU is not enabled]
20:12:01  >> Failure condition was not found: [Output match: Operation not permitted]
20:12:01  >> Success condition was not found: [Output match: Thread pid mismatch]
20:12:01  >> Success condition was not found: [Output match: do not match expected]
20:12:01  >> Success condition was not found: [Output match: Unable to create a thread:]
20:12:01  >> Failure condition was found: [Output match: TEST FAILED]
20:12:01  >> Failure condition was not found: [Output match: Could not dump the JVM processes, err=-70]
@pshipton pshipton added test failure criu Used to track CRIU snapshot related work labels Dec 13, 2024
@pshipton
Copy link
Member Author

@tajila

@JasonFengJ9
Copy link
Member

JasonFengJ9 commented Dec 13, 2024

This JVMCheckpointException was thrown when the checkpoint couldn't be performed due to a method with NotCheckpointSafe annotation in the current stack within a default sleepMillisecondsForNotCheckpointSafe (10ms) * maxRetries (100) total 1s.
If this occurs more often, it can be mitigated via -XX:sleepMillisecondsForNotCheckpointSafe= or -XX:maxRetryForNotCheckpointSafe= w/ a larger value.

Note: a javacore file is expected to be generated when this JVMCheckpointException is thrown but not seen in the result file (a few from other tests).

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
comp:vm criu Used to track CRIU snapshot related work test failure
Projects
None yet
Development

No branches or pull requests

3 participants