Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

jdk22 java/foreign/TestAddressDereference crash vmState=0x00000000 #18999

Closed
pshipton opened this issue Feb 22, 2024 · 6 comments · Fixed by #19002
Closed

jdk22 java/foreign/TestAddressDereference crash vmState=0x00000000 #18999

pshipton opened this issue Feb 22, 2024 · 6 comments · Fixed by #19002
Labels
jdk22 segfault Issues that describe segfaults / JVM crashes test failure

Comments

@pshipton
Copy link
Member

pshipton commented Feb 22, 2024

Happening across platforms
https://openj9-jenkins.osuosl.org/job/Pipeline-Build-Test-JDK22/19/
https://openj9-jenkins.osuosl.org/job/Pipeline-Build-Test-JDK22-with-System/9/

https://openj9-jenkins.osuosl.org/job/Test_openjdk22_j9_sanity.openjdk_x86-64_linux_Nightly_testList_0/12
java/foreign/TestAddressDereference.java

https://openj9-artifactory.osuosl.org/artifactory/ci-openj9/Test/Test_openjdk22_j9_sanity.openjdk_x86-64_linux_Nightly_testList_0/12/openjdk_test_output.tar.gz

01:51:29  Type=Segmentation error vmState=0x00000000
01:51:29  J9Generic_Signal_Number=00000018 Signal_Number=0000000b Error_Value=00000000 Signal_Code=00000001
01:51:29  Handler1=00007F904FDBF810 Handler2=00007F905531C730 InaccessibleAddress=0000000000000008
01:51:29  RDI=0000000000000001 RSI=0000000000000000 RAX=00007F8FAC0097C0 RBX=0000000000000000
01:51:29  RCX=FFFFFFFFC5777630 RDX=0000000000000001 R8=0000000000000000 R9=1FFFFFFFFFFFFFFF
01:51:29  R10=0000000000000002 R11=00007F9047E6E050 R12=0000000000000000 R13=00007F900D5E5600
01:51:29  R14=00007F90503CFB00 R15=00007F900D5E5900
01:51:29  RIP=00007F904FE59E5C GS=0000 FS=0000 RSP=00007F900D5E53A0
01:51:29  EFlags=0000000000010246 CS=0033 RBP=0000000000000000 ERR=0000000000000004
01:51:29  TRAPNO=000000000000000E OLDMASK=0000000000000000 CR2=0000000000000008
01:51:29  xmm0 549fba7eac00935f (f: 2885718784.000000, d: 4.337386e+99)
01:51:29  xmm1 553c96ee01a32c90 (f: 27470992.000000, d: 4.002084e+102)
01:51:29  xmm2 01a32c90b7654321 (f: 3076866816.000000, d: 8.947237e-301)
01:51:29  xmm3 65706f5f6176614a (f: 1635148160.000000, d: 4.262343e+180)
01:51:29  xmm4 0000000000000002 (f: 2.000000, d: 9.881313e-324)
01:51:29  xmm5 00007f8fac2f4c58 (f: 2888780800.000000, d: 6.929520e-310)
01:51:29  xmm6 00007f8fac009770 (f: 2885719808.000000, d: 6.929520e-310)
01:51:29  xmm7 00007f8fac00ec88 (f: 2885741824.000000, d: 6.929520e-310)
01:51:29  xmm8 dddddddddddddddd (f: 3722305024.000000, d: -1.456816e+144)
01:51:29  xmm9 4015555555555555 (f: 1431655808.000000, d: 5.333333e+00)
01:51:29  xmm10 f606aa67e2e1ab07 (f: 3806440192.000000, d: -3.484933e+260)
01:51:29  xmm11 0000ff0000000000 (f: 0.000000, d: 1.385239e-309)
01:51:29  xmm12 0000013d00000140 (f: 320.000000, d: 6.726727e-312)
01:51:29  xmm13 000001380000013f (f: 319.000000, d: 6.620627e-312)
01:51:29  xmm14 0000000008001800 (f: 134223872.000000, d: 6.631540e-316)
01:51:29  xmm15 000001420000013b (f: 315.000000, d: 6.832826e-312)
01:51:29  Module=/home/jenkins/workspace/Test_openjdk22_j9_sanity.openjdk_x86-64_linux_Nightly_testList_0/jdkbinary/j2sdk-image/lib/default/libj9vm29.so
01:51:29  Module_base_address=00007F904FD7C000
01:51:29  Target=2_90_20240221_20 (Linux 3.10.0-1160.88.1.el7.x86_64)
01:51:29  CPU=amd64 (4 logical CPUs) (0x1e8cbd000 RAM)
01:51:29  ----------- Stack Backtrace -----------
01:51:29  _ZN26VM_BytecodeInterpreterFull3runEP10J9VMThread+0x12adc (0x00007F904FE59E5C [libj9vm29.so+0xdde5c])
01:51:29  bytecodeLoopFull+0xca (0x00007F904FE4736A [libj9vm29.so+0xcb36a])
01:51:29   (0x00007F904FF2B6F2 [libj9vm29.so+0x1af6f2])
01:51:29  ---------------------------------------

@ChengJin01
da2c022...4b15be6
I assume #18930

@pshipton pshipton added test failure segfault Issues that describe segfaults / JVM crashes jdk22 labels Feb 22, 2024
@pshipton pshipton added this to the Java 22 milestone Feb 22, 2024
@pshipton
Copy link
Member Author

We need to take some action asap to stop this test from failing every night, either backing out the cause or excluding the test.

@ChengJin01
Copy link

I didn't see there is any connection between the crash and the PR with the heap argument at #18930 which is only enabled by the Linker.Option.critical(true), which doesn't show up in the failing test (related to upcall) at https://github.com/ibmruntimes/openj9-openjdk-jdk22/blob/441da504657d0e2d6b41be7ea971839a6b7605b2/test/jdk/java/foreign/TestAddressDereference.java#L104

    public void testNativeReturn(long alignment, ValueLayout layout) throws Throwable {
        boolean badAlign = layout.byteAlignment() > alignment;
        try {
            MethodHandle get_addr_handle = LINKER.downcallHandle(GET_ADDR_SYM,
                    FunctionDescriptor.of(ValueLayout.ADDRESS.withTargetLayout(layout), ValueLayout.ADDRESS));
            MemorySegment deref = (MemorySegment)get_addr_handle.invokeExact(MemorySegment.ofAddress(alignment)); <-----
            assertFalse(badAlign);
            assertEquals(deref.byteSize(), layout.byteSize());
        } catch (IllegalArgumentException ex) {
            assertTrue(badAlign);
            assertTrue(ex.getMessage().contains("alignment constraint for address"));
        }
    }

In any case, I will exclude it for the time being via https://github.com/adoptium/aqa-tests.

@pshipton
Copy link
Member Author

The only other change other than the openj9 changes was ibmruntimes/openj9-openjdk-jdk22@7ef956c...441da50 but it seems unrelated.

@ChengJin01
Copy link

I will need to investigate at first to see what happened to the crash after excluding it.

ChengJin01 pushed a commit to ChengJin01/aqa-tests that referenced this issue Feb 22, 2024
The change disables the failing FFI test suite detected
in JDK22+ for the moment and will be re-enabled once the
issue is resolved.

Related: #eclipse-openj9/openj9/issues/18999

Signed-off-by: ChengJin01 <[email protected]>
@ChengJin01
Copy link

The PR is created at adoptium/aqa-tests#5088 to exclude the failing test suite.

ChengJin01 pushed a commit to ChengJin01/aqa-tests that referenced this issue Feb 22, 2024
The change disables the failing FFI test suite detected
in JDK22+ for the moment and will be re-enabled once the
issue is resolved.

Related: #eclipse-openj9/openj9/issues/18999

Signed-off-by: ChengJin01 <[email protected]>
JasonFengJ9 pushed a commit to adoptium/aqa-tests that referenced this issue Feb 22, 2024
The change disables the failing FFI test suite detected
in JDK22+ for the moment and will be re-enabled once the
issue is resolved.

Related: #eclipse-openj9/openj9/issues/18999

Signed-off-by: ChengJin01 <[email protected]>
@ChengJin01
Copy link

ChengJin01 commented Feb 22, 2024

[1] The javacore shows it crashed when performing the downcall as follows:

...
XMTHREADINFO3           Java callstack:
4XESTACKTRACE                at openj9/internal/foreign/abi/InternalDowncallHandler.runNativeMethod(InternalDowncallHandler.java:648) <------
4XESTACKTRACE                at java/lang/invoke/LambdaForm$DMH/0x00000000c415a600.invokeSpecial(LambdaForm$DMH)
4XESTACKTRACE                at java/lang/invoke/LambdaForm$MH/0x00000000c41e6600.invoke(LambdaForm$MH)
4XESTACKTRACE                at java/lang/invoke/LambdaForm$MH/0x0000000058259130.invokeExact_MT(LambdaForm$MH)
4XESTACKTRACE                at TestAddressDereference.testNativeReturn(TestAddressDereference.java:104)
...

and debugging indicates the crash occurs when it unexpectedly read the invalid address value from the heap related code at:

Thread 27 "MainThread" hit Breakpoint 1, VM_BytecodeInterpreterCompressed::inlInternalDowncallHandlerInvokeNative (this=0x7fffd435e8e0, _sp=@0x7fffd435e860: 0x234ee8, _pc=@0x7fffd435e868: 0x7fff9d8b5fa0 "\267\300") at /root/docker_openj9/jchau_ffi/openj9-openjdk-jdk22/openj9/runtime/vm/BytecodeInterpreter.hpp:5292
5291                                 if ((U_64)J9_FFI_DOWNCALL_HEAP_ARGUMENT_ID       == pointerValues[i]) {
5292                                            j9object_t heapBase = (j9object_t)J9JAVAARRAYOFOBJECT_LOAD(
(gdb) x/10x  pointerValues[i]  <---------
0x1:    Cannot access memory at address 0x1
(gdb) p/x   pointerValues[i]
$4 = 0x1

Thread 27 "MainThread" received signal SIGSEGV, Segmentation fault.
0x00007ffff7926e8f in j9javaArrayOfObject_load (vmThread=0x22e100, array=0x0, index=0) at /root/docker_openj9/jchau_ffi/openj9-openjdk-jdk22/openj9/runtime/oti/j9accessbarrierhelpers.h:51
51                      U_32 *loadAddress = J9JAVAARRAY_EA(vmThread, array, index, U_32);

by passing the alignment as an invalid address value from https://github.com/ibmruntimes/openj9-openjdk-jdk22/blob/cdc12a56665796db5270385de01dc6a30804ba03/test/jdk/java/foreign/TestAddressDereference.java#L104

 @Test(dataProvider = "layoutsAndAlignments")
    public void testNativeReturn(long alignment, ValueLayout layout) throws Throwable {
        boolean badAlign = layout.byteAlignment() > alignment;
        try {
            MethodHandle get_addr_handle = LINKER.downcallHandle(GET_ADDR_SYM,
                    FunctionDescriptor.of(ValueLayout.ADDRESS.withTargetLayout(layout), ValueLayout.ADDRESS));
-----> MemorySegment deref = (MemorySegment)get_addr_handle.invokeExact(MemorySegment.ofAddress(alignment));
            assertFalse(badAlign);
            assertEquals(deref.byteSize(), layout.byteSize());
        } catch (IllegalArgumentException ex) {
            assertTrue(badAlign);
            assertTrue(ex.getMessage().contains("alignment constraint for address"));
        }
    }

plus the data provider at https://github.com/ibmruntimes/openj9-openjdk-jdk22/blob/cdc12a56665796db5270385de01dc6a30804ba03/test/jdk/java/foreign/TestAddressDereference.java#L160

 @DataProvider(name = "layoutsAndAlignments")
    static Object[][] layoutsAndAlignments() {
        List<Object[]> layoutsAndAlignments = new ArrayList<>();
        for (LayoutKind lk : LayoutKind.values()) {
            for (int align : new int[]{ 1, 2, 4, 8 }) {  <---------- 1 is passed to downcall via MemorySegment.ofAddress(alignment)
                layoutsAndAlignments.add(new Object[] { align, lk.layout });
            }
        }
        return layoutsAndAlignments.toArray(Object[][]::new);
    }

The intent of the test case is to intentionally pass one of the alignment values (1,2,4,8) as the address value to downcall to simply return it back from https://github.com/ibmruntimes/openj9-openjdk-jdk22/blob/cdc12a56665796db5270385de01dc6a30804ba03/test/jdk/java/foreign/libAddressDereference.c#L31

EXPORT void* get_addr(void* align) {
    return align;  <---------
}

[2] Looking at our code at

private static final long DOWNCALL_HEAP_ARGUMENT_ID = 0x1;

in which we set the heap address ID to 0x1 which incidentally has conflict with the invalid address value in the failing test.

So we have to choose a more generic value to identify the heap address to avoid these kind of issues.

ChengJin01 pushed a commit to ChengJin01/openj9 that referenced this issue Feb 23, 2024
The tiny changes modify the constant used to
identify the heap address in downcall to avoid
any potential conflict with the address value
passed to downcall.

Fixes: eclipse-openj9#18999

Signed-off-by: ChengJin01 <[email protected]>
ChengJin01 pushed a commit to ChengJin01/openj9 that referenced this issue Feb 23, 2024
The tiny changes modify the constant used to
identify the heap address in downcall to avoid
any potential conflict with the address value
passed to downcall.

Fixes: eclipse-openj9#18999

Signed-off-by: ChengJin01 <[email protected]>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
jdk22 segfault Issues that describe segfaults / JVM crashes test failure
Projects
None yet
Development

Successfully merging a pull request may close this issue.

2 participants