Flaky fails on SYCL :: AtomicRef/atomic_memory_order_acq_rel.cpp #8847

aelovikov-intel · 2023-03-28T23:25:24Z

It seems the issues has been seen at least twice recently:

https://github.com/intel/llvm/actions/runs/4547459972/jobs/8018077332 (post-commit run for 96acbff). Failed on GEN9 L0 like this:

$ "env" "ONEAPI_DEVICE_SELECTOR=ext_oneapi_level_zero:gpu" "/__w/llvm/llvm/build-e2e/AtomicRef/Output/atomic_memory_order_acq_rel.cpp.tmp.out"
# command output:
Testing acquire
Testing release

# command stderr:
atomic_memory_order_acq_rel.cpp.tmp.out: /__w/llvm/llvm/llvm/sycl/test-e2e/AtomicRef/atomic_memory_order_acq_rel.cpp:143: void test_release_global() [order = sycl::memory_order::release]: Assertion `error == 0' failed.

error: command failed with exit status: -6

As one of the pre-commit CI runs in #8510 (see discussion in comments there)

The text was updated successfully, but these errors were encountered:

bader · 2023-03-30T04:59:26Z

@aelovikov-intel, I see that it fails quite stably in pre- and post- commits. Please, disable or fix the test as soon as possible.

jandres742 · 2023-03-30T15:01:50Z

this test is flaky here #8732

See intel#8847

bader · 2023-03-31T01:51:07Z

@andylshort, I think this issue might be caused by your PR. #8825
The way test is built assumes that we built for CUDA only. Right?

// RUN: %clangxx -fsycl -fsycl-targets=%sycl_triple %s -O3 -o %t.out -Xsycl-target-backend=nvptx64-nvidia-cuda --cuda-gpu-arch=sm_70

bader · 2023-03-31T02:11:50Z

I also noticed that test doesn't check atomic_memory_order_capabilities of tested devices.
Maybe devices in GitHub CI do not support acq_rel property.

) See #8847 --------- Co-authored-by: Alexey Bader <[email protected]>

andylshort · 2023-03-31T10:42:36Z

@andylshort, I think this issue might be caused by your PR. #8825 The way test is built assumes that we built for CUDA only. Right?
// RUN: %clangxx -fsycl -fsycl-targets=%sycl_triple %s -O3 -o %t.out -Xsycl-target-backend=nvptx64-nvidia-cuda --cuda-gpu-arch=sm_70

Yes but they're unused in the failing GEN9 test runs so I think they're okay to keep, but could be written better to avoid the clang++: warning: argument unused during compilation: '-Xsycl-target-backend=nvptx64-nvidia-cuda --cuda-gpu-arch=sm_70' [-Wunused-command-line-argument].

I also noticed that test doesn't check atomic_memory_order_capabilities of tested devices. Maybe devices in GitHub CI do not support acq_rel property.

I also think they might not property support ace_rel. My earlier patch should have implemented the PI call for atomic_memory_order_capabilities so the devices are actually being queried for this information now. I'm currently working on another bug at the moment that utilises memory orders and atomic_ref in a similar way to this failing test and that too fails on L0 and OCL GPU backends.

I believe both are caused by the same issue, and it's likely a lower-level issue with SPIRV or the GPU driver.

maarquitos14 · 2023-04-24T09:28:21Z

Fixed by #9111.

aelovikov-intel added the bug Something isn't working label Mar 28, 2023

bader added the confirmed label Mar 28, 2023

aelovikov-intel mentioned this issue Mar 29, 2023

[SYCL] Optimize UR 2 PI convert #8737

Merged

This was referenced Mar 30, 2023

[SYCL] Host pipe runtime implementation #7468

Merged

[SYCL] Fix unused variable warnings and unittests #8875

Merged

jandres742 mentioned this issue Mar 30, 2023

[SYCL][UR][L0] Add UR_L0 environment variables #8732

Merged

aelovikov-intel added a commit to aelovikov-intel/llvm that referenced this issue Mar 30, 2023

[SYCL] Disable test-e2e/AtomicRef/atomic_memory_order_acq_rel.cpp

3ab7f50

See intel#8847

aelovikov-intel mentioned this issue Mar 30, 2023

[SYCL] Disable test-e2e/AtomicRef/atomic_memory_order_acq_rel.cpp #8885

Merged

0x12CC mentioned this issue Mar 30, 2023

[SYCL] Fix handling of subgroup info queries #8859

Merged

YuriPlyakhin mentioned this issue Mar 30, 2023

[SYCL][Matrix] Clean-up for couple tests #8866

Merged

aelovikov-intel mentioned this issue Mar 30, 2023

[SYCL] Add sycl/test-e2e tests to in-tree build #8884

Merged

bmyates mentioned this issue Mar 30, 2023

[SYCL][UR] Link UR PI against UR Loader #8865

Merged

This was referenced Mar 30, 2023

[SYCL][L0] Rework how we maintain per-thread queue groups #8896

Merged

[SYCL][L0] optimize re-use of command-lists #8870

Merged

jandres742 mentioned this issue Mar 31, 2023

[SYCL][UR] Use official device info property for c-slices #8891

Merged

bader assigned andylshort Mar 31, 2023

bader added a commit that referenced this issue Mar 31, 2023

[SYCL] Disable test-e2e/AtomicRef/atomic_memory_order_acq_rel.cpp (#8885

f5d9891

) See #8847 --------- Co-authored-by: Alexey Bader <[email protected]>

0x12CC mentioned this issue Apr 18, 2023

[SYCL] Fix buffer range in atomic_memory_order_acq_rel #9111

Merged

maarquitos14 closed this as completed Apr 24, 2023

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Flaky fails on SYCL :: AtomicRef/atomic_memory_order_acq_rel.cpp #8847

Flaky fails on SYCL :: AtomicRef/atomic_memory_order_acq_rel.cpp #8847

aelovikov-intel commented Mar 28, 2023

bader commented Mar 30, 2023

jandres742 commented Mar 30, 2023

bader commented Mar 31, 2023 •

edited

Loading

bader commented Mar 31, 2023

andylshort commented Mar 31, 2023

maarquitos14 commented Apr 24, 2023

Flaky fails on SYCL :: AtomicRef/atomic_memory_order_acq_rel.cpp #8847

Flaky fails on SYCL :: AtomicRef/atomic_memory_order_acq_rel.cpp #8847

Comments

aelovikov-intel commented Mar 28, 2023

bader commented Mar 30, 2023

jandres742 commented Mar 30, 2023

bader commented Mar 31, 2023 • edited Loading

bader commented Mar 31, 2023

andylshort commented Mar 31, 2023

maarquitos14 commented Apr 24, 2023

bader commented Mar 31, 2023 •

edited

Loading