You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
between the two different runs the last julia versions changed, but on 2024-02-02 I ran again jobs with julia v1.9.3 in https://github.com/JuliaParallel/MPI.jl/actions/runs/7759323957 and the tests are hanging again, so the julia version doesn't seem to be relevant
there haven't been significant changes of source code in this package between the two runs, so I would rule out some problems in this package
more interestingly, I can't reproduce the issue locally in the containers we're using for CI (tried both ghcr.io/juliaparallel/github-actions-buildcache:intel-oneapi-mpi-jq or ghcr.io/juliaparallel/github-actions-buildcache:intel-oneapi-mpi-2021.7.0-gzc7es2p27ftwyk4sdplynlj6d54xzi6.spack): I installed both julia v1.9.4 and v1.9.3, I couldn't reproduce the hangs with either of them, tests just run fine every single time.
I suspect there have been some changes in GitHub-hosted runners configuration which is causing this, but this hypothesis is hard to test. One last option to try is to use a newer version of Intel oneAPI MPI, in case this was some sort of old bug later fixed, but I'll postpone that test for another day. In the meantime, I'm opening this ticket to keep track of the issue. Edit: it does appear that just upgrading to oneAPI 2021.11.0 magically fixes the hang without any other change on our side, this is implemented in #818.
The text was updated successfully, but these errors were encountered:
CI jobs using Intel oneAPI MPI 2021.7.0 are hanging forever after the
test_shared_win.jl
test set, causing jobs to eventually time outTo summarise
ghcr.io/juliaparallel/github-actions-buildcache:intel-oneapi-mpi-jq
orghcr.io/juliaparallel/github-actions-buildcache:intel-oneapi-mpi-2021.7.0-gzc7es2p27ftwyk4sdplynlj6d54xzi6.spack
): I installed both julia v1.9.4 and v1.9.3, I couldn't reproduce the hangs with either of them, tests just run fine every single time.I suspect there have been some changes in GitHub-hosted runners configuration which is causing this, but this hypothesis is hard to test. One last option to try is to use a newer version of Intel oneAPI MPI, in case this was some sort of old bug later fixed, but I'll postpone that test for another day. In the meantime, I'm opening this ticket to keep track of the issue. Edit: it does appear that just upgrading to oneAPI 2021.11.0 magically fixes the hang without any other change on our side, this is implemented in #818.
The text was updated successfully, but these errors were encountered: