-
-
Notifications
You must be signed in to change notification settings - Fork 122
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Add has_rocm
for OpenMPI
#821
Conversation
src/environment.jl
Outdated
flag = get(ENV, "JULIA_MPI_HAS_ROCM", nothing) | ||
if flag === nothing | ||
# Only Open MPI provides a function to check ROCm support | ||
@static if MPI_LIBRARY == "OpenMPI" |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
We probably need to use MPI_LIBRARY_VERSION
to check that this is indeed v5
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The current compat is on 4. Should we hold it off till newer JLLs are supported?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Someone will have to debug what's the problem with OpenMPI 5: #789 (comment)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Someone could already run with a local version of OpenMPI 5 before we get around to jll
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
If that helps for testing, here's a docker container with openmpi v5.0.2: https://github.com/JuliaParallel/github-actions-buildcache/pkgs/container/github-actions-buildcache/183303325?tag=openmpi-5.0.2-wufrzwmzj3h642rou7vsrphzpysokjui.spack
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
For the record, MPI.jl
tests pass in that container, I'll open a PR to add that to the matrix so that we test OpenMPI v5. We still need to figure out what was the problem with the JLL.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@avik-pal with #826 now merged we actually have coverage for OpenMPI v5 on CI (although not with an AMD GPU available...) so you can do a test like
@static if MPI_LIBRARY == "OpenMPI" | |
@static if MPI_LIBRARY == "OpenMPI" && MPI_LIBRARY_VERSION >= v"5" |
as suggested above by @vchuravy and we can at least ensure the ccall is working correctly.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Tested locally that ccall works by bumping the jll compat to 5
julia> using MPI
julia> MPI.has_cuda()
false
julia> MPI.has_rocm()
false
Can also document |
@avik-pal are you interested in moving this forward? |
Completely fell off my radar, I will finish this over the weekend |
Once this is merged, I can implement |
Maybe one could add a sentence similar to CUDA (https://github.com/JuliaParallel/MPI.jl/blob/master/docs/src/usage.md)
in the ROCm-aware section of the docs? |
9dc9b45
to
7de8a18
Compare
done |
Thanks! |
Last documentation request and the we should be good to go ☝️ |
Updated the docs |
Can you erase on master? |
What do you want me to erase? |
I presume that was meant to be "rebase". |
See https://docs.open-mpi.org/en/main/man-openmpi/man3/MPIX_Query_rocm_support.3.html#mpix-query-rocm-support
Seems to be available from v5 of OpenMPI