Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add has_rocm for OpenMPI #821

Merged
merged 4 commits into from
Jun 23, 2024
Merged
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
3 changes: 2 additions & 1 deletion docs/src/knownissues.md
Original file line number Diff line number Diff line change
Expand Up @@ -180,7 +180,7 @@ Make sure to:
```
- Then in Julia, upon loading MPI and CUDA modules, you can check
- CUDA version: `CUDA.versioninfo()`
- If MPI has CUDA: `MPI.has_cuda()`
- If MPI has CUDA: [`MPI.has_cuda()`](@ref)
- If you are using correct MPI library: `MPI.libmpi`

After that, it may be preferred to run the Julia MPI script (as suggested [here](https://discourse.julialang.org/t/cuda-aware-mpi-works-on-system-but-not-for-julia/75060/11)) launching it from a shell script (as suggested [here](https://discourse.julialang.org/t/cuda-aware-mpi-works-on-system-but-not-for-julia/75060/4)).
Expand All @@ -197,6 +197,7 @@ Make sure to:
```
- Then in Julia, upon loading MPI and CUDA modules, you can check
- AMDGPU version: `AMDGPU.versioninfo()`
- If MPI has ROCm: [`MPI.has_rocm()`](@ref)
- If you are using correct MPI implementation: `MPI.identify_implementation()`

After that, [this script](https://gist.github.com/luraess/c228ec08629737888a18c6a1e397643c) can be used to verify if ROCm-aware MPI is functional (modified after the CUDA-aware version from [here](https://discourse.julialang.org/t/cuda-aware-mpi-works-on-system-but-not-for-julia/75060/11)). It may be preferred to run the Julia ROCm-aware MPI script launching it from a shell script (as suggested [here](https://discourse.julialang.org/t/cuda-aware-mpi-works-on-system-but-not-for-julia/75060/4)).
Expand Down
2 changes: 2 additions & 0 deletions docs/src/reference/library.md
Original file line number Diff line number Diff line change
Expand Up @@ -14,5 +14,7 @@ MPI.MPI_LIBRARY_VERSION_STRING
```@docs
MPI.versioninfo
MPI.has_cuda
MPI.has_rocm
MPI.has_gpu
MPI.identify_implementation
```
3 changes: 2 additions & 1 deletion docs/src/usage.md
Original file line number Diff line number Diff line change
Expand Up @@ -98,7 +98,8 @@ should confirm your MPI implementation to have the ROCm support (AMDGPU) enabled
[alltoall\_test\_rocm\_multigpu.jl](https://gist.github.com/luraess/a47931d7fb668bd4348a2c730d5489f4) should confirm
your ROCm-aware MPI implementation to use multiple AMD GPUs (one GPU per rank).

The status of ROCm (AMDGPU) support cannot currently be queried.
If using OpenMPI, the status of ROCm support can be checked via the
[`MPI.has_rocm()`](@ref) function.

## Writing MPI tests

Expand Down
47 changes: 45 additions & 2 deletions src/environment.jl
Original file line number Diff line number Diff line change
Expand Up @@ -320,21 +320,23 @@ Wtime() = API.MPI_Wtime()

Check if the MPI implementation is known to have CUDA support. Currently only Open MPI
provides a mechanism to check, so it will return `false` with other implementations
(unless overriden).
(unless overriden). For "IBMSpectrumMPI" it will return `true`.

This can be overriden by setting the `JULIA_MPI_HAS_CUDA` environment variable to `true`
or `false`.

!!! note
For OpenMPI or OpenMPI-based implementations you first need to call [Init()](@ref).

See also [`MPI.has_rocm`](@ref) for ROCm support.
"""
function has_cuda()
flag = get(ENV, "JULIA_MPI_HAS_CUDA", nothing)
if flag === nothing
# Only Open MPI provides a function to check CUDA support
@static if MPI_LIBRARY == "OpenMPI"
# int MPIX_Query_cuda_support(void)
return 0 != ccall((:MPIX_Query_cuda_support, libmpi), Cint, ())
return @ccall libmpi.MPIX_Query_cuda_support()::Bool
elseif MPI_LIBRARY == "IBMSpectrumMPI"
return true
else
Expand All @@ -344,3 +346,44 @@ function has_cuda()
return parse(Bool, flag)
end
end

"""
MPI.has_rocm()

Check if the MPI implementation is known to have ROCm support. Currently only Open MPI
provides a mechanism to check, so it will return `false` with other implementations
(unless overriden).

This can be overriden by setting the `JULIA_MPI_HAS_ROCM` environment variable to `true`
or `false`.

See also [`MPI.has_cuda`](@ref) for CUDA support.
"""
function has_rocm()
avik-pal marked this conversation as resolved.
Show resolved Hide resolved
flag = get(ENV, "JULIA_MPI_HAS_ROCM", nothing)
if flag === nothing
# Only Open MPI provides a function to check ROCm support
@static if MPI_LIBRARY == "OpenMPI" && MPI_LIBRARY_VERSION ≥ v"5"
# int MPIX_Query_rocm_support(void)
return @ccall libmpi.MPIX_Query_rocm_support()::Bool
else
return false
end
else
return parse(Bool, flag)
end
end

"""
MPI.has_gpu()

Checks if the MPI implementation is known to have GPU support. Currently this checks for the
following GPUs:

1. CUDA: via [`MPI.has_cuda`](@ref)
2. ROCm: via [`MPI.has_rocm`](@ref)

See also [`MPI.has_cuda`](@ref) and [`MPI.has_rocm`](@ref) for more fine-grained
checks.
"""
has_gpu() = has_cuda() || has_rocm()
14 changes: 13 additions & 1 deletion test/test_basic.jl
Original file line number Diff line number Diff line change
Expand Up @@ -8,10 +8,22 @@ MPI.Init()

@test MPI.has_cuda() isa Bool

if get(ENV,"JULIA_MPI_TEST_ARRAYTYPE","") == "CuArray"
if get(ENV, "JULIA_MPI_TEST_ARRAYTYPE", "") == "CuArray"
@test MPI.has_cuda()
end

@test MPI.has_rocm() isa Bool

if get(ENV, "JULIA_MPI_TEST_ARRAYTYPE", "") == "ROCArray"
@test MPI.has_rocm()
end

@test MPI.has_gpu() isa Bool

if get(ENV, "JULIA_MPI_TEST_ARRAYTYPE", "") == "CuArray" || get(ENV, "JULIA_MPI_TEST_ARRAYTYPE", "") == "ROCArray"
@test MPI.has_gpu()
end

@test !MPI.Finalized()
MPI.Finalize()
@test MPI.Finalized()
Loading