Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

When testing ROCm D2D transfers with UCX_TLS=rc, how does setting UCX_IB_GPU_DIRECT_RDMA=0 affect the osu_bw test results? #10077

Open
shuiYizero opened this issue Aug 20, 2024 · 5 comments

Comments

@shuiYizero
Copy link

When using UCX_TLS=rc to test ROCm D2D transfers, setting UCX_IB_GPU_DIRECT_RDMA=0 doesn't affect the osu_bw test results. Is this because rc doesn't use GPUDirect RDMA technology, or is it because GPUDirect RDMA is enabled by default when using rc?

@rakhmets
Copy link
Collaborator

rc transports can use GPU direct RDMA feature.
The default value of UCX_IB_GPU_DIRECT_RDMA is 'try'. This means that GPU direct RDMA will be used if UCX finds the necessary driver on the target system. Which is ROCm KFD driver in case of ROCm.
Please try to set UCX_IB_GPU_DIRECT_RDMA=1. You will see error message if the driver cannot be found on your system.

@shuiYizero
Copy link
Author

You’re right, but what puzzles me is that when I set UCX_IB_GPU_DIRECT_RDMA=0, my test results are the same as when UCX_IB_GPU_DIRECT_RDMA=1. Do you know why this happens?

mpirun -np 2 -H a:1,b:1 -mca pml ucx -x UCX_NET_DEVICES=mlx5_0:1,mlx5_1:1,mlx5_2:1,mlx5_3:1  -x UCX_TLS=rc -x  UCX_IB_GPU_DIRECT_RDMA=0 -x LD_LIBRARY_PATH osu_bw -d rocm D D
# OSU MPI-ROCM Bandwidth Test v7.3
# Send Buffer on DEVICE (D) and Receive Buffer on DEVICE (D)
# Size      Bandwidth (MB/s)
# Datatype: MPI_CHAR.
1                       0.79
2                       1.57
4                       3.14
8                       6.29
16                      6.69
32                      7.57
64                      8.24
128                     8.39
256                     8.45
512                     8.56
1024                    8.59
2048                    8.62
4096                    8.63
8192                    8.63
16384                5958.54
32768                3811.03
65536                3251.07
131072               3263.04
262144               3273.16
524288               3272.21
1048576              3277.51
2097152              3277.63
4194304              3275.10
mpirun -np 2 -H a:1,b:1 -mca pml ucx -x UCX_NET_DEVICES=mlx5_0:1,mlx5_1:1,mlx5_2:1,mlx5_3:1  -x UCX_TLS=rc -x  UCX_IB_GPU_DIRECT_RDMA=1-x LD_LIBRARY_PATH osu_bw -d rocm D D

1                       0.78
2                       1.57
4                       3.14
8                       6.28
16                      7.09
32                      7.57
64                      8.25
128                     8.39
256                     8.43
512                     8.53
1024                    8.59
2048                    8.62
4096                    8.63
8192                    8.64
16384                5924.55
32768                3822.05
65536                3252.99
131072               3269.29
262144               3269.66
524288               3274.54
1048576              3278.45
2097152              3276.73
4194304              3276.40

@edgargabriel
Copy link
Contributor

I would not set UCX_TLS=rc, you are basically excluding the rocm components. At the bare minimum, UCX will not be able to detect/recognize the rocm memory types, i.e. it will not be able to tell that it is dealing with GPU memory, and I am not 100% sure what is the impact of that. I would recommend to at least set UCX_TLS=rocm,rc

@edgargabriel
Copy link
Contributor

I am not entirely sure what generation of IB hardware you are using, but the bandwidth values that you show are very low, most likely data is funneled through the CPU memory in your case. I would recommend a) try first only one HCA at a time (ideally the one closest to the GPU that you are using), b) double check that acs is disabled on your system, since that might prevent direct GPU to HCA communication. You should not have to worry about the IB_GPU_DIRECT_RDMA setting, we usually don't set that value in order to achieve full line BW.

@edgargabriel
Copy link
Contributor

edgargabriel commented Aug 21, 2024

Also, are you using the a Mellanox OFED driver on your system, or the standard Linux RMDA packages? I would recommend MOFED for easier interactions with the GPUs

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

3 participants