Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

What does dma_buf do when gpuDirectRdma is disabled ? #245

Open
Pavani-Panakanti opened this issue Aug 2, 2024 · 1 comment
Open

What does dma_buf do when gpuDirectRdma is disabled ? #245

Pavani-Panakanti opened this issue Aug 2, 2024 · 1 comment

Comments

@Pavani-Panakanti
Copy link

Running nccl test with 2 nodes with one A10G on each node with GDR disabled.
Why do I see the following line in the logs "DMA-BUF is available on GPU device 0". Will DMA_BUF be used when GDR is disabled ?
Appreciate the help !

 [0] NCCL INFO NET/OFI Could not disable CUDA API usage for HMEM, disabling GDR
 [0] NCCL INFO NET/OFI Setting NCCL_PROTO to "simple"
[0] NCCL INFO NET/OFI Could not disable CUDA API usage for HMEM, disabling GDR
[0] NCCL INFO NET/OFI Setting NCCL_PROTO to "simple"
[0] NCCL INFO DMA-BUF is available on GPU device 0
[0] NCCL INFO DMA-BUF is available on GPU device 0
[0] NCCL INFO comm 0x2515e00 rank 0 nranks 2 cudaDev 0 nvmlDev 0 busId 1e0 commId 0x1c71981584deedae - Init START
[0] NCCL INFO comm 0x2b049f0 rank 1 nranks 2 cudaDev 0 nvmlDev 0 busId 1e0 commId 0x1c71981584deedae - Init START
[0] NCCL INFO NET/OFI Libfabric provider associates MRs with domains
[0] NCCL INFO NET/OFI Libfabric provider associates MRs with domains
[0] NCCL INFO Channel 00/02 :    0   1
[0] NCCL INFO Channel 01/02 :    0   1
@kiskra-nvidia
Copy link
Member

It's a generic test that's always done during initialization, irrespective of the communication layer used or its configuration.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants