Skip to content

Latest commit

 

History

History
26 lines (19 loc) · 612 Bytes

README.md

File metadata and controls

26 lines (19 loc) · 612 Bytes

nccl.torch

Torch7 FFI bindings for NVidia NCCL library.

Installation

Collective operations supported

  • allReduce
  • reduce
  • broadcast
  • allGather

Example usage

Argument to the collective call should be a table of contiguous tensors located on the different devices. Example: perform in-place allReduce on the table of tensors:

require 'nccl'
nccl.allReduce(inputs)

where inputs is a table of contiguous tensors of the same size located on the different devices.