Skip to content
This repository has been archived by the owner on Feb 29, 2024. It is now read-only.

Commit

Permalink
refactor
Browse files Browse the repository at this point in the history
  • Loading branch information
ImmanuelSegol committed Feb 21, 2024
1 parent 85c01e4 commit c21a836
Show file tree
Hide file tree
Showing 2 changed files with 156 additions and 0 deletions.
6 changes: 6 additions & 0 deletions docs/icicle/multi-gpu.md
Original file line number Diff line number Diff line change
Expand Up @@ -43,6 +43,12 @@ To dive deeper and learn about the API checkout the docs for our different ICICL
- C++ Multi GPU APIs


## Best practices

- Never hardcode device IDs, if you want your software to take advantage of all GPUs on a machine use methods such as `get_device_count` to support arbitrary number of GPUs.

- Launch one thread per GPU, to avoid nasty errors and hard to read code we suggest that for every GPU task you wish to launch you create a dedicated thread. This will make your code way more manageable, easy to read and performant.

## ZKContainer support for multi GPUs

Multi GPU support should work with ZK-Containers by simple defining which devices the docker container should interact with:
Expand Down
150 changes: 150 additions & 0 deletions docs/icicle/rust-bindings/multi-gpu.md
Original file line number Diff line number Diff line change
@@ -1,2 +1,152 @@
# Multi GPU APIs

To learn more about the theory of Multi GPU programming refer to [this part](../multi-gpu.md) of documentation.

## Device management API

To stream line device management we offer as part of `icicle-cuda-runtime` package methods for dealing with devices.

#### [`set_device`](https://github.com/vhnatyk/icicle/blob/275eaa99040ab06b088154d64cfa50b25fbad2df/wrappers/rust/icicle-cuda-runtime/src/device.rs#L6)

Sets the current CUDA device by its ID, when calling `set_device` it will set the current thread to a CUDA device.

**Parameters:**

- `device_id: usize`: The ID of the device to set as the current device. Device IDs start from 0.

**Returns:**

- `CudaResult<()>`: An empty result indicating success if the device is set successfully. In case of failure, returns a `CudaError`.

**Errors:**

- Returns a `CudaError` if the specified device ID is invalid or if a CUDA-related error occurs during the operation.

**Example:**

```rust
let device_id = 0; // Device ID to set
match set_device(device_id) {
Ok(()) => println!("Device set successfully."),
Err(e) => eprintln!("Failed to set device: {:?}", e),
}
```

#### [`get_device_count`](https://github.com/vhnatyk/icicle/blob/275eaa99040ab06b088154d64cfa50b25fbad2df/wrappers/rust/icicle-cuda-runtime/src/device.rs#L10)

Retrieves the number of CUDA devices available on the machine.

**Returns:**

- `CudaResult<usize>`: The number of available CUDA devices. On success, contains the count of CUDA devices. On failure, returns a `CudaError`.

**Errors:**

- Returns a `CudaError` if a CUDA-related error occurs during the retrieval of the device count.

**Example:**

```rust
match get_device_count() {
Ok(count) => println!("Number of devices available: {}", count),
Err(e) => eprintln!("Failed to get device count: {:?}", e),
}
```

#### [`get_device`](https://github.com/vhnatyk/icicle/blob/275eaa99040ab06b088154d64cfa50b25fbad2df/wrappers/rust/icicle-cuda-runtime/src/device.rs#L15)

Retrieves the ID of the current CUDA device.

**Returns:**

- `CudaResult<usize>`: The ID of the current CUDA device. On success, contains the device ID. On failure, returns a `CudaError`.

**Errors:**

- Returns a `CudaError` if a CUDA-related error occurs during the retrieval of the current device ID.

**Example:**

```rust
match get_device() {
Ok(device_id) => println!("Current device ID: {}", device_id),
Err(e) => eprintln!("Failed to get current device: {:?}", e),
}
```

## Device context API

The `DeviceContext` is embedded into `NTTConfig`, `MSMConfig` and `PoseidonConfig`, meaning you can simple pass a `device_id` to your existing config an the same computation will be triggered on a different device automatically.

#### [`DeviceContext`](https://github.com/vhnatyk/icicle/blob/eef6876b037a6b0797464e7cdcf9c1ecfcf41808/wrappers/rust/icicle-cuda-runtime/src/device_context.rs#L11)

Represents the configuration a CUDA device, encapsulating the device's stream, ID, and memory pool. The default device is always `0`, unless configured otherwise.

```rust
pub struct DeviceContext<'a> {
pub stream: &'a CudaStream,
pub device_id: usize,
pub mempool: CudaMemPool,
}
```

##### Fields

- **`stream: &'a CudaStream`**

A reference to a `CudaStream`. This stream is used for executing CUDA operations. By default, it points to a null stream CUDA's default execution stream.

- **`device_id: usize`**

The index of the GPU currently in use. The default value is `0`, indicating the first GPU in the system.

- **`mempool: CudaMemPool`**

Represents the memory pool used for CUDA memory allocations. The default is set to a null pointer, which signifies the use of the default CUDA memory pool.

##### Implementation Notes

- The `DeviceContext` structure is cloneable and can be debugged, facilitating easier logging and duplication of contexts when needed.


#### [`DeviceContext::default_for_device(device_id: usize) -> DeviceContext<'static>`](https://github.com/vhnatyk/icicle/blob/eef6876b037a6b0797464e7cdcf9c1ecfcf41808/wrappers/rust/icicle-cuda-runtime/src/device_context.rs#L30C12-L30C30)

Provides a default `DeviceContext` with system-wide defaults, ideal for straightforward setups.

#### Returns

A `DeviceContext` instance configured with:
- The default stream (`null_mut()`).
- The default device ID (`0`).
- The default memory pool (`null_mut()`).

#### Parameters

- **`device_id: usize`**: The ID of the device for which to create the context.

#### Returns

A `DeviceContext` instance with the provided `device_id` and default settings for the stream and memory pool.


#### [`check_device(device_id: i32)`](https://github.com/vhnatyk/icicle/blob/eef6876b037a6b0797464e7cdcf9c1ecfcf41808/wrappers/rust/icicle-cuda-runtime/src/device_context.rs#L42)

Validates that the specified `device_id` matches the ID of the currently active device, ensuring operations are targeted correctly.

#### Parameters

- **`device_id: i32`**: The device ID to verify against the currently active device.

#### Behavior

- **Panics** if the `device_id` does not match the active device's ID, preventing cross-device operation errors.

#### Example

```rust
let device_id: i32 = 0; // Example device ID
check_device(device_id);
// Ensures that the current context is correctly set for the specified device ID.
```


0 comments on commit c21a836

Please sign in to comment.