cuBLAS Extension APIs - `cublasAxpyEx`

Description

This code demonstrates a usage of cuBLAS AxpyEx function to compute a vector-scalar product and adds the result to a vector

A = | 1.0 | 2.0 | 3.0 | 4.0 |
B = | 5.0 | 6.0 | 7.0 | 8.0 |

This function is an API generalization of the routine cublas<t>axpy where input data, output data and compute type can be specified independently.

See documentation for further details.

Supported SM Architectures

All GPUs supported by CUDA Toolkit (https://developer.nvidia.com/cuda-gpus)

Supported OSes

Linux
Windows

Supported CPU Architecture

x86_64
ppc64le
arm64-sbsa

CUDA APIs involved

cublasAxpyEx API

Building (make)

Prerequisites

A Linux/Windows system with recent NVIDIA drivers.
CMake version 3.18 minimum

Build command on Linux

$ mkdir build
$ cd build
$ cmake ..
$ make

Make sure that CMake finds expected CUDA Toolkit. If that is not the case you can add argument -DCMAKE_CUDA_COMPILER=/path/to/cuda/bin/nvcc to cmake command.

Build command on Windows

$ mkdir build
$ cd build
$ cmake -DCMAKE_GENERATOR_PLATFORM=x64 ..
$ Open cublas_examples.sln project in Visual Studio and build

Usage

$  ./cublas_AxpyEx_example

Sample example output:

A
1.00 2.00 3.00 4.00
=====
B
5.00 6.00 7.00 8.00
=====
B
7.10 10.20 13.30 16.40
=====

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

README.md

README.md

cuBLAS Extension APIs - `cublasAxpyEx`

Description

Supported SM Architectures

Supported OSes

Supported CPU Architecture

CUDA APIs involved

Building (make)

Prerequisites

Build command on Linux

Build command on Windows

Usage

Files

README.md

Latest commit

History

README.md

File metadata and controls

cuBLAS Extension APIs - cublasAxpyEx

Description

Supported SM Architectures

Supported OSes

Supported CPU Architecture

CUDA APIs involved

Building (make)

Prerequisites

Build command on Linux

Build command on Windows

Usage

cuBLAS Extension APIs - `cublasAxpyEx`