There are three ways to use Triton Model Analyzer:
-
The recommended way to use Model Analyzer is with the Triton SDK docker container available on the NVIDIA GPU Cloud Catalog. You can pull and run the SDK container with the following commands:
$ docker pull nvcr.io/nvidia/tritonserver:21.07-py3-sdk
If you are not planning to run Model Analyzer with
--triton-launch-mode=docker
you can run the container with the following command:$ docker run -it --gpus all --net=host nvcr.io/nvidia/tritonserver:21.07-py3-sdk
If you intend to use
--triton-launch-mode=docker
, you will need to mount the following:-v /var/run/docker.sock:/var/run/docker.sock
allows running docker containers as sibling containers from inside the Triton SDK container. Model Analyzer will require this if run with--triton-launch-mode=docker
.-v <path-to-output-model-repo>:<path-to-output-model-repo>
The absolute path to the directory where the output model repository will be located (i.e. parent directory of the output model repository). This is so that the launched Triton container has access to the model config variants that Model Analyzer creates.
$ docker run -it --gpus all \ -v /var/run/docker.sock:/var/run/docker.sock \ -v <path-to-output-model-repo>:<path-to-output-model-repo> \ --net=host nvcr.io/nvidia/tritonserver:21.07-py3-sdk
Model Analyzer uses
pdfkit
for report generation. If you are running Model Analyzer inside the Triton SDK container, then you will need to downloadwkhtmltopdf
.$ sudo apt-get update && sudo apt-get install wkhtmltopdf
Once you do this, Model Analyzer will able to use
pdfkit
to generate reports. -
Building the Dockerfile:
You can also build the Model Analyzer's dockerfile yourself. First, clone the Model Analyzer's git repository, then build the docker image.
$ git clone https://github.com/triton-inference-server/model_analyzer $ docker build --pull -t model-analyzer .
The above command will pull all the containers that model analyzer needs to run. The Model Analyzer's Dockerfile bases the container on the latest
tritonserver
containers from NGC. Now you can run the container with:$ docker run -it --rm --gpus all \ -v /var/run/docker.sock:/var/run/docker.sock \ -v <path-to-triton-model-repository>:<path-to-triton-model-repository> \ -v <path-to-output-model-repo>:<path-to-output-model-repo> \ --net=host model-analyzer root@hostname:/opt/triton-model-analyzer#
-
Using
pip3
:You can install pip using:
$ sudo apt-get update && sudo apt-get install python3-pip
Model analyzer can be installed with:
$ pip3 install nvidia-pyindex $ pip3 install triton-model-analyzer
If you encounter any errors installing dependencies like
numba
, make sure that you have the latest version ofpip
using:$ pip3 install --upgrade pip
You can then try installing model analyzer again.
If you are using this approach you need to install DCGM on your machine.
For installing DCGM on Ubuntu 20.04 you can use the following commands:
$ export DCGM_VERSION=2.0.13 $ wget -q https://developer.download.nvidia.com/compute/cuda/repos/ubuntu2004/x86_64/datacenter-gpu-manager_${DCGM_VERSION}_amd64.deb && \ dpkg -i datacenter-gpu-manager_${DCGM_VERSION}_amd64.deb
-
Building from source:
To build model analyzer form source, you'll need to install the same dependencies (tritonclient and DCGM) mentioned in the "Using pip section". After that, you can use the following commands:
$ git clone https://github.com/triton-inference-server/model_analyzer $ cd model_analyzer $ ./build_wheel.sh <path to perf_analyzer> true
In the final command above we are building the triton-model-analyzer wheel. You will need to provide the
build_wheel.sh
script with two arguments. The first is the path to theperf_analyzer
binary that you would like Model Analyzer to use. The second is whether you want this wheel to be linux specific. Currently, this argument must be set totrue
as perf analyzer is supported only on linux. This will create a wheel file in thewheels
directory namedtriton-model-analyzer-<version>-py3-none-manylinux1_x86_64.whl
. We can now install this with:$ pip3 install wheels/triton-model-analyzer-*.whl
After these steps,
model-analyzer
executable should be available in$PATH
.
Notes:
- Triton Model Analyzer supports all the GPUs supported by the DCGM library. See DCGM Supported GPUs for more information.