The Qualcomm® AI Hub Models are a collection of state-of-the-art machine learning models optimized for deployment on Qualcomm® devices.
See supported: On-Device Runtimes, Hardware Targets & Precision, Chipsets, Devices
The package is available via pip:
# NOTE for Snapdragon X Elite users:
# Only AMDx64 (64-bit) Python in supported on Windows.
# Installation will fail when using Windows ARM64 Python.
pip install qai_hub_models
Some models (e.g. YOLOv7) require additional dependencies that can be installed as follows:
pip install "qai_hub_models[yolov7]"
Many features of AI Hub Models (such as model compilation, on-device profiling, etc.) require access to Qualcomm® AI Hub:
- Create a Qualcomm® ID, and use it to login to Qualcomm® AI Hub.
- Configure your API token:
qai-hub configure --api_token API_TOKEN
All models in our directory can be compiled and profiled on a hosted Qualcomm® device:
pip install "qai_hub_models[yolov7]"
python -m qai_hub_models.models.yolov7.export [--target-runtime ...] [--device ...] [--help]
Using Qualcomm® AI Hub, the export script will:
- Compile the model for the chosen device and target runtime (see: Compiling Models on AI Hub).
- If applicable, Quantize the model (see: Quantization on AI Hub)
- Profile the compiled model on a real device in the cloud (see: Profiling Models on AI Hub).
- Run inference with a sample input data on a real device in the cloud, and compare on-device model output with PyTorch output (see: Running Inference on AI Hub)
- Download the compiled model to disk.
Most models in our directory contain CLI demos that run the model end-to-end:
pip install "qai_hub_models[yolov7]"
# Predict and draw bounding boxes on the provided image
python -m qai_hub_models.models.yolov7.demo [--image ...] [--on-device] [--help]
End-to-end demos:
- Preprocess human-readable input into model input
- Run model inference
- Postprocess model output to a human-readable format
Many end-to-end demos use AI Hub to run inference on a real cloud-hosted device (if the --on-device
flag is set). All end-to-end demos also run locally via PyTorch.
Native applications that can run our models (with pre- and post-processing) on physical devices are published in the AI Hub Apps repository.
Python applications are defined for all models (from qai_hub_models.models.<model_name> import App). These apps wrap model inference with pre- and post-processing steps written using torch & numpy. These apps are optimized to be an easy-to-follow example, rather than to minimize prediction time.
Runtime | Supported OS |
---|---|
Qualcomm AI Engine Direct | Android, Linux, Windows |
LiteRT (TensorFlow Lite) | Android, Linux |
ONNX | Android, Linux, Windows |
Device Compute Unit | Supported Precision |
---|---|
CPU | FP32, INT16, INT8 |
GPU | FP32, FP16 |
NPU (includes Hexagon DSP, HTP) | FP16*, INT16, INT8 |
*Some older chipsets do not support fp16 inference on their NPU.
- Snapdragon 8 Elite, 8 Gen 3, 8 Gen 2, and 8 Gen 1 Mobile Platforms
- Snapdragon X Elite Compute Platform
- SA8255P, SA8295P, SA8650P, and SA8775P Automotive Platforms
- QCS 6490, QCS 8250, and QCS 8550 IoT Platforms
- QCS8450 XR Platform
and many more.
- Samsung Galaxy S21, S22, S23, and S24 Series
- Xiaomi 12 and 13
- Snapdragon X Elite CRD (Compute Reference Device)
- Qualcomm RB3 Gen 2, RB5
and many more.
Model | README |
---|---|
TrOCR | qai_hub_models.models.trocr |
OpenAI-Clip | qai_hub_models.models.openai_clip |
Slack: https://aihub.qualcomm.com/community/slack
GitHub Issues: https://github.com/quic/ai-hub-models/issues
Email: [email protected].
Qualcomm® AI Hub Models is licensed under BSD-3. See the LICENSE file.