Skip to content

Latest commit

 

History

History
460 lines (414 loc) · 10.9 KB

getting_started.md

File metadata and controls

460 lines (414 loc) · 10.9 KB

Getting Started

Installation

The Intel® Neural Compressor library is released as part of the Intel® oneAPI AI Analytics Toolkit (AI Kit). The AI Kit provides a consolidated package of Intel's latest deep learning and machine optimizations all in one place for ease of development. Along with Neural Compressor, the AI Kit includes Intel-optimized versions of deep learning frameworks (such as TensorFlow and PyTorch) and high-performing Python libraries to streamline end-to-end data science and AI workflows on Intel architectures.

Linux Installation

You can install just the library from binary or source, or you can get the Intel-optimized framework together with the library by installing the Intel® oneAPI AI Analytics Toolkit.

Install from binary

# install from pip
pip install neural-compressor

# install from conda
conda install neural-compressor -c conda-forge -c intel 

Install from source

git clone https://github.com/intel/neural-compressor.git
cd neural-compressor
pip install -r requirements.txt
python setup.py install

Install from AI Kit

The AI Kit, which includes the library, is distributed through many common channels, including from Intel's website, YUM, APT, Anaconda, and more. Select and download the AI Kit distribution package that's best suited for you and follow the Get Started Guide for post-installation instructions.

Download AI Kit AI Kit Get Started Guide

Windows Installation

Prerequisites

The following prerequisites and requirements must be satisfied for a successful installation:

  • Python version: 3.6 or 3.7 or 3.8 or 3.9

  • Download and install anaconda.

  • Create a virtual environment named nc in anaconda:

    # Here we install python 3.7 for instance. You can also choose python 3.6, 3.8, or 3.9.
    conda create -n nc python=3.7
    conda activate nc 

Install from binary

# install from pip
pip install neural-compressor

# install from conda
conda install neural-compressor -c conda-forge -c intel 

Install from source

git clone https://github.com/intel/neural-compressor.git
cd neural-compressor
pip install -r requirements.txt
python setup.py install

Tutorials and Examples

Read the following resources to learn how to use Neural Compressor.

Tutorial

The Tutorial provides comprehensive instructions on how to utilize Intel® Neural Compressor's features with examples.

Examples

Examples are provided to demonstrate the usage of Intel® Neural Compressor in different frameworks: TensorFlow, PyTorch, MXNet, and ONNX Runtime. Hello World examples are also available.

Developer Documentation

View Neural Compressor Documentation for getting started, deep dive, and advanced resources to help you use and develop Neural Compressor.

System Requirements

Intel® Neural Compressor supports systems based on Intel 64 architecture or compatible processors, specially optimized for the following CPUs:

  • Intel Xeon Scalable processor (formerly Skylake, Cascade Lake, Cooper Lake, and Icelake)
  • future Intel Xeon Scalable processor (code name Sapphire Rapids)

Intel® Neural Compressor requires installing the Intel-optimized framework version for the supported DL framework you use: TensorFlow, PyTorch, MXNet, or ONNX runtime.

Note: Intel Neural Compressor supports Intel-optimized and official frameworks for some TensorFlow versions. Refer to Supported Frameworks for specifics.

Validated Hardware/Software Environment

Platform OS Python Framework Version
Cascade Lake

Cooper Lake

Skylake

Ice Lake
CentOS 8.3

Ubuntu 18.04
3.6

3.7

3.8

3.9
TensorFlow 2.5.0
2.4.0
2.3.0
2.2.0
2.1.0
1.15.0 UP1
1.15.0 UP2
1.15.0 UP3
1.15.2
PyTorch 1.5.0+cpu
1.6.0+cpu
1.8.0+cpu
IPEX
MXNet 1.7.0
1.6.0
ONNX Runtime 1.6.0
1.7.0
1.8.0

Validated Models

Intel® Neural Compressor provides numerous examples to show promising accuracy loss with the best performance gain. A full quantized model list on various frameworks is available in the Model List.

Framework version Model dataset Accuracy Performance speed up
INT8 Tuning Accuracy FP32 Accuracy Baseline Acc Ratio[(INT8-FP32)/FP32] Realtime Latency Ratio[FP32/INT8]
tensorflow 2.4.0 resnet50v1.5 ImageNet 76.70% 76.50% 0.26% 3.23x
tensorflow 2.4.0 Resnet101 ImageNet 77.20% 76.40% 1.05% 2.42x
tensorflow 2.4.0 inception_v1 ImageNet 70.10% 69.70% 0.57% 1.88x
tensorflow 2.4.0 inception_v2 ImageNet 74.10% 74.00% 0.14% 1.96x
tensorflow 2.4.0 inception_v3 ImageNet 77.20% 76.70% 0.65% 2.36x
tensorflow 2.4.0 inception_v4 ImageNet 80.00% 80.30% -0.37% 2.59x
tensorflow 2.4.0 inception_resnet_v2 ImageNet 80.10% 80.40% -0.37% 1.97x
tensorflow 2.4.0 Mobilenetv1 ImageNet 71.10% 71.00% 0.14% 2.88x
tensorflow 2.4.0 ssd_resnet50_v1 Coco 37.90% 38.00% -0.26% 2.97x
tensorflow 2.4.0 mask_rcnn_inception_v2 Coco 28.90% 29.10% -0.69% 2.66x
tensorflow 2.4.0 vgg16 ImageNet 72.50% 70.90% 2.26% 3.75x
tensorflow 2.4.0 vgg19 ImageNet 72.40% 71.00% 1.97% 3.79x
Framework version model dataset Accuracy Performance speed up
INT8 Tuning Accuracy FP32 Accuracy Baseline Acc Ratio[(INT8-FP32)/FP32] Realtime Latency Ratio[FP32/INT8]
pytorch 1.5.0+cpu resnet50 ImageNet 75.96% 76.13% -0.23% 2.63x
pytorch 1.5.0+cpu resnext101_32x8d ImageNet 79.12% 79.31% -0.24% 2.61x
pytorch 1.6.0a0+24aac32 bert_base_mrpc MRPC 88.90% 88.73% 0.19% 1.98x
pytorch 1.6.0a0+24aac32 bert_base_cola COLA 59.06% 58.84% 0.37% 2.19x
pytorch 1.6.0a0+24aac32 bert_base_sts-b STS-B 88.40% 89.27% -0.97% 2.28x
pytorch 1.6.0a0+24aac32 bert_base_sst-2 SST-2 91.51% 91.86% -0.37% 2.30x
pytorch 1.6.0a0+24aac32 bert_base_rte RTE 69.31% 69.68% -0.52% 2.15x
pytorch 1.6.0a0+24aac32 bert_large_mrpc MRPC 87.45% 88.33% -0.99% 2.73x
pytorch 1.6.0a0+24aac32 bert_large_squad SQUAD 92.85% 93.05% -0.21% 2.01x
pytorch 1.6.0a0+24aac32 bert_large_qnli QNLI 91.20% 91.82% -0.68% 2.69x