The VMAF Python library offers full functionalities from running basic VMAF command lines, software testing, training and validating a new VMAF model on video datasets, data visualization tools, etc. It is the playground to experiment with VMAF and other video quality metrics.
Make sure you have python3
(python 3.6 or higher). You can check the version by python3 --version
.
On Debian Buster (10), or Debian-based systems such as Ubuntu 18.04 or higher:
sudo apt install nasm doxygen python3-dev
On Fedora 22 or higher, or CentOS 8 or higher:
sudo dnf install nasm doxygen python3-devel
On older CentOS and RHEL:
sudo yum install nasm doxygen python3-devel
Make sure nasm
is 2.13.02 or higher (check by nasm --version
).
First, install Homebrew.
If you don't already have a python3
installation on your Mac, run the following to install Python 3 via Homebrew:
brew install python3
Install the remaining dependencies:
brew install nasm doxygen llvm libomp
Note that brew
requires no sudo
.
Follow the steps below to set up a clean virtual environment
python3 -m pip install virtualenv
python3 -m virtualenv .venv
source .venv/bin/activate
From this point forward python3
and pip3
will be relative to the virtualenv and isolated from the system python. Returning to the project in subsequent shell sessions will require re-activating the virtualenv with source .venv/bin/activate
.
Now install the tools required to build VMAF into the virtualenv.
pip3 install cython numpy meson ninja
Make sure ninja
is 1.7.1 or higher (check by ninja --version
).
Clean build the binary by:
make clean; make
Check if build is successful:
./libvmaf/build/tools/vmaf --version
Install the rest of the required Python packages:
pip3 install -r python/requirements.txt
On macOS it's important to use the LLVM from homebrew as the macOS clang does not include support for OpenMP, which is needed for libsvm-official
CC=$HOMEBREW_PREFIX/opt/llvm/bin/clang CXX=$HOMEBREW_PREFIX/opt/llvm/bin/clang++ pip3 install -r python/requirements.txt
Run unittests and make sure they all pass:
./unittest
One can run VMAF in the command line by run_vmaf
, which allows the input videos to be the .yuv
format. To run VMAF on a single reference/distorted video pair, run:
PYTHONPATH=python ./python/vmaf/script/run_vmaf.py \
format width height \
reference_path \
distorted_path \
[--out-fmt output_format]
The arguments are the following:
format
can be one of:yuv420p
,yuv422p
,yuv444p
(8-Bit YUV)yuv420p10le
,yuv422p10le
,yuv444p10le
(10-Bit little-endian YUV)yuv420p12le
,yuv422p12le
,yuv444p12le
(12-Bit little-endian YUV)yuv420p16le
,yuv422p16le
,yuv444p16le
(16-Bit little-endian YUV)
width
andheight
are the width and height of the videos, in pixelsreference_path
anddistorted_path
are the paths to the reference and distorted video filesoutput_format
can be one of:text
xml
json
For example, the following command runs VMAF on a pair of .yuv
inputs (src01_hrc00_576x324.yuv
, src01_hrc01_576x324.yuv
):
PYTHONPATH=python ./python/vmaf/script/run_vmaf.py \
yuv420p 576 324 \
src01_hrc00_576x324.yuv \
src01_hrc01_576x324.yuv \
--out-fmt json
This will generate JSON output like:
{
...
"aggregate": {
"VMAF_feature_adm2_score": 0.93458780776205741,
"VMAF_feature_motion2_score": 3.8953518541666665,
"VMAF_feature_vif_scale0_score": 0.36342081156994926,
"VMAF_feature_vif_scale1_score": 0.76664738784617292,
"VMAF_feature_vif_scale2_score": 0.86285338927816291,
"VMAF_feature_vif_scale3_score": 0.91597186913930484,
"VMAF_score": 76.699271371151269,
"method": "mean"
}
}
where VMAF_score
is the final score and the others are the scores for VMAF's elementary metrics:
adm2
,vif_scalex
scores range from 0 (worst) to 1 (best)motion2
score typically ranges from 0 (static) to 20 (high-motion)
VMAF follows a machine-learning based approach to first extract a number of quality-relevant features (or elementary metrics) from a distorted video and its reference full-quality video, followed by fusing them into a final quality score using a non-linear regressor (e.g. an SVM regressor), hence the name “Video Multi-method Assessment Fusion”.
In addition to the basic commands, the VMAF package also provides a framework to allow any user to train his/her own perceptual quality assessment model. For example, directory model
contains a number of pre-trained models, which can be loaded by the aforementioned commands:
PYTHONPATH=python ./python/vmaf/script/run_vmaf.py \
format width height \
reference_path \
distorted_path \
[--model model_path]
For example:
PYTHONPATH=python ./python/vmaf/script/run_vmaf.py \
yuv420p 576 324 \
python/test/resource/yuv/src01_hrc00_576x324.yuv \
python/test/resource/yuv/src01_hrc01_576x324.yuv \
--model model/other_models/nflxtrain_vmafv3.pkl
A user can customize the model based on:
- The video dataset it is trained on
- The list of features used
- The regressor used (and its hyper-parameters)
Once a model is trained, the VMAF package also provides tools to cross validate it on a different dataset and visualization.
To begin with, create a dataset file following the format in example_dataset.py
. A dataset is a collection of distorted videos. Each has a unique asset ID and a corresponding reference video, identified by a unique content ID. Each distorted video is also associated with subjective quality score, typically a MOS (mean opinion score), obtained through subjective study. An example code snippet that defines a dataset is as follows:
dataset_name = 'example'
yuv_fmt = 'yuv420p'
width = 1920
height = 1080
ref_videos = [
{'content_id':0, 'path':'checkerboard.yuv'},
{'content_id':1, 'path':'flat.yuv'},
]
dis_videos = [
{'content_id':0, 'asset_id': 0, 'dmos':100, 'path':'checkerboard.yuv'},
{'content_id':0, 'asset_id': 1, 'dmos':50, 'path':'checkerboard_dis.yuv'},
{'content_id':1, 'asset_id': 2, 'dmos':100, 'path':'flat.yuv'},
{'content_id':1, 'asset_id': 3, 'dmos':80, 'path':'flat_dis.yuv'},
]
See the directory resource/dataset
for more examples. Also refer to the Datasets document regarding publicly available datasets.
Once a dataset is created, first validate the dataset using existing VMAF or other (PSNR, SSIM or MS-SSIM) metrics. Run:
PYTHONPATH=python ./python/vmaf/script/run_testing.py \
quality_type \
test_dataset_file \
[--vmaf-model optional_VMAF_model_path] \
[--cache-result] \
[--parallelize]
where quality_type
can be VMAF
, PSNR
, SSIM
, MS_SSIM
, etc.
Enabling --cache-result
allows storing/retrieving extracted features (or elementary quality metrics) in a data store (under workspace/result_store_dir/file_result_store
), since feature extraction is the most expensive operations here.
Enabling --parallelize
allows execution on multiple reference-distorted video pairs in parallel. Sometimes it is desirable to disable parallelization for debugging purpose (e.g. some error messages can only be displayed when parallel execution is disabled).
For example:
PYTHONPATH=python ./python/vmaf/script/run_testing.py \
VMAF \
resource/example/example_dataset.py \
--cache-result \
--parallelize
Make sure matplotlib
is installed to visualize the MOS-prediction scatter plot and inspect the statistics:
- PCC – Pearson correlation coefficient
- SRCC – Spearman rank order correlation coefficient
- RMSE – root mean squared error
When creating a dataset file, one may make errors (for example, having a typo in a file path) that could go unnoticed but make the execution of run_testing
fail. For debugging purposes, it is recommended to disable --parallelize
.
If the problem persists, one may need to run the script:
PYTHONPATH=python ./python/vmaf/script/run_cleaning_cache.py \
quality_type \
test_dataset_file
to clean up corrupted results in the store before retrying. For example:
PYTHONPATH=python ./python/vmaf/script/run_cleaning_cache.py \
VMAF \
resource/example/example_dataset.py
Now that we are confident that the dataset is created correctly and we have some benchmark result on existing metrics, we proceed to train a new quality assessment model. Run:
PYTHONPATH=python ./python/vmaf/script/run_vmaf_training.py \
train_dataset_filepath \
feature_param_file \
model_param_file \
output_model_file \
[--cache-result] \
[--parallelize]
For example:
PYTHONPATH=python ./python/vmaf/script/run_vmaf_training.py \
resource/example/example_dataset.py \
resource/feature_param/vmaf_feature_v2.py \
resource/model_param/libsvmnusvr_v2.py \
workspace/model/test_model.pkl \
--cache-result \
--parallelize
feature_param_file
defines the set of features used. For example, both dictionaries below:
feature_dict = {'VMAF_feature': 'all', }
and
feature_dict = {'VMAF_feature': ['vif', 'adm'], }
are valid specifications of selected features. Here VMAF_feature
is an "aggregate" feature type, and vif
, adm
are the "atomic" feature types within the aggregate type. In the first case, all
specifies that all atomic features of VMAF_feature
are selected. A feature_dict
dictionary can also contain more than one aggregate feature types.
model_param_file
defines the type and hyper-parameters of the regressor to be used. For details, refer to the self-explanatory examples in directory resource/model_param
. One example is:
model_type = "LIBSVMNUSVR"
model_param_dict = {
# ==== preprocess: normalize each feature ==== #
'norm_type':'clip_0to1', # rescale to within [0, 1]
# ==== postprocess: clip final quality score ==== #
'score_clip':[0.0, 100.0], # clip to within [0, 100]
# ==== libsvmnusvr parameters ==== #
'gamma':0.85, # selected
'C':1.0, # default
'nu':0.5, # default
'cache_size':200, # default
}
The trained model is output to output_model_file
. Once it is obtained, it can be used by the run_vmaf
, or by run_testing
to validate another dataset.
Above are two example scatter plots obtained from running the run_vmaf_training
and run_testing
commands on a training and a testing dataset, respectively.
The commands run_vmaf_training
and run_testing
also support custom subjective models (e.g. MLE_CO_AP2 (default), MOS, DMOS, SR_MOS (i.e. ITU-R BT.500), BR_SR_MOS (i.e. ITU-T P.913) and more), through the sureal package.
The subjective model option can be specified with option --subj-model subjective_model
, for example:
PYTHONPATH=python ./python/vmaf/script/run_vmaf_training.py \
resource/example/example_raw_dataset.py \
resource/feature_param/vmaf_feature_v2.py \
resource/model_param/libsvmnusvr_v2.py \
workspace/model/test_model.pkl \
--subj-model MLE_CO_AP2 \
--cache-result \
--parallelize
PYTHONPATH=python ./python/vmaf/script/run_testing.py \
VMAF \
resource/example/example_raw_dataset.py \
--subj-model MLE_CO_AP2 \
--cache-result \
--parallelize
Note that for the --subj-model
option to have effect, the input dataset file must follow a format similar to example_raw_dataset.py. Specifically, for each dictionary element in dis_videos
, instead of having a key named dmos
or groundtruth
as in example_dataset.py, it must have a key named os
(stands for opinion score), and the value must be a list of numbers. This is the "raw opinion score" collected from subjective experiments, which is used as the input to the custom subjective models.
run_vmaf_cross_validation.py
provides tools for cross-validation of hyper-parameters and models. run_vmaf_cv
runs training on a training dataset using hyper-parameters specified in a parameter file, output a trained model file, and then test the trained model on another test dataset and report testing correlation scores. run_vmaf_kfold_cv
takes in a dataset file, a parameter file, and a data structure (list of lists) that specifies the folds based on video content's IDs, and run k-fold cross valiation on the video dataset. This can be useful for manually tuning the model parameters.
You can also customize VMAF by plugging in third-party features or inventing new features, and specify them in a feature_param_file
. Essentially, the "aggregate" feature type (for example: VMAF_feature
) specified in the feature_dict
corresponds to the TYPE
field of a FeatureExtractor
subclass (for example: VmafFeatureExtractor
). All you need to do is to create a new class extending the FeatureExtractor
base class.
Similarly, you can plug in a third-party regressor or invent a new regressor and specify them in a model_param_file
. The model_type
(for example: LIBSVMNUSVR
) corresponds to the TYPE
field of a TrainTestModel
sublass (for example: LibsvmnusvrTrainTestModel
). All needed is to create a new class extending the TrainTestModel
base class.
For instructions on how to extending the FeatureExtractor
and TrainTestModel
base classes, refer to CONTRIBUTING.md
.
Overtime, a number of helper tools have been incorporated into the package, to facilitate training and validating VMAF models. An overview of the tools available can be found in this slide deck.
A Bjøntegaard-Delta (BD) rate implementation is added. Example usage can be found here. The implementation is validated against MPEG JCTVC-L1100.
An implementation of LIME is also added as part of the repository. For more information, refer to our analysis tools presentation. The main idea is to perform a local linear approximation to any regressor or classifier and then use the coefficients of the linearized model as indicators of feature importance. LIME can be used as part of the VMAF regression framework, for example:
PYTHONPATH=python ./python/vmaf/script/run_vmaf.py \
yuv420p 576 324 \
src01_hrc00_576x324.yuv \
src01_hrc00_576x324.yuv \
--local-explain
Naturally, LIME can also be applied to any other regression scheme as long as there exists a pre-trained model. For example, applying to BRISQUE:
PYTHONPATH=python ./python/vmaf/script/run_vmaf.py yuv420p 576 324 \
src01_hrc00_576x324.yuv \
src01_hrc00_576x324.yuv \
--local-explain \
--model model/other_models/nflxall_vmafv1.pkl
A tool to convert a model file (currently support libsvm model) from pickle to json is added at python/vmaf/script/convert_model_from_pkl_to_json.py
. Usage:
usage: convert_model_from_pkl_to_json.py [-h] --input-pkl-filepath
INPUT_PKL_FILEPATH
--output-json-filepath
OUTPUT_JSON_FILEPATH
optional arguments:
-h, --help show this help message and exit
--input-pkl-filepath INPUT_PKL_FILEPATH
path to the input pkl file, example:
model/vmaf_float_v0.6.1.pkl or
model/vmaf_float_b_v0.6.3/vmaf_float_b_v0.6.3.pkl
--output-json-filepath OUTPUT_JSON_FILEPATH
path to the output json file, example:
model/vmaf_float_v0.6.1.json or model/vmaf_float_b_v0.6.3.json
Examples:
python/vmaf/script/convert_model_from_pkl_to_json.py \
--input-pkl-filepath model/vmaf_float_b_v0.6.3/vmaf_float_b_v0.6.3.pkl \
--output-json-filepath ./vmaf_float_b_v0.6.3.json
python/vmaf/script/convert_model_from_pkl_to_json.py \
--input-pkl-filepath model/vmaf_float_v0.6.1.pkl \
--output-json-filepath ./vmaf_float_v0.6.1.json
The core classes of the VMAF Python library can be depicted in the diagram below:
An Asset is the most basic unit with enough information to perform a task on a media. It includes basic information about a distorted video and its undistorted reference counterpart, as well as the auxiliary preprocessing information that can be understood by the Executor
and its subclasses. For example:
- The frame range on which to perform a task (i.e.
dis_start_end_frame
andref_start_end_frame
) - At what resolution to perform a task (e.g. a video frame is upscaled with a
resampling_type
method to the resolution specified byquality_width_hight
before feature extraction)
Asset extends the WorkdirEnabled
mixin, which comes with a thread-safe working directory to facilitate parallel execution.
An Executor
takes a list of Assets
as input, run computations on them, and return a list of corresponding Results
. An Executor
extends the TypeVersionEnabled
mixin, and must specify a unique type and version combination (by the TYPE
and VERSION
attribute), so that the Result
generated by it can be uniquely identified. This facilitates a number of shared housekeeping functions, including storing and reusing Results
(result_store
), creating FIFO pipes (fifo_mode
), etc. Executor
understands the preprocessing steps specified in its input Assets
. It relies on FFmpeg to do the processing for it (FFmpeg must be pre-installed and its path specified in the FFMPEG_PATH
field in the python/vmaf/externals.py
file).
An Executor
and its subclasses can take optional parameters during initialization. There are two fields to put the optional parameters:
optional_dict
: a dictionary field to specify parameters that will impact numerical result (e.g. which wavelet transform to use).optional_dict2
: a dictionary field to specify parameters that will NOT impact numerical result (e.g. outputting optional results).
Executor
is the base class for FeatureExtractor
and QualityRunner
(and the sub-subclass VmafQualityRunner
).
A Result
is a key-value store of read-only execution results generated by an Executor
on an Asset
. A key corresponds to an "atom" feature type or a type of a quality score, and a value is a list of score values, each corresponding to a computation unit (i.e. in the current implementation, a frame).
The Result
class also provides a number of tools for aggregating the per-unit scores into an average score. The default aggregatijon method is the mean, but Result.set_score_aggregate_method()
allows customizing other methods (see test_to_score_str()
in test/result_test.py
for examples).
ResultStore
provides capability to save and load a Result
. Current implementation FileSystemResultStore
persists results by a simple file system that save/load result in a directory. The directory has multiple subdirectories, each corresponding to an Executor
. Each subdirectory contains multiple files, each file storing the dataframe for an Asset
.
FeatureExtractor
subclasses Executor
, and is specifically for extracting features (aka elementary quality metrics) from Assets
. Any concrete feature extraction implementation should extend the FeatureExtractor
base class (e.g. VmafFeatureExtractor
). The TYPE
field corresponds to the "aggregate" feature name, and the ATOM_FEATURES
/DERIVED_ATOM_FEATURES
field corresponds to the "atom" feature names.
FeatureAssembler
assembles features for an input list of Assets
on a input list of FeatureExtractor
subclasses. The constructor argument feature_dict
specifies the list of FeatureExtractor
subclasses (i.e. the "aggregate" feature) and selected "atom" features. For each asset on a FeatureExtractor
, it outputs a BasicResult
object. FeatureAssembler
is used by a QualityRunner
to assemble the vector of features to be used by a TrainTestModel
.
TrainTestModel
is the base class for any concrete implementation of regressor, which must provide a train()
method to perform training on a set of data and their groud-truth labels, and a predict()
method to predict the labels on a set of data, and a to_file()
and a from_file()
method to save and load trained models.
A TrainTestModel
constructor must supply a dictionary of parameters (i.e. param_dict
) that contains the regressor's hyper-parameters. The base class also provides shared functionalities such as input data normalization/output data denormalization, evaluating prediction performance, etc.
Like an Executor
, a TrainTestModel
extends TypeVersionEnabled
and must specify a unique type and version combination (by the TYPE
and VERSION
attribute).
CrossValidation
provides a collection of static methods to facilitate validation of a TrainTestModel
object. As such, it also provides means to search the optimal hyper-parameter set for a TrainTestModel
object.
QualityRunner
subclasses Executor
, and is specifically for evaluating the quality score for Assets
. Any concrete implementation to generate the final quality score should extend the QualityRunner
base class (e.g. VmafQualityRunner
, PsnrQualityRunner
).
There are two ways to extend a QualityRunner
base class -- either by directly implementing the quality calculation (e.g. by calling a C executable, as in PsnrQualityRunner
), or by calling a FeatureAssembler
(with indirectly calls a FeatureExtractor
) and a TrainTestModel
subclass (as in VmafQualityRunner
).