Chunxu Liu*,
Guozhen Zhang*,
Rui Zhao,
Limin Wang,
Nanjing University, SenseTime Research
TL;DR: We introduce Sparse Global Matching Pipeline for Video Frame Interpolation task:
0. Estimate intermediate initial flows with local information.
1. Identify flaws in the initial flows.
2. Estimate flow compensation by Sparese Global Matching.
3. Merge the flow compensation with the initial flows.
4. Compute the intermediate frame using the flows from 3. and keep refining.
To evaluate the effectiveness of our method in handling large motion, we carefully curate a more challenging subset from commonly used benchmarks. Experiments shows that our work can bring improvements when dealing with challenging large motion benchmarks.
We need X4K1000FPS for our sparse global matching branch fine-tuning, and Vimeo90K for our local branch training. After downloading and processing the datasets, you can place them in the following folder structure:
.
├── ...
└── datasets
├── X4K1000FPS
│ ├── train
│ ├── val
│ └── test
└── vimeo_triplet (needed if train local branch)
├── ...
├── tri_trainlist.txt
└── sequences
conda create -n sgm-vfi python=3.8
pip install torch==1.13.1+cu117 torchvision==0.14.1+cu117 --extra-index-url https://download.pytorch.org/whl/cu117
pip install -r requirements.txt
We provide the pretrained local branch model for a quicker
launch of sparse global matching. You can download the
pretrained model here and place it in
[project_folder]/log/ours-local/ckpt/ours-local.pth
.
Furthermore, for the global feature extractor GMFlow,
you can download the pretrained model in here,
then unzip it and place gmflow_sintel-0c07dcb3.pth
in [project_folder]/pretrained/gmflow_sintel-0c07dcb3.pth
.
Finally, for fine-tuning sparse global matching branch, the file folder should look like this.
After the preparation, you can modify and check the settings in config.py
,
the default setting is for ours-small-1/2
fine-tuning.
Finally, you can start the fine-tuning with the following command:
torchrun --nproc_per_node=4 train_x4k.py --batch_size 8 --need_patch --train_data_path path/to/X4K/train --val_data_path path/to/X4K/val
.
├── train_x4k.py
├── Trainer_x4k.py
├── dataset_x4k.py
├── config.py
├── pretrained
│ └── gmflow_sintel-0c07dcb3.pth
├── log
│ └── ours-local
│ └── ckpt
│ └── ours-local.pth
└── model
├── __init__.py
├── flow_estimation_global.py
├── matching.py
├── gmflow.py
└── ...
We also provide scripts for training the local branch.
After preparing the Vimeo90K
dataset, and check the settings in config_base.py
(default setting is for ours-local-branch
model training),
you can start the training process by the following command:
torchrun --nproc_per_node=4 train_base.py --batch_size 8 --data_path ./vimeo_triplet
.
├── train_base.py
├── Trainer_base.py
├── dataset.py
├── config_base.py
└── model
├── __init__.py
├── flow_estimation_local.py
└── ...
In our paper, we analyzed the mean motion magnitude and motion sufficiency (the minimum of the top 5% of each pixel’s flow magnitude) in the most frequently used large motion benchmarks.
As a result, we curated the most challenging half from Xiph and SNU-FILM hard and extreme with the
help of raft-sintel.pth
checkpoint provided in RAFT.
The resulting benchmark is available here.
You can put top-half-motion-sufficiency_test-hard.txt
, top-half-motion-sufficiency_test-extreme.txt
in SNU-FILM dataset folder and top-half-motion-sufficiency-gap2.txt
in Xiph dataset folder.
We provide the checkpoints here for evaluation. Please download and place them in the following folder structure:
.
├── ...
└── log
└── ours-1-2-points
└── ckpt
└── ours-1-2-points.pth
We provide the evaluation script of ours-1-2-points
as follows:
python benchmark/XTest_interval.py --path path/to/XTest/test --exp_name ours-1-2-points --num_key_points 0.5
python benchmark/SNU_FILM.py --path ./data/SNU-FILM --exp_name ours-1-2-points --num_key_points 0.5
(Suggestion: You can use ln -s path/to/SNUFILM (project folder)/data/SNU-FILM
to avoid extra processing on the input path name)
python benchmark/Xiph.py --path ./xiph --exp_name ours-1-2-points --num_key_points 0.5
(Suggestion: You can use ln -s path/to/Xiph (project folder)/xiph
to avoid extra processing on the input path name)
You can try out our simple 2x inference demo with the following command:
python demo_2x.py
(Need to prepare the model checkpoint in log/ours-1-2-points/ckpt/ours-1-2-points.pth
and the GMFlow pretrained model in pretrained/gmflow_sintel-0c07dcb3.pth
)
If you think this project is helpful in your research or for application, please feel free to leave a star⭐️ and cite our paper:
@InProceedings{Liu_2024_CVPR,
author = {Liu, Chunxu and Zhang, Guozhen and Zhao, Rui and Wang, Limin},
title = {Sparse Global Matching for Video Frame Interpolation with Large Motion},
booktitle = {Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR)},
month = {June},
year = {2024},
pages = {19125-19134}
}
This project is released under the Apache 2.0 license. The codes are based on GMFlow, RAFT, EMA-VFI, RIFE, IFRNet. Please also follow their licenses. Thanks for their awesome works!