Frame Interpolation consists of adding an intermediate frame between all pairs of consecutive frames in a video to raise the overall fps of the video. Our Project consists of Tensorflow implementation of two papers namely Video Frame Interpolation via Adaptive Convolution and Video Frame Interpolation via Adaptive Separable Convolution. We have implemented both papers separately and included results after training respective models. We have also included pre-trained models in this repository so you can double the FPS of your own videos, to start please follow steps given below.
Interpolated Video:
Interpolated Video :
As this model is built on python it is recommended to use Anaconda to install all the dependencies for easy installations.
For non-gpu users:
Note: If you don't have a GPU better than 1050Ti then you should probably use google colaboratory. You will not need to install any of the dependency too.
pip install tensorflow
For gpu users:
pip install tensorflow-gpu
Note: Install required version of cudatoolkit so that tf-gpu can work
pip install opencv-contrib
pip install scikit-learn
If you don't want to create your own dataset and use ours, follow this google drive link and use all the files with extension .tfrecord in it. Our dataset contains nearly 70,000 * 3 ( Three frames per data ) image patches as of now.
To create your own dataset, open create_dataset_config.py and specify the path to the folder containing your videos to be used as dataset and the folder where you want to save your tfrecord files.
Now run:
python create_dataset_from_video.py
This will take time depending on the total length of all of your videos, so sit back and relax.
If you don't want to train your own model and use our pre-trained models then you can move to step 4.
First, choose the model you want to use.
Go to AdapConv folder and open adap_conv_model_utils.py and specify paths for:
TFRECORD_DATASET_DIR - Path to the folder containing your tfrecord files
CHECKPOINT_PATH - Path to the folder you want to save your model weights.
CHECKPOINT_NAME - Name of file containing your model weights.
We trained our model for 25 epochs and got the following graphs.
It can take upto 2 minutes per epoch to be trained in this model on google colab.
You can also train your own Adaptive Separable Convolution model by running train_adaSepConv.py
in same directory. To train model according to your specific hyperparameters you can edit config.py
file in same directory to change hyperparameters like BATCH_SIZE etc.To include more dataset, just download .tfrecord
files from drive link and paste them in dataset
folder in repository and run train_adaSepConv.py
We trained our model for 20 epochs and got following graphs
To train on full dataset it takes about 40 min per epoch on google colab.
First open adap_conv_model_config.py and specify paths and names for the following:
VIDEO_PATH - Path of the folder containing the video to be interpolated
VIDEO_NAME - Name of the video
INTERPOLATED_VIDEO_PATH - Path to the folder where interpolated video is to be saved
CHECKPOINT_PATH - Path to the folder containing your model weights.
CHECKPOINT_NAME - Name of file containing your model weights.
FRAME1_PATH and FRAME2_PATH - Paths of folders containing frames to be interpolated.
FRAME1_NAME and FRAME2_NAME - Names of the frames
INTERPOLATED_FRAME_PATH - Path to the folder where interpolated frame is to be saved
If you want to use our weights for this model then download them from here and paste them to AdapConv/pre-trained-models
Go to AdapConv directory
cd AdapConv
To interpolate image run -
python predict_image_adap_conv_model.py
To interpolate video run -
python predict_video_adap_conv_model.py
It takes around 5 minutes to predict one 1080p resolution frame on google colab in this model. However, this time reduces significantly for lower resolution frames. If you want a faster prediction go for adaptive separable convolution model.
After you have verified above installations open terminal and navigate to adaSepConv folder in repository using
cd adaSepConv
Run test_video.py using
python test_video.py
It will prompt you to provide path of video file for which you want to double FPS. You can provide any video file of any resolution, for example we have included file juggle.mp4
in repository, you can provide its path (Note: don't provide video files with duration more than 20 sec as 20 sec 1080p video takes about 15 min to process)
enter path to video file : /path/to/video/file.mp4
You can alternatively see results by interpolating frame between two given frames which will take lot less time. To do this run test_frame.py
in terminal it will prompt to provide paths of two frames between which you want to get frame. For easy example just provide path to frame1.jpg
and frame3.jpg
already in repository and a file interpolated_frame.jpg
will be generated
enter path to first image file : /path/to/first/image.jpg
enter path to second image file : /path/to/second/image.jpg
Note : Adaptive Separable convolution model takes much less time for predictions as compared to Adaptive convolution model as seen from fact that it takes about 5 min to predict one 1080p frame from Adaptive convolution model whereas it takes only 3 sec to predict 1080p frame from Adaptive Convolution model. This is because of fact that Adaptive convolution model requires height*width
number of passes to produce one frame whereas Adaptive Separable convolution model requires only ONE pass.
Following images illustrate the architecture of the two models.
Because of the following papers this repository became possible:
- Video Frame Interpolation via Adaptive Convolution
- Video Frame Interpolation via Adaptive Separable Convolution
There are several areas in which the code can be further optimized like using less GPU memory during prediction and a faster implementation of adaptive convolution which might be done with the help of shift-n-stitch technique. Also, you can add the tfrecords that you create to make predictions even better!