FAZE: Few-Shot Adaptive Gaze Estimation

This repository contains the code for training and evaluation of our ICCV 2019 work, which was presented as an Oral presentation. FAZE is a framework for few-shot adaptation of gaze estimation networks, consisting of equivariance learning (via the DT-ED or Disentangling Transforming Encoder-Decoder architecture) and meta-learning with gaze-direction embeddings as input.

Links

NVIDIA Project Page
ETH Zurich Project Page
arXiv Page
CVF Open Access PDF
ICCV 2019 Presentation
Pre-processing Code GitHub Repository (also included as a submodule in this repository)

Training and Evaluation

1. Datasets

Pre-process the GazeCapture and MPIIGaze datasets using the code-base at https://github.com/swook/faze_preprocess which is also available as a git submodule at the relative path, preprocess/.

If you have already cloned this few_shot_gaze repository without pulling the submodules, please run:

git submodule update --init --recursive

After the dataset preprocessing procedures have been performed, we can move on to the next steps.

2. Prerequisites

This codebase should run on most standard Linux systems. We specifically used Ubuntu

Please install the following prerequisites manually (as well as their dependencies), by following the instructions found below:

PyTorch 1.3 - https://pytorch.org/get-started/locally/
NVIDIA Apex - https://github.com/NVIDIA/apex#quick-start
- please note that only NVIDIA Volta and newer architectures can benefit from AMP training via NVIDIA Apex.

The remaining Python package dependencies can be installed by running:

pip3 install --user --upgrade -r requirements.txt

3. Pre-trained weights for the DT-ED architecture and MAML models

You can obtain a copy of the pre-trained weights for the Disentangling Transforming Encoder-Decoder and for the various MAML models from the following location.

cd src/
wget -N https://ait.ethz.ch/projects/2019/faze/downloads/outputs_of_full_train_test_and_plot.zip
unzip -o outputs_of_full_train_test_and_plot.zip

4. Training, Meta-Learning, and Final Evaluation

Run the all-in-one example bash script with:

cd src/
bash full_train_test_and_plot.bash

The bash script should be self-explanatory and can be edited to replicate the final FAZE model evaluation procedure, given that hardware requirements are satisfied (8x GPUs, where each are Tesla V100 GPUs with 32GB of memory).

The pre-trained DT-ED weights should be loaded automatically by the script 1_train_dt_ed.py. Please note that this model can take a long time to train when training from scratch, so we recommend adjusting batch sizes and the using multiple GPUs (the code is multi-GPU-ready).

The Meta-Learning step is also very time consuming, particularly because it must be run for every value of k or number of calibration samples. The code pertinent to this step is 2_meta_learning.py, and its execution is recommended to be done in parallel as shown in full_train_test_and_plot.bash.

5. Outputs

When the full pipeline successfully runs, you will find some outputs in the path src/outputs_of_full_train_test_and_plot, in particular:

walks/: mp4 videos of latent space walks in gaze direction and head orientation
Zg_OLR1e-03_IN5_ILR1e-05_Net64/: outputs of the meta-learning step.
Zg_OLR1e-03_IN5_ILR1e-05_Net64 MAML MPIIGaze.pdf: plotted results of the few-shot learning evaluations on MPIIGaze.
Zg_OLR1e-03_IN5_ILR1e-05_Net64 MAML GazeCapture (test).pdf: plotted results of the few-shot learning evaluations on the GazeCapture test set.

Realtime Demo

We also provide a realtime demo that runs with live input from a webcam in the demo/ folder. Please check the separate demo instructions for details of how to setup and run it.

Bibtex

Please cite our paper when referencing or using our code.

@inproceedings{Park2019ICCV,
  author    = {Seonwook Park and Shalini De Mello and Pavlo Molchanov and Umar Iqbal and Otmar Hilliges and Jan Kautz},
  title     = {Few-Shot Adaptive Gaze Estimation},
  year      = {2019},
  booktitle = {International Conference on Computer Vision (ICCV)},
  location  = {Seoul, Korea}
}

Acknowledgements

Seonwook Park carried out this work during his internship at NVIDIA. This work was supported in part by the ERC Grant OPTINT (StG-2016-717054).

Name		Name	Last commit message	Last commit date
Latest commit History 34 Commits
demo		demo
preprocess @ 5c33caa		preprocess @ 5c33caa
src		src
.gitignore		.gitignore
.gitmodules		.gitmodules
LICENSE		LICENSE
README.md		README.md
requirements.txt		requirements.txt

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

FAZE: Few-Shot Adaptive Gaze Estimation

Links

Training and Evaluation

1. Datasets

2. Prerequisites

3. Pre-trained weights for the DT-ED architecture and MAML models

4. Training, Meta-Learning, and Final Evaluation

5. Outputs

Realtime Demo

Bibtex

Acknowledgements

About

Releases

Packages

Languages

License

Hamid293/few_shot_gaze

Folders and files

Latest commit

History

Repository files navigation

FAZE: Few-Shot Adaptive Gaze Estimation

Links

Training and Evaluation

1. Datasets

2. Prerequisites

3. Pre-trained weights for the DT-ED architecture and MAML models

4. Training, Meta-Learning, and Final Evaluation

5. Outputs

Realtime Demo

Bibtex

Acknowledgements

About

Resources

License

Stars

Watchers

Forks

Releases

Packages 0

Languages

Packages