The DigiFace-1M dataset is a collection of over one million diverse synthetic face images for face recognition.
It was introduced in our paper DigiFace-1M: 1 Million Digital Face Images for Face Recognition and can be used to train deep learning models for facial recognition.
The dataset contains:
- 720K images with 10K identities (72 images per identity). For each identity, 4 different sets of accessories are sampled and 18 images are rendered for each set.
- 500K images with 100K identities (5 images per identity). For each identity, only one set of accessories is sampled.
The DigiFace-1M dataset can be used for non-commercial research, and is licensed under the license found in LICENSE.
For convenience the dataset is split into 8 parts which can be downloaded here:
72 images per identity
5 images per identity
The DigiFace-1M dataset contains cropped color images in the following layout.
subj_id_n
├── 0.png # First rendered image of subject subj_id_n
├── 1.png # Second rendered image of subject subj_id_n
...
├── k.png # k+1 rendered image of subject subj_id_n
Some of our rendered faces may be close in appearance to the faces of real people. Any such similarity is naturally unintentional, as it would be in a dataset of real images, where people may appear similar to others unknown to them.
If you use the DigiFace-1M dataset in your work, please cite the following paper:
@inproceedings{bae2023digiface1m,
title={DigiFace-1M: 1 Million Digital Face Images for Face Recognition},
author={Bae, Gwangbin and de La Gorce, Martin and Baltru{\v{s}}aitis, Tadas and Hewitt, Charlie and Chen, Dong and Valentin, Julien and Cipolla, Roberto and Shen, Jingjing},
booktitle={2023 IEEE Winter Conference on Applications of Computer Vision (WACV)},
year={2023},
organization={IEEE}
}