Based on the paper XGAN: https://arxiv.org/abs/1711.05139
This repo aims to contribute to the daunting problem of generating a cartoon given the picture of a face.
This is an image-to-image translation problem, which involves many classic computer vision tasks, like style transfer, super-resolution, colorization and semantic segmentation. Also, this is a many-to-many mapping, which means that for a given face there are multiple valid cartoons, and for a given cartoon there are multiple valid faces too.
Faces dataset: we use the VggFace dataset (https://www.robots.ox.ac.uk/~vgg/data/vgg_face/) from the University of Oxford
Cartoon dataset: we use the CartoonSet dataset from Google (https://google.github.io/cartoonset/), both the versions of 10000 and 100000 items.
We filtered out the data just to keep realistic cartoons and faces images. This code is in scripts
. To download the dataset:
pip3 install gdown
gdown https://drive.google.com/uc?id=1tfMW5vZ0aUFnl-fSYpWexoGRKGSQsStL
unzip datasets.zip
config.json
: contains the model configuration to train the model
weights
: contains weights that we saved the last time we trained the model.
├── api.py
├── config.json
├── images
│ ├── Cartoons_example.jpeg
│ └── Faces_example.jpeg
├── LICENSE
├── losses
│ └── __init__.py
├── models
│ ├── avatar_generator_model.py
│ ├── cdann.py
│ ├── decoder.py
│ ├── denoiser.py
│ ├── discriminator.py
│ ├── encoder.py
│ └── __init__.py
├── README.md
├── requirements.txt
├── scripts
│ ├── copyFiles.sh
│ ├── download_faces.py
│ ├── keepFiles.sh
│ ├── plot_utils.py
│ └── preprocessing_cartoons_data.py
├── train.py
├── utils
│ └── __init__.py
└── weights
Our codebase is in Python3. We suggest creating a new virtual environment.
- The required packages can be installed by running
pip3 install -r requirements.txt
- Update
N_CUDA
by runningexport N_CUDA=<gpu_number>
if you want to specify the GPU to use
It is based on the XGAN paper omitting the Teacher Loss and adding an autoencoder in the end. The latter was trained to learn well only the representation of the cartoons as to "denoise" the spots and wrong colorisation from the face-to-cartoon outputs of the XGAN.
The model was trained using the hyperparameters located in config.json
:
- Change
root_path
inconfig json
. It specifies where isdatasets
which contains the datasets. - Run
python3 train.py --no-wandb
- To launch tensorboard:
tensorboard --logdir=<tensorboard-dir>
The codebase contains REST API endpoint for testing the model:
- Update
model_path
inconfig.json
to point to your model - Run
api.py
POST
image file to0.0.0.0:9999/generate
as a form parameter with nameimage