Tensorflow 2.0 implementation of Fast Style Transfer which merges the style of one picture with the content of another.
Code based on emla2805/fast-style-transfer
The algorithm is based on Perceptual Losses for Real-Time Style Transfer and Super-Resolution with the addition of Instance Normalization.
Python 3.7+ (in virtual environment or else), required dependencies:
pip install -r requirements.txt
To style an image using a pre-trained model specify the input and output image paths and the log directory containing model checkpoints.
python style.py \
--image-path path/to/content/image.jpg \
--log-dir log/dir/ \
--output-path path/to/output/image.png
You can also add --cpu
to prevent using GPU.
python style_dir.py \
--image-path path/to/content/ \
--log-dir log/dir/ \
--output-path path/to/output/
You can also add --cpu
to prevent using GPU, --png
to save into PNG and --prefix
to add to each filename.
python train.py \
--log-dir log/dir/ \
--style-image path/to/style/image.jpg \
--test-image path/to/test/image.jpg
and --content-weight
allows to adjust balance between keeping style and content respectfully.
setup number of passed on dataset.
and --batch-size
setup training parameters.
is size to which dataset images resized and cropped.
command to use VGG19 network, which lead to styling similar to lengstrom/fast-style-transfer but not 100%.
1 epoch of training, which uses the COCO 2014 train dataset, takes about 40 minutes on GTX 1080 Ti.
If you don't already have dataset in TensorFlow Datasets folder and format, then it will be downloaded (~38Gb) on first start and repacked (will require additional 38Gb+ on disk and few hours).
Training: udnie_479neg - style-weight 100, content-weight 10, epoch 1, batch 16, learning-rate 0.001, vgg19
On default parameters image COCO_train2014_000000105396.jpg cropped to white rectangle without any details which (by confusing content estimator?) with ~70% chances turns few weights of network to NaNs or with ~10% chances a lot of weights and so whole NN became unusable. To prevent this dataset shuffle turned off and specific step is blocked, but it is valid only for default batch-size = 16. Network with few NaNs seems to not work on CPU while could train and work on GPU.