FloWaveNet : A Generative Flow for Raw Audio

This is a PyTorch implementation of our work "FloWaveNet : A Generative Flow for Raw Audio".

For a purpose of parallel sampling, we propose FloWaveNet, a flow-based generative model for raw audio synthesis. FloWaveNet can generate audio samples as fast as ClariNet and Parallel WaveNet, while the training procedure is really easy and stable with a single-stage pipeline. Our generated audio samples are available at http://bit.ly/2zpsElV. Also, our implementation of ClariNet (Gaussian WaveNet and Gaussian IAF) is available at https://github.com/ksw0306/ClariNet

Requirements

PyTorch 0.4.1
Python 3.6
Librosa

Examples

Step 1. Download Dataset

LJSpeech : https://keithito.com/LJ-Speech-Dataset/

Step 2. Preprocessing (Preparing Mel Spectrogram)

python preprocessing.py --in_dir ljspeech --out_dir DATASETS/ljspeech

Step 3. Train

python train.py --model_name flowavenet --batch_size 8 --n_block 8 --n_flow 6 --n_layer 2 --causal no

Step 4. Synthesize

--load_step CHECKPOINT : the # of the pre-trained model's global training step (also depicted in the trained weight file)

--temp: Temperature (standard deviation) value implemented as z ~ N(0, 1 * TEMPERATURE)

ex) python synthesize.py --model_name flowavenet --n_block 8 --n_flow 6 --n_layer 2 --causal no --load_step 100000 --temp 0.7 --num_samples 10

Sample Link

Sample Link : http://bit.ly/2zpsElV

Our implementation of ClariNet (Gaussian WaveNet, Gaussian IAF) : https://github.com/ksw0306/ClariNet

Results 1 : Model Comparisons (WaveNet (MoL, Gaussian), ClariNet and FloWaveNet)
Results 2 : Temperature effect on Audio Quality Trade-off (Temperature T : 0.0 ~ 1.0, Model : Gaussian IAF and FloWaveNet)
Results 3 : Analysis of ClariNet Loss Terms (Loss functions : 1. KLD + Frame Loss 2. Only KL 3. Only Frame)
Results 4 : Context Block and Long term Dependency (FloWaveNet : 8 Context Blocks, FloWaveNet_small : 6 Context Blocks)
Results 5 : Causality of WaveNet Dilated Convolutions (FloWaveNet : Non-causal WaveNet Affine Coupling Layers, FloWaveNet_causal : Causal WaveNet Affine Coupling Layers)

Reference

WaveNet vocoder : https://github.com/r9y9/wavenet_vocoder
glow-pytorch : https://github.com/rosinality/glow-pytorch

Name		Name	Last commit message	Last commit date
Latest commit History 20 Commits
png		png
.gitignore		.gitignore
LICENSE		LICENSE
README.md		README.md
data.py		data.py
model.py		model.py
modules.py		modules.py
preprocessing.py		preprocessing.py
synthesize.py		synthesize.py
train.py		train.py

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

FloWaveNet : A Generative Flow for Raw Audio

Requirements

Examples

Step 1. Download Dataset

Step 2. Preprocessing (Preparing Mel Spectrogram)

Step 3. Train

Step 4. Synthesize

Sample Link

Reference

About

Releases

Packages

Languages

License

pukkapies/FloWaveNet

Folders and files

Latest commit

History

Repository files navigation

FloWaveNet : A Generative Flow for Raw Audio

Requirements

Examples

Step 1. Download Dataset

Step 2. Preprocessing (Preparing Mel Spectrogram)

Step 3. Train

Step 4. Synthesize

Sample Link

Reference

About

Resources

License

Stars

Watchers

Forks

Releases

Packages 0

Languages

Packages