Implementation of MaskCycleGAN-VC with Oneflow.
Non-parallel voice conversion (VC) is a technique for training voice converters without a parallel corpus. MaskCycleGAN-VC is the state of the art method for non-parallel voice conversion using CycleGAN. It is trained using a novel auxiliary task of filling in frames (FIF) by applying a temporal mask to the input Mel-spectrogram. It demonstrates marked improvements over prior models such as CycleGAN-VC (2018), CycleGAN-VC2 (2019), and CycleGAN-VC3 (2020).
pip install -r requirements.txt
sh train.sh
sh infer.sh