PyTorch implementation of Frame-level Signal-to-Noise Ratio Estimation using Deep Learning.
This implementation includes distributed training and trained on LibriSpeech -train-clean-100.tar.gz- dataset and the noise collected from different sources.
The below image is taken from the training on LibriSpeech
The below images are samples shows the results
- Download and extract LibriSpeech
- Clone this repo:
git clone https://github.com/msalhab96/SNR-Estimation-Using-Deep-Learning
- CD into this repo:
cd SNR-Estimation-Using-Deep-Learning
- Install the requirements:
pip install -r requiremnts.txt
To train the model follow the steps below:
- Preprocess all the audio files and make sure all of them are single channeled audios
- Change the configuration in the config/configs.yaml file
- Run
python train.py
to train from scratch orpython train.py checkpoint=path/to/checkpoint
to train the model from a checkpoint - Run
tensorboard --logdir=logdir/
to monitor the training (optional)
You can download the pretrained model from here