Repo configured with useful functions for preprocessing, segmenting and creating audio datasets, as well as extracting embeddings.
- Create a conda environment and activate it:
conda create --name audioext -c conda-forge python=3.11
conda activate audioext
- Be sure to have torch>=2.0 and torchaudio>=2.0 installed
- Install the requirements
pip install -r ./requirements.txt
- For jukebox embeddings - Install Jukemirlib from the github repo:
pip install git+https://github.com/rodrigo-castellon/jukemirlib.git
- Install ffmpeg
conda install -c conda-forge ffmpeg
Extract OpenAI's Jukebox embeddings from a series of audio files contained in a directory. From the project directory, run for help on the parameters:
python -m audioext.pipelines.jukebox_extractor -h
Note that running this script for the first time will automatically download the model weigths to the machine. Refer to Jukemirlib for more information.
Create MusiGen ready samples and metadata from a series of audio files contained in a directory, and send them to train, val and test splits. Prompting is still unconditional (use 8H38fNdtri as a prompt to all models). From the project directory, run for help on the parameters:
python -m audioext.pipelines.musicgen_dataset -h
Reduce noise and optionally remove silence in between sounds of files in a given directory. From the project directory, run for help on the parameters:
python -m audioext.pipelines.denoise -h
- Reformat code to have a decoupled audio_segmenter utility
- Reformat code to have a decoupled audio_processer utility
- Jukebox extraction notebook to callable script
- Reformat pipelines for Musicgen Dataset Generation.
- Add Denoise + Remove Silence.
- Reformat pipeline for Music Emotion Recognition.
- Create Audio Metadata enriching pipelines.
- Add bandwith extension and audio super-resolution?
- Handle single audio file with varying sample_rate
- Debug multi-channel normalization