This project was created for the Independent Study and Mentorship program. The ISM program allows Frisco ISD students to explore interested fields through research and mentorship, developing life-long skills in the process. Students showcase their knowledge through the semester-long projects known as the Original Work and Final Product. Furthermore, students acquire mentorship from professionals, developing communication and valuable connections.
This year in ISM, I am exploring the field of Machine Learning with a special focus on audio classification. This repository documents my code for my Original Work and provides an easy-to-use UI for navigating through my code.
- Commits: Documentation of my work over time in the form of incremental updates.
- Digital Portfolio: My digital portfolio showcasing my work in ISM.
This guide assumes you have basic familiarity with the terminal.
- Open a terminal environment.
- Clone this repository:
git clone https://github.com/AnanthVivekanand/audio-classification.git
- Navigate into the
GUI/
directory:cd audio-classification/GUI/
- Install all python dependencies:
pip3 install -r requirements.txt
- Then run
application.py
:python3 ./application.py
- Navigate to
http://localhost:5000/
in your web browser
Documentation:
GUI/application.py
: This is a simple Flask application that creates a HTTP API for the audio classification model.GUI/templates/home.html
: This is the front-end HTML code that the provides a file upload and visualization interface.final_model/weights.h5
: This binary file contains the "weights" of the trained machine learning model.preprocessing/extract_fft_data.ipynb
: This file extracts Short-time Fourier Transform data from .wav files in the dataset. This allows for faster training as the STFT is computed only once and stored indefinetely.training/optimized_TCN.ipynb
: This file implements a Temporal Convolutional Neural Network that classifies STFT inputs into one of 10 catagories. This is the model that the Flask application uses.
Acknowledgements:
- Dr. Abhishek Sehgal at Samsung Research America
- Mr. Gautum Bhat at UTD's Statistical Signal Processing Research Laboratory