Audio Classification

Project Description

This project was created for the Independent Study and Mentorship program. The ISM program allows Frisco ISD students to explore interested fields through research and mentorship, developing life-long skills in the process. Students showcase their knowledge through the semester-long projects known as the Original Work and Final Product. Furthermore, students acquire mentorship from professionals, developing communication and valuable connections.

This year in ISM, I am exploring the field of Machine Learning with a special focus on audio classification. This repository documents my code for my Original Work and provides an easy-to-use UI for navigating through my code.

Quick Links

Commits: Documentation of my work over time in the form of incremental updates.
Digital Portfolio: My digital portfolio showcasing my work in ISM.

Getting started

This guide assumes you have basic familiarity with the terminal.

Open a terminal environment.
Clone this repository: git clone https://github.com/AnanthVivekanand/audio-classification.git
Navigate into the GUI/ directory: cd audio-classification/GUI/
Install all python dependencies: pip3 install -r requirements.txt
Then run application.py: python3 ./application.py
Navigate to http://localhost:5000/ in your web browser

Documentation:

GUI/application.py: This is a simple Flask application that creates a HTTP API for the audio classification model.
GUI/templates/home.html: This is the front-end HTML code that the provides a file upload and visualization interface.
final_model/weights.h5: This binary file contains the "weights" of the trained machine learning model.
preprocessing/extract_fft_data.ipynb: This file extracts Short-time Fourier Transform data from .wav files in the dataset. This allows for faster training as the STFT is computed only once and stored indefinetely.
training/optimized_TCN.ipynb: This file implements a Temporal Convolutional Neural Network that classifies STFT inputs into one of 10 catagories. This is the model that the Flask application uses.

Acknowledgements:

Dr. Abhishek Sehgal at Samsung Research America
Mr. Gautum Bhat at UTD's Statistical Signal Processing Research Laboratory

Name		Name	Last commit message	Last commit date
Latest commit History 43 Commits
GUI		GUI
final_model		final_model
preprocessing		preprocessing
training		training
.DS_Store		.DS_Store
.gitignore		.gitignore
README.md		README.md

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

Audio Classification

Project Description

Quick Links

Getting started

About

Releases

Packages

Languages

AnanthVivekanand/audio-classification

Folders and files

Latest commit

History

Repository files navigation

Audio Classification

Project Description

Quick Links

Getting started

About

Resources

Stars

Watchers

Forks

Releases

Packages 0

Languages

Packages