Skip to content

Using temporal convolutional neural networks (TCNNs) with a Flask web application to classify audio files

Notifications You must be signed in to change notification settings

AnanthVivekanand/audio-classification

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

43 Commits
 
 
 
 
 
 
 
 
 
 
 
 
 
 

Repository files navigation

Audio Classification

Project Description

This project was created for the Independent Study and Mentorship program. The ISM program allows Frisco ISD students to explore interested fields through research and mentorship, developing life-long skills in the process. Students showcase their knowledge through the semester-long projects known as the Original Work and Final Product. Furthermore, students acquire mentorship from professionals, developing communication and valuable connections.

This year in ISM, I am exploring the field of Machine Learning with a special focus on audio classification. This repository documents my code for my Original Work and provides an easy-to-use UI for navigating through my code.

Quick Links

  • Commits: Documentation of my work over time in the form of incremental updates.
  • Digital Portfolio: My digital portfolio showcasing my work in ISM.

Getting started

This guide assumes you have basic familiarity with the terminal.

  1. Open a terminal environment.
  2. Clone this repository: git clone https://github.com/AnanthVivekanand/audio-classification.git
  3. Navigate into the GUI/ directory: cd audio-classification/GUI/
  4. Install all python dependencies: pip3 install -r requirements.txt
  5. Then run application.py: python3 ./application.py
  6. Navigate to http://localhost:5000/ in your web browser

Documentation:

  • GUI/application.py: This is a simple Flask application that creates a HTTP API for the audio classification model.
  • GUI/templates/home.html: This is the front-end HTML code that the provides a file upload and visualization interface.
  • final_model/weights.h5: This binary file contains the "weights" of the trained machine learning model.
  • preprocessing/extract_fft_data.ipynb: This file extracts Short-time Fourier Transform data from .wav files in the dataset. This allows for faster training as the STFT is computed only once and stored indefinetely.
  • training/optimized_TCN.ipynb: This file implements a Temporal Convolutional Neural Network that classifies STFT inputs into one of 10 catagories. This is the model that the Flask application uses.

Acknowledgements:

  • Dr. Abhishek Sehgal at Samsung Research America
  • Mr. Gautum Bhat at UTD's Statistical Signal Processing Research Laboratory

About

Using temporal convolutional neural networks (TCNNs) with a Flask web application to classify audio files

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published

Languages