Speech Commands Dataset 🗣️

This is a speech recognition example using spiking and artificial recurrent neural networks.

Check out this awesome demonstration built by Tim Shea at Accenture using Intel Kapaho Bay:

https://vimeo.com/intelpr/review/486105332/2b3ac2e263

Spiking model LSNN 🧠

This model implements the recurrent Long short-term Spiking Neural Network (LSNN) and reproduces the Google Speech Commands results from the paper:

Salaj, D., Subramoney, A., Kraisnikovic, C., Bellec, G., Legenstein, R. and Maass, W., 2020.
Spike-frequency adaptation provides a long short-term memory to networks of spiking neurons. bioRxiv.

To reproduce result from the paper (91.2% test accuracy) run the following commands:

python3 train.py --model_architecture=lsnn --window_stride_ms=1

The details that allow spiking network to achieve the high accuracy are:

Spiking network is able to exploit the higher temporal resolution of the input so we use --window_stride_ms=1
For classification we consider the output of spiking network throught the sequence --avg_spikes=True
We use larger number of neurons --n_hidden=2048

Resulting accuracy:

Iteration	Validation	Test
400	68.6%
1200	79.4%
2400	85.3%
4800	87.5%
18000	91.5%	91.2%

LSTM model 🤖

run with:

python3 train.py --model_architecture=lstm --n_hidden=512 --n_layer=1 --dropout_prob=0.4 --optimizer=adam

Resulting accuracy:

Iteration	Validation	Test
400	81.4%
1200	90.9%
18000	94.6%	94.4%

Default CNN model

CNN model is the default one used in TensorFlow GSC example, which is based on cnn-trad-fpool3 in the 'Convolutional Neural Networks for Small-footprint Keyword Spotting' paper.

run with:

python3 train.py --model_architecture=conv

Resulting accuracy:

Iteration	Validation	Test
18000	88.4%	87.6%

Environment

Tested with TensorFlow 2.0 and 2.1.

To get started create the conda environment from file and activate:

conda env create -f environment.yml
conda activate venv2.1
python3 train.py --model_architecture=lsnn --window_stride_ms=1

Name		Name	Last commit message	Last commit date
Latest commit History 44 Commits
BUILD		BUILD
README.md		README.md
accuracy_utils.cc		accuracy_utils.cc
accuracy_utils.h		accuracy_utils.h
accuracy_utils.py		accuracy_utils.py
accuracy_utils_test.cc		accuracy_utils_test.cc
collect_data.py		collect_data.py
environment.yml		environment.yml
freeze.py		freeze.py
freeze_test.py		freeze_test.py
generate_streaming_test_wav.py		generate_streaming_test_wav.py
generate_streaming_test_wav_test.py		generate_streaming_test_wav_test.py
hp_search_beta.sh		hp_search_beta.sh
hp_search_lf.sh		hp_search_lf.sh
input_data.py		input_data.py
input_data_test.py		input_data_test.py
label_wav.cc		label_wav.cc
label_wav.py		label_wav.py
label_wav_dir.py		label_wav_dir.py
label_wav_test.py		label_wav_test.py
lsnn_demo.ipynb		lsnn_demo.ipynb
models.py		models.py
models_test.py		models_test.py
recognize_commands.cc		recognize_commands.cc
recognize_commands.h		recognize_commands.h
recognize_commands.py		recognize_commands.py
recognize_commands_test.cc		recognize_commands_test.cc
slurmjob.sh		slurmjob.sh
spiking_models.py		spiking_models.py
test_streaming_accuracy.cc		test_streaming_accuracy.cc
test_streaming_accuracy.py		test_streaming_accuracy.py
train.py		train.py
train_test.py		train_test.py
wav_to_features.py		wav_to_features.py
wav_to_features_test.py		wav_to_features_test.py

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

Speech Commands Dataset 🗣️

Spiking model LSNN 🧠

LSTM model 🤖

Default CNN model

Environment

About

Releases

Packages

Contributors 2

Languages

dsalaj/GoogleSpeechCommandsRNN

Folders and files

Latest commit

History

Repository files navigation

Speech Commands Dataset 🗣️

Spiking model LSNN 🧠

LSTM model 🤖

Default CNN model

Environment

About

Resources

Stars

Watchers

Forks

Releases

Packages 0

Contributors 2

Languages

Packages