Google Speech Commands-Musan test set

Trinh, Viet Anh; Kavaki, Hassan Salami; Mandel, Michael

doi:10.5281/zenodo.6066174

Published April 27, 2022 | Version 1.0

Dataset Open

Google Speech Commands-Musan test set

1. CUNY Graduate Center
2. Brooklyn College, CUNY

This noisy speech test set is created from the Google Speech Commands v2 [1] and the Musan dataset[2]. It is introduced in our ICASSP 2022 paper [3].

Specifically, we created this test set by mixing the speech in the Google Speech Commands v2 test set with random noise in the Musan dataset at different signal to noise ratio -12.5,-10,0,10,20,30 and 40 decibel (dB).

The Google Speech Commands v2 dataset is under the Creative Commons BY 4.0 license. It could be downloaded at: http://download.tensorflow.org/data/speech_commands_v0.02.tar.gz

The Musan dataset is under Attribution 4.0 International (CC BY 4.0). It could be downlowned at https://www.openslr.org/17/

Citations:

[1] Pete Warden, “Speech commands: A dataset for limited-vocabulary speech recognition,” arXiv preprint arXiv:1804.03209, 2018.

[2] David Snyder, Guoguo Chen, and Daniel Povey, “Musan: A music, speech, and noise corpus,” arXiv preprint arXiv:1510.08484, 2015.

[3] V. A. Trinh, H. Salami Kavaki and M. I. Mandel, "Importantaug: A Data Augmentation Agent for Speech," ICASSP 2022 - 2022 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP), 2022, pp. 8592-8596, doi: 10.1109/ICASSP43922.2022.9747003.

Notes

Grant NSF, IIS-1750383

Files

Files (4.0 GB)

Name	Size
SpeechCommands_Musan.tar.gz md5:668630823307b5d0efc5cf035b037c8e	4.0 GB	Download

Additional details

Is derived from: Dataset: arXiv:1804.03209 (arXiv); Dataset: arXiv:1510.08484 (arXiv)
Is published in: Conference paper: 10.1109/ICASSP43922.2022.9747003 (DOI)

Pete Warden, "Speech commands: A dataset for limited-vocabulary speech recognition," arXiv preprint arXiv:1804.03209, 2018.
David Snyder, Guoguo Chen, and Daniel Povey, "Musan: A music, speech, and noise corpus," arXiv preprint arXiv:1510.08484, 2015.
V. A. Trinh, H. Salami Kavaki and M. I. Mandel, "Importantaug: A Data Augmentation Agent for Speech," ICASSP 2022 - 2022 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP), 2022, pp. 8592-8596, doi: 10.1109/ICASSP43922.2022.9747003.

	All versions	This version
Views	1,655	1,641
Downloads	468	461
Data volume	2.9 TB	2.9 TB

Files (4.0 GB)

Related works

References

Google Speech Commands-Musan test set

Authors/Creators

Description

Notes

Files

Files (4.0 GB)

Additional details

Related works

References