Dataset Open Access

Silent Speech EMG

Gaddy, David

Facial electromyography recordings during both silent and vocalized speech.

This data is described in the publication "Digital Voicing of Silent Speech" at EMNLP 2020 (https://arxiv.org/abs/2010.02960).

Code for processing this data can be found at https://github.com/dgaddy/silent_speech.

Each data sample has 5 data files: {i}_emg.npy - a saved numpy array of size (T, 8) with the raw EMG signals; {i}_audio.flac - the raw audio recording; {i}_audio_clean.flac - audio with background noise reduced; {i}_info.json - JSON with extra information, such as the text prompt that was read; {i}_button.npy - a numpy array containing device button state, which is generally unused. Note that some samples do not represent actual datapoints, but are used as reference EMG or audio signals. These samples are marked with "sentence_index: -1" in the associated info file.
Files (3.9 GB)
Name Size
emg_data.tar.gz
md5:7f97d2182b896652999b1b2d0c69fd7b
3.9 GB Download
955
3,497
views
downloads
All versions This version
Views 955955
Downloads 3,4973,497
Data volume 13.7 TB13.7 TB
Unique views 871871
Unique downloads 481481

Share

Cite as