There is a newer version of this record available.

Dataset Open Access

DeepPredSpeech: computational models of predictive speech coding based on deep learning

Hueber, Thomas; Tatulli, Eric; Girin, Laurent; Schwartz, Jean-Luc

This dataset contains all data, source code, pre-trained computational predictive models and experimental results related to:  

Hueber T., Tatulli E., Girin L., Schwatz, J-L "How predictive can be predictions in the neurocognitive processing of auditory and audiovisual speech? A deep learning study." (biorXiv preprint 

  • Raw data are extracted from the publicly available database NTCD-TIMIT (10.5281/zenodo.260228). 
    • Audio recordings are available in the audio_clean/ directory
    • Post-processed lip image sequences are available in the lips_roi/ directory (67x67 pixels, 8bits, obtained by lossless inverse DCT-2D transform from the DCT feature available in the original repository of NTCD-TIMIT)
    • Phonetic segmentation (extracted from NTCD-TIMIT original zenodo repository) is available in the HTK MLF file volunteer_labelfiles.mlf
  • Audio features (MFCC-spectrogram and log-spectrogram) are available in the mfcc_16k/ and fft_16k/ directories. 
  • Models (audio-only, video-only and audiovisual, based on deep feed-forward neural networks and/or convolutional neural network, in .h5 format, trained with Keras 2.0 toolkit) and data normalization parameters (in .dat scikit-learn format) are available in models_mfcc/ and models_logspectro/ directories
  • Predicted and target (ground truth) MFCC-spectro (resp. log-spectro) for the test databases (1909 sentences), and for the different values of \(\tau_p\) or \(\tau_f\) are available in pred_testdb_mfccspectro/ (resp. pred_testdb_logspectro/) directory

Source code for extracting audio features, training and evaluating the models is available on GitHub

All directories have been zipped before upload.

Feel free to contact me for more details.

Thomas Hueber, Ph. D., CNRS research fellow, GIPSA-lab, Grenoble, France, 

Files (31.8 GB)
Name Size
854.4 MB Download
1.1 GB Download
2.1 GB Download
112.7 MB Download
2.7 GB Download
429.3 MB Download
18.4 GB Download
6.1 GB Download
4.3 MB Download
All versions This version
Views 333176
Downloads 33783
Data volume 2.0 TB303.6 GB
Unique views 277159
Unique downloads 10633


Cite as