LibriSpeech Alignments

Loren Lugosch

doi:10.5281/zenodo.2619474

Published March 31, 2019 | Version 1.0

Other Open

LibriSpeech Alignments

Loren Lugosch¹

1. Mila

This contains phoneme alignments and word alignments (= labels for each timestep) for all 980 hours of LibriSpeech.

We obtained these alignments using the Montreal Forced Aligner, using their pre-trained LibriSpeech acoustic model. To make it easy to replicate the experiments in our paper, we provide these alignments, so you don't need to run the aligner yourself. Note that for a small number of audio files, the aligner could not compute an alignment; we did not use these audios during training.

If you find these alignments or other parts of our experiment useful, please cite our paper:

Loren Lugosch, Mirco Ravanelli, Patrick Ignoto, Vikrant Singh Tomar, and Yoshua Bengio, "Speech Model Pre-training for End-to-End Spoken Language Understanding", Interspeech 2019.

as well as the Montreal Forced Aligner paper:

Michael McAuliffe, Michaela Socolof, Sarah Mihuc, Michael Wagner, and Morgan Sonderegger. "Montreal Forced Aligner: trainable text-speech alignment using Kaldi", Interspeech 2017.

Files

librispeech_alignments.zip

Files (623.0 MB)

Name	Size
librispeech_alignments.zip md5:2bab567d0ace651a4ba254e813629f46	623.0 MB	Preview Download

	All versions	This version
Views	9,705	9,664
Downloads	3,481	3,461
Data volume	6.3 TB	6.3 TB

LibriSpeech Alignments

Authors/Creators

Description

Files

librispeech_alignments.zip

Files (623.0 MB)