Published May 24, 2021 | Version 1.0
Dataset Open

Segmented DAPS (Device and Produced Speech) Dataset

  • 1. Northwestern University
  • 2. Adobe Research

Description

This is a modified version of a subset of the Device and Produced Speech (DAPS) dataset. The original dataset can be found here. This dataset contains text-aligned audio of the first script of the "clean" partition of the DAPS dataset for all 20 speakers. Phoneme and word alignments are provided as JSON files. We segment the audio and alignments into single sentences. For each sentence, we additionally provide the raw text in a txt file. Audio is provided as 44.1 kHz WAV files.

If you use this work as part of an academic publication, please cite the paper corresponding to the original dataset:

Gautham J. Mysore, “Can We Automatically Transform Speech Recorded on Common Consumer Devices in Real-World Environments into Professional Production Quality Speech? - A Dataset, Insights, and Challenges”, in the IEEE Signal Processing Letters, Vol. 22, No. 8, August 2015

Files

Files (228.7 MB)

Name Size Download all
md5:5d9faddade18a3df7034507d651651fc
228.7 MB Download

Additional details

Related works

Cites
Journal article: 10.1109/LSP.2014.2379648 (DOI)
Is derived from
Dataset: 10.5281/zenodo.4660670 (DOI)