Dataset Open Access
The DAPS (Device and Produced Speech) dataset is a collection of aligned versions of professionally produced studio speech recordings and recordings of the same speech on common consumer devices (tablet and smartphone) in real-world environments. It has 15 versions of audio (3 professional versions and 12 consumer device/real-world environment combinations). Each version consists of about 4 1/2 hours of data (about 14 minutes from each of 20 speakers). Please see this paper for a detailed description of the dataset:
Gautham J. Mysore, “Can We Automatically Transform Speech Recorded on Common Consumer Devices in Real-World Environments into Professional Production Quality Speech? - A Dataset, Insights, and Challenges”, in the IEEE Signal Processing Letters, Vol. 22, No. 8, August 2015
The primary goal of the dataset is to help develop methods to automatically convert real-world device recordings into professional sounding recordings. It can be also used for various other applications like voice conversion, traditional speech enhancement, and automatic production of studio recordings.
Name | Size | |
---|---|---|
daps.tar.gz
md5:303c130b7ce2e02b59c7ca5cd595a89c |
16.1 GB | Download |
All versions | This version | |
---|---|---|
Views | 1,650 | 1,649 |
Downloads | 1,755 | 1,755 |
Data volume | 28.2 TB | 28.2 TB |
Unique views | 1,486 | 1,485 |
Unique downloads | 923 | 923 |