Dataset Open Access
The DAPS (Device and Produced Speech) dataset is a collection of aligned versions of professionally produced studio speech recordings and recordings of the same speech on common consumer devices (tablet and smartphone) in real-world environments. It has 15 versions of audio (3 professional versions and 12 consumer device/real-world environment combinations). Each version consists of about 4 1/2 hours of data (about 14 minutes from each of 20 speakers). Please see this paper for a detailed description of the dataset:
Gautham J. Mysore, “Can We Automatically Transform Speech Recorded on Common Consumer Devices in Real-World Environments into Professional Production Quality Speech? - A Dataset, Insights, and Challenges”, in the IEEE Signal Processing Letters, Vol. 22, No. 8, August 2015
The primary goal of the dataset is to help develop methods to automatically convert real-world device recordings into professional sounding recordings. It can be also used for various other applications like voice conversion, traditional speech enhancement, and automatic production of studio recordings.
Name | Size | |
---|---|---|
daps.tar.gz
md5:303c130b7ce2e02b59c7ca5cd595a89c |
16.1 GB | Download |
All versions | This version | |
---|---|---|
Views | 4,092 | 4,091 |
Downloads | 3,545 | 3,545 |
Data volume | 56.9 TB | 56.9 TB |
Unique views | 3,727 | 3,726 |
Unique downloads | 2,123 | 2,123 |