Published September 20, 2021 | Version 0.8
Video/Audio Open

DDS (Device-Degraded Speech) Dataset - VCTK portion - Part 2

  • 1. National Institute of Informatics

Description

DDS (Device-Degraded Speech) dataset provides aligned parallel recordings of high-quality speech (recorded in professional studios) and a large number of versions of low-quality speech, producing approximately 2,000 hours speech data. 

DDS is built on top of two datasets: DAPS and VCTK. We play clean speech recordings (4 hours from DAPS and 8 hours from VCTK) and re-record waveforms in nine environments (two offices, two conference rooms, three studios, one living room, one waiting room) on three different devices (one MEMS and two condenser microphones), producing 27 different recording conditions. Moreover, each version of condition consists of multiple recordings recorded at 6 different microphone positions to simulate various signal-to-noise ratio (SNR) and reverberation levels. 

Arxiv: https://arxiv.org/abs/2109.07931

 

The whole dataset is split into 3 repositories (one part for DAPS portion, two parts for VCTK portion). This repository contains VCTK portion (part 2).

For all repository links of DDS v0.8:

  • DAPS portion: https://zenodo.org/record/5464104
  • VCTK portion part1: https://zenodo.org/record/5499506
  • VCTK portion part2: https://zenodo.org/record/5501697

Notes

This dataset is based on the VCTK dataset (DOI: 10.7488/ds/2645). VCTK dataset is distributed under Open Data Commons Attribution License (ODC-By) v1.0.

Files

Files (49.6 GB)

Name Size Download all
md5:ee25ef77005d7e8db04db34091292cf7
10.0 GB Download
md5:db0228cd330133c25bb88c0851b11747
9.9 GB Download
md5:16b56ead4e04dd7cc7538c7057499bc6
10.0 GB Download
md5:369abc1b67aa03c404af69807c3ea107
9.9 GB Download
md5:93138494242b4159f196d70ae728d258
9.8 GB Download