DDS (Device-Degraded Speech) Dataset - VCTK portion - Part 2

Li, Haoyu; Yamagishi, Junichi

doi:10.5281/zenodo.5501697

Published September 20, 2021 | Version 0.8

Video/Audio Open

DDS (Device-Degraded Speech) Dataset - VCTK portion - Part 2

1. National Institute of Informatics

DDS (Device-Degraded Speech) dataset provides aligned parallel recordings of high-quality speech (recorded in professional studios) and a large number of versions of low-quality speech, producing approximately 2,000 hours speech data.

DDS is built on top of two datasets: DAPS and VCTK. We play clean speech recordings (4 hours from DAPS and 8 hours from VCTK) and re-record waveforms in nine environments (two offices, two conference rooms, three studios, one living room, one waiting room) on three different devices (one MEMS and two condenser microphones), producing 27 different recording conditions. Moreover, each version of condition consists of multiple recordings recorded at 6 different microphone positions to simulate various signal-to-noise ratio (SNR) and reverberation levels.

Arxiv: https://arxiv.org/abs/2109.07931

The whole dataset is split into 3 repositories (one part for DAPS portion, two parts for VCTK portion). This repository contains VCTK portion (part 2).

For all repository links of DDS v0.8:

DAPS portion: https://zenodo.org/record/5464104
VCTK portion part1: https://zenodo.org/record/5499506
VCTK portion part2: https://zenodo.org/record/5501697

Notes

This dataset is based on the VCTK dataset (DOI: 10.7488/ds/2645). VCTK dataset is distributed under Open Data Commons Attribution License (ODC-By) v1.0.

Files

Files (49.6 GB)

Name	Size	Download all
VCTK_16k_confroom1.tar.gz md5:ee25ef77005d7e8db04db34091292cf7	10.0 GB	Download
VCTK_16k_livingroom1.tar.gz md5:db0228cd330133c25bb88c0851b11747	9.9 GB	Download
VCTK_16k_studio2.tar.gz md5:16b56ead4e04dd7cc7538c7057499bc6	10.0 GB	Download
VCTK_16k_studio3.tar.gz md5:369abc1b67aa03c404af69807c3ea107	9.9 GB	Download
VCTK_16k_waitingroom1.tar.gz md5:93138494242b4159f196d70ae728d258	9.8 GB	Download

	All versions	This version
Views	845	839
Downloads	517	513
Data volume	12.2 TB	12.2 TB

DDS (Device-Degraded Speech) Dataset - VCTK portion - Part 2

Authors/Creators

Description

Notes

Files

Files (49.6 GB)