DDS (Device-Degraded Speech) Dataset - VCTK portion - Part 1

Li, Haoyu; Yamagishi, Junichi

doi:10.5281/zenodo.5499506

Published September 20, 2021 | Version 0.8

Video/Audio Open

DDS (Device-Degraded Speech) Dataset - VCTK portion - Part 1

1. National Institute of Informatics

DDS (Device-Degraded Speech) dataset provides aligned parallel recordings of high-quality speech (recorded in professional studios) and a large number of versions of low-quality speech, producing approximately 2,000 hours speech data.

DDS is built on top of two datasets: DAPS and VCTK. We play clean speech recordings (4 hours from DAPS and 8 hours from VCTK) and re-record waveforms in nine environments (two offices, two conference rooms, three studios, one living room, one waiting room) on three different devices (one MEMS and two condenser microphones), producing 27 different recording conditions. Moreover, each version of condition consists of multiple recordings recorded at 6 different microphone positions to simulate various signal-to-noise ratio (SNR) and reverberation levels.

Arxiv: https://arxiv.org/abs/2109.07931

The whole dataset is split into 3 repositories (one part for DAPS portion, two parts for VCTK portion). This repository contains VCTK portion (part 1).

For all repository links of DDS v0.8:

DAPS portion: https://zenodo.org/record/5464104
VCTK portion part1: https://zenodo.org/record/5499506
VCTK portion part2: https://zenodo.org/record/5501697

Notes

This dataset is based on the VCTK dataset (DOI: 10.7488/ds/2645). VCTK dataset is distributed under Open Data Commons Attribution License (ODC-By) v1.0.

Files

misc.zip

Files (41.3 GB)

Name	Size	Download all
misc.zip md5:1b53eadecbc7de904cc31ad01e1f1a3f	44.9 MB	Preview Download
README md5:6699cddb5cfc8869c2c3f55b00b55f81	5.7 kB	Download
text.zip md5:f0d254542aad998c9e3e266a261d19a0	2.3 MB	Preview Download
VCTK_16k_clean.tar.gz md5:ddda85d3c2004cf4b077c30cbc4c20f5	478.0 MB	Download
VCTK_16k_confroom2.tar.gz md5:eb7f20bb17c5802e086f7b9268a78617	10.3 GB	Download
VCTK_16k_office1.tar.gz md5:a43220935957870768a0c5e77a3b4584	10.1 GB	Download
VCTK_16k_office2.tar.gz md5:ac1909622d8536343cc81b9d806526d8	10.1 GB	Download
VCTK_16k_studio1.tar.gz md5:09a5bb607916937ac623d91130127a6d	10.2 GB	Download

	All versions	This version
Views	481	477
Downloads	482	476
Data volume	8.7 TB	8.6 TB

DDS (Device-Degraded Speech) Dataset - VCTK portion - Part 1

Creators

Description

Notes

Files

misc.zip

Files (41.3 GB)