ADEPT: A Dataset for Evaluating Prosody Transfer

Torresquintero, Alexandra; Teh, Tian Huey; Wallis, Christopher G. R.; Staib, Marlene; Mohan, Devang S Ram; Hu, Vivian; Foglianti, Lorenzo; Gao, Jiameng; King, Simon

doi:10.5281/zenodo.5117102

Published June 14, 2021 | Version 1

Dataset Open

ADEPT: A Dataset for Evaluating Prosody Transfer

1. Papercup Technologies Ltd
2. University of Edinburgh

The ADEPT dataset consists of prosodically-varied natural speech samples for evaluating prosody transfer in english text-to-speech models. The samples include global variations reflecting emotion and interpersonal attitude, and local variations reflecting topical emphasis, propositional attitude, syntactic phrasing and marked tonicity.

Txt and wav files are organised according to the folder structure {speech_class}/{subcategory_or_interpretation}/{filename}, where filename follows the naming convention {speaker}_{utterance_id}. Speakers comprise 'ad00' (female voice) and 'ad01' (male voice). For classes with multiple interpretations, we provide the interpretations used in the disambiguation tasks in 'adept_prompts.json'.

The corpus only includes prosodic variations that listeners are able to distinguish with reasonable accuracy, and we report these figures as a benchmark against which text-to-speech prosody transfer can be compared. More details can be found in our pre-print about the dataset (https://arxiv.org/abs/2106.08321).

Files

ADEPT.zip

Files (37.8 MB)

Name	Size	Download all
ADEPT.zip md5:9c5078e11bbaa77ed95ea25e13c3a995	37.8 MB	Preview Download

Additional details

Is cited by: Preprint: arXiv:2106.08321 (arXiv)

	All versions	This version
Views	1,244	1,206
Downloads	383	378
Data volume	17.8 GB	17.6 GB

ADEPT: A Dataset for Evaluating Prosody Transfer

Authors/Creators

Description

Files

ADEPT.zip

Files (37.8 MB)

Additional details

Related works