Published September 29, 2023 | Version 1.0.0
Dataset Open

ODSS: An Open Dataset of Synthetic Speech

  • 1. Fraunhofer Institute for Digital Media Technology
  • 2. Centre for Research and Technology Hellas

Description

ODSS is a multilingual, multispeaker dataset of synthetic and natural speech, designed to foster research and benchmarking of novel studies on synthetic speech detection. 

ODSS comprises audio utterances generated from text by state-of-the-art synthesis methods, paired with their corresponding natural counterparts. The synthetic audio data includes several languages, with an equal representation of genders.

Natural and synthetic speech audio files within ODSS are released under the CC-BY-SA 4.0 license: Usage, extension and redistribution by the research community are strongly encouraged.

Files

odss.zip

Files (2.4 GB)

Name Size Download all
md5:9fd343968a12d6599a313b243a176ac6
2.4 GB Preview Download

Additional details

Related works

Is derived from
Dataset: 10.7488/ds/2645 (DOI)
Dataset: 10.21437/Interspeech.2021-1599 (DOI)
Dataset: 979-10-95546-34-4 (ISBN)
Dataset: 10.1007/978-3-030-87626-5_15 (DOI)
Is described by
Conference proceeding: 10.5281/zenodo.10124945 (DOI)

Funding

vera.ai – vera.ai: VERification Assisted by Artificial Intelligence 101070093
European Commission
AI4Media – A European Excellence Centre for Media, Society and Democracy 951911
European Commission