Published September 29, 2023 | Version 1.0.0
Dataset Open

ODSS: An Open Dataset of Synthetic Speech

  • 1. Fraunhofer Institute for Digital Media Technology
  • 2. Centre for Research and Technology Hellas


ODSS is a multilingual, multispeaker dataset of synthetic and natural speech, designed to foster research and benchmarking of novel studies on synthetic speech detection. 

ODSS comprises audio utterances generated from text by state-of-the-art synthesis methods, paired with their corresponding natural counterparts. The synthetic audio data includes several languages, with an equal representation of genders.

Natural and synthetic speech audio files within ODSS are released under the CC-BY-SA 4.0 license: Usage, extension and redistribution by the research community are strongly encouraged.


Files (2.4 GB)

Name Size Download all
2.4 GB Preview Download

Additional details

Related works

Is derived from
Dataset: 10.7488/ds/2645 (DOI)
Dataset: 10.21437/Interspeech.2021-1599 (DOI)
Dataset: 979-10-95546-34-4 (ISBN)
Dataset: 10.1007/978-3-030-87626-5_15 (DOI)
Is described by
Conference proceeding: 10.5281/zenodo.10124945 (DOI)

Funding – VERification Assisted by Artificial Intelligence 101070093
European Commission
AI4Media – A European Excellence Centre for Media, Society and Democracy 951911
European Commission