Published May 30, 2025 | Version v1
Dataset Open

Dataset: EmoSpoof-TTS

  • 1. Center for Language and Speech Processing, Johns Hopkins University
  • 2. Language Technologies Institute, Carnegie Mellon University

Description

We introduce and release EmoSpoof-TTS, a corpus of emotionally expressive synthetic speech generated using recent text-to-speech models, to facilitate research on the impact of emotion on anti-spoofing models. It contains a total of 36,000 synthesized speech samples from four emotions (Happiness, Anger, Sadness, and Neutral state), 10 (5 male, 5 female) speakers, and 3 TTS models (StyleTTS2, F5-TTS, CosyVoice). The bona-fide samples are from Emotional Speech Database (ESD).

Files

CosyVoice.zip

Files (6.7 GB)

Name Size Download all
md5:b4a8257cb6acbba8c99903929b8eaab4
2.7 GB Preview Download
md5:2f05f42cc810a90f5870f22b9636bf73
1.1 GB Preview Download
md5:58fa272f832d8d0ae5944368865bb25e
5.5 kB Preview Download
md5:cc70b4d6da9f2ecde9637b57b5e56726
2.9 GB Preview Download
md5:cbb3439294792e906d18d12a6988f89e
116.6 kB Preview Download

Additional details