STOPA: A Dataset of Systematic VariaTion Of DeePfake Audio for Open-Set Source Tracing and Attribution
Authors/Creators
Description
STOPA is a dataset for source tracing and attribution of deepfake audio, identifying which synthesis system generated a given utterance. It includes over 700,000 synthetic speech samples, generated using 13 distinct systems with controlled variation across 8 acoustic models and 6 vocoders.
The dataset follows ASVspoof2019 Logical Access protocols and uses speakers from the VCTK corpus. It supports open-world evaluation, where test utterances are compared against individual source hypotheses without assuming closed-set conditions. Rich metadata and pairwise trial protocols enable fine-grained attribution at the level of attack, acoustic model, or vocoder.
All audio is provided as 16-bit PCM WAV at 16 kHz. Metadata includes transcription, silence regions, WER, and system labels.
License: CC BY 4.0
Audio type: Synthetic only
Language: English (VCTK-based)
Files
README.txt
Files
(44.9 GB)
Additional details
Related works
- Is described by
- 10.21437/Interspeech.2025-2065 (DOI)
Funding
Dates
- Accepted
-
2025-05-19Accepted to Interspeech 2025