There is a newer version of this record available.

Video/Audio Open Access

PartialSpoof Database - Partially Spoofed Audio Dataset for Anti-spoofing

Zhang, Lin; Wang, Xin; Cooper, Erica; Yamagishi, Junichi; Patino, Jose; Evans, Nicholas

All existing databases of spoofed speech contain attack data that is spoofed in its entirety. In practice, it is entirely plausible that successful attacks can be mounted with utterances that are only partially spoofed. By definition, partially-spoofed utterances contain a mix of both spoofed and bona fide segments, which will likely degrade the performance of countermeasures trained with entirely spoofed utterances. This hypothesis raises the obvious question: ‘Can we detect partially spoofed audio?’ This paper introduces a new database of partially-spoofed data, named PartialSpoof, to help address this question. This new database enables us to investigate and compare the performance of countermeasures on both utterance- and segmental- level labels. Experimental results using the utterance-level labels reveal that the reliability of countermeasures trained to detect fully-spoofed data is found to degrade substantially when tested with partially-spoofed data, whereas training on partially-spoofed data performs reliably in the case of both fully- and partially- spoofed utterances. Additional experiments using segmental-level labels show that spotting injected spoofed segments included in an utterance is a much more challenging task even if the latest countermeasure models are used.

  • For the initial version of PartialSpoof v1.0
    • Arxiv: https://arxiv.org/abs/2104.02518
    • Samples: https://nii-yamagishilab.github.io/zlin-demo/IS2021/index.html
    • PartialSpoof Database v1.0: https://zenodo.org/record/4817532#.YQH3eRMzZhF
  • For the multi-task version of PartialSpoof v1.1
    • Arxiv: https://arxiv.org/abs/2107.14132
    • PartialSpoof Database v1.1 (only update segmental-level labels and README_v1.1): https://zenodo.org/record/5112031#.YQQA4S2l3iE

P.S. File database_eval.tar.gz here is split into 3 parts, please download all parts and uncompress files using the following commands:

cat database_eval.tar.gz.a* > database_eval.tar.gz
tar -jxvf database_eval.tar.gz
tar -zxvf database_eval.tar.gz

 

File database_eval.tar.gz is split into 3 parts, please download all parts and concatenate the files using the command `cat database_eval.tar.gz.a* > database_eval.tar.gz` This database was partially supported by the Japanese-French joint national VoicePersonae project supported by JST CREST (JPMJCR18A6) and the ANR (ANR-18-JSTS-0001), JST CREST Grants (JPMJCR20D3), MEXT KAKENHI Grants (16H06302, 18H04120, 18H04112, 18KT0051), Japan, and Google AI for Japan program.
Files (9.9 GB)
Name Size
database_dev.tar.gz
md5:ddd4cd3221b7210ac879f67452fb209e
2.0 GB Download
database_eval.tar.gz.aa
md5:2f2087bb3b9c32b9f2ac24028e77a734
2.1 GB Download
database_eval.tar.gz.ab
md5:2e79c7cec8b05f4231e0144bd633174d
2.1 GB Download
database_eval.tar.gz.ac
md5:df620d7f685514b95409530f6b9b1253
1.5 GB Download
database_protocols.tar.gz
md5:699d81f020e4b7fa8f33747010e1cba8
5.4 MB Download
database_train.tar.gz
md5:c4853ddd831e8e96b0e279fc0a512e7e
2.0 GB Download
README
md5:f0a0bf55be87a9b6b0b5116d69e3ec9f
19.8 kB Download
  • Zhang, L., Wang, X., Cooper, E., Yamagishi, J., Patino, J., & Evans, N. (2021). An Initial Investigation for Detecting Partially Spoofed Audio. arXiv preprint arXiv:2104.02518.

  • Wang, X., Yamagishi, J., Todisco, M., Delgado, H., Nautsch, A., Evans, N., ... & Ling, Z. H. (2020). ASVspoof 2019: A large-scale public database of synthesized, converted and replayed speech. Computer Speech & Language, 64, 101114.

624
11,091
views
downloads
All versions This version
Views 624368
Downloads 11,091175
Data volume 47.9 TB239.1 GB
Unique views 454301
Unique downloads 43051

Share

Cite as