Published January 25, 2021 | Version v1.0.0
Software Open

nii-yamagishilab/ASVspoof2019-LA-human-assessment-data: Release of the ASVspoof2019 LA human assessment results, ver1.0

  • 1. National Institute of Informatics

Description

=====================================================================

               Human Perceptual Assessment Data on ASVspoof2019 LA Database

=====================================================================

Authors:

Xin Wang(1), Junichi Yamagishi(1), Massimiliano Todisco(2), 

Hector Delgado(2), Andreas Nautsch(2), Nicholas Evans(2), 

Md Sahidullah(3), Ville Vestman(4), Tomi Kinnunen(4), Kong Aik Lee(5)

 

Affiliations (by the time of data collection): 

(1)National Institute of Informatics, Japan 

(2)EURECOM, France

(3)Universite de Lorraine, CNRS, Inria, France

(4)University of Eastern Finland, Finland

(5)NEC Corp., Japan

 

Introduction

Automatic speaker verification (ASV) is one of the most natural and convenient means of biometric person recognition. Unfortunately, just like all other biometric systems, ASV is vulnerable to spoofing, also referred to as  ``presentation attacks.'' These vulnerabilities are generally unacceptable and call for spoofing countermeasures or ``presentation attack detection'' systems. In addition to impersonation, ASV systems are vulnerable to replay, speech synthesis, and voice conversion attacks.

The ASVspoof challenge initiative was created to foster research on anti-spoofing and to provide common platforms for the assessment and comparison of spoofing countermeasures. The first edition, ASVspoof 2015, focused upon the study of countermeasures for detecting of text-to-speech synthesis (TTS) and voice conversion (VC) attacks.  The second edition, ASVspoof 2017, focused instead upon replay spoofing attacks and countermeasures. The ASVspoof 2019 edition is the first to consider all three spoofing attack types within a single challenge.  While they originate from the same source database and same underlying protocol, they are explored in two specific use case scenarios. Spoofing attacks within a logical access (LA) scenario are generated with the latest speech synthesis and voice conversion technologies, including state-of-the-art neural acoustic and waveform model techniques. Replay spoofing attacks within a physical access (PA) scenario are generated through carefully controlled simulations that support much more revealing analysis than possible previously. 

In the ASVspoof2019 database paper, we have described a human assessment on spoofed data in logical access. It was demonstrated that the spoofing data in the ASVspoof 2019 database have varied degrees of perceived quality and similarity to the target speakers, including spoofed data that cannot be differentiated from bona fide utterances even by human subjects. 

In this repository, we release the human assessment data on the ASVspoof2019 database LA scenario

 

Notice

This repository only releases the assessment results and related meta information. Waveforms used in the human assessment should be downloaded from the official ASVspoof2019 database link:  https://datashare.is.ed.ac.uk/handle/10283/3336

Waveform files used for the human assessment are listed in in ./wav_file_sample_A.txt and ./wav_file_sample_B.txt. Please check README for more details.

Details on the spoof speech data and TTS/VC systems are described in this paper https://doi.org/10.1016/j.csl.2020.101114 (or arxiv version https://arxiv.org/abs/1911.01601).

 

Reference

If your publish using any of the data in this dataset please cite the following papers: 

 

@article{WANG2020101114,

 author = {Wang, Xin and Yamagishi, Junichi and Todisco, Massimiliano and Delgado, H{\'{e}}ctor and Nautsch, Andreas and Evans, Nicholas and Sahidullah, Md and Vestman, Ville and Kinnunen, Tomi and Lee, Kong Aik and Juvela, Lauri and Alku, Paavo and Peng, Yu-Huai and Hwang, Hsin-Te and Tsao, Yu and Wang, Hsin-Min and Maguer, S{\'{e}}bastien Le and Becker, Markus and Henderson, Fergus and Clark, Rob and Zhang, Yu and Wang, Quan and Jia, Ye and Onuma, Kai and Mushika, Koji and Kaneda, Takashi and Jiang, Yuan and Liu, Li-Juan and Wu, Yi-Chiao and Huang, Wen-Chin and Toda, Tomoki and Tanaka, Kou and Kameoka, Hirokazu and Steiner, Ingmar and Matrouf, Driss and Bonastre, Jean-Fran{\c{c}}ois and Govender, Avashna and Ronanki, Srikanth and Zhang, Jing-Xuan and Ling, Zhen-Hua},

 doi = {https://doi.org/10.1016/j.csl.2020.101114},

 issn = {0885-2308},

 journal = {Computer Speech {\&} Language},

 keywords = {ASVspoof challenge,Anti-spoofing,Automatic speaker verification,Biometrics,Countermeasure,Media forensics,Presentation attack,Presentation attack detection,Replay,Text-to-speech synthesis,Voice conversion},

 pages = {101114},

 title = {{ASVspoof 2019: a large-scale public database of synthesized, converted and replayed speech}},

 url = {http://www.sciencedirect.com/science/article/pii/S0885230820300474},

 year = {2020}

}

 

@inproceedings{Todisco2019,

 author = {Todisco, Massimiliano and Wang, Xin and Vestman, Ville and Sahidullah, Md. and Delgado, H{\'{e}}ctor and Nautsch, Andreas and Yamagishi, Junichi and Evans, Nicholas and Kinnunen, Tomi H and Lee, Kong Aik},

 booktitle = {Proc. Interspeech},

 doi = {10.21437/Interspeech.2019-2249},

 pages = {1008--1012},

 title = {{ASVspoof 2019: future horizons in spoofed and fake audio detection}},

 url = {http://dx.doi.org/10.21437/Interspeech.2019-2249},

 year = {2019}

}

Files

nii-yamagishilab/ASVspoof2019-LA-human-assessment-data-v1.0.0.zip

Files (3.3 MB)

Additional details