ASVspoof 2019 LA Listening Test Data for Partial Rank Similarity MOS Prediction
Creators
- 1. National Institute of Informatics, Japan; University of Edinburgh, UK
- 2. EURECOM, France
- 3. Inria, France
- 4. National Institute of Informatics, Japan
- 5. University of Eastern Finland, Finland
- 6. NEC, Japan
Description
This dataset is a derivitave work of the ASVSpoof 2019 LA condition listening test data found here:
https://datashare.ed.ac.uk/handle/10283/3336
-> LA.zip
"ASVspoof 2019: A large-scale public database of synthesized, converted and replayed speech"
Xin Wang, Junichi Yamagishi, Massimiliano Todisco, Héctor Delgado, Andreas Nautsch, Nicholas Evans, Md Sahidullah, Ville Vestman, Tomi Kinnunen, Kong Aik Lee, Lauri Juvela, Paavo Alku, Yu-Huai Peng, Hsin-Te Hwang, Yu Tsao, Hsin-Min Wang, Sébastien Le Maguer, Markus Becker, Fergus Henderson, Rob Clark, Yu Zhang, Quan Wang, Ye Jia, Kai Onuma, Koji Mushika, Takashi Kaneda, Yuan Jiang, Li-Juan Liu, Yi-Chiao Wu, Wen-Chin Huang, Tomoki Toda, Kou Tanaka, Hirokazu Kameoka, Ingmar Steiner, Driss Matrouf, Jean-François Bonastre, Avashna Govender, Srikanth Ronanki, Jing-Xuan Zhang, Zhen-Hua Ling.
Computer Speech and Language Colume 64, 2020.
This form of the data was used for the PRS paper accepted to ASRU 2023:
"Partial Rank Similarity Minimization Method for Quality MOS Prediction of
Unseen Speech Synthesis Systems in Zero-shot and Semi-supervised Setting."
Hemant Yadav, Erica Cooper, Junichi Yamagishi, Sunayana Sitaram, Rajiv Ratn Shah.
Modifications to the original data include converting audio from flac -> wav, sv56 normalization, conversion of labels from an 0-9 rating scale to a 1-5 scale, and creation of training/development/testing splits.
Files
LICENSE.txt
Additional details
References
- Wang, Xin et al. "ASVspoof 2019: A large-scale public database of synthesized, converted and replayed speech." Computer Speech and Language Volume 64, 2020.