There is a newer version of the record available.

Published May 23, 2022 | Version v1
Dataset Open

Data for the VoiceMOS Challenge 2022

  • 1. National Institute of Informatics
  • 2. Nagoya University

Description

This is the public release of the data for the first VoiceMOS Challenge.  The challenge had two tracks: a main track and an out-of-domain (OOD) track.  The data for the main track is known as the BVCC dataset, and contains samples from past Blizzard Challenges, Voice Conversion Challenges, and public samples from ESPnet-TTS, along with their mean opinion score (MOS) ratings collected in one unified listening test.  Standard training/development/testing splits from the challenge are also provided.  The OOD track contains samples from the Blizzard Challenge 2019 along with their ratings from the original, separate listening test.  We also include the scoring scripts that were used for the challenge.

Samples from Blizzard Challenges may NOT be redistributed.  Blizzard samples are not included in this dataset, but the scripts to download and preprocess them are included.  Please run all of the included scripts to obtain the full dataset.

BVCC reference:
Erica Cooper and Junichi Yamagishi, "How do Voices from Past Speech Synthesis Challenges Compare Today?" SSW 2021. https://arxiv.org/abs/2105.02373

The VoiceMOS Challenge:
Wen-Chin Huang, Erica Cooper, Yu Tsao, Hsin-Min Wang, Tomoki Toda, Junichi Yamagishi, "The VoiceMOS Challenge 2022," submitted to Interspeech 2022.  https://arxiv.org/abs/2203.11389
https://voicemos-challenge-2022.github.io

The Blizzard Challenges 2008, 2009, 2010, 2011, 2013, 2016, 2019:
V. Karaiskos, S. King, R. A. Clark, and C. Mayo, "The Blizzard Challenge 2008," in Proc. Blizzard Challenge Workshop, 2008.
A. W. Black, S. King, and K. Tokuda, "The Blizzard Challenge 2009," in Proc. Blizzard Challenge, 2009.
S. King and V. Karaiskos, "The Blizzard Challenge 2010," 2010.
S. King and V. Karaiskos, "The Blizzard Challenge 2011," 2011.
S. King and V. Karaiskos, "The Blizzard Challenge 2013," 2013.
S. King and V. Karaiskos, "The Blizzard Challenge 2016," 2016.
Z. Wu, Z. Xie, and S. King, "The Blizzard Challenge 2019," 2019.

The Voice Conversion Challenges 2016, 2018, and 2020:
T. Toda, L.-H. Chen, D. Saito, F. Villavicencio, M. Wester, Z. Wu, and J. Yamagishi, "The Voice Conversion Challenge 2016," Interspeech, 2016.
J. Lorenzo-Trueba, J. Yamagishi, T. Toda, D. Saito, F. Villavicencio, T. Kinnunen, and Z. Ling, "The Voice Conversion Challenge 2018: Promoting development of parallel and nonparallel methods."
Z. Yi, W.-C. Huang, X. Tian, J. Yamagishi, R. K. Das, T. Kinnunen, Z. Ling, and T. Toda, "Voice Conversion Challenge 2020 — intra-lingual semi-parallel and cross-lingual voice conversion —," in Proc. Joint Workshop for the Blizzard Challenge and Voice Conversion Challenge 2020, 2020, pp. 80–98.

ESPnet-TTS:
S. Watanabe, T. Hori, S. Karita, T. Hayashi, J. Nishitoba, Y. Unno, N. Enrique Yalta Soplin, J. Heymann, M. Wiesner, N. Chen, A. Renduchintala, and T. Ochiai, "ESPnet: End-to-end speech processing toolkit," in Proceedings of Interspeech, 2018, pp. 2207–2211. [Online]. Available: http://dx.doi.org/10.21437/ Interspeech.2018- 1456

Files

Files (286.9 MB)

Name Size Download all
md5:fc880c2a208c3285a47bd9a64f34eb11
286.7 MB Download
md5:3f606518cd99138784380482d0c38e48
152.6 kB Download
md5:11e21b41dd88f33b64c17481a9281722
34.0 kB Download

Additional details

References

  • Cooper, Erica and Yamagishi, Junichi. "How do Voices from Past Speech Synthesis Challenges Compare Today?" SSW 2021. https://arxiv.org/abs/2105.02373
  • Huang, Wen-Chin et al. ""The VoiceMOS Challenge 2022," https://arxiv.org/abs/2203.11389
  • V. Karaiskos et al. "The Blizzard Challenge 2008," in Proc. Blizzard Challenge Workshop, 2008.
  • A. W. Black et al. "The Blizzard Challenge 2009," in Proc. Blizzard Challenge, 2009.
  • S. King and V. Karaiskos, "The Blizzard Challenge 2010," 2010.
  • S. King and V. Karaiskos, "The Blizzard Challenge 2011," 2011.
  • S. King and V. Karaiskos, "The Blizzard Challenge 2013," 2013.
  • S. King and V. Karaiskos, "The Blizzard Challenge 2016," 2016.
  • Z. Wu et al. "The Blizzard Challenge 2019," 2019.
  • T. Toda et al. "The Voice Conversion Challenge 2016," Interspeech, 2016.
  • J. Lorenzo-Trueba et al. "The Voice Conversion Challenge 2018: Promoting development of parallel and nonparallel methods."
  • Z. Yi et al. "Voice Conversion Challenge 2020 — intra-lingual semi-parallel and cross-lingual voice conversion —," in Proc. Joint Workshop for the Blizzard Challenge and Voice Conversion Challenge 2020, 2020, pp. 80–98.
  • S. Watanabe et al. "ESPnet: End-to-end speech processing toolkit," in Proceedings of Interspeech, 2018, pp. 2207–2211. [Online]. Available: http://dx.doi.org/10.21437/ Interspeech.2018- 1456