Pretrained Models for An Investigation of the Relation Between Grapheme Embeddings and Pronunciation for Tacotron-based Systems
- 1. Univ Rennes, CNRS, IRISA, France
- 2. National Institute of Informatics, Japan
Description
Pretrained models for "An Investigation of the Relation Between Grapheme Embeddings and Pronunciation for Tacotron-based Systems," Antoine Perquin, Erica Cooper, Junichi Yamagishi.
https://arxiv.org/abs/2010.10694
This is a derived work of the SIWIS corpus:
Yamagishi, Junichi; Honnet, Pierre-Edouard; Garner, Philip; Lazaridis, Alexandros. (2017). The SIWIS French Speech Synthesis Database, 2016 [dataset]. University of Edinburgh. School of Informatics. The Centre for Speech Technology Research. https://doi.org/10.7488/ds/1705.
End-to-end models, particularly Tacotron-based ones, are currently a popular solution for text-to-speech synthesis. They allow the production of high-quality synthesized speech with little to no text preprocessing. Indeed, they can be trained using either graphemes or phonemes as input directly. However, in the case of grapheme inputs, little is known concerning the relation between the underlying representations learned by the model and word pronunciations. This work investigates this relation in the case of a Tacotron model trained on French graphemes. Our analysis shows that grapheme embeddings are related to phoneme information despite no such information being present during training. Thanks to this property, we show that grapheme embeddings learned by Tacotron models can be useful for tasks such as grapheme-to-phoneme conversion and control of the pronunciation in synthetic speech.
Files
french-tacotron-models.zip
Files
(1.1 GB)
| Name | Size | Download all |
|---|---|---|
|
md5:8f37e55acdeca8d84480d1c35a462a02
|
1.1 GB | Preview Download |
Additional details
References
- Perquin, Antoine et al. (2021.) "An Investigation of the Relation Between Grapheme Embeddings and Pronunciation for Tacotron-based Systems"
- Yamagishi, Junichi et al. (2017). "The SIWIS French Speech Synthesis Database."