Pretrained Models for An Investigation of the Relation Between Grapheme Embeddings and Pronunciation for Tacotron-based Systems

Perquin, Antoine; Cooper, Erica; Yamagishi, Junichi

doi:10.5281/zenodo.6534268

Published April 4, 2021 | Version v1

Report Open

Pretrained Models for An Investigation of the Relation Between Grapheme Embeddings and Pronunciation for Tacotron-based Systems

1. Univ Rennes, CNRS, IRISA, France
2. National Institute of Informatics, Japan

Pretrained models for "An Investigation of the Relation Between Grapheme Embeddings and Pronunciation for Tacotron-based Systems," Antoine Perquin, Erica Cooper, Junichi Yamagishi.
https://arxiv.org/abs/2010.10694

This is a derived work of the SIWIS corpus:
Yamagishi, Junichi; Honnet, Pierre-Edouard; Garner, Philip; Lazaridis, Alexandros. (2017). The SIWIS French Speech Synthesis Database, 2016 [dataset]. University of Edinburgh. School of Informatics. The Centre for Speech Technology Research. https://doi.org/10.7488/ds/1705.

End-to-end models, particularly Tacotron-based ones, are currently a popular solution for text-to-speech synthesis. They allow the production of high-quality synthesized speech with little to no text preprocessing. Indeed, they can be trained using either graphemes or phonemes as input directly. However, in the case of grapheme inputs, little is known concerning the relation between the underlying representations learned by the model and word pronunciations. This work investigates this relation in the case of a Tacotron model trained on French graphemes. Our analysis shows that grapheme embeddings are related to phoneme information despite no such information being present during training. Thanks to this property, we show that grapheme embeddings learned by Tacotron models can be useful for tasks such as grapheme-to-phoneme conversion and control of the pronunciation in synthetic speech.

Files

french-tacotron-models.zip

Files (1.1 GB)

Name	Size	Download all
french-tacotron-models.zip md5:8f37e55acdeca8d84480d1c35a462a02	1.1 GB	Preview Download

Additional details

Perquin, Antoine et al. (2021.) "An Investigation of the Relation Between Grapheme Embeddings and Pronunciation for Tacotron-based Systems"
Yamagishi, Junichi et al. (2017). "The SIWIS French Speech Synthesis Database."

	All versions	This version
Views	117	117
Downloads	28	28
Data volume	32.1 GB	32.1 GB

Pretrained Models for An Investigation of the Relation Between Grapheme Embeddings and Pronunciation for Tacotron-based Systems

Creators

Description

Files

french-tacotron-models.zip

Files (1.1 GB)

Additional details

References