CRPIH_UVigo-GL-Voices: Galician TTS dataset

Centro Ramón Piñeiro para a Investigación en Humanidades (CRPIH); Multimedia Technology Group (GTM) – atlanTTic Research Center for Telecommunication Technologies

doi:10.5281/zenodo.8027725

Published June 13, 2023 | Version 1.0.0.

Dataset Open

CRPIH_UVigo-GL-Voices: Galician TTS dataset

Centro Ramón Piñeiro para a Investigación en Humanidades (CRPIH)
Multimedia Technology Group (GTM) – atlanTTic Research Center for Telecommunication Technologies¹

1. Universidade de Vigo

CRPIH_UVigo-GL-Voices is a Galician TTS multi-speaker dataset containing audio recordings from four different speakers (two female and two male voices). The characteristics of each voice are detailed in the table below:

Voice Name	Gender	Speaker	Recording	# Utts	Duration	Sampling Rate	Format
Sabela	Female	Professional radio broadcaster	Professional studio	9,999	14h 28m	16kHz/ 44kHz	16-bit PCM
Icía	Female	Amateur	Semi-professional studio	2,950	4h 5m	16kHz/ 96kHz	16/24-bit PCM
Iago	Male	Amateur	Radio studio	1,316	1h 13m	16kHz/ 48kHz	16-bit PCM
Paulo	Male	Amateur	Radio studio	1,316	1h 15m	16kHz	16-bit PCM

Each speaker recorded a subset of utterances from a text corpus of 10,000 sentences, with a length between 1 and 44 words. This corpus is mainly composed of press excerpts, but it also contains a small subset of manually designed sentences. The press excerpts were extracted from newspapers published before 2010 ("O Correo Galego", "Galicia Hoxe" and "Vieiros"), whereas the hand-crafted sentences were created at the CRPIH in the year 1999.

The data is organized into folders with each folder corresponding to one of the speakers. Each speaker's folder is composed of the following subdirectories:

txt → Audio transcripts enconded in ISO 8859-1.
fon → Phoneme-level forced alignment between phonemic transcriptions (provided by Cotovía) and audio recordings.
wav_[bit depth]bits_[sampling rate]kHz → WAV format audio files at [sampling-rate]kHz [bit-depth]-bit (see table above).

The file naming convention is as follows: two lowercase elements indicating the creators of the dataset (“crpih_uvigo”), the ISO code for the Galician language (“gl”), the name of the voice (e. g., “sabela”), and a 5-digit number identifying the utterance. All components are separated by underscores (e. g., “crpih_uvigo_gl_sabela_00001.txt”). For some of the speakers, there is an additional element for identifying dataset's splits (e. g., “crpih_uvigo_gl_iago_a_00001.txt” and “crpih_uvigo_gl_iago_b_00001.txt”).

Acknowledgements

We would like to thank the speakers for recording and donating their voices.

Files

iago.zip

Files (9.6 GB)

Name	Size	Download all
iago.zip md5:1575e5672ddbd192951dd987737b7b9d	433.8 MB	Preview Download
icia.zip md5:e9e8fc990305f251a926efec8e92a61a	4.2 GB	Preview Download
paulo.zip md5:7980ba6ea4c580933647f8788d956643	119.3 MB	Preview Download
sabela.zip md5:7abfd88d35b6859a9b6680d90ca225be	4.9 GB	Preview Download

Citations

Oops! Something went wrong while fetching results.

	All versions	This version
Views	375	374
Downloads	129	129
Data volume	399.3 GB	399.3 GB

CRPIH_UVigo-GL-Voices: Galician TTS dataset

Creators

Description

Files

iago.zip

Files (9.6 GB)