Published March 9, 2025 | Version v0.2
Dataset Open

Blizzard Challenge 2025 - Training Material

  • 1. ROR icon University of Groningen

Description

Blizzard Challenge 2025 - Participant Information

The 2025 edition of the Blizzard Challenge is focussing on synthesizing speech for Bildts. Bildts (Indo-European > West Germanic) is a unique language variety spoken in the north of the Netherlands, specifically in the the Bildt region of Friesland, one of the country’s twelve provinces. It represents a living example of language diversity in Europe, with its own distinct characteristics and rich cultural heritage. It has about 10,000 speakers, who acquired it as a first or second language. The language has been documented through grammatical resources, dictionaries, literature, and media including weekly radio broadcasts, theater productions, and regular newspaper columns. This choice for the Blizzard Challenge 2025 aligns with our theme "Scaling down: sustainable synthesis for language diversity" as it provides an opportunity to advance speech synthesis capabilities for languages beyond the usual major languages while working with carefully curated but naturally limited data resources.

Material

For this challenge, we provide around 7h of speech from one male speaker (Jan de Groot from Omrop Fryslân): 

  • WAV files (44.1kHz, 16bits, mono) normalized using sv56demo 
  • TextGrid (in UTF-8) composed of three tiers:
    • Full Text (graphemes) :: the full text
    • Segments (graphemes):: the segmented text we obtained using pydub and then hand-corrected
    • Expanded Segments (graphemes) :: the expanded version of the segmented text - numbers and some acronyms are spelled out (full upcase words are acronyms which should be spelled out)

We also recommend the participants to get familiar with the online resources provided at https://wiki.mercator-research.eu/languages:bildts_in_the_netherlands#online_learning_resources.

More information

Change Log

  • v0.2 - fix heterogeneous sampling rate (44.1kHz, 48kHz => 44.1kHz), fix some number expansion issues
  • v0.1 - initial release

Files

Files (1.7 GB)

Name Size Download all
md5:78ce84a4c2595af29fc181265b292350
1.7 GB Download