Dataset Open Access

Baule speech dataset

Dougban Monsia

The dataset was created to enable research on automatic speech recognition in Boulé (Baule) language. The dataset was intentionally created with this task in mind, in order to participate in the Google NLP Hack Series: Intro to ASR Africa Challenge hosted on the Zindi Africa platform. It contains about 565 recordings of participants reading a transcription in Baule as spoken in Côte d’Ivoire, one sentence at a time. Each example contains the audio files and the associated text. The audio is recorded in a less noisy environment by the speakers using their android phone. The
dataset is multi-speaker, containing recordings from 4 volunteers (2 males and 2 females), where each volunteer contributed up to 141 recordings. The recordings took place in Abidjan, Côte d’Ivoire in April 2022.

Files (46.2 MB)
Name Size
bci-datasets.zip
md5:0af939f1f4672ec887792ba1715ae3c1
46.2 MB Download
42
3
views
downloads
All versions This version
Views 4242
Downloads 33
Data volume 138.6 MB138.6 MB
Unique views 3838
Unique downloads 33

Share

Cite as