Dataset Open Access
The dataset was created to enable research on automatic speech recognition in Boulé (Baule) language. The dataset was intentionally created with this task in mind, in order to participate in the Google NLP Hack Series: Intro to ASR Africa Challenge hosted on the Zindi Africa platform. It contains about 565 recordings of participants reading a transcription in Baule as spoken in Côte d’Ivoire, one sentence at a time. Each example contains the audio files and the associated text. The audio is recorded in a less noisy environment by the speakers using their android phone. The
dataset is multi-speaker, containing recordings from 4 volunteers (2 males and 2 females), where each volunteer contributed up to 141 recordings. The recordings took place in Abidjan, Côte d’Ivoire in April 2022.