ANV-SOT-Sample-1: Sesotho Sample Dataset - Next Voices-ZA (South Africa)
Creators
-
Marivate, Vukosi
(Work package leader)
-
Olaleye, Kayode
(Researcher)1
-
Mundia, Sitwala
(Researcher)1
-
van Wyk, Nia Zion
(Researcher)1
- Bakainaga, Andinda (Researcher)1
- Morrissey, Graham (Data collector)2
- Dunbar, Dale (Data collector)2
- Smit, Francois (Data collector)2
- Mogale, Hope Tsholofelo (Researcher)1
-
Okorie, Chijioke
(Researcher)3
- 1. University of Pretoria
- 2. Way With Words
- 3. Penguide Advisory
Description
## Sesotho Sample Dataset - Next Voices-ZA (South Africa) - Multilingual Speech Dataset
This dataset includes **scripted and unscripted speech** across various domains such as agriculture, health, finance, sports, transport, culture, society and general topics. It is primarily designed for automatic speech recognition (ASR).
### Use Restriction:
The persons whose voices are included in this dataset, and the creators and owners of this dataset* do not give consent in any manner or form to, and strictly prohibit any use of this dataset for any form of text-to-speech (TTS), voice cloning, voice synthesis, or any technology or activity intended to replicate, mimic or generate human voices or any technology or activity resulting in the replication, mimicry or generation of human voices.
This dataset includes scripted and unscripted speech across various domains such as agriculture, health, finance, sports, transport, culture, society, and general topics. It is primarily designed for use in automatic speech recognition (ASR) tasks.
Use of this dataset for any form of text-to-speech (TTS), voice cloning, voice synthesis, or any technology intended to replicate or generate human voices is strictly prohibited.
These restrictions are in place until further notice.
## Folder structure The dataset is organised hierarchically as follows: ## Folder Structure
ANV-ZA-SOT-1h/
├── sot/ # Folder for Sesotho
│ ├── recorder_uuid/ # Contains all audio files
│ │ ├── recording-1731053452.wav
│ │ ├── ...
│ ├── transcripts.csv # Contains transcripts of all audio recordings
│ ├── meta.csv # Contains additional metadata
├── README.md # Description of the dataset
## Data Details
### Audio
- Format: **16-bit PCM WAV**
- Sample rate: **48kHz**
### Transcriptions
- Provided in `transcript.csv` with fields:
- `file_name`: Name of the audio file.
- `transcript`: Text transcription of the audio.
- `duration`: Duration of the recording in seconds.
- `type`: Scripted or unscripted.
### Metadata
- Provided in `meta.csv` with fields:
- `recorder_uuid`: Unique speaker identifier.
- `age_range`,
- `gender`
## Contact Person
Please contact vukosi.marivate@cs.up.ac.za if you have any questions
## Citation
TBA
## Funding
Funding for this project was generously made possible through a grant from the Bill & Melinda Gates Foundation and a gift from Meta.