Published December 6, 2022
| Version 1.0.0
Dataset
Open
Voice of America: Ukrainian ASR Dataset of Broadcast Speech
Creators
Contributors
Data collectors:
Description
The dataset is based on public recordings of Voice of America (https://ukrainian.voanews.com) extracted from their videos.
The dataset contains 398 hours of speech.
The dataset is created by the ASR Corpus Creator (https://zenodo.org/record/7396705).
The format of files: WAV with 16 kHz.
The URL to download WAV files: https://nx16725.your-storageshare.de/s/f4NYHXdEw2ykZKa
Files
Files
(274.8 MB)
Name | Size | Download all |
---|---|---|
md5:13e486a0fe76ef0b519485bf6d6f4ef1
|
127.2 MB | Download |
md5:920447b4bd9056612dfe6b877083ba89
|
147.6 MB | Download |
Additional details
References
- Speech Recognition for Ukrainian, https://github.com/egorsmkv/speech-recognition-uk
- Smoliakov, Yehor. (2022). ASR Corpus Creator (1.5.1). Zenodo. https://doi.org/10.5281/zenodo.7396705