Published April 10, 2021 | Version v2
Dataset Open

mohammadalihumayun/kadazan_digits: Kadazan digits

Description

The dataset contains recordings for ten spoken digits in the Kadazan Language by 50 speakers. The ten digits (1-10) in the Kadazan language are ‘Iso', 'Duvo', 'Tohu', 'Apat', 'Himo', 'Onom', 'Tuu', 'Vahu', 'Sizam', and 'Opod’ . The samples are recorded by varying quality microphones in noisy recording conditions. The least significant digit in the file name indicates the number of the digit spoken in the recording (e.g 0 for ‘Iso', and so on) while the rest of the digits in the file name represent the speaker.

The dataset has been used in the following papers. Please cite accordingly.

Humayun MA, Yassin H, Abas PE. 2021. Spatial position constraint for unsupervised learning of speech representations. PeerJ Computer Science 7:e650  
https://doi.org/10.7717/peerj-cs.650

 

Files

mohammadalihumayun/kadazan_digits-v2.zip

Files (38.4 MB)

Name Size Download all
md5:e85fcb8246f8911d595d3ae266efb803
38.4 MB Preview Download

Additional details