Published March 12, 2024 | Version 1.0.0
Dataset Open

SpokeN-100: A Cross-Lingual Benchmarking Dataset for The Classification of Spoken Numbers in Different Languages

  • 1. ROR icon Friedrich-Alexander-Universität Erlangen-Nürnberg

Description

SpokeN-100 is a novel, entirely artificially generated benchmarking dataset tailored for speech recognition, representing a core challenge in the field of tiny deep learning. SpokeN-100 consists of spoken numbers from 0 to 99 spoken by 32 different speakers in four different languages, namely English, Mandarin, German and French, resulting in 12,800 audio samples.

Files

SpokeN-100.zip

Files (2.0 GB)

Name Size Download all
md5:139ff40e163e7dc26ec6a4d50d796e45
2.0 GB Preview Download

Additional details

Software

Repository URL
https://github.com/ankilab/SpokeN-100
Programming language
Python