SpokeN-100: A Cross-Lingual Benchmarking Dataset for The Classification of Spoken Numbers in Different Languages

Groh, René; Goes, Nina; Kist, Andreas M.

doi:10.5281/zenodo.10810044

Published March 12, 2024 | Version 1.0.0

Dataset Open

SpokeN-100: A Cross-Lingual Benchmarking Dataset for The Classification of Spoken Numbers in Different Languages

1. Friedrich-Alexander-Universität Erlangen-Nürnberg

SpokeN-100 is a novel, entirely artificially generated benchmarking dataset tailored for speech recognition, representing a core challenge in the field of tiny deep learning. SpokeN-100 consists of spoken numbers from 0 to 99 spoken by 32 different speakers in four different languages, namely English, Mandarin, German and French, resulting in 12,800 audio samples.

Files

SpokeN-100.zip

Files (2.0 GB)

Name	Size
SpokeN-100.zip md5:139ff40e163e7dc26ec6a4d50d796e45	2.0 GB	Preview Download

Additional details

Repository URL: https://github.com/ankilab/SpokeN-100
Programming language: Python

605

Views

172

Downloads

Show more details

	All versions	This version
Views	605	605
Downloads	172	172
Data volume	1.9 TB	1.9 TB

More info on how stats are collected....

DOI

Resource type

Dataset

Publisher

Zenodo

Conference

In Proceedings of tinyML Research Symposium (tinyML Research Symposium'24) , Burlingame, CA, USA, 22 April 2024

Languages

English, German, French, Mandarin Chinese

License: Creative Commons Attribution 4.0 International

The Creative Commons Attribution license allows re-distribution and re-use of a licensed work on the condition that the creator is appropriately credited. Read more

Technical metadata

Created: March 12, 2024
Modified: March 12, 2024

SpokeN-100: A Cross-Lingual Benchmarking Dataset for The Classification of Spoken Numbers in Different Languages

Authors/Creators

Description

Files

SpokeN-100.zip

Files (2.0 GB)

Additional details

Software