Afro-MNIST: Synthetic generation of MNIST-style datasets for low-resource languages

Wu, Daniel J; Yang, Andrew C; Prabhu, Vinay U

doi:10.5281/zenodo.4050071

Published April 26, 2020 | Version v1

Dataset Open

Afro-MNIST: Synthetic generation of MNIST-style datasets for low-resource languages

1. Stanford University
2. Unify ID Inc.

We present Afro-MNIST, a set of synthetic MNIST-style datasets for four orthographies used in Afro-Asiatic and Niger-Congo languages: Ge`ez (Ethiopic), Vai, Osmanya, and N'Ko.
These datasets serve as ``drop-in'' replacements for MNIST. We hope that MNIST-style datasets will be developed for other numeral systems, and that these datasets vitalize machine learning education in underrepresented nations in the research community.

Files

AfroMNIST.zip

Files (99.0 MB)

Name	Size	Download all
AfroMNIST.zip md5:1e8bd7d8b6257de1de4b2d837fd96bc1	99.0 MB	Preview Download

215

Views

Downloads

Show more details

	All versions	This version
Views	215	215
Downloads	57	57
Data volume	6.5 GB	6.5 GB

More info on how stats are collected....

DOI

Resource type

Dataset

Publisher

Zenodo

Creative Commons Attribution 4.0 International

The Creative Commons Attribution license allows re-distribution and re-use of a licensed work on the condition that the creator is appropriately credited. Read more

Technical metadata

Created: September 25, 2020
Modified: September 26, 2020

Afro-MNIST: Synthetic generation of MNIST-style datasets for low-resource languages

Creators

Description

Files

AfroMNIST.zip

Files (99.0 MB)