Published April 26, 2020 | Version v1
Dataset Open

Afro-MNIST: Synthetic generation of MNIST-style datasets for low-resource languages

  • 1. Stanford University
  • 2. Unify ID Inc.

Description

We present Afro-MNIST, a set of synthetic MNIST-style datasets for four orthographies used in Afro-Asiatic and Niger-Congo languages: Ge`ez (Ethiopic), Vai, Osmanya, and N'Ko.
These datasets serve as ``drop-in'' replacements for MNIST. We hope that MNIST-style datasets will be developed for other numeral systems, and that these datasets vitalize machine learning education in underrepresented nations in the research community.

Files

AfroMNIST.zip

Files (99.0 MB)

Name Size Download all
md5:1e8bd7d8b6257de1de4b2d837fd96bc1
99.0 MB Preview Download