Published August 14, 2022 | Version v1
Dataset Open

AfroMAFT Corpus: Language Adaptation Corpus for African languages

  • 1. Saarland University

Description

Language Adaptation Corpus for 17 African languages, English, French, and Arabic.

We used this corpus to train the following pre-trained language models:

If you use this corpus, please cite the MAFAND paper and mC4 paper

Files

Processed.zip

Files (4.4 GB)

Name Size Download all
md5:c40de77a970bc6a0db761a7e3059c169
4.4 GB Preview Download