A consolidated lexical dataset for Dogon languages
Authors/Creators
- 1. CNRS Délégation Paris-Villejuif
- 2. LLACAN - Langage, Langues et Cultures d'Afrique
- 3. Researcher
- 4. IRD-MISELI-Mali
Description
This dataset is a staged release of a consolidated lexical dataset for Dogon languages. It brings together heterogeneous source layers, including RefLex-derived data, Dogon and Bangime Linguistics materials, CLDF/LexiBank-derived working files, and subsequent BANG project curation. The workflow includes transcription standardization, source and language-name normalization, staged merging, Concepticon and part-of-speech enrichment, manual revision, source-village and GPS verification, doculect construction, and Glottolog alignment.
This release should be treated as provisional. Remaining issues include duplicate resolution, language-level attribution auditing, and verification of the incorporation of earlier manual curation of verbal paradigms. Full source-level attribution and contributor roles are documented in ATTRIBUTION.md.
Files
ATTRIBUTION.md
Additional details
Funding
Software
- Repository URL
- https://github.com/IndianaTones/dogon_consolidation
- Programming language
- Python