10.5281/zenodo.3926769
https://zenodo.org/records/3926769
oai:zenodo.org:3926769
Hämäläinen, Mika
Mika
Hämäläinen
0000-0001-9315-1278
University of Helsinki
Partanen, Niko
Niko
Partanen
0000-0001-8584-3880
University of Helsinki
Rueter, Jack
Jack
Rueter
0000-0002-3076-7929
University of Helsinki
Alnajjar, Khalid
Khalid
Alnajjar
0000-0002-7986-2994
University of Helsinki
Neural models for morphological generation, analysis and lemmatization in 22 languages
Zenodo
2020
morphology
fst
endangered languages
neural models
2020-07-01
fin
10.5281/zenodo.3926768
1.0
Creative Commons Attribution 4.0 International
Morphological models for generation, lemmatization and analysis in 22 languages. The models are trained in OpenNMT-py https://github.com/OpenNMT/OpenNMT-py. Feed one word at a time, split into characters (kissa -> k i s s a)
Supported languages: German (deu), Kven (fkv), Komi-Zyrian (kpv), Mokhsa (mdf), Mansi (mns), Erzya (myv), Norwegian Bokmål (nob), Russian (rus), South Sami (sma), Lule Sami (smj), Skolt Sami (sms), Võro (vro), Finnish (fin), Komi-Permyak (koi), Latvian (lav), Eastern Mari (mhr), Western Mari (mrj), Namonuito (nmt), Olonets-Karelian (olo), Pite Sami (sje), Northern Sami (sme), Inari Sami (smn) and Udmurt (udm)
Cite:
Hämäläinen, M., Partanen, N., Rueter, J., & Alnajjar, K. (2021). Neural Morphology Dataset and Models for Multiple Languages, from the Large to the Endangered. In Proceedings of the 23rd Nordic Conference on Computational Linguistics (NoDaLiDa 2021)