Dataset Open Access
Hämäläinen, Mika;
Partanen, Niko;
Rueter, Jack;
Alnajjar, Khalid
Morphological models for generation, lemmatization and analysis in 22 languages. The models are trained in OpenNMT-py https://github.com/OpenNMT/OpenNMT-py. Feed one word at a time, split into characters (kissa -> k i s s a)
Supported languages: German (deu), Kven (fkv), Komi-Zyrian (kpv), Mokhsa (mdf), Mansi (mns), Erzya (myv), Norwegian Bokmål (nob), Russian (rus), South Sami (sma), Lule Sami (smj), Skolt Sami (sms), Võro (vro), Finnish (fin), Komi-Permyak (koi), Latvian (lav), Eastern Mari (mhr), Western Mari (mrj), Namonuito (nmt), Olonets-Karelian (olo), Pite Sami (sje), Northern Sami (sme), Inari Sami (smn) and Udmurt (udm)
Cite:
Hämäläinen, M., Partanen, N., Rueter, J., & Alnajjar, K. (2021). Neural Morphology Dataset and Models for Multiple Languages, from the Large to the Endangered. In Proceedings of the 23rd Nordic Conference on Computational Linguistics (NoDaLiDa 2021)
Name | Size | |
---|---|---|
deu.zip
md5:0114cc3a6cd69647ee544c412003989a |
2.0 GB | Download |
fin.zip
md5:a6f6605c42855be9ef84e45d35d2bc47 |
3.9 GB | Download |
fkv.zip
md5:3a856cbe5f0842627222d4ac430e0c32 |
2.0 GB | Download |
koi.zip
md5:232d391b1f1086b847c07788f2b3f756 |
2.1 GB | Download |
kpv.zip
md5:e72b3a7ef10824890b217f78c22470cf |
2.3 GB | Download |
lav.zip
md5:4e027e3f2ede42527153c01b85191697 |
2.0 GB | Download |
mdf.zip
md5:45726260d34ce9486bbb72d72fd18107 |
2.4 GB | Download |
mhr.zip
md5:52a04f40ef146e24c9f671db99fbd7d8 |
2.2 GB | Download |
mns.zip
md5:9d32cafa8c6ac2bc752ee682930d8b53 |
2.0 GB | Download |
mrj.zip
md5:ec83397fa63b14f54a67993f6ac5b10d |
2.0 GB | Download |
myv.zip
md5:841b4a8a2daf0e57bc1a624f77dd3e89 |
2.0 GB | Download |
nob.zip
md5:8be159dca0b1f0c990eab9b5ca0052b1 |
2.0 GB | Download |
olo.zip
md5:0c32432cde219350fb3f8861fc7bce3c |
2.1 GB | Download |
rus.zip
md5:764076cafd78aa16981fc33b0400d050 |
2.0 GB | Download |
sje.zip
md5:9acfd9a42576c20c0afc1ee96c5b8092 |
2.0 GB | Download |
sma.zip
md5:9fa0233618d3bd27aadb0bd88dd7b21c |
2.0 GB | Download |
sme.zip
md5:7464f055a3d2a4763cc763d40be6c13b |
2.1 GB | Download |
smj.zip
md5:b5c26d520a0f930596f369613bd4dea2 |
2.0 GB | Download |
smn.zip
md5:8f72eeff3afd6836fedabc0e08449fdc |
2.1 GB | Download |
sms.zip
md5:97a4db0d83d98d2c424169d6cdd7eb7b |
2.3 GB | Download |
train_scripts.zip
md5:8f0db15754e43d8bfc11a29d4f57186b |
40.1 kB | Download |
udm.zip
md5:9eba670c1b3813e02471929504779f8a |
2.2 GB | Download |
vro.zip
md5:ed401d0852cf880a67ee4944b2474c59 |
2.0 GB | Download |
All versions | This version | |
---|---|---|
Views | 35 | 35 |
Downloads | 80 | 80 |
Data volume | 172.0 GB | 172.0 GB |
Unique views | 34 | 34 |
Unique downloads | 23 | 23 |