Dataset Open Access

Romance Verbal Inflection Dataset 2.0.0

Beniamine, Sacha; Maiden, Martin; Round, Erich

We introduce the Romance Verbal Inflection Dataset 2.0, a multilingual lexicon of Romance inflection covering 73 varieties. The lexicon provide verbal paradigm forms in broad IPA phonemic notation. Both lexemes and paradigm cells are organized to reflect cognacy. Such multi-lingual inflected lexicons annotated for two dimensions of cognacy are necessary to study the evolution of inflectional paradigms, and test linguist hypothesis systematically. However, these resources seldom exist, and when they do, they are not usually encoded in computationally usable ways. The Oxford Online Database of Romance Verb Morphology provides this kind of information, however, it is not maintained anymore and only available as a web service without interfaces for machine-readability. We collect its data and clean and correct it for consistency using both heuristics and expert annotator judgements. Most resources used to study language evolution computationally rely strictly on multilingual contemporary information, and lack information about prior stages of the languages. To provide such information, we augmented the database  with Latin paradigms from the LatInFlexi lexicon. Finally, to make it widely available, the resource is released under a GPLv3 license in CLDF format.

Files (1.2 MB)
Name Size
v2.0.4.zip
md5:4fdcb76ba4450754f8c99c3d95c35b18
1.2 MB Download
309
42
views
downloads
All versions This version
Views 30949
Downloads 4213
Data volume 403.5 MB15.2 MB
Unique views 24245
Unique downloads 3612

Share

Cite as