Gabay, Simon
Clérice, Thibault
Camps, Jean-Baptiste
Tanguy, Jean-Baptiste
Gille-Levenson, Matthias
2020-10-13
<p>With the development of big corpora of various periods, it becomescrucial to standardise linguistic annotation (e.g.lemmas, POS tags,morphological annotation) to increase the interoperability of the dataproduced, despite diachronic variations. In the present paper, wedescribe both methodologically (by proposing annotation principles)and technically (by creating the required training data and therelevant models) the production of a linguistic tagger for (early)modern French (16-18th c.), taking as much as possible into accountalready existing standards for contemporary and, especially, medievalFrench</p>
https://doi.org/10.5281/zenodo.4084499
oai:zenodo.org:4084499
eng
Zenodo
https://doi.org/10.5281/zenodo.4084498
info:eu-repo/semantics/openAccess
Creative Commons Attribution 4.0 International
https://creativecommons.org/licenses/by/4.0/legalcode
linguistic annotation, pre-orthographic language, lemmatisation,POS-tagging
Standardizing linguistic data: method and tools for annotating(pre-orthographic) French
info:eu-repo/semantics/other