00000ngm##2200000uu#4500 4084499 doi 10.5281/zenodo.4084499 oai:zenodo.org:4084499 Clérice, Thibault École des Chartes Camps, Jean-Baptiste École des Chartes Tanguy, Jean-Baptiste Sorbonne Université Gille-Levenson, Matthias École normale supérieure de Lyon Standardizing linguistic data: method and tools for annotating(pre-orthographic) French Gabay, Simon Universités de Neuchâtel et de Genève info:eu-repo/semantics/openAccess Creative Commons Attribution 4.0 International https://creativecommons.org/licenses/by/4.0/legalcode cc-by-4.0 spdx linguistic annotation, pre-orthographic language, lemmatisation,POS-tagging <p>With the development of big corpora of various periods, it becomescrucial to standardise linguistic annotation (e.g.lemmas, POS tags,morphological annotation) to increase the interoperability of the dataproduced, despite diachronic variations. In the present paper, wedescribe both methodologically (by proposing annotation principles)and technically (by creating the required training data and therelevant models) the production of a linguistic tagger for (early)modern French (16-18th c.), taking as much as possible into accountalready existing standards for contemporary and, especially, medievalFrench</p> eng Zenodo 2020-10-13 info:eu-repo/semantics/other 20201013082930.0 36479539 md5:bdb7905bc80f09612a21f6d967723254 https://zenodo.org/records/4084499/files/Gabay-al-1.webm open 10.5281/zenodo.4084498 isVersionOf doi