Published October 13, 2020
| Version v1
Video/Audio
Open
Standardizing linguistic data: method and tools for annotating(pre-orthographic) French
Creators
- 1. Universités de Neuchâtel et de Genève
- 2. École des Chartes
- 3. Sorbonne Université
- 4. École normale supérieure de Lyon
Description
With the development of big corpora of various periods, it becomescrucial to standardise linguistic annotation (e.g.lemmas, POS tags,morphological annotation) to increase the interoperability of the dataproduced, despite diachronic variations. In the present paper, wedescribe both methodologically (by proposing annotation principles)and technically (by creating the required training data and therelevant models) the production of a linguistic tagger for (early)modern French (16-18th c.), taking as much as possible into accountalready existing standards for contemporary and, especially, medievalFrench
Files
Gabay-al-1.webm
Files
(36.5 MB)
Name | Size | Download all |
---|---|---|
md5:bdb7905bc80f09612a21f6d967723254
|
36.5 MB | Preview Download |