Published June 30, 2023
| Version v1
Conference paper
Open
Developing a Pipeline for Automatic Linguistic Analysis of Historical Manuscripts and Early Printings: The Pre-Modern Slavic Case
Authors/Creators
- 1. University of Freiburg, Germany
- 2. Bavarian Academy of Sciences and Humanities, Germany
- 3. University of Kragujevac, Serbia
Contributors
Data managers:
Hosting institution:
- 1. University of Graz
- 2. Belgrade Center for Digital Humanities
- 3. Le Mans Université
- 4. Digital Humanities im deutschsprachigen Raum
Description
We report on experiments with Handwritten Text Recognition models to automatically create large pre-modern Slavic text corpora and to use these corpora without manual post-correction (as raw data and with uncorrected POS tags) for quantitative linguistic analysis (inferential statistics, stylometry); we evaluate the actual noise in the data.
Files
RABUS_Achim_Developing_a_Pipeline_for_Automatic_Linguistic_A.pdf
Files
(115.4 kB)
| Name | Size | Download all |
|---|---|---|
|
md5:de930461d48c17e6e1118de302c6cc4c
|
98.9 kB | Preview Download |
|
md5:95e0fdadd3b8befd0234a6b1963bea9e
|
16.5 kB | Preview Download |
Additional details
Related works
- Is part of
- Book: 10.5281/zenodo.7961822 (DOI)