Computational Modeling of Diachronic Variation in Late and Medieval Latin
Description
Can a computer determine when a Latin text was written? To explore this question, we assess the feasibility of automatically dating historical Latin writings based solely on linguistic features, addressing a methodological gap in computational philology where stylometric techniques (commonly employed for authorship attribution) have rarely been applied to chronological classification. Drawing on a large corpus of Late and Medieval Latin, we investigate whether syntactic patterns exhibit systematic diachronic variation. A prototype model distinguishes between two distant periods (4th-5th vs. 11th-12th centuries) using part-of-speech bigram frequencies and a linear SVM classifier, achieving 86.9% accuracy. Analysis of bigram distributions reveals interpretable structural shifts, suggesting that syntactic tendencies can function as reliable and transparent chronological markers even in the absence of lexical information. Future work will refine temporal granularity and integrate additional linguistic dimensions, aiming to improve predictive performance while maintaining philological interpretability.
Files
QUINTIN_Guillaume_DHBenelux26_proposal.pdf
Files
(289.0 kB)
| Name | Size | Download all |
|---|---|---|
|
md5:c610af53a0a277843edcf087d70c0625
|
289.0 kB | Preview Download |