Dataset Open Access

Dataset of EvaLatin 2020

Rachele Sprugnoli; Marco Passarotti; Flavio Massimiliano Cecchini; Matteo Pellegrini

This repository contains training and test data of EvaLatin 2020, the first campaign devoted to the evaluation of Natural Language Processing Tools for Latin. It also includes the evaluation script.

EvaLatin first edition have 2 tasks (i.e. Lemmatization and PoS tagging) each with 3 sub-tasks (i.e. Classical, Cross-Genre, Cross-Time). 

The EvaLatin 2020 dataset is licensed under the Creative Commons Attribution-NonCommercial-ShareAlike (CC BY-NC-SA) 4.0 International.

Files (4.6 MB)
Name Size
4.6 MB Download
  • Sprugnoli, R., Passarotti, M., Cecchini, F. M., & Pellegrini, M. (2020, May). Overview of the evalatin 2020 evaluation campaign. In Proceedings of LT4HALA 2020-1st Workshop on Language Technologies for Historical and Ancient Languages (pp. 105-110).

All versions This version
Views 1919
Downloads 22
Data volume 9.3 MB9.3 MB
Unique views 1616
Unique downloads 22


Cite as