Published November 16, 2024 | Version 2.0
Poster Open

SEFLAG: Systematic Evaluation Framework for NLP Models and Datasets in Latin and Ancient Greek

  • 1. ROR icon Humboldt-Universität zu Berlin

Description

The poster presents SEFLAG, a Systematic Evaluation Framework for NLP models and datasets in Latin and Ancient Greek, developed at Humboldt-Universität zu Berlin. It addresses three core research questions: helping literary scholars select suitable NLP models, systematically documenting language resources, and unifying similar but distinct annotation schemas.

SEFLAG integrates components such as model evaluation, data curation, and documentation, utilizing tools like spaCy, flair, Hugging Face, and Zenodo. It supports tasks including named entity recognition, lemmatization, and dependency parsing. A key feature is mapping between different annotation schemas to ensure comparability across resources. Evaluation metrics (such as F1, accuracy) show performance results for both Latin and Ancient Greek across several models and datasets.

Challenges in this domain include linguistic variation, limited resources, interoperability issues, and the need for sustainable, interdisciplinary research. SEFLAG contributes solutions such as publishing model cards and datasheets, using Linked Data for evaluation results, and offering case-specific mappings.

Future plans include expanding to more tasks, models, and datasets, creating educational materials on NLP evaluation, and fully integrating the framework into the Daidalos research infrastructure. All resources are open-access, with code and evaluation data available online.

Files

poster_seflag_v2.pdf

Files (5.1 MB)

Name Size Download all
md5:e8aab0ca54479b3eb28ed8e85f25095b
5.1 MB Preview Download

Additional details

Funding

Deutsche Forschungsgemeinschaft
Daidalos-Projekt - Entwicklung einer Infrastruktur zum Einsatz von Natural Language Processing für Forschende der Klassischen Philologie 518919950

Software

Repository URL
https://github.com/daidalos-project/seflag
Programming language
Python
Development Status
Wip