Published February 20, 2026 | Version 1.0
Dataset Open

LAPIS-GT

Description

This goldstandard (LAPIS-GT: Labeled Annotations for Proper Nouns in Inscriptional Sources) contains manual annotations for 1,000 Latin inscriptions sampled from the Epigraphik-Datenbank Clauss-Slaby. The stratified sampling procedure guarantees a representative collection of inscriptions. The annotations concern named entities, especially persons and locations. Next to person tags, the goldstandard was annotated for fine-grained components of personal names (tria nomina). As such, it exhibits annotations for nine categories, among which the coarse-grained are LOC (for places) and PERS (for persons), and the fine-grained name-sepcific tags are PERS:PRAE, PERS:NOMEN, PERS:COG, PERS:AG, PERS:TRIB, PERS:TITLE, PERS:FILI. 129 inscriptions have been annoted by two additional annotators, yielding an overall Fleiss' kappa of 0.658.

The 1,000 annotated inscriptions are available as .json and .csv files.

The subset of 150 inscriptions, 129 of which are annotated by three annotators, is available as .json file (called lapis-gt-150-3-annotators.json)

To discern the annotators:

  • Annotator 1 is the original upload in 2025.
  • Annotator 2 has produces their annotations up until including February 4, 2026.
  • Annotator 3 has produced their annotations after Februrary 4, 2026 (to control: there are a total of 129 annotations by annotator 3)

Files

LAPIS-GT.zip

Files (521.7 kB)

Name Size Download all
md5:580603f62eb7c1d2796816dda0e88468
521.7 kB Preview Download

Additional details

Additional titles

Subtitle
: Labeled Annotations for Proper Nouns in Inscriptional Sources