There is a newer version of the record available.

Published July 16, 2022 | Version 0.2

LIST

  • 1. University of West Bohemia
  • 2. Aarhus University

Description

LIST dataset (Latin Inscriptions in space and time) is an aggregate of EDH and EDCS epigraphic datasets.

In total, the dataset consists of 528,484 inscripriptions, enriched by 69 attributes. 76,583 inscriptions are overlapping between the two source datasets (i.e. EDH and EDCS); 3,316 inscriptions are exclusively from EDH; 448,585 inscriptions are exclusively from EDCS.

500,030 inscriptions have valid geospatial coordinates (the `geometry` attribute). This information is also used to determine urban context of each inscription (i.e. whether it is in neighborhood (i.e. within 5000m buffer) of a large city, medium city, or small city or rural (>5000m to any type of city; see the attributes `urban_context`, `urban_context_city`, and `urban_context_pop`).

199,249 inscriptions have numerical date of origin expressed by means of an interval or singular year using the attributes `not_before` and `not_after`.

187,934 inscriptions have both geospatial coordinates and numerically expressed date of origin (see the "geotemporal?" attribute).

The dataset also employs a machine learning  model to classify the inscipriptions covered exclusively by EDCS in terms of 22 categories employed by EDH.

We publish the dataset in the parquet file format. Description of individual attributes is available in separate file. The scripts used to generate the dataset are available via GitHub:  https://github.com/sdam-au/LIRE_ETL.

Files

LIST_v0-2_metadata.csv

Files (18.5 kB)

Name Size Download all
md5:19ac72d6fdaf9430f968018d657ecb41
18.5 kB Preview Download