Other Open Access

S14: Modelling Data Quality in archaeological Linked Open Data

Kai-Christian Bruhn; Allard Mees; Florian Thiery

Today, increasing quantities of data are published by archaeological institutions. At the same time, interconnecting these data following the concept of “Linked Data” is becoming more and more popular. The current evolution from “Linked Data” via “Linked Open Data” (LOD) towards “Linked Open Usable Data” enables a wide array of archaeological applications. However, this development of an increasing LO(U)D-cloud implies challenges in handling complex facets of data quality. Therefore, modelling the handling of data quality becomes an increasingly important issue. This is especially valid for archaeological data, which are based on a complicated network of concepts from different knowledge domains. Even very carefully compiled datasets can contain errors and ambiguities. Unrecognised errors multiply exponentially in scenarios of data reuse: not only incorrect data and conclusions are the result, but possibly also a loss of confidence in web-based resources. Moreover, modelling data quality to share knowledge about uncertainty is necessary to produce and publish transparent Linked Open Usable Data. The success of the session "Guaranteeing data quality in archaeological Linked Open Data" at CAA International 2018 has raised awareness of many challenges related to this topic and encourages pursuing the debate.

For this session we invite contributions that addresses e.g. following issues:

- Identifying and strategies for correcting inconsistencies within the data;

- Identifying sources and dangers of incorrect or ambiguous data;

- Identifying duplicates across different LOD sources;

- Keeping track of the provenance of data as a means of solving errors and identifying their source;

- Defining metrics in order to rate data in respect to their quality;

- Setting up methodologies and tools in order to label or certify data sets based on their quality;

- Compiling trust levels based on various inputs such as provenance and quality level;

- Modelling uncertainty and vagueness in LOD (e.g. thesauri and CIDOC CRM);

- Dealing with ambiguities resulting from multiple links in the LOD cloud;

We encourage presenters to derive the problems from real-world datasets and to formulate proposals for solutions, preferably demonstrating (prototypes of) realised data driven web applications. As we target a broad and diverse audience because of the thematic relevance, the challenges described should also be integrated into their archaeological context (excavation, museum, archive, etc.).

Files (2.9 MB)
Name Size
2.9 MB Download
All versions This version
Views 266266
Downloads 6767
Data volume 196.6 MB196.6 MB
Unique views 251251
Unique downloads 6060


Cite as