Published June 5, 2024 | Version v2
Presentation Open

Accessing the Republic. Entity extraction from the resolutions of the Dutch States-General.

Description

This repository contains the abstract and presentation of our paper presented at the Digital Humanities in the BeNeLux 2024 (DH Benelux 2024) conference, held 4-7 June at KU Leuven in Leuven Belgium.

In this paper we report on our approach to extracting entities from the REPUBLIC corpus of the resolutions of the States General of the Dutch Republic 1576-1796. We describe 1) the construction of ground truth data for different types of entities, 2) the evaluation of NER taggers based on various types of embeddings for historical Dutch, 3) our findings from curating millions of occurrences of the different entity types, and 4) how the curation gives insights into the characteristics of the corpus.

Files

DH-Benelux-2024-REPUBLIC-NER.pdf

Files (5.0 MB)

Name Size Download all
md5:b0e63ca80439985ccdf3e14768129a94
4.5 MB Preview Download
md5:73670d5fc52cc3d371278536b8b6aa63
553.0 kB Preview Download

Additional details

Funding

Dutch Research Council
REPUBLIC 175.217.024