Conference paper Open Access

Textual repetition and variation in the Resolutions of the States General of the Dutch Republic

Marijn Koolen; Rik Hoekstra; Rutger van Koert; Ida Nijenhuis; Ronald Sluijter

In the NWO REPUBLIC project, we are creating digital access to the corpus of the Resolutions of the States General of the Dutch Republic (1576-1796). This corpus contains the decisions made in the States General each day for a 220 year period. The resolutions were recorded using a standard structure and contain many standard formulations for aspects of the decision making process, including the source of the topic that was decided on (a formal request, a missive, etc.), whether a decision was reached and what that decision was.

We discuss different techniques we use to identify formulaic expressions and how we iteratively build a corpus-specific phrase model with which we can identify 1) the dates and attendants of each meeting, which are followed by all the resolutions of that day, 2) resolution boundaries, e.g. where they start and stop in the running text, so we know which text belongs to which resolution, 3) different types of opening phrases that correspond to different types of sources (e.g. requests, missives, reports, etc.), and 4) the decision paragraphs that state what decision, if any, was reached.

We discuss how we built ground truth to evaluate the phrase model and the fuzzy searching and extraction process. Finally, we discuss how this approach generalised to other corpora and text genres.

Files (420.1 kB)
Name Size
420.1 kB Download
All versions This version
Views 131131
Downloads 6767
Data volume 28.1 MB28.1 MB
Unique views 121121
Unique downloads 6262


Cite as