Published November 26, 2024 | Version v1
Dataset Open

paragraphs of the resolutions of the States General of the Dutch Republic (1576-1796)

  • 1. ROR icon Huygens Institute for History and Culture of the Netherlands
  • 2. KNAW Humanities Cluster

Description

This TSV file contains all the plain text paragraphs of all the resolutions of the States General of the Dutch Republic (1576-1796). These were created as part of the project REPUBLIC (REsolutions PUBLished In a Computational environment) which is funded by NWO grant 175.2017.024

There are a total of 1,061,962 paragraphs from 680,593 resolutions, making up a corpus of 129,585,089 words (according to Unix word count).

The file contains six columns:

  • `session_date`: the date on which the resolution (decision) was reached.
  • `resolution_id`: the identifier of the resolution.
  • `para_id`: the identifier of the paragraph in the resolution.
  • `line_start`: the identifier of the first line in the paragraph. This identifier contains the scan ID and the x,y,w,h coordinates of the line in the scan, so it the resolution text can be traced to where it starts in the scan. 
  • `line_end`: the identifier of the last line in the paragraph. This identifier contains the scan ID and the x,y,w,h coordinates of the line in the scan, so it the resolution text can be traced to where it ends in the scan (which is not necessarily the same scan as the first line, as paragraphs can cross scan boundaries). 
  • `text`: the text of the paragraph.

 

For more information on how this was generated, see the following publications:

Marijn Koolen, Rik Hoekstra, Joris Oddens, Ronald Sluijter, Rutger van Koert, Ger Brouwer en Hennie Brugman, ‘The Value of Preexisting Structures for Digital Access Modelling the Resolutions of the Dutch States General’, Journal of Computing and Cultural Heritage 16:1 (2023). https://dl.acm.org/doi/10.1145/3575864

Marijn Koolen en Rik Hoekstra, ‘Detecting Formulaic Language Use in Historical Administrative Corpora’, in: F. Karsdorp, A. Lassche, en K. Nielbo eds., Proceedings of the Computational Humanities Research Conference 2022 (Antwerpen 2022) 127-151. Proceedings http://ceur-ws. org ISSN1613, 0073. https://ceur-ws.org/Vol-3290/long_paper5740.pdf

Koolen, M., Hoekstra, R., Oddens, J., & Sluijter, R. (2023). 'Formulas and decision-making: the case of the states general of the Dutch Republic' in: F. Karsdorp, A. Lassche, en K. Nielbo eds., Proceedings of the Computational Humanities Research Conference 2023 (Paris 2023) 772-798. Proceedings http://ceur-ws. org ISSN1613, 0073. https://ceur-ws.org/Vol-3558/paper9465.pdf

Rutger van Koert, Stefan Klut, Tim Koornstra, Martijn Maas en Luke Peters, ‘Loghi: An End-to-End Framework for Making Historical Documents Machine-Readable’, Document Analysis and Recognition – ICDAR 2024 Workshops (Cham: Springer 2024) 73-88. https://link.springer.com/chapter/10.1007/978-3-031-70645-5_6

Files

Files (273.9 MB)

Name Size Download all
md5:ed00235b0afbb05035be68632c791df1
273.9 MB Download

Additional details

Funding

Investment Grant NWO Large 175.2017.024
Dutch Research Council

Software

Repository URL
https://github.com/huygensING/republic-project/
Programming language
Python
Development Status
Active