paragraphs of the resolutions of the States General of the Dutch Republic (1576-1796)
Creators
Description
This TSV file contains all the plain text paragraphs of all the resolutions of the States General of the Dutch Republic (1576-1796). These were created as part of the project REPUBLIC (REsolutions PUBLished In a Computational environment) which is funded by NWO grant 175.2017.024
There are a total of 1,061,962 paragraphs from 680,593 resolutions, making up a corpus of 129,585,089 words (according to Unix word count).
The file contains six columns:
- `session_date`: the date on which the resolution (decision) was reached.
- `resolution_id`: the identifier of the resolution.
- `para_id`: the identifier of the paragraph in the resolution.
- `line_start`: the identifier of the first line in the paragraph. This identifier contains the scan ID and the x,y,w,h coordinates of the line in the scan, so it the resolution text can be traced to where it starts in the scan.
- `line_end`: the identifier of the last line in the paragraph. This identifier contains the scan ID and the x,y,w,h coordinates of the line in the scan, so it the resolution text can be traced to where it ends in the scan (which is not necessarily the same scan as the first line, as paragraphs can cross scan boundaries).
- `text`: the text of the paragraph.
For more information on how this was generated, see the following publications:
Marijn Koolen, Rik Hoekstra, Joris Oddens, Ronald Sluijter, Rutger van Koert, Ger Brouwer en Hennie Brugman, ‘The Value of Preexisting Structures for Digital Access Modelling the Resolutions of the Dutch States General’, Journal of Computing and Cultural Heritage 16:1 (2023). https://dl.acm.org/doi/10.1145/3575864
Marijn Koolen en Rik Hoekstra, ‘Detecting Formulaic Language Use in Historical Administrative Corpora’, in: F. Karsdorp, A. Lassche, en K. Nielbo eds., Proceedings of the Computational Humanities Research Conference 2022 (Antwerpen 2022) 127-151. Proceedings http://ceur-ws. org ISSN, 1613, 0073. https://ceur-ws.org/Vol-3290/long_paper5740.pdf
Koolen, M., Hoekstra, R., Oddens, J., & Sluijter, R. (2023). 'Formulas and decision-making: the case of the states general of the Dutch Republic' in: F. Karsdorp, A. Lassche, en K. Nielbo eds., Proceedings of the Computational Humanities Research Conference 2023 (Paris 2023) 772-798. Proceedings http://ceur-ws. org ISSN, 1613, 0073. https://ceur-ws.org/Vol-3558/paper9465.pdf
Rutger van Koert, Stefan Klut, Tim Koornstra, Martijn Maas en Luke Peters, ‘Loghi: An End-to-End Framework for Making Historical Documents Machine-Readable’, Document Analysis and Recognition – ICDAR 2024 Workshops (Cham: Springer 2024) 73-88. https://link.springer.com/chapter/10.1007/978-3-031-70645-5_6
Files
Files
(273.9 MB)
Name | Size | Download all |
---|---|---|
md5:ed00235b0afbb05035be68632c791df1
|
273.9 MB | Download |
Additional details
Software
- Repository URL
- https://github.com/huygensING/republic-project/
- Programming language
- Python
- Development Status
- Active