Published August 28, 2022 | Version 1.0.0
Other Open

Corpus of OSCE Elections Monitoring Reports

Description

This is a repository for the corpus of election monitoring reports collected from the official website of the Organization for Security and Cooperation in Europe (OSCE) and its subsidiary Office for Democratic Institutions and Human Rights (ODIHR) (link). Monitoring reports stored as machine-readable pdf files were text-mined and re-organized into a convenient tabular format (i.e. data frame). The corpus covers the period of 1995-2020 and counts 415 reports parsed into 11 529 topical sections. The corpus is stored in R's native .RDS file. Apart from the main corpus, the repository also contains annotated speeches in the full CoNNL-U format. The annotation was done using Trankit analytical pipeline with the default English language model.

If you use the dataset, please cite it with the meta link for the whole repository: Mochtak, Michal and Adam Drnovsky (2022): Corpus of OSCE Elections Monitoring Reports, v1 (1995-2020), https://doi.org/10.5281/zenodo.7030098.

For the paper introducing the dataset, please cite: Mochtak, Michal, Adam Drnovsky, and Christophe Lesschaeve (2022): "Bias in the Eye of Beholder? 25 Years of Election Monitoring in Europe". Democratization, 29 (5): 899-917. (link)

Notes

Please notify the authors if you notice any systematic problems with the corpus. Although we did our best to eliminate potential issues caused by text mining, it was impossible to check all data entries manually.

Files

CODEBOOK_OSCE_corpus.pdf

Files (85.8 MB)

Name Size Download all
md5:6035f51086979cc6f3fea60581e86643
171.8 kB Preview Download
md5:004c8e4160b36777fb0b34abe3db5f9b
71.2 MB Download
md5:c79b1aa0a094d13afc81ab077433b7d1
14.3 MB Download

Additional details

Related works

Is supplement to
Journal article: 10.1080/13510347.2021.2019219 (DOI)