Planned intervention: On Thursday 19/09 between 05:30-06:30 (UTC), Zenodo will be unavailable because of a scheduled upgrade in our storage cluster.
Published October 15, 2019 | Version 0.1
Dataset Open

Benchmark for the evaluation of Named Entity Linking over ancient documents

  • 1. University of La Rochelle

Description

Benchmark for the evaluation of Named Entity Linking over ancient documents
Elvys Linhares Pontes, Ahmed Hamdi, Nicolas Sidere, and Antoine Doucet
University of Avignon: elvys.linhares-pontes@univ-avignon.fr; University of La Rochelle: {elvys.linhares_pontes,ahmed.hamdi,nicolas.sidere,antoine.doucet}@univ-lr.fr

These are the supplementary materials for the ICADL 2019 paper Impact of OCR Quality on Named Entity Linking. If you end up using whole or parts of this resource, please use the following citation:

  • Linhares Pontes, E., Hamdi, A., Sidere, N., and Doucet, A. (2019). Impact of OCR Quality on Named Entity Linking. In Proceedings of 21st International Conference on Asia-Pacific Digital Libraries ICADL 2019, Kuala Lumpur, Malaysia.

or alternatively use the following `bib`:

@inproceedings{linhares2019icadl,
 title="Impact of OCR Quality on Named Entity Linking.",
 author={Linhares Pontes, Elvys, and Hamdi, Ahmed, and Sidere, Nicolas, and Doucet, Antoine},
 year={2019},
 booktitle={Proceedings of 21st International Conference on Asia-Pacific Digital Libraries ICADL 2019}
 }

Files
This archive contains six folders -- one per dataset -- as well as this README. The folders contain the degraded images, the noisy texts extracted by the OCR and their aligned version with clean data. This work is licensed under a [Creative Commons Attribution-ShareAlike 4.0 International License](http://creativecommons.org/licenses/by-sa/4.0/).

Acknowledgments
This work has been supported by the European Union's Horizon 2020 research and innovation programme under grant 770299 [NewsEye](https://www.newseye.eu/).

Files

nel_dataset-ocr_degradation.zip

Files (3.1 GB)

Name Size Download all
md5:8758a9cf1bd37968ac21e64daa95bf4a
3.1 GB Preview Download

Additional details

Funding

NewsEye – NewsEye: A Digital Investigator for Historical Newspapers 770299
European Commission