Benchmark for the evaluation of Named Entity Linking over ancient documents
- 1. University of La Rochelle
Description
Benchmark for the evaluation of Named Entity Linking over ancient documents
Elvys Linhares Pontes, Ahmed Hamdi, Nicolas Sidere, and Antoine Doucet
University of Avignon: elvys.linhares-pontes@univ-avignon.fr; University of La Rochelle: {elvys.linhares_pontes,ahmed.hamdi,nicolas.sidere,antoine.doucet}@univ-lr.fr
These are the supplementary materials for the ICADL 2019 paper Impact of OCR Quality on Named Entity Linking. If you end up using whole or parts of this resource, please use the following citation:
- Linhares Pontes, E., Hamdi, A., Sidere, N., and Doucet, A. (2019). Impact of OCR Quality on Named Entity Linking. In Proceedings of 21st International Conference on Asia-Pacific Digital Libraries ICADL 2019, Kuala Lumpur, Malaysia.
or alternatively use the following `bib`:
@inproceedings{linhares2019icadl,
title="Impact of OCR Quality on Named Entity Linking.",
author={Linhares Pontes, Elvys, and Hamdi, Ahmed, and Sidere, Nicolas, and Doucet, Antoine},
year={2019},
booktitle={Proceedings of 21st International Conference on Asia-Pacific Digital Libraries ICADL 2019}
}
Files
This archive contains six folders -- one per dataset -- as well as this README. The folders contain the degraded images, the noisy texts extracted by the OCR and their aligned version with clean data. This work is licensed under a [Creative Commons Attribution-ShareAlike 4.0 International License](http://creativecommons.org/licenses/by-sa/4.0/).
Acknowledgments
This work has been supported by the European Union's Horizon 2020 research and innovation programme under grant 770299 [NewsEye](https://www.newseye.eu/).
Files
nel_dataset-ocr_degradation.zip
Files
(3.1 GB)
Name | Size | Download all |
---|---|---|
md5:8758a9cf1bd37968ac21e64daa95bf4a
|
3.1 GB | Preview Download |