Published June 24, 2025 | Version 1.0
Dataset Open

UoS Data Rescue

Description

UoS_Data_Rescue Dataset is a dataset of 1,113 historical logbooks with 594,000 annotated text cells, tackling challenges like handwritten entries, aging artifacts, and intricate layouts.

Cite: Singh, L.G. Middleton, S.E. Tabular context-aware optical character recognition and tabular data reconstruction for historical records. IJDAR (2025). https://doi.org/10.1007/s10032-025-00543-9

Files

UoS_Data_Rescue.zip

Files (1.4 GB)

Name Size Download all
md5:4a80004d8cf9d943bee94371d61ca565
1.4 GB Preview Download

Additional details

Related works

Continues
Dataset: 10.5281/zenodo.5363456 (DOI)

Funding

UK Research and Innovation
Global Surface Air Temperature (GloSAT) NE/S015604/1

Software

Repository URL
https://github.com/gyanendrol9/context-aware_table_extraction
Programming language
Python
Development Status
Active