There is a newer version of the record available.

Published February 19, 2024 | Version 1.1
Dataset Open

Rumsey Train and Validation Data for ICDAR'24 MapText Competition

Description

Data set of 2Kx2K image tiles cropped from maps of the David Rumsey collection for the ICDAR'24 Competition on Historical Map Text Detection, Recognition, and Linking.

Annotations and images follow the format described at the competition website and can be evaluated using the official evaluation repository script.

Important: v1.1 fixes an image channel order error, superseding the prior version.

  Train Validation
Annotations rumsey_train.json rumsey_val.json
Images train.zip val.zip
Files rumsey/train/*.png rumsey/val/*.png
Tiles 200 40
Map Sheets 196 40
Words 34,521 5,543
Label Groups 27,729 4,959
Illegible Words 1,741 291
Truncated Words 3,582 643
Valid Words 30,683 4,881

 

Annotations: Copyright 2024 UMN Knowledge Computing Lab, CC-BY-NC-SA 4.0 International.
Images: David Rumsey Map Collection, David Rumsey Map Center, Stanford Libraries. CC-BY-NC-SA 3.0 Unported.

Files

rumsey_train.json

Files (2.0 GB)

Name Size Download all
md5:d289899af4b1f8a45187073a3c8ead91
24.5 MB Preview Download
md5:e9f6047fde6f5197d6d03350c2fb8a7f
1.9 MB Preview Download
md5:601866498e4b010e74fd340a48fc7a73
1.7 GB Preview Download
md5:cf20cd06be1d96345315d6c434db4365
312.1 MB Preview Download

Additional details

Related works

Is derived from
Dataset: https://davidrumsey.com (URL)
Is described by
Publication: https://rrc.cvc.uab.es/?ch=28&com=tasks (URL)
Is supplemented by
Software: https://github.com/icdar-maptext/evaluation (URL)