Published December 14, 2024 | Version v1
Dataset Open

SynthMap+ (English) Synthetic Train Data for ICDAR'25 MapText Competition

  • 1. ROR icon University of Minnesota

Description

Dataset of synthetic map images in English for the ICDAR'25 Competition on Historical Map Text Detection, Recognition, and Linking.

Annotations and images follow the format described at the competition website. 

Please refer to [1] for the generation process and usage. We extend [1] to provide grouping labels for location phrases.

 

  Train
Annotations en25synth_train.json
Images train.zip
Files en25synth/train/*.jpg
Tiles 35,000
Map Sheets -
Words 348,494
Label Groups 157,483
Label Groups (Group Size > 1) 133,955
Illegible Words 0
Truncated Words 0
Valid Words 348,494

 

[1] Lin, Y., & Chiang, Y. -Y. (2024). Hyper-local deformable transformers for text spotting on historical maps. In Proceedings of the 30th ACM SIGKDD Conference on Knowledge Discovery and Data Mining (pp. 5387-5397).

Files

en25synth_train.json

Files (38.8 GB)

Name Size Download all
md5:93f6f07f139eca0ad71fd1f45498e4de
247.8 MB Preview Download
md5:cfe6a8a5197fca41e7f158a41a48289c
38.6 GB Preview Download

Additional details

Related works

Cites
Publication: 10.1145/3637528.3671589 (DOI)
Is described by
Publication: https://rrc.cvc.uab.es/?ch=32&com=tasks (URL)