Published May 18, 2026 | Version v1
Conference paper Open

Introducing the VISQAM Dataset: Toward Automated Map Interpretation

  • 1. ROR icon University of Münster

Description

We introduce VISQAM, an open dataset designed specifically for visual question-answering (VQA) on thematic geographic maps. Comprising 1200 annotated images and 4594 QA pairs from four permissive-license sources, VISQAM enables the development of models capable of interpreting and understanding the complex spatial and thematic information encoded in maps. We fine-tuned Qwen3-VL-2B-Instruct on this dataset, achieving substantial performance improvements: BERTScore-F1 increased from 0.43 (base model) to 0.72 (fine-tuned), with exact match rising from 0.0 to 0.24. Our experiments highlight the challenges of spatial and relational reasoning in map interpretation. The fine-tuned model performs best on object-related questions, while relation questions prove most difficult, indicating the need for more examples of this question type in future expansions of the dataset. As automated map interpretation can improve access to spatial information, e.g. for visually impaired users, and facilitate knowledge extraction from cartographic products, VISQAM lays the groundwork for developing more advanced VQA systems.

Files

_GeoAI__Introducing_the_VISQAM_Dataset.pdf

Files (5.5 MB)

Name Size Download all
md5:5e4fa1fbc07cc854caa9675a6634b639
5.5 MB Preview Download

Additional details

Funding

Deutsche Forschungsgemeinschaft
NFDI4Earth 460036893
Erasmus+
Geospatial Technologies 101049796