Introducing the VISQAM Dataset: Toward Automated Map Interpretation
Authors/Creators
Description
We introduce VISQAM, an open dataset designed specifically for visual question-answering (VQA) on thematic geographic maps. Comprising 1200 annotated images and 4594 QA pairs from four permissive-license sources, VISQAM enables the development of models capable of interpreting and understanding the complex spatial and thematic information encoded in maps. We fine-tuned Qwen3-VL-2B-Instruct on this dataset, achieving substantial performance improvements: BERTScore-F1 increased from 0.43 (base model) to 0.72 (fine-tuned), with exact match rising from 0.0 to 0.24. Our experiments highlight the challenges of spatial and relational reasoning in map interpretation. The fine-tuned model performs best on object-related questions, while relation questions prove most difficult, indicating the need for more examples of this question type in future expansions of the dataset. As automated map interpretation can improve access to spatial information, e.g. for visually impaired users, and facilitate knowledge extraction from cartographic products, VISQAM lays the groundwork for developing more advanced VQA systems.
Files
_GeoAI__Introducing_the_VISQAM_Dataset.pdf
Files
(5.5 MB)
| Name | Size | Download all |
|---|---|---|
|
md5:5e4fa1fbc07cc854caa9675a6634b639
|
5.5 MB | Preview Download |