There is a newer version of the record available.

Published February 4, 2026 | Version v3
Dataset Open

LLM-GeoDis

Authors/Creators

Description

This dataset provides subnational geocoding for global disaster events recorded in the EM-DAT database (2000–2024). Each record has been automatically processed using a large language model (GPT-4o) to extract and standardize textual location information, cross-referenced with GADM, OpenStreetMap (OSM), and Wikidata. The dataset contains 14,215 disaster events across 17,948 unique locations, each linked to a GADM administrative unit (levels 1–2). It includes point geometries from Wikidata and OSM, and reprojected, gap-filled GADM geometries to ensure complete spatial coverage. The full database (~30 GB) is divided into five parts for easier distribution and handling. All records are linked to their corresponding EM-DAT entries and associated metadata.

Files

input_emdat.csv

Files (14.8 GB)

Name Size Download all
md5:79b9e864059c19e6bf76b084a4aba9ee
6.1 GB Download
md5:f57303a3322bd0ce79ee13c347111581
1.7 MB Preview Download
md5:e0b90fa7ffff0ca9a4ef102118db8056
1.9 GB Preview Download
md5:e8137895a92159c6d9b19f21a4559709
1.9 GB Preview Download
md5:817e75737a4834b782568e6a079e3dc5
1.8 GB Preview Download
md5:667f69347acd41261bf397cadd8f11a3
1.5 GB Preview Download
md5:be427a1290f6eea5eb88a7bf87b2141d
1.6 GB Preview Download
md5:c6a669466c815a0882deb5b4cc648acb
4.9 MB Preview Download
md5:fbb290425afbb64a3de62d8373fb839d
3.5 MB Preview Download
md5:391085a505a50c948b95ffdb257ab058
104.3 kB Preview Download