FloodCastBench: A Large-Scale Dataset and Foundation Models for Flood Modeling and Forecasting

Xu, Qingsong; Shi, Yilei; Zhao, Jie; Zhu, Xiao Xiang

doi:10.5281/zenodo.14017092

Published October 31, 2024 | Version v2

Dataset Open

FloodCastBench: A Large-Scale Dataset and Foundation Models for Flood Modeling and Forecasting

1. Technical University of Munich

Effective flood forecasting is crucial for informed decision-making and emergency response. Existing flood datasets mainly describe flood events but lack dynamic process data suitable for machine learning (ML). This work introduces the FloodCastBench dataset, designed for ML-based flood modeling and forecasting, featuring four major flood events: Pakistan 2022, UK 2015, Australia 2022, and Mozambique 2019. FloodCastBench provides comprehensive low-fidelity and high-fidelity flood forecasting datasets specifically for ML.

This dataset comprises three folders: the low-fidelity flood forecasting folder, the high-fidelity flood forecasting folder, and the relevant data folder. The low-fidelity flood forecasting folder includes data on the 2022 Pakistan flood and the 2019 Mozambique flood, both with a spatial resolution of 480 m. The high-fidelity flood forecasting folder contains two subfolders: one for the 2022 Australia flood and the 2015 UK flood with a spatial resolution of 30 m, and another for the same floods with a spatial resolution of 60 m. All data files are stored in TIFF format, with a temporal resolution of 300 seconds, and file names are numbered sequentially, incremented every 300 seconds until the simulation endpoint. The relevant data folder includes five subfiles: DEM, land use and land cover, rainfall data, georeferenced files, and initial condition files. The DEM, land use and land cover, rainfall, and initial condition data are all provided in TIFF format. The rainfall data is organized in a format of year-month-day-hour-minute-second. Georeferenced files provide geographic extent and spatial reference to support viewing and analysis of the associated TIFF files in GIS.

FloodCastBench details the process of flood dynamics data acquisition, starting with input data preparation (e.g., topography, land use, rainfall) and flood measurement data collection (e.g., SAR-based maps, surveyed outlines) for hydrodynamic modeling. We deploy a widely recognized finite difference numerical solution to construct high-resolution spatiotemporal dynamic processes with 30-m spatial and 300-second temporal resolutions. Flood measurement data are used to calibrate the hydrodynamic model parameters and validate the flood inundation maps. Furthermore, we establish a benchmark of foundational models for neural flood forecasting using FloodCastBench, validating its effectiveness in supporting ML models for spatiotemporal, cross-regional, and downscaled flood forecasting.

Files

FloodCastBench.zip

Files (21.6 GB)

Name	Size	Download all
FloodCastBench.zip md5:c43f3009c82e212ef21a65739f4ada3d	21.6 GB	Preview Download

	All versions	This version
Views	1,057	938
Downloads	335	307
Data volume	20.8 TB	19.4 TB

FloodCastBench: A Large-Scale Dataset and Foundation Models for Flood Modeling and Forecasting

Creators

Description

Files

FloodCastBench.zip

Files (21.6 GB)