Published October 29, 2024 | Version 0.3
Dataset Open

GRDC-Caravan: extending the original dataset with data from the Global Runoff Data Centre

  • 1. Global Runoff Data Centre (GRDC), Federal Institute of Hydrology (BfG), Koblenz, Germany
  • 2. Google Research, Vienna, Austria
  • 3. Fathom, Bristol, UK
  • 4. Google Research, Tel Aviv, Israel

Description

Large-sample datasets are essential in hydrological science to support modelling studies and global assessments. This dataset is an extension to Caravan, a global community dataset of meteorological forcing data, catchment attributes, and discharge data for catchments around the world (Kratzert et al. 2023).

The extension includes a subset of those hydrological discharge data and station-based watersheds from the Global Runoff Data Centre (GRDC), which are covered by an open data policy (Attribution 4.0 International; CC BY 4.0). In total, the dataset covers stations from 5357 catchments and 25 countries worldwide with a time series record from 1950 – 2023.

GRDC is an international data centre operating under the auspices of the World Meteorological Organization (WMO) at the German Federal Institute of Hydrology (BfG). Established in 1988, it holds the most substantive collection of quality assured river discharge data worldwide. Primary providers of river discharge data and associated metadata are the National Hydrological and Hydro-Meteorological Services of WMO Member States.

Reference:

Kratzert, F., Nearing, G., Addor, N. et al. Caravan - A global community dataset for large-sample hydrology. Sci Data 10, 61 (2023). https://doi.org/10.1038/s41597-023-01975-w

 

Update:

With version 0.2 a bug has been fixed that affected the time series of four bands of all GRDC gauges in the GRDC extension. The affected bands were total_precipitation, surface_net_solar_radiation, surface_net_thermal_radiation and potential_evaporation, i.e. all features that are accumulated over the day, as per definition of ERA5-Land.
For details look at https://github.com/kratzert/Caravan/issues/26.

Version 0.3: Data description file added.

 

Dataset structure:

The dataset is provided in the following two file formats:
1. caravan-grdc-extension-csv.tar.gz: provides the time series data as comma-separated text files (CSV) (downloadable as 9.6 GB zip archive)
2. caravan-grdc-extension-nc.tar.gz: provides the time series data in the Network Common Data Form (NetCDF) (downloadable as 8 GB zip archive)

The data in both versions are identical, users can choose if they require the time series data in CSV or NetCDF format.

Further details of the structure of the dataset are described in the data description file.

Files

grdc-caravan_data_description.pdf

Files (17.6 GB)

Name Size Download all
md5:8d525d8c89a4760c9640ae551980dcfc
9.6 GB Download
md5:e2e447c4a0be7af6026f5ce6523f0dc8
8.0 GB Download
md5:61b196a33734fec2f261e20a5c98b21f
38.6 kB Preview Download

Additional details

Related works

Is supplement to
Dataset: 10.5281/zenodo.6522634 (DOI)