GRDC-Caravan: extending the original dataset with data from the Global Runoff Data Centre
Creators
- 1. Global Runoff Data Centre (GRDC), Federal Institute of Hydrology (BfG), Koblenz, Germany
- 2. Google Research, Vienna, Austria
- 3. Fathom, Bristol, UK
- 4. Google Research, Tel Aviv, Israel
Description
Large-sample datasets are essential in hydrological science to support modelling studies and global assessments. This dataset is an extension to Caravan, a global community dataset of meteorological forcing data, catchment attributes, and discharge data for catchments around the world (Kratzert et al. 2023).
The extension includes a subset of those hydrological discharge data and station-based watersheds from the Global Runoff Data Centre (GRDC), which are covered by an open data policy (Attribution 4.0 International; CC BY 4.0). In total, the dataset covers stations from 5357 catchments and 25 countries worldwide with a time series record from 1950 – 2023.
GRDC is an international data centre operating under the auspices of the World Meteorological Organization (WMO) at the German Federal Institute of Hydrology (BfG). Established in 1988, it holds the most substantive collection of quality assured river discharge data worldwide. Primary providers of river discharge data and associated metadata are the National Hydrological and Hydro-Meteorological Services of WMO Member States.
Reference:
Kratzert, F., Nearing, G., Addor, N. et al. Caravan - A global community dataset for large-sample hydrology. Sci Data 10, 61 (2023). https://doi.org/10.1038/s41597-023-01975-w
Update:
With version 0.2 a bug has been fixed that affected the time series of four bands of all GRDC gauges in the GRDC extension. The affected bands were total_precipitation, surface_net_solar_radiation, surface_net_thermal_radiation and potential_evaporation, i.e. all features that are accumulated over the day, as per definition of ERA5-Land.
For details look at https://github.com/kratzert/Caravan/issues/26.
Version 0.3: Data description file added.
Dataset structure:
The dataset is provided in the following two file formats:
1. caravan-grdc-extension-csv.tar.gz: provides the time series data as comma-separated text files (CSV) (downloadable as 9.6 GB zip archive)
2. caravan-grdc-extension-nc.tar.gz: provides the time series data in the Network Common Data Form (NetCDF) (downloadable as 8 GB zip archive)
The data in both versions are identical, users can choose if they require the time series data in CSV or NetCDF format.
Further details of the structure of the dataset are described in the data description file.
Files
grdc-caravan_data_description.pdf
Files
(17.6 GB)
Name | Size | Download all |
---|---|---|
md5:8d525d8c89a4760c9640ae551980dcfc
|
9.6 GB | Download |
md5:e2e447c4a0be7af6026f5ce6523f0dc8
|
8.0 GB | Download |
md5:61b196a33734fec2f261e20a5c98b21f
|
38.6 kB | Preview Download |
Additional details
Related works
- Is supplement to
- Dataset: 10.5281/zenodo.6522634 (DOI)