Published May 25, 2022 | Version v1.1
Dataset Open

Time to Update the Split-Sample Approach in Hydrological Model Calibration v1.1

  • 1. University of Waterloo

Description

Time to Update the Split-Sample Approach in Hydrological Model Calibration

Hongren Shen1, Bryan A. Tolson1, Juliane Mai1

1Department of Civil and Environmental Engineering, University of Waterloo, Waterloo, Ontario, Canada

Corresponding author: Hongren Shen (hongren.shen@uwaterloo.ca)

Abstract

Model calibration and validation are critical in hydrological model robustness assessment. Unfortunately, the commonly-used split-sample test (SST) framework for data splitting requires modelers to make subjective decisions without clear guidelines. This large-sample SST assessment study empirically assesses how different data splitting methods influence post-validation model testing period performance, thereby identifying optimal data splitting methods under different conditions. This study investigates the performance of two lumped conceptual hydrological models calibrated and tested in 463 catchments across the United States using 50 different data splitting schemes. These schemes are established regarding the data availability, length and data recentness of the continuous calibration sub-periods (CSPs). A full-period CSP is also included in the experiment, which skips model validation. The assessment approach is novel in multiple ways including how model building decisions are framed as a decision tree problem and viewing the model building process as a formal testing period classification problem, aiming to accurately predict model success/failure in the testing period. Results span different climate and catchment conditions across a 35-year period with available data, making conclusions quite generalizable. Calibrating to older data and then validating models on newer data produces inferior model testing period performance in every single analysis conducted and should be avoided. Calibrating to the full available data and skipping model validation entirely is the most robust split-sample decision. Experimental findings remain consistent no matter how model building factors (i.e., catchments, model types, data availability, and testing periods) are varied. Results strongly support revising the traditional split-sample approach in hydrological modeling.

Version updates

v1.1 Updated on May 19, 2022. We added hydrographs for each catchment.

There are 8 parts of the zipped file attached in v1.1. You should download all of them and unzip all those eight parts together.

In this update, we added two zipped files in each gauge subfolder:

    (1) GR4J_Hydrographs.zip and

    (2) HMETS_Hydrographs.zip

Each of the zip files contains 50 CSV files. These CSV files are named with keywords of model name, gauge ID, and the calibration sub-period (CSP) identifier.

Each hydrograph CSV file contains four key columns:

    (1) Date time (note that the hour column is less significant since this is daily data);

    (2) Precipitation in mm that is the aggregated basin mean precipitation;

    (3) Simulated streamflow in m3/s and the column is named as "subXXX", where XXX is the ID of the catchment, specified in the CAMELS_463_gauge_info.txt file; and

    (4) Observed streamflow in m3/s and the column is named as "subXXX(observed)".

Note that these hydrograph CSV files reported period-ending time-averaged flows. They were directly produced by the Raven hydrological modeling framework. More information about the format of the hydrograph CSV files can be redirected to the Raven webpage.

v1.0 First version published on Jan 29, 2022.

Data description

This data was used in the paper entitled "Time to Update the Split-Sample Approach in Hydrological Model Calibration" by Shen et al. (2022).

Catchment, meteorological forcing and streamflow data are provided for hydrological modeling use. Specifically, the forcing and streamflow data are archived in the Raven hydrological modeling required format. The GR4J and HMETS model building results in the paper, i.e., reference KGE and KGE metrics in calibration, validation and testing periods, are provided for replication of the split-sample assessment performed in the paper.

Data content

The data folder contains a gauge info file (CAMELS_463_gauge_info.txt), which reports basic information of each catchment, and 463 subfolders, each having four files for a catchment, including:

    (1) Raven_Daymet_forcing.rvt, which contains Daymet meteorological forcing (i.e., daily precipitation in mm/d, minimum and maximum air temperature in deg_C, shortwave in MJ/m2/day, and day length in day) from Jan 1st 1980 to Dec 31 2014 in a Raven hydrological modeling required format.

    (2) Raven_USGS_streamflow.rvt, which contains daily discharge data (in m3/s) from Jan 1st 1980 to Dec 31 2014 in a Raven hydrological modeling required format.

    (3) GR4J_metrics.txt, which contains reference KGE and GR4J-based KGE metrics in calibration, validation and testing periods.

    (4) HMETS_metrics.txt, which contains reference KGE and HMETS-based KGE metrics in calibration, validation and testing periods.

Data collection and processing methods

       Data source

  •  Catchment information and the Daymet meteorological forcing are retrieved from the CAMELS data set, which can be found here.
  •  The USGS streamflow data are collected from the U.S. Geological Survey's (USGS) National Water Information System (NWIS), which can be found here.
  • The GR4J and HMETS performance metrics (i.e., reference KGE and KGE) are produced in the study by Shen et al. (2022).

       Forcing data processing

  • A quality assessment procedure was performed. For example, daily maximum air temperature should be larger than the daily minimum air temperature; otherwise, these two values will be swapped.
  • Units are converted to Raven-required ones. Precipitation: mm/day, unchanged; daily minimum/maximum air temperature: deg_C, unchanged; shortwave: W/m2 to MJ/m2/day; day length: seconds to days.
  • Data for a catchment is archived in a RVT (ASCII-based) file, in which the second line specifies the start time of the forcing series, the time step (= 1 day), and the total time steps in the series (= 12784), respectively; the third and the fourth lines specify the forcing variables and their corresponding units, respectively.
  • More details of Raven formatted forcing files can be found in the Raven manual (here).

       Streamflow data processing

  • Units are converted to Raven-required ones. Daily discharge originally in cfs is converted to m3/s.
  • Missing data are replaced with -1.2345 as Raven requires. Those missing time steps will not be counted in performance metrics calculation.
  • Streamflow series is archived in a RVT (ASCII-based) file, which is open with eight commented lines specifying relevant gauge and streamflow data information, such as gauge name, gauge ID, USGS reported catchment area, calculated catchment area (based on the catchment shapefiles in CAMELS dataset), streamflow data range, data time step, and missing data periods. The first line after the commented lines in the streamflow RVT files specifies data type (default is HYDROGRAPH), subbasin ID (i.e., SubID), and discharge unit (m3/s), respectively. And the next line specifies the start of the streamflow data, time step (=1 day), and the total time steps in the series(= 12784), respectively.

GR4J and HMETS metrics 

The GR4J and HMETS metrics files consists of reference KGE and KGE in model calibration, validation, and testing periods, which are derived in the massive split-sample test experiment performed in the paper.

  • Columns in these metrics files are gauge ID, calibration sub-period (CSP) identifier, KGE in calibration, validation, testing1, testing2, and testing3, respectively.
  • We proposed 50 different CSPs in the experiment. "CSP_identifier" is a unique name of each CSP. e.g., CSP identifier "CSP-3A_1990" stands for the model is built in Jan 1st 1990, calibrated in the first 3-year sample (1981-1983), calibrated in the rest years during the period of 1980 to 1989. Note that 1980 is always used for spin-up.
  • We defined three testing periods (independent to calibration and validation periods) for each CSP, which are the first 3 years from model build year inclusive, the first 5 years from model build year inclusive, and the full years from model build year inclusive. e.g., "testing1", "testing2", and "testing3" for CSP-3A_1990 are 1990-1992, 1990-1994, and 1990-2014, respectively.
  • Reference flow is the interannual mean daily flow based on a specific period, which is derived for a one-year period and then repeated in each year in the calculation period.
    • For calibration, its reference flow is based on spin-up + calibration periods.
    • For validation, its reference flow is based on spin-up + calibration periods.
    • For testing, its reference flow is based on spin-up +calibration + validation periods.
  • Reference KGE is calculated based on the reference flow and observed streamflow in a specific calculation period (e.g., calibration). Reference KGE is computed using the KGE equation with substituting the "simulated" flow for "reference" flow in the period for calculation. Note that the reference KGEs for the three different testing periods corresponds to the same historical period, but are different, because each testing period spans in a different time period and covers different series of observed flow.

More details of the split-sample test experiment and modeling results analysis can be referred to the paper by Shen et al. (2022).

Citation

Journal Publication

This study:

Shen, H., Tolson, B. A., & Mai, J.(2022). Time to update the split-sample approach in hydrological model calibration. Water Resources Research, 58, e2021WR031523. https://doi.org/10.1029/2021WR031523

Original CAMELS dataset:

A. J. Newman, M. P. Clark, K. Sampson, A. Wood, L. E. Hay, A. Bock, R. J. Viger, D. Blodgett, L. Brekke, J. R. Arnold, T. Hopson, and Q. Duan (2015). Development of a large-sample watershed-scale hydrometeorological dataset for the contiguous USA: dataset characteristics and assessment of regional variability in hydrologic model performance. Hydrol. Earth Syst. Sci., 19, 209-223, http://doi.org/10.5194/hess-19-209-2015

Data Publication

This study:

H. Shen, B. A. Tolson, and J. Mai (2022). Time to Update the Split-Sample Approach in Hydrological Model Calibration. Zenodo. http://doi.org/10.5281/zenodo.5915374

Original CAMELS dataset:

A. Newman; K. Sampson; M. P. Clark; A. Bock; R. J. Viger; D. Blodgett, 2014. A large-sample watershed-scale hydrometeorological dataset for the contiguous USA. Boulder, CO: UCAR/NCAR. https://dx.doi.org/10.5065/D6MW2F4D

Files

Files (7.7 GB)

Name Size Download all
md5:76a84eb46b8939f8441e8cd135ed5190
1.0 GB Download
md5:b860f6aadead57420834885204d156ed
1.0 GB Download
md5:87fc0a9f6751c586ed240e984b24da38
1.0 GB Download
md5:60f2efdf3c1c6bd83918434fafdcf867
1.0 GB Download
md5:0e82001c8530b3bbe64e7e8ac67bc251
1.0 GB Download
md5:3ced33b90fae61185e93e91e3116875d
1.0 GB Download
md5:75595c0990fc10f34b739823b6f7214c
1.0 GB Download
md5:ecfb9575e70f4c3d7b492e9c4738aa4a
335.9 MB Download

Additional details

Related works

Is published in
Journal article: 10.1029/2021WR031523 (DOI)