Published September 7, 2022 | Version 1.3
Dataset Open

Global River Water Quality Archive (GRQA)

  • 1. University of Tartu
  • 2. Yale University, School of the Environment
  • 3. Spatial-Ecology, Meaderville House, Wheal Buller, Redruth, TR16 6ST, UK

Description

A major problem related to large-scale water quality modeling has been the lack of available observation data with a good spatiotemporal coverage. This has affected the reproducibility of previous studies and the potential improvement of existing models. In addition to the observation data itself, insufficient or poor quality metadata has also discouraged researchers to integrate the already available datasets. Therefore, improving both the availability and quality of open water quality data woould increase the potential to implement predictive modeling on a global scale. We aim to address the aforementioned issues by presenting the new Global River Water Quality Archive (GRQA) by integrating data from five existing global and regional sources: Canadian Environmental Sustainability Indicators program (CESI), Global Freshwater Quality Database (GEMStat), GLObal RIver Chemistry database (GLORICH), European Environment Agency (Waterbase) and USGS Water Quality Portal (WQP). The resulting dataset covering the timeframe 1898 - 2020 contains a total of over 17 million observations for 42 different forms of some of the most important water quality parameters, focusing on nutrients, carbon, oxygen and sediments. Supplementary metadata and statistics are provided with the observation time series to improve the usability of the dataset.

GRQA data processing scripts are available at https://doi.org/10.5281/zenodo.7056302.

Last update: 2022-09-07

Changes since GRQA_v1.2

It was discovered that due to a preprocessing error in the previous versions of GRQA some of the parameters originating from WQP were assigned the incorrect code. The error was caused by certain source parameter codes shifting by during the creation of the corresponding code map (WQP_code_map.csv). As a result, for example, parameter pH got the code for BOD5, while TEMP got the code for TSS. Parameters affected by this error were TPP, TDP, TP, TN, TDN, POC, DOC, TOC, BOD5, pH, TSS and TEMP.

The processing error also meant that the statistics calculated along with the number of outliers in those parameters were also affected. In GRQA_v1.3, this error in WQ_code_map.csv has been fixed and all the statistics and plots have been updated to account for these changes. In addition, the data catalog has been updated as well.

An overview of all the files in the dataset can be found in README_v1.3.md.

Statistical overview of all 42 parameters is given in the data catalog file GRQA_data_catalog_v1.3.pdf.

For more information about the development of this dataset look for Virro, H., Amatulli, G., Kmoch, A., Shen, L., and Uuemaa, E.: GRQA: Global River Water Quality Archive, Earth Syst. Sci. Data, 13, 5483–5507, https://doi.org/10.5194/essd-13-5483-2021, 2021.

Files

GRQA_data_v1.3.zip

Files (4.9 GB)

Name Size Download all
md5:bee508b73a581c47691d976a3f8f17da
1.1 GB Preview Download
md5:1ccf6dc4b583b4e0897f8ca882aeb243
38.4 MB Preview Download
md5:99d4f5bacea7f9fd8c11321ae97c8dcf
13.7 MB Preview Download
md5:49eedd789acbf3599c35c418434cd7f6
3.7 GB Preview Download
md5:9811621f10686f898b62bde6523af98d
3.8 kB Preview Download

Additional details

Funding

GLOMODAT – Enhancing data fusion, parallelisation for hydrological modelling and estimating sensitivity to spatial parameterization of SWAT to model nitrogen and phosphorus runoff at local and global scale 795625
European Commission