Dataset Open Access

SCDNA: a serially complete precipitation and temperature dataset in North America from 1979 to 2018 (Version 1.1)

Guoqiang Tang; Martyn P. Clark; Andrew J. Newman; Andrew W. Wood; Simon Michael Papalexiou; Vincent Vionnet; Paul H. Whitfield

Version updates:

Version 1.1 is generally consistent with Version 1 in data estimation but (1) provides source flags in the final dataset, (2) adds individual station files for every merged station, and (3) excludes some stations due to their quality problems.

Station-based serially complete datasets (SCDs) of precipitation and temperature observations are important for hydrometeorological studies. We developed a SCD for North America (SCDNA) of precipitation, minimum temperature, and maximum temperature from 1979 to 2018. Raw meteorological station data were obtained from the Global Historical Climate Network Daily (GHCN-D), the Global Surface Summary of the Day (GSOD), Environment and Climate Change Canada (ECCC), and a compiled station database in Mexico (Livneh et al. 2015).

There are three types of missing values that are infilled/reconstructed by this dataset:

  1. Missing value during the observation period when the station still works.
  2. Missing value beyond the observation period (reconstruction period) before the station is deployed or after the station ceases working.
  3. Station measurements that fail quality control checks are treated as missing values and imputed.

This dataset is useful for various purposes of applications that require:

  1. Quality-controlled actual station observations from multiple datasets in North America;
  2. Station observations without missing values in the observation period;
  3. Serially complete station observations. Users should be cautious when using this dataset for trend analysis because it is possible that trends are not well reconstructed.

Three types of dataset files are provided:

  1. “SCDNA_v1.1.nc4”. This NetCDF file contains basic information (ID, location, elevation) and the final variables of stations. For each variable (precipitation, minimum temperature, and maximum temperature), this file provides the serially complete data, the estimation flag indicating whether a value is from observation or estimation, and accuracy index (KGE) of estimated data.
  2. “” to “”. These ten compressed files contain complete data for the production of the SCD, including quality flags, estimates from 16 strategies (quantile mapping, interpolation, machine learning, and multiple-strategy merging), corrected/uncorrected SCD estimates, accuracy indices, etc.
  3. "". This file contains the list and data of stations that have the same latitude and longitude records due to various reasons, such as the same stations from different sources, naming rules, recording bias, etc, network design, etc.

We recommend that users download "SCDNA_v1.1.nc4" for quick and direct application, and adopt the second type for in-depth investigation of different strategies and potential methodology improvement. Please refer to Readme.txt for more details.

The codes used to produce this dataset are available on GitHub (

Files (105.8 GB)
Name Size
14.1 MB Download
8.1 kB Download
13.0 GB Download
9.8 GB Download
14.2 GB Download
5.5 GB Download
4.1 GB Download
10.3 GB Download
11.0 GB Download
10.8 GB Download
12.4 GB Download
13.0 GB Download
1.7 GB Download
All versions This version
Views 889401
Downloads 797393
Data volume 5.3 TB2.1 TB
Unique views 764360
Unique downloads 386188


Cite as