Published September 12, 2024 | Version v1

Dataset for "On the Extrapolation of Generative Adversarial Networks for downscaling precipitation extremes in warmer climates"

  • 1. NIWA - The National Institute of Water and Atmospheric Research Ltd.
  • 2. National Institute of Water and Atmospheric Research Wellington

Description

Code and Dataset for "On the Extrapolation of Generative Adversarial Networks for downscaling precipitation extremes in warmer climates"

This dataset accompanies the research paper titled "On the Extrapolation of Generative Adversarial Networks for downscaling precipitation extremes in warmer climates", currently under review for the AGU Journal GRL. The study introduces a novel Regional Climate Model (RCM) emulator focusing on high-resolution climate downscaling for the New Zealand region. For additional insights and access to the codebase utilized in this research, please refer to our Github Repository.

The code can also be found as a ".zip" file: *On-the-Extrapolation-of-Generative-Adversarial-Networks-for-downscaling-precipitation-extremes-main. 

Aims

Our study focuses on two important gaps in the literature regarding the extrapolation of empirical downscaling algorithms. First, we examine how well relationships learned from a historical period extrapolate to future unobserved climates. We compare two widely used algorithms, a GAN and a deterministic CNN baseline, that use a similar architecture (i.e. convolutional layers) trained in a model-as-truth framework to downscale daily precipitation over New Zealand. We evaluate their accuracy in capturing climate change signals in mean and extreme precipitation. Second, we explore whether training on future vs. only historical periods combined with different-sized training datasets can improve extrapolation skill. 

Geographic Focus

Our research focuses only on the New Zealand Region (165°E-184°W, 33°S-51°S).

 

Data Overview

Training and Evaluation Data

The training data used in this study (for our RCM emulator) spans the historical period and future period (SSP370) of simulation. It comprises daily accumulated precipitation as the primary target variable, alongside large-scale predictor variables. 

  • Resolution: The target variable is presented at a 12km resolution, reflecting the highest resolution face of RCM for the New Zealand region. Predictor variables are coarsened to a 1.5-degree resolution from original CCAM outputs using conservative interpolation. 

  • Period Coverage:

    • Training Data: 1960-2100 (Depending on Experiment, see Table 1 for list of experiment configurations)
    • Validation Data: 1985-2014 + 2070-2099 (to compute the climate change signal)
  • Models:

    • Training on: ACCESS-CM2
    • Validated on: EC-Earth3, NorESM2-MM, CNRM-CM6-1, AWI-MR-1 

File Structure

  • Training Data:

    • Target/Ground Truth (Y): target_ACCESS-CM2_hist_ssp370_pr.nc
    • Predictor (X): predictor_ACCESS-CM2_hist_ssp370.nc
  • Evaluation Data:
    All other GCMs can be accessed in one single file, predictor and target variables have the dimensions (time, lat, lon, GCM).

    • Target/Ground Truth (Y): Other_GCMs_hist_SSP370_target_fields_pr.nc
    • Predictor (X): Other_GCMs_hist_SSP370_predictor_fields.nc

Methodological Insights

  • Regional Climate Model, Our Regional Climate Model training data is from the Conformal Cubic Atmospheric Model (CCAM) which is a global non-hydrostatic atmospheric model renowned for its variable-resolution cubic grid. . For more information about CCAM, please see the following paper.

  • Predictor and Target Variables: Daily-averaged large-scale prognostic variables, including zonal wind, meridional wind, temperature, and specific humidity, are employed as predictors at the 500mb and 850mb pressure levels. These are normalized (see the GitHub repository for the mean and standard deviation fields). Precipitation is taken as is from CCAM and accumulated for each given day. Static predictors are also used in our model, which is stored in a GitHub repository.

  • Training Framework: Our dataset benefits from the "perfect framework" training strategy, which uses CCAM-coarsened predictor variables. For more information about the perfect and imperfect training frameworks, see the following review

Algorithm

Training Data

Period

Deterministic Baseline

Historical

1960-2014 (~21,000 days)

Deterministic Baseline

Future (SSP370)

2044-2099 (~21,000 days)

Deterministic Baseline

Historical and Future (SSP370)

1960-2099 (~51,000 days)

Residual GAN

Historical

1960-2014

Residual GAN

Future (SSP370)

2044-2099

Residual GAN

Historical and Future (SSP370)

1960-2099

Table 1: The six RCM emulator experiments performed in this study.

Files

On-the-Extrapolation-of-Generative-Adversarial-Networks-for-downscaling-precipitation-extremes-main.zip

Files (25.4 GB)

Name Size
md5:8e4700e910169c1aaafaaaea4f6ce484
59.1 MB Download
md5:3fbc51ddb98b1e0f5968151711802214
103.8 MB Preview Download
md5:07308f21392ed3636825c6be740e0352
2.4 GB Download
md5:66f69daa94a5c97b133f91092d74daab
8.5 GB Download
md5:840c5af88ac1b46904b5378123b333ca
1.8 GB Download
md5:93a39971bbfc2bf90abc8edc70b67d76
12.6 GB Download

Additional details