This dataset is associated with the publication "Bias correction and statistical modeling of variable oceanic forcing of Greenland outlet glaciers" by Verjans V., Robel A., Thompson A. F., and Seroussi H. It includes all the corrections, which are specified in the corrigendum of the publication. Author: Vincent Verjans Contact: vverjans3@gatech.edu Part 1: Description of the final output TF time series only (located in the FinalOutput subdirectory) Part 2: Description of code to compute the TF time series (follows the methodoly described in the publication associated with this dataset) Part 3: More detailed description of each intermediary file (located in the InputOutput subdirectory) --------------------------------------------------------- Part1: Description of the final output time series only There are 4 files of final TF series generated following the method described in the publication associated with this dataset. The 4 files are located in the subdirectory FinalOutput. The 4 files are: (1) generatedTF_allglaciersinshore_MIROCES2L_MembersAverage_hist2100ssp126.nc 1850-2100 time series of MIROC-ES2L under hist(1850-2015) and ssp126(2015-2100) emission scenario (2) generatedTF_allglaciersinshore_MIROCES2L_MembersAverage_hist2100ssp585.nc 1850-2100 time series of MIROC-ES2L under hist(1850-2015) and ssp585(2015-2100) emission scenario (3) generatedTF_allglaciersinshore_IPSLCM6A_MembersAverage_hist2100ssp126.nc 1850-2100 time series of IPSL-CM6A under hist(1850-2015) and ssp126(2015-2100) emission scenario (4) generatedTF_allglaciersinshore_IPSLCM6A_MembersAverage_hist2100ssp585.nc 1850-2100 time series of IPSL-CM6A under hist(1850-2015) and ssp585(2015-2100) emission scenario Each of these netcdf files contains 1000 stochastic realization of TF time series at the 226 glacier front locations investigated in the publication associated with this dataset. The 1000 stochastic realization have been generated with the statistical models and with the glacier-to-glacier correlation that are described in the publication associated with this dataset. Each file has 6 variables: (a) time in years (b) lat for latitude coordinate of each glacier (c) lon for longitude coordinate of each glacier (d) Glacier for name of each glacier (e) RealizationNumber for the number of each given realization of TF (ranges between 0 and 999) (f) TF for the final TF time series. The dimensions are 1000*226*3000, where 1000 is the number of stochastic realizations, 226 is the number of different glaciers, and 3000 is the number of monthly time steps in the time series (1850-2100). --------------------------------------------------------- --------------------------------------------------------- Part 2: Description of code to compute the TF time series For more information on each step of the procedure, please refer to the publication associated with this dataset. 1) Extract the raw CMIP6 model output in a domain around Greenland - download CMIP6 model output for variables thetao and so for experiments of interest (e.g., hist, ssp585, ssp126) (https://esgf-node.llnl.gov/search/cmip6/) - store the downloaded products in the subdirectory InputOutput/rawCMIP6files/ - run membersCMIP_boxGr.py with your selected model (variable SelModel) and selected experiment (ex: To2100histssp585 = True) 2) Depth-averaging of TF - run tfDpAvgModCMIPscenarios.py with your selected model (variable SelModel) and selected experiment (ex: To2100histssp585 = True) Note that you can select the depth range for depth-averaging (variable DepthRange) and the minimum bathymetry threshold to exclude too shallow gridpoints (variable ShallowThreshold). example of output file: ensembleMIROCES2L_hist2100ssp585_Mr1_TFdpavg_Dp0to500_bathymin100.nc contains the grid of TF(0-500) around Greenland 3) Quantile Delta Mapping - run qdmtfCMIPmodwithEN4nrnb.py This will perform the QDM correction based on the EN4 reanalysis product. Note that you can select the period of calibration (variable PeriodObs0). example of output file: ensembleMIROCES2L_qdmhist2100ssp585_Mr1_ObsPer1950to2015.nc contains the grid of QDM-corrected TF(0-500) around Greenland 4) Computing cost functions for selection of offshore predictors - run matsOlHu_FrGl.py with your choice of variables numPointsToGl: the number of nearest ECCO grid points to each individual glacier front over which TF is averaged to represent the ECCO inshore TF decayDist: the decay lengthscale in the localization function weightscostmat: the weights for the Quality, Strength, and Localization functions in the final cost function example of output files: (1) offshorePredsCostTotMIROCES2L_QMen4anlnrnbTFdpavg_Dp0to500_bathymin100_ObsPer1950to2015.csv contains the index pair (x,y) of the offshore predictor for each ensemble member of the AOGCM (2) info_offshorePredsCostTotMIROCES2L_QMen4anlnrnbTFdpavg_Dp0to500_bathymin100_ObsPer1950to2015.csv contains the parameter values used in the matsOlHu_FrGl.py code 5) Extrapolation based on ECCO relations - run extrapolOlHo_FrGl.py This performs the extrapolation and produces a single inshore TF time series per marine terminating glacier. example of output files: (1) inshoreTFseriesMIROCES2L_hist2100ssp585_QMen4anlnrnbTFdpavg_Dp0to500_bathymin100_ObsPer1950to2015.csv contains all the inshore TF time series (2) infoIDinshoreTFseriesMIROCES2L_hist2100ssp585_QMen4anlnrnbTFdpavg_Dp0to500_bathymin100_ObsPer1950to2015.csv contains the identification information corresponding to each TF time series of file (1) (i.e., AOGCM name, ensemble member, marine glacier name) (3) extrapolParametersMIROCES2L_hist2100ssp585_QMen4anlnrnbTFdpavg_Dp0to500_bathymin100_ObsPer1950to2015.csv contains the values of the extrapolation parameters (alpha, gamma, beta) for each TF extrapolation corresponding to file (1) (4) valuesInshoreParametersMIROCES2L_hist2100ssp585_QMen4anlnrnbTFdpavg_Dp0to500_bathymin100_ObsPer1950to2015.csv contains the values of the inshore mean, the 12 inshore monthly effects, and the inshore standard deviation resulting from the extrapolation for each TF time series of file (1) 6) Find best ARMA model to fit the residual variability TF' - run findfullBestARMA.py Tests all ARMA models specified with the variables evalarorders,evalmaorders and selects the optimal ARMA model for each combination of AOGCM,glacier,member Note that the TF series are detrended and deseasonalized to yield TF'. The order (TrendOrderFit) and number of breakpoints (TrendNbBreaks) for the detrending can be chosen. The order (SsnOrderFit) and breakpoint (SsnDateBreak) to fit the monthly effects for deseasonalizing can be chosen. example of output file: combsARMAsel_allmembersallglaciersinshore_MIROCES2L_hist2100ssp585.csv contains the AR and MA order selected for each AOGCM,glacier,member combination. 7) Fit the ARMA model coefficients - run fitStat_rsdyrARMA.py Fits all the coefficients of the Ensemble Mean mean-plus-trend component, of the Ensemble Mean monthly effect components, and of the optimal ARMA model. This is performed for each glacier. example of output file: fitARMAparams_allglaciersinshore_MIROCES2L_hist2100ssp585_criterionBIC_fitcoefAveraged.csv contains all the parameters for each individual glacier. 8) Compute the sparse correlation matrix between glaciers - run corrmat_rsdyr.py Estimates the sparse correlation matrix using the graphical lasso algorithm. This is estimated across all the inshore TF' time series of a given member, and then the correlation matrices of all the different members are averaged. example of output file: correlationMatrix_allglaciersinshore_MIROCES2L_MembersAverage_hist2100ssp585.csv contains the 226*226 sparse correlation matrix (note that 226 is the number of marine terminating glaciers used in this study). 9) Generate TF time series - run finalfullTFgeneration_v1.py and set the number of stochastically-generated sample that you want (variable nsamples), and with all the other settings specified to the settings used in the previous steps of the procedure This will generate nsamples stochastic realizations of TF time series using the ARMA models calibrated and the sparse correlation matrix to prescribe covariance between the different glaciers. example of output file: generatedTF_allglaciersinshore_MIROCES2L_MembersAverage_hist2100ssp585.nc contains the nsamples TF realizations at each glacier. It has 6 variables (a) time in years (b) lat for latitude coordinate of each glacier (c) lon for longitude coordinate of each glacier (d) Glacier for name of each glacier (e) RealizationNumber for the number of each given realization of TF (ranges between 0 and nsamples-1) (f) TF for the final TF time series. The dimensions are (nsamples)*(number of glaciers)*(length of time series) --------------------------------------------------------- --------------------------------------------------------- Part 3: More detailed description of each intermediary file (located in the InputOutput subdirectory) (3.1) rawCMIP6files: subdirectory where to store the thetao and so files downloaded from the CMIP6 dataset (https://esgf-node.llnl.gov/search/cmip6/) (3.2) ensembleMIROCES2L_hist2100ssp585_Mr1_TFdpavg_Dp0to500_bathymin100.nc: contains the gridded TF variable from the raw MIROC-ES2L member r1 around Greenland from 1850 to 2100 (1850-2015: hist experiment, 2015-2100: ssp585 experiment) (3.3) ensembleMIROCES2L_qdmhist2100ssp585_Mr1_ObsPer1950to2015.nc: contains the gridded QDM-corrected TF variable from the raw MIROC-ES2L member r1 around Greenland from 1850 to 2100 (1850-2015: hist experiment, 2015-2100: ssp585 experiment). QDM-correction was performed with the EN4 gridded reanalysis product. (3.4) offshorePredsCostTotMIROCES2L_QMen4anlnrnbTFdpavg_Dp0to500_bathymin100_ObsPer1950to2015.csv: contains the index pair (x,y) of the offshore predictor in the MIROC-ES2L grid for each ensemble member for each glacier. (3.5) info_offshorePredsCostTotMIROCES2L_QMen4anlnrnbTFdpavg_Dp0to500_bathymin100_ObsPer1950to2015.csv: contains the parameter values used in the code for computing the optimal offshore predictor points (weighting of the Quality, Strength, and Localization functions, decay lengthscale for the Localization function, number of ECCO gridpoints used to compute the average ECCO inshore TF). (3.6) inshoreTFseriesMIROCES2L_hist2100ssp585_QMen4anlnrnbTFdpavg_Dp0to500_bathymin100_ObsPer1950to2015.csv: contains all the inshore TF time series after extrapolation for the MIROC-ES2L member r1 in scenario ssp585. (3.7) infoIDinshoreTFseriesMIROCES2L_hist2100ssp585_QMen4anlnrnbTFdpavg_Dp0to500_bathymin100_ObsPer1950to2015.csv: contains the identification information corresponding to each TF time series of file (3.6) (i.e., AOGCM name, ensemble member, marine glacier name). (3.8) extrapolParametersMIROCES2L_hist2100ssp585_QMen4anlnrnbTFdpavg_Dp0to500_bathymin100_ObsPer1950to2015.csv: contains the values of the extrapolation parameters (alpha, gamma, beta) for each corresponding TF extrapolation in file (3.6). (3.9) valuesInshoreParametersMIROCES2L_hist2100ssp585_QMen4anlnrnbTFdpavg_Dp0to500_bathymin100_ObsPer1950to2015.csv: contains the values of the inshore mean, the 12 inshore monthly effects, and the inshore standard deviation resulting from the extrapolation for each corresponding TF time series in file (3.6). (3.10) combsARMAsel_allmembersallglaciersinshore_MIROCES2L_hist2100ssp585.csv: contains the AR and MA order selected for after fitting ARMA(p,q) models to the residual TF' time series for each glacier,member combination of MIROC-ES2L under scenario ssp585. We tested all possible combination for p and q ranging between 0 and 4. The optimal (p,q) pair was selected by minimizing the Bayesian Information Criterion (BIC). (3.11) fitARMAparams_allglaciersinshore_MIROCES2L_hist2100ssp585_criterionBIC_fitcoefAveraged.csv: contains all the parameters of the final statistical models calibrated to the glacier TF time series of MIROC-ES2L under scenario ssp585. Note that these parameters are averaged across ensemble members, as explained in the publication associated with this dataset. (3.12) correlationMatrix_allglaciersinshore_MIROCES2L_MembersAverage_hist2100ssp585.csv: contains the 226*226 sparse correlation matrix capturing the correlation in the residual TF' time series of the glaciers in MIROC-ES2L under scenario ssp585. Note that 226 is the number of marine terminating glaciers used in this study. Note that the correlation matrix is an average of the correlation matrices of the individual ensemble members, as explained in the publication associated with this dataset. (3.13) dpavg_tf_ECCO2arcticmonthly_Dp0to500_bathymin100.nc: the gridded TF values of ECCO in a domain around Greenland (Nguyen et al., 2012). (3.14) dpavg_tf_EN4anl_Dp0to500_bathymin100.nc: the gridded TF values of EN4 in a domain around Greenland (Good et al., 2013). (3.15) ecco2arctic_removeinds.csv: the indices of ECCO removed because they show unphysical variability in TF. (3.16) ensembleMIROCES2L_qdmhist_Mr1_ObsPer1950to2015.nc: similar to file (3.3) but only over the hist experiment (1850-2015). (3.17) frontGlacierPos.csv: contains the coordinates of each glacier of the dataset of Wood et al. (2021) in the EPSG3413 projection. (3.18) nrngb_MIROCES2L_toECCO2arcticAndEN4_bathymin100.csv: contains for each of the MIROC-ES2L gridpoint around Greenland its Lat-Lon coordinates, its indices on the MIROC-ES2L grid, the indices of the ECCO nearest neighbor gridpoint, the indices of the EN4 nearest neighbor gridpoint, and the oceanic sector to which it belongs. (3.19) tfECCO2arctic_only1992.nc: contains the full temperature and salinity product of ECCO for the year 1992 (Nguyen et al., 2012). ---------------------------------------------------------