Data files belonging to the paper "Dealing with clustered samples for assessing map accuracy by cross-validation"
Creators
- 1. Wageningen University & Research
- 2. The University of Sydney
Description
Mapping of environmental variables often relies on map accuracy assessment through cross-validation with the data used for calibrating the underlying mapping model. When the data points are spatially clustered, conventional cross-validation leads to optimistically biased estimates of map accuracy. Several papers have promoted spatial cross-validation as a means to tackle this over-optimism. Many of these papers blame spatial autocorrelation as the cause of the bias and propagate the widespread misconception that spatial proximity of calibration points to validation points invalidates classical statistical validation of maps. In the paper related to these data, we present and evaluate alternative cross-validation approaches for assessing map accuracy from clustered sample data.
The study area is western Europe, constrained in the north at 52° latitude and at -10° and 24° longitude The projection is IGNF:ETRS89LAEA (Lambert azimuthal equal area projection).
Files:
agb.tif = above ground biomass (AGB) map from version 3 of the 2017 CCI-Biomass product (https://catalogue.ceda.ac.uk/uuid/5f331c418e9f4935b8eb1b836f8a91b8)
AGBstack.tif = covariates used for predicting AGB
aggArea.tif = coarse grid used for simulation in the model-based methods
ocs.tif = soil organic carbon stock (OCS) map (0-30 cm) from Soilgrids (https://www.isric.org/explore/soilgrids)
OCSstack.tif = covariates used for predicting OCS
strata.xxx = 100 compact geo-strata (ESRI shape) created with the spcosa package; used for generating clustered samples
TOTmask.tif = mask of the area covered by the covariates
Details and data sources of the covariates in AGBstack.tif and OCSstack.tif:
Name |
Description |
Source |
Note |
ai |
Aridity Index |
Version 2.1 | |
bio1 |
Mean annual air temperature [°C] |
https://chelsa-climate.org/downloads/ | Version 2.1 |
bio5 |
Mean daily maximum air temperature of the warmest month [°C] |
https://chelsa-climate.org/downloads/ | Version 2.1 |
bio7 |
Annual range of air temperature [°C] |
https://chelsa-climate.org/downloads/ | Version 2.1 |
bio12 |
Annual precipitation [kg/m2] |
https://chelsa-climate.org/downloads/ | Version 2.1 |
bio15 |
Precipitation seasonality [kg/m2] |
https://chelsa-climate.org/downloads/ | Version 2.1 |
gdd10 |
Growing degree days heat sum above 10°C |
https://chelsa-climate.org/downloads/ | Version 2.1 |
clay |
Clay content [g/kg] of the 0-5cm layer |
|
Only used for AGB |
sand |
Sand content [g/kg] of the 0-5cm layer |
https://soilgrids.org/ | as above |
pH |
Acidity (Ph(water)) of the 0-5cm layer |
https://soilgrids.org/ | as above |
glc2017 |
Landcover 2017 |
https://land.copernicus.eu/global/products/lc, reclassified to: closed forest, open forest, natural non-forest veg., bare & sparse veg. cropland, built-up, water |
Categorical variable |
dem |
Elevation |
https://www.eea.europa.eu/data-and-maps/data/copernicus-land-monitoring-service-eu-dem |
|
cosasp |
Cosine of slope aspect |
Computed with the terra package from elevation |
Computed @25m resolution; next aggregated to 0.5km |
sinasp |
Sine of slope aspect |
Computed with the terra package from elevation | as above |
slope |
Slope |
Computed with the terra package from elevation | as above |
TPI |
Topographic position index |
Computed with the terra package from elevation | as above |
TRI |
Terrain ruggedness index |
Computed with the terra package from elevation | as above |
TWI |
Topographic wetness index |
Computed with SAGA from 500m resolution (aggregated) dem |
|
gedi |
Forest height |
Zone: NAFR |
|
xcoord |
X coordinate |
Using a mask created from the other covariates |
|
ycoord |
Y coordinate |
Using a mask created from the other covariates | |
Dcoast |
Distance from coast |
Using a land mask created from the other covariates |
Files
agb.tif
Files
(2.4 GB)
Name | Size | Download all |
---|---|---|
md5:c959521dbb7d1891edf13d47ed777eef
|
53.8 MB | Preview Download |
md5:8febe5916e7f2a9add74d77f6babf4bf
|
1.2 GB | Preview Download |
md5:e036e2a24d4158baf440798805201c70
|
59.9 kB | Preview Download |
md5:985d0d5e7f107eefc45b8c469dd1441d
|
49.7 MB | Preview Download |
md5:2b4f786d7fe99e72ef00658447666300
|
1.0 GB | Preview Download |
md5:c7f498d34d4957883711069f072b870a
|
2.6 kB | Download |
md5:b0c293ffe47b5ce0166f9720013931dc
|
337.2 kB | Download |
md5:ab982163e2ae5d88a747b6760bcb9a20
|
900 Bytes | Download |
md5:378f005d4d76dd0a9eb4e60ae417d065
|
3.4 MB | Preview Download |
Additional details
Related works
- Is supplement to
- Software: 10.5281/zenodo.6514923 (DOI)
- Journal article: 10.1016/j.ecoinf.2022.101665 (DOI)
References
- de Bruin et al., 2022. Dealing with clustered samples for assessing map accuracy by cross-validation. https://doi.org/10.1016/j.ecoinf.2022.101665