This repository is split into the following subfolders which contain different portions of the project. Descriptions of each file contained in these folders are given the corresponding subsection below.
Analysis
This folder, empty in the archive file, is used to save summary tables during some steps of data processing. We have included it for easier replication of some portions of the R code in the “Code” folder.
ModelObjects
Files in ~/ModelObjects include variable selection outputs from the VSURF package v. 1.1.0, and Random Forest (RF) model objects developed in the ranger package v. 0.13.1. VSURF objects were used in selecting final variables for RF models, while RF objects were used to predict land cover classes throughout the study area in central Arizona, USA from 1985 to 2020. - VSURF_vegetated.rds: VSURF model selection outputs for the model separating forest from non-forest vegetation - VSURF_forestType.rds: VSURF model selection outputs for the model separating different forest types (mixed conifer, ponderosa pine, pine-oak, and pinyon-juniper) - RFVegetatedMod.rds: Final RF model used to separate forest from non-forest (grass/shrub and bare ground) cover types - RFForestTypeMod.rds: Final RF model used to separate different forest types (mixed conifer, ponderosa pine, pine-oak, pinyon-juniper)
Code
This folder contains all Google Earth Engine (.txt files) and R (.r files) code used in our analyses and in producing figures. Brief descriptions of these scripts are given below, split by subfolder. To repeat any of these analyses, make sure that “PrTo_LandscapeChange.Rproj” is open in a new R session. This R project file works in conjunction with the “here” package to read in datasets with relative path names for repeatability on different computers and operating systems.
Step1-Processing
- Step1a-LandTrendrFunctions.txt: Creates functions to extract Landsat imagery within an area of interest, harmonize Landsat 8/9 (OLI) to Landsat 5 (TM) equivalents, and perform spatial and temporal segmentation to create Landsat-derived predictors of annual land cover.
- Step1b-LandTrendrLayerCreation.txt: Uses functions defined in “Step1a…” to develop seasonal Landsat composites for predictions of annual land cover.
- Step1c-GetElevHliTpi.txt: Uses 10-m digital elevation models to extract elevation and topographic position index (TPI) throughout the study area and obtains the CHILI dataset of Theobald et al. (2015)
- Step1d-FireSeverity.txt: Uses fire perimeter dataset found in “Data/Fires/AllFires.shp” to calculate RdNBR with a phenology offset following methods in Parks et al. (2018).
- Step1e-GetFDSI.R: Uses PRISM data from the study area to calculate the Forest Drought Stress Index (FDSI) following Williams et al. (2012), an indicator of moisture stress and tree growth potential in Southwestern forests.
- Step1f-GetBBData.R: Extracts data from Hicke et al. (2020), derived from USFS aerial detection surveys, to get estimates of bark beetle-caused tree mortality in each year (1997-2018), aggregated across different agent and tree host combinations.
- Step1g-GetTreeRingData.R: Processes raw tree ring data, formats it for use in dplR package, and creates standardized tree ring chronologies for the two landscapes.
- Step1h-ReprojectSpatialData.R: Reads in all spatial data files and reprojects them to the same system (NAD83 UTM12N). Raster data are also aligned to the same 30-m reference grid, which is based on the Landsat data and plot centers.
Step2-Analysis
- Step2a-SpatialExtraction.R: Extracts spatial data (e.g., Landsat, terrain data) to each training data point location, for subsequent use in classification models.
- Step2b-ClasssificationRF.R: Uses data extracted in “Step2a…” to perform variable selection, tune, and fit Random Forest classification models. These models are saved in the “ModelObjects” folder for later use.
- Step2c-PredictionRF.R: Uses fitted Random Forest models from “Step2b…”, as well as Landsat and terrain data, to predict annual land cover types in the study area from 1985 to 2020.
- Step2d-AccuracyAssessmentRF.R: Performs 20-fold cross-validation with models from “Step2b…” to assess final classification accuracy.
- Step2e-VisualizeChange.R: Extracts error-adjusted area proportion (following Olofsson et al. 2014) for each year, land cover type, and landscape. This is then visualized using line graphs for each landscape and cover type.
- Step2f-FireInsectsHarvestGrowth.R: Pulls in spatial data describing bark beetle-caused tree mortality, fire activity, tree harvest, tree-ring indices, FDSI, and other potential drivers of forest change to create a large panel figure.
- Step2g-CARTModel_CausalDrivers.R: Fits, visualizes, and evaluates a spatiotemporal classification tree model to predict the drivers of annual transitions from forest to non-forest cover types.
Data
This folder contains spatial data, training data, and tree-ring records used throughout the study area. Unless otherwise specified, all spatial data have been clipped to either the bounding box of the study area, or the study landscapes themselves (found in “Data/StudyBound/BothForests_Merged.shp”) and reprojected to NAD83 UTM12N.
AnnualClassifications
All files contained in this folder are geotiff-format raster files (30-m resolution), which give the Random Forest-predicted land cover type in each pixel and year. File naming conventions are as follows. Class[year].tif, where [year] is the specific year of a land cover map. Files are in 8-bit unsigned integer format, and the following information defines the land cover class values for each number.
- Ponderosa Pine
- Mixed Conifer
- Pinyon-Juniper
- Pine-Oak
- Non-forest
Bark Beetles
All files contained in this folder are geotiff-format raster files which give estimates of percent beetle-caused tree mortality (across all species combinations) in a given pixel and year. Files were originally developed for the western US at a 1-km spatial resolution by Hicke et al. (2020) and resampled to 30-m using “average” interpolation to align with other datasets and maintain original values. File naming conventions are as follows. BBCausedMort_[year].tif, where [year] is the specific year of mortality estimates. Files are in 16-bit unsigned integer format, and values range 0-900, with cell numbers representing “mortality area” (m) within a 30-m cell. This involved multiplying mortality area estimates of Hicke et al. by 11.1111 to convert from units of ha/sq km to sq m/30-m pixel. This allowed for easier area-based summation at a 30-m resolution.
Climate/FDSI
All files contained in this folder are geotiff-format raster files which give values of FDSI in a given pixel and year. Files were originally developed for the study area using PRISM data (4-km resolution) and resampled to 30-m using “average” interpolation to align with other datasets and maintain original values. File naming conventions are as follows. FDSI_[year].tif, where [year] is the specific year of FDSI. Files are in 16-bit signed integer format, with cell numbers giving FDSI x 1000 (to preserve three decimals of precision).
Climate/TreeRings
All files contained in this folder were involved in tree-ring data collection or processing. Descriptions of sampled stand locations on each landscape are in “Data/Climate/TreeRings/Locations”, information about sampled trees and protocol are in “Data/Climate/TreeRings/ProtocolAndTreeData”, ring width measurements (from Measure J2X) and crossdating information (COFECHA) are in “Data/Climate/TreeRings/RawData”. - rawRingWidths_cleaned.csv: This file gives summarized ring width information from the 99 sampled ponderosa pine individuals described in the study. These individuals were located in untreated stands from each landscape, and spanned a range of tree sizes. Columns represent individual tree ring samples, rows represent years, and cell values give measured ring width values (in mm). This file wass used to develop standardized tree ring chronologies.
FieldPlots
All files contained in this folder give training data information used to train Random Forest classifiers for the prediction of land cover types in each year.
- spatialExtract.csv: This file gives information from training data points, as well as extracted spatial data values (i.e., Landsat, terrain) used in land cover classification. Columns are as follows…
- NatForest: Name of study landscape in which a training data point is located (Prescott or Tonto)
- SurveyYr: Year in which a training data point was collected (for field data) or the year of aerial imagery (for aerial image interpretation). Values are 2019 or 2020
- Plot: The “cluster” of training data points. Field plots were collected in clusters of ~3 points in the same area. For points derived from interpretation of 2019 NAIP imagery, Plot is “AirInterp”
- Point: The unique training data row number within a given landscape and plot
- UniqueID: Text string that combines NatForest, Plot, and Point to a single variable as a unique row identifier. This should be used when identifying a particular training data point.
- VegClassText: The identified land cover type from field surveys or aerial interpretations. Codes vary slightly from terms used in manuscript text and are as follows… MC = Mixed Conifer, PIPO = Ponderosa Pine, PIPO/Oak = Pine-Oak, PJ = Pinyon-Juniper, GRASS/SHRUB = Non-forest, BARE/DEVELOPED = Non-forest.
- [index]&[season]: Between columns “G” and “BH”, these give extracted values from Landsat data. [season] is defined based on date ranges for use in Landsat composites (Spring, March 15-June 14; Summer, June 15-September 14; Fall, September 15-December 14). [index] gives the specific spectral index as below. For further information on each index, see Table 2 of the main text in Rodman et al. (in review)
- EVI (Enhanced Vegetation Index)
- RGI (Normalized Red-Green Index)
- NBR (Normalized Burn Ratio)
- NDMI (Normalized Difference Moisture Index)
- NDVI (Normalized Difference Vegetation Index)
- TCB (Tasselled Cap Brightness)
- TCG (Tasselled Cap Greenness)
- TCW (Tasselled Cap Wetness)
- TCA (Tasselled Cap Angle)
- [index]&[season]_seg5: Variables are the same as above, but including simple non-iterative clustering (SNIC, Achanta and Süsstrunk 2017) spatial segmentation for spatial smoothing and incorporation of context around each pixel.
- chili: Extracted values of the continuous heat load index (CHILI; HLI) from Theobald et al. (2015).
- elev_m: Extracted values of elevation (m) from a digital elevation model of the study area.
- tpi_450mWindow: Topographic position index (TPI) based on a 450m window around each training data point.
- TrainingPts_all.csv: Similar to above file, but excluding spatial data extractions. Additionally, this file includes the following fields…
- NAD83_12N_UTMx: Easting (m) in NAD83 UTM Zone 12 N of the training data location (either the center of the field plot or aerial photo point)
- NAD83_12N_UTMy: Northing (m) in NAD83 UTM Zone 12 N of the training data location (either the center of the field plot or aerial photo point)
Fires
All files contained in this folder give information about fire occurrence and severity throughout the study area.
- Fires/RdNBR: Files in this subfolder give fire severity (RdNBR) within all known fires 1985-2020 throughout the study area. RdNBR was calculated using Landsat data from years immediately before and after fire occurrence following Parks et al. (2018). Fires were grouped by year for simplicity, and the naming convention is RdNBR_[year].tif, where [year] is the year of fire occurrence. All files are geotiff-format raster data with a 30-m spatial resolution and in 16-bit unsigned integer format. Areas outside of known fire perimeters, and unburned areas within fire perimeters have values of 0.
- AllFires.shp: Shapefile of known fire perimeters throughout the study area. A combination of MTBS, NIFC, and US Forest Service records are given here, with polygons merged and dissolved by fire year for processing.
FuelTreatments
All files contained in this folder give information about timber harvests and treatments related to biomass removal throughout the study area.
- AllTreatments.shp: Shapefile gives the location and completion year of all known treatments related to woody biomass removal in either study landscape. Data were derived from USFS FACTS datasets shown here (https://data.fs.usda.gov/geodata/edw/datasets.php)
LandsatHarmonized
This folder, empty in the archive file, contained Landsat imagery for use in mapping annual land cover classes. However, as these data are publicly available, have large file sizes (> 15 gb of data total), and can be replicated using archived code (specifically “Step1a…” and “Step1b…” text files), we have not included them in this archive
StudyBound
All files located in this folder give the locations of the study area boundary in central Arizona, USA, and associated information.
- BothForests_Merged.shp: Shapefile with precise boundaries of the study landscapes. Used to clip and restrict extent of many other spatial datasets, and to summarize results by landscape.
- forestCode.tif: Rasterized (geotiff; 30-m resolution) version of “BothForests_Merged.shp”, used to extract values to training data and in making predictive maps of land cover. Areas outside of each landscape are assigned “NA” values
- forestCode.tif: Rasterized (geotiff; 30-m resolution) version of “BothForests_Merged.shp”, used to extract values to training data and in making predictive maps of land cover. Areas outside of each landscape are assigned “0” values rather than NAs. Useful for particular steps in modeling process.
- PrTo_BoundaryForLandsat.shp: Shapefile is similar to “BothForests_Merged.shp”, but buffered by ca. 1,500 m to accomodate all training data points, some of which fall outside of study area polygons (e.g., developed and urban areas). This file was used to extract Landsat data for the study area.
Terrain
All files contained in this folder are geotiff-format raster files which give information about terrain throughout the study landscapes. All fires are created in Google Earth Engine (using script “Step1c…”) Files were originally developed for the study area using USGS 3dep data (10-m resolution) and resampled to 30-m using to align with other datasets, using the mean value of all 10-m pixels within a given 30-m pixel.
- chili.tif: Continuous heat load index of Theobald et al. (2015). Values range 0-255 (8-bit unsigned integer), and were calculated using a combination of slope, aspect, and latitude to estimate terrain-driven solar heating.
- elev_m.tif: Elevation (m) of each pixel based on aggregated USGS 3dep data.
- tpi_450mWindow.tif: Topographic position index (following Weiss 2001) in a 450-m window surrounding each pixel. TPI was first calculated at a 10-m resolution and then aggregated to a 30-m resolution to align with other datasets.
WorkingDirectory
This folder, empty in the archive file, is used to save temporary files during some steps of data processing. We have included it for easier replication of some portions of the R code in the “Code” folder.