Data from: Global conservation prioritisation approach provides credible results at a regional scale
Description
Overview
This repository contains code and data for Roswell and Espíndola "Global conservation prioritization approach provides credible results at a regional scale" (doi:10.1111/ddi.13969), a manuscript about predicting which unassessed regional taxa are likely to faceconservation threats... using occurrence data, covariates, and random forest classifiers. The development version is on GitHub https://github.com/mikeroswell/threatRF.git
General organization
- code contains R scripts to download occurrence data and GIS layers, process them, and fit the Random Forests.
- data contains all the downloaded and manufactured datasets (these are often large) for this project
- data/fromR mainly contains tables generated by the scripts in code
- data/GIS_downloads contains raster layers downloaded from various sources
Workflow within `code/`:
Utilities
data cleanup
1. tidy_flora.R uses regex matching to turn .pdf into a flat file
2. robust_gbif_namesearch.R wraps an `rgbif` function to try to get nice matches for taxon names without returning synonyms if a valid match exists.
model fitting, etc. 
1. fix_mod.R handles novel factor levels when using various `predict` functions.
2. RF_tuner.R specifies how to tune and fit the random forests
3. RF_setup.R creates folds for model fitting, cleans up model formulae
Data download and analysis scripts (may call 1 or more utilities above)
- download_gis.R documents the sources of many of the GIS layers used downstream. Created a long time ago and unstable. Do not run
- download_occurrences_and_statuses.R documents the queries in GBIF and natureserve. Largely stable but not rerun; the dataset liable to change if rerun.
- crunch_GIS.R Should be rel. stable, all GIS work done in R
- fit_RF.R Fits random forests
- graphing_model_outputs.R generates figures and tabular results
Data
The data input for analyses is saved as a .RDA file data/fromR/lfs/to_predict.RDA
This dataset is generated by cleaning and harmonizing occurrence data (GBIF.org (08 May 2024) GBIF Occurrence Download https://doi.org/10.15468/dl.9jrwwd.) with conservation status data from Nature Serve and geographic covariates from a variety of sources, with further details in scripts described above.
Files
      
        Roswell_Espíndola_Zenodo_code.zip
        
      
    
    
      
        Files
         (18.7 GB)
        
      
    
    | Name | Size | Download all | 
|---|---|---|
| md5:dae736de76d5aeeda2240c8a5807d53f | 34.6 kB | Preview Download | 
| md5:6c6d29a7fe839dbf741461001afdc037 | 18.7 GB | Preview Download | 
Additional details
Related works
- Is described by
- Journal article: 10.1111/ddi.13969 (DOI)
Funding
- University of Maryland, College Park
Dates
- Available
- 
      2025-01-03
              
                Software
              
            
          - Programming language
- R
- Development Status
- Inactive