This README_Williamsetal2022PNAS.txt file was generated on 2022-06-06 by David R Williams


GENERAL INFORMATION

1. Title of Dataset: Data and analysis code for: Global protected areas seem insufficient to safeguard half of the world's mammals from human-induced extinction

2. Author Information
	Corresponding Investigator 
		Name: Dr David R Williams
		Institution: Sustainability Research Institute, University of Leeds, UK
		Email: d.r.williams@leeds.ac.uk

	Co-investigator 1
		Name: Prof Carlo Rondinini
		Institution: Sapienza University of Rome, Italy; State University of New York, USA

	Co-investigator 2
		Name: Prof David Tilman
		Institution: University of Minnesota, USA; University of California Santa Barbara, USA

3. Date of data collection: N/A Secondary data only

4. Geographic location of data collection: Global

5. Funding sources that supported the collection of the data: None

6. Recommended citation for this dataset: Williams et al (2022) Global protected areas seem insufficient to safeguard half of the world's mammals from human-induced extinction, PNAS


DATA & FILE OVERVIEW

1. Description of dataset
The directory is assumed to be set up with the following subdirectories:
	•	Data
	•	GeneratedData
	•	Scripts

1.1 Scripts
There are five scripts, with file names indicating the order in which they should be run. Scripts should be run using R either on a computer or on a cluster. Note that Script 2 is very memory hungry!

0_Functions.R Functions used, should not be run on its own, it is called by other scripts

1_DataPrep_ClimateDensitiesDispersalMVPs.R #Data preparation: 
    1) Creates rasters of climate variables needed to estimate population densities. Data from https://www.worldclim.org/data/worldclim21.html;
    2) Uses data from Santini et al 2018a and those climate variables to refit models from Santini et al 2018b for population density Data from https://figshare.com/articles/TetraDENSITY_Population_Density_dataset/5371633 
    3) Estimates dispersal distances based on body mass from Jones et al (2009) https://esajournals.onlinelibrary.wiley.com/doi/abs/10.1890/08-1494.1
    4) Estimates MVPs based on Hilbers et al (2016) https://doi.org/10.1111/cobi.12846

2_Habitat_PA_overlap.R 
Calculates the overlap between each species' ESH and protected areas; PA data downloaded from WDPA (https://www.protectedplanet.net/) and rasterised at 1.5 km resolution, following WDPA suggestions on data processing; Mammal habitat maps from Carlo Rondinini

3_ClimateInPAs.R 
Extracts the mean NPP and Pcv (climate variables calculated in 1_DataPrep_ClimateDensitiesDispersalMVPs.R), and species richness, for each species' PA groups (from 2_Habitat_PA_overlap.R). These covariates are needed to estimate population densities

4_ModelledPopsInPAs.R 
Takes the areas of habitats (from 2_Habitat_PA_overlap.R) and combines them with population density estimates (from 1_DataPrep_ClimateDensitiesDispersalMVPs.R) to estimate population sizes in each PA-cluster. Finally it compares these to the population targets (also from 1_DataPrep_ClimateDensitiesDispersalMVPs.R)

5_OverlapWithCountryEcoregions.R 
Overlapping PA maps with ecoregion and country maps to get an estimate of how many countries/ecoregions have viable populations

All scripts are marked up and should be self explanatory.


1.2 Data
Data used in the analysis:
BodyMassEstimates_AllSpecies.csv 
Body mass estimates for all species analysed

HilbersRegressionCoefficients.csv
Coefficients for estimating minimum viable populations from Hilbers et al (2016) https://doi.org/10.1111/cobi.12846

IUCN_CountryPresence_mammals.csv
Data from IUCN (https://www.iucnredlist.org/) of the country presence for each species

Mammals_Richness.tif
Data derived from Carlo Rondinini's habitat maps, showing the number of mammal species with available habitat in each 1.5km cell

npp_geotiff.tif
Net primary productivity from https://www.worldclim.org/data/worldclim21.html

PanTHERIA_Aug2008_Combined.csv
Data from Jones et al (2009) https://esajournals.onlinelibrary.wiley.com/doi/abs/10.1890/08-1494.1

Reed2003_MVPEstimates.csv
Data from Reed et al (2003) https://doi.org/10.1016/S0006-3207(02)00346-4

Santini2013_DispersalData.csv
Data from Santini et al (2013) https://doi.org/10.4404/hystrix-24.2-8746


1.3 Generated data
Data generated by the analysis, in case users do not want to run all of the analyses:
DispersalDistances_Estimated.csv
Estimated dispersal distances from 1_DataPrep_ClimateDensitiesDispersalMVPs.R

MVPEstimates_All.csv
Estimated minimum viable populations from 1_DataPrep_ClimateDensitiesDispersalMVPs.R

NPPForSantiniModels.tif
Net primary productivity for estimating population densities from 1_DataPrep_ClimateDensitiesDispersalMVPs.R

PASizes_AllMammals.csv
Estimated size of mammal populations in each protected area cluster from 4_ModelledPopsInPAs.R

PcvForSantiniModels.tif
Coefficient of variation of precipitation for estimating population densities from 1_DataPrep_ClimateDensitiesDispersalMVPs.R

RefittedSantiniModelCoefficients_fixed.csv
RefittedSantiniModelCoefficients_random_binomial.csv
RefittedSantiniModelCoefficients_random_family.csv
RefittedSantiniModelCoefficients_random_order.csv
Series of files with the coefficients needed for estimating population densities in individual cells, from 1_DataPrep_ClimateDensitiesDispersalMVPs.R



2 METHODOLOGICAL INFORMATION
All methods are described in detail in the individual scripts



3 DATA SPECIFICATION
Throughout, missing data are specified by "NA" in .csv files

All column headings are non-abbreviated and self-explanatory. Possible exception is PASizes_AllMammals.csv
taxon: mammal, same for all species
species: species name
WDPAID: World Database on Protected Areas ID for specific protected area
PA_group: Grouping of the protected area within a cluster, assuming no dispersal by the species. Note: this is NOT unique, but is the numbering for the specific species!
PA_group_buffered: Grouping of the protected area within a cluster, assuming median dispersal distance for the species. Note: this is NOT unique, but is the numbering for the specific species!
PA_group_buffered_max: Grouping of the protected area within a cluster, assuming maximum dispersal distance for the species. Note: this is NOT unique, but is the numbering for the specific species!
no_cells: number of 1.5 x 1.5 km cells within the protected area / habitat combination.
area: the area of the protected area / habitat combination in square kilometres
mean_npp: the mean net primary productivity for the  protected area / habitat combination (from 1_DataPrep_ClimateDensitiesDispersalMVPs.R and 3_ClimateInPAs.R)
mean_pcv: the mean coefficient of variation of precipitation for the  protected area / habitat combination (from 1_DataPrep_ClimateDensitiesDispersalMVPs.R and 3_ClimateInPAs.R)
mean_richness: the mean mammalian species richness for the  protected area / habitat combination (from 1_DataPrep_ClimateDensitiesDispersalMVPs.R and 3_ClimateInPAs.R)


