Published April 17, 2020 | Version v1
Dataset Open

Data from: Megafauna decline have reduced pathogen dispersal which may have increased emergent infectious diseases

  • 1. Northern Arizona University
  • 2. University of Gothenburg
  • 3. Aarhus University
  • 4. University of Miami


The Late Quaternary extinctions of megafauna (defined as animal species > 44.5 kg) reduced the dispersal of seeds and nutrients, and likely also microbes and parasites. Here we use body-mass based scaling and range maps for extinct and extant mammal species to show that these extinctions led to an almost seven-fold reduction in the movement of gut-transported microbes, such as Escherichia coli (3.3–0.5 km 2 d − 1 ). Similarly, the extinctions led to a seven-fold reduction in the mean home ranges of vector-borne pathogens (7.8–1.1km 2 ). To understand the impact of this, we created an individualbased model where an order of magnitude decrease in home range increased maximum aggregated microbial mutations 4-fold after 20 000 yr. We hypothesize that pathogen speciation and hence endemism increased with isolation, as global dispersal distances decreased through a mechanism similar to the theory of island biogeography. To investigate if such an effect could be found, we analysed where 145 zoonotic diseases have emerged in human populations and found quantitative estimates of reduced dispersal of ectoparasites and fecal pathogens significantly improved our ability to predict the locations of outbreaks (increasing variance explained by 8%). There are limitations to this analysis which we discuss in detail, but if further studies support these results, they broadly suggest that reduced pathogen dispersal following megafauna extinctions may have increased the emergence of zoonotic pathogens moving into human populations.


Figure map

Equation 1 and 3 - To determine the initial allometric scaling relationships run the program codeHR.m.  This requires the .csv data called MergeTraitsMSW_slim.  This will produce graphs with the coefficients for Eq 1 and 3.  Also, data and stats are in spreadsheet HRandDR.xls.

Figure 1 and table 1

To create the dispersal maps, run the program diseasezfirst.m.  The top part of the code will create the past dispersal maps.  These include not just extinct mammals, but former range maps of extant species, like elephants that have a much reduced range.  To avoid double counting, I use column 3 of the extinct variable to identify these species.  I then use the code spnotextinct.m to find the current maps to subtract from these.  This is done in lines 21-31.  From lines 68, I calculate the same variables for the current IUCN dataset.  Line 72 removes bats from the analysis.

To create table 1 – run createtable1.m – output is on line 110

To create maps from Figure 1, change the input on line 118

Figure 2 - For Figure 2 – to create the figure, the data is in 20000all.mat.  The code to create that data is IBMdisease2.m.  To get tables S7 and S8, we vary parameters in this same code.

Table 2  -Input datasets are created in diseasezfirst.m  as explained above.  Use the sortcountriez.m code to create the country level JID and population datasets.  The saved maps from these datasets go into the datasets for the code finaldiseasecode.m. 

Modify q in line 6 to choose the dataset to analyze.  q=1=vector, q=2=notvector, q=3=all of the EIDs from Jones et al.

Code description - Line 50 eliminates points too close to each other to reduce spatial autocorrelation.  Line 122 calculates the key megafauna variables.  From line 158, this finds and aggregates all the data for the disease pixels.  From line 270, finds random points on the land surface and generates data for these pixels.  From line 360, saves the data to analyze with the R code. Dataz3.xls is the dataset used as input into the R code

Then run the r code SAR.r.  In line 7, choose one of three variables from the matlab code (dataz1, dataz2, dataz3).  Choose the variables by modifying the dz variable.  To produce the data in Table 2 run SAR.r– uncomment the dz term for the three different models.  For instance, to get model FD uncomment: dz = cbind(y1, y2, y6, y11)

Table 2 and Table S2 takes the pseudo r2 the aic values the VIF and the model coefficints from these data.

Table S3 - To produce the data in Table S3 use SARnovector.r and input dataz2.xls .  uncomment the dz term for the three different models.  For instance, to get model FD uncomment: dz = cbind(y1, y2, y6, y11)

Table S4 - To produce the data in Table S4 use SARvector.r and input dataz1.xls.  uncomment the dz term for the three different models.  For instance, to get model FD uncomment: dz = cbind(y1, y2, y6, y11)

Figure 3 – This figure was produced by Victor Leshyk and the data come from Table 1.

Figure 4 - For figure 4, we use the equations listed in Table 2 model FD to estimate EID likelihood.  We use the code predictor.m.  To produce the numbers listed in the final results paragraph, for model FD on line 37 choose 0, and for model HR, choose 1.  Use lines 80 and 82 to create maps 4a and b.



Files (258.4 MB)

Name Size Download all
300.2 kB Download
11.9 kB Download
3.9 kB Download
260.1 kB Download
265.7 kB Download
286.7 kB Download
240.8 MB Download
7.0 kB Download
16.0 kB Download
7.8 MB Download
7.8 MB Download
66.9 kB Download
13.4 kB Download
376.5 kB Preview Download
294.4 kB Download
3.3 kB Download
15.6 kB Download
7.0 kB Download
6.0 kB Download
6.0 kB Download
2.2 kB Download
968 Bytes Download