The following contains an overview of the methods used in the paper.

Data

The detailed and repeatable version of this section can be found here.

TODO: We used data.

MCMCglmm

The detailed and repeatable version of this section can be found here.

To estimate the variance-covariance matrix for the phylogeny and each clade (see 01-Data_preparation.Rmd) we run multivariate generalised linear mixed models (MCMCglmm) using the MCMCglmm package (Hadfield 2010).

We ran a general multi-response model with §§§N_traits as fixed effects and the residual covariance structure. We used §§§N_clades + 1 random effects (one for the whole phylogeny and one for each clade):

Data ~ residuals + whole phylogeny + phylogeny per clade

To increase the speed of the analysis and take phylogenetic uncertainty into account, we used a "mini chains" approach. It runs multiple small MCMCglmm analyses on multiple trees and pulls the results together into one larger MCMCglmm that contains more variation due to phylogenetic uncertainty (similar to Guillerme and Healy (2014)).

To do so, we first ran three independent MCMC chains with the model and data described above using the consensus tree and flat priors. We ran these models for §§§N_iterations sampling every §§§N generations with no burnin (resulting in an effective sample size of §§§ and a convergence of §§§ measured as the ASDS§§§). We then extract the burnin periods from these three chains (defined as the generation when the chain reaches the median likelihood value times 1.25) and used the resulting residuals terms and random terms structure as priors for the mini-chains.

We set up a mini-chain to be a MCMCglmm running with 1) the model described above, 2) a random tree, and 3) the estimated priors and burnin and run the chains to reach only 100 sample after the burnin. We ran §§§n_chains mini-chains, thus providing us with §§§n_chains * 100 posterior data points. The two main advantages of this mini-chain approach is that 1) they are much faster to run since no diagnosis of convergence is necessary and the chains are only run for a relatively short time (which allow several chains to crash/fail without losing all the outputs); and 2) they take into account tree uncertainty without having to run the complete MCMCglmm on all trees (c.f. Guillerme and Healy (2014)).

Mini-chains overview

Mini-chains overview

Figure: Mini chains overview.

Elaboration and exploration

The detailed and repeatable version of this section can be found here.

To analyse the exploration and elaboration aspects of both clades and tips, we first selected 1000 random variance-covariance matrices from the combined mini chains MCMCglmm results. This resulted in 1000 \(\times\) §§§N_clades+1 (1000 matrices for each random effect). For each of these matrices, we measured the main axis of the 95% confidence interval ellipsoid in §§§N_dimensions which corresponds to the main axis of variation for each random effect. We then centred each axes on their respective clade's centroid coordinates.

set.seed(42)
## Space plot
cor_matrix <- matrix(cbind(1, 0.8, 0.8, 1), nrow = 2)
space <- space.maker(50, 2, rnorm, cor.matrix = cor_matrix)
lim <- c(floor(range(space)[1]), ceiling(range(space)[2]))
plot(space, pch = 19, xlab = "Trait 1", ylab = "Trait 2", xlim = lim, ylim = lim)

## Plotting the ellipse
lines(ellipse::ellipse(cor_matrix), col = "blue", lwd = 3)
## Plotting the major axis
lines(get.axes(list(list(list(VCV = cor_matrix, Sol = c(0,0)))))[[1]][[1]], lwd = 3, col = "orange")

Figure: the main axis of variation (in orange) of the 95% CI ellipse of the variance-covariance matrix (in blue) of 20 elements in 2 dimensions (in black).

We then calculated the elaboration and exploration relative to each of these axes using projection and regression from linear algebra. The elaboration of a clade or a species is calculated as the linear algebra's projection of the species' coordinates or the clade's major axes onto the main axis of reference. The exploration ("innovation" in Endler et al. (2005)) of a clade or a speciies is calculated as the linear algebra's rejection of it from the main axis of reference. In other words the elaboration corresponds to the where does a species or a clade fits on the major evolutionary axis and the exploration corresponds to how far does a species or a clade deviates from that axis.

Figure: in a space with five elements: A, B, C, D, E (in grey); where D and E represent the major axes (e.g. the major phylogenetic effect axes from the MCMCglmm), we can rotate and rescale each element so that D and E become the unit vector of length 1 (the black letters D and E) and get the exploration and elaboration scores for either the elements (e.g. element A's projection in blue and its rejection in orange) or from any other axes (e.g. the major phylogenetic effect axes from the MCMCglmm for a specific clade).

Using such a way to measure the exploration and elaboration scores for clades or species has the advantage of being generalised to any number of dimensions and to be relative to any main axis of reference. In our case we use the major axis of the variance-covariance of the random terms of a MCMCglmm model but this reference could also be an arbitrary axis defined in the space (e.g. the "duckness" axis or the "blueness" axis - REF Frane?).

We measured this elaboration and exploration scores for 1) each clade's main axes compared to the overall phylogenetic main axis and 2) each species compared to the overall phylogenetic main axis and their respective clade main axes. For the first approach (clade's elaboration and exploration) we also measured the angle between the two axes (because of linear algebra, this measurement is redundant yet sometimes more intuitive). In both approaches, high value of exploration indicates that the clade or the species deviates more from the main axis of reference and elaboration values away from 0.5 means that the clade or species is elaboration on the major axis more than the central/"average" shape.

Running and summarising the analyses

To fully reproduce the results of the publication, you can use the running and summarising.

References

Endler, John A, David A Westcott, Joah R Madden, and Tim Robson. 2005. “Animal Visual Systems and the Evolution of Color Patterns: Sensory Processing Illuminates Signal Evolution.” Evolution 59 (8). Wiley Online Library: 1795–1818.

Guillerme, Thomas, and Kevin Healy. 2014. mulTree: a package for running MCMCglmm analysis on multiple trees. Zenodo. doi:10.5281/zenodo.12902.

Hadfield, Jarrod D. 2010. “MCMC Methods for Multi-Response Generalized Linear Mixed Models: The MCMCglmm R Package.” Journal of Statistical Software 33 (2): 1–22. https://www.jstatsoft.org/v33/i02/.