Published November 25, 2020 | Version v1
Dataset Open

Data from: Properties of Markov chain Monte Carlo performance across many empirical alignments --part II

  • 1. University of Hawaii at Manoa

Description

Nearly all current Bayesian phylogenetic applications rely on Markov chain Monte Carlo (MCMC) methods to approximate the posterior distribution for trees and other parameters of the model. These approximations are only reliable if Markov chains adequately converge and sample from the joint posterior distribution. While several studies of phylogenetic MCMC convergence exist, these have focused on simulated datasets or select empirical examples. Therefore, much that is considered common knowledge about MCMC in empirical systems derives from a relatively small family of analyses under ideal conditions. To address this, we present an overview of commonly applied phylogenetic MCMC diagnostics and an assessment of patterns of these diagnostics across more than 18,000 empirical analyses. Many analyses appeared to perform well and failures in convergence were most likely to be detected using the average standard deviation of split frequencies, a diagnostic that compares topologies among independent chains. Different diagnostics yielded different information about failed convergence, demonstrating that multiple diagnostics must be employed to reliably detect problems. The number of taxa and average branch lengths in analyses have clear impacts on MCMC performance, with more taxa and shorter branches leading to more difficult convergence. We show that the usage of models that include both Γ-distributed among-site rate variation and a proportion of invariable sites are not broadly problematic for MCMC convergence but are also unnecessary. Changes to heating and the usage of model-averaged substitution models can both offer improved convergence in some cases, but neither are a panacea.

Notes

Funding provided by: National Science Foundation
Crossref Funder Registry ID: http://dx.doi.org/10.13039/100000001
Award Number: DBI-1356796

Files

Harrington_mcmcOutput_repo_2_readme.txt

Files (237.8 GB)

Name Size Download all
md5:33db0c1a8c6e61e3c5eefcff0755c54a
10.0 GB Download
md5:aef508ceb61f927640d7dec06ed30133
10.0 GB Download
md5:470714a5ae9a4d7f680fbb6503754542
10.0 GB Download
md5:5d2cc486c82138c40fbb7db2f3bf446e
10.0 GB Download
md5:98a89f9846ec8aa8f206e55aa22f9aeb
10.0 GB Download
md5:9aa23cb21e7b3c42b57d03bdd7266cb5
10.0 GB Download
md5:f72902f9331634f8aa6c6e18334ed4df
10.0 GB Download
md5:f935ae4ec7c3a09a1640f104670cccbc
10.0 GB Download
md5:9e83d716b304ad70a66a7854870fa57c
10.0 GB Download
md5:ffb147f0465862d01613c7cf6b95e637
10.0 GB Download
md5:cd34462dbf057c276067cbf16b8ae3d8
10.0 GB Download
md5:8329496510b669c5830981d938fcb179
10.0 GB Download
md5:b48a9263362ae1b566e092681d3a5f98
10.0 GB Download
md5:40489b5e14895522e3ef1de03e8bbcdf
10.0 GB Download
md5:938f6ea42dab91c691b1c51f39ffaef6
10.0 GB Download
md5:942145dbceb26ce47d0bc063677f9dc1
10.0 GB Download
md5:ec2d38e4de3790b60af2676b8b5e3788
10.0 GB Download
md5:c098ba00d63c9cb656c165c5feb89c61
10.0 GB Download
md5:4efa3511aa012cb7e5e6bcb81d9c02d5
4.4 GB Download
md5:1b74622d0920e315e8b58421855a357d
6.0 kB Preview Download
md5:b27246dc21871442afaf0d2bc3022af0
10.0 GB Download
md5:43d94fc712748ca2d9697e7daf478945
10.0 GB Download
md5:c1d66f63badc8a8230e17f06c3978ae3
10.0 GB Download
md5:875181af1a529505d3623e2041a7e036
10.0 GB Download
md5:b2ac848831f3b5a0573bb388511576d3
10.0 GB Download
md5:2edc1fdb43e254fc4df85cc71ea8949c
3.6 GB Download