Presentation Open Access

Hamiltonian Monte Carlo on the Space of Phylogenies

Dinh, Vu; Bilge, Arman; Matsen, Erick

Talk given at Evolution 2016 in the SSB Spotlight: Next generation phylogenetic inference 1.

Evolutionary tree inference, or phylogenetics, is an essential tool for understanding biological systems from deep-time divergences to recent viral transmission. The Bayesian paradigm is now commonly used in phylogenetics to describe support for estimated phylogenies or to test hypotheses that can be expressed in phylogenetic terms. However, current Bayesian phylogenetic inference algorithms are limited to about 1,000 sequences, which is much fewer than are available via modern sequencing technology.

Here we develop phylogenetic Hamiltonian Monte Carlo (HMC) as a new approach to enable phylogenetic inference on larger data sets. HMC is an existing computational statistical method that scales to large datasets by using Newton's laws of motions to efficiently explore various parameter values. However, because a phylogenetic tree parameter includes both its branch lengths and topology, we must go beyond the current implementations of HMC which cannot consider this special structure of trees. To do so, we develop a probabilistic version of the physics simulator within HMC, which can explore tree space. This algorithm generalizes previous algorithms by doing classical HMC on the branch lengths when considering a single topology, but making random choices between the tree topologies at the "intersection" between various trees. We show that our algorithm correctly explores the entire tree space and provide a proof-of-concept implementation in open-source software.

Files (819.3 kB)
Name Size
evol16.pdf
md5:6cbea752068ee2be6994cff7999e67dd
819.3 kB Download
38
9
views
downloads
All versions This version
Views 3838
Downloads 99
Data volume 7.4 MB7.4 MB
Unique views 3838
Unique downloads 99

Share

Cite as