Published July 6, 2023 | Version v1
Dataset Open

Data from: Summary tests of introgression are highly sensitive to rate variation across lineages

  • 1. University of Wisconsin-Madison

Description

The evolutionary implications and frequency of hybridization and introgression are increasingly being recognized across the tree of life. To detect hybridization from multi-locus and genome-wide sequence data, a popular class of methods is based on summary statistics from subsets of 3 or 4 taxa. However, these methods often carry the assumption of a constant substitution rate across lineages and genes, which is commonly violated in many groups. In this work, we quantify the effects of rate variation on the D test (also known as ABBA-BABA test), the D3 test, and HyDe. All three tests are used widely across a range of taxonomic groups, in part because they are very fast to compute. We consider rate variation across species lineages, across genes, their lineage-by-gene interaction, and residual variation across gene-tree edges. We do so by simulating gene trees within species networks according to a birth-death-hybridization process so as to capture a range of realistic species phylogenies. For all three methods tested, we found a marked increase in the false discovery of reticulation (type-1 error rate) when there is rate variation across species lineages. The D3 test was the most sensitive, with around 80% type-1 error, such that D3 appears to be more sensitive to a departure from the clock than to the presence of reticulation. For all three tests, the power to detect hybridization events decreased as the number of hybridization events increased, indicating that multiple hybridization events can obscure one another if they occur within a small subset of taxa. Our study highlights the need to consider rate variation when using site-based summary statistics and points to the advantages of methods that do not require assumptions on evolutionary rates across lineages or across genes.

Notes

Funding provided by: National Science Foundation
Crossref Funder Registry ID: http://dx.doi.org/10.13039/100000001
Award Number: DGE-2137424

Funding provided by: National Science Foundation
Crossref Funder Registry ID: http://dx.doi.org/10.13039/501100008982
Award Number: DMS-1902892

Funding provided by: National Science Foundation
Crossref Funder Registry ID: http://dx.doi.org/10.13039/501100008982
Award Number: DMS-2023239

Files

output.zip

Files (11.6 MB)

Name Size Download all
md5:f7d268f6205cd301b31e709991c4d94f
11.6 MB Preview Download
md5:8ac6677a2f653a9c38bb6d2c4ba12b51
8.7 kB Preview Download

Additional details

Related works

Is source of
10.5281/zenodo.8121872 (DOI)