Published February 1, 2022 | Version v1
Dataset Open

Full-likelihood genomic analysis clarifies a complex history of species divergence and introgression: the example of the erato-sara group of Heliconius butterflies

  • 1. Harvard University
  • 2. University College London

Description

Introgressive hybridization plays a key role in adaptive evolution and species diversification in many groups of species. However, frequent hybridization and gene flow between species make estimation of the species phylogeny and key population parameters challenging. Here, we show that by accounting for phasing and using full-likelihood methods, introgression histories and population parameters can be estimated reliably from whole-genome sequence data. We employ the multispecies coalescent (MSC) model with and without gene flow to infer the species phylogeny and cross-species introgression events using genomic data from six members of the erato-sara clade of Heliconius butterflies. The methods naturally accommodate random fluctuations in genealogical history across the genome due to deep coalescence. To avoid heterozygote phasing errors in haploid sequences commonly produced by genome assembly methods, we process and compile unphased diploid sequence alignments and use analytical methods to average over uncertainties in heterozygote phase resolution. There is robust evidence for introgression across the genome, both among distantly related species deep in the phylogeny and between sister species in shallow parts of the tree. We obtain chromosome-specific estimates of key population parameters such as introgression directions, times and probabilities, as well as species divergence times and population sizes for modern and ancestral species. We confirm ancestral gene flow between the sara clade and an ancestral population of H. telesiphe, a likely hybrid speciation origin for H. hecalesia, and gene flow between the sister species H. erato and H. himera. Inferred introgression among ancestral species also explains the history of two chromosomal inversions deep in the phylogeny of the group. This study illustrates how a full-likelihood approach based on the multispecies coalescent makes it possible to extract rich historical information of species divergence and gene flow from genomic data.

Files

README.txt

Files (301.9 MB)

Name Size Download all
md5:4d65cc4cea26c529d7444b34ca43377f
176.8 MB Download
md5:2f16d68a622789ee1b01b91725f9825f
41.5 MB Download
md5:2a33c65720ac96472904a7c265261115
41.9 MB Download
md5:454f346ca1ebe00cd57d19f9a424312e
41.7 MB Download
md5:0b69a289ef68728269810e4feccbaecf
607 Bytes Preview Download

Additional details

Related works

Is cited by
10.1101/2021.02.10.430600 (DOI)
Is source of
10.5281/zenodo.5941161 (DOI)
Is supplemented by
https://zenodo.org/record/5078147 (URL)