Published November 4, 2022 | Version v1
Software Open

Faster‐haplodiploid evolution under divergence‐with‐gene‐flow: Simulations and empirical data from pine‐feeding hymenopterans

  • 1. University of Kentucky
  • 2. University of Lisbon

Description

Although haplodiploidy is widespread in nature, the evolutionary consequences of this mode of reproduction are not well characterized. Here, we examine how genome-wide hemizygosity and a lack of recombination in haploid males affects genomic differentiation in populations that diverge via natural selection while experiencing gene flow. First, we simulated diploid and haplodiploid "genomes" (500-kb loci) evolving under an isolation-with-migration model with mutation, drift, selection, migration, and recombination; and examined differentiation at neutral sites both tightly and loosely linked to a divergently selected site. So long as there is divergent selection and migration, sex-limited hemizygosity and recombination cause elevated differentiation (i.e., produce a "faster-haplodiploid effect") in haplodiploid populations relative to otherwise equivalent diploid populations, for both recessive and codominant mutations. Second, we used genome-wide SNP data to model divergence history and describe patterns of genomic differentiation between sympatric populations of Neodiprion lecontei and N. pinetum, a pair of pine sawfly species (order: Hymenoptera; family: Diprionidae) that are specialized on different pine hosts. These analyses support a history of continuous gene exchange throughout divergence and reveal a pattern of heterogeneous genomic differentiation that is consistent with divergent selection on many unlinked loci. Third, using simulations of haplodiploid and diploid populations evolving according to the estimated divergence history of N. lecontei and N. pinetum, we found that divergent selection would lead to higher differentiation in haplodiploids. Based on these results, we hypothesize that haplodiploids undergo divergence-with-gene-flow and sympatric speciation more readily than diploids.

Notes

ZIP files contains all input files (dryad) and scripts (zenodo) needed to run a set of population genetic analyses (plus a README file). 

ABBA-BABA:

Data and scripts used to perform the ABBA-BABA tests (D-statistic), including the following files:

DATA:
- data_files folder: VCF and individual info files

SCRIPTS:
- ProcessData_Neodiprion_filterDPperind.sh:  bash script to filter the VCF file based on DP and HWE and obtain a genotype matrix file
- analysis_NeoLecPin_feb2017.r:  R script to read the genotype matrix file and compute the ABBA-BABA tests (D-statistics) and compute their significance using block-jackniffe
- RScripts folder: files with definition of functions used by the bash script ProcessData_Neodiprion_filterDPperind.sh and R script analysis_NeoLecPin_feb2017.r

ADMIXTURE:

Data and scripts to run the admixture analysis for the hybrids, N. pinetum and N.lecontei in Kentucky, including the following files:

DATA:
- KY_Pin_F1_nohet_7x_0.5miss_0.01maf.recode.vcf is the vcf file containing the SNPS used in this analyses
- KY_pin_F1_sitesincommon.ped and KY_pin_F1_sitesincommon.map are the input files for this analysis

SCRIPTS:
- test_run_admixture.sh is the script to run the admixture analysis

Demography_Fastsimcoal:

Data and scripts to perform demographic analyses of Neodiprion sawflies using fastsimcoal2

DATA:
- Define_NeutralFstThreshold folder: outputfiles to obtain the threshold FST based on simulations
- input_data folder: contains vcf files that are conversted to 2DSFS, the 2DSFS files, and the fastsimcoal input files

SCRIPTS:
- Build_2DSFS folder: scripts  to obtain the 2D SFS from VCF files
- Define_NeutralFstThreshold folder: scripts to obtain the threshold FST based on simulations
- ScriptsLaunchFastsimcoal2 folder: scripts to launch fastsimcoal2 analyses

Genomewide_patterns_of_divergence:

Data for calculating pi and Fst for N. lecontei and N. pinteum

DATA:
- KY.text and Pintetum.txt contain the sample names for N. lecontei and N. pinetum respectively
- for all files in this folder KY refers to the Kentucky population of N. lecontei
- KY_Pin_7x_50%miss_nothetexc_0.01maf.recode.vcf is the input vcf file for all three analyses
- The .log files contain the command line code that was used to run the three analyses
- .pi and .fst files are the output files containing the raw pi and Fst values

Process_Neodiprion_data:

Data and scripts to go from raw fastq reads to vcf files. 

DATA:
- Deduplication folder contains fastq_files folder with one fastq file per lane per library for removal of PCR duplicates

SCRIPTS:
- Deduplication folder contains Python_scripts folder includes the python script for removal of PCR duplicates. There is a single python script for each lane and library.
- Process_fastq.sh has the command line input to go from fastq files to BAM files. The output is a single filtered BAM filer per sample

Funding provided by: National Science Foundation
Crossref Funder Registry ID: http://dx.doi.org/10.13039/100000001
Award Number: DEB-CAREER-1750946

Funding provided by: National Science Foundation
Crossref Funder Registry ID: http://dx.doi.org/10.13039/100000001
Award Number: DEB-1257739

Funding provided by: National Institute of Food and Agriculture
Crossref Funder Registry ID: http://dx.doi.org/10.13039/100005825
Award Number: 2015-67011-22803

Funding provided by: Fundação para a Ciência e a Tecnologia
Crossref Funder Registry ID: http://dx.doi.org/10.13039/501100001871
Award Number: UIDB/00329/2020

Funding provided by: Fundação para a Ciência e a Tecnologia
Crossref Funder Registry ID: http://dx.doi.org/10.13039/501100001871
Award Number: CEECIND/02391/2017

Funding provided by: Fundação para a Ciência e a Tecnologia
Crossref Funder Registry ID: http://dx.doi.org/10.13039/501100001871
Award Number: CEECINST/00032/2018/CP1523/CT0008

Funding provided by: HORIZON EUROPE Marie Sklodowska-Curie Actions
Crossref Funder Registry ID: http://dx.doi.org/10.13039/100018694
Award Number: 799729

Funding provided by: Fundação para a Ciência e a Tecnologia
Crossref Funder Registry ID: http://dx.doi.org/10.13039/501100001871
Award Number: CPCA/A0/7303/2020

Funding provided by: National Science Foundation of Sri Lanka
Award Number: DEB‐1257739

Funding provided by: National Science Foundation of Sri Lanka
Award Number: DEB‐CAREER‐1750946

Funding provided by: National Institute of Food and Agriculture
Award Number: 2015‐67011‐22803

Funding provided by: Faculdade de Ciências e Tecnologia, Universidade Nova de Lisboa
Award Number: CPCA/A0/7303/2020

Funding provided by: European Commission
Award Number: 799729

Files

ABBA-BABA.zip

Files (1.7 MB)

Name Size Download all
md5:c132e8bc6f2e1e95afa10aebf46ac365
37.7 kB Preview Download
md5:ee01c32e3b29f05b0140ba69f0a28073
1.7 kB Preview Download
md5:52b76775d982e6239ca4493e83dbfcd7
1.6 MB Preview Download
md5:9227815fb8c9c64f85ed809a7006320a
77.3 kB Preview Download
md5:ae6d398ed0185e29d4b89e33d70626b7
6.1 kB Preview Download

Additional details

Related works