Published December 17, 2019 | Version v1
Dataset Open

Data from: The role of structural genomic variants in population differentiation and ecotype formation in Timema cristinae walking sticks

  • 1. University of Sheffield
  • 2. Utah State University

Description

Theory predicts that structural genomic variants such as inversions can promote adaptive diversification and speciation. Despite increasing empirical evidence that adaptive divergence can be triggered by one or a few large inversions, the degree to which widespread genomic regions under divergent selection are associated with structural variants remains unclear. Here we test for an association between structural variants and genomic regions that underlie parallel host-plant associated ecotype formation in Timema cristinae stick insects. Using mate-pair re-sequencing of 20 new whole genomes we find that modest-sized structural variants such as inversions, deletions, and duplications are widespread across the genome, being retained as standing variation within and among populations. Using 160 previously published, standard-orientation whole genome sequences we find little to no evidence that the DNA sequences within inversions exhibit accentuated differentiation between ecotypes. In contrast, a formerly described large region of reduced recombination that harbors genes controlling color-pattern exhibits evidence for accentuated differentiation between ecotypes, which is consistent with differences in the frequency of color-pattern morphs between host-associated ecotypes. Our results suggest that some types of structural variants (e.g., large inversions) are more likely to underlie adaptive divergence than others, and that structural variants are not required for subtle yet genome-wide genetic differentiation with gene flow.

Notes

Deletion variants

VCF file with the 194 deletion structural variants found that were identified using Lumpy and Delly. Data for all 20 Timema cristinae individuals are included.

mod_del_genotyped.vcf.gz

Duplication variants

VCF file with the 223 duplication structural variants found that were identified using Lumpy and Delly. Data for all 20 Timema cristinae individuals are included.

mod_dup_genotyped.vcf.gz

Inversion variants

VCF file with the 492 inversion structural variants found that were identified using Lumpy and Delly. Data for all 20 Timema cristinae individuals are included.

mod_lumpy_inversions_genotyped.vcf.gz

SV population genetics script

R script for population genetic analyses and plots of the structural variant data. This includes calculations for Fst.

svSummary.R

SV allele frequencies

This compressed directory includes maximum l likelihood allele frequency estimates for the SVs. There is one file per SV type (inv = inversion, del = deletion, dup = duplication) and population. Files without population IDs are for all individuals together. In each file, there is one row per SV, the first column gives the locus ID, and the third column gives the non-reference SV allele frequency.

svAlleleFreqs.tar.gz

MeasureOrientationFreqs

One of two complementary perl scripts used to identify the inversions from the whole genome comparative alignment.

ExtractOrientInversions

One of two complementary perl scripts used to identify the inversions from the whole genome comparative alignment.

SNP variant file

VCF file with SNPs from the 160 Timema cristinae genomes.

filtered1X_tcr_wgs_variants_x.vcf.gz

SNP allele frequencies

This compressed directory includes maximum l likelihood allele frequency estimates for the SNPs from the 160 genomes. There is one file per population. In each file, there is one row per SNP, the first column gives the locus ID, and the third column gives the non-reference allele frequency.

snpAlleleFreqs.tar.gz

R population genomics script

This R script contains the core analyses of genetic variation within inversions sequences based on SNPs from the 160 Timema cristinae genomes.

popgen.R

Funding provided by: H2020 European Research Council
Crossref Funder Registry ID: http://dx.doi.org/10.13039/100010663
Award Number: R/129639

Funding provided by: Schweizerischer Nationalfonds zur Förderung der Wissenschaftlichen Forschung
Crossref Funder Registry ID: http://dx.doi.org/10.13039/501100001711
Award Number: P2BEP3_152103

Files

Files (1.2 GB)

Name Size Download all
md5:8e535f725d019eb41af864423e598a33
2.0 kB Download
md5:617f6e3a02664f1af8fb4bcf3f67f778
1.2 GB Download
md5:e1dfe65c1be84ff01309fd156c4dd222
1.2 kB Download
md5:649226e63a235d6f55f2554c69d1439d
50.8 kB Download
md5:9017adc02d1adeae8ee865cc3740a2ad
51.3 kB Download
md5:3863e25a265f0534848ecb7d4027767c
105.7 kB Download
md5:352e44e57fa8a83c23d99bbbb1dc5551
36.0 kB Download
md5:d7c9d2a56c0138c612880a3f1d9ad5c1
26.9 MB Download
md5:544f1d2ee1abbb60464ecb2b3454a4ee
40.7 kB Download
md5:8652a0ed6c20691dc5a7640c378a145e
9.2 kB Download

Additional details

Related works

Is cited by
10.1111/mec.15016 (DOI)