Published September 1, 2022 | Version Version 1
Dataset Open

Supplementary Information of "Introgression between highly divergent sea squirt genomes: an adaptive breakthrough?"

Description

Supplementary Figures

Figure S1 Population genetic statistics calculated in non-overlapping 10 Kb windows along the 14 chromosomes in the sea squirt genome.
Figure S2 C. robusta introgression into C. intestinalis shown across the 14 chromosomes.
Figure S3 Population genetic statistics of the C. robusta introgressed coding sequences.
Figure S4 ABBA-BABA introgression patterns using C. edwardsi as an outgroup.
Figure S5 Inference of the divergence history between C. robusta and C. intestinalis with moments.
Figure S6 Selection tests.
Figure S7 C. robusta ancestry along chromosome 5 in C. intestinalis individuals.
Figure S8 Neighbor-joining trees of 50 Kb windows framing the “missing data region” (grey band) at the center of the chromosome 5 hotspot.
Figure S9 Copy number variation at candidate SNPs in the introgression hotspot on chromosome 5 (700 Kb - 1.5 Mb).
Figure S10 Structural analysis of the “missing data region” on chromosome 5 (from 1,009,000 to 1,055,000 bp).

 

Supplementary Tables

Table S1 Sample information.
Table S2 Correlation between chromosomes of the individual C. robusta ancestry fraction.
Table S3 Demographic results with moments – excluding chromosome 5.
Table S4 Demographic results with moments – including chromosome 5.
Table S5 Description of the Supplementary Data.

 

Supplementary Scripts

Bioinformatic pipeline used for genotyping and haplotyping.

Script #1: prepare the reference genome for BWA and GATK.
reference_bwa_GATK_CF.sh
Script #2: mapping the reads to the reference with BWA.
mapping_bwa-mem_CF.sh
Script #3: indel realignment with GATK.
indel_realignment_CF.sh
Script #4: individual variant calling in gVCF format with GATK.
snpindel_callingGVCF_raw_CF.sh
Script #5: joint genotyping with GATK.
joint_genotyping_raw_CF.sh
Script #6: genotype refinement with GATK.
genotype_refinement_raw_CF.sh
Script #7: SNPs and indels recalibration with GATK.
snpindel_recalibration_CF.sh
Script #8: genotype refinement after recalibration with GATK.
genotype_refinement_recal_CF.sh
Script #9: genotype correction.
phase_by_transmission_correctCalling_CF@2020.sh
Script #10: phasing with GATK and BEAGLE.
phase_by_transmission_clean_CF@2020.sh

Pipeline used for the demographic inferences with moments.

Script #11: define the demographic models.
moments_models_2pop_bb_parallel_folded_2periods.py
Script #12: run the demographic inferences.
moments_inference_dualanneal_bb_parallel_folded_2periods_bounds.py

Files

FigureS1-S10.pdf

Files (15.2 MB)

Name Size Download all
md5:aa31c1a23077874043ac1df3dd31819a
14.8 MB Preview Download
md5:b1882eba8812527955941c66206543e2
886 Bytes Download
md5:e25e8d317562bbddb0fe4510b170e775
29.1 kB Download
md5:bde21d14becedc2fe60be304df5cb2b4
58.7 kB Download
md5:30b4be1799453792abbe846d84273d19
57.2 kB Download
md5:3fbb5c01e056ddb6b12031e04c9eecbe
5.0 kB Download
md5:064900e7819852613eba5a302be26664
1.7 kB Download
md5:363daa6247060495fd3cdc908ed1de59
4.9 kB Download
md5:d4ec4141faf1799d46d8a77190ae3e0e
2.7 kB Download
md5:df08a2a93e21ce35be8a77817c5cc568
9.7 kB Download
md5:a457848444f7b4c90432a1c3d7a83372
5.8 kB Download
md5:b34b9276b3038bbd111f5a8030aa1f84
5.1 kB Download
md5:d8bad7afafb4e808592e8f64d26ea978
23.7 kB Download
md5:31067cec716e501def9ddbf12bdde76c
27.9 kB Download
md5:900151559ae2ab5e929281c087b9b3fb
29.9 kB Download
md5:d373bf6a8a7868c3b663be6d4101ec6c
35.4 kB Download
md5:c8aff50f3359fe971475fe50d7b60078
31.8 kB Download
md5:f3936254690a3b6d645c120b524af07c
24.4 kB Download

Additional details

Related works

Is cited by
10.1101/2022.03.22.485319 (DOI)

Funding

MUSE – MUSE 16-IDEX-0006
Agence Nationale de la Recherche
HYSEA – Hybridization, a pivotal but neglected contributor to marine biodiversity dynamics ANR-12-BSV7-0011
Agence Nationale de la Recherche