Published February 12, 2021 | Version v1
Dataset Open

Drivers of linkage disequilibrium across a species' geographic range

  • 1. Kay
  • 2. Yvonne

Description

Data from "Drivers of linkage disequilibrium across a species’ geographic range" by Lucek & Willi

The files contain all LD estimates used for the analysis in the respective study estimated for genic and intergenic regions of the Arabidopsis lyrata genome.

3 files contain the LD estimates from Pool-Seq data (Genbank project PRJEB19338) using the software Ldx https://journals.plos.org/plosone/article?id=10.1371/journal.pone.0048588 :

genic_ld.txt contains the following columns:

GeneID => ID of the respective gene based on the V2 annotation of A. lyrata
Scaffold => Scaffold ID (1-8)
Population => Population ID (see Table S1 in the study)
UncorrectedLD => LD estimates from Ldx
Distance_between_LD_pairs => Distance in bps between SNPs used to calculate LD
Fis => Inbreeding coefficient (Fis) based on microsattelite estimates
Expansion_distance => Range expansion distance (in km) taken from Willi et al. 2018 Mol Biol Evol
Genetic_cluster => Phylogenetic cluster (1= East; 2= West)
LD => Distance corrected LD used for all analyses
Average_distance_to_nearest_gene => average distance to the nearest gene as an estimate for gene density.

 

LD_genic_with_SIFT_annotations.txt
A subset of genic_ld.txt for which SNPs were annotated by SIFT4G https://www.nature.com/articles/nprot.2015.123
The file contains one additional column:
SIFT_comparison => 3 types of comparisons, T_T (between Tolerated and Tolerated SNPs), T_D (between Tolerated and Deleterious SNPs), D_D (between Deleterious and Deleterious SNPs)

 

intergenic_ld.txt contains the following columns:

Scaffold => Scaffold ID (1-8)
Population => Population ID (see Table S1 in the study)
UncorrectedLD => LD estimates from Ldx
Distance_between_LD_pairs => Distance in bps between SNPs used to calculate LD
Fis => Inbreeding coefficient (Fis) based on microsattelite estimates
Expansion_distance => Range expansion distance (in km) taken from Willi et al. 2018 Mol Biol Evol
Genetic_cluster => Phylogenetic cluster (1= East; 2= West)
LD => Distance corrected LD used for all analyses
Size_of_intergenic_region_in_bps => size in bps of each intergenic region

 

 

 

 

 

2 files contain the LD estimates from individually re-sequenced genomes (Genbank project PRJEB30473 ) using the software mlrho https://journals.plos.org/plosone/article?id=10.1371/journal.pone.0048588 : https://onlinelibrary.wiley.com/doi/full/10.1111/j.1365-294X.2009.04482.x

Each file contains the correlation of zygosity estimated for each individual for each bp distance as distinct columns, either for all genic (genic_mlrho.txt) or intergenic (intergenic_mlrho.txt) regions. All values were scaled by genome-wide theta.

 

 

Files

Files (820.5 MB)

Name Size Download all
md5:d1e07db63c812f374b3b3b06b7f55119
490.9 MB Download
md5:91beca578c75bd4d07237b0e0edc4bb1
1.6 MB Download
md5:d9641b97a907a28af687d4af9ed3fccd
189.4 MB Download
md5:35a1d5ee3fc1cd8f41bd7dbbea30790d
2.0 MB Download
md5:8655fedf04cdc6d38fcdf5a65859f36a
136.7 MB Download

Additional details

Funding

The genetics of adaptation in quantitative traits PP00P3_123396
Swiss National Science Foundation
Evolutionary dynamics of drift load and its role in species distribution limits 31003A_166322
Swiss National Science Foundation
The genetic basis of evolutionary constraints PP00P3_146342
Swiss National Science Foundation