Published January 26, 2021 | Version 2.1
Dataset Open

Local adaptation and archaic introgression shape global diversity at human structural variant loci

  • 1. Department of Biology, Johns Hopkins University

Description

Supporting data associated with the manuscript "Local adaptation and archaic introgression shape global diversity at human structural variant loci". These include:

Description of files in this directory:

Structural variant genotypes

SVs_paragraphFormat.vcf.gz - merged long-read structural variant calls

SVs_1KGP_pgGTs.vcf.gz - genotypes for 1000 Genomes samples in VCF format

eQTL mapping results

fastqtl_out.txt - results from fastQTL permutation pass; see http://fastqtl.sourceforge.net/ for column descriptions

caviar_out.txt - results from fine-mapping SNPs and SVs at significant SV eQTL loci with CAVIAR. Description of columns:

  • query_sv: SV that was a significant eQTL and underwent fine-mapping
  • gene_id: gene exhibiting an expression association with the query_sv
  • var_id: variant (SNV or SV) that was tested for expression association with the above gene in the fine-mapping analysis
  • var_in_credible_causal_set: Boolean variable denoting whether the above variant is in the 95% credible causal set
  • prob_in_pcausal_set: the amount that this variant contributes to 95% credible causal set
  • causal_post_prob: the posterior probability that the variant is causal in the expression association

Structural variant selection scan results

chr21_pruned_50_Q.matrix - admixture proportion matrix (generated by Ohana; https://github.com/jade-cheng/ohana)

chr21_pruned_50_F.matrix - matrix of inferred ancestral allele frequencies (generated by Ohana)

chr21_pruned_50_C.matrix - matrix of ancestry component covariances (generated by Ohana) Entries of the matrix can be modified to produce "selection hypothesis" matrices where allele frequencies are allowed to vary in one ancestry component (https://github.com/jade-cheng/ohana/wiki/Population-or-ancestry-specific-selection-scan).

selscan_50_k8_p*.txt.gz - raw output of Ohana selscan (see https://github.com/jade-cheng/ohana)

selscan_res.txt.gz - Ohana selection scan results. These results have been filtered to exclude SVs that have low genotyping rates (<50% of samples), violate Hardy-Weinberg equilibrium expectations (excess of heterozygotes) in more than half of populations, or have extreme global log likelihood estimate (LLE) values. Description of columns:

  • ID: SV ID
  • #CHROM: SV chromosome
  • POS: SV start position
  • SVLEN: SV length (negative for deletions)
  • step: number of steps needed to interpolate between genome-wide and selection hypothesis models
  • lle_ratio: likelihood ratio statistic (LRS) of the genome-wide vs. selection hypothesis model
  • global-lle: log likelihood of the genome-wide model
  • local-lle: log likelihood of the selection hypothesis model
  • f-pop0: inferred allele frequency in ancestry component 0
  • f-pop1: inferred allele frequency in ancestry component 1
  • f-pop2: inferred allele frequency in ancestry component 2
  • f-pop3: inferred allele frequency in ancestry component 3
  • f-pop4: inferred allele frequency in ancestry component 4
  • f-pop5: inferred allele frequency in ancestry component 5
  • f-pop6: inferred allele frequency in ancestry component 6
  • f-pop7: inferred allele frequency in ancestry component 7
  • ancestry_component: ancestry component tested by the selection hypothesis model. Note that we have added 1 to the ancestry component numbers to match the terminology used in paper (which orders the components from 1-8 rather than 0-7 for interpretability)
  • snp_perc: SV's percentile in the LRS distribution for frequency-matched SNPs
  • p_nominal: nominal p-value calculated from the likelihood ratio
  • p_adj: adjusted p-value calculated from the likelihood ratio

 

Files

caviar_out.txt

Files (4.4 GB)

Name Size Download all
md5:dbdc48322243cd7ce0c19455e333c3c5
182.4 MB Preview Download
md5:da79631539c3d8f1a11a33ca9e973d67
1.1 kB Download
md5:58c6f5d9836fb7deb4482c76cd91e9c4
19.6 MB Download
md5:0e5b1016921cf5264c82d15acb2dc84a
460.7 kB Download
md5:8664874969222830a3932ad12fe82a90
2.7 MB Preview Download
md5:8320c99df8a6365a84b9da3ff64d40b9
28.7 MB Download
md5:4e4c00ace4f483cf7c8d5c9ebad0934d
28.7 MB Download
md5:fb8acea93f726ab5525c0f49c2e6b2bd
28.7 MB Download
md5:378e4a23670d5a21209e95242d6b1984
28.7 MB Download
md5:6438a41d3a14a97584b66bd8db124c82
28.7 MB Download
md5:9c6f1ec744a6d53ecc57009c4f01249e
28.7 MB Download
md5:b09f36c8f53609faeacab7e3d175229e
28.7 MB Download
md5:b4a8bb4c70207fc084cd776b2badc084
28.7 MB Download
md5:45296d19e23f8c7ab783275c0d1cd877
37.3 MB Download
md5:343911c5b2fa92cbba10397545c06078
3.3 GB Download
md5:ae612f5fd83cf0e521b1cf60757cf11a
679.7 MB Download