There is a newer version of this record available.

Dataset Open Access

GWAS summary statistics imputation support data and integration with PrediXcan MASHR

Barbeira, Alvaro Numa; Im, Hae Kyung

Research group(s)
The GTEx Consortium

# GWAS summary statistics imputation, integration with PrediXcan MASHR-M

 

The file `sample_data.tar` contains all necessary files to perform imputation of GWAS summary statistics to the GTEx v8 QTL data set.

It includes 1000 Genomes individuals' genotypes as reference panel.

The `.tar` archive, upon uncompression, contains the following:

```

data/

├── eur_ld.bed.gz
├── gtex_v8_eur_filtered_maf0.01_monoallelic_variants.txt.gz

├── coordinate_map
├── gwas
├── liftover
├── models
│   ├── eqtl
│   │   └── mashr
│   └── sqtl
│       └── mashr
└── reference_panel_1000G

```

 

`data/eur_ld.bed.gz` contains definitions of approximately independent LD-regions in hg38 (Berisa-Pickrell regions, lifted over)

`data/gtex_v8_eur_filtered_maf0.01_monoallelic_variants.txt.gz` is a snp annotation file, listing all GTEx v8 variants with MAF>0.01 in europeans.

`data/coordinate_map` contains precomputed mapping tables that MetaXcan tools can use to convert GWAS' genomic coordinates in GWAS between genome assemblies.

`data/gwas` contains a sample GWAS file for the purposes of a tutorial (data obtained from Nikpay et al (Nat Gen 2016) https://www.ncbi.nlm.nih.gov/pubmed/26343387

`data/liftover` contains Liftover chains to map coordinates between human genome assemblies (used by full harmonization tools)

`data/models` contains PrediXcan MASHR-M models, and cross-tissue S-MultiXcan LD compilation, from eQTL and sQTL.

`data/reference_panel_1000G` contains 1000G hg38 genotypes, in parquet format, to be used by imputation tools.

 

Files (14.0 GB)
Name Size
sample_data.tar
md5:8b8cdbaf29ee92e16d434bdb177bf3a2
14.0 GB Download
  • Barbeira et al (Biorxiv 2019) "Widespread dose-dependent effects of RNA expression and splicing on complex diseases and traits" doi 10.1101/814350

  • Nikpay et al (Nat Gen 2016) "A comprehensive 1000 Genomes–based genome-wide association meta-analysis of coronary artery disease" doi 10.1038/ng.3396

526
13,939
views
downloads
All versions This version
Views 526163
Downloads 13,93960
Data volume 232.9 TB838.1 GB
Unique views 421140
Unique downloads 1,83044

Share

Cite as