Published December 10, 2019 | Version v2
Dataset Open

GWAS summary statistics imputation support data and integration with PrediXcan MASHR

  • 1. The University of Chicago

Contributors

Research group:

Description

# GWAS summary statistics imputation, integration with PrediXcan MASHR-M

 

The file `sample_data.tar` contains all necessary files to perform imputation of GWAS summary statistics to the GTEx v8 QTL data set.

It includes 1000 Genomes individuals' genotypes as reference panel.

The `.tar` archive, upon uncompression, contains the following folder structure:

```

data
|-- coordinate_map
|-- gwas
|-- liftover
|-- models
|   |-- eqtl
|   |   `-- mashr
|   `-- sqtl
|       `-- mashr
|-- reference_panel_1000G
`-- ucsc

```

 

`data/eur_ld.bed.gz` contains definitions of approximately independent LD-regions in hg38 (Berisa-Pickrell regions, lifted over)

`data/gtex_v8_eur_filtered_maf0.01_monoallelic_variants.txt.gz` is a snp annotation file, listing all GTEx v8 variants with MAF>0.01 in europeans.

`data/coordinate_map` contains precomputed mapping tables that MetaXcan tools can use to convert GWAS' genomic coordinates in GWAS between genome assemblies.

`data/gwas` contains a sample GWAS file for the purposes of a tutorial (data obtained from Nikpay et al (Nat Gen 2016) https://www.ncbi.nlm.nih.gov/pubmed/26343387

`data/liftover` contains Liftover chains to map coordinates between human genome assemblies (used by full harmonization tools)

`data/models` contains PrediXcan MASHR-M models, and cross-tissue S-MultiXcan LD compilation, from eQTL and sQTL.

`data/reference_panel_1000G` contains 1000G hg38 genotypes, in parquet format, to be used by imputation tools.

`data/ucsc` contains genomic coordinates of rsids in hg17, hg18 and hg19. You can use these to add chromosome and start position information to a GWAS based on its rsids. (column `end` is not used)

 

Files

Files (16.7 GB)

Name Size Download all
md5:8b42c388953d016e1112051d3b6140ed
16.7 GB Download

Additional details

References

  • Nikpay et al (Nat Gen 2016) "A comprehensive 1000 Genomes–based genome-wide association meta-analysis of coronary artery disease" doi 10.1038/ng.3396
  • Barbeira et al (Biorxiv 2019) "Widespread dose-dependent effects of RNA expression and splicing on complex diseases and traits" doi 10.1101/814350