Dataset Open Access
# GWAS summary statistics imputation, integration with PrediXcan MASHR-M
The file `sample_data.tar` contains all necessary files to perform imputation of GWAS summary statistics to the GTEx v8 QTL data set.
It includes 1000 Genomes individuals' genotypes as reference panel.
The `.tar` archive, upon uncompression, contains the following:
│ ├── eqtl
│ │ └── mashr
│ └── sqtl
│ └── mashr
`data/eur_ld.bed.gz` contains definitions of approximately independent LD-regions in hg38 (Berisa-Pickrell regions, lifted over)
`data/gtex_v8_eur_filtered_maf0.01_monoallelic_variants.txt.gz` is a snp annotation file, listing all GTEx v8 variants with MAF>0.01 in europeans.
`data/coordinate_map` contains precomputed mapping tables that MetaXcan tools can use to convert GWAS' genomic coordinates in GWAS between genome assemblies.
`data/gwas` contains a sample GWAS file for the purposes of a tutorial (data obtained from Nikpay et al (Nat Gen 2016) https://www.ncbi.nlm.nih.gov/pubmed/26343387
`data/liftover` contains Liftover chains to map coordinates between human genome assemblies (used by full harmonization tools)
`data/models` contains PrediXcan MASHR-M models, and cross-tissue S-MultiXcan LD compilation, from eQTL and sQTL.
`data/reference_panel_1000G` contains 1000G hg38 genotypes, in parquet format, to be used by imputation tools.
Nikpay et al (Nat Gen 2016) "A comprehensive 1000 Genomes–based genome-wide association meta-analysis of coronary artery disease" doi 10.1038/ng.3396
Barbeira et al (Biorxiv 2019) "Widespread dose-dependent effects of RNA expression and splicing on complex diseases and traits" doi 10.1101/814350