Datasets used in the benchmarking study of MR methods

Xianghong, Hu

doi:10.5281/zenodo.13832454

Published January 4, 2024 | Version v2

Dataset Open

Datasets used in the benchmarking study of MR methods

Xianghong, Hu (Data manager)

Contributors

Researchers:

Supervisors:

We conducted a benchmarking analysis of 16 summary-level data-based MR methods for causal inference with five real-world genetic datasets, focusing on three key aspects: type I error control, the accuracy of causal effect estimates, replicability, and power.

The datasets used in the MR benchmarking study can be downloaded here:

"dataset-GWASATLAS-negativecontrol.zip": the GWASATLAS dataset for evaluation of type I error control in confounding scenario (a): Population stratification
"dataset-NealeLab-negativecontrol.zip": the Neale Lab dataset for evaluation of type I error control in confounding scenario (a): Population stratification;
"dataset-PanUKBB-negativecontrol.zip": the Pan UKBB dataset for evaluation of type I error control in confounding scenario (a): Population stratification;
"dataset-Pleiotropy-negativecontrol": the dataset used for evaluation of type I error control in confounding scenario (b): Pleiotropy;
"dataset-familylevelconf-negativecontrol.zip": the dataset used for evaluation of type I error control in confounding scenario (c): Family-level confounders;
"dataset_ukb-ukb.zip": the dataset used for evaluation of the accuracy of causal effect estimates;
"dataset-LDL-CAD_clumped.zip": the dataset used for evaluation of replicability and power;

Each of the datasets contains the following files:

"Tested Trait pairs": the exposure-outcome trait pairs to be analyzed;
"MRdat" refers to the summary statistics after performing IV selection (p-value < 5e-05) and PLINK LD clumping with a clumping window size of 1000kb and an r^2 threshold of 0.001.
"bg_paras" are the estimated background parameters "Omega" and "C" which will be used for MR estimation in MR-APSS.

Note:

The formatted dataset after quality control can be accessible at our GitHub website (https://github.com/YangLabHKUST/MRbenchmarking).
The details on quality control of GWAS summary statistics, formatting GWASs, and LD clumping for IV selection can be found on the MR-APSS software tutorial on the MR-APSS website (https://github.com/YangLabHKUST/MR-APSS).
R code for running MR methods is also available at https://github.com/YangLabHKUST/MRbenchmarking.

Files

dataset-familylevelconf-negativecontrol.zip

Files (56.4 MB)

Name	Size	Download all
dataset-familylevelconf-negativecontrol.zip md5:9d685fc35228d227dcf25c3ea250f240	1.9 MB	Preview Download
dataset-GWASATLAS-negativecontrol.zip md5:7e9dbb36101995dc4a198870bc4e8811	27.2 MB	Preview Download
dataset-LDL-CAD.zip md5:97304fc02abd1ec0e3a7b14a40663c70	125.8 kB	Preview Download
dataset-NealeLab-negativecontrol.zip md5:f592479ff87f671f58ffd8fd74eab4d5	18.9 MB	Preview Download
dataset-PanUKBB-negativecontrol.zip md5:515bfbcb8e0ca1f1310b6b912abc1116	7.3 MB	Preview Download
dataset-Pleiotropy-negativecontrol.zip md5:29e414a0a5e010bf73f5b340975358f2	843.9 kB	Preview Download
dataset_ukb-ukb.zip md5:312b6362a57c1755ff78c113e600326b	69.6 kB	Preview Download

Additional details

DOI: 10.1016/j.ajhg.2024.06.016

Is new version of: Dataset: https://zenodo.org/records/10929572 (Other)

Available: 2024-08-08

Repository URL: https://github.com/YangLabHKUST/MRbenchmarking
Programming language: R

Xianghong Hu, Mingxuan Cai, Jiashun Xiao, Xiaomeng Wan, Zhiwei Wang, Hongyu Zhao, Can Yang, Benchmarking Mendelian randomization methods for causal inference using genome-wide association study summary statistics, The American Journal of Human Genetics, Volume 111, Issue 8, 1717 - 1735. [medrxiv link]: https://medrxiv.org/cgi/content/short/2024.01.03.24300765v1.

	All versions	This version
Views	264	41
Downloads	410	165
Data volume	3.7 GB	1.7 GB

Datasets used in the benchmarking study of MR methods

Contributors

Researchers:

Supervisors:

Files

dataset-familylevelconf-negativecontrol.zip

Files (56.4 MB)

Additional details

Identifiers

Related works

Dates

Software

References

Datasets used in the benchmarking study of MR methods

Creators

Contributors

Researchers:

Supervisors:

Description

Files

dataset-familylevelconf-negativecontrol.zip

Files (56.4 MB)

Additional details

Identifiers

Related works

Dates

Software

References