Integrated methylome and phenome study of the circulating proteome reveals markers pertinent to brain health
Creators
- Danni A Gadd
- Robert F Hillary
- Daniel L McCartney
- Liu Shi
- Aleks Stolicyn
- Neil Robertson
- Rosie M Walker
- Robert I McGeachan
- Archie Campbell
- Shen Xueyi
- Miruna C Barbu
- Claire Green
- Stewart W Morris
- Mathew A Harris
- Ellen V Backhouse
- Joanna M Wardlaw
- J Douglas Steele
- Diego A Oyarzún
- Graciela Muniz-Terrera
- Craig Ritchie
- Aleio Nevado-Holgado
- Tamir Chandra
- Caroline Hayward
- Kathryn L Evans
- David J Porteous
- Simon R Cox
- Heather C Whalley
- Andrew M McIntosh
- Riccardo E Marioni
Description
This repository houses fully-adjusted methylome-wide association study (MWAS) summary statistics for 4,231 SomaScan protein measurements. These were generated as part of the study titled ‘Integrated methylome and phenome study of the circulating proteome reveals markers pertinent to brain health’ by Gadd et al. The Stratifying Resilience and Depression Longitudinally (STRADL) cohort used in this study is a subset of individuals from Generation Scotland: The Scottish Family Health Study. There were 744 individuals with complete protein and DNA methylation measurements available at 772,619 CpG probes. MWAS were performed with protein residuals as the outcome and DNA methylation as the exposure, using the Omics-data-based complex trait analysis (OSCA) software.
Fully-adjusted models were run using M-values that were adjusted for age, sex, DNA methylation-derived immune cell estimates, depression status, DNA methylation batch and set, body mass index and a DNA methylation-derived smoking score. Protein levels were rank-based inverse normalised and scaled to have a mean of 0 and standard deviation of 1. Protein levels were residualised by age, sex, available pQTLs, technical covariates and 20 genetic principal components.
Four of the 4,235 protein MWAS models did not converge (15509-2 - NAGLU, 15584-9 - CFHR2, 4407-10 - MST1 and 6402-8 - PILRA). Therefore, summary statistics are provided for 4,231 protein levels.
Each protein MWAS summary statistics file has been saved with the following naming system: "MWAS_SeqId_Protein_gene.csv". For example, the protein with gene name CRYBB2 and SeqId 10000-28 has the following file name: "MWAS_10000-28_CRYBB2.csv".
The SeqIds, UniProt codes, gene names and full UniProt names can be found in "annotation_formatted_for_paper.csv" and the full summary statistics are found within "compressed-protein-ewas.tar.gz".
Please contact either riccardo.marioni@ed.ac.uk or danni.gadd@ed.ac.uk for any queries. All code is available at the following Github repository: https://github.com/DanniGadd/Epigenome-and-phenome-wide-study-of-brain-health-outcomes.
Files
annotation_formatted_for_paper.csv
Files
(47.7 GB)
Name | Size | Download all |
---|---|---|
md5:7830626e2c87ed783332d19f0ff1a67b
|
234.5 kB | Preview Download |
md5:af21d893f81af09f76b31149b91ddfcc
|
47.7 GB | Download |