Combining genome-wide studies of breast, prostate, ovarian and endometrial cancers maps cross-cancer susceptibility loci and identifies new genetic associations
Description
Data set linked to the paper, "Combining genome-wide studies of breast, prostate, ovarian and endometrial cancers maps cross-cancer susceptibility loci and identifies new genetic associations". Pre-print of the paper is here: https://doi.org/10.1101/2020.06.16.146803.
cross_cancer_sum_stats.txt.gz contains summary genome-wide association statistics for susceptibility to single cancers (breast (BR), prostate (PR), ovarian (OV), endometrial (EN), estrogen receptor (ER)-positive breast (POS), ER-negative breast (NEG), and high-grade serous ovarian (HGS) cancers) and from the cross-cancer meta-analysis (main [main] and subtype-focused [sub]). EA in the header refers to the effect allele, OA is the other allele, EAF is the effect allele frequency in the largest of the single cancer data sets (BR), IMPR2 is the imputation quality in the largest of the single cancer data sets (BR), SE is the standard error, PVAL is the P-value, RE2Cs1 is the RE2C statistic mean effect part, RE2Cs2 is the RE2C statistic heterogeneity part, RE2Cp* is the RE2C* P-value. More on RE2Cp* can be found here: http://software.buhmhan.com/RE2C/index.php?mid=contact&act=dispBoardWrite and in https://academic.oup.com/bioinformatics/article/33/14/i379/3953957 SNP names in cross_cancer_sum_stats.txt.gz include the chromosome and build 37 position.
main_tetrachoric_corr_matrix.txt and subtype_tetrachoric_corr_matrix.txt provide the tetrachoric correlation matrices used in the main and subtype-focused meta-analyses. These were also used to specify the cryptic.cor argument of the exh.abf function of MetABF. More on MetABF can be found here: https://github.com/trochet/metabf and in https://onlinelibrary.wiley.com/doi/abs/10.1002/gepi.22202
prior_sigmas_for_metabf.txt contains the values used to specify the prior.sigma argument of the exh.abf function in MetABF.
The breast cancer data used are described in PMID 29059683 and can be downloaded from http://bcac.ccge.medschl.cam.ac.uk/bcacdata/oncoarray/oncoarray-and-combined-summary-result/gwas- summary-results-breast-cancer-risk-2017/ (this link also includes acknowledgements). The prostate cancer data are described in PMID 29892016 and can be downloaded from: http://practical.icr.ac.uk/blog/?page_id=8164 (this link also includes acknowledgements). The ovarian cancer data used are described in PMID 28346442 and can be downloaded from https://www.ebi.ac.uk/gwas/studies/GCST004415. The endometrial cancer data are described in PMID 30093612 and can be downloaded from https://www.ebi.ac.uk/gwas/studies/GCST006464. These links point to the same data that form the basis of the cross_cancer_sum_stats.txt.gz file.
The sample size and precision of the data presented should preclude identification of any individual study participant. However, in downloading these data, you undertake not to attempt to identify individual study participant and not to re-post these data to a third-party website. Please cite the PMIDs highlighted above along with the appropriate acknowledements if you use the cross_cancer_sum_stats.txt.gz file.
If you have any questions about this repository, please email Siddhartha Kar at siddhartha dot kar at bristol dot ac dot uk