A scalable, accurate, and universal analysis framework using individual-level allele frequency for large-scale genetic association studies in an admixed population
Description
Inclusion of individuals with diverse or admixed genetic ancestries is crucial to discover novel findings that may be missed by genomics analyses rooted solely in Caucasian population. Here, we present an analysis framework, SPAmix, which is scalable to a large-scale biobank data analysis including hundreds of thousands of admixed individuals and is universally applicable to various types of complex traits including binary trait, quantitative trait, time-to-event trait, longitudinal traits, etc. For each genetic variant, SPAmix uses genotype data and genetic principal components (PCs) to estimate individual-level allele frequency, which is subsequently used to calibrate p values via a retrospective analysis. A hybrid strategy including saddlepoint approximation (SPA) can greatly increase the accuracy to analyze rare genetic variants, especially if the phenotypic distribution is unbalanced or extremely unbalanced. Compared to Tractor, SPAmix does not require local ancestry information and can be straightforwardly applicable to a multi-way admixed population. Meanwhile, SPAmix can also be extended to SPAmixlocal in which the local ancestry can be incorporated if available. In addition, we propose SPAmixCCT to combine the p values of SPAmix and SPAmixlocal via Cauchy combination (CCT). SPAmixlocal performs close to Tractor when analyzing quantitative traits and is more accurate when analyzing binary traits with an unbalanced case-control ratio. And SPAmixCCT is an optimal unified approach for various cross-ancestry genetic architectures. Extensive simulation studies and real data analyses of 369,314 UK Biobank individuals from multiple ancestries demonstrated that SPAmix is scalable and can discover novel hits while controlling type I error rates well.
Files
Longitudinal_Beta_G_SPAmix_loci_allTraits.csv
Files
(21.6 GB)
Name | Size | Download all |
---|---|---|
md5:98f148356f80058a6d704323969bdd32
|
1.0 GB | Download |
md5:9e20041b511fe9148af1e5fed93c2527
|
1.0 GB | Download |
md5:306da23a66a71f5bacdddbe0a288ea7b
|
1.0 GB | Download |
md5:72345a4f430c75b6bf6a81ef6bd16acb
|
1.0 GB | Download |
md5:b107eeeb9da233dd37b323d39ed685d8
|
1.0 GB | Download |
md5:721e942b895da735db2c0ffec9262c82
|
1.0 GB | Download |
md5:03603c22fb83166e4255e3d1bfbb6125
|
1.0 GB | Download |
md5:39eca1315fa4a1ebdd4864064fa546d5
|
1.0 GB | Download |
md5:f8b6a3a3803eee059effda0c2396d99c
|
1.0 GB | Download |
md5:133008987cb9b8131091f56322c1f772
|
99.6 kB | Preview Download |
md5:b8f6fb67c580a6870b3b7b661d330f6a
|
1.0 GB | Download |
md5:6e83af4b8cb0833f20862c9bbc1019d3
|
1.0 GB | Download |
md5:ee3d128159842040ba4968f5a1a3b480
|
1.0 GB | Download |
md5:c6925fb9e7548585aaa64babb6b181d1
|
1.0 GB | Download |
md5:d24529df3549150a73d387834ad4436d
|
1.0 GB | Download |
md5:16928a8a1accff26fe3e82b75e74e827
|
1.0 GB | Download |
md5:3f4691517b05648b96a642dde567ee14
|
1.0 GB | Download |
md5:f7b2f2b369a45da4b73bc6d41814693c
|
1.0 GB | Download |
md5:1ecb9651eccdded4e803efb15d133533
|
1.0 GB | Download |
md5:727ece5d379c9215a4e60eeca016eef3
|
1.0 GB | Download |
md5:ac75e4a9187a9ce4ca1e9281318d2c50
|
1.0 GB | Download |
md5:8a309676707e73f081fabf073d8cc765
|
28.0 kB | Preview Download |
md5:5c6f037be47e8f6928ba5e9a50f4de45
|
1.0 GB | Download |