The simulation dataset of GRIPT: a novel case-control analysis method for Mendelian disease gene discovery
Authors/Creators
- 1. Baylor College of Medicine
- 2. Fudan University
Description
The simulation dataset of GRIPT: a novel case-control analysis method for Mendelian disease gene discovery
To uncompress the data:
tar -xvf Simulation_data.tar.gz
The meaning of the datasets:
sim_X_Y.tar.gz means the simulation is generated with the maximum population frequency cutoff of X and the sample size of Y.
sim_X_Y.tar.gz is generated based on the average allele frequency in population.
sim_X_Y_AMR.tar.gz is generated based on the allele frequency in Latino population.
sim_X_Y_500_Z.tar.gz is mixed of Latino population with a proportion of (500-Z)/500, and African population with a proportion of Z/500.
The allele frequency is based on the ExAC database (Lek et al Nature 2016)
Within each folder, there are case or control folders. The case folder contains the simulation spiked in the HGMD mutation of the given gene (i.e. RPE65 or TINF2) in the given percentage of individuals (e.g. 0.5%, 1%, 2%, 3%). The control folder contains the simulation without HGMD mutation spiked in.
Files
Files
(12.0 GB)
| Name | Size | |
|---|---|---|
|
md5:207fca59de5821920c3a2075139ab445
|
12.0 GB | Download |