Published October 30, 2018 | Version v1

The simulation dataset of GRIPT: a novel case-control analysis method for Mendelian disease gene discovery

Description

The simulation dataset of GRIPT: a novel case-control analysis method for Mendelian disease gene discovery

To uncompress the data:

tar -xvf Simulation_data.tar.gz

The meaning of the datasets:

sim_X_Y.tar.gz means the simulation is generated with the maximum population frequency cutoff of X and the sample size of Y.

sim_X_Y.tar.gz is generated based on the average allele frequency in population.

sim_X_Y_AMR.tar.gz is generated based on the allele frequency in Latino population.

sim_X_Y_500_Z.tar.gz is mixed of Latino population with a proportion of (500-Z)/500, and African population with a proportion of Z/500.

The allele frequency is based on the ExAC database (Lek et al Nature 2016)

Within each folder, there are case or control folders. The case folder contains the simulation spiked in the HGMD mutation of the given gene (i.e. RPE65 or TINF2) in the given percentage of individuals (e.g. 0.5%, 1%, 2%, 3%). The control folder contains the simulation without HGMD mutation spiked in.

Files

Files (12.0 GB)

Name Size
md5:207fca59de5821920c3a2075139ab445
12.0 GB Download