SelectSim Dataset
Authors/Creators
Description
Readme
- There are two supplementary data available in this repository. Descriptions of them are mentioned below.
Supplementary Data 1
- Data Version: GENIE v15
- Raw Data available at: https://doi.org/10.7303/syn53210170
- There are 19 folders, each representing an AACR project GENIE Participating Center. Each folder contains .rds R-data files containing the processed GAMs, one for each gene sequencing panel used by that Center. The data of each file include a list with 2 GAMs, one built from missense, and one built from truncating mutations.
- How to read data in R:
`data <- readRDS('MSK/MSK-IMPACT468.rds')`
Supplementary Data 2
- There are two folders, representing a pan_cancer (for joint analysis of all tumor classes) and tumor_class runs, respectively.
- Both folders contain three subfolders, corresponding to the GENIE Participating Centers MSK and DFCI, and the TCGA cohort (https://www.cancer.gov/ccg/research/genome-sequencing/tcga)
- Each subfolder contains .rds files with processed run_data objects for running SelectSim. These include missense and truncating mutations GAMs, tumor mutational burden estimates for each sample, and sample classes (tumor subtypes). Within each of the three subfolders, there are files containing GAMs built from all available patients and OncoKB genes (n=396, e.g. "pan_can_dfci_primary_run_data_v15.rds"), whereas files within the subsubfolders named "gene_panel" contain GAMs subsetted to the corresponding gene panel and patients sequenced by that panel. Every file's name includes labels corresponding to the cohort (MSK, DFCI, or TCGA), the tumor class (or label "pan-can"), the gene sequencing panel (e.g. "p_505"), and the metastatic status of analyzed patients ("primary" or "meta").
- How to read data in R:
`data <- readRDS('pan_can_tcga_run_data.rds.rds')`
Contact Details
- For any queries:
- Contact: Arvind Iyer (ayalurarvind@gmail.com, arvind.iyer@unil.ch) , Miljan Petrovic (miljan.petrovic@unil.ch)
- Lead Contact: Giovanni Ciriello (giovanni.ciriello@unil.ch)
Acknowledgments
"The authors would like to acknowledge the American Association for Cancer Research and its financial and material support in the development of the AACR Project GENIE registry, as well as members of the consortium for their commitment to data sharing. Interpretations are the responsibility of the study authors."
Files
Additional details
Software
- Repository URL
- https://github.com/CSOgroup/SelectSim
- Programming language
- R
- Development Status
- Active