Published October 3, 2024 | Version v1
Dataset Restricted

SelectSim Dataset

  • 1. University of Lausanne
  • 2. ROR icon SIB Swiss Institute of Bioinformatics
  • 3. ROR icon Swiss Cancer Center Léman

Description

Readme

- There are two supplementary data available in this repository. Descriptions of them are mentioned below.

Supplementary Data 1

- Data Version: GENIE v15
- Raw Data available at: https://doi.org/10.7303/syn53210170

- There are 19 folders, each representing an AACR project GENIE Participating Center. Each folder contains .rds R-data files containing the processed GAMs, one for each gene sequencing panel used by that Center. The data of each file include a list with 2 GAMs, one built from missense, and one built from truncating mutations.

- How to read data in R:

`data <- readRDS('MSK/MSK-IMPACT468.rds')`

Supplementary Data 2 

- There are two folders, representing a pan_cancer (for joint analysis of all tumor classes) and tumor_class runs, respectively.

- Both folders contain three subfolders, corresponding to the GENIE Participating Centers MSK and DFCI, and the TCGA cohort (https://www.cancer.gov/ccg/research/genome-sequencing/tcga)

- Each subfolder contains .rds files with processed run_data objects for running SelectSim. These include missense and truncating mutations GAMs, tumor mutational burden estimates for each sample, and sample classes (tumor subtypes). Within each of the three subfolders, there are files containing GAMs built from all available patients and OncoKB genes (n=396, e.g. "pan_can_dfci_primary_run_data_v15.rds"), whereas files within the subsubfolders named "gene_panel" contain GAMs subsetted to the corresponding gene panel and patients sequenced by that panel. Every file's name includes labels corresponding to the cohort (MSK, DFCI, or TCGA), the tumor class (or label "pan-can"), the gene sequencing panel (e.g. "p_505"), and the metastatic status of analyzed patients ("primary" or "meta").

- How to read data in R:

`data <- readRDS('pan_can_tcga_run_data.rds.rds')`

Contact Details

- For any queries:
    - Contact: Arvind Iyer (ayalurarvind@gmail.com, arvind.iyer@unil.ch) , Miljan Petrovic (miljan.petrovic@unil.ch)
    - Lead Contact: Giovanni Ciriello (giovanni.ciriello@unil.ch) 

Acknowledgments

"The authors would like to acknowledge the American Association for Cancer Research and its financial and material support in the development of the AACR Project GENIE registry, as well as members of the consortium for their commitment to data sharing. Interpretations are the responsibility of the study authors."

Files

Restricted

The record is publicly accessible, but files are restricted to users with access.

Request access

If you would like to request access to these files, please fill out the form below.

You are currently not logged in. Do you have an account? Log in here

Additional details

Software

Repository URL
https://github.com/CSOgroup/SelectSim
Programming language
R
Development Status
Active