Dataset Open Access

Pan-cancer Aberrant Pathway Activity Analysis (PAPAA)

DANIEL BLANKENBERG; NAGAMPALLI, VIJAY

Information about the dataset files:

1) pancan_rnaseq_freeze.tsv.gz: Publicly available gene expression data for the TCGA Pan-cancer dataset. File: PanCanAtlas EBPlusPlusAdjustPANCAN_IlluminaHiSeq_RNASeqV2.geneExp.tsv was processed using script process_sample_freeze.py by Gregory Way et al as described in https://github.com/greenelab/pancancer/ data processing and initialization steps. [http://api.gdc.cancer.gov/data/3586c0da-64d0-4b74-a449-5ff4d9136611] [https://doi.org/10.1016/j.celrep.2018.03.046]

2) pancan_mutation_freeze.tsv.gz: Publicly available Mutational information for TCGA Pan-cancer dataset.  File: mc3.v0.2.8.PUBLIC.maf.gz was processed using script process_sample_freeze.py by Gregory Way et al as described in https://github.com/greenelab/pancancer/ data processing and initialization steps. [http://api.gdc.cancer.gov/data/1c8cfe5f-e52d-41ba-94da-f15ea1337efc] [https://doi.org/10.1016/j.celrep.2018.03.046] 

3) pancan_GISTIC_threshold.tsv.gz: Publicly available Gene- level copy number information of the TCGA Pan-cancer dataset. This file is processed using script process_copynumber.py  by Gregory Way et al as described in https://github.com/greenelab/pancancer/ data processing and initialization steps. The files copy_number_loss_status.tsv.gz and copy_number_gain_status.tsv.gz generated from this data are used as inputs in our Galaxy pipeline. [https://xenabrowser.net/datapages/?cohort=TCGA%20Pan-Cancer%20(PANCAN)&removeHub=https%3A%2F%2Fxena.treehouse.gi.ucsc.edu%3A443] [https://doi.org/10.1016/j.celrep.2018.03.046]

4) mutation_burden_freeze.tsv.gz: Publicly available Mutational information for TCGA Pan-cancer dataset  mc3.v0.2.8.PUBLIC.maf.gz was processed using script process_sample_freeze.py by Gregory Way et al as described in https://github.com/greenelab/pancancer/ data processing and initialization steps. [https://github.com/greenelab/pancancer/][http://api.gdc.cancer.gov/data/1c8cfe5f-e52d-41ba-94da-f15ea1337efc] [https://doi.org/10.1016/j.celrep.2018.03.046]

5) sample_freeze.tsv or sample_freeze_version4_modify.tsv: The file lists the frozen samples as determined by TCGA PanCancer Atlas consortium along with raw RNAseq and mutation data. These were previously determined and included for all downstream analysis All other datasets were processed and subset according to the frozen samples.[https://github.com/greenelab/pancancer/]

6) cosmic_cancer_classification.tsv: Compendium of OG and TSG used for the analysis. Added additional genes from the cosmic database to volgelstein_cancer_classification.tsv [https://github.com/greenelab/pancancer/]

7) CCLE_DepMap_18Q1_maf_20180207.txt.gz Publicly available Mutational data for CCLE cell lines from Broad Institute Cancer Cell Line Encyclopedia (CCLE) / DepMap Portal. [https://depmap.org/portal/download/api/download/external?file_name=ccle%2FCCLE_DepMap_18Q1_maf_20180207.txt]

8) ccle_rnaseq_genes_rpkm_20180929_mod.tsv.gz: Publicly available Expression data for 1019 cell lines (RPKM) from  Broad Institute Cancer Cell Line Encyclopedia (CCLE) / DepMap Portal. [https://depmap.org/portal/download/api/download/external?file_name=ccle%2Fccle_2019%2FCCLE_RNAseq_genes_rpkm_20180929.gct.gz]

9) CCLE_MUT_CNA_AMP_DEL_binary_Revealer.tsv: Publicly available merged Mutational and copy number alterations that include gene amplifications and deletions for the CCLE cell lines. This data is represented in the binary format and provided by the Broad Institute Cancer Cell Line Encyclopedia (CCLE) / DepMap Portal. [https://data.broadinstitute.org/ccle_legacy_data/binary_calls_for_copy_number_and_mutation_data/CCLE_MUT_CNA_AMP_DEL_binary_Revealer.gct]

10) GDSC_cell_lines_EXP_CCLE_names.tsv.gz Publicly available RMA normalized expression data for Genomics of Drug Sensitivity in Cancer(GDSC) cell-lines. File gdsc_cell_line_RMA_proc_basalExp.csv was downloaded. This data was subsetted to 389 cell lines that are common among CCLE and GDSC. All the GDSC cell line names were replaced with CCLE cell line names for further processing. [https://www.cancerrxgene.org/gdsc1000/GDSC1000_WebResources//Data/preprocessed/Cell_line_RMA_proc_basalExp.txt.zip]

11) GDSC_CCLE_common_mut_cnv_binary.tsv.gz:   A subset of merged Mutational and copy number alterations that include gene amplifications and deletions for common cell lines between GDSC and CCLE. This file is generated using CCLE_MUT_CNA_AMP_DEL_binary_Revealer.tsv and a list of common cell lines. 

12) gdsc1_ccle_pharm_fitted_dose_data.txt.gz: Pharmacological data for GDSC1 cell lines. [ftp://ftp.sanger.ac.uk/pub/project/cancerrxgene/releases/current_release/GDSC1_fitted_dose_response_15Oct19.xlsx]

13) gdsc2_ccle_pharm_fitted_dose_data.txt.gz: Pharmacological data for GDSC2 cell lines. [ftp://ftp.sanger.ac.uk/pub/project/cancerrxgene/releases/current_release/GDSC2_fitted_dose_response_15Oct19.xlsx]

14) compounds_of_interest.txt: list of pharmacological compounds tested for our analysis,  taken from ftp://ftp.sanger.ac.uk/pub4/cancerrxgene/releases/release-8.1/screened_compounds_rel_8.1.csv.  

15) tcga_dictonary.tsv: list of cancer types used in the analysis. 

16) seg_based_scores.tsv: Measurement of total copy number burden, Percent of genome altered by copy number alterations. This file was used as part of the Pancancer analysis by Gregory Way et al as described in https://github.com/greenelab/pancancer/ data processing and initialization steps. [https://github.com/greenelab/pancancer/]

17) GSE69822_pi3k_sign.txt: File with values assigned for tumor [1] or normal [-1]  in given external samples (GSE69822)

18) vlog_trans.csv: Variant stabilized log-transformed expression values in given external samples (GSE69822)

19) path_rtk_ras_pi3k_genes.txt: File with the list of ERK/RAS/PI3K pathway genes used in the analysis. 

20) path_myc_genes.txt: File with the list of Myc pathway genes used in the analysis. (Sanchez-Vega, Francisco et al.)

21)  path_ras_genes.txt: File with the list of RAS pathway genes used in the analysis. (Sanchez-Vega, Francisco et al.)

22) path_cell_cycle_genes.txt: File with the list of cell cycle pathway genes used in the analysis. (Sanchez-Vega, Francisco et al.)

23)  path_wnt_genes.txt: File with the list of WNT pathway genes used in the analysis. (Sanchez-Vega, Francisco et al.)

24) GSE94937_rpkm_kras.csv: Expression values in given external samples (GSE94937)

25) GSE94937_kras_sign.txt: File with values assigned for KRAS Mutant [1] or WT [-1]  in given external samples (GSE94937)

Files (3.5 GB)
Name Size
CCLE_DepMap_18Q1_maf_20180207.txt.gz
md5:59ab800fc0f5df8c8723d6abba6a3d7a
62.0 MB Download
CCLE_MUT_CNA_AMP_DEL_binary_Revealer.tsv.gz
md5:01583df2f0e522715f1a16ba82ef7d54
3.0 MB Download
ccle_rnaseq_genes_rpkm_20180929_mod.tsv.gz
md5:836dd2632c842fd2ac1ba8a667a95e07
126.1 MB Download
compounds_of_interest.txt
md5:82057eceab3bfd24fbf3a0106eee86d2
926 Bytes Download
copy_number_gain_status.tsv.gz
md5:cac73b9db34f59d50dd96ef492aee091
903.0 kB Download
copy_number_loss_status.tsv.gz
md5:0e7131a3099703449002c4c90e1fe5cb
803.1 kB Download
cosmic_cancer_classification.tsv
md5:e4f5bb0175ea2ebb03b7ebd3265bfb80
31.1 kB Download
EBPlusPlusAdjustPANCAN_IlluminaHiSeq_RNASeqV2.geneExp.tsv
md5:02e72c33071307ff6570621480d3c90b
1.9 GB Download
gdsc1_ccle_pharm_fitted_dose_data.txt.gz
md5:e4687066af79b86f08228368b5752eec
4.0 MB Download
gdsc2_ccle_pharm_fitted_dose_data.txt.gz
md5:39511d74034b932f7b7697dc00b0e6a1
1.4 MB Download
GDSC_CCLE_common_mut_cnv_binary.tsv.gz
md5:c456c808aeb698227613d5bfc2aa64fc
1.2 MB Download
GDSC_EXP_CCLE_converted_name.tsv.gz
md5:79b4f9324f83592c21ec002922e6632f
26.1 MB Download
GSE69822_pi3k_sign.txt
md5:fcf3c09d769fc1bc8758974efce0b10c
30 Bytes Download
GSE69822_pi3k_trans.csv
md5:0cf1d5c037a1ecbb5caa5883387b08db
13.3 MB Download
GSE94937_kras_sign.txt
md5:7fe9f15b1ecb7d3bba7eea163261124f
24 Bytes Download
GSE94937_rpkm_kras.csv
md5:9f2b7133206fa86b4cd64456e11def46
3.5 MB Download
mc3.v0.2.8.PUBLIC.maf.gz
md5:32dda8ee9796ce23e98ab1ab29ca4f80
704.8 MB Download
mutation_burden_freeze.tsv
md5:086a865f3ce79fffcb732165cb552a10
312.5 kB Download
pancan_GISTIC_threshold.tsv.gz
md5:045c0b9f434e1e0d932e4477145feb84
43.0 MB Download
pancan_mutation_freeze.tsv.gz
md5:6fbff2a2555876ca9c613e4e3ef3caf6
3.5 MB Download
pancan_rnaseq_freeze.tsv.gz
md5:c9c85757f4731c78fe8c860561d9b149
655.1 MB Download
path_cell_cycle_genes.txt
md5:4f889b2453906919a777d4aa85eea840
166 Bytes Download
path_myc_genes.txt
md5:74e179740a2079703585d80bc77645d7
75 Bytes Download
path_ras_genes.txt
md5:c5b8b4b0f303b9f9f75c0452f62eab5d
377 Bytes Download
path_rtk_ras_pi3k_genes.txt
md5:97c10b16c6271999b95709d6e8bb949e
236 Bytes Download
path_wnt_genes.txt
md5:dbf551ebb5ab19b19a90e681156be2c0
276 Bytes Download
sample_freeze.tsv
md5:ce9f8d12eaf2d696974440371096d4f4
454.8 kB Download
sampleset_freeze_version4_modify.csv
md5:9997626e6e0e3632d7d0b39c66618eae
413.1 kB Download
seg_based_scores.tsv
md5:24ad6e2c29beb206ca068828ec99da74
454.7 kB Download
tcga_dictionary.tsv
md5:794575895312e3d92d1b4b739a2e6760
975 Bytes Download
vogelstein_cancergenes.tsv
md5:e2486e5fe3b28c69e5237601feb48503
14.3 kB Download
  • Machine Learning Detects Pan-cancer Ras Pathway Activation in The Cancer Genome Atlas Way G.P., Sanchez-Vega F., La K., Armenia J., Chatila W.K., Luna A., Sander C., (...), The Cancer Genome Atlas Research Network (2018) Cell Reports, 23 (1) , pp. 172-180.e3.

  • Oncogenic Signaling Pathways in The Cancer Genome Atlas Sanchez-Vega F., Mina M., Armenia J., Chatila W.K., Luna A., La K.C., Dimitriadoy S., (...), The Cancer Genome Atlas Research Network (2018) Cell, 173 (2) , pp. 321-337.e10.

  • The Cancer Cell Line Encyclopedia enables predictive modelling of anticancer drug sensitivity Barretina Caponigro Stransky et al. Nature doi:10.1038/nature11003 / Mar 29, 2012

  • Next-generation characterization of the Cancer Cell Line Encyclopedia Ghandi, M., Huang F. et al. Nature doi:10.1038/s41586-019-1186-3 / May 8, 2019

  • Genomics of Drug Sensitivity in Cancer (GDSC): a resource for therapeutic biomarker discovery in cancer cells. Yang et al., (2013) Nucl. Acids Res. 41 (Database issue): D955 - D961. (PMID:23180760 )

  • A landscape of pharmacogenomic interactions in cancer Iorio et al., (2016). Cell, Volume 166, Issue 3, 740 - 754 (PMID:27397505 )

  • Systematic identification of genomic markers of drug sensitivity in cancer cells Garnett et al., (2012) Nature volume 483, pages 570 – 575 (PMID:27397505 )

  • Mermel, C.H., Schumacher, S.E., Hill, B. et al. GISTIC2.0 facilitates sensitive and confident localization of the targets of focal somatic copy-number alteration in human cancers. Genome Biol 12, R41 (2011). https://doi.org/10.1186/gb-2011-12-4-r41

  • Kiselev VY, Juvin V, Malek M, Luscombe N et al. Perturbations of PIP3 signalling trigger a global remodelling of mRNA landscape and reveal a transcriptional feedback loop. Nucleic Acids Res 2015 Nov 16;43(20):9663-79. PMID: 26464442

397
2,000
views
downloads
All versions This version
Views 39783
Downloads 2,0001,595
Data volume 279.1 GB232.1 GB
Unique views 35479
Unique downloads 337189

Share

Cite as