Dataset Open Access
This file includes the pointer to the 42 patient ids and zip file names of the 84 genomic and proteomic datasets used for the paper "Gil, Y, Garijo, D, Ratnakar, V, Mayani, R, Adusumilli, A, Srivastava, R, Boyce, H, Mallick,P. Towards Continuous Scientific Data Analysis and Hypothesis Evolution", accepted in AAAI 2017.
The datasets itself are not published due to their size and access conditions. They can be retrieved with the provided ids from TCGA (https://gdc-portal.nci.nih.gov/legacy-archive/search/f) and CPTAC (https://cptac-data-portal.georgetown.edu/cptac/s/S022) archives.
These patient ids are a subset of the nearly 90 samples used in "Zhang, B., Wang, J., Wang, X., Zhu, J., Liu, Q., et al. Proteogenomic characterization of human colon and rectal cancer. Nature 513,382–387", in order to test the system described in the AAAI 2017 paper. More samples were not included in the analysis due to time constraints.