Info: Zenodo’s user support line is staffed on regular business days between Dec 23 and Jan 5. Response times may be slightly longer than normal.

Published September 8, 2022 | Version v1
Dataset Open

A systematic assessment of deep learning methods for drug response prediction: from in-vitro to clinical application

Creators

Description

https://github.com/LihongLab/Suppl-data-Benchmark

## GDSC dataset

**Table S3.** GDSC gene expression profiles for 966 cancer cell lines, where each column represents a cell line in the form of its name and tissue collection site, and each row represents a gene in the form of the HGNC symbol.

 

**Table S4.** GDSC gene mutation profiles for 966 cancer cell lines, where each column represents a cell line in the form of its name and tissue collection site, and each row represents a gene in the form of the HGNC symbol. The wild type is coded as 1 and the wild type as 0.

 

**Table S5.** GDSC copy number variation profiles for 966 cancer cell lines, where each column represents a cell line in the form of its name and tissue collection site, and each row represents a gene in the form of the HGNC symbol. The copy-neutral is coded as 0 and the deletion or amplification as 1.

 

**Table S6.** GDSC drug response data for 966 cancer cell lines and 282 drugs in the form of the natural logarithm of the IC50 readout. The first column shows the cell line name and tissue collection site, the second column shows the drug name, and the third column shows the drug response readout.

 

**Table S7.** GDSC annotations for 282 drugs include drug name, PubChem CID, PubChem canonical SMILES, Rdkit canonical SMILES, Target Pathway, standard deviation, bimodality coefficient and density coverage.

## TCGA dataset

**Table S8.** TCGA gene expression profiles, where each column represents a patient in the form of TCGA patient ID, and each row represents a gene in the form of the HGNC symbol.

 

**Table S9.** TCGA gene mutation profiles, where each column represents a patient in the form of TCGA patient ID, and each row represents a gene in the form of the HGNC symbol. The wild type is coded as 1 and the wild type as 0.

 

**Table S10.** TCGA copy number variation profiles, where each column represents a patient in the form of TCGA patient ID, and each row represents a gene in the form of the HGNC symbol. The copy-neutral is coded as 0 and the deletion or amplification as 1.

 

**Table S11.** TCGA clinical response data. The first column shows the TCGA patient ID, the second column shows the drug name, the third column shows the clinical response category, the fourth column shows the cancer type, and the last column shows the clinical label as responder or non-responder.

Files

TableS10.csv

Files (1.3 GB)

Name Size Download all
md5:beaa0e93f5ad5b8281231e0f561fc726
56.4 MB Preview Download
md5:c774b9d0b10872558d4afaaac9647f3e
92.7 kB Preview Download
md5:4cd14f5e00814f1728ed1c673b733691
457.4 MB Preview Download
md5:8fd663e42f37d070704a058e1309a894
39.7 MB Preview Download
md5:153e897447b121069009ce70fb63d9d0
43.0 MB Preview Download
md5:219b0569ef9a657b704173f416cff229
14.2 MB Preview Download
md5:40d16a1966d45b87b6d2ead570877135
54.4 kB Preview Download
md5:a1fefba102f3fb8a979bb4e9eb1ccd93
681.0 MB Preview Download
md5:0177d96d062246722720615627d529d2
45.1 MB Preview Download