Published December 31, 2025 | Version v.1.0.0
Dataset Restricted

A single-cell atlas linking intratumoral states to therapeutic vulnerabilities across cancers

Description

The Therapeutic Cancer Cell Atlas (TCCA) is an integrated single-cell multi-omics resource comprising 1,839,242 cells (1,089,024 malignant + 750,218 non-malignant) from 853 samples across 36 studies and 34 cancer types.

The atlas integrates five complementary data layers across all cells/samples:

  • Expression: Single-cell RNA-seq with batch-corrected embeddings (scANVI).
  • Clinical: Curated sample metadata.
  • TME: 12 microenvironment archetypes from immune/stromal composition.
  • Genomic: Copy-number variation profiles for genetic subclones (SCEVAN).
  • Therapeutic: Drug sensitivity predictions and therapeutic clusters.
  • Functional: 43 recurrent transcriptional metaprograms (NMF).

File organization (13 files): 

  • tcca_expression_metadata_lvl2.h5ad (~ 23 GB) — Integrated atlas with ALL data layers: expression, clinical metadata, cell annotations, TME archetypes, genetic subclones, therapeutic clusters, embeddings, and metaprogram enrichment scores.
  • Standalone files (12 files, ~2 GB) — Individual tabular extracts for targeted analyses without loading the full atlas:  cell-level metadata (`obs`) from the AnnData object (1 file), source studies (1 file), clinical data (1 file), TME (2 files), CNV profiles (2 files), therapeutic predictions (3 files), metaprograms (2 files).

See README.md for detailed file descriptions and usage example.

Methods (English)

The code used to create and analyze the Therapeutic Cancer Cell Atlas is available at cnio-bu/tcca.

Notes

This work integrates single-cell RNA-seq data from 36 publicly available studies retrieved from GEO and other repositories: GSE137804, GSE132509, GSE185381, GSE235063, GSE141526, GSE162454, GSE161529, GSE176078, GSE186344, GSE200218, SCP1950, GSE157220, GSE138709, GSE132065, GSE166555, EGAS00001006469, GSE160269, GSE182109, GSE131907, GSE131309, GSE161801, GSE173682, GSE203612, E-MTAB-8107, E-MTAB-6149, E-MTAB-6653, EGAS00001005115, CRA001160, GSE197177, GSE163678, GSE141445, SCP1288, GSE215121, PRJNA662018, GSM4147091, and datasets from Zenodo (DOI: 10.5281/zenodo.7227571), CodeOcean (capsule 8321305), and Mendeley Data (DOI: 10.17632/g67bkbnhhg.1).

For complete bibliographic references, tumor types, and per-study sample/cell counts see source_studies.xlsx. We thank all original authors for making their data publicly available.

Other

References:

1. Yuan, H. et al. CancerSEA: a cancer single-cell state atlas. Nucleic Acids Research 47, D900–D908 (2019).
2. Kinker, G. S. et al. Pan-cancer single-cell RNA-seq identifies recurring programs of cellular heterogeneity. Nat Genet 52, 1208–1218 (2020).
3. Barkley, D. et al. Cancer cell states recur across tumor types and form specific interactions with the tumor microenvironment. Nat Genet 54, 1192–1201 (2022).
3. Tyler, M. et al. The Curated Cancer Cell Atlas provides a comprehensive characterization of tumors at single-cell resolution. Nat Cancer 6, 1088–1101 (2025).
4. Kang, J. et al. Systematic dissection of tumor-normal single-cell ecosystems across a thousand tumors of 30 cancer types. Nat Commun 15, 4067 (2024).
5. Huang, C. et al. scCancerExplorer: a comprehensive database for interactively exploring single-cell multi-omics data of human pan-cancer. Nucleic Acids Res 53, D1526–D1535 (2025).
6. Stuart, T. et al. Comprehensive Integration of Single-Cell Data. Cell 177, 1888-1902.e21 (2019).
7. Wolf, F. A., Angerer, P. & Theis, F. J. SCANPY: large-scale single-cell gene expression data analysis. Genome Biol19, 15 (2018).
8. Xu, C. et al. Probabilistic harmonization and annotation of single‐cell transcriptomics data with deep generative models. Mol Syst Biol 17, MSB20209620 (2021).
9. Luecken, M. D. et al. Benchmarking atlas-level data integration in single-cell genomics. Nat Methods 19, 41–50 (2022).
10. Hao, Y. et al. Integrated analysis of multimodal single-cell data. Cell 184, 3573-3587.e29 (2021).
11. Ianevski, A. et al. Single-cell transcriptomes identify patient-tailored therapies for selective co-inhibition of cancer clones. Nat Commun 15, 8579 (2024).
12. Fustero-Torre, C. et al. Beyondcell: targeting cancer therapeutic heterogeneity in single-cell RNA-seq data. Genome Medicine 13, 187 (2021).
13. De Falco, A., Caruso, F., Su, X.-D., Iavarone, A. & Ceccarelli, M. A variational algorithm to detect the clonal copy number substructure of tumors from scRNA-seq data. Nat Commun 14, 1074 (2023).
14. Combes, A. J. et al. Discovering dominant tumor immune archetypes in a pan-cancer census. Cell 185, 184-203.e19 (2022).
 

Files

Restricted

The record is publicly accessible, but files are restricted to users with access.