Published January 12, 2026 | Version v2
Dataset Open

Data from "Single-cell integration and multi-modal profiling reveals phenotypes and spatial organization of neutrophils in colorectal cancer"

  • 1. Biocenter, Institute of Bioinformatics, Medical University of Innsbruck, Austria
  • 2. Institute of Experimental Immunology, University of Zurich, Switzerland
  • 3. Department of Internal Medicine V, Haematology & Oncology, Comprehensive Cancer Center
  • 4. Department of Visceral, Transplant and Thoracic Surgery, Medical University Innsbruck, Austria
  • 5. Innpath, Tirol Kliniken, Medical University Innsbruck, Austria
  • 6. Tyrolpath Obrist Brunhuber GmbH, Zams, Austria
  • 7. Institute of Pathology, Neuropathology and Molecular Pathology, Medical University of Innsbruck, Austria
  • 8. Department of Therapeutic Radiology and Oncology, Medical University Innsbruck, Austria
  • 9. Department of Molecular Life Sciences, University of Zürich, Switzerland
  • 10. Institute of Molecular Genetics of the Czech Academy of Sciences, Prague, Czech Republic
  • 11. Boehringer Ingelheim International Pharma GmbH & Co KG, Biberach, Germany
  • 12. Department of Dermatology, Venereology and Allergology, Medical University of Innsbruck, Austria
  • 13. Senckenberg Institute of Pathology, Goethe University Frankfurt, Germany
  • 14. University Cancer Center Frankfurt (UCT), Germany
  • 15. Senckenberg Institute of Pathology, Goethe University Frankfurt, GermanySenckenberg Institute of Pathology, Goethe University Frankfurt, Germany
  • 16. Translational Cancer Research and Institute of Experimental Cancer Therapy, Klinikum rechts der Isar, School of Medicine & Health, Technical University of Munich, Germany
  • 17. Institute of Pathology, Paracelsus Medical University, Salzburg, Austria
  • 18. Department of Internal Medicine III, Paracelsus Medical University, Salzburg, Austria
  • 19. Austrian Breast & Colorectal Cancer Study Group (ABCSG), Vienna, Austria
  • 20. Department of Pathology, Leiden University Medical Center, The Netherlands
  • 21. Comprehensive Cancer Center Zürich, Zürich, Switzerland

Description

This archive provides all datasets needed to reproduce the single‐cell data integration detailed in the paper

Single-cell integration and multi-modal profiling reveals phenotypes and spatial organization of neutrophils in colorectal cancer

DOI: 10.1016/j.ccell.2025.12.003


The archive comprises the following files:

  • crc_atlas_models.tar.xz: Trained scVI (unsupervised) and scANVI (cell-type aware) models for the global CRC atlas and tissue-specific subsets (normal, tumor, metastasis). Enable projection of external data onto the CRC atlas, expecting Ensembl IDs (e.g. ENSG00000105329) as var_names. Trained with scvi-tools v1.4.1.

  • crc_atlas_models_minified.tar.xz: Lightweight minified models that only retain the weights necessary for downstream inference. Optimized for scArches workflows, including reference mapping and automated cell-type label transfer.

  • MUI_Innsbruck-adata.h5ad: In-house scRNA-seq dataset from CRC cohort I (n = 12) comprising matched peripheral blood, adjacent normal, and tumor samples generated using the BD Rhapsody platform.

  • input_datasets.tar.xz: Preprocessed input datasets in .h5ad format required to build the CRC scRNA-seq atlas.

  • downstream_analyses.tar.xz: Fully executed HTML notebooks and corresponding analysis outputs used to generate the main single-cell atlas figures in the paper.

  • downstream_analyses_de_analysis.tar.xz: DESeq2-based differential expression analyses on pseudobulked data by cell type for various matched comparisons within the CRC atlas. Includes RDS files, result TSV tables, and short summaries for each comparison.

  • remove_ambient_rna.tar.xz: A subset of 24 .h5ad datasets with scAR-denoised counts. The original unfiltered count matrices are available in input_datasets.tar.xz.

  • containers.tar.xz: Singularity .sif images encapsulating all software dependencies required to fully reproduce the workflow.

  • shears_tutorial.tar.xz: Input datasets in .h5ad format required to execute the shears tutorial. Includes the single-cell CRC reference and combined bulk clinical cohorts to demonstrate both the quantitative deconvolution and the single-cell phenotypic modeling (e.g., mapping clinical outcomes to single cells) introduced in this paper.

The CRC atlas is publicly available for download and interactive exploration through a cell-x-gene instance with standardized metadata, which allows custom analyses of the atlas. For more information, check out the

Files

Files (159.2 GB)

Name Size Download all
md5:2c3bb34735b1f3a1639555e25083de85
10.2 GB Download
md5:286269e2641fe1dffe63bb1bc2d787c5
41.0 GB Download
md5:938bcd7621e58fe1d8c871a0c4aab8c0
10.5 GB Download
md5:9c11cbea81ba61bea41ee16764e837a6
7.5 GB Download
md5:8b74a61fc648493ff2c8f2da0c9bb97f
12.0 GB Download
md5:6d73a9c6d9246970e1a02dc39265dd95
27.6 GB Download
md5:4bbe1687d323c735512223df983269e5
2.2 GB Download
md5:13aad307901012097d1faba9291f0e3d
13.9 GB Download
md5:fe6ad19b73b6affd4ec8ff61675f4066
34.3 GB Download

Additional details

Related works

Is published in
Peer review: 10.1016/j.ccell.2025.12.003 (DOI)

Funding

European Commission
EPIC - Enabling Precision Immuno-oncology in Colorectal cancer 786295
FWF Austrian Science Fund
10.55776/DOC82