Published November 26, 2025 | Version 0.1.0
Software Open

SLIDE-EX: a deep-learning framework for predicting cell type-specific gene expression and cell type abundance from H&E images

  • 1. Cancer Data Science Laboratory, Center for Cancer Research, National Cancer Institute, Bethesda, MD, USA
  • 2. Departments of Pathology and Laboratory Medicine and Computational Biomedicine, Cedars-Sinai Medical Center, Los Angeles, CA, USA
  • 3. Translational Research Institute and Jim and Eleanor Randall Department of Surgery, Cedars-Sinai Medical Center, Los Angeles, CA, USA

Description

SLIDE-EX: a deep-learning framework for predicting cell type-specific gene expression and cell type abundance from H&E images

Code associated with "Deep learning inference of cell type-specific gene expression from breast tumor histopathology", bioRxiv 2025, by Andrew T. Wang, Saugato R. Dhruba, Kun Wang, Emma M. Campagnolo, Eldad D. Shulman, Eytan Ruppin.

1. Introduction

SLIDE-EX (SLide-based Inference of DEconvolved gene EXpression) is a deep-learning framework trained on deconvolved bulk transcriptomics data that robustly predicts cell type-specific gene expression and cell type abundance from histopathology slides.

The SLIDE-EX architecture consists of four main components: (1) Cellular Deconvolution, (2) Image Processing, (3) Feature Extraction, and (4) Multilayer Perceptron Regression. 

2. Dependencies

SLIDE-EX uses the following packages:

python 3.9.7

numpy 1.20.3

pandas 1.3.4

scikit-learn 1.2.2

matplotlib 3.4.3

openslide 1.1.2

openCV 4.5.4

PIL 8.4.0

pytorch 1.12.0

3. SLIDE-EX computational pipeline

Step 1: Deconvolution wth CODEFACS

The first step employs CODEFACS, a cellular deconvolution tool developed in our lab [1]. CODEFACS takes measured bulk expression from the sample as input, and outputs the cell type-specific expression profiles and the corresponding cell type abundances in each cell type.

Step 2: Image processing and feature extraction

- Run `python slide_processing/1main_processing.py <project_path> <slide_index>` to perform image pre-processing and feature extraction for a given slide. <project_path> is the path to your project files. <slide_index> is the index of slide to be processed.

- Run `python slide_processing/collect_mask.py <project_path>" to collect mask files for all slides into a single file that can be used to evaluate slide quality.

- Run `python slide_processing/collect_features.py <project_path>` to create a file that contains features of image tiles for all slides.

Step 3: Predicting cell-type specific expression from the slide image features

- Run "python prediction/1main_cs_exp_regression.py <project_path> <outer_fold> <inner_fold> <cell_type>" to train and predict cell-type specific expression values from the slide image features. <outer_fold> and <inner_fold> are the outer and inner fold indexes, respectively, for nested cross-validation. <cell_type> is a string of the name of the cell type of interest.

Step 4: Predicting cell type abundance from the slide image features

- Run "python prediction/1main_abundance_regression.py <project_path> <outer_fold> <inner_fold>" to train and predict cell type abundances from the slide image features.

Step 5: Classifying responders and non-responders to chemotherapy from the slide image features (direct model)

- Run "python prediction/1main_direct.py <project_path> <outer_fold> <inner_fold>" to train and classify patients into responders or non-responders to chemotherapy directly from the slide image features. 

- In our paper, we investigated whether cell type-specific gene expression inferred by SLIDE-EX could predict neoadjuvant chemotherapy response using DECODEM, a computational framework for predicting chemotherapy response from deconvolved gene expression profiles developed in our lab [2]. As a baseline for comparison, we designed this supervised model in 1main_direct.py to classify responders and non-responders directly from the slide image features without the intermediate step of cell type-specific gene expression prediction.

4. References

1. Wang, K. et al. Deconvolving clinically relevant cellular immune cross-talk from bulk gene expression using CODEFACS and LIRICS stratifies patients with melanoma to anti-PD-1 therapy. Cancer Discov. 12, 1088–1105 (2022).

2. Dhruba, S. R. et al. Enhanced prediction of breast cancer patient response to chemotherapy by integrating deconvolved expression patterns of immune, stromal and tumor cells. Cancer Lett. 218101 (2025).

Files

SLIDE_EX_code.zip

Files (22.1 kB)

Name Size Download all
md5:b7d557a05f6eacb3d755f4f2b3947ec8
22.1 kB Preview Download

Additional details

Software

Programming language
Python

References

  • Wang, K. et al. Deconvolving clinically relevant cellular immune cross-talk from bulk gene expression using CODEFACS and LIRICS stratifies patients with melanoma to anti-PD-1 therapy. Cancer Discov. 12, 1088–1105 (2022).
  • Dhruba, S. R. et al. Enhanced prediction of breast cancer patient response to chemotherapy by integrating deconvolved expression patterns of immune, stromal and tumor cells. Cancer Lett. 218101 (2025).