SLIDE-EX: a deep-learning framework for predicting cell type-specific gene expression and cell type abundance from H&E images
Authors/Creators
- 1. Cancer Data Science Laboratory, Center for Cancer Research, National Cancer Institute, Bethesda, MD, USA
- 2. Departments of Pathology and Laboratory Medicine and Computational Biomedicine, Cedars-Sinai Medical Center, Los Angeles, CA, USA
- 3. Translational Research Institute and Jim and Eleanor Randall Department of Surgery, Cedars-Sinai Medical Center, Los Angeles, CA, USA
Description
SLIDE-EX: a deep-learning framework for predicting cell type-specific gene expression and cell type abundance from H&E images
Code associated with "Deep learning inference of cell type-specific gene expression from breast tumor histopathology", bioRxiv 2025, by Andrew T. Wang, Saugato R. Dhruba, Kun Wang, Emma M. Campagnolo, Eldad D. Shulman, Eytan Ruppin.
1. Introduction
SLIDE-EX (SLide-based Inference of DEconvolved gene EXpression) is a deep-learning framework trained on deconvolved bulk transcriptomics data that robustly predicts cell type-specific gene expression and cell type abundance from histopathology slides.
The SLIDE-EX architecture consists of four main components: (1) Cellular Deconvolution, (2) Image Processing, (3) Feature Extraction, and (4) Multilayer Perceptron Regression.
2. Dependencies
SLIDE-EX uses the following packages:
python 3.9.7
numpy 1.20.3
pandas 1.3.4
scikit-learn 1.2.2
matplotlib 3.4.3
openslide 1.1.2
openCV 4.5.4
PIL 8.4.0
pytorch 1.12.0
3. SLIDE-EX computational pipeline
Step 1: Deconvolution wth CODEFACS
The first step employs CODEFACS, a cellular deconvolution tool developed in our lab [1]. CODEFACS takes measured bulk expression from the sample as input, and outputs the cell type-specific expression profiles and the corresponding cell type abundances in each cell type.
Step 2: Image processing and feature extraction
- Run `python slide_processing/1main_processing.py <project_path> <slide_index>` to perform image pre-processing and feature extraction for a given slide. <project_path> is the path to your project files. <slide_index> is the index of slide to be processed.
- Run `python slide_processing/collect_mask.py <project_path>" to collect mask files for all slides into a single file that can be used to evaluate slide quality.
- Run `python slide_processing/collect_features.py <project_path>` to create a file that contains features of image tiles for all slides.
Step 3: Predicting cell-type specific expression from the slide image features
- Run "python prediction/1main_cs_exp_regression.py <project_path> <outer_fold> <inner_fold> <cell_type>" to train and predict cell-type specific expression values from the slide image features. <outer_fold> and <inner_fold> are the outer and inner fold indexes, respectively, for nested cross-validation. <cell_type> is a string of the name of the cell type of interest.
Step 4: Predicting cell type abundance from the slide image features
- Run "python prediction/1main_abundance_regression.py <project_path> <outer_fold> <inner_fold>" to train and predict cell type abundances from the slide image features.
Step 5: Classifying responders and non-responders to chemotherapy from the slide image features (direct model)
- Run "python prediction/1main_direct.py <project_path> <outer_fold> <inner_fold>" to train and classify patients into responders or non-responders to chemotherapy directly from the slide image features.
- In our paper, we investigated whether cell type-specific gene expression inferred by SLIDE-EX could predict neoadjuvant chemotherapy response using DECODEM, a computational framework for predicting chemotherapy response from deconvolved gene expression profiles developed in our lab [2]. As a baseline for comparison, we designed this supervised model in 1main_direct.py to classify responders and non-responders directly from the slide image features without the intermediate step of cell type-specific gene expression prediction.
4. References
1. Wang, K. et al. Deconvolving clinically relevant cellular immune cross-talk from bulk gene expression using CODEFACS and LIRICS stratifies patients with melanoma to anti-PD-1 therapy. Cancer Discov. 12, 1088–1105 (2022).
2. Dhruba, S. R. et al. Enhanced prediction of breast cancer patient response to chemotherapy by integrating deconvolved expression patterns of immune, stromal and tumor cells. Cancer Lett. 218101 (2025).
Files
SLIDE_EX_code.zip
Files
(22.1 kB)
| Name | Size | Download all |
|---|---|---|
|
md5:b7d557a05f6eacb3d755f4f2b3947ec8
|
22.1 kB | Preview Download |
Additional details
Software
- Programming language
- Python
References
- Wang, K. et al. Deconvolving clinically relevant cellular immune cross-talk from bulk gene expression using CODEFACS and LIRICS stratifies patients with melanoma to anti-PD-1 therapy. Cancer Discov. 12, 1088–1105 (2022).
- Dhruba, S. R. et al. Enhanced prediction of breast cancer patient response to chemotherapy by integrating deconvolved expression patterns of immune, stromal and tumor cells. Cancer Lett. 218101 (2025).