Prediction of cancer treatment response from histopathology images through imputed transcriptomics

doi:10.5281/zenodo.8242989

Published August 13, 2023 | Version v0.0.1

Software Restricted

Prediction of cancer treatment response from histopathology images through imputed transcriptomics

1. Biological Data Science Institute, College of Science, Australian National University, Canberra, ACT, Australia
2. Pangea Biomed Ltd., Tel Aviv, Israel
3. Department of Immunology, University of Pittsburgh, Pittsburgh, PA, USA; Tumor Microenvironment Center, UPMC Hillman Cancer Center, University of Pittsburgh, Pittsburgh, PA, USA
4. Breast Cancer Now Toby Robins Research Centre, The Institute of Cancer Research, London, United Kingdom; The Royal Marsden Hospital NHS Foundation Trust, London, United Kingdom; Cancer Research UK Cambridge Institute, University of Cambridge, Li Ka Shing Centre, Cambridge, UK
5. Cancer Data Science Laboratory, Center for Cancer Research, National Cancer Institute, Bethesda, MD, USA
6. Laboratory of Pathology, Center for Cancer Research, National Cancer Institute, Bethesda, MD, USA
7. Oncology Institute, Sheba Medical Center at Tel-Hashomer, Tel Aviv University, Tel Aviv, Israel
8. Division of Medical Oncology, University of Colorado Anschutz Medical Campus, Aurora, CO, USA
9. Thoracic and GI Malignancies Branch, Center for Cancer Research, National Cancer Institute, Bethesda, MD, USA.
10. Center for Immuno-Oncology, Center for Cancer Research, National Cancer Institute, Bethesda, MD, USA.
11. Surgical Oncology Program, Center for Cancer Research, National Cancer Institute, Bethesda, MD, USA
12. Center for Immuno-Oncology, Center for Cancer Research, National Cancer Institute, Bethesda, MD, USA
13. Laboratory of Genitourinary Cancer Pathogenesis, Center for Cancer Research, National Cancer Institute, Bethesda, MD, USA
14. Cancer Research UK Cambridge Institute, University of Cambridge, Li Ka Shing Centre, Cambridge, UK
15. Genitourinary Malignancy Branch, Center for Cancer Research, National Cancer Institute, Bethesda, MD, USA

DeepPT codes in the manuscript "Prediction of cancer treatment response from histopathology images through imputed transcriptomics" are uploaded here. Please see the README for details.

-----

Introduction:

DeepPT (Deep Pathology for Transcriptomics) is a deep learning framework that predicts gene expression from histopathology images. DeepPT consists of 4 main components:

1. Image pre-processing: Split each whole slide image into tiles/patches and select only tiles that contain tissue and exclude them from background. Color normalization was included to minimize staining variation (heterogeneity and batch effects).

2. Feature extraction: Use the pre-trained ResNet50 CNN model to extract image features from the tiles. Through this process, each image tile is represented by a vector of 2,048 derived features (pre-trained ResNet features).

3. Feature compression: Compress the 2,048 pre-trained ResNet features to 512 features using an autoencoder network. This helps to exclude noise, to avoid overfitting, and finally to reduce the computational demands.

4. Prediction: This component takes the AE features as input and gene expressions as output.

DeepPT computational pipeline:

- Step 1: Run “11slide_processing/1main_processing.py” to perform image pre-processing and feature extraction. This code will run on each slide simultaneously.

- Step 2: Run “11slide_processing/collect_mask.py” to collect mask files into a single file “mask.pdf” that will be used to evaluate slide quality.

- Step 3: Run “11slide_processing/collect_features.py” to create a file that contains features of image tiles.

- Step 4: Run “12AE/1main_AE.py” to compress the 2,048 pre-trained features to 512 AE features.

- Step 5: Run “13DeepPT_train/1main_train.py” to train and predict gene expression from the AE features.

Files

Restricted

The record is publicly accessible, but files are restricted to users with access.

Request access

If you would like to request access to these files, please fill out the form below.

We are currently in the process of submitting our manuscript to a journal. The DeepPT codes will be made available upon reasonable request at this stage.

---------------------

Introduction:

DeepPT (Deep Pathology for Transcriptomics) is a deep learning framework that predicts gene expression from histopathology images. DeepPT consists of 4 main components:

1. Image pre-processing: Split each whole slide image into tiles/patches and select only tiles that contain tissue and exclude them from background. Color normalization was included to minimize staining variation (heterogeneity and batch effects).

2. Feature extraction: Use the pre-trained ResNet50 CNN model to extract image features from the tiles. Through this process, each image tile is represented by a vector of 2,048 derived features (pre-trained ResNet features).

3. Feature compression: Compress the 2,048 pre-trained ResNet features to 512 features using an autoencoder network. This helps to exclude noise, to avoid overfitting, and finally to reduce the computational demands.

4. Prediction: This component takes the AE features as input and gene expressions as output.

DeepPT computational pipeline:

- Step 1: Run “11slide_processing/1main_processing.py” to perform image pre-processing and feature extraction. This code will run on each slide simultaneously.

- Step 2: Run “11slide_processing/collect_mask.py” to collect mask files into a single file “mask.pdf” that will be used to evaluate slide quality.

- Step 3: Run “11slide_processing/collect_features.py” to create a file that contains features of image tiles.

- Step 4: Run “12AE/1main_AE.py” to compress the 2,048 pre-trained features to 512 AE features.

- Step 5: Run “13DeepPT_train/1main_train.py” to train and predict gene expression from the AE features.

You are currently not logged in. Do you have an account? Log in here

	All versions	This version
Views	533	39
Downloads	416	0
Data volume	11.1 GB	0 Bytes

Prediction of cancer treatment response from histopathology images through imputed transcriptomics

Creators

Description

Files

Restricted

Request access