Published March 28, 2024 | Version 1.0
Software Open

DEPLOY: An integrated deep learning model for predicting DNA methylation and tumor types from H&E images

  • 1. Biological Data Science Institute, College of Science, Australian National University, Canberra, ACT, Australia
  • 2. Cancer Data Science Laboratory, Center for Cancer Research, National Cancer Institute, Bethesda, MD, USA
  • 3. Laboratory of Pathology, Center for Cancer Research, National Cancer Institute, Bethesda, MD, USA
  • 4. Division of Neuropathology, Department of Pathology and Laboratory Medicine, Perelman School of Medicine at the University of Pennsylvania, Philadelphia, PA, USA

Description

DEPLOY: An integrated deep learning model for predicting DNA methylation and tumor types from H&E image

Code associated with “Deep Learning Prediction of DNA Methylation and Tumor Types from Histopathology in Central Nervous System Tumors”, Nature Medicine 2024, by Danh-Tai Hoang*, Eldad D. Shulman*, Rust Turakulov, Zied Abdullaev, Omkar Singh, Emma M. Campagnolo, H. Lalchungnunga, Eric A. Stone, MacLean P. Nasrallah, Eytan Ruppin**, Kenneth Aldape**.

1. Introduction

DEPLOY (Deep lEarning from histoPathoLOgy and methYlation) is a deep learning framework that predicts DNA methylation and classifies brain tumor types from histopathology slide images.

The DEPLOY architecture consists of six main components: (1) Image processing, (2) Feature Extraction, (3) Feature Compression, (4) Indirect Model, (5) Direct Model, and (6) Demographic Model.

2. Installation

DEPLOY uses the following required packages:

python 3.9.7

numpy 1.20.3

pandas 1.3.4

scikit-learn 1.2.2

matplotlib 3.4.3

openslide 1.1.2

opencV 4.5.4

PIL 8.4.0

pytorch 1.12.0

3. DeepPT computational pipeline

Step 1: Image processing and feature extraction

- Run “slide_processing/1main_processing.py” to perform image pre-processing and feature extraction. This code will run on each slide simultaneously.

- Run “slide_processing/collect_mask.py” to collect mask files into a single file “mask.pdf” that will be used to evaluate slide quality.

- Run “slide_processing/collect_features.py” to create a file that contains features of image tiles.

Step 2: Feature compression

- Run “auto_encoder/1main_AE.py” to compress the 2,048 pre-trained features to 512 AE features.

Step 3: Predicting DNA methylation

- Run “methylation_prediction/1main_methylation.py” to train and predict methylation beta values from the AE features.

Step 4: Classifying tumor types from the inferred methylation (indirect model)

- Run “indirect/1main_indirect.py” to train and classify tumor types from the inferred methylation beta values.

Step 5: Classifying tumor types from the AE features directly (direct model)

- Run “direct/1main_direct.py” to train and classify tumor types from the AE features.

Step 6: Classifying tumor types from demographics (demographic model)

- Run “demographic/1main_demographics.py” to train and predict tumor types from the demographics (age, sex, surgical location).

4. License and Terms of use

This model and its associated code have been filed for a provisional US patent (application No. 63/626,277) and are permitted solely for non-commercial, academic research purposes. Commercial use, sale, or any form of monetization of the DEPLOY model is strictly prohibited without prior approval. Commercial entities interested in utilizing the model should contact the corresponding authors for authorization.

Files

auto_encoder.zip

Files (97.8 MB)

Name Size Download all
md5:89c15e87aabd03fe88e0bbd5a839d447
76.2 kB Preview Download
md5:ef69bed7d6770aa9114038918192ada6
128.9 kB Preview Download
md5:aaefe0bdfb0986879ccfaa996f0e9f09
7.8 kB Preview Download
md5:d3c1d76aa2f5caeba96b2b93764be74d
4.2 kB Preview Download
md5:bbd6dde2e2fa9e17f4809ac76d519f10
100.1 kB Preview Download
md5:de3171292b4d117b2adccc499b77e836
2.1 MB Preview Download
md5:d0bc206c5c99a67ec7b89c6cd1d88178
95.3 MB Preview Download