DEPLOY: An integrated deep learning model for predicting DNA methylation and tumor types from H&E images
Creators
- 1. Biological Data Science Institute, College of Science, Australian National University, Canberra, ACT, Australia
- 2. Cancer Data Science Laboratory, Center for Cancer Research, National Cancer Institute, Bethesda, MD, USA
- 3. Laboratory of Pathology, Center for Cancer Research, National Cancer Institute, Bethesda, MD, USA
- 4. Division of Neuropathology, Department of Pathology and Laboratory Medicine, Perelman School of Medicine at the University of Pennsylvania, Philadelphia, PA, USA
Description
DEPLOY: An integrated deep learning model for predicting DNA methylation and tumor types from H&E image
Code associated with “Deep Learning Prediction of DNA Methylation and Tumor Types from Histopathology in Central Nervous System Tumors”, Nature Medicine 2024, by Danh-Tai Hoang*, Eldad D. Shulman*, Rust Turakulov, Zied Abdullaev, Omkar Singh, Emma M. Campagnolo, H. Lalchungnunga, Eric A. Stone, MacLean P. Nasrallah, Eytan Ruppin**, Kenneth Aldape**.
1. Introduction
DEPLOY (Deep lEarning from histoPathoLOgy and methYlation) is a deep learning framework that predicts DNA methylation and classifies brain tumor types from histopathology slide images.
The DEPLOY architecture consists of six main components: (1) Image processing, (2) Feature Extraction, (3) Feature Compression, (4) Indirect Model, (5) Direct Model, and (6) Demographic Model.
2. Installation
DEPLOY uses the following required packages:
python 3.9.7
numpy 1.20.3
pandas 1.3.4
scikit-learn 1.2.2
matplotlib 3.4.3
openslide 1.1.2
opencV 4.5.4
PIL 8.4.0
pytorch 1.12.0
3. DeepPT computational pipeline
Step 1: Image processing and feature extraction
- Run “slide_processing/1main_processing.py” to perform image pre-processing and feature extraction. This code will run on each slide simultaneously.
- Run “slide_processing/collect_mask.py” to collect mask files into a single file “mask.pdf” that will be used to evaluate slide quality.
- Run “slide_processing/collect_features.py” to create a file that contains features of image tiles.
Step 2: Feature compression
- Run “auto_encoder/1main_AE.py” to compress the 2,048 pre-trained features to 512 AE features.
Step 3: Predicting DNA methylation
- Run “methylation_prediction/1main_methylation.py” to train and predict methylation beta values from the AE features.
Step 4: Classifying tumor types from the inferred methylation (indirect model)
- Run “indirect/1main_indirect.py” to train and classify tumor types from the inferred methylation beta values.
Step 5: Classifying tumor types from the AE features directly (direct model)
- Run “direct/1main_direct.py” to train and classify tumor types from the AE features.
Step 6: Classifying tumor types from demographics (demographic model)
- Run “demographic/1main_demographics.py” to train and predict tumor types from the demographics (age, sex, surgical location).
4. License and Terms of use
This model and its associated code have been filed for a provisional US patent (application No. 63/626,277) and are permitted solely for non-commercial, academic research purposes. Commercial use, sale, or any form of monetization of the DEPLOY model is strictly prohibited without prior approval. Commercial entities interested in utilizing the model should contact the corresponding authors for authorization.
Files
auto_encoder.zip
Files
(97.8 MB)
Name | Size | Download all |
---|---|---|
md5:89c15e87aabd03fe88e0bbd5a839d447
|
76.2 kB | Preview Download |
md5:ef69bed7d6770aa9114038918192ada6
|
128.9 kB | Preview Download |
md5:aaefe0bdfb0986879ccfaa996f0e9f09
|
7.8 kB | Preview Download |
md5:d3c1d76aa2f5caeba96b2b93764be74d
|
4.2 kB | Preview Download |
md5:bbd6dde2e2fa9e17f4809ac76d519f10
|
100.1 kB | Preview Download |
md5:de3171292b4d117b2adccc499b77e836
|
2.1 MB | Preview Download |
md5:d0bc206c5c99a67ec7b89c6cd1d88178
|
95.3 MB | Preview Download |