Published October 8, 2024 | Version v1
Journal article Open

Explainable multi-omics deep clustering reveals an important role of DNA methylation in PDAC


Patients with pancreatic ductal adenocarcinoma (PDAC) have the lowest survival rate among all cancer patients in Europe. Since western societies have the highest incidence of pancreatic cancer, it has been projected that PDAC will soon become the second leading cause of cancer-related deaths. The main challenge of PDAC treatment is that patients with similar somatic genotypes exhibit a wide range of disease phenotypes. Artificial Intelligence (AI) is currently transforming the field of healthcare and represents a promising technology for integrating various datasets and optimizing evidence-based decision making. However, the interpretability of most AI models is limited and it is challenging to understand how and why a decision is made. In this study, we developed a deep clustering model for PDAC patient stratification using integrated methylation and gene expression data. We placed a specific emphasis on model explainability, with the aim to understand hidden multi-modal patterns learned by the model. The model resulted in two subgroups of PDAC patients with different prognoses and biological factors. We performed several follow-up analyses to measure the relative contribution of each modality to the clustering solution. This multi-omics profile analysis revealed an important role of DNA methylation, partially supported by previous experimental studies. We also show how the model learned the underlying patterns in a multi-modal setting, where individual hidden neurons are specialized either in single data modalities or their combinations. We hope this study will help to promote more explainable AI in real-world clinical applications, where the knowledge of the decision factors is crucial. The code of this project is publicly available in GitHub (



Additional details


Repository URL
Programming language