There is a newer version of the record available.

Published November 21, 2021 | Version V1.0.0
Preprint Open

Translating Medicine to Mechanism: Enhancing Clinical Phenotypes with Mechanistic Knowledge

Authors/Creators

  • 1. Computational Bioscience Program, University of Colorado Anschutz Medical Campus

Description

Objective: Understanding the molecular drivers of disease is a vital component of personalized medicine. Unfortunately, molecular data are not currently available in most electronic health records (EHRs). To solve this problem we created Med2Mech, a joint learning framework for inferring molecular characterizations of patients from clinical data and publicly available biomedical data. 

Methods: Med2Mech was evaluated using pediatric EHR data from a subset of rare disease and other similarly medically complex patients. First, patient-level clinical embeddings were generated. Then, a PheKnowLator knowledge graph (KG) was used to generate mechanism embeddings. Finally, patient-level mechanism embeddings were derived by summing or averaging each patient’s unique set of mechanism embeddings. A one-vs-the-rest multiclass classification strategy, with five cross-fold validation, was used to evaluate the discriminatory ability of the mechanism and clinical embeddings. Rare disease subphenotype differences, using both clinical and mechanism embeddings, were further investigated using K-Means, which were verified by PhD- and MD-level domain experts. As external validation, the ability to infer the genotype of the rare disease patients using an independent sample of publicly available transcriptomic data was examined.

Results: Clinical embeddings were built for four rare disease groups (n = 2,646) and 10,000 similarly complex patients using 6,382 conditions, 2,334 medications, and 272 measurements. Mechanism embeddings were generated from a PheKnowlator KG with 129,875 nodes and 3,838,935 edges. On classification, the mechanism embeddings out-performed 82.2% of the clinical embedding parameterizations. Domain expert review confirmed the mechanism embeddings produced more clinically-relevant clusters of comorbidities for each rare disease subphenotype than the clinical embeddings. External validation further demonstrated the utility of this framework by accurately inferring the genotype and phenotype of EHR-derived rare disease patients from publicly available molecular data. 

Conclusion: These results illustrate the translational utility of PheKnowLator KGs and demonstrate that it is possible to derive clinically meaningful and biologically relevant patient representations from disparate sources of EHR data and expert-curated publicly available transcriptomic data.

Notes

This is submission serves as a placeholder for a preprint that is being submitted to arXiv. As soon as a valid DOI has been produced, this submission will be updated with the preprint PDF, the DOI, and the submission authors.

Files

Med2Mech_KG_Overview.png

Files (6.4 MB)

Name Size Download all
md5:52b7b8eb4ddc34850e9bfbfa401b97f9
6.4 MB Preview Download

Additional details