Translating Medicine to Mechanism: Enhancing Clinical Phenotypes with Mechanistic Knowledge
Authors/Creators
- 1. Computational Bioscience Program, University of Colorado Anschutz Medical Campus
Description
Objective: Understanding the molecular drivers of disease is a vital component of personalized medicine. Unfortunately, molecular data are not currently available in most electronic health records (EHRs). To solve this problem we created Med2Mech, a joint learning framework for inferring molecular characterizations of patients from clinical data and publicly available biomedical data.
Methods: Med2Mech was evaluated using pediatric EHR data from a subset of rare disease and other similarly medically complex patients. First, patient-level clinical embeddings were generated. Then, a PheKnowLator knowledge graph (KG) was used to generate mechanism embeddings. Finally, patient-level mechanism embeddings were derived by summing or averaging each patient’s unique set of mechanism embeddings. A one-vs-the-rest multiclass classification strategy, with five cross-fold validation, was used to evaluate the discriminatory ability of the mechanism and clinical embeddings. Rare disease subphenotype differences, using both clinical and mechanism embeddings, were further investigated using K-Means, which were verified by PhD- and MD-level domain experts. As external validation, the ability to infer the genotype of the rare disease patients using an independent sample of publicly available transcriptomic data was examined.
Results: Clinical embeddings were built for four rare disease groups (n = 2,646) and 10,000 similarly complex patients using 6,382 conditions, 2,334 medications, and 272 measurements. Mechanism embeddings were generated from a PheKnowlator KG with 129,875 nodes and 3,838,935 edges. On classification, the mechanism embeddings out-performed 82.2% of the clinical embedding parameterizations. Domain expert review confirmed the mechanism embeddings produced more clinically-relevant clusters of comorbidities for each rare disease subphenotype than the clinical embeddings. External validation further demonstrated the utility of this framework by accurately inferring the genotype and phenotype of EHR-derived rare disease patients from publicly available molecular data.
Conclusion: These results illustrate the translational utility of PheKnowLator KGs and demonstrate that it is possible to derive clinically meaningful and biologically relevant patient representations from disparate sources of EHR data and expert-curated publicly available transcriptomic data.
Notes
Files
Med2Mech_KG_Overview.png
Files
(6.4 MB)
| Name | Size | Download all |
|---|---|---|
|
md5:52b7b8eb4ddc34850e9bfbfa401b97f9
|
6.4 MB | Preview Download |
Additional details
Related works
- References
- https://github.com/callahantiff/PheKnowLator (URL)
- https://github.com/callahantiff/OMOP2OBO (URL)