Info: Zenodo’s user support line is staffed on regular business days between Dec 23 and Jan 5. Response times may be slightly longer than normal.

There is a newer version of the record available.

Published November 20, 2021 | Version v1.0.0
Preprint Open

Phenotype Knowledge Translator: A FAIR Ecosystem for Representing Large-Scale Biomedical Knowledge

  • 1. University of Colorado Anschutz Medical Campus

Description

Although knowledge graphs (KGs) are used extensively in the biomedical domain to model complex phenomena, existing construction methods remain largely unable to account for the use of different standards and knowledge models, are limited to specific use cases, are often difficult to use, and perform poorly as the size of the resulting KGs increase in scale. To solve these problems, PheKnowLator, an ecosystem for FAIR construction of ontologically-grounded KGs, was developed. PheKnowLator KGs are fully customizable, enabling the use of alternative knowledge models, use of unidirectional or bidirectional relations, and with or without semantic property graph abstraction. The PheKnowLator Ecosystem was evaluated in two ways. First, a survey of existing open-source KG construction methods available on GitHub was performed. Then, PheKnowLator was used to construct KGs of human diseases mechanisms and the computational performance of the 12 builds was recorded. The survey identified 14 existing open-source KG construction methods and found that the PheKnowLator Ecosystem was as good or better than the other methods in terms of its KG construction functionality, maturity, availability, usability, and reproducibility. The KG of human disease mechanisms was built by applying PheKnowLator to 12 Open Biomedical Foundry ontologies and 31 publicly available resources using 15 edge types. The resulting KGs varied significantly in size from 737,556 nodes and 5,487,821 edges with 293 unique edge types to 15,903,225 nodes and 47,420,725 edges with 847 unique edge types. Computational performance varied by build step and build such that on average, the data download step used the least amount of resources (3.5 min; 7.9 GiB) and the KG construction step used the most resources (319.6 min; 119.7 GiB), and the subclass-based build with inverse relations and OWL-NETS transformation took the longest time and used the most memory (615.9 min; 147.1 GiB). Although additional experiments are needed to demonstrate the value of the 12 different builds, the PheKnowLator Ecosystem is one of the first fully customizable open-source KG construction frameworks able to provide a wide range of functionality without compromising usability. 

Notes

This is submission serves as a placeholder for a preprint that is being submitted to arXiv. As soon as a valid DOI has been produced, this submission will be updated with the preprint PDF, the DOI, and the submission authors.

Files

PheKnowLator_Ecosystem_Overview_.png

Files (4.0 MB)

Name Size Download all
md5:beb5dbb9350edb3c5fc702335c945f3b
729.2 kB Preview Download
md5:688dc3bc6f9f871159b258b2e169c491
791.2 kB Preview Download
md5:7d7da6806ecb89a3ab4c94d3c06dddbe
2.5 MB Preview Download

Additional details

Related works