PheKnowLator Human Disease Knowledge Graph Benchmarks Archive

PheKnowLator Ecosystem Developers

doi:10.5281/zenodo.10065431

Published November 1, 2023 | Version v3.6.0

Dataset Open

PheKnowLator Human Disease Knowledge Graph Benchmarks Archive

PheKnowLator Ecosystem Developers¹

1. CU Anschutz Medical Campus

PKT Human Disease KG Benchmark Builds

The PheKnowLator (PKT) Human Disease KG (PKT-KG) was built to model mechanisms of human disease, which includes the Central Dogma and represents multiple biological scales of organization including molecular, cellular, tissue, and organ. The knowledge representation was designed in collaboration with a PhD-level molecular biologist (Figure).

The PKT Human Disease KG was constructed using 12 OBO Foundry ontologies, 31 Linked Open Data sets, and results from two large-scale experiments (Supplementary Material). The 12 OBO Foundry ontologies were selected to represent chemicals and vaccines (i.e., ChEBI and Vaccine Ontology), cells and cell lines (i.e., Cell Ontology, Cell Line Ontology), gene/gene product attributes (i.e., Gene Ontology), phenotypes and diseases (i.e., Human Phenotype Ontology, Mondo Disease Ontology), proteins, including complexes and isoforms (i.e., Protein Ontology), pathways (i.e., Pathway Ontology), types and attributes of biological sequences (i.e., Sequence Ontology), and anatomical entities (Uberon ontology). The RO is used to provide relationships between the core OBO Foundry ontologies and database entities.

The PKT Human Disease KG contained 18 node types and 33 edge types. Note that the number of nodes and edge types reflects those that are explicitly added to the core set of OBO Foundry ontologies and does not take into account the node and edge types provided by the ontologies. These nodes and edge types were used to construct 12 different PKT Human Disease benchmark KGs by altering the Knowledge Model (i.e., class- vs. instance-based), Relation Strategy (i.e., standard vs. inverse relations), and Semantic Abstraction (i.e., OWL-NETS (yes/no) with and without Knowledge Model harmonization [OWL-NETS Only vs. OWL-NETS + Harmonization]) parameters. Benchmarks within the PheKnowLator ecosystem are different versions of a KG that can be built under alternative knowledge models, relation strategies, and with or without semantic abstraction. They provide users with the ability to evaluate different modeling decisions (based on the prior mentioned parameters) and to examine the impact of these decisions on different downstream tasks.

The Figures and Tables explaining attributes in the builds can be found here.

Build Data Access

Important Build Information

The benchmarks were originally built and stored using Google Cloud Platform (GCP) resources. For details and a complete description of this process, can be found on GitHub (here). Note that we have developed this Zenodo-based archive for the builds. While the original GCP resources contained all of the resources needed to generate the builds, due to the file size upload limits associated with each archive, we have limited the uploaded files to the KGs, associated metadata, and log files. The list of resources, including their URLs, and date of download, can all be found in the logs associated with each build.

🗂 For additional information on the KG file types please see the following Wiki page, which is also available as a download from this repository (PheKnowLator_HumanDiseaseKG_Output_FileInformation.xlsx).

v1.0.0

KGs: https://zenodo.org/doi/10.5281/zenodo.7030200
Embeddings: https://zenodo.org/doi/10.5281/zenodo.7030188

All Other Build Versions

Class-based Builds

Standard Relations

OWL Build
- v2.0.0: MAY2020 ; JAN2021; FEB2021
- v2.1.0: MAY2021; JUN2021; JUL2021; AUG2021; SEP2021
- v3.0.2: OCT2021; NOV2021
OWL-NETS Build
- v2.0.0: MAY2020 ; JAN2021; FEB2021
- v2.1.0: MAY2021; JUN2021; JUL2021; AUG2021; SEP2021
- v3.0.2: OCT2021; NOV2021

Inverse Relations

OWL Build
- v2.0.0: MAY2020 ; JAN2021; FEB2021
- v2.1.0: MAY2021; JUN2021; JUL2021; AUG2021; SEP2021
- v3.0.2: OCT2021; NOV2021
OWL-NETS Build
- v2.0.0: MAY2020 ; JAN2021; FEB2021
- v2.1.0: MAY2021; JUN2021; JUL2021; AUG2021; SEP2021
- v3.0.2: OCT2021; NOV2021

Instance-based Builds

Standard Relations

OWL Build
- v2.0.0: MAY2020 ; JAN2021; FEB2021
- v2.1.0: MAY2021; JUN2021; JUL2021; AUG2021; SEP2021
- v3.0.2: OCT2021; NOV2021
OWL-NETS Build
- v2.0.0: MAY2020 ; JAN2021; FEB2021
- v2.1.0: MAY2021; JUN2021; JUL2021; AUG2021; SEP2021
- v3.0.2: OCT2021; NOV2021

Inverse Relations

OWL Build
- v2.0.0: MAY2020 ; JAN2021; FEB2021
- v2.1.0: MAY2021; JUN2021; JUL2021; AUG2021; SEP2021
- v3.0.2: OCT2021; NOV2021
OWL-NETS Build
- v2.0.0: MAY2020 ; JAN2021; FEB2021
- v2.1.0: MAY2021; JUN2021; JUL2021; AUG2021; SEP2021
- v3.0.2: OCT2021; NOV2021

Files

Files (17.4 kB)

Name	Size	Download all
PheKnowLator_HumanDiseaseKG_Output_FileInformation.xlsx md5:71b044a87de10a34eda5ef0b9f859cb7	17.4 kB	Download

Additional details

Is identical to: Dataset: https://github.com/callahantiff/PheKnowLator/wiki/Archived-Builds (URL)
References: Dataset: https://console.cloud.google.com/storage/browser/pheknowlator (URL)

	All versions	This version
Views	1,571	349
Downloads	152	48
Data volume	11.2 MB	870.5 kB

PheKnowLator Human Disease Knowledge Graph Benchmarks Archive

Creators

Description

PKT Human Disease KG Benchmark Builds

Build Data Access

Important Build Information

v1.0.0

All Other Build Versions

Files

Files (17.4 kB)

Additional details

Related works