Link-prediction on Biomedical Knowledge Graphs
Description
Release of the experimental data from the paper Towards Linking Graph Topology to Model Performance for Biomedical Knowledge Graph Completion (accepted at Machine Learning for Life and Material Sciences workshop @ ICML2024).
(h,r,?)
is scored against all entities in the KG and we compute the rank of the score of the correct completion (h,r,t)
, after masking out scores of other (h,r,t')
triples contained in the graph.experimental_data.zip
, the following files are provided for each dataset:{dataset}_preprocessing.ipynb
: a Jupyter notebook for downloading and preprocessing the dataset. In particular, this generates the custom label->ID mapping for entities and relations, and the numerical tensor of(h_ID,r_ID,t_ID)
triples for all edges in the graph, which can be used to compute graph topological metrics (e.g., using kg-topology-toolbox) and compare them with the edge prediction accuracy.test_ranks.csv
: csv table with columns["h", "r", "t"]
specifying the head, relation, tail IDs of the test triples, and columns["DistMult", "TransE", "RotatE", "TripleRE"]
with the rank of the ground-truth tail in the ordered list of predictions made by the four models;entity_dict.csv
: the list of entity labels, ordered by entity ID (as generated in the preprocessing notebook);relation_dict.csv
: the list of relation labels, ordered by relation ID (as generated in the preprocessing notebook).
The separate top_100_tail_predictions.zip
archive contains, for each of the test queries in the corresponding test_ranks.csv
table, the IDs of the top-100 tail predictions made by each of the four KGE models, ordered by decreasing likelihood. The predictions are released in a .npz
archive of numpy arrays (one array of shape (n_test_triples, 100)
for each of the KGE models).
All experiments (training and inference) have been run on Graphcore IPU hardware using the BESS-KGE distribution framework.
Files
experimental_data.zip
Files
(1.2 GB)
Name | Size | Download all |
---|---|---|
md5:84581f60a2d9a425e0ccfd33df8b1ec6
|
59.5 MB | Preview Download |
md5:c564137a64918c1ebe14f00ddaa39a35
|
1.1 GB | Preview Download |
Additional details
Dates
- Available
-
2021-06-25
Software
- Repository URL
- https://github.com/graphcore-research/bess-kge
- Programming language
- Python
- Development Status
- Active