Published October 10, 2025 | Version v1
Dataset Open

OnT (Language Models as Ontology Encoder) - Complete Datasets

Creators

  • 1. University of Manchester

Description

Complete dataset collection for OnT (Language Models as Ontology Encoder), a language model-based framework for ontology embeddings. This collection enables effective representation of concepts as points in hyperbolic space and axioms as hierarchical relationships between concepts. Built upon HierarchyTransformer, OnT provides enhanced capabilities through specialized embedding techniques.

This collection includes 6 dataset files:

  • ANATOMY_predict.zip: Training/validation for ANATOMY ontology
  • ANATOMY_inference.zip: Inference/testing for ANATOMY ontology 
  • GALEN_predict.zip: Training/validation for GALEN ontology 
  • GALEN_inference.zip: Inference/testing for GALEN ontology 
  • GO_predict.zip: Training/validation for Gene Ontology 
  • GO_inference.zip: Inference/testing for Gene Ontology

Each dataset contains preprocessed data for ontological reasoning tasks. Pre-trained models are available on Hugging Face: https://huggingface.co/collections/Hui97/ontology-transformer-68e8fdea10cba273bfdc687c

Notes

This dataset collection is part of the OnT (Language Models as Ontology Encoder) project. Pre-trained models are available on Hugging Face: https://huggingface.co/collections/Hui97/ontology-transformer-68e8fdea10cba273bfdc687c

Files

ANATOMY_inference.zip

Files (291.8 MB)

Name Size Download all
md5:569e269708e79737add4e2c4ace7cb07
22.7 MB Preview Download
md5:196e8c06df0048d90c1abc6c2f3da8ed
22.2 MB Preview Download
md5:c48292f5b56a1f48d3e725aa03dfaaf4
14.2 MB Preview Download
md5:016f8dbd1f160b0a67addea518bed181
13.1 MB Preview Download
md5:80cc93287c240de574f337dcf46d0ef8
115.9 MB Preview Download
md5:d6afc378c41f0d69899fe47f37450689
103.8 MB Preview Download

Additional details