Published March 6, 2023 | Version ACL 2023
Dataset Open

OntoLAMA: LAnguage Model Analysis for Ontology Subsumption Inference

  • 1. University of Oxford
  • 2. University of Manchester
  • 3. City, University of London

Description

About

OntoLAMA is a set of language model (LM) probing datasets for ontology subsumption inference. The work follows the "LMs-as-KBs" literature but focuses on conceptualised knowledge extracted from formalised KBs such as the OWL ontologies. Specifically, the subsumption inference (SI) task is introduced and formulated in the Natural Language Inference (NLI) style, where the sub-concept and the super-concept involved in a subsumption axiom are verbalised and fitted into a template to form the premise and hypothesis, respectively. The sampled axioms are verified through ontology reasoning. The SI task is further divided into Atomic SI and Complex SI where the former involves only atomic named concepts and the latter involves both atomic and complex concepts. Real-world ontologies of different scales and domains are used for constructing OntoLAMA and in total there are four Atomic SI datasets and two Complex SI datasets.

 

Dataset Source #Concepts #EquivAxioms #Datasets(Train/Dev/Test)
Schema.org 894 N/A

Atomic SI: 808/404/2, 830

DOID 11,157 N/A

Atomic SI: 90,500/11,312/11,314

FoodOn 30,995 2,383

Atomic SI: 768,486/96,060/96,062

Complex SI: 3,754/1,850/13,080

GO 43,303 11,456

Atomic SI: 772,870/96,608/96,610

Complex SI: 72,318/9,040/9,040

MNLI N/A N/A

biMNLI: 235,622/26,180/12,906

 

Citation

The relevant paper has been accepted at Findings of ACL 2023: https://aclanthology.org/2023.findings-acl.213/.

```
@inproceedings{he-etal-2023-language, title = "Language Model Analysis for Ontology Subsumption Inference", author = "He, Yuan and Chen, Jiaoyan and Jimenez-Ruiz, Ernesto and Dong, Hang and Horrocks, Ian", booktitle = "Findings of the Association for Computational Linguistics: ACL 2023", month = jul, year = "2023", address = "Toronto, Canada", publisher = "Association for Computational Linguistics", url = "https://aclanthology.org/2023.findings-acl.213", doi = "10.18653/v1/2023.findings-acl.213", pages = "3439--3453" }
```

Links

Files

bimnli.zip

Files (112.3 MB)

Name Size Download all
md5:6605b088295eb0797fcbd3141724d5af
19.3 MB Preview Download
md5:785b662e2a695fb1f8bca6078b990d97
3.2 MB Preview Download
md5:9b3e4fb350625f5ffd89c64a41e6b496
28.5 MB Preview Download
md5:356037cc1a9b6cffb884eec1d9a5cccc
1.1 MB Preview Download
md5:12c24de0efbb567479af4442a4cfbdcb
32.4 MB Preview Download
md5:749aa5cd4c45fd1dab45a9a00526d6a7
5.1 MB Preview Download
md5:c145893dd42cfefaacf54c7a240e2213
22.8 MB Preview Download
md5:cc44c818ed6e0253eddf1fa0246e4c58
82.6 kB Preview Download