Published September 2024
| Version NeurIPS 2024
Dataset
Open
HiT: Language Models as Hierarchy Encoders
Description
About
Datasets for training and evaluating the Hierarchy Transformer encoders (HiTs) proposed in the paper titled: "Language Models as Hierarchy Encoders".
- Files with
multi
suffix corresponds to Multi-hop Inference evaluaiton. - Files with
mixed
suffix corresponds to Mixed-hop Prediction (and its transfer setting) evaluation. schemaorg
,foodon
, anddoid
are only involved in the transfer evaluation, but the datasets here forfoodon
anddoid
also give their training sets (see explanation in the paper for why we opted not to generate a trainning set forschemaorg
).
The previous version of this dataset collection has been marked deprecated because it seems that it contains broken files for
snomed
.
Huggingface Datasets
We offer a convenient Huggingface Datasets entry, enabling users to load data directly using the load_dataset
method. The datasets are available in formats of either entity triplets or labelled entity pairs. Please note that in this way, the original entity IDs are not retained. To map entities back to their original hierarchies, refer to this Zenodo release.
Citation
@article{he2024language, title={Language models as hierarchy encoders}, author={He, Yuan and Yuan, Moy and Chen, Jiaoyan and Horrocks, Ian}, journal={Advances in Neural Information Processing Systems}, volume={37}, pages={14690--14711}, year={2024} }
Links
- GitHub repository: https://github.com/KRR-Oxford/HierarchyTransformers
- Models and Datasets on Huggingface Hub: https://huggingface.co/Hierarchy-Transformers
- Arxiv preprint: https://arxiv.org/abs/2401.11374
Contact
Yuan He (yuan.he(at)cs.ox.ac.uk
)
Files
doid-mixed.zip
Files
(236.0 MB)
Name | Size | Download all |
---|---|---|
md5:f03bd0763063a1f0bbe2a51d1efcdaba
|
1.5 MB | Preview Download |
md5:1e2a5d5a36c7c03e56f00f79999c8dd0
|
8.8 MB | Preview Download |
md5:55fa7f1b506ba6d71355d0eb45e50d0c
|
29.8 MB | Preview Download |
md5:1ca33af8d7e847b562f14af72e5d37cd
|
251.4 kB | Preview Download |
md5:6ac08c7266428122631a6fceb24e5906
|
80.8 MB | Preview Download |
md5:3ca80758a71566ed9c04ec0e392515fc
|
73.5 MB | Preview Download |
md5:4b2c790a6974631ef35ee940ba881af5
|
20.7 MB | Preview Download |
md5:39c18ae770a4d84c650badc5f648f6d8
|
20.7 MB | Preview Download |