Inductive WN18RR and FB15k-237

Daniel Daza; Michael Cochez; Paul Groth

doi:10.5281/zenodo.4501273

Published February 4, 2021 | Version 1.0

Dataset Open

Inductive WN18RR and FB15k-237

1. Vrije Universiteit Amsterdam; University of Amsterdam
2. Vrije Universiteit Amsterdam
3. University of Amsterdam

This repository contains knowledge graphs based on the WN18RR and FB15k-237 datasets. We generate new training, validation, and test splits for the inductive setting, where some entities are removed from the training set. The splits are used in the experiments described in the paper "Inductive Entity Representations from Text via Link Prediction".

To generate inductive splits, we remove nodes so that no other node becomes isolated, and the number of edges of a particular relation type does not drop below 100.

The following are statistics for the datasets.

|           | WN18RR-ind | FB15k-237-ind |
|-----------|------------|---------------|
| Relations |     11     |      237      |
|           |         Training           |
| Entities  |   32,755   |     11,633    |
| Triples   |   69,585   |    215,082    |
|           |        Validation          |
| Entities  |    4,094   |     1,454     |
| Triples   |   11,381   |     42,164    |
|           |           Test             |
| Entities  |    4,094   |     1,454     |
| Triples   |   12,087   |     52,870    |

The splits for each dataset are called ind-train.tsv, ind-dev.tsv, and ind-test.tsv. We also include textual descriptions for each entity, as well as type information.

Files

Files (28.0 MB)

Name	Size	Download all
FB15k-237.tar.gz md5:c7cf03efea81b5fbef958f6ce3bfc198	21.1 MB	Download
WN18RR.tar.gz md5:3b3b9920a118207ea7b17355820c54a6	6.8 MB	Download

	All versions	This version
Views	1,020	1,005
Downloads	222	221
Data volume	3.4 GB	3.4 GB

Inductive WN18RR and FB15k-237

Authors/Creators

Description

Files

Files (28.0 MB)