Planned intervention: On Wednesday April 3rd 05:30 UTC Zenodo will be unavailable for up to 2-10 minutes to perform a storage cluster upgrade.
Published February 4, 2021 | Version 1.0
Dataset Open

Inductive WN18RR and FB15k-237

  • 1. Vrije Universiteit Amsterdam; University of Amsterdam
  • 2. Vrije Universiteit Amsterdam
  • 3. University of Amsterdam

Description

This repository contains knowledge graphs based on the WN18RR and FB15k-237 datasets. We generate new training, validation, and test splits for the inductive setting, where some entities are removed from the training set. The splits are used in the experiments described in the paper "Inductive Entity Representations from Text via Link Prediction".

To generate inductive splits, we remove nodes so that no other node becomes isolated, and the number of edges of a particular relation type does not drop below 100.

The following are statistics for the datasets.

 

|           | WN18RR-ind | FB15k-237-ind |
|-----------|------------|---------------|
| Relations |     11     |      237      |
|           |         Training           |
| Entities  |   32,755   |     11,633    |
| Triples   |   69,585   |    215,082    |
|           |        Validation          |
| Entities  |    4,094   |     1,454     |
| Triples   |   11,381   |     42,164    |
|           |           Test             |
| Entities  |    4,094   |     1,454     |
| Triples   |   12,087   |     52,870    |

 

The splits for each dataset are called ind-train.tsv, ind-dev.tsv, and ind-test.tsv. We also include textual descriptions for each entity, as well as type information.

 

 

Files

Files (28.0 MB)

Name Size Download all
md5:c7cf03efea81b5fbef958f6ce3bfc198
21.1 MB Download
md5:3b3b9920a118207ea7b17355820c54a6
6.8 MB Download