Published November 11, 2021 | Version v1
Dataset Open

AMiner-534K: Knowledge Graph of AMiner benchmark for Author Name Disambiguation

  • 1. FIZ-Karlsruhe
  • 2. University of Bologna

Description

This dataset is a knowledge graph extracted from a AMiner benchmark for a research project on knowledge graph embeddings (KGEs) for author disambiguation. Structural triples of the knowledge graph are split into training, testing and validation for applying representation learning methods. Textual literals and numeric literals were stored separately in order to implement multimodal approaches for KGEs (see arXiv:1802.00934). For the same reason, textual literals and numeric literals are already stored into sentence embeddings and a numeric matrix respectively in the files textual_literals.npy and numeric_literals.npy. The file and_eval.json contains the evaluation dataset used for evaluating our AND architecture. For the script used to gather this dataset see the GitHub repository: https://github.com/sntcristian/and-kge/tree/main/aminer.

Files

AMiner-534K.zip

Files (128.0 MB)

Name Size Download all
md5:67331fc6494b5591a765dd8342843a96
128.0 MB Preview Download