There is a newer version of the record available.

Published April 30, 2020 | Version v1
Dataset Open

DBLP Article Similarities (DBLP-ArtSim) dataset

  • 1. Athena Research Center
  • 2. Univ. of the Peloponnese

Description

This dataset contains similarity scores among articles in AMiner's DBLP v10 dataset.

Similarities are calculated using the JoinSim [1] similarity measure on the derived citation network using the following metapaths: 

  • Paper - Author - Paper (PAP_similarities.csv)
  • Paper - Topic - Paper (PTP_similarities.csv)

The file ids.csv contains a mapping from AMiner's ids to our internal numeric ids used in the similarities files.

[1] Xiong, Y., Zhu, Y., Yu, P.S.: Top-k similarity join in heterogeneous information networks. IEEE Transactions on Knowledge and Data Engineering 27(6), 1710– 1723 (2015)

Notes

We acknowledge support of this work by the project "Moving from Big Data Management to Data Science" (MIS 5002437/3) which is implemented under the Action "Reinforcement of the Research and Innovation Infrastructure", funded by the Operational Programme "Competitiveness, Entrepreneurship and Innovation" (NSRF 2014-2020) and co-financed by Greece and the European Union (European Regional Development Fund).

Files

ids.csv

Files (8.1 GB)

Name Size Download all
md5:e4dd4b9d19976e84f16d347fb192b8d6
137.4 MB Preview Download
md5:72720e946d276182352be86386348ee1
1.1 GB Preview Download
md5:58d4f5e24c9c90739df07be491881abe
6.8 GB Preview Download