Published July 1, 2022 | Version v1
Dataset Open

Algorithm and System Co-design for Efficient Subgraph-based Graph Representation Learning

  • 1. ROR icon Purdue University West Lafayette
  • 2. ROR icon Peking University
  • 3. ROR icon Cornell University

Description

Following the format of the Open Graph Benchmark (OGB), we design four prediction tasks of relations (mag-write, mag-cite) and higher-order patterns (tags-math, DBLP-coauthor) and construct the corresponding datasets over heterogeneous graphs and hypergraphs [1]. The original ogb-mag dataset only contains features for 'paper'-type nodes. We add the node embedding provided by [2] as raw features for other node types in MAG(P-A)/(P-P). For these four tasks, the model is evaluated by one positive query paired with a certain number of randomly sampled negative queries (1:1000 by default, except for tags-math 1:100).

Files

Files (4.5 GB)

Name Size Download all
md5:1c3e6217ec5f6a9ae877892966113ce7
1.3 GB Download
md5:f0894dff84c13d6bdfe70fc459ee7a17
1.3 GB Download
md5:bc7569e0d9961088792e576431379467
1.7 GB Download
md5:5dc1c01aa0fc31ad753355238c9aaf58
126.6 MB Download

Additional details

Additional titles

Alternative title
SGRL Dataset (Relation / Higher-order Prediction)

References

  • [1] Austin R Benson, Rediet Abebe, Michael T Schaub, Ali Jadbabaie, and Jon Kleinberg. 2018. Simplicial closure and higher-order link prediction. Proceedings of the National Academy of Sciences 115, 48 (2018), E11221–E11230.
  • [2] Le Yu, Leilei Sun, Bowen Du, Chuanren Liu, Weifeng Lv, and Hui Xiong. 2022. Heterogeneous graph representation learning with relation awareness. IEEE Transactions on Knowledge and Data Engineering (2022).