Published February 8, 2023 | Version v2
Dataset Restricted

A Fast Hop-Biased Approximation Algorithm for the Quadratic Group Steiner Tree Problem

Authors/Creators

Description

The dataset for our paper 'A Fast Hop-Biased Approximation Algorithm for the Quadratic Group Steiner Tree Problem'. It consists of 5 real KGs (Mondial, OpenCyc, LinkedMDB, YAGO, DBpedia) and 5 synthetic KGs (LUBM-10U, LUBM-50U, LUBM-250U, LUBM-2U, DBP-50K). Each KG is compressed in one file, which including (for example, in LUBM-2U):

  • lubm_2u_nodes.sql: the id, the name and the weight of a node,

  • lubm_2u_edges.sql: the ids of two nodes an edge connects,

  • lubm_2u_queries.sql: a query consists of some keywords,

  • lubm_2u_keymap.sql: a keyword maps to a set of nodes,

  • lubm_2u_nodevec.sql: the vector of a node, used to compute quadratic function qw,

  • lubm_2u_hub_hop.sql: the hub labeling index to compute in Section 4.1,

  • lubm_2u_hub_mix_1.sql: the hub labeling index to compute in Section 4.1 where α=0.1,

  • lubm_2u_hub_mix_5.sql: the hub labeling index to compute in Section 4.1 where α=0.5,

  • lubm_2u_hub_mix_9.sql: the hub labeling index to compute in Section 4.1 where α=0.9.

You can dump the data into MySQL database. For example,

create database lubm_2u;
use lubm_2u;
source lubm_2u_nodes.sql;
…

Unfortunately, due to the limit of space, for large KGs (DBpedia and LUBM-250U), we don't directly provide the data of hub labeling, i.e., these two compressed files only contains the first 5 sql files. You should generate hub labeling by yourself where the process is detailed in README of our project.

 

Files

Restricted

The record is publicly accessible, but files are restricted to users with access.