A Fast Hop-Biased Approximation Algorithm for the Quadratic Group Steiner Tree Problem
Authors/Creators
Description
The dataset for our paper 'A Fast Hop-Biased Approximation Algorithm for the Quadratic Group Steiner Tree Problem'. It consists of 5 real KGs (Mondial, OpenCyc, LinkedMDB, YAGO, DBpedia) and 5 synthetic KGs (LUBM-10U, LUBM-50U, LUBM-250U, LUBM-2U, DBP-50K). Each KG is compressed in one file, which including (for example, in LUBM-2U):
-
lubm_2u_nodes.sql: the id, the name and the weight of a node, -
lubm_2u_edges.sql: the ids of two nodes an edge connects, -
lubm_2u_queries.sql: a query consists of some keywords, -
lubm_2u_keymap.sql: a keyword maps to a set of nodes, -
lubm_2u_nodevec.sql: the vector of a node, used to compute quadratic function qw, -
lubm_2u_hub_hop.sql: the hub labeling index to compute in Section 4.1, -
lubm_2u_hub_mix_1.sql: the hub labeling index to compute in Section 4.1 where α=0.1, -
lubm_2u_hub_mix_5.sql: the hub labeling index to compute in Section 4.1 where α=0.5, -
lubm_2u_hub_mix_9.sql: the hub labeling index to compute in Section 4.1 where α=0.9.
You can dump the data into MySQL database. For example,
create database lubm_2u;
use lubm_2u;
source lubm_2u_nodes.sql;
…
Unfortunately, due to the limit of space, for large KGs (DBpedia and LUBM-250U), we don't directly provide the data of hub labeling, i.e., these two compressed files only contains the first 5 sql files. You should generate hub labeling by yourself where the process is detailed in README of our project.