Published April 8, 2020 | Version v1
Dataset Open

Datasets for Content-and-Structure (CAS) Indexing

  • 1. University of Zurich

Description

We provide the datasets used in our paper "Dynamic Interleaving of Content and Structure for Robust Indexing of Semi-Structured Hierarchical Data".

There are three datasets:

  • ServerFarm (SF) dataset
  • XMark dataset
  • Amazon dataset

We created the ServerFarm dataset ourselves.

Our Amazon dataset is based on a subset of the Amazon dataset by Julian McAuley (see http://jmcauley.ucsd.edu/data/amazon/links.html).

The XMark dataset is a synthetic dataset based on the XMark benchmark (https://projects.cwi.nl/xmark/downloads.html)

Files

Files (7.0 GB)

Name Size Download all
md5:720dcf55a4fd83dfe3e734d0bff77d9b
393.1 MB Download
md5:1ca1f6a51379fa91216fae449d333f71
6.0 GB Download
md5:187fea6cf2d1d60fbf4a0ba63ce9faef
638.8 MB Download