Published September 28, 2018 | Version 1.0 | Final
Project deliverable Open

BigDataGrapes D3.3 - Distributed Indexing Components

  • 1. CNR
  • 2. ONTOTEXT
  • 3. Agroknow

Description

The BigDataGrapes (BDG) platform aspires to provide components that go beyond the state-of-the-art on various stages of the management, processing, and usage of grapevine-related big data assets thus making easier for grapevine-powered industries to take important business decisions. The platform employs the necessary components for carrying out rigorous analytics processes on complex and heterogeneous data helping companies and organizations in the sector to evolve methods, standards and processes based on insights extracted from their data.

The goal of the BDG Distributed Indexing activity is to develop novel methodologies and components for realizing efficient indexing over distributed big data batch and cross-streaming sources.

The activities carried out in this first period focused on the design of time and space efficient indexing data structures for structured and unstructured data such as labelled trees, graphs, and text documents, including compression techniques for Big data management that support a broad range of analytical queries over arbitrary data dimensions. Specifically, we investigated the efficiency and effectiveness dimensions of indexes for RDF triples based on inverted indexes, and designed a novel compression technique for making these indexes more efficient in both space and time. This deliverable includes the first version of the software components developed and discusses the preliminary results obtained. An appendix shows how to access the software, install it and reproduce the tests conducted.

Files

D3.3 - Distributed Indexing Components.pdf

Files (641.1 kB)

Name Size Download all
md5:29a93319c39022400e8ec4b46be02298
641.1 kB Preview Download

Additional details

Funding

BigDataGrapes – Big Data to Enable Global Disruption of the Grapevine-powered Industries 780751
European Commission