And synopses for all: A synopses data engine for extreme scale analytics-as-a-service

doi:10.1016/j.is.2023.102221

Published May 2, 2023 | Version v1

Journal article Open

And synopses for all: A synopses data engine for extreme scale analytics-as-a-service

1. Universite Libre de Bruxelles
2. Technical University of Crete, Athena RC

In this work, we detail the design and structure of a Synopses Data Engine (SDE) which combines the virtues of parallel processing and stream summarization towards delivering interactive analytics at extreme scale. Our SDE is built on top of Apache Flink and implements a novel synopsis-as-a-service paradigm. In that, it achieves (i) concurrently maintaining thousands of synopses of various types for thousands of streams, on demand, (ii) reusing synopses that are common across various concurrent workflows, (iii) providing data summarization facilities even for cross-(Big Data) platform workflows, (iv) pluggability of new synopses on-the-fly, (v) increased potential for workflow execution optimization. The proposed SDE-as-a-service provides interactive analytics at scale by enabling 3 types of scalability: (i) enhanced horizontal scalability, i.e., not only scaling out the computation to a number of processing units available in a computer cluster, but also harnessing the processing load assigned to each by operating on carefully-crafted data summaries, (ii) vertical scalability, i.e., scaling the computation to very high numbers of processed streams and (iii) federated scalability i.e., scaling across geo-distributed clusters and clouds by controlling the communication required to answer global queries.

Files

IS2023.pdf

Files (1.7 MB)

Name	Size	Download all
IS2023.pdf md5:a4249fbc1253ed5bda8e19e3a8fe5778	1.7 MB	Preview Download

Additional details

DEDS – Data Engineering for Data Science 955895: European Commission
EVENFLOW – Robust Learning and Reasoning for Complex Event Forecasting 101070430: European Commission
STELAR – Spatio-TEmporal Linked data tools for the AgRi-food data space 101070122: European Commission

	All versions	This version
Views	124	120
Downloads	69	66
Data volume	127.7 MB	122.7 MB

And synopses for all: A synopses data engine for extreme scale analytics-as-a-service

Creators

Description

Files

IS2023.pdf

Files (1.7 MB)

Additional details

Funding