Published July 21, 2016 | Version big data, analytics, performance, power consumption, SPARK, Hadoop
Conference paper Open

Performance-Power Exploration of Software-Defined Big Data Analytics: The AEGLE Cloud Backend

Description

In this paper, we present the design and analyze the performance-energy characteristics of a software-defined infras- tructure targeting Big Data analytics workloads. This software- defined Big Data framework forms the data analytic platform adopted in AEGLE, an European H2020 funded project for healthcare analytics. The developed framework utilizes state-of- art open source solutions and it is very flexible to enable the definition and automatic deployment of differing SPARK over Hadoop cluster configurations as analytics engines. In this paper, we exploit this flexibility of our software defined infrastructure to explore the performance-energy trade-offs of Big Data analytics under variable resource allocation scenarios. Specifically, we show that with respect to our local infrastructure, i.e., two Intel Xeon E5-2658A servers with 128GB RAM each, virtual cluster configurations with many nodes achieve the highest performance, while virtual cluster with high available RAM memory are more power efficient, exhibiting higher instructions per cycle (IPC) per kilojoule values. 

Files

SAMOS-2016.pdf

Files (1.4 MB)

Name Size Download all
md5:61acb8a16b74788fb9d2e9640ebcd35e
1.4 MB Preview Download

Additional details

Funding

AEGLE – AEGLE (Ancient Greek: Αἴγλη) – An analytics framework for integrated and personalized healthcare services in Europe 644906
European Commission