Enabling Transparent Acceleration of Big Data Frameworks Using Heterogeneous Hardware
Creators
- 1. The University of Manchester
- 2. National Technical University of Athens
Description
The ever-increasing demand for high performance Big Data ana- lytics and data processing, has paved the way for heterogeneous hardware accelerators, such as Graphics Processing Units (GPUs) and Field Programmable Gate Arrays (FPGAs), to be integrated into modern Big Data platforms. Currently, this integration comes at the cost of programmability since the end-user Application Programming Interface (APIs) must be altered to access the underlying heterogeneous hardware. For example, current Big Data frameworks, such as Apache Spark, provide a new API that combines the existing Spark programming model with GPUs. For other Big Data frameworks, such as Flink, the integration of GPUs and FPGAs is achieved via external API calls that bypass their execution models completely.
In this paper, we rethink current Big Data frameworks from a systems and programming language perspective, and introduce a novel co-designed approach for integrating hardware accelera- tion into their execution models. The novelty of our approach is attributed to two key design decisions: a) support for arbitrary User Defined Functions (UDFs), and b) no modifications to the user level API. The proposed approach has been prototyped in the context of Apache Flink, and enables unmodified applications written in Java to run on heterogeneous hardware, such as GPU and FPGAs, trans- parently to the users. The performance evaluation of the proposed solution has shown performance speedups of up to 65x on GPUs and 184x on FPGAs for suitable workloads of standard benchmarks and industrial use cases against vanilla Flink running on traditional multi-core CPUs.
Files
xekalaki-VLDB.pdf
Files
(2.1 MB)
Name | Size | Download all |
---|---|---|
md5:184d726aff0aab7e99000591653c9e1e
|
2.1 MB | Preview Download |
Additional details
Funding
- European Commission
- ELEGANT – Secure and Seamless Edge-to-Cloud Analytics 957286
- European Commission
- E2DATA – European Extreme Performing Big Data Stacks 780245
- European Commission
- ENCRYPT – A SCALABLE AND PRACTICAL PRIVACY-PRESERVING FRAMEWORK 101070670
- European Commission
- TANGO – Digital Technologies ActiNg as a Gatekeeper to information and data flOws 101070052