Published January 25, 2017 | Version v1
Conference paper Open

Dataflow Acceleration of scikit-learn Gaussian Process Regression

Description

Big data revolution has sparked the widespread use of predictive data analytics based on sophisticated machine learning tasks. Fast data analysis have become very important, and this fact stresses software developers and computer architects to deliver more efficient design solutions able to address the in- creased performance requirements. Dataflow computing engines from Maxeler has been recently emerged as a promising way of performing high performance computation, utilizing FPGA devices. In this paper, we focus on exploiting Maxeler’s dataflow computing for accelerating Gaussian Process Regression from scikit-learn Python library, one of the most computationally intensive and with poor scaling characteistics machine learning algorithm. Through extensive analysis over diverse datasets, we point out which NumPy and SciPy functions forms the major performance bottlenecks that should be implemented in a dataflow acceleration engine and then we discuss the mapping decisions that enable the generation of parameterized dataflow engines. Finally, we show that the proposed acceleration solution delivers significant speedups for the examined datasets, while it also reports good scalability in respect to increased dataset sizes. 

Files

PARMA-DITAM-2017.pdf

Files (455.9 kB)

Name Size Download all
md5:904269d6b3cc806dd82b5db176c9e072
455.9 kB Preview Download

Additional details

Funding

AEGLE – AEGLE (Ancient Greek: Αἴγλη) – An analytics framework for integrated and personalized healthcare services in Europe 644906
European Commission