Info: Zenodo’s user support line is staffed on regular business days between Dec 23 and Jan 5. Response times may be slightly longer than normal.

Published May 2, 2018 | Version v1
Conference paper Open

Exploring Functional Acceleration of OpenCL on FPGAs and GPUs Through Platform-Independent Optimizations

  • 1. Queens University Belfast

Description

OpenCL has been proposed as a means of accelerating functional computation using FPGA and GPU accelerators. Although it provides ease of programmability and code portability, questions remain about the performance portability and underlying vendor's compiler capabilities to generate efficient implementations without user-dened, platform specic optimizations. In this work, we systematically evaluate this by formalizing a design space exploration strategy using platform-independent micro-architectural and application-specic optimizations only. The optimizations are then applied across Altera FPGA, NVIDIA GPU and ARM Mali GPU platforms for three computing examples, namely matrix-matrix multiplication, binomial-tree option pricing and 3-dimensional nite difference time domain. Our strategy enables a fair comparison across platforms in terms of throughput and energy efficiency by using the same design effort. Our results indicate that FPGA provides better performance portability in terms of achieved percentage of device's peak performance (68%) compared to NVIDIA GPU (20%) and also achieves better energy efficiency (up to 1:4X) for some of the considered cases without requiring in-depth hardware design expertise.

Files

Exploring Functional Acceleration of OpenCL_QUB.pdf

Files (965.8 kB)

Additional details

Funding

VINEYARD – Versatile Integrated Accelerator-based Heterogeneous Data Centres 687628
European Commission