Info: Zenodo’s user support line is staffed on regular business days between Dec 23 and Jan 5. Response times may be slightly longer than normal.

Published February 20, 2023 | Version v1
Presentation Open

GPU Performance Portability Using Standard C++ and SYCL

Creators

  • 1. Codeplay Software

Description

The proliferation of accelerators, in particular GPUs, over the past decade is im-
pacting the way software is being developed. Most developers who have been using
CPU based machines are now considering how it’s possible to improve the per-
formance of applications by offloading execution to many core processors. Many
emerging disciplines including AI, deep neural networks and machine learning have
shown that GPUs can increase performance by many times compared to CPU-only
architectures. New hardware features such as ”tensor cores” are also starting to
emerge to address specific problems including mixed precision computing. The new
challenge for developers is figuring out how to develop for heterogeneous architec-
tures that include GPUs made by different companies. Currently the most common
way to develop software for GPUs is using the CUDA programming model but this
has pitfalls. CUDA uses non-standard C++ syntax and semantics, is a proprietary
interface, and can only be used to target Nvidia GPUs. Alternatives include HIP
which offers another proprietary programming interface only capable of targeting
AMD GPUs.
This presentation will demonstrate how standard C++ code with SYCL can be
used to achieve performance portability on processors from multiple vendors includ-
ing Nvidia GPUs, AMD GPUs and Intel GPUs. The SYCL programming interface
is a royalty free and industry defined open standard designed to enable the latest
features of accelerators. Using an open source project, we’ll show how standard
C++ syntax and semantics are used to define the SYCL kernel and memory man-
agement code required to offload parallel execution to a range of GPUs. Further to
this, we’ll explain how easy it is to compile this C++ code using a SYCL compiler
so that it can be run on Nvidia, AMD and Intel GPUs and compare this execu-
tion performance with the same code written using proprietary CUDA and HIP
environments.

Files

WAMTA23 Hugh.pdf

Files (2.2 MB)

Name Size Download all
md5:2505aaf853c6c52c2afd33bb29548a72
2.2 MB Preview Download