Efficient GPU Offloading with OpenMP for a Hyperbolic Finite Volume Solver on Dynamically Adaptive Meshes

Mario Wille; Tobias Weinzierl; Gonzalo Brito Gadeschi; Michael Bader

doi:10.5281/zenodo.7951195

Published May 19, 2023 | Version 1.0.2

Dataset Open

Efficient GPU Offloading with OpenMP for a Hyperbolic Finite Volume Solver on Dynamically Adaptive Meshes

1. Technical University of Munich
2. Durham University
3. NVIDIA

We identify and show how to overcome an OpenMP bottleneck in the administration of GPU memory. It arises for a wave equation solver on dynamically adaptive block-structured Cartesian meshes, which keeps all CPU threads busy and allows all of them to offload sets of patches to the GPU. Our studies show that multithreaded, concurrent, non-deterministic access to the GPU leads to performance breakdowns, since the GPU memory bookkeeping as offered through OpenMP's map clause, i.e., the allocation and freeing, becomes another runtime challenge besides expensive data transfer and actual computation. We, therefore, propose to retain the memory management responsibility on the host: A caching mechanism acquires memory on the accelerator for all CPU threads, keeps hold of this memory and hands it out to the offloading threads upon demand. We show that this user-managed, CPU-based memory administration helps us to overcome the GPU memory bookkeeping bottleneck and speeds up the time-to-solution of Finite Volume kernels by more than an order of magnitude.

Files

AMD Results.pdf

Files (4.2 MB)

Name	Size	Download all
AMD Results.pdf md5:ed795d349dde3726a0ee690788e81c8f	169.5 kB	Preview Download
amd-map.zip md5:37d6d46809f9bde7f83c7f98cf360a50	633.1 kB	Preview Download
amd-user-managed.zip md5:7407c4281fbf46cb97644bb9cd0d3ccc	650.4 kB	Preview Download
Artifact Description.pdf md5:3d61f86afbdfbc09fdd898fcf41016d7	159.4 kB	Preview Download
Compiler Bugs.pdf md5:fa7204a4b09fb4076ebee8157ad9d01f	239.8 kB	Preview Download
create-strong-plot.py md5:d13ed845a6cb26f40e0f460c178a07fa	6.6 kB	Download
create-weak-plot.py md5:8fa0482e116febd38c159a5f3fffb6b7	6.3 kB	Download
generate_results.sh md5:48af74cbbac7e56da2270bbe4173b779	3.7 kB	Download
nvidia-cuda-managed.zip md5:a4f97fbc0a3812be92f1c70d6acee837	797.0 kB	Preview Download
nvidia-map.zip md5:bb68a45a27b26dfac59fd1f9837baaa0	760.2 kB	Preview Download
nvidia-user-managed.zip md5:0c8bfcec4d3fa6085988c6e1223c70e9	798.3 kB	Preview Download

	All versions	This version
Views	265	140
Downloads	452	299
Data volume	140.8 MB	108.9 MB

Efficient GPU Offloading with OpenMP for a Hyperbolic Finite Volume Solver on Dynamically Adaptive Meshes

Authors/Creators

Description

Files

AMD Results.pdf

Files (4.2 MB)