HPC Software Codesign in GROMACS
Description
Abstract
GROMACS is a widely used molecular dynamics simulation package known for its versatility, performance, and portability, with excellent efficiency and scalability from laptops to supercomputers. This is enabled by state-of-the art parallel algorithms and bottom-up performance optimizations to target all levels of hardware parallelism from SIMD to multicore, NUMA, accelerators and intra- and inter-node. Motivated by the end of Dennard scaling, the increasing need to parallelize, and to make efficient use of microprocessors, the GROMACS engine has evolved through a series of algorithmic and parallelization redesigns. Codesign efforts have been at core of these efforts and this talk will give an overview of these.
Fundamental molecular dynamics algorithms have been reformulated to target wide SIMD/SIMT-style architectures combining physics and accuracy-based approach, computer science, and HPC performance engineering. A SIMD abstraction layer and algorithm benchmark miniapp was developed to facilitate codesign and allow quick and relatively easy porting to new SIMD instruction sets without expert knowledge of the algorithms or the application, often in a just hours to days.
As early as 2011 an NVIDIA collaboration became an integral part of the GROMACS development. This collaboration had two-way impact: learning from hardware and library engineers provided essential guidance, while our use-cases motivated improvements and features exposed in CUDA like stream priorities, 3D FFTs optimizations, and most recently the distributed cuFFTmp library. The collaboration evolved into an ongoing codesign project with with important recent results like: a GPU-resident loop, design and implementation of the direct GPU communication layer, and a distributed GPU-optimized particle mesh Ewald algorithm.
In the frame of the Intel CoE collaboration project delivered long-term impact results with a nearly complete SYCL backend of GROMACS, planned to become the new standards-based GPU portability layer with support for all major GPU platforms and the primary means to target AMD and Intel.
Notes
Files
GROMACS-HPC-codesign_compr.pdf
Files
(6.6 MB)
Name | Size | Download all |
---|---|---|
md5:8aa6bd605a807fbb4809e761faed3681
|
6.6 MB | Preview Download |