Executing dynamic heterogeneous workloads on Blue Waters with RADICAL-Pilot
- 1. Rutgers University
- 2. Intel Corporation
- 3. University of Edinburgh
Description
Traditionally HPC systems such as Crays have been designed to support mostly monolithic workloads. However, the workload of many important scientific applications is constructed out of spatially and temporally heterogeneous tasks that are often dynamically inter-related. These workloads can benefit from being executed at scale on HPC resources, but a tension exists between the workloads' resource utilization requirements and the capabilities of the HPC system software and usage policies. Pilot systems have the potential to relieve this tension. RADICAL-Pilot is a scalable and portable pilot system that enables the execution of such diverse workloads. In this paper we describe the design and characterize the performance of its RADICAL-Pilot's scheduling and executing components on Crays, which are engineered for efficient resource utilization while maintaining the full generality of the Pilot abstraction. We will discuss four different implementations of support for RADICAL-Pilot on Cray systems and analyze and report on their performance.
Files
pap130s2-file1.pdf
Files
(484.2 kB)
Name | Size | Download all |
---|---|---|
md5:5189a2a4c0beeff92a8e3a0e1451d929
|
484.2 kB | Preview Download |
Additional details
Funding
- U.S. National Science Foundation
- SI2-SSE: RADICAL Cybertools: Scalable, Interoperable and Sustainable Tools for Science 1440677