CQSim Plus
Description
Efficient job scheduling is crucial in high-performance computing (HPC), balancing user demands for quick job turnaround with fa- cility goals for high resource utilization. Traditional scheduling requires users to specify a system at job submission, which can lead to inefficiencies. A unified scheduling approach, viewing the re- sources within a computing facility as an integrated pool, promises improved resource use and reduced job wait times. This paper presents CQSim+, an open-source, discrete event-driven simulator tailored for symbiotic multi-resource scheduling. CQSim+ supports dynamic simulation by continuously integrating real-time data from job schedulers, enabling adaptive scheduling based on the system’s current state. Through extensive experimentation, we demonstrate CQSim+’s ability to enhance resource utilization and decrease job wait times in both homogeneous and heterogeneous HPC envi- ronments. Additionally, we present a case study that coordinates job scheduling between two production systems, illustrating how CQSim+ can effectively optimize job scheduling across distinct systems.
Files
CQSimPlus.zip
Files
(64.2 MB)
| Name | Size | Download all |
|---|---|---|
|
md5:329b6b722b8cdbbb83cad4dfd68aee81
|
64.2 MB | Preview Download |