Published April 13, 2025 | Version v4
Software Open

CQSim Plus

  • 1. ROR icon University of Illinois Chicago

Description

Efficient job scheduling is crucial in high-performance computing (HPC), balancing user demands for quick job turnaround with fa- cility goals for high resource utilization. Traditional scheduling requires users to specify a system at job submission, which can lead to inefficiencies. A unified scheduling approach, viewing the re- sources within a computing facility as an integrated pool, promises improved resource use and reduced job wait times. This paper presents CQSim+, an open-source, discrete event-driven simulator tailored for symbiotic multi-resource scheduling. CQSim+ supports dynamic simulation by continuously integrating real-time data from job schedulers, enabling adaptive scheduling based on the system’s current state. Through extensive experimentation, we demonstrate CQSim+’s ability to enhance resource utilization and decrease job wait times in both homogeneous and heterogeneous HPC envi- ronments. Additionally, we present a case study that coordinates job scheduling between two production systems, illustrating how CQSim+ can effectively optimize job scheduling across distinct systems.

Files

CQSimPlus.zip

Files (64.2 MB)

Name Size Download all
md5:329b6b722b8cdbbb83cad4dfd68aee81
64.2 MB Preview Download