Quantum-Enhanced Carbon-Aware Scheduling for Large Language Model Inference
Authors/Creators
Description
As the computational demand for Large Language Models (LLMs) surges, minimizing the carbon footprint of inference has become a critical challenge. While classical schedulers optimize for throughput, they often neglect the spatial and temporal variance of grid carbon intensity. This paper presents a Hybrid Quantum-Classical (HQC) framework utilizing the Quantum Approximate Optimization Algorithm (QAOA) to solve the layer-to-hardware mapping problem with the explicit objective of minimizing gCO$_2$e emissions. We benchmark our QAOA optimizer against classical Brute Force and Genetic Algorithms across static, dynamic, and noisy environments. Our results demonstrate that QAOA achieves near-perfect optimality (gap<10−5) and successfully adapts to 24-hour grid fluctuations, realizing a simulated carbon saving of 23.76 gCO$_2$e. However, the study also reveals a "Simulation Wall" at N=15 layers, where classical simulation of the quantum circuit becomes computationally prohibitive, whereas Genetic Algorithms maintain speed at the cost of theoretical guarantees. We conclude that QAOA represents a scalable, robust pathway for green AI, provided the optimizer is migrated from simulation to physical Quantum Processing Units (QPUs).
Files
QAOA.pdf
Files
(220.3 kB)
| Name | Size | Download all |
|---|---|---|
|
md5:80255a55d0ec2c6fedf1529db77e3e65
|
220.3 kB | Preview Download |