Published February 4, 2026 | Version 1.0.1
Preprint Open

Quantum-Enhanced Carbon-Aware Scheduling for Large Language Model Inference

Authors/Creators

Description

As the computational demand for Large Language Models (LLMs) surges, minimizing the carbon footprint of inference has become a critical challenge. While classical schedulers optimize for throughput, they often neglect the spatial and temporal variance of grid carbon intensity. This paper presents a Hybrid Quantum-Classical (HQC) framework utilizing the Quantum Approximate Optimization Algorithm (QAOA) to solve the layer-to-hardware mapping problem with the explicit objective of minimizing gCO$_2$e emissions. We benchmark our QAOA optimizer against classical Brute Force and Genetic Algorithms across static, dynamic, and noisy environments. Our results demonstrate that QAOA achieves near-perfect optimality (gap<10−5) and successfully adapts to 24-hour grid fluctuations, realizing a simulated carbon saving of 23.76 gCO$_2$e. However, the study also reveals a "Simulation Wall" at N=15 layers, where classical simulation of the quantum circuit becomes computationally prohibitive, whereas Genetic Algorithms maintain speed at the cost of theoretical guarantees. We conclude that QAOA represents a scalable, robust pathway for green AI, provided the optimizer is migrated from simulation to physical Quantum Processing Units (QPUs).

Files

QAOA.pdf

Files (220.3 kB)

Name Size Download all
md5:80255a55d0ec2c6fedf1529db77e3e65
220.3 kB Preview Download