Published April 28, 2026 | Version v1
Journal article Open

Behavioral Stability of Quantized Large Language Models Under Prompt Drift: The Resilio Evaluation Framework

  • 1. MIT ADT University

Description

Large Language Models (LLMs) are increasingly deployed in resource-constrained environments using quantization techniques such as 8-bit and 4-bit precision, which reduce memory footprint and inference costs. While these models perform well on clean benchmarks, real-world deployment frequently encounters "Prompt Drift"—typographical errors, informal phrasing, and structural degradation. This paper introduces Resilio, a systematic evaluation framework investigating the interaction between model quantization and five progressive levels of prompt quality degradation. We evaluate Llama 3.1 8B, Mistral 7B, and Phi-3 Mini using two novel metrics: the Task Performance Score (TPS) and the Behavioral Stability Score (BSS). Our study reveals a "Quantization Amplification" effect, where 4-bit models exhibit disproportionately higher sensitivity to noise compared to FP16 baselines, particularly in reasoning-intensive tasks.

Files

behavioral-stability-of-quantized-large-language-models-under-prompt-drift-the-resilio-evaluation-fr-IJERTV15IS042474.pdf

Additional details