Confidently Wrong: Failure of Mixture of Inputs in Low-Entropy Regimes
Authors/Creators
Description
Mixture of Inputs (MOI) is a recent training-free method for enhancing autoregressive generation in transformer models: sampled tokens are blended with their output distributions via Bayesian posterior estimation. While MOI demonstrates consistent improvements on reasoning-intensive benchmarks, we identify a systematic failure mode: in low-entropy regimes, where the model produces peaked output distributions, MOI’s intervention weights collapse to near-zero, causing the method to degenerate to an identity mapping equivalent to standard generation. We provide theoretical analysis, showing that this collapse is intrinsic to MOI’s Bayesian formulation, in which normalized entropy directly scales the prior concentration. We demonstrate that this collapse critically occurs even when models produce confident but incorrect outputs - when intervention is most needed. Empirical validation across five arithmetic tasks and two language models confirms that MOI provides no benefit over baseline in such setups, with Gemma-2B exhibiting high confidence (low entropy) on 98% of incorrect predictions, while TinyLlama suggests that the collapse severity is model-dependent. Our findings delineate boundary conditions for distribution-mixing interventions, and motivate the development of entropy-invariant methods effective across the full spectrum of model confidence.
Notes (English)
Files
moi_failure_paper_arxiv_RC1.pdf
Files
(382.9 kB)
| Name | Size | Download all |
|---|---|---|
|
md5:4e08034ac51425e31d4e6bacd50d7363
|
382.9 kB | Preview Download |