Published January 17, 2026 | Version v1
Preprint Open

Predicting and Suppressing Repetitive Degeneration in Language Models via Hidden-State Risk Estimation

Authors/Creators

Description

Repetitive degeneration—the tendency of autoregressive language models to fall into loops or repeat phrases—remains a persistent failure mode in long-form generation. This work demonstrates that such degeneration is strongly predictable from internal hidden states before it occurs, rather than being a purely stochastic artifact of decoding.

We train a lightweight classifier (~50k parameters) on transformer hidden representations to predict whether the current token will reappear within a fixed future window. The predictor achieves high discrimination (F1 > 0.96), with up to 125× separation between predicted risk for tokens that will repeat versus those that will not at the best checkpoint. This indicates that repetition corresponds to a distinct internal regime already encoded in the model’s representations.

Using this signal, we introduce a decode-time control mechanism that selectively applies repetition penalties only when predicted risk is high. The intervention leaves the base model’s forward pass unchanged and operates solely at the sampling stage. Across open-ended generation prompts, this approach reduces repetition rate by 48.4% while improving lexical diversity (Distinct-2) by 16.7%.

We further document five attempted approaches to modifying attention patterns directly, all of which failed due to training–inference mismatch or signal collapse. These negative results suggest that sampling-stage control is substantially more robust than architectural modification for pretrained models.

Overall, this work establishes that repetition in language models is a predictable internal state and that lightweight, external control mechanisms can exploit this signal without retraining or modifying model architectures. Code, trained weights, and reproduction instructions are provided.

Files

technical_paper.md.pdf

Files (99.5 kB)

Name Size Download all
md5:f90209b3aeaa5477a95d7b6b4c4ffb29
99.5 kB Preview Download

Additional details