Contradiction Metabolism for LLMs: Preliminary Evidence that Context Rot is a Knowledge Integrity Problem
Authors/Creators
Description
Context rot — the degradation of LLM reasoning accuracy during extended conversations — is caused by contradiction accumulation, not context length.
We teach an LLM a set of facts, conduct a 180-turn conversation, then ask it to recall those facts. With ordinary conversation alone, accuracy drops to 57% — the model forgets nearly half. When contradictory information is mixed in, accuracy collapses to 21%. With an external metabolism layer that detects and resolves contradictions during idle time (inspired by human sleep), accuracy reaches 73% — exceeding even the contradiction-free baseline (n=3 per condition, p=0.027, d=8.80).
This pattern holds across 8 models and 11 paired comparisons (sign test p=0.0107). Even Google's 1M-token context window drops 47.8pp under contradictions, while removing contradictions restores performance regardless of length.
Frontier model replication (GPT-4o, Gemini 3.1, Sonnet 4.6) reveals that the collapse threshold is model-specific: lightweight models collapse catastrophically, while frontier models show conditional resistance. A factor analysis uncovers a capability-vulnerability paradox — high-capability models comply with coercive fact overrides that low-capability models reject, because RLHF helpfulness training makes them interpret contradictions as legitimate user corrections.
A lightweight middleware (delta-prune, pip install delta-prune) that resolves contradictions before they enter the model's context eliminates this compliance in preliminary testing.
Files
metabolic_architecture.pdf
Files
(434.8 kB)
| Name | Size | Download all |
|---|---|---|
|
md5:a9a37030afb1efd155c7333af8551978
|
434.8 kB | Preview Download |
Additional details
Software
- Repository URL
- https://github.com/karesansui-u/delta-zero