Evans' Law 7.0 From Threshold to Reliability Surface Preliminary Reformulation
Description
Evans’ Law (2025) proposed that coherence collapse in large language models occurs at a predictable threshold that scales sublinearly with benchmark capability, formalized as L is approximately 1969.8 x M^0.74, where L represents the coherence collapse threshold in tokens and M represents model capability. Since publication, multiple independent research teams (Microsoft/Salesforce, Anthropic, Apple, Google, Caltech/Stanford) have documented degradation regimes whose observed collapse inflection points cluster within the order-of-magnitude band predicted by the model.
This paper makes three contributions. First, it constructs a preliminary cross-study quantitative overlay assessing whether independently observed degradation regimes align with the bounded-coherence hypothesis. Convergence across studies supports the claim that coherence degradation is structural, sublinearly scalable, and not eliminated by increased model intelligence.
Second, it presents new empirical findings from controlled multimodal verification testing conducted February 2026 across six frontier models, building on the cross-modal degradation tax established in Evans’ Law 5.0. These findings demonstrate that for task classes requiring rigid referential binding, coherence instability manifests at baseline, independent of context accumulation.
Third, and most significantly, this paper proposes a fundamental reformulation. Evans’ Law as originally formulated describes one regime of a two-regime system. We introduce the reliability surface R(L, M, T), where the original threshold equation is a cross-section at T = 0. An accumulation regime governs degradation along the context-length axis, and a baseline instability regime governs degradation along the task-rigidity axis. We provide an operational definition of T as a continuous coefficient on the interval 0 to 1, a numerical worked example of the L-axis prediction, and a visualization of the full reliability surface.
The original formula was always a projection of a higher-dimensional structure. This paper reveals and formalizes that structure, connecting architectural failure modes to the significance deficit identified in the S-vector research program.
Files
Evans' Law 7.0 From Threshold to Reliability Surface Preliminary ReformulationvFINAL.pdf
Files
(737.9 kB)
| Name | Size | Download all |
|---|---|---|
|
md5:796c45b4d9e0df73a94313260c63c72e
|
737.9 kB | Preview Download |
Additional details
Related works
- Is new version of
- 10.5281/zenodo.17523735 (DOI)
- Is supplement to
- Preprint: 10.5281/zenodo.17878026 (DOI)
Dates
- Available
-
2025-02-26Evans' Law 7.0 extends the original threshold model into a three-dimensional reliability surface, introducing a task-rigidity axis that explains both long-context collapse and baseline referential instability as consequences of a shared architectural constraint.