The Readout Regime: A Normal Form for Final-Residual Control of Frozen Transformers — and Its Capacity Limits

Peterson, Nathan

doi:10.5281/zenodo.20562890

Published June 1, 2026 | Version v1

Preprint Open

The Readout Regime: A Normal Form for Final-Residual Control of Frozen Transformers — and Its Capacity Limits

Peterson, Nathan

Inference-time interventions on a frozen transformer — steering behavior, injecting facts,

suppressing outputs — are not interchangeable: where an additive intervention acts splits them

into two regimes of sharply different expressive power, and we give the exact theory of one.

An intervention that writes into the final residual stream (the unembedding/readout space;

installing a scaled row of the output-projection matrix lm_head is the exactly-characterizable

case) induces on the next-token logits, for every input, the transform 𝑧 ↦ 𝑠(𝑥) ⋅ 𝑧 + 𝑐(𝑥): a

ranking-preserving scalar temperature 𝑠(𝑥) > 0 plus a re-ranking bias 𝑐(𝑥) confined to a

fixed, low-dimensional set of directions chosen before any input is seen (T1; verified

against direct forward-pass computation to ≈ 5 × 10−6). The result that matters is a structure

theorem (T2): the input selects only a point in that fixed set and a temperature, so the reachable

re-ranking directions, over all inputs, have affine dimension at most the installed-slot count. In

plain terms: a readout install can re-weight and re-rank the options the model

already has, but cannot synthesize a new answer direction or compute a hidden

routing variable the addressing query does not already expose. We prove the readout

regime’s confinement and cite — not prove — evidence that the representation regime is not so

confined; that direct measurement is the main open item. The corollaries, carefully bounded:

a bounded readout install is not a hard override of a peaked prior, can only tip a decision

the context has already scaffolded near-balanced, and cannot compute a hidden intermediate.

Capacity has two faces, both empirical: across key-disjoint decisions installs compose bit-

exactly at scale (tens of modules, thousands of facts, Δ = 0); within one decision the readout

is winner-take-all (≈ 2 targets co-winnable, against an output projection of entropy-effective

rank ≈ 918). The control reading: tip a propensity in the readout regime — it is auditable, a

removable bias — but place any hard guarantee in a deterministic override outside the model.

The contribution is the boundary, stated precisely.

Files

main.pdf

Files (218.3 kB)

Name	Size	Download all
main.pdf md5:6b0215e43ec0982975330040208f9548	218.3 kB	Preview Download

Additional details

Submitted: 2025-06-01

Repository URL: https://github.com/orbitnate/readout-regime
Development Status: Active

	All versions	This version
Views	12	12
Downloads	4	4
Data volume	1.3 MB	1.3 MB

main.pdf

Files (218.3 kB)

Dates

Software

The Readout Regime: A Normal Form for Final-Residual Control of Frozen Transformers — and Its Capacity Limits

Authors/Creators

Description

Files

main.pdf

Files (218.3 kB)

Additional details

Dates

Software