The residual growth landscape: A 43-model survey of residual stream dynamics and a post-hoc intervention study

Ono, Katsuki

doi:10.5281/zenodo.19275087

Published March 28, 2026 | Version v3

Report Open

The residual growth landscape: A 43-model survey of residual stream dynamics and a post-hoc intervention study

Ono, Katsuki

We measure the residual stream growth factor, the ratio of the last-layer residual norm to the first-layer norm, across 43 openly available language models spanning 15 architecture families and 70M to 4B parameters. Residual growth varies by over 500x across models (from 5x in OLMo-2 to 2,747x in Qwen3-1.7B) and shows no correlation with parameter count (Spearman r = 0.043, p = 0.78).

We conduct two intervention studies on 9–10 models. First, Norm Equalization (NormEq), an analytical rescaling that forces uniform residual growth, degrades perplexity in 8 of 9 cases, with catastrophic failure (+2,073%) in Qwen3-0.6B. Second, progressive layer dropping reveals that resilience to depth reduction is uncorrelated with residual growth (r = -0.09, p = 0.80): Falcon-H1 (RG = 9x) is the most fragile model, while GPT2-XL (RG = 510x) degrades gracefully.

We conclude that heterogeneous residual growth is a learned feature of Pre-LN Transformer training, not an architectural defect, and that layer-level criticality depends on architecture type rather than residual dynamics.

Residual growth values reported in the survey and intervention files were measured under different calibration and evaluation setups, so overlapping models may have different absolute RG values across files. Cross-file comparisons should therefore use within-experiment values.

Files

residual_growth_report_v2.pdf

Files (1.1 MB)

Name	Size	Download all
residual_growth_report_data.zip md5:1c97a58e2d297bf14e2ac1b2363a0108	68.0 kB	Preview Download
residual_growth_report_v2.pdf md5:f89e296bc8cd85b41d5f9d31de5fb182	1.1 MB	Preview Download

Additional details

Repository URL: https://github.com/Ono-Katsuki/residual-growth-report
Programming language: Python

	All versions	This version
Views	49	40
Downloads	51	43
Data volume	60.5 MB	49.0 MB

The residual growth landscape: A 43-model survey of residual stream dynamics and a post-hoc intervention study

Authors/Creators

Description

Files

residual_growth_report_v2.pdf

Files (1.1 MB)

Additional details

Software