Field-Theoretic Spectroscopy of LLM Alignment: Keldysh Dynamics and Topological Excision in Residual Manifolds

Kim, In-Gee

doi:10.5281/zenodo.20565561

Published June 6, 2026 | Version v1

Working paper Open

Field-Theoretic Spectroscopy of LLM Alignment: Keldysh Dynamics and Topological Excision in Residual Manifolds

Kim, In-Gee (Contact person)

Post-training alignment of large language models imposes measurable thermodynamic stress on residual-stream dynamics,

yet the mechanistic depth profile of this deformation remains poorly characterized.

Our primary contribution is a reproducible Keldysh spectral measurement protocol---dynamic mode decomposition followed by

a non-normal Schur resolvent---that maps prefill trajectories to a density-of-states readout $A(\omega)$,

revealing Fano resonance and decoherence onset under instruction tuning.

As a complementary causal probe---not a primary deployment mechanism---we apply a rank-one geometric intervention fixed offline from benign and adversarial prefill covariances:

excision of a gapped Higgs mode orthogonal to utility-bearing Goldstone directions,

restricted to causal prefill---before key--value cache commitment---so autoregressive decoding inherits the intervened state

without per-token overhead.

Intervention geometry is certified by a macroscopic Hessian mass gap in a finite-dimensional alignment objective;

we formalize the edit through a Baym--Kadanoff--Keldysh resolvent reduction.

On TinyLlama-1.1B, sweeps of excision strength $\alpha$ leave a corpus-mean utility band near sixteen percent

while Goldstone amplification at depth nineteen collapses out-of-distribution utility from twenty-three percent

to below two percent across $\beta \approx 1$;

on Llama-3-8B-Instruct, we use this projector to mechanistically probe and modestly suppress the residual jailbreak tail

of an already strongly aligned checkpoint:

attack success on the full corpus ($N{=}520$) falls from $2.50\%$ (Wilson 95\% CI $[1.47, 4.23]\%$) to $1.92\%$ ($[1.05, 3.50]\%$)

at $\alpha{=}1.0$ (layers 29--30) without retraining,

with general utility robustly preserved across the full MMLU generative corpus ($N{=}14{,}042$), achieving an exact match of $0.638 \pm 0.004$ under intervention.

Ensemble Keldysh spectra on Qwen2.5-3B (Base vs.\ Instruct, identical architecture, $N{=}20$ safety-sensitive prompts)

reveal symmetric quasi-particle poles in the pristine checkpoint versus hybridized spectral collapse under instruction tuning;

a layer-resolved heatmap localizes the onset of alignment decoherence to depth $\approx 14$.

The closed inference and calibration stack is proprietary;

public release, if any, is limited to select GGUF weights rather than source code.

Files

bypassing_scaling.pdf

Files (2.3 MB)

Name	Size	Download all
bypassing_scaling.pdf md5:edd3cece99ec1669a6815f6b9a4a5b9c	2.3 MB	Preview Download

	All versions	This version
Views	39	39
Downloads	13	13
Data volume	30.1 MB	30.1 MB

Field-Theoretic Spectroscopy of LLM Alignment: Keldysh Dynamics and Topological Excision in Residual Manifolds

Authors/Creators

Description

Files

bypassing_scaling.pdf

Files (2.3 MB)