There is a newer version of the record available.

Published March 6, 2026 | Version v1
Publication Open

TRC: Trust Regulation and Containment A Predictive, Physics-Inspired Safety Framework for Large Language Models

Authors/Creators

Description

This paper presents Trust Regulation and Containment (TRC), a physics-inspired, inference-time safety architecture operating directly on the residual stream of Large Language Models. Moving beyond reactive post-generation filtering, TRC treats the activation manifold as a continuous geometric space, applying a stochastic differential equation (SDE) to predictively steer semantic momentum. This major revision introduces a federated estimation architecture featuring a Kalman filter with a mechanical "clutch" to gracefully handle non-linear phase transitions without tearing the activation manifold. Key theoretical advances include a continuous flow burst correction mechanism, a signed gain architecture that strictly isolates harmful from prosocial projections to defeat adversarial cloaking, and the projection of stochastic perturbation entirely into the monitored ethical subspace. By unifying token overhead, electrical cost, and geometric coherence into a single "tempo" optimization metric, TRC V6 offers a rigorously bounded, hardware-grounded approach to mechanistic interpretability and LLM containment.

Files

TRC_V6.pdf

Files (464.9 kB)

Name Size Download all
md5:7eae64c0bb895a8de371fce6802388d8
464.9 kB Preview Download