There is a newer version of the record available.

Published March 8, 2026 | Version v2
Publication Open

TRC: Trust Regulation and Containment A Predictive, Physics-Inspired Safety Framework for Large Language Models

Authors/Creators

Description

Version 7 introduces the game-theoretic adversarial robustness layer, which formalises the monitoring system as a Stackelberg–Bayesian differential game. The attenuation operator generalises the signed gain architecture to multiplicative directional control on the base model flow—shaping the dynamics the model computes within rather than only correcting deviations after the fact. The Hamilton–Jacobi–Isaacs equation governs the adversarial game and yields both the optimal attenuation profile and the security level bound—a quantitative ceiling on worst-case deviation under adversarial conditions. The equilibrium hierarchy (correlated, Stackelberg, Bayesian–Stackelberg) provides three nested guarantees, and the cost of distrust metric quantifies the efficiency loss from defensive monitoring.

Files

Trust_Regulation_and_Containment_Framework.pdf

Files (539.5 kB)

Name Size Download all
md5:faac29d4bbff3f7ea4b55a0fb40c1190
539.5 kB Preview Download