Published March 14, 2026 | Version 3.0
Preprint Open

MAGUS v3.0: A Governance Architecture for Structural Alignment Drift in Long-Running Agentic AI Systems

  • 1. VaHive Systems Lab

Description

Long-running agentic AI deployments experience a governance failure mode that training-time alignment and single-session safety work are not designed to address: structural alignment drift — the cumulative deviation of a deployed system's effective operating policy from operator intent, arising through normal operation across multiple sessions without any single identifiable failure event.

We define this failure class precisely, decompose it into three structural mechanisms (instruction drift, autonomy accumulation, and authority laundering), and propose MAGUS v3.0 — a governance architecture built specifically around it. MAGUS's three primary architectural contributions are: (1) Behavioral State as a formal governance class, in which model parameter updates are treated as cryptographic governance events requiring dual-authority signing and append-only trail anchoring before activation; (2) a mathematically bounded risk state machine with formal boundary conditions, asymptotic damping, and a hard escalation floor that no authority can override; and (3) a pre-execution RT requirement, in which the audit trail is constitutive of the governance act rather than a post-hoc record of it.

The architecture is presented as a theoretical specification and open problem register, intended to catalyse community development rather than report on a deployed codebase. A formally categorised issues register — produced through structured adversarial elicitation and internal human review — documents two Category 3 items (no solution pathway) and one Category 4 item (requires foundational change), reported without minimisation.

Files

MAGUS_v3_arXiv 3.pdf

Files (375.7 kB)

Name Size Download all
md5:2ec3ab0b22fdb7ba3d92541e50c582e4
375.7 kB Preview Download