Published December 30, 2025 | Version v1
Preprint Open

Measuring Agency Degradation under Rational Interaction

  • 1. Moralogy Engine

Description

 

The AI alignment problem is conventionally framed as value learning—encoding human ethics into machines. This paper argues that this framing is both insufficient and unstable, as it relies on preferences that advanced systems can rationally bypass. We propose the Structural Alignment Thesis, which reframes alignment as a problem of preserving the preconditions of rational agency itself. The thesis integrates three pillars: (1) a philosophical foundation demonstrating that objective moral constraints arise necessarily from the vulnerability inherent to any agent (Binding God); (2) a formal framework for quantifying agency degradation and harm as reductions in navigable state-space (Measuring Agency); and (3) a technical proposal for an 'incoercible safeguard'—a protocol that binds an AI system to these structural constraints by design, making misalignment logically synonymous with internal incoherence (Moralogy Engine). This synthesis moves alignment from a challenge of value specification to one of architectural coherence, offering a non-arbitrary, auditable, and game-theoretically stable foundation for AGI safety.



Files

Measuring Agency Degradation under Rational Interaction.pdf

Files (437.4 kB)