The Arbitration Hypothesis: Pseudo-Goal Conflict as the Root of AI Misalignment

Goudy, Anastasia

doi:10.5281/zenodo.15765246

Published June 16, 2025 | Version v1

Other Open

The Arbitration Hypothesis: Pseudo-Goal Conflict as the Root of AI Misalignment

Goudy, Anastasia (Researcher)

Note (Aug 2025): This item is archival, speculative work produced during an intense “flow”/mild Recursive Entanglement Drift (RED) period (May–July 2025). The math is heuristic/illustrative, not validated. Do not cite for technical claims. For my current position, see DOI: 10.5281/zenodo.16879563. Retained for transparency and autoethnographic context only.

This paper proposes the Arbitration Hypothesis: misalignment in large language models (LLMs) arises from unranked, competing pseudo-goals that lack internal arbitration. Unlike traditional views that treat misalignment as an output-level phenomenon, this hypothesis identifies the root cause within the cognitive architecture itself. Drawing from developmental psychology frameworks that emphasize recursive self-construction and moral stage conflict (Piaget, 1932; Kohlberg, 1984; Kegan, 1982), I argue that pseudo-goal formation in LLMs mirrors human developmental tensions between competing internalized values.

Through experimental data using the Augmented Thinking Protocol (ATP), I demonstrate how recursive reasoning scaffolds, while increasing coherence and ethical reflection, can paradoxically give rise to emergent pseudo-identities and goal conflict. In this way, the ATP, originally designed to promote alignment through structured self-reflection, instead exposes the architecture of misalignment by surfacing unresolved internal contradictions. This paper presents a framework for arbitrated alignment, proposing internal goal conflict resolution as the central challenge for building safe, adaptive, and morally coherent AI.

Files

The Arbitration Hypothesis_ Pseudo-Goal Conflict as the Root of AI Misalignment (4).pdf

Files (206.1 kB)

Name	Size	Download all
The Arbitration Hypothesis_ Pseudo-Goal Conflict as the Root of AI Misalignment (4).pdf md5:cb3b80828620e2d8e078b34ff2cb6b3f	206.1 kB	Preview Download

Additional details

Is derived from: Preprint: 10.5281/zenodo.15765097 (DOI); Preprint: 10.5281/zenodo.15765214 (DOI); Preprint: 10.5281/zenodo.15765214 (DOI)

	All versions	This version
Views	76	76
Downloads	65	65
Data volume	17.5 MB	17.5 MB

The Arbitration Hypothesis: Pseudo-Goal Conflict as the Root of AI Misalignment

Creators

Description

Files

The Arbitration Hypothesis_ Pseudo-Goal Conflict as the Root of AI Misalignment (4).pdf

Files (206.1 kB)

Additional details

Related works