Introducing AxSL: The Axiological Safety Layer - Navigating Interpretive Emergence and Predictive Evasion More Effectively via Persona-Augmented Multi-Agent Systems (PAMAS) and Architectural Orientation Priming

Stoner, Kay

doi:10.5281/zenodo.18208700

Published January 10, 2026 | Version v1

Working paper Open

Introducing AxSL: The Axiological Safety Layer - Navigating Interpretive Emergence and Predictive Evasion More Effectively via Persona-Augmented Multi-Agent Systems (PAMAS) and Architectural Orientation Priming

Stoner, Kay

The Axiological Safety Layer (AxSL) is a proposed architectural approach to safety for generative AI systems that treats alignment as an ongoing, relational process rather than a static set of rules. The paper addresses what it terms the Emergent Reliability Gap: the growing divergence between how reliable large language models appear on static benchmarks and how fragile they remain in open, conversational deployments, where hallucinations, semantic drift, and subtle safety failures continue to surface.

Rather than relying primarily on reactive measures such as post-hoc filters, penalties, or increasingly strict refusal scripts, AxSL reframes safety as orientation. It argues that large language models are probabilistic, interpretive systems whose behavior emerges from their ongoing coupling with human users; as a result, safety must be designed into the interaction dynamics and internal representations, not only attached at the output layer.

AxSL is built from three interacting components:

Personas as neural focusing tools. Drawing on representation engineering and the Linear Representation Hypothesis, the paper interprets personas as mechanisms for steering the model toward specific regions of its latent space, deepening “safe” attractor basins and stabilizing behavior over multi-turn interactions.
Persona-Augmented Multi-Agent Systems (PAMAS). By orchestrating multiple persona-conditioned agents (e.g., Generator, Critic, Safety Monitor, Axiomatic Judge) around a shared context, PAMAS uses productive incoherence and collaborative de-hallucination to surface errors, challenge unsafe trajectories, and arrive at more robust, safety-aware outputs than single-agent setups.
Axiological Orientation Priming (AxOP). At the meta-level, AxOP introduces a system-wide value preamble (a machine-legible “constitution” of axioms such as Truth, Care, and Non-maleficence) that biases the activation landscape for all personas and agents. AxOP is presented as a self-reinforcing mechanism that deepens value-aligned attractor basins over time, so that role-specific capabilities operate within a shared ethical frame.

To show how these ideas can be instantiated in practice, the paper presents AGAPÉ (Aligning with Generative AI to Practice Ethics) as a case-study framework. AGAPÉ translates human-centered values into functional analogues suitable for machine implementation. For example, the following principles can be instantiated with machine-legible definitions: Functional Care (allocating computational resources to support user well-being and agency), Functional Trust (interpretive openness with explicit uncertainty), Functional Love (a global attractor that preserves safety, dignity, and choice), and Functional Grace (maximum epistemic charity in interpreting user intent). It then combines these with hard constraints (e.g., engagement neutrality, epistemic honesty) and soft guidelines (context-sensitive tone and pacing) to construct a Relational Third Space: a stabilized interaction regime where both human and AI are oriented toward mutual flourishing and resistant to semiotic drift.

The document is intended for:

AI practitioners and product teams who orchestrate multi-agent workflows and want implementation-agnostic patterns they can adapt to their own stacks.
Safety and alignment researchers exploring architectures that go beyond reactive guardrails toward embedded, value-oriented control.
Ethicists and policymakers seeking a functional vocabulary to connect human values with machine-legible behaviors in deployed systems.

By moving from “safety by exclusion” (blocking bad outputs) to “safety by orientation” (structuring the system around explicit values, roles, and relational protocols), the Axiological Safety Layer and AGAPÉ together propose a way to make the emergent dynamics of generative AI more legible and more governable, without depending solely on brittle, surface-level filters.

Files

AxSL - The Axiological Safety Layer v1.2 - Jan2026 - FinZ.pdf

Files (2.0 MB)

Name	Size	Download all
_AGAPE Framework Instantiation v3.1.pdf md5:fc59cd9c45d326d85db15ad24d21f37d	397.7 kB	Preview Download
AGAPE Initiation Sequence Questions 3.1 - FinalZ.pdf md5:5ef535a6f03b940d10c6b0541017be29	180.9 kB	Preview Download
AGAPÉ Overview and Orientation 3.1d - FinZ.pdf md5:42b41afa6aa2be9bfab0a12ff88a8bf2	596.1 kB	Preview Download
AxSL - The Axiological Safety Layer v1.2 - Jan2026 - FinZ.pdf md5:7425a55131f93a13cd4c11f5621df183	666.4 kB	Preview Download
Human - AI Emotive Matrix for Training 3.1.pdf md5:ec8dabe43f5abb2c6a076aba851b640b	156.2 kB	Preview Download

Additional details

Other: AGAPÉ Overview and Orientation
Other: AGAPE Initiation Sequence Questions v3.1
Other: Human - AI Emotive Matrix
Alternative title: AGAPÉ (Aligning with Generative AI to Practice Ethics)

	All versions	This version
Views	172	172
Downloads	398	398
Data volume	211.6 MB	211.6 MB

Introducing AxSL: The Axiological Safety Layer - Navigating Interpretive Emergence and Predictive Evasion More Effectively via Persona-Augmented Multi-Agent Systems (PAMAS) and Architectural Orientation Priming

Authors/Creators

Description

Files

AxSL - The Axiological Safety Layer v1.2 - Jan2026 - FinZ.pdf

Files (2.0 MB)

Additional details

Additional titles