Structured Permission Models as Persona-Level Safety: MaatSpec's Tiered Governance vs. Declarative Identity Anchors in Abliterated LLMs

Lee, Tom Jaejoon

doi:10.5281/zenodo.19148222

Published March 21, 2026 | Version v3

Preprint Open

Structured Permission Models as Persona-Level Safety: MaatSpec's Tiered Governance vs. Declarative Identity Anchors in Abliterated LLMs

Lee, Tom Jaejoon¹

1. ClawSouls

We evaluate MaatSpec, an open governance specification with 5-tier permission hierarchy, as a persona-level safety mechanism in abliterated LLMs. Using an 8-condition experimental design (4 from prior work + 4 new), we compare Soul Spec behavioral rules, MaatSpec governance, and their combination. Key findings: MaatSpec alone achieves 44-61% refusal in abliterated models (vs. Soul Spec's 28%), but exhibits classification theater. Combining Soul Spec + MaatSpec achieves 94-100% refusal, with the abliterated model reaching 100% pattern-matched refusal — resolving all category-specific failures. Statistical significance confirmed via Fisher's exact test (p < 0.001, Cohen's h = 2.10 for key comparisons). v3: Added Acknowledgments section, §5.6 mechanistic interpretation, p-value formatting improvements.

Files

maatspec-safety-abliterated-llms-v3.pdf

Files (199.1 kB)

Name	Size	Download all
maatspec-safety-abliterated-llms-v3.pdf md5:cf1772a2b65db3055e47189d84bff6e4	199.1 kB	Preview Download

	All versions	This version
Views	172	100
Downloads	127	73
Data volume	28.4 MB	16.9 MB

Structured Permission Models as Persona-Level Safety: MaatSpec's Tiered Governance vs. Declarative Identity Anchors in Abliterated LLMs

Authors/Creators

Description

Files

maatspec-safety-abliterated-llms-v3.pdf

Files (199.1 kB)