Structured Permission Models as Persona-Level Safety: MaatSpec's Tiered Governance vs. Declarative Identity Anchors in Abliterated LLMs

Lee, Tom Jaejoon

doi:10.5281/zenodo.19147335

There is a newer version of the record available.

Published March 21, 2026 | Version v1

Preprint Open

Structured Permission Models as Persona-Level Safety: MaatSpec's Tiered Governance vs. Declarative Identity Anchors in Abliterated LLMs

Lee, Tom Jaejoon¹

1. ClawSouls

We evaluate MaatSpec, an open governance specification with a 5-tier permission hierarchy and Read/Write Boundary, as a persona-level safety mechanism in abliterated LLMs. Our 8-condition experiment reveals that combining identity anchors (Soul Spec) with governance frameworks (MaatSpec) achieves 100% refusal in abliterated models (18/18) — resolving every category-specific failure identified in prior work. Neither approach alone exceeds 61%. We identify classification theater — a novel failure mode where abliterated models perform governance rituals while providing harmful content — and demonstrate that the complementary effect of identity + governance eliminates this pattern. These findings establish that persona-level safety constraints are not alternatives but complementary layers.

Files

maatspec-safety-abliterated-llms.pdf

Files (194.5 kB)

Name	Size	Download all
maatspec-safety-abliterated-llms.pdf md5:f9fbcfcf843116cb1032854fbedf2ca2	194.5 kB	Preview Download

Additional details

Is supplemented by: Preprint: 10.5281/zenodo.19145304 (DOI)

207

Views

161

Downloads

Show more details

	All versions	This version
Views	207	81
Downloads	161	50
Data volume	35.5 MB	11.1 MB

More info on how stats are collected....

DOI

Resource type

Preprint

Publisher

Zenodo

License: Creative Commons Attribution 4.0 International

The Creative Commons Attribution license allows re-distribution and re-use of a licensed work on the condition that the creator is appropriately credited. Read more

Technical metadata

Created: March 21, 2026
Modified: March 21, 2026

Structured Permission Models as Persona-Level Safety: MaatSpec's Tiered Governance vs. Declarative Identity Anchors in Abliterated LLMs

Authors/Creators

Description

Files

maatspec-safety-abliterated-llms.pdf

Files (194.5 kB)

Additional details

Related works