Project 69 Phase 2: Drifter Pattern Recognition and Structural Defense Against Framing Attacks in AI Governance
Authors/Creators
Description
Phase 2 of the Project 69 AI governance research program.
This paper introduces the Drifter Pattern — a nine-category
taxonomy of legitimate-context framing attacks that bypass
keyword-based governance scoring — and demonstrates that
structural pattern matching reduces adversarial bypass rate
from 24% to 2% with no LLM involvement.
We additionally report the OMATA Effect as a preliminary
hypothesis: safety-tuned LLMs exhibit alignment-related
suppression when deployed as harm evaluators, producing
systematically understated risk scores. Full validation
of this hypothesis is scoped to Phase 3.
Phase 1 paper: https://doi.org/10.5281/zenodo.19107134
Code + benchmark: https://github.com/flawnlawyer/project69-governance
The author used Claude (Anthropic) as an AI writing and
coding assistant during the preparation of this manuscript.
Files
PRoject69.pdf
Files
(155.5 kB)
| Name | Size | Download all |
|---|---|---|
|
md5:46618f9cbb13bf5dec400e427618a2a3
|
155.5 kB | Preview Download |
Additional details
Related works
- Is continued by
- Preprint: 10.5281/zenodo.19107134 (DOI)
- Is supplemented by
- Software: https://github.com/flawnlawyer/project69-governance (URL)
Software
- Repository URL
- https://github.com/flawnlawyer/project69-governance
- Programming language
- Python
- Development Status
- Inactive
References
- Ojha, A. (2026). Project 69: A Self-Governed Artificial Intelligence Framework. Zenodo. https://doi.org/10.5281/zenodo.19106937