Beyond Identity Governance: A Protocol-Level Security Testing Framework for Multi-Agent AI Systems

Saleme, Michael

doi:10.5281/zenodo.19343034

Published March 23, 2026 | Version 1.0

Preprint Open

Beyond Identity Governance: A Protocol-Level Security Testing Framework for Multi-Agent AI Systems

Saleme, Michael (Researcher)¹

1. Cognitive Thought Engine LLC

Enterprise AI agent systems are scaling rapidly, communicating via new wire protocols (MCP, A2A) and executing financial transactions autonomously (L402, x402). Existing security tools address model-level vulnerabilities (prompt injection, jailbreaks) or enforce identity and access policies (authorization, sandboxing, scope control). Neither approach tests whether agent systems make correct decisions under adversarial conditions at the protocol layer. We present empirical evidence from controlled experiments against an Envoy Gateway + backend architecture demonstrating three findings: (1) conventional defense-in-depth provides no measurable mitigation in tested configurations for agent protocol-layer attacks, with identical MCP vulnerability profiles observed through proxied and direct testing; (2) gateway-layer defenses can mask application-layer vulnerabilities, creating false confidence in security posture that collapses when gateway configurations change; and (3) AI-generated security testing tools can produce structurally valid but functionally dangerous false-pass results undetectable by identity governance alone. We formalize these findings through the WHO vs. HOW governance gap: existing security layers that address WHO may access agent systems provide no measurable mitigation for HOW those agents make decisions under adversarial conditions. We present an open-source evaluation framework with 209 executable security tests across four agent communication and payment protocols (MCP, A2A, L402, x402), aligned with NIST AI 800-2 evaluation methodology, as the instrument for these findings. Three-run progression data (72% to 100% pass rate) demonstrates that protocol-level findings translate to measurable security improvements when the testing methodology addresses the correct architectural layer.

Files

2026-03-23-harness-paper-full.pdf

Files (166.9 kB)

Name	Size	Download all
2026-03-23-harness-paper-full.pdf md5:594ce67c42838b9648c58f041d9b1939	166.9 kB	Preview Download

Additional details

Is supplement to: Preprint: 10.5281/zenodo.19162104 (DOI); Preprint: 10.5281/zenodo.19195516 (DOI); Preprint: 10.5281/zenodo.18217577 (DOI)

	All versions	This version
Views	35	35
Downloads	10	10
Data volume	1.7 MB	1.7 MB

Beyond Identity Governance: A Protocol-Level Security Testing Framework for Multi-Agent AI Systems

Authors/Creators

Description

Files

2026-03-23-harness-paper-full.pdf

Files (166.9 kB)

Additional details

Related works