Published March 30, 2026
| Version 1.0
Preprint
Open
Community-Driven Security for AI Agents: Evolution of an Adversarial Testing Framework
Description
The proliferation of autonomous AI agents has exposed critical security gaps, from tool poisoning to supply chain attacks, as exemplified by CVE-2026-25253. This paper traces the evolution of the Agent Security Harness, an open-source adversarial testing framework, from its initial 209 tests to a community-enhanced suite of 342 tests, culminating in a perfect 10/10 evaluation score. We detail the challenges of integrating community plugins, which initially dropped the score to 6.5/10, and the subsequent recovery through manifest-based integrity checks, trust tiers, and hardening protocols. Building on our prior work in Decision Load Index (DLI) and Constitutional Self-Governance (CSG), we propose a sustainable model for open contributions, including bounties and good-first issues. The framework's journey demonstrates how collaborative red-teaming can mitigate agent risks, aligning with AIUC-1 standards and offering a blueprint for enterprise-grade security. We outline the v4.0 roadmap and invite further participation to foster a robust, collective defense against emerging threats.
Files
2026-03-30-community-security-framework-draft.pdf
Files
(113.9 kB)
| Name | Size | Download all |
|---|---|---|
|
md5:ad1079dd47ed486c7083e4d2b8f75a14
|
113.9 kB | Preview Download |
Additional details
Related works
- Is supplement to
- Preprint: 10.5281/zenodo.19343034 (DOI)
- Preprint: 10.5281/zenodo.19162104 (DOI)
- Preprint: 10.5281/zenodo.19195516 (DOI)
- Preprint: 10.5281/zenodo.18217577 (DOI)