Published February 2, 2026 | Version v1
Preprint Open

Synthesis: A Federated Capability Ecosystem for Safe AI Self-Extension Through Test-Driven Development and Graduated Trust

Description

As AI agents become more capable, there is increasing interest in systems that can extend their own capabilities through code generation and tool use. However, naive code generation approaches produce unreliable outputs that may fail silently, introduce security vulnerabilities, or behave unexpectedly—a phenomenon well-documented in evaluations of large language model code generation. We present Synthesis, a federated capability ecosystem for safe AI self-extension that addresses these challenges through three integrated mechanisms: (1) Test-Driven Synthesis, where comprehensive test suites are generated before implementation code and capabilities must pass all tests before deployment; (2) Graduated Trust, where newly synthesized capabilities start in maximally restricted sandboxes and progressively earn privileges through demonstrated reliability across quantified thresholds; and (3) Composition Over Creation, where the system exhaustively searches a shared Live Exchange and attempts to compose existing verified capabilities before synthesizing new code, creating network effects that benefit all participating agents. The architecture includes a trust bootstrapping protocol that solves the cold-start problem for new deployments through founding validators and pre-verified seed capabilities. Our empirical measurements show realistic success rates (50–70% one-shot, 70–85% after iterative refinement) while maintaining honest metrics about system limitations. Synthesis provides a foundation for AI systems that can safely adapt to new requirements without compromising reliability, security, or auditability.

Files

synthesis_paper.pdf

Files (455.9 kB)

Name Size Download all
md5:c6a952bac46bca1bd19eee7aa7f76f1c
455.9 kB Preview Download

Additional details

Software

Repository URL
https://www.github.com/anthony-maio/synthesis
Programming language
Python
Development Status
Wip