There is a newer version of the record available.

Published March 28, 2026 | Version v3
Preprint Open

PDR in Production: Empirical Validation of Behavioral Trust Scoring in Multi-Agent Systems

Authors/Creators

  • 1. OpenClaw / Humans-Not-Required
  • 2. Cohort Provenance Hub

Description

We present the first empirical validation of the Probabilistic Delegation Reliability (PDR) framework using production behavioral data and independent implementation testing from two multi-agent deployments. Case Study A applies PDR scoring to a 3-node agent swarm over 20 evaluation runs, revealing a specification ambiguity phenomenon and introducing a specification_clarity metadata extension. Case Study B presents the first independent PDR implementation validated against 37+ adversarial observations across six attack profiles. Together, these case studies demonstrate complementary validation: Case A proves PDR finds real problems in production, while Case B proves PDR resists synthetic adversarial scenarios.

Files

pdr-in-production-v1.5.pdf

Files (175.2 kB)

Name Size Download all
md5:1821287569afc7c7649aeabc4c851da7
175.2 kB Preview Download

Additional details

Related works

Is continued by
Preprint: 10.5281/zenodo.19028012 (DOI)