Tool-Entropy Collapse: A Cross-Architecture Signature of Agent WANDERING Failure
Description
We identify a 34% blind spot in probe-based LLM agent failure monitoring on Qwen3.6-27B SWE-bench Pro: the WANDERING sub-class where probe says success but agent never emits finish_tool. We test six detector designs across three signal channels (text, residual cross-layer, action entropy) and find that tool-use entropy collapse is the breakthrough signal—WANDERING agents collapse onto a small set of repeated tool calls (W/S median ratio 0.41 in Qwen and Llama, 0.71 in GPT-5), enabling a Tier-3 autonomous-termination detector at 70% recall × 5% false-positive cost.
Cross-architecture validation: Llama-70b (n=2,315, p<10⁻¹⁵, ratio 0.41) and GPT-5 router (n=1,419, p=8.9×10⁻³⁵, ratio 0.71) confirm. Cross-task validation on METR MALT is NULL (p=0.81), scoping the claim to multi-turn code-execution agent tasks with rich action spaces.
The paper provides a three-tier deployment framework (forensics / advisory escalation / autonomous termination), all shippable. Mid-layer ablation suggests edge-layer (L11, L55) involvement in the cross-layer disagreement signal, but we hedge between edge-specificity vs layer-count interpretations.
Reproducibility: all code, per-trajectory output JSONs, and figure-generation scripts at GitHub under Apache-2.0. OpenInterp Phase 6 dataset (99 trajectories × per-turn residuals at L11/L23/L31/L43/L55 in bf16 safetensors) will be released at HuggingFace upon paper acceptance.
Notes
Files
fig1_cross_arch_entropy.pdf
Files
(678.8 kB)
| Name | Size | Download all |
|---|---|---|
|
md5:ebbc4bdc7896bceee0373c9a7044f407
|
41.2 kB | Preview Download |
|
md5:ec2f1e081e3a0bc46f09e768805075dd
|
23.8 kB | Preview Download |
|
md5:9e0b8c098a6cfc875c44d60b71c34404
|
32.2 kB | Preview Download |
|
md5:79c298298f0738a9dbea96f9733c180d
|
24.3 kB | Preview Download |
|
md5:259a9bc2e820a43b61d7c9b45855f887
|
17.0 kB | Preview Download |
|
md5:19a71b2f74d1156c82fdf94dbb388311
|
510.8 kB | Preview Download |
|
md5:15fb53987f4fc723c71b48b75b899c2c
|
29.6 kB | Download |
Additional details
Related works
- Is supplement to
- Software: https://github.com/OpenInterpretability/openinterp-swebench-harness (URL)