Published May 19, 2026 | Version v1
Working paper Open

Explore-Consolidate Dynamics in Cross-Probe Coherence Separate Successful and Failed LLM Agent Trajectories

Authors/Creators

  • 1. OpenInterpretability

Description

We propose cross-probe coherence κ_t — the mean absolute pairwise correlation of N concurrent per-turn behavioral probes within a moving window of agent turns — as a meta-signal for LLM agent monitoring. On 99 SWE-bench Pro trajectories from Qwen3.6-27B we report two findings. (1) Per-trace mean κ̄ separates success from failure at AUROC 0.677 (Mann-Whitney p=0.0009). (2) κ_t exhibits a U-shape over each trajectory: it falls through an early exploration phase and rises through a late consolidation phase, and the AMPLITUDE of the U is markedly larger in successful traces (early-half slope p=0.0002, late-half slope p=0.00004). The pattern is the inverse of cardiac uncoupling: in ICU literature, cross-vital decorrelation anticipates decompensation; in LLM agents, the cross-probe trajectory OSCILLATES during successful reasoning and stays flat during failure. A pre-registered robustness control caught a substantial trace-length confound in an earlier monolithic-slope version of the headline; the U-shape decomposition is the post-control rescue and is length-normalized by construction. Five pre-registered single-probe candidates were walked back before this finding emerged; we document this walk-back-and-rescue trajectory explicitly as the source of methodological credibility.

Notes

Compute: total marginal compute after Phase 6 capture is R$0 (CPU only). All scripts (Apache-2.0) and data (CC-BY-4.0) are published. Phase 6 capture artifacts (~900 MB residuals) are available at the linked Hugging Face dataset. Reproducibility: re-running run_kappa_t_v2.py on the published residuals should produce results within 1e-4 of those reported in the paper.

Files

fig8_early_late_half.png

Files (356.8 kB)

Name Size Download all
md5:3aa1ae84423f73875868e287c68815b6
223.2 kB Preview Download
md5:df13be702b2fbbdb65de965a797cafaf
37.3 kB Preview Download
md5:6097be708198df2c3d57c9f9785f2e66
96.4 kB Preview Download

Additional details