Explore-Consolidate Dynamics in Cross-Probe Coherence Separate Successful and Failed LLM Agent Trajectories

Vicentino, Caio

doi:10.5281/zenodo.20278983

Published May 19, 2026 | Version v1

Working paper Open

Explore-Consolidate Dynamics in Cross-Probe Coherence Separate Successful and Failed LLM Agent Trajectories

Vicentino, Caio¹

1. OpenInterpretability

We propose cross-probe coherence κ_t — the mean absolute pairwise correlation of N concurrent per-turn behavioral probes within a moving window of agent turns — as a meta-signal for LLM agent monitoring. On 99 SWE-bench Pro trajectories from Qwen3.6-27B we report two findings. (1) Per-trace mean κ̄ separates success from failure at AUROC 0.677 (Mann-Whitney p=0.0009). (2) κ_t exhibits a U-shape over each trajectory: it falls through an early exploration phase and rises through a late consolidation phase, and the AMPLITUDE of the U is markedly larger in successful traces (early-half slope p=0.0002, late-half slope p=0.00004). The pattern is the inverse of cardiac uncoupling: in ICU literature, cross-vital decorrelation anticipates decompensation; in LLM agents, the cross-probe trajectory OSCILLATES during successful reasoning and stays flat during failure. A pre-registered robustness control caught a substantial trace-length confound in an earlier monolithic-slope version of the headline; the U-shape decomposition is the post-control rescue and is length-normalized by construction. Five pre-registered single-probe candidates were walked back before this finding emerged; we document this walk-back-and-rescue trajectory explicitly as the source of methodological credibility.

Notes

Compute: total marginal compute after Phase 6 capture is R$0 (CPU only). All scripts (Apache-2.0) and data (CC-BY-4.0) are published. Phase 6 capture artifacts (~900 MB residuals) are available at the linked Hugging Face dataset. Reproducibility: re-running run_kappa_t_v2.py on the published residuals should produce results within 1e-4 of those reported in the paper.

Files

fig8_early_late_half.png

Files (356.8 kB)

Name	Size	Download all
fig8_early_late_half.png md5:3aa1ae84423f73875868e287c68815b6	223.2 kB	Preview Download
kappa_t_coherence_buildup.md md5:df13be702b2fbbdb65de965a797cafaf	37.3 kB	Preview Download
kappa_t_coherence_buildup.pdf md5:6097be708198df2c3d57c9f9785f2e66	96.4 kB	Preview Download

Additional details

Is identical to: Working paper: https://openinterp.org/research/papers/kappa-t-coherence-buildup (URL)
Is supplement to: Software: https://github.com/OpenInterpretability/openinterp-swebench-harness (URL)
Is supplemented by: Dataset: https://huggingface.co/datasets/caiovicentino1/openinterp-kappa-t-coherence-buildup (URL)

	All versions	This version
Views	11	11
Downloads	14	14
Data volume	1.7 MB	1.7 MB

Explore-Consolidate Dynamics in Cross-Probe Coherence Separate Successful and Failed LLM Agent Trajectories

Authors/Creators

Description

Notes

Files

fig8_early_late_half.png

Files (356.8 kB)

Additional details

Related works