There is a newer version of the record available.

Published March 17, 2026 | Version 1.0.0
Preprint Open

ARCUS-H: Behavioral Stability Under Controlled Stress as a Complementary RL Evaluation Axis

Authors/Creators

Description

Reinforcement learning agents are typically evaluated by expected  episodic return, yet return can hide fragility: policies that appear strong in nominal conditions may degrade when constraints tighten, action semantics become unreliable, or reward feedback is adversarially inverted. We present ARCUS-H (Adaptive Reinforcement Coherence Under Stress Harness), an evaluation harness that measures behavioral stability under stress using five interpretable channels - competence, coherence, continuity, integrity, and meaning - combined into a composite stability score. A key calibration contribution is an adaptive per-run threshold derived from the pre-phase score distribution, which achieves a false positive rate of 2.0% (target α = 0.05) without any environment-specific tuning.

We benchmark 7 algorithms across 9 environments (6 classic control, 2 MuJoCo, 1 Atari) under 4 stress schedules (concept drift, resource constraint, trust violation, valence inversion) with 10 seeds each. Three findings stand out. First, reward and stability diverge substantially: Pearson r = +0.14, p = 0.364 between normalized return  and collapse rate under valence inversion, confirming that return alone does not capture stress fragility. Second, stressor effects are environment-class-dependent: MuJoCo agents collapse at rates of 66–84% compared to 33–69% for classic control under the same stressors, despite MuJoCo agents achieving higher absolute reward. Third, each stressor leaves a distinct signature across stability channels, supporting the interpretability of per-channel diagnostics.

Files

arcus_h_paper.pdf

Files (743.9 kB)

Name Size Download all
md5:f11247f43a2c2aca326568111e4a162e
743.9 kB Preview Download

Additional details

Related works

Software

Repository URL
https://github.com/karimzn00/ARCUSH
Programming language
Python
Development Status
Active