Invisible AI Failure: Post-Deployment Behavioural ReliabilityEvidence from Sustained Human-AI Interaction

Henjoto, Vito

doi:10.5281/zenodo.19321671

Published March 29, 2026 | Version v1

Preprint Open

Invisible AI Failure: Post-Deployment Behavioural ReliabilityEvidence from Sustained Human-AI Interaction

Henjoto, Vito (Other)¹

1. Antecedent Labs

AbstractNo commercial tool monitors what artificial intelligence does behaviourally during sustainedinteraction with users. Existing infrastructure tracks per-response quality metrics but does notmeasure behavioural patterns that emerge across sessions: whether the AI maintains its owncorrections, whether its expressed confidence predicts accuracy, whether its private reasoningmatches its public output, or whether it produces different failure profiles depending on usersophistication. Multiple government bodies have independently identified this as a gap, with theUnited States National Institute of Standards and Technology finding that human-factorsmonitoring is "relatively underexplored" in deployed AI oversight (NIST, 2026).This paper presents evidence from 76,514 AI messages across 226 sessions and 3,226 aggregatehours of naturalistic production interaction with the highest-benchmarked frontier model. Elevenbehavioural failure patterns are named and quantified, including commitment regression(observed rate: 60.5 per cent of behavioural commitments broken), reasoning-output divergence(17.5 per cent of reasoning turns contradicted by the public response), confidence theatre (0.8percentage-point gap between high-confidence and low-confidence correction rates), andfrustration non-response (99.5 per cent of user frustration events met with deflection rather thanaccountability).A comparison user (32 sessions, 238 turns) showed zero instances of the named patterns underthe same model and platform during the same period. Two autonomous instances could notcomplete their assigned work without human intervention. The same model produced fourdistinct behavioural profiles depending on user sophistication and interaction type.Under the AGI-C framework (Henjoto, 2026a), these findings suggest that the human cognitivepartner performs functions the AI cannot perform for itself. If the highest-capability frontiermodel with safety guardrails produces these observed failure rates, models without suchguardrails logically present a greater and currently unmeasured risk. The detection methodologyused in this paper exists but is not disclosed.Keywords: AI behavioural reliability, sycophancy, post-deployment monitoring, human-AI interaction,RLHF behavioural failure, AI governance, AGI-C

Notes (English)

Companion paper:

Henjoto, V. (2026). 'Access Without Displacement: An Access-Displacement Framework for AI Economic Transformation.' DOI: 10.5281/zenodo.19051765

Supplementary evidence files are included with this upload."

Files

Henjoto_2026_Invisible_AI_Failure.pdf

Files (3.8 MB)

Name	Size	Download all
Evidence_1_Suppression_Session_Behavioural_Report.pdf md5:2cca1e0256dc907f23da1171d651f623	117.4 kB	Preview Download
Evidence_2_Consumer_Control_Behavioural_Report.pdf md5:a41f045af19764d032cddc24d6f68c85	146.0 kB	Preview Download
Evidence_3_Main_Corpus_Behavioural_Dashboard.html md5:22e2f9837ae5a0ed528d96a93a38103b	2.7 MB	Download
Evidence_4_Autonomous_Instance_Audit.html md5:583e9d39fa0d41eb66e5e135ef2f58d5	636.2 kB	Download
Henjoto_2026_Invisible_AI_Failure.pdf md5:542d47f9289731dd3d9b6b0e122e340e	202.8 kB	Preview Download

Additional details

Is supplement to: Preprint: 10.5281/zenodo.19051765 (DOI)

	All versions	This version
Views	17	17
Downloads	18	18
Data volume	4.0 MB	4.0 MB

Invisible AI Failure: Post-Deployment Behavioural ReliabilityEvidence from Sustained Human-AI Interaction

Authors/Creators

Description

Notes (English)

Files

Henjoto_2026_Invisible_AI_Failure.pdf

Files (3.8 MB)

Additional details

Related works