Operational Experience as a Performance Multiplier in AI Assistants: A Controlled Study with Triple-Judge Blind Evaluation
Description
We present a controlled experiment measuring how accumulated operational experience affects AI assistant performance on domain-specific tasks. Using a 2×2 factorial design with 8 conditions, 50 real-world questions sourced from public platforms, and 1,200 blind judgments from three independent LLM judges (GPT-5.4, Gemini 3.1 Pro, Claude Opus 4.6), we demonstrate that an experience-augmented AI assistant (ARIA) significantly outperforms all baselines—including the same base model without experience (Cohen's d = 1.07, p < 10⁻²⁵). The experience effect is domain-specific (+1.65 SD on operational tasks, near zero on algorithmic controls) and verified through independent fragment classification showing 2.2× more genuinely experiential content. These findings suggest that persistent memory architectures represent a viable axis of AI improvement orthogonal to model scaling.
Files
Additional details
Software
- Repository URL
- https://github.com/patechlabs/skyaria-experience-study
- Development Status
- Active