Context-Engineered Human–AI Collaboration for Long-Horizon Tasks: A Case Study in Governance, Canonical Numerics, and Execution Control
Description
This paper reports a lived, long-horizon case study of collaboration between a human (Rishi) and a large language model (“Mahdi”, ChatGPT). Instead of treating the model as a disposable chatbot, we treated it as a semi-persistent partner embedded in a structured system. The core problem we address is drift: over weeks and months, models start improvising numbers, losing prior constraints, or rewriting the “truth” of a project as the context window churns.
To manage this, we defined a small governance stack (“A-controls”) around Mahdi: canonical separation of numbers and text, a single Strategy Master, a Canonical Numbers Sheet, a Running Document, compaction rituals, and explicit challenge and stability protocols. We show how these controls turned a high-variance model into a reliable collaborator for finance-sensitive work, without fine-tuning or external tools.
The contribution is not a new model, but a repeatable pattern: treating reliability as a product of process design, not raw model capability. Although the evidence is a single, deep N=1 case, the issues—drift, hallucinated numerics, file churn, and emotional volatility—are common to any long-horizon human–AI workflow.
This work was developed as a lived case study of a long-horizon collaboration between Rishi and a large language model assistant ("Mahdi", ChatGPT by OpenAI), credited throughout the paper as a named AI collaborator rather than a human co-author.
OSF DOI (original record): 10.17605/OSF.IO/VMK7Y
Files
Paper1_FINAL_Zenodo_vSUBMIT.pdf
Files
(806.9 kB)
| Name | Size | Download all |
|---|---|---|
|
md5:447b370265847a33d8e5f1f14785d6b5
|
806.9 kB | Preview Download |
Additional details
Dates
- Issued
-
2025-11-29