Pratyakṣa: A Context-Engineering System for Long-Context, Hallucination-Resistant Agentic AI
Description
We present Pratyakṣa, a context-engineering system for long-context, hallucination-resistant agentic
AI, packaged as a Claude-Code- and Cursor-compatible plugin (15 Model Context Protocol [MCP] tools, 3 skills,
3 agents, 4 commands, 3 lifecycle hooks). The system operationalises seven constructs drawn from classical In-
dian epistemology — specifically from Nyāya–Vaiśeṣika, Advaita Vedānta, Pūrva Mīmāṃsā, and Sāṃkhya —
into runtime mechanisms for an LLM agent’s working context: pratyakṣa (direct perception), avacchedaka (typed
limitor conditions), bādha (sublation), buddhi/manas (judging vs. attending faculties), sākṣī (witness invariants),
khyātivāda (a six-class taxonomy of cognitive error), and adaptive forgetting. We validate across three orthogonal
evidence layers: (L1) seven preregistered hypotheses (H1–H7) on six public long-context and hallucination bench-
marks (RULER, HELMET, NoCha, HaluEval, TruthfulQA, FACTS-Grounding) with multi-seed multi-model
paired permutation tests; (L2) a deterministic, reproducible live case study (P6-B) on three real GitHub issues
spanning Django, Requests, and pandas; and (L3) a head-to-head A/B test on 120 SWE-bench Verified instances
(P6-C, 720 paired runs across 2 models × 3 seeds × 120 issues) under a fixed 512-token research-block budget.
SWE-bench Verified instantiates the harness on one challenging coding domain; the mechanisms themselves are
agent- and domain-agnostic and the L1 evidence is the general claim. Across 10 quantitative studies, the system
produces a Stouffer-combined 𝑍= 9.114 (two-sided 𝑝 = 7.94 × 10−20), with mean per-study delta +0.476 in the
harness’s favour and a 100% target-path-hit rate on SWE-bench Verified versus 50.3% for the budgeted baseline.
The khyātivāda 6-class hallucination annotator achieves Cohen’s 𝜅 = 0.736 (“substantial”) on 𝑛 = 3,000 jointly
annotated examples. The contribution is not a new model architecture but a typed, witness-tracked, sublation-
aware context discipline that any LLM-based agent — research assistants, document QA, multi-tool orchestrators,
code-review agents — can adopt today via a drop-in plugin. The system, the plugin, and the full reproducibility
manifest are open-sourced.
Files
main.pdf
Files
(950.2 kB)
| Name | Size | Download all |
|---|---|---|
|
md5:fe6677d9babb08d4ca10f41e6647f4f4
|
950.2 kB | Preview Download |
Additional details
Dates
- Created
-
2026-02-08