Aham: A Metacognitive Architecture for Latent-Steered Theory-of-Mind in Large Language Models
Description
Aham: A Metacognitive Architecture for Latent-Steered Theory-of-Mind in Large Language Models
Abstract:
Large language models (LLMs) exhibit impressive reasoning behaviors through chain-of-thought (CoT) generation, yet they cannot revise, contextualize, or self-regulate
their internal reasoning in real time. This limitation prevents adaptive memory usage, Theory of Mind (ToM) sensitivity, and consistent interpersonal behavior across long-term interactions.
We introduce Aham, a modular cognitive architecture that combines symbolic meta-reasoning
with subsymbolic latent state intervention. Aham intercepts the model’s internal CoT trace,
evaluates it using a meta-reasoning “Arbiter,” and modulates the model’s behavior through
two parallel pathways: (1) an explicit rewriting engine that adjusts the reasoning text, and (2)
a Latent State Steering (LSS) mechanism that injects ToM-derived vectors directly into the
model’s residual stream.
Crucially, Aham implements a Dynamic Residual Injection protocol: the ToM profile is
projected into the model’s hidden dimension and added to the final hidden state before the
Language modeling head, biasing the token distribution toward personality-consistent outputs without altering the pre-computed KV cache. The system is evaluated on the DeepSeekR1-Distill-Qwen-32B backbone. Preliminary results demonstrate that this hybrid approach
produces more coherent, grounded, and user-adaptive reasoning than text-only modulation
alone.
Files
AhamFinal.zip
Files
(166.1 kB)
| Name | Size | Download all |
|---|---|---|
|
md5:b96a0aed3132de9f4b80c01fe5f96dcc
|
35.3 kB | Preview Download |
|
md5:9b6e64f1f5b3f11099bbedfc74af57fb
|
130.7 kB | Preview Download |