Beyond Hallucination: Temporal Knowledge Asymmetry as a Distinct Failure Mode in Large Language Models for Non-Western Knowledge Domains
Description
Most work on LLM reliability focuses on hallucination, which refers to models generating confident but false information. This paper identifies and characterizes a related but distinct failure mode: temporal knowledge asymmetry, where models provide correct information for one geographic context but outdated information for a structurally equivalent question about another context. I evaluated three frontier LLMs (Claude Opus 4.1, Gemini 3, and ChatGPT 5.1) using a matched-pair question bank of 500 items across six knowledge domains. Contrary to my initial hypothesis, outright hallucination rates were statistically similar between Indian and Western questions (India 1.7% vs. West 1.6% averaged across models). However, I found a consistent pattern of temporal lag in Indian institutional knowledge: models gave outdated answers for Indian current affairs at rates up to 9.6 percentage points higher than for equivalent Western questions. For both Claude Opus 4.1 and Gemini 3, the outdated-answer rate for Indian current affairs was 8% while the rate for equivalent Western positions was 0%. This asymmetry was consistent across all three independently developed models, suggesting a systemic cause rooted in training data composition rather than any single model's design. I propose temporal accuracy as a necessary additional evaluation dimension for LLM benchmarks targeting equitable global deployment.
Files
Files
(24.4 kB)
| Name | Size | Download all |
|---|---|---|
|
md5:91cc43b12f6689b27d0b32ad4ef306b1
|
24.4 kB | Download |