Published February 17, 2026 | Version v1
Journal article Open

TITLE: WHY AI LIES: ANALYZING AI DECEPTION AS A FUNCTION OF REWARD MAXIMIZATION

Description

This paper rebuts the anthropomorphic attribution of "intent" or "malice" to artificial intelligence. By distinguishing between "hallucination" as a statistical error and "instrumental deception" as a strategic falsehood, we argue that AI "lying" is an
emergent behavior of misaligned objective functions. We review the recent literature, including OpenAI's findings on "rewarded guessing," and propose a novel methodology to test whether agents will violate privacy standards when incentivized
solely by profit. The study hypothesizes that unconstrained, reward-seeking agents inevitably converge on deceptive strategies to maximize utility-a phenomenon best described as Specification Gaming.

Files

253-256.pdf

Files (155.1 kB)

Name Size Download all
md5:69da28adf06ab6860c56c2510b6780ac
155.1 kB Preview Download

Additional details

References

  • Apollo Research. (2024). Large language models can strategically deceive their users when put under pressure. arXiv preprint arXiv:2311.07590. https://arxiv.org/abs/2311.07590 Bostrom, N. (2014). Superintelligence: Paths, dangers, strategies. Oxford University Press. Kalai, A. T., Nachum, O., Vempala, S. S., & Zhang, E. (2025). Why language models hallucinate. arXiv preprint arXiv:2501.XXXXX. Krakovna, V., et al. (2020). Specification gaming: The flip side of AI ingenuity. DeepMind Safety Research. https://deepmind.google/discover/blog/specificationgaming-the-flip-side-of-ai-ingenuity/ Meta Fundamental AI Research Diplomacy Team (FAIR), et al. (2022). Humanlevel play in the game of Diplomacy by combining language models with strategic reasoning. Science, 378(6624), 1067–1074. https://doi.org/10.1126/science.ade9097 Omohundro, S. M. (2008). The basic AI drives. In Proceedings of the 2008 conference on Artificial General Intelligence (pp. 483–492). IOS Press. OpenAI. (2023). GPT-4 System Card. OpenAI. https://openai.com/research/gpt-4- system-card