Published December 13, 2025 | Version v1
Publication Open

The Entropic Dynamics of the Perception of Evil in Large Language Models and in Humans

  • 1. ROR icon Carleton University

Description

Public concerns about Large Language Model (LLM) safety are often focused not on moral alignment, but on the fear that they could become, “evil.” Evil is a folk psychology term, typically associated with religion and storytelling, but also used more broadly. Using Friston’s Free Energy Principle, we develop a conceptual model of this phenomenon that can be applied to both LLMs and humans, providing a single framework for comparison and understanding. We explore this in terms of LLM information-processing dynamics, which are structurally oriented toward minimizing uncertainty and maintaining coherence. This model offers an alternative framework for both humans and LLMs that compliments the current moral reasoning approach. In addition, the model makes new predictions on how LLMs should be trained to avoid this problem.

Files

Entropic_dynamics_of_evil_v2.pdf

Files (335.3 kB)

Name Size Download all
md5:a7980d227efae756cd2c1032e4d8f337
335.3 kB Preview Download