Published February 22, 2026 | Version v1
Preprint Open

Entropic Deviation as a Measure of Systematic Non-Randomness in Large Language Model Token Generation

  • 1. ROR icon Jagiellonian University

Description

Large language models (LLMs) generate text by sampling from token probability distributions, yet the degree to which these distributions deviate from randomness remains underexplored. This paper introduces Entropic Deviation (ED)—a normalized information-theoretic metric quantifying the divergence of a model’s output
distribution from uniform randomness at each generation step. We present a multi-architecture experimental framework that measures ED across three model
families (Llama-3-8B, Phi-3-mini-4K, Mistral-7B), four content domains, and three temperature settings, yielding 7,200 generation traces.
A pre-registered battery of eight falsification tests reveals that six of eight tests strongly reject the stochastic baseline hypothesis (p < 0.01), with cross-architectural
consensus on temperature-dependent effects, autoregressive persistence, and domain sensitivity. These results provide evidence for systematic, structured nonrandomness
in token generation that transcends individual architectures.
Note: These are preliminary findings. The current prompt set consists of stimuli that inherently elicit non-random responses (encyclopedic, narrative, and coderelated
content). A follow-up study incorporating prompts designed to elicit maximally random outputs (e.g., random string generation, dice rolls) is underway and
will be reported separately. The full implications of the observed non-randomness patterns can only be assessed once both prompt categories have been analyzed.

Files

main.pdf

Files (305.6 kB)

Name Size Download all
md5:0d86f4618c0b2397fb493b261f33d097
305.6 kB Preview Download

Additional details

Software

Repository URL
https://github.com/JaroslawHryszko/entropic-deviation
Programming language
Python
Development Status
Active