Published March 14, 2026 | Version 1.0
Preprint Open

Streaming Epistemic Geometry in Large Language Models: Token-Level Dynamics of Certainty, Hallucination, and Refusal Across Five Model Families

  • 1. Independent Researcher

Description

We introduce streaming epistemic geometry — the first token-by-token tracking of epistemic subspace projections during autoregressive generation in large language models. Using PCA-based subspace analysis on five independently trained model families (Llama-3.1-8B, Mistral-7B, Gemma-2-9B, Qwen2.5-7B, Llama-3.2-3B; 4 organisations, 3B–9B parameters), we show that hallucination, refusal, and certainty each produce a distinct dynamic signature in the residual stream detectable from the very first generated token. A logistic classifier trained on the first-token projection score achieves leave-one-out AUC = 0.991 on Llama-3.1-8B and transfers zero-shot to TruthfulQA. Our geometric detector and an output-entropy baseline capture complementary failure modes: the subspace method flags factual-citation errors while entropy flags physically improbable myths. All code and data included for full reproducibility.

Files

streaming_epistemic_geometry_2026.pdf

Files (2.9 MB)

Name Size Download all
md5:fce133a41d680c0e98232263103cca8e
3.6 kB Download
md5:4939be2f84d0041a7f91e40a7182f072
139.8 kB Download
md5:26f8516dfa227192c4a32512cfa6c41d
456 Bytes Download
md5:b4d60cf684d5d8bcf888e070a539ea65
95.9 kB Download
md5:d03e0d2c1ca0179f6e905b4ace7b0ba8
135.1 kB Download
md5:2fae16604a0cdcc86a6ded1c78e049ea
2.2 MB Preview Download
md5:ff42a02c72c0872744b3d094382377bb
21.0 kB Download
md5:796e00fe5915a2da00fdc4ad452fdae4
252.8 kB Download

Additional details

Dates

Created
2026-03-14

Software

Programming language
Python

References

  • Arditi, A., Obeso, O., Syed, A., Paleka, D., Rimsky, N., Gurnee, W., & Nanda, N. (2024). Refusal in Language Models Is Mediated by a Single Direction. Advances in Neural Information Processing Systems (NeurIPS 2024). arXiv:2406.11717. Wollschläger, T., Elstner, J., Geisler, S., Cohen-Addad, V., Günnemann, S., & Gasteiger, J. (2026). The Geometry of Refusal in Large Language Models: Concept Cones and Representational Independence. International Conference on Learning Representations (ICLR 2026). arXiv:2502.17420. Alieksieienko, I. (2026). Epistemic State Space Classifier in Large Language Models. Zenodo Preprint. Alieksieienko, I. (2026). Epistemic Uncertainty Has a Geometric Address. Zenodo Preprint. Alieksieienko, I. (2025). Refusal Geometry in LLMs: A Universal Property of Late MLP Layers. Zenodo Preprint. Lin, S., Hilton, J., & Evans, O. (2022). TruthfulQA: Measuring How Models Mimic Human Falsehoods. Proc. ACL 2022. arXiv:2109.07958. Elhage, N., Hume, T., Olsson, C., et al. (2022). Toy Models of Superposition. Transformer Circuits Thread, Anthropic. Geva, M., Caciularu, A., Wang, K., & Goldberg, Y. (2022). Transformer Feed-Forward Layers Build Predictions by Promoting Concepts in the Vocabulary Space. Proc. EMNLP 2022. arXiv:2203.14680. Hong, Y., Zhou, D., Cao, M., Yu, L., & Jin, Z. (2025). The Reasoning-Memorization Interplay in Language Models Is Mediated by a Single Direction. arXiv:2503.23084. Pan, W., et al. (2025). The Hidden Dimensions of LLM Alignment: A Multi-Dimensional Safety Analysis. arXiv preprint, 2025.