There is a newer version of the record available.

Published April 2, 2026 | Version v1
Technical note Open

Fathom Monitor: Per-Token Hallucination Detection via Coherence Divergence in Sparse Autoencoder Feature Space

Authors/Creators

  • 1. Independent

Description

This technical disclosure describes Fathom Monitor, a system and method for detecting hallucination-risk tokens in large language model (LLM) outputs at the time of generation, using a mechanistic signal derived from the geometric structure of sparse autoencoder (SAE) feature activations.

The core innovation is the use of C_delta — the divergence between late-layer and early-layer feature coherence — as a per-token hallucination indicator. When C_delta exceeds a calibrated threshold at a given token position, that token is flagged as uncertain or high-risk and annotated inline.

Empirical validation on TruthfulQA (n=50, Gemma-2-2B): C_delta discriminates hallucination with p=0.040, Cohen's d=0.407. Depth (K) is blind to hallucination (p=0.931).

This document constitutes a public technical disclosure establishing prior art. Related provisional patents: US 64/020,489 (March 29, 2026) and US 64/021,113 (March 30, 2026). Builds on Zenodo records doi:10.5281/zenodo.19326175 and doi:10.5281/zenodo.19364702.

Notes

Patent pending: US 64/020,489 and US 64/021,113. This disclosure establishes prior art for the Fathom Monitor system as of April 2, 2026 under 35 U.S.C. § 102 (AIA). Provisional patent application to be filed within 12 months.

Files

fathom_monitor_disclosure.pdf

Files (12.6 kB)

Name Size Download all
md5:49dcdc4cb7c33830a140c469a16365a9
12.6 kB Preview Download

Additional details

Related works

Is supplement to
Preprint: 10.5281/zenodo.19364702 (DOI)
Preprint: 10.5281/zenodo.19326175 (DOI)