The Geometric Blind Spot of Perplexity: When Low Loss Hides Out-of-Distribution
Authors/Creators
Description
Perplexity is widely used as a proxy for out-of-distribution (ood) detection in largelanguage models, under the assumption that unfamiliar inputs yield higher loss. We show this assumption has a structural blind spot: domains where the model is fluent but the input does not belong to the task distribution. Concretely, we evaluate LLaMA-3-8B and Mistral-7B on a math-reasoning task with five ood domains. Code snippets produce perplexitylower than in-distribution math (2.57 vs. 2.76 for LLaMA), yet their hidden-state intrinsic dimensionality collapses from 34 to 4—an 8.5× reduction. Perplexity achieves auroc 0.352 (LLaMA) and 0.150 (Mistral) on code, worse than random. In contrast, intrinsic dimensionality and Mahalanobis distance achieve auroc 1.000 across all ood categories in both models. The dissociation is consistent across five transformer layers and robust to sample-size equalization. Our results demonstrate that perplexity measures model fluency, not task-distribution membership, and that geometric
Files
geood_paper.pdf
Files
(503.8 kB)
| Name | Size | Download all |
|---|---|---|
|
md5:4c944ec4510700ffff25908977aee0bb
|
503.8 kB | Preview Download |
Additional details
Related works
- Is derived from
- Publication: 10.5281/zenodo.19039582 (DOI)