The Ainex Limit: Geometric Proof of LLM Semantic Collapse.
Authors/Creators
Description
As Large Language Models (LLMs) increasingly saturate the internet with synthetic content, the risk of future models
being trained on generated data grows exponentially. This paper introduces the “Ainex Law,” a mathematical principle defining
the upper bound of semantic integrity in recursive self-learning systems. Through rigorous experimentation using a GPT-2 architec-
ture within a closed feedback loop, we empirically demonstrate that without external human-grounded data, the model’s semantic
space—measured via the Convex Hull Volume (Vhull) of latent embeddings—suffers a deterministic decay. We observe a 66%
reduction in semantic diversity within 20 generations, accompanied by a sharp increase in Centroid Drift (μAI ) away from the
human baseline. Our findings suggest that “Model Collapse” is not merely a quality degradation but a geometric inevitability akin
to thermodynamic entropy. We propose the Ainex Score (A) as a standardized metric to quantify this decay.
Files
Methodology.pdf
Files
(180.8 kB)
| Name | Size | Download all |
|---|---|---|
|
md5:7e28717e4487f360e304e249fe79cc49
|
180.8 kB | Preview Download |