Fractal–Hyperbolic Degeneracy in Overparameterized Learning Manifolds v1.0

3 Pilgrim, LLC

doi:10.5281/zenodo.18489279

Published February 5, 2026 | Version 1.0

Publication Open

Fractal–Hyperbolic Degeneracy in Overparameterized Learning Manifolds v1.0

3 Pilgrim, LLC (Producer)

Why do modern overparameterized neural networks train so efficiently despite massive degeneracy, wide flat minima, and exponentially many redundant paths?
Why does training consistently collapse to a low‑intrinsic‑dimension core, display hyperbolic curvature signatures, exhibit fractal roughness in loss boundaries, and undergo sharp phase transitions such as grokking?

These features are well‑documented but theoretically fragmented. Random‑matrix explanations clarify low rank; tangent‑kernel limits describe early training; hyperbolic embeddings explain hierarchy; and mode connectivity explains flat valleys. But none explains why they all co‑occur, or why overparameterization seems to help optimization rather than hinder it.

This paper introduces three minimal primitives that unify these observations into a single geometric account:

Gradient erosion — training subtractively carves away low‑friction (near‑null) directions, leaving a resistant, low‑dimensional core.
Fisher metric as parametric friction — curvature defines a direction‑wise friction field ϕ(θ; u) = uᵀF(θ)u that shapes flow after erosion.
Overparameterization as degeneracy amplifier — extra parameters enlarge the negative space available for erosion, accelerating convergence.

These primitives causally explain:

Fractal roughness → multiscale carving of redundant structure
Hyperbolic curvature → exponential rarity of high‑friction directions
Low intrinsic dimension → collapse onto the resistant core
Low‑rank Fisher spectrum → friction‑resolved degeneracy
Flat minima → symmetry as the fixed point of negative‑space collapse
Grokking‑like transitions → friction‑field phase changes at negative closure

From this geometry, a three‑phase optimization protocol emerges (Acquisition → Re‑Ask → Execution), triggered by intrinsic‑dimension stall and Fisher‑rank concentration. These geometric signals adapt learning‑rate and metric updates to evolving curvature, reducing steps to matched accuracy by 15–35% in a quadratic toy model and a small Vision Transformer on CIFAR‑10, under equal compute.

This work is foundational and reductionist. It offers no new architectures or benchmarks; instead, it provides precise, domain‑general objects for understanding and engineering overparameterized training flow. It exposes deep structural kinship between erosion, degeneracy, curvature, and symmetry—and presents an actionable systems‑level lens for future optimization design.

Files

Fractal Hyperbolic Degeneracy in Overparameterized Learning Manifolds v1.0.pdf

Files (496.3 kB)

Name	Size	Download all
Fractal Hyperbolic Degeneracy in Overparameterized Learning Manifolds v1.0.pdf md5:a3acb069b51651fd081293ec17bd710c	496.3 kB	Preview Download

Additional details

Subtitle (English): A Reductionist Framework for Manifold Geometry, Gradient Erosion, and Phased Optimization Dynamics

	All versions	This version
Views	36	36
Downloads	5	5
Data volume	4.0 MB	4.0 MB

Fractal–Hyperbolic Degeneracy in Overparameterized Learning Manifolds v1.0

Authors/Creators

Description

Files

Fractal Hyperbolic Degeneracy in Overparameterized Learning Manifolds v1.0.pdf

Files (496.3 kB)

Additional details

Additional titles