Neural Null Cones: Zero-Curvature Channels in Loss Landscapes from Symplectic Hessian Decomposition
Authors/Creators
Description
We discover that neural network loss landscapes contain zero-curvature channels: directions along which gradient updates incur no second-order penalty. These null directions are not eigenvectors of the Hessian H itself (the object studied by prior spectral analyses), but eigenvectors of H^{-1}_{reg}J -- a symplectic decomposition that couples the Hessian with an external pairing matrix J. Null residuals reach 10^{-26} in GPT-2 (124M parameters), 10^{-21} in LeNet-5 (61K), and 10^{-15} in a 22-parameter MLP. This holds across architectures (MLP, CNN, Transformer), layers, input texts, and random parameter subsets.
The theoretical guarantee is the Spectral Null Cone Theorem: every real eigenvector of H^{-1}{reg}J is H{reg}-null -- an algebraic identity that holds whenever the Hessian is indefinite and H^{-1}_{reg}J has real eigenpairs. These conditions are not hypothetical: we observe them at 100% of training steps (22-param MLP) and in 6/7 layer types (GPT-2), with the required indefiniteness arising from a sharp phase transition analogous to that in the free energy Hessian.
This geometric structure is not merely present but exploitable. We demonstrate that it is actionable (5-18% loss reduction), tunable (4.4x channel widening), persistent (+2.9% under strict verification), and universal (confirmed across MLPs, CNNs, and Transformers). Of 147 numerical tests, 139 pass; the 8 failures are documented boundary cases that do not contradict the core claim.
Files
Null Cones.pdf
Files
(224.5 kB)
| Name | Size | Download all |
|---|---|---|
|
md5:1e13a4293ef6a6488eb64affb447286d
|
224.5 kB | Preview Download |