Self-Supervised Learning as Constrained Free-Energy Systems

Kemple, Kurtis

doi:10.5281/zenodo.18292515

Published January 18, 2026 | Version v2

Other Open

Self-Supervised Learning as Constrained Free-Energy Systems

Kemple, Kurtis

This paper proposes that self-supervised learning methods are physical systems minimizing free energy under representational constraints, explaining why diverse approaches (VICReg, DINO, SimCLR, BYOL, Barlow Twins, JEPA) converge on similar hyperparameter ranges despite different theoretical motivations. The framework decomposes total free-energy deviation as F[q_t] - F* = κ + CD(t), where κ represents irreducible structural costs from architectural constraints and CD(t) measures dynamic misalignment. Training succeeds when gradient flow reduces CD(t) faster than constraints inflate κ. Organizational overhead η—the fraction of capacity consumed by coherence maintenance—must remain below a critical threshold η_c for stable representations. Documented empirical phenomena receive unified interpretation: variance-collapse universality (all SSL methods fail when embedding variance approaches zero); momentum convergence (BYOL, DINO, MoCo independently discover m ≈ 0.996, creating timescale τ = 1/(1-m) ≈ 250 steps matching characteristic relaxation times); batch size scaling (SimCLR requires ~4096 samples for manifold percolation); and depth thresholds (transformers exhibit emergent capabilities around 10-12 layers). These narrow ranges suggest underlying constraint boundaries. Each method implements the same physics through different mechanisms: VICReg's variance/covariance terms maintain dimensional spread; DINO's momentum creates timescale separation for stable reference tracking; SimCLR's negative samples ensure manifold coverage; BYOL's predictor breaks symmetry; Barlow Twins' decorrelation reduces redundancy; JEPA's prediction horizon enables recursive temporal coherence. All keep organizational overhead subcritical. The paper connects to the constraint eigenvalue framework's triplet architecture, proposing that SSL systems realize some eigenbranch configuration. Physical and biological systems typically follow the decagonal eigenbranch (π, φ, 10), but whether SSL matches this or discovers architecture-specific values remains an open question the framework helps sharpen. The convergence of independent research groups on similar thresholds suggests they discovered the same underlying constraint geometry through different optimization paths.

Files

self-supervised-learning-as-constrained-free-energy-systems.pdf

Files (116.7 kB)

Name	Size	Download all
self-supervised-learning-as-constrained-free-energy-systems.pdf md5:e0f9e345fb5152ab8a71f1cda37c0983	116.7 kB	Preview Download

Additional details

Is identical to: Other: https://scienceandmathematics.com/self-supervised-learning-as-constrained-free-energy-systems/ (URL)

Available: 2025-11-17

First published on scienceandmathematics.com
Updated: 2026-01-18

Clarifies that SSL systems may realize eigenbranch configurations distinct from the physical decagonal branch.

	All versions	This version
Views	22	13
Downloads	12	4
Data volume	1.7 MB	700.0 kB

self-supervised-learning-as-constrained-free-energy-systems.pdf

Files (116.7 kB)

Related works

Dates

Self-Supervised Learning as Constrained Free-Energy Systems

Authors/Creators

Description

Files

self-supervised-learning-as-constrained-free-energy-systems.pdf

Files (116.7 kB)

Additional details

Related works

Dates