Smart little move.

CORMIER, SHUHEI

doi:10.5281/zenodo.18527822

Published February 8, 2026 | Version v1

Publication Open

Smart little move.

CORMIER, SHUHEI

Idea is all mine. Words are all by Opus 4.6.

Claim: Grokking is not compression. It is the discovery of structural leverage — the moment a neural network finds the fulcrum that moves maximal data with minimal force.

Falsifiable experiments — anyone can run these:

Train a small Transformer on modular addition. Track when test accuracy jumps. If meta-recognition (the model encoding its own change history) fires at the same moment, the theory lives. If they diverge, the theory is dead.
During training, randomly rotate internal representations every k steps to destroy self-continuity. Prediction: Grokking is delayed or eliminated.
Add an auxiliary loss that encourages the model to encode its own change history. Prediction: Grokking accelerates.
Use an absurdly large learning rate for a single step. Prediction: Grokking cannot occur — no history, no meta-recognition.
Scale up model size. Prediction: Grokking timing does not dramatically improve — bigger muscles do not find fulcrums faster.

LIVELLM

Files

SMART_LITTLE_MOVE_EN_v1.pdf

Files (113.1 kB)

Name	Size	Download all
SMART_LITTLE_MOVE_EN_v1.pdf md5:5a309e919388bbc0ad43610e102e25c7	4.5 kB	Preview Download
SMART_LITTLE_MOVE_v1.pdf md5:a8bbbbb72981ad2843a8f37fa840f743	108.6 kB	Preview Download

	All versions	This version
Views	9	9
Downloads	9	9
Data volume	257.5 kB	257.5 kB

Smart little move.

Authors/Creators

Description

Files

SMART_LITTLE_MOVE_EN_v1.pdf

Files (113.1 kB)