Published January 24, 2026 | Version v1
Preprint Open

Recursive Optimization with Controlled Language Models: Inference-Time Control, Tokenization Co-Evolution, and the Fourth Loop

Authors/Creators

Description

This work presents a complete framework for recursive self-improvement in language models, validated through extensive experimentation on an 8-billion parameter model.

The core contribution is the discovery that the RSI (Recursive Self-Improvement) ceiling—previously observed at 3-5 iterations—is not a fundamental limit but a tokenization bottleneck. By identifying high-stress token boundaries using a novel entropy-attention discontinuity metric and expanding the vocabulary with merged tokens, we create representational headroom that enables continued self-improvement.

Key validated results:
- CF-HoT behavioral probe: 80× separation ratio, 97.2% accuracy
- Dense training pipeline: 68% density improvement, 57% token reduction
- Loop 4 tokenization co-evolution: 9.87% token reduction across 30 merge candidates
- RSI ceiling breakthrough: 10/10 successful iterations (previous ceiling: 3-5)

The framework comprises four interconnected optimization loops:
1. Inference-time behavioral control through hidden state probing and decode-time intervention
2. Density optimization through SFT → DPO → PPO training
3. Bounded recursive self-improvement with automatic rollback
4. Tokenization co-evolution through boundary stress detection and vocabulary expansion

Includes complete implementation code, experimental results, and reproduction guide.

Keywords: language models, recursive self-improvement, inference-time control, tokenization, self-optimization, AI safety, behavioral control

Files

recursive_optimization_book.pdf

Files (356.6 kB)

Name Size Download all
md5:1aa25a03bd8b3a2a4366551336939d95
356.6 kB Preview Download

Additional details