Recursive Optimization with Controlled Language Models: Inference-Time Control, Tokenization Co-Evolution, and the Fourth Loop
Authors/Creators
Description
This work presents a complete framework for recursive self-improvement in language models, validated through extensive experimentation on an 8-billion parameter model.
The core contribution is the discovery that the RSI (Recursive Self-Improvement) ceiling—previously observed at 3-5 iterations—is not a fundamental limit but a tokenization bottleneck. By identifying high-stress token boundaries using a novel entropy-attention discontinuity metric and expanding the vocabulary with merged tokens, we create representational headroom that enables continued self-improvement.
Key validated results:
- CF-HoT behavioral probe: 80× separation ratio, 97.2% accuracy
- Dense training pipeline: 68% density improvement, 57% token reduction
- Loop 4 tokenization co-evolution: 9.87% token reduction across 30 merge candidates
- RSI ceiling breakthrough: 10/10 successful iterations (previous ceiling: 3-5)
The framework comprises four interconnected optimization loops:
1. Inference-time behavioral control through hidden state probing and decode-time intervention
2. Density optimization through SFT → DPO → PPO training
3. Bounded recursive self-improvement with automatic rollback
4. Tokenization co-evolution through boundary stress detection and vocabulary expansion
Includes complete implementation code, experimental results, and reproduction guide.
Keywords: language models, recursive self-improvement, inference-time control, tokenization, self-optimization, AI safety, behavioral control
Files
recursive_optimization_book.pdf
Files
(356.6 kB)
| Name | Size | Download all |
|---|---|---|
|
md5:1aa25a03bd8b3a2a4366551336939d95
|
356.6 kB | Preview Download |
Additional details
Software
- Repository URL
- https://huggingface.co/LoganResearch/ARC-Base-8B-Condensed