Recursive Optimization with Controlled Language Models: Inference-Time Control, Tokenization Co-Evolution, and the Fourth Loop

Napolitano, Logan Matthew

doi:10.5281/zenodo.18361799

Published January 24, 2026 | Version v1

Preprint Open

Recursive Optimization with Controlled Language Models: Inference-Time Control, Tokenization Co-Evolution, and the Fourth Loop

Napolitano, Logan Matthew

This work presents a complete framework for recursive self-improvement in language models, validated through extensive experimentation on an 8-billion parameter model.

The core contribution is the discovery that the RSI (Recursive Self-Improvement) ceiling—previously observed at 3-5 iterations—is not a fundamental limit but a tokenization bottleneck. By identifying high-stress token boundaries using a novel entropy-attention discontinuity metric and expanding the vocabulary with merged tokens, we create representational headroom that enables continued self-improvement.

Key validated results:
- CF-HoT behavioral probe: 80× separation ratio, 97.2% accuracy
- Dense training pipeline: 68% density improvement, 57% token reduction
- Loop 4 tokenization co-evolution: 9.87% token reduction across 30 merge candidates
- RSI ceiling breakthrough: 10/10 successful iterations (previous ceiling: 3-5)

The framework comprises four interconnected optimization loops:
1. Inference-time behavioral control through hidden state probing and decode-time intervention
2. Density optimization through SFT → DPO → PPO training
3. Bounded recursive self-improvement with automatic rollback
4. Tokenization co-evolution through boundary stress detection and vocabulary expansion

Includes complete implementation code, experimental results, and reproduction guide.

Keywords: language models, recursive self-improvement, inference-time control, tokenization, self-optimization, AI safety, behavioral control

Files

recursive_optimization_book.pdf

Files (356.6 kB)

Name	Size	Download all
recursive_optimization_book.pdf md5:1aa25a03bd8b3a2a4366551336939d95	356.6 kB	Preview Download

Additional details

Repository URL: https://huggingface.co/LoganResearch/ARC-Base-8B-Condensed

	All versions	This version
Views	11	11
Downloads	7	7
Data volume	2.9 MB	2.9 MB

Recursive Optimization with Controlled Language Models: Inference-Time Control, Tokenization Co-Evolution, and the Fourth Loop

Authors/Creators

Description

Files

recursive_optimization_book.pdf

Files (356.6 kB)

Additional details

Software