Algorithmic Capability Extraction at Extreme Compression: A 1.5B Parameter Model Matches Frontier Performance
Creators
Description
We demonstrate that complex algorithmic reasoning capabilities can be extracted from frontier language models (Claude 3.5 Haiku, ~100B parameters) and compressed into models as small as 1.5 billion parameters with no performance loss—and in some cases, performance gains.
Key Results:
• Qwen2.5-3B: 86.0% accuracy (baseline: 12%)
• Qwen2.5-1.5B: 82.7% accuracy (baseline: <10%)
• Claude 3.5 Haiku: 84.0% with LRL, 81.3% baseline
• Both student models exceed teacher baseline despite being 33-67× smaller
Methodology:
1. Teacher (Claude) learns O(n log n) sweep line algorithm through structured self-reflection (Linguistic Reinforcement Learning)
2. Learned strategy extracted as natural language pseudocode
3. Teacher generates 100 high-quality training examples
4. Student models learn via LoRA fine-tuning (rank 8, 3 epochs)
The 1.5B model achieves 67× compression with better-than-teacher performance while requiring zero inference cost after a one-time $10 training investment. This challenges the prevailing assumption that complex reasoning requires massive model scale and demonstrates that algorithmic intelligence is highly compressible through structured knowledge transfer.
Complete validation packages, reproducible experiments, and framework code available at: https://github.com/DRawson5570/linguistic-rl-scheduling-experiments
Files
Compressing Frontier Intelligence: A Framework for Algorithmic Knowledge Transfer.pdf
Files
(102.5 kB)
| Name | Size | Download all |
|---|---|---|
|
md5:f080d34817afd89b23f68e52336a7330
|
102.5 kB | Preview Download |
Additional details
Software
- Repository URL
- https://github.com/DRawson5570/linguistic-rl-scheduling
- Programming language
- Python