Unified Behavioral Modulation in Large Language Models: Cross-Architecture Validation of Geometric Behavioral Subspaces
Authors/Creators
Description
We present empirical evidence that behavioral patterns in large language models—repetition, hedging, verbosity, and sycophancy—occupy identifiable low-dimensional geometric subspaces within transformer hidden states.
Key Results:
- Separation ratios up to 238.8× (repetition) and 1376.7× (hedging) on Qwen2.5-3B
- Cross-architecture generalization validated on both LLaMA-3.1-8B and Qwen2.5-3B
- 16-dimensional fiber projections sufficient for detection (compression ratio >128:1)
- Smaller 3B model outperforms larger 8B model, challenging scale assumptions
Reproducibility: This paper includes complete implementation code for the RiskPredictor architecture, behavioral labeling functions, training loops, and evaluation protocols. All hyperparameters, training logs, and hardware requirements are documented. Code and weights available at huggingface.co/loganresearch/ubm
Implications: These findings validate the geometric behavioral subspace hypothesis and provide a foundation for real-time behavioral monitoring and modulation in deployed AI systems.
Files
UBM_Paper_Final.pdf
Files
(403.2 kB)
| Name | Size | Download all |
|---|---|---|
|
md5:3464273ef24dea6cb47ef2c4acf2d68f
|
403.2 kB | Preview Download |
Additional details
Identifiers
Dates
- Created
-
2026-02-03