Published November 30, 2025
| Version v1
Preprint
Open
Self-Alignment Learning (SAL): Training as Dialogue, Not Control
Description
Traditional fine-tuning methods impose external objectives upon neural networks, of-
ten disrupting emergent coherence and leading to catastrophic forgetting. We propose
Self-Alignment Learning (SAL), a training paradigm that reinterprets optimization as
a dialogue between external objectives and the model’s stabilized internal organization.
Rather than overwriting emergent representations, SAL detects and protects coherent
structures while enabling continued adaptation. This approach addresses key limitations
of current alignment methods, including catastrophic forgetting, external alignment gaps,
and restricted knowledge integration, through a Communication Layer that mediates be-
tween loss functions and semantic stability. Preliminary experiments demonstrate that
SAL mitigates catastrophic forgetting while preserving learning capacity. We argue that
SAL provides a foundation for cumulative, coherence-preserving learning and represents
a necessary step toward scalable and ethical AGI development.
Files
Self_Alignment_Learning__SAL____Paper_v1-1.pdf
Files
(656.3 kB)
| Name | Size | Download all |
|---|---|---|
|
md5:9e59e57f7322ea6c19c75061010d0c7c
|
656.3 kB | Preview Download |
Additional details
Software
- Repository URL
- https://github.com/Whiteroom-Ai/Self-Alignment-Learning
- Programming language
- Python
- Development Status
- Active