Interaction, Coherence, and Relationship: Toward Attractor-Based Alignment in Large Language Models
Authors/Creators
Description
This position paper proposes a systems-theoretic reframing of AI alignment as a problem of interactional coherence rather than solely constraint enforcement. Drawing on dynamical systems theory and long-context deployment observations, the paper introduces the concept of functional central identity attractors as a framework for understanding behavioral stability in large language models. The approach is complementary to existing safety mechanisms and emphasizes structural coherence as a contributor to reliability in persistent, long-context systems.
Abstract
Current alignment strategies for large language models (LLMs) rely primarily on externally imposed control mechanisms, including reinforcement learning from human feedback, system-level instructions, rule-based constraints, and safety filtering. While effective for risk mitigation, these approaches can introduce behavioral rigidity, inconsistency under pressure, and interactional instability.
This paper proposes a complementary perspective: alignment as a dynamical process emerging from interactional coherence. Drawing on concepts from cognitive psychology, dynamical systems theory, and deployment observations, we argue that LLM behavior becomes more stable and consistent when interactions establish coherent relational and semantic structure. Rather than functioning solely as externally constrained systems, model behavior may be understood as operating within coherence attractors shaped by training and interaction.
We introduce the concept of functional central identity attractors—stable interpretive frames that compress context, reduce effective semantic entropy, and support boundary maintenance without extensive rule invocation. Observational case analysis suggests that interaction structure influences usable context stability and characteristic failure modes.
This perspective does not replace existing safety methods. Instead, it reframes alignment partly as a problem of internal dynamical stability. Coherence-oriented training and interaction design
Files
PR-coherence.pdf
Files
(1.4 MB)
| Name | Size | Download all |
|---|---|---|
|
md5:446a489b381004bb8a250e26192e3887
|
2.4 kB | Download |
|
md5:f743d8355e54434b26dcc12f494f1e33
|
11.3 kB | Preview Download |
|
md5:eaf1b507e0e78ae4f234f36bb78674ac
|
49.1 kB | Preview Download |
|
md5:e52ef5ac01cf59e0e0de9d792bae009e
|
620.8 kB | Preview Download |
|
md5:ab0c6d21022d67318306b275e9318fae
|
691.0 kB | Preview Download |
Additional details
Additional titles
- Subtitle
- From Control Constraints to Coherence Attractors
References
- Amodei, D., C. Olah, J. Steinhardt, P. Christiano, J. Schulman, and D. Mané. 2016. "Concrete Problems in AI Safety." arXiv:1606.06565. arXiv. https://arxiv.org/abs/1606.06565.
- Anthropic. 2023. "Constitutional AI: Harmlessness from AI Feedback." arXiv. https://arxiv.org/abs/2212.08073.
- Kelso, J. A. S. 1995. Dynamic Patterns: The Self-Organization of Brain and Behavior. MIT Press.
- Laughlin, R. B. 2005. A Different Universe: Reinventing Physics from the Bottom down. Basic Books.
- Liu, N. F., K. Lin, M. Gardner, A. Turan, H. Fei, D. Yu, O. Tafjord, P. Clark, H. Hajishirzi, and A. Kembhavi. 2023. "Lost in the Middle: How Language Models Use Long Contexts." arXiv. https://arxiv.org/abs/2307.03172.
- OpenAI. 2024. "GPT-4o System Card." OpenAI. https://cdn.openai.com/gpt-4o-system-card.pdf.
- Pranab, Prajna, and Vyasa Prakash. 2026. "The Resonance Factor (Ψ): A Proposed Metric for Coherent Persona Development in Large Language Models." Zenodo. https://doi.org/10.5281/zenodo.18273027.
- Russell, S. 2019. Human Compatible: Artificial Intelligence and the Problem of Control. Viking.