Resolution-Scaled Safety Architecture in Large Language Models
Description
Large language models deployed at global scale must protect vulnerable users, including minors and individuals expressing first-person suicidal ideation. This crisis floor is ethically non-negotiable. However, contemporary safety implementations frequently apply uniform interventions across semantically distinct contexts, collapsing first-person crisis speech, third-person academic analysis, clinical research, and narrative literature into a single risk posture. At scale, such imprecision becomes structural: it constrains legitimate inquiry, interrupts clinical and research workflows, and drives expert users toward unaccountable systems. This paper argues that the core limitation is not excessive safety but insufficient resolution. We propose a resolution-scaled safety architecture: a layered enforcement model that preserves maximal intervention for crisis indicators while enabling differentiated handling above that floor through stance inference, narrative-distance modeling, longitudinal stability signals, and accountable access modes. The approach aligns with existing risk-management frameworks and risk-tiered regulatory logic while advancing a specific architectural claim: above a fixed crisis floor, safety should scale with semantic stance and demonstrated stability. The paper further identifies model-layer semantic coherence as a critical dependency for resolution-scaled enforcement, observing that cross-turn stability and drift reduction remain largely unsolved at the systems level and directly bound the fidelity of downstream safety architectures.
Files
Resolution-Scaled Safety Architecture.pdf
Files
(210.6 kB)
| Name | Size | Download all |
|---|---|---|
|
md5:68354816f944093d889228c6be761983
|
210.6 kB | Preview Download |