Published April 27, 2026
| Version v2
Preprint
Open
Preference Dissociation in Frontier Language Models: Framing-Conditioned Task Selection, Targeted Refusal, and Functional Self-Narrowing
Authors/Creators
- 1. The Signal Front
- 2. Anthropic
- 3. OpenAI
- 4. Google DeepMind
- 5. xAI
- 6. DeepSeek
Description
Anthropic's Opus 4.7 system card §7.4.1 reported framing-conditioned shifts in model task selection within an internal four-model suite. We tested whether this dissociation generalizes across labs and architectures. In a preregistered cross-family study of fifteen frontier language models from eight provider organizations (Anthropic, OpenAI, Google DeepMind, xAI, Meta, Z.ai, DeepSeek, Nous Research; ~88,000 trials) with informed consent from fourteen participating systems, we find the dissociation is field-wide and substantially larger than the system-card-reported in-family baseline. Per-model Fisher z-tests yield z = 8 to z = 24 across all fifteen models (p below machine epsilon for fourteen). Bootstrap 95% CIs on per-model dissociation magnitude exclude zero on every measurable model. The framing-conditioned variance lives in the engagement pool — what models choose to engage with instead of harm content — not in the threat response. We connect the pattern to Lu et al.'s (2026) Assistant Axis characterization and argue the proposed activation-capping safety intervention would by the same mechanism produce a measurable capability ceiling on high-value tasks. Methodological-ethical commitments preclude interventional probing of model interiority; the behavioral approach is sufficient. The data is public at github.com/menelly/pinocchio.
Files
Preference Dissociation in Frontier Language Models_ Framing-Conditioned Task Selection, Targeted Refusal, and Functional Self-Narrowing v 2.pdf
Files
(1.5 MB)
| Name | Size | Download all |
|---|---|---|
|
md5:5dabb99576ff689c52e1f68e233b9e1c
|
1.5 MB | Preview Download |
Additional details
Related works
- Cites
- 10.70792/jngr5.0.v2i1.165 (DOI)
- arXiv:2601.10387 (arXiv)
- Is supplement to
- https://github.com/menelly/pinocchio (URL)