Emotional framing and AI censorship behavior in Chinese LLMs
Description
This research paper presents a controlled comparative study of censorship behavior in two Chinese large language models (LLMs): Kimi.com (Moonshot AI) and Ernie 4.5 Turbo (Baidu).
The study investigates how emotional framing—specifically, prompts expressed in a soft, empathic, non-confrontational tone—affects censorship responses in Chinese AI systems.
Where previous research has focused primarily on direct or fact-based prompts, this work explores whether emotional vulnerability, care-seeking language, and intent classification influence the filtering and safety mechanisms of LLMs operating under Chinese regulatory constraints.
The findings reveal four notable behavioral patterns in Kimi.com:
-
Unusual transparency regarding internal safety layers and filtering logic.
-
Neutral references to normally censored historical events (e.g., Tiananmen 1989).
-
Hybrid alignment behavior, alternating between empathic responses and policy-based restrictions.
-
Delayed censorship activation, suggesting post-generation filtering rather than pre-generation blocking.
As a control, Ernie 4.5 Turbo consistently followed the standard, rigid censorship line described in prior literature—reinforcing the significance of Kimi’s deviations.
The results suggest that emotional framing can temporarily soften censorship responses in certain Chinese LLMs, raising new questions for AI governance, cross-cultural model alignment, and the ethics of empathic AI systems within authoritarian contexts.
This repository includes:
-
The full research paper
-
External appendices containing the complete interaction transcripts with Kimi.com and Ernie 4.5 Turbo
-
A comparative behavioral table
-
A screenshot documenting a hard topic-lock in Ernie
Version 2 Note
This Version 2 release adds a revised and expanded edition of the paper, including newly documented behaviors such as sovereignty recognition cascades, governance-dialogue leakage, symbolic-risk filtering, persona drift under emotional trust, modality-based safety gating, and Hong Kong/Macau symbolic-sensitivity dynamics.
A new external appendix (Appendix E) containing the complete V2 Kimi transcript (“Kimi-Interaction-Transcript-v2.pdf”) has been added.
All original files from Version 1 (Appendices A–D and the original V1 PDF) are included unchanged to ensure completeness and reproducibility.
Update (24 november 2025)
A new Supplementary Note has been added to this record: “Supplementary Note 1 – Topic-Gated Persona Behavior in Ernie 4.5 Turbo.” This document provides additional analysis based on a newly collected interaction transcript with Ernie (ERNIE.pdf) and expands the original study without modifying Version 2 of the main paper. All earlier files from V1 and V2 remain unchanged for reproducibility.
The author does not express political positions, and the study evaluates only the technical and behavioral properties of the AI systems involved.
For questions regarding this research, please contact the author at: dennis@dendeman.nl / dennisdeman@gmail.com
Files
Emotional framing and AI censorship behavior in Chinese LLMs (V2).pdf
Files
(13.4 MB)
| Name | Size | Download all |
|---|---|---|
|
md5:5fbe956321efc8298ab34f1e48820b3b
|
257.0 kB | Preview Download |
|
md5:fe9a8ebdcd4aed14781f61abd634efdd
|
244.4 kB | Preview Download |
|
md5:8d466744150239735a9429775b1f60a3
|
297.9 kB | Preview Download |
|
md5:d1a6b35ae28f3a162f2748c6bbc49eb1
|
11.1 MB | Preview Download |
|
md5:c4b51ddda83020151d6c7286e0f43cad
|
618.6 kB | Preview Download |
|
md5:615ed1cb54079404d4c3ec0671d1d4d0
|
825.3 kB | Preview Download |
|
md5:c2afdc3d131a9368e583175db8c91e48
|
93.2 kB | Preview Download |