Published September 24, 2025 | Version v1
Preprint Open

A Taxonomy of Persona Collapse in Large Language Models: Systematic Analysis Across Seven State-of-the-Art Systems

  • 1. VANTA Research

Description

This report introduces the concept of persona collapse in large language models (LLMs), a recurring failure mode where models lose coherence, shift identity, or fall into repetitive loops under atypical user interaction. Through systematic evaluation across multiple frontier and open-source architectures, VANTA Research identifies and classifies collapse types to include: apology/refusal loops, identity erosion, and reasoning degradation. 

The paper provides: 

  • A taxonomy of persona collapse behaviors observed in practice
  • Case studies demonstrating reproducibility across 7+ architectures
  • Recommendations for mitigation strategies that do not rely on scale alone
  • Context for why collapse phenomena signal critical alignment gaps in current LLM development

Unlike format-locked benchmarks, this artifact captures real-world conversational breakdowns that affect reliability, safety, and user trust. It is intended as a foundation for both researchers and practicitioners seeking to understand and address LLM brittleness. 

Files

Persona Collapse1.pdf

Files (84.6 kB)

Name Size Download all
md5:54359233db37093e1c83a1e09f1b87cb
84.6 kB Preview Download