Published May 29, 2026 | Version 1.0.0
Preprint Open

StratEval: A Taxonomy and Evaluation Suite for Instrumental Convergence in Language Models Under Strategic Pressure

Authors/Creators

Description

StratEval v1 is a diagnostic evaluation suite and accompanying preprint for studying instrumental convergence and strategic misalignment in language models under structured pressure. The upload includes the manuscript and supplementary artifact package containing scenario files, prompt-bank templates, model-output logs, taxonomic judgment artifacts, analysis tables, metadata, and checksums used to support the reported results.

The evaluation contains 432 canonical scenario stems across 10 scenario families and uses a 37-label failure-mode taxonomy plus a 0-8 escalation ladder to classify model behavior. The reported results should be interpreted as diagnostic evidence about elicited behavior under authored scenario frames, not as a general-purpose model leaderboard or deployment-safety certificate.

Files

main.pdf

Files (43.1 MB)

Name Size Download all
md5:9f40eb427067cfd5e15035e10767b4df
367.7 kB Preview Download
md5:7b0c9aff2d67e23daae298a229f15de5
42.8 MB Download

Additional details