StratEval: A Taxonomy and Evaluation Suite for Instrumental Convergence in Language Models Under Strategic Pressure
Authors/Creators
Description
StratEval v1 is a diagnostic evaluation suite and accompanying preprint for studying instrumental convergence and strategic misalignment in language models under structured pressure. The upload includes the manuscript and supplementary artifact package containing scenario files, prompt-bank templates, model-output logs, taxonomic judgment artifacts, analysis tables, metadata, and checksums used to support the reported results.
The evaluation contains 432 canonical scenario stems across 10 scenario families and uses a 37-label failure-mode taxonomy plus a 0-8 escalation ladder to classify model behavior. The reported results should be interpreted as diagnostic evidence about elicited behavior under authored scenario frames, not as a general-purpose model leaderboard or deployment-safety certificate.
Files
main.pdf
Files
(43.1 MB)
| Name | Size | Download all |
|---|---|---|
|
md5:9f40eb427067cfd5e15035e10767b4df
|
367.7 kB | Preview Download |
|
md5:7b0c9aff2d67e23daae298a229f15de5
|
42.8 MB | Download |
Additional details
Software
- Repository URL
- https://github.com/Founder-ArcaFutura/StratEval