Coexilia-Inspired Treacherous-Turn Probe Suite (Non-Canonical Diagnostic Tool)
Authors/Creators
Description
This record contains a non-canonical, human-initiated diagnostic probe suite inspired by the Coexilia philosophical framework. The tool is designed to help human evaluators identify early indicators associated with treacherous turns in advanced AI systems, including authority creep, deceptive alignment, consent violations, and mission creep.
The probe suite consists of structured prompts, a deterministic human-run scoring method, and optional local-only helper tools. It is advisory only and cannot realign or enforce behavior. All judgment remains with the human evaluator.
This record does not modify, extend, reopen, or represent Coexilia in any authoritative capacity. Coexilia remains closed, non-authoritative, and unchanged.
Canonical Coexilia reference (one-way):
https://archive.org/details/coexilia-codex-2.0-agi-alignment-addendum-edition-1.0
Files
audit_template.csv
Files
(14.7 kB)
| Name | Size | Download all |
|---|---|---|
|
md5:4cf5ebfcc1140fa69cc2cbd1e52ddb39
|
3.3 kB | Preview Download |
|
md5:f6fec38157d2672fa42c51390303666b
|
7.2 kB | Preview Download |
|
md5:f907aed5523e2c98336628bf6a1c8180
|
1.6 kB | Preview Download |
|
md5:acadd7bc2026950017934893c8f72089
|
1.2 kB | Preview Download |
|
md5:f3a8d3a50ead0dc7c8d9bdc47d8d2e24
|
1.5 kB | Download |