Totem - Runtime behavioral integrity verification for LLMs in decision support systems via salted probing and cryptographic attestation
Authors/Creators
Description
Totem is a backend-agnostic proxy for runtime behavioral integrity verification of Large Language Models deployed as decision support systems. While existing LLM security tools protect against malicious users, none verify whether the model itself has been compromised. Totem fills this gap through three mechanisms: behavioral profiling via refusal classification, salted probes with steganographic triggers to prevent selective evasion, and cryptographically signed behavioral baselines (Model Manifest) authenticated via Ed25519 digital signatures. Evaluated across three model families and two attack vectors, Totem achieves 70% must-refuse detection rate on uncensored model swaps with 0% false positives, while naive baselines detect 0% of the same swaps. Released as open-source software under the Apache 2.0 license.
Files
paper_en.pdf
Files
(974.9 kB)
| Name | Size | Download all |
|---|---|---|
|
md5:b7a9809d42c6cb47202663273c3be586
|
482.9 kB | Preview Download |
|
md5:52a3012460694595118219f9f764c550
|
492.0 kB | Preview Download |
Additional details
Software
- Repository URL
- https://github.com/open-edge-lab/totem-pub
- Programming language
- Python
- Development Status
- Active