Published January 24, 2025
| Version v1
Software
Restricted
JBShield: Defending Large Language Models from Jailbreak Attacks through Activated Concept Analysis and Manipulation
Description
This is the artifact for USENIX Security 2025 accepted paper "JBShield: Defending Large Language Models from Jailbreak Attacks through Activated Concept Analysis and Manipulation." To test the artifact, please download and uncompress the zip file.