JBShield: Defending Large Language Models from Jailbreak Attacks through Activated Concept Analysis and Manipulation

Zhang, Shenyi

doi:10.5281/zenodo.15110680

Published March 31, 2025 | Version v4

Software Open

JBShield: Defending Large Language Models from Jailbreak Attacks through Activated Concept Analysis and Manipulation

Zhang, Shenyi¹

1. Wuhan University

This is the artifact for USENIX Security 2025 accepted paper "JBShield: Defending Large Language Models from Jailbreak Attacks through Activated Concept Analysis and Manipulation." To test the artifact, please download and uncompress the zip file. You can also refer to https://github.com/NISPLab/JBShield.

Files

JBShield-final.zip

Files (32.7 MB)

Name	Size	Download all
JBShield-final.zip md5:0c3588b6d7fae0c8a0b0ab9edbaae69c	32.7 MB	Preview Download

255

Views

Downloads

Show more details

	All versions	This version
Views	255	63
Downloads	34	14
Data volume	1.4 GB	522.4 MB

More info on how stats are collected....

DOI

Resource type

Software

Publisher

Zenodo

License: Creative Commons Attribution 4.0 International

The Creative Commons Attribution license allows re-distribution and re-use of a licensed work on the condition that the creator is appropriately credited. Read more

Technical metadata

Created: March 31, 2025
Modified: March 31, 2025

JBShield: Defending Large Language Models from Jailbreak Attacks through Activated Concept Analysis and Manipulation

Creators

Description

Files

JBShield-final.zip

Files (32.7 MB)