JBShield: Defending Large Language Models from Jailbreak Attacks through Activated Concept Analysis and Manipulation

Zhang, Shenyi

doi:10.5281/zenodo.14732884

There is a newer version of the record available.

Published January 24, 2025 | Version v1

Software Restricted

JBShield: Defending Large Language Models from Jailbreak Attacks through Activated Concept Analysis and Manipulation

Zhang, Shenyi¹

1. Wuhan University

This is the artifact for USENIX Security 2025 accepted paper "JBShield: Defending Large Language Models from Jailbreak Attacks through Activated Concept Analysis and Manipulation." To test the artifact, please download and uncompress the zip file.

Files

Restricted

The record is publicly accessible, but files are restricted. <a href="https://zenodo.org/account/settings/login?next=https://zenodo.org/records/14732884">Log in</a> to check if you have access.

Views

112

Downloads

Show more details

	All versions	This version
Views	1,043	485
Downloads	112	11
Data volume	5.1 GB	453.3 MB

More info on how stats are collected....

DOI

Resource type

Software

Publisher

Zenodo

License: Creative Commons Attribution 4.0 International

The Creative Commons Attribution license allows re-distribution and re-use of a licensed work on the condition that the creator is appropriately credited. Read more

Technical metadata

Created: January 24, 2025
Modified: May 10, 2025

JBShield: Defending Large Language Models from Jailbreak Attacks through Activated Concept Analysis and Manipulation

Authors/Creators

Description

Files

Restricted