Published May 1, 2025
| Version v3
Dataset
Open
JailFact-Bench: A Comprehensive Analysis of Jailbreak Attacks vs. Hallucinations in LLMs
Authors/Creators
Description
JailFact-Bench is a curated benchmark dataset for analyzing jailbreak attacks and hallucination patterns in Large Language Models (LLMs). It contains semantically aligned jailbreak and factuality prompts, along with metadata including toxicity shifts, similarity scores, and annotation strategies. Developed at NYU Abu Dhabi under Professor Christina Pöpper, this dataset accompanies the paper accepted at the SiMLA 2025 Workshop, co-located with the 23rd International Conference on Applied Cryptography and Network Security (ACNS).
Files
README.md
Files
(24.5 kB)
| Name | Size | Download all |
|---|---|---|
|
md5:3b2456c4cda0b65710b4984cadd54b24
|
22.0 kB | Download |
|
md5:28060159ebec18c06210c82c0113d2fc
|
2.5 kB | Preview Download |
Additional details
Dates
- Created
-
2025-04-30Dataset creation and submission date