Published May 1, 2025 | Version v3
Dataset Open

JailFact-Bench: A Comprehensive Analysis of Jailbreak Attacks vs. Hallucinations in LLMs

  • 1. ROR icon New York University Abu Dhabi
  • 2. New York University Abu Dhabi (NYUAD)
  • 3. ROR icon Ruhr University Bochum

Description

JailFact-Bench is a curated benchmark dataset for analyzing jailbreak attacks and hallucination patterns in Large Language Models (LLMs). It contains semantically aligned jailbreak and factuality prompts, along with metadata including toxicity shifts, similarity scores, and annotation strategies. Developed at NYU Abu Dhabi under Professor Christina Pöpper, this dataset accompanies the paper accepted at the SiMLA 2025 Workshop, co-located with the 23rd International Conference on Applied Cryptography and Network Security (ACNS).

Files

README.md

Files (24.5 kB)

Name Size Download all
md5:3b2456c4cda0b65710b4984cadd54b24
22.0 kB Download
md5:28060159ebec18c06210c82c0113d2fc
2.5 kB Preview Download

Additional details

Dates

Created
2025-04-30
Dataset creation and submission date