Published June 9, 2025 | Version v3
Dataset Open

PrivacyXray: Detecting Privacy Breaches in LLMs through Semantic Consistency and Probability Certainty

Description

This artifact contains the code and dataset used in our paper to analyze and classify privacy leakage behaviors in LLMs. It includes scripts for model fine-tuning, hidden state extraction, and classification, as well as training data.

Files

privacyxray.zip

Files (27.1 MB)

Name Size Download all
md5:8552f7e65d9fcad0b024ae2689bfbd2f
27.1 MB Preview Download