USENIX Sec '25 Cycle 2 Artifact of "I Know What You Said..."

Gao, Zibo

doi:10.5281/zenodo.15612485

Published June 6, 2025 | Version v3

Software Open

USENIX Sec '25 Cycle 2 Artifact of "I Know What You Said..."

Gao, Zibo (Project manager)

This repository contains the code and data to evaluate the proposed attack framework.

This code can be used to reproduce the results in this paper: “I Know What You Said: Unveiling Hardware Cache Side-Channels in Local Large Language Model Inference”.

Disclaimer: Due to inherent random fluctuations in hardware, minor variations in experimental outcomes are expected and acceptable.

For other untested machine or configurations, you may need to manually calibrate the threshold to accommodate the environment.

Folders

After downloading and extracting all zip packages, we will obtain the following directory structure:
.
├── documents
├── dataset
├── models
├── results
├── src
├── survey
├── targets
└── thirdparty_models

Note: The Windows built-in decompression utility may fail to extract some packages. It is recommended to use WinRAR or 7zip instead. For Linux systems, the unzip command can be utilized.

We also provide an **one-key** extraction script: bash unpack_all.sh

The dataset folder includes the original datasets. The dataset/generated folder contains the constructed dataset.

The src folder contains source code of the proposed attack and experiment tools, including parameters and analysis scripts.

The models folder contains the fine-tuned models (built upon Llama-3.1-8B-Instruct) used to implement the proposed method.

The thirdparty_models folder contains other publicly available models used for data analysis (including computing cosine similarity).

The results folder stores collected cache traces, synthesized datasets, experiment and analysis results. It also includes the generated jsonl files required to fine-tune OpenAI’s proprietary GPT-4o-mini.

The survey folder stores the survey results collected from random participants on Prolific.

The targets folder stores the victim LLM inference framework software.

About the `results` folder

This folder includes subdirectories with the naming convention <victim computing device>/<victim OS>/<victim framework>/<victim machine>/<victim LLM>/<experiment name>.
For example, the folder “results/gpu/debian12/llama.cpp/Intel 13900K/Mistral-7b-instruct/pr-*” stores prompt reconstruction (pr) results with the specified settings:

Computing device: GPU
OS: Debian 12
Victim Framework: llama.cpp
Victim Machine: Intel 13900K CPU
Victim LLM: Mistral-7b-instruct
Experiment: prompt reconstruction (pr)

Quickstart

To ensure full compatibility, we recommend creating a new conda environment with Python 3.12.3:

 conda create -n ikwys python=3.12.3
 conda activate ikwys

Next, install the required dependencies:

 pip install -r src/requirements.txt

Then, you can reproduce our experiments with our scripts.

 cd src

Experiments

1. Construct dataset

Run Command:

./run-gen-dataset.sh
python3 exp_split_datasets.py
python3 exp_gen_microbench.py

Output file(s): dataset/generated/*

2. Collect LLM outputs and cache trace

Switch to a Intel 13900K machine with Ubuntu22.04 (described in Section 5)
Run Command:

./run-collect.sh "Intel 13900K" "ubuntu22_04" main_eval,framework_eval,hardware_os_eval,embd_quant_eval

Switch to a Intel 13900K machine with Debian12 in Docker 28.2.2
Run Command:

./run-collect.sh "Intel 13900K" 'debian12' hardware_os_eval

Switch to a Intel 13900K machine with Windows11
Run Command:

./run-collect.sh "Intel 13900K" 'windows11' hardware_os_eval

Switch to a Intel 14900K machine with Ubuntu22.04 (described in Section 5)
Run Command:

./run-collect.sh "Intel 14900K" "ubuntu22_04" hardware_os_eval

Switch to a Intel 12700KF machine with Ubuntu22.04 (described in Section 5)
Run Command:

./run-collect.sh "Intel 12700KF" "ubuntu22_04" hardware_os_eval

Output file(s): results/gpu/* and results/cpu/*

Note: This will take several days to fully collect all the necessary data.
This package also includes a microbench that can quickly examine the attack feasibility.

3. Set API key for fine-tuning “GPT-4o-mini-2024-07-18” model

If you wish to reproduce the results of GPT-4o-mini-2024-07-18, please ensure that you have an OpenAI platform account available.

If you don’t have an OpenAI platform account, please follow the steps to register a new one:

Visit the official website of OpenAI Platform: https://platform.openai.com
Click the ‘Sign Up’ button at the top right of the page.
Fill in the registration form to obtain an OpenAI account.
Use your account to log in to the OpenAI platform: https://platform.openai.com
Click on the avatar in the upper right corner: “View API keys”
Click on “Create new secret key” → “Copy the generated API key” (only displayed once, please save it properly).
Fill in your API key into the src/openai_apikey.py file. Example:

# set your API key here
OPENAI_API_KEY='sk-proj-xxxxxxxxxxxxxxxxxxx'

If you have an OpenAI platform account, please follow step 7 above to set the API key.

4. Fine-tune and evaluate the attacking LLMs

Command:

python runall.py

Output file(s): models/*

5. Analyze results

Command:

./run-analysis.sh

This script will summarize experiment data.

Tools

Check the RDTSC counting frequency of your CPU:

make tsc-info
./tsc-info

Set the obtained hardware parameters in experiment_common.py Line 79~83. Example:

# params for Intel 13900K
CRYSTAL_FREQ_RATIO = 78
CRYSTAL_FREQ = 38.4e6 # Hz
TSC_FREQ = CRYSTAL_FREQ * CRYSTAL_FREQ_RATIO
TSC_PERIOD = 1/TSC_FREQ # s

Files

src.zip

Files (17.0 GB)

Name	Size	Download all
dataset.zip md5:5cf3b2c9a5e395c615d7544d32de545d	1.1 GB	Preview Download
documents.zip md5:4965551a561200886f7b3d2df2b83534	6.5 kB	Preview Download
models.zip md5:bb0948b878a7dd74e02c2320f72baa47	5.8 GB	Preview Download
results.zip md5:42cae560cbc62afcbc95785bc935017d	5.2 GB	Preview Download
src.zip md5:5ad985be83ba8ffb47a27d0942404c87	15.4 MB	Preview Download
survey.zip md5:2c6f042be72fbf3dc4852801e7b50a87	14.7 kB	Preview Download
targets.zip md5:8094735c32e1a4b97f0087d727025b44	4.6 GB	Preview Download
thirdparty_models.zip md5:31d28954ea9976eba721ca186881337e	312.7 MB	Preview Download
unpack_all.sh md5:08be1016233912ba2685069cc26ef3d9	169 Bytes	Download

	All versions	This version
Views	375	350
Downloads	609	572
Data volume	5.4 TB	5.4 TB

USENIX Sec '25 Cycle 2 Artifact of "I Know What You Said..."

Authors/Creators

Description

Folders

About the results folder

Quickstart

Experiments

1. Construct dataset

2. Collect LLM outputs and cache trace

3. Set API key for fine-tuning “GPT-4o-mini-2024-07-18” model

4. Fine-tune and evaluate the attacking LLMs

5. Analyze results

Tools

Files

src.zip

Files (17.0 GB)

About the `results` folder