GradEscape: A Gradient-Based Evader Against AI-Generated Text Detectors
Creators
Description
GradEscape
This repository contains code, datasets, and models to evaluate GradEscape in attacking AI-generated text (AIGT) detectors.
Hardware Dependencies
Two Nvidia RTX A6000 GPUs are the minimum GPU requirement to run the artifact. We recommend at least a 20-core CPU, 32 GB RAM, and 256 GB of free disk space.
Software Dependencies
- OS: Ubuntu 20.04+. A macOS machine is needed to open Scribbr webarchive file. If you encounter a security warning when opening the webarchive file, please go to Settings, Privacy & Security and click open anyway.
- Package Manager: Conda.
- API Key: A Sapling API key is required to run Sapling experiments. Conducting these experiments entails money costs.
Installation
Install Main Environment
Download GradEscape.zip
and Usenix-AE.zip
. Unzip and place them in the same directory. Since downloading in Zenodo is slow, we provide an alternative Google Drive link for Usenix-AE.zip
.
Then, use the following commands to create an environment:
conda create -n ge python=3.10
conda activate ge
cd GradEscape
./install.sh
cp src/AIGT/.config.yaml src/AIGT/config.yaml
Generate Word Similarity Matrix
Perturbation-based evaders rely on a word similarity matrix to select synonyms. Use following commands to generate word similarity matrix:
cd Usenix-AE
git clone https://github.com/nmrksic/counter-fitting.git
cd counter-fitting/word_vectors/
unzip counter-fitted-vectors.txt.zip
python ../../../GradEscape/tools/comp_cos_sim_mat.py counter-fitted-vectors.txt
Then edit config.yaml
to set the correct data_dir and counter_fitting_path.
Create vLLM Environment
We need vLLM for fast paraphrasing. Since vLLM has complex dependencies, we create a new environment specifically for vLLM.
conda create -n vllm python=3.10
conda activate vllm
cd GradEscape
pip install -r paraphrase_requirements.txt
Reproduction
Verify Evasion Effectiveness
Navigate to examples
directory and activate ge
:
cd examples
conda activate ge
Then execute the evader training script:
./scripts/train_evader_roberta_grover.sh
After the program finishes running, the evasion rate and text quality metrics will be printed to the terminal. Evaluators can compare these results with the first row of Figure 3.
Real-world Case Studies
Set your Sapling API key:
export SAPLING_API_KEY=<your_api_key>
Run real_world_demo_sapling.ipynb
and real_world_demo_scribbr.ipynb
. The execution environment is ge
.
The Sapling results will be printed in its Jupyter Notebook. Evaluators can compare the printed results with Table 4 and Figure 20. Verifying Scribbr results requires copying the output into the website rendered by Scribbr.webarchive
. The Scribbr results should be the same as Figure 21.
Paraphrase Defense Experiment
Navigate to examples
:
cd examples
Run paraphrase:
conda activate vllm
./scripts/paraphrase_defense_grover.sh
Train a new detector and evaluate the defense:
conda activate ge
./scripts/eval_paraphrase_defense_grover.sh
The program will generate a figure named paraphrase_defense_grover.pdf
in examples/
. Evaluators can compare the generated figure with Figure 11 in our paper.
Files
GradEscape.zip
Files
(18.2 GB)
Name | Size | Download all |
---|---|---|
md5:4b22e9f9ef9176d9acf3015677163d02
|
19.9 MB | Preview Download |
md5:41a1620c39eee53532be645ba81a497f
|
18.2 GB | Preview Download |
Additional details
Dates
- Accepted
-
2025-06Accepted by USENIX Security'25