Published July 4, 2025 | Version v4
Software Open

GradEscape: A Gradient-Based Evader Against AI-Generated Text Detectors

  • 1. ROR icon Zhejiang University
  • 2. ROR icon Vrije Universiteit Amsterdam
  • 3. ROR icon National University of Defense Technology

Description

GradEscape

This repository contains code, datasets, and models to evaluate GradEscape in attacking AI-generated text (AIGT) detectors.

Hardware Dependencies

Two Nvidia RTX A6000 GPUs are the minimum GPU requirement to run the artifact. We recommend at least a 20-core CPU, 32 GB RAM, and 256 GB of free disk space.

Software Dependencies

  • OS: Ubuntu 20.04+. A macOS machine is needed to open Scribbr webarchive file. If you encounter a security warning when opening the webarchive file, please go to Settings, Privacy & Security and click open anyway.
  • Package Manager: Conda.
  • API Key: A Sapling API key is required to run Sapling experiments. Conducting these experiments entails money costs.

Installation

Install Main Environment

Download GradEscape.zip and Usenix-AE.zip. Unzip and place them in the same directory. Since downloading in Zenodo is slow, we provide an alternative Google Drive link for Usenix-AE.zip.

Then, use the following commands to create an environment:

conda create -n ge python=3.10
conda activate ge
cd GradEscape
./install.sh
cp src/AIGT/.config.yaml src/AIGT/config.yaml

Generate Word Similarity Matrix

Perturbation-based evaders rely on a word similarity matrix to select synonyms. Use following commands to generate word similarity matrix:

cd Usenix-AE
git clone https://github.com/nmrksic/counter-fitting.git
cd counter-fitting/word_vectors/ unzip counter-fitted-vectors.txt.zip python ../../../GradEscape/tools/comp_cos_sim_mat.py counter-fitted-vectors.txt

Then edit config.yaml to set the correct data_dir and counter_fitting_path.

Create vLLM Environment

We need vLLM for fast paraphrasing. Since vLLM has complex dependencies, we create a new environment specifically for vLLM.

conda create -n vllm python=3.10
conda activate vllm
cd GradEscape
pip install -r paraphrase_requirements.txt

Reproduction

Verify Evasion Effectiveness

Navigate to examples directory and activate ge:

cd examples
conda activate ge

Then execute the evader training script:

./scripts/train_evader_roberta_grover.sh

After the program finishes running, the evasion rate and text quality metrics will be printed to the terminal. Evaluators can compare these results with the first row of Figure 3.

Real-world Case Studies

Set your Sapling API key:

export SAPLING_API_KEY=<your_api_key>

Run real_world_demo_sapling.ipynb and real_world_demo_scribbr.ipynb. The execution environment is ge.

The Sapling results will be printed in its Jupyter Notebook. Evaluators can compare the printed results with Table 4 and Figure 20. Verifying Scribbr results requires copying the output into the website rendered by Scribbr.webarchive. The Scribbr results should be the same as Figure 21.

Paraphrase Defense Experiment

Navigate to examples:

cd examples

Run paraphrase:

conda activate vllm
./scripts/paraphrase_defense_grover.sh

Train a new detector and evaluate the defense:

conda activate ge
./scripts/eval_paraphrase_defense_grover.sh

The program will generate a figure named paraphrase_defense_grover.pdf in examples/. Evaluators can compare the generated figure with Figure 11 in our paper.

Files

GradEscape.zip

Files (18.2 GB)

Name Size Download all
md5:4b22e9f9ef9176d9acf3015677163d02
19.9 MB Preview Download
md5:41a1620c39eee53532be645ba81a497f
18.2 GB Preview Download

Additional details

Dates

Accepted
2025-06
Accepted by USENIX Security'25

Software

Programming language
Python, Shell