VIPER_code

Qi, Minfeng; He, Dongyang; Zhang, Lefeng; Wang, Qin

doi:10.5281/zenodo.17971722

Published December 18, 2025 | Version v1

Dataset Open

VIPER_code

1. City University of Macau
2. Minzu University of China
3. CSIRO Data61

Breaking Visual Reasoning CAPTCHAs with VIPER: A Structured Vision-Language Attack Framework

VIPER_code is an academic and practical Python framework designed for the analysis and attack of Visual Reasoning CAPTCHAs (VRC). It implements a structured, modular vision-language attack pipeline that supports a range of CAPTCHA types — including six major VRC datasets and various non-VRC challenges. The framework enables automated evaluation and experimentation with large language models (LLMs) to break CAPTCHAs in a highly configurable manner.

File & Directory Structure

First, unzip all the compressed files.

File/Folder	Description
`main.py`	Main entry for experiment orchestration and accuracy testing. Modifying imports here allows switching among LLM modules and VRC types.
`environment_variable.py`	LLM API and environment configuration. You must edit this file before running any experiment.
`r1_and_r2_main.py`	Script for running “r1” and “r2” series experiments. Update the imports to select the relevant experiment logic.
`DS_.py`, `Grok_.py`, `KIMI_*.py`	Scripts for specific dataset/model/attack integration. For example, `DS_geetest.py` for Geetest VRC-specific logic.
`tools_*.py`	Toolkit and utility scripts for each VRC or LLM variant.
`best_28.pt`	Pretrained model weights (PyTorch). May be needed for certain experiments.
`data_process/`	Contains scripts and raw/intermediate data for transforming datasets.
`dataset/`	Houses all datasets, including six VRC datasets, their labels, prompts, and evaluation scripts that consume these resources.
`other_captcha/`	Scripts and results for non-VRC experiments, such as traditional text-based CAPTCHAs or rotated CAPTCHA variants.
`result/`	Experimental result files, including cracking accuracy per LLM/model (`r1`, `r2`), corresponding response times, and any processed data.
`recaptchaTest/`	Specific test scripts and resources for reCAPTCHA-related CAPTCHAs.
`README.md`	This documentation.

Directions for Use

Configure the Environment:
- Edit environment_variable.py to set the LLM API keys, endpoints, and dataset addresses as needed for your experiments.
Run Main Experiments:
- The central script for most experiments is main.py:
  - Run with python main.py.
  - To test different LLMs or VRC types, adjust the import statements at the top, e.g.:
```
from tools_gpt_xiaodun import *
# or: from Grok_xiaodun import *
# or: from KIMI_VTT import *
```
  - The script will manage accuracy testing, experiment orchestration, and results logging.
Run r1 / r2 Experiments:
- Use r1_and_r2_main.py for specialized accuracy experiments of r1 and r2:
  - Edit the import line: from tools_r1 import * to select your target.
  - Change VC-specific logic by editing the directory, sub_directory, width, height = getDirectory() or .Xiaodun_file() invocation.
  - Select which VRC to attack by adjusting the arguments in the related function call.
Datasets and Output:
- Datasets must be placed in /dataset as referenced by test scripts.
- Transformed/intermediate data is handled by /data_process.
- Results and accuracy scores generated by experiments are saved in /result.
Other CAPTCHA Experiments:
- /other_captcha/ contains scripts and data for additional types of CAPTCHAs, as well as experiment results for variations like rotated CAPTCHAs or text-based ones.

Example Project Tree

VIPER_code/
├── main.py
├── environment_variable.py
├── r1_and_r2_main.py
├── DS_*.py
├── Grok_*.py
├── KIMI_*.py
├── tools_*.py
├── best_28.pt
├── README.md
├── data_process/
│   └── # Dataset processing scripts & data
├── dataset/
│   └── # All datasets and associated labels/questions
├── other_captcha/
│   └── # Non-VRC CAPTCHA code/experiments
├── recaptchaTest/
│   └── # Specific reCAPTCHA experiments
└── result/
    └── # Output/results for each experiment

Reproducibility & Recommendations

Python 3.8+ environment recommended.
Manage dependencies as needed (pip install ...). There may not be a requirements.txt — review import statements in main scripts.
Some experiments require a CUDA-enabled GPU with a functional PyTorch setup.
Always check and update environment_variable.py before executing scripts for the correct test context.

Files

data_process.zip

Files (2.0 GB)

Name	Size	Download all
best_28.pt md5:a53bb007c79662c107a76f92c7775869	19.3 MB	Download
data_process.zip md5:a162fc86ebd517d7a938d4e115574ef5	74.0 MB	Preview Download
dataset.zip md5:c181e430f72a7809573e791b6f063b63	1.9 GB	Preview Download
DS_DingXiang.py md5:10b59ca3cce091bce9f329bdfd3343e8	16.7 kB	Download
DS_geetest.py md5:019923d13d796fd38fb324a16df0b839	16.6 kB	Download
DS_netease.py md5:b946e8c9a07b9385cfe5123522f335ea	16.3 kB	Download
DS_shumei.py md5:ffb29e3ee7d5c6c2d9de3aab0ce344bb	16.2 kB	Download
DS_VTT.py md5:6aef3b0ed9164cfc24602abf7ea08322	16.3 kB	Download
DS_xiaodun.py md5:a2a92fc639ee463086eb60588bd4d660	16.4 kB	Download
environment_variable.py md5:3c1d219ef68abfff46204307111e4132	3.2 kB	Download
Grok_DingXiang.py md5:5eb25ed26c919f869e4c51d8715cd6d0	16.0 kB	Download
Grok_geetest.py md5:0699f3a404703be78a48e3af7a53881c	16.4 kB	Download
Grok_netease.py md5:14227a5e64d1f0fbe37f2c14569d2bb4	16.6 kB	Download
Grok_shumei.py md5:86e94e07d1b39f1bf728e6795104f40e	16.7 kB	Download
Grok_VTT.py md5:91723500736f47083d634f6882a93d89	16.7 kB	Download
Grok_xiaodun.py md5:11ee6f7cb6deeffbe12767309aa8c0cb	16.4 kB	Download
KIMI_DingXiang.py md5:9a8d6c5325da5bd8bc55007ce586f765	19.3 kB	Download
KIMI_geetest.py md5:b506397777897e3111e0207eb0781763	19.1 kB	Download
KIMI_netease.py md5:dfa9051bf7d6c4455fd345339268c9a7	19.4 kB	Download
KIMI_shumei.py md5:456231109ed47914d5039525826f2ff0	19.5 kB	Download
KIMI_VTT.py md5:748cdd7d32c0a880a8e4d3600fa77295	19.3 kB	Download
KIMI_xiaodun.py md5:eb17902135869ba1774e790c7a446449	19.5 kB	Download
main.py md5:d5aa3b75bac6a02fbe0795cf5fa62ef3	4.6 kB	Download
other_captcha.zip md5:c04b85da201d3cfb9bcebf1b683b6f96	159.2 kB	Preview Download
r1_and_r2_main.py md5:22b22f85be01a368341144fcd5625e49	4.2 kB	Download
README.md md5:134733835c1ee427c1ddaa9ece662c68	1.7 kB	Preview Download
recaptchaTest.zip md5:8eac7375dc15193fbc32422024fc8895	15.9 MB	Preview Download
result.zip md5:607983ae2e0726cef0008c81cf6ea6c9	1.1 MB	Preview Download
tools_gpt_dingxiang.py md5:2e42c769538d5f7f0a2d3fc0c1f7a64d	15.2 kB	Download
tools_gpt_geetest.py md5:925e01489cb34fa2a9a47d8537f05b85	15.0 kB	Download
tools_gpt_netease.py md5:d3c2bf598d6f794fac7608f58084b921	15.2 kB	Download
tools_gpt_shumei.py md5:0353e296028f84a6b6f01534562a5eb0	15.2 kB	Download
tools_gpt_VTT.py md5:2c7253bf60708ad050a8fcf6a9369e4b	15.1 kB	Download
tools_gpt_xiaodun.py md5:c50789e7a59333b33abf9dda19f742e5	15.2 kB	Download
tools_r1.py md5:33482995db18548a7d9a5d6aaf2341f8	2.7 kB	Download
tools_r2.py md5:fc2176845b8c384b81bda031ac0e153c	4.5 kB	Download

	All versions	This version
Views	133	101
Downloads	490	485
Data volume	83.4 GB	72.9 GB

VIPER_code

Authors/Creators

Description

Breaking Visual Reasoning CAPTCHAs with VIPER: A Structured Vision-Language Attack Framework

File & Directory Structure

Directions for Use

Example Project Tree

Reproducibility & Recommendations

Files

data_process.zip

Files (2.0 GB)