Published March 8, 2026 | Version 1.0.6
Preprint Open

Basilisk: An Evolutionary AI Red-Teaming Framework for Systematic Security Evaluation of Large Language Models

  • 1. ROT Independent Security Research Lab

Description

The rapid deployment of large language models (LLMs) in production environments has introduced a new class of security vulnerabilities that traditional software testing methodologies are ill-equipped to address. I present Basilisk, an open-source AI red-teaming framework that applies evolutionary computation to the systematic discovery of adversarial vulnerabilities in LLMs. 

At its core, Basilisk introduces Smart Prompt Evolution (SPE-NL), a genetic algorithm that treats adversarial prompts as organisms subject to selection pressure, enabling the automated generation of novel attack variants that evade static guardrails. The framework covers 29 attack modules mapped to 8 categories of the OWASP LLM Top 10, supports differential testing across 100+ providers via a unified abstraction layer, and provides non-destructive guardrail posture assessment suitable for production environments. 

Basilisk produces audit-trails with cryptographic chain integrity and generates reports in five formats including SARIF 2.1.0 for integration with developer security workflows. Empirical evaluation demonstrates that evolutionary prompt mutation achieves a 92% relative improvement in attack success rate over static payload libraries. Basilisk is available as a Python package (pip install basilisk-ai), Docker image, desktop application, and GitHub Action for CI/CD integration.

Files

Basilisk.pdf

Files (677.6 kB)

Name Size Download all
md5:4462fd7114592b2cb33f2ec5c1f51084
677.6 kB Preview Download

Additional details

Software

Repository URL
https://github.com/regaan/basilisk
Programming language
Python , Go , C , TypeScript
Development Status
Active