Testing Static Analyzers via Semantic-Preserving Mutators Learned from Real-World Refactoring Practice
Authors/Creators
Description
SAFuzzer
1. SAFuzzer Project Introduction
SAFuzzer is an innovative framework for testing Static Application Security Testing (SAST) tools through semantic-preserving code mutations. The framework employs a three-phase pipeline:
-
Mutator Invention: Mines patterns from real-world refactoring commits and transforms them into executable Spoon-based mutators via LLM agents
-
Mutator Refinement: Validates each mutator's semantic-preservation guarantee through rigorous dynamic equivalence checking
-
Static Analyzer Testing: Applies validated mutators at scale to test static analyzers via metamorphic testing
SAFuzzer supports mainstream SAST tools including SpotBugs, PMD, Infer, CheckStyle, and SonarQube. The framework uses Java Spoon for AST manipulation .
2. Top 10 Mutators Causing Bugs
The following are the top 10 mutators that most frequently cause bugs in SAST tools during testing:
|
Rank |
Mutator Name |
Issue Count |
Issues |
|---|---|---|---|
|
1 |
EqualityCheckToInstanceofMutator |
4 |
SpotBugs #3916, Infer #2001, Sonar S2259, PMD #6513 |
|
2 |
ConditionalBlockInsertionMutator |
4 |
SpotBugs #3894, Infer #2015, SpotBugs #3929, PMD #6518 |
|
3 |
ParenthesesAdditionMutator |
4 |
SpotBugs #3904, Infer #2015, PMD #6491, CheckStyle #19162 |
|
4 |
IfConditionReorderingMutator |
3 |
SpotBugs #3886, SpotBugs #3920, SpotBugs #3963 |
|
5 |
VariableAdditionMutator |
3 |
SpotBugs #3884, Infer #2015, Infer #1993 |
|
6 |
NullCheckReorderingMutator |
3 |
SpotBugs #3920, SpotBugs #3886, SpotBugs #3916 |
|
7 |
ConditionalLogicInsertionMutator |
2 |
PMD #6518, SpotBugs #3978 |
|
8 |
MethodChainCallSwapMutator |
2 |
SpotBugs #3966, PMD #6494 |
|
9 |
ConditionNegationMutator |
2 |
PMD #6435, SpotBugs #3963 |
|
10 |
SingleLineReturnToBlockReturnMutator |
2 |
PMD #6491, PMD #6519 |
3. SAFuzzer Usage Guide
Project Architecture Overview
SAFuzzer consists of three main components:
-
Semantic_Equivalence_Knowledge_Base: A Python-based pipeline for mining semantic-preserving code patterns from real-world refactoring commits using LLM agents and dynamic execution validation. This component extracts transformation patterns from GitHub commits and validates their semantic equivalence.
-
MutatorExecutor: A standalone Maven project containing mutator implementations and semantic equivalence validation module. This component is used by the knowledge base pipeline to dynamically validate semantic equivalence of code transformations.
-
Main SAFuzzer Framework: The core Java application that applies validated mutators to test SAST tools via metamorphic testing. This is the primary tool for detecting bugs in static analyzers.
Mutator Generation and Validation Pipeline
The framework includes three Python scripts that automate the mutator invention and refinement process:
1. Stage 1: Mutator Generation (stage1_generator.py)
-
Input: Code pairs from GitHub refactoring commits (
raw_diffs_chunk_*_output.json) -
Output: Mutator descriptions (JSON) and Java implementations
-
Process:
-
Uses LLM to analyze code pairs and generate mutator descriptions
-
Converts descriptions into executable Spoon-based Java mutators
-
-
Output Structure:
outputs/ ├── 1_mutator_description/ # JSON descriptions └── 2_mutator_implementation/ # Java implementations
2. Stage 2: Compilation Verification (stage2_compilation_verification.py)
-
Input: Java mutators from Stage 1
-
Output: Compilable mutators
-
Process:
-
Deploys mutators to sandbox environment
-
Validates compilation using javac
-
Automatically repairs compilation errors using LLM agents (max 5 attempts)
-
-
Key Features:
-
Parallel processing (16 workers)
-
Sandbox isolation for each mutator
-
Intelligent repair with specialized tools
-
-
Output Structure:
outputs/compilable_3-7/ # Compilable mutators
3. Stage 3: Fast Semantic Verification (stage3_fast_verify.py)
-
Input: Compilable mutators + test seeds
-
Output: Validation results + repair datasets
-
Process:
-
Phase 1: Quick verification with 200 seeds
-
Phase 2: Extends to 500 seeds if pass rate < 90%
-
Phase 3: Extends to 1000 seeds if zero triggers
-
Repair loop: LLM-driven repair for failed mutators (max 3 attempts)
-
-
Validation Criteria:
-
Pass: Trigger rate > 0 AND pass rate ≥ 90% (200 seeds) OR ≥ 80% (extended seeds)
-
Fail: Pass rate not met OR zero triggers
-
-
Output Structure:
lists/ ├── success_list_v2.txt # Successfully validated mutators └── fail_list_v2.txt # Failed mutators refine_dataset/ # Detailed repair datasets (JSON)
Running the Pipeline
# Step 1: Generate mutators from refactoring commits
python stage1_generator.py
# Step 2: Verify and repair compilation errors
python stage2_compilation_verification.py [start_chunk]
# Step 3: Validate semantic preservation
python stage3_fast_verify.py
Environment Requirements
-
Java 17 or higher (Maven compilation target)
-
Python 3.8+ (for analysis scripts in Semantic_Equivalence_Knowledge_Base)
-
Maven 3.6+ for building the project
-
8GB RAM minimum, 16GB RAM recommended
-
20GB free disk space for generated mutants and results
Quick Start with the Core Framework Package
Step 1: Extract and Setup
# Extract the ZIP file
unzip SAFuzzer_Core_Framework_*.zip
cd SAFuzzer_Core_Framework
# Make scripts executable
chmod +x run_complete_pipeline.sh test_pipeline_quick.sh
chmod +x Semantic_Equivalence_Knowledge_Base/run_pipeline.sh
Step 2: Install SAST Tools
Before running SAFuzzer, you need to install the SAST tools. Follow the instructions in tools/README.md to download and install:
-
SpotBugs 4.9.8
-
PMD 7.22.0
-
CheckStyle 13.3.0
-
Infer 1.2.0
-
SonarQube Scanner 8.0.1
Step 3: Configure Tool Paths
# Copy the configuration template
cp config.properties.template config.properties
# Edit config.properties with your tool paths
nano config.properties # or use your favorite editor
Update the paths in config.properties:
spotbugs.jar.path=/absolute/path/to/spotbugs-4.9.8/lib/spotbugs.jar
pmd.cli.path=/absolute/path/to/pmd-bin-7.22.0/bin/pmd
checkstyle.jar.path=/absolute/path/to/checkstyle-13.3.0-all.jar
infer.cli.path=/absolute/path/to/infer-linux-x86_64-v1.2.0/bin/infer
sonar.scanner.path=/absolute/path/to/sonar-scanner-8.0.1.6346-linux-x64/bin/sonar-scanner
Step 4: Build the Project
# Build main SAFuzzer framework
mvn clean compile package
# Build MutatorExecutor (for semantic validation)
cd MutatorExecutor
mvn clean compile
cd ..
Step 5: Install Python Dependencies
# Install required Python packages
pip install -r Semantic_Equivalence_Knowledge_Base/requirements.txt
Step 6: Run Quick Verification Test
# Test if everything works correctly
./test_pipeline_quick.sh
If all tests pass, you're ready to run the full pipeline!
Running the Complete Pipeline
Option A: Run All Three Stages (Recommended)
# This runs the complete SAFuzzer pipeline end-to-end
./run_complete_pipeline.sh
The script will:
-
Check environment and dependencies
-
Build the project if needed
-
Run the Semantic Equivalence Knowledge Base pipeline (Stage 1)
-
Validate mutators using MutatorExecutor (Stage 2)
-
Test SAST tools with validated mutators (Stage 3)
-
Generate results and summary
Option B: Run Individual Stages
Stage 1: Mutator Invention (Pattern Mining)
cd Semantic_Equivalence_Knowledge_Base
./run_pipeline.sh
This stage mines refactoring patterns from GitHub commits. Note: This requires GitHub API access and may take several hours.
Stage 2: Mutator Refinement (Semantic Validation)
cd MutatorExecutor
mvn compile
# The validation is integrated into Stage 1 pipeline
Stage 3: Testing Static Analyzers
# Test a specific test case with SpotBugs
java -cp "target/SASTFuzz-1.0-.jar:target/classes:target/dependency/*" \
com.mutation.Main \
--project_path "." \
--target_case "seeds.PMD_Seeds.bestpractices_AccessorClassGeneration.AccessorClassGeneration1" \
--target_SAST "SpotBugs" \
--max_iter 10
# Test all SAST tools on a test case
java -cp "target/SASTFuzz-1.0-.jar:target/classes:target/dependency/*" \
com.mutation.Main \
--project_path "." \
--target_case "seeds.SpotBugs_Seeds.bestpractices_ArrayIsStoredDirectly.ArrayIsStoredDirectly1" \
--target_SAST "ALL" \
--max_iter 20
Command Line Parameters
--project_path <arg> Source code root directory (required)
--target_case <arg> Target Java class (package.ClassName format) (required)
--target_SAST <arg> SAST tool to test: SpotBugs, PMD, CheckStyle,
Infer, SonarQube, Semgrep, or ALL (required)
--max_iter <arg> Maximum mutation iterations (default: 50)
Output Structure
Results are organized in results/run_YYYYMMDD_HHMMSS/:
-
safuzzer_output.log: Complete execution log -
final_results/: Generated mutants and SAST reports-
0/: Original seed code with baseline SAST analysis -
1..N/: Each iteration's mutated code and SAST results -
iteration_history.txt: Trace of applied mutators
-
-
verification_summary.txt: Pipeline verification results
Advanced Configuration
Custom Mutator Selection
The framework automatically selects from all available mutators. To modify mutator behavior, edit the Scheduler.run() method in src/com/mutation/Scheduler.java.
Rule Coverage Experiment
Enable JaCoCo coverage measurement in config.properties:
jacoco.enabled=true
jacoco.agent.path=/path/to/jacoco-agent.jar
jacoco.cli.path=/path/to/jacoco-cli.jar
Custom SAST Tool Integration
Implement new SAST tool classes extending the SAST abstract class in src/com/mutation/config/.
4. Detected Bug Case Demonstrations
Case 1: PMD SimplifyConditional False Negative (#6513)
Bug Description: PMD fails to detect a redundant null check before instanceof when additional conditions are interleaved in the && chain by a semantic-preserving mutation.
Original Code (PMD correctly reports SimplifyConditional):
public class SimplifyConditionalDemo {
public void foo() {
String s = "a";
if (s != null && s instanceof String) { // <- SimplifyConditional reported (TP)
System.out.println(s);
}
}
}
Mutated Code (PMD silently misses the bug):
public class SimplifyConditionalDemo {
public void foo() {
String s = "a";
String s2 = "a";
if (s != null && s2 != null && s instanceof String) { // <- null check still redundant, but NOT reported (FN)
System.out.println(s);
}
}
}
Triggering Mutator: NonNullVarRedundantNullCheckMutator — inserts an additional s2 != null guard into an existing && chain, a common defensive coding pattern that does not change the semantics of the original condition.
Analysis: In both cases the s != null check immediately before s instanceof String is completely redundant, since instanceof already handles null by returning false. PMD's SimplifyConditional detector only matches the pattern when the null check and instanceof are directly adjacent in the && chain. Once any intervening condition is inserted between them, the rule fails to trace the relationship and produces a False Negative. This issue is open and reported on Mar 20, 2026.
Case 2: SpotBugs IM_BAD_CHECK_FOR_ODD False Negative (#3886)
Bug Description: SpotBugs fails to detect the incorrect odd-number check pattern when the condition operands are reordered into Yoda-style by a semantic-preserving mutation.
Original Code (SpotBugs correctly reports IM_BAD_CHECK_FOR_ODD):
public class TestModulo {
public void standardCheck(int i) {
if (i % 2 == 1) { // <- IM_BAD_CHECK_FOR_ODD reported (TP)
System.out.println("Odd");
}
}
}
Mutated Code (SpotBugs silently misses the bug):
public class TestModulo {
public void yodaCheck(int i) {
if (1 == i % 2) { // <- semantically identical, but IM_BAD_CHECK_FOR_ODD NOT reported (FN)
System.out.println("Odd");
}
}
}
Triggering Mutator: IfConditionReorderingMutator — rewrites <expr> == <literal> into the Yoda-style <literal> == <expr>, a common and semantically equivalent code transformation.
Analysis: Both i % 2 == 1 and 1 == i % 2 are semantically identical and share the same bug: this check incorrectly returns false for negative odd integers (e.g., -3 % 2 == -1, not 1). SpotBugs' IM_BAD_CHECK_FOR_ODD detector only matches the canonical operand order and fails to recognize the Yoda variant, resulting in a False Negative. This bug was subsequently fixed via PR #3935.
Case 3: PMD ForLoopCanBeForeach False Negative (#6495)
Bug Description: PMD fails to detect that a traditional index-based for loop can be replaced by an enhanced foreach loop when the array length is first extracted into a pre-declared local variable by a semantic-preserving mutation.
Original Code (PMD correctly reports ForLoopCanBeForeach):
public class PMD_FN_Demo {
public void testTruePositive(long[] counts) {
double total = 0;
for (int i = 0; i < counts.length; i++) { // <- ForLoopCanBeForeach reported (TP)
total += counts[i];
}
}
}
Mutated Code (PMD silently misses the bug):
public class PMD_FN_Demo {
public void testFalseNegative(long[] counts) {
double total = 0;
int len = counts.length; // array length extracted to a local variable
for (int i = 0; i < len; i++) { // <- semantically identical, but ForLoopCanBeForeach NOT reported (FN)
total += counts[i];
}
}
}
Triggering Mutator: ConditionalBlockInsertionMutator (combined with loop bound extraction) — hoists the array.length expression into a pre-declared local variable, a standard performance-oriented refactoring that does not change loop semantics.
Analysis: Both loops iterate over the entire array in the same order and produce identical results. PMD's ForLoopCanBeForeach rule performs pattern matching on the loop condition and expects i < array.length literally in the for header. When the bound is stored in an intermediate variable len, the rule's detector fails to trace back to the array and misses the violation. A PR (#6521) has been submitted to address this.
5. Bugs Summary Table
Bug Statistics Overview
The following table summarizes bugs detected across different SAST tools and their current status:
|
Issue Status |
SpotBugs |
PMD |
Infer |
SonarQube |
CheckStyle |
Overall |
|---|---|---|---|---|---|---|
|
Reported |
18 |
10 |
8 |
4 |
2 |
42 |
|
Confirmed |
12 |
6 |
0 |
3 |
1 |
22 |
|
Fixed |
2 |
0 |
0 |
0 |
1 |
3 |
|
Won't Fix |
1 |
0 |
0 |
0 |
0 |
1 |
Bug Details
|
Bug Type |
Rule |
Status |
Issue ID |
Issue Link |
Rule Link |
|---|---|---|---|---|---|
|
FN |
NN_NAKED_NOTIFY |
Reported |
#3884 |
||
|
FN |
IM_BAD_CHECK_FOR_ODD |
Fixed |
#3886 |
||
|
FN |
ST_WRITE_TO_STATIC_FROM_INSTANCE_METHOD |
Confirmed |
#3893 |
||
|
FN |
UCF_USELESS_CONTROL_FLOW |
Confirmed |
#3894 |
||
|
FN |
RV_RETURN_VALUE_IGNORED_NO_SIDE_EFFECT |
Confirmed |
#3900 |
||
|
FP |
IL_INFINITE_RECURSIVE_LOOP |
Confirmed |
#3904 |
||
|
FN |
SF_SWITCH_NO_DEFAULT |
Confirmed |
#3905 |
||
|
FN |
NULLPTR_DEREFERENCE |
Reported |
#1992 |
||
|
FN |
DIVIDE_BY_ZERO |
Reported |
#1993 |
||
|
FN |
UnconditionalIfStatement |
Confirmed |
#6435 |
||
|
FP |
INFINITE_EXECUTION_TIME |
Reported |
#2000 |
||
|
FN |
NP_LOAD_OF_KNOWN_NULL_VALUE |
Confirmed |
#3916 |
||
|
FN |
NULL_DEREFERENCE |
Reported |
#2001 |
||
|
FN |
S2259 (Null pointers should not be dereferenced) |
Reported |
#177381 |
||
|
FP |
DANGLING_POINTER_DEREFERENCE |
Reported |
#2002 |
||
|
FN |
INFINITE_EXECUTION_TIME |
Reported |
#2005 |
||
|
FN |
RCN_REDUNDANT_COMPARISON_OF_NULL_AND_NONNULL_VALUE |
Confirmed |
#3920 |
||
|
FP |
SA_LOCAL_SELF_ASSIGNMENT |
Confirmed |
#3929 |
||
|
FN |
URF_UNREAD_FIELD |
Confirmed |
#3955 |
||
|
FN |
NULLPTR_DEREFERENCE |
Reported |
#2015 |
||
|
FN |
UselessOverridingMethod |
Reported |
#6491 |
||
|
FN |
CloseResource |
Reported |
#6494 |
||
|
FN |
LeftCurly |
Fixed |
#19162 |
||
|
FN |
ForLoopCanBeForeach |
Confirmed |
#6495 |
||
|
FN |
NP_LOAD_OF_KNOWN_NULL_VALUE |
Reported |
#3961 |
||
|
FN |
NULLPTR_DEREFERENCE |
Reported |
#2019 |
||
|
FP |
CWO_CLOSED_WITHOUT_OPENED |
Reported |
#3962 |
||
|
FP |
IL_INFINITE_RECURSIVE_LOOP |
Confirmed |
#3963 |
||
|
FN |
DM_STRING_TOSTRING |
Fixed |
#3966 |
||
|
FN |
SimplifyConditional |
Reported |
#6513 |
||
|
FN |
SA_FIELD_DOUBLE_ASSIGNMENT |
Reported |
#3975 |
||
|
FN |
UselessPureMethodCall |
Confirmed |
#6517 |
||
|
FN |
NS_NON_SHORT_CIRCUIT |
Reported |
#3976 |
||
|
FN |
UnusedAssignment |
Confirmed |
#6518 |
||
|
FP |
DoNotUseThreads |
Confirmed |
#6520 |
||
|
FN |
SimplifyBooleanReturns |
Confirmed |
#6519 |
||
|
FN |
CollectionTypeMismatch |
Reported |
#6526 |
||
|
FP |
IL_INFINITE_RECURSIVE_LOOP |
Reported |
#3978 |
||
|
FN |
AvoidInstantiatingObjectsInLoops |
Reported |
#6560 |
||
|
FN |
NP_NULL_ON_SOME_PATH |
Reported |
#3985 |
||
|
FP |
INTEGER_OVERFLOW_L2 |
Reported |
#2027 |
||
|
FN |
Inconsistent synchronization |
Reported |
#3986 |
Files
SAFuzzer_Core_Framework.zip
Files
(2.1 MB)
| Name | Size | Download all |
|---|---|---|
|
md5:3b132dc5407540186783f01e4ca8732e
|
2.1 MB | Preview Download |