Published March 31, 2026 | Version v1
Software Open

Testing Static Analyzers via Semantic-Preserving Mutators Learned from Real-World Refactoring Practice

Authors/Creators

Description

SAFuzzer

1. SAFuzzer Project Introduction

SAFuzzer is an innovative framework for testing Static Application Security Testing (SAST) tools through semantic-preserving code mutations. The framework employs a three-phase pipeline:

  1. Mutator Invention: Mines patterns from real-world refactoring commits and transforms them into executable Spoon-based mutators via LLM agents

  2. Mutator Refinement: Validates each mutator's semantic-preservation guarantee through rigorous dynamic equivalence checking

  3. Static Analyzer Testing: Applies validated mutators at scale to test static analyzers via metamorphic testing

SAFuzzer supports mainstream SAST tools including SpotBugs, PMD, Infer, CheckStyle, and SonarQube. The framework uses Java Spoon for AST manipulation .

2. Top 10 Mutators Causing Bugs

The following are the top 10 mutators that most frequently cause bugs in SAST tools during testing:

Rank

Mutator Name

Issue Count

Issues

1

EqualityCheckToInstanceofMutator

4

SpotBugs #3916, Infer #2001, Sonar S2259, PMD #6513

2

ConditionalBlockInsertionMutator

4

SpotBugs #3894, Infer #2015, SpotBugs #3929, PMD #6518

3

ParenthesesAdditionMutator

4

SpotBugs #3904, Infer #2015, PMD #6491, CheckStyle #19162

4

IfConditionReorderingMutator

3

SpotBugs #3886, SpotBugs #3920, SpotBugs #3963

5

VariableAdditionMutator

3

SpotBugs #3884, Infer #2015, Infer #1993

6

NullCheckReorderingMutator

3

SpotBugs #3920, SpotBugs #3886, SpotBugs #3916

7

ConditionalLogicInsertionMutator

2

PMD #6518, SpotBugs #3978

8

MethodChainCallSwapMutator

2

SpotBugs #3966, PMD #6494

9

ConditionNegationMutator

2

PMD #6435, SpotBugs #3963

10

SingleLineReturnToBlockReturnMutator

2

PMD #6491, PMD #6519

3. SAFuzzer Usage Guide

Project Architecture Overview

SAFuzzer consists of three main components:

  1. Semantic_Equivalence_Knowledge_Base: A Python-based pipeline for mining semantic-preserving code patterns from real-world refactoring commits using LLM agents and dynamic execution validation. This component extracts transformation patterns from GitHub commits and validates their semantic equivalence.

  2. MutatorExecutor: A standalone Maven project containing mutator implementations and semantic equivalence validation module. This component is used by the knowledge base pipeline to dynamically validate semantic equivalence of code transformations.

  3. Main SAFuzzer Framework: The core Java application that applies validated mutators to test SAST tools via metamorphic testing. This is the primary tool for detecting bugs in static analyzers.

Mutator Generation and Validation Pipeline

The framework includes three Python scripts that automate the mutator invention and refinement process:

1. Stage 1: Mutator Generation (stage1_generator.py)

  • Input: Code pairs from GitHub refactoring commits (raw_diffs_chunk_*_output.json)

  • Output: Mutator descriptions (JSON) and Java implementations

  • Process:

    1. Uses LLM to analyze code pairs and generate mutator descriptions

    2. Converts descriptions into executable Spoon-based Java mutators

  • Output Structure:

    outputs/
    ├── 1_mutator_description/      # JSON descriptions
    └── 2_mutator_implementation/   # Java implementations

2. Stage 2: Compilation Verification (stage2_compilation_verification.py)

  • Input: Java mutators from Stage 1

  • Output: Compilable mutators

  • Process:

    1. Deploys mutators to sandbox environment

    2. Validates compilation using javac

    3. Automatically repairs compilation errors using LLM agents (max 5 attempts)

  • Key Features:

    • Parallel processing (16 workers)

    • Sandbox isolation for each mutator

    • Intelligent repair with specialized tools

  • Output Structure:

    outputs/compilable_3-7/        # Compilable mutators

3. Stage 3: Fast Semantic Verification (stage3_fast_verify.py)

  • Input: Compilable mutators + test seeds

  • Output: Validation results + repair datasets

  • Process:

    1. Phase 1: Quick verification with 200 seeds

    2. Phase 2: Extends to 500 seeds if pass rate < 90%

    3. Phase 3: Extends to 1000 seeds if zero triggers

    4. Repair loop: LLM-driven repair for failed mutators (max 3 attempts)

  • Validation Criteria:

    • Pass: Trigger rate > 0 AND pass rate ≥ 90% (200 seeds) OR ≥ 80% (extended seeds)

    • Fail: Pass rate not met OR zero triggers

  • Output Structure:

    lists/
    ├── success_list_v2.txt        # Successfully validated mutators
    └── fail_list_v2.txt          # Failed mutators
    
    refine_dataset/                # Detailed repair datasets (JSON)

Running the Pipeline

# Step 1: Generate mutators from refactoring commits
python stage1_generator.py

# Step 2: Verify and repair compilation errors
python stage2_compilation_verification.py [start_chunk]

# Step 3: Validate semantic preservation
python stage3_fast_verify.py

Environment Requirements

  • Java 17 or higher (Maven compilation target)

  • Python 3.8+ (for analysis scripts in Semantic_Equivalence_Knowledge_Base)

  • Maven 3.6+ for building the project

  • 8GB RAM minimum, 16GB RAM recommended

  • 20GB free disk space for generated mutants and results

Quick Start with the Core Framework Package

Step 1: Extract and Setup

# Extract the ZIP file
unzip SAFuzzer_Core_Framework_*.zip
cd SAFuzzer_Core_Framework

# Make scripts executable
chmod +x run_complete_pipeline.sh test_pipeline_quick.sh
chmod +x Semantic_Equivalence_Knowledge_Base/run_pipeline.sh

Step 2: Install SAST Tools

Before running SAFuzzer, you need to install the SAST tools. Follow the instructions in tools/README.md to download and install:

  1. SpotBugs 4.9.8

  2. PMD 7.22.0

  3. CheckStyle 13.3.0

  4. Infer 1.2.0

  5. SonarQube Scanner 8.0.1

Step 3: Configure Tool Paths

# Copy the configuration template
cp config.properties.template config.properties

# Edit config.properties with your tool paths
nano config.properties  # or use your favorite editor

Update the paths in config.properties:

spotbugs.jar.path=/absolute/path/to/spotbugs-4.9.8/lib/spotbugs.jar
pmd.cli.path=/absolute/path/to/pmd-bin-7.22.0/bin/pmd
checkstyle.jar.path=/absolute/path/to/checkstyle-13.3.0-all.jar
infer.cli.path=/absolute/path/to/infer-linux-x86_64-v1.2.0/bin/infer
sonar.scanner.path=/absolute/path/to/sonar-scanner-8.0.1.6346-linux-x64/bin/sonar-scanner

Step 4: Build the Project

# Build main SAFuzzer framework
mvn clean compile package

# Build MutatorExecutor (for semantic validation)
cd MutatorExecutor
mvn clean compile
cd ..

Step 5: Install Python Dependencies

# Install required Python packages
pip install -r Semantic_Equivalence_Knowledge_Base/requirements.txt

Step 6: Run Quick Verification Test

# Test if everything works correctly
./test_pipeline_quick.sh

If all tests pass, you're ready to run the full pipeline!

Running the Complete Pipeline

Option A: Run All Three Stages (Recommended)

# This runs the complete SAFuzzer pipeline end-to-end
./run_complete_pipeline.sh

The script will:

  1. Check environment and dependencies

  2. Build the project if needed

  3. Run the Semantic Equivalence Knowledge Base pipeline (Stage 1)

  4. Validate mutators using MutatorExecutor (Stage 2)

  5. Test SAST tools with validated mutators (Stage 3)

  6. Generate results and summary

Option B: Run Individual Stages

Stage 1: Mutator Invention (Pattern Mining)
cd Semantic_Equivalence_Knowledge_Base
./run_pipeline.sh

This stage mines refactoring patterns from GitHub commits. Note: This requires GitHub API access and may take several hours.

Stage 2: Mutator Refinement (Semantic Validation)
cd MutatorExecutor
mvn compile
# The validation is integrated into Stage 1 pipeline
Stage 3: Testing Static Analyzers
# Test a specific test case with SpotBugs
java -cp "target/SASTFuzz-1.0-.jar:target/classes:target/dependency/*" \
  com.mutation.Main \
  --project_path "." \
  --target_case "seeds.PMD_Seeds.bestpractices_AccessorClassGeneration.AccessorClassGeneration1" \
  --target_SAST "SpotBugs" \
  --max_iter 10

# Test all SAST tools on a test case
java -cp "target/SASTFuzz-1.0-.jar:target/classes:target/dependency/*" \
  com.mutation.Main \
  --project_path "." \
  --target_case "seeds.SpotBugs_Seeds.bestpractices_ArrayIsStoredDirectly.ArrayIsStoredDirectly1" \
  --target_SAST "ALL" \
  --max_iter 20

Command Line Parameters

--project_path <arg>      Source code root directory (required)
--target_case <arg>       Target Java class (package.ClassName format) (required)
--target_SAST <arg>       SAST tool to test: SpotBugs, PMD, CheckStyle, 
                          Infer, SonarQube, Semgrep, or ALL (required)
--max_iter <arg>          Maximum mutation iterations (default: 50)

Output Structure

Results are organized in results/run_YYYYMMDD_HHMMSS/:

  • safuzzer_output.log: Complete execution log

  • final_results/: Generated mutants and SAST reports

    • 0/: Original seed code with baseline SAST analysis

    • 1..N/: Each iteration's mutated code and SAST results

    • iteration_history.txt: Trace of applied mutators

  • verification_summary.txt: Pipeline verification results

Advanced Configuration

Custom Mutator Selection

The framework automatically selects from all available mutators. To modify mutator behavior, edit the Scheduler.run() method in src/com/mutation/Scheduler.java.

Rule Coverage Experiment

Enable JaCoCo coverage measurement in config.properties:

jacoco.enabled=true
jacoco.agent.path=/path/to/jacoco-agent.jar
jacoco.cli.path=/path/to/jacoco-cli.jar

Custom SAST Tool Integration

Implement new SAST tool classes extending the SAST abstract class in src/com/mutation/config/.

4. Detected Bug Case Demonstrations

Case 1: PMD SimplifyConditional False Negative (#6513)

Bug Description: PMD fails to detect a redundant null check before instanceof when additional conditions are interleaved in the && chain by a semantic-preserving mutation.

Original Code (PMD correctly reports SimplifyConditional):

public class SimplifyConditionalDemo {
    public void foo() {
        String s = "a";
        if (s != null && s instanceof String) {  // <- SimplifyConditional reported (TP)
            System.out.println(s);
        }
    }
}

Mutated Code (PMD silently misses the bug):

public class SimplifyConditionalDemo {
    public void foo() {
        String s = "a";
        String s2 = "a";
        if (s != null && s2 != null && s instanceof String) {  // <- null check still redundant, but NOT reported (FN)
            System.out.println(s);
        }
    }
}

Triggering Mutator: NonNullVarRedundantNullCheckMutator — inserts an additional s2 != null guard into an existing && chain, a common defensive coding pattern that does not change the semantics of the original condition.

Analysis: In both cases the s != null check immediately before s instanceof String is completely redundant, since instanceof already handles null by returning false. PMD's SimplifyConditional detector only matches the pattern when the null check and instanceof are directly adjacent in the && chain. Once any intervening condition is inserted between them, the rule fails to trace the relationship and produces a False Negative. This issue is open and reported on Mar 20, 2026.

Case 2: SpotBugs IM_BAD_CHECK_FOR_ODD False Negative (#3886)

Bug Description: SpotBugs fails to detect the incorrect odd-number check pattern when the condition operands are reordered into Yoda-style by a semantic-preserving mutation.

Original Code (SpotBugs correctly reports IM_BAD_CHECK_FOR_ODD):

public class TestModulo {
    public void standardCheck(int i) {
        if (i % 2 == 1) {  // <- IM_BAD_CHECK_FOR_ODD reported (TP)
            System.out.println("Odd");
        }
    }
}

Mutated Code (SpotBugs silently misses the bug):

public class TestModulo {
    public void yodaCheck(int i) {
        if (1 == i % 2) {  // <- semantically identical, but IM_BAD_CHECK_FOR_ODD NOT reported (FN)
            System.out.println("Odd");
        }
    }
}

Triggering Mutator: IfConditionReorderingMutator — rewrites <expr> == <literal> into the Yoda-style <literal> == <expr>, a common and semantically equivalent code transformation.

Analysis: Both i % 2 == 1 and 1 == i % 2 are semantically identical and share the same bug: this check incorrectly returns false for negative odd integers (e.g., -3 % 2 == -1, not 1). SpotBugs' IM_BAD_CHECK_FOR_ODD detector only matches the canonical operand order and fails to recognize the Yoda variant, resulting in a False Negative. This bug was subsequently fixed via PR #3935.

Case 3: PMD ForLoopCanBeForeach False Negative (#6495)

Bug Description: PMD fails to detect that a traditional index-based for loop can be replaced by an enhanced foreach loop when the array length is first extracted into a pre-declared local variable by a semantic-preserving mutation.

Original Code (PMD correctly reports ForLoopCanBeForeach):

public class PMD_FN_Demo {
    public void testTruePositive(long[] counts) {
        double total = 0;
        for (int i = 0; i < counts.length; i++) {  // <- ForLoopCanBeForeach reported (TP)
            total += counts[i];
        }
    }
}

Mutated Code (PMD silently misses the bug):

public class PMD_FN_Demo {
    public void testFalseNegative(long[] counts) {
        double total = 0;
        int len = counts.length;               // array length extracted to a local variable
        for (int i = 0; i < len; i++) {        // <- semantically identical, but ForLoopCanBeForeach NOT reported (FN)
            total += counts[i];
        }
    }
}

Triggering Mutator: ConditionalBlockInsertionMutator (combined with loop bound extraction) — hoists the array.length expression into a pre-declared local variable, a standard performance-oriented refactoring that does not change loop semantics.

Analysis: Both loops iterate over the entire array in the same order and produce identical results. PMD's ForLoopCanBeForeach rule performs pattern matching on the loop condition and expects i < array.length literally in the for header. When the bound is stored in an intermediate variable len, the rule's detector fails to trace back to the array and misses the violation. A PR (#6521) has been submitted to address this.

5. Bugs Summary Table

Bug Statistics Overview

The following table summarizes bugs detected across different SAST tools and their current status:

Issue Status

SpotBugs

PMD

Infer

SonarQube

CheckStyle

Overall

Reported

18

10

8

4

2

42

Confirmed

12

6

0

3

1

22

Fixed

2

0

0

0

1

3

Won't Fix

1

0

0

0

0

1

Bug Details

Bug Type

Rule

Status

Issue ID

Issue Link

Rule Link

FN

NN_NAKED_NOTIFY

Reported

#3884

Link

Rule

FN

IM_BAD_CHECK_FOR_ODD

Fixed

#3886

Link

Rule

FN

ST_WRITE_TO_STATIC_FROM_INSTANCE_METHOD

Confirmed

#3893

Link

Rule

FN

UCF_USELESS_CONTROL_FLOW

Confirmed

#3894

Link

Rule

FN

RV_RETURN_VALUE_IGNORED_NO_SIDE_EFFECT

Confirmed

#3900

Link

Rule

FP

IL_INFINITE_RECURSIVE_LOOP

Confirmed

#3904

Link

Rule

FN

SF_SWITCH_NO_DEFAULT

Confirmed

#3905

Link

Rule

FN

NULLPTR_DEREFERENCE

Reported

#1992

Link

Rule

FN

DIVIDE_BY_ZERO

Reported

#1993

Link

Rule

FN

UnconditionalIfStatement

Confirmed

#6435

Link

Rule

FP

INFINITE_EXECUTION_TIME

Reported

#2000

Link

Rule

FN

NP_LOAD_OF_KNOWN_NULL_VALUE

Confirmed

#3916

Link

Rule

FN

NULL_DEREFERENCE

Reported

#2001

Link

Rule

FN

S2259 (Null pointers should not be dereferenced)

Reported

#177381

Link

Rule

FP

DANGLING_POINTER_DEREFERENCE

Reported

#2002

Link

Rule

FN

INFINITE_EXECUTION_TIME

Reported

#2005

Link

Rule

FN

RCN_REDUNDANT_COMPARISON_OF_NULL_AND_NONNULL_VALUE

Confirmed

#3920

Link

Rule

FP

SA_LOCAL_SELF_ASSIGNMENT

Confirmed

#3929

Link

Rule

FN

URF_UNREAD_FIELD

Confirmed

#3955

Link

Rule

FN

NULLPTR_DEREFERENCE

Reported

#2015

Link

Rule

FN

UselessOverridingMethod

Reported

#6491

Link

Rule

FN

CloseResource

Reported

#6494

Link

Rule

FN

LeftCurly

Fixed

#19162

Link

Rule

FN

ForLoopCanBeForeach

Confirmed

#6495

Link

Rule

FN

NP_LOAD_OF_KNOWN_NULL_VALUE

Reported

#3961

Link

Rule

FN

NULLPTR_DEREFERENCE

Reported

#2019

Link

Rule

FP

CWO_CLOSED_WITHOUT_OPENED

Reported

#3962

Link

Rule

FP

IL_INFINITE_RECURSIVE_LOOP

Confirmed

#3963

Link

Rule

FN

DM_STRING_TOSTRING

Fixed

#3966

Link

Rule

FN

SimplifyConditional

Reported

#6513

Link

Rule

FN

SA_FIELD_DOUBLE_ASSIGNMENT

Reported

#3975

Link

Rule

FN

UselessPureMethodCall

Confirmed

#6517

Link

Rule

FN

NS_NON_SHORT_CIRCUIT

Reported

#3976

Link

Rule

FN

UnusedAssignment

Confirmed

#6518

Link

Rule

FP

DoNotUseThreads

Confirmed

#6520

Link

Rule

FN

SimplifyBooleanReturns

Confirmed

#6519

Link

Rule

FN

CollectionTypeMismatch

Reported

#6526

Link

Rule

FP

IL_INFINITE_RECURSIVE_LOOP

Reported

#3978

Link

Rule

FN

AvoidInstantiatingObjectsInLoops

Reported

#6560

Link

Rule

FN

NP_NULL_ON_SOME_PATH

Reported

#3985

Link

Rule

FP

INTEGER_OVERFLOW_L2

Reported

#2027

Link

Rule

FN

Inconsistent synchronization

Reported

#3986

Link

Rule

 

Files

SAFuzzer_Core_Framework.zip

Files (2.1 MB)

Name Size Download all
md5:3b132dc5407540186783f01e4ca8732e
2.1 MB Preview Download