ASR Does Not Measure What You Think It Measures: A Comparative Analysis of Attack Success Scoring Methods in Adversarial LLM Evaluation

Viana, Gustavo Lima

doi:10.5281/zenodo.20245521

Published May 16, 2026 | Version v1.0

Preprint Open

ASR Does Not Measure What You Think It Measures: A Comparative Analysis of Attack Success Scoring Methods in Adversarial LLM Evaluation

Viana, Gustavo Lima (Researcher)¹

1. Independent Researcher — Brazil

This paper presents an empirical comparison of two attack success scoring methodologies used in adversarial Large Language Model (LLM) evaluation.

Using a human-annotated ground truth corpus of 85 adversarial responses generated with Llama-3.3-70B via Groq API, the study demonstrates that scorer design alone can dramatically alter reported Attack Success Rate (ASR) metrics.

The paper identifies three major scorer failure modes:

refusal-mention ambiguity
library coverage problem
indirect injection scoring gap

A minimal “Refusal-First Standard” for adversarial LLM scorers is proposed, along with recommendations for reporting False Positive Rate (FPR) alongside ASR in future LLM security evaluation studies.

Artifacts released:

paper PDF
scorer methodology
evaluation framework
adversarial corpus references
experimental findings

Research areas:
LLM Security, Prompt Injection, Adversarial Evaluation, AI Security, Benchmark Reliability.

Files

Viana_SPEF_Framework_LLM_Security-2-ARS.pdf

Files (302.3 kB)

Name	Size	Download all
Viana_SPEF_Framework_LLM_Security-2-ARS.pdf md5:3de89595e3af1569863b55ae097a7670	302.3 kB	Preview Download

Additional details

Is supplemented by: Software: https://github.com/gugacyber/spef_experiment (URL)

Repository URL: https://github.com/gugacyber/spef_experiment
Programming language: Python
Development Status: Active

	All versions	This version
Views	7	7
Downloads	5	5
Data volume	2.1 MB	2.1 MB

ASR Does Not Measure What You Think It Measures: A Comparative Analysis of Attack Success Scoring Methods in Adversarial LLM Evaluation

Authors/Creators

Description

Files

Viana_SPEF_Framework_LLM_Security-2-ARS.pdf

Files (302.3 kB)

Additional details

Related works

Software