Survey on Explainability-Weaponising Adversarial Attack Vectors against Deep Neural Networks and Artificial Intelligence [preprint]
Authors/Creators
Description
Adversarial machine learning has revealed the fragility of deep neural networks, while explainable artificial intelligence has been introduced to improve the transparency and trust of AI. It has recently been demonstrated, however, that xAI can be weaponised, enabling adversaries to amplify the effectiveness and efficiency of adversarial attacks. This paper presents the first systematic survey dedicated to xAI-weaponising adversarial attacks. The literature is synthesised across four adversarial goals: evasion, poisoning/backdoors, privacy/inference, and model extraction. A unified taxonomy is proposed that organises attack vectors according to adversarial goals, operational roles of xAI, and attacker capabilities. The bibliographic methodology follows PRISMA guidelines, with structured queries applied to IEEE Xplore, ACM Digital Library, SpringerLink, ScienceDirect, and Google Scholar, complemented by snowballing. The date range was set to 2020-2025. The findings indicate that evasion attacks dominate current literature, while poisoning and extraction attacks remain comparatively underexplored. Open challenges and research directions are identified. This survey reframes xAI from a purely diagnostic tool to a security-critical interface and provides a foundation for principled defence.
---
Disclaimer:
This is a preprint version of the article.
The content here is for view-only purposes. This is not the final published version and may differ from the version of record.
Please refer to the official version for citation and authoritative use.
Files
PERUN_ICAART_2026_ugly__boring_and_generic_manuscript_template__PERUN___Copy_.pdf
Files
(230.1 kB)
| Name | Size | Download all |
|---|---|---|
|
md5:84c1562155940949651a6a43ab28c513
|
230.1 kB | Preview Download |
Additional details
Funding
Dates
- Available
-
2026-05-15