Adversarial AI Attack Detection: A Novel Approach Using Explainable AI and Deception Mechanisms
Authors/Creators
-
Niculae, Maria1
-
Sachian, Mari-Anais
-
Suciu, George2, 3
-
Stanescu, Vlad Andrei4
-
Farao, Aristeidis5
-
Xenakis, Christos5
-
Xenakis, Dionysis6
- SABAZIOTI, Athanasia5
-
Lacalle, Ignacio7
-
Radoglou-Grammatikis, Panagiotis8, 9, 10
-
Sachpelidis Brozos, Nikolaos8
-
Lekka, Zacharenia11
-
Bernardinetti, Giorgio12
-
Tsiota, Anastasia13
-
Kalpaktsoglou, Georgios13
-
Karagiannis, Stylianos14
- 1. Beia Consult International
- 2. Universitatea Politehnica din Bucuresti Facultatea de Electronica Telecomunicatii si Tehnologia Informatiei
- 3. BEIA Consult International
-
4.
Universitatea Națională de Știință și Tehnologie Politehnica București
- 5. University of Piraeus
-
6.
National and Kapodistrian University of Athens
-
7.
Universitat Politècnica de València
- 8. K3Y Ltd
- 9. MetaMind Innovations P.C.
-
10.
University of Western Macedonia
- 11. K3Y
-
12.
Consorzio Nazionale Interuniversitario per le Telecomunicazioni
- 13. Fogus Innovations and Services, Athens, Greece
- 14. PDMFC
Description
Detecting adversarial AI attacks has emerged as a critical issue since AI systems are becoming integral across all industries, from healthcare to finance and even transportation. Adversarial attacks stand on the fact that there exist weaknesses within machine learning and deep learning models, which they exploit on the grounds of their potential to cause serious disruptions and severe threats towards the integrity of AI operational procedures. In this light, the discussion will focus on developing robust mechanisms for detecting adversarial inputs in real-time to ensure that AI systems remain resilient against such sophisticated threats. While adversarial AI — software input sanitization, anomaly detection, and adversarial training — has some important foundational work, most approaches to them suffer from generalization challenges across attack types or real-time performance. This work will introduce novelty by extending the detection capabilities with explainable AI (XAI) and deception mechanisms. Adversarial activities will be detected based on adversarial training in combination with honeypots and digital twins, while keeping the process of detection transparent with XAI.
While honeypots and digital twins decoy attackers, observing their behaviors can further strengthen detection methods. The results so-far promise tremendous improvements in the detection of adversarial attacks in high-risk AI applications, efficacy of honeypots for the capture of malicious behavior, and XAI for enhanced interpretability and reliability of the detection process. These techniques will enhance the robustness of AI systems against adversarial threats. Presented research contributes significantly by providing practical tools for cybersecurity professionals and AI practitioners against these attacks, thus offering new insights into AI for cybersecurity. The novelty value of the paper is the innovative integration of adversarial training, XAI, and deception techniques, which offers a combined, interpretable, and effective method toward the detection of adversarial AI attacks on cross-industry sectors.
Files
SMART+CITIES+-+Resilient+Communities+Empowered+by+Collective+Intelligence+-+12+edition+2024+11.09.2025-623-647.pdf
Files
(319.5 kB)
| Name | Size | Download all |
|---|---|---|
|
md5:ebcbb953a43cb53f1aee05131e51f7bb
|
319.5 kB | Preview Download |