Published April 8, 2026 | Version v1
Preprint Open

Artificial Intelligence in Cybersecurity: Reproducible Evidence on Vulnerability Discovery and Security Operations

Description

This paper synthesizes publicly available evidence on the role of large language models (LLMs) in cybersecurity operations. Rather than proposing new benchmarks, the work audits existing research—including Mozilla advisories, peer-reviewed security studies, and regulatory standards—to assess which cybersecurity tasks are currently well-supported, weakly supported, or not supported for LLM deployment. The authors argue that current evidence supports LLM-assisted vulnerability analysis, penetration-testing workflows, and incident-response augmentation, but does not support claims of fully autonomous security operations without substantial governance controls. The paper contributes a machine-readable evidence matrix and task-level summary to enable independent verification of its claims.

Files

Artificial Intelligence in Cybersecurity.pdf

Files (248.3 kB)

Name Size Download all
md5:41fefbd333a5891bf6cf4a30e8a17903
248.3 kB Preview Download

Additional details

Dates

Available
2026

References

  • Gelei Deng, Yi Liu, Víctor Mayoral-Vilches, Peng Liu, Yuekang Li, Yuan Xu, Tianwei Zhang, Yang Liu, Martin Pinzger, and Stefan Rass. Pentestgpt: Evaluating and harnessing large language models for automated penetration testing. In 33rd USENIX Security Symposium (USENIX Security 24). USENIX Association, 2024.
  • European Parliament and Council of the European Union. Regulation (EU) 2024/1689 of the european parliament and of the council of 13 june 2024 laying down harmonised rules on artificial intelligence (Artificial Intelligence Act). Technical report, Publications Office of the European Union, 2024. CELEX 32024R1689. Accessed 2026-04-03.
  • Michael Howard and David LeBlanc. Writing Secure Code. Microsoft Press, 2 edition, 2003. ISBN 9780735617223.
  • Diana Kramer, Lambert Rosique, Ajay Narotam, Elie Bursztein, Patrick Gage Kelley, Kurt Thomas, and Allison Woodruff. Integrating large language models into security incident response. In Twenty-First Symposium on Usable Privacy and Security (SOUPS 2025). USENIX Association, 2025. Study with 18 security analysts and 50 real-world incidents. Accessed 2026-04-03.
  • Peiyu Liu, Junming Liu, Lirong Fu, Kangjie Lu, Yifan Xia, Xuhong Zhang, Wenzhi Chen, Haiqin Weng, Shouling Ji, and Wenhai Wang. Exploring ChatGPT's capabilities on vulnerability management. In 33rd USENIX Security Symposium (USENIX Security 24). USENIX Association, 2024. USENIX Security 2024 program entry for the paper. Accessed 2026-04-03.
  • Microsoft Security. Microsoft unveils microsoft security copilot agents and new protections for AI. https://www.microsoft.com/en-us/security/blog/2025/03/24/microsoft-unveils-mic rosoft-security-copilot-agents-and-new-protections-for-ai/, 2025. Official Microsoft Security Blog post on Security Copilot agents. Accessed 2026-04-03.
  • MITRE. MITRE ATT&CK: Frequently asked questions. https://attack.mitre.org/resourc es/faq/, 2025. Official overview of ATT&CK as a knowledge base and taxonomy of adversary behavior. Accessed 2026-04-03.
  • Mozilla Foundation. Security vulnerabilities fixed in firefox 148. https://www.mozilla.org/en -US/security/advisories/mfsa2026-13/, 2026a. Mozilla advisory published February 2026. Accessed 2026-04-03.
  • Mozilla Foundation. Security vulnerabilities fixed in firefox esr 140.8. https://www.mozilla.or g/security/advisories/mfsa2026-15/, 2026b. Mozilla ESR advisory published February 24, 2026. Accessed 2026-04-03.
  • National Institute of Standards and Technology. AI risk management framework (AI RMF 1.0). https://www.nist.gov/itl/ai-risk-management-framework, 2023. Official NIST framework. Accessed 2026-04-03.
  • Hammond Pearce, Baleegh Ahmad, Benjamin Tan, Brendan Dolan-Gavitt, and Ramesh Karri. Asleep at the keyboard? assessing the security of GitHub copilot's code contributions. In 2022 IEEE Symposium on Security and Privacy (SP), pages 754–768, 2022. doi: 10.1109/SP46214.20 22.9833571.
  • Yifan Yao, Jiajun Duan, Kaidi Xu, Yutong Cai, Errui Sun, and Yue Zhang. A survey on large language model (LLM) security and privacy: The good, the bad, and the ugly. arXiv preprint arXiv:2312.02003, 2023.