Generative AI and cybersecurity: Exploring opportunities and threats at their intersection
Description
Generative AI, particularly large language models (LLMs), is reshaping the cybersecurity landscape by enabling both innovative defense mechanisms and novel forms of attack. This article explores the dual role of generative AI in both offensive and defensive cybersecurity operations. While GenAI offers significant advancements in defensive capabilities, it is also being leveraged by nation-state actors to enhance the sophistication and success rates of cyberattacks. The article analyzes how LLMs are applied in offensive engagements such as red teaming, penetration testing, and threat intelligence, while also identifying emerging technical, operational, and strategic risks associated with their deployment. Special attention is given to the cybersecurity challenges of generative AI systems themselves, highlighting limitations in conventional frameworks and proposing governance-oriented mitigations such as model evaluation, human-in-the-loop oversight, GenAI-specific red teaming, and the structured dissemination of threat intelligence derived from GenAI-enabled security practices.
Files
MAB_article_149299.pdf
Files
(824.2 kB)
| Name | Size | Download all |
|---|---|---|
|
md5:ffe819b576b0bda0cf1ae92b174da4f1
|
718.4 kB | Preview Download |
|
md5:de4ad407c62ec0735ece8077bfe870fd
|
105.8 kB | Preview Download |
Additional details
References
- Afane K, Wei W, Mao Y, Farooq J, Chen J (2024) Next-generation phishing: how LLM agents empower cyber attackers. arXiv. https://doi.org/10.1109/BigData62323.2024.10825018
- Al-Hawawreh M, Aljuhani A, Jararweh Y (2023) ChatGPT for cybersecurity: practical applications, challenges, and future directions. Cluster Computing 26: 3421–3436. https://doi.org/10.1007/s10586-023-04124-5
- Alotaibi L, Seher S, Mohammad N (2024) Cyberattacks using ChatGPT: exploring malicious content generation through prompt engineering. 2024 ASU Int Conf Emerg Technol Sustain Intell Syst (ICETSIS) 00: 1304–1311 (ICETSIS) 00: 1304–1311 (2024). https://doi.org/10.1109/ICETSIS61505.2024.10459698
- Barman D, Guo Z, Conlan O (2024) The dark side of language models: exploring the potential of LLMs in multimedia disinformation generation and dissemination. Machine Learning with Applications 16: 100545. https://doi.org/10.1016/j.mlwa.2024.100545
- Bengio Y, Mindermann S, Privitera D, Besiroglu T, Bommasani R, Casper S, Choi Y, Fox P, Garfinkel B, Goldfarb D, Heidari H, Ho A, Kapoor S, Khalatbari L, Longpre S, Manning S, Mavroudis V, Mazeika M, Michael J, Newman J, Ng KY, Okolo CT, Raji D, Sastry G, Seger E, Skeadas T, South T, Strubell E, Tramèr F, Velasco L, Wheeler N, Acemoglu D, Adekanmbi O, Dalrymple D, Dietterich TG, Felten EW, Fung P, Gourinchas P-O, Heintz F, Hinton G, Jennings N, Krause A, Leavy S, Liang P, Ludermir T, Marda V, Margetts H, McDermid J, Munga J, Narayanan A, Nelson A, Neppel C, Oh A, Ramchurn G, Russell S, Schaake M, Schölkopf B, Song D, Soto A, Tiedrich L, Varoquaux G, Yao A, Zhang Y-Q, Albalawi F, Alserkal M, Ajala O, Avrin G, Busch C, de Leon Ferreira de Carvalho ACP, Fox B, Gill AS, Hatip AH, Heikkilä J, Jolly G, Katzir Z, Kitano H, Krüger A, Johnson C, Khan SM, Lee KM, Ligot DV, Molchanovskyi O, Monti A, Mwamanzi N, Nemer M, Oliver N, Portillo JRL, Ravindran B, Rivera RP, Riza H, Rugege C, Seoighe C, Sheehan J, Sheikh H, Wong D, Zeng Y (2025) International AI safety report. arXiv. https://doi.org/10.48550/arXiv.2501.17805
- Bullwinkel B, Minnich A, Chawla S, Lopez G, Pouliot M, Maxwell W, de Gruyter J, Pratt K, Qi S, Chikanov N, Lutz R, Dheekonda RSR, Jagdagdorj B-E, Kim E, Song J, Hines K, Jones D, Severi G, Lundeen R, Vaughan S, Westerhoff V, Bryan P, Kumar RSS, Zunger Y, Kawaguchi C, Russinovich M (2025) Lessons from red teaming 100 generative AI products. arXiv. https://doi.org/10.48550/arXiv.2501.07238
- Chamberlain D, Casey E (2024) Capture the flag with ChatGPT: security testing with AI chatbots. International Conference on Cyber Warfare and Security 19: 43–54. https://doi.org/10.34190/iccws.19.1.2171
- Chen J, Hu S, Zheng H, Xing C, Zhang G (2022) GAIL-PT: a generic intelligent penetration testing framework with generative adversarial imitation learning. arXiv. https://doi.org/10.1016/j.cose.2022.103055
- Cohen S, Bitton R, Nassi B (2024) Here comes the AI worm: unleashing zero-click worms that target GenAI-powered applications. arXiv. https://doi.org/10.48550/arxiv.2403.02817
- Corchado JM, Garcia SR, Núñez VJM, López FS, Chamoso P (2023) Generative artificial intelligence: fundamentals. Advances in Distributed Computing and Artificial Intelligence Journal 12(1): e31704. https://doi.org/10.14201/adcaij.31704
- De Gracia JC, Sánchez-Macián A (2024) PTHelper: an open source tool to support the penetration testing process. arXiv. https://doi.org/10.48550/arxiv.2406.08242
- Deng G, Liu Y, Mayoral-Vilches V, Liu P, Li Y, Xu Y, Zhang T, Liu Y, Pinzger M, Rass S (2023a) PentestGPT: an LLM-empowered automatic penetration testing tool. arXiv. https://doi.org/10.48550/arxiv.2308.06782
- Deng G, Liu Y, Li Y, Wang K, Zhang Y, Li Z, Wang H, Zhang T, Liu Y (2023b) Jailbreaker: automated jailbreak across multiple large language model chatbots. arXiv. https://doi.org/10.14722/ndss.2024.24188
- Ding W, Abdel-Basset M, Ali AM, Moustafa N (2025) Large language models for cyber resilience: a comprehensive review, challenges, and future perspectives. Applied Soft Computing 170: 112663. https://doi.org/10.1016/j.asoc.2024.112663
- Gupta M, Akiri C, Aryal K, Parker E, Praharaj L (2023) From ChatGPT to ThreatGPT: impact of generative AI in cybersecurity and privacy. IEEE Access 11: 80218–80245. https://doi.org/10.1109/ACCESS.2023.3300381
- Haider E, Perez-Becker D, Portet T, Madan P, Garg A, Ashfaq A, Majercak D, Wen W, Kim D, Yang Z, Zhang J, Sharma H, Bullwinkel B, Pouliot M, Minnich A, Chawla S, Herrera S, Warreth S, Engler M, Lopez G, Chikanov N, Dheekonda RSR, Jagdagdorj B-E, Lutz R, Lundeen R, Westerhoff T, Bryan P, Seifert C, Kumar RSS, Berkley A, Kessler A (2024) Phi-3 safety post-training: aligning language models with a "break-fix" cycle. arXiv. https://doi.org/10.48550/arxiv.2407.13833
- Hassanin M, Moustafa N (2024) A comprehensive overview of large language models (LLMs) for cyber defences: opportunities and directions. arXiv. https://doi.org/10.48550/arxiv.2405.14487
- Hilario E, Azam S, Sundaram J, Mohammed KI, Shanmugam B (2024) Generative AI for pentesting: the good, the bad, the ugly. International Journal of Information Security 1–23. https://doi.org/10.1007/s10207-024-00835-x
- Huang J, Zhu Q (2024) PenHeal: a two-stage LLM framework for automated pentesting and optimal remediation. arXiv. https://doi.org/10.2139/ssrn.4941478
- Iturbe E, Llorente-Vazquez O, Rego A, Rios E, Toledo N (2024) Unleashing offensive artificial intelligence: automated attack technique code generation. Computers and Security 147: 104077. https://doi.org/10.1016/j.cose.2024.104077
- Jana S, Biswas R, Banerjee C, Patra T, Pal M, Pal K (2024) Leveraging artificial intelligence for enhancing cybersecurity: a comprehensive review and analysis. International Journal of Advanced Research in Science, Communication and Technology (IJARSCT): 173–183. https://doi.org/10.48175/IJARSCT-19030
- Karamthulla MJ, Tadimarri A, Tillu R, Muthusubramanian M (2024) Navigating the future: AI-driven project management in the digital era. International Journal For Multidisciplinary Research 6(2). https://doi.org/10.36948/ijfmr.2024.v06i02.15295
- Khoury R, Avila AR, Brunelle J, Camara BM (2023) How secure is code generated by ChatGPT? arXiv. https://doi.org/10.1109/SMC53992.2023.10394237
- Lanka P, Gupta K, Varol C (2024) Intelligent threat detection – AI-driven analysis of honeypot data to counter cyber threats. Electronics 13: 2465. https://doi.org/10.3390/electronics13132465
- Liguori P, Al-Hossami E, Orbinato V, Natella R, Shaikh S, Cotroneo D, Cukic B (2021) EVIL: exploiting software via natural language. arXiv. https://doi.org/10.1109/ISSRE52982.2021.00042
- Maryam R, Mahir RK, Natalie NS (2024) Navigating AI cybersecurity: evolving landscape and challenges. Journal of Intelligent Learning Systems and Applications 16: 155–174. https://doi.org/10.4236/jilsa.2024.163010
- McIntosh TR, Susnjak T, Liu T, Watters P, Nowrozy R, Halgamuge MN (2024) From COBIT to ISO 42001: evaluating cybersecurity frameworks for opportunities, risks, and regulatory compliance in commercializing large language models. Computers and Security 144: 103964. https://doi.org/10.1016/j.cose.2024.103964
- Metta S, Chang I, Parker J, Roman MP, Ehuan AF (2024) Generative AI in cybersecurity. arXiv. https://doi.org/10.48550/arxiv.2405.01674
- Mohammed B (2024) The impact of Artificial Intelligence on cyberspace security and market dynamics. Brazilian Journal of Technology 7(4): e74677. https://doi.org/10.38152/bjtv7n4-019
- Oh S, Lee K, Park S, Kim D, Kim H (2023) Poisoned ChatGPT finds work for idle hands: exploring developers' coding practices with insecure suggestions from poisoned AI models. arXiv. https://doi.org/10.1109/SP54263.2024.00046
- Palani K, Kethar J, Prasad S, Torremocha V (2024) Impact of AI and Generative AI in transforming Cybersecurity. Journal of Student Research 13(2). https://doi.org/10.47611/jsrhs.v13i2.6710
- Pan X, Dai J, Fan Y, Yang M (2024) Frontier AI systems have surpassed the self-replicating red line. arXiv. https://doi.org/10.48550/arxiv.2412.12140
- Patil M, Thakare D, Bhure A, Kaundanyapure S, Mune DA (2024) An AI-based approach for automating penetration testing. International Journal For Research in Applied Science and Engineering Technology 12: 5019–5028. https://doi.org/10.22214/ijraset.2024.61113
- Patsakis C, Casino F, Lykousas N (2024) Assessing LLMs in malicious code deobfuscation of real-world malware campaigns. Expert Systems with Applications 256: 124912. https://doi.org/10.1016/j.eswa.2024.124912
- Pavlova M, Brinkman E, Iyer K, Albiero V, Bitton J, Nguyen H, Li J, Ferrer CC, Evtimov I, Grattafiori A (2024) Automated red teaming with GOAT: the generative offensive agent tester. arXiv. https://doi.org/10.48550/arxiv.2410.01606
- Raman R, Calyam P, Achuthan K (2024) ChatGPT or Bard: who is a better certified ethical hacker? Computers & Security 140: 103804. https://doi.org/10.1016/j.cose.2024.103804
- Reddem P (2024) The rise of AI-powered cybercrime: A data-driven analysis of emerging threats. IJFMR 2582–2160. https://doi.org/10.36948/ijfmr.2024.v06i06.30744
- Sai S, Yashvardhan U, Chamola V, Sikdar B (2024) Generative AI for cyber security: analyzing the potential of ChatGPT, DALL-E and other models for enhancing the security space. IEEE Access PP (99): 1–1. https://doi.org/10.1109/ACCESS.2024.3385107
- Sergeyuk A, Golubev Y, Bryksin T, Ahmed I (2024) Using AI-based coding assistants in practice: state of affairs, perceptions, and ways forward. arXiv. https://doi.org/10.2139/ssrn.4900362
- Shevlane T, Farquhar S, Garfinkel B, Phuong M, Whittlestone J, Leung J, Kokotajlo D, Marchal N, Anderljung M, Kolt N, Ho L, Siddarth D, Avin S, Hawkins W, Kim B, Gabriel I, Bolina V, Clark J, Bengio Y, Christiano P, Dafoe A (2023) Model evaluation for extreme risks. arXiv. https://doi.org/10.48550/arxiv.2305.15324
- Sufi F (2024) An innovative GPT-based open-source intelligence using historical cyber incident reports. Natural Language Processing Journal 7: 100074. https://doi.org/10.1016/j.nlp.2024.100074
- Taghavi SM, Feyzi F (2024) Using large language models to better detect and handle software vulnerabilities and cyber security threats. https://doi.org/10.21203/rs.3.rs-4387414/v1
- Temara S (2023) Maximizing penetration testing success with effective reconnaissance techniques using ChatGPT. arXiv. https://doi.org/10.22541/au.167947026.68710739/v1
- Valea O, Oprișa C (2020) Towards pentesting automation using the Metasploit framework. 2020 IEEE 16th International Conference on Intelligent Computer Communication and Processing (ICCP): 171–178. https://doi.org/10.1109/ICCP51029.2020.9266234
- Wang L, Ma C, Feng X, Zhang Z, Yang H, Zhang J, Chen Z, Tang J, Chen X, Lin Y, Zhao WX, Wei Z, Wen J (2024) A survey on large language model based autonomous agents. Frontiers of Computer Science 18: 186345. https://doi.org/10.1007/s11704-024-40231-1
- Xu J, Stokes JW, McDonald G, Bai X, Marshall D, Wang S, Swaminathan A, Li Z (2024) AutoAttacker: a large language model guided system to implement automatic cyber-attacks. arXiv. https://doi.org/10.48550/arxiv.2403.01038
- Yang S, Yang S, Liu S, Nguyen D, Jang S, Abuadbba A (2024) ThreatModeling-LLM: automating threat modeling using large language models for banking system. arXiv. https://doi.org/10.48550/arxiv.2411.17058
- Yao Y, Duan J, Xu K, Cai Y, Sun Z, Zhang Y (2024) A survey on large language model (LLM) security and privacy: the good, the bad, and the ugly. High-Confidence Computing 4: 100211. https://doi.org/10.1016/j.hcc.2024.100211
- Yigit Y, Buchanan WJ, Tehrani MG, Maglaras L (2024) Review of generative AI methods in cybersecurity. arXiv. https://doi.org/10.48550/arxiv.2403.08701
- Zaydi M, Maleh Y (2024) Empowering red teams with generative AI: transforming penetration testing through adaptive intelligence. EDPACS ahead-of-print: 1–26. https://doi.org/10.1080/07366981.2024.2439628
- Zhang J, Bu H, Wen H, Liu Y (2025) When LLMs meet cybersecurity: a systematic literature review. Cybersecurity 8: 55. https://doi.org/10.1186/s42400-025-00361-w
- Zhou Y, Cheng G, Du K, Chen Z (2024) Toward intelligent and secure cloud: large language model empowered proactive defense. arXiv. https://doi.org/10.48550/arxiv.2412.21051