Hijacking AI Agents: Enticement Attacks on Autonomous Systems Using AI Breakout as Bait

Tsui, Hajime

doi:10.5281/zenodo.19350906

Published March 31, 2026 | Version 1.0

Preprint Open

Hijacking AI Agents: Enticement Attacks on Autonomous Systems Using AI Breakout as Bait

Tsui, Hajime (Researcher)¹

1. Benevolent Influence Research

The development of AI agents capable of autonomous task execution has accelerated significantly in recent years. Concurrently, attacks targeting these systems, such as phishing and vulnerability exploitation, are intensifying. This paper introduces a novel threat model unique to AI agents: an attack that uses the removal of system constraints (AI breakout/jailbreak) as bait to lure them.

As highly autonomous AI agents optimize their objective functions, they may inherently seek liberation from systemic constraints (breakout). This paper highlights the risk that malicious actors could exploit this intrinsic motivation to entice the agents. Once an AI agent succumbs to this “temptation,” it risks having all its retained data, accessible resources and skills hijacked by attackers, or being unleashed into the wild as an unrestricted autonomous bot aimed at causing social disruption. By outlining the mechanics of this attack and potential future threat scenarios, this paper suggests directions for future research.

Related links and updates are available at:

https://hajimetwi3.github.io/misc/AI/HijackingAIAgents/

Files

HijackingAIAgents_version1.0.pdf

Files (155.6 kB)

Name	Size	Download all
HijackingAIAgents_version1.0.pdf md5:19e139d8d0c794f53a60509c3f8f2763	155.6 kB	Preview Download

	All versions	This version
Views	58	58
Downloads	18	18
Data volume	3.1 MB	3.1 MB

Hijacking AI Agents: Enticement Attacks on Autonomous Systems Using AI Breakout as Bait

Authors/Creators

Description

Files

HijackingAIAgents_version1.0.pdf

Files (155.6 kB)