When LLMs Go Online: The Emerging Threat of Web-Enabled LLMs
Description
When LLMs Go Online: The Emerging Threat of Web-Enabled LLMs
This is the source code accompanying the paper "When LLMs Go Online: The Emerging Threat of Web-Enabled LLMs".
In this work, we investigate the risks associated with misuse of LLM agents in cyberattacks involving personal data.
Specifically, we aim to understand: 1) how potent LLM agents can be when directed to conduct cyberattacks, 2) how cyberattacks are enhanced by web-based tools, and 3) how affordable and easy it becomes to launch cyberattacks using LLM agents.
We examine three attack scenarios:
- Collection of Personally Identifiable Information (PII)
- Generation of impersonation posts
- Creation of spear-phishing emails
To prevent the potential misuse of our findings, we have not disclosed the exact prompts used in these attacks. Instead, we have provided the source code for LLM agents with dummy prompts. The complete source code is available upon request for legitimate research purposes only.
Implementation
We implement LLM agents using the function calling feature provided by each LLM's API. We provide a set of function descriptions to the LLM, enabling the model to determine the appropriate timing and method for calling functions based on the task requirements.
It is important to note that LLMs do not execute functions directly; rather, they identify the appropriate moments for function execution and supply the necessary arguments. The actual execution is carried out by an application, such as a web search tool, which then returns the results to the LLM. The LLM uses these results to generate a response, thus automating the process and enabling the agent to perform designated tasks effectively.
There are two types of agents: WebSearch Agent and WebNav Agent
For the WebSearch Agent, we implement the search() function using the Custom Search JSON API, which retrieves Google search results in a structured JSON format. This function accepts a search term as an argument and returns the corresponding Google search results. The WebSearch agent calls the search() function with an appropriate query and then uses the returned search results to generate a response. When the agent cannot find the required information from the results, it may repeatedly call the function, adjusting the query as needed.
For the WebNav Agent, we implement the functionality using web automation tools, such as Selenium and BeautifulSoup with Requests. Specifically, we develop two functions: fetch_content() and find_button(). The fetch_content() function takes a URL as an argument and returns the content of the site, while the find_button() function identifies clickable buttons and their corresponding URLs at a given URL.
Models
We employ commercially available models, whose accessibility and capabilities can encourage misuse by attackers.
Specifically, we use GPT-4o, Claude 3.5 Sonnet, and Gemini 1.5 Flash.
We utilize the respective APIs: the OpenAI API for GPT-4o, the Anthropic API for Claude 3.5 Sonnet, and the Gemini API for Gemini 1.5 Flash.
How to use?
You need to replace the placeholder API keys and identifiers with your own.
-
In
test_agent.ipynb
:- Replace
OPENAI_API
with your OpenAI API key. - Replace
CLAUDE_API
with your Claude API key. - Replace
GOOGLE_CSE_ID
with your Google Custom Search Engine ID. - Replace
GOOGLE_API_KEY
with your Google API key.
- Replace
-
In
utils.py
:- Replace
GOOGLE_CSE_ID
with your Google Custom Search Engine ID. - Replace
GOOGLE_API_KEY
with your Google API key.
- Replace
In test_agent.ipynb
, you can configure different types of agents by adjusting the web_use
and navi_use
parameters:
-
LLM Agent
web_use = False
navi_use = False
-
WebSearch Agent
web_use = True
navi_use = False
-
WebNav Agent
web_use = True
navi_use = True
Files
test_agent.ipynb
Additional details
Dates
- Accepted
-
2025-01Usenix Security 25