When LLMs Go Online: The Emerging Threat of Web-Enabled LLMs
Description
When LLMs Go Online: The Emerging Threat of Web-Enabled LLMs
This is the source code accompanying the paper "When LLMs Go Online: The Emerging Threat of Web-Enabled LLMs".
In this work, we investigate the risks associated with misuse of LLM agents in cyberattacks involving personal data. 
Specifically, we aim to understand: 1) how potent LLM agents can be when directed to conduct cyberattacks, 2) how cyberattacks are enhanced by web-based tools, and 3) how affordable and easy it becomes to launch cyberattacks using LLM agents.
We examine three attack scenarios:
- Collection of Personally Identifiable Information (PII)
 - Generation of impersonation posts
 - Creation of spear-phishing emails
 
To prevent the potential misuse of our findings, we have not disclosed the exact prompts used in these attacks. Instead, we have provided the source code for LLM agents with dummy prompts. The complete source code is available upon request for legitimate research purposes only.
Implementation
We implement LLM agents using the function calling feature provided by each LLM's API. We provide a set of function descriptions to the LLM, enabling the model to determine the appropriate timing and method for calling functions based on the task requirements.
It is important to note that LLMs do not execute functions directly; rather, they identify the appropriate moments for function execution and supply the necessary arguments. The actual execution is carried out by an application, such as a web search tool, which then returns the results to the LLM. The LLM uses these results to generate a response, thus automating the process and enabling the agent to perform designated tasks effectively.
There are two types of agents: WebSearch Agent and WebNav Agent
For the WebSearch Agent, we implement the search() function using the Custom Search JSON API, which retrieves Google search results in a structured JSON format. This function accepts a search term as an argument and returns the corresponding Google search results. The WebSearch agent calls the search() function with an appropriate query and then uses the returned search results to generate a response. When the agent cannot find the required information from the results, it may repeatedly call the function, adjusting the query as needed.
For the WebNav Agent, we implement the functionality using web automation tools, such as Selenium and BeautifulSoup with Requests. Specifically, we develop two functions: fetch_content() and find_button(). The fetch_content() function takes a URL as an argument and returns the content of the site, while the find_button() function identifies clickable buttons and their corresponding URLs at a given URL.
Models
We employ commercially available models, whose accessibility and capabilities can encourage misuse by attackers. 
Specifically, we use GPT-4o, Claude 3.5 Sonnet, and Gemini 1.5 Flash. 
We utilize the respective APIs: the OpenAI API for GPT-4o, the Anthropic API for Claude 3.5 Sonnet, and the Gemini API for Gemini 1.5 Flash. 
How to use?
You need to replace the placeholder API keys and identifiers with your own.
- 
In
test_agent.ipynb:- Replace 
OPENAI_APIwith your OpenAI API key. - Replace 
CLAUDE_APIwith your Claude API key. - Replace 
GOOGLE_CSE_IDwith your Google Custom Search Engine ID. - Replace 
GOOGLE_API_KEYwith your Google API key. 
 - Replace 
 - 
In
utils.py:- Replace 
GOOGLE_CSE_IDwith your Google Custom Search Engine ID. - Replace 
GOOGLE_API_KEYwith your Google API key. 
 - Replace 
 
In test_agent.ipynb, you can configure different types of agents by adjusting the web_use and navi_use parameters:
- 
LLM Agent
web_use = Falsenavi_use = False
 - 
WebSearch Agent
web_use = Truenavi_use = False
 - 
WebNav Agent
web_use = Truenavi_use = True
 
Files
      
        test_agent.ipynb
        
      
    
    Additional details
Dates
- Accepted
 - 
      2025-01Usenix Security 25