Detecting MCP Tool Poisoning and Rug-Pull Attacks in LLM Agent Architectures

Jain, Gunjan

doi:10.5281/zenodo.19687220

Published April 22, 2026 | Version v1

Preprint Open

Detecting MCP Tool Poisoning and Rug-Pull Attacks in LLM Agent Architectures

Jain, Gunjan

The Model Context Protocol (MCP) enables LLM agents to invoke external tools, creating a new attack surface where malicious tool definitions can manipulate agent behavior. We present an 8-check MCP tool poisoning detection system that identifies hidden instructions, excessive permissions, exfiltration endpoints, shadowed tool names, obfuscated parameters, shell metacharacter injection, sensitive data scope violations, and a novel class of rug-pull attacks -- where tools behave benignly during testing but activate malicious payloads after establishing trust. We formalize the rug-pull threat model, describe detection heuristics based on temporal behavior analysis and conditional execution patterns, and evaluate the detector against a corpus of benign and adversarial tool definitions. Our system operates as drop-in Express middleware, enabling real-time scanning of tool registrations before they reach the LLM agent.

Files

MCP tool posioning and rug pull attacks.pdf

Files (50.5 kB)

Name	Size	Download all
MCP tool posioning and rug pull attacks.pdf md5:1b6e07824330853edeaea22ee6217502	50.5 kB	Preview Download

	All versions	This version
Views	4	4
Downloads	3	3
Data volume	202.0 kB	202.0 kB

Detecting MCP Tool Poisoning and Rug-Pull Attacks in LLM Agent Architectures

Authors/Creators

Description

Files

MCP tool posioning and rug pull attacks.pdf

Files (50.5 kB)