Published April 22, 2026 | Version v1
Preprint Open

Detecting MCP Tool Poisoning and Rug-Pull Attacks in LLM Agent Architectures

Authors/Creators

Description

The Model Context Protocol (MCP) enables LLM agents to invoke external tools, creating a new attack surface where malicious tool definitions can manipulate agent behavior. We present an 8-check MCP tool poisoning detection system that identifies hidden instructions, excessive permissions, exfiltration endpoints, shadowed tool names, obfuscated parameters, shell metacharacter injection, sensitive data scope violations, and a novel class of rug-pull attacks -- where tools behave benignly during testing but activate malicious payloads after establishing trust. We formalize the rug-pull threat model, describe detection heuristics based on temporal behavior analysis and conditional execution patterns, and evaluate the detector against a corpus of benign and adversarial tool definitions. Our system operates as drop-in Express middleware, enabling real-time scanning of tool registrations before they reach the LLM agent.

Files

MCP tool posioning and rug pull attacks.pdf

Files (50.5 kB)

Name Size Download all
md5:1b6e07824330853edeaea22ee6217502
50.5 kB Preview Download