Published February 12, 2026 | Version v1
Preprint Open

mcpbr: Benchmarking Model Context Protocol Servers on Software Engineering Tasks

Authors/Creators

  • 1. EDMO icon Georgia Institute of Technology

Description

The Model Context Protocol (MCP) lets developers expose tools and data sources to LLM-based agents through a standardized interface. Despite rapid ecosystem growth, no methodology exists for evaluating whether a given MCP server improves agent task completion. We present mcpbr, an open-source benchmark runner that isolates the effect of MCP tool augmentation through paired comparison experiments. We evaluate a code graph analysis MCP server on all 500 tasks from SWE-bench Verified using Claude Sonnet as the base agent. MCP augmentation reduced resolution rate by 14.9% (from 49.8% to 42.4%) while improving efficiency: 42.3% fewer tool calls, 14.0% fewer tokens, and 15.2% lower cost. Per-repository analysis shows the effect varies across codebases, with the server helping on 1 of 12 repositories and hurting on 10. We analyze this efficiency-resolution tradeoff and show that MCP tools alter the agent's exploration strategy, trading general-purpose search for opinionated shortcuts that can narrow the solution space.

Files

main.pdf

Files (287.9 kB)

Name Size Download all
md5:de1efde19dc2e67e714b2269035d4d41
287.9 kB Preview Download

Additional details

Software

Repository URL
https://github.com/greynewell/mcpbr
Programming language
Python
Development Status
Active