Published March 4, 2026
| Version 1.0.0
Dataset
Open
Pact Benchmark: ICPC World Finals — Contract-First Multi-Agent vs Single-Agent Code Generation
Description
Benchmark comparing Pact (contract-first multi-agent framework) against Claude Code on 5 ICPC World Finals competitive programming problems (212 test cases). Pact achieves 100% (212/212) vs Claude Code single-shot 79% (167/212) and iterative 92% (196/212). All conditions use Claude Opus 4.6. Includes test data, baseline results, full Pact state for both conditions (research and base), and reproduction scripts. The decisive problem is Trailing Digits (2020 World Finals): Claude Code scores 31/47 even with 5 retry iterations — the naive algorithm times out. Pact's interview and decomposition phases force upfront mathematical analysis, producing the correct O(log n) approach on the first attempt.
Files
Files
(36.9 MB)
| Name | Size | Download all |
|---|---|---|
|
md5:9452285c460280e27ad5112aba6788ea
|
36.9 MB | Download |
Additional details
Related works
- Is part of
- Software: https://github.com/jmcentire/pact (URL)
- Is supplement to
- Software: https://github.com/jmcentire/pact-bench (URL)