Published January 7, 2026 | Version v7

Heterogeneous Prompting and Execution Feedback for SWE Issue Test Generation and Selection

Authors/Creators

Description

We have shared 8 folders containing tests for 2 benchmarks (SWT-Bench Lite and TDD-Bench Verified), 2 models (GPT-4o and Claude-3.7-Sonnet), and 2 approaches (Otter and e-Otter). Each folder contains 10 tests using different prompting techniques (e.g., planner, full, standard) and associated logs. We also share the json files containing the e-otter++ generated tests.

Files

claude_e_otter_plus_swt_lite.json

Files (3.8 GB)

Name Size
md5:f9a6edc1bcfeb80bac940c578b68e787
554.6 kB Preview Download
md5:11ae14b42b838b38c75f754d6d8ac5be
939.6 kB Preview Download
md5:99331605793fab49344635c48b195ad3
289.0 MB Preview Download
md5:006ab1c91b5c3fbad72213c5fc3e3407
287.2 MB Preview Download
md5:90b46a6a020d10f92d901b7e58a89321
518.8 MB Preview Download
md5:e94da042bc591f858d5ef29b87aab110
387.4 MB Preview Download
md5:420bb3d2e38aec5dcbc07e24b89a6a61
412.6 MB Preview Download
md5:477bc387f78178f6bd2a5ea03b4b6589
500.6 kB Preview Download
md5:6a74b96a279fca6aa44daa6bc8a40247
782.4 kB Preview Download
md5:30e3ce3e187c57d31802ee8fd1acf1f6
292.2 MB Preview Download
md5:6237d2dde3b997c72703a1f1ed80e42b
289.1 MB Preview Download
md5:7f96d7653746e1cb2f15fce8b3068e7c
522.4 MB Preview Download
md5:efa67759a49030169e745ee3296f613f
384.9 MB Preview Download
md5:57946c7226907d2af86f02dcd4f84e84
411.1 MB Preview Download
md5:36b45a5edc212f8b2e1886959626ff7f
839.6 kB Preview Download