SigMap Benchmark Suite: 240-Repository Large-Scale AI Context Extraction Dataset
Authors/Creators
Description
The SigMap Benchmark Suite presents a comprehensive evaluation of AI context extraction across 240 diverse open-source repositories spanning 30+ programming languages. This dataset comprises 1,775 benchmark operations capturing token reduction metrics, execution performance, and code complexity analysis.
Key Features:
• 240 repositories across 30+ languages
• 1,775 benchmark operations (5 modes per repository)
• 50+ metadata fields per repository
• 96.2% average token reduction
• Complete reproducibility package
• 4 data export formats (CSV, JSON, JSONL, SQL)
The dataset enables analysis of language-specific context extraction patterns, monorepo complexity, domain-specific compression characteristics, and AI context optimization strategies.
Complete methodology, reproducibility materials, and scripts are included.
Files
Dataset_Paper.md
Files
(943.0 kB)
| Name | Size | Download all |
|---|---|---|
|
md5:5677bb3c8910e4e17e334b8963a70846
|
11.5 kB | Download |
|
md5:c3b94888c5d5f9f7269cfc22b4fcc987
|
22.5 kB | Download |
|
md5:8048853b5bfc8a651dea096fe8634f1c
|
26.6 kB | Download |
|
md5:722afa7bf9d2644db3aba029ea223d17
|
7.1 kB | Download |
|
md5:9d535e3f62d5b1996a0252cc419e6c0e
|
9.8 kB | Download |
|
md5:d9c290d9cfa425ac6a9ee81e8534cdd1
|
8.6 kB | Download |
|
md5:52b703452f77fbdec4c1bc55cb6c90b6
|
4.2 kB | Download |
|
md5:000391ac0333c148347697d3e50a6103
|
17.3 kB | Download |
|
md5:6cbeb79cd6377d928c2228887a741691
|
2.9 kB | Download |
|
md5:9ac1a2702a53bb230f53412c4505286b
|
16.8 kB | Preview Download |
|
md5:0eeca47f09ee3f1a2797956b75dcb870
|
863 Bytes | Download |
|
md5:44a7051a72199e44c855b9321d19edde
|
4.7 kB | Preview Download |
|
md5:e9cda9c43bc7e1714f9f1a26061cd360
|
5.7 kB | Download |
|
md5:82ab7f6fc295c77d9aed2bd7311ea88b
|
1.4 kB | Preview Download |
|
md5:6e5d9be419f4fed3cbd719d86641e904
|
6.4 kB | Preview Download |
|
md5:6da468de0d3f7f41b717240cb5576fad
|
27.2 kB | Preview Download |
|
md5:2626b56ba054f518408bcf58c07bb75c
|
1.4 kB | Download |
|
md5:4cef45ed34154262a462041de8dc8f67
|
50.2 kB | Preview Download |
|
md5:295003109c0f5af355810ec3fd79bc1d
|
350.5 kB | Preview Download |
|
md5:d6b9853b4fa6fc685e31a9b56d2d6400
|
277.8 kB | Download |
|
md5:82018927f57a3af509a148882fdb4f97
|
89.6 kB | Download |