Published May 8, 2026
| Version v1
Dataset
Open
SAGEBench: Simulated Bruker timsTOF .d fixtures and ground-truth FDR/TPR benchmark for the Sage DDA search engine
Authors/Creators
Description
Simulated Bruker .d fixtures and a benchmark harness
for the Sage DDA
database-search engine.
Two purposes:
- CI-grade Bruker test data for Sage so regressions in the timsTOF code path get caught before release (motivated by lazear/sage#228).
- A larger ground-truth-backed evaluation set so anyone can compute true FDR / TPR for Sage on simulated DDA data, in the spirit of timsim-bench for DIA.
All datasets generated with
TimSim;
ground truth is exact (every injected peptide is recorded in
synthetic_data.db alongside each .d).
Files in this record:
sagebench-ci-smoke.tar.gz(~457 MB) — two 5-min HeLa.dfiles, seed CSV, configs, regen script. Drop-in CI fixture.sagebench-hela-150k-g30m.tar.gz(~3.6 GB) — HeLa, 150 000 peptides, 30-min gradient (rep 001).sagebench-hla-10k-g40.tar.gz(~2.6 GB) — HLA Thunder, 10 000 peptides, 40-min gradient, 3 replicates.sagebench-hla-100k-g3600.tar.gz(~6.6 GB) — HLA Thunder, 100 000 peptides, 60-min gradient, 3 replicates.sagebench-results.tar.gz(~288 KB) — first-run report (REPORT.html,RESULTS.md, eval CSVs) against Sage 0.15.0-beta.2.
Each archive contains its own README.md with usage
instructions. The SAGEBench repository
(github.com/theGreatHerrLebert/SAGEBench)
hosts the harness used to score search-engine output against the
recorded ground truth.
Files
Files
(13.9 GB)
Additional details
Related works
- Is supplement to
- https://github.com/theGreatHerrLebert/SAGEBench (URL)
- https://github.com/lazear/sage (URL)
- References
- https://github.com/theGreatHerrLebert/rustims (URL)