Boa Constrictor: A Mamba-based Lossless Compressor for High Energy Physics data
Description
BOA Constrictor is an open-source implementation of a Bytewise Online Autoregressive (BOA) compressor built on Mamba state-space models. It targets lossless compression of High Energy Physics (HEP) datasets.
The compressor couples a compact Mamba model (order of a few megabytes) with a parallel range coder to predict and encode bytes in a fully online fashion. On representative ATLAS and CMS datasets, BOA achieves substantially higher compression ratios than a strong classical baseline (LZMA-9), while remaining strictly lossless and respecting the original serialisation. The repository includes:
-
Training and evaluation pipelines for Mamba-based bytewise models
-
A streaming encoder/decoder interface suitable for large HEP files
-
Scripts to reproduce key metrics from the paper (compression ratio, throughput, reliability diagrams, Top-k accuracy, confusion matrices)
-
Example configs for adapting BOA to new datasets
This Zenodo record provides an archived snapshot of the BOA Constrictor code corresponding to the results in the paper “Boa Constrictor: ML-Enhanced Lossless Compression Algorithms for HEP”. If you use this software in academic work, please cite both this software record and the associated paper.
Files
boa-constrictor-1.0.0.zip
Files
(94.0 MB)
| Name | Size | Download all |
|---|---|---|
|
md5:3de563127a1460a875dbebc071523135
|
94.0 MB | Preview Download |
Additional details
Related works
- Is described by
- Preprint: arXiv:2511.11337 (arXiv)
Software
- Repository URL
- https://github.com/boa-collaboration/boa-constrictor/releases/tag/v1.0.0
- Programming language
- Python , C++
- Development Status
- Active