There is a newer version of the record available.

Published November 10, 2025 | Version v1.0.0
Software Open

Boa Constrictor: A Mamba-based Lossless Compressor for High Energy Physics data

  • 1. ROR icon University of Manchester
  • 1. ROR icon University of Manchester

Description

BOA Constrictor is an open-source implementation of a Bytewise Online Autoregressive (BOA) compressor built on Mamba state-space models. It targets lossless compression of High Energy Physics (HEP) datasets.

The compressor couples a compact Mamba model (order of a few megabytes) with a parallel range coder to predict and encode bytes in a fully online fashion. On representative ATLAS and CMS datasets, BOA achieves substantially higher compression ratios than a strong classical baseline (LZMA-9), while remaining strictly lossless and respecting the original serialisation. The repository includes:

  • Training and evaluation pipelines for Mamba-based bytewise models

  • A streaming encoder/decoder interface suitable for large HEP files

  • Scripts to reproduce key metrics from the paper (compression ratio, throughput, reliability diagrams, Top-k accuracy, confusion matrices)

  • Example configs for adapting BOA to new datasets

This Zenodo record provides an archived snapshot of the BOA Constrictor code corresponding to the results in the paper “Boa Constrictor: ML-Enhanced Lossless Compression Algorithms for HEP”. If you use this software in academic work, please cite both this software record and the associated paper.

Files

boa-constrictor-1.0.0.zip

Files (94.0 MB)

Name Size Download all
md5:3de563127a1460a875dbebc071523135
94.0 MB Preview Download

Additional details

Related works

Is described by
Preprint: arXiv:2511.11337 (arXiv)

Software