Published February 4, 2026 | Version V1.2.1
Software Open

Boa Constrictor: A Mamba-based Lossless Compressor for Scientific Data

  • 1. ROR icon University of Manchester
  • 1. ROR icon University of Manchester

Description

BOA Constrictor is an open-source implementation of a Bytewise Online Autoregressive (BOA) compressor built on Mamba state-space models. It targets lossless compression of High Energy Physics (HEP) datasets.

The compressor couples a compact Mamba model (order of a few megabytes) with a parallel range coder to predict and encode bytes in a fully online fashion. On representative ATLAS and CMS datasets, BOA achieves substantially higher compression ratios than a strong classical baseline (LZMA-9), while remaining strictly lossless and respecting the original serialisation. The repository includes:

  • Training and evaluation pipelines for Mamba-based bytewise models

  • A streaming encoder/decoder interface suitable for large HEP files

  • Scripts to reproduce key metrics from the paper (compression ratio, throughput, reliability diagrams, Top-k accuracy, confusion matrices)

  • Example configs for adapting BOA to new datasets

  • Reference C++ implementation for solving the portability issue 

This Zenodo record provides an archived snapshot of the BOA Constrictor code corresponding to the results in the paper “Boa Constrictor: ML-Enhanced Lossless Compression Algorithms for scientific data”. If you use this software in academic work, please cite both this software record and the associated paper.

Files

boa-constrictor-1.2.1.zip

Files (94.1 MB)

Name Size Download all
md5:1776ed77beaa334ba8472bb179e07b99
94.1 MB Preview Download

Additional details

Related works

Is described by
Preprint: arXiv:2511.11337 (arXiv)

Software