Published June 21, 2026
| Version v0.2.0
Software
Open
ProteinTensor: AI-Native Biomolecular Tensor Storage for Structural Biology ML
Authors/Creators
Description
ProteinTensor is a Python library and file format (.ptt) that eliminates redundant preprocessing in structural biology machine learning pipelines. It converts mmCIF/PDB structures - or raw protein sequences - once into a Zarr-backed, LZ4-compressed, memory-mappable store, providing zero-parse access to atomic coordinates, backbone geometry, covalent bond graphs, MSA tokens, pairwise distance features, and protein language model embeddings. Sequence-only entries serve as direct input to AlphaFold- and Boltz-style predictors. Round-trip conversion is lossless, and structure loading is benchmarked at 2-95x faster than mmCIF parsing across proteins from 74 to 3,525 residues.
Notes
Files
mooreneural/HelixDB-v0.2.0.zip
Files
(83.8 kB)
| Name | Size | Download all |
|---|---|---|
|
md5:3314b4860fdb174bcf642782cc4ae12d
|
83.8 kB | Preview Download |
Additional details
Related works
- Is supplement to
- Software: https://github.com/mooreneural/HelixDB/tree/v0.2.0 (URL)
Software
- Repository URL
- https://github.com/mooreneural/HelixDB