ROLV: A Universal Sparse Compute Primitive with Cross‑Vendor Reproducibility and Orders‑of‑Magnitude Real‑World Acceleration
Authors/Creators
Description
ROLV Primitive© is a software operator for sparse matrix multiplication in large language model inference.
Modern large language models contain hundreds of weight matrices. During inference, the majority of arithmetic performed on these matrices produces no meaningful contribution to the output — it operates on parameters that have been zeroed through pruning. ROLV Primitive© eliminates that computation before it reaches the hardware. No multiply. No memory access. No energy spent on parameters that do not contribute.
The operator is built once from a weight matrix and reused across all subsequent inference calls. It requires no changes to model architecture, no retraining, no new hardware, and no modifications to the inference pipeline beyond substituting the operator. It runs on GPU and CPU.
Benchmark results, collected on NVIDIA H200, B200, Tesla T4, Intel CPU, and AMD EPYC, show speedups of 8–19× on production large language model weight matrices at sparsity levels achievable through standard pruning. Energy reductions of up to 99% have been measured directly via hardware power instrumentation at the same conditions. Every published result carries four SHA-256 hashes and a perturbation test — independent verification requires only downloading the same public model weights and recomputing the hashes.
At 70% sparsity on the exact dimensions of LLaMA-3.1-70B MLP layers, ROLV Primitive© runs in 1.22ms versus 13.97ms for the leading vendor sparse operator on NVIDIA B200 — an 11.45× reduction on the layers that account for the majority of inference compute. The speedup grows with batch size: at batch=2048, ROLV uses 0.41µs per token versus 4.44µs. Larger models benefit more than smaller ones.
Files
ROLV_Validation.pdf
Files
(314.3 kB)
| Name | Size | Download all |
|---|---|---|
|
md5:34e1bea6ba957acae0a7e5e845bb7d4b
|
7.3 kB | Preview Download |
|
md5:3af8d5295c66a35d6e3063eb6c988928
|
306.9 kB | Preview Download |
Additional details
Dates
- Available
-
2026-03rolv