The Redundancy Cliff: Discovering and Exploiting 50% Dispensable Compute in Transformers with a Single O(1) Spectral Gate
Authors/Creators
Description
This repository accompanies the paper "The Redundancy Cliff: Discovering and Exploiting 50% Dispensable Compute in Transformers with a Single O(1) Spectral Gate" (Dec 2025).
We introduce a simple, fast $O(1)$ spectral gate $\kappa_x$ derived from a modular-residue fingerprint of the input, then apply $\kappa_x$ to dynamically prune attention KV cache length and slice MLP intermediate dimensions. Experiments on TinyLlama-1.1B (WikiText-2) show substantial compute reduction with negligible change in perplexity in our proof-of-concept (PoC) runs.
This release contains the final, tested code (scripts, patching logic, $\kappa_x$ computation, and required dependencies) used to run the experiments, along with the original run artifacts for maximum transparency. This is a proof-of-concept and is not intended for production deployment. See README.md for instructions, environment notes, and full reproducibility details.
Files
AfixForAICompute.pdf
Files
(586.3 kB)
| Name | Size | Download all |
|---|---|---|
|
md5:eae566c22631e551bdab69837319ae08
|
539.9 kB | Preview Download |
|
md5:bd3e0bacb39eaba4eb230cda53733eba
|
1.9 kB | Download |
|
md5:10aaca69e0e40dec1113081f3140324c
|
949 Bytes | Download |
|
md5:900b93a0339e611e2e0983777295efad
|
2.4 kB | Download |
|
md5:ab66b35a740c4287d8eb24fa72167b1f
|
14.2 kB | Download |
|
md5:43675a6838c46319dcbbb94c172e5fcb
|
10.3 kB | Download |
|
md5:b3b7ca74dd1b6b6fcce7f844fe31304c
|
7.3 kB | Download |
|
md5:eda5a9fc53122b9459416b1b6a265caf
|
1.7 kB | Preview Download |
|
md5:36f92ae68969e6c55db93e91ce4f375e
|
748 Bytes | Preview Download |
|
md5:9acbcbe1ab0eff579cb1dcc4b77dec73
|
72 Bytes | Download |
|
md5:18520255f890c2d9f84be82433015893
|
726 Bytes | Download |
|
md5:a76c559e3fdf6410bfb5b047dcd2867a
|
5.3 kB | Download |
|
md5:ec0a05fa56e88475d20445736d96ae64
|
701 Bytes | Download |
Additional details
Related works
- Is supplement to
- Publication: 10.5281/zenodo/.17872873 (DOI)
- Publication: 10.5281/zenodo/.17883257 (DOI)