Who Needs Attention? Spiking Language Modeling via Synaptogenic Adaptive Processing Units
Authors/Creators
Description
Who Needs Attention? Spiking Language Modeling via Synaptogenic Adaptive Processing Units
A spiking neural network generates coherent multi-turn conversation from pure next-token prediction, without attention, without RLHF, and without filtering — running on a $290 used GPU.
We introduce the Synaptogenic Adaptive Processing Unit Language Model (SAPU-LM), a multi-timescale spiking reservoir architecture that replaces attention entirely with trained recurrent dynamics in leaky integrate-and-fire neurons. The chatbot "Nemo" emerges from freezing the learned spiking topology and retraining only 8.5% of parameters on conversational data, achieving 38.05 test perplexity on DailyDialog.
The architecture spans a lineage from a frozen Echo State Network (~19,500 perplexity) to 84.15 perplexity (M-SAPU-LM) on a WikiText-103 10M-token subsample — an ~80× improvement from training reservoir weights via surrogate gradients. A Tiling Parallel SAPU (TPSAPU) shares a single 512×512 recurrent weight matrix across three timescales and recovers to 84.67 perplexity after L1 pruning, suggesting that membrane time constant τ alone creates functional differentiation. Ternary quantization compresses the learned recurrent core to ~45 KB at 93.6% sparsity.
L1 pruning reveals timescale-dependent topology emergence: fast reservoirs maintain distributed connectivity while slow reservoirs self-organize into diagonal self-excitatory memory cells — a structure discovered by the network, not imposed by design. The trained ternary spiking core maps directly to analog resistor-capacitor-comparator circuits; a proof-of-concept hardware exporter has been developed.
To our knowledge, this is the first demonstration of open-ended next-token prediction using a trained spiking reservoir with no attention mechanism. Code and checkpoints: https://gitlab.com/AntonioGCGonzalez/synaptogenic-adaptive-processing-unit-language-models
This is a preliminary technical report. Several configurations are ongoing; results will be updated in subsequent revisions.
Files
WhoNeedsAttention.pdf
Files
(1.7 MB)
| Name | Size | Download all |
|---|---|---|
|
md5:89d3cf1348284e217163df4a3aa89f7d
|
1.7 MB | Preview Download |
Additional details
Dates
- Submitted
-
2023-03-01Submited pre-print to guarantee priority rights
Software
- Repository URL
- https://gitlab.com/AntonioGCGonzalez/synaptogenic-adaptive-processing-unit-language-models
- Programming language
- Python , HTML , JSON
- Development Status
- Active