Published March 1, 2026 | Version 1.0.0
Preprint Open

Who Needs Attention? Spiking Language Modeling via Synaptogenic Adaptive Processing Units

  • 1. ROR icon Jagiellonian University
  • 2. ROR icon Jan Matejko Academy Of Fine Arts

Description

Who Needs Attention? Spiking Language Modeling via Synaptogenic Adaptive Processing Units

A spiking neural network generates coherent multi-turn conversation from pure next-token prediction, without attention, without RLHF, and without filtering — running on a $290 used GPU.

We introduce the Synaptogenic Adaptive Processing Unit Language Model (SAPU-LM), a multi-timescale spiking reservoir architecture that replaces attention entirely with trained recurrent dynamics in leaky integrate-and-fire neurons. The chatbot "Nemo" emerges from freezing the learned spiking topology and retraining only 8.5% of parameters on conversational data, achieving 38.05 test perplexity on DailyDialog.

The architecture spans a lineage from a frozen Echo State Network (~19,500 perplexity) to 84.15 perplexity (M-SAPU-LM) on a WikiText-103 10M-token subsample — an ~80× improvement from training reservoir weights via surrogate gradients. A Tiling Parallel SAPU (TPSAPU) shares a single 512×512 recurrent weight matrix across three timescales and recovers to 84.67 perplexity after L1 pruning, suggesting that membrane time constant τ alone creates functional differentiation. Ternary quantization compresses the learned recurrent core to ~45 KB at 93.6% sparsity.

L1 pruning reveals timescale-dependent topology emergence: fast reservoirs maintain distributed connectivity while slow reservoirs self-organize into diagonal self-excitatory memory cells — a structure discovered by the network, not imposed by design. The trained ternary spiking core maps directly to analog resistor-capacitor-comparator circuits; a proof-of-concept hardware exporter has been developed.

To our knowledge, this is the first demonstration of open-ended next-token prediction using a trained spiking reservoir with no attention mechanism. Code and checkpoints: https://gitlab.com/AntonioGCGonzalez/synaptogenic-adaptive-processing-unit-language-models

This is a preliminary technical report. Several configurations are ongoing; results will be updated in subsequent revisions.

Files

WhoNeedsAttention.pdf

Files (1.7 MB)

Name Size Download all
md5:89d3cf1348284e217163df4a3aa89f7d
1.7 MB Preview Download

Additional details

Dates

Submitted
2023-03-01
Submited pre-print to guarantee priority rights