Published January 28, 2026 | Version v4
Working paper Open

Hybrid Spiking Neural Networks: Combining Spike Counts and Membrane Potentials for Energy-Efficient Language and Image Generation

Authors/Creators

  • 1. Independent Researcher

Description

I propose a hybrid spiking neural network that combines spike counts and membrane potentials for output prediction, extended to both language modeling and image generation tasks.

Key Findings (v4 NEW - Image Generation):
- Spiking VAE with 50% membrane weight: 57% loss reduction vs spike-only
- Posterior collapse solution: KL>0 achieved (spike-only had KL=0)
- Image generation sparsity: 96% fewer spike operations
- Optimal trade-off: 50% membrane weight balances quality and efficiency

Key Findings (v3 - Language Model):
- BitNet Mixed Precision: PPL 2.69 BEATS standard SNN (3.29)!
- RWKV Time-Mixing: 36.1% improvement in long-range memory
- Ultimate Architecture: 43.4% improvement combining all techniques
- Multiplication-free reservoir: 50-70% of operations are additions only
- 16-model ensemble achieves PPL 1.04

Key Findings (v1-v2):
- SNN achieves BEST perplexity (PPL=9.90) vs DNN (11.28) and LSTM (15.67)
- 14.7× more energy-efficient through sparse computation (only 7.6% of neurons fire)
- 39.7% quality improvement from hybrid (spike + membrane) approach
- Extreme compressibility: 80% neuron pruning and 4-bit quantization still work
- Noise robust: No degradation at 30% input noise

This v4 establishes hybrid SNNs as the optimal architecture for energy-efficient multimodal AI (language and vision) on edge devices.

Source code: https://github.com/hafufu-stack/snn-language-model

Files

Hybrid_SNN.pdf

Files (532.2 kB)

Name Size Download all
md5:d678bd581c4097782911ce7ebe517c9b
532.2 kB Preview Download

Additional details

Software