Steklov Activations: Piecewise-Polynomial Gates with Compact Support and Tunable Sparsity
Authors/Creators
Description
We introduce Steklov activations, a piecewise-polynomial activation family derived from B-spline antiderivatives. Parameterized by order r (smoothness) and scale α (transition width), they produce exact zero output and gradient outside a compact support. At α=2 the activation approximates GELU (sup error <0.0091); at α=6 it is exactly HardSwish. On image classification (MNIST, CIFAR-10, CIFAR-100 across LeNet-5, ResNet-18, and WideResNet-28-10), Steklov achieves the highest accuracy on all benchmarks. On language modeling (GPT-2 124M/354M, LLaMA-style 105M), it matches GELU and improves over SiLU. The compact support induces tunable neuron inactivity (3–83%) that is stable across data splits and distributions. Pruning inactive neurons removes 7–11% of parameters with negligible quality loss; a Triton kernel then delivers 3–6% faster inference than unpruned GELU.
Files
steklov_activations.pdf
Files
(462.2 kB)
| Name | Size | Download all |
|---|---|---|
|
md5:44c3bbdfff0da11060f66b40cce4d2d1
|
462.2 kB | Preview Download |
Additional details
Dates
- Issued
-
2026-03-26preprint v1.1