Published March 26, 2026 | Version v2
Preprint Open

Steklov Activations: Piecewise-Polynomial Gates with Compact Support and Tunable Sparsity

Description

We introduce Steklov activations, a piecewise-polynomial activation family derived from B-spline antiderivatives. Parameterized by order r (smoothness) and scale α (transition width), they produce exact zero output and gradient outside a compact support. At α=2 the activation approximates GELU (sup error <0.0091); at α=6 it is exactly HardSwish. On image classification (MNIST, CIFAR-10, CIFAR-100 across LeNet-5, ResNet-18, and WideResNet-28-10), Steklov achieves the highest accuracy on all benchmarks. On language modeling (GPT-2 124M/354M, LLaMA-style 105M), it matches GELU and improves over SiLU. The compact support induces tunable neuron inactivity (3–83%) that is stable across data splits and distributions. Pruning inactive neurons removes 7–11% of parameters with negligible quality loss; a Triton kernel then delivers 3–6% faster inference than unpruned GELU.

Files

steklov_activations.pdf

Files (462.2 kB)

Name Size Download all
md5:44c3bbdfff0da11060f66b40cce4d2d1
462.2 kB Preview Download

Additional details

Dates

Issued
2026-03-26
preprint v1.1