WiggleGPT: Revisiting the Monotonicity Assumption in Neural Networks via Oscillating Activation Functions
Authors/Creators
Description
WiggleGPT: Revisiting the Monotonicity Assumption in Neural Networks via Oscillating Activation Functions
Since Minsky and Papert's Perceptrons (1969), artificial neural networks have relied on monotonic activation functions (Sigmoid, ReLU, GELU). This design choice requires multiple hidden layers to solve non-linearly separable problems like XOR—a limitation that contributed to the first AI winter.
WiggleGPT challenges this 56-year-old assumption by implementing learnable oscillating activations: f(x) = sin(ωx + φ) · tanh(x). We demonstrate that a single neuron using this activation solves XOR with 100% accuracy—mathematically impossible for standard monotonic neurons.
Scaling to 124M parameters (matching GPT-2 Small), WiggleGPT achieves validation loss of 3.1621 on OpenWebText, within 1.3% of the standard baseline, without increasing parameter count. Frequency analysis confirms the model actively utilizes oscillation: frequency variance increased 6x from initialization, with 95% of neurons retaining oscillatory behavior rather than collapsing to linear approximations.
These results establish oscillating activations as a viable architectural alternative to standard deep learning primitives, reopening design questions closed since 1969.
v3 Note: The original 'RTX 5060 Ti' specification was correct. v2 erroneously 'corrected' this.
The correct original date has been restored to reflect the publication date on my own website.
Files
WiggleGPT_Paper_Syncedv1.pdf
Files
(1.1 MB)
| Name | Size | Download all |
|---|---|---|
|
md5:0ac2fcb690a972259fe32bf035b0553d
|
1.1 MB | Preview Download |
Additional details
Software
- Repository URL
- https://github.com/Eden-Eldith/WiggleGPT
- Programming language
- Python
- Development Status
- Active