Ano : Faster is Better in Noisy Landscapes
Creators
Description
Stochastic optimizers are central to deep learning, yet widely used methods like Adam and Adan exhibit performance degradation in non-stationary or noisy environments, partly due to their reliance on momentum-based magnitude estimates. We introduce Ano, a novel optimizer that decouples the direction and magnitude of parameter updates: momentum is applied exclusively to directional smoothing, while the step size is using instantaneous gradient magnitudes. This design improves robustness to gradient noise while retaining the simplicity and efficiency of first-order methods. We also propose Anolog, a variant of Ano that dynamically adjusts the momentum coefficient $\beta_1$, effectively expanding the momentum window over time. We provide convergence guarantees in standard nonconvex stochastic optimization settings. Empirically, we evaluate Ano across three major domains, computer vision, natural language processing, and deep reinforcement learning, using both default and tuned hyperparameters. Ano demonstrates competitive or superior performance across a range of tasks, particularly in noisy environments such as MuJoCo, where it improves cumulative rewards by 15\% over Adam, and achieves comparable results to Adam with 50–70\% fewer training steps. The method maintains the memory footprint of Adam. All code and experimental logs are publicly released to support reproducibility. These results suggest that decoupling direction and magnitude offers a promising avenue for improving the robustness and generality of optimization algorithms in deep learning.
Files
AnoFaster-2.pdf
Files
(5.4 MB)
Name | Size | Download all |
---|---|---|
md5:1604b5792212ba995d8a79c0cc0d3353
|
5.4 MB | Preview Download |
Additional details
Dates
- Issued
-
2025-07-25
Software
- Repository URL
- https://github.com/Adrienkgz/ano-experiments
- Programming language
- Python
- Development Status
- Active