There is a newer version of the record available.

Published July 25, 2025 | Version v1
Preprint Open

Ano : Faster is Better in Noisy Landscapes

Description

Stochastic optimizers are central to deep learning, yet widely used methods like Adam and Adan exhibit performance degradation in non-stationary or noisy environments, partly due to their reliance on momentum-based magnitude estimates. We introduce Ano, a novel optimizer that decouples the direction and magnitude of parameter updates: momentum is applied exclusively to directional smoothing, while the step size is using instantaneous gradient magnitudes. This design improves robustness to gradient noise while retaining the simplicity and efficiency of first-order methods. We also propose Anolog, a variant of Ano that dynamically adjusts the momentum coefficient $\beta_1$, effectively expanding the momentum window over time. We provide convergence guarantees in standard nonconvex stochastic optimization settings. Empirically, we evaluate Ano across three major domains, computer vision, natural language processing, and deep reinforcement learning, using both default and tuned hyperparameters. Ano demonstrates competitive or superior performance across a range of tasks, particularly in noisy environments such as MuJoCo, where it improves cumulative rewards by 15\% over Adam, and achieves comparable results to Adam with 50–70\% fewer training steps. The method maintains the memory footprint of Adam. All code and experimental logs are publicly released to support reproducibility. These results suggest that decoupling direction and magnitude offers a promising avenue for improving the robustness and generality of optimization algorithms in deep learning.

Files

AnoFaster-2.pdf

Files (5.4 MB)

Name Size Download all
md5:1604b5792212ba995d8a79c0cc0d3353
5.4 MB Preview Download

Additional details

Dates

Issued
2025-07-25

Software

Repository URL
https://github.com/Adrienkgz/ano-experiments
Programming language
Python
Development Status
Active