TURINICI, Gabriel
AYADI, Imen
2020-12-10
<p>Presentation at the ICPR 2021 conference</p>
<p>The minimization of the loss function is of paramount importance in deep neural networks. Many popular optimization algorithms have been shown to correspond to some evolution equation of gradient flow type. Inspired by the numerical schemes used for general evolution equations, we introduce a second-order stochastic Runge Kutta method and show that it yields a consistent procedure for the minimization of the loss function. In addition, it can be coupled, in an adaptive framework, with the Stochastic Gradient Descent (SGD) to adjust automatically the learning rate of the SGD. The resulting adaptive SGD, called SGD-G2, shows good results in terms of convergence speed when tested on standard data-sets.</p>
https://doi.org/10.5281/zenodo.4314299
oai:zenodo.org:4314299
eng
Zenodo
https://zenodo.org/communities/ai_ml
https://doi.org/10.5281/zenodo.4314298
info:eu-repo/semantics/openAccess
Creative Commons Attribution Non Commercial No Derivatives 4.0 International
https://creativecommons.org/licenses/by-nc-nd/4.0/legalcode
ICPR, 25th International Conference on Pattern Recognition, Milano, Italy, 10-15 January 2021
deep learning
neural network
stochastic gradient descent
machine learning
adaptive learning rate
Stochastic Runge-Kutta methods and adaptive SGD-G2 stochastic gradient descent
info:eu-repo/semantics/lecture