Published March 31, 2021 | Version v1
Report Open

A Deep Neural Network Optimization Method Via A Traffic Flow Model

  • 1. ROR icon IMT School for Advanced Studies Lucca
  • 2. University of Vienna

Description

We present, via the solution of nonlinear parabolic partial differential equations (PDEs), a continuous-time formulation for stochastic optimization algorithms used for training deep neural networks. Using continuous-time formulation of stochastic differential equations (SDEs), relaxation approaches like the stochastic gradient descent (SGD) method are interpreted as the solution of nonlinear PDEs that arise from modeling physical problems. We reinterpret, through homogenization of SDEs, the modified SGD algorithm as the solution of the viscous Burgers' equation that models a highway traffic flow.

Notes

Final report submitted in partial fulfillment of the African Masters in Machine Intelligence (AMMI) Master's Degree program at the African Institute for Mathematical Sciences in Rwanda. This report reflects studies and research conducted by the first author during the program from 2019 to 2020. The official submission deadline for this report was 31 March 2021. For more information about the program, please visit AIMS-AMMI.

Thanks to the program sponsors, Google and Meta Platforms (previously "Facebook").

Files

adeoye-petersen-2021.pdf

Files (119.9 kB)

Name Size Download all
md5:b45d8ada31f292249db03cd5296e6c70
119.9 kB Preview Download

Additional details

Dates

Submitted
2021-03-19
Updated
2021-03-31