Registered Data

[02116] Generalized Polyak Step Size for First Order Optimization with Momentum

  • Session Time & Room : 1E (Aug.21, 17:40-19:20) @F309
  • Type : Contributed Talk
  • Abstract : This paper presents a general framework to set the learning rate adaptively for first-order optimization methods with momentum, motivated by the derivation of Polyak step size. It is shown that the resulting methods are much less sensitive to the choice of momentum parameter and may avoid the oscillation of the heavy-ball method on ill-conditioned problems. These adaptive step sizes are further extended to the stochastic settings, which are attractive choices for stochastic gradient descent with momentum. Our methods are demonstrated to be more effective for stochastic gradient methods than prior adaptive step size algorithms in large-scale machine learning tasks.
  • Classification : 90C15, 65K05, 90C06
  • Format : Online Talk on Zoom
  • Author(s) :
    • Xiaoyu Wang (Hong Kong University of Science and Technology)
    • Mikael Johansson (KTH Royal Institute of Technology)
    • Tong Zhang (Hong Kong University of Science and Technology)