TITLE: Non-Convex Learning via Stochastic Gradient Langevin Dynamics
ABSTRACT:
Stochastic Gradient Langevin Dynamics (SGLD) is a popular variant of Stochastic Gradient Descent, where properly scaled isotropic Gaussian noise is added to an unbiased estimate of the gradient at each iteration. This modest change allows SGLD to escape local minima and suffices to guarantee asymptotic convergence to global minimizers for sufficiently regular non-convex objectives. I will present a nonasymptotic analysis in the context of non-convex learning problems and show that SGLD requires $O(\epsilon^{-4})$ iterations to sample $O(\epsilon)$-approximate minimizers of both empirical and population risk, where $\tO(\cdot)$ hides polynomial dependence on a temperature parameter, the model dimension, and a certain spectral gap parameter. As in the asymptotic setting, the analysis relates the discrete-time SGLD Markov chain to a continuous-time diffusion process. A new tool that drives the results is the use of weighted transportation cost inequalities to quantify the rate of convergence of SGLD to a stationary distribution in the Euclidean $2$-Wasserstein distance. This talk is based on joint work with Sasha Rakhlin and Matus Telgarsky.
Bio: Maxim Raginsky received the B.S. and M.S. degrees in 2000 and the Ph.D. degree in 2002 from Northwestern University, all in electrical engineering. He has held research positions with Northwestern, University of Illinois at Urbana-Champaign (where he was a Beckman Foundation Fellow from 2004 to 2007), and Duke University. In 2012, he returned to UIUC, where he is currently an Assistant Professor and William L. Everitt Fellow in Electrical and Computer Engineering. He is also a faculty member of the Coordinated Science Laboratory. Dr. Raginsky received a Faculty Early Career Development (CAREER) Award from the National Science Foundation in 2013. His research interests lie at the intersection of information theory, machine learning, and control. He is a member of the editorial boards of Foundations and Trends in Communications and Information Theory and IEEE Transactions on Network Science and Engineering.