- Baptiste Goujaud
- Damien Scieur
- Aymeric Dieuleveut
- Adrien B. Taylor
- Fabian Pedregosa
Abstract
Cyclical step-sizes have become increasingly popular in deep learning. Motivated by recent observations on the spectral gaps of Hessians in machine learning, we show that these step-size schedules offer a simple way to exploit such properties. More precisely, we develop a convergence rate analysis for quadratic objectives that provides optimal parameters and shows that cyclical learning rates can improve upon traditional lower complexity bounds. We further propose a systematic approach to design optimal first order methods for quadratic minimization with given spectral structure. Finally, we provide a local convergence rate analysis beyond quadratic minimization for those methods, and illustrate these findings through benchmarks on least squares and logistic regression problems.
Research Areas
Learn more about how we do research
We maintain a portfolio of research projects, providing individuals and teams the freedom to emphasize specific types of work