Debiasing a First-order Heuristic for Approximate Bi-level Optimization

Valerii Likhosherstov; Xingyou Song; Krzysztof Choromanski; Jared Davis; Adrian Weller

Debiasing a First-order Heuristic for Approximate Bi-level Optimization

Valerii Likhosherstov

Xingyou Song

Krzysztof Choromanski

Jared Davis

Adrian Weller

Thirty-eighth International Conference on Machine Learning (ICML 2021)

Download Google Scholar

Abstract

Approximate bi-level optimization (ABLO) consists of (outer-level) optimization problems, involving numerical (inner-level) optimization loops. While ABLO has many applications across deep learning, it suffers from time and memory complexity proportional to the length r of its inner optimization loop. To address this complexity, an earlier first-order method (FOM) was proposed as a heuristic that omits second derivative terms, yielding significant speed gains and requiring only constant memory. Despite FOM's popularity, there is a lack of theoretical understanding of its convergence properties. We contribute by theoretically characterizing FOM's gradient bias under mild assumptions. We further demonstrate a rich family of examples where FOM-based SGD does not converge to a stationary point of the ABLO objective. We address this concern by proposing an unbiased FOM (UFOM) enjoying constant memory complexity as a function of r. We characterize the introduced time-variance tradeoff, demonstrate convergence bounds, and find an optimal UFOM for a given ABLO problem. Finally, we propose an efficient adaptive UFOM scheme.

Research Areas

Machine intelligence

Explore our many areas of focus

Building a collaborative ecosystem

Shaping the future together

Translating discovery into real-world impact

Debiasing a First-order Heuristic for Approximate Bi-level Optimization

Abstract

Research Areas

Meet the teams driving innovation

Google AI

Google Cloud

Google DeepMind

Google Labs