Google Research

Stochastic Optimization with Laggard Data Pipelines

Thirty-fourth Conference on Neural Information Processing Systems, 2020 (2020) (to appear)

Abstract

State-of-the-art optimization has increasingly moved toward massively parallel pipelines with extremely large batches. As a consequence, the performance bottleneck is shifting towards the CPU- and disk-bound data loading and preprocessing, as opposed to hardware-accelerated backpropagation. In this regime, a recently proposed approach is data echoing (Choi et al. '19), which takes repeated gradient steps on the same batch. We provide the first convergence analysis of data echoing-based extensions of ubiquitous optimization methods, exhibiting provable improvements over their synchronous counterparts. Specifically, we show that asynchronous batch reuse can magnify the gradient signal in a stochastic batch, without harming the statistical rate.

Research Areas

Learn more about how we do research

We maintain a portfolio of research projects, providing individuals and teams the freedom to emphasize specific types of work