Stochastic Frank-Wolfe for Constrained Finite-Sum Minimization
Abstract
We propose a novel Stochastic Frank-Wolfe ($\equiv$ Conditional Gradient) algorithm with fixed batch size tailored to the constrained optimization of a finite sum of smooth objectives. We make use of a primal-dual interpretation of the Frank-Wolfe algorithm. Recent work to design stochastic variants of the Frank-Wolfe algorithm fall into two categories: algorithms with increasing batch size, and algorithms with constant batch size. The former have faster convergence rates but are impractical; the latter are practical but slower. Our method combines the advantages of both: it converges for any constant batch size, and has faster theoretical worst case rates than previous fixed batch size algorithms. Our experiments also show faster empirical convergence than previous fixed batch methods for several tasks. Finally, we construct a stochastic estimator of the Frank-Wolfe gap. It allows us to bound the true Frank-Wolfe gap, which is a primal-dual gap in the convex case and a measure of stationarity in general. Our gap estimator can therefore be used as a practical stopping criterion in all cases.