A Unified Approach to Adaptive Regularization in Online and Stochastic Optimization

Yoram Singer


We describe a unified adaptive optimization algorithm, which uses a potential function to construct a series of preconditioning matrices that shape the gradient based directions to update the learning parameters. We prove a regret bound for this, and show that various well known algorithms, such as AdaGrad and Online Newton Stepping can be obtained from this by choosing appropriate potential functions. Besides the uniform derivation of regret bounds for all these algorithms, this approach also enables us to vary the potential function and come up with new algorithms.