Discover: Deep Scalable Variance Reduction

Lionel Ngoupeyou Tondji

Moustapha Cisse

Sergii Kashubin

CoRR, abs/2111.11828(2021)

Download Google Scholar

Abstract

Most variance reduction methods for stochastic optimization are primarily designed for smooth and strongly convex functions. They also often come with high memory requirements. Consequently, they do not scale to large scale deep learning settings where we are in presence of massive neural networks and virtually infinite data due to the use of data augmentation strategies. In this work, we extend convex online variance reduction to the realm of deep learning. We exploit the ubiquitous clustering structure of rich datasets used in deep learning to design a scalable variance reduced optimization procedure. Our proposal allows to leverage prior knowledge about a given problem to speedup the learning process. It is robust and theoretically well-motivated. Our experiments show that it is superior or on par with most widely used optimizers in deep learning on standard benchmark datasets.

Research Areas

Machine Intelligence

Defining the technology of today and tomorrow.

Philosophy

People

Teams

AI/ML Foundations  & Capabilities

Algorithms & Optimization

Computing Paradigms

Responsible Human-Centric Technology

Science & Societal Impact

Projects

Publications

Resources

Shaping the future, together.

Student programs

Faculty programs

Conferences & events

Discover: Deep Scalable Variance Reduction

Abstract

Research Areas

Meet the teams driving innovation

Defining the technology of today and tomorrow.

Philosophy

People

Teams

AI/ML Foundations & Capabilities

Algorithms & Optimization

Computing Paradigms

Responsible Human-Centric Technology

Science & Societal Impact

Projects

Publications

Resources

Shaping the future, together.

Student programs

Faculty programs

Conferences & events

Discover: Deep Scalable Variance Reduction

Abstract

Research Areas

Meet the teams driving innovation

AI/ML Foundations  & Capabilities