Revisiting ResNets: Improved Training Methodologies and Scaling Principles

Irwan Bello

Liam B. Fedus

Xianzhi Du

Ekin Dogus Cubuk

Aravind Srinivas

Tsung-Yi Lin

Jon Shlens

Barret Richard Zoph

ICML 2021 (2021) (to appear)

Google Scholar

Abstract

Novel ImageNet architectures monopolize the limelight when advancing the state-of-the-art, but progress is often muddled by simultaneous changes to training methodology and scaling strategies. Our work disentangles these factors by revisiting the ResNet architecture using modern training and scaling techniques and, in doing so, we show ResNets match recent state-of-the-art models. A ResNet trained to 79.0 top-1 ImageNet accuracy is increased to 82.2 through improved training methodology alone; two small popular architecture changes further improve this to 83.4. We next offer new perspectives on the scaling strategy which we summarize by two key principles: (1) increase model depth and image size, but not model width (2) increase image size far more slowly than previously recommended. Using improved training methodology and our scaling principles, we design a family of ResNet architectures, ResNet-RS, which are 1.9x - 2.3x faster than the EfficientNets in supervised learning on ImageNet. And though EfficientNet has significantly fewer FLOPs and parameters -- training ResNet-RS is both faster and less memory-intensive, serving as a strong baseline for researchers and practitioners.

Defining the technology of today and tomorrow.

Philosophy

People

Teams

AI/ML Foundations  & Capabilities

Algorithms & Optimization

Computing Paradigms

Responsible Human-Centric Technology

Science & Societal Impact

Projects

Publications

Resources

Shaping the future, together.

Student programs

Faculty programs

Conferences & events

Revisiting ResNets: Improved Training Methodologies and Scaling Principles

Abstract

Research Areas

Learn more about how we conduct our research

Defining the technology of today and tomorrow.

Philosophy

People

Teams

AI/ML Foundations & Capabilities

Algorithms & Optimization

Computing Paradigms

Responsible Human-Centric Technology

Science & Societal Impact

Projects

Publications

Resources

Shaping the future, together.

Student programs

Faculty programs

Conferences & events

Revisiting ResNets: Improved Training Methodologies and Scaling Principles

Abstract

Research Areas

Learn more about how we conduct our research

AI/ML Foundations  & Capabilities