On the interplay between noise and curvature and its effect on optimization and generalization

Valentin Thomas

Fabian Pedregosa

Bart van Merriënboer

Pierre-Antoine Manzagol

Yoshua Bengio

Nicolas Le Roux

Proceedings of the 23rdInternational Conference on Artificial Intelligence and Statistics (AISTATS)(2020)

Download Google Scholar

Abstract

This work revisits the notion of \textit{information criterion} to characterize generalization for modern deep learning models. In particular, we empirically demonstrate the effectiveness of the Takeuchi Information Criterion, an extension of the Akaike Information Criterion for misspecified models, in estimating the generalization gap, shedding light on why quantities such as the number of parameters cannot quantify generalization. The TIC depends on both the Hessian of the loss $\rmH$ and the covariance matrix of the gradients $\rmSS$. By exploring the semantic and numerical similarities and differences between these two matrices as well as the Fisher information matrix $\rmF$, we bring further evidence that flatness cannot in itself predict generalization. We also address the question of when is $\rmSS$ a reasonable approximation to $\rmF$, as commonly assumed.

Research Areas

Machine Intelligence

Defining the technology of today and tomorrow.

Philosophy

People

Teams

AI/ML Foundations  & Capabilities

Algorithms & Optimization

Computing Paradigms

Responsible Human-Centric Technology

Science & Societal Impact

Projects

Publications

Resources

Shaping the future, together.

Student programs

Faculty programs

Conferences & events

On the interplay between noise and curvature and its effect on optimization and generalization

Abstract

Research Areas

Meet the teams driving innovation

Defining the technology of today and tomorrow.

Philosophy

People

Teams

AI/ML Foundations & Capabilities

Algorithms & Optimization

Computing Paradigms

Responsible Human-Centric Technology

Science & Societal Impact

Projects

Publications

Resources

Shaping the future, together.

Student programs

Faculty programs

Conferences & events

On the interplay between noise and curvature and its effect on optimization and generalization

Abstract

Research Areas

Meet the teams driving innovation

AI/ML Foundations  & Capabilities