Google Research

A unifying view on implicit bias in training linear neural networks

ICLR (2021)

Abstract

We study the implicit bias of gradient flow (i.e., gradient descent with infinitesimal step size) on linear neural network training. We consider separable classification and underdetermined linear regression problems where there exist many solutions that achieve zero training error, and characterize how the network architecture and initialization affects the final solution found by gradient flow. Our results apply to a general tensor formulation of neural networks that includes linear fully-connected networks, linear diagonal networks, and linear convolutional networks as special cases, while removing convergence assumptions required by prior research. We also provide experiments that corroborate our theoretical analysis.

Research Areas

Learn more about how we do research

We maintain a portfolio of research projects, providing individuals and teams the freedom to emphasize specific types of work