Tight Hardness Results for Training Depth-2 ReLU Networks

Surbhi Goel

Adam Klivans

Pasin Manurangsi

Daniel Reichman

Innovations in Theoretical Computer Science (ITCS) (2021), 22:1-22:14

Google Scholar

Abstract

We prove several hardness results for training neural networks with a single hidden layer and the ReLU activation function. Our goal is to output a one-layer neural network that minimizes the square loss with respect to a given training set. We prove that this problem is NP-hard already for a network with a single ReLU. We also prove NP-hardness for outputting a weighted sum of k ReLUs minimizing the squared error (for k > 1) even in the realizable setting (i.e., when the labels are consistent with an unknown one layer ReLU network). We are also able to obtain lower bounds on the running time in terms of 1/epsilon where epsilon is the desired additive error. To obtain our lower bounds, we use the Gap-ETH hypothesis as well as a new hypothesis regarding the hardness of approximating the well known Densest k-Subgraph problem in subexponential time (these hypotheses are used separately in proving different lower bounds). For example, we prove that under reasonable hardness assumptions, any proper learning algorithm for finding the best fitting ReLU must run in time exponential in 1/epsilon^2. This implies the first separation between proper and improper algorithms for learning a ReLU due to prior work by Geol et al. We also study the problem of properly learning a depth-2 network with ReLUs with bounded weights giving new (worst-case) upper bounds on the running time needed to learn such networks, that essentially matches our lower bounds in terms of the dependency on epsilon.

Defining the technology of today and tomorrow.

Philosophy

People

Teams

AI/ML Foundations  & Capabilities

Algorithms & Optimization

Computing Paradigms

Responsible Human-Centric Technology

Science & Societal Impact

Projects

Publications

Resources

Shaping the future, together.

Student programs

Faculty programs

Conferences & events

Tight Hardness Results for Training Depth-2 ReLU Networks

Abstract

Research Areas

Meet the teams driving innovation

Defining the technology of today and tomorrow.

Philosophy

People

Teams

AI/ML Foundations & Capabilities

Algorithms & Optimization

Computing Paradigms

Responsible Human-Centric Technology

Science & Societal Impact

Projects

Publications

Resources

Shaping the future, together.

Student programs

Faculty programs

Conferences & events

Tight Hardness Results for Training Depth-2 ReLU Networks

Abstract

Research Areas

Meet the teams driving innovation

AI/ML Foundations  & Capabilities