Leveraging Unlabeled Data to Predict Out-of-Distribution Performance

Saurabh Garg

Sivaraman Balakrishnan

Zachary Chase Lipton

Behnam Neyshabur

Hanie Sedghi

ICLR (2022)

Download Google Scholar

Abstract

Distribution shift is a prevalent problem in the real-world deployment of machine learning models. Typically a mismatch between the source (training) and target (test) distribution leads to a gap between the source and target performance of the model. In this work, we investigate methods that leverage only unlabeled target data to predict accuracy under distribution shift. We propose a simple and effective method called Average Thresholded Confidence (ATC) that learns a scalar \emph{threshold} on model confidence on source data and predicts model performance as the average number of unlabeled target examples above the identified threshold. ATC outperforms previous approaches across several model architectures and various types of distribution shifts (e.g. synthetic corruptions, shifts due to dataset reproduction, or shifts due to novel subpopulations) applied to FMoW-\textsc{wilds}, ImageNet, CIFAR, and MNIST datasets. ATC estimates target performance up to $2\text{--}3\times$ more accurately compared to recently proposed methods. Finally, we theoretically analyze our proposed method on a toy distribution shift model with varying degrees of spurious correlation.

Research Areas

Machine Intelligence

Defining the technology of today and tomorrow.

Philosophy

People

Teams

AI/ML Foundations  & Capabilities

Algorithms & Optimization

Computing Paradigms

Responsible Human-Centric Technology

Science & Societal Impact

Projects

Publications

Resources

Shaping the future, together.

Student programs

Faculty programs

Conferences & events

Leveraging Unlabeled Data to Predict Out-of-Distribution Performance

Abstract

Research Areas

Learn more about how we conduct our research

Defining the technology of today and tomorrow.

Philosophy

People

Teams

AI/ML Foundations & Capabilities

Algorithms & Optimization

Computing Paradigms

Responsible Human-Centric Technology

Science & Societal Impact

Projects

Publications

Resources

Shaping the future, together.

Student programs

Faculty programs

Conferences & events

Leveraging Unlabeled Data to Predict Out-of-Distribution Performance

Abstract

Research Areas

Learn more about how we conduct our research

AI/ML Foundations  & Capabilities