Simple and Principled Uncertainty Estimation with Deterministic Deep Learning via Distance Awareness

Zi Lin
Shreyas Padhy
Advances in Neural Information Processing Systems 33, Curran Associates, Inc. (2020) (to appear)
Google Scholar

Abstract

Bayesian neural networks (BNN) and Deep Ensembles are principled approaches to estimate the predictive uncertainty of a deep learning model. However their practicality in real-time, industrial-scale applications are limited due to their heavy memory and inference cost. This motivates us to study principled approaches to high-quality uncertainty estimation that require only a single deep neural network (DNN). By formalizing the uncertainty quantification as a minimax learning problem, we first identify \textit{input distance awareness}, i.e., the model’s ability in quantifying the distance of a testing example from the training data in the input space, as a necessary condition for a DNN to achieve high-quality (i.e., minimax optimal) uncertainty estimation. We then propose \textit{Spectral-normalized Gaussian Process} (SNGP), a simple method that improves the distance-awareness ability of modern DNNs, by adding a weight normalization step during training and replacing the activation of the penultimate layer. We visually illustrate the property of the proposed method on two-dimensional datasets, and benchmark its performance against Deep Ensembles and other single-model approaches across both vision and language understanding tasks and on modern architectures (ResNet and BERT). Despite its simplicity, SNGP is competitive with Deep Ensembles in prediction, calibration and out-of-domain detection, and significantly outperforms the other single-model approaches.

Research Areas