Simple and Principled Uncertainty Estimation with Deterministic Deep Learning via Distance Awareness
Abstract
Bayesian neural networks (BNN) and Deep Ensembles are principled approaches to estimate the predictive uncertainty of a deep learning model. However their practicality in real-time, industrial-scale applications are limited due to their heavy memory and inference cost. This motivates us to study principled approaches to high-quality uncertainty estimation that require only a single deep neural network (DNN). By formalizing the uncertainty quantification as a minimax learning problem, we first identify \textit{input distance awareness}, i.e., the model’s ability in quantifying the distance of a testing example from the training data in the input space, as a necessary condition for a DNN to achieve high-quality (i.e., minimax optimal) uncertainty estimation. We then propose \textit{Spectral-normalized Gaussian Process} (SNGP), a simple method that improves the distance-awareness ability of modern DNNs, by adding a weight normalization step during training and replacing the activation of the penultimate layer. We visually illustrate the property of the proposed method on two-dimensional datasets, and benchmark its performance against Deep Ensembles and other single-model approaches across both vision and language understanding tasks and on modern architectures (ResNet and BERT). Despite its simplicity, SNGP is competitive with Deep Ensembles in prediction, calibration and out-of-domain detection, and significantly outperforms the other single-model approaches.