Overparameterisation and worst-case generalisation: friend or foe?

Aditya Krishna Menon
International Conference on Learning Representations (ICLR) 2021
Google Scholar

Abstract

Overparameterised neural networks have demonstrated the remarkable ability to perfectly fit training samples, while still generalising to unseen test samples.However, several recent works have revealed that such models’ good average performance does not always translate to good worst-case performance: in particular, they may perform poorly on under-represented subgroups in the training set. In this paper, we show that in certain settings, overparameterised models’ bias against under-represented samples may be easily corrected via post-hoc processing. Specifically, we demonstrate such models’ bias can be restricted to their classification layers, and manifests in structured shifts in predictions for rare subgroups. We de-tail two post-hoc correction techniques to eliminate this bias, which operate purely on the original models’ outputs. We empirically verify that with such post-hoc correction, overparameterisation can improve worst-case performance.

Research Areas