Jump to Content

Towards NNGP-guided Neural Architecture Search

Daiyi Peng
Daniel S. Park
Jascha Sohl-dickstein
ArXiv (2020)

Abstract

Bayesian inference in the parameter space of deep neural networks can be approximated by Gaussian processes (GPs). While the exact kernels of these GPs are known for a class of models, computation for competitive architectures are often expensive or intractable. One can obtain approximation of these kernels through Monte-Carlo estimation using finite networks at initialization. Monte-Carlo neural network Gaussian process (NNGP) training and inference are orders-of-magnitude cheaper in FLOPs compared to the gradient-based counter-parts when the dataset size is small. Since NNGP inference provides a cheap measure of performance of the network, we investigate its potential as a signal for neural architecture search (NAS). We compute the NNGP performance of approximately 423k networks in the NAS-bench 101 dataset on CIFAR-10 and compare its utility against conventional performance measures obtained by shortened gradient-based training. We carry out a similar analysis on 10k randomly sampled networks in the mobile neural architecture search (MNAS) space for ImageNet. We discover comparative advantages of NNGP-based metrics, and discuss potential applications. In particular, we propose that NNGP performance is an inexpensive signal independent of metrics obtained from training that can either be used for reducing big search spaces, or improving training-based performance measures.

Research Areas