The role of permutation invariance in linear mode connectivity of neural networks
Abstract
Understanding the loss landscape of deep neural networks has been the subject of many studies due to its close connections to optimization and generalization. Prior work has shown that there is often a performance barrier along the linear interpolation of the weights of two models trained with different initial seeds. In this work, we first empirically investigate how different model parameters and data distributions impact such performance barriers. Next, we consider the invariances in the function space of neural networks that arise from permutation of hidden units. We investigate this through extensive experiments and provide several pieces of evidence that if these invariances are taken into account, many of the barriers vanish.