The Impact of Geometric Complexity on Neural Collapse in Transfer Learning

Abstract

Recent exceptional advances in computer vision and large language models can be largely attributed to transfer learning and the pre-training of foundation models. Despite this success, the theoretical underpinnings of the mechanisms behind transfer learning are poorly understood from a theoretical perspective. Parameter flatness and neural collapse (which characterizes how neural networks simplify their representations during the final stages of training) have emerged as strong indicators which contribute to the performance of transfer learning. In this paper, we explore the fundamental mechanisms that relate the two by examining the geometric complexity of a foundation model’s learned representations. In particular, we show that the same mechanisms used during pre-training which control the geometric complexity in turn put pressure on the neural collapse of the model, thus encouraging overall performance for downstream tasks.
×