InfoNCE Loss Provably Learns Cluster-Preserving Representations

Advait Parulekar
Aryan Mokhtari
Liam Collins
Sanjay Shakkottai
COLT 2023

Abstract

The goal of contrasting learning is to learn a representation that preserves underlying clusters by
keeping samples with similar content, e.g. the “dogness” of a dog, close to each other in the space
generated by the representation. A common and successful approach for tackling this unsupervised
learning problem is minimizing the InfoNCE loss associated with the training samples, where each
sample is associated with their augmentations (positive samples such as rotation, crop) and a batch
of negative samples (unrelated samples). To the best of our knowledge, it was unanswered if the
representation learned by minimizing the InfoNCE loss preserves the underlying data clusters, as
it only promotes learning a representation that is faithful to augmentations, i.e., an image and its
augmentations have the same representation. Our main result is to show that the representation
learned by InfoNCE with a finite number of negative samples is also consistent with respect to
clusters in the data, under the condition that the augmentation sets within clusters may be nonoverlapping but are close and intertwined, relative to the complexity of the learning function class.

Research Areas