MixMatch: A Holistic Approach to Semi-Supervised Learning
Abstract
Semi-supervised learning has proven to be a powerful paradigm for leveraging
unlabeled data to mitigate the reliance on large labeled datasets. In this work, we
unify the current dominant approaches for semi-supervised learning to produce a
new algorithm called MixMatch. MixMatch works by guessing low-entropy la-
bels for data-augmented unlabeled examples, and then mixes labeled and unlabeled
data using MixUp. We show that MixMatch obtains state-of-the-art results by a
large margin across many datasets and labeled data amounts. We also demonstrate
how MixMatch can help achieve a dramatically better accuracy-privacy trade-off
for differential privacy. Finally, we perform an ablation study to tease apart which
components of MixMatch are most important for its success.
unlabeled data to mitigate the reliance on large labeled datasets. In this work, we
unify the current dominant approaches for semi-supervised learning to produce a
new algorithm called MixMatch. MixMatch works by guessing low-entropy la-
bels for data-augmented unlabeled examples, and then mixes labeled and unlabeled
data using MixUp. We show that MixMatch obtains state-of-the-art results by a
large margin across many datasets and labeled data amounts. We also demonstrate
how MixMatch can help achieve a dramatically better accuracy-privacy trade-off
for differential privacy. Finally, we perform an ablation study to tease apart which
components of MixMatch are most important for its success.