Semi-supervised batch active learning via bilevel optimization

Andreas Krause
2021 IEEE International Conference on Acoustics, Speech and Signal Processing (to appear)

Abstract

\emph{Active learning} is an effective technique for reducing the labeling cost by improving data efficiency. In this work, we propose a novel \emph{batch acquisition strategy} for active learning in the setting when the model training is performed in a \emph{semi-supervised} manner. We formulate our approach as a \emph{data summarization} problem via \emph{bilevel optimization}, where the queried batch consists of the points that best summarize the unlabeled data pool. We show that our method is highly effective in \emph{keyword detection} tasks in the regime when only \emph{few labeled samples} are available.