Google Research

Active Learning of Classifiers with Label and Seed Queries

  • Andrea Paudice
  • Marco Bressan
  • Maximilian Thiessen
  • Nicolo Cesa-Bianchi
  • Silvio Lattanzi
NeurIPS 2022 (to appear)

Abstract

We study exact active learning of binary and multiclass classifiers with margin. Given an $n$-point set $X \subset \R^m$, we want to learn any unknown classifier on $X$ whose classes have finite \emph{strong convex hull margin}, a new notion extending the SVM margin. On the other hand, using the more powerful \emph{seed} queries (a variant of equivalence queries), the target classifier could be learned in $\scO(m \log n)$ queries via Littlestone's Halving algorithm; however, Halving is computationally inefficient. In this work we show that, by carefully combining the two types of queries, a binary classifier can be learned in time $\poly(n+m)$ using only $\scO(m^2 \log n)$ label queries and $\scO\big(m \log \frac{m}{\gamma}\big)$ seed queries; the result extends to $k$-class classifiers at the price of a $k!k^2$ multiplicative overhead. Similar results hold when the input points have bounded bit complexity, or when only one class has strong convex hull margin against the rest. We complement these upper bounds by showing that in the worst case any algorithm needs $\Omega\big(\frac{k m \log \nicefrac{1}{\gamma}}{\log m}\big)$ seed and label queries to learn a $k$-class classifier with strong convex hull margin $\gamma$.

Learn more about how we do research

We maintain a portfolio of research projects, providing individuals and teams the freedom to emphasize specific types of work