Online Learning with Sleeping Experts and Feedback Graphs

Google Scholar

Abstract

We consider the scenario of online learning with the sleeping experts where not all experts are available at each round and analyze the general framework of learning with stochastic feedback graphs, where loss observations associated to each expert are characterized by a graph, thereby including both the bandit and full information settings as special cases. A critical assumption in this framework is that the loss observations and the set of sleeping experts at each round is independent. We first extend the classical algorithm of Kleinberg et al 2008 to use the loss information encoded by any sequence of feedback graphs and prove matching upper and lower bounds for the sleeping regret of this algorithm. Our main contribution is then to relax this independence assumption, present a finer notion of sleeping regret, and derive a general algorithm with strong theoretical guarantees. We instantiate our framework to the important scenario of online learning with abstention, where a learner can elect to abstain from making a prediction at the price of a certain cost. We empirically validate the improvement of our algorithm against multiple abstention algorithms on several real-world datasets

Research Areas