Claudio Gentile
I am currently a Research Scientist at Google NY. I have been PC Chair or Area Chair of several conferences in the field of Machine Learning/Artificial Intelligence (e.g., COLT, ALT, NIPS, ICML, IJCAI). I held a Research Director position at INRIA (France), and visiting positions at a number of institutions, including UC Santa Cruz, Microsoft Research, and Technion (Israel). My current research interests include online learning with partial information, active learning, online clustering, and applications thereof.
More information can be found at my homepage.
Research Areas
Authored Publications
Sort By
Preview abstract
We present a novel way of training models in theweakly supervised setup of learning frombagsof examples with just aggregate label informa-tion. Unlike previous work, we focus on learninga classifier that can produce accurate predictionsat an individual instance level as opposed to simply matching a bag’s label proportion. The mainstrength of our algorithm lies on the simplicity ofits implementation. We demonstrate that a simplemodification to the loss function can yield accu-rate event level estimates. Moreover we show thatthe sample complexity of doing empirical riskminimization or stochastic gradient descent withlabel proportions increases only by a factor of √k compared to traditional supervised learning sce-narios. Finally, we validate our theoretical resultson multiple datasets and demonstrate that our pro-posed algorithm beats provides state of the artperformance for learning with label proportions.
View details
Preview abstract
We investigate a nonstochastic bandit setting in which the loss of an action is not immediately charged to the player, but rather spread over the subsequent rounds in an adversarial way. The instantaneous loss observed by the player at the end of each round is then a sum of many loss components of previously played actions. This setting encompasses as a special case the easier task of bandits with delayed feedback, a well-studied framework where the player observes the delayed losses individually.
Our first contribution is a general reduction transforming a standard bandit algorithm into one that can operate in the harder setting: We bound the regret of the transformed algorithm in terms of the stability and regret of the original algorithm. Then, we show that the transformation of a suitably tuned FTRL with Tsallis entropy has a regret of order $\sqrt{(d+1)KT}$, where $d$ is the maximum delay, $K$ is the number of arms, and $T$ is the time horizon. Finally, we show that our results cannot be improved in general by exhibiting a matching (up to a log factor) lower bound on the regret of any algorithm operating in this setting.
View details
Preview abstract
Motivated by problems of ranking with partial information, we introduce a variant of the cascading bandit model that considers flexible length sequences with varying rewards and losses. We formulate two generative models for this problem within the generalized linear setting, and design and analyze upper confidence algorithms for it. Our analysis delivers tight regret bounds which, when specialized to standard cascading bandits, results in sharper guarantees than previously available in the literature. We evaluate our algorithms against a representative sample of cascading bandit baselines on a number of real-world datasets and show significantly improved empirical performance.
View details
A Contextual Bandit Approach for Learning to Plan in Environments with Probabilistic Goal Configurations
Sohan Rudra
Saksham Goel
Gaurav Aggarwal
NeurIPS 5th Robot Learning Workshop: Trustworthy Robotics (2022) (to appear)
Preview abstract
We derive a novel active learning algorithm in the streaming setting for binary classification tasks. The algorithm leverages weak labels to minimize the number of label requests, and trains a model to optimize a surrogate loss on a resulting set of labeled and weak-labeled points. Our algorithm jointly admits two crucial properties: theoretical guarantees in the general agnostic setting and a strong empirical performance. Our theoretical analysis shows that the algorithm attains favorable generalization and label complexity bounds, while our empirical study on 18 real-world datasets demonstrate that the algorithm outperforms standard baselines, including the Margin Algorithm, or Uncertainty Sampling, a high-performing active learning algorithm favored by practitioners.
View details
Hierarchical Clustering of Data Streams: Scalable Algorithms and Approximation Guarantees
Anand Rajagopalan
Danny Vainstein
Fabio Vitale
Gui Citovsky
Magda Procopiuc
ICML (2021)
Preview abstract
We investigate the problem of hierarchically clustering data streams containing metric data in R^d. We introduce a desirable invariance property for such algorithms, describe a general family of hyperplane-based methods enjoying this property, and analyze two scalable instances of this general family against recently popularized similarity/dissimilarity-based metrics for hierarchical clustering. We prove a number of new results related to the approximation ratios of these algorithms, improving in various ways over the literature on this subject. Finally, since our algorithms are principled but also very practical, we carry out an experimental comparison on both synthetic and real-world datasets showing competitive results against known baselines.
View details
Batch Active Learning at Scale
Anand Rajagopalan
Gui Citovsky
Laz Karydas
NeurIPS 2021
Preview abstract
The ability to train complex and highly effective models often requires an abundance of training data, which can easily become a bottleneck in cost, time, and computational resources. Batch active learning, which adaptively issues batched queries to a labeling oracle, is a common approach for addressing this problem. The practical benefits of batch sampling come with the downside of less adaptivity and the risk of sampling redundant examples within a batch -- a risk that grows with the batch size. In this work, we analyze an efficient active learning algorithm, which focuses on the large batch setting. In particular, we show that our sampling method, which combines notions of uncertainty and diversity, easily scales to batch sizes (100K-1M) several orders of magnitude larger than used in previous studies and provides significant improvements in model training efficiency compared to recent baselines.
View details
Preview abstract
A general framework for online learning with partial information is one where feedback graphs specify which losses can be observed by the learner. We study a challenging scenario where feedback graphs vary with time stochastically and,more importantly, where graphs and losses are stochastically dependent. This scenario appears in several applications that we describe. We give a new algorithm for this scenario that exploits the stochastic properties of the graphs and that benefits from favorable regret guarantees in this challenging setting. We present a detailed theoretical analysis of this algorithm, and also report the results of a series of experiments on real-world datasets, which show that our algorithm outperforms standard baselines for online learning with feedback graphs.
View details
Preview abstract
We present a new active learning algorithm that adaptively partitions the input space into a finite number of regions, and subsequently seeks a distinct predictor for each region, both phases actively requesting labels. We prove theoretical guarantees for both the generalization error and the label complexity of our algorithm, and analyze the number of regions defined by the algorithm under some mild assumptions. We also report the results of an extensive suite of experiments on several real-world datasets demonstrating substantial empirical benefits over existing single-region and nonadaptive region-based active learning baselines.
View details
Preview abstract
We present two novel extensions of an on-line importance weighted active learning algorithm IWAL, using the properties of disagreement values among hypotheses. The first extension, DIWAL, prunes the hypothesis set with a more aggressive strategy based on concentration bounds with disagreement values. We show that DIWAL improves the generalization performance and the label complexity of the original IWAL, and further quantify the improvement in terms of the disagreement graph coefficient. The second extension, ZOOM, adaptively zooms into the function space near the best-in-class hypothesis, which effectively reduces the best-in-class error and thus simultaneously improves the generalization performance and the label complexity. We report experimental results on multiple datasets and demonstrate that the proposed algorithms achieve better test performances than IWAL given the same amount of labeling budget.
View details