Kushal Chauhan
I am a Research Software Engineer at Google Research working with Dr. Aravindan Raghuveer on privacy preserving deep learning. Before this, I was a Pre-Doctoral Researcher at Google Research, where I worked with Dr. Pradeep Shenoy on robust and interpretable deep learning.
Research Areas
Authored Publications
Sort By
Generalization and Learnability in Multiple Instance Regression
Lorne Applebaum
Ashwinkumar Badanidiyuru Varadaraja
Chandan Giri
Proc. UAI (2024)
Preview abstract
Multiple instance regression (MIR) introduced by (Ray & Page, 2001) as an analogue of multiple
instance learning (MIL) in which we are given bags of feature-vectors (instances) and for each bag there is a bag-label which is matches the label of one (unknown) primary instance from that bag. The goal is to compute a hypothesis regressor consistent with the underlying instance-labels. A natural approach is to find the best primary instance assignment and regressor optimizing (say) the mse loss on the bags though no formal generalization guarantees were known. Our work is the first to prove generalization error bounds for MIR when the bags are drawn iid at random. Essentially, w.h.p. any MIR regressor with low error on sampled bags also has low error on the underlying instance-label distribution. We next study the complexity of linear regression on MIR bags, shown to be NP-hard in general by (Ray & Page, 2001) who however left open the possibility of arbitrarily good approximations. Significantly strengthening previous work, we prove a strong inapproximability bound: even if there exists zero-loss MIR linear regressor on a collection of 2-sized bags with labels in [−1, 1], it is NP-hard to find an MIR linear regressor with loss < C for some absolute constant C > 0.
Our work also proposes two novel model training methods on MIR bags based on (i) weighted assignment loss and, (ii) EM pseudo-labeling, handling the case of overlapping bags which has not previously been studied. We conduct extensive empirical evaluations on synthetic and real-world datasets showing that our method outperforms the baseline MIR methods.
View details
Preview abstract
Reliable outlier detection is critical for real-world applications of deep learning models. Likelihoods produced by deep generative models, although extensively studied, have been largely dismissed as being impractical for outlier detection. For one, deep generative model likelihoods are readily biased by low-level input statistics. Second, many recent solutions for correcting these biases are computationally expensive or do not generalize well to complex, natural datasets. Here, we explore outlier detection with a state-of-the-art deep autoregressive model: PixelCNN++. We show that biases in PixelCNN++ likelihoods arise primarily from predictions based on local dependencies. We propose two families of bijective transformations that we term “shaking” and “stirring”, which ameliorate low-level biases and isolate the contribution of long-range dependencies to the PixelCNN++ likelihood. These transformations are computationally inexpensive and readily applied at evaluation time. We evaluate our approaches extensively with five grayscale and six natural image datasets and show that they achieve or exceed state-of-the-art outlier detection performance. In sum, lightweight remedies suffice to achieve robust outlier detection on images with deep autoregressive models.
View details
Preview abstract
The options framework in Hierarchical Reinforcement Learning breaks down overall goals into a combination of options or simpler tasks and associated policies, allowing for abstraction in the action space. Ideally, these options can be reused across different higher-level goals; indeed, many previous approaches have proposed limited forms of transfer of prelearned options to new task settings. We propose a novel "option indexing" approach to hierarchical learning (OI-HRL), where we learn an affinity function between options and the functionalities (or affordances) supported by the environment. This allows us to effectively reuse a large library of pretrained options, in zero-shot generalization at test time, by restricting goal-directed learning to only those options relevant to the task at hand. We develop a meta-training loop that learns the representations of options and environment affordances over a series of HRL problems, by incorporating feedback about the relevance of retrieved options to the higher-level goal. In addition to a substantial decrease in sample complexity compared to learning HRL policies from scratch, we also show significant gains over baselines that have the entire option pool available for learning the hierarchical policy.
View details
Preview abstract
Concept bottleneck models (CBMs) (Koh et al. 2020) are interpretable neural networks that first predict labels for human-interpretable concepts relevant to the prediction task, and then predict the final label based on the concept label predictions. We extend CBMs to interactive prediction settings where the model can query a human collaborator for the label to some concepts. We develop an interaction policy that, at prediction time, chooses which concepts to request a label for so as to maximally improve the final prediction. We demonstrate that a simple policy combining concept prediction uncertainty and influence of the concept on the final prediction achieves strong performance and outperforms a static approach proposed in Koh et al. (2020) as well as active feature acquisition methods proposed in the literature. We show that the interactive CBM can achieve accuracy gains of 5-10% with only 5 interactions over competitive baselines on the Caltech UCSB Birds dataset and the Chexpert dataset.
View details
Preview abstract
Deep networks often make confident, yet, incorrect, predictions when tested with outlier data that is far removed from their training distributions. Likelihoods computed by deep generative models (DGMs) are a candidate metric for outlier detection with unlabeled data. Yet, previous studies have shown that DGM likelihoods are unreliable and can be easily biased by simple transformations to input data. Here, we examine outlier detection with variational autoencoders (VAEs), among the simplest of DGMs. We propose novel analytical and algorithmic approaches to ameliorate key biases with VAE likelihoods. Our bias corrections are sample-specific, computationally inexpensive, and readily computed for various decoder visible distributions. Next, we show that a well-known image pre-processing technique – contrast stretching – extends the effectiveness of bias correction to further improve outlier detection. Our approach achieves state-of-the-art accuracies with nine grayscale and natural image datasets, and demonstrates significant advantages – both with speed and performance – over four recent, competing approaches. In summary, lightweight remedies suffice to achieve robust outlier detection with VAEs.
View details