Jump to Content
Harikrishna Narasimhan

Harikrishna Narasimhan

I am a Senior Research Scientist at Google Research in Mountain View, USA. Prior to joining Google, I was a post-doctoral researcher at Harvard University. I completed my PhD from the Indian Institute of Science, Bangalore. My research interests broadly lie in the areas of machine learning and learning theory, with focus on complex evaluation metrics, constrained optimization and algorithmic fairness. Please see my personal web page for my full list of publications.
Authored Publications
Google Publications
Other Publications
Sort By
  • Title
  • Title, desc
  • Year
  • Year, desc
    Preview abstract The generalization ability of machine learning models degrades significantly when the test distribution shifts away from the training distribution. We investigate the problem of training models that are robust to shifts caused by changes in the distribution of class-priors or group-priors. The presence of skewed training priors can often lead to the models overfitting to spurious features. Unlike existing methods, which optimize for either the worst or the average performance over classes or groups, our work is motivated by the need for finer control over the robustness properties of the model. We present a lightweight post-hoc approach that performs scaling adjustments to predictions from a fixed pre-trained model, with the goal of minimizing a distributionally robust loss around a chosen target distribution. These adjustments are computed by solving a constrained optimization problem on a validation set and applied during test time. We propose a novel evaluation metric to test the models on a spectrum of controlled distribution shifts. While being extremely lightweight and fast, our method comes with provable guarantees and compares favorably with the existing more complex methods for this problem on standard class imbalance and group distributional robustness benchmarks. View details
    Preview abstract In real-world systems, models are frequently updated as more data becomes available, and in addition to achieving high accuracy, the goal is to also maintain a low difference in predictions compared to the base model (i.e. predictive churn). If model retraining results in vastly different behavior, then it could cause negative effects in downstream systems, especially if this churn can be avoided with limited impact on model accuracy. In this paper, we show an equivalence between training with distillation using the base model as the teacher and training with an explicit constraint on the predictive churn. We then show that distillation performs strongly for low churn training against a number of recent baselines on a wide range of datasets and model architectures, including fully-connected networks, convolutional networks, and transformers. View details
    Preview abstract Many practical settings allow a learner to defer predictions to one or more costly experts. For example, the learning to defer paradigm allows a learner to defer to a human expert, at some monetary cost. Similarly, the adaptive inference paradigm allows a base model to defer to one or more large models, at some computational cost. The goal in these settings is to learn classification and deferral mechanisms to optimise a suitable accuracy-cost tradeoff. To achieve this, a central issue studied in prior work is the design of a coherent loss function for both mechanisms. In this work, we demonstrate that existing losses have two subtle limitations: they can encourage underfitting when there is a high cost of deferring, and the deferral function can have a weak dependence on the base model predictions. To resolve these issues, we propose a post-hoc training scheme: we train a deferral function on top of a base model, with the objective of predicting to defer when the base model's error probability exceeds the cost of the expert model. This may be viewed as applying a partial surrogate to the ideal deferral loss, which can lead to a tighter approximation and thus better performance. Empirically, we verify the efficacy of post-hoc training on benchmarks for learning to defer and adaptive inference. View details
    Optimal Auctions through Deep Learning
    Paul Duetting
    Zhe Feng
    David C. Parkes
    Sai Srivatsa Ravindranath
    Communications of the ACM, vol. 64 (Issue 8) (2021), pp. 109-116
    Preview abstract Designing an incentive compatible auction that maximizes expected revenue is an intricate task. The single-item case was resolved in a seminal piece of work by Myerson in 1981. Even after 30-40 years of intense research the problem remains unsolved for settings with two or more items. We overview recent research results that show how tools from deep learning are shaping up to be become a powerful tool for the automated design of optimal auctions. In this approach, an auction is modeled as a multi-layer neural network, with optimal auction design framed as a constrained learning problem, and solved through standard machine learning pipelines. Through this approach, it is possible to recover essentially all known analytical solutions for multi-item settings, and obtain novel mechanisms for settings in which the optimal mechanism is unknown. View details
    Implicit rate-constrained optimization of non-decomposable objectives
    Andy Cotter
    Proceedings of the 38th International Conference on Machine Learning (ICML), 2021
    Preview abstract We consider a popular family of constrained optimization problems arising in machine learning that involve optimizing a non-decomposable evaluation metric with a certain thresholded form, while constraining another metric of interest. Examples of such problems include optimizing the false negative rate at a fixed false positive rate, optimizing precision at a fixed recall, optimizing the area under the precision-recall or ROC curves, etc. Our key idea is to formulate a rate-constrained optimization that expresses the threshold parameter as a function of the model parameters via the Implicit Function theorem. We show how the resulting optimization problem can be solved using standard gradient based methods. Experiments on benchmark datasets demonstrate the effectiveness of our proposed method over existing state-of-the-art approaches for these problems. The code for the proposed method is available at this url: https://github.com/google-research/google-research/tree/master/implicit_constrained_optimization. View details
    Optimizing Blackbox Metrics with Iterative Example Weighting
    Gaurush Hiranandani
    Jatin Mathur
    Mahdi Milani Fard
    Sanmi Koyejo
    Proceedings of the 38th International Conference on Machine Learning (ICML), 2021
    Preview abstract We consider learning to optimize a classification metric defined by a black-box function of the confusion matrix. Such black-box learning settings are ubiquitous, for example, when the learner only has query access to the metric of interest, or in noisy-label and domain adaptation applications where the learner must evaluate the metric via performance evaluation using a small validation sample. Our approach is to adaptively learn example weights on the training dataset such that the resulting weighted objective best approximates the metric on the validation sample. We show how to model and estimate the example weights and use them to iteratively post-shift a pre-trained class probability estimator to construct a classifier and analyze the resulting procedure's statistical properties. Experiments on various label noise, domain shift, and fair classification setups confirm that our proposal is better than the individual state-of-the-art baselines for each application. View details
    Fair Performance Metric Elicitation
    Gaurush Hiranandani
    Sanmi Koyejo
    NeurIPS 2020
    Preview abstract What is a fair performance metric? We consider the choice of fairness metrics through the lens of metric elicitation -- a principled framework for selecting performance metrics that best reflect implicit preferences. The use of metric elicitation enables a practitioner to tune the performance and fairness metrics to the task, context, and population at hand. Specifically, we propose novel strategies to elicit fair performance metrics for multiclass classification problems with multiple sensitive groups that also includes selecting the tradeoff between performance and fairness. The proposed elicitation strategies require only relative preference and are robust to both finite sample and feedback noise. View details
    Pairwise Fairness for Ranking and Regression
    Andy Cotter
    Maya Gupta
    33rd AAAI Conference on Artificial Intelligence (2020)
    Preview abstract Improving fairness for ranking and regression models has less mature algorithmic tooling than classifiers. Here, we present pairwise formulations of fairness for ranking and regression models that can express analogues of statistical fairness notions like equal opportunity or equal accuracy, as well as statistical parity. The proposed framework supports both discrete protected groups, and continuous protected attributes. We show that the resulting training problems can be efficiently and effectively solved using constrained optimization or robust optimization algorithms. Experiments illustrate the broad applicability and trade-offs of these methods. View details
    Preview abstract We present a consistent algorithm for constrained classification problems where the objective (e.g. F-measure, G-mean) and the constraints (e.g. demographic parity fairness, coverage) are defined by general functions of the confusion matrix. Our approach reduces the problem into a sequence of plug-in classifier learning tasks. The reduction is achieved by posing the learning problem as an optimization over the intersection of two sets: the set of confusion matrices that are achievable and those that are feasible. This decoupling of the constraint space then allows us to solve the problem using a Frank-Wolfe based algorithm. For objective and constraints that are convex functions of the confusion matrix, our algorithm requires $O(1/\epsilon^2)$ calls to the plug-in subroutine, which improves on the $O(1/\epsilon^3)$ rate needed by the reduction based algorithm of Narasimhan (2018). We show empirically that our algorithm is competitive with prior methods. View details
    Optimizing Black-box Metrics with Adaptive Surrogates
    Qijia Jiang
    Olaoluwa Adigun
    Mahdi Milani Fard
    Maya R. Gupta
    Proceedings of the 37th International Conference on Machine Learning (ICML 2020)
    Preview abstract We address the problem of training models with black-box and hard-to-optimize metrics by expressing the metric as a monotonic function of a small number of easy-to-optimize surrogates. We pose the training problem as an optimization over a relaxed surrogate space, which we solve by estimating local gradients for the metric and performing inexact convex projections. We analyze gradient estimates based on finite differences and local linear interpolations, and show convergence of our approach under smoothness assumptions with respect to the surrogates. Experimental results on classification and ranking problems verify the proposal performs on par with methods that know the mathematical formulation, and adds notable value when the form of the metric is unknown. View details
    Preview abstract In machine learning applications such as ranking fairness or fairness over intersectional groups, one often encounters optimization problems with an extremely large number of constraints. In particular, with ranking fairness tasks, there may even be a variable number of constraints, e.g. one for each query in the training set. In these cases, the standard approach of optimizing a Lagrangian while maintaining one Lagrange multiplier per constraint may no longer be practical. Our proposal is to associate a feature vector with each constraint, and to learn a "multiplier model" that maps each such vector to the corresponding Lagrange multiplier. We prove optimality and feasibility guarantees under assumptions on the flexibility of the multiplier model, and empirically demonstrate that our method is effective on real-world case studies. View details
    Metric-Optimized Example Weights
    Mahdi Milani Fard
    Maya Gupta
    Proceedings of the 36th International Conference on Machine Learning, PMLR, Long Beach, California, USA (2019), pp. 7533-7542
    Preview abstract Real-world machine learning applications often have complex test metrics, and may have training and test data that are not identically distributed. Motivated by known connections between complex test metrics and cost-weighted learning, we propose addressing these issues by using a weighted loss function with a standard loss, where the weights on the training examples are learned to optimize the test metric on a validation set. These metric-optimized example weights can be learned for any test metric, including black box and customized ones for specific applications. We illustrate the performance of the proposed method on diverse public benchmark datasets and real-world applications. We also provide a generalization bound for the method. View details
    On Making Stochastic Classifiers Deterministic
    Andy Cotter
    Maya Gupta
    33rd Conference on Neural Information Processing Systems (NeurIPS) (2019)
    Preview abstract Stochastic classifiers arise in a number of machine learning problems, and have become especially prominent of late, as they often result from constrained optimization problems, \eg for fairness, churn, or custom loss optimization. Despite their utility, the inherent randomness of stochastic classifiers may be problematic to use in practice for a variety of practical reasons. In this paper, we attempt to answer the theoretical question of how well a stochastic classifier can be approximated by a deterministic one, and compare several possible approaches, proving lower and upper bounds. We also experimentally investigate the pros and cons of these methods, not only in regard to how successfully each deterministic classifier approximates the original stochastic classifier, but also in terms of how well each addresses the other issues that can make stochastic classifiers undesirable. View details
    Optimal Auctions through Deep Learning
    Paul Duetting
    Zhe Feng
    David C. Parkes
    Sai Srivatsa Ravindranath
    Proceedings of the 36th International Conference on Machine Learning, PMLR, Long Beach, California, USA (2019), pp. 1706-1715
    Preview abstract Designing an incentive compatible auction that maximizes expected revenue is an intricate task. The single-item case was resolved in a seminal piece of work by Myerson in 1981. Even after 30-40 years of intense research the problem remains unsolved for seemingly simple multibidder, multi-item settings. In this work, we initiate the exploration of the use of tools from deep learning for the automated design of optimal auctions. We model an auction as a multi-layer neural network, frame optimal auction design as a constrained learning problem, and show how it can be solved using standard pipelines. We prove generalization bounds and present extensive experiments, recovering essentially all known analytical solutions for multi-item settings, and obtaining novel mechanisms for settings in which the optimal mechanism is unknown. View details
    Optimizing Generalized Rate Metrics with Three Players
    Andy Cotter
    Maya Gupta
    33rd Conference on Neural Information Processing Systems (NeurIPS) (2019)
    Preview abstract We present a general framework for solving a large class of learning problems with non-linear functions of classification rates. This includes problems where one wishes to optimize a non-decomposable performance metric such as the F-measure or G-mean, and constrained training problems where the classifier needs to satisfy non-linear rate constraints such as predictive parity fairness, distribution divergences or churn ratios. We extend previous two-player game approaches for constrained optimization to a game between three players to decouple the classifier rates from the nonlinear objective, and seek to find an equilibrium of the game. Our approach generalizes many existing algorithms, and makes possible new algorithms with more flexibility and tighter handling of non-linear rate constraints. We provide convergence guarantees for convex functions of rates, and show how our methodology can be extended to handle sums of ratios of rates. Experiments on different fairness tasks confirm the efficacy of our approach. View details
    No Results Found