Jump to Content
Mukund Sundararajan

Mukund Sundararajan

For up-to-date information visit: Mukund Sundararjan
Authored Publications
Google Publications
Other Publications
Sort By
  • Title
  • Title, descending
  • Year
  • Year, descending
    Preview abstract In this work, we check whether a deep learning model that does rainfall prediction using a variety of sensor readings behaves reasonably. Unlike traditional numerical weather prediction models that encode the physics of rainfall, our model relies purely on data and deep learning. Can we trust the model? Or should we take a rain check? We perform two types of analysis. First, we perform a one-at-a-time sensitivity analysis to understand properties of the input features. Holding all the other features fixed, we vary a single feature from its minimum to its maximum value and check whether the predicted rainfall obeys conventional intuition (e.g. more lightning implies more rainfall). Second, for specific prediction at a certain location, we use an existing feature attribution technique to identify influential features (sensor readings) from this and other locations. Again, we check whether the feature importances match conventional wisdom. (e.g. is ‘instant reflectivity’, a measure of the current rainfall more influential than say surface temperature). We compute influence both on the predictions of the model, but also on the error; the latter is perhaps a novel contribution to the literature on feature attribution. The model we chose to analyze is not the state of the art. It is flawed in several ways, and therefore makes for an interesting analysis target. We find several interesting issues. However, we should clarify that our analysis is not an indictment of machine learning approaches; indeed we know of better models ourselves. But our goal is to demonstrate an interactive analysis technique. View details
    Preview abstract We study the attribution problem (cf. ~\cite{SVZ13}) for deep networks applied to \emph{perception tasks}. Traditionally, the attribution problem is formulated as blaming the network's prediction on the pixels of the input image, i.e., the \emph{space} dimension. Often, signal is also present in the \emph{scale/frequency} dimension. We propose a new technique called \emph{Blur Integrated Gradients} that produces attributions in both space and in scale. Furthermore, we use the scale-space axioms (cf.~\cite{Lindeberg}) to argue that the input perturbations used by Blur Integrated Gradients will not accidentally create features. There resulting explanations are cleaner, and more faithful to how deep networks operate. We compare against some previously proposed techniques and demonstrate applications on three tasks: ImageNet object recognition, Diabetic Retinopathy prediction, and AudioSet audio event identification. View details
    Preview abstract We study interactions among players in cooperative games. We propose a new interaction index called Shapley-Taylor Interaction index. It decomposes the value of the game into terms that model the interactions betweensubsets of players in a manner analogous to how the Taylor series represents a function in terms of its derivativesWe axiomatize the method using the axioms that axiomatize the Shapley value—linearity,dummyandefficiency—and also an additional axiom that we call theinteraction distributionaxiom. This axiom explicitlycharacterizes how interactions are distributed for a class of games called interaction games.We contrast Shapley-Taylor values against the previously proposed Shapley Interaction Value(cf. [1]) thatinstead relies on a recursive construction rather than the efficiency and interaction distribution axioms. View details
    Preview abstract The problem of attributing a deep network’s prediction to its input/base features is well-studied (cf. Simonyan et al. (2013)). We introduce the notion of conductance to extend the notion of attribution to understanding the importance of hidden units. Informally, the conductance of a hidden unit of a deep network is the flow of attribution via this hidden unit. We can use conductance to understand the importance of a hidden unit to the prediction for a specific input, or over a set of inputs. We justify conductance in multiple ways via a qualitative comparison with other methods, via some axiomatic results, and via an empirical evaluation based on a feature selection task. The empirical evaluations are done using the Inception network over ImageNet data, and a convolutinal network over text data. In both cases, we demonstrate the effectiveness of conductance in identifying interesting insights about the internal workings of these networks. View details
    Preview abstract We analyze state-of-the-art deep learning models for three tasks: question answering on (1) images, (2) tables, and (3) passages of text. Using the notion of attribution (word importance), we find that these deep networks often ignore important question terms. Leveraging such behavior, we perturb questions to craft a variety of adversarial examples. Our strongest attacks drop the accuracy of a visual question answering model from 61.1% to 19%, and that of a tabular question answering model from 33.5% to 3.3%. Additionally, we show how attributions can strengthen attacks proposed by Jia and Liang (2017) on paragraph comprehension models. Our results demonstrate that attributions can augment standard measures of accuracy and empower investigation of model performance. When a model is accurate but for the wrong reasons, attributions can surface erroneous logic in the model that indicates inadequacies in the test data. View details
    Analyza: Exploring Data with Conversation
    Kevin McCurley
    Ralfi Nahmias
    Intelligent User Interfaces 2017, ACM, Limassol, Cyprus (to appear)
    Preview abstract We describe Analyza, a system that helps lay users explore data. Analyza has been used within two large real world systems. The first is a question-and-answer feature in a spreadsheet product. The second provides convenient access to a revenue/inventory database for a large sales force. Both user bases consist of users who do not necessarily have coding skills, demonstrating Analyza's ability to democratize access to data. We discuss the key design decisions in implementing this system. For instance, how to mix structured and natural language modalities, how to use conversation to disambiguate and simplify querying, how to rely on the ``semantics'' of the data to compensate for the lack of syntactic structure, and how to efficiently curate the data. View details
    Universally optimal privacy mechanisms for minimax agents
    Mangesh Gupte
    Proc. ACM SIGMOD, ACM, Indianapolis, Indiana (2010), pp. 135-146
    Preview abstract A scheme that publishes aggregate information about sensitive data must resolve the trade-off between utility to information consumers and privacy of the database participants. Differential privacy is a well-established definition of privacy--this is a universal guarantee against all attackers, whatever their side-information or intent. Can we have a similar universal guarantee for utility? There are two standard models of utility considered in decision theory: Bayesian and minimax. Ghosh et. al. show that a certain "geometric mechanism" gives optimal utility to all Bayesian information consumers. In this paper, we prove a similar result for minimax information consumers. Our result also works for a wider class of information consumers which includes Bayesian information consumers and subsumes the result from [8]. We model information consumers as minimax (risk-averse) agents, each endowed with a loss-function which models their tolerance to inaccuracies and each possessing some side-information about the query. Further, information consumers are rational in the sense that they actively combine information from the mechanism with their side-information in a way that minimizes their loss. Under this assumption of rational behavior, we show that for every fixed count query, the geometric mechanism is universally optimal for all minimax information consumers. Additionally, our solution makes it possible to release query results, when information consumers are at different levels of privacy, in a collusion-resistant manner. View details
    Axiomatic Attribution for Multilinear Functions
    Yi Sun
    ACM Conference on Electronic Commerce (2011), pp. 177-178
    A Learning-Based Approach to Reactive Security
    Adam Barth
    Benjamin I. P. Rubinstein
    John C. Mitchell
    Dawn Song
    Peter L. Bartlett
    Financial Cryptography (2010), pp. 192-206
    Robust mechanisms for risk-averse sellers
    ACM Conference on Electronic Commerce (2010), pp. 139-148
    Universally optimal privacy mechanisms for minimax agents
    Mangesh Gupte
    PODS (2010), pp. 135-146
    Quantifying inefficiency in cost-sharing mechanisms
    Tim Roughgarden
    Journal of the ACM, vol. 56 (2009)
    Revenue Submodularity
    Shaddin Dughmi
    Tim Roughgarden
    AMMA (2009), pp. 89-91
    A Learning-Based Approach to Reactive Security
    Adam Barth
    Benjamin I. P. Rubinstein
    John C. Mitchell
    Dawn Xiaodong Song
    Peter L. Bartlett
    Financial Cryptography and Data Security, Springer-Verlag (2009), pp. 192-206
    New enhancements to the SOCKS communication network security protocol: Schemes and performance evaluation
    Mohammad S. Obaidat
    Journal of Systems and Software, vol. 82 (2009), pp. 1941-1949
    Revenue submodularity
    Shaddin Dughmi
    Tim Roughgarden
    ACM Conference on Electronic Commerce (2009), pp. 243-252
    An Automated Approach for Proving PCL Invariants
    John C. Mitchell
    Arnab Roy
    ENTCS, vol. 234 (2009), pp. 93-113
    Universally utility-maximizing privacy mechanisms
    Arpita Ghosh
    Tim Roughgarden
    STOC (2009), pp. 351-360
    On characterizations of truthful mechanisms for combinatorial auctions and scheduling
    Shahar Dobzinski
    ACM Conference on Electronic Commerce (2008), pp. 38-47
    Optimal marketing strategies over social networks
    Jason D. Hartline
    WWW (2008), pp. 189-198
    Is Shapley Cost Sharing Optimal?
    Shahar Dobzinski
    Tim Roughgarden
    SAGT (2008), pp. 327-336
    Computing Optimal Bundles for Sponsored Search
    Arpita Ghosh
    Hamid Nazerzadeh
    WINE (2007), pp. 576-583
    Beyond moulin mechanisms
    Tim Roughgarden
    ACM Conference on Electronic Commerce (2007), pp. 1-10
    Stochastic Mechanism Design
    Samuel Ieong
    Anthony Man-Cho So
    WINE (2007), pp. 269-280
    Optimal Efficiency Guarantees for Network Design Mechanisms
    Tim Roughgarden
    IPCO (2007), pp. 469-483
    New trade-offs in cost-sharing mechanisms
    Tim Roughgarden
    STOC (2006), pp. 79-88
    Optimal Cost-Sharing Mechanisms for Steiner Forest Problems
    Shuchi Chawla
    Tim Roughgarden
    WINE (2006), pp. 112-123
    Chaining Algorithms for Alignment of Draft Sequence
    Michael Brudno
    Kerrin Small
    Arend Sidow
    Serafim Batzoglou
    WABI (2004), pp. 326-337