Google Research

Addressing Stability in Classifier Explanations

  • Amitabha Roy
  • Dake He
  • Nasrin Baratalipour
  • Pranjul Yadav
  • Siavash Samiei
2021 IEEE International Conference on Big Data

Abstract

Machine learning based classifiers are often a black box when considering the contribution of inputs to output probability of a label, especially with complex non-linear models such as neural networks. A popular way to explain machine learning model outputs in a model independent manner is through the use of Shapley values. We discuss the problem of instability when using Shapley values in explanations - where we found explanations to vary due to random sampling in the algorithm. We show how this problem can be effectively addressed using Monte Carlo integration in the form of averaging the model output while varying only a subset of features in the example to be explained. This unlocks the use of Shapley value based explainers for a variety of classifiers including neural networks.

Research Areas

Learn more about how we do research

We maintain a portfolio of research projects, providing individuals and teams the freedom to emphasize specific types of work