Local Explanation Methods for Deep Neural Networks Lack Sensitivity to Parameter Values

Julius Adebayo

Justin Gilmer

Ian Goodfellow

Been Kim

ICLR Workshop(2018)

Download Google Scholar

Abstract

Explaining the output of a complicated machine learning model like a deep neural network (DNN) is a central challenge in machine learning. Increasingly, explanations are required for debugging models, building trust prior to model deployment, and potentially identifying unwanted effects like model bias. Several methods have been proposed to address this issue. Local explanation methods provide explanations of the output of a model on a single input. Given the importance of these explanations to the use and deployment of these models, we ask: can we trust local explanations for DNNs created using current methods? In particular, we seek to assess how specific local explanations are to the parameter values of DNNs. We compare explanations generated using a fully trained DNNs to explanations of DNNs with some or all parameters replaced by random values. Somewhat surprisingly, we find that, for several local explanation methods, explanations derived from networks with randomized weights and trained weights are both visually and quantitatively similar; in some cases, virtually indistinguishable. By randomizing different portions of the network, we find that local explanations are significantly reliant on lower level features of the DNN.

Research Areas

Machine Intelligence

Defining the technology of today and tomorrow.

Philosophy

People

Teams

AI/ML Foundations  & Capabilities

Algorithms & Optimization

Computing Paradigms

Responsible Human-Centric Technology

Science & Societal Impact

Projects

Publications

Resources

Shaping the future, together.

Student programs

Faculty programs

Conferences & events

Local Explanation Methods for Deep Neural Networks Lack Sensitivity to Parameter Values

Abstract

Research Areas

Learn more about how we conduct our research

Defining the technology of today and tomorrow.

Philosophy

People

Teams

AI/ML Foundations & Capabilities

Algorithms & Optimization

Computing Paradigms

Responsible Human-Centric Technology

Science & Societal Impact

Projects

Publications

Resources

Shaping the future, together.

Student programs

Faculty programs

Conferences & events

Local Explanation Methods for Deep Neural Networks Lack Sensitivity to Parameter Values

Abstract

Research Areas

Learn more about how we conduct our research

AI/ML Foundations  & Capabilities