Jump to Content

Saccader: Accurate, interpretable image classification with hard attention

Simon Kornblith
2019 Conference on Neural Information Processing Systems (NeurIPS) (2019)
Google Scholar

Abstract

Deep convolutional networks have achieved high accuracy on image classification tasks. Due to the complexity of these models, they are considered black boxes as the decisions made by these models are hard to interpret. This lack of interpretation have plagued the wide use of these models in critical application. One class of models that offers interpretations by design are those that use hard attention mechanisms. The training of these models without attention supervision is often challenging, resulting in low accuracy and poor attention locations. The difficulty stems from the fact that it is hard to quantify what is salient places in an image. Thus, these models are often trained by RL losses such as REINFORCE. In large scale images such as ImageNet, the action space is high dimensional and the reward is sparse which lead to the optimization to fail. Here we propose a novel model (Saccader) with hard attention mechanism that make discrete attention actions. We also propose a self supervised pretraining procedure that initializes the model to a state with more frequent rewards. We show that our model achieves high accuracy on ImageNet while providing interpretable decisions.