Explanations based on the Missing: Towards Contrastive Explanations with\n Pertinent Negatives

Abstract

In this paper we propose a novel method that provides contrastive\nexplanations justifying the classification of an input by a black box\nclassifier such as a deep neural network. Given an input we find what should be\n%necessarily and minimally and sufficiently present (viz. important object\npixels in an image) to justify its classification and analogously what should\nbe minimally and necessarily \\emph{absent} (viz. certain background pixels). We\nargue that such explanations are natural for humans and are used commonly in\ndomains such as health care and criminology. What is minimally but critically\n\\emph{absent} is an important part of an explanation, which to the best of our\nknowledge, has not been explicitly identified by current explanation methods\nthat explain predictions of neural networks. We validate our approach on three\nreal datasets obtained from diverse domains; namely, a handwritten digits\ndataset MNIST, a large procurement fraud dataset and a brain activity strength\ndataset. In all three cases, we witness the power of our approach in generating\nprecise explanations that are also easy for human experts to understand and\nevaluate.\n