Journal of Machine Learning Research 20 (2019) 1-47 Submitted 8/18; Revised 6/19; Published 8/19 Logical Explanations for Deep Relational Machines Using Relevance Information Ashwin Srinivasan
[email protected] Department of Computer Sc. & Information Systems BITS Pilani, K.K. Birla Goa Campus, Goa, India Lovekesh Vig
[email protected] TCS Research, New Delhi, India Michael Bain
[email protected] School of Computer Science and Engineering University of New South Wales, Sydney, Australia. Editor: Luc De Raedt Abstract Our interest in this paper is in the construction of symbolic explanations for predictions made by a deep neural network. We will focus attention on deep relational machines (DRMs: a term introduced in Lodhi (2013)). A DRM is a deep network in which the input layer consists of Boolean-valued functions (features) that are defined in terms of relations provided as domain, or background, knowledge. Our DRMs differ from those in Lodhi (2013), which uses an Inductive Logic Programming (ILP) engine to first select features (we use random selections from a space of features that satisfies some approximate constraints on logical relevance and non-redundancy). But why do the DRMs predict what they do? One way of answering this was provided in recent work Ribeiro et al. (2016), by constructing readable proxies for a black-box predictor. The proxies are intended only to model the predictions of the black-box in local regions of the instance-space. But readability alone may not be enough: to be understandable, the local models must use relevant concepts in an meaningful manner.