Cognitive Psychology for Deep Neural Networks: a Shape Bias Case Study
Total Page:16
File Type:pdf, Size:1020Kb
Cognitive Psychology for Deep Neural Networks: A Shape Bias Case Study Samuel Ritter * 1 David G.T. Barrett * 1 Adam Santoro 1 Matt M. Botvinick 1 Abstract 1. Introduction During the last half-decade deep learning has significantly improved performance on a variety of tasks (for a review, Deep neural networks (DNNs) have achieved see LeCun et al.(2015)). However, deep neural network unprecedented performance on a wide range of (DNN) solutions remain poorly understood, leaving many complex tasks, rapidly outpacing our understand- to think of these models as black boxes, and to question ing of the nature of their solutions. This has whether they can be understood at all (Bornstein, 2016; caused a recent surge of interest in methods Lipton, 2016). This opacity obstructs both basic research for rendering modern neural systems more inter- seeking to improve these models, and applications of these pretable. In this work, we propose to address the models to real world problems (Caruana et al., 2015). interpretability problem in modern DNNs using Recent pushes have aimed to better understand DNNs: the rich history of problem descriptions, theo- tailor-made loss functions and architectures produce more ries and experimental methods developed by cog- interpretable features (Higgins et al., 2016; Raposo et al., nitive psychologists to study the human mind. 2017) while output-behavior analyses unveil previously To explore the potential value of these tools, opaque operations of these networks (Karpathy et al., we chose a well-established analysis from de- 2015). Parallel to this work, neuroscience-inspired meth- velopmental psychology that explains how chil- ods such as activation visualization (Li et al., 2015), abla- dren learn word labels for objects, and applied tion analysis (Zeiler & Fergus, 2014) and activation maxi- that analysis to DNNs. Using datasets of stim- mization (Yosinski et al., 2015) have also been applied. uli inspired by the original cognitive psychol- ogy experiments, we find that state-of-the-art Altogether, this line of research developed a set of promis- one shot learning models trained on ImageNet ing tools for understanding DNNs, each paper producing exhibit a similar bias to that observed in hu- a glimmer of insight. Here, we propose another tool for mans: they prefer to categorize objects accord- the kit, leveraging methods inspired not by neuroscience, ing to shape rather than color. The magnitude but instead by psychology. Cognitive psychologists have of this shape bias varies greatly among archi- long wrestled with the problem of understanding another tecturally identical, but differently seeded mod- opaque intelligent system: the human mind. We contend els, and even fluctuates within seeds through- that the search for a better understanding of DNNs may out training, despite nearly equivalent classifi- profit from the rich heritage of problem descriptions, the- cation performance. These results demonstrate ories, and experimental tools developed in cognitive psy- chology. To test this belief, we performed a proof-of- arXiv:1706.08606v2 [stat.ML] 29 Jun 2017 the capability of tools from cognitive psychology for exposing hidden computational properties of concept study on state-of-the-art DNNs that solve a par- DNNs, while concurrently providing us with a ticularly challenging task: one-shot word learning. Specif- computational model for human word learning. ically, we investigate Matching Networks (MNs) (Vinyals et al., 2016), which have state-of-the-art one-shot learning performance on ImageNet and we investigate an Inception Baseline model (Szegedy et al., 2015a). *Equal contribution 1DeepMind, London, UK. Correspon- Following the approach used in cognitive psychology, we dence to: Samuel Ritter <[email protected]>, David G.T. Bar- began by hypothesizing an inductive bias our model may rett <[email protected]>. use to solve a word learning task. Research in develop- Proceedings of the 34 th International Conference on Machine mental psychology shows that when learning new words, Learning, Sydney, Australia, PMLR 70, 2017. Copyright 2017 humans tend to assign the same name to similarly shaped by the author(s). Cognitive Psychology for Deep Neural Networks: A Shape Bias Case Study items rather than to items with similar color, texture, or as benign or malignant (Kourou et al., 2015), or medical size. To test the hypothesis that our DNNs discover this history and vital measurements to be classified according same “shape bias”, we probed our models using datasets to likely pneumonia outcomes (Caruana et al., 2015). and an experimental setup based on the original shape bias A statistical learner such as a DNN will minimize L by studies (Landau et al., 1988). discovering properties of the input x that are predictive of Our results are as follows: 1) Inception networks trained on the labels y. These discovered predictive properties are, in ImageNet do indeed display a strong shape bias. 2) There effect, the properties of x for which the trained model has is high variance in the bias between Inception networks an inductive bias. Examples of such properties include the initialized with different random seeds, demonstrating that shape of ImageNet objects, the number of nodes of a tumor, otherwise identical networks converge to qualitatively dif- or a particular constellation of blood test values that often ferent solutions. 3) MNs also have a strong shape bias, and precedes an exacerbation of pneumonia symptoms. this bias closely mimics the bias of the Inception model Critically, in real-world datasets such as these, the discov- that provides input to the MN. 4) By emulating the shape ered properties are unlikely to correspond to a single fea- bias observed in children, these models provide a candidate ture of the input x; instead they correspond to complex computational account for human one-shot word learning. conjunctions of those features. We could describe one of Altogether, these results show that the technique of testing these properties using a function h(x), which, for example, hypothesized biases using probe datasets can yield both ex- returns the shape of the focal object given an ImageNet im- pected and surprising insights about solutions discovered age, or the number of nodes given a scan of tumor. Indeed, by trained DNNs. one way to articulate the difficulty in understanding DNNs is to say that we often can’t intuitively describe these con- 1.1. Related Work: Cognitive Modeling with Neural junctions of features h(x); although we often have numeri- Networks cal representations in intermediate DNN layers, they’re of- The use of behavioral probes to understand neural network ten too arcane for us to interpret. function has been extensively applied within psychology We advocate for addressing this problem using the follow- itself, where neural networks have been employed effec- ing hypothesis-driven approach: First, propose a property tively as models of human cognitive function (Rumelhart hp(x) that the model may be using. Critically, it’s not nec- et al., 1988; Plaut et al., 1996; Rogers & McClelland, 2004; essary that hp(x) be a function that can be evaluated using Mareschal et al., 2000). In contrast, in the present work an automated method. Instead, the intention is that hp(x) we are advocating for the application of behavioral probes is a function that humans (e.g. ML researchers and practi- along with associated theories and hypotheses from cogni- tioners) can intuitively evaluate. hp(x) should be a prop- tive psychology to address the interpretability problem in erty that is believed to be relevant to the problem, such as modern deep networks. object shape or number of tumor nodes. In spite of the widespread adoption of deep learning meth- After proposing a property, the next step is to generate pre- ods in recent years, to our knowledge, work applying be- dictions about how the model should behave when given havioral probes to DNNs in machine learning for this pur- various inputs, if in fact it uses a bias with respect to the pose has been quite limited; we only are aware of Zo- property hp(x). Then, construct and carry out an experi- ran et al.(2015) and Goodfellow et al.(2009), who used ment wherein those predictions are tested. In order to ex- psychophysics-like experiments to better understand image ecute such an experiment, it typically will be necessary to processing models. craft a set of probe examples x that cover a relevant por- tion of the range of hp(x), for example a variety of object 2. Inductive Biases, Statistical Learners and shapes. The results of this experiment will either support Probe Datasets or fail to support the hypothesis that the model uses hp(x) to solve the task. This process can be especially valuable in Before we delve into the specifics of the shape bias and situations where there is little or no training data available one-shot word learning, we will describe our approach in in important regions of the input space, and a practitioner the general context of inductive biases, probe datasets, and needs to know how the trained model will behave in that N statistical learning. Suppose we have some data fyi; xigi=1 region. where yi = f(xi). Our goal is to build a model of the data g(:) to optimize some loss function L measuring the Psychologists have developed a repertoire of such hypothe- P 2 ses and experiments in their effort to understand the hu- disparity between y and g(x), e.g., L = i jjyi − g(xi)jj . Perhaps this data x is images of ImageNet objects to be man mind. Here we explore the application of one of these classified, images and histology of tumors to be classified theory-experiment pairs to state of the art one-shot learning Cognitive Psychology for Deep Neural Networks: A Shape Bias Case Study models.