On the Biological Plausibility of Grandmother Cells: Implications for Neural Network Theories in Psychology and Neuroscience
Total Page:16
File Type:pdf, Size:1020Kb
Psychological Review © 2009 American Psychological Association 2009, Vol. 116, No. 1, 220–251 0033-295X/09/$12.00 DOI: 10.1037/a0014462 On the Biological Plausibility of Grandmother Cells: Implications for Neural Network Theories in Psychology and Neuroscience Jeffrey S. Bowers University of Bristol A fundamental claim associated with parallel distributed processing (PDP) theories of cognition is that knowledge is coded in a distributed manner in mind and brain. This approach rejects the claim that knowledge is coded in a localist fashion, with words, objects, and simple concepts (e.g. “dog”), that is, coded with their own dedicated representations. One of the putative advantages of this approach is that the theories are biologically plausible. Indeed, advocates of the PDP approach often highlight the close parallels between distributed representations learned in connectionist models and neural coding in brain and often dismiss localist (grandmother cell) theories as biologically implausible. The author reviews a range a data that strongly challenge this claim and shows that localist models provide a better account of single-cell recording studies. The author also contrast local and alternative distributed coding schemes (sparse and coarse coding) and argues that common rejection of grandmother cell theories in neuro- science is due to a misunderstanding about how localist models behave. The author concludes that the localist representations embedded in theories of perception and cognition are consistent with neuro- science; biology only calls into question the distributed representations often learned in PDP models. Keywords: grandmother cells, distributed representations, parallel distributed processing, localist repre- sentation, connectionism Of the many different schools of neural network modeling, that it is not a principle of neural network theories in general (cf. the parallel distributed processing (PDP) approach is the most Bowers, 2002; Feldman & Ballard, 1982). Indeed, many neural influential within psychology (McClelland, Rumelhart, & PDP networks rely on local representations for their functioning. Research Group, 1986; Rumelhart, McClelland, & the PDP In some cases, localist representations are built into a model Research Group, 1986). A key claim of this approach is that (e.g., Dell, 1986; Morton, 1979; McClelland & Elman, 1986; knowledge is coded in a distributed manner in the mind and the McClelland & Rumelhart, 1981), and in other cases, they are brain. That is, knowledge is coded as a pattern of activation across learned (e.g., Grossberg, 1980, 1987; Kruschke, 1992; Rumelhart many processing units, with each unit contributing to many dif- & Zipser, 1985). At present, localist and distributed neural net- ferent representations. As a consequence, there is no one unit works have been applied to a wide range of cognitive phenomena, devoted to coding a given word, object, or person. This contrasts and the goal in both cases is to identify a set of general principles with the classic view, according to which knowledge is coded in a that apply across domains. localist fashion. That is, individual words, objects, simple con- Given that neural networks can learn either local or distributed cepts, and the like are coded distinctly, with their own dedicated representations, why have so many theorists rejected localist rep- representation. For example, the words mother and father would be coded with distinct and nonoverlapping mental representations. resentations in favor of distributed ones? Of course, criticisms On any localist account, the words mother and father would be levied against localist models differ, case-by-case, but one under- linked by virtue of sharing some features (e.g., letters), but the lying criticism is often raised against the approach as a whole; words themselves would be stored explicitly and separately in the namely, the models are thought to be biologically implausible. By mind. contrast, advocates of the PDP approach often highlight the link The rejection of localist coding schemes in favor of distributed between the distributed representations learned in PDP models and representations is a core principle of the PDP approach (e.g., Plaut the neural coding systems employed in the brain. For example, in & Shallice, 1993; Seidenberg, 1993). But, it is important to note an article titled “Six Principles for Biologically Based Computa- tional Models of Cortical Cognition,” O’Reilly (1998) included distributed representations as Principle 2, writing, I would like to thank Patricia Bowers, Colin Davis, Nina Kazanina, The cortex is widely believed to use distributed representations to Chad Marsolek, and Sven Mattys for reading and commenting on drafts of encode information. A distributed representation uses multiple active this article. Finally, I would like to thank Colin Davis for creating Figure 4 and Markus Damian for creating Figures 5A–5B and 11. neuron-like processing units to encode information (as opposed to a Correspondence concerning this article should be addressed to Jeffrey S. single-unit, localist representation), and the same unit can participate Bowers, Department of Experimental Psychology, University of Bristol, in multiple representations . Electrophysiological recordings dem- 12A Prior Road, Clifton, Bristol BS8-ITU, England. E-mail: onstrate that distributed representations are widely used in the cortex. [email protected] (p. 456) 220 GRANDMOTHER CELLS 221 In a general review of cognitive neuroscience, McClelland (e.g., Coltheart, 2004). Rather, I assume that the relevance of this (2001) wrote, data is not appreciated. Of course there is one more possible response for an advocate of Studies relying on microelectrodes to record from neurons in the localist models. That is, the common claim that distributed repre- brains of behaving animals can allow researchers to study the repre- sentations better capture key aspects of brain functioning can be sentations that the brain uses to encode information, and the evolution challenged. Although this latter position is rarely adopted, there of these representations over time . These studies indicate, among are a few prominent researchers in neuroscience who either en- other things, that the brain relies on distributed representations. dorse the grandmother neuron hypothesis or who consider it a (p. 2137) serious option on the basis of the neurophysiological data (e.g., Barlow, 1972; Parker & Newsome, 1998; Thorpe, 1989, 2002). Countless other similar quotes could be provided. This is the position taken here. Indeed, the main goal in this At first blush, these claims regarding the relative biological article is to show that the current findings in neuroscience are plausibility of localist (grandmother) and distributed coding compatible with localist models in psychology. It is not my claim schemes seem well founded. As reviewed by Gross (2002), few that current data provide unambiguous support for localist coding neuroscientists have taken localist representations seriously. In- schemes. But, I do intend to show that there is no reason to prefer deed, the term grandmother cell (a term often attributed to Jerry distributed over localist representations on the basis of their rela- Lettvin; cf. Gross, 2002) is generally used to ridicule the claim that tive biological plausibility. Indeed, I argue that the distributed complex and meaningful stimuli are coded by individual cells in representations learned in PDP models are often inconsistent with the cortex. Finkel (1988) called them “infamous grandmother much of the relevant neuroscience data. cells” (p. 787). As noted by Connor (2005), “No one wants to be Although this article is largely directed to cognitive psycholo- accused of believing in grandmother cells” (p. 1036). Instead, after gists, I hope the article is of some relevance to neuroscientists as 50 years of single-cell recording studies, it is widely claimed (and well. Despite the widespread rejection of grandmother cells in implicitly assumed) that the brain codes information with some neuroscience literature, there is now a large body of evidence form of distributed coding. As Averbek, Latham, and Pouget highlighting the extent to which single neurons respond selectively (2006) put it, “As in any good democracy, individual neurons to inputs, including words, objects, and faces. So why then do count for little; it is the population of activity that matters” neuroscientists favor distributed coding schemes? Part of the rea- (p. 358). son is that the term “distributed representation” is used somewhat Given all this, what is an advocate of localist representations to differently in the two literatures. Indeed, the data taken to support do? Grandmother cells in brain are the physiological counterpart to distributed coding schemes in the brain rarely provide any support localist representations in cognitive models (e.g., Coltheart, Curtis, for the types of coding schemes employed in many PDP networks. Atkins, & Haller, 1993; Davis, 1999; Hummel & Holyoak, 2003; More importantly, there seems to be confusion in the neuroscience McClelland & Rumelhart, 1981; Morton, 1969; Norris, 1994; Page literature with regard to how local coding schemes work, and what & Norris, 1998). If localist and PDP models are to be compared sorts of data support or refute this framework. It is striking how (and assessed) in terms of their biological plausibility, grand- often data entirely consistent with localist coding in the brain is mother cells had