Connectionism, Confusion, and Cognitive Science MICHAEL R.W

Connectionism, Confusion, and Cognitive Science MICHAEL R.W. DAWSON and KEVIN S. SHAMANSKI Biological Computation Project, University of Alberta CONTENTS Page Synopsis 216 Introduction 216 PDP Networks and the Tri-Level Hypothesis 218 Computational Descriptions of PDP Networks 221 PDP Networks Are Powerful Information Processing Systems 221 How Are Claims About PDP Competence Related to Cognitive Science? 225 Computational Connectionism and Cognitive Science 228 Algorithmic Descriptions of PDP Networks 229 PDP Networks Are Themselves Algorithms 229 Problems With PDP Algorithms 230 Algorithmic Connectionism and Cognitive Science 235 Implementational Descriptions of PDP Networks 238 PDP Networks, Biological Plausibility, and Computational Relevance 238 Problems Arise Because of Biologically Undefended Design Decisions 241 Implementational Connectionism and Cognitive Science 249 From Connectionism to Cognitive Science 250 Acknowledgements 252 References 253 215 Volume 4, Nos. 3-4, 1994 Connectionism, Confusion and Cognitive Science SYNOPSIS This paper argues that while connectionist technology may be flourishing, connectionist cognitive science is languishing. Parallel distributed processing (PDP) networks can be proven to be computationally powerful, but these proofs offer few useful constraints for developing models in cognitive science. Connectionist algorithms — that is, the PDP networks themselves — can exhibit interesting behaviours, but are difficult to interpret and are based upon an incomplete functional architecture. While PDP networks are often touted as being more "biologically plausible than classical AI models, they do not appear to have been widely endorsed by neurophysiologists because they incorporate many implausible assumptions, and they may not model appropriate physiological processes. In our view, connectionism faces such problems because the design decisions governing current connectionist theory are determined by engineering needs — generating the appropriate output — and not by cognitive or neurophysiological considerations. As a result, the true nature of connectionist theories, and their potential contribution to cognitive science, is unclear. We propose that the current confusion surrounding connectionism's role in cognitive science could be greatly alleviated by adopting a research programme in which connectionists paid much more attention to validating the PDP architecture. Key Words Parallel distributed processing, connectionism, cognitive science INTRODUCTION Connectionists appear to be making great advances in the technology of knowledge engineering, and now feel poised to answer the difficult questions about machine intelligence that seem to have passed classical artificial intelligence by. Parallel distributed processing (PDP) models have been developed for a diverse range of phenomena, as a survey of almost any journal related to cognitive science will show. For example, in recent years 216 M.R. W. Dawson and K.S. Shamanski Journal of Intelligent Systems Psychological Review has published connectionist models concerned with aspects of reading (Hinton & Shallice, 1991; Seidenberg & McClelland, 1989), classical learning theory (Kehoe, 1988), automatic processing (Cohen, Dunbar, & McClelland, 1991), sentence production (Dell, 1986), apparent Motion (Dawson, 1991), and dreaming (Antrobus, 1991). In addition, many basic connectionist ideas are being directly implemented in hardware (e.g., Jabri & Flower, 1991) under the assumption that increases in computer power and speed require radical new parallel architectures (e.g., Hillis, 1985; Müller & Reinhardt, 1990, p. 17). "The neural network revolution has happened. We are living in the aftermath" (Hanson & Olson, 1991, p. 332). While connectionist technology may indeed be flourishing, connectionist cognitive science is languishing. PDP networks may generate interesting behaviour, but it is not clear that they do so by emulating the fundamental nature of human cognitive processes. In our view, the current design decisions governing connectionist theory are determined by engineering needs — generating the appropriate output — and not by cognitive or neurophysiological considerations. The goal of this paper is to illustrate the gap between connectionist technology and connectionist cognitive science. Connectionist networks can be proven to be computationally powerful, but these proofs offer no meaningful constraints for designing cognitive models. Connectionist algorithms — that is, the PDP networks themselves — can exhibit interesting behaviours, but are difficult to interpret and are based upon an insufficient functional architecture. While PDP networks are often touted as being more "biologically plausible" than classical or symbolic artificial intelligence (AI) models, they do not appear to have been widely endorsed by neurophysiologists because they incorporate many implausible and unjustified assumptions, and they may not model appropriate physiological processes. 217 Volume 4, Nos. 3-4, 1994 Connectionism, Confusion and Cognitive Science PDP NETWORKS AND THE TRI-LEVEL HYPOTHESIS The explosion of interest in connectionist systems over the past decade has been accompanied by the development of diverse architectures (for overviews, see Cowan & Sharp, 1988; Hecht-Neilson, 1990; Müller & Reinhardt, 1990). These have ranged from simulations designed to mimic (with varying detail) specific neural circuits (e.g., Granger, Ambros- Ingerson & Lynch, 1989; Grossberg, 1991; Grossberg & Rudd, 1989, 1992; Lynch, Granger, Larson & Baudry, 1989) to new computer designs that have little to do with human cognition, but which use parallel processing to solve problems that are ill-posed or that require simultaneous satisfaction of multiple constraints (e.g., Hillis, 1985; Abu-Mostafa & Psaltis, 1987). Given this diversity, it is important at the outset to identify the branch of connectionism with which we are particularly concerned. This paper examines the characteristics of what has been called generic connectionism (e.g., Anderson & Rosenfeld, 1988, p. xv), because it appears to have had the most impact on cognitive science. A detailed description of the generic connectionist architecture is provided by Rumelhart, Hinton and McClelland (1986). PDP models are defined as networks of simple, interconnected processing units (see Figure la). Each processing unit in such a network operates in parallel, and is characterized by three components: a net input function which defines the total signal to the unit, an activation function which specifies the unit's current "numerical state", and an output function which defines the signal sent by the unit to others. Such signals are sent through connections between processing units, which serve as communication channels that transfer weighted numeric signals from one unit to another. Connection strengths can be modified by applying a learning rule, which serves to teach a network how to perform some desired task. For instance, the generalized delta rule (Rumelhart, Hinton & Williams, 1986a, 1986b) computes an error signal using the difference between the observed and desired responses of the network. This error signal is then "propagated backwards" through the network, and used to change connection weights, so that the network's performance will improve. 218 M.R. W. Dawson and K.S. Shamanski Journal of Intelligent Systems c Fig. 1: Components of the generic connectionist architecture. (A) A typical multilayer network of processing units. It is trained to generate a particular response to patterns that are presented to the input units. Hidden units can be viewed as feature detectors. (B) The three basic operations of a single processing unit. See the text for details. (C) Processing units communicate through a connection that has the numerical weight w-; learning rules are used to modify weight values. 219 Volume 4, Nos. 3-4, 1994 Connectionism, Confusion and Cognitive Science Classical artificial intelligence (AI) systems embody the notion that information processing is the serial manipulation of physical symbols that represent semantic content. Defined in this fashion, classical systems need to be described at three different levels of analysis — computational, algorithmic, and implementational — if they are to be completely understood (e.g., Marr, 1982, Chap. 1; Pylyshyn, 1984). Models created from the generic connectionist architecture are usually described (and sometimes criticized) as being radically different from classical models (e.g., Broadbent, 1985; Churchland & Sejnowski, 1989; Clark, 1989; Fodor & Pylyshyn, 1988; Hawthorne, 1989; Hecht-Nielsen, 1990; McClelland, Rumelhart & Hinton, 1986; Rumelhart, Smolensky, McClelland & Hinton, 1986; Schneider, 1987; Smolensky, 1988). One issue that arises from the view that connectionism offers a "paradigm shift" to AI researchers is whether the so-called tri-level hypothesis also applies to PDP models. For instance, some arguments indicate that PDP systems are implementational models: "In our view, people are smarter than today's computers because the brain employs a basic architecture that is more suited to deal with a central aspect of the natural information processing tasks that people are so good at" (McClelland, Rumelhart & Hinton, 1986, p. 1). However, when connectionism is described (and, in particular, criticized) as being merely implementational (e.g., Broadbent, 1985; Fodor & Pylyshyn, 1988), connectionists beg to differ. "Our primary concern is with the computations

Connectionism, Confusion, and Cognitive Science MICHAEL R.W

Arxiv:2107.04562V1 [Stat.ML] 9 Jul 2021 ¯ ¯ Θ∗ = Arg Min `(Θ) Where `(Θ) = `(Yi, Fθ(Xi)) + R(Θ)

Warren Mcculloch and the British Cyberneticians

Kybernetik in Urbana

1011 Neurons } 1014 Synapses the Cybernetics Group

Journal of Sociocybernetics

A Defense of Pure Connectionism

Training Plastic Neural Networks with Backpropagation

Intelligence Without Representation: a Historical Perspective

Simultaneous Unsupervised and Supervised Learning of Cognitive

THE INTELLECTUAL ORIGINS of the Mcculloch

What Is Systems Theory?

The Neocognitron As a System for Handavritten Character Recognition: Limitations and Improvements