On the Proper Treatment of Connectionism
Total Page:16
File Type:pdf, Size:1020Kb
BEHAVIORAL AND BRAIN SCIENCES (1988) 11, 1-74 Printed in the United Stales 01 America On the proper treatment of connectionism Paul Smolensky Department of Computer Science and Institute of Cognitive Science. University of Colorado, Boulder, Colo. 80309-0430 Abstract: A set of hypotheses i~' formulated for a connectionist approach to cognitive modeling. These hypotheses are shown to be incompatible with the hypotheses underlying traditional cognitive models. The connectionist models considered are massively parallel numerical computational systems that are a kind of continuous dynamical system. The numerical variables in the system correspond semantically to fine-grained features below the level of the concepts cons<.:iously used to describe the task domain. The level of analysis is intermediate between those of symbolic cognitive models and neural models. The explanations of behavior provided are like those traditional in the physical sciences, unlike the explanations provided by symbolic models. Higher-level analyses ofthese connectionist models reveal subtle relations to symbolic models. Parallel connectionist memory and linguistic processes are hypothesized to give rise to processes that are describable at a higher level as sequential rule appliC1.tion. At the lower level, compntation has the character of massively parallel satisfaction ofsoft numerical constraints; at the higher level, this can lead to competence charactcrizahle by hard rules. Performance will typically deviate frqm this competence since behavior is achieved not hy interpreting hard rules but by satisfYing soft constraints. The result is a picture in which traditional and connectionist theoretical constructs collaborate intimately to provide an understanding of cognition. Keywords: cognition; computation; connectionism; dynamical systems; networks; neural models; parallel distributed processing; symbolic models 1. Introduction on the input units propagates along the connections until some set ofactivation values emerges on the output units; In the past half-decade the connectionist approach to these activation values encode the output the system has cognitive modeling has grown from an obscure cult claim computed from the input. In hetween the input and ing a few trn.e believers to a movement so vigorol1s that output units there may he other units, often called hidden recent meetings of the Cognitive Science Society have units, that participate in representing neither the input begun to look like connectionist pep rallies. With the fise nor the OtItput. ofthe connectionist movement come a ntunber of funda The computation performed by the network in trans mental questions which are the subject of this target forming the input pattern ofactivity to the Otltput pattern article. I begin with a brief description of connectionist depends on the set ofconnection strengths; these weights models. are usually regarded as encoding the system's knowledge. In this sense, the connection strengths play the role ofthe 1.1. Connectionist models. Connectionist models are large program in a conventional comptlter. Much of the allure networks ofsimple parallel computing elements, each of of the connectionist approach is that many connectionist which carries a Illllnerical activation value which it com networks program themselves, that is, they have alltono ptltes from the valtles of neighboring elements in the mous procedllres for tnning their weights to eventually network, using some simple numerical formula. The perform some specific computation. StIch learning pro network elements, or units, influence each other's valtles cedtlres often depend on training in which the network is throtlgh connections that carry a numerical strength, or presented with sample input/otItptlt pairs from the func weight. The infltlence of lIIlit i on tIIlitj is the activation tion it is supposed to comptlte. In learning networks with valtle oftIIlit i times the strength ofthe connection fmm i hidden units, the network itself"decides" what computa to j. Thtls, if a unit has a positive activation value, its tions the hidden units will perform; because these tmits influence on a neighbor's value is positive ifits weight to represent neither inputs nor outputs, they are never that neighbor is positive, and negative if the weight is "told" what their values should be, even during training. negative. In an ohvious neural allusion, connections car In recent years connectionist models have been devel rying positive weights arc called excitatory and those oped for many tasks, encompassing the areas of vision, carrying negative weights are inhibitory. language processing, inference, and motor control. Ntl~ In a typical connectionist model, input to the system is merous examples can be fotmd in recent proceedings of provided by imposing activation values on the input units the meetings ofthe Cognitive Science Society; Cognitive of the network; these ntImerical values represent some Science (1985); Feldman et aJ. (1985); Hinton and Ander encoding, or representation, ofthe input. The activation son (1981); McClelland, Rumelhart, aud the PDP Re- © 1988 Cambridge University Press 0140-525X/8B $5.00+.00 I Smolensky: Proper treatment of connectionism search Group (1986); Rumelhart, McClelland, and the chines are, after all, "universal"), or because it cannot PDP Research Group (1986). [See also Ballard "Cortical offer the kinds of explanations that cognitive science Connections and Parallel Processing" BBS 9(1) 1986.] requires. Some di~miss connectionist models on the grounds that they are too neurally unfaithful. PTC has 1.2. Goalofthis target article. Given the rapid development been designed to withstand these attacks. in recent years ofthe connectionist approach to cognitive On the opposite side, most existing connectionist mod modeling, it is not yet an appropriate time for defmitive els fail to come to grips with the traditional approach assessments of the power and validity of the approach. partly through a neglect intended as benign. lt is easy to The time seems right, however, for an attempt to articu read into the connectionist literature the claim that there late the goals of the approach, the fundamental hypoth is no role in cognitive science for traditional theoretical eses it is testing, and the relations presumed to link it with constructs such as rules, sequential processing, logic, the other theoretical frameworks ofcognitive science. A rationality, and conceptual schemata or frames. PTC coherent and plausible articulation ofthese fundamentals undertakes to assign these constructs their proper role in is the goal of this target article. Such an articulation is a a connectionist paradigm for cognitive modeling. PTC nontrivial task, because the term "connectionist" enCOIn also addresses certain foundational issues concerning passes a number of rather disparate theoretical frame mental states. works, all ofthem quite undeveloped. The connectionist I see no way of achieving the goals of PTC without framework I will articulate departs sufficiently radically adopting certain positions that will be regarded by a from traditional approaches in that its relations to other number of connectionists as premature or mistaken. parts of cognitive science are not simple. These are inevitable consequenccs of the fact that the For the moment, let me call the formulation of the connectionist appr9ach is still quite underdeveloped, and connectionist approach that I will offer PTe. I will not that the term "connectionist" has come to label a number argue the scientific merit of PTC; that some version of of approaches that embody significantly conflicting as connectionism along the lines ofPTC constitutes a "prop sumptions. PTC is not intended to represent a consensus er description ofprucessing" is argued ebewhere (e.g., in view ofwhat the connectionist approach is or should be. Rumelhart, McClelland & the PDP Research Group lt will perhaps enhance the clarity of the article if I 1986; McClelland, Rumelhart & the PDP Research attempt at the outset to make my position' clear on the Group 1986). Leavingaside the sdentmc merit ofconnec present value of connectionist models and their future tionist models, I want to argue here that PTC offers a potential. This article is not intended a"s a defense of all "Proper Treatment of Connectionism": a coherent for these views, though I will argue for a number of them, mulation of the connectionist approach that puts it in and the remainder have undoubtedly influenccd the contact with other theory in cognitive science in a particu presentation. On the one hand, I believe that: larly constructive way. PTC is intended as a formulatioll of connectionism that is at once strong enough to con (1) a. It is far from clear whether connectioni~t models have stitute a 1l1ajor cognitive hypothesis, comprehensive adequate computational power to perform high-Ievcl enough to face a numberofdifficult challenges, and sound cognitive tasks: There are serious ob~taclcs tbat must be enough to resist a number of objections in principle. If overcomc before connectionist L'OlUputation can offer PTC succeeds in these goals, it will facilitate the real modelers power comparable to that of symbolic business at hand: Assessing the-scientific adequacy ofthe computation, connectionist approach, that is, determiningwhether the b. It is far from dear that connectionist models offer a approach offers computational poweradequatefor human sound basis for modeling human cognitivc