Metadata of the Chapter That Will Be Visualized Online

Metadata of the chapter that will be visualized online Chapter Title Connectionism Copyright Year 2011 Copyright Holder Springer Science+Business Media, LLC Corresponding Author Family Name Karaminis Particle Given Name Themis N. Suffix Division Department of Psychological Sciences, Birkbeck College Organization University of London Address Malet Street, London, WC1E 7HX, UK Email [email protected] Author Family Name Thomas Particle Given Name Michael S.C. Suffix Division Department of Psychological Sciences, Birkbeck College Organization University of London Address Malet Street, London, WC1E 7HX, UK Email [email protected] Comp. by: KArunKumar Stage: Proof Chapter No.: 397 Title Name: ESL Page Number: 0 Date:23/3/11 Time:17:22:10 1 C Theoretical Background 38 2 Connectionism Artificial Neural Networks are abstract models of 39 biological neural systems. They consist of a set of identical 40 3 THEMIS N. KARAMINIS,MICHAEL S.C. THOMAS processing units, which are referred to as artificial neurons 41 4 Department of Psychological Sciences, Birkbeck College, or processing units. Artificial neurons are interconnected 42 5 University of London, London, UK via weighted connections. 43 A great deal of biological complexity is omitted in 44 artificial neural network models. For example, artificial 45 6 Synonyms neurons perform the simple function of discriminating 46 7 (Artificial) Neural network modeling; Connectionist between different levels of input activation. The detector 47 8 modeling; Neural nets; Parallel Distributed Processing model of the neuron (Fig. 1) is a crude approximation of 48 9 (PDP) the role of dendrites and synaptic channels in biological 49 neurons. According to this model, each neuron receives 50 10 Definition a number of inputs from other neurons. The neuron 51 11 Connectionism is an interdisciplinary approach to the integrates the inputs by computing a weighted sum of 52 12 study of cognition that integrates elements from the fields sending activation. Based on the value of the total input 53 13 of artificial intelligence, neuroscience, cognitive psychology, activation, an activation function (e.g., a threshold 54 14 and philosophy of mind. As a theoretical movement in function) determines the level of the output activation of 55 15 cognitive science, connectionism suggests that cognitive the neuron. The output activation is propagated to 56 16 phenomena can be explained with respect to a set of general succeeding neurons. 57 17 information-processing principles, known as parallel The pattern of connectivity between the processing 58 18 distributed processing (Rumelhart et al. 1986a). From a units defines the architecture of the neural network and 59 19 methodological point of view, connectionism is the input–output functions that can be performed. 60 20 a framework for studying cognitive phenomena using The processing units are usually arranged in layers. It is 61 21 architectures of simple processing units interconnected notable that a layered structure has also been observed 62 22 via weighted connections. in neural tissues. Many different neural network 63 23 These architectures present analogies to biological architectures have been implemented in the connectionist 64 24 neural systems and are referred to as (Artificial) Neural literature. One that has been particularly common 65 25 Networks. Connectionist studies typically propose and is the three-layer feed-forward neural network (Fig. 2). 66 26 implement neural network models to explain various In this network, the units are arranged in three layers: 67 27 aspects of cognition. The term connectionism stems input, hidden, and output. The connectivity is feed-for- 68 28 from the proposal that cognition emerges in neural ward, which means that the connections are unidirec- 69 29 network models as a product of a learning process which tional, and connect the input to the hidden, and the 70 30 shapes the values of the weighted connections. hidden to the output layer. The connectivity is also full: 71 31 Connectionism supports the idea that knowledge is Every neuron of a given layer is connected to every neuron 72 32 represented in the weights of the connections between of the next layer. 73 33 the processing units in a distributed fashion. This means A key property of neural networks is their ability to 74 34 that knowledge is encoded in the structure of the learn. Learning in neural networks is based on altering the 75 35 processing system, in contrast to the symbolic approach extent to which a given neuron’s activity alters the activity 76 36 where knowledge is readily shifted between different of the neurons to which it is connected. Learning is 77 37 memory registers. performed by a learning algorithm which determines 78 appropriate changes in the weight values to perform 79 a set of input–output mappings. For example, the 80 Norbert Seel (ed.), Encyclopedia of the Sciences of Learning, DOI 10.1007/978-1-4419-1428-6, # Springer Science+Business Media, LLC 2011 Comp. by: KArunKumar Stage: Proof Chapter No.: 397 Title Name: ESL Page Number: 0 Date:23/3/11 Time:17:22:10 2 C Connectionism 81 Backpropagation of Error algorithm (Rumelhart et al. One of the most influential connectionist models is 130 82 1986b) can be used to train a feed-forward multilayered that of Rumelhart and McClelland (1986) for the acquisi- 131 83 network (Fig. 2) using supervised learning. For this type of tion of the English past tense (Fig. 3). The domain of the 132 84 learning, the learning algorithm presents the network with English past tense is of theoretical interest to psycholin- 133 85 pairs of input patterns and desired output patterns guists because it presents a predominant regularity, with 134 86 (or targets). The algorithm computes the output error, the great majority of verbs forming their past tenses 135 87 i.e., the difference between the actual output of the through a stem-suffixation rule (e.g., walk/walked). 136 88 network and the targets. Next, the algorithm propagates However, a significant group of verbs form their past 137 89 appropriate error signals back down through each layer of tenses irregularly (e.g., swim/swam, hit/hit, is/was). 138 90 the network. These error signals are used to determine Rumelhart and McClelland trained a two-layered 139 91 weight changes necessary to achieve the minimization feed-forward network (a pattern associator) on mappings 140 92 of the output error. For a more detailed discussion of between phonological representations of the stems and the 141 93 learning in neural networks, see connectionist theories corresponding past tense forms of English verbs. 142 94 of learning. Rumelhart and McClelland showed that both regular and 143 95 Other issues that are considered in neural network irregular inflections could be learned by this network. 144 96 modeling concern the representation of the learning Furthermore, they argued that their model reproduced 145 97 environment. For example, a localist or a distributed a series of well-established phenomena in empirical 146 98 scheme can be used to represent different entities. In the studies of language acquisition. For example, the past 147 99 former, a single unit is used to encode an entity, while in tense rule was generalized to novel stems, while the 148 100 the latter an entity is encoded by an activation learning of irregular verbs followed a U-shaped pattern 149 101 pattern across multiple units. Furthermore, the different (an initial period of error-free performance succeeded by 150 102 input–output patterns which compose the learning a period of increased occurrence of overgeneralization 151 103 environment can be presented in different ways (e.g., errors, e.g., think/thinked instead of thought). 152 104 sequentially, randomly with replacement, incrementally, The success of this model in simulating the acquisition 153 105 or based on a frequency structure). of the English past tense demonstrated that an explicit 154 representation of rules is not necessary for the acquisition 155 106 Important Scientific Research and Open of morphology. Instead, a rule-like behavior was the 156 107 Questions product of the statistical properties of input–output 157 108 The concept of neural network computation was initially mappings. The Rumelhart and McClelland (1986) model 158 109 proposed in the 1940s. However, the foundations for their posed a serious challenge to existing ‘symbolic’ views, 159 110 systematic application to the exploration of cognition which maintained that the acquisition of morphology 160 111 were laid several decades later by the influential volumes was supported by two separate mechanisms, also referred 161 112 of Rumelhart, McClelland, and colleagues. Following this to as the dual-route model. According to the dual-route 162 113 seminal work, a large number of studies proposed neural model, a rule-based system was involved in the learning of 163 114 network models to address various cognitive phenomena. regular mappings, while a rote-memory was involved in the 164 115 Although connectionist models are inspired by learning of irregular mappings. A vigorous debate, also 165 116 computation in biological neural systems, they present known as the ‘past tense debate,’ ensued in the field of 166 117 a high level of abstraction. Therefore, they could not language acquisition (c.f., Pinker and Prince 1988). By the 167 118 claim biological plausibility. Connectionist models are time this debate resided, connectionist studies had moved 168 119 usually seen as cognitive models, which explain cognition on to addressing many aspects of the acquisition of past 169 120 based on general information-processing principles. One tense and inflectional morphology in greater detail. 170 121 of the main strengths of connectionism is that the neural For example, Thomas and Karmiloff-Smith (2003) 171 122 network models are not verbally specified but incorporated phonological and lexical-semantics infor- 172 123 implemented.

Load more