Computational Principles in Quasi-Regular Domains
Total Page:16
File Type:pdf, Size:1020Kb
To appear in Psychological Review, 1996. Understanding Normal and Impaired Word Reading: Computational Principles in Quasi-Regular Domains David C. Plaut James L. McClelland Carnegie Mellon University and the Carnegie Mellon University and the Center for the Neural Basis of Cognition Center for the Neural Basis of Cognition Mark S. Seidenberg Karalyn Patterson Neuroscience Program MRC Applied Psychology Unit University of Southern California Abstract CROWN vs. KNOWN, SHOWN, GROWN, THROWN, or OUGH in We develop a connectionist approach to processing in quasi-regular domains, COUGH ROUGH BOUGH THOUGH THROUGH as exemplified by English word reading. A consideration of the shortcomings , , , , ). Nonetheless, in of a previous implementation (Seidenberg & McClelland, 1989, Psych. Rev.) the face of this complexity, skilled readers pronounce written in reading nonwords leads to the development of orthographic and phonologi- words quickly and accurately, and can also use their knowl- cal representations that capture better the relevant structure among the written edge of spelling-sound correspondences to read pronounceable and spoken forms of words. In a number of simulation experiments, networks using the new representations learn to read both regular and exception words, nonwords (e.g., MAVE, RINT). including low-frequency exception words, and yet are still able to read pro- An important debate within cognitive psychology is how nounceable nonwords as well as skilled readers. A mathematical analysis of best to characterize knowledge and processing in quasi-regular the effects of word frequency and spelling-sound consistency in a related but domains in order to account for human language performance. simpler system serves to clarify the close relationship of these factors in influ- encing naming latencies. These insights are verifiedin subsequent simulations, One view (e.g., Pinker, 1984, 1991) is that the systematic as- including an attractor network that reproduces the naming latency data directly pects of language are represented and processed in the form of in its time to settle on a response. Further analyses of the network’s ability to an explicit set of rules. A rule-based approach has considerable reproduce data on impaired reading in surface dyslexia support a view of the intuitive appeal because much of human language behavior can reading system that incorporates a graded division-of-labor between semantic and phonological processes. Such a view is consistent with the more general be characterized at a broad scale in terms of rules. It also pro- Seidenberg and McClelland framework and has some similarities with—but vides a straightforward account of how language knowledge also important differences from—the standard dual-route account. can be applied productively to novel items (Fodor & Pylyshyn, 1988). However, as illustrated above, most domains are only partially systematic; accordingly, a separate mechanism is re- Many aspects of language can be characterized as quasi- quired to handle the exceptions. This distinction between a regular—the relationship between inputs and outputs is sys- rule-based mechanism and an exception mechanism, each op- tematic but admits many exceptions. One such task is the erating according to fundamentally different principles, forms mapping between the written and spoken forms of English the central tenet of so-called “dual-route” theories of language. words. Most words are regular (e.g., GAVE, MINT) in that An alternative view comes out of research on connectionist their pronunciations adhere to standard spelling-sound corre- or parallel distributed processing networks, in which computa- spondences. There are, however, many irregular or excep- tion takes the form of cooperative and competitive interactions tion words (e.g., HAVE, PINT) whose pronunciations violate among large numbers of simple, neuron-like processing units the standard correspondences. To make matters worse, some (McClelland, Rumelhart, & the PDP research group, 1986; spelling patterns have a range of pronunciations with none Rumelhart, McClelland, & the PDP research group, 1986). clearly predominating (e.g., OWN in DOWN, TOWN, BROWN, Such systems learn by adjusting weights on connections be- This research was supported financially by the National Institute of Men- tween units in a way that is sensitive to how the statistical tal Health (Grants MH47566 and MH00385), the National Institute on Aging structure of the environment influences the behavior of the net- (Grant Ag10109), the National Science Foundation (Grant ASC–9109215), work. As a result, there is no sharp dichotomy between the and the McDonnell-Pew Program in Cognitive Neuroscience (Grant T89– items that obey the rules and the items that do not. Rather, 01245–016). We thank Marlene Behrmann, Derek Besner, Max Coltheart, Joe Devlin, all items coexist within a single system whose representations Geoff Hinton, and Eamon Strain for helpful discussions and comments. We and processing reflect the relative degree of consistency in the also acknowledge Derek Besner, Max Coltheart, and Michael McCloskey for mappings for different items. The connectionist approach is directing attention to many of the issues addressed in this paper. particularly appropriate for capturing the rapid, online nature Correspondence concerning this paper should be sent to Dr. David C. Plaut, Department of Psychology, Carnegie Mellon University, Pittsburgh, of language use, as well as for specifying how such processes PA 15213–3890, [email protected]. might be learned and implemented in the brain (although still at Understanding Normal and Impaired Word Reading 2 a somewhat abstractlevel; seeSejnowski, Koch,& Churchland, and sublexical procedures is motivated primarily by evidence 1989, for discussion). Perhaps more fundamentally, connec- that they can be independently impaired, either by abnormal tionist modeling provides a rich set of general computational reading acquisition (developmental dyslexia) or by brain dam- principles that can lead to new and useful ways of thinking age in a previously literate adult (acquired dyslexia). Thus, about human performance in quasi-regular domains. phonological dyslexics, who can read words but not nonwords, Much of the initial debate between these two views of the appear to have a selective impairment of the sublexical proce- language system focused on the relatively constrained domain dure, whereas surface dyslexics, who can read nonwords but of English inflectional morphology—specifically, forming the who “regularize” exception words (e.g., SEW ¡ “sue”), appear past-tense of verbs. Past-tense formation is a rather simple to have a selective impairment of the lexical procedure. quasi-regular task: there is a single regular “rule” (add – Seidenberg and McClelland (1989, hereafter SM89) chal- ed; e.g., WALK ¡ “walked”) and only about 100 exceptions, lenged the central claim of dual-route theories by developing grouped into several clusters of similar items that undergo a a connectionist simulation that learned to map representations ¡ similar change (e.g., SING ¡ “sang”, DRINK “drank”) along of the written forms of words (orthography) to representations with a very small number of very high-frequency, arbitrary of their spoken forms (phonology). The network success- forms (e.g., GO ¡ “went”; Bybee & Slobin, 1982). Rumelhart fully pronounces both regular and exception words and yet and McClelland (1986) attempted to reformulate the issue away is not an implementation of two separate mechanisms (see from a sharp dichotomy between explicit rules and exceptions, Seidenberg & McClelland, 1992, for a demonstration of this and toward a view that emphasizes the graded structure relat- last point). The simulation was put forward in support of a ing verbs and their inflections. They developed a connectionist more general framework for lexical processing in which or- model that learned a direct association between the phonology thographic, phonological, and semantic information interact of all types of verb stems and the phonology of their past-tense in gradually settling on the best representations for a given forms. Pinker and Prince (1988) and Lachter and Bever (1988), input (see Stone & Van Orden, 1989, 1994; Van Orden & however, pointed out numerous deficiencies in the model’s ac- Goldinger, 1994; Van Orden, Pennington, & Stone, 1990, for tual performance and in some of its specific assumptions, and a similar perspective on word reading). A major strength of argued more generally that the applicability of connectionist the approach is that it provides a natural account of the graded mechanisms in language is fundamentally limited (also see effects of spelling-sound consistency among words (Glushko, Fodor & Pylyshyn, 1988). However, many of the specific lim- 1979; Jared, McRae, & Seidenberg, 1990) and how this consis- itations of the Rumelhart and McClelland model have been tency interacts with word frequency (Andrews, 1982; Seiden- addressed in subsequent simulation work (Cottrell & Plun- berg, 1985; Seidenberg, Waters, Barnes, & Tanenhaus, 1984; kett, 1991; Daugherty & Seidenberg, 1992; Hoeffner, 1992; Taraban & McClelland, 1987; Waters & Seidenberg, 1985).1 MacWhinney & Leinbach, 1991; Marchman, 1993; Plunkett & Furthermore, SM89 demonstrated that undertrained versions Marchman, 1991, 1993). Thus, the possibility remains strong of the model exhibit some aspects of developmental surface that a connectionist model could provide a full account of past- dyslexia, and Patterson