Description of the module Subject name Linguistics Paper name Psycho- Module name /Title Module#20-Modularity vs. Connectionism Pre-requisites Modules 1-20 psycholinguistics Objectives To explain the concepts of modularity and connectionism in relation to speech and processing Key words Bottom-up Vs. Topdown processing, Connectionism, constructivism, Mentalism, Nativism,

1. Introduction

According to classical , cognition is (1) computational, meaning that the mind /brain deals with real and intentional mental states that are symbolic representations; (2) symbols and rules interact with each other; (3) not only mind „holds‟ these symbolic representations but that it‟s processes are explainable as rule governed interactions between representations. Considerable debate exists today on how language acquisition and its use could be best explained by appeal to mental representations of linguistic rules, and concepts as entities independent of linguistic experience. Nativists like Chomsky have held that at least some mental representations are endogenously specified (innate), and therefore present before any learning or experience has taken place. Thus, concepts such as mentalism, nativism, and modularity are interconnected.

The issue of modularity has dominated much of the research aimed at finding answers to questions such as, how language is represented in the brain, how children acquire and use it, and how brain damage interferes with language knowledge and use, and therefore, it has had considerable influence on the theories / models and applications in disciplines as varied as Psycholinguistics, neurolinguistics, developmental psychology, neuropsychology, and articficial intelligence. As discussed in the module on innateness, synchrony of language learning milestones, the speed with which we are able to perform linguistic computations to parse complex sentences in order to comprehend their meanings, and the fact that brain damage can selectively spare / impair language processes are some of the arguments offered to defend modularity of the language faculty.

Linguistic theories of generative persuasion propounded by presuppose modularity. Chomsky viewed language as an autonomous system insulated from other cognitive systems. According to him, speech is a manifestation of human mind, the roots of which lie in the innate knowledge in the form of modules containing mental representations. Within such symbolic or modular perspective, mind is like a digital computational device separable from the environment, and it manipulates internal amodal symbols according to logical rules. Modular theorists like Jerry Fodor (1983) argued that knowledge about words never influences perception of phonemes involving bottom-up processing. The initial stages of word or sentence comprehension for instance are not influenced by knowledge representations from higher levels (top-down processing). It is only after initial linguistic processing in specialized modules that information about context or other factors from the real world might play a role in the interpretation of meanings.

Connectionist models on the other hand are based primarily on learning theories in Psychology and concepts such as neural net works in computer science. These models (some are referred to as Parallel Distributed Processing or PDP models) challenge the idea

that mental processes are basically algorithmic symbol manipulations. Connectionist models such as those proposed by Rumelhart and McClelland propose existence of processing units akin to neurons with trainable connections (like synapses) that have sub-symbolic numeric representations. The analysis doesn‟t stop with computational and algorithmic levels (as envisaged by modular theories) but go on to include implementation level as well. Feedback from environment can influence learning. The process of visual word recognition within a connectionist framework for instance involves a network in which there are at least three different hierarchically organized units such as feature detectors, letter detectors and word detectors. By adjusting weights to the connections between the units, one can induce training. Most excited units send inhibitory connections to dampen activity in competing units. Within the connectionist frameworks, frequency / probability of occurrence of words, linguistic context and world knowledge (top-down processing) play a role in the comprehension of the meaning of words. An attempt is made in this unit to synthesize current thinking about modularity and connectionism with a view to provide frameworks for characterizing adult language disorders in the subsequent units. Since there is no consensus among cognitive scientists about the terminology relating to the notions, modularity or module, Seok (2006) discussed different uses of the term, „modularity‟ and listed some the defining properties and features of modular cognitive architectures.

2. Types of Modularity

 Anatomical modularity: Module is conceived as a specific anatomical structure (group of cells) in the brain, a typical neuroscience perspective

 Design modularity: Modularity is seen as a specific design principle, the way components in a system are organized. Individual assemblies with distinct functions come together to form a system as in vision research

 Neuropsychological modularity: Modules are functionally independent units in the brain that can dissociate; they have independent cognitive functions with dedicated groups of brain cells. Aphasiological research makes use of this view of modularity

 Chomskian modularity: Modularity according to Chomsky is a special property of an organized body of symbolic knowledge, what he called competence.

 Computational modularity or the Fodorian modularity: Mind is treated as a computational system that processes information by manipulating signals that represent things in the external world. It refers to specific mode of information processing.

 Developmental modularity: A module develops along a specific pre-determined path following domain specific developmental trajectory. Modularity within this view is the endpoint of active interaction between the mind and its environment, what Karmiloff- Smith calls the process of modularization endorsed by notions such as neuroconstructivism and emergentism.

 Darwinian or evolutionary modularity: sees modularity as a property of an innate and domain specific cognitive mechanism that comes to exist by natural selection, a popular view in evolutionary psychology.

According to Seok, there are at least five defining properties of modular cognitive systems, viz., physical structure, cognitive function performed, type of computation, type

of information handled, and development. These features interact with the below listed characteristic features of modularity endorsed by many theorists including Fodor (1983):

 Domain Specificity (modules are dedicated to processing of single information type)

 Mandatoriness (modules operate reflexively)

 Limited central access (modules have limited to mental representations at higher levels)

 Speed (modules process data very fast)

 Information Encapsulation(It is difficult to interfere with the inner workings of a module)

 Fixed neural architecture (modules are innately specified and hard-wired)

 Specific breakdown patterns (modules breakdown in a characteristic & predictable way)

 Specific ontogeny (modules develop in a specific sequence)

According to Fodor, every cognitive system has three tiers: the first level is the transducer level that transforms environmental signals into a form that can be used by the organism; at the second level the input systems perform basic recognition and description functions; the third and final level there are higher cognitive functions (e.g. thinking, reasoning, problem solving etc) performed by central systems. In Fodor‟s view only input (vertical) systems in the second tier are modular and the higher level cognitive processes (the third level) are not modular. These were termed, horizontal or domain-general systems which are not content-specific like the vertical modules.

Dimensions What do they specify? Features

Physical structure How a cognitive system is physically Fixed neural architecture, specific assembled breakdown pattern

Cognitive function What a cognitive system does Functional specialization

Computation The way a cognitive Speed, mandatoriness, limited central access, information system processes information encapsulation

Information The type of information employed in Domain specificity carrying out its cognitive function

Development How a cognitive system comes to exists Specific development pattern, specific ontogeny

Table 1: Dimensions and features of modularity

Thus, object perception might be modular in that it doesn‟t need to connect with the language module or music module and so on. The higher level processes on the other hand

can have access to all kinds of information contained in the entire cognitive system when performing a given operation.

One of the best evidence of modularity, in particular, for the features of information encapsulation and mandatoriness cited by many researchers in the field of vision is the Muller-Lyer illusion (see figure-1 below). Even when we know that the two vertical lines are the same length, we continue to perceive the first (the left side) line as longer. The visual system is making an inference that goes beyond the stimulus in interpreting the input. Such powerful inferences are made within the visual system using rules that we have neither access to nor the ability to override.

What is a Module? A module is a cognitive or perceptual subsystem whose workings are relatively independent from the rest of the cognitive architecture, and whose functioning can be analyzed & Figure-1: The Muller-Lyer illusion understood relatively independently of the overall

system in which it is embedded.

Seok (2006) lists different types of modules (see table-2 below):

The concept of innate cerebral modules got extended to several cognitive domains such as lexicon, number processing, face-processing, spatial cognition and so on. The

arguments originally belonging to adult neuropsychology were applied even to developmental cases suggesting that the infant brain comes pre-specified to process and represent information in various cognitive modules. It was claimed that people with autism exhibit a selective impairment in their so-called 'theory-of-mind module' (ToM) aspect of cognition. Specifically, it was demonstrated that individuals with autism have particular difficulty attributing intentional, mental states to others, while they seem relatively unimpaired on tasks of equivalent complexity but involving physical objects rather than human beings. Likewise, subgroups of children with so-called specific language impairment were thought to be impaired in aspects of language with other cognitive skills spared. A genetic disorder called Williams syndrome was also upheld by many modular theorists as demonstrating dissociation in the functioning of different modules: independent functioning of language and face processing, alongside seriously impaired spatial cognition as evident in severely impaired drawing ability. In each of these examples, the claim was that the neonate brain is composed of some intact and some deficient or missing cognitive modules, and that the brain develops as a series of isolated components with no effect of the development of any one of them on the development of the others. Researchers belonging to the neuroconstructivist and emergentist approaches to the study of cognition have challenged such strictly modularist views stating that rather than generalizing the static, adult neuropsychological view to the developmental cases, it is vital to consider the possibility that modules are the end product of an ontogenetic process, not its starting point (see Karmiloff- Smith at http://www.answers.com/topic/cognitive-modularity#ixzz2USVV12dq ).

3. Is there a language module?

According to modular models of cognition, the answer is YES. Following are some of the arguments in support of Modularity with a capital M.

 Human linguistic capacity is considered domain specific in that it has little to do with general reasoning capacity or intelligence

 Human linguistic faculty requires little, if any stimulation because it has its own domain specific information processing representations and rules

 Language processing is mandatory in that we cannot help hearing the speech sounds or receive meaningful messages generated by others (a phonology module).

 Human language system is functionally independent of other cognitive capacities such as for instance, memory, problem solving etc.

 Damage to certain areas in the brain does result in language specific impairments that are to some extent predictable as in Broca‟s aphasia with relatively intact speech comprehension and severely affected speech production with the reverse pattern associated with Wernicke‟s aphasia.

From the decade of the 90‟s, there has been a renewed interest in (1) moving beyond the architecture of the mind to looking into the nature of processing (2) developing computational models of language learning (3) test learnability of linguistic properties hitherto assumed to be innate and (4) statistical properties of the input and of the role of these properties in language learning and processing. Before turning to the topic of Connectionism which embodies these four ideas, the basic tenets of modularity are summarized (see box

below):

Modularity is a theory of mind which refers to the vertical, autonomous or non- interactive, encapsulated psychological faculties or the input systems. Fodor‟s view of modularity fits into the symbolic paradigm of cognitive science, which is concerned with representations or the transformation of symbols according to certain rules…

Vivian Maria Heberle (1998), Fragmentos 7:2, P. 116

P. 116 4. Connectionism

Connection first emerged in the 1940‟s with synergism among disciples such as Neuroscience and Computer science (artificial intelligence). By 1960‟s Artificial Neural Network (ANN) models were developed, but soon they were overshadowed by symbolic models. By 1980‟s there was a rise of connectionist models (e.g. Parallel Distributed Processing or PDP) and later in the 90‟s the concepts, neurocomputing, emergentism, Dynamic systems theory gained prominence. Connectionists attempted to develop design systems that can exhibit intelligent behavior without storing, retrieving and other ways of operating on locally stored structured symbolic expressions (for a brief history of development of these ideas and their relation to language processing, see Christiansen and Chater 1999).

In contrast to modular theories, connectionist frameworks propose horizontal, interactive non-modular cognitive systems working in parallel and in a non-linear and dynamic fashion; derived primarily from the fields of psychology, neurology and artificial intelligence, connectionist approaches have emerged as alternative proposals to symbolic theories of cognition that rely primarily on localist representations of memory. Within these frameworks, memory doesn‟t reside in any single neural element. Instead, it is represented by a pattern of activation over several elements or nodes (hence the representations are termed „distributed‟). The activity of each node might be either excitatory or inhibitory depending on the weight of their connections. The weight that is either positive or negative varies between -1 to +1 and it specifies the strength or importance of the connection between any two units / nodes. The excitatory (positive) connections increase the activity of the unit. The computations within connectionist systems deal with causal processes; they do not transform symbols according to rules as it happens in modular systems. In place of innately specified modules, connectionists advocate computational cognitive systems with some degree of learning process embedded in them (see box below):

For the connectionists, learning takes place through the strengthening and weakening of interconnections in response to examples encountered in the input. There is no need for learners to appreciate the significance of specific syntactic forms or to make an inductive leap to abstract rules. Instead, learning consists of a network of units that enable the learner to produce rule-like behavior, but the rules themselves exist only as association strengths distributed across the entire network-- McLaughlin (1990:624).

There has been a sustained critique of the notion of „representations‟ assumed by the proponents of modularity. It was argued (see for e.g. McClelland et al 2010) that in actual spoken language representations such as „phonemes‟ or „syllables‟ are matters of degree. For instance, these authors pointed out that in the English word, “softly”, there is almost no [t] sound compared to a word such as “swiftly”; words such as “memory” have more than two syllables but less than 3 syllables. Even at the level of morphology, a word like “prefabricate” is clearly compositional compared to „predict‟ or „prefer‟ and at the level of syntax the question of grammatical or ungrammatical applies to sentences, but the borderline cases are ignored. They pointed out that in connectionist models, the internal representations are graded patterns with varying degrees of distinctness, compositionality, and context sensitivity.

5. Basic components of a connectionist system

1. A set of processing units 2. A set of modifiable connections between units 3. A learning procedure (optional)

5.1 Processing Units

There are three types of processing units: input units (those that receive inputs from sources outside the network), output units (those that can send signals outside the network) and hidden units (those processing units which can only communicate directly within the network). A unit in a connectionist network typically sends a signal to other units in the network or even to structures outside the network. The signal that a unit sends out is determined by the output function. The output function depends upon the state of activation of the unit. The output function of a particular unit is such that it sends out a signal equivalent to its activation value.

5.2 Modifiable Connections In order for a particular connectionist network to process information, the units within the network need to be connected together. It is via these connections that the units communicate with one another. The connections within a network are usually 'weighted'. The weight of a connection determines the amount of the signal input into the connection which will be passed between units. Connection weights (sometimes also called 'connection strengths') are positive or negative real numerical values. The amount of input a particular connection supplies to a unit to which it is connected is the value of the result of the output function of the sending unit, multiplied by the weight of the connection. There is no limit to the

number or pattern of connections which a particular unit may have. Units can have weighted connections with themselves and there can even be loops or cycles of connections.

5.3 Learning Rules

A learning rule is an algorithm which can be used to make changes to strengths of the weights of the connections between processing units. Whereas all connectionist systems have processing units and patterns of connections between the units, not all systems have learning rules. A learning rule is used to modify the connection weights of a network in order to make the network better able to produce the appropriate response for a given set of inputs. Networks which use learning rules have to undergo training, to be able to set the connection weights. Training usually consists of the network being presented with patterns which represent the input stimuli at their input layer. It is common for connection weights to be set randomly prior to training.

In connectionist models computation is general purpose, parallel and not specialized for any task or function as against modular models which assume stable genetically endowed content-sensitive units that are devised for a very narrow function and work with a particular type of input. Connectionist models place emphasis on early and continuous interaction and feedback to ensure efficient processing especially under difficult conditions (e.g. noisy environment) whereas proponents of autonomous (modular) models posit that processing is primarily bottom-up and that top-down effects operate at a post-lexical level of analysis.

Figure- : A three layered connectionist system (From Berkeley, 1997)

6 Language processing: Connectionist vs. modularist perspectives

6.1 Reading words In the early 80‟s McClelland and Rumelhart proposed a connectionist model of letter recognition that works for processing of English printed words. The model operates at three levels: a feature level where the nodes represent parts of letters (the vertical and horizontal

strokes constituting letter such as T, E); the letter level representing parts of words (entire letters), and a word level. The feature level can excite or inhibit nodes at the letter level and these can in turn excite or inhibit nodes at the word level and be exited or inhibited by them. In addition to the units that encode orthography, phonology and semantic information, most reading models also have hidden units that mediate the computations between these codes by allowing more complex mappings and thereby increasing the computational power of the network. Hidden units also promote generalizations. The authors were able to demonstrate that the behavior of this model conformed many experimental results in word recognition experiments in addition to accounting for contextual influences.

Traditional modular models of reading such as the Dual Route Cascaded (DRC) models posit two separate types of knowledge representations as driving the process of reading: a set of rules to help one pronounce regular words (e.g. gave, save) as opposed to mentally stored memorized lexical representations for reading exception words (e.g. have) that defy grapheme-phoneme rules. The rules and memory systems are said to be governed by different principles, acquired by different mechanisms, relevant to different types of input and located in different parts of the brain. The connectionists argue that assigning rule governed forms and exceptions to different modules create paradoxes because exceptions are not totally arbitrary, instead, they do overlap with regular forms at least at the level of structure ( see Seidenberg 2005).

By late 80‟s researchers working with connectionist models were able to train a single layer network to map roots to past tense forms of English words just like children do using perceptron learning algorithm, but without recourse to positing linguistic rules. Connectionist models have been demonstrated to account patterns of impairment in verb morphology after brain damage (see Joanisse and Seidenberg 1999). However, the debate on whether modular or connectionist models provide best explanation for reading disorders is far from being resolved.

6.2 Speech perception

There is considerable on-going debate on the effectiveness of modular versus connectionist models in accounting for the process of speech perception. The following citations capture the unresolved status of this debate:

Speech perception must be counted as a distinct module of the auditory system because, as experiments show, it doesn‟t depend on the outputs of the other modules: under appropriate conditions, a coherent speech percept is evoked by information in two signals that are perceived as separate sources; moreover, one of the signal is also simultaneously perceived as non-speech…speech perception is thoroughly pre-cognitive and is part of the larger specialization for language –Mattingly and Liberman 1988

Mattingly and Liberman were of the opinion that there are two different types of modules involved in processing speech: Open (auditory) modules involved in the perception of a wide variety of environmental events, auditory qualities of pitch, loudness and timbre and

Closed (phonetic) module that is heteromorphic meaning, the dimensions of the precept do not correspond directly to the dimensions of the signal as in the case of open modules.

Kuhl (2006) contested the above thesis by stating thus:

“I will argue that, early in infancy, the mechanisms underlying speech perception are domain general, and rely on processing that is not exclusive to speech signals. Language has evolved to capitalize on general mechanisms of perception and action, and this ensures that all infants, regardless of the culture in which they are born, initially respond to, and learn from exposure to language. As experience with language ensues, neural commitment to the properties of the language to which the infant is exposed results in dedicated neural networks and speech perception becomes more specialized… P. 3166

Recent researchers are of the opinion that neural circuits underlying language must be highly flexible in order to accommodate both specificity and generalization of perceptually learned categorization boundaries. This observation is particularly relevant for understanding perceptual learning in bi and multilingual individuals.

7. Summary

The first part of this unit deals with the notion of modularity in relation to cognitive systems including language. Different dimensions, types and characteristics of modular systems are described. Some of the limitations of modular models of cognition in accounting for learning as noted by neuroconstructivist and emergentist frameworks are pointed out. In the latter part of the module, the concept of connectionism is explained especially in relation to the Parallel Distributed Processing models developed and used in research related to reading and auditory-speech perception. It was pointed out that while modularists posit independent memory-based localist symbolic representations and rules and emphasize on bottom-up processing, connectionists place emphasis on interactive (both top-down and bottom-up) processing involving networks of neurons causing learning and generalization by making use of feedback.

Story board: none; I will redo some of the tables to make them more visible and clearer.