Chapter 10 Phonology Versus Phonetics in Speech Sound Disorders

WOLFRAM ZIEGLERa aInstitute of Phonetics and Speech Processing, Ludwig Maximilian University of Munich, Germany Correspondence to Wolfram Ziegler: [email protected]

Abstract: Historically, aphasiologists have created a deep chasm between phonological and phonetic impairments of sound production. Today, there is still a fundamental separation of aphasic–phonological impairment from phonetic planning impairment (apraxia of speech), both in theoretical models and in clinical taxonomy, although several new developments in phonetics and phonology emphasize that phonetic substance interacts with phonological structure. Yet, these developments have not yet found their way into neurolinguistic theories of sound production impairment. This chapter is focused on the question of whether modern theories of aphasic phonological impairment provide appropriate frameworks to study the potential links or divides between phonetic and phonological sound production impairment. Three different accounts are discussed: (1) the phonological mind theory proposed by Berent (2013), (2) connectionist theories in the tradition of Dell (1986), and (3) optimality theory and harmonic grammar approaches developed by Prince and Smolensky (2004) as well as Smolensky and Legendre (2006). A common property of these frameworks is that they lack any theoretical handle by which a potential role of phonetic substance in the origin of phonological errors can be grasped. As a consequence, these theories are inappropriate to illuminate the pathomechanisms underlying phonological impairment in aphasia. In a final section, several approaches are introduced that appear more promising in this regard.

Speaking is one of the most complex and, at the same time, one of the easiest of all human motor activities. On the one hand, it is tremendously complex, because hundreds of muscle commands need to be edited by the brain every second when people speak (Lenneberg, 1967), and these activities need to be synchronized with the cognitive and linguistic processes associated with spoken language production (Levelt, 1989). Speaking is easy, on the other hand, because every healthy child can learn it effortlessly and— unlike writing—without any specific education, and because people do not waste even the slightest attention to the movements of their tongues and lips when they talk. Speaking may accompany almost all other cognitive or motor activities of daily life, without any effort and without committing noticeable numbers of errors. 2 SPEECH MOTOR CONTROL IN NORMAL AND DISORDERED SPEECH

Patterns of Sound Production Impairment After Left Hemisphere Stroke

Patients with a left hemisphere stroke may lose this capacity. When they speak, their words sound distorted, sometimes beyond being recognizable. One particular subpopulation of these patients, usually those with a lesion in the anterior portion of the language area, suffers from a disorder termed apraxia of speech (AOS). These patients appear to have lost the ease of speaking and seem to be helplessly confronted with the plethora of muscle contractions that need to be controlled for the production of a simple word or phrase, although they still seem to “know how the words should sound” (Ziegler, Aichert, & Staiger, 2012). Their speech is halting and effortful, often with problems initiating an utterance, with groping movements, multiple false attempts, and self-corrections. They make many speech errors1 that often lead to phonetically distorted sound productions. Auditory repetition of the German word wald (English = forest) may for instance sound like /valt/→ [ffːː.. ɰalːtˢ]. (1) Often these patients may also commit phonemic errors of a well-articulated quality, such as

/bluːmə/ → [bruːmə] (English = flower). (2) Because they know from life-long experience how easy it has always been to talk, they are desperate when facing this awful struggling and groping for every single syllable after their stroke. Patients with AOS are generally assumed to suffer from phonetic planning impairment (e.g., Ziegler, 2002). A different group of stroke patients, usually with lesions in more posterior perisylvian (i.e., temporo-parietal) regions of the left hemisphere, may experience striking sound production problems as well, yet of a different kind: Their speech is much more fluent most of the time, and their output sounds largely well-articulated, despite the frequent occurrence of phonemic errors of the sort mentioned earlier (see Example 2). Quite complex errors may occur as well. As an example, a patient with a left inferior parietal stroke produced, in an auditory repetition task, the following errors (excerpted from a longer list):

/frɔʃ/ → [ʃɔʃ] (English = frog) (3) /riŋ/ → [nil] (English = ring) /dax/ → [ʃtɔtʃ] (English = roof) /print͡ sɛsin/ → [mɛlˈtikin] (English = princess). This condition is called (aphasic) phonological impairment. It is one of the most frequent symptoms found in patients with aphasia and may occur in virtually every aphasia syndrome. Patients with this impairment usually lack the conspicuous groping movements of the articulators seen in the apraxic speakers—that is, errors of the sort

1 All speech errors mentioned in this chapter are authentic examples from German-speaking patients with AOS or with postlexical phonological impairment, respectively. PHONOLOGY VS. PHONETICS IN SPEECH SOUND DISORDERS 3

(see Example 1) mentioned earlier would be very atypical. Moreover, they often seem to be much less concerned about their failures or, in some cases, even be unaware of them. Patients who have predominantly this symptom—with otherwise good comprehension, preserved word retrieval, and preserved syntax—are clinically classified as conduction aphasic; however, this constellation is rare. More often phonological impairment is accompanied by impaired comprehension, and in extreme cases, it may present as a largely incomprehensible “phonemic jargon” (Butterworth, 1992). In the cognitive neurolinguistic tradition, phonological impairment is subdivided into a lexical subtype and a postlexical subtype, and it is ascribed to a processing level where phonological form information is retrieved (Goldrick & Rapp, 2007; Schwartz, 2014).

The Motor–Linguistic Dualism of Sound Production Impairment

Of note, aphasiologists have construed a deep chasm between phonological impairment and AOS by classifying the former as an aphasic (i.e., nonmotor) symptom and the latter as a motor (i.e., nonaphasic) symptom. Being a part of aphasia, phonological impairment is allocated to the language domain and is therefore, by virtue of standard theories of aphasia (e.g., Caramazza & Zurif, 1976), considered as an amodal dysfunction of purely abstract processes and representations—that is, a disorder affecting the core of the language module. AOS, to the contrary, is said to interfere with one of the tools through which language can be implemented in the physical world—that is, the vocal tract motor system. As is taught in linguistics courses, the implements of linguistic structure—script, gesture, or speech—are insubstantial to linguistic competence; hence, the mechanisms deemed to underlie aphasic phonological impairment are entirely divorced from those underlying AOS. In this understanding, phonological impairment goes more with syntactic or semantic impairment than with AOS, because—like the former (and unlike the latter)— it is part of the language module. The concept of “pure AOS,” which has always played an important role in clinical research (e.g., Graff-Radford et al., 2014), is revealing in this regard, because it views the absence of agrammatism or writing impairment as a crucial criterion for the diagnosis of AOS and as an argument against a linguistic speech problem—not considering that phonological impairment can be dissociated from other aphasic symptoms as well (for a discussion, cf. Ziegler, 2002). The virtually insurmountable divide between a phonological and a phonetic impairment level has created models in which the processing of linguistic units, including phonological representations, is encapsulated from the physical implement of spoken language. In most cognitive neurolinguistic models, speech is an unspecified appendix to lexical and response buffer modules whose properties are uninfluenced by the properties of the speech motor system (e.g., Patterson & Shewell, 1987). Likewise, Dell’s (1986) spreading activation model of word production ends where the motor processes of speaking begin. Phonetic implementation in these models is not only strictly feedforward, but the two parts of the implementation mechanism are even entirely alien to each other. In clinical practice, this taxonomic situation creates high pressure on clinicians in their diagnostic decision about whether a patient’s speech sound impairment is apraxic 4 SPEECH MOTOR CONTROL IN NORMAL AND DISORDERED SPEECH or phonologic, because in making this decision, they need to balance between two fundamentally distinct worlds—movement and language. The situation can become rather challenging, because AOS and phonological impairment (especially of the conduction aphasia type) may appear very similar on the surface, and because the two impairments are influenced by similar critical variables—for example, word length, lexical frequency, or phonological complexity. Therefore, the requirement to deliver a judgment of such primal taxonomic importance and theoretical momentousness is often at variance with the clinical intuition that neither the symptoms nor the therapeutic needs are exceedingly different in these two groups. Notably, the mere presence of other aphasic symptoms—for example, agrammatism or dyslexia—is often considered as circumstantial evidence that the observed speech sound impairment is phonological.

A Clinical Dualism With a Vanishing Theoretical Basis

Although the idea of a fundamental cleft between phonological and phonetic processing steps is still maintained very strictly in clinical diagnostics (e.g., Wambaugh, Duffy, McNeil, Robin, & Rogers, 2006), its theoretical underpinnings have long undergone progressive dissolution. Textbook definitions of aphasia nowadays dispense with the term “amodal” in their characterization of the disorder and emphasize the distinction between modalities of input and output processing of words (e.g., Caplan, 1992, Table 1.1), thereby dismissing the idea of aphasia as a destruction of a modality-independent language module. In the same vein, neurolinguistic processing models have shattered the framework of amodal phonology by introducing separate input and output pathways for spoken and written language: The phonological code is a code for speaking and auditory comprehension (Hickok, 2014). Remarkably, in an influencing (and heavily discussed) conceptual article, Hauser, Chomsky, and Fitch (2002) have completely abandoned the idea that phonology is part of the “faculty of language in a narrow sense,” which they have defined as “the abstract linguistic computational system alone, independent of the other systems with which it interacts,” that deals with syntax, but not with sound, patterns (p. 1571). This view has been elaborated on more recently by Berwick, Friederici, Chomsky, and Bolhuis (2013), for whom both phonology and phonetics pertain to an “externalization system” that is distinct from the core language system. A similar reunion of phonology with phonetics has also been advocated from an opposite perspective, according to which the scope of the faculty of language should be expanded to include, in addition to phonology, the phonetic aspects of speech processing (Liberman & Whalen, 2000; Okanoya, 2007). In a similar vein, Collier, Bickel, van Schaik, Manser, and Townsend (2014) proposed that phonetics and phonology, rather than syntax, constitute the core acquirement of language evolution. Irrespective of which viewpoint is adopted, the big divide—if there is one at all—is no longer between phonology and phonetics but is rather between phonology/phonetics, on the one hand, and other components of linguistic processing, such as syntax, on the other hand. This thinking is also reflected in the development of psycholinguistic models integrating high-level lexical with lower level PHONOLOGY VS. PHONETICS IN SPEECH SOUND DISORDERS 5 phonetic properties or considering the existence of a cascading flow of information across these domains (Goldrick, Baker, Murphy, & Baese-Berk, 2011; Hickok, 2014). Finally, phonological theory itself has split into a multitude of approaches, only some of which still maintain the idea of phonology as an encapsulated symbol processing system (e.g., Berent, 2013), whereas others take nonabstractionist—that is, substance- and usage- based—approaches (e.g., Blevins, 2004; Bybee, 2003; Goldstein, Byrd, & Saltzman, 2006; Pierrehumbert, 2002).

Organization of the Chapter

This chapter is intended to boost the debate on the phonetics–phonology distinction in aphasiology by surveying different theoretical frameworks of speech production with regard to their consequences for the relationship between phonological and apraxic sound production impairment. The chapter starts with a discussion of the phonemic principle, which is basic to most theoretical and clinical approaches. It then expands on three modern theories: (1) a radically abstractionist perspective of spoken language (the “phonological mind”; Berent, 2013), (2) connectionist approaches, and (3) the “harmonic mind” framework (Smolensky & Legendre, 2006). The chapter ends with an outlook emphasizing the role of substance-based approaches and the importance of a neuro- anatomical perspective.

The Phonemic Principle in Sound Production Impairment

Phonetic Well-Formedness of Phoneme Errors Clinically, a major criterion to demarcate phonological from motor speech impairment is linked with the notion of phonemic errors. A patient who, in naming the picture of a snail (German = /ʃnɛkə/), produces a well-articulated but phonemically distorted response, for example, /ʃnɛkə/ → [ʃtɛkə], (4) is considered to have substituted one phoneme for another—that is, /t/ for /n/. Because of the phonetically well-formed shape of the utterance [ʃtɛkə], it is deemed unlikely that this error reflects a motor impairment, especially when the patient’s overall output is consistently (or at least predominantly) fluent and well-articulated across the board. A motor planning deficit—so the argument goes—should necessarily create phonetic distortion and dysfluency and, hence, cannot generate a well-articulated [ʃtɛkə]. The standard interpretation is therefore that the patient who produced [ʃtɛkə] has incorrectly selected and activated an “abstract segmental code” for [t] at a position where the abstract code for [n] should actually be filled in (Laganaro, 2014). This characterization of the paraphasic pathomechanism postulates abstractness of the underlying processes and representations in the sense that it makes no particular reference to phonetic substance (Béland, Caplan, & Nespoulous, 1990) and thereby enforces the motor-linguistic dualism described earlier. In this approach, the observation that patients with AOS may also often 6 SPEECH MOTOR CONTROL IN NORMAL AND DISORDERED SPEECH produce instances of phonetically well-formed phoneme substitutions, as explained in the introduction section, is explained as a sign of comorbidity—that is, coexistence of linguistic and motor symptoms in the same patient (Wambaugh et al., 2006)—or as an indication of an interaction between two (still strictly separate) processes of phonological and phonetic encoding (Laganaro, 2012). At the base of this thinking is the phonemic principle, which, in one way or another, is acknowledged by most phononological theories.

Discreteness According to this principle, the sound structure of words and phrases is composed of a small number of categories—that is, phonemes. An antagonism between the phonemic building blocks of words, on the one hand, and the motor patterns associated with their pronunciation, on the other hand, has been construed from the fact that the former are discrete and the latter continuous: Whereas discreteness is seen as a characteristic property of the units of symbolic cognition, gradedness is a property of physical events, and—by virtue of the ancient principle that natura non facit saltum (nature does not make jumps)—the two are deemed irreconcilable.2 Yet, coming back to the enigmatic occurrence of phoneme errors also in apraxic speakers (see Example 2), there are several technical explanations of how observations of discrete phoneme substitutions can actually be reconciled with the deficient phonetic planning mechanism underlying AOS. One explanation refers to a general principle of motor cognition—that is, that a person’s brain entertains a model of human anatomy and of the body parts implied in motor activities (Berlucchi & Aglioti, 1997; Goldenberg, 2013). Because speaking is based on a small set of discrete body parts functioning as primary articulators, any mis-selection of a moving organ in the motor planning for a word such as /tomaːtə/ (English = tomato) is likely to result in a discrete error concerning place of articulation (e.g., [phomaːtə]). Likewise, any omission of a gesture of one of these body parts (e.g., the velum) may alter the segmental content in a discrete manner (e.g., [ʃtɛkə] for /ʃnɛkə/). In milder forms of apraxic impairment, motor planning errors may even not be very likely to transgress the inventory of motor “coalitions” that a patient has acquired and practiced over decades—for example, the language-specific patterning of labial–lingual–velar–laryngeal gestures with their characteristic phase-relationships (Kelso & Tuller, 1981)—and therefore apraxic motor systems may still continue generating well-formed sound patterns repeatedly. Other explanations of such behavior refer to the “quantal theory” of speech (Stevens, 1989), which provides an explanatory framework for a number of discontinuity phenomena in motor speech. As an example, discontinuities of aerodynamic conditions at the level of the vocal folds may lead to a categorical switch from a state of harmonic vocal fold oscillations to voiceless phonation, even when there is only little inaccuracy in the control of glottis width or an alteration of transglottal airflow because of slight articulatory aberrations. In phonological theory, this argument has played an important

2 Despite its wide acceptance, the notion of the phoneme and its role in phonological theory is nonetheless controversial (e.g., Port, 2010). In particular, articulatory phonology dispenses with the phoneme concept and is instead built on the concept of (still discrete) articulatory gestures (e.g., Goldstein et al., 2006). PHONOLOGY VS. PHONETICS IN SPEECH SOUND DISORDERS 7 role in formulating phonetically based markedness relationships between voiced and voiceless plosives and geminates at different places of articulation (e.g., Hayes & Steriade, 2004). The quantal nature of phonetic functions has also been established for articulatory- to-acoustic transformations (certain regions in the vocal tract are insensitive to small articulatory aberrations, whereas in other regions, small changes may provoke substantial acoustic changes) and for acoustic-to-auditory transformations (“categorical perception”). The perceptual bias argument of perceived well-formedness of articulation errors plays a fundamental role in discussions about speech errors in the account proposed by proponents of articulatory phonology (AP; e.g., Pouplier, 2007a), as discussed later. These arguments implicate that occurrence of categorical errors in AOS is the rule rather than the exception. Yet, the assumption that phonetically well-formed errors in a given patient originate from an apraxic pathomechanism is reasonable only if these errors are part of a broader clinical pattern supporting this assumption, that is, when there is also a substantial number of phonetically distorted sounds and sound transitions, a groping for articulations, and slowed and dysfluent speech—in short, the overall clinical pattern of AOS. In turn, the technical arguments mentioned earlier are presumably not appropriate to explain the clinical pattern of fluent phonological impairment—for example, in conduction or Wernicke’s aphasia—because these syndromes lack any other obvious characteristics of a disordered speech motor system. Nonetheless, the conclusion that phonological errors in these cases necessarily reflect mechanisms pertaining to a symbolic domain that is autonomous from the perceptual and motor mechanisms of speaking—simply by virtue of their discrete nature—is premature. First, discrete representations of continuous processes are commonplace in the modeling of complex dynamical (biological or physical) systems—for example, by differential equations—suggesting that discreteness of a representational system for dynamic processes does not automatically entail autonomy of the representations from their substance. This thinking has been influential in resolving the antagonism between the continuous and the symbolic in dynamic systems phonetics (e.g., Gafos & Benus, 2006). Second, beyond these modern approaches, there is a long tradition in phonetics to explain the discreteness of phonological form as emergent from the continuous properties of phonetic function (e.g., Liljencrants & Lindblom, 1972; Lindblom, MacNeilage, & Studdert-Kennedy, 1983). Third, the phonemes involved in a phonological error—for example, the /t/ that spuriously appeared for /n/ when the lexical form /ʃnɛkə/ was pronounced as [ʃtɛkə]—are by no means abstract-symbolic in the radical understanding of the phoneme as an arbitrary signifiant containing no intrinsic relationship at all with its signifié—that is, articulation. In contrast, the /n/ of /ʃnɛkə/, for instance, contains at least the information that the velum should be lowered and that there should be harmonic oscillation of the vocal folds in the production of this segment.3 From this perspective, the discrete “symbols” of /ʃnɛkə/ serve, even at the level of underlying lexical forms, as an interface to the sensorimotor system that specifies salient elements

3 There may be motor equivalence principles that weaken or even dissolve this one-to-one correspondence, but such principles still pertain to the perceptuo-motor domain and do not require a symbolic account. 8 SPEECH MOTOR CONTROL IN NORMAL AND DISORDERED SPEECH

(though not the full details and not always faithfully) of the motor implementation and the auditory quality of the sound patterns of words. This point is taken up again in The Faithfulness Principle Constrains the Abstractness of Lexical Representations section.

The Case of Metathetic Errors In psycholinguistics, a particularly strong case for the abstractness of phonemic errors has often been made by referring to contextual errors in spontaneous slips of typical speakers and in paraphasic speech (e.g., Dell, 2014; Laganaro, 2014). When an aphasic patient, for instance, produces /meloːnə/ → [menoːlə] (English = melon), (5) two phonemes, /l/ and /n/, exchange their places (metathesis), which points at their discreteness and the discreteness of the slots in which they go (i.e., the syllable onsets). That they do so without any apparent phonetic motivation is taken as proof of the autonomy of the symbolic level. Yet, preclusion of a phonetic grounding of metathetic sound errors is premature. The process of sound-metathesis is well-known from studies of diachronic sound change and quite successful attempts have been made to provide phonetic (i.e., mostly auditory- based) explanations for the occurrence of such phoneme exchanges (e.g., Blevins, 2004). In the example mentioned earlier, for instance, the auditory reference frame of the word of /meloːnə/ may have been blurred by the presence of two features spreading acoustically over a larger than segmental range—that is, nasality and laterality. Extensive feature spreading may have caused uncertainty in the speaker about where in the utterance the lateral and the nasal quality should be spotted. Such an explanation would be based on theories emphasizing the role of auditory information in the guidance of articulation in speech production (e.g., Guenther, Hampson, & Johnson, 1998; Hickok, 2014). Other, more genuinely motor-based mechanisms have been proposed to cause contextually motivated slips in error elicitation experiments (e.g., Goldstein, Pouplier, Chen, Saltzman, & Byrd, 2007; Pouplier & Hardcastle, 2005).

Contextual Accommodation Another important aspect of the phonemic principle, which plays a prominent role in discussions about spontaneous speech errors and phonological sound production impairment, is context independence. Consider again the metathesis problem. A patient with AOS produced, in an auditory repetition task, after several false starts, the metathetic error

/kyrbis/ → [ˈbyrkis] (English = pumpkin). (6) Remarkably, the misplaced /k/ and /b/ in this utterance adopted the context information from their new environment—that is, the misplaced /b/ inherited the anticipatory lip rounding for /y/, whereas /k/ lacked any lip rounding quality and sounded less aspirated in its new unstressed context than it would have in the original stressed position of the word onset. This accommodation property of context errors has frequently been reported in studies of spontaneous slips in healthy speakers and occasionally also in neurolinguistic PHONOLOGY VS. PHONETICS IN SPEECH SOUND DISORDERS 9 studies (for a discussion, see Goldrick & Rapp, 2007). It is considered as a proof that such errors affect context-independent (and in this sense abstract) units. A more recent discussion has centered on case reports of patients with sound production impairment in whom allophonic accommodation has been studied systematically. Buchwald and Miozzo (2011, 2012) and Miozzo and Buchwald (2013) reported on two patients who demonstrated a clear dissociation in this regard: patient DLE, a left-hander with a vast lesion covering almost the whole perisylvian region of the left hemisphere and also including portions of the left thalamus, and patient HFL, a right-hander with an infarction of the left middle cerebral artery whose lesion was not reported in greater detail. Both patients were aphasic, with moderately (DLE) and severely (HFL) impaired naming, and both had AOS. The two patients were remarkable for their frequent deletion errors in initial consonant clusters—for example, deletion of /s/ in s-nasal-clusters (small → mall) or in s-plosive-clusters (spill → pill). Yet, they differed in how the remaining nasal or plosive consonant of the cluster was produced in cases of deletion: DLE accommodated the context—that is, produced nasal consonants after s-deletion with durations typical of simplex initial nasals, and plosives after s-deletion with voice-onset-time values typical of initial (aspirated) voiceless simplex plosives (i.e., [phil])—whereas HFL maintained the phonetic properties of nasals and plosives typical for the s-cluster context even when the /s/ was omitted—for example, /spil/ → [bil]. DLE’s behavior was interpreted to reflect consonant deletion at a stage of abstract phoneme representations—that is, without specification of the context—whereas HFL’s error pattern was considered to reflect a pathomechanism occurring at a later stage—that is, after phonemes were accommodated to their context. In short, in DFL, the error was phonological, whereas in HFL, it was apraxic. These single case results definitely require replication, given the exceptional clinical constellation in the two patients and the fact that both probably had lexical, postlexical, and apraxic impairment at the same time. From experience, one would say that DLE’s accommodation behavior is much more representative than the nonadaptive behavior shown by HFL, irrespective of whether a patient is speech apraxic. A challenging theoretical problem here is that no clear assumptions can be made about the origin and the level of the accommodation mechanisms after a consonantal segment has been omitted. In the case of plosive aspiration, for instance, rather basic dynamic and aerodynamic mechanisms can be made responsible for the fact that a plosive is aspirated or unaspirated after alterations in supralaryngeal timing through a loss of a proportion of the onset (i.e., /spil/ → [phil]). Moreover, other factors may also contribute to allophonic accommodation, such as coordinative adaptation mechanisms emerging as a result of extensive motor learning. More generally, the idea that motor planning (or speech apraxic) mechanisms come into play only after all phonetic details of a word have been specified at some abstract-symbolic level is extremely restrictive and implies an inappropriate flattening of the hierarchy of action control mechanisms. In summary of this section, the lack of a straightforward phonetic explanation for some classes of segmental errors should not lead researchers to draw premature inferences about the absence of such explanations and move the burden of explanatory labor from the substantial to the symbolic. 10 SPEECH MOTOR CONTROL IN NORMAL AND DISORDERED SPEECH

The Phonological Mind: A Framework for Aphasic Phonological Impairment?

The Algebra of Speaking According to a theory proposed recently by Berent (2013), and in the tradition of Chomsky and Halle’s (1968) The Sound Pattern of English, humans are genetically endowed with a phonological mind that embraces abstract grammatical knowledge—that is, “core knowledge” about how abstract meaningless elements are combined “to weave linguistic messages” (Berent, 2013, p. xiii). The phonological mind is an algebraic framework in people’s brains that deals with abstract symbols and connects them according to grammatical constraints. As such, it does not care about the particular communication channels that it potentially governs (i.e., the vocal tract motor system) but is more generally concerned with “linguistic elements” of any kind. Hence, speech is only one arbitrary incarnation, whereas sign language (Berent, 2013, p. 12) or Amazonian whistle speech (Berent, 2013, p. 133), for instance, are other potential implements of the same mental organ. The core architecture of the phonological mind is, as a matter of principle, “autonomous from the perception/production channels” of linguistic communication (Berent, 2013, p. 21) and is uninfluenced by phonetic knowledge (Berent, 2013, p. 187). In this sense, the theory of the phonological mind serves as a perfect blueprint of the classical motor-linguistic dichotomy of neurogenic sound production impairment sketched in the introduction section. It predicts that a neurologic condition should exist that specifically destroys the phonological mind proper—that is, the universal grammar that constrains linguistic (sound or gestural) patterns. This phonological impairment should be described as clearly amodal and as interfering with purely abstract, symbolic processing mechanisms, and it should therefore be considered fundamentally distinct from impairments located at levels on which phonetic contents are traded. Conversely, any impairment that directly afflicts the phonetic values that people enter into the algebraic calculus of the phonological mind when they produce or understand speech is by definition a phonetic syndrome and is caused by mechanisms that are external to the phonological machinery. Hence, the phonological mind theory, although it is a theory about typical phonological processing, would imply a clear-cut distinction and profound chasm between phonological and phonetic speech sound disorders and relegate any disturbances relating to the phonetic substance of speech processing from consideration.

Phonological Impairment—A Disorder of Algebraic Brain Functions? Which symptoms could be predicted from an impairment of the phonological mind? Because the core knowledge of the phonological mind is grammatical, neurologic conditions destroying this module would necessarily destroy grammatical knowledge. Berent (2013) addressed several elements constituting this knowledge. One of these elements is a speaker’s/listener’s capacity to represent phonological equivalence classes— that is, to make a distinction between consonants and vowels or to recognize syllabicity (Berent, 2013, Chapter 4). Destruction of this knowledge would therefore be likely to create PHONOLOGY VS. PHONETICS IN SPEECH SOUND DISORDERS 11 errors by which vowels are exchanged with consonants, or errors destroying syllabicity. Yet, these error types do not occur in aphasic patients with phonological impairment. Another important element of the phonological mind theory is that typological universals of phonological grammar are implemented in people’s brains. An example treated at some length in Berent (2013) is the singleton-geminate contingency—that is, the observation that if a language lacks a particular consonant (e.g., the plosive /p/), it will also lack the corresponding geminate (i.e., /pp/, as is the case in Egyptian Arabic; Berent, 2013, Chapter 3). Implementation of such contingencies in language users’ brains would imply, for instance, that speakers of Egyptian Arabic avoid /pp/ because they also avoid /p/. This should be understood in the most abstract sense—that is, that they conform to a syllogistic rule of the form “if x is banned, xx must also be banned,” with x representing some abstract variable rather than a speech sound (Berent, 2013, section 3.1.1). Hence, if such contingencies were destroyed because of a brain lesion, researchers would, for instance, observe /pp/-geminates in patients speaking Egyptian Arabic. Although to my knowledge, clinical data on specifically this issue do not exist, one may exclude that such error types occur in aphasic phonological impairment, simply because patients with aphasia rarely— if ever—produce phonemes not pertaining to their native phoneme inventory, irrespective of any typological contingencies that these phonemes might be subjected to. More generally, patients with phonological impairment as understood here do not produce any illicit forms whatsoever. For instance, English-speaking patients would not violate the rule that assigns the voicing of plural-s and say, for instance, *[dɔgs̱ ] (dogs) or *[kʌpẕ] (cups); they would also not offend the assimilation rule by saying inpossible instead of impossible or systematically violate stress shift principles and say [ˈmætəlik] instead of [məˈtælik]. Even the opposite is true: If a German or Dutch patient would eventually produce a word-final voiced obstruent—that is, *[kinḏ] for /kinṯ/—or an English patient would produce [kʌpẕ] for cups, one would rather suspect him or her to have an apraxic problem of planning laryngeal movements rather than phonological impairment.

Sonority A paradigm that has been exploited most extensively by Berent and colleagues is the sonority restriction on syllable structure—that is, the fact that there is a hierarchy of phonological well-formedness depending on the sonority gradient in syllable onsets and codas (Berent, 2013; Berent, Lennertz, Jun, Moreno, & Smolensky, 2008; Berent, Steriade, Lennertz, & Vaknin, 2007). A frequently mentioned example is the series blif > bnif > bdif > lbif, (7) with blif being the most preferred, and lbif the least preferred, syllable. This ordering holds universally—that is, across languages—irrespective of whether any one of the four syllables is attested in a language. The tenet of the phonological mind theory is that the preference relationship between these syllables represents innate knowledge that is active in the brains of all humans and cannot be explained by phonetic differences between the implied consonant clusters (Berent et al., 2007, 2008). Without dealing with the problems associated with the experimental paradigms applied by Berent and colleagues in their 12 SPEECH MOTOR CONTROL IN NORMAL AND DISORDERED SPEECH evaluation of this postulate (see Peperkamp, 2007, for a thoughtful discussion), one recent study deserves mentioning because it relates the sonority restriction hypothesis to a brain region implied in sound production impairment—that is, Broca’s area (Berent et al., 2014). In this study, healthy participants were presented with the blif–bnif–bdif–lbif stimuli mentioned in Example 7, interspersed with stimuli in which a schwa was inserted between the two onset consonants—that is, belif–benif–bedif–lebif. Participants were required to count the syllables of these stimuli. It was known from earlier studies, and was replicated in this investigation, that participants tend to perceive bnif, bdif, and lbif as disyllabic, with increasing error rates from bnif to lbif, which the authors interpreted as a sign of the listeners’ knowledge of universal grammatical constraints (cf. Berent et al., 2007).4 Remarkably, the hemodynamic response during syllable counting was modulated by the preference hierarchy of the four monosyllabic stimuli, and this activation peaked in one of the traditional language areas—that is, Broca’s area (Berent et al., 2014). Without considering the problems arising from the experimental paradigm and the interpretation of the complex bilateral activation patterns, the question of particular interest here is whether a destruction of the activated areas because of a stroke would actually lead to a loss of the purported knowledge of language universals. An obvious prediction would be that patients with such a lesion start ignoring, for instance, sonority restrictions on syllable structure and generate illegal forms such as *lbue for blue. However, as stated earlier, this is not the case in aphasic patients with phonological impairment who, as a rule, produce perfectly well-formed sound structures. Hence, the nature of the phonological sound production impairment in speakers with aphasia is by no means algebraic in the sense that it interferes with the core of phonological competence constituting the phonological mind described by Berent (2013). It is remarkable that virtually none of the core features of the phonological mind have an equivalent in neurological sound production impairment, which casts doubts on the belief that such a grammar is implemented in the human brain.

Connectionist Frameworks of Sound Errors in Word Production

Connectionist accounts of phonological processing in typical and aphasic speech production are in an important way antithetic to universal grammar accounts, because they dispense with the assumption that regularity in the sound patterns of speech reflects grammatical rules or constraints in the speakers’ brains. Instead, the regular phonological patterns of words are shown to arise from statistical learning in a network of nodes with massive interconnections. Most generally, such systems can account for remarkably complex linguistic patterns and have a large potential for generalization—that is, processing of new input (Elman, 1993). In phonology, connectionist models dissolve the “list–rule principle” of generativist accounts (i.e., the principle that a set of rules operates on a separate list of lexical units) by implementing grammatical rules in the lexicon through principles of statistical learning (Rumelhart & McClelland, 1986). When trained on a set of arbitrary

4 However, see Peperkamp (2007) for a critical discussion. PHONOLOGY VS. PHONETICS IN SPEECH SOUND DISORDERS 13 semantic representations of real words, such networks learn to generate their phonological patterns. In a much-cited article, Dell (1986) presented a spreading activation model of word retrieval constrained by syllable structure information to show that the patterns of spontaneous errors that occur after a training of the model resembled the patterns of slips made by typical speakers. Even when no structural information is included, as in Dell, Juliano, and Gowindjee (1993), the “slips” committed by such models still resemble those made by typical speakers, to the extent that only few phonotactic violations arise. The connectionist approach was also used to model language processing in aphasia. In these accounts, the aphasic condition was simulated by a “lesioning” of the networks after training—for example, by altering connection weights, lowering decay rates, reducing input connections to the nodes of a particular layer, or applying noise on the output of nodes (e.g., Dell, 1986; Foygel & Dell, 2000; Nozari, Kittredge, Dell, & Schwartz, 2010; Ueno, Saito, Rogers, & Lambon Ralph, 2011). Such manipulations cause disturbances in the behavior of the network, with the type of errors depending on the network layer on which the “lesions” are applied. In two-step computational models of word processing, three layers are distinguished—semantic, lexical, and phonological—hence, the errors that may arise can be semantic, lexical (“formal”), sublexical (“phonological”), or mixed (e.g., Schwartz, Dell, Martin, Gahl, & Sobel, 2006). Typically, these studies examine the relative proportions of such error types in simulations of naming, repetition, or comprehension, and they often test for influences of variables such as word frequency or lexicality on error rates. The results of these simulations are then compared with actual corpora obtained from patients with aphasia (e.g., Nozari et al., 2010; Rapp & Goldrick, 2000; Schwartz, Wilshire, Gagnon, & Polansky, 2004). For a discussion of this approach, readers are referred to Rapp and Goldrick (2000, 2006) or to Ruml, Caramazza, Capasso, and Miceli (2005). In the context of this chapter, it is important to note that the existing models are usually designed to describe the architecture of lexical access mechanisms on a macroscopic level (the two-stage model) rather than to simulate the details of aphasic phoneme error patterns. Because their emphasis is on the interactions between stages and on differences between input and output modalities, their input is often not sufficiently representative, phonologically, to cover a substantial proportion of aphasic phoneme error types. For instance, the Lichtheim 2 model designed by Ueno et al. (2011) is confined to three-mora Japanese words, hence the model cannot explain errors that would transgress this small inventory of phonological forms, as would often be the case in patients with speech sound impairment. Another characteristic property of most of these models is that they eschew incorporation of the continuous phonetic level—that is, they use discrete instead of continuous time and discrete phonetic features instead of graded acoustic or motor information.5 In

5 An exception is the model proposed by Plaut and Kello (1999), which uses graded feature information but discrete time steps. The model presented by Kello and Plaut (2004) deals only with articulatory— acoustic mapping using real electropalatography—data of articulation and power spectra of speech acoustics, with no lexical representations. Neither of these two models was used to simulate aphasic or apraxic speech. 14 SPEECH MOTOR CONTROL IN NORMAL AND DISORDERED SPEECH

Dell’s (1986) seminal model, for instance, all connections between processing nodes are construed as bidirectional, whereas the final implementation step by which the motor system is activated remains unidirectional. In the Lichtheim 2 model (Ueno et al., 2011), auditory input and articulatory output nodes are defined as discrete feature vectors to keep the complex computations manageable. Because the architecture and functioning of these networks is thereby sealed-off from motor and auditory influences, word processing and phonological acquisition cannot be influenced by what happens at the front-ends—that is, within the motor/auditory apparatus of speaking/understanding. As a consequence, these models (at the current stage of their development) cannot be used to address the interface problem of phonetic versus phonological impairment in patients with AOS and aphasia. Yet, extensions of such accounts into the auditory and motor domains are available (Kello & Plaut, 2004; Plaut & Kello, 1999) and may offer new options to study the phonetics– phonology interface in patients with speech sound disorders.

The Harmonic Mind

Several new approaches to phonology strive to consolidate universal phonological grammar accounts with connectionism, with the aim of integrating symbolic grammatical theory with microlevel computational accounts. One of these developments is harmonic grammar (HG; e.g., Smolensky, Goldrick, & Mathis, 2014; Smolensky & Legendre, 2006), which has become particularly influential in its symbolic grammar variant—that is, optimality theory (OT; Prince & Smolensky, 2004).

Markedness Versus Faithfulness The major idea in these approaches is that grammar is organized in a way to generate the most harmonic (or the optimal) surface structure for a given underlying (i.e., lexical) representation of a word. This is achieved by a large inventory of universal constraints. An important property that distinguishes these accounts from earlier rule-based grammars is that constraints can be violated. In fact, constraint violation occurs regularly, because constraints are mutually conflicting. There are two major classes of constraints: markedness constraints, which contribute to the well-formedness of surface forms, and faithfulness constraints, which promote correspondence of surface forms with their underlying representations. At the core of the grammar is a mechanism that generates, for each underlying lexical form to be produced, a large variety of candidate surface forms. Among these, an optimal (i.e., most harmonic) candidate is selected by trading the mutually conflicting constraints against each other. As an example, the underlying representation of the German word hund (English = dog) is /hʊnd/, whereas the surface form generated by the grammar is [hʊnt]. According to HG or OT, the underlying form /hʊnd/ (the “input”) generates a large variety of possible candidates of phonetic forms (the “output”), including all kinds of potential syllabifications and segmental variants, among them, for example, [hʊnd] and [hʊnt]. Why is [hʊnt] finally preferred on the surface? Among the myriads of constraints of universal grammar, two are important here: The first is a faithfulness constraint, which is termed Ident(Lar) because PHONOLOGY VS. PHONETICS IN SPEECH SOUND DISORDERS 15 it promotes faithfulness of the underlying laryngeal specification of consonants. The second is a markedness constraint, which is termed *Lar because it prohibits laryngeal features, hence, especially voiced obstruents in the syllable coda (Lombardi, 1999). These two constraints are conflicting, because the first promotes retaining the /d/ of /hʊnd/ in the output, whereas the second strives to reduce markedness by specifying the “better” variant—that is, unvoiced /t/. In German phonology, the latter constraint “wins” over the former, because the phonological grammar of German puts a higher weight on *Lar than on Ident(Lar). In other languages, such as English, *Lar has a lower weight; therefore, voiced obstruents may appear in the output of the grammar (e.g., [dɔɡ]). That is, the phonological grammars of different languages differ in how they weight (or rank) the constraints that are otherwise universal to all languages.

The Faithfulness Principle Constrains the Abstractness of Lexical Representations Before discussing applications to sound production impairments, two comments are indicated. The first is that HG theory and OT make no commitments about the content and the origin of their constraints. Consider, as an example, two important syllable structure constraints—that is, Onset (i.e., syllables should contain onset consonants) and NoCoda (i.e., syllables should not contain coda consonants). There is no logical reason why these two cannot be replaced by, for instance, NoOnset and Coda, but grammars would not work with these alternatives, because they would not create the sound patterns that occur in natural languages. The constraint Onset, for instance, is built on the evidence that no known language prohibits CV syllables—without asking whether this typological rule might be grounded in some external circumstance. Likewise, the constraints *Lar or *Complex are preferred over the—logically equally possible—constraints Lar or Complex, because they make more successful grammars and not because they reflect more stable conditions in motor control or auditory perception. Hence, HG creates harmony because it is based on constraints that are known—from phonological research—to create harmony.6 To escape this circularity, it would be necessary to explicitly motivate markedness constraints by some grammar-external substantial content—for example, by phonetically grounded scales (Prince & Smolensky, 2004, p. 81). Phonological theory is not necessarily unhinged by this problem, because the clue of HG and OT is in how the conflicting constraints can be arranged to generate the sound patterns of a language, not in where the constraints come from. As a second comment, the importance of faithfulness constraints in HG and OT undermines the purported abstractness of underlying representations. Consider, as an example, the faithfulness constraint Ident(Lar) mentioned earlier, according to which consonants should be faithful to underlying laryngeal specification. Constraints of this sort require that underlying representations are mapped unequivocally onto phonetic

6 For such reasons, Blevins (2004, p. 74) considered the markedness concept as teleological and claimed that phonological theory should deal without this concept. Haspelmath (2006) has also raised fundamental criticism against the markedness concept and has proposed to substitute it with less ambiguous and more substantive factors. 16 SPEECH MOTOR CONTROL IN NORMAL AND DISORDERED SPEECH representations at the surface level—irrespective of reasons for which the actual surface form should eventually be harmonized with other phonetic constraints. Hence, Ident(Lar) implies that the motor behavior of the larynx during word production, as specified in the surface form, should already be represented at the level of underlying lexical forms— how else could the faithfulness constraint be fulfilled?7 More generally, faithfulness of the output relative to the underlying input representations implies, by inversion, that the latter should already have a phonetic signature.

Markedness Asymmetry in Phonological Impairment Markedness, the key concept of HG and OT, has played a major role in research on spontaneous speech errors in healthy speakers and in neurolinguistic studies of sound production impairment. The major thread of the discussion in aphasiology focuses on the hypothesis that marked structures are more vulnerable to phonological impairment than unmarked structures, and that phoneme errors, when they occur, tend to reduce markedness relative to the target lexical form (markedness asymmetry). As a German illustration of this hypothesis, the word /juveːl/ (jewel) should be more vulnerable to phonological error mechanisms than the word /juːdo/ (judo), because the former has an iambic and the latter a trochaic metrical pattern, and iambs are more marked than trochees. Likewise, /frɔʃ/ (frog) should be more vulnerable than /fiʃ/ (fish) because /frɔʃ/ violates the constraint *Complex penalizing complex onsets and, hence, is more marked than /fiʃ/. Moreover, when an error occurs on /frɔʃ/, it is likely to create a less marked outcome—for example, [ʃɔʃ] (see Example 3). There is evidence that spontaneous slips of healthy speakers conform to the markedness asymmetry hypothesis (for a discussion, see Meyer, 1992), and a similar statistical tendency has been reported for the sound production errors of patients with left hemisphere lesions (for a discussion, cf. Rapp & Goldrick, 2006). Several single case studies of patients with postlexical phonological impairment were published in which markedness effects were studied from a broader perspective—that is, at the segmental and the feature level, at the level of syllable constituents and of sonority constraints, and at the phonotactic level (Béland et al., 1990; Buchwald, 2009; Goldrick & Rapp, 2007; Miozzo & Buchwald, 2013; Stenneken, Bastiaanse, Huber, & Jacobs, 2005). These and other studies have confirmed the hypothesis that marked structures are more prone to phoneme errors and that such errors are likely to create structures that are less marked than their respective target structures.

Markedness Asymmetry Is Not Pathognomonic of Phonological Impairment Coming back to the topic of this chapter, it is remarkable that patients with AOS have the same tendency as patients with (postlexical) phonological impairment to make more errors on marked than on unmarked structures and to reduce markedness when they make errors. For instance, they are more vulnerable to iambic than to trochaic stress (Aichert,

7 Again, motor equivalence principles may modify the strength of this correspondence by allowing for different motor patterns to generate the same sound. Yet, this equivocality concerns the relationship between output forms and motor patterns rather than the grammar itself. Motor equivalence is a general property of all motor skills. PHONOLOGY VS. PHONETICS IN SPEECH SOUND DISORDERS 17

Büchner, & Ziegler, 2011; Ziegler & Aichert, 2015), they make more errors on consonant clusters than on simplex consonants, they tend to reduce clusters by omitting consonants or inserting vocoids, and they avoid omitting simplex consonants in the onset of a syllable (Miozzo & Buchwald, 2013, case HFL; Romani & Calabrese, 1998, case DB). Miozzo and Buchwald (2013) observed clear sonority effects in the two patients mentioned earlier in this chapter, one whom they considered to have postlexical phonological impairment (DLE) and another who was deemed to have AOS (HFL). Both made, for example, more errors on initial consonant clusters with a flat sonority rise (e.g., flow) compared with clusters with a steep rise (e.g., blow). The authors concluded, in brief, that sonority plays a role at both levels, phonological and phonetic. In a further case study, Buchwald, Rapp, and Stone (2007) made a claim that the two levels should nonetheless be distinguished because the way patients with phonological impairment resolve their problem with marked phonological structures points at an involvement of discrete phonological rather than graded phonetic representations. Tying in with work done by Davidson (2005, 2006a), they examined consonant cluster production in a patient with postlexical phonological impairment, VBR, who resolved her obvious problem with onset clusters by inserting a schwa between the two consonants (e.g., clone → [kəloːn]). Two competing explanations for this phenomenon are offered: First, the schwa may result from a mistiming of the consonant gestures involved in the cluster, resulting in a short period between the two consonants during which the vocal tract is open, and a short schwa-like vocoid is heard. Second, it may reflect insertion of a full lexical schwa, as in the word cologne (schwa-epenthesis). Using ultrasound and acoustic analysis techniques, Buchwald et al. demonstrated that in VBR the inserted vocoids were indistinguishable from corresponding lexical schwa vowels, evidencing that the insertions were not due to articulatory mistiming but rather reflected a repair process at a level of discrete symbolic representations. Hence, VBR was characterized as an exemplary case demonstrating that repair processes resulting in markedness reduction may occur on an abstract processing level (see also Buchwald, 2009). However, this result does not provide convincing evidence regarding a presumed prearticulatory, symbolic locus of the problem in patient VBR: If she had difficulties at some symbolic processing stage, would she then be able to choose an efficient repair strategy at exactly the same symbolic level? If so, why did she not choose consonant deletion (clone → cone) as a more elegant solution of her problem, considering that through her schwa epenthesis strategy she created disyllabic forms with an iambic meter, which is infrequent in English? After all, given her schwa insertion strategy, would she not deploy all the preserved agility of her articulatory system to conceal her repair by keeping the inserted schwa as short as possible, as in casual, hypo-articulated speech (e.g., [kəloːn])? Consider, in turn, that VBR may have had a speech motor planning problem concomitant with preserved phonology. Relying on her knowledge about the sound structure of her language, VBR could then easily have used a phonemic strategy to overcome her phonetic impairment. In this case, her motor speech problem would have prevented her to mask the repair by a schwa reduction process (cf. Staiger, Rüttenauer, & Ziegler, 2010, for evidence from a patient with AOS). According to this logic, the full-schwa repair would be ascribable 18 SPEECH MOTOR CONTROL IN NORMAL AND DISORDERED SPEECH to the apraxic condition and the reduced-schwa repair to phonological impairment— exactly opposite to the prediction offered by Buchwald et al. (2007).

The Repair Paradox As in the case of VBR discussed earlier, the observation that phonemic errors in patients with aphasia conform to the markedness asymmetry hypothesis has repeatedly been explained as a “repair” strategy (e.g., Béland et al., 1990, p. 142). There is an obvious paradox in this explanation: How should a patient, who has a problem with abstract phonological representations, be able to repair his or her problem at the same abstract level where he or she suffers a dysfunction within milliseconds and without any loss of fluency? The assumption that through the repair the target structure even improves (i.e., becomes more harmonic) implies a considerable degree of competence in a system that is actually considered to be destroyed by the brain lesion. Explanations within the HG or OT framework avoid this conflict by proposing that phonological errors are probabilistic variants of the target form that are best accounted for by appropriate adaptations of the grammar (Goldrick & Daland, 2009). In this framework, brain lesions create new phonological grammars that differ from the patients’ premorbid grammars, much like the grammars of Italian and English differ from each other—that is, by the relative weighting of their constraints. In patients who avoid consonant clusters, for instance, their grammar is characterized by a particularly high weighting of a constraint termed *complex that universally penalizes onset clusters. Hence, brain lesions are thought to create new grammars rather than destroy grammaticality per se.

Capturing Variability in HG Such a theory obviously leads to a proliferation of phonological grammars, because each patient may acquire his or her own individual constraint ranking, depending on whether he or she prefers schwa epenthesis, consonant deletion, improvement of sonority profiles, and so forth. Yet, a still greater obstacle to this approach is that the error patterns of aphasic individuals are highly variable—that is, patients doing always the same thing, such as VBR, are the exception rather than the rule (for a discussion, cf. Olson, Romani, & Halloran, 2007). Consider, for instance, the errors presented in Example 3 made by a patient with a conduction-like aphasia in a word repetition task: Although the errors /frɔʃ/ → [ʃɔʃ] and /print͡ sɛsin/ → [mɛlˈtikin] are compatible with a high ranking of *complex, the error /dax/ → [ʃtɔtʃ] is inconsistent with this explanation, and any OT explanation of /riŋ/ → [nil] requires still another re-ranking of constraints. Hence, constraint rankings may not only differ between, but even also within, patients from one instance to the next. This renders such theories not only proliferative but also unfalsifiable. A particular problem for OT explanations arises from the fact that markedness asymmetry only holds in a statistical sense—that is, errors may also lead to a loss of harmony, as in the example /dax/ → [ʃtɔtʃ] mentioned earlier or in other cases reported in the literature (e.g., Romani & Galluzzi, 2005). The OT framework would not allow for a grammar in which faithfulness is sacrificed for an increase instead of a reduction of markedness. At this point, HG provides options that do not exist in OT to explain such variation. Unlike PHONOLOGY VS. PHONETICS IN SPEECH SOUND DISORDERS 19

OT, HG selects optimal candidates not by ranking constraints but rather by assigning numerical weights to constraints. Stochastic and gradient effects, which are abundant in the sound patterns of natural languages, can then be modeled by adding random noise to the weights of constraints and thereby assigning probabilities to output candidates rather than selecting them on a deterministic basis (e.g., Hayes & Wilson, 2008). In this logic, Goldrick and Daland (2009) proposed a connectionist account in which phonological impairment is simulated by adding random noise to the constraints of a phonological grammar (represented by the connections of a network). This may distort the relative harmony of surface candidates to the extent that even a competitor with a lower harmony can be selected. The larger the amount of disruption, the higher is the likelihood that errors lead to a markedness increase.8 A fundamental weakness in these approaches is that they describe how error patterns are constrained but not how and why they emerge. Moreover, constraints provide only necessary but not sufficient conditions for errors: If in the example /riŋ/ → [nil], the [n] was perhaps selected because of its lower sonority value relative to /r/, why did the patient not produce [t] for [r] to generate a still less marked output? Hence, the most conspicuous property of phonemic errors is, in HG and OT terms, that they violate faithfulness. Therefore, the first question to be asked before markedness constraints are brought into play should refer to the process that generates all the output candidates: Is the candidate set generated by aphasic patients the same as in healthy participants, or are speakers with aphasia more imaginative in creating candidates? For example, considering the surface forms mentioned in Example 3, should it be assumed that /nil/ and /mɛlˈtikin/ are among the candidates generated in all German speakers’ brains when they are going to produce ring or prinzessin, respectively, and that it is only due to the efficiency of their grammar that these awkward forms do not reach the surface? This can hardly be imagined, given the nearly unconstrained wealth of neologistic forms that patients with aphasia can potentially generate. As a consequence, the clue to phonological impairment should be sought in the dysfunction of the generator rather than in the weighting of the constraints: Why does /riŋ/ generate [nil] after a stroke in the left brain? OT and HG do not provide an answer to this question. As a summary, without any commitments regarding the content of markedness constraints and without any theoretical assumptions constraining the generation mechanism that provides the surface candidates, the harmonic mind theory has little to say about the pathomechanisms causing phonological errors and about how they are related to the pathology underlying AOS. Yet, with their emphasis on the markedness concept, the OT and HG grammar frameworks illuminate that the two levels, phonological and phonetic, should be interrelated very closely, given that the phonetically most transparent

8 This feature of Goldrick and Daland’s (2009) model would imply that with increasing severity of the phonological impairment, the proportion of harmony-reducing errors increases. Yet, this prediction contrasts with the observation that patients with a very severe phonemic jargon (e.g., Stenneken et al., 2005) or patients whose output is reduced to only a few recurring utterances (Blanken, 1991; Code & Ball, 1994) have a particularly strong tendency to produce unmarked structures. 20 SPEECH MOTOR CONTROL IN NORMAL AND DISORDERED SPEECH markedness constraints—that is, those relating to the complexity of syllable constituents or the sonority hierarchy—are relevant on both processing levels.

Prospect: Substance-Based Phonology and the Role of Neuroanatomic Evidence

Limitations of Theories That Exclude the Phonetic Domain For obvious epistemological reasons, any attempt to elucidate the relationship between motor and phonological impairments of sound production should be based on theories incorporating both levels of analysis, together with a specification of their interface. As shown earlier, the three theoretical frameworks discussed here do not fulfil this requirement for different reasons: (1) In the phonological mind theory, the core of phonological cognition is deemed algebraic and entirely autonomous from its physical implementation. By definition then, phonological impairment interferes with abstract algebraic relations rather than with “the flesh and blood” of speech sounds (Jakobson, 1937). Impairments affecting the phonetic substance of speech production are therefore necessarily beyond the scope of the theory. (2) At present, most connectionist models of word production focus on questions regarding the relationship among semantics, lexicon, and phonology as well as on the disorders affecting these processing levels. They have therefore deliberately neglected the phonology–phonetics interface—discouraged also by the enormous computational effort that would be required to integrate auditory and motor representations in their networks (cf. Ueno et al., 2011, p. 393). Yet, there is no reason why existing expansions of these models should not in principle be exploited to simulate sound production impairments across the phonetics–phonology interface. (3) OT and HG would probably not deny that the constraints they are dealing with are—partly or entirely—grounded on phonetic principles. For instance, the phonetic grounding of sonority constraints or of constraints such as *Complex is more than obvious. In the field of phonology, these approaches have proven successful without necessarily reverting to the substantial content of their constraints. Yet, statements characterizing phonological impairment by a promotion of markedness constraints over faithfulness constraints have no explanatory value as long as the mechanisms governing markedness or the nature of the representations that are deemed mutually faithful or unfaithful are disregarded. Historically, phonological theory has made impressive advancements without considering the phonetic basis on which it is built. Yet, when researchers attempt to exploit phonological frameworks for a better understanding of neurogenic speech sound disorders, the situation changes fundamentally: The mere fact that it is due to a brain lesion that sound production is impaired in an individual raises questions regarding the pathomechanisms that underlie the observed pattern, the relationships between mechanisms causing related clinical patterns, and the role of the brain networks whose lesions create these patterns. Hence, neurolinguistic inquiries into sound production impairment cannot afford postulating an autonomous symbolic processing level from the outset. PHONOLOGY VS. PHONETICS IN SPEECH SOUND DISORDERS 21

The Role of Lesion Data Neuroanatomic evidence from functional imaging and lesion studies generates clear hypotheses about where the substantive content of phonological constraints and representations should be sought. As a rule, aphasic and apraxic sound production impairments result from lesions of the auditory ventral-dorsal-stream network in the left hemisphere. This network connects auditory regions of the superior temporal lobe with inferior-frontal cortex along an anterior-ventral stream, and with inferior-parietal, ventral premotor, and posterior inferior frontal cortex along a posterior-dorsal stream (Dick, Bernal, & Tremblay, 2014). The dorsal stream network, which connects temporal auditory with frontal motor areas, is especially considered to be associated with phonological processing and sound production impairment (Hickok, 2014; Hickok & Poeppel, 2007). Hence, phonological processing and its impairments should be based on mechanisms that ultimately derive from an interaction of auditory with oral motor processes and representations. According to some theories, the ventral-dorsal-stream network is even considered as the major biological basis of language evolution (Rauschecker, 2012). In humans, the left auditory dorsal stream network offers processing mechanisms that are known to convey a degree of abstractness in the perceptuo-motor system subserving speech. First, by nature of its reciprocal connectivities, it integrates the motor network based in the left posterior inferior frontal gyrus and the ventral premotor region with auditory areas providing a sensory reference frame for the motor processes implied in speaking. Implication of inferior parietal lobe structures in this system may serve to enrich this reference frame by somatosensory information (Tremblay, Shiller, & Ostry, 2003). From this perspective, the dorsal stream constitutes a platform for supramodal processing. Second, the temporal-parietal transition zone and the supramarginal gyrus of the left hemisphere are considered to be particularly specialized for sensory-motor integration processes (Hickok, Okada, & Serences, 2009; Peschke, Ziegler, Eisenberger, & Baumgaertner, 2012) and for the temporal storage of such information (Buchsbaum & D’Esposito, 2008). Third, there is evidence from direct cortical surface recordings that left superior temporal cortex encodes fairly abstract auditory feature information, suggesting that there are cortical sites in which discrete representations abstracted from auditory information are traded (Mesgarani, Cheung, Johnson, & Chang, 2014). Although the system is still not completely understood, lesion studies have shown that lesions in the ventral premotor and the posterior inferior frontal target regions of the left dorsal stream may cause AOS (Richardson, Fillmore, Rorden, Lapointe, & Fridriksson, 2012; Trupe et al., 2013). This region is apparently specialized for the planning of vocal tract movements for speech, probably as a result of long-term speech motor learning (Ackermann, Hage, & Ziegler, 2014). The motor representations traded at this stage are necessarily abstract, in some sense, given that they interact with highly integrated auditory and somatosensory information and that the motor implementation mechanisms downstream to this cortical system are still of considerable complexity. Phonological impairments, however, are associated with lesion sites along the inferior parietal to superior temporal course of the dorsal stream network (Schwartz, Faseyitan, Kim, & Coslett, 2012; Vigneau et al., 2006). There is some evidence that, statistically, the 22 SPEECH MOTOR CONTROL IN NORMAL AND DISORDERED SPEECH lesions causing these impairments are distributed more toward the anterior (motor) than toward the posterior (auditory) limb of this system (Schwartz et al., 2012), but a contribution of superior-temporal and temporo-parietal areas is uncontroversial (Cappa, Cavallotti, & Vignolo, 1981; Kappes, Baumgaertner, Peschke, Goldenberg, & Ziegler, 2010). Hence, apraxic and phonological sound production impairments pertain to different subcomponents of the same network. The relationship of their underlying mechanisms should therefore be sought in the relative contributions of these components to the generation of sensory reference frames for movement and the access to motor information, both of which should be sufficiently abstract to allow for their mutual mapping. This neuroanatomic framework is not compatible with the understanding of the phonological mind as an algebraic rule system dealing with syllogistic forms of reasoning, as proposed by Berent (2013). Lesion studies of impaired thinking or logical reasoning have predominantly pointed to prefrontal cortical regions in the left and/or right hemisphere, depending on which type of reasoning process is concerned (Shallice & Cooper, 2011, Chapter 12). Contributions to logical deduction were found in left frontopolar cortex by Monti, Parsons, and Osherson (2009). Neither of these sites is known to be implicated in phonology, and patients with impaired logical reasoning are not known to present with phonemic paraphasia in their speech. Another neuroanatomical hypothesis that could apply to abstractionist approaches such as Berent’s (2013) phonological mind theory is based on studies focusing on the distinction between a temporal-parietal/medial-temporal declarative memory system housing the mental lexicon and a frontal/basal ganglia procedural system housing the system of grammatical rules (cf. Ullman et al., 1997). Although the term grammatical rules refers to syntax rather than to phonology in this literature, generative accounts of phonology would consequently expand the declarative-procedural paradigm to the neural processing of sound structure. Hence, a straightforward hypothesis is that lesions interfering with a suspected universal phonological grammar are located in the frontal/basal ganglia system. However, according to functional imaging studies, the basal ganglia are not typically involved in phonological processing, and clinical populations suffering from basal ganglia disorders are not those who typically present with phonological paraphasia (Teichmann et al., 2005). There is a complex role of the fronto-striatal network in spoken language production, but phonological processing is definitely not implicated (Ackermann et al., 2014). As a summary, lesion data support theories proposing that phonological processing emanates from an interaction of higher sensory with higher motor processes of the vocal tract auditory-motor system, such as, for instance, the models developed by Tourville, Peeva, and Guenther (2014) or by Hickok (2014). Neurocomputational frameworks such as the Lichtheim 2 model, which was developed to simulate the ventral-dorsal stream system of language processing, may in the future prove powerful enough to integrate auditory and motor processing levels in their networks (Ueno et al., 2011).

Substance-Based Phonology and Sound Production Impairment During the last two decades, experimental phonology and phonetics have made substantial advancements in explaining how phonological grammar grounds on phonetic substance. PHONOLOGY VS. PHONETICS IN SPEECH SOUND DISORDERS 23

In these accounts, phonological structure emerges from language use and the interaction of phonetic mechanisms—that is, articulation and perception (e.g., Bybee, 2003, p. 201). Only some of the most important developments will be mentioned here. It is suggested that such approaches contain a largely unexplored potential as frameworks for the understanding of phonetic and phonological sound production impairment and their mutual relationship. Usage-based approaches. One such development comprises usage-based and exemplar-theoretical approaches to phonology (Bybee, 2003; Pierrehumbert, 2002, 2006). These theories are committed to explaining gradient, statistical variation in natural sound patterns that cannot be framed by deterministic phonological rule systems. Gradient variation in phonology is the rule rather than the exception, as exemplified for instance by the fact that phonemes differ in their fine-phonetic details across languages, dialects, or sociolects and, hence, are not universal entities. Likewise, the range of applicability of many transformation rules established in conventional generative phonology is modulated by gradient variables such as word frequency. In this sense, sound structures constitute language- or dialect-particular probability distributions over gradient phonetic, social, and usage-dependent parameters, which emerge through experience, social interaction, and covert imitation processes—that is, through perception–production loops (Harrington, Palethorpe, & Watson, 2000; Pierrehumbert, 2003, 2006). Although this approach has, to the best of my knowledge, not been applied explicitly to aphasic sound production impairment so far, such applications seem viable. If phonological patterns reflect an equilibrium state over a parametrically warped phonetic space, as suggested by exemplar theory, phonemically distorted word productions may be understood to arise from inaccurate representations of sensory or articulatory targets resulting from a distorted resolution of underlying phonetic parameters and their temporal extensions. As an example, the neologism [nil] produced for /riŋ/ may constitute an attempt to match an auditory target that is only diffusely specified by widely overlapping auditory features of the target word, such as nasal, coronal, anterior, and so forth—that is, a “bad” exemplar. The underlying pathology is a destruction of neural networks in the temporal lobe representing the word ring as an auditory object (e.g., Leaver & Rauschecker, 2010). Given a normally functioning motor planning and coordination apparatus, some well-articulated output will result, grossly resembling the target word in its featural decomposition, syllable number, stress pattern, or vowel quality. This idea may be considered to resemble the concept of gradient symbol systems developed by Smolensky and colleagues (e.g., Smolensky et al., 2014) and the explanation of nonharmonic phoneme errors proposed by Goldrick and Daland (2009) and Goldrick (2011), as mentioned earlier. Yet, unlike in the gradient symbol systems theory, the ultimate aim of this approach is to specify the phonetic dimensions along which the faithfulness of paraphasic output relative to its target can be measured. The notion of perceptual distinctiveness, as discussed by Flemming (2013), might play an important role in such an account, because it may provide a metric to parameterize potential inaccuracies of the auditory reference frames guiding the (otherwise unimpaired) articulatory planning mechanisms of aphasic patients. 24 SPEECH MOTOR CONTROL IN NORMAL AND DISORDERED SPEECH

Evolutionary phonology. Another line of advancements in phonological theory that might be worth pursuing in neurophonetic and neurolinguistic modeling is evolutionary phonology (Blevins, 2004). Unlike OT and HG, and similar to usage-based theories, this account entirely dispenses with teleological principles postulated to optimize well- formedness (Blevins, 2004, p. 74). In evolutionary phonology, markedness emerges from language use and reflects (rather than explains) the distributions of sound patterns and their relative frequencies within and across languages. Thus, the typological characteristics of languages observed at a particular historical point in time should be considered to constitute a snap-shot in an evolutionary process driven by phonetic sound change mechanisms. Language learners are confronted with the patterns they find in their social environment and acquire them through statistical learning mechanisms. What can be learned from evolutionary phonology for sound production impairment? The data collected in this research field and the methods applied in reconstructing historical sound change may prove useful in detecting the phonetic mechanisms that may lead to paraphasic sound change after brain lesions. In her typology of sound change mechanisms, Blevins (2004) identified inaccuracies and ambiguities in speaker–listener interactions as responsible for the occurrence of diachronic sound change. For instance, word forms that are consistently or frequently misperceived, or forms that allow for different articulatory realizations because of acoustic ambiguities, will gradually undergo a change. In translating such mechanisms into mechanisms underlying phonemic paraphasia, one would replace historical listeners’ inaccurate perceptions or ambiguous response selections by corresponding failures based on inaccurate or ambiguous sensory representations of words, resulting from damage to left superior temporal and inferior parietal areas. As an example, a metathetic error such as /film/ → [flim] may result from an aphasic patient’s faint and therefore ambiguous auditory representation of the word film, caused by a decreased resolution of temporal patterns in cortical areas storing words as auditory objects—similar to the mechanism described in the exemplar-based explanation above. Articulatory phonology (AP). As a third approach, AP should be mentioned as a theory that has dealt extensively with speech error data, although predominantly with spontaneous or experimentally elicited slips of neurologically healthy speakers rather than with phonemic paraphasia (Goldstein et al., 2007; Pouplier, 2007a). AP postulates articulatory gestures rather than phonemes as the primitives of speech sound patterns. Gestures of the lips, the tongue tip, and the tongue back, as well as aperture gestures of the glottis and the velum, are conceived as discrete vocal tract events with a certain temporal extension. In the closely related task dynamics model (Saltzman & Kelso, 1987), gestures are modeled mathematically as a dynamical system, and the temporal patterning of vocal tract gestures is in large parts organized by principles of oscillator coupling. As an example, gestures in a prevocalic position are considered to have an in-phase relationship with the vocalic gesture (phase-angle of 0°), whereas postvocalic gestures have an antiphase coupling (phase-angle of 180°). For more details, readers are referred to Goldstein et al. (2006) or Goldstein and Pouplier (2014). PHONOLOGY VS. PHONETICS IN SPEECH SOUND DISORDERS 25

AP provides articulatory explanations for a number of observations described phonemically in more traditional approaches. For instance, assimilation or phrase level reduction processes are explained by increased gestural overlaps or a tuning of gestural amplitudes, leading to a perceived “loss” or “substitution” of phonemes, although the gestural composition of a word remains unaltered (Davidson, 2006b). AP accounts of speech errors are similar to the extent that such errors are considered to result from gestural slips, especially gestural intrusions, which are perceived as phonetically well-formed (Pouplier, 2007a). This approach, therefore, contradicts theories in which well-formed speech errors (as documented predominantly by transcription methods) are considered as a proof that phonemes are the primitives of spoken language (e.g., Dell et al., 1993). The observation that perceived phonemes may move into different discrete “slots,” as in /meloːnə/ → [menoːlə] (cf. Example 5), is ascribed to the coupling of gestures according to dynamically stable coordination modes, such as the 0° and the 180° coupling of pre- and postvocalic gestures, respectively. Hence, intruding gestures will with a high probability conform to a stable mode and therefore sound undistorted. The AP account of speech errors is based on a still limited amount of acoustic and articulatory data collected in slip elicitation experiments with healthy speakers (e.g., Pouplier, 2007b). Articulatory events associated with perceived phoneme errors in patients with AOS have been recorded using electropalatography, demonstrating that these patients produced misdirected articulatory gestures or double or triple articulations that were, on the surface, perceived as either phonetically well-formed or phonetically distorted, depending on accidental details of the movement patterns rather than on categorical differences in gestural organization (Pouplier & Hardcastle, 2005). These data point to a common error mechanism underlying phonetic distortions and phonemic errors in apraxic speakers, as discussed in an earlier section of this chapter. Furthermore, the methodological framework of AP has been applied successfully to model error data from large samples of patients with AOS by a hierarchical arrangement of gestures to syllables and syllable to metrical feet and phonological words (Ziegler, 2011; Ziegler & Aichert, 2015). Because the AP account makes no distinction between a phonological and a phonetic level of speech production, it predicts that phonological errors in patients with fluent aphasia are explainable by the same mechanisms of gestural intrusions, double articulation, reduced or increased overlap, and so forth. Although only few instrumental investigations of nonapraxic aphasic speakers have been performed so far, there are indeed indications that these patients demonstrate aberrations in the kinematic details of their speech gestures (e.g., Bose, van Lieshout, & Square, 2003). In an electropalatography study by Wood (cited in Pouplier, 2007a; Pouplier & Hardcastle, 2005), double articulations could be observed in a large proportion of errors classified as phonemic substitutions in aphasic patients with and without AOS. Hence, in this account, the distinction between AOS and phonemic paraphasia would be interpreted as a diagnostic artefact ascribable to clinical surface factors unrelated with the underlying pathomechanism. A problem related with this explanation is whether it can be accommodated with the striking differences between the disfluent, groping, trial-and-error speaking of patients with 26 SPEECH MOTOR CONTROL IN NORMAL AND DISORDERED SPEECH

AOS, on the one hand, and the fluent paraphasic output produced by patients classified as phonological, on the other hand. It also remains to be shown how the supposed articulatory mechanisms may create complex errors such as /riŋ/ → [nil], /dax/ → [ʃtɔtʃ], or /print͡ sɛsin/ → [mɛlˈtikin], which are the rule rather than the exception in the output of patients with moderate or severe phonological impairment, let alone of patients with phonemic jargon. Finally, the AP explanation of phonological impairment as a disorder afflicting the gestural organization of speech cannot account for neuroanatomic findings relating disordered output to an impairment afflicting the sensory reference frames of articulatory planning or other mechanisms relying on a shared code for action and perception (cf. Galantucci, Fowler, & Turvey, 2006, for a review), rather than primarily the gestural organization of speech.

Conclusion

Theoretical frameworks of sound production impairment should necessarily be based on models transgressing the phonology–phonetics boundary and addressing the interface between these two domains. Otherwise, they cannot account for the close clinical similarities between phonological and phonetic syndromes in patients with left hemisphere stroke and for the connectivity of the network whose lesions may cause AOS or phonemic paraphasia. Theories postulating complete autonomy of the phonological domain, such as the phonological mind theory (Berent, 2013), cannot account for a potential relationship between phonetic and phonological speech impairment, irrespective of whether such a relationship exists, because they deny the relevance of phonetic substance to phonological processing mechanisms from the outset. Likewise (although for other reasons), connectionist theories that exclude the auditory and motor domains of speaking from their scope of interest deliberately segregate the continuous from the symbolic and, hence, disallow any potential interactions between phonetic and phonological mechanisms. Therefore, the relationship between apraxic and phonological sound production impairment is presently not a part of these theories, even though appropriate extensions can be made in the future. Finally, OT and HG are based on a markedness concept whose phonetic content is not a relevant part of their theoretical framework. To this extent, these accounts also fail to provide an integrative account of the pathomechanisms underlying the different clinical syndromes of sound production impairment. As a consequence, the abstractionist theories discussed in this chapter contribute to a further unfortunate consolidation of the segregation of these syndromes in conventional clinical taxonomies. Yet, more recent developments in experimental phonology and phonetics may provide promising theoretical frameworks to promote the understanding of the pathomechanism of phonological sound production impairment and its relationship with AOS. Two approaches mentioned here—that is, usage-based phonology and evolutionary phonology— can promote the understanding of the role of auditory and perhaps also somatosensory information as a reference frame to guide motor planning in aphasic speech production. They therefore emphasize an aspect of aphasic sound production that has been largely PHONOLOGY VS. PHONETICS IN SPEECH SOUND DISORDERS 27 neglected so far—that is, the role of sensory information in motor planning. The relevance of this line of research has also been emphasized by Hickok (2014). Furthermore, AP has already made and will continue to make an important contribution to the understanding of how the continuous is related with the symbolic in sound production impairment. Together, these approaches provide a structurally richer alternative to the conventional symbolic accounts of speech production with their flat hierarchy of sensorimotor cognition in speech processing.

Acknowledgments

The considerations presented here originate from work supported by the German Research Council (Zi 469/9-1, 12-1, 14-1, 14-2). I thank Ingrid Aichert, Georg Goldenberg, and Anja Staiger for numerous discussions on these and related topics.

References

Ackermann, H., Hage, S. R., & Ziegler, W. (2014). Brain mechanisms of acoustic communication in humans and nonhuman primates: An evolutionary perspective. Behavioral and Brain Sciences, 37, 529–604. Aichert, I., Büchner, M., & Ziegler, W. (2011). Why is [ˈjuːdo] easier than [juˈveːl]? Perceptual and acoustic analyses of word stress in patients with apraxia of speech. Stem-, Spraak- en Taalpathologie, 17, 15. Béland, R., Caplan, D., & Nespoulous, J.-L. (1990). The role of abstract phonological representations in word production: Evidence from phonemic paraphasias. Journal of Neurolinguistics, 5, 125–164. Berent, I. (2013). The phonological mind. Cambridge, England: Cambridge Univeristy Press. Berent, I., Lennertz, T., Jun, J., Moreno, M. A., & Smolensky, P. (2008). Language universals in human brains. Proceedings of the National Academy of Sciences, USA, 105, 5321–5325. Berent, I., Pan, H., Zhao, X., Epstein, J., Bennett, M. L., Deshpande, V., . . . Stern, E. (2014). Language universals engage Broca’s area. PLoS ONE, 9, e95155. Berent, I., Steriade, D., Lennertz, T., & Vaknin, V. (2007). What we know about what we have never heard: Evidence from perceptual illusions. Cognition, 104, 591–630. Berlucchi, G., & Aglioti, S. (1997). The body in the brain: Neural bases of corporeal awareness. Trends in Neurosciences, 20, 560–564. Berwick, R. C., Friederici, A. D., Chomsky, N., & Bolhuis, J. J. (2013). Evolution, brain, and the nature of language. Trends in Cognitive Sciences, 17, 89–98. Blanken, G. (1991). The functional basis of speech automatisms (recurring utterances). Aphasiology, 5, 103–127. Blevins, J. (2004). Evolutionary phonology: The emergence of sound patterns. Cambridge, England: Cambridge University Press. 28 SPEECH MOTOR CONTROL IN NORMAL AND DISORDERED SPEECH

Bose, A., van Lieshout, P., & Square, P. A. (2003). Speech coordination in individuals with aphasia and normal speakers. Brain and Language, 87, 158–159. Buchsbaum, B. R., & D’Esposito, M. (2008). The search for the phonological store: From loop to convolution. Journal of Cognitive Neuroscience, 20, 762–778. Buchwald, A. (2009). Minimizing and optimizing structure in phonology: Evidence from aphasia. Lingua, 119, 1380–1395. Buchwald, A., & Miozzo, M. (2011). Finding levels of abstraction in speech production: Evidence from sound-production impairment. Psychological Science, 22, 1113–1119. Buchwald, A., & Miozzo, M. (2012). Phonological and motor errors in individuals with acquired sound production impairment. Journal of Speech, Language, and Hearing Research, 55, S1573–S1586. Buchwald, A., Rapp, B., & Stone, M. (2007). Insertion of discrete phonological units: An articulatory and acoustic investigation of aphasic speech. Language and Cognitive Processes, 22, 910–948. Butterworth, B. (1992). Disorders of phonological encoding. Cognition, 42, 261–286. Bybee, J. (2003). Phonology and language use (Vol. 94). Cambridge, England: Cambridge University Press. Caplan, D. (1992). Language: Structure, processing, and disorders. Cambridge, MA: MIT Press. Cappa, S. F., Cavallotti, G., & Vignolo, L. A. (1981). Phonemic and lexical errors in fluent aphasia: Correlation with lesion site. Neuropsychologia, 19, 171–177. Caramazza, A., & Zurif, E. B. (1976). Dissociation of algorithmic and heuristic processes in language comprehension: Evidence from aphasia. Brain and Language, 3, 572–582. Chomsky, N., & Halle, M. (1968). The sound pattern of English. New York, NY: Harper and Row. Code, C., & Ball, M. J. (1994). Syllabification in aphasic recurring utterances: Contributions of sonority theory. Journal of Neurolinguistics, 8, 257–265. Collier, K., Bickel, B., van Schaik, C. P., Manser, M. B., & Townsend, S. W. (2014). Language evolution: Syntax before phonology? Proceedings of the Royal Society of London B: Biological Sciences, 281, 20140263. Davidson, L. (2005). Addressing phonological questions with ultrasound. Clinical Linguistics & Phonetics, 19, 619–633. Davidson, L. (2006a). Phonotactics and articulatory coordination interact in phonology: Evidence from nonnative production. Cognitive Science, 30, 837–862. Davidson, L. (2006b). Schwa elision in fast speech: Segmental deletion or gestural overlap? Phonetica, 63, 79–112. Dell, G. S. (1986). A spreading activation theory of retrieval in sentence production. Psychological Review, 93, 283–321. Dell, G. S. (2014). Phonemes and production. Language, Cognition and Neuroscience, 29, 30–32. Dell, G. S., Juliano, C., & Gowindjee, A. (1993). Structure and content in language production: A theory of frame constraints in phonological speech errors. Cognitive Science, 17, 149–195. PHONOLOGY VS. PHONETICS IN SPEECH SOUND DISORDERS 29

Dick, A. S., Bernal, B., & Tremblay, P. (2014). The language connectome: New pathways, new concepts. The Neuroscientist, 20, 453–467. Elman, J. L. (1993). Learning and development in neural networks: The importance of starting small. Cognition, 48, 71–99. Flemming, E. (2013). Auditory representations in phonology. London, England: Routledge. Foygel, D., & Dell, G. S. (2000). Models of impaired lexical access in speech production. Journal of Memory and Language, 43, 182–216. Gafos, A., & Benus, S. (2006). Dynamics of phonological cognition. Cognitive Science, 30, 905–943. Galantucci, B., Fowler, C. A., & Turvey, M. T. (2006). The motor theory of speech perception reviewed. Psychonomic Bulletin & Review, 13, 361–377. Goldenberg, G. (2013). Apraxia: The cognitive side of motor control. Oxford, England: Oxford University Press. Goldrick, M. (2011). Linking speech errors and generative phonological theory. Language and Linguistics Compass, 5, 397–412. Goldrick, M., Baker, H. R., Murphy, A., & Baese-Berk, M. (2011). Interaction and representational integration: Evidence from speech errors. Cognition, 121, 58–72. Goldrick, M., & Daland, R. (2009). Linking speech errors and phonological grammars: Insights from harmonic grammar networks. Phonology, 26, 147–185. doi:10.1017/ S0952675709001742 Goldrick, M., & Rapp, B. (2007). Lexical and post-lexical phonological representations in spoken production. Cognition, 102, 219–260. Goldstein, L., Byrd, D., & Saltzman, E. (2006). The role of vocal tract gestural action units in understanding the evolution of phonology. In M. A. Arbib (Ed.), Action to language via the mirror neuron system (pp. 215–249). Cambridge, England: Cambridge University Press. Goldstein, L., & Pouplier, M. (2014). The temporal organization of speech. In M. Goldrick, V. S. Ferreira, & M. Miozzo (Eds.), The Oxford handbook of language production (pp. 210–227). Oxford, England: Oxford University Press. Goldstein, L., Pouplier, M., Chen, L., Saltzman, E., & Byrd, D. (2007). Dynamic action units slip in speech production errors. Cognition, 103, 386–412. Graff-Radford, J., Jones, D. T., Strand, E. A., Rabinstein, A. A., Duffy, J. R., & Josephs, K. A. (2014). The neuroanatomy of pure apraxia of speech in stroke. Brain and Language, 129, 43–46. Guenther, F. H., Hampson, M., & Johnson, D. (1998). A theoretical investigation of reference frames for the planning of speech movements. Psychological Review, 105, 611–633. Harrington, J., Palethorpe, S., & Watson, C. I. (2000, December 21). Does the Queen speak the Queen’s English? Nature, 408, 927–928. Haspelmath, M. (2006). Against markedness (and what to replace it with). Journal of Linguistics, 42, 25–70. Hauser, M. D., Chomsky, N., & Fitch, W. T. (2002, November 22). The faculty of language: What is it, who has it, and how did it evolve? Science, 298, 1569–1579. 30 SPEECH MOTOR CONTROL IN NORMAL AND DISORDERED SPEECH

Hayes, B., & Steriade, D. (2004). The phonetic bases of phonological markedness. In B. Hayes, R. Kirchner, & D. Steriade (Eds.), Phonetically based phonology (pp. 1–33). Cambridge, England: Cambridge University Press. Hayes, B., & Wilson, C. (2008). A maximum entropy model of phonotactics and phonotactic learning. Linguistic Inquiry, 39, 379–440. Hickok, G. (2014). Towards an integrated psycholinguistic, neurolinguistic, sensorimotor framework for speech production. Language, Cognition and Neuroscience, 29, 52–59. Hickok, G., Okada, K., & Serences, J. T. (2009). Area Spt in the human planum temporale supports sensory-motor integration for speech processing. Journal of Neurophysiology, 101, 2725–2732. Hickok, G., & Poeppel, D. (2007). The cortical organization of speech processing. Nature Reviews Neuroscience, 8, 393–402. Jakobson, R. (1937). Lectures on sound and meaning. Cambridge, MA: MIT Press. Kappes, J., Baumgaertner, A., Peschke, C., Goldenberg, G., & Ziegler, W. (2010). Imitation of paraphonological detail following left hemisphere lesions. Neuropsychologia, 48, 1115–1124. Kello, C. T., & Plaut, D. C. (2004). A neural network model of the articulatory-acoustic forward mapping trained on recordings of articulatory parameters. The Journal of the Acoustical Society of America, 116, 2354–2364. Kelso, J. A. S., & Tuller, B. (1981). Toward a theory of apractic syndromes. Brain and Language, 12, 224–245. Laganaro, M. (2012). Patterns of impairments in AOS and mechanisms of interaction between phonological and phonetic encoding. Journal of Speech, Language, and Hearing Research, 55, S1535–S1543. Laganaro, M. (2014). Phonological errors in conduction aphasia and the HSFC model: A comment to Hickok 2013. Language, Cognition and Neuroscience, 29, 28–29. Leaver, A. M., & Rauschecker, J. P. (2010). Cortical representation of natural complex sounds: Effects of acoustic features and auditory object category. The Journal of Neuroscience, 30, 7604–7612. Lenneberg, E. H. (1967). Biological foundations of language (Vol. 68). New York, NY: Wiley. Levelt, W. J. M. (1989). Speaking: From intention to articulation. Cambridge, MA: MIT Press. Liberman, A. M., & Whalen, D. H. (2000). On the relation of speech to language. Trends in Cognitive Sciences, 4, 187–196. Liljencrants, J., & Lindblom, B. (1972). Numerical simulation of vowel quality systems: The role of perceptual contrast. Language, 48, 839–862. Lindblom, B., MacNeilage, P. F., & Studdert-Kennedy, M. (1983). Self-organizing processes and the explanation of phonological universals. Linguistics, 21, 181–203. Lombardi, L. (1999). Positional faithfulness and voicing assimilation in optimality theory. Natural Language & Linguistic Theory, 17, 267–302. Mesgarani, N., Cheung, C., Johnson, K., & Chang, E. F. (2014, February 28). Phonetic feature encoding in human superior temporal gyrus. Science, 343, 1006–1010. PHONOLOGY VS. PHONETICS IN SPEECH SOUND DISORDERS 31

Meyer, A. S. (1992). Investigation of phonological encoding through speech error analyses: Achievements, limitations, and alternatives. Cognition, 42, 181–211. Miozzo, M., & Buchwald, A. (2013). On the nature of sonority in spoken word production: Evidence from neuropsychology. Cognition, 128, 287–301. Monti, M. M., Parsons, L. M., & Osherson, D. N. (2009). The boundaries of language and thought in deductive inference. Proceedings of the National Academy of Sciences, USA, 106, 12554–12559. Nozari, N., Kittredge, A. K., Dell, G. S., & Schwartz, M. F. (2010). Naming and repetition in aphasia: Steps, routes, and frequency effects. Journal of Memory and Language, 63, 541–559. Okanoya, K. (2007). Language evolution and an emergent property. Current Opinion in Neurobiology, 17, 271–276. Olson, A., Romani, C., & Halloran, L. (2007). Localizing the deficit in a case of jargonaphasia. Cognitive Neuropsychology, 24, 211–238. Patterson, K. E., & Shewell, C. (1987). Speak and spell: Dissociations and word class effects. In M. Coltheart, G. Sartori, & R. Job (Eds.), The cognitive neuropsychology of language (pp. 273–294). London, England: Erlbaum. Peperkamp, S. (2007). Do we have innate knowledge about phonological markedness? Comments on Berent, Steriade, Lennertz, and Vaknin. Cognition, 104, 631–637. Peschke, C., Ziegler, W., Eisenberger, J., & Baumgaertner, A. (2012). Phonological manipulation between speech perception and production activates a parieto-frontal circuit. NeuroImage, 59, 788–799. Pierrehumbert, J. (2002). Word-specific phonetics. Laboratory Phonology, 7, 101–139. Pierrehumbert, J. (2003). Phonetic diversity, statistical learning, and acquisition of phonology. Language and Speech, 46, 115–154. Pierrehumbert, J. (2006). The next toolkit. Journal of Phonetics, 34, 516–530. Plaut, D. C., & Kello, C. T. (1999). The emergence of phonology from the interplay of speech comprehension and production: A distributed connectionist approach. In B. MacWhinney (Ed.), The emergence of language (pp. 381–416). Mahwah, NJ: Erlbaum. Port, R. F. (2010). Rich memory and distributed phonology. Language Sciences, 32, 43–55. Pouplier, M. (2007a). Articulatory perspectives on errors. MIT Working Papers in Linguistics, 53, 115–132. Pouplier, M. (2007b). Tongue kinematics during utterances elicited with the SLIP technique. Language and Speech, 50, 311–341. Pouplier, M., & Hardcastle, W. (2005). A re-evaluation of the nature of speech errors in normal and disordered speakers. Phonetica, 62, 227–243. Prince, A., & Smolensky, P. (2004). Optimality theory: Constraint interaction in generative grammar. Malden, MA: Blackwell. Rapp, B., & Goldrick, M. (2000). Discreteness and interactivity in spoken word production. Psychological Review, 107, 460–499. Rapp, B., & Goldrick, M. (2006). Speaking words: Contributions of cognitive neuropsychological research. Cognitive Neuropsychology, 23, 39–73. 32 SPEECH MOTOR CONTROL IN NORMAL AND DISORDERED SPEECH

Rauschecker, J. P. (2012). Ventral and dorsal streams in the evolution of speech and language. Frontiers in Evolutionary Neuroscience, 4, 7. Richardson, J. D., Fillmore, P., Rorden, C., Lapointe, L. L., & Fridriksson, J. (2012). Re-establishing Broca’s initial findings. Brain and Language, 123, 125–130. Romani, C., & Calabrese, A. (1998). Syllabic constraints in the phonological errors of an aphasic patient. Brain and Language, 64, 83–121. Romani, C., & Galluzzi, C. (2005). Effects of syllabic complexity in predicting accuracy of repetition and direction of errors in patients with articulatory and phonological difficulties. Cognitive Neuropsychology, 22, 817–850. Rumelhart, D. E., & McClelland, J. L. (1986). On learning the past tenses of English verbs. In J. L. McClelland & D. E. Rumelhart (Eds.), Parallel distributed processing: Explorations in the microstructure of human cognition (Vol. 1, pp. 216–271). Cambridge, MA: Bradford Books. Ruml, W., Caramazza, A., Capasso, R., & Miceli, G. (2005). Interactivity and continuity in normal and aphasic language production. Cognitive Neuropsychology, 22, 131–168. Saltzman, E., & Kelso, J. A. S. (1987). Skilled actions: A task-dynamic approach. Psychological Review, 94, 84–106. Schwartz, M. F. (2014). Theoretical analysis of word production deficits in adult aphasia. Philosophical Transactions of the Royal Society B: Biological Sciences, 369, 20120390. Schwartz, M. F., Dell, G. S., Martin, N., Gahl, S., & Sobel, P. (2006). A case-series test of the interactive two-step model of lexical access: Evidence from picture naming. Journal of Memory and Language, 54, 228–264. Schwartz, M. F., Faseyitan, O., Kim, J., & Coslett, H. B. (2012). The dorsal stream contribution to phonological retrieval in object naming. Brain, 135, 3799–3814. Schwartz, M. F., Wilshire, C. E., Gagnon, D. A., & Polansky, M. (2004). Origins of nonword phonological errors in aphasic picture naming. Cognitive Neuropsychology, 21, 159–186. Shallice, T., & Cooper, R. (2011). The organisation of mind. Oxford, England: Oxford University Press. Smolensky, P., Goldrick, M., & Mathis, D. (2014). Optimization and quantization in gradient symbol systems: A framework for integrating the continuous and the discrete in cognition. Cognitive Science, 38, 1102–1138. Smolensky, P., & Legendre, G. (2006). The harmonic mind: From neural computation to optimality-theoretic grammar (Vol. 1). Cambridge, MA: MIT Press. Staiger, A., Rüttenauer, A., & Ziegler, W. (2010). The economy of fluent speaking: Phrase-level reduction in a patient with pure apraxia of speech. Language and Cognitive Processes, 24, 483–507. Stenneken, P., Bastiaanse, R., Huber, W., & Jacobs, A. M. (2005). Syllable structure and sonority in language inventory and aphasic neologisms. Brain and Language, 95, 280–292. Stevens, K. N. (1989). On the quantal nature of speech. Journal of Phonetics, 17, 3–45. PHONOLOGY VS. PHONETICS IN SPEECH SOUND DISORDERS 33

Teichmann, M., Dupoux, E., Kouider, S., Brugières, P., Boissé, M.-F., Baudic, S., . . . Bachoud-Lévi, A.-C. (2005). The role of the striatum in rule application: The model of Huntington’s disease at early stage. Brain, 128, 1155–1167. Tourville, J. A., Peeva, M. G., & Guenther, F. H. (2014). Perception–production interactions and their neural bases. In M. Goldrick, V. S. Ferreira, & M. Miozzo (Eds.), The Oxford handbook of language production (pp. 460–478). Oxford, England: Oxford University Press. Tremblay, S., Shiller, D. M., & Ostry, D. J. (2003, June 19). Somatosensory basis of speech production. Nature, 423, 866–869. Trupe, L. A., Varma, D. D., Gomez, Y., Race, D., Leigh, R., Hillis, A. E., & Gottesman, R. F. (2013). Chronic apraxia of speech and Broca’s area. Stroke, 44, 740–744. Ueno, T., Saito, S., Rogers, T. T., & Lambon Ralph, M. A. (2011). Lichtheim 2: Synthesizing aphasia and the neural basis of language in a neurocomputational model of the dual dorsal-ventral language pathways. Neuron, 72, 385–396. Ullman, M. T., Corkin, S., Coppola, M., Hickok, G., Growdon, J. H., Koroshetz, W. J., & Pinker, S. (1997). A neural dissociation within language: Evidence that the mental dictionary is part of declarative memory, and that grammatical rules are processed by the procedural system. Journal of Cognitive Neuroscience, 9, 266–276. Vigneau, M., Beaucousin, V., Herve, P. Y., Duffau, H., Crivello, F., Houde, O., . . . Tzourio-Mazoyer, N. (2006). Meta-analyzing left hemisphere language areas: Phonology, semantics, and sentence processing. NeuroImage, 30, 1414–1432. Wambaugh, J. L., Duffy, J. R., McNeil, M. R., Robin, D. A., & Rogers, M. A. (2006). Treatment guidelines for acquired apraxia of speech: A synthesis and evaluation of the evidence. Journal of Medical Speech-Language Pathology, 14, xv–xxxiii. Ziegler, W. (2002). Psycholinguistic and motor theories of apraxia of speech. Seminars in Speech and Language, 23, 231–243. Ziegler, W. (2011). Apraxic failure and the hierarchical structure of speech motor plans: A non-linear probabilistic model. In A. Lowit & R. D. Kent (Eds.), Assessment of motor speech disorders (pp. 305–323). San Diego, CA: Plural Publishing. Ziegler, W., & Aichert, I. (2015). How much is a word? Predicting ease of articulation planning from apraxic speech error patterns. Cortex, 69, 24–39. Ziegler, W., Aichert, I., & Staiger, A. (2012). Apraxia of speech: Concepts and controversies. Journal of Speech, Language, and Hearing Research, 55, S1485–S1501.