
Visual Processing Affects the Neural Basis of Auditory Discrimination Daniel S. Kislyuk1,2,3, Riikka Mo¨tto¨nen1, and Mikko Sams1 Downloaded from http://mitprc.silverchair.com/jocn/article-pdf/20/12/2175/1759333/jocn.2008.20152.pdf by guest on 18 May 2021 Abstract & The interaction between auditory and visual speech streams ‘‘odd-ball’’ sequence of acoustic stimuli consisting of frequent is a seamless and surprisingly effective process. An intriguing /va/ syllables (standards) and infrequent /ba/ syllables (devi- example is the ‘‘McGurk effect’’: The acoustic syllable /ba/ ants) was presented to 11 participants. Deviant stimuli in the presented simultaneously with a mouth articulating /ga/ is unisensory acoustic stimulus sequence elicited a typical MMN, typically heard as /da/ [McGurk, H., & MacDonald, J. Hearing reflecting discrimination of acoustic features in the auditory lips and seeing voices. Nature, 264, 746–748, 1976]. Previous cortex. When the acoustic stimuli were dubbed onto a video studies have demonstrated the interaction of auditory and of a mouth constantly articulating /va/, the deviant acoustic visual streams at the auditory cortex level, but the importance /ba/ was heard as /va/ due to the McGurk effect and was of these interactions for the qualitative perception change indistinguishable from the standards. Importantly, such de- remained unclear because the change could result from in- viants did not elicit MMN, indicating that the auditory cortex teractions at higher processing levels as well. In our electro- failed to discriminate between the acoustic stimuli. Our find- encephalogram experiment, we combined the McGurk effect ings show that visual stream can qualitatively change the with mismatch negativity (MMN), a response that is elicited auditory percept at the auditory cortex level, profoundly in the auditory cortex at a latency of 100–250 msec by any influencing the auditory cortex mechanisms underlying early above-threshold change in a sequence of repetitive sounds. An sound discrimination. & INTRODUCTION Sams, 2002) and the auditory cortex blood oxygenation Viewing a speaker’s articulatory movements can consid- level-dependent response (Calvert, Brammer, Bullmore, erably improve the accuracy of speech perception in face- Iversen, & David, 1999). Although these findings indicate to-face communication. This has been shown for speech that visual processing influences processing in the audi- embedded in noise (Dodd, 1977; Sumby & Pollack, 1954), tory cortex, it is not possible to conclude that these and for foreign and complex native speech (Reisberg, changes in the brain responses are due to a qualitative McLean, & Goldfield, 1987). The effect of viewing speech change in the auditory–cortical neural basis of the speech on auditory perception can be so strong that if auditory percept. They might equally well reflect a specific tuning and visual speech stimuli are in conflict, the auditory per- of the auditory cortex in response to the presentation of cept is determined completely or partially by the visual the visual stimuli strongly associated with auditory object, input. A classical demonstration of this ‘‘McGurk’’ illusion while the behaviorally noticeable qualitative changes in is that an auditory /ba/ dubbed onto a visual /ga/ is heard perception result from the higher level interactions. In- as /da/ by most observers (McGurk & MacDonald, 1976). deed, the influence of the visual speech stream on the Numerous studies have suggested that interactions in auditory one can be seen already in the brainstem re- audiovisual speech perception may start as early as the sponses (Musacchia, Sams, Nicol, & Kraus, 2006). How- auditory cortex (Pekkola et al., 2005; Calvert et al., 1997). ever, because the influences are similar for congruent A visual influence on auditory processing has been dem- and incongruent visual speech, it is most unlikely that onstrated in auditory electroencephalogram (EEG) and they reflect a qualitative change in the speech percept. magnetoencephalogram (MEG) event-related responses Speech sounds are probably not categorized at such a low (van Wassenhove, Grant, & Poeppel, 2005; Klucharev, level (Scott & Wise, 2004). So far, the aforementioned Mo¨tto¨nen, & Sams, 2003; Mo¨tto¨nen, Krause, Tiippana, & studies leave open the question of the level of qualitative audiovisual speech integration. To answer this question, one needs to access qualitative changes in the auditory– 1Helsinki University of Technology, Finland, 2Helsinki Uni- cortical neural basis of the percept. What is needed is a versity Central Hospital, Finland, 3Saint Petersburg State Uni- brain response generated in the auditory cortex and versity, Saint Petersburg, Russia whose measurable characteristics depend on the degree D 2008 Massachusetts Institute of Technology Journal of Cognitive Neuroscience 20:12, pp. 2175–2184 Downloaded from http://www.mitpressjournals.org/doi/pdf/10.1162/jocn.2008.20152 by guest on 01 October 2021 of change in the complex features of the percept. These 2002). Thus, the previous studies of the McGurk effect requirements are very well fulfilled by the properties of that employed MMN left the following alternatives: (i) mismatch negativity (MMN). visual speech qualitatively modifies the neural represen- MMN is a robust and accurate indicator of preattentive tations of speech sounds in the auditory cortex being detection of occasional changes in the features of acoustic processed by the same neural population as the auditory stimuli. It is a negative deflection in the event-related stream, or (ii) visual speech is processed by separate potentials (ERPs) to the distinct ‘‘deviant’’ stimuli that visual-responsive neurons in the auditory cortex and infrequently replace acoustically regular ‘‘standard’’ stim- does not qualitatively alter the representations of acous- uli, forming together an ‘‘odd-ball’’ sequence (Na¨a¨ta¨nen, tic speech. Gaillard, & Ma¨ntysa¨lo, 1978; see also Na¨a¨ta¨nen, Tervaniemi, The existence of MMN to the McGurk type of audio- Downloaded from http://mitprc.silverchair.com/jocn/article-pdf/20/12/2175/1759333/jocn.2008.20152.pdf by guest on 18 May 2021 Sussman, Paavilainen, & Winkler, 2001 for a review). Al- visual deviant stimuli can always be explained by the though there is an ongoing discussion whether MMN orig- activity in neurons that are not necessarily activated by inates from specialized generators (Na¨a¨ta¨nen, Jacobsen, & the auditory speech stimuli. Therefore, we chose to use Winkler, 2005) or can rather be explained by adaptation a paradigm where the McGurk effect is used to render properties of the auditory cortex neurons (Ja¨a¨skela¨inen, deviant auditory stimuli identical to the standard stimuli. Ahveninen, et al., 2004), it is generally agreed that MMN is We prepared McGurk stimuli (acoustic /ba/ + visual /va/, generated in the primary and secondary auditory cortices. perceived as /va/), which were phonologically indis- This is supported by MEG studies in humans (Sams et al., tinguishable from the congruent audiovisual stimuli 1985; Hari et al., 1984) as well as animal studies (Pincze, (acoustic and visual /va/). In the McGurk condition, the Lakatos, Rajkai, Ulbert, & Karmos, 2001). Being a preat- standards were audiovisually congruent /va/ syllables tentive component, MMN can be recorded even when the and the deviants were acoustically different but percep- acoustic stimuli are quite irrelevant to a participant who is tually similar McGurk stimuli (Figure 1). In the auditory performing some other task. The size of MMN typically condition, the auditory stimuli were delivered in the depends strongly upon the participant’s ability to discrim- very same odd-ball sequence (/va/ standards and /ba/ inate between the sounds. For instance, speech sounds deviants) as in the McGurk condition, but the partic- that belong to different phonological categories in the ipants were watching a silent movie instead of an ar- perceiver’s native language elicit usually larger MMNs than ticulating face. We expected a clear MMN to be elicited speech sounds which have the same acoustical distance by the deviants in the auditory condition, indicating but which are harder to distinguish due to their phono- discrimination of speech sounds in the auditory cortex. logical identity (Sharma & Dorman, 2000; Winkler et al., 1999; Na¨a¨ta¨nen et al., 1997). MMN recordings and the McGurk effect form an ef- ficient combination to study the neural basis of audio- visual speech interactions. Sams et al. (1991) coupled the acoustic stimulus /pa/ with a video of a person fre- quently articulating /pa/ (standard) interspersed with in- frequently articulated /ka/ (deviant). Due to the McGurk effect, the combination of acoustic /pa/ and visual /ka/ was heard as /ta/ or /ka/. These audiovisually incongru- ent deviants (acoustic /pa/ + visual /ka/) presented among congruent standards (acoustic /pa/ + visual /pa/) elicited a mismatch field (MMF) in the auditory cortex. Because this occurred in the absence of any acoustic changes in the stimuli, the results indicated that process- ing of visual speech can influence activity in the auditory cortex. These early MEG results were later confirmed (Mo¨tto¨nen et al., 2002) and extended to EEG and to a wider set of stimuli (Saint-Amour, De Sanctis, Molholm, Ritter, & Foxe, 2007; Colin, Radeau, Soquet, & Deltenre, 2004; Colin, Radeau, Soquet, Demolin, et al., 2002). Neuroimaging studies have shown that even silent lip- reading activates the auditory cortex
Details
-
File Typepdf
-
Upload Time-
-
Content LanguagesEnglish
-
Upload UserAnonymous/Not logged-in
-
File Pages10 Page
-
File Size-