Article

Auditory lexical decision, categorical , and FM direction discrimination differentially engage left and right auditory cortex

POEPPEL, David, et al.

Abstract

Recent neuroimaging and neuropsychological data suggest that perception is supported in bilaterally auditory areas. We evaluate this issue building on well-known behavioral effects. While undergoing positron emission tomography (PET), subjects performed standard auditory tasks: direction discrimination of frequency-modulated (FM) tones, categorical perception (CP) of consonant-vowel (CV) syllables, and word/non-word judgments (lexical decision, LD). Compared to rest, the three conditions led to bilateral activation of the auditory cortices. However, lateralization patterns differed as a function of stimulus type: the LD task generated stronger responses in the left, the FM task a stronger response in the right hemisphere. Contrasts between either words or syllables versus FM were associated with significantly greater activity bilaterally in superior temporal gyrus (STG) ventro-lateral to Heschl's gyrus. These activations extended into the superior temporal sulcus (STS) and the middle temporal gyrus (MTG) and were greater in the left. The same areas were more active in the LD than the CP task. In contrast, the FM [...]

Reference

POEPPEL, David, et al. Auditory lexical decision, categorical perception, and FM direction discrimination differentially engage left and right auditory cortex. Neuropsychologia, 2004, vol. 42, no. 2, p. 183-200

DOI : 10.1016/j.neuropsychologia.2003.07.010 PMID : 14644105

Available at: http://archive-ouverte.unige.ch/unige:103285

Disclaimer: layout of this document may differ from the published version.

1 / 1 Neuropsychologia 42 (2004) 183–200

Auditory lexical decision, categorical perception, and FM direction discrimination differentially engage left and right auditory cortex a, b b c David Poeppel ∗, Andre Guillemin , Jennifer Thompson , Jonathan Fritz , Daphne Bavelier d, Allen R. Braun b a Cognitive Neuroscience of Laboratory, Departments of Linguistics and Biology, University of Maryland, 1401 Marie Mount Hall, College Park, MD 20742, USA b Language Section, Voice, Speech, and Language Branch, National Institute of Deafness and other Communication Disorders, Bethesda, MD 20892, USA c Institute for Systems Research, University of Maryland, College Park, MD 20742, USA d Department of Brain and Cognitive Sciences, University of Rochester, Rochester, NY 14627, USA Received 1 June 2002; received in revised form 22 November 2002; accepted 22 July 2003

Abstract Recent neuroimaging and neuropsychological data suggest that is supported in bilaterally auditory areas. We evaluate this issue building on well-known behavioral effects. While undergoing positron emission tomography (PET), subjects performed stan- dard auditory tasks: direction discrimination of frequency-modulated (FM) tones, categorical perception (CP) of consonant–vowel (CV) syllables, and word/non-word judgments (lexical decision, LD). Compared to rest, the three conditions led to bilateral activation of the auditory cortices. However, lateralization patterns differed as a function of stimulus type: the LD task generated stronger responses in the left, the FM task a stronger response in the right hemisphere. Contrasts between either words or syllables versus FM were associated with significantly greater activity bilaterally in superior temporal gyrus (STG) ventro-lateral to Heschl’s gyrus. These activations extended into the superior temporal sulcus (STS) and the middle temporal gyrus (MTG) and were greater in the left. The same areas were more active in the LD than the CP task. In contrast, the FM task was associated with significantly greater activity in the right lateral–posterior STG and lateral MTG. The findings argue for a view in which speech perception is mediated bilaterally in the auditory cortices and that the well-documented lateralization is likely associated with processes subsequent to the auditory analysis of speech. © 2003 Elsevier Ltd. All rights reserved.

Keywords: Hemispheric asymmetry; Speech perception; Word recognition; Spectral; Temporal

1. Introduction Previous work investigating the neural basis of speech has used behavioral tasks such as phoneme monitoring Despite much recent work on the functional architec- (Démonet et al., 1992) or listening to rotated speech ture of speech perception, some basic issues remain unre- (Scott et al., 2000). To complement these studies, we in- solved, including coarse functional anatomic considerations vestigate the cortical architecture of speech building on on hemispheric lateralization. One point of debate concerns canonical psychophysical phenomena, contrasting three to what degree there is a significant bilateral contribution standard auditory paradigms: (i) discrimination (up/down) to speech perception (construed as the process of analyz- of frequency-modulated (FM) signals; (ii) categorical per- ing and transforming continuous waveform input into rep- ception (CP) (ba/pa) of consonant–vowel (CV) syllables resentations suitable to interface with the mental lexicon), varying along an acoustic voice-onset time (VOT) contin- notwithstanding the fact that language processing beyond uum; and (iii) lexical decision of phonologically permissible the input interface of speech perception is highly lateralized targets (word/non-word). FMs (adjusted to have the same (Binder et al., 1997, 2000; Giraud & Price, 2001; Hickok & frequency-range as speech without eliciting speech-like Poeppel, 2000; Mummery, Ashburner, Scott, & Wise, 1999; percepts) are used to evaluate elementary auditory process- Norris & Wise, 2000; Scott et al., 2000). ing of dynamic signals (e.g. Gordon & Poeppel, 2002). CP of CV syllables varying in one acoustic parameter is a Corresponding author. Tel.: 1-301-405-1016; fax: 1-301-405-7104. phenomenon that has been used extensively to probe mech- ∗ + + E-mail address: [email protected] (D. Poeppel). anisms of speech perception. The successful execution of a

0028-3932/$ – see front matter © 2003 Elsevier Ltd. All rights reserved. doi:10.1016/j.neuropsychologia.2003.07.010 184 D. Poeppel et al. / Neuropsychologia 42 (2004) 183–200

CP task requires precise analysis of the speech sound but computer (Apple Computer, Inc., Cupertino, CA) playing does not entail any (obvious) lexical–semantic processing through an AIWA LCX-800M stereo system (AIWA Amer- (Liberman, Harris, Hoffman, & Griffith, 1957). In an audi- ica, Inc., Mahwah, NJ). The output from the stereo was pre- tory lexical decision (LD) task, subjects must judge whether sented through two audio speakers situated on the sides of the or not an auditory target (e.g. “blicket”) is a word. Execu- positron emission tomography (PET) scanner gantry. These tion of this task requires lexical access or lexical search in speakers were equidistant, dorsal and inferior to either side addition to the analysis of the speech signal (for review, see of the subjects’ ears. Subjects indicated their responses by e.g. Goldinger, 1996). pressing a response button held in each hand. Responses in We attempt to minimize task effects: in all three paradigms the two-alternative forced choice experiments were recorded subjects execute a single-trial two-alternative-forced-choice by the computer and program used to present stimuli. on signals presented at the same rate. By hypothesis, in Materials for the FM sweeps condition (FM) consisted all cases there is an initial processing stage that constructs of eight frequency-modulated signals of 380 ms dura- spectro-temporal representations of the signals. Subse- tion (sinusoidal carrier): four linearly rising FM sweeps quently, these representations interface with different sys- (200–3200, 200–1600, 200–800 and 200–400) and four tems. Words and non-words elicit lexical access (requiring linearly falling FM sweeps (3200–200, 1600–200, 800–200 speech analysis and lexical access), syllables elicit pro- and 400–200 Hz). The mean presentation amplitude of the cessing of the speech signal but not (the same degree of) sweeps was 76 dB. lexical analysis, and FM processing requires spectral anal- The stimuli for the CP condition consisted of seven syn- ysis but neither speech nor lexical processing. These dif- thesized CV syllables of 386 ms total duration with VOTs ferences should be reflected in distinct functional anatomic of 5, 10, 15, 25, 30, 35 and 45 ms. The syllables were syn- substrates. thesized using the Sensyn implementation of the Klatt syn- Three questions are investigated. First, we evaluate thesizer (Sensimetrics, Cambridge MA) and the synthesis whether the processing of speech is mediated bilaterally in parameters were those previously reported (Poeppel et al., the superior temporal gyri. Second, we test whether words 1996). The presentation amplitude of the CV syllables was additionally activate extra-temporal areas and reflect the 85 dB (range, 84–86). lateralization that is typical of language processing beyond Materials for the LD condition consisted of 200 sin- the analysis of the input signal. Third, the hypothesis that gle syllable words (e.g. lease, fruit, herb, lead) and 200 left and right non-primary auditory areas differentially con- single-syllable phonologically permissible non-words (e.g. tribute to the analysis of the speech and other complex tice, treek, jide, zumb). The materials were spoken by an signals is explored by comparing which signals and tasks adult male speaker. Mean word duration was 537 ms (range, drive the (most likely non-primary) areas more effectively. 528–550). The presentation amplitude of the words was 85 dB (range, 77–93). The difference in duration between words and the other two stimulus types is problematic. 2. Materials and methods However, to maintain natural word stimuli, we were forced to compromise on this issue. 2.1. Subjects 2.3. Scanning methods Participants were five males and five females (mean age, 26 years; range, 19–34 years). All participants graduated PET scans were performed on a GE Advance tomograph from or were attending a 4-year college. All participants (Milwaukee, WI). The scanner has an axial field of view were right handed according to the Oldfield handedness in- of 15.3 cm, and an axial and in-plane resolution 5.5 mm ventory (Oldfield, 1971), were native English speakers, and FWHM. 35 contiguous axial planes, offset by 4.25 mm had normal physical, audiometric, and neurological exami- (center to center), were acquired simultaneously. Subjects’ nations. All participants gave written consent after the na- eyes were patched, and head motion was restricted during ture and possible consequences of the study were explained. the scans by the use of a thermoplastic mask. 10 mCi of 15 Each subject was paid for participating. H2 O were injected as an intravenous bolus for each scan in 6–8 cm3 of normal saline. Auditory tasks were begun 30 s 2.2. Stimulus materials and apparatus prior to the injection of radiotracer and continued through- out the scanning period. Scans were initiated automatically All materials were recorded on an Apple Power Mac- when the count rate in the brain reached a threshold value of 1 intosh computer using Macromedia SoundEdit 16 (Macro- 8000 s− , approximately 20 s after injection. Data acquisi- media, Inc., 1990–1996). SoundEdit files were converted to tion continued for 1 min. Studies were separated by 5 min in- Macintosh System 7 sound resource files using SoundApp tervals with background scans acquired for count correction 15 (Franke, 1999). All materials were auditorily presented to beginning one minute prior to each H2 O injection. A trans- participants via the program RSVP (courtesy of Michael mission scan using rotating Ge-68/Ga-68 pin source was Tarr, Brown University) using an Apple Power Macintosh performed for attenuation correction before the rCBF scans. D. Poeppel et al. / Neuropsychologia 42 (2004) 183–200 185

2.4. Procedure Two sets of within-group contrasts were performed: (1) each auditory task was compared to rest (R): LD–R, CP–R Each participant underwent five scans in each of the three and FM–R; (2) contrasts between the auditory tasks were experimental conditions, as well as five scans in a resting then carried out: LD–FM and LD–CP (to evaluate lexical control condition. Participants were given response instruc- processing); FM–LD and FM–CP (to evaluate FM process- tions for each condition before being placed in the scanner. ing); CP–LD and CP–FM (to evaluate syllable processing). For the FM sweeps condition, participants were instructed to The resulting set of voxel values for each contrast constitute indicate whether a stimulus was rising or falling by pressing a statistical parametric map of the t-statistic (SPM{t}) which the right or left response button, respectively. For the cate- is then transformed to standard normal (SPM{z}) scores. gorical perception condition, participants were instructed to Tests of significance based on the size of the activated region indicate whether a stimulus sounded like a /ba or a /pa/ by (Friston, Worsley, Frackowiak, Mazziotta, & Evans, 1994) pressing the right or left response button, respectively. For were performed for each contrast. For the auditory task ver- the lexical decision condition, participants were told to indi- sus rest contrasts, mean differences in normalized rCBF at cate whether a stimulus was a word or non-word by pressing selected voxels of interest were extracted from the SPM out- the right or left response button, respectively. For the rest- put for purposes of illustration. ing control condition, participants pressed alternately either the left or right response button. No sounds were played 3. Results in this condition. All participants wore opaque eye patches to block out ambient light. All trials started with the stim- 3.1. Behavioral results ulus being played followed with a response period during which the participant could indicate his or her response. 3.1.1. Lexical decision Participants’ reaction time was measured from the start of Fig. 1a and b summarizes the LD behavioral data. Partic- stimulus playback. All trials in all conditions were 1500 ms ipants correctly made word/non-word judgments on 87% of in length. trials. The mean time to make these judgments was 980 ms. The presentation order of all stimuli in all conditions was Participants timed out on 1% of trials. There was no sig- determined by a pseudo-random block design. In each scan nificant difference in accuracy between their judgments in the FM sweeps condition, participants were played 10 of words and non-words (89% versus 85%, respectively; blocks of stimuli, each block containing four rising sweeps F(1, 9) 1.14, P>0.10). However, Fig. 1b shows the typ- and four falling sweeps. All sweeps appeared once in each ical finding= for lexical decision studies contrasting words block. In each scan in the categorical perception condition, and pronounceable non-words, namely that participants participants were played 11 blocks of stimuli, each block were faster to judge words than non-words (951 ms versus containing seven phonemes. All phonemes appeared once in 1009 ms, respectively, F(1, 9) 16.42, P<0.01). each block. In each scan in the lexical decision condition, = participants were played 10 blocks of stimuli, each block 3.1.2. Categorical perception containing four words and four non-words. No stimulus was The response profile generated by subjects in the sylla- repeated during the experiment. ble categorization task is shown in Fig. 1c. The judgments matched judgments made by participants in previous studies 2.5. Analysis of CV categorical perception (Liberman et al., 1957). Specif- ically, the continuously varying variable VOT is treated dis- Calculations and image processing were carried out on a continuously in perception, i.e. equal acoustic steps (10 ms SUN Ultra 60 workstation using Matlab (MathWorks, Nat- VOT) are classified into discrete bins, with syllables with ick, MA) and SPM96 software (Wellcome Department of VOTs of 5, 10, or 15 ms all being categorized as voiced /ba/ Cognitive Neurology, London, UK). To correct for head whereas syllables with VOTs longer than 25 ms are classified movement between scans, images were aligned on a voxel as the voiceless stop /pa/. Participants’ mean time to make by voxel basis using a 3-D automated image registration al- category judgments is shown in Fig. 1d. Overall, participants gorithm (Woods, Cherry, & Mazziotta, 1992). Images were took 716 ms to make these judgments. Participants timed out stereotaxically normalized into a canonical space (Talairach on less than 1% of trials. The performance of participants & Tournoux, 1988), and smoothed using a Gaussian filter of was analyzed in a one-way within-subjects ANOVA. Repli- 15 mm 15 mm 9 mm in x, y and z axes. The SPM analy- cating the well known categorical perception reaction time × × sis, which used the multi-subjects: with replications model, response profile, there was an effect of VOT that was due to is an implementation of the General Linear Model (Friston, the increased time in classifying the 25 and 30 ms (bound- 1995), equivalent to an ANOVA applied on a voxel-by-voxel ary) voice-onset stimuli [F(6, 54) 20.11, P<0.001]. basis in which the task effect is the parameter of interest and = global activity and inter- and intrasubject variability are con- 3.1.3. FM sweeps founding effects. Images are scaled to a global mean rCBF Fig. 1e and f summarizes the behavioral performance of 50 ml/100 g/min. for the FM direction discrimination task. Participants’ 186 D. Poeppel et al. / Neuropsychologia 42 (2004) 183–200

Fig. 1. Behavioral data for three tasks. (a and b) The lexical decision data. Note that although there is no performance difference, the longer reaction time for the pronounceable non-words (b) reflects that these items require additional time to make an accurate lexical decision. (c and d) The response profile for the CV-syllable categorical perception task. Note the sharp (categorical) drop-off in judgment between 20 and 30 ms VOT, a typical judgment profile for native English speakers. The reaction time data shown in (d) also reflect the increased processing time at the point of uncertainty between 20 and 30 ms VOT. (e) The proportion correct and (f) the reaction time data for the up/down FM discrimination task for each FM rate. FM rate increases from left to right. accuracy in judging the frequency direction of FM sweep the FM rate) increased, participants were more accurate stimuli is shown in Fig. 1e. Overall, participants made with rising sweeps [F(3, 27) 14.35, P<0.001] and less accurate judgments on 92% of trials. The performance of accurate with falling sweeps [=F(3, 27) 8.65, P<0.001]. participants was analyzed in a 2 (direction: rising, falling) Further, there was a main effect of range= [F(3, 27) 9.90, 4 (range: 200–400, 200–800, 200–1600, 200–3200 Hz) P<0.001]. This effect appeared to be due to the= fall-off within-subjects× ANOVA. There was no difference between in accuracy of participants judging the frequency direction participants’ accuracy in judging rising sweeps from falling of the rising 200–400 Hz tone and the falling 3200–200 Hz sweeps [92% versus 93%, respectively; F(1, 9) 2.61, tone. Participants’ reaction time in judging the frequency P>0.10]. However, there was an interaction= between direction of sweeps stimuli is shown in Fig. 1f. Excluding range and direction [F (3, 27) 13.38, P<0.001]. As time-outs (1% of trials), the mean time to make judgments shown in Fig. 1e, as the= range of= the sweep (and therefore was 821 ms. Participants’ reaction times were analyzed in D. Poeppel et al. / Neuropsychologia 42 (2004) 183–200 187 the same design used to analyze the accuracy of their judg- 3.2.1.1. Activations common to all tasks. ments. Participants were faster to judge falling FM sweeps than rising FM sweeps (979 ms versus 845 ms, respectively; Temporal areas. For all tasks, activation includes the F(1, 9) 39.46, P<0.001). A direction range inter- entire antero-posterior extent of the lateral–temporal cortex = × action for reaction time was found that was similar to that in both hemispheres anterior, middle and posterior superior for accuracy [F(3, 27) 21.49, P<0.001]. As shown in temporal gyrus (STG), anterior, middle and posterior mid- = Fig. 1f, as the frequency-range of the sweep increased, par- dle temporal gyrus (MTG), and all portions of intervening ticipants were faster with rising sweeps [F(3, 27) 31.97, superior temporal sulcus (STS). The areas of activation = P<0.001] and slower with falling sweeps [F(3, 27) extended medially to include the areas in and around the = 7.04, P 0.001]. Finally, there was a main effect of range transverse temporal gyrus. The activation should include AI = of frequency [F(3, 27) 8.46, P<0.001] on reaction and contiguous portions of AII, or, using a more contem- = times. This effect is partially due to the relatively long time porary nomenclature, core and belt auditory areas (Kaas & participants took to classify rising 200–400 Hz tones. Hackett, 2000; Rademacher & Caviness, 1993; Rademacher & Morosan, 2001; Rauschecker, 1998). The resolution of our measurement technique, however, does not permit us 3.2. PET results to localize the activation in way that would allow us to use this available nomenclature for the auditory cortex in an 3.2.1. Tasks versus rest comparison accurate way. Importantly the local maxima—indexing the Table 1 summarizes the analysis for the comparison of greatest activations—are located in the STG and STS. For each of the three tasks against rest, listing the local activa- all tasks, the largest Z-scores, indexing the greatest local tion maxima for each contrast. Fig. 2 shows activations ver- maxima overall, are found in the middle portion of the sus resting baseline as standard normal (SPM{z}) scores in STS, at the ventral bank of the STG, in both hemispheres canonical planar Talairach views. (mid-STG/STS, Table 1). The comparison LD versus rest yielded eight significant As the data in Table 1 show, throughout the STG/STS, clusters, encompassing activations in both left and right activations in the LH exceeded those in RH for LD and hemispheres, with 9744 voxels above threshold, lateralized CP (left hemisphere lateralization is most pronounced in to the left (L:R ratio 1.40). CP versus rest yielded eight sig- the STS). In the central part of the STG, encompassing the nificant clusters, with 5386 voxels above threshold, evenly most anterior portion of Heschl’s gyrus (putative primary distributed in the left and right hemispheres (L:R ratio 1.05). auditory cortex; Rademacher & Morosan, 2001), activation FM versus rest showed 11 significant clusters, with 7734 is lateralized (qualitatively) to the left for all tasks. On the voxels above threshold, lateralized to the right (L:R ratio other hand, for FM, STS activations in the RH always exceed 0.89). In addition to the evident trend—leftward lateraliza- those in the LH, and the degree of rightward lateralization tion for words, bilateral activation for syllables, and slight is most pronounced in the posterior portions of the STS. rightward lateralization for sweeps—activations for sweeps also extended more posteriorly in the right hemisphere. Note Extra-temporal areas. All tasks were associated with that these lateralization patterns are qualitative; we are not increased activity in premotor and motor structures, in- making quantitative claims about absolute lateralization out- cluding pre-central gyrus (primary motor cortex), SMA, comes, but merely reporting the overall patterns as reflected putamen and ventral thalamus and cerebellum (elements of by clusters and maximal Z-scores.

Fig. 2. Map of three tasks compared to rest. Maps of brain areas activated during the lexical decision (a), categorical perception (b) and FM sweeps discrimination (c) tasks. Statistical parametric maps in three projections display pixels in which normalized regional cerebral blood flow differed between task and rest conditions. Values are Z-scores representing the significance level of changes in normalized rCBF in each voxel; the range of scores for each contrast are coded in the accompanying color tables. The grid is the standard stereotaxic (Talairach) grid into which subjects’ scans were normalized. The anterior commissural–posterior commissural line is set at zero on the sagittal and coronal projections. Vertical projections of the anterior commissure (VAC) and posterior commissure (VPC) are depicted on the transverse and sagittal projections. See Table 1 for detail. 188 D. Poeppel et al. / Neuropsychologia 42 (2004) 183–200 4 4 20 − − − 48 2 16 6 48 2 32 16 74 32 0 36 8 30 4 10 32 12 − − − − − − − − − − − xyz -score 4.03 30 38 20 8– ––– 4 8.08 52 12 3.60 18 20 3.64 8 24 – – – – − − − − − 6 40 5.20 40 2 32 20 12 12 3.71 12 78 10 20 3.93 16 34 068 3.94 30 40 8 11.75 58 64 48 – – – – 38 36 – – – – 26 4 11.76 52 32 12 6.47 44 − − − − − − − − − − − − − − 6 2 – – – 4.91 30 20 4 4 6 48 – – – – – – – 3.83 38 14 24 –––– ––– 12 16 18 6 4 4.11 18 2 0 16 30 34 10 12 36 3.24 18 4 32 42 0 44 3.10 34 42 44 18 36 60 52 40 − − − − − − − − − − − − − − − − − xyz Z − -score 4.03 3.57 3.86 4– – – – – – – – 4 6.83 12 4.28 − − − 4 24 36 0 3.87 44 850 8.53 28 4 9.60 30 8 7.47 − − − − − − − xyz Z -score 3.28 36 4.32 30 24 24 – 8 3.65 8 20 – – – – 4.81 − − 4 8 – – – – 3.10 6 40 – – – – 6.58 4 0 6.98 52 34 10 0 – – – – 4.18 10 20 – – – – 4.76 78 28 0 4.70 30 36 8 5.84 54 24 4 11.60 58 32 8 7.52 48 − − − − − − − − − − − 6 6 8 4 – – – – 4.99 6 – – – 4.05 36 18 0 – 4 4 48 – – – – 6.57 –––– –––– –––– ––– 14 12 16 30 10 10 36 – – – – 4.10 46 0 44 – – – – 4.02 48 50 24 20 4.30 36 20 24 – 48 58 54 38 − − − − − − − − − − − − − − − xyz Z − -score 4.10 8 3.70 4– –––– –––– –––– ––– 4 7.32 − − − 4 36 12 – – – – – – – – – – – – – – – – 10 8 3.97 36 0 4.90 44 8 8.76 28 4 11.67 32 12 9.90 − − − − − − − xyzZ -score −−−− −−−−−−−−−−−− 44 – 3.17 – 0 26 – – 3.63 4 4.50 34 24 4 9.50 54 2020 – – – – – – – – – 4.30 – – – – – – – 1616 – – – – – – – – – – – – – – – – − − − − − − − − 4 10 34 34 12 3.14 22 10 20 3.67 10 6 8 5.44 62 80 28 0 5.32 30 10 44 – – – – 5.54 60 26 44 8 7.41 50 22 4 14.43 54 32 4 8.04 44 − − − − − − − − − − − − − − 6 – – – 3.68 10 6 2 14 40 4.14 10 4 40 4.37 2 6 48 7.02 0 12 44 5.70 –––– –––– –––– ––– –––– –––– –––– –––– ––– –––– –––––––– –––– –––– ––– ––– 10 22 14 16 8 4 3.78 12 2 4 4.24 40 30 36 22 0 6.10 34 22 4 – 24 36 6 24 5.72 42 22 24 3.03 44 20 12 4.59 42 20 8 – 38 38 34 24 48 58 52 42 − − − − − − − − − − − − − − − − − − − − xyz Z − -score − LeftZ Right Left Right Left Right Lexical decision Categorical perception FM sweeps no. MB tegmentumPAG 4.04 3.62 Pulvinar 4.23 Ventral thalamus – Putamen 4.50 Caudate 4.50 Lateral CBPosterior CB 4.68 4.30 Anterior PHPC 35 6.05 Anterior cingulate/MPF 32/24 6.14 Anterior insula 6.84 LPM 6 – – – – – – – – PrecentralSMA 4/6 6 4.07 7.16 Superior frontal operculum 44/6 6.26 Middle frontal operculum 44/45 7.03 ITG – – – – – – – – Superior DLPFCInferior frontal operculum 47 8 6.68 – Posterior fusiformAnterior fusiformSuperior parietal 37 lobuleSupramarginal gyrus 37 7 4.80 40 3.41 – Inferior DLPFC/operculum 46/44 – Posterior STS 22 9.41 Middle STS 22 15.49 Anterior STS 22 10.04 STG core 12.58 Subcortical Insula/cingulate/PHPC Frontal motor Inferior, basal temporal Parieto-occipital Prefrontal Superior, middle temporal Table 1 Three tasks vs. rest Region Brod. D. Poeppel et al. / Neuropsychologia 42 (2004) 183–200 189 corticostriatal thalamocortical motor circuit). All tasks were To compare activation across conditions, Figs. 3 and 4 also associated with increased activity compared to rest illustrate increases in normalized rCBF versus rest at coor- in the anterior cingulate cortex, parahippocampal gyri, and dinates of interest representing maxima in the temporal lobe midbrain. (Table 1, STG and STS coordinates) derived from the LD

Fig. 3. Bar graph illustrating changes in normalized rCBF for each of the auditory tasks vs. rest. For each contrast, the magnitude of rCBF increases at specified voxels of interest representing maxima in the temporal lobe derived from the LD minus rest contrast (Table 1) were extracted from SPM output matrices. Values represent mean differences in normalized rCBF (ml/100 g/min S.D.) at regions in left (A) and right (B) hemispheres, between resting ± baseline rCBF values and lexical decision (solid), categorical perception (heavily stippled) and FM sweeps (lightly stippled) tasks: (A) a: greater than (P<0.01) CP and FM; b: greater than (P<0.0001) CP and FM; c: greater than (P<0.01) FM; (B) a: greater than (P<0.0001) CP and (P<0.01) FM; b: greater than (P<0.05) CP; c: greater than (P<0.0001) CP and FM; d: greater than (P<0.001) CP and (P<0.05) FM. 190 D. Poeppel et al. / Neuropsychologia 42 (2004) 183–200

Fig. 4. Bar graph illustrating changes in normalized rCBF for each of the auditory tasks vs. rest. For each contrast, the magnitude of rCBF increases at specified voxels of interest representing maxima in the temporal lobe (Table 1, STG and STS coordinates) derived from the FM sweeps minus rest contrast (Table 1) were extracted from SPM output matrices. Values represent mean differences in normalized rCBF (ml/100 g/min S.D.) at regions in left (A) ± and right (B) hemispheres, between resting baseline rCBF values and lexical decision (solid), categorical perception (heavily stippled) and FM sweeps (lightly stippled) tasks: (A) a: significantly greater (P<0.0001) than CP and FM; (B) a: significantly greater (P<0.05) than CP and (P<0.0001) FM; b: significantly greater (P<0.0001) than CP; c: significantly greater (P<0.0001) than LD and CP; d: significantly greater (P<0.0001) than LD and (P<0.001) CP; e: significantly greater than LD (P<0.05). minus rest contrast (Fig. 3) and the FM sweeps minus rest the differences went up to 76% (LD versus FM, STS-mid) contrast (Fig. 4). Fig. 3 shows that the response to words in the left hemisphere and up to 68% (LD versus CP, STG) was significantly greater than the other conditions in all in the right hemisphere. Fig. 4 shows that rCBF responses regions except the right posterior STS. The magnitude of to FM sweeps exceeded those to other stimuli only in the D. Poeppel et al. / Neuropsychologia 42 (2004) 183–200 191

Fig. 5. Maps illustrating contrasts between responses to auditory stimuli in three conditions. The statistical parametric (SPM z ) map illustrating changes { } in rCBF is displayed on a standardized MRI scan. The MR image was transformed linearly into the same stereotaxic (Talairach) space as the SPM z { } data. Using Voxel View Ultra (Vital Images, Fairfield, Iowa), SPM and MR data were volume-rendered into a single three-dimensional image for each contrast. The volume sets are resliced and displayed at selected planes of interest relative to the anterior commissural–posterior commissural line as indicated. Values are Z-scores representing the significance level of differences in normalized rCBF in each voxel; the range of scores is coded in the accompanying color table, scaled to the maximal value for each contrast. The images in (a) illustrate the contrasts between lexical decision and FM sweeps as baseline (upper row) and syllable categorization as baseline (lower row). The images in (b) illustrate the contrasts between FM sweeps and LD as baseline (upper row) and syllable categorization as baseline (lower row). Images in (c) illustrate the contrasts between syllable categorization and FM sweeps as baseline (upper row) and LD as baseline (lower row). 192 D. Poeppel et al. / Neuropsychologia 42 (2004) 183–200 right hemisphere. Particularly robust are the FM-elicited hemispheres. Greater activity was also observed in the right responses observed in the right posterior regions, STG/PT MTG. Compared to FM only, LD was associated with greater and posterior MTG. activation of the left temporal pole, the right cerebellum, and the left ACC. 3.2.1.2. Task specific activations. The LD task was asso- ciated with left-lateralized activation within extra-temporal 3.2.2.2. Categorical perception versus LD and FM (Fig. 5c regions, including the inferior and mid portions of the frontal and Table 3). The CP versus FM comparison showed three operculum bilaterally (left greater than right, qualitatively, significant clusters, strongly lateralized to the left (1286 vox- e.g. by assessing Z-scores), the left anterior and posterior els above threshold; L:R ratio 3.85). CP versus LD showed fusiform gyrus, the pulvinar (left greater than right), and four significant clusters, lateralized to the right (4188 voxels the left anterior insula. CP and FM, but not LD, were as- above threshold; L:R ratio 0.49). sociated with significant activation of the right dorso-lateral In these analyses, the contrasts were markedly different, prefrontal cortex, CP alone was associated with activity in i.e. there were essentially no differences common to both the right inferior temporal gyrus, and FM sweeps alone contrasts (no regions in which activations during the CP showed activation of the left superior parietal lobule and task were significantly greater than both LD and FM). When supramarginal gyrus (SMG). compared with FM, the following patterns were observed: greater activity for CP in the lateral–temporal cortices, bilat- 3.2.2. Task versus task comparisons eral in the mid portion of the STS; in other temporal regions, Fig. 5 and Tables 2–4 summarize the analysis for the significant increases in activation for CP versus FM— comparison of each of the tasks (LD, CP, FM) against two including the posterior STS and the anterior and posterior other tasks (i.e. LD–CP and LD–FM; CP–LD and CP–FM; MTG—are lateralized to the left hemisphere. Extra-temporal FM–LD and FM–CP). Tables 2–4 list the peak activation foci were similarly lateralized; greater activation for CP ver- maxima for each of those comparisons. Fig. 5 shows the sus FM was seen in the operculum, orbital and dorsolateral PET activations overlaid on axial MR images at five differ- prefrontal cortices, the inferior parietal lobule, the cingu- ent levels. Insofar as we consider commonalities of activa- late and the parahippocampal gyrus. Broadly speaking, the tions across subtraction conditions, these are done by visual pattern is similar to that seen in the LD versus FM contrast inspection and by comparing scores, not by using masking. (which ostensibly reflects the speech–non-speech differ- ence): greater activity bilaterally for CP in the mid-portion 3.2.2.1. Lexical decision versus FM and CP (Fig. 5a and of the STS; other differences in lateral–temporal and Table 2). The comparison between LD and FM revealed extra-temporal areas were left lateralized. four significant clusters, very strongly lateralized to the When compared with LD, no activations associated with left (8598 voxels above threshold; L:R ratio 3.58). The CP were significantly greater in the STG, MTG or STS comparison between Lexical decision and CP revealed in either hemisphere. Relative elevations in activity in four significant clusters, also strongly lateralized to the left extra-temporal regions were found principally in the right (7681 voxels above threshold; L:R ratio 2.96). Again, later- hemisphere: in dorsolateral prefrontal, insular, parietal and alization is qualitatively assessed as a ratio of the activation occipital cortices. Bilateral elevations in activity are seen in patterns, but not further quantified. midline cortices: medial prefrontal, anterior and posterior cingulate gyri. In summary, in all temporal areas generally Greater activity for LD versus both CP and FM. Sig- associated with auditory processing, activation associated nificant increases in activity for the LD task versus both CP with lexical decision exceeded activation in the categorical and FM are seen throughout the entire antero-posterior ex- perception task. tent of the STS bilaterally, as well as in the left posterior MTG. Bilateral increases in activity of the ITG are also seen. 3.2.2.3. FM sweeps versus LD and CP (Fig. 5b and Table 4). Outside of the temporal cortices, significantly greater acti- The FM sweeps versus LD contrast showed 11 significant vation for words versus either syllables or FM sweeps was clusters, lateralized to right (4799 voxels above threshold; detected in the left frontal operculum, the left anterior in- L:R ratio 0.71). The FM versus CP comparison showed six sula, the left DLPFC, the left anterior and posterior fusiform significant clusters, lateralized to left (2747 voxels above gyri, as well as the left anterior parahippocampal gyrus. In threshold; L:R ratio 1.24). summary, greater activity for the lexical decision task than either FM or CP is found bilaterally in the auditory cortices; Greater activity for FM versus both LD and CP. In the the differences in basal temporal and extra-temporal regions lateral–temporal cortices, local maxima representing greater are, on the other hand, lateralized to the left hemisphere. activation for FM versus either CP or LD are located entirely within the right hemisphere. All differences common to both Other contrast specific differences. Compared to CP contrasts (FM versus both LD and CP) are found in the poste- only, lexical decision showed greater activation of the cen- rior temporal regions (greater than 45 mm posterior to the an- tral portion of the STG (BA 42/22), in both right and left terior commissure). Activity for FM exceeds both CP and LD D. Poeppel et al. / Neuropsychologia 42 (2004) 183–200 193 4 − 8 10 26 4 36 0 12 20 − − − − − − 44 46 xyz − − -score 4 7.54 56 12 – – – – 16 – – – – 16 – – – – 24 – – – – − − − − − 4 8 3.91 58 4 3.91 30 32 6 24 – – – – 16 0 6.53 54 34 4 6.26 54 32 58 28 20 3.12 34 − − − − − − − − − − − 6 20 52 – – – – 54 56 44 54 44 44 42 48 36 8 – – – – 30 18 48 – – – – 46 22 12 – – – – 38 4 28 – – – – 24 48 − xyz Z − − − − − − − − − − − − − 24 10 50 38 26 42 16 8 – – – – 40 -score − − − − − 4 6.48 − 2 28 8 – – – – – – – – 22 4 9.54 38 0 6.72 44 4 6.37 12 5.78 − − − − − − 4 30 40 – – – – 48 − xyz Z − -score 8 5.88 50 16 – – – – 7.10 24 – – – – 5.15 12 – – – – 24 – – – – 5.24 − − − − − 4 – – – – 8.10 6 14 12 5.25 42 14 0 5.84 58 34 4 4.80 54 44 4 3.77 56 12 3.05 54 62 38 16 – – – – 6.42 − − − − − − − − − − 48 50 4 56 44 50 32 42 40 46 36 8 – – – – 4.34 22 20 40 20 12 – – – – 6.57 38 2 28 3.69 44 2 28 5.41 10 14 48 5.34 8 12 40 3.79 30 64 xyz Z − − − − − − − − − − − − − − − 44 34 24 34 16 4 – – – – 4.71 44 -score LeftZ Right Left Right − − − − Temporal pole – – – – – – – – 5.23 STG, core 42/21 3.34 Anterior STS/STG 22 5.27 Middle STS/STG 22 7.21 Posterior STS/STG 22 6.31 Posterior MTG 21 4.20 ITG 3.34 Posterior fusiform 37 6.19 Anterior fusiform 37 5.91 Inferior DLPFC 46 4.20 Superior DLPFC 8 – – – – – – – – 3.85 Operculum, orbital 47 3.88 Operculum, triangular 4.82 Operculum, triangular 44/45 6.82 Operculum, operculum 44/6 5.95 SMA –ACC 6 3.21 Anterior insula 5.30 Anterior cingulate 32/24 – – – – 5.20 Anterior PHPC 35 3.10 Lateral CB 5.68 Superior, middle temporal Table 2 LD vs. CP and LDRegion vs. FM Brod. no. Lexical decision–categorical perception Lexical decision–FM sweeps Inferior, basal temporal Prefrontal Frontal motor Insula/cingulate/PHPC Cerebellum 194 D. Poeppel et al. / Neuropsychologia 42 (2004) 183–200 4 − 2 28 4 4660 16 4 − − − − 6 − xyz -score 8 32 5.69 44 68 44 – – – – 42 40 – – – – 20 – – – – − − − − 16 36 34 86 − xyz Z − − − 28 -score − 4 36 4.14 4860 20 4 – – – – – – – – 3.88 3.53 56 50 40 28 3.96 16 3.38 − − − − − 46 4 – – – – – – – – 70 xyz Z − − -score 68 48 – – – – 4.03 34 24 4.66 46 74 412 – 44 4.34 – – 48 – – – – – – – – – 16 3.98 18 − − − − − 16 58 52 4 3.25 8 50 40 68 − − xyz Z − − − − 20 14 -score LeftZ Right Left Right − − Posterior STG –PTPosterior MTGSuperior parietal lobule/PCC 22 7 21 – 4.02 – – – – – – – 5.21 5.82 50 48 Ant STGSupramarginal gyrus 22 40 – 4.66 – – – – – – – – – – – 4.11 50 4 Lingual gyrus 3.13 Lateral occipitlal cortexInferior DLPFCFrontal 18 motor Inferior precentral 46 4.05 4/6 – 3.75 – – – 3.19 40 48 12 – – – – – – – – CDAnterior insulaAnterior cingulate/MPFPosterior CB 32/24 – – – 3.62 – – – – – – – – – 3.14 – 3.37 4 – 4 40 24 4 – 8 – – – – – – – – – – – – – 3.54 – 3.42 6 42 – 2 40 – – SMALPM 6 6 – – – – – – – – – 3.01 42 – 0 – 44 – – – – – – – – – 4.33 3.91 42 8 0 4 44 48 Parieto-occipital Table 3 FM vs. LD and FMRegion vs. CP Superior, middle temporal Brod. no. FM sweeps –lexical decision FM sweeps –categorical perception Prefrontal Insula/cingulate/PHPC Cerebellum D. Poeppel et al. / Neuropsychologia 42 (2004) 183–200 195 50 28 72 8 16 − − − 72 xyz − -score 4 16 – – – – − − 4 46 30 56 12 54 4 4.151412 20 40 2 60 8 4 4.98 2 38 0 − xyz Z − − − − -score 4 14 0 – – – – – – – – 48 12 − − − 54 xyz Z − -score 4 3.55 58 4– –––– –––– ––– 8– –––– –––– ––– 24 – – – – – – – – – – – – 12 – – – – 3.68 20 – – – – – – – – – – – – − − − − − − 16 18 36 4 – – – – – – – – – – – – 46 72 24 – – – – – – – – – – – – 40 28 4.06 10 12 − − − − − − − 6 42 8 34 40 – – – – 3.94 58 60 62 48 44 30 16 44 – – – – – – – – 3.81 30 24 40 10 42 26 26 − − xyz Z − − − − − − − − − -score LeftZ Right Left Right Anterior MTG 21 3.22 Middle STS/MTG 20/21 3.77 Posterior STS/STG 22/21 3.96 Posterior MTG 21/39 3.10 ITGSupramarginal gyrusAngular gyrus 40 37/19 39 – – 3.44 – – – – – – – – – – – – – – – – – – – – – – 4.83 3.29 42 44 Medial orbitalSuperior DLPFC 11 8 3.97 3.34 Lingual gyrus – – – – 3.49 36 Anterior cingulatePosterior cingulate 32 23/31 4.03 3.56 Lateral orbitalInferior frontal operculum 47 10 2.68 – – – – – – – – 3.43 CDAnterior insulaAnterior PHPC – – 28 – – 3.25 – – – – – – – – – – – – 3.26 – 32 – – – 3.55 6 22 0 Medial prefrontal 10 – – – – – – – – 3.51 Superior, middle temporal Region Brod. no. Categorical perception –FM sweeps Categorical perception –Lexical decision Table 4 CP vs. FM and CP vs. LD Inferior, basal temporal Parietal Prefrontal Insula/cingulate/PHPC Frontal motor Posterior CB – – – – – – – – – – – – 3.31 16 Cerebellum 196 D. Poeppel et al. / Neuropsychologia 42 (2004) 183–200 in the most caudal portions of the right MTG and in the right right areas were, however, differentially modulated (Fig. 5). posterior STG in the region of the planum temporale. Greater The words engaged left temporal areas more strongly and activity for FM versus both tasks was also seen in right ante- also activated left-lateralized extra-temporal areas, includ- rior cingulate cortex and left parietal cortex (superior parietal ing the frontal operculum and fusiform gyrus. The FM lobule and SMG) (Fig. 5b). Significantly larger activations activation strongly lateralized to the right, particularly in for FM were also found in motor areas: precentral gyri, pre- non-primary temporal areas (STS and MTG). The categor- motor cortices, and cerebellum. In summary, in the temporal ical perception task was associated with bilateral activation lobe, the greater activations for the FM discrimination are of auditory fields and a leftward . For the CP versus confined to the right posterior areas. Portions of the parietal FM the pattern was similar to that seen for LD versus FM. cortex appear to be more active for the FM task as well. Two methodological issues require comment: (i) the na- ture of the acoustic contrasts we used and their inherent Other contrast specific differences. Differences specific limitations and (ii) the utility of PET for mapping the func- to the FM versus CP or the FM versus LD contrasts are tional anatomy of the superior temporal cortex. The acoustic also principally located within the right hemisphere. Com- matching of our stimuli across the three experimental tasks pared to CP only, the right anterior STG and the right insula is a difficult issue. For this experiment, we opted to use (as well as scattered motor areas) show foci of increased acoustic stimuli that are well understood in psycholinguis- activation for FM sweeps. Compared to LD, the FM task tics, particularly research on speech processing. We chose was also associated with greater activity in the right SMG; the lexical, syllabic, and non-speech (FM) stimuli because more widespread activations seen for FM versus LD alone the choice allowed us to connect with the literature exploring include right SMG, DLPFC, cerebellum, and occipital areas these materials in behavioral research. In other words, given bilaterally. how popular ‘lexical decision’ and ‘categorical perception’ tasks are in cognitive science research, we wanted to explore the neural basis associated with these tasks as they are exe- 4. Discussion cuted with typical stimuli, and we focused more on match- ing across behavioral requirements (i.e. same response type, PET was used to measure the response patterns associ- stimulation rate, etc.). However, we have as a consequence ated with auditory lexical decision, categorical perception had to compromise with regard to the acoustic matching. of syllables, and FM direction identification. Because the Specifically, there are acoustic complexities that differ across acoustic and psycholinguistic characteristics of these signals the three conditions, with speech being the most acousti- are well understood and the behavioral profiles generated by cally complex relative to FM sweeps. The interpretation of the tasks are stable, we suggest that the neuronal activation these data requires caution insofar as we want to discuss the data connect readily to the psycholinguistic literature. Cru- activations as forming the basis for speech processing. cially, the behavioral data we collected replicated the typical The second issue concerns whether or not using PET one response profiles (Fig. 1). can localize activation with a high resolution and therefore To insure that the activation patterns observed were ro- differentiate between primary and non-primary auditory bust, and to more explicitly characterize the features of the areas. PET is quiet (and therefore well suited for auditory response patterns, task-related activations were compared studies that require psychoacoustic judgments) and has against three baselines: rest and the two other tasks. By terrific sensitivity, but a limited spatial resolution when using three different subtractions, we aimed to isolate those compared to fMRI. Recent treatments, e.g. by Johnsrude, activations that survive comparisons with different types Giraud, & Frackowiak (2002) and Hall, Hart & Johnsrude of controls. It is the sites that are activated across three (2003), point out that a convincing analysis of the func- comparisons that appear to us likely to merit interpreta- tional anatomy of human auditory cortex is very difficult to tion in the context of our experiment. We used the same establish with a method that has a spatial resolution on the experimental procedure for each stimulus type (single trial order of 10–15 mm. Fine-grained anatomic differences (and two-alternative-forced-choice, same rate of stimulus pre- possible subtle lateralization patterns in responses) may sentation) to minimize the effects due to the execution of be masked given the limited resolution of the technique. the experimental task itself (Poeppel et al., 1996; Norris & Therefore, we stick to gross morphological landmarks and Wise, 2000). The main findings were: (1) all tasks activated qualitative patterns of lateralization in the data. auditory cortex bilaterally (Fig. 2). When the non-speech condition (FM) was used as a baseline for both speech tasks 4.1. Bilaterality/symmetry (acoustic control), the bilateral nature of the response re- mained. Given that the activation for the sweeps was highly The robust bilateral activation to words and syllables (even significant bilaterally, it is compelling that further bilateral when FM sweeps were used as baseline) suggests that speech (most likely non-primary) activation in STG and STS— perception—construed as the set of procedures that take perhaps related to the complex acoustic aspects of the speech acoustic input and derive representations that make contact signal that we did not control for persisted; (2) the left and with the mental lexicon—is mediated in left and right audi- D. Poeppel et al. / Neuropsychologia 42 (2004) 183–200 197 tory areas. Precisely which computation in the speech per- 4.2. Laterality/asymmetry ception process is being subserved by left and right areas cannot be resolved because in the present study the acous- 4.2.1. Words tics were not controlled in a way to permit that analysis. Temporal areas were differentially more active for words There are suggestions in the literature that processes opti- than for syllables. The extent and magnitude of activation mized for spectral analysis are more rightward lateralized in left areas exceeded right areas, although the response and the processes optimized for temporal analysis more left- was still bilateral. While lateralization was observed in ward lateralized (Zatorre, Belin, & Penhune, 2002; Poeppel, the response magnitude of temporal areas, it was even 2003). In the context of speech per se it has been suggested more apparent in the recruitment of extra-temporal areas. that a longer temporal window of analysis associated with Words compared to either CP or FM showed left-lateralized right non-primary auditory cortex ( 200–300 ms) can con- extra-temporal activations including the operculum (BA 44, fer a slight rightward advantage for∼ syllabic processing— 45 and 47), the anterior insula, and the fusiform gyrus (BA and syllable-sized acoustic units, in turn, form an effec- 37). In general, single words have been used in a number tive temporal unit for spectral analysis tasks. In contrast, of studies, and a marked leftward lateralization has been the shorter temporal integration window associated with left observed by many groups (Wise et al., 1991; Howard et al., non-primary areas ( 20–50 ms) forms the basis for process- 1992; Fiez, Raichle, Balota, Tallal, & Petersen, 1996; Price ing at shorter time scales,∼ which in speech would be advanta- et al., 1996; Binder et al., 1997). Our results suggest that geous for segmental and subsegmental processing and the es- post-perceptual computations, perhaps aspects of lexical tablishment of intra-syllabic temporal order (Poeppel, 2003). semantics, reflect the laterality that is characteristic of lan- In summary, in the context of the present results, the bilateral guage processing. Recent convergent fMRI evidence also nature of the response need not be speech-specific; but very argues that it is the lexical–semantic level of processing that probably the processes mediated by left and right auditory is lateralized (Zahn et al., 2000). areas play a core role in the analysis of the speech signal. We selected durations of words, CVs and FMs that are A common assumption deriving primarily from neu- used in standard psychophysical paradigms. As a conse- ropsychological research is that speech perception lateral- quence, the words were about 40% greater in duration izes to the dominant hemisphere, presumably motivated by than the syllables and sweeps, which were matched to the fact that speech perception is closely linked to language each other. On this basis, differences in the duration of our processing—which is highly lateralized (Geschwind, 1970; stimuli may have accounted for a portion of the variation Binder et al., 1997). In view of recent data, including the in the auditory cortical responses. However, the major ef- data from this study, this model has to be reconsidered. fects we observe—differences exceeding 80% (versus FM The data observed here converge with imaging results by sweeps) in the left anterior STS (Fig. 4A) and 73% (versus Wise et al. (1991), Mummery et al. (1999), Belin, Zatorre, CV syllables) in the right anterior STS (Fig. 4B)—were Lafaille, Ahad, & Pike (2000), Binder et al. (2000), as well in non-primary auditory areas, and the previous literature as neuropsychological arguments by Poeppel (2001). Re- indicates that left non-primary areas are less susceptible cent reviews of imaging, lesion, and electrophysiological to rate changes (Price et al., 1992). Moreover, the finding data by Hickok & Poeppel (2000) and Norris & Wise (2000) that posterior right STG and MTG respond much more emphasize the emerging consensus that speech perception strongly to FM sweeps than to words argues against a con- (or, better, the numerous processes underlying speech per- ception that signal duration entirely drives the response ception, including the temporal and spectral analysis of pattern. the signal, the analysis of periodic and aperiodic compo- What is the role of the left frontal opercular activa- nents, and other necessary signal processing subroutines) is tion? One hypothesis is that the lexical decision task, in mediated bilaterally. addition to the known exhaustive lexical search, requires By and large, the data we observe here are consistent with different underlying phonetic/phonological computations the hierarchical model of auditory word processing artic- (cf. Bokde, Tagamets, Friedman, & Horwitz, 2001). In ulated by Binder et al. (2000). Their model suggests that particular, processing non-words might engage speech seg- spectro-temporally complex sounds, in general, including mentation operations that have been shown to drive left speech sounds, are processed bilaterally in the dorsal tem- frontal opercular cortex (Zatorre, Evans, Meyer, & Gjedde, poral plane, including STG. Speech sounds per se appear 1992; Burton, Small, & Blumstein, 2000; Burton, 2001). to be processed bilaterally in areas more ventro-lateral than If this interpretation is on the right track, then the activa- non-speech signals (e.g. along STS). Finally, the process- tion may be primarily due to processing of the non-words, ing of words beyond the analysis of the input signal (i.e. which more strongly engages sound segmentation. The lexical search, word recognition) appears to be handled by fusiform activation may reflect an interface with word or additional areas outside of the superior temporal lobe (e.g. conceptual storage. Fusiform activation to words (visual MTG) as well as extra-temporal areas. Using a different and auditory) has been observed in other studies (e.g. experimental design and a different imaging technique, the Zatorre, Meyer, & Gjedde, 1996; Wagner et al., 1998; Chee data we report are consistent with the above model. et al., 1999). Based on neuropsychological and imaging 198 D. Poeppel et al. / Neuropsychologia 42 (2004) 183–200 data we argue that inferior temporal and fusiform areas play processing is driven by differing temporal and spectral sen- a core role in processing lexical and conceptual informa- sitivities, as mentioned above. Zatorre & Belin (2001), in tion, possibly in a manner independent of the input modalit particular, argue that left cortical areas are specialized for (Büchel, Price, & Friston, 1998). temporal processing and right areas for spectral processing. Signals that require an analysis that emphasizes either type 4.2.2. Syllables of processing to execute a task will therefore differentially The analysis of syllables with the type of consonantal engage left versus right areas. An alternative possibility, onset we used is modulated by formant transitions, which also discussed above, is that the temporal integration win- are typically on the order of 20–50 ms duration. Based on dows over which sounds are analyzed in non-primary audi- arguments that the left hemisphere is optimized to analyze tory areas differ between the left and right areas. Left areas ‘rapid temporal transitions’ (Fitch et al., 1993; Nicholls, favor temporal information because shorter temporal inte- 1996; Poeppel, 2001, 2003) it is assumed that the percep- gration windows are analyzed (favoring temporal features tion of CV syllables should be associated with left audi- of a sound), right areas favor spectral information because tory cortex. Studies that have used auditory CV syllables longer integration windows are considered (Poeppel, 2001, have indeed typically observed activation in the left (Zatorre 2003).There are two novel aspects to this work. First, tasks et al., 1992; Fiez et al., 1995; Celsis et al., 1999; Burton such as categorical perception or lexical decision derive et al., 2000). The present results do not clearly support this from a rich experimental literature. Because these stimuli view. The activation in the syllable task, although much and tasks are well understood, it seems reasonable to use more diffuse, does not appear to be markedly lateralized. them to aid the interpretation of the anatomic findings. For One reason for the absence of the lateralization in CP might example, because we know that lexical decision requires be that subjects execute a CP task by attending to the en- lexical access, we can argue that the identified areas must tire spectrum of a syllable (i.e. envelope and temporal fine play a role in subparts of lexical access. The network of structure) rather than just the spectro-temporal fine struc- areas identified can subsequently be investigated in a more ture necessary for the segmentation tasks used in previous parametric manner. Second, the observations we report work. confirm and extend a new perspective on the neural basis of speech. The data are consistent with a model in which 4.2.3. FM sweeps all sound-based representations are constructed bilaterally FM stimuli, while eliciting a bilateral response, more in auditory cortex (Binder et al., 2000; Hickok & Poeppel, strongly activated right temporal areas, where responses 2000). These representations interface with computational were also less variable than on the left. What drives this re- systems in different ways. Auditory representations that are sponse pattern is not clear. Recently there has been increas- subject to linguistic interpretation lateralize primarily to ing emphasis on using non-speech auditory stimuli with the left; other representations presumably lateralize based more complex spectral structure (Johnsrude, Zatorre, Mil- on their functional role. One parameter that conditions the ner, & Evans, 1997; Belin et al., 1998; Scheich et al., 1998; lateralization is the apparent differential temporal sensitiv- Schlosser, Aoyagi, Fulbright, Gore & McCarthy, 1998; ity of left and right auditory areas (Zatorre & Belin, 2001; Baumgart, Gaschler–Markefski, Woldorff, Heinze, & Sche- Zatorre et al., 2002; Poeppel, 2001, 2003). On this view, ich, 1999; Belin et al., 2000; Binder et al., 2000; Thivard, the analysis of the speech signal is mediated bilaterally be- Belin, Zilbovicius, Poline, & Samson (2000), Hall et al., cause speech contains ‘fast’ (e.g. formant transitions) and 2002). One of the emerging themes from these studies— ‘slow’ components (envelope of syllable, intonation con- consistent with findings in the animal literature by Scheich tour). In contrast, post-perceptual linguistic computation is and colleagues—is that FM sounds, particularly FMs with lateralized. slow rates of change or FMs with long durations, lateralize to the right, as observed in the present study. For example, Schlosser et al. (1998) used FM sounds in fMRI recordings Acknowledgements (5.9 s duration, 200–5000 Hz bandwidth) and found activa- tion in right STG. The emerging consensus is that ‘slow’ This work was supported by the James S. McDon- FMs drive right temporal cortical fields. However, this is nell Foundation Program in Cognitive Neuroscience, NIH complementary with observations by others (Fiez et al., DC04638 and NIH DC05660 (DP) and the National Institute 1995; Johnsrude et al., 1997; Belin et al., 1998) that left areas of Deafness and other Communication Disorders Intramural appear to be driven by rapid FMs or other rapidly changing Research Program (AB). We thank Barry Horwitz for com- sounds. The relevant contrast appears to lie in the duration ments on the manuscript, Anna Salajegheh for help with fig- of the FM or the FM rate (these are confounded, making ure preparation, and Charles Wharton and Lucila San Jose direct comparisons difficult). Crucially, because duration, for help with stimulus creation. Correspondence to David bandwidth, and FM rate interact, it will be necessary to do Poeppel, Cognitive Neuroscience of Language Laboratory, parametric studies to assess how these stimuli drive lateral- University of Maryland, 1401 Marie Mount Hall, College ization. One possibility is that the lateralization in auditory Park, MD 20742, USA ([email protected]). D. Poeppel et al. / Neuropsychologia 42 (2004) 183–200 199

References Gordon, M., & Poeppel, D. (2002). Inequality in identification of direction of frequency change (up versus down) for rapid frequency-modulated Baumgart, F., Gaschler-Markefski, B., Woldorff, M. G., Heinze, H., & sweeps. ARLO/Journal of Acoustical Society of America, 3 (1). Scheich, H. (1999). A movement-sensitive area in auditory cortex. Hall, D. A., Hart, H. C., & Johnsrude, I. S. (2003). Relationships Nature, 400, 724–725. between human auditory cortical structure and function. Audiology & Belin, P., Zatorre, R. J., Lafaille, P., Ahad, P., & Pike, B. (2000). Neurootology, 8, 1–18. Hall, D. A., Johnsrude, I. S., Haggard, M. P., Palmer, A. R., Akeroyd, M. Voice-selective areas in human auditory cortex. Nature, 403, 309– A., & Summerfield, A. Q. (2002). Spectral and temporal processing 312. in human auditory cortex. Cerebral Cortex, 12, 140–149. Belin, P., Zilbovicius, M., Crozier, S., Thivard, L., Fontaine, A., Masure, Hickok, G., & Poeppel, D. (2000). Towards a functional neuroanatomy M. C., & Samson, Y. (1998). Lateralization of speech and auditory of speech perception. Trends Cognitive Sciences, 4, 131–138. temporal processing. Journal of Cognitive Neuroscience, 10, 536– Howard, D., Patterson, K., Wise, R., Brow n, W., Friston, K., Weiller, 540. C., & Frackowiak, R. (1992). The cortical localization of the lexicons: Binder, J. R., Frost, J. A., Hammeke, T. A., Bellgowan, P. S., Springer, Positron emission tomography evidence. Brain, 115, 1769–1782. J. A., Kaufman, J. N., & Possing, E. T. (2000). Human temporal Johnsrude, I. S., Giraud, A. L., & Frackowiak, R. (2002). Functional lobe activation by speech and nonspeech sounds. Cerebral Cortex, 10, imaging of the auditory system: The use of positron emission 512–528. tomography. Audiology & Neurootology, 7, 251–276. Binder, J. R., Frost, J. A., Hammeke, T. A., Cox, R. W., Rao, S. M., & Johnsrude, I. S., Zatorre, R. J., Milner, B. A., & Evans, A. C. (1997). Prieto, T. (1997). Human brain language areas identified by functional Left-hemisphere specialization for the processing of acoustic transients. magnetic resonance imaging. Journal of Neuroscience, 17, 353– Cognitive Neuroscience and Neuropsychology, 8, 1761–1765. 362. Kaas, J., & Hackett, T. (2000). Subdivisions of auditory cortex and Bokde, A., Tagamets, M., Friedman, R., & Horwitz, B. (2001). Functional processing streams in primates. Proceedings of the National Academy interactions of the inferior frontal cortex during the processing of of Sciences of the Unite States of America, 97, 11793–11799. words and word-like stimuli. Neuron, 30, 609–617. Liberman, A., Harris, K. S., Hoffman, H. S., & Griffith, B. C. (1957). The Büchel, C., Price, C., & Friston, K. (1998). A multimodal language region discrimination of speech sounds within and across phoneme boundaries. in the ventral visual pathway. Nature, 394, 274–277. Journal of Experimental Psychology, 54, 358–368. Burton, M. W. (2001). The role of inferior frontal cortex in phonological Mummery, C. J., Ashburner, J., Scott, S. K., & Wise, R. J. (1999). processing. Cognitive Science, 25, 695–709. Functional neuroimaging of speech perception in six normal and two Burton, M. W., Small, S., & Blumstein, S. E. (2000). The role of aphasic subjects. Journal of Acoustical Society of America, 106, 449– segmentation in phonological processing: An fMRI investigation. 457. Journal of Cognitive Neuroscience, 12, 679–690. Nicholls, M. (1996). Temporal processing asymmetries between the Celsis, P., Boulanouar, K., Doyon, B., Ranjeva, J. P., Berry, I., Nespoulous, cerebral hemispheres: Evidence and implications. Laterality, 1, 97–137. J. L., & Chollet, F. (1999). Differential fMRI responses in the left Norris, D., Wise, R. (2000). The study of prelexical and lexical processes posterior superior temporal gyrus and left supramarginal gyrus to in comprehension: Psycholinguistics and functional neuroimaging. In habituation and change detection in syllables and tones. NeuroImage, Gazzaniga, M. (Ed.), The new cognitive neurosciences. Cambridge, 9, 135–144. MA: MIT Press. Chee, M. W., O’ Craven, K. M., Bergida, R., Rosen, B. R., & Savoy, R. Oldfield, R. C. (1971). The assessment and analysis of handedness: The L. (1999). Auditory and visual word processing studied with fMRI. Edinburgh inventory. Neuropsychologia, 9, 97–113. Hum. Brain Mapping, 7, 15–28. Poeppel, D. (2001). Pure word deafness and the bilateral processing of Démonet, J.-F., Chollet, F., Ramsay, S., Cardebat, D., Nespoulous, J.-L., the speech code. Cognitive Science, 25, 679–693. Wise, R., Rascol, A., & Frackowiak, R. (1992). The anatomy of Poeppel, D. (2003). The analysis of speech in different temporal phonological and semantic processing in normal subjects. Brain, 115, integration windows: Cerebral lateralization as ‘asymmetric sampling 1753–1768. in time’. Speech Communication, 41, 245–255. Poeppel, D., Yellin, E., Phillips, C., Roberts, T. P. L., Rowley, H. A., Fiez, J. A., Raichle, M. E., Balota, D. A., Tallal, P., & Petersen, S. E. Wexler, K., & Marantz, A. (1996). Task-induced asymmetry of the (1996). PET activation of posterior temporal regions during auditory auditory evoked M100 neuromagnetic field elicited by speech sounds. word presentation and verb generation. Cerebral Cortex, 6, 1–10. Cognitive Brain Research, 4, 231–242. Fiez, J., Raichle, M. E., Miezin, F. M., Petersen, S. E., Tallal, P., & Katz, Price, C. J., Wise, R. J. S., Ramsay, S., Friston, K. J., Howard, D., W. F. (1995). Studies of auditory and phonological procressing: Effects Patterson, K., & Frackowiak, R. S. J. (1992). Regional response of stimulus characteristics and task demands. Journal of Cognitive differences within the human auditory cortex when listening to words. Neuroscience, 7, 357–375. Neuroscience Letters, 146, 179–182. Fitch, R. H., Brown, C. P., & Tallal, P. (1993). Left hemisphere Price, C. J., Wise, R. J. S., Warburton, E. A., Moore, C. J., Howard, D., specialization for auditory temporal processing in rats. Annals of the Patterson, K., Frackowiak, R. S. J., & Friston, K. J. (1996). Hearing New York Academy of Sciences, 682, 346–347. and saying: The functional neuro-anatomy of auditory word processing. Franke, N. (1999). SoundApp (2.5.1 ed.). Brain, 119, 919–931. Friston, K. J. (1995). Commentary and opinion. II. Statistical parametric Rademacher, J., & Caviness, V. S. (1993). Topographical variation of the mapping: Ontology and current issues. Journal of Cerebral Blood Flow human primary cortices: Implications for neuroimaging, brain mapping, and Metabolism, 15, 361–370. and neurobiology. Cerebral Cortex, 3, 313–329. Friston, K., Worsley, K., Frackowiak, R., Mazziotta, J., & Evans, A. Rademacher, L., & Morosan, P. (2001). Probabilistic mapping and volume (1994). Assessing the significance of focal activations using their spatial measurement of human primary auditory cortex. NeuroImage, 13, 669– extent. Human Brain Mapping, 1, 210–220. 683. Geschwind, N. (1970). The organization of language and the brain. Rauschecker, J. (1998). Parallel processing in auditory cortex of primates. Science, 170, 940–944. Audioogy and Neurootology, 3, 86–103. Giraud, A. L., & Price, C. J. (2001). The constraints functional Scheich, H., Baumgart, F., Gaschler-Markefski, B., Tegeler, C., neuroimaging places on classical models of auditory word processing. Tempelmann, C., Heinze, H. J., Schindler, F., & Stiller, D. (1998). Journal of Cognitive Neuroscience, 13, 754–765. Functional magnetic resonance imaging of a human auditory cortex area Goldinger, S. D. (1996). Auditory lexical decision. Language and involved in foreground–background decomposition. European Journal Cognitive Processes, 11, 559–567. of Neuroscience, 10, 803–809. 200 D. Poeppel et al. / Neuropsychologia 42 (2004) 183–200

Schlosser, M. J., Aoyagi, N., Fulbright, R. K., Gore, J. C., & McCarthy, Woods, R. P., Cherry, S. R., & Mazziotta, J. C. (1992). Rapid automated G. (1998). Functional MRI studies of auditory comprehension. Human algorithm for aligning and reslicing PET images. Journal of Computer Brain Mapping, 6, 1–13. Assisted Tomography, 16, 620–633. Scott, S. K., Blank, S. C., Rosen, S., & Wise, R. J. S. (2000). Identification Zahn, R., Huber, W., Drews, E., Erberich, S., Krings, T., Willmes, K., of a pathway for intelligible speech in the left temporal lobe. Brain, & Schwarz, M. (2000). Hemispheric lateralization at different levels 123, 2400–2406. of human auditory word processing: A functional magnetic resonance Talairach, J., Tournoux, P. (1988). Co-planar stereotactic atlas of the imaging study. Neuroscience Letters, 287, 195–198. human brain (2nd ed.). Stuttgart: Thieme Verlag. Zatorre, R. J., & Belin, P. (2001). Spectral and temporal processing in Thivard, L., Belin, P., Zilbovicius, M., Poline, J. B., & Samson, Y. (2000). human auditory cortex. Cerebral Cortex, 11, 946–953. A cortical region sensitive to auditory spectral motion. NeuroReport, Zatorre, R. J., Belin, P., & Penhune, V. B. (2002). Structure and function 11, 2969–2972. of auditory cortex: Music and speech. Trends Cognitive Sciences, 6, Wagner, A. D., Schacter, D. L., Rotte, M., Koutstaal, W., Maril, A., Dale, 37–646. A. M., Rosen, B. R., & Buckner, R. L. (1998). Building memories: Zatorre, R., Evans, A., Meyer, E., & Gjedde, A. (1992). Lateralization of Remembering and forgetting of verbal experiences as predicted by phonetic and pitch discrimination in speech processing. Science, 256, brain activity. Science, 281, 1188–1191. 846–849. Wise, R., Chollet, F., Hadar, U., Friston, K., Hoffner, E., & Frackowiak, Zatorre, R. J., Meyer, E., Gjedde, A., & Evans, A. C. (1996). PET studies R. (1991). Distribution of cortical neural networks involved in word of phonetic processing of speech: Review, replication, and reanalysis. comprehension and word retrieval. Brain, 114, 1803–1817. Cerebral Cortex, 6, 21–30.