Neural Encoding of Attended Continuous Speech Under Different Types of Interference
Total Page:16
File Type:pdf, Size:1020Kb
Neural Encoding of Attended Continuous Speech under Different Types of Interference Andrea Olguin, Tristan A. Bekinschtein, and Mirjana Bozic Abstract ■ We examined how attention modulates the neural encoding Critically, however, the type of the interfering stream significantly of continuous speech under different types of interference. In modulated this process, with the fully intelligible distractor an EEG experiment, participants attended to a narrative in English (English) causing the strongest encoding of both attended and while ignoring a competing stream in the other ear. Four different unattended streams and latest dissociation between them and types of interference were presented to the unattended ear: a nonintelligible distractors causing weaker encoding and early dis- different English narrative, a narrative in a language unknown to sociation between attended and unattended streams. The results the listener (Spanish), a well-matched nonlinguistic acoustic inter- were consistent over the time course of the spoken narrative. ference (Musical Rain), and no interference. Neural encoding of These findings suggest that attended and unattended information attended and unattended signals was assessed by calculating can be differentiated at different depths of processing analysis, cross-correlations between their respective envelopes and the with the locus of selective attention determined by the nature EEG recordings. Findings revealed more robust neural encoding of the competing stream. They provide strong support to flexible for the attended envelopes compared with the ignored ones. accounts of auditory selective attention. ■ INTRODUCTION “late selection” approaches. The early selection theory Directingattentiontoasinglespeakerinamultitalker (Broadbent, 1958) argued that, because of our limited environment is an everyday occurrence that we manage processing capacity, attended and unattended informa- with relative ease. This phenomenon is commonly tion is differentiated early in perceptual processing. More termed as the “cocktail party” effect (Cherry, 1953). A specifically, sensory features can guide attentional selec- large body of research has sought to assess how the un- tion early on, thus determining what will be subsequently derlying attentional mechanisms operate and how much processed for meaning. The late selection approach of the nonattended signal is perceived in such situations, (Duncan, 1980; Deutsch & Deutsch, 1963) proposed that producing mixed results. Here, we aim to assess these selective attention cannot affect the perceptual analysis questions by investigating the neural encoding of contin- of the stimuli and that both attended and unattended in- uous attended speech under different types of linguistic puts are processed equivalently by the perceptual sys- and nonlinguistic interference. tem. In this view, selective attention only acts later in the process, after the input had undergone semantic en- coding and analysis. Subsequent theories argued that un- Selective Attention attended information might be attenuated rather than completely filtered out, allowing unattended information Selective attention is the ability to sustain focus on task- with low identification thresholds (as determined by their relevant stimuli in the presence of distractors. This has semantic features) to reach awareness (Treisman, 1969). long been recognized as an essential cognitive capacity Johnston and Heinz (1978) suggested that selective (e.g., James, 1890) because our brains are continuously attention is a multimode flexible system, where attended flooded with information but limited in what they can and unattended information can be differentiated at process. Nevertheless, listeners are also often distracted different depths of processing analysis. They also argued by irrelevant stimuli, prompting the questions about the that selective attention itself requires processing capacity locus and mechanisms of attentional allocation in the (cf. Kahneman, 1973), with later selection requiring more presence of competing streams of information. Histori- processing capacity and effort. On this account, efficient cally, two major views guiding research on auditory selection can be achieved early based on sensory dif- selective attention were the “early selection” and the ferences between attended and unattended streams; however, in the absence of effective sensory cues, se- University of Cambridge mantic features will be driving the differentiation later © 2018 Massachusetts Institute of Technology Journal of Cognitive Neuroscience 30:11, pp. 1606–1619 doi:10.1162/jocn_a_01303 Downloaded from http://www.mitpressjournals.org/doi/pdf/10.1162/jocn_a_01303 by guest on 29 September 2021 in the process, using more capacity. A more recent Neural Encoding of Attended and account of attention allocation in speech comprehension Unattended Streams (Bronkhorst, 2015) also argues that attentional selection can be triggered at different processing depths. Attention The temporal envelope of speech is strongly represented triggered early on is based on basic signal properties in the brain, with several studies showing a significant (sound level, fundamental frequency) and enables fast correlation between speech envelopes and cortical activ- selection, whereas attention at later processing stages ity (Lalor & Foxe, 2010; Abrams, Nicol, Zecker, & Kraus, is based on complex information, such as syntactic and 2008; Aiken & Picton, 2008). These correlations appear to semantic information, and used for slow selection. be a result of phase locking or synchronization between neural activity and the slow amplitude modulations of the speech envelope, which are mainly present in the theta frequency band (3–7 Hz) and correspond to the syllabic Experimental Evidence rate of speech (Doelling, Arnal, Ghitza, & Poeppel, 2014; A substantial body of research used dichotic listening to Giraud & Poeppel, 2012; Drullman, Festen, & Plomp, assess whether auditory attention selects information 1994). Phase locking has also been observed for noise- early on, based on the physical characteristics of the stim- vocoded speech (i.e., stimuli in which the slow amplitude ulus, or after the input has been processed up to a se- fluctuations are preserved but spectral details are reduced) mantic level. The results are mixed, with some studies but is stronger for intelligible stimuli (Ding, Chatterjee, & showing that both attended and unattended information Simon, 2013; Peelle, Gross, & Davis, 2013). can be processed up to the semantic level (Bentin, Kutas, Selective attention has been shown to have a robust & Hillyard, 1995; Wood & Cowan, 1995; Eich, 1984) and influence on these synchronizations. In “cocktail party” others finding no evidence for semantic processing of the paradigms, the auditory system preferentially tracks the unattended stream (Wood, Stadler, & Cowan, 1997; temporal envelope of the attended talker and appears Newstead & Dennis, 1979). The inconsistency has been to be out of phase with the ignored speech stream attributed to inadequate control of attentional shifts to (Rimmele, Zion Golumbic, Schröger, & Poeppel, 2015; the unattended ear (Dupoux, Kouider, & Mehler, 2003; Hambrook & Tata, 2014; Horton, Srinivasan, & D’Zmura, Holender, 1986), prompting the claim that listeners can- 2014; Horton, D’Zmura, & Srinivasan, 2013; Ding & not semantically process information that is genuinely Simon, 2012a; Zion Golumbic, Poeppel, & Schroeder, unattended. Yet, further studies demonstrated that unat- 2012; Kerlin, Shahin, & Miller, 2010). This phenomenon tended information can be processed in the absence has been referred to as the “selective entrainment hy- of attention shifts to the irrelevant channel (Rivenez, pothesis” (Zion Golumbic et al., 2013; Giraud & Poeppel, Guillaume, Bourgeon, & Darwin, 2008) and that it can, 2012; Schroeder & Lakatos, 2010; Lakatos, Karmos, Mehta, under certain conditions, be processed up to the seman- Ulbert, & Schroeder, 2008), which suggests that attention tic and syntactic processing levels (Aydelott, Jamaluddin, causes low-frequency neural oscillations to entrain to the & Nixon Pearce, 2015; Pulvermüller, Shtyrov, Hasting, & temporal envelope of the attended speech stream. For Carlyon, 2008). This conclusion is also consistent with instance, Rimmele et al. (2015) used magnetoencepha- theargumentthattheauditorysystem—although able lography and a cocktail paradigm to reveal stronger to selectively focus processing on the relevant stream— attentional encoding for natural speech compared with has surplus capacity to process auditory information from noise-vocoded speech. They suggested that attentional other streams, regardless of the perceptual load in the enhancement of speech tracking depends on the pres- attended stream (Murphy, Fraenkel, & Dalton, 2013). ence of fine structure in the stimulus. In another study, However, it has also been argued that the nature of the Hambrook and Tata (2014) presented two simultaneous measurement can determine whether the processing of audiobook clips while EEG was being recorded. Atten- the unattended message is observed or not (Rivenez tional selection increased the EEG signals that were syn- et al., 2008), with studies using explicit measures (e.g., chronized with the attended stream, but not the ignored word recall) more likely to find that unattended message one. Similarly, Horton et al. (2013) asked participants to was not processed. attend to one of two competing speech streams. The Thus,