Frequency modulation entrains slow neural oscillations and optimizes human listening behavior

Molly J. Henry1 and Jonas Obleser1

Max Planck Research Group Auditory Cognition, Max Planck Institute for Human Cognitive and Sciences, 04103 Leipzig, Germany

Edited by Nancy J. Kopell, Boston University, Boston, MA, and approved October 16, 2012 (received for review August 2, 2012) The human ability to continuously track dynamic environmental stimulus onsets (3, 15). Behaviorally, decreased response times stimuli, in particular speech, is proposed to profit from “entrainment” (RTs) to suprathreshold targets are associated with optimal delta of endogenous neural oscillations, which involves phase reorganiza- phase (3, 14). Moreover, empirical results from vision suggest that tion such that “optimal” phase comes into line with temporally ex- a number of perceptual benefits are afforded to near-threshold pected critical events, resulting in improved processing. The current stimuli coinciding with optimal (mostly theta and alpha) phase, experiment goes beyond previous work in this domain by address- including improved detection (7, 16), and faster RTs (17) (see ref. ing two thus far unanswered questions. First, how general is neu- 18 for a review). ral entrainment to environmental rhythms: Can neural oscillations Improved perceptual processing (i.e., increased hit rate) of near- be entrained by temporal dynamics of ongoing rhythmic stimuli with- threshold stimuli occurring in the optimal phase on an entrained out abrupt onsets? Second, does neural entrainment optimize perfor- (that is, during rhythmic stimulation) has thus far mance of the perceptual system: Does human auditory perception not been demonstrated in the auditory domain (but see ref. 19 for benefit from neural phase reorganization? In a human electroen- evidence on phasic modulation of miss rates). This lack of evidence cephalography study, listeners detected short gaps distributed uni- is unfortunate, because the rhythmic structure of speech provides formly with respect to the phase angle of a 3-Hz frequency-modulated a pacing signal for the reorganization of ongoing neural oscillations stimulus. Listeners’ ability to detect gaps in the frequency-modulated over a range of frequency bands (for a review, see ref. 20). For ex- sound was not uniformly distributed in time, but clustered in cer- ample, important information about the content of the signal is tain preferred phases of the modulation. Moreover, the optimal communicated by quasi-periodicity in the gamma range corre- stimulus phase was individually determined by the neural delta sponding to fine structure and by low-frequency fluctuations (theta NEUROSCIENCE oscillation entrained by the stimulus. Finally, delta phase predicted and delta range) corresponding to the syllable envelope and prosodic behavior better than stimulus phase or the event-related potential variations, respectively (21). Accordingly, evidence for neural en- after the gap. This study demonstrates behavioral benefits of phase trainment to the slow amplitude fluctuations in tone sequences (3, 4), realignment in response to frequency-modulated auditory stimuli, speech (22–24), and natural sounds (25, 26) has been demonstrated. overall suggesting that frequency fluctuations in natural environ- However, a crucial missing link—a perceptual benefit for near- mental input provide a pacing signal for endogenous neural oscil- threshold events stemming from optimal phase alignment of lations, thereby influencing perceptual processing. an entrained oscillation in the auditory domain—has not yet been provided. pre-stimulus phase | auditory processing | EEG | FM Thus far, all evidence for neural entrainment and its electro- physiological consequences has been gathered during stimulation fi eural oscillations are associated with rhythmic fluctuations with periodic sound onsets, by de nition coupled to amplitude Nin the excitation–inhibition cycle of local neuronal pop- changes that result in an evoked response. As such, the current ulations (1–3). That is, a single neuron is not equally likely to study made use of simple nonspeech auditory stimuli, where pe- discharge in response to stimulation at all points in time. Instead, riodicity was communicated by frequency modulation (FM) rather fl its likelihood of responding is influenced by local extracellular than amplitude uctuations (i.e., onsets). Although neural oscil- – and membrane potentials that, in turn, are reflected in neural lations have been shown to be entrained by FM (27 29), the pos- sibility that slow FM provides a pacing signal for phase alignment oscillations. Critically, these neural oscillations can be entrained fi by external rhythmic sensory stimulation (3, 4) or, less naturally, has not received much scienti c (29). by rhythmic neural stimulation using transcranial magnetic Listeners detected brief auditory targets (silent gaps) embed- ded in the ongoing stimulus (Fig. 1). Thus, targets were not “ex- stimulation (TMS) or transcranial alternating current stimulation ” (TACS; refs. 5 and 6). As a result, neurons are more likely to fire traneous events, but were wholly contained within the rhythmic at temporally expected points in time. Restated, the time of peak carrier stimulus, in much the same way as a phonemic cue in neural sensitivity predicts the point in time at which an upcoming speech is not an additional, independent event, but is contained within the rhythmic speech signal. Gap locations were distributed stimulus will occur within a framework of continued temporal uniformly around the 3-Hz FM cycle of the stimulus. We reasoned regularity. Such a rhythmic neural processing mode is adaptive that if ongoing delta-band oscillations were entrained by the 3-Hz given the abundance of behaviorally relevant environmental au- modulation, these gaps would also be uniformly distributed with ditory stimuli that are inherently rhythmic in nature and, thus, respect to the phase angle of the brain oscillation. We were thus could provide a pacing signal for neural oscillations across a range able to examine modulation profiles of behavior and evoked of frequency bands. Entrainment of low-frequency oscillations involves a re- organization of phase so that the optimal, in this case most excit- Author contributions: M.J.H. and J.O. designed research; M.J.H. performed research; M.J.H. able, phase comes into line with temporally expected critical events and J.O. contributed new reagents/analytic tools; M.J.H. and J.O. analyzed data; and M.J.H. in the ongoing stimulus, for example tone onsets or syllable onsets and J.O. wrote the paper. (4, 7, 8). Recordings from macaque auditory cortex (3, 9, 10), The authors declare no conflict of interest. human auditory cortex (11), and human This article is a PNAS Direct Submission. – fi (EEG) recordings (12 14) con rm that delta phase reorganizes in 1To whom correspondence may be addressed. E-mail: [email protected] or obleser@cbs. response to rhythmic auditory stimulation and that optimal phase mpg.de. fi aligns to expected times of event onsets. The results are ampli ed This article contains supporting information online at www.pnas.org/lookup/suppl/doi:10. event-related potentials (ERPs) and increased firing in response to 1073/pnas.1213390109/-/DCSupplemental.

www.pnas.org/cgi/doi/10.1073/pnas.1213390109 PNAS Early Edition | 1of6 Downloaded by guest on September 28, 2021 A A Characteristics of driving stimulus Frequency Modulation Entrained Ongoing Delta Oscillations. Fig. 2 B stimulus onset shows amplitude spectra estimated from Fourier analyses for individual listeners (gray) and averaged over listeners (red), av- eraged over all electrodes. If ongoing neural oscillations were Stimulus Amplitude phase reset entrained by the 3-Hz spectral modulation, increased amplitude should be observed for the stimulation frequency, 3 Hz (and the

Brain harmonic, 6 Hz, as has been reported in response to FM stimuli; Frequency Time Time ref. 30). Therefore, two paired-samples t tests were conducted to Gap locations Predicted data π/2 Excitation test the amplitude at the two target frequencies against the av- erage amplitude of the adjacent frequency bins (eight bins on π 0 either side of the target frequency, as recommended in ref. 30). fi 3π/2 Inhibition Gap detection This analysis indicated signi cant amplitude peaks at both 3 Hz, 3-Hz stimulus phase t(11) = 3.32, P = 0.008, and 6 Hz, t(11) = 4.30, P = 0.001. As a main indicator of neural entrainment, intertrial phase co- Fig. 1. (A) FM stimuli for the EEG experiment. Periodicity was conveyed without herence (ITPC; ref. 31) was calculated, assessing the consistency of fluctuations in amplitude (Top), but instead only by fluctuations in frequency (Middle). Listeners detected short silent gaps (Bottom Left) that were distributed the brain response over trials and, therefore, in response to the uniformly around the 3-Hz FM cycle (Bottom Right); 2, 3, or 4 gaps were present in stimulus (Fig. 2B, averaged over listeners and electrodes). The each 10-s stimulus. (B) Predicted neural and behavioral effects of entrainment significance of peristimulus ITPC was tested against baseline ITPC to the FM stimulus. At stimulus onset (Top), the phase of the ongoing delta (estimated for each channel in each frequency bin and averaged oscillation was predicted to reset to bring the neuronal oscillation into line with over the time period from –1sto–0.5 s before stimulus onset) by the driving stimulus (Middle), thereby modulating the excitation–inhibition cycle using a permutation t test (repeated measures) with a cluster-based of the delta oscillation (Bottom Left). For this reason, gap detection hit rates multiple-comparisons correction (32). A single significant cluster were expected to be modulated by stimulus phase (Bottom Right). was observed for ITPC in the 3-Hz frequency band (Fig. 2B). The topography for the significant 3-Hz cluster is in full accordance – responses by instantaneous oscillatory delta phase at the time of with auditory generators (33 35). gap occurrence. We predicted that listening behavior (hit rates, Critically, the FM phase of the stimulus at onset was randomized RTs) and ERPs would be influenced by the phase of the stimulus from trial to trial, and the reported evidence for neural entrainment in which the target occurred and, therefore, by the phase of the at the stimulus frequency was only observable when the neural signals were realigned such that per-trial stimulus phases were entrained delta oscillation. consistent (SI Results and Fig. S1). Thus, phase reorganization was Results not simply evoked by the stimulus onset but was maintained by exact Listeners detected silent gaps in 10-s complex tones that were stimulus phase. In the next section, the perceptual and neural consequences of this entrainment for target detection are reported. frequency modulated at 3 Hz; a target was considered to be a “hit” if a button press occurred within 1 s of gap onset. Gaps were dis- Neural Entrainment Modulated Behavioral and Electrophysiological tributed uniformly around the 3-Hz FM cycle (in 20 possible Responses to Gaps. Behavioral effects of stimulus phase. Fig. 3 shows positions); each 10-s stimulus contained two, three, or four gaps. smoothed proportions of detected gaps (hit rate) for each of the A Fig. 1 shows the frequency and amplitude envelope of an example 20 FM-phase bins for individual listeners (Fig. 3A) and averaged stimulus and the 20 gap locations around the FM cycle. over listeners (Fig. 3B), superimposed on a schematic of the FM We predicted that ongoing delta oscillations would be stimulus. See Fig. S2 for corroborating RT data. entrained by the 3-Hz frequency modulation. As a conse- Hit rates were significantly correlated with stimulus phase [av- quence, we expected that gap detection, as indexed by hit rate, erage (root mean square; rms) ρ = 0.79, t(11) = 8.92, P < 0.001; see would not be uniformly distributed as a function of stimulus SI Results for details of data analysis]. This finding is simple be- phase, but would instead be modulated by stimulus phase (Fig. havioral evidence that human listening performance was, at least 1B). Moreover, stimulus phase effects on performance were indirectly, modulated by neural oscillations, under the assumption expected to be explainable by way of the phase of the in- confirmed above that neural oscillations were entrained by spectral tervening neural delta (3-Hz) oscillation. We anticipated that, stimulus fluctuations. However, it was important to rule out across listeners, optimal stimulus phase would likely be in- a trivial acoustic explanation by assessing the consistency of the consistent because of individual stimulus–brain phase lags, but phase relationship of hit rate to the stimulus across listeners. An we expected to observe a consistent optimal delta brain phase. acoustic explanation for the current results would suggest that gaps Finally, we also predicted that stimulus phase, by way of delta would have been inherently easier to detect when they occurred in, brain phase, would modulate the ERP elicited by the gap. for example, the peak of the 3-Hz FM stimulus. We tested this

AB.05 ITPC .12 Fig. 2. (A) Amplitude spectrum from fast Fourier transform (FFT) of time-domain EEG signal. Ampli- 0.7 10 tude in the 3-Hz and 6-Hz frequency bins was sig- 0.6 9 nificantly larger than in the neighboring bins. Red 8 solid line indicates the group average spectrum, 0.5 7 gray lines show single participants’ spectra, aver- 0.4 6 aged over all electrodes. (B) ITPC shown over time (Left), and averaged over time (Right), again aver- 0.3 5 4 aged over all electrodes. The red bar indicates the 0.2 Amplitude (μ V) frequency region in which phase coherence was

Frequency (Hz) Frequency 3 significantly greater than baseline. Inset shows the 0.1 2 topography for the significant frequency region, 0 1 0 2 46 8 10 12 14 16 18 20 19243 5 68 7 .06 .10 averaged over time; the color scale is the same as Frequency (Hz) Time (s) ITPC for the time-frequency representation of ITPC.

2of6 | www.pnas.org/cgi/doi/10.1073/pnas.1213390109 Henry and Obleser Downloaded by guest on September 28, 2021 Single-subject hit rates Grand average hit rate Event-related potentials as a function of neural delta phase. Fig. 4D A B “ ” .46 .66 .28 .72 +2 shows an ERP image (i.e., all single trials from all listeners), sorted according to per-trial neural delta phase at gap onset. Similar .37 .58 .17 .64 to the results for hit rate, ERPs were systematically related to neural .37 .58 .81 .29 delta phase across listeners, but not to stimulus phase (Fig. S3). To assess phase effects on ERP components, amplitudes were extrac- .24 .51 .69 .23 Hit Rate Hit – z(Hit Rate) ted from the canonical time windows of the N1 (0.05 0.15 s after .49 .46 .32 .27 gap onset) and P2 (0.15–0.25 s). We were especially interested in

.32 .37 .23 .20 -2 N1 effects, because they most faithfully represent early perceptual 0 π2 0 2π processing of simple, to-be-detected auditory targets (36). 3-Hz stimulus phase (radians) 3-Hz stimulus phase (radians) Phase effects were assessed in the same way as for hit rate Fig. 3. (A) Hit rates (red) as a function of 3-Hz stimulus phase (black) for data. For the N1 (Fig. 4E), the mean correlation with brain phase each individual listener. Two cycles of both stimulus and data have been was extremely strong [rms ρ = 0.99, t(11) = 212.20, P < 0.001]. concatenated for illustration purposes. Circular-linear correlations between The correlation between N1 and stimulus phase (Fig. S2E) also fi < stimulus phase and hit rates were signi cant across listeners (P 0.001). (B) reached significance [rms ρ = 0.70, t = 4.57, P < 0.001], but N1 Grand average of the z-transformed individual data in A. Across listeners, (11) amplitude was more strongly correlated with delta phase than there was no consistent relation between gap detection (grand average hit t = P < rate, red) and the stimulus phase (black), ruling out acoustic explanations for stimulus phase [ (11) 6.53, 0.001]. Similar results were the observed effect. obtained for the P2: Amplitude was significantly correlated with both brain phase [rms ρ = 0.98, t(11) = 32.17, P < 0.001; Fig. 4I] and stimulus phase [rms ρ = 0.88, t(11) = 21.19, P < 0.001; Fig. hypothesis formally by fitting a single-cycle cosine function to each S2I], but more strongly with brain phase than with stimulus phase ’ SI Results listener s smoothed data ( ); from this function, we esti- [t(11) = 6.13, P < 0.001]. mated the stimulus phase angle corresponding to peak perfor- Optimal phase for N1 and P2 amplitude per listener was then mance. The distribution of individual “best” stimulus phases was estimated from the best-fit cosine function as a function of phase. not significantly different from a uniform distribution (Rayleigh z = Separate Rayleigh tests indicated that, for both N1 and P2 ampli- 1.31, P = 0.29). Thus, stimulus acoustics alone are insufficient to tudes, peak brain phase was consistent across listeners (N1: z = explain the current results. This observation suggests that listeners 11.96, P < 0.001; P2: z = 11.62, P < 0.001). However, peak stimulus differ in the phase lag with which their behavior related to the phase was not (N1: z = 1.83, P = 0.16; P2: z = 0.26, P = 0.78). NEUROSCIENCE entraining stimulus. Next, we demonstrate that the source of in- Not surprisingly, evoked responses also depended on whether terindividual variability in this lag is the intervening neural oscillation. the gap that elicited them would be detected. A full analysis of Behavioral effects of neural delta phase. Fig. 4 shows smoothed hit ERPs to hits and misses is reported in SI Results (and see Figs. S4 rates from all trials of all listeners (Fig. 4C), sorted by 3-Hz delta and S5). Overall, although ERPs were stronger to hits than to phase (Fig. 4A) at the time of gap occurrence (0 ms); phase misses, the delta phase in which the gap was presented was estimates were taken from Cz based on the central topography of a stronger predictor of the evoked signature than the behavioral fi the signi cant 3-Hz ITPC cluster. Neural delta phase was sig- response to the gap [N1: t(11) = 8.35, P2: t(11) = 10.08, P < 0.001]. nificantly correlated with hit rate [rms ρ = 0.76, t(11) = 8.25, P < 0.001; for full analysis details, see SI Results). Hit rates were Individual Differences in Optimal Stimulus Phase Are Predictable from related to stimulus phase and to neural delta phase with similar Intervening Brain Phase. The relation of individual behavior to the strength (t(11) = –0.49, P = 0.63). Thus, within listeners, both stimulus (i.e., the stimulus–behavior relation) was inconsistent stimulus phase and delta phase correlated significantly (and across listeners, but the relation of individual behavior to the brain similarly) with gap detection performance. (the brain–behavior relation) was consistent. We suggest that the One important conjecture was that neural delta phase, but not critical variable is the intervening neural oscillation, namely the stimulus phase, would predict hit rates across listeners. To test for phase lag of the brain with respect to the stimulus (the stimulus– consistency of phase effects across listeners, cosine functions were brain relation). Therefore, we expected that we could predict the fitted to the smoothed, binned data and used to calculate the phase stimulus–behavior relation if we knew something about individual angle corresponding to peak performance. Estimates of optimal stimulus–brain relations (Fig. 5). phase angle indicated significant clustering of optimal neural phase This analysis required combining and comparing three lag values across listeners (Rayleigh z = 3.72, P = 0.02; Fig. 4G). estimated for each listener. First, the lag of each individual’s delta

ABCDEF cos(δ-phase) δ-phase (rad) FM-phase (rad) Hit rate (PC) ERPs to gaps N1 (μV) P2 (μV) 5700 Fig. 4. Single trials were sorted according to the instantaneous phase of the neural delta oscillation at electrode Cz (A) at the time of gap occurrence (from –π to π). Sorting single-trial stimulus phase (B) by applying the same permutation vector revealed Trial that stimulus phase was not consistently related to neural delta phase across listeners. However, hit rate (C) and ERPs (D–F) were. Hit rate (C), N1 am- plitude (E), and P2 amplitude (F) were significantly 1 correlated with neural delta phase (P < 0.001). .3 .55 -.5 0.5-6 6 -4 4 Peri-stimulus time (s) Moreover, optimal neural delta phase for hit rate GHI(G), N1 amplitude (H), and P2 amplitude (I)was < fi -10 Amplitude (μV) 10 consistent across listeners (P 0.02). Note: this g- 000ure shows all trials for all listeners (fixed-effects), whereas all statistics reported took into account the between-subjects variance (random effects).

Henry and Obleser PNAS Early Edition | 3of6 Downloaded by guest on September 28, 2021 C C B vigorous neural response to tone onsets (3, 14) and modulation of A B RTs to suprathreshold stimuli (14). A recent study demonstrated A – Behavior modulation of miss rates by low frequency (2 6 Hz) oscillatory phase for near-threshold auditory targets embedded in an ongoing Stimulus aperiodic stimulus (19). However, the present data demonstrate that phase reorganization in response to a periodic auditory stimulus influences perceptual Delta processing of near-threshold target stimuli in a periodic fashion. – Moreover, although phase locking to FM stimuli has been observed Fig. 5. A schematic depicting the reconstruction of individual stimulus – behavior lags (C) from stimulus–brain lags (A) and brain–behavior lags (B). (27 29), this study shows optimal-phase effects due to phase locking For each listener, stimulus–brain lags were estimated from a cross-correla- to an auditory stimulus modulated only spectrally (i.e., without any tion analysis, whereas stimulus–behavior and brain–behavior relations were rhythmic amplitude fluctuations). Thus, the current results are sug- taken from analyses estimating optimal phase from hit rate data. Then, gestive that spectral fluctuations in natural auditory stimuli may act as brain–behavior lags and stimulus–brain lags in radians were summed, and a pacing signal for low-frequency oscillations and may, in turn, in- this sum was correlated with stimulus–behavior lags (P = 0.03). This re- fluence auditory perception. lationship confirmed that variability in stimulus–behavior relations is well explained by the intervening brain oscillation phase lag. Entrainment in the Absence of Envelope Information. Much em- phasis is placed on amplitude fluctuations as the time-varying oscillation with respect to the stimulus was estimated from a cross- signal by which ongoing neural oscillations might be entrained correlation calculated between the trial-average time-domain 3-Hz (22, 23). This strong emphasis is epitomized by a recently pro- brain oscillation (bandpass-filtered between 2 Hz and 4 Hz) and posed model in which speech comprehension is formalized in the stimulus (36); we note here that a cross-correlation was chosen terms of the goodness of entrainment of ongoing theta-band oscillations to the amplitude (syllable) envelope of speech (38). because here we aimed for an estimate of the lag of the brain with In this model, the active adjustment of the stimulus–brain re- respect to the stimulus, rather than the consistency of the neural lation is proposed to be supported by a phase-reset mechanism, response. The lag corresponding to the peak correlation (con- which critically relies on acoustic onsets to evoke the reset re- verted to radians) was taken as the stimulus–brain relation sponse (ref. 20 for further discussion of this point). For this (denoted “A” in Fig. 5). Second, the stimulus–behavior relation reason, envelope cues (at the syllabic rate) are proposed to be of was taken as the estimated lag parameter from the cosine fits to hit utmost importance for entrainment of ongoing brain oscillations rates as a function of stimulus phase (“C” in Fig. 5). Third, the – to an acoustic signal. brain behavior relation was similarly estimated as the lag of co- Here, we chose to make use of a FM stimulus for three reasons. sine-fitted hit rates as a function of delta phase (“B” in Fig. 5). – First, although it is commonly acknowledged that speech contains We summed the phase lags for the stimulus brain (A) and slow fluctuations in the frequency domain, the possibility that these brain–behavior (B) relations. We assumed that the combination fl – uctuations might provide a pacing signal for phase reorganization of these two phase lags should predict the stimulus behavior lag of low-frequency oscillations has not received much scientific at- (C). That is, we assumed we could account for individual vari- – tention. This study demonstrates behavioral consequences of delta ability in the stimulus behavior lag by taking into account the phase reorganization in response to periodic spectral changes. It is intervening brain signal. Accordingly, we correlated the summed noteworthy in this regard that we did not see evidence for phase – – + stimulus brain and brain behavior lags (A B) with the stimu- reorganization when we did not realign trials with respect to – – lus behavior lag (C) by using a circular circular correlation and stimulus phase. – found that the stimulus behavior lag (C) was predictable from Second, the constant amplitude over the cycle of the FM allowed the intervening brain relation between the stimulus and behavior us to uniformly distribute targets around the stimulus phase, and ρ = P = [ (22) 0.46, 0.033, one-tailed]. thereby construct a modulation profile for hit rates and ERPs as a function of entrained delta phase. Because the phase of the Discussion entrained neural delta oscillation was predictable from the phase The current study demonstrated that low-frequency (3-Hz) delta of the FM stimulus, we were capable of targeting specific neural fl oscillations are entrained by spectral uctuations in nonspeech delta phases to assess modulatory effects on perception. This auditory stimuli. In turn, instantaneous phase of both the stimulus technique is similar to, but less obtrusive than, paradigms that drive and the entrained neural oscillation determined optimal listening brain oscillations with TACS or TMS (5, 6) and present targets at behavior (here: gap detection performance). Moreover, entrained known phases of the entrained oscillation. delta oscillations shaped the auditory potential evoked by the gap Third, an advantage of using an entraining stimulus without (ERP), and delta phase was more strongly predictive of ERP amplitude fluctuations (i.e., onsets) is the avoidance of complica- amplitude than whether the gap was detected. We interpret the tions that come with attempting to measure “proper entrainment” current results as reflecting low-frequency oscillation of a neural separately from periodic evoked responses to rhythmically pre- excitation–inhibition cycle, which governs listening performance sented stimuli. It has been argued that “entrainment” may reflect in a difficult (near-threshold) auditory perception task. no more than rhythmic evoked responses (39). When the temporal Critically, optimal stimulus phase varied across listeners, ruling context is defined based on discrete events (e.g., tones), these two out a trivial acoustic explanation of the current results. Instead, possibilities are not separable. This confound arises because in- optimal neural delta phase was consistent across listeners. More- dividual ERPs occur in response to event onsets at precisely the over, we were able to explain individual differences in stimulus stimulation frequency, leading to increased spectral power and phase effects on behavior by taking into account variation across phase locking at the stimulation frequency in the (time–)frequency listeners in the relation of the entrained delta oscillation to the domain. In the present study, rhythmic information was commu- driving stimulus. nicated by frequency modulation. Thus, we argue that increased Perceptual benefits (enhanced detection; faster RTs) have been 3-Hz power and phase coherence did not result from periodic evoked shown for near-threshold visual stimuli occurring in the optimal responses but rather reflect entrainment of neural oscillations. phase of theta or alpha oscillations (16, 17, 37). Moreover, in the At the level of an individual frequency-tuned neuronal pop- auditory domain, delta phase reorganization has been demon- ulation, frequency modulation constitutes a local “onset” and, thus, strated in response to rhythmic tone sequences, resulting in a more an evoked response, which follows from the tonotopic organization

4of6 | www.pnas.org/cgi/doi/10.1073/pnas.1213390109 Henry and Obleser Downloaded by guest on September 28, 2021 of the auditory system, beginning with cochlear output. An auditory input coinciding with stable viewing is enhanced, whereas visual input stimulus with continuously varying frequency serially stimulates during a saccade is suppressed, consistent with rhythmic attending. neighboring cortical locations (40, 41); therefore, a single fre- Moreover, in nonhuman animals, respiration tied to sniffing (olfac- quency-tuned neuronal population would respond periodically to tion) and whisking behavior (somatosensation) are both rhythmic, an FM stimulus, with increased firing coinciding with the time with periods of excitation interleaved with periods of inhibition. point at which the frequency trajectory intersects the characteristic This proposal fits well with a theory of rhythmic attending de- frequency of that cortical location. From this perspective, it could veloped by Jones and colleagues (44–47), referred to as dynamic be argued that on a very local spatial scale, an FM stimulus is not so attending theory. In brief, dynamic attending theory proposes that different from a periodic tone sequence, in terms of recurrent attention waxes and wanes rhythmically and that events coinciding onsets and offsets. with attentional “peaks” are privy to more thorough processing. However, our behavioral results would be very difficult to explain Moreover, attentional rhythms are entrained by environmental based on local excitation by frequency modulation. Because the rhythms. Thus, better processing is afforded to stimuli occurring pattern of evoked excitations is continuous—that is, based on con- at points in time that are expected based on the rhythmic context tinuous stimulation—an optimal stimulus phase for gap detection in which they are presented. Indeed, perception of onset time would imply a privileged set of frequency-tuned neurons that respond (47), pitch perception (48), and gap detection (49) are better for best to near-threshold auditory targets. Moreover, because optimal temporally expected relative to unexpected stimuli. Although the stimulus phase varied across listeners, the privileged neuronal pop- proposed mechanism underlying these perceptual benefits is os- ulation would differ between listeners in terms of frequency tuning. cillation of an “attentional pulse” as opposed to a neural oscil- Overall, we suggest that a distinction can be made between local lation, these ideas lay important groundwork for the bourgeoning excitation evoked by transient stimulation of a frequency-tuned interest in perceptual consequences of neural entrainment by neuronal population (FM) and global excitation evoked by sound environmental rhythms. onsets and associated with a perceptual onset (AM). However, fu- ture research on FM-induced entrainment and phase reorganization Conclusions will profit by using intracranial recording techniques in human or In conclusion, this study demonstrates that successful detection animal subjects. of (and response times and ERPs to) difficult-to-detect auditory stimuli depends on the instantaneous phase of neural delta oscil- Interpreting Optimal Phase. In the current study, we found that lations that are entrained by a spectrally modulated stimulus. This

detection performance and N1 amplitudes peaked for gaps that observation constitutes an important generalization of the neural NEUROSCIENCE were presented in the rising phase, near the peak, of the 3-Hz delta entrainment hypothesis to rhythmic stimuli in the absence of salient oscillation (Fig. 4). These results are perhaps unexpected based on stimulus onsets and offsets. We suggest that low-frequency (i.e., the idea that local neuronal populations should be most excitable delta-range) fluctuations of natural auditory stimuli in the frequency when the slow oscillation indexing local potentials is at its most domain, for example prosodic or melodic contour information, negative point (i.e., in the trough of the oscillation), as has been act as a pacing signal by which slow neural delta oscillations are demonstrated recording directly from macaque auditory cortex entrained, thereby optimizing human listening behavior. (A1; 3, 4, 9). In contrast, the current study relies on EEG, in which each single electrode measures summed electric potential fluctu- Experimental Procedures ations through the skull. Thus, the EEG signal is comparably im- Participants. Twelve normal-hearing, native German speakers (age 21–32, pure. Moreover, the absolute phase value of an EEG signal 6 female) took part in the study. Data from another four participants were depends on the choice of reference (42). Thus, we exercise some collected but were discarded because of a high degree of noise in the EEG caution when interpreting absolute values of optimal phase. recording (n = 3) or inability to perform the gap detection task (n = 1). fi V With this cautionary note in mind, one of the most interesting Participants received nancial compensation of 15 . Written informed fi consent was obtained from all participants. The procedure was approved by ndings emerging from the current study was based on relative the ethics committee of the medical faculty of the University of Leipzig and phase. That is, the stimulus phase in which individual listeners in accordance with the declaration of Helsinki. performed best was variable (Fig. 3). However, we were able to – predict peak stimulus phase (i.e., the stimulus behavior lag) from Stimuli. Auditory stimuli were generated by MATLAB software at a sampling a simple combination of the stimulus–brain lag and the brain–be- rate of 60,000 Hz. Stimuli were 10-s complex tones frequency-modulated at havior lag. This result confirms our hypothesis that the entrained a rate of 3 Hz and a depth of 37.5% (Fig. 1). Complex carrier signals were delta oscillation was the intervening variable between the stimulus centered on one of three frequencies (800, 1,000, 1,200 Hz) and comprised and the pattern of behavioral results observed for each listener. of 30 components sampled from a uniform distribution with a 500-Hz range. The amplitude of each component was scaled linearly based on its inverse Neural Oscillations and Temporal Context. The role of the rhythmic distance from the center frequency; that is, the center frequency itself was context in which an event is situated recently has gained much the highest-amplitude component, and component amplitudes decreased with increasing distance from the center frequency. The onset phase of the attention, and a theoretical proposal regarding rhythmic attention stimulus was randomized from trial to trial, taking on one of eight values (0, captures well this zeitgeist (8). The critical idea behind this pro- π/4, π/2, 3π/4, π,5π/4, 3π/2, 7π/4). All stimuli were rms amplitude normalized. posal is that attention can behave in either a “rhythmic” or “ ” Two, three, or four silent gaps were inserted into each 10-s stimulus (gap a continuous mode, with the former being the default mode of onset and offset were gated with 3-ms half-cosine ramps) without changing the operation. Under the rhythmic processing scheme, neuronal sen- duration of the stimulus. Each gap was chosen to be centered in 1 of 20 equally sitivity fluctuates and is heightened at times when stimuli are most spaced phase bins into which each single cycle of the frequency modulation was likely to occur. The rhythmic mode can be contrasted with an divided. Gaps were placed in the final 9 s of the stimulus, with the constraint energetically more expensive, constant state of high neuronal that two gaps could not occur within 667 ms (i.e., 2 cycles) of each other. sensitivity that subserves continuous monitoring for a temporally unpredictable stimulus. Because of the imbalance in energy Procedure. Gap duration was first titrated for each individual listener; in- demands, it is suggested that the rhythmic mode is preferred. The dividual thresholds ranged between 10 ms and 18 ms. For the main exper- fi iment, EEG was recorded while listeners detected gaps embedded in 10-s- preference for a rhythmic processing mode ts well with the ob- long FM stimuli. Each stimulus contained 2, 3, or 4 gaps, and listeners servation that many natural auditory signals are, to varying extents, responded with a button-press when they detected a gap. Listeners were rhythmic. Moreover, it can be noted that many sensory behaviors instructed to respond as quickly as possible when they detected a gap, and actively push perception into a rhythmic mode (43). For example, a response was considered to be a “hit” when it occurred within 1 s of a gap. in vision, saccades occur rhythmically with a rate near 3 Hz; visual Overall, each listener heard 70 stimuli per carrier frequency, for a total of

Henry and Obleser PNAS Early Edition | 5of6 Downloaded by guest on September 28, 2021 210 trials. For each of the 20 FM-phase bins, 30 gaps were presented, 10 per artifact rejection, full stimulus epochs were analyzed in the frequency and carrier frequency, for a total of 600 targets. On average, the experiment time-frequency domains to examine oscillatory brain responses entrained by lasted between 90 and 120 min including preparation of the EEG. See SI the 3-Hz stimulation. Specifically, we examined amplitude spectra and in- Experimental Procedures for additional procedural information. tertrial phase coherence (ITPC). For target epochs, ERPs were examined in the time domain, and power, Data Acquisition and Analysis. Behavioral data. Behavioral data were recorded ITPC, and a bifurcation index (37) were analyzed in the time-frequency do- online by Presentation software (Neurobehavioral Systems). Hits were de- main. For each of these dependent measures, responses to detected (hits) fined as button-press responses that occurred no more than 1 s after the and undetected (misses) targets were compared directly. Moreover, single- occurrence of a gap. Hit rates and RTs were calculated separately for each of trial complex output from the Wavelet convolution was also used to esti- the 20 FM-phase bins. See SI Experimental Procedures for additional details. mate the phase angle in the 3-Hz band at the time of gap occurrence, then Electroencephalogram data. The EEG was recorded from 64 Ag–AgCl electrodes single-trial ERPs were sorted with respect to the delta phase angle to char- mounted on a custom-made cap (Electro-Cap International), according to acterize delta phase effects. For hit rates, RTs, N1 amplitudes, and P2 the modified and expanded 10–20 system. Signals were recorded continu- amplitudes, the optimal delta or stimulus phase angle was estimated based ously with a passband of DC to 200 Hz and digitized at a sampling rate of on a single-cycle cosine fit to the data. For additional data analysis in- 500 Hz. The reference electrode was the left mastoid. Bipolar horizontal and formation, please see SI Experimental Procedures. vertical EOGs were recorded for artifact rejection purposes. Electrode re- Ω sistance was kept under 5 k . ACKNOWLEDGMENTS. We thank Heike Boethel for help with the data All EEG data were analyzed offline by using Fieldtrip software (www.ru.nl/ acquisition. This work was supported by the Max Planck Society, Germany, fcdonders/fieldtrip; ref. 50), and custom Matlab (Mathworks) scripts. After through a Max Planck Research Group grant (to J.O.).

1. Bishop GH (1933) Cyclic changes in the excitability of the optic pathway of the rabbit. 27. Luo H, Wang Y, Poeppel D, Simon JZ (2006) Concurrent encoding of frequency and Am J Physiol 103:213–224. amplitude modulation in human auditory cortex: MEG evidence. J Neurophysiol 2. Buzsáki G, Draguhn A (2004) Neuronal oscillations in cortical networks. Science 304 96(5):2712–2723. (5679):1926–1929. 28. Millman RE, Prendergast G, Kitterick PT, Woods WP, Green GGR (2010) Spatiotem- 3. Lakatos P, et al. (2005) An oscillatory hierarchy controlling neuronal excitability and poral reconstruction of the auditory steady-state response to frequency modulation stimulus processing in the auditory cortex. J Neurophysiol 94(3):1904–1911. using . Neuroimage 49(1):745–758. 4. Lakatos P, Karmos G, Mehta AD, Ulbert I, Schroeder CE (2008) Entrainment of neu- 29. Obleser J, Herrmann B, Henry MJ (2012) Neural oscillations in speech: Don’t be en- ronal oscillations as a mechanism of attentional selection. Science 320(5872):110–113. slaved by the envelope. Front Human Neurosci 6:250. 5. Thut G, et al. (2011) Rhythmic TMS causes local entrainment of natural oscillatory 30. Picton TW, John MS, Dimitrijevic A, Purcell D (2003) Human auditory steady-state signatures. Curr Biol 21(14):1176–1185. responses. Int J Audiol 42(4):177–219. 6. Zaehle T, Rach S, Herrmann CS (2010) Transcranial alternating current stimulation 31. Lachaux J-P, Rodriguez E, Martinerie J, Varela FJ (1999) Measuring phase synchrony in enhances individual alpha activity in human EEG. PLoS ONE 5(11):e13766. brain signals. Hum Brain Mapp 8(4):194–208. 7. Mathewson KE, Gratton G, Fabiani M, Beck DM, Ro T (2009) To see or not to see: 32. Maris E, Oostenveld R (2007) Nonparametric statistical testing of EEG- and MEG-data. – Prestimulus alpha phase predicts visual awareness. J Neurosci 29(9):2725–2732. J Neurosci Methods 164(1):177 190. 8. Schroeder CE, Lakatos P (2009) Low-frequency neuronal oscillations as instruments of 33. Budd TW, Barry RJ, Gordon E, Rennie C, Michie PT (1998) Decrement of the N1 au- sensory selection. Trends Neurosci 32(1):9–18. ditory event-related potential with stimulus repetition: Habituation vs. refractoriness. – 9. Lakatos P, Chen C-M, O’Connell MN, Mills A, Schroeder CE (2007) Neuronal oscillations Int J Psychophysiol 31(1):51 68. ’ and multisensory interactions in primary auditory cortex. Neuron 53(2):272–292. 34. Lenz D, Schadow J, Thaerig S, Busch NA, Herrmann CS (2007) What s that sound? 10. Lakatos P, et al. (2009) The leading sense: Supramodal control of neurophysiological Matches with auditory long-term induce gamma activity in human EEG. Int J – context by attention. Neuron 64(3):419–430. Psychophysiol 64(1):31 38. 11. Besle J, et al. (2011) Tuning of the human neocortex to the temporal dynamics of 35. Oades RD, Zerbin D, Dittmann-Balcar A (1995) The topography of event-related po- attended events. J Neurosci 31(9):3176–3185. tentials in passive and active conditions of a 3-tone auditory oddball test. Int J Neu- rosci 81(3-4):249–264. 12. Bas¸ar E, Bas¸ar-Eroglu C, Rosen B, Schütt A (1984) A new approach to endogenous 36. Skoe E, Kraus N (2010) Auditory brain stem response to complex sounds: A tutorial. event-related potentials in man: Relation between EEG and P300-wave. Int J Neurosci Ear Hear 31(3):302–324. 24(1):1–21. 37. Busch NA, Dubois J, VanRullen R (2009) The phase of ongoing EEG oscillations predicts 13. Bas¸ar E, Stampfer HG (1985) Important associations among EEG-dynamics, event-re- visual perception. J Neurosci 29(24):7869–7876. lated potentials, short-term memory and learning. Int J Neurosci 26(3-4):161–180. 38. Giraud A-L, Poeppel D (2012) Cortical oscillations and speech processing: Emerging 14. Stefanics G, et al. (2010) Phase entrainment of human delta oscillations can mediate computational principles and operations. Nat Neurosci 15(4):511–517. the effects of expectation on reaction speed. J Neurosci 30(41):13578–13585. 39. Capilla A, Pazo-Alvarez P, Darriba A, Campo P, Gross J (2011) Steady-state visual 15. Haig AR, Gordon E (1998) EEG alpha phase at stimulus onset significantly affects the evoked potentials can be explained by temporal superposition of transient event- amplitude of the P3 ERP component. Int J Neurosci 93(1-2):101–115. related responses. PLoS ONE 6(1):e14543. 16. Busch NA, VanRullen R (2010) Spontaneous EEG oscillations reveal periodic sampling 40. Horikawa J, Nasu M, Taniguchi I (1998) Optical recording of responses to frequency- of visual attention. Proc Natl Acad Sci USA 107(37):16048–16053. modulated sounds in the auditory cortex. Neuroreport 9(5):799–802. 17. Drewes J, VanRullen R (2011) This is the rhythm of your eyes: The phase of ongoing 41. Weisz N, Wienbruch C, Hoffmeister S, Elbert T (2004) Tonotopic organization of the electroencephalogram oscillations modulates saccadic reaction time. J Neurosci human auditory cortex probed with frequency-modulated tones. Hear Res 191(1-2): – 31(12):4698 4708. 49–58. 18. VanRullen R, Busch NA, Drewes J, Dubno JR (2011) Ongoing EEG phase as a trial-by- 42. Nuñez PL (2005) Electric Fields in the Brain: The Neurophysics of EEG (Oxford Univ – trial predictor of perceptual and attentional variability. Frontiers in Psychology 2:1 9. Press, Oxford). 19. Ng BSW, Schroeder T, Kayser C (2012) A precluding but not ensuring role of entrained 43. Schroeder CE, Wilson DA, Radman T, Scharfman H, Lakatos P (2010) Dynamics of – low-frequency oscillations for auditory perception. J Neurosci 32(35):12268 12276. active sensing and perceptual selection. Curr Opin Neurobiol 20(2):172–176. 20. Schroeder CE, Lakatos P, Kajikawa Y, Partan S, Puce A (2008) Neuronal oscillations and 44. Jones MR (1976) Time, our lost dimension: Toward a new theory of perception, at- fi – visual ampli cation of speech. Trends Cogn Sci 12(3):106 113. tention, and memory. Psychol Rev 83(5):323–355. 21. Rosen S (1992) Temporal information in speech: Acoustic, auditory and linguistic as- 45. Jones MR (2004) Attention and timing. Ecological Psychoacoustics, ed Neuhoff JG – pects. Philos Trans R Soc Lond B Biol Sci 336(1278):367 373. (Elsevier, San Diego), pp 49–85. 22. Ahissar E, et al. (2001) Speech comprehension is correlated with temporal response 46. Large EW, Jones MR (1999) The dynamics of attending: How people track time- patterns recorded from auditory cortex. Proc Natl Acad Sci USA 98(23):13367–13372. varying events. Psychol Rev 106(1):119–159. 23. Luo H, Poeppel D (2007) Phase patterns of neuronal responses reliably discriminate 47. McAuley JD, Jones MR (2003) Modeling effects of rhythmic context on perceived speech in human auditory cortex. Neuron 54(6):1001–1010. duration: A comparison of interval and entrainment approaches to short-interval 24. Peelle JE, Gross J, Davis MH (May 17, 2012) Phase-locked responses to speech in hu- timing. J Exp Psychol Hum Percept Perform 29(6):1102–1125. man auditory cortex are enhanced during comprehension. Cereb Cortex, 10.1093/ 48. Jones MR, Moynihan H, MacKenzie N, Puente J (2002) Temporal aspects of stimulus- cercor/bhs118. driven attending in dynamic arrays. Psychol Sci 13(4):313–319. 25. Kayser C, Montemurro MA, Logothetis NK, Panzeri S (2009) Spike-phase coding 49. Lange K (2009) Brain correlates of early auditory processing are attenuated by ex- boosts and stabilizes information carried by spatial and temporal spike patterns. pectations for time and pitch. Brain Cogn 69(1):127–137. Neuron 61(4):597–608. 50. Oostenveld R, Fries P, Maris E, Schoffelen J-M (2011) FieldTrip: Open source software 26. Ng BSW, Logothetis NK, Kayser C (February 17, 2012) EEG phase patterns reflect the for advanced analysis of MEG, EEG, and invasive electrophysiological data. Comput selectivity of neural firing. Cereb Cortex, 10.1093/cercor/bhs031. Intell Neurosci 2011:1–9.

6of6 | www.pnas.org/cgi/doi/10.1073/pnas.1213390109 Henry and Obleser Downloaded by guest on September 28, 2021