Linköping University | Department of Computer and Information Science Bachelor Thesis, 18 ECTS | Spring term 2017 | LIU-IDA/KOGVET-G--17/019--SE

Neural Entrainment to Speech Analyzed with EEG A Review of Contemporary Theories about the Underlying Mechanisms of Speech Processing

Author: Richard Larsson

Supervisor: Carine Signoret, senior lecturer, IBL at Linköping University Examinator: Fredrik Stjernberg, professor, IKK at Linköping University

Copyright The publishers will keep this document online on the Internet – or its possible replacement – for a period of 25 years starting from the date of publication barring exceptional circumstances. The online availability of the document implies permanent permission for anyone to read, to download, or to print out single copies for his/hers own use and to use it unchanged for non-commercial research and educational purpose. Subsequent transfers of copyright cannot revoke this permission. All other uses of the document are conditional upon the consent of the copyright owner. The publisher has taken technical and administrative measures to assure authenticity, security and accessibility. According to intellectual property law the author has the right to be mentioned when his/her work is accessed as described above and to be protected against infringement. For additional information about the Linköping University Electronic Press and its procedures for publication and for assurance of document integrity, please refer to its www home page: http://www.ep.liu.se/.

© Richard Larsson

i

Abstract

Neural entrainment quite recently became considered an important mechanism used by the to process stimuli with periodic qualities, such as the and duration time of signals reaching sensory organs. An increasing amount of data strongly implies that the brain might be using neural entrainment as a mechanism to either directly process speech and/or to facilitate speech interpretation. Neural entrainment is therefore a promising marker to use for research of speech . This literature review aims to summarize the most recent findings within this area with the end-goal to be used as a basis for designing an EEG experiment intended to analyze speech perception as a means to distinguish human voices. For this reason, data was collected from the scientific databases Europe PMC, Academic Search Premier, PsycINFO, PubMed, Scopus and Web of Science, where the keywords “EEG” + either the phrase “neural entrainment”, “neural ”, or “cortical oscillation” were used to gather articles. Inclusion and exclusion criteria were then applied and the data was analyzed with the intention to answer the following research questions: “is it possible to observe neural entrainment to human voice/speech using EEG?”, “if so, what are the possibilities to use such neural entrainment as a marker for differentiating human voices from each other?” and “what is the nature of the mechanisms used by the brain to attain this entrainment?”. The resulting data from the articles indicated that, in order to yield reliable results when investigating neural entrainment to speech, the technique for analysis of brain activity could be done with EEG, a number of participants between 15-30 persons is enough, the spectral bands of interest are delta (<3 Hz), theta (4-8 Hz), beta (15-35 Hz) and gamma (>40 Hz), the method of analysis could be looking at both frequency and amplitude in the speech envelope, and finally the anatomical areas for investigating the brain’s ability to distinguish human voices using speech entrainment could be either areas within the auditory cortex or prefrontal areas involved in behavioral responses to speech processing.

ii

iii

Acknowledgement I have attempted as best I could to complete tHis literature review, which is intended to elucidate the state of the concerned research field at present, to satisfactory standards. This work would have been a lot harder, not to say impossible, to accomplish without the continuous aid and patience with my many questions from the followinG people, whom I would like to thAnk for their kind assistance in this endeavor. My supervisor Carine Signoret, senior lecturer at IBL, for her invaluable guidance and support during this entire project, for her extensive proof reading of my drafts, and perhaps most of all, for giving me the idea for this thesis. My examinator Fredrik Stjernberg, professor at IKK, also him for his proof reading of my work, as well as for his most helpful comments and suggestions upon it. And last but not least, my esteemed classmates and group members of the thesis course, whom all kindly offered their much appreciated and helpful suggestions, namely the following persons: Sini Alhola, Ellinor Ihs Håkansson, Pontus Ohlsson, Elin Sjöström and Anna Tågmark, all third year students at the bachelor program in Cognitive Science, 2017. I thank you all for your invaluable assistance!

Linköping, June 2017

Richard Larsson

iv

Contents Copyright ...... i Abstract ...... ii Acknowledgement ...... iv 1. Introduction ...... 1 2. Theory and Background ...... 2 2.1 The Sound Signal and the Auditory Pathway ...... 2 2.2 Speech Perception ...... 3 2.3 Neural Entrainment ...... 3 2.4 The Importance of Neural Entrainment ...... 4 2.5 Neural ...... 5 2.6 Current Theories of Speech Perception ...... 5 2.7 Four Hypotheses for Neural Entrainment to Continuous Speech ...... 6 2.7.1 The Onset Tracking Hypothesis ...... 6 2.7.2 Collective Feature Tracking Hypothesis ...... 6 2.7.3 Syllabic Parsing Hypothesis ...... 6 2.7.4 Sensory Selection Hypothesis ...... 7 2.7.5 Combining the Hypotheses ...... 7 2.8 Prospects for Further Research ...... 7 3. Method ...... 9 3.1 Database Choice ...... 9 3.2 Motivations for Keyword Choices and Final Keywords ...... 9 3.3 Inclusion and Exclusion Criteria ...... 10 3.3.1 Inclusion Criteria ...... 10 3.3.2 Exclusion Criteria ...... 10 3.4 Key Experimental Factors ...... 11 3.4.1 SJR and IF Scales ...... 12 3.4.2 Subsequent Factor Selection ...... 12 3.4.3 Final Factor Analysis ...... 13 4. Results ...... 14 4.1 Article Selection Results ...... 14 4.2 Research Factors ...... 15 4.2.1 Participants ...... 15 4.2.2 Methodology ...... 16

v

4.2.3 Theory ...... 19 5. Discussion ...... 23 5.1 Findings ...... 23 5.2 Suggestions for Further Research ...... 24 5.3 General Discussion ...... 25 5.4 Method Discussion ...... 26 5.5 Theory Discussion ...... 27 References ...... 28 Appendix ...... 31 A. Included Articles ...... 31 B. Excluded Articles ...... 32

vi

1. Introduction

Neurology is one of several scientific fields suffering from such an overwhelming complexity of the subject of study that it has to resort to indirect means of analysis. In such fields, statistical models are used to aid in the directing of inquiry (Yang, 2014), yet in this particular example, the subject (the brain) consists of a known structure and it is therefore possible to replicate its functions in cognitive models, artificial implants or non-invasive techniques. Such methods are at present very crude relative to the full potential of the human neocortex and merely mimic or aid in small tasks carried out by this vastly integrative machinery. The manipulation of sensory organs is one such use where these techniques can do much both in helping us understand how the larger mechanisms of the brain work, and in aiding persons with deficiencies related to brain functions. With this context in mind, the purpose of this literature review is to analyze contemporary theories regarding the phenomenon known as neural entrainment. The reason for why this phenomenon is interesting is that it has been shown to be a key feature in how the brain can make sense of its environment by translating waveforms into neural signals so as to give the a coherent image of the world around itself. The end-goal with this investigation is to be able to predicate which method and which neural marker will be the most promising to use when designing a study intended to investigate whether neural entrainment could be used as a marker to formally/artificially distinguish different human voices. For this reason, the focus will be on analyzing the exact mechanisms used for this operation, because if its mechanisms are known, then they can potentially be replicated in another medium, such as an implant or a non-invasive technique. Technology making use of such knowledge could e.g. help deaf persons to pick out a given speaker from a crowd of speakers.

The research questions of this study are the following three: 1) Is it possible to observe neural entrainment to human voice/speech using EEG? 2) If so, what are the possibilities to use such neural entrainment as a marker for differentiating human voices from each other? 3) What is the nature of the mechanisms used by the brain to attain this entrainment?

These research questions beg to find answers in the gathered data regarding exactly which methodologies for analysis that are most promising when analyzing neurofeedback in response to neural entrainment to speech, as well as which theories are most likely to yield relevant and trustworthy results. To answer these questions, a search was conducted on scientific databases, using keywords relevant to the research questions. Keywords or phrases such as e.g. “EEG” and “neural entrainment” were used to collect relevant articles, and a preliminary reading of the articles’ abstracts was conducted to apply inclusion and exclusion criteria, whereupon the present literature review was subsequently created based on the gathered and included data.

1

2. Theory and Background

Neural entrainment to speech is one possible mechanism that many experiments in just the recent few years have implied as not only a major mechanism used by the brain to process speech (as well as other stimuli) but it is also one that lends itself readily for analysis, due to the many periodic qualities intrinsic to this mechanism. Following is a thorough explanation as to why this is so, and the motivation for conducting a study that uses this mechanism to analyze how the brain distinguishes human voices in speech signals.

2.1 The Sound Signal and the Auditory Pathway

Sound, in physics terms, consists of a set of longitudinal, sinusoidal composed of displacements of pressure. The varied displacement between high and low pressure is what gives rise to the waveform of sound, and this in turn produces the periodic qualities, such as the frequency of sound. (Young & Freedman, 2008). The sound envelope – which is often in focus in EEG studies related to soundwaves – is the “silhouette” of a continuous, oscillating signal, or in other words the outline of the signal’s extremes (Johnson, Sethares & Klein, 2011), and the envelope is for this reason highly important in neurology studies, since it basically is the container of all the information within the signals emitted by the brain, which is how the brain activity is represented in studies using techniques with high temporal resolutions. The amplitude, or loudness of the signal, is the vertical extension of an oscillating curve, and the frequency is the extension of a single repetition, or cycle, of the sinusoid comprising the soundwave. Furthermore, a sound envelope usually contains distinctive features, called spectral bands, consisting of extensions of similar . These are examples of what is known as the temporal fine structure (TFS) of soundwaves, which is the term for the rapid oscillations close to the center frequency and temporal envelope (Moon & Hong, 2014). In the brain, such frequency bands almost exclusively correspond to specific cortical functions, such as processing of sensory information (Zhou, Melloni, Poeppel & Ding, 2016). These bands are examples of spectral information, i.e. information found in the frequency composition of soundwaves. The frequency in an oscillating wave can always be decomposed down to its smallest frequency constituent, which is called its fundamental frequency. This frequency is necessary to know in EEG analysis, since waveforms can consist of many superimposed layers of summed frequencies, called harmonies, which can be mistaken for the fundamental frequency. For the sake of avoiding such mistakes, it is possible to indirectly represent the features of a sinusoidal wave (Zhou et al., 2016). These techniques, which are useful in EEG studies – where the interpretation or manipulation of such signals is centrally important – are called amplitude modulation (AM) and frequency modulation (FM). In AM, the features of a soundwave are varied in amplitude in accordance with the original signal, and in FM, this mimicking modulation is instead varied in frequency (Henry, Herrmann & Obleser, 2014).

Soundwave processing starts in the cochlea of the inner ear, where hair cells vibrating at the corresponding frequency and amplitude of the incoming soundwave, trigger in adjacent , the latter of which translate the vibratory energy into electrochemical signals and then propagate these signals into the auditory cortex (Rubel & Fritzsch, 2002). In the auditory cortex (which is located bilaterally in the

2 superior regions of the temporal lobes), the sound signal is first discretized into low- level segments of signal information, such as frequency and pitch. And, with regard to speech sound, these segments are later also synthesized into higher orders of information, such as phonemes (the smallest building blocks of speech that contain linguistic meaning) or prosody (the tone of a voice, often relaying the emotional intents of the speaker) (Giraud & Poeppel, 2012).

2.2 Speech Perception

The information contained in a sound signal includes periodicity (sometimes referred to as the frequency composition of sound), which means that distinguishable features in the sound signal have certain duration periods which repeat in predictable intervals, and thus contain cues, such as the respective onsets of these features, which can be observed by neurological analysis. In the sound signals of speech, such periodic features are speech details such as phonemes or syllables. It is unclear at present how the process of sound interpretation from the information contained within sound signals is carried out by the brain. There is no one-to-one mapping between phoneme categories and the corresponding speech sound categories (phones) (Kazanina, Phillips & Idsardi, 2006), which means that speech interpretation is not as easy as simply observing the periodicity of speech syllables and then mapping them against the corresponding neural activity. Instead, recent work in speech perception has focused on more complex theories, according to which the brain uses several different methods at different stages of processing. One such mechanism is neural entrainment.

2.3 Neural Entrainment

One, and perhaps the most promising, branch of research into speech perception has, for the better part of the last decade, been focusing on neural entrainment (Giraud & Poeppel, 2012). Neurons may change the frequency at which they fire action potential (their spike rate) and entrain (i.e. synchronize) their spike rate to the frequency of the input (Giraud & Poeppel, 2012). The same phenomenon is true for the amplitude and periodicity between the neuronal activity and incoming stimuli. The concept ‘entrainment’ itself originates from the observations of the Dutch scientist Christiaan Huygens in his experiments on , which synchronized each other’s swinging rates (Pantaleone, 2002), but its use in neurology emerged from the much more modern complex systems theory, a field which investigates how systems consisting of many parts interact to give rise to synchronizing phenomena (Wolfram, 1985), one of which is entrainment. In the early days (which still isn’t much more than a decade ago) of the role of neural entrainment in neurology, as with many new concepts before they come into their own, it was given several names, such as “stimulus- synchronized temporal discharge patterns of cortical neurons” (Wang, Lu & Liang, 2003) and “asymmetric sampling in time (AST)” (Poeppel, 2002). When it became clear enough what the role of entrainment played in the neural mechanisms of speech processing, the phenomenon became commonly known as neural or cortical entrainment to speech. For, as the name suggests, neural entrainment is the neuronal ability to process incoming stimuli by synchronizing the stimuli’s periodic and oscillatory

3 qualities in their own spike rates and spike amplitudes (Power, Mead, Barnes & Goswami, 2012). This neural activity can take place either within a single in response to external stimuli, or as activity, where groups of neurons synchronize their activity with each other. One example of neural entrainment is phase entrainment, which is when one or several neurons entrain to the phase, that is to say, the part of a sinusoidal wave at a given point in time, of a stimulus (Giraud & Poeppel, 2012). According to a growing body of evidence (e.g. Diamond & Zhang, 2016; Henry & Obleser, 2013; Lehongre, Morillon, Giraud & Ramus, 2013), this process of synchronizing then, is one of the main components in the brain’s ability to comprehend auditory stimuli, and since human speech is a process very rich in these periodic and oscillatory qualities, neural entrainment has, not surprisingly, become the dominating methodology of recent analyses of speech processing (e.g. Cross, Butler & Lalor, 2015; Horton, D’Zmura & Srinivasan, 2013; Kong, Somarowthu & Ding, 2015; Zoefel & VanRullen, 2015a).

2.4 The Importance of Neural Entrainment

The phenomenon of neural entrainment to speech brings to neurology a much higher precision of analysis than what was the focus of studies preceding the realization of the importance of spectral information in speech encoding. ERPs (event related potential) are phenomena that occur in response to the onset of a cued stimulus and results in a specific signature of the neurofeedback signal. Prior to neural entrainment, time-locked information of discrete stimuli (i.e. a response that is isolated to the onset of a given, discrete signal), such as the , P3 and ERP components, were used as markers for the neuronal activity in speech perception (Zhou et al., 2016), but this excluded phenomena such as entrainment to continuous speech, which differs from time-locked analysis in several important ways. For example, in a recent meta-analysis of contemporary neural entrainment research, Zhou et al. (2016) show that neural entrainment to musical auditory streams occur along harmonies of beats as well as the beats themselves. Such an observation would be impossible to detect using merely time- locked analysis such as ERPs. Another, and perhaps more important, example is the increasing amount of data (e.g. Diamond & Zhang, 2016; Gao et al., 2017; Henry & Obleser, 2013) strongly indicating that very specific linguistic information processing, such as syllabic and prosodic information, can be observed directly from neural entrainment, and more importantly, also, conveniently enough, delineated into separate spectral bands (i.e. frequency sections of the sound spectrum). For the sake of convenience, spectral bands in neurological terms have been given names related to the neural activities usually associated with them, such as the brainwaves that differ between an alert and a sleeping brain (beta waves and delta waves respectively). The spectral bands which have so far turned out to be most interesting with regards to the brain’s speech processing are the gamma-, beta-, theta- and delta-bands, since these are the duration fractions which most speech details lie within: phonetic information is associated with gamma (>40 Hz) and beta (15-35 Hz) oscillations, syllabic information with theta (4-8 Hz) oscillations, and finally prosodic information embedded in sequences of syllables/words is associated with delta (<3 Hz) oscillations (Ghitza, 2012). These are, in other words, the duration times for each distinctive unit of these features which are embedded in speech information.

4

Finally, other examples of observations excluded from time-locked analysis of speech are for instance Kong et al. (2015), Horton et al. (2013) or Zoefel & VanRullen (2015a), all of which analyze continuous speech and finding significant results strongly implying some form of neural entrainment as a necessary mechanism for speech perception, such as speech intelligibility or speech localization. Thus, since oscillatory information is so central in speech, it will be quite impossible in any study aiming to understand speech perception, to neglect the information contained within the entire speech envelope (the amplitude of each speech segment’s onset and their respective refraction periods) rather than limiting the scope to temporal information alone.

2.5 Neural Oscillations

Neural oscillations are waveforms emanating from the synchronized firing of individual neurons. These exist everywhere in the brain and their roles in cognition seem to be plentiful (Giraud & Poeppel, 2012). One example of this that has piqued the curiosity of many a neurologist in the last decade is in speech perception, since oscillations laminate information into discrete pieces, which is centrally important for speech processing, given the frequency composition of speech (Giraud & Poeppel, 2012). For this reason, one of the most efficient methods to use by the brain for picking out speech information among many other auditory signals, ought to be the ability to temporally segregate sound based on its oscillatory qualities, and this is indeed the mechanism that a growing amount of data supports (e.g. Henry, Herrmann & Obleser, 2014; Henry & Obleser, 2013; Zhou et al., 2016; Zoefel & VanRullen, 2015a). Among the many arguments supporting this view, Giraud and Poeppel (2012) argue that the articulatory motor functions of human speech most likely evolved in synchrony with the brain’s capacity for interpreting distinct sound information, wherefore the auditory processing, and the motoric system, with its rhythmic movements of the jaw along with other rhythmic components eliciting the acoustic prerequisites of human speech, must be tuned to each other. And this is the point at which the difference in theories about the mechanisms underlying speech perception branch out into bifurcations with data implying several different hypotheses.

2.6 Current Theories of Speech Perception

Neural entrainment has come to be considered a mechanism used by the brain to process sensory stimuli of several modalities (Wang, Lu & Liang, 2003). What is unclear as of yet, however, is precisely how neural entrainment to speech signals works to elicit speech perception. Ding & Simon (2014) presented four main hypotheses in a recent meta-analysis; although it should be said that more has happened since it was written, given that this field has become such a hot topic these last few years (e.g. Celma- Miralles, de Menenez & Toro, 2016; Henry, Herrmann & Obleser, 2014; Zhou et al., 2016; Zoefel & VanRullen, 2015b), these hypotheses are still very useful in generally encapsulating the nature of neural entrainment. Generally speaking, these four hypotheses hold that the brain uses either a passive or an active mechanism for speech entrainment (Ding & Simon, 2014). Speech entrainment can be divided into two stages: an analysis stage and a synthesis stage. Speech analysis is the process during which the auditory signals of speech are deconstructed into auditory features, such as amplitude, frequency, pitch and so forth.

5

Speech synthesis is the process where the detected and discretely separated auditory signals are put together into higher levels of detail, such as distinct phonemes and prosodic detail. A more thorough description of these hypotheses will be given below, but it should be mentioned here that the field contains research gaps that beg to be investigated, because several of the results contradict each other such that they support all of the hypotheses presented in different ways.

2.7 Four Hypotheses for Neural Entrainment to Continuous Speech

Ding & Simon (2014) thoroughly explain four different hypotheses in their review of the contemporary literature on the subject of neural entrainment to continuous speech. These hypotheses can be divided into two classes: two of them argue for an active role of neural entrainment in speech synthesis, whereas the other two argue for a latent role. Since these theories are so all-inclusive of contemporary data within this topic, they are thoroughly described below.

2.7.1 The Onset Tracking Hypothesis This theory states that instead of being directly entrained by the speech envelope as a whole, the speech entrainment is a superposition of discrete onsets that relates to neural responses (Ding & Simon, 2014). They mention observations of the sharpness of acoustic edges influencing cortical tracking of the sound envelope. An argument against this theory is that it's hard to know which part of a speech envelope that really is an edge/onset/offset, since speech continuously changes. In the case this theory holds true, it could work as an alternative method of studying speech perception to the ERP methodology.

2.7.2 Collective Feature Tracking Hypothesis This hypothesis instead holds that neural entrainment tracks acoustic features, such as pitch, sound source location information etc. The speech envelope is the summation of the power of all speech features at each time moment in the same way that large-scale neural entrainment to speech is the summation of neural activity tracking different acoustic features of speech. "It is therefore plausible,’ the authors argue, ‘to hypothesize that macroscopic speech entrainment is a passive summation of microscopic neural tracking of acoustic features across neurons/networks" (Ding & Simon, 2014). The onset tracking hypothesis is a special case of this hypothesis, focused only on discrete edges of the speech envelope. More generally, entrained activity is a mixture of all speech features such as temporal envelopes in different frequency bands and their response functions.

2.7.3 Syllabic Parsing Hypothesis In this hypothesis speech envelope entrainment creates a discrete syllabic level of speech representation (Ding & Simon, 2014). An example of this mechanism is that theta band oscillations are aligned to lie within the interim of two vowels in speech, and this corresponds to two adjacent peaks in the speech envelope. In this way, the information contained in each single theta oscillation can then be used to decode this phonetic information used in speech. Importantly, this hypothesis therefore not just tracks, but also creates a neural representation of syllables.

6

2.7.4 Sensory Selection Hypothesis This hypothesis focuses on picking out sounds from the environment that contain speech sounds and ignoring those that don't. The brain does this by listening for rhythmicity and temporal coherence between acoustic features. These are all critical features contained in the speech envelope, so they would have to be crucial for the brain to comprehend speech. The mechanism by which the brain tracks these features has two underlying hypotheses: one theorizes that the brain uses temporal coherence in order to connect acoustic features which are included in the same speech stream. Thus, envelope entrainment is here theorized to reflect the computations belonging to this “coherence analysis” (Ding & Simon, 2014). The other hypothesis states that the moments at which more speech is being heard is tracked by the envelope and used to guide the brain to process those.

2.7.5 Combining the Hypotheses As mentioned above, speech processing is divided into an analysis stage and a synthesis stage. The analysis stage works to decompose speech sounds into auditory features such as pitch, amplitude etc. This starts out in the cochlea, no matter which types of sound signals there are. The synthesis stage combines these auditory features into chunks that are used to perceive 'linguistic entities' such as syllables and speech streams. In hypotheses 1 and 2, speech entrainment is considered as a passive process in which only the analysis stage is used, whereas in hypotheses 3 and 4, the synthesis stage is used. But synthesis could have both an active and a passive mechanism. In a passive mechanism, syllabic parsing would be conducted by neural computations that generate spatially coherent signals. These would be measurable by macroscopic recording tools. An active mechanism would entrain "cortical activity, as a large-scale voltage fluctuation, directly regulating syllabic parsing or sensory selection" (Ding & Simon, 2014).

2.8 Prospects for Further Research

However the case may be, since the mechanism of neural entrainment involves both frequency and amplitude of any periodic signals reaching the brain, neural entrainment is a highly relevant marker to use for research in auditory stimuli, and more specifically, in speech perception, since not only do auditory signals in general consist of both frequency and amplitude (Giraud & Poeppel, 2012), but speech can also easily be deconstructed into temporally dependent segments, such as the phonemes or prosodic information (Giraud & Poeppel, 2012). These speech components would be distinguishable using this temporally based phenomenon as a marker. The methodologies used to observe neural entrainment are plentiful, but among the most promising for the purpose of the present study are functional techniques such as magnetoencephalogram (MEG) or electroencephalogram (EEG) (Schultz, 2012), since the focus should be on the highest temporal rather than spatial resolution, with which the richest possible spectral analysis of the oscillatory nature of speech can be conducted. Lastly, Rogala et al. (2016) warns in their review of EEG-neurofeedback (NFB) studies that 39.3% of the EEG entrainment and 50% of behavioral effects (motoric responses) experiments they analyzed were unsuccessful, which is an argument that

7 strongly implies caution when carrying out such studies, and it is therefore an excellent motivation for this literature review as a preliminary basis for practical experiments.

8

3. Method

In order to answer the research questions for this review, information regarding both the methodologies and theoretical bases necessary to design an EEG experiment to analyze neural entrainment to speech needed to be identified in the reviewed articles. Specifically, what needed to be answered in order to conduct such an experiment was a) which parts of the brain to look at, b) how to analyze the gathered data, and c) which methods, besides EEG, are most reliable to carry this work out. For this reason both inclusion and exclusion criteria were created to filter out irrelevant data and retain the necessary information for further analysis. This section explains how this process was conducted and finally presents which factors were chosen for analysis in the included articles. Furthermore, this section is divisible into two main parts aside from the named sections. The first part concerns the article selection process, and the second part explains the criteria selection for article analysis once the articles themselves had been selected.

3.1 Database Choice

A set of relevant keywords were decided upon by preliminary searches using several databases. The original list of databases attempted is as follows: Academic search Premier (23 hits), PsycINFO (50 hits), PubMed (19 hits), Scopus (16 hits), Web of Science (19 hits) and Europe PMC (67 hits). The Europe PMC database (europepmc.org) yielded the highest number of results for the relevant keywords. Also, some of the hits differed between these databases, but a read-through of the hit titles indicated that either the hits were duplicates of hits in the Europe PMC database, or none of the hits were relevant to this study, and for this reason only this database was used as a source for articles.

3.2 Motivations for Keyword Choices and Final Keywords

The study that this literature review intends to build a case for will be conducted using EEG. EEG is sufficient to provide high resolution data of specific cortical areas related to the subject (Schultz, 2012). Techniques with high spatial resolution, such as fMRI, would be far less likely to yield meaningful data since the subject of study is temporal in nature, such as neural oscillations, and what little a spatially focused analysis might add is unlikely to be convincing as an argument for the extra cost of such equipment use. Neural entrainment is the neural mechanism in focus of the proposed study that this review aims to lay the foundation for. This is because speech is rich in periodic qualities such as frequency, and it is therefore endowed with easily detectable markers such as stimuli response onsets, duration times of oscillations etc. (Giraud & Poeppel, 2012), and that is furthermore well suited for EEG as the method of neural analysis. The term ‘neural entrainment’ is similar in meaning to the term ‘cortical entrainment’ (Ding & Simon, 2014), but ‘neural entrainment’ included more articles, so ‘neural’ seems to be slightly more common than ‘cortical’. For these two “key phrases” the search results showed that both were very similar in occurrence (98 and 111 hits, respectively). The keywords decided upon were ‘EEG’, ‘neural entrainment’, ‘neural oscillation’ and ‘cortical oscillation’, and the searches were conducted using the following combinations:

9

 “EEG” + “neural entrainment”  “EEG” + “neural oscillation”  “EEG” + “cortical oscillation”

3.3 Inclusion and Exclusion Criteria

Following article gathering based on keywords, a further filtering based on the inclusion and exclusion criteria listed below was conducted and articles not measuring up to the criteria were excluded from further analysis in this review. See Table 1 for an overview of the included articles that passed through each step of this filtering process.

3.3.1 Inclusion Criteria The following criteria were chosen as qualifiers for inclusion in further review: 1) Studies using EEG as a method to analyze neural entrainment to speech. After further analysis of the articles included in the above criteria, it became clear that a broader inclusion criterion was also needed so as not to exclude relevant data from this review: 2) Studies analyzing or reviewing EEG methodology to analyze neural entrainment to speech. To explain this second criterion, the first criterion excluded articles which were not necessarily looking at entrainment to speech exclusively – e.g. other literature reviews – but they still incorporated EEG analysis of speech entrainment and relevant reasoning regarding it, even if the focus was elsewhere. Other examples include one article reviewing entrainment to meter induction, i.e. the ability to keep pace with music (Celma-Miralles et al., 2016). This article looked at this phenomenon partly because it had been shown previously that speech entrainment seemed to use this mechanism, and therefore, the arguments brought up in it were also relevant to this review. Importantly, in cases where there were no arguments that in any way elucidated anything relating directly to speech entrainment, but still arguments pointing to this topic, such as “these findings could also prove to be relevant in speech entrainment, since similar mechanisms are used”, these articles were excluded, because they merely provided agreement/confirmation rather than any research of their own.

3.3.2 Exclusion Criteria Below are the criteria chosen for disqualifying articles from further review: 1) Studies not using EEG as methods for obtaining their reported data were excluded, since different methods yield different results when observing the same stimuli due to different resolutions etc. and can therefore not be used as a basis for further research with EEG technique. 2) Studies in which EEG did not measure any form of neural entrainment. Some studies did use several techniques, such as MEG, functional magnetic imaging (fMRI) and EEG combined, but e.g. only MEG was ever applied to cortical entrainment specifically, whereas EEG was used as a complement looking at other phenomena. 3) Studies which didn’t look into entrainment to speech specifically. For example, some studies looked into motor entrainment or other forms of entrainment that were unrelated to the subject of this review.

10

Based on these criteria, 24 articles were chosen for review in accordance with Figure 1 below.

Figure 1. A flow-chart over the successive exclusion of articles generated through the three listed searches on the Europe PMC database. There were altogether 7 duplicates between the three searches, 5 were among the excluded articles and these were duplicates between the leftmost and middle searches, and the 2 other duplicates were among the included articles, and were found with both the leftmost and rightmost searches. Thus, the last row of arrows in the diagram should not be interpreted as 5 + 3 + 18 included articles, but either as 3 + 3 + 18, 5 + 1 + 18 or 5 + 3 + 16. Criterion 1 excluded articles not using EEG, criterion 2 non- neural entrainment articles, and criterion 3 excluded non-speech entrainment articles. No information regarding searches in the other databases is presented here, since those searches were either redundant or irrelevant.

3.4 Key Experimental Factors

After selecting these 24 articles, they were analyzed for factors relevant to the research questions. The factors were as follows: (i) participants, (ii) electrode setups, (iii) spectral bands, (iv) data analysis, (v) cortical loci and (vi) theoretical basis. The factors are presented in detail below. This selection analysis was almost exclusively qualitative in that no majorly encompassing statistical tests were conducted to support it. Concerning the articles containing experiments, it was deemed sufficient for this review’s intended purpose to find qualitative similarities between the experiments conducted in them and then see which of these qualitatively determined factors were

11 present in experiments which significantly confirmed each of their respective research hypotheses.

3.4.1 SJR and IF Scales Before this procedure was carried out, due to the fact that literature reviews were allowed in the selection criteria – again due to the qualitative nature of this review – short term (publishing year 2015) SCImago Journal Rank (SJR) and impact factor (IF) scores for the journals those articles were published in were used as objective judging criteria for analyzing the articles’ relevance. The reason for this was that literature reviews count as secondary sources and should therefore be regarded with more skepticism than primary sources, yet it is still relevant to help aim the search for relevant factors by starting out with literature reviews, since the very purpose of a literature review is to give an overview of the research within a given field. So in order to ascertain that the articles reviewed held a high enough standard to be equitable as sources of this preliminary search, the validity of the journals they were published in was tested with these two scales. Both the SJR and the IF scales are based on the fraction (number of annual citations)/(number of annually published articles), though they differ in that the SJR scale also considers the relevance of the journals citing them (Cantin, Muñoz & Roa, 2015). Generally, the scales differ such that the IF scale has higher hit rates within a single field, but the SJR scale has higher hit rates on general counts across fields, and it is therefore recommended that both scales be used to increase certainty of the given journal’s genuine relevance (Cantin et al., 2015). Although there has been considerable criticism, especially of the IF scale use, to determine the relevance of a given journal (e.g. Diamandis, 2017; Mañana-Rodriguez, 2015), both scales are widely considered to be relevant for this purpose (e.g. Bornmann & Pudovkin, 2017; Seglen, 1997). Regarding the SJR score, for the articles to qualify, the qualifying score was determined based on the criteria used on the SJR official website (www.scimagojr.com), wherein which the journals are divided into one of four quartiles (Q1, Q2, Q3 and Q4) based on how high their score values are. It was, in accordance with recommendations on the SJR official website, decided that articles published in journals which were in at least Q2 (the next highest rank) qualified the SJR criterion. As for the IF score, according to the mean of the number of listed journals in a list of the highest ranking journals conducted 2014 in Cantin et al. (2015), it was decided that any score at least as high as this mean qualified the IF criterion. The analysis based on the factors found in the review articles was however not made exclusively on the basis of the articles with experiments (which otherwise might be indicated by the subsequent factor selection described here), but mixed both literature reviews and experiment studies for finding data. The point of this factor selection was not so much to separate direct from indirect data in the articles, but rather it was determined that literature reviews would give a better overview and thus be a better source for initially finding the factors.

3.4.2 Subsequent Factor Selection The review articles were then used as a source of further factors to look for – factors also judged to be relevant for the research questions – in the articles with experiments. The specific factors for the suggested theoretical and practical basis, as well as their respective motivations, were as follows:

12

(i) Participants. These qualities were specifically the number of participants, their ages, and if they were healthy individuals. Several of the reviews showed significant differences between healthy and disabled participants with regard to speech perception. (ii) Electrode setups. The positioning of electrodes used should be aligned with known cortical areas for the neurological functions analyzed. This criterion was motivated by Rogala et al. (2016), in which the authors looked thoroughly at both the number and setups of electrodes used in analyzed experiments, and since they found significant differences in results between studies with what they deemed too few and enough electrodes in relevant areas, this was determined an important factor. (iii) Spectral bands. The number of spectral bands involved in the analysis. This factor was also explicitly analyzed in Rogala et al. (2016). According to their conclusion, the fewer spectral bands analyzed at once, the better, which they based on significant differences in results. (iv) Data analysis. The methods used for interpreting neural data. Ding & Simon (2014) listed differences in which results could be analyzed between different data analysis methods, such as Fast Fourier Transforms (FFT) and auditory steady-state response (aSSR). (v) Cortical loci. The cortical areas analyzed. Different cortical areas are involved in different neurological functions, something that is reflected in the articles, and this is centrally important to the research questions, which makes it a factor. (vi) Theoretical basis. Finally, which theories and suggestions for further research about neural entrainment to speech, e.g. similar or equal to the four listed by Ding & Simon (2014), motivated the experiments or were entertained in relation to such motivations, such as e.g. comprising partial motivations in combination with different theories. This factor also includes any eventual suggestions for further research contained in the articles.

3.4.3 Final Factor Analysis After factor selection based on the 7 review articles, the analysis of the 17 remaining experiment articles was conducted. The chosen factors worked here as guidelines for a qualitative analysis with the aim at answering the research questions. The results of this analysis were subsequently sorted as presented in the result section.

13

4. Results

After thoroughly going through the results of the article analysis, this section finishes with a conclusion which is intended to summarize all data relevant as suggestions for further research based on the results. Some data, namely the full table of the included articles (Table 4) as well as the tables of excluded articles (Tables 5, 6 and 7), is so extensive that it is instead reported separately in Appendix B.

4.1 Article Selection Results

The included articles are listed in Table 1. All the literature reviews passed the two selection criteria and were judged relevant as guides for general factor selection. Their respective IF and SJR scores are listed in Table 2.

Table 1. The articles included in this literature review. Articles with * were included based on inclusion criterion 2, and articles with ** are literature reviews. Author(s) Published Benitez-Burraco & Murphy** 2016 Celma-Miralles, de Menezes & Toro* 2016 Crosse, Butler & Lalor 2015 Diamond & Zhang 2016 Ding & Simon** 2014 Gao et al. 2017 Henry, Herrmann & Obleser 2014 Henry & Obleser 2012 Henry & Obleser 2013 Horton, D'Zmura & Srinivasan 2013 Kong, Somarowthu & Ding 2015 Lehongre, Morillon, Giraud & Ramus 2013 Obleser, Herrmann & Henry** 2012 Peelle & Davis** 2012 Power, Colling, Mead, Barnes & Goswami 2016 Power, Mead, Barnes & Goswami 2012 Power, Mead, Barnes & Goswami 2013 Rimmele, Sussman & Poeppel** 2015 Rogala et al.** 2016 Rufener, Oechslin, Wöstmann, Dellwo & Meyer 2016 Thwaites, Schlittenlacher, Nimmo-Smith, Marslen-Wilson & Moore 2017 Zhou, Melloni, Poeppel & Ding 2016 Zoefel & VanRullen(a) 2015 Zoefel & VanRullen(b)*/** 2015

Table 2. The impact factor (IF) and SCImago Journal Rank (SJR) results for the listed years of the three journals in which the literature reviews were published. Article(s) Journal IF score SJR score Benitez-Burraco & Murphy Frontiers in Human 3,634 (2015-2016) 49 (2015) (2016); Ding & Simon Neuroscience (2014); Obleser, Herrmann & Henry (2012); Rogala et al., (2016); Zoefel & VanRullen (2015b) Peelle & Davis (2012) Frontiers in Psychology 2.463 (2015) 43 (2015) Rimmele, Sussman & International Journal of 2.596 (2015-2016) 93 (2015) Poeppel (2015) Psychophysiology

14

4.2 Research Factors

The six factors identified in the method section are here, for the sake of overview, divided into three more thematic sections, namely factors related to participants, methodology and theory, which in effect means that factor (i) is a single section, factors (ii), (iii) and (iv) are all included in the methodology section, and factors (v) and (vi) in the theory section.

4.2.1 Participants Some of the articles were focused explicitly on health effects, such as the aging effects of speech interpretation, and some others on how speech interpretation of dyslectics were affected (one article had participants of three different age categories, and three articles included participants diagnosed with dyslexia), and these factors resulted in significant differences between healthy and disabled individuals with regards to speech perception. Therefore, participant health qualities are relevant concerns which significantly affect the results in studies analyzing neural entrainment to speech. Since these qualities are unrelated to the goal of answering the research questions, it was determined that healthy individuals will have to be a qualifying criterion for participants. As for the number of participants, the mean of participants used in the articles was 23 (m = 23.27, SD = 303.43, with outliers included), with the minimum number of 8 participants. Significant results of the experiments in the articles did not seem to have any relationship to the number of participants, judging from the fact that in the two articles which presented non-significant results (of which only one reported participant amount), the number of participants exceeded those in articles with significant results only. Beside the fact that this is only based on a single result, it could also be argued that this has more to do with differences in the experiments, but investigating that further lies outside the scope of this review, seeing as it would be difficult to find supporting evidence either for or against this hypothesis by comparing the experiments for statistical significance. The participant mean age in the articles with young adult participants (i.e. excluding those with children and elderly participants, since those populations are irrelevant in regards to the research questions) was 25.18 and SD = 4.70, spanning the ages 18-37 years old. Participant gender was always mixed in some approximation of a 50/50 setup. Finally, participants were always right-handed, which reportedly has to do with the fact that the speech center of a right-handed person is far more dominantly positioned in the left hemisphere than that of a left-handed person (e.g. Cross et al., 2015).

Table 3. Columns from left to right showing participant data, frequency bands of interest used, number and positioning of EEG electrodes and significance of the respective experiments. N/A (not available) if not known or reported in the article. If significance is reported without further comment, it means that all experiments in the article were significant. Articles Participants Frequency Electrodes Significance band(s) of interest Cross et al., 2015 21 healthy (19-37) Delta-theta (2-6 128 electrodes, Significant results Hz) N/A Konget al., 2015 8 healthy (20-28) N/A 16 electrodes, Significant results 10/20 system Horton et al., 2013 9 healthy (21-29) Gamma (40-41 128 electrodes, Significant results Hz) 10/5 system Zoefel & 12 healthy (m = Delta-theta (1-8 64 electrodes, Significant results in VanRullen, 2015a 27,6) Hz) 10/10 system all but one experiment

15

Henry et al., 2014 17 healthy (m = Delta-theta (3.1 26 electrodes, Significant results 25,2) Hz & 5.075 Hz) 10/20 system Henry & Obleser, 12 healthy (21-32) Delta (3 Hz) 64 electrodes, Significant results 2012 10/20 system Zhou et al., 2016 N/A Delta N/A N/A Celma-Miralles et 16 healthy, Delta-theta (0.8 60 electrodes, Significant results al., 2016 professional at Hz, 2.4 Hz & 4.8 10/10 system hearing task (18-35) Hz) Power et al., 2016 46, 12 dyslectic, 23 Delta-theta (0-10 129 electrodes, Significant results healthy (age N/A) Hz) N/A Power et al., 2013 32, 21 healthy (m = Delta (2 Hz) N/A Significant results 165,57 months), 11 dyslectic (m = 166,73 months) Power et al., 2012 23 healthy (163 Delta (2 Hz) N/A Significant results months) Thwaites et al., 15 healthy (18-30) N/A 70 electrodes, Significant results but 2017 10/20 system only weak evidence in one experiment Rufener et al., 72 healthy, 21 Theta (4-8 Hz), 128 electrodes, Significant results 2015 young (20-25), 21 gamma (30-48 N/A middle-aged (40- Hz) 45), 30 elderly (60- 67) Diamond & 21 healthy (18-24) 0.016-200 Hz, 64 electrodes, Significant results Zhang, 2016 sampling at 512 10/20 system Hz Henry & Obleser, 16 healthy (21-31) Delta (3 Hz), 64 electrodes, Significant results 2013 theta (6 Hz) 10/20 system Lehongre et al., 32, normal, 17 Delta (1-3 Hz), 64 electrodes, Significant results 2013 dyslectic, 15 healthy theta (4-7 Hz), positioned with (N/A) gamma (25-35 FCz as reference Hz) Gao et al., 2017 12 healthy (19-25) Alpha (8-12 Hz), 64 electrodes, Significant results beta (12-30 Hz), reference on nose gamma (30-48 Hz)

4.2.2 Methodology The factors (ii), regarding electrode setups, (iii), regarding the number of spectral bands analyzed, and (iv), regarding different methodologies, such as markers, are related to this topic and are thus all included here.

4.2.2.1 Electrode Setups The number of electrodes used in the articles as well as their positioning can be viewed in Table 3 above. According to Rogala et al. (2016), the importance of electrode positioning lies in the positions being correlated between known functional regions and the corresponding behavior observed when those regions are active. This was further confirmed by the two articles with non-significant results (see Table 3), since one distinct detail separating those experiments from other articles was that they were testing theories which did not contain known cortical regions for the respective functions they were analyzing. Rogala et al. (2016) recommend using at least 2-3 electrodes per known cortical area to control for noise from irrelevant stimuli. Regarding the positioning schemes of electrodes, Rogala et al. (2016) concluded that the 10-20 system or any system with the Fz location (frontal lobe, midline position) included was most reliable in yielding NFB. They failed, however, to find any consistency in electrode setups. From the other articles it was difficult to either disprove or support these findings, since they too coincided with Rogala et al.’s (2016) findings and since almost all of them had significant results. In Zoefel & VanRullen (2015a), they were looking at which potential areas were involved in high- or low-level features of speech. They were interested to find out if

16 top-down or bottom-up processes were used by the brain. Participants were instructed to press a button when they detected a tone pip, and the authors failed to significantly show that the detection of these were due to phase-entrainment (alignment between two rhythmic structures) of EEG oscillations. Since the regions responsible for the processes looked at are so diverse (only frontal lobe areas and auditory cortex was reported), it is reasonable to assume that this might have been at least a partial reason for this result. In Thwaites et al., (2017), they looked at loudness detection when presented with several auditory stimuli at once. Participants were presented with speech stimuli and the loudness perception at each second was analyzed using nine mathematical models to pinpoint the location of the cortical area showing most entrainment to a specific sound segment, among a total of 10,242 minuscule areas. They reported that the location of the expression of their nine channels was uncertain “due to the point-spread function” (Thwaites et al., 2017) (the tendency of a point-object to get “smeared out” and thus hard to pinpoint). This non-significant result further suggests a correlation between the specifics of cortical locations and certainty of EEG readings. Finally, as further evidence of this conclusion, no counter-examples were found among the other articles in which the experimenters looked at cortical regions with unspecific correlation to cognitive or behavioral functions.

4.2.2.2 Spectral Bands Rogala et al. (2016) found evidence suggesting that there is a negative correlation between the number of spectral bands and the efficiency of NFB training. E.g. they reported that in one experiment, an increase in alpha band amplitude that is reportedly common in EEG-NFB experiments, instead of being a reading of neural entrainment might have represented “return to baseline level disturbed by training procedure” (Rogala et al., 2016). Based on this and several other examples, they conclusively suggest as few spectral bands at once as possible. It is difficult to determine if this suggestion is relevant or not from the experiment articles, since all but one of them used at most 2 spectral bands. Most often these spectral bands were either delta- or theta-band protocols, which is understandable given that most speech information has been found within these bands, as well as the gamma- and beta-bands. On the other hand, one contradicting example to the recommendation was found in Kong et al. (2015). The experiment consisted of participants listening to speech that was modulated via vocoding into four different channel conditions (8-, 16-, 32- and 64 channels) and two different target-to-masker ratios (TMR) at 0 and 3 dB. TMR is a ratio between how much a distractor stimulus contributes to perception and how much a target stimulus does. In this case, the distractor stimulus was a delay of two voices speaking at once, with one of the voices speaking between 0 and 3 dB louder than the other. In their experiment, they looked at five different spectral ratios (the four vocoded and the unaltered voice recording), and yet they still reported significant results in all experiments. It should also be noted in this section, that all articles confirmed the previously presented theories about which spectral bands are involved in which form of speech information (such as syllabic and prosodic), both in their hypotheses and in any of the significant results, where this was an issue.

17

4.2.2.3 Data Analysis Partly based on different theories, but more commonly based on different areas of interest, the articles contained several different techniques for analyzing neural entrainment to speech. The ones taken up here can be divided into two categories: material types used to either modify EEG data or to induce neurofeedback, and analysis types of EEG data.

Material Types One of the materials included in the articles, which Ding & Simon (2014) claim to be the most common technique, is the auditory steady state response (aSSR), which was used in several of the articles. The aSSR is a subcategory of auditory evoked potentials (AEP), which in turn is a subcategory of event related potentials (ERP). All of these are neural responses to isolated signals, in this case auditory signals (ERPs can be used in other sensory modalities as well). aSSRs are neural responses which track the envelope of a complex auditory signal and are evoked by the signal’s onset, just as for any ERP. Ding & Simon recommend using slow aSSRs (<10 Hz) since this is where the speech envelope range lies. This contention was also supported by the experiment articles using aSSRs, since they were all looking at either the delta- or theta-bands. Somatosensory evoked potentials (SSEP or SEP) can be evoked by many different stimuli in several sensory modalities, and they are used in order to observe the pathways of signals in the neocortex, since they are evoked in early processing. In the analyzed articles they were used in behavioral experiments. Amplitude modulation (AM) and frequency modulation (FM) were used in Henry et al. (2014) to modulate speech sound to match specific amplitudes and frequencies. They were interested in analyzing syllabic and prosodic information in speech and they wanted a means to do this using several spectral bands simultaneously. By separating AM and FM in speech sounds, they managed to observe a lack of speech processing by the brain, which they concluded was due to the fact that the brain needs to entrain to both the AM and FM of speech in order to process it. Another speech entrainment analysis technique used was independent component analysis (ICA) (viz. by Horton et al., 2013). ICA is a computational method for pinpointing from where a signal originates. It is useful when the characteristics of the signal are known but the signal’s origin unknown. In Horton et al. (2013) they used ICA to locate the cortices involved in vs. inattention to competing speech sources. ICA use was only reported in this article, which could indicate that it’s not commonly used in EEG analysis of speech entrainment.

Analysis Types Phase-locking value (PLV) is a statistic used for task-induced variation in long range neural entrainment (or any coordinated activation of different brain regions) from EEG data in order to enhance or in other ways modify the EEG data. Since the phase-locking phenomenon was mentioned in many of the articles, this is a potentially relevant technique to use, and should be used when analyzing more than one cortical regions, such as the interrelationship between motor functions and early auditory processing, which was observed in several articles. Closely related to PLV, in being a technique to modify EEG data, fast Fourier transforms (FFT) are also commonly applied, something that was looked into specifically by Zhou et al. (2016). Fourier analysis uses trigonometric functions to approximate the “average” form of other functions to better illustrate the latter graphically. An FFT is an algorithm that computes discrete instances

18 of Fourier transforms, and one example of the way this is used in the field of neural entrainment is when neural responses sometimes come in the harmonics rather than the actual frequency, so that instead of the evoked response to a rhythmic stimulus at f Hz, the neural response is at 2f, 3f and so on. FFTs are used to detect such response signals. Characteristic frequency (CF) is a theory suggested by Thwaites et al. (2017), and it was only reported in that article. CFs are generalized combinations of several different frequencies. They based a computer model of tonotopic representation (i.e. a tone-based cortical mapping of neurons), which held the assumption that cortical activity uses CFs in early processing stages to represent instantaneous loudness, before they are transformed to short-term loudness. In the study, they found that the brain uses channel- specific entrainment in early processing, and CFs in later stages (45-100 ms and 165- 275 ms respectively).

4.2.3 Theory Factors (v), regarding cortical areas, and (vi), regarding theories about neural entrainment to speech, are included in this section.

4.2.3.1 Cortical Loci Since this is a review exclusively of EEG studies, and since EEG is a technique with very low spatial resolution, it was not surprising that most articles did not report which cortical areas that were or seemed to be involved in the observed neuronal activity. Instead, the EEG analyses in the articles focused on the characteristics of the signals under scrutiny. Some articles (e.g. Power, Colling, Mead, Barnes & Goswami, 2016; Zoefel & VanRullen, 2015a) did mention cortical areas in the discussion sections, but merely in relation to theories of speech processing regarding which cortical regions that were most likely involved, but the low frequency of reported regions throughout the articles indicates that cortical regions are generally not particularly important to studies of this nature. Furthermore, most often, cortical regions were, when at all, mentioned in general terms, such as “prefrontal areas” and “primary auditory cortex”. Thwaites et al. (2017) was an outlying example in the context of this trend. They reported bilateral activity in and around Heschl’s gyrus (HG) and the dorso-lateral sulcus (DLS), the latter of which according to the authors is involved in CF entrainment to auditory stimuli, and bilateral activity in superior temporal sulcus (STS), which is involved in channel- specific auditory stimuli entrainment along with DLS activity. Another example was Power et al., (2016) in which they listed previously reported anatomical regions responsible for tracking low frequency speech envelopes as HG, planum temporale (PT), STS, superior temporal gyrus (STG), the inferior parietal lobule (IPL), inferior frontal gyrus (IFG), and angular gyrus (AG). In both of these articles, however, the authors merely mentioned these regions by name, but omitted to thoroughly discuss their exact roles in speech processing, and when they briefly discussed such roles, it was only to mention observations from previous studies.

4.2.3.2 Theoretical Basis A lengthy analysis of Ding & Simon’s (2014) four general theories of the underlying cortical entrainment to speech was brought up in the introduction section, but this factor will account in more detail for the theoretical motivations and suggestions for further analysis found in all the articles.

19

First and foremost, without exception, the articles all maintained that speech is inherently temporal, and that entrainment to most speech details are limited to rather slow frequencies, between delta-band (1-4 Hz) and gamma-band (>30 Hz). They were furthermore in consensus agreement about the respective speech details contained within each one of the four respective spectral-bands (delta, theta, beta and gamma). As is also implied by factor (v), the cortical regions involved in speech processing were so sparingly described that any discrepancy between theories regarding these is difficult to assess in this analysis. The theoretical differences lied instead in the characteristics of auditory signal processing. Peelle & Davis (2012), notably in the early days of speech entrainment studies, discussed the roles of cortical oscillations that temporally coincide with speech rhythm. Their observations led them to conclude, besides that speech is a temporal phenomenon, that brain oscillations “show phase-locking to low-frequency information in speech” (Peelle & Davis, 2012), and that speech signals related to the intelligibility of speech are processed predominantly by the left hemisphere, but that acoustic amplitude modulations are largely bilaterally processed. Regarding their second conclusion, they theorized that the observed phase-locking of neural oscillations might be involved in a hierarchy of speech processing mechanisms. Later studies have since observed more thorough details about speech processing, such as a review by Zoefel & VanRullen (2015b), focusing on phase entrainment to speech. The development in research since Peelle & Davis’s review enabled the former to make more detailed conclusions, namely: that the phase entrainment is modulated by attention/predictions, these in kind being supported by top-down signals, which implies higher-level processes in the brain’s adjustment to speech; secondly, that the fact that phase entrainment to speech is observable without systematic fluctuations in either sound amplitude or spectral content implies both “a passive steady-state “ringing” of the cochlea” (Zoefel & VanRullen, 2015b) and higher-level processes causative of these cortical oscillations; and finally, that there is, as of the date of the article, evidence potentially indicating that speech intelligibility works as a modulator of the behavioral responses to entrainment, instead of directly as a strength modifier of entrainment in auditory regions. However uncertain the mechanism behind speech entrainment may be, they did hold for certain that it reflects a complex mechanism consisting of several high-level processes which align neural oscillations in order to predict sound signals of high relevance, and that these processes work even when the speech is showered by a cacophony of distractor noise. Furthermore, Ding & Simon (2014) point out that speech intelligibility lies last in the chain of mechanisms in speech processing, and most likely utilizes high-level features of the speech signal, which would need to be processed earlier. However, they also mention that speech intelligibility and envelope entrainment are hard to distinguish. They almost exclusively follow each other, so creating experiments that separate them is tricky. They conclude from the studies mentioned, that theta-band entrainment and speech intelligibility seem to follow each other, while delta-band entrainment is affected differently. “It remains unclear”, they state, “about whether speech intelligibility causally modulates cortical entrainment or that auditory encoding, reflected by cortical entrainment, influences downstream language processing and therefore become indirectly related to intelligibility” (Ding & Simon, 2014). It should be pointed out here that this relates to Ding & Simon’s (2014) discrepancies between models of active and passive speech entrainment, because Zoefel & VanRullen’s (2015b) observations corroborate and further develop their contention into a more detailed image of what parts of speech entrainment are active. Other articles, e.g. Kong et al., (2015), also confirmed the active role of speech entrainment, showing that

20 attention is necessary for distinguishing speech from two voices, and Zoefel & VanRullen (2015a), in which they showed that low-level features of speech are entrained early and high-level (late entrainment) features are entrained outside of the auditory cortex. Another hypothesis highly relevant to mention regarding the purpose of this review even though it is more than a decade old, is known as the asymmetric sampling in time (AST), originally proposed by Poeppel (2003). It states that “speech processing is a bilateral effort, but each hemisphere preferentially processes certain timescales” (Poeppel, 2003). The left hemisphere handles short timescales which are needed for discriminating places of articulation etc., and the right hemisphere handles slower features such as the envelope. Evidence by Horton et al. (2013), however, contradicts the lateralization part of this hypothesis. They found hemispheric preferences in both hemispheres, and therefore they instead concluded that their data showed distinctions between early and late cortical responses. The right hemisphere responses were stronger for early speech processing, whereas the left had stronger responses for later processing stages. Importantly, they furthermore found evidence showing that attention networks interfered with the phase of slow oscillations to suppress competing rhythms. Furthermore on this subject, Zhou et al., (2016) conducted an investigation into the role Fourier analysis has in neural entrainment, and, importantly for the active or passive role of speech intelligibility, they concluded their analysis with the observation that neural generators either have intrinsic dynamics that elicit “learning” behavior by the brain, or that there is a continuous tracking mechanism responsible for the phenomenon: “When low-frequency neural entrainment emerges at the fundamental frequency of the stimulus rhythm, it indicates that the influences of previous stimuli do not die out when the next stimulus comes. In other words, previous stimuli set the (neural) context in which a new stimulus will be processed”, (Zhou et al., 2016). They furthermore suggested that it’s unlikely that transient responses to discrete events are the cause of neural entrainment observations. As for behavioral data, Henry et al., (2014) looked at how behavior is influenced by neural oscillatory dynamics to naturalistic degrees of complexity in auditory stimuli, specifically at phase-phase effects (i.e. how several frequency bands are entrained to each other) on behavioral performance. In order to fully be able to test out their hypothesis, they constructed synthetic speech by mimicking the AM rate of syllabic information (at 5.075 Hz) and FM of prosodic information (at 3.1 Hz). They then used narrow-band noise stimuli with these modulations simultaneously and also included near-threshold targets which the participants had to detect in order to test their attention levels. They failed to observe 5.075-Hz peaks when analyzing AM-aligned brain signals only, and they hypothesized that this was due to the fact that the human auditory cortex is in need of the combined processing of AM and FM rhythms in a perceived signal. For further research, they suggested using rhythm as a means to organize neural oscillations, because this would have a simplifying effect on neural phase effects on behavior. Finally, several articles (e.g. Celma-Miralles et al., 2016; and Cross et al., 2015; Power et al., 2012) looked at multimodal speech processing, specifically audiovisual (AV) entrainment. Cross et al. (2015), first of all, concluded that congruent AV stimuli increased the cortical processing of speech compared to either auditory or visual alone, and they surmised that the responsible mechanism for this enhancement worked best within the rate of syllabic information (2-6 Hz). Celma-Miralles et al., (2016) let skilled musicians listen to auditory stimuli which were isochronous (temporally synchronized in regards to rhythm) with visual stimuli, both stimuli repeating at 2.4 Hz as well as this frequency’s first harmony (4.8 Hz), its binary subharmonic (1.2 Hz) and its ternary

21 subharmonic (0.8 Hz). The results showed an amplitude enhancement at 0.8 Hz in the ternary condition for both modalities, and they concluded that this indicated meter induction across modalities.

22

5. Discussion

In this section the findings relevant to the research questions of this review are discussed, followed by a detailed description of an EEG study based on these findings. The section then concludes with general comments on the review’s previous parts.

5.1 Findings

The purpose of this review was to find data relevant to the research questions which are defined with the purpose of designing a study that analyzes the mechanisms which the brain uses to make a distinction between human voices. The analyzed data listed in the results are presented in this section, including, for each relevant factor, a short motivating explanation. Some of the data suggests more certainty in the observations than others, and for this reason it is elucidating to divide observations with high certainty and those with low into two groups when summarizing the data. Below are the two summarized lists of important factors in regards to the research questions. The first list contains factors with a high degree of certainty in the evidence, and the second list contains factors with a lower certainty degree or where there is a research gap that needs to be filled.

Factors of high certainty:

1) Participants should be healthy in regards to speech perception (viz. no dyslexia, no old age symptoms, no underdeveloped hearing ability), their genders should be mixed, their ages between 18-37 years old, preferably with a mean age of 25, and they should be right handed. Experiments should contain at least 8 participants, but preferably 23. 2) Several electrodes positioned at appropriate locations to record well-identified sources of given EEG frequencies should function as alignment criteria for electrode positioning in EEG analysis. At least 2-3 electrodes per known cortical area are recommended. 3) The most efficient methods for analysis of EEG data with the intention to detect neural entrainment to speech are FFTs and aSSRs. 4) The frequency bands most likely to yield relevant data of speech entrainment activity are delta, theta, beta and gamma. 5) Neural entrainment to speech consists of several components and not all components use the same mechanism. Some are passive and some are active in the speech processing mechanism. 6) Attention is necessary for distinguishing speech from other sound signals. 7) Attention is also necessary for differentiating voices. 8) Attention furthermore uses high-level details in the auditory signal to distinguish human voices, and cortical areas involved in this process lie outside the auditory cortex, and instead in the prefrontal cortex.

Factors of low certainty and research gaps: a) Exactly how the multiple mechanisms of speech processing work and how they cooperate to accomplish the complete processing is partly unknown.

23 b) It is possible but not clear if both frequency and amplitude of the sound signal are necessary for speech detection, since only data from indirect analysis in the analyzed experiments led to this observation. c) Exactly which cortical areas should be investigated is disputed, due to contradictory theoretical bases for which areas are involved in speech processing, both early and late stages. It is not clear if areas outside auditory cortex ought to be analyzed alone in order to detect voice distinction, or if such analysis is necessary to combine with areas involved in early speech processing. d) Speech intelligibility may influence entrainment in ways that affect distinction between voices, but it is not certain since intelligibility lies late in the chain of speech processing and voice recognition does not necessarily need high-level features.

5.2 Suggestions for Further Research

The end-goal of this review necessitates that a methodology for analysis of neural activity and data interpretation that can measure the relevant activity in question are isolated and clearly described. The conclusive factors in the result section can be used to describe the outline of just such a study with the potential to empirically answer the research questions of this review. In order to carry out that study, in accordance with the factors, the participants should number at least 8 but preferably be 23 humans, healthy in regards to speech perception and speech production, in the ages between 18-37 years old, be of mixed gender, right handed, and finally native to the language used. The right-handedness deserves some explanation, since it is critically important that participants be right-handed in order to normalize the analyses, something that has been confirmed by many studies, including some of the included articles in this review. The reason is that left-handed and right- handed persons tend to have their speech centers in different hemispheres, with right- handed persons’ speech centers almost exclusively focused in the left hemisphere, and left-handed persons’ either in the right or in both hemispheres. EEG should be used as a method of analysis, and electrode positioning scheme should be the International 10-20 system, with preferably 2-3 electrodes over each relevant cortical center The cortical areas analyzed should be the auditory cortex and areas in the prefrontal cortex responsible for behavioral speech processing as well as centers for attention. The frequency bands sought after will in some experiments be in the delta-theta bands and in some experiments also include the beta-gamma bands. In all experiments as few bands as possible to detect the target entrainment in question should be used in order to minimize reading errors due to interference from non-relevant stimuli. The brain’s ability to detect human voices coincides with neural entrainment to speech details, which are easily isolated using these frequencies, but some features that will be analyzed require limiting the analysis to only some of the bands. The proposed study using this investigative framework will be comprised of two tests described below. The first test will contain one experiment and the second two experiments. In the first test, the stimulus used in order to elicit NFB will be in part recordings of human speech with different voices and in part artificial sound set up so that the sound envelope is normalized in both frequency and amplitude to that of the speech envelope of the speech recording, since both amplitude and frequency are necessary for speech

24 processing. The artificial sound will be continuous and last long enough to detect the difference in neural activity between the sound of intelligible speech and unintelligible sound resembling speech in frequency and amplitude. In this experiment, only delta- theta band frequencies will be analyzed, since it requires distinguishing speech intelligibility, and this is, according to the results listed in factor (vi), most likely observed within these bands. The artificial sound can consist of a modulated version of the speech recording itself, such that speech intelligibility is rendered undetectable by participants, which will be verified both by means of participant self-reports and observed neural activity. This distinction procedure will be conducted with a noise vocoding script which will not be described in this review. The reason for using these two recordings is to be able to distinguish the neural activity of speech intelligibility from that of entrainment to the speech envelope, which is necessary since both of these tasks use attention and are closely interrelated in the speech processing chain, and in order to be able to detect the brain’s separation of two different voices, the analysis must contain information regarding which data belongs to which of these two mechanisms, since they are likely to contain overlapping information, which could skew the results in regards to the marker. After this has been achieved, the acquired data will be used as a marker from which to detect differences between two different human voices using speech. In other words, what is unclear and in need of investigation from this point on, is which features in the NFB data that contain information about how the brain distinguishes voices. Using the above framework for investigation then, another test, consisting of two experiments, will be conducted to analyze this. In experiment 1, subjects will listen to a – to the subject – familiar voice and an unfamiliar voice. Both voices will read two different texts in two separate recordings, and the order in which subjects hear the voices and recordings will be randomized into four groups to control for training effects. EEG analysis will be conducted to observe speech entrainment and specifically look for differences in entrainment between the two voices. The hypothesis is that differences in entrainment will indicate differences in the processes used for recognition of a given voice. Experiment 2 is a control for potentially irrelevant data, such as emotional attachment to the familiar voice instead of mere distinction of voices. In this experiment, subjects will instead listen to an unfamiliar voice, followed by the same voice reading a different text. The texts will be the same as in experiment 1 and the order of the texts will be randomized in the same way as well. Half of the participants will start with experiment 1 and the other half in experiment 2. Finally, the data analysis used to detect these markers should be PLVs and perhaps also FFTs in order to pick out synchronicities between the sound envelopes and the corresponding neural activity, since this technique is the most efficient to use for pattern detection in oscillatory data, especially PLVs, since it is mostly used in cases with more than one cortical centers.

5.3 General Discussion

The choice to make this review about neural/cortical entrainment to speech in order to search for a method to analyze how the brain’s ability to distinguish human voices could be reverse engineered is seemingly the only aim available given the present data on the subject, since in order to artificially distinguish one voice from another, the specific qualities of voices as distinct from other sound signals must first be established, and

25 there are far too great research gaps in the neurological data for alternative outsets to be relevant at present. Other potential approaches, such as the use of cognitive modelling, require far more certainty in its neurological basis, as this review indicates, but this alternative approach is definitely worth considering for similar research in a near future, given the quick pace at which this area is following since the outset of neural entrainment research. Indeed, some of the analyzed experiments in this review, such as Henry et al. (2014), do make great use of computer models to help aim their approach at analyzing brain functions, though it is still instructive that the only articles with non- significant results in their experiments were those in which computer models were used as inspiration for the EEG analysis. It is quite easy to motivate why a literature review is needed as a basis for conducting any study of neural entrainment to speech, since the branch is both new and it also changes quickly. Therefore it is necessary to grasp “the lay of the land” before delving into a potential dead-end. As was reported by Rogala et al. (2016), almost 40% of the EEG experiments they analyzed failed to show results to neural entrainment and half of the behavioral response experiments were unsuccessful. The reasons for these poor results are illustrated in this review as being mainly due to a poorly understood mechanism for speech processing. The fact that so many different techniques are used, both in gathering and analyzing the data, also implies uncertainty, and experimental results are correspondingly unsuccessful to some extent. The implication to draw from these facts is that positive results are still not guaranteed, but that the next step must nevertheless be to conduct a practical experiment in order to refine any of the existing theories regarding this topic from this point.

5.4 Method Discussion

Research focusing on neural entrainment mechanisms in general rather than limited to speech entrainment would benefit from including the articles which were excluded by the third exclusion criterion in this review, since they provided valuable evidence for mechanisms of neural entrainment not discussed in the included articles. The aim of this review concerns such intricate details of speech entrainment that they had to be excluded here nonetheless. The database choice needs to be commented upon, since so many databases were originally used (6), yet only one yielded relevant results other than those of the finitely used database Europe PMC. The other databases were originally used with the intention to ascertain that nothing potentially important from other perspectives than strictly neuroscientific evidence were excluded in this review. For this reason, most of the other databases contain articles in either the psychology or medicine fields, and this also explains their poor yields in the searches, searches which used very narrow terms in regards to research fields. Finally, the choice of using a distinction between literature reviews and experiment articles can perhaps be called into question as an unusual and possibly even seemingly unscientific approach. To answer such critique, it should be said that it was done in order to increase the precision of isolating the most important factors without closing any doors to other important information, especially since such a thorough analysis of all articles was carried out. Also, since such great care was taken to ascertain the reliability of the journals the review articles were published in, it was decided that the benefits outweighed the drawbacks. Though it can, of course, still be argued that a complete bottom-up approach might have yielded the same results more reliably.

26

5.5 Theory Discussion

The theoretical underpinnings of the research in neural entrainment to speech are, as already mentioned in several places, as of yet inconclusive and need to be further investigated in order to fully understand the mechanisms involved in speech processing. The theoretical suggestion offered by Ding & Simon (2014) is still valid in the light of the other articles analyzed here. The inconsistencies lie most prominently between which processes are active and passive, as well as in which order the series of actions are carried out: do high-level or low-level processes come first, and is there even a hierarchical sequence at all? Their conclusion regarding these inconsistencies is furthermore supported by the data gathered in this review, namely that it does not seem to be any complete hierarchical structure of these processes, but instead the system seems more flexible. This flexibility is most likely due to the fact that the brain needs to be adaptive in a complex and unpredictable environment, for the same reason that top- down and bottom-up processes need to cooperate depending on the nature of the stimuli in need of interpretation: a quickly oncoming car can be deadly if top-down processes are not used to localize where the sound signal is coming from, whereas the rigidity of top-down processes are far too clunky to efficiently pick out syllabic details of human speech. So in other words, speech processing has many different uses depending on the context, and therefore the brain must have a systematic flexibility. That is not to say that it seems as if there is no system at all, but merely that such a semi-hierarchical network contains flexibility enough to handle these interchanges between top-down and bottom- up processing. This has important implications for the purpose of this review, since no data with certainty offers suggestions as to how to observe the communication of the involved neural networks in and between their corresponding functional centers. If there is no known priority order, it will most likely be difficult to draw conclusions about any observed EEG activity. As reported in the results section, both Ding & Simon (2014) and Zoefel & VanRullen (2015a) observed that speech intelligibility might not be interacting via down-stream order only, but high-level intelligibility processes seem to facilitate the earlier processes of speech processing in auditory cortex as well.

27

References

Bornmann L.; Pudovkin A. I. (2016). The Journal Impact Factor Should Not Be Discarded. Journal of Korean Medical Science, 32(2):180-182. Cantin M.; Muñoz M; Roa I. (2015). Comparison between Impact Factor, Eigenfactor Score, and SCImago Journal Rank Indicator in Anatomy and Morphology Journals. International Journal of Morphology, 33(3), 1183-1188, 2015. Celma-Miralles A; de Menenez R. F; Toro J. M. (2016). Look at the Beat, Feel the Meter: Top-Down Effects of Meter Induction on Auditory and Visual Modalities. Frontiers in Human Neuroscience, vol. 10, March 2016, article 108. Crosse M. J; Butler J. S; Lalor E. C. (2015). Congruent Visual Speech Enhances Cortical Entrainment to Continuous Auditory Speech in Noise-Free Conditions. The Journal of Neuroscience, October 21, 35(42):14195-14204. E Diamandis. P. (2017). The Journal Impact Factor is under attack – use the CAPCI factor instead. BMC Medicine, 2017, 15:9. Diamond E; Zhang Y. (2016). Cortical processing of phonetic and emotional information in speech: A cross-modal priming study. Neuropsychologia, 82, (2016), 110-122. Ding N; Simon J. Z. (2014). Cortical entrainment to continuous speech: functional roles and interpretations. Frontiers in Human Neuroscience, vol. 8, May 2014, article 311. Gao Y; Wang Q; Ding Y; Wang C; Li H; Wu X; Qu T; Li L. (2017). Selective Attention Enhances Beta-Band Cortical Oscillation to Speech under “Cocktail-Party” Listening Conditions. Frontiers in Human Neuroscience, vol. 11, February 2017, article 34. Ghitza O. (2012). On the role of theta-driven syllabic parsing in decoding speech: intelligibility of speech with a manipulated modulation spectrum. Frontiers in Psychology, vol. 3, July 2012, article 238. Giraud A-L; Poeppel D. (2012). Cortical oscillations and speech processing: emerging computational principles and operations. Nature Neuroscience, 15(4), 511-517. Henry M. J; Obleser J. (2013). Dissociable Neural Response Signatures for Slow Amplitude and Frequency Modulation in Human Auditory Cortex. PLOS ONE, vol. 8, issue 10, October 2013, e78758. Henry M. J; Herrmann B; Obleser J. (2014). Entrained neural oscillations in multiple frequency bands comodulate behavior. PNAS, vol. 111, no. 41, October 14, 2014, 14935- 14940. Horton C; D’Zmura M; Srinivasan R. (2013). Suppression of competing speech through entrainment of cortical oscillations. Journal of Neurophysiology, 109; 3082-3093, 2013. Johnson Jr. C. R,; Sethares W. A; Klein A. G. (2011). Software Receiver Design: Build Your Own Digital Communication System in Five Easy Steps. Cambridge University Press. Kazanina N; Phillips C; Idsardi W. (2006). The influence of meaning on the perception of speech sounds. PNAS, vol. 103, no. 30, July 25, 2006. The National Academy of Sciences of the USA. Kong Y-Y; Somarowthu A; Ding N. (2015). Effects of Spectral Degradation on Attentional Modulation of Cortical Auditory Responses to Continuous Speech. Journal of the Association for Research in Otolaryngology, 16, 2015, 783-796. Lehongre K; Morillon B; Giraud A-L; Ramus F. (2013). Impaired auditory sampling in dyslexia: further evidence from combined fMRI and EEG. Frontiers in Human Neuroscience, vol. 7, August 2013, article 454. Mañana-Rodriquez J. (2015). A critical review of SCImago Journal & Country Rank. Research evaluation, vol. 24, issue 4, 343-354. Moon I. J; Hong S. H. (2014). What is Temporal Fine Structure and Why is it Important?. Korean Journal of Audiology, April 2014, 18(1): 1-7.

28

Pantaleone J. (2002). of Metronomes. American Journal of Physics, vol. 70, 2002, pp. 992–1000. Parkkonen L; Andersson J; Hämäläinen M; Hari R. (2008). Early visual brain areas reflect the percept of an ambiguous scene. PNAS, vol. 105, no. 51, December 23, 2008, 20500- 20504. Peelle J. E; Davis M. H. (2012). Neural oscillations carry speech rhythm through to comprehension. Frontiers in Psychology, vol 3, September 2012, article 320. Poeppel D. (2002). The analysis of speech in different temporal integration windows: cerebral lateralization as ‘asymmetric sampling in time’. Speech Communication, vol. 41, issue 1, August 2003, pp. 245-255. Elsevier Science B.V. Power A. J; Colling L. J; Mead N; Barnes L; Goswami U. (2016). Neural encoding of the speech envelope by children with developmental dyslexia. Brain & Language, 160 (2016), 1-10. Power A. J; Mead N; Barnes L; Goswami U. (2012). Neural entrainment to rhythmically presented auditory, visual, and audio-visual speech in children. Frontiers in Psychology, vol. 3, July 2012, article 216. Rimmele J. M; Sussman E; Poeppel D. (2015). The role of temporal structure in the investigation of sensory , auditory scene analysis, and speech perception: A healthy-aging perspective. International Journal of Psychophysiology, February 2015, 95(2), 175-183. Rogala J; Jurewicz K; Paluch K; Kublik E; Cetnarski R; Wróbel A. (2016). The Do’s and Don’ts of Neurofeedback Training: A Review of the Controlled Studies Using Healthy Adults. Frontiers in Human Neuroscience, vol. 10, June 2016, article 301. Rubel E. W; Fritzsch B. (2002). Auditory system development: primary auditory neurons and their targets. Annual Review of Neuroscience. 25: 51–101. Sandberg K; Barnes G. R; Bahrami B; Kanai R; Overgaard M; Rees G. (2014). Distinct MEG correlates of conscious experience, perceptual reversals and stabilization during binocular rivalry. NeuroImage, 100 (2014), 161-175. Schultz T. L. (2011). Technical Tips: MRI Compatible EEG Electrodes: Advantages, Disadvantages, and Financial Feasibility in a Clinical Setting. The Neurodiagnostic Journal, 52:69-81, 2012. Scimago Journal & Country Rank. (2017-04-09). Help. Received from: http://www.scimagojr.com/help.php#understand_countries Seglen P. O. (1997). Why the impact factor of journals should not be used for evaluating research. BMJ, 1997;314:498-502. Thwaites A; Schlittenlacher J; Nimmo-Smith I; Marslen-Wilson W. D; Moore B. C.J. (2017). Tonotopic representation of loudness in the human cortex. Hearing Research, 344, 2017, 244-254. Wang X; Lu T; Liang L. (2003). Cortical processing of temporal modulations. Speech Commun, 41, 2014, 107–121. Wolfram S. (1985). Complex Systems Theory. The Institute for Advanced Study. Princeston NJ 08540. Yang Z. (2014). Molecular Evolution: A Statistical Approach. Oxford University Press. Young H. D; Freedman R. A; Ford A. L. (2008). University Physics with Modern Physics, 12th edition. Pearson Education, Inc. Zhou H; Melloni L; Poeppel D; Ding N. (2016). Interpretations of Analyses of Neural Entrainment: Periodicity, Fundamental Frequency, and Harmonics. Frontiers in Human Neuroscience, vol. 10, June 2016, article 274. Zoefel B; VanRullen R. (2015a). EEG oscillations entrain their phase to high-level features of speech sound. NeuroImage, 124 (2016), 16-23.

29

Zoefel B; VanRullen R. (2015b). The Role of High-Level Processes for Oscillatory Phase Entrainment to Speech Sound. Frontiers in Human Neuroscience, vol. 9, December 2015, article 651.

30

Appendix

Below are the tables listing included and excluded articles. The excluded articles are, as in Figure 1, separated by the three different searches on the Europe PMC database.

A. Included Articles

Table 4. The total included articles for review, showing author names, publication year, article title and journal of publication. Author(s) Published Title Journal Rufener, Oechslin, 2016 Age-Related Neural Oscillation Patterns During the Springer Science+Business Wöstmann, Dellwo & Processing of Temporally Manipulated Speech Media Meyer Crosse, Butler & Lalor 2015 Congruent Visual Speech Enhances Cortical Entrainment to The Journal of Neuroscience Continuous Auditory Speech in Noise-Free Conditions Ding & Simon 2014 Cortical entrainment to continuous speech: functional roles Frontiers in Human and interpretations Neuroscience Diamond & Zhang 2016 Cortical processing of phonetic and emotional information in Neuropsychologia speech: A cross-modal priming study Henry & Obleser 2013 Dissociable Neural Response Signatures for Slow Amplitude PLOS|one and Frequency Modulation in Human Auditory Cortex Zoefel & VanRullen (a) 2015 EEG oscillations entrain their phase to high-level features of NeuroImage speech sound Kong, Somarowthu & Ding 2015 Effects of Spectral Degradation on Attentional Modulation Journal of the Association of Cortical Auditory Responses to Continuous Speech for Research in Otolaryngology Henry, Herrmann & 2014 Entrained neural oscillations in multiple frequency bands PNAS Obleser comodulate behavior Henry & Obleser 2012 Frequency modulation entrains slow neural oscillations and PNAS optimizes human listening behavior Lehongre, Morillon, Giraud 2013 Impaired auditory sampling in dyslexia: further evidence Frontiers in Human & Ramus from combined fMRI and EEG Neuroscience Zhou, Melloni, Poeppel & 2016 Interpretations of Frequency Domain Analyses of Neural Frontiers in Human Ding Entrainment: Periodicity, Fundamental Frequency, and Neuroscience Harmonics Celma-Miralles, de 2016 Look at the Beat, Feel the Meter: Top-Down Effects of Frontiers in Human Menezes & Toro Meter Induction on Auditory and Visual Modalities Neuroscience Power, Colling, Mead, 2016 Neural encoding of the speech envelope by children with Brain & Language Barnes & Goswami developmental dyslexia Power, Mead, Barnes & 2013 Neural entrainment to rhythmic speech in children with Frontiers in Human Goswami developmental dyslexia Neuroscience Power, Mead, Barnes & 2012 Neural entrainment to rhythmically presented auditory, Frontiers in Human Goswami visual, and audio-visual speech in children Neuroscience Peelle & Davis 2012 Neural oscillations carry speech rhythm through to Frontiers in Psychology comprehension Obleser, Herrmann & 2012 Neural oscillations in speech: don't be enslaved by the Frontiers in Human Henry envelope Neuroscience Gao et al. 2017 Selective Attention Enhances Beta-Band Cortical Frontiers in Human Oscillations to Speech under "Cocktail-Party" Listening Neuroscience Conditions Horton, D'Zmura & 2013 Suppression of competing speech through entrainment of Journal of Neurophysiology Srinivasan cortical oscillations Rogala et al. 2016 The Do's and Don'ts of Neurofeedback Training: A Review Frontiers in Human of the Controlled Studies Using Healthy Adults Neuroscience Benitez-Burraco & Murphy 2016 The Oscillopathic Nature of Language Deficits in Autism: Frontiers in Human From Genes to Language Evolution Neuroscience Zoefel & VanRullen (b) 2015 The Role of High-Level Processes for Oscillatory Phase Frontiers in Human Entrainment to Speech Sound Neuroscience Rimmele, Sussman & 2015 The role of temporal structure in the investigation of sensory International Journal of Poeppel memory, auditory scene analysis, and speech perception: A Psychophysiology healthy-aging perspective Thwaites, Schlittenlacher, 2017 Tonotopic representation of loudness in the human cortex Hearing Research Nimmo-Smith, Marslen- Wilson & Moore

31

B. Excluded Articles

Table 5. The excluded articles from the search with keywords “EEG” + “neural oscillation”. Authors Published Title Journal

Zhang, Zhang, Yu, Liu & 2017 Women Overestimate Temporal Duration: Evidence from Frontiers in Psychology Luo Chinese Emotional Words. Brinkman et al. 2016 Independent Causal Contributions of Alpha- and Beta- Journal of Neuroscience Band Oscillations during Movement Selection. Yeom, Kim & Chung 2016 Macroscopic Neural Oscillation during Skilled Reaching Computational Intelligence and Movements in Humans. Neuroscience Journal Meng et al. 2016 EEG Oscillation Evidences of Enhanced Susceptibility to Frontiers in Psychology Emotional Stimuli during Adolescence. Lim, Kim, Kim & Chung 2016 Increased Low- and High-Frequency Oscillatory Activity Frontiers in Human in the Prefrontal Cortex of Fibromyalgia Patients. Neuroscience Kononowicz & van 2016 In Search of Oscillatory Traces of the Internal Clock. Frontiers in Psychology Wassenhove Lowet, Roberts, Bonizzi, 2016 Quantifying Neural Oscillatory Synchronization: A PLoS|One Karel & De Weerd Comparison between Spectral Coherence and Phase- Locking Value Approaches. Hickok, Farahbod & Saberi 2015 The Rhythm of Perception: Entrainment to Acoustic Psychological Science Rhythms Induces Subsequent Perceptual Oscillation. Venkatasubramanian 2015 Understanding as a disorder of Clinical consciousness: biological correlates and translational Psychopharmacological and implications from quantum theory perspectives. Neuroscience Bettinardi, Tort-Colet, 2015 Gradual emergence of spontaneous correlated brain NeuroImage Ruiz-Mejias, Sanchez-Vives activity during fading of general anesthesia in rats: & Deco Evidences from fMRI and local field potentials. Song et al. 2015 Brain amyloid-β burden is associated with disruption of Journal of Neuroscience intrinsic functional connectivity within the medial temporal lobe in cognitively normal elderly. Wang et al. 2014 Steady-state BOLD response modulates low frequency Scientific Reports neural oscillations. Darvas & Hebb 2014 Task specific inter-hemispheric coupling in human Frontiers in Human subthalamic nuclei. Neuroscience Jones 2014 Independent effects of bottom-up temporal expectancy Frontiers in Integrative and top-down spatial attention. An audiovisual study Neuroscience using rhythmic cueing. Nakao & Nakazawa 2014 Brain state-dependent abnormal LFP activity in the Frontiers in Neuroscience auditory cortex of a schizophrenia mouse model. Sasaki et al. 2013 Disturbed resting functional inter-hemispherical PLoS|One connectivity of the ventral attentional network in alpha band is associated with unilateral spatial neglect. Yuan 2013 Combining independent component analysis and Granger Biomedical Optics Express causality to investigate brain network dynamics with fNIRS measurements. Gardiner 2012 Insights into plant consciousness from neuroscience, Plant Signaling & Behavior physics and mathematics: a role for quasicrystals? Barthó et al. 2014 Ongoing network state controls the length of Neuron spindles via inhibitory activity. Christison-Lagay, Gifford 2014 Neural correlates of auditory scene analysis and International Journal of & Cohen perception. Psychophysiology Akam & Kullmann 2014 Oscillatory multiplexing of population codes for selective Nature Reviews Neuroscience communication in the mammalian brain. Sato & Yamaguchi 2009 Spatial-area selective retrieval of multiple object-place Cognitive Neurodynamics associations in a hierarchical cognitive map formed by theta phase coding. Xu, An, Mi & Zhang 2013 Impairment of cognitive function and synaptic plasticity PLoS|One associated with alteration of information flow in theta and gamma oscillations in melamine-treated rats. Karalunas, Huang-Pollock 2013 Is reaction time variability in ADHD mainly at low Journal of Child Psychology & Nigg frequencies? and Psychiatry Spencer, Barich, Goldberg 2012 Behavioral dynamics and neural grounding of a dynamic Journal of Integrative & Perone field theory of multi-object tracking. Neuroscience Day, Kinnischtzke, Adam & 2009 Daily and developmental modulation of "premotor" Developmental Neurobiology Nick activity in the birdsong system. Abbs et al. (conference) 2012 The 3rd Schizophrenia International Research Society Schizophrenia Research Conference, 14-18 April 2012, Florence, Italy: summaries of oral sessions. Baharnoori et al. 2010 The 2nd Schizophrenia International Research Society Schizophrenia Research (conference) Conference, 10-14 April 2010, Florence, Italy: summaries

32

of oral sessions.

Wang, Wang, Fan, Huang 2017 Speech-specific categorical perception deficit in autism: Scientific Reports & Zhang An Event-Related Potential study of lexical tone processing in Mandarin-speaking children. Garcia-Garcia et al. 2017 COMT and DRD2/ANKK-1 gene-gene interaction PLoS|One account for resetting of gamma neural oscillations to auditory stimulus-driven attention. Dipoppa, Szwed & Gutkin 2016 Controlling Working Memory Operations by Selective Advances in Cognitive Gating: The Roles of Oscillations and Synchrony. Psychology Ma et al. 2016 Brain Connectivity Variation Topography Associated PLoS|One with Working Memory. San-Juan et al. 2016 Transcranial Alternating Current Stimulation: A Potential Frontiers in Neurology Risk for Genetic Generalized Patients (Study Case). Jensen, Spaak & Park 2016 Discriminating Valid from Spurious Indices of Phase- eNeuro Amplitude Coupling. Lewis, Setsompop, Rosen & 2016 Fast fMRI can detect oscillatory neural activity in PNAS Polimeni humans. Su et al. 2016 A Comparison of Multiscale Permutation Entropy PLoS|One Measures in On-Line Depth of Anesthesia Monitoring. Akeju et al. 2016 Spatiotemporal Dynamics of Dexmedetomidine-Induced PLoS|One Electroencephalogram Oscillations. de la Salle et al. 2016 Effects of Ketamine on Resting-State EEG Activity and Frontiers in Pharmacology Their Relationship to Perceptual/Dissociative Symptoms in Healthy Humans. Kanayama, Morandi, 2017 Causal Dynamics of Scalp Brain Topology Hiraki & Pavani Oscillation During the Rubber Hand Illusion. Padilla-Buritica, Martinez- 2016 Emotion Discrimination Using Spatially Compact Frontiers in Computational Vargas & Castellanos- Regions of Interest Extracted from Imaging EEG Activity. Neuroscience Dominguez Chang, Liang, Lai, Hung & 2016 Theta Oscillation Reveals the Temporal Involvement of Frontiers in Human Juan Different Attentional Networks in Contingent Neuroscience Reorienting. Kumagai & Mizuhara 2016 Top-down and bottom-up attention cause the NeuroReport ventriloquism effect with distinct electroencephalography modulations. Weiner & Dang-Vu 2016 Spindle Oscillations in Sleep Disorders: A Systematic Neural Plasticity Review. Peng & Tang 2016 Pain Related Cortical Oscillations: Methodological Frontiers in Computational Advances and Potential Applications. Neuroscience Soh et al. 2015 Joint Coupling of Awake EEG Frequency Activity and Frontiers in Psychiatry MRI Gray Matter Volumes in the Psychosis Dimension: A BSNIP Study. Duffy, D'Angelo, Rotenberg 2015 Neurophysiological differences between patients BMC Medicine & Gonzalez-Heydrich clinically at high risk for schizophrenia and neurotypical controls--first steps in development of a biomarker. Sakurai et al. 2015 Converging models of schizophrenia--Network alterations Progress in Neurobiology of prefrontal cortex underlying cognitive impairments. Kang et al. 2015 Brain Networks Responsible for Sense of Agency: An PLoS|One EEG Study. Tang, Hu, Lei, Li & Chen 2015 Frontal and occipital-parietal alpha oscillations Frontiers in Human distinguish between stimulus conflict and response Neuroscience conflict. Elliott, Kelly, Friedel, 2015 The Golden Section as Optical Limitation. PLoS|One Brodsky & Mulcahy Wen, Zhou & Li 2015 A critical review: coupling and synchronization analysis Frontiers in Aging methods of EEG signal with mild cognitive impairment. Neuroscience Behroozmand, Ibrahim, 2015 Functional role of delta and theta band oscillations for Frontiers in Neuroscience Korzyukov, Robin & auditory processing during vocal pitch motor Larson control. Sullivan, Timi, Hong & 2015 Effects of NMDA and GABA-A Receptor Antagonism on The International Journal of O'Donnell Auditory Steady-State Synchronization in Awake Neuropsychopharmacology Behaving Rats. Roset, Gant, Prasad & 2014 An adaptive brain actuated system for augmenting Frontiers in Neuroscience Sanchez rehabilitation. Kuhn et al. 2011 Deep brain stimulation in schizophrenia. Fortschritte der Neurologie - Psychiatrie Tanaka, Hayashida, Igasaki 2011 A new method for localizing the sources of correlated Conference proceedings - & Murayama cross-frequency oscillations in human . IEEE engineering in medicine and biology society Hong & Rebec 2012 A new perspective on behavioral inconsistency and neural Frontiers in Aging noise in aging: compensatory speeding of neural Neuroscience

33

communication.

Nunez, Srinivasan & Fields 2015 EEG functional connectivity, axon delays and white Clinical Neurophysiology matter disease. Diederich, Schomburg & 2012 Saccadic reaction times to audiovisual stimuli show PLoS|One Colonius effects of oscillatory phase reset. Basar 2013 Brain oscillations in neuropsychiatric disease. Dialogues in Clinical Neuroscience Lee et al. 2013 Dipole source localization of mouse PLoS|One electroencephalogram using the Fieldtrip toolbox. Burke, Merkow, Jacobs, 2015 Brain computer interface to enhance in Frontiers in Human Kahana & Zaghoul human participants. Neuroscience Wang et al. 2013 Modulation of brain electroencephalography oscillations ECAM by electroacupuncture in a rat model of postincisional pain. Yuval-Greenberg, Tomer, 2008 Transient induced gamma-band response in EEG as a Neuron Keren, Nelken & Deouell manifestation of miniature saccades. Kosilo et al. 2013 Low-level and high-level modulations of fixational Frontiers in Psychology saccades and high frequency oscillatory brain activity in a visual object classification task. Hong, Summerfelt, 2012 A shared low-frequency oscillatory rhythm abnormality in Clinical Neurophysiology Mitchell, O'Donnell & resting and sensory gating in schizophrenia. Thaker Edgar et al. 2014 Cortical thickness as a contributor to abnormal NeuroImage. Clinical oscillations in schizophrenia? Qian & Di 2011 Phase or amplitude? The relationship between ongoing The Journal of Neuroscience and evoked neural activity. Schmitt, Hasan, Gruber & 2011 Schizophrenia as a disorder of disconnectivity. European Archives of Falkai Psychiatry and Clinical Neuroscience Peters et al. 2013 On the feasibility of concurrent human TMS-EEG-fMRI Journal of Neurophysiology measurements. Fogel, Martin, Lafortune & 2012 NREM Sleep Oscillations and Brain Plasticity in Aging. Frontiers in Neurology Carrier Botcharova, Farmer & 2014 Markers of criticality in phase synchronization. Frontiers in Systems Berthouze Neuroscience Bódizs et al. 2005 Prediction of general mental ability based on neural Journal of Sleep Research oscillation measures of sleep. Dmochowski, Sajda, Dias & 2012 Correlated components of ongoing EEG point to Frontiers in Human Parra emotionally laden attention - a possible marker of Neuroscience engagement? McClintock et al. 2014 Multifactorial determinants of the neurocognitive effects The Journal of ECT of electroconvulsive therapy. Oin, Perdoni & He 2011 Dissociation of subjectively reported and behaviorally PLoS|One indexed mind wandering by EEG rhythmic activity. Guggisberg & Mottaz 2013 Timing and of movement decisions: does Frontiers in Human consciousness really come too late? Neuroscience Kaser et al. 2013 Oscillatory underpinnings of and PLoS|One their relationship with cognitive function in patients with schizophrenia. Moran & Hong 2011 High vs low frequency neural oscillations in Schizophrenia Bulletin schizophrenia. Kleen, Wu, Holmes, Scott & 2011 Enhanced oscillatory activity in the hippocampal- The Journal of Neuroscience Lenck-Santini prefrontal network is related to short-term memory function after early-life . Emadi, Rajimehr & Esteky 2014 High baseline activity in inferior temporal cortex Frontiers in Human improves neural and behavioral discriminability during Neuroscience visual categorization. Jadi & Sejnowski 2014 Regulating Cortical Oscillations in an Inhibition- Proceedings of the IEEE Stabilized Network. Rana & Vaina 2014 Functional roles of 10 Hz alpha-band power modulating PLoS|One engagement and disengagement of cortical networks in a complex visual motion task. Kwak et al. 2010 Altered resting state cortico-striatal connectivity in mild Frontiers in Systems to moderate stage Parkinson's disease. Neuroscience Ding & Simon 2009 Neural representations of complex temporal modulations Journal of Neurophysiology in the human auditory cortex. Cohen, van Gaal, 2009 Unconscious errors enhance prefrontal-occipital Frontiers in Human Ridderinkhof & Lamme oscillatory synchrony. Neuroscience Hong et al. 2010 Gamma and delta neural oscillations and association with Frontiers in Pharmacology clinical symptoms under subanesthetic ketamine. Cottrell et al. 2013 Working memory impairment in calcineurin knock-out Journal of Neuroscience

34

mice is associated with alterations in synaptic vesicle cycling and disruption of high-frequency synaptic and network activity in prefrontal cortex. Wang, Tseng, Liu & Tsai 2017 Neural Oscillation Reveals Deficits in Visuospatial Child Development Working Memory in Children With Developmental Coordination Disorder. Tseng, Chang, Chang, 2016 The critical role of phase difference in gamma oscillation Scientific Reports Liang & Juan within the temporoparietal network for binding visual working memory. Antal & Herrmann 2016 Transcranial Alternating Current and Random Noise Neural Plasticity Stimulation: Possible Mechanisms. Chang, Bosnyak & Trainor 2016 Unpredicted Pitch Modulates Beta Oscillatory Power Frontiers in Psychology during Rhythmic Entrainment to a Tone Sequence. Large, Herrera & Velasco 2015 Neural Networks for Beat Perception in Musical Rhythm. Frontiers in Woods et al. 2016 A technical guide to tDCS, and related non-invasive brain Clinical Neurophysiology stimulation tools. Thut, Schyns & Gross 2011 Entrainment of perceptually relevant brain oscillations by Frontiers in Psychology non-invasive rhythmic stimulation of the . Ding & Simon 2013 Power and phase properties of oscillatory neural Journal of Computational responses in the presence of background activity. Neuroscience Uhlhaas & Singer 2013 High-frequency oscillations and the neurobiology of Dialogues in Clinical schizophrenia. Neuroscience Skosnik et al. 2012 The effect of chronic cannabinoids on broadband EEG Neuropsychopharmacology neural oscillations in humans. Capilla, Paze-Alvarez, 2011 Steady-state visual evoked potentials can be explained by PLoS|One Darriba, Campo & Gross temporal superposition of transient event-related responses. Zhang et al. 2016 Perceptual Temporal Asymmetry Associated with Distinct Brain Sciences ON and OFF Responses to Time-Varying Sounds with Rising versus Falling Intensity: A Study. David et al. 2016 Variability of cortical oscillation patterns: A possible Neuroscience & Behavioral endophenotype in autism spectrum disorders? Reviews Pammer 2014 Temporal sampling in vision and the implications for Frontier in Human dyslexia. Neuroscience Hamid, Gall, Speck, Antal 2015 Effects of alternating current stimulation on the healthy Frontiers in Neuroscience & Sabel and diseased brain. N/A 2014 Poster Session III Wednesday, December 10, 2014. Neuropsychopharmacology (Neuropsychopharmacolog y publication) BMC Neuroscience 2016 25th Annual Computational Neuroscience Meeting: CNS- BMC Neuroscience (conference) 2016. N/A (ICOSR) 2013 Abstracts of the 14th International Congress on Schizophrenia Bulletin Schizophrenia Research (ICOSR). April 21-15, 2013. Orlando Grande Lakes, Florida, USA.

Table 6. The excluded articles from the search with keywords “EEG” + “cortical oscillation”. Authors Published Title Journal

Maezawa et al. 2016 Modulation of stimulus-induced 20-Hz activity for the Clinical Neurophysiology tongue and hard palate during tongue movement in humans. Hawasli et al. 2016 Influence of White and Gray Matter Connections on Frontiers in Human Endogenous Human Cortical Oscillations. Neuroscience Schmidt, Iyengar, Foulser, 2014 Endogenous cortical oscillations constrain Brain Stimulation Boyle & Fröhlich by weak electric fields. Subramaniyam, Hyttinen, 2015 Recurrence network analysis of multiple Conference proceedings - Hatsopoulos, Ross & bands from the orofacial portion of primary motor cortex. IEEE engineering in Takahashi medicine and biology society Hao, He, Xiao, Alstermark 2013 Corticomuscular transmission of signals by PLoS|One & Lan propriospinal neurons in Parkinson's disease. Rogasch, Daskalakis & 2013 Cortical inhibition, excitation, and connectivity in Schizophrenia Bulletin Fitzgerald schizophrenia: a review of insights from transcranial magnetic stimulation. Datko, Gougelet, Huang & 2016 Resting State Functional Connectivity MRI among Spectral Frontiers in Neuroscience Pineda MEG Current Sources in Children on the Autism Spectrum. Colonnese 2014 Rapid developmental emergence of stable depolarization Journal of Neuroscience during by inhibitory balancing of cortical network excitability. Formaggio et al. 2013 Frequency and time-frequency analysis of intraoperative ECoG during awake brain stimulation.

35

Yousif et al. 2017 A Network Model of Local Field Potential Activity in PLoS Computational and the Impact of Deep Brain Stimulation. Biology Noda, Kanzaki & 2013 Stimulus phase locking of cortical oscillation for auditory PLoS|One Takahashi stream segregation in rats. Sellers, Bennett & Fröhlich 2015 Frequency-band signatures of visual responses to naturalistic Brain Research input in ferret primary visual cortex during free viewing. Brumberg & Guenther 2014 Development of speech prostheses: current status and recent Expert Review of Medical advances. Devices Berchicci et al. 2011 Development of mu rhythm in infants and preschool Developmental children. Neuroscience Saalmann 2014 Intralaminar and medial thalamic influence on cortical Frontiers in Systems synchrony, information transmission and cognition. Neuroscience Doesburg et al. 2011 Magnetoencephalography reveals slowing of resting peak Pediatric Research oscillatory frequency in children born very preterm. Gardner, Hughes & Jones 2013 Differential spike timing and phase dynamics of reticular Journal of Neuroscience thalamic and prefrontal cortical neuronal populations during sleep spindles. Raver, Haughwout & 2013 Adolescent cannabinoid exposure permanently suppresses Neuropsychopharmacology Keller cortical oscillations in adult mice. Amzica & Steriade 1995 Short- and long-range neuronal synchronization of the slow Journal of Neurophysiology (< 1 Hz) cortical oscillation. Chang & Shyu 2013 Anterior Cingulate epilepsy: mechanisms and modulation. Frontiers in Integrative Neuroscience Partos, Cropper & 2016 You Don't See What I See: Individual Differences in the PLoS|One Rawlings Perception of Meaning from Visual Stimuli. Larson-Prior et al. 2013 Adding dynamics to the Human Project with NeuroImage MEG. Kargieman, Riga, Artigas 2012 Clozapine Reverses Phencyclidine-Induced Neuropsychopharmacology & Celada Desynchronization of Prefrontal Cortex through a 5-HT(1A) Receptor-Dependent Mechanism. Brown, Henny, Bolam & 2009 Activity of neurochemically heterogeneous dopaminergic The Journal of Magill neurons in the substantia nigra during spontaneous and Neuroscience driven changes in brain state. Krause & Cohen Kadosh 2013 Can transcranial electrical stimulation improve learning Developmental Cognitive difficulties in atypical brain development? A future Neuroscience possibility for cognitive training. Heitmann, Boonstra & 2013 A dendritic mechanism for decoding traveling waves: PLoS Computational Breakspear principles and applications to motor cortex. Biology Ahissar & Arieli 2012 Seeing via Miniature Eye Movements: A Dynamic Frontiers in Computational Hypothesis for Vision. Neuroscience Barthó et al. 2007 Cortical control of zona incerta. Journal of Neuroscience Wright & Bourke 2013 On the dynamics of cortical development: synchrony and Frontiers in Computational synaptic self-organization. Neuroscience Lee, Sen & Kopell 2009 Cortical gamma rhythms modulate NMDAR-mediated spike PLoS Computational timing dependent plasticity in a biophysical model. Biology Neuenschwander, Castelo- 2002 Feed-forward synchronization: propagation of temporal Philosophical Transactions Branco, Baron & Singer patterns along the retinothalamocortical pathway. of the Royal Society Llinás & Ribary 1993 Coherent 40-Hz oscillation characterizes state in PNAS humans. Barthó, Payne, Freund & 2004 Differential distribution of the KCl cotransporter KCC2 in European Journal of Acsády thalamic relay and reticular nuclei. Neuroscience Terman, Bose & Kopell 1996 Functional reorganization in thalamocortical networks: PNAS transition between spindling and delta sleep rhythms. Song, Vanneste & De 2015 Dysfunctional noise cancelling of the rostral anterior PLoS|One Ridder cingulate cortex in tinnitus patients. Melloni et al. 2015 Cortical dynamics and subcortical signatures of motor- Scientific Reports language coupling in Parkinson's disease. Pennucci et al. 2016 Loss of Either Rac1 or Rac3 GTPase Differentially Affects Cerebral Cortex the Behavior of Mutant Mice and the Development of Functional GABAergic Networks. Lustenberger, Boyle, 2015 Functional role of frontal alpha oscillations in creativity. Cortex Foulser, Mellin & Fröhlich Schalk 2015 A general framework for dynamic cortical function: the Frontiers in Human function-through-biased-oscillations (FBO) hypothesis. Neuroscience Tallus, Lioumis, 2013 Transcranial magnetic stimulation-electroencephalography Journal of Neurotrauma Hämäläinen, Kähkönen & responses in recovered and symptomatic mild traumatic brain Tenovuo injury. Li, Long & Yang 2015 Hippocampal-prefrontal circuit and disrupted functional BioMed Research connectivity in psychiatric and neurodegenerative disorders. International Goncalves, Anstey, 2013 Circuit level defects in the developing neocortex of Fragile X Nature Neuroscience Golshani & Portera- mice.

36

Cailliau Broccard et al. 2014 Closed-loop brain-machine-body interfaces for noninvasive Annals of Biomedical rehabilitation of movement disorders. Engineering Woltering, Jung, Liu & 2012 Resting state EEG oscillatory power differences in ADHD Behavioral and Brain Tannock college students and their peers. Functions Ferrarelli et al. 2012 Reduced natural oscillatory frequency of frontal Archives of General thalamocortical circuits in schizophrenia. Psychiatry Lee, Yu, Wu & Chen 2011 Do resting brain dynamics predict oddball evoked-potential? BMC Neuroscience Wang, Norton, Hutchinson, 2012 Spontaneous EEG-Functional MRI in Mesial Temporal Lobe Epilepsy Research and Ives & Mirsattari Epilepsy: Implications for the Neural Correlates of Treatment Consciousness. Koupparis, Kokkinos & 2013 Spindle power is not affected after spontaneous K-complexes PLoS|One Kostopoulos during human NREM sleep. Shiogai, Dhamala, Oshima 2012 Cortico-cardio-respiratory network interactions during PLoS|One & Hasler anesthesia. Lee et al. 2011 The influence of transporter polymorphisms on BMC Neuroscience cortical activity: a resting EEG study. Bruce, Bruce, Ramanand & 2011 Progressive changes in cortical state before and after Neuroscience Hayes spontaneous from sleep in elderly and middle-aged women. Ugawa, Hanajima, Terao & 2003 Exaggerated 16-20 Hz motor cortical oscillation in patients Clinical Neurophysiology Kanazawa with positive or negative myoclonus. Rovó et al. 2014 Phasic, nonsynaptic GABA-A receptor-mediated inhibition Journal of Neuroscience entrains thalamocortical oscillations. Kameyama et al. 2003 Effect of phospholipase Cbeta4 lacking in thalamic neurons Biomedical and on electroencephalogram. Biophysical Research Communications Beenhakker & Huguenard 2009 Neurons that fire together also conspire together: is normal Neuron sleep circuitry hijacked to generate epilepsy? Zanto et al. 2011 Age-related changes in orienting attention in time. Journal of Neuroscience

Saletin & Walker 2012 Nocturnal mnemonics: sleep and hippocampal memory Frontiers in Neurology processing. Steriade & Timofeev 2004 Neuronal plasticity and thalamocortical sleep and waking Progress in Brain Research oscillations. Amzica & Steriade 1998 Electrophysiological correlates of sleep delta waves. Electroencephalography and Clinical Neurophysiology Mima, Steger, Schulman, 2000 Electroencephalographic measurement of motor cortex Clinical Neurophysiology Gerloff & Hallett control of muscle activity in humans. Huang et al. 2010 Spiral wave dynamics in neocortex. Neuron Steriade & Amzica 1998 Slow sleep oscillation, rhythmic K-complexes, and their Journal of Sleep Research paroxysmal developments. Contreras, Destexhe & 1997 Spindle oscillations during cortical spreading depression in Neuroscience Steriade naturally sleeping cats. Vanhatalo et al. 2004 Infraslow oscillations modulate excitability and interictal PNAS epileptic activity in the human cortex during sleep. Radhakrishnan, Wilkinson 2014 Gone to Pot - A Review of the Association between Frontiers in Psychiatry & D'Souza Cannabis and Psychosis. Steriade, Nunez & Amzica 1993 Intracellular analysis of relations between the slow (< 1 Hz) The Journal of neocortical oscillation and other sleep rhythms of the Neuroscience electroencephalogram. Furth, Mastwal, Wang, 2013 Dopamine, cognitive function, and gamma oscillations: role Frontiers in Cellular Buonanno & Vullhorst of D4 receptors. Neuroscience Steriade, Contreras, Curro 1993 The slow (< 1 Hz) oscillation in reticular thalamic and The Journal of & Nunez thalamocortical neurons: scenario of sleep rhythm generation Neuroscience in interacting thalamic and neocortical networks. Inan, Petros & Anderson 2013 Losing your inhibition: linking cortical GABAergic Neurobiology of Disease to schizophrenia. Kiss et al. 2011 Role of Thalamic Projection in NMDA Receptor-Induced Frontiers in Psychology Disruption of Cortical Slow Oscillation and Short-Term Plasticity. Steriade 1994 Sleep oscillations and their blockage by activating systems. Journal of Psychiatry & Neuroscience Fa-Hsuan 2004 Spectral spatiotemporal imaging of cortical oscillations and NeuroImage interactions in the human brain. Yang, Solis-Escalante, van 2016 Nonlinear Coupling between Cortical Oscillations and Frontiers in Computational de Ruit, van der Helm & Muscle Activity during Isotonic Wrist Flexion. Neuroscience Schouten Wasilczuk, Proekt, Kelz & 2016 High-density Electroencephalographic Acquisition in a Journal of Visualized

37

McKinstry-Wu Rodent Model Using Low-cost and Open-source Resources. Experiments Ouandt et al. 2016 Spectral Variability in the Aged Brain during Fine Motor Frontiers in Aging Control. Neuroscience Boonstra, Nikolin, 2016 Change in Mean Frequency of Resting-State Frontiers in Human Meisener, Martin & Loo Electroencephalography after Transcranial Direct Current Neuroscience Stimulation. Kim et al. 2016 Neural substrates predicting short-term improvement of Scientific Reports tinnitus loudness and distress after modified tinnitus retraining therapy. Bellesi, Riedner, Garcia- 2014 Enhancement of sleep slow waves: underlying mechanisms Frontiers in Systems Molina, Cirelli & Tononi and practical consequences. Neuroscience Chen, Madhavan, Rapoport 2011 A method for real-time cortical oscillation detection and Engineering in Medicine & Anderson phase-locked stimulation. and Biology Society, EMBC Kim et al. 2015 Cortically projecting basal forebrain parvalbumin neurons PNAS regulate cortical gamma band oscillations. Ambrus et al. 2015 Bi-frontal transcranial alternating current stimulation in the Frontiers in Cellular ripple range reduced overnight forgetting. Neuroscience Fujimoto et al. 2012 Changes in Event-Related Desynchronization and The Open Synchronization during the Auditory Oddball Task in Journal Schizophrenia Patients. Fröhlich, Sellers & Cordle 2015 Targeting the neurophysiology of cognitive systems with Expert Review of transcranial alternating current stimulation. Neurotherapeutics Steriade 2004 Slow-wave sleep: serotonin, neuronal plasticity, and seizures. Archives Italiennes de Biologie Cho et al. 2016 Cortical Responses and Shape Complexity of Stereoscopic Neuro-Signals Image - A Simultaneous EEG/MEG Study. Olcese & Faraguna 2015 Slow cortical rhythms: from single-neuron Archives Italiennes de to whole-brain imaging in vivo. Biologie Huang, Su & Hwang 2014 Rate control and quality assurance during rhythmic force Behavioral Brain Research tracking. Balconi & Finocchiaro 2015 Decisional impairments in cocaine addiction, reward bias, Neuropsychiatric Disease and cortical oscillation "unbalance". and Treatment Choi, Yu, Lee & Llinás 2015 Altered thalamocortical rhythmicity and connectivity in mice PNAS lacking CaV3.1 T-type Ca2+ channels in unconsciousness. Zhang et al. 2016 Perceptual Temporal Asymmetry Associated with Distinct Brain Sciences ON and OFF Responses to Time-Varying Sounds with Rising versus Falling Intensity: A Magnetoencephalography Study. David et al. 2016 Variability of cortical oscillation patterns: A possible Neuroscience & Behavioral endophenotype in autism spectrum disorders? Reviews Pammer 2014 Temporal sampling in vision and the implications for Frontier in Human dyslexia. Neuroscience Hamid, Gall, Speck, Antal 2015 Effects of alternating current stimulation on the healthy and Frontiers in Neuroscience & Sabel diseased brain. N/A 2014 Poster Session III Wednesday, December 10, 2014. Neuropsychopharmacology (Neuropsychopharmacolog y publication) N/A (annual meeting) 2015 AES 2014 Annual Meeting - Online Abstract Supplement. Epilepsy Currents

Table 7. The excluded articles from the search with keywords “EEG” + “neural entrainment”. Authors Published Title Journal Teki 2016 A Citation-Based Analysis and Review of Significant Papers on Frontiers in Human Timing and Neuroscience Hamm, Gilmore, 2011 Abnormalities of Neuronal Oscillations and Temporal Integration to Biological Psychiatry Picchetti, Sponheim Low and High Frequency Auditory Stimulation in Schizophrenia & Clementz Ding & Simon 2013 Adaptive Temporal Encoding Leads to a Background Insensitive Journal of Neuroscience Cortical Representation of Speech Allen & Williams 2011 Consciousness, plasticity, and connectomics: the role of Frontiers in Psychology intersubjectivity in human cognition Doelling & Poeppel 2015 Cortical entrainment to music and its modulation by expertise PNAS Ding, Melloni, Zhang, 2016 Cortical Tracking of Hierarchical Linguistic Structures in Connected Natural Neuroscience Tian & Poeppel Speech Benichov, Globerson 2016 Finding the Beat: From Socially Coordinated Vocalizations in Frontiers in Human & Tchemichovski Songbirds to Rhythmic Entrainment in Humans Neuroscience Okajima & 2016 Flickering task–irrelevant distractors induce dilation of target Scientific Reports Yotsumoto duration depending upon cortical distance

38

Barlow et al. 2014 Frequency-Modulated Orocutaneous Stimulation Promotes Non- Journal of Perinatology nutritive Suck Development in Preterm Infants with Respiratory Distress Syndrome or Chronic Lung Disease Peelle, Gross & Davis 2013 Phase-Locked Responses to Speech in Human Auditory Cortex are Cerebral Cortex Enhanced During Comprehension Marsh & Campbell 2016 Processing Complex Sounds Passing through the Rostral : Frontiers in Neuroscience The New Early Filter Model Malcolm, Lavine, 2008 Repetitive transcranial magnetic stimulation interrupts phase Neuroscience Letters Kenyon, Massie & synchronization during rhythmic motor entrainment Thaut Sunderam et al. 2009 entrainment with polarizing low frequency electric fields in a Journal of Neural chronic animal epilepsy model Engineering Lawrence, Harper, 2014 Temporal predictability enhances auditory detection The Journal of the Cooke & Schnupp Acoustical Society of America Patel 2014 The Evolutionary Biology of Musical Rhythm: Was Darwin Wrong? PLOS|Biology Reedjik, Bolders & 2013 The impact of binaural beats on creativity Frontiers in Human Hommel Neuroscience Herrmann, Rach, 2013 Transcranial alternating current stimulation: a review of the Frontiers in Human Neuling & Strüber underlying mechanisms and modulation of cognitive processes Neuroscience Safron 2016 What is orgasm? A model of sexual and climax via rhythmic Socioaffective entrainment Neuroscience & Psychology Opitz et al. 2016 Spatiotemporal structure of intracranial electric fields induced by Scientific Reports transcranial electric stimulation in humans and nonhuman primates Sunderam, 2010 Toward rational design of electrical stimulation strategies for Frontiers in Human Gluckman, Reato & epilepsy control Neuroscience Bikson Zanto, Chadick & 2014 Anticipatory alpha phase influences visual working memory NeuroImage Gazzaley performance Bestmann & 2014 Combined neurostimulation and neuroimaging in cognitive Annals of the New York Feredoes neuroscience: past, present, and future Academy of Sciences Mathias, Lidji, 2016 Electrical Brain Responses to Beat Irregularities in Two Cases of Frontiers in Neuroscience Honing, Palmer & Beat Deafness Peretz Swann et al. 2015 Elevated Synchrony in Parkinson's Disease Detected with Annals of Neurology Electroencephalography Garcia, Grossman & 2011 Evoked potentials in large-scale cortical networks elicited by TMS of Journal of Srinivasan the visual cortex Neurophysiology Noury, Hipp & Siegel 2016 Physiological processes non-linearly affect electrophysiological NeuroImage recordings during transcranial electric stimulation Schmidt et al. 2012 Progressive enhancement of alpha activity and visual function in Brain Stimulation patients with optic neuropathy: A two-week repeated session alternating current stimulation study Keller, Novembre & 2014 Rhythm in joint action: psychological and neurophysiological Philosophical Transactions Hove mechanisms for real-time interpersonal coordination of the Royal Society Ronconi, Pincham, 2016 Shaping prestimulus neural activity with auditory rhythmic NeuroReport Cristoforetti, Facoetti stimulation improves the temporal allocation of attention & Szücs Ludwig et al. 2016 Spectral EEG abnormalities during vibrotactile encoding and NeuroImage: Clinical quantitative working memory processing in schizophrenia Brenner et al. 2009 Steady State Responses: Electrophysiological Assessment of Sensory Schizophrenia Bulletin Function in Schizophrenia Nozaradan, Peretz, 2011 Tagging the Neuronal Entrainment to Beat and Meter The Journal of Missal & Mouraux Neuroscience Crosse, Liberto, 2016 The Multivariate Temporal Response Function (mTRF) Toolbox: A Frontiers in Human Bednar & Lalor MATLAB Toolbox for Relating Neural Signals to Continuous Neuroscience Stimuli Meltzer et al. 2015 The steady-state response of the cerebral cortex to the beat of music Frontiers in Human reflects both the comprehension of music and attention Neuroscience O'Sullivan, Crosse, 2017 Visual Cortical Entrainment to Motion and Categorical Speech Frontiers in Human Liberto & Lalor Features during Silent Lipreading Neuroscience Snijders, Milivojevic 2013 Atypical excitation–inhibition balance in autism captured by the NeuroImage: Clinical & Kemner gamma response to contextual modulation Hamm, Gilmore & 2012 Augmented gamma band auditory steady-state responses: Support for Schizophrenic Resolution Clementz NMDA hypofunction in schizophrenia Nozaradan, Zerouali, 2013 Capturing with EEG the Neural Entrainment and Coupling Cerebral Cortex Peretz & Mouraux Underlying Sensorimotor Synchronization to the Beat Cutini, Szücs, Mead, 2016 Atypical right hemisphere response to slow temporal modulations in NeuroImage Huss & Goswami children with developmental dyslexia Nozaradan 2014 Exploring how musical rhythm entrains brain activity with Philosophical Transactions

39

electroencephalogram frequency-tagging of the Royal Society

Sato 2013 Fast entrainment of human electroencephalogram to a theta-band Frontiers in Human photic flicker during successful memory encoding Neuroscience Nozaradan, Peretz & 2016 Individual Differences in Rhythmic Cortical Entrainment Correlate Scientific Reports Keller with Predictive Behavior in Sensorimotor Synchronization Cirelli, Spinelli, 2016 Measuring Neural Entrainment to Beat and Meter in Infants: Effects Frontiers in Neuroscience Nozaradan & of Music Background Trainor Tierney & Kraus 2014 Neural Entrainment to the Rhythmic Structure of Music Journal of Cognitive Neuroscience Norton, Beach & 2015 Neurobiology of Dyslexia Current Opinion in Gabrieli Neurobiology Libertus, Brannon & 2011 Parallels in Stimulus-Driven Oscillatory Brain Responses to Developmental Woldorff Numerosity Changes in Adults and Seven-Month-Old Infants Neuropsychology Horr, Wimber & 2016 Perceived time and temporal structure: Neural entrainment to NeuroImage Luca isochronous stimulation increases duration estimates Cagnan et al. 2013 Phase dependent modulation of tremor amplitude in essential tremor Brain - A Journal of through thalamic stimulation Neurology A collection of N/A A lot International Congress on abstracts to studies Schizophrenia Research

40