Aus dem Epilepsiezentrum Erlangen Leiter: Prof. Dr. med. Hermann Stefan der Neurologischen Klinik mit Poliklinik der Friedrich-Alexander-Universität Erlangen-Nürnberg Direktor: Prof. Dr. med. Stefan Schwab

Durchgeführt im

Helen Wills Neuroscience Institute der University of California, Berkeley, USA Direktor: Robert T. Knight, M.D. Betreuer: Aurélie Bidet-Caulet, Ph.D.

Auditory Selective : an introduction and evidence for distinct facilitation and inhibition mechanisms

Inaugural-Dissertation zur Erlangung der Doktorwürde der Medizinischen Fakultät der Friedrich-Alexander-Universität Erlangen-Nürnberg

vorgelegt von Constanze Elisabeth Anna Mikyska aus München

Gedruckt mit Erlaubnis der Medizinischen Fakultät der Friedrich-Alexander-Universität Erlangen-Nürnberg

Dekan: Prof. Dr. med. Dr. h.c. J. Schüttler

Referent: Prof. Dr. med. H. Stefan

Korreferent: Prof. Dr. med. Dipl.-Psych. Ch. Lang

Tag der mündlichen Prüfung: 29. Februar 2012

To my family

Table of contents

1 Summary 1 1.1 Summary 1 1.2 Zusammenfassung 3

2 Introduction 5 2.1 Auditory system: anatomy and function 5 2.1.1 Ear 5 2.1.2 Sub-cortical auditory relays 10 2.1.3 Auditory cortex 13 2.2 Investigation of auditory and processing 14 2.2.1 Psychophysics – psychoacoustics 15 2.2.2 Brain activity – (EEG) 17 2.2.2.1 Introduction and history 17 2.2.2.2 Physiological fundamentals 17 2.2.2.3 Recording 22 2.2.2.4 Classification of frequency 25 2.2.2.5 Artifacts 27 2.2.2.6 Data analysis: preprocessing and event-related potentials (ERP) 30 2.2.2.7 Main auditory electrophysiological components 31 2.3 Auditory attention 35 2.3.1 Psychological theories 35 2.3.1.1 Introduction to selective attention 35 2.3.1.2 Bottleneck theories: early- versus late-selection 37 2.3.1.3 Other capacity-limitation theories 40 2.3.2 Electrophysiological findings and theories 42 2.4 Aims of this dissertation 49

3 Material and methods 54 3.1 Subjects 54 3.2 Stimuli and task 54

3.3 Procedure 57 3.4 EEG recording 57 3.5 EEG data analysis 57 3.6 Statistical analysis 58 3.6.1 Selection of applied methods 58 3.6.1.1 Analysis of variance (ANOVA) 58 3.6.1.2 Statistic permutation test 59 3.6.2 Behavioral data 60 3.6.3 ERP standards 61 3.6.4 ERP deviants 62

4 Results 64 4.1 Behavioral data 64 4.2 ERP results of standards 65 4.2.1 Main attention effect (attended versus ignored) 65 4.2.2 Influence of the memory task difficulty on attention effects 66 4.2.3 Timing of attention facilitation and inhibition 70 4.2.4 Topographies of attention facilitation and inhibition 72 4.3 ERP results of deviants 73 4.3.1 Attention enhancement of deviant processing 73 4.3.2 Memory effect on the P3-Component 75

5 Discussion 77

6 References 82

7 List of abbreviations 90

8 Publication 92

9 Acknowledgements 93

10 Curriculum vitae 94

1

1 Summary 1.1 Summary Objective Auditory selective attention is a complex brain function that is still not completely understood. The classic example is the so-called “” (Cherry, 1953), which describes the impressive ability to focus one’s attention on a single voice from a multitude of voices. This means that particular stimuli in the environment are enhanced in contrast to other ones of lower priority that are ignored. To be able to understand how attention can influence the perception and processing of sound, background knowledge is essential. One aim of this dissertation is to provide an overview of already existing literature. Therefore, the auditory system and different methods to measure and evaluate auditory processes are introduced at first, followed by a review about competing theories, trying to explain how auditory attention operates. The second aim of the dissertation is to specify the mechanisms and to elucidate how they operate. It is generally accepted that distinct signals (top-down signals) are important for cognitive control, enabling selective attention and leading to an enhanced processing of task relevant information. But it is unknown whether facilitation and inhibition of stimulus processing are based upon one (unitary gain control mechanism of facilitation) or two mechanisms (net activity of distinct top- down facilitation and inhibition mechanisms). Results from a visual fMRI study (de Fockert, 2001) suggest that facilitation and inhibition rely on distinct mechanisms that would be differentially affected by the availability of cognitive resources (i.e. for performing a task). To reveal that facilitation and inhibition represent distinct mechanisms in auditory selective attention, we conducted a study, where subjects performed an auditory attention task, while the amount of available cognitive resources was modulated (by varying the difficulty of a memory task). Methods Electrophysiological experiments were conducted in young healthy adults. 16 subjects performed an attention task and a memory task of varying difficulty (no, easy and difficult memory) at the same time (dual task protocol) while EEG was recorded. Facilitation and inhibition were measured by comparing

2 electrophysiological responses to attended and ignored sounds with responses to the same sounds when attention was considered to be equally distributed towards all sounds. Results Two ERP-components were observed: a negative one in response to attended sound and a positive one to ignored sounds. The two frontally distributed components had distinct timing and scalp topographies and were differentially affected by the difficulty of the memory load. Conclusion This dissertation provides an insight into the literature of auditory selective attention and also enriches the existing knowledge with results of a new study about the operating mechanisms of auditory selective attention. The study provides evidence that top-down attention control can operate via distinct facilitation and inhibition mechanisms.

3

1.2 Zusammenfassung Hintergrund und Ziele Selektive auditorische Aufmerksamkeit ist ein komplexer Mechanismus, der noch nicht vollständig verstanden ist. Das klassische Beispiel ist der sogenannte „Cocktail Party Effekt“ (Cherry, 1953). Dieser beschreibt die beeindruckende Fähigkeit, die Aufmerksamkeit auf einen einzelnen Sprecher zu konzentrieren und andere Unterhaltungen auszublenden. Das bedeutet, dass bestimmte Reize in unserer Umwelt verstärkt wahrgenommen werden, wohingegen Reize von niedrigerer Priorität ignoriert werden. Um zu verstehen, wie Aufmerksamkeit die Wahrnehmung und Verarbeitung von Reizen beeinflusst, gibt der erste Teil dieser Dissertation einen Überblick der Grundlagenliteratur. Dabei werden zuerst das auditorische System vorgestellt und verschiedene Methoden zur Messung und Beurteilung auditorischer Verarbeitungsprozesse eingeführt. Dem folgt ein kurzer Überblick über konkurrierende Theorien, die zu erklären versuchen, wie selektive auditorische Aufmerksamkeit funktioniert. Der zweite Teil dieser Arbeit befasst sich genauer mit der Frage nach den Mechanismen und wie diese arbeiten. Es ist allgemein anerkannt, dass bestimmte Signale (top-down Signale) wichtig für die kognitive Kontrolle sind. Sie aktivieren selektive auditorische Aufmerksamkeit und führen so zu einer verstärkten Verarbeitung eines relevanten Reizes. Aber es ist noch ungeklärt ob die Förderung und Hemmung der Reizverarbeitung durch einen (einheitlicher, linearer Verstärkungsmechanismus von Förderung) oder zwei Mechanismen (Netzwerkaktivität von unabhängiger Förderung und Hemmung) geregelt wird. Ergebnisse einer visuellen fMRT Studie zeigen, dass das Ausmaß der Hemmung ablenkender Reize von der Verfügbarkeit kognitiver Ressourcen (z.B. für das Lösen von Problemen) abhängig ist (de Fockert, 2001). Die Ergebnisse deuten darauf hin, dass Förderung und Hemmung im visuellen System auf verschiedenen Mechanismen basieren, die von der Verfügbarkeit kognitiver Ressourcen unterschiedlich beeinflusst werden. Um zu zeigen, dass Förderung und Hemmung unabhängig voneinander agieren, führten wir eine Studie durch, in der Probanden einen auditorischen Aufmerksamkeitstest lösten, während die Verfügbarkeit von kognitiven Ressourcen variiert wurde (verschiedene Schwierigkeitsstufen in einem Gedächtnis Test).

4

Methoden Elektrophysiologische Versuche wurden mit 16 jungen, gesunden Erwachsenen durchgeführt. Die Probanden lösten gleichzeitig (dual task protocol) einen Aufmerksamkeits- und einen Gedächtnis Test mit variierenden Schwierigkeitsstufen (no, easy und difficult memory) während elektrophysiologische Signale (EEG) aufgezeichnet wurden. Förderung und Hemmung wurden gemessen, indem die Antworten zu den beachteten und den ignorierten Reizen jeweils mit den Antworten auf die gleichen Reize einer Kontrollbedingung verglichen wurden. In dieser Kontrollbedingung wurde angenommen, dass die Aufmerksamkeit ausgewogen auf alle Reize gerichtet war. Ergebnisse und Beobachtungen Zwei ERP-Komponenten wurden beobachtet: eine negative, in Antwort zu den beachteten Reizen und eine positive, den ignorierten Reizen folgend. Die zwei Komponenten zeigten verschiedene frontale Skalp-Topographien und variierten auch in der zeitlichen Domäne. Außerdem wurden sie unterschiedlich von der Schwierigkeit des Gedächtnis Tests beeinflusst. Praktische Schlussfolgerungen Diese Dissertation bietet einen Einblick in die Literatur über selektive auditorische Aufmerksamkeit und bereichert das bestehende Wissen mit Ergebnissen einer neuen Studie über die Wirkmechanismen. Die Studie erbringt den Nachweis, dass top-down Kontrolle die Aktivität voneinander unabhängiger Förderungs- und Hemmungsmechanismen widerspiegelt.

5

2 Introduction The auditory system processes acoustic waves, leading to auditory percepts. An important issue is to understand how attention can influence the perception of sound, i.e. the processing of sounds. In other words, by which mechanisms and at which step of sound processing, auditory attention operates. To address this question, several basic principles will be introduced first: (1) the anatomy of the auditory system and the sequence of sound processing from the outer ear to the auditory cortices, (2) different methods to measure and evaluate auditory processes (especially the electroencephalography), (3) auditory attention and the attempt of psychological and physiological theories to elucidate its influence on sound processing. Finally, (4) the aims of the present study are introduced.

2.1 Auditory system: anatomy and function The auditory system is a remarkable sensory organ. Both ears are involved in sound detection from all directions regardless of the organism’s current orientation. Processing takes place, while information about the stimuli is transmitted along complex sub-cortical relays to the auditory cortex. The final processing and interpretation occurs in the auditory cortex and in surrounding higher order areas.

2.1.1 Ear Outer ear Sound waves first reach the outer ear, which is composed of the pinna (auricle), the ear canal (external acoustic meatus) and the eardrum (tympanic membrane). The pinna, the visible part of the outer ear, collects and focuses sound waves, and directs them through the ear canal (approximately 30 to 35 mm long and 7 mm in diameter) to the eardrum, which transmits sound vibrations to the middle ear (see Figure 2.1).

6

Figure 2.1 – The anatomy of the ear (adapted from Netter, 2006).

Middle ear The middle ear is an air filled cavity consisting of different muscles and the three ossicles (malleus, incus and stapes). There are also two openings, linking the middle ear to the inner ear over membranes: the oval (vestibular) window adjoining the perilymph in the scala vestibuli and the round (cochlear) window connecting to the perilymph in the scala tympani (see Figure 2.1). The malleus is attached to the inner surface of the tympanic membrane and transmits the arriving vibration to the incus and the stapes, which is attached to the membrane of the oval window. This small bone is stabilized by the stapedius muscle, which controls the amplitude of sound waves by pulling the stapes away from the oval window and therefore protects the inner ear from high noise levels (Trepel, 2008). The tensor tympani muscle functions in a similar manner by pulling the malleus, thus tensing the tympanic membrane. From a physical point of view, two mechanisms permit an increased efficiency of sound transmission (Schmidt, 1993): (1) the reduced surface of the membrane of the oval window compared to the surface of the tympanic membrane causes an enhancement of pressure and (2) the lever system of the ossicles leads to

7 an adaptation between the low impedance of the air in the middle ear and the high impedance of the fluid in the inner ear. The middle ear is only functioning as long as the tympanic cavity is ventilated and its pressure is matched to the atmosphere. This is assured by the Eustachian tube, which links the middle ear to the nasopharynx. An upper airway infection can cause swelling and occlusion of the tube, which can result in an ear infection as well as in a rupture of the tympanic membrane, caused by a pathological pressure difference (Schmidt, 1993).

Inner ear The inner ear contains the vestibular system, dedicated to balance and spatial orientation, and the cochlea, which is essential for hearing. The cochlea is part of the osseous labyrinth and turns like a snail two and a half times around a core of bone (modiolus), in which the cochlear nerve runs. This labyrinth is filled with perilymph, a derivative of the cerebrospinal fluid, similar to extracellular fluid, and also contains a membranous labyrinth: the cochlear duct (scala media), filled with endolymph (a fluid with a high content of potassium, similar to intracellular fluid). The cochlear duct is formed by the Reissner's membrane above and the basilar membrane below and also holds the organ of Corti (organum spirale). This is the sensory organ of hearing and is comprised of receptor cells (hair cells), different types of supporting cells (cells of Deiters, Hensen, Claudius and Boettcher) and the basilar membrane (see Figure 2.2 A and B). The hair cells are arranged in one row of inner and three rows of outer hair cells and make contact with neurons on their basis (see Figure 2.2 B). Additionally, they have stereo cilia (hair bundles) on their free surface, which are attached to each other by filamentous structures, called tip-links (Roberts, 1988). The stereo cilia from the outer hair cells are conjoined to the tectorial membrane – a colloidal membrane that covers the organ of Corti. Furthermore the cochlear duct separates two structures: the scala vestibuli (above) and the scala tympani (below), that merge at the apex of the cochlea (helicotrema) and so the perilyphm can flow from one scala to another. The organ of Corti sits on top of the basilar membrane along the entire length of the scala media.

8

Figure 2.2 – Models of the cochlea and the organ of Corti. (A) Cross section through a turn of the cochlea showing the scala vestibuli, the scala tympani and the cochlear duct with the organ of Corti. (B) The anatomical structures of the organ of Corti (adapted from Hawkins, 1997).

9

If a sound impacts the tympanic membrane and is transmitted from malleus to incus and stapes, the arriving vibrations cause the membrane of the oval window to produce pressure waves within the incompressible perilymph of the scala vestibuli. These pressure waves lead to vibrations of the Reissner's membrane and moreover to deflections of the basilar membrane, which is also called travelling wave (Von Bekesy, 1960). Thus, the stereo cilia of the inner hair cells are displaced, causing an action potential in the cochlear nerve. Because the basilar membrane is more rigid at the basal aspect of the cochlea compared to the apical part, higher frequencies lead to a bigger deflection of the basilar membrane at basal parts of the cochlea, whereas lower frequencies are mapped at the apical parts. These spatial arrangements of sound information is called tonotopy (Von Bekesy, 1960) and is more or less maintained throughout the auditory pathway, so that the frequency content of a sound is constantly decipherable. Especially the hair cells are essential for the generation of action potentials because of their characteristic features: the outer hair cells operate as an (cochlear) amplifier, ensuring the sensitivity and tuning of the cochlea. By active contractions that displace the basilar and tectorial membrane they can enhance the endolymphatic flow and therefore also increase the travelling wave. They are innervated by efferent nerve fibers from the superior olivary complex (see Figure 2.3). Interestingly, a diminutive part of the kinetic energy of the outer hair cells travels back through the middle ear to the ear canal, where it can be recorded as sound (Kemp, 1978). This is called otoacoustic emission (OAE) and this method is used for examining the function of the outer hair cells and for screening newborn babies for hearing defects. The proper stimuli for the inner hair cells are hydrodynamic forces of endolymph, moving the freestanding hair bundles (hydrodynamic coupling) (Hudspeth, 1983). The inner hair cell is a seconday receptor cell and cannot generate an action potential. The mechanical stimulus rather triggers a receptor potential (mechano- electrical transduction) that is transferred from electrical to chemical signal and transmitted to an afferent neuron. More detailed: if the stereocilia are moved in one direction, tensing the tip links, mechanically gated ion channels on the top of the hair cells open and positively charged ions (especially potassium) enter the cell and cause a depolarization, which leads to a receptor potential that can occur up to 5000 times per second. This receptor potential opens voltage gated calcium channels and ions

10 enter the cell and trigger the release of neurotransmitters (glutamate) at the basal end of the inner hair cell. Another depolarization is inhibited, if the hair bundles are deflected to the other direction, relaxing the tip links, (Schmidt, 2005). The released glutamate diffuses through the synaptic cleft and binds to the postsynaptic receptor (AMPA receptor), which triggers a postsynaptic potential that causes an action potentials in the afferent neuron. This process is called transformation. The number of axons firing and the frequency of the action potentials encode for the volume of a sound (amplitude), i.e. high volume will result in higher frequencies of action potentials.

2.1.2 Sub-cortical auditory relays The signal is processed along six or more neurons consecutively forming synapses (see Figure 2.3). Coming from the inner hair cells, the signal runs along the afferent nerve fiber to the spiral ganglion (first order neuron), located within the central aspects of the cochlea. Together with the vestibular nerve coming from receptor cells of the vestibular system, the cochlear nerve forms the vestibulocochlear nerve (the VIIIth cranial nerve), which runs through the internal acoustic meatus in the petrosus part of the temporal bone and enters the cranium through the porus acusticus internus (Trepel, 2008). The nerves run to the cerebellopontine angle and from there to the brainstem, where the vestibulocochlear nerve splits again and each part runs to their cranial nerve nuclei. The cochlear nuclei consist of the nucleus cochlearis anterior (ventralis) and posterior (dorsalis), both located at the inferior cerebellar peduncle, a part of the medulla oblongata. In these nuclei not only the first relay takes place (to second order neurons) but also the first processing of the sensory input occurs: an automatic decoding of the basic signal (duration, intensity and frequency). From the nucleus cochlearis anterior a minor aspect of the nerve fibers runs on the ipsilateral side, whereas the major part decussates in the corpus trapezoideum (located in the pons) to the contralateral side (Trepel, 2008). Moreover the corpus trapezoideum contains the nuclei corporis trapezoidei and the nuclei olivares superiors (superior olivary complex), where the neurons also form synapses (third order neuron). The fibers from the nucleus cochlearis posterior form a minor part of the auditory system and decussate separately from the corpus trapezoideum to the

11 contralateral side, without forming any synapse with other neurons. Additionally some of the neurons starting from the nucleus cochlearis posterior hold partly excitatory and also partly inhibitory neurons, which can inhibit processing in subsequent levels of the pathway (Schmidt, 2005). On the contralateral side all auditory nerve fibers form the lemniscus lateralis, where the neurons come together in synapses (nuclei lemnisci lateralis; fourth order neurons) and either decussate back to the originally ipsilateral side or run to the colliculus inferior (fifth order neuron) – part of the corpora quadrigemnia, located in the mesencephalon (Trepel, 2008). At this location and also in the superior olivary complex the direction of the sound is analyzed by specialized neurons, comparing the timing of action potentials, coming from both cochleae. From here, some nerve fibers decussate again to the contralateral side through the brachium colliculi inferioris, but most of them continue to the corpus geniculatum mediale of the thalamus, located in the diencephalon. The geniculate neurons (sixth order neurons) project their axons through the capsula interna to the primary auditory cortex (radiatio acustica). This complex and intensely interconnected pathway is crucial to connect both cochleae to the left and right auditory cortices, which is important for bilateral processing and comparing sounds from the right and left side (Schmidt, 2005).

12

Figure 2.3 – Simplified scheme of the central auditory pathway. CGM = Corpus geniculatum mediale, HG = Heschl’s gyrus (there can be two Heschl’s gyri, HG1 and HG2), PP = planum polare, PT = planum temporale, STG = superior temporal gyrus, MTG = medial temporal gyrus (adapted from Bidet-Caulet, 2006).

13

2.1.3 Auditory cortex The auditory cortex consists of the primary auditory cortex and higher level surrounding areas. The primary auditory cortex (A1, Broadmann area 41) is located on the supratemporal plane of the temporal lobe (Pandya, 1995) and it is only visible after removing the frontal and parietal operculum. The medial part of the gyri temporales transversi, also called Heschl’s gyrus, named after Richard Heschl, an Austrian anatomist (1824 – 1881), forms the major part of the primary auditory cortex. As an expression of cerebral anatomically asymmetry, some individuals can have two Heschl’s gyri – mostly on the right side, whereas also morphological variations, with the left side being larger, are postulated (Geschwind, 1968). Most neurons in A1 are organized according to the frequency of sounds they respond the best to (Howard, 1996). Afferent fibers carrying information about low frequencies end more anterolateral in Heschl’s gyrus, whereas high frequencies are mapped more posteromedial (Trepel, 2008). This frequency map corresponds to the tonotopic organization of the auditory pathway. Other than that, there is also an organization for binaural properties. The neurons are arranged in different stripes. For instance: one stripe is excited by both ears (EE cells) whereas the neurons in another stripe are firing by receiving information from one ear and are inhibited by input from the other ear (EI cells). This organization is comparable to the ocular dominance columns in the primary visual cortex (V1) (Purves, 1997). The higher order auditory areas (Brodmann area 42 and 22) laterally adjoin the primary auditory cortex and are located posteriorly in the planum temporale, aneriorly in the planum polare and laterally in the superior temporal gyrus (STG). These areas receive the majority of the afferent information from A1. They are less precise in their tonotopic organization and mainly operate by interpreting the detected sounds as words, melody, rhythm or noise (Trepel, 2008). In the dominant hemisphere within an area of the secondary auditory cortex, named Wernickes area, the information is processed and integrated into speech comprehension. This area is located in the posterior section of the superior temporal gyrus (Brodmann area 22). The dominant hemisphere is the one controlling and processing speech and understanding, which is mostly located at the opposite side of the dominant hand – i.e. the left hemisphere for about 95 % of right-handed population, but also for 70 % of the left-handed people (Rickheit, 2003). A lesion in this area leads to the so-called sensory aphasia, receptive aphasia or Wernicke’s aphasia, which main symptom is

14 the disability to understand speech. The patient is still able to speak fluently, but it makes little or no sense, because the word is not linked to its proper meaning. The secondary auditory areas also receive afferent input from the angular gyrus that gets it information from the secondary visual cortex. This circuit is important for the combination of visual and auditory input to its meaning, crucial for reading and writing. Furthermore, there may be two major streams of information processing comparable to the ‘what‘ and ‘where’ streams in the visual system (Kaas, 1999). To simplify: the information about spatial location (‘where’) would run from A1 to posterior higher order areas and continue to the parietal lobe and the posterior parts of the dorsolateral prefrontal cortex. Object-related properties would be processed within a ‘what’ pathway composed of the primary auditory cortex, anterior higher order areas and ventral and medial prefrontal areas. Given the knowledge about the complex auditory system, the obvious questions follow: how does the brain interpret acoustic waves to produce a percept and with what kind of methods and procedures is it possible to measure, evaluate and interpret auditory perception and processing.

2.2 Investigation of auditory perception and processing There are two main approaches attempting to investigate auditory perception and processing. (1) Psychophysics analyzes the interaction between physical stimuli that are quantitative measurable and the subjective perception, triggered by the stimuli. A section of psychophysics is psychoacoustic describing the relationship between a subjective auditory impression and the appropriate physical stimulus. (2) Brain activity can be measured using four main techniques, each representing a different approach. Electrophysiological methods like the Electroencephalogram (EEG), on which the following part is mainly concentrated, reveals electrical activity generated by the brain. The recording electrodes can be placed in different locations: on the scalp (EEG), directly on the cortex (Electrocorticogram, ECoG) or into structures deeper in the brain (Stereotactic EEG, SEEG). ECoG and SEEG are both invasive intracranial recording techniques.

15

Also magnetic fields produced by electrical currents can be measured using the Magnetoencephalogram (MEG). Other techniques using Imaging technology provide a different view of brain activity: Important to mention is the magnetic resonance imaging (MRI). It uses powerful magnets to excite hydrogen nuclei. These atomic nuclei emit a signal while returning to the initial point of excitement (relaxation). The signal can be measured and computed into structural images of the brain. It is also possible to visualize the brain function with the functional MRI (fMRI). Neuronal activity enhances metabolic processes resulting in changes of blood flow. Hemoglobin features different oxygenation levels that are measurable as different MRT signals showing different activated structures in the brain. This is called the Blood Oxygen Level Dependency effect (BOLD-effect). Another imaging technique, the Positron emission tomography (PET), visualizes metabolic processes by showing the distribution of a radioactive tracer in the brain. The tracer is attached to a biological active molecule and injected into the blood circulation; the most commonly used is fluorodeoxyglucose (FDG). A compatible and complimentary use of some of these methods is possible.

2.2.1 Psychophysics – psychoacoustics The appropriate stimulus for the ear is a sound wave, which is generally comprised of several frequencies (expressed in Hertz) and pressure oscillations. The magnitude of a pressure wave is the amplitude and is also called sound pressure (P, 1 Pa=1N/m). The human ear can detect sounds in a wide range of amplitude and therefore sound pressure is often expressed as a level on a logarithmic decibel (dB) scale, also called sound pressure level (SPL, L):

L = 20 log Px/P0 [dB] (Schmidt, 2005).

The term level means, that the sound pressure measured (Px), is in a logarithmic ratio -5 to another sound pressure (P0), which is the absolute threshold of hearing (2*10 Pa). That indicates, that few decibels imply a multiplication of the sound pressure. The most important way to examine a persons hearing ability is an test. Different tones are presented through headphones at different levels and the tested person has to press a button as soon as the tone is heard to determine

16 the individual threshold of audibility. The perception of loudness indicates how loud a person perceives a sound and therefore it cannot objectively be measured. But it is still related to the sound pressure level and the duration of a sound. If the sound pressure increases, a sound is perceived louder and high frequencies are heard as a high tone (and the other way around). Furthermore, at a constant sound pressure, tones are perceived louder at frequencies between 2000 and 5000 Hz (Schmidt, 1993). Therefore the sound pressure must be adjusted to the frequencies in order to perceive all tones at the same loudness (isophon). Thereby a chart is created (see Figure 2.4) that shows equal loudness curves, which are also called Fletcher-Munson curves (Fletcher, 1933). Values at 1000 Hz can also be named phon and per definition, one phon equals one decibel at 1000 Hz. The human hearing is limited to frequencies between 20 Hz and 16.000 Hz and loudness between 4 and 130 phon (Schmidt, 2005). A normal spoken word would be found at around 50 to 70 decibel and a painful tone at around 130 decibel (Schmidt, 2005).

Figure 2.4 – The Fletcher-Munson curves. Equal loudness curves. The intensity (vertical axis) is adapted to the frequencies (horizontal axis) to perceive all sounds at the same loudness. Values at 1000 Hz are called phon. Phon levels of 0, 10, 20, 30,…120 are depicted (adapted from Fletcher, 1933).

17

2.2.2 Brain activity – electroencephalogram (EEG) 2.2.2.1 Introduction and history In the 1870’s electrical activity was recorded from a mammalian brain for the first time. Richard Caton was an English psychologist, who reported in 1875 spontaneous activity directly from the exposed cortex of rabbits and monkeys (Millett, 2001). This laid the groundwork for Hans Berger (1873-1941), a German psychiatrist at the University of Jena, who first recorded the activity from a human brain in 1929. He applied electrodes on the head of patients, who had skull defects and recorded the first elctroencephalogram (Millett, 2001). The skull defects are comparable to a today’s decompressive craniectomy (performed in patients with traumatic brain injuries to reduce elevated intracranial pressure by taking out a part of the skull for a few months). This was the invention of a technique that revolutionized the current clinical and psychological work and research. Even though new great inventions like the Positron emission tomography (PET), the Magnetic resonance imaging (MRI) or the MEG provide new opportunity to investigate the Human brain, the EEG still remains the gold standard for diagnosing numerous diseases and it is crucial for studying the dynamics of brain activity.

2.2.2.2 Physiological fundamentals The human brain is mostly constituted of neurons – cells specialized in transmitting and processing information via electrical and chemical signals – and glia cells (for example: astrocytes, oligodendrocytes, radial glia, microglia) providing support and electrical insulation for the neurons and also maintaining ion homeostasis of the brain. Glia cells also show electrical activity, which is probably too small to contribute to an EEG (Araque, 2004). Neurons show a negative intracellular membrane potential (–70mV) compared to the extracellular space. During a depolarization, positively charged ions (especially sodium) enter the cell, generating an action potential. Subsequently, positively charged ions (potassium) diffuse out of the cell (repolarisation) and the voltage returns to its initial value. The duration of an action potential is between less than a millisecond and up to few milliseconds and it can occur over 5000 times per second (5000 Hz). The neurons communicate over synapses. One depolarized afferent neuron releases neurotransmitter, which opens ion channels on the postsynaptic membrane of the subsequent neuron. There are different neurotransmitters with

18 distinct properties: glutamate (generally excitatory), GABA and glycin (generally inhibitory) and acetylcholine (excitatory or inhibitory depending on the receptor). The induced postsynaptic potentials (excitatory or inhibitory, EPSP or IPSP) are relatively slow and followed by voltage fluctuations that can be measured from EEG electrodes. For instance, an excitatory neurotransmitter like glutamate causes a depolarization of the dendrites, which leads to a sodium and calcium influx or to a reduced potassium efflux. In this case, the surface of the dendrite shows a reduced electric charge compared to a positive charge inside the cell (see Figure 2.5). This potential difference generates a dipole. On the other hand, an inhibiting synapse causes a hyperpolarisation of neurons and therefore a positivity of the extracellular space. The ion flow induces a modified distribution of ions in the extracellular space that is balanced from adjacent extracellular compartments.

Figure 2.5 – Model of a neuron generating a field potential. The afferent fiber induces an EPSP at the dendrites of the neuron. The ion influx in the cells results in a negative field potential and a dipole (dashed lines) (adapted from Ebner, 2006).

19

Electrical changes from one single neuron cannot be recorded from an EEG electrode, because the amplitude is too small and there is a considerable distance between neurons and electrodes. The recorded electrical activity is rather a summation of voltage fluctuations caused by EPSPs and/or IPSPs of many neurons within a population. Given that a neuron is part of a population, extracellular potentials are behaving according to the orientation and the polarity of the neurons. A summation of the field potentials happens only if the neurons are organized in parallel or serial networks and if they have similar morphological polarization. This situation is called open field (see Figure 2.6 A). Special neurons in the human cortex (pyramidal cells) – mainly organized vertically – primarily generate the electrical potentials one can see in the EEG. On the other hand, in a closed field (see Figure 2.6 B), the current flow is canceled out within the population. This happens if the neurons are arranged in stellate morphology with dendrites extending radially outward, or if the neurons are randomly oriented, for example, interneurons show closed field potentials (Ebner, 2006).

Figure 2.6 – Orientation of neurons. (A) A parallel orientation results in a measurable signal – it is called an open field. (B) A stellate organisation results in a marginally measurable signal- it is called a closed field (adapted from Ebner, 2006).

20

The EEG signal recorded from the scalp is composed of frequencies between 0,5 – 80 Hz and amplitudes in a range from 1 – 100 µV (Schmidt, 2005). The spontaneous EEG signal mostly displays noise, but nevertheless the state of and the areas of higher activity can be displayed and mapped on a scalp model. Therefore, it is important to consider the orientation of the generator population. It can be located vertically to the surface of the cortex (see Figure 2.7 B). In this case, the dipole moment is presented as a radial dipole and the scalp topography shows about the localization of the source (Ebner, 2006) (see Figure 2.7 D). A tangential dipole would result if the neurons were arranged tangential to the cortical surface (see Figure 2.7 A) and for this, the maximal negativity or positivity of the scalp topography would not show the actual source (see Figure 2.7 C). The source is rather located in between the two maxima with opposite signs on the scalp. Therefore, one must not reason that the biggest signal in the EEG presents the location of the biggest activity in the brain. This is called the inverse problem. Consequently, interpreting EEG scalp results requires carefulness, especially with respect to locating the generator. Moreover, the further away a dipole is from the scalp, the broader the distribution and the smaller the amplitude of the signal.

21

Figure 2.7 – Model of the localization of the neurons generating a dipole. (A) Neurons are orientated tangential to the folded cortical surface resulting in a tangential dipole moment (C). Neurons are orientated perpendicular to the folded cortical surface (B) resulting in a radial dipole moment (D) (adapted from Ebner, 2006).

22

2.2.2.3 Recording EEG corresponds to the difference of 2 electrodes potentials, one of interested positioned on the scalp and one reference (see Figure 2.8).

Figure 2.8 – A subject set up with electrodes, ready to start the experiment (picture taken in the testing booth of the Helen Wills Neuroscience Institute at the University of California, Berkeley, USA).

Electrodes Electrodes are small metal discs, which are mainly made of silver, but also platinum, gold or tin. Mostly silver / silver chloride (Ag / AgCl) electrodes are used, because this compound reduces the polarization effect. This is a counter voltage that arises while the voltage on the scalp is constant or slowly changing (Ebner, 2006). These kinds of electrodes not only record brain activity, but also interfering activity, for instance, alternating current (AC) (see 2.2.2.5). The electrodes are placed on the head (cap or glued) and the application of a conductive paste, rich in electrolytes, lowers the impedance between electrode and skin – preferably below 5 kOhm. To ensure standardized recording, the positions of

23 the scalp are identified using the International 10 / 20 system (see Figure 2.9). The number of recording electrodes can go up to 256. Each electrode is labeled with a letter and a number: the letter refers to a brain area (‘F’ = frontal lobe, ‘T’ = temporal lobe, ‘P’ = parietal lobe, ‘O’ = occipital lobe, and the ‘z’ refers to the central line), even numbers refer to the right side of the head and odd numbers to the left side.

Figure 2.9 – The layout form the International 10 / 20 system with 64 recording electrodes (adapted from BioSemi, the Netherlands).

EEG instruments The small amplitude of the EEG signal (1-100 µV) requires amplification. Therefore a differential amplifier is used. The difference between two signals is amplified by a constant factor (usually 10.000) (Ebner, 2006). The amplified signal is digitized or sampled, i.e. the signal is converted into a series of numeric values (Analogue-to-Digital conversion - ADC). The samples, representing the actual value of the EEG amplitude, are measured at constant time

24 periods. Sampling rate (expressed in Hz) refers to the number of samples per second. For clinical application a usual sampling rate is at about 250 Hz, whereas in research studies the signal can also be sampled much higher, for instance at over 1000 Hz. Furthermore, the sampled signal is filtered to reduce superimposed signal or to distinguish EEG frequency bands of interest. The bandwidth of EEG signal is from under 1 Hz up to over 50 Hz varying in relative amplitude. Different filters can be used depending on the purpose of a study. A notch filter, or band stop filter can be used to exclude contaminating frequencies (50 Hz or 60 Hz) caused by electrical power. A low pass filter attenuates signal higher than a specified threshold (e.g. 35 Hz) such as high frequency artifacts, for instance muscular activity. In contrast, a high pass filter passes high frequencies and reduces the amplitude of low frequencies (e.g. below 1 Hz) to remove slow artifacts (Ebner, 2006). A band pass filter allows setting a range of frequencies that remain unattenuated whereas the frequencies outside that range are rejected.

Montage The way a pair of electrodes is connected to the differential amplifier is called montage. It is crucial for a study to carefully choose the reference because data alteration or loss due to subtraction can occur. There are different montages: referential montage, bipolar montage and (common) average reference. The referential montage indicates that one electrode is used as a reference. This signal is subtracted from the signal of all other electrodes. Therefore, the reference electrode should not record brain activity or artifacts, because otherwise subtraction could cause information loss or modification. For instance, electrodes placed on both earlobes, the nose or the mastoids would be a reference with a minor activity of their own. In the bipolar montage, electrodes are subsequently linked together and potential differences between two adjacent electrodes are measured. In general, both montages are equally effective, but they are used for different purposes according to the location and dimension of the potential field. A special montage is the average reference (common average reference). The signal from all electrodes is summed up, averaged and subtracted from every electrode. But, since the potentials are statistically irregular distributed, big

25 deflections in the EEG of one region due to physiological or pathological activity can falsify the EEG (Ebner, 2006). This montage is often used in ECoG recordings.

2.2.2.4 Classification of frequency The recorded spontaneous electrical activity appears to be chaotic, but after applying different filters, there is a rhythmic activity that can be classified into different bands by their frequencies (typically, a negative deflection is depicted up in a graph). The EEG of an awake, healthy adult is composed of several frequency bands: delta (< 4 Hz), theta (4 – 7 Hz), alpha (8 – 13 Hz), beta (14 – 30 Hz), and gamma (> 30Hz) (see Figure 2.10). The amplitude is negatively correlated with the frequency, which means that the amplitude decreases with increasing frequency (Pfurtscheller, 1999). Furthermore, the amplitude is proportional to the number of synchronously active neuronal populations (Elul, 1971), i.e. slow fluctuations reflect a bigger active cell assembly than fast oscillations (Singer, 1993). The alpha rhythm occurs with amplitude of about 50 µV, in an awake, relaxed person with closed eyes during a low input of environmental stimulation. This activity is bilaterally distributed mainly on occipital electrodes but also on temporal and central electrodes. Every human being appears to have his own individual alpha frequency that can vary according to the state of arousal. Exhaustion can decrease the alpha to 8 Hz or below. If a tested person suddenly opens their eyes or focuses on a mental activity (for instance, mathematical task), the alpha rhythm disappears and is replaced by beta activity. This is called alpha block or desynchronization. The beta rhythm is the fastest rhythm for the main purposes of EEG recordings but of course the brain shows activation in higher frequencies (gamma rhythm), too. Beta is low in amplitude and shows a maximal distribution over fronto- central sites of the scalp. It occurs during waking state and with open eyes, in particular when the tested person focuses their attention or receives a high input of environmental stimulation. The theta rhythm also shows low amplitude (< 30 µV) and is mostly distributed over parieto-occipital areas of the scalp. It is often seen in minor occurrence in young adults. It mainly occurs in pathological states, for instance if the patient has a lesion (e.g. tumor), an encephalopathy, or is under an antipsychotic therapy (Ebner, 2006).

26

Big populations of synchronously oscillating neurons generate the delta activity. It is mostly present during sleep, but also considered normal in young, awake adults over occipital sites of the scalp. There is also a temporal theta and delta activity in older adults (over 60 years), which is considered normal as long as there are only single waves or short sequences of theta and certain criteria are met: the proportion of delta should be less than 1 % and theta less than 10 % compared to the background activity (Ebner, 2006).

Figure 2.10 – EEG frequency bands. The profiles are obtained during various state of . The particular band is written in brackets (adapted from Kolb, 1996).

27

2.2.2.5 Artifacts Artifacts are deflections in the EEG that do not represent activity from the brain. A distinction is drawn between biological and non-biological (technical) artifacts (Cacioppo, 2005). If the artifacts show a typical shape and localization, they are easy to identify, but artifacts can often modify the EEG in a minor way that is difficult to notice. Therefore, observation and video monitoring are indispensable and make it easier to identify the artifacts. Sources for biological artifacts are: eyes, heart, arteries (pulse), tongue, skin (sweat) and muscle activity. Especially eye movements and blinks are a problem in experimental paradigms. The bulbus oculi (globe of the eye) forms an electrical dipole that causes measurable potentials while the eyes move (see Figure 2.11).

Figure 2.11 – Schema of EEG artifacts due to eye movements. The cornea is charged positive, whereas the retina is charged negative. According to this, looking up, leads to positive potentials on frontal electrodes and looking down to negative ones. If the person takes a look to the left, positive potentials are recorded at left fronto-temporal electrodes (F7-T3) and negative potentials at opposite electrodes of the right side (F8-T4). Corresponding potentials are elicited, when looking to the right (adapted from Ebner, 2006).

28

To avoid blinks and saccades, the tested person is instructed to blink as less as possible and for instance, to fixate the gaze on a centrally presented cross on the testing screen. Additionally, an electrooculogram (EOG) is recorded from electrodes placed on both external canthi and below an eye. Thus, vertical and horizontal eye movements are recorded (see Figure 2.13 A) and can be removed later in the analysis. Other artifacts, coming from muscle activity, i.e. chewing, frowning or tense face muscles can highly contaminate the EEG (see Figure 2.13 B). Especially difficult to deal with are complex biological artifacts, i.e. prolonged movement of the subject. The best way to reduce these artifacts is to avoid them in a prophylactic manner by carefully instructing the person to stay as relaxed as possible during testing. A very common technical artifact is the contamination of the signal with AC, coming from other devices near the tested person (for instance, a cell phone) (see Figure 2.12). Most electrical power is generated either at 50 Hz (for example in Germany) or at 60 Hz (in the United States). If the noise source cannot be located, special filters, i.e., a notch filter, can be used during data collection or offline to remove the superimposed activity (Cacioppo, 2005).

Figure 2.12 – Technical artifact: contamination of the EEG signal with 60 Hz signal (recorded in the Helen Wills Neuroscience Institute at the University of California, Berkeley, USA).

29

Figure 2.13 – Biological artifacts. (A) Blink artifacts propagated over the frontal scalp electrodes. The three last channels represent the EOG channels: rEOG = electrode placed next to the lateral canthus of the right eye, lEOG = electrode placed next to the lateral canthus of the left eye, vEOG = electrode under the left eye. Together rEOG and lEOG record horizontal eye movements, whereas vEOG records vertical eye movements. (B) Muscle artifact probably from chewing (recorded in the Helen Wills Neuroscience Institute at the University of California, Berkeley, USA).

30

2.2.2.6 Data analysis: preprocessing and event-related potentials (ERP) Preprocessing The collected data does not provide clear information about the source of a lesion or differences according to cognitive tasks. Therefore, it needs to be analyzed. After the data is imported and visualized an inspection of the raw data is useful to identify electrodes with no or poor signal or to classify artifacts and their amplitudes. So that, thresholds are set to automatically exclude artifacts with amplitude above the cut off. If the data suffers from major contamination with eye blinks that cannot be excluded because of too much data loss, it can be corrected using independent component analysis (ICA). This method separates a multivariate signal into independent components using linear decomposition (Hoffmann, 2008). The component that shows a scalp distribution corresponding to an eye blink can be identified and removed. The last step is filtering the data to reject contaminated signal or to cut down data to the frequency bands of interest (see 2.2.2.3).

Event-related potentials Before, during and after a sensory, motor or cognitive event, specific electrical events arise in the cerebral cortex. These effects can be measured as evoked- potentials (EP) or event-related potentials (ERPs), which are very small signals embedded in the ongoing EEG signal. ERPs refer to time locked perceptual, cognitive or response potential, whereas evoked potential (EPs) refer to early sensory responses such as the brainstem auditory evoked potentials (BAEP; see below). All ERPs feature specific polarity, latency, localization and amplitude that characterize the different components. ERPs reflect brain responses time-locked to an event or stimulus in an experimental paradigm. ERPs are obtained by averaging the EEG traces from a series of trials, aligned according to the event that is, for instance, the onset of a stimulus or a response. Given that the background EEG is assumed to be random, averaging random activity sums zero and the EP or ERP emerges from the EEG. The signal to noise ratio (SNR) indicates to what extent the signal is compromised by noise. The SNR is defined as the ratio of signal to noise power. A ratio higher than 1:1 indicates more signal than noise. To increase the SNR, the number of trial needs to increase as well because the SNR is proportional to the square root of the number of sums (Schmidt, 2005). For instance, 81 trials improve the SNR to 9:1 (given an initial SNR of 1:1). Moreover, the number of trials needed

31 for ERPs or EPs is also dependent on the amplitude of the ERP or EP of interest. For instance, the P300 (see 2.2.2.7) can already be visible after averaging 10 trials, whereas the BAEP requires at least more than 100 trials. Due to standardized electrode positions, the amplitude value of the averaged signal at a favored time can be plotted on a topographic scalp map. However, if the source of brain activity is causing a tangential dipole, the ERPs can be mapped paradoxically. Moreover, the activity can also reflect processes, executed in parallel. Therefore, before interpreting activity as function or processes and allocating it to distinct brain areas, the orientation and possible source of dipoles and underlying cognitive processes that might be present during the experimental paradigm need to be considered.

2.2.2.7 Main auditory electrophysiological components ERPs have found to be a powerful tool for clinicians and researchers. For instance, tumors in the auditory system can compress auditory processing areas and therefore compromise hearing. Even, if MRT is the goldstandard, the localization of such a tumor could also be identified using auditory evoked potentials (AEPs) that are generated in certain areas of the ascending auditory pathway. There are a series of responses that index the neural activity in the brainstem, midbrain, thalamus and cortex (Gazzaniga, 2002). Electrodes placed on the vertex and mastoid can measure these ERPs. The earliest AEPs are very small electrical voltage potentials that arise within the first 10 ms. They are called the early latency ERPs or brainstem auditory evoked potentials (BAEP) and are used to test the auditory pathway up to the inferior colliculus (see Figure 2.14 A). The middle latency ERPs emerge after the brainstem response at about 10 to 40 ms (Picton, 1980). The thalamus (medial geniculate ganglion) and the auditory cortex are the related structures to the middle and long latency ERPs (> 40 ms) (see Figure 2.14. B and C). These ERP waveforms are referred to as exogenous components, because they are driven from the physical features of the stimuli (Schmidt, 2005). That means, the amplitudes of exogenous components are altered according to the intensity of the stimulus. Endogenous components, on the other hand, show variations according to cognitive processes.

32

Figure 2.14 – The auditory event-related potentials. (A) The brainstem auditory evoked potentials. The anatomical locations are related to the different waves: wave I = cochlear nerve, wave II = cochlear nuclei, wave III = superior olivary complex, wave IV = lateral lemniscus and wave V = inferior colliculus wave V and VI are not surely assigned to specific anatomical structures (B) The middle-latency, and (C) the long-latency deflections of the auditory ERPs with the main components: P50, N1 and P2. Negativity is depicted up (adapted from Picton, 1980).

P50- N1- and P2-components These three auditory components are exogenous evoked potentials, which are mainly generated in the auditory cortices and result from sensory analysis of stimuli – even in the absence of auditory attention. These potentials are assigned to the long (>40ms) latency deflections of the auditory ERPs (Picton, 1980) and they are named for their characterizing polarity and latency. The P20-50- or P50-effect, is a positive potential occurring at around 20-50 ms after stimulus onset. It is thought to reflect neural activity in primary and associative auditory cortices (Liegeois-Chauvel, 1994) and it shows a central distribution on scalp electrodes (see Figure 2.15).

33

The N100 or N1 is one of the major components of the auditory evoked potentials. It is a large negative component, which peaks around 80-110 ms after stimulus onset and shows a fronto-central scalp distribution (see Figure 2.15). The N1 was first recorded by Pauline A. Davis at Harvard University (Davis, 1939). The origin of the wave was unknown for a long time and finally conjoint with the auditory cortex in the 1970s (Näätänen, 1987; Vaughan, 1970). From intracranial depth recordings in humans, it has been shown that the N1 is generated in several primary and associative auditory areas (Liegeois-Chauvel, 1994; Yvert, 2005). Moreover, a dynamic dipole model analysis showed that neural generators are not only active in the auditory areas, but possibly also in the motor and supplementary motor areas and/or the cingulate gyrus (Giard, 1994b). This response is generated after an abrupt acoustic event, and thus is involved in detection and perception of acoustic transitions. Furthermore, the N1 amplitude depends upon several physical features of the stimulus e.g. rise time of sound onset, inter stimulus interval (ISI), loudness and frequency of the stimulus and preceding sounds (Näätänen, 1999). After reaching the peak, the N1 returns abruptly to positivity and flows into the P200 or P2 component, which appears as a positive deflection in the ERP waveforms with a latency of about 200 ms and it shows a positive scalp distribution at central electrodes (see Figure 2.15). Together, N1 and P2 are often referred to as the N100-P200 or N1-P2 complex. In addition, there can also be a N2, which is a negative component peaking around 200-350 ms after stimulus onset (Folstein, 2008) and is present during processes enabling cognitive control. Moreover this component is also sensitive to novel stimuli in terms of mismatch negativity (see 2.3.2) (Schmitt, 2000).

34

Figure 2.15 – The long-latency deflections of the auditory ERPs at Fz electrode. P50- N1- and P2-components with the corresponding scalp distributions (top views, green represents positive amplitude and red negative amplitude) (curves and topographies created from data collected for the current study).

P300 Component The P300 or P3 is a positive ERP component, which reaches its maximum around 300 ms after stimulus onset. The strongest signal can be measured at parietal electrodes. The P3 was first reported by Sutton (Sutton, 1965) in response to unpredictable stimuli presented in an oddball paradigm. In this kind of paradigm a rare target stimulus is presented amongst more frequent standard background stimuli and the P3 arises when the target stimulus is detected. A larger P3 is elicited by those events representing a low-probability category of stimuli (McCarthy, 1981). The P3 wave is composed of two subcomponents known as P3a and P3b. These subcomponents reflect distinct information-processing events. The P3a is usually observed in response to non-expected meaningful stimulus, such as novels. Therefore, it has been proposed that the P3a originates from stimulus-driven frontal attention mechanisms. The P3b is elicited in response to detected targets and is

35 considered as target-related. The P3b arises from temporal–parietal activity associated with top-down attention and appears to be related to subsequent memory processing (McCarthy, 1981). Both P3a and P3b depend on a number of variables, in particular the subject's mental state, the task that has to be accomplished, the significance of the stimulus, and the degree of attention. Therefore, these responses are often used as indicators of higher-order cognitive functions such as decision-making or selective attention. In consequence, various studies have suggested that several cortical generators of P3 may co-exist: the medial temporal cortex, the temporo-parietal junction, and the lateral prefrontal cortex (Soltani, 2000).

2.3 Auditory attention 2.3.1 Psychological theories 2.3.1.1 Introduction to selective attention Attention is an abstract concept, which is not easy to define; but already in the 19th century William James, a psychologist at Harvard University, proposed a definition of attention: “Everyone knows what attention is. It is the taking possession by the mind, in clear and vivid form, of one out of what seem several simultaneously possible objects or trains of thought. Localization, concentration, of consciousness are of its essence. It implies withdrawal from some things in order to deal effectively with others, and is a condition which has a real opposite in the confused, dazed, scatterbrained state which in French is called distraction, and Zerstreutheit in German.” (James, 1890) W. James emphasized the main characteristics of attention: it is a cognitive brain mechanism that enables processing relevant inputs, thoughts, or action while ignoring irrelevant or distracting stimulation (Gazzaniga, 2002). And by his statement “it is the taking possession by the mind” he also stressed one of the two categories of attention: voluntary or endogenous attention, which involves one choosing to focus their attention on an event of interest. This process is driven by so-called top-down signals, which means that cognitive influence and decisions can alter the perception of stimuli. The other category is reflexive, automatic or exogenous attention, which occurs when an external object or a sensory event captures our attention, also called bottom-up. The attention process is

36 based on the analysis of the stimulus characteristic (e.g. color, brightness). For example, one red balloon in a bunch of blue balloons will grab the attention and attract it involuntarily. The ideas of attention W. James proposed over 100 years ago are still today’s purpose of research. The main goal in studying attention is to investigate how attention enables and influences detection, perception and encoding of stimulus events (Gazzaniga, 2002). Most studies were conducted in the visual and auditory modalities. During the last decades the number of studies in visual attention increased and displaced the emphasis of auditory attention, predominant in the 1950s and 1960s (Broadbent, 1958; Cherry, 1953). Auditory attention seems to be a greater challenge because of crucial physiological differences in structure and function. Visual attention is linked to the position of the head and eyes since the stimuli are already mostly fully processed in the fovea (Pashler, 1998). The human cochlea on the other side is not an equivalent to the fovea. The characteristic of auditory selective attention is that it is mostly independent of the position of the head and the ears, which makes it a system that is ready to receive and process stimuli from all directions regardless of the organism’s current orientation (Pashler, 1998). On the other side this openness to all inputs from the environment means that efficient selection mechanisms need to distinguish relevant from irrelevant sounds. Colin Cherry, a British psychologist, described the classic auditory example of this phenomenon – the so-called cocktail party effect (Cherry, 1953): a person can focus on one particular speaker while tuning out several other simultaneous conversations. This can only be achieved by auditory selective attention: the perception of a certain stimuli in the environment is enhanced relative to other stimuli of lower immediate priority. In Cherry’s study, competing speech input was provided through earphones into the two ears of a subject. The subjects were asked to attend and verbally shadow (immediately repeat each word) a relevant input in one ear while ignoring irrelevant information presented to the other ear; this approach is called . He noticed that the subjects were only able to report the input from the attended ear and could not report one detail from the ignored channel. He also observed a significant decrease in performance when the subjects attempted to attend to both input channels simultaneously in comparison to selectively attending to one channel.

37

Cherry proposed that attention focused on one ear results in better encoding of inputs in this channel, whereas the input of unattended channels might be attenuated or rejected. These findings led to general models of attention, which fall into two categories: bottleneck theories (see 2.3.1.2) and other capacity model theories (see 2.3.1.3). The bottleneck is the most influential one. It is worth noting that all theories are based on the idea that humans have limited information processing capacity: i.e. it is impossible to process and react to all exogenous and endogenous inputs that continuously excite our senses.

2.3.1.2 Bottleneck theories: early- versus late-selection A few influential psychologists proposed different models of attention to explain results like those from Cherry’s experiment (Broadbent, 1958; Deutsch, 1963; Treisman, 1960). The theories all share the underlying idea, that every processing of the brain, even sensory inputs, have a limited capacity channel (bottleneck) and thereby, only a certain amount of information could pass (Gazzaniga, 2002). Therefore, all sensory inputs need to be screened, sorted and filtered, to let only the relevant stimuli pass for further processing i.e. irrelevant inputs are rejected and relevant ones are admitted for higher order processing. The main differences between the following models of attention are the proposed location of the processing bottleneck that is either early or late, and the extent to what ignored inputs are actually processed before they are rejected or admitted for further processing. Thus, the competing theories arise: early-selection represented by Broadbent’s and Triesman’s models and late-selection represented by Deutsch’s model. In Boradbent’s theory (Broadbent, 1958), the incoming stimuli are temporarily held in a sensory register, which allows attending to unanalyzed information later on. Then, the stimuli are analyzed in parallel by a selective filter on the basis of their physical characteristics such as spatial location (attended versus ignored ear, the later are rejected), spectral content, and temporal features. This filtering process happens unconsciously. The selected stimuli pass along a limited capacity channel and, subsequently, semantic analysis takes place, which is essential for influencing a response or entering long-term memory. The not-selected stimuli are not further analyzed and do not reach consciousness, which is an all-or-nothing view of perception.

38

Some features of Broadbent’s filter theory explained Cherry’s data well, but Neville Moray showed in 1959 that high priority information in an unattended input channel was also processed to the extent that it could break through the attentional barrier. In his experiment, Moray found that a persons’ own name in an ignored input channel could often direct attention to this channel (Moray, 1959). These findings led to the assumption that all information was actually analyzed equivalently regardless whether it was attended or ignored during testing. Therefore, Treisman proposed a direct modification of Broadbent’s model on which he agreed a year later. The theories are quite similar, but the main difference is the filter. Treisman's filter passes the attended input as well through the limited capacity channel but also allows unattended messages to go through, but in an attenuated form, i.e. their signal strength is lowered. Accordingly, certain unattended messages can be processed semantically and also reach consciousness, if they meet certain criteria. Most important criteria are differing thresholds that can be variable and also function as a filtering mechanism. For example, biologically important signals have permanently lowered thresholds, thus, even very attenuated signals can be facilitated and semantically analyzed. This could explain why one's own name in an unattended message can attract attention to it. This model is, therefore, an early selection theory, and an attenuation model of attention. Taken together, the bottleneck in the early selection theory is located around the level of perceptual analysis (see Figure 2.16), thus, attended input is perceptually processed and continues to higher order processing (e.g. encoding as semantic or categorical information), whereas unattended input is either rejected categorically (Broadbent’s theory) or attenuated so that important messages in an unattended channel are enabled for further processing (Treisman’s theory). So, it may be possible, that inputs are selected or rejected even before the perceptual analysis of the stimuli’s characteristics is fully completed (Gazzaniga, 2002). The early selection models can be contrasted with the late selection one, which proposes that attended and unattended stimuli are processed equivalently by the perceptual system and both inputs reach further processing of semantic encoding. After that, selection for further processing or for conscious awareness can take place. Thus, selection takes place at higher stages of information processing about whether the stimuli should gain complete access to awareness, be encoded in memory, or initiate a response. J. A. Deutsch and D. Deutsch first proposed the most influential

39 late selection theory (Deutsch, 1963). In this model all incoming stimuli are stored in a sensory register and fully processed even at a semantic level without any attenuation. This perceptual analysis happens automatically and independently whether attention was paid or not and it is accomplished before any selection due to attention takes place. The information is then grouped by mechanisms, activated by particular features of the incoming stimuli, i.e. importance of the stimulus. The highest level represents a criterion by which all the other levels are compared. This level represents a reference point enabling the appropriate output, such as a motor response, and inhibits the output associated with other levels. Furthermore, the general state of arousal alters the access to an output system, i.e. for a low level of arousal (e.g. sleep), only very high-priority information will be able to alter storage or motor response.

Figure 2.16 – Diagram of early and late selection. This schema shows the location of the bottleneck regarding early- and late selection theories, i.e. the extent to what a stimulus is processed before it is rejected or admitted for further processing. The limitation of early selection is located during or even previously to perceptual analysis. Late selection, on the other hand, occurs after complete semantic encoding of all stimuli (adapted from Gazzaniga, 2002).

40

In summary, the early selection theories allow few automatic processing and no semantic processing of the unattended input before the selection takes place. Therefore, the bottleneck of these theories is located in the perceptual system. On the contrary, in the late selection theories, the bottleneck appears to be in the response system. Indeed, all inputs are processed rather automatically by the perceptual system and reach the stage of semantic encoding and are therefore able to influence the executive functions, such as decision, memory or simply making a response. However, the theories might not be so different. One argument is that it might be a terminological issue, because in all theories selecting mechanism operate by similar conditions: levels of importance (Deutsch and Deutsch) or different threshold levels (Treisman). More importantly, only the highest level of importance (Deutsch and Deutsch) and the information with a triggered threshold (Treisman) can pass on to further processing such as making a response. Besides, both theories propose pattern recognition units and mechanisms selecting highly salient stimuli dependent on bottom-up (physical features of the stimulus) and top-down (contextual features). This shows that the theories feature major differences but also similarities.

2.3.1.3 Other capacity-limitation theories As already mentioned, all attention theories are based on the idea that humans have limited information processing capacity. During the fruitless discussion between early and late selection theories other attention models based on assumptions from this debate were developed. Nilli Lavie, a researcher at University of London developed two theories on attention and distraction: the and the cognitive load theory. The perceptual load theory is based on the hypothesis, that perception has limited capacity (early selection) but processes all stimuli automatically (late selection) until it runs out of capacity (Lavie, 2005). The idea is that the perceptual load of relevant information determines selective processing of irrelevant information (Lavie, 1995). That means that a high perceptual load would engage all capacities available and would leave no spare capacity for processing irrelevant stimuli. On the other hand, under the condition of a low perceptual load (when the relevant stimuli do not demand all of the available capacity) irrelevant stimuli will unintentionally capture spare capacity and be processed at the perceptual level, which would lead to an increased distraction (Lavie, 1995). Therefore, rejection of irrelevant stimuli results

41 only from an overload of the perceptual system by relevant information, i.e. in case the capacity limit is exceeded. Nilli Lavie observed that the effect of load on distractor processing is mainly depending on the type of mental process that is loaded, because load on executive functions such as had the opposite result. So, she proposed the cognitive load theory. This theory is about the interaction between attention and working memory, which is the ability to hold and manipulate information in mind for a short time, or rather, actively maintain stimulus-processing priorities through out the task (Lavie, 2005). Lavie and colleagues suggested that load on working memory results in a reduced availability of working memory for a selective attention task (by loading working memory in a concurrent, yet unrelated task). This in turn should result in reduced efficiency of focusing attention on the relevant stimuli, with greater interference by distractors. More precisely, a high cognitive load would increase the interference by an irrelevant low-priority distractor, and a low cognitive load would decrease distractor interference (Lavie, 2004). Therefore, load on an executive function such as working memory has the opposite effect than perceptual load. This idea was supported by results from de Fockert (de Fockert, 2001) showing a causal role for working memory in the control of selective attention (see 2.4). The effects of different types of load on distractor processing provide a better understanding of how distractor processing is affected by capacity limits in different mental processes. These load models also provides a more complete view at the early- and late-selection debate: early selection depends on high perceptual load, whereas late selection depends on cognitive control functions, available for the selective attention task (Lavie, 2005). Thus, these capacity-limitation models also reconcile the competing bottleneck models by combining the assumption that perception is a limited process (early) with the view that perception is an automatic process (late) to the extent that there is spare capacity available. This suggests that there is no single bottleneck in the information processing system, but there are a series of filters so that incoming stimuli can be selected at early or late stages depending on the situation. Therefore, these are also flexible theories of selective attention.

42

2.3.2 Electrophysiological findings and theories Many electrophysiological studies have been conducted to investigate the electrical changes in the brain due to selective auditory attention. Although this is difficult enough, it is even more difficult to combine the findings with corresponding attention theories. Especially important are studies using EEG because ERPs can provide information about perceptual and cognitive processes and their alteration due to the state of attention in real time. An introduction and review about the most important studies follows.

Support for early selection theories of attention Steven Hillyard (Hillyard, 1973) developed a selective listening task, which became the classic auditory selective attention paradigm (see Figure 2.17), to investigate brain mechanisms of auditory attention with scalp EEG. Streams of sounds differing in pitch were delivered through headphones to the ears of the subjects. In one condition they were asked to attend to the sounds in one ear and make a response to a target sound in this ear (see Figure 2.17: the grey notes represent the targets) while ignoring all sounds in the other ear (for instance: attend left ear and ignore right ear). Then, in a second condition, they were asked to pay attention to the stimuli in the other ear (in this example: attend right and ignore left). This way, Hillyard separately got ERPs to the same stimuli when they were attended and when they were ignored.

Figure 2.17 – Scheme of the selective listening task. The sounds occur successively to the left and the right ear. The subjects are asked to detect stimuli of interest in one ear (for instance, grey notes in the left ear) and ignore all stimuli presented to the other ear (adapted from Bidet-Caulet, 2008).

43

Hillyard controlled the global state of arousal during testing by engaging the subject in this difficult task subjects. Thus, only the direction of attention varied (i.e., which ear the subjects directed their attention to). The researcher discovered that auditory ERPs - more precisely the N1 component - was substantially larger in amplitude for the attended stimuli compared to ignored ones (Hillyard, 1973). Since the N1 is known to be generated in the auditory cortex (Vaughan, 1970), Hillyard and his colleagues interpreted their findings as an increased activity of the N1 generators, i.e. an enhanced activation of neurons involved in automatic sensory analysis of sounds in the auditory cortex. Consequently, they proposed that selective attention acts as a filtering or gain mechanism that can inhibit or gate unattended stimuli at an early stage of sensory analysis (about 100 ms). This represents a physiological version of the psychological attenuation model of early selection (Broadbent, 1958; Treisman, 1960). A few years later Näätänen (Näätänen, 1978) suggested that the increased negativity observed by Hillyard could be dissociated from the N100 component. Näätänen used a longer and constant inter stimulus interval (ISI, 800 ms) than Hillyard (250-1250 ms) and he did not observe an enlargement of the N1, but when he subtracted the ERPs to ignored tones from those to the same tones when they were attended, he found a negative difference wave (Nd), or also called processing negativity (PN). This deflection began to emerge at around 150 ms after stimulus onset and persisted for at least 500 ms. Näätänen proposed that this wave is an endogenous component, representing attention-specific activity, which is different from the activity resulting from automatic sensory analysis (Näätänen, 1978). He concluded that the N1-effect Hillyard reported was the exogenous N1 overlapped by the endogenous Nd and thus, not an increased activity of the N1 generators. Näätänen also observed that the Nd is composed of two subcomponents: an early one, which could be generated in the auditory association cortices and is independent of the ISI and a later one of larger amplitude and longer duration at frontal sites, which is elicited with long ISI (800-2000 ms) (Näätänen, 1981). Based on these findings, he developed the attentional trace model of selective attention (Näätänen, 1982). This attentional trace would be an actively maintained cortical representation of the physical features (e.g. pitch, location) of stimuli. These features separate relevant irrelevant stimuli from ones. The model proposes that there is an early selection in terms of a comparison between the sensory input and the

44 attentional trace in the auditory cortex. The earlier Nd component, which would explain the attention effect at the N1 latency, would be generated by the comparison of the stimuli features with the trace. If the sounds do not match with the trace they would be rejected from further analysis. Accordingly, the late Nd could reflect a frontal component controlling and maintaining the attentional trace. Hillyard’s and Näätänen’s models led to a controversy and numerous studies about the relationship of the processing negativity and the N1. The main difference between the two models is that Hillyard’s filtering mechanism would represent modulation of the exogenous components of the ERPs in addition to an endogenous attention effect presented as a frontally distributed negativity. In Näätänen’s attentional trace model on the other hand all ERP effects would be of endogenous origin and modulation. Furthermore, another important study provided evidence for the early- selection theory. Based on the idea that the auditory cortex can be activated as early as 20 to 25 ms after the onset of a sound, Marty Woldorff tried to find attentional changes on the earliest components of the auditory ERP, brainstem auditory evoked potentials (BAEP) and the middle-latency deflections (latency range 10-40ms) (Woldorff, 1987). He also used the classic dichotic listening paradigm but modified it to facilitate early selection attention effects: the ISI was rather short and the subjects had to perform a difficult detection task. The targets were of a low probability and of lower intensity than the other stimuli, which increased the attentional load and force the participants to closely pay attention to the sounds. He did not find any evidence for an attentional modulation on the BAEP. However, he found attentional changes of the ERPs even prior to the N1 and the P2, that is the affection of the mid-latency ERPs around 20-50 ms (see Figure 2.18). More specifically, Woldorff concluded that the P50 was modulated as function of selective attention, since it showed enlargement of the amplitude to attended sounds in comparison to unattended sounds. In addition, he replicated the results in another study and finally showed that neural processing of attended versus unattended sounds can differ significantly even at 20 ms post stimulus (Woldorff, 1991). Thus, Woldorff provided support for the early-selection theory that stimuli can be selected or gated before perceptual processing is fully completed.

45

Figure 2.18 – ERPs to attended and ignored stimuli in a dichotic listening task (grand average across all subjects). Except for the N1 and P2 effect, the essential finding is the attentional modulation of the positive deflection at around 20-50 ms post stimulus (P50 effect). Thus, providing support for early selection theories of attention (adapted from Woldorff, 1991).

Furthermore, Aurélie Bidet-Caulet showed attention effects in the auditory cortex at 30 ms recorded from depth electrodes implanted in the temporal cortex of patients with pharmacologically resistant partial epilepsy (Bidet-Caulet, 2007). Attention effects have also been shown to alter the sensory analysis of acoustic inputs peripheral to the central auditory system. Findings from Lukas first suggested that attention alters the sensory analysis of acoustic inputs not only in the central auditory system but also in the brainstem. He observed that efferent axons within the olivocochlear bundle in the brainstem even might function by attenuating irrelevant acoustic stimuli during a visual attention task at an early stage of processing (Lukas, 1980; Lukas, 1981). A few years later, Marie-Helene Giard provided evidence that evoked otoacoustic emissions (EOAEs) to tones in one ear had larger amplitude when they were attended compared to when they were ignored

46

(Giard, 1994a). These results indicate that selective attention related modulations could occur already at the cochlear receptor, which indicates the existence of top- down control mechanisms at a very early level. Marie-Helen Giard concluded that selective attention could already operate as a peripheral band-pass filter at the cochlear receptor level prior to transduction process (transduction of sound into a neural signal). Altogether these studies support the early selection theories of attention but they do not answer the question by which mechanisms the attentional selection happens.

Mechanisms of auditory attention The studies described above addressed auditory selective attention by comparing brain responses to attended and ignored sounds. However, another important issue is whether selective attention is operating via one or more mechanisms and how they interact. More precisely, do attentional mechanisms operate by facilitating the processing of relevant sounds and/or by inhibiting the processing of irrelevant sounds? To answer these questions, it is essential to use a baseline condition in which all sounds are fairly similarly processed, to be able to see attentional modulations corresponding to facilitation or inhibition effects. Merlin Donald used the classic auditory selective attention task and also added a baseline condition (Donald, 1987). This baseline (neutral condition) was defined by the situation, in which the task was so difficult that ERPs to attended and ignored standard stimuli did not differ. He obtained difference waveforms by subtracting the ERPs of the baseline condition from those to attended stimuli or to ignored stimuli, to distinguish the effect of attention on the processing of relevant and irrelevant stimuli (see Figure 2.19: Difference waves attended-neutral and ignored-neutral). The obtained difference wave for attended sounds was a negative deflection with a very early onset (at about 25-30 ms), to which Donald referred as the facilitation effect. The obtained difference wave for the ignored sounds was a positive waveform starting around 125 ms that Donald considered as the rejection effect. Donald proposed that, since the facilitation and rejection effects of attention show different timing, they should be independent from each other.

47

Figure 2.19 – ERPs to tones in an attended, ignored and neutral condition. ERPs to tones in the neutral condition are depicted as the dashed line, whereas attended (left) and ignored (right) are shown as a solid line. The difference wave between ERPs in the attended and neutral condition reflects the facilitation effect, whereas the difference between ERPs in the unattended and neutral condition represents the rejection effect. Both effect show different timing and polarity (adapted from Donald, 1987).

Aurélie Bidet-Caulet (Bidet-Caulet, 2007) also found facilitation and inhibition (Donald referred to as rejection) effects of auditory attention from direct recordings of the human auditory cortex. The characteristic of her study was that the subjects did not perform the classic dichotic listening task because it is not representing a physiological situation one is confronted with in everyday life. Instead, participants were presented with two binaural (in both ears) streams simultaneously, to examine the influence of active selection on the processing of overlapping binaural streams. Two concurrent streams at different pitches and amplitude modulation frequencies (21 and 29 Hz) were presented binaurally and changed in spatial direction at the end (left or right ear). Subjects were asked to focus their attention on one of the two streams and to indicate the final direction (left or right) of this attended stream. Bidet-Caulet also used a baseline condition in which the subjects had to detect a rare noise burst, superimposed to the streams, to force the subjects to orient their attention away from the streams (control condition). This

48 seems to be a better baseline condition than Donald’s, because he changed the difficulty of the target detection task to influence the ERPs but he did not try to control for the direction the subjects were paying attention to. The researchers recorded data from epileptic patients implanted with multicontact depth electrodes in the temporal cortex and analyzed the brain responses to the same streams in the three different attentional conditions (attended, control and ignored). Each stream elicited an evoked activity and a steady state response (SSR), which features the property to occur at the same frequency than the amplitude modulation of the sound (21 or 29 Hz). The results showed that, in a situation of sound rivalry, selective auditory attention could affect the SSR in the primary auditory cortex and modulate the evoked responses in secondary auditory areas. These findings show that selective attention can modulate the sensory processing of sound within distinct auditory areas. Therefore, they are consistent with Hillyard’s gain theory (Hillyard, 1973), but this does not rule out the existance of an attentional trace in the selection process, as proposed by Näätänen (Näätänen, 1978). Moreover, the study provides insights in the neural mechanisms of selective attention, because it shows that there is not only an enhancement of neural representation of the attended stream, but also a reduction of the neural representation of the ignored stream. Altogether, these electrophysiological studies brought together several pieces about auditory selective attention. It is now well accepted that auditory selective attention can modulate the sensory analysis of relevant and irrelevant stimuli not only in the auditory cortex (Bidet-Caulet, 2007) but also on the level of the brainstem (Lukas, 1980; Lukas, 1981) and even at the level of the cochlear receptor (Giard, 1994a). Therefore, attention can operate as the gain mechanism Hillyard proposed, which would operate as a filtering mechanism selecting relevant stimuli at early stages of perceptual analysis. However, there is also evidence for the attentional trace Naatanen porposed, observed as a sustained negative deflection (Nd). This attentional trace would be of endogenous origin and would be an actively maintained cortical representation of the relevant acoustic features to which incoming stimuli are compared. Finally, auditory selective attention seems to operate via facilitation and inhibition mechanisms, reflecting enhanced or reduced processing of relevant or irrelevant sounds, respectively (Bidet-Caulet, 2007; Donald, 1987; Michie, 1990; Michie, 1993).

49

However, little is known about how these mechanisms interact, in particular if facilitation and inhibition are functionally distinct.

2.4 Aims of this dissertation Attention is the ability to focus on external activities or internal processes while preventing being distracted by irrelevant information. The ability to ignore irrelevant distracting stimuli and subsequently adapt behavioral response is very important in every day life, as the effect of distraction can have a broad range of consequences, for instance from decreased life quality to dangerous incidents (e.g. during driving). The mechanisms of selective attention have been extensively investigated. It is generally accepted that selective attention is capable of modulating responses to relevant (Hillyard, 1973) and irrelevant stimuli (Donald, 1987) at different levels of the auditory processing including the auditory cortex (Bidet-Caulet, 2007; Jancke, 1999), the brainstem (Lukas, 1980; Lukas, 1981) and the cochlea (Giard, 1994a). Furthermore, it has been shown that particular signals, so-called top-down signals (see 2.3.1.1) are important for cognitive control to enable selective attention. These top-down signals derive from knowledge about the current task and are able to modulate the neural activity in sensory cortices. Imaging and electrophysiological studies have found neural correlates for top-down and bottom-up signals in the frontal and sensory cortex (Buschman, 2007; Kastner, 2000), which seems to provide evidence for an involvement of the frontal lobe in attention and selection processes. However, it is still unknown by which mechanism the brain activity is regulated. Regarding the enhancement and reduction of responses to relevant and irrelevant stimuli, there are two competing theories. One proposes that attention is a unitary gain control mechanism that regulates activity either up or down along one continuum (see Figure 2.20 A). The other one represents attention as a net activity of top down distinct facilitation and inhibition mechanisms (see Figure 2.20 B).

50

Figure 2.20 – Competing theories about the regulation of brain activity in response to stimuli. (A) One mechanism representing a unitary gain mechanism that regulates the activity either up or down along one continuum. (B) Two distinct mechanisms: attentional facilitation and inhibition operating as a net activity (adapted from Bidet-Caulet, 2008).

An argument for two distinct attentional mechanisms also come from an EEG (Gazzaley, 2008) and a fMRI (Gazzaley, 2005) study. Gazzaley found, that in the visual modality, older adults exhibit a selective deficit in suppressing task-irrelevant information during working memory encoding. He further showed that suppression mechanisms are rather delayed in time than lost with age. Attention and working memory are strongly related to each other. Lavie’s cognitive load theory suggests that in order to direct attention and to specify which stimuli are currently relevant, the active maintenance of stimulus properties in the working memory is required (Lavie 2005). Therefore, a high load on working memory should lead to less differentiation between high and low priority stimuli (target versus distractor) and thus, increase distractor processing and thereby increase distraction. An fMRI study in young adults observed that the extent to which distractors are inhibited can be determined by the availability of cognitive resources, assessing a direct causal role for working memory in the control of selective attention (de Fockert et al., 2001). Cognitive resources were manipulated in a dual task protocol where subjects performed, at the same time, two unrelated tasks: an attention and a memory tasks. In the visual attention task, subjects had to classify famous written

51 names as pop stars or politicians while ignoring distractor faces, which could be congruent or incongruent with the target name or anonymous (see Figure 2.21 A). In the working memory task, subjects were asked to remember a 5-digit order on each trial at the beginning of the attention task. In order to manipulate the memory load, subjects were asked to remember either a fixed order of digits (0 1 2 3 4) or a random order of digits (0 3 1 4 2). After the attention task, a memory probe was presented to the subjects, who were asked to report the digit that followed this probe in the memory set (see Figure 2.21 A). Functional magnetic resonance imaging (fMRI) was used to measure brain activity while participants performed the two tasks. Distractor related activity was obtained by comparing the activity during attention condition with a neutral condition in which the distractor faces were absent (see Figure 2.21 B: face present versus face absent). De Fockert observed that a high working memory load increased the distractor interference effect on behavioural performance of subjects. A high load also resulted in an increase of activity elicited by the distarctor faces in visual areas, especially in the extrastriate visual cortex and the fusiforme gyrus (known to be selective for face processing) (see Figure 2.21 B). These findings indicate that distractor faces were more extensively processed under high than under low working memory load. De Fockert concluded that working memory serves to control visual selective attention and suggested that there might be two distinct attentional mechanisms regulating the responses to stimuli. However, this study did not assess, to what extent the availability of cognitive resources affects the processing of relevant information.

52

Figure 2.21 – Interaction between working memory and selective attention. (A) Example for the dual task protocol (in order, from top to bottom): memory set, pop star with congruent face, politician with incongruent face, memory probe. (B) Distractor related activity in high and low memory load. Left side: views of the ventral surface of the template brain, on which superimposed loci indicate greater activity in the presence than in the absence of distractor faces under condition of low (top) and high working memory load (bottom). Right side: mean signal change of the distractor related activity (percent signal change for face presence minus face absence) for the maxima of the interaction in the right fusiform gyrus (adapted from de Fockert, 2001).

These results in the visual modality suggest that, facilitation and inhibition rely on distinct mechanisms that would be differentially affected by the amount of available cognitive resources, and thus the difficulty of a memory task in dual task protocol. More precisely, facilitation would not to be affected by the memory task difficulty, whereas inhibition is most likely to decrease with increasing memory task difficulty.

53

The aim of the current article was to dissociate the two competing attentional theories (see Figure 2.20) and to test whether facilitation and inhibition can operate independently. To do so, a dual task protocol was also used. Subjects had to perform an auditory selective attention task and a memory task. Electrophysiological responses to the same sounds in three conditions were compared (attended, ignored and a control condition) to measure facilitation and inhibition. The amount of available cognitive resources was manipulated by varying the difficulty of a concurrent sound memorization task. Based on the idea that facilitation and inhibition operate independently, the hypothesis was, that they should not be correlated, but rather feature different electrophysiological properties and should be differentially affected by the memory task difficulty.

The following is largely content of the already published article: Bidet-Caulet, A., Mikyska, C., and Knight, R. T., (2010), Load effects in auditory selective attention: evidence for distinct facilitation and inhibition mechanisms, Neuroimage, 50, (p. 277-84)

54

3 Material and methods 3.1 Subjects Sixteen subjects (5 female, 1 left-handed, aged 18-30 years) participated in this experiment. All subjects were free from neurological or psychiatric disorder, and had normal hearing. They all gave written informed consent in accordance to the study protocol approved by the University of California, Berkeley Committees on Human Research.

3.2 Stimuli and task Subjects had to perform an attention and a memory tasks at the same time (dual task protocol). In the attention task, subjects were randomly presented with 3 different kinds of stimuli (see Figure 3.1). One was the standard (50-ms duration) and the other one the deviant stimulus (100-ms duration), varying in duration. Both stimuli were band- pass noises (5-semitone wide, 5 ms rise/fall times) and were successively delivered to each ear. The third stimulus was a binaural pure tone occurring in both ears at the same time (carrier frequency 988 Hz, 50-ms duration). In one ear, the standard and deviant sounds were low-pitch noises (554-740 Hz). In the other ear, the standard and deviant sounds were high-pitch noises (1319-1760 Hz). The loudness of these noises was matched by previous subjective matching in 11 subjects. The sound pitches presented in each ear were balanced across blocks. In each block (about 25 s), 49 sounds were played: 20 standards and 3 deviants in each ear (41% and 6% probability in each ear, respectively), and 3 pure tones (6% probability). The inter- stimulus-interval (ISI) between 2 successive sound onsets varied between 300 and 500 ms. Subjects had to perform 3 different detection tasks. They either had to pay attention to the left (right) ear and press the right button of a joystick when they heard a duration deviant in the left (right) ear; or they had to press the right button when they heard a binaural sound (control condition). Thus, in the two first conditions, half of the standards were considered as attended (in the attended ear) and half were considered as ignored (in the unattended ear). In the control condition, all standards (in right and left ear) were considered as “control” standards.

55

The memory task consisted in the memorization of a sequence of four 5- harmonic sounds (100-ms duration, 5 ms rise/fall times). Subjects were presented with this sequence, then performed the attention task, and finally were presented with a second sequence they had to compare to the first one. Thus, they had to keep the short sequence in memory while performing the attention task. To construct the sequences, 4 different sounds were used with the following fundamental frequencies: 1724, 4023, 5747, or 8046 Hz. When the memory task was easy the first sequence was the 4-time repetition of one of these sounds, and the second was either the same (left button press) or a sequence of the 4 different sounds (right button press). When the memory task was difficult the first sequence was a sequence of the 4 different sounds, and the second was either the same (left button press) or a sequence of the 4 different sounds in a different order (right button press). Three memory conditions were considered: no, easy or difficult memory task (see Figure 3.1).

56

Figure 3.1 – Scheme of the dual task protocol: memory and attention task. Subjects were presented with a sequence of 4 notes. Afterwards, they had to perform 3 different attentional tasks (detection of: (1) duration deviants in the left ear and (2) in the right ear, (3) pure tones in both ears) while they were keeping in memory the auditory sequence. 1 attention block consisted of 20 standards and 3 duration deviants in each ear, respectively, and 3 pure tones in both ears. After the attention block was completed, subjects had to do an easy or difficult (easy or difficult memory task) test of the acoustic memorization (adapted from Bidet-Caulet, 2008).

57

3.3 Procedure Participants were seated in a sound-attenuated EEG recording room. The sounds were delivered through earphones at an intensity level judged comfortable by the subjects, using ‘Presentation’ software (Neurobehavioral Systems, Albany, NY, USA). The experiment started with a familiarization with the sounds and tasks and the participants were trained on the attention and memory tasks separately. EEG was then recorded while subjects performed 12 blocks of the attention task (4 in each attention condition) for each memory condition, resulting in a total of 160 attended standards, 160 ignored standards and 160 standards in the control condition. The blocks were run by memory condition (e.g. 12 attention blocks were run under the condition of easy memory and so forth). The order of memory conditions was balanced across subjects. The order of the 12 attention blocks was the same for each memory condition, and was balanced across participants using a Latin-square design. During all the experiment, subjects were instructed to perform as well and as fast as possible and to favor accuracy in the memory task if it was difficult to perform both tasks correctly. They were also asked to keep their eyes fixated on a centrally presented cross and to minimize any eye movements and blinks while performing the tasks.

3.4 EEG recording EEG data were recorded from 64 electrodes using the ‘ActiveTwo’ system (BioSemi, the Netherlands; see Figure 2.8 and 2.9). Vertical and horizontal eye movements were recorded from electrodes placed at both external canthi and below the left eye. Data were amplified (-3dB at ~819 Hz low-pass, DC coupled), digitized (1024 Hz), and stored for offline analysis. Data were referenced offline to the average potential of two earlobe electrodes (referential montage, see 2.2.2.3).

3.5 EEG data analysis Trials contaminated with eye movements, eye blinks or excessive muscular activity were excluded from further analysis (examples see Figure 2.12 and 2.13 A and B). Trials corresponding to standards after a target, before or after a button press were also excluded. In seven subjects, the flat or excessively noisy signals at one or two

58 electrodes were replaced by their values interpolated from the remaining adjacent electrodes. Averaging, locked to standard or deviant onset, respectively, was done separately for each attention condition (attended, ignored and control) in each memory condition (no, easy, difficult memory task). For the standard analysis, at least 108 trials were averaged for each participant, for each condition. For deviant analysis, trials contaminated corresponding to missed targets were excluded from further analysis. At least 21 to 24 trials were averaged for each participant and for each condition. With this procedure, the average acoustic content of the sounds was the same for all obtained event-related potentials (ERPs), only the attention orientation and the memory task difficulty varied. ERPs were corrected with a -100 to 0 ms baseline before standard or deviant onset, and were digitally filtered (low-pass 35 Hz). Since the shortest ISI was 300 ms, only the -100 to 300 ms time-window was retained for further analysis of standards. ERP scalp topographies were computed using spherical spline interpolation (Perrin, 1989; Perrin, 1987).

3.6 Statistical analysis 3.6.1 Selection of applied methods Given the small sample number (16 subjects) we preferred non-parametric tests. But for the calculation of the interference of the two factors (attention and memory) there are no such parametric tests. Therefore, we used analysis of variance (ANOVA) with appropriate corrections for non-sphericity.

3.6.1.1 Analysis of variance (ANOVA) Analysis of variance is a set of statistical methods. It is a method that assigns sample variance to different sources and therefore determines, whether the variation arises within or among different population groups. It applies to classical linear models and it has also been extended to generalized linear models and multilevel models (Gelman, 2005). We performed repeated measure ANOVA, which means that the different measures are conducted within the same individuals. Besides, in this experiment one- way and two-way ANOVAs were used. In a two-way ANOVA groups have two defining characteristics instead of one. Both ANOVAS are special cases of the linear

59 model. The method assumes that the samples are normally distributed within different population groups, each featuring the same variance. Thus, the ANOVA is analyzing whether or not the means of several groups are all equal in order to determine whether the groups are actually different or not. This is similar to a t-test. But since there are more than two groups, multiple t-test would be necessary and this in turn would increase the chance of committing a type I error. Therefore, ANOVAs are useful for comparing more than two means. The result of an ANOVA only states whether the tested groups are different or not, but it does not reveal which means differ. For this reason it is necessary to perform so called post-hoc tests like the permutation test (see 3.6.1.2). Another important issue is the problem of multiple comparisons that arises from testing multiple hypotheses at the same time. It means, the more tests performed the higher the probability of obtaining at least one false positive result. For this reason it is important to correct for the numbers of comparisons by performing statistical tests like the Bonferroni correction.

3.6.1.2 Statistic permutation test To limit assumption on the data distribution, we used as most as possible a test based on randomizations (Edgington, 1995). Each randomization consisted in (1) the random permutation of the 16 pairs (corresponding to the 16 subjects) of values, (2) the sum of squared sums of values in the 2 obtained samples, and (3) the computation of the difference between theses two statistic values. We did 10.000 of such randomizations to obtain an estimate of the distribution of this difference under the null hypothesis. We then compared the actual difference between the ERPs in the 2 conditions of interest to this distribution. When this test was used over several time-windows and electrodes, we corrected for multiple tests. In the temporal dimension, we used a randomization procedure (Blair, 1993) to estimate the minimum number of consecutive 10-ms time- windows that must be significant for the effect to be globally significant on the entire time-window of interest (0-300 ms). For the spatial dimension, we considered the data to be independent and therefore set the statistical threshold to P < 0.0005 (see Figure 3.2).

60

Figure 3.2 – Significant differences between ERPs to attended and ignored sounds. Permutation tests were performed over all 64 electrodes and 10-ms time-windows between 0 and 300 ms, contrasting ERPs to attended and ignored standards, independent of memory conditions. P values are plotted in a time (horizontal axis) by electrodes (vertical axis) space. Only significant P values after correction for multiple tests in space and time are plotted (P < 0.0005).

3.6.2 Behavioral data In the attention task, a button press within the interval of 200-1000 ms after target onset was considered a correct response, and a press at any other time was counted as a false alarm. Reaction times, percentage of correct responses and number of false alarms were averaged across attention conditions for each memory condition, separately. The effect of the memory task difficulty on these measures was assessed using a repeated-measure one-way analysis of variance (ANOVA) with memory difficulty (3 levels: no, easy, difficult) as within-subject factor. When necessary, ANOVA results were corrected with the Greenhouse-Geisser procedure (epsilon and

61 corrected P are reported). Significant effects were explored using 2-tailed paired t- tests. The Bonferroni correction to was used to correct the P-value for multiple comparisons.

3.6.3 ERP standards To compare ERPs to attended and ignored standards, we conducted a permutation test on the ERP mean amplitude in successive 10-ms time-windows at each electrode between all attended and ignored standards (collapsing memory conditions) with correction for multiple tests (see 3.6.1.2 and Figure 3.2). Furthermore, we performed a two-way repeated-measure ANOVA with memory difficulty (3 levels: no, easy, difficult) and attention condition (3 levels: attended, ignored, control) as within-subjects factors, on 1 fronto-central group of electrodes (Fz, F1, F2, FCz, FC1, FC2), in 3 successive 50-ms time-windows (150- 200, 200-250 and 250-300 ms). The selection of electrodes and time-windows of interest was based on results in previous EEG studies on auditory selective attention and on the permutation test results in the present study. Significant effects were explored using post-hoc permutation tests. We assessed topography differences on the difference between ERPs to attended and control standards, and the difference between ERPs to control and ignored standards (collapsing memory conditions). To avoid any bias from amplitude effect, these difference values were first normalized for each subject using a division by the norm of the vector in electrode space (McCarthy, 1985). We then used two different methods to assess topographical differences. First, we performed a two-way repeated-measure ANOVA with attention effect (2 levels: “attended – control” and “control – ignored”) and electrode group (2 levels: anterior frontal and posterior frontal) as within-subjects factors, on the 250–300 ms time-window. The anterior frontal group included Fz, F1 and F2 electrodes, and the posterior frontal, FCz, FC1 and FC2 electrodes. The second method consisted in computing the center of mass of components. In physics, the center of mass of a system of particles is the point at which the system's whole mass can be considered to be concentrated, and is a function only of the positions and masses of the particles that compose the system. Applied to ERPs, ERP amplitudes at each electrode are considered as the masses, and the electrode coordinates as the positions of the particles (Manjarrez, 2007). We

62 computed the center of mass for “attended-control” and “ignored-control” effects from the mean ERP value in the 250-300 ms time-window from 21 frontal electrodes (Fpz, AFz, Fz, FCz, Cz, F1, FC1, C1, Fp1, AF3, F3, FC3, C3, F2, FC2, C2, Fp2, AF4, F4, FC4, C4) in each subject. We used a repeated-measure ANOVA with attention (2 levels: “attended-control” and “ignored-control”) and coordinates (3 levels: X, Y and Z) as within-subjects factors to compare the coordinates of the centers of mass. Significant effects were explored using post-hoc permutation tests. All data analyses were performed with ELAN-Pack software developed at INSERM U821 (Lyon, France).

3.6.4 ERP deviants To define latencies and electrodes of interest for further analysis, we computed the grand-average ERP across all conditions (see Figure 3.3). Three main responses were considered: N1 (time-window: 100-150 ms; electrode group: Fz, F1, F2, FCz, FC1, FC2), N2b (time-window: 205-255 ms; left electrode group: F5, F7, FC5, FT7, C5, T7; right electrode group: AF4, AF8, F4, F6, F8, FC6), and P3 (time-window: 350- 450 ms; electrode group: Pz, POz, P1, P2). The time-windows are 50-ms or 100-ms around the maximum of N1 and N2b, or P3, respectively. Electrode groups were chosen as electrodes with maximum amplitude on the topographies in these time- windows. For each response, we performed a two-way repeated-measure ANOVA with memory difficulty (3 levels: no, easy, difficult) and attention condition (3 levels: attended, ignored, control) as within-subjects factors, on the corresponding electrode groups and time-windows. Significant effects were explored using post-hoc permutation tests.

63

Figure 3.3 – Main ERPs components to deviants. (A) Grand average ERPs across all conditions on 22 out of the 64 recorded electrodes. Three main waves can be observed: the N1 between 100 and 150 ms (orange shaded area), the N2b between 205 and 255 ms (red shaded area) and the P3 maximal around 350-450 ms (purple shaded area). (B) Left scalp topography of the N2b wave (205-255 ms). (C) Top topography of the N1 wave (100-150 ms). (D) Back topography of the P3 wave (350-450 ms). (E) Right topography of the N2b wave (205-255 ms). The black ovals surround group of electrodes used for further analysis of ERPs to deviants.

64

4 Results We used a dual task protocol to orthogonally manipulate attention and cognitive resources. For the attention task, we adapted the classic auditory attention protocol by adding a third condition (control condition) in which attention was considered as equally distributed to all sounds. We measured with electroencephalography (EEG) the effects of three distinct levels of attention by comparing the event-related potentials (ERPs) to the same sounds when they were attended (in the attended ear), ignored (in the opposite, non-attended ear) or during the control condition. The availability of cognitive resources was modulated by varying the difficulty of a concurrent sound memorization task (3 difficulty levels: no, easy or difficult memory task). Our hypothesis was that if attention-mediated facilitation and inhibition are distinct mechanisms, they would be differentially affected by the difficulty of the memory task.

4.1 Behavioral data Participants performed better in the easy (99.0% of correct responses) than in the difficult (82.1%) memory task (t16 = 4.65, P = 0.0003). These results indicate that manipulation of the memory load was effective. We observed a significant effect of the memory task difficulty on the performance of the attention task, both in terms of percentage of correct responses (F2,30 = 6.0, ε =

0.781, P = 0.012) and reaction times (F2,30 = 4.5, ε = 0.932, P = 0.023), but not in the number of false alarms (F2,30 = 2.7, ε = 0.771, P = 0.098) in the attention task (see Table 1). Post-hoc t-tests showed that the percentage of correct responses was lower during the difficult than during the easy memory task (t16 = 3.00, P = 0.027) or when there was no memory task (t16 = 2.88, P = 0.033). Subjects were also faster to detect the targets when there was no memory task than when the memory task was difficult

(t16 = 3.06, P = 0.024). These results indicate that the higher the memory load, the worse the attention performance.

65

Table 1 – Effects of the memory task difficulty on the attention task performances. Mean percentage of correct responses, mean number of false alarms and mean reaction time (and their standard error to the mean, SEM) in the attention task are indicated as a function of the memory task difficulty.

4.2 ERP results of standards 4.2.1 Main attention effect (attended versus ignored) Previous studies investigating auditory selective attention compared ERPs to attended and ignored (unattended) standard sounds and found a negative frontally distributed activity (called Nd) starting around 100-150 ms (reviewed in Giard, 2000). We confirmed these results by performing an analysis of our data using permutation tests over all 64 electrodes and 10-ms time-windows between 0 and 300 ms (with correction for multiple comparisons), comparing ERPs to attended and ignored standards, independently of the memory conditions. We found that ERPs to attended and ignored standards begin to differ around 150 ms (see Figure 3.2 and 4.1 A) and that this difference is reflected in a negative frontally distributed component maximal over fronto-central electrodes (see Figure 4.1 B). Following these and previous authors, we focused our analysis of ERPs to standard stimuli on a fronto-central group of electrodes (Fz, F1, F2, FCz, FC1 and FC2) and on 3 successive 50-ms time-windows between 150 and 300 ms.

66

Figure 4.1 – Main attention effect on ERPs to standards. (A) Mean ERPs at Fz electrode. ERPs to attended and ignored standards are depicted in green and red, respectively. The difference between ERPs to attended and ignored standards is represented by a dashed black line; the shaded area corresponds to the 150-300 ms period, used for further analysis, when this difference is significant (see Fig. 3.2). (B) Scalp topography (top view) of the mean difference between ERPs to attended and ignored standards (200-300 ms). The black dot indicates the position of the Fz electrode and the black oval surrounds the fronto-central group of electrodes used for further analysis of ERPs to standards.

4.2.2 Influence of the memory task difficulty on attention effects We examined the ERPs to attended, control and ignored standards across three conditions of no, easy or difficult memory task (see Figure 4.2) and performed a two- way ANOVA on the ERP mean amplitude, with memory difficulty (no, easy and difficult) and attention condition (attended, control, ignored) as factors. On the three 50-ms time-windows, we found a significant main effect of attention, but not of the memory task difficulty (see Table 2).

67

Figure 4.2 – Mean ERPs by attention and memory conditions. Mean ERPs at the fronto-central electrode group to attended (green), control (grey) and ignored (red) standards in the no (A), easy (B) and difficult (C) memory tasks are depicted. Shaded areas correspond to the 3 successive 50-ms windows, in the 150-300 ms period, used for statistical analysis.

We also found a significant interaction between attention condition and the memory task difficulty between 200 and 250 ms, but not for the other time-windows. To assess whether these results are independent of the control condition, we performed the same statistical analysis excluding the control condition: a two-way ANOVA on ERP mean amplitude, with memory difficulty (no, easy and difficult) and attention condition (attended and ignored) as factors. We obtained similar results with and without factoring in the control condition (see Table 2).

68

Table 2 – Effect of the memory task difficulty and attention conditions on the ERP amplitude. Results of the two-way ANOVA on ERP mean amplitude, with memory difficulty (no, easy and difficult) and attention condition as factors, for the three tested time- windows. Statistical values (F, ε and P) of attention and memory difficulty main effects and of attention by memory interaction effect are indicated with the control condition included (attention condition factor with 3 levels: attended, control and ignored) and with the control condition excluded (attention condition factor with 2 levels: attended and ignored). Significant effects are highlighted in grey.

To further investigate the effect of the memory task on attention modulations between 200 and 250 ms, we assessed, for each memory difficulty, the amplitude of the facilitatory (ERP difference between attended and control standards) and inhibitory (ERP difference between ignored and control standards) attention effects (see Figure 4.3 A). We found that amplitudes of ERPs to attended and control standards were significantly different in all memory conditions (P < 0.004), whereas the amplitudes of ERPs to ignored and control standards significantly differed in the easy memory task only (P = 0.018). These results indicate that the memory task difficulty differentially affects facilitation and inhibition mechanisms.

69

Figure 4.3 – Effect of the memory task difficulty on attention effects. (A) Mean ERP amplitudes (fronto-central group, 200-250 ms) of attention-mediated facilitation (green) and inhibition (red) effects as a function of the memory task difficulty (no,

70 easy, difficult). Facilitation and inhibition effects are represented as the mean difference between ERPs to attended and control, and to ignored and control standards, respectively. Error bars represent 1 SEM. Stars indicate significant differences assessed by permutation post-hoc tests of the interaction (attention by memory) effect (*: P < 0.05; **: P < 0.01; ***: P < 0.001). (B) Scalp topographies (top view) of the attention effects (200-250 ms): facilitation (mean difference between ERPs to attended and control standards) and inhibition (mean difference between ERPs to ignored and control standards). The black oval surrounds the fronto-central electrode group used for computation of mean amplitudes and statistical analysis represented in (A).

4.2.3 Timing of attention facilitation and inhibition We assessed the timing and amplitude of the facilitatory and inhibitory attention effects, independently of the memory load. ERP amplitudes to attended and control standards were different between 150 and 200 ms, (P = 0.001), between 200 and 250 ms (P = 0.0001), and between 250 and 300 ms (P = 0.0001). Amplitudes of ERPs to ignored and control standards were not different between 150 and 200 ms (P = 0.85), but were significantly different between 200 and 250 ms (P = 0.002) and between 250 and 300 ms (P = 0.0003). These results, in combination with the ones observed in the attention by memory interaction, suggest that facilitation and inhibition mechanisms have different timing: facilitation starts as early as 150 ms after stimulus onset in all memory conditions, whereas inhibition begins around 200 ms in the easy memory condition and not before 250 ms in the other memory conditions (see Figure 4.4 A).

71

Figure 4.4 – Timing and topography of attention-mediated facilitation and inhibition. (A) Mean ERPs at the fronto-central electrode group to attended, control and ignored standards are depicted in green, grey and red, respectively. The differences between ERPs to attended and control (facilitation), and to ignored and control (inhibition) standards are represented below by green and red lines, respectively. Yellow shaded areas correspond to the 50-ms windows, in the 150-300 ms period, used for statistical analysis. Stars indicate significant differences assessed by permutation post-hoc tests of the main attention effect (**: P < 0.01; ***: P < 0.001). (B) Scalp topographies (top view) of facilitation (mean difference between ERPs to attended and control standards) and inhibition (mean difference between ERPs to ignored and control standards) between 250 and 300 ms. The black oval surrounds the fronto-central electrode group used for ERP computation and statistical analysis represented in (A).

72

4.2.4 Topographies of attention facilitation and inhibition We also observed that the inhibitory component had a more posterior scalp distribution than the facilitatory component (see Figure 4.3 B and 4.4 B). Since facilitatory and inhibitory components were both found to be active between 250 and 300 ms, we used this time-window to test if the topographies of these two components were different. We first performed a two-way ANOVA on normalized ERP mean amplitude (averaging across memory conditions), with electrode group (anterior or posterior frontal) and attention effect (“attended – control” and “control – ignored”) as factors (see Figure 4.5). We found a significant interaction between electrode groups and attention effects (F1,15 = 15.3, P = 0.001), suggesting that facilitation and inhibition mechanisms have distinct topographies.

Figure 4.5 – Comparison of facilitation and inhibition topographies. Scalp topographies (left, top and right views) of the normalized mean attention effects (250-300 ms). On the top: facilitation (mean difference between ERPs to attended and control standards). On the bottom: inhibition (mean difference between ERPs to control and ignored standards). The black ovals surround the anterior and posterior frontal groups of electrodes used for statistical analysis of the topography differences.

73

Second, we computed the center of mass for “attended – control” and “control – ignored” effects from the mean ERP value in the 250–300 ms time- window from 21 frontal electrodes. We obtained the mean coordinates X = 0.55, Y = 10.16, Z = 11.00 for the facilitatory component, and X = 0.24, Y = 2.70, Z = 3.51 for the inhibitory component. Using a two-way ANOVA with attention (2 levels: “attended – control” and “control – ignored”) and coordinates (3 levels: X, Y and Z) as factors, we found a significant effect of attention (P = 0.050), of coordinates (P = 0.008), and attention by coordinates interaction (P = 0.044). The centers of mass of the facilitatory and inhibitory components were found to be different in their Y coordinates (P < 0.05), but not in their X or Z coordinates (P > 0.26), suggesting that the topography of the facilitatory component is more anterior than the topography of the inhibitory one. It is noteworthy that these results are independent of the control condition since this condition is subtracted to extract both the facilitatory (attended – control) and the inhibitory (control – ignored) components, and the remaining difference can only be attributed to a difference between the ERPs to attended and ignored sounds. In this analysis, the control condition is only used to eliminate the overlapping P2 response.

4.3 ERP results of deviants 4.3.1 Attention enhancement of deviant processing We investigated the effect of attention condition and memory difficulty on the mean amplitude of three classically investigated responses to deviant sounds: N1, N2b, and

P3 (see Figure 3.3). We found a significant main effect of attention for the N1 (F2,30

= 7.3, ε = 0.912, P = 0.0037) and the N2b (left: F2,30 = 16.980, ε = 0.0.911, P =

0.00003; right F2,30 = 8.370, ε = 0.952, P = 0.0016), but no memory task difficulty (P > 0.21), nor interaction (P > 0.39) effects (see Figure 4.6). N1 was of larger amplitude to attended than control (P = 0.016) and ignored (P = 0.0048) deviants and N2b was actually present only when deviants were attended, i.e. targets (P < 0.02). Interestingly, the N1 amplitude to control and ignored deviants was not significantly different (P > 0.19).

Furthermore, the P3 was also modulated by attention (F2,30 = 62.743, ε = 0.905, P < 0.00001). Interestingly, the P3 was only present in response to attended deviants, i.e. targets (see Figure 4.7).

74

Figure 4.6 – N1 and N2b ERPs to deviants. (A) Mean ERPs at frontal (left), left temporal (center) and right temporal (right) electrode groups. ERPs to attended, control and ignored deviants are depicted in green, grey and red, respectively. The shaded areas correspond to the 100-150 ms and 205-255 ms periods, used for N1 and N2b analysis, respectively. (B) Scalp topographies (top views) of the N1 and N2b waves to attended, control and ignored deviants (100-150 ms). Top views of N1 topographies between 100 and 150 ms (left). Left and right views of N2b topographies between 205 and 255 ms (center and right panels, respectively). The black or white ovals surround the electrode group used for N1 and N2b analysis and to compute the mean ERP in (A).

75

Figure 4.7 – P3 to targets. (A) Mean ERPs at parietal electrode group. ERPs to attended, control and ignored deviants are depicted in green, grey and red, respectively. The shaded area corresponds to the 350-450 ms period, used for P3 analysis. (B) Scalp topographies (back views) of the mean ERPs to attended, control and ignored deviants (350-450 ms). The black oval surrounds the parietal group of electrodes used for P3 analysis and to compute the mean ERP in A. The P3 is only present in response to targets (attended deviants).

4.3.2 Memory effect on the P3-Component In addition, we found a significant interaction between the attention conditions and the memory task difficulty on the P3 (F4,60 = 2.822, ε = 0.863, P = 0.041). The amplitude of the P3 to target sounds was larger when there was no memory task than when the memory task was difficult (P = 0.0008; see Figure 4.8).

76

Figure 4.8 – Effect of the memory task difficulty on the P3 to targets. (A) Mean P3 amplitudes (parietal group, 350-450 ms) to attended, control and ignored deviants (depicted in green, grey and red, respectively) as a function of the memory task difficulty (no, easy, difficult). Error bars represent 1 SEM Stars indicate significant differences assessed by permutation post-hoc tests of the interaction (attention by memory) effect (***: P < 0.001). (B) Scalp topographies (back views) of the mean ERPs to targets (attended deviants) as a function of the memory task difficulty, between 350 and 450 ms. The black oval surrounds the parietal group of electrodes used for P3 analysis. (C) Mean ERPs at the parietal electrode group. ERPs to targets in the no, easy and difficult memory conditions are depicted with thick, thin and dashed green lines, respectively. The shaded area corresponds to the 350-450 ms period, used for P3 analysis.

77

5 Discussion Auditory selective attention is a complex mechanism constantly operating in our daily life: the perception of a certain stimuli in the environment is enhanced relative to other stimuli of lesser immediate priority (cocktail party-effect). Numerous theories and studies were put forward to explain the phenomenon and to elucidate the operating mechanisms and the associated anatomical structures. However, to date it is still not known exactly how auditory selective attention is operating. Nevertheless, the studies brought together several pieces of information about the mechanisms of auditory selective attention. It is now well accepted that auditory attention can modulate the sensory analysis of sounds at multiple levels. First, selective attention can operate at early stage of sensory processing, i.e. as early as 30 ms after sound onset, of the automatic/exogenous ERPs generated in the auditory cortices. Second, attention can also modulate stimuli processing at late selection stages via an attentional trace observed as a sustained negative deflection of endogenous origin, called ‘Negative difference’ (Nd) or Processing Negativity (PN). Additionally, auditory selective attention seems to operate via facilitation and inhibition mechanisms, reflecting an enhanced or reduced processing of relevant or irrelevant sounds, respectively. In this study we tried to find out whether facilitation and inhibition are distinct mechanisms and could operate independently at a late selection stage. To do so, we modulated the amount of cognitive resources in an auditory selective attention task, because if facilitation and inhibition are distinct mechanisms they should be affected differently by the variation of the cognitive load (memory task). Therefore, we used a dual task protocol: subjects had to perform an auditory attention task and a memory task at the same time. We compared the electrophysiological responses to the same sounds when they were attended, ignored or under a control condition, where attention was considered equally distributed towards all sounds. After analyzing the data, we found two frontally distributed components: a negative one in response to attended standard sounds (facilitatory component), and a positive one to ignored standard sounds (inhibitory component). These frontal electrophysiological responses have distinct timing and topographies, and are differentially modulated by the difficulty of the memory task. These results provide

78 evidence that auditory attention is enabled by distinct facilitation and inhibition mechanisms.

We first observed a negative frontally distributed ERP component onsetting at about 150 ms that differentiated attended and ignored standard sounds. This response probably corresponds to components of the Nd or PN described in several previous studies (reviewed by Giard, 2000). This component is felt to index late selective attention mechanisms, involved in controlling and maintaining the representation of stimuli according to their behavioral relevance (Giard, 2000; Näätänen, 1992; Näätänen, 1982) and can be elicited without being preceded by N1 enhancement (Näätänen, 1978), as was observed in the present experiment. Indeed, we observed no difference in ERPs to attended and ignored sounds during the first 150 ms, probably until sufficient information is processed in order to decide whether the sound belongs to the task-relevant or to the task-irrelevant ear, in agreement with the findings and theory of Näätänen and colleauges (Näätänen, 1992; Näätänen, 1982). To dissociate facilitatory and inhibitory components, we used a control condition in which the participants had to detect binaural pure tones. We acknowledge that the perfect control condition is elusive but we considered the current choice better than a passive task (what is the subject actually doing?) or a visual task (inter-modal attention is involved). Furthermore, this control task has already been shown to be valuable in understanding the mechanisms of auditory selective attention in intracranial recordings (Bidet-Caulet, 2007). We assume that, in this control condition, participants' auditory attention was equally distributed towards all monaural standard sounds. Indeed, to detect these binaural pure tones, they had to pay attention to the auditory modality, but they did not need to actively ignore the other sounds since the pure tones were quite salient and the task was easy. The control condition did not require selective attention, but necessitated broad auditory attention towards all sounds to be correctly performed. One can argue that the control task we used actually required the inhibition of the standard monaural noises. In this case, we might be underestimating the inhibitory component. More importantly, to further address the issue of the control condition, we reanalyzed the data independently of the control condition. This analysis did not affect the effect of the memory difficulty manipulation: processing of attended and ignored sounds is

79 differentially affected by the memory difficulty. Moreover, the topographical differences are independent of the control condition since the control condition is subtracted to extract both the facilitatory and inhibitory components. Using this control condition, we found that the Nd response can be dissociated into two distinct components: (1) a negative ERP component in response to attended standards, with onset at about 150 ms, with an anterior frontal scalp distribution; and (2) a positive ERP component in response to ignored standards, with onset between 200 and 250 ms, with a fronto-central scalp distribution. These findings are consistent with results from several previous scalp EEG studies dissociating the Nd component into two facilitatory and inhibitory sub-components, using control conditions in the auditory modality (Donald, 1987; Melara, 2002; Schroger, 1997) or in the visual modality (Alho, 1987; Alho, 1994; Berman, 1989; Degerman, 2008; Michie, 1990; Michie, 1993). These researches found a positive response or “rejection positivity” to unattended sounds compared to the control condition, starting later in latency than the negative response to attended sounds. It has been suggested in some of this previous work that the topographies of facilitatory and inhibitory components are different (Degerman, 2008; Donald, 1987; Melara, 2002). In the current paper, these two components are directly compared and dissociated. The distinct scalp topographies provide support that different brain sources support the facilitatory and inhibitory components. However, we cannot precisely infer the brain origin of these components from the present data. These components most likely reflect neural activity from the auditory cortices in the superior temporal lobes and/or from frontal areas, as it has been suggested for the Nd components (Alcaini, 1994; Degerman, 2008; Giard, 1988; Woldorff, 1993). To test if these facilitatory and inhibitory components correspond to two functionally distinct mechanisms or are generated by a single control mechanism, we manipulated the availability of cognitive resources. The hypothesis was that the control of facilitation and inhibition mechanisms requires cognitive resources, and that if these two mechanisms are independently controlled they should not covary according to the amount of available cognitive resources. It has been shown, previously, that increasing the load on executive functions, such as increasing memory, decreases the availability of cognitive resources to perform other cognitive task, such as an attention task (Lavie, 2005). We manipulated the availability of cognitive resources by varying the difficulty (or load) of a concurrent sound

80 memorization task. We found that facilitation and inhibition mechanisms in auditory selective attention are differentially modulated by the memory difficulty, providing evidence for distinct functional roles, as reported in the visual modality (Gazzaley, 2008; Gazzaley, 2005). More precisely, we found that the availability of cognitive resources differentially influenced the timing of attention-mediated facilitation and inhibition mechanisms: facilitation starts at the same latency (150 ms) in all memory loads, whereas inhibition is activated at 200 ms for low memory load (easy memory task), and after 250 ms for no and high (difficult memory task) memory loads. In a previous visual study employing fMRI, brain activation by distracting sounds was found to be larger under high rather than low memory load, suggesting a reduction of inhibition mechanisms under high memory load (de Fockert, 2001). Accordingly, our findings show that the inhibition mechanism is delayed from low to high memory load conditions. Thus, the less cognitive resources are available, the later the inhibition mechanisms are activated and the more distractors are processed. We did not observe inhibition before 250 ms in the no memory condition likely because of the ease of the attention task. These results extend the cognitive load theory (Lavie, 2005) to the auditory modality, but importantly, we have also shown using the time resolution of electrophysiology, that the availability of cognitive resources influences late selection processes (after the first steps of the sensory analysis) which control access to memory and response. When cognitive resources are available, distractor inhibition can be activated early (as early as 200 ms). Late attention-mediated inhibition mechanisms also seem to be influenced by the task difficulty: they are delayed when the task is easy even if the cognitive resources are available. Analysis of ERPs to deviants reveals strong effects of attention on deviant processing, consistent with previous findings (Hansen, 1984; Näätänen, 1993). We observed an early attention-dependent enhancement of the sensory N1 response between 100 and 150 ms. Moreover, N2b and P3 components, known to be related to target processing (Muller-Gass, 2002; Näätänen, 1983), were only elicited in response to attended deviants. ERPs to ignored and control deviants were not different, suggesting that only facilitation mechanisms are modulating deviant processing. The present study provides new insights on the brain mechanisms of selective attention: late selection of the relevant stream of stimuli relies on the engagement of

81 distinct attention-mediated facilitation and inhibition mechanisms. Sustained facilitatory and inhibitory frontally distributed components represent distinct cognitive processing of the attended and ignored streams of sounds, enhancing the rapid and accurate detection of targets without interference by distracting stimuli. These findings provide evidence that, at a late selection stage, attention operates by employing distinct facilitation and inhibition mechanisms.

82

6 References 1. Alcaini, M., Giard, M. H., Echallier, J. F., and Pernier, J., (1994), Selective auditory attention effects in tonotopically organized cortical areas: A topographic ERP study, Human Brain Mapping, 2, (p. 159-169). 2. Alho, K., Tottola, K., Reinikainen, K., Sams, M., and Naatanen, R., (1987), Brain mechanism of selective listening reflected by event-related potentials, Electroencephalogr Clin Neurophysiol, 68, (p. 458-70). 3. Alho, K., Woods, D. L., and Algazi, A., (1994), Processing of auditory stimuli during auditory and visual attention as revealed by event-related potentials, Psychophysiology, 31, (p. 469-79). 4. Araque, A. and Perea, G., (2004), Glial modulation of synaptic transmission in culture, Glia, 47, (p. 241-8). 5. Berman, S. M., Heilweil, R., Ritter, W., and Rosen, J., (1989), Channel probability and Nd: an event-related potential sign of attention strategies, Biol Psychol, 29, (p. 107-24). 6. Bidet-Caulet, A., Mécanismes neurophysiologiques de la perception de flux sonores chez l'Homme: Effets des contextes acoustiques et attentionnels, Dissertation, Université Claude Bernard, Lyon, 2006. 7. Bidet-Caulet, A., Fischer, C., Besle, J., Aguera, P. E., Giard, M. H., and Bertrand, O., (2007), Effects of selective attention on the electrophysiological representation of concurrent sounds in the human auditory cortex, J Neurosci, 27, (p. 9252-61). 8. Bidet-Caulet, A. and Mikyska, C., Facilitation and inhibition mechanisms in auditory selective attention, in Society for Neuroscience, 2008, Washington DC. 9. Blair, R. C. and Karniski, W., (1993), An alternative method for significance testing of waveform difference potentials, Psychophysiology, 30, (p. 518-24). 10. Broadbent, D. E., (1958), Perception and communication, Pergamon Press, London. 11. Buschman, T. J. and Miller, E. K., (2007), Top-down versus bottom-up control of attention in the prefrontal and posterior parietal cortices, Science, 315, (p. 1860-2).

83

12. Cacioppo, J., Tassinary, L., and Berntson, G., (2005), Handbook of psychophysiology, 3 ed., Cambridge University Press, New York, (p. 908). 13. Cherry, E. C., (1953), Some exmeriments on the recognition of speech, with one and with 2 ears, Journal of the Acoustical Society of America, 25, (p. 975-979). 14. Davis, P. A., (1939), Effects of acoustic stimuli on the waking human brain, Journal of Neurophysiology, 2, (p. 494-499). 15. de Fockert, J. W., Rees, G., Frith, C. D., and Lavie, N., (2001), The role of working memory in visual selective attention, Science, 291, (p. 1803-6). 16. Degerman, A., Rinne, T., Sarkka, A. K., Salmi, J., and Alho, K., (2008), Selective attention to sound location or pitch studied with event-related brain potentials and magnetic fields, Eur J Neurosci, 27, (p. 3329-41). 17. Deutsch, J. A. and Deutsch, D., (1963), Some theoretical considerations, Psychol Rev, 70, (p. 80-90). 18. Donald, M. W., (1987), The timing and polarity of different attention-related ERP changes inside and outside of the attentional focus, Electroencephalogr Clin Neurophysiol Suppl, 40, (p. 81-6). 19. Ebner, A., (2006), EEG, 1 ed., Georg Thieme Verlag, Stuttgart, (p. 1-8). 20. Edgington, E. S., (1995), Randomization Tests, Third edition : revised and expanded ed., Marcel Dekker, New York, USA. 21. Elul, R., (1971), The genesis of the EEG, Int Rev Neurobiol, 15, (p. 227-72). 22. Fletcher, H. and A., M. W., (1933), Loudness, Its Definition, Measurement and Calculation, J. Acoust. Soc. Am., 5, (p. 82-108). 23. Folstein, J. R. and Van Petten, C., (2008), Influence of cognitive control and mismatch on the N2 component of the ERP: a review, Psychophysiology, 45, (p. 152-70). 24. Gazzaley, A., Clapp, W., Kelley, J., McEvoy, K., Knight, R. T., and D'Esposito, M., (2008), Age-related top-down suppression deficit in the early stages of cortical visual memory processing, Proc Natl Acad Sci U S A, 105, (p. 13122-6). 25. Gazzaley, A., Cooney, J. W., Rissman, J., and D'Esposito, M., (2005), Top- down suppression deficit underlies working memory impairment in normal aging, Nat Neurosci, 8, (p. 1298-300).

84

26. Gazzaniga, M., Ivry, R., and Mangun, G., (2002), Cognitive Neuroscience, 2 ed., W. W. Norton & Company, New York City, (p. 244-251). 27. Gelman, A., (2005), Analysis of variance - Why it is more important than ever, Annals of Statistics, 33, (p. 1-31). 28. Geschwind, N. and Levitsky, W., (1968), Human brain: left-right asymmetries in temporal speech region, Science, 161, (p. 186-7). 29. Giard, M. H., Collet, L., Bouchet, P., and Pernier, J., (1994a), Auditory selective attention in the human cochlea, Brain Res, 633, (p. 353-6). 30. Giard, M. H., Fort, A., Mouchetant-Rostaing, Y., and Pernier, J., (2000), Neurophysiological mechanisms of auditory selective attention in humans, Front Biosci, 5, (p. D84-94). 31. Giard, M. H., Perrin, F., Echallier, J. F., Thevenet, M., Froment, J. C., and Pernier, J., (1994b), Dissociation of temporal and frontal components in the human auditory N1 wave: a scalp current density and dipole model analysis, Electroencephalogr Clin Neurophysiol, 92, (p. 238-52). 32. Giard, M. H., Perrin, F., Pernier, J., and Peronnet, F., (1988), Several attention-related wave forms in auditory areas: a topographic study, Electroencephalogr Clin Neurophysiol, 69, (p. 371-84). 33. Hansen, J. C. and Hillyard, S. A., (1984), Effects of stimulation rate and attribute cuing on event-related potentials during selective auditory attention, Psychophysiology, 21, (p. 394-405). 34. Hawkins, J. E., Human ear, in Britannica Ecyclopaedia. 1997, Encyclopedia Britannica Inc. 35. Hillyard, S. A., Hink, R. F., Schwent, V. L., and Picton, T. W., (1973), Electrical signs of selective attention in the human brain, Science, 182, (p. 177-80). 36. Hoffmann, S. and Falkenstein, M., (2008), The correction of eye blink artefacts in the EEG: a comparison of two prominent methods, PLoS One, 3, (p. e3004). 37. Howard, M. A., 3rd, Volkov, I. O., Abbas, P. J., Damasio, H., Ollendieck, M. C., and Granner, M. A., (1996), A chronic microelectrode investigation of the tonotopic organization of human auditory cortex, Brain Res, 724, (p. 260-4). 38. Hudspeth, A. J., (1983), Mechanoelectrical transduction by hair cells in the acousticolateralis sensory system, Annu Rev Neurosci, 6, (p. 187-215).

85

39. James, W., (1890), The Principles of , New York: Henry Holt, 1, (p. 403-404). 40. Jancke, L., Mirzazade, S., and Shah, N. J., (1999), Attention modulates activity in the primary and the secondary auditory cortex: a functional magnetic resonance imaging study in human subjects, Neurosci Lett, 266, (p. 125-8). 41. Kaas, J. H. and Hackett, T. A., (1999), 'What' and 'where' processing in auditory cortex, Nat Neurosci, 2, (p. 1045-7). 42. Kastner, S. and Ungerleider, L. G., (2000), Mechanisms of visual attention in the human cortex, Annu Rev Neurosci, 23, (p. 315-41). 43. Kemp, D. T., (1978), Stimulated acoustic emissions from within the human auditory system, J Acoust Soc Am, 64, (p. 1386-91). 44. Kolb, B. and Whishaw, I. Q., (1996), Fundamentals of human neuropsychology, 4th ed., W.H. Freeman, New York, N.Y. 45. Lavie, N., (2005), Distracted and confused?: selective attention under load, Trends Cogn Sci, 9, (p. 75-82). 46. Lavie, N., (1995), Perceptual load as a necessary condition for selective attention, J Exp Psychol Hum Percept Perform, 21, (p. 451-68). 47. Lavie, N., Hirst, A., de Fockert, J. W., and Viding, E., (2004), Load theory of selective attention and cognitive control, J Exp Psychol Gen, 133, (p. 339- 54). 48. Liegeois-Chauvel, C., Musolino, A., Badier, J. M., Marquis, P., and Chauvel, P., (1994), Evoked potentials recorded from the auditory cortex in man: evaluation and topography of the middle latency components, Electroencephalogr Clin Neurophysiol, 92, (p. 204-14). 49. Lukas, J. H., (1980), Human auditory attention: the olivocochlear bundle may function as a peripheral filter, Psychophysiology, 17, (p. 444-52). 50. Lukas, J. H., (1981), The role of efferent inhibition in human auditory attention: an examination of the auditory brainstem potentials, Int J Neurosci, 12, (p. 137-45). 51. Malmivuo, J., (2004), Comparison of the properties of EEG and MEG, International Journal of Bioelectromagnetism, 6, (p. 1-14).

86

52. Manjarrez, E., Vazquez, M., and Flores, A., (2007), Computing the center of mass for traveling alpha waves in the human brain, Brain Res, 1145, (p. 239- 47). 53. McCarthy, G. and Donchin, E., (1981), A metric for thought: a comparison of P300 latency and reaction time, Science, 211, (p. 77-80). 54. McCarthy, G. and Wood, C. C., (1985), Scalp distributions of event-related potentials: an ambiguity associated with analysis of variance models, Electroencephalogr Clin Neurophysiol, 62, (p. 203-8). 55. Melara, R. D., Rao, A., and Tong, Y., (2002), The duality of selection: excitatory and inhibitory processes in auditory selective attention, J Exp Psychol Hum Percept Perform, 28, (p. 279-306). 56. Michie, P. T., Bearpark, H. M., Crawford, J. M., and Glue, L. C., (1990), The nature of selective attention effects on auditory event-related potentials, Biol Psychol, 30, (p. 219-50). 57. Michie, P. T., Solowij, N., Crawford, J. M., and Glue, L. C., (1993), The effects of between-source discriminability on attended and unattended auditory ERPs, Psychophysiology, 30, (p. 205-20). 58. Millett, D., (2001), Hans Berger: from psychic energy to the EEG, Perspect Biol Med, 44, (p. 522-42). 59. Moray, N., (1959), Attention in dichotic listening: Affective cues and the influence of instructions, Quarterly Journal of Experimental Psychology, (p. 56-60). 60. Muller-Gass, A. and Campbell, K., (2002), Event-related potential measures of the inhibition of information processing: I. Selective attention in the waking state, Int J Psychophysiol, 46, (p. 177-95). 61. Näätänen, R., (1992), Attention and Brain Function, Erlbaum, Hilldale, NJ. 62. Näätänen , R., (1982), Processing negativity: an evoked-potential reflection of selective attention, Psychol Bull, 92, (p. 605-40). 63. Näätänen , R., Gaillard, A. W., and Mantysalo, S., (1978), Early selective- attention effect on evoked potential reinterpreted, Acta Psychol (Amst), 42, (p. 313-29). 64. Näätänen, R., Gaillard, A. W., and Varey, C. A., (1981), Attention effects on auditory EPs as a function of inter-stimulus interval, Biol Psychol, 13, (p. 173-87).

87

65. Näätänen , R., Gaillard, A. W. K., Anthony, W. K. G., and Walter, R., The Orienting Reflex and the N2 Deflection of the Event-Related Potential (ERP), in Advances in Psychology. 1983, North-Holland. (p. 119-141). 66. Näätänen , R., Paavilainen, P., Tiitinen, H., Jiang, D., and Alho, K., (1993), Attention and mismatch negativity, Psychophysiology, 30, (p. 436-50). 67. Näätänen , R. and Picton, T., (1987), The N1 wave of the human electric and magnetic response to sound: a review and an analysis of the component structure, Psychophysiology, 24, (p. 375-425). 68. Näätänen , R. and Winkler, I., (1999), The concept of auditory stimulus representation in cognitive neuroscience, Psychol Bull, 125, (p. 826-59). 69. Netter, F., (2006), Atlas of human Anatomy, 4 ed., Saunders Elsevier, Philadelphia, (p. 640). 70. Pandya, D. N., (1995), Anatomy of the auditory cortex, Rev Neurol (Paris), 151, (p. 486-94). 71. Pashler, H., (1998), The Psychology of Attention, MA: MIT Press, Cambridge, (p. 75-77). 72. Perrin, F., Pernier, J., Bertrand, O., and Echallier, J. F., (1989), Spherical splines for scalp potential and current density mapping, Electroencephalography and Clinical Neurophysiology, 72, (p. 184-7). 73. Perrin, F., Pernier, J., Bertrand, O., Giard, M. H., and Echallier, J. F., (1987), Mapping of scalp potentials by surface spline interpolation, Electroencephalography and Clinical Neurophysiology, 66, (p. 75-81). 74. Pfurtscheller, G. and Lopes da Silva, F. H., (1999), Event-related EEG/MEG synchronization and desynchronization: basic principles, Clin Neurophysiol, 110, (p. 1842-57). 75. Picton, T. W., Stuss, D. T., Kornhubek, H. H., and Deecke, L., The Component Structure of the Human Event-Related Potentials, in Progress in Brain Research. 1980, Elsevier. (p. 17-49). 76. Purves, D., Augustine, G. J., Katz, L. C., LaMantia, A. S., and McNamara, J. O., (1997), Neuroscience, MA: Sinauer Associates, Sunderland. 77. Rickheit, G., Herrmann, T., and Deutsch, W., (2003), Psycholinguistik: Ein internationales Handbuch, Walter de Gruyter, Berlin/New York, (p. 67).

88

78. Roberts, W. M., Howard, J., and Hudspeth, A. J., (1988), Hair cells: transduction, tuning, and transmission in the inner ear, Annu Rev Cell Biol, 4, (p. 63-92). 79. Schmidt, R. F. and Schaible, H., (1993), Neuro- und Sinnesphysiologie, 5 ed., Springer, Berlin, (p. 287-311). 80. Schmidt, R. F., Thews, G., and Lang, F., (2005), Physiologie des Menschen, 25 ed., Springer, Heidelberg, (p. 334-357). 81. Schmitt, B. M., Munte, T. F., and Kutas, M., (2000), Electrophysiological estimates of the time course of semantic and phonological encoding during implicit picture naming, Psychophysiology, 37, (p. 473-84). 82. Schroger, E. and Eimer, M., (1997), Endogenous covert spatial orienting in audition: "Cost-benefit" analyses of reaction times and event-related potentials., Quarterly Journal of Experimental Psychology - A, (p. 457-474). 83. Singer, W., (1993), Synchronization of cortical activity and its putative role in information processing and learning, Annu Rev Physiol, 55, (p. 349-74). 84. Soltani, M. and Knight, R. T., (2000), Neural origins of the P300, Crit Rev Neurobiol, 14, (p. 199-224). 85. Sutton, S., Braren, M., Zubin, J., and John, E. R., (1965), Evoked-potential correlates of stimulus uncertainty, Science, 150, (p. 1187-8). 86. Treisman, A. M., (1960), Contextual cues in selective listening, Quarterly Journal of Experimental Psychology, 12, (p. 242-248). 87. Trepel, M., (2008), Neuroanatomie: Struktur und Fuktion, 4 ed., Urban & Fischer, München, (p. 358-370). 88. Vaughan, H. G., Jr. and Ritter, W., (1970), The sources of auditory evoked responses recorded from the human scalp, Electroencephalogr Clin Neurophysiol, 28, (p. 360-7). 89. Von Bekesy, G., (1960), Experiments in hearing, McGraw-Hill, New York. 90. Woldorff, M. G., Gallen, C. C., Hampson, S. A., Hillyard, S. A., Pantev, C., Sobel, D., and Bloom, F. E., (1993), Modulation of early sensory processing in human auditory cortex during auditory selective attention, Proc Natl Acad Sci U S A, 90, (p. 8722-6). 91. Woldorff, M. G., Hansen, J. C., and Hillyard, S. A., (1987), Evidence for effects of selective attention in the mid-latency range of the human auditory

89

event-related potential, Electroencephalogr Clin Neurophysiol Suppl, 40, (p. 146-54). 92. Woldorff, M. G. and Hillyard, S. A., (1991), Modulation of early auditory processing during selective listening to rapidly presented tones, Electroencephalogr Clin Neurophysiol, 79, (p. 170-91). 93. Yvert, B., Fischer, C., Bertrand, O., and Pernier, J., (2005), Localization of human supratemporal auditory areas from intracerebral auditory evoked potentials using distributed source models, Neuroimage, 28, (p. 140-53).

90

7 List of abbreviations A1 primary auditory cortex AC alternating current ADC analogue to digital conversion AEP auditory evoked potential Ag/Cl silver chloride ANOVA analysis of variances BAEP brainstem auditory evoked potential BOLD blood oxygen level dependency CGM corpus geniculatum mediale dB decibel ECoG electrocorticogram EEG electroencephalogram (r/l/v) EOG (right/left/vertical) electrooculogram EP evoked potential EPSP excitatory postsynaptic potential ERP event related potential FDG fluorodeoxyglucose fMRI functional magnetic resonance imagining GABA gamma aminobutyric acid HG Heschl’s gyrus Hz Hertz ICA independent component analysis ISI inter stimulus interval IPSP inhibitory postsynaptic potential kOhm kilo Ohm mm millimeter mV millivolt mmol/l millimol per liter N1/N100 auditory ERP at around 100 ms post stimulus (negative deflection) Nd negative difference wave (E)OAE (evoked) otoacoustic emission MEG magnetencephalogram

91

MRI magnetic resonance imaging MTG medial temporal gyrus P20-50/P50 auditory ERP at around 20-50 ms post stimulus (positive deflection) P2/P200 auditory ERP at around 200 ms post stimulus (positive deflection) P3/P300 auditory ERP at around 300 ms post stimulus (positive deflection) Pa Pascal (1 Newton/meter) P pressure PET positron emission tomography PN processing negativity PP planum polare PT planum temporale SEEG stereotactic electroencephalogram SEM standard error to the mean SNR signal to noise ratio SPL sound pressure level SSR steady state response STG superior temporal gyrus V1 primary visual cortex µV microvolt

92

8 Publication Results from this study are already published:

Bidet-Caulet, A., Mikyska, C., and Knight, R. T., (2010), Load effects in auditory selective attention: evidence for distinct facilitation and inhibition mechanisms, Neuroimage, 50, (p. 277-84)

93

9 Acknowledgements I would like to express my gratitude to everyone, who contributed to this thesis. Especially I would like to thank Aurélie for introducing me to the world of science and showing me everything she knew. It was a pleasure working with her and learning from her. But more important, during the time in Berkeley, she became a close friend. Thank you for the support (e.g. lab meeting, SfN, CNS), guidance, advice, corrections, good music in the pod, French food and wine….

Also I would like to thank Robert T. Knight, M.D. for giving me the opportunity to spend a year at Berkeley and do research at the Helen Wills Neuroscience Institute.

I am also grateful to Prof. Dr. H. Stefan, for supporting the cooperation with UC Berkeley, for corrections and advices and for encouraging me to finish writing.

A very special appreciation goes to Prof. Dr. H-J Heinze, for the initial idea and the support and initiative to bring it to life.

My deepest thank you is for my family, without you I would not be where I am now. Thank you for all your love and support.

Nic, you always believe in me. Thank you for your love and understanding.

94

10 Curriculum vitae

Personal Name Mikyska First name Constanze Elisabeth Anna Date and place of birth December 06th, 1984 in Munich, Germany

Parents Dr. med. Veit Mikyska Dr. med. Maria-Magdalena Mikyska, nee Mittermair Siblings Christoph Maximilian Vitus Mikyska

Education School 2004 University-entrance diploma (Abitur), Heimschule Kloster Wald, Wald (secondary school and boarding school) Professional education 2000 – 2004 Apprenticeship as a tailor, Heimschule Kloster Wald, Wald University 2004 – 2011 Medical student at the Friedrich-Alexander University of Erlangen-Nuremberg 09/2006 Preliminary medical examination 12/2011 Final medical licensing examination

Research experience 05/2008 – 09/2009 Visiting researcher, Helen Wills Neuroscience Institute, University of California, Berkeley, USA Employment 12/2009 – 10/2010 Student assistant, Epilepsiezentrum (ZEE), Department of Neurology, University hospital Erlangen- Nuremberg