Neural responses to the sensory effects of willed motor actions

Nathan G. Mifsud

Thesis submitted for the degree of Doctor of Philosophy

School of Psychology Faculty of Science University of New South Wales

2017 THE UNIVERSITY OF NEW SOUTH WALES

Thesis/Dissertation Sheet

Name Nathan Gerard Mifsud School Psychology Degree PhD Faculty Science

Title Neural responses to the sensory effects of willed motor actions Abstract Our sense of agency and dynamic motor control relies on sensory attenuation, a phenomenon that distinguishes the sensory effects of our actions from sensations caused by the external world. This distinction is realized by the using motor commands to anticipate and modulate sensory input resulting from self-initiated actions. A putative electrophysiological index of sensory attenuation is the reduction of the auditory N1 component in the electroencephalogram (EEG). However, the underlying causes of electrophysiological sensory attenuation are debated. Previous research is almost entirely limited to auditory stimuli, and two action–sensation pairings in particular; specifically, vocalizing to produce speech sounds, and button pressing for tones. The aims of this thesis were to elucidate how sensory attenuation may be modulated by the strength of the existing causal relationship between an action (e.g., speaking) and its sensory effect (e.g., hearing one’s own voice), and to determine if the phenomenon can occur in non-auditory modalities. To achieve these aims, this thesis reports a series of experiments that measured the event-related potential (ERP) response evoked by a range of action–sensation pairings new to the literature. The first study found that button press-initiated tones evoked reduced auditory ERPs uncorrelated to the amplified visual ERPs evoked by button press- initiated pattern reversals. The next two studies debuted a saccade initiation paradigm, and found that ERP attenuation depended on the likelihood of the initiating action (i.e., saccades or button presses) being paired in everyday life with stimulation within the modality of the sensory effect (i.e., flashes or tones). The fourth study debuted a blow initiation paradigm to address stimulus confounds in the vocal production paradigm that is typically used in studies of supposed motor prediction dysfunction in schizophrenia, and found that blow-initiated tones evoked greater auditory ERP attenuation than that evoked by button presses and saccades. Collectively, the studies reported in this thesis support the notion that auditory ERP attenuation relates to the habitual causal relationship between motor output region and the modality of subsequent sensory input. Furthermore, the equivocal outcomes of the visual studies complicate accounts of electrophysiological sensory attenuation based on motor prediction alone.

Declaration relating to disposition of thesis I hereby grant to the University of New South Wales or its agents the right to archive and to make available my thesis in whole or in part in the University libraries in all forms of media, now or here after known, subject to the provisions of the Copyright Act 1968. I retain all property rights, such as patent rights. I retain the right to use in future works (such as articles or books) all or part of this thesis. I also authorise University Microfilms to use the 350-word abstract of my thesis in Dissertation Abstracts International.

13 November 2017 Signature Witness Date

FOR OFFICE USE ONLY Date of completion of requirements for Award: ORIGINALITY STATEMENT

I hereby declare that this submission is my own work and to the best of my knowledge it contains no materials previously published or written by another person, or substantial proportions of material which have been accepted for the award of any other degree or diploma at UNSW or any other educational institution, except where due acknowledgement is made in the thesis. Any contribution made to the research by others, with whom I have worked at UNSW or elsewhere, is explicitly acknowledged in the thesis. I also declare that the intellectual content of this thesis is the product of my own work, except to the extent that assistance from others in the project’s design and conception or in style, presentation and linguistic expression is acknowledged.

Signed

Date 13 November 2017 COPYRIGHT STATEMENT

I hereby grant the University of New South Wales or its agents the right to archive and to make available my thesis in whole or part in the University libraries in all forms of media, now or here after known, subject to the provisions of the Copyright Act 1968. I retain all proprietary rights, such as patent rights. I also retain the right to use in future works (such as articles or books) all or part of this thesis. I also authorise University Microfilms to use the 350-word abstract of my thesis in Dissertation Abstract International. I have obtained permission to use copyright material.

Signed

Date 13 November 2017

AUTHENTICITY STATEMENT

I certify that the Library deposit digital copy is a direct equivalent of the final officially approved version of my thesis. No emendation of content has occurred and if there are any minor variations in formatting, they are the result of the conversion to digital format.

Signed

Date 13 November 2017 Abstract

Our sense of agency and dynamic motor control relies on sensory attenuation, a phenomenon that distinguishes the sensory effects of our actions from sensations caused by the external world. This distinction is realized by the brain using motor commands to anticipate and modulate sensory input resulting from self-initiated actions. A putative electrophysiological index of sensory attenuation is the reduction of the auditory N1 component in the electroencephalogram (EEG). However, the underlying causes of electrophysiological sensory attenuation are debated. Previous research is almost entirely limited to auditory stimuli, and two action–sensation pairings in particular; specifically, vocalizing to produce speech sounds, and button pressing for tones. The aims of this thesis were to elucidate how sensory attenuation may be modulated by the strength of the existing causal relationship between an action (e.g., speaking) and its sensory effect (e.g., hearing one’s own voice), and to determine if the phenomenon can occur in non-auditory modalities. To achieve these aims, this thesis reports a series of experiments that measured the event- related potential (ERP) response evoked by a range of action–sensation pairings new to the literature. The first study found that button press-initiated tones evoked reduced auditory ERPs uncorrelated to the amplified visual ERPs evoked by button press-initiated pattern reversals. The next two studies debuted a saccade initiation paradigm, and found that ERP attenuation depended on the likelihood of the initiating action (i.e., saccades or button presses) being paired in everyday life with stimulation within the modality of the sensory effect (i.e., flashes or tones). The fourth study debuted a blow initiation paradigm to address stimulus confounds in the vocal production paradigm that is typically used in studies of supposed motor prediction dysfunction in schizophrenia, and found that blow-initiated tones evoked greater auditory ERP attenuation than that evoked by button presses and saccades. Collectively, the studies reported in this thesis support the notion that auditory ERP attenuation relates to the habitual causal relationship between motor output region and the modality of subsequent sensory input. Furthermore, the equivocal outcomes of the visual studies complicate accounts of electrophysiological sensory attenuation based on motor prediction alone.

ii

Contents

List of figures ...... v Acknowledgements ...... vi Scholarship ...... vii

1 Introduction 1 1.1 Background to the thesis ...... 1 1.2 Aim and scope ...... 2 1.3 Thesis overview ...... 2

2 Background 4 2.1 Motor control and corollary discharge ...... 4 2.2 Psychotic symptomology and etiology ...... 13 2.3 Neural measurement of sensory events ...... 18 2.4 Experimental self-initiation paradigms ...... 21 2.5 Research motivation ...... 27

3 Press initiation of auditory and visual stimuli 29 3.1 Introduction ...... 30 3.2 Method ...... 33 3.3 Results ...... 37 3.4 Discussion ...... 46 3.5 Supplementary material ...... 51

4 Saccade initiation of auditory stimuli 53 4.1 Introduction ...... 54 4.2 Method ...... 58 4.3 Results ...... 62 4.4 Discussion ...... 67

5 Saccade initiation of visual stimuli 71 5.1 Introduction ...... 72 5.2 Expt 1: Method ...... 74

iii

5.3 Expt 1: Results and discussion ...... 76 5.4 Expt 2: Method ...... 79 5.5 Expt 2: Results and discussion ...... 82 5.6 General discussion ...... 84

6 Blow initiation of auditory stimuli 87 6.1 Introduction ...... 88 6.2 Method ...... 90 6.3 Results ...... 95 6.4 Discussion ...... 98

7 General discussion 102 7.1 Summary ...... 102 7.2 Implications ...... 104 7.3 Future research ...... 110 7.4 Conclusion ...... 116

References 117

iv

List of figures

2.1 Normal fish and fish with rotated eye ...... 8 2.2 Mechanical tickling apparatus ...... 10 2.3 Forward model of motor control ...... 11 2.4 Abnormal forward modeling in schizophrenia ...... 16 2.5 Auditory N1 suppression of self-initiated speech ...... 21 2.6 Forward prediction in the talk–listen paradigm...... 23

3.1 Experimental protocol ...... 34 3.2 Grand-averaged ERPs of the self and external conditions ...... 39 3.3 Topographic maps of the auditory N1 and P2 components ...... 40 3.4 Topographic maps of the self–passive difference waves ...... 41 3.5 Topographic maps of the visual N145 component ...... 42 3.6 Scatter plots of the amplitude differences across modalities ...... 43

4.1 Experimental protocol ...... 59 4.2 Grand-averaged ERPs of the self and external conditions ...... 63 4.3 Topographic maps of the auditory N1 and P2 components ...... 64 4.4 Grand-averaged ERPs of the motor subtraction procedure ...... 66 4.5 Topographic maps of the motor subtraction procedure ...... 67

5.1 Grand-averaged ERPs of the bright and dim conditions ...... 77 5.2 Topographic maps of the visual N1 in the intensity task ...... 78 5.3 Grand-averaged ERPs of the motor subtraction procedure ...... 81 5.4 Grand-averaged ERPs of the self and external conditions ...... 83 5.5 Topographic maps of the visual N1 in the self-initiation task ...... 84

6.1 Grand-averaged ERPs of the motor subtraction procedure ...... 94 6.2 Grand-averaged ERPs of the self and external conditions ...... 95 6.3 Topographic maps of the auditory N1 and P2 components ...... 96

v

Acknowledgements

I thank Thomas Whitford for his wonderful supervision, which shaped this truly enjoyable, rewarding process. His advice and enthusiasm were essential, and his pragmatism, humour, and goodwill helped to kickstart the engine when it stalled. I thank my friends and colleagues for their banter, counsel, and assistance, especially Bill Reda, Branka Spehar, Tom Beesley, Damien Mannion, Bradley Jack, Oren Griffiths, Ruth Elijah, Lena Oestreich, Tamara Watson, Emily White, Daniel Pearson, Mike Le Pelley, Miranda Chilver, and Sol Libesman. I thank Dad, Mum, Aaron, Jacob, and Michael for the tremendous privilege of their love, generosity, and confidence. I thank Catriona Bisset for her day-to-day support, encouragement, and remarkable wisdom. I gratefully acknowledge the financial assistance provided by an Australian Postgraduate Award from the Australian Government and a stipend extension from Thomas Whitford, as well as travel funding from the UNSW School of Psychology and the Tom Troscianko Award from the Applied Vision Association and European Conference on Visual Perception.

vi

Scholarship

Works related to this thesis are listed below. Those marked with an asterisk form their own chapters, each of which is prefaced by an outline of author contributions using the CRediT role taxonomy (Brand, Allen, Altman, Hlava, & Scott, 2015).

*Mifsud, N. G., Beesley, T., Watson, T. L., Elijah, R. B., & Whitford, T. J. (2017). Attenuation of visual evoked responses to hand and saccade-initiated flashes. Manuscript under review. *Mifsud, N. G. & Whitford, T. J. (2017). Sensory attenuation of self-initiated sounds maps onto habitual associations between motor action and sound. Neuropsychologia, 103, 38–43. https://doi.org/10.1016/j.neuropsychologia.2017.07.019 *Mifsud, N. G., Beesley, T., Watson, T. L., & Whitford, T. J. (2016). Attenuation of auditory evoked potentials for hand and eye-initiated sounds. Biological Psychology, 120, 61–68. https://doi.org/10.1016/j.biopsycho.2016.08.011 *Mifsud, N. G., Oestreich, L. K., Jack, B. N., Ford, J. M., Roach, B. J., Mathalon, D. H., & Whitford, T. J. (2016). Self-initiated actions result in suppressed auditory but amplified visual evoked components in healthy participants. Psychophysiology, 53, 723–732. https://doi.org/10.1111/psyp.12605 Oestreich, L. K., Mifsud, N. G., Ford, J. M., Roach, B. J., Mathalon, D. H., & Whitford, T. J. (2016). Cortical suppression to delayed self-initiated auditory stimuli in schizotypy: Neurophysiological evidence for a continuum of psychosis. Clinical EEG and Neuroscience, 47, 3–10. https://doi.org/10.1177/1550059415581708 Oestreich, L. K., Mifsud, N. G., Ford, J. M., Roach, B. J., Mathalon, D. H., & Whitford, T. J. (2015). Subnormal sensory attenuation in schizotypy: Electrophysiological evidence for a “continuum of psychosis”. International Journal of Psychophysiology, 97, 131–138. https://doi.org/10.1016/j.ijpsycho.2015.05.014

vii

1 Introduction

1.1 Background to the thesis

Ambiguity about the origin of sensory input is a fundamental problem that all behaving organisms must resolve. Imagine setting out on an evening walk in the Australian bush, senses saturated by the lemon-scented gum trees, dappled pink light, and the drone of cicadas—when abruptly, you hear a loud snap. Is there cause for alarm? The answer to this, and hence your course of action, rests on the origin of the noise. While it may have been caused by an external agent, such as a disturbed snake or errant gust of wind, you equally may have startled yourself by stepping on an unseen branch beneath leaf litter on the path. But no matter where we are, we regulate a continuous stream of overlapping sensory information throughout our waking lives. Distinctions between self- and externally initiated sensations allow us to safely ignore the immediate consequences of our movements, which frees limited processing capacity to facilitate our identification of and response to potentially critical events in our environment. A well-established phenomenon known as sensory attenuation is that humans and other animals exhibit dramatically reduced phenomenological and neurophysiological responses to sensations that result from their own actions— consider, for instance, the difficulty of tickling oneself. This reduced sensitivity to self-initiated sensations helps resolve source ambiguity, and is essential in producing the sense of agency we hold over our own actions (“I did that”). Notably, sensory attenuation is disrupted in people with schizophrenia, a psychiatric disorder that is characterized by bizarre experiences of compromised agency. However, the neural mechanisms that contribute to sensory attenuation remain unclear, and the existing literature explores only a subset of the range of stimuli and initiating motor actions to which the phenomenon theoretically applies. The empirical underpinning of this thesis is that information received by human sensory receptors is reliably transduced into measurable electrical activity that is processed by the sensory and higher order areas of the brain. Accordingly, the quantity and location of electrophysiological response evoked by various stimuli that

1 are initiated by various motor actions (compared with a baseline of response evoked by externally initiated stimuli) can be informative of the underlying processes that result in sensory attenuation.

1.2 Aim and scope

The aims of this thesis were to elucidate how the strength of the existing causal relationship between an action (e.g., speaking) and its sensory effect (e.g., hearing one’s own voice) may modulate sensory attenuation, and to determine whether the phenomenon can occur in non-auditory modalities. In this thesis, levels of sensory attenuation will be determined by using (EEG) to measure event-related potentials (ERPs) evoked by auditory and visual stimuli that are either self- or externally initiated. Although the wider phenomenological and clinical implications of the outcomes will be discussed, the primary focus of this thesis is on the neurophysiology of sensory attenuation in healthy individuals.

1.3 Thesis overview

To achieve the above aims, I conducted a series of experiments that investigate the EEG response associated with various action–sensation pairings, including several that have not been previously studied in the context of sensory attenuation, systematically covering a wide range of causal relationships. The thesis begins by introducing key background concepts. In Chapter 2, I outline the forward model of motor control hypothesized to underlie sensory attenuation, describe psychotic symptoms and their possible association with abnormal brain processing of self- initiated sensations, discuss neurophysiological measurement of sensory events, and review the most common self-initiation paradigms in the ERP literature to yield research motivation. Chapters 3 to 6 constitute the novel research components of the thesis. In Chapter 3, we extend investigation of sensory attenuation to the visual modality by measuring participants’ ERP response to pattern reversals as well as tones. This represents the first within-subjects comparison of neurophysiological responses to self-initiated auditory and visual stimuli. We find that button press-initiated tones

2 evoke significant attenuation of auditory evoked potentials (AEPs), whereas button press-initiated pattern reversals result in amplified visual evoked potentials (VEPs). However, we do not find a clear quantitative association between the level of ERP modulation across sensory modalities. In our subsequent experiments, we keep the dependent variable (i.e., sensory effect) constant in each design, and instead manipulate the causal likelihood of coincident action–sensation pairings by introducing previously unstudied motor actions. Crucially, as eye movements are not routinely associated with sounds— unlike the typical button-press motor action—Chapter 4 investigates saccade initiation of tones for the first time. We find that saccade-initiated tones evoke reduced AEP attenuation compared with button-press initiation. Then, in Chapter 5, we demonstrate the converse: that when the sensory effect is visual (i.e., a flash), saccade initiation results in greater VEP attenuation than button-press initiation. Chapter 6 introduces another new motor action: a soundless blow. This avoids bone conduction, which confounds the vocal production paradigm used in most of the studies that observe atypical sensory attenuation in schizophrenia. Importantly, in support of the notion that AEP attenuation may relate to the strength of the existing causal relationship between motor output and sensory input modality, we find that blow-initiated tones evoke significant AEP attenuation. In Chapter 7, I summarize the main research findings, discuss their implications, provide suggestions for future research, and conclude the thesis.

3

2 Background

In this chapter, I introduce background concepts that provide context for a program of EEG investigations into willed motor actions and their sensory effects. I first outline the forward model of motor control as it relates to sensory attenuation (§2.1) and describe an etiological theory of psychotic symptoms based on abnormal processing of self-initiated sensations (§2.2). Next, I discuss neurophysiological methods and the chief measure used to study sensory attenuation (§2.3), present common self-initiation ERP paradigms (§2.4), and review outstanding questions that motivate the research component of the thesis (§2.5).

2.1 Motor control and corollary discharge

2.1.1 Problems that arise with movement

Movement is the prime objective of the brain. Consider that multicellular life does not necessarily require a nervous system: plants, which are sessile, have flourished without them (Llinás, 2001). Instead, the arrival of a nervous system coincides with the need to guide active movement. This evolutionary link is illustrated by the humble sea squirt (Ascidiacea), a creature that begins its aquatic life seeking a suitable rock to make a permanent home. Upon findinga rock, the larva cements itself there and proceeds to consume its own cerebral ganglion (Dennett, 1993). In their adult form, sea squirts filter feed—no longer mobile, their nervous system becomes superfluous. Conversely, humans who likewise root themselves to stable platforms (e.g., desk-bound doctoral candidates) avoid self-cannibalism because muscular contractions continue to facilitate their every interaction with the world— namely, the movements required to type, talk, and drink coffee. Taken to its furthest conclusion, this hypothesis suggests that our higher-order abilities, such as memory and language, emerged to better facilitate our future movements (Wolpert, 2011). With active movement comes the possibility of evading predators and accessing new resources—but to realize these profound advantages, an organism must enter communion with the outside environment. Our faculty of perception forms that essential conduit between the internal processes driving movement (the

4

“mind”) and the external world. The sophisticated sensory organs we possess have developed to sample available sensory information, to segregate our surroundings into discrete objects, bodies, and events, and to identify suitable paths forward. Without perception, active movement can be worse than useless; it can be dangerous—consider the most probable outcome of walking down a city street with one’s eyes closed. Perception plays an important part in the successful execution of actions. For example, to drink a beverage, one must simultaneously plan, adjust, and control several muscular contractions. The complex computations involved in these motor processes rely on continually changing sensory information such as the current spatial location of our jaws and limbs. Clearly, action is inextricably connected to perception. Indeed, in some theoretical approaches, such as the sensorimotor account of visual consciousness (O’Regan & Noë, 2001), the notion that action and perception are discrete constructs substantially dissolves. To achieve the present seamless functional harmony between action and perception, successful species had to overcome two major hurdles that arise during movement: source ambiguity and afferent desensitization. Before considering those two issues, I will definea few terms related to information flow. Consider the action of eating an apple. To open one’s mouth, the motor cortex issues commands to the jaw muscles. On taking a bite, chemicals released from the apple then interact with taste receptors on the tongue, who send information to the sensory cortex. In this example, a neural signal projected away from the brain (i.e., the motor command) constitutes an efferent, and its destined organ (i.e., a jaw muscle) is known as an effector. Contrariwise, the neural projection conveying taste information back to the brain is an afferent. In short, motor output is efference, and sensory input is afference. Lastly, to explore how the central nervous system (CNS) interprets afference,von Holst and Mittelstaedt (1950) introduced a useful semantic distinction between sensory input produced by events in the external world (exafference) and necessary sensory input that reflects volitional movement (reafference). The afference dichotomy is crucial to survival. As sessile animals only experience exafference, they need not reconcile disparate sources of sensory input— but moving animals risk confusing reafferencewith exafference, which would

5 severely impair their ability to navigate and predate. For instance, echolocating bats must parse the reafferent echoes of their auditory probes from exafferent background noise and calls from their conspecific neighbors (Neuweiler, 2003). The crux of the problem is that sensory receptors cannot determine what caused their activation. When sensory receptors detect changes in the environment, they pass information downstream without bias. Hence, if one were to shake a tree branch, the reafference produced by receptors in the peripheral organs would be physically indistinguishable from the exafference produced by the same receptors being stimulated due to the wind shaking a branch that one held with a passive grip (von Holst, 1954). Movement also complicates sensory processing because effectors are often located in the same region as relevant sensory receptors. In the absence of a suitably adapted nervous system, limb movements could desensitize the neural projection (afferent)that conveys sensory information from the receptors in the skin covering the facilitating muscles. Similarly, one’s own vocalizations could be deafening due to the proximity of one’s mouth and ears (Bendixen, SanMiguel, & Schröger, 2012). It would be highly disadvantageous if volitional actions were to suddenly render an animal unable to receive or otherwise process incoming information. Therefore, an essential function of the nervous system in all surviving animals is the resolution of the sensory problems that accompany active movement.

2.1.2 Keeping track of imminent movements

To overcome these problems, diverse species—from roundworms to primates— have evolved a common mechanism: they monitor their motor commands and advise the sensory system about these upcoming movements (Crapse & Sommer, 2008). With this notice, sensory processing can be modulated to maintain a stable percept of the external world and to allow the updating of subsequent actions even before muscular contractions occur. A neural basis for this mechanism was put forward by von Holst and Mittelstaedt (1950), who hypothesized that an efference copy (i.e., a copy of a motor command sent to an effector) was conveyed to the sensory system. Accordingly, exafference si salient because the reafferenceproduced by effector activation is nullified by the efference copy. In the same year, Sperry (1950) introduced the

6 similar, more general notion of corollary discharge, in which sensory processing is influenced by various neural signals originating from motor-related brain areas. These modulatory signals are “corollary” to, rather than an exact “copy” of, the low-level motor commands targeting the muscles. Therefore, corollary discharge can form a model for candidate neuronal mechanisms in higher-level brain areas, such as the cerebral cortex (Wurtz, 2008). Both proposals were independently derived from observing altered behavior in animals after surgical manipulation of their visual apparatus. For example, Sperry (1950) recorded circling behavior in swellfish after the left eye had been rotated 180° (and the right eye blinded to force monocular input). He suggested that, in normal fish,the motor signals that typically have consequent retinal displacement may produce a corollary discharge to the visual system to compensate for the visual effect of the retinal displacement; thereby allowing them to accurately adjust their swimming speed and direction. With the inversion of the visual field in fish with rotated eyes, such motor adjustments would diametrically conflict with retinal input and accentuate, rather than cancel, retinal displacement, leading to an ill-fated cycle of erroneous movements that manifest as spinning in a circle (see Figure 2.1). The optokinetic response of flies with rotated heads led von Holst and Mittelstaedt (1950) down similar logic and deduction. These biological experiments did not arise in a vacuum; rather, they were an empirical contribution to a longstanding scientific discourse. An interplay between efferent and afferent signals as a basis for perception has been considered since pre- Socratic philosophers first suggested that an internal “fire” emitted by the eye may facilitate vision by afferent reflection from objects in the external world. This initial postulation was advanced to a more recognizably modern form by Bell, Purkinje, Mach, and von Helmholtz, among others (see Grüsser, 1994). Their inquiry had a shared concern: how does our visual world remain stable despite constant, violent displacement of the image on the retina by head and eye movements?

7

2.1 Normal fish and fish with rotated eye Figure 2.1 A normal fish (left) swimming backward is exposed to the same retinal stimulus as a fish with rotated eye (right) swimming forward. This would produce identical visual input. As circling is only observed in the latter fish (which swims normally in the dark), its behavior cannot be attributed to the retinal stimulation alone, but rather to its combination with the direction of movement. N and T mark nasal and temporal poles of the eye. Adapted from Sperry (1950). Public domain.

Sperry (1950) directly addressed this fundamental question by applying his newly-coined term to a vivid perceptual effect noted by Descartes in the 17th century: when one moves their eyes normally, the external world remains stable, despite its image shifting across the retina; however, when the eye is moved without its muscles (e.g., close one eye, and gently tap your open eye through the eyelid), the world appears to “jump”. He suggested that eye movements (saccades) produce corollary discharge, which allows the brain to distinguish between displacements in the retinal influx caused by saccadic artifacts from those that are due to actual changes in the environment. As tapping the eyeball does not produce a saccadic corollary discharge, displacement of the retinal image is interpreted as a shift in the external world. This basic hypothesis has been corroborated by an increasing body of experimental work using primates, which has emphasized the critical role of a corollary discharge pathway for visual stability during saccades (Duhamel, Colby, & Goldberg, 1992; Wurtz, Joiner, & Berman, 2011). Considering the multisensory problems introduced by active movement, it may be self-evident that anticipatory motor-to-sensory signals are valuable—indeed, necessary—in contexts beyond the stabilization of the visual field.This generalization was anticipated by von Helmholtz (1866), who expanded the role of

8 eye movement signals from perception to sensorimotor coordination; even so, he still relegated them to an “effort of will”. The quantitative, neural approach adopted by von Holst and Mittelstaedt (1950) and Sperry (1950), which emerged from the development of control theory during World War II (Bridgeman, 2007), emphasized the relevance of anticipatory motor signaling to all movements (von Holst, 1954). Now, contemporary motor control theory—as defined by the formal model that I will discuss below—has become inseparable from corollary discharge.

2.1.3 Tickling and the forward model account

Following the landmark papers by von Holst and Mittelstaedt (1950) and Sperry (1950), substantial evidence continued to accrue using nonhuman animal subjects that implicated motor-to-sensory feedback in behavioral control (e.g., Delcomyn, 1977; Guthrie, Porter, & Sparks, 1983; Müller-Preuss & Ploog, 1981; Roberts & Russell, 1972; Sommer & Wurtz, 2002), including at a cellular level (e.g., Davis, Siegler, & Mpitsos, 1973; Poulet & Hedwig, 2006, 2007). Prior to the spread of appropriate neurophysiological methods (outlined in §2.3), evidence had also accrued using human subjects by way of designs such as self-report of tactile stimulation (Claxton, 1975; Weiskrantz, Elliott, & Darlington, 1971), force estimation (Flanagan & Wing, 1997; Johansson & Westling, 1984; Shergill, Bays, Frith, & Wolpert, 2003), and estimation of hand location (Synofzik, Thier, & Lindner, 2006; Wolpert, Ghahramani, & Jordan, 1995). Collectively, this work confirmed the phenomenological aspect of sensory attenuation, and modeling behavioral data with computational frameworks provided insight into the possible operation of the motor system. One of the earliest studies in humans was conducted by Weiskrantz et al. (1971), who explored why most people are unable to tickle themselves. The authors devised a mechanical apparatus (see Figure 2.2) to produce tactile stimulation that was consistent in pressure, duration, and rhythm. Participants rated the “ticklishness” of the stimulation they received in each of three conditions: self- initiation, whereby participants solely moved the handle to administer the stimulation; external initiation, whereby the experimenter moved the handle; and passive, whereby participants held the handle as the experimenter controlled its lateral movement.

9

2.2 Mechanical tickling apparatus Figure 2.2 A mechanical tickling apparatus. The participant places their bare foot on top of the box, so that when a handle (D) is moved along a horizontal slot (H) in synchrony with the beat of a metronome, the sole of the foot is consistently tickled by the tip of a plastic knitting needle (F) counterweighted (E) to maintain constant pressure. Adapted from Weiskrantz et al. (1971) with permission from Nature Publishing Group.

Weiskrantz et al. (1971) found that participants rated stimulation that was produced by the experimenter as more ticklish than the stimulation they produced themselves, and rated passively initiated stimulation as intermediate in ticklishness between the self- and externally initiated conditions. It appeared that some degree of sensation dampening resulted when sensory afferenceduring the arm movement was available, even when the motor command itself was not available—which suggested that corollary discharge was a prediction, rather than a mere copy of the motor command. Crucially, this pattern of results supplied early empirical evidence that reafference in the human tactile system, not only the visual system, is modulated consistent with the existence of an internal feedback mechanism. This mechanism has been elucidated in contemporary motor control theory, which holds that the CNS simulates dynamic behavior in the motor system to enable planning, control, and learning (Bays & Wolpert, 2007). Specifically, it is posited that an internal forward model performs this simulation (e.g., Francis & Wonham, 1976; Kawato, 1999; Miall & Wolpert, 1996; Wolpert et al., 1995). Forward models in the motor system predict the next state of a system given the current context and

10 information about the motor command. Accordingly, corollary discharge can be interpreted as the physical instantiation of the predictions made by an internal forward model (see Figure 2.3). Note that the motor system also uses an inverse model, which estimates the motor command that would produce a desired state (e.g., to calculate how to move one’s arm to reach for an object). That is, the inverse model transforms a motor intention into the motor commands that will initiate action—and hence the efferent copy that will inform the forward model.

Predicted sensory feedback Predicted (corollary discharge) Desired state state

Motor Efferent copy Forward Optimization

command model

Estimated Motor action actual state Actual sensory feedback (reafference)

External influences

2.3 Forward model of motor control Figure 2.3 For every motor action executed by the musculoskeletal system, a forward model is said to predict its sensory consequences based on a copy of the motor command. This predicted sensory feedback (corollary discharge) is compared with actual sensory feedback to anticipate and cancel reafference and optimize subsequent predictions. Figure created by the author based on the frameworks proposed in Blakemore, Frith, and Wolpert (1999) and Frith, Blakemore, and Wolpert (2000).

Forward modeling is vital for at least two aspects of action: state estimation and sensory attenuation. First, by calculating probable limb trajectory and comparing this with the desired outcome, errors can be rapidly corrected before actual sensory feedback becomes available (Miall & Wolpert, 1996). This is necessary owing to significant delays in sensory feedback due to receptor transduction, neural conduction and central processing; as well as inherent noise in

11 the sensory system that renders estimation based on sensory feedback alone prone to error (Wolpert & Flanagan, 2001). The second aspect, most relevant to this thesis, is that once a movement has been made, its sensory consequences can be predicted to anticipate and cancel reafference, enhancing more relevant sensations and further optimizing the model. A large body of work has converged on this framework of motor control over the past two decades. Its application to sensory attenuation has been implicated across a broad range of contexts, including the diminishing of vestibular reafference during head movements (Cullen, Brooks, Jamali, Carriot, & Massot, 2011) and proprioceptive reafference accompanying limb movements (Proske & Gandevia, 2012), as well as reduced sensitivity to leg sensations during heelstrike (Duysens et al., 1995) and to teeth stimulation during mastication (Sowman, Brinkworth, & Turker, 2010). There is also an increasing focus on isolating its physiological underpinnings. For example, the inferior parietal lobule (IPL) may contribute to conscious motor intention (see Desmurget & Sirigu, 2012) and the cerebro- cerebellum has been proposed as the loci of forward modeling (see Ishikawa, Tomatsu, Izawa, & Kakei, 2016). These efforts hold promise not only for explaining the nature of motor control and perception, but for the basis of our “selves”. Importantly, our motor control system only has access to the outside world through actions that are performed in it, and the sensory feedback that results from those actions. Hence, the fundamental distinction between sensations caused by our own actions and those with an external origin may play a central role in creating a sense of agency (Engbert, Wohlschlager, & Haggard, 2008). To crystallize this notion, consider the difficulty in tickling oneself: a copy of the motor command that is sent to the agitating finger allows the forward model to predict the sensory effects of the resulting movement, and thus attenuate the actual sensory feedback. Conversely, a tickle performed by someone else—so that sensory feedback is experienced without the forward model being privy to an antecedent efferent copy—will produce a large discrepancy between predicted and actual feedback. Hence, as the discrepancy between predicted and actual sensory feedback increases, so too does the likelihood that a sensation was externally produced.

12

If our sense of agency depends on such comparisons, irregularities in the forward model mechanism could lead to unusual experiences. Specifically, if a self- initiated action occurred yet there was a large discrepancy between predicted and actual feedback, the sensation produced by the internally-mediated action may nonetheless be perceived as external in origin. This loss of agency resembles certain psychotic experiences in schizophrenia. The possibility that the psychotic symptoms associated with schizophrenia could be related to errors in forward prediction was one of the theoretical underpinnings of this thesis. The novel paradigms introduced in Chapters 3, 4 and 6 were intended as pilot studies for potential future investigation in this clinical population. To contextualize a corollary discharge-based etiological theory of psychotic symptoms, I will first introduce schizophrenia in terms of its impact and typical symptomology.

2.2 Psychotic symptomology and etiology

2.2.1 Schizophrenia and its first-rank symptoms

Schizophrenia is a mental disorder with a global lifetime prevalence estimated to be 4.0 per 1,000 persons (McGrath, Saha, Chant, & Welham, 2008) with a disproportionately large societal and economic impact due to its chronic nature (Knapp, Mangalore, & Simon, 2004)—in Australia, schizophrenia costs $1.44 billion per annum (Carr, Neil, Halpin, Holmes, & Lewin, 2003). Initial symptoms typically appear during adolescence and early adulthood, after which individual pathogenesis differs widely. Prognosis also varies, although good long-term outcomes occur in less than 50% of patients (van Os & Kapur, 2009), who die 12 to 15 years before the general population due to increased physical health issues and the higher rate (4.9%) of suicide (Palmer, Pankratz, & Bostwick, 2005; Saha, Chant, & McGrath, 2007). Many patients suffer severe distress, emotional pain, and an impoverished quality of life, spurring over a century of research in which the neuropathology of the disorder has remained obscure. A prevailing view is that schizophrenia arises from abnormal perinatal and adolescent neurodevelopment (Harrison, 1991). Accordingly, an influx of magnetic resonance imaging (MRI) and diffusion tensor imaging (DTI) studies have identified structural brain abnormalities such as

13 ventricular enlargement, reduced cortical volume, and reduced white matter integrity (Fitzsimmons, Kubicki, & Shenton, 2013; Shenton, Dickey, Frumin, & McCarley, 2001; Steen, Mull, McClure, Hamer, & Lieberman, 2006). These inherited brain vulnerabilities could cause subtle functional disturbances, such as slight alterations in cognition, some suspiciousness, or affective dysregulation. In a minority of individuals exposed to environmental factors such as prenatal infection (Brown & Derkits, 2010) and social disadvantage (van Os, Krabbendam, Myin- Germeys, & Delespaul, 2005), abnormal neurobiology could emerge and produce distinct symptoms (van Os & Kapur, 2009). Schizophrenia is associated with heterogeneous clinical profiles. Symptoms have been found to fall into positive and negative dimensions (Andreasen, Arndt, Alliger, Miller, & Flaum, 1995) that seem independent in both structure and longitudinal course (Eaton, Thara, Federman, Melton, & Liang, 1995). Positive symptoms are those that most people do not normally experience, but are present in schizophrenia. These include psychotic experiences (psychosis) in the form of hallucinations (false perceptions) and delusions, as well as disorganized thoughts and speech. In contrast, negative symptoms mark an absence of normal functions, such as flat affect, a lack of motivation, and reduced social engagement. Some positive symptoms, including third-person auditory hallucinations and passivity experiences, were described by K. Schneider (1959) as “first rank” features of schizophrenia, as they appeared pathognomonic in most cases. Third-person auditory hallucinations (i.e., voices that speak about the patient) are very common, occurring in approximately 60%–80% of patients (Waters, Woodward, Allen, Aleman, & Sommer, 2010). These may be voices that argue (e.g., hearing a voice say “He should be locked up”, and another reply “He is not, he is a lovely man”) or provide commentary on one’s actions (e.g., “She is peeling potatoes, got hold of the peeler, she does not want that potato…”; see Mellor (1970) for further examples). Passivity experiences are debilitating situations in which a patient literally feels as if an external force initiates their actions, emotions or thoughts. Such symptoms are labeled delusions of control (e.g., “It is my hand and arm that move, and my fingers pick up thepen, but I don’t control them. What they do is nothing to do with me”), made emotions (e.g., “They project upon me laughter for no reason”) and thought insertion (e.g., “Thoughts come into my mind from outer space”).

14

These experiences are often tied into delusions, which are persistent bizarre or irrational beliefs that are not easily understood in relation to an individual’s sociocultural background (Fletcher & Frith, 2009). For example, a patient may believe that the government has implanted a computer chip in their head to monitor and control their thoughts. Although not all patients exhibit first-rank symptoms such as auditory hallucinations and passivity experiences, they provide a stringent test of any etiological theory of schizophrenia, because they hold no obvious bearing to normal experience or sensory input (Stephan, Friston, & Frith, 2009).

2.2.2 Self-monitoring theory and abnormal corollary discharge

There is a remarkable gap between the numerous physiological factors that accompany schizophrenia and the symptomatic manifestations on which a diagnosis often depends. For instance, we know that activation of psychotic symptomology is strongly associated with dopaminergic dysregulation (Laruelle & Abi-Dargham, 1999; Seeman, Lee, Chau-Wong, & Wong, 1976)—but the underlying causes remain challenging to understand. To explain strange phenomenological experiences such as delusions of control, a neuropsychological account of psychosis must be pursued. Feinberg (1978) first proposed that characteristic psychotic symptoms could arise due to abnormalities in the neural signals that allow us to distinguish self- initiated events from those produced by the environment. This fundamental idea was developed into self-monitoring theory (Frith, 1987; Frith et al., 2000; Frith & Done, 1988) to incorporate advances in our understanding of motor control (i.e., evidence for the use of a forward model), which opened new lines of investigation into psychotic experiences that had been difficult to study in an empirical manner. The essential argument of self-monitoring theory is that positive symptoms in schizophrenia, particularly first-rank symptoms, arise due to a breakdown in an internal monitoring system, which in turn leads patients to misattribute self-initiated events to others. We can locate the proximal cause of the breakdown as abnormal corollary discharge (see Figure 2.4). This abnormality could thus produce a large discrepancy between actual and predicted sensory feedback, and be responsible for self-initiated movements that feel as if they are being initiated by a foreign agency (i.e., delusions of control) or internal voices and thoughts that instead seem to originate from an external source (i.e., auditory hallucinations or thought insertion).

15

Predicted sensory feedback Predicted (corollary discharge) Desired state state

Motor Efferent copy Forward Optimization

command model

Estimated Motor action actual state Actual sensory feedback (reafference)

External influences

2.4 Abnormal forward modeling in schizophrenia Figure 2.4 Patients with schizophrenia may have impaired self-monitoring due to abnormal corollary discharge, possibly due to delayed efferent copies (blue) or issues with the forward model itself (orange). This would lead to an atypically large discrepancy between actual and predicted sensory feedback. Consequently, patients may fail to identify some sensations as self-initiated, leading to a systematic misinterpretation that their actions arise from an external source. Figure created by the author.

Various suggestions have been made to specify neurobiological causes of abnormal corollary discharge. For instance, Whitford, Ford, Mathalon, Kubicki, and Shenton (2012) hypothesize that abnormalities in the conductive myelination of frontal white matter fasciculi might delay neural signals generated in the motor- initiation areas of the frontal lobe (such as efference copies). As illustrated in Figure 2.4, if the forward model received these signals late, the resulting corollary discharge would have problematic imprecision. Some preliminary evidence for the dysmyelination hypothesis will be described in §2.4.2. At a broader level, patients may have a problem with the Bayesian inferencing mechanisms that are posited to underlie brain function (e.g., A. Clark, 2013; Friston, 2010), affecting the operation of forward modeling in action (Fletcher & Frith, 2009; Frith, 2012). This would result in misattributed self-initiated sensations much as with delayed efference copies. The theoretical interest in a general predictive issue in schizophrenia stems from evidence that dopamine release acts as a signal for

16 reward prediction error (Schultz & Dickinson, 2000), which may indicate that it has a similar role in motor-related predictions—and thus offer aparsimonious synthesis between action-based theories of schizophrenia and those theories focused on the phenomenological repercussions of aberrant signaling within a hyperdopaminergic state (e.g., Kapur, 2003).

2.2.3 Evidence using psychophysical paradigms

The self-monitoring theory has led to several experimental demonstrations that patients with schizophrenia lack awareness of predicted actions. In an extension to findings that healthy individuals cannot tickle themselves (Blakemore et al., 1999; Claxton, 1975; Weiskrantz et al., 1971), Blakemore, Smith, Steel, Johnstone, and Frith (2000) reported that people with psychotic symptoms could, in fact, tickle themselves. Healthy subjects and psychiatric patients with neither auditory hallucinations nor passivity experiences rated self-initiated tactile stimulation as less intense, ticklish, and pleasant than identical stimulation that was initiated by the experimenter. Conversely, patients with psychotic symptoms did not have reduced perceptual ratings on self-initiated trials. Shergill, Samson, Bays, Frith, and Wolpert (2005) similarly extended a paradigm that had been shown to elicit sensory attenuation in the healthy population (i.e., force-matching; Shergill et al., 2003). While healthy participants consistently overestimated the force required to reproduce the tactile stimulation produced by a torque motor applied to their finger, patients overshot the target force by a lesser amount than did controls. This superior accuracy indicated that patients had better sensitivity to the amount of force they applied, which can be attributed to reduced attenuation of self-initiated sensations. Taken together, the above studies, and many others (for a meta-analysis, see Waters et al., 2010), indicate that patients with schizophrenia have self-monitoring impairments—that is, a diminished ability to identify self-initiated actions and therefore attenuate the sensory consequences of those actions. Further strands of evidence for abnormal sensory attenuation in schizophrenia have emerged based on neurophysiological paradigms, which directly informed the basic design and motivational impetus for the subsequent research chapters of this thesis. Prior to detailing the relevant findings of those studies, I will explain the fundamental

17 methodological concept of EEG-derived ERP—a helpful experimental tool in efforts to resolve the swirl of neural activity that underlies mental function.

2.3 Neural measurement of sensory events

2.3.1 Electro- and magneto-encephalography

EEG and MEG are methods that evince human brain activity by recording scalp potentials and extracranial magnetic fields (Ahlfors et al., 2010) and achieve the highest temporal resolution (milliseconds) available in noninvasive brain imaging (Sharon, Hamalainen, Tootell, Halgren, & Belliveau, 2007). The primary biological basis of EEG/MEG signals are postsynaptic and active transmembrane currents generated by assemblies of cortical pyramidal cells that activate in synchrony during information processing (Woodman, 2010). Due to the continuous nature of current flow, EEG and MEG are valuable in investigations of sensory attenuation, as they capture processing that takes place even in the absence of overt participant responses (i.e., externally initiated stimuli). As the methods share their cerebral current generators, data from comparable EEG and MEG paradigms can sometimes be interpreted in tandem. Nonetheless, there are nontrivial differences. For example, voltage fields are “smeared” as they propagate through the brain, dura, skull, and skin, a problem largely avoided by their magnetic counterparts (Hämäläinen, Hari, Ilmoniemi, Knuutila, & Lounasmaa, 1993). Conversely, the decay of magnetic fields due to distance is more rapid than electric fields( Cohen & Cuffin, 1979). However, these issues are most germane to spatial localization, and outside the scope of analysis employed in this thesis. An additional, practical matter of difference is the marked compactness and simplicity of EEG technology. By contrast, MEG requires bulky, expensive machinery and the immobilization of participants (as do all other functional neuroimaging methods) to detect the magnetic induction produced by neural currents—a signal that is orders of magnitude weaker than scalp voltages (Baillet, Mosher, & Leahy, 2001; Gevins, Leong, Smith, Le, & Du, 1995). Consequently, electroencephalograms are more accessible and tend to be easily tolerated by patients, leading them to become widespread clinical procedures (e.g., to monitor seizures in people with epilepsy). These advantages support the long-term potential

18 of ERP measures of sensory attenuation to become biomarkers in schizophrenia research (Luck et al., 2011).

2.3.2 Event-related potentials

Though EEG emerged early in the 20th century (Berger, 1929), it was the advent of signal-averaging multiple trials in the 1960s that marked the proliferation of event- related potentials as a powerful tool in cognitive neuroscience (Woodman, 2010). ERPs are discrete voltage fluctuations extracted from the ongoing EEG by averaging together the activity from intervals that are time-locked to the same event, such as onset of a stimulus (e.g., an audible tone, or a flash of light). With a sufficiently large number of trials, specific event-related responses (i.e., brain activity consistently present in the time-locked intervals) will be preserved, and unrelated brain activity will theoretically average to zero. Averaging also mitigates individual differences in ERP morphology, caused by anatomical factors such as unique cortical folding (Nunez & Srinivasan, 2006) and skull thickness and conductivity (Hoekema et al., 2003), as well as normal variation in human capacities (e.g., Fukuda & Vogel, 2009). The resulting averaged ERP waveform comprises a sequence of positive and negative deflections that are called components, usually named according to their polarity and ordinal or temporal timing (e.g., “P1” for the first positive component, or “N140” for a negative component that peaks 140 ms after the event onset). These components reflect the series of overlapping neural routines that are triggered by the event, from early basic perceptual processes to later cognitive processes such as those related to decision-making. Therefore, evaluating changes in the amplitude and latency of ERP components in conjunction with differences in task conditions allows us to make inferences about the nature and putative involvement of sensory, cognitive, and motor processes in those ERP differences. In terms of sensory attenuation, the auditory N1 response has entered widespread use as a cortical index of the phenomenon due to its sensitivity to volume.

2.3.3 Auditory N1 response

The auditory N1 response is a negative deflection in the EEG that peaks around 100 ms after the onset of a transient auditory stimulus, such as a pure tone. Note that the polarity, amplitude, and latency of deflections depend on the position of the key

19 electrode pairs measuring current flow.In the case of the auditory N1, the recording site is the vertex of the scalp—as that is where the auditory N1 response is typically maximal, as demonstrated on voltage maps—and the reference site is typically the average of the mastoid electrodes (just behind the ears). While the auditory N1 appears unitary, it consists of several distinct subcomponents (i.e., the latent contributors) as described by Näätänen and Picton (1987). Despite this fractionation, the dominant generators of the auditory N1—and its magnetic counterpart, the N1m or M1—reportedly reside in or near the primary auditory cortex (Virtanen, Ahveninen, Ilmoniemi, Näätänen, & Pekkonen, 1998). Importantly, the auditory N1 is reliably volume-dependent. With increases in stimulus intensity, peak amplitude increases and latency decreases (e.g., Hegerl, Gallinat, & Mrowinski, 1994; Picton, Woods, Baribeau-Braun, & Healey, 1976; Rapin, Schimmel, Tourk, Krasnegor, & Pollak, 1966)—that is, loud sounds elicit larger N1 amplitudes than do soft sounds. This relationship makes the auditory N1 response useful, as an underlying premise of sensory attenuation measured in the ERP is that any physiological reductions reflect alterations in the perceived intensity of a stimulus. Hence, early ERP studies that investigated the effect of self-initiation on sensory response used auditory N1 response in the context of vocalization—one of the clearest cases where sensory attenuation should be exhibited, since the proximity of one’s mouth and ears during vocalization could otherwise be deafening (Bendixen et al., 2012). The auditory P2 (a positive deflection around 200 ms post-onset) has often been reported as well, and it appears to be modulated by the temporal predictability of the stimulus (see §2.4.4). I will further discuss the P2 alongside the auditory N1 in Chapters 3, 4, and 6, and I will analyze differences that arise with self-initiation in the pattern-evoked visual N145 and flash-evoked visual N1 in Chapters 3 and 5 respectively. In the following section, I describe the two ERP paradigms that are commonly used in the sensory attenuation literature, namely vocal production and button-press initiation, as well as highlight outstanding questions in the literature.

20

2.4 Experimental self-initiation paradigms

2.4.1 Vocal production

In a typical vocal production design, referred to here as talk–listen, participants first make a simple utterance such as “ah” and then listen to a recording of their own speech (for a detailed protocol, see Ford, Roach, & Mathalon, 2010). As illustrated in Figure 2.5, healthy individuals have consistently been shown to exhibit attenuated response of the auditory N1/N1m to self-initiated vocalizations compared with their replayed vocalizations (e.g., Behroozmand, Karvelis, Liu, & Larson, 2009; Curio, Neuloh, Numminen, Jousmaki, & Hari, 2000; Heinks-Maldonado, Mathalon, Gray, & Ford, 2005; Houde, Nagarajan, Sekihara, & Merzenich, 2002; Wang et al., 2014).

2.5 Auditory N1 suppression of self-initiated speech Figure 2.5 Grand-averaged ERPs recorded during talk (red) and listen (blue) conditions at site Cz. Voltage is on the y axis and time in milliseconds is on the x axis. The auditory N1 component is suppressed to speech onset during talking compared with during listening. Adapted from Ford et al. (2010) with permission from Nature Publishing Group.

Given that auditory N1 response increases with volume and is known to originate from the auditory cortex (for a review, see Näätänen & Picton, 1987), this pattern of results suggests that the auditory cortex adjusts its sensitivity to lower its response to self-initiated vocalizations. This interpretation is supported by invasive recordings in both nonhuman primates (Eliades & Wang, 2003, 2005) and human neurosurgery patients (Greenlee et al., 2011), who exhibited suppressed response in the primary auditory cortex just prior to vocalization.

21

The attenuation of auditory consequences following self-initiated speech neatly maps onto the action of corollary discharge. Forward prediction is strongly implicated in speech production (Hickok, 2012), as it involves a well-defined range of motor output (e.g., shape formed by lips) producing specific, habitual, sensory consequences (i.e., uttered syllables). There is also a plausible neuroanatomical basis to forward prediction in speech production: efference copies of motor output could conceivably travel along candidate language fibre pathways (e.g., Dick & Tremblay, 2012) and specific acoustic properties of the upcoming sound could be encoded by a number of proposed distinct neural networks (Behroozmand et al., 2016; Chang, Niziolek, Knight, Nagarajan, & Houde, 2013; Greenlee et al., 2013; Parkinson et al., 2012), ultimately reducing its sensory effects (see Figure 2.6). Notably, in the talk–listen paradigm, patients with schizophrenia have been shown to exhibit a lesser degree of auditory N1 response attenuation during talking than healthy participants (Ford, Gray, Faustman, Roach, & Mathalon, 2007; Ford et al., 2001; Ford, Roach, Faustman, & Mathalon, 2007; Heinks-Maldonado et al., 2007; Perez et al., 2012). These findings provide neurophysiological support for the notion that psychosis involves dysfunctional corollary discharge. According to this view, auditory hallucinations could emerge as thoughts (Feinberg, 1978) or patients’ own sub-vocal speech (see Green & Kinsbourne, 1990) could be misinterpreted as having an external origin. However, the talk–listen paradigm has significant limitations due to physical differences in the auditory signal between conditions. For example, self-initiated vocalizations will always be confounded by bone conduction (Stenfelt & Goode, 2005). That is, the heard sound in the self-initiated condition will be qualitatively different from the heard sound in the externally initiated condition. Even if an online recording is played simultaneously through headphones during vocalization, the eardrum receives a combination of air-conducted sound and bone-conducted sound. Moreover, speech production may lead to contraction of the middle ear stapedius muscle, which alters transmission of the sound signal (Horváth & Burgyán, 2013). These confounds make it difficult to determine the degree to which differences in the ERP amplitudes of each condition are associated with forward prediction as opposed to merely reflecting between-condition differences in the auditory signal.

22

2.6 Forward prediction in the talk–listen paradigm Figure 2.6 A forward-model schematic of vocalization in the talk–listen paradigm. An intent to say “ah” (blue thought bubble) occurs in the speech production areas of the frontal lobe and is conveyed to the premotor and motor cortical areas, shown with a dotted bidirectional line. The motor cortical areas issue a motor command to initiate speech, as well as an efference copy of that command that reaches the auditory cortex and generates a corollary discharge of the expected “ah” sound (red burst). Auditory reafference is produced by the vocalized speech. The neural representation of the sensation in the auditory cortex is shown as a green burst. When auditory reafference (what you hear) matches the corollary discharge (what you intended to say), auditory cortical responsiveness is reduced. Adapted from Ford (2016) with permission from John Wiley & Sons.

2.4.2 Button-press initiation

Such issues are mitigated in button-press paradigms (in which participants press a button to initiate sounds) because the mouth is not used in stimulus production, which entirely avoids both bone conduction and muscle contraction of the ear. The general experimental design in talk–listen and button-press paradigms is the same: a condition in which participants hear self-initiated auditory stimuli is compared with

23 a condition where they listen to physically identical sounds that have been initiated externally (i.e., by a computer). Much as for willed vocalizations, a number of studies have indicated that auditory N1 response to sounds is even dampened when they are self-initiated by a button press (e.g., Aliu, Houde, & Nagarajan, 2009; Baess, Jacobsen, & Schröger, 2008; Martikainen, Kaneko, & Hari, 2005; McCarthy & Donchin, 1976; Schafer & Marcus, 1973; Sowman, Kuusik, & Johnson, 2012). In addition, patients with schizophrenia exhibit reduced N1 attenuation to button press-initiated sounds (Ford, Gray, et al., 2007; Ford, Palzes, Roach, & Mathalon, 2013; Whitford et al., 2011). Typically, these studies use artificially generated tone stimuli, though attenuation of an “ah” sample has also been reported (Ford, Gray, et al., 2007; Whitford et al., 2011). A further advantage of button-press studies free from bone conduction is that stimulus delay manipulations are possible. For example, Whitford et al. (2011) and Oestreich et al. (2016) interposed a 50-ms delay between the button-press and tone, which led healthy participants to exhibit auditory N1 response indistinguishable from the response evoked by externally initiated tones. This supports the notion that a comparison between actual and predicted sensory feedback underlies AEP attenuation. Furthermore, to test the dysmyelination hypothesis in schizophrenia (see §2.2.2), Whitford et al. (2011) showed that a 50-ms delay caused patients to exhibit normalized auditory N1 attenuation (i.e., the same response as healthy participants when hearing undelayed tones). This might be because the artificially delayed reafferenceapproximately matched the delay hypothetically experienced by corollary discharge due to possible damage to the arcuate fasciculus connecting frontal sites with the auditory cortex.

2.4.3 Generalizability of motor-to-sensory mapping

There is an important distinction to make between the talk–listen and button-press initiation paradigms that may hold implications for our understanding of sensory attenuation. Willed vocalizations, as in talk–listen, represent natural action– sensation pairings where auditory sensory input is unavoidable, predictable, and tightly coupled to specific motor output. By contrast, the button-press initiation paradigm involves arbitrary action–sensation pairings—typically hand movements to

24 press a button and elicit a tone. That is, the sound bears no necessary relation to the motor output. On the surface, the fact that auditory N1 attenuation has been found in both paradigms supports the notion that an efference copy prepares sensory cortex for the consequences of self-initiated actions even when the causal chain is indirect and mediated by external devices (Ford et al., 2013). However, in view of the scarcity of work directly testing this notion, it is far from clear that motor-to-sensory mapping can be generalized from speech production to auditory input evoked by motor actions unrelated to speech. An alternative possibility, given that speech processing may have highly specialized, dedicated subsystems, is that different, perhaps independent, mechanisms drive sensory attenuation of speech and non-speech stimuli (Horváth, 2015).

2.4.4 Limits of the sensory attenuation literature

Several other outstanding factors are relevant to neurophysiological investigations of sensory attenuation. I will briefly discuss an imbalance in the sensory attenuation literature across sensory domains, an associative account that has been proposed to explain the phenomenon, and the temporal predictability of stimuli. Firstly, self-initiation paradigms used to investigate sensory attenuation have primarily focused on the auditory domain, and comparatively little data are available with respect to the effects of self-initiation in the visual domain. At least three previous ERP studies have involved both self- and externally initiated visual stimuli, with equivocal findings regarding differences in VEP response (Gentsch & Schütz- Bosbach, 2011; Hughes & Waszak, 2011; Schafer & Marcus, 1973). All the studies used button presses to initiate stimuli, but in markedly different paradigms (described further in §3.1)—and in the case of Gentsch and Schütz-Bosbach (2011), the comparison between self- and externally initiated stimuli was peripheral to their aim, which was focused on conscious experience of agency. The scant, inconsistent set of results for electrophysiological suppression to self-initiated visual stimuli stands in contrast to the large and consistent body of evidence for auditory N1 attenuation to self-initiated auditory stimuli. There are also alternative (or perhaps complementary) explanatory accounts to consider. It may not be appropriate to apply forward model-type predictions to the

25 full range of experimental conditions under which sensory attenuation has been found. Self-initiation effects may involvedistinct and overlapping contributory processes. For example, Roussel, Hughes, and Waszak (2013) have proposed an account that relates to learned associations between actions and sensations. In their operational terms, learned associations are proposed to “pre-activate” potential sensory input, which increases baseline neural activity at a greater rate than the signal increase that accompanies an incoming stimulus. This account differs from forward prediction in that sensory attenuation is attributed to poorer stimulus discrimination due to increased baseline neural activity (and consequent reduction of the corresponding sensory representations) rather than an efference copy directly reducing the internal response to a predicted stimulus. An associative explanation can be reasonably applied to the auditory button- press initiation paradigm, given that such a contingency (i.e., pressing a button for a tone) conceivably could draw upon pre-existing action–sensation associations that have been experienced and learnt over the course of a lifetime. As will be discussed in §4.1, there is some preliminary evidence in support of the idea that contingency strength alters neurophysiological response. Given that efference copy modulation may not be wholly responsible in all cases of sensory attenuation, an account based on learned associations may explain observed effects in contingencies that lack the highly specific motor-to-sensory mapping that exists during speech production. Lastly, as discussed by Hughes, Desantis, and Waszak (2013), a key difference between self- and externally initiated conditions in many paradigms is that the former condition is temporally predictable (i.e., the participant chooses when to press the button, and can thus anticipate in time when the stimulus will occur), while the latter is not. Some studies have tested the effect of providing participants with timing foreknowledge of upcoming tones and found a degree of auditory ERP attenuation to these predictable but externally initiated stimuli (e.g., Oestreich et al., 2015; Schafer, Amochaev, & Russell, 1981; methodological details are compared in §3.4). Hence, there is the possibility that self-initiation effects are attributable, at least in part, to differencesbetween stimulus conditions in terms of the temporal predictability of the stimuli.

26

2.5 Research motivation

The preceding sections of this chapter introduced the role of the forward model in the ubiquitous phenomenon of sensory attenuation and put forth its dysfunction as a viable explanation of the most characteristic features of schizophrenia. I outlined the use of EEG and ERP—particularly recording of the auditory N1 component—to measure the sensory effects of self-initiated actions, reviewed common experimental self-initiation paradigms, and highlighted several areas in the literature that deserve attention. On the basis of this background, I will now specify the motivation behind each research component of this thesis, which aimed to advance our understanding of how sensory prediction in the motor system can influence perception. To begin, Chapter 3 attempts to redress the imbalance that most studies of neurophysiological sensory attenuation are auditory, despite the universal nature of corollary discharge applying to multiple sensory domains. To achieve this, we used the established button-press paradigm to conduct the first within-subjects comparison of neurophysiological response following self-initiation of both auditory and visual stimuli. In response to some findings that have suggested that temporal predictability may contribute to ERP differences, we also measured response to externally initiated stimuli that were preceded by a countdown. Chapters 4 and 5 focused on this essential question: to what degree does the causal likelihood of coincident pairings between an efference (motor output) and its resultant reafference (sensory input) modulate sensory attenuation? To explore this, we measured healthy participants’ response to previously unstudied combinations of motor output and sensory input, and used the button-press paradigm as a template in a design where self-initiation of stimuli occurred via eye movements. Accordingly, we debuted a saccade initiation ERP paradigm to investigate sensory attenuation. Chapter 4 used auditory stimuli, whereas Chapter 5 used visual stimuli. Button presses are routinely accompanied by sounds (e.g., while typing), whereas eye movements are not. Conversely, button presses are seldom accompanied by visual sensations, but saccades always are. Hence, by contrasting ERP response evoked by these particular action–sensation pairings following self- and external initiation, we could infer whether sensory attenuation is influenced by the causal likelihood of the pairing in everyday life. Moreover, it formed a test of

27 whether the forward model account, which has been convincingly applied to saccadic suppression, can be extended to the non-retinal sensory consequences of eye movements. Following similar logic regarding the new pairings, Chapter 6 debuted blow initiation. In a novel paradigm, participants used either a button press or a soundless blow to initiate auditory tones. As an exhalation is closer in nature, compared with pressing a button, with the movements involved in vocalization (which is almost always accompanied by auditory sensations) its pairing with a tone acted as a strong causal relationship in the auditory domain, as had been tested with saccade-initiated visual input. A second, major impetus of the final study was to test AEP attenuation with a more “natural” motor action than button presses, while avoiding bone conduction and other confounds of the vocal production studies that provide the strongest evidence for atypical sensory attenuation in schizophrenia.

28

3 Press initiation of auditory and visual stimuli

Published as:

Mifsud, N. G., Oestreich, L. K., Jack, B. N., Ford, J. M., Roach, B. J., Mathalon, D. H., & Whitford, T. J. (2016). Self-initiated actions result in suppressed auditory but amplified visual evoked components in healthy participants, Psychophysiology, 53, 723–732. https://doi.org/10.1111/psyp.12605

Author contributions:

Conceptualization: NGM TJW. Methodology: NGM TJW. Software: NGM TJW. Investigation: NGM LKO. Data curation: NGM. Formal analysis: NGM. Visualization: NGM BNJ. Writing – original draft: NGM. Writing – review & editing: NGM LKO BNJ JMF BJR DHM TJW. Funding acquisition: TJW. Supervision: TJW.

Context in thesis:

The following study is presented largely as written for publication (as are Chapters 4, 5, and 6), so there is some repetition of the key concepts. As mentioned earlier, the common self-initiation ERP paradigms used to investigate sensory attenuation have primarily focused on effects inthe auditory domain. Given the scarcity of research in the visual domain, the primary impetus of the following study was to investigate self-initiated visual stimuli—in this case, pattern reversals—and analyze within-subject correlations between the auditory and visual sensory modalities.

29

3.1 Introduction

Sensory attenuation refers to the reduced neurophysiological and phenomenological response to sensations produced by our own actions, relative to identical sensations produced by the external world—the archetypal example being the fact that it is difficult to tickle oneself (Weiskrantz et al., 1971). Sensory attenuation is believed to result from an internal forward model, in which sensory feedback is predicted on the basis of a copy of the motor command, known as an efference copy (Wolpert et al., 1995). In contrast to sensations caused by changes in the external world, self- initiated actions are typically associated with little to no discrepancy between actual and predicted sensory feedback. Consequently, the CNS tends to be less responsive to the sensory consequences of self-initiated actions than to identical sensations evoked by externally initiated events. This framework, together with its neurophysiological foundation, is well-supported across the animal kingdom (Crapse & Sommer, 2008). In humans, sensory attenuation has been demonstrated with various psychophysical approaches (Blakemore et al., 1999; Cardoso-Leite, Mamassian, Schütz-Bosbach, & Waszak, 2010; Sato, 2008). For instance, participants have been shown to judge self-initiated tones as softer than externally initiated tones (Weiss, Herwig, & Schütz-Bosbach, 2011). Researchers have also increasingly turned to ERP and event-related field (ERF) paradigms, measured with EEG and MEG respectively, to index the cortical attenuation of self-initiated sensations. Healthy participants have been found to exhibit a reduced amplitude of the N1 component of the AEP (i.e., the negative component observed approximately 100 ms after auditory stimulus onset) when the auditory stimulus is self- versus externally initiated. This finding has been found with both willed vocalizations (Curio et al., 2000; Heinks- Maldonado et al., 2005) and button press-initiated stimuli (Aliu et al., 2009; Baess et al., 2008; Martikainen et al., 2005; McCarthy & Donchin, 1976; Schafer & Marcus, 1973). Notably, patients with schizophrenia have been found to exhibit subnormal levels of auditory N1 suppression to self-initiated auditory sensations (Ford et al., 2001; Whitford et al., 2011), which has been taken as evidence for the self- monitoring abnormalities proposed to underlie the most characteristic clinical features of the disorder (Feinberg, 1978; Frith, 2005).

30

Research on sensory attenuation has primarily focused on the auditory domain, and comparatively little data are available with respect to sensory attenuation in the visual domain. To our knowledge, three previous studies have investigated specific electrophysiological responses to self-initiated visual stimuli, all of which employed button press-initiated stimuli, but in markedly different paradigms (Gentsch & Schütz-Bosbach, 2011; Hughes & Waszak, 2011; Schafer & Marcus, 1973). Additionally, Stenner, Bauer, Haggard, Heinze, and Dolan (2014) have examined prestimulus neural oscillations by using a visual contrast discrimination task. Gentsch and Schütz-Bosbach (2011) measured VEPs to arrow stimuli produced by either the participant or the computer, but this was embedded within a visual forced-choice response task in which the arrow stimuli were subliminally primed to address a separate question regarding conscious experience of agency. Schafer and Marcus (1973) employed a less complex design, comparing the VEPs of self-paced photic flashes that were either self- or machine-initiated. Both studies reported reduction of the visual N1 component evoked by self-initiated visual stimuli at anterior electrodes, but not at occipital sites—where visual ERP amplitudes are typically maximal (Luck, 2012; O’Shea, Roeber, & Bach, 2009; Odom et al., 2010). In contrast, Hughes and Waszak (2011), who investigated VEP responses to self-initiated visual checkerboards, reported increased visual P1 amplitude at an occipital site. The inconsistent pattern of results in studies on electrophysiological suppression to self-initiated visual stimuli stands in contrast with the large and consistent body of evidence in support of N1 suppression to self- initiated auditory stimuli. The primary aim of the study described in this chapter was to clarify whether electrophysiological suppression of self-initiated stimuli occurs across both auditory and visual modalities in healthy participants. To our knowledge, Schafer and Marcus (1973) is the only previous study to compare sensory attenuation in the auditory and visual sensory domains in the same participants—they identified unambiguous sensory attenuation of auditory stimuli, but, as mentioned above, only provided mixed support to attenuation of visual stimuli. Given the increasing prominence of sensory attenuation in the literature, and its possible role in psychotic symptomology, it is sensible to determine whether it is present in sensory modalities other than the auditory domain.

31

Additionally, this study aimed to quantify the contribution of temporal predictability in any observed effects of self-initiation. As discussed by Hughes et al. (2013), a key difference between self-initiated and externally initiated conditions in many sensory attenuation paradigms is that the first is temporally predictable (i.e., the participant chooses when to press the button, and can thus anticipate in time when the stimulus will occur), while the second is not. This raises the possibility that the sensory attenuation phenomenon is attributable, at least in part, to differences in the temporal predictability of the stimuli. To address these aims, this study measured ERP responses to both self- initiated auditory and visual stimuli in a single group of healthy participants, and compared them to physically identical, externally initiated stimuli, while also controlling temporal predictability between the conditions. Auditory stimuli consisted of pure tones, commonly used in AEP studies (e.g., Baess et al., 2008). Visual stimuli consisted of pattern reversals, which elicit robust VEPs—typically a small negativity (N75), a major positivity (P100), and then a negativity (N145) that appear to originate from the primary visual cortex (Shigeto, Tobimatsu, Yamamoto, Kobayashi, & Kato, 1998). These VEPs have less intersubject and intrasubject variability than other forms of luminance-change stimulation, such as flash and pattern-onset stimuli (Luck & Kappenman, 2012; Odom et al., 2010). Stimuli were initiated by either a participant’s willed button press (the active condition; i.e., self- initiated), at random intervals (the passive condition; i.e., externally initiated, unpredictable onset), or following a visual countdown (the cued condition, i.e., externally initiated, predictable onset, to match the intrinsic predictability of self- initiated stimuli). A motor condition, in which no stimuli occurred, was also included to control neural activity associated with the act of pressing a button. In accordance with previous studies, we predicted that participants would exhibit significantly reduced auditory N1 response to self-initiated tones relative to both predictable and unpredictable externally initiated tones. We also predicted suppression of early visual components of the VEP (namely, N75, P100, and N145) in response to self-initiated pattern reversals, based on the universal (i.e., multimodal) nature of the forward model framework on which sensory attenuation is based, notwithstanding inconsistent results in the small number of previous studies that have investigated self-initiated visual stimuli.

32

3.2 Method

3.2.1 Participants

Forty-five introductory psychology students enrolled at UNSW Sydney were recruited with an online experiment management system. Participants received course credit in exchange for their time and provided written, informed consent prior to participating. No participants reported having a psychotic mental disorder according to DSM-IV-TR criteria. Five participants were excluded from the analysis due to low signal-to-noise ratio. Of the remaining 40 participants, 28 were female, 37 were right-handed, and the mean age was 20 years (SD = 4). This study was approved by the UNSW Human Research Ethics Advisory Panel (Psychology).

3.2.2 Procedure

Participants sat in a quiet, artificially lit room, one meter in front of a computer monitor (BenQ XL2420T, 1024 × 768 pixels, 144 Hz, 24" width). Following the completion of a demographics questionnaire, EEG was continuously recorded while participants completed a computer task programmed in MATLAB (MathWorks, Natick, US) with Psychophysics Toolbox extensions (Brainard, 1997; Kleiner, Brainard, & Pelli, 2007; Pelli, 1997). As described below, the task consisted of four conditions (active, motor, passive, and cued) run in two sensory modalities (auditory and visual), for a total of eight homogenous, 60-trial blocks. The order of these blocks was randomized for each participant. Each block was preceded by five practice trials to ensure participants understood instructions and to emphasize the need for self-paced rather than speeded responses in the active conditions. The entire experimental procedure (see Figure 3.1 for an overview) took approximately 45 min to complete.

33

3.1 Experimental protocol Figure 3.1 Experimental protocol. There were four conditions for both auditory (tone) and visual (pattern reversal) stimuli: (a) Active condition (self-initiated stimuli). Following presentation of a fixation target, participants pressed a button at will. Immediately after the duration of this response (T1), a tone was bilaterally delivered or the pattern was reversed in auditory and visual conditions respectively. (b) Motor condition. This was identical to the active condition except that it resulted in no stimulus delivery. (c) Passive condition (externally initiated, unpredictable onset). This was identical to the active condition, except stimulus delivery occurred after a variable delay (T2) rather than involving participant input. (d) Cued condition (externally initiated, predictable onset). In this condition, stimulus delivery occurred following a series of countdown screens separated by one second each.

34

Auditory task. The active condition measured electrophysiological responses associated with self-initiated stimuli. A fixation cross was presented on the screen, and participants were instructed to respond at will by pressing a button on a low- latency response pad (Cedrus Corporation, Model RB-530) with their dominant hand. Pressing the button caused a tone (100 ms duration, 500 Hz frequency, 70 dB SPL) to be delivered immediately to their headphones (Philips, Model SHL4500GY). The motor condition was identical to the active condition except that pressing the button did not cause a tone to be delivered. The ensuing EEG activity was subsequently subtracted from the active condition to remove the EEG activity associated with pressing the button, as is standard practice in studies of this nature (Baess et al., 2008; Ford, Gray, et al., 2007; Martikainen et al., 2005; Whitford et al., 2011). Trials in the passive condition began with a fixation target, followed by a uniformly distributed random interval (0.5–2.5 s) before a tone was delivered without participant input, which aimed to assess participants’ electrophysiological response to an externally initiated, temporally unpredictable stimulus. Lastly, trials in the cued condition delivered tones following a countdown composed of fixation targets (i.e., +++, then ++, then +), each of which were separated by a 1,000-ms interval. The cued condition aimed to assess participants’ EEG response to an externally initiated, temporally predictable stimulus. All four conditions had random intervals (1–2 s) between stimulus onset (or in the motor condition, button pressing) and the start of the succeeding trial.

Visual task. Pattern-reversal stimuli were checkerboards (0.25° black and white checks, 100 cd/m2 mean luminance, full-field). In the ctivea condition, a fixation dot was superimposed on a static checkerboard, to which participants were instructed to respond at will by pressing a button on the response pad. This response caused an abrupt pattern reversal (i.e., black squares to white and white squares to black), and the reversed pattern persisted for 1,500 ms to prevent visual offset contamination. In the motor condition, the checkerboard remained static following the button press. In the passive condition, a reversal occurred without participant input after a uniformly distributed random interval (0.5–2.5 s). Lastly, in the cued condition, the reversal occurred after a countdown composed of fixation targets (i.e.,

35

• • •, then • •, then •), each of which were separated by a 1,000-ms interval. As in the auditory blocks, all four conditions had random intertrial intervals (1–2 s).

Data collection and reduction. EEG was continuously recorded with a BioSemi ActiveTwo system with sintered Ag-AgCl electrodes in an electrode cap at 64 scalp sites, positioned according to the extended 10–20 system, at a sampling rate of 2,048 Hz. During offline preprocessing, auditory blocks were rereferenced to the averaged mastoid electrodes, as is common protocol in AEP analysis, whereas visual blocks were rereferenced to the electrode Fz, as is common protocol in analysis of pattern-reversal-induced VEPs (Odom et al., 2010). Bipolar vertical electro- oculogram (EOG) was derived from electrode Fp1 and an electrode below the left eye, and horizontal EOG was derived from electrodes on the left and right outer canthi. Data were band-pass filtered from 0.05 to 30 Hz with a phase-shift free Butterworth filter and separated into 600-ms epochs, consisting of 100 ms pre-onset and 500 ms post-onset. Data were baseline corrected from -100 to 0 ms. Vertical and horizontal EOG data were used to correct eye-movement artifacts with a regression- based algorithm (Gratton, Coles, & Donchin, 1983). All individual trials with EEG or EOG activity exceeding ±50 µV were excluded from further analysis, resulting in a mean rejection rate of 7.3% ± 5.7% (SD) trials per block and participant. We then averaged individual trials in each condition to produce the ERPs of each participant. Lastly, for each participant, motor waveforms were subtracted from the active waveforms of both modalities to produce difference waveforms, as is typical in sensory attenuation studies of this nature (Baess et al., 2008; Ford, Gray, et al., 2007; Martikainen et al., 2005; Whitford et al., 2011). These difference waveforms are hereafter referred to as self waveforms.

Statistical analysis. In all statistical analyses, individual ERP component amplitudes were measured as the mean of the voltage in intervals centered on the group-average peak latencies of each condition. Following visual inspection of the group averages, these intervals were determined as 20 ms for N1, 40 ms for P2, 10 ms for N75 and P100, and 40 ms for N145. For auditory blocks, we identified our components of interest at electrode Cz, due to its characteristically large N1 amplitude after bilateral auditory stimulation (Ford, Gray, et al., 2007). Due to our inclusion of the cued condition, the auditory P2 component was also examined, as it

36 may better reflect suppression associated with temporal certainty (Sowman et al., 2012). For visual blocks, we identified our components of interest at electrode Oz to conform with standard protocol in VEP analysis, as that is where amplitude is typically the largest (Odom et al., 2010). We conducted our analyses at electrodes Cz and Oz, as well as Fz for auditory blocks (which was precluded for visual blocks due to its use as the reference). To investigate the effect of condition on mean amplitude of the identified components, we conducted a separate one-way analysis of variance (ANOVA) for each sensory modality and for each electrode. In cases where the assumption of sphericity was violated, a Greenhouse-Geisser correction was used, and for significant main effects, underlying simple effects were explored with pairwise comparisons. Lastly, to compare the effect of self-initiation on mean amplitudes across sensory modalities, measures for each component were calculated based on absolute differences between self- and externally initiated conditions. This represented 10 measures in total: 5 components (N1, P2, N75, P100, N145) × 2 contrasts (self–passive and self–cued). Participants’ resultant measures were then compared with bivariate Pearson correlations and two-tailed tests of significance, and cross-modal comparisons were reported (i.e., N1/N75, N1/P100, N1/N145, P2/N75, P2/P100, P2/N145). A similar correlative analysis that used ratio measures (i.e., self divided by passive or cued) was also conducted. Data preprocessing was done in BrainVision Analyzer 2 (Brain Products GmbH, Munich, Germany), and statistical analyses were performed in SPSS version 22 (IBM Corp., Armonk, US).

3.3 Results

3.3.1 Auditory task

For the N1 component, a one-way ANOVA showed a significant main effect of

2 condition, F(2,78) = 25.07, p < .001, ηp = .39. This indicated that mean N1 amplitude at electrode Cz differed between self (M = -5.95), passive (M = -10.59), and cued (M = -10.07) conditions (see Figures 3.2 and 3.3). Follow-up pairwise comparisons indicated that self significantly differed from both passive, F(1,39) = 16.95, p < .001 (see Figure 3.4), and cued, F(1,39) = 12.60, p < .001, but passive and cued did not differ from each other, F(1,39) = 0.16, p = .40.

37

There was also a significant main effect of condition for the P2 component,

2 F(2,78) = 62.03, p < .001, ηp = .61. This indicated that mean P2 amplitude at electrode Cz differed between self (M = 4.22), passive (M = 12.99), and cued (M = 11.37) conditions (see Figures 3.2 and 3.3). Follow-up pairwise comparisons indicated that self significantly differed from both passive, F(1,39) = 68.67, p < .001 (see Figure 3.4), and cued, F(1,39) = 48.23, p < .001. In addition, passive and cued significantly differed, F(1,39) = 1.68, p = .016.

3.3.2 Visual task

For the N75 component, a one-way ANOVA did not show a main effect of

2 condition, F(2,78) = 0.52, p = .565, ηp = .01. This indicated that mean N75 amplitude at electrode Oz did not differ between self (M = -4.26), passive (M = -4.78), and cued (M = -4.14). Similarly, a main effect of condition was not evident for

2 the P100 component, F(2,78) = 0.60, p = .552, ηp = .02. This indicated that mean P100 amplitude at electrode Oz did not differ between self (M = 1.17), passive (M = 1.74), and cued (M = 1.66). However, for the N145 component, a one-way ANOVA revealed a main effect

2 of condition, F(2,78) = 4.61, p = .013, ηp = .11. This indicated that mean N145 amplitude at electrode Oz differed between self (M = -6.76), passive (M = -5.22), and cued (M = -5.43) conditions (see Figures 3.2 and 3.5). Follow-up pairwise comparisons indicated that self significantly differed from both passive, F(1,39) = 1.25, p = .006 (see Figure 3.4), and cued, F(1,39) = 1.16, p = .050, but passive and cued did not differ from each other, F(1,39) = 0.02, p = .636.

38

3.2 Grand-averaged ERPs of the self and external conditions Figure 3.2 The left column shows ERPs of the auditory task recorded at electrodes Fz, Cz, and Oz (referenced to mastoids) in the self, passive, and cued conditions. The right column shows ERPs of the visual task recorded at electrodes Cz and Oz (referenced to Fz) in the self, passive, and cued conditions. ERPs in the self conditions were calculated by subtracting the appropriate motor condition from the active condition. Voltage is on the y axes and time in milliseconds is on the x axes, where stimulus onset is at 0 ms.

39

3.3 Topographic maps of the auditory N1 and P2 components Figure 3.3 Topographic maps of the auditory N1 and P2 components, shown in the time windows of the self, passive, and cued conditions.

40

3.4 Topographic maps of the self–passive difference waves Figure 3.4 Topographic maps of the self–passive difference waves of the auditory N1, auditory P2, and visual N145. Scalp topographies are shown in the time windows of the self conditions, referenced to linked mastoids (auditory data) or Fz (visual data).

41

3.5 Topographic maps of the visual N145 component Figure 3.5 Topographic maps of the visual N145 component. Scalp topographies are shown in the N145 time windows for the self, passive, and cued conditions, referenced to either linked mastoids or Fz.

42

3.3.3 Correlations between sensory suppression measures

With the self–passive measures, no significant correlations were found across sensory modalities (i.e., N1/N75, N1/P100, N1/N145, P2/N75, P2/P100, P2/N145) with either absolute, r < ± .117, p ≥ .473, or ratio, r < ± .274, p ≥ .087, calculations. Similarly, no significant correlations were found across sensory modalities for self– cued measures with either absolute, r < ± .224, p ≥ .164, or ratio, r < ± .101, p ≥ .539, calculations. See Figure 3.6 for scatter plots comparing self- and externally initiated conditions with the N1 and N145 absolute measures.

3.6 Scatter plots of the amplitude differences across modalities Figure 3.6 Scatter plots of the absolute differences between self and passive conditions (left) and self and cued conditions (right). All axes represent mean amplitude in microvolts (µV) across the time windows of the respective components (auditory N1 and visual N145).

3.3.4 Reaction times Mean reaction times in each self-initiated condition were compared to check possible behavioral differences between the auditory and visual tasks. Reaction times in the auditory tasks—active (M = 811 ms, SD = 631 ms); motor (M = 830 ms, SD = 758 ms)—were similar to those in the visual tasks—active (M = 893 ms, SD = 486 ms); motor (M = 971 ms, SD = 531 ms). A one-way ANOVA did not present a main

2 effect,F (2,78) = 0.80, p = .497, ηp = .02, which indicates that participants’ reaction times did not differ as a function of condition (active or motor) or modality (auditory or visual).

43

3.3.5 Evaluating possible refractoriness contributions

Mean interstimulus intervals (ISIs) differed between conditions, although the size of this intercondition difference was similar across modalities (see §3.5). Given the connection between refractoriness and N1 amplitude (Budd, Barry, Gordon, Rennie, & Michie, 1998), we conducted post hoc analyses to quantify the possible contribution this may have had to our effects. To begin with, each participant’s average ISI in each of the four conditions (active, passive, cued, motor) were entered as centered covariates in the ANOVAs described above. The outcomes were identical; that is, the main effects for each component remained, and all self–passive contrasts remained significant (see §3.5 for details). We also conducted independent samples t tests for each component on a median split of participants based on their ISI difference scores, calculated by subtracting ISI in the passive condition from ISIs in the active condition for each modality. There were no significant between-condition differences or trends (for all components, t(38) < ±1.528, p ≥ .135, see §3.5), indicating that ISIs were not having a consistent influence on component amplitude. Additionally, we conducted a bivariate correlation analysis of participants’ ISI difference scores with their self-initiation measures (i.e., self–passive, mean amplitudes) for all components. In the auditory modality, there was no significant correlation for either absolute (r < ± .235, p > .144) or ratio (r < ± .124, p > .446) measures. Likewise, in the visual modality, there was no significant correlation for either absolute (r < ± .204, p > .207) or ratio (r < ± .305, p > .056) measures. Lastly, we checked correlations between participants’ ISI difference scores and N1 mean amplitude for each condition, and found no significant correlations s( elf: r = -.114, p = .483; passive: r = -.263, p = .101; cued: r = -.138, p = .398).

3.3.6 Supplementary post hoc analyses at extra sites

For a more complete view of activity across the scalp, post hoc one-way ANOVAs were conducted for mean amplitudes at frontal (Fz) and occipital (Oz) sites for auditory components, and at a central (Cz) site for visual components. To further test the visual task, we also rereferenced visual data to linked mastoids and analyzed activity at electrodes Fz, Cz, and Oz. The details of these post hoc analyses are described below; in summary, in the auditory task, the pattern of results at both Fz

44 and Oz was identical to results at Cz reported above, except that passive and cued did not significantly differ in mean P2 amplitude. In the visual task, no main effects occurred at any electrode site.

Auditory task. For electrode Fz, there was a significant main effect of

2 condition for the N1 component, F(2,78) = 12.92, p < .001, ηp = .25. This indicated that mean N1 amplitude differed between self (M = -5.11), passive (M = -7.77), and cued (M = -7.53), where self significantly differed from both passive, F(1,39) = 4.76, p < .001, and cued, F(1,39) = 3.34, p < .001, but passive and cued did not differ from each other, F(1,39) = 0.03, p = .612. Similarly, there was a significant main effect for

2 the P2 component, F(2,78) = 17.02, p < .001, ηp = .30. This indicated that mean P2 amplitude differed between self (M = 2.15), passive (M = 5.94), and cued (M = 5.32), where self significantly differed from both passive, F(1,39) = 10.72, p < .001, and cued, F(1,39) = 7.75, p < .001, but passive and cued did not differ from each other, F(1,39) = 0.21, p = .265. For electrode Oz, there was a significant main effect of condition for the N1

2 component, F(2,78) = 3.88, p = .025, ηp = .09. This indicated that mean N1 amplitude differed between self (M = -0.81), passive (M = -1.86), and cued (M = -1.85), where self significantly differed from both passive, F(1,39) = 0.51, p = .019, and cued, F(1,39) = 0.50, p = .029, but passive and cued did not differ from each other, F(1,39) = 0.001, p = .931. Similarly, there was a significant main effect for the

2 P2 component, F(2,78) = 6.22, p = .005, ηp = .14. This indicated that mean P2 amplitude differed between self (M = 1.87), passive (M = 3.62), and cued (M = 3.42) conditions, where self significantly differed from both passive, F(1,39) = 1.89, p = .007, and cued, F(1,39) = 1.40, p = .011, but passive and cued did not differ from each other, F(1,39) = 0.02, p = .633.

Visual task. For Fz-referenced data at electrode Cz, there were no main effects of condition for any component. This indicated that for the N75 component,

2 F(2,78) = 2.19, p = .128, ηp = .05, mean amplitude at electrode Cz did not differ between self (M = -0.88), passive (M = -1.60), and cued (M = -1.04). Similarly, for

2 the P100 component, F(2,78) = 1.36, p = .262, ηp = .03, mean amplitude at electrode Cz did not differ between self (M = -0.49), passive (M = -0.02), and cued (M =

2 -0.47). Lastly, for the N145 component, F(2,78) = 0.45, p = .642, ηp = .01, mean

45 amplitude at electrode Cz did not differ between self (M = -0.37), passive (M = -0.14), and cued (M = -0.50). For mastoid-referenced data, there were no main effects. For all sites (Oz, Cz, and Fz), mean amplitude did not differ between self, passive, and cued conditions

2 for the N75 component, F(2,78) ≤ 1.65, p ≥ .200, ηp ≤ .05; P100 component, F(2,78)

2 2 ≤ 0.93, p ≥ .379, ηp ≤ .03; or N145 component, F(2,78) ≤ 1.99, p ≥ .154, ηp ≤ .05.

3.4 Discussion

The primary aim of the study in this chapter was to investigate the effect of self- initiation on sensory processing in both auditory and visual domains in healthy participants. As predicted, we found electrophysiological suppression of the N1 component of the AEP to self-initiated auditory stimuli (button press-initiated tones). However, in contrast, we found an increase in the amplitude of the N145 component of the VEP to self-initiated visual stimuli (button press-initiated pattern reversals). Furthermore, there was no correlation between participants’ level of electrophysiological sensory suppression in the auditory and visual tasks. Taken together, these results suggest that electrophysiological suppression of self-initiated sensations may not occur across all sensory modalities, but may instead be specific to certain modalities such as audition. Indeed, the results of the study in this chapter suggest that self-initiated, button press-initiated visual stimuli may in fact cause an amplification—rather than suppression—of the visual N145 component. An additional aim was to quantify the contribution of temporal predictability to any observed differences in response between self- and externally initiated stimuli. There was a difference in auditory P2 but not N1 response between passive and cued conditions in the auditory tasks, and no difference at any component in the visual tasks. This suggests that temporal predictability was involved to a minor degree in basic sensory processing of the tones but not pattern reversals used in this study. In the auditory domain, self-initiated tones evoked significantly attenuated auditory N1 amplitude compared with both externally initiated conditions (i.e., passive and cued). This result corroborates the substantial body of literature showing sensory suppression of auditory stimuli (e.g., Baess et al., 2008; Ford et al., 2001; Heinks-Maldonado et al., 2005). In terms of temporal predictability, auditory

46

N1 response did not differ between the passive and cued conditions, but there was a difference in P2 response. This contrasts somewhat with earlier studies that have found attenuation of AEP responses to temporally predictable stimuli in both externally initiated conditions (Ford, Gray, et al., 2007; Lange, 2009; Oestreich et al., 2015; Schafer et al., 1981) and, by controlling sound onset delay, self-initiated conditions (Baess et al., 2008). Substantial methodological diversity exists: Lange (2009) relied on the temporal rhythm of the stimuli (i.e., the tones themselves became temporal cues), whereas the study reported in this chapter used a visual countdown similar to Ford, Gray, et al. (2007). These different methods of temporal orienting may have varying contributions to attention and prediction—see Lange (2013) for a review. Notably, the countdown method contains an element of temporal uncertainty, as it requires an estimation of the time elapsed between the final cue and the tone. This could be avoided by use of a spectrogram-style cue as employed in Oestreich et al. (2015). In short, we speculate that residual uncertainty explains the lack of an expected cued effect at N1, which consequently suggests that P2 is more sensitive to this form of manipulation. This also supports the emerging notion that differences at P2 may be instructive in the comparison of self- versus externally initiated auditory stimuli (SanMiguel, Todd, & Schröger, 2013). With regard to our auditory task design, a strength of the design in this study was that the order of the conditions was fully randomized; that is, self did not have to precede passive, as is the case in traditional yoked designs. However, a limitation of such a design is that it required us to estimate ISI in the passive and cued conditions. Our estimation was slightly (but significantly) too generous in that ISI in self-initiated blocks ended up longer than in externally initiated blocks by approximately 600 ms on average. This raises the possibility that refractoriness moderated N1 amplitude, though this is more pronounced with ISIs shorter than in the study described in this chapter (Budd et al., 1998). Several follow-up analyses we conducted suggest that refractoriness was not responsible for our effects (see §3.3.5, with additional details presented in §3.5). In particular, there was no correlation between ISI and N1 amplitude, and the results did not change when participants’ centered ISIs in the four conditions were included as nuisance covariates. In the visual domain, self-initiated pattern reversals evoked significantly augmented visual N145 amplitude at electrode Oz compared with both externally

47 initiated conditions (i.e., passive and cued), while no difference was found between self- or externally initiated conditions in terms of visual N75 or P100 response. This lack of modulation in early response contrasts with previous self-initiation studies involving visual stimuli, which have shown attenuation at vertex (but not occipital) electrode sites (Gentsch & Schütz-Bosbach, 2011; Schafer & Marcus, 1973) or posterior enhancement (Hughes & Waszak, 2011). However, the present finding of increased response at N145 to self-initiated pattern reversals is novel. Importantly, this was not an artificial effect produced by referencing, as suppression was not evident at anterior sites around the Fz reference, as can be seen in the mastoid- referenced data (Figure 3.5) and lack of main effects in post hoc analyses conducted at Cz and Fz with such data. This seems to differ from Schafer and Marcus (1973), who reported self-initiated suppression of visual components at the vertex, but did not report frontal or topographical data. Thus, the pattern of effects in this study is that self-initiation leads to reduced AEPs but increased VEPs. Interestingly, ongoing electrophysiological investigation of mice has also demonstrated heightened responsiveness to visual stimuli (Niell & Stryker, 2010)—but suppression to auditory stimuli (Zhou et al., 2014)—as the animals freely move (see also D. M. Schneider, Nelson, & Mooney, 2014). This relationship could represent an important prioritization of visual over auditory stimuli in our sensory-guided interaction with the environment. Alternatively, it may reflect processing based on the natural contingencies between actions and resultant sensations. Simple motor acts, such as finger movements, often lead to auditory sensations (e.g., tapping one’s finger on the table), but less often lead to abrupt, dramatic visual sensations (e.g., full-field visual stimulation as in het study in this chapter). Less frequently reinforced contingencies (i.e., motor–visual) are proportionally more likely to contain important information than stronger contingencies (i.e., motor–auditory), which may explain processing enhancement and attenuation of these respective tasks. This possibility further underlines the call by Horváth (2015) for systematic investigation into action–effect contingency. The amplified N145 response to self-initiated stimuli in this study may also be compared with the visual enhancement reported by Hughes and Waszak (2011) mentioned earlier, although they analyzed the P1 component of electrophysiological response evoked by pattern-onset stimuli. Shawkat and Kriss (2000) have directly

48 compared VEPs evoked by pattern-onset/offset and pattern-reversal stimuli— manipulating contrast of the checkerboards in a stepwise fashion to morph between the two types of stimulation—and indicated that the reversal N145 component may be analogous not to the P1, but to the N1 component evoked by pattern onset. Given that attention has been shown to enhance both N1 and P1 response to pattern-onset stimuli (V. P. Clark & Hillyard, 1996), it is possible that amplification of the N145 in this study could be caused by an increase in attention to self-initiated pattern reversals. Indeed, non stimulus-related modulation unique to the N145 component of the pattern-reversal VEP is not surprising, as the N145 may be generated over a wider cortical area than the N75 and P100 components (Shigeto et al., 1998). When comparing pattern-onset and pattern-reversal stimuli, it is worth noting that both involve simple, low-level representations. Contrary to the visual enhancement found in response to these stimuli, Hughes and Waszak (2014) reported sensory attenuation evoked by high-level stimuli (faces and houses). A parametrical investigation of the influence of visual stimulus complexity on neurophysiological sensory attenuation would be a worthwhile continuation of these findings. Regarding temporal prediction in the visual tasks, the lack of difference between passive and cued conditions suggests that there was no effect of temporal prediction on visual processing. This is consistent with the finding that temporal attention to visual stimuli seems to increase early neural response only when the task is perceptually demanding (Correa, Lupianez, Madrid, & Tudela, 2006), as opposed to the perceptually simple task used in the study reported in this chapter. In view of the finding that self-initiation amplified N145 response, the lack of difference between externally initiated conditions indicates that if the amplification is, in fact, due to increased attention, this attention would need to be directed toward nontemporal features of the stimulus. The final result of interest was the lack of any significant correlation between participants’ electrophysiological response modulation in the auditory and visual tasks; in particular, auditory N1 attenuation was not quantitatively related to visual N145 amplification. This suggests that the effects of self-initiation may be sensory specific, but more work is required on the underlying neurophysiological mechanisms that may be producing this differential processing. It is possible that a higher-level, non sensory-specific process may also have contributed to the reported

49 effects of self-generation. Accordingly, designs that attempt to isolate sensory and nonsensory subcomponents of the N1 component are highly relevant (SanMiguel, Todd, et al., 2013), as are emerging lines of enquiry beyond stimulus-evoked responses, such as alpha oscillations (Stenner et al., 2014). The possibility of sensory specificity is interesting because of potential implications with regard to the prevalence of psychotic symptoms in disorders such as schizophrenia. Following the theoretical link between abnormal sensory suppression and psychotic symptoms (Feinberg, 1978; Frith, 2005), the apparent sensory specificity of electrophysiological response to self-initiated stimuli may lend insight to the prevalence and pathology underlying auditory and visual hallucinations. It may also shed light on why auditory hallucinations are so much more common than visual hallucinations in patients with schizophrenia. In this study of healthy participants, there was no apparent relationship between auditory and visual effects, and it would be worthwhile to examine if this were also true in a population with schizophrenia. In sum, the study reported in this chapter undertook an investigation of self- initiated sensations in both auditory and visual domains with the same participants, the first sinceSchafer and Marcus (1973) and the first to report within-subject comparisons. The results indicated that electrophysiological sensory attenuation occurs in response to self-initiated auditory, but not visual, sensations in healthy participants. These findings highlight the importance of several continuing areas of investigation, including the impact on sensory processing across sensory modalities of factors such as contingency, temporal predictability, and stimulus complexity.

50

3.5 Supplementary material

Additional analyses were conducted to investigate the possible contributions of refractoriness, given that mean stimulus onset-to-onset delays were found to differ in the auditory (active: 3,422 ms; motor: 3,396 ms; passive: 4,102 ms; cued: 5,583 ms) and visual tasks (active: 5,996 ms; motor: 6,086 ms; passive: 6,588 ms; cued: 7,566 ms). ANOVAs presented here are identical to the analyses in §3.3, but with participants’ centered ISIs in the four conditions as nuisance covariates. We also present independent samples t tests based on participants’ ISI difference scores.

3.5.1 Auditory task (ISI covariates)

For the N1 component, a one-way ANOVA showed a significant main effect of

2 condition, F(2,78) = 26.50, p < .001, ηp = .43. This indicated that mean N1 amplitude at electrode Cz differed between self, passive, and cued conditions, where self significantly differed from both passive, F(1,39) = 16.24, p < .001, and cued, F(1,39) = 12.67, p < .001, but passive and cued did not differ from each other, F(1,39) = 0.15, p = .381. There was also a significant main effect for the P2 component, F(2,78) =

2 61.46, p < .001, ηp = .64. This indicated that mean P2 amplitude at electrode Cz differed betweens elf, passive, and cued conditions, where self significantly differed from both passive, F(1,39) = 64.90, p < .001, and cued, F(1,39) = 49.20, p < .001, and passive and cued also significantly differed, F(1,39) = 1.71, p = .018.

3.5.2 Visual task (ISI covariates)

For the N75 component, a one-way ANOVA did not show a main effect of

2 condition, F(2,78) = 0.51, p = .603, ηp = .01. This indicated that mean N75 amplitude at electrode Oz did not differ between self, passive, and cued conditions. Similarly, a main effect of condition was not evident for the P100 component,

2 F(2,78) = 0.51, p = .575, ηp = .01. This indicated that mean P100 amplitude at electrode Oz did not differ between self, passive, and cued conditions. However, for the N145 component, a one-way ANOVA revealed a main effect

2 of condition, F(2,78) = 4.50, p = .014, ηp = .11. This indicated that mean N145 amplitude at electrode Oz differed between self, passive, and cued conditions, where

51 self significantly differed from passive, F(1,39) = 1.25, p = .006, but not cued, F(1,39) = 1.20, p = .060, and passive and cued did not differ from each other, F(1,39) = 0.02, p = .632.

3.5.3 Independent samples t tests

Difference scores were calculated for each participant by subtracting their ISI in passive from their ISI in self for each modality (for example, using the mean ISIs described above results in -680 ms for auditory and -592 ms for visual). Then, for each modality, participants were sorted into “small ISI difference” and “large ISI difference” groups based on a median split of their difference scores, and independent samples t tests were conducted to compare these groups in terms of their mean amplitudes. For the auditory N1 at site Cz (referenced to mastoids), there were no significant between-group differences fors elf, t(38) = 1.01, p = .319, passive, t(38) = 1.53, p = .135, or cued, t(38) = 0.05, p = .960. Similarly, for the P2, there were no significant between-group differences for self, t(38) = 0.78, p = .440, passive, t(38) = -0.24, p = .809, or cued, t(38) = 1.08, p = .286. For the visual N75 at site Oz (referenced to Fz), there were no significant between-group differences for self, t(38) = -0.35, p = .728, passive, t(38) = 0.55, p = .585, or cued, t(38) = 0.97, p = .339. Similarly, for the P100, there were no significant between-group differences for self, t(38) = -0.31, p = .759, passive, t(38) = -0.05, p = .964, or cued, t(38) = -0.01, p = .993. Lastly, for the N145, there were no significant between-group differences for self, t(38) = -1.05, p = .298, passive, t(38) = -0.12, p = .907, or cued, t(38) = -1.04, p = .304.

52

4 Saccade initiation of auditory stimuli

Published as:

Mifsud, N. G., Beesley, T., Watson, T. L., & Whitford, T. J. (2016). Attenuation of auditory evoked potentials for hand and eye-initiated sounds. Biological Psychology, 120, 61–68. https://doi.org/10.1016/j.biopsycho.2016.08.011

Author contributions:

Conceptualization: NGM TJW. Methodology: NGM TJW. Software: NGM TB. Investigation: NGM. Data curation: NGM. Formal analysis: NGM. Visualization: NGM. Writing – original draft: NGM. Writing – review & editing: NGM TB TLW TJW. Funding acquisition: TJW. Supervision: TJW.

Context in thesis:

The following study directly investigates whether the causal likelihood of coincident pairings between motor output and its resultant sensory input modulates sensory attenuation, by exploiting the fact that button presses are routinely accompanied by sounds (e.g., while typing), whereas eye movements are not. We introduced an eye- tracker design where participants could produce tones by moving their eyes. As the saccade–tone pairing represents an action–sensation contingency that no participant could have previously experienced, the size of its auditory ERP response would be informative about the processes underlying electrophysiological sensory attenuation.

53

4.1 Introduction

Our must regulate a continuous stream of sensory input to flexibly generate behavior and allow interaction with the world. A well-established example of such regulation is sensory attenuation, where the sensory input evoked by self-initiated actions is marked by reduced phenomenological (Blakemore et al., 1999; Cardoso- Leite et al., 2010; Sato, 2008) and neurophysiological representations (Baess et al., 2008; Houde et al., 2002; Schafer & Marcus, 1973) compared with identical, externally initiated sensory input—a phenomenon typified by the difficulty of tickling ourselves (Weiskrantz et al., 1971). Functionally, sensory attenuation serves to conserve attentional resources and to enable sensory processing in situations where volitional actions would otherwise desensitize sensory receptors, such as during speech production (Bendixen et al., 2012). It has also been proposed as fundamental to self-identity, such that dysfunctional attenuation could lead to psychotic symptomology (Feinberg, 1978; Ford et al., 2001). In the auditory domain, the N1 or N1m component (an or magnetic field that appears approximately 100 ms after the onset of an auditory stimulus) is used as a cortical index of sensory attenuation, because its amplitude, compared with externally initiated stimuli, is consistently reduced for both self- initiated vocalizations (Curio et al., 2000; Heinks-Maldonado et al., 2005; Houde et al., 2002) and button press-initiated stimuli (Aliu et al., 2009; Baess et al., 2008; Martikainen et al., 2005; McCarthy & Donchin, 1976; Schafer & Marcus, 1973; Sowman et al., 2012). The predominant explanation of these findings invokes a theory of motor control in which a forward model predicts future behavioral states and their sensory consequences (Wolpert et al., 1995). According to this theory, the sensory consequences of volitional motor actions can be predicted based on an efference copy (i.e., a copy of the motor command), and neurophysiological sensory attenuation reflects the subtraction of this prediction from actual sensory input (Bays & Wolpert, 2007). Conversely, externally initiated sensory input, for which there is no motor information to form an accurate prediction, will remain unmodulated by the efference copy mechanism (see Timm, SanMiguel, Keil, Schroger, & Schonwiesner, 2014).

54

Forward prediction is strongly implicated in speech production (Hickok, 2012), which involves a well-defined range of motor output (e.g., shape formed by lips) producing specific, habitual, sensory consequences (i.e., uttered syllables). As candidate language fibre pathways have been identified( Dick & Tremblay, 2012), upon which the efference copies of speech motor output conceivably travel, there is also a plausible neural basis of forward prediction in speech production. However, the N1 attenuation literature largely comprises experiments based on arbitrary action–sensation contingencies, usually hand movements to press a button and elicit a tone. This is problematic, because it is not clear that motor-to-sensory mapping can be generalized from speech production to auditory input evoked by motor actions unrelated to speech (Horváth, 2015). Unlike speech-induced auditory attenuation, for which there are distinct neural networks proposed to be involved (Behroozmand et al., 2016; Chang et al., 2013; Greenlee et al., 2013) that likely encode specific acoustic properties of the upcoming sound, internal predictions arising from non speech motor actions may be comparatively more crude. This presents the possibility that sensory attenuation of speech and non-speech stimuli are driven by different mechanisms. An alternate (or perhaps complementary) explanation of sensory attenuation relates to learned associations between actions and sensations. Operationally, learned associations have been proposed to “pre-activate” potential sensory input, increasing baseline neural activity at a greater rate than the signal increase resulting from an incoming stimulus (Roussel et al., 2013). This account differs from forward prediction in that sensory attenuation is attributed to poorer stimulus discrimination (and consequent reduction of the corresponding sensory representations) rather than an efference copy directly reducing the internal response to a predicted stimulus. An associative explanation can be reasonably applied to the classic paradigm of pressing a button to hear an auditory stimulus, given that such a contingency conceivably draws upon a wealth of pre-existing action–sensation associations that have been experienced and learnt over the course of a lifetime. Indeed, there is evidence to suggest that prior experience can affect sound perception (Repp & Knoblich, 2007), and some suggestion that contingency strength alters neurophysiological response in the auditory (Baess et al., 2008; SanMiguel, Widmann, Bendixen, Trujillo-Barreto, & Schröger, 2013) and visual

55

(Roussel, Hughes, & Waszak, 2014) domains (see Horváth, 2015 for a discussion). Hence, efference copy modulation may not be wholly responsible for all cases of sensory attenuation, and an account based on learned associations may explain observed effects in contingencies that lack the highly specific motor-to-sensory mapping that exists during speech production. One possible test of the contribution of learned associations would be to employ a novel contingency (i.e., one never experienced before) between motor output and sensory input. For example, volitional eye movements cannot directly cause sounds in our natural environment, and therefore it seems implausible that an efference copy of the eye motor command would be sent to the auditory cortex. Hence, for a contingency between an eye movement and an auditory stimulus, it is almost certain that no learned association exists, which provides a strong test of whether learned associations are an essential component of sensory attenuation. Moreover, if neurophysiological auditory attenuation were still observed for this novel contingency, it implies that motor-to-sensory mapping is indeed generalizable to indirect action–sensation contingencies. This result would support the forward model account of sensory attenuation. However, if sensory attenuation was not observed, a limit of generalizability will have been identified, whichmay suggest that learned associations are driving button press-initiated auditory N1 attenuation. Alternatively, sensory attenuation may be driven by a combination of these processes; for example, it seems possible that forward prediction could be established for novel contingencies, such that levels of N1 attenuation are mediated by associative strength. Accordingly, the primary aim of this chapter’s study was to determine whether auditory ERP attenuation, particularly with respect to the N1 component, would occur following an action–sensation contingency for which no prior learning exists, and thus provide a valuable contribution to the discussion about the processes underlying sensory attenuation. To do so, our experimental design contained a new condition that associated eye movements (the motor output) with tones (the auditory sensory input), a pairing that cannot occur outside of artificial contexts, and for which it is difficult to imagine an analogous, naturally occurring pairing of events. Specifically, participants were presented a pure tone (as is common in auditory ERP studies of this nature, e.g., Baess et al., 2008) following a singular, volitional saccadic

56 movement. To determine whether N1 attenuation occurred, their resultant electrophysiological response, following subtraction of a motor condition (i.e., the same eye movement without stimulus presentation), was compared with an externally initiated condition in which tones were presented without any participant input. Given the possibility raised above that a combination of efference copy and associative processes drive sensory attenuation, we expected to observe N1 attenuation for saccade-initiated stimuli compared with the externally initiated condition. Even so, it seemed unlikely that a novel contingency could produce the same level of attenuation as previously seen in button-press initiation experiments, as we have substantial prior experience with auditory sensory input following hand motor output. In view of this hypothesis, our secondary aim was to quantitatively compare self-initiation effects associated with different regions of motor output (i.e., hand and eye). To achieve this aim, our within-subjects design also included an established button press-initiated condition, for which convincing N1 and P2 attenuation has been demonstrated (Mifsud, Oestreich, et al., 2016; Oestreich et al., 2016; Whitford et al., 2011). We chose to additionally assess the P2 component (a large voltage positivity that peaks approximately 200 ms after stimulus onset), which reflects the processing of specific auditory features (Shahin, Roberts, Pantev, Trainor, & Ross, 2005), for two reasons. First, SanMiguel, Todd, et al. (2013) have suggested that it may provide a more direct measure of sensory-specific prediction effects than the N1, because, unlike N1 effects, P2 attenuation was uniform over different stimulus onset asynchronies. Hence, it seems prudent to report P2 effects to enable comparison between different paradigms. Second, the P2 component has previously been shown to discriminate between self-initiated conditions that differ by motor output region (i.e., hand and foot; van Elk, Salomon, Kannape, & Blanke, 2014), if not necessarily contingency strength (that is, we have substantive experience with both hand and feet producing auditory stimulation), and is therefore relevant given the disparate eye and hand regions targeted by this study’s design.

57

4.2 Method

4.2.1 Participants

Forty participants were recruited with an online experiment management system at UNSW Sydney. There were 18 females, 36 were right-handed, and mean age was 23 years (SD = 7). Participants provided written, informed consent and received either course credit (n = 24) or financial imbursement (n = 16, A$30) in exchange for their time. This study was approved by the UNSW Human Research Ethics Advisory Panel (Psychology).

4.2.2 Procedure

Participants sat in a quiet, artificially lit room, 60 cm in front of a computer monitor with integrated eye tracking system (Tobii TX300: 300 Hz gaze sampling rate; 23", 60 Hz, 1920 × 1080 resolution TFT screen; accuracy of 0.4° visual angle; system latency under 10 ms). Following a demographics questionnaire, participants were fitted with the EEG cap and electrodes, and underwent a 5-point eye tracking calibration procedure. EEG was then continuously recorded while participants completed an experimental protocol (see Figure 4.1) that was controlled by MATLAB (MathWorks, Natick, US) with Psychophysics Toolbox extensions (Brainard, 1997; Kleiner et al., 2007; Pelli, 1997). The experiment consisted of five conditions: two types of self-initiation (i.e., button press and saccade), two corresponding motor control conditions (i.e., button presses and saccades without consequent tones), and an externally initiated condition (i.e., tones played automatically). These conditions are described in detail below. Each condition was presented in a homogenous 80-trial block, and block order was randomized between participants. Three practice trials preceded each block to ensure participants understood instructions, and, in the self-initiated conditions, to encourage self-paced responses. A uniformly distributed random interval (2–4 s) preceded the beginning of each trial. The EEG recording lasted approximately 50 min, including three additional blocks intermixed with those presented here. These additional blocks contained self- and externally initiated visual stimuli, and are presented in Chapter 5.

58

4.1 Experimental protocol Figure 4.1 Experimental protocol. In the press initiation condition, participants depressed a button at will any time after a fixation dot appeared. Immediately following the button press, a tone was delivered. In the saccade initiation condition, participants focused on a distal dot and then shifted at will to the center fixation dot, which immediately delivered a tone. In the externally initiated condition, tone delivery followed a variable delay without motor input. Lastly, control conditions were identical to their respective stimulus conditions, except that motor input did not result in tone delivery.

While Chapter 3 (Mifsud, Oestreich, et al., 2016) included a cued condition (i.e., externally initiated stimuli with predictable onset) as an additional form of control condition, it is not included in the study reported in this chapter or the subsequent studies in this thesis. The primary reason for this change in design was practical: by introducing new stimulus conditions (culminating in Chapter 6 with press, saccade, and blow in a single experiment), extra motor conditions are a necessity. By removing the cued condition, the number of trials per condition could be increased (from 60 to 80) while limiting the EEG recording duration to manage participant fatigue. These aspects of ERP experiments are favourable for maximizing signal-to-noise ratio (Luck, 2005). Moreover, as discussed in §3.4, cued conditions vary widely in their implementation across the literature. For example, temporal predictability has been provided with the temporal rhythm of stimuli (Lange, 2009), visual countdowns (Ford, Gray, et al., 2007), and spectrogram-style cues (Oestreich et al., 2015). Given this methodological diversity, precedence was given to questions related to new types of action–sensation contingencies.

59

Press condition. This self-initiated condition measured electrophysiological response to an auditory stimulus produced by a button press (i.e., hand motor output). Participants were instructed to respond at will any time after the appearance of a red fixation dot (0.7° diameter) presented in the center of a black screen, and did so by pressing the space bar on a low-latency keyboard with their dominant hand. Responses immediately delivered a tone (30 ms duration, 500 Hz frequency, 70 dB sound pressure level) to their headphones (AKG K77 Perception). The shorter duration of the tone used here and in Chapter 6, compared to the 100- ms tone used in Chapter 3, was to match the duration of the visual flash reported in Chapter 5 (see §7.2.2 for a detailed discussion). Both 30- and 100-ms fall within the range of tone durations to which auditory N1 attenuation has previously been reported, from 1 ms (Schafer & Marcus, 1973) to 400 ms (Sowman et al., 2012).

Saccade condition. This self-initiated condition measured electrophysiological response to an auditory stimulus produced by a volitional saccade (i.e., eye motor output). Each trial began with two dots appearing on screen: a solid red circle in the center of screen (identical to the fixation in the press initiation condition) and a distal (17° left) hollow white circle. Participants were instructed to initially fixate on the white circle, which would turn solid once the script detected their gaze, based on a 20 ms sample of location recordings. If detection took longer than 5 seconds, trials were skipped with replacement (M = 0.8 skipped trials per participant across both stimulus and motor saccade-initiated blocks). Following fixation on the white circle, participants shifted their gaze at will to the red circle, which immediately delivered a tone to the headphones (identical to the press-initiated tone). More precisely, tone delivery followed detection of the gaze within the 200-px (5°) square area of interest surrounding the central red circle.

Motor conditions. The motor control conditions were identical to their respective self-initiated conditions, except that pressing the space bar or shifting gaze between circles did not result in the delivery of a stimulus. The ensuing EEG activity was subsequently subtracted from the appropriate self-initiated conditions to remove EEG activity associated with button pressing, as is standard practice (Baess et al., 2008; Martikainen et al., 2005; Whitford et al., 2011), or with the singular, volitional eye movement associated with the saccade-initiated procedure.

60

External condition. In this condition, auditory stimuli were delivered without participant input to assess electrophysiological response to externally initiated, temporally unpredictable stimuli. Trials began with a red fixation dot followed by a uniformly distributed random interval (0.5–2.5 s), after which a tone was delivered to the headphones (identical to that in the self-initiated conditions). Participants were instructed to keep their eyes open and maintain their gaze on the screen.

Data collection and reduction. EEG was recorded with a BioSemi ActiveTwo system with 64 Ag-AgCl active electrodes placed according to the extended 10–20 system. Analog signals were anti-aliased with a fixed first-order filter -( 3 dB at 3,600 Hz) and continuously digitized at a sampling frequency of 2,048 Hz, with common mode sense (CMS) and driven right leg (DRL) used as reference and ground electrodes. During offline preprocessing, data were re-referenced to the averaged mastoid electrodes, band-pass filtered from 0.01 to 30 Hz (8th order zero-phase Butterworth IIR), and separated into 600-ms epochs (100 ms pre-onset and 500 ms post-onset). Data were baseline corrected with the average voltage between -100 and 0 ms. To address eye blinks and movement artifacts, we rejected individual epochs at any electrode site that contained EEG activity exceeding ±75 µV or min-max changes in excess of 75 µV between adjacent 100-ms intervals. At electrode Cz, this resulted in a mean rejection rate of 4.2% ± 5.4% (SD) trials, with no significant differences between stimulus blocks. We then averaged individual trials in each condition to produce the ERPs of each participant. At this point, a headphone-induced artifact (a solitary spike around 30 ms after tone onset) was removed from 7 participants by using independent component analysis (ICA) with the FastICA algorithm (Hyvärinen & Oja, 2000). Lastly, motor waveforms were subtracted from the appropriate self-initiated waveforms to produce difference waveforms, as is typical in other button press-initiated sensory attenuation studies (Baess et al., 2008; Martikainen et al., 2005; Whitford et al., 2011) and that has been extended here to our saccadic initiation paradigm. Hereafter, unless noted otherwise, mentions of the self-initiated waveforms refer to these motor-corrected waveforms. Data preprocessing was done in BrainVision Analyzer 2 (Brain Products GmbH, Munich, Germany), and statistical analyses were performed in SPSS version 23 (IBM Corp., Armonk, US).

61

4.3 Results

Figure 4.2 shows the grand-averaged ERPs at electrode Cz and its neighboring electrodes (i.e., FCz, C1, C2, and CPz), which represent locations at which the N1 and P2 components are typically maximal, especially after bilateral auditory stimulation (Ford, Gray, et al., 2007; Luck, 2012; Näätänen & Picton, 1987). In addition, Figure 4.3 shows a complete view of scalp activity during the component windows in each stimulus condition. We conducted separate one-way repeated- measures ANOVAs to determine the effect of condition (three levels: press-initiated, saccade-initiated, and externally initiated) on the mean amplitudes of the N1 (100 to 110 ms) and P2 (180 to 200 ms) at electrode Cz.

4.3.1 Auditory N1 (100 to 110 ms)

There was a main effect of condition for the N1 component, F(2,78) = 13.85, p <

2 .001, ηp = .26, which indicated that mean N1 amplitude at electrode Cz differed between press (M = -2.81, SD = 7.43), saccade (M = -5.79, SD = 7.30), and external (M = -7.73, SD = 6.99) conditions. Follow-up pairwise comparisons indicated that external significantly differed from both press, F(1,39) = 29.89, p < .001, and saccade, F(1,39) = 4.78, p = .035. Additionally, press and saccade significantly differed from each other, F(1,39) = 8.35, p = .006. These outcomes were underscored by a highly significant linear trend for N1 mean amplitudes across the

2 conditions (F = 29.88, p < .001, ηp = .43).

4.3.2 Auditory P2 (180 to 200 ms)

There was also a main effect of condition for the P2 component, F(2,78) = 19.85, p <

2 .001, ηp = .34, which indicated that mean P2 amplitude at electrode Cz differed between press (M = 6.78, SD = 6.12), saccade (M = 12.76, SD = 7.79), and external (M = 14.09, SD = 7.14) conditions. Follow-up pairwise comparisons indicated that external significantly differed from press,F (1,39) = 36.86, p < .001, but did not differ from saccade, F(1,39) = 1.40, p = .244. However, press and saccade significantly differed from each other, F(1,39) = 19.20, p < .001. As for N1, there was a highly significant linear trend for P2 mean amplitudes across the conditions (F = 36.88, p <

2 .001, ηp = .49).

62

4.2 Grand-averaged ERPs of the self and external conditions Figure 4.2 Grand-averaged ERPs at electrodes FCz, C1, Cz, C2, and CPz of the press, saccade, and external conditions. Self-initiated conditions (i.e., press and saccade) are motor corrected. Voltage is on the y axis and time in milliseconds is on the x axis, where tone onset is at 0 ms.

63

4.3 Topographic maps of the auditory N1 and P2 components Figure 4.3 Topographic maps of the auditory N1 (100 to 110 ms) and P2 (180 to 200 ms) components in each condition. Self-initiated conditions (i.e., press and saccade) are motor corrected.

64

4.3.3 Uncorrected waveforms

The effect of the motor subtraction procedure can be seen in Figure 4.4, which presents grand-averaged ERPs at electrode Cz and its neighbors (i.e., FCz, C1, C2, and CPz) in the uncorrected conditions and their corresponding motor controls. This illustrates that the motor subtraction in fact reduced the N1 difference between button press- and saccade-initiated conditions. Figure 4.5 shows scalp activity during the component windows in each uncorrected condition and its motor control. For the N1 component, a one-way ANOVA with the uncorrected amplitudes

2 revealed a main effect of condition, F(2,78) = 35.38, p < .001, ηp = .48, which indicated that mean N1 amplitude at electrode Cz differed between press (M = -0.80, SD = 7.88), saccade (M = -7.22, SD = 6.38), and external (M = -7.73, SD = 6.99) conditions. Follow-up pairwise comparisons indicated that external significantly differed from press,F (1,39) = 51.69, p < .001, but not saccade, F(1,39) = 0.39, p = .535, and that press and saccade significantly differed from each other, F(1,39) = 44.84, p < .001. There was also a main effect for the P2 component, F(2,78) = 5.68, p = .005,

2 ηp = .13, which indicated that mean P2 amplitude at Cz differed between press (M = 10.70, SD = 6.76), saccade (M = 11.62, SD = 6.27), and external (M = 14.09, SD = 7.14) conditions. Follow-up pairwise comparisons indicated that external significantly differed from both press,F (1,39) = 8.63, p = .006, and saccade, F(1,39) = 8.54, p = .006, but that press and saccade did not differ, F(1,39) = 0.71, p = .405.

4.3.4 Evaluating possible refractoriness contributions

Mean ISIs differed between conditions (M = 3,872–4,529 ms; SD = 484–895 ms). Specifically, paired samples t tests revealed significant differences in ISI between external and press, t(76) = 5.60, p < .001, and external and saccade, t(76) = 3.32, p = .025. However, there was no significant difference between press and saccade, nor between press/saccade and their corresponding motor conditions, ts ≤ 1.36 and ps ≥ .87. As refractoriness can influence N1 amplitude (Budd et al., 1998), we entered each participant’s average ISI in the external condition as a centered covariate in the ANOVA procedures described in §4.3.1 and §4.3.2. The outcomes were identical; that is, the main effects for each component remained, and all the contrast outcomes were unchanged. This suggests that ISI differences did not have a systemic effect.

65

4.4 Grand-averaged ERPs of the motor subtraction procedure Figure 4.4 Grand-averaged ERPs at electrodes FCz, C1, Cz, C2, and CPz in the uncorrected self-initiated conditions (solid traces) and their corresponding motor conditions (dotted traces). Voltage is on the y axes and time in milliseconds is on the x axes, where tone onset is at 0 ms.

66

4.5 Topographic maps of the motor subtraction procedure Figure 4.5 Topographic maps of the auditory N1 (100 to 110 ms) and P2 (180 to 200 ms) component windows in the uncorrected stimulus conditions (press, saccade) and their motor controls.

4.4 Discussion

The study reported in this chapter investigated sensory attenuation of self-initiated stimuli in terms of the effect of motor output region (i.e., hand or eye actions) on neurophysiological response to identical contingent sensory input (i.e., auditory tones). We found that button press-initiated stimuli evoked significantly reduced N1 and P2 component amplitude compared with both saccade- and externally initiated stimuli. In the saccade-initiated condition, we found an intermediary level of N1 attenuation, that is, a significant reduction compared with the externally initiated condition, but significantly less reduction than in the button press-initiated condition. In contrast, there was no difference between the saccade- and externally initiated conditions in terms of their P2 component amplitude. Hence, in relation to our primary aim, which was to determine whether ERP attenuation would occur following a novel action–sensation contingency (i.e., saccade–tone), the data indicates that it does at the N1 component, but not the P2 component. We will discuss the possible implications of these outcomes in turn. Our finding that button-press initiation was associated with significant N1 and P2 attenuation was expected, as it replicates previous iterations of a similar paradigm (Mifsud, Oestreich, et al., 2016; Oestreich et al., 2016; Whitford et al., 2011) and

67 aligns with data from several other research groups (e.g., Aliu et al., 2009; Knolle, Schröger, Baess, & Kotz, 2012; Sowman et al., 2012). More crucially in terms of our experimental rationale was the new finding that N1 attenuation also occurred in the saccade-initiated condition, despite participants’ lack of prior experience with the novel saccade–tone contingency. This suggests that sensory attenuation does not rely upon learned associations, and potentially that forward prediction can operate with any combination of motor and sensory events. The finding that saccade- initiated N1 attenuation was not as strong as button press-initiated N1 attenuation could suggest that learned associations contribute to the size of N1 attenuation in button-press paradigms, or it may be that optimal parameters of saccade initiation are still to be determined. Studies that introduce a wider range of action–sensation contingencies than are presently investigated could provide a clearer indication of the relationship between existing learned associations on sensory attenuation of self- initiated stimuli. Based on the results in this chapter, one might hypothesize a positive relationship between N1 attenuation and the existing strength of action– sensation contingencies. As such, a motor action that has consistently resulted in a given sound over the course of a person’s life would be expected to elicit a high degree of N1 attenuation. For the P2 component, saccade-initiated stimuli did not differ from externally initiated stimuli, which diverges from the press-initiated outcome and indicates that the P2 is sensitive to a property of the motor output region. For example, P2 attenuation may more directly correlate with “contingent temporal certainty” (Sowman et al., 2012) or “sensory-specific predictions” (SanMiguel, Todd, et al., 2013) than N1 attenuation. Such factors are more strongly implicated by button- press initiation than saccade initiation, given the greater practice we have with auditory stimulation following button presses as opposed to saccades, which may explain the present pattern of results. This is similar to the speculation provided by van Elk et al. (2014) in explaining the lack of P2 modulation of feet-initiated compared with hand-initiated stimuli, who argued that associations between feet actions and subsequent sounds are less established than for hands. However, direct evidence for this line of reasoning remains scarce. Shahin et al. (2005) showed that P2 (but not N1) response to tones was enhanced for pianists, and when the tones became more spectrally complex, which indicates that the P2 reflects the specific

68 features of acoustic stimuli. To properly link this study to sensory attenuation, it would be useful to see if there is an inverse pattern for self-initiated tones (i.e., that attenuation increases with spectral complexity). Moreover, in any attempt to functionally dissociate N1 and P2, it is important to consider pertinent research in the speech domain. For instance, Houde and Chang (2015) recently accounted for differences between N1 and P2 responses by way of the state feedback control (SFC) model of speaking. As mentioned earlier, discordance in the pattern of ERP response following speech and non-speech motor actions seems related to the likelihood that predictions of the responsible neural mechanisms differ in their respective degrees of specificity. As the saccade initiation paradigm presented in the present study is new, we acknowledge that the motor subtraction procedure, standard practice in contingent button-press studies investigating sensory attenuation (e.g., Baess et al., 2008; Martikainen et al., 2005; Whitford et al., 2011), has not been previously applied to eye movements. A direct investigation into possible differences in motor activity between action-only and action-response conditions for both button-press and saccade initiation would be worthwhile. Nonetheless, it is crucial to highlight that in the present study, button press- and saccade-initiated conditions significantly differed even when uncorrected for motor activity (see Figure 4.4). That is, motor subtraction reduced the N1 difference between button press- and saccade-initiated conditions, suggesting that there are true N1 attenuation differences related to motor output region. An explicit assumption so far in this discussion has been that N1 and P2 differences between motor output conditions may be only ascribed to the presence (or lack) of either forward prediction or learned associations, but it should be noted that other contributory factors are feasible. For instance, exogenous orienting effects may play a role if differences in the strength of learned associations produce differences in the degree to which stimuli are expected (Santangelo, Olivetti Belardinelli, & Spence, 2007). It is also possible that using a saccadic movement to investigate sensory attenuation poses an issue due to the existence of saccadic suppression, a compensatory mechanism classically grounded in forward prediction (Sperry, 1950; von Holst, 1954), in which stability of the visual field is maintained by suppression of the retinal consequences of eye movements. However, exactly why

69 saccadic suppression would have a substantial effect on auditory perception seems unclear. Outside of the laboratory, there are no situations in which auditory consequences directly arise due to saccades. Even beyond saccades, eye movements produce only visual, and, rarely, social consequences (e.g., staring, eye rolling). If eye movements were to attenuate sensory perception in non-visual modalities, there would be a persistent, resource-intensive source of usually irrelevant suppression in the sensory system. In any case, visual suppression has been shown to disappear within ~50 ms post-saccade (Diamond, Ross, & Morrone, 2000), prior to the time window of the auditory components examined in the present study, further diminishing the likelihood of non-visual modulation. Moreover, in a multimodal localization task, Binda, Bruno, Burr, and Morrone (2007) found that due to blurring during saccades, auditory rather than visual signals were accorded heavier weight. This suggests that if saccades were to affect auditory perception, the auditory signal would be enhanced. That we found the opposite effect indicates that the saccade per se is not causing N1 attenuation. The study reported in this chapter presented a new saccade initiation paradigm, which demonstrated that auditory N1 attenuation occurs even with an action–sensation contingency for which no prior learning exists. We suggest that sensory-specific forward prediction is an essential component of sensory attenuation, but as saccade-initiated N1 attenuation was less severe than in button press-initiated N1 attenuation, that learned associations are likely to also play a role. Moreover, we have provided support for the notion that the processes contributing to N1 and P2 effects may be independent, but the specific identity and functionality of these underlying processes, particularly P2, remains an open question.

70

5 Saccade initiation of visual stimuli

Submitted as:

Mifsud, N. G., Beesley, T., Watson, T. L., Elijah, R. B., & Whitford, T. J. (2017). Attenuation of visual evoked responses to hand and saccade-initiated flashes. Manuscript under review.

Author contributions:

Conceptualization: NGM TJW. Methodology: NGM TJW. Software: NGM TB. Investigation: NGM RBE. Data curation: NGM. Formal analysis: NGM. Visualization: NGM. Writing – original draft: NGM. Writing – review & editing: NGM TB TLW RBE TJW. Funding acquisition: TJW. Supervision: TJW.

Context in thesis:

The following study proceeds directly from the previous two studies, by contrasting coincident action–sensation pairings of differing causal likelihood and extending the visual sensory attenuation literature. We used the saccade initiation paradigm first debuted in Chapter 4. Here, participants used button presses or saccades to produce full-field flash stimuli (rather than tones). Note that because participants had to move their eyes across a screen during the saccadic condition, pattern reversals— used in Chapter 3—were unsuitable, as stimulation would occur before the reversal.

71

5.1 Introduction

Sensory attenuation refers to self-initiated stimuli evoking reduced neurophysiological (Baess et al., 2008; Houde et al., 2002; Schafer & Marcus, 1973) and phenomenological (Blakemore et al., 1999; Cardoso-Leite et al., 2010; Sato, 2008) sensory representations, compared with the sensory representations evoked by physically identical, externally initiated stimuli. The phenomenon is typically explained with a forward model that predicts the sensory consequences of intended actions based on internal motor commands, where these predictions are subtracted from actual sensory input (Bays & Wolpert, 2007; Wolpert et al., 1995). Conversely, externally initiated stimuli lack accompanying motor information, and are thus marked by a large disparity between predicted and actual sensory inputs. Notably, Feinberg (1978) first suggested that disruption of this distinction between self and the external world could account for some of the characteristic symptoms of schizophrenia (e.g., delusions of control), and evidence has emerged to support this theory (Ford et al., 2001; Pinheiro, Rezaii, Rauber, & Niznikiewicz, 2016; Whitford et al., 2011). Studies of sensory attenuation have thus far largely been limited to the auditory domain (e.g., see Table 3 in Hughes et al., 2013), centered on a reliable ERP component that is used by multiple research groups as an index of sensory attenuation—that is, the N1 or N1m component, an evoked potential or magnetic field that is consistently reduced for self-initiated vocalizations and tones (Baess et al., 2008; Curio et al., 2000; Houde et al., 2002; Mifsud, Beesley, Watson, & Whitford, 2016; Sowman et al., 2012). Given the well-established positive relationship between the auditory N1 and stimulus intensity (Näätänen & Picton, 1987)—i.e., loud sounds evoke larger auditory N1 amplitudes than do soft sounds— the finding that self-initiated sounds have reduced auditory N1 response suggests that the brain processes them as being “softer”. This reduced perceived loudness of self-initiated sounds may reflect an ecological adaptation, in the sense that the strong auditory feedback associated with our own speech may require attenuation to preserve the sensitivity of receptors to incoming sounds (Bendixen et al., 2012). In contrast to the auditory domain, studies of sensory attenuation in the visual domain are relatively scarce, and results less easily reconciled. Reported differences

72 in VEPs between self- and externally initiated visual stimuli are inconsistent in terms of both their direction and spatial location. This may be due to a diverse range of stimuli, and, relatedly, the chosen event-related components and reference electrode sites. Self-initiation has been shown to result in anterior (but not occipital) reduction of N1 for flashes (Schafer & Marcus, 1973, mastoid-referenced data) and arrow shapes (Gentsch & Schütz-Bosbach, 2011, average-referenced data), and occipital reduction of P2 for faces and houses (Hughes & Waszak, 2014, FCz-referenced data). Conversely, occipital amplification of P1 has been shown for pattern onsets (Hughes & Waszak, 2011, vertex-referenced data) and occipital amplification of N145 for pattern reversals (Chapter 3; Mifsud, Oestreich, et al., 2016, Fz-referenced data). However, sensory attenuation has also been found in behavioral tasks that use Gabor patches (Cardoso-Leite et al., 2010; Stenner et al., 2014). The clear differences in the reported results means that further experimental work is required in the visual domain that builds on existing self-initiation paradigms. Another limiting factor of investigating sensory attenuation in the visual domain is that nearly all such studies direct participants to initiate visual stimuli by pressing a button. Despite this highly specific experimental condition, there is an implicit assumption that we can generalize findings across a much wider range of action–sensation contingencies. In the auditory domain, the limits of this assumption have been tested by van Elk et al. (2014) and in Chapter 4 (Mifsud, Beesley, et al., 2016) with paradigms that use foot and saccade initiation respectively to demonstrate that differences in auditory-evoked potentials were dependent on the region of motor output used to produce the incoming stimulus. In Chapter 4, for example, more auditory N1 attenuation was found for button press-initiated tones than for saccade-initiated tones, consistent with the fact that although hand movements are strongly associated with auditory sensations (e.g., the sound of one’s fingers typing on a keyboard), eye movements are rarely, if ever, accompanied by auditory feedback. The study reported in this chapter explored whether a similar ERP pattern exists for self-initiated visual stimuli, using the saccade initiation paradigm introduced in Chapter 4 (Mifsud, Beesley, et al., 2016). We aimed to determine whether VEP attenuation would occur for button press- and saccade-initiated flashes compared with externally initiated flashes. We expected that button press-initiated

73 flashes would evoke attenuated visual N1 based on a similar condition reported by Schafer and Marcus (1973). Notably, however, we hypothesized that there would be increased visual N1 attenuation for saccade-initiated flashes compared with button press-initiated flashes, given that eye movements, rather than hand movements, are more tightly coupled to visual sensations.

ExpErimEnt 1

An underlying premise of neurophysiological sensory attenuation is that reductions in the ERP reflect alterations in the perceived intensity of a stimulus. For example, decreased auditory N1 amplitude has been consistently found for self-initiated auditory tones (e.g., Sowman et al., 2012) and auditory N1 amplitude is known to decrease with decreasing stimulus intensity (Näätänen & Picton, 1987). As mentioned earlier, this suggests that self-initiated sounds are processed as “softer” (Weiss et al., 2011). If we are to measure sensory attenuation in the visual domain, we must use a VEP component that reflects the intensity of a visual stimulus, in the same manner that the auditory N1 component is sensitive to loudness. The visual N1 component is a likely candidate, as an early, sensory-evoked component that appears to be partially generated from occipital cortex (see V. P. Clark, Fan, & Hillyard, 1995; Gomez Gonzalez, Clark, Fan, Luck, & Hillyard, 1994; Yamazaki et al., 2000), but (to our knowledge) no previous studies explicitly demonstrate that the visual N1 (i.e., frontocentral maxima, mastoid-referenced) is sensitive to changes in intensity. Hence, the aim of Experiment 1 was to establish the luminance- dependence of the visual N1 component.

5.2 Method

5.2.1 Participants

Eleven participants were recruited at UNSW Sydney. Six were female, 8 were right- handed, and mean age was 19 years (SD = 1). Participants gave written, informed consent, and received course credit in exchange for their time. This study was approved by the UNSW Human Research Ethics Advisory Panel (Psychology).

74

5.2.2 Procedure

Following provision of their demographical information, participants were fitted with an EEG cap and electrodes. EEG was continuously recorded while participants completed the experiment, seated 60 cm from a computer monitor with an integrated eye tracking system (Tobii TX300: 300 Hz gaze sampling rate; 23", 60 Hz, 1920 × 1080 resolution TFT screen; accuracy of 0.4° visual angle; system latency under 10 ms). The eye tracking function was not used in Experiment 1, but was required to achieve saccade detection in Experiment 2. The experiment comprised a series of stimulus presentations of four different types: an unstructured full-field white flash of 33.33 ms duration (i.e., two frames, verified with a photometer), that was either dim (10 cd/m2 mean luminance) or bright (100 cd/m2); and two types of pure tones (the data for which are not presented). Participant input was not required at any time. Each trial type was shown for 120 trials in total, intermixed in a total 480-trial sequence whose order was randomized between participants and split into 10 equal blocks separated by 30- s rest periods. Individual trials were separated by a uniformly distributed random interval (1–4 s). The EEG recording lasted approximately 30 min. Stimulus presentation was controlled by MATLAB (MathWorks, Natick, US) with the Psychophysics Toolbox extensions (Brainard, 1997; Kleiner et al., 2007; Pelli, 1997). EEG was recorded with a BioSemi ActiveTwo system with 64 Ag-AgCl active electrodes placed according to the extended 10–20 system. Analog signals were anti- aliased with a fixed first-order filter -( 3 dB at 3,600 Hz) and continuously digitized at a sampling frequency of 2,048 Hz, with CMS and DRL used as reference and ground electrodes. During offline preprocessing, data were re-referenced to the averaged mastoid electrodes as is typical in visual N1 analysis (V. P. Clark et al., 1995; Vogel & Luck, 2000), band-pass filtered from 0.01 to 30 Hz (8th order zero- phase Butterworth IIR), and separated into 600-ms epochs (100 ms pre-onset and 500 ms post-onset). Data were baseline corrected with the average voltage between -100 and 0 ms. To address eye blinks and movement artifacts, we rejected individual epochs at any electrode site that contained EEG activity exceeding ±75 µV or min- max changes in excess of 75 µV between adjacent 100-ms intervals, a step function procedure recommended by Luck (2012). Individual trials were then averaged for

75 each condition to produce the ERPs of each participant. Data preprocessing was done in BrainVision Analyzer 2 (Brain Products GmbH, Munich, Germany), and statistical analyses were performed in SPSS version 23 (IBM Corp, Armonk, US).

5.3 Results and discussion

As the latency and amplitude of flash ERPs are sensitive to stimulus parameters, a collapsed localizer approach was used to guide our analysis (Luck & Kappenman, 2012). The waveforms of the dim and bright conditions were averaged, and this collapsed waveform was used to identify a measurement window centered on the peak and electrode site at which the visual N1 was maximal. Accordingly, to investigate the effect of flash intensity on the visual ERP, we conducted a one-way ANOVA on the condition factor (dim, bright) on the mean amplitudes of the visual N1 (149 to 159 ms) at the site at which N1 was maximal (FCz) and the adjacent electrodes (i.e., Fz, FC1, FC2, and Cz). In cases where the assumption of sphericity was violated, a Greenhouse-Geisser correction was used. Figure 5.1 presents the grand-averaged ERPs of the 5-site cluster. Figure 5.2 presents the scalp distributions of each stimulus condition.

76

5.1 Grand-averaged ERPs of the bright and dim conditions Figure 5.1 Grand-averaged ERPs at electrodes Fz, FC1, FCz, FC2, and Cz of the bright and dim flash conditions. Voltage is on the y axes and time in milliseconds is on the x axes, where flash onset is at 0 ms. Gray areas indicate measurement windows.

77

5.2 Topographic maps of the visual N1 in the intensity task Figure 5.2 Topographic maps of the visual N1 (149 to 159 ms) for each stimulus condition.

2 A main effect of condition, F(1,10) = 30.03, p < .001, ηp = .75, indicated that mean visual N1 amplitude across electrodes Fz, FC1, FCz, FC2, and Cz differed between the dim (M = -2.60, SD = 0.51) and bright (M = -5.27, SD = 0.51) stimulus

2 conditions. There was neither a main effect of site, F(4,40) = 0.18, p = .728, ηp =

2 .02, nor an interaction between condition and site, F(4,40) = 1.24, p = .312, ηp = .11. A follow-up pairwise comparison confirmed that mean visual N1 amplitude between the dim (M = -2.70, SD = 1.79) and bright (M = -5.36, SD = 1.75) stimulus conditions at the maximal site (FCz) significantly differed, F(1,10) = 13.54, p < 001. This indicated that the mastoid-referenced visual N1 at frontocentral sites was sensitive to luminance. Specifically, a bright flash produced larger mean visual N1 amplitude than a dim flash at FCz and its neighbors. These findings suggested that amplitude of the flash-evoked visual N1 could serve as a proxy measurement of sensory attenuation in the visual domain (like the well-established use of the auditory N1 in previous sensory attenuation studies in the auditory domain). In Experiment 2, we test the notion that the causal likelihood of pairings between action and visual sensation may influence the degree of sensory attenuation. To do so, the flash-evoked visual N1 was again measured, but stimulus intensity was held constant (i.e., all flashes had the same luminance as the bright flash in Experiment 1) while the initiating motor action was varied.

78

ExpErimEnt 2

5.4 Method

5.4.1 Participants

Forty participants were recruited at UNSW Sydney. Seven participants were excluded due to low signal-to-noise ratio in one or more of the stimulus blocks. Of the remaining 33 participants, 15 were female, 29 were right-handed, and mean age was 22 years (SD = 6). Participants gave written, informed consent, and received either course credit (n = 21) or financial reimbursement (n = 12, A$30) in exchange for their time. This study was approved by the UNSW Human Research Ethics Advisory Panel (Psychology).

5.4.2 Procedure

The prerecording routine, spatial arrangements, and hardware specifications were identical to Experiment 1, with an additional 5-point procedure to calibrate the eye tracking system for saccade detection. Experiment 2 comprised five conditions: two types of self-initiation conditions (i.e., button press- and saccade-initiated flashes) and their corresponding motor control conditions (i.e., button presses and saccades without consequent flashes), and an externally initiated condition (i.e., flashes initiated without participant input). Each condition was presented in a homogenous 80-trial block, where each trial consisted of a single stimulus presentation (or, in the control blocks, a single motor action), and block order was randomized between participants. Three practice trials preceded each block to ensure that participants understood the instructions displayed on screen, and, where appropriate, allowed the experimenter to verbally encourage self-paced rather than speeded responses. Individual trials in all conditions were separated by a uniformly distributed random interval (2–4 s). The EEG recording lasted approximately 50 min.

Press condition. In this condition, a visual stimulus was self-initiated by a button press (i.e., hand motor output). Participants were instructed to respond at will any time after the appearance of a red fixation dot (0.7° diameter) presented in the center of a black screen, and did so by pressing the space bar on a low-latency

79 keyboard (Ducky Shine 4: 1000 Hz report rate) with their dominant hand. Responses immediately delivered a full-field flash identical to the bright flash used in Experiment 1 (i.e., 100 cd/m2 mean luminance, two-frame duration).

Saccade condition. In this condition, the visual stimulus was self-initiated by a volitional saccade (i.e., eye motor output). Each trial began with two dots appearing on screen: a solid red circle in the center of screen (identical to the fixation in the press-initiated condition) and a distal (17° left) hollow white circle. Participants were instructed to initially fixate on the white circle, which would turn solid once the script detected their gaze, based on a 20 ms sample of location recordings. If detection took longer than 5 seconds, trials were skipped with replacement (M = 2.7 skipped trials per participant across both stimulus and motor saccade-initiated blocks). Following fixation on the white circle, participants shifted their gaze at will to the red circle, which immediately delivered a full-field flash identical to the press- initiated stimulus. More precisely, flashes followed detection of the gaze within the 200-px (5°) square area of interest surrounding the central red circle. We confirmed that system latency was identical in both self-initiation conditions by using a photometer to detect actual delivery of the flash. Specifically, mean latency between the end of the action (button press or the eye attaining fixation in the center) and stimulus delivery was 37 ms ± 7 ms (SD) trials in both conditions.

Motor conditions. The motor control conditions were identical to their respective self-initiated conditions, except that pressing the space bar or shifting gaze between circles did not result in the delivery of a stimulus. The ensuing EEG activity was subsequently subtracted from the appropriate self-initiated conditions to remove EEG activity associated with button pressing (for the press condition) or a singular, volitional eye movement (for the saccade condition), as is standard practice in button-press studies of this nature (Baess et al., 2008; Martikainen et al., 2005; Whitford et al., 2011), and as was previously used for saccade initiation in Chapter 4 (Mifsud, Beesley, et al., 2016).

External condition. In this condition, stimuli were delivered automatically (i.e., without participant input) to assess electrophysiological response to externally initiated, temporally unpredictable stimuli. Trials began with a red fixation dot followed by a uniformly distributed random interval (0.5–2.5 s) before a flash was

80 presented (identical to that in the self-initiated conditions). Participants were instructed to keep their eyes open and maintain their gaze on the screen at all times.

EEG data acquisition. EEG was recorded and preprocessed as described in Experiment 1, including data re-referencing to the averaged mastoid electrodes, and a step function artifact rejection procedure that resulted in a mean rejection rate at electrode FCz of 4.5% ± 5.5% (SD) trials (press: 4.7% ± 6.0%, saccade: 3.9% ± 6.4%, external: 5.0% ± 5.5%), with no significant differences between stimulus blocks,

2 F(2,64) = 0.99, p = .378, ηp = .03. Once the individual trials for each condition were averaged to produce ERPs, motor waveforms were subtracted from the appropriate self-initiated waveforms to produce participants’ difference waveforms. Figure 5.3 demonstrates the effect of the motor subtraction procedure by presenting grand- averaged ERPs at electrode FCz of the uncorrected waveforms in the self-initiated conditions and their corresponding motor waveforms. Hereafter, unless noted otherwise, mentions of self-initiated waveforms refer to motor-corrected waveforms.

5.3 Grand-averaged ERPs of the motor subtraction procedure Figure 5.3 Grand-averaged ERPs at electrode FCz of the uncorrected self-initiated conditions (solid traces, left), their corresponding motor conditions (dotted traces, left), and motor-corrected self-initiation conditions (right). The externally initiated condition (black line, both panels) is included for comparison. Voltage is on the y axes and time in milliseconds is on the x axes, where flash onset is at 0 ms.

81

5.5 Results and discussion

To investigate the effect of self-initiation on the visual ERP, we conducted separate one-way ANOVAs on the condition factor (press, saccade, external) on the mean amplitudes of the visual N1 (150 to 160 ms) at electrode FCz and then similarly for a 5-site cluster that included adjacent electrodes (i.e., Fz, FC1, FC2, and Cz). In cases where the assumption of sphericity was violated, a Greenhouse-Geisser correction was used. This analysis mirrored Experiment 1, with a trivially different measurement window (i.e., 149 to 159 ms vs. 150 to 160 ms) due to the latency of the peak identified in the collapsed waveform formed from the average of the press, saccade, and external conditions. Figure 5.4 presents the grand-averaged ERPs at electrode FCz and adjacent sites. Figure 5.5 presents the scalp distributions of each stimulus condition.

2 At electrode FCz, a main effect of condition, F(2,64) = 9.19, p < .001, ηp = .22, indicated that mean visual N1 amplitude differed between the press (M = -3.88, SD = 4.34), saccade (M = -1.61, SD = 5.10), and external (M = -5.59, SD = 3.45) conditions. Follow-up pairwise comparisons indicated that external significantly differed from press, F(1,32) = 4.89, p = .034, as well as saccade, F(1,32) = 16.54, p < .001. Likewise, press and saccade significantly differed from each other, F(1,32) = 4.92, p = .034. Crucially, this demonstrated that saccade-initiated flashes evoked a significant reduction in mean visual N1 amplitude at FCz compared with both button press- and externally initiated flashes. Similarly, for the 5-site cluster, a main effect of condition, F(2,64) = 10.41, p <

2 .001, ηp = .25, indicated that mean visual N1 amplitude across electrodes Fz, FC1, FCz, FC2, and Cz differed between the press (M = -4.04, SD = 0.67), saccade (M = -1.54, SD = 0.83), and external (M = -5.37, SD = 0.59) conditions. Follow-up pairwise comparisons indicated that external significantly differed from saccade, F(1,32) = 17.31, p < .001, but not press, F(1,32) = 3.52, p = .070. However, press and saccade significantly differed from each other, F(1,32) = 7.52, p = .010. There was

2 neither a main effect of site, F(4,128) = 1.51, p = .223, ηp = .05, nor was there an

2 interaction between condition and site, F(8,256) = 1.68, p = .104, ηp = .05. Taken together, the FCz analysis and 5-site cluster analyses demonstrated that saccade initiation led to significantly reduced mean visual N1 amplitude compared

82 with external initiation. As predicted, button-press initiation led to a smaller, but still significant, reduction in mean visual N1 amplitude, representing an intermediary point between the saccade and external conditions.

5.4 Grand-averaged ERPs of the self and external conditions Figure 5.4 Grand-averaged ERPs at electrodes Fz, FC1, FCz, FC2, and Cz of the press, saccade, and external conditions. Self-initiated conditions (i.e., press and saccade) are motor-corrected. Voltage is on the y axes and time in milliseconds is on the x axes, where flash onset is at 0 ms. Gray areas indicate measurement windows.

83

5.5 Topographic maps of the visual N1 in the self-initiation task Figure 5.5 Topographic maps of the visual N1 (150 to 160 ms) in each condition. Self- initiated conditions (i.e., press and saccade) are motor-corrected.

5.6 General discussion

The study reported in this chapter used a saccadic paradigm to investigate the effect of self-initiation on subsequent visual response for the first time, extending similar work conducted in the auditory domain (Chapter 4; Mifsud, Beesley, et al., 2016). Though a corpus of research has focused on saccadic production and control around the time of saccadic onset (e.g., Reingold & Stampe, 2002) and physiological effects during reading (Ibbotson & Krekelberg, 2011; Yang & McConkie, 2001), the present study focused on the sensory consequences cued by the eye movement, rather than the movement itself. The key novel finding, demonstrated in Experiment 2, was that saccade-initiated flashes evoked attenuated visual N1 amplitude compared with both button press- and externally initiated flashes. Importantly, as Experiment 1 showed that visual N1 amplitude was luminance-dependent, and given the clear similarities between both the procedures and results presented here and those in the auditory domain (e.g., Baess et al., 2008; Sowman et al., 2012; Whitford et al., 2011), we suggest that visual N1 amplitude may provide an analogous measure of visual sensory attenuation and provide a starting reference for future inquiry. With respect to the finding that button-press initiation evoked attenuated visual N1 response at frontocentral sites, previous studies that measured the anterior N1 component have reported similar findings (indeed, they formed part of the rationale for analysis parameters in the present study). Schafer and Marcus (1973) and Gentsch and Schütz-Bosbach (2011) both reported reduction of the visual N1 at electrode Cz under similar circumstances, especially Schafer and Marcus (1973), who used a flash stimulus—albeit briefer, as a flash of 10-µsec duration. Importantly,

84 by including a button-press condition in the current design, we can make a comparison of key interest: the relative degree of visual N1 attenuation observed across different types of initiating motor output. Given that the N1 attenuation associated with eye movements was greater than the N1 attenuation associated with finger movements (i.e., button-press initiation), even when controlling between-condition differences in motor-evoked potentials, it is conceivable that the observed effects relate to the strength of the habitual association between the type of motor action (eye or hand movement) and the resultant perceptual sensations (flashes). That is, the strength of pre-existing associations gained throughout a lifetime of experience appears to positively correlate with the degree of neurophysiological attenuation. Eye movements are very strongly associated with changes in visual sensation, whereas hand movements are only weakly related to changes in visual sensation. This hypothesis accords with the pattern of effects found in Chapter 4 (Mifsud, Beesley, et al., 2016), where saccade-initiated tones produced less auditory N1 attenuation than button press-initiated tones, perhaps because eye movements are less likely than hand movements to be associated with auditory feedback. Interestingly, Gentsch and Schütz-Bosbach (2011) noted that the quantitative effect of button-press initiation on visual N1 amplitude was less than effects reported in the literature measuring the auditory N1 (e.g., Heinks-Maldonado et al., 2005). However, the results in the present study, taken together with those in Chapter 4, suggests a rough correspondence in the size of ERP attenuation effects for the initiating actions that have a stronger causal likelihood of being paired with the sensory modality under investigation (i.e., saccade–visual and press–auditory). In terms of the possible drivers of sensory attenuation, the popular forward model account—in which predicted sensory consequences of an action in the form of physical, corollary signals modulate actual sensory feedback—seems likely to be most applicable where there are direct neural connections between the relevant areas of the brain. Such sensorimotor connections conceivably exist between the motor area of the brain involved in eye movements and the visual cortex (e.g., the frontal eye field in prefrontal cortex, see Schall, 2002), whereas it may be less likely that such established connections exist between the visual cortex and the parts of the motor cortex involved in hand movements.

85

After demonstrating the luminance-dependence of the visual N1, we employed a new saccade initiation paradigm to investigate the sensory attenuation of visual stimuli that were initiated by either a saccade or button press, introducing a novel procedure to complement the large body of evidence showing sensory attenuation for auditory stimuli that are initiated by button press. We observed attenuation of visual N1 amplitude following saccade initiation, over and above that which occurred following button press-initiation, and suggested that this may relate to the strength of the association between eye movements and visual events. The experiments in the study reported in this chapter represent a valuable contribution toward understanding the underlying mechanisms that produce the ubiquitous sensory attenuation phenomenon.

86

6 Blow initiation of auditory stimuli

Published as:

Mifsud, N. G. & Whitford, T. J. (2017). Sensory attenuation of self-initiated sounds maps onto habitual associations between motor action and sound. Neuropsychologia, 103, 38–43. https://doi.org/10.1016/j.neuropsychologia.2017.07.019

Author contributions:

Conceptualization: NGM TJW. Methodology: NGM TJW. Software: NGM. Investigation: NGM. Data curation: NGM. Formal analysis: NGM. Visualization: NGM. Writing – original draft: NGM. Writing – review & editing: NGM TJW. Funding acquisition: TJW. Supervision: TJW.

Context in thesis:

The following study is the culmination of the thesis, returning to auditory tone stimuli while introducing a further new motor action, a soundless blow. Notably, testing whether a mouth-initiated artificial stimulus evokes AEP reduction interrogates the assumption of motor-to-sensory mapping generalizability that underlies forward prediction accounts of button-press initiation paradigms, and therefore is essential to accepting the evidence for dysfunctional sensory attenuation as an explanation of characteristic psychotic symptoms. We also included a saccade condition for within-subject comparisons with the blow and button-press conditions to provide a spectrum of causal likelihood in terms of action–sensation pairings.

87

6.1 Introduction

A central aspect of everyday experience is that sensations caused by our own actions are easily distinguishable from those with an external origin. This fundamental distinction is believed to play a central role in establishing our sense of agency (Engbert et al., 2008). It is accounted for by a motor control framework in which the central nervous system uses a forward model to anticipate the sensory effects of movement based on internal motor commands (Wolpert et al., 1995). Modulatory signals, referred to as corollary discharges (Sperry, 1950) or efference copies (von Holst, 1954), contain the predictions and attenuate incoming sensory input accordingly, such that residual sensory activity represents deviation from what was expected on the basis of the outgoing motor command. Sensory attenuation due to forward prediction may explain why we cannot tickle ourselves (Blakemore, Wolpert, & Frith, 1998). The phenomenon also comports with the inhibition of auditory neurons during vocalization to prevent self-induced desensitization, widely observed in crickets (Poulet & Hedwig, 2002), bats (Suga & Shimozawa, 1974), primates (Eliades & Wang, 2003; Müller-Preuss & Ploog, 1981), and humans (Curio et al., 2000; Ford et al., 2001; Houde et al., 2002; Paus, Perry, Zatorre, Worsley, & Evans, 1996). The ERP literature on sensory attenuation has dramatically expanded over the past two decades. Ford et al. (2010) detailed a typical paradigm involving vocal production, where participants first utter a series of basic syllables (self-initiated condition) and have this recording played back to them (externally initiated condition). Healthy individuals have consistently been shown to exhibit attenuated response of the N1/M1 for self-initiated vocalizations (e.g., Behroozmand et al., 2009; Curio et al., 2000; Ford, Gray, et al., 2007; Houde et al., 2002). As the N1 (i.e., the negative peak in the EEG that appears around 100 ms after transient onset of an auditory stimulus) is sensitive to volume and known to originate from the auditory cortex (Näätänen & Picton, 1987), this pattern of results suggests that the auditory cortex is adjusting its sensitivity and lowering its response to self-initiated vocalizations. There are, however, significant limitations to these willed vocalization studies. Most problematic is that there are physical differences in the auditory signal

88 between conditions. For instance, self-initiated vocalizations will always have the confound of bone conduction (Stenfelt & Goode, 2005). Even if an online recording is played simultaneously through headphones during vocalization, the eardrum will receive a combination of air-conducted sound and bone-conducted sound, such that the heard sound in the self-initiated condition will be qualitatively different from the heard sound in the externally initiated condition. Moreover, speech production may lead to contraction of the middle ear stapedius muscle, which alters transmission of the sound signal (Horváth & Burgyán, 2013). Such confounds make it difficult to determine to what degree differences in the ERP amplitudes of each condition are associated with forward prediction as opposed to between-condition differences in the auditory signal. These issues are mitigated in button-press paradigms, in which participants press a button to initiate artificial stimuli such as tones, avoiding both bone conduction and muscle contraction (as the mouth is not used in stimulus production). In line with the findings of vocalization paradigms, several research groups have reported auditory N1 suppression to button press-initiated tones (e.g., Baess et al., 2008; Martikainen et al., 2005; Mifsud, Beesley, et al., 2016; Schafer & Marcus, 1973; Sowman et al., 2012), as well as abnormal N1 suppression of tones in patients with schizophrenia (Ford et al., 2013; Whitford et al., 2011). Nevertheless, the sensory attenuation studies that used tones as auditory stimuli have almost always used button presses as the initiating motor action (typically via fingers, but see van Elk et al., 2014), and to our knowledge, no studies have used the muscle effectors involved in speech as the initiating motor action. It is possible that differences in the habitual associations between motor actions and corresponding sensory feedback may have an influence on the degree of sensory attenuation. That is, willed mouth movements are very often associated with sounds (e.g., you can hear yourself speak and chew), whereas willed finger movements are only sometimes associated with sounds (e.g., the tap of keys while typing on a keyboard), and quite often are not (e.g., wiggling one’s fingers in the air). Furthermore, the degree of N1 suppression to speech recordings initiated by a button press has been found to be less strong than the degree of N1 suppression when talking (Ford, Gray, et al., 2007).

89

The study reported in this chapter tested the notion that the degree of sensory attenuation, as indexed by auditory N1 amplitude, may map onto the strength of the habitual association between motor action and sensory response. We measured sensory attenuation to physically identical sounds (tones) evoked by three markedly different motor actions, namely mouth movements, finger movements, and eye movements. This study introduces a highly novel self-initiated condition in which participants trigger auditory tones with a soundless blow, which is a movement free of bone conduction in an effector region habitually associated with sounds. This represents the first investigation of an action–sensation contingency that uses a mouth action while comparing truly identical stimuli in the self- and externally initiated conditions. We also compared the blow condition to saccade and button-press conditions to provide a clearer account of quantitative differences in sensory attenuation based on type of eliciting motor action. We hypothesized that blow initiation would produce the largest degree of auditory N1 attenuation because mouth actions are strongly associated with auditory feedback. Conversely, we hypothesized that saccade initiation would result in the least N1 attenuation, given that eye movements never produce auditory feedback in the natural environment, and consistent with previous results (Chapter 4; Mifsud, Beesley, et al., 2016). Furthermore, we expected that button-press initiation would result in intermediary N1 response between that from blow and saccade initiation, given that the habitual association between finger movements and sounds lies in between those of mouth movements (strongest association) and eye movements (weakest association).

6.2 Method

6.2.1 Participants

Twenty-eight participants were recruited at UNSW Sydney, of whom 18 were female, 24 were right-handed, and mean age was 22 years (SD = 7). Participants provided written, informed consent and received either course credit (n = 16) or financial imbursement (n = 12, A$30) in exchange for their time. This study was approved by the UNSW Human Research Ethics Advisory Panel (Psychology).

90

6.2.2 Procedure

Participants sat 60 cm in front of a computer monitor with integrated eye tracking system (Tobii TX300: 300 Hz gaze sampling rate; 23", 60 Hz, 1920 × 1080 resolution TFT screen; accuracy of 0.4° visual angle; system latency under 10 ms). Participants completed a demographics questionnaire before being fitted with an EEG cap and electrodes. After a 5-point eye tracking calibration procedure, participants had their EEG continuously recorded as they completed the experimental protocol, which was controlled by MATLAB (MathWorks, Natick, US) and lasted approximately 50 min. The experiment consisted of seven conditions: three self-initiation conditions (i.e., blow-, press- and saccade-initiated tones), three corresponding motor controls (i.e., motor actions without consequent tones) and an externally initiated condition (i.e., tones initiated without participant input). Each condition was presented as a separate block that consisted of 80 homogenous trials. The order in which blocks were presented was randomized between participants. Each block began with 4 practice trials to ensure that participants understood instructions and, in the self- initiated conditions, were self-pacing their responses. A uniformly distributed random interval (1.5–3 s) preceded the beginning of each trial.

Blow condition. In this novel condition, participants received a tone (30 ms duration, 500 Hz frequency, 70 dB sound pressure level) to their headphones (AKG K77 Perception) immediately following a blow into a microphone (Shure SM58). To assist within-subjects consistency of pace between conditions, participants were instructed to respond at will any time after a red fixation dot (0.7° diameter) appeared in the center of a black screen, which remained on screen for the duration of the tone. The blowing action, demonstrated by the experimenter for participants to mimic, consisted of a brief, soundless push of the air already in the mouth cavity, which produced a movement of air sufficient for detection by the microphone diaphragm. This soundless blow served to minimize muscle recruitment and eliminate the bone conduction that occurs during speech. To prevent system latency that would result from the master audio device switching between audio capture (detecting the blow) and playback (delivering the tone) in succession, a virtual slave audio device was initialized for each role, and both virtual devices ran in parallel.

91

Press condition. In this self-initiated condition, participants responded at will any time after a fixation dot appeared (identical to the blow condition) by pressing the space bar on a low-latency keyboard (Ducky Shine 4: 1,000 Hz report rate) with their dominant hand, which immediately delivered a tone identical to that in the blow condition. This condition represented a replication of several previous studies (e.g., Mifsud, Oestreich, et al., 2016; Sowman et al., 2012).

Saccade condition. This condition is similar to the saccade initiation condition reported in Chapter 4 (Mifsud, Beesley, et al., 2016) and Chapter 5. This self-initiated condition began with two dots simultaneously appearing on screen: a solid red circle in the center of screen (identical to the fixation dot in previous conditions) and a hollow white circle of equal diameter (17° left or right of center, pseudo-randomly determined to ensure counterbalanced presentation). Participants were instructed to initially fixate on the white circle, which would turn solid to indicate that their gaze had been detected within a 20 ms sample of location recordings. If not detected after 5 seconds, trials were skipped with replacement (M = 1.5 skipped trials per participant across both stimulus and motor saccade-initiated blocks). Following gaze detection, participants were instructed to shift their gaze any time at will to the center dot, which delivered a tone identical to that in all the other stimulus conditions immediately upon detection of the gaze inside a 200-px (5°) square area of interest around the center dot.

Motor conditions. The motor control conditions were identical to their respective self-initiated conditions, except that each motor action did not produce a tone. The ensuing EEG activity from these conditions was subsequently subtracted from the appropriate self-initiated conditions to remove EEG activity associated with button pressing. This procedure is standard practice in button-press studies of this nature (Baess et al., 2008; Martikainen et al., 2005; Whitford et al., 2011) and has here been applied to mouth and eye actions of similar or even lesser muscular complexity than moving a finger or uttering syllables.

External condition. This externally initiated condition involved tones delivered automatically (i.e., without participant input) at unpredictable intervals. Trials began with a fixation dot, as in the other conditions, and following a uniformly distributed random interval (0.5–2 s), a tone identical to that in the self-initiated

92 conditions was delivered. Participants were instructed to keep their eyes open and maintain their gaze on the screen at all times.

EEG data acquisition. EEG was recorded with a BioSemi ActiveTwo system with 64 Ag-AgCl active electrodes placed per the extended 10–20 system, sampling at a rate of 2,048 Hz. During offline preprocessing, data were re-referenced from sites CMS and DRL to an average of the mastoid electrodes. Data were band-pass filtered from 0.01 to 30 Hz, separated into 600-ms epochs (100 ms prior to tone onset, 500 ms post-onset), and baseline corrected with the average voltage prior to onset. We mitigated eye blinks and movement artifacts by rejecting individual epochs that contained voltage activity (or min-max changes between adjacent 100- ms intervals) in excess of ±75 µV. At electrode Cz, this resulted in a mean rejection rate of 4.6% ± 4.4% (SD) trials, with no significant differences between blow, press,

2 or saccade initiation blocks, F(2,54) = 1.17, p = .319, ηp = .04. We then averaged individual trials of each condition to produce the ERPs of each participant. Lastly, motor waveforms were subtracted from the appropriate self-initiated waveforms to produce difference waveforms, as is typical in button-press initiation studies (Baess et al., 2008; Martikainen et al., 2005; Whitford et al., 2011), and as has been done previously in saccade initiation studies (Chapter 4; Mifsud, Beesley, et al., 2016). To illustrate sensorimotor activity prior to the subtraction procedure, Figure 6.1 presents grand-averaged ERPs at electrode Cz and its neighboring electrodes (i.e., FCz, C1, C2, and CPz) of the uncorrected self-initiated waveforms (solid lines) and their corresponding motor waveforms (dotted lines). Hereafter, mentions of the self- initiated waveforms refer to the motor-corrected waveforms. Data preprocessing was done in BrainVision Analyzer 2 (Brain Products GmbH, Munich, Germany), and statistical analyses were performed in SPSS version 23 (IBM Corp., Armonk, US).

93

6.1 Grand-averaged ERPs of the motor subtraction procedure Figure 6.1 Grand-averaged ERPs of the uncorrected self-initiated conditions (solid traces) and their corresponding motor conditions (dotted traces) at electrodes FCz, C1, Cz, C2, and CPz. Voltage is on the y axes and time in milliseconds is on the x axes, where tone onset is at 0 ms.

94

6.3 Results

Individual ERP component amplitudes were calculated based on the mean of the voltage in intervals centered on the group-average peak latencies of each condition (10-ms intervals for the N1, 20-ms intervals for the P2). Our electrode of interest was Cz, which typically has large N1 amplitude after bilateral auditory stimulation (Ford et al., 2007), and was the electrode at which N1 was maximal in this study. To investigate the effect of condition on the components, we conducted separate one- way ANOVAs on the mean amplitudes of the N1 and P2 at electrode Cz of each participant. In cases where the assumption of sphericity was violated, a Greenhouse- Geisser correction was used. Figure 6.2 presents the grand-averaged ERPs of all conditions at electrode Cz, and Figure 6.3 presents component scalp distributions.

6.2 Grand-averaged ERPs of the self and external conditions Figure 6.2 Grand-averaged ERPs of the blow, press, saccade, and external conditions at electrode Cz. Self-initiated conditions (i.e., blow, press and saccade) are motor corrected. Voltage is on the y axis and time in milliseconds is on the x axis, where tone onset is at 0 ms.

95

6.3 Topographic maps of the auditory N1 and P2 components Figure 6.3 Topographic maps of the auditory N1 and P2 components for each condition. Self-initiated conditions (i.e., blow, press and saccade) are motor corrected.

For the N1 component, there were effectively two pairs of conditions: blow (M = -8.42, SD = 4.50) and press (M = -8.67, SD = 3.92) both had reduced N1 amplitude compared with both saccade (M = -11.01, SD = 5.64) and external (M = -10.74, SD = 3.86). This was supported in the ANOVA by a main effect of condition,

2 F(3,81) = 3.80, p = .025, ηp = .12, indicating that mean N1 amplitude at electrode Cz differed between the conditions. Follow-up pairwise comparisons indicated that blow significantly differed from external, F(1,27) = 7.10, p = .013, as did press, F(1,27) = 10.16, p = .004. However, blow did not differ from press,F (1,27) = 0.05, p = .825. Meanwhile, saccade did not differ from external, F(1,27) = 0.09, p = .766, but it did significantly differ from press,F (1,27) = 6.46, p = .017, with a trend toward a difference from blow, F(1,27) = 3.92, p = .058. There was a wider spread of means for the P2 component. A main effect of

2 condition, F(3,81) = 9.94, p < .001, ηp = .27, indicated that mean P2 amplitude at electrode Cz differed between blow M( = 5.30, SD = 8.26), press (M = 7.20, SD = 5.25), saccade (M = 11.38, SD = 6.35), and external (M = 12.55, SD = 5.12). Follow- up pairwise comparisons indicated that blow significantly differed from saccade, F(1,27) = 10.21, p = .004, and external, F(1,27) = 20.22, p < .001, though not from press, F(1,27) = 0.89, p = .354. Similarly, press significantly differed from saccade, F(1,27) = 12.37, p = .002, and external, F(1,27) = 24.88, p < .001. However, saccade did not differ from external, F(1,27) = 1.03, p = .319.

96

A 5-site cluster analysis spatially centered around electrode Cz (i.e., FCz, C1, Cz, C2, and CPz) produced identical outcomes, in terms of significant pairwise comparisons, to those based on Cz alone. For the N1 component, a main effect of

2 condition, F(3,81) = 3.34, p = .037, ηp = .11, indicated that mean N1 amplitude across the 5-site cluster differed between blow (M = -7.85, SD = 4.32), press (M = -8.21, SD = 3.90), saccade (M = -10.17, SD = 5.26) and external (M = -9.90, SD = 3.79) conditions. Follow-up pairwise comparisons indicated that blow significantly differed from external,F (1,27) = 5.08, p = .020, as did press, F(1,27) = 4.69, p = .010. However, blow did not differ from press, F(1,27) = 0.12, p = .731. Meanwhile, saccade did not differ from external, F(1,27) = 0.09, p = .748, but it did significantly differ from press,F (1,27) = 4.50, p = .029, with a trend toward a difference from blow, F(1,27) = 4.53, p = .061.

2 For the P2 component, a main effect of condition, F(3,81) = 8.92, p = .001, ηp = .25, indicated that mean P2 amplitude across the 5-site cluster differed between blow (M = 4.67, SD = 7.53), press (M = 6.49, SD = 4.80), saccade (M = 10.15, SD = 6.22) and external (M = 11.06, SD = 4.86) conditions. Follow-up pairwise comparisons indicated that blow significantly differed from saccade, F(1,27) = 16.68, p = .005, and external, F(1,27) = 27.09, p < .001, though not from press, F(1,27) = 1.77, p = .339. Similarly, press significantly differed from saccade, F(1,27) = 12.98, p = .001, and external, F(1,27) = 21.62, p < .001. However, saccade did not differ from external, F(1,27) = 0.74, p = .421. Lastly, we addressed the possibility of systematic differences in sensory-motor activity (i.e., uncorrected blow, saccade, and press conditions) and corresponding motor controls in the pre-onset interval, given that action preparation-related activity has been shown as strong and originating from both prefrontal and premotor cortices (e.g., Berchicci, Spinelli, & Russo, 2016; Sulpizio et al., 2017). We averaged activity in epochs that began 1,100 ms prior to the onset triggers, and baseline corrected these with mean voltage during the initial 200 ms (i.e., -1,100 ms to -900 ms), as was done in Perri, Berchicci, Lucci, Spinelli, and Di Russo (2015). We then split the post-baseline activity at Cz into a series of 150-ms windows of analysis, and compared sensory-motor and motor activity by using paired samples t tests. This produced 18 paired samples (3 actions [blow, saccade, press] × 6 time windows [-900 to -750 ms, -750 to -600 ms, -600 to -450 ms, -450 to -300 ms, -300 to -150 ms, and

97

-150 to 0 ms]). For all paired samples, no significant differences in pre-onset activity were found between the sensory-motor (-2.66 ≤ M ≤ 0.40, 1.09 ≤ SD ≤ 8.33) conditions and their corresponding motor (-2.13 ≤ M ≤ -0.27, 1.35 ≤ SD ≤ 7.51) controls, ts ≤ 1.14 and ps ≥ .26. This suggests that activity associated with action preparation was not a confounding factor in our motor subtraction procedure.

6.4 Discussion

We set out to investigate the level of sensory attenuation associated with self- initiation by blow, press, and saccade motor actions by comparing auditory N1 and P2 amplitude associated with self-initiated tone stimuli with identical externally initiated stimuli. We introduced a soundless blow to trigger stimuli to avoid the bone conduction and muscle contraction confounds associated with previous vocalization studies. Hence, an important contribution of the study reported in this chapter is that it represented the first sensory attenuation paradigm to use mouth region effectors while holding the stimulus physically constant across both self- and externally initiated conditions. Further, we investigated a range of motor actions that varied in terms of their habitual association with auditory sensations. This included a saccade condition similar to that introduced in Chapter 4 (Mifsud, Beesley, et al., 2016), as well as a press condition typical to existing literature (e.g., Baess et al., 2008; Martikainen et al., 2005; Mifsud, Oestreich, et al., 2016; Schafer & Marcus, 1973; Whitford et al., 2011). Crucially, blow-initiated tones were associated with significant auditory ERP attenuation, eliciting reduced N1 and P2 responses compared with unpredictable externally initiated tones. This is the first study to show that a motor action using effectors in the mouth region produces significant attenuation of auditory sensations even when holding the stimulus truly constant. Importantly, this finding supports the notion that the effects reported in existing vocalization paradigms (e.g., Behroozmand et al., 2009; Curio et al., 2000; Ford, Gray, et al., 2007; Houde et al., 2002) are not simply attributable to confounds related to speech movements or physical differences between the self-produced and replayed vocalization stimuli, but may instead be related to forward prediction.

98

Press-initiated tones were associated with significant N1 and P2 attenuation in the same pattern as blow initiation, which concords with the large body of literature showing ERP attenuation to button press-initiated tones (Baess et al., 2008; Martikainen et al., 2005; Oestreich et al., 2016; Schafer & Marcus, 1973; Sowman et al., 2012). In addition, both blow and press initiation had significantly reduced N1 and P2 responses compared with saccade initiation, which aligns with a previous finding that press-initiated tones resulted in reduced auditory N1 and P2 responses compared with saccade-initiated tones (Chapter 4; Mifsud, Beesley, et al., 2016). The pattern of results is consistent with the notion that the degree of auditory ERP attenuation reflects the degree of habitual associations between an eliciting motor action and the sound. Unlike eye movements, which are typically not associated with auditory feedback, mouth actions are habitually concomitant with sounds in everyday life (speaking, chewing, etc.), whereas finger movements also often initiate auditory feedback (e.g., tapping the keyboard). The results of the study reported in this chapter suggest that motor actions that typically produce sounds in everyday life are associated with a higher degree of sensory attenuation than motor actions that are rarely associated with sounds in everyday life. It is possible that the underlying neural architecture associated with such habitual associations facilitates forward prediction. The strong causal link between mouth actions and auditory consequences is reflected in a corresponding candidate fibre pathway between the auditory cortex and vocal production area of the motor cortex (Dick & Tremblay, 2012). While no such auditory pathway has been identified for sounds resulting from button presses, candidate fasciculi between the auditory and motor cortices have been proposed (e.g., Catani et al., 2013). Although there were not significant differences in ERP attenuation between the blow and press conditions, Figure 6.2 illustrates that blow initiation produced the largest amplitude reductions, particularly for the P2 component. This was expected due to the tight coupling of mouth actions with sounds, but the lack of significant difference may be because a soundless blow is unlikely to result in specific predictions of an artificial tone, and that the generalized expectation of auditory feedback is indistinguishable from the same prediction following finger movements. There is some evidence to suggest that the contribution of forward prediction to sensory attenuation depends on the specificity of the predictions being made. For

99 example, N1 attenuation has been shown to be ameliorated by pitch-shifting the sensory feedback received by participants during vocalization (e.g., Behroozmand & Larson, 2011; Heinks-Maldonado et al., 2005). Given this possibility, a future iteration of the study reported in this chapter could use auditory stimuli more akin to a blowing sound, rather than pure tones. The level of sensory attenuation to button press-initiated stimuli would presumably remain the same (i.e., neither a tone nor a natural-sounding blow sound are specifically predicted by button presses), but blow initiation may possibly produce significantly more sensory attenuation for blow-like sounds. In any case, it seems prudent to explore more complex, ecologically valid stimuli in sensory attenuation paradigms, as Hughes and Waszak (2014) started in the visual domain by investigating face and house stimuli. An improved understanding of the mechanisms that underlie sensory attenuation may hold implications for the sense of agency, and by extension, clinical disorders that involve an abnormal sense of agency. Disrupted sensory attenuation may be involved in the etiology of psychotic symptoms that are characterized by an abnormal sense of agency, including auditory hallucinations and delusions of control (Feinberg, 1978; Frith, 1987). Indeed, patients with schizophrenia have been found to have similar auditory N1 response for both self- and externally initiated speech (Ford, Gray, et al., 2007; Ford et al., 2001; Heinks-Maldonado et al., 2007), suggesting that they do not make a distinction between self- and externally generated actions, which may underlie the feelings of abnormal agency. If, in a replication of the blow initiation paradigm introduced here, patients with schizophrenia exhibited equal N1 response across all conditions, this would suggest that an underlying reason for their deficits in sensory attenuation is an abnormality in their ability to form habitual associations. Hence, a worthwhile direction of future research is to conduct sensory attenuation studies that use a wider range of eliciting motor actions in a clinical population. The study reported in this chapter demonstrated that neurophysiological attenuation of N1 and P2 responses to blow-initiated tones occurred, representing the first study to investigate mouth-evoked auditory stimuli without stimulus confounds such as bone conduction. This provides a strong case that existing key effects foundwith vocalization paradigms are not due to stimulus or motor action confounds, and that forward prediction is involved in auditory N1 attenuation of

100 self-initiated stimuli. More broadly, we also found that saccade initiation did not produce N1 or P2 attenuation, and that button-press initiation produced reductions in N1 and P2 responses comparable to blow initiation. These results suggest that motor actions that are habitually associated with auditory feedback were those associated with the highest degree of sensory attenuation in the EEG. This pattern of effects points toward a broader predictive framework that incorporates learned associations rather than relying on cancellation through physical predictive signals.

101

7 General discussion

The overarching aim of this thesis was to elucidate how sensory attenuation may be modulated by the strength of the existing causal relationship between an action and its sensory effect. This aim was achieved with a programmatic series of ERP studies that measured the electrophysiological response of healthy participants who were exposed to a range of action–sensation pairings. In this chapter, I summarize the main findings of the experimental program (§7.1) and discuss implications for our understanding of sensory attenuation (§7.2), provide directions for future research (§7.3) and conclude the thesis (§7.4).

7.1 Summary

Chapter 3 investigated electrophysiological sensory attenuation in the auditory and visual domains using a within-subjects design. This was the first study to nvestigatei sensory attenuation in both sensory modalities concurrently. It was important to do so because previously observed attenuation of ERPs evoked by self-initiated auditory stimuli had been explained with reference to forward prediction, a mechanism that is not confined to the auditory modality and thus should result in ERP attenuation in other sensory modalities. We replicated the results of past studies by showing that button press-initiated tones evoked significant attenuation of auditory N1 and P2 responses. However, we found that button press-initiated pattern reversals resulted in amplified visual N145 response, and that there was not a clear quantitative association between the level of ERP modulation in each sensory modality. In addition, temporally predictable externally initiated stimuli resulted in attenuated auditory P2 (but not N1) response. Together, these results suggested differences across sensory modalities in the underlying generators of ERP modulation of self- initiated stimuli, such that pattern-reversal VEPs may be dominated by factors orthogonal to forward prediction, and suggested that temporal predictability had a minor role in tone AEPs. Chapter 4 examined whether electrophysiological sensory attenuation of auditory stimuli could occur when initiated by a habitually unrelated motor action— specifically, saccades, which are not routinely associated with sounds in everyday

102 life. We showed that button press-initiated tones evoked attenuated auditory N1 and P2 responses compared with both saccade and external initiation. In the saccade- initiated condition, we found an intermediary level of N1 attenuation; that is, a significant reduction compared with the externally initiated condition, but significantly less reduction than in the button press-initiated condition. In contrast, there was no difference between the saccade- and externally initiated conditions in terms of their auditory P2 response. This indicated that the motor-to-sensory mapping involved in speech production may be partly generalized to other action– sensation contingencies, and that learned associations may also contribute to the attenuation of auditory N1 response. Chapter 5 extended the saccade initiation paradigm to the visual modality. We firstshowed that the flash-evoked visual N1 was luminance-dependent, and secondly, we showed that saccade initiation resulted in greater visual N1 attenuation than button-press initiation. This finding was the converse of Chapter 4, whereby saccade initiation of auditory stimuli produced weaker attenuation than button press- initiation. In this way, both studies supported the notion that the degree of electrophysiological sensory attenuation, in both auditory and visual sensory modalities, is related to the strength of the causal likelihood that a motor action will be paired with a particular sensory effect. It is worth noting that the flash-evoked findings diverged from the pattern reversal-evoked findings found in Chapter 3, and possible reasons will be suggested in §7.2.2. Chapter 6 introduced the blow initiation paradigm. We showed for the first time that a motor action using effectors in the mouth region evoked significant attenuation of auditory N1 response even when holding the accompanying stimulus truly constant. Crucially, this findingcorroborated vocal production studies, including those showing that patients with schizophrenia exhibit atypical sensory attenuation. That is, the results of previous studies that found differences between self-initiated and replayed vocalization stimuli had been confounded by speech movements or physical differences in the auditory stimuli, and our demonstration of auditory N1 attenuation following motor output from the mouth region strongly suggests that the ERP differences could be due to forward prediction. In addition, both blow- and press-initiated tones had significantly reduced auditory N1 and P2 responses compared with saccade-initiated tones, a finding that extended the pattern

103 seen in Chapters 4 and 5, indicating that electrophysiological sensory attenuation is related to the existing strength of habitual associations between willed motor actions and their resulting sensations.

7.2 Implications

The studies reported in this thesis collectively advance our understanding as to the conditions under which electrophysiological sensory attenuation occurs. I will now consider the broader implications of this knowledge, with a focus on the degree of involvement of the causal likelihood of action–sensation pairings in the magnitude of ERP attenuation. I will also consider whether sensory attenuation can be reliably measured in non-auditory sensory modalities.

7.2.1 Habitual associations between actions and sensations

The main implication of this thesis is that the strength of electrophysiological sensory attenuation in healthy participants may relate to the habitual associations that already exist between willed motor actions and their sensory effects. This relationship would explain why button pressing evoked greater attenuation of auditory consequences than saccades (Chapter 4), saccades evoked greater attenuation of visual consequences than button pressing (Chapter 5), and a blow evoked greater attenuation of auditory consequences than both button pressing and saccades (Chapter 6). Notably, this interpretation suggests that auditory N1 attenuation observed in the vocal production and button-press initiation paradigms have the same underlying cause. Specifically, apart from any differencesattributable to physical properties of the stimuli (e.g., volume) and methodology (e.g., ISI refractoriness), the size of the self-initiation effect in both paradigms appears to be mediated by the forward model account introduced in §2.1.2. Consequently, button-press initiation studies, and perhaps other designs involving tone stimuli (such as saccade and blow initiation), offer apotentially useful platform for testing clinical populations that involve a putative breakdown in forward modeling. Hence, these new designs may be able to complement and extend the clinical application of the vocal production paradigm by

104 research groups such as Curio et al. (2000), Heinks-Maldonado et al. (2005), and Behroozmand et al. (2009). There is a reasonable basis for supposing the existence of latent associations between actions and their habitual sensory effects; that is, an “expectation” that our actions result in sensory events could be either innate or built up by long-term training over a lifetime (Horváth, Maess, Baess, & Tóth, 2012). This scenario seems unavoidable for the repetitive, non-arbitrary coupling of speech sounds with the motor actions involved speech production, and it seems possible that our lifetime experience could also result in a contingency between arbitrary pairs of actions and sensations, such as button pressing and tones. Specifically, due to the varied pairings that take place over a lifetime, a general association between a class of motor action and the modality of its sensory effect may arise. For example, the expectation that tactile interaction with a physical surface will be accompanied by auditory sensations (such as when tapping one’s fingers on a desk). Hence, the forward model may have a range of pre-existing biases about the frequency of occurrence and modality of sensations initiated by various classes of motor action, perhaps organized by effector region (e.g., hands, eyes, etc.). Furthermore, consider that the forward model makes predictions about the expected physical properties of self-initiated sensations. Hence, it is conceivable that exact matches between expected and actual physical characteristics would produce the greatest sensory attenuation, and as predictions become more inaccurate (e.g., as the actual physical outcome of a motor action deviates from the expected outcome), a corresponding decrease in sensory attenuation would be observed. In support of this notion, there is evidence to suggest that the degree of auditory N1 attenuation relates to the specificity of the predictions being made (e.g., Behroozmand & Larson, 2011; Heinks-Maldonado et al., 2005). For example, Behroozmand and Larson (2011) tested whether pitch was an acoustical dimension that could account for the identification and suppression of our own voice. Using a vocal production paradigm in which both active and passive stimuli were pitch-shifted to various degrees, they showed that auditory N1 attenuation was largest when participants received unaltered voice feedback during vocalization, and that auditory N1 attenuation decreased as the pitch shifted upward; indeed, 400-cent shifted stimuli were not attenuated at all.

105

This distribution of attenuation strength as a function of the accuracy of stimuli predictions aligns with an increasing tendency in cognitive science to characterize brain function as fundamentally predictive in nature (e.g., A. Clark, 2013; Friston, 2010). In such “predictive coding” hypotheses, predictability is a feature used to aid processing of the constant influx of sensory stimuli (Fletcher & Frith, 2009). For instance, reafferenceis attenuated because it would be redundant, and therefore an inefficient use of limited sensory processing resources. This demonstrates that our experience of something can be directly affected by our knowledge of it, that is, by its predictability. Conversely, our knowledge can be affected by our experience, in that the likelihood of an action–sensation association is continually updated with new occurrences of the stimulus. This reciprocal relationship between knowledge (beliefs) and experience (perception) was, of course, first captured by Bayes’ theorem (Bayes, 1763), and indeed, there is evidence to suggest that sensorimotor learning is essentially Bayesian in operation (Körding & Wolpert, 2004). At a neuroanatomical level, the structure of a predictive brain recapitulates the causal structure of its environment (see A. Clark, 2013). Accordingly, action– sensation pairings that are more predictable (e.g., due to their frequent occurrence) might be marked by stronger neural pathways to facilitate forward prediction. For example, as noted in §6.4, the strong relationship between mouth actions and auditory consequences is reflected in the existence of fibre pathways, such as the arcuate fasciculus, that connect the auditory cortex and vocal production areas in the premotor and motor cortices (see Dick & Tremblay, 2012). Corresponding fasciculi between the auditory and motor cortices also exist (e.g., the frontal aslant tract; Catani et al., 2013) that could facilitate the attenuation of sounds that result from motor actions such as button presses. There may also be a corresponding physical link between the brain areas that process visual sensation and control eye movements, though neuronal modulation based on the sensory consequences of a saccade would be difficult to disentangle from trans-saccadic visual processing, which is speculated to include a pathway that ascends from the superior colliculus (SC) to the frontal eye field (FEF) through the thalamus (Crapse & Sommer, 2008). In any case, a clear area of emerging interest is exploring how cortical projections, local circuit mechanisms (e.g., within the auditory cortex), and interneuron

106 subpopulations respond to self-initiated stimulus. This is important for clarificatory purposes and promising for our understanding of psychotic symptoms, given that neural circuit dysfunction in animal models of schizophrenia could be better elucidated with experimental paradigms based around neurophysiological responses to self-initiated stimuli (Rummell, Klee, & Sigurdsson, 2016; Sigurdsson, 2016).

7.2.2 Non-auditory sensory attenuation

Chapters 3 and 5 presented different outcomes in terms of the attenuation of visual consequences of button presses: in Chapter 3, pattern-reversal N145 response was amplified in the self-initiated condition compared with the externally initiated condition, whereas in Chapter 5, flash-evoked visual N1 response was attenuated (and even more so for saccade-initiated flashes). It is worth speculating on the methodological basis of these discrepant results. In Chapter 3, our priority was the reliability of the sensory component we chose to employ given that no prior within- subject comparisons of auditory and visual stimuli existed in the context of sensory attenuation. Hence, we opted for a pattern reversal owing to its robust VEPs with less intersubject and intrasubject variability than other forms of luminance-change stimulation, including flash and pattern-onset stimuli (Luck & Kappenman, 2012; Odom et al., 2010). In Chapter 5, we were interested in the possible attenuation of the sensory consequences cued by a saccade—rather than changes in visual processing that occur during a saccade—so we chose to use a sudden onset full-field flash stimulus that occurred at the termination of the saccade. If we had used a pattern reversal (where a checkerboard pattern is shown for a significant duration, has the luminance value of each of its pixel simultaneously inverted, and this inverted pattern persists for an equal duration), eye movement over the pattern during the saccade would have confounded any sensory response to the reversal itself. The pertinent reason for explaining this difference in methodology is that the visual N145 and visual N1 are not the same ERP component. Hence, I will first consider the difference in the outcomes of Chapters 3 and 5 in terms of the possible underlying VEP sources, and then explore the implications for sensory attenuation given the phenomenological differences between pattern reversal and flash stimuli. As discussed in §3.4, the visual N145 is generated over a wider cortical area than the visual N75 and P100 components that are also evoked by pattern reversals

107

(Shigeto et al., 1998). This leads to the complicating possibility of non stimulus- related factors modulating the visual N145. For example, attention has been shown to enhance both visual N1 and P1 response to pattern-onset stimuli (V. P. Clark & Hillyard, 1996), which may share sources with the N145 component of the pattern- reversal VEP (Shawkat & Kriss, 2000). It may be the case that attention effects simply exaggerated a genuine self-initiation-related enhancement of the visual N145 in Chapter 3. Alternatively, an attentional enhancement may have masked the occurrence of a neutral effect (i.e., in which the self-initiated condition produced a visual N145 that was indistinguishable from the externally initiated condition), which would also be notable given that auditory stimuli consistently evoke clear ERP attenuation. However, such outcomes are indistinguishable in the present design, so a clear interpretation regarding the self-initiation-related modulation of visual N145 is impossible. Conversely, the visual N1 component used in Chapter 5 is demonstrably an early, sensory-evoked component, partially generated from occipital cortex (V. P. Clark, Fan, & Hillyard, 1995). As a result, I am inclined to consider the design used in Chapter 5 a more suitable test of sensory attenuation in the visual domain, and thus infer that self-initiation is indeed associated with a reduction in the electrophysiological consequences of visual stimuli. What does visual sensory attenuation look like—in the phenomenological, literal sense of “look like”? In the auditory domain, sensory attenuation as measured by reduced auditory N1 amplitude is thought to reflect a decrease in perceptual intensity; that is, self-initiated sounds are perceived as physically softer in volume. This argument is supported by findings that auditory N1 amplitude increases as stimulus volume increases (e.g., Hegerl et al., 1994; Picton et al., 1976; Rapin et al., 1966). The corresponding argument for vision is thus that self-initiated visual sensations will be perceived as lower in luminance, given that a stimulus of greater luminance produced a corresponding increase in VEP amplitude (Experiment 1, Chapter 5). There is also psychophysical evidence of this parallel between auditory and visual sensory attenuation. For example, Cardoso-Leite et al. (2010) used a design where left-key and right-key presses were differentially associated with certain patch tilts. They found that participants’ sensitivity to low- contrast Gabor patches was reduced when the visual stimuli were self-initiated with a learned button-press contingency. However, can a sensory modality as dominant

108 and “present” as vision be modified in the unilateral fashion implicated by auditory sensory attenuation? That is, because sight is intrinsic to our experience and interaction with the external world, dampening of a property of the entire visual stream seems like a flawed adaptive mechanism. Instead, perhaps sensory attenuation could be applied on an object-specific basis, akin to a gain modulator in functional opposition to attention. A key question here is whether sensory attenuation discriminates between self-initiated sensations and simultaneous external stimulation. In the auditory domain, invasive studies have suggested that only certain sections of the auditory cortex have their cortical firing dampened during self-vocalization (Eliades & Wang, 2003; Greenlee et al., 2011), but it seems implausible that the topography of the auditory cortex is spatially defined in a manner like the retinotopic organization of the primary visual cortex. Rather, it might be that electrophysiological sensory attenuation in the auditory modality is made possible due to discrete sounds (e.g., tones or utterances) being auditory “events” that can be foregrounded (and thus cancelled) relative to other, competing events in the sensory environment. Hence, it may be that a sudden flash is moreof an event than a checkerboard reversal, as the latter necessarily provides textural stimulation before and after the reversal event—in other words, luminance remains constant throughout, so there is no luminance “event”. In the case of a flash, which is a period of brief intensity, there is a period of luminance to be modulated. This might explain why Hughes and Waszak (2011) found attenuation to their static pattern stimuli, which have a more figural (object) status than a sudden flash and thus are discrete objects to be attenuated in the case of self-initiation. An equivalent potential investigation of this phenomenological difference in the auditory domain would be to deploy a design where participants listen to a constant tone, and their button presses initiate a sudden shift in pitch (maintaining loudness) for a brief period (e.g., 30 ms). If this did not produce AEP attenuation, or even enhanced it, this contrast with the discrete tone outcomes in Chapters 3, 4, and 6 would be informative as to the properties of non-auditory stimuli that should be used in future studies of electrophysiological sensory attenuation.

109

7.3 Future research

The preceding discussion showed how this thesis contributed to our understanding of electrophysiological sensory attenuation, both in terms of its association with the causal likelihood of action–sensation pairings, and its existence in non-auditory sensory modalities (i.e., vision). Many avenues of investigation and contending hypotheses remain to be explored. I will now suggest some future directions that offerpromising opportunities to further clarify our understanding of the underlying causes of electrophysiological sensory attenuation.

7.3.1 Parametric investigation of causal likelihood

Given the findings in Chapters 4 and 6 that suggest that the degree of auditory N1 attenuation is related to the existing causal likelihood of the action–sensation pairing, it seems appropriate to conduct a parametric study that explores this possibility. However, a fine-grained parametric approach may not be tractable, because one’s lifetime of experience with any one pairing cannot be easily quantified in a continuous fashion. Instead, a categorical approach may be required. An example of such an approach is to recruit participants who are experts in a field that involves frequent, repetitive action–sensation pairings. For example, Repp and Knoblich (2007) conducted a psychophysical study where the expert group were pianists. Their design involved participants pressing keyboard buttons that were mapped to pairs of tones, pre-selected by the experimenters, that were bistable in terms of the direction of pitch change between the two tones. The pianists, who all had at least 11 years of training, were found to differ significantly from nonpianist participants in their interpretation of the pitch direction. The authors argued that the pianists’ perception was biased by auditory predictions derived from a forward model, which was influenced by their extensive active experience in producing tones of specific pitches compared with the nonpianists. Nevertheless, there are some downsides to this kind of expert-based design. For example, because participants’ prior experience is the manipulated variable, we must necessarily perform between- subjects rather than within-subjects analysis. Moreover, there are not a great deal of fields that involve specialisation of stereotyped motor actions to predictable sensory

110 outcomes, so experiments would be confined to a rather limited set of action– sensation pairings. A possible alternative class of self-initiated designs could involve the use of “natural” stimuli. That is, rather than manipulate the modality of sensory input (as in Chapters 4 and 5) or the motor output (as in Chapter 6) to modify the causal relationship between an action and its sensation, the specific physical properties of stimuli within the same sensory modality could be manipulated so that one stimulus is more “naturally” linked to the initiating motor output than the other. Here, I use “natural” to refer to action–sensation pairings where the physical properties of the motor output constrain the physical properties of the stimuli, such as the sounds linked to vocalization or tapping a table; whereas a computer-modulated (button press-initiated) stimulus can take on any physical form according to the parameters of the intervening software. There is low-hanging fruit to be gathered using only those actions and stimuli already presented. For example, as far as I am aware, although numerous studies investigate tones and a minority look at sample “ah” sounds (Ford, Gray, et al., 2007; Whitford et al., 2011), no previous studies have examined both natural and computer-modulated stimuli in the same subjects to compare their respective levels of auditory N1 attenuation. A simple 2 (action) × 3 (sensory effect)within -subjects design that could potentially be useful would be to ask participants to either press a button or soundlessly exhale (i.e., action factor) to initiate a tone, sample “ah”, or no sound (i.e., sensory effect factor), where the soundless condition acts as a motor control. In line with the hypothesis that stronger sensory attenuation is associated with more natural pairings of actions and their sensory effects—or, put more broadly, between the effector region and modality of the reafference—I would expect blow–“ah” pairings to produce a greater degree of auditory N1 attenuation than blow–tone pairings. A similar hypothetical paradigm that avoids speech-related sounds would involve participants tapping a drum to deliver either a tone or a stereotypical drum sound through headphones. In this case, the drum sounds would be expected to be associated with a great degree of AEP attenuation than the tones. There is also promise in testing a musician sample akin to Repp and Knoblich (2007), but in the context of an ERP button-press initiation paradigm. For example, participants’ key pressing on a musical keyboard could trigger either normal piano tones or pitch-

111 shifted tones. Presumably, pianists would have developed a stronger relationship to the normal tones over the course of their lifelong training. Hence, I hypothesize that they would exhibit the strongest degree of AEP attenuation in the normal condition, whereas the control group would exhibit negligible differences in their AEP response between the normal and pitch-shifted conditions. A simple, interesting addition to such an experiment would be to correlate the approximate hours of piano training undergone by the musicians with their absolute ERP amplitudes, to test if the greatest degree of AEP attenuation would be associated with the pianists who had undergone the most training. Finally, an important aspect of the blow initiation experiment (Chapter 6; Mifsud & Whitford, 2017) was to confirm that auditory N1 attenuation occurred for motor output to the mouth effector region even in the absence of bone conduction. It may be worth conducting a follow-up experiment to better assess the possible contribution that bone conduction of sound makes to the size of the auditory N1 attenuation effect. For example, an experiment in which a sample “ah” was initiated by participants in two different ways; one being a normal “ah” vocalization, the other a silent exhalation with the same articulatory muscle movements. The findings of such an experiment may also provide some insight into whether an appreciable differenceexists between two motor actions that are nearly identical in their physical manifestations but differ in whether there was intent to vocalize a sound.

7.3.2 Interrelated approaches in sensory attenuation research

Despite the focus on forward prediction in the preceding discussion, it has also been acknowledged that it is unlikely that a single mechanism underlies all reported instances of electrophysiological sensory attenuation. There are almost certainly distinct and overlapping processes, such as differences in attentional and cognitive processing, that can account for the divergent outcomes apparent across different experimental designs (Waszak, Cardoso-Leite, & Hughes, 2012). I suggested earlier that sensory attenuation even in “artificial” setups (i.e., sensory effects that are arbitrarily connected to the initiating motor output) is contributed to by forward prediction, and speculated that a lifetime of experience with a certain action– sensation pairing will make those motor-to-sensory pathways stronger and facilitate efference copy-based corollary discharge. This contrasts with some alternate

112 proposals, including the associative “pre-activation” account (Roussel et al., 2013; see §2.4.4). And it feels intuitively the case that sensory attenuation of non-speech stimuli may have more in common with learning paradigms than with forward model prediction. So where to from here? Several emerging experimental approaches promise to further improve our understanding of the underlying processes that mediate electrophysiological sensory attenuation. Foremost, it is important to popularize instrumental learning paradigms that have methodological overlap with common self-initiation ERP designs so as to marry sensory attenuation with the traditional field oflearning. For instance, Elijah, Le Pelley, and Whitford (2016) used button-press initiation with a huge number of exposures to either immediate or delayed (by 100 ms) tones, and found that auditory N1 attenuation to the delayed but not the immediate tones decreased over the course of the experiment. This findingsuggested that anticipated timing appears to be one of the stimulus properties that might be amenable to behavioral training. More recently, Baum, Wolfensteller, and Ruge (2017) also used a training paradigm (albeit within a much more complex design) and attempted to isolate electrical brain responses to auditory stimuli that they argued captured the increasing (within- experiment) associative strength of the action–sensation contingencies. While it is still an open question whether and how neural adaptation relates to prediction (Schröger, Kotz, & SanMiguel, 2015), experiments such as the two just described are on the right track. Indeed, a training paradigm that specifically investigated action–sensation pairings based on their existing causal strength would be of interest. As reported in Chapter 6 (Mifsud & Whitford, 2017), auditory ERP amplitude reduction associated with saccade initiation was smaller than the auditory ERP amplitude reduction associated with blow and press initiation. If this were due to the absence of a learned saccade–tone contingency, it is reasonable to expect a stronger reduction in the saccade initiation condition after some learning of the underlying contingency. In an experiment with a greater number of trials than the 60- and 80-trial blocks used in the studies reported in this thesis—as a benchmark, Elijah et al. (2016) used 420 trials to investigate training of self-initiated sensations—it may be possible to observe changes in ERP attenuation dependent on the existing causal strength of action–sensation pairings over the course of an experiment.

113

Another potentially important design, apart from training paradigms, is the coincidence paradigm introduced by Horváth et al. (2012). In this paradigm, a sequence of tones is presented concurrently with the participant pressing buttons. Tones that coincide with those keypresses typically elicit attenuated auditory N1m and P2m responses. Horváth et al. (2012) argued that this shows that a contingency (that is, a causal relationship) between a motor action and its sensory consequence (which is what the forward model uses to cancel reafference in contingent paradigms such as those used throughout this thesis) is not actually a necessary condition for auditory N1 attenuation. However, Horváth (2015) also acknowledged that the mechanism causing the coincidence effect is not clear. Hence, it remains unclear whether contingent paradigms cause ERP attenuation orthogonal to that reported in the coincidence paradigm, and continued replication and experimentation with coincident action–sensation pairings is a worthwhile area of future research. For instance, it could be translated into the saccade initiation paradigm introduced in Chapter 4 (Mifsud, Beesley, et al., 2016) by having participants make saccades between two predetermined fixation points while an overlapping series of tones is played to see if the ERP attenuation outcomes converge across sensory modalities. Lastly, two other exciting areas worthy of further consideration involve tool- mediated stimulus initiation and multisensory predictions. Given that auditory N1 attenuation is consistently observed following button-press initiation, there is evidence to support the notion that an efference copy prepares sensory cortex for the consequences of self-initiated actions, even when the causal chain is indirect and mediated by external devices (Ford et al., 2013). Kilteni and Ehrsson (2017) took this hypothesis to the next logical level by using a psychophysical force-matching paradigm to show that sensory attenuation of touch could be observed even for touches applied using a hand-held tool. They suggested that the brain uses effector- independent forward models, given that the touch attenuation relied on the anticipated position of the tool-tip rather than the body part itself. In another study, van Kemenade, Arikan, Kircher, and Straube (2016) found that bimodal (i.e., both auditory and visual) consequences of participants’ actions led to their enhanced detection of delays between action and feedback, providing support for the notion that forward models create multimodal predictions. These innovative approaches to testing the theoretical boundaries of forward prediction hold great promise, and the

114 number of very recent studies suggests that the sensory attenuation phenomenon is under closer interrogation than ever before.

7.3.3 Clinical trials of the novel self-initiation paradigms

As noted previously, an improved understanding of the mechanisms that underlie sensory attenuation may hold implications for the sense of agency, and by extension, clinical disorders that involve an abnormal sense of agency. Specifically, disrupted sensory attenuation may be involved in the etiology of psychotic symptoms that are characterized by an abnormal sense of agency, including auditory hallucinations and delusions of control (Feinberg, 1978; Frith, 1987). In support of this hypothesis, patients with schizophrenia have been found to have similar auditory N1 responses for both self- and externally initiated speech (Ford, Gray, et al., 2007; Ford et al., 2001; Heinks-Maldonado et al., 2007), suggesting that they do not make a distinction between self- and externally generated actions, which may underlie the feelings of abnormal agency. Given that this possibility was one of the theoretical underpinnings of this thesis, I will end this general discussion by considering what insight may be provided by taking the directions of the preceding empirical chapters into clinical populations. Most importantly, given that patients with schizophrenia exhibit reduced levels of electrophysiological attenuation to sounds, if they were to exhibit responses to self-initiated visual stimuli that aligned with the responses of healthy participants (e.g., reduced visual N1 to flashes, as shown in Chapter 5), this may help to explain why auditory hallucinations have much higher prevalence in schizophrenia than those in any other sensory modality, including vision. To increase the impact of a visual self-initiation ERP clinical study, it would be sensible to include patients with conditions that are primarily associated with visual hallucinations, such as Charles Bonnet syndrome (Santhouse, Howard, & ffytche, 2000) or Parkinson’s disease (Fénelon, Mahieux, Huon, & Ziégler, 2000). If a dissociation in VEP attenuation could be demonstrated between these clinical populations and schizophrenia patients with auditory hallucinations, this would provide a significant new line of evidence that the first-rank symptoms of schizophrenia considered to be most characteristic of the disorder, such as auditory hallucinations, are tied to dysfunction of motor prediction in the auditory domain.

115

Similarly, it would also be worth testing whether patients with schizophrenia show abnormal electrophysiological attenuation to saccade- and blow-initiated stimuli. For example, it would be instructive if patients exhibited equal auditory N1 response across all action–sensation pairings, when, compared with button-press initiation, healthy participants exhibit reduced N1 attenuation for saccade–tone (Chapter 4; Mifsud, Beesley, et al., 2016) and increased N1 attenuation for blow– tone pairings (Chapter 6; Mifsud & Whitford, 2017). This would suggest that an underlying reason for patients’ deficits in electrophysiological sensory attenuation is an abnormality in their ability to form habitual associations (see Heinks-Maldonado et al., 2007). Hence, a worthwhile avenue for future work is to conduct self-initiation studies that use a wider range of eliciting motor actions in clinical populations that experience psychotic symptoms.

7.4 Conclusion

This thesis explored the phenomenon of sensory attenuation in a series of ERP experiments that compared pairings of motor actions and their consequent sensations. Several novel self-initiated paradigms were introduced to the literature. The results suggested that neurophysiological representations of self-initiated auditory and visual sensations are attenuated according to the likelihood that their initiating motor actions typically produce stimulation in that sensory modality. In attempting to elucidate sensory attenuation, a phenomenon at the nexus of prediction, action, and perception, this work has contributed to advancing our understanding of how we interact with the world. Further, it is hoped that this work forms a step on the path to ameliorating the distressing repercussions of atypical sensorimotor function, as ostensibly occurs in people suffering from schizophrenia.

116

References

Ahlfors, S. P., Han, J., Lin, F. H., Witzel, T., Belliveau, J. W., Hämäläinen, M. S., & Halgren, E. (2010). Cancellation of EEG and MEG signals generated by extended and distributed sources. Human Brain Mapping, 31, 140–149. https://doi.org/10.1002/hbm.20851 Aliu, S. O., Houde, J. F., & Nagarajan, S. S. (2009). Motor-induced suppression of the auditory cortex. Journal of Cognitive Neuroscience, 21, 791–802. https://doi.org/10.1162/jocn.2009.21055 Andreasen, N. C., Arndt, S., Alliger, R., Miller, D., & Flaum, M. (1995). Symptoms of schizophrenia: Methods, meanings and mechanisms. Archives of General Psychiatry, 52, 341–351. https://doi.org/10.1001/archpsyc.1995.03950170015003 Baess, P., Jacobsen, T., & Schröger, E. (2008). Suppression of the auditory N1 event-related potential component with unpredictable self-initiated tones: Evidence for internal forward models with dynamic stimulation. International Journal of Psychophysiology, 70, 137–143. https://doi.org/10.1016/j.ijpsycho.2008.06.005 Baillet, S., Mosher, J. C., & Leahy, R. M. (2001). Electromagnetic brain mapping. IEEE Signal Processing Magazine, 18, 14–30. https://doi.org/10.1109/79.962275 Baum, F., Wolfensteller, U., & Ruge, H. (2017). Learning-related brain-electrical activity dynamics associated with the subsequent impact of learnt action- outcome associations. Frontiers in Human Neuroscience, 11, 252. https://doi.org/10.3389/fnhum.2017.00252 Bayes, T. R. (1763). An essay towards solving a problem in the doctrine of chances. Philosophical Transactions of the Royal Society of London, 53, 370–418. https://doi.org/10.1098/rstl.1763.0053 Bays, P. M., & Wolpert, D. M. (2007). Computational principles of sensorimotor control that minimize uncertainty and variability. Journal of Physiology, 578, 387–396. https://doi.org/10.1113/jphysiol.2006.120121 Behroozmand, R., Karvelis, L., Liu, H., & Larson, C. R. (2009). Vocalization- induced enhancement of the auditory cortex responsiveness during voice F0

117

feedback perturbation. Clinical Neurophysiology, 120, 1303–1312. https://doi.org/10.1016/j.clinph.2009.04.022 Behroozmand, R., & Larson, C. R. (2011). Error-dependent modulation of speech- induced auditory suppression for pitch-shifted voice feedback. BMC Neuroscience, 12, 54. https://doi.org/10.1186/1471-2202-12-54 Behroozmand, R., Oya, H., Nourski, K. V., Kawasaki, H., Larson, C. R., Brugge, J. F., … Greenlee, J. D. (2016). Neural correlates of vocal production and motor control in human Heschl’s gyrus. Journal of Neuroscience, 36, 2302–2315. https://doi.org/10.1523/JNEUROSCI.3305-14.2016 Bendixen, A., SanMiguel, I., & Schröger, E. (2012). Early electrophysiological indicators for predictive processing in audition: A review. International Journal of Psychophysiology, 83, 120–131. https://doi.org/10.1016/j.ijpsycho.2011.08.003 Berchicci, M., Spinelli, D., & Russo, F. D. (2016). New insights into old waves. Matching stimulus- and response-locked ERPs on the same time-window. Biological Psychology, 117, 202–215. https://doi.org/10.1016/j.biopsycho.2016.04.007 Berger, H. (1929). Über das Elektrenkephalogramm des Menschen. Archiv für Psychiatrie und Nervenkrankheiten, 87, 527–570. https://doi.org/10.1007/BF01797193 Binda, P., Bruno, A., Burr, D. C., & Morrone, M. C. (2007). Fusion of visual and auditory stimuli during saccades: A Bayesian explanation for perisaccadic distortions. Journal of Neuroscience, 27, 8525–8532. https://doi.org/10.1523/JNEUROSCI.0737-07.2007 Blakemore, S. J., Frith, C. D., & Wolpert, D. M. (1999). Spatio-temporal prediction modulates the perception of self-produced stimuli. Journal of Cognitive Neuroscience, 11, 551–559. https://doi.org/10.1162/089892999563607 Blakemore, S. J., Smith, J., Steel, R., Johnstone, E. C., & Frith, C. D. (2000). The perception of self-produced sensory stimuli in patients with auditory hallucinations and passivity experiences: Evidence for a breakdown in self- monitoring. Psychological Medicine, 30, 1131–1139. https://doi.org/10.1017/S0033291799002676

118

Blakemore, S. J., Wolpert, D. M., & Frith, C. D. (1998). Central cancellation of self- produced tickle sensation. Nature Neuroscience, 1, 635–640. https://doi.org/10.1038/2870 Brainard, D. H. (1997). The Psychophysics Toolbox. Spatial Vision, 10, 433–436. https://doi.org/10.1163/156856897X00357 Brand, A., Allen, L., Altman, M., Hlava, M., & Scott, J. (2015). Beyond authorship: Attribution, contribution, collaboration, and credit. Learned Publishing, 28, 151–155. https://doi.org/10.1087/20150211 Bridgeman, B. (2007). Efference copy and its limitations. Computers in Biology and Medicine, 37, 924–929. https://doi.org/10.1016/j.compbiomed.2006.07.001 Brown, A. S., & Derkits, E. J. (2010). Prenatal infection and schizophrenia: A review of epidemiologic and translational studies. American Journal of Psychiatry, 167, 261–280. https://doi.org/10.1176/appi.ajp.2009.09030361 Budd, T. W., Barry, R. J., Gordon, E., Rennie, C., & Michie, P. T. (1998). Decrement of the N1 auditory event-related potential with stimulus repetition: Habituation vs. refractoriness. International Journal of Psychophysiology, 31, 51–68. https://doi.org/10.1016/S0167-8760(98)00040-3 Cardoso-Leite, P., Mamassian, P., Schütz-Bosbach, S., & Waszak, F. (2010). A new look at sensory attenuation: Action-effect anticipation affects sensitivity, not response bias. Psychological Science, 21, 1740–1745. https://doi.org/10.1177/0956797610389187 Carr, V. J., Neil, A. L., Halpin, S. A., Holmes, S., & Lewin, T. J. (2003). Costs of schizophrenia and other psychoses in urban Australia: Findings from the Low Prevalence (Psychotic) Disorders Study. Australian and New Zealand Journal of Psychiatry, 37, 31–40. https://doi.org/10.1046/j.1440-1614.2003.01092.x Catani, M., Mesulam, M. M., Jakobsen, E., Malik, F., Martersteck, A., Wieneke, C., … Rogalski, E. (2013). A novel frontal pathway underlies verbal fluency in primary progressive aphasia. Brain, 136, 2619–2628. https://doi.org/10.1093/brain/awt163 Chang, E. F., Niziolek, C. A., Knight, R. T., Nagarajan, S. S., & Houde, J. F. (2013). Human cortical sensorimotor network underlying feedback control of vocal pitch. Proceedings of the National Academy of Sciences of the United States of America, 110, 2653–2658. https://doi.org/10.1073/pnas.1216827110

119

Clark, A. (2013). Whatever next? Predictive brains, situated agents, and the future of cognitive science. Behavioral and Brain Sciences, 36, 181–204. https://doi.org/10.1017/S0140525X12000477 Clark, V. P., Fan, S., & Hillyard, S. A. (1995). Identification of early visual evoked potential generators by retinotopic and topographic analyses. Human Brain Mapping, 2, 170–187. https://doi.org/10.1002/hbm.460020306 Clark, V. P., & Hillyard, S. A. (1996). Spatial selective attention affects early extrastriate but not striate components of the visual evoked potential. Journal of Cognitive Neuroscience, 8, 387–402. https://doi.org/10.1162/jocn.1996.8.5.387 Claxton, G. (1975). Why can’t we tickle ourselves? Perceptual and Motor Skills, 41, 335–338. https://doi.org/10.2466/pms.1975.41.1.335 Cohen, D., & Cuffin, B. N. (1979). Comparison of the magnetoencephalogram and electroencephalogram. Electroencephalography and Clinical Neurophysiology, 47, 132–146. https://doi.org/10.1016/0013-4694(79)90215-3 Correa, A., Lupianez, J., Madrid, E., & Tudela, P. (2006). Temporal attention enhances early visual processing: A review and new evidence from event- related potentials. Brain Research, 1076, 116–128. https://doi.org/10.1016/j.brainres.2005.11.074 Crapse, T. B., & Sommer, M. A. (2008). Corollary discharge across the animal kingdom. Nature Reviews Neuroscience, 9, 587–600. https://doi.org/10.1038/nrn2457 Cullen, K. E., Brooks, J. X., Jamali, M., Carriot, J., & Massot, C. (2011). Internal models of self-motion: Computations that suppress vestibular reafference in early vestibular processing. Experimental Brain Research, 210, 377–388. https://doi.org/10.1007/s00221-011-2555-9 Curio, G., Neuloh, G., Numminen, J., Jousmaki, V., & Hari, R. (2000). Speaking modifies voice-evoked activity in the human auditory cortex. Human Brain Mapping, 9, 183–191. https://doi.org/10.1002/(SICI)1097- 0193(200004)9:4<183::AID-HBM1>3.0.CO;2-Z Davis, W. J., Siegler, M. V., & Mpitsos, G. J. (1973). Distributed neuronal oscillators and efference copy in the feeding system of Pleurobranchaea. Journal of Neurophysiology, 36, 258–274.

120

Delcomyn, F. (1977). Corollary discharge to cockroach giant interneurones. Nature, 269, 160–162. https://doi.org/10.1038/269160a0 Dennett, D. C. (1993). Consciousness explained. UK: Penguin. Desmurget, M., & Sirigu, A. (2012). Conscious motor intention emerges in the inferior parietal lobule. Current Opinion in Neurobiology, 22, 1004–1011. https://doi.org/10.1016/j.conb.2012.06.006 Diamond, M. R., Ross, J., & Morrone, M. C. (2000). Extraretinal control of saccadic suppression. Journal of Neuroscience, 20, 3449–3455. Dick, A. S., & Tremblay, P. (2012). Beyond the arcuate fasciculus: Consensus and controversy in the connectional anatomy of language. Brain, 135, 3529–3550. https://doi.org/10.1093/brain/aws222 Duhamel, J. R., Colby, C. L., & Goldberg, M. E. (1992). The updating of the representation of visual space in parietal cortex by intended eye movements. Science, 255, 90–92. https://doi.org/10.1126/science.1553535 Duysens, J., Tax, A. A., Nawijn, S., Berger, W., Prokop, T., & Altenmuller, E. (1995). Gating of sensation and evoked potentials following foot stimulation during human gait. Experimental Brain Research, 105, 423–431. https://doi.org/10.1007/bf00233042 Eaton, W. W., Thara, R., Federman, B., Melton, B., & Liang, K. (1995). Structure and course of positive and negative symptoms in schizophrenia. Archives of General Psychiatry, 52, 127–134. https://doi.org/10.1001/archpsyc.1995.03950140045005 Eliades, S. J., & Wang, X. (2003). Sensory-motor interaction in the primate auditory cortex during self-initiated vocalizations. Journal of Neurophysiology, 89, 2194– 2207. https://doi.org/10.1152/jn.00627.2002 Eliades, S. J., & Wang, X. (2005). Dynamics of auditory-vocal interaction in monkey auditory cortex. Cerebral Cortex, 15, 1510–1523. https://doi.org/10.1093/cercor/bhi030 Elijah, R. B., Le Pelley, M. E., & Whitford, T. J. (2016). Modifying temporal expectations: Changing cortical responsivity to delayed self-initiated sensations with training. Biological Psychology, 120, 88–95. https://doi.org/10.1016/j.biopsycho.2016.09.001

121

Engbert, K., Wohlschlager, A., & Haggard, P. (2008). Who is causing what? The sense of agency is relational and efferent-triggered. Cognition, 107, 693–704. https://doi.org/10.1016/j.cognition.2007.07.021 Feinberg, I. (1978). Efference copy and corollary discharge: Implications for thinking and its disorders. Schizophrenia Bulletin, 4, 636–640. https://doi.org/10.1093/schbul/4.4.636 Fénelon, G., Mahieux, F., Huon, R., & Ziégler, M. (2000). Hallucinations in Parkinson’s disease: Prevalence, phenomenology and risk factors. Brain, 123, 733–745. https://doi.org/10.1093/brain/123.4.733 Fitzsimmons, J., Kubicki, M., & Shenton, M. E. (2013). Review of functional and anatomical brain connectivity findings in schizophrenia. Current Opinion in Psychiatry, 26, 172–187. https://doi.org/10.1097/YCO.0b013e32835d9e6a Flanagan, J. R., & Wing, A. M. (1997). The role of internal models in motion planning and control: Evidence from grip force adjustments during movements of hand-held loads. Journal of Neuroscience, 17, 1519–1528. Fletcher, P. C., & Frith, C. D. (2009). Perceiving is believing: A Bayesian approach to explaining the positive symptoms of schizophrenia. Nature Neuroscience, 10, 48–58. https://doi.org/10.1038/nrn2536 Ford, J. M. (2016). Studying auditory verbal hallucinations using the RDoC framework. Psychophysiology, 53, 298–304. https://doi.org/10.1111/psyp.12457 Ford, J. M., Gray, M., Faustman, W. O., Roach, B. J., & Mathalon, D. H. (2007). Dissecting corollary discharge dysfunction in schizophrenia. Psychophysiology, 44, 522–529. https://doi.org/10.1111/j.1469-8986.2007.00533.x Ford, J. M., Mathalon, D. H., Heinks, T., Kalba, S., Faustman, W. O., & Roth, W. T. (2001). Neurophysiological evidence of corollary discharge dysfunction in schizophrenia. American Journal of Psychiatry, 158, 2069–2071. https://doi.org/10.1176/appi.ajp.158.12.2069 Ford, J. M., Palzes, V. A., Roach, B. J., & Mathalon, D. H. (2013). Did I do that? Abnormal predictive processes in schizophrenia when button pressing to deliver a tone. Schizophrenia Bulletin, 40, 804–812. https://doi.org/10.1093/schbul/sbt072

122

Ford, J. M., Roach, B. J., Faustman, W. O., & Mathalon, D. H. (2007). Synch before you speak: Auditory hallucinations in schizophrenia. American Journal of Psychiatry, 164, 458–466. https://doi.org/10.1176/appi.ajp.164.3.458 Ford, J. M., Roach, B. J., & Mathalon, D. H. (2010). Assessing corollary discharge in humans using noninvasive neurophysiological methods. Nature Protocols, 5, 1160–1168. https://doi.org/10.1038/nprot.2010.67 Francis, B. A., & Wonham, W. M. (1976). The internal model principle of control theory. Automatica, 12, 457–465. https://doi.org/10.1016/0005- 1098(76)90006-6 Friston, K. (2010). The free-energy principle: A unified brain theory? Nature Reviews Neuroscience, 11, 127–138. https://doi.org/10.1038/nrn2787 Frith, C. D. (1987). The positive and negative symptoms of schizophrenia reflect impairments in the perception and initiation of action. Psychological Medicine, 17, 631–648. https://doi.org/10.1017/S0033291700025873 Frith, C. D. (2005). The neural basis of hallucinations and delusions. Comptes Rendus Biologies, 328, 169–175. https://doi.org/10.1016/j.crvi.2004.10.012 Frith, C. D. (2012). Explaining delusions of control: The comparator model 20 years on. Consciousness and Cognition, 21, 52–54. https://doi.org/10.1016/j.concog.2011.06.010 Frith, C. D., Blakemore, S. J., & Wolpert, D. M. (2000). Explaining the symptoms of schizophrenia: Abnormalities in the awareness of action. Brain Research Reviews, 31, 357–363. https://doi.org/10.1016/S0165-0173(99)00052-1 Frith, C. D., & Done, D. J. (1988). Towards a neuropsychology of schizophrenia. British Journal of Psychiatry, 153, 437–443. https://doi.org/10.1192/bjp.153.4.437 Fukuda, K., & Vogel, E. K. (2009). Human variation in overriding attentional capture. Journal of Neuroscience, 29, 8726–8733. https://doi.org/10.1523/JNEUROSCI.2145-09.2009 Gentsch, A., & Schütz-Bosbach, S. (2011). I did it: Unconscious expectation of sensory consequences modulates the experience of self-agency and its functional signature. Journal of Cognitive Neuroscience, 23, 3817–3828. https://doi.org/10.1162/jocn_a_00012

123

Gevins, A., Leong, H., Smith, M. E., Le, J., & Du, R. (1995). Mapping cognitive brain function with modern high-resolution electroencephalography. Trends in Neuroscience, 18, 429–436. https://doi.org/10.1016/0166-2236(95)94489-R Gomez Gonzalez, C. M., Clark, V. P., Fan, S., Luck, S. J., & Hillyard, S. A. (1994). Sources of attention-sensitive visual event-related potentials. Brain Topography, 7, 41–51. https://doi.org/10.1007/BF01184836 Gratton, G., Coles, M. G., & Donchin, E. (1983). A new method for off-line removal of ocular artifact. Electroencephalography and Clinical Neurophysiology, 55, 468– 484. https://doi.org/10.1016/0013-4694(83)90135-9 Green, M. F., & Kinsbourne, M. (1990). Subvocal activity and auditory hallucinations: Clues for behavioral treatments? Schizophrenia Bulletin, 16, 617–625. https://doi.org/10.1093/schbul/16.4.617 Greenlee, J. D., Behroozmand, R., Larson, C. R., Jackson, A. W., Chen, F., Hansen, D. R., … Howard, M. A., 3rd. (2013). Sensory-motor interactions for vocal pitch monitoring in non-primary human auditory cortex. PLoS One, 8, e60783. https://doi.org/10.1371/journal.pone.0060783 Greenlee, J. D., Jackson, A. W., Chen, F., Larson, C. R., Oya, H., Kawasaki, H., … Howard, M. A., 3rd. (2011). Human auditory cortical activation during self- vocalization. PLoS One, 6, e14744. https://doi.org/10.1371/journal.pone.0014744 Grüsser, O.-J. (1994). On the history of the ideas of efference copy and reafference. In C. Debru (Ed.), Essays in the history of physiological sciences (pp. 35–36). Atlanta, GA: Rodopi. Guthrie, B. L., Porter, J. D., & Sparks, D. L. (1983). Corollary discharge provides accurate eye position information to the oculomotor system. Science, 221, 1193–1195. https://doi.org/10.1126/science.6612334 Hämäläinen, M. S., Hari, R., Ilmoniemi, R. J., Knuutila, J., & Lounasmaa, O. V. (1993). —theory, instrumentation, and applications to noninvasive studies of the working human brain. Reviews of Modern Physics, 65, 413–497. https://doi.org/10.1103/RevModPhys.65.413 Harrison, P. J. (1991). The neuropathology of schizophrenia: A critical review of the data and their interpretation. Brain, 122, 593–624. https://doi.org/10.1093/brain/122.4.593

124

Hegerl, U., Gallinat, J., & Mrowinski, D. (1994). Intensity dependence of auditory evoked dipole source activity. International Journal of Psychophysiology, 17, 1– 13. https://doi.org/10.1016/0167-8760(94)90050-7 Heinks-Maldonado, T. H., Mathalon, D. H., Gray, M., & Ford, J. M. (2005). Fine- tuning of auditory cortex during speech production. Psychophysiology, 42, 180– 190. https://doi.org/10.1111/j.1469-8986.2005.00272.x Heinks-Maldonado, T. H., Mathalon, D. H., Houde, J. F., Gray, M., Faustman, W. O., & Ford, J. M. (2007). Relationship of imprecise corollary discharge in schizophrenia to auditory hallucinations. Archives of General Psychiatry, 64, 286–296. https://doi.org/10.1001/archpsyc.64.3.286 Hickok, G. (2012). Computational neuroanatomy of speech production. Nature Reviews Neuroscience, 13, 135–145. https://doi.org/10.1038/nrn3158 Hoekema, R., Wieneke, G. H., Leijten, F. S., van Veelen, C. W., van Rijen, P. C., Huiskamp, G. J., … van Huffelen, A. C. (2003). Measurement of the conductivity of skull, temporarily removed during epilepsy surgery. Brain Topography, 16, 29–38. https://doi.org/10.1023/A:1025606415858 Horváth, J. (2015). Action-related auditory ERP attenuation: Paradigms and hypotheses. Brain Research, 1626, 54–65. https://doi.org/10.1016/j.brainres.2015.03.038 Horváth, J., & Burgyán, A. (2013). No evidence for peripheral mechanism attenuating auditory ERPs to self-induced tones. Psychophysiology, 50, 563– 569. https://doi.org/10.1111/psyp.12041 Horváth, J., Maess, B., Baess, P., & Tóth, A. (2012). Action-sound coincidences suppress evoked responses of the human auditory cortex in EEG and MEG. Journal of Cognitive Neuroscience, 24, 1919–1931. https://doi.org/10.1162/jocn_a_00215 Houde, J. F., & Chang, E. F. (2015). The cortical computations underlying feedback control in vocal production. Current Opinion in Neurobiology, 33, 174–181. https://doi.org/10.1016/j.conb.2015.04.006 Houde, J. F., Nagarajan, S. S., Sekihara, K., & Merzenich, M. M. (2002). Modulation of the auditory cortex during speech: An MEG study. Journal of Cognitive Neuroscience, 14, 1125–1138. https://doi.org/10.1162/089892902760807140

125

Hughes, G., Desantis, A., & Waszak, F. (2013). Mechanisms of intentional binding and sensory attenuation: The role of temporal prediction, temporal control, identity prediction, and motor prediction. Psychological Bulletin, 139, 133–151. https://doi.org/10.1037/a0028566 Hughes, G., & Waszak, F. (2011). ERP correlates of action effect prediction and visual sensory attenuation in voluntary action. NeuroImage, 56, 1632–1640. https://doi.org/10.1016/j.neuroimage.2011.02.057 Hughes, G., & Waszak, F. (2014). Predicting faces and houses: Category-specific visual action-effect prediction modulates late stages of sensory processing. Neuropsychologia, 61, 11–18. https://doi.org/10.1016/j.neuropsychologia.2014.06.002 Hyvärinen, A., & Oja, E. (2000). Independent component analysis: Algorithms and applications. Neural Networks, 13, 411–430. https://doi.org/10.1016/S0893- 6080(00)00026-5 Ibbotson, M., & Krekelberg, B. (2011). Visual perception and saccadic eye movements. Current Opinion in Neurobiology, 21, 553–558. https://doi.org/10.1016/j.conb.2011.05.012 Ishikawa, T., Tomatsu, S., Izawa, J., & Kakei, S. (2016). The cerebro-cerebellum: Could it be loci of forward models? Neuroscience Research, 104, 72–79. https://doi.org/10.1016/j.neures.2015.12.003 Johansson, R. S., & Westling, G. (1984). Roles of glabrous skin receptors and sensorimotor memory in automatic control of precision grip when lifting rougher or more slippery objects. Experimental Brain Research, 56, 550–564. https://doi.org/10.1007/BF00237997 Kapur, S. (2003). Psychosis as a state of aberrant salience: A framework linking biology, phenomenology, and pharmacology in schizophrenia. American Journal of Psychiatry, 160, 13–23. https://doi.org/10.1176/appi.ajp.160.1.13 Kawato, M. (1999). Internal models for motor control and trajectory planning. Current Opinion in Neurobiology, 9, 718–727. Kilteni, K., & Ehrsson, H. H. (2017). Sensorimotor predictions and tool use: Hand- held tools attenuate self-touch. Cognition, 165, 1–9. https://doi.org/10.1016/j.cognition.2017.04.005

126

Kleiner, M., Brainard, D. H., & Pelli, D. G. (2007). What’s new in Psychtoolbox-3? Perception, 36, ECVP Abstract Supplement. Knapp, M., Mangalore, R., & Simon, J. (2004). The global costs of schizophrenia. Schizophrenia Bulletin, 30, 279–293. https://doi.org/10.1093/oxfordjournals.schbul.a007078 Knolle, F., Schröger, E., Baess, P., & Kotz, S. A. (2012). The cerebellum generates motor-to-auditory predictions: ERP lesion evidence. Journal of Cognitive Neuroscience, 24, 698–706. https://doi.org/10.1162/jocn_a_00167 Körding, K. P., & Wolpert, D. M. (2004). Bayesian integration in sensorimotor learning. Nature, 427, 244–247. https://doi.org/10.1038/nature02169 Lange, K. (2009). Brain correlates of early auditory processing are attenuated by expectations for time and pitch. Brain and Cognition, 69, 127–137. https://doi.org/10.1016/j.bandc.2008.06.004 Lange, K. (2013). The ups and downs of temporal orienting: A review of auditory temporal orienting studies and a model associating the heterogeneous findings on the auditory N1 with opposite effects of attention and prediction. Frontiers in Human Neuroscience, 7, 263. https://doi.org/10.3389/fnhum.2013.00263 Laruelle, M., & Abi-Dargham, A. (1999). Dopamine as the wind of the psychotic fire: New evidence from brain imaging studies. Journal of Psychopharmacology, 13, 358–371. https://doi.org/10.1177/026988119901300405 Llinás, R. (2001). I of the vortex: From neurons to self. Cambridge, MA: The MIT Press. Luck, S. J. (2005). Ten simple rules for designing ERP experiments. In T. C. Handy (Ed.), Event-related potentials: A methods handbook (pp. 17-32). Cambridge, MA: The MIT Press. Luck, S. J. (2012). An introduction to the event-related potential technique (2nd ed.). Cambridge, MA: The MIT Press. Luck, S. J., & Kappenman, E. S. (2012). The Oxford handbook of event-related potential components: Oxford University Press, USA. Luck, S. J., Mathalon, D. H., O'Donnell, B. F., Hamalainen, M. S., Spencer, K. M., Javitt, D. C., & Uhlhaas, P. J. (2011). A roadmap for the development and validation of event-related potential biomarkers in schizophrenia research. Biological Psychiatry, 70, 28–34. https://doi.org/10.1016/j.biopsych.2010.09.021

127

Martikainen, M. H., Kaneko, K., & Hari, R. (2005). Suppressed responses to self- triggered sounds in the human auditory cortex. Cerebral Cortex, 15, 299–302. https://doi.org/10.1093/cercor/bhh131 McCarthy, G., & Donchin, E. (1976). The effects of temporal and event uncertainty in determining the waveforms of the auditory event related potential (ERP). Psychophysiology, 13, 581–590. https://doi.org/10.1111/j.1469- 8986.1976.tb00885.x McGrath, J., Saha, S., Chant, D., & Welham, J. (2008). Schizophrenia: A concise overview of incidence, prevalence, and mortality. Epidemiologic Reviews, 30, 67–76. https://doi.org/10.1093/epirev/mxn001 Mellor, C. (1970). First rank symptoms of schizophrenia: I. The frequency in schizophrenics on admission to hospital. II. Differences between individual first rank symptoms. British Journal of Psychiatry, 117, 15–23. Miall, R. C., & Wolpert, D. M. (1996). Forward models for physiological motor control. Neural Networks, 9, 1265–1279. https://doi.org/10.1016/S0893- 6080(96)00035-4 Mifsud, N. G., Beesley, T., Watson, T. L., & Whitford, T. J. (2016). Attenuation of auditory evoked potentials for hand and eye-initiated sounds. Biological Psychology, 120, 61–68. https://doi.org/10.1016/j.biopsycho.2016.08.011 Mifsud, N. G., Oestreich, L. K., Jack, B. N., Ford, J. M., Roach, B. J., Mathalon, D. H., & Whitford, T. J. (2016). Self-initiated actions result in suppressed auditory but amplified visual evoked components in healthy participants. Psychophysiology, 53, 723–732. https://doi.org/10.1111/psyp.12605 Mifsud, N. G., & Whitford, T. J. (2017). Sensory attenuation of self-initiated sounds maps onto habitual associations between motor action and sound. Neuropsychologia, 103, 38–43. https://doi.org/10.1016/j.neuropsychologia.2017.07.019 Müller-Preuss, P., & Ploog, D. (1981). Inhibition of auditory cortical neurons during phonation. Brain Research, 215, 61–76. https://doi.org/10.1016/0006- 8993(81)90491-1 Näätänen, R., & Picton, T. (1987). The N1 wave of the human electric and magnetic response to sound: A review and an analysis of the component structure.

128

Psychophysiology, 24, 375–425. https://doi.org/10.1111/j.1469- 8986.1987.tb00311.x Neuweiler, G. (2003). Evolutionary aspects of bat echolocation. Journal of Comparative Physiology A, 189, 245–256. https://doi.org/10.1007/s00359-003- 0406-2 Niell, C. M., & Stryker, M. P. (2010). Modulation of visual responses by behavioral state in mouse visual cortex. Neuron, 65, 472–479. https://doi.org/10.1016/j.neuron.2010.01.033 Nunez, P. L., & Srinivasan, R. (2006). Electric fields of the brain: The neurophysics of EEG: Oxford University Press, USA. O’Regan, J. K., & Noë, A. (2001). A sensorimotor account of vision and visual consciousness. Behavioral and Brain Sciences, 24, 939–973. https://doi.org/10.1017/S0140525X01000115 O’Shea, R. P., Roeber, U., & Bach, M. (2009). Evoked potential: Vision. In E. B. Goldstein (Ed.), Encyclopedia of Perception (Vol. xli, pp. 399–400). Los Angeles: Sage Publications. Odom, J. V., Bach, M., Brigell, M., Holder, G. E., McCulloch, D. L., Tormene, A. P., & Vaegan. (2010). ISCEV standard for clinical visual evoked potentials (2009 update). Documenta Ophthalmologica, 120, 111–119. https://doi.org/10.1007/s10633-009-9195-4 Oestreich, L. K., Mifsud, N. G., Ford, J. M., Roach, B. J., Mathalon, D. H., & Whitford, T. J. (2015). Subnormal sensory attenuation to self-generated speech in schizotypy: Electrophysiological evidence for a “continuum of psychosis”. International Journal of Psychophysiology, 97, 131–138. https://doi.org/10.1016/j.ijpsycho.2015.05.014 Oestreich, L. K., Mifsud, N. G., Ford, J. M., Roach, B. J., Mathalon, D. H., & Whitford, T. J. (2016). Cortical suppression to delayed self-initiated auditory stimuli in schizotypy: Neurophysiological evidence for a continuum of psychosis. Clinical EEG and Neuroscience, 47, 3–10. https://doi.org/10.1177/1550059415581708 Palmer, B. A., Pankratz, V. S., & Bostwick, J. M. (2005). The lifetime risk of suicide in schizophrenia: A reexamination. Archives of General Psychiatry, 62, 247– 253. https://doi.org/10.1001/archpsyc.62.3.247

129

Parkinson, A. L., Flagmeier, S. G., Manes, J. L., Larson, C. R., Rogers, B., & Robin, D. A. (2012). Understanding the neural mechanisms involved in sensory control of voice production. NeuroImage, 61, 314–322. https://doi.org/10.1016/j.neuroimage.2012.02.068 Paus, T., Perry, D. W., Zatorre, R. J., Worsley, K. J., & Evans, A. C. (1996). Modulation of cerebral blood flow in the human auditory cortex during speech: Role of motor-to-sensory discharges. European Journal of Neuroscience, 8, 2236–2246. https://doi.org/10.1111/j.1460- 9568.1996.tb01187.x Pelli, D. G. (1997). The VideoToolbox software for visual psychophysics: Transforming numbers into movies. Spatial Vision, 10, 437–442. https://doi.org/10.1163/156856897X00366 Perez, V. B., Ford, J. M., Roach, B. J., Loewy, R. L., Stuart, B. K., Vinogradov, S., & Mathalon, D. H. (2012). Auditory cortex responsiveness during talking and listening: Early illness schizophrenia and patients at clinical high-risk for psychosis. Schizophrenia Bulletin, 38, 1216–1224. https://doi.org/10.1093/schbul/sbr124 Perri, R. L., Berchicci, M., Lucci, G., Spinelli, D., & Di Russo, F. (2015). The premotor role of the prefrontal cortex in response consistency. Neuropsychology, 29, 767–775. https://doi.org/10.1037/neu0000168 Picton, T. W., Woods, D. L., Baribeau-Braun, J., & Healey, T. M. (1976). Evoked potential audiometry. Journal of Otolaryngology, 6, 90–119. Pinheiro, A. P., Rezaii, N., Rauber, A., & Niznikiewicz, M. (2016). Is this my voice or yours? The role of emotion and acoustic quality in self-other voice discrimination in schizophrenia. Cognitive Neuropsychiatry, 21, 335–353. https://doi.org/10.1080/13546805.2016.1208611 Poulet, J. F., & Hedwig, B. (2002). A corollary discharge maintains auditory sensitivity during sound production. Nature, 418, 872–876. https://doi.org/10.1038/nature00919 Poulet, J. F., & Hedwig, B. (2006). The cellular basis of a corollary discharge. Science, 311, 518–522. https://doi.org/10.1126/science.1120847

130

Poulet, J. F., & Hedwig, B. (2007). New insights into corollary discharges mediated by identified neural pathways. Trends in Neuroscience, 30, 14–21. https://doi.org/10.1016/j.tins.2006.11.005 Proske, U., & Gandevia, S. C. (2012). The proprioceptive senses: Their roles in signaling body shape, body position and movement, and muscle force. Physiological Reviews, 92, 1651–1697. https://doi.org/10.1152/physrev.00048.2011 Rapin, I., Schimmel, H., Tourk, L. M., Krasnegor, N. A., & Pollak, C. (1966). Evoked responses to clicks and tones of varying intensity in waking adults. Electroencephalography and Clinical Neurophysiology, 21, 335–344. https://doi.org/10.1016/0013-4694(66)90039-3 Reingold, E. M., & Stampe, D. M. (2002). Saccadic inhibition in voluntary and reflexive saccades. Journal of Cognitive Neuroscience, 14, 371–388. https://doi.org/10.1162/089892902317361903 Repp, B. H., & Knoblich, G. (2007). Action can affect auditory perception. Psychological Science, 18, 6–7. https://doi.org/10.1111/j.1467- 9280.2007.01839.x Roberts, B. L., & Russell, I. J. (1972). The activity of lateral-line efferent neurones in stationary and swimming dogfish. Journal of Experimental Biology, 57, 435– 448. Roussel, C., Hughes, G., & Waszak, F. (2013). A preactivation account of sensory attenuation. Neuropsychologia, 51, 922–929. https://doi.org/10.1016/j.neuropsychologia.2013.02.005 Roussel, C., Hughes, G., & Waszak, F. (2014). Action prediction modulates both neurophysiological and psychophysical indices of sensory attenuation. Frontiers in Human Neuroscience, 8. https://doi.org/10.3389/fnhum.2014.00115 Rummell, B. P., Klee, J. L., & Sigurdsson, T. (2016). Attenuation of responses to self-generated sounds in auditory cortical neurons. Journal of Neuroscience, 36, 12010–12026. https://doi.org/10.1523/JNEUROSCI.1564-16.2016 Saha, S., Chant, D., & McGrath, J. (2007). A systematic review of mortality in schizophrenia: Is the differential mortality gap worsening over time? Archives

131

of General Psychiatry, 64, 1123–1131. https://doi.org/10.1001/archpsyc.64.10.1123 SanMiguel, I., Todd, J., & Schröger, E. (2013). Sensory suppression effects to self- initiated sounds reflect the attenuation of the unspecific N1 component of the auditory ERP. Psychophysiology, 50, 334–343. https://doi.org/10.1111/psyp.12024 SanMiguel, I., Widmann, A., Bendixen, A., Trujillo-Barreto, N., & Schröger, E. (2013). Hearing silences: Human auditory processing relies on preactivation of sound-specific brain activity patterns. Journal of Neuroscience, 33, 8633– 8639. https://doi.org/10.1523/JNEUROSCI.5821-12.2013 Santangelo, V., Olivetti Belardinelli, M., & Spence, C. (2007). The suppression of reflexive visual and auditory orienting when attention is otherwise engaged. Journal of Experimental Psychology: Human Perception and Performance, 33, 137–148. https://doi.org/10.1037/0096-1523.33.1.137 Santhouse, A. M., Howard, R. J., & ffytche, D. H. (2000). Visual hallucinatory syndromes and the anatomy of the visual brain. Brain, 123, 2055–2064. Sato, A. (2008). Action observation modulates auditory perception of the consequence of others’ actions. Consciousness and Cognition, 17, 1219–1227. https://doi.org/10.1016/j.concog.2008.01.003 Schafer, E. W., Amochaev, A., & Russell, M. J. (1981). Knowledge of stimulus timing attenuates human evoked cortical potentials. Electroencephalography and Clinical Neurophysiology, 52, 9–17. https://doi.org/10.1016/0013- 4694(81)90183-8 Schafer, E. W., & Marcus, M. M. (1973). Self-stimulation alters human sensory brain responses. Science, 181, 175–177. https://doi.org/10.1126/science.181.4095.175 Schall, J. D. (2002). The neural selection and control of saccades by the frontal eye field. Philosophical Transactions of the Royal Society B: Biological Sciences, 357, 1073–1082. https://doi.org/10.1098/rstb.2002.1098 Schneider, D. M., Nelson, A., & Mooney, R. (2014). A synaptic and circuit basis for corollary discharge in the auditory cortex. Nature, 513, 189–194. https://doi.org/10.1038/nature13724 Schneider, K. (1959). Clinical psychopathology. New York: Grune & Stratton.

132

Schröger, E., Kotz, S. A., & SanMiguel, I. (2015). Bridging prediction and attention in current research on perception and action. Brain Research, 1626, 1–13. https://doi.org/10.1016/j.brainres.2015.08.037 Schultz, W., & Dickinson, A. (2000). Neuronal coding of prediction errors. Annual Review of Neuroscience, 23, 473–500. https://doi.org/10.1146/annurev.neuro.23.1.473 Seeman, P., Lee, T., Chau-Wong, M., & Wong, K. (1976). Antipsychotic drug doses and neuroleptic/dopamine receptors. Nature, 261, 717–719. https://doi.org/10.1038/261717a0 Shahin, A., Roberts, L. E., Pantev, C., Trainor, L. J., & Ross, B. (2005). Modulation of P2 auditory-evoked responses by the spectral complexity of musical sounds. NeuroReport, 16, 1781–1785. https://doi.org/10.1097/01.wnr.0000185017.29316.63 Sharon, D., Hamalainen, M. S., Tootell, R. B., Halgren, E., & Belliveau, J. W. (2007). The advantage of combining MEG and EEG: Comparison to fMRI in focally stimulated visual cortex. NeuroImage, 36, 1225–1235. https://doi.org/10.1016/j.neuroimage.2007.03.066 Shawkat, F. S., & Kriss, A. (2000). A study of the effects of contrast change on pattern VEPS, and the transition between onset, reversal and offset modes of stimulation. Documenta Ophthalmologica, 101, 73–89. https://doi.org/10.1023/A:1002750719094 Shenton, M. E., Dickey, C. C., Frumin, M., & McCarley, M. W. (2001). A review of MRI findings in schizophrenia. Schizophrenia Research, 49, 1–52. https://doi.org/10.1016/S0920-9964(01)00163-3 Shergill, S. S., Bays, P. M., Frith, C. D., & Wolpert, D. M. (2003). Two eyes for an eye: The neuroscience of force escalation. Science, 301, 187. https://doi.org/10.1126/science.1085327 Shergill, S. S., Samson, G., Bays, P. M., Frith, C. D., & Wolpert, D. M. (2005). Evidence for sensory prediction deficits in schizophrenia. American Journal of Psychiatry, 162, 2384–2386. https://doi.org/10.1176/appi.ajp.162.12.2384 Shigeto, H., Tobimatsu, S., Yamamoto, T., Kobayashi, T., & Kato, M. (1998). Visual evoked cortical magnetic responses to checkerboard pattern reversal stimulation: A study on the neural generators of N75, P100 and N145. Journal

133

of Neurological Sciences, 156, 186–194. https://doi.org/10.1016/S0022- 510X(98)00026-4 Sigurdsson, T. (2016). Neural circuit dysfunction in schizophrenia: Insights from animal models. Neuroscience, 321, 42–65. https://doi.org/10.1016/j.neuroscience.2015.06.059 Sommer, M. A., & Wurtz, R. H. (2002). A pathway in primate brain for internal monitoring of movements. Science, 296, 1480–1482. https://doi.org/10.1126/science.1069590 Sowman, P. F., Brinkworth, R. S., & Turker, K. S. (2010). Threshold for detection of incisal forces is increased by jaw movement. Journal of Dental Research, 89, 395–399. https://doi.org/10.1177/0022034510363101 Sowman, P. F., Kuusik, A., & Johnson, B. W. (2012). Self-initiation and temporal cueing of monaural tones reduce the auditory N1 and P2. Experimental Brain Research, 222, 149–157. https://doi.org/10.1007/s00221-012-3204-7 Sperry, R. W. (1950). Neural basis of the spontaneous optokinetic response produced by visual inversion. Journal of Comparative and Physiological Psychology, 43, 482–489. https://doi.org/10.1037/h0055479 Steen, R. G., Mull, C., McClure, R., Hamer, R. M., & Lieberman, J. A. (2006). Brain volume in first-episode schizophrenia: Systematic review and meta- analysis of magnetic resonance imaging studies. British Journal of Psychiatry, 188, 510–518. https://doi.org/10.1192/bjp.188.6.510 Stenfelt, S., & Goode, R. L. (2005). Bone-conducted sound: Physiological and clinical aspects. Otology & Neurotology, 26, 1245–1261. Stenner, M. P., Bauer, M., Haggard, P., Heinze, H. J., & Dolan, R. (2014). Enhanced alpha-oscillations in visual cortex during anticipation of self- generated visual stimulation. Journal of Cognitive Neuroscience, 26, 2540–2551. https://doi.org/10.1162/jocn_a_00658 Stephan, K. E., Friston, K. J., & Frith, C. D. (2009). Dysconnection in schizophrenia: From abnormal synaptic plasticity to failures of self- monitoring. Schizophrenia Bulletin, 35, 509–527. https://doi.org/10.1093/schbul/sbn176

134

Suga, N., & Shimozawa, T. (1974). Site of neural attenuation of responses to self- vocalized sounds in echolocating bats. Science, 183, 1211–1213. https://doi.org/10.1126/science.183.4130.1211 Sulpizio, V., Lucci, G., Berchicci, M., Galati, G., Pitzalis, S., & Di Russo, F. (2017). Hemispheric asymmetries in the transition from action preparation to execution. NeuroImage, 148, 390–402. https://doi.org/10.1016/j.neuroimage.2017.01.009 Synofzik, M., Thier, P., & Lindner, A. (2006). Internalizing agency of self-action: Perception of one’s own hand movements depends on an adaptable prediction about the sensory action outcome. Journal of Neurophysiology, 96, 1592–1601. https://doi.org/10.1152/jn.00104.2006 Timm, J., SanMiguel, I., Keil, J., Schroger, E., & Schonwiesner, M. (2014). Motor intention determines sensory attenuation of brain responses to self-initiated sounds. Journal of Cognitive Neuroscience, 26, 1481–1489. https://doi.org/10.1162/jocn_a_00552 van Elk, M., Salomon, R., Kannape, O., & Blanke, O. (2014). Suppression of the N1 auditory evoked potential for sounds generated by the upper and lower limbs. Biological Psychology, 102, 108–117. https://doi.org/10.1016/j.biopsycho.2014.06.007 van Kemenade, B. M., Arikan, B. E., Kircher, T., & Straube, B. (2016). Predicting the sensory consequences of one’s own action: First evidence for multisensory facilitation. Attention Perception & Psychophysics, 78, 2515–2526. https://doi.org/10.3758/s13414-016-1189-1 van Os, J., & Kapur, S. (2009). Schizophrenia. Lancet, 374, 635–645. https://doi.org/10.1016/S0140-6736(09)60995-8 van Os, J., Krabbendam, L., Myin-Germeys, I., & Delespaul, P. (2005). The schizophrenia envirome. Current Opinion in Psychiatry, 18, 141–145. https://doi.org/10.1097/00001504-200503000-00006 Virtanen, J., Ahveninen, J., Ilmoniemi, R. J., Näätänen, R., & Pekkonen, E. (1998). Replicability of MEG and EEG measures of the auditory N1/N1m-response. Electroencephalography and Clinical Neurophysiology, 108, 291–298. https://doi.org/10.1016/S0168-5597(98)00006-9

135

Vogel, E. K., & Luck, S. J. (2000). The visual N1 component as an index of a discrimination process. Psychophysiology, 37, 190–203. https://doi.org/10.1111/1469-8986.3720190 von Helmholtz, H. (1866). Handbuch der Physiologischen Optik. Leipzig: Voss. von Holst, E. (1954). Relations between the central nervous system and the peripheral organs. British Journal of Animal Behaviour, 2, 89–94. https://doi.org/10.1016/s0950-5601(54)80044-x von Holst, E., & Mittelstaedt, H. (1950). Das Reafferenzprinzip. [The principle of reafference]. Naturwissenschaften, 37, 464–476. https://doi.org/10.1007/BF00622503 Wang, J., Mathalon, D. H., Roach, B. J., Reilly, J., Keedy, S. K., Sweeney, J. A., & Ford, J. M. (2014). Action planning and predictive coding when speaking. NeuroImage, 91, 91–98. https://doi.org/10.1016/j.neuroimage.2014.01.003 Waszak, F., Cardoso-Leite, P., & Hughes, G. (2012). Action effect anticipation: Neurophysiological basis and functional consequences. Neuroscience and Biobehavioral Reviews, 36, 943–959. https://doi.org/10.1016/j.neubiorev.2011.11.004 Waters, F., Woodward, T., Allen, P., Aleman, A., & Sommer, I. (2010). Self- recognition deficits in schizophrenia patients with auditory hallucinations: A meta-analysis of the literature. Schizophrenia Bulletin, 38, 741–750. https://doi.org/10.1093/schbul/sbq144 Weiskrantz, L., Elliott, J., & Darlington, C. (1971). Preliminary observations on tickling oneself. Nature, 230, 598–599. https://doi.org/10.1038/230598a0 Weiss, C., Herwig, A., & Schütz-Bosbach, S. (2011). The self in action effects: Selective attenuation of self-generated sounds. Cognition, 121, 207–218. https://doi.org/10.1016/j.cognition.2011.06.011 Whitford, T. J., Ford, J. M., Mathalon, D. H., Kubicki, M., & Shenton, M. E. (2012). Schizophrenia, myelination, and delayed corollary discharges: A hypothesis. Schizophrenia Bulletin, 38, 486–494. https://doi.org/10.1093/schbul/sbq105 Whitford, T. J., Mathalon, D. H., Shenton, M. E., Roach, B. J., Bammer, R., Adcock, R. A., … Ford, J. M. (2011). Electrophysiological and diffusion tensor imaging evidence of delayed corollary discharges in patients with

136

schizophrenia. Psychological Medicine, 41, 959–969. https://doi.org/10.1017/S0033291710001376 Wolpert, D. M. (2011). The real reason for brains [Video file]. Retrieved from https://www.ted.com/talks/daniel_wolpert_the_real_reason_for_brains Wolpert, D. M., & Flanagan, J. R. (2001). Motor prediction. Current Biology, 11, R729–R732. https://doi.org/10.1016/S0960-9822(01)00432-8 Wolpert, D. M., Ghahramani, Z., & Jordan, M. I. (1995). An internal model for sensorimotor integration. Science, 269, 1880–1882. https://doi.org/10.1126/science.7569931 Woodman, G. F. (2010). A brief introduction to the use of event-related potentials in studies of perception and attention. Attention Perception & Psychophysics, 72, 2031–2046. https://doi.org/10.3758/APP.72.8.2031 Wurtz, R. H. (2008). Neuronal mechanisms of visual stability. Vision Research, 48, 2070–2089. https://doi.org/10.1016/j.visres.2008.03.021 Wurtz, R. H., Joiner, W. M., & Berman, R. A. (2011). Neuronal mechanisms for visual stability: Progress and problems. Philosophical Transactions of the Royal Society B: Biological Sciences, 366, 492–503. https://doi.org/10.1098/rstb.2010.0186 Yamazaki, T., Kamijo, K., Kenmochi, A., Fukuzumi, S., Kiyuna, T., Takaki, Y., & Kuroiwa, Y. (2000). Multiple equivalent current dipole source localization of visual event-related potentials during with motor response. Brain Topography, 12, 159–175. https://doi.org/10.1023/A:1023467806268 Yang, S. N., & McConkie, G. W. (2001). Eye movements during reading: A theory of saccade initiation times. Vision Research, 41, 3567–3585. https://doi.org/10.1016/S0042-6989(01)00025-6 Zhou, M., Liang, F., Xiong, X. R., Li, L., Li, H., Xiao, Z., … Zhang, L. I. (2014). Scaling down of balanced excitation and inhibition by active behavioral states in auditory cortex. Nature Neuroscience, 17, 841–850. https://doi.org/10.1038/nn.3701

137