Attention Capture by Sudden and Unexpected Changes: a Multisensory Perspective

Attention Capture by Sudden and Unexpected Changes: A Multisensory Perspective

Erik Marsja

Department of Psychology Umeå 2017 This work is protected by the Swedish Copyright Legislation (Act 1960:729) Dissertation for PhD ISBN: 978-91-7601-803-3 Cover art by Olof Marsja (front) and Erik Marsja (back) Electronic version available at: http://umu.diva-portal.org/ Printed by: UmU Print Service, Umeå University Umeå, Sweden 2017

To Angelica and Elliot, you make me a better person

Table of Contents

Abstract ...... iii Abbreviations ...... iv List of studies ...... v Enkel sammanfattning på svenska ...... vi Introduction ...... 1 Attention ...... 1 Distraction/orienting ...... 2 Electrophysiological responses to deviating stimuli ...... 3 Behavioral responses to deviating stimuli ...... 5 Short-term memory ...... 6 Multicomponent ...... 7 Embedded-processes model ...... 7 Duplex-mechanism account ...... 8 Distraction and short-term memory ...... 8 Violation of predictions and attention capture ...... 9 Multisensory accounts on attention, distraction, and short-term memory ...... 11 Central or sensory-specific mechanisms? ...... 12 Multisensory neural models? ...... 14 Aims ...... 15 Specific aims: ...... 15 Study I:...... 15 Study II: ...... 15 Study III: ...... 15 Materials and methods ...... 16 Subjects ...... 16 Instruments/procedure ...... 16 Materials ...... 16 Crossmodal oddball task (Study I & II) ...... 18 Verbal & spatial serial recall task (Study III) ...... 20 Data analysis ...... 21 Results ...... 23 Study I: ...... 23 Aim ...... 23 Results ...... 23 Study II: ...... 25 Aim ...... 25 Results ...... 25 Conclusion ...... 26 Study III: ...... 27 Aim ...... 27 Results ...... 27

Conclusion ...... 28 Discussion ...... 29 Central or specific mechanisms? ...... 29 Implications for theories on attention/distraction/prediction ...... 30 Implications for theories on short-term memory ...... 31 Practical implications ...... 32 Limitations ...... 32 Conclusions ...... 34 Acknowledgement ...... 35 References ...... 37 Appendix ...... 46 Study 1 ...... 46 Study 2 ...... 46 Study 3 ...... 46

Abstract

The main focus for this thesis was cross-modal attention capture by sudden and unexpected sounds and vibrations, known as deviants, presented in a stream the same to-be-ignored stimulus. More specifically, the thesis takes a multisensory perspective and examines the possible similarities and differences in how deviant vibrations and sounds affect visual task performance (Study I), and whether the deviant and standard stimuli have to be presented within the same modality to capture attention away from visual tasks (Study II). Furthermore, by presenting spatial deviants (changing the source of the stimuli from one side of the body to the other) in audiotactile (bimodal), tactile, and auditory to-be-ignored, it explores whether bimodal stimuli are more salient compared to unimodal (Study III). In addition, Study III tested the claims that short-term memory is domain- specific.

In line with previous research, Study I found that both auditory and tactile deviants captured attention away from the visual task. However, the temporal dynamics between the two modalities seem to differ. That is, it seems like practice causes the effect of vibratory deviants to reduce, whereas this is not the case for auditory deviants. This suggests that there are central mechanisms (detection of the change) and sensory-specific mechanisms.

Study II found that the deviant and standard stimuli must be presented within the same modality. If attention capture by deviants is produced by a mismatch within a neural model predicting upcoming stimuli, the neural model is likely built on stimuli within each modality separately.

The results of Study III revealed that spatial and verbal short-term memory are negatively affected by a spatial change in to-be-ignored sequences, but only when the change is within a bimodal sequence. These results can be taken as evidence for a unitary account of short-term memory (verbal and spatial information stored in the same storage) and that bimodal stimuli may be integrated into a unitary percept that make any change in the stream more salient.

iii

Abbreviations

ERP (Event-Related Potentials) fMRI (functional Magnetic Resonance Imaging)

MMN (Mismatch Negativity)

OOR (Object-Oriented Records)

OR (Orienting Response)

RON (Reorienting Negativity)

SOA (Stimulus-Onset-Asynchrony)

STM (Short-term memory)

ISI (Interstimulus Interval)

ITI (Inter-Trial Interval)

TBI (To-Be-Ignored)

TBR (To-Be-Recalled)

List of studies

I Marsja, E., Neely, G., & Ljungberg, J. K. (Under Review). The Effects of Unexpected and Sudden Vibrations and Sounds Does Not Disrupt Performanice Similarly Over Time. Manuscript

II Marsja, E., Neely, G., & Ljungberg, J. K. (In Press). Investigating Deviance Distraction and the Impact of The Modality of the To-Be-Ignored Stimuli. Experimental Psychology

III Marsja, E., Marsh, J. E., Hansson, P., & Neely, G (Submitted). Examining the Role of Spatial Changes in Bimodal and Uni-Modal To-Be-Ignored Stimuli and How They Affect Short-Term Memory Processes. Manuscript

Enkel sammanfattning på svenska

Denna avhandling fokuserade på hur uppmärksamhet fångas mellan sensoriska modaliteter. Fokus låg på hur plötsliga och oväntade ljud samt vibrationer presenterade i en ström av upprepande stimuli påverkar prestation (dvs., irrelevanta sekvenser). Mer specifikt tar uppsatsen ett multisensoriskt perspektiv och undersöker möjliga likheter och skillnader i hur plötsliga och oväntade vibrationer samt ljud påverkar prestation i en visuell uppgift (Studie I), och huruvida avvikande och upprepande stimuli måste presenteras inom samma sensoriska modalitet för att det avvikande ljudet ska fånga uppmärksamhet (Studie II). Vidare undersöktes hur en oväntad spatial förändring (förändring av presentationskälla från kroppens ena sida till den andra) i en audiotaktil (bimodal), taktil och auditiv stimuli (unimodal) irrelevant sekvens, påverkar spatialt och verbalt korttidsminne. Är oväntade förändringar i bimodala stimuli mer utstickande jämfört med unimodala (Studie III)? Dessutom testade studie III påståenden att korttidsminnet är domänspecifik (lagras spatial och verbal information separat).

I linje med tidigare forskning fann denna avhandling att både plötsliga och oväntade förändringar i auditiva och taktila irrelevanta sekvenser fångade uppmärksamhet bort från den visuella uppgiften. Men den temporala dynamiken mellan de två modaliteterna verkar skilja sig. Det verkar som att upprepad utsättning för plötsliga och oväntade vibrationer gör att den negativa effekten minskar, medan detta inte är fallet i den auditiva modaliteten. Detta tyder på att det finns centrala mekanismer (upptäckt av förändringen) och sensoriskt specifika mekanismer.

Studie II fann att den plötsliga och väntade förändringen måste presenteras i samma modalitet som det upprepande stimulit. Om det som gör att uppmärksamheten fångas är en neural modell som förutsäger inkommande sensorisk information, så byggs den neurala modellen sannolikt av stimuli inom varje modalitet separat.

Resultaten av Studie III visade att spatialt och verbalt korttidsminne påverkas negativt av en oväntad spatial förändring, men endast när förändringen sker inom en bimodal irrelevant sekvens. Dessa resultat är bevis för en enhetlig syn av korttidsminnet (verbal och spatial information lagras enhetligt) och att bimodala stimuli kan integreras i en enhetlig percept som gör den plötsliga spatiala förändringen i den irrelevanta strömmen mer utstickande.

Introduction

Imagine that you are sitting at your favorite café at the train station reading a newspaper. The café is probably an environment packed full of information coming through the different sensory modalities, such as sight, sound, touch, and smell. Although the café is filled with a lot of noises in the background (e.g., people talking), visual information (e.g., flashing lights from the coffee machines), vibrations (e.g., someone moving a chair), and more, you can focus on the task at hand – reading your newspaper. However, when the barista suddenly and unexpectedly drops a big box of coffee on the floor, your focus is diverted due to the sound and the vibrations through the floor. This example illustrates the interplay between the ability to focus attention on the task at hand and, a crucial aspect for survival, attention capture. On one hand, you can focus on the newspaper but on the other hand the sounds and vibrations could be a signal that something dangerous is happening. In the example above, the sudden and unexpected event was not of importance, but it nonetheless distracted you from reading your newspaper. This thesis aims to examine the interplay between focusing of attention and distractibility using a multisensory approach.

Specifically, it examines sudden and unexpected changes, known as deviant or novel events, in expected to-be-ignored (TBI) streams using a multisensory perspective. It starts by examining whether deviant sounds and vibrations capture attention in a functionally similar manner. It then examines whether the expected events in the TBI stream need to be presented within the same modality as the deviant event. Finally, it examines whether multisensory models of the expected sensory stimuli are formed.

Attention This part defines one of the central cognitive constructs for this thesis, namely attention. It is a well-used construct in the field of cognitive psychology and there are many different definitions. One popular definition is that of William James:

“Everyone knows what attention is. It is taking possession of the mind, in clear and vivid form, of one out of what seems several simultaneously possible objects or trains of thought. Focalization, concentration of consciousness is of its essence. It implies a withdrawal from some things to deal effectively with others.” - William James (1890, pp 403-404)

To navigate through the environment that is full of information from different sensory modalities, the human brain must create mental representations (i.e., neural models; e.g., Sokolov, 1963). To do so, the human brain encodes and filters the sensory input by through extraction of statistical regularities and attentional

filtering. The ability to filter certain stimuli out while obtaining information from another stimulus is known as selective attention (e.g., Broadbent, 1958; Driver, 2001; Treisman, 1969). Attention can further be divided into exogenous and endogenous attention. Exogenous attention, also known as stimulus-driven, is when attentional resources are allocated to something external, such as a salient event or a sudden change. Endogenous attention, also known as goal-driven, is when the attentional resources are directed internally such as a goal or directed behavior (Corbetta & Shulman, 2002). This thesis exmained exogenous attention by sudden and unexpected changes. Corbetta and Schulman (2002) put forward two different systems for attention, namely the dorsal and ventral frontoparietal networks. The dorsal system reflects the goal-directed part of attention (i.e., endogenous attention). It thus reflects how cognitive processes such as prior experiences and current goals can control attention. The ventral attention system, on the other hand, reflects the stimulus-driven (i.e., exogenous attention) part of attention. This is the system that is recruited during detection of behaviorally- relevant sensory information– in particular, when the sensory information is of relevance, salient and unattended (Corbetta & Schulman, 2002). This thesis examines exogenous attention by sudden and unexpected changes.

Distraction/orienting The interplay between attention/focusing of attention and distractibility have been of interest for a long time. This section will briefly describe some of the early work that showed that TBI stimuli can affect behavior.

Early studies examining sudden and unexpected changes focused mainly on audition and used the dichotic listening task developed by Cherry (1953). In this task, subjects listen to a passage of speech through headphones. They are then asked to shadow (repeat) the speech presented to one ear and to ignore anything from the other ear. While participants are shadowing the speech, a message is presented in the unattended channel. Cherry (1953) found that the unexpected message presented to the unattended ear disrupted shadowing performance (i.e., more errors were produced). A follow-up study by Moray (1959) found that most of the message presented to the unattended ear is successfully filtered out. Certain properties of the message in the unattended ear, however, captured the subject’s attention (i.e., the subjects own name). The effect of presenting the subject’s own name has been followed up and the results, in line with above mention studies, showed that presenting the subject’s own name serves as a distraction from the shadowing task. When the subject’s own name was presented in the irrelevant channel, one-third of the participants showed increased errors and response lags to the two words subsequent to the name (Wood & Cowan, 1995).

More recent studies have also shown that performance in irrelevant sound experiments using dichotic measurements may be mediated by individual differences. For example, Conway, Cowan, & Bunting (2001) followed up the above-mentioned study with an experiment in which the participant's working memory capacity was measured. The results revealed that subjects with a low working memory capacity were more prone to distraction by the irrelevant message compared to subjects with high working memory capacity. More recently, it was found that presenting the subject’s own name while doing a visual task does not affect performance more compared to another name (Ljungberg, Parmentier, Jones, Marsja, & Neely, 2014). It was suggested that the effect of an unexpected presentation of the subjects’ own name is only a surprise effect.

Another strand of research of interest to this thesis is relates to how sudden and unexpected events in an otherwise repetitive stream of stimuli elicits the orient response (OR). Sokolov (1963) put forward the idea that a repetitive presentation of the same stimulus built up a mental representation, or neural model, based on the properties of the stimulus. When a novel, or unexpected, event is presented, it is compared to the neural model. If this event is salient or of biological relevance, the OR is elicited (Sokolov, 1963). The OR was defined as a number of different physiological responses (e.g., slowing of heart rate and increase in skin conductance; Sokolov, 1963). These responses can, in turn, be accompanied by attention capture.

Following the ideas of Sokolov (1963), more recent research examined the underpinnings of the neural model and attention capture using the oddball paradigm. This strand of work focused for many years on the electrophysiological responses (ERP) to stimulus deviating from an otherwise repetitive stream of sounds. The deviating stimulus is known as a deviant or novel stimulus depending on whether it is the same throughout a task or new at each presentation. The following paragraphs briefly describe the findings from studies using auditory-auditory and auditory-visual oddball tasks. Since the research started off by examining the electrophyisiological markers for the OR, I will start by briefly describing these markers. After that, I will describe the behavioral responses that are relevant to the studies. Furthermore, there are other studies using a unimodal (i.e., using stimuli from only one sensory modality) and a crossmodal (e.g., to-be-ignored stimuli in one modality and task in another) method with visual and somatosensory stimuli. These studies will be presented later in the part examining multisensory aspects.

Electrophysiological responses to deviating stimuli Early research examining the neural underpinnings of attention capture focused mainly on deviant stimuli using an auditory-auditory oddball paradigm in which

subjects are typically exposed to a stream of repetitive sounds. For instance, in a seminal study Näätänen, Gaillard, and Mäntysalo (1978) subjects were asked to perform a dichotic listening task. In this task, subjects were presented with either a frequent, standard stimulus or an infrequent, deviating, stimulus. Both types of sounds were presented to either the left or the right ear. The study found that the deviating stimulus elicited a late negativity in the ERP waveform (i.e., the mismatch negativity; MMN).

Most oddball studies use a method similar to the study by Näätänen, Gaillard, and Mäntysalo (1978) whereby the repetitive stream typically consists of two or more sounds. In most of the trials, the same, standard, stimulus is used. With sudden and unexpected trials, another, novel or deviant, stimulus is presented. However, it is worth noting that more recent studies have used methods in which the deviant stimulus is a deviation in learned perceptual rules (e.g., Schröger, Bendixen, Trujillo-Barreto, & Roeber, 2007).

In the auditory-visual variant, subjects are typically engaged in a task in the visual modality (e.g., categorizing visually presented digits as odd or even) while being exposed to the stream of to-be-ignored (TBI) stimuli. The difference between most unimodal and crossmodal oddball tasks is that the targets and TBI stimuli are perceptually decoupled in the crossmodal settings while they are typically part of the same object in the unimodal settings (see Parmentier, 2014). In most unimodal studies, the task is to categorize the sound (e.g., as long or short; e.g., Berti, 2008). Here, a deviant can be a change of spatial location, or a change in pitch or frequency of the sound. In the crossmodal variant, however, the TBI stimuli are typically presented prior to the visual targets (e.g., Schröger & Wolff, 1998).

Research using both unimodal and crossmodal task settings has found that novel and deviant stimuli elicit a triumvirate of ERPs; MMN/N1, the P3a, and the reorienting negativity. MMN has been extensively researched and is elicited whether the subjects are engaged in a task or not (while reading a book; e.g., Saarinen, Paavilainen, Schöger, Tervaniemi, & Näätänen, 1992; Schröger, Giard, & Wolff, 2000). Furthermore, it has been suggested that the MMN is either a marker for a change-detection mechanism (e.g., the deviant stimulus; Näätänen, Paavilainen, Rinne, & Alho, 2007) or a marker that the change violated the predictions of the neural model (e.g., Bendixen, SanMiguel, & Schröger, 2012; Winkler, 2007). Secondly, it has been suggested that the P3a is a marker for when attention has been oriented to the change (e.g., Berti, 2008; Friedman, Cycowicz, & Gaeta, 2001). Finally, if the subjects are engaged in a primary task, the reorienting negativity (RON) is elicited. This is suggested to be a marker that attention has been reoriented back to the focal task (e.g., Berti, 2008; Schröger & Wolff, 1998).

Behavioral responses to deviating stimuli Behaviorally, the introduction of a novel or deviant stimulus has been shown to prolong response latencies to the focal task. For instance, when subjects are to categorize sounds by their duration (long or short), responses have been shown to be prolonged in deviant trials (e.g. Berti, 2008). In the crossmodal settings, the task is typically categorizing digits as odd or even. Here as well, a prolongation is typically seen after the introduction of a novel or deviant stimulus (see Parmentier, 2014 for an extensive review). The TBI stimuli are presented prior to each target stimuli and a standard stimulus is frequently used (e.g., 80% of trials) whereas a novel or deviant stimulus is used infrequently (e.g., Andrés, Parmentier, & Escera, 2006; Escera, Alho, Winkler, & Näätänen, 1998). The prolongation of response times will henceforth be called deviance distraction. It has been found that deviance distraction is because the processing of the onset of the target is delayed (Parmentier, Elford, Escera, Andrés, & SanMiguel, 2008). In other words, rather than being due to that the processing of the target itself, it seems that attention has been oriented to the change. Parmentier, Elford, Escera, Andrés, & SanMiguel (2008) showed this in a study in which they manipulated the perceptual load in the visual task, increased the number of response choices, and used a visual cue prior to the visual targets. It was found that the visual cue recaptured attention to the visual target. The effects of deviant sound have been reported to be reduced due to practice (Parmentier, 2008; Sörqvist, Nöstl, & Halin, 2012). For instance, Parmentier (2008) found that the effect of deviant sounds was smaller at the end of the task compared to the beginning of the task.

Why are sudden changes distracting? How this attentional shift works has not been fully examined, but Parmentier et al. (2008) put forward three alternative explanations, namely that deviance distraction can be due to a spatial shift, modality shift, and a task set shift.

To elaborate, in the crossmodal oddball task the TBI stimuli, whether standard or deviant, are presented from a different spatial location than the visual targets. For example, in the audio-visual oddball task, the visual targets are presented in front of the subjects on the computer screen whereas the TBI stimuli are typically presented through headphones. When the deviant sound captures attention, focus shifts from the visual targets to the source of the sounds (i.e., the headphones). The deviant sound may also shift attention from the task modality (visual) to the TBI modality (auditory).

However, given that the same mechanisms are underlying deviance distraction in unimodal and crossmodal settings, the spatial and modality shift cannot fully explain what is going on. In unimodal oddball tasks, there is only one modality used (e.g., auditory) and the location of the task and the TBI stimuli are the same. Thus, the spatial and modality shift can only explain crossmodal task settings.

Parmentier et al. (2008) therefore proposed a third possible attentional shift – task set shifting. According to this account, the time penalty is caused by the cognitive system trying to find the right response to the deviant or novel stimulus. This account is based on the assumption that not producing a response is the right response to a deviant or novel sound. Furthermore, this makes the stimulus- response mapping distinguishable from cases where the primary tasks lead to deviant or novel sounds making a person switch task sets (see Parmentier, 2014 for a discussion).

Response latencies are sped up Depending on the task setting, it has also been found that response latencies can be sped up by the presentation of deviating stimuli. SanMiguel, Linden, & Escera (2010) conducted a series of experiments. They found that novel sounds facilitated responses in terms of both decreased response latencies and increased accuracy. SanMiguel and colleagues (2010) suggested that this facilitation was due to a nonspecific alerting signal thought to be part of the orienting response.

Short-term memory In a closely-related paradigm, attention capture was also studied using a similar to the oddball paradigm but using a short-term memory task (e.g., the serial recall task) instead. The main idea is similar to that of the oddball tasks, i.e. subjects are presented the same auditory TBI stimuli in most of the trials. However, the task is more mentally taxing and fewer trials are used in the experiments. In the serial recall task, subjects are typically asked to remember verbal targets (e.g., digits), called to-be-remembered items (TBR), that are presented one at a time on a computer screen. The task is to remember the order of the TBR items. For example, if the digits “7 4 2 1 8 0” are presented in a trial, they are to be remembered in that precise order.

Before defining and describing short-term memory, it may be worth noting that the term short-term memory (STM) is often used in reference to the construct of working memory (WM). A typical STM task is the serial recall task described previously. In WM tasks, the subjects may be exposed to a set of TBR items. However, they are also engaged in another task, unrelated to the serial recall task (e.g., reading sentences). Thus, STM is typically defined as maintenance of TBR items over a brief time period and WM as maintenance plus manipulation. Note, however, that a lot of research has focused solely on the maintenance part of WM (e.g., see a review by D’Esposito, 2007). Other researchers also thoroughly reviewed and meta-analyzed literature using STM and WM tasks in different contexts (Unsworth & Engle, 2007). Unsworth and Engle (2007) concluded that STM and WM reflect the same cognitive process and, therefore, rejected the notion that they were two different constructs.

In this thesis, most literature reviewed use the term STM and simple span tasks (i.e., the serial recall). Thus, the term STM will be used to describe the maintenance of information over a brief time period. Note, however, that some of the models and theories cited use the term WM.

One outstanding issue in relation to short-term memory/working memory relates to whether different kinds of information re held in different stores or in the same store. In other words, it has been questioned whether the STM is domain-general or domain-specific (e.g., Kane et al., 2004; Li, Christ, & Cowan, 2014). Some researchers argue that information held in the STM is held in different domains (e.g., visual, auditory, verbal, spatial; Cocchini, Logie, Della Sala, MacPherson, & Baddeley, 2002), while others acknowledge the existence of domain-specific stores but argue that information also can be stored in a central, domain-general manner (e.g., Li et al., 2014; Saults & Cowan, 2007).

Multicomponent One of the most prominent models of working memory is the multicomponent view of the working memory that posits that there are several modules responsible for different aspects of information kept, and manipulated, over brief periods of time (e.g., see Baddeley, 2012). The components are as follows: the phonological loop, the visuospatial sketchpad, episodic buffer, and the central executive (CE). Verbal (linguistic, speech-like) material is thought to be maintained in the phonological loop, whereas non-verbal material is thought to be maintained in the visuospatial sketchpad. The episodic buffer, on the other hand, has been suggested to be a buffer store for processing multidimensional code or integrated episodes. In addition to these three components, the central executive guides the behavior related to the maintenance of the two components. Importantly, the multicomponent view of short-term memory can be viewed as a domain-specific model of short-term memory.

Embedded-processes model Another prominent model is the embedded-processes account of short-term memory (Cowan, 1988, 1995). In this model, the short-term memory is divided into three hierarchical storages. First, there are subsets of inactivated long-term memory representations. Second, information can be brought into short-term memory by a subset of information from the long-term system being temporarily activated. This activation is limited by time and will decay. Third, a limited focus of attention can cause further subsets of information to receive even more activation and be made particularly salient when they fall under the focus of attention. In this model, a central controller is assumed to provide domain- general processing capacity. Thus, the embedded processing model predicts that

different types of information (verbal, visuospatial) are maintained in a common storage.

Duplex-mechanism account The third account, the object-oriented record (OOR), is also a domain-general model. According to this view, short-term memory representations of verbal, non-verbal sounds as well as verbal and spatial items are stored within the same medium (e.g., Jones, Macken, & Murray, 1993). The short-term memory representations are called objects and are associated with pointers. These pointers are created through rehearsal and store serial order. More recently, the OOR has been superseded by the duplex-mechanism account (Hughes, 2014; Hughes, Vachon, & Jones, 2005).

Distraction and short-term memory As this thesis deals with distraction by sudden and unexpected changes, the next section will describe research related to this and then return to the different short-term memory accounts mentioned in previous sections.

It is well-documented that irrelevant auditory stimuli can affect the ability to reproduce visually presented sequences of verbal items (e.g., see Hughes, 2014 for a review). This is typically referred to the irrelevant speech effect (Colle & Welsh, 1976) or the irrelevant sound effect because non-speech sounds can also produce distraction (Beaman & Jones, 1997) The vast majority of studies (e.g., Hughes, Tremblay, & Jones, 2005; Jones et al., 1993; Röer, Bell, Marsh, & Buchner, 2015; Tremblay, Nicholls, Alford, & Jones, 2000; see Hughes, 2014) have examined how a changing sound sequence (e.g., "ABABABAB") disrupts serial recall compared to a repetitive sequence (e.g., "AAAAAAAA"). There are usually three conditions: the changing-state, the steady-state, and a silence control condition. The changing-state effect is prominent during both encoding and retention of the TBR items. Noteworthy, the steady-state sequence does not typically affect serial recall compared to the silent control condition.

As previously mentioned, there is also an increasing interest in how deviant sounds affects short-term memory. A similar method is used; the subjects are exposed to a sequence of TBI sounds while performing the serial-recall task. In the majority of trials, the TBI sound is the same. Crucially, on some trials there is a deviant sound presented in the TBI sequence (e.g., at the fourth item). For instance, in a study by Hughes, Vachon, and Jones, (2005) a deviation in the inter-stimulus interval between two sounds was shown to have a negative impact on serial-recall (e.g., fewer items were recalled correctly in deviant trials). It has further been found that deviants (e.g., changes from car horn sounds to piano sounds) disrupted performance in verbal but not spatial short-term memory

(Lange, 2005). Furthermore, it has also been found that an unexpected and sudden change from male to female voice have the same negative impact (Hughes, Vachon, & Jones, 2007). More recently, it has also been found that a deviation in spatial source of to TBI stream (i.e., from one ear to the other) disrupt both spatial and verbal short-term memory (Vachon, Labonté, & Marsh, 2017).

Returning to the different models of short-term memory they explain the changing-state and deviation effects a bit differently. Furthermore, broadly they can be categorized into either the unitary (e.g., Cowan, 1988, 1995) or the duplex- mechanism accounts (e.g., Hughes, 2014; Hughes, Hurlstone, Marsh, Vachon, & Jones, 2013; Hughes, Vachon, & Jones, 2007). According to the unitary account both the deviation and the changing-state effects are due to exogenous attentional orienting or attention capture (e.g., Cowan, 1995; Lange, 2005). Changing-state sequences are viewed as a succession of deviants that over, and over, capture attention from TBI items. The duplex-mechanism account, on the other hand, posit that the deviation effect is due to attentional capture and the changing-state effect is due to interference- by-process. According to the duplex-mechanism account changing-state sounds do not capture attention but that "the preattentive and obligatory processing of the order of the changing stimuli conflicts with the deliberate serial rehearsal of the to-be-remembered stimuli" (pp. 1050, Hughes, Vachon, & Jones, 2005).

Violation of predictions and attention capture In this section I will describe a very recent account of the effect of deviant stimuli. This account brings together both the studies focusing on response latencies and short-term memory disruption. Namely, the violation of predictions account of attention capture.

Recently, researchers have focused on investigating the mechanisms of attention capture and asked whether deviants capture attention due to that they are novel or due to that they violate the subjects’ expectations concerning the TBI sequences. Deviant or novel events are most often presented at a relatively lower probability compared to standard events and, therefore, it was assumed that deviants were distracting due to their low base-rate probability. There are, however, at least two other hypothesis that have been put forward; perceptual local change and violation of predictions hypothesis (e.g., Nöstl, Marsh, & Sörqvist, 2012; Parmentier, Elsley, Andrés, & Barceló, 2011). According to perceptual local change, deviant and novel stimuli capture attention due to that there is a physical change from the standard to the deviant (e.g., in pitch). The unexpectedness account, on the other hand, posits that deviant or novel stimuli is capturing attention due to that they are unexpected. That is, our cognitive

system creates feedforward models and is expecting the standard but gets a deviating stimulus.

Other evidence suggesting that a neural model based on the cognitive systems prediction on upcoming events comes from research that have found that simply omitting the standard stimulus elicits the MMN (Horváth, Müller, Weise, & Schröger, 2010; Yabe et al., 2001; Yabe, Tervaniemi, Reinikainen, & Näätänen, 1997). Yabe and colleagues found that MMN (Yabe et al., 1997) and MMNm (the MEG equivalence of MMN; Yabe et al., 2001) was elicited when the standard sound was omitted (e.g., "S-S-S-S- -S-S"). Horváth et al. (2010) reported MMN for an omission of the sound in a continuous stream of sounds (e.g., "SSSSSSS SSSS").

Returning to the short-term memory literature and disruption by deviants, Hughes, Vachon, and Jones (2007) argued that attention is captured because "foregoing predictions do not hold" (pp. 738, Hughes, Vachon, & Jones, 2005). The authors put forward an algorithmic model. According to their model, attention capture is the result of a violation of a pattern or a rule. This account provides an answer to the fact that a violation of a canonical order (e.g., 1 2 3 4 5 7 8) and deviations from locally constructed rules (e.g., 5 4 3 8 7 7 4 6) have been shown to capture attention (e.g., Unger, 1964). More recent evidence was provided by Vachon, Hughes, and Jones (2012). Their study found that a deviant that was a change from one gender to another only captured attention when the subjects had to time form a neural model based on expectations of the voice. In contrast, if the deviant was presented in the beginning of a trial, where the sound would have been novel, the change could not have been expected. However, when the subjects started to expect the voice change (across the experiment) the effect declined. Further, evidence comes from a study by Hughes, Hurlstone, Marsh, Vachon, and Jones (2013) in which the subjects were warned prior to each trial that a deviant would be presented.

Violations of prediction may be what underlies deviance distraction, but is this effect limited to the oddball paradigm? There is a more general theory/framework that fits quite well with this idea, i.e. the predictive coding framework (e.g., Friston, 2010; Quak, London, & Talsma, 2015; Talsma, 2015).

According to the predictive coding framework, there are stochastic models in the brain that are constantly updated based on processed sensory information. Higher-order brain areas provide the lower areas with predictions. These predictions, in turn, have an impact on the processing of the current sensory information. If the incoming sensory information is a mismatch between what is predicted and the actual input, the model needs to be updated. For instance, when unexpected sensory input (like a deviant stimulus) is present, the internal model

of the environment may need to be updated to deal with this change in representations. This continuously updating model has been described as a multisensory working memory representation that is made up of information from different modalities (Quak et al., 2015). To clarify, the representations held in the stochastic models are thought to be amodal (e.g., Talsma, 2015; et al., 2015).

Interestingly, both the auditory (e.g., Schröger et al., 2014; Winkler & Czigler, 2012) and visual MMN (e.g., Stefanics, Kremláček, & Czigler, 2014; Winkler & Czigler, 2012) were interpreted within the context of the predictive coding framework and the MMN is thought to reflect the mismatch between what was predicted (i.e., the standard stimulus) and the actual stimulus (i.e., the deviant).

Multisensory accounts on attention, distraction, and short- term memory Prediction may be something that enables the cognitive system to deal with the vast amount of sensory processing from the environment. It is also worth questioning whether the neural model is created based on input from each sensory modality separately. In this section, I will briefly return to the OR, and discuss evidence from neuroscience and the oddball paradigm that indicate that both multisensory aspects (e.g., central mechanisms) and unisensory aspects (e.g., sensory specific mechanisms) are involved in the detection of unexpected and sudden changes.

According to Sokolov (1963), the OR is non-specific as regards stimulus quality because it can be elicited by sound, light, electrical or thermal stimulation of the skin. Thus, Sokolov (1963) claimed that the OR is modality independent. Sokolov described the OR as a functional system that becomes inactivated by repeated exposure to a stimulus and is reactivated by any change in that stimulus. Furthermore, it has been reported to be elicited by stimuli in one modality presented after a repetitive stream of stimuli in another modality. Houck & Mefferd Jr. (1969), and Smith, Dickel, & Deutsch (1978) exposed subjects to a stream of either standard auditory or standard visual stimuli. After the stream of standard stimuli, a novel stimulus in the opposing modality was presented. Thus, these two studies imply that unexpected stimulus in one modality can be detected even though it is presented in the context of another modality.

More recently, it has been found that certain neural networks are activated by unexpected changes. Importantly, these networks are activated whether it is a salient and/or an unexpected change and share underlying neural networks regardless of which sensory modality the change was presented in. Most relevant is one network, the ventral attention network, which is suggested to be an alerting

system (e.g., Corbetta, Patel, & Shulman, 2008). Furthermore, it is suggested that the ventral attention network is independent of modality (e.g., Corbetta et al., 2008; Macaluso, 2010). In a seminal study, Downar, Crawley, Mikulis, & Davis (2000) used functional magnetic resonance imaging (fMRI) to study which brain regions respond to deviants in the visual, auditory and tactile modalities. It was found that visual, tactile, and auditory deviants activated multimodal networks. More recent evidence comes from a meta-analysis of oddball studies in the auditory and visual modalities (Kim, 2014). The results showed that regardless whether the deviant stimuli were visual or auditory, relevant or irrelevant, they shared common neural networks (i.e., the ventral system). However, there are three reasons be cautious when generalizing to behavioral deviance distraction in crossmodal settings using tactile stimuli. First, all results in the aforementioned studies used unimodal designs (i.e., deviants, standards, and tasks were all presented within the same modality). Second, no study in the meta-analysis by Kim (2014) used tactile stimuli. Third, the methodologies of oddball studies employing fMRI, ERP, as well as behavioral elements differ somewhat. For instance, the temporal resolution of fMRI is lower than with ERP and behavioral studies, which leads to slower presentation rates of the stimuli for fMRI (i.e., longer interstimulus intervals).

Central or sensory-specific mechanisms? Of key interest for this thesis is whether there are similar mechanisms underlying distraction of sudden and unexpected changes. At first glance, if distraction by deviants both in categorizing and cognitive tasks (e.g., short-term memory) is due to violations of expectations, one could propose that this distraction should be independent of modality. Furthermore, and as previously described, research using the crossmodal oddball paradigm as well as serial recall task clearly indicate that the TBI stimuli and the task do not have to be in the same modality. This section will describe research using tactile and visual TBI stimuli.

Deviant or novel presented stimuli in the visual and somatosensory modalities have been reported to elicit responses similar to those in the auditory modality. For instance, visual deviants have been found to elicit the MMN and the P3a (e.g., Berti, Roeber, & Schröger, 2004; Boll & Berti, 2009), and tactile deviants have been shown to elicit the P3a (Knight, 1996). Importantly in relation to this thesis, vibrotactile deviants have also been shown to disrupt performance in visual tasks (Ljungberg & Parmentier, 2012a; Parmentier, Ljungberg, Elsley, & Lindkvist, 2011). This has led some authors to suggest that there are functional similarities between the tactile and auditory modalities (e.g., Ljungberg & Parmentier, 2012). In a study by Ljungberg and Parmentier (2012), subjects were exposed to vibratory and auditory TBI stimuli while performing a visual categorization task. In both modalities, deviants prolonged responses in the visual task. Furthermore,

Ljungberg & Parmentier (2012) found that the temporal dynamics of deviance distraction were similar for auditory and vibratory deviants. That is, although the effect was reduced across the tasks, it did not disappear in either modality.

It is worth noting, however, that there are also some differences between modalities in terms of deviance distraction. Ljungberg and Parmentier (2012) reported a significant correlation between auditory and vibratory deviance distraction concerning accuracy. However, the correlation between response latencies when comparing the distractive effect of sound and vibration deviants failed to reach a significant level. Furthermore, Leiva, Parmentier, and Andrés, (2015) reported a difference between deviance distraction by auditory and visual deviants. In their study, they used visual-visual, auditory-auditory, auditory- visual, and visual-auditory oddball tasks. Crucially, the standard and deviant stimuli were always presented prior to the target in the focal task, whereas in previous research of visual deviance distraction the standard and deviant stimuli were part of the same object (e.g., the target). Auditory deviants significantly affected performance, but the visual deviants did not.

Although the research on how deviations in tactile TBI sequences is sparse, there is a growing interest in how tactile stimuli, for instance, orient attention to visual targets or certain spatial locations. In these research paradigms, the focus is on the facilitation of responses (e.g., operationally defined as faster response latencies) due to tactile cues. For instance, it has been found that subjects respond faster when changes in a search display coincide with a tactile cue (e.g., Ngo & Spence, 2010; Van der Burg, Olivers, Bronkhorst, & Theeuwes, 2009). Furthermore, from an applied perspective, it has been shown that tactile warning systems can capture attention and warn for potential upcoming car collisions (e.g., Ho, Reed, & Spence, 2006; Mohebbi, Gray, & Tan, 2009). Ho, Reed and Spence (2006) examined whether vibrotactile warnings signals could cue subjects of potential front-to-rear-end collision. It was found that vibrotactile cues led to faster breaking responses compared to when there was no cue. Importantly, vibrotactile distractors have also been found to affect performance in visual tasks using other experimental paradigms. Using a crossmodal incongruence task, Spence and Walton, (2005) found that when the vibrotactile elevation was incongruent with the visual elevation, the subjects responded slower. In other words, when the visual target (i.e., flash of a LED light) was presented at a higher location but the vibrotactile target was presented at a lower location (i.e., on the thumb and not the index finger) the responses were slowed. Tactile information can, apparently, affect performance in visual tasks.

Altogether, both auditory and tactile stimuli seem to capture attention in a similar manner. This is true whether the task is disrupted or facilitated.

Multisensory neural models? The fact that there are brain areas activated for changes in the visual, auditory, and tactile modality (e.g., Corbetta, Patel, & Shulman, 2008; Downar, Crawley, Mikulis, & Davis, 2000; Macaluso, 2010) as well as the functional similarities highlighted above (e.g., Ljungberg & Parmentier, 2012) fit well with an idea of a predictive brain that uses feedforward networks for sensory events (e.g., Parmentier et al., 2011; Talsma, 2015; Vachon, Hughes, & Jones, 2012; Winkler & Czigler, 2012). When a sensory event is a mismatch, it violates the prediction and attention may be captured (the model has to be updated; e.g., Talsma, 2015). Such a view transcends sensory distinctions because the computation of expectation does not have to be specific to any sensory modality. Furthermore, there are some limitations of the two studies on vibrotactile deviance distraction (Ljungberg & Parmentier, 2012b; Parmentier, Ljungberg, et al., 2011) that might make conclusions concerning similarities in the tactile and auditory modality premature. In the first study, the exposure to the vibrating TBI stimuli was limited (i.e., for a short period; 3 minutes). Furthermore, in the study by Ljungberg and Parmentier, (2012), the method used may not have been suitable for examining the reduction of deviance distraction over time. In their study, the authors exposed the subjects to auditory and tactile stimuli in blocks of trials. These blocks were alternated in such a way that, although the total exposure time for each modality was 12 minutes, any given block lasted only 3 minutes and then the modality was switched (i.e., three auditory-visual blocks alternated with three vibrotactile-visual blocks).

Although there has been a lot of work invested in examining the cognitive underpinnings of attention capture by deviant events, there are still some critical questions that need to be answered. For one, the evidence of similarities or differences between the effects in different modalities are few and have some methodological limitations. Secondly, investigation is required to determine whether the neural model is built based on sensory information within modalities or between them.

Aims

The overall aim of the empirical work included in this thesis was to investigate attention capture by sudden and unexpected changes (i.e., deviants) in TBI stimuli from a multisensory perspective. More specifically, the studies aimed to explore similarities and differences in how deviant stimuli affect performance crossmodally, whether the neural model makes predictions based on within- modality experience or if it is comprised in a multisensory manner (mismatches in one modality requires the model to be updated).

Specific aims:

Study I: To examine the possible functional similarities and differences of deviance distraction by vibrating deviants to the effects in the auditory modality. Are the temporal dynamics also similar when the subjects are exposed to the TBI vibrations for a longer and uninterrupted period of time?

Study II: To test whether deviance distraction can be elicited even when the standard and deviants are presented in different modalities. Is deviance distraction contingent on the standard and deviant stimuli being presented within the same modality?

Study III: Based on multisensory accounts of STM, examine whether a change in spatial deviants in bimodal TBI sequences is more salient compared to unimodal (i.e., vibrating or auditory). Further, to address the claim of domain-specificity of the STM is addressed; i.e. can a spatial deviant affect both verbal and spatial STM?

Materials and methods

This section describes the methodologies used in all three studies included in the thesis. All experiments included in this thesis used some variation of the crossmodal oddball paradigm using vibratory and auditory TBI stimuli. All sounds used in the experiments were presented through headphones (Study I- III). Vibrations were presented using vibrating handles (Study I-II) or vibrating motors (Study III).

Subjects All subjects in the three studies were persons from the city of Umeå. They were recruited via flyers at Umeå University, an online webpage (Studentkaninen.se), and through Facebook groups. All subjects reported that they had normal or corrected-to-normal sight and hearing (or any known somatosensory deficits).

The first study included 20 people (8 females and 12 males) with a mean and SD age of 25.4. The second study included 30 subjects across 3 experiments. The mean age was 24.63, 23.82, and 24.61 for experiments 1-3, respectively.

The third study included 50 subjects (mean age 27.2) in Experiment 1, 50 subjects (mean age 25.9) in Experiment 2, and 50 subjects (mean age 27.1) in Experiment 3.

Instruments/procedure

Materials Informed consent forms were used to provide brief information on the overarching aim and procedures, potential benefits and risks of participation, specification that participation was voluntary, how the results were to be handle, and contact information for the research. All subjects used Vic Firth sound attenuating headphones and the experiments were run on computers.

Auditory stimuli A sinewave tone with a frequency of 600Hz was used in all three studies. In Study I and III, the sinewave tone was used as a standard TBI stimulus, whereas in Study II it was used as a deviant TBI stimulus. The duration of the sinewave tone was 200 ms in Study I and II and 200 ms in Study III. All sounds were generated with the open-source software Audacity®.

Vibrating handles (Study I & II) A specially-built vibrating device consisting of two handles was used in the first two studies (see Figure 1). Each handle was made of a transparent plastic tube with response buttons on top. The vibrations were caused by a motor inside the tube that spun an eccentric mass on its rotor (See Figure 1).

Figure 1. The specially-built vibrating handles used in Study I & II.

The standard TBI stimulus consisted of a 200 ms vibration with an amplitude and frequency of 2.3 ms2 (R.M.S), and 33 Hz, respectively. In Study I, the deviant and standard stimuli consisted of a 200 ms vibration with an amplitude and frequency of 2.3 m/s2 (R.M.S), and 33 Hz, respectively.

Vibration motors (Study III) The TBI vibrations were delivered through two brushless coin vibration motors (Precision Microdrive’s shaftless and brushless vibration motor, Dura VibeTM, model 910-101, 10 mm, 3 V, 65 mA, 12,500 rpm, 1 g). The vibration motors were driven by a sinusoidal signal at a frequency and amplitude of 240 Hz and 1.8 g (peak-to-peak), respectively. Each motor had a surface area (point of contact) of 74.5 mm and reached maximal rotational speed after approximately 52 ms. Each vibration motor pair was controlled by Arduino UnoTM microcontrollers. The

Arduino Uno microcontrollers were programmed using the Arduino IDETM version 1.66 (www.arduino.cc). See Figure 2 for a photo of one of the vibrating motors.

Figure 2. The coin-like vibration motors used in Study III.

Crossmodal oddball task (Study I & II) In this task, subjects had to categorize digits as odd or even while being exposed to the TBI stimuli. Each trial started with the presentation of the TBI stimuli shortly followed of the visual target (digits one to eight). In the majority of trials, the same standard stimulus was used. On sudden and unexpected occasions, the standard was replaced with a deviant event. The TBI stimuli were presented 100 ms prior to each target stimulus. In Study I, audio-visual and vibrotactile-visual oddball tasks were blocked, i.e. the subjects were first exposed to either 4 blocks of auditory-visual or 4 vibratory-visual oddball tasks (counterbalanced across subjects). The order of the audio-visual and vibrotactile-visual blocks was counterbalanced across participants (see Figure 3 for a schematic overview of a typical trial in Study I).

Figure 3. A schematic overview of a typical trial in Study I. Depending on the block modality, each trial started with the presentation of either a vibration or a sound. In most of the trials, the TBI stimulus (standard) was the same, while on sudden and unexpected occasions the stimulus was replaced with another stimulus (deviant).

In Study 2, the proportion of standard and deviant stimuli were the same as in Study 1, as was the visual task. However, across three experiments the deviant was either a sinewave tone presented at the same time without the standard vibration (Experiment 1), a sinewave tone presented at the same time as the standard vibration (Experiment 2), or an omission of the standard vibration (Experiment 3). See Figure 4 for a schematic overview of a typical trial for each experiment.

Figure 4. A schematic overview of a typical trial in the crossmodal oddball task (Study II). Each trial started with the presentation of a fixation cross and a TBI stimuli. In Experiment 1, it could be either a standard vibration or a deviant sound. In Experiment 2, it could be either a standard vibration or a standard vibration and deviant sound. In Experiment 3, it could be either a standard vibration or the omission of a standard vibration.

Verbal & spatial serial recall task (Study III) In this task, subjects had to remember the order of items presented visually. Each trial started with the presentation of the TBI sequence and the first TBR item. In all three experiments, the TBI sequence consisted of 10 stimuli. Each stimulus had a duration of 400 ms and the ISI was 450 ms. In Experiment 1, the TBI stimuli were made up of vibrations and sinewave tones presented simultaneously. In Experiment 2 and 3, the TBI sequence consisted of only vibrations and sounds, respectively. A roving standard paradigm was employed (e.g., Lange, 2005), i.e. the TBI sequence were presented at one side of the body in the majority of trials. In infrequent trials, the TBI sequence changed side of the body. After this spatial change, the sequence was presented on this side of the body until the next change.

Each TBR item had a duration of 250 ms and was presented on the computer screen one at a time. The interstimulus interval was 350 ms and subjects were asked to respond immediately after the last TBI item was presented. In the verbal task, subjects were presented 7 digits randomly taken from the set 1-9 without replacement. In the spatial task, the TBR items consisted of dots presented at locations randomly taken from a 5x5 matrix without replacement. Both tasks required the subjects to click on the TBR items in the order they were presented. The verbal TBI items were presented in canonical order and the dots were presented in the locations in the 5x5 matrix. After a TBR item was clicked, it changed color from black to green. See Figure 5 for a schematic overview of a typical trial in both tasks.

Figure 5. A schematic overview of a typical trial in both the spatial and verbal serial recall task. Each trial started with the presentation of the TBR items and TBI stimulus. In both tasks, 7 TBR and 10 TBI items were presented. After the last TBI item was presented, the subjects were to give their answers.

Data analysis

Study I and Study II Response latencies and the proportion of correct responses (accuracy) were analyzed. Response latencies were aggregated for correct responses only and response latencies greater than 200 ms were excluded. Response latencies and

the proportion of correct responses were analyzed using repeated measures ANOVAs. In Study I, a 2 (modality: auditory, tactile) x 2 (trial type: standard, deviant) x 4 (block: 1, 2, 3, 4) ANOVA was used to explore any differences or similarities in terms of habituation between the modalities. In Study II, the response latencies and the proportion of correct responses were analyzed separately for each experiment using a one-way ANOVA with the levels of trial type (standard, deviant). In a follow-up analysis using mixed, within and between, ANOVAs were used to explore any interactions across deviant types across the experiments.

Study III The proportion of correctly-recalled items was scored according to a strict criterion whereby an item it had to be recalled in the correct position (i.e., in the exact order the items, whether digits or dots, where presented) in order to be scored as correctly recalled. Each subject’s score was aggregated across trials in each condition and task. The data was then analyzed by the means of a 2 (task: verbal, spatial) x 2 (trial type: standard, deviant) ANOVA.

Data presentation In all experiments presented in this thesis, the mean data, whether response latencies or proportions, was presented in tables or figures. The confidence intervals in both types of data presentation techniques were calculated for within- subject comparisons according to Cousineau (2005) and Morey (2008).

Results

Study I:

Aim The aim was to expand and replicate previous research examining possible similarities and differences (Ljungberg & Parmentier, 2012b; Parmentier, Ljungberg, et al., 2011) between deviance distraction in the tactile and auditory modality. More specifically, the aim was to examine the temporal dynamics of deviance distraction in the two modalities after uninterrupted, repeated exposure within each modality separately.

Based on previous research, it was hypothesized that a) both auditory and vibratory deviants would prolong response latencies, and b) that the temporal dynamics of deviance distraction would be similar for both auditory and vibratory deviants.

Results The subjects performed quite well (M = .87, SD = .34). Subjects responded slower in the visual task following the presentation of a deviant stimuli. Importantly, this was true regardless of the modality of the TBI stimuli. Thus, both auditory and vibratory deviants elicited deviance distraction. Interestingly, the effect of the auditory deviant was more pronounced compared to that of the vibratory deviant.

As can be seen in Figure 6, there was a difference between the modalities in terms of the temporal characteristics of deviance distraction. Whereas the effect of the vibratory deviant was reduced at the end of the task, the same pattern was not found for the auditory deviant.

Figure 6. Results from Study I: Deviance distraction (i.e., the response latencies in the deviant trials subtracted from the response latencies in the standard trials). Error bars represent 95% confidence intervals.

To summarize, the results replicated previous research findings (Ljungberg & Parmentier, 2012b; Parmentier, Ljungberg, et al., 2011). That is, a deviation in vibrating TBI stimuli disrupts performance in visual tasks – attention has been captured. However, it also implies that there are more differences between deviance distraction by auditory and tactile deviants. That is, the effect of the vibrating deviations reduced over time whereas the effect of auditory deviations remained relatively stable. The results do not provide compelling evidence for solely a central mechanism.

Study II:

Aim The aim was to test whether deviance distraction can be elicited even when the standard and deviants are presented in different modalities. Is deviance distraction contingent on the standard and deviant stimuli being presented within the same modality?

Results Performance across all three experiments was good (see Table 1 for mean and 95% confidence interval).

Table 1. Mean and 95% CI (within brackets) accuracy (proportion of correct responses) for all three experiments.

Standard Deviant Experiment 1 0.891 [0.883, 0.899] 0.907 [0.899, 0.915] Experiment 2 0.9 [0.893, 0.907] 0.901 [0.894, 0.908] Experiment 3 0.868 [0.857, 0.879] 0.878 [0.867, 0.889]

A neural model scanning the environment for sensory information and updating sudden and unexpected changes should detect an auditory change even when the predicted input is a vibration, at least if the model is based on sensory input from all sensory modalities stimulated. Overall, the results from the experiments of Study II are not in line with a solely multisensory view of the violation of prediction. They are further not in line with the predictive coding framework. As can be seen in Figure 7, the response latencies were slowed in Experiment 1. Although this may have indicated that the deviant sound captured attention away from the visual task, the results from Experiment 2 and 3 can be used for another interpretation.

In Experiment 1, the standard vibration was omitted when the deviant sound was presented. When the deviant was presented at the same time as the standard vibration, a different pattern emerged. As can be seen in Figure 7 (middle panel), the response latencies for deviant and standard trials were very similar in Experiment 2. The results from Experiment 1 could be explained by the omission of the vibration, not the deviant sound per se, being what captured attention. Experiment 3 showed that a simple omission of the standard vibration could also affect behavior – the response latencies are prolonged.

Figure 7. Response latencies in the standard and deviant trial types. Error bars represent 95% confidence intervals. *p < .00.

Conclusion The results of Study II are not in line with a view that representations and predictions of incoming sensory stimulation are constructed in a modality general manner (e.g., a neural model encompassing information from all modalities). Deviance distraction may be contingent on the deviant and standard stimuli being presented within the same modality. Finally, attention can be captured by a mere omission of a standard stimulus.

Study III:

Aim The aim was to further explore the effects if deviant vibrations in a more complex task. Specifically, the aim was to examine whether spatial deviants (changes from one side of the body to the other), bimodal (vibration and sounds), vibrating, and auditory TBI stimuli can affect short-term memory performance.

Results As can be seen in Figure 8, the results from the first experiment showed that performance dropped when the bimodal stream changed from one side of the body to the other (i.e., the spatial deviant). This implies that a deviation that is a change in the location of the TBI sequence affects both verbal and spatial short- term memory performance. Research examining attention capture in relation to short-term memory is sparse. The first study showed that auditory deviations (e.g., change from one sound to another) only affected verbal short-term memory (Lange, 2005). More recently, it has been shown that both verbal and spatial short-term memory can be disrupted by deviations in the spatial location (Vachon, Labonté, & Marsh, 2017). Experiment 1 and the methodology from Vachon, Labonté, and Marsh (2017) are more similar compared to Lange (2005).

Figure 8. Results from Experiment 1: mean proportion of items correctly recalled in trials with and without a spatial deviant. The left panel depicts the proportion of correctly recalled items in the spatial task and the right panel depicts the proportion of correctly recalled items in the verbal task. Error bars represent 95% confidence intervals.

As can be seen in Figure 9 (left panel), the results from Experiment 2 failed to find an effect when the TBI sequence consisted of only vibrations. This means it is possible that it was spatial change the auditory TBI sequence that had an impact on performance, as already shown by Vachon, Labonté, and Marsh (2017). However, the results from Experiment 3 (see Figure 9, right panel), show that using only spatial deviations in the auditory TBI sequence was also ineffective in terms of disrupting STM performance.

Figure 9. Results from Experiment 2 (left panel) and Experiment 3 (right panel): mean proportion of correctly recalled items with and without a spatial deviant in the spatial and verbal STM tasks. Error bars represent 95% confidence intervals.

Conclusion The results from Study III provide evidence against a domain-general view of short-term memory. The spatial deviant captured attention and thereby recall were disrupted in both the verbal and the spatial task. Bimodal TBI sequences make the spatial change more salient and thus are more effective.

Discussion

The aim of this thesis was to examine the interplay between focusing of attention and distractibility using a multisensory approach. Study I found that there are more differences than previously reported in relation to how deviant vibrations and deviant sounds affect visual task processing. The results indicated that although deviance distraction is found in both modalities, the effect of vibration deviants may be reduced due to practice time whereas sounds are more resilient. Study II implicated deviance distraction may be contingent on the changes in the TBI stream being presented within the same modalities. It was found that a mere omission of a standard vibration captured attention, while the simultaneous presentation of a deviant sound and a standard vibration did not affect performance. Finally, in the third study, a spatial change in a bimodal TBI sequence had a negative impact on both verbal and spatial serial recall. However, the spatial change in only auditory or vibratory TBI sequences did not to affect serial recall.

Central or specific mechanisms? The assumption that there is a central mechanism underlying the detection of sudden and unexpected changes (e.g., deviants) is not fully supported by the results of this thesis – at least in a strict sense. The fact that both auditory and vibratory deviants prolong response latencies suggests that the detection mechanism may be functionally similar. However, the results from Study I suggest that the temporal dynamics may be different between the auditory and tactile modality. This may, in turn, suggest that there are even more sensory- specific mechanisms than previously suggested (Ljungberg & Parmentier, 2012). It may further support the hybrid model proposed by Ljungberg and Parmentier (2012). According to this view, there are both modality-specific and multimodal mechanisms. These two mechanisms influence behavior differently.

The central mechanism may be that the cognitive system does in fact detect irregularities in the environment, i.e. there is an amodal interpretation of the violation of expectation account (e.g., Nöstl et al., 2012; Parmentier, Elsley, Andrés, & Barceló, 2011). Both tactile and auditory deviants capture attention because they violate predictions of the cognitive system. Further evidence for this proposal comes from studies reporting deviance distraction in auditory (e.g., Berti, 2008), visual (when target and deviants are presented simultaneously and are part of the same objects; e.g., Bendixen et al., 2010; Berti & Schröger, 2004; Boll & Berti, 2009; but see Leiva, Parmentier, & Andrés, 2015 for absence of distraction by visual deviants) and tactile (Ljungberg & Parmentier, 2012; Parmentier et al., 2011) modalities. In line with previous research comparing the effects of auditory and vibratory deviants (Ljungberg & Parmentier, 2012), Study

I showed larger effects of deviants in auditory modality compared to tactile modality (39.69 ms vs. 20.7 ms). This may also suggest that the auditory system is more fine-tuned to detect sudden changes (see also Leiva et al., 2015 that suggested that auditory deviants are more potent distractors compared to visual deviants).

Study I found no evidence of deviance distraction reduction over time with auditory deviants, while deviance distraction by vibrating deviants was reduced. If auditory deviants are, in fact, more potent distractors compared to vibrotactile deviants (e.g., Leiva et al., 2015), it may mean that more time is needed to show effects of practice for auditory deviants compared to vibratory deviants. Previous research using auditory TBI stimuli used slightly different methods (Parmentier, 2008; Sörqvist et al., 2012). In both previous studies, the subjects were exposed to the TBI stimuli for a longer period of time. However, Ljungberg and Parmentier (2012) found a significant reduction in deviance distraction using the same total number of trials as in Study I. One explanation could, thus, be that significant reductions in deviance distraction may start appearing around 500 trials. It might be worth acknowledging that the results in the tactile modality could be at the sensory receptor level rather than the cognitive level (i.e., due to practice). The experiment of Study I was not designed to test whether it is a matter of sensory adaptation or practice effect. However, whether it is due to adaptation or practice, it suggests that there are sensory-specific mechanisms underlying deviance distraction. Whether the adaptation occurs on a more local (i.e., receptors in the hands) or central (e.g., at the level of the brain) level might have implications for an amodal violation of expectation account. Adaptation on the level of local receptors would mean that the signals are not sent to the cognitive system, and deviant vibrations might stop being unexpected due to that. Evidence contradicting a full central mechanism has further been reported by Leiva et al. (2015). In their study, they compared behavioral deviance and post- deviance distraction in the auditory and visual modalities. Visual deviance distraction was only found when the subjects directed their attention voluntarily to the TBI images. Auditory deviance distraction, on the other hand, was found regardless of whether the subjects directed their attention to or ignored the TBI sounds.

Implications for theories on attention/distraction/prediction The results from Study II are not in line with an amodal view like that of the violation of predictions account (e.g., a neural model encompassing information from all modalities; Nöstl et al., 2012; Parmentier, Elsley, et al., 2011) or the predictive coding framework (e.g., Friston, 2010; Quak, London, & Talsma, 2015; Talsma, 2015).

Concerning attention and exogenous attention in particular: Do the results mean that the auditory system is unable to detect sounds presented in silence? If there is some danger in the surroundings, may it be an oncoming car signaling with the horn in the silent night, and it were to go undetected, it could have horrible consequences. There is research, however, that suggests that sounds presented in silence are detected. In a study by Berti (2013), the crossmodal oddball paradigm was used with two conditions. In the first condition, he presented a transient distractor sound that was not embedded by the traditional repetitive standard sounds. Berti (2013) reported no evidence of behavioral deviance distraction, but rather that the transient sound elicited the ERP N1 followed by the P3a and the RON. In a follow-up experiment, Berti (2013) used novel sounds as the transient distractor. In this experiment, the transient novels prolonged response latencies and elicited the MMN, the P3a, and the RON. These findings lead Berti (2013) to propose that there is a mechanism for detecting sounds in an otherwise silent context. He further argued that the sound probably needs to be a salient and strong sound (e.g., an environmental sound) to affect behavior.

Implications for theories on short-term memory The results of Study III put forward some problems for the multicomponent view of short-term memory (e.g., Baddeley, 2012; Lange, 2005). If spatial and verbal information are maintained in different storages, a spatial change should only affect spatial serial recall. Study III found the contrary, i.e. when the TBI sequence consisted of both auditory and tactile stimuli a sudden change from one side of the body to the other affected performance negatively. These results could be taken as evidence of both the OOR/duplex-mechanism and embedded-processes accounts (e.g., Cowan, 1995; Hughes, 2014). Both of these unitary views predict that a deviant stimulus, regardless of domain or sensory modality, will interfere and have negative impact on short-term memory performance due to attention capture.

It must be pointed out that the results from Study III could, of course, be explained by the fact that information in the episodic buffer is linked together (e.g., spatial and verbal). It may, however, be more problematic that Study III further found that unisensory streams (tactile or auditory) did not capture attention. These results are somewhat problematic for both the multicomponent view (if the interference takes place in the episodic buffer) and the unitary view of short-term memory. In both views, the spatial deviant should capture attention, regardless of whether it happens within an auditory or a tactile TBI sequence. Returning to the results from Study III, it may be that the TBI sequences, when unimodal, were not salient enough and that the spatial deviant,

thus, was ineffective. In fact, research in other areas suggests that stimuli presented in two or more sensory modalities at the same time may be integrated into one percept (e.g., see Talsma, Senkowski, Soto-Faraco, & Woldorff, 2010). This percept, in turn, may be more salient compared to either of the two stimuli presented alone (unisensory). In fact, there is research indicating that the simultaneous presentation of short-term memoranda is advantageous compared to unimodal presentation. A comparatively recent study (Botta et al., 2011) examined the effects of visual, auditory, and audiovisual cues on short-term memory performance. The auditory cues consisted of a pure tone, the visual cue was a black outlined square, and the audiovisual was both previously described cues presented simultaneously. Furthermore, the cues could either be congruent or incongruent. In the congruent condition, attention was captured towards the spatial location that contained the TBR items, whereas in the incongruent condition attention was captured towards the spatial location opposite to the TBR items. Botta et al. (2011) found that response accuracy increased when the cue was congruent and decreased when the audiovisual cue was incongruent. Crucially, both congruent and incongruent cues had a larger effect when they were bimodal compared to visual cues only.

Practical implications Although this thesis focused on research questions related to basic research, it may be worth briefly discussing practical implications of the results. As mentioned in the introduction, the main focus in relation to tactile attention capture has been on how it facilitates performance in more applied settings like collisions (e.g., warning signals for potential collisions; Ho, Reed, & Spence, 2006; Mohebbi, Gray, & Tan, 2009). Even though the focus here is on the negative impact of unexpected changes in auditory, vibratory, and bimodal stimuli, the context of driving a car may also be used. To elaborate, when driving a car one endogenous attention may be occupied with focusing on the road ahead. An in-car-warning system could warn the driver another car is in the blind spot, for instance, by vibrating on the left shoulder. Instead of changing to the left lane, attention may be captured towards the blind spot and the driver may continue driving in the right lane. Here, sudden vibration was negative for the task (changing lanes) but effective in terms of alerting for the other car. In situations like this, reactions in milliseconds can be of importance.

Limitations The empirical work included in this thesis has several limitations that should be addressed in future research. All studies reported suffer from some general methodological issues. The most important one is that of the TBI stimuli chosen from the different modalities. In relation to the auditory deviant chosen in Study I (white noise), it has been thoroughly used in many auditory-visual oddball

studies (e.g., Ljungberg & Parmentier, 2012; Parmentier, 2016). It can, however, be discussed whether the white noise used elicits a startle response or if it is matter of a deviation in the prediction made. It can be both, of course, but it must be acknowledged that white noise has been reported to elicit the startle response (e.g., Dawson, Hazlett, Filion, Nuechterlein, & Schell, 1993). The startle response is an unconscious response and is associated with negative effect. This, in turn, makes it difficult to draw firm conclusions concerning any similarities or differences, particularly, if the deviant vibration did not induce a startle response. Another aspect (also related to Study I and the stimuli used) is that the change between a sinewave tone and a white noise may be experienced as phenomenologically “larger” compared to a change between vibrations differing in frequency and amplitude (e.g., many subjects described the vibrations used in this thesis as a “slower” and a “faster” vibration).

There are similar concerns in Study II. Here, the change may not have been salient enough to be deemed, by the cognitive system, as of biological relevance. This is especially true when the standard vibration and deviant sound (i.e., the sinewave tone) was presented at the same time. It has been found that if stimuli from two sensory modalities are presented at the same time they will be integrated into one percept. In the case of Study II, this may have been enough to be a mismatch of the neural model. However, if one of the stimuli is more salient than the other, that stimuli will capture attention. Given that the vibration was tightly coupled with the visual target, it may have been more biologically relevant for the subjects and, thus, the effect was overridden. Using another stimulus that is more salient may have had a different effect. It must be pointed out here that a pilot study found no significant difference between how subjects perceived the vibrations and sounds used in Study I in terms of how attention capturing they were.

Another limitation of the first two studies in this thesis is statistical power. The first study used as sample of 20 subjects, while the second used a sample of 30 subjects across the three experiments. This, of course, reduced the chance of detecting a true effect and increased the risk of overestimating the magnitude of the effect.

An attempt was made to minimize these concerns in Study III. Here, the sudden and unexpected change was a matter of location (spatial deviant) rather than a change in the sound’s character. In addition, it used sounds that contained no or minimal semantic properties (i.e., sinewave tones). Moreover, the deviation was spatial in both modalities. Although larger sample sizes were used, and the deviants were more similar in Study III, some limitations must be acknowledged. The performance in the verbal task was better compared to the spatial task. In the oddball-like paradigms, where the deviant is presented in far fewer trials

compared to the standard, there are fewer trials in the spatial task and the measure is less reliable. Furthermore, it is hard to know whether the spatial deviant in both modalities is perceived, or functionally similar, even though it is a matter of a change in spatial location. Does the subject perceive a spatial change in an auditory TBI sequence as distracting as the same change in a vibratory TBI sequence?

Future research should, thus, attempt to equate the crossmodal deviant stimuli of important psychophysical characteristics to help isolate the mechanisms behind any distraction effects. Before examining similarities and differences between deviations in two or more modalities, future research should carefully test both the standard and the deviants used. For instance, psychophysical methods such as crossmodal matching could be used to match the stimuli in terms of intensity or duration. It may be specifically important to match the deviant stimuli used in the different modalities (e.g., so that the perceived change from the standard to the deviant is approximately the same).

Conclusions • One central mechanism underlying attention capture may be to detect irregularities in the environment. • In terms of temporal dynamics, detection of sudden and unexpected changes in auditory and tactile TBI sequences may not be similar. • The deviant and standard stimuli in the TBI sequence need to be presented within the same modality. That is, the results from this thesis do not support the idea that the cognitive system builds a neural model predicting regular sensory events from many sensory modalities at the same time. • A mere omission of an expected vibration can capture attention. • Sudden and unexpected changes in terms of spatial location of TBI sequences affect both verbal and spatial serial recall when the sequence consists of both vibrations and sounds. • Unisensory TBI sequences, either vibratory or auditory, are not salient enough to disrupt short-term memory performance.

Acknowledgement

Being a Ph.D. student is one of the most rewarding and important projects I've experienced in my life ... I have had the good fortune to have a good team of supervisors who made me move on. These tutors clearly deserve recognition for this.

First and foremost, I would like to take this opportunity to thank my main supervisor, Jessica K-Ljungberg, whom I met as a Bachelor student. I'm forever grateful that you saw and still see, the potential in me. Without you, I would never have gotten the opportunity to become a Ph.d. When I got stupid criticism, you've reminded me not to take it personally, but you've also involved me in many projects besides the dissertation project, where I got to my top skills and interests. I have learned a lot from you throughout these years.

Second, I would like to give my assistant supervisor Greg Neely, who has also taught me a lot during my time as a Ph.D. student. Sitting on a meeting with you, without really having any questions, often meant learn from Greg's great knowledge of how the university world works. Apart from a lot of knowledge about the scientific part, this has also been invaluable.

In the last years, I also came to include an additional supervisor; Patrik Hansson. Often knocking on my door, and then sitting in my foothold, he came up with bold and challenging questions about my studies, but also methodological questions regarding his own studies. Towards the end when I wanted to give up, Patrik often knocked on my door and pushed me further. It was needed.

All three supervisors have complemented each other with different types of knowledge and interests. Method, career, and general knowledge of how the university world works is probably something that few Ph.D. students get, all in one package, from their supervisors.

There are clearly a lot of other people who have supported and stimulated me through these years. The first is my father. You have raised me to a skeptic who thinks critically but also stimulated my curiosity. The discussions we have had about the bad pages of the peer review process have, during the periods it felt, made it much easier. It also fell out well!

I also want to thank those who, at an early stage, took the time to read and review a draft of this thesis. Peter Bengtsson, Gustaf Wadenholt, and Linnea Karlsson Wirebring, of your comments and questions, many of them have applied to my dissertation. Thank you for taking the time to read my work.

I was also part of a group of doctoral students in the department of psychology. I would like to thank everyone that is, or has been, part of this group, but there are also some worth mentioning by name. Erik and Andreas, a lot of methods and stats discussion over a beer or two. Gustaf, who came in later, also discussing stats, methods, and cognitive science in general. All three always posed tricky questions stimulating my mind. Markus, we shared room almost the 2 first years. My desk has never been cleaner, and we always had fun! Robert and Hanna, the roomies that always listened to me when needed to chat (not necessarily both at the same time but…)

There are of course some other people that I need to acknowledge. Lars, always positive and never late to join for some good food and beverages.

Last, but not least, I want to thank the rest of my family. Angelica, my girlfriend, for being such an understanding and supporting person. I could not have done this without you! My mom, for coming down taking care of us when we’ve been swamped with work. Invaluable. My brother, Olof, for sharing your art with me. One of your works ended up at the front page!

Erik Marsja, 2017-11-08

References

Andrés, P., Parmentier, F. B. R., & Escera, C. (2006). The effect of age on involuntary capture of attention by irrelevant sounds: A test of the frontal hypothesis of aging. Neuropsychologia, 44(12), 2564–2568. http://doi.org/10.1016/j.neuropsychologia.2006.05.005

Baddeley, A. D. (2012). Working Memory: Theories, Models, and Controversies. Annual Review of Psychology, 63(1), 1–29. http://doi.org/10.1146/annurev-psych-120710-100422

Beaman, C. P., & Jones, D. M. (1997). Role of serial order in the irrelevant speech effect: Tests of the changing-state hypothesis. Journal of Experimental Psychology: Learning, Memory, and Cognition, 23(2), 459–471. http://doi.org/10.1037/0278-7393.23.2.459

Bendixen, A., Grimm, S., Deouell, L. Y., Wetzel, N., Mädebach, A., & Schröger, E. (2010). The time-course of auditory and visual distraction effects in a new crossmodal paradigm. Neuropsychologia, 48(7), 2130–2139. http://doi.org/10.1016/j.neuropsychologia.2010.04.004

Bendixen, A., SanMiguel, I., & Schröger, E. (2012). Early electrophysiological indicators for predictive processing in audition: A review. International Journal of Psychophysiology, 83(2), 120–131. http://doi.org/10.1016/j.ijpsycho.2011.08.003

Berti, S. (2008). Cognitive control after distraction: Event-related brain potentials (ERPs) dissociate between different processes of attentional allocation. Psychophysiology, 45(4), 608–620. http://doi.org/10.1111/j.1469-8986.2008.00660.x

Berti, S. (2013). The role of auditory transient and deviance processing in distraction of task performance: a combined behavioral and event-related brain potential study. Frontiers in Human Neuroscience, 7(July), 352. http://doi.org/10.3389/fnhum.2013.00352

Berti, S., Roeber, U., & Schröger, E. (2004). Bottom-up influences on working memory: Behavioral and electrophysiological distraction varies with distractor strength. Experimental Psychology, 51(4), 249–257. http://doi.org/10.1027/1618-3169.51.4.249

Berti, S., & Schröger, E. (2004). Distraction effects in vision: behavioral and event-related potential indices. Neuroreport, 15(4), 665–669. http://doi.org/10.1097/00001756-200403220-00018

Boll, S., & Berti, S. (2009). Distraction of task-relevant information processing by irrelevant changes in auditory, visual, and bimodal stimulus features: A

behavioral and event-related potential study. Psychophysiology, 46(3), 645–654. http://doi.org/10.1111/j.1469-8986.2009.00803.x

Botta, F., Santangelo, V., Raffone, A., Sanabria, D., Lupiáñez, J., & Belardinelli, M. O. (2011). Multisensory integration affects visuo-spatial working memory. Journal of Experimental Psychology. Human Perception and Performance, 37(4), 1099–1109. http://doi.org/10.1037/a0023513

Broadbent, D. E. (1958). Perception and communication. Elmsford, NY, US: Pergamon Press. http://doi.org/10.1037/10037-000

Cherry, E. C. (1953). Some experiments on The recognition of Speach, with one and with Two Ears. The Journal of the Acoustical Society of America, 25(5), 975–979.

Cocchini, G., Logie, R. H., Della Sala, S., MacPherson, S. E., & Baddeley, A. D. (2002). Concurrent performance of two memory tasks: evidence for domain-specific working memory systems. Memory & Cognition, 30(7), 1086–1095. http://doi.org/10.3758/BF03194326

Colle, H. A., & Welsh, A. (1976). Acoustic masking in primary memory. Journal of Verbal Learning and Verbal Behavior, 15(1), 17–31.

Conway, A. R. a., Cowan, N., & Bunting, M. F. (2001). The cocktail party phenomenon revisited: the importance of working memory capacity. Psychonomic Bulletin & Review, 8(2), 331–5. Retrieved from http://www.ncbi.nlm.nih.gov/pubmed/11495122

Corbetta, M., Patel, G., & Shulman, G. L. (2008). The reorienting system of the human brain: from environment to theory of mind. Neuron, 58(3), 306–24. http://doi.org/10.1016/j.neuron.2008.04.017

Corbetta, M., & Shulman, G. L. (2002). Control of goal-directed and stimulus- driven attention in the brain. Nature Reviews. Neuroscience, 3(3), 201–15. http://doi.org/10.1038/nrn755

Cousineau, D. (2005). Confidence intervals in within-subject designs: A simpler solution to Loftus and Masson’s method. Tutorials in Quantitative Methods for Psychology, 1(1), 42–45. http://doi.org/no DOI found

Cowan, N. (1988). Evolving conceptions of memory storage, selective attention, and their mutual constraints within the human information-processing system. Psychological Bulletin, 104(2), 163–191. http://doi.org/10.1037/0033-2909.104.2.163

Cowan, N. (1995). Attention and memory: an integrated framework. Oxford Psychology.

D’Esposito, M. (2007). From cognitive to neural models of working memory. Philosophical Transactions of the Royal Society B: Biological Sciences, 362(1481), 761–772. http://doi.org/10.1098/rstb.2007.2086

Dawson, M. E., Hazlett, E. a, Filion, D. L., Nuechterlein, K. H., & Schell, A. M. (1993). Attention and schizophrenia: impaired modulation of the startle reflex. Journal of Abnormal Psychology, 102(4), 633–641. http://doi.org/10.1037/0021-843X.102.4.633

Downar, J., Crawley, a P., Mikulis, D. J., & Davis, K. D. (2000). A multimodal cortical network for the detection of changes in the sensory environment. Nature Neuroscience, 3(3), 277–83. http://doi.org/10.1038/72991

Driver, J. (2001). A selective review of selective attention research from the past century. British Journal of Psychology (London, England : 1953), 92(Pt 1), 53–78. Retrieved from http://www.ncbi.nlm.nih.gov/pubmed/11256770

Escera, C., Alho, K., Winkler, I., & Näätänen, R. (1998). Neural mechanisms of involuntary attention to acoustic novelty and change. Journal of Cognitive Neuroscience, 10(5), 590–604. http://doi.org/10.1162/089892998562997

Friedman, D., Cycowicz, Y. M., & Gaeta, H. (2001). The novelty P3: An event- related brain potential (ERP) sign of the brain’s evaluation of novelty. Neuroscience and Biobehavioral Reviews, 25(4), 355–373. http://doi.org/10.1016/S0149-7634(01)00019-7

Friston, K. (2010). The free-energy principle: a unified brain theory? Nature Reviews. Neuroscience, 11(2), 127–138. http://doi.org/10.1038/nrn2787

Ho, C., Reed, N., & Spence, C. J. (2006). Assessing the effectiveness of “intuitive” vibrotactile warning signals in preventing front-to-rear-end collisions in a driving simulator. Accident; Analysis and Prevention, 38(5), 988–96. http://doi.org/10.1016/j.aap.2006.04.002

Horváth, J., Müller, D., Weise, A., & Schröger, E. (2010). Omission mismatch negativity builds up late. Neuroreport, 21(7), 537–541. http://doi.org/10.1097/WNR.0b013e3283398094

Houck, R. L., & Mefferd Jr., R. B. (1969). Generalization of GSR habituation to mild intramodal stimuli. Psychophysiology, 6, 202–206.

Hughes, R. W. (2014). Auditory distraction: A duplex-mechanism account. PsyCh Journal, 3(1), 30–41. http://doi.org/10.1002/pchj.44

Hughes, R. W., Hurlstone, M. J., Marsh, J. E., Vachon, F., & Jones, D. M. (2013). Cognitive control of auditory distraction: impact of task difficulty,

foreknowledge, and working memory capacity supports duplex-mechanism account. Journal of Experimental Psychology. Human Perception and Performance, 39(2), 539–553. http://doi.org/10.1037/a0029064

Hughes, R. W., Tremblay, S., & Jones, D. M. (2005). Disruption by speech of serial short-term memory: the role of changing-state vowels. Psychonomic Bulletin & Review, 12(5), 886–90. http://doi.org/10.3758/BF03196781

Hughes, R. W., Vachon, F., & Jones, D. M. (2005). Auditory Attentional Capture During Serial Recall: Violations at Encoding of an Algorithm-Based Neural Model? Journal of Experimental Psychology: Learning, Memory, and Cognition, 31(4), 736–749. http://doi.org/10.1037/0278-7393.31.4.736

Hughes, R. W., Vachon, F., & Jones, D. M. (2007). Disruption of short-term memory by changing and deviant sounds: Support for a duplex-mechanism account of auditory distraction. Journal of Experimental Psychology: Learning, Memory, and Cognition, 33(6), 1050–1061. http://doi.org/10.1037/0278-7393.33.6.1050

James, W. (1890). The Principles of Psychology. H. Holt. Retrieved from https://books.google.se/books?id=JLcAAAAAMAAJ

Jones, D. M., Macken, W. J., & Murray, a C. (1993). Disruption of visual short- term memory by changing-state auditory stimuli: the role of segmentation. Memory & Cognition, 21(3), 318–328. http://doi.org/10.3758/BF03208264

Kane, M. J., Hambrick, D. Z., Tuholski, S. W., Wilhelm, O., Payne, T. W., & Engle, R. W. (2004). The generality of working memory capacity: a latent-variable approach to verbal and visuospatial memory span and reasoning. Journal of Experimental Psychology. General, 133(2), 189–217. http://doi.org/10.1037/0096-3445.133.2.189

Kim, H. (2014). Involvement of the dorsal and ventral attention networks in oddball stimulus processing: A meta-analysis. Human Brain Mapping, 35(5), 2265–2284. http://doi.org/10.1002/hbm.22326

Knight, R. T. (1996). Contribution of human hippocampal region to novelty detection. Nature, 383(6597), 256–259. http://doi.org/10.1038/383256a0

Lange, E. B. (2005). Disruption of attention by irrelevant stimuli in serial recall. Journal of Memory and Language, 53(4), 513–531. http://doi.org/10.1016/j.jml.2005.07.002

Leiva, A., Parmentier, F. B. R., & Andrés, P. (2015). Distraction by deviance comparing the effects of auditory and visual deviant stimuli on auditory and visual target processing. Experimental Psychology, 62(1), 54–65.

http://doi.org/10.1027/1618-3169/a000273

Li, D., Christ, S. E., & Cowan, N. (2014). Domain-general and domain-specific functional networks in working memory. NeuroImage, 102, 646–656. http://doi.org/10.1016/j.neuroimage.2014.08.028

Ljungberg, J. K., & Parmentier, F. B. R. (2012a). Cross-modal distraction by deviance: Functional similarities between the auditory and tactile modalities. Experimental Psychology, 59(6), 355–363. http://doi.org/10.1027/1618-3169/a000164

Ljungberg, J. K., & Parmentier, F. B. R. (2012b). Cross-modal distraction by deviance: Functional similarities between the auditory and tactile modalities. Experimental Psychology, 59(6), 355–363. http://doi.org/10.1027/1618-3169/a000164

Ljungberg, J. K., Parmentier, F. B. R., Jones, D. M., Marsja, E., & Neely, G. (2014). “What”s in a name?’ “No more than when it”s mine own’. Evidence from auditory oddball distraction. Acta Psychologica, 150, 161–166. http://doi.org/10.1016/j.actpsy.2014.05.009

Macaluso, E. (2010). Orienting of spatial attention and the interplay between the senses. Cortex, 46(3), 282–297. http://doi.org/10.1016/j.cortex.2009.05.010

Mohebbi, R., Gray, R., & Tan, H. Z. (2009). Driver reaction time to tactile and auditory rear-end collision warnings while talking on a cell phone. Human Factors, 51(1), 102–110. http://doi.org/10.1177/0018720809333517

Moray, N. (1959). Attention in dichotic listening: Affective cues and the influence of instructions. The Quarterly Journal of Experimental Psychology, 11(1), 56–60. http://doi.org/10.1080/17470215908416289

Morey, R. D. (2008). Confidence Intervals from Normalized Data: A correction to Cousineau (2005). Tutorials in Quantitative Methods for Psychology, 4(2), 61–64. http://doi.org/10.20982/tqmp.04.2.p061

Ngo, M. K., & Spence, C. (2010). Auditory, tactile, and multisensory cues facilitate search for dynamic visual stimuli. Attention, Perception, & Psychophysics, 72(6), 1654–1665. http://doi.org/10.3758/APP.72.6.1654

Näätänen, R., Gaillard, A. W. K., & Mäntysalo, S. (1978). Early selective-attention effect on evoked potential reinterpreted. Acta Psychologica, 42(4), 313– 329. http://doi.org/10.1016/0001-6918(78)90006-9

Nöstl, A., Marsh, J. E., & Sörqvist, P. (2012). Expectations Modulate the Magnitude of Attentional Capture by Auditory Events. PLoS ONE, 7(11),

e48569. http://doi.org/10.1371/journal.pone.0048569

Parmentier, F. B. R. (2008). Towards a cognitive model of distraction by auditory novelty: The role of involuntary attention capture and semantic processing. Cognition, 109(3), 345–362. http://doi.org/10.1016/j.cognition.2008.09.005

Parmentier, F. B. R. (2014). The cognitive determinants of behavioral distraction by deviant auditory stimuli: A review. Psychological Research, 78(3), 321– 338. http://doi.org/10.1007/s00426-013-0534-4

Parmentier, F. B. R. (2016). Deviant sounds yield distraction irrespective of the sounds’ informational value. Journal of Experimental Psychology: Human Perception and Performance, 42(6), 837–846. http://doi.org/10.1037/xhp0000195

Parmentier, F. B. R., Elford, G., Escera, C., Andrés, P., & SanMiguel, I. (2008). The cognitive locus of distraction by acoustic novelty in the cross-modal oddball task. Cognition, 106(1), 408–432. http://doi.org/10.1016/j.cognition.2007.03.008

Parmentier, F. B. R., Elsley, J. V., Andrés, P., & Barceló, F. (2011). Why are auditory novels distracting? Contrasting the roles of novelty, violation of expectation and stimulus change. Cognition, 119(3), 374–380. http://doi.org/10.1016/j.cognition.2011.02.001

Parmentier, F. B. R., Ljungberg, J. K., Elsley, J. V., & Lindkvist, M. (2011). A behavioral study of distraction by vibrotactile novelty. Journal of Experimental Psychology. Human Perception and Performance, 37(4), 1134–1139. http://doi.org/10.1037/a0021931

Quak, M., London, R. E., & Talsma, D. (2015). A multisensory perspective of working memory. Frontiers in Human Neuroscience, 9(April), 1–11. http://doi.org/10.3389/fnhum.2015.00197

Röer, J. P., Bell, R., Marsh, J. E., & Buchner, A. (2015). Age equivalence in auditory distraction by changing and deviant speech sounds. Psychology and Aging, 30(4), 849–855. http://doi.org/10.1037/pag0000055

Saarinen, J., Paavilainen, P., Schöger, E., Tervaniemi, M., & Näätänen, R. (1992). Representation of abstract attributes of auditory stimuli in the human brain. Neuroreport. http://doi.org/10.1097/00001756-199212000-00030

SanMiguel, I., Linden, D., & Escera, C. (2010). Attention capture by novel sounds: Distraction versus facilitation. European Journal of Cognitive Psychology, 22(4), 481–515. http://doi.org/10.1080/09541440902930994

Saults, J. S., & Cowan, N. (2007). A central capacity limit to the simultaneous storage of visual and auditory arrays in working memory. Journal of Experimental Psychology. General, 136(4), 663–684. http://doi.org/10.1037/0096-3445.136.4.663

Schröger, E., Bendixen, A., Denham, S. L., Mill, R. W., Bohm, T. M., & Winkler, I. (2014). Predictive regularity representations in violation detection and auditory stream segregation: From conceptual to computational models. Brain Topography, 27(4), 565–577. http://doi.org/10.1007/s10548-013- 0334-6

Schröger, E., Bendixen, A., Trujillo-Barreto, N. J., & Roeber, U. (2007). Processing of abstract rule violations in audition. PloS One, 2(11), e1131. http://doi.org/10.1371/journal.pone.0001131

Schröger, E., Giard, M. H., & Wolff, C. (2000). Auditory distraction: Event- related potential and behavioral indices. Clinical Neurophysiology, 111(8), 1450–1460. http://doi.org/10.1016/S1388-2457(00)00337-0

Schröger, E., & Wolff, C. (1998). Behavioral and electrophysiological effects of task-irrelevant sound change: a new distraction paradigm. Brain Research. Cognitive Brain Research, 7(1), 71–87. Retrieved from http://www.ncbi.nlm.nih.gov/pubmed/9714745

Smith, B. D., Dickel, M., & Deutsch, S. (1978). Overextinction and Test Stimulus Modality Determinants of Dishabituation. Psychophysiology, 15(4), 324– 329. http://doi.org/10.1111/j.1469-8986.1978.tb01388.x

Sokolov, E. (1963). Higher nervous functions; the orienting reflex. Annual Review of Physiology, 25(September), 545–580. http://doi.org/10.1146/annurev.ph.25.030163.002553

Spence, C., & Walton, M. (2005). On the inability to ignore touch when responding to vision in the crossmodal congruency task. Acta Psychologica, 118(1–2 SPEC. ISS.), 47–70. http://doi.org/10.1016/j.actpsy.2004.10.003

Stefanics, G., Kremláček, J., & Czigler, I. (2014). Visual mismatch negativity: a predictive coding view. Frontiers in Human Neuroscience, 8(September), 666. http://doi.org/10.3389/fnhum.2014.00666

Sörqvist, P., Nöstl, A., & Halin, N. (2012). Working memory capacity modulates habituation rate: Evidence from a cross-modal auditory distraction paradigm. Psychonomic Bulletin & Review, 19(2), 245–250. http://doi.org/10.3758/s13423-011-0203-9

Talsma, D. (2015). Predictive coding and multisensory integration: an attentional account of the multisensory mind. Frontiers in Integrative Neuroscience,

9(March), 19. http://doi.org/10.3389/fnint.2015.00019

Talsma, D., Senkowski, D., Soto-Faraco, S., & Woldorff, M. G. (2010). The multifaceted interplay between attention and multisensory integration. Trends in Cognitive Sciences, 14(9), 400–410. http://doi.org/10.1016/j.tics.2010.06.008

Treisman, a M. (1969). Strategies and models of selective attention. Psychological Review, 76(3), 282–299. http://doi.org/10.1037/h0027242

Tremblay, S., Nicholls, A. P., Alford, D., & Jones, D. M. (2000). The irrelevant sound effect: Does speech play a special role? Journal of Experimental Psychology: Learning, Memory, and Cognition, 26(6), 1750–1754. http://doi.org/10.1037/0278-7393.26.6.1750

Unger, S. M. (1964). Habituation of the vasoconstrictive orienting reaction. Journal of Experimental Psychology, 67(1), 11–18. http://doi.org/10.1037/h0044510

Unsworth, N., & Engle, R. W. (2007). On the division of short-term and working memory: an examination of simple and complex span and their relation to higher order abilities. Psychological Bulletin, 133(6), 1038–66. http://doi.org/10.1037/0033-2909.133.6.1038

Vachon, F., Hughes, R. W., & Jones, D. M. (2012). Broken expectations: Violation of expectancies, not novelty, captures auditory attention. Journal of Experimental Psychology: Learning, Memory, and Cognition, 38(1), 164– 177. http://doi.org/10.1037/a0025054

Vachon, F., Labonté, K., & Marsh, J. E. (2017). Attentional capture by deviant sounds: A noncontingent form of auditory distraction? Journal of Experimental Psychology: Learning, Memory, and Cognition, 43(4), 622– 634. http://doi.org/10.1037/xlm0000330

Van der Burg, E., Olivers, C. N. L., Bronkhorst, A. W., & Theeuwes, J. (2009). Poke and pop: Tactile-visual synchrony increases visual saliency. Neuroscience Letters, 450(1), 60–64. http://doi.org/10.1016/j.neulet.2008.11.002

Winkler, I. (2007). Interpreting the mismatch negativity. Journal of Psychophysiology, 21(3–4), 147–163. http://doi.org/10.1027/0269- 8803.21.34.147

Winkler, I., & Czigler, I. (2012). Evidence from auditory and visual event-related potential (ERP) studies of deviance detection (MMN and vMMN) linking predictive coding theories and perceptual object representations. International Journal of Psychophysiology, 83(2), 132–143.

http://doi.org/10.1016/j.ijpsycho.2011.10.001

Wood, N. L., & Cowan, N. (1995). The cocktail party phenomenon revisited: attention and memory in the classic selective listening procedure of Cherry (1953). Journal of Experimental Psychology. General, 124(3), 243–62. Retrieved from http://www.ncbi.nlm.nih.gov/pubmed/7673862

Yabe, H., Tervaniemi, M., Reinikainen, K., & Näätänen, R. (1997). Temporal window of integration revealed by MMN to sound omission. NeuroReport, 8(8), 1971–1974. http://doi.org/10.1097/00001756-199705260-00035

Yabe, H., Winkler, I., Czigler, I., Koyama, S., Kakigi, R., Sutoh, T., … Kaneko, S. (2001). Organizing sound sequences in the human brain: The interplay of auditory streaming and temporal integration. Brain Research, 897(1–2), 222–227. http://doi.org/10.1016/S0006-8993(01)02224-7

Appendix

Study 1

Study 2

Study 3