The Pennsylvania State University

The Graduate School

College of the Liberal Arts

ORIENTING VISUAL ATTENTION IN SPACE:

IS CAPTURE OF ATTENTION PURELY STIMULUS-DRIVEN OR CONTINGENT UPON GOAL-DRIVEN SETTINGS?

A Thesis in

Psychology

by

Peggy Chen

© 2007 Peggy Chen

Submitted in Partial Fulfillment of the Requirements for the Degree of

Doctor of Philosophy

May 2007

The thesis of Peggy Chen was reviewed and approved* by the following:

J. Toby Mordkoff Associate Professor of Psychology Thesis Advisor Chair of Committee

Cathleen Moore Associate Professor of Psychology

Rick Gilmore Associate Professor of Psychology

Octavia Camps Associate Professor of Electrical Engineering

Melvin Mark Professor of Psychology Head of the Department of Psychology

*Signatures are on file in the Graduate School

iii ABSTRACT

Visuospatial attention refers to the selection of stimuli that appear in the location to which attention is oriented. One way to allocate attention to a certain location (in the absence of eye-movements) is driven by stimuli. This occurs when a salient, external stimulus captures attention, such as when attention is drawn to a bright, flashing light.

The present study focuses on the question: Is attention captured by any and all salient stimuli, or only by stimuli that contain the attribute that defines the target? Two hypotheses have been presented to answer this question. Contingent Capture argues that a stimulus can only capture attention when it contains some attributes in common with the target or what is currently relevant to the task. In opposition to Contingent Capture,

Rapid Disengagement posits that stimuli that do not contain the target-defining attribute can capture attention but rather briefly. Two independent experiments were conducted.

The behavioral experiment provided findings consistent with Contingent Capture.

However, the psychophysiological experiment did not provide an answer to the addressed question. Therefore, evidence up to this point favors Contingent Capture over Rapid

Disengagement. That is, attentional capture is modulated by goal-driven control settings.

At the same time, however, the present work also raises questions concerning the approaches that have been employed in the study of visual attention. Future lines of research are suggested.

iv TABLE OF CONTENTS

LIST OF FIGURES ...... vi

LIST OF TABLES...... ix

ACKNOWLEDGEMENTS...... x

Chapter 1 Introduction ...... 1

Attention...... 1 Visual Attention...... 2 Visuospatial Attention...... 2 Attentional Capture ...... 7 Folk, Johnston, & Remington, 1992...... 13 An alternative to Contingent Capture: Rapid Disengagement ...... 16 To Test Rapid Disengagement ...... 17

Chapter 2 Contingent Cuing with a Short SOA...... 19

Short-SOA Experiments...... 20 Short but not Zero SOA...... 23 Modifications of Folk et al. (1992) Parameters...... 27 Experiment I ...... 29 Method...... 29 Results ...... 33 Discussion...... 35

Chapter 3 Background: the Choice of ...... 38

Neurophysiological Methods...... 39 Event-Related Potentials ...... 41 N2pc ...... 47

Chapter 4 Contingent Cuing with ERP...... 52

Prior ERP Experiments...... 53 N2pc: Index of Attentional Capture ...... 56 Experiment II...... 58 Method...... 58 Behavioral Results and Discussion ...... 60 Electrophysiological Results and Discussion...... 64

Chapter 5 General Discussion...... 69

v Summary...... 69 Why is the N2pc Absent? ...... 72 Conclusion...... 75

Bibliography ...... 77

vi LIST OF FIGURES

Figure 1-1: The exogenous cuing paradigm used in Folk, Remington, and Johnston (1992). The actual display was presented on a black background. All black lines in this figure were actually white in the real presentation. There were two types of Target: Onset (Figure 1A) and Color (Figure 1B), and two types of Cue: Onset and Color. Cue Attributes and Target Attributes were combined to yield four cue-target situations: Onset Target preceded by a congruent (onset) Cue or incongruent (color) Cue; Color Target following a congruent (color) cue or incongruent (onset) cue...... 15

Figure 1-2: Mean response times as a function of Target Attribute, Cue Attribute, and Cue Validity in Folk et al. (1992). (Adapted from Figure 2 in Folk et al., 1992)...... 16

Figure 2-1: With the current experiment setting of contingent cuing, what displays would look like if the cue and the target occurred simultaneously. Divided by the dash line, the left panel shows the congruent condition as the cue and the target share the same attribute (onset or color). The right panel shows the incongruent condition as the cue does not contain the target- defining attribute. In each condition, examples of valid and invalid cuing are given...... 24

Figure 2-2: The actual display was presented on a black background. All black lines in this figure were actually white or grey in the real presentation. There were two types of Target: Onset (Figure 1A) and Color (Figure 1B), and two types of Cue: Onset and Color. Cue Attributes and Target Attributes were combined to yield four cue-target situations: Onset Target preceded by a congruent (onset) Cue or incongruent (color) Cue; Color Target following a congruent (color) cue or incongruent (onset) cue...... 31

Figure 2-3: Mean response times as a function of Target Attribute, Cue Attribute, and Cue Validity...... 34

Figure 3-1: The international 10-20 electrode system. 10-20 refers to the placing of electrodes at sites 10% and 20% from certain anatomical landmarks as shown on the left. The right side of the figure depicts the electrode position from viewing the top of the head. The letters refer to areas of the , F for frontal, C for central, T for temporal, P for parietal, and O for occipital. Numerical subscripts indicate the brain hemisphere (odd numbers for left and even numbers for right). Subscript “z” indicates “zenith,” referring to the midline placement...... 42

vii Figure 3-2: Extraction of the ERP waveform from the ongoing EEG. Time zero is locked to the stimulus onset. (a) Stimuli (1… N) are presented while the EEG is being recorded, but the specific response to each stimulus is too small to be seen in the noisy EEG. (b) To isolate the ERP from the ongoing EEG, the EEG epochs or segments are extracted and averaged to create the ERP waveform. Note that negative voltage is plotted up by convention. The waveforms are hypothetical. (Adapted from Luck et al., 2000)...... 44

Figure 3-3: Amplitude modulation of early ERP components in visuospatial attention. Time zero is locked to the flash onset. (a) Subject fixated the center plus sign while a stimulus flashed in a random order. In each trial block, subjects attended to stimuli in only one quadrant. That is, subject endogenously attended to one location and made a response if the target (the flashing stimulus) was in the attended location. The scalp current distribution map, shown on the schematic head, was generated based on the P1 component recorded from each electrode. (b) Averaged from EEG epochs recorded over the right posterior visual cortex, ERP waveforms to lower left flashes were shown when these flashes were attended (solid line) and ignored (dotted line). (Adapted from Mangun et al., 1993) ...... 47

Figure 3-4: Example of an N2pc experiment. (a) Stimulus arrays with a target (the white bar) on the right or left. (b) Hypothetical waveforms recorded from posterior electrode sites over the left and right hemispheres. By convention, negative is plotted upward. The waveforms from the left hemisphere are more negative-going for right visual field (RVF) targets than for left visual field (LVF) targets, and the waveforms from the right hemisphere are more negative-going for LVF targets than for RVF targets. (c) Waveforms that have been collapsed across hemispheres into ipsilateral and contralateral waveforms. This difference in amplitude between the ipsilateral and contralateral waveforms defines the N2pc component and is indicated by the shaded region. (Adapted from Woodman & Luck, 2003)...... 48

Figure 3-5: ERP waveforms for each combination of target condition and singleton type in Girelli & Luck (1997). The N2pc component is defined by a difference in voltage between trials on which the target is contralateral versus ipsilateral to the electrode site (OL or OR), and this difference is shaded. Negative is plotted upward in this and all subsequent figures. (Adapted from Girelli & Luck, 1997)...... 51

Figure 4-1: Hypothetical temporal sequence of stimuli and the elicited ERP components. Each tag marks 50 ms, with time goes by from left to right. The upper time line depicts the likely overlapping between the P1 and the N2pc if the cue-to-target SOA is 150 ms (as used in Folk et al., 1992). The lower time line shows the possibility to separate the N2pc from the P1 so that the

viii N2pc can be observed much more clearly when the SOA is 250 ms. The present experiment uses 250 ms as the SOA...... 57

Figure 4-2: The 128-channel Hydrocel Geodesic Sensor Net (marked with electrode numbers in green circles) mapped with approximately equivalent positions of the international 10-20 system (light blue; see Figure 3-1) and two additional standard electrodes (yellow)...... 59

Figure 4-3: Mean response times as a function of Target Attribute, Cue Attribute, and Cue Validity...... 62

Figure 4-4: Grand average event-related brain potentials for all four cue-target combinations. Each tag on the horizontal axis marks 100 ms, and on the vertical axis is 1 µV. Time zero is locked to the cue onset. Negative voltage is plotted up by convention. The green boxes are the time windows for statistical analyses...... 66

ix LIST OF TABLES

Table 2-1: Mean response times (in ms) and error rates (in %) in each cue-target situation...... 34

Table 3-1: Comparisons among the most commonly used neurophysiological methods in cognitive psychology. (Adapted from Table 1 in Volkow, Rosen, & Farde, 1997)...... 39

Table 4-1: Mean response times (in ms) and error rates (in %) at each cue-target situation...... 62

x ACKNOWLEDGEMENTS

This project has been supported by Grants-in-Aid of Research for Doctoral

Dissertations awarded by Sigma Xi, and Dissertation Support awarded by the College of

Liberal Arts, the Pennsylvania State University. I thank Charles Folk, Elisabeth Hein,

Rose Halterman, Lyndsey Lanagan-Leitzel, Justin Eroh, and the undergraduate members of the Vision & Information Processing Lab for their comments and assistance on this work. Many thanks are also given to my mother, Nikki Chiang, for her unconditioned support and encouragement.

Chapter 1

Introduction

Attention

One constantly receives stimuli from the environment via various sensory receptors. However, one is not able to process all available information coming from the world. One cannot process all available information that is already in the cognitive system, either. In short: human information processing is limited. For example, no more than approximate seven items can be held in working memory (e.g., memorizing a novel string of numbers) and multiple tasks cannot be performed as well as a single task (e.g., talking on the phone while driving). Contemplating two trains of thoughts at the same time is hardly possible (e.g., converting Fahrenheit to Celsius while looking for directions in an unfamiliar neighborhood). To prevent the capacity-limited central system from being overloaded, a certain mechanism somewhere along the processing stream gives priority to some stimuli and discards other stimuli. This process of selection is called attention.

2 Visual Attention

In the domain of vision, processing capacity is limited as well. One sees everything in a scene, which is often crowded with a great variety of objects and events, but one does not scrutinize every single object or study every stimulus in detail. “ …

[M]ore is seen than can be remembered …” (Sperling, 1960, pp.1) expresses the limitation of iconic memory and visual short-term memory. Visual attention selects important or interesting visual stimuli, which consequently are processed further down the stream and stored in memory, briefly or for a lengthy time depending on how important or interesting the stimuli are. One moves attention from face to face, rather than from chair to chair, when looking for a friend in a café, for example. But one is also distracted by an ant that crosses the computer screen when typing. What aspect of the visual stimuli that is selected can vary, depending on the environment, the task, and the mental state of the observer.

Visuospatial Attention

Visual attention shifts in space. Visuospatial attention refers to selection of stimuli that appear in the location to which attention is oriented. Note that it is not the location that is selected; rather, it is stimuli that are selected because the limited capacity

3 of processing is allocated to the location. For example, one walks by the hallway when the elevator opens. If one pays attention to the elevator, the person who steps out the elevator is recognized, maybe as a coworker. If one does not pay attention to the elevator, then a person is seen coming out of the elevator, but the identity of the person might not be encoded. Another example of elevators: one is waiting for a friend, who will arrive from one of eight elevators in the hall (of a hotel or an office building). Every time when an elevator opens, one looks to the elevator, and attention stays there to see whether this is the friend. If the friend happens to show up unexpectedly (e.g., walking down the stairs instead of riding in an elevator), one may not see the friend immediately.

The above examples demonstrate that there are many ways to decide where to allocate attention. One way is to pay attention to a specific stimulus, and other stimuli in the same location are selected. Another way is to deploy attention to a specific location, and the processing of stimuli in this location is facilitated.

How people decide where to allocate attention or make attentional shifts has interested researchers for many years. Attentional shifts can be directed voluntarily or involuntarily. It is endogenous orienting when one voluntarily directs attention to a certain location, and the source of the control is from within, which generates a voluntary shift of attention in a top-down fashion. Back to the waiting-for-a-friend example, if it is known which elevator will be taken, one can deploy attention to the specific elevator in advance before it opens. In contrast, exogenous orienting is triggered by external stimuli in a bottom-up fashion. That is, an involuntary shift occurs when a certain stimulus captures attention, even though sometimes one does not have the intention to allow this to

4 happen. When one pays attention to any elevator that opens, peripheral vision might sense that the door of the stairway has opened. If attention is captured by the door, the friend is identified, even though he or she has unexpectedly chosen to walk down the stairs. In daily life, attentional shifts are rarely dependent on just top-down factors or bottom-up information. Again, back to the example, the chance of one’s friend walking out of an open elevator is very high (i.e., a top-down factor) but which elevator is going to open is uncertain (i.e., bottom-up information). With integrating orienting parameters from these two types of components, attention is shifted.

Before moving on to how the control of visuospatial attention is studied, one thing has to be clarified. There is close connection between movements of eyes and shifts of attention, and shifts of attention usually accompany movements of eyes. But daily life experience supports the idea that people are able to use “peripheral vision” to attend to a location other than current eye fixation. That is, knowing eye position may not necessarily tell you where attention is located; furthermore, knowing how the eyes are moved may not necessarily tell you how attention is moved. Attentional shifts and eye movements are confounded if there is no way to study them separately. Fortunately, in the 19th century, scientists started to empirically demonstrate the fact that attentional

shifts and eye movements can be separated. Hermann von Helmholtz ([1856-1867]

1924-1925), for example, conducted an experiment showing that he (his only participant

in the experiment) was able to identify stimuli in a location where attention was

voluntarily shifted to, while his eyes remained fixated and not aligned with attention.

William James (1890) also noted that one is able to attend to a visual stimulus,

5 voluntarily or involuntarily, without moving the eyes. Not until much later, however, did intense empirical research on visuospatial attention begin. Michael Posner and colleagues are a few of the pioneers to devote themselves to the study of the so-called

“mind’s eye” (e.g., Posner, Nissen, & Ogden, 1978; Posner & Snyder, 1975; Posner,

Snyder, & Davidson, 1980). To be specific, it is an overt shift of visuospatial attention when attention and eye are aligned, and a covert shift when the eyes do not move

(Posner, 1980). For the sake of convenience, displacements of visuospatial attention mentioned in the rest of this dissertation are all covertly oriented, and the word

“attention” will be interchanged with the phrase “visuospatial attention”; otherwise, a specification will be added. Recall that there are two ways to shift attention – endogenous (voluntary) and exogenous (involuntary) orienting, and both of these two types of attentional shifts can be accomplished without eye movements.

Purely endogenous orienting and purely exogenous orienting are at the two extremes of the spectrum of how attention is shifted from location to location. In order to investigate the control of visual attention, researchers have tried to design experiments that exclude one type of orienting in order to focus on the other type. That is, either purely endogenous orienting or purely exogenous orienting is studied. Purely endogenous orienting means that the initiation of an attentional shift is free from external stimuli, and is under control rather than occurring automatically. Purely exogenous orienting, on the contrary, is a reflex to a stimulus, and is not contingent upon internal conditions. Empirically, if endogenous orienting is addressed, one will minimize

6 exogenous information that may be involved in attentional shifts; if exogenous orienting is examined, then endogenous factors are to be avoided.

It has been demonstrated that attention can be deployed to a blank space without the presence of any physical object (e.g., Posner, et al., 1978), which indicates the existence of a pure form of endogenous orienting. However, does a pure form of exogenous orienting exist? It is exogenous orienting when an attentional shift is triggered by a stimulus. It is exogenous orienting when attention is captured to a location that is no more likely than chance to contain a target, the shift is not necessary, and it is not even helpful to attend to that location. This definition does not say exogenous orienting is free from endogenous factors; it simply says that stimulus-driven-ness and spatial uncertainty of an attentional shift categorize this attentional shift as being exogenous. It is purely exogenous orienting when an attentional shift is determined by stimulus-driven-ness and spatial uncertainty and nothing else. It is purely exogenous orienting when attentional capture by stimuli has nothing to do with endogenous factors

(e.g., strategies, knowledge of what to look for, being absorbed in a train of thoughts, etc.). There have been arguments back and forth between psychologists as to whether attentional capture can be purely exogenous or always has to involve endogenous settings.

7 Attentional Capture

To make a strong case of purely exogenous orienting, stimulus-driven-ness and spatial uncertainty are stressed, especially the salience of the stimulus. If a stimulus can capture attention regardless of task relevance, then it is evidence of purely exogenous orienting. This stimulus must be salient, physically standing out from others, such as the moon surrounded by numerous stars in the sky or a police car siren in the middle of rush- hour traffic. There is evidence showing that attention can be oriented by a stimulus that does not provide any helpful information to perform a task (e.g., Jonides, 1981; Posner &

Cohen, 1984). In the now-classic paradigm, Posner and Cohen (1984) flashed one of the possible target locations shortly before presenting the target (i.e., the stimulus-onset- asynchrony was less than 200 ms). The incidence of the flash was uncorrelated with the target location. However, it was found that the response to the target was faster when the target location had been flashed (valid-cue trial) than not (invalid-cue trial). This difference in response time (RT) has been taken as an index of attentional capture by the flash. Previously, Jonides (1981) used arrowheads that did not bear reliable prediction about where the target would be. An arrowhead could be presented right beside one of the possible target locations or in the center. Using the same measure (i.e., RT difference) developed by Posner’s research group (Posner et al., 1978), Jonides found that observers were able to ignore the central arrowhead but were highly influenced by arrowheads presented in the periphery where the target might appear by coincidence. It has been interpreted that the focus of attention was moved from fixation to the

8 uninformative stimulus in both studies, Posner and Cohen (1984) and Jonides (1981), even though both types of stimuli (i.e., a flash and an arrowhead) did not actually predict the target’s location. That is, attention was automatically and reflexively captured by these cues. Since then, this has been a dominant theme in the study of attentional control

– attentional capture by peripheral stimuli is automatic and reflexive. Later on, the line of attention research moved on to what kind of stimuli can capture attention automatically.

Theeuwes and colleagues suggest that the stimulus with the highest saliency in a display always captures attention in a purely stimulus-driven or bottom-up fashion

(Theeuwes, 1991b, 1992, 1994, 1996, 2004; Theeuwes, Atchley, & Kramer, 2000;

Theeuwes & Godjin, 2002). The argument is based on a series of experiments, in which observers were performing a so-called additional singleton task. The target was embedded with many other non-targets. The target was defined as a certain type of singleton, for example, a shape singleton such that the target was a diamond shape and non-targets were circles. Half of the time, an additional singleton from another dimension appeared within the stimulus array; for example, the critical distractor was a color singleton (e.g., red) while other stimuli including the target were all the same color

(e.g., green). Responses were faster when the critical distractor was absent than when it was present. As a result, Theeuwes et al. argued that attentional focus was automatically shifted to the distractor because of its high saliency compared to other stimuli. These authors proposed the following hypothesis: Preattentively, stimulus dimensions (e.g., color, shape, luminance, and onset property) are all processed in parallel. In preattentive

9 processing, there are temporal differences among stimulus attributes, which correspond to different degrees of saliency. Attention is captured by the feature that first completes preattentive processing (i.e., the most salient feature). This attentional capture by the most salient feature is unaffected by task nature or by knowledge of what defines the actual target. If this feature happens to be the feature that defines the target, the stimulus at that location is processed further until a response is given to the reported attribute (e.g., the orientation of the line segment in the target location). If the most salient feature is not contained by the target, attention is automatically switched to the next salient feature.

Being physically salient or noticeable is not necessary to capture attention. Yantis and colleagues believe that certain kinds of stimuli definitely capture attention.

Particularly, a dynamic discontinuity in a display (e.g., an abrupt onset or appearance of a new object) automatically captures attention while a static discontinuity (e.g., a color singleton) is easily detectable but does not capture attentional resources as a dynamic discontinuity does (Jonides & Yantis, 1988; Yantis & Hillstrom, 1994; Yantis & Jonides,

1984; Yantis & Johnson, 1990). It was a visual search paradigm with an irrelevant feature that was employed in Yantis et al.’s studies. The logic of visual search paradigm is that when the target is not a pop-out or an obvious singleton, the search time for the target increases linearly with the number of stimuli in a display (i.e., with display size).

When the target is a pop-out, the search time is independent of the display size (Treisman

& Gelade, 1980). That is, unlike Theeuwes’s additional singleton task, the target defining attribute was not a singleton in Yantis’s studies. Participants serially

(behaviorally speaking) searched for the target letter when the target was like non-target

10 letters that appeared gradually from the camouflage of a placeholder (with the evidence that RT increased as the display size increased). Sometimes the target was a static discontinuity (e.g., a unique color) irrelevant to the task. Sometimes the target appeared in a previously blank space so that the target was an abruptly onset stimulus (i.e., a dynamic discontinuity) comparing to non-targets. It was found that, in the condition of static discontinuity, the function relating RT to display size was not as steep as in standard visual search where the target and non-targets are not easy to tell apart. But the search time did increase as the display size increased. In the condition of dynamic discontinuity, the search time was independent of the display size. These results were explained in a way that attention is captured automatically by an abrupt onset or a new object that creates dynamic discontinuity, but not by other types of singletons generating static discontinuity. Although abrupt onsets have been emphasized in Yantis et al.’s research, abrupt offsets are able to capture attention almost as well as abrupt onsets

(Miller, 1989; Theeuwes, 1991a).

To examine whether a static discontinuity (especially the most salient singleton) can capture attention as automatically as Theeuwes et al. demonstrated, Bacon and Egeth

(1994) adapted the additional singleton task and found that high saliency was not necessary to capture attention. Originally, on each trial of the additional singleton task, one singleton was the target (e.g., a shape singleton such as a circle), and the additional singleton was the distractor (e.g., a color singleton such as red) while other stimuli were homogenous (e.g., green diamonds). Keeping the critical trial where a distractor (e.g., a red diamond) appeared, Bacon and Egeth increased the number of targets in some trials

11 (e.g., one to three circles in an array with diamond non-targets), which resulted in no evidence of attentional capture as compared with trials with one target and one distractor.

In the second manipulation, Bacon and Egeth increased the number of singletons that shared the same dimension as the target-defining singleton (e.g., the target was a circle while most of the non-targets were diamonds with presence of a square and/or a triangle), mixed with the one-target/one-distractor condition. Again, attention was not captured by the most salient feature (i.e., the distractor). The authors concluded that knowledge of a specific feature was not used by participants in Theeuwes et al.’s studies because participants found that using the easiest approach – which was called singleton-detection mode – was sufficient to do the task. This is why attention appeared to be captured by the distractor. In Bacon and Egeth’s experiments, however, knowledge of the target feature definitely was needed to optimize task performance, and thus another approach – called feature-search mode – was employed by the participants. This is why the distractor now did not capture attention. Note that the distractors used in both laboratories were physically identical. This indicates that as task demands change, participants change their approach to the task. That is, the most salient feature does not always capture attention. Thus, if an experiment is not designed properly (even though the instructions are clearly stated to participants) to induce a cognitively demanding state such as the feature search mode, participants might use an easier approach, such as the singleton-detection mode, simply because they can. The feature-search mode, as well as the singleton-detection mode, is considered to be an endogenous factor that influences the control of attentional shifts. In other words, attentional capture by stimuli is not purely

12 exogenous. However, controversy remains (Theeuwes, 2005; Theeuwes & Godijn,

2002).

Folk and colleagues lead another line of research that challenges the hypothesis of purely exogenous orienting (Folk, Annett, 1994; Folk, Johnston, & Remington, 1992;

Folk & Remington, 1998, 1999, 2006; Folk, Remington, & Wright, 1994; Remington,

Folk, & McLean, 2001). Contingent Capture, also known as the contingent involuntary orienting hypothesis, does not deny the definition of exogenous orienting: attentional capture is triggered by a stimulus, but stresses attentional capture must not be a purely bottom-up process. Endogenous factors, such as strategies, goals, and recent experience, build up a control setting for the purpose of the current task. This setting does not indicate where to move attention (if knowing where to attend, then it is endogenous rather than exogenous orienting), but modulates whether exogenous orienting is triggered by a stimulus. With a control setting established, stimuli or events that contain the critical attributes (i.e., target-defining features) will summon attention without one’s volition, whether or not the stimuli or events are actually relevant to task performance.

Attention is set to be captured by any information that helps locate the target and thus helps fulfill the task demand, and is not captured absolutely by specific stimulus properties, such as the abruptness of onset. Since Folk et al.’s experimental paradigm is to be used in the present work, the methodology will be reviewed in detail.

13 Folk, Johnston, & Remington, 1992

Folk and colleagues (1992) used an exogenous cuing paradigm (i.e., an uninformative peripheral cue precedes the presentation of the target) in a task that involved the identification of a target stimulus and a speeded response. The target was defined by its salient feature distinguishable from other stimuli in the same display (e.g., the target was a red item while other items in the same display were un-colored).

Participants were required to make a choice response to the target (if one character, press one key; if the other, press the other key). Shortly prior to the target display (150 ms), a display containing a feature-salient stimulus – the cue – was presented. The cue provided no information about the target location or the correct response. It was the congruent condition when the cue contained the attribute that defined the target, and it was the incongruent condition when it did not (e.g., when the target was colored, a colored cue would be congruent and an onset cue would be incongruent). The sequence of the displays is illustrated in Figure 1-1.

When the locations of the cue and the target were signaled by the same feature

(i.e., the congruent condition), a positive cuing effect was found in RT. When the cue location and the target location were signaled by different features (incongruent), there was no reliable cuing effect. That is, participants made quicker response to the validly cued target than to the invalidly cued target in the congruent condition, but not in the incongruent condition (Figure 1-2). A positive cuing effect is an index of attentional capture in an exogenous cuing paradigm (recall the review on Posner’s work in the

14 preceding part of this chapter). The pattern of RT – the opposite two-way interactions

(between cue attribute and cue validity as a function of target attribute) suggests that attentional capture depended on congruency between cue attribute and target attribute; that is, a congruent cue captured attention, leading to the cuing effect, while an incongruent cue did not capture attention, thus there was no cuing effect on RT.

Consequently, the Contingent Capture hypothesis was proposed that endogenous factors establish a control setting that modulates the exogenous orienting that is triggered by stimuli. Under conditions of spatial uncertainty toward the target location (i.e., the critical criterion for exogenous orienting), endogenous knowledge builds up control settings for the purpose of the current task. Thus, attention is “set” to be captured selectively by an attribute that is relevant to optimal task performance. The results also suggest that the most salient stimulus in a display does not always capture attention (cf.,

Theeuwes, 1991b, 1992), and that dynamic discontinuity is not a unique property that definitely captures attention (cf., Jonides & Yantis, 1988).

15

A. Onset Target

B. Color Target

Figure 1-1: The exogenous cuing paradigm used in Folk, Remington, and Johnston (1992). The actual display was presented on a black background. All black lines in this figure were actually white in the real presentation. There were two types of Target: Onset (Figure 1A) and Color (Figure 1B), and two types of Cue: Onset and Color. Cue Attributes and Target Attributes were combined to yield four cue-target situations: Onset Target preceded by a congruent (onset) Cue or incongruent (color) Cue; Color Target following a congruent (color) cue or incongruent (onset) cue.

16

Figure 1-2: Mean response times as a function of Target Attribute, Cue Attribute, and Cue Validity in Folk et al. (1992). (Adapted from Figure 2 in Folk et al., 1992)

An alternative to Contingent Capture: Rapid Disengagement

Theeuwes and colleagues (Theeuwes 1994; Theeuwes, Atchley, & Kramer, 2000) have provided an alternative interpretation of the findings of Folk et al. (1992). They agree that knowledge of the target-defining attribute does represent an endogenous influence. However, they do not agree on how this endogenous knowledge affects performance. Theeuwes et al. argue that the knowledge of the target-defining attribute does not operate on the orienting of attention, but on certain processes that occur after

17 attention is allocated to the stimulus (i.e., those processes that discriminate between targets and non-targets). Besides, as mentioned previously, Theeuwes et al. believe that the most salient stimulus always captures attention. Therefore, attention should have been captured by both congruent and incongruent cues in Folk et al. (1992) because the cue was always the most salient stimulus in the display. The reason for no evidence of attentional capture in the incongruent condition is that an incongruent cue captured attention only very briefly. Once top-down knowledge had been used to determine that the cue did not contain the target–defining attribute, attention was immediately disengaged, and the system returned neutral. That is, attentional dwell time (i.e., the time attention must remain focused on an object before shifting to another location) for stimuli that do not contain similar features to the target is much shorter than for those that do.

Consequently, attention is briefly captured by and then rapidly disengaged from a salient but totally irrelevant stimulus. By the time the target appeared, attention was ready to be captured by the target, and thus no cuing effect was reflected in RT.

To Test Rapid Disengagement

The broad goal of this research is to investigate the control of visual attention by studying exogenous orienting of visual attention. Under the condition of special uncertainty, attention is captured by a stimulus. It seems as though the presence of the stimulus controls the attentional shift. However, it is unclear whether this shift is solely

18 determined by exogenous information (i.e., the stimulus) or by integrating factors from exogenous and endogenous sources. That is, is attentional capture purely stimulus-driven or contingent upon goal-driven settings? Making use of the contingent cuing task developed by Folk et al. (1992), the question is adapted to the task: Is attention captured by a cue that does not contain the attribute defining the target? Rapid Disengagement argues that a cue which is the most salient stimulus in a display will capture attention, independent of the attribute congruency between the cue and the target. To answer the addressed question by testing Rapid Disengagement, two converging experiments

(reported in Chapter 2 and 4) were conducted. Experiment I is a behavioral experiment very similar to Folk et al. (1992), but one critical parameter – the cue-to-target SOA – is changed to be very short, so short that there should not be enough time for rapid disengagement to occur if attention is actually captured by an incongruent cue. In

Experiment II, a psychophysiological method is used to measure “online” responses before a physical response is made. Rapid disengagement may be so rapid as to be invisible in behavior, but evidence of attentional capture may be observable in psychophysiological measures. If neither of the experiments provides evidence for Rapid

Disengagement, Contingent Capture will be preferred as a closer description of one’s mental experience of exogenous orienting of visual attention. If one of the experiments fails to demonstrate Contingent Capture, it is possible that a pure form of exogenous orienting does exist.

Chapter 2

Contingent Cuing with a Short SOA

A cuing effect on response time (RT) is used to infer attentional capture. As reported by Folk and colleagues (1992), whether there is a cuing effect is contingent on whether the cue contains the target-defining attribute, which generates the two-way interactions between cue attribute and cue validity as a function of target attribute (see

Figure 1-2). According to Contingent Capture, an incongruent cue (i.e., a cue that does not contain the target-defining attribute) does not capture attention and thus no cuing effect is found. The RT pattern of the two-way interactions becomes the signature of

Contingent Capture.

Alternatively, an incongruent cue might capture attention so briefly (because the mismatch between the cue attribute and the target attribute causes a rapid disengagement of attention) that no cuing effect is reflected in RT (Theeuwes 1994; Theeuwes et al.,

2000). In other words, attention was shifted twice – once to the incongruent cue and then back to fixation – during the 150 ms of cue-to-target stimulus onset asynchrony (SOA) employed in Folk et al. (1992). By the time the target appeared, attention was ready to be captured by the target, and thus no evidence of capture was found in the incongruent condition, even though capture occurred.

Shortening the SOA is one approach to testing Rapid Disengagement. A short

SOA may prevent attentional disengagement before the target is presented (if the

20 incongruent cue does capture attention). Folk and Remington (1998) and Lamy (2005) have used this approach. However, there are shortcomings in all of their experiments, which are to be pointed out in the next section.

Short-SOA Experiments

Folk & Remington (1998, Experiment 4). The cue-to-target SOA was shortened from 150 to 100 and 50 ms. Participants performed in one incongruent condition where the cue did not contain the target-defining attribute (i.e., the target was green while the cue was red). The three values of SOAs were randomly mixed within each trial block.

Consistently, no cuing effect was found when the SOA was 150 or 100 ms. There was a small, insignificant effect when the SOA was 50 ms. Overall, they did not find evidence of attentional capture by incongruent cues even with the shortest SOA in the experiment

(i.e., 50 ms). Therefore, the authors concluded that Rapid Disengagement was unlikely to account for the null effect in the incongruent condition.

Lamy (2005, Experiment 1). Lamy used a short SOA (50 ms) with some other longer SOAs (175 and 300 ms). Like Folk and Remington (1998), participants in Lamy’s study performed in one incongruent condition where the target was red and the cue was an onset stimulus (i.e., a set of grey dots surrounding a placeholder). The three values of

SOAs were either randomly mixed within a trial block, or one SOA was assigned to all trials in the block. When the short SOA (50 ms) was mixed with the other SOA values, a

21 cuing effect was found; when it was blocked (i.e., constant 50 ms of SOA through the whole trial block), cue validity showed no effect. Lamy suggested that when the contingent cuing task includes not only spatial uncertainty (i.e., unpredictable target location) but also temporal uncertainty (i.e., unpredictable timing for target appearance), attention can be captured by a cue that does not contain the target-defining attribute.

However, we found the above-mentioned experiments were not well-designed to examine the Rapid Disengagement account for the following reasons.

First, only the incongruent condition was included. That is, participants only performed in the incongruent condition where the cue and the target did not share the same attribute. It is possible that these participants (Folk & Remington, 1998,

Experiment 4; Lamy, 2005) were more experienced in ignoring the irrelevant cue than participants in other experiments that consisted of both congruent and incongruent conditions (e.g., Folk et al., 1992; Folk & Remington, 1998, Experiments 1, 2, & 3).

Thus, no reliable evidence for attentional capture was shown in those short-SOA experiments while in other relatively-long-SOA experiments there were cuing effects at least in the congruent condition, resulting in the signature of contingent capture (i.e., the two-way interactions).

Second, the shortest SOA was equal to the cue duration (i.e., 50 ms). This means that the cue went off at the time when the target appeared. If the cue captures attention simply because of its salience, the compulsion to capture attention may be interrupted because of the offset of the cue. A number of studies, besides, have shown that exogenous cuing is not effective when the cue and the target do not overlap in time (e.g.,

22 Collie et al, 2000; Maruff et al., 1999; cf., Berger et al., 1999). With an SOA that is equal to the cue duration, the lack of cuing effect in the incongruent condition may result from the discontinuation of the cue.

Third, mixing various SOAs added uncertainty to the task. When the SOA was

50 ms and mixed with other SOAs, there was a trend of cuing effect in Folk and

Remington’s Experiment 4 (1998) and a significant cuing effect in Lamy’s Experiment 1

(2005). This does not necessarily mean that Contingent Capture failed. This may result from the fact that various SOAs were used in these particular experiments. Previously,

Folk and colleagues reported evidence of Contingent Capture under the circumstance in which the SOA was a constant rather than a variable so that the only uncertainty in the task was the target location. When there was only spatial uncertainty, spatial orienting by exogenous stimuli was found to be modulated by cue-target congruency. However, when temporal uncertainty was added to the task (i.e., various SOAs), attentional control settings may have been changed, which made attention less reluctant to be captured by incongruent cues and initiated the most convenient strategy (e.g., “the singleton detection mode” of Bacon & Egeth, 1994).

Like previous research, the present experiment is to test Rapid Disengagement with a short SOA, while avoiding previous issues and optimizing the experimental design to serve the purpose of the study. Thus, participants would be exposed to both congruent and incongruent conditions. A fixed time window would be given between the cue and the target, and the SOA would not be the same as the cue duration. To choose this SOA, some more consideration is given.

23

Short but not Zero SOA

Presenting the cue and the target simultaneously is the extreme case of a short

SOA (i.e., 0 SOA). Illustrated in Figure 2-1, they are displays as the cue and the target appear at the same time in the cuing paradigm of the current study. There are, however, concerns about simultaneous occurrence of the cue and the target. When more than one visual event (defined as changes, appearances, or disappearances of visual objects) occurs at the same instance, events will compete for capacity of parallel preattentive processing.

Consequently, the RT to the target event will be slowed down. This is called filtering cost (Kahneman, Treisman, & Burkell, 1983; also see Lupiáñez & Weaver, 1998). In the contingent cuing paradigm, if the cue and the target appear simultaneously in the same location (i.e., valid condition), the cue and the target are likely to be perceived as a single event, while two events will be perceived in an invalid condition. Therefore, even though the invalid cue may not draw attention, it will compete for parallel preattentive processing, and the allocation of attention to the target will suffer interference, which ultimately will slow down the response. As a result, filtering cost will be reflected in an

RT pattern where the RT for the invalidly cued target is greater than that for the validly cued target regardless of the congruency between the cue and the target. This RT pattern is also predicted by Rapid Disengagement. Specifically, simultaneously presenting the cue and the target (i.e., 0 SOA) may impair the two-way interactions that signature

24 Contingent Capture, and this impairment may result from filtering cost rather than attentional capture.

Figure 2-1: With the current experiment setting of contingent cuing, what displays would look like if the cue and the target occurred simultaneously. Divided by the dash line, the left panel shows the congruent condition as the cue and the target share the same attribute (onset or color). The right panel shows the incongruent condition as the cue does not contain the target-defining attribute. In each condition, examples of valid and invalid cuing are given.

To be the sole account for finding a cuing effect in the current paradigm if the cue and the target were presented simultaneously, the filtering cost hypothesis has to be extended as a result from all preattentive processes. The reasoning of filtering cost is more obvious for the onset- cue-onset-target condition, as shown in the upper left panel

25 of Figure 2-1, than for the other conditions involved with color stimuli. In those conditions with color stimuli, there will always be at least four events occurring at the same time, and they will compete for processing. Thus, there should not be any cost or

RT differences between the valid and the invalid conditions. One could argue that the odd-colored stimulus (i.e., the purple dots or the purple character) is a unique event against its background, and so is the onset stimulus; that is, the purple stimulus and the onset stimulus are both discontinuities of certain sorts. Therefore, the cost will be generated when the two discontinuities happen in different locations (invalid) than in the same location (valid). This argument stresses more preattentive processes such as figure- ground segregation and complicates the filtering cost hypothesis. According to the filtering cost hypothesis, various events (which are figures but not the background) are processed simultaneously to the representation stage. Before attention is allocated to any event, these figures are grouped into homogenous sets (e.g., all green stimuli) and rare ones (e.g., purple stimuli), and homogenous stimuli are becoming a part of the background with only the outstanding stimuli being the “figure” or “event.” Eventually, the outcome of all preattentive processes is that there is only one event in the valid condition and two events in the invalid condition. The fewer events, the less time is required for processing the target. Thus, the cuing effect is not caused by attentional allocation but filtering cost or the outcome of preattentive processing.

The possibility of masking should also be a concern if the cue and the target appear at the same time. It has been found that a mask that does not match the target configuration or spatially overlap with the target is still able to retard target processing

26 (Enns & Di Lollo, 1997; Lleras & Moore, 2003). Usually this configurationally unmatched or spatially non-overlapping mask elicits the greatest effects when the mask offsets later than the target (such as object substitution masking, see Di Lollo, Enns, &

Rensink, 2000), but there is evidence of masking in the condition where the mask and the target appear and disappear simultaneously (Enns & Di Lollo, 1997). If the cue and the target appear and disappear simultaneously in the same location, the cue may function as a mask and contaminate the processing of the target, which produces cost for the validly cued target comparing to the invalidly cued target. In the trials with a color cue (the lower four displays in Figure 2-1), all four placeholders are surrounded by four dots; thus, in both valid and invalid trials the target stimulus may be degraded because of masking.

Furthermore, if the masking happens, the effect will be greater in the congruent than in the incongruent condition because similar attributes tend to be integrated as the same object representation than different attributes (Moore & Lleras, 2005). Recall that the cue in the contingent cuing paradigm is supposed to capture attention but not act as a mask for the target. However, if the cue captures attention, the masking effect will not occur. This is supported by the evidence of no masking shown when attention was directed to the mask location (Enns & Di Lollo, 1997). Thus by chance, one way or the other will happen that the cue captures attention and facilitates the target processing, or that the cue masks the target and inhibits the target processing.

Nonetheless, the odds of masking effects interfering with the effects of attentional capture in the current cuing paradigm are very small. The task of a typical masking experiment is very different from that of the current experiment. In the former, the target

27 is difficult to discriminate from the non-targets; while in the latter, the characters are very distinguishable from each other. This difference in task nature ensures the unlikelihood of masking to occur in the current study. It was found that stimuli above the visibility threshold were not easy to be masked by a 4-dot mask even when the mask was presented very closely in time to the target (Enns, 2004).

Although there is no hard evidence for filtering cost or masking to interfere attentional capture in this particular paradigm developed by Folk at al. (1992), careful forethoughts are taken before using a very short SOA. We replicated the experimental design of Folk et al. (1992) and used a very short but non-zero SOA such that the cue and the target were perceived as separate events. Also, suggested by previous research, temporal overlap between the cue and the target increases the likelihood of obtaining a cuing effect (e.g., Maruff et al., 1999). In one pilot experiment, any SOA value less than

35 ms gave the impression of simultaneous onset. Thus, with 50 ms of cue duration and target duration, the SOA was fixed to 35 ms, so that there was a 15 ms overlap between the cue and the target (see Figure 2-2).

Modifications of Folk et al. (1992) Parameters

One issue in the study of attentional capture is whether a dynamic discontinuity is superior to a static discontinuity in terms of attracting attentional capacity (e.g., Jonides

& Yantis, 1988). Folk et al. (1992) showed that the criterion for attentional capture is not

28 the stimulus property but whether this property matches the control setting or has task relevance. It was shown that a dynamic discontinuity (i.e., an onset cue) captured attention only when the target was defined as an onset item but no capture evidence was revealed when the target was a color item. However, in a static discontinuity display, the color item was not only statically discontinuous from the other items, but also dynamically discontinuous because its luminance was unique. As shown in Figure 1-1, when the target was defined as the red item, the other characters in boxes were achromatic. To control for this in the current study, the unique color item in a static discontinuity display (i.e., the purple stimulus) is companied by other colored items with a different color (e.g., green) that are perceived as having the same luminance as the target.

Finally, in order to keep the experimental display constant throughout the study, including the behavioral experiment (Experiment I) and the psychophysiological ERP experiment (Experiment II), peripheral squares (place holders) are arranged in the four quadrants centered at the fixation, instead of in a diamond configuration as in Folk et al.

(1992). The fixation is a dot instead of another square (comparing Figure 1-1 and 2-2).

29 Experiment I

Method

Subjects. Sixty-four individuals (with an average age of 22.1; 42 women; 8 left- handed) were recruited at the Pennsylvania State University. All of the participants reported normal or corrected-to-normal visual acuity and color vision, and they were unaware of the purpose of the experiment before being tested. Half of participants received $6 US Dollars as participant compensation, and the other half received partial course credit.

Apparatus. The experiment took place in a dim room. Stimuli were presented on a 15” color monitor in SVGA mode. Trial events and data collection were controlled by an IBM-compatible, Pentium-based personal computer. The refresh rate was 85 Hz. The experiment was run by customized MatLab software. A keyboard was used for the input of subjects’ responses.

Stimuli. Each trial consisted of four sequential displays: fixation display, cue display, cue-and-target display, and target display (see Figure 2-2). The fixation display consisted of a central fixation dot, in size of 0.34 dg of visual angle from a viewing distance of approximately 50 cm, flanked by four squares (1.15 dg in width) distant from the fixation dot at about 5 dg. The fixation dot and the squares were gray (RGB code:

[129, 129, 129]) against a black background.

30 In the Onset Cue condition, one of the squares in the cue display was surrounded by four white dots (0.23 dg in diameter) in a diamond configuration. The distance from each dot to the edge of the square was 0.30 dg. In the Color Cue condition, four purple dots surrounded one of the squares and the other squares were surrounded by green dots.

The cue-and-target display consisted of the cue display with the addition of an “X” or “=”

(0.57 dg) in the squares. In the Onset Target condition, only one of the squares contained an “X” or “=” in white, while the other squares were empty. In the Color Target condition, two of the squares contained an “X”, while the other two contained an “=”, and only one of the characters was purple, while the others were green. Green and purple were equal in brightness or perceived luminance (green RGB code: [0, 110, 0]; purple

RBG code: [190, 0, 190]). The two line segments of the “=” presented in the paid-subject

group were 0.19 dg apart (i.e., the narrow equal-sign), and they were 0.43 dg apart in the

other subject group (i.e., the wide equal-sign).

Design. A 2 x 2 x 2 x 2 mixed-factor design was used in this experiment. The

three within-subjects factors were Cue Attribute, Target Attribute, and Cue Validity. The

between-subjects factor was the (spacing of the) Equal-Sign. Each cue-target situation

(onset cue & onset target; onset cue & color target; color cue & color target; color cue &

onset target) was experienced in 4 contiguous trial blocks with the first block discarded as

practice. The order of conditions was counter-balanced across participants. In each

block, there were 32 data-collecting trials, in which 4 cue locations and 4 target locations

were counterbalanced twice. In addition, 2 warm-up trials, and 2 recovery trials after

each error trials were included. The maximum number of trials in a block was 50.

31

A. Onset Target

B. Color Target

Figure 2-2: The actual display was presented on a black background. All black lines in this figure were actually white or grey in the real presentation. There were two types of Target: Onset (Figure 1A) and Color (Figure 1B), and two types of Cue: Onset and Color. Cue Attributes and Target Attributes were combined to yield four cue-target situations: Onset Target preceded by a congruent (onset) Cue or incongruent (color) Cue; Color Target following a congruent (color) cue or incongruent (onset) cue.

32

Procedure. The session lasted approximately one hour, beginning with oral and written instructions that required participants to respond as quickly and accurately as possible and to maintain fixation through-out every trial. The uninformative nature of the cues was included in the instructions.

Each trial began with a fixation display, lasting for 1000 ms, followed by the cue display for 35 ms, the cue-and-target display for 15 ms, and the target display for 35 ms.

That is, both the cue and target display lasted for 50 ms, and the cue-to-target SOA was

35 ms. The target display was followed by the fixation display until the participant made a response. The next trial sequence was initiated 1000 ms after the response. At the end of each block, subjects would get feedback on their performance in terms of mean RT and accuracy. If the accuracy was equal to or greater than 95%, an “Excellent!” remark would also be presented. If the error rate was greater than 15%, “Please slow down a little and make fewer errors” would be shown.

Participants made responses by pressing a key on the keyboard. The target character “X” was assigned to the key “Z”, and the character “=” to the key “?”. RT was measured from the onset of the cue-and-target display, in which the target first appeared.

If a response was not made within 1500 ms, an error message, “no response detected,” was presented. Incorrect responses elicited the error message “incorrect response.” If a response was made within 50 ms after the target appearance, the participant was presented with an error message “please do not press the key before the target appears.”

If a key other than “Z” or “?” was pressed, the participant would be presented with

33 “please press ‘Z’ or ‘?’ to make a response.” Each error message was presented for 1500 ms, after which the next trial was initiated. An error trial was followed by 2 recovery trials. Parameters in recovery trials and warm-up trials at the beginning of each block were drawn randomly. RTs for warm-up, error, recovery trials, and trials in practice blocks were not included in the data analysis.

Results

A 2 x 2 x 2 x 2 mixed-factor ANOVA was performed on the mean RTs. The analysis showed that participants who were shown widely-spaced equal-signs made faster responses than those who used the narrow equal-signs [F(1, 62) = 11.86, p = .001] . But

Equal-Sign did not interact with any other factors, thus the data were collapsed across this factor and plotted in Figure 2-3 with mean RTs as a function of Cue Attribute (onset or color), Target Attribute (onset or color), and Cue Validity (valid or invalid). A complete summary of the data is given in Table 2-1. The analysis confirmed the apparent three-way interaction [F(1, 62) = 110.32, p < .001]. The initial analysis also showed main effects of Target Attribute [F(1, 62) = 85.88, p < .001] and Cue Validity [F(1, 62) =

108.65, p < .001], as well as all three of the two-way interactions [all F(1, 62) ≥ 15.84, all p < .001]. Consistent with previous work, as well as the figure, the three-way interaction was first explored by conducting separate two-way analyses, divided by Target Attribute.

For the onset targets, the interaction between Cue Attribute and Cue Validity was significant [F(1, 62) = 14.79, p < .001]. The same was true for color targets [F(1, 62) =

34 77.02, p < .001], although the specific pattern was reversed (see Figure 2-3). In short: congruent cues captured attention; incongruent cues did not.

Figure 2-3: Mean response times as a function of Target Attribute, Cue Attribute, and Cue Validity.

Table 2-1: Mean response times (in ms) and error rates (in %) in each cue-target situation.

Onset Target Color Target Validity Validity Validity Validity Response Error Response Error Effect Effect Effect Effect Time Rate Time Rate (RT) (ER) (RT) (ER) Onset Valid 425.13 6.98 473.25 6.82 22.90** -2.82* 7.77 -2.13 Cue Invalid 448.04 4.16 481.02 4.69 Color Valid 420.59 5.23 463.48 6.87 -.72 .19 91.27** 9.63** Cue Invalid 419.87 5.04 554.75 16.50 * p < .050 ** p < .001

35 Overall mean error rate averaged 7.03 %. There was no evidence for a speed- accuracy trade-off: averaged across participants, error rates by condition were positively correlated with mean RT (mean r = .32; t(63) = 5.78; p < .001). Thus, there is no reason for the error data to alter the conclusion that was reached using mean RT.

Discussion

The present results are inconsistent with Rapid Disengagement and provide additional support for Contingent Capture. Rapid Disengagement explains the lack of a spatial validity effect with incongruent cues in previous research (e.g., Folk et al., 1992) by claiming that a 150-ms delay between the onset of the cue and the onset of the target is sufficient for both pure bottom-up capture and withdrawal of attention from the location of the incongruent cue. But the present experiment has also failed to find a spatial validity effect with incongruent cues at the very short SOA of 35 ms. At the same time, and going beyond previous work using the cuing paradigm and short SOAs (e.g., Folk &

Remington, 1998; Lamy, 2005), the present experiment succeeded in producing a spatial validity effect with congruent cues, adding weight to the null finding with incongruent cues. This entire pattern is consistent with Contingent Capture.

One might be tempted to argue that the incongruent cues in the present experiment did capture attention and then release it so rapidly that no spatial validity effect was observed, even at a very short SOA. However, such an idea would run counter

36 to a large body of work, using a variety of methods, all of which suggests that 35 ms is insufficient for both a shift of attention and then disengagement (e.g., Moore et al., 1996;

Logan, 2005; Theeuwes, Godijn, & Pratt, 2004). Furthermore, in a recent variant of the cuing paradigm, Folk and Remington (2006) have shown that flanker-like distractors

(i.e., task-irrelevant stimuli that can be compatible or incompatible with the actual target), appearing at the same time and location as the cue, only cause a response-compatibility effect when the cue is congruent. Thus, after modifying the paradigm in order to avoid the issues that arise when cues and targets appear at the same moment, this other work has shown no evidence of capture by incongruent cues at an SOA of zero (between the cue and critical distractor), leaving no time at all for Rapid Disengagement.

Alternatively, if incongruent cues capture attention in a fashion that it is hardly possible to demonstrate in the current experiment, and if Rapid Disengagement remains theoretically possible, Contingent Capture is not to be rejected, either. When attentional capture occurs to a salient irrelevant stimulus that does not share features with the target, it is probably a consequence from the mental setting – the singleton-detection mode.

When attentional capture occurs only to a stimulus that contains the attribute defining the target, another mental setting, namely the feature-search mode, is probably the cause.

This has been demonstrated by Pratt and McAuliffe (2002), in which the target was preceded by either only congruent cues, only incongruent cues, or possibly both. It was found that attentional capture is evident in the congruent only condition and the combination condition, but not in the incongruent only condition. That is, whether incongruent cues capture attention or not, the task context plays an important role that

37 determines top-down control settings. As long as top-down control settings are involved in attentional capture, there is no violation of Contingent Capture.

Chapter 3

Background: the Choice of N2pc

Response time (RT) measurements are useful for studying a broad range of cognitive processes, including visual attention. As reviewed in Chapter 1, RT has been used as a tool to infer whether or not a stimulus is selected from a visual display and whether or not a stimulus captures attention. The hypothesis that is under examination in the present study is Rapid Disengagement. This hypothesis claims that a stimulus that does not contain the attribute that defines the target still captures attention, but does so very briefly. To test this idea, the contingent cuing paradigm is used. In Experiment I

(reported in Chapter 2), RT was measured while a very short interval between the onsets of the cue and the target was used. In Experiment II (reported in Chapter 4), an ERP component that indicates attentional capture will be examined. ERPs are one of the techniques that allow “online” observation of cognitive processing (before a physical response is made). This chapter provides an (optional) introduction as to how this method was selected and why.

39 Neurophysiological Methods

Using current neurophysiological techniques, much has been learned about the anatomy and physiology of the brain systems that mediate cognitive processes. Positron emission tomography (PET) and functional magnetic resonance imaging (fMRI) register changes in the metabolic activity of nerve cell populations or changes in regional blood flow associated with neural activity. Event-related potentials (ERPs) from (EEG) and event-related magnetic fields (ERFs) from (MEG) respectively record the electric and magnetic fields that are generated by the ionic currents associated with synchronous synaptic activity and cell firing patterns in neural populations. Yet another physiological method practiced directly on brain cells is single-cell or single-unit recording, generally performed on a non-human subject, by recording the target neuron’s activity (i.e., its firing rate) during a given task.

Table 3-1: Comparisons among the most commonly used neurophysiological methods in cognitive psychology. (Adapted from Table 1 in Volkow, Rosen, & Farde, 1997)

Neurophysiological Methods Temporal Resolution Spatial Resolution fMRI 3-5 sec 1.0-1.5 mm PET 45 sec 4 mm EEG 1 msec 10-15 mm MEG 1 msec 5 mm

40

For the current research on visuospatial attention, ERP is chosen because of its high temporal resolution (higher than fMRI and PET), its relative low cost (as compared to fMRI, PET, MEG, and single-cell recording), its noninvasiveness (as opposed to PET and single-cell recording), and its availability on the main campus of the Pennsylvania

State University. Of these reasons, the high temporal resolution is of greatest importance and qualifies ERPs as an online psychophysiological technique (see Table 3-1). In addition, the high temporal resolution of ERPs allows investigation of internal processes that occur too fast to be observed from behavior or from the hemodynamic responses recorded by PET or fMRI (which have poor temporal resolution). The subject of interest in the current study, i.e., attentional capture, is a process that occurs very quickly after stimulus presentation, making PET or fMRI unlikely to detect its occurrence. The ERP method, on the other hand, is much more likely to reflect an event as fleeting as attentional capture.

Due to the nature of recording, the spatial resolution of ERPs is not as high in the scale of millimeters as brain-imaging techniques such as PET and fMRI. The spatial resolution of the ERP technique is fundamentally undefined because there are infinite internal ERP generator configurations that can explain a given pattern of ERP data.

Unlike PET and fMRI, it is currently not possible to specify a margin of error for an ERP localization claim (for a greater discussion in a layman’s language, see Luck, 2005).

However, the distribution of voltage over the scalp can be used to estimate the loci of the

ERP component generators, providing a coarse understanding of functional mapping on

41 the surface of the brain. In Chapter 4, the contingent cuing experiment will be paired with a high-density (128-channel/electrode) EEG measure. With that many electrodes, the mapping of functional anatomy on the scalp is more possible than with low-density systems (containing usually less than 56 electrodes). However, one shortcoming of the high-density EEG system is its relatively high impedance, which decreases the signal-to- noise ratio from one single electrode. The advantage of the high-density EEG system, on the other hand, is that in a given scalp region, there are more electrodes than low-density systems, which means that more data can be used to increase the signal-to-noise ratio within that region. Therefore, the spatial resolution of the high-density system may not be very different or much more informative than that of the low-density system.

Event-Related Potentials

Event-related potentials (ERPs) refer to differences between electrical voltages evoked at the occurrence of a given event. ERPs begin as postsynaptic potentials generated approximately at the same time by a great number of neurons during neurotransmission. Current flowing into each individual neuron creates a dipole – a pair of positive and negative electrical charges separated by distance. If all the neurons have a similar orientation (i.e., spatially aligned) and receive the same type of input corresponding to the same event, their dipoles will summate and may be measured at the scalp. However, electricity does not just travel directly between two poles of a dipole in

42 a conductive medium such as the brain, but also passively spreads out as it travels through the brain. In addition, because electricity follows the path of least resistance, an

ERP tends to spread laterally when encountering the high resistance of the skull. The extent to which electrical signals spread depends on various cortical activities and area structures in the brain. This is one of the causes of the poor spatial resolution of the ERP recording. Eventually, weak but detectable potentials (only a few microvolts; μV) reach the scalp, from which electrodes, placed on the scalp, instantly pick up the electrical activity (see Figure 3-1 for the electrode position in the international 10-20 system).

Figure 3-1: The international 10-20 electrode system. 10-20 refers to the placing of electrodes at sites 10% and 20% from certain anatomical landmarks as shown on the left. The right side of the figure depicts the electrode position from viewing the top of the head. The letters refer to areas of the brain, F for frontal, C for central, T for temporal, P for parietal, and O for occipital. Numerical subscripts indicate the brain hemisphere (odd numbers for left and even numbers for right). Subscript “z” indicates “zenith,” referring to the midline placement.

43

An electroencephalogragh with raw EEG waves is difficult to decompose directly because the brain is constantly active and generates electrical potentials corresponding to internal or external events. Therefore, in an EEG laboratory, it is very helpful to shield out unwanted stimuli and to present only stimuli of interest to the study in question.

Time-locked to the stimuli of interest (e.g., the stimulus onset), a recording epoch can be decided from a range related to the time locked zero (e.g., from 100 ms before the stimulus onset until 1000 ms after the stimulus onset). Some potentials evoked in the epoch are related to external events (i.e., stimulus presentation) or internal events (i.e., mental processes), and others are baseline activities. Thus, in addition to an isolated recording environment, offline analyses are needed to separate ERPs from baseline EEG potentials. A recording session contains a large number of task trials, resulting in a large number of epochs. To extract relatively clean ERPs, all epochs are lined up in time relative to stimulus (or response) onset and averaged. Ideally, any brain activity that is not related to the stimulus (or response) will average to zero voltage, and any brain activity that is consistently time-locked will remain in the averaged waveform (Figure 3-

2).

The resulting averaged ERP waveform consists of several negative and positive going waves in voltage related to the zero potential. Some of the waves are referred to as components, but not all ERP components are evoked waves in a waveform. Some components are defined as relative negativity or positivity compared to a given control condition. Some components are products of a series of mathematical algorithms,

44 depending on the theory or model used to generate a functional index. ERP components are usually labeled P or N to indicate positive or negative waves (or differences), with numbers indicating the timing of the peak or the sequential order of the peak. For example, the indicates the positive peak evoked around 300 ms after the time- locked zero, and the P1 refers to the first major positive peak after the time-locked zero.

Sometimes, the number denoting the sequential order of the peak is followed by one or two lowercase letters, indicating in which scalp region the component is most likely to be recorded. For example, the N2pc refers to a negative difference that is observed over the posterior scalp region that is contralateral to the critical stimulus and is observed during the latency range of the N2 wave (200 – 300 ms).

Figure 3-2: Extraction of the ERP waveform from the ongoing EEG. Time zero is locked to the stimulus onset. (a) Stimuli (1… N) are presented while the EEG is being recorded, but the specific response to each stimulus is too small to be seen in the noisy EEG. (b) To isolate the ERP from the ongoing EEG, the EEG epochs or segments are extracted and averaged to create the ERP waveform. Note that negative voltage is plotted up by convention. The waveforms are hypothetical. (Adapted from Luck et al., 2000)

45 The sequence of components following a mental (e.g., being prepared) or physical

(e.g., presentation of a stimulus) event reflects the sequence of neural processes triggered by the event. Some ERPs are evoked simply by sensory input, reflecting some very front-end processes that effortful processes have no access to. For example, the C1 (not labeled with a P or an N because its polarity can vary) onsets as early as 40-60 ms after the presentation of a visual stimulus, which is sensitive to stimulus properties but not to attention. A few sensory evoked ERPs are sensitive to mental processes and change in amplitude or latency, such as the visual N1 whose amplitude is enhanced if attention is voluntarily deployed to the stimulus. Other ERPs are elicited only under certain circumstances, such as the N2pc, which will be described in detail below.

Because of its limit in spatial resolution, as mentioned earlier, most of the ERP components recorded at a given electrode site are affected by various psychological and physiological events whose timing or period may vary substantially. Such variance is primarily an issue for higher-order processes or processes occurring relatively late in time. For example, consider that an ERP wave is recorded more than half a second after the task begins, and the greatest magnitude is shown from the electrode on the posterior side of the scalp. The peak may be originally linked to the firing of a group of neurons in the occipital area of the brain. This neuronal activity may or may not be influenced by other neurons from other areas. The ERP components of interest may be hidden in the waveform or distorted by proximate components. Therefore, early ERP components are less likely to be missed or distorted in a waveform than late ERP components, meaning that early ERP components are more reliable in indexing functions than are late ones.

46 Effects of visual attention have been shown to occur pretty early in the process, and thus the ERP technique is frequently used in visual attention studies.

A number of ERP components such as the P1, the N1, and the N2pc have been used as indices of visuospatial attention (see Figure 3-3 for P1 & N1 effects, and Figure

3-4 for hypothetical N2pc). These components are observed relatively early in the process, and their generators are located approximately over the extrastriate cortex (i.e., the visual areas in the brain other than the primary visual cortex). The visual world is represented in the opposite side of the brain; for example, stimuli in the right visual field are processed in the left hemisphere of the cortex. When a sensory evoked component such as the P1 or the N1 whose property is affected by a stimulus from the contralateral side, it is indicated that some spatial factors must be involved in the effect. Typically, amplitudes of the P1 and the N1 change when a contralateral location is voluntarily attended (e.g., Hillyard, Luck, & Mangun, 1994; Luck et al., 1994). In a task requiring involuntary shifts of attention, evidence of P1 effects but no reliable N1 effect have been reported (e.g. Doallo et al., 2004, 2005; Fu et al., 2005; Hopfinger & Mangun, 1998,

2001; Hopfinger & West, 2006). Thus, with regard to the current study, which examines an essential property of involuntary shift of attention (i.e., attentional capture), the N1 is not of use. However, the P1 is not ideal either to serve the indexing function of attentional capture because the P1 is also magnified when the same location is stimulated repeatedly, such as is done in an exogenous cuing task (e.g., Johannes et al., 1995;

Hopfinger & West, 2006; Wijers et al., 1997). As for the N2pc, the elicitation of this

ERP component is associated with attentional shift to a stimulus whose location is not

47 known beforehand; that is, attention is captured by the stimulus (e.g., Eimer, 1996; Girelli

& Luck, 1997; Luck & Hillyard, 1994a, 1994b; Wolber & Wascher, 2003, 2005;

Woodman & Luck, 1999, 2003). Therefore, the N2pc is the most ideal for the current study. Selected literature on the N2pc is to be reviewed in the next section.

Figure 3-3: Amplitude modulation of early ERP components in visuospatial attention. Time zero is locked to the flash onset. (a) Subject fixated the center plus sign while a stimulus flashed in a random order. In each trial block, subjects attended to stimuli in only one quadrant. That is, subject endogenously attended to one location and made a response if the target (the flashing stimulus) was in the attended location. The scalp current distribution map, shown on the schematic head, was generated based on the P1 component recorded from each electrode. (b) Averaged from EEG epochs recorded over the right posterior visual cortex, ERP waveforms to lower left flashes were shown when these flashes were attended (solid line) and ignored (dotted line). (Adapted from Mangun et al., 1993)

N2pc

The N2pc reflects attentional capture to a stimulus location. This capture is usually caused by attentional selection of the feature that defines the target, but can also

48 be caused by features that are highly salient but irrelevant to the task. The nature of the task and the stimulus both determine whether attention is oriented to a location that is uncertain until the stimulus appears. The N2pc indexes the implementation rather than the control of attentional capture. This conclusion has come from seminal work from

Luck and Hillyard (1994a, 1994b), Eimer (1996), and Girelli and Luck (1997).

Figure 3-4: Example of an N2pc experiment. (a) Stimulus arrays with a target (the white bar) on the right or left. (b) Hypothetical waveforms recorded from posterior electrode sites over the left and right hemispheres. By convention, negative is plotted upward. The waveforms from the left hemisphere are more negative-going for right visual field (RVF) targets than for left visual field (LVF) targets, and the waveforms from the right hemisphere are more negative-going for LVF targets than for RVF targets. (c) Waveforms that have been collapsed across hemispheres into ipsilateral and contralateral waveforms. This difference in amplitude between the ipsilateral and contralateral waveforms defines the N2pc component and is indicated by the shaded region. (Adapted from Woodman & Luck, 2003)

49

Luck and Hillyard published two influential papers on the N2pc in 1994. The experiments were very similar to what is illustrated in Figure 3-4. In an array of multiple stimuli, one stimulus stood out from other stimuli because of its obvious physical feature, such as in color, shape, or size. If this feature defined the target (i.e., an infrequent response was required when the target feature was found in a display), the N2pc was elicited. If this feature did not define the target but simply an odd appearance of a distractor (i.e., a non-target pop-out singleton), the N2pc was not evident. Apparently, the N2pc reflects processing of task-relevant information, which occurs in the location of the task-relevant stimulus. This suggests that attentional capture triggered by a salient feature is contingent on top-down knowledge. In the second article, the authors continued to investigate what the N2pc conveys by manipulating the similarity between target and non-target pop-outs, and testing whether a single-item display can also elicit the N2pc. For similarity, the target was defined by two features with one more obvious

(color) than the other (size), and thus a non-target singleton could appear similar to the target if it contained the more obvious target-defining feature. The results were very promising in that the N2pc was present when the non-target pop-out was similar to the target, and that the N2pc was absent when the target was the single item in a display.

This set of data suggests that the N2pc reflects a spatial filtering process that suppresses information surrounding the target, and consequently target-defining features receive facilitation of processing.

50 Rather than filtering out non-targets, Eimer (1996) argues that the N2pc reflects active selection of task-relevant information. He used search arrays containing few numbers of stimuli (i.e., four or two), and found the presence of the N2pc. In such an uncrowded display (e.g., one item on each side of the visual field) it is not necessary to suppress or filter out a distractor but to simply select the target. That is, the target processing is enhanced after localizing task-relevant information. Whether the N2pc reflects spatial filtering of irrelevant information or spatial selection of relevant information, the N2pc has been serving as an index of the ability to localize a stimulus under the top-down control. Put differently, the destination of the attentional shift is not predetermined because the target location is unknown – the definition of exogenous shift or attentional capture; however, the target attribute is known, and attention is captured by the target features but not by a distractor – the top-down control influence.

Girelli and Luck (1997), however, demonstrated that the N2pc also reflects attentional capture that seems free from top-down control. They employed the same paradigm as in Luck and Hillyard (1994a, 1994b, see Figure 3-4), and chose color, orientation, and motion as features that made a singleton pop out. Like previous research, the singleton could be the target or not. When the singleton was the target, the

N2pc was evident. But inconsistent with previous research, when the singleton was defined by motion and was not the target, the N2pc was elicited too (Figure 3-5), which indicates that motion singletons produce an automatic capture of attention. Hence, the

N2pc reflects attentional capture by a stimulus, but whether the attentional capture is

51 purely stimulus-driven or partly goal-driven cannot be inferred simply with the presence of the N2pc. The task nature has to be taken into account as well.

Figure 3-5: ERP waveforms for each combination of target condition and singleton type in Girelli & Luck (1997). The N2pc component is defined by a difference in voltage between trials on which the target is contralateral versus ipsilateral to the electrode site (OL or OR), and this difference is shaded. Negative is plotted upward in this and all subsequent figures. (Adapted from Girelli & Luck, 1997)

Chapter 4

Contingent Cuing with ERP

According to Contingent Capture, an incongruent cue (which does not contain the target-defining attribute) does not capture attention (Folk et al., 1992, 2006).

Alternatively, Rapid Disengagement argues that an incongruent cue captures attention but very briefly (Theeuwes 1994; Theeuwes et al., 2000). In Chapter 2, these two accounts have been examined via a behavioral experiment; that is, attentional capture was inferred from differences in response time. In this chapter, the question “Is attentional capture purely stimulus-driven or contingent upon goal-driven settings?” is addressed by observing physiological responses prior to the behavioral response. Event-related potentials (ERPs) were employed for observing an ERP component – N2pc as an index of attentional capture in Experiment II. Previously, two studies have tried to find psychophysiological evidence of Contingent Capture; they are Arnott et al. (2001) and

Hopfinger and Ries (2005). However, the ERP components that those studies used as indexes of attentional capture by the cue are problematic. Details of the problems are discussed below, followed by an introduction to Experiment II.

53 Prior ERP Experiments

Hopfinger & Ries (2005). With epochs time-locked to the target, these authors took magnification of the P1 wave to the target (i.e., the first major positive peak post- stimulus) as the index for attentional capture by the cue. The underlying hypothesis of this kind of ERP experiments is that, if attention is paid to the cue location prior to the target and if the target is validly cued, then the amplitude of the P1 evoked by the target will increase. In the context of a contingent cuing paradigm, Hopfinger and Ries found in the congruent condition the validly cued target generated a more positive waveform

(including the P1, the N1, etc.) than the invalidly cued target. In the incongruent condition, the P1 was found greater in amplitude when the target was validly cued than not. However, when they equalized the luminance of stimuli in a cue display (so that the cue and other non-cues were equiluminant), no reliable effect was found on the P1 in either congruency condition. Therefore, it was argued that a unique brightness transient automatically captures attention at the level of sensory processing, where influence of top-down control settings cannot reach.

Observing changes in P1 amplitude in a peripheral cuing paradigm is not convincing. The P1 is evoked once a stimulus is presented to the visual system. When the target is validly cued, meaning that the same location is signaled by two successive stimuli, the P1 at the target onset is thus enhanced. This may indicate an attentional process that the target processing is facilitated because the cue has captured attention to the location, but it may also simply reflect a location repetition effect or accumulation of

54 visual information. Recently, Hopfinger and West (2006) provided a much more clear view on the P1 and the N1, especially on the P1, the earlier portion of which was found to be more sensitive to physical attributes, while the later portion of which may be associated with Contingent Capture (as an account for Hopfinger & Ries’s findings).

Thus, it is not surprising that, regardless of cue-target congruency, a unique brightness transient cue elicited greater P1s than the equiluminant cue used by Hopfinger and Ries

(2005). Therefore, one way to solve the problem is to look for an ERP effect with the indexing function of attentional capture directly elicited by the cue; rather than observing

ERP effects after the onset of the target.

Another issue in this particular study was the technique used to separate waveforms generated by the cue and the target, in order to see a clean waveform elicited by the target only (Woldorff, 1996). This technique has proved very useful in many ERP studies. One requirement of this technique is to vary the temporal interval between the two stimuli, and thus various cue-to-target SOAs (100 – 300 ms) were used in Hopfinger and Ries’s experiments. Recall in Chapter 2 where it was discussed that as temporal uncertainty is added to the task (i.e., various SOAs), attentional control settings may be changed, which makes attention less reluctant to be captured by incongruent cues and initiates the most convenient strategy (e.g., “the singleton detection mode” of Bacon &

Egeth, 1994). This could be a reason why Hopfinger and Ries found attentional effects on physiological responses.

Arnott et al. (2001). With epochs time-locked to the cue, the critical ERP component was N1 (i.e., the first major negative peak post-stimulus). The authors took

55 magnification of N1 as the index of stimulus discrimination, which was hypothesized as also being evidence of attentional capture. This logic is the most problematic and will be discussed in the next paragraph. With two values of SOAs mixed (150 and 250 ms), participants performed in a cuing task almost identical to the task used in Folk et al.

(1992). To avoid waveform contamination from the target’s visual evoked potentials, only the data from the condition with 250 ms of SOA were analyzed. The major finding was that N1 of a congruent cue was greater than N1 of an incongruent cue. Based on this finding, it was claimed that congruent cues capture attention while incongruent cues do not.

The magnification of the N1 has been shown as a consequence of voluntary shifts of attention (e.g. Doallo et al., 2004, 2005; Fu et al., 2005; Hopfinger & Mangun, 1998,

2001; Hopfinger & West, 2006) and as an index of attentional enhancement of signals

(i.e., Luck et al., 1994). From this respect, the N1 is not an index for attentional capture or involuntary shifts of attention (for an N1 difference is not evident in tasks with exogenous cuing). N1 magnification indicates that attention is endogenously deployed.

N1 magnification also reflects stimulus discrimination (Vogel & Luck, 2000). As shown in many studies, stimulus discrimination is not a process directly related to or enhanced by attentional capture (e.g., Prinzmetal, McCool, & Park, 2005). However, Arnott et al. took the latter index function (i.e., stimulus discrimination) as the very indication of attentional capture. According to these authors’ interpretation of Contingent Capture, top-down control settings modulate attentional capture in a way that stimuli are discriminated or identified before attention is allocated to the stimulus that contains the

56 target-defining attribute. Despite this interpretation, their findings indicate that the congruent cues were discriminated or processed more deeply than the incongruent cues.

That is, the congruent cues attracted more processing capacity than the incongruent cues.

However, this does necessary imply that spatial attention is captured by the congruent cues. Besides, the consensus of the indexing function of N1 magnification is not attentional capture. In conclusion, the evidence provided by Arnott et al. (2001) does not answer the essential question of whether attention is captured in a purely stimulus-driven way.

N2pc: Index of Attentional Capture

The present experiment is to test Rapid Disengagement using an ERP component, while avoiding previous issues. The epochs will be time-locked to the cue onset, because the addressed question is concerned with the attentional-capture ability of the cue; that is, a component elicited by the cue should be the focus. Instead of relying on visually evoked P1 or N1, another ERP component — N2pc, the negative difference that is observed over the posterior scalp region that is contralateral to the critical stimulus during the latency range of the N2 wave (200 – 300 ms) — is the index of attentional capture

(see Chapter 3 for more detail on N2pc). Instead of varying SOAs, a fixed time window would be given between the cue and the target. To choose this SOA and to obtain relatively clean N2pc waveforms, some consideration is given.

57

Figure 4-1: Hypothetical temporal sequence of stimuli and the elicited ERP components. Each tag marks 50 ms, with time goes by from left to right. The upper time line depicts the likely overlapping between the P1 and the N2pc if the cue-to-target SOA is 150 ms (as used in Folk et al., 1992). The lower time line shows the possibility to separate the N2pc from the P1 so that the N2pc can be observed much more clearly when the SOA is 250 ms. The present experiment uses 250 ms as the SOA.

Usually, the latency of the N2pc falls within the window of 200 to 300 ms post- stimulus (e.g., Luck & Hillyard, 1990, 1994a, 1994b; Woodman & Luck, 1999, 2003).

When a stimulus captures attention, the contralateral potentials are more negative than the ipsilateral potentials. In the original Folk et al.’s paradigm (1992, or see review in

Chapter 1), the SOA was 150 ms. Ideally, we should have used the same parameter in the present experiment. However, the N2pc component that is expected, if observed, would appear at around 200 ms after the onset of the cue and last for about 100 ms, and thus the N2pc elicited by the cue would overlap with the P1 evoked by the target (see the upper time line in Figure 4-1). If the target onset is moved to 250 ms after the cue onset, there will not be any overlap between the N2pc (to the cue) and the P1 (to the target).

Thus, 250 ms was chosen as the constant SOA value in the ERP experiment.

58

Experiment II

Method

Subjects. Eleven individuals (with an average age of 24.5; 3 women; 3 left- handed) were recruited. All of the participants reported normal or corrected-to-normal visual acuity and color vision, and they were unaware of the purpose of the experiment before being tested. Participants received $24 US Dollars as participant compensation.

Apparatus. Stimuli were presented on a 15” color monitor in SVGA mode. Trial events and data collection were controlled by an IBM-compatible, Pentium-based personal computer. The refresh rate was 60 Hz. The experiment was run by customized

MatLab software. A keyboard was used for the input of participants’ responses.

Participants sat in a electrically shielded, sound attenuated, dimly lighted chamber. The electroencephalogram (EEG) was recorded from 128 channels that are scalp electrode sites and digitized at 250 Hz (i.e., the sample rate). The EEG system was run by a

Macintosh-operated Apple computer with commercial software Net Station (Electrical

Geodesics, Inc.) For the electrodes’ layout, see Figure 4-2.

59

Figure 4-2: The 128-channel Hydrocel Geodesic Sensor Net (marked with electrode numbers in green circles) mapped with approximately equivalent positions of the international 10-20 system (light blue; see Figure 3-1) and two additional standard electrodes (yellow).

Stimuli, Design, and Procedure. Everything was the same as in Experiment I, except for the following. The cue-to-target SOA now was 250 ms. In Experiment I, there were four blocks in each cue-target combination. In the present experiment, to increase the number of trials so as to increase the signal-to-noise ratio in the ERP waves, there were now six blocks in each combination, with the first block being the practice block. In addition, participants were encouraged to suppress blinks before making a response and to blink after each trial. In addition to more trial blocks, the application of the electrode net took about 30 minutes, and thus the session lasted about 2 hours. Note

60 that the SOA was now longer than the cue duration (50 ms), so there was no display containing both the cue and the target.

Event-related Potential Protocol. During off-line analysis, the digitized EEG data were referenced to mathematically linked mastoids. Ocular and motion artifacts were excluded. Time-locked to the cue onset (i.e., time zero was set as the cue appeared), each epoch was selected from -200 ms to 500 ms and baseline corrected. The ERPs were averaged separately for each of the eight conditions (i.e., 2 cue attributes x 2 target attributes x 2 hemispherical locations of the cue). The window for observing the N2pc was 200 to 300 ms post-cue. Although the whole 128 channels were recorded, only ten of them were chosen for analyses. On the left side, the five channels were 58 (T5), 59, 65

(OL), 66, and 70 (O1); on the right side, they were channels 83 (O2), 84, 90 (OR), 91, and 96 (T6).

Behavioral Results and Discussion

Mean RTs as a function of Cue Attribute (onset or color), Target Attribute (onset or color), and Cue Validity (valid or invalid) are shown in Figure 4-3. A complete summary of the behavioral data is given in Table 4-1. A 2 x 2 x 2 repeated measures

ANOVA, with variables of Cue Attribute, Target Attribute, and Cue Validity was performed on RT. The three-way interaction was significant [F(1,10) = 16.32, p = .002].

The analysis also showed main effects of Target Attribute [F(1, 10) = 39.92, p < .001]

61 and Cue Validity [F(1, 10) = 11.67, p = .007], as well as an interaction between Cue

Attribute and Cue Validity [F(1, 10) = 29.74, p < .001]. Divided by Target Attribute, sequential analyses followed. For the onset targets, the interaction between Cue Attribute and Cue Validity was not significant, but it was significant for the color targets [F(1, 10)

= 46.76, p < .001]. That is, the signature of Contingent Capture was only half present. In the condition where the target was defined as a color, congruent cues but not incongruent cues produced a positive cuing effect. But, in the condition where the target was defined as onset, both types of cues led to positive cuing effects; although the cuing effect from the onset-cue-onset-target trials did not reach statistical significance, the effect size was close to that from the color-cue-onset-target trials. In short, when the target-defining attribute was onset, attention was captured by both types of the cue. When the target- defining attribute was color, attention was captured only by the congruent cue.

Overall mean error rate averaged 3.14 %. There was no evidence for a speed- accuracy trade-off: averaged across participants, error rates by condition were positively

(but insignificantly) correlated with mean RT (mean r = .06; t(10) = .89; p = .393).

Consistent with Experiment I, there is no reason for the error data to alter the conclusion that was reached using mean RT.

62

Figure 4-3: Mean response times as a function of Target Attribute, Cue Attribute, and Cue Validity.

Table 4-1: Mean response times (in ms) and error rates (in %) at each cue-target situation.

Onset Target Color Target Validity Validity Validity Validity Response Error Response Error Effect Effect Effect Effect Time Rate Time Rate (RT) (ER) (RT) (ER) Onset Valid 386.92 2.60 459.92 3.25 13.39 .58 -12.31 .55 Cue Invalid 400.31 3.18 447.61 3.81 Color Valid 383.50 2.36 442.90 2.20 14.07* 1.20 51.90** 2.01* Cue Invalid 397.57 3.55 494.79 4.21 * p < .050 ** p < .001

63 This experiment was not designed to test Rapid Disengagement using the behavioral data, especially since the SOA was longer than the original Folk et al. (1992) study (250 ms vs. 150 ms). To account for the current results in terms of Contingent

Capture, the view of different adopted modes (Bacon & Egeth, 1994) is incorporated. In the onset target condition, participants for some reason (e.g., the blink suppression procedure might contribute to the control setting of onset-ness) fell into the singleton- detection mode, which enabled any singleton or physically obvious stimulus to capture attention – both incongruent and congruent cues caused validity effects. In the color target condition, the feature-search mode was adopted, and thus, only the cue with the target-defining attribute captured attention. Hence, the behavioral results were consistent with Contingent Capture.

To predict the ERP results that would be consistent with the RT results, both hypotheses are taken into consideration. If Rapid Disengagement is correct – all cues capture attention – then the N2pc will be elicited by the cue in every condition regardless of the RT results. If Contingent Capture is correct – both types of cues in the onset target condition capture attention (in the singleton-detection mode), and color cues but not onset cues in the color target condition capture attention (in the feature-search mode) – then the

N2pc will be present in all of the conditions except the onset-cue-color-target condition.

64 Electrophysiological Results and Discussion

Two measured values were extracted from individual participants’ ERPs, and they were Mean Amplitude and Negative Peak within the 200-300 ms window post-cue. An initial analysis showed that channels on the same side of the scalp did not show any differences. Therefore, the data from all hemispheric channels were averaged, and

Average of Mean Amplitude and Average of Negative Peak became the dependent variables in the following analyses: a 2 x 2 x 2 repeated-measure ANOVA (with factors being Cue Attribute, Target Attribute, and Hemispheric Side contralateral or ipsilateral to the cue) and paired t-tests were performed on each dependent variable. Grand averaged

ERP waveforms were plotted in Figure 4-4. All waveforms were averaged over the 11 participants, over left and right hemisphere channel pairs, and over the four target locations.

For Average of Mean Amplitude, the ANOVA showed main effects of Cue

Attribute [F(1, 10) = 8.19, p = .017] and Hemispheric Side [F(1, 10) = 7.31, p = .022].

Hemispheric Side also interacted with Cue Attribute and Target Attribute [F(1, 10) =

6.09, p = .033; F(1, 10) = 7.11, p = .024]. Paired t-tests on each cue-target situation confirmed that Cue Attribute rather than Target Attribute differentiated physiological responses from contralateral to ipsilateral sides of the cue, with greater responses from the contralateral side than from the ipsilateral side. As for effects on Average of

Negative Peak, the ANOVA revealed main effects of Cue Attribute [F(1, 10) = 23.12, p =

.001] and Hemispheric Side [F(1, 10) = 5.61, p = .039], which showed an interaction with

65 Cue Attribute [F(1, 10) = 21.91, p = .001] but not with Target Attribute. The t-test results on Average of Negative Peak were the same pattern as those on Average of Mean

Amplitude. In each condition, the lateralized effect occurred within the window of 200 –

300 ms post-cue when the cue stood out as an onset item but not as a color singleton. In other words, the N2pc was elicited by an onset cue, not by a color cue, regardless of the cue’s congruency with the target.

The ERP results are inconsistent with both Contingent Capture and Rapid

Disengagement. In particular, both hypotheses predicted that congruent cues would elicit the N2pc. However, the N2pc was observed in the onset-cue-onset-target condition but not in the color-cue-color-target condition, in spite of the very large behavioral effect in the latter condition. In addition, Contingent Capture predicted no N2pc in only one of the four cue-target conditions. That condition was the onset-cue-color-target condition, and the results went against this prediction. But Rapid Disengagement is not supported either, because the other incongruent condition with color cues did not produce an N2pc.

The pattern of N2pc results is impossible to explain using either hypothesis, but, as will be shown next, it is questionable whether the observed lateralized effects in the onset cue conditions are really examples of N2pc.

Figure 4-4: Grand average event-related brain potentials for all four cue-target combinations. Each tag on the horizontal axis marks 100 ms, and on the vertical axis is 1 µV. Time zero is locked to the cue onset. Negative voltage is plotted up by convention. The green boxes are the time windows for statistical analyses.

The fact that the onset cue conditions, regardless of the cue’s congruency with the target, led to the lateralization makes one wonder whether it is a sensory effect instead of an attentional effect. In early N2pc work (e.g., Eimer, 1996; Girelli & Luck, 1997;

Hillyard & Luck, 1994a, 1994b), the stimulus display typically consisted of multiple items, presented on both sides of the visual field. With this balanced stimulation to both hemispheres, lateralized effects were found in target-present but not in target-absent conditions, indicating that the effects were not sensory-based, and spatial attention may be the cause. However, when only one item was presented (Luck & Hillyard, 1994b), thus making the display asymmetrical, lateralized effects were found in both target and non-target conditions, suggesting that these effects resulted from unbalanced sensory input instead of other processes such as attentional shifts. Therefore, N2pc experiments usually are conducted with symmetrical displays to avoid the confounding between sensory and attentional effects.

Unfortunately, asymmetrical displays were used for the onset cue conditions in the present experiment, which could not be excluded when testing Rapid Disengagement using the original version of the contingent cuing paradigm. In an onset cue display, one side of visual field was more stimulated than the other side. Hence, the lateralized difference in the conditions with onset cues may simply reflect an imbalance of sensory input between the two hemispheres. That is, ERP differences elicited in conditions involving an onset cue are not convincing with regard to attentional effects. This means that the observed “N2pc” in the onset cue conditions cannot be taken as evidence of attentional capture.

68 Since the onset cue conditions are not able to offer a convincing answer to the question that motivates the present study, the focus must now be concentrated on the color cue conditions that employed symmetrical displays. Contingent Capture predicts that the N2pc would be produced by the congruent cue (i.e., preceding a color target), but not by the incongruent cue (i.e., preceding an onset target). Rapid Disengagement predicts that an N2pc would be elicited by both types of the cue (i.e., preceding either a color or onset target). Unexpectedly, there is no evidence of an N2pc in either condition.

Thus, neither of the hypotheses is supported. The question of whether attentional capture is purely stimulus-driven or contingent upon goal-driven settings is not here answered with the approach of examining the effects on online electrophysiological responses, since there are no such responses in this experiment.

Chapter 5

General Discussion

Summary

Exogenous orienting of attention refers to attentional shifts that are driven by stimuli. It is purely exogenous orienting if only bottom-up information is necessary to capture attention. It is not a purely exogenous orienting if top-down factors also play a role in attentional capture. This leads to a more specific question that has motivated the current work: Is attentional capture purely stimulus-driven or contingent upon goal- driven settings?

Previous research does not provide a clear answer, but comes down to two specific hypotheses and an experimental paradigm. Taking the cuing effect – i.e., faster responses when the target location has been signaled by an uninformative cue than not

(Posner & Cohen, 1984) – to indicate attentional capture, Folk et al. (1992) found that an uninformative cue captured attention only if the cue contained the attribute that defined the target (i.e., congruent cues produced a cuing effect while incongruent cues did not).

The Contingent Capture hypothesis argues that goal-driven control settings (e.g., the knowledge of the target-defining attribute) modulate stimulus-driven orienting of

70 attention. In contrast, the Rapid Disengagement hypothesis argues that attentional capture is a purely stimulus-driven process that cannot be influenced by goal-driven factors. According to Rapid Disengagement, the absence of a cuing effect in the incongruent condition is caused by the brevity of capture when the cue doesn’t have the target’s defining attribute, which resulted in the null effects in Folk et al’s experiments

(1992) since those experiments used an SOA that was longer than the duration of brief capture.

The empirical purpose of the present study was to discover which of the hypotheses accounts for the phenomenon found in Folk et al. (1992). Two parallel approaches were taken, with converging evidence expected. One was to use a very short cue-to-target stimulus onset asynchrony (SOA), and to observe whether a cuing effect

(for indication of attentional capture) would be produced in the incongruent condition. It was assumed that if attention was ever captured, then attention could not be disengaged in such a brief time of 35 ms. In the other approach, brain waves were recorded while the cuing task was performed. An event-related potential (ERP) component – the N2pc – was taken as the index of attentional capture. It has been theorized that if a stimulus captures attention, the ERP detected from the contralateral side (related to the stimulus) on the posterior scalp region is more negative than that detected from the ipsilateral side, within the latency range of 200 to 300 ms post-stimulus. Contingent Capture predicts that only congruent cues capture attention so that cuing effects and an N2pc would be found in the congruent condition but not in the incongruent condition. Rapid

Disengagement predicts that all salient cues, regardless of congruency, capture attention,

71 and thus cuing effects and an N2pc would be observed in both congruent and incongruent conditions.

In Experiment I, using the short-SOA approach, cuing effects were shown only in the congruent condition. In Experiment II, using the ERP approach, the N2pc was absent from all conditions. By the logic discussed above, this indicates that attention is captured by congruent cues but not by incongruent cues (based on RT), and also that attention is not captured by either types of cues (based on ERPs). These two conclusions cannot co- exist. Because both hypotheses of Contingent Capture and Rapid Disengagement agree that congruent cues capture attention (which was verified by the RT data), the ERP results are not reasonable and not acceptable as providing an answer to the addressed question. Therefore, based only on the short-SOA behavioral experiment, Contingent

Capture is favored over Rapid Disengagement as the account for the null effect in Folk et al’s (1992) incongruent conditions.

As to the reasons why the ERP experiment could not provide evidence of attentional capture by any type of cue, several possibilities are considered. Since, as mentioned, Contingent Capture and Rapid Disengagement both predict attentional capture by congruent cues, the congruent condition is to be discussed. In addition, the focus of the discussion will be on the color-cue-color-target condition because the onset- cue-onset-target condition is not convincing to provide evidence of purely attentional effects using lateralized ERPs, due to the unbalance exhibited by onset cue displays.

72 Why is the N2pc Absent?

Assumption 1: The cuing effect in RT does not reflect attentional capture.

From the time of seeing the target to that of pressing a key, there are several stages involved in the stream of information processing. First, stimuli are encoded and processed in parallel through the stages of perception. Second, encoded stimuli are transferred to visual short-term memory (VSTM), waiting for further processing. Some or all of the information in VSTM is then transferred, one item at a time, to the central stage where working memory plays a role to decide what to do about the input. For instance, perceptual inputs are transformed to abstract or semantic information that is understandable for the motor system (e.g., press the right-hand key if the target is an

“=”). After the process of response selection, motor programming is initiated to execute the response. Visuospatial attention is shifted (if shifted at all) during the processes of perception. Stimuli in the attended location are facilitated and transferred more rapidly to

VSTM. However, pieces of information in VSTM must also be selected before being transferred to the central stage. This second form of selection is called attentional prioritization (Yantis & Johnson, 1990). That is, a stimulus that is responded to quicker than another may gain its advantage from attentional capture (perceptual facilitation) and/or attentional prioritization (post-perceptual facilitation).

In the present study, the cuing effects in the color-cue-color-target condition are rather large (91.27 ms in Experiment I and 51.90 ms in Experiment II), but the N2pc, of which the indexing function is attentional capture, is absent. According to this view, it is

73 now suggested that the cuing effect is resulting from attentional prioritization rather than attentional capture. Assuming that the odd item – the cue – in a cue display does not capture attention, the cue activates a prioritization mechanism, which operates on VSTM after the following target display is encoded. That is, the stimuli in the target display are all encoded without any one of them being facilitated because of visuospatial attention

(i.e., attention is at fixation when the target display is presented), and information from the entire target display is transferred to VSTM. Instead, attentional prioritization that has been activated by the cue is channeled to the cued location so that any information in

VSTM from the cued location is given priority for central processing. Therefore, based on this assumption, the cuing effect in RT is the outcome of the process beyond perception, and attentional capture is not the cause. If attentional capture is not the cause, then the N2pc, of course, will not be observed. In this way, the assumption that the cuing effects in RT are not due to attentional capture can be used to reconcile the data from the present two experiments. Even more: the data from the ERP experiment could be used to support the idea that the cuing effects observed in the contingent capture paradigm are not actually caused by attentional capture.

Assumption 2: The N2pc does not reflect attentional capture.

The cuing effect in RT, according to many hypotheses, including Contingent

Capture and Rapid Disengagement, has been taken as evidence of attentional capture.

Assuming that this is correct, then with the finding of a robust cuing effect in RT together with an absence of an N2pc suggests that the N2pc may not reflect attentional capture

74 without additional processes being involved. Going even further, the N2pc may not be an index of attentional capture at all but of a process that very often follows attentional capture. In fact, the actual indexing function of N2pc has started being re-defined (e.g., van Velzen & Eimer, 2003). Recall from Chapter 3 where selected research was reviewed for the choice of the N2pc. Those studies employed a search array in which the target captured attention. Because of the task demand, the target in a search array not only captured attention but also very likely initiated processes that subserve discrimination or identification of the target. That is, processing resources or capacity would be allocated to the target after the target was localized. Consequently, the target would be fully processed and, thus, a proper response could be selected and executed.

Resource allocation, therefore, always follows attentional capture in typical N2pc experiments. If we now assume that the N2pc does not reflect attentional capture, it is plausible that the N2pc actually reflects the subsequent resource allocation, instead.

In the present study, the N2pc is missing in the color-cue-color-target condition where the cuing effect in RT, which indicates attentional capture, is highly significant.

However, the cuing effect does not necessarily imply that processing resources were allocated to the cue. Because the cues are task-irrelevant and not indicative of target location, it is not required to discriminate the cue, and resource allocation to the cue is not essential for the participant to perform the task. If the presence of N2pc reflects resource allocation rather than attentional capture, and if the cue does not need resource allocation, then the N2pc will not be observed. Thus, the assumption that the N2pc is an index of resource allocation instead of attentional capture can also reconcile the data from the two

75 experiments. In this case it is argued that attentional capture is occurring in the congruent conditions, but resources are not allocated to the location of the cue because the cues are known to be unimportant.

Conclusion

The attempt to answer the question “Is the capture of attention purely stimulus- driven or contingent upon goal-driven settings?” is here made by testing Rapid

Disengagement in the context of a contingent cuing task. Rapid Disengagement did not pass the test that consisted of two parallel experiments. The unsuccessfulness of the electrophysiological experiment does not damage the credibility of the behavioral experiment. That is, no evidence is provided that attention is captured by a stimulus simply because of the stimulus’s physical properties such as saliency. Until further studies on indexing function of the N2pc are done and/or there are discoveries of new

ERP components for attentional capture, an ERP approach might be of little use to answer the addressed question. Alternatively, if the N2pc does reflect attentional capture, a different approach for the behavioral experiments might be needed to address the issue of whether the cuing effects in RT are due to attentional prioritization instead of attentional capture. Until then, goal-driven settings cannot be excluded from the factors that influence stimulus-driven shifts of attention. Contingent Capture – attentional capture contingent on top-down control settings – is more convincing than and thus

76 favored over Rapid Disengagement as the account for the original finding in Folk et al.

(1992): the null effects in the incongruent cue conditions.

77

Bibliography

Arnott, S. R., Pratt, J., Shore, D. I., & Alain, C. (2001). Attentional set modulates visual

areas: An event-related potential study of attentional capture. Cognitive Brain

Research, 12(3), 383-395.

Bacon, W. F., & Egeth, H. E. (1994). Overriding stimulus-driven attentional capture.

Perception & Psychophysics, 55(5), 485-496.

Berger, A., Dori, H., & Henik, A. (1999). Peripheral non-informative cues do induce

early facilitation of target detection. European Journal Of Cognitive Psychology,

11(1), 119-137.

Clark, V. P. & Hillyard, S.A. (1996). Spatial selective attention affects early extrastriate

but not striate components of the visual . Journal of Cognitive

Neuroscience, 8, 387-402.

Collie, A., Maruff, P., Yucel, M., Danckert, J., & Currie, J. (2000). Spatiotemporal

distribution of facilitation and inhibition of return arising from the reflexive

orienting of covert attention. Journal Of Experimental Psychology-Human

Perception And Performance, 26(6), 1733-1745.

Corbetta, M., & Shulman, G. L. (2002). Control of goal-directed and stimulus-driven

attention in the brain. Nature Reviews Neuroscience, 3(3), 201-215.

78

Di Lollo, V., Enns, J. T., & Rensink, R. A. (2000). Competition for consciousness among

visual events: The psychophysics of reentrant visual processes. Journal Of

Experimental Psychology-General, 129(4), 481-507.

Doallo, S., Lorenzo-Lopez, L., Vizoso, C., Holguin, S.R., Amenedo, E., Bara, S., &

Cadaveira, F. (2004). The time course of the effects of central and peripheral cues on

visual processing: An event-related potentials study. Clinical Neurophysiology, 115,

199-210.

Doallo, S., Lorenzo-Lopez, L., Vizoso, C., Holguin, S.R., Amenedo, E., Bara, S., &

Cadaveira, F. (2005). Modulations of the visual N1 component of event-related

potentials by central and peripheral cueing. Clinical Neurophysiology, 116, 807-820.

Eimer, M. (1996). The N2pc component as an indicator of attentional selectivity.

Electroencephalography and Clinical Neurophysiology, 99, 225-234.

Enns, J. T. (2004). Object substitution and its relation to other forms of visual masking.

Vision Research, 44(12), 1321-1331.

Enns, J. T., & DiLollo, V. (1997). Object substitution: A new form of masking in

unattended visual locations. Psychological Science, 8(2), 135-139.

Folk, C. L. & Annett, S. (1994). Do locally defined feature discontinuities capture

attention? Perception & Psychophysics, 56, 277-287.

Folk, C. L. & Remington, R. W. (1998). Selectivity in distraction by irrelevant featural

singletons: Evidence from two forms of attentional capture. Journal of Experimental

Psychology: Human Perception and Performance, 24, 847-858.

79

Folk, C. L. & Remington, R. W. (1999). Can new objects override attentional control

settings? Perception & Psychophysics, 61, 727-739.

Folk, C. L. & Remington, R. W. (2006). Top-down modulation of preattentive

processing: Testing the recovery account of contingent capture. Visual Cognition,

14, 445-465.

Folk, C. L., & Remington, R. (2006). Top-down modulation of preattentive processing:

Testing the recovery account of contingent capture. Visual Cognition, 14(4-8), 445-

465.

Folk, C. L., Remington, R. W., & Johnston, J. C. (1992). Involuntary covert orienting is

contingent on attentional control settings. Journal of Experimental Psychology:

Human Perception and Performance, 18, 1030-1044.

Folk, C. L., Remington, R. W., & Wright, J. H. (1994). The structure of attentional

control: Contingent attentional capture by apparent motion, abrupt onset, and color.

Journal of Experimental Psychology: Human Perception and Performance, 20, 317-

329.

Fu, S., Greenwood, P. M., & Parasuraman, R. (2005). Brain mechanisms of involuntary

visuospatial attention: An event-related potential study. Human Brain Mapping, 25,

378-390.

Girelli, M. & Luck, S. J. (1997). Are the same attentional mechanisms used to detect

visual search targets defined by color, orientation, and motion? Journal of Cognitive

Neuroscience, 9, 238-253.

80

Helmholtz, H. (1924-1925). Helmholtz’s Treatise on Physiological Optics, translated

from the 3rd German edition. Edited by James P. C. Southall. Rochester, NY: The

Optical Society of America.

Hillyard, S.A., Luck, S.J. & Mangun, G.R. (1994). The cuing of attention to visual field

locations: Analysis with ERP recordings. In H.J. Heinze, T.F. Munte, & G.R.

Mangun (Eds.). Cognitive Electrophysiology. Boston: Birkhauser.

Hopfinger, J. B. & Mangun, G. R. (1998). Reflexive attention modulates processing of

visual stimuli in human extrastriate cortex. Psychological Science, 9, 441-447.

Hopfinger, J. B. & Mangun, G. R. (2001). Tracking the influence of reflexive attention

on sensory and cognitive processing. Cognitive, Affective, & Behavioral

Neuroscience, 1, 56-65.

Hopfinger, J. B. & Ries, A. J. (2005). Automatic versus contingent mechanisms of

sensory-driven neural biasing and reflexive attention. Journal of Cognitive

Neuroscience, 17, 1341-1352.

Hopfinger, J. B., & West, V. M. (2006). Interactions between endogenous and exogenous

attention on cortical visual processing. Neuroimage, 31(2), 774-789.

James, W. (1890). The Principle of Psychology, Vol 1. New York: Dover.

Johannes, S., Munte, T. F., Heinze, H. J., & Mangun, G. R. (1995). Luminance and

spatial attention effects on early visual processing. Cognitive Brain Research, 2,

189-205.

81

Jonides, J. & Yantis, S. (1988). Uniqueness of abrupt visual onset in capturing attention.

Perception & Psychophysics, 43, 346-354.

Jonides, J. (1981). Voluntary versus automatic control over the mind’s eye’s movement.

In J.B. Long & A.D. Baddeley (Eds.) Attention & Performance IX. (pp.187-203).

Hillsdale, NJ: Erlbaum.

Kahneman, D., Treisman, A., & Burkell, J. (1983). The cost of visual filtering. Journal of

Experimental Psychology: Human Perception and Performance, 9, 510-522.

Lamy, D. (2005). Temporal expectations modulate attentional capture. Psychonomic

Bulletin & Review, 12(6), 1112-1119.

Lleras, A., & Moore, C. M. (2003). When the target becomes the mask: Using apparent

motion to isolate the object-level component of object substitution masking. Journal

Of Experimental Psychology-Human Perception And Performance, 29(1), 106-120.

Logan, G. D. (2005). The time it takes to switch attention. Psychonomic Bulletin &

Review, 12(4), 647-653.

Luck, S. J. & Hillyard, S. A. (1994a). Electrophysiological correlates of feature analysis

during visual search. Psychophysiology, 31, 291-308.

Luck, S. J. & Hillyard, S. A. (1994b). Spatial filtering during visual search: Evidence

from human electrophysiology. Journal of Experimental Psychology: Human

Perception and Performance, 20, 1000-1014.

Luck, S. J. (2005). An introduction to the event-related potential technique. Cambridge,

MA: The MIT Press.

82

Luck, S. J., Hillyard, S. A., Mouloua, M., Woldorff, M. G., Clark, V. P., & Hawkins, H.

L. (1994). Effects of spatial cuing on luminance detectability: Psychophysical and

electrophysiological evidence for early selection. Journal of Experimental

Psychology: Human Perception and Performance, 20, 887-904.

Luck, S. J., Woodman, G. F., & Vogel, E. K. (2000). Event-related potential studies of

attention. Trends in Cognitive Sciences, 4, 432-440.

Lupiáñez, J., & Weaver, B. (1998). On the time course of exogenous cueing effects: A

commentary on Tassinari et al (1994). Vision Research, 38(11), 1621-1623.

Mangun, G. R., Hillyard, S. A., & Luck, S. J. (1993). Electrocortical substrates of visual

selective attention. In D. Meyer & S. Kornblum (Eds.). Attention & Performance

XIV. (pp.219-243). Cambridge, MA: MIT Press.

Maruff, P., Yucel, M., Danckert, J., Stuart, G., & Currie, J. (1999). Facilitation and

inhibition arising from the exogenous orienting of covert attention depends on the

temporal properties of spatial cues and targets. Neuropsychologia, 37(6), 731-744.

Miller, J. (1989). The control of attention by abrupt visual onsets and offsets. Perception

& Psychophysics, 45(6), 567-571.

Moore, C. M., & Lleras, A. (2005). On the role of object representations in substitution

masking. Journal Of Experimental Psychology-Human Perception And

Performance, 31(6), 1171-1180.

83

Moore, C. M., Egeth, H., Berglan, L. R., & Luck, S. J. (1996). Are attentional dwell

times inconsistent with serial visual search? Psychonomic Bulletin & Review, 3(3),

360-365.

Posner, M. I. & Cohen, Y. A. (1984). Components of visual orienting. In H. Bouma &

D.G. Bouwhuis (Eds.). Attention & Performance X. (pp.531-556). Hillsdale, NJ:

Erlbaum.

Posner, M. I. (1980). Orienting of attention. Quarterly Journal of Experimental

Psychology, 32, 3-25.

Posner, M. I., & Snyder, C. R. R. (1975). Facilitation and inhibition in the processing of

signals. In P. M. A. Rabbitt. & S. Dornic (Eds.), Attention & Performance V.

Hillsdale, NJ: Erlbaum.

Posner, M. I., Nissen, M. J., & Ogden, W. C. (1978). Attended and unattended processing

modes: The role of set for spatial location. In H. L. Pick & E. Saltzman (Eds.),

Modes of perceiving and processing information. New Jersey: Lawrence Erlbaum

Associates, Inc.

Posner, M. I., Rafal, R. D., Choate, L. S., & Vaughan, J. (1985). Inhibition of return -

neural basis and function. Cognitive Neuropsychology, 2(3), 211-228.

Posner, M. I., Sheese, B. E., Odludas, Y., & Tang, Y. Y. (2006). Analyzing and shaping

human attentional networks. Neural Networks(19), 1422-1429.

Posner, M. I., Snyder, C. R. R., & Davidson, B. J. (1980). Attention and the detection of

signals. Journal of Experimental Psychology: General, 109, 160-174.

84

Prinzmetal, W., McCool, C., & Park, S. (2005). Attention: Reaction time and accuracy

reveal different mechanisms. Journal Of Experimental Psychology-General, 134(1),

73-92.

Remington, R. W., Folk, C. L., & McLean, J. P. (2001). Contingent attentional capture or

delayed allocation of attention? Perception & Psychophysics, 63, 298-307.

Sperling, G. (1960). The information available in brief visual presentations.

Psychological Monographs, 74(11), 1-29.

Theeuwes, J. (1991a). Exogenous and endogenous control of attention: The effect of

visual onsets and offsets. Perception & Psychophysics, 49, 83-90.

Theeuwes, J. (1991b). Cross-dimensional perceptual selectivity. Perception &

Psychophysics, 50, 184-193.

Theeuwes, J. (1992). Perceptual selectivity for color and form. Perception &

Psychophysics, 51, 599-606.

Theeuwes, J. (1994). Stimulus-driven capture and attentional set: Selective search for

color and visual abrupt onsets. Journal of Experimental Psychology: Human

Perception and Performance, 20, 799-806.

Theeuwes, J. (1996). Perceptual selectivity for color and form: On the nature of the

interference effect. In A.F. Kramer, M.G.H. Coles, & G.D. Logan (Eds.).

Converging Operations in the Study of Visual Attention. Washington DC: American

Psychological Association.

85

Theeuwes, J. (2004). Top-down search strategies cannot override attentional capture.

Psychonomic Bulletin & Review, 1, 65-70.

Theeuwes, J., & Godijn, R. (2002). Irrelevant singletons capture attention: Evidence from

inhibition of return. Perception & Psychophysics, 64(5), 764-770.

Theeuwes, J., Atchley, P., & Kramer, A. F. (2000). On the time course of top-down and

bottom-up control of visual attention. In S. Monsell & J. Driver (Eds.). Attention &

Performance XVIII. Cambridge: MIT Press.

Theeuwes, J., Godijn, R., & Pratt, J. (2004). A new estimation of the duration of

attentional dwell time. Psychonomic Bulletin & Review, 11(1), 60-64.

Treisman, A. M., & Gelade, G. (1980). Feature-integration theory of attention. Cognitive

Psychology, 12(1), 97-136.

Vogel, E. K., & Luck, S. J. (2000). The visual n1 component as an index of a

discrimination process. Psychophysiology, 37(2), 190-203.

Volkow, N. D., Rosen, B., & Farde, L. (1997). Imaging the living human brain: Magnetic

resonance imaging and positron emission tomography. Proceedings Of The National

Academy Of Sciences Of The United States Of America, 94(7), 2787-2788.

Wijers, A. A., Lange, J. J., Mulder, G., & Mulder, L. J. M. (1997). An ERP study of

visual spatial attention and letter target detection for isoluminant and nonisoluminant

stimuli. Psychophysiology, 34, 553-565.

Wolber, M. & Wascher, E. (2003). Visual search strategies are indexed by event-related

lateralizations of the EEG. Biological Psychology, 63, 79-100.

86

Wolber, M. & Wascher, E. (2005). The posterior contralateral negativity as a temporal

indicator of visuo-spatial processing. Journal of Psychophysiology, 19, 182-194.

Woldorff, M. G. (1993). Distortion of ERP averages due to overlap from temporally

adjacent ERPs: Analysis and correction. Psychophysiology, 30, 98-119.

Wolfe, J. M. (1994). Guided search 2.0 - a revised model of visual-search. Psychonomic

Bulletin & Review, 1(2), 202-238.

Woodman, G.F. & Luck, S.J. (1999). Electrophysiological measurement of rapid shifts of

attention during visual search. Nature, 400, 867-869.

Woodman, G.F. & Luck, S.J. (2003). Serial deployment of attention during visual search.

Journal of Experimental Psychology: Human Perception and Performance, 29, 121-

138.

Yantis, S. & Jonides, J. (1984). Abrupt visual onsets and selective attention: Evidence

from visual search. Journal of Experimental Psychology: Human Perception and

Performance, 10, 601-621.

Yantis, S., & Hillstrom, A. P. (1994). Stimulus-driven attentional capture - evidence from

equiluminant visual objects. Journal Of Experimental Psychology-Human

Perception And Performance, 20(1), 95-107.

Yantis, S., & Johnson, D. N. (1990). Mechanisms of attentional priority. Journal Of

Experimental Psychology-Human Perception And Performance, 16(4), 812-825.

VITA

Peggy Chen

Education • Ph.D. in Psychology, Pennsylvania State University, University Park, PA, May 2007. • Specialization in cognitive and affective neuroscience (SCAN), Department of Psychology, Pennsylvania State University, University Park, PA, September 2005. • M.S. in Psychology, Pennsylvania State University, University Park, PA, May 2004. • B.S. in Medical Radiation Technology, National Yang-Ming University, Taipei, Taiwan, June 2001.

Awards • Grants-in-Aid of Research for Doctoral Dissertation awarded by the Sigma Xi, April 2006. • Dissertation Support awarded by the College of Liberal Art, the Pennsylvania State University, April 2006.

Theses • Orienting Visual Attention in Space: Is capture of attention purely stimulus-driven or contingent upon goal-driven settings? Department of Psychology Doctoral Dissertation, Pennsylvania State University, to be completed in May 2007. • Another Distinction between Exogenous and Endogenous Shifts of Attention. Department of Psychology Master Thesis, Pennsylvania State University, May 2004.

Submitted Articles

• Chen, P. & Mordkoff, J.T. (in press, Visual Cognition). Contingent Capture at a very short SOA: Evidence against Rapid Disengagement. • Mordkoff, J.T., Halterman, R., & Chen, P. (submitted to Psychonomic Bulletin & Review). Why does the effect of exogenous cuing depend on the number of display locations? • Chen, P., Moore, C.M, & Mordkoff, J.T. (in press, American Journal of Psychology). On the spatial metric of short-SOA costs of exogenous cuing. • Moore, C.M., Lanagan, L., Chen, P., Halterman, R., & Fine, E. (in press, Perception & Psychophysics). Non-spatial attributes of stimuli can influence spatial limitations of attentional control.