<<

INVESTIGATION OF HUMAN VISUAL SPATIAL ATTENTION WITH fMRI AND

GRANGER ANALYSIS

by

Wei Tang

A Dissertation Submitted to the Faculty of

The Charles E. Schmidt College of Science

in Partial Fulfillment of the Requirements for the Degree of

Doctor of Philosophy

Florida Atlantic University

Boca Raton, FL

December 2011 Copyright by Wei Tang 2011

ii

ACKNOWLEDGEMENTS

I would like to thank all those who have helped to make this thesis reach its final com- pletion. My utmost gratitude goes to Dr. Steven Bressler, who has been advising me for

five years. Without his knowledge and insights, and especially his enormous patience, I would have lost the battle against all those complicated problems. My appreciation goes to our collaborators at Washington University at St. Louis: Drs. Maurizio Corbetta, Gordon

Shulman and Chad Sylvester, who designed and conducted the behavioral and generously provided us the fMRI , together with discussions on the results. I thank my friends from the laboratory, Tracy Romano, Marmaduke Woodman and Timothy Meehan, for giving valuable suggestions on the analysis methods. I also thank the Scientific Squir- rels Club of China for letting me present my work to general audience in my home country.

At last, I would like to thank my parents and my sister, who have always been there with deep love to encourage and support me chasing my dream.

iv ABSTRACT

Author: Wei Tang

Title: Investigation of Human Visual Spatial Attention with fMRI and Granger Causality Analysis

Institution: Florida Atlantic University

Dissertation Advisor: Dr. Steven L. Bressler

Degree: Doctor of Philosophy

Year: 2011

Contemporary understanding of human visual spatial attention rests on the hypothesis of a top-down control sending from cortical regions carrying higher-level functions to sen- sory regions. Evidence has been gathered through functional Magnetic Resonance Imaging

(fMRI) experiments. The Frontal Eye Field (FEF) and IntraParietal Sulcus (IPS) are can- didates proposed to form the frontoparietal attention network for top-down control. In this work we examined the influence patterns between frontoparietal network and Visual

Occipital Cortex (VOC) using a statistical measure, Granger Causality (GC), with fMRI data acquired from subjects participated in a covert attention task. We found a directional asymmetry in GC between FEF/IPS and VOC, and further identified retinotopically spe- cific control patterns in top-down GC. This work may lead to deeper understanding of goal-directed attention, as well as the application of GC to analyzing higher-level cognitive functions in healthy functioning human brain.

v DEDICATION

To Xuxu, Yark and The Mad Mountain, who keep reminding me how important and joyful it is to keep creative. INVESTIGATION OF HUMAN VISUAL SPATIAL ATTENTION WITH fMRI AND

GRANGER CAUSALITY ANALYSIS

List of Tables ...... ix

List of Figures ...... x

Introduction ...... 1

A Brief Historical Account ...... 1

The Role of FEF and IPS in Attentional Control ...... 5

The Large-Scale Cortical Network Approach ...... 10

Scenario of Current Study ...... 13

Top-Down Control of Visual Cortex by Frontal and Parietal Cortex in Anticipatory

Visual Spatial Attention ...... 15

Introduction ...... 16

Materials and Methods ...... 18

Results ...... 24

Discussion ...... 29

Measuring Granger Causality Between Cortical Regions from Voxelwise fMRI

BOLD Signals with LASSO ...... 32

Introduction ...... 33

Materials and Methods ...... 38

vii Results ...... 47

Discussion ...... 58

Retinotopically Oriented Top-Down Modulation of the Visual Cortex during

Visual Spatial Attention ...... 63

Introduction ...... 63

Materials and Methods ...... 65

Results ...... 68

Discussion ...... 80

Discussion and Conclusions ...... 85

Relate Findings to Theory: A Biased-Competition Model ...... 86

Spatiotopic Maps: A Control Mechanism without Agency ...... 87

Technical Issues Concerning GC Application to fMRI ...... 91

Summary and Conclusions ...... 93

Bibliography ...... 95

viii LIST OF TABLES

3.1 The Fraction of Non-Zero Coefficients in Each of the 4 Submatrices for Each

of the 56 Simulation Models ...... 48

4.1 Paired-Sample t Tests for the Difference Between Groups of Summary GC

Scores ...... 70

4.2 Repeated-Measures ANOVA for Retinotopy Effect (Pre-Target Condition) ...... 71

4.3 Repeated-Measures ANOVA for Retinotopy Effect (Control Condition 1) ...... 74

4.4 Repeated-Measures ANOVA for Retinotopy Effect (Control Condition 2) ...... 74

4.5 Repeated-Measures ANOVA for Retinotopy Effect (Post-Rarget Ttest Condition) 78

4.6 Repeated-Measures ANOVA for Cue Effect (Pre-Target Test Condition) ...... 79

ix LIST OF FIGURES

1.1 Schematic Depiction of Four Influential Accounts of Selective Attention ...... 3

1.2 Lateral View of the Left Hemisphere of Human and Monkey Brain Showing the

Location of FEF ...... 7

1.3 Lateral View of Macaque Monkey Brain and Human Brain Showing the Location

ofIPS ...... 9

2.1 Visual Spatial Attention Behavioral Paradigm ...... 19

2.2 Top-Down and Bottom-Up Granger Causality F - for a

Representative ROI Pair in One Subject ...... 25

2.3 Top-Down Versus Bottom-Up Granger Causality ...... 26

2.4 Top-Down Granger Causality Before Correct Versus Incorrect Performance . . . . .28

3.1 Simple Driving Patterns That Can Lead to Spurious Identification of Significant

Granger Causality ...... 36

3.2 Schematic Illustration of the Computation of Summary f and W

for Hypothetical Submatrix Byx ...... 46

3.3 Granger Causality Patterns Between Simulated ROIs ...... 49

3.4 Comparison of Model Estimation by LASSO-GC and Pairwise-GC Methods

for One Simulation Model ...... 51

x 3.5 Comparison of LASSO-GC and Pairwise-GC Methods in Recovering the f

Summary Statistic ...... 52

3.6 Comparison of LASSO-GC and Pairwise-GC Methods in Recovering the W

Summary Statistic ...... 53

3.7 Comparison of Connectivity Patterns with LASSO-GC and Cross-Correlation

Measures ...... 55

3.8 Functional Connectivity Analysis of Dorsal Attention Network and Visual

Occipital Cortex in Visual Spatial Attention ...... 57

4.1 Illustration of the Locations of Randomly Sampled ROIs Outside FEF, IPS and

VOC ...... 67

4.2 of the Summary Scores Over Subjects ...... 69

4.3 Time Plots Showing the Retinotopy × Direction in the Pre-Target

Test Condition ...... 72

4.4 Time Plots Showing the Retinotopy × Direction Interaction in Control

Condition 1 ...... 73

4.5 Time Plots Showing the Retinotopy × Direction Interaction in Control

Condition 2 ...... 75

4.6 Time Plots Showing the Retinotopy × Direction Interaction in the Post-Target

Test Condition ...... 76

4.7 Summary Time Plot for the Test Condition ...... 77

4.8 Invariant Time and Cue Effect in the Preparatory Period ...... 80

xi CHAPTER 1. INTRODUCTION

Everyone knows what attention is. It is the taking possession by the mind in clear and vivid form, of one out of what seem several simultaneously possible objects or trains of thought...It implies withdrawal from some things in order to deal effectively with others, and is a condition which has a real opposite in the confused, dazed, scatterbrained state.

William James (1890)

Everyone knows what attention is, yet no one knows exactly how it works. Introspec- tively, as William James put it, it is easy to find the subjective feeling about attention as a of reallocating mental resources to bypass the capacity limit of information pro- cessing. Such feeling can commonly arise from two different situations: when a prominent event in the outside world draws our attention toward it, or when an endogenous goal di- rects us to events of our own interest. Cognitively what lead to this phenomenon remains a puzzle far from resolved.

A Brief Historical Account

There has been a long quest in psychology trying to pin down the mechanism of at- tention by measuring behavior (Pashler, 1999). A predominant stream of experimental design utilized competing stimuli, where attentional effects were measured through in- terference from unwanted information sources. Participants were usually presented more

1 information than they can handle, thus physical and/or semantic properties of the stim- uli can be manipulated in a way that interference might cause inaccurate or delayed re- sponses. The interference was a probe to detect the limit of information-processing, quan- titatively analyzed with performance accuracy and/or response time (RT). Cognitive mod- els based on such analyses treat the inability to simultaneously process all sensory in- puts as a bottleneck and attention operates as a filter (Broadbent, 1958; Treisman, 1969;

Deutsch and Deutsch, 1963) that decides which materials pass through the bottleneck and how.

The filter model is an intuitive explanation for the subjective feeling of attention being a resolution for limited processing capacity. However, though differ in details, the filter models have hovered around controversies over to what extent information from competing sources are processed and at what stage selection takes place. In fact, these controversies are hard to settle solely with this type of behavioral tests. It is later suggested that, on top of a passive filtering paradigm, executive control should be brought into scope to solve the competition problem (Posner and DiGirolamo, 1998; Norman and Shallice, 1986).

The study of subjective aspects of consciousness such as executive control used to be disputed by experimentalists, due to a lack of objective methods to investigate the con- trol functions. Contemporary psychologists started to bring back the analysis of subjective mind processes, but taking an impersonal stance other than introspection. Among pioneers,

Michael Posner (1980) proposed a paradigm to link behavioral measures with endogenous attentional control. Subjects in the paradigm make use of a preparatory cue to attend to tar- gets located away from the fixation point without making eye movements (covertly). This

2 Figure 1.1: Schematic depiction of four inuential accounts of selective attention. (A) the early-selection filter theory; (B) a rival late-selectionaccount; (C) Treisman’s (1960) “at- tenuation”version of Broadbent’s theory; (D) Treisman and Gelade’s (1980) feature inte- gration theory. Adapted with permission from Driver, 2001.

3 covert attention shift was demonstrated to be time-locked, i.e. the amount of time it takes to orient to the target location depends on the distance between the target and the fixation point. Thus by way of the Posner task, attention becomes a measurable phenomenon and is manipulable by tuning experimental parameters.

Parallel to the exploration with behavioral approaches, there has also been discover- ies in the neural basis of attention. Converging early evidence have pointed to candidates distributed across the brain: the prefrontal cortex known of carrying executive functions

(Roberts et al., 1998; Fuster, 2008), the posterior parietal cortex whose impairment leads to neglect syndrome (Mesulam, 1981; Driver and Mattingley, 1998; Corbetta et al., 2005), and the anterior cingulate cortex which carries a wide of functions including selective attention (Posner and DiGirolamo, 1998; Bush et al., 2000; Crottaz-Herbette and Menon,

2006). Contemporary recording techniques have made it possible for more precise defini- tion of functional areas in vivo. It is now possible to record brain activity from subjects participating behavioral tasks and identify specific loci inside the brain that account for the task events. With these techniques we are now able to look more closely into the question:

If spatial attention is to solve the competition problem among neuronal processes by way of top-down control, then it is important to find out where the control signal is originated and where it is sent to, and in what manner the competition gets resolved. Neurophysio- logical changes that correlate to goal-directed behavior have been found consistently in the

Frontal Eye Field (FEF) and the IntraParietal Sulcus (IPS) (Mesulam, 1981; Corbetta, 1998;

Corbetta and Shulman, 2002), two regions of particular interest in this dissertation work.

If the tentative control regions are to be FEF and IPS, and the target of modulation is to

4 be the visual occipital cortex, then the remaining question would be how their interactions account for attentional control.

A candidate mechanism might be the biased-competition model proposed by Desimone and Duncan (1995a), in which the saliency map from bottom-up processing in the visual cortex is amended by the top-down control, with computational facilitation biased toward voluntarily attended locations. If we take this model as hypothesis, it presumes modulation from the frontoparietal control regions onto the visual processing regions. The key to ver- ify this hypothesis is then to demonstrate that such modulation exists and that its pattern is in close relationship with attentional control required by the task. We identified FEF and IPS with funtional Magnetic Resonance Imaging (fMRI) from healthy human subjects performing a Posner-like task. Our exploration is through a large-scale cortical network approach. We adopted statistical methods for analyzing directional influence between from the network nodes, and then tested the top-down modulation hypothesis based on the influence patterns.

This manuscript is organized into five chapters. The rest of this introduction will present a review on the composite research background, followed by two stand-alone research articles and one technical report as the main chapters for methods and results. We conclude in the fifth chapter with summary and discussions of our findings.

The Role of FEF and IPS in Attentional Control

The frontal lobe has been known for carrying out executive functions ((Miller and

Cohen, ; Fuster, 2008)). Attention as a component of executive control, is evident to

5 be associated with subregions of the PreFrontal Cortex (PFC) (Luria, ; Knight, 1997;

Knight et al., 1995). The earliest study was reported by Ferrier in the late 19th century, where lesions in monkey PFC led to inability of eye movements, accompanied by impair- ment to localizing and recognizing visual objects. He then speculated that the function of PFC is to “excite attention”(Crowne, 1983). The early findings ignited thoughts to associate spatial attention with saccade preparation, among which “the premotor theory of attention”proposed by Rizzolatti and colleagues (Rizzolatti et al., 1987; Rizzolatti and

Craighero, 1998) prevailed for decades.

Further lesion studies refined the regions responsible for saccades. The Frontal Eye

Field (FEF), located in the middle frontal gyrus where Brodmann’s areas 8, 9 and 6 join

(Fig. 1.1) (Penfield and Rasmussen, 1950), has been found to have a dual role in both eye movement and attention shift. An interesting finding from more recent studies with different experimental approaches and across species is that the attentional signal in FEF is dissociable from that of saccade execution. It has been suggested that neurons signal- ing the location of visual targets and those controlling the shift of gaze are distinct groups

(Schall, 2002). Cohen et al. (2008b) also found that FEF have different types of neurons: visual neurons receiving input from visual cortex, visuomotor neurons carrying local com- putation, and movement neurons exerting output to the motor cortex. In a single-neuron recording study in monkeys, Thompson et al. (2005) found a spatially selective signal in

FEF that could correspond to attended locations while the activity of movement neurons was suppressed. Visual analysis for attentional selection is also found dissociable from saccade preparation in human with Transcranial Magnetic Stimulation (TMS) (Juan et al.,

6 Figure 1.2: Lateral view of the left hemisphere of human (A) and monkey brain (B) show- ing the parcellation of the cortex as described by Brodmann. Outlined in yellow are the locations of the FEF within the precentral sulcus (PCS) in humans. Adapted with permis- sion from Rosano et al., 2003.

2008). Although more evidence are needed to settle the relation between spatial-attention and eye-movement controls in FEF, it is quite clear that FEF can influence covert attention shift without initiating eye movements (Armstrong et al., 2009).

Besides FEF, another region of interest in this study is the IntraParietal Sulcus (IPS), a part of the Posterior Parietal Cortex (PPC). PPC is known for its typical association with an attentional disorder named neglect syndrome, in which the patients become unaware

7 of the side of space contralateral to the lesioned hemisphere, with right-side lesion having more prevalent symptoms (Mesulam, 1981; Corbetta, 1998; Corbetta and Shulman, 2002).

Studies of neglect have collected facts on how attentional behavior can be affected by im- pairment to PPC (Gillebert et al., 2011). On the other hand, psychophysical experiments at the same time have uncovered a variety of functions carried by PPC, including spatial perception, visuomotor control and goal-directed movements (for reviews see, for exam- ple, Kravitz et al., 2011). Given the anatomical location of PPC, which is strategically situated between the somatosensory cortex and the visual occipital cortex, a link might be established between the attention-related behavior and the special ability of PPC in inte- grating spatial information for motor planning (for reviews see, for example, Hallingan et al., 2003).

Spatial and sensorimotor funcions are performed by highly specialized neuron groups inside PPC (Mountcastle et al., 1975), among which the oculomotor control function of IPS has been found tightly coupled with visual spatial orienting (Posner and Snyder, 2004). Al- though its exact anatomical position in humans is yet to be clarified, the IPS is identified to be near the border of Brodmann’s areas 39 and 40 (Choi et al., 2006), a location at a higher level to the occipital visual hierarchy, suitable for further visual-information representation.

The IPS shows a topography for both visual inputs (Swisher et al., 2007) and sensorimotor outputs (Johnson-Frey, 2004), indicating its role in integrating spatial information to trans- late sensory inputs into motor plans. Such integration is thought to be achieved by way of abstract representation of the intrinsic spatial properties of the perceptual and motor sets.

Single-neuron recordings in monkeys (Colby and Goldberg, 1999) and functional imaging

8 Figure 1.3: Lateral view of (a) macaque monkey brain and (b) human brain showing the location of IPS. Bold text indicates major sulci, italicized text indicates lobules, and plain text indicates functional or anatomical areas. Parietal boundaries are based on anatomical criteria rather than on functional attributes. Adapted with permission from Culham et al., 2001. in humans (Sereno, 2001) have suggested that IPS has egocentric reference frames that map the coordinates of receptor surfaces, such as the retina or the cochlea to the coordinates of effectors, such as the eye or hand. Spatial coordinates are also found with direct mapping of visual attention. Silver et al. (2005) mapped the activation in human IPS when the subjects attended to different locations and found topographic egocentric representations of those locations.

Similar to findings with FEF where attentional signal is dissociable from saccade ex-

9 ecution, it has been shown that voluntary orienting signals in IPS is dissociable from the visual presentation (Corbetta et al., 2000).

Put together, FEF and IPS share similar features in their association with attentional control and motor planning. In fact, converging results from functional brain imaging have shown the co-activation of these two regions in a variety of attention tasks (for review see

Corbtta and Shulman, 2002). Instead of breaking down to individual functions of each region in a reductionist way, it is more insightful to consider the relationship between the two on a network viewpoint, and look into the way they interact with other regions for a better understanding of the top-down control of attention.

The Large-Scale Cortical Network Approach

The discussion so far is centered around particular cognitive functions involving the regions FEF and IPS. It is not to indicate that such functions are performed by these two regions per se. Contemporary research has come to a consensus that the brain cannot be simply segregated into individually operating processors that serially pass information onto each other, but instead be considered as a network that supports cognitive functions through parallel distributed processing (Mesulam, 1990). In a large-scale-cortical-network point of view (Bressler and Menon, 2010), a single function may involve many specialized areas whose union is mediated by the functional integration among them.

The anatomical properties of FEF and IPS and structural connections between them support a network approach to investigate the mechanism of attention. For example, fiber tract tracing in rhesus monkeys have demonstrated direct efferent projections from IPS to

10 FEF. Both FEF and IPS have reciprocal connection with the cingulate cortex and the supe- rior colliculus, and share common inputs from the same parts of associate sensory cortices and common projection targets in the temporal, prefrontal and medial parietal cortices (Se- lemon and Goldman-Rakic, 1988; Cavada and Goldman-Rakic, 1989). Mesulam (1990) thus pointed out that the inferior parietal lobule (Brodmann’s area 39/PG, in overlap with

IPS) together with FEF serve a dual purpose: they provide a local network for regional neu- ral computations and also provide a nodal point for the convergence and reentrant access- ing of distributed information. He speculated that area PG sculpts the subjective attentional landscape, while the FEF plans the strategy for navigating it. They are probably engaged simultaneously and interactively by attentional tasks and there is unlikely a hierarchical processing among them.

More prevailing evidence for the frontoparietal network’s engagement in visual spa- tial attention is that the FEF and IPS both have reciprocal connection with the extrastriate visual cortex (Ungerleider et al., 1989; Webster et al., 1994; Schall et al., 1995). These connections might serve the role to transfer top-down bias signals, in the context of a bi- ased competition account of attention (S. and Ungerleider, 2000; Desimone and Duncan,

1995b), from the frontoparietal network to the visual areas. The visual cortex is orga- nized in a hierarchy such that specialized receptive fields of the lower-level areas encode fundamental features, e.g. shape, color and texture of objects. The lower-level neural repre- sentations are projected through feed-forward connections to higher-level areas for further integration to generate visual perception. In such a bottom-up information-processing hi- erarchy, neural representations compete with each other for limited resources, with more

11 salient representations having a better chance to win. Meanwhile, the saliency map in the visual cortex could be amended by biasing signals sent from other brain areas, resulting in adjusted perception that better fits the endogenous goal. The biasing signal is likely to be sent from cortices serving executive control functions. Given the structure of the frontoparietal network and its heavy interconnection with the extrastriate cortex, there is a well-supported anatomical basis for such top-down control.

Nevertheless, anatomical structure alone does not tell how FEF and IPS are dynamically associated and how they cooperatively modulate the visual cortex. In fact, the function of a cortical network is supported but not solely determined by the anatomical structure

(Bressler and Menon, 2010). Corbetta and colleagues (2005) have shown that in some cases of acute-phase neglect, the structurally intact frontoparietal network was function- ally impaired due to lesions in the temporoparietal junction. Their findings suggest that attentional control might rely on dependency between the physiological changes of distant but functionally related areas from the attention network and sensory cortex. Direct mea- sures on the coupling between FEF, IPS and visual areas support this idea. For example, high-, long-range coupling was found between monkey FEF and V4 neurons dur- ing a covert attention task (Gregoriou et al., 2009); TMS experiments in human subjects showed that both FEF and IPS exert direct causal influence onto the visual cortex but with distinct patterns (Ruff et al., 2008). However, less is known about such influence between non-disturbed frontoparietal network and the visual cortex in healthily functioning human brains.

The main focus of this dissertation work is thus on the functional interdependency be-

12 tween the frontoparietal network and the visual occipital cortex, with fMRI data and sta- tistical analysis, to explore the mechanism of how top-down attentional control modulates our visual spatial perception.

Scenario of Current Study

Our approach takes a Posner-like paradigm to dissociate attentional control from other cognitive processes in human subjects. The paradigm informs the subjects with an auditory preparatory cue (“left”or “right”) to direct their attention to one of the two target locations mirrored across the vertical meridian in the upper visual hemifield, while keeping their eyes

fixated at the center point of the visual field. After a delay period, there are two stimuli appeared briefly at the two target locations together with an auditory report cue (“left”or

“right”) requesting the subjects to discriminate the grating’s orientation in the cued stimuli.

The report cue can be consistent (in valid trials) or inconsistent (in invalid trials) with the preparatory cue. The subjects report with a bottom press whether the gratings are tilted to the left, to the right, or not tilted. By manipulating the ratio of valid and invalid trials, we can measure the performance accuracy to verify whether the attention shift is in effect.

If the percentage of correct responses is greater in the valid trials than that in the invalid trials, it indicates that the subjects have successfully engaged their attention at the pre- cued location, which results in better performance if the location of stimulus to be reported matches their expectation.

With this paradigm, we verified that the attention shifts did occur during the delay period of the task. We then recorded with fMRI the activity from the subjects’s brain

13 and looked at how FEF and IPS modulate the Visual Occipital Cortex (VOC) and how this modulation is associated with attention shift. Measure of the modulation is through a statistical tool named Granger causality (GC). It tells the interdependency between two time series with temporal precedence information. If the prediction of a time series’current state by its previous state(s) can be improved by including into the regression the previous state(s) from another time series, then the latter is considered to “Granger cause the former.

Thus by definition the GC measure of interdependency is directional. With this tool we were able to examine the influences between the frontoparietal regions and the VOC in both directions. We asked whether GC is actually measuring attentional control and if so, how such control modulated VOC during the delay period. Chapter 2 is aimed at answering the

first question and Chapter 4 the second. In between the two studies, we took an additional step to develop a new GC-analysis method in order to exclude possible spurious results from the method used in Chapter 2 and carried the new method onto analyses in Chapter 4.

Report on the method development composes Chapter 3.

14 CHAPTER 2. TOP-DOWN CONTROL OF VISUAL CORTEX BY FRONTAL AND

PARIETAL CORTEX IN ANTICIPATORY VISUAL SPATIAL ATTENTION

Advance information about an impending stimulus facilitates its subsequent identifi- cation and ensuing behavioral responses. This facilitation is thought to be mediated by top-down control signals from frontal and parietal cortex that modulate sensory cortical ac- tivity. Here we show, using Granger causality measures on blood oxygen level-dependent time series, that frontal eye field (FEF) and intraparietal sulcus (IPS) activity predicts visual occipital activity before an expected visual stimulus. Top-down levels of Granger causality from FEF and IPS to visual occipital cortex were significantly greater than both bottom-up and mean cortex-wide levels in all individual subjects and the group. In the group and most individual subjects, Granger causality was significantly greater from FEF to IPS than from IPS to FEF, and significantly greater from both FEF and IPS to intermediate-tier than lower-tier ventral visual areas. Moreover, top-down Granger causality from right IPS to intermediate-tier areas was predictive of correct behavioral performance. These results suggest that FEF and IPS modulate visual occipital cortex, and FEF modulates IPS, in rela- tion to visual attention. The current approach may prove advantageous for the investigation of interregional directed influences in other human brain functions.

15 Introduction

A strong theoretical foundation supports the concept of topdown modulation of sen- sory processes by control regions of the cerebral cortex (Desimone and Duncan, 1995b;

S. and Ungerleider, 2000; Corbetta and Shulman, 2002). Anticipatory visual spatial at- tention, whereby an observer voluntarily attends the visual field location of an impending target, improves target detection and behavioral performance (Eriksen and Hoffman, 1972;

Bashinski and Bacharach, 1980; Posner, 1980; Tong, 2003). Human functional mag- netic resonance imaging (fMRI) studies have shown that anticipatory visual spatial at- tention involves stimulus-independent changes in blood oxygen level-dependent (BOLD) signals from frontal, parietal, and visual cortical regions (Corbetta and Shulman, 2002;

Serences and Yantis, 2006), but have not shown directed influences between these regions.

Evidence that transcranial magnetic stimulation (TMS) of human frontal (Ruff et al., 2006) and parietal (Ruff et al., 2008) regions affects visual cortical BOLD activity, and that elec- trical microstimulation of monkey frontal cortex affects visual cortical electrical activity

(Moore and Armstrong, 2003), indicates that pathways exist for top-down modulation of visual cortex and that stimulation of these pathways can produce attention-like behavioral effects, but does not show a relation between top-down modulation and visual attention.

Here we present evidence suggesting that visual cortex receives top-down modulation from frontal and parietal areas in relation to visual attention.

Directed influences between BOLD signals were measured by Granger causality (Granger,

1969; Roebroeck et al., 2005), which quantifies the improvement in predicting one brain

16 regions signal that results from inclusion of another regions signal in that prediction. Al- though Granger causality cannot prove that neuronal communication occurs between re- gions, it can nonetheless provide supporting evidence for it. Several properties of Granger causality make it suitable for the study of interregional directed influences in human cog- nition. First, it is measured from ongoing activity in subjects performing a cognitive task and, thus, does not depend on nonphysiological intervention as in TMS or microstimula- tion. Second, it is an asymmetric measure that allows quantification of in both directions between regions, unlike symmetric measures such as cross-correlation or mutual information. Third, being based on predictability, it provides stronger evidence for neu- ronal communication than do measures simply showing temporal precedence of one time series over another. Finally, it can measure directed influences at a specific task-related time.

Granger causality was evaluated on BOLD time series in frontal eye field (FEF), intra- parietal sulcus (IPS), and visual occipital regions immediately preceding visual target pre- sentation, when attentional spatial selectivity was maximal (Sylvester et al., 2007). Top- down Granger causality was significantly greater than both bottom-up and mean cortex- wide levels. Granger causality was also significantly greater from FEF to IPS than from

IPS to FEF, and from FEF and IPS to intermediate-tier ventral visual areas than to low-tier areas. Furthermore, top-down Granger causality from the right intraparietal sulcal area to intermediate-tier areas predicted correct behavioral performance. These results imply that top-down control was in effect before the target in this task.

17 Materials and Methods

The experimental design, data acquisition and first-step general linear modeling on the

fMRI BOLD signal were carried out by Dr. Chad Sylvester at Washington University at St.

Louis.

Participants and task. Six right-handed subjects (three male, three female), aged 26–30, performed a demanding visual-spatial attention task (Sylvester et al., 2007) (Fig. 2.1). Sub- jects had no history of neurological illness, and had normal or corrected-to-normal vision.

Informed consent was obtained as per human studies committee guidelines at Washington

University School of Medicine. Each subject performed 1400–1600 trials over 16–24 scan- ning sessions. Eye position was monitored to ensure fixation on a central crosshair. Trials began with a 500 ms preparatory cue (the spoken word “left”or “right”), which directed attention to one of two locations at 5◦ eccentricity in the upper hemifield, 45◦ clockwise or

counterclockwise from the vertical meridian (Fig. 2.1). Right- and left- cued trials were

randomly intermixed with equal . After a stimulus-onset asynchrony (SOA) of

6.192 (25%), 8.256 (25%), or 10.32 s (50%), visual target stimuli appeared for 100 ms

centered at both locations, concurrent with an auditory report cue (“left”or “right”). Tar-

gets were 3.5 cycle per degree Gabor patches (0.3◦ Gaussian envelope SD). On valid trials

(75%), the report cue matched the preparatory cue. Subjects indicated the orientation (left

tilt, vertical, right tilt) of the report-cued patch by pressing one of three buttons. High (50%)

and low (5–12%) contrast targets were presented in separate scans. Stimulus parameters

were adjusted based on in-scanner practice sessions to yield approximately 70% correct

18 Figure 2.1: Visual spatial attention behavioral paradigm. An auditory preparatory cue (“Left”or “Right”) began each trial, instructing subjects to attend a location left or right of the vertical meridian. After a stimulus-onset asynchrony (10.32 s), visual targets appeared for 100 ms centered at both locations, concurrent with an auditory report cue (“Left”or “Right”). BOLD data were acquired with TR = 2.064 s. The sample times used for analysis are marked by triangles. The figure is not drawn to scale. valid trial performance. Valid trials having the 10.32s SOA, with either target contrast level and cue direction, were included in the analysis. The selective analysis of trials with 10.32 s SOA was aimed at maximizing anticipatory effects, which were shown by Sylvester et al.

(2007) to increase with time after the cue. Mean task performance was 70.6% correct for valid and 61.5% correct for invalid trials (chance = 33.3%), indicating that subjects used the preparatory cue to discriminate the target.

Data acquisition. BOLD data were acquired with a Siemens Allegra 3T scanner using an asymmetric spin-echo echoplanar sequence [retention time (TR) = 2.064 s, echo time

19 = 25 ms, flip angle = 90◦, 32 contiguous 4 mm axial slices, 4×4 mm in-plane resolution).

BOLD images were motion-corrected within and between runs, corrected for across-slice timing differences, resampled into 3 mm isotropic voxels, and warped into a standardized atlas space.

Region of interest creation. For each subject, BOLD data at each voxel were subjected to a general using in-house software. Constant and linear terms over BOLD runs modeled baseline and linear drift. Sine waves modeled low-frequency noise (<0.009

Hz). Separate δ function regressors coded each time point after the preparatory cue. To generate individual-trial time series, modeled responses of the appropriate event type were summed with the residuals from the linear model at each time point.

Regions of interest (ROIs) outside retinotopic cortex were created by voxel-wise ANOVA in each subject over the first six trial time points using the residuals dataset. An in-house clustering algorithm defined ROIs from the map of the main effect of cue direction. ROIs were 8 mm spheres centered on map peaks with z-scores<3; spheres within 12mm of each other were consolidated into a single ROI. ROIs were retained for subsequent analysis if they were present in at least 6 of the 12 subject hemispheres so as to reduce intersub- ject variability, which could come from variability in BOLD signal strength, task strategy

(causing the relative involvement of different functional areas to vary), or localization of functional areas.

If a subject lacked a particular ROI, the z threshold was lowered to 2; if the subject still lacked the ROI, it was not included in subsequent analyses. This procedure yielded ROIs in FEF (at the junction of the superior frontal sulcus and the precentral sulcus), anterior IPS

20 (aIPS), and posterior IPS (pIPS).

To create ROIs inside retinotopic cortical areas, subjects first passively viewed contrast-

reversing checkerboard stimuli extending along the horizontal and vertical meridians. From

a contrast of responses to the horizontal and vertical meridians, early visual region borders

were handdrawn on flattened representations of each subjects anatomy using the Caret soft-

ware suite (Van Essen et al., 2001). The V3A ROI consisted of V3A voxels with responses

that varied with the direction of the preparatory cue. In separate localizer scans, subjects

passively viewed highcontrast (∼50%) Gabor patches flickering at 4 Hz. In each 12 s block, a patch randomly appeared at one of five locations (two target, two mirrored across the horizontal meridian, and one central). Voxels representing each of these locations were localized by t tests: voxel responses to each non-central stimulus were significantly larger than those to its mirror stimulus across the vertical meridian; responses to the central loca- tion (1◦ width) were significantly larger than the summed responses to all other locations.

Subdivisions of early visual cortex (V1v, V2v, VP, V4) were made from the conjunction of voxels with a stimulus preference in the localizer scans and the retinotopic regions for upper hemifield locations.

BOLD time series preprocessing. Outlier voxels were rejected, based on their BOLD amplitude variability over trials, using Tukeys boxplot technique. BOLD time series were z-normalized to have zero mean and unit .

Granger causality method. Granger causality testing followed the method of Greene

(2002). For any two voxels, X and Y , Granger causality is tested from X to Y , and from

Y to X. For X to Y , two different models are considered. The restricted

21 Pp model is: Y (t) = α1 m=1 Y (t − m) + 1(t), where Y (t) is the Y time series at time t,

Y (t − m) is the m-lagged Y time series,α1 is the regression coefficient, and 1(t) is the restricted model residual at time t.

Pp Pp The unrestricted model is: Y (t) = α2 m=1 Y (t − m) + β m=1 X(t − m) + 2(t), where Y (t) is the Y time series at time t, Y (t − m) and X(t − m) are the m-lagged time series of Y and X, α2 and β are regression coefficients, and 2(t) is the unrestricted model residual at time t.

Granger causality significance testing. If the variability of the residual of the unre- stricted model is significantly reduced compared with that of the restricted model, then there is an improvement in the prediction of Y due to X. The amount of reduction is measured as an F statistic: F = [(RSSr − RSSur)/p]/[RSSur/(T − 2p − 1)], where

RSSr is the restricted residual sum of squares, RSSur is the unrestricted residual sum of squares, p is the model order, and T is the total number of observations used to estimate the unrestricted model. The F statistic approximately follows an F distribution with degrees of freedom p and (T −2p−1). If the F statistic from X to Y is significant (i.e., greater than the critical value at p < 0.05 in the standard Fp,T −2p−1 distribution), then the unrestricted model yields a better explanation of Y (t) than does the restricted model, and X is said to

Granger cause Y .

Granger causality BOLD data analysis. For a given ROI pair, F statistics were com- puted in both directions between BOLD time series of every voxel pair. The portion of voxel pairs carrying Granger causality was quantified for each ROI pair in both directions as the fraction of F statistics that were significant ( p < 0.05). To compare significant

22 fractions of F distributions having equal numbers of voxel pairs, the McNemar (1947)

statistic was calculated. Significance of this statistic came from comparison with the stan-

dard χ2 distribution with one degree of freedom. F -statistic distributions were compared by the MannWhitney U test (Mann and Whitney, 1947) when the distributions had unequal numbers. Correction for multiple comparisons was by Dunns procedure (Kirk, 1982).

Granger causality was tested on F statistics derived from linear regression models using the last two BOLD measurements of the anticipatory period (marked by triangles in Fig.

2.1). Balanced numbers of correct and incorrect validly cued trials were used to maximize the number of trials while avoiding biases that might otherwise result. Frontal and parietal

ROIs consisted of spatially selective voxels in right and left FEF, aIPS, and pIPS. Visual occipital ROIs consisted of voxels in right and left V1v, V2v, V3A, VP, and V4. With the exception of two subjects, who lacked significantly identifiable ROIs in left FEF and aIPS,

60 pairwise frontoparietaloccipital relationships (six frontal and parietal with 10 visual occipital ROIs) were examined, as were 4 pairwise frontal-parietal relationships (two FEF with two IPS, anterior and posterior combined). Each ROI pair was tested independently of all others.

F -statistic distributions were derived for individual subjects. These distributions were either analyzed separately for each subject or were combined across subjects for group analysis. Analyses of the veridical data were compared with analyses of randomized data sets created in two ways. First, trial-randomized data consisted of ROIs having the same voxels as the veridical data, and with F -statistic distributions computed with randomized trial order for each voxel. Trial-randomized data provided an estimate of expected Granger

23 causality with physiological influences removed. Second, voxel-randomized data consisted of ROIs having voxels randomly selected from the entire cortex, and with F -statistic dis- tributions computed with the same trial order for each voxel as the veridical data. Voxel- randomized data provided an estimate of Granger causality expected for any two randomly selected cortical locations.

Results

Granger causality was tested for every voxel pair in each of the 60 frontoparietal- occipital ROI pairs in top-down (from FEF or IPS to visual occipital) and bottom-up (from visual occipital to FEF or IPS) directions.F -statistic distributions (Fig. 2.2) contained sig- nificant fractions, indicating Granger causality, in both directions for every ROI pair in the veridical, as well as in the voxelrandomized and trial-randomized data. The fact that a variable fraction of the voxel pairs comprising an ROI pair had significant F -statistics indicated that the ROIs were not spatially homogeneous in their influences on one another.

Top-down Granger causality in the veridical data was significantly greater (p < 0.00001) in all six subjects than in the voxel-randomized or trialrandomized data. In contrast, bottom- up Granger causality in the veridical data was lower than in the voxel-randomized data in all subjects [significantly lower (p < 0.005) in three subjects], but was significantly greater

(p < 0.05) than in the trial-randomized data in all subjects. Veridical top-down Granger causality was significantly greater than bottom-up Granger causality (p < 1.0 × 10−30) in all subjects (Fig. 2.3A) and was significantly greater (p < 0.05) for 59 of 60 ROI pairs in a group analysis (Fig. 2.3C). These results suggest top-down modulation of visual occipital

24 Figure 2.2: Top-down (left) and bottom-up (middle) Granger causality F -statistic his- tograms for a representative ROI pair, right aIPS and right V3A, in one subject. The critical value of F is 3.87 for significance (p < 0.05) in both directions. A larger fraction of the total number of voxel pairs (1064) has significant F statistics in the top-down (16.9%) than in the bottom-up (8.7%) direction. The schematic diagram (right) shows Granger causality repre- sented as arrows in the two directions between these ROIs on a standard brain image. Arrow thickness corresponds to the significant fraction, representing Granger causality strength. areas by FEF and IPS.

Granger causality was also tested in both directions for every voxel pair in each of the four frontal-parietal ROI pairs. Veridical Granger causality from FEF to IPS was sig- nificantly greater (p < 1.0 × 10−30) than that from IPS to FEF in the group and in five subjects (Fig. 2.3B). Granger causality from FEF to IPS in the veridical data was signifi- cantly greater (p < 1.0 × 10−30) than that in the voxel-randomized data in the group and in

five subjects, and significantly greater (p < 1.0 × 10−30) than that in the trial-randomized data in all subjects. Veridical Granger causality from IPS to FEF was significantly less

(p < 1.0 × 10−30) than that in the voxel-randomized data in the group and in four sub- jects, and was significantly greater (p < 1.0 × 10−30) than that in the trial-randomized data in all subjects. Equivalent comparisons in both the voxelrandomized (Fig. 2.3D) and trial-randomized (Fig. 2.3E) data did not show any significant directional difference. Thus,

FEF modulated IPS, and both regions modulated visual occipital cortex.

25 Figure 2.3: A–E, Top-down versus bottom-up Granger causality. The mean fraction of significant F is significantly greater for top-down (blue) than for bottom-up (red) Granger causality when measured separately for each subject (A) and for 59 of 60 ROI pairs with all subjects combined (C), but not in the voxel-randomized (D) or trial-randomized (E) data. Error bars indicate variability across ROI pairs (A, B, D, E) or across subjects (C). B) The mean fraction of significant F is significantly greater from FEF to IPS than from IPS to FEF for five of six subjects. C) ROI pairs follow left-to-right, top-to-bottom ordering in Fig. 2.4A.

26 We next sought to determine whether top-down modulation was greater to some visual occipital regions than to others. In a group analysis, top-down Granger causality from frontal and parietal (left and right FEF, aIPS, and pIPS) regions to intermediate-tier ventral visual occipital regions (VP and V4) was significantly greater (p < 1.0 × 10−40) than that to low-tier regions (V1 and V2) of both the left and right hemispheres. In individual analyses, the effect was significant (p < 0.05) in five subjects for low-tier regions of the left hemisphere, and in four subjects for those of the right hemisphere. Equivalent comparisons on the voxel-randomized and trial-randomized data were not significant.

We then tested whether top-down modulation of visual cortex predicted correct re- sponses to subsequent targets by comparing top-down Granger causality before correct and incorrect performance from each frontal and parietal ROI to all visual occipital ROIs com- bined. In a group analysis, significant (p < 0.05) performance differences were observed for each frontal and parietal ROI except the left aIPS. In individual subject analyses, a con- sistent performance difference in five subjects was only seen for right aIPS, with Granger causality significantly greater (p < 0.05) before correct than before incorrect performance.

No consistent performance difference was observed for FEF-to-IPS Granger causality.

Top-down Granger causality for each of the frontoparietaloccipital and frontal-parietal

ROI pairs was also separately compared for performance differences. In a group analysis, top-down Granger causality was significantly different (p < 0.05) before correct and in- correct performance for a number of ROI pairs (Fig. 2.4A). However, top-down Granger causality was significantly different (p < 0.05) consistently in five of six subjects only from right aIPS to left VP (Fig. 2.4B), being significantly greater (p < 0.05) before correct than

27 Figure 2.4: Top-down Granger causality before correct versus incorrect performance. A) Grid specifying the significance of correct versus incorrect performance difference in a group analysis of top-down Granger causality: * p < 0.05, **p < 1.0 × 10−10, ***p < 1.0 × 10−30; white, not significant. Each cell represents Granger causality from the row- labeled ROI to the column-labeled ROI. B) The fraction of significant top-down Granger causality from right aIPS to left VP was significantly greater before correct (blue) than incorrect (red) performance in five of six subjects. before incorrect performance. In addition, top-down Granger causality from right aIPS was significantly greater (p < 0.05) prior to correct than incorrect performance to right and left V4 in four of six subjects, and to right VP in three of six subjects. These performance- related results could not be attributed to the greater magnitude of top-down Granger causal- ity to intermediate-tier regions.Noconsistent across-subject performance differences were observed for the other ROI pairs in the top-down direction, for any pair in the bottom-up di- rection, or for any pair in either direction in either the voxelrandomized or trial-randomized data. The variability in ROI pairs showing significant differences in top-down Granger causality before correct and incorrect performance suggests that some top-down influences promote correct behavioral performance, whereas others may actually impair performance.

28 Discussion

The results of this study support the view that FEF and IPS exert top-down modulatory influences on visual cortex in relation to visual attention (Desimone and Duncan, 1995b;

S. and Ungerleider, 2000; Corbetta and Shulman, 2002; Moore et al., 2003; Serences and

Yantis, 2006). Although studies using TMS of human FEF (Ruff et al., 2006) and IPS

(Ruff et al., 2008), and electrical microstimulation of monkey FEF (Moore and Armstrong,

2003), demonstrate that pathways exist to carry influences from FEF and IPS to visual cortex, and stimulation of these pathways can produce behavioral effects similar to those produced by attention, they do not indicate whether top-down influences are exerted when subjects attend to a location under physiological conditions. Moreover, because stimula- tion is only applied to frontal or parietal cortex, these studies do not show any asymme- try between top-down and bottom-up influences. Our study reveals such an asymmetry by demonstrating that BOLD activity in FEF and IPS predicts BOLD activity in the vi- sual cortex of subjects engaged in a visual attention task at levels significantly higher than bottom-up levels. Furthermore, top-down, but not bottom-up, predictability was signif- icantly higher than the mean cortex-wide level. Such top down predictability between

BOLD signals may depend on the entrainment of slow neuroelectric excitability waves observed in visual attention (Lakatos et al., 2008).

The analysis of directed influences between FEF and IPS suggests that FEF modulates

IPS far more than IPS modulates FEF during visual attention. To our knowledge, directed influences between frontal and parietal regions have not previously been investigated, al-

29 though neural synchrony between them has been observed during attention (Buschman and Miller, 2007). Future analysis will consider the question of whether FEF-to-occipital

Granger causality is direct or can be attributed to an influence through IPS, and likewise, whether IPS-to-occipital Granger causality can be attributed to FEF.

The finding of stronger top-down modulation of intermediate-tier areas (VP, V4) than of low-tier areas (V1, V2) is consistent with the known distribution of attentionrelated modulation in extrastriate cortex (Motter, 1993; Leonardo et al., 1995; Luck et al., 1997;

Mehta et al., 2000; Schroeder et al., 2001; Stefan and Treue, 2001; Kastner and Pinsk,

2004). Similarly, our finding that top-down Granger causality from the right anterior IPS region to bilateral VP and V4 predicts subsequent task performance suggests that these areas may be critical nodes in attentional regulation of visual cortex. This finding is con- sistent with the theory that right parietal cortex influences spatial processing of both visual hemifields (Heilman et al., 1984; Mesulam, 1999; Siman-Tov et al., 2007), results show- ing that TMS of right IPS produces stimulusindependent BOLD changes in both left and right visual cortex (Ruff et al., 2008), and findings that task-related decreases in the BOLD signal of right anterior IPS and bilateral visual cortical regions correlate with behavioral improvement in visual perceptual learning (Mukai et al., 2007).

Our approach is based on a unique combination of experimental and analytic methods allowing measurement of eventrelated Granger causality in human BOLD data on a rela- tively short time scale. It bears a superficial resemblance to techniques measuring relative delay time between ROIs in eventrelated BOLD data, such as the mutual information-based method of Fuhrmann Alpert et al. (2007). The pairwise measure in that study is symmet-

30 ric and is derived from longduration averaged BOLD time series. In contrast, our method provides asymmetric directed measures between ROIs, and is derived from short-duration single-trial BOLD time series. Although our study examined Granger causality at a fixed time point and time lag between ROIs, our method can also yield latency and relative delay information, as did Fuhrmann Alpert et al. (2007).

This study examined Granger causality at a single sample time during the anticipatory period. More comprehensive future studies of the same data set will involve analysis of other time samples, and will examine whether Granger causality can distinguish cortical representations of cued from uncued visual quadrants.

Granger causality is a measure of statistical relation and, thus, cannot identify the path- ways carrying the effects observed in this study. It is possible that influences from a region other than FEF or IPS were responsible for those effects. However, as a whole, the set of comparisons comprising this study establish that significant unidirectional predictive re- lations exist from FEF and IPS to visual occipital cortex, and support the view that FEF and IPS exert top-down control of visual occipital processing in visual attention. Granger causality analysis of event-related BOLD data may prove to be a generally useful tool to noninvasively measure the causal interplay between different regions of the human brain in relation to cognition and emotion.

31 CHAPTER 3. MEASURING GRANGER CAUSALITY BETWEEN CORTICAL

REGIONS FROM VOXELWISE FMRI BOLD SIGNALS WITH LASSO

Functional brain network studies are becoming more and more prevalent in research on the neural basis of human cognition. Yet, the analytic tools used to investigate functional brain networks are often still relatively unsophisticated. For example, the averaged Blood

Oxygen-Level Dependent (BOLD) signal from a brain region has commonly been used to represent nodal activity in functional connectivity analysis, even when the validity of do- ing so has not been well established. Here we show that the averaging of regional BOLD activity to create a nodal signal may lead to biased Granger Causality (GC) estimation of interregional functional connectivity. We propose an alternative approach to the measure- ment of functional connectivity with GC that is based on the activity of individual voxels within brain regions, and we demonstrate its effectiveness on both simulated and empirical functional Magnetic Resonance Imaging data. Our method first uses the Least Absolute

Shrinkage Selection Operator (LASSO) to overcome estimation problems that would oth- erwise preclude voxel-based analysis, and then computes to represent interregional GC. This approach makes feasible GC analysis of functional connectivity be- tween brain regions containing large numbers of voxels without the need for averaging.

Our results suggest that the analysis of functional brain networks must give careful consid- eration to the way that network nodes and edges are defined because those definitions may

32 have important implications for the validity of the analysis.

Introduction

The modern understanding of human cognition relies heavily on the concept of large- scale functional brain networks, and large-scale functional network analysis of Blood-

Oxygenation-Level-Dependent (BOLD) signals from functional Magnetic Resonance Imag- ing (fMRI) is playing an increasingly important role in cognitive neuroscience (Bressler and

Menon, 2010). From this perspective, to gain knowledge of cognition through fMRI anal- ysis requires identification of the nodes and edges of large-scale functional brain networks.

An important unresolved question remaining in the field, however, is how best to define the nodes and edges of large-scale functional brain networks.

A node is typically represented in brain network studies of fMRI BOLD activity as a lumped Region Of Interest (ROI), formed by averaging the BOLD signals of all the ROIs voxels (Greicius et al., 2003; He et al., 2007; Hagmann et al., 2008; Anderson et al., 2010;

Zhang et al., 2010). This collapse of the ROI by averaging has the benefit of reducing the dimensionality of analysis, but rests on the twin assumptions that the BOLD activity of an

ROI is homogeneous over all its voxels and that the functional interdependence (connec- tivity) between voxels within an ROI and in different ROIs is also homogeneous. If the homogeneity assumptions are not true, edge measurements computed from ROI-averaged

BOLD signals may be erroneous since averaging may distort the time series information.

Here we present a new procedure for brain network analysis that is based on the BOLD activity of the individual voxels of ROIs and the Granger Causality (GC) measure of inter-

33 dependence between voxels. GC tests whether the prediction of the present value of one time series by its own past values can be significantly improved by including past values of another time series in the prediction. If so, the second time series is said to Granger cause the first, and the degree of significance may be taken as the strength of GC (Wiener, 1956).

The GC measure is typically implemented by AutoRegressive (AR) modeling (Granger,

1969) and has been shown to be a powerful and flexible tool for measuring the predictabil- ity of one neural time series from another (Bernasconi and Konig, 1999; Ding et al., 2000;

Kaminski et al., 2001; Hesse et al., 2003; Ding et al., 2006; Bressler and Seth, 2010). The

GC measure has advantages as an edge measure over the typically utilized cross-correlation measure: first, it provides the strength of interdependence between voxels in both direc- tions, as opposed to a single non-directional strength; second, its grounding in prediction allows stronger statements to be made about functional interdependence than does simple correlation.

Previous evidence from GC BOLD analysis argues against the assumption of homoge- neous GC interdependence, and thus suggests that averaging BOLD signals prior to edge measurement may not be appropriate. Bressler et al. (2008) found that GC between ROIs varies considerably across voxel pairs, with the distribution of GC values being highly skewed and only a small fraction of values in the tail of the distribution being significantly different from zero. These results indicate that GC is heterogeneous across voxel pairs, suggesting that the investigation of functional interdependence between ROIs should take into account the interdependence of BOLD signals from all the voxels within the ROIs.

An approach to the heterogeneous interdependence problem is to compute the distribu-

34 tion of GC values using a bivariate AR model for each pairwise combination of voxels in two ROIs. This pairwise-GC approach, followed by Bressler et al. (2008), not only avoids the possible pitfalls of averaging, but also makes feasible the separate measurement of GC density and strength between ROIs, two factors that are conflated by averaging. Thus, de- riving a summary GC statistic between ROIs from the distribution of GC values across all voxel-voxel pairs may be statistically more informative than simply setting it to the GC between across-voxel averages. There is a further problem, however, with the pairwise-GC measure: some GC values may be identified as being significant when actually they are not.

This problem arises, for example, if one voxel (x) drives a second voxel (y), while voxel y

“drives”a third voxel (z), without there being a “drive”from voxel x to voxel z (Fig. 3.1A).

In this case, the GC from voxel x to voxel z may be spuriously identified as being signif- icant. As another example, the problem also occurs if voxel x “drives”both voxels y and z with different delays, without there being a “drive”from y to z (Fig. 3.1B). In this case, the GC from y to z may be spuriously identified as being significant. Since x, y, and z may be in the same or different ROIs, these examples make is clear that the GC within ROIs should be taken into account in order to reduce the possibility of spuriously identifying

GCs as being significant.

Our approach to the problem of spurious GC significance rests on the concept of con- ditional GC (Geweke, 1984; Chen et al., 2006; Seth and Edelman, 2007). Conditional GC analysis tests for a significant GC from one time series to a second with the effect of a third time series removed. By this procedure, it is possible to determine whether a significant

GC measured between two time series is attributable to the third time series. In this pa-

35 Figure 3.1: Simple driving patterns that can lead to spurious identification of significant Granger Causality. A) Sequential driving pattern, where voxel x drives voxel y, which in turn drives voxel z. GC from x to z may be spuriously identified as being significant. B) Differentially delayed driving, where voxel x drives voxel y with shorter delay and z with longer delay. GC from y to z may be spuriously identified as being significant. Modified from (Chen et al., 2006). per, we utilize the conditional GC concept for ROI-level GC analysis in an approach that essentially measures the GC between any pair of voxels in two ROIs conditional on all the other voxels in the ROIs. This is accomplished by constructing a single Multivariate

Vector AutoRegressive (MVAR) model from the time series of all voxels, as opposed to the pairwise-GC method, in which a separate bivariate AR model is constructed for each voxel pair. Use of the MVAR model offers the promise of reducing or eliminating the problem of spurious significant GC identification in the assessment of network functional interdependence from fMRI BOLD signals.

To make use of the MVAR model for ROI-level GC analysis necessitates overcoming one further problem that often occurs in model estimation: too few observations (data points) may be available to accurately estimate the parameters (model coefficients). This problem commonly arises in neurobehavioral studies because the number of data points that can be acquired is restricted, and the size of the MVAR model that can be estimated is inadequate for assessing ROI-ROI interdependence. This limitation can be mitigated, however, if it is assumed that the voxel-voxel functional interdependence between ROIs

36 is sparse (i.e., has a low density of connectivity) (Valdes-Sosa et al., 2005). The Least

Absolute Shrinkage and Selection Operator (LASSO) algorithm (Tibshirani, 1996) uses the assumption of sparseness (low connectivity density) to deal with the problem of having too few observations for the number of model coefficients that must be estimated. This algorithm has previously been tested on numerical experiments (Arnold et al., 2007), gene- network data (Shojaie and Michailidis, 2010) and simulated and experimental fMRI BOLD data (Valdes-Sosa et al., 2005; Sanchez-Bornot et al., 2008). Here we demonstrate that the

LASSO algorithm provides an effective way to estimate the coefficients of a voxel-based

MVAR model of two predefined ROIs, and thus to measure the distributions of voxel-to- voxel GCs between the two ROIs.

In the Materials and Methods section, we describe: (1) the MVAR model for fMRI voxel-level BOLD time series from two ROIs; (2) the LASSO algorithm to estimate the

MVAR model; (3) a criterion for determining optimal predictors in the MVAR model with the LASSO algorithm; and (4) two types of summary statistics at the ROI level that repre- sent the separate measurement of density and strength of GC between ROIs. In the Results section, we report on MVARmodel simulations demonstrating that voxel-based approaches can better capture the GC between two ROIs than the averaging approach. When LASSO is used to estimate the MVAR model, voxel-based GC summary statistics are sensitive to coefficient changes in the model, whereas GC values computed from averaged signals are not. LASSO estimation fits the simulation model accurately as long as the GC functional connectivity density is relatively low, i.e. GC functional connectivity is sparse. We also report that construction of the voxel-based GC distribution by pairwise bivariate AR model

37 estimation, instead of by MVAR model estimation by LASSO, may yield spuriously sig- nificant GC values.

Finally, we present an application of MVAR model estimation by LASSO to the prob- lem of determining GC between ROIs in an fMRI BOLD dataset obtained during a visu- ospatial attention task (Sylvester et al., 2007). The results indicate that the assumption of sparse GC functional connectivity is realistic, and that LASSO MVAR model estimation is thus effective, for empirical fMRI BOLD data. Also, the low GC connectivity density observed for this dataset suggests that interregional GC interdependence is heterogeneous and that averaging the voxels of an ROI prior to GC connectivity analysis is inappropriate.

Furthermore, the observed directional asymmetry, as measured by the average GC strength summary statistic, is consistent with current theory on top-down modulation in visuospa- tial attention. We conclude that LASSO is a useful tool that can aid in the measurement of

Granger Causality between cortical regions from voxelwise fMRI BOLD signals. With this tool, it is beneficial to analyze all the voxels in an ROI, instead of taking an average over the ROI, with the result that functional interdependence is captured with less distortion of the information carried in the BOLD time series.

Materials and Methods

The MultiVariate AutoRegressive (MVAR) model.We first consider an fMRI BOLD dataset from m voxels in ROI X and n voxels in ROI Y. The dataset consists of time series

38 of t points recorded from every voxel in X and Y, and can be written in matrix form as:

  x x ··· x  11 12 1t       x x ··· x   21 22 2t  X =   (3.1)  . . . .   . . .. .   . . .      xm1 xm2 ··· xmt

  y y ··· y  11 12 1t     y y ··· y   21 22 2t Y =   (3.2)  . . .   ......   . . .      yn1 yn2 ··· ynt

The relationship between X and Y can be expressed in the form of a Multivariate Vector

AutoRegressive (MVAR) model. A general matrix representation of the model is:

p X Zt = BkZt−k + Et (3.3) k=1

where Zt is the dependent variable in vector form, representing the BOLD data values at arbitrary time t of all voxels in X and Y; Zt−k represents the values of the Z vector at arbitrary earlier time point t − k; lag k ranges from 1 to p, the model order; Bk is the corresponding coefficient matrix at lag k; and Et is the residual vector.

39 When expanded, the product term in Eq. 3.3 becomes:

BkZt−k =     bk ··· bk bk ··· bk x  11 1m 1(m+1) 1(m+n)   1(t−k)       ......   .   ......   .               bk ··· bk bk ··· bk  x  (3.4)  m1 mm m(m+1) m(m+n)   m(t−k)   .       bk ··· bk bk ··· bk   y   (m+1)1 (m+1)m (m+1)(m+1) (m+1)(m+n)   1(t−k)       ......   .   ......   .   . . . .   .       k k k k    b(m+n)1 ··· b(m+n)m b(m+n)(m+1) ··· b (m + n)(m + n) yn(t−k)

k Each element of the Zt−kth vector is a predictor, and each element (bij) of the Bk matrix is a coefficient representing the degree of prediction of the ith element of Zt by the jth

k predictor. If a value of bij significantly differs from zero, then a significant GC is said to exist from voxel j to voxel i. The magnitude (strength) of that GC may be assessed by the magnitude of the statistic (e.g. t-statistic) used to measure the difference of the b value from zero. The sum of product terms over all lags is the total prediction of Zt by the model.

The model order (p) was set to one in this paper, based on our prior experience with the analysis of fMRI BOLD data (Bressler et al., 2008). The MVAR model in Eq. 3.3, with model order one, was used here for both simulation and GC analysis. For simulation, the residual vector represented an innovation process that generates random values, the B matrix is known, and the X and Y time series data were simulated. For GC analysis, the X and Y time series data were known, the B matrix was estimated in order to determine GC,

40 and the residual vector represented prediction errors.

We also employed pairwise-GC analysis for comparison with MVAR analysis. In the

pairwise-GC approach, coefficients are estimated (and the significance of GC determined)

by constructing a separate bivariate model for each pair of voxels, one in X and one in Y:

  Pp k Pp k xit = k=1 biixi(t−k) + k=1 bijyj(t−k) + it (3.5)   Pp k Pp k yjt = k=1 bjixi(t−k) + k=1 bjjyj(t−k) + jt

In pairwise-GC analysis, the assumption is made that the predictors are independent of one another. Under this assumption, the GC between X and Y can be assessed solely from the bivariate models in Eq. 3.5, and it is not necessary to estimate the coefficients represent- ing GC within X or Y. In fact, however, the predictors may be correlated for BOLD time series, making the pairwise-GC approach problematic. If the predictors are correlated, es- timation by separate bivariate (or partial) models may be biased, and all of the coefficients in the B matrix should be estimated simultaneously (Greene, 2003). Nonetheless, simulta- neous estimation may be impossible in the analysis of data from neurobehavioral studies, in which the number of observations is often limited.

The Least Absolute Shrinkage and Selection Operator (LASSO) The Least Absolute

Shrinkage and Selection Operator (LASSO) technique is a method that makes model es- timation feasible when only a limited number of observations is available. Under the as- sumption that the B matrix is sparse (i.e., many coefficients are zero), the LASSO algo- rithm effectively determines which b values are actually zero. Our goal in using LASSO is to identify non-zero coefficients and then estimate them simultaneously, thus avoiding bias 41 due to partial regression with correlated predictors. The pre-selection process in LASSO

involves determining an optimal set of predictors.

In the MVAR model, pre-selection is carried out in a row-wise manner. LASSO adds

a constraint on each row equation of Eq. 3.3 that restricts the total absolute values of the

coefficients. The constraint is expressed as:

X X k |bij| ≤ c (3.6) k j

where c is a tuning parameter.

Regression of the ith row of Eq. 3.3 under the constraint provided by Eq. 3.6 is equiv-

alent to the regression of:

p m p m+n p m+n X X k t−k X X k t−k X X k xit = bijxij + bilyi(l−m) + λi bij + it (3.7) k=1 j=1 k=1 l=m+1 k=1 j=1

Finding a least-squares solution of Eq. 3.7 requires a subset of the b values to be set to

zero. To achieve this goal we use the Least Angle RegreSsion (LARS) algorithm developed

by Efron et al. (2004), which starts with all bs equal to zero and then iteratively adjusts their

values to fit the model. Some of the bs remain zero after the adjustment, resulting in the

identification of an optimal set of non-zero b values for a particular c value (corresponding

to λ in Eq. 3.7).

The General Cross-Validation (GCV) criterion for determining optimal predictors. The next step in model estimation is to tune the parameter c to achieve a best fit of Eq. 3.7.

The minimum value that c can take is zero, corresponding to the extreme case where all

42 bs are zero. The upper boundary is reached when LARS does not penalize any b to zero,

making c equal to the sum of the absolute values of all bs. Within this interval, a number

(approximately 100 in our case) of c values are chosen to compute the subsets of bs. For

each c value, after an optimal subset has been found, the corresponding Residual Sum of

Squares (RSS) is used to calculate a General Cross-Validation (GCV) statistic (Valdes-Sosa et al., 2005):

GCV = RSS/(n − df)2 (3.8)

where n is the number of independent observations and df is the estimated degrees of

freedom from the LARS algorithm. From all the solutions, a GCV curve is plotted. The

minimum GCV value determines the single most optimal set of predictors over all c values.

A subsequent Ordinary (OLS) procedure is then applied to the new row

equation with the selected predictors. If the model order is one, as in our application, there

is only one coefficient for each predictor. Either an F -test or a t-test is performed for each

coefficient to determine whether its value is significantly different from zero. The resulting

F -score or t-score characterizes the prediction by a predictor on the RHS of Eq. 3.3 of the

dependent variable on the LHS, and corresponds to the GC strength from that predictor to

dependent variable. Here we used the t-score to measure GC because it has a signed value,

and thus indicates whether the GC is enhancing or reducing, in addition to indicating GC

strength.

Summary statistics of GC between two ROIs. The full B matrix may be estimated by

43 following the above procedures for every row equation in Eq. 3.3. It consists of four subma-

trices (Bxy, Byx, Bxx, and Byy), where the first subscripted index represents the predictor

and the second represents the dependent variable. Thus, Bxy represents connectivity from

X to Y, Byx represents connectivity from Y to X, and Bxx and Byy represent connectivity

within X and within Y, respectively. In order to measure GC from one ROI to another (i.e.,

X → Y or Y → X), one or more statistics are needed to summarize the voxel-to-voxel GCs

represented by significant coefficients in Bxy or Byx.

The first summary statistic that we used was the fraction (f) of significant b values

in the B matrix or one of its submatrices (representing the fraction of significant GCs)

(Fig. 3.8). The fraction of b values found to be significantly different from zero at p <

0.05 was corrected for multiple-comparisons by the (FDR). This summary statistic is a measure of density of the ROI-level connectivity. Because each b value represents a potential functional “connection”, the f summary statistic summarizes the fraction of all possible voxel-to-voxel connections from one ROI to another by which the two ROIs are actually connected.

The second summary statistic used was the average strength of significant GC from voxels in one ROI to voxels in another (Fig. 3.8). Consider, for example, the GC from

(“sending”) ROI Y to (“receiving”) ROI X. Significant voxel-to-voxel GCs are represented by significant coefficients in Byx. For any given voxel x in ROI X having at least one significant (p < 0.05) t-score (indicating a significant GC) from ROI Y, we first summed the t-scores of all the GCs to x. This sum represents the total significant “input”to the

“receiving”voxel x from all “sending”voxels in Y. Because the t-scores can be positive or

44 negative, signifying that changes of activity in the “sending”voxel contribute to a change of

activity in the “receiving”voxel either in the same or opposite direction, the sum of t-scores

takes into account the balancing effect of positive and negative inputs to the same receiving

voxel. We then computed the average strength of significant input over all receiving voxels

in ROI X as the W summary statistic. The same procedure was also followed to assess

the average strength of significant GC in the other direction, i.e. from ROI X to ROI Y

using Bxy. Although not the focus of this paper, W could also be computed to assess the

average strength of significant GC within ROI X using Bxx, or within ROI Y using Byy.

W measures the average strength of GC from ROI Y to ROI X, but is not simply a weighted version of the f summary statistic. A high W value from Y to X depends on a combination of the following: 1) many voxels in Y have high GC values to voxels in X; and 2) single voxels in X have significant GC values from multiple voxels in Y.

Simulation models were constructed using the R statistical computing package. For the purpose of comparing the GC strength of a simulation model with its estimated values, we computed the simulation W statistic directly from the b values of the simulation model.

To make the estimated and simulation W measures comparable, we normalized the t and b values to z-scores (i.e. subtracted the mean and then divided by the ).

45 Figure 3.2: Schematic illustration of the computation of summary statistics f and W for hypothetical submatrix Byx. Red dots represent the voxels of ROI Y, green dots the voxels of ROI X, and arrows the significant t-values between inter-region voxel pairs. Positive values are colored orange and negative values are colored blue.

46 Results

1. Application to simulated data

Simulation MVAR models were created based on Eq. 3.3 (see Materials and Meth-

ods), and iterated to generate simulated fMRI BOLD time series data for pseudo-voxels in

two pseudo-ROIs having fixed sizes (30 pseudo-voxels in X and 50 pseudo-voxels in Y).

The innovation process for the simulation model was created by iterative random

of a zero-mean normal distribution with 0.1 standard deviation. The predictors were ini-

tialized with random values taken from a zero-mean normal distribution with 0.1 standard

deviation. The four coefficient submatrices (Bxx, Byx, Bxy and Byy) were constructed separately. For each submatrix, some coefficients (bij) were randomly set to zero and the rest were randomly drawn from a normal distribution with zero-mean and a specific stan- dard deviation (0.08 for Bxx and Byy, 0.2 for Byx, and 0.1 for Bxy). For each simulation,

200-point-long time series for each pseudo-voxel were created by model iteration. A total of 56 simulation models were created. The density of model connectivity was systemati- cally increased with increasing model identification number by augmenting the number of voxel pairs connected by non-zero b values (Table 1).

We first considered the effect of averaging the BOLD activity of all voxels in an ROI on the measurement of interregional GC. The GC between two ROIs, each of which is repre- sented by an averaged time series, was measured by a single t-score in each direction. For it to properly portray the connectivity between ROIs, the t-score was expected to follow the change of parameters across the simulation models shown in Table 1. We tested this pre-

47 Sim. Bxx Byy Byx Bxy Sim. Bxx Byy Byx Bxy 1 0.0656 0.0592 0.0493 0.0420 29 0.1944 0.1768 0.1593 0.1580 2 0.0656 0.0592 0.0473 0.0373 30 0.1944 0.1768 0.1413 0.1473 3 0.0656 0.0592 0.0453 0.0427 31 0.1944 0.1768 0.1373 0.1627 4 0.0656 0.0592 0.0400 0.0433 32 0.1944 0.1768 0.1240 0.1647 5 0.0656 0.0592 0.0427 0.0453 33 0.1944 0.1964 0.1853 0.1500 6 0.0656 0.0592 0.0413 0.0520 34 0.1944 0.1964 0.1707 0.1460 7 0.0656 0.0592 0.0320 0.0580 35 0.2267 0.2160 0.1807 0.1620 8 0.0656 0.0592 0.0340 0.0507 36 0.2267 0.2160 0.2000 0.1920 9 0.0978 0.0984 0.0973 0.0553 37 0.2267 0.2160 0.1800 0.1767 10 0.0978 0.0984 0.0860 0.0727 38 0.2267 0.2160 0.1727 0.2193 11 0.0978 0.0984 0.0827 0.0760 39 0.1944 0.1964 0.1367 0.1860 12 0.0978 0.0984 0.0860 0.0607 40 0.2267 0.2160 0.1687 0.2160 13 0.0978 0.0984 0.0907 0.0680 41 0.2589 0.2552 0.2380 0.2200 14 0.0978 0.0984 0.0713 0.0847 42 0.2267 0.2356 0.2113 0.1947 15 0.0978 0.0984 0.0787 0.0800 43 0.2267 0.2356 0.2213 0.1927 16 0.0978 0.0984 0.0713 0.0907 44 0.2267 0.2356 0.2113 0.2180 17 0.1300 0.1376 0.1233 0.0933 45 0.2589 0.2552 0.2220 0.2360 18 0.1300 0.1376 0.1220 0.1073 46 0.2267 0.2356 0.2027 0.2113 19 0.1300 0.1376 0.1187 0.1107 47 0.2267 0.2356 0.1740 0.2013 20 0.1300 0.1376 0.1300 0.0993 48 0.2589 0.2552 0.1720 0.2613 21 0.1300 0.1376 0.1193 0.1147 49 0.2911 0.2944 0.2847 0.2467 22 0.1300 0.1376 0.0993 0.1213 50 0.2589 0.2552 0.2387 0.1773 23 0.1300 0.1376 0.0980 0.1313 51 0.2911 0.2944 0.2540 0.2680 24 0.1300 0.1376 0.0880 0.1327 52 0.2589 0.2552 0.2373 0.2120 25 0.1944 0.1768 0.1800 0.0920 53 0.2589 0.2552 0.1880 0.2280 26 0.1944 0.1768 0.1727 0.0933 54 0.2911 0.2944 0.2367 0.2947 27 0.1944 0.1768 0.1587 0.1473 55 0.2911 0.2944 0.2053 0.2680 28 0.1944 0.1768 0.1560 0.1500 56 0.2911 0.2944 0.1827 0.2907 Table 3.1: The fraction of non-zero coefficients in each of the 4 submatrices for each of the 56 simulation models.The fraction changed over the models from values of approximately 0.05 to values of approximately 0.29, increasing by approximately 0.04 every 8 models.

48 Figure 3.3: Granger Causality patterns between simulated ROIs. GC was computed as tyx and txy and then normalized to z-scores using averaged voxel time series, compared across simulation models against the corresponding voxel-based f and W summary statis- tics computed directly from the model parameters (b values were normalized to z-scores before computing W ). The horizontal axis labels the 56 simulation models in the order of Table 1, representing different connectivity parameter settings. The t-values do not sig- nificantly correlate with either W or f across simulation models, demonstrating that GC computed from averaged voxel time series is not sensitive to true connectivity.

diction by measuring the correlation of the t-scores from the averaged time series with two

summary statistics (the fraction of significant connections, f, and the average connectivity

strength, W ) (see Materials and Methods and Fig. 3.8). These summary statistics were

computed directly from the simulation models, and thus followed the change of parameters

across the simulation models. The t-scores were not significantly correlated (p < 0.05)

with either summary statistic (Fig. 3.2). That the t-scores did not follow the change of pa- rameters across simulation models indicates that computing GC from averaged voxel time series does not accurately capture inter-ROI connectivity patterns.

We next examined how well voxel-based methods recovered the actual GC patterns of the four submatrices across the simulation models shown in Table 1. Analysis for each

49 simulation model consisted of estimating the full B matrix from the simulated data gener-

ated by that model by both the pairwise-GC and LASSO-GC methods, and comparing the

results with the actual values in the model. Each of these two methods is voxel-based. The

pairwise-GC method constructs the B matrix by estimating a separate bivariate AR model

for each voxel pair, whereas the LASSO-GC method computes the B matrix by estimat-

ing an MVAR model that incorporates all voxel pairs. Unlike the approach of averaging

across voxels, both methods compute a t-score for each b coefficient in the B matrix, test- ing whether the value of that coefficient significantly deviates from zero. A significant non-zero b value is equivalent to a significant GC value when the model order is one. Fig.

3.3 illustrates the results from a simulation in which the LASSO-GC method (Fig. 3.3B) closely estimated the pattern of b values of the model (Fig. 3.3A), whereas the pairwise-GC method (Fig. 3.3C) yielded a large number of spurious non-zero values.

To determine how typical were the results seen in Fig. 3.3 across all simulation models, summary statistics from pairwise-GC and LASSO-GC estimations were compared with those computed directly from the models. First to be used was the f summary statistic,

which reflects the fraction of significant b values. Fig. 3.4 compares how well the pairwise-

GC and LASSO-GC methods recovered the actual f summary statistic computed directly from the simulation models. It reveals that in most simulations the f summary statistic from pairwise-GC estimation was greater than the actual simulation model value, whereas that from LASSO-GC estimation closely matched the actual simulation model value. We defined the distance between estimated and model f values by their absolute difference, and compared the distances resulting from the LASSO-GC method with that from the pairwise-

50 Figure 3.4: Comparison of model estimation by LASSO-GC and pairwise-GC methods for one simulation model. The X voxels in A-C are represented by green dots and Y voxels by red dots. All the t and b values are z-normalized. A) Simulated connectivity pattern of the model for the four B matrices, with orange arrows representing positive b values and blue arrows negative b values. B) Estimated connectivity pattern with LASSO-GC method. Significant t-values are shown as arrows, with the thickness representing the absolute mag- nitude of the t-values, and the color representing the sign of the t-value (orange for positive, blue for negative). The pattern is similar to that in the model. C) Estimated connectivity pattern with pairwise-GC method, shown in the same manner as for the LASSO-GC result. The connectivity is much denser than the model pattern. D) Summary statistics for the patterns shown in the previous three panels. LASSO-GC matches the modeled values more closely than pairwise-GC.

51 Figure 3.5: Comparison of LASSO-GC and pairwise-GC methods in recovering the f summary statistic. The fraction of significant b coefficients (f summary statistic) in each submatrix, computed directly from the simulation model, was compared with the f statistic estimated by the LASSO-GC and pairwise-GC methods. The estimated LASSO-GC f statistic more closely matches the f statistic of the model across simulation models than does the estimated pairwise-GC f statistic. The horizontal axis is arranged the same way as in Fig. 3.2. The example shown in Fig. 3.3 is from the 28th model.

GC method. Paired t-tests showed highly significantly (p < 0.01) smaller distances with

the LASSO-GC method for all four submatrices.

The W summary statistic, which reflects the average strength of significant GC from voxels in one ROI to voxels in another, was used next to compare the pairwise-GC and

LASSO methods. As with the f statistic, the W statistic from the LASSO method matched

the actual W statistic computed from the simulation model more closely than that from the pairwise-GC method (Fig. 3.5). Also as with the f statistic, the distances between estimated and model W values for the two methods were compared. Paired t-tests showed highly significantly (p < 0.01) smaller distances with the LASSO-GC method for all four

52 Figure 3.6: Comparison of LASSO-GC and pairwise-GC methods in recovering the W summary statistic. The average GC strength (W summary statistic) in each submatrix, computed directly from the simulation model, was compared with the W statistic estimated by the LASSO-GC and pairwise-GC methods. The estimated LASSO-GC W statistic more closely matches the W statistic of the model across simulation models than does the esti- mated pairwise-GC W statistic. Since the estimated W statistic is based on t-values and the W statistic computed directly from the simulation model is based on b coefficient val- ues, both b and t-values were normalized to standard z-scores before calculating W . The horizontal axis is arranged the same way as in Fig. 3.2. submatrices.

To summarize the results up to this point, the LASSO-GC method was found to out- perform the pairwise-GC method and the average-signal based method in recovering sim- ulation model connectivity. We next applied the LASSO-GC method to explore functional connectivity in an empirical fMRI BOLD dataset.

2. Application to fMRI BOLD data from a visuospatial attention task

An fMRI BOLD dataset from a slow event-related visuospatial attention task paradigm

(Sylvester et al., 2007; Bressler et al., 2008) was analyzed with the LASSO-GC method.

53 Within each of 6 subjects, bilateral areas V1v, V2v, VP, V3A and V4 were in the Visual

Occipital Cortex (VOC), and bilateral areas Frontal Eye Field (FEF) and anterior and pos- terior IntraParietal Sulcus (aIPS and pIPS) were in the Dorsal Attention Network (DAN).

MVAR models of order-one were estimated from the time series of all voxels from each pair of VOC and DAN ROIs by the LASSO-GC method. Repeated trials (average num- ber 70) at each time point were used as observations. For each ROI pair, a full B matrix was first estimated, and the f and W statistics were then computed for each of the four submatrices.

The results of functional connectivity analysis between the VOC and DAN are pre- sented in Fig. 3.6 for a representative ROI pair in one subject. GC connectivity diagrams are shown between the right VP region (having 25 voxels) in VOC and the right FEF region

(having 56 voxels) in the DAN (Fig. 3.6A). The four diagrams represent GC connectivity within right VP (VP → VP), from right FEF to right VP (FEF → VP), from right VP to right FEF (VP → FEF), and within right FEF (FEF → FEF). GC connectivity is sparse both within and between ROIs, meaning that a low fraction of t-scores is significant at p < 0.05 (fVP →VP = 0.13, fFEF →VP = 0.09, fVP →FEF = 0.09, fFEF →FEF = 0.07).

Both significantly positive (orange arrows) and significantly negative (blue arrows) GCs are present both within and between ROIs. A positive GC indicates that increased activity of the “sending”voxel predicts increased activity of the “receiving”voxel, whereas a negative

GC signifies that increased activity of the “sending”voxel predicts decreased activity of the

“receiving”voxel. We also looked at the same statistics from the cross-correlation measure.

A relatively larger fraction of connections is significant at p < 0.05 (fVP −VP = 0.74,

54 Figure 3.7: Comparison of connectivity patterns with LASSO-GC and cross-correlation measures. The patterns were computed for one exemplary ROI pair from one subject. The t-scores from LASSO-GC analysis were z-normalized. Green dots represent voxels from right VP and red dots represent voxels from right FEF. A) Estimated connectivity patterns with the LASSO-GC measure. Significant t-values are shown as arrows, with the thickness representing the absolute magnitude of the t-values, and the color representing the sign of the t-value (orange for positive, blue for negative). B) Estimated connectivity patterns with the cross-correlation measure. Significant cross-correlation coefficients are shown as lines, with the thickness representing the absolute magnitude and the color representing the sign (orange for positive, blue for negative). C) Summary statistics for the patterns shown in the previous two panels. For the cross-correlation measure, FEF→VP and VP→FEF have the same summary scores since the measure is non-directional.

fVP −FEF = 0.36, fFEF −FEF = 0.52) for the same ROI pair and subject (Fig. 3.6B), sug-

gesting that a large portion of the voxels are correlated. The average connectivity strength

is greater than 2, both within and between ROIs, indicating that on average each voxel

receives connections from more than 2 other voxels. This observation of relatively high

correlation density suggests that LASSO-GC is needed to reduce correlation-induced spu-

rious GC estimates.

55 To extend the functional connectivity analysis to the full fMRI BOLD dataset, we ap-

plied the LASSO-GC method to all 60 VOC-DAN ROI pairs in each of the 6 subjects. The

f and W summary statistics were then averaged across ROI pairs and subjects, yielding mean f and W summary statistics for VOC-to-VOC connectivity, DAN-to-DAN connec- tivity, DAN-to-VOC connectivity, and VOC-to-DAN connectivity (Fig. 3.7). These four connectivity types correspond to the four coefficient submatrices of the estimated B matrix in LASSO-GC analysis: VOC-to-VOC and DAN-to-DAN connectivity refers to connectiv- ity within a single region of VOC or DAN, not to connectivity between different VOC or

DAN regions. The mean f summary statistic is below 0.1 for all submatrices, indicating overall sparse within- and between-ROI GC connectivity. Paired-sample t-tests with sub- jects as repeated measures (df = 5 for all comparisons) were performed on both f and W

to compare: (1) top-down (DAN-to-VOC) with bottom-up (VOC-to-DAN) connectivity;

(2) within-VOC with within-DAN connectivity; (3) top-down with within-VOC connectiv-

ity; and (4) bottom-up with within-DAN connectivity. The comparison of top-down with

within-DAN connectivity and the comparison of bottom-up with within-VOC connectivity

were not performed because these comparisons are ambiguous. The reason is that these

comparisons are based on GC to voxels in a sending region, whereas the W summary

statistic is based on voxels in a receiving region (see Materials and Methods and Fig. 3.8).

The functional connectivity analysis results show that within-VOC (VOC→VOC) con-

nectivity was significantly greater than top-down (DAN→VOC) connectivity for both the

f (t = 6.34, p < 0.01) and W (t = 4.76, p < 0.05) summary statistics, indicating that the

local GC between voxels within VOC is both more dense and stronger than the long-range,

56 Figure 3.8: Functional connectivity analysis of Dorsal Attention Network and Visual Oc- cipital Cortex in visual spatial attention. The f and W summary statistics were computed from LASSO-GC for each of 60 ROI pairs and 6 subjects, and then averaged over pairs and subjects. For each ROI pair, one ROI was in the Dorsal Attention Network (DAN) and the other was in Visual Occipital Cortex (VOC). The bars represent mean f and W sum- mary statistics for VOC-to-VOC connectivity, DAN-to-DAN connectivity, DAN-to-VOC connectivity, and VOC-to-DAN connectivity. Error bars represent the of the mean. Significant differences from paired-sample t-tests are marked (*: p < 0.05, **: p < 0.01). top-down GC from the DAN. Connectivity within DAN (DAN→DAN) was also signif-

icantly greater than that in the bottom-up direction (VOC-to-DAN) for the W summary

statistic (t = 6.47, p < 0.01) but not for the f summary statistic, indicating that the local

GC between voxels within DAN is stronger, but not more dense, than the long-range GC

from VOC. Finally, connectivity in the top-down direction (DAN→VOC) was significantly

greater than that in the bottom-up direction (VOC→DAN) for the W summary statistic

(t = 4.85, p < 0.05) but not for the f summary statistic, indicating a long-range directional

strength asymmetry between DAN and VOC, with stronger top-down connectivity.

57 Discussion

We have shown that Granger Causality (GC) computed from voxel-level BOLD signals better reflects the interdependence pattern between ROIs than that computed from voxel- averaged signals. We conclude that brain regions are not unitary elements, that network structure exists at the voxel level, and that ROI-level GC connectivity is best measured by summary scores computed over voxel-level connectivity patterns.

We emphasize that our results apply specifically to GC between pre-defined ROIs, and do not necessarily extend to the computation of maps showing GC between a “seed”signal, averaged over the voxels in one cortical region, and voxels throughout the rest of the cor- tex (Roebroeck et al., 2005). In fact, the inter-regional methods that we have investigated may prove to be complementary to the mapping method. To explore the relationship of a particular region to the remainder of the cortex, the mapping method would appear to be more appropriate since it yields a global interdependence pattern. Of course, since the mapping method is based on the pairwise-GC approach, it entails the risk of producing spuriously significant GC values. Nonetheless, mapping may be useful as a first step to establish global interdependency patterns, which then can be explored in greater detail by examining the interdependence between regions with voxel-based inter-regional analysis.

In addition to mapping, another analytic method common in the literature examines region- to-region cross-correlations based on averaged signals and identifies topological properties from large-scale networks that involve hundreds of ROIs (Cohen et al., 2008a). Our GC

findings do not necessarily negate this approach: since GC and cross-correlation are differ-

58 ent measures, inhomogeneity in GC does not imply inhomogeneity in cross-correlation. It

is possible that correlation-based connectivity with averaged signals may be effective even

though GC analysis requires a voxel-based approach.

We have shown that the LASSO-GC method can better identify GC connectivity be-

tween ROIs in simulated fMRI BOLD data than the pairwise-GC method by more accu-

rately estimating the connectivity density and strength. The pairwise-GC method can yield

spuriously significant coefficients if correlated predictors are present in the MVAR model.

The close fit of the LASSO-GC results to the actual results from the simulation models

demonstrates that the LASSO-GC method is able to avoid false positives. The close fit

of the LASSO-GC results also shows the sensitivity of this method in detecting model

changes. By contrast, GC values computed from averaged data do not systematically fol-

low changes in simulated ROI models, suggesting that summary statistics computed from

voxel-to-voxel GCs are better able to represent ROI-level connectivity than single region-

to-region GCs computed after averaging over ROI voxels.

The estimated f summary statistics from the LASSO-GC method matched the actual

f statistics from the simulation models better when the B matrices were more sparse. Al-

though the LASSO algorithm could potentially fail for high connectivity densities, we were

not able to observe such a failure because the simulated voxel activity at high connectiv-

ity density becomes unstable. Nonetheless, it is unlikely that the low f values observed

for the empirical BOLD data are artifactual because if the B matrices were ill-estimated,

then the directional asymmetry found with the W statistic would not display the high de- gree of consistency across subjects that was observed. Thus, the fact that the range of f

59 found for the empirical BOLD data fell within the range of f in the simulations suggests

the suitability of the LASSO-GC technique for application to BOLD data. The low values

of the f summary statistic from the empirical BOLD data further suggest that GC connec-

tivity between cortical ROIs is sparse. Given the evidence from anatomical studies that

the axonal connectivity between neuronal populations is generally sparse (He et al., 2006;

Gong et al., 2008), it is more likely that the sparse GC connectivity reflects actual functional

interaction patterns between neurons than that it is a mere statistical byproduct.

Directional asymmetry in GC connectivity between the Dorsal Attention Network (DAN)

and Visual Occipital Cortex (VOC) was reported in our previous work (Bressler et al., 2008)

using the pairwise-GC method for computing GC and f as the summary statistic. Using the

LASSO-GC method, we report here that the directional asymmetry is found in the W , but not the f, summary statistic (Fig. 3.5). The difference in results from the pairwise-GC and

LASSO-GC methods may be understood by examining the properties of the W summary

statistic. The finding that W values in the top-down DAN-to-VOC direction are signifi-

cantly greater than in the bottom-up VOC-to-DAN direction suggests that VOC voxels are

modulated more strongly by DAN voxels than DAN voxels are by VOC voxels, despite

there being similar fractions of voxels being modulated in both directions. The greater top-

down modulation strength may have introduced a bias in the pairwise-GC results from our

previous work, yielding an apparently greater fraction of significant top-down GC values.

Relatively high cross-correlation density (Fig. 3.6) may have contributed to such a bias.

The problem of bias actually has multiple facets. It is known from theory that the

LASSO method may be biased if predictors are highly correlated. There are two main

60 problems caused by correlated predictors. First, some predictors in a system may not be included in the model of the system. This is the case when estimation of multiple bivariate

AR models is employed in place of MVAR model estimation: the model estimation may be biased by undetected influences from the excluded predictors. The use of LASSO helps to mitigate this problem by allowing estimation of the full MVAR model. Second, even when all the predictors are taken into account, correlation among predictors may still bias model estimation, a situation often referred to as the collinearity problem for multiple regressions.

Although this collinearity problem is deeper, and a solution is not currently available from theory, it nonetheless does not invalidate our results. We found that the f and W summary statistics effectively recovered the modeled connectivity values from simulated data, even though those data had significant cross-correlations between most voxel pairs. Further- more, in empirical BOLD data analysis, it is often desirable to compare summary statistics across different conditions rather than to precisely identify their values. For such compar- ison, any possible bias introduced by voxel-voxel cross-correlations would exist in each condition and thus not alter the comparison.

Although the MVAR models used in this paper were implemented with order one, mod- els having higher order (p > 1 in Eq. 3.3) can be implemented within the same framework, provided that the number of coefficients in the higher-order MVAR model does not exceed that allowed by the number of observations. For model orders greater than one, multiple b coefficients at different time lags (t−k) contribute to the GC from one voxel to another, and it is not sufficient simply to test the significance of a single b coefficient. In that case, testing for significant between-voxel GC would be performed differently, and the summary statis-

61 tics would accordingly be defined differently. For example, a criterion for between-voxel

GC to be significant might be that at least one of the b coefficients from different lags must

be significant. A summary statistic equivalent to f might then be defined as the fraction of

significant between-voxel GCs rather than the fraction of significant b values. Similarly, a

summary statistic equivalent to W could base the average strength of significant GC on all

significant b values for a voxel pair instead of a single b value. A straightforward way to do

this would be to sum the significant b values from different lags over all inputs to receiving voxels. In this way the W statistic would be sensitive to three factors: the magnitude of all significant b values, their corresponding time lags, and the total number of converging

significant inputs to receiving voxels. However, to compare the W statistic between models

of different order, the time-lag factor would need to be removed, possibly by averaging b

values over time lags, to avoid bias due to the total number of b values.

In conclusion, our work suggests that LASSO-GC is an effective method for measur-

ing connectivity between fMRI BOLD ROIs in a voxel-based manner. It indicates that

the f and the W summary statistics reveal different aspects of directed influence between

ROIs. Used in tandem, these statistics may provide consistent information about influences

between different brain regions that is richer than either one alone. Additional summary

statistics will likely be found in the future that will further our understanding of directed

influences between brain regions.

62 CHAPTER 4. RETINOTOPICALLY ORIENTED TOP-DOWN MODULATION OF

THE VISUAL CORTEX DURING VISUAL SPATIAL ATTENTION

Introduction

A well known effect of visual spatial attention is the change of neural activity in the

Visual Occipital Cortex (VOC) before target presence in the visual field (Kastner et al.,

1999; Reynolds and Chelazzi, 2004), be it an enhancement or suppression depending on the retinotopic representation of where in space the attention was directed to (Golomb et al., 2008). In the absence of target stimuli, such change is proposed to be a result of top- down modulation from areas responsible for attentional control, as a means to prepare the

VOC for processing incoming information (Buschman and Miller, 2007).

Two tentative regions for generating top-down attentional control have been intensively studied. The Frontal Eye Field (FEF) and the IntraParietal Sulcus (IPS) are consistently found to be active during goal-directed behaviors and the damage to which would cause neglect to portions of the contralateral visual hemifield (Mesulam, 1981; Corbetta et al.,

2005; He et al., 2007). To look for more direct evidence for top-down modulation from

FEF and IPS to VOC, our previous work (Bressler et al., 2008) utilized an analytical tool named Granger Causality (GC) to measure the directional interdependency between these areas in a functional Magnetic Resonance Imaging (fMRI) study where subjects performed a Postner-like covert attention task. We found greater GC influence from FEF and IPS to

63 VOC than that in the opposite direction at the time of stimulus onset, which supports the attentional modulation theory. However, the specificity of such modulation, i.e. to what extent it distinguishes visual-field locations in their retinotopically represented areas and if such specificity exists in what time period of the task was it carried through remains unknown. Investigation into this question would help settling the uncertainty whether the

GC influence is really measuring attentional control. The answer would be yes if it does show retinotopic specificity and is time-locked to the maintenance of attention; or GC might be instead measuring the modulation for a baseline shift generally to the VOC activity if there is no retinotopic specificity, and the modulation might be a tonic effect to the task if not locked to a specific time period .

In a follow-up study, we extended the GC analysis to all of the time points before stim- ulus onset and examined in more detail how top-down GC strength was distributed among different visual regions. In the test condition, i.e. with GC computed from FEF and IPS to

VOC, we observed retinotopic specificity in the modulation pattern such that top-down GC to the ventral portion of VOC, representing the target locations in the upper visual hemi-

field, was significantly greater than that to the dorsal portion, representing mirror locations of the targets in the lower visual hemifield. The test results were compared with two control conditions: (1) GC computed during the preparatory period in the same manner as in the test condition, but with region I (the frontoparietal network) replaced by region III, defined as randomly sampled ROIs in the cortex outside regions I and II; and (2) GC computed from region I to region II with disturbed trial order. The retinotopic specificity observed in the test condition is not found in either control condition. Test for finer-grained retinotopic

64 specificity found a null result such that the GC pattern does not distinguish the cued target location from the uncued target location across the vertical meridian.

For testing time-locking feature of the GC patterns, we also extended the analysis to all of the time points after stimulus onset (post-target period), and compared the modu- lation patterns from pre- and post-target periods. The retinotopic specificity observed in the pre-target period was not a significant effect in the post-target period. In addition, we observed temporal invariance of the directional asymmetry such that the top-down-greater- than-bottom-up GC pattern was maintained for the entire trial.

Our analysis suggests that GC-measured top-down modulation from the preparatory period is effectively reflecting attentional control. It is specifically directed to the neural representations of targets in the attended visual hemifield, and such retinotopic specificity is locked to the time period where attention is shifted and maintained. The attentional control might be carried solely by the top-down GC instead of by the bottom-up or the difference between top-down and bottom-up GC. Top-down GC not distinguishing the specific cued target location across the vertical meridian suggests that the modulation is onto both neural representations of the two targets. Whether it is facilitating the processing for one site and suppressing that for the other is yet to be tested. These findings may help elucidating in detail the role of dorsal attention network in visual spatial attention.

Materials and Methods

Task and Data Acquisition. This is a follow-up study of the data acquired with the task from Chapter 2. Entire trial period with 13 time points were included in the analysis.

65 Region of interest creation. Regions Of Interest (ROIs) outside retinotopic cortical areas and ROIs inside the ventral portion of the retinotopic areas were created the same way as described in Chapter 2. In addition, we also include into the analysis ROIs inside the dorsal portion of the retinotopic areas, created by taking conjuctions of voxels representing the mirrored locations to the targets across the horizontal meridian (in the lower visual hemi-

field) with voxels in the retinotopic areas. The location-representing voxels were identified using the same localizer scans as described in Chapter 2. Retinotopic areas were the same ones from Chapter 2. Thus a total of four locations in the visual field were represented by the ROIs created from ventral and dorsal portions of the retinotopic areas: two target locations (represented by ROIs from V1v, V2v, VP and V3A) and two mirrored locations across horizontal meridian (represented by ROIs from V1d, V2d, V3 and V3Ad).

BOLD time series preprocessing. Outlier voxels were rejected, based on their BOLD amplitude variability over trials, using Tukeys boxplot technique. BOLD time series were z-normalized to have zero mean and unit variance.

Granger Causality Analysis. Definition and testing of Granger Causality (GC) followed the LASSO-GC method developed in Chapter 3. For each ROI pair, a Multivariate Vector

AutoRegressive model was fitted to the preprocessed BOLD data with all the voxels within the two ROIs as model variables, after which the W summary statistic was calculated from the model coefficients as the measure of GC strength. The model parameter estimation used a trial-based method such that measures of a voxel’s BOLD activity at the same time point from each trial were treated as repeated observations for that voxel at that time point.

Random Selection of Regions outside FEF, IPS, and VOC.To do the sampling, we first

66 Figure 4.1: An example from one subject as illustration of the locations of randomly sam- pled ROIs outside FEF, IPS and VOC. The ROIs are shown on the flattened cortical surface created in Caret. divided the 48 × 64 × 48 voxel space which stores the gray-matter volume into 4 × 4

× 4 subspaces. We then “circled out”FEF, IPS and VOC manually on a flattened cortical surface in caret and projected the resulting surface into volumetric space, where it served as a mask on the original voxel space. A subspace was then randomly selected to create a random ROI. If fewer than 50 gray-matter voxels were in that subspace, we increased one randomly chosen side of the cube, i.e. let the subspace grow into the size of 5 × 4 × 4, 4

× 5 × 4 or 4 × 4 × 5 voxels. By iteratively adjusting the size, we could make the enclosed voxel number in the subspace greater than 50 and lower than 100, in order to match the

50–100 range seen for FEF/IPS voxel numbers. Fig. 4.1 shows the resulting 12 randomly selected ROIs in each hemisphere for a representative subject.

Trial-Order for Control Condition 2 Trial order was randomly disturbed separately for each time point such that observations from one trial might regress onto observations at a previous time point from another trial.

67 Results

We performed a group analysis on the mean values of LASSO-GC W summary statis-

tics from each subject. For all conditions, in each of 6 subjects there was one W summary

statistic for each dimension of the data, i.e. 5 (time points) × 2 (cue conditions) × 2 (types

of performance) × 3 (higher-level ROIs) × 4 (lower-level ROIs) × 2 (hemispheres) × 2

(portions of VOC) × 2 (directions) = 1120 scores in total for pre-target test condition and control conditions 1 and 2 (fewer scores in 2 of the 6 subjects who had missing values for left FEF and left aIPS), and 7 (time points) × 2 (cue conditions) × 2 (types of performance)

× 3 (higher-level ROIs) × 4 (lower-level ROIs) × 2 (hemispheres) × 2 (portions of VOC)

× 2 (directions) = 1568 scores in total for post-target test condition. For all conditions,

we computed a mean measure for each subject, separately in the top-down and bottom-

up direction and for the ventral and dorsal portions of VOC, by averaging the W summary

statistic over all the other dimensions. Then we performed paired-sample t-tests to compare the mean scores, using the Bonferroni method for multiple comparisons correction.

The results are summarized in Fig. 4.2 and Table 4.1. In the pre-target test condition, top-down GC from FEF and IPS to VOC is highly significantly (p < 0.01) greater to the ventral portion than to the dorsal portion of VOC. No significant ventral/dorsal difference was found for GC from region III to VOC (control condition 1), for GC from FEF and IPS to VOC with disturbed trial order (control condition 2), or for post-target test condition.

For both the pre- and post-target test conditions, top-down GC from FEF and IPS to VOC is significantly greater (p < 0.01 for the ventral portion and p < 0.05 for the dorsal portion)

68 Figure 4.2: Mean over 6 subjects of the summary scores with standard error plotted, sep- arately for top-down and bottom-up GC with each portion of VOC. (*: p < 0.05; **: p < 0.01).

than bottom-up GC from VOC to FEF and IPS. Such directional asymmetry were not found

to be significant in the two control conditions.

We examined the within-subject effects of time, retinotopy and direction in the pre-

target test condition and control conditions 1 and 2 with the higher-level ROIs collapsed

to keep the number of measurements balanced for all the subjects. We then used the 5

(time points) × 2 (cue conditions) × 2 (types of performance) × 4 (lower-level ROIs) ×

2 (hemispheres) = 160 scores as observations for each subject, and performed a repeated- measures 3-way (time × retinotopy × direction) ANOVA for each condition. In the pre-

target test condition (Fig. 4.3 and Table 4.2), we found significant main effects of retinotopy

(p < 0.05) and direction (p < 0.005), and a significant interaction between retinotopy and

69 Pre-Target Test Condition t df p-val. p (Bonf.) TDv–TDd 5.70 5 0.00232 0.00927 ** BUv–BUd -1.19 5 0.288 1.00 TDv–BUv 7.11 5 0.000855 0.00342 ** TDd–BUd 4.44 5 0.00676 0.0270 * Control Condition 1 t df p-val. p (Bonf.) TDv–TDd 1.60 5 0.0392 0.157 BUv–BUd -1.47 5 0.942 1.00 TDv–BUv 2.45 5 0.106 0.424 TDd–BUd 0.814 5 0.179 0.716 Control Condition 2 t df p-val. p (Bonf.) TDv–TDd -0.298 5 0.777 1.00 BUv–BUd 0.487 5 0.647 1.00 TDv–BUv -1.36 5 0.232 0.928 TDd–BUd -0.242 5 0.818 1.00 Post-Target Test Condition t df p-val. p (Bonf.) TDv–TDd 1.85 5 0.124 0.496 BUv–BUd -0.355 5 0.737 1.00 TDv–BUv 6.53 5 0.00126 0.00504 ** TDd–BUd 5.42 5 0.00290 0.0116 * Table 4.1: Paired-sample t tests for the difference between groups of summary GC scores (TDv/TDd: top-down GC to the ventral/dorsal portion of VOC; BUv/BUd: bottom-up GC from the ventral/dorsal portion of VOC; p (Bonf.): Bonferroni corrected for multiple comparisons). Asterisks mark the significance level, *: p < 0.05; **: p < 0.01.

70 Df Sum Sq Mean Sq F value Pr(>F) Time 4 24.22 6.056 1.725 0.1839 Residuals 20 70.20 3.510 Retino 1 12.35 12.35 6.950 0.0462 * Residuals 5 1.179 0.2358 Dir 1 344.7 344.7 28.96 0.0030 ** Residuals 5 59.51 11.90 Time:Retino 4 5.804 1.451 1.706 0.1882 Residuals 20 17.01 0.8506 Time:Dir 4 9.346 2.336 1.117 0.3763 Residuals 20 41.86 2.092 Retino:Dir 1 15.031 15.031 6.732 0.0486 * Residuals 5 11.16 2.233 Time:Retino:Dir 4 2.713 0.6783 0.6293 0.6472 Residuals 20 21.56 1.078 Table 4.2: Repeated-Measures ANOVA for retinotopy effect (pre-target test condition) (*: p < 0.05; **: p < 0.005). direction (p < 0.05). For either control condition 1 or 2, no significant main effect or interaction was found. The results are summarized in Figures 4.4 and 4.5, and Tables 4.3 and 4.4.

We also examined the within-subject effects of time, retinotopy and direction in the post-target test with the higher-level ROIs collapsed. We then used the 7 (time points) ×

2 (cue conditions) × 2 (types of performance) × 4 (lower-level ROIs) × 2 (hemispheres)

= 224 scores as observations for each subject, and performed a repeated-measures 3-way

(time × retinotopy × direction) ANOVA. In the post-target test condition (Fig. 4.6 and

Table 4.5), we found a significant main effect of direction (p < 0.005), and a significant interaction between time and direction (p < 0.005). Fig. 4.7 summarizes the results with time plots for the whole trial, separated by retinotopy and direction and averaged across 6 subjects.

71 Figure 4.3: Time plots showing the retinotopy × direction interaction in the pre-target test condition. (A) For GC involving the ventral portion of VOC, top-down GC is greater than bottom-up GC at all time points except for one time point in a single subject. (B) For GC involving the dorsal portion of VOC, top-down GC is greater than bottom-up GC at most of the time points in each subject, but the separation between the two time courses is smaller than that in A. (C) For the top-down direction, GC is positive, and is greater in magnitude to the ventral portion of VOC than to the dorsal portion of VOC, at most of the time points for each subject. At the last time point, top-down GC to ventral VOC is consistently greater in magnitude than to dorsal VOC for all subjects. (D) No consistent difference was found across subjects in the bottom-up direction. GC is both positive and negative, and is more negative from the ventral portion of VOC than from the dorsal portion at the last time point in each subject.

72 Figure 4.4: Time plots showing the retinotopy × direction interaction in control condition 1. (A) For GC with the ventral portion of VOC, top-down GC is greater than bottom- up GC at all time points for 3 out of 6 subjects. (B) For GC with the dorsal portion of VOC, top-down GC is greater than bottom-up GC at most of the time points in 4 out of 6 subjects. The separation of the two time courses for both top-down and bottom-up GC is smaller than that in the test condition. (C) For the top-down direction, GC to the ventral portion of VOC is greater than to the dorsal portion of VOC at most of the time points in 3 out of 6 subjects. (D) For the bottom-up direction, GC from the dorsal portion of VOC is more positive than GC from the ventral portion of VOC at all time points for 1 out of 6 subjects. In comparison to the test condition, the control condition 1 result shows less separation between the time courses being compared, and the patterns of such separation are less consistent across subjects.

73 Df Sum Sq Mean Sq F value Pr(>F) Time 4 5.903 1.476 1.761 0.1764 Residuals 20 16.76 0.8380 Retino 1 1.364 1.364 0.8080 0.4099 Residuals 5 8.444 1.689 Dir 1 20.16 20.16 5.387 0.0680 Residuals 5 18.71 3.742 Time:Retino 4 1.821 0.4553 0.7917 0.5443 Residuals 20 11.50 0.5751 Time:Dir 4 2.614 0.6536 0.4788 0.7510 Residuals 20 27.30 1.365 Retino:Dir 1 0.0579 0.0579 0.0584 0.8186 Residuals 5 4.955 0.9909 Time:Retino:Dir 4 4.012 1.003 1.562 0.2229 Residuals 20 12.84 0.6420 Table 4.3: Repeated measures ANOVA for retinotopy effect (control condition 1).

Df Sum Sq Mean Sq F value Pr(>F) Time 4 16.33 4.082 0.9669 0.4473 Residuals 20 84.44 4.222 Retino 1 0.0185 0.0185 0.0027 0.9602 Residuals 5 33.70 6.741 Dir 1 5.370 5.370 1.188 0.3254 Residuals 5 22.59 4.518 Time:Retino 4 16.18 4.044 2.196 0.1062 Residuals 20 36.83 1.841 Time:Dir 4 9.358 2.340 0.7437 0.5735 Residuals 20 62.91 3.146 Retino:Dir 1 2.733 2.733 0.9090 0.3842 Residuals 5 15.03 3.007 Time:Retino:Dir 4 11.40 2.850 0.9860 0.4376 Residuals 20 57.80 2.890 Table 4.4: Repeated measures ANOVA for retinotopy effect (control condition 2).

74 Figure 4.5: Time plots showing the retinotopy × direction interaction in control condition 2. (A) With the ventral portion of VOC, top-down GC has a random relation to bottom- up GC. (B) With the dorsal portion of VOC, top-down GC also has a random relation to bottom-up GC. (C) For the top-down direction, GC to the ventral portion of VOC has a random relation with GC to the dorsal portion of VOC. (D) For the bottom-up direction, GC from the ventral portion of VOC also has a random relation with GC from the dorsal portion of VOC. In comparison to the test condition, control condition 2 shows no particular pattern in the time courses being compared.

75 Figure 4.6: Time plots showing the retinotopy × direction interaction in post-target test condition. For directional asymmetry, top-down GC is greater than bottom-up GC at all time points except for one time point in a single subject, for both results involving ventral and dorsal portion of VOC (A and B). The magnitude of GC scores involving ventral and dorsal portion of VOC are at similar levels, for both top-down and bottom-up direction (C and D).

76 Figure 4.7: Summary time plot for the test condition, separated by retinotopy and direction and averaged across 6 subjects with standard error on top.

77 Df Sum Sq Mean Sq F value Pr(>F) Time 6 25.65 4.275 0.7727 0.5975 Residuals 30 166.0 5.533 Retino 1 2.036 2.036 0.3356 0.5875 Residuals 5 30.3345 6.067 Dir 1 906.9 906.9 41.96 0.0013 ** Residuals 5 108.1 21.61 Time:Retino 6 6.107 1.018 1.464 0.2242 Residuals 30 20.86 0.6954 Time:Dir 6 73.69 12.28 4.123 0.0039 ** Residuals 30 89.37 2.979 Retino:Dir 1 7.733 7.733 2.037 0.2128 Residuals 5 18.98 3.796 Time:Retino:Dir 6 1.512 0.2521 0.1790 0.9805 Residuals 30 42.24 1.408 Table 4.5: Repeated-Measures ANOVA for retinotopy effect (post-target test condition) (*: p < 0.05; **: p < 0.005).

At last, we tested the cued condition for hemispherically specific effect in the pre-target

period. We examined the within-subject effects of time, cue, hemisphere and direction

on the LASSO-GC W summary statistics from pre-target test condition with the higher-

level ROIs collapsed to keep the number of measurements balanced for all the subjects.

We then used the 5 (time points) × 2 (cue conditions) × 2 (types of performance) × 4

(lower-level ROIs) × 2 (hemispheres) = 160 scores as observations for each subject, and

performed a repeated-measures 4-way (time × cue × hemisphere × direction) ANOVA.

No significance was found for the main effect of time or its interaction with other factors,

neither for interaction between cue and hemisphere.

78 Df Sum Sq Mean Sq F value Pr(>F) Time 4 24.22 6.056 1.725 0.1839 Residuals 20 70.20 3.510 Cue 1 0.1236 0.1236 0.5241 0.5015 Residuals 5 1.179 0.2358 Hemi 1 11.02 11.02 1.385 0.2923 Residuals 5 39.78 7.956 Dir 1 344.7 344.7 28.96 0.0030 ** Residuals 5 59.51 11.90 Time:Cue 4 23.92 5.980 3.471 0.0262 * Residuals 20 34.46 1.723 Time:Hemi 4 8.430 2.107 1.300 0.3038 Residuals 20 32.42 1.621 Cue:Hemi 1 0.2477 0.2477 0.5481 0.4924 Residuals 5 2.260 0.4519 Time:Dir 4 9.346 2.336 1.117 0.3763 Residuals 20 41.86 2.092 Cue:Dir 1 0.6336 0.6336 0.2043 0.6702 Residuals 5 15.51 3.101 Hemi:Dir 1 0.3210 0.3206 0.0202 0.8926 Residuals 5 79.48 15.90 Time:Cue:Hemi 4 15.95 3.987 1.703 0.1887 Residuals 20 46.81 2.340 Time:Cue:Dir 4 3.544 0.8861 0.2604 0.8998 Residuals 20 68.06 3.403 Time:Hemi:Dir 4 3.372 0.8430 0.3443 0.8448 Residuals 20 48.96 2.448 Cue:Hemi:Dir 1 0.0602 0.0602 0.0275 0.8747 Residuals 5 10.94 2.188 Time:Cue:Hemi:Dir 4 10.47 2.618 1.182 0.3488 Residuals 20 44.29 2.214 Table 4.6: Repeated-Measures ANOVA for cue effect (pre-target test condition).

79 Figure 4.8: Mean of the summary scores over subjects, showing non-significant cue × hemisphere interaction in the pre-target test condition, separately at each time point.

Discussion

The results presented in this report suggest that the dorsal frontoparietal attention net- work anchored in FEF and IPS exerts retinotopically specific top-down influence on VOC during the pre-target period of the visual spatial attention task, in contrast to cortex outside the frontoparietal network, and to post-target period following the stimulus onset. Direc- tion was significant main effect for both pre- and post-target conditions but not for either of the control conditions, and had significant interaction with time during the post-target pe- riod of the test condition. The preparatory-cue direction did not significantly interact with the hemisphere lateralization for the pre-target test condition. These findings may point to important properties about attentional control carried by the frontoparietal network.

The first important result here is the significant direction effect found with the pre-target

80 test condition. It verifies the top-down-bottom-up asymmetry observed in Chapter 2 with

our new method developed in Chapter 3. The non-significant result for direction effect

with ROIs randomly selected outside FEF, IPS and VOC also matched the randomization

result in Chapter 2, Fig. 2.3 (D). This consistency in results granted our usage of the new

methodology in this follow-up study.

However, there is inconsistency between the two sets of results. In Chapter 2 GC was

analyzed only at the stimulus-onset time, under the assumption that the attentional control

is the strongest at that point. In this report we extended the GC analysis to all 13 time

points in the trial (stimulus-onset time being the 5th point) and it is observed from Fig. 4.7

that the directional asymmetry was maintained throughout the whole trial. Interestingly,

the separation of top-down and bottom-up GC became larger in the post-target period,

suggesting that other cognitive processes might get involved after stimulus onset and the

directional asymmetry per se may not be enough in explaining attentional events. More details about the GC patterns are needed for capturing the attentional control signal.

Our findings of the significant retinotopy × direction interaction is better related to the

control signal, for the following reasons: First, there was a retinotopic asymmetry found

in the top-down direction such that GC to the ventral portion of VOC representing target

locations in the upper visual hemifield was greater than to the dorsal portion of VOC rep-

resenting non-target locations mirrored across the horizontal meridian in the lower visual

hemifield. This asymmetry was in compliance with the task set-up in which subjects always

shifted attention upward. It makes sense that top-down modulation was stronger toward the

neural representations of the attended visual space. Second, such retinotopic asymmetry

81 was found only in the top-down GC. By definition, GC measures the directional influence from one temporally varying signal to another, in some sense a type of “driving”. It is then reasonable to think of the top-down GC reflecting a mechanism of the frontoparietal network driving VOC. The retinotopic asymmetry only occurring in this direction, mean- ing that FEF and IPS as drivers determine where the modulation should be sent to, fits the assumption that FEF and IPS are the control regions. Third, the result with randomly sam- pled regions outside FEF and IPS provided another piece of evidence that the frontoparietal network originates attentional control signals. It is worth noting that the direction effect in control condition 1 is close to significance, indicating that there might be other regions driving VOC. However, the large non-significant p-value for retinotopy × direction inter- action dissociates it from attention. Last, the retinotopy × direction interaction was only significant for the pre-target period, suggesting such effect was time-locked to the attention shift. It rules out the possibility that the top-down GC might reflect tonic task-related effect to boost the level of arousal, instead of a sensitive reflection of attentional events. As was stated above, this doubt cannot be removed by merely looking at the directional asymmetry.

Given that we have found an appropriate measure of top-down control, we then asked what properties of such control can be uncovered from it. The most-difficult-to-explain phenomenon might be the lack of a cue × hemisphere interaction in the pre-target period.

The rationale behind it is this: Since visual information of objects distributed across the vertical meridian of the visual field are processed by VOC in the contralateral hemisphere, if only neural representations of objects in the attention “spotlight”were to be modulated, then one would expect top-down GC to be stronger to the hemisphere contralateral to the

82 cued target location than to the other hemisphere representing the mirrored target location, i.e. a cue × hemisphere interaction should be found significant. However, this was not the case. The fact that top-down GC was evenly directed to both hemisphere suggests that the modulation was on both attended and unattended representations. A possible mechanism could be that the modulation is facilitating the attended representation while suppressing the unattended representation. Further analysis is needed to test this hypothesis, probably by relating the GC strength with BOLD activity at visual areas representing the two target locations, to see if GC with similar strength can result in different BOLD changes.

Another observed property of the top-down control was that the retinotopic specificity was maintained throughout the pre-target period, indicating that modulation might be trig- gered by the cue and persisted until stimuli occur. However, the earlies time point we can get of a trial coincide with the preparatory cue, thus this argument is not convincing without supporting results from data prior to the cue onset. Because trials used in this study were not recorded end to end, the last time points of a trial cannot be treated as the pre-cue period for next trial. A follow-up study for future investigation might be to look into the original scans for the pre-cue BOLD data and apply the same analysis to see if the top-down GC exhibit a different pattern there.

The GC time series in Fig. 4.7 has a subtle characteristic such that the separation between top-down GC with ventral and dorsal portion of VOC is the largest at time points

2 and 4. Interestingly, GC in Fig. 4.8 exhibit patterns that look like cue × hemisphere interaction also at time points 2 and 4. Whether this is a mere coincidence or in fact revealing some information about “spotlight”-like attentional control is worth looking into.

83 It might be the case that the cue × hemisphere interaction only occurred at certain time

points and was diluted by the null-effect from other time points. To dig into this question, a

possible analysis might be to take the difference between top-down GC with ventral VOC

and that with dorsal VOC, and then test the correlation between the difference score and

the F -value of cue × hemisphere interaction across time points. A significant correlation might lead to further study of acute control signal that shifts attention onto the exact cued location.

At last we would like to share a speculation based on this study, on the possibly different roles that top-down and bottom-up GC play. In the post-target test condition, we found a significant time × direction interaction which was absent in the pre-target test condition.

According to Fig. 4.7, this interaction might be a result of the “U”-shaped bottom-up GC that makes the top-down-bottom-up difference varying across time. Thus it seems that top-down GC is more acute to pre-target events and bottom-up GC more reactive after the stimulus presence. We speculate that top-down GC is a reflection of endogenous goal- directed mental activity, while bottom-up GC is more related to lower-level processing of exogenous information from the outer world.

84 CHAPTER 5. DISCUSSION AND CONCLUSIONS

In this work we applied Granger causality analysis to fMRI BOLD data collected from a Posner-like task to explore visual spatial attention. In Chapter 2 we described our first step in seeking a GC measure for attentional control, where we have found a significant directional effect such that top-down GC from FEF and IPS to VOC is greater than bottom- up GC from VOC to FEF and IPS, at the time point of stimulus onset. After a method- ology development documented by Chapter 3, we looked for more details of the control mechanism. Several important findings were reported in Chapter 4: First, the directional asymmetry found in Chapter 2 was maintained for all time points in the trial, suggesting that a more acute measure was needed to capture the attention-shift event in the task. We then found a retinotopy effect in the top-down direction such that GC from FEF and IPS to the ventroal portion of VOC representing target locations in the upper visual hemifield was greater than to the dorsal portion of VOC representing non-target locations mirrored across horizontal meridian in the lower visual hemifield. This retinotopic specificity was restricted to the frontoparietal cortex and time-locked to the pre-target period. With prior knowledge of attention-shift events in the task and the ties of FEF and IPS function with attention, these results suggest that top-down GC measured with the W summary statistic was an acute reflection of attentional control.

We have shown the linkage of GC measure with the control mechanism of visual spatial

85 attention in terms of the correspondence between GC results and behavioral events. We have not so far provided a causal explanation for attention yet. These results per se are not sufficient to serve the purpose since they were based on hemodynamics, a metabolic measure rather than direct neuronal activity. We now turn for help to neuronal-level studies carried out by others, guided with cognitive theories, to attempt a deeper understanding of the indications from our findings.

Relate Findings to Theory: A Biased-Competition Model

An attention theory that has been well supported by neuronal-level evidence is the biased-competition model. As was introduced at the beginning of this manuscript, it treats top-down attention as one of the possible mechanisms to resolve competition among multi- ple neural representations. It is hypothesized that top-down biasing signal can facilitate the processing of objects at the attended location by two different ways: filtering unwanted in- formation or increase baseline activity (Beck and Kastner, 2009). In our scenario, because attention was demanded before stimulus onset, it rules out the first possibility. Instead, our

findings suit more to the baseline-increase mechanism.

A first related observation was that activity-increase did occur at the part of VOC rep- resenting target locations (Sylvester et al., 2007). The fact that it occurred before stimulus onset indicates that the bias was from endogenous signal. Unlike studies that only focus within VOC to see how the competition was resolved, our analysis with GC measure fur- ther tells where the biasing signal is from. The directional asymmetry between top-down and bottom-up GC confirms that FEF and IPS were the sources that drive the activity in

86 VOC. Essentially, the most prevailing result so far being the retinotopic specificity of top- down GC, the activity pattern within VOC matches up with the influence pattern exerted by the frontoparietal regions. On the other hand, we have demonstrated that the retinotopic specificity was associated with attention-shifting in the task, relating FEF and IPS to goal- directed control functions. Thus the big picture becomes clear: FEF and IPS direct control signals to the targets’neural representation in the visual cortex and tune the neuronal activ- ity ready for processing the incoming visual signals. Doing so facilitates perception at the attended spot, and the facilitation constitutes our conscious feeling about attention.

This mechanism is straightforward and intuitive. However, it is not the end of story.

The real problem has not been solved, that how FEF and IPS “know”where to send the signal. It is not unusual that impersonated verbs are used in describing a brain region’s role involed in certain cognitive functions, as we just did above. However, this is unhelpful for understanding those functions, because one cannot get rid of the cognitive agency by putting another agent into its own components. A mechanism should explain the subjective feeling of a cognitive function by describing objectively observable phenomena. Thus to say that FEF and IPS “direct”control signals to the designated location in VOC does not explain what determines the way those signals should be sent out. We then look for a mechanism of the retinotopic control that does not presume agency.

Spatiotopic Maps: A Control Mechanism without Agency

Interestingly, our results seem to fit well with a less heard theory of biased competition.

In their review article, Beck and Kastner (2009) mentioned close to the end a less popular

87 tenet of biased competition theory that might be “least supported by empirical evidence”.

It was proposed by Duncan (1996) that the winner of the competition in one modality

(e.g. in the visual cortex) might as well gain similar dominance in other modalities (e.g. in the frontal and parietal areas). For example, the target location favored by visual pro- cessing might also evoke enhanced activity in higher-level neural representation over that of non-favored locations. A rare piece of evidence comes from Everling et al. (2002;

2006). With recordings from monkey prefrontal cortex, they showed enhanced activity in neurons processing the attended objects but not in those processing unattended objects.

This globally-biased-competition theory suggests that the same bias pattern would occur throughout the cortex.

Parallel studies of common spatiotopic maps in different modalities provide a possible mechanism for the globally-biased competition. Consistent coordinate systems for repre- senting spatial relationships have been found in a wide range of cortical areas. Neural- coded spatial information in early visual areas are known to be arranged into retinotopic maps for visual-field representation (Woldorff et al., 2004; Worden et al., 2000). In the posterior parietal cortex, egocentric coordinates were found to intermediately represent retinotopic (Silver, 2005; Silver and Kastner, 2009; Saygin and Sereno, 2007) and soma- totopic information (Binkofski et al., 1999; Buccino et al., 2001). In the prefrontal cortex, mototopic maps were found to represent spatial relationship in the effector system . These spatiotopic maps are proposed to serve as templates for translating neural codes between functionally differentiated regions and thus conveying information across modalities, for example, to translate spatial perceptual information into motor plans for grasping. If there

88 is this tendency for the brain to match differentially arranged maps to maintain a unified global representation of space, it might explain why information at the location favored by one spatiotopic area may also gain preference in other spatiotopic areas.

The retinotopic control that we found has an important feature to support the integration principle stated above. What we mean by “control”refers to the top-down GC influence, which is a measure of interdependency between two regions instead of a meaure of activity within a single region. Thus the retinotopically specific control is in deed a retinotopically specific interdependency. This might be better explained by referring to the definition of our GC measure, the W summary statistic.

The W summary statistic is defined as the sum of total input b coefficients to each receiving voxel over the number of receiving voxels. For top-down GC, because each b coefficient represents how well previous states of a FEF/IPS voxel can help predicting cur- rent states of a VOC voxel in addition to the latter’s own past states, either of the following conditions will result in large W values: 1) a large number of FEF/IPS voxels help pre- dicting activity in VOC; 2) the coupling strength, i.e. b values between FEF/IPS and VOC voxels are very strong. Therefore a strong top-down GC from a frontoparietal ROI to a visual ROI means that voxels in the visual region has coupled to a large percentage of the frontoparietal voxels (corresponding to condition 1), or, that the visual region is strongly coupled to the frontoparietal region, or both. When saying that top-down modulation is retinotopically specific, we refer to the fact that top-down GC is greater to the ventral than to the dorsal portion of VOC, meaning that ventral VOC areas are coupled with a larger portion of FEF/IPS and/or with greater coupling strength. Either way it reflects a better

89 of voxels between FEF/IPS and VOC.

It all makes sense when remembered that the ROIs from FEF and IPS were defined by the activation difference with preparatory-cue contrast. Because during the task attention was always shifted to the upper visual hemifield, only the voxels responsible for shifting upward would present a cue difference. Voxels corresponding to downward shifts would not be responsible for this task, thus not exhibiting a cue difference. As a result, the voxels selected for FEF and IPS ROIs carried spatial representations only for the upper hemifield.

According to the spatial integration theory, these voxels would have a better coupling with voxels from ventral VOC that represent the same hemifield than those from dorsal VOC representing a different hemifield.

While the retinotopic specificity of top-down GC supports the spatial integration theory, the latter serves a good agency-free mechanism for attentional control. According to the spatial integration theory, spatiotopic information was maintained in all FEF, IPS and VOC areas. The original attended-location information was from the auditory cue, after which was represented in the higher-level areas and then matched onto the visual areas. By this way the biasing signal was also matched across modalities and eventually reached to the visual regions to resolve the competition. In this mechanism, there is no need for an agent to decide where to send the signal, since the directing is through a matching process.

It is at the first glance surprising that our findings linked together two seemingly distant cognitive theories. However, after careful examination of the meaning of GC and how it

fits the research context, such linkage becomes a natural result.

90 Technical issues concerning GC application to fMRI

There have been debates about applying GC analysis directly onto fMRI BOLD signals, centered on the concerns that latency difference in the Hemodynamic Response Function

(HRF) for different brain regions might introduce spurious directionality into GC results

(Friston, 2009). In the scope of this work, HRF difference is not a concern for the validity of conclusions drawn from the results. For the experimental BOLD data, we take advan- tage of slow event-related design which allows averaging over repeated trials to estimate the hemodynamic response. The hemodynamic response is analyzed to verify the disso- ciation between latency and the LASSO-GC results. For the simulation data, time series were generated directly from the MVAR model without HRF convolution, in order to rep- resent BOLD residuals with hemodynamic response removed in addition to conventional preprocessing (i.e. machine-based and physiological noise removal, etc.). Regardless of the HRF latency issue, to have the hemodynamic response removed is essential for GC analysis because the autoregression requires the time series to be stationary. This notion is ignored by many of the current GC studies on fMRI, for both simulated and actual BOLD analysis. In the simulation, it would be pointless to first simulate realistic BOLD data by

HRF convolution and then deconvolve to get back to the residuals. It might be interesting to remove the simulated hemodynamic response without HRF deconvolution and compare with the deconvolved result, but that only adds complication to the problem and deviates from the main purpose of current work. Therefore, we base our simulation on the MVAR model and consider it a sufficient representation of BOLD residuals for testing GC.

91 In actual data analysis, the stationary BOLD residuals may be obtained in different ways depending on the experimental design, with or without deconvolving HRF from the data. The resulting BOLD residuals will share the same feature as autocorrelated stationary stochastic processes that can be modeled by autoregression. For our experimental datasets, the hemodynamic response for each voxel is measured by the mean signal over trials, and subtracted from each trial to yield residual time series. Latency is defined by the time to peak, i.e. the time from the first sampled point to where the response reaches its maximum.

Because the results have shown that there is no specific association between latency and

GC patterns in our dataset, we consider it unnecessary and inappropriate to deconvolve the residuals with presumed HRF.

Along with the HRF debate, there is another often-raised issue with the effectiveness of Vector AutoRegressive (VAR) modeling on fMRI BOLD data, so-called the “downsam- pling problem”(Smith et al., 2011). As was formally summarized in the review article by

Valdes-Sosa et al. (2011), the issue is rooted in treating fMRI BOLD signal as a subsample from the neuronal data. Under this assumption, one can obtain a VAR model for hemo- dynamics by integrating the linear approximation of a neuronal-level dynamic system over the time interval between BOLD samples. It has been shown that such integration changes the connectivity matrix into its exponential, thus distorts the connectivity pattern at the hemodynamic level. We did not specifically test this effect in our simulation. However, our empirical findings suggest this reasoning to be flawed. The examples that have been used to support the downsampling argument all had a small number of variables, result- ing in small dimension of the connectivity matrix. In that case if the connectivity matrix

92 contains a certain amount of zeros they have a good chance to survive and stay zeros after transformed to the exponential. But for the realistic situation in the brain, the connectivity pattern involves a much larger amount of neuronal populations, meaning that the dimen- sion of the connectivity matrix is at a level of hundreds or even thousands. By definition of matrix exponential, a matrix of that size would always change all of its zero-elements into non-zeros after the exponential transformation, regardless how sparse it is to begin with.

This would indicate that any VAR model approximating the dynamics of a large group of neuronal populations would always yield full connectivity. However, converging evidence have been found to support that the brain is sparsely connected. Our experimental BOLD data results, as one additional piece of the evidence, have also shown that a MVAR model with around a hundred voxels yields sparse instead of full connectivity.

Summary and Conclusions

This work has confirmed the possibility to dissociate and analyze the physiological cor- relates to attentional control. The control is carried out in large-scale cortical networks involving distantly connected frontal, parietal and visual occipital areas. Particularly, FEF and IPS exert top-down modulation onto the visual areas during attention shift. The modu- lation signal can be well captured by the GC measure in terms of a directional asymmetry such that top-down GC from FEF/IPS to VOC is greater than that in the opposite direc- tion. The top-down other than the bottom-up GC represents the retinotopic control which is specifically directed to modulate the ventral portion of VOC for processing the upper visual field where stimuli occur. The retinotopic control is only carried out by the fron-

93 toparietal regions other than elsewhere in the brain, and is maintained in the preparatory period but not after stimulus onset. The control signal does not distinguish left- or right- cued locations.

These findings support a biased-competition explanation of what the control signal is designated for. Its retinotopic specificity provides direct connection between biasing sig- nal from higher-level control regions and the modulated activity within lower-level sensory regions. This explanation does not necessarily require an agent to decide where the signal should be sent to. Rather, recent evidence with global spatiotopic maps found in function- ally differentiated regions may shed light to a global-biased-competition theory in which the retinotopic control results from the matching between spatiotopic maps. This specu- lation is based on the definition of our GC measure which may correspondingly serve as a mechanism of how the matching could be achieved. These observations suggest that

LASSO-GC is a sophisticated measure more than mere indication of temporal precedence, and the way it was applied in our work does not subject to current criticism on GC-based connectivity analysis with fMRI.

Findings in this work may lead to further understanding of the frontoparietal network and its role in visual spatial attention, as well as a broader application of GC analysis to discover cognitive mechanisms inside the brain.

94 BIBLIOGRAPHY

Alpert G, Sun F, Handwerker D, DEsposito M, Knight R (2007) Spatio-temporal infor- mation analysis of event-related bold responses. NeuroImage 34:1545–1561.

Anderson JS, Ferguson MA, Lopez-Larson M, Yurgelun-Todd D (2010) Topographic maps of multisensory attention. Proceedings of the National Academy of Sciences of the

United States of America 107:20110–20114.

Armstrong KM, Chang MH, Moore T (2009) Selection and maintenance of spatial infor- mation by frontal eye field neurons. Journal of Neuroscience 29:15621–15629.

Arnold A, Liu Y, Abe N (2007) Temporal causal modeling with graphical granger meth- ods In Proceedings of the 13th ACM SIGKDD international conference on Knowledge discovery and data mining - KDD ’07, p. 66, San Jose, California, USA.

Bashinski H, Bacharach V (1980) Enhancement of perceptual sensitivity as the result of selectively attending to spatial locations. Attention, Perception, & Psy- chophysics 28:241–248.

Beck DM, Kastner S (2009) Top-down and bottom-up mechanisms in biasing competition in the human brain. Vision Research 49:1154–1165.

95 Bernasconi C, Konig P (1999) On the directionality of cortical interactions stud- ied by structural analysis of electrophysiological recordings. Biological Cybernet- ics 81:199–210.

Binkofski F, Buccino G, Posse S, Seitz RJ, Rizzolatti G, Freund HJ (1999) A fronto- parietal circuit for object manipulation in man: evidence from an fMRI-study. European

Journal of Neuroscience 11:3276–3286.

Bressler SL, Menon V (2010) Large-scale brain networks in cognition: emerging methods and principles. Trends in Cognitive Sciences 14:277–290.

Bressler SL, Seth AK (2010) WienerGranger causality: A well established methodology.

NeuroImage .

Bressler SL, Tang W, Sylvester CM, Shulman GL, Corbetta M (2008) Top-down con- trol of human visual cortex by frontal and parietal cortex in anticipatory visual spatial attention. The Journal of Neuroscience: The Official Journal of the Society for Neuro- science 28:10056–10061.

Broadbent DE (1958) Perception and communication. Oxford University Press.

Buccino G, Binkofski F, Fink GR, Fadiga L, Fogassi L, Gallese V, Seitz RJ, Zilles K,

Rizzolatti G, Freund H (2001) Action observation activates premotor and parietal areas in a somatotopic manner: an fMRI study. European Journal of Neuroscience 13:400–404.

Buschman TJ, Miller EK (2007) Top-down versus bottom-up control of attention in the prefrontal and posterior parietal cortices. Science 315:1860–1862. 96 Bush G, Luu P, Posner MI (2000) Cognitive and emotional influences in anterior cingulate cortex. Trends in Cognitive Sciences 4:215–222.

Cavada C, Goldman-Rakic PS (1989) Posterior parietal cortex in rhesus monkey: II. evidence for segregated corticocortical networks linking sensory and limbic areas with the frontal lobe. The Journal of Comparative Neurology 287:422–445.

Chen Y, Bressler S, Ding M (2006) Frequency decomposition of conditional granger causality and application to multivariate neural field potential data. Journal of Neuro- science Methods 150:228–237.

Choi H, Zilles K, Mohlberg H, Schleicher A, Fink GR, Armstrong E, Amunts K (2006)

Cytoarchitectonic identification and probabilistic mapping of two distinct areas within the anterior ventral bank of the human intraparietal sulcus. The Journal of Comparative

Neurology 495:53–69.

Cohen A, Fair D, Dosenbach N, Miezin F, Dierker D, Vanessen D, Schlaggar B, Petersen

S (2008a) Defining functional areas in individual human brains using resting functional connectivity MRI. NeuroImage 41:45–57.

Cohen JY, Pouget P, Heitz RP, Woodman GF, Schall JD (2008b) Biophysical support for functionally distinct cell types in the frontal eye field. Journal of Neurophysiol- ogy 101:912–916.

Colby CL, Goldberg ME (1999) Space and attention in parietal cortex. Annual Review of

Neuroscience 22:319–349.

97 Corbetta M (1998) Frontoparietal cortical networks for directing attention and the eye to visual locations: Identical, independent, or overlapping neural systems? Proceedings of the National Academy of Sciences 95:831–838.

Corbetta M, Kincade JM, Ollinger JM, McAvoy MP, Shulman GL (2000) Voluntary orienting is dissociated from target detection in human posterior parietal cortex. Nature

Neuroscience 3:292–297.

Corbetta M, Kincade MJ, Lewis C, Snyder AZ, Sapir A (2005) Neural basis and recovery of spatial attention deficits in spatial neglect. Nature Neuroscience 8:1603–1610.

Corbetta M, Shulman GL (2002) Control of goal-directed and stimulus-driven attention in the brain. Nat Rev Neurosci 3:201–215.

Crottaz-Herbette S, Menon V (2006) Where and when the anterior cingulate cortex mod- ulates attentional response: Combined fMRI and ERP evidence. Journal of Cognitive

Neuroscience 18:766–780.

Crowne DP (1983) The frontal eye field and attention. Psychological Bul- letin 93:232–260.

Culham JC, Kanwisher NG (2001) Neuroimaging of cognitive functions in human parietal cortex. Current Opinion in Neurobiology 11:157–163.

Desimone R, Duncan J (1995a) Neural mechanisms of selective visual attention. Annual

Review of Neuroscience 18:193–222.

98 Desimone R, Duncan J (1995b) Neural mechanisms of selective visual attention. Annual

Review of Neuroscience 18:193–222.

Deutsch JA, Deutsch D (1963) Attention: Some theoretical considerations. Psychological

Review 70:80–90.

Ding M, Bressler SL, Yang W, Liang H (2000) Short-window spectral analysis of cortical event-related potentials by adaptive multivariate autoregressive modeling: data prepro- cessing, model validation, and variability assessment. Biological Cybernetics 83:35–45.

Ding M, Chen Y, Bressler SL (2006) Granger causality: Basic theory and application to neuroscience In Handbook of Time Series Analysis, p. 437460. Wiley-VCH Verlag GmbH

& Co. KGaA.

Driver J, Mattingley JB (1998) Parietal neglect and visual awareness. Nature Neuro- science 1:17–22.

Driver J (2001) A selective review of selective attention research from the past century.

British Journal of Psychology 92 Part 1:53–78.

Duncan J (1996) Cooperating brain systems in selective perception and action. In At- tention and performance 16: Information integration in perception and communication.,

Attention and performance., pp. 549–578. MIT Press.

Efron B, Hastie T, Johnstone I, Tibshirani R (2004) Least angle regression. The Annals of Statistics 32:407–499.

99 Eriksen C, Hoffman J (1972) Temporal and spatial characteristics of selective encoding from visual displays. Attention, Perception, & Psychophysics pp. 201–204.

Everling S, Tinsley CJ, Gaffan D, Duncan J (2002) Filtering of neural signals by focused attention in the monkey prefrontal cortex. Nature Neuroscience 5:671–676.

Everling S, Tinsley CJ, Gaffan D, Duncan J (2006) Selective representation of task- relevant objects and locations in the monkey prefrontal cortex. European Journal of Neu- roscience 23:2197–2214.

Friston K (2009) Causal modelling and brain connectivity in functional magnetic reso- nance imaging. PLoS Biology 7:e33.

Fuster J (2008) The prefrontal cortex anatomy, physiology, and neuropsychology of the frontal lobe Elsevier, Amsterdam, 4. ed. edition.

Geweke JF (1984) Measures of conditional linear dependence and feedback between time series. Journal of the American Statistical Association 79:907–915.

Gillebert CR, Mantini D, Thijs V, Sunaert S, Dupont P, Vandenberghe R (2011)

Lesion evidence for the critical role of the intraparietal sulcus in spatial attention.

Brain 134:1694–1709.

Golomb JD, Chun MM, Mazer JA (2008) The native coordinate system of spatial atten- tion is retinotopic. The Journal of Neuroscience: The Official Journal of the Society for

Neuroscience 28:10654–10662.

100 Gong G, He Y, Concha L, Lebel C, Gross DW, Evans AC, Beaulieu C (2008) Mapping anatomical connectivity patterns of human cerebral cortex using in vivo diffusion tensor imaging tractography. Cerebral Cortex 19:524–536.

Granger CWJ (1969) Investigating causal relations by econometric models and cross- spectral methods. Econometrica 37:pp. 424–438.

Greene W (2003) Least squares In Econometric analysis, pp. 19–40. Prentice Hall.

Gregoriou GG, Gotts SJ, Zhou H, Desimone R (2009) High-Frequency, Long-Range coupling between prefrontal and visual cortex during attention. Science 324:1207–1210.

Greicius MD, Krasnow B, Reiss AL, Menon V (2003) Functional connectivity in the resting brain: a network analysis of the default hypothesis. Proceedings of the

National Academy of Sciences of the United States of America 100:253–258.

Hagmann P, Cammoun L, Gigandet X, Meuli R, Honey CJ, Wedeen VJ, Sporns O (2008)

Mapping the structural core of human cerebral cortex. PLoS Biology 6:e159.

Halligan PW, Fink GR, Marshall JC, Vallar G (2003) Spatial cognition: evidence from visual neglect. Trends in Cognitive Sciences 7:125–133.

He BJ, Snyder AZ, Vincent JL, Epstein A, Shulman GL, Corbetta M (2007) Breakdown of functional connectivity in frontoparietal networks underlies behavioral deficits in spatial neglect. Neuron 53:905–918.

He Y, Chen ZJ, Evans AC (2006) Small-World anatomical networks in the human brain revealed by cortical thickness from MRI. Cerebral Cortex 17:2407–2419. 101 Heilman KM, Valenstein E, Watson RT (1984) Neglect and related disorders. Semin

Neurol 4:209–219.

Hesse W, Moller E, Arnold M, Schack B (2003) The use of time-variant EEG granger causality for inspecting directed interdependencies of neural assemblies. Journal of Neu- roscience Methods 124:27–44.

James W (1890) The principles of psychology, Vol I. Henry Holt and Co.

Johnson-Frey SH (2004) The neural bases of complex tool use in humans. Trends in

Cognitive Sciences 8:71–78.

Juan C, Muggleton NG, Tzeng OJL, Hung DL, Cowey A, Walsh V (2008) Segregation of visual selection and saccades in human frontal eye fields. Cerebral Cortex 18:2410–2415.

Kaminski M, Ding M, Truccolo WA, Bressler SL (2001) Evaluating causal relations in neural systems: Granger causality, directed transfer function and statistical assessment of significance. Biological Cybernetics 85:145–157.

Kastner S, Pinsk M (2004) Visual attention as a multilevel selection process. Cognitive,

Affective, & Behavioral Neuroscience 4:483–500.

Kastner S, Pinsk MA, De Weerd P, Desimone R, Ungerleider LG (1999) Increased activity in human visual cortex during directed attention in the absence of visual stimulation.

Neuron 22:751–761.

Kirk RE (1982) Experimental design procedures for the behavioral sciences Psychology

Series. Brooks/Cole. 102 Knight RT, Grabowecky MF, Scabini D (1995) Role of human prefrontal cortex in atten- tion control. Advances in Neurology 66:21–34.

Knight RT (1997) Distributed cortical network for visual attention. Journal of Cognitive

Neuroscience 9:75–91.

Kravitz DJ, Saleem KS, Baker CI, Mishkin M (2011) A new neural framework for visu- ospatial processing. Nature Reviews Neuroscience 12:217–230.

Lakatos P, Karmos G, Mehta AD, Ulbert I, Schroeder CE (2008) Entrainment of neuronal oscillations as a mechanism of attentional selection. Science 320:110–113.

Leonardo C, Monica B, Maurizio C, Andrea P, Giancarlo T, Giovanni B (1995) Oculo- motor activity and visual spatial attention. Behavioural Brain Research 71:81–88.

Luck SJ, Chelazzi L, Hillyard SA, Desimone R (1997) Neural mechanisms of spatial selective attention in areas v1, v2, and v4 of macaque visual cortex. Journal of Neuro- physiology 77:24–42.

Luria AR The frontal lobes and the regulation of behavior. In Psychophysiology of the frontal lobes. Academic Press.

Mann HB, Whitney DR (1947) On a test of whether one of two random variables is stochastically larger than the other. The Annals of 18:50–60.

McNemar Q (1947) Note on the sampling error of the difference between correlated proportions or percentages. Psychometrika 12:153–157.

103 Mehta AD, Ulbert I, Schroeder CE (2000) Intermodal selective attention in monkeys. i:

Distribution and timing of effects across visual areas. Cerebral Cortex 10:343–358.

Mesulam M (1990) Large-scale neurocognitive networks and distributed processing for attention, language, and memory. Annals of Neurology 28:597–613.

Mesulam M (1981) A cortical network for directed attention and unilateral neglect. Annals of Neurology 10:309–325.

Mesulam MM (1999) Spatial attention and neglect: parietal, frontal and cingulate con- tributions to the mental representation and attentional targeting of salient extrapersonal events 354:1325–1346.

Miller EK, Cohen JD An integrative theory of prefrontal cortex function. Annual Review of Neuroscience 24:167–202.

Moore T, Armstrong M, Falla M (2003) Visuomotor origins of covert spatial attention.

Neuron 40:671–683.

Moore T, Armstrong KM (2003) Selective gating of visual signals by microstimulation of frontal cortex. Nature 421:370–373.

Motter BC (1993) Focal attention produces spatially selective processing in visual corti- cal areas v1, v2, and v4 in the presence of competing stimuli. Journal of Neurophysiol- ogy 70:909–919.

104 Mountcastle VB, Lynch JC, Georgopoulos A, Sakata H, Acuna C (1975) Posterior parietal association cortex of the monkey: command functions for operations within extrapersonal space. Journal of Neurophysiology 38:871–908.

Mukai I, Kim D, Fukunaga M, Japee S, Marrett S, Ungerleider LG (2007) Activations in visual and attention-related areas predict and correlate with the degree of perceptual learning. The Journal of Neuroscience 27:11401–11411.

Norman D, Shallice T (1986) Attention to action: Willed and automatic control of be- havior In Consciousness and Self-Regulation: Advances in Research and Theory IV, pp. 1–18. Plenum Press.

Pashler H (1999) The Psychology of Attention Bradford Books. MIT Press.

Penfield W, Rasmussen T (1950) The cerebral cortex of man; a clinical study of localiza- tion of function. Macmillan.

Posner M, Snyder C (2004) Attention and cognitive control In Cognitive psychology: key readings, Key readings in cognition. Psychology Press.

Posner MI, Snyder CR, Davidson BJ (1980) Attention and the detection of signals. Jour- nal of Experimental Psychology 109:160–174.

Posner MI (1980) Orienting of attention. Quarterly Journal of Experimental Psychol- ogy 32:3–25.

Posner MI, DiGirolamo GJ (1998) Executive attention: Conflict, target detection, and cognitive control. In The attentive brain., pp. 401–423. MIT Press. 105 Reynolds JH, Chelazzi L (2004) Attentional modulation of visual processing. Annual

Review of Neuroscience 27:611–647.

Rizzolatti G, Riggio L, Dascola I, Umilta C (1987) Reorienting attention across the hor- izontal and vertical meridians: evidence in favor of a premotor theory of attention. Neu- ropsychologia 25:31–40.

Rizzolatti G, Craighero L (1998) Spatial attention: Mechanisms and theories. In Ad- vances in psychological science, Vol. 2: Biological and cognitive aspects., pp. 171–198.

Psychology Press/Erlbaum (UK) Taylor & Francis.

Roberts AC, Robbins TW, Weiskrantz L (1998) The prefrontal cortex: Executive and cognitive functions. Oxford University Press.

Roebroeck A, Formisano E, Goebel R (2005) Mapping directed influence over the brain using granger causality and fmri. NeuroImage 25:230–242.

Ruff CC, Bestmann S, Blankenburg F, Bjoertomt O, Josephs O, Weiskopf N, Deichmann

R, Driver J (2008) Distinct causal influences of parietal versus frontal areas on human visual cortex: Evidence from concurrent tmsfmri. Cerebral Cortex 18:817–827.

Ruff CC, Blankenburg F, Bjoertomt O, Bestmann S, Freeman E, Haynes JD, Rees G,

Josephs O, Deichmann R, Driver J (2006) Concurrent tms-fmri and psychophysics reveal frontal influences on human retinotopic visual cortex. Current Biology 16:1479–1488.

S. K, Ungerleider L (2000) Mechanisms of visual attention in the human cortex. Annual

Review of Neuroscience 23:315–341. 106 Sanchez-Bornot JM, Martinez-Montes E, Lage-Castellanos A, Vega-Hernandez M,

Valdes-Sosa PA (2008) Uncovering sparse brain effective connectivity: a voxel-based approach using penalized regression. Statistica Sinica 18:1501–1518.

Saygin AP, Sereno MI (2007) Retinotopy and attention in human occipital, temporal, parietal, and frontal cortex. Cerebral Cortex 18:2158–2168.

Schall JD, Morel A, King DJ, Bullier J (1995) Topography of visual cortex connec- tions with frontal eye field in macaque: convergence and segregation of processing streams. The Journal of Neuroscience: The Official Journal of the Society for Neuro- science 15:4464–4487.

Schall JD (2002) The neural selection and control of saccades by the frontal eye field.

Philosophical Transactions of the Royal Society of London. Series B, Biological Sci- ences 357:1073–1082.

Schroeder CE, Mehta AD, Foxe JJ (2001) Determinants and mechanisms of attentional modulation of neural processing. Frontiers in Bioscience: A Journal and Virtual Li- brary 6:672–684.

Selemon LD, Goldman-Rakic PS (1988) Common cortical and subcortical targets of the dorsolateral prefrontal and posterior parietal cortices in the rhesus monkey: evidence for a distributed neural network subserving spatially guided behavior. The Journal of

Neuroscience: The Official Journal of the Society for Neuroscience 8:4049–4068.

107 Serences JT, Yantis S (2006) Selective visual attention and perceptual coherence. Trends in Cognitive Sciences 10:38–45.

Sereno MI (2001) Mapping of contralateral space in retinotopic coordinates by a parietal cortical area in humans. Science 294:1350–1354.

Seth AK, Edelman GM (2007) Distinguishing causal interactions in neural populations.

Neural Computation 19:910–933.

Shojaie A, Michailidis G (2010) Discovering graphical granger causality using the trun- cating lasso penalty. 26:517–523.

Silver MA (2005) Topographic maps of visual spatial attention in human parietal cortex.

Journal of Neurophysiology 94:1358–1371.

Silver MA, Kastner S (2009) Topographic maps in human frontal and parietal cortex.

Trends in Cognitive Sciences 13:488–495.

Siman-Tov T, Mendelsohn A, Schonberg T, Avidan G, Podlipsky I, Pessoa L, Gadoth N,

Ungerleider LG, Hendler T (2007) Bihemispheric leftward bias in a visuospatial attention- related network. The Journal of Neuroscience 27:11271–11278.

Smith SM, Miller KL, Salimi-Khorshidi G, Webster M, Beckmann CF, Nichols TE,

Ramsey JD, Woolrich MW (2011) Network modelling methods for FMRI. NeuroIm- age 54:875–891.

Stefan, Treue (2001) Neural correlates of attention in primate visual cortex. Trends in

Neurosciences 24:295–300. 108 Swisher JD, Halko MA, Merabet LB, McMains SA, Somers DC (2007) Visual topography of human intraparietal sulcus. Journal of Neuroscience 27:5326–5337.

Sylvester CM, Shulman GL, Jack AI, Corbetta M (2007) Asymmetry of anticipatory activity in visual cortex predicts the locus of attention and perception. The Journal of

Neuroscience 27:14424–14433.

Thompson KG (2005) Neuronal basis of covert spatial attention in the frontal eye field.

Journal of Neuroscience 25:9479–9487.

Tibshirani R (1996) Regression shrinkage and selection via the lasso. Journal of the Royal

Statistical Society. Series B (Methodological) 58:267–288.

Tong F (2003) Primary visual cortex and visual awareness. Nat Rev Neurosci 4:219–229.

Treisman AM (1960) Contextual cues in selective listening. The Quarterly Journal of

Experimental Psychology 12:242–248.

Treisman AM (1969) Strategies and models of selective attention. Psychological Re- view 76:282–299.

Treisman AM, Gelade G (1980) A feature-integration theory of attention. Cognitive

Psychology 12:97–136.

Ungerleider LG, Gaffan D, Pelak VS (1989) Projections from inferior temporal cortex to prefrontal cortex via the uncinate fascicle in rhesus monkeys. Experimental Brain

Research 76:473–484.

109 Valdes-Sosa PA, Sanchez-Bornot JM, Lage-Castellanos A, Vega-Hernandez M, Bosch-

Bayard J, Melie-Garcia L, Canales-Rodriguez E (2005) Estimating brain functional con- nectivity with sparse multivariate autoregression. Philosophical Transactions of the Royal

Society B: Biological Sciences 360:969–981.

Valdes-Sosa PA, Roebroeck A, Daunizeau J, Friston K (2011) Effective connectivity:

Influence, causality and biophysical modeling. NeuroImage 58:339–361.

Van Essen DC, Drury HA, Dickson J, Harwell J, Hanlon D, Anderson CH (2001) An integrated software suite for surface-based analyses of cerebral cortex. Journal of the

American Medical Informatics Association 8:443–459.

Webster MJ, Bachevalier J, Ungerleider LG (1994) Connections of inferior temporal areas TEO and TE with parietal and frontal cortex in macaque monkeys. Cerebral Cor- tex 4:470–483.

Wiener N (1956) The theory of prediction In Modern mathematics for the engineer, pp. 165–190. McGraw-Hill, New York.

Woldorff MG, Hazlett CJ, Fichtenholtz HM, Weissman DH, Dale AM, Song AW (2004)

Functional parcellation of attentional control regions of the brain. Journal of Cognitive

Neuroscience 16:149–165.

Worden MS, Foxe JJ, Wang N, Simpson GV (2000) Anticipatory biasing of visuospatial attention indexed by retinotopically specific -band electroencephalography increases over occipital cortex. The Journal of Neuroscience 20:RC63.

110 Zhang L, Zhong G, Wu Y, Vangel MG, Jiang B, Kong J (2010) Using Granger-Geweke causality model to evaluate the effective connectivity of primary motor cortex (M1), sup- plementary motor area (SMA) and cerebellum. Journal of Biomedical Science and Engi- neering 3:848–860.

111