Outline Approaches to EEG-MRI integration

• fMRI/EEG data through:

• Three Approaches to integration/fusion (i) Prediction • Prediction some features of EEG to predict fMRI responses. • Constraints • 2nd Level Fusion (ii) Constraints Spatial Information from • Conclusions fMRI as Priors for Source Reconstruction

(iii) Fusion Common forward or generative model to explain EEG and fMRI data

1 2

Data integration/fusion: Previous Work Data Fusion vs. Data Integration

• Definitions [Savopol ISPRS 2004] • Data Integration: The use of another data-type to improve the geometry or other information reconstructed from the first data-type (e.g. fMRI constrained EEG). • Data Fusion: Images analyzed jointly, hence allowing them to influence one another, to reveal inter- relationships between data-types. higher EEG

Temporal Resolution fMRI lower lower Spatial Resolution higher

3 4

(iii) Fusion: common generative model that explains EEG and fMRI Possible approaches for joint analyses • Point-based • Correlation [Worseley 2002] • Bayesian Hierarchical Models • Straightforward, but difficult to visualize Reverend Thomas Bayes (1702-1761) • Region-based Karl Friston et al. (now) • Interregional correlation [Horovitz, et al, 2002, Mathalon, et al, 2003] • Structural equation modeling or dynamic causal modeling [McIntosh and Gonzalez-Lima 1994; Friston et al., 2003] • Useful for model testing, does not take into account all brain •Multiway Partial Least Squares regions/times • Transformation-based • A natural set of tools for this problem include those that transform data matrices into a smaller set of modes or components Pedro Valdes-Sosa • Singular value decomposition [Friston et al., 1993; Friston et al., 1996] • Partial Least Squares [McIntosh, Bookstein, et al, 1996] •ICA or jICA • Canonical Variates Analysis [Strother et al, 1995] • Independent Component Analysis [McKeown et al, 1998, Calhoun et al, 2001, Beckmann, et al., 2002] S = Ax

5 6

1 Separate vs. Joint Estimation Joint ICA Approach

(EE) ( ) ( FF) ( ) CC Mixing Model: XASXAS and Linear mixtures with ()EEFF () () () == xi,,,, v==as ic c v and x i v as ic c v shared mixing parameter ∑∑ T cc==11 For two sources: Xxx()mmm= ⎡⎤ () () , where m=E , F ⎣⎦12 {} ⎡⎤aa In a non-joint analysis, we maximize Shared mixing matrix: A = 11 12 the likelihood functions for each px()E ; w p xw()F ; ⎢⎥ the likelihood functions for each ( ) ( ) wa=1/ ⎣⎦aa21 22 modality separately… The mixing equations are: wpxw* = arg max log()E ; Resulting in two unmixing 11() F FF (E) (EE) ( ) w1 ( ) ( )() parameters, that then have to xss1111122=+aa xss1111122=+aa somehow be fused together * ()F wpxw22= arg max log ; w ( ) 2 xss(F ) =+aa(FF)() ()E ()EE( ) 2211222 xss2211222=+aa In contrast: for a joint analysis we maximize the joint likelihood * ()EF () TT w= arg max log pxxw( , ; ) ()EE () () FF () function, resulting in a single fused w Update equation: Δ=WIη − 2y ()u − 2y ()uW unmixing parameter { }

where yu()mm==+ggx( ()) and () 1/(1 e−x )

7 8

Overview Auditory Oddball Task Data Brain 1 Brain 2 Brain N ERP …

fMRI … tone, hi sweep, intensity 0.5 kHz 1kHz whistle a low Joint ICA Independent Component #1 Component #2 Components t1 t2

Ttt= []12 Standard Standard Target Standard Standard Standard Novel Standard

Sss= []12 s1 s2

b recombine

Spatiotemporal “Snapshots”

c 0 ms 200 ms 300 ms 9 10

Group Averaged fMRI/ERP Features fMRI EEG

Preprocess Preprocess Peak of “Target” Position Cz of “Target” -locked EEG signal -locked fMRI signal Feature Feature Extraction Extraction

Normalize Normalize

Subject 1 Subject 1 Subject 2 Subject 2 Subject N jICA Subject N

Noise Removal

Visualization

11 12

2 Joint ERP/fMRI Components Visualization Snapshots…

ERP (temporal) Components: Tt= [ 1 … tN ]

FMRI (spatial) Components: Ss= [ 1 … sN ] T FMRI Image Snapshot: MTSF ()tt=×()

T ERP Snapshot: MTSE ()vv=× ()

13 14

FMRI Snapshots (movie) FMRI Snapshots (stills)

15 16

ERP Snapshots Parallel ICA • Impose additional inter-modality correlation to improve ability to identify connections between EEG and fMRI

17 18

3 jica_simulation jica_method

A s

x

y W

neuronal sources s = (s1, s2, …, sn)T transformed by an unknown mixing matrix A to the signal mixture x = As; features of the sources and their mixture observed as scalp EEG/ERP waveforms E at timepoints t and channels c as AEs = xE= [xE1(tc), xE2(tc), …, xEn(tc)]T; hemodynamic correlates of the detected in fMRI volumes F at the respective locations v as AFs = xF = [xF1(v), xF2(v), …, xFn(v)]T;the mixture is entered into a common 2D space from which fused signals yEF = [xEF1(tcv), yEF2(tcv), …, yEFn(tcv)]T are extracted by adjusting the unmixing matrix W (the inverse of A), such that y = Wx optimally represents s. The update equation for the algorithm to compute the common unmixing matrix W and the Eichele, Calhoun, fused EEG/ERP and fMRI sources, uEF is ΔW = η{I-2 yEF (uEF) T }W, where yEF = g(uEF), and g(x) = 1/(1+e-x) is the nonlinearity in the neural Moosmann et al., in progress network…

19

Fusion ICA Toolbox (FIT) jica_results

136 ms

Eichele, Calhoun, Moosmann et al., in progress 21 22

Outline • Motivation • Data integration/fusion • ICA for data fusion • Study 1: Multi-task fMRI in Sz • Study 2: fMRI and GM in Sz • Other Projects • Summary

23 24

4 Motivation: we collect many data types… Multi-modal data collected

Brain Function Task 1-N Task 1-N (spatiotemporal, EEG EEGEEG fMRIfMRIfMRI task-related)

Brain Structure DTI (spatial) T1/T2

Covariates (Age, etc.) Other*

25 26

Benefit to a coupled/joint analysis? Data integration/fusion: Previous Work

27 28

Data Fusion vs. Data Integration No Inter-relationship

• Definitions [Savopol ISPRS 2004] • Data Integration: The use of another data-type to

improve the geometry or other information high reconstructed from the first data-type (e.g. fMRI EEG constrained EEG/DTI). • Data Fusion: Images analyzed jointly, hence allowing them to influence one another, to reveal inter- relationships between data-types. fMRI Temporal Resolution DTI Other* T1/T2 low low Spatial Resolution high

29 30

5 One-Way Inter-Relationship Two-Way Inter-Relationship high high EEG EEG

fMRI fMRI Temporal Resolution Temporal Resolution Other* DTI T1/T2 Other* DTI T1/T2 low low low Spatial Resolution high low Spatial Resolution high

31 32

Previous approaches Possible approaches for joint analyses • Data integration models • Voxel-based • Correlation [Worsely 1998] • fMRI constrained EEG [Dale, 1997; George, 1995] • Straightforward, but difficult to visualize • fMRI constrained DTI [Kim, 2003] • Region-based • Interregional correlation [Horwitz, et al, 1984] • Data fusion models • Structural equation modeling or dynamic causal modeling [McIntosh and Gonzalez-Lima 1994; Friston et al., 2003] • Multi-task PET using SVD [Friston, 1994] • Structural equation modeling [McIntosh & Gonzalez-Lima, 1991, Buchel & Friston, 1997] • PET and sMRI using PLS [Chen 2004] • Multiple regression and extensions [e.g., Kalman filters, Buchel & Friston, 1998] • Bayes networks [Dynamic Causal Modeling, Friston, Penny, et al, 2003] • EEG and fMRI using multiway-PLS [Martinez-Montes 2004] • Useful for model testing, does not take into account all brain regions • Transformation-based • A natural set of tools for this problem include those that transform data matrices into a smaller set of modes or components • Singular value decomposition [Friston et al., 1993; Friston et al., 1996] • Partial Least Squares [McIntosh, Bookstein, et al, 1996] • Canonical Variates Analysis [Strother et al, 1995] • Independent Component Analysis [McKeown et al, 1998, Calhoun et al, 2001, Beckmann, et al., 2002]

33 34

ICA for data fusion Feature-based • In contrast to a first-level ICA approach, we • What is a feature? instead introduce the idea of a second level, • Lower dimensional data containing information of feature-based analysis of the fMRI activation interest maps (the features) generated from a first-level • Examples: An image of activation amplitudes, A gray analysis matter segmentation image, fractional anisotropy image • We propose a method, called joint ICA, that • Advantages enables the decomposition of activation maps • Less-computationally complex/easier to model generated from two tasks into joint, maximally • Takes advantages of existing analytic approaches spatially independent components • Examines covariation of multiple data types at the subject level

Modality Raw data Feature Compression size size ratio fMRI 260 MB 1.3 MB 1:200 DTI 200 MB 10 MB 1:20 sMRI 46 MB 28 MB 1:1.6 EEG 25MB .45MB 1:56

35 36

6 Framework jICA Overview: basic framework Raw Data Features Coherent Patterns EEG Preprocess EEG1

Preprocess1 fMRI fMRI11

Preprocess2 fMRI12 DTI Preprocess f() DTI1 T1/T2 Preprocess T1/T21

* Other Preprocess * 6 Other1 xi ∈

37 38

(slightly) more formally… Algorithm for jICA analysis • Feature selection (fMRI activation image for task 1, and fMRI CC activation image for task 2) Linear mixtures with x(1)==as (1) and x (2) as (2) shared mixing parameter iv,,,,∑∑ ic cv iv ic cv • Feature normalization (In order to ensure similar consideration for cc==11 both tasks, possibilities include 1) resampling, 2) sign-flip to preserve positivity, and 3) length (compute average sum-of-squares across all Alternatively, we can write (1) 1 (2) 2 subjects and all voxels and normalize with this) s = wx() s = wx() wa=1/ the unmixing equations • Feature matrix composition (in-brain voxels extracted and feature datasets organized into a joint data matrix) In a non-joint analysis, we maximize • Dimensionality estimation (using e.g. the Minimum Description ()1 2 the likelihood functions for each px( ; w) p( xw(); ) Length) [Rissanen 1983] modality separately… • Dimensionality reduction (PCA used to reduce the dimensionality of the data) * ()1 Resulting in two unmixing wpxw11= arg max log() ; • Spatial ICA decomposition of the feature matrix using the infomax w1 parameters, that then have to algorithm [Bell and Sejnowski 1995] to generate component images. somehow be fused together * ()2 • Component selection (e.g. loading parameters examined for a wpxw22= arg max log( ; ) • Component selection (e.g. loading parameters examined for a w2 significant difference between two groups) In contrast: for a joint analysis we • Component display (joint components are re-converted into 3D images maximize the joint likelihood function, and the initial sign-flipping process [if used] is reversed) wpxxw* = arg max log()12 , ( ) ; resulting in a single fused unmixing w ( ) parameter

39 40

Data-1 Data-2 Simulation: Hybrid-Experiment

Preprocessing Preprocessing Sub_1 … Sub_N e.g. spatial normalization e.g. smoothing

AOD + 0.3× + 0.1× Feature Extraction Feature Extraction … e.G GLM Analysis e.g. Segmentation

e.g. Length/ Normalize Normalize # Samples GM + 0.3× … + 0.1× Data-1 Feature Data-2 Feature Group 1 { Joint Feature Matrix Group 2 {

jICA / ICA

Component Selection e.g. Test loading parameters (group1 vs group2) 41 42

7 jICA of multiple-tasks Study 1: Multi-task fMRI

43 44

Study 1: Multi-task fMRI Fusion in Sz Schizophrenia and Disconnection • Subject Information • Schizophrenia is a complex condition with a diverse and • Fifteen healthy controls and • All participants had normal fifteen chronic schizophrenia hearing (assessed by self heterogeneous clinical presentation, which makes it unlikely that outpatients in complete or report) and were able to the neural mechanism underlying the disorder is limited to a partial remission perform both tasks circumscribed brain dysfunction • Patients met criteria for successfully during practice schizophrenia in the DSM-IV prior to the scanning session. • Schizophrenia has been associated with both structural and on the basis of a structured • Scan Parameters functional abnormalities in neocortical networks including clinical interview and review • 3.0 Tesla Siemens Allegra of the case file (mean duration frontal, parietal, and temporal regions. of illness=11.2±6.3 years) • TR=1.50s [AOD]/1.86s[SB] • All but one patient with • TE=27ms • Schizophrenia is thought to involve a disturbance of coupling • field of view=24cm schizophrenia were stabilized • field of view=24cm between large-scale cortical systems [Breakspear, 2003; Friston, 1998; on atypical antipsychotic • acquisition matrix=64x64 Pearlson, 1999]. medications. • flip angle=70 deg . • Equal numbers of males • slice • There have been a number of models proposed with many studies (N=12/12) and females thickness=4[AOD]/3[SB]mm (N=3/3) and all but two implicating regions in temporal lobe, cerebellum, thalamus, basal participants in each group were • gap=1mm, 29[AOD]/36[SB] slices ganglia, and lateral frontal regions [Weinberger 1987; McCarley et al., 1991; right handed. Andreasen et al., 1998; Braver et al., 1999; Friston 1999] • No significant between-group • ascending acquisition) differences in age (patients • Six “dummy” scans • Some suggested models 37±11 years; controls 38±11 • Dis-coordination models [Andreasen et al., 1998] years) • Fronto-temporal disconnection models [Liddle et al., 1992] 45 46

Auditory Oddball & Sternberg Tasks

tone, sweep, 1 kHz .5 kHz whistle

Standard Standard Target Standard Standard Standard Novel Standard

Encode Maintain Recognition Probes 10 sec 9 sec 12 sec

d f g j p f n d …

47 48

8 AOD SB-WM Preprocessing Timing/motion correction Spatial normalization Spatial smoothing Feature Extraction GLM Analysis

Normalize Normalize

AOD “targets” SB-WM “recognition” 15 Patients { Joint Feature Matrix 15 Controls {

jICA 10 components

Component Selection Test loading parameters (patients differ from controls) 49 50

jICA Results Joint Histograms One component Significant at p<0.01

51 52

Inter-task correlation Brief Summary of Study 1 • Finding 1: • Schizophrenia patients demonstrate “decreased” connectivity in the joint network identified using the jICA approach • This network includes regions in temporal lobe, cerebellum, thalamus, basal ganglia, visual cortex, and lateral frontal regions, and these findings are consistent with both the cognitive dysmetria and fronto-temporal disconnection • Our findings thus argue against new, less efficient, areas being recruited by patients due to impaired connectivity between regions which are normally utilized • Finding 2: • A second finding is that for the voxels identified by the jICA analysis, the correlation between the two tasks was significantly higher in patients than controls • This finding suggests that schizophrenia patients activate “more similarly” for both tasks than controls. The degree to which a brain activation map is different from that of another task may reflect the degree to which performance on a task is “specialized” to a certain set of regions • A possible synthesis of both findings is that patients are activating less, but also activating with a less unique set of regions for these very different tasks. This suggests both a global attenuation of activity as well as a breakdown of specialized wiring between cognitive domains. Alternative explanations?

53 54

9 Study 2: Function and Structure Study 2: fMRI and GM Fusion in Sz • Subject Information • Fifteen healthy controls and • All participants had normal hearing (assessed by self report) fifteen chronic schizophrenia and were able to perform both outpatients in complete or tasks successfully during practice partial remission prior to the scanning session. • Patients met criteria for • Scan Parameters (3.0 Tesla schizophrenia in the DSM-IV Allegra) on the basis of a structured • fMRI EPI Scan clinical interview and review • TR=1.50s, TE=27ms of the case file (mean duration • field of view=24cm, 64x64 of illness=11.2±6.3 years) • flip angle=70 deg • All but one patient with • st=4mm, gap=1mm, 29 slices schizophrenia were stabilized • ascending acquisition on atypical antipsychotic • Six “dummy” scans medications. • sMRI T1 MP-RAGE • Equal numbers of males • TR/TE/TI = 2300/2.74/900 ms (N=12/12) and females • flip angle = 8° (N=3/3) and all but two • FOV = 176x256 mm participants in each group were • Slab thickness = 176 mm right handed. • Matrix = 176x256x176 • Voxel size =1x1x1 mm • No significant between-group • Number of averages = 2 differences in age (patients • Pixel bandwidth =190 Hz 37±11 years; controls 38±11 • Total scan time =10:09 min years)

55 56

AOD fMRI Data T1 Image jICA Results One component Preprocessing Preprocessing/ Significant at p>0.01 Timing/motion correction Spatial normalization Feature Extraction Spatial smoothing Spatial normalization Segmentation/Bias correction Spatial smoothing Feature Extraction Spatial smoothing GLM Analysis Normalize Normalize

AOD “targets” GM Segmentation 15 Patients { Joint Feature Matrix 15 Controls {

jICA 10 components

Component Selection Test coupled loading parameters (patients differ from controls) 57 58

Joint Histograms Masked fMRI/VBM

59 60

10 Group averaged Images Summary Study 2 • We have used a joint-ICA model to examine linearly related fMRI auditory oddball target activation data and gray matter segmentation data • A single component which was significantly different between patients with schizophrenia and healthy controls were examined • Finding 1: • Gray matter regions in bilateral parietal lobe and frontal lobe as well as right temporal lobe were found to be associated with auditory oddball activations in bilateral temporal lobe • Previous findings of diminished activity to AOD and decreased gray matter in temporal lobe regions were replicated • Finding 2: • in the regions showing the largest group differences, gray matter concentrations were larger in patients versus controls, suggesting that more gray matter may be related to less functional connectivity in the auditory oddball fMRI task • As a whole, these findings suggest a possible morphological substrate for auditory oddball connectivity changes in schizophrenia (although this cannot be proven in the current study) • In general, we suggest studying interactions between gray matter data and fMRI data provides a useful way to examine structure and function.

61 62

Other examples Summary • Much more data fusion is need to make sense of the vast amounts of data we are collecting! • Caution: causality cannot be established with correlational Gray Matter or related methods! & • Data fusion approaches enable us to ask novel questions DTI about fMRI data and provides a way to potentially help us gain a more complete understanding of brain structure and function • Our choice of modeling the shared dependence between the tasks with the mixing parameters should be examined in more studies before the true utility of this assumption will be known • Future work includes advancing the statistical models fMRI used, especially wrt incorporating prior information about & each of the data types ERP

63 64

65

11