<<

The General A talk for dummies, by dummies Meghan Morley and Anne Urai Where are we headed?

• A delicious analogy • The General Linear Model equation • What do the variables ? • How does this relate to fMRI? • Minimizing error Analogy: Reverse Cookery Start with finished product and try to explain how it is made... – You specify which ingredients to add (X) – For each ingredient, GLM finds the quantities (β) that produce the best reproduction – Then if you tried to make the cake with what you know about X and β then the error would be the difference between the original cake/ and yours! y = x1β1 + x2β2 + e The General Linear Model y = Xβ + e

The General Linear Model Describes a response (y), such as the BOLD response in a voxel, in terms of all its contributing factors (xβ) in a linear combination, whilst also accounting for the contribution of error (e). The General Linear Model y = Xβ + e

Dependent variable Describes a response (such as the BOLD response in a single voxel, taken from an fMRI scan) The General Linear Model y = Xβ + e

Independent Variable aka. Predictor e.g. Experimental conditions (Embodies all available knowledge about experimentally controlled factors and potential confounds) The General Linear Model y = Xβ + e

Parameters (aka regression coefficient/beta weights) Quantifies how much each predictor (X) independently Y influences the dependent variable (Y)  The slope of the line X The General Linear Model y = Xβ + e

Error in the data (y) which is not explained by the linear combination of predictors (x)

Therefore… y = Xβ + e =

DV = (IV x Parameters) + Error

As we take samples of a response (y) many times, this equation actually represents a … …the GLM matrix

One response…

...over time Time = Single voxel

BOLD signal Each predictor (x) has an expected signal time course, which contributes to y

X1 X2 X3 Time

= 1 × + 2 × + 3 × +

fMRI signal residuals

Y: Observed Data X: Predictors Residual Parameters (β)

• Beta is the slope of the regression line  Quantifies a specific predictor’s (x) contribution to y.  The parameter (β) chosen for a model should minimise the error (reducing the amount of variance in y which is left unexplained)

Y

X The does not account for all of y

 If we plot our observations (n) on a graph these will not fall in a straight line Y

 This is a result of uncontrolled influences (other than x) on y

 This contribution to y is called the error (or residual)

^  Minimising the difference between the response predicted by the model (y) and the actual response (y) minimises the error of the model

.... 3 3 , x X 2 , x 1 x 2 X

” 1

X ”Known ....and our data our ....and ≈

Measured

Shadowing Baseline Generation We have our set of our time-series: hypothetical have We

We find the best parameter values by modelling...

2 3 4 0 1 0 1 0 1 2 Generation Shadowing

≈ β1* + β2* + β3*

Baseline

”Unknown” parameters

...the best parameter will miminise the error in the model is is 2 1 3 which 0 * y 3 ) β in

+ error 1 ( 0 0 * 2 variance β

model + 1 0 0 * residual by the by 1

β of

≈ lot 4 3 is a is 2 unexplained Not brilliant brilliant Not there

, Generation Shadowing Baseline Here 2 1 4 0 * 3 β

+ 1 here 0 0 * 2 β + 1 1 0 * 1 β ≈ 4 ...and the same goes ...andthe goes same 3 2 Still not great not Still Generation Shadowing Baseline

2 error 1 0 * * 2.98 23 ββ + + minimal minimal 1 1 0 0 * * 0.16 1 with 2 β β + + fit, 1 0 good * 0.83 * 1 0 a β β ≈ ≈ have

4 3 we

2 here

Generation Shadowing Baseline but ... 2 1 0 2.17 * * * 223 βββ + + + 1 1 0 0 * * * 0.82 11 2 fit different data ββ β + + + can

1 0 * * 0.68 * 1 30 model different parameters β ββ use ≈ ≈ ≈ 3 2 1 – just just – In other words: words: other In

...andthe same Generation Shadowing Baseline

2 ) x 1 0

* * * 2.04 223 βββ + + + predictors ( 1 1

0 0 * * * parameters () 0.06 11 2 same see

ββ β The + + + can

Different 1 0 you * * 0.03 * 1 00 β ββ ...as ≈ ≈ ≈ ≈ 3 2

) 1

y

Doesn’t care:

data ( Shadowing Baseline Generation Different So the same model can be used across all voxels of the brain, just using different paramteres

beta_0001.img

⎡0.83⎤ β ⎢0.16⎥ = ⎢ ⎥ ⎣⎢2.98⎦⎥ beta_0002.img ⎡0.03⎤ β ⎢0.06⎥ = ⎢ ⎥ ⎣⎢2.04⎦⎥

beta_0003.img ⎡0.68⎤ β ⎢0.82⎥ = ⎢ ⎥ ⎣⎢2.17⎦⎥ Finding the optimal parameter can also be visualised in geometric space

• The y and^ x in our model are all vectors of the same dimensionality and so, lie within the same large space.

• The design matrix (x1, x2, x3…) x2 defines a subspace; the design space (green panel). x1

Design space defined by X y = Xβ + e Parameters determine the co-ordinates of the ^ predicted response (y) within this space y

• The actual data (y) however, lie outside this space.

x2 • So there is always a difference between yˆ ^ the predicted y and actual y

x1 We need to minimise the difference between ^ predicted y and actual y

y

• So the GLM aims find the projection of y on^ the design space which e (minimise) minimises the error of the model (minimises the difference between ^ x2 predicted y and actual y) yˆ

x1

The smallest error vector is orthogonal to the design space…

y

…So the best parameter will position y so^ that the error vector between y and ^y e is orthogonal to the design space (minimising the error) x2 yˆ

x1 How do we find the parameter which produces minimal error? The optimal parameter can be calculated using Ordinary βˆ = (X T X )−1 X T y

A statistical method for estimating unknown parameters from sampled data - minimizing the difference between predicted data and observed data.

N 2 = minimum å e t t = 1 Overview of SPM

Image time-series Kernel Design matrix

Realignment Smoothing General linear model

Statistical Gaussian field theory Normalisation inference

Template p <0.05 Parameter estimates fMRI data Single voxel analyzed across many different Y = BOLD signal at each time points time point from that voxel Time

BOLD signal An fMRI

Is there a change in the BOLD response between the two conditions? Applying the GLM

y = x1β1 + x2β2 + e

BOLD Predictors that How much each Variance in the response at explain the data predictor data that cannot each time point explains the be explained by at chosen voxel data the predictors (Coefficient) (noise)

β = 0.44 Statistical Parametric Mapping • Null hypothesis: your effect of interest explains none of your data. • Is the task significally

different? eg. P<0.05 Problems

1. BOLD signal is not a simple on/off 2. Low- noise 3. Assumptions about the nature of the error 4. Physiological confounds Convolution model

Impulses HRF Expected BOLD

x =

expected BOLD response = input function x hemodynamic response function (HRF) Convolution model Convolve stimulus function with a canonical hemodynamic response Original function (HRF): Convolved HRF

 HRF Adjusting for low frequencies

discrete cosine transform (DCT) set Adjusting for low frequencies

Blue = data Black = mean + low-frequency drift Green = predicted response, taking into account low- frequency drift red = predicted response, NOT taking into account low-frequency drift

GLM Assumptions

1. Errors are normally distributed - smoothing

2. Error is the same in each & every measurement point x 3. There is no correlation between errors at different time points/data points

Error is time-correlated

• The error at each time point is correlated to the error at the previous time point

e e e in time t is correlated with e in time t-1

0 0 t t

It should be… It is… Autoregressive Model

• Temporal : in y = Xβ + e over time

et = aet-1 + ε

• ‘Whitening’

• To compensate for inflated t-value Physiological Confounds

• head movements

• arterial pulsations

• breathing

• eye blinks (visual cortex)

• adaptation affects, fatigue, changes in attention to task To recap... y = x1β1 + x2β2 + e =

Response = (Predictors x Parameters...) + Error Thanks to...

• Previous years MfD slides (2009-2010)

• Dr. Guillaume Flandin