The General Linear Model 20/21

The General Linear Model 20/21 Univ.-Prof. Dr. Dirk Ostwald Contents 1 Introduction 3 1.1 Probabilistic modelling......................................3 1.2 Experimental design.......................................6 1.3 A verbose introduction to the general linear model......................7 1.4 Bibliographic remarks...................................... 11 1.5 Study questions.......................................... 12 2 Sets, sums, and functions 13 2.1 Sets................................................ 13 2.2 Sums, products, and exponentiation.............................. 17 2.3 Functions............................................. 19 2.4 Bibliographic remarks...................................... 25 2.5 Study questions.......................................... 25 3 Calculus 26 3.1 Derivatives of univariate real-valued functions......................... 26 3.2 Analytical optimization of univariate real-valued functions.................. 29 3.3 Derivatives of multivariate real-valued functions........................ 32 3.4 Derivatives of multivariate vector-valued functions...................... 36 3.5 Basic integrals.......................................... 37 3.6 Bibliographic remarks...................................... 43 3.7 Study questions.......................................... 43 4 Matrices 44 4.1 Matrix definition......................................... 44 4.2 Matrix operations........................................ 44 4.3 Determinants........................................... 52 4.4 Symmetry and positive-definiteness............................... 52 4.5 Bibliographic remarks...................................... 53 4.6 Study Questions......................................... 53 5 Probability spaces and random variables 55 5.1 Probability spaces........................................ 55 5.2 Elementary probabilities..................................... 56 5.3 Random variables and distributions............................... 58 5.4 Random vectors and multivariate probability distributions.................. 62 5.5 Bibliographic remarks...................................... 68 5.6 Study questions.......................................... 68 6 Expectation, covariance, and transformations 69 6.1 Expectation............................................ 69 6.2 Variance.............................................. 71 6.3 Sample mean, sample variance, and sample standard deviation............... 74 6.4 Covariance and correlation of random variables........................ 75 6.5 Sample covariance and sample correlation........................... 78 6.6 Probability density transformations............................... 79 6.7 Combining random variables.................................. 81 6.8 Bibliographic remarks...................................... 85 6.9 Study questions.......................................... 85 7 Probability distributions 86 7.1 The multivariate Gaussian distribution............................. 87 Contents 2 7.2 The General Linear Model.................................... 91 7.3 The Gamma distribution..................................... 92 7.4 The χ2 distribution........................................ 92 7.5 The t distribution........................................ 93 7.6 The f distribution........................................ 95 7.7 Bibliographic remarks...................................... 96 7.8 Study questions.......................................... 96 8 Maximum likelihood estimation 97 8.1 Likelihood functions and maximum likelihood estimators................... 97 8.2 Maximum likelihood estimation for univariate Gaussian distributions............ 100 8.3 ML estimation of GLM parameters............................... 103 8.4 Example (Independent and identically distributed Gaussian samples)............ 106 8.5 Bibliographic remarks...................................... 107 8.6 Study questions.......................................... 107 9 Frequentist distribution theory 108 9.1 Introduction............................................ 108 9.2 Beta parameter estimates.................................... 108 9.3 Variance parameter estimates.................................. 110 9.4 The T -statistic.......................................... 111 9.5 The F -statistic.......................................... 112 9.6 Bibliographic remarks...................................... 115 9.7 Study questions.......................................... 115 10 Statistical testing 116 10.1 Statistical tests.......................................... 116 10.2 A single-observation z-test.................................... 118 10.3 Bibliographic remarks...................................... 119 10.4 Study questions.......................................... 119 11 T-tests and simple linear regression 121 11.1 Introduction............................................ 121 11.2 One-sample t-test......................................... 123 11.3 Independent two-sample t-test................................. 124 11.4 Simple linear regression..................................... 127 11.5 Bibliographic remarks...................................... 131 11.6 Study questions.......................................... 131 12 Multiple linear regression 132 12.1 An exemplary multiple linear regression design........................ 132 12.2 Linearly independent, orthogonal, and uncorrelated regressors................ 134 12.3 Statistical efficiency of multiple linear regression designs................... 135 12.4 Multiple linear regression in functional neuroimaging..................... 136 12.5 Bibliographic remarks...................................... 141 12.6 Study questions.......................................... 141 13 One-way ANOVA 143 13.1 The GLM perspective...................................... 143 13.2 The F -test perspective...................................... 149 13.3 Bibliographic remarks...................................... 154 13.4 Study questions.......................................... 155 14 Two-way analysis of variance 156 14.1 An additive two-way ANOVA design.............................. 157 14.2 A two-way ANOVA design with interaction.......................... 159 14.3 Bibliographic remarks...................................... 161 14.4 Study questions.......................................... 161 The General Linear Model 20/21 j © 2020 Dirk Ostwald CC BY-NC-SA 4.0 1 j Introduction The general linear model (GLM) is a unifying perspective on many data analytical techniques in statistics, machine learning, and artificial intelligence. For example, many statistical methods, such as T-tests, F-tests, simple linear regression, multiple linear regression, the analysis of variance, and the analysis of covariance are special cases of the GLM. Furthermore, the mathematical machinery of the GLM forms the basis for many more advanced data analytical techniques ranging from mixed linear models to neural networks to Bayesian hierarchical models. In cognitive neuroimaging, the GLM is popular as a standard technique in the analysis of fMRI data. The aim of this introductory Section is to preview the scope contemporary data analytical approaches, which is most sensibly summarized by the term probabilistic modelling (Section 1.1). After touching upon some basic aspects of experimental design (Section 1.2), we then provide a verbose introduction to the GLM and its mathematical form (Section 1.3). The mathematical language that is needed to discuss the GLM (e.g., matrix calculus and multivariate Gaussian distributions) will be expanded upon in subsequent Sections. It is introduced here primarily to motivate the engagement with these mathematically more basic concepts in subsequent Sections. 1.1 Probabilistic modelling Science is the dyad of formulating quantitative theories about natural phenomena and validating these theories in light of quantitative data. Because quantitative data is finite, theories will ever only be quantified up to a certain level of uncertainty. Probabilistic modelling provides the glue between formalized scientific theories and empirical data and offers a mechanistic framework for quantifying the remaining uncertainty about a theory's validation. Probabilistic modelling has many synonyms, such as statistics, Bayesian inference, data assimilation, advanced machine learning, or simply data analysis. Cognitive neuroscience aims for a scientific approach to understanding brain function. When designing any experiment in cognitive neuroscience, it is thus essential to have at least a vague idea about the data analytical procedures that are going to be used on the collected data, irrespective of whether the data is behavioural or derives from neuroimaging techniques such as functional magnetic resonance imaging (fMRI) or magneto- or electroencephalography (M/EEG). In the current Section, we provide a brief overview about common data analytical strategies employed in cognitive neuroimaging or, more generally, in probabilistic quantitative data analysis. To this end, it is first helpful to appreciate that any form of data analysis embodies data reduction and that any sensible form of data reduction is based on a model of the data generating process. Data analysis is data reduction. Any cognitive neuroscience experiment generates a wealth of quantitative data (numbers). For example, when conducting a typical behavioural experiment, one presents stimuli of different experimental conditions multiple times to participants and records, for example, the correctness of the response and the associated

Load more