Matrix Approach to Linear Regression

Matrix Approach to Linear Regression Dr. Frank Wood Frank Wood, [email protected] Linear Regression Models Lecture 11, Slide 1 Random Vectors and Matrices • Let’s say we have a vector consisting of three random variables The expectation of a random vector is defined Frank Wood, [email protected] Linear Regression Models Lecture 11, Slide 2 Expectation of a Random Matrix • The expectation of a random matrix is defined similarly Frank Wood, [email protected] Linear Regression Models Lecture 11, Slide 3 Covariance Matrix of a Random Vector • The collection of variances and covariances of and between the elements of a random vector can be collection into a matrix called the covariance matrix remember so the covariance matrix is symmetric Frank Wood, [email protected] Linear Regression Models Lecture 11, Slide 4 Derivation of Covariance Matrix • In vector terms the covariance matrix is defined by because verify first entry Frank Wood, [email protected] Linear Regression Models Lecture 11, Slide 5 Regression Example • Take a regression example with n=3 with constant error terms σ {ǫi} = σ and are ≠ uncorrelated so that σ {ǫi, ǫj} = 0 for all i j • The covariance matrix for the random vector ǫǫǫ is which can be written as Frank Wood, [email protected] Linear Regression Models Lecture 11, Slide 6 Basic Results • If A is a constant matrix and Y is a random matrix then is a random matrix Frank Wood, [email protected] Linear Regression Models Lecture 11, Slide 7 Multivariate Normal Density • Let Y be a vector of p observations • Let µ be a vector of p means for each of the p observations Frank Wood, [email protected] Linear Regression Models Lecture 11, Slide 8 Multivariate Normal Density • Let § be the covariance matrix of Y • Then the multivariate normal density is given by Frank Wood, [email protected] Linear Regression Models Lecture 11, Slide 9 Example 2d Multivariate Normal Distribution 0.04 0.02 0 10 mvnpdf([0 0], [10 2;2 2]) 8 6 4 2 10 0 8 6 -2 4 -4 2 0 -6 -2 -4 y -8 -6 Run multivariate_normal_plots.m -8 -10 x -10 Frank Wood, [email protected] Linear Regression Models Lecture 11, Slide 10 Matrix Simple Linear Regression • Nothing new – only matrix formalism for previous results • Remember the normal error regression model • This implies Frank Wood, [email protected] Linear Regression Models Lecture 11, Slide 11 Regression Matrices • If we identify the following matrices • We can write the linear regression equations in a compact form Frank Wood, [email protected] Linear Regression Models Lecture 11, Slide 12 Regression Matrices • Of course, in the normal regression model the expected value of each of the ǫi’s is zero, we can write • This is because Frank Wood, [email protected] Linear Regression Models Lecture 11, Slide 13 Error Covariance • Because the error terms are independent and have constant variance σ Frank Wood, [email protected] Linear Regression Models Lecture 11, Slide 14 Matrix Normal Regression Model • In matrix terms the normal regression model can be written as where and Frank Wood, [email protected] Linear Regression Models Lecture 11, Slide 15 Least Squares Estimation • Starting from the normal equations you have derived we can see that these equations are equivalent to the following matrix operations with demonstrate this on board Frank Wood, [email protected] Linear Regression Models Lecture 11, Slide 16 Estimation • We can solve this equation (if the inverse of X’X exists) by the following and since we have Frank Wood, [email protected] Linear Regression Models Lecture 11, Slide 17 Least Squares Solution • The matrix normal equations can be derived directly from the minimization of w.r.t. to βββ Do this on board. Frank Wood, [email protected] Linear Regression Models Lecture 11, Slide 18 Fitted Values and Residuals • Let the vector of the fitted values be in matrix notation we then have Frank Wood, [email protected] Linear Regression Models Lecture 11, Slide 19 Hat Matrix – Puts hat on Y • We can also directly express the fitted values in terms of only the X and Y matrices and we can further define H, the “hat matrix” • The hat matrix plans an important role in diagnostics for regression analysis. write H on board Frank Wood, [email protected] Linear Regression Models Lecture 11, Slide 20 Hat Matrix Properties • The hat matrix is symmetric • The hat matrix is idempotent, i.e. demonstrate on board Frank Wood, [email protected] Linear Regression Models Lecture 11, Slide 21 Residuals • The residuals, like the fitted values of \hat{Y_i} can be expressed as linear combinations of the response variable observations Y i Frank Wood, [email protected] Linear Regression Models Lecture 11, Slide 22 Covariance of Residuals • Starting with we see that but which means that and since I-H is idempotent (check) we have we can plug in MSE for σ as an estimate Frank Wood, [email protected] Linear Regression Models Lecture 11, Slide 23 ANOVA • We can express the ANOVA results in matrix form as well, starting with where leaving J is matrix of all ones, do 3x3 example Frank Wood, [email protected] Linear Regression Models Lecture 11, Slide 24 SSE • Remember • We have derive this on board and this b • Simplified Frank Wood, [email protected] Linear Regression Models Lecture 11, Slide 25 SSR • It can be shown that – for instance, remember SSR = SSTO-SSE write these on board Frank Wood, [email protected] Linear Regression Models Lecture 11, Slide 26 Tests and Inference • The ANOVA tests and inferences we can perform are the same as before • Only the algebraic method of getting the quantities changes • Matrix notation is a writing short-cut, not a computational shortcut Frank Wood, [email protected] Linear Regression Models Lecture 11, Slide 27 Quadratic Forms • The ANOVA sums of squares can be shown to be quadratic forms. An example of a quadratic form is given by • Note that this can be expressed in matrix notation as (where A is a symmetric matrix) do on board Frank Wood, [email protected] Linear Regression Models Lecture 11, Slide 28 Quadratic Forms • In general, a quadratic form is defined by A is the matrix of the quadratic form. • The ANOVA sums SSTO, SSE, and SSR are all quadratic forms. Frank Wood, [email protected] Linear Regression Models Lecture 11, Slide 29 ANOVA quadratic forms • Consider the following rexpression of b’X’ • With this it is easy to see that Frank Wood, [email protected] Linear Regression Models Lecture 11, Slide 30 Inference • We can derive the sampling variance of the β vector estimator by remembering that where A is a constant matrix which yields Frank Wood, [email protected] Linear Regression Models Lecture 11, Slide 31 Variance of b • Since is symmetric we can write and thus Frank Wood, [email protected] Linear Regression Models Lecture 11, Slide 32 Variance of b • Of course this assumes that we know σ. If we don’t, we, as usual, replace it with the MSE. Frank Wood, [email protected] Linear Regression Models Lecture 11, Slide 33 Mean Response • To estimate the mean response we can create the following matrix • The fit (or prediction) is then since Frank Wood, [email protected] Linear Regression Models Lecture 11, Slide 34 Variance of Mean Response • Is given by and is arrived at in the same way as for the variance of \beta • Similarly the estimated variance in matrix notation is given by Frank Wood, [email protected] Linear Regression Models Lecture 11, Slide 35 Wrap-Up • Expectation and variance of random vector and matrices • Simple linear regression in matrix form • Next: multiple regression Frank Wood, [email protected] Linear Regression Models Lecture 11, Slide 36.

Matrix Approach to Linear Regression

LECTURES 2 - 3 : Stochastic Processes, Autocorrelation Function

Spatial Autocorrelation: Covariance and Semivariance Semivariance

Fast Estimation of the Median Covariation Matrix with Application to Online Robust Principal Components Analysis

Characteristics and Statistics of Digital Remote Sensing Imagery (1)

Covariance of Cross-Correlations: Towards Efﬁcient Measures for Large-Scale Structure

On the Use of the Autocorrelation and Covariance Methods for Feedforward Control of Transverse Angle and Position Jitter in Linear Particle Beam Accelerators*

Lecture 4 Multivariate Normal Distribution and Multivariate CLT

Lecture 2 Covariance Is the Statistical Measure That Indicates The

Week 7: Multiple Regression

Stat 111 Spring 2011 Week 3 Central Limit Theorem, Covariance, T Distribution Presentations 1. Prove the Central Limit Theorem I

Linear Regression in Matrix Form

The Truth About Linear Regression