REGRESSION in MATRIX TERMS a “Matrix” Is a Display of Numbers Or Numerical Quantities Laid out in a Rectangular Array of Rows and Columns

PubH 7405: REGRESSION ANALYSIS REGRESSION IN MATRIX TERMS A “matrix” is a display of numbers or numerical quantities laid out in a rectangular array of rows and columns. The array, or two-way table of numbers, could be rectangular or square – could be just one row (a row matrix or row vector) or one column (a column matrix or column vector). When it is square, it could be symmetric or a “diagonal matrix” (non-zero entries are on the main diagonal). The numbers of rows and of columns form the “dimension” of a matrix; for example, a “3x2” matrix has three rows and two columns. An “entry” or “element” of a matrix need two subscripts for identification; the first for the row number and the second for the column number: A = [aij ] For example, in the following matrix we have a11 = 16,000 and a32 = 35. 16,000 23 A = 33,000 47 21,000 35 BASIC SL REGRESSION MATRICES Y1 1 x1 Y 1 x Y = 2 X = 2 M M Yn 1 xn X is special, called “Design Matrix” X: the “Design Matrix” for MLR: 1 x11 x21 . xk1 1 x12 x22 . xk 2 X + = nx(k 1) M M M M 1 x1n x2n . xkn First subscript: Variable, one column for each predictor; Second subscript: Subject, one row for each subject The dimension of “Design Matrix” X is changed to handle more predictors: one column for each predictor (the number of rows is still the sample size. The first column (filled with “1”) is still “optional”; not included when doing “Regression through the origin” (i.e. no intercept). X is called the Design Matrix. There are two reasons for the name: (1) By the model, the values of X’s (columns) are under the controlled of investigators: entries are fixed/designed, (2) The design/choice is consequential: the larger the variation in x’s in each column the more precise the estimate of the slope. TRANSPOSE The transpose of a matrix A is another matrix, denoted by A’ (or AT), that is obtained by interchanging the columns and the rows of the matrix A; that is: ' If A = [a ij] then A = [a ji ] 2 5 ' 2 7 3 A3x2 = 7 10 ;A2x3 = 5 10 4 3 4 The roles of rows and columns are swapped; if A is a symmetric matrix, aij = aji, we have A = A’ The transpose of a column vector is a row vector 3 − 2 C = ;C' = [3 − 2 0 12] 0 12 MULTIPLICATION by a Scalar • Again, “scalar” is an (ordinary) number • In multiplying a matrix by a scalar, every element of the matrix is multiplied by that scalar • The result is a new matrix with the same dimension A = [aij ] kA = [kaij ] MULTIPLICATION of Matrices • Multiplication of a matrix by another matrix is much more complicated. • First, the “order” is important; “AxB” is said to be “A is post-multiplied by B” or “B is pre-multiplied by A”; In general: AxB ≠ BxA • There is a strong requirement on the dimensions: AxB is defined only if “the number of columns of A is equal to the number of rows of B”. • The product AxB has, as its dimension, the number of rows of A and the number of columns of B. MULTIPLICATION FORMULA ArxcBcxs = AB rxs c [aij ]rxc[bij ]cxs = [∑aikbkj ]rxs k=1 For entry (i,j) of AxB, we multiply row “i” of A by column “j” of B; That’s why the number of columns of A should be the same as the number of rows of B. EXAMPLES 4 2a1 4a1 + 2a2 = 5 8a2 5a1 + 8a2 2 2 2 2 [2 3 5]3 = [2 + 3 + 5 = 38] 5 2 54 6 33 52 = 4 15 8 21 32 OPERATION ON SLR BASIC DATA MATRICES y1 ' y2 2 Y Y = [y1 y2 L yn ] = [∑ yi ] L yn y1 ' 1 1 L 1 y2 ∑ yi X Y = = x1 x2 L xn L ∑ xi yi yn “Order” is important; cannot form YX’ OPERATION ON MLR BASIC DATA MATRICES y1 y ' 2 2 Y Y = [y1 y2 L yn ] = [∑ yi ] L yn 1 1 1 . 1 y1 ∑ yi x y ' x11 x12 x13 . x1n y2 ∑ 1i i X Y = = M M M M M M xk1 xk 2 xk3 . xkn yn ∑ xki yi MORE REGRESSION EXAMPLE 1 x1 1 1 L 1 1 x2 X' X = x1 x2 L xn M 1 xn n ∑ xi = 2 ∑ xi ∑ xi Again, X is referred to as the “Design Matrix” 1 1 L 1 1 x11 L xk1 x x L x 1 x L x X' X = 11 12 1n 12 k2 M M L MM M L M xk1 xk2 L xkn 1 x1n L xkn 1 ∑ x1i L ∑ xki x x2 L x x = ∑ 1i ∑ 1i ∑ 1i ki M M L M 2 ∑ xki ∑ x1ixki L ∑ x1i X’X is a square matrix filled with sums of squares and sums of products; we can form XX’ but it is a different n-by-n matrix which we do not need. SIMPLE LINEAR REGRESSION MODEL Yi = β0 + β1 X i + ε i ;i =1,2,...,n Y1 1 X1 ε1 β ε Y2 1 X 2 0 2 = + M M Mβ1 M Yn 1 X n ε n Ynx1 = Xnx2β2x1 + εnx1; or Y = Xβ + ε MULTIPLE LINEAR REGRESSION MODEL Yi = β0 + β1 X i1 + β2 X i2 +...+ βk X ik + ε i ;i =1,2,...,n Y1 1 x11 x21 . xk1 β0 ε1 Y 1 x x . x β ε 2 = 12 22 k 2 1 + 2 M M M M M M M Yn 1 x1n x2n . xkn βk ε n MLR MODEL IN MATRIX TERMS Yn−by−1 = Xn−by−(k+1)β(k+1)−by−1 + εn−by−1 E(Y) n−by−1 = Xβ 2 2 σ (Y) n−by−n = σ I OBSERVATIONS & ERRORS Y1 ε1 Y ε Y = 2 ε = 2 nx1 M nx1 M Yn ε n β: Regression Coefficient (a column vector of parameters) β0 β β = 1 (k +1)x1 M βk LINEAR DEPENDENCE Consider a set of c column vectors C1, C2, …, Cc in a rxc matrix. If we can find c scalars k1, k2, …, kc – not all zero – so that: k1C1 + k2C2 + … + kcCc = 0 the c column vectors are said to be “linearly dependent”. If the only set of scalars for which the above equation holds is all zero (k1 = … = kc = 0), the c column vectors is linearly independent. EXAMPLE 1 2 5 1 A = 2 2 10 6 3 4 15 1 Since : 1 2 5 1 0 (5)2 + (0)2 + (−1)10 + (0)6 = 0 3 4 15 1 0 The four column vec tors are linearly dependent RANK OF A MATRIX • In the previous example, the rank of A<4 • Since the first, second and fourth columns are linearly independent (no scalars can be found so that k1C1 + k2C2 + k4C4 = 0); the rank of that matrix A is 3. • The rank of a matrix is defined as “the maximum number of linearly independent columns in the matrix. SINGULAR/NON-SINGULAR MATRICES • If the rank of a square r x r matrix A is r then matrix A is said to be nonsingular or of “full rank” • An r x r matrix with rank less than r is said to be singular or not of full rank. INVERSE OF A MATRIX • The inverse of a matrix A is another matrix A-1 such that: A-1A = AA-1 = I where I is the identity or unit matrix. • An inverse of a matrix is defined only for square matrices. • Many matrices do not have an inverse; a singular matrix does not have an inverse. • If a square matrix has an inverse, the inverse is unique; the inverse of a nonsingular or full rank matrix is also nonsingular and has same rank. INVERSE OF A 2x2 MATRICE a b A = c d d − b A−1 = D D − c a D D D = (ad − bc) is the "determinan t" of A; or | A | A singular matrix does not have an inverse because its “determinants” is zero: 2 6 A = 7 21 We have : 2 6 0 (3) + (−1) = 7 21 0 and : D = (2)(21) - (6)(7) = 0 A SYSTEM OF EQUATIONS It is extremely simple to write a system of equations in the matrix form – especially with many equations 3x + 4y −10z = 0 − 2x − 5y + 21z =14 x + 29y − 2z = −3 3 4 −10 x 0 − 2 − 5 21y = 14 1 29 − 2 z − 3 A SIMPLE APPLICATION Consider a system of two equations : 2y1 + 4y2 = 20 3y1 + y2 =10 which is written in matrix notation : 2 4 y1 20 = 3 1y2 10 −1 y1 2 4 20 = y2 3 1 10 −.1 .4 20 Solution: = .3 −.210 2 y1 = 2 & y2 = 4 = 4 ANOTHER EXAMPLE 3 4 −10 x 0 − 2 − 5 21y = 14 1 29 − 2 z − 3 −1 x 3 4 −10 0 y = − 2 − 5 21 14 z 1 29 − 2 − 3 REGRESSION EXAMPLE n ∑ x X' X = 2 ∑ x ∑ x 2 D = n∑ x − (∑ x)(∑ x) _ = n∑(x − x)2 We can easily see that (D≠0) . This property would not be true if X has more columns (Multiple Regression) and columns are not linearly independent. If “columns” (i.e. predictors/factors) are highly related, Design Matrix approaching “singular”: Regression failed! REGRESSION EXAMPLE n ∑ x X' X = 2 ∑ x ∑ x _ D = n∑(x − x)2 ∑ x2 − ∑ x _ _ n (x − x)2 n (x − x)2 (X' X)−1 = ∑ ∑ − ∑ x n _ _ 2 2 n∑(x − x) n∑(x − x) _ 2 _ 1 x − x + _ _ n (x − x)2 (x − x)2 = ∑ ∑ _ − x 1 _ _ 2 2 ∑(x − x) ∑(x − x) FOUR IMPORTANT MATRICES IN REGRESSION ANALYSIS ' 2 Y Y = [∑ yi ] ' ∑ yi X Y = ∑ xi yi n ∑ xi X' X = 2 ∑ xi ∑ xi _ 2 _ 1 x − x + _ _ n 2 2 −1 (x − x) (x − x) (X' X) = ∑ ∑ _ − x 1 _ _ 2 2 ∑(x − x) ∑(x − x) LEAST SQUARE METHOD SLR n Data : {(xi , yi )}i=1 n 2 Q = ∑(yi − β0 − β1xi ) i=1 δQ n = −2∑(yi − β0 − β1xi ) = 0 δβ0 i=1 δQ n = −2∑ xi (yi − β0 − β1xi ) = 0 δβ1 i=1 LEAST SQUARE METHOD MLR n Data : {(xi1, xi2 ,..., xik , yi )}i=1 n 2 Q = ∑(yi − β0 − β1x1i − β2 x2i ...− βk xki ) i=1 We solve equations : δQ n = −2∑(yi − β0 − β1x1i − β2 x2i ...− βk xki ) = 0 δβ0 i=1 δQ n = −2∑ x1i (yi − β0 − β1x1i − β2 x2i ...− βk xki ) = 0 δβ1 i=1 ..

REGRESSION in MATRIX TERMS a “Matrix” Is a Display of Numbers Or Numerical Quantities Laid out in a Rectangular Array of Rows and Columns

Lecture 13: Simple Linear Regression in Matrix Format

Introducing the Game Design Matrix: a Step-By-Step Process for Creating Serious Games

Stat 5102 Notes: Regression

Uncertainty of the Design and Covariance Matrices in Linear Statistical Model*

The Concept of a Generalized Inverse for Matrices Was Introduced by Moore(1920)

Kriging Models for Linear Networks and Non‐Euclidean Distances

Linear Regression with Shuffled Data: Statistical and Computational

Week 7: Multiple Regression

Stat 714 Linear Statistical Models

Fitting Differential Equations to Functional Data: Principal Differential Analysis

Linear Regression in Matrix Form

Some Matrix Results