Gov 2000: 9. Multiple Regression in Matrix Form

Gov 2000: 9. Multiple Regression in Matrix Form Matthew Blackwell November 19, 2015 1 / 62 1. Matrix algebra review 2. Matrix Operations 3. Writing the linear model more compactly 4. A bit more about matrices 5. OLS in matrix form 6. OLS inference in matrix form 2 / 62 Where are we? Where are we going? • Last few weeks: regression estimation and inference with one and two independent variables, varying effects • This week: the general regression model with arbitrary covariates • Next week: what happens when assumptions are wrong 3 / 62 Nunn & Wantchekon • Are there long-term, persistent effects of slave trade on Africans today? • Basic idea: compare levels of interpersonal trust (푌푖) across different levels of historical slave exports for a respondent’s ethnic group • Problem: ethnic groups and respondents might differ in their interpersonal trust in ways that correlate with the severity of slave exports • One solution: try to control for relevant differences between groups via multiple regression 4 / 62 Nunn & Wantchekon • Whaaaaa? Bold letter, quotation marks, what is this? • Today’s goal is to decipher this type of writing 5 / 62 Multiple Regression in R nunn <- foreign::read.dta("Nunn_Wantchekon_AER_2011.dta") mod <- lm(trust_neighbors ~ exports + age + male + urban_dum + malaria_ecology, data = nunn) summary(mod) ## ## Coefficients: ## Estimate Std. Error t value Pr(>|t|) ## (Intercept) 1.5030370 0.0218325 68.84 <2e-16 *** ## exports -0.0010208 0.0000409 -24.94 <2e-16 *** ## age 0.0050447 0.0004724 10.68 <2e-16 *** ## male 0.0278369 0.0138163 2.01 0.044 * ## urban_dum -0.2738719 0.0143549 -19.08 <2e-16 *** ## malaria_ecology 0.0194106 0.0008712 22.28 <2e-16 *** ## --- ## Signif. codes: 0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1 ## ## Residual standard error: 0.978 on 20319 degrees of freedom ## (1497 observations deleted due to missingness) ## Multiple R-squared: 0.0604, Adjusted R-squared: 0.0602 ## F-statistic: 261 on 5 and 20319 DF, p-value: <2e-16 6 / 62 Why matrices and vectors? 7 / 62 8 / 62 Why matrices and vectors? • Here’s one way to write the full multiple regression model: 푦푖 = 훽0 + 푥푖1훽1 + 푥푖2훽2 + ⋯ + 푥푖푘훽푘 + 푢푖 • Notation is going to get needlessly messy as we add variables • Matrices are clean, but they are like a foreign language • You need to build intuitions over a long period of time 9 / 62 Quick note about interpretation 푦푖 = 훽0 + 푥푖1훽1 + 푥푖2훽2 + ⋯ + 푥푖푘훽푘 + 푢푖 • In this model, 훽1 is the effect of a one-unit change in 푥푖1 conditional on all other 푥푖푗. • Jargon “partial effect,” “ceteris paribus,” “all else equal,” “conditional on the covariates,” etc 10 / 62 1/ Matrix algebra review 11 / 62 Matrices and vectors • A matrix is just a rectangular array of numbers. • We say that a matrix is 푛 × 푘 (“푛 by 푘”) if it has 푛 rows and 푘 columns. • Uppercase bold denotes a matrix: 푎 푎 ⋯ 푎 ⎡ 11 12 1푘 ⎤ ⎢ 푎 푎 ⋯ 푎 ⎥ 퐀 = ⎢ 21 22 2푘 ⎥ ⎢ ⋮ ⋮ ⋱ ⋮ ⎥ ⎣ 푎푛1 푎푛2 ⋯ 푎푛푘 ⎦ • Generic entry: 푎푖푗 where this is the entry in row 푖 and column 푗 • If you’ve used Excel, you’ve seen matrices. 12 / 62 Examples of matrices • One example of a matrix that we’ll use a lot is the design matrix, which has a column of ones, and then each of the subsequent columns is each independent variable in the regression. 1 exports age male ⎡ 1 1 1 ⎤ ⎢ 1 exports age male ⎥ 퐗 = ⎢ 2 2 2 ⎥ ⎢ ⋮ ⋮ ⋮ ⋮ ⎥ ⎣ 1 exports푛 age푛 male푛 ⎦ 13 / 62 Design matrix in R head(model.matrix(mod), 8) ## (Intercept) exports age male urban_dum malaria_ecology ## 1 1 855 40 0 0 28.15 ## 2 1 855 25 1 0 28.15 ## 3 1 855 38 1 1 28.15 ## 4 1 855 37 0 1 28.15 ## 5 1 855 31 1 0 28.15 ## 6 1 855 45 0 0 28.15 ## 7 1 855 20 1 0 28.15 ## 8 1 855 31 0 0 28.15 dim(model.matrix(mod)) ##[1]20325 6 14 / 62 Vectors • A vector is just a matrix with only one row or one column. • A row vector is a vector with only one row, sometimes called a 1 × 푘 vector: 휶 = [ 훼1 훼2 훼3 ⋯ 훼푘 ] • A column vector is a vector with one column and more than one row. Here is a 푛 × 1 vector: 푦 ⎡ 1 ⎤ ⎢ 푦 ⎥ 퐲 = ⎢ 2 ⎥ ⎢ ⋮ ⎥ ⎣ 푦푛 ⎦ • Convention we’ll assume that a vector is column vector and vectors will be written with lowercase bold lettering (퐛) 15 / 62 Vector examples • One really common vector that we will work with are individual variables, such as the dependent variable, which we will represent as 퐲: 푦 ⎡ 1 ⎤ ⎢ 푦 ⎥ 퐲 = ⎢ 2 ⎥ ⎢ ⋮ ⎥ ⎣ 푦푛 ⎦ 16 / 62 Vectors in R • Vectors can come from subsets of matrices: model.matrix(mod)[1, ] ## (Intercept) exports age male ## 1.00 854.96 40.00 0.00 ## urban_dum malaria_ecology ## 0.00 28.15 • Note, though, that R always prints a vector in row form, even if it is a column in the original data: head(nunn$trust_neighbors) ##[1]330011 17 / 62 2/ Matrix Operations 18 / 62 Transpose • The transpose of a matrix 퐀 is the matrix created by switching the rows and columns of the data and is denoted 퐀′. • 푘th column of 퐀 becomes the 푘th row of 퐀′: 푎11 푎12 ⎡ ⎤ ′ 푎11 푎21 푎31 퐀 = ⎢ 푎21 푎22 ⎥ 퐀 = [ ] 푎12 푎22 푎32 ⎣ 푎31 푎32 ⎦ • If 퐀 is 푛 × 푘, then 퐀′ will be 푘 × 푛. • Also written 퐀퐓 19 / 62 Transposing vectors • Transposing will turn a 푘 × 1 column vector into a 1 × 푘 row vector and vice versa: 1 ⎡ ⎤ ⎢ 3 ⎥ 휔 = ⎢ ⎥ 휔′ = [ 1 3 2 −5 ] ⎢ 2 ⎥ ⎣ −5 ⎦ 20 / 62 Transposing in R a <- matrix(1:6, ncol = 3, nrow = 2) a ## [,1] [,2] [,3] ## [1,] 1 3 5 ## [2,] 2 4 6 t(a) ## [,1] [,2] ## [1,] 1 2 ## [2,] 3 4 ## [3,] 5 6 21 / 62 Write matrices as vectors • A matrix is just a collection of vectors (row or column) • As a row vector: ′ 푎11 푎12 푎13 퐚1 퐀 = [ ] = [ ′ ] 푎21 푎22 푎23 퐚2 with row vectors ′ ′ 퐚1 = [ 푎11 푎12 푎13 ] 퐚2 = [ 푎21 푎22 푎23 ] • Or we can define it in terms of column vectors: 푏 푏 ⎡ 11 12 ⎤ 퐁 = ⎢ 푏21 푏22 ⎥ = [ 퐛ퟏ 퐛ퟐ ] ⎣ 푏31 푏32 ⎦ where 퐛ퟏ and 퐛ퟐ represent the columns of 퐁. • 푗 subscripts columns of a matrix: 퐱푗 ′ • 푖 and 푡 will be used for rows 퐱푖 . 22 / 62 Addition and subtraction • How do we add or subtract matrices and vectors? • First, the matrices/vectors need to be comformable, meaning that the dimensions have to be the same • Let 퐀 and 퐁 both be 2 × 2 matrices. Then, let 퐂 = 퐀 + 퐁, where we add each cell together: 푎 푎 푏 푏 퐀 + 퐁 = [ 11 12 ] + [ 11 12 ] 푎21 푎22 푏21 푏22 푎 + 푏 푎 + 푏 = [ 11 11 12 12 ] 푎21 + 푏21 푎22 + 푏22 푐 푐 = [ 11 12 ] 푐21 푐22 = 퐂 23 / 62 Scalar multiplication • A scalar is just a single number: you can think of it sort of like a 1 by 1 matrix. • When we multiply a scalar by a matrix, we just multiply each element/cell by that scalar: 푎 푎 훼 × 푎 훼 × 푎 훼퐀 = 훼 [ 11 12 ] = [ 11 12 ] 푎21 푎22 훼 × 푎21 훼 × 푎22 24 / 62 3/ Writing the linear model more compactly 25 / 62 The linear model with new notation • Remember that we wrote the linear model as the following for all 푖 ∈ [1, … , 푛]: 푦푖 = 훽0 + 푥푖훽1 + 푧푖훽2 + 푢푖 • Imagine we had an 푛 of 4. We could write out each formula: 푦1 = 훽0 + 푥1훽1 + 푧1훽2 + 푢1 (unit 1) 푦2 = 훽0 + 푥2훽1 + 푧2훽2 + 푢2 (unit 2) 푦3 = 훽0 + 푥3훽1 + 푧3훽2 + 푢3 (unit 3) 푦4 = 훽0 + 푥4훽1 + 푧4훽2 + 푢4 (unit 4) 26 / 62 The linear model with new notation 푦1 = 훽0 + 푥1훽1 + 푧1훽2 + 푢1 (unit 1) 푦2 = 훽0 + 푥2훽1 + 푧2훽2 + 푢2 (unit 2) 푦3 = 훽0 + 푥3훽1 + 푧3훽2 + 푢3 (unit 3) 푦4 = 훽0 + 푥4훽1 + 푧4훽2 + 푢4 (unit 4) • We can write this as: 푦 1 푥 푧 푢 ⎡ 1 ⎤ ⎡ ⎤ ⎡ 1 ⎤ ⎡ 1 ⎤ ⎡ 1 ⎤ ⎢ 푦2 ⎥ ⎢ 1 ⎥ ⎢ 푥2 ⎥ ⎢ 푧2 ⎥ ⎢ 푢2 ⎥ ⎢ ⎥ = ⎢ ⎥ 훽0 + ⎢ ⎥ 훽1 + ⎢ ⎥ 훽2 + ⎢ ⎥ ⎢ 푦3 ⎥ ⎢ 1 ⎥ ⎢ 푥3 ⎥ ⎢ 푧3 ⎥ ⎢ 푢3 ⎥ ⎣ 푦4 ⎦ ⎣ 1 ⎦ ⎣ 푥4 ⎦ ⎣ 푧4 ⎦ ⎣ 푢4 ⎦ • Outcome is a linear combination of the the 퐱, 퐳, and 퐮 vectors 27 / 62 Grouping things into matrices • Can we write this in a more compact form? Yes! Let 퐗 and 휷 be the following: 1 푥1 푧1 ⎡ ⎤ 훽0 ⎢ 1 푥2 푧2 ⎥ ⎡ ⎤ 퐗 = ⎢ ⎥ 휷 = ⎢ 훽1 ⎥ (4×3) ⎢ 1 푥3 푧3 ⎥ (3×1) ⎣ 훽2 ⎦ ⎣ 1 푥4 푧4 ⎦ 28 / 62 Matrix multiplication by a vector • We can write this more compactly as a matrix (post-)multiplied by a vector: 1 푥 푧 ⎡ ⎤ ⎡ 1 ⎤ ⎡ 1 ⎤ ⎢ 1 ⎥ ⎢ 푥2 ⎥ ⎢ 푧2 ⎥ ⎢ ⎥ 훽0 + ⎢ ⎥ 훽1 + ⎢ ⎥ 훽2 = 퐗휷 ⎢ 1 ⎥ ⎢ 푥3 ⎥ ⎢ 푧3 ⎥ ⎣ 1 ⎦ ⎣ 푥4 ⎦ ⎣ 푧4 ⎦ • Multiplication of a matrix by a vector is just the linear combination of the columns of the matrix with the vector elements as weights/coefficients. • And the left-hand side here only uses scalars times vectors, which is easy! 29 / 62 General matrix by vector multiplication • 퐀 is a 푛 × 푘 matrix • 퐛 is a 푘 × 1 column vector • Columns of 퐀 have to match rows of 퐛 • Let 퐚푗 be the 푗th column of 퐴. Then we can write: 퐜 = 퐀퐛 = 푏1퐚1 + 푏2퐚2 + ⋯ + 푏푘퐚푘 (푛×1) • 퐜 is linear combination of the columns of 퐀 30 / 62 Back to regression • 퐗 is the 푛 × (푘 + 1) design matrix of independent variables • 휷 be the (푘 + 1) × 1 column vector of coefficients. • 퐗휷 will be 푛 × 1: 퐗휷 = 훽0 + 훽1퐱1 + 훽2퐱2 + ⋯ + 훽푘퐱푘 • Thus, we can compactly write the linear model as the following: 퐲 = 퐗휷 + 퐮 (푛×1) (푛×1) (푛×1) ′ • We can also write this at the individual level, where 퐱푖 is the 푖th row of 퐗: ′ 푦푖 = 퐱푖 휷 + 푢푖 31 / 62 4/ A bit more about matrices 32 / 62 Matrix multiplication • What if, instead of a column vector 푏, we have a matrix 퐁 with dimensions 푘 × 푚.

Gov 2000: 9. Multiple Regression in Matrix Form

Lecture 13: Simple Linear Regression in Matrix Format

Introducing the Game Design Matrix: a Step-By-Step Process for Creating Serious Games

Stat 5102 Notes: Regression

Uncertainty of the Design and Covariance Matrices in Linear Statistical Model*

The Concept of a Generalized Inverse for Matrices Was Introduced by Moore(1920)

Kriging Models for Linear Networks and Non‐Euclidean Distances

Linear Regression with Shuffled Data: Statistical and Computational

Week 7: Multiple Regression

Stat 714 Linear Statistical Models

Fitting Differential Equations to Functional Data: Principal Differential Analysis

Linear Regression in Matrix Form

Some Matrix Results