<<

Data, , and Correlation

Nathaniel E. Helwig

Assistant Professor of Psychology and University of Minnesota (Twin Cities)

Updated 16-Jan-2017

Nathaniel E. Helwig (U of Minnesota) Data, Covariance, and Correlation Matrix Updated 16-Jan-2017 : Slide 1 Copyright

Copyright c 2017 by Nathaniel E. Helwig

Nathaniel E. Helwig (U of Minnesota) Data, Covariance, and Correlation Matrix Updated 16-Jan-2017 : Slide 2 Outline of Notes

1) The Data Matrix 3) The Correlation Matrix Definition Definition Properties Properties R code R code

2) The 4) Miscellaneous Topics Definition Crossproduct calculations Properties Vec and Kronecker R code Visualizing data

Nathaniel E. Helwig (U of Minnesota) Data, Covariance, and Correlation Matrix Updated 16-Jan-2017 : Slide 3 The Data Matrix

The Data Matrix

Nathaniel E. Helwig (U of Minnesota) Data, Covariance, and Correlation Matrix Updated 16-Jan-2017 : Slide 4 The Data Matrix Definition The Organization of Data

The data matrix refers to the array of numbers   x11 x12 ··· x1p x21 x22 ··· x2p   x x ··· x  X =  31 32 3p  . . .. .   . . . .  xn1 xn2 ··· xnp

where xij is the j-th variable collected from the i-th item (e.g., subject). items/subjects are rows variables are columns

X is a data matrix of order n × p (# items by # variables).

Nathaniel E. Helwig (U of Minnesota) Data, Covariance, and Correlation Matrix Updated 16-Jan-2017 : Slide 5 The Data Matrix Definition Collection of Column Vectors

We can view a data matrix as a collection of column vectors:     X = x1 x2 ··· xp

where xj is the j-th column of X for j ∈ {1,..., p}.

The n × 1 vector xj gives the j-th variable’s scores for the n items.

Nathaniel E. Helwig (U of Minnesota) Data, Covariance, and Correlation Matrix Updated 16-Jan-2017 : Slide 6 The Data Matrix Definition Collection of Row Vectors

We can view a data matrix as a collection of row vectors:

 0  x1  x0  X =  2   .   .  0 xn

0 where xi is the i-th row of X for i ∈ {1,..., n}.

0 The 1 × p vector xi gives the i-th item’s scores for the p variables.

Nathaniel E. Helwig (U of Minnesota) Data, Covariance, and Correlation Matrix Updated 16-Jan-2017 : Slide 7 The Data Matrix Properties Calculating Variable (Column)

The of the j-th variable is given by

n 1 X x¯ = x j n ij i=1 −1 0 = n 1nxj where

1n denotes an n × 1 vector of ones

xj denotes the j-th column of X

Nathaniel E. Helwig (U of Minnesota) Data, Covariance, and Correlation Matrix Updated 16-Jan-2017 : Slide 8 The Data Matrix Properties Calculating Item (Row) Means

The sample mean of the i-th item is given by

p 1 X x¯ = x i p ij j=1 −1 0 = p xi 1p where

1p denotes an p × 1 vector of ones 0 xi denotes the i-th row of X

Nathaniel E. Helwig (U of Minnesota) Data, Covariance, and Correlation Matrix Updated 16-Jan-2017 : Slide 9 The Data Matrix R Code Data Frame and Matrix Classes in R

> data(mtcars) > class(mtcars) [1] "data.frame" > dim(mtcars) [1] 32 11 > head(mtcars) mpg cyl disp hp drat wt qsec vs am gear carb Mazda RX4 21.0 6 160 110 3.90 2.620 16.46 0 1 4 4 Mazda RX4 Wag 21.0 6 160 110 3.90 2.875 17.02 0 1 4 4 Datsun 710 22.8 4 108 93 3.85 2.320 18.61 1 1 4 1 Hornet 4 Drive 21.4 6 258 110 3.08 3.215 19.44 1 0 3 1 Hornet Sportabout 18.7 8 360 175 3.15 3.440 17.02 0 0 3 2 Valiant 18.1 6 225 105 2.76 3.460 20.22 1 0 3 1 > X <- as.matrix(mtcars) > class(X) [1] "matrix"

Nathaniel E. Helwig (U of Minnesota) Data, Covariance, and Correlation Matrix Updated 16-Jan-2017 : Slide 10 The Data Matrix R Code Row and Column Means

> # get row means (3 ways) > rowMeans(X)[1:3] Mazda RX4 Mazda RX4 Wag Datsun 710 29.90727 29.98136 23.59818 > c(mean(X[1,]), mean(X[2,]), mean(X[3,])) [1] 29.90727 29.98136 23.59818 > apply(X,1,mean)[1:3] Mazda RX4 Mazda RX4 Wag Datsun 710 29.90727 29.98136 23.59818

> # get column means (3 ways) > colMeans(X)[1:3] mpg cyl disp 20.09062 6.18750 230.72188 > c(mean(X[,1]), mean(X[,2]), mean(X[,3])) [1] 20.09062 6.18750 230.72188 > apply(X,2,mean)[1:3] mpg cyl disp 20.09062 6.18750 230.72188

Nathaniel E. Helwig (U of Minnesota) Data, Covariance, and Correlation Matrix Updated 16-Jan-2017 : Slide 11 The Data Matrix R Code Other Row and Column Functions

> # get column > apply(X,2,)[1:3] mpg cyl disp 19.2 6.0 196.3 > c(median(X[,1]), median(X[,2]), median(X[,3])) [1] 19.2 6.0 196.3

> # get column ranges > apply(X,2,)[,1:3] mpg cyl disp [1,] 10.4 4 71.1 [2,] 33.9 8 472.0 > cbind(range(X[,1]), range(X[,2]), range(X[,3])) [,1] [,2] [,3] [1,] 10.4 4 71.1 [2,] 33.9 8 472.0

Nathaniel E. Helwig (U of Minnesota) Data, Covariance, and Correlation Matrix Updated 16-Jan-2017 : Slide 12 The Covariance Matrix

The Covariance Matrix

Nathaniel E. Helwig (U of Minnesota) Data, Covariance, and Correlation Matrix Updated 16-Jan-2017 : Slide 13 The Covariance Matrix Definition The Covariation of Data

The covariance matrix refers to the symmetric array of numbers

 2  s1 s12 s13 ··· s1p 2 s21 s s23 ··· s2p  2  s s s2 ··· s  S =  31 32 3 3p  ......   . . . . .  2 sp1 sp2 sp3 ··· sp

where 2 Pn ¯ 2 sj = (1/n) i=1(xij − xj ) is the of the j-th variable Pn sjk = (1/n) i=1(xij − x¯j )(xik − x¯k ) is the covariance between the j-th and k-th variables Pn x¯j = (1/n) i=1 xij is the mean of the j-th variable

Nathaniel E. Helwig (U of Minnesota) Data, Covariance, and Correlation Matrix Updated 16-Jan-2017 : Slide 14 The Covariance Matrix Definition Covariance Matrix from Data Matrix

We can calculate the covariance matrix such as 1 S = X0 X n c c 0 where Xc = X − 1nx¯ = CX with 0 x¯ = (x¯1,..., x¯p) denoting the vector of variable means −1 0 C = In − n 1n1n denoting a

Note that the centered matrix Xc has the form   x11 − x¯1 x12 − x¯2 ··· x1p − x¯p x21 − x¯1 x22 − x¯2 ··· x2p − x¯p   x − x¯ x − x¯ ··· x − x¯p Xc =  31 1 32 2 3p   . . .. .   . . . .  xn1 − x¯1 xn2 − x¯2 ··· xnp − x¯p

Nathaniel E. Helwig (U of Minnesota) Data, Covariance, and Correlation Matrix Updated 16-Jan-2017 : Slide 15 The Covariance Matrix Properties are Nonnegative

2 Variances are sums-of-squares, which implies that sj ≥ 0 ∀j. 2 sj > 0 as long as there does not exist an α such that xj = α1n

This implies that. . . tr(S) ≥ 0 where tr(·) denotes the matrix trace function Pp j=1 λj ≥ 0 where (λ1, . . . , λp) are the eigenvalues of S

If n < p, then λj = 0 for at least one j ∈ {1,..., p}. If n ≥ p and the p columns of X are linearly independent, then λj > 0 for all j ∈ {1,..., p}.

Nathaniel E. Helwig (U of Minnesota) Data, Covariance, and Correlation Matrix Updated 16-Jan-2017 : Slide 16 The Covariance Matrix Properties The Cauchy-Schwarz Inequality

From the Cauchy-Schwarz inequality we have that

2 2 2 sjk ≤ sj sk

with the equality holding if and only if xj and xk are linearly dependent.

We could also write the Cauchy-Schwarz inequality as

|sjk | ≤ sj sk

where sj and sk denote the standard deviations of the variables.

Nathaniel E. Helwig (U of Minnesota) Data, Covariance, and Correlation Matrix Updated 16-Jan-2017 : Slide 17 The Covariance Matrix R Code Covariance Matrix by Hand (hard way)

> n <- nrow(X) > C <- diag(n) - matrix(1/n, n, n) > Xc <- C %*%X > S <- t(Xc) %*% Xc / (n-1) > S[1:3,1:6] mpg cyl disp hp drat wt mpg 36.324103 -9.172379 -633.0972 -320.7321 2.1950635 -5.116685 cyl -9.172379 3.189516 199.6603 101.9315 -0.6683669 1.367371 disp -633.097208 199.660282 15360.7998 6721.1587 -47.0640192 107.684204

# or #

> Xc <- scale(X, center=TRUE, scale=FALSE) > S <- t(Xc) %*% Xc / (n-1) > S[1:3,1:6] mpg cyl disp hp drat wt mpg 36.324103 -9.172379 -633.0972 -320.7321 2.1950635 -5.116685 cyl -9.172379 3.189516 199.6603 101.9315 -0.6683669 1.367371 disp -633.097208 199.660282 15360.7998 6721.1587 -47.0640192 107.684204

Nathaniel E. Helwig (U of Minnesota) Data, Covariance, and Correlation Matrix Updated 16-Jan-2017 : Slide 18 The Covariance Matrix R Code Covariance Matrix using cov Function (easy way)

# calculate covariance matrix > S <- cov(X) > dim(S) [1] 11 11

# check variance > S[1,1] [1] 36.3241 > var(X[,1]) [1] 36.3241 > sum((X[,1]-mean(X[,1]))^2) / (n-1) [1] 36.3241

# check covariance > S[1:3,1:6] mpg cyl disp hp drat wt mpg 36.324103 -9.172379 -633.0972 -320.7321 2.1950635 -5.116685 cyl -9.172379 3.189516 199.6603 101.9315 -0.6683669 1.367371 disp -633.097208 199.660282 15360.7998 6721.1587 -47.0640192 107.684204

Nathaniel E. Helwig (U of Minnesota) Data, Covariance, and Correlation Matrix Updated 16-Jan-2017 : Slide 19 The Correlation Matrix

The Correlation Matrix

Nathaniel E. Helwig (U of Minnesota) Data, Covariance, and Correlation Matrix Updated 16-Jan-2017 : Slide 20 The Correlation Matrix Definition The Correlation of Data

The correlation matrix refers to the symmetric array of numbers   1 r12 r13 ··· r1p r21 1 r23 ··· r2p   r r 1 ··· r  R =  31 32 3p  ......   . . . . .  rp1 rp2 rp3 ··· 1

where Pn sjk i=1(xij − x¯j )(xik − x¯k ) rjk = = q q sj sk Pn 2 Pn 2 i=1(xij − x¯j ) i=1(xik − x¯k )

is the Pearson correlation coefficient between variables xj and xk .

Nathaniel E. Helwig (U of Minnesota) Data, Covariance, and Correlation Matrix Updated 16-Jan-2017 : Slide 21 The Correlation Matrix Definition Correlation Matrix from Data Matrix

We can calculate the correlation matrix such as 1 R = X0 X n s s −1 where Xs = CXD with −1 0 C = In − n 1n1n denoting a centering matrix D = diag(s1,..., sp) denoting a diagonal scaling matrix

Note that the standardized matrix Xs has the form   (x11 − x¯1)/s1 (x12 − x¯2)/s2 ··· (x1p − x¯p)/sp (x21 − x¯1)/s1 (x22 − x¯2)/s2 ··· (x2p − x¯p)/sp   (x − x¯ )/s (x − x¯ )/s ··· (x − x¯p)/sp Xs =  31 1 1 32 2 2 3p   . . .. .   . . . .  (xn1 − x¯1)/s1 (xn2 − x¯2)/s2 ··· (xnp − x¯p)/sp

Nathaniel E. Helwig (U of Minnesota) Data, Covariance, and Correlation Matrix Updated 16-Jan-2017 : Slide 22 The Correlation Matrix Properties Correlation of a Variable with Itself is One

2 Assuming that sj > 0 for all j ∈ {1,..., p}, we have that Pn  i=1(xij − x¯j )(xik − x¯k ) 1 if j = k Cor(xj , xk ) = q q = Pn 2 Pn 2 rjk if j 6= k i=1(xij − x¯j ) i=1(xik − x¯k )

Because rjk = 1 whenever j = k, we know that tr(R) = p where tr(·) denotes the matrix trace function Pp j=1 λj = p where (λ1, . . . , λp) are the eigenvalues of R

We also know that the eigenvalues satisfy

λj = 0 for at least one j ∈ {1,..., p} if n < p

λj > 0 ∀j if columns of X are linearly independent

Nathaniel E. Helwig (U of Minnesota) Data, Covariance, and Correlation Matrix Updated 16-Jan-2017 : Slide 23 The Correlation Matrix Properties The Cauchy-Schwarz Inequality (revisited)

Reminder: the Cauchy-Schwarz inequality implies that

2 2 2 sjk ≤ sj sk

with the equality holding if and only if xj and xk are linearly dependent.

Rearranging the terms, we have that

s2 jk ≤ ←→ 2 ≤ 2 2 1 rjk 1 sj sk

which implies that |rjk | ≤ 1 with equality holding if and only if xj = α1n + βxk for some scalars α ∈ R and β ∈ R.

Nathaniel E. Helwig (U of Minnesota) Data, Covariance, and Correlation Matrix Updated 16-Jan-2017 : Slide 24 The Correlation Matrix R Code Correlation Matrix by Hand (hard way)

> n <- nrow(X) > C <- diag(n) - matrix(1/n, n, n) > D <- diag(apply(X, 2, sd)) > Xs <- C %*%X%*% solve(D) > R <- t(Xs) %*% Xs / (n-1) > R[1:3,1:6] [,1] [,2] [,3] [,4] [,5] [,6] [1,] 1.0000000 -0.8521620 -0.8475514 -0.7761684 0.6811719 -0.8676594 [2,] -0.8521620 1.0000000 0.9020329 0.8324475 -0.6999381 0.7824958 [3,] -0.8475514 0.9020329 1.0000000 0.7909486 -0.7102139 0.8879799

# or #

> Xs <- scale(X, center=TRUE, scale=TRUE) > R <- t(Xs) %*% Xs / (n-1) > R[1:3,1:6] mpg cyl disp hp drat wt mpg 1.0000000 -0.8521620 -0.8475514 -0.7761684 0.6811719 -0.8676594 cyl -0.8521620 1.0000000 0.9020329 0.8324475 -0.6999381 0.7824958 disp -0.8475514 0.9020329 1.0000000 0.7909486 -0.7102139 0.8879799

Nathaniel E. Helwig (U of Minnesota) Data, Covariance, and Correlation Matrix Updated 16-Jan-2017 : Slide 25 The Correlation Matrix R Code Correlation Matrix using cor Function (easy way)

# calculate correlation matrix > R <- cor(X) > dim(R) [1] 11 11

# check correlation of mpg and cyl > R[1,2] [1] -0.852162 > cor(X[,1],X[,2]) [1] -0.852162 > cov(X[,1],X[,2]) / (sd(X[,1]) * sd(X[,2])) [1] -0.852162

# check correlations > R[1:3,1:6] mpg cyl disp hp drat wt mpg 1.0000000 -0.8521620 -0.8475514 -0.7761684 0.6811719 -0.8676594 cyl -0.8521620 1.0000000 0.9020329 0.8324475 -0.6999381 0.7824958 disp -0.8475514 0.9020329 1.0000000 0.7909486 -0.7102139 0.8879799

Nathaniel E. Helwig (U of Minnesota) Data, Covariance, and Correlation Matrix Updated 16-Jan-2017 : Slide 26 Miscellaneous Topics

Miscellaneous Topics

Nathaniel E. Helwig (U of Minnesota) Data, Covariance, and Correlation Matrix Updated 16-Jan-2017 : Slide 27 Miscellaneous Topics Crossproduct Calculations Two Types of Matrix Crossproducts

We often need to calculate one of two different types of crossproducts: X0Y = “regular” crossproduct of X and Y XY0 = “” crossproduct of X and Y

Regular crossproduct is X0 being post-multipled by Y.

Transpose crossproduct is X being post-multipled by Y0.

Nathaniel E. Helwig (U of Minnesota) Data, Covariance, and Correlation Matrix Updated 16-Jan-2017 : Slide 28 Miscellaneous Topics Crossproduct Calculations Simple and Efficient Crossproducts in R

> X <- matrix(rnorm(2*3),2,3) > Y <- matrix(rnorm(2*3),2,3) > t(X) %*%Y [,1] [,2] [,3] [1,] 0.1342302 -1.8181837 -1.107821 [2,] 1.1014703 -0.6619466 -1.356606 [3,] 0.8760823 -1.0077151 -1.340044 > crossprod(X, Y) [,1] [,2] [,3] [1,] 0.1342302 -1.8181837 -1.107821 [2,] 1.1014703 -0.6619466 -1.356606 [3,] 0.8760823 -1.0077151 -1.340044 > X %*% t(Y) [,1] [,2] [1,] 0.8364239 3.227566 [2,] -1.3899946 -2.704184 > tcrossprod(X, Y) [,1] [,2] [1,] 0.8364239 3.227566 [2,] -1.3899946 -2.704184

Nathaniel E. Helwig (U of Minnesota) Data, Covariance, and Correlation Matrix Updated 16-Jan-2017 : Slide 29 Miscellaneous Topics Vec and Kronecker Product Turning a Matrix into a Vector

The vectorization (vec) operator turns a matrix into a vector: 0 vec(X) = x11, x21,..., xn1, x12,..., xn2,..., x1p,..., xnp

where the vectorization is done column-by-column.

In R, we just use the combine function c to vectorize a matrix > X <- matrix(1:6,2,3) > X [,1] [,2] [,3] [1,] 1 3 5 [2,] 2 4 6 > c(X) [1] 1 2 3 4 5 6 > c(t(X)) [1] 1 3 5 2 4 6

Nathaniel E. Helwig (U of Minnesota) Data, Covariance, and Correlation Matrix Updated 16-Jan-2017 : Slide 30 Miscellaneous Topics Vec Operator and Kronecker Product vec Operator Properties

Some useful properties of the vec(·) operator include: 0 vec(a ) = vec(a) = a for any vector a ∈ Rm 0 vec(ab ) = b ⊗ a for any vectors a ∈ Rm and b ∈ Rn 0 0 × vec(A) vec(B) = tr(A B) for any matrices A, B ∈ Rm n vec(ABC) = (C0 ⊗ A)vec(B) if the product ABC exists

Note: ⊗ is the Kronecker product, which is defined on the next slide.

Nathaniel E. Helwig (U of Minnesota) Data, Covariance, and Correlation Matrix Updated 16-Jan-2017 : Slide 31 Miscellaneous Topics Vec Operator and Kronecker Product Kronecker Product of Two Matrices

Given X = {xij }n×p and Y = {yij }m×q, the Kronecker product is   x11Y x12Y ··· x1pY x21Y x22Y ··· x2pY X ⊗ Y =    . . .. .   . . . .  xn1Y xn2Y ··· xnpY which is a matrix of order mn × pq.

In R, the kronecker function calculates Kronecker products > X <- matrix(1:4,2,2) > Y <- matrix(5:10,2,3) > kronecker(X, Y) [,1] [,2] [,3] [,4] [,5] [,6] [1,] 5 7 9 15 21 27 [2,] 6 8 10 18 24 30 [3,] 10 14 18 20 28 36 [4,] 12 16 20 24 32 40

Nathaniel E. Helwig (U of Minnesota) Data, Covariance, and Correlation Matrix Updated 16-Jan-2017 : Slide 32 Miscellaneous Topics Vec Operator and Kronecker Product Kronecker Product Properties

Some useful properties of the Kronecker product include: × 1 A ⊗ a = Aa = aA = a ⊗ A for any a ∈ R and A ∈ Rm n 0 0 0 × × 2 (A ⊗ B) = A ⊗ B for any matrices A ∈ Rm n and B ∈ Rp q 0 0 0 3 a ⊗ b = ba = b ⊗ a for any vectors a ∈ Rm and b ∈ Rn × × 4 tr(A ⊗ B) = tr(A)tr(B) for any matrices A ∈ Rm m and B ∈ Rp p 5 (A ⊗ B)−1 = A−1 ⊗ B−1 for any invertible matrices A and B 6 (A ⊗ B)† = A† ⊗ B† where (·)† is Moore-Penrose pseudoinverse × × 7 |A ⊗ B| = |A|p|B|m for any matrices A ∈ Rm m and B ∈ Rp p 8 rank(A ⊗ B) = rank(A)rank(B) for any matrices A and B 9 A ⊗ (B + C) = A ⊗ B + A ⊗ C for any matrices A, B, and C 10 (A + B) ⊗ C = A ⊗ C + B ⊗ C for any matrices A, B, and C 11 (A ⊗ B) ⊗ C = A ⊗ (B ⊗ C) for any matrices A, B, and C 12 (A ⊗ B)(C ⊗ D) = (AC) ⊗ (BD) for any matrices A, B, C, and D

Nathaniel E. Helwig (U of Minnesota) Data, Covariance, and Correlation Matrix Updated 16-Jan-2017 : Slide 33 Miscellaneous Topics Vec Operator and Kronecker Product Common Application of Vec and Kronecker

Suppose the rows of X are iid samples from some multivariate 0 distribution with mean µ = (µ1, . . . , µp) and covariance matrix Σ. iid xi ∼ (µ, Σ) where xi is the i-th row of X

If we let y = vec(X0), then the expectation and covariance are E(y) = 1n ⊗ µ is the mean vector V (y) = In ⊗ Σ is the covariance matrix

Note that the covariance matrix is block diagonal Σ 0 ··· 0 0 Σ ··· 0 I ⊗ Σ =   n  . . .. .   . . . .  0 0 ··· Σ given that data from different subjects are assumed to be independent.

Nathaniel E. Helwig (U of Minnesota) Data, Covariance, and Correlation Matrix Updated 16-Jan-2017 : Slide 34 Miscellaneous Topics Visualizing Multivariate Data Two Versions of a Scatterplot in R

● ● ● ●

● ●

● ● 30 30

● ● ● ●

25 ●

25 ● ●● ●● ● ●● MPG ● ● ● MPG ● ● 20 ● ● ● 20 ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● 15 ● ● ● ● ● 15 ● ● ●

● ●

● ● 10 10 50 100 150 200 250 300 50 100 150 200 250 300 HP HP ●

plot(mtcars$hp, mtcars$mpg, xlab="HP", ylab="MPG") library(car) scatterplot(mtcars$hp, mtcars$mpg, xlab="HP", ylab="MPG")

Nathaniel E. Helwig (U of Minnesota) Data, Covariance, and Correlation Matrix Updated 16-Jan-2017 : Slide 35 Miscellaneous Topics Visualizing Multivariate Data Two Versions of a Scatterplot Matrix in R

100 300 2 3 4 5 100 300 2 3 4 5

● ● ● ● ● ● ● ● ● ● 4 ● ● ● ●● ● ● ●● mpg ●● ● ● ●● 30 30 ● ● ● 6 ● ● ● ● ● ● ● ● ● ● ● ● 8 ● ● ● ● ● ● ● ● 25 ● ● ● ● ● 25 mpg ● ●● ● ● ● ●● ● ● 20 20 15 15 10 10 disp

300 disp 300

● ● ● ● ● ● ● ● ● ● ● ● ●●● ● ● ● ● ● ●●● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ●● ● ● ● ● ● ● ●● ● ● ●

100 ● ● ● 100 ● ● ● hp hp 250 250 150 150 ● ● ● ● ● ● ● ● ● ● ● ● ●● ● ●●● ●●● ● ●● ● ●●● ●●● ● ● ● ● ● ●● ● ●● ● ● ● ● ● ● ●● ● ●● ● ● ● ● ● ● ● ● 50 50

5 5 wt 4 4

● ● ●● ● ● wt ● ● ●● ● ● 3 ● ● ● 3 ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ●

2 ● ● ●● ● 2 ● ● ●● ● ● ●● ● ● ● ●● ● ●

10 15 20 25 30 50 150 250 10 15 20 25 30 50 150 250

cylint <- as.integer(factor(mtcars$cyl)) pairs(~mpg+disp+hp+wt, data=mtcars, col=cylint, pch=cylint) library(car) scatterplotMatrix(~mpg+disp+hp+wt|cyl, data=mtcars)

Nathaniel E. Helwig (U of Minnesota) Data, Covariance, and Correlation Matrix Updated 16-Jan-2017 : Slide 36 Miscellaneous Topics Visualizing Multivariate Data Three-Dimensional Scatterplot in R

● ● ● ● 35 35

● ● ● ●

30 ● ● 30 ● ● ● ● ● ●

● ● ● ● ● ● 25 25

MPG 6 MPG 6 WT WT 20 5 20 5 4 4

15 3 15 3 2 2 1 1 10 10 50 100 150 200 250 300 350 50 100 150 200 250 300 350

HP HP

library(scatterplot3d) sp3d <- scatterplot3d(mtcars$hp, mtcars$wt, mtcars$mpg, type="h", color=cylint, pch=cylint, xlab="HP", ylab="WT", zlab="MPG") fitmod <- lm(mpg ~ hp + wt, data=mtcars) sp3d$plane3d(fitmod)

Nathaniel E. Helwig (U of Minnesota) Data, Covariance, and Correlation Matrix Updated 16-Jan-2017 : Slide 37 Miscellaneous Topics Visualizing Multivariate Data Color Image (Heat Map) Plots in R 5 5 25 4 4 20 WT WT MPG 3 3 15 10 2 2

50 100 150 200 250 300 50 100 150 200 250 300

HP HP

fitmod <- lm(mpg ~ hp + wt, data=mtcars) hpseq <- seq(50, 330, by=20) wtseq <- seq(1.5, 5.4, length=15) newdata <- expand.grid(hp=hpseq, wt=wtseq) fit <- predict(fitmod, newdata) fitmat <- matrix(fit, 15, 15) image(hpseq, wtseq, fitmat, xlab="HP", ylab="WT") library(bigsplines) imagebar(hpseq, wtseq, fitmat, xlab="HP", ylab="WT", zlab="MPG", col=heat.colors(12), ncolor=12)

Nathaniel E. Helwig (U of Minnesota) Data, Covariance, and Correlation Matrix Updated 16-Jan-2017 : Slide 38 Miscellaneous Topics Visualizing Multivariate Data Correlation Matrix Plot in R

1

mpg cyl disp hp drat wt qsec vs am gear carb mpg 1 mpg ● ● 0.8 −0.85 cyl 0.8 cyl ● 0.6 −0.85 0.9 disp 0.6 disp ● ● 0.4 −0.78 0.83 0.79 hp hp ● ● ● 0.4 0.68 −0.7 −0.71 −0.45 drat 0.2 drat ● ● ● ● 0.2 −0.87 0.78 0.89 0.66 −0.71 wt 0 wt ● ● 0

0.42 −0.59 −0.43 −0.71 0.09 −0.17 qsec −0.2 qsec ● ● ● ● ● ● −0.2 0.66 −0.81 −0.71 −0.72 0.44 −0.55 0.74 vs vs ● ● −0.4 ● −0.4

● 0.6 −0.52 −0.59 −0.24 0.71 −0.69 −0.23 0.17 am am ● ● ● −0.6 −0.6 gear ● ● ● ● ● ● 0.48 −0.49 −0.56 −0.13 0.7 −0.58 −0.21 0.21 0.79 gear −0.8 −0.8 carb ● ● ● ● ● −0.55 0.53 0.39 0.75 −0.09 0.43 −0.66 −0.57 0.06 0.27 carb −1 −1

cmat <- cor(mtcars) library(corrplot) corrplot(cmat, method="circle") corrplot.mixed(cmat, lower="number", upper="ellipse")

Nathaniel E. Helwig (U of Minnesota) Data, Covariance, and Correlation Matrix Updated 16-Jan-2017 : Slide 39 Miscellaneous Topics Visualizing Multivariate Data Correlation Matrix Color Image (Heat Map) in R 1.0 1.0 mpg mpg −0.85

−0.85 0.9 disp disp 0.5 0.5 −0.78 0.83 0.79

0.68 −0.7 −0.71−0.45 drat drat

z −0.87 0.78 0.89 0.66 −0.71 z 0.0 0.0

0.42 −0.59−0.43−0.71 0.09 −0.17 qsec qsec 0.66 −0.81−0.71−0.72 0.44 −0.55 0.74

0.6 −0.52−0.59−0.24 0.71 −0.69−0.23 0.17 −0.5 −0.5 am am

0.48 −0.49−0.56−0.13 0.7 −0.58−0.21 0.21 0.79

−0.55 0.53 0.39 0.75 −0.09 0.43 −0.66−0.57 0.06 0.27 carb carb −1.0 −1.0

mpg disp drat qsec am carb mpg disp drat qsec am carb

cmat <- cor(mtcars) p <- nrow(cmat) library(RColorBrewer) imagebar(1:p, 1:p, cmat[,p:1], axes=F, zlim=c(-1,1), xlab="", ylab="", col=brewer.pal(7, "RdBu")) axis(1, 1:p, labels=rownames(cmat)) axis(2, p:1, labels=colnames(cmat)) for(k in 1:p) { for(j in 1:k) { if(j < k) text(j, p+1-k, labels=round(cmat[j,k],2), cex=0.75) } }

Nathaniel E. Helwig (U of Minnesota) Data, Covariance, and Correlation Matrix Updated 16-Jan-2017 : Slide 40