<<

Unit 6: decomposition

Juan Luis Melero and Eduardo Eyras October 2018

1 Contents

1 Matrix decomposition3 1.1 Definitions...... 3

2 Decomposition6

3 Exercises 13

4 R practical 15 4.1 SVD...... 15

2 1 Matrix decomposition

1.1 Definitions Matrix decomposition consists in transforming a matrix into a product of other matrices. Matrix diagonalization is a special case of decomposition and is also called diagonal (eigen) decomposition of a matrix.

Definition: there is a diagonal decomposition of a A if we can write A = UDU −1 Where: • D is a

• The diagonal elements of D are the eigenvalues of A

• Vector columns of U are the eigenvectors of A. For symmetric matrices there is a special decomposition:

Definition: given a A (i.e. AT = A), there is a unique decomposition of the form A = UDU T Where

• U is an orthonormal matrix

• Vector columns of U are the unit-norm orthogonal eigenvectors of A

• D is a diagonal matrix formed by the eigenvalues of A

This special decomposition is known as spectral decomposition.

Definition: An orthonormal matrix is a square matrix whose columns and row vectors are orthogonal unit vectors (orthonormal vectors).

Orthonormal matrices have the property that their transposed matrix is the inverse matrix.

3 Proposition: if a matrix Q is orthonormal, then QT = Q−1

Proof: consider the row vectors in a matrix Q:   u1  .  T u | ... |u  Q =  .  and Q = 1 n un

      u1 hu1, u1i ... hu1, uni 1 ... 0 T  .  u | ... |u   . .. .  . .. . QQ =  .  1 n =  . . .  = . . . un hun, u1i ... hun, uni 0 ... 1

Remember that:

Orthogonal vectors hui, uji = 0, ∀i 6= j

Unit vectors hui, uii = 1, ∀i In addition, the of Q is ±1.

Theorem: if Q is an orthonormal matrix → det(Q) = ±1.

Proof: √ 2 T T det(Q) = det(Q)det(Q ) = det(QQ ) = det(In) = 1 → det(Q) = 1 = ±1

Proposition: The product of two vectors is invariant under the trans- formation by a an orthonormal matrix:

hQu, Qvi = hu, vi

Proof: hQu, Qvi = (Qu)T Qv = uT QT Qv = uT v = hu, vi Corollary: the column (or row) vectors from an orthonormal matrix form an (orthonormal) of Rn 2 1 Example: Perform the spectral decomposition of the matrix A = . 1 2

4 We calculate the eigenvalues and the eigenvectors: 2 − λ 1  det(A − λI ) = det = λ2 − 4λ + 3 2 1 2 − λ

2 det(A − λI2) = 0 → λ − 4λ + 3 = (λ − 1)(λ − 3) = 0 → λ = 1, 3

2 1 x x 2x + y = 3x  x = 3 → → x = y → Eigenvectors of 3 1 2 y y x + 2y = 3y x

2 1 x x 2x + y = x   x  = 1 → → −x = y → Eigenvectors of 1 1 2 y y x + 2y = y −x We have the following eigenspaces:

a  E(3) = Ker(A − 3I ) = , a ∈ ⊂ 2 2 a R R  b   E(1) = Ker(A − I ) = , b ∈ ⊂ 2 2 −b R R We have to choose two orthogonal unit vectors. The eigenvectors are already orthogonal to each other:  b  a a = ab − ba = 0 −b We normalize both vectors to 1: a 1 a a = 1 → a = √ a 2

 b  1 b −b = 1 → b = √ −b 2 √ √ 1/ 2  1/ 2  So the basis B = √ , √ is an orthonormal basis of 2 1/ 2 −1/ 2 R With these orthonormal vectors we build the orthonormal matrix that diag- onalize A: √ √ √ √  1/ 2 1/ 2 1/ 2 −1/ 2 U = √ √ U −1 = U T = √ √ −1/ 2 1/ 2 1/ 2 1/ 2

5 The matrix U provides a unique diagonal spectral decomposition: √ √ √ √  1/ 2 1/ 2 1 0 1/ 2 −1/ 2 A = UDU T = √ √ √ √ −1/ 2 1/ 2 0 3 1/ 2 1/ 2 Notice that the order of the eigenvectors in the orthonormal matrix deter- mines the order of the eigenvalues in the diagonal matrix.

If the matrix A is non-singular (det(A) 6= 0) A ∈ Mn×n(R), we can define the inverse of A using the spectral decomposition A = UDU T as:

A−1 = (UDU T )−1 = (U T )−1D−1U −1 = UD−1U T where we have used that U is an orthonormal matrix (built with the eigen- values of A) for which the inverse is the . Here D is the diagonal matrix with th eigenvalues of A, hence D−1 is a diagonal matrix containing the inverse of the eigenvalues. We confirm that this is the inverse of A:

−1 T −1 T −1 T T AA = UDU UD U = UDD U = UU = In Example: consider the same matrix as before √ √ √ √  1/ 2 1/ 2 1 0 1/ 2 −1/ 2 A = UDU T = √ √ √ √ −1/ 2 1/ 2 0 3 1/ 2 1/ 2 √ √ √ √  1/ 2 1/ 2 1 0  1/ 2 −1/ 2 A−1 = √ √ √ √ −1/ 2 1/ 2 0 1/3 1/ 2 1/ 2 −1 You can easily check that AA = I2.

2 Singular Value Decomposition

Singular Value Decomposition (SVD) is a form of dimensional reduction. SVD allows to identify the dimensions that are most relevant. Once these dimensions are identified, we can find the best approximation to the original data using fewer dimensions.

SVD generalizes the symmetrical orthonormal decomposition to non-symmetric matrices and to rectangular (non-square) matrices.

6 A is the input matrix of dimensions m × n, U is an orthonormal matrix of dimensions m × m, Σ contains a diagonal submatrix of size min(m, n) and the rest of the elements are 0, V is an orthonormal matrix n × n.

Definition of SVD: Given A a rectangular matrix A ∈ Mm×n(R), we can find a decomposition: A = UΣV T Where:

• U and V are orthonormal matrices

T • The columns of U are orthonormal eigenvectors of AA ∈ Mm×m(R). It is the orthonormal matrix that diagonalizes AAT . These are called the left singular vectors of A

T • The columns of V are orthonormal eigenvectors of A A ∈ Mn×n(R). It is the orthonormal matrix that diagonalizes AT A. These are called the right singular vectors of A

• Σ is a rectangular block-diagonal matrix containing the square-roots of the eigenvalues of AT A (which are the same as AAT ). These are called the singular values

Let’s try to understand where does A = UΣV T come from and how to do the decomposition.

The idea is to build symmetric matrices because they have the property that the eigenvectors matrix is orthonormal. To build symmetric matrices we multiply the matrix by its transposed:

T T AA ∈ Mm×m(R) A A ∈ Mn×n(R)

7 These matrices are square and symmetric:

(AAT )T = (AT )T AT = AAT (AT A)T = AT (AT )T = AT A

AT A is a square symmetric matrix, so it has a set orthonormal eigenvectors.

V is the matrix built from these eigenvectors that diagonalizes AT A:

AT A = (UΣV T )T UΣV T = V ΣT U T UΣV T = V ΣT ΣV T = V Σ2V T

So Σ contains the square root of the eigenvalues of AT A:

(AT A)V = V Σ2

Similar for U:

AAT is a square symmetric matrix, so it has also a set of orthogonal eigen- vectors.

AAT = UΣV T (UΣV T )T = UΣV T V ΣT U T = UΣ2U T

(AAT )U = UΣ2 U is the matrix that diagonalizes AAT and Σ contains the square root of the eigenvalues of AAT .

AAT and AT A have non-negative eigenvalues (Σ matrix contains squared numbers) and they coincide:

Proposition: AT A and AAT share min(m, n) eigenvalues.

8 Proof: consider v an eigenvector of AT A:

AT A = V Σ2V T

(AT A)v = λv → A(AT A)v = λAv → (AAT )Av = λAv So, if v is an eigenvalue of AT A (in V ) with eigenvalue λ, then Av is an eigenvector of AAT with eigenvalue λ. v is a column vector of V , hence hv, vi = 1. However, for Av:

hAv, Avi = (Av)T Av = vT (AT A)v = vT λv = λ hv, vi = λ

The eigenvectors of AAT in U corresponding to the same eigenvalues can thus be built as: 1 u = √ Av λ This definition ensures that hu, ui = hAv, Avi = 1. In addition, it establishes a relationship between the eigenvectors.

As the spectral decomposition, we can use SVD to define an inverse matrix. Given a non-singular non-symmetrical matrix A (det(A) 6= 0) A ∈ Mn×n(R), we can write: A = UDV T where U and V are the matrix built as before, where:

• U is the orthonormal matrix that diagonalizes AAT

• V is the orthonormal matrix that diagonalizes AT A

• D−1 is a diagonal matrix containing the reciprocal of the eigenvalues 1 ( λ ) of A

9 We can define the inverse of a matrix with:

A−1 = VD−1U T

−1 T −1 T −1 T T AA = UDV VD U = UDD U = UU = In

 4 4 Example of SVD: calculate SVD for the matrix (A = UΣV T −3 3 We calculate V , the orthonormal matrix that diagonalizes AT A

4 −3  4 4 25 7  AT A = = 4 3 −3 3 7 25

We can see numerically here that AT A is symmetric. Calculate the eigenval- ues and matrix V . 25 − λ 7  det(AT A − λI ) = det = (25 − λ)2 − 72 = 0 → λ = 32, 18 2 7 25 − λ

−7 7  a  E(32) = Ker(AT A − 32I ) = Ker = , a ∈ 2 7 −7 a R 7 7  a   E(18) = Ker(AT A − 18I ) = Ker = , a ∈ 2 7 7 −a R Orthonormal vectors from these subspaces (an orthonormal basis): √ √ √ √ 1/ 2  1/ 2  1/ 2 1/ 2  v = √ , v = √ V = √ √ 1 1/ 2 2 −1/ 2 1/ 2 −1/ 2

Now we can decompose the matrix: √ √ √ √ 25 7  1/ 2 1/ 2  32 0  1/ 2 1/ 2  AT A = = V = √ √ √ √ 7 25 1/ 2 −1/ 2 0 18 1/ 2 −1/ 2

10 We can also already build the Σ matrix, since it is the matrix composed by the square root of the eigenvalues, which are shared between AAT and AT A: √ 4 2 0  Σ = √ 0 3 2 We do the same to build the matrix U. U is the orthonormal matrix that diagonalizes AAT .  4 4 4 −3 32 0  AAT = = −3 3 4 3 0 18

We diagonalize AAT : 32 − λ 0  det(AAT −λI ) = det = (32−λ)(18−λ) = 0 → λ = 32, 18 2 0 18 − λ

We can see that both AT A and AAT have the same eigenvalues. We calculate the eigenspaces. 0 0  a  E(32) = Ker(AAT − 32I ) = Ker = , a ∈ n 0 −14 0 R

14 0 0  E(18) = Ker(AAT − 18I ) = Ker = , a ∈ n 0 0 a R An orthonomal matrix built from these eigenvectors is of the form: a 0 U = , with a2 = 1, b2 = 1 0 b In this case, we can choose several orthonormal vectors. Eigenvectors from different eigenvalues are orthogonal to each other, but there is more than one possibility to define the unit vectors: 1 −1 0  0  , , , all have unit length 0 0 1 −1

To determine the sign of a and b, recall that if v is an eigenvector of AT A with eigenvalue λ, then Av is an eigenvector of AAT ; and to have the same eigenvalue we must define this eigenvector as: 1 u = √ Av λ

11 Since the eigenvalues of AT A are λ = 32, 18, we then choose these two vectors u as: √ √ √ √ 1/ 2 1/ 2  4 4 1/ 2 4/ 2 v = √ → A √ = √ = → 1 1/ 2 1/ 2 −3 3 1/ 2 0 √ 1 4/ 2 1 → u1 = √ = 4/ 2 0 0

√ √ √  1/ 2   1/ 2   4 4  1/ 2   0  v = √ → A √ = √ = √ → 2 −1/ 2 −1/ 2 −3 3 −1/ 2 3/ 2

1  0   0  → u2 = √ √ = 3/ 2 −3/ 2 −1 1 0  Thus, we choose U = . As AAT is already a diagonal matrix, its 0 −1 diagonalization is trivial:

32 0  1 0  32 0  1 0  AAT = UDU T → = 0 18 0 −1 0 18 0 −1

We finally have: √ √ √  4 4 1 0  4/ 2 0  1/ 2 1/ 2  A = UΣV T → = √ √ √ −3 3 0 −1 0 3/ 2 1/ 2 −1/ 2

12 3 Exercises Ex. 1 — Find a singular value decomposition (SVD) for the matrix

−2 0  A = 0 −1

Ex. 2 — Find a singular value decomposition (SVD) for the matrix

 3 1 1 A = −1 3 1

Ex. 3 — Find a singular value decomposition (SVD) for the matrix

0 2 A = 0 0 0 0

Ex. 4 — The Fibonacci sequence is an infinite sequence of integer numbers of the form:

F = {0, 1, 1, 2, 3, 5, 8, 13, 21, 34, 55, 89, 144,... }

Any element of the sequence Fn can be defined by the recurrence: Fn = Fn−1 + Fn−2, with initialization F1 = 1,F0 = 0. This recurrence can be expressed in a matrix form as follows:      1 1 Fn+1 Fn+2Fn+1 = 1 0 Fn

1 1 So the matrix A = is usually called the Fibonacci matrix. 1 0 Show that in the singular value decomposition (SVD) for A, A = UDV T , the matrix of singular values is

ϕ 0  S = 0 ϕ−1 √ 1+ 5 where ϕ = 2 is the golden ratio.

13 Ex. 5 — Consider the Fibonacci sequence in matrix form described above. The general form for any term in the series can be written as:

F  F  n+1 = An 1 Fn F0 where A is the Fibonacci matrix described above. Use the singular value decomposition (SVD), A = UDV T , derived before to calculate An. Use this decomposition to obtain the general form of an element of the sequence in terms of golden-ratio (ϕ):

ϕn − (−ϕ)−n Fn = √ 5 √ 1+ 5 where ϕ = 2

Note: a matrix raised to an integer power n is the product of n matrices, e.g. A3 = A · A · A

14 4 R practical

4.1 SVD Singular value decomposition can be automatically done in R with the func- tion svd(). #Define the matrix > m <- matrix(c(4, -3, 4, 3), 2,2)

#Compute theSVD > svd <- svd(m)

#svd() returns three objects:

#svd$d returns the list of the singular values:

> svd $d [1] 5.656854 4.242641

#svd$u returns the left singular vectors matrix:

> svd $u [,1] [,2] [1 ,] -1 0 [2 ,] 0 1

#svd$v returns the right singular vectors matrix > svd $v [,1] [,2] [1,] -0.7071068 -0.7071068 [2,] -0.7071068 0.7071068

#To have the right singular vector matrix transposed #we used the command La.svd()

> svdt <- La.svd(m) > svdt $vt [,1] [,2] [1,] -0.7071068 -0.7071068 [2,] -0.7071068 0.7071068

15 #In this particular case is the same as before

16 References

[1] Marc Peter Deisenroth; A Aldo Faisal and Cheng Soon Ong. Mathematics for . 2018.

[2] Venkatesan Guruswami and Ravi Kannan. Computer Science Theory for the Information Age. 2012.

17