Chapter 5

Linear Algebra

It can be argued that all of linear algebra can be understood using the four fundamental subspaces associated with a matrix. Because they form the foundation on which we later work, we want an explicit method for analyzing these subspaces- That method will be the Singular Value Decomposition (SVD). It is unfortunate that mostfirst courses in linear algebra do not cover this material, so we do it here. Again, we cannot stress the importance of this decomposition enough- We will apply this technique throughout the rest of this text.

5.1 Representation, and Dimension

Let us quickly review some notation and basic ideas from linear algebra. First, a basis is a linearly inde- pendent spanning set, and the dimension of a subspace is the number of basis vectors it needs. Suppose we have a subspaceH IR n, and a basis forH in the columns of matrixV. ⊂ By the definition of a basis, every vector in the subspaceH can be written as a linear combination of the basis elements. In particular, ifx H, then we can write: ∈ . x=c 1v1 +...+c kvk =Vc=V[x] V (5.1)

wherec is sometimes denoted as [x] V , and is referred to as the coordinates ofx with respect to the basis in V. Therefore, every vector in our subset of IRn can be identified with a point in IRk, which gives us a function that we’ll call the coordinate mapping:     x1   c1  x2      n . k x= . IR  .  =c IR  .  ∈ ←→ ∈ ck xn

Ifk is small (with respect ton) we think ofc is the low dimensional representation of the vectorx, and thatH is isomorphic to IR k (Isormorphic meaning one to one and onto linear map is the isormorphism)

Example 5.1.1. Ifv i,v j are two linearly independent vectors, then the subspace created by their span is isomorphic to the IR2 - but is not equal to the plane. The isomorphism is the coordinate mapping.

Generally,finding the coordinates ofx with respect to an arbitrary basis (as columns of a matrixV) means that we have to solve Equation 5.1. However, if the columns ofV are orthogonal, then it is very simple to compute the coordinates. Start with Equation 5.1: x=c v + +c v 1 1 ··· k k 51 Now take the inner product of both sides withv j : x v =c v v + +c v v + +c v v · j 1 1 · j ··· j j · j ··· k k · j All the dot products are 0 (due to orthogonality) except for the withx andv j leading to the formula: x v c = · j j v v j · j And we see that this is the projection ofx ontov j . Recall the formula from Calc III: u v Proj (u) = · v v v v · Therefore, we can think of the linear combination as the following, which simplifies if we use orthonormal basis vectors: x= Proj (x) +Proj (x) + +Proj (x) (5.2) v1 v2 ··· vk = (x v )v + (x v )v + +(x v )v · 1 1 · 2 2 ··· · k k IMPORTANT NOTE: In the event thatx is NOT inH, then Equation 5.2 gives the (orthogonal) projection of x intoH.

Projections

Consider the following example. If a matrixU=[u 1,...,u k] has orthonormal columns (so ifU isn k, then that requiresk n), then × ≤       T T T T u1 u1 u1 u2 u 1 uk 1 0 0 u1  T T ··· T   ···     u u1 u u2 u uk   0 1 0  T  .   2 2 ··· 2   ···  U U= . [u1,...,u k] = . . = . . =I k  . .   . .  u k uT u uT u u T u 0 0 1 k 1 k 2 ··· k k ··· ButUU T (which isn n) is NOT the identity ifk=n (Ifk=n, then the previous computation proves that the inverse is the× transpose). � Here is a computation one might make forUU T (these are OUTER products):   uT  1  UU T = [u ,...,u ]  .  =u uT +u uT + +u uT 1 k . 1 1 2 2 ··· k k uk Example 5.1.2. Consider the following computations: � � � � 1 1 0 U= U T U=1UU T = 0 0 0

IfUU T is not the identity, what is it? Consider the following computation: UU T x=u uT x+u uT x+ +u uT x 1 1 2 2 ··· k k =u (uT x) +u (uT x) + +u (uT x) 1 1 2 2 ··· k k which we recognize as the projection ofx into the space spanned by the orthonormal vectors ofU. In summary, we can think of: the following matrix form for the coordinates:

T [x]U =c=U x. and the projection matrixP that takes a vectorx and produces the projection ofx into the space spanned by the orthonormal columns ofU is P=UU T

52 Exercises

1. Let the subspaceH be formed by the span of the vectorsv 1,v 2 given below. Given the pointx 1,x 2 below,find which one belongs toH, and if it does, give its coordinates. (NOTE: The basis vectors are NOT orthonormal)         1 2 7 4         v1 = 2 v2 = 1 x1 = 4 x2 = 3 1 −1 0 1 − − 2. Show that the planeH defined by:        1 1      H= α1 1 +α 2 1 such thatα 1,α 2 IR 1 −0 ∈

is isormorphic to IR2. 3. Let the subspaceG be the plane defined below, and consider the vectorx, where:          1 3  1       G= α1 3 +α 2 1 such thatα 1,α 2 IR x= 0 2 −0 ∈ 2 − (a) Find the projectorP that takes an arbitrary vector and projects it (orthogonally) to the planeG. (b) Find the orthogonal projection of the givenx onto the planeG. (c) Find the distance from the planeG to the vectorx. 4. If the low dimensional representation of a vectorx is [9, 1] T and the basis vectors are [1,0,1] T and [3,1,1] T , then what was the original vectorx? (HINT: it− is easy to compute it directly) 5. If the vectorx = [10,4,2] T and the basis vectors are [1,0,1] T and [3,1,1] T , then what is the low dimensional representation forx? 6. Leta=[ 1,3] T . Find a square matrixP so thatPx is the orthogonal projection ofx onto the span ofa. −

7. To prove that we have an orthogonal projection, the vector Proju(x) x should be orthogonal tou. Use this definition to show that our earlier formula was correct- that− is, x u Proj (x)= · u u u u · is the orthogonal projection ofx ontou. 8. Continuing with the last exercise, show thatUU T x is the orthogonal projection ofx into the space spanned by the columns ofU by showing that (UU T x x) is orthogonal tou for anyi=1,2, ,k. − i ··· 5.2 The Four Fundamental Subspaces

Given anym n matrixA, we consider the mappingA : IR n IRm by: × → x Ax=y → The four subspaces allow us to completely understand the domain and range of the mapping. We willfirst define them, then look at some examples.

53 Definition 5.2.1. The Four Fundamental Subspaces

The row spaceofA is a subspace of IR n formed by taking all possible linear combinations of the rows • ofA. Formally, � � Row(A) = x IR n x=A T yy IR m ∈ | ∈ The null space ofA is a subspace of IR n formed by • Null(A) = x IR n Ax=0 { ∈ | }

The column space ofA is a subspace of IR m formed by taking all possible linear combinations of the • columns ofA. Col(A) = y IR m y=Ax IR n { ∈ | ∈ } The column space is also the image of the mapping. Notice thatAx is simply a linear combination of the columns ofA: Ax=x a +x a + +x a 1 1 2 2 ··· n n Finally, we define the null space ofA T can be defined in the obvious way (see the Exercises). • The fundamental subspaces subdivide the domain and range of the mapping in a particularly nice way:

Theorem 5.2.1. LetA be anm n matrix. Then × The nullspace ofA is orthogonal to the row space ofA • The nullspace ofA T is orthogonal to the columnspace ofA • Proof: We’ll prove thefirst statement, the second statement is almost identical to thefirst. To prove thefirst statement, we have to show that if we take any vectorx from nullspace ofA and any vectory from the row space ofA, thenx y=0. · Alternatively, if we can show thatx is orthogonal to each and every row ofA, then we’re done as well (sincey is a linear combination of the rows ofA). In fact, now we see a strategy: Write out what it means forx to be in the nullspace using the rows ofA. For ease of notation, leta denote thej th row ofA, which will have size 1 n. Then: j ×     a1 a1x      a2   a2x  Ax=0  .  x=  .  =0 ⇒  .   .  am amx

Therefore, the dot product between any row ofA andx is zero, so thatx is orthogonal to every row ofA. Therefore,x must be orthogonal to any linear combination of the rows ofA, so thatx is orthogonal to the row space ofA.� Before going further, let us recall how to construct a basis for the column space, row space and nullspace of a matrixA. We’ll do it with a particular matrix:

Example 5.2.1. Construct a basis for the column space, row space and nullspace of the matrixA below that is row equivalent to the matrix beside it, RREF(A):     2 0 2 2 1 0 1 1 − − A=  2 5 7 3  RREF(A) =  0 1 1 1  −3 5 8 2 0 0 0 0 − − − 54 Thefirst two columns of the original matrix form a basis for the columnspace (which is a subspace of IR 3):      2 2      Col(A) = span  2 , 2  −3 −3

A basis for the row space is found by using the row reduced rows corresponding to the pivots (and is a subspace of IR4). You should also verify that you canfind a basis for the null space ofA, given below (also a subspace of IR4). If you’re having any difficulties here, be sure to look it up in a linear algebra text:          1 0   1 1         −   0   1   1   1  Row(A) = span   ,   Null(A) = span  −  ,  −   1 1   1 0  −1 1 0 1

We will often refer to the dimensions of the four subspaces. We recall that there is a term for the dimension of the column space- That is, the rank.

Definition 5.2.2. The rank of a matrixA is the number of independent columns ofA.

In our previous example, the rank ofA is 2. Also from our example, we see that the rank is the dimension of the column space, and that this is the same as the dimension of the row space (all three numbers correspond to the number of pivots in the row reduced form ofA). Finally, a handy theorem for counting is the following. The Rank Theorem. Let them n matrixA have rankr. Then × r + dim (Null(A)) =n

This theorem says that the number of pivot columns plus the other columns (which correspond to free variables) is equal to the total number of columns.

Example 5.2.2. The Dimensions of the Subspaces. Given a matrixA that ism n with rankk, then the dimensions of the four subspaces are shown below. × dim (Row(A)) =k dim (Col(A)) =k • • � � dim (Null(A)) =n k dim Null(AT ) =m k • − • − There are some interesting implications of these theorems to matrices of data- For example, supposeA ism n. With no other information, we do not know whether we should consider this matrix asn points in IR×m, orm points in IR n. In one sense, it doesn’t matter! The theorems we’ve discussed shows that the dimension of the columnspace is equal to the dimension of the rowspace. Later on, we’llfind out that if we canfind a basis for the columnspace, it is easy tofind a basis for the rowspace. We’ll need some more machineryfirst.

5.3 Exercises

1. Show that Null(AT ) Col(A). Hint: You may use what we already proved. ⊥ 2. IfA ism n, how big can the rank ofA possibly be? ×

3. Show that multiplication by an orthogonal matrix preserves lengths: Qx 2 = x 2 (Hint: Use prop- erties of inner products). Conclude that multiplication byQ represents� a� rigid rotation.� �

55 4. Prove the Pythagorean Theorem by induction: Given a set ofn orthogonal vectors x { i} �n �n x 2 = x 2 � i�2 � i�2 i=1 i=1

The case wheren = 1 is trivial, so you might look atn=2first. Try starting with x+y 2 = (x+y) T (x+y)= � � ··· and then simplify to get x 2 + y 2. Now try the induction step on your own. � � � � 5. LetA be anm n matrix wherem>n, and letA have rankn. Lety, yˆ IR m, such that yˆ is the orthogonal× projection ofy onto the column space ofA. We want a formula∈ for the matrix P : IRm IRm so thatPy= yˆ. → The following image shows the relevant subspaces:

(a) Why is the projector notP= AA T ? (b) Since yˆ y is orthogonal to the column space ofA, show that − AT (yˆ y)=0 (5.3) − (c) Show that there existsx IR n so that Equation (5.3) can be written as: ∈ AT (Ax y)=0 (5.4) − (d) Argue thatA T A (which isn n) is invertible, so that Equation ( 5.4) implies that × � � 1 x= AT A − AT y

(e) Finally, show that this implies that

� � 1 P=A AT A − AT Note: IfA has rankk=n, then we will need something different, sinceA T A will not be full rank. The missing piece is� the singular value decomposition, to be discussed later.

6. The Orthogonal Decomposition Theorem: ifx IR n andW is a (non-zero) subspace of IR n, thenx can be written uniquely as ∈ x=w+z

wherew W andz W ⊥. ∈ ∈ p To prove this, let u i i=1 be an orthonormal basis forW , definew=(x u 1)u1 +...+(x u p)up, and definez=x w{. Then:} · · − 56 (a) Show thatz W ⊥ by showing that it is orthogonal to everyu . ∈ i (b) To show that the decomposition is unique, suppose it is not. That is, there are two decompositions:

x=w 1 +z 1,x=w 2 +z 2

Show this implies thatw 1 w 2 =z 2 z 1, and that this vector is in bothW andW ⊥. What can we conclude from this? − −

7. Use the previous exercises to prove the The Best Approximation Theorem IfW is a subspace of IRn andx IR n, then the point closest tox inW is the orthogonal projection ofx intoW. ∈ 5.4 The Decomposition Theorems

The matrix factorization that arises from an eigenvector/eigenvalue decomposition is useful in many appli- cations, so we’ll briefly review it here and build from it until we get to our main goal, the Singular Value Decomposition.

5.4.1 The Eigenvector/Eigenvalue Decomposition First we have a basic definition: LetA be ann n matrix. If there exists a scalarλ and non-zero vectorv so that × Av=λv then we say thatλ is an eigenvalue andv is an associated eigenvector. An equivalent formulation of the problem is to solveAv v=0, or, factoringv out, − (A λI)v=0 − This equation always has the zero solution (lettingv=0), however, we need to have non-trivial solutions, and the only way that will happen is ifA λI is non-invertible, or equivalently, − det(A λI)=0 − which, when multiplied out, is a polynomial equation inλ that is called the characteristic equation. Therefore, wefind the eigenvaluesfirst, then for eachλ, there is an associated subspace- The null space ofA λI, or the eigenspace associated withλ, denoted byE . − λ The way tofinish the problem is to give a “nice” basis for the eigenspace- If you’re working by hand, try one with integer values. If you’re on the computer, it is often convenient to make them unit vectors. Some vocabulary associated with eigenvalues: Solving the characteristic equation will mean that we can have repeated solutions. The number of repetitions is the algebraic multiplicity ofλ. On the other hand, for eachλ, wefind the eigenspace which will have a certain dimension- The dimension of the eigenspace is the geometric multiplicity ofλ.

Examples: 1. Compute the eigenvalues and eigenvectors for the 2 2 identity matrix. ×

SOLUTION: The eigenvalue is 1 (a double root), so the algebraic multiplicity of 1 is 2. On the other hand, if we takeA λI, we simply get the zero matrix, which implies that every vector 2 − 2 in IR is an eigenvector. Therefore, we can take any basis of IR is a basis forE 1, and the geometric multiplicity is 2.

57 2. Consider the matrix � � 1 2 0 1 Again, the eigenvalue 1 is a double eigenvalue (so the algebraic multiplicity is 2), but solving (A λI)v= 0 gives us: − 2v = 0 v = 0 2 ⇒ 2 T That meansv 1 is free, and the basis forE 1 is [1,0] . Therefore, the algebraic multiplicity is 1.

Definition: A matrix for which the algebraic and geometric multiplicities are not equal is called defective.

There is a nice theorem relating eigenvalues: 1 Theorem: IfX is square and invertible, thenA andX − AX have the same eigenvalues. Sometimes this method of characterizing eigenvalues in terms of the determinant and trace of a matrix: � n ∞ det(A) =Π i=1λi trace(A) = λi i=1

Symmetric Matrices and the Spectral Theorem There are some difficulties working with eigenvalues and eigenvectors of a general matrix. For one thing, they are only defined for square matrices, and even when they are defined, we may get real or complex eigenvalues. If a matrix is symmetric, beautiful things happen with the eigenvalues and eigenvectors, and it is sum- marized below in the Spectral Theorem.

The Spectral Theorem: IfA is ann n symmetric matrix, then: × 1.A hasn real eigenvalues (counting multiplicity).

2. For each distinctλ, the algebraic and geometric multiplicities are the same.

3. The eigenspaces are mutually orthogonal- both for distinct eigenvalues, and we’ll take eachE λ to have an orthonormal basis.

4.A is orthogonally diagonalizeable, withD = diag(λ 1,λ 2,...,λ n). That is, ifV is the matrix whose columns are the (othornormal) eigenvectors ofA, then

A=VDV T

Some remarks about the Spectral Theorem:

If a matrix is real and symmetric, the Spectral Theorem says that its eigenvectors form an orthonormal • basis for IRn.

Thefirst part is somewhat difficult to prove in that we would have to bring in more machinery than • we would like. If you would like to see a proof, it comes from the Schur Decomposition, which is given, for example, in “Matrix Computations” by Golub and Van Loan.

The following is a proof of the third part. Supply justification for each step: Letv 1,v 2 be eigenvectors from distinct eigenvalues,λ ,λ . We show thatv v = 0: 1 2 1 · 2 λ v v = (Av )T v =v T AT v =v T Av =λ v v 1 1 · 2 1 2 1 2 1 2 2 1 · 2 Now, (λ λ )v v = 0. 1 − 2 1 · 2 58 The Spectral Decomposition: SinceA is orthogonally diagonalizable, then     T λ1 0...0 q1    T   0λ 2 ...0   q     2  A=(q 1 q2 ...q n) . . . . .  . . .. .   .  T 0 0...λ n qn so that:   qT  1   qT   2  A=(λ 1q1 λ2q2 ...λ nqn) .  .  T qn sofinally: T T T A=λ 1q1q1 +λ 2q2q2 +...+λ nqnqn That is,A is a sum ofn rank one matrices, each of which is a projection matrix.

Exercises 1 1. Prove that ifX is invertible and matrixA is square, thenA andX − AX have the same eigenvalues. Matlab Exercise: Verify the spectral decomposition for a symmetric matrix. Type the following into Matlab (the lines that begin with a% denote comments that do not have to be typed in).

%Construct a random, symmetric, 6 x 6 matrix: for i=1:6 for j=1:i A(i,j)=rand; A(j,i)=A(i,j); end end

%Compute the eigenvalues of A: [Q,L]=eig(A); %NOTE: A = Q L Q’ %L is a diagonal matrix

%Now form the spectral sum S=zeros(6,6); for i=1:6 S=S+L(i,i)*Q(:,i)*Q(:,i)’; end

max(max(S-A))

Note that the maximum ofS A should be a very small number! (By the spectral decomposition theorem). −

5.4.2 The Singular Value Decomposition There is a special matrix factorization that is extremely useful, both in applications and in proving theorems. This is mainly due to two facts, which we shall investigate in this section: (1) We can use this factorization on any matrix, (2) The factorization defines explicitly the rank of the matrix, and gives orthonormal bases for all four matrix subspaces. In what follows, assume thatA is anm n matrix (soA maps IR n to IRm and is not square). × 59 1. AlthoughA itself is not symmetric,A T A isn n and symmetric, so the Spectral Theorem applies. × T n In particular,A A is orthogonally diagonalizable. Let λ i i=1 andV=[v 1,v 2,...,v n] be the eigen- values and orthonormal eigenvectors so that { }

AT A=VDV T

andD is the diagonal matrix withλ i along the diagonal. 2. Exercise: Show thatλ 0 fori=1..n by showing that Av 2 =λ . i ≥ � i�2 i As a starting point, you might rewrite

Av 2 = (Av)T Av � i� 3. Definition: We define the singular values ofA by: � σi = λi

T whereλ i is an eigenvalue ofA A. 4. In the rest of the section, we will assume any list (or diagonal matrix) of eigenvals ofA T A (or singular values ofA) will be ordered from highest to lowest:λ λ ... λ . 1 ≥ 2 ≥ ≥ n T 5. Exercise: Prove that, ifv i andv j are distinct eigenvectors ofA A, then their corresponding images, Avi andAv j , are orthogonal.

6. Exercise: Prove that, ifx=α 1v1 +...α nvn, then

Ax 2 =α 2λ +...+α 2 λ � � 1 1 n n 7. Exercise: LetW be the subspace generated by the basis v n , wherev are the eigenvectors { j }j=k+1 j associated with the zero eigenvalues ofA T A (therefore, we are assuming that thefirstk eigenvalues are NOT zero). Show thatW = Null(A). Hint: To show this, take an arbitrary vectorx fromW . Rather than showing directly thatAx=0, instead show that the magnitude ofAx is zero. We also need to show that if we take any vector from the nullspace ofA, then it is also inW. 8. Exercise: Prove that if the rank ofA T A isr, then so is the rank ofA. Hint: How does the previous exercise help? 9. Define the columns of a matrixU as: 1 1 u = Av = Av i Av i σ i � i�2 i th and letU be the matrix whosei column isu i.

We note that this definition only makes sense sense for thefirstr vectorsv (otherwise,Av i =0). Thus, we’ll have to extend the basis to span all of IRm, which can be done using Gram-Schmidt.

T 10. Exercise: Show thatu i is an eigenvector of AA whose eigenvalue is alsoλ i.

T 11. Exercise: Show thatA ui =σ ivi 12. So far, we have shown how to construct two matrices,U andV given a matrixA. That is, the matrix V is constructed by the eigenvectors ofA T A, and the matrixU can be constructed using thev’s or by finding the eigenvectors of AAT .

60 1 2 v 0.8 1

1.5 0.6

1 0.4

σ u 1 1 0.2 0.5

0 0 v 2

−0.2 −0.5

−0.4 −1 σ u 2 2 −0.6 −1.5

−0.8 −2

−1 −0.5 0 0.5 1 −2.5 −2 −1.5 −1 −0.5 0 0.5 1 1.5 2 2.5

Figure 5.1: The geometric meaning of the right and left singular vectors of the SVD decomposition. Note thatAv =σ u . The mappingx Ax will map the unit circle on the left to the ellipse on the right. i i i →

13. Exercise: LetAbem n. Define them n matrix × ×

Σ = diag(σ1,...,σ n)

th whereσ i is thei singular value of the matrixA. Show that

AV=UΣ

14. Theorem: The Singular Value Decomposition (SVD) LetA be anym n matrix of rankr. Then we can factor the matrixA as the following product: ×

A=UΣV T

whereU,Σ,V are the matrices defined in the previous exercises. That is,U is an orthogonalm m matrix,Σ is a diagonalm n matrix, andV is an orthogonaln n matrix. Theu’s are called the left× singular vectors and the×v’s are called the right singular vectors× .

15. Keep in mind the following relationship between the right and left singular vectors:

Avi =σ iui T A ui =σ ivi

16. Computing The Four Subspaces to a matrixA. LetA=UΣV T be the SVD ofA which has rankr. Be sure that the singular values are ordered from highest to lowest. Then:

(a) A basis for the columnspace ofA, Col(A) is u r { i}i=1 (b) A basis for nullspace ofA, Null(A) is v n { i}i=r+1 (c) A basis for the rowspace ofA, Row(A) is v r { i}i=1 (d) A basis for the nullspace ofA T , Null(AT ) is u m { i}i=r+1

17. We can also visualize the right and left singular values as in Figure 5.1. We think of thev i as a special n orthogonal basis inR (Domain) that maps to the ellipse whose axes are defined byσ iui.

18. The SVD is one of the premier tools of linear algebra, because it allows us to completely reveal everything we need to know about a matrix mapping: The rank, the basis of the nullspace, a basis for the column space, the basis for the nullspace ofA T , and of the row space. This is depicted in Figure 5.2.

61 T Col(A ) v1 u1 Col(A)

u2 v2 . . . . vk ⇐⇒ uk

= 0 vk+1 ⇒ . N(A) . vn uk+1 T . N(A ) . 0 = ⇐ um IRn IRm

Figure 5.2: The SVD of A ([U,S,V]=svd(A)) completely and explicitly describes the 4 fundamental subspaces associated with the matrix, as shown. We have a one to one correspondence between the rowspace and columnspace ofA, the remainingv’s map to zero, and the remainingu’s map to zero (underA T ).

62 19. Lastly, the SVD provides a decomposition of any linear mapping into two “rotations” and a scaling. This will become important later when we try to deduce a mapping matrix from data (See the section on signal separation).

20. Exercise: Compute the SVD by hand of the following matrices:   � � 0 2 1 1  0 0  0 0 0 0

21. Remark: Ifm orn is very large, it might not make sense to keep the full matrixU andV.

22. The Reduced SVD LetAbem n with rankr. Then we can write: × A= U˜Σ˜V˜T

where U˜ is anm r matrix with orthogonal columns, Σ˜ is anr r square matrix, and V˜ is ann r matrix. × × ×

23. Theorem: (Actually, this is just another way to express the SVD). LetA=UΣV T be the SVD ofA, which has rankr. Then: �r T A= σiuivi i=1 Therefore, we can approximateA by the sum of rank one matrices.

24. Matlab and the SVD Matlab has the SVD built in. The function specifications are: [U,S,V]=svd(A) and [U,S,V]=svd(A,0) where thefirst function call returns the full SVD, and the second call returns a reduced SVD- see Matlab’s helpfile for the details on the second call.

25. Matlab Exercise: Image Processing and the SVD. First, in Matlab, load the clown picture:

load clown

This loads a matrixX and a colormap, map, into the workspace. To see the clown, type:

image(X); colormap(map)

We now perform a Singular Value Decomposition on the clown. Type in:

[U,S,V]=svd(X);

How many vectors are needed to retain a good picture of the clown? Try performing ak dimensional reconstruction of the image by typing: −

H=U(:,1:k)*S(1:k,1:k)*V(:,1:k)’; image(H)

fork=5,10, 20 and 30. What do you see?

63 Generalized Inverses Let a matrixA bem n with rankr. In the general case,A does not have an inverse. Is there a way of restricting the domain× and range of the mappingy=Ax so that the map is invertible? We know that the columnspace and rowspace ofA have the same dimensions. Therefore, there exists a 1-1 and onto map between these spaces, and this is our restriction. To “solve”y=Ax, we replacey by its orthogonal projection to the columnspace ofA, yˆ. This gives the least squares solution, which makes the problem solvable. To get a unique solution, we replacex by its projection to the rowspace ofA, xˆ. The problem yˆ=A xˆ now has a solution, and that solution is unique. We can rewrite this problem now in terms of the reduced SVD ofA: xˆ=VV T x, yˆ=UU T y Now we can write: � � UU T y=UΣV T VV T x so that 1 T T VΣ − U y=VV x (Exercise: Verify that these computations are correct!) Given anm n matrixA, define its pseudoinverse,A † by: × 1 T A† =VΣ − U We have shown that the least squares solution toy=Ax is given by:

xˆ=A †y where xˆ is in the rowspace ofA, and its image,A xˆ is the projection ofy into the columnspace ofA. Geometrically, we can understand these computations in terms of the four fundamental subspaces.

✻Row(A) ✻Col(A) x y A = ✲

q y

✲ Null(A) ✲ Null(AT ) IRn IRm

In this case, there is no value ofx IR n which will map ontoy, sincey is outside the columnspace ofA. To get a solution, we projecty onto the∈ columnspace ofA as shown below:

✻Row(A) ✻Col(A)

yˆ=UU T y q y

✲ Null(A) ✲ Null(AT )

64 Now it is possible tofind anx that will map onto yˆ, but if the nullspace ofA is nontrivial, then all of the points on the dotted line will also map to yˆ

✻Row(A) ✻Col(A)

xˆ =VV T xr rx yˆ q y

✲ Null(A) ✲ Null(AT )

Finally, we must choose a unique value ofx for the mapping- We choose thex inside the rowspace ofA. This is a very useful idea, and it is one we will explore in more detail later. For now, notice that to get this solution, we analyzed our four fundamental subspaces in terms of the basis vectors given by the SVD.

Exercises

1. Consider   � � x � � 2 1 1 1 5  x  = 3 1− 2 2 1 x3 (a) Before solving this problem, what are the dimensions of the four fundamental subspaces? (b) Use Matlab to compute the SVD of the matrixA, and solve the problem by computing the pseudoinverse ofA directly. (c) Check your answer explicitly and verify that xˆ and yˆ are in the rowspace and columnspace. (Hint: If a vectorx is already in the rowspace, what isVV T x?)

2. Consider     2 1 1 3   5  −  x1    1 0 1 2     1     x2    −7 2 5− 12 = 0    x     3 2− 0 4  3  2  x4 −4− 1 3− 7 −6 − (a) Find the dimensions of the four fundamental subspaces by using the SVD ofA (in Matlab). (b) Solve the problem. (c) Check your answer explicitly and verify that xˆ and yˆ are in the rowspace and columnspace.

3. Write the following in Matlab to reproduce Figure 5.1:

theta=linspace(0,2*pi,30); z=exp(i*theta); X=[real(z);imag(z)]; %The domain points m=1/sqrt(2); A=(m*[1,1;1,-1])*[1,0;0,3]; Y=A*X; %The image of the circle

t=linspace(0,1); vec1=[0;0]*(1-t)+[0;1]*t; %The basis vectors v

65 vec2=[0;0]*(1-t)+[1;0]*t;

Avec1=A*vec1; Avec2=A*vec2; %Image of the basis vectors

figure(1) %The domain plot(X(1,:),X(2,:),’k’,vec1(1,:),vec1(2,:),’k’, vec2(1,:),vec2(2,:),’k’); axis equal figure(2) %The image plot(Y(1,:),Y(2,:),’k’,Avec1(1,:),Avec1(2,:),’k’, Avec2(1,:),Avec2(2,:),’k’); axis equal

4. In the previous example, what was the matrixA? The vectorsv? The vectorsu? The singular values σ1,σ 2? Once you’ve written these down from the program, perform the SVD ofA in Matlab. Are the vectors the same that you wrote down? NOTE: These show that the singular vectors are not unique- they vary by v, or u. ± ±

66