Orthogonality
Total Page:16
File Type:pdf, Size:1020Kb
Jim Lambers MAT 610 Summer Session 2009-10 Lecture 3 Notes These notes correspond to Sections 2.5-2.7 in the text. Orthogonality n T A set of vectors fx1; x2;:::; xkg in R is said to be orthogonal if xi xj = 0 whenever i 6= j. If, in T addition, xi xi = 1 for i = 1; 2; : : : ; k, then we say that the set is orthonormal. In this case, we can T also write xi xj = ij, where 1 i = j = ij 0 i 6= j is called the Kronecker delta. Example The vectors x = 1 −3 2 T and y = 4 2 1 T are orthogonal, as xT y = 1(4) − p T 3(2) + 2(1) = 4 − 6 + 2 = 0. Using the fact that kxk2 = x x, we can normalize these vectors to obtain orthonormal vectors 2 1 3 2 4 3 x 1 y 1 x~ = = p −3 ; y~ = = p 2 ; kxk 4 5 kyk 4 5 2 14 2 2 21 1 which satisfy x~T x~ = y~T y~ = 1; x~T y~ = 0: 2 m The concept of orthogonality extends to subspaces. If S1;S2;:::;Sk is a set of subspaces of R , T we say that these subspaces are mutually orthogonal if x y = 0 whenever x 2 Si, y 2 Sj, for i 6= j. n ? Furthermore, we define the orthogonal complement of a subspace S ⊆ R to be the set S defined by ? m T S = fy 2 R j x y = 0; x 2 Sg: 3 Example Let S be the subspace of R defined by S = spanfv1; v2g; where 2 1 3 2 1 3 v1 = 4 1 5 ; v2 = 4 −1 5 : 1 1 n o Then, S has the orthogonal complement S? = span 1 0 −1 T . It can be verified directly, by computing inner products, that the single basis vector of S?, T v3 = 1 0 −1 , is orthogonal to either basis vector of S, v1 or v2. It follows that any 1 multiple of this vector, which describes any vector in the span of v3, is orthogonal to any linear combination of these basis vectors, which accounts for any vector in S, due to the linearity of the inner product: T T T (c3v3) (c1v1 + c2v2) = c3c1v3 v1 + c3c2v3 v2 = c3c3(0) + c3c2(0) = 0: Therefore, any vector in S? is in fact orthogonal to any vector in S. 2 3 It is helpful to note that given a basis fv1; v2g for any two-dimensional subspace S ⊆ R , a basis ? fv3g for S can be obtained by computing the cross-product of the basis vectors of S, 2 3 (v1)2(v2)3 − (v1)3(v2)2 v3 = v1 × v2 = 4 (v1)3(v2)1 − (v1)1(v2)3 5 : (v1)1(v2)2 − (v1)2(v2)1 A similar process can be used to compute the orthogonal complement of any (n − 1)-dimensional n subspace of R . This orthogonal complement will always have dimension 1, because the sum of the dimension of any subspace of a finite-dimensional vector space, and the dimension of its orthogonal complement, is equal to the dimension of the vector space. That is, if dim(V ) = n, and S ⊆ V is a subspace of V , then dim(S) + dim(S?) = n: Let S = range(A) for an m × n matrix A. By the definition of the range of a matrix, if y 2 S, n m T then y = Ax for some vector x 2 R . If z 2 R is such that z y = 0, then we have zT Ax = zT (AT )T x = (AT z)T x = 0: If this is true for all y 2 S, that is, if z 2 S?, then we must have AT z = 0. Therefore, z 2 null(AT ), and we conclude that S? = null(AT ). That is, the orthogonal complement of the range is the null space of the transpose. Example Let 2 1 1 3 A = 4 2 −1 5 : 3 2 ? If S = range(A) = spanfa1; a2g, where a1 and a2 are the columns of A, then S = spanfbg, where T b = a1 × a2 = 7 1 −3 . We then note that 2 7 3 1 2 3 0 AT b = 1 = : 1 −1 2 4 5 0 −3 2 That is, b is in the null space of AT . In fact, because of the relationships rank(AT ) + dim(null(AT )) = 3; rank(A) = rank(AT ) = 2; we have dim(null(AT )) = 1, so null(A) = spanfbg. 2 m A basis for a subspace S ⊆ R is an orthonormal basis if the vectors in the basis form an orthonormal set. For example, let Q1 be an m × r matrix with orthonormal columns. Then these columns form an orthonormal basis for range(Q1). If r < m, then this basis can be extended to a m ? T basis of all of R by obtaining an orthonormal basis of range(Q1) = null(Q1 ). ? If we define Q2 to be a matrix whose columns form an orthonormal basis of range(Q1) , then T the columns of the m × m matrix Q = Q1 Q2 are also orthonormal. It follows that Q Q = I. That is, QT = Q−1. Furthermore, because the inverse of a matrix is unique, QQT = I as well. We say that Q is an orthogonal matrix. One interesting property of orthogonal matrices is that they preserve certain norms. For exam- 2 T T T 2 ple, if Q is an orthogonal n×n matrix, then kQxk2 = x Q Qx = x x = kxk2. That is, orthogonal matrices preserve `2-norms. Geometrically, multiplication by an n × n orthogonal matrix preserves the length of a vector, and performs a rotation in n-dimensional space. Example The matrix 2 cos 0 sin 3 Q = 4 0 1 0 5 − sin 0 cos 3 is an orthogonal matrix, for any angle . For any vector x 2 R , the vector Qx can be obtained by rotating x clockwise by the angle in the xz-plane, while QT = Q−1 performs a counterclockwise rotation by in the xz-plane. 2 In addition, orthogonal matrices preserve the matrix `2 and Frobenius norms. That is, if A is an m × n matrix, and Q and Z are m × m and n × n orthogonal matrices, respectively, then kQAZkF = kAkF ; kQAZk2 = kAk2: The Singular Value Decomposition The matrix `2-norm can be used to obtain a highly useful decomposition of any m × n matrix A. Let x be a unit `2-norm vector (that is, kxk2 = 1 such that kAxk2 = kAk2. If z = Ax, and we define y = z=kzk2, then we have Ax = y, with = kAk2, and both x and y are unit vectors. 3 n m We can extend x and y, separately, to orthonormal bases of R and R , respectively, to obtain n×n m×m orthogonal matrices V1 = x V2 2 R and U1 = y U2 2 R . We then have T T y U1 AV1 = T A x V2 U2 T T y Ax y AV2 = T T U2 Ax U2 AV2 T T y y y AV2 = T T U2 y U2 AV2 wT = 0 B ≡ A1; T T T where B = U2 AV2 and w = V2 A y. Now, let s = : w 2 2 2 2 2 Then ksk2 = + kwk2. Furthermore, the first component of A1s is + kwk2. It follows from the invariance of the matrix `2-norm under orthogonal transformations that 2 2 2 2 2 2 2 kA1sk2 ( + kwk2) 2 2 = kAk2 = kA1k2 ≥ 2 ≥ 2 2 = + kwk2; ksk2 + kwk2 and therefore we must have w = 0. We then have 0 U T AV = A = : 1 1 1 0 B Continuing this process on B, and keeping in mind that kBk2 ≤ kAk2, we obtain the decom- position U T AV = Σ where m×m n×n U = u1 ⋅ ⋅ ⋅ um 2 R ;V = v1 ⋅ ⋅ ⋅ vn 2 R are both orthogonal matrices, and m×n Σ = diag(1; : : : ; p) 2 R ; p = minfm; ng is a diagonal matrix, in which Σii = i for i = 1; 2; : : : ; p, and Σij = 0 for i 6= j. Furthermore, we have kAk2 = 1 ≥ 2 ≥ ⋅ ⋅ ⋅ ≥ p ≥ 0: 4 This decomposition of A is called the singular value decomposition, or SVD. It is more commonly written as a factorization of A, A = UΣV T : The diagonal entries of Σ are the singular values of A. The columns of U are the left singular vectors, and the columns of V are the right singluar vectors. It follows from the SVD itself that the singular values and vectors satisfy the relations T Avi = iui;A ui = ivi; i = 1; 2;:::; minfm; ng: For convenience, we denote the ith largest singular value of A by i(A), and the largest and smallest singular values are commonly denoted by max(A) and min(A), respectively. The SVD conveys much useful information about the structure of a matrix, particularly with regard to systems of linear equations involving the matrix. Let r be the number of nonzero singular values. Then r is the rank of A, and range(A) = spanfu1;:::; urg; null(A) = spanfvr+1;:::; vng: That is, the SVD yields orthonormal bases of the range and null space of A. It follows that we can write r X T A = iuivi : i=1 This is called the SVD expansion of A. If m ≥ n, then this expansion yields the \economy-size" SVD T A = U1Σ1V ; where m×n n×n U1 = u1 ⋅ ⋅ ⋅ un 2 R ; Σ1 = diag(1; : : : ; n) 2 R : Example The matrix 2 11 19 11 3 A = 4 9 21 9 5 10 20 10 has the SVD A = UΣV T where p p p p p p 2 1= 3 −1= 2 1= 6 3 2 1= 6 −1= 3 1= 2 3 p p p p p U = 4 1=p3 1= 2 1= 6 5 ;V = 4 2p=3 1=p3p 0 5 ; 1= 3 0 −p2=3 1= 6 −1= 3 −1= 2 and 2 42:4264 0 0 3 S = 4 0 2:4495 0 5 : 0 0 0 5 Let U = u1 u2 u3 and V = v1 v2 v3 be column partitions of U and V , respectively.