The Gram-Schmidt Procedure, Orthogonal Complements, and Orthogonal Projections
Total Page:16
File Type:pdf, Size:1020Kb
The Gram-Schmidt Procedure, Orthogonal Complements, and Orthogonal Projections 1 Orthogonal Vectors and Gram-Schmidt In this section, we will develop the standard algorithm for production orthonormal sets of vectors and explore some related matters We present the results in a general, real, inner product space, V rather than just in Rn. We will make use of this level of generality later on when we discuss the topic of conjugate direction methods and the related conjugate gradient methods for optimization. There, once again, we will meet the Gram-Schmidt process. We begin by recalling that a set of non-zero vectors fv1;:::; vkg is called an orthogonal set provided for any indices i; j; i 6= j,the inner products hvi; vji = 0. It is called 2 an orthonormal set provided that, in addition, hvi; vii = kvik = 1. It should be clear that any orthogonal set of vectors must be a linearly independent set since, if α1v1 + ··· + αkvk = 0 then, for any i = 1; : : : ; k, taking the inner product of the sum with vi, and using linearity of the inner product and the orthogonality of the vectors, hvi; α1v1 + ··· αkvki = αihvi; vii = 0 : But since hvi; vii 6= 0 we must have αi = 0. This means, in particular, that in any n-dimensional space any set of n orthogonal vectors forms a basis. The Gram-Schmidt Orthogonalization Process is a constructive method, valid in any finite-dimensional inner product space, which will replace any basis U = fu1; u2;:::; ung with an orthonormal basis V = fv1; v2;:::; vng. Moreover, the replacement is made in such a way that for all k = 1; 2; : : : ; n, the subspace spanned by the first k vectors fu1;:::; ukg and that spanned by the new vectors fv1;:::; vkg are the same. To do this we proceed inductively. First observe that u1 6= 0 since U is a linearly independent set. We take v1 = u1=ku1k. Suppose now that the v1;:::; vk have been chosen so that they form an orthonormal set and so that each vj ; j = 1; : : : ; k, is a linear combination of the vectors u1;:::; uk. We write w = uk+1 − (α1v1 + ··· + αkvk) ; where the values of the scalars α1; : : : ; αk are still to be determined. Since hw; vji = huk+1 − (α1v1 + ··· + αkvk) ; vji = huk+1; vji − αj ; for i = 1; : : : ; k ; 1 it follows that if we choose αj = huk+1; vji then hw; vji = 0 for j = 1; : : : ; k. Since, moreover, w is a linear combination of uk+1 and v1;:::; vk, it is also a linear combination of uk+! and u1;:::; uk. Finally, the vector w 6= 0 since u1;:::; uk; uk+1 are linearly independent and the coefficient of uk+1 in the expression for w is not zero. We may now define vk+1 = w=kwk. The set fv1; vk; vk+1g is certainly an orthonormal set with the required properties and the proof by induction is complete. We can summarize the procedure by listing a series of steps, It is really irrelevant whether we normalize with each step. We do not do it here, preferring to do so, if necessary, at the end of the procedure. The Gram-Schmidt Procedure Step 1: v1 = u1: Compute kv1k ; hu2; v1i Step 2: v2 = u2 − 2 v1 : Compute kv2k ; kv1k hu3; v1i hu3; v2i Step 3: v3 = u3 − 2 v1 − 2 v2 : Compute kv3k ; kv1k kv2k . k−1 X huk; vii Step k: v = u − v : Compute kv k . k k kv k2 i k i=1 i . 2 Examples Let us give some examples. Example 2.1 Let 80 1 1 0 1 1 0 1 1 9 < = U = @ −1 A ; @ 0 A ; @ 1 A = fu1; u2; u3g : : 1 1 2 ; 2 > 2 Then v1 = (1; −1; 1) and kv1k = 3. Then, we compute v2: 0 1 1 *0 1 1 0 1 1+ 0 1 1 hu ; v i 1 v = u − 2 1 v = 1 − 0 ; −1 −1 2 2 kv k2 1 @ A 3 @ A @ A @ A 1 1 1 1 1 0 1 0 1 0 1 1 1 1 3 2 2 2 2 = @ 0 A − @ −1 A = @ 3 A and kv2k = : 3 1 3 1 1 3 Finally, hu3; v1i hu3; v2i v3 = u3 − 2 v1 − 2 v2 kv1k kv2k 0 1 1 0 1 1 0 1 1 *0 1 1 0 1 1+ 0 1 1 *0 1 1 + 1 3 3 3 = 1 − 1 ; −1 −1 − 1 ; B 2 C B 2 C @ A 3 @ A @ A @ A 2 @ A @ 3 A @ 3 A 2 2 1 1 2 1 1 3 3 0 1 1 0 1 1 0 1 1 0 1 1 − 2 5 2 1 = 1 − −1 − 2 = B 0 C and kv k2 = : @ A 3 @ A 6 @ A @ A 3 2 2 1 1 1 2 The normalized set is 0 1 0 1 p1 p1 0 −p1 1 (3) (6) (2) B p1 C B p2 C B C v^1 = B − C ; v^2 = B C ; v^3 = B 0 C : B (3) C B (6) C @ A @ p1 A @ p1 A p1 (3) (6) (2) In a more geometric vein, we consider the next example. 3 > Example 2.2 Let H be the plane in R spanned by the vectors u1 = (1; 2; 2) and > u2 = (−1; 0; 2) . These vectors are clearly linearly independent and so form a basis for the plane. We wish to find an orthonormal basis for the plane and extend it to an orthonormal basis for all of R3. We add one linearly independent vector to the original 3 > set of two to form a basis for all of R by adding the vector u3 = (0; 0; 1) . Then the 3 set of vectors fu1; u2; u3g are a linearly independent set in R and so form a basis for 3 the entire space. If one has any doubt about the linear independence of this set, just compute det (col [u1; u2; v3]) = 2 6= 0. Now, we could have orthogonalized the set consisting of the two given vectors, and then added a third, but the fact that the Gram-Schmidt procedure preserves the span at each stage, it is simpler to add the additional linearly independent vector now. The process then proceeds as usual: 2 2 2 2 v1 = u1 and ku1k = 1 + 2 + 2 = 9;: 0 −1 1 *0 −1 1 0 1 1+ 0 1 1 hu ; v i 1 v = u − 2 1 v = 0 − 0 ; 2 2 2 2 9 1 @ A 9 @ A @ A @ A 2 2 2 2 0 4 1 0 −1 1 0 1 1 − 3 3 = 0 − 2 = B − 2 C @ A 9 @ A @ 3 A 2 2 4 3 2 Note that kv2k = 36=9 = 4. Finally, hu ; v i hu ; v i v = u − 3 1 v − 3 2 v 3 3 9 1 4 2 0 0 1 *0 0 1 0 1 1+ 0 1 1 *0 0 1 0 −4=3 1+ 0 −4=3 1 1 1 = 0 − 0 ; 2 2 − 0 ; −2=3 −2=3 @ A 9 @ A @ A @ A 4 @ A @ A @ A 1 1 2 2 1 4=3 4=3 0 4 1 0 2 1 0 0 1 0 1 1 − 2 1 3 9 = 0 − 2 − B − 2 C = B − 2 C @ A 9 @ A 3 @ 3 A @ 9 A 1 2 4 1 3 9 Now v1 and v2 are an orthogonal basis for the plane H, and, together with v3 form an orthogonal basis for all of R3. In order to get the orthonormal basis, we merely divide each by their norm. Since, as we have seen, kv1k = 3 and kv2k = 2, we need p p only compute the norm of kv3k = (4=81 + 4=81 + 1=81) = (1=9) = 1=3. Hence the vectors of the required orthonormal basis are 0 1 1 0 2 1 0 2 1 3 − 3 3 v^ = B 2 C ; v^ = B − 1 C ; and v^ = B − 2 C : 1 @ 3 A 2 @ 3 A 3 @ 3 A 2 2 1 3 3 3 4 As another example, we leave the vector space Rn. Example 2.3 Here we look at the space of polynomials of degree at most 3, defined on the interval [−1; 1] and having real coefficients. This is the vector space we denote by P3([−1; 1]). We take, as a basis, the monomials f1; t; t2; t3g. These polynomials clearly 2 3 span the vector space and are linearly independent since, αo 1; α1 t + α2 t + α3 t = 0 for all t 2 [−1; 1] then all the αi = 0 because such a polynomial, if not the zero polynomial, can have at most three real roots according to the Fundamental Theorem of Algebra. In this vector space we introduce the form 1 Z hp1; p2i = p1(t) p2(t) dt : −1 We claim that this form is an inner product on P3([−1; 1]). To verify that the claim is true, we must show that the form is a positive definite, symmetric, bi-linear form. First, since p1(t)p2(t) = p2(t)p1(t) the form is clearly symmetric. Moreover, since p2(t) ≥ 0 for any p 2 P3([−1; 1]) we certainly know that 1 Z p2(t) dt ≥ 0 −1 and is equal to 0 if and only if p(t) ≡ 0 on [−1; 1]. So the form is positive definite. Since we already know that the form is symmetric, it suffices to show that the form is 3 linear in the first argument.