Chapter 5 Orthogonality
Total Page:16
File Type:pdf, Size:1020Kb
Chapter 5 Orthogonality Orthogonality is the mathematical formalization of the geometrical property of per- pendicularity | when adapted to general inner product spaces. In ¯nite-dimensional spaces, bases that consist of mutually orthogonal elements play an essential role in theoret- ical development, in applications, and in practical numerical algorithms. Many computa- tions become dramatically simpler and less prone to numerical instabilities when performed in orthogonal systems. In in¯nite-dimensional function space, orthogonality unlocks the secrets of Fourier analysis and its extensions, and underlies basic series solution methods for partial di®erential equations. Indeed, many large scale modern applications, including signal processing and computer vision, would be impractical, if not completely infeasible were it not for the dramatic simplifying power of orthogonality. As we will later discover, orthogonal systems naturally arise as eigenvector and eigenfunction bases for symmetric matrices and self-adjoint boundary value problems for both ordinary and partial di®eren- tial equations, and so play a major role in both ¯nite-dimensional and in¯nite-dimensional analysis. Orthogonality is motivated by geometry, and the resultinng contrsuctions have signi¯- cant geometrical consequences. Orthogonal matrices play an essential role in the geometry of Euclidean space, in computer graphics and animation, and in three-dimensional image analysis; details will appear in Chapter 7. The orthogonal projection of a point onto a subspace turns out to be the closest point, that is the least squares minimizer. Moreover, when written in terms of an orthogonal basis for the subspace, the normal equations un- derlying data estimation have an elegant explicit solution formula that o®ers numerical advantages over the direct approach to least squares minimization. Yet another important fact is that the four fundamental subspaces of a matrix form mutually orthogonal pairs. This observation leads directly to a new characterization of the compatibility conditions for linear systems known as the Fredholm alternative, of further importance in the analysis of boundary value problems and di®erential equations. The duly famous Gram{Schmidt process will convert an arbitrary basis of an inner product space into an orthogonal basis. As such, it forms one of the premier algorithms of linear analysis, in both ¯nite-dimensional vector spaces and also in function space, where it is used to construct orthogonal polynomials and other systems of orthogonal functions. In Euclidean space, the Gram{Schmidt process can be re-interpreted as a new kind of matrix factorization, in which a nonsingular matrix A = QR is written as the product of an orthogonal matrix Q and an upper triangular matrix R. This turns out to have signi¯cant applications in the numerical computation of eigenvalues, cf. Section 10.6. 2/25/04 150 c 2004 Peter J. Olver ° Figure 5.1. Orthogonal Bases in R2 and R3. 5.1. Orthogonal Bases. Let V be a ¯xed realy inner product space. Recall that two elements v; w V are called orthogonal if their inner product vanishes: v ; w = 0. In the case of vectors2 in Euclidean space, orthogonality under the dot producth meansi that they meet at a right angle. A particularly important con¯guration is when V admits a basis consisting of mutually orthogonal elements. De¯nition 5.1. A basis u1; : : : ; un of V is called orthogonal if ui ; uj = 0 for all i = j. The basis is called orthonormal if, in addition, each vectorhhas uniti length: u 6 = 1, for all i = 1; : : : ; n. k i k For the Euclidean space Rn equipped with the standard dot product, the simplest example of an orthonormal basis is the standard basis 1 0 0 0 1 0 0 0 1 0 0 1 0 0 1 e = . ; e = . ; : : : e = . : 1 B . C 2 B . C n B . C B . C B . C B . C B C B C B C B 0 C B 0 C B 0 C B C B C B C B 0 C B 0 C B 1 C @ A @ A @ A Orthogonality follows because e e = 0, for i = j, while e = 1 implies normality. i ¢ j 6 k i k Since a basis cannot contain the zero vector, there is an easy way to convert an orthogonal basis to an orthonormal basis. Namely, one replaces each basis vector by a unit vector pointing in the same direction, as in Lemma 3.16. y The methods can be adapted more or less straightforwardly to complex inner product spaces. The main complication, as noted in Section 3.6, is that we need to be careful with the order of vectors appearing in the non-symmetric complex inner products. In this chapter, we will write all inner product formulas in the proper order so that they retain their validity in complex vector spaces. 2/25/04 151 c 2004 Peter J. Olver ° Lemma 5.2. If v1; : : : ; vn is an orthogonal basis of a vector space V , then the normalized vectors u = v = v form an orthonormal basis. i i k i k Example 5.3. The vectors 1 0 5 v = 2 ; v = 1 ; v = 2 ; 1 0 1 2 0 1 3 0 1 1 2 ¡1 ¡ @ A 3 @ A @ A are easily seen to form a basis of R . Moreover, they are mutually perpendicular, v1 v2 = v v = v v = 0, and so form an orthogonal basis with respect to the standard¢ dot 1 ¢ 3 2 ¢ 3 product on R3. When we divide each orthogonal basis vector by its length, the result is the orthonormal basis 1 5 1 p6 0 0 5 p30 1 1 1 1 u = 2 =0 2 1; u = 1 =0 1; u = 2 =0 2 1; 1 p60 1 p6 2 p50 1 p5 3 p300 ¡ 1 ¡ p30 1 2 2 1 ¡ B 1 C B C B 1 C @ A B ¡ p6 C @ A B p5 C @ A B p30 C @ A @ A @ A satisfying u1 u2 = u1 u3 = u2 u3 = 0 and u1 = u2 = u3 = 1. The appearance of square roots¢ in the elemen¢ ts of¢ an orthonormalk kbasisk is kfairlyk typical.k A useful observation is that any orthogonal collection of nonzero vectors is automati- cally linearly independent. Proposition 5.4. If v1; : : : ; vk V are nonzero, mutually orthogonal, so vi ; vj = 0 for all i = j, then they are linearly indep2 endent. h i 6 Proof : Suppose c v + + c v = 0: 1 1 ¢ ¢ ¢ k k Let us take the inner product of this equation with any vi. Using linearity of the inner product and orthogonality, we compute 0 = c v + + c v ; v = c v ; v + + c v ; v = c v ; v = c v 2: h 1 1 ¢ ¢ ¢ k k i i 1 h 1 i i ¢ ¢ ¢ k h k i i i h i i i i k i k Therefore, provided v = 0, we conclude that the coe±cient c = 0. Since this holds for i 6 i all i = 1; : : : ; k, linear independence of v1; : : : ; vk follows. Q.E.D. As a direct corollary, we infer that any collection of nonzero orthogonal vectors forms a basis for its span. Proposition 5.5. Suppose v ; : : : ; v V are nonzero, mutually orthogonal ele- 1 n 2 ments of an inner product space V . Then v1; : : : ; vn form an orthogonal basis for their span W = span v ; : : : ; v V , which is therefore a subspace of dimension n = dim W . f 1 ng ½ In particular, if dim V = n, then v1; : : : ; vn form a orthogonal basis for V . Orthogonality is also of profound signi¯cance for function spaces. Here is a relatively simple example. 2/25/04 152 c 2004 Peter J. Olver ° Example 5.6. Consider the vector space (2) consisting of all quadratic polynomials p(x) = ® + ¯ x + γ x2, equipped with the L2 innerP product and norm 1 1 p ; q = p(x) q(x) dx; p = p ; p = p(x)2 dx : h i 0 k k h i s 0 Z p Z The standard monomials 1; x; x2 do not form an orthogonal basis. Indeed, 1 ; x = 1 ; 1 ; x2 = 1 ; x ; x2 = 1 : h i 2 h i 3 h i 4 One orthogonal basis of (2) is provided by following polynomials: P p (x) = 1; p (x) = x 1 ; p (x) = x2 x + 1 : (5:1) 1 2 ¡ 2 3 ¡ 6 Indeed, one easily veri¯es that p ; p = p ; p = p ; p = 0, while h 1 2 i h 1 3 i h 2 3 i 1 1 1 1 p = 1; p = = ; p = = : k 1 k k 2 k p12 2 p3 k 3 k p180 6 p5 The corresponding orthonormal basis is found by dividing each orthogonal basis element by its norm: u (x) = 1; u (x) = p3 ( 2x 1 ) ; u (x) = p5 6x2 6x + 1 : 1 2 ¡ 3 ¡ In Section 5.4 below, we will learn how to systematically construct suc¡ h orthogonal¢systems of polynomials. Computations in Orthogonal Bases What are the advantages of orthogonal and orthonormal bases? Once one has a basis of a vector space, a key issue is how to express other elements as linear combinations of the basis elements | that is, to ¯nd their coordinates in the prescribed basis. In general, this is not so easy, since it requires solving a system of linear equations, (2.22). In high dimensional situations arising in applications, computing the solution may require a considerable, if not infeasible amount of time and e®ort. However, if the basis is orthogonal, or, even better, orthonormal, then the change of basis computation requires almost no work. This is the crucial insight underlying the e±cacy of both discrete and continuous Fourier analysis, least squares approximations and statistical analysis of large data sets, signal, image and video processing, and a multitude of other essential applications.