Quick viewing(Text Mode)

Lecture 25: 6.3 Orthonormal Bases

Lecture 25: 6.3 Orthonormal Bases

Lecture 25: 6.3 Orthonormal Bases

Wei-Ta Chu

2008/12/24 Theorem 6.3.2

 Theorem 6.3.2  If S is an orthonormal for an n-dimensional , and if (u)s = (u1, u2, …, un) and (v)s = (v1, v2, …, vn) then:

2 2 2 u  u1 u2 un

2 2 2 d(u, v)  (u1 v1) (u2 v2 ) (un vn )

u, v u1v1 u2v2 unvn

 Remark  By working with orthonormal bases, the computation of general norms and inner products can be reduced to the computation of Euclidean norms and inner products of the coordinate vectors.

2008/12/24 Elementary 2 Example

 If R3 has the Euclidean inner product, then the of the vector u=(1,1,1) is

 However, if we let R3 have the S in the last example, then we know from that the coordinate

vector of u relative to S is (u)s= (1, -1/5, 7/5)  The norm of u yields

2008/12/24 Elementary Linear Algebra 3 Coordinates Relative to Orthogonal Bases

 If S = {v1, v2, …, vn} is an for a V, then normalizing each of these vectors yields the orthonormal basis v v v  S' 1 , 2 ,, n  v1 v2 vn   Thus, if u is any vector in V, it follows from theorem 6.3.1 that v v v v v v u  u, 1 1  u, 2 2  u, n n v1 v1 v2 v2 v n vn or

u, v1 u, v2 u, vn u  2 v1  2 v2  2 vn v1 v 2 vn  The above equation expresses u as a of the vectors in the orthogonal basis S.

2008/12/24 Elementary Linear Algebra 4 Theorem 6.3.3

 Theorem 6.3.3

 If S = {v1, v2, …, vn} is an orthogonal set of nonzero vectors in an inner product space, then S is linearly independent.

2008/12/24 Elementary Linear Algebra 5 Proof of Theorem 6.3.3

 Assume that k1v1+k2v2+…+knvn = 0. To demonstrate that S is linearly independent, we must prove that k1=k2=…=0.

 For each vi in S, k1v1+k2v2+…+knvn, vi= 0,vi=0 or, equivalently k1v1,vi+ k2v2,vi+…+knvn,vi=0

 From the of S it follows that vj,vi=0 when j is not equal to i, so the equation reduces to kivi,vi=0

 Since the vectors in S are assumed to be nonzero, vi,vi ≠0. Therefore, ki=0. Since the subscript i is arbitrary, we have k1=k2=…=kn=0.

2008/12/24 Elementary Linear Algebra 6 Theorem 6.3.4

 Theorem 6.3.4 (Projection Theorem)  If W is a finite-dimensional subspace of an inner product space V, then every vector u in V can be expressed in exactly one way as

u = w1 + w2  where w1 is in W and w2 is in W .

u u w2 w2 W w1 w1 W

2008/12/24 Elementary Linear Algebra 7 Projection u w2 W w1

 The vector w1 is called the orthogonal projection of u on W and is denoted projWu.

 The vector w2 is called the component of u orthogonal to W and is denote by projWu.

 u = projWu + projWu

 Since w2 = u-w1, it follows that projWu = u –projWu

 So we can write u = projWu + (u –projWu) u (u –proj u) W W projWu

2008/12/24 Elementary Linear Algebra 8 Theorem 6.3.5

 Theorem 6.3.5  Let W be a finite-dimensional subspace of an inner product space V.

 If {v1, …, vr} is an orthonormal basis for W, and u is any vector in V, then

projwu = u,v1v1 + u,v2v2 + … + u,vrvr

 If {v1, …, vr} is an orthogonal basis for W, and u is any vector in V, then

u, v1 u, v2 u, vr projW u  2 v1  2 v2  2 vr Need Normalization v1 v2 vr

2008/12/24 Elementary Linear Algebra 9 Example

 Let R3 have the Euclidean inner product, and let W be the subspace spanned by the orthonormal vectors v1 = (0, 1, 0) and v2 = (-4/5, 0, 3/5).  From the above theorem, the orthogonal projection of u = (1,

1, 1) on W is projwu =<, u v 1 > v 1 <, u v 2 > v 2 1 4 3 4 3 =(1)(0, 1, 0) ( )( , 0, )=( , 1,  ) 5 5 5 25 25  The component of u orthogonal to W is 4 3 21 28 proju = u proj u =(1,1,1) ( ,1, ) ( ,0, ) w w 25 25 25 25

 Observe that projWu is orthogonal to both v1 and v2.

2008/12/24 Elementary Linear Algebra 10 Finding Orthogonal/Orthonormal Bases

 Theorem 6.3.6

 Every nonzero finite-dimensional inner product space has an orthonormal basis.

 Remark

 The step-by-step construction for converting an arbitrary basis into an orthogonal basis is called the Gram-Schmidt process.

2008/12/24 Elementary Linear Algebra 11 Proof of Theorem 6.3.6

 Let V be an nonzero finite-dimensional inner product

space, and suppose that {u1, u2, …, un} is any basis for V. It suffices to show that V has an orthogonal basis, since the vectors in the orthogonal basis can be normalized to produce an orthonormal basis for V.  The following sequence of steps will produce an

orthogonal basis {v1, v2, …, vn} for V.

 Step 1: Let v1 = u1.

2008/12/24 Elementary Linear Algebra 12 Proof of Theorem 6.3.6 u2 v2 = u2 –projW1u2 W1 v1 projW1u2

 Step 2: We can obtain a vector v2 that is orthogonal to v1 by computing the component of u2 that is orthogonal to the space W1 spanned by v1.

 Of course if v2=0, then v2 is not a basis vector. But this cannot happen. Assume v2=0,

 Which says that u2 is multiple of u1, contradicting the linear independence of the basis S={u1,u2,…,un}

2008/12/24 Elementary Linear Algebra 13 Proof of Theorem 6.3.6

 Step 3: To construct a vector v3 that is orthogonal to both v1 and v2, we compute the component of u3 orthogonal to space W2 spanned by v1 and v2. From Theorem 6.3.5(b):

 As in the Step 2, the linear

independence of {u1,u2,…,un} u ensures that v3 ≠ 0 3

v2 v1 W

2008/12/24 Elementary Linear Algebra 14 Proof of Theorem 6.3.6

 Step 4: To demonstrate a vector v4 is orthogonal to v1, v2, and v3, we compute the component of u4 orthogonal to the space W3 spanned by v1, v2, and v3.

 Continuing in this way, we will obtain, after n steps, an

orthogonal set of vectors {v1,v2,…,vn}. Since V is n- dimensional and every orthogonal set is linearly

independence, the set {v1,v2,…,vn} is an orthogonal basis for V.

2008/12/24 Elementary Linear Algebra 15 Example (Gram-Schmidt Process)

 Consider the vector space R3 with the Euclidean inner product. Apply the Gram-Schmidt process to transform the basis vectors

u1 = (1, 1, 1), u2 = (0, 1, 1), u3 = (0, 0, 1)

into an orthogonal basis {v1, v2, v3}; then normalize the orthogonal basis vectors to obtain an orthonormal basis {q1, q2, q3}.

 Solution:

 Step 1: Let v1 = u1.That is, v1 = u1 = (1, 1, 1)  Step 2: Let v = u –proj u . That is, 2 2 W1 2 u, v  v uproj u u 2 1 v 2 2 w1 2 22 1 v1 2 2 1 1 (0, 1, 1) (1, 1, 1) ( , , ) 3 3 3 3 2008/12/24 Elementary Linear Algebra 16 Example (Gram-Schmidt Process)

We have two vectors in W2 now!

 Step 3: Let v = u –proj u . That is, 3 3 W2 3 u,, v  u v  v uproj u u 3 1 v 3 2 v 3 3 w2 3 32 1 2 2 v1 v 2 1 1/ 3 2 1 1 1 1 (0, 1, 1) (1, 1, 1) ( , , ) (0, , ) 3 2 / 3 3 3 3 2 2

 Thus, v1 = (1, 1, 1), v2 = (-2/3, 1/3, 1/3), v3 = (0, -1/2, 1/2) form an orthogonal basis for R3. The norms of these vectors are 6 1 v 3, v , v  1 23 3 2 so an orthonormal basis for R3 is

v11 1 1 v 2 2 1 1 q1 ( , , ), q 2   ( , , ), v13 3 3 v 2 66 6

v3 1 1 q3  (0, - , ) v3 2 2

2008/12/24 Elementary Linear Algebra 17 Theorem 6.3.7

 Theorem 6.3.7 (QR-Decomposition)  If A is an mn matrix with linearly independent column vectors, then A can be factored as A = QR where Q is an mn matrix with orthonormal column vectors, and R is an nn invertible upper triangular matrix.

 Remark  In recent years the QR-decomposition has assumed growing importance as the mathematical foundation for a wide variety of practical algorithms, including a widely used algorithm for computing eigenvalues of large matrices.

2008/12/24 Elementary Linear Algebra 18 QR-Decomposition

 Suppose that the column vectors of A are u1, u2, …, un and the orthonormal column vectors of Q are q1, q2, …, qn; thus A = [u1 | u2 | … | un] and Q = [q1 | q2 | … | qn]

 From Theorem 6.3.1, the vectors u1, u2, …, un are expressible in terms of the vectors q1, q2, …, qn … … … …

2008/12/24 Elementary Linear Algebra 19 QR-Decomposition

 Recalling from Section 1.3 that the jth column vector of a matrix product is a linear combination of the column vectors of the first factor with coefficients coming from the jth column of the second factor.

A = Q R

2008/12/24 Elementary Linear Algebra 20 QR-Decomposition

 It is a property of the Gram-Schmidt process that for

j≧ 2, the vector qj is orthogonal to u1, u2, …, uj-1; thus all entries below the main diagonal of R are zero

 The diagonal entries of R are nonzero, so R is invertible.

2008/12/24 Elementary Linear Algebra 21 QR-Decomposition of a 33 Matrix

1 0 0    Find the QR-decomposition of A 1 1 0 1 1 1  Solution:

 The column vectors A are 1  0  0    u1 , u 1 , u   1/ 2 1 2 3       1  1 1/ 2   Applying the Gram-Schmidt process with subsequent normalization to these column vectors yields the orthonormal vectors 1/ 3  2 / 6  0        q11/3, q 2 1/6, q 3  1/2 Q       1/ 3  1/ 6  1/ 2 

2008/12/24 Elementary Linear Algebra 22 QR-Decomposition of a 33 Matrix

 The matrix R is u ,q u ,q u ,q  3/ 3 2 / 3 1/ 3 1 1 2 1 3 1     0 2 / 6 1/ 6 R  0 u2 ,q2 u3 ,q2     0 0 u ,q   0 0 1/ 2  3 3   

 Thus, the QR-decomposition of A is

1 0 0 1/3 2/6 0 3/32/31/3       110 1/3 1/6 1/2 0 2/61/6  1 1 1      1/31/6 1/3 0 0 1/2  A Q R

2008/12/24 Elementary Linear Algebra 23 Lecture 25: 6.4 Best Approximation & Least Squares

Wei-Ta Chu

2008/12/24 Orthogonal Projections Viewed as Approximations

 If P is a point in 3-space and W is a plane through the origin, then the point Q in W closest to P is obtained by dropping a perpendicular from P to W.  Therefore, if we let u = OP, the distance between P and W is

given by || u –projWu ||.  In other words, among all vectors w in W the vector

w = projWu minimize the distance || u –w ||. P P u u u - projWu O O projWu Q W projWu Q W

2008/12/24 Elementary Linear Algebra 25 Orthogonal Projections Viewed as Approximations

 View u as a fixed vector that we would like to approximate by a vector in W. Any such approximation w will result in an “error vector,” u –w.  Unless u is in W, it cannot be made equal to 0. However,

by choosing w = projWu, we can make the length of the error vector ||u –w|| = ||u –projWu|| as small as possible.

 Thus we can describe projWu as the “best approximation” to u by vectors in W.

2008/12/24 Elementary Linear Algebra 26 Theorem 6.4.1

 Theorem 6.4.1 (Best Approximation 最佳近似 Theorem)

 If W is a finite-dimensional subspace of an inner product

space V, and if u is a vector in V, then projWu is the best approximation to u from W in the sense that

|| u –projWu || < || u –w ||

for every vector w in W that is different from projWu.

2008/12/24 Elementary Linear Algebra 27 Proof of Theorem 6.4.1

 For every vector w in W, we can write

u –w = (u - projWu) + (projWu - w)

 But projWu –w, being a difference of vectors in W, is in W; and u-projWu is orthogonal to W, so the two terms are orthogonal.  Thus, by the Theorem of Pythagoras 2 2 2 ||u-w|| = ||u-projWu|| +||projWu-w||

 If w ≠ projWu, then the second term in this sum will be 2 2 positive, so ||u-w|| > ||u-projWu|| , or, equivalently, ||u-w|| > ||u-projWu||

2008/12/24 Elementary Linear Algebra 28 Least Squares (最小平方)

 Some physical problem leads to a linear system Ax=b that should be consistent on theoretical grounds but fails to be so, because “measurement errors” in the entries of A and b perturb the system to cause inconsistency.  We look for a value of x that comes “as close as possible” to being a solution in the sense that it minimizes the value of ||Ax-b|| with respect to the Euclidean inner product.  The quantity ||Ax-b|| can be viewed as a measure of the error that results from regarding x as an approximate solution of the linear system Ax=b. The larger value it is, the more poorly x serves as an approximate solution of the system.

2008/12/24 Elementary Linear Algebra 29 Least Squares Problem

 Least Squares Problem

 Given a linear system Ax = b of m equations in n unknowns, find a vector x, if possible, that minimize || Ax –b || with respect to the Euclidean inner product on Rm. Such a vector is called a least squares solution of Ax = b.

2008/12/24 Elementary Linear Algebra 30 Least Squares Problem

 To solve the least squares problem, let W be the column space of A. For each n × 1 matrix x, the product Ax is a linear combination of the column vectors of A.  Thus, as x varies over Rn, the vector Ax varies over all possible linear combinations of the column vectors of A; that is, Ax varies over the entire column space W.  Geometrically, solving the least squares problem amounts to finding a vector x in Rn such that Ax is the closest vector in W to b. b

Ax W = column space of A 2008/12/24 Elementary Linear Algebra 31 Least Squares Problem b Ax W = column space of A

 The closest vector in W to b is the orthogonal projection of b on W. Thus, for a vector x to be a least squares solution (最小 平方解) of Ax=b, this vector must satisfy

Ax = projWb  One can attempt to find least squares solutions by first

calculating the vector projWb and then solve the equation; however, there is a better approach.

 It follows that b –Ax = b –projWb is orthogonal to W. But W is the column space of A, so it follows from Theorem 6.2.6 that b-Ax lines in the nullspace of AT.

The nullspace of AT and the column space of A are orthogonal complements in Rm with respect to the Euclidean inner product. 2008/12/24 Elementary Linear Algebra 32 Least Squares Problem

 Therefore, a least squares solution of Ax=b must satisfy AT(b-Ax) = 0, or, equivalently, ATAx = ATb  This is called the normal system (正交方程組) associated with Ax=b, and the individual equations are called the normal equations (正交方程式) associated with Ax=b.  Thus the problem of finding a least squares solution of Ax=b has been reduced to the problem of finding an exact solution of the associated normal system.

2008/12/24 Elementary Linear Algebra 33 Least Squares Problem

AT(b-Ax) = 0, or, equivalently, ATAx = ATb  The normal system involves n equations in n unknowns.  The normal system is consistent, since it is satisfied by a least squares solution of Ax=b.  The normal system may have infinitely many solutions, in which case all of its solutions are least squares solutions of Ax = b.

2008/12/24 Elementary Linear Algebra 34 Theorem 6.4.2

 Theorem 6.4.2

 For any linear system Ax = b, the associated normal system ATAx = ATb is consistent, and all solutions of the normal system are least squares solutions of Ax = b. Moreover, if W is the column space of A, and x is any least squares solution of Ax = b, then the orthogonal projection of b on W is

projWb = Ax

(or you can treat it as Ax –projWb = 0 )

2008/12/24 Elementary Linear Algebra 35 Theorems

 Theorem 6.4.3  If A is an mn matrix, then the following are equivalent.  A has linearly independent column vectors.  ATA is invertible.

 Theorem 6.4.4  If A is an mn matrix with linearly independent column vectors, then for every m1 matrix b, the linear system Ax = b has a unique least squares solution. This solution is given by x = (ATA)-1ATb Moreover, if W is the column space of A, then the orthogonal projection of b on W is T -1 T projWb = Ax = A(A A) A b

2008/12/24 Elementary Linear Algebra 36 Example (Least Squares Solution)

 Find the least squares solution of the linear system Ax = b given by

x1 – x2 = 4

3x1 + 2x2 = 1

-2x1 + 4x2 = 3 and find the orthogonal projection of b on the column space of A.  Solution: 1 1  4    A 3 2 andb = 1 2 4  3

 Observe that A has linearly independent column vectors, so we know in advance that there is a unique least squares solution.

2008/12/24 Elementary Linear Algebra 37 Example (Least Squares Solution)

 We have 1 1 T 1 3 2   14 3 AA  3 2   1 2 4  3 21 2 4  4 T 1 3 2  1  A b  1  1 2 4  10 3 14 3x1   1    T T      so the normal system A Ax = A b in this case is 3 21x2   10

 Solving this system yields the least squares solution

x1 = 17/95, x2 = 143/285  The orthogonal projection of b on the column space of A is 1 1  92 / 285  17 / 95    Ax 3 2  439 / 285 143/ 285 2 4  94 / 57 

2008/12/24 Elementary Linear Algebra 38 Example (Orthogonal Projection on a Subspace)

 Find the orthogonal projection of the vector u = (-3,-3,8,9) on the subspace of R4 spanned by the vectors

u1 = (3,1,0,1), u2 = (1,2,1,1), u3 = (-1,0,2,-1)  Solution:

 The subspace spanned by u1, u2, and u3, is the column space of 3 1 1 1 2 0  A   0 1 2    1 1 1  If u is expressed as a column vector, we can find the orthogonal projection of u on W by finding a least squares solution of the system Ax = u and then calculating projWu = Ax from the least square solution.

2008/12/24 Elementary Linear Algebra 39 Example

 From Theorem 6.4.4, the least squares solution is given by x = (ATA)-1ATu

 That is,

1  3 1 1 3 3 1 0 1   3 1 0 1   1  1 2 0  3 x  1 2 1 1   1 2 1 1  2   0 1 2   8      1 0 2 1  1 0 2 1  1   1 1 1 9  T  Thus, projWu = Ax = [-2 3 4 0]  Second method: using Gram-Schmidt process

2008/12/24 Elementary Linear Algebra 40