Lecture 25: 6.3 Orthonormal Bases
Lecture 25: 6.3 Orthonormal Bases
Wei-Ta Chu
2008/12/24 Theorem 6.3.2
Theorem 6.3.2 If S is an orthonormal basis for an n-dimensional inner product space, and if (u)s = (u1, u2, …, un) and (v)s = (v1, v2, …, vn) then:
2 2 2 u u1 u2 un
2 2 2 d(u, v) (u1 v1) (u2 v2 ) (un vn )
u, v u1v1 u2v2 unvn
Remark By working with orthonormal bases, the computation of general norms and inner products can be reduced to the computation of Euclidean norms and inner products of the coordinate vectors.
2008/12/24 Elementary Linear Algebra 2 Example
If R3 has the Euclidean inner product, then the norm of the vector u=(1,1,1) is
However, if we let R3 have the orthonormal basis S in the last example, then we know from that the coordinate
vector of u relative to S is (u)s= (1, -1/5, 7/5) The norm of u yields
2008/12/24 Elementary Linear Algebra 3 Coordinates Relative to Orthogonal Bases
If S = {v1, v2, …, vn} is an orthogonal basis for a vector space V, then normalizing each of these vectors yields the orthonormal basis v v v S' 1 , 2 ,, n v1 v2 vn Thus, if u is any vector in V, it follows from theorem 6.3.1 that v v v v v v u u, 1 1 u, 2 2 u, n n v1 v1 v2 v2 v n vn or
u, v1 u, v2 u, vn u 2 v1 2 v2 2 vn v1 v 2 vn The above equation expresses u as a linear combination of the vectors in the orthogonal basis S.
2008/12/24 Elementary Linear Algebra 4 Theorem 6.3.3
Theorem 6.3.3
If S = {v1, v2, …, vn} is an orthogonal set of nonzero vectors in an inner product space, then S is linearly independent.
2008/12/24 Elementary Linear Algebra 5 Proof of Theorem 6.3.3
Assume that k1v1+k2v2+…+knvn = 0. To demonstrate that S is linearly independent, we must prove that k1=k2=…=0.
For each vi in S, k1v1+k2v2+…+knvn, vi= 0,vi=0 or, equivalently k1v1,vi+ k2v2,vi+…+knvn,vi=0
From the orthogonality of S it follows that vj,vi=0 when j is not equal to i, so the equation reduces to kivi,vi=0
Since the vectors in S are assumed to be nonzero, vi,vi ≠0. Therefore, ki=0. Since the subscript i is arbitrary, we have k1=k2=…=kn=0.
2008/12/24 Elementary Linear Algebra 6 Theorem 6.3.4
Theorem 6.3.4 (Projection Theorem) If W is a finite-dimensional subspace of an inner product space V, then every vector u in V can be expressed in exactly one way as
u = w1 + w2 where w1 is in W and w2 is in W .
u u w2 w2 W w1 w1 W
2008/12/24 Elementary Linear Algebra 7 Projection u w2 W w1
The vector w1 is called the orthogonal projection of u on W and is denoted projWu.
The vector w2 is called the component of u orthogonal to W and is denote by projWu.
u = projWu + projWu
Since w2 = u-w1, it follows that projWu = u –projWu
So we can write u = projWu + (u –projWu) u (u –proj u) W W projWu
2008/12/24 Elementary Linear Algebra 8 Theorem 6.3.5
Theorem 6.3.5 Let W be a finite-dimensional subspace of an inner product space V.
If {v1, …, vr} is an orthonormal basis for W, and u is any vector in V, then
projwu = u,v1v1 + u,v2v2 + … + u,vrvr
If {v1, …, vr} is an orthogonal basis for W, and u is any vector in V, then
u, v1 u, v2 u, vr projW u 2 v1 2 v2 2 vr Need Normalization v1 v2 vr
2008/12/24 Elementary Linear Algebra 9 Example
Let R3 have the Euclidean inner product, and let W be the subspace spanned by the orthonormal vectors v1 = (0, 1, 0) and v2 = (-4/5, 0, 3/5). From the above theorem, the orthogonal projection of u = (1,
1, 1) on W is projwu =<, u v 1 > v 1 <, u v 2 > v 2 1 4 3 4 3 =(1)(0, 1, 0) ( )( , 0, )=( , 1, ) 5 5 5 25 25 The component of u orthogonal to W is 4 3 21 28 proju = u proj u =(1,1,1) ( ,1, ) ( ,0, ) w w 25 25 25 25
Observe that projWu is orthogonal to both v1 and v2.
2008/12/24 Elementary Linear Algebra 10 Finding Orthogonal/Orthonormal Bases
Theorem 6.3.6
Every nonzero finite-dimensional inner product space has an orthonormal basis.
Remark
The step-by-step construction for converting an arbitrary basis into an orthogonal basis is called the Gram-Schmidt process.
2008/12/24 Elementary Linear Algebra 11 Proof of Theorem 6.3.6
Let V be an nonzero finite-dimensional inner product
space, and suppose that {u1, u2, …, un} is any basis for V. It suffices to show that V has an orthogonal basis, since the vectors in the orthogonal basis can be normalized to produce an orthonormal basis for V. The following sequence of steps will produce an
orthogonal basis {v1, v2, …, vn} for V.
Step 1: Let v1 = u1.
2008/12/24 Elementary Linear Algebra 12 Proof of Theorem 6.3.6 u2 v2 = u2 –projW1u2 W1 v1 projW1u2
Step 2: We can obtain a vector v2 that is orthogonal to v1 by computing the component of u2 that is orthogonal to the space W1 spanned by v1.
Of course if v2=0, then v2 is not a basis vector. But this cannot happen. Assume v2=0,
Which says that u2 is multiple of u1, contradicting the linear independence of the basis S={u1,u2,…,un}
2008/12/24 Elementary Linear Algebra 13 Proof of Theorem 6.3.6
Step 3: To construct a vector v3 that is orthogonal to both v1 and v2, we compute the component of u3 orthogonal to space W2 spanned by v1 and v2. From Theorem 6.3.5(b):
As in the Step 2, the linear
independence of {u1,u2,…,un} u ensures that v3 ≠ 0 3
v2 v1 W
2008/12/24 Elementary Linear Algebra 14 Proof of Theorem 6.3.6
Step 4: To demonstrate a vector v4 is orthogonal to v1, v2, and v3, we compute the component of u4 orthogonal to the space W3 spanned by v1, v2, and v3.
Continuing in this way, we will obtain, after n steps, an
orthogonal set of vectors {v1,v2,…,vn}. Since V is n- dimensional and every orthogonal set is linearly
independence, the set {v1,v2,…,vn} is an orthogonal basis for V.
2008/12/24 Elementary Linear Algebra 15 Example (Gram-Schmidt Process)
Consider the vector space R3 with the Euclidean inner product. Apply the Gram-Schmidt process to transform the basis vectors
u1 = (1, 1, 1), u2 = (0, 1, 1), u3 = (0, 0, 1)
into an orthogonal basis {v1, v2, v3}; then normalize the orthogonal basis vectors to obtain an orthonormal basis {q1, q2, q3}.
Solution:
Step 1: Let v1 = u1.That is, v1 = u1 = (1, 1, 1) Step 2: Let v = u –proj u . That is, 2 2 W1 2 u, v v uproj u u 2 1 v 2 2 w1 2 22 1 v1 2 2 1 1 (0, 1, 1) (1, 1, 1) ( , , ) 3 3 3 3 2008/12/24 Elementary Linear Algebra 16 Example (Gram-Schmidt Process)
We have two vectors in W2 now!
Step 3: Let v = u –proj u . That is, 3 3 W2 3 u,, v u v v uproj u u 3 1 v 3 2 v 3 3 w2 3 32 1 2 2 v1 v 2 1 1/ 3 2 1 1 1 1 (0, 1, 1) (1, 1, 1) ( , , ) (0, , ) 3 2 / 3 3 3 3 2 2
Thus, v1 = (1, 1, 1), v2 = (-2/3, 1/3, 1/3), v3 = (0, -1/2, 1/2) form an orthogonal basis for R3. The norms of these vectors are 6 1 v 3, v , v 1 23 3 2 so an orthonormal basis for R3 is
v11 1 1 v 2 2 1 1 q1 ( , , ), q 2 ( , , ), v13 3 3 v 2 66 6
v3 1 1 q3 (0, - , ) v3 2 2
2008/12/24 Elementary Linear Algebra 17 Theorem 6.3.7
Theorem 6.3.7 (QR-Decomposition) If A is an mn matrix with linearly independent column vectors, then A can be factored as A = QR where Q is an mn matrix with orthonormal column vectors, and R is an nn invertible upper triangular matrix.
Remark In recent years the QR-decomposition has assumed growing importance as the mathematical foundation for a wide variety of practical algorithms, including a widely used algorithm for computing eigenvalues of large matrices.
2008/12/24 Elementary Linear Algebra 18 QR-Decomposition
Suppose that the column vectors of A are u1, u2, …, un and the orthonormal column vectors of Q are q1, q2, …, qn; thus A = [u1 | u2 | … | un] and Q = [q1 | q2 | … | qn]
From Theorem 6.3.1, the vectors u1, u2, …, un are expressible in terms of the vectors q1, q2, …, qn … … … …
2008/12/24 Elementary Linear Algebra 19 QR-Decomposition
Recalling from Section 1.3 that the jth column vector of a matrix product is a linear combination of the column vectors of the first factor with coefficients coming from the jth column of the second factor.
A = Q R
2008/12/24 Elementary Linear Algebra 20 QR-Decomposition
It is a property of the Gram-Schmidt process that for
j≧ 2, the vector qj is orthogonal to u1, u2, …, uj-1; thus all entries below the main diagonal of R are zero
The diagonal entries of R are nonzero, so R is invertible.
2008/12/24 Elementary Linear Algebra 21 QR-Decomposition of a 33 Matrix
1 0 0 Find the QR-decomposition of A 1 1 0 1 1 1 Solution:
The column vectors A are 1 0 0 u1 , u 1 , u 1/ 2 1 2 3 1 1 1/ 2 Applying the Gram-Schmidt process with subsequent normalization to these column vectors yields the orthonormal vectors 1/ 3 2 / 6 0 q11/3, q 2 1/6, q 3 1/2 Q 1/ 3 1/ 6 1/ 2
2008/12/24 Elementary Linear Algebra 22 QR-Decomposition of a 33 Matrix
The matrix R is u ,q u ,q u ,q 3/ 3 2 / 3 1/ 3 1 1 2 1 3 1 0 2 / 6 1/ 6 R 0 u2 ,q2 u3 ,q2 0 0 u ,q 0 0 1/ 2 3 3
Thus, the QR-decomposition of A is
1 0 0 1/3 2/6 0 3/32/31/3 110 1/3 1/6 1/2 0 2/61/6 1 1 1 1/31/6 1/3 0 0 1/2 A Q R
2008/12/24 Elementary Linear Algebra 23 Lecture 25: 6.4 Best Approximation & Least Squares
Wei-Ta Chu
2008/12/24 Orthogonal Projections Viewed as Approximations
If P is a point in 3-space and W is a plane through the origin, then the point Q in W closest to P is obtained by dropping a perpendicular from P to W. Therefore, if we let u = OP, the distance between P and W is
given by || u –projWu ||. In other words, among all vectors w in W the vector
w = projWu minimize the distance || u –w ||. P P u u u - projWu O O projWu Q W projWu Q W
2008/12/24 Elementary Linear Algebra 25 Orthogonal Projections Viewed as Approximations
View u as a fixed vector that we would like to approximate by a vector in W. Any such approximation w will result in an “error vector,” u –w. Unless u is in W, it cannot be made equal to 0. However,
by choosing w = projWu, we can make the length of the error vector ||u –w|| = ||u –projWu|| as small as possible.
Thus we can describe projWu as the “best approximation” to u by vectors in W.
2008/12/24 Elementary Linear Algebra 26 Theorem 6.4.1
Theorem 6.4.1 (Best Approximation 最佳近似 Theorem)
If W is a finite-dimensional subspace of an inner product
space V, and if u is a vector in V, then projWu is the best approximation to u from W in the sense that
|| u –projWu || < || u –w ||
for every vector w in W that is different from projWu.
2008/12/24 Elementary Linear Algebra 27 Proof of Theorem 6.4.1
For every vector w in W, we can write
u –w = (u - projWu) + (projWu - w)
But projWu –w, being a difference of vectors in W, is in W; and u-projWu is orthogonal to W, so the two terms are orthogonal. Thus, by the Theorem of Pythagoras 2 2 2 ||u-w|| = ||u-projWu|| +||projWu-w||
If w ≠ projWu, then the second term in this sum will be 2 2 positive, so ||u-w|| > ||u-projWu|| , or, equivalently, ||u-w|| > ||u-projWu||
2008/12/24 Elementary Linear Algebra 28 Least Squares (最小平方)
Some physical problem leads to a linear system Ax=b that should be consistent on theoretical grounds but fails to be so, because “measurement errors” in the entries of A and b perturb the system to cause inconsistency. We look for a value of x that comes “as close as possible” to being a solution in the sense that it minimizes the value of ||Ax-b|| with respect to the Euclidean inner product. The quantity ||Ax-b|| can be viewed as a measure of the error that results from regarding x as an approximate solution of the linear system Ax=b. The larger value it is, the more poorly x serves as an approximate solution of the system.
2008/12/24 Elementary Linear Algebra 29 Least Squares Problem
Least Squares Problem
Given a linear system Ax = b of m equations in n unknowns, find a vector x, if possible, that minimize || Ax –b || with respect to the Euclidean inner product on Rm. Such a vector is called a least squares solution of Ax = b.
2008/12/24 Elementary Linear Algebra 30 Least Squares Problem
To solve the least squares problem, let W be the column space of A. For each n × 1 matrix x, the product Ax is a linear combination of the column vectors of A. Thus, as x varies over Rn, the vector Ax varies over all possible linear combinations of the column vectors of A; that is, Ax varies over the entire column space W. Geometrically, solving the least squares problem amounts to finding a vector x in Rn such that Ax is the closest vector in W to b. b
Ax W = column space of A 2008/12/24 Elementary Linear Algebra 31 Least Squares Problem b Ax W = column space of A
The closest vector in W to b is the orthogonal projection of b on W. Thus, for a vector x to be a least squares solution (最小 平方解) of Ax=b, this vector must satisfy
Ax = projWb One can attempt to find least squares solutions by first
calculating the vector projWb and then solve the equation; however, there is a better approach.
It follows that b –Ax = b –projWb is orthogonal to W. But W is the column space of A, so it follows from Theorem 6.2.6 that b-Ax lines in the nullspace of AT.
The nullspace of AT and the column space of A are orthogonal complements in Rm with respect to the Euclidean inner product. 2008/12/24 Elementary Linear Algebra 32 Least Squares Problem
Therefore, a least squares solution of Ax=b must satisfy AT(b-Ax) = 0, or, equivalently, ATAx = ATb This is called the normal system (正交方程組) associated with Ax=b, and the individual equations are called the normal equations (正交方程式) associated with Ax=b. Thus the problem of finding a least squares solution of Ax=b has been reduced to the problem of finding an exact solution of the associated normal system.
2008/12/24 Elementary Linear Algebra 33 Least Squares Problem
AT(b-Ax) = 0, or, equivalently, ATAx = ATb The normal system involves n equations in n unknowns. The normal system is consistent, since it is satisfied by a least squares solution of Ax=b. The normal system may have infinitely many solutions, in which case all of its solutions are least squares solutions of Ax = b.
2008/12/24 Elementary Linear Algebra 34 Theorem 6.4.2
Theorem 6.4.2
For any linear system Ax = b, the associated normal system ATAx = ATb is consistent, and all solutions of the normal system are least squares solutions of Ax = b. Moreover, if W is the column space of A, and x is any least squares solution of Ax = b, then the orthogonal projection of b on W is
projWb = Ax
(or you can treat it as Ax –projWb = 0 )
2008/12/24 Elementary Linear Algebra 35 Theorems
Theorem 6.4.3 If A is an mn matrix, then the following are equivalent. A has linearly independent column vectors. ATA is invertible.
Theorem 6.4.4 If A is an mn matrix with linearly independent column vectors, then for every m1 matrix b, the linear system Ax = b has a unique least squares solution. This solution is given by x = (ATA)-1ATb Moreover, if W is the column space of A, then the orthogonal projection of b on W is T -1 T projWb = Ax = A(A A) A b
2008/12/24 Elementary Linear Algebra 36 Example (Least Squares Solution)
Find the least squares solution of the linear system Ax = b given by
x1 – x2 = 4
3x1 + 2x2 = 1
-2x1 + 4x2 = 3 and find the orthogonal projection of b on the column space of A. Solution: 1 1 4 A 3 2 andb = 1 2 4 3
Observe that A has linearly independent column vectors, so we know in advance that there is a unique least squares solution.
2008/12/24 Elementary Linear Algebra 37 Example (Least Squares Solution)
We have 1 1 T 1 3 2 14 3 AA 3 2 1 2 4 3 21 2 4 4 T 1 3 2 1 A b 1 1 2 4 10 3 14 3x1 1 T T so the normal system A Ax = A b in this case is 3 21x2 10
Solving this system yields the least squares solution
x1 = 17/95, x2 = 143/285 The orthogonal projection of b on the column space of A is 1 1 92 / 285 17 / 95 Ax 3 2 439 / 285 143/ 285 2 4 94 / 57
2008/12/24 Elementary Linear Algebra 38 Example (Orthogonal Projection on a Subspace)
Find the orthogonal projection of the vector u = (-3,-3,8,9) on the subspace of R4 spanned by the vectors
u1 = (3,1,0,1), u2 = (1,2,1,1), u3 = (-1,0,2,-1) Solution:
The subspace spanned by u1, u2, and u3, is the column space of 3 1 1 1 2 0 A 0 1 2 1 1 1 If u is expressed as a column vector, we can find the orthogonal projection of u on W by finding a least squares solution of the system Ax = u and then calculating projWu = Ax from the least square solution.
2008/12/24 Elementary Linear Algebra 39 Example
From Theorem 6.4.4, the least squares solution is given by x = (ATA)-1ATu
That is,
1 3 1 1 3 3 1 0 1 3 1 0 1 1 1 2 0 3 x 1 2 1 1 1 2 1 1 2 0 1 2 8 1 0 2 1 1 0 2 1 1 1 1 1 9 T Thus, projWu = Ax = [-2 3 4 0] Second method: using Gram-Schmidt process
2008/12/24 Elementary Linear Algebra 40