Notes for Chapter 5 – Orthogonality 5.1 the Scalar Product in R
Total Page:16
File Type:pdf, Size:1020Kb
Notes for Chapter 5 { Orthogonality n 5.1 The Scalar Product in R n The vector space , R = fx = (x1; x2; ··· ; xn): xj 2 R for j = 1; ··· ; ng. Here we define the vector addition and scalar multiplication as follows. 1. Given vectors x = (x1; x2; ··· ; xn) and y = (y1; y2; ··· ; yn) we define x + y by x + y = (x1 + y1; x2 + y2; ··· ; xn + yn): 2. Given a scalar α and a vector x = (x1; x2; ··· ; xn) we define αx by αx = (αx1; αx2; ··· ; αxn): The scalar product (or dot product) of x and y 2 Rn is defined by n T X x · y = x y = xjyj (1) j=1 We use the scalar product to define a distance function called the norm. Definition 1. 1. The length of a vector x 2 Rn is defined by n !1=2 T 1=2 X 2 kxk = (x x) = xj : (2) j=1 2. The distance between two vectors x and yRn is n !1=2 X 2 kx − yk = (xj − yj) : (3) j=1 3. If θ is the angle between two vectors x and y 2 Rn then xTy cos(θ) = : (4) kxk kyk 4. A vector u is called a unit vector if kuk = 1. For any vector x 2 Rn a direction vector u pointing in the direction of x is the unit vector u = x=kxk. 5. Two vectors x and y 2 Rn are said to be orthogonal if xTy = 0, i.e., if the angle between them is 90◦. xTy 6. The Scalar Projection of a vector x onto a vector y is . kyk xTy 7. The Vector Projection of a vector x onto a vector y is P (x) = y. y kyk2 8. If vectors x and y are orthogonal then kx + yk2 = kxk2 + kyk2. 1 9. In general for all vectors x and y we have kx + yk2 ≤ kxk2 + kyk2 which is often called the triangle inequality. 10. (Cauchy-Schwartz Inequality) For all vectors x and y we have T x y ≤ kxk kyk: (5) If P and P are points in n, then the vector from P = (x ; x ; ··· ; x ) to P = (y ; y ; ··· ; y ) is 1 2−−! R 1 1 2 n 2 1 2 n denoted by P1P2. This vector can be written as −−! P1P2 =< (y1 − x1); (y2 − x2); ··· ; (yn − xn) > : n n If N is a nonzero vector and P0 is a point in R , then the set of points P = (x1; x2; ··· ; xn) 2 R T −−! n satisfying N P0P = 0 forms a hyperplane in R that passes through the point P0 with normal 0 0 0 vector N. For example, if N =< a1; a2; ··· ; an > and P0 = (x1; x2; ··· ; xn) then the equation of the hyperplane is T −−! 0 0 0 N P0P = a1(x1 − x1) + a2(x2 − x2) + ··· + an(xn − xn) = 0: In R3 a hyperplane is called a plane and the above formula is the one you learned in calculus III for the equation of a plane. In R2 a hyperplane is called a line and the above formula is the one you learned in calculus III for the equation of a line. We can use this information to answer some simple basic problems. For example, to find the distance 3 T −−! d from a point P1 = (x1; y1; z1) in R to the plane N P0P = we can use the scalar projection to find T −−! jN P0P1 j ja (x − x ) + a (y − y ) + a (z − z )j d = = 1 1 0 2 1 0 3 1 0 : kNk p 2 2 2 a1 + a2 + a3 The analog of this in R2 would be to find the distance d from a point to a line. In this case the line may be written as follows: The line passing through a point P0 = (x0; y0) and orthogonal to a vector N =< a1; a2 > is given by T −−! N P0P1 = a1(x − x0) + a2(y − y0): 2 T −−! The distance from a point P1 = (x1; y1) in R to the line N P0P = 0 is T −−! jN P0P1 j ja (x − x ) + a (y − y )j d = = 1 1 0 2 1 0 : kNk p 2 2 a1 + a2 5.2 Orthogonal Subspaces Definition 2. 1. Two subspaces X and Y of Rn are called orthogonal if xTy = 0 for every x 2 X and y 2 Y . In this case we write X ? Y . 2. If Y is a subspace of Rn then the Orthogonal Complement of Y , denoted Y ? is defined by ? n T Y = fx 2 R : x y = 0 8 y 2 Y g: (6) 2 Example 1. Let A be a m × n matrix. Then A generates a linear mapping from Rn to Rm by T (x) = Ax. The column space of A is the same as the image of T . We denote the range space by R(A): m n R(A) = fb 2 R : b = Ax for some x 2 R g: (7) The column space of AT, i.e., R(AT) ⊂ Rn T n T m R(A ) = fy 2 R : y = A x for some x 2 R g: (8) If y 2 R(AT) ⊂ Rn and x 2 N(A) then we claim x and y are orthogonal, i.e., R(AT) ? N(A). T m T Recall the y 2 R(A ) means there is a vector x0R such that y = A x0. So we have T T T T x y = x A x0 = (Ax) x0 = 0 since x 2 N(A). This gives us the so-called Fundamental Subspace Theorem: Theorem 1. Let A be a m × n matrix. The we have N(A) = R(AT)? and N(AT) = R(A)?: (9) Definition 3. If U and V are subspaces of a vector space W and if each w 2 W can be written uniquely in the form w = u + v. Notice that this is really two statements: 1) every w can be written in this form, and, 2) the representation is unique. When this is the case we say the W is the direct sum of U and V and we write W = U ⊕ V . We also have the following general results for subspaces of the vector space Rn: Theorem 2. 1. Let S be a subspace of Rn. Then we have dim(S) + dim(S?) = n. Further, if r n n n fxjgj=1 is a basis for U and fxjgj=r+1 is a basis for V then fxjgj=1 is a basis for R . 2. If S be a subspace of Rn then S ⊕ S? = Rn. 3. If S be a subspace of Rn then (S?)? = S. 4. If A is an m × n matrix and b 2 Rm, then one of the following alternatives must hold: (a) There is an x 2 Rn so that Ax = b, or, (b) There is a vector y 2 Rm so that ATy = 0 (i.e., y 2 N(AT) ) and yTb 6= 0. 5. In other words, b 2 R(A) , b ? N(AT). N.B. This is a very important result. (a) When we want to solve Ax = b it can be very important to have a test to decide if the problem is solvable. That is, whether b 2 R(A). This result tells us that if we find a basis for N(AT) we can check to see if b 2 R(A) by simply checking to see if b ? y for all y in a basis for N(AT). (b) You will also find that this is the basis for the method of least squares in the next section. 6. We can write n T R = R(A ) ⊕ N(A): Remark 1. 1. To find a basis for the R(A) we find the row echelon form to determine the pivot columns then take exactly those columns of A. 3 01 2 3 1 1 2. We can also proceed as in the following example. Given A = @1 3 5 −2A. We note 3 8 13 −3 that a basis for the R(A) is the same thing as a basis for the column space of A. Now we also know that the column space of A is the same as the row space of AT so we find the row echelon form U of AT and take the transpose of the pivot rows. Here 01 1 3 1 01 1 31 T B2 3 8 C B0 1 2C A = B C ) U = B C : @3 5 13 A @0 0 0A 1 −2 −3 0 0 0 Therefore < 1; 1; 3 > and < 0; 1; 2 > form a basis for R(A). 5.3 Least Squares When we have an overdetermined linear system it is unlikely there will be a solution. Nevertheless this is exactly the type of problem very often considered in applications. What happens in these applications is that one seeks to find a so called least squares solution. This is something that is close to being a solution in a certain sense. In particular we have an overdetermined system Ax = b where A is m × n with m > n (usually much larger). Definition 4 (Least Squares Problem). Given a system Ax = b where A is m × n with m > n, b 2 Rm, then for each x 2 Rn we can form the residual r(x) = b − Ax: The distance from b to Ax is kb − Axk = kr(x)k: 1. The least squares problem is to find a vector x 2 Rn for which kr(x)k is minimum.