Orthogonality Inner Or Dot Product in R : Uv = U · V = U V1 +
Total Page:16
File Type:pdf, Size:1020Kb
Math 52 0 - Linear algebra, Spring Semester 2012-2013 Dan Abramovich Orthogonality n Inner or dot product in R : T u v = u · v = u1v1 + ··· unvn examples Properties: • u · v = v · u • (u + v) · w = u · w + v · w • (cv) · w = c(v · w) • u · u ≥ 0, and u · u > 0 unless u = 0. 1 2 Norm = length: p kvk = v · v u is a unit vector if kuk = 1. If v =6 0 then 1 u = v kvk is a unit vector positively-proportional examples! to v. 3 We argue that u ? v if and only if ku − vk = ku − (−v)k. perpendicular or or- thogonal This means (u−v)·(u−v) = (u+v)·(u+v) Expand and get kuk2 + kvk2 − 2u · v = kuk2 + kvk2 + 2u · v So u ? v () u · v = 0 4 Exercise: u ? v () kuk2 + kvk2 = ku + vk2 Angles: We define the angle cosine between nonzero u and v to be cos θ = u·v examples! kuk kvk This fits with the law of cosines! 5 n W ⊂ R a subspace. we say u ? W if for all v 2 W we have u ? v: ? n W = fu 2 R ju ? W g. 6 Theorem: if fv1;:::; vkg spans W then u ? W if and only if u ? vi for all i: ? n Theorem: W ⊂ R is a subspace. Theorem: dim W + dim W ? = n Theorem: (W ?)? = W . 7 Key examples: Row (A)? = Null (A). Col (A)? = Null (AT ) 8 fu1;:::; ukg is an orthogonal set if u ? u whenever i =6 j. 3 examples - fill in R ! i j Theorem. If fu1;:::; ukg is an orthogonal set and ui are non-zero, prove it!! then they are linearly independent. 9 An orthogonal set of non-zero vec- tors is a basis for its span. Definition: An orthogonal basis of W is a basis which is an orthogo- nal set. just a change of per- spective Theorem. If fu1;:::; ukg is an orthogonal basis for W and we want to decompose a vector y 2 W as y = c1u1 + ··· + ckuk then y · ui examples!! cj = : ui · ui 10 Things get better if we normalize: fu1;:::; ukg is an orthonormal set if it is an orthogonal set of unit examples! vectors. 11 Theorem. A matrix U has or- thonormal columns if and only if UT U = I: Theorem. If U has orthonormal columns then (Ux) · (Uy) = x · y Corollary If U has orthonormal columns then •k Uxk = kxk • x ? y () Ux ? Uy 12 If U is a square matrix with or- thonormal columns then it is called an orthogonal matrix. Note: UT U = I; and U a square matrix means UUT = I: meaning the rows are also orthonor- mal! 13 Suppose fu1;:::; ukg is an orthog- onal basis for W but y 62 W . What can we do? Theorem. There is a unique expression y = yW + yW ? ? where yW 2 W and yW ? 2 W . In fact yW = c1u1 + ··· + ckuk where y · ui examples!! cj = : ui · ui 14 If fu1;:::; ukg is orthonormal, the formula gets simpler: yW = (y · u1)u1 + ··· + (y · uk)uk Writing U = [u1 ::: uk] this gives T yW = UU y 15 We call yW the perpendicular pro- jection of y on W . Notation: yW = projW y So y = projW y + projW ? y: Then yW is the vector in W closest to W . 16 Gram-Schmidt Algorithm. Say fx1;:::; xkg is a basis for W . Think about Wi = Span fx1; xig We construct inductively vi as fol- examples! lows: v1 = x1 v2 = Proj ? x2 =x2−ProjW x2 W1 1 x2 · v1 = x2 − v1 v1 · v1 . vk = Proj ? xk Wk−1 xk·v1 xk·vk−1 = x − v1+···+ v k v1·v1 vk−1·vk−1 k−1 1 We can normalize: ui = vi. kvik 17 QR factorization Matrix interpretation of Gram-Schmidt: say A has linearly independent columns A = [x1 ··· xn] and say Q = [u1 ··· un]: Since Span (u1;:::; ui) = Span (x1;:::; xi) we have xi = r1iu1 + ··· + riiui 2 3 r1i 6 . 7 6 . 7 6 7 6rii7 meaning xi = Q6 7 6 0 7 6 . 7 4 . 5 0 So A = [x1 ··· xn] = Q[r1 ··· rn]. 18 In other words A = QR, where Q has orthonormal columns, and R is invertible upper triangular. Note: R = QT A 19 Least squares approximation. Say Col A does not span b. You still want to approximate Ax ≈ b: We'll calculate x^ such that kAx^ − bk is minimized. Then we'll apply. 20 The closest an element Ax^ of the Gram-Schmidt? column space gets to b is ouch!! ^ b = projCol A b: Now b^ 2 Col A, so we can solve Ax^ = b^: This seems hard. But. 21 ^ Ax^ = b = projCol A b: () (b − b^) ? Col A = Row AT () AT (b − Ax^) = 0 () AT A x^ = AT b 22 Linear regression Theoretically a system behaves like y = β0 + β1x; and you want to find βi. You run experiments which give data points (xi; yi). They will not actu- ally lie of a line. You get the approximate equations β0 + x1β1 ≈ y1 β0 + x2β1 ≈ y2 . β0 + xnβn ≈ yn 23 In matrix notation: 2 3 1 x1 2 3 y1 61 x27 β1 . 6. 7 ≈ 4 . 5 4. 5 β2 yn 1 xn or Xβ ≈ y: 24 To solve it we take XT Xβ = XT y We can solve directly, or expand: P P n xi β0 yi P P 2 = P xi xi β1 xiyi.