Linear Algebra Review Concepts1 Orthogonality • Least-Squares: Let a Be an M

Linear Algebra review concepts1 Orthogonality • Let W be a subspace of Rn. For any v 2 Rn, the projection of v onto Rn, denoted v^, is the point in W closest to v. Let n and be an orthogonal basis of . Then the projection of n onto is • R fw1; w2; :::; wkg W v 2 R W w1 · v w2 · v wk · v v^ = w1 + w2 + ::: + wk: w1 · w1 w2 · w2 wk · wk • v^ is the point in W closest to v. • If v 2 W then v^ = v. If v ? W then v^ = 0. For any v, we can write v = v^ + v? where v? = v − v^ 2 W ?. The projection matrix for projection onto is 1 T . • spanfwg P = w·w ww Gram-Schmidt: Given a subspace n and a basis of , we can nd an orthogonal basis • W ⊂ R fa1; :::; akg W fb1; :::; bkg of W and an orthonormal basis fu1; :::; ukg of W . The algorithm is as follows: b1 1. b1 = a1. Then u1 = . kb1k b2 2. b2 = a2 − (u1 · a2)u1. Then u2 = . kb2k . bj i. bj = aj − (u1 · aj)u1 − ::: − (uj−1 · aj)uj−1. Then uj = . kbj k • Least-Squares: Let A be an m × n matrix and b 2 Rm. The least squares solution of Ax = b, denoted x^, is the vector that minimizes the error Ax^ − b. Note that if Ax = b has a solution, then x^ = x. • Least-Squares can be solved in two ways: 1. Ax^ = b^. Compute b^, the orthogonal projection of b onto col(A), then solve for x^. 2. AT Ax^ = AT b. Compute AT A and AT b, then solve for x^. In general, this method is easier! Furthermore, after you have found x^, it is easy to compute b^ = Ax^. • For a matrix A with linearly independent columns, the matrix that represents projection onto col(A) is P = A(AT A)−1AT : • If Q is an m × n matrix with orthonormal columns, then QT Q = I. If in addition Q is n × n (we call Q an orthogonal matrix), then Q−1 = QT . • If Q has orthonormal columns, then the matrix that represents projection onto col(Q) is P = QQT . Note: if Q is n × n, then because Q−1 = QT ;P = QQT = I. I.e., the projection matrix onto col(Q) is the identity matrix. Why does this make sense? What is col(Q)? • QR Decomposition Let A be an m×n matrix with linearly independent columns. Then we can write A = QR where Q is an m × n matrix with orthonormal columns and R is n × n, upper triangular and invertible. 2 . 3 . 6 7 • To compute Q, let A = 6 a1 a2 ::: an 7 and apply Gram-Schmidt process to fa1; :::; ang to get an 4 . 5 . 2 . 3 . 6 7 orthonormal basis fq1; :::; qng of col(A). Then Q = 6 q1 q2 ::: qn 7 and Q is an orthogonal matrix. 4 . 5 . Note that QT A = QT QR = Q−1QR = R: Therefore to nd R, we simply compute R = QT A. 1Email: [email protected]; there may be typos. 1 Determinants • Let A be an n × n matrix. The determinant of A, det(A), is a scalar that satises the following: If A is upper or lower triangular, then det(A) is the product of diagonal entries. We can reduce any matrix to upper triangular form (echelon form) using row operations. The row operations aect the determinant in the following way: A ! B by sending Ri ! Ri + αRj. Then det(A) = det(B). A ! B by swapping Ri $ Rj. Then −det(A) = det(B). A ! B by scaling Ri ! αRi (α 6= 0). Then α · det(A) = det(B). We can use the above method to nd the determinant of a square matrix of any size. • A is invertible if and only if det(A) 6= 0. a b • For 2 × 2 matrices ONLY: if A = , then det(A) = ad − bc. c d 2 a b c 3 • For 3 × 3 matrices ONLY: if A = 4 d e f 5 then det(A) = [aei + bfg + cdh] − [ceg + bdi + afh]. g h i (Add products of left to right diagonals, subtract products of right to left diagonals. This is similar to computing cross products.) • Cofactor Expansion: 2 3 a1;1 ··· a1;n 6 a2;1 ··· a2;n 7 (For square matrices of any size!) Let A = 6 7. Pick any row or column of A to expand 4 ··· 5 an;1 ··· an;n along. Dene Ai;j to be the determinant of the (n − 1) × (n − 1) matrix obtained by removing the ith row and jth column of A. If we choose to expand along the ith row, then i+1 i+2 i+n det(A) = (−1) ai;1Ai;1 + (−1) ai;2Ai;2 + ::: + (−1) ai;nAi;n; if instead the jth column is chosen, then 1+j 2+j n+j det(A) = (−1) a1;jA1;j + (−1) a2;jA2;j + ::: + (−1) an;jAn;j: 2 + − + ··· 3 6 − + − · · · 7 You may nd the following checkerboard matrix helpful for determining the sign of each term: 6 7. n×n 6 + − + ··· 7 4 . 5 . 2 1 2 3 3 For example, let A = 4 4 5 6 5. If I choose to expand along the second row then: 7 8 9 2 3 1 3 1 2 det(A) = −4det + 5det − 6det : 8 9 7 9 7 8 Note: the cofactor expansion method is generally more computation than other methods. However, if we have a row or column with many zero entries, then we can expand along that row or column and reduce the amount of computation required. • If A; B are n × n matrices, then det(AB) = det(A) · det(B). In particular, if is invertible (therefore ) then −1 1 . • A det(A) 6= 0 det(A ) = det(A) • det(AT ) = det(A). • If A is n × n, then for any scalar α, det(αA) = αndet(A). 2 Eigenvalues and Eigenvectors • Let A be an n × n matrix. x where x 6= 0 is an eigenvector of A if Ax = λx for some scalar λ. We call λ an eigenvalue of A, and x is an eigenvector corresponding to λ. The eigenspace corresponding to λ is the subspace of the eigenvectors corresponding to λ. • The characteristic polynomial of A is p(λ) = det(A − λI). • To nd the eigenvalues of A, nd the roots of p(λ), i.e., solve for λ such that det(A − λI) = 0. • Given λ, to nd the eigenvectors (and eigenspace) corresponding to λ, nd nul(A − λI). • Any nonzero vector in the null space of A is an eigenvector of A corresponding to λ = 0. Thus A is noninvertible if and only if 0 is an eigenvalue of A. • If fx1; :::; xkg are eigenvectors of A corresponding to the same eigenvalue λ, then any linear combination of fx1; :::; xkg is also an eigenvector of A corresponding to λ. • If A is an n × n matrix, then A has n eigenvalues (counted with multiplicity). The eigenvalues may be real or complex. • The determinant of A is equal to the product of eigenvalues of A. • The trace of A (sum of diagonal entries) is equal to the sum of the eigenvalues of A. • A has at most n linearly independent eigenvectors. • If λ is an eigenvalue of A with multiplicity k (i.e., λ appears as a root of p(λ) k times), then A has at most k linearly independent eigenvectors corresponding to λ. Discrete Dynamical Systems Let n be a vector that changes with time where for some matrix (we call the • xt 2 R t xt+1 = Axt n × n A A 2 transition matrix). Then x1 = Ax0; x2 = Ax1 = A(Ax0) = A x0 and in general, k xk = A x0 where x0 is the initial condition. • A Markov process is a discrete dynamical system with properties: the total population remains xed and the population of a given state can never be negative. • Within the population, we have n dierent classes, call them class 1; 2; :::; n. At every time step, there is a probability that entities from one class can move to another class (or stay in the same class). We wish to nd the equilibrium state of the classes as t ! 1. • As in discrete dynamical systems, we can model a Markov process by xt+1 = Axt where A is an n × n matrix 2 3 x1;t . and 6 . 7 reports the population at time . is the percentage of class in the total population at xt = 4 . 5 t xi;t i xn;t time t. • We call A a Markov matrix. A has the following properties: the columns of A add up to 1, no entry of A is negative, 1 is always an eigenvalue of A and all other eigenvalues are less than 1 in absolute value. • We wish to nd the equilibrium state as t ! 1, call it x1. Observe Ax1 = lim Axn = lim xn+1 = x1; n!1 n!1 therefore to nd the equilibrium state x1, we should look at the eigenvector corresponding toλ = 1. 3 • The sum of the entries of the equilibrium vector should be 1 since we are talking about the sum of percentages that add up to a whole. Note: this is not a unit vector! • Let fv1; v2; :::; vng be an eigenbasis of the Markov matrix A (with corresponding eigenvalues λ1(= 1); λ2; :::; λn respectively). For any vector x, we can write x = c1v1 + ::: + cnvn for some scalars c1; :::; cn.

Load more