<<

review concepts1

Orthogonality

• Let W be a subspace of Rn. For any v ∈ Rn, the projection of v onto Rn, denoted vˆ, is the in W closest to v. Let n and be an orthogonal of . Then the projection of n onto is • R {w1, w2, ..., wk} W v ∈ R W       w1 · v w2 · v wk · v vˆ = w1 + w2 + ... + wk. w1 · w1 w2 · w2 wk · wk

• vˆ is the point in W closest to v. • If v ∈ W then vˆ = v. If v ⊥ W then vˆ = 0. For any v, we can write v = vˆ + v⊥ where v⊥ = v − vˆ ∈ W ⊥. The projection for projection onto is 1 T . • span{w} P = w·w ww Gram-Schmidt: Given a subspace n and a basis of , we can nd an • W ⊂ R {a1, ..., ak} W {b1, ..., bk} of W and an {u1, ..., uk} of W . The algorithm is as follows:

b1 1. b1 = a1. Then u1 = . kb1k

b2 2. b2 = a2 − (u1 · a2)u1. Then u2 = . kb2k . .

bj i. bj = aj − (u1 · aj)u1 − ... − (uj−1 · aj)uj−1. Then uj = . kbj k • Least-Squares: Let A be an m × n matrix and b ∈ Rm. The least squares solution of Ax = b, denoted xˆ, is the vector that minimizes the error Axˆ − b. Note that if Ax = b has a solution, then xˆ = x. • Least-Squares can be solved in two ways: 1. Axˆ = bˆ. Compute bˆ, the orthogonal projection of b onto col(A), then solve for xˆ. 2. AT Axˆ = AT b. Compute AT A and AT b, then solve for xˆ. In general, this method is easier! Furthermore, after you have found xˆ, it is easy to compute bˆ = Axˆ. • For a matrix A with linearly independent columns, the matrix that represents projection onto col(A) is

P = A(AT A)−1AT .

• If Q is an m × n matrix with orthonormal columns, then QT Q = I. If in addition Q is n × n (we call Q an ), then Q−1 = QT . • If Q has orthonormal columns, then the matrix that represents projection onto col(Q) is P = QQT . Note: if Q is n × n, then because Q−1 = QT ,P = QQT = I. I.e., the projection matrix onto col(Q) is the . Why does this make sense? What is col(Q)? • QR Decomposition Let A be an m×n matrix with linearly independent columns. Then we can write A = QR where Q is an m × n matrix with orthonormal columns and R is n × n, upper triangular and invertible.  . . .  . . .   • To compute Q, let A =  a1 a2 ... an  and apply Gram-Schmidt process to {a1, ..., an} to get an  . . .  . . .  . . .  . . .   orthonormal basis {q1, ..., qn} of col(A). Then Q =  q1 q2 ... qn  and Q is an orthogonal matrix.  . . .  . . . Note that QT A = QT QR = Q−1QR = R. Therefore to nd R, we simply compute R = QT A. 1Email: [email protected]; there may be typos.

1

• Let A be an n × n matrix. The of A, det(A), is a that satises the following:  If A is upper or lower triangular, then det(A) is the product of diagonal entries. We can reduce any matrix to upper triangular form (echelon form) using row operations. The row operations aect the determinant in the following way:

 A → B by sending Ri → Ri + αRj. Then det(A) = det(B).

 A → B by swapping Ri ↔ Rj. Then −det(A) = det(B).

 A → B by scaling Ri → αRi (α 6= 0). Then α · det(A) = det(B). We can use the above method to nd the determinant of a of any size.

• A is invertible if and only if det(A) 6= 0.  a b  • For 2 × 2 matrices ONLY: if A = , then det(A) = ad − bc. c d

 a b c  • For 3 × 3 matrices ONLY: if A =  d e f  then det(A) = [aei + bfg + cdh] − [ceg + bdi + afh]. g h i (Add products of left to right diagonals, subtract products of right to left diagonals. This is similar to computing cross products.)

• Cofactor Expansion:   a1,1 ··· a1,n  a2,1 ··· a2,n  (For square matrices of any size!) Let A =  . Pick any row or column of A to expand  ···  an,1 ··· an,n along. Dene Ai,j to be the determinant of the (n − 1) × (n − 1) matrix obtained by removing the ith row and jth column of A. If we choose to expand along the ith row, then

i+1 i+2 i+n det(A) = (−1) ai,1Ai,1 + (−1) ai,2Ai,2 + ... + (−1) ai,nAi,n,

if instead the jth column is chosen, then

1+j 2+j n+j det(A) = (−1) a1,jA1,j + (−1) a2,jA2,j + ... + (−1) an,jAn,j.

 + − + ···   − + − · · ·  You may nd the following checkerboard matrix helpful for determining the sign of each term:  . n×n  + − + ···   . . .  . . .  1 2 3  For example, let A =  4 5 6 . If I choose to expand along the second row then: 7 8 9

 2 3   1 3   1 2  det(A) = −4det + 5det − 6det . 8 9 7 9 7 8 Note: the cofactor expansion method is generally more computation than other methods. However, if we have a row or column with many zero entries, then we can expand along that row or column and reduce the amount of computation required.

• If A, B are n × n matrices, then det(AB) = det(A) · det(B). In particular, if is invertible (therefore ) then −1 1 . • A det(A) 6= 0 det(A ) = det(A) • det(AT ) = det(A). • If A is n × n, then for any scalar α, det(αA) = αndet(A).

2 Eigenvalues and Eigenvectors

• Let A be an n × n matrix. x where x 6= 0 is an eigenvector of A if

Ax = λx

for some scalar λ. We call λ an eigenvalue of A, and x is an eigenvector corresponding to λ. The eigenspace corresponding to λ is the subspace of the eigenvectors corresponding to λ. • The characteristic of A is p(λ) = det(A − λI). • To nd the eigenvalues of A, nd the roots of p(λ), i.e., solve for λ such that det(A − λI) = 0.

• Given λ, to nd the eigenvectors (and eigenspace) corresponding to λ, nd nul(A − λI). • Any nonzero vector in the null space of A is an eigenvector of A corresponding to λ = 0. Thus A is noninvertible if and only if 0 is an eigenvalue of A.

• If {x1, ..., xk} are eigenvectors of A corresponding to the same eigenvalue λ, then any of {x1, ..., xk} is also an eigenvector of A corresponding to λ. • If A is an n × n matrix, then A has n eigenvalues (counted with multiplicity). The eigenvalues may be real or complex.

• The determinant of A is equal to the product of eigenvalues of A.

• The of A (sum of diagonal entries) is equal to the sum of the eigenvalues of A. • A has at most n linearly independent eigenvectors. • If λ is an eigenvalue of A with multiplicity k (i.e., λ appears as a root of p(λ) k times), then A has at most k linearly independent eigenvectors corresponding to λ. Discrete Dynamical Systems

Let n be a vector that changes with time where for some matrix (we call the • xt ∈ R t xt+1 = Axt n × n A A 2 transition matrix). Then x1 = Ax0, x2 = Ax1 = A(Ax0) = A x0 and in general,

k xk = A x0

where x0 is the initial condition. • A Markov process is a discrete dynamical system with properties: the total population remains xed and the population of a given state can never be negative.

• Within the population, we have n dierent classes, call them class 1, 2, ..., n. At every time step, there is a that entities from one class can move to another class (or stay in the same class). We wish to nd the equilibrium state of the classes as t → ∞.

• As in discrete dynamical systems, we can model a Markov process by xt+1 = Axt where A is an n × n matrix   x1,t . and  .  reports the population at time . is the percentage of class in the total population at xt =  .  t xi,t i xn,t time t. • We call A a Markov matrix. A has the following properties: the columns of A add up to 1, no entry of A is negative, 1 is always an eigenvalue of A and all other eigenvalues are less than 1 in .

• We wish to nd the equilibrium state as t → ∞, call it x∞. Observe

Ax∞ = lim Axn = lim xn+1 = x∞, n→∞ n→∞

therefore to nd the equilibrium state x∞, we should look at the eigenvector corresponding toλ = 1.

3 • The sum of the entries of the equilibrium vector should be 1 since we are talking about the sum of percentages that add up to a whole. Note: this is not a !

• Let {v1, v2, ..., vn} be an eigenbasis of the Markov matrix A (with corresponding eigenvalues λ1(= 1), λ2, ..., λn respectively). For any vector x, we can write x = c1v1 + ... + cnvn for some scalars c1, ..., cn. Then k k k as because for all A x = c1λ1 v1 + ... + cnλ vn → c1v1 k → ∞ |λi| < 1 i > 1.

Let A be an n × n matrix. The following are equivalent: • A is invertible. • There exists a matrix A−1 such that AA−1 = A−1A = I.

• A has n pivots in echelon form (every row and every column has a pivot). • The reduced of A is I. • A is a product of elementary matrices.

• There is a unique solution to Ax = 0 (x = 0).

• For any b ∈ Rn, Ax = b has a unique solution. • Nul(A) = {0}.

• Col(A) = Rn. • The columns of A are linearly independent. • The rows of A are linearly independent.

• The columns of A span Rn. • The rows of A span Rn.

• The columns of A form a basis of Rn. • The rows of A form a basis of Rn. • (A) = n.

• dim(Nul(A)) = 0. • det(A) 6= 0. • 0 is not an eigenvalue of A.

4