1 Useful Background Information in This Section of the Notes, Various

1 Useful Background Information In this section of the notes, various definitions and results from calculus, linear/matrix algebra, and least-squares regression will be summarized. I will refer to these items at various times during the semester. 1.1 Taylor Series (k) th 1. Let η (x) denote the k derivative of function η(x). For function η and x0 in some interval I, define (x − x )2 (x − x )n P (x; x ) = η(x ) + η(1)(x )(x − x ) + η(2)(x ) 0 + ··· + η(n)(x ) 0 n 0 0 0 0 0 2! 0 n! (x − c)n+1 R (x; c) = η(n+1)(c): n (n + 1)! Then, there exists some number z between x and x0 such that η(x) = Pn(x; x0) + Rn(x; z) 2. Taylor Series for functions of one variable: If η is a function that has derivatives of all orders throughout an interval I containing x0 and if lim Rn(x; x0) = 0 for every x0 in I, then n!1 η(x) can be represented by the Taylor series about x0 for any x0 in I. That is, 1 k X (x − x0) η(x) = η(x ) + η(k)(x ) 0 k! 0 k=1 th 3. Note that Pn(x; x0) is a polynomial of degree n. Thus, Pn(x; x0) is an n -order Taylor series approximation of η(x) because Rn(x; x0) vanishes as n increases. 4. Practically, this means that even if the true form of η(x) is unknown, we can use a polynomial f(x) = Pn(x; x0) to approximate it with the approximation improving as n increases. 5. In statistics, we may fit a linear model 2 n f(x) = β0 + β1x + β2x + ··· + βnx : What we are actually doing is fitting (1) (2) 2 (n) n f(x) = Pn(x; 0) = η(0) + η (0)x + η (0)x + ··· + η (0)x (i) i where β0 = η0 and βi = η (0)x for i = 1; 2; : : : ; n and we assume the remainder Rn(x; 0) is negligible. 6. Taylor series can be generalized to higher dimensions. I will only review the 2-dimensional case. @ηn 7. For function η(x; y) let be the nth-order partial derivative with derivation taken k @xk@yn−k times with respect to x and (n − k) times with respect to y. 8. If η is a function of (x; y) that has partial derivatives of all orders inside a ball B containing p0 and if lim Rn(p; p0) = 0 for every p0 in B, then η(p) can be represented by the 2-variable n!1 Taylor series about p0 for any p0 in B. 4 9. For function η(x; y) and p0 = (x0; y0) in some open ball B containing p0, define p = (x; y) and (x − x0) @η (y − y0) @η Pn(p; p ) = η(p ) + + 0 0 1! @x 1! @y p0 p0 2 2 2 2 2 (x − x0) @ η (x − x0)(y − y0) @ η (y − y0) @ η + + + 2! @x2 1!1! @x@y 2! @y2 p0 p0 p0 + ··· n−1 k (n−1−k) k X (x − x0) (y − y0) @ η + k!(n − 1 − k)! @kx@(n−1−k)y k=0 p0 n k (n−k) k X (x − x0) (y − y0) @ η + k!(n − k)! @kx@(n−k)y k=0 p0 n+1 k (n+1−k) k ∗ X (x − x0) (y − y0) @ η Rn(p; p ) = k (n−k) k!(n + 1 − k)! @ x@ y ∗ k=0 p ∗ where p is a point on the line segment joining p and p0. 10. Taylor Series for functions of two variables: There exists some point pz on the line segment joining p and p0 such that η(p) = Pn(p; p0) + Rn(p; pz) 11. Note that Pn(p; p0) is a polynomial of degree n in variables x and y. Thus, Pn(p; p0) is an th n -order Taylor series approximation of η(p) because Rn(p; p0) vanishes as n increases. 12. Practically, this means that even if the true form of η(p) is unknown, we can use a polynomial f(p) = Pn(p; p0) to approximate it with the approximation improving as n increases. 13. In statistics, we may fit a linear model n n−i X X i j f(x; y) = βi;jx y i=0 j=0 What we are actually doing is fitting f(x) = Pn(p; (0; 0)) where β0;0 = η(0; 0) and βi;j = (i+j) i j η (0; 0)x y for i + j = 1; 2; : : : ; n, and we assume the remainder Rn(p; (0; 0)) is negligible. @2f @2f @2f 14. On the following page: f = f = f = : 12 @x@y 11 @x2 22 @y2 @2f 2 @2f @2f Thus, ∆ = − : @x@y @x2 @y2 5 66 1.2 Matrix Theory Terminology and Useful Results 2 3 x11 x12 x13 ··· x1k 6 x21 x22 x23 ··· x2k 7 6 7 0 15. If x = 6 x31 x32 x33 ··· x3k 7 then the symmetric matrix X X can be written as 6 7 4 ····· 5 xn1 xn2 xn3 ··· xnk 2 Pn 2 Pn Pn Pn 3 p=1 xp1 p=1 xp1xp2 p=1 xp1xp3 ··· p=1 xp1xpk Pn 2 Pn Pn 6 p=1 xp2 p=1 xp2xp3 ··· p=1 xp2xpk 7 0 6 Pn 2 Pn 7 X X = 6 symmetric x ··· xp3xpk 7 6 p=1 p3 p=1 7 4 ······ 5 Pn 2 p=1 xpk 16. Transpose of a product of two matrices: (AB)0 = B0A0: 0 0 0 0 0 17. Transpose of a product of k matrices: If B = A1A2 ··· Ak−1Ak then B = AkAk−1 ··· A2A1: 18. The trace of a square matrix A, denoted tr(A), is the sum of the diagonal elements of A. 19. For two k-square matrices A and B, tr(A ± B) = tr(A) ± tr(B). 20. Given an m × n matrix A and an n × m matrix B, then tr(AB) = tr(BA). 21. The rank of a matrix A, denoted rank(A), is the number of linearly independent rows (or columns) of A. 22. If the determinant is nonzero for at least one matrix formed from r rows and r columns of matrix A but no matrix formed from r + 1 rows and r + 1 columns of A has nonzero determinant, then the rank of A is r. 23. Consider a k-square matrix A with rank(A) = k. The k-square matrix A−1 where AA−1 = −1 A A = Ik is called the inverse matrix of A. 24. A k-square matrix A is singular if A is not invertible. This is equivalent to saying jAj = 0 or rank(A) < k. 25. Any nonsingular square matrix (i.e., its determinant =6 0) will have a unique inverse. 26. In the use of least squares as an estimation procedure, it is often required to invert matrices which are symmetric. The inverse matrix is also important as a means of solving sets of simultaneous independent linear equations. If the set of equations is not independent, there is no unique solution. 27. The set of k linearly independent equations a11x1 + a12x2 + ··· + a1kxk = g1 a21x1 + a22x2 + ··· + a2kxk = g2 ········· ak1x1 + ak2x2 + ··· + akkxk = gk can be written in matrix form as Ax = g: Thus, the solution is x = A−1g: 7 28. If A = diag(a1; a2; ··· ; ak) is a diagonal matrix with nonzero diagonal elements −1 a1; a2; ··· ; ak, then A = diag(1=a1; 1=a2; ··· ; 1=ak) is a diagonal matrix with diagonal elements 1=a1; 1=a2; ··· ; 1=ak. 29. If S is a nonsingular symmetric matrix, then (S−1)0 = S−1: Thus, the inverse of a nonsingular symmetric matrix is itself symmetric. 30. A square matrix A is idempotent if A2 = A. 0 −1 0 31. A nonsingular k-square matrix P is orthogonal if P = P ; or equivalently, PP = Ik: 32. Suppose P is a k-square orthogonal matrix, x is a k × 1 vector, and y = P x is a k × 1 vector. The transformation y = P x is called an orthogonal transformation. 33. If y = P x is an orthogonal transformation then y0y = x0P 0P x = x0x. 1.3 Eigenvalues, Eigenvectors, and Quadratic Forms 34. If A is a k-square matrix and λ is a scalar variable, then A − λIk is called the characteristic matrix of A. 35. The determinant jA − λIkj = h(λ) is called the characteristic function of A. 36. The roots of the equation h(λ) = 0 are called the characteristic roots or eigenvalues of A. 37. Suppose λ∗ is an eigenvalue of a k-square matrix A, then an eigenvector associated with λ∗ ∗ ∗ is defined as a column vector x which is a solution to Ax = λ x or (A − λ Ik)x = 0: 38. An important use of eigenvalues and eigenvectors in response surface methodology is in the application to problems of finding optimum experimental conditions. 39. The quadratic form in k variables x1; x2; : : : ; xk is k X 2 XX Q = biixi + 2 bijxixj (1) i=1 i<j where we assume the elements bij (i = 1; : : : ; k j = 1; : : : ; k) are real-valued. 40. In matrix notation: Q = x0Bx where 2 3 2 3 x1 b11 b12 ··· b1k 6 x2 7 6 b22 ··· b2k 7 x = 6 7 B = 6 7 4 ··· 5 4 ······ 5 xk symmetric bkk 41. B and jBj are, respectively, called the matrix and determinant of the quadratic form Q.

1 Useful Background Information in This Section of the Notes, Various

Details

Download

Copyright

We respect the copyrights and intellectual property rights of all users. All uploaded documents are either original works of the uploader or authorized works of the rightful owners.

Support