Matrix Calculus

Total Page:16

File Type:pdf, Size:1020Kb

Matrix Calculus Matrix Calculus Sourya Dey 1 Notation • Scalars are written as lower case letters. • Vectors are written as lower case bold letters, such as x, and can be either row (dimensions 1×n) or column (dimensions n×1). Column vectors are the default choice, unless otherwise mentioned. Individual elements are indexed by subscripts, such as xi (i 2 f1; ··· ; ng). • Matrices are written as upper case bold letters, such as X, and have dimensions m × n corresponding to m rows and n columns. Individual elements are indexed by double subscripts for row and column, such as Xij (i 2 f1; ··· ; mg, j 2 f1; ··· ; ng). • Occasionally higher order tensors occur, such as 3rd order with dimensions m × n × p, etc. Note that a matrix is a 2nd order tensor. A row vector is a matrix with 1 row, and a column vector is a matrix with 1 column. A scalar is a matrix with 1 row and 1 column. Essentially, scalars and vectors are special cases of matrices. @f The derivative of f with respect to x is . Both x and f can be a scalar, vector, or matrix, @x @f T leading to 9 types of derivatives. The gradient of f w.r.t x is r f = , i.e. gradient x @x is transpose of derivative. The gradient at any point x0 in the domain has a physical interpretation, its direction is the direction of maximum increase of the function f at the point x0, and its magnitude is the rate of increase in that direction. We do not generally deal with the gradient when x is a scalar. 2 Basic Rules This document follows numerator layout convention. There is an alternative denom- inator layout convention, where several results are transposed. Do not mix different layout conventions. We'll first state the most general matrix-matrix derivative type. All other types are sim- plifications since scalars and vectors are special cases of matrices. Consider a function F (·) which maps m × n matrices to p × q matrices, i.e. domain ⊂ Rm×n and range ⊂ Rp×q. So, @F F (·): X ! F (X). Its derivative is a 4th order tensor of dimensions p × q × n × m. This m×n p×q @X is an outer matrix of dimensions n × m (transposed dimensions of the denominator X), with 1 each element being a p × q inner matrix (same dimensions as the numerator F ). It is given as: 2 @F @F 3 ··· 6 @X1;1 @Xm;1 7 6 7 @F 6 . 7 6 . .. 7 = 6 . 7 (1a) @X 6 7 6 @F @F 7 4 ··· 5 @X1;n @Xm;n which has n rows and m columns, and the (i; j)th element is given as: 2 @F @F 3 1;1 ··· 1;q 6@Xi;j @Xi;j 7 6 7 @F 6 . 7 6 . .. 7 = 6 . 7 (1b) @Xi;j 6 7 6 @F @F 7 4 p;1 ··· p;q 5 @Xi;j @Xi;j which has p rows and q columns. Whew! Now that that's out of the way, let's get to some general rules (for the following, x and y can represent scalar, vector or matrix): @y • The derivative always has outer matrix dimensions = transposed dimen- @x sions of denominator x, and each individual element (inner matrix) has di- mensions = same dimensions of numerator y. If you do a calculation and the dimension doesn't come out right, the answer is not correct. @f (g(x)) @f (g(x)) @g(x) • Derivatives usually obey the chain rule, i.e. = . @x @g(x) @x @f(x)g(x) @g(x) @f(x) • Derivatives usually obey the product rule, i.e. = f(x) + g(x) . @x @x @x 3 Types of derivatives 3.1 Scalar by scalar Nothing special here. The derivative is a scalar, and can also be written as f 0(x). For example, if f(x) = sin x, then f 0(x) = cos x. 3.2 Scalar by vector f(·): x ! f(x). For this, the derivative is a 1 × m row vector: m×1 1×1 @f @f @f @f = ··· (2) @x @x1 @x2 @xm The gradient rxf is its transposed column vector. 2 3.3 Vector by scalar f(·): x ! f(x). For this, the derivative is a n × 1 column vector: 1×1 n×1 2 @f1 3 6 @x 7 6 7 6 @f 7 6 2 7 @f 6 @x 7 6 7 = 6 7 (3) @x 6 . 7 6 . 7 6 7 6 7 4@fn 5 @x 3.4 Vector by vector f(·): x ! f(x). Derivative, also known as the Jacobian, is a matrix of dimensions n × m. m×1 n×1 Its (i; j)th element is the scalar derivative of the ith output component w.r.t the jth input component, i.e.: 2 @f @f 3 1 ··· 1 6@x1 @xm 7 6 7 @f 6 . 7 6 . .. 7 = 6 . 7 (4) @x 6 7 6 7 4@fn @fn 5 ··· @x1 @xm 3.4.1 Special case { Vectorized scalar function This is a scalar-scalar function applied element-wise to a vector, and is denoted by f(·): x ! m×1 f(x). For example: m×1 02 31 2 3 x1 f (x1) B6 x2 7C 6 f (x2) 7 f B6 7C = 6 7 (5) B6 . 7C 6 . 7 @4 . 5A 4 . 5 xm f (xm) In this case, both the derivative and gradient are the same m × m diagonal matrix, given as: 2 3 0 f (x1) 0 6 7 6 0 7 @f 6 f (x2) 7 6 7 rxf = = 6 . 7 (6) @x 6 .. 7 6 7 4 5 0 0 f (xm) 0 @f (xi) where f (xi) = . @xi 3 Note: Some texts take the derivative of a vectorized scalar function by taking element-wise derivatives to get a m × 1 vector. To avoid confusion with (6), we will refer to this as f 0(x). 2 0 3 f (x1) 0 6 f (x2) 7 f 0(x) = 6 7 (7) 6 . 7 4 . 5 0 f (xm) To realize the effect of this, let's say we want to multiply the gradient from (6) with some m-dimensional vector a. This would result in: 2 0 3 f (x1) a1 6 0 7 6 f (x2) a2 7 6 7 rxf a = 6 . 7 (8) 6 . 7 4 5 0 f (xm) am Achieving the same result with f 0(x) from (7) would require the Hadamard product ◦, defined as element-wise multiplication of 2 vectors: 2 0 3 f (x1) a1 6 0 7 6 f (x2) a2 7 0 6 7 f (x) ◦ a = 6 . 7 (9) 6 . 7 4 5 0 f (xm) am 3.4.2 Special Case { Hessian Consider the type of function in Sec. 3.2, i.e. f(·): x ! f(x). Its gradient is a vector- m×1 1×1 to-vector function given as rxf(·): x ! rxf(x). The transpose of its derivative is the m×1 m×1 Hessian: 2 @2f @2f 3 2 ··· 6 @x @x1@xm 7 6 1 7 6 . 7 6 . .. 7 H = 6 . 7 (10) 6 7 6 7 4 @2f @2f 5 ··· 2 @xm@x1 @xm @r f T @2f @2f i.e. H = x . If derivatives are continuous, then = , so the Hessian is @x @xi@xj @xj@xi symmetric. 4 3.5 Scalar by matrix f(·): X ! f(X). In this case, the derivative is a n × m matrix: m×n 1×1 2 @f @f 3 ··· 6 @X1;1 @Xm;1 7 6 7 @f 6 . 7 6 . .. 7 = 6 . 7 (11) @X 6 7 6 @f @f 7 4 ··· 5 @X1;n @Xm;n The gradient has the same dimensions as the input matrix, i.e. m × n. 3.6 Matrix by scalar f(·): x ! F (x). In this case, the derivative is a p × q matrix: 1×1 p×q 2@F @F 3 1;1 ··· 1;q 6 @x @x 7 6 7 @F 6 . 7 = 6 . .. 7 (12) @x 6 7 6 7 4@F @F 5 p;1 ··· p;q @x @x 3.7 Vector by matrix f(·): X ! f(X). In this case, the derivative is a 3rd-order tensor with dimensions p×n×m. m×n p×1 This is the same n × m matrix in (11), but with f replaced by the p-dimensional vector f, i.e.: 2 @f @f 3 ··· 6 @X1;1 @Xm;1 7 @f 6 7 = 6 . .. 7 (13) @X 6 . 7 6 @f @f 7 4 ··· 5 @X1;n @Xm;n 3.8 Matrix by vector F (·): x ! F (x). In this case, the derivative is a 3rd-order tensor with dimensions p × q × m. m×1 p×q This is the same m × 1 row vector in (2), but with f replaced by the p × q matrix F , i.e.: @F @F @F @F = ··· (14) @x @x1 @x2 @xm 5 4 Operations and Examples 4.1 Commutation If things normally don't commute (such as for matrices, AB 6= BA), then order should be maintained when taking derivatives. If things normally commute (such as for vector inner product, a·b = b·a), their order can be switched when taking derivatives. Output dimensions must always come out right. @f For example, let f(x) = (aT x) b . The derivative should be a n×m matrix. Keeping n×1 1×m m×1 n×1 @x @f @x order fixed, we get = aT b = aT Ib = aT b.
Recommended publications
  • Mathematics for Dynamicists - Lecture Notes Thomas Michelitsch
    Mathematics for Dynamicists - Lecture Notes Thomas Michelitsch To cite this version: Thomas Michelitsch. Mathematics for Dynamicists - Lecture Notes. Doctoral. Mathematics for Dynamicists, France. 2006. cel-01128826 HAL Id: cel-01128826 https://hal.archives-ouvertes.fr/cel-01128826 Submitted on 10 Mar 2015 HAL is a multi-disciplinary open access L’archive ouverte pluridisciplinaire HAL, est archive for the deposit and dissemination of sci- destinée au dépôt et à la diffusion de documents entific research documents, whether they are pub- scientifiques de niveau recherche, publiés ou non, lished or not. The documents may come from émanant des établissements d’enseignement et de teaching and research institutions in France or recherche français ou étrangers, des laboratoires abroad, or from public or private research centers. publics ou privés. Mathematics for Dynamicists Lecture Notes c 2004-2006 Thomas Michael Michelitsch1 Department of Civil and Structural Engineering The University of Sheffield Sheffield, UK 1Present address : http://www.dalembert.upmc.fr/home/michelitsch/ . Dedicated to the memory of my parents Michael and Hildegard Michelitsch Handout Manuscript with Fractal Cover Picture by Michael Michelitsch https://www.flickr.com/photos/michelitsch/sets/72157601230117490 1 Contents Preface 1 Part I 1.1 Binomial Theorem 1.2 Derivatives, Integrals and Taylor Series 1.3 Elementary Functions 1.4 Complex Numbers: Cartesian and Polar Representation 1.5 Sequences, Series, Integrals and Power-Series 1.6 Polynomials and Partial Fractions
    [Show full text]
  • Determinant Notes
    68 UNIT FIVE DETERMINANTS 5.1 INTRODUCTION In unit one the determinant of a 2× 2 matrix was introduced and used in the evaluation of a cross product. In this chapter we extend the definition of a determinant to any size square matrix. The determinant has a variety of applications. The value of the determinant of a square matrix A can be used to determine whether A is invertible or noninvertible. An explicit formula for A–1 exists that involves the determinant of A. Some systems of linear equations have solutions that can be expressed in terms of determinants. 5.2 DEFINITION OF THE DETERMINANT a11 a12 Recall that in chapter one the determinant of the 2× 2 matrix A = was a21 a22 defined to be the number a11a22 − a12 a21 and that the notation det (A) or A was used to represent the determinant of A. For any given n × n matrix A = a , the notation A [ ij ]n×n ij will be used to denote the (n −1)× (n −1) submatrix obtained from A by deleting the ith row and the jth column of A. The determinant of any size square matrix A = a is [ ij ]n×n defined recursively as follows. Definition of the Determinant Let A = a be an n × n matrix. [ ij ]n×n (1) If n = 1, that is A = [a11], then we define det (A) = a11 . n 1k+ (2) If na>=1, we define det(A) ∑(-1)11kk det(A ) k=1 Example If A = []5 , then by part (1) of the definition of the determinant, det (A) = 5.
    [Show full text]
  • Determinants Math 122 Calculus III D Joyce, Fall 2012
    Determinants Math 122 Calculus III D Joyce, Fall 2012 What they are. A determinant is a value associated to a square array of numbers, that square array being called a square matrix. For example, here are determinants of a general 2 × 2 matrix and a general 3 × 3 matrix. a b = ad − bc: c d a b c d e f = aei + bfg + cdh − ceg − afh − bdi: g h i The determinant of a matrix A is usually denoted jAj or det (A). You can think of the rows of the determinant as being vectors. For the 3×3 matrix above, the vectors are u = (a; b; c), v = (d; e; f), and w = (g; h; i). Then the determinant is a value associated to n vectors in Rn. There's a general definition for n×n determinants. It's a particular signed sum of products of n entries in the matrix where each product is of one entry in each row and column. The two ways you can choose one entry in each row and column of the 2 × 2 matrix give you the two products ad and bc. There are six ways of chosing one entry in each row and column in a 3 × 3 matrix, and generally, there are n! ways in an n × n matrix. Thus, the determinant of a 4 × 4 matrix is the signed sum of 24, which is 4!, terms. In this general definition, half the terms are taken positively and half negatively. In class, we briefly saw how the signs are determined by permutations.
    [Show full text]
  • Matrix Calculus
    Appendix D Matrix Calculus From too much study, and from extreme passion, cometh madnesse. Isaac Newton [205, §5] − D.1 Gradient, Directional derivative, Taylor series D.1.1 Gradients Gradient of a differentiable real function f(x) : RK R with respect to its vector argument is defined uniquely in terms of partial derivatives→ ∂f(x) ∂x1 ∂f(x) , ∂x2 RK f(x) . (2053) ∇ . ∈ . ∂f(x) ∂xK while the second-order gradient of the twice differentiable real function with respect to its vector argument is traditionally called the Hessian; 2 2 2 ∂ f(x) ∂ f(x) ∂ f(x) 2 ∂x1 ∂x1∂x2 ··· ∂x1∂xK 2 2 2 ∂ f(x) ∂ f(x) ∂ f(x) 2 2 K f(x) , ∂x2∂x1 ∂x2 ··· ∂x2∂xK S (2054) ∇ . ∈ . .. 2 2 2 ∂ f(x) ∂ f(x) ∂ f(x) 2 ∂xK ∂x1 ∂xK ∂x2 ∂x ··· K interpreted ∂f(x) ∂f(x) 2 ∂ ∂ 2 ∂ f(x) ∂x1 ∂x2 ∂ f(x) = = = (2055) ∂x1∂x2 ³∂x2 ´ ³∂x1 ´ ∂x2∂x1 Dattorro, Convex Optimization Euclidean Distance Geometry, Mεβoo, 2005, v2020.02.29. 599 600 APPENDIX D. MATRIX CALCULUS The gradient of vector-valued function v(x) : R RN on real domain is a row vector → v(x) , ∂v1(x) ∂v2(x) ∂vN (x) RN (2056) ∇ ∂x ∂x ··· ∂x ∈ h i while the second-order gradient is 2 2 2 2 , ∂ v1(x) ∂ v2(x) ∂ vN (x) RN v(x) 2 2 2 (2057) ∇ ∂x ∂x ··· ∂x ∈ h i Gradient of vector-valued function h(x) : RK RN on vector domain is → ∂h1(x) ∂h2(x) ∂hN (x) ∂x1 ∂x1 ··· ∂x1 ∂h1(x) ∂h2(x) ∂hN (x) h(x) , ∂x2 ∂x2 ··· ∂x2 ∇ .
    [Show full text]
  • MATH 304 Linear Algebra Lecture 24: Scalar Product. Vectors: Geometric Approach
    MATH 304 Linear Algebra Lecture 24: Scalar product. Vectors: geometric approach B A B′ A′ A vector is represented by a directed segment. • Directed segment is drawn as an arrow. • Different arrows represent the same vector if • they are of the same length and direction. Vectors: geometric approach v B A v B′ A′ −→AB denotes the vector represented by the arrow with tip at B and tail at A. −→AA is called the zero vector and denoted 0. Vectors: geometric approach v B − A v B′ A′ If v = −→AB then −→BA is called the negative vector of v and denoted v. − Vector addition Given vectors a and b, their sum a + b is defined by the rule −→AB + −→BC = −→AC. That is, choose points A, B, C so that −→AB = a and −→BC = b. Then a + b = −→AC. B b a C a b + B′ b A a C ′ a + b A′ The difference of the two vectors is defined as a b = a + ( b). − − b a b − a Properties of vector addition: a + b = b + a (commutative law) (a + b) + c = a + (b + c) (associative law) a + 0 = 0 + a = a a + ( a) = ( a) + a = 0 − − Let −→AB = a. Then a + 0 = −→AB + −→BB = −→AB = a, a + ( a) = −→AB + −→BA = −→AA = 0. − Let −→AB = a, −→BC = b, and −→CD = c. Then (a + b) + c = (−→AB + −→BC) + −→CD = −→AC + −→CD = −→AD, a + (b + c) = −→AB + (−→BC + −→CD) = −→AB + −→BD = −→AD. Parallelogram rule Let −→AB = a, −→BC = b, −−→AB′ = b, and −−→B′C ′ = a. Then a + b = −→AC, b + a = −−→AC ′.
    [Show full text]
  • Matrix Calculus
    D Matrix Calculus D–1 Appendix D: MATRIX CALCULUS D–2 In this Appendix we collect some useful formulas of matrix calculus that often appear in finite element derivations. §D.1 THE DERIVATIVES OF VECTOR FUNCTIONS Let x and y be vectors of orders n and m respectively: x1 y1 x2 y2 x = . , y = . ,(D.1) . xn ym where each component yi may be a function of all the xj , a fact represented by saying that y is a function of x,or y = y(x). (D.2) If n = 1, x reduces to a scalar, which we call x.Ifm = 1, y reduces to a scalar, which we call y. Various applications are studied in the following subsections. §D.1.1 Derivative of Vector with Respect to Vector The derivative of the vector y with respect to vector x is the n × m matrix ∂y1 ∂y2 ∂ym ∂ ∂ ··· ∂ x1 x1 x1 ∂ ∂ ∂ ∂ y1 y2 ··· ym y def ∂x ∂x ∂x = 2 2 2 (D.3) ∂x . . .. ∂ ∂ ∂ y1 y2 ··· ym ∂xn ∂xn ∂xn §D.1.2 Derivative of a Scalar with Respect to Vector If y is a scalar, ∂y ∂x1 ∂ ∂ y y def ∂ = x2 .(D.4) ∂x . . ∂y ∂xn §D.1.3 Derivative of Vector with Respect to Scalar If x is a scalar, ∂y def ∂ ∂ ∂ = y1 y2 ... ym (D.5) ∂x ∂x ∂x ∂x D–2 D–3 §D.1 THE DERIVATIVES OF VECTOR FUNCTIONS REMARK D.1 Many authors, notably in statistics and economics, define the derivatives as the transposes of those given above.1 This has the advantage of better agreement of matrix products with composition schemes such as the chain rule.
    [Show full text]
  • Inner Product Spaces
    CHAPTER 6 Woman teaching geometry, from a fourteenth-century edition of Euclid’s geometry book. Inner Product Spaces In making the definition of a vector space, we generalized the linear structure (addition and scalar multiplication) of R2 and R3. We ignored other important features, such as the notions of length and angle. These ideas are embedded in the concept we now investigate, inner products. Our standing assumptions are as follows: 6.1 Notation F, V F denotes R or C. V denotes a vector space over F. LEARNING OBJECTIVES FOR THIS CHAPTER Cauchy–Schwarz Inequality Gram–Schmidt Procedure linear functionals on inner product spaces calculating minimum distance to a subspace Linear Algebra Done Right, third edition, by Sheldon Axler 164 CHAPTER 6 Inner Product Spaces 6.A Inner Products and Norms Inner Products To motivate the concept of inner prod- 2 3 x1 , x 2 uct, think of vectors in R and R as x arrows with initial point at the origin. x R2 R3 H L The length of a vector in or is called the norm of x, denoted x . 2 k k Thus for x .x1; x2/ R , we have The length of this vector x is p D2 2 2 x x1 x2 . p 2 2 x1 x2 . k k D C 3 C Similarly, if x .x1; x2; x3/ R , p 2D 2 2 2 then x x1 x2 x3 . k k D C C Even though we cannot draw pictures in higher dimensions, the gener- n n alization to R is obvious: we define the norm of x .x1; : : : ; xn/ R D 2 by p 2 2 x x1 xn : k k D C C The norm is not linear on Rn.
    [Show full text]
  • Properties of Transpose
    3.2, 3.3 Inverting Matrices P. Danziger Properties of Transpose Transpose has higher precedence than multiplica- tion and addition, so T T T T AB = A B and A + B = A + B As opposed to the bracketed expressions (AB)T and (A + B)T Example 1 1 2 1 1 0 1 Let A = ! and B = !. 2 5 2 1 1 0 Find ABT , and (AB)T . T 1 1 1 2 1 1 0 1 1 2 1 0 1 ABT = ! ! = ! 0 1 2 5 2 1 1 0 2 5 2 B C @ 1 0 A 2 3 = ! 4 7 Whereas (AB)T is undefined. 1 3.2, 3.3 Inverting Matrices P. Danziger Theorem 2 (Properties of Transpose) Given ma- trices A and B so that the operations can be pre- formed 1. (AT )T = A 2. (A + B)T = AT + BT and (A B)T = AT BT − − 3. (kA)T = kAT 4. (AB)T = BT AT 2 3.2, 3.3 Inverting Matrices P. Danziger Matrix Algebra Theorem 3 (Algebraic Properties of Matrix Multiplication) 1. (k + `)A = kA + `A (Distributivity of scalar multiplication I) 2. k(A + B) = kA + kB (Distributivity of scalar multiplication II) 3. A(B + C) = AB + AC (Distributivity of matrix multiplication) 4. A(BC) = (AB)C (Associativity of matrix mul- tiplication) 5. A + B = B + A (Commutativity of matrix ad- dition) 6. (A + B) + C = A + (B + C) (Associativity of matrix addition) 7. k(AB) = A(kB) (Commutativity of Scalar Mul- tiplication) 3 3.2, 3.3 Inverting Matrices P. Danziger The matrix 0 is the identity of matrix addition.
    [Show full text]
  • Clifford Algebra with Mathematica
    Clifford Algebra with Mathematica J.L. ARAGON´ G. ARAGON-CAMARASA Universidad Nacional Aut´onoma de M´exico University of Glasgow Centro de F´ısica Aplicada School of Computing Science y Tecnolog´ıa Avanzada Sir Alwyn William Building, Apartado Postal 1-1010, 76000 Quer´etaro Glasgow, G12 8QQ Scotland MEXICO UNITED KINGDOM [email protected] [email protected] G. ARAGON-GONZ´ ALEZ´ M.A. RODRIGUEZ-ANDRADE´ Universidad Aut´onoma Metropolitana Instituto Polit´ecnico Nacional Unidad Azcapotzalco Departamento de Matem´aticas, ESFM San Pablo 180, Colonia Reynosa-Tamaulipas, UP Adolfo L´opez Mateos, 02200 D.F. M´exico Edificio 9. 07300 D.F. M´exico MEXICO MEXICO [email protected] [email protected] Abstract: The Clifford algebra of a n-dimensional Euclidean vector space provides a general language comprising vectors, complex numbers, quaternions, Grassman algebra, Pauli and Dirac matrices. In this work, we present an introduction to the main ideas of Clifford algebra, with the main goal to develop a package for Clifford algebra calculations for the computer algebra program Mathematica.∗ The Clifford algebra package is thus a powerful tool since it allows the manipulation of all Clifford mathematical objects. The package also provides a visualization tool for elements of Clifford Algebra in the 3-dimensional space. clifford.m is available from https://github.com/jlaragonvera/Geometric-Algebra Key–Words: Clifford Algebras, Geometric Algebra, Mathematica Software. 1 Introduction Mathematica, resulting in a package for doing Clif- ford algebra computations. There exists some other The importance of Clifford algebra was recognized packages and specialized programs for doing Clif- for the first time in quantum field theory.
    [Show full text]
  • Eigenvalues and Eigenvectors
    Section 5.1 Eigenvalues and Eigenvectors At this point in the class, we understand that the operation of matrix multiplication is very different from scalar multiplication. Matrix multiplication is an operation that finds the product of a pair of matrices, whereas scalar multiplication finds the product of a scalar with a matrix. More to the point, scalar multiplication of a scalar k with a matrix X is an easy operation to perform; indeed there's a simple algorithm: multiply each entry of X by the scalar k. Matrix multiplication on the other hand cannot be expressed using a simple algorithm. Indeed, the operation is extremely complex: to multiply a pair of n×n matrices, it takes about n2 operations, a hugely inefficient process. Thus it is surprising that, for any square matrix A, certain matrix multiplications \collapse" (in a sense) into scalar multiplication. As an example, consider the matrices ( ) ( ) ( ) 3 2 −4 10 A = ; w = ; x = : 3 4 −1 15 The product Aw is not particularly interesting: ( ) −14 Aw = ; −16 indeed, there appears to be very little connection between A, w, and Aw. On the other hand, an we see an interesting phenomenon when we calculate Ax: ( ) ( ) 60 10 Ax = = 6 = 6x: 90 15 Forgetting the calculation in the middle, we have Ax = 6x; in a sense, the matrix multiplication Ax collapsed into scalar multiplication 6x. We will see below that the phenomenon described here is more common than one might think. Identifying products like Ax = 6x will allow us to extract a great deal of useful information about A; accordingly, the rest of the section will be devoted to this phenomenon, which we give a name in the next definition.
    [Show full text]
  • Cofactor Expansions the Determinant Is a Scalar Characteristic of a Square
    Linear Algebra Grinshpan Cofactor expansions The determinant is a scalar characteristic of a square matrix. The determinant of A is denoted by det A or by jAj: For 1 × 1; 2 × 2; and 3 × 3 matrices the formulas are: [ ] det a = a; [ ] a b det = ad − bc; c d 2 3 a b c det 4d e f5 = aei + bfg + cdh − ceg − bdi − afh: g h i What is the pattern? One way to describe it is as follows. i+j Let A be n × n: To each entry aij of A; associate the sign (−1) and the submatrix Aij obtained from A by deleting the i-th row and the j-th column. i+j The number (−1) jAijj is the cofactor of A corresponding to the position (i; j): For every row index i; i+1 i+2 i+n jAj = ai1(−1) jAi1j + ai2(−1) jAi2j + ::: + ain(−1) jAinj: This is the Laplace or cofactor expansion of jAj along the i-th row. For every column index j; 1+j 2+j n+j jAj = a1j(−1) jA1jj + a2j(−1) jA2jj + ::: + anj(−1) jAnjj: This is the cofactor expansion of jAj along the j-th column. The above expansions give jAj in terms of determinants of one order less, so any of them may be used for calculating jAj: The reduction may take several steps. 1 2 3 2 3 1 3 1 2 4 5 6 = −4 + 5 − 6 = −4(18 − 24) + 5(9 − 21) − 6(8 − 14) = 0 8 9 7 9 7 8 7 8 9 1 2 3 5 6 2 3 2 3 4 5 6 = +1 − 4 + 7 = (45 − 48) − 4(18 − 24) + 7(12 − 15) = 0 8 9 8 9 5 6 7 8 9 For every 2 × 2 matrix, the determinant is plus or minus the area of the parallelogram bounded by its columns (rows).
    [Show full text]
  • Higher Order Derivatives 1 Chapter 3 Higher Order Derivatives
    Higher order derivatives 1 Chapter 3 Higher order derivatives You certainly realize from single-variable calculus how very important it is to use derivatives of orders greater than one. The same is of course true for multivariable calculus. In particular, we shall de¯nitely want a \second derivative test" for critical points. A. Partial derivatives First we need to clarify just what sort of domains we wish to consider for our functions. So we give a couple of de¯nitions. n DEFINITION. Suppose x0 2 A ½ R . We say that x0 is an interior point of A if there exists r > 0 such that B(x0; r) ½ A. A DEFINITION. A set A ½ Rn is said to be open if every point of A is an interior point of A. EXAMPLE. An open ball is an open set. (So we are fortunate in our terminology!) To see this suppose A = B(x; r). Then if y 2 A, kx ¡ yk < r and we can let ² = r ¡ kx ¡ yk: Then B(y; ²) ½ A. For if z 2 B(y; ²), then the triangle inequality implies kz ¡ xk · kz ¡ yk + ky ¡ xk < ² + ky ¡ xk = r; so that z 2 A. Thus B(y; ²) ½ B(x; r) (cf. Problem 1{34). This proves that y is an interior point of A. As y is an arbitrary point of A, we conclude that A is open. PROBLEM 3{1. Explain why the empty set is an open set and why this is actually a problem about the logic of language rather than a problem about mathematics.
    [Show full text]