Matrix Differential Calculus with Applications to Simple, Hadamard, and Kronecker Products
Total Page:16
File Type:pdf, Size:1020Kb
Load more
Recommended publications
-
Mathematics for Dynamicists - Lecture Notes Thomas Michelitsch
Mathematics for Dynamicists - Lecture Notes Thomas Michelitsch To cite this version: Thomas Michelitsch. Mathematics for Dynamicists - Lecture Notes. Doctoral. Mathematics for Dynamicists, France. 2006. cel-01128826 HAL Id: cel-01128826 https://hal.archives-ouvertes.fr/cel-01128826 Submitted on 10 Mar 2015 HAL is a multi-disciplinary open access L’archive ouverte pluridisciplinaire HAL, est archive for the deposit and dissemination of sci- destinée au dépôt et à la diffusion de documents entific research documents, whether they are pub- scientifiques de niveau recherche, publiés ou non, lished or not. The documents may come from émanant des établissements d’enseignement et de teaching and research institutions in France or recherche français ou étrangers, des laboratoires abroad, or from public or private research centers. publics ou privés. Mathematics for Dynamicists Lecture Notes c 2004-2006 Thomas Michael Michelitsch1 Department of Civil and Structural Engineering The University of Sheffield Sheffield, UK 1Present address : http://www.dalembert.upmc.fr/home/michelitsch/ . Dedicated to the memory of my parents Michael and Hildegard Michelitsch Handout Manuscript with Fractal Cover Picture by Michael Michelitsch https://www.flickr.com/photos/michelitsch/sets/72157601230117490 1 Contents Preface 1 Part I 1.1 Binomial Theorem 1.2 Derivatives, Integrals and Taylor Series 1.3 Elementary Functions 1.4 Complex Numbers: Cartesian and Polar Representation 1.5 Sequences, Series, Integrals and Power-Series 1.6 Polynomials and Partial Fractions -
Block Kronecker Products and the Vecb Operator* CORE View
View metadata, citation and similar papers at core.ac.uk brought to you by CORE provided by Elsevier - Publisher Connector Block Kronecker Products and the vecb Operator* Ruud H. Koning Department of Economics University of Groningen P.O. Box 800 9700 AV, Groningen, The Netherlands Heinz Neudecker+ Department of Actuarial Sciences and Econometrics University of Amsterdam Jodenbreestraat 23 1011 NH, Amsterdam, The Netherlands and Tom Wansbeek Department of Economics University of Groningen P.O. Box 800 9700 AV, Groningen, The Netherlands Submitted hv Richard Rrualdi ABSTRACT This paper is concerned with two generalizations of the Kronecker product and two related generalizations of the vet operator. It is demonstrated that they pairwise match two different kinds of matrix partition, viz. the balanced and unbalanced ones. Relevant properties are supplied and proved. A related concept, the so-called tilde transform of a balanced block matrix, is also studied. The results are illustrated with various statistical applications of the five concepts studied. *Comments of an anonymous referee are gratefully acknowledged. ‘This work was started while the second author was at the Indian Statistiral Institute, New Delhi. LINEAR ALGEBRA AND ITS APPLICATIONS 149:165-184 (1991) 165 0 Elsevier Science Publishing Co., Inc., 1991 655 Avenue of the Americas, New York, NY 10010 0024-3795/91/$3.50 166 R. H. KONING, H. NEUDECKER, AND T. WANSBEEK INTRODUCTION Almost twenty years ago Singh [7] and Tracy and Singh [9] introduced a generalization of the Kronecker product A@B. They used varying notation for this new product, viz. A @ B and A &3B. Recently, Hyland and Collins [l] studied the-same product under rather restrictive order conditions. -
The Kronecker Product a Product of the Times
The Kronecker Product A Product of the Times Charles Van Loan Department of Computer Science Cornell University Presented at the SIAM Conference on Applied Linear Algebra, Monterey, Califirnia, October 26, 2009 The Kronecker Product B C is a block matrix whose ij-th block is b C. ⊗ ij E.g., b b b11C b12C 11 12 C = b b ⊗ 21 22 b21C b22C Also called the “Direct Product” or the “Tensor Product” Every bijckl Shows Up c11 c12 c13 b11 b12 c21 c22 c23 b21 b22 ⊗ c31 c32 c33 = b11c11 b11c12 b11c13 b12c11 b12c12 b12c13 b11c21 b11c22 b11c23 b12c21 b12c22 b12c23 b c b c b c b c b c b c 11 31 11 32 11 33 12 31 12 32 12 33 b c b c b c b c b c b c 21 11 21 12 21 13 22 11 22 12 22 13 b21c21 b21c22 b21c23 b22c21 b22c22 b22c23 b21c31 b21c32 b21c33 b22c31 b22c32 b22c33 Basic Algebraic Properties (B C)T = BT CT ⊗ ⊗ (B C) 1 = B 1 C 1 ⊗ − − ⊗ − (B C)(D F ) = BD CF ⊗ ⊗ ⊗ B (C D) =(B C) D ⊗ ⊗ ⊗ ⊗ C B = (Perfect Shuffle)T (B C)(Perfect Shuffle) ⊗ ⊗ R.J. Horn and C.R. Johnson(1991). Topics in Matrix Analysis, Cambridge University Press, NY. Reshaping KP Computations 2 Suppose B, C IRn n and x IRn . ∈ × ∈ The operation y =(B C)x is O(n4): ⊗ y = kron(B,C)*x The equivalent, reshaped operation Y = CXBT is O(n3): y = reshape(C*reshape(x,n,n)*B’,n,n) H.V. -
Matrix Calculus
Appendix D Matrix Calculus From too much study, and from extreme passion, cometh madnesse. Isaac Newton [205, §5] − D.1 Gradient, Directional derivative, Taylor series D.1.1 Gradients Gradient of a differentiable real function f(x) : RK R with respect to its vector argument is defined uniquely in terms of partial derivatives→ ∂f(x) ∂x1 ∂f(x) , ∂x2 RK f(x) . (2053) ∇ . ∈ . ∂f(x) ∂xK while the second-order gradient of the twice differentiable real function with respect to its vector argument is traditionally called the Hessian; 2 2 2 ∂ f(x) ∂ f(x) ∂ f(x) 2 ∂x1 ∂x1∂x2 ··· ∂x1∂xK 2 2 2 ∂ f(x) ∂ f(x) ∂ f(x) 2 2 K f(x) , ∂x2∂x1 ∂x2 ··· ∂x2∂xK S (2054) ∇ . ∈ . .. 2 2 2 ∂ f(x) ∂ f(x) ∂ f(x) 2 ∂xK ∂x1 ∂xK ∂x2 ∂x ··· K interpreted ∂f(x) ∂f(x) 2 ∂ ∂ 2 ∂ f(x) ∂x1 ∂x2 ∂ f(x) = = = (2055) ∂x1∂x2 ³∂x2 ´ ³∂x1 ´ ∂x2∂x1 Dattorro, Convex Optimization Euclidean Distance Geometry, Mεβoo, 2005, v2020.02.29. 599 600 APPENDIX D. MATRIX CALCULUS The gradient of vector-valued function v(x) : R RN on real domain is a row vector → v(x) , ∂v1(x) ∂v2(x) ∂vN (x) RN (2056) ∇ ∂x ∂x ··· ∂x ∈ h i while the second-order gradient is 2 2 2 2 , ∂ v1(x) ∂ v2(x) ∂ vN (x) RN v(x) 2 2 2 (2057) ∇ ∂x ∂x ··· ∂x ∈ h i Gradient of vector-valued function h(x) : RK RN on vector domain is → ∂h1(x) ∂h2(x) ∂hN (x) ∂x1 ∂x1 ··· ∂x1 ∂h1(x) ∂h2(x) ∂hN (x) h(x) , ∂x2 ∂x2 ··· ∂x2 ∇ . -
The Partition Algebra and the Kronecker Product (Extended Abstract)
FPSAC 2013, Paris, France DMTCS proc. (subm.), by the authors, 1–12 The partition algebra and the Kronecker product (Extended abstract) C. Bowman1yand M. De Visscher2 and R. Orellana3z 1Institut de Mathematiques´ de Jussieu, 175 rue du chevaleret, 75013, Paris 2Centre for Mathematical Science, City University London, Northampton Square, London, EC1V 0HB, England. 3 Department of Mathematics, Dartmouth College, 6188 Kemeny Hall, Hanover, NH 03755, USA Abstract. We propose a new approach to study the Kronecker coefficients by using the Schur–Weyl duality between the symmetric group and the partition algebra. Resum´ e.´ Nous proposons une nouvelle approche pour l’etude´ des coefficients´ de Kronecker via la dualite´ entre le groupe symetrique´ et l’algebre` des partitions. Keywords: Kronecker coefficients, tensor product, partition algebra, representations of the symmetric group 1 Introduction A fundamental problem in the representation theory of the symmetric group is to describe the coeffi- cients in the decomposition of the tensor product of two Specht modules. These coefficients are known in the literature as the Kronecker coefficients. Finding a formula or combinatorial interpretation for these coefficients has been described by Richard Stanley as ‘one of the main problems in the combinatorial rep- resentation theory of the symmetric group’. This question has received the attention of Littlewood [Lit58], James [JK81, Chapter 2.9], Lascoux [Las80], Thibon [Thi91], Garsia and Remmel [GR85], Kleshchev and Bessenrodt [BK99] amongst others and yet a combinatorial solution has remained beyond reach for over a hundred years. Murnaghan discovered an amazing limiting phenomenon satisfied by the Kronecker coefficients; as we increase the length of the first row of the indexing partitions the sequence of Kronecker coefficients obtained stabilises. -
Matrix Calculus
D Matrix Calculus D–1 Appendix D: MATRIX CALCULUS D–2 In this Appendix we collect some useful formulas of matrix calculus that often appear in finite element derivations. §D.1 THE DERIVATIVES OF VECTOR FUNCTIONS Let x and y be vectors of orders n and m respectively: x1 y1 x2 y2 x = . , y = . ,(D.1) . xn ym where each component yi may be a function of all the xj , a fact represented by saying that y is a function of x,or y = y(x). (D.2) If n = 1, x reduces to a scalar, which we call x.Ifm = 1, y reduces to a scalar, which we call y. Various applications are studied in the following subsections. §D.1.1 Derivative of Vector with Respect to Vector The derivative of the vector y with respect to vector x is the n × m matrix ∂y1 ∂y2 ∂ym ∂ ∂ ··· ∂ x1 x1 x1 ∂ ∂ ∂ ∂ y1 y2 ··· ym y def ∂x ∂x ∂x = 2 2 2 (D.3) ∂x . . .. ∂ ∂ ∂ y1 y2 ··· ym ∂xn ∂xn ∂xn §D.1.2 Derivative of a Scalar with Respect to Vector If y is a scalar, ∂y ∂x1 ∂ ∂ y y def ∂ = x2 .(D.4) ∂x . . ∂y ∂xn §D.1.3 Derivative of Vector with Respect to Scalar If x is a scalar, ∂y def ∂ ∂ ∂ = y1 y2 ... ym (D.5) ∂x ∂x ∂x ∂x D–2 D–3 §D.1 THE DERIVATIVES OF VECTOR FUNCTIONS REMARK D.1 Many authors, notably in statistics and economics, define the derivatives as the transposes of those given above.1 This has the advantage of better agreement of matrix products with composition schemes such as the chain rule. -
Kronecker Products
Copyright ©2005 by the Society for Industrial and Applied Mathematics This electronic version is for personal use and may not be duplicated or distributed. Chapter 13 Kronecker Products 13.1 Definition and Examples Definition 13.1. Let A ∈ Rm×n, B ∈ Rp×q . Then the Kronecker product (or tensor product) of A and B is defined as the matrix a11B ··· a1nB ⊗ = . ∈ Rmp×nq A B . .. . (13.1) am1B ··· amnB Obviously, the same definition holds if A and B are complex-valued matrices. We restrict our attention in this chapter primarily to real-valued matrices, pointing out the extension to the complex case only where it is not obvious. Example 13.2. = 123 = 21 1. Let A 321and B 23. Then 214263 B 2B 3B 234669 A ⊗ B = = . 3B 2BB 634221 694623 Note that B ⊗ A = A ⊗ B. ∈ Rp×q ⊗ = B 0 2. For any B , I2 B 0 B . Replacing I2 by In yields a block diagonal matrix with n copies of B along the diagonal. 3. Let B be an arbitrary 2 × 2 matrix. Then b11 0 b12 0 0 b11 0 b12 B ⊗ I2 = . b21 0 b22 0 0 b21 0 b22 139 “ajlbook” — 2004/11/9 — 13:36 — page 139 — #147 From "Matrix Analysis for Scientists and Engineers" Alan J. Laub. Buy this book from SIAM at www.ec-securehost.com/SIAM/ot91.html Copyright ©2005 by the Society for Industrial and Applied Mathematics This electronic version is for personal use and may not be duplicated or distributed. 140 Chapter 13. Kronecker Products The extension to arbitrary B and In is obvious. -
A Constructive Arbitrary-Degree Kronecker Product Decomposition of Tensors
A CONSTRUCTIVE ARBITRARY-DEGREE KRONECKER PRODUCT DECOMPOSITION OF TENSORS KIM BATSELIER AND NGAI WONG∗ Abstract. We propose the tensor Kronecker product singular value decomposition (TKPSVD) that decomposes a real k-way tensor A into a linear combination of tensor Kronecker products PR (d) (1) with an arbitrary number of d factors A = j=1 σj Aj ⊗ · · · ⊗ Aj . We generalize the matrix (i) Kronecker product to tensors such that each factor Aj in the TKPSVD is a k-way tensor. The algorithm relies on reshaping and permuting the original tensor into a d-way tensor, after which a polyadic decomposition with orthogonal rank-1 terms is computed. We prove that for many (1) (d) different structured tensors, the Kronecker product factors Aj ;:::; Aj are guaranteed to inherit this structure. In addition, we introduce the new notion of general symmetric tensors, which includes many different structures such as symmetric, persymmetric, centrosymmetric, Toeplitz and Hankel tensors. Key words. Kronecker product, structured tensors, tensor decomposition, TTr1SVD, general- ized symmetric tensors, Toeplitz tensor, Hankel tensor AMS subject classifications. 15A69, 15B05, 15A18, 15A23, 15B57 1. Introduction. Consider the singular value decomposition (SVD) of the fol- lowing 16 × 9 matrix A~ 0 1:108 −0:267 −1:192 −0:267 −1:192 −1:281 −1:192 −1:281 1:102 1 B 0:417 −1:487 −0:004 −1:487 −0:004 −1:418 −0:004 −1:418 −0:228C B C B−0:127 1:100 −1:461 1:100 −1:461 0:729 −1:461 0:729 0:940 C B C B−0:748 −0:243 0:387 −0:243 0:387 −1:241 0:387 −1:241 −1:853C B C B 0:417 -
Introduction to Kronecker Products
Math 515 Fall, 2010 Introduction to Kronecker Products If A is an m × n matrix and B is a p × q matrix, then the Kronecker product of A and B is the mp × nq matrix 2 3 a11B a12B ··· a1nB 6 7 6 a21B a22B ··· a2nB 7 A ⊗ B = 6 . 7 6 . 7 4 . 5 am1B am2B ··· amnB Note that if A and B are large matrices, then the Kronecker product A⊗B will be huge. MATLAB has a built-in function kron that can be used as K = kron(A, B); However, you will quickly run out of memory if you try this for matrices that are 50 × 50 or larger. Fortunately we can exploit the block structure of Kronecker products to do many compu- tations involving A ⊗ B without actually forming the Kronecker product. Instead we need only do computations with A and B individually. To begin, we state some very simple properties of Kronecker products, which should not be difficult to verify: • (A ⊗ B)T = AT ⊗ BT • If A and B are square and nonsingular, then (A ⊗ B)−1 = A−1 ⊗ B−1 • (A ⊗ B)(C ⊗ D) = AC ⊗ BD The next property we want to consider involves the matrix-vector multiplication y = (A ⊗ B)x; where A 2 Rm×n and B 2 Rp×q. Thus A ⊗ B 2 Rmp×nq, x 2 Rnq, and y 2 Rmp. Our goal is to exploit the block structure of the Kronecker product matrix to compute y without explicitly forming (A ⊗ B). To see how this can be done, first partition the vectors x and y as 2 3 2 3 x1 y1 6 x 7 6 y 7 6 2 7 q 6 2 7 p x = 6 . -
Higher Order Derivatives 1 Chapter 3 Higher Order Derivatives
Higher order derivatives 1 Chapter 3 Higher order derivatives You certainly realize from single-variable calculus how very important it is to use derivatives of orders greater than one. The same is of course true for multivariable calculus. In particular, we shall de¯nitely want a \second derivative test" for critical points. A. Partial derivatives First we need to clarify just what sort of domains we wish to consider for our functions. So we give a couple of de¯nitions. n DEFINITION. Suppose x0 2 A ½ R . We say that x0 is an interior point of A if there exists r > 0 such that B(x0; r) ½ A. A DEFINITION. A set A ½ Rn is said to be open if every point of A is an interior point of A. EXAMPLE. An open ball is an open set. (So we are fortunate in our terminology!) To see this suppose A = B(x; r). Then if y 2 A, kx ¡ yk < r and we can let ² = r ¡ kx ¡ yk: Then B(y; ²) ½ A. For if z 2 B(y; ²), then the triangle inequality implies kz ¡ xk · kz ¡ yk + ky ¡ xk < ² + ky ¡ xk = r; so that z 2 A. Thus B(y; ²) ½ B(x; r) (cf. Problem 1{34). This proves that y is an interior point of A. As y is an arbitrary point of A, we conclude that A is open. PROBLEM 3{1. Explain why the empty set is an open set and why this is actually a problem about the logic of language rather than a problem about mathematics. -
Kronecker Products and Matrix Calculus in System Theory
772 IEEE TRANSACTIONS ON CIRCUITS AND SYSTEMS, VOL. CAS-25, NO. 9, ISEFTEMBER 1978 Kronecker Products and Matrix Calculus in System Theory JOHN W. BREWER I Absfrucr-The paper begins with a review of the algebras related to and are based on the developments of Section IV. Also, a Kronecker products. These algebras have several applications in system novel and interesting operator notation will ble introduced theory inclluding the analysis of stochastic steady state. The calculus of matrk valued functions of matrices is reviewed in the second part of the in Section VI. paper. This cakuhs is then used to develop an interesting new method for Concluding comments are placed in Section VII. the identification of parameters of linear time-invariant system models. II. NOTATION I. INTRODUCTION Matrices will be denoted by upper case boldface (e.g., HE ART of differentiation has many practical ap- A) and column matrices (vectors) will be denoted by T plications in system analysis. Differentiation is used lower case boldface (e.g., x). The kth row of a matrix such. to obtain necessaryand sufficient conditions for optimiza- as A will be denoted A,. and the kth colmnn will be tion in a.nalytical studies or it is used to obtain gradients denoted A.,. The ik element of A will be denoted ujk. and Hessians in numerical optimization studies. Sensitiv- The n x n unit matrix is denoted I,,. The q-dimensional ity analysis is yet another application area for. differentia- vector which is “I” in the kth and zero elsewhereis called tion formulas. In turn, sensitivity analysis is applied to the the unit vector and is denoted design of feedback and adaptive feedback control sys- tems. -
Appendix D Matrix Calculus
Appendix D Matrix calculus From too much study, and from extreme passion, cometh madnesse. Isaac Newton [86, §5] − D.1 Directional derivative, Taylor series D.1.1 Gradients Gradient of a differentiable real function f(x) : RK R with respect to its vector domain is defined → ∂f(x) ∂x1 ∂f(x) ∆ K f(x) = ∂x2 R (1354) ∇ . ∈ . ∂f(x) ∂xK while the second-order gradient of the twice differentiable real function with respect to its vector domain is traditionally called the Hessian ; ∂2f(x) ∂2f(x) ∂2f(x) 2 ∂ x1 ∂x1∂x2 ··· ∂x1∂xK ∂2f(x) ∂2f(x) ∂2f(x) ∆ 2 K 2 ∂x2∂x1 ∂ x2 ∂x2∂x S f(x) = . ··· . K (1355) ∇ . ... ∈ . ∂2f(x) ∂2f(x) ∂2f(x) 2 1 2 ∂xK ∂x ∂xK ∂x ··· ∂ xK © 2001 Jon Dattorro. CO&EDG version 04.18.2006. All rights reserved. 501 Citation: Jon Dattorro, Convex Optimization & Euclidean Distance Geometry, Meboo Publishing USA, 2005. 502 APPENDIX D. MATRIX CALCULUS The gradient of vector-valued function v(x) : R RN on real domain is a row-vector → ∆ v(x) = ∂v1(x) ∂v2(x) ∂vN (x) RN (1356) ∇ ∂x ∂x ··· ∂x ∈ h i while the second-order gradient is 2 2 2 2 ∆ ∂ v1(x) ∂ v2(x) ∂ vN (x) N v(x) = 2 2 2 R (1357) ∇ ∂x ∂x ··· ∂x ∈ h i Gradient of vector-valued function h(x) : RK RN on vector domain is → ∂h1(x) ∂h2(x) ∂hN (x) ∂x1 ∂x1 ··· ∂x1 ∂h1(x) ∂h2(x) ∂hN (x) ∆ ∂x2 ∂x2 ∂x2 h(x) = . ··· . ∇ . . (1358) ∂h1(x) ∂h2(x) ∂hN (x) ∂x ∂x ∂x K K ··· K K×N = [ h (x) h (x) hN (x) ] R ∇ 1 ∇ 2 · · · ∇ ∈ while the second-order gradient has a three-dimensional representation dubbed cubix ;D.1 ∂h1(x) ∂h2(x) ∂hN (x) ∇ ∂x1 ∇ ∂x1 · · · ∇ ∂x1 ∂h1(x) ∂h2(x) ∂hN (x) ∆ 2 ∂x2 ∂x2 ∂x2 h(x) = ∇ .