The Matrix Cookbook [ ]

The Matrix Cookbook [ http://matrixcookbook.com ] Kaare Brandt Petersen Michael Syskind Pedersen Version: November 15, 2012 1 Introduction What is this? These pages are a collection of facts (identities, approximations, inequalities, relations, ...) about matrices and matters relating to them. It is collected in this form for the convenience of anyone who wants a quick desktop reference . Disclaimer: The identities, approximations and relations presented here were obviously not invented but collected, borrowed and copied from a large amount of sources. These sources include similar but shorter notes found on the internet and appendices in books - see the references for a full list. Errors: Very likely there are errors, typos, and mistakes for which we apolo- gize and would be grateful to receive corrections at [email protected]. Its ongoing: The project of keeping a large repository of relations involving matrices is naturally ongoing and the version will be apparent from the date in the header. Suggestions: Your suggestion for additional content or elaboration of some topics is most welcome [email protected]. Keywords: Matrix algebra, matrix relations, matrix identities, derivative of determinant, derivative of inverse matrix, differentiate a matrix. Acknowledgements: We would like to thank the following for contributions and suggestions: Bill Baxter, Brian Templeton, Christian Rishøj, Christian Schröppel, Dan Boley, Douglas L. Theobald, Esben Hoegh-Rasmussen, Evripidis Karseras, Georg Martius, Glynne Casteel, Jan Larsen, Jun Bin Gao, Jürgen Struckmeier, Kamil Dedecius, Karim T. Abou-Moustafa, Korbinian Strimmer, Lars Christiansen, Lars Kai Hansen, Leland Wilkinson, Liguo He, Loic Thibaut, Markus Froeb, Michael Hubatka, Miguel Bar~ao,Ole Winther, Pavel Sakov, Stephan Hattinger, Troels Pedersen, Vasile Sima, Vincent Rabaud, Zhaoshui He. We would also like thank The Oticon Foundation for funding our PhD studies. Petersen & Pedersen, The Matrix Cookbook, Version: November 15, 2012, Page 2 CONTENTS CONTENTS Contents 1 Basics 6 1.1 Trace . 6 1.2 Determinant . 6 1.3 The Special Case 2x2 . 7 2 Derivatives 8 2.1 Derivatives of a Determinant . 8 2.2 Derivatives of an Inverse . 9 2.3 Derivatives of Eigenvalues . 10 2.4 Derivatives of Matrices, Vectors and Scalar Forms . 10 2.5 Derivatives of Traces . 12 2.6 Derivatives of vector norms . 14 2.7 Derivatives of matrix norms . 14 2.8 Derivatives of Structured Matrices . 14 3 Inverses 17 3.1 Basic . 17 3.2 Exact Relations . 18 3.3 Implication on Inverses . 20 3.4 Approximations . 20 3.5 Generalized Inverse . 21 3.6 Pseudo Inverse . 21 4 Complex Matrices 24 4.1 Complex Derivatives . 24 4.2 Higher order and non-linear derivatives . 26 4.3 Inverse of complex sum . 27 5 Solutions and Decompositions 28 5.1 Solutions to linear equations . 28 5.2 Eigenvalues and Eigenvectors . 30 5.3 Singular Value Decomposition . 31 5.4 Triangular Decomposition . 32 5.5 LU decomposition . 32 5.6 LDM decomposition . 33 5.7 LDL decompositions . 33 6 Statistics and Probability 34 6.1 Definition of Moments . 34 6.2 Expectation of Linear Combinations . 35 6.3 Weighted Scalar Variable . 36 7 Multivariate Distributions 37 7.1 Cauchy . 37 7.2 Dirichlet . 37 7.3 Normal . 37 7.4 Normal-Inverse Gamma . 37 7.5 Gaussian . 37 7.6 Multinomial . 37 Petersen & Pedersen, The Matrix Cookbook, Version: November 15, 2012, Page 3 CONTENTS CONTENTS 7.7 Student's t . 37 7.8 Wishart . 38 7.9 Wishart, Inverse . 39 8 Gaussians 40 8.1 Basics . 40 8.2 Moments . 42 8.3 Miscellaneous . 44 8.4 Mixture of Gaussians . 44 9 Special Matrices 46 9.1 Block matrices . 46 9.2 Discrete Fourier Transform Matrix, The . 47 9.3 Hermitian Matrices and skew-Hermitian . 48 9.4 Idempotent Matrices . 49 9.5 Orthogonal matrices . 49 9.6 Positive Definite and Semi-definite Matrices . 50 9.7 Singleentry Matrix, The . 52 9.8 Symmetric, Skew-symmetric/Antisymmetric . 54 9.9 Toeplitz Matrices . 54 9.10 Transition matrices . 55 9.11 Units, Permutation and Shift . 56 9.12 Vandermonde Matrices . 57 10 Functions and Operators 58 10.1 Functions and Series . 58 10.2 Kronecker and Vec Operator . 59 10.3 Vector Norms . 61 10.4 Matrix Norms . 61 10.5 Rank . 62 10.6 Integral Involving Dirac Delta Functions . 62 10.7 Miscellaneous . 63 A One-dimensional Results 64 A.1 Gaussian . 64 A.2 One Dimensional Mixture of Gaussians . 65 B Proofs and Details 66 B.1 Misc Proofs . 66 Petersen & Pedersen, The Matrix Cookbook, Version: November 15, 2012, Page 4 CONTENTS CONTENTS Notation and Nomenclature A Matrix Aij Matrix indexed for some purpose Ai Matrix indexed for some purpose Aij Matrix indexed for some purpose An Matrix indexed for some purpose or The n.th power of a square matrix A−1 The inverse matrix of the matrix A A+ The pseudo inverse matrix of the matrix A (see Sec. 3.6) A1=2 The square root of a matrix (if unique), not elementwise (A)ij The (i; j).th entry of the matrix A Aij The (i; j).th entry of the matrix A [A]ij The ij-submatrix, i.e. A with i.th row and j.th column deleted a Vector (column-vector) ai Vector indexed for some purpose ai The i.th element of the vector a a Scalar <z Real part of a scalar <z Real part of a vector <Z Real part of a matrix =z Imaginary part of a scalar =z Imaginary part of a vector =Z Imaginary part of a matrix det(A) Determinant of A Tr(A) Trace of the matrix A diag(A) Diagonal matrix of the matrix A, i.e. (diag(A))ij = δijAij eig(A) Eigenvalues of the matrix A vec(A) The vector-version of the matrix A (see Sec. 10.2.2) sup Supremum of a set jjAjj Matrix norm (subscript if any denotes what norm) AT Transposed matrix A−T The inverse of the transposed and vice versa, A−T = (A−1)T = (AT )−1. A∗ Complex conjugated matrix AH Transposed and complex conjugated matrix (Hermitian) A ◦ B Hadamard (elementwise) product A ⊗ B Kronecker product 0 The null matrix. Zero in all entries. I The identity matrix Jij The single-entry matrix, 1 at (i; j) and zero elsewhere Σ A positive definite matrix Λ A diagonal matrix Petersen & Pedersen, The Matrix Cookbook, Version: November 15, 2012, Page 5 1 BASICS 1 Basics (AB)−1 = B−1A−1 (1) (ABC:::)−1 = :::C−1B−1A−1 (2) (AT )−1 = (A−1)T (3) (A + B)T = AT + BT (4) (AB)T = BT AT (5) (ABC:::)T = :::CT BT AT (6) (AH )−1 = (A−1)H (7) (A + B)H = AH + BH (8) (AB)H = BH AH (9) (ABC:::)H = :::CH BH AH (10) 1.1 Trace P Tr(A) = iAii (11) P Tr(A) = iλi; λi = eig(A) (12) Tr(A) = Tr(AT ) (13) Tr(AB) = Tr(BA) (14) Tr(A + B) = Tr(A) + Tr(B) (15) Tr(ABC) = Tr(BCA) = Tr(CAB) (16) aT a = Tr(aaT ) (17) 1.2 Determinant Let A be an n × n matrix. Q det(A) = iλi λi = eig(A) (18) n n×n det(cA) = c det(A); if A 2 R (19) det(AT ) = det(A) (20) det(AB) = det(A) det(B) (21) det(A−1) = 1= det(A) (22) det(An) = det(A)n (23) det(I + uvT ) = 1 + uT v (24) For n = 2: det(I + A) = 1 + det(A) + Tr(A) (25) For n = 3: 1 1 det(I + A) = 1 + det(A) + Tr(A) + Tr(A)2 − Tr(A2) (26) 2 2 Petersen & Pedersen, The Matrix Cookbook, Version: November 15, 2012, Page 6 1.3 The Special Case 2x2 1 BASICS For n = 4: 1 det(I + A) = 1 + det(A) + Tr(A) + 2 1 +Tr(A)2 − Tr(A2) 2 1 1 1 + Tr(A)3 − Tr(A)Tr(A2) + Tr(A3) (27) 6 2 3 For small ", the following approximation holds 1 1 det(I + "A) =∼ 1 + det(A) + "Tr(A) + "2Tr(A)2 − "2Tr(A2) (28) 2 2 1.3 The Special Case 2x2 Consider the matrix A A A A = 11 12 A21 A22 Determinant and trace det(A) = A11A22 − A12A21 (29) Tr(A) = A11 + A22 (30) Eigenvalues λ2 − λ · Tr(A) + det(A) = 0 Tr(A) + pTr(A)2 − 4 det(A) Tr(A) − pTr(A)2 − 4 det(A) λ = λ = 1 2 2 2 λ1 + λ2 = Tr(A) λ1λ2 = det(A) Eigenvectors A12 A12 v1 / v2 / λ1 − A11 λ2 − A11 Inverse 1 A −A A−1 = 22 12 (31) det(A) −A21 A11 Petersen & Pedersen, The Matrix Cookbook, Version: November 15, 2012, Page 7 2 DERIVATIVES 2 Derivatives This section is covering differentiation of a number of expressions with respect to a matrix X. Note that it is always assumed that X has no special structure, i.e. that the elements of X are independent (e.g. not symmetric, Toeplitz, positive definite). See section 2.8 for differentiation of structured matrices. The basic assumptions can be written in a formula as @Xkl = δikδlj (32) @Xij that is for e.g. vector forms, @x @x @x @x @x @x = i = = i @y i @y @y i @yi @y ij @yj The following rules are general and very useful when deriving the differential of an expression ([19]): @A = 0 (A is a constant) (33) @(αX) = α@X (34) @(X + Y) = @X + @Y (35) @(Tr(X)) = Tr(@X) (36) @(XY) = (@X)Y + X(@Y) (37) @(X ◦ Y) = (@X) ◦ Y + X ◦ (@Y) (38) @(X ⊗ Y) = (@X) ⊗ Y + X ⊗ (@Y) (39) @(X−1) = −X−1(@X)X−1 (40) @(det(X)) = Tr(adj(X)@X) (41) @(det(X)) = det(X)Tr(X−1@X) (42) @(ln(det(X))) = Tr(X−1@X) (43) @XT = (@X)T (44) @XH = (@X)H (45) 2.1 Derivatives of a Determinant 2.1.1 General form @ det(Y) @Y = det(Y)Tr Y−1 (46) @x @x X @ det(X) Xjk = δij det(X) (47) @Xik k " " # @2 det(Y) @ @Y = det(Y) Tr Y−1 @x @x2 @x @Y @Y +Tr Y−1 Tr Y−1 @x @x # @Y @Y −Tr Y−1 Y−1 (48) @x @x Petersen & Pedersen, The Matrix Cookbook, Version: November 15, 2012, Page 8 2.2 Derivatives of an Inverse 2 DERIVATIVES 2.1.2 Linear forms @ det(X) = det(X)(X−1)T (49) @X X @ det(X) Xjk = δij det(X) (50) @Xik k @ det(AXB) = det(AXB)(X−1)T = det(AXB)(XT )−1 (51) @X 2.1.3 Square forms If X is square and invertible, then @ det(XT AX) = 2 det(XT AX)X−T (52) @X If X is not square but A is symmetric, then @ det(XT AX) = 2 det(XT AX)AX(XT AX)−1 (53) @X If X is not square and A is not symmetric, then @ det(XT AX) = det(XT AX)(AX(XT AX)−1 + AT X(XT AT X)−1) (54) @X 2.1.4 Other nonlinear forms Some special cases are (See [9, 7]) @ ln det(XT X)j = 2(X+)T (55) @X @ ln det(XT X) = −2XT (56) @X+ @ ln j det(X)j = (X−1)T

Load more