The Matrix Cookbook [ ]
Total Page:16
File Type:pdf, Size:1020Kb
The Matrix Cookbook [ http://matrixcookbook.com ] Kaare Brandt Petersen Michael Syskind Pedersen Version: November 14, 2008 What is this? These pages are a collection of facts (identities, approxima- tions, inequalities, relations, ...) about matrices and matters relating to them. It is collected in this form for the convenience of anyone who wants a quick desktop reference . Disclaimer: The identities, approximations and relations presented here were obviously not invented but collected, borrowed and copied from a large amount of sources. These sources include similar but shorter notes found on the internet and appendices in books - see the references for a full list. Errors: Very likely there are errors, typos, and mistakes for which we apolo- gize and would be grateful to receive corrections at [email protected]. Its ongoing: The project of keeping a large repository of relations involving matrices is naturally ongoing and the version will be apparent from the date in the header. Suggestions: Your suggestion for additional content or elaboration of some topics is most welcome at [email protected]. Keywords: Matrix algebra, matrix relations, matrix identities, derivative of determinant, derivative of inverse matrix, differentiate a matrix. Acknowledgements: We would like to thank the following for contributions and suggestions: Bill Baxter, Brian Templeton, Christian Rishøj, Christian Schr¨oppel Douglas L. Theobald, Esben Hoegh-Rasmussen, Glynne Casteel, Jan Larsen, Jun Bin Gao, J¨urgen Struckmeier, Kamil Dedecius, Korbinian Strim- mer, Lars Christiansen, Lars Kai Hansen, Leland Wilkinson, Liguo He, Loic Thibaut, Miguel Bar˜ao,Ole Winther, Pavel Sakov, Stephan Hattinger, Vasile Sima, Vincent Rabaud, Zhaoshui He. We would also like thank The Oticon Foundation for funding our PhD studies. 1 CONTENTS CONTENTS Contents 1 Basics 5 1.1 Trace and Determinants . 5 1.2 The Special Case 2x2 . 5 2 Derivatives 7 2.1 Derivatives of a Determinant . 7 2.2 Derivatives of an Inverse . 8 2.3 Derivatives of Eigenvalues . 9 2.4 Derivatives of Matrices, Vectors and Scalar Forms . 9 2.5 Derivatives of Traces . 11 2.6 Derivatives of vector norms . 13 2.7 Derivatives of matrix norms . 13 2.8 Derivatives of Structured Matrices . 14 3 Inverses 16 3.1 Basic . 16 3.2 Exact Relations . 17 3.3 Implication on Inverses . 19 3.4 Approximations . 19 3.5 Generalized Inverse . 20 3.6 Pseudo Inverse . 20 4 Complex Matrices 23 4.1 Complex Derivatives . 23 4.2 Higher order and non-linear derivatives . 26 4.3 Inverse of complex sum . 26 5 Solutions and Decompositions 27 5.1 Solutions to linear equations . 27 5.2 Eigenvalues and Eigenvectors . 29 5.3 Singular Value Decomposition . 30 5.4 Triangular Decomposition . 32 5.5 LU decomposition . 32 5.6 LDM decomposition . 32 5.7 LDL decompositions . 32 6 Statistics and Probability 33 6.1 Definition of Moments . 33 6.2 Expectation of Linear Combinations . 34 6.3 Weighted Scalar Variable . 35 Petersen & Pedersen, The Matrix Cookbook, Version: November 14, 2008, Page 2 CONTENTS CONTENTS 7 Multivariate Distributions 36 7.1 Cauchy . 36 7.2 Dirichlet . 36 7.3 Normal . 36 7.4 Normal-Inverse Gamma . 36 7.5 Gaussian . 36 7.6 Multinomial . 36 7.7 Student’s t . 36 7.8 Wishart . 37 7.9 Wishart, Inverse . 38 8 Gaussians 39 8.1 Basics . 39 8.2 Moments . 41 8.3 Miscellaneous . 43 8.4 Mixture of Gaussians . 44 9 Special Matrices 45 9.1 Block matrices . 45 9.2 Discrete Fourier Transform Matrix, The . 46 9.3 Hermitian Matrices and skew-Hermitian . 47 9.4 Idempotent Matrices . 48 9.5 Orthogonal matrices . 48 9.6 Positive Definite and Semi-definite Matrices . 50 9.7 Singleentry Matrix, The . 51 9.8 Symmetric, Skew-symmetric/Antisymmetric . 53 9.9 Toeplitz Matrices . 54 9.10 Transition matrices . 55 9.11 Units, Permutation and Shift . 56 9.12 Vandermonde Matrices . 57 10 Functions and Operators 58 10.1 Functions and Series . 58 10.2 Kronecker and Vec Operator . 59 10.3 Vector Norms . 61 10.4 Matrix Norms . 61 10.5 Rank . 62 10.6 Integral Involving Dirac Delta Functions . 62 10.7 Miscellaneous . 63 A One-dimensional Results 64 A.1 Gaussian . 64 A.2 One Dimensional Mixture of Gaussians . 65 B Proofs and Details 67 B.1 Misc Proofs . 67 Petersen & Pedersen, The Matrix Cookbook, Version: November 14, 2008, Page 3 CONTENTS CONTENTS Notation and Nomenclature A Matrix Aij Matrix indexed for some purpose Ai Matrix indexed for some purpose Aij Matrix indexed for some purpose An Matrix indexed for some purpose or The n.th power of a square matrix A−1 The inverse matrix of the matrix A A+ The pseudo inverse matrix of the matrix A (see Sec. 3.6) A1/2 The square root of a matrix (if unique), not elementwise (A)ij The (i, j).th entry of the matrix A Aij The (i, j).th entry of the matrix A [A]ij The ij-submatrix, i.e. A with i.th row and j.th column deleted a Vector ai Vector indexed for some purpose ai The i.th element of the vector a a Scalar <z Real part of a scalar <z Real part of a vector <Z Real part of a matrix =z Imaginary part of a scalar =z Imaginary part of a vector =Z Imaginary part of a matrix det(A) Determinant of A Tr(A) Trace of the matrix A diag(A) Diagonal matrix of the matrix A, i.e. (diag(A))ij = δijAij eig(A) Eigenvalues of the matrix A vec(A) The vector-version of the matrix A (see Sec. 10.2.2) sup Supremum of a set ||A|| Matrix norm (subscript if any denotes what norm) AT Transposed matrix A−T The inverse of the transposed and vice versa, A−T = (A−1)T = (AT )−1. A∗ Complex conjugated matrix AH Transposed and complex conjugated matrix (Hermitian) A ◦ B Hadamard (elementwise) product A ⊗ B Kronecker product 0 The null matrix. Zero in all entries. I The identity matrix Jij The single-entry matrix, 1 at (i, j) and zero elsewhere Σ A positive definite matrix Λ A diagonal matrix Petersen & Pedersen, The Matrix Cookbook, Version: November 14, 2008, Page 4 1 BASICS 1 Basics (AB)−1 = B−1A−1 (1) (ABC...)−1 = ...C−1B−1A−1 (2) (AT )−1 = (A−1)T (3) (A + B)T = AT + BT (4) (AB)T = BT AT (5) (ABC...)T = ...CT BT AT (6) (AH )−1 = (A−1)H (7) (A + B)H = AH + BH (8) (AB)H = BH AH (9) (ABC...)H = ...CH BH AH (10) 1.1 Trace and Determinants P Tr(A) = iAii (11) P Tr(A) = iλi, λi = eig(A) (12) Tr(A) = Tr(AT ) (13) Tr(AB) = Tr(BA) (14) Tr(A + B) = Tr(A) + Tr(B) (15) Tr(ABC) = Tr(BCA) = Tr(CAB) (16) Q det(A) = iλi λi = eig(A) (17) n n×n det(cA) = c det(A), if A ∈ R (18) det(AT ) = det(A) (19) det(AB) = det(A) det(B) (20) det(A−1) = 1/ det(A) (21) det(An) = det(A)n (22) det(I + uvT ) = 1 + uT v (23) det(I + εA) =∼ 1 + εTr(A), ε small (24) 1.2 The Special Case 2x2 Consider the matrix A A A A = 11 12 A21 A22 Determinant and trace det(A) = A11A22 − A12A21 (25) Tr(A) = A11 + A22 (26) Petersen & Pedersen, The Matrix Cookbook, Version: November 14, 2008, Page 5 1.2 The Special Case 2x2 1 BASICS Eigenvalues λ2 − λ · Tr(A) + det(A) = 0 Tr(A) + pTr(A)2 − 4 det(A) Tr(A) − pTr(A)2 − 4 det(A) λ = λ = 1 2 2 2 λ1 + λ2 = Tr(A) λ1λ2 = det(A) Eigenvectors A12 A12 v1 ∝ v2 ∝ λ1 − A11 λ2 − A11 Inverse 1 A −A A−1 = 22 12 (27) det(A) −A21 A11 Petersen & Pedersen, The Matrix Cookbook, Version: November 14, 2008, Page 6 2 DERIVATIVES 2 Derivatives This section is covering differentiation of a number of expressions with respect to a matrix X. Note that it is always assumed that X has no special structure, i.e. that the elements of X are independent (e.g. not symmetric, Toeplitz, positive definite). See section 2.8 for differentiation of structured matrices. The basic assumptions can be written in a formula as ∂Xkl = δikδlj (28) ∂Xij that is for e.g. vector forms, ∂x ∂x ∂x ∂x ∂x ∂x = i = = i ∂y i ∂y ∂y i ∂yi ∂y ij ∂yj The following rules are general and very useful when deriving the differential of an expression ([19]): ∂A = 0 (A is a constant) (29) ∂(αX) = α∂X (30) ∂(X + Y) = ∂X + ∂Y (31) ∂(Tr(X)) = Tr(∂X) (32) ∂(XY) = (∂X)Y + X(∂Y) (33) ∂(X ◦ Y) = (∂X) ◦ Y + X ◦ (∂Y) (34) ∂(X ⊗ Y) = (∂X) ⊗ Y + X ⊗ (∂Y) (35) ∂(X−1) = −X−1(∂X)X−1 (36) ∂(det(X)) = det(X)Tr(X−1∂X) (37) ∂(ln(det(X))) = Tr(X−1∂X) (38) ∂XT = (∂X)T (39) ∂XH = (∂X)H (40) 2.1 Derivatives of a Determinant 2.1.1 General form ∂ det(Y) ∂Y = det(Y)Tr Y−1 (41) ∂x ∂x " " # ∂ ∂ det(Y) ∂ ∂Y ∂x = det(Y) Tr Y−1 ∂x ∂x ∂x ∂Y ∂Y +Tr Y−1 Tr Y−1 ∂x ∂x # ∂Y ∂Y −Tr Y−1 Y−1 (42) ∂x ∂x Petersen & Pedersen, The Matrix Cookbook, Version: November 14, 2008, Page 7 2.2 Derivatives of an Inverse 2 DERIVATIVES 2.1.2 Linear forms ∂ det(X) = det(X)(X−1)T (43) ∂X X ∂ det(X) Xjk = δij det(X) (44) ∂Xik k ∂ det(AXB) = det(AXB)(X−1)T = det(AXB)(XT )−1 (45) ∂X 2.1.3 Square forms If X is square and invertible, then ∂ det(XT AX) = 2 det(XT AX)X−T (46) ∂X If X is not square but A is symmetric, then ∂ det(XT AX) = 2 det(XT AX)AX(XT AX)−1 (47) ∂X If X is not square and A is not symmetric, then ∂ det(XT AX) = det(XT AX)(AX(XT AX)−1 + AT X(XT AT X)−1) (48) ∂X 2.1.4 Other nonlinear forms Some special cases are (See [9, 7]) ∂ ln det(XT X)| = 2(X+)T (49) ∂X ∂ ln det(XT X) = −2XT (50) ∂X+ ∂ ln | det(X)| = (X−1)T = (XT )−1 (51) ∂X ∂ det(Xk) = k det(Xk)X−T (52) ∂X 2.2 Derivatives of an Inverse From [27] we have the basic identity ∂Y−1 ∂Y = −Y−1 Y−1 (53) ∂x ∂x Petersen & Pedersen, The Matrix Cookbook, Version: November 14, 2008, Page 8 2.3 Derivatives of Eigenvalues 2 DERIVATIVES from which it follows −1 ∂(X )kl −1 −1 = −(X )ki(X )jl (54) ∂Xij ∂aT X−1b = −X−T abT X−T (55) ∂X ∂ det(X−1) = − det(X−1)(X−1)T (56) ∂X ∂Tr(AX−1B) = −(X−1BAX−1)T (57) ∂X ∂Tr((X + A)−1) = −((X + A)−1(X + A)−1)T (58) ∂X From [32] we have the following result: Let A be an n × n invertible square matrix, W be the inverse of A, and J(A) is an n × n -variate and differentiable function with