Summary of key matrix and vector algebra definitions and results
Peter Sollich Nov 2014
1 Matrices, vectors, transpose
• Matrix A of size M × N: M rows, N columns.
• Matrix elements Aij with i = 1,...,M, j = 1,...,N, elements can also be written as (A)ij
• N-dimensional column vector v ≡ matrix of size N × 1, elements written as vi; we generally assume all vectors are column vectors
T T • Matrix transpose: A has elements (A )ij = Aji, size N × M • For vector: vT is a row vector ≡ matrix of size 1 × N
2 Products
• For A of size L × M, B of size M × N, the matrix product AB is defined to have elements P (AB)ik = j AijBjk • Note “inner” matrix size (M) has to match, and order of factors matters, i.e. AB 6= BA generally T P • Special case (i): scalar product u v = i uivi • Special case (ii): for N-dimensional vectors u, v, the outer product uvT is an N × N matrix with elements uivj • Note that e.g. uv is not defined (matrix sizes don’t match) • A (square) matrix A with A = AT is called symmetric • Transpose of product: (AB)T = BTAT, i.e. transpose the factors and reverse their order • Transpose applied to a number ≡ 1×1 matrix gives same number, hence uTv = vTu, uTAv = vTATu etc
3 Quadratic forms
T P • Quadratic form: v Av = ij viAijvj for square matrix A T T T T • This equals v A v, hence also v Asv where As = (A + A )/2 is symmetric part of A T T • Often need derivatives of this: (∂/∂vi)v Av = ((A + A )v)i = 2(Asv)i; for symmetric A this equals 2(Av)i • Completing the square: for symmetric A, T vTAv + 2bTv = v + A−1b A v + A−1b − bTA−1b
1 4 Identity, inverse
• Identity matrix I has elements δij; Kronecker delta is defined as δij = 1 if i = j, δij = 0 otherwise; also sometimes see notation Iij
• When necessary size of identity matrix is written, e.g. IN for N × N identity matrix • Inverse matrix: for a square matrix A, the inverse A−1 is defined by AA−1 = I or (equivalently) A−1A = I • Can write elements of inverse explicitly in terms of determinants of submatrices • Inverse of products:(AB)−1 = B−1A−1, i.e. invert the factors and reverse their order • A matrix A whose inverse is its transpose, A−1 = AT is called orthogonal
• Determinant of square matrix A is defined as
X σ |A| = (−1) A1σ(1) ··· ANσ(N) σ
where σ runs over all permutations of {1, 2,...,N} and (−1)σ is the sign of the permutation (= 1 if permutation can be obtained by a sequence of an even number of swaps between two elements, = −1 if the number of swaps needed is odd).
E.g. for N = 2 there are only two permutations, |A| = A11A22 − A12A21
• If matrix is diagonal (Aij = 0 for i 6= j), then simply |A| = A11 ··· ANN ; in particular |I| = 1 • Determinant of multiple: |cA| = cN A
• Determinant of transpose: |AT| = |A| • Determinant of product: |AB| = |A| |B| (provided A and B are both square, else rhs is undefined) • Determinant of inverse: |A−1| = |A|−1 • Sylvester’s determinant theorem: for A and B of size M × N and N × M respectively, |IM + AB| = |IN + BA| • Special case for outer products: |I + uvT| = 1 + vTu, hence also |A + uvT| = |A(I + A−1uvT)| = |A|(1 + vTA−1u)
6 Trace P • Trace of a square matrix: Tr A = i Aii, sum of diagonal elements • Trace of multiple: Tr cA = c Tr A
• Trace of transpose: Tr AT = Tr A • Trace of product: Tr (AB) = Tr (BA), Tr (ABC) = Tr (CAB) etc (“cyclic invariance”) • Special case: trace of outer product = inner product, Tr uvT = vTu = uTv
2 7 Derivatives of matrices
• A matrix A can depend on some parameter, say x • Derivative ∂A/∂x is applied elementwise • Familiar from dynamics, time derivative of position vector ∂r/∂t is velocity • Derivative of determinant:(∂/∂x) ln |A| = Tr (A−1∂A/∂x)
• Derivative of inverse:(∂/∂x)A−1 = −A−1(∂A/∂x)A−1
• If we choose x as one of the matrix elements Akl then derivative matrix only has one nonzero entry, (∂A/∂(Akl))ij = δikδjl
P −1 −1 • Hence (∂/∂(Akl)) ln |A| = ji(A )ji(∂A/∂(Akl))ij = (A )lk
−1 P −1 −1 −1 −1 • And (∂/∂(Akl))(A )mn = − ij(A )mi(∂A/∂(Akl))ij(A )jn = −(A )mk(A )ln
8 Eigenvalues and eigenvectors
• If a nonzero vector v and a square matrix A obey Av = λv with some multiplier λ, then v is an eigenvector of A with eigenvalue λ • If A is symmetric, one can find N eigenvectors vα that are orthogonal and normalized, i.e. (vα)Tvβ = δαβ, with associated eigenvectors λα
P α α T • A then has an eigenvector decomposition, A = α λαv (v ) • If we define V = (v1,..., vN ) (matrix whose columns are the eigenvectors) and Λ with elements −1 Λαβ = λαδαβ (diagonal matrix containing the eigenvalues), then A = V ΛV One says A is diagonalized by the transformation with V . V is orthogonal because the eigenvectors are, so one can also write A = V ΛV T and the transformation has the interpretation of a rotation. QN • Determinant: for symmetric A, |A| = α=1 λα, determinant is product of eigenvalues; follows from |A| = |V ΛV −1| = |V | |Λ| |V |−1 = |Λ| P • Trace: for symmetric A, Tr A = α λα, trace is sum of eigenvalues; follows from Tr A = Tr V ΛV −1 = Tr V −1V Λ = Tr Λ
n n −1 P n α α T • Powers of symmetric A: A = V Λ V = α λαv (v ) Pm n Pm n • Polynomial of matrix: if f(x) = n=0 cnx , define f(A) as n=0 cnA ; previous result then shows P α α T f(A) = α f(λα)v (v ) • By letting order of polynomial, m, go to infinity, extend this definition to general matrix functions P α α T that can be represented as power series: f(A) = α f(λα)v (v ) • Then in particular e.g. ln |A| = Tr ln(A)
P∞ n • Important in other contexts (e.g. dynamics) is the matrix exponential exp(A) = n=0 A /n!, which P α α T from above = α exp(λα)v (v )
3