Short Guides to Microeconometrics Kurt Schmidheiny/Klaus Neusser Fall 2020 Universit¨atBasel
Contents
1 Definitions 2
2 Matrix Operations 3
3 Rank of a Matrix 6
4 Special Functions of Square Matrices 7
5 Systems of Equations 10
6 Eigenvalue, -vector and Decomposition 11
7 Quadratic Forms 13
8 Partitioned Matrices 15
9 Derivatives with Matrix Algebra 16
10 Kronecker Product 18
References 19
Formula Sources and Proofs 20
Version: 30-9-2020, 18:06 Elements of Matrix Algebra 2
Foreword
These lecture notes are supposed to summarize the main results concern- ing matrix algebra as they are used in econometrics and economics. For a deeper discussion of the material, the interested reader should consult the references listed at the end.
1 Definitions
A matrix is a rectangular array of numbers. Here we consider only real numbers. If the matrix has n rows and m columns, we say that the matrix is of dimension (n × m). We denote matrices by capital bold letters:
a11 a12 . . . a1m a21 a22 . . . a2m A = (A)ij = (aij) = . . . . . .. . . . . . an1 an2 . . . anm
The numbers aij are called the elements of the matrix. An (n × 1) matrix is a column vector with n elements. Similarly, a (1 × m) matrix is a row vector with m elements. We denote vectors by bold letters. a1 a2 a = . b = b1 b2 . . . bm . . . an
A (1 × 1) matrix is a scalar which is denoted by an italic letter. The null matrix (O) is a matrix whose elements are all equal to zero, i.e. aij = 0 for all i = 1, . . . , n and j = 1, . . . , m. A square matrix is a matrix with the same number of columns and rows, i.e. n = m. 3 Short Guides to Microeconometrics
A symmetric matrix is a square matrix such that aij = aji for all i = 1, . . . , n and j = 1, . . . , m. A diagonal matrix is a square matrix such that the off-diagonal ele- ments are all equal to zero, i.e. aij = 0 for i 6= j. The identity matrix is a diagonal matrix with all diagonal elements equal to one. The identity matrix is denoted by I or In.
A square matrix is said to be upper triangular whenever aij = 0 for i > j and lower triangular whenever aij = 0 for i < j. Two vectors a and b are said to be linearly dependent if there exist scalars α and β both not equal to zero such that αa + βb = 0. Otherwise they are said to be linearly independent.
2 Matrix Operations
2.1 Equality
Two matrices or two vectors are equal if they have the same dimension and if their respective elements are all equal:
A = B ⇐⇒ aij = bij for all i and j
2.2 Transpose
Definition 1. The matrix B is called the transpose of matrix A if and only if
bij = aji for all i and j. The matrix B is denoted by A0 or AT . Taking the transpose of a matrix is equivalent to interchanging rows and columns. If A has dimension (n×m) then A0 has dimension (m×n). The transpose of a column vector is a row vector and vice versa. Note:
• (A0)0 = A for any matrix A (2.1)
• A0 = A for a symmetric matrix A (2.2) Elements of Matrix Algebra 4
2.3 Addition and Subtraction
The addition and subtraction of matrices is only defined for matrices with the same dimension. Definition 2. The sum of two matrices A and B of the same dimensions is given by the sum of their elements, i.e.
C = A + B ⇐⇒ cij = aij + bij for all i and j
The sum of a matrix A and a scalar b is a matrix C = A + b with cij = aij + b. Note that A + b = b + A. We have the following calculation rules if matrix dimensions agree:
• A + O = A (2.3)
• A − B = A + (−B) (2.4)
• A + B = B + A (2.5)
• (A + B) + C = A + (B + C) (2.6)
• (A + B)0 = A0 + B0 (2.7)
2.4 Product
Definition 3. The inner product (dot product, scalar product) of two vectors a and b of the same dimension (n × 1) is a scalar (real number) defined as:
n 0 0 X a b = b a = a1b1 + a2b2 + ··· + anbn = aibi. i=1 The product of a scalar c and a matrix A is a matrix B = cA with bij = caij. Note that cA = Ac when c is a scalar. Definition 4. The product of two matrices A and B with dimensions (n×k) and (k ×m), respectively, is given by the matrix C with dimension 5 Short Guides to Microeconometrics
(n × m) such that
k X C = AB ⇐⇒ cij = aisbsj for all i and j s=1 Remark 1. The matrix product is only defined if the number of columns of the first matrix is equal to the number of rows of the second matrix. Thus, although AB may be defined, BA is only defined if n = m. Thus for square matrices both AB and BA are defined. Remark 2. The product of two matrices is in general not commutative, i.e. AB 6= BA. Remark 3. The product AB may also be defined as
0 cij = (C)ij = ai•b•j
0 where ai• denotes the i-th row of A and b•j the j-th column of B. We have the following calculation rules if matrix dimensions agree:
• AI = A, IA = A (2.8)
• AO = O, OA = O (2.9)
• (AB)C = A(BC) = ABC (2.10)
• A(B + C) = AB + AC (2.11)
• (B + C)A = BA + CA (2.12)
• c(A + B) = cA + cB (2.13)
• (AB)0 = B0A0 (order!) (2.14)
• (ABC)0 = C0B0A0 (order!) (2.15) Elements of Matrix Algebra 6
3 Rank of a Matrix
Pn A set of vectors x1, x2,..., xn is linearly independent if i=1 cixi = 0 implies ci = 0 for all i = 1, . . . , n. The column rank of a matrix is the maximal number of linearly in- dependent columns. The row rank of a matrix is the maximal number of linearly independent rows. A matrix is said to have full column (row) rank if the column rank (row rank) equals the number of columns (rows). The column rank of an n × k matrix A is equal to its row rank. We can therefore just speak of the rank of a matrix denoted by rank(A). For an (n × k) matrix A, a (k × m) matrix B and an (n × n) square matrix C, we have
• rank(A) ≤ min(n, k) (3.1)
• rank(A0) = rank(A) (3.2)
• rank(A0A) = rank(AA0) = rank(A) (3.3)
• rank(AB) ≤ min(rank(A), rank(B)) (3.4)
• rank(AB) = rank(B) if A has full column rank (3.5)
• rank(AB) = rank(A) if B has full row rank (3.6)
• rank(A0CA) = rank(CA) if C is nonnegative definite (3.7)
• rank(A0CA) = rank(A) if C is positive definite (3.8) 7 Short Guides to Microeconometrics
4 Special Functions of Square Matrices
In this section only square (n × n) matrices are considered.
4.1 Trace of a Matrix
Definition 5. The trace of a matrix A, denoted by tr(A), is the sum of its diagonal elements: n X tr(A) = aii i=1 The following calculation rules hold if matrix dimensions agree:
• tr(cA) = c tr(A) (4.1)
• tr(A0) = tr(A) (4.2)
• tr(A + B) = tr(A) + tr(B) (4.3)
• tr(AB) = tr(BA) (4.4)
• tr(ABC) = tr(BCA) = tr(CAB) (4.5)
4.2 Determinant
The determinant of a(n × n) matrix A with n > 1 can be computed according to the following formula:
n X i+j |A| = aij(−1) |Aij| for some arbitrary j i=1 The determinant, computed as above, is said to be developed according i+j to the j-th column. The term (−1) |Aij| is called the cofactor of the element aij. Thereby Aij is a matrix of dimension ((n − 1) × (n − 1)) which is obtained by deleting the i-th row and the j-th column. Elements of Matrix Algebra 8
C a a a S D 11 1 j 1n T D T D T A = ij D ai1 aij ain T D T D T D T E an1 anj ann U
For n = 1, i.e. if A is a scalar, the determinant |A| is defined as the absolute value. For n = 2 , the determinant is given by:
|A| = a11a22 − a12a21. If at least two columns (rows) are linearly dependent, the determi- nant is equal to zero and the inverse of A does not exist. The matrix is called singular in this case. If the matrix is nonsingular then all columns (rows) are linearly independent. If a column or a row has just zeros as its elements, the determinant is equal to zero. If two columns (rows) are interchanged, the determinant changes its sign. Calculation rules for the determinant are: • |A| = |A0| (4.6)
• |AB| = |A|·|B| (4.7)
• |cA| = cn|A| (4.8)
4.3 Inverse of a Matrix
If A is a square matrix, there may exist a matrix B with property AB = BA = I. If such a matrix exists, it is called the inverse of A and is denoted by A−1, hence AA−1 = A−1A = I. The inverse of a matrix can be computed as follows
1+1 2+1 n+1 (−1) |A11| (−1) |A21| ... (−1) |An1| 1+2 2+2 n+2 (−1) |A12| (−1) |A22| ... (−1) |An2| −1 1 A = . . |A| . .. . . . . 1+n 2+n n+n (−1) |A1n| (−1) |A2n| ... (−1) |Ann| 9 Short Guides to Microeconometrics
where Aij is the matrix of dimension (n − 1) × (n − 1) obtained from A by deleting the i-th row and the j-th column.
C a a a S D 11 1 j 1n T D T D T A = ij D ai1 aij ain T D T D T D T E an1 anj ann U
i+j The term (−1) |Aij| is called the cofactor of aij. For n = 2, the inverse is given by ! 1 a −a A−1 = 22 12 . a11a22 − a12a21 −a21 a11
We have the following calculation rules if both A−1 and B−1 exist and matrix dimensions agree:
−1 • A−1 = A (4.9)
• (AB)−1 = B−1A−1 (order!) (4.10) 0 • (A0)−1 = A−1 (4.11) 1 • |A−1| = |A|−1 = (4.12) |A| 4.4 Nonsingular Square Matrices
The following statements about a square (n × n) matrix A are equivalent:
• A is nonsingular (4.13)
• |A|= 6 0 (4.14)
• A−1 exists (4.15)
• rank(A) = n (full rank) (4.16)
• λi 6= 0 for all i = 1, ..., n (4.17) Elements of Matrix Algebra 10
5 Systems of Equations
Consider the following system of m equations in n unknowns x1, . . . , xn:
a11x1 + a12x2 + ··· + a1nxn = b1
a21x1 + a22x2 + ··· + a2nxn = b2 ...
am1x1 + am2x2 + ··· + amnxn = bm
0 If we collect the unknowns into a vector x = (x1, . . . , xn) , the coefficients b1, . . . , bn in to a vector b, and the coefficients (aij) into a matrix A, we can rewrite the equation system compactly in matrix form as follows: a11 a12 . . . a1n x1 b1 a21 a22 . . . a2n x2 b2 . . . . = . . . .. . . . . . . . . . am1 am2 . . . amn xn bn | {z } | {z } | {z } A x b
A x = b
This equation system has a unique solution if m = n, i.e. if A is a square matrix, and A is nonsingular, i.e A−1 exits. The solution is then given by x = A−1b
Remark 4. To achieve numerical accuracy it is preferable not to compute the inverse explicitly. There are efficient numerical algorithms which can solve the equation system without computing the inverse. 11 Short Guides to Microeconometrics
6 Eigenvalue, -vector and Decomposition
6.1 Eigenvalue and Eigenvector
A scalar λ is said to be an eigenvalue of the square matrix A if there exists a vector x 6= 0 such that A x = λx
The vector x is called an eigenvector corresponding to λ. If x is an eigenvector then α x, α 6= 0, is also an eigenvector. Eigenvectors are therefore not unique. It is sometimes useful to normalize the length of the eigenvectors to one, i.e. to choose the eigenvector such that x0x = 1.
6.2 Characteristic Equation
In order to find the eigenvalues and eigenvectors of a square matrix, one has to solve the equation system
A x = λx = λI x ⇐⇒ (A − λ I)x = 0.
This equation system has a nontrivial solution, x 6= 0, if and only if the matrix (A − λ I) is singular, or equivalently if and only if the determinant of (A − λ I) is equal to zero. This leads to an equation in the unknown parameter λ: |A − λ I| = 0. This equation is called the characteristic equation of the matrix A and corresponds to a polynomial equation of order n. The n solutions of this equation (roots) are the eigenvalues of the matrix. The solutions may be complex numbers. Some solutions may appear several times. Eigenvectors corresponding to some eigenvalue λ can be obtained from the equation (A − λ I)x = 0. We have the following relations for an (n × n) matrix A: Pn • tr(A) = i=1 λi (6.1) Qn • |A| = i=1 λi (6.2) Elements of Matrix Algebra 12
6.3 Decomposition of Symmetric Matrices
If A is a symmetric (n×n) matrix, all n eigenvalues λ1, ..., λn are real and there exist n linearly independent eigenvectors x1,..., xn with the prop- 0 0 erties xixj = 0 for i 6= j and xixi = 1, i.e the eigenvectors are orthogonal to each other and of length one. The eigenvector xi corresponds to the eigenvalue λi. A symmetric (n × n) matrix A can be diagonalized as
H0AH = Λ, (6.3) where the diagonal matrix Λ collects the eigenvalues of A λ1 0 ... 0 0 λ2 ... 0 Λ = . . . , . . .. . . . . . 0 0 . . . λn and the (n × n) matrix H = (x1,..., xn) collecting the corresponding eigenvectors x1,..., xn is orthogonal
H0H = I, hence H−1 = H0 and HH0 = I. We can therefore decompose A into the sum of n matrices: n 0 X 0 A = HΛH = λixixi i=1 0 where the matrices hihi have all rank one. This decomposition is called the spectral decomposition or eigendecomposition of A. The inverse of a nonsingular symmetric matrix A can be calculated as
n X 1 A−1 = HΛ−1H0 = x x0 . λ i i i=1 i Remark 5. Beside symmetric matrices, many other matrices, but not all matrices, are also diagonalizable. 13 Short Guides to Microeconometrics
7 Quadratic Forms
For a vector x ∈ Rn and a symmetric matrix A of dimension (n × n) the scalar function n n 0 X X f(x) = x Ax = xixjaij j=1 i=1 is called a quadratic form. The quadratic form x0Ax and therefore the matrix A is called positive (negative) definite, if and only if
x0Ax > 0(< 0) for all x 6= 0.
The property that A is positive definite implies that
• λi > 0 for all i = 1, ..., n (7.1)
• |A| > 0 (7.2)
• A−1 exists and is positive definite (7.3)
• tr(A) > 0 (7.4)
The first property is an alternative definition for a positive definite matrix. The quadratic form x0Ax and therefore the matrix A is called non- negative definite or positive semi-definite, if and only if
x0Ax ≥ 0 for all x.
For nonnegative definite matrices we have:
• λi ≥ 0 for all i = 1, ..., n (7.5)
• |A| ≥ 0 (7.6)
• tr(A) ≥ 0 (7.7)
The first property is an alternative definition for nonnegative definiteness. Elements of Matrix Algebra 14
For an (n × m) matrix B,
• B0B is nonnegative definite (7.8)
• B0B is positive definite if B has full column rank (7.9)
• BB0 is nonnegative definite (7.10)
If the (n×m) matrix B has rank m (full column rank) and the (n×n) matrix A is positive definite then
• B0AB is positive definite (7.11)
The inverse of a symmetric positive definite (n × n) matrix A can be decomposed into
A−1 = C0C where CAC0 = I. where C is a (n × n) matrix. 15 Short Guides to Microeconometrics
8 Partitioned Matrices
Consider a square matrix P of dimensions ((p + q) × (r + s)) which is partitioned into the (p×r) matrix P11, the (p×s) matrix P12, the (q ×r) matrix P21 and the (q × s) matrix P22: ! P P P = 11 12 P21 P22
Assuming that dimensions in the involved multiplications agree, two partitioned matrices are multiplied as ! ! ! P P Q Q P Q + P Q P Q + P Q 11 12 11 12 = 11 11 12 21 11 12 12 22 P21 P22 Q21 Q22 P21Q11 + P22Q21 P21Q12 + P22Q22
−1 Assuming that P11 exists, the determinant of a partitioned matrix is
P P 11 12 −1 = |P11| · |P22 − P21P11 P12| (8.1) P21 P22 and the inverse is !−1 ! P P P−1 + P−1P F−1P P−1 −P−1P F−1 11 12 = 11 11 12 21 11 11 12 (8.2) −1 −1 −1 P21 P22 −F P21P11 F
−1 where F = P22 − P21P11 P12 is assumed nonsingular. The determinant of a block diagonal matrix is
P O 11 = |P11| · |P22| OP22
−1 −1 and its inverse is, assuming that P11 and P22 exist,
−1 ! −1 ! P11 O P11 O = −1 . OP22 OP22 Elements of Matrix Algebra 16
9 Derivatives with Matrix Algebra
A linear function f from the n-dimensional vector space of real numbers, Rn, to the real numbers, R, f : Rn −→ R is determined by the coefficient 0 vector a = (a1, . . . , an) :
n 0 X y = f(x) = a x = aixi = a1x1 + a2x2 + ··· + anxn i=1 where x is a column vector of dimension n and y a scalar. The derivative of y = f(x) with respect to the column vector x is defined as follows: ∂y/∂x1 a1 0 0 ∂y ∂a x ∂x a ∂y/∂x2 a2 = = = . = . = a ∂x ∂x ∂x . . . . ∂y/∂xn an and with respect to the row vector x0 as follows:
0 0 ∂y ∂a x ∂x a h ∂y ∂y ∂y i h i 0 = = = ... = a1 a2 . . . an = a ∂x0 ∂x0 ∂x0 ∂x1 ∂x2 ∂xn The simultaneous equation system y = Ax can be viewed as m linear 0 0 functions yi = aix where ai denotes the i-th row of the (m×n) dimensional matrix A. Thus the derivative of yi with respect to x is given by ∂y ∂a0 x i = i = a ∂x ∂x i Consequently the derivative of y = Ax with respect to row vector x0 can be defined as
0 0 ∂y1/∂x a1 0 0 ∂y ∂Ax ∂y2/∂x a2 = = . = . = A. ∂x0 ∂x0 . . . . 0 0 ∂ym/∂x am 17 Short Guides to Microeconometrics
The derivative of y = Ax with respect to column vector x is therefore ∂y ∂Ax = = A0. ∂x ∂x For a square matrix A of dimension (n × n) and the quadratic form 0 Pn Pn x Ax = j=1 i=1 xixjaij the derivative with respect to the column vector x is defined as ∂x0Ax = (A + A0)x. ∂x If A is a symmetric matrix this reduces to
∂x0Ax = 2Ax. ∂x The derivative of the quadratic form x0Ax with respect to the matrix elements aij is given by ∂x0Ax = xixj. ∂aij Therefore the derivative with respect to the matrix A is given by
∂x0Ax = xx0. ∂A Elements of Matrix Algebra 18
10 Kronecker Product
The Kronecker Product of a m × n Matrix A with a p × q Matrix B is a mp × nq Matrix A ⊗ B defined as follows: a11B a12B . . . a1nB a21B a22B . . . a21B A ⊗ B = . . . . . .. . . . . . am1B am2B . . . amnB
The following calculation rules hold if matrix dimensions agree:
• (A ⊗ B) + (C ⊗ B) = (A + C) ⊗ B (10.1)
• (A ⊗ B) + (A ⊗ C) = A ⊗ (B + C) (10.2)
• (A ⊗ B)(C ⊗ D) = (AC) ⊗ (BD) (10.3)
• (A ⊗ B)−1 = A−1 ⊗ B−1 (10.4)
• tr(A ⊗ B) = tr(A)tr(B) (10.5) 19 Short Guides to Microeconometrics
References
[1] Abadir, K.M. and J.R. Magnus, Matrix Algebra, Cambridge: Cam- bridge University Press, 2005.
[2] Amemiya, T., Introduction to Statistics and Econometrics, Cam- bridge, Massachusetts: Harvard University Press, 1994.
[3] Dhrymes, P.J., Introductory Econometrics, New York : Springer- Verlag, 1978.
[4] Meyer, C.D., Matrix Analysis and Applied Linear Algebra, Philadel- phia: SIAM, 2000.
[5] Strang, G., Linear Algebra and its Applications, 3rd Edition, San Diego: Harcourt Brace Jovanovich, 1986.
[6] Magnus, J.R., and H. Neudecker, Matrix Differential Calculus with Applications in Statistics and Econometrics, Chichester: John Wiley, 1988. Elements of Matrix Algebra 20
Formula Sources and Proofs
(2.8) Abadir and Magnus (2005), p. 28, ex. 2.18 (b). (2.10) Abadir and Magnus (2005), p. 25, ex. 2.14 (a). (2.11) Abadir and Magnus (2005), p. 25, ex. 2.14 (b). (2.14) Abadir and Magnus (2005), p. 26, ex. 2.15 (a). (2.15) Abadir and Magnus (2005), p. 26, ex. 2.15 (b). (3.1) Abadir and Magnus (2005), p. 78 - 79, ex. 4.7 (a). (3.2) Abadir and Magnus (2005), p. 77 - 78, ex. 4.5. (3.3) Abadir and Magnus (2005), p. 81, ex. 4.13 (d). (3.4) Abadir and Magnus (2005), p. 81, ex. 4.15 (b). (3.5) Abadir and Magnus (2005), p. 85, ex. 4.25 (c). (3.6) Abadir and Magnus (2005), p. 85, ex. 4.25 (d). (3.7) Abadir and Magnus (2005), p. 221, ex. 8.27 (a). (3.8) Abadir and Magnus (2005), p. 221, ex. 8.26 (a). (4.1) Abadir and Magnus (2005), p. 30, ex. 2.24 (b). (4.2) Abadir and Magnus (2005), p. 30, ex. 2.24 (c). (4.3) Abadir and Magnus (2005), p. 30, ex. 2.24 (a). (4.4) Abadir and Magnus (2005), p. 30, ex. 2.26 (a). (4.5) Abadir and Magnus (2005), p. 31, ex. 2.26 (c). (4.6) Abadir and Magnus (2005), p. 88, ex. 4.30. (4.7) Abadir and Magnus (2005), p. 94, ex. 4.42. (4.8) Abadir and Magnus (2005), p. 90, ex. 4.35 (a). (4.9) Abadir and Magnus (2005), p. 84, ex. 4.22 (b). (4.10) Abadir and Magnus (2005), p. 84, ex. 4.22 (d). (4.11) Abadir and Magnus (2005), p. 84, ex. 4.22 (c). (4.12) Abadir and Magnus (2005), p. 95, ex. 4.44 (a). 21 Short Guides to Microeconometrics
(4.13) Abadir and Magnus (2005), p. 83-84, ex. 4.21. (4.14) Abadir and Magnus (2005), p. 94, ex. 4.43. (4.15) Abadir and Magnus (2005), p. 83-84, ex. 4.21. (4.16) Abadir and Magnus (2005), p. 83-84, ex. 4.21 (4.17) Abadir and Magnus (2005), p. 164, ex. 7.16. (6.1) Abadir and Magnus (2005), p. 168, ex. 7.27. (6.2) Abadir and Magnus (2005), p. 167, ex. 7.26. (6.3) Abadir and Magnus (2005), p. 177, ex. 7.46. (7.1) Abadir and Magnus (2005), p. 215, ex. 8.11 (a). (7.2) Abadir and Magnus (2005), p. 215, ex. 8.12 (a). (7.3) Abadir and Magnus (2005), p. 216, ex. 8.14 (c). (7.4) Abadir and Magnus (2005), p. 215, ex. 8.12 (b). (7.5) Abadir and Magnus (2005), p. 215, ex. 8.11 (b). (7.6) Abadir and Magnus (2005), p. 216, ex. 8.13 (a) (7.7) Abadir and Magnus (2005), p. 216, ex. 8.13 (b). (7.8) Abadir and Magnus (2005), p. 214, ex. 8.9 (a). (7.10) Abadir and Magnus (2005), p. 214, ex. 8.9 (a). (7.11) Abadir and Magnus (2005), p. 221, ex. 8.26 (b). (8.1) Abadir and Magnus (2005), p. 114, ex. 5.30 (a). (8.2) Abadir and Magnus (2005), p. 106, ex. 5.16 (a). (10.1) Abadir and Magnus (2005), p. 275, ex. 10.3 (a). (10.2) Abadir and Magnus (2005), p. 275, ex. 10.3 (b). (10.3) Abadir and Magnus (2005), p. 275, ex. 10.3 (d). (10.4) Abadir and Magnus (2005), p. 278, ex. 10.8. (10.5) Abadir and Magnus (2005), p. 277, ex. 10.7 (b).