<<

Short Guides to Microeconometrics Kurt Schmidheiny/Klaus Neusser Fall 2020 Universit¨atBasel

Elements of

Contents

1 Definitions 2

2 Matrix Operations 3

3 of a Matrix 6

4 Special Functions of Square Matrices 7

5 Systems of Equations 10

6 Eigenvalue, -vector and Decomposition 11

7 Quadratic Forms 13

8 Partitioned Matrices 15

9 Derivatives with Matrix Algebra 16

10 18

References 19

Formula Sources and Proofs 20

Version: 30-9-2020, 18:06 Elements of Matrix Algebra 2

Foreword

These lecture notes are supposed to summarize the main results concern- ing matrix algebra as they are used in econometrics and economics. For a deeper discussion of the material, the interested reader should consult the references listed at the end.

1 Definitions

A matrix is a rectangular array of . Here we consider only real numbers. If the matrix has n rows and m columns, we say that the matrix is of (n × m). We denote matrices by capital bold letters:

  a11 a12 . . . a1m   a21 a22 . . . a2m  A = (A)ij = (aij) =  . . .   . . .. .   . . . .  an1 an2 . . . anm

The numbers aij are called the elements of the matrix. An (n × 1) matrix is a column vector with n elements. Similarly, a (1 × m) matrix is a row vector with m elements. We denote vectors by bold letters.   a1   a2    a =  .  b = b1 b2 . . . bm .  .   .  an

A (1 × 1) matrix is a which is denoted by an italic letter. The null matrix (O) is a matrix whose elements are all equal to zero, i.e. aij = 0 for all i = 1, . . . , n and j = 1, . . . , m. A is a matrix with the same of columns and rows, i.e. n = m. 3 Short Guides to Microeconometrics

A is a square matrix such that aij = aji for all i = 1, . . . , n and j = 1, . . . , m. A matrix is a square matrix such that the off-diagonal ele- ments are all equal to zero, i.e. aij = 0 for i 6= j. The is a with all diagonal elements equal to one. The identity matrix is denoted by I or In.

A square matrix is said to be upper triangular whenever aij = 0 for i > j and lower triangular whenever aij = 0 for i < j. Two vectors a and b are said to be linearly dependent if there exist scalars α and β both not equal to zero such that αa + βb = 0. Otherwise they are said to be linearly independent.

2 Matrix Operations

2.1 Equality

Two matrices or two vectors are equal if they have the same dimension and if their respective elements are all equal:

A = B ⇐⇒ aij = bij for all i and j

2.2

Definition 1. The matrix B is called the transpose of matrix A if and only if

bij = aji for all i and j. The matrix B is denoted by A0 or AT . Taking the transpose of a matrix is equivalent to interchanging rows and columns. If A has dimension (n×m) then A0 has dimension (m×n). The transpose of a column vector is a row vector and vice versa. Note:

• (A0)0 = A for any matrix A (2.1)

• A0 = A for a symmetric matrix A (2.2) Elements of Matrix Algebra 4

2.3 Addition and Subtraction

The addition and subtraction of matrices is only defined for matrices with the same dimension. Definition 2. The sum of two matrices A and B of the same dimensions is given by the sum of their elements, i.e.

C = A + B ⇐⇒ cij = aij + bij for all i and j

The sum of a matrix A and a scalar b is a matrix C = A + b with cij = aij + b. Note that A + b = b + A. We have the following calculation rules if matrix dimensions agree:

• A + O = A (2.3)

• A − B = A + (−B) (2.4)

• A + B = B + A (2.5)

• (A + B) + C = A + (B + C) (2.6)

• (A + B)0 = A0 + B0 (2.7)

2.4 Product

Definition 3. The inner product (, scalar product) of two vectors a and b of the same dimension (n × 1) is a scalar () defined as:

n 0 0 X a b = b a = a1b1 + a2b2 + ··· + anbn = aibi. i=1 The product of a scalar c and a matrix A is a matrix B = cA with bij = caij. Note that cA = Ac when c is a scalar. Definition 4. The product of two matrices A and B with dimensions (n×k) and (k ×m), respectively, is given by the matrix C with dimension 5 Short Guides to Microeconometrics

(n × m) such that

k X C = AB ⇐⇒ cij = aisbsj for all i and j s=1 Remark 1. The matrix product is only defined if the number of columns of the first matrix is equal to the number of rows of the second matrix. Thus, although AB may be defined, BA is only defined if n = m. Thus for square matrices both AB and BA are defined. Remark 2. The product of two matrices is in general not commutative, i.e. AB 6= BA. Remark 3. The product AB may also be defined as

0 cij = (C)ij = ai•b•j

0 where ai• denotes the i-th row of A and b•j the j-th column of B. We have the following calculation rules if matrix dimensions agree:

• AI = A, IA = A (2.8)

• AO = O, OA = O (2.9)

• (AB)C = A(BC) = ABC (2.10)

• A(B + C) = AB + AC (2.11)

• (B + C)A = BA + CA (2.12)

• c(A + B) = cA + cB (2.13)

• (AB)0 = B0A0 (order!) (2.14)

• (ABC)0 = C0B0A0 (order!) (2.15) Elements of Matrix Algebra 6

3 Rank of a Matrix

Pn A set of vectors x1, x2,..., xn is linearly independent if i=1 cixi = 0 implies ci = 0 for all i = 1, . . . , n. The column rank of a matrix is the maximal number of linearly in- dependent columns. The row rank of a matrix is the maximal number of linearly independent rows. A matrix is said to have full column (row) rank if the column rank (row rank) equals the number of columns (rows). The column rank of an n × k matrix A is equal to its row rank. We can therefore just speak of the rank of a matrix denoted by rank(A). For an (n × k) matrix A, a (k × m) matrix B and an (n × n) square matrix C, we have

• rank(A) ≤ min(n, k) (3.1)

• rank(A0) = rank(A) (3.2)

• rank(A0A) = rank(AA0) = rank(A) (3.3)

• rank(AB) ≤ min(rank(A), rank(B)) (3.4)

• rank(AB) = rank(B) if A has full column rank (3.5)

• rank(AB) = rank(A) if B has full row rank (3.6)

• rank(A0CA) = rank(CA) if C is nonnegative definite (3.7)

• rank(A0CA) = rank(A) if C is positive definite (3.8) 7 Short Guides to Microeconometrics

4 Special Functions of Square Matrices

In this section only square (n × n) matrices are considered.

4.1 of a Matrix

Definition 5. The trace of a matrix A, denoted by tr(A), is the sum of its diagonal elements: n X tr(A) = aii i=1 The following calculation rules hold if matrix dimensions agree:

• tr(cA) = c tr(A) (4.1)

• tr(A0) = tr(A) (4.2)

• tr(A + B) = tr(A) + tr(B) (4.3)

• tr(AB) = tr(BA) (4.4)

• tr(ABC) = tr(BCA) = tr(CAB) (4.5)

4.2

The determinant of a(n × n) matrix A with n > 1 can be computed according to the following formula:

n X i+j |A| = aij(−1) |Aij| for some arbitrary j i=1 The determinant, computed as above, is said to be developed according i+j to the j-th column. The term (−1) |Aij| is called the cofactor of the element aij. Thereby Aij is a matrix of dimension ((n − 1) × (n − 1)) which is obtained by deleting the i-th row and the j-th column. Elements of Matrix Algebra 8

C a  a  a S D 11 1 j 1n T D    T D T A =   ij D ai1 aij ain T D    T D T D   T E an1 anj ann U

For n = 1, i.e. if A is a scalar, the determinant |A| is defined as the . For n = 2 , the determinant is given by:

|A| = a11a22 − a12a21. If at least two columns (rows) are linearly dependent, the determi- nant is equal to zero and the inverse of A does not exist. The matrix is called singular in this case. If the matrix is nonsingular then all columns (rows) are linearly independent. If a column or a row has just zeros as its elements, the determinant is equal to zero. If two columns (rows) are interchanged, the determinant changes its sign. Calculation rules for the determinant are: • |A| = |A0| (4.6)

• |AB| = |A|·|B| (4.7)

• |cA| = cn|A| (4.8)

4.3 Inverse of a Matrix

If A is a square matrix, there may exist a matrix B with property AB = BA = I. If such a matrix exists, it is called the inverse of A and is denoted by A−1, hence AA−1 = A−1A = I. The inverse of a matrix can be computed as follows

 1+1 2+1 n+1  (−1) |A11| (−1) |A21| ... (−1) |An1| 1+2 2+2 n+2 (−1) |A12| (−1) |A22| ... (−1) |An2| −1 1   A =  . .  |A|  . .. .   . . .  1+n 2+n n+n (−1) |A1n| (−1) |A2n| ... (−1) |Ann| 9 Short Guides to Microeconometrics

where Aij is the matrix of dimension (n − 1) × (n − 1) obtained from A by deleting the i-th row and the j-th column.

C a  a  a S D 11 1 j 1n T D    T D T A =   ij D ai1 aij ain T D    T D T D   T E an1 anj ann U

i+j The term (−1) |Aij| is called the cofactor of aij. For n = 2, the inverse is given by ! 1 a −a A−1 = 22 12 . a11a22 − a12a21 −a21 a11

We have the following calculation rules if both A−1 and B−1 exist and matrix dimensions agree:

−1 • A−1 = A (4.9)

• (AB)−1 = B−1A−1 (order!) (4.10) 0 • (A0)−1 = A−1 (4.11) 1 • |A−1| = |A|−1 = (4.12) |A| 4.4 Nonsingular Square Matrices

The following statements about a square (n × n) matrix A are equivalent:

• A is nonsingular (4.13)

• |A|= 6 0 (4.14)

• A−1 exists (4.15)

• rank(A) = n (full rank) (4.16)

• λi 6= 0 for all i = 1, ..., n (4.17) Elements of Matrix Algebra 10

5 Systems of Equations

Consider the following system of m equations in n unknowns x1, . . . , xn:

a11x1 + a12x2 + ··· + a1nxn = b1

a21x1 + a22x2 + ··· + a2nxn = b2 ...

am1x1 + am2x2 + ··· + amnxn = bm

0 If we collect the unknowns into a vector x = (x1, . . . , xn) , the coefficients b1, . . . , bn in to a vector b, and the coefficients (aij) into a matrix A, we can rewrite the equation system compactly in matrix form as follows:       a11 a12 . . . a1n x1 b1        a21 a22 . . . a2n  x2  b2   . . .   .  =  .   . . .. .   .   .   . . . .   .   .  am1 am2 . . . amn xn bn | {z } | {z } | {z } A x b

A x = b

This equation system has a unique solution if m = n, i.e. if A is a square matrix, and A is nonsingular, i.e A−1 exits. The solution is then given by x = A−1b

Remark 4. To achieve numerical accuracy it is preferable not to compute the inverse explicitly. There are efficient numerical algorithms which can solve the equation system without computing the inverse. 11 Short Guides to Microeconometrics

6 Eigenvalue, -vector and Decomposition

6.1 Eigenvalue and Eigenvector

A scalar λ is said to be an eigenvalue of the square matrix A if there exists a vector x 6= 0 such that A x = λx

The vector x is called an eigenvector corresponding to λ. If x is an eigenvector then α x, α 6= 0, is also an eigenvector. Eigenvectors are therefore not unique. It is sometimes useful to normalize the length of the eigenvectors to one, i.e. to choose the eigenvector such that x0x = 1.

6.2 Characteristic Equation

In order to find the eigenvalues and eigenvectors of a square matrix, one has to solve the equation system

A x = λx = λI x ⇐⇒ (A − λ I)x = 0.

This equation system has a nontrivial solution, x 6= 0, if and only if the matrix (A − λ I) is singular, or equivalently if and only if the determinant of (A − λ I) is equal to zero. This leads to an equation in the unknown parameter λ: |A − λ I| = 0. This equation is called the characteristic equation of the matrix A and corresponds to a equation of order n. The n solutions of this equation (roots) are the eigenvalues of the matrix. The solutions may be complex numbers. Some solutions may appear several times. Eigenvectors corresponding to some eigenvalue λ can be obtained from the equation (A − λ I)x = 0. We have the following relations for an (n × n) matrix A: Pn • tr(A) = i=1 λi (6.1) Qn • |A| = i=1 λi (6.2) Elements of Matrix Algebra 12

6.3 Decomposition of Symmetric Matrices

If A is a symmetric (n×n) matrix, all n eigenvalues λ1, ..., λn are real and there exist n linearly independent eigenvectors x1,..., xn with the prop- 0 0 erties xixj = 0 for i 6= j and xixi = 1, i.e the eigenvectors are orthogonal to each other and of length one. The eigenvector xi corresponds to the eigenvalue λi. A symmetric (n × n) matrix A can be diagonalized as

H0AH = Λ, (6.3) where the diagonal matrix Λ collects the eigenvalues of A   λ1 0 ... 0    0 λ2 ... 0  Λ =  . . .  ,  . . .. .   . . . .  0 0 . . . λn and the (n × n) matrix H = (x1,..., xn) collecting the corresponding eigenvectors x1,..., xn is orthogonal

H0H = I, hence H−1 = H0 and HH0 = I. We can therefore decompose A into the sum of n matrices: n 0 X 0 A = HΛH = λixixi i=1 0 where the matrices hihi have all rank one. This decomposition is called the spectral decomposition or eigendecomposition of A. The inverse of a nonsingular symmetric matrix A can be calculated as

n X 1 A−1 = HΛ−1H0 = x x0 . λ i i i=1 i Remark 5. Beside symmetric matrices, many other matrices, but not all matrices, are also diagonalizable. 13 Short Guides to Microeconometrics

7 Quadratic Forms

For a vector x ∈ Rn and a symmetric matrix A of dimension (n × n) the scalar function n n 0 X X f(x) = x Ax = xixjaij j=1 i=1 is called a . The quadratic form x0Ax and therefore the matrix A is called positive (negative) definite, if and only if

x0Ax > 0(< 0) for all x 6= 0.

The property that A is positive definite implies that

• λi > 0 for all i = 1, ..., n (7.1)

• |A| > 0 (7.2)

• A−1 exists and is positive definite (7.3)

• tr(A) > 0 (7.4)

The first property is an alternative definition for a positive definite matrix. The quadratic form x0Ax and therefore the matrix A is called non- negative definite or positive semi-definite, if and only if

x0Ax ≥ 0 for all x.

For nonnegative definite matrices we have:

• λi ≥ 0 for all i = 1, ..., n (7.5)

• |A| ≥ 0 (7.6)

• tr(A) ≥ 0 (7.7)

The first property is an alternative definition for nonnegative definiteness. Elements of Matrix Algebra 14

For an (n × m) matrix B,

• B0B is nonnegative definite (7.8)

• B0B is positive definite if B has full column rank (7.9)

• BB0 is nonnegative definite (7.10)

If the (n×m) matrix B has rank m (full column rank) and the (n×n) matrix A is positive definite then

• B0AB is positive definite (7.11)

The inverse of a symmetric positive definite (n × n) matrix A can be decomposed into

A−1 = C0C where CAC0 = I. where C is a (n × n) matrix. 15 Short Guides to Microeconometrics

8 Partitioned Matrices

Consider a square matrix P of dimensions ((p + q) × (r + s)) which is partitioned into the (p×r) matrix P11, the (p×s) matrix P12, the (q ×r) matrix P21 and the (q × s) matrix P22: ! P P P = 11 12 P21 P22

Assuming that dimensions in the involved agree, two partitioned matrices are multiplied as ! ! ! P P Q Q P Q + P Q P Q + P Q 11 12 11 12 = 11 11 12 21 11 12 12 22 P21 P22 Q21 Q22 P21Q11 + P22Q21 P21Q12 + P22Q22

−1 Assuming that P11 exists, the determinant of a partitioned matrix is

P P 11 12 −1 = |P11| · |P22 − P21P11 P12| (8.1) P21 P22 and the inverse is !−1 ! P P P−1 + P−1P F−1P P−1 −P−1P F−1 11 12 = 11 11 12 21 11 11 12 (8.2) −1 −1 −1 P21 P22 −F P21P11 F

−1 where F = P22 − P21P11 P12 is assumed nonsingular. The determinant of a block diagonal matrix is

P O 11 = |P11| · |P22| OP22

−1 −1 and its inverse is, assuming that P11 and P22 exist,

−1 ! −1 ! P11 O P11 O = −1 . OP22 OP22 Elements of Matrix Algebra 16

9 Derivatives with Matrix Algebra

A f from the n-dimensional of real numbers, Rn, to the real numbers, R, f : Rn −→ R is determined by the coefficient 0 vector a = (a1, . . . , an) :

n 0 X y = f(x) = a x = aixi = a1x1 + a2x2 + ··· + anxn i=1 where x is a column vector of dimension n and y a scalar. The derivative of y = f(x) with respect to the column vector x is defined as follows:     ∂y/∂x1 a1 0 0     ∂y ∂a x ∂x a ∂y/∂x2  a2  = = =  .  =  .  = a ∂x ∂x ∂x  .   .   .   .  ∂y/∂xn an and with respect to the row vector x0 as follows:

0 0 ∂y ∂a x ∂x a h ∂y ∂y ∂y i h i 0 = = = ... = a1 a2 . . . an = a ∂x0 ∂x0 ∂x0 ∂x1 ∂x2 ∂xn The simultaneous equation system y = Ax can be viewed as m linear 0 0 functions yi = aix where ai denotes the i-th row of the (m×n) dimensional matrix A. Thus the derivative of yi with respect to x is given by ∂y ∂a0 x i = i = a ∂x ∂x i Consequently the derivative of y = Ax with respect to row vector x0 can be defined as

 0   0  ∂y1/∂x a1  0   0  ∂y ∂Ax  ∂y2/∂x   a2  = =  .  =  .  = A. ∂x0 ∂x0  .   .   .   .  0 0 ∂ym/∂x am 17 Short Guides to Microeconometrics

The derivative of y = Ax with respect to column vector x is therefore ∂y ∂Ax = = A0. ∂x ∂x For a square matrix A of dimension (n × n) and the quadratic form 0 Pn Pn x Ax = j=1 i=1 xixjaij the derivative with respect to the column vector x is defined as ∂x0Ax = (A + A0)x. ∂x If A is a symmetric matrix this reduces to

∂x0Ax = 2Ax. ∂x The derivative of the quadratic form x0Ax with respect to the matrix elements aij is given by ∂x0Ax = xixj. ∂aij Therefore the derivative with respect to the matrix A is given by

∂x0Ax = xx0. ∂A Elements of Matrix Algebra 18

10 Kronecker Product

The Kronecker Product of a m × n Matrix A with a p × q Matrix B is a mp × nq Matrix A ⊗ B defined as follows:   a11B a12B . . . a1nB    a21B a22B . . . a21B  A ⊗ B =  . . .   . . .. .   . . . .  am1B am2B . . . amnB

The following calculation rules hold if matrix dimensions agree:

• (A ⊗ B) + (C ⊗ B) = (A + C) ⊗ B (10.1)

• (A ⊗ B) + (A ⊗ C) = A ⊗ (B + C) (10.2)

• (A ⊗ B)(C ⊗ D) = (AC) ⊗ (BD) (10.3)

• (A ⊗ B)−1 = A−1 ⊗ B−1 (10.4)

• tr(A ⊗ B) = tr(A)tr(B) (10.5) 19 Short Guides to Microeconometrics

References

[1] Abadir, K.M. and J.R. Magnus, Matrix Algebra, Cambridge: Cam- bridge University Press, 2005.

[2] Amemiya, T., Introduction to Statistics and Econometrics, Cam- bridge, Massachusetts: Harvard University Press, 1994.

[3] Dhrymes, P.J., Introductory Econometrics, New York : Springer- Verlag, 1978.

[4] Meyer, C.D., and Applied , Philadel- phia: SIAM, 2000.

[5] Strang, G., Linear Algebra and its Applications, 3rd Edition, San Diego: Harcourt Brace Jovanovich, 1986.

[6] Magnus, J.R., and H. Neudecker, Matrix Differential Calculus with Applications in Statistics and Econometrics, Chichester: John Wiley, 1988. Elements of Matrix Algebra 20

Formula Sources and Proofs

(2.8) Abadir and Magnus (2005), p. 28, ex. 2.18 (b). (2.10) Abadir and Magnus (2005), p. 25, ex. 2.14 (a). (2.11) Abadir and Magnus (2005), p. 25, ex. 2.14 (b). (2.14) Abadir and Magnus (2005), p. 26, ex. 2.15 (a). (2.15) Abadir and Magnus (2005), p. 26, ex. 2.15 (b). (3.1) Abadir and Magnus (2005), p. 78 - 79, ex. 4.7 (a). (3.2) Abadir and Magnus (2005), p. 77 - 78, ex. 4.5. (3.3) Abadir and Magnus (2005), p. 81, ex. 4.13 (d). (3.4) Abadir and Magnus (2005), p. 81, ex. 4.15 (b). (3.5) Abadir and Magnus (2005), p. 85, ex. 4.25 (c). (3.6) Abadir and Magnus (2005), p. 85, ex. 4.25 (d). (3.7) Abadir and Magnus (2005), p. 221, ex. 8.27 (a). (3.8) Abadir and Magnus (2005), p. 221, ex. 8.26 (a). (4.1) Abadir and Magnus (2005), p. 30, ex. 2.24 (b). (4.2) Abadir and Magnus (2005), p. 30, ex. 2.24 (c). (4.3) Abadir and Magnus (2005), p. 30, ex. 2.24 (a). (4.4) Abadir and Magnus (2005), p. 30, ex. 2.26 (a). (4.5) Abadir and Magnus (2005), p. 31, ex. 2.26 (c). (4.6) Abadir and Magnus (2005), p. 88, ex. 4.30. (4.7) Abadir and Magnus (2005), p. 94, ex. 4.42. (4.8) Abadir and Magnus (2005), p. 90, ex. 4.35 (a). (4.9) Abadir and Magnus (2005), p. 84, ex. 4.22 (b). (4.10) Abadir and Magnus (2005), p. 84, ex. 4.22 (d). (4.11) Abadir and Magnus (2005), p. 84, ex. 4.22 (c). (4.12) Abadir and Magnus (2005), p. 95, ex. 4.44 (a). 21 Short Guides to Microeconometrics

(4.13) Abadir and Magnus (2005), p. 83-84, ex. 4.21. (4.14) Abadir and Magnus (2005), p. 94, ex. 4.43. (4.15) Abadir and Magnus (2005), p. 83-84, ex. 4.21. (4.16) Abadir and Magnus (2005), p. 83-84, ex. 4.21 (4.17) Abadir and Magnus (2005), p. 164, ex. 7.16. (6.1) Abadir and Magnus (2005), p. 168, ex. 7.27. (6.2) Abadir and Magnus (2005), p. 167, ex. 7.26. (6.3) Abadir and Magnus (2005), p. 177, ex. 7.46. (7.1) Abadir and Magnus (2005), p. 215, ex. 8.11 (a). (7.2) Abadir and Magnus (2005), p. 215, ex. 8.12 (a). (7.3) Abadir and Magnus (2005), p. 216, ex. 8.14 (c). (7.4) Abadir and Magnus (2005), p. 215, ex. 8.12 (b). (7.5) Abadir and Magnus (2005), p. 215, ex. 8.11 (b). (7.6) Abadir and Magnus (2005), p. 216, ex. 8.13 (a) (7.7) Abadir and Magnus (2005), p. 216, ex. 8.13 (b). (7.8) Abadir and Magnus (2005), p. 214, ex. 8.9 (a). (7.10) Abadir and Magnus (2005), p. 214, ex. 8.9 (a). (7.11) Abadir and Magnus (2005), p. 221, ex. 8.26 (b). (8.1) Abadir and Magnus (2005), p. 114, ex. 5.30 (a). (8.2) Abadir and Magnus (2005), p. 106, ex. 5.16 (a). (10.1) Abadir and Magnus (2005), p. 275, ex. 10.3 (a). (10.2) Abadir and Magnus (2005), p. 275, ex. 10.3 (b). (10.3) Abadir and Magnus (2005), p. 275, ex. 10.3 (d). (10.4) Abadir and Magnus (2005), p. 278, ex. 10.8. (10.5) Abadir and Magnus (2005), p. 277, ex. 10.7 (b).