APPENDIX A
Additional Details and Fortification for Chapter 1
A.1 Matrix Classes and Special Matrices The matrices can be grouped into several classes based on their operational prop- erties. A short list of various classes of matrices is given in Tables A.1 and A.2. Some of these have already been described earlier, for example, elementary, sym- metric, hermitian, othogonal, unitary, positive definite/semidefinite, negative defi- nite/semidefinite, real, imaginary, and reducible/irreducible. Some of the matrix classes are defined based on the existence of associated matri- ces. For instance, A is a diagonalizable matrix if there exists nonsingular matrices T such that TAT −1 = D results in a diagonal matrix D. Connected with diagonalizable matrices are normal matrices. A matrix B is a normal matrix if BB∗ = B∗B.Normal matrices are guaranteed to be diagonalizable matrices. However, defective matri- ces are not diagonalizable. Once a matrix has been identified to be diagonalizable, then the following fact can be used for easier computation of integral powers of the matrix: A = T −1DT → Ak = T −1DT T −1DT ··· T −1DT = T −1DkT and then take advantage of the fact that ⎛ ⎞ dk 0 ⎜ 1 ⎟ K . D = ⎝ .. ⎠ K 0 dN Another set of related classes of matrices are the idempotent, projection, invo- lutory, nilpotent, and convergent matrices. These classes are based on the results of integral powers. Matrix A is idempotent if A2 = A, and if, in addition, A is hermitian, then A is known as a projection matrix. Projection matrices are used to partition an N-dimensional space into two subspaces that are orthogonal to each other. A matrix B is involutory if it is its own inverse, that is, if B2 = I. For example, a reflection matrix such as the Householder matrix is given by 2 H = I − vv∗ v∗v where v is a nonzero vector, and then H = H−1. A convergent matrix (also known as k stable matrix) C is a matrix for which limk→∞ C = 0. These matrices are important
561 562 Appendix A: Additional Details and Fortification for Chapter 1
Table A.1. Matrix classes (based on operational properties)
Class Definition Remarks
Convergent (Stable) lim Ak = 0 k→∞
r α Ak = 0 Defective (Deficient) k k=0 αr = 0; r < N
T −1AT is diagonal Diagonalizable for some nonsingular T
Any matrix that scales, interchanges, or adds Elementary multiples of rows or • Used in Gaussian elimination columns of another matrix B
Gram A = B∗B for some B • Are Hermitian
• (B + B∗) /2 is the hermitian Hermitian A∗ = A part of B. • B∗B and BB∗ are hermitian
Idempotent A2 = A • det(A) = 1ordet(A) = 0
• Examples: identity matrix Involutory A2 = I,i.e.A = A−1 reverse unit matrices, symmetric orthogonal matrices
Negative definite x∗Ax < 0 x = 0
Negative semidefinite x∗Ax ≤ 0 x = 0
k Nilpotent (of degree k) A = 0; k > 0 • det(A) = 0
Normal AA∗ = A∗A • Are diagonalizable
Nonsingular (Invertible) |A| = 0
for procedures that implement iterative computations. If, in addition, k < ∞ for Ck = 0, then the stable matrix will belong to the subclass of nilpotent matrices. Aside from the classifications given in Tables A.1 and A.2, we also list some spe- cial matrices based on the structure and composition of the matrices. These are given in Table A.3. Some of the items in this table serve as a glossary of terms for the special matrices already described in this chapter. Some of the matrices refer to matrix struc- tures based on the positions of zero and nonzero elements such as banded, sparse, triangular, tridiagonal, diagonal, bidiagonal, anti-diagonal, and Hessenberg. Some involve additional specifications on the elements themselves. These include iden- tity, reverse identity, shift, real, complex, polynomial, rational, positive/negative, or nonpositive/nonnegative matrices. For instance, positive (or nonnegative) matrices Appendix A: Additional Details and Fortification for Chapter 1 563
Table A.2. Matrix classes (based on operations)
Class Definition Remarks
Orthogonal AT = A−1
Positive definite x∗Ax > 0;x = 0
Positive semidefinite x∗Ax ≥ 0;x = 0
Projection Idempotent and Hermitian
There exists permutation P Reducible such that A = PAPT is block triangular
• det(A) = 0ifN is odd • a = 0, thus trace(A) = 0 Skew-symmetric AT =−A ii • (B − BT )/2isthe skew-symmetric part of B
• aii = 0 or pure imaginary Skew-hermitian A∗ =−A • (B − B∗)/2isthe skew-hermitian part of B
• BT B and BBT are both symmetric but generally not equal Symmetric A = AT • (B + BT )/2isthe symmetric part of B
Unitary A∗ = A−1 are matrices having only positive (or nonnegative) elements.1 Some special matrices depend on specifications on the pattern of the nonzero elements. For instance, we have Jordan, Toeplitz, Shift, Hankel, and circulant matrices, as well as their block matrix versions, that is, block-Jordan, block-Toeplitz, and so forth. There are also special matrices that depend on collective properties of the rows or columns. For instance, stochastic matrices are positive matrices in which the sum of the elements within each row should sum up to unity. Another example are diagonally dominant matrices, where for the elements of any fixed row, the sum of the magnitudes of off-diagonal elements should be less than the magnitude of the diagonal element in that row. Finally, there are matrices whose entries depend on their row and column indices, such as Fourier, Haddamard, Hilbert, and Cauchy matrices. Fourier and Haddamard matrices are used in signal-processing applications. As can be expected, these tables are not exhaustive. Instead, the collection shows that there are several classes and special matrices found in the literature. They often contain interesting patterns and properties such as analytical formulas for determinants, trace, inverses, and so forth, that could be taken advantage of during analysis and computations.
1 Note that positive matrices are not the same as positive definite matrices. For instance, with
15 1 −2 A = B = 51 02
A is positive but not positive definite, whereas B is positive definite but not positive. 564 Appendix A: Additional Details and Fortification for Chapter 1
Table A.3. Matrices classes (based on structure and composition)
Name Definition Remarks
• AB (or BA) will reverse ⎛ ⎞ sequence of rows (columns) α α 0 1 of B,scaledby i ⎝ ⎠ N Antidiagonal A = ... • det(A) = (−1) αi αN 0 • MATLAB: A=flipud(diag(alpha)) where alpha=(α1,...,αN) ⎧ ⎨ i > j + p • p is the right-bandwidth Band (or banded) a = 0if or ij ⎩ • q is the left-bandwidth j > i + q • = N α det(A) i=1 i • Let B = A−1 then if j > i, bij = 0 ⎛ ⎞ α 1 1 0 if j = i, b = ⎜ β α ⎟ ii α ⎜ 1 2 ⎟ i Bidiagonal = ⎜ ⎟ i−1 A ⎜ . . ⎟ 1 βk (Stieltjes) ⎝ .. .. ⎠ if i > j, bij = − αi αk k=j 0 βN−1 αN • MATLAB: A=diag(v)+diag(w,-1) where v= (α1,...,αN) w= (β1,...,βN−1)
• Often used to indicate Binary aij = 0or1 incidence relationship between i and j
• Are nonsingular (but often ill-conditioned for large N) For given x and y • = det(A) 1 N i−1 a = ; x + y = 0 = = fij Cauchy ij x + y i j i2 j 1 i j N N + and elements of x and y i=1 j=1(xi yj ) = − − are distinct where fij (xi xj )(yi yj ) • MATLAB: A=gallery(‘cauchy’,x,y)
⎛ ⎞ • α α ··· α Are normal matrices 1 2 N • ⎜ α α ··· α ⎟ Are special case of Toeplitz = ⎜ N 1 N−1 ⎟ • Circulant A ⎝ ··· ⎠ MATLAB: A=gallery(‘circul’,alpha) α2 α3 ··· α1 where alpha= (α1, ···,αN) ⎛ ⎞ − ··· − − pn−1 p1 p0 • ⎜ ⎟ pk are coefficients of a ⎜ ⎟ ⎜ ⎟ polynomial: ⎜ 100 ⎟ N n−1 Companion A = ⎜ ⎟ s + pN−1s + p1s + p0 ⎜ ⎟ . . • MATLAB: A=compan(p) ⎝ .. . ⎠ where p= (1, p − ,...,p , p ) 010 n 1 1 0
Complex aij are complex-valued Appendix A: Additional Details and Fortification for Chapter 1 565
Name Definition Remarks ⎛ ⎞ α 1 0 • = α ⎜ ⎟ det(A) i i = ⎜ . ⎟ • Diagonal A ⎝ .. ⎠ MATLAB: A=diag(alpha) where alpha= (α1,...,αN) 0 αN |a | > a • Nonsingular (based on Diagonally dominant ii ij i=j Gersgorin’s theorem) i = 1, 2,...,N
• Are orthogonal √ (i−1)(j−1) • Used in Fourier transforms aij = (1/ N)W • MATLAB: Fourier h=ones(N,1)*[0:N-1]; √ 2π W = exp − −1 W=exp(-2*pi/N*1i); N A=W.ˆ(h.*h’)/sqrt(N)
Identity matrix with 4 • Used to rotate points elements replaced based on in hyperplane Givens (Rotation) given p and q: • Useful in matrix reduction app = aqq = cos(θ) to Hessenberg form apq =−aqp = sin(θ) • Are orthogonal
H [=]2k × 2k k • Elements either 1 or −1 11 Hadamard H = ⊗ H − • Are orthogonal k 1 −1 k 1 • MATLAB: A=hadamard(2ˆk) H0 = 1 ⎛ ⎞ • ··· βα Each anti-diagonal has the ⎜ ··· ··· γ ⎟ same value = ⎜ ⎟ • Hankel A ⎝ β ··· ··· ⎠ MATLAB: A=hankel([v,w]) = ...,β,α αγ ··· where v ( ) w= (α,γ,...)
• Useful in finding eigenvalues • For square B, there is unitary Q such that A = Q∗BQ is a + , = 0 Hessenberg j k j upper hessenberg 2 ≤ k ≤ (N − j) • MATLAB: [Q,A]=hess(B); where A=(Q’)(B)(Q)
• Symmetric and positive definite 1 • MATLAB: Hilbert a = ij i + j − 1 h=[1:N]; A=gallery(‘cauchy’,h,h-1) ⎛ ⎞ • 10 Often denoted by IN ⎜ ⎟ • det(A) = 1 Identity A = ⎜ .. ⎟ ⎝ . ⎠ • AB = BA = B 01 • MATLAB: A=eye(N)
(continued) 566 Appendix A: Additional Details and Fortification for Chapter 1
Table A.3 (continued)
Name Definition Remarks
A = iB Imaginary where B√is real and i = −1 ⎛ ⎞ s 10 ⎜ ⎟ • Are bidiagonal ⎜ . . ⎟ ⎜ .. .. ⎟ • det(A) = sN Jordan block A = ⎜ ⎟ ⎜ . ⎟ • MATLAB: ⎝ .. ⎠ 1 A=gallery(’jordbloc’,N,s) 0 s N • det(A) = aii i=1 • = = − Let D diag (A)andK D A N−1 Lower Triangular ai,j = 0; j > i A−1 = D−1 I + KD−1 =1 • MATLAB: A=tril(B) extracts the lower triangle of B
Negative aij < 0
Non-negative aij ≥ 0
Non-positive aij ≤ 0
T • PA (or APT ) rearranges P = ··· ek1 ekN columns (or rows) of A based Permutation =···= on sequence K k1 kN • th MATLAB: B=eye(N);P=B(K,:) ej is the j unit vector
A[=]N × N A = RH for reverse identity R Persymmetric ai,j = a(N+1−j),(N+1−i) and symmetric H
Positive aij > 0
a are Polynomial ij polynomial functions
Real aij are real valued
a are Rational ij rational functions Rectangular • if N > M then A is tall A[=]N × M; N = M (non-square) • if N < M then A is wide • AB (orBA ) will reverse the ⎛ ⎞ order of the rows ( or columns) 01 of B Reverse identity A = ⎝ ··· ⎠ • det(A) = (−1)(N/2) if N is even 10 det(A) = (−1)(N−1)/2 if N is odd • MATLAB: A=flipud(eye(N)) Significant number of Sparse (see Section 1.6) elements are zero Appendix A: Additional Details and Fortification for Chapter 1 567
Name Definition Remarks
• aka Right-Stochastic • Stochastic A is real, nonnegative Left-Stochastic if N = N = , ∀ (Probability, and j=1 aij 1 i=1 aij 1 j transition) for i = 1, 2,...N • Doubly-Stochastic if both right- and left- stochastic ⎛ ⎞ 0 10 • Are circulant, permuation and ⎜ ⎟ ⎜ . . ⎟ Toeplitz ⎜ . .. ⎟ Shift A = ⎜ ⎟ • AN = I ⎜ 0 01⎟ N ⎝ ⎠ • MATLAB: 1 0 ··· 0 A=circshift(eye(N),-1)
⎛ ⎞ • Each diagonal has the αβ ··· same value ⎜ ⎟ ⎜ . ⎟ • A = BH with reverse ⎜ γα.. ⎟ Toeplitz A = ⎜ ⎟ identity B and hankel H ⎜ . . ⎟ ⎝ .. .. β ⎠ • MATLAB: A=toeplitz(v,w) ··· γα where v= (α, γ, ···) w= (α, β, ···)
• Are Hessenberg matrices • Solution of Ax = b can be ⎛ ⎞ solved using the Thomas α1 β1 0 ⎜ ⎟ algorithm ⎜ . ⎟ ⎜ γ α .. ⎟ • MATLAB: Tridiagonal A = ⎜ 1 2 ⎟ ⎜ . . ⎟ A=diag(v)+diag(w,1)... ⎝ .. .. β ⎠ N−1 + diag(z,-1) 0 γN−1 αN where v= (α1, ···,αN) w= (β1, ···,βN−1) z= (γ1, ···,γN−1)
Unit aij = 1 • MATLAB: A=ones(N,M)
A is (lower or upper) Unitriangular det(A) = 1 triangular and aii = 1
N • det(A) = aii i=1 • Let D = diag(A)and = − K D A N−1 Upper Triangular ai,j = 0; j < i A−1 = D−1 I + KD−1 =1 • MATLAB: A=triu(B) extracts the upper triangle portion of B ⎛ ⎞ − • = α − α αM 1 α 1 If square, det(A) i Zero aij = 0 • MATLAB: A=zeros(N,M) 568 Appendix A: Additional Details and Fortification for Chapter 1 A.2 Motivation for Matrix Operations from Solution of Equations Instead of simply taking the various matrix operations at face value with fixed rules, it might be instructive to motivate the development of the matrix algebraic operations through the use of matrix representation of equations’ origination from using indexed variables. The aim of this exposition is to illustrate how the various operations, such as matrix products, determinants, adjugates, and inverses, appear to be natural consequences of the operations involved in linear equations. A.2.1 Matrix Sums, Scalar Products, and Matrix Products We facilitate the definition of matrix operations by framing it in terms of equations that contain indexed variables. We start with the representation of a set of N linear equations relating M variables x1, x2,...,xM to N variables y1, y2,...,yN given by y1 = a11x1 +···+a1MxM . . yN = aN1x1 +···+aNMxM The indexed notation for these equations are given by M yi = aijxj i = 1, 2,...,N (A.1) j=1 By collecting the variables to form matrices: ⎛ ⎞ ⎛ ⎞ ⎛ ⎞ y1 x1 a11 ··· a1M ⎜ ⎟ ⎜ ⎟ ⎜ ⎟ = . = . = . . . y ⎝ . ⎠ x ⎝ . ⎠ A ⎝ . .. . ⎠ yN xM aN1 ··· aNM we postulate the matrix representation of (A.1) as y = Ax (A.2) For instance, consider the set of equations y1 = x1 + 3x2 y2 =−x1 − 2x2 then 13 y x y = Ax where A = ; y = 1 ; x = 1 −1 −2 y2 x2 As we proceed from here, we show that the postulated form in (A.2) to represent (A.1) will ultimately result in a definition of matrix products C = AB, which is a generalization of (A.2), that is, with y = C and x = B. Now let y1,...,yN and z1,...,zN be related to x1,...,xM as follows: M M yi = akj xj and zi = bkj xj i = 1,...,N j=1 j=1 where aij and bij are elements of matrix A and B, respectively. Appendix A: Additional Details and Fortification for Chapter 1 569 Let ui = yi + zi, i = 1,...,N, then M M M M ui = aijxj + bijxj = (aki + bki) xj = gijxj j=1 j=1 j=1 j=1 where gij are the elements of a matrix G. From the rightmost equality, we can then define the matrix sum by the following operation: i = 1,...,N G = A + B ←→ g = a + b (A.3) ij ij ij j = 1,...,M v = α = ,..., = M α Next, let i yi, i 1 N, and yi j=1 akj xj , where is a scalar multiplier, then M M M vi = α aijxj = αaijxj = hijxj j=1 j=1 j=1 where hij are the elements of a matrix H. From the rightmost equality, we can then define the scalar product by the following operation: i = 1,...,N H = αA ←→ h = αa (A.4) ij ij j = 1,...,M w = N = ,..., = M = ,..., Next, let k i=1 ckiyi, k 1 K, and yi j=1 aijxj , i 1 N, where cki and aij are elements of matrices C and A, respectively, then ⎛ ⎞ N M M N M ⎝ ⎠ wk = cki aijxj = ckiaij xj = fkj xj i=1 j=1 j=1 i=1 j=1 where fkj are the elements of a matrix F . From the rightmost equality, we can then define the matrix product by the following operation: N k = 1,...,K F = CA ←→ f = c a (A.5) kj ki ij j = 1,...,M i=1 A.2.2 Determinants, Cofactors, and Adjugates Let us begin with the case involving two linear equation with two unknowns, a x + a x = b 11 1 12 2 1 (A.6) a21x1 + a22x2 = b2 One of the unknowns (e.g., x2) can be eliminated by multiplying the first equation by a22 and the second equation by −a12, and then adding adding both results. Doing so, we obtain (a11a22 − a12a21) x1 = a22b1 − a12b2 (A.7) We could also eliminate x1 using a similar procedure. Alternatively, we could simply exchange indices 1 and 2 in (A.7) to obtain (a22a11 − a21a12) x2 = a11b2 − a21b1 (A.8) 570 Appendix A: Additional Details and Fortification for Chapter 1 The coefficients of x1 and x2 in (A.7) and (A.8) are essentially the same, which we now define the determinant function of a 2 × 2 matrix, m11 m12 det (M) = det = m11m22 − m12m21 (A.9) m21 m22 Equations (A.7) and (A.8) can be then be combined to yield a matrix equation, x a −a b det(A) 1 = 22 12 1 (A.10) x2 −a21 a11 b2 If det (A) = 0, then we have − x1 = 1 a22 a12 b1 x2 det (A) −a21 a11 b2 and find that the inverse matrix of a 2 × 2 matrix is given by − 1 a −a A 1 = 22 12 det (A) −a21 a11 Next, we look at the case of three equations with three unknowns, a11x1 + a12x2 + a13x3 = b1 a21x1 + a22x2 + a23x3 = b2 (A.11) a31x1 + a32x2 + a33x3 = b3 We can rearrange the first two equations in (A.11) and move terms with x3 to the other side to mimic (A.6), that is, a11x1 + a12x2 = b1 − a13x3 a21x1 + a22x2 = b2 − a23x3 then using (A.10), we obtain x1 a22 −a12 b1 − a13x3 α 3 = (A.12) x2 −a21 a11 b2 − a23x3 where a11 a12 α 3 = det a21 a22 Returning to the third equation in (A.11), we could multiply it by the scalar α 3 to obtain