1.8 Similarity Transformations 15

Home , Definite matrix, Euclidean vector, Hermitian matrix, Unitary matrix

Deﬁnition 1.13 A subspaceS inC n is called invariant with respect to a square matrix A if AS S, where AS is the transformed ofS through A.� ⊂

1.8 Similarity Transformations

Deﬁnition 1.14 Let C be a square nonsingular matrix having the same or- 1 der as the matrix A. We say that the matrices A and C− AC are similar, 1 and the transformation from A to C− AC is called a similarity transformation. Moreover, we say that the two matrices are unitarily similar if C is unitary. �

Two similar matrices share the same spectrum and the same characteris- tic polynomial. Indeed, it is easy to check that if (λ,x) is an eigenvalue- 1 1 eigenvector pair of A, (λ,C − x) is the same for the matrix C− AC since

1 1 1 1 (C− AC)C− x=C − Ax=λC − x.

n m We notice in particular that the product matrices AB and BA, with A C × m n ∈ and B C × , are not similar but satisfy the following property (see [Hac94], p.18, Theorem∈ 2.4.6)

σ(AB) 0 =σ(BA) 0 , \{ } \{ } that is, AB and BA share the same spectrum apart from null eigenvalues so thatρ(AB) =ρ(BA). The use of similarity transformations aims at reducing the complexity of the problem of evaluating the eigenvalues of a matrix. Indeed, if a given matrix could be transformed into a similar matrix in diagonal or triangular form, the computation of the eigenvalues would be immediate. The main result in this direction is the following theorem (for the proof, see [Dem97], Theorem 4.2).

n n Property 1.5 (Schur decomposition) GivenA C × , there existsU unitary such that ∈

λ1 b12 . . . b1n 0λ 2 b2n 1 H U− AU = U AU =  . . .  = T, . .. .    0...0λ   n    whereλ i are the eigenvalues ofA. It thus turns out that every matrix A is unitarily similar to an upper triangular matrix. The matrices T and U are not necessarily unique [Hac94]. The Schur decomposition theorem gives rise to several important results; among them, we recall: 16 1 Foundations of Matrix Analysis

1. every hermitian matrix is unitarily similar to a diagonal real matrix, that is, when A is hermitian every Schur decomposition of A is diagonal. In such an event, since

1 U− AU =Λ = diag(λ 1, . . . , λn),

it turns out that AU = UΛ, that is, Aui =λ iui fori=1,...,n so that the column vectors of U are the eigenvectors of A. Moreover, since the eigenvectors are orthogonal two by two, it turns out that an hermitian matrix has a system of orthonormal eigenvectors that generates the whole spaceC n. Finally, it can be shown that a matrix A of ordern is similar to a diagonal matrix D iﬀ the eigenvectors of A form a basis forC n [Axe94]; n n 2. a matrix A C × is normal iﬀ it is unitarily similar to a diagonal ∈ n n matrix. As a consequence, a normal matrix A C × admits the following H n ∈ H spectral decomposition: A = UΛU = i=1 λiuiui being U unitary and Λ diagonal [SS90]; 3. let A and B be two normal and commutati� ve matrices; then, the generic eigenvalueµ i of A+B is given by the sumλ i +ξ i, whereλ i andξ i are the eigenvalues of A and B associated with the same eigenvector. There are, of course, nonsymmetric matrices that are similar to diagonal matrices, but these are not unitarily similar (see, e.g., Exercise 7). The Schur decomposition can be improved as follows (for the proof see, e.g., [Str80], [God66]).

Property 1.6 (Canonical Jordan Form) LetA be any square matri x. Then, there exists a nonsingular matrixX which transformsA into a block diagonal matrixJ such that

1 X− AX = J = diag (Jk1 (λ1),J k2 (λ2),...,J kl (λl)), which is called canonical Jordan form,λ j being the eigenvalues ofA and k k J (λ) C × a Jordan block of the formJ (λ)=λ ifk=1 and k ∈ 1 λ10...0 .  0λ1 .  . . ···. Jk(λ)=  . .. .. 1 0  , fork>1.    . .   . .. λ1     0...... 0λ      If an eigenvalue is defective, the size of the corresponding Jordan block is greater than one. Therefore, the canonical Jordan form tells us that a matrix can be diagonalized by a similarity transformation iﬀ it is nondefective. For this reason, the nondefective matrices are called diagonalizable. In particular, normal matrices are diagonalizable. 1.9 The Singular Value Decomposition (SVD) 17

Partitioning X by columns, X = (x1,...,x n), it can be seen that thek i vectors

associated with the Jordan block Jki (λi) satisfy the following recursive relation

i 1 − Ax =λ x , l= m + 1, l i l j (1.8) j=1 � Axj =λ ixj +x j 1 , j=l+1, . . . , l 1+k i, ifk i = 1. − − � The vectorsx i are called principal vectors or generalized eigenvectors of A.

Example 1.6 Let us consider the following matrix 7/4 3/4 1/4 1/4 1/4 1/4 − − − 0 2 0 0 0 0  1/2 1/2 5/2 1/2 1/2 1/2  A = − − − . 1/2 1/2 1/2 5/2 1/2 1/2  − − −   1/4 1/4 1/4 1/4 11/4 1/4   − − − −   3/2 1/2 1/2 1/2 1/2 7/2   − − −  The Jordan canonical form of A and its associated matrix X are given by 2 1 0 0 0 0 1 0 0 0 0 1 0 2 0 0 0 0 0 1 0 0 0 1  0 0 3 1 0 0   0 0 1 0 0 1  J = , X= .  0 0 0 3 1 0   0 0 0 1 0 1   0 0 0 0 3 0   0 0 0 0 1 1       0 0 0 0 0 2   1 1 1 1 1 1          Notice that two diﬀerent Jordan blocks are related to the same eigenvalue (λ=2). It is easy to check property (1.8). Consider, for example, the Jordan block associated with the eigenvalueλ 2 = 3; we have

T T Ax3 = [0 0 3 0 0 3] = 3 [0 0 1 0 0 1] =λ 2x3, T T T Ax4 = [0 0 1 3 0 4] = 3 [0 0 0 1 0 1] + [0 0 1 0 0 1] =λ 2x4 +x 3, T T T Ax5 = [0 0 0 1 3 4] = 3 [0 0 0 0 1 1] + [0 0 0 1 0 1] =λ 2x5 +x 4.

•

1.9 The Singular Value Decomposition (SVD)

Any matrix can be reduced in diagonal form by a suitable pre and post- multiplication by unitary matrices. Precisely, the following result holds.

m n m m Property 1.7 LetA C × . There exist two unitary matricesU C × n n ∈ ∈ andV C × such that ∈ H m n U AV =Σ = diag(σ , . . . , σ ) R × withp = min(m, n) (1.9) 1 p ∈ andσ ... σ 0. Formula (1.9) is called Singular Valu e Decomposi- 1 ≥ ≥ p ≥ tion or (SVD) ofA and the numbersσ i (orσ i(A)) are called singular values ofA. 18 1 Foundations of Matrix Analysis

If A is a real-valued matrix, U and V will also be real-valued and in (1.9) UT must be written instead of UH . The following characterization of the singular values holds

H σi(A) = λi(A A), i=1, . . . , p. (1.10) � Indeed, from (1.9) it follows that A = UΣVH ,AH = VΣH UH so that, U H H H H H and V being unitary, A A = VΣ ΣV , that is,λ i(A A) =λ i(Σ Σ) = 2 H H (σi(A)) . Since AA and A A are hermitian matrices, the columns of U, called the left singular vectors of A, turn out to be the eigenvectors of AAH (see Section 1.8) and, therefore, they are not uniquely deﬁned. The same holds for the columns of V, which are the right singular vectors of A. n n Relation (1.10) implies that if A C × is hermitian with eigenvalues given ∈ byλ 1,λ 2, . . . , λn, then the singular values of A coincide with the modules H 2 2 of the eigenvalues of A. Indeed because AA = A ,σ i = λi = λ i for i=1,...,n. As far as the rank is c oncerned, if | | � σ ... σ > σ =...=σ = 0, 1 ≥ ≥ r r+1 p then the rank of A isr, the kernel of A is the span of the column vectors of V, v r+1,...,v n , and the range of A is the span of the column vectors of U, u{,...,u . } { 1 r} m n Deﬁnition 1.15 Suppose that A C × has rank equal tor and that it H ∈ H admits a SVD of the type U AV =Σ. The matrix A † = VΣ†U is called the Moore-Penrose pseudo-inverse matrix, being 1 1 Σ = diag ,..., ,0,...,0 . (1.11) † σ σ � 1 r � �

The matrix A† is also called the generalized inverse of A (see Exercise 13). T 1 T Indeed, if rank(A) =n

1.10 Scalar Product and Norms in Vector Spaces

Very often, to quantify errors or measure distances one needs to compute the magnitude of a vector or a matrix. For that purpose we introduce in this section the concept of a vector norm and, in the following one, of a matrix norm. We refer the reader to [Ste73], [SS90] and [Axe94] for the proofs of the properties that are reported hereafter.

Deﬁnition 1.16A scalar product on a vector spaceV deﬁned overK is any map ( , ) acting fromV V intoK which enjoys the follow ing properties: · · × 1.10 Scalar Product and Norms in Vector Spaces 19

1. it is linear with respect to the vectors of V, that is

(γx+λz,y)=γ(x,y)+λ(z,y), x,y,z V, γ, λ K; ∀ ∈ ∀ ∈ 2. it is hermitian, that is, (y,x)= (x,y), x,y V; 3. it is positive deﬁnite, that is, (x,x)>0,∀x=0∈ (in other words, (x,x) 0, and (x,x) = 0 if and only ifx=0). ∀ � ≥ �

In the caseV=C n (orR n), an example is provided by the classical Euclidean scalar product given by

n H (x,y)=y x= xi ¯iy, i=1 � where ¯z denotes the complexonjugate c ofz.

Moreover, for any given square matrix A of ordern and for anyx,y C n the following relation holds ∈

(Ax,y)=(x,A H y). (1.12)

n n H In particular, since for any matrix Q C × , (Qx,Qy)=(x,Q Qy), one gets ∈ Property 1.8 Unitary matrices preserve the Euclidean scalar product, that is, (Qx,Qy)=(x,y) for any unitary matrixQ and for any pair of vec torsx andy.

Definition 1.17 LetV be a vector space overK. We say that the map fromV intoRisa norm onV if the following axiom s are satisfied: �·� 1. (i) v 0 v V and (ii) v = 0 if and only ifv=0; 2. αv�=�≥α v∀ ∈ α K, v �V �(homogeneity propert y); 3.�v+�w | |� v�+ ∀ w∈ ∀v,∈w V (triangular inequality) , � �≤� � � � ∀ ∈ where α denotes the absolute value ofα ifK=R, the module ofα if K=C.| | � The pair (V, ) is called a normed space. We shall distinguish among norms by a suitable�·� subscript at the margin of the double bar symbol. In the case the map fromV intoR enjoys only the propert ies 1(i), 2 and 3 we shall call such a map|·| a seminorm. Finally, we shall call a unit vector any vector of V having unit norm. An example of a normed space isR n, equipped for instance by the p-norm (or Hölder norm); this latter is defined for a vectorx of components x as { i} n 1/p x = x p , for 1 p< . (1.13) � �p | i| ≤ ∞ �i=1 � � 20 1 Foundations of Matrix Analysis

Notice that the limit asp goes to infinity of x p exists, is finite, and equals the maximum module of the components of�x.� Such a limit defines in turn a norm, called the infinity norm (or maximum norm), given by

x = max xi . � �∞ 1 i n| | ≤ ≤ Whenp = 2, from (1.13) th e standard deﬁnition of Euclidean norm is recovered

n 1/2 1/2 x = (x,x) 1/2 = x 2 = xT x , � �2 | i| �i=1 � � � � for which the following property holds.

Property 1.9 (Cauchy-Schwarz inequality) For any pairx,y R n, ∈ (x,y) = x T y x y , (1.14) | | | |≤� �2 � �2 where strict equality holds iffy=αx for someα R. ∈ We recall that the scalar product inR n can be related to thep-norms intro- duced overR n in (1.13) by the Hölder inequality 1 1 (x,y) x y , with + = 1. | | ≤ � � p� �q p q In the case whereV is a finite-dimensional space the following property holds (for a sketch of the proof, see Exercise 14).

Property 1.10 Any vector norm deﬁned onV is a continuous functi on of its argument, namely, ε >0, C�·� >0 such that if x x ε then x x Cε, for any∀ x, x∃ V. � − � ≤ | � �−� �|≤ ∈ � New norms� can be easily built using� the following result.

n n n Property 1.11 Let be a norm ofR andA R × be a matrix withn �·� ∈ n linearly independent columns. Then, the function A2 acting fromR into R deﬁned as �·�

n x 2 = Ax x R , � �A � � ∀ ∈ is a norm ofR n. Two vectorsx,y inV are said to be orthogonalif(x,y) = 0. This statement has an immediate geometric interpretation whenV=R 2 since in such a case

(x,y)= x y cos(ϑ), � � 2� �2 1.10 Scalar Product and Norms in Vector Spaces 21

Table 1.1. Equivalence constants for the main norms ofR n

cpq q=1q=2q= Cpq q=1q=2q= ∞ ∞ p=1 1 1 1 p=1 1n 1/2 n p=2n −1/2 1 1 p=2 1 1n 1/2 p= n −1 n−1/2 1 p= 11 1 ∞ ∞

whereϑ is the angle between th e vectorsx andy. As a consequence, if (x,y)= 0 thenϑ is a right angle and th e two vectors are orthogonal in the geometric sense.

Deﬁnition 1.18 Two norms and onV are equivalent if there exist �·� p �·� q two positive constantsc pq andC pq such that c x x C x x V. pq� �q ≤ � �p ≤ pq� �q ∀ ∈ �

In a finite-dimensional normed space all norms are equivalent. In particular, ifV=R n it can be shown that for thep-norms, withp = 1, 2, and , the ∞ constantsc pq andC pq take the value reported in Table 1.1. In this book we shall often deal with sequences of vectors and with their convergence. For this purpose, we recall that a sequence of vectors x(k) in a vector spaceV having finite dimensio nn, converges to a vectorx, and we � � write lim x(k) =x if k →∞ (k) lim xi =x i, i=1,...,n, (1.15) k →∞ (k) wherex i andx i are the components of the corresponding vectors with respect to a basis ofV. IfV=R n, due to the uniqueness of the limit of a sequence of real numbers, (1.15) implies also the uniqueness of the limit, if existing, of a sequence of vectors. We further notice that in a finite-dimensional space all the norms are topo- logically equivalent in the sense of convergence, namely, given a sequence of vectorsx (k), we have that x(k) 0 x (k) 0ifk , ||| ||| → ⇔� � → →∞ where and are any two vector no rms. As a consequence, we can establish ||| ·the ||| following�·� link between norms and limits.

Property 1.12 Let be a norm in a ﬁnite d imensional spaceV . Then �·� lim x(k) =x lim x x (k) =0, k ⇔ k � − � →∞ →∞ wherex V and x(k) is a sequence of elements ofV. ∈ � � 22 1 Foundations of Matrix Analysis 1.11 Matrix Norms

m n Deﬁnition 1.19A matrix norm is a mapping :R × R such that: �·� → m n 1. A 0 A R × and A = 0 if and only if A = 0; � �≥ ∀ ∈ � � m n 2. αA = α A α R, A R × (homogeneity); � � | |� � ∀ ∈ ∀ ∈ m n 3. A + B A + B A,B R × (triangular inequality). � �≤� � � � ∀ ∈ �

Unless otherwise speciﬁed we shall employ the same symbol , to denote matrix norms and vector norms. �·� We can better characterize the matrix norms by introducing the concepts of compatible norm and norm induced by a vector norm.

Deﬁnition 1.20 We say that a matrix norm is compatible or consistent with a vector norm if �·� �·� Ax A x , x R n. (1.16) � �≤� � � � ∀ ∈ More generally, given three norms, all denoted by , albeit deﬁned on m n m n �·� n R ,R andR × , respectively, we say that they are consistent if x R , m m n ∀ ∈ Ax=y R ,A R × , we have that y A x .� ∈ ∈ � �≤� � � � In order to single out matrix norms of practical interest, the following property is in general required

Deﬁnition 1.21 We say that a matrix norm is sub-multiplicativeif A n m m q �·� ∀ ∈ R × , B R × ∀ ∈ AB A B . (1.17) � �≤� � � � �

This property is not satisfied by any matrix norm. For example (taken from [GL89]), the norm A Δ = max a ij fori=1, . . . , n,j=1,...,m does not satisfy (1.17) if applied� � to the matrices| | 1 1 A = B = , 1 1 � � since 2 = AB > A B = 1. � � Δ � � Δ� �Δ Notice that, given a certain sub-multiplicative matrix norm α, there always exists a consistent vector norm. For instance, given any fixed�·� vectory=0 in Cn, it suffices to define the consistent vector norm as � x = xy H x C n. � � � �α ∈ As a consequence, in the case of sub-multiplicative matrix norms it is no longer necessary to explicitly specify the vector norm with respect to the matrix norm is consistent. 1.11 Matrix Norms 23

Example 1.7 The norm

n 2 H A F = aij = tr(AA ) (1.18) � � � | | �i,j=1 � � � � 2 is a matrix norm called the Frobenius norm (or Euclidean norm inC n ) and is compatible with the Euclidean vector norm 2. Indeed, �·� 2 n n n n n 2 2 2 2 2 Ax 2 = aij xj aij xj = A x 2. � � ≤ | | | | � � F � � =1 � =1 � =1 � =1 =1 � i � j � i j j � �� Notice that for such a� norm I �n F = √n. � � � � • In view of the definition of a natural norm, we recall the following theorem. Theorem 1.1 Let be a vector norm. The function �·� Ax A = sup � � (1.19) � � x=0 x � � � is a matrix norm called induced matrix norm or natural matrix norm. Proof. We start by noticing that (1.19) is equivalent to A = sup Ax . (1.20) � � �x�=1� � Indeed, one can define for anyx=0 the unit vectoru=x/ x , so that (1.19) � � � becomes A = sup Au = Aw with w =1. � � �u�=1� � � � � � This being taken as given, let us check that (1.19) (or, equivalently, (1.20)) is actually a norm, making direct use of Definition 1.19. 1. If Ax 0, then it follows that A = sup Ax 0. Moreover � �≥ � � �x�=1� �≥ Ax A = sup � � = 0 Ax =0 x=0, � � = x ⇔� � ∀ � x� 0 � � and Ax=0 x=0 if and only if A=0; th erefore A =0 A=0. 2. Given a scalar∀ � α, � � ⇔ αA = sup αAx = α sup Ax = α A . � � �x�=1� � | | �x�=1� � | | � � 3. Finally, triangular inequality holds. Indeed, by definition of supremum, ifx=0 � then Ax � � A Ax A x , x ≤ � �⇒� �≤� �� so that, takingx with unit norm, one g ets (A+B)x Ax + Bx A + B , � �≤� � � �≤� � � � from which it follows that A+B = sup (A+B)x A + B . � � �x�=1� �≤� � � � � 24 1 Foundations of Matrix Analysis

Relevant instances of induced matrix norms are the so-called p-norms de- ﬁned as

Ax p A p = sup � � . � � x=0 x p � � � The 1-norm and the inﬁnity norm are easily computable since

m n

A 1 = max aij , A = max aij , � � j=1,...,n | | � � ∞ i=1,...,m | | i=1 j=1 � � and they are called the column sum norm and the row sum norm, respectively. T Moreover, we have A 1 = A and, if A is self-adjoint or real sym- � � � �∞ metric, A 1 = A . A special� � discussion� � ∞ is deserved by the 2-norm or spectral norm for which the following theorem holds.

Theorem 1.2 Letσ 1(A) be the largest singular value ofA. Then

A = ρ(AH A) = ρ(AAH ) =σ (A). (1.21) � �2 1 � � In particular, ifA is hermitian (or real a nd symmetric), then

A =ρ(A), (1.22) � �2 while, ifA is unitary, A = 1. � � 2 Proof. Since AH A is hermitian, there exists a unitary matrix U such that

H H U A AU = diag(µ1, . . . , µn),

H H whereµ i are the (positive) eigenvalues of A A. Lety=U x, then

(AH Ax,x) (UH AH AUy,y) A 2 = sup = sup � � = (x,x) = (y,y) x� 0� y� 0� n n 2 2 = sup µi yi / yi = max µi , y=� 0� | | | | i=1,...,n| | � i=1 i=1 �� from which (1.21) follows, thanks to (1.10). If A is hermitian, the same considerations as above apply directly to A. Finally, if A is unitary, we have

2 H 2 Ax 2 = (Ax,Ax)=(x,A Ax)= x 2, � � � �

so that A 2 = 1. � � � As a consequence, the computation of A is much more expensive than � � 2 that of A or A 1. However, if only an estimate of A 2 is required, the following� relations� ∞ � can� be proﬁtably employed in the case� � of square matrices 1.11 Matrix Norms 25

max aij A 2 n max aij , i,j | | ≤ � � ≤ i,j | | 1 A A 2 √n A , √n � �∞ ≤ � � ≤ � �∞ 1 A A √n A , √n � �1 ≤ � �2 ≤ � �1

A 2 A 1 A . � � ≤ � � � �∞ For other estimates of similar type� we refer to Exercise 17. Moreover, if A is normal then A A for anyn and allp 2. � � 2 ≤ � �p ≥ Theorem 1.3 Let be a matrix norm induced by a vector norm . Then, the following ||| rela · |||tions hold: �·� 1. Ax A x , that is, is a norm compatible w ith ; 2.� I �=1 ≤ |||; ||| � � ||| · ||| �·� 3. |||AB||| A B , that is, is sub-multiplicative. ||| ||| ≤ ||| ||| ||| ||| ||| · ||| Proof. Part 1 of the theorem is already contained in the proof of Theorem 1.1, while part 2 follows from the fact that I = sup Ix / x = 1. Part 3 is simple to ||| ||| x=� 0� � � � check. � Notice that thep-norms are sub-multipli cative. Moreover, we remark that the sub-multiplicativity property by itself would only allow us to conclude that I 1. Indeed, I = I I I 2. ||| ||| ≥ ||| ||| ||| · ||| ≤ ||| |||

1.11.1 Relation between Norms and the Spectral Radius of a Matrix

We next recall some results that relate the spectral radius of a matrix to matrix norms and that will be widely employed in Chapter 4. Theorem 1.4 Let be a consistent matrix norm; then �·� n n ρ(A) A A C × . ≤� � ∀ ∈ Proof. Letλ be an eigenvalue of A andv=0 an associated eigenve ctor. As a � consequence, since is consistent, we have �·� λ v = λv = Av A v , | | � � � � � �≤� � � � so that λ A . � | |≤� � More precisely, the following property holds (see for the proof [IK66], p. 12, Theorem 3).

n n Property 1.13 LetA C × andε>0. Then, there exists an induced matrix norm (depending∈ onε) such that �·� A,ε A ρ(A) + ε. � �A,ε ≤ 26 1 Foundations of Matrix Analysis

As a result, having fixed an arbitrarily small tolerance, there always exists a matrix norm which is arbitrarily close to the spectral radius of A, namely ρ(A) = inf A , (1.23) � � �·� the infimum being taken on the set of all the consistent norms. For the sake of clarity, we notice that the spectral radius is a sub- multiplicative seminorm, since it is not true thatρ(A) = 0 iff A = 0. As an example, any triangular matrix with null diagonal entries clearly has spectral radius equal to zero. Moreover, we have the following result.

Property 1.14 LetA be a square matrix and let be a consistent norm. Then �·� lim Am 1/m =ρ(A). m →∞� � 1.11.2 Sequences and Series of Matrices

(k) n n A sequence of matrices A R × is said to converge to a matrix A Rn n if ∈ ∈ × � � lim A(k) A =0. k � − � →∞ n n The choice of the norm does not inﬂuence the result since inR × all norms are equivalent. In particular, when studying the convergence of iterative meth- ods for solving linear systems (see Chapter 4), one is interested in the so-called convergent matrices for which lim Ak = 0, k →∞ 0 being the null matrix. The following theorem holds. Theorem 1.5 LetA be a square matrix; the n lim Ak = 0 ρ(A)<1. (1 .24) k ⇔ →∞ ∞ Moreover, the geometric series Ak is convergent iﬀρ(A)<1. In such a k=0 case �

∞ k 1 A = (I A) − . (1.25) − k�=0 As a result, ifρ(A)<1 the matrixI A is invertible and the following inequalities hold −

1 1 1 (I A) − , (1.26) 1 + A ≤ � − � ≤ 1 A � � −� � where is an induced matrix n orm such that A <1. �·� � � 1.12 Positive Deﬁnite, Diagonally Dominant and M-matrices 27

Proof. Let us prove (1.24). Letρ(A)< 1, then ε> 0 such thatρ(A)<1 ε and ∃ − thus, thanks to Property 1.13, there exists an induced matrix norm such that �·� A ρ(A)+ε< 1. From the fact that A k A k < 1 and from the definition of � �≤ � � ≤ � � convergence it turns out that ask the sequence Ak tends to zero. Conversely, →∞ assume that lim Ak = 0 and letλ denote an eigenvalue of A. Then, Akx=λ kx, k→∞ � � beingx(=0) an eigenvector associa ted withλ, so that lim λk = 0. As a consequence, � k→∞ λ < 1 and because this is t rue for a generic eigenvalue one getsρ(A)< 1 as desired. | | Relation (1.25) can be obtained noting first that the eigenvalues of I A are given by − 1 λ(A),λ(A) being the generic e igenvalue of A. On the other hand, sinceρ(A)<1, − we deduce that I A is nonsingular. Then, from the identity − (I A)(I + A +...+A n) = (I A n+1) − − and taking the limit forn tending to infinity the thesis follows since

∞ (I A) Ak = I. − k=0 � Finally, thanks to Theorem 1.3, the equality I = 1 holds, so that � � 1 = I I A (I A) −1 (1+ A ) (I A) −1 , � �≤� − � � − � ≤ � � � − � giving the ﬁrst inequality in (1.26). As for the second part, noting that I = I A+A 1 1− and multiplying both sides on the right by (I A) − , one gets (I A) − = I + 1 − − A(I A) − . Passing to the norms, we obtain − (I A) −1 1+ A (I A) −1 , � − � ≤ � � � − � and thus the second inequality, since A <1. � � � Remark 1.1 The assumption that there exists an induced matrix norm such that A < 1 is justiﬁed by Proper ty 1.13, recalling that A is convergent and, therefore,� � ρ(A)< 1. �

Notice that (1.25) suggests an algorithm to approximate the inverse of a matrix by a truncated series expansion.

1.12 Positive Deﬁnite, Diagonally Dominant and M-matrices

n n n Definition 1.22 A matrix A C × is positive definite inC if the num- ∈ n n n ber (Ax,x) is real and positive x C ,x=0. A matrix A R × is positive definite inR n if (Ax,x)∀>0∈ x R � n,x=0. If the strict∈ inequality is substituted by the weak one∀ (∈) the matrix� is called positive semi- definite. ≥ � 28 1 Foundations of Matrix Analysis

Example 1.8 Matrices that are positive deﬁnite inR n are not necessarily symmetric. An instance is provided by matrices of the form

2α A = (1.27) 2 α2 � − − � T 2 forα= 1. Indeed, for any nonn ull vectorx=(x 1, x2) inR � − 2 2 (Ax,x)=2(x 1 +x 2 x 1x2)>0. − Notice that A is not positive definite inC 2. Indeed, if we take a complex vectorx we find out that the number (Ax,x) is not real-valued in general. • n n Definition 1.23 LetA R × . The matrices ∈ 1 1 A = (A + AT ),A = (A A T ) S 2 SS 2 − are respectively called the symmetric part and the skew-symmetric part of A. n n Obviously, A = AS + ASS. If A C × , the definitions modify as follows: A = 1 (A + AH ) and A = 1 (A∈ A H ). � S 2 SS 2 − The following property holds

Property 1.15 A real matrixA of ordern is positive definite iff it s symmetric partA S is positive definite.

Indeed, it suffices to notice that, due to (1.12) and the definition of ASS, T n x ASSx=0 x R . For instance, the matrix in (1.27) has a positive definite symmetric∀ ∈ part, since

1 T 2 1 AS = (A + A ) = − . 2 1 2 � − � This holds more generally (for the proof see [Axe94]).

n n n n Property 1.16 LetA C × (respectively,A R × ); if (Ax,x) is real- valued x C n, thenA∈ is hermitian (respectiv ely,∈ symmetric). ∀ ∈ An immediate consequence of the above results is that matrices that are positive deﬁnite inC n do satisfy the following characterizing property.

Property 1.17 A square matrixA of ordern is positive definite inC n iff it is hermitian and has positive eigenvalues. Thus, a positive definite matrix is nonsingular. In the case of positive definite real matrices inR n, results more specific than those presented so far hold only if the matrix is also symmetric (this is the reason why many textbooks deal only with symmetric positive definite matrices). In particular 1.12 Positive Definite, Diagonally Dominant and M-matrices 29

n n Property 1.18 LetA R × be symmetric. Then,A is positive definite iff one of the following pro∈perties is satisfied: 1. (Ax,x)>0 x=0 withx R n; 2. the eigenvalues∀ � of the∈ principal submatrices ofA are all positive; 3. the dominant principal minors ofA are all positive (Sylves ter criterion); 4. there exists a nonsingular matrixH such thatA=H T H.

All the diagonal entries of a positive definite matrix are positive. Indeed, ife i R n T is thei-th vector of the canon ical basis of , thene i Aei =a ii > 0. Moreover, it can be shown that if A is symmetric positive definite, the entry with the largest module must be a diagonal entry (these last two properties are therefore necessary conditions for a matrix to be positive definite). We finally notice that if A is symmetric positive definite and A1/2 is the only positive definite matrix that is a solution of the matrix equation X2 = A, the norm

x = A 1/2x = (Ax,x) 1/2 (1.28) � �A � �2 deﬁnes a vector norm, called the energy norm of the vectorx. Related to the energy norm is the energy scalar product given by (x,y) A = (Ax,y).

n n Deﬁnition 1.24 A matrix A R × is called diagonally dominant by rows if ∈ n a a , withi=1, . . . , n, | ii| ≥ | ij | j=1,j=i �� while it is called diagonally dominant by columns if

n a a , withi=1, . . . , n. | ii| ≥ | ji| j=1,j=i �� If the inequalities above hold in a strict sense, A is called strictly diagonally dominant (by rows or by columns, respectively). � A strictly diagonally dominant matrix that is symmetric with positive diagonal entries is also positive deﬁnite.

n n Deﬁnition 1.25 A nonsingular matrix A R × is an M-matrix ifa ij 0 fori=j and if all the entries o f its inverse∈ are nonnegative.� ≤ � M-matrices enjoy the so-called discrete maximum principle, that is, if A is an M-matrix and Ax 0, thenx 0 (where the inequalitie s are meant componentwise). In thi≤s connection,≤ the following result can be useful.

Property 1.19 (M-criterion) Let a matrixA satisfya ij 0 fori=j. ThenA is an M-matrix if and only if there exists a vectorw>≤0 such� that Aw>0.