Basic Perron Frobenius Theory and Inverse Spectral Problems
Total Page:16
File Type:pdf, Size:1020Kb
I. BASIC PERRON FROBENIUS THEORY AND INVERSE SPECTRAL PROBLEMS MIKE BOYLE Contents 1. Introduction 1 2. The primitive case 1 3. Why the Perron Theorem is useful 2 4. A proof of the Perron Theorem 3 5. The irreducible case 5 6. General nonnegative matrices 8 7. Perron numbers and Mahler measures 9 8. The Spectral Conjecture 10 1. Introduction By a nonnegative matrix we mean a matrix whose entries are nonnegative real numbers. By positive matrix we mean a matrix all of whose entries are strictly positive real numbers. These notes (with appendices) give the core elements of the Perron-Frobenius theory of nonnegative matrices. This splits into three parts: (1) the primitive case (due to Perron) (2) the irreducible case (due to Frobenius) (3) the general case (due to?) 2. The primitive case Definition 2.1. A primitive matrix is a square nonnegative matrix some power of which is positive. The primitive case is the heart of the Perron-Frobenius theory and its applica- tions. More definitions: • The spectral radius of a square matrix is the maximum of the moduli of the roots of its characteristic polynomial. • A number λ is a simple root of a polynomial p(x) if it is a root of multiplicity one (i.e., p(λ) = 0 and p0(λ) 6= 0). • For a matrix A or vector v, we define the norm (jjAjj or jjvjj) to be the sum of the absolute values of its entries. 1 2 MIKE BOYLE 0 n If jj·jj and jj·jj are two norms on R , then there are positive constants C1;C2 > 0 such that for all v in Rn 0 0 jjvjj ≤ C1jjvjj and jjvjj ≤ C2jjvjj : So, our particular choice of norm isn't important. In the sequel, inequalities of matrices or vectors are defined to hold entrywise. Theorem 2.2 (Perron Theorem). Suppose A is a primitive matrix, with spectral radius λ. Then λ is a simple root of the characteristic polynomial which is strictly greater than the modulus of any other root, and λ has strictly positive eigenvectors. For example, 0 2 • is primitive (eigenvalues are 2; −1) 1 1 0 4 • is not primitive (eigenvalues are 2; −2) 1 0 1 0 • is not primitive (1 is a repeated root of char.polynomial) 1 1 3. Why the Perron Theorem is useful The Perron theorem provides a very clear picture of the way large powers of a primitive matrix behave, with exponentially good estimates. Theorem 3.1. Suppose A is primitive. Let u be a positive left eigenvector and let v be a positive right eigenvector for the spectral radius λ, chosen such that uv = (1). Then ((1/λ)A)n converges to the positive matrix vu, exponentially fast. The theorem says that for large n, An − λnvu has entries exponentially smaller than An; the dominant behavior of An is described by the positive rank one matrix λnvu. For example, if A is a stochastic matrix defining a stationary Markov chain, then λ = 1 and An(i; j) is the probability of being in state j after n steps from state i. Here the Perron Theorem makes a statement that for each j the probability of being in state j after n steps is positive and rapidly approaches independence of the initial state i. 1 3 Example 3.2. Let A = . Then A has spectral radius λ = 4, with left and 2 2 1 right eigenvectors (2; 3) and . Normalizing to achieve uv = (1), we define 1 1 u = 2 3 and v = (1=5) : 1 Then 1 2 3 2=5 3=5 vu = (1=5) 2 3 = (1=5) = : 1 2 3 2=5 3=5 One can check that indeed 2=5 3=5 3=5 −3=5 An = 4n + (−1)n : 2=5 3=5 −2=5 2=5 BASIC PERRON FROBENIUS THEORY 3 Proof of Theorem. The matrix (1/λ)A multiplies the eigenvectors u and v by 1 (i.e. leaves them unchanged). Let W be the A-invariant codimension 1 subspace of column vectors complemen- tary to Rv. Let β be the spectral radius of the restriction of A to V . Then there are k > 0 and C > 0 such that for all w in W , and for all positive integers n, jjAnwjj ≤ Cnkβnjjwjj : By the Perron Theorem, β < λ, so 1 n β n jj A wjj ≤ Cnk jjwjj λ λ which goes to zero exponentially fast. Therefore ((1/λ)A)n converges to the (unique) rank one matrix M which anni- hilates W and satisfies Mv = v. For any w in W , 1 n uw = u A w λ 1 n = u A w −! 0 as n ! 1 λ and therefore uw = (0). Now to check vu = M, we check that (vu) is a rank one matrix fixing v and annihilating W : (vu)w = v(uw) = 0 for all w 2 W (vu)v = v(uv) = v(1) = v : 4. A proof of the Perron Theorem We'll give a proof of the Perron Theorem. There are others. Theorem 4.1 (Perron Theorem). Suppose A is a primitive matrix, with spectral radius λ. Then λ is a simple root of the characteristic polynomial which is strictly greater than the modulus of any other root, and λ has strictly positive eigenvectors. Note that the \simple root" condition is stronger than the condition that λ have a one dimensional eigenspace, because a one-dimensional eigenspace may be part of a larger-dimensional generalized eigenspace. For example, consider 4 1 and 4 : 0 4 We begin with a geometrically compelling lemma. Lemma 4.2. Suppose T is a linear transformation of a finite dimensional real vector space, S0 is a polyhedron containing the origin in its interior, and a positive power of T maps S0 into its interior. Then the spectral radius of T is less than 1. Proof of the lemma. Without loss of generality, we may suppose T maps S0 into its interior. Clearly, there is no root of the characteristic polynomial of modulus greater than 1. The image of S0 is a closed set which does not intersect the boundary of S0. Because T n(S0) ⊂ T (S0) if n ≥ 1, no point on the boundary of S0 can be an image of a power of T , or an accumulation point of points which are images of powers of T . But this is contradicted if T has an eigenvalue of modulus 1, as follows: 4 MIKE BOYLE CASE I: a root of unity is an eigenvalue of a T. In this case, 1 is an eigenvalue of a power of T , and a power of T has a fixed point on the boundary of S0. Thus the image of S0 under a power of T intersects the boundary of S0, a contradiction. CASE II: there is an eigenvalue of modulus 1 which is not a root of unity. In this case, let V be a 2-dimensional subspace on which T acts as an irrational rotation. Let p be a point on the boundary of S0 which is in V . Then p is a limit point of fT n(p): n > 1g, so p is in the image of T , a contradiction. This completes the proof. Proof of the Perron Theorem. There are three steps. STEP 1: get the positive eigenvector. P The unit simplex S is the set of nonnegative vectors v such that jjvjj := i vi equals 1. The map S ! S 1 v 7! vA jjvAjj is well defined (vA 6= 0 because no row of A is zero) and continuous. By Brouwer's Fixed Point Theorem, this map has a fixed point, which must be a nonnegative eigenvector of A for some positive eigenvalue, λ. Because a power of A is positive, the eigenvector must be positive. STEP 2: stochasticize A. Let r be a positive right eigenvector. Let R be the diagonal matrix whose −1 diagonal entries come from r, i.e. R(i; i) = ri. Define the matrix P = (1/λ)R AR. P is still primitive. The column vector with every entry equal to 1 is an eigenvector of P with eigenvalue 1. Therefore every row sum of P is 1, and P is stochastic. It now suffices to do Step 3. STEP 3: show 1 is a simple root of the characteristic polynomial of P dominating the modulus of any other root. Consider the action of P on row vectors: P maps the unit simplex S into itself and a power of P maps S into its interior. From Step 1, we know there is a positive row vector v in S which is fixed by P . Therefore S0 = −v + S is a polyhedron, whose interior contains the origin. By the lemma the restriction of P to the subspace V spanned by S0 has spectral radius less than 1. But V is P -invariant with codimension 1. Remarks 4.3 (Remarks on the proof above.). (1) Any number of people have noticed that applicability of Brouwer's Theorem (Ky Fan in the 1950's.) It's a matter of taste as to whether to use it to get the eigenvector. There are other significant arguments for getting the existence of the positive eigenvector. (2) The proof above, using the easy reduction to the geometrically clear and simple lemma, was found by Michael Brin in 1993. It is dangerous in this area to claim a proof is new. I haven't seen an earlier explicit use of this reduction. (3) The utility of the stochasticization trick is by no means confined to this theorem.