<<

International Journal of Bifurcation and Chaos, Vol. 10, No. 1 (2000) 103–122 c World Scientific Publishing Company

RIGOROUS NUMERICAL ESTIMATION OF LYAPUNOV EXPONENTS AND INVARIANT MEASURES OF ITERATED SYSTEMS AND RANDOM MATRIX PRODUCTS

GARY FROYLAND∗ Department of and Computer Science, University of Paderborn, Warburger Str. 100, Paderborn 33098, Germany KAZUYUKI AIHARA Department of Mathematical Engineering and Information Physics, The University of Tokyo, 7-3-1 Hongo, Bunkyo-ku, Tokyo 113-0033, Japan

Received January 5, 1999; Revised April 13, 1999

We present a fast, simple matrix method of computing the unique and associated Lyapunov exponents of a nonlinear . Analytic bounds for the error in our approximate invariant measure (in terms of the Hutchinson metric) are provided, while convergence of the Lyapunov exponent estimates to the true value is assured. As a special case, we are able to rigorously estimate the Lyapunov exponents of an iid random matrix product. Computation of the Lyapunov exponents is carried out by evaluating an integral with respect to the unique invariant measure, rather than following a random . For low- dimensional systems, our method is considerably quicker and more accurate than conventional methods of exponent computation. An application to Markov random matrix product is also described.

by THE UNIV OF NEW SOUTH WALES on 03/18/13. For personal use only. 1. Outline and Motivation tion. Finally, we bring the invariant measure and Lyapunov exponent approximations together to Int. J. Bifurcation Chaos 2000.10:103-122. Downloaded from www.worldscientific.com This paper is divided into three parts. Firstly, we produce a method of estimating the Lyapunov ex- consider approximating the invariant measure of a ponents of a nonlinear IFS. contractive iterated function system (the density of dots that one sees in computer plots). Secondly, we describe a method of estimating the Lyapunov 1.1. Invariant measures of exponents of an affine IFS, or equivalently, the iterated function systems Lyapunov exponents of an iid random matrix prod- uct. While invariant measures of iterated function Contractive iterated function systems are relatively systems and Lyapunov exponents of random ma- simple from the point of view of ergodic theory, trix products may seem rather like different ob- as they have a unique invariant measure. We jects, they are mathematically very similar, and call aP (Borel) probability measure µ invariant if r ◦ −1 our method of approximation is almost identical µ = k=1 wkµ Tk , where the weights wk and for each. In both cases, we discretise appro- maps Tk, k =1,..., r define our IFS (formal defi- priate spaces, as an alternative to random itera- nitions appear later). This invariant measure is the

∗E-mail: [email protected]; URL: http://www-math.uni-paderborn.de/∼froyland

103 104 G. Froyland & K. Aihara

distribution that you would “see” if you iterated the cardinality of the histogram partition, while the an initial “blob” of mass forward for an infinitely accuracy1 is O(n−1/d), where the invariant attract- long time. It is also the distribution that appears ing set lies in Rd. We present analytic bounds for when random orbits of iterated function systems the accuracy of our approximation in terms of the are plotted on a computer. As there is only one size of the histogram partition sets. invariant measure, it describes the distribution of almost all [Elton, 1987] random trajectories of the Remark 1.1. After completion of this work, the pa- IFS, and so plays a commanding role in the behav- per by Stark [1991] was brought to our attention, in ior of the dynamics. However, despite its ubiquity, which results similar to those of Theorem 2.2 were methods for obtaining its numerical approximation obtained with a view to implementation on a neural have been rather rudimentary to date. network. The most common way of obtaining a numerical approximation is to form a histogram from a very 1.2. Lyapunov exponents of random long random orbit. The IFS is defined by a finite matrix products and nonlinear collection of maps T1,..., Tr.Ateachtimestep, iterated function systems amapTk is chosen in an iid fashion and applied to the current point. In this way, long random or- We also describe a new rigorous numerical method bits are produced. There is a theorem [Elton, 1987] of computing the Lyapunov exponents of an IFS. In that says that this method works, but it is often analogy to the deterministic case [Benettin et al., slow and there are no bounds on the error for fi- 1980], the traditional method of exponent compu- nite length orbits. The authors know of two other tation is to choose a random starting point in phase constructions in the literature. Firstly, Boyarsky space and run out a long random orbit and Lou [1992] introduced a matrix method that x = x (k − ,..., k ,x)=T ◦···◦T x , is applicable only when each map T is Jablonski N N N 1 0 0 kN−1 k0 0 k (1) (in two dimensions, this means that each Tk has where ki, i =1,..., N−1 are iid random variables the form Tk(x1,x2)=(Tk,1(x1), Tk,2(x2)), where that determine which map from a finite collection Tk,1 and Tk,2 are one-dimensional maps). Their result is very restrictive in the class of mappings is to be applied. The local rate of contraction in a it may be applied to, and their results do not in- randomly chosen direction is averaged along this or- clude any error bounds for the approximation. Sec- bit by sequentially applying the Jacobian matrices ondly, the book by Peruggia [1993] introduces a of the maps. Symbolically, the Lyapunov exponents general method of discretisation, as a way of ap- λ are computed as a time average by: proximating the attracting invariant sets and the k ◦···◦ k1/N λ := lim log( DTk − (xN−1) DTk (x0)v ), invariant measures of an IFS. While the construc- N→∞ N 1 0 tions in [Peruggia, 1993] are similar to those of the (2) present paper, our approach is entirely different, as where DTki(xi) denotes the Jacobian matrix of the

by THE UNIV OF NEW SOUTH WALES on 03/18/13. For personal use only. the results of [Peruggia, 1993] rely on random it- mapappliedattheith , evaluated at xi eration, which we wish to avoid. Because of this Int. J. Bifurcation Chaos 2000.10:103-122. Downloaded from www.worldscientific.com (the ith element of the random time series), and reliance on random iteration, [Peruggia, 1993] con- v denotes a starting vector. It is numerically ob- tains no quantitative bounds on the accuracy of the served, and in some cases may be proven, that approximation. the same value of λ is obtained for all starting In this paper, we present a simple compu- vectors v. tational method of rigorously approximating the Even in simple situations, such as an IFS com- unique invariant measure of an IFS (linear or non- prised of affine mappings, the standard time av- linear) based on a discretisation of the dynamics. eraging technique may suffer from instabilities and As we are producing a computer estimate of the random fluctuations. This is so especially if some of invariant measure, it must be in the form of a his- the Jacobian matrices are near to singular, or some togram, on a partition of the user’s choice. The of the mappings of the IFS are chosen with very number of calculations required is O(n), where n is small probability. Using a simple affine IFS, we

1Using adaptive methods of subdividing the state space, the accuracy may be improved if a “thin” set is living in a higher-dimensional Euclidean space; see [Dellnitz & Junge, 1998] for example. Rigorous Numerical Estimation of Lyapunov Exponents 105

demonstrate that Lyapunov exponent calculations remainder are given in the appendix. The theory for via a time average, fluctuate significantly along the the more technical results is developed in [Froyland, orbit, and even with long orbits the instabilities 1998]. We begin by formalizing the mathematical persist. Perhaps more unsettling is the observation objectsthatwewillworkwith,sothatweareable that for very long orbits, the variation of exponent to state precise results. In the next section we give estimates between individual orbits is much greater detailed examples of the use of our method. than the variation along a particular orbit. Thus, Typically our IFS will act on some compact apparent convergence to some value along a partic- subset M of Rd. One has a finite collection of ular orbit need not imply that the value is a good Lipschitz mappings T1 : M ,..., Tr : M ,and estimate of a Lyapunov exponent. an associatedP collection of probabilities w1,..., wr r ≥ Rather than be subjected to the random nature such that k=1 wk =1andwk 0, k =1,..., r. of the time average, we instead perform a space av- At each iteration step, a map Tk is selected with erage, which makes direct use of our approximation probability wk, and applied to the current point in { }N−1 of the invariant measure. The Lyapunov exponents our space M. In this way, random orbits xi i=0 are then calculated as an expectation or integral are generated as in (1). that automatically averages out the random fluctu- We wish to describe the asymptotic distribu- ations and is unaffected by near singular Jacobian tion of such random orbits. Often, one obtains matrices or infrequently applied maps. We show the same asymptotic distribution for every initial how to calculate all of the Lyapunov exponents of point x ∈ M and almost every random orbit. A an IFS, be it linear or nonlinear. sufficient2 condition for our IFS to possess only one We begin with the case where each mapping is such asymptotic distribution is that the IFS is a affine, as we are here simply dealing with an iid ran- contraction on average;thatis, dom matrix product, where at each step a matrix r is selected from a finite collection of matrices. We X s:= w s < 1 , (3) give a detailed example for a well-known affine IFS k k k=1 and compare our results with the standard time av- erage. Later we demonstrate an extension of our where method to a random matrix product where the ma- kTkx − Tkyk sk := sup (4) trices are selected according to a Markov process. x,y∈M kx − yk In the case of nonaffine IFS’s, our method is a gen- eralization of the space averaging method for deter- is the Lipschitz constant for Tk. At this point, we make precise what is meant ministic systems [Froyland et al., 1995]. M While we acknowledge that our methods do not by an “asymptotic distribution”. Denote by (M) provide analytic solutions for the Lyapunov expo- the space of Borel probability measures on M;el- ements of this space describe “mass” distributions nents, they are mathematically rigorous, and in low- ∈M dimensional systems we have found them to provide on M. We seek a probability measure µ (M) satisfying by THE UNIV OF NEW SOUTH WALES on 03/18/13. For personal use only. estimates that are much more stable and accurate Xr than those obtained via random iteration. −1 Int. J. Bifurcation Chaos 2000.10:103-122. Downloaded from www.worldscientific.com ◦ µ = wkµ Tk , (5) k=1 2. Invariant Measures of and will call such a probability measure µ an in- Iterated Function Systems variant probability measure. Later we shall use the operator P : M(M) defined by

2.1. Background Xr P ◦ −1 ∈M ν = wkν Tk ,ν (M). (6) The purpose of this paper is to numerically demon- k=1 strate the application of our methods and to com- pare and contrast them with standard techniques. The invariance of a measure µ is equivalent to µ Proofs of most results and proof sketches of the being a fixed point of P.

2There are weaker conditions ([Barnsley & Elton, 1988] for example) that imply the existence of a unique invariant probability measure, but as we desire a bound for our approximation, we use the simple condition (3). 106 G. Froyland & K. Aihara

Remark 2.1. There are several reasons why we useful property of weak convergence is that given a choose to define a measure µ satisfying (5) as of probability measures {µn} we may al- “invariant” under our IFS. Firstly, recall that the ways find a weakly convergent subsequence. That ◦ −1 M measure µ Tk describes the image of the mass is, the space (M) is compact with the topology distribution µ under Tk. Thus, the RHS of (5) is induced by weak convergence. a weighted average of the images of µ under the In order to quantify how good our approxima- various mappings Tk. As such, µ has the interpre- tions are, we require a metric. A useful metric which tation of being invariant on average under the ac- generates weak convergence as defined in (8) is the tion of the IFS. Secondly, a theorem by Elton [1987] “Hutchinson metric”, defined by states that the measure µ is the unique probability Z Z

measure that is “exhibited” by almost all orbits of dH(µ, ν) = sup hdµ − hdν , (9) the IFS. That is, if one defines a probability mea- h∈Lip(1) M M sure by N−1 where Lip(s) is the space of Lipschitz functions 1 X µ˜ =˜µ(x, k0,k1,...) := lim δx , (7) with Lipschitz constant s;thatisLip(s)={h∈ N→∞ N i i=0 C(M, R):|h(x)−h(y)|≤skx−yk}. For further where xi is as in (1), and a weak limit (see below) details on weak convergence, see [Falconer, 1997]. is meant, then for every x ∈ M and almost all se- quences of map choices (k0,k1,...), one hasµ ˜ = µ. 2.2. Discretization and Thus the distribution of points in M that one sees statement of result from a finite random orbit of the IFS will almost al- ways be close to the unique measure µ satisfying (5). To find a fixed point of the infinite-dimensional op- Thirdly, the measure µ appears as a projection of an erator P is difficult, so we replace P with a simpler invariant measure of a deterministic representation finite-dimensional operator Pn whose fixed points of the IFS, called a skew product. This determin- are close to that of P. istic representation is used to formalize the mathe- Construct a partition of M into n connected matics from an ergodic theoretic point of view; see sets {A1,..., An}. From each set, choose a single [Froyland, 1998] for a discussion. point ai, i =1,..., n and for each mapping Tk de- fine an n × n Pn(k) by setting In the sequel, we will be trying to approximate ( various measures, so we need a metric on M(M)in 1, if Tkai ∈ Aj , Pn,ij(k)= . (10) order to tell how good our approximations are, or 0, otherwise if convergence occurs at all. The measures we will be attempting to approximate live on fractal sets This matrix requires O(n) to construct and have a description which is infinite in nature, and is very sparse. Combine these r matrices to while our numerical approximations are in the form form the n × n matrix Pn: by THE UNIV OF NEW SOUTH WALES on 03/18/13. For personal use only. of histograms and have a finite description. For this

Int. J. Bifurcation Chaos 2000.10:103-122. Downloaded from www.worldscientific.com reason, we cannot expect our computer approxima- Xr tions to be accurate in the sense of strong conver- Pn := wkPn(k) (11) gence; the natural choice for fractal measures is ap- k=1 proximation in the sense of weak convergence (see [Lasota & Mackey, 1994, Sec. 12.2] for further de- and denote by pn a normalized fixed left eigenvec- tails on strong and weak convergence of measures). tor of Pn. We are now in a position to define our The effect (and sometimes definition) of weak con- approximation of µ. Wesimplyplaceaweightof vergence is that: pn,i at the position ai, forming a probability mea- Z Z sure that is a convex combination of nδ-measures. µn → µ weakly ⇐⇒ gdµn → gdµ That is, M M Xn ∈ R for all g C(M, ). (8) µn := pn,iδai . (12) Thus if we are concerned with how our measures i=1 integrate continuous functions, then weak conver- Theoretical results are available on how good our gence is the natural choice for convergence. A very approximation is. Rigorous Numerical Estimation of Lyapunov Exponents 107

Theorem 2.2. 2.3. Example: The fern (i) For a general partition of connected sets, We illustrate our results for a well-known IFS. Four affine mappings are used to produce a picture of a 1+s fern [Guti´errez et al., 1996]. In a later section, we dH (µ, µn) ≤ max diam(Ai) , 1−s1≤i≤n will find the Lyapunov exponents of this IFS.     (ii) If the partition is a regular cubic lattice, and ai 0.81 0.07 0.12 T x = x + are chosen as center points of the partition sets, 1 −0.04 0.84 0.195 1+s     d (µ, µ ) ≤ max diam(A ) , 0.18 −0.25 0.12 H n − ≤ ≤ i T x = x + 2(1 s) 1 i n 2 0.277 0.28 0.02     Thus we have a rigorous upper bound for the dis- 0.19 0.275 0.16 T x = x + tance between our approximation and the true in- 3 0.238 −0.14 0.12 variant measure in terms of the natural metric for     0.0235 0.087 0.11 fractal measures. T x = x + . 4 0.045 0.1666 0 Remark 2.3. In terms of computing time, one must The respective probabilities are w =0.753, w = calculate the images of rn points and for each of 1 2 0.123, w =0.104, w =0.02, while the contraction these points, a search must be performed over the 3 4 factors are s =0.8480, s =0.3580, s =0.3361, n partition sets to find which set contains the im- 1 2 3 s =0.1947, yielding s =0.7215. A represen- age. At first glance, the computing time required 4 tation of an approximation of the unique invari- by our method seems to be O(n2). However, by us- ant measure using a 200 × 200 grid: [0.005(g − 1), ing the continuity of the maps T , the search may k 0.005g] × [0.005(h − 1), 0.005h], g =1,..., 200, be restricted to only a few partition sets, since the h =1,..., 200 is shown in Fig. 1. As we choose images of neighboring points are close. Thus the grid center points (0.0025(2g − 1), 0.0025(2h − 1)), construction of P takes O(n) computing time. n g =1,..., 200, h =1,..., 200, by Theorem 2.2, we Since unity is the largest eigenvalue of P ,the n know that power method may be used to find a fixed left eigen- √ vector, and the computing time is negligible. The (1 + 0.7215) 2 d (µ ,µ)≤ ≈ 0.0219 . memory requirements for finding the left eigenvec- H 40000 2(1 − 0.7215) 200 tor are also O(n) as the power method requires 0 only the matrix Pn (which may be represented in To put this in perspective, if µ40000 is the sparse form, storing only the positions and values probability measure produced by translating the of nonzero entries) and the current test vector to be “picture” in Fig. 1 at a distance 0.0219 units, then 0 ≤ held in memory. d(µ40000,µ40000) 0.0219.

by THE UNIV OF NEW SOUTH WALES on 03/18/13. For personal use only. A direct comparison of these estimates with es- Atthispointwenotethatoncewehavecal- timates of µ obtained via random iteration is diffi- Int. J. Bifurcation Chaos 2000.10:103-122. Downloaded from www.worldscientific.com culated the matrices Pn(k) for a given set of map- cult. One way to quantify the error is to run out a pings T1,..., Tr, it is a very simple matter to alter random orbit of length N and define µN as in (7). the probabilities wk and recalculate an approximate One may then calculate dH (µ, µN ) and average this invariant measure for this new system. This is be- error over all possible random orbits. This would cause almost all of the computing effort goes into give an “expected error”, quantified in terms of the the construction of Pn(k). For new probabilities wk, Hutchinson metric. For simple systems, such a the- we merely construct a new matrix Pn from (11) and oretical analysis is possible, and in [Froyland, 1998] find a fixed left eigenvector. Thus once we have an the one-dimensional Cantor system is considered, approximation for one set of probabilities, new ap- with our method clearly outperforming random it- proximations may be computed very quickly for a eration or the “chaos game”. For more complicated whole range of probabilities. The same idea applies systems, our method has the advantage of possess- to the estimation of Lyapunov exponents in the fol- ing a rigorous upper bound on the error of the ap- lowing sections and is briefly illustrated in the final proximation; something which is lacking from ap- example. proximations obtained from random iteration. On 108 G. Froyland & K. Aihara

Fig. 1. A picture of a fern using a 200 × 200 grid. The density of µ40000 is plotted along the third coordinate axis (height).

an iteration for iteration basis in low-dimensional An iid random matrix product is mathemati- systems, we expect our discrete method to pro- cally very similar to an IFS. Our IFS was defined duce superior approximations of µ, with the rigor- by randomly selecting a map from some finite col- ous bound of Theorem 2.2 often being very con- lection at each time step, and applying this selected by THE UNIV OF NEW SOUTH WALES on 03/18/13. For personal use only. servative. Intelligent ways of refining the parti- map to the current point. By simply replacing the

Int. J. Bifurcation Chaos 2000.10:103-122. Downloaded from www.worldscientific.com tion (see [Dellnitz & Junge, 1998], for example) collection of maps T1 : M ,..., Tr : M with a d may also be used to increase the efficiency of the collection of d × d matrices M1 : R ,..., Mr : method. Rd , (treating them as maps on Rd via matrix multiplication) we may now study the (random) d evolution of an initial (nonzero) vector v0 ∈ R 3. Lyapunov Exponents of under repeated multiplication by M ,justaswe Iid Random Matrix Products k studied the evolution of an initial point x0 ∈ M under the action of Tk. That is, we can define 3.1. Background ◦···◦ vN = vN (kN−1,..., k0,v0):=MkN−1 Mk0v0 We begin by discussing the Lyapunov exponents of in analogy with (1), where composition now means an iid random matrix product (or affine IFS), as multiplication. Because this system is simply lin- they are simpler to describe than the Lyapunov ex- ear (and not affine as in some IFS’s), the dynamics ponents of a nonlinear IFS. The latter will be dealt is often rather boring, with vi heading off towards with in a later section. infinity or contracting towards the origin. What is Rigorous Numerical Estimation of Lyapunov Exponents 109

of more interest is the exponential rate at which vi in the direction of v. This line intersects the sphere diverges to infinity or converges to zero, and this Sd−1 in two antipodal points v+ and v−,withthe rate is quantified by the Lyapunov exponents of the former lying in the right hemisphere. We define random matrix product. Precisely, one defines the Π(v)=v+. The simplest way to describe a point limit: v+ ∈ RPd−1 is through angular coordinates, and in numerical calculations, we typically subdivide Sd−1 (0) k ◦···◦ k1/N λ = lim log( Mk − Mk v0 ) . (13) using an equiangular grid. Invertible d × d matri- N→∞ N 1 0 ces have a natural action on RPd−1 under matrix Provided that multiplication. To compute the image of a point in RPd−1 under a d × d matrix, one may convert Hypotheses 3.1. the vector into Cartesian coordinates, multiply by the matrix, and then convert back to angular co- k k k −1k (i) both Mk and Mk are finite for k = ordinates. From now on, we assume that this con- 1,..., r,and version process is done automatically, and we de- d (ii) there is no nontrivial subspace of R that is left note by v both a vector in Rd\{0} and a point in − invariant by all Mk, k =1,..., r, RPd 1. the limit (13) exists for almost all random orbits. Moreover, when this limit exists, the same value 3.2. The Calculation of λ(0) of λ(0) is obtained for all such orbits and for all starting vectors v0; see [Furstenberg & Kifer, 1983] We may rewrite (13) as for details. In the sequel, we will assume that Hy- − 3 NX1 potheses 3.1 hold. Our matrices will be selected (0) 1 λ = lim log kMk v`k , (14) from GL(d, R), the multiplicative group of invert- N→∞ N l `=0 ible d × d real matrices. The invertibility condition ensures that our product does not suddenly collapse where v` denotes a unit vector in the direction of to zero if the current vector happens to be in the v`. From this point onwards, vectors vi and v will null space of the next matrix. be assumed to have unit length. The sum (14) is a Before describing our method of calculation, we (random) time average, and may be converted into introduce a space that plays a central role in our a space average [Furstenberg & Kifer, 1983], construction. Xr Z (0) λ = wk log kMkvkdξ(v) (15) Real projective space. It is clear from (13) that R d−1 k=1 P only the direction of v is important and not its length. We thus eliminate the unnecessary informa- for some suitable probability measures ξ on RPd−1. tion and from now on consider our vectors to live in In complete analogy to the IFS case, suitable prob- − d − 1-dimensional real projective space RPd 1 (the ability measures satisfy by THE UNIV OF NEW SOUTH WALES on 03/18/13. For personal use only. “space of directions” in Rd). This “space of direc- − r Int. J. Bifurcation Chaos 2000.10:103-122. Downloaded from www.worldscientific.com tions” is the factor space RPd 1 =(Rd\{0})/ ∼, X ξ = w ξ ◦ M −1 . (16) where the equivalence relation ∼ is defined by u ∼ k k k=1 v ⇔ u = a · v, for some a ∈ R; u, v ∈ Rd\{0}.One R d−1 d−1 D can visualize P as one hemisphere of S ,the Following Sec. 2, weP define an operator : − R 1 M R d−1 D r ◦ −1 d 1 dimensional unit sphere. For example, P ( P ) by ζ = k=1 wkζ Mk , and seek may be considered to be the “right” half of S1 in its fixed points. Unlike the operator P in the IFS R2 (comprised of those points with non-negative x- case, the operator D is not a contraction with re- coordinate) with the endpoints identified. That is, spect to the Hutchinson metric. This is because it R 1 { ∈ R2 − ≤ } P = (cos θ, sin θ) : π/2 θ<π/2.A is impossible for an invertible matrix Mk to be a − natural mapping Π : Rd\{0}→RPd 1 may be de- local contraction at all v ∈ RPd−1, based on a sim- fined by extending a given vector v ∈ Rd in both ple conservation of mass argument (the image of d−1 d−1 directions to form a line passing through the origin RP under some Mk ∈ GL(d, R)isallofRP ,

3 Conditions (i) and (ii) are satisfied for “almost all” choices of matrices Mk, k =1,r; see [Froyland, 1998] for a formal result. 110 G. Froyland & K. Aihara

while if Mk was a strict contraction everywhere, (ii) As in Remark 2.3, the required computing time d−1 the area/volume of Mk(RP )mustbestrictlyless is O(n). than that of RPd−1; a contradiction). Thus we are faced with the possibility of there existing more 3.3. Example: The fern than one fixed point of D. Thisdoesnotoverly concern us, as in most cases (and in particular, un- We return to the fern IFS, and proceed to compute der Hypotheses 3.1), all fixed points of D produce its Lyapunov exponents. Since the Jacobian matri- the same value λ(0). ces of the four mappings T1,..., T4 are constant, To approximate a fixed point of D,weuse the orbit of the IFS in M is unimportant; all that exactly the same method as in Sec. 2. Namely, matters is the sequence in which the mappings are d−1 partition RP into m connected sets V1,..., Vm applied. Thus, the Lyapunov exponents of the fern of small diameter, choose a single point vi ∈ Vi, IFS are the same as the Lyapunov exponents of an i =1,..., m, and for each matrix Mk, produce a iid random product of the matrices matrix   ( 0.81 0.07 ∈ M = , 1, if Mkvi Vj , 1 −0.04 0.84 Dm,ij(k)= (17) 0, otherwise .   0.18 −0.25 M = , and combine them to form 2 0.277 0.28 Xr   D := w D (k) . (18) 0.19 0.275 m k m M3 = − , k=1 0.238 0.14   Let d denote a fixed left eigenvector of D (which 0.0235 0.087 m m M = , necessarily has all elements non-negative, and has 4 0.045 0.1666 been normalized so that the sum of its elements is unity) and define an approximate fixed point of D chosen with probabilities w1 =0.753, w2 =0.123, by w3 =0.104 and w4 =0.02. Since the matrices m X M1,..., M4 satisfy Hypotheses 3.1, there is only ξm = dm,iδvi . (19) one exponent that is almost always observed, and i=1 this is the exponent that we wish to estimate. Be- We now have the following result regarding our fore we describe our results, we attempt to use the estimate of λ(0). standard method of random iteration to estimate λ(0). We use (14) directly, by randomly selecting an Theorem 3.2. Assume that Hypotheses 3.1 hold, initial vector v0, and then producing an iid sequence { }∞ and let dm m=m0 be a sequence of eigenvectors as of matrices, selected from M1,..., M4.Ateachit- constructed above. Then eration, (14) is applied to track the current estimate (0) 6

by THE UNIV OF NEW SOUTH WALES on 03/18/13. For personal use only. Xr Xm of λ ; we truncate this process after N =10 it- λ := w d log kM v k→λ(0) erations. The results are shown in Fig. 2 for ten Int. J. Bifurcation Chaos 2000.10:103-122. Downloaded from www.worldscientific.com m k m,i k i k=1 i=1 separate random orbits for 9 × 105 ≤ N ≤ 106. 6 as m →∞. (20) After 10 iterations there are still some minor fluc- tuations along individual orbits, but significantly Remark 3.3. different estimates are obtained from the ten sam- ple orbits. Using the ten final estimates at N =106, (i) We could of course, try to estimate the value of we calculate the mean and standard deviation as λ(0) directly from a random simulation of (13) −0.443672 and 0.000654, respectively; leaving un- for some large finite N. However, we present certainty in the third decimal place, or a standard numerical evidence that this method is often error of around 0.15%. In the sequel, we will use inefficient and inaccurate, in particular in the our method to produce an estimate that appears to cases where there are matrices that are near to be accurate up to five decimal places, with far fewer singular, or matrices that are chosen very infre- iterations required. quently. For a theoretical analysis in a simple We now begin a description of our method. We model, see [Froyland, 1998]. seek to approximate a probability measure ξ on RP1 Rigorous Numerical Estimation of Lyapunov Exponents 111

−0.442

−0.4425

−0.443

−0.4435

Lyapunov exponent estimate −0.444

−0.4445

−0.445 9 9.1 9.2 9.3 9.4 9.5 9.6 9.7 9.8 9.9 10 5 iterations x 10 Fig. 2. Top Lyapunov exponent estimates calculated using (14) for ten random orbits of length 106 (shown between 9 × 105 and 106 iterations).

200

180

160

140

120 by THE UNIV OF NEW SOUTH WALES on 03/18/13. For personal use only. 100 Int. J. Bifurcation Chaos 2000.10:103-122. Downloaded from www.worldscientific.com density

80

60

40

20

0 −1.5 −1 −0.5 0 0.5 1 1.5 angle

Fig. 3. Density of the approximate D-invariant measure ξ10 000. This plot is a step function that takes the value 10 000d10 000,i on the set [−π/2+(i−1)π/10 000, −π/2+iπ/10 000), i =1,..., 10 000. 112 G. Froyland & K. Aihara

that is invariant under D. For numerical calcula- small m, better results are obtained. We denote tions, we treat RP1 as the subinterval [−π/2,π/2). the space average estimates produced by this alter- − 0 We construct equipartitions of [ π/2π/2) of car- native definition as λm. dinality m = 100, 200, 1000, 2000, 10 000, 20 000, 40 000; these partitions consist of the sets [−π/2+ The results of Table 1 for low iteration num- (i−1)π/m, −π/2+iπ/m), i =1,..., m.The bers are plotted in Fig. 4. We compare our esti- mates with the time average estimate on an iter- points vi are chosen to be the center points of the ation for the iteration basis. The construction of partition sets, namely, vi = −π/2+(2i−1)π/2m, i =1,..., m. For a fixed partition (fixed value of each Dm(k) requires m iterations of Mk,soDm requires 4m iterations. We compare our estimate m), we construct the matrix Dm from the four ma- with the time average estimates after N =4miter- trices Dm(k), k =1,..., 4. Producing each matrix ations; see Fig. 4(a). Figure 4(b) is a zoom of the Dm(k) is a simple matter of applying the matrix area of Fig. 4(a) that contains the space average es- Mk to each of the points vi and noting which par- timates. At the scale in (a), the two estimates λm tition set contains the image. The matrices Dm(k), 0 and λm coincide. Under the reasonable assumption k =1,..., 4 are then added together to produce 0 Dm, and the fixed left eigenvector dm may be com- that the estimates λ40 000 and λ40 000 are accurate puted by a simple application of the power method to five decimal places, we see that even with only 800 iterations, the estimates λ and λ0 are at (as the eigenvalue 1 is the largest eigenvalue of Dm). 200 200 The density of the approximate D-invariant proba- least as good as the estimate −0.443672, which was obtained as a mean of ten time average estimates bility measure ξ10 000 (from (19)) is shown in Fig. 3. from trajectories of length 106.Thusourmethod By inserting the values of dm,i into (20), we produce can produce more accurate estimates with far fewer an estimate of λm of the top Lyapunov exponent λ(0);seeTable1. iterations. Remark 3.5. As is standard practice, the remain- Remark 3.4. An alternative definition of the matrix 0 − ing exponents may be estimated by considering the D (k)isD (k)=`(V∩M 1(V))/`(V ), where ` m m,ij i k j i action of the matrices on successively larger exte- is the normalized length on [−π/2,π/2). This con- rior powers of RP d−1 or Rd\{0}. Numerically, this struction was proposed by Ulam [1964] for a related means considering the action of the matrices on purpose. The length measure ` may be replaced by − parallelepipeds rather than vectors. By consider- area and volume when working in RPd 1, d =2and ing the kth exterior powers (of k-dimensional par- 3, respectively, however, the calculations become allelepipeds), the sum of the k largest exponents more difficult and Monte Carlo sampling methods will be observed almost always, and our method is or the like may be necessary to approximate the ar- R 1 0 guaranteed to converge to this sum. eas and volumes. In P , the construction of Dm takes the same number of iterations (endpoints of In particular, the sum of all of the exponents is the subintervals [−π/2+(i−1)π/m, −π/2+iπ/m) easily calculated as

by THE UNIV OF NEW SOUTH WALES on 03/18/13. For personal use only. are iterated, rather than the center points), and for

Int. J. Bifurcation Chaos 2000.10:103-122. Downloaded from www.worldscientific.com Sum of all Lyapunov exponents Xr Table 1. Top Lyapunov exponent estimates for the fern | | IFS using a space average. = wk log det Mk . (21) k=1 Number of Number of “Ulam” In our example, this value is −1.13005, so based Partition Sets Iterations Estimate Estimate on Table 1, an estimate of the second exponent mN=4mλ λ0 m m is −1.13005 + 0.44351 = −0.68654. The second 100 400 −0.443078 −0.443535 (and smallest) exponent λ(1) may be calculated di- 200 800 −0.443508 −0.443457 rectly by substituting the inverses of M1,..., M4 in 1000 4000 −0.443556 −0.443494 Eqs. (17) and (20). That is, − − 2000 8000 0.443481 0.443505 Xr Xm − − − 10 000 40 000 −0.443506 −0.443510 − k 1 k→ (1) λm := dm,i log Mk vi λ 20 000 80 000 −0.443513 −0.443511 k=1 i=1 40 000 160 000 −0.443509 −0.443509 as m →∞, (22) Rigorous Numerical Estimation of Lyapunov Exponents 113

(a) −0.4

−0.42

−0.44

−0.46

−0.48

Estimate of top Lyapunov exponent −0.5 2 3 4 5 10 10 10 10 Number of iterations (b) −0.443

−0.4431

−0.4432

−0.4433

−0.4434

−0.4435

Estimate of top Lyapunov exponent −0.4436 2 3 4 5 10 10 10 10 Number of iterations Fig. 4. (a) Plot of Lyapunov exponent estimates from ten random orbits versus number of iterations, N = 400, 800, 4000, 8000, 40 000 (stars), and Lyapunov exponent estimates λm from the space average versus N =4m(circles). (b) Space average 0 estimates λm (solid line) and λm (dotted line) vsersu number of iterations N =4m(zoom of (a)).

− where dm is a fixed left eigenvector of the ma- Hypotheses 4.1. − −1 trix Dm produced from (17) using M in place R k + k k + k −1k of Mk.Form= 10 000, we indeed obtain the value (i) M (log DTk(x) +log DTk(x) )dµ(x) < − − ∞ for each k =1,..., r. λm = 0.68654, establishing the consistency of our method. (ii) A “nondegeneracy” condition holds (see The- orem 4.10 in [Froyland, 1998] for a precise statement). 4. Lyapunov Exponents of by THE UNIV OF NEW SOUTH WALES on 03/18/13. For personal use only. Iterated Function Systems Under Hypotheses 4.1, only the top exponent λ(0) Int. J. Bifurcation Chaos 2000.10:103-122. Downloaded from www.worldscientific.com is observed for all vectors v and almost all random In this final section, we combine the methods of orbits of the IFS. In almost all cases, the nonde- Secs. 2 and 3 to produce an approximation for the generacy condition (ii) is satisfied (see discussion in Lyapunov exponents of a nonlinear contractive IFS. [Froyland, 1998]). The same technique may be applied to estimate the Lyapunov exponents of any iid random composition of maps, however in the noncontractive case, rigor- 4.2. Statement of results ous results are still lacking. As in Secs. 2 and 3, partition M into n connected d−1 sets A1,..., An, and similarly, partition RP into 4.1. Background m connected sets V1,..., Vm. Choose a single point ai ∈ Ai for i =1,..., n, and a single vector (angle We wish to estimate the values of λ that arise from coordinate) vg ∈ Vg,forg=1,..., m. the limit (2). Denote by µ the unique P-invariant As in (11), we construct the matrices Pn(k), probability measure for our IFS. k =1,..., r and find a normalized fixed left 114 G. Froyland & K. Aihara

eigenvector pn, to approximate the unique invariant m × m matrices probablity measure µ. ( 1, if DT (a )v ∈ V , For each point ai ∈ M, and map Tk, there is k i g h Dm,gh(k, i)= (23) a Jacobian matrix DTk(ai). We approximate the 0, otherwise , action of each DTk(ai), k =1,..., r, i =1,..., n ∗ using the technique described in Sec. 3. Define rn k =1,..., r, i =1,..., n.DenotingPn,ij(k)= Pn,ij(k)pn,i/pn,j, we now put the two approxima- tions together to form the nm × nm matrix:

  P ∗ (k)D (k, 1) P ∗ (k)D (k, 1) ··· P∗ (k)D (k, 1)  n,11 m n,21 m n,m1 m  Xr  ∗ ∗ ··· ∗   Pn,12(k)Dm(k, 2) Pn,22(k)Dm(k, 2) Pn,m2(k)Dm(k, 2)  D = w   (24) n,m k  . . . .  k=1  . . .. .  ∗ ∗ ··· ∗ Pn,1m(k)Dm(k, n) Pn,2m(k)Dm(k, n) Pn,mm(k)Dm(k, n)

(1) (2) (n) Denote by (sm |sm |···|sm ) a fixed left eigen- vector of Dn,m that has been divided into segments Then 0 of length m. Each segment is normalized so that 2ε max Lip(log |T |) P n ≤ ≤ k m (i) (0) 1 k r s =1foreachi=1,..., n. |λ − λn|≤ . g=1 m,g 1 − s

Theorem 4.2. Assume that Hypotheses 4.1 Remarks 4.4. { }∞ hold. Let pn n=n0 be a sequence of normalized (1) (2) (i) The constructions of Theorem 4.2 and Corol- fixed left eigenvectors of Pn, and {(sm |sm |··· (n) ∞ lary 4.3 may also be used to estimate the |sm )} be a sequence of fixed left eigen- n=n0,m=m0 Lyapunov exponents of (noncontractive or vectors of D , normalized as described above. n,m chaotic) deterministic systems. In this case, Then there is only one mapping T1, which is applied Xr Xn with probability w1 = 1. In such a situation, λn,m := wk pn,i one is usually not guaranteed to have a unique k=1 i=1 invariant measure, however numerical results m X are encouraging; see [Froyland et al., 1995] for × s(i) log kDT (a )v k→λ(0) m,g k i g an application of this technique using only time g=1 series data. It is often numerically observed as n, m →∞. (25)

by THE UNIV OF NEW SOUTH WALES on 03/18/13. For personal use only. that the Ulam construction of Remark 3.4 ap- pliedtothematricesP gives better results, Int. J. Bifurcation Chaos 2000.10:103-122. Downloaded from www.worldscientific.com n If our system is one-dimensional, our job be- especially for relatively small partitions. comes particularly easy, as we need only worry (ii) In (i) above, we are really approximating the ac- about approximating µ, and do not need to con- tion of a single mapping T1 by the struct the matrix Dn,m. Additionally, we can obtain governed by the transition matrix P .Thus, error bounds in this situation. the same construction may be used to estimate the Lyapunov exponents of a Markovian prod- Corollary 4.3 (One-Dimensional Version of Theo- uct of matrices. That is, suppose there are n rem 4.2).R Suppose that M is one-dimensional, and matrices M1,..., Mn, and that the selection + | 0 | +| 0 −1| ∞ that M (log Tk(x) +log (Tk(x) )dµ(x) < process is no longer iid, but Markovian, ac- for each k =1,..., r.Letεn=max1≤i≤ndiam(Ai). cording to Prob{Mj is multiplied next|Mi was In the notation of Theorem 4.2 define, just used} = Pij. A Markovian random ma- Xr Xn trix product can thus be defined and one of- λ := w p log |T 0 (a )| (26) ten wishes to calculate the Lyapunov exponents n k n,i k i (0) k=1 i=1 of this product. In fact, λ is given by (25), Rigorous Numerical Estimation of Lyapunov Exponents 115

putting r =1,w1 =1,and where E denotes the system energy. This recursive ( formula is nothing other than a Markovian product ∈ 1, if Mivg Vh , of matrices, drawn in this case from a set of two Dm,gh(i)= (27) 0, otherwise . matrices (as εn takes only the two values εa and εb). An object of concern is the localization length, This is illustrated in the final example. given by 1 L = − lim hlog |ψn|i 4.3. Example: A random dimer model |n|→∞ |n| This example demonstrates how to use (25) to which measures how the wavefunction amplitudes compute the Lyapunov exponents of a Markovian vary with n. This length is simply the inverse of random matrix product. The matrix product arises the top Lyapunov exponent of this random matrix from a model known as the random dimer model product, which we now set out to estimate. [Oliveira & Petri, 1996] which we now briefly For the moment we shall fix p =1/2. Setting describe. Consider a discrete one-dimensional εa = εb =0.5andE=0.7, the three matrices which Schr¨odinger equation on an infinite lattice we will multiply together are {..., −n,..., n,...}.Atthenth lattice site, ψn     denotes the amplitude of the wavefunction, and 1.2 −1 0.2 −1 M1 = ,M2 =M3 = εn denotes the site energy. In the random dimer 10 10 model, the site energy takes on only two values εa and εb. Furthermore, sites with energy εb can only Referring to Remark 4.4(ii), the transition matrix occur in adjacent pairs; that is, a single site of en- for this Markovian matrix product is given by ergy εb may not be flanked on either side by sites of 123 energy εa. The arrangement of the site energies on   the lattice can be considered as a sample trajectory 1 0.50.50   of a finite state Markov chain with the following P = 2  001 transition matrix. 3 0.50.50  εa εb εb  ε p 1 − p 0 a   To begin, one forms the n = 3 matrices P =   εb 001 Dm,gh(i), i =1,..., 3 as in (27) (with Dm(2) = εb p1−p0 Dm(3) since M2 = M3), and then constructs the × This means, for example, that to the right of a site 3m 3m matrix displayed in (24) (recalling that here r =1andw1 = 1). The fixed left eigenvector of energy εa, a site of energy εb occurs with proba- bility 1 − p, where 0

by THE UNIV OF NEW SOUTH WALES on 03/18/13. For personal use only.      employed to produce an estimate λ3,m of the top − − Lyapunov exponent of this Markovian random

Int. J. Bifurcation Chaos 2000.10:103-122. Downloaded from www.worldscientific.com ψ E ε 1 ψ n+1 = n n , (28) ψn 10ψn−1 matrix product.

Table 2. Lyapunov exponent estimates for the random dimer model using a space average.

# of Partition Sets # of Iterations Estimate/(CPU time) “Ulam” Estimate/(CPU time) 0 mN=2mλ3,m λ3,m

400 800 5.93373 × 10−3 (1.0s) 5.99273 × 10−3 (1.7s) 800 1600 5.96439 × 10−3 (3.0s) 5.99160 × 10−3 (5.3s) 1200 2400 5.98256 × 10−3 (6.1s) 5.99137 × 10−3 (10.9s) 1600 3200 5.99157 × 10−3 (10.3s) 5.99129 × 10−3 (18.6s) 2800 5600 5.98995 × 10−3 (57.5s) 5.99123 × 10−3 (89.7s) 4000 8000 5.99036 × 10−3 (116.7s) 5.99121 × 10−3 (184.7s) 116 G. Froyland & K. Aihara

This procedure was carried out for m = 100, integration with respect to the ith density shown in 200, 400, 600,..., 2000, the results of which are Fig. 5. displayedinFig.7andinpartinTable2.The We now compare these results with those ob- method in Remark 3.4 was also tested, and as be- tained from (2) by simulating a long random orbit. fore was found to give more stable results. Based on In analogy to Fig. 2, Fig. 6 shows the computed these data, it appears that the true value of the top estimates of the top Lyapunov exponent along the 6 Lyapunov exponent may be expressed to five signif- tail sections of ten orbits of length 10 .Themean icant digits as 5.9912 × 10−3. Both the one-point and standard deviation of the values obtained at 6 × −3 × −5 and Ulam method have produced answers very close N =10 are 5.95488 10 and 5.76197 10 , to this value using only around 3200 iterations, or respectively, leaving an uncertainty in the second a CPU time of 10–20 seconds, depending on which significant digit of the mean, or a standard error of of the two methods is used. around 0.97%. This mean and standard deviation In analogy to Fig. 3, we plot the densities on is shown as a vertical bar on the right-hand side of Fig. 7. From the strong indications that the space RP1 defined by the vectors s(1), s(2),ands(3) for m m m average has converged to a value very close to the m = 1000 in Fig. 5. These densities have the true value of λ(0), it appears that from ten random R 1 interpretation that if a uniform mass on P is orbits of length 106 (taking a total CPU time of pushed forward along a Markov random trajectory over 1900 seconds), the random iteration (or time of matrices (with the trajectory beginning in the average) method still produces comparatively inac- infinite past), terminating at state i,thenalmost curate results. surely, the resulting distribution is approximated by Finally, we consider the variation of λ(0) as the (i) sm .Theith inner in (25) represents an transition probability p varies between 0 and 1. By

2.5 2 1.5 1 density 0.5 0 −1.5 −1 −0.5 0 0.5 1 1.5 angle 2.5 2

by THE UNIV OF NEW SOUTH WALES on 03/18/13. For personal use only. 1.5

Int. J. Bifurcation Chaos 2000.10:103-122. Downloaded from www.worldscientific.com 1 density 0.5 0 −1.5 −1 −0.5 0 0.5 1 1.5 angle 2.5 2 1.5 1 density 0.5 0 −1.5 −1 −0.5 0 0.5 1 1.5 angle (1) (2) (3) − Fig. 5. Density of the vectors s1000, s1000,ands1000 (from top to bottom, respectively). The interval [ π/2,π/2) has been normalized to have length 1. Rigorous Numerical Estimation of Lyapunov Exponents 117

−3 x 10 6.1

6.05

6

5.95

5.9 Lyapunov exponent estimates

5.85

5.8 9 9.1 9.2 9.3 9.4 9.5 9.6 9.7 9.8 9.9 10 5 iterations x 10 Fig. 6. Top Lyapunov exponent estimates for the random dimer model versus the number of iterations, calculated using (2) for ten simulated random orbits of length 106 (shown between 9 × 105 and 106 iterations).

−3 x 10 6.1

6.05

6

5.95 by THE UNIV OF NEW SOUTH WALES on 03/18/13. For personal use only. Int. J. Bifurcation Chaos 2000.10:103-122. Downloaded from www.worldscientific.com

5.9 Estimate of top Lyapunov exponent

5.85

5.8 0 1000 2000 3000 4000 5000 6000 7000 8000 Number of iterations Fig. 7. Plot of space average estimates for the top Lyapunov exponent versus the number of iterations used. The solid line uses the one point method (27) of constructing the matrices Dm(i) and the dotted line uses the alternative “Ulam” method outlined in Remark 3.4. The axes have been scaled to allow comparison with Fig. 6; the cross and vertical line to the right of the plot show the mean and one standard deviation (to each side of the mean) of the time average simulation in Fig. 6. 118 G. Froyland & K. Aihara

0.014

0.012

0.01

0.008

0.006

0.004 Estimate of the top Lyapunov exponent

0.002

0 0 0.1 0.2 0.3 0.4 0.5 0.6 0.7 0.8 0.9 1 Value of p

0 Fig. 8. Graph of λ3,m(p)versusp, for p between 0 and 1 at 0.01 intervals, using m = 1600. Note that at p =0andp=1, (0) λ =0asp= 0 corresponds to only the matrices M2 = M3 occurring and p = 1 corresponds to only the matrix M1 occurring (and each of the matrices have both eigenvalues on the unit circle). Note also that at p =0.5 (indicated by the asterisk) we recover our earlier value for λ(0).

only altering the transition matrix P ,wedonot dynamical systems and for Hamiltonian systems; a need to recalculate the matrices Dm(i), the task method for computing all of them. Part 1: Theory,” which requires most of the computing effort. Thus, Meccanica 15, 9–20. for each new P , we need only form the large matrix Boyarsky, A. & Lou, Y. S. [1992] “A matrix method for D3,m in (24) (reusing the same Dm(i)), compute the approximating fractal measures,” Int. J. Bifurcation fixed left eigenvector, and substitute the result into and Chaos 2(1), 167–175. (25). The results are shown in Fig. 8. If one were to Crisanti, A., Vulpiani, A. & Paladin, G. [1993] Products use the random iteration method, an entirely new of Random Matrices in Statistical Physics, Springer simulation would be required for each value of p, Series in Solid-State Sciences, Vol. 104 (Springer- Verlag, Berlin). making the task much more expensive. Dellnitz, M. & Junge, O. [1998] “An adaptive subdivi- by THE UNIV OF NEW SOUTH WALES on 03/18/13. For personal use only. sion technique for the approximation of attractors and Int. J. Bifurcation Chaos 2000.10:103-122. Downloaded from www.worldscientific.com Acknowledgments invariant measures,” Comput.Vis.Sci.1, 63–68. de Oliveira, M. J. & Petri, A. [1996] “Generalized The research by Gary Froyland was supported by Lyapunov exponents for products of correlated ran- the Japan Society for the Promotion of Science dom matrices,” Phys. Rev. E53(3), 2960–2963. postdoctoral fellowship (research grant from the Elton, J. [1987] “An ergodic theorem for iterated maps,” Ministry of Education, Science, and Culture No. Erg. Th. Dyn. Syst. 7, 481–488. 09-97004). Falconer, K. [1997] Techniques in Fractal Geometry (Wiley, Chichester). Froyland, G. “On estimating invariant measures and References Lyapunov exponents for iid compositions of maps,” Barnsley, M. F. & Elton, J. H. [1988] “A new class of Unpublished notes. Available at http://www.upb.de/ Markov processes for image encoding,” Adv. Appl. math/∼froyland. Probab. 20(1), 14–32. Froyland, G., Judd, K. & Mees, A. I. [1995] “Estimation Benettin, G., Galgani, L., Giorgilli, A. & J.-M. Strelcyn of Lyapunov exponents of dynamical systems using a [1980] “Lyapunov characteristic exponents for smooth spatial average,” Phys. Rev. E51(4), 2844–2855. Rigorous Numerical Estimation of Lyapunov Exponents 119

Furstenberg, H. & Kifer, Y. [1983] “Random matrix reader is guided through the proof of Theorem 3.2 products and measures on projective spaces,” Israel and a sketch of the proof of Theorem 4.2 is given. J. Math. 46(1&2), 12–32. Guti´errez, J. M., Iglesias, A. & Rodr´ıguez, M. A. [1996] “A multifractal analysis of IFSP invariant measures A.1. Invariant Measures: with application to fractal image generation,” Frac- Proof of Theorem 2.2 tals 4(1), 17–27. Hutchinson, J. E. [1981] “ and self similarity,” There are three main steps in the proof. Firstly we P Indiana Univ. Math. J. 30, 713–747. show that is a contraction on the metric space Lasota, A. & Mackey, M. C. [1994] Chaos, Fractals, (M(M),dH) (Lemma A.1). We then use the ma- and Noise. Stochastic Aspects of Dynamics, Applied trix Pn to define an operator Pn on M(M), and Mathematical Sciences, Vol. 97, 2nd edition (Springer- show that Pν and Pnν are always close in the dH Verlag, NY). metric (Lemma A.5). Finally, we combine these Peruggia, M. [1993] Discrete Iterated Function Systems two properties in Lemma A.6 to provide a bound for (A K Peters, Wellesley). dH(µ, µn), where µn is our approximate P-invariant Stark, J. [1991] “Iterated function systems as neural net- measure satisfying µn = Pnµn. works,” Neural Networks 4, 679–690. Ulam, S. M. [1964] Problems in Modern Mathematics Lemma A.1 (Contractivity). Denote by s , the (Interscience). k contractionP constant of Tk,k=1,..., r, and put r s = k=1 wksk.Then, Appendix A Proofs dH(Pν1,Pν2)≤s·dH(ν1,ν2). (A.1) We give in detail the proof of Theorem 2.2. The Proof

Z Z Xr Xr dH (Pν1, Pν2) = sup wkh(Tkx)dν1(x) − wkh(Tkx)dν2(x) ∈ h Lip(1) M k=1 M k=1

Z Z Xr ≤ wk sup h ◦ Tk dν1 − h ◦ Tkdν2 ∈ k=1 h Lip(1) M M

Z Z Xr = wk sup hdν1 − hdν2 ∈ k=1 h Lip(sk) M M

Z Z Xr = w sup s hdν − s hdν by THE UNIV OF NEW SOUTH WALES on 03/18/13. For personal use only. k k 1 k 2 ∈ k=1 h Lip(1) M M Int. J. Bifurcation Chaos 2000.10:103-122. Downloaded from www.worldscientific.com ! Xr = wksk dH (ν1,ν2)  k=1

Remark A.2. The result of Lemma A.1 was proven in [Hutchinson, 1987, Theorem 4.4.1], with s = ! Xn Xn Xn max1≤k≤r sk. P nν = wk ν(Ai)Pn,ij(k) δaj . (A.2) j=1 k=1 i=1 We now use the matrices Pn(k)todefineafam- ily of Markov operators that will approximate P in In words, the action of this operator is: mea- the appropriate sense. sure the set Ai, find which partition set Aj the point ai is mapped into by Tk, and place a contribu- ∈M Definition A.3. Let ν (M) and define tion of weight wkν(Ai) concentrated on the point aj 120 G. Froyland & K. Aihara

(repeat for each k =1,..., r). The utility of such Proof. Straightforward.  a family of Markov operators is that (i) they have easily computable fixed points µn, and (ii) these Pn- Lemma A.5 (Uniform Approximation). Define invariant measures µn are close to the true invariant εn =max1≤i≤ndiam(Ai), andletsbedefinedas measure µ. in Lemma A.6. Then

sup d (Pν, P ν) ≤ (1 + s)ε . (A.3) Lemma A.4 (Invariance of µn). The probability H n n ν∈M(M) measure µn as defined in (12) isafixedpointofthe Markov operator Pn. Proof. Choose any ν ∈M(M), and define ax = ai, where x ∈ Ai. Z Z

dH (Pν, Pnν) = sup hd(Pµ) − hd(Pnν) h∈Lip(1) M   Z ! Xr Xr Xn Xn   =sup wkh(Tkx)dν(x) − wk ν(Ai)Pn,ij(k) h(aj) ∈ hLip(1) M k=1 k=1 j=1 i=1

Z Xr Xn ≤ wk sup h(Tkx)dν(x) − h(Tkai)ν(Ai) ∈ k=1 h Lip(1) M i=1   ! Xn Xn   + h(T a ) − P (k)h(a ) ν(A ) k i n,ij j i i=1 j=1 ! Z Xr Xn Xn ≤ wk sup (h(Tkx) − h(Tkai))dν(x) + (h(Tkai) − h(aj(i)))ν(Ai) ∈ k=1 h Lip(1) i=1 Ai i=1 ! Xr ≤ wk sup sup d(h(Tkx),h(Tkax)) + max d(h(Tkai),h(aj(i))) ∈ ∈ 1≤i≤n k=1 h Lip(1) x M ! Xr ≤ wk sup d(Tkx, Tkax) + max d(Tkai,aj(i)) ∈ 1≤i≤n k=1 x M ! Xr ≤ wk sk sup d(x, ax)+εn ∈ by THE UNIV OF NEW SOUTH WALES on 03/18/13. For personal use only. k=1 x M

Int. J. Bifurcation Chaos 2000.10:103-122. Downloaded from www.worldscientific.com Xr ≤ wk(sk +1)εn.  k=1

Lemma A.6 (Error bound). (1 + s)ε d (µ, µ ) ≤ n . (A.4) H n 1 − s Proof

dH (µ, µn)=dH(Pµ, Pnµn)

≤ dH (Pµ, Pµn)+dH(Pµn,Pnµn)

≤s·dH(µ, µn) + sup dH (Pν, Pnν) . ν∈M(M) A rearrangement provides the result.  Rigorous Numerical Estimation of Lyapunov Exponents 121

A.2. Lyapunov exponents of Suppose that each Tn has at least one fixed point xn, { }∞ iid random matrix products: and denote by x˜ a limit of the sequence xn n=1. Proof of Theorem 3.2 Then x˜ isafixedpointofT. R d−1 P M We now replace M with P and : (M) Proof. Note that with D : M(RPd−1) . The main difference is that D M R d−1 is not a contraction on ( P ), so we have no %(T x,˜ x˜) ≤ %(T x,˜ T xn)+%(Txn,Tnxn)+%(xn, x˜). complete analogue of Lemma A.1. Instead, we re- place contractivity of P with continuity of D.The We may make the RHS as small as we like by proofs of the next three lemmas follow exactly as choosing a suitably high value for n, since the first those for Lemmas A.1, A.4 and A.5 respectively. term goes to zero by continuity of T , the second term goes to zero uniformly by (A.6), and the Lemma A.7 (Continuity). The operator D is third term goes to zero by hypothesis.  continuous; in particular, one has − ! To apply this Lemma we set X = M(RPd 1), Xr and % = dH . Recall that the metric dH generates d (Dζ , Dζ ) ≤ w ` d (ζ ,ζ), − H 1 2 k k H 1 2 the weak topology on M(RPd 1), and that with k=1 respect to this topology, M(RPd−1)iscompact.

where `k denotes the Lipschitz constant of v 7→ We let the map T in Lemma A.11 be the operator D M R d−1 DTk(v). : ( P ) defined in (A.5). By Lemma A.7, we have continuity of D and by Lemma A.10 we see In analogy to Definition A.3, that dH (Dζ, Dmζ) → 0 uniformly as m →∞.The measure ξm is fixed under Dm by Lemma A.9, and ˜ Definition A.8. Define an approximate operator applying Lemma A.11 we have that if ξ is a limit d−1 { } ˜ D Dm : M(RP ) by of the sequence ξm ,thenξis -invariant. We ! now simply insert ξm into Eq. (15) to obtain the re- Xm Xr Xm sult of Theorem 3.2. Hypothesis 3.1(ii) is needed so D mζ = wk ζ(Vi)Dm,ij (k) δvj . (A.5) that any D-invariant measure ξ˜ will produce the top j=1 k=1 i=1 Lyapunov exponent when inserted into Eq. (15); see [Furstenberg & Kifer, 1983]. As before, we have

Lemma A.9 (Invariance of ξm). The probability A.3. Lyapunov exponent of measure Xm iterated functions systems:

ξm := dm,j δvj Proof of Theorem 4.2 and j=1 Corollary 4.3 is a fixed point of the Markov operator D . by THE UNIV OF NEW SOUTH WALES on 03/18/13. For personal use only. m In the case of nonlinear IFS’s, we must consider × R d−1

Int. J. Bifurcation Chaos 2000.10:103-122. Downloaded from www.worldscientific.com probability measures on M P . The ana- Lemma A.10 (Uniform Approximation). Let logue of the operator D is an operator D0 : M(M × − εm =max1≤j≤mdiam(Vj). Then RPd 1) given by ! Xr Xr D D ≤ D0 0 0 ◦T−1 sup dH ( mζ, ζ) 1+ wk`k εm . ζ = wkζ k (A.7) − ζ∈M(RPd 1) k=1 k=1

where Tk(x, v)=(Tkx, DxTk(v)) for (x, v) ∈ M × We now use the following general result. RPd−1. We wish to approximate fixed points ξ0 of 0 Lemma A.11 (Invariance of limit). Let (X, %) be D and do so by finding fixed points of an operator D0 D a compact metric space. Let T : X be continuous, m,n constructed using the matrix m,n in (24). D0 and Tn : X ,n=1,2,...be a family of maps such The construction of m,n and the arguments show- 0 that ing that fixed points ξm,n converge to a fixed point of D0 follow exactly as in Sec. A.2. The nondegen- %(Tnx, T x) → 0 uniformly as n →∞. (A.6) eracy condition of Hypothesis 4.1(ii) is required so 122 G. Froyland & K. Aihara

that any D0-invariant measure will provide us with This is just a sketch of the proof of Theorem 4.2; the top Lyapunov exponent when inserted into an further technical details are given in [Froyland, appropriate version of the space average formula. 1998]. P R P R (0) r | 0 | r | ˜0 | Proof of Corollary 4.3. Note that λ = k=1 wk M log Tk dµ,andthatλn = k=1 wk M log Tk dµn, ˜0 0 ∈ where Tk(x)=Tk(ai)forx Ai.Now, Z Z Z Z Z

| 0 | − | ˜0 | ≤ | 0 | − | 0 | | | 0 |− | ˜0 || log Tk dµ log Tk dµn log Tk dµ log Tk dµn + log Tk log Tk dµn M M M M M ≤ | 0 | | 0 | dH (µ, µn)Lip(log Tk ) + Lip(log Tk )maxdiam(Ai) 1≤i≤n ≤ | 0 | − 2εnLip(log Tk )/(1 s) , using the bound of Theorem 2.2(i), from which the result follows.  by THE UNIV OF NEW SOUTH WALES on 03/18/13. For personal use only. Int. J. Bifurcation Chaos 2000.10:103-122. Downloaded from www.worldscientific.com