THEORY AND APPLICATIONS OF TOPS

MICHAEL BARNSLEY

Abstract. We consider an iterated system (IFS) of one-to-one con- tractive maps on a compact metric space. We define the top of an IFS; define an associated symbolic ; present and explain a fast algorithm for computing the top; describe an example in one dimension with a rich his- torygoingbacktoworkofA.Rényi[Representations for Real Numbers and Their Ergodic Properties, Acta Math. Acad. Sci. Hung.,8 (1957), pp. 477- 493]; and we show how tops may be used to help to model and render synthetic pictures in applications in computer graphics.

1. Introduction It is well-known that an system (IFS) of 1-1 contractive maps, mapping a compact metric space into itself, possesses a set and various invariant measures. But it also possesses another type of invariant object which we call a top. One application of tops is to modelling and rendering new families of synthetic pictures in computer graphics. Another application is to information theory and data compression. Tops are mathematically fascinating because they have a rich symbolic dynam- ics structure, they support intricate Markov chains, and they provide examples of IFS with place-dependent probabilities in a regime where not much research has takenplace.Inthesenoteswe(i)define the top of an IFS; (ii) define an associated symbolic dynamical system; (iii) present and explain a new fast algorithm for com- puting the top, based on the dynamical structure; (iv) describe an example in one dimension with a rich history; (v) explain and demonstrate that tops can be used to define and render beautiful pictures. This work was supported by the Australian Research Council.

2. The Top of an IFS Let an iterated function system (IFS) be denoted

(2.1) := X; w0, ..., wN 1 . W { − } This consists of a finite of sequence of one-to-one contraction mappings

(2.2) wn : X X,n=0, 2,...,N 1 → − acting on the compact metric space (2.3) (X,d) with metric d so that for some (2.4) 0 l<1 ≤ Date: March 9, 2005. 1 2 MICHAEL BARNSLEY we have

(2.5) d(wn(x),wn(y)) l d(x, y) ≤ · for all x, y X. ∈ Let A denote the attractor of the IFS, that is A X istheuniquenon-empty compact set such that ⊂

A = wn(A). n [ Let the associated code space be denoted by Ω = Ω 0,1,...,N 1 . This is the space { − } of infinite sequences of symbols σi ∞ belongingtothealphabet 0, 1,...,N 1 { }i=1 { − } with the discrete product topology. We will also write σ = σ1σ2σ3... Ω to denote th ∈ a typical element of Ω, and we will write ωk to denote the k element of the sequence ω Ω.WeordertheelementsofΩ according to ∈ σ<ωiff σk <ωk where k is the least index for which σk = ωk. Let 6 φ : Ω A → denote the associated continuous addressing map from code space onto the attrac- tor of the IFS. We note that the set of addresses of a point x A,defined to be 1 ∈ φ− (x), is compact and so possesses a unique largest element. We denote this value by τ(x).Thatis,τ : A Ω is defined by → τ(x)=max σ Ω : φ(σ)=x . { ∈ } We call τ the tops function oftheIFS.WealsocallGτ := (x, τ(x)) : x A the graph of the top of the IFS or simply the top of the IFS. { ∈ } The top of an IFS may be described as follows: consider the lifted IFS

X Ω : W0,W1,...,WN 1 { × − } where Wn(x, σ)=(wn(x),nσ) where, for the avoidance of any doubt, nσ := ω where ω1 = n and ωn+1 = σn for n =0, 1, ..., N 1. Let the unique attractor of this IFS be denoted by A. Then the − projection of A on the X-direction is X,andintheΩ-direction it is Ω. The top of the original IFS is related to A according to: b b Gτ = (x, σ) A :(x, ω) A = ω σ . { b ∈ ∈ ⇒ ≤ } This latter formulation is useful because we can use the chaos game algorithm (also called a Markov Chain Monte Carlob (MCMC)b algorithm or a random iteration algorithm) to compute approximations to A and hence to Gτ . Accordingtothis method we select a sequence of symbols b σ1σ2σ3... 1, 2,...,N ∞ ∈ { } with probability pn > 0 for the choice σk = n, independent of all of the other choices. We also select X0 X and let ∈

Xn+1 = Wσn+1 (Xn) for n =0, 1, 2, ... . Then, almost always,

lim Xn : n = k,k +1, ... = A k →∞ { } b FRACTAL TOPS 3 where S denotes the closure of the set S. This algorithm provides in many cases asimpleefficient fast method to compute approximations to the attractor of an 2 IFS, for example when X = ¤,acompactsubsetofR . By keeping track of points which, for each approximate value of x X, have the greatest code space value, we ∈ can compute approximations to Gτ . We illustrate this approach in the following example which we continue in Section 6. Example 1. Consider the IFS

(2.6) [0, 1] R; w0(x)=αx, w2(x)=αx +(1 α) { ⊂ − } We are interested in the case where 1 <α<1, 2 which we refer to as "overlapping" because w0([0, 1]) w1([0, 1]) contains a non- empty open set. The two maps of the IFS are illustrated∩ in Figure 1. In Figure 2

Figure 1. Graphs of the two transformations of the overlapping IFS in Example 1. See also Figure 2. **tifs1.gif we show the attractor A of the associated lifted IFS, and upon this attractor we have indicated the top with some red squiggles. Figure 2 was computed using random iteration: we have representedb points in code space by their binary expansions which are interpreted as points in [0, 1]. Since the invariant measure of both the IFS 4 MICHAEL BARNSLEY

Figure 2. The attractor of the IFS in Equation 2.7. This repre- sents the attractor A of the lifted IFS corresponding to Equation 2.6. The top of the IFS is indicated in red. The visible part of the "x-axis" representsb the real interval [0, 1] and the visible part of the "y-axis" represents the code space Ω between the points 000000000.... and 111111111..... **graph2.gif and the lifted IFS contain no atoms, the information lost by this representation is irrelevant to pictures. Accordingly, the actual IFS used to compute Figure 2 is 1 1 1 (2.7) [0, 1] [0, 1] R; W0(x, y)=(αx, y),W2(x, y)=(αx +(1 α), y + ) { × ⊂ 2 − 2 2 } 2 with α = 3 . 3. Application of Tops to Computer Graphics Here we introduce the application of tops to computer graphics. There is a great deal more to say about this, but to serve as motivation as well as to provide an excellent method for graphing fractal tops, we explain the basic idea here. 2 A picture function is a mapping P : DP R C where C is a colour space, 3 3 ⊂ → for example C =[0, 255] R .ThedomainDP is typically a rectangular subset of ⊂ R2:weoftentake 2 DP = := (x, y) R :0 x, y 1 . ¤ { ∈ ≤ ≤ } The domain of a picture function is an important part of its definition; for example asegmentofapicturemaybeusedtodefine a picture function. A picture in the usual sense may then be thought of as the graph of a picture function. But we will use the concepts of picture, picture function, and graph of a picture function FRACTAL TOPS 5 interchangeably. We do not discuss here the important questions of how such functions arise in image science, for example, nor about the relationship between such abstract objects and real world pictures. Here we assume that given a picture function,wehavesomeprocessbywhichwecanrenderittomakepictureswhich may be printed, viewed on computer screens, etc. This is far from a simple matter in general. Let two IFS’s

:= ¤; w0, ..., wN 1 and := ¤; w0,...,wN 1 W { − } W { − } and a picture function f e e P : C ¤ → be given. Let A denote the attractor of the IFS and let A denote the attractor W of the IFS .Let e W τ : A Ω e → denote thef tops function for .Let W φ : Ω A → ⊂ ¤ denote the addressing function for the IFS .Thenwedefine a new picture e e function W P :A Cf → by P =P φ τ. ◦ ◦ This is the unique picture function defined by the IFS’s , ,andthepicture W W P. We say that it has been producede bye tops + colour stealing. We think in this way: colours are "stolen" from the picture P to "paint" codef space; that is, wee make a code space picture, that is the function P φ : Ω C,whichwethen use together with top of to paint the attractoreA. ◦ → W Notice the following points. (i) Picture functions havee e properties that are deter- mined by their source; digital pictures of natural scenes such as clouds and sky, fields of flowers and grasses, seascapes, thick foliage, etc. all have their own distinctive palettes, relationships between colour and position, "continuity" and "discontinu- ity" properties, and so on. (ii) Addressing functions are continuous. (iii) Tops functions have their own special properties; for example they are continuous when the associated IFS is totally disconnected, and they contain the geometry of the un- derlying IFS attractor A plus much more, and so may have certain self-similarities and, assuming the IFS are built from low information content transformations such as similitudes, possess their own harmonies. Thus, the picture functions produced by tops plus colour stealing may define pictures which are interesting to look at, carrying a natural palette, possessing certain continuities and discontinuities, and also certain self-similarities. There is much more one could say here. The stolen picture P φ τ may be computed by random iteration, by coupling ◦ ◦ the lifted IFS associated with to the IFS . This is the method used until recently, and has beene describede W in [3] and inW [4]. Recently we have discovered a much faster algorithm for computing the entiref stolen picture at a given resolution. It is based on a symbolic dynamical system associated with the top of .We describe this new method in Section 5. W 6 MICHAEL BARNSLEY

In Figure 4 we illustrate the attractor, and invariant measure, and a picture defined by tops + colour stealing, all for theIFSofprojectivetransformations

anx + bny + cn dnx + eny + fn (3.1) = ¤; wn(x, y)=( , ),n=0, 1, 2, 3 W { gnx + hny + jn gnx + hny + jn } where the coefficients are given by

nan bn cn dn en fn gn hn jn 01.901 0.072 0.186 0.015 1.69 0.028 0.563 0.201 2.005 10.002 −0.044 0.075 0.003 0.044 0.104 0.002 −0.088 0.154 20.965 −0.352 0.058 1.314 −0.065 0.191 1.348 −0.307 0.075 3 0.325 −0.0581 0.029 1.229 −0.001− 0.199 1.281− 0.243 0.058 − − − − − − − The picture from which the colours were stolen is shown in Figure 3.

Figure 3. Colours were stolen from this picture to produce Figure 5 and the right-hand image in Figure 4. **starcolour2.bmp

A close-up on the tops picture is illustrated in Figure 5.

Figure 4. From left to right, the attractor, an invariant measure, and a picture of the top made by colour stealing, for the IFS in Equation 3.1. Figure 5 shows a zoom on the picture of the top. **physics.bmp

4. The Tops Dynamical System

We show how the IFS leads to a natural dynamical system T : Gτ Gτ . The notation is the same as above. We also need the shift mapping → S : Ω Ω → defined by Sσ1σ2σ3... = σ2σ3... FRACTAL TOPS 7

Figure 5. Close-up on the fractal top in Figure 4. Details of the structure are revealed by colour-stealing. **physics6.gif

for all σ = σ1σ2σ3.. Ω. ∈ 1 Lemma 1. Let (x, σ) Gτ .Then(w− (x),Sσ) Gτ . ∈ σ1 ∈ 1 Proof. Notice that (wσ−1 (x),Sσ) is uniquely defined and belongs to X Ω because 1 × wn : X X is one-to-one and S : Ω Ω.Suppose(wσ−1 (x),Sσ) / Gτ . Then there → 1 → ∈ exists a point (wσ−1 (x),ω) A where ω>Sσ. But since A = Wn(A) it follows ∈ n 1 1 that Wσ (w− (x),ω) A. But Wσ (w− (x),ω)=(x, σ1ω) andSσ1ω>σ,whichis 1 σ1 ∈ b 1 σ1 b b a contradiction. ¤ b 1 Lemma 2. Let (x, σ) Gτ .Thenthereis(y,ω) Gτ such that (w− (y),Sω)= ∈ ∈ σ1 (x, σ) Gτ . ∈ 1 Proof. Let y =(wN 1(x), (N 1)σ). Clearly (wσ− (y),S((N 1)σ)) = (x, σ) and − − 1 − (wN 1(x), (N 1)σ) A. We just have to show (wN 1(x), (N 1)σ) Gτ . Suppose − − ∈ − − ∈ not. Then there exists a point (wN 1(x),υ) A with υ>(N 1)σ. It follows that − ∈ − υ1 =(N 1). It followsb that Sυ > σ. Also since (wN 1(x),υ) A it follows that 1 − b − ∈ WN− 1(wN 1(x),υ) A and hence (x, Sυ) A. It follows that (x, σ) / Gτ which − − ∈ ∈ ∈ is a contradiction. b ¤ b b It follows that the mapping 1 T : Gτ Gτ defined by T (x, σ)=(w− (x),Sσ) → σ1 is well-defined, and onto. It can be treated as a dynamical system which we refer to as Gτ ,T . As such we may explore its invariant sets, invariant measures, other { } 8 MICHAEL BARNSLEY types of invariants such as entropies and information dimensions, and its ergodic properties, using "standard" terminology and machinery. We can we project Gτ ,T onto the Ω-direction, as follows: let { } Ωγ := σ Ω :(x, σ) Gτ for some x X { ∈ ∈ ∈ }

Then Ωγ is a shift invariant subspace of Ω,thatisS : Ωγ Ωγ with →

S(Ωγ)=Ωγ, and we see that Ωγ ,S is a symbolic dynamical system, see for example [9]. { } Indeed, Ωγ,S is the symbolic dynamical system corresponding to a partition { } of the domain of yet a third dynamical system A, T corresponding to a mapping { } T : A A which is obtained by projecting Gτ ,T onto A.Thissystemisdefined by → { }e e 1 wN− 1(x) if x DN 1 := wN 1(A), 1− ∈ − − wN− 2(x) if x DN 2 := wN 2(A) wN 1(A)  − ∈ − − \ − (4.1) T (x)= .. .  N 1  1 − w0− (x) if x D0 := w0(A) wn(A) e ∈ \ n=1   S for all x A and we have ∈ T (A)=A.

We call A, T the tops dynamical system (TDS) associated with the IFS. { } e Ωγ ,S is the symbolic dynamical system obtained by starting from the tops { } dynamical systeme A, T and partitioning A into the disjoint sets D0,D1, ..., DN 1 definedinEquation4.1,where{ } −

N 1 − A = Dn and Di Dj = ∅ for i = j. n=0 ∩ 6 [ An example of such a partition is illustrated in Figure 6.We refer to Ωγ,S as the symbolic tops dynamical system associated with the original{ dynamical} system. A couple key phrases that relate to this type of situation are Markov Partitions and Specification Property. References include works by ,Y. G. Sinai, Rufus Bowen, , Brian Marcus, and many others.

Theorem 1. The tops dynamical system A, T and the symbolic dynamical system { } Ωγ,S are conjugate. The identification between them is provided by the tops { } function τ : A Ωγ.Thatis, e → T (x)=φ S τ(x) ◦ ◦ for all x A,andµ is an invariante measure for A, T iff τ µ is an invariant ∈ { } ◦ measure for Ωγ,S . { } e Proof. Follows directly from everything we have said above. ¤ FRACTAL TOPS 9

Figure 6. Illustrates the domains D0,D1,D2,D3 for the tops dy- namicalsystemassociatedwiththeIFSinEquation3.1.Thiswas the IFS used in Figures 4 and 5. Once this "picture" has been com- puted it is easy to compute the tops function. Just follow orbits of the tops dynamical system! **newleaftds.gif

5. Symbolic Dynamics Algorithm for Computing Tops Corollary 1. The value of τ(x) may be computed by following the of x A ∈ as follows. Let x1 = x and xn+1 = T (xn) for n =1, 2, ..., so that the orbit of x is xn ∞ .Thenτ(x)=σ where σn 0, 1, ..., N 1 is the unique index such that { }n=1 ∈ { − } xn Dn for all n =1, 2, ... e ∈ Proof. This follows directly from Theorem 1. At the risk of being repetitive, notice the following points. (i) The IFS has totally disconnected attractor. Hence, W there is a well-defined shift dynamical system T : A A which is equivalent to → S : Ω Ω. A is the disjoint union ofc the sets Wn(A) and the code space address of → each point x A can be obtained by following itsb orbitb underb T and concatenating ∈ the indices ofb the successive regions Wn(A) in whichb it lands. The corollary follows because T = T bGτ . (ii) We may follow directly the orbit of xb under T and keep track of the indices| as described. Use the strict contractivity of the maps to show that limewσ bwσ ...wσ (y)=x for all y A. The key observatione is that the n 1 ◦ 2 ◦ n ∈ orbit of→∞ a point on top stays on the top, and so the code given by its orbit is the value of the tops function applied to the point. ¤ 10 MICHAEL BARNSLEY

Corollary 1 provides us with a delightful algorithm for computing approximations to the top of an IFS in cases in which we are particularly interested, namely when the IFS acts in R2 as in Equation 3.1. Here we describe briefly one of many possible variants of the algorithm, with concentration on the key idea. (i) Fix a resolution L M. Set up a rectangular array of little rectangular × boxes, "pixels", corresponding to a rectangular region in R2 which contains the attractor of the IFS. Each box "contains" either a null value or a floating point 2 1 2 Lm,n number xl,m R and a finite string of indexes ωl,m = σl,mσl,m...σl,m where Ll,m ∈ k denotes the length of the string and each σl,m 0, 1, ..., N 1 . The little boxes are initialized to null values. ∈ { − } (ii) Run the standard random iteration algorithm applied to the IFS with ap- propriate choices of probabilities, for sufficiently many steps to ensure that it has "settled" on the attractor, then record in all of those boxes, which taken together correspond to a discretized version of the attractor, a representative floating point value of x A and the highest value, so far encountered, of the map index corre- sponding to∈ that x-value. This requires that one runs the algorithm for sufficiently many steps that each box is visited many times and that, each time a map with a higher index value than the one recorded in a visited little box, the index value at the box is replaced by the higher index. [The result will be a discretized "pic- ture" of the attractor, defined by the boxes with non null entries, partitioned into the domains D0,D1, .., DN 1 with a high resolution value of x A for each pixel. We will use these high resolution− values to correct at each step∈ the approximate orbits of T : A A which otherwise would rapidly loose precision and "leave" the attractor.] → (iii) Choosee a little rectangular box, indexed by say l1,m1, which does not have a null value. (If the value of τ(xl1,m1 ), namely the string ωl1,m1 , is already recorded inthelittlerectangularboxtosufficient precision, that is Ll1,m1 = L,saygoto another little rectangular box until one is found for which Ll1,m1 =1.) Keep track of l ,m ,σ . Compute w (x ) then discretize and identify the little box 1 1 l1,m1 σl1,m1 l1,m1 l2,m2 to which it belongs. If Ll2,m2 = L then set 1 1 2 L 1 ω = σ σ σ ...σ − l1,m1 l1,m1 l2,m2 l2,m2 l2,m2 andgoto(iv).Ifl2,m2 = l1,m1 set 1 1 1 1 ωl1,m1 = σl1,m1 σl1,m1 σl1,m1 ...σl1,m1 and go to (iv). Otherwise, keep track of l1,m1,σl1,m1 ; l2,m2,σl2,m2 and repeat the iterative step now starting at l ,m and computing w (x ).Continuein 2 2 σl2,m2 l2,m2 this manner until either one lands in a box for which the string value is already of length L, in which case one back-tracks along the orbit one has been following, filling in all the string values up to length L, or until the sequence of visited boxes first includes the address of one box twice; i.e. a discretized periodic orbit is encountered. The strings of all of the points on the periodic orbit can now be filled-out to length L, and then the strings of all of the points leading to the periodic cycle can be deduced and entered into their boxes. (iv) Select systematically a new little rectangular box and repeat step (iii), and continue until all the strings ωl,m have length L.Ourfinal approximation is

τ(xl,m)=ωl,m. FRACTAL TOPS 11

This algorithm includes "pixel-chaining" as described in [8] and is very efficient because only one point lands in each pixel during stages (iii) and (iv). **Give an example here.

6. Analysis of Example 1 6.1. Invariant Measures and Random Iteration on Fractal Tops. We are interested in developing algorithms which are able to compute directly the graph Gτ of the tops function by some form of random iteration in which, at every step, the new points remain on Gτ . For while the just-described algorithm is delightful for computing approximations to the whole of Gτ it appears to be cumbersome and to have large memory requirements if very high resolution approximations to a part of Gτ are required, as when one "zooms in on" a fractal. But the standard chaos game algorithm has huge benefits in this regard and we would like to be able to repeat the process here. (The variant of the random iteration algorithm mentioned earlier, where one works on the whole of A and keeps track of "highest values" is clearly inefficient in overlapping cases where the measure of the overlapping region is not negligible. Orbits of points may spendb time wandering around deep within A rarely visiting the top!) In Section 7 use the insights gained in this section to provide a random iteration algorithm which, for a class of cases, "stays on top". b But this is not the only motivation for studying stochastic processes and invariant measures on tops. Such processes have very interesting connections to information theory and data compression. In the end one would like to come back, full-circle, to obtain insights into image compression by understanding these processes. This topic in turn relates to IFS’s with place-dependent probabilities and to studies concerning when such IFS’s possess unique invariant measures. Here we extend our discussion of Example 1 in some detail, to show the sort of thing we mean. We show that this example is related to a dynamical system studied by A. Renyi [10] connected to information theory. The IFS in this example also shows up in the context of Bernoulli convolutions, where the absolute continuity or otherwise of its invariant measures is discussed. We prove a theorem concerning this example which is, hopefully, new. Much of what we say may be generalized extensively.

6.2. TheTIFSandMarkovProcessforExample1.We continue Example 1. The tops IFS (TIFS) associated with the IFS in Equation 2.6 is the "IFS" made from the following two functions, one of which has as its domain a set that is not all of [0, 1]: 1 α (6.1) w0(x)=αx for all x [0, − ); ∈ α w1(x)=αx +(1 α) for all x [0, 1]. e − ∈ These two functions are sketched in Figure 7. We are interested ine invariant measures for the following type of Markov process. This relates to a "chaos game" with place-dependent probabilities. Define a Markov transition probability by

(6.2) P (x, B)=p0(x)χB(w0(x)) + p1(x)χB(w1(x)).

e e e e e 12 MICHAEL BARNSLEY

Figure 7. The TIFS definedinEquation6.1.ThisistheTIFS associated with the IFS illustrated in Figure 1. **tifs2.wmf

Here p for 0 x< 1 1 (6.3) p (x)= 0 α 0 0 for 1 ≤ 1 0 and ep0 + p1 =1. P (x, B) is the probability of transfer from x [0, 1] into the Borel set B. Intuitively, pick a number i 0, 1 according to ∈ ∈ { } the distribution pi(x) and transfer frome x to wi(x). What types of measures may be generated by such random orbits? When are such measures invariant for the Markov process?e A Borel measure µ on [0, 1] ise said to be invariant for the Markov process iff µ(B)=e P (x, B)dµ(x) for all Borel sets B [0, 1]. R ⊂ e e e 6.3. The TDS and Trapping Region for Example 1. The tops dynamical system(TDS)associatedwiththisTIFSinEquation 2.6 is obtained by "inverting" FRACTAL TOPS 13 the TIFS and is readily found to be defined as follows:

D0 =[0, 1 α),D1 =[1 α, 1] − − and 1 α xforxD0 (6.5) T (x)= 1 1 ∈ . x (1 ) for x D1 ½ α − − α ∈ This is sketched in Figuree 8 where a web-diagram for part of one orbit is plotted. 1 α Notice that orbits under T of points in ( −α , 1) eventually arrive in the interval

e

Figure 8. The tops dynamical system definedinEquation6.5. This is the dynamical system associated with the TIFS in Figure 7 and the IFS in Figure 1. Part of one orbit is plotted, and the "trapping region" is labelled: once an orbit gets into here, it never escapes. What sort of invariant measures are available? **tifs3.wmf

1 α Rtrapping =[0, −α ] and that once there, they never leave again. We refer to Rtrapping as the trapping region. Notice too that T (1) = 1 and that x =1is a repulsive fixed point of the dynamical system. That is, the tops dynamical system T :[0, 1] [0, 1] admits the invariant measure µ =eδ1(x). Note that this atomic → measure is also invariant for the Markov process with transition probability P (x, B). Bute the following theorem tells us that there aree no other possible atoms than one at x =1for invariant probability measures for the process. e 14 MICHAEL BARNSLEY

6.4. Non-atomic invariant measures for Example 1. The proof of the follow- ing result typifies an argument which may be used in many cases to establish the existence of non-atomic invariant measures for TIFS.

Theorem 2. Let µ be an invariant probability measure for the Markov process described by Equation 6.2. Then µ( a )=0for all a [0, 1). { } ∈ Proof. Let µ be ane invariant probability measure and let a [0, 1]. Suppose that µ( a ) > 0.Then e ∈ { } e 1 1 1 1 µ( a )=p0(w− ( a ))µ(w− ( a )) + p1(w− ( a ))µ(w− ( a )) e { } 0 { } 0 { } 1 { } 1 { } from which it follows that e e e e e e e e e 1 either p0µ(w0− ( a )), 1 { } 1 µ( a )= or p1µ(w1− ( a ))µ(w0− ( a )), { }  { }1 { } or µ(w− ( a )),  e e0 { } e e e e e where the option is determined by the location of a.Equivalently e e 1 1 µ( T (a) )= ,or ,or1 µ( a ) { } {p0 p1 } { } where the last option, µe( Te(a) )=µ( a ), occurse iff T (a) a. It follows that the mass of each succeeding{ point} on{ the} orbit of the point≥ a under the TDS T :[0, 1] [0, 1] is greatere than or equal to the masse of the point whose image → e e under T it is, with equality only when the succeeding point lies to the right of the pointe whose image it is. It follows that the orbit of a must be eventually periodic, for otherwisee we have µ([0, 1]) = . If the orbit of a is eventually periodic, then without loss of generality we can∞ take a to be a periodic point, and moreover can choose it to be the leftmoste point on the orbit. Then if the orbit contains more than one point, the preimage of a ontheorbitmustlietotherightofa and consequently the measure of a must be strictly greater than the measure of a which is impossible. This leaves only the possibility that a =1. ¤

6.5. The TIFS and TDS for Example 1 in the trapping region. From this point forward in this section we concentrate on the behaviour of the TDS and the TIFS in the trapping region. We study invariant probability measures and ergodic properties for the TIFS and TDS restricted to the trapping region. In view of Theorem 2, we restrict attention to invariant probability measures which do not contain atoms. This will allow us to modify the "trapped" TDS/TIFS on any countable sets of points without altering potential invariant probability measures off a set of measure zero. Later, in Section 7, we will use the systems of functions and measures which we are now analyzing to construct new Markov processes on the top of the original IFS. 1 α Wemakethechangeofvariablex0 = −α x = g(x) to rescale the behaviour of the TDS restricted to Rtrapping to produce an equivalent dynamical system acting 1 on [0, 1]:thatisT :[0, 1] [0, 1] is defined by T = g T g− . The result is → ◦ ◦

e D0 =[0,α), D1 =[eα, 1] e

e e FRACTAL TOPS 15 and 1 xforxD0 (6.6) T (x)= α ∈ . 1 ( x 1 for x D1 α − ∈ e This is sketched in Figuree 9. Notice that e (6.7) T (x)=(βx) where (y) denotes the fractional part of the real number y and e 1 β = . α By using the symbol β we connect our problem to standard notation for a much studied dynamical system, Equation 6.7, see for example [10],[9]. The focus of all of the work, with which we are familiar, connected to the dynamical system in Equation 6.6, concerns invariant measures which are absolutely continuous with respect to Lebesgue measure and which maximize topological entropy. This is not our focus. We are interested in the existence and computation of invariant measures for the associated IFS with place-dependent, typically piecewise constant, probabilities.

Figure 9. The TDS in Figure 8 restricted to the trapping region and rescaled to [0, 1]. ThistrappedTDSisgivenbyEquations6.6. **tifs4.wmf 16 MICHAEL BARNSLEY

Next we invert the trapped TDS to produce a corresponding restricted TIFS. This is defined with the aid of the two functions

(6.8) w0(x)=αx for all x [0, 1); ∈ 1 α w1(x)=αx + α for all x [0, − ]. e ∈ α These are sketched in Figure 10. The Markov process in the trapped region corre- e

Figure 10. The trapped part of the rescaled TIFS, defined in Equation 6.8. This is the inverse of the dynamical system in Figure 9. **tifs5.wmf sponds to the transition probability

P (x, B)=p0(x)χB(w0(x)) + p1(x)χB(w1(x)). with the place-dependent probabilities e e e e e e p fore 0 ex 1 1e (6.9) p (x)= 0 α 0 1 for 1 ≤ 1 ≤ 0 and p +p =1. The reader may find it interesting to compare the 0 1 0e 1 trapped TIFS in Equations 6.8, 6.9 and 6.10 (system II ) to the original TIFS in FRACTAL TOPS 17

Equations 6.1, 6.3 and 6.4 (system I ). The two systems look very similar. But the trapped system admits no invariant probability measure which contains an atom. Notice that if µ is an invariant probability measure for the trapped system (sys- tem II )then e µ = g µ. ◦ is an invariant measure for the original system (system I). e e 6.6. Existence and Unicity of Invariant Measures for the trapped IFS. The following theorem asserts that there are two basically different situations which can occur. In following subsections we will sketch the proof, and in so doing give a much clearer picture of what is going on. In the course of the proof we will need some new bits of theory, and these are developed, appropriate to more general situations, relevant to tops more generally, as they are required.

1 Theorem 3. Let 2 <α<1.Letw0 :[0, 1] [0, 1] be defined by w0(x)=αx.Let 1 → w1 :[0, α 1] [0, 1] be defined by w1(x)=αx + α.Let − → e e pe for 0 x 1 1 e p (x)= 0 α e 0 1 e for 1 ≤ 1 ≤ 0 and p0 + p1 =1.Define a Markov transition probability by e

(6.11) P (x, B)=p0(x)χB(w0(x)) + p1(x)χB(w1(x)). Then the Markov process possesses at least one and at most two linearly indepen- e dent invariant probabilitye measures.e Ine particular,e there existse at most two invariant probability measures which are also ergodic invariant measures for the tops dynam- ical system T :[0, 1] [0, 1] defined by → 1 e T (x)=( x) for all x [0, 1], e α ∈ where (y) denotes the fractionale part of the real number y.Ifα possesses certain e 1 arithmetic properties, namely that β = α is a "β-number", then the Markov process possesses a unique invariant probability measure which is ergodic for T . 6.7. First Part of the Proof : Corresponding Code Space Systems. We e canconvertthesysteminTheorem3toanequivalentsymbolicdynamicalsystem with the aid of the maps

(6.12) w0(x, σ)=(αx, 0σ) for all (x, σ) [0, 1) Ω ∈ × 1 α w1(x, σ)=(αx + α, 1σ) for all (x, σ) [0, − ] Ω b ∈ α × We call the system described by Equation 6.12 "the lifted trapped TIFS"; it is illustrated inb Figure 11. This system possesses a maximal invariant set Gτ [0, 1) Ω which obeys ⊂ × Gτ = w0(Gτ ) w1(Gτ ) ∪ e G is the graph of a one-to-one function τ :[0, 1] Ω. The projection of this system τ e e e onto the [0, 1]-direction gives us backb the originalb → system, the one in Theorem 3. e e 18 MICHAEL BARNSLEY

Figure 11. The lifted trapped TIFS in Equation 6.12. The maximal invariant set, the attractor, is the graph of a function τ :[0, 1] Ω which provides a symbolic dynamical system which is equivalent→ to the original trapped TIFS. **tifs6.wmf e ProjectionintheΩ-direction provides us with a symbolic dynamical system S : Ωγ Ωγ where τ([0, 1]) = Ωγ,andS(Ωγ)=Ωγ. Ω,S is the usual shift dynamical → { } system on code space and the symbolic dynamical system Ωγ,S is obtained by { } restricting the domain of S to Ωγ. We could choose to write S : Ωγ Ωγ as e → S Ωγ : Ωγ Ωγ and Ωγ,S Ωγ = Ωγ,S , but do not. Note that we can compute τ(|x) by following→ the{ orbit| of x} [0{, 1] under} the trapped TDS (Equation 6.6). Thus we obtain the symbolic∈ IFS with place-dependent probabilities e Ωγ; s0 Ω ,s1 Ω ; p0(σ),p1(σ) { | γ | γ } where

(6.13) s0 Ω (σ)=0σ for all σ Ωγ | γ ∈ s1 Ω (σ)=1σ for all σ Ωγ, | γ ∈ 1 and the probabilities are given by pi(σ)=pi(τ − (σ)), that is

p0 for all σ Ωγ with 0 σ γ (6.14) p0(σ)= e∈ e ≤ ≤ 1 for all σ Ωγ with γ<σ 1 ½ ∈ ≤ FRACTAL TOPS 19 and

p1 for all σ Ωγ with 0 σ γ (6.15) p1(σ)= ∈ ≤ ≤ 0 for all σ Ωγ with γ<σ 1 ½ ∈ ≤ 1 α where γ = τ( −α ). This system is equivalent to the one in Theorem 3. It is the restriction to Ωγ of the IFS Ω; s0,s1; p0(σ),p1(σ) where { } (6.16) e s0(σ)=0σ for all σ Ω ∈ s1(σ)=1σ for all σ Ω ∈ and the probabilities pi : Ω [0, 1] are defined in the same way as the pi : Ωγ → → [0, 1] in Equations 6.14 and 6.15 with Ωγ replaced by Ω. It is helpful in some circumstances, such as when one needs to discuss the inverses of the maps in the IFS, to take the domain of s1 here to be Ω [0,γ] in place of Ω. Then the IFS is, strictly speaking, a local IFS, namely an∩ IFS in which the domains of the functions need not be the whole space on which the IFS is defined. A similar remark applies to other IFS’s with place-dependent probabilities, when the probabilities are zero over parts of the underlying space. The system described by Equation 6.16 is an extension of the system described by Equation 6.13. Notice the following relationship between these two systems: Ωγ is the unique set attractor of the IFS Ω; s0,s1 [0,γ] ; that is Ωγ is the unique nonempty compact subset of Ω such that { | }

Ωγ = s0(Ω) s1 (Ω)=s0(Ω) s1(Ω [0,γ]). ∪ |[0,γ] ∪ ∩ By ignoring a countable set of points we can embed the code space in Ω in [0, 1] to make pictures of, and simplify the visualization of, the symbolic IFS’s in Equations 6.13 and 6.16. For example the symbolic system in Equation 6.13 may be represented by the following pretty IFS with place-dependent probabilities

[0, 1]; s0 :[0, 1] [0, 1],s1 :[0,γ] [0, 1]; p0(x),p1(x) { → → } where 1 (6.17) s0(x)= x for all x [0, 1] 2 ∈ 1 1 s1(x)= x + for all x [0, 1] 2 2 ∈ where p for all 0 x γ p (x)= 0 0 1 for all γγ. This process equivalent | γ ∈ to the Markov process on [0, 1] corresponding to the transition probability P (x, B) in Equation 6.11 in Theorem 3. e 20 MICHAEL BARNSLEY

Figure 12. The symbolic TIFS in Equation 6.17 which is (a) easyish to analyze and (b) holds the key to understanding all the earlier manifestations. The dotted line represents the portion of the graph of s1 in the region where it is never applied, i.e. p1(x)=0. **tifs7.wmf

But we can also consider the corresponding process on Ω which involves applying the maps s0,s1 with probabilities p0,p1 respectively when σ Ω with σ γ,and ∈ ≤ applying the map s0 with probability one when σ Ω with σ>γ. This process extends, to all of Ω, the domain of the original symbolic∈ process. We will find it very useful to consider this latter process. This is because, when we change α, the maps and the space upon which they act remain unaltered; only the set of values of σ Ω for which p1(σ)=0changes. In the original system, in Theorem 3, the slopes∈ of the maps, the location where a probability becomes zero, and the set of allowed codes, all change with α. 6.8. Transfer operator and Probability operator for the system in The- orem 3. We will find that the structure of the system in Theorem 3 depends 1 α fundamentally on whether or not γ, the address of the point −α ,whichmaybe 1 α computed by following the orbit of −α under the trapped tops dynamical system T :[0, 1] [0, 1], terminates in an endless string of zeros or not. That is, on whetherornot→ e r 1 α k 1 = − = β 1 2 − α − FRACTAL TOPS 21 for some pair of positive integers k and r. We work with the closure of the symbolic system. We are interested in those invariant measures which correspond to orbits of the following type of random iteration. When the current point lies in the region 0000... σ γ there is a ≤ ≤ nonzero probability that s0 may be applied to the current point and there is a nonzero probability that s1 may be applied to the current point. When the current point lies in the region γ<σ<1 the map s0 is applied with probability one. Theorem 2 implies that this Markov process possesses no invariant measures which include point masses. Notice that if γ terminates in 0 or 1 then S : Ωγ Ωγ is open, that is it maps open sets to open sets. This implies that the inverse→ branches of S are continuous in the product topology. 0 0 Let X [0, 1], Ω, Ωγ .Weneedthetransfer operator U : ,where ∈ { } L → L 0 = 0(X) is the space of bounded Borel measurable functions on X,defined by L L 1 0 (Uf)( ):= pi( )f(ti( )) for all f . · · · ∈ L i=0 X where ti = wi, or si, or si Ωγ , according as X =[0, 1], or Ω, or Ωγ, respectively. We | also need the adjoint operator U ∗ : P P, where P = P(X),thespaceofallBorel → probabilitye measures on X. U ∗ is defined by 0 (6.18) (U ∗υ)(f)= fd(U ∗υ):= Ufdυ for all f and υ P. ∈ L ∈ ZX ZX We refer to U ∗ as the probability operator associated with the corresponding IFS. We interpret these operators U and U ∗ as acting with underlying space Ω, Ωγ or [0, 1] accordingtocontext. 6.9. Example:Thecaseγ =110. At this point, to illustrate some key ideas, we look at a simple example of the case where γ terminates in 0, namely γ =110. We modify the Markov process using the symbolic IFS by replacing γ by <γ, that is: ≤

when σ<110 apply maps s0,s1 with probabilities p0,p1 respectively

otherwise apply map s0

Look at what happens on the four cylinder sets C00,C01,C10,C11. What does the operator U ∗ do to a measure ν P(Ω)?Letuswrite ∈ ν = ν00 + ν01 + ν10 + ν11 where νij is zero off the cylinder Cij, to denote the four obvious components of ν. Then the new measure ν = U ∗ν on C00 is given by

ν00(B)=p0ν00(SB)+p0ν01(SB), e for all Borel subsets B B(Ω). It consists of two components, one supported on ∈ the cylinder C000 C00eand one supported on the cylinder C001 C00. We also find, using similar⊂ notation, that ⊂

ν01(B)=p0ν10(SB)+ν11(SB),

ν10(B)=p1ν00(SB)+p1ν01(SB), e ν11(B)=p1ν10(SB). e e 22 MICHAEL BARNSLEY

See Figure 13.

Figure 13. Illustrates the redistribution of mass between cylinder sets by the probability operator in Example 6.9. **bins1.gif

In particular, if the measure ν is invariant, then the masses of each of the cylin- ders must be preserved, which implies

p0 p0 00 ν(C00) ν(C00) 00p0 1 ν(C01) ν(C01) (6.19)     =   . p1 p1 00 ν(C10) ν(C10)  00p1 0 ν(C11) ν(C11)       Thematrixhereisthetransposeofanirreduciblestochasticmatrixandbywell-      known theorems [**] Equation 6.19 possesses a unique solution, an eigenvector with strictly positive entries, with eigenvalue one, with total mass one. In fact, in this case, we readily find: p p p p2 (6.20) ν(C )= 0 ,ν(C )= 1 ,ν(C )= 1 ,ν(C )= 1 00 c 01 c 10 c 11 c 2 where c =1+p1 + p1. Now let P(Ω)= ν P(Ω):ν obeys Equation 6.20. { ∈ Then ( P(Ω),dMK) is a complete metric space, where dMK denotes the Monge- Kantorovitch metric.e By whate we have just proved, U ∗ : P(Ω) P(Ω). Furthermore, we claim that → dM K (U ∗ν,U∗µ) ldM K (ν,µ). − e ≤ e − FRACTAL TOPS 23

To see that this is true, and much more as well, we set up some notation and a framework.

6.10. A general framework. Let

Ki : i =0, 1, 2, ..., N 1 { − } be a finite sequence of disjoint compact metric spaces. The metric on each is denoted by d = di. Let

K =K0 K1 ... KN 1. ∪ ∪ ∪ − Define a metric d on X by ∪ d(x, y) if x, y Ki for some i 0, 1, ..., N 1, d (x, y)= ∈ ∈ − ∪ D if x Ki and y Kj for some i = j, with i, j 0, 1, ..., N 1, ½ ∈ ∈ 6 ∈ − where D>0 is a constant. Then (K,d ) is a complete metric space. Let ∪

wij : Ki Kj → be a one-to-one Lipshitz function with Lipshitz constant lij,thatis

d(wij(x),wij(y)) lijd(x, y) for all x, y Ki for all i, j =0, 1, ..., N 1. ≤ ∈ − Use these functions to define new functions Wi : K K according to → Wi(x)=wij(x) when x Ki, i =0, 1, ..., N 1. ∈ − Notice that Wi is continuous because

d (Wi(x),Wi(y)) max lij d (x, y) whenever d (x, y)

N 1 − pij =1. j=0 X Let m0,m1, ..., mN 1 denote an eigenvector such that mn > 0, m0+ m1 + ... + − mN 1 =1,and − N 1 − mipij = mj for j =0, 1,...,N 1, − i=0 X corresponding to eigenvalue one. At least one such eigenvector exists. This eigen- vector represents a stationary probability distribution for a Markov process which is represented by the stochastic matrix. Let

P(K)= ν P(K):νi(Ki)=mi . { ∈ } Now notice that, for any set of numbers m0,m1, ..., mN 1 with mn > 0 for all n, e − (P(K),dMK) is a complete metric space, where νi M(Ki) denotes the restriction ∈ e 24 MICHAEL BARNSLEY of ν P(K) to Ki. The Monge-Kantorovitch metric on P(K) and P(K) is defined by ∈ e e dMK(ν,µ)= max f(x)(dν(x) dµ(x)) f Lip1(K) − ∈ ZK N 1 − =max f Ki (xi)(dνi(xi) dµi(xi) f Lip1(K) { | − } ∈ i=0 X ZKi where

Lip1(K)= f : K R : f(x) f(y) d (x, y) for all x, y K . { → | − | ≤ ∪ ∈ } Let place-dependent, piecewise constant probabilities be defined on K by

pj(x)=pij when x Ki ∈ so that N 1 − pj(x)=1for each x K. ∈ j=0 X We consider the Markov process on K with transition probability N 1 − P (x, B)= pj(x)χB(Wj(x)). j=0 X The corresponding probability operator is U ∗ : P(K) P(K) where → N 1 N 1 − − e e1 (U ∗ν)(B)= P (x, B)dU ∗ν(x)= pij ν(wij− (B)) for all B B(K). ∈ K i=0 j=0 Z X X Notice that the range of U ∗ is indeed contained in P(K) because N 1 N 1 N 1 − − 1 e − 1 (U ∗ν)(Kk)= pij ν(wij− (Kk)) = pik ν(wik− (Kk)) i=0 j=0 i=0 X X X N 1 N 1 − − = pik ν(Ki)= pik mi = mk. i=0 i=0 X X Let f : K R be Borel measurable and let us denote the restriction of f to Ki → by fi. Then the transfer operator is given by N 1 − (Uf)(x)= pj(x)f(Wj(x)); j=0 X the restriction of Uf to Ki is N 1 − (Uf) K (x)= pijf K (wij(x)); | i | j j=0 X and we have

Uf(x)dv(x)= f(x)d(U ∗v)(x).

ZK ZX FRACTAL TOPS 25

Theorem 4. Let

D max max di(x, y) i x,y Ki ≥ ∈ and N 1 − Li := pijlij > 0 for i =0, 1, ..., N 1. − j=0 X When N 1 − L := max pijlij < 1 i j=0 X the operator U ∗ : P(K) P(K) is a contraction mapping with contractivity factor L and possesses a unique→fixed point. That is, the Markov process has exactly one invariant probabilitye measuree µ such that µ(Ki)=mi for i =0, 1, ..., N 1. − Proof.

(6.21) dMK(U ∗υ, U∗µ) =max Uf(x)(dν(x) dµ(x)) f Lip1(K) − ∈ ZK N 1 − =max (Uf) Ki (xi)(dνi(xi) dµi(xi)) f Lip1(K) | − ∈ i=0 Z X Ki N 1 N 1 − − =max pijf Kj (wij(xi))(dνi(xi) dµi(xi)) f Lip1(K) | − ∈ i=0 j=0 X ZKi X N 1 − =max Li f Ki (xi)(dνi(xi) dµi(xi)) f Lip1(K) | − ∈ i=0 X KZ i e where N 1 1 − f K (xi)= pijf K (wij(xi)); | i L | j i j=0 X has the property that e

f K (xi) f K (yi) d(xi,yi) for all xi,yi Ki. | | i − | i | ≤ ∈

Now notice that we can alter the value of f K by a constant without altering e e | i the value of the last expression in Equation 6.21. So we define f : K R by e →

maxx Ki f(x) minx Ki f(x) e f(x)=f(x) ∈ − ∈ when x Ki, forei =0, 1, ..., N 1, − 2 ∈ − e e e ande we choosee D maxi maxx,y Ki di(x, y). Then the distance between distinct ≥ ∈ Ki ’s is greater than the variation of f and it follows that

f(x) f(y) ed(x, y) for all x, y K. | − | ≤ ∈ ee ee 26 MICHAEL BARNSLEY

Hence, N 1 − dMK(U ∗υ, U∗µ)=L max f Ki (xi)(dνi(xi) dµi(xi)) f Lip1(K) | − ∈ i=0 X ZKi e ee = L max f(x)(dν(x) dµ(x))d(ν,µ) f Lip1(K) − ∈ ZK e ee L max f(x)(dν(x) dµ(x))d(ν,µ)=LdMK(ν,µ). ≤ − f Lip1(K) ∈ ZK ee ee ¤ 6.11. Completion of the proof of Theorem 3 in the case γ =110. We simply apply Theorem 4 to the Example in Section 6.9. In Example 6.9 we have N =4, the transition matrix pij is the transpose of the one in Equation 6.19, namely

p0 0 p1 0 p0 0 p1 0   , 0 p0 0 p1  0100   and possesses a unique stationary probability distribution, and L =0.5. It follows from Theorem 4 that the corresponding symbolic IFS with place-dependent prob- abilities possesses a unique invariant measure. This measure is non-atomic. It can be mapped to an invariant measure for the corresponding trapped TIFS.

6.12. Remark. In Theorem 4, in place of the space K, we could have used the space

K=K0 K1 ... KN 1. Then we would have defined ν P(m0,m1,...,mN 1) := × × × − ∈ − Pm0 (K0) Pm1 (K1) ... PmK 1 (KN 1) where Pmi (Ki) denotes the set of positive × × × − − fienite Borel measures on Ki such that υi Ki implies υ(Ki)=mi.Inthiscase ∈ we would write ν =(v0,v1, ..., vN 1). We would then have defined a metric on − P(m0,m1,...,mN 1) according to: − N 1 − dMK2(ν,µ)= max fi(xi)(dνi(xi) dµi(xi) fi Lip1(Ki){ − } i=0 ∈ X ZKi

This would have provided us with the complete metric space (P(m0,m1,...,mN 1),dMK2) and we could have developed our results in this space. Such an approach would− have avoided reference to d and has certain advantages and certain disadvantages.(**) The conclusions regarding∪ contractivity of the probability operator using this alternative framework would have been essentially the same as we obtained in Theorem 4.

6.13. Completion of the proof of Theorem 3 in the cases γ = x....xx10 and γ = x....xx01.. We claim that all cases where γ = γ1....γn 110 work somewhat similarly to the case γ =110. The main points to establish are:− (i) There exists a γ finite "core" collection of cylinders Ci Ω : i =1, 2, ..., m(γ) such that for each i 1, 2, ..., m(γ) there are j, k {1, 2, ...,⊂ m(γ) such that } ∈ { } ∈ { } γ γ γ γ γ γ γ s0 (C )=s0(C ) C and s0 (C ):=s1 (C )=(s1([0,γ] C ) C . 0 i i ⊂ j 1 i |[0,γ] i ∩ i ⊂ k FRACTAL TOPS 27 where s0 denotes the mapping s0 : Ω Ω and s10 denotes the mapping s1 [0,γ], defined to act on subsets of Ω according→ to |

s0 (B)=(s1([0,γ] B) for all B Ω. 1 ∩ ⊂ n (ii) Let σ1σ2...σn 0, 1 . Then ∈ { } either ∅ s0 s0 ... s0 (Ω)= γ σ1 ◦ σ2 ◦ ◦ σn or C ½ i for some i 1, 2, ..., m(γ) . (iii) The Markov process restricted to the "core" cylinders, is∈ recurrent,{ or regionally} transitive, that is, that one can go from cylinder to cylinder in the core with finite probability in finitely many steps. Some detail is needed to show these points. This involves direct analysis of the transition matrix n 1 2 − (pi,j)i,j=0 where

pi,j = probability of transition from Cσ1σ2...σn to Cσ1σ2...σn n 1 n 1 where i = σ1 +2 σ2 + ...2 − σn and j = σ1 +2 σ2 + ...2 − σne.Inthisregardwee e note the following:· · e f e p p , p n 1 p l , ..., l , l,[l/2] = 0 and l,2 − +[l/2] = 1 for all =0 1 γ

n p =1and p n 1 =0for all l = lγ +1, ..., 2 1, l,[l/2] l,2 − +[l/2] − where n 2 n 1 lγ := γ1 +2 γ2....2 − γn 1 +2 − , · · − and that pi,j =0for all values of i, j other than those for which pi,j has already been defined. These values enable us to explicitly demonstrate our assertions regarding eventual regional transitivity and uniqueness of the stationary state of the stochastic matrix. For example, when γ = 1010 we find

p0 000p1 000 p0 000p1 000   0 p0 000p1 00  0 p0 000p1 00 (pi,j)=  .  00p0 00 0p1 0    00100000    00010000    00010000     n m(γ) γ It follows from (i), (ii) and (iii) that: (iv) If υ P(Ω) then (U ∗) υ P( i Ci ). m(γ) γ γ∈ ∈ ∪ (v) If U ∗υ = υ then υ P( i Ci ) and υ(Ci ) is uniquely determined. Indeed, γ ∈ ∪ υ(Ci )=υi for i =1, 2, ..., m(γ) where υi is the stationary probability distribution m(γ) γ for the corresponding stochastic matrix. Now let P( i Ci ) denote the space of m(γ) γ γ ∪ probability measures µ on i Ci such that µ(Ci )=υi. Then as in Theorem 4 m(γ) γ m(γ∪) γ e U ∗ : P( i Ci ) P( i Ci ) is a strict contraction, which provides the rest of the proof∪ of Theorem→ 3∪ in the case γ = x....xx10. Thee case γ = xxxe01 is just like the case γ = xxx10 after some possible modifi- cation on a set of measure zero. 28 MICHAEL BARNSLEY

6.14. Another approach to the proof of Theorem 3 in the case γ = x....xx10 using the formulation and a theorem of Parry. Another approach makes connection to the work of W. Parry [9]: in this approach we quote Parry’s work to show that the underlying structure is that of an intrinsic Markov chain. However our framework, above, is broader since we work on all of Ω, which has a further implication: when we consider associated ergodic theorems, which we do not do in these notes at this time, in the "recurrent" case, we obtain a stronger result than I. Werner, [11], regarding convergence of orbits produced by random iteration in the case of certain graph-directed IFS. This also has implications which are entirely new, so far as we know, for the "not open" case, which corresponds for example to thecasewhereγ is irrational, which we discuss in later sections. **Recurrence of the symbolic system Ωγ,S Ω follows from the fact that the { | γ } corresponding restricted tops dynamical system T :[0, 1] [0, 1] is strictly expand- → ing and, if O [0, 1] is nonempty and open, then T (O)=[0, 1] for sufficiently large e integer n. ⊂ e e **The symbolic system Ωγ,S Ωγ here is exactlye the one considered by W.Parry [9]. By combining Parry’s{ set-up| with} piecewise-constant probabilities we obtain a framework to which Theorem 4 applies, with L =0.5. *Ωγ has the compact metric totally disconnected product topology which arises from the product topology on Ω. * Cylinders in Ωγ are cylinders in Ω intersected with Ωγ. Cylinders are both open and closed in the relative topology.

* Ωγ ,S is an intrinsic Markov chain of order r iff Ωγ Cσ0σ1...σn = ∅ with { } ∩ 6 n r and Ωγ Cσn r+1...σnσn+1 = ∅ imply Ωγ Cσ0σ1...σn+1 = ∅. ≥* Regional∩ transitivity− is defined6 as follows: (i)∩ given non-empty6 cylinders C, D n n of Ωγ there exists n such that S◦ C D = ∅ and (ii) n∞=0S◦ C = Ωγ. These two ∩ 6 ∪ conditions are equivalent when S : Ωγ Ωγ is open. We have extracted the following theorem→ from the first few pages of [9]. The second sentence in our statement is contained in Parry’s proof of the first sentence.

Theorem 5. [9] The symbolic dynamical system Ωγ ,S is an intrinsic Markov { } chain iff S : Ωγ Ωγ is open. In this case Ωγ can be partitioned into a set of → non-empty cylinders Ωγ Cσ σ ...σ ,eachofthesamelengthn, such that ∩ 1 2 n S(Ωγ Cσ σ ...σ )=Ωγ Cσ ...σ . ∩ 1 2 n ∩ 2 n 6.15. ContinuationoftheproofofTheorem3inthecasewheretheex- pansion of γ does not terminate in 0 or 1. Let γ = γ1γ2... not terminate in 0 or 1.Letµ(n) P(Ω) denote the unique invariant measure corresponding (n) ∈ (n) to γ = γ1...γn 110.Noticethatγ =limγ . Let µ denote a limit point of − n µ(n) : n =1, 2, ... .Thenweprovethatµ →∞is an invariant measure for the process {corresponding to that} value of γ. b (nm) (nm) Let µ = lim µ where where , andb µ m∞=1 is an infinite subsequence nm { } (n) →∞ of µ n∞=1. Then we begin by showing that showing that Uγ∗µ = µ. **We seem { b} (n) to be using two notations: µ = µγ(n) . (n) We need some more notation. For each positive integer n, C γ bdenotesb the unique 2 γ (n) cylinder of length n which contains the infinite string 2 . C γ 1 denotes the unique 2 + 2 FRACTAL TOPS 29

γ 1 cylinder of length n which contains the infinite string 2 + 2 . Notice that neither (n) n of these strings terminates in 0 or 1.LetCm for m =0, 1, ..., 2 1 denote the unique cylinder of length n which "contains" the number m expressed− as a string of length n in binary. For example, (2) (2) (2) (2) C0 = C00; C1 = C01; C2 = C10; C3 = C11.

Notice that Cγ1γ2...γn is same as the interval [γ γ ...γ 0,γ γ ...γ 1] := γ Ω : γ γ ...γ 0 γ γ γ ...γ 1 . 1 2 n 1 2 n { ∈ 1 2 n ≤ ≤ 1 2 n } We need the following observations. By choosing f : Ω R to be the charac- teristic function of a cylinder set C Ω in Equation 6.18,→ that is, f(ω)=χ (ω), ⊂ C which we can do because f is continuous, we obtain that for any µ P(Ω) ∈ 1

(U ∗µ)(C)= pi(ω)χC (si(ω))dµ(ω). i=0 X ZC We can evaluate this expression in the following four cases. (1) Suppose that C 01 and C<γ =0γ γ ...:then ≤ 2 1 2 1

(U ∗µ)(C)= pi(ω)χC (si(ω))dµ(ω)=p0µ(S(C)) i=0 X ZΩ because p (ω)χ (0ω)=p (ω)=p when 0ω C, i.e. ω S(C); p (ω)χ (s (ω)) = 0 C 0 0 , 0 C 0 0 ∈ω/S(C) ∈ ½ ∈ and p1(ω)χ (s1(ω)) = p1(ω)χ (1ω)=0for all ω Ω. C C ∈ (2) Suppose that C 01 and C>γ =0γ γ ...:then ≤ 2 1 2 1

(U ∗µ)(C)= pi(ω)χC (si(ω))dµ(ω)=µ(S(C)) i=0 X ZΩ because p (ω)χ (0ω)=p (ω)=1 when 0ω C, i.e. ω S(C); p (ω)χ (s (ω)) = 0 C 0 , 0 C 0 0 ∈ω/S(C) ∈ ½ ∈ and

p1(ω)χC (s1(ω)) = p1(ω)χC (1ω)=0. (3) Suppose that C 10 and C<γ + 1 =1γ γ ...:thenwefind ≥ 2 2 1 2 1

(U ∗µ)(C)= pi(ω)χC (si(ω))dµ(ω)=p1µ(S(C)) i=0 X ZΩ because p (ω)χ (1ω)=p (ω)=p when 1ω C, i.e. ω S(C); p (ω)χ (s (ω)) = 1 C 1 1 , 1 C 1 0 ∈ω/S(C) ∈ ½ ∈ and p0(ω)χ (s0(ω)) = p0(ω)χ (0ω)=0for all ω Ω. C C ∈ 30 MICHAEL BARNSLEY

(4) Suppose that C 10 and C>γ + 1 =1γ γ ...:then ≥ 2 2 1 2 1

(U ∗µ)(C)= pi(ω)χC (si(ω))dµ(ω)=0 i=0 X ZΩ because p (ω)χ (1ω)=p (ω)=0 when 1ω C, i.e. ω S(C); p (ω)χ (s (ω)) = 1 C 1 , 1 C 1 0 ∈ω/S(C) ∈ ½ ∈ and p0(ω)χ (s0(ω)) = p0(ω)χ (0ω)=0for all ω Ω. C C ∈ Putting the four results together, we obtain: γ p0µ(S(C)) C 01 and C<2 =0γ1γ2... ≤ γ µ(S(C)) C 01 and C>2 =0γ1γ2... (6.22) (U ∗µ)(C)= ≤ γ 1  p1µ(S(C)) C 10 and C<2 + 2 =1γ1γ2...  0 C ≥ 10 and C>γ + 1 =1γ γ ... ≥ 2 2 1 2 Now let >0 be given, and let C be any cylinder set that does not contain γ γ 1 either of the points 2 or 2 + 2 . Then Equation 6.22 implies that for integers n, sufficiently large that γ(mn) / C, we have ∈ (U ∗µ)(C) (U ∗µ (mn) )(C) = either p0 or 1 or p1 or 0 µ(C) µ (mn) (C) . γ − γ γ { } − γ But¯ for sufficiently large n we¯ have µ(C) µγ(mn) (C) <.So¯ for sufficiently large¯ n ¯we haveb ¯ − ¯b ¯ ¯ ¯ (U ∗µ)(C)¯ (U ∗µ (mn) )(C) ¯<. γ −b γ γ It follows that for all suffi¯ ciently large n we have ¯ ¯ b ¯ (U ∗µ)(C) µ (mn) (C) <. γ − γ γ It follows that if C is any cylinder¯ set that does not¯ contain either of the points 2 γ 1 ¯ b ¯ or 2 + 2 then (Uγ∗µ)(C)=µ(C). It follows that b b γ γ 1 (Uγ∗µ)(B)=µ(B) for all B B(Ω) such that / B and + / B. ∈ 2 ∈ 2 2 ∈ γ γ 1 γ We now demonstrate that µ( 2 )=µ( 2 + 2 )=0. Suppose either µ( 2 ) =0 γ 1b b { } { } γ γ {1 } 6 or µ( 2 + 2 ) =0.Thenweclaimthatµ( γ ) =0. For let B = Ω 2 , 2 + 2 . Then { } 6 { } 6 \{ }γ γ since (Uγ∗µ)(B)=µ(B) and b(Uγ∗µ)(Ω)=b µ(Ω)=1it follow that (Uγ∗bµ)( 2 , 2 + 1 γ γ 1 γ γ 1 { )=b µ( , + ). But (Uγ∗µ)( , b + )=(p0 + p1)µ(γ)=µ( γ ). So it 2 } { 2 2 2 } { 2 2 2 } { } follows thatb µ( γ )b=0. We nowb proceedb much as in the proof of Theoremb 2 to { } 6 show thatb µ( γ )=0. Supposeb that γ is not eventually periodic.b Thenb we find n { } n that µ( S◦ γb ) µ( γ ) which, since S◦ γ n∞=1 is an infinite sequence of distinct points,{ implies} µ≥(Ω)={ } which is wrong.{ One} the other hand suppose that γ is b ∞ eventuallyb periodic,b with eventual period k>1.Thenwefind that for some finite (n) (n+k) (n) positive integer bn, µ( S◦ γ ) < µ(S◦ γ )=µ( S◦ γ ) which is impossible. γ γ{ 1 } { } { } So µ( )=µ( + )=0. It follows that (Uγ∗µ)(B)=µ(B) for all B B(Ω), { 2 } { 2 2 } ∈ i.e. µ is indeed anb invariant measureb as desired. b Theb interestingb issue is in relation to uniqueness.b First letbPµ(Ω) denote the set of allb solutions ρ P(Ω) of Uγ∗ρ = ρ such that ρ([0,γ]) = µ([0,γ]). Then (below) ∈ b we demonstrate, much as in Theorem 4 that Uγ∗ : Pµ(Ω) Pµ(Ω) is a contraction → b b b FRACTAL TOPS 31 with respect to dMK. From this it follows that there is at most one solution with prescribed mass on [0,γ] Ω. ⊂ The contraction argument goes as follows. Let µ, υ P(Ω) be such that ∈ µ( ω Ω : ω<γ )=υ( ω Ω : ω<γ ). { ∈ } { ∈ } Then

dMK(U ∗υ, U∗µ) =max Uf(ω)(dν(ω) dµ(ω)) f Lip1(Ω) − ∈ ZΩ

=max (p0(ω)f(s0(ω)) + p1(ω)f(s1(ω)))(dν(ω) dµ(ω)) f Lip1(Ω) − ∈ ZΩ

=max (p0f(s0(ω)) + p1f(s1(ω)))(dν(ω) dµ(ω)) f Lip1(Ω){ − ∈ [0Z,γ)

+ f(s0(ω))(dν(ω) dµ(ω)) − } [γ,Z1]

=max (p0f(s0(ω)) + p1f(s1(ω)))(dν(ω) dµ(ω)) f Lip1(Ω){ − ∈ [0Z,γ)

+ (f(s0(ω)) + c)(dν(ω) dµ(ω)) − } [γ,Z1] where c R. ∈ Nownoticethatwecanchoosec so that the function f : Ω R defined by → p0f(s0(ω)) + p1f(s1(ω)) ω [0,γ) f(ω)= ∈e f(s0(ω)) + cω[γ,1] ½ ∈ is continuous. Thene f L1(Ω). It follows that ∈ 1 e dMK(U ∗υ, U∗µ) dMK(υ,µ). ≤ 2 It follows that there can exist at most one solution with any given mass between zero and one on the interval [0,γ). Now suppose that there are three linearly independent invariant probability mea- sures, solutions to Uγ∗µ = µ,sayµ1,µ2,µ3. Theneachofthesesolutionsassignsa different mass to [0,γ] and so we can suppose that

µ1([0,γ]) <µ2([0,γ]) <µ3([0,γ]).

But then we could construct a linear combination of µ1 and µ3 which has the same mass µ2 on [0,γ], yielding a contradiction. Hence there exists at least one and at most two linearly independent solutions. Hence, via Choquet’s theorem, there are at most two ergodic invariant probability measures.

7. A Random Iteration Algorithm which Stays On Top Here we wiil describe a simple method for random iterating on the top of an IFS. 32 MICHAEL BARNSLEY

Acknowledgement 1. The author thanks John Hutchinson for many useful dis- cussions and much help with this work. The author thanks Louisa Barnsley for producing the graphics.

References [1] M. F. Barnsley, Everywhere,AcademicPress,NewYork,NY,1988. [2]M.F.BarnsleyandS.Demko,Iterated Function Systems and the Global Construction of Fractals, R.Soc.Lond.Proc.Ser.AMath.Phys.Eng.Sci.399 (1985), pp. 243-275. [3] M.F.Barnsley and L.F. Barnsley, Fractal Transformations, in "The Colours of Infinity: The Beauty and Power of Fractals", by Ian Stewart et al., Published by Clear Books, London, (2004), pp.66-81. [4] M.F.Barnsley, informal lecture notes entitled , Fractal Tops and Colour Steal- ing (2004). [5] M.F.Barnsley, J.E.Hutchinson, O. Stenflo A Fractal Valued Random Iteration Algorithm and Fractal Hierarchy (2003) to appear in Fractals journal. [6] J. Elton, An Ergodic Theorem for Iterated Maps, Ergodic Theory Dynam. Systems, 7 (1987), pp. 481-488. [7] J. E. Hutchinson, Fractals and Self-Similarity, Indiana. Univ. Math. J., 30 (1981), pp. 713- 749. [8] N. Lu, Fractal Imaging,AcademicPress,(1997). [9] W. Parry, Symbolic Dynamics and Transformations of the Unit Interval, Trans. Amer. Math. Soc., 122 (1966),368-378. [10] A.Rényi, Representations for Real Numbers and Their Ergodic Properties,ActaMath.Acad. Sci. Hung., 8 (1957), pp. 477-493. [11] I. Werner, Ergodic Theorem for Contractive Markov Systems, Nonlinearity 17 (2004) 2303- 2313

AustralianNationalUniversity,Canberra,ACT,Austalia E-mail address: [email protected]