Ergod. Th. & Dynam. Sys. (0), 0, 1–29 Printed in the United Kingdom c 0 Cambridge University Press

On the Topological Entropy of Saturated Sets

C.-E. Pfister† and W.G. Sullivan‡ † EPF-L, Institut d’analyse et calcul scientifique CH-1015 Lausanne, Switzerland (e-mail: charles.pfister@epfl.ch) ‡ School of Mathematical Sciences, UCD Dublin, Belfield, Dublin 4, Ireland (e-mail: [email protected])

(Received 18 November 2005 )

Abstract.Let(X, d, T) be a dynamical system, where (X, d)isacompactmetric space and T : X → X a continuous map. We introduce two conditions for the set of orbits, called respectively g-almost product property and uniform separation property.Theg-almost product property holds for dynamical systems with the specification property, but also for many others. For example all β-shifts have the g-almost product property. The uniform separation property is true for expansive and more generally asymptotically h-expansive maps. Under these two conditions we compute the topological entropy of saturated sets. If the uniform separation condition does not hold, then we can compute the topological entropy of the set of generic points, and show that for any invariant probability measure µ, the (metric) entropy of µ is equal to the topological entropy of generic points of µ.Wegivean application of these results to multi-fractal analysis and compare our results with those of Takens and Verbitskiy (Ergod. Th. & Dynam. Sys. 23 (2003)).

1. Introduction In this paper a dynamical system (X, d, T) means always that (X, d)isacompact metric space and T : X → X is a continuous map. The set M(X)ofall (Borel) probability measures is a compact space for the weak*- of measures, and M(X, T) is the of T -invariant probability measures with the induced topology. We are interested in comparing the (metric) entropy of µ ∈ M(X, T) with the topological entropy, which is a measure of the complexity of the dynamical system. In order to do that we study time-averages along orbits and introduce the empirical measure (of order n)ofx ∈ X, which is the probability measure

n−1 1 En(x):= δT kx , n k=0

Prepared using etds.cls [Version: 1999/07/21 v1.0] 2 C-E. Pfister and W.G. Sullivan where δy is the Dirac mass at y ∈ X. A subset D ⊂ X is called saturated if x ∈ D and the sequences {En(x)} and {En(y)} have the same limit-point set, then y ∈ D. The limit-point set of {En(x)} is always a compact connected subset V (x) ⊂ M(X, T). Of particular interest are the points x ∈ X such that V (x)={µ}. For such points, the time-average along the orbit of x of any continuous function f is equal to the average of f with respect to µ (space-average)

n−1 1 lim f(T jx)= fdµ. n→∞ n j=0 X

These points are called generic points of µ.ByGµ we denote the saturated set of generic points of µ, and more generally, if K ⊂ M(X, T) is a compact connected subset, we denote by GK the saturated set of points x such that V (x)=K.In [Bo2] Bowen proved the remarkable result that

htop(T,Gµ)=h(T,µ)ifµ is ergodic. (1)

Here htop(T,Gµ) is the topological entropy of µ and h(T,µ) is the metric (or Kolmogorov-Sinai) entropy of µ. These concepts are quite different. The notion of metric entropy is affine while the Bowen entropy is a dimension-like characteristic. Generalization and extension of Bowen’s result are given in [PePi]. Essential in these papers is the fact that one deals with large sets in the measure theoretic sense. Indeed the Ergodic Theorem implies that µ is ergodic if and only if µ(Gµ) = 1, while for non-ergodic measures ν one always has ν(Gν ) = 0. For non-ergodic measures the situation is quite different. It is clear that (1) cannot hold for general µ ∈ M(X, T) since it is not difficult to provide examples with Gµ = ∅ and h(T,µ) > 0. However, the following inequalities hold (Theorem 4.1) for µ ∈ M(X, T),

htop(T,Gµ) ≤ h(T,µ) , (2) and for K ⊂ M(X, T), a compact connected set,

htop(T,GK ) ≤ inf{h(T,µ):µ ∈ K} . (3)

Formulas (2) and (3) are subcases of Bowen’s variational principle. To treat non-ergodic measures we introduce two conditions. The first one is a weaker form of the notion of specification, which is verified for a large class of interesting and important systems. The new condition (see Section 2), called g- almost product property, is inspired by our previous works [PfS1] and in particular [PfS2], where we introduced a similar condition for deriving large deviations estimates. When dealing with the topological entropy we had to strengthen a little bit the property introduced in [PfS2], but the principal feature is retained: while for the specification property one requires the existence of an orbit, which ε-shadows all specified orbit-segments, we only require the existence of an orbit, which partially ε-shadows the specified orbit-segments (see Section 2). In the case of β-shifts it is known that the specification property holds for a set of β of Lebesgue measure 0 (see [Sc]), whereas the g-almost product property always holds (see Section 2). To

Prepared using etds.cls Topological Entropy of Saturated Sets 3 deal with the g-almost product property we must consider the notion introduced in [PfS2]ofa(δ, n, ε)-separated subset. For δ>0andε>0, two points x and y are (δ, n, ε)-separated if card{j : d(T jx, T jy) >ε,0 ≤ j ≤ n − 1}≥δn . A subset E is (δ, n, ε)-separated if any pair of different points of E are (δ, n, ε)- separated. The reason for introducing this stronger separation property comes from the basic principle for error-correcting codes: in order to correct errors one introduces more information then a priori necessary. Here the stronger form of separation compensates the fact that we require only partial shadowing. We proved in [PfS2] the following proposition. Let F ⊂ M(X)beaneighborhood;weset

Xn,F := {x ∈ X : En(x) ∈ F } . Proposition 1.1. Let ν ∈ M(X, T) be ergodic and h∗ 0 and ε∗ > 0 so that for each neighborhood F of ν in M(X),thereexists ∗ ∗ ∗ ∗ nF,ν ∈ N such that for any n ≥ nF,ν,thereexistsa(δ ,n,ε)-separated set Γn, such that nh∗ Γn ⊂ Xn,F and |Γn|≥2 . The second condition is introduced in Section 3; it is called uniform separation property. It states that the affirmation of Proposition 1.1 holds uniformly, that is, δ∗ and ε∗ can be chosen independently of the ergodic measure ν. Expansive dynamical systems, and more generally asymptotic h-expansive dynamical systems ([Bo1]and [Mi1]), have the uniform separation property (Theorem 3.1). Let F ⊂ M(X)bea neighborhood of ν,andε>0. We define

N(F ; n, ε) := maximal cardinality of a (n, ε)-separated subset of Xn,F .

Following Kolmogorov, the quantity N(F ; n, ε)isthe(n, ε)-capacity of the set Xn,F (see [W] p.170). Let 1 1 s(ν; ε):=inf lim inf log2 N(F ; n, ε) , s(ν; ε):=inf lim sup log2 N(F ; n, ε) . F ν n→∞ n F ν n→∞ n In these definitions we can take the infimum over any base of neighborhoods of ν. For any ergodic measure† µ, and general dynamical system (Corollary 3.2) lim s(µ; ε) = lim s(µ; ε)=h(T,µ) . ε→0 ε→0 If the uniform separation condition is true and the ergodic measures are entropy- dense, that is, for each ν ∈ M(X, T), each neighborhood F ν,andh∗

Prepared using etds.cls 4 C-E. Pfister and W.G. Sullivan

Theorem 1.1. If the g-almost product property and the uniform separation property hold, then for any compact connected non- K ⊂ M(X, T)

inf{h(T,µ):µ ∈ K} = htop(T,GK ) .

The difficult part of the proof is to get a lower bound for htop(T,GK ). The method consists in constructing a subset of GK , which is simple enough, so that we can estimate its topological entropy. The construction, which generalizes the construction of normal numbers by Champernowne [Ch], is inspired by a similar construction in [PfS1]. In Section 6 we drop the condition of uniform separation and assume only the g-almost product property. We prove (Proposition 6.1) that the g-almost product property alone implies (4). We are able to treat the particular case (but the most important one) of saturated sets of generic points for any invariant probability measure. Adapting the method of proof of Section 5, the main result of Section 6 is Theorem 1.2. If the g-almost product property holds, then

htop(T,Gµ)=h(T,µ) ∀ µ ∈ M(X, T) .

In the last section we give an application of our results to multi-fractal analysis; we compute the entropy spectrum of ergodic averages. Let ϕ be a continuous function with values in a topological vector space Y , such that the map Φ:M(X, T) → Y,Φ(µ):= ϕdµ X is continuous. Let a ∈ Y and consider the level-set n−1 1 j Ka := x ∈ X : lim ϕ(T x)=a . n→∞ n j=0

If the g-almost property holds, then htop(T,Ka)=sup h(T,µ):µ ∈ M(X, T) , ϕdµ= a . (5) X Finally we compare (5) with the corresponding (weaker) result in [TV], and comment about their proof. We write Z+ for the set of nonnegative integers N ∪{0}.Ifr, s ∈ Z+, r ≤ s,we set [r, s]:={k ∈ Z+ : r ≤ k ≤ s},andΛn := [0,n− 1]. The cardinality of a finite set Λ is denoted by |Λ|.Weset f,µ := fdµ. X

There exists a countable and separating set of continuous functions {f1,f2,...} with 0 ≤ fk(x) ≤ 1, and such that −k d(µ, ν) ≡µ − ν := 2 | fk,µ− ν | (6) k≥1

Prepared using etds.cls Topological Entropy of Saturated Sets 5 defines a metric for the weak*-topology on M(X, T). Notice that µ − ν≤1. In the rest of the paper we find convenient to use an equivalent metric on X, still denoted by d, d(x, y):=d(δx,δy) . (7)

We refer to [W] for Ergodic Theory, and to [Pe] for Dimension Theory. The problem of the equality of the topological entropy of generic points of an invariant measure and its metric entropy has a long history. Let us mention the classical earlier results by Eggleston [Eg], Wegmann [We] and Colebrook [Co]. Two of the main impulses to this field are Cajar’s monograph [C] and Young’s paper [Y]. More recent references include [O2]and[Te]. Another important related work is Barreira and Schmeling’s paper [BaSc] about Hausdorff dimension and topological entropy of sets of “non-typical” points.

2. g-almost product property Let x ∈ X.Thedynamical ball Bn(x, ε)istheset j j Bn(x, ε):= y ∈ X :max{d(T x, T y) ≤ ε : j ∈ Λn} .

Apointx ∈ Xε-shadows a sequence {x0,x1,...,xk} if

j d(T x, xj) ≤ ε ∀ j =0,...,k.

We first recall the definition of specification.

Definition 2.1. The dynamical system (X, d, T) has the specification property if the following is true. For any ∆ > 0 there exists k(∆) such that for any + finite collection of p intervals Ij =[aj,bj] ⊂ Z , j =0,...,p − 1, such that aj − bj−1 ≥ k(∆), j =1,...,p− 1, and any collection of p points {x0,...,xp−1}, there exists x ∈ X such that

aj +m m d(T x, T xj) ≤ ∆ ∀ m =0,...,bj − aj and ∀ j =0,...,p− 1 .

Our main definition is Definition 2.3. The terminology of Definition 2.2 is taken from [S].

Definition 2.2. Let g : N → N be a given nondecreasing unbounded map with the properties g(n) g(n) 0.Theg-blowup of Bn(x, ε) is the closed set j j Bn(g; x, ε):= y ∈ X : ∃Λ ⊂ Λn , |Λn\Λ|≤g(n) and max{d(T x, T y):j ∈ Λ}≤ε .

Definition 2.3. The dynamical system (X, d, T) has the g-almost product property with blowup function g, if there exists a nonincreasing function m : R+ → N,such

Prepared using etds.cls 6 C-E. Pfister and W.G. Sullivan that for any k ∈ N,anyx1 ∈ X,...,xk ∈ X,anypositiveε1,...,εk,andany integers n1 ≥ m(ε1),...,nk ≥ m(εk), k −Mj−1  ∅ T Bnj (g; xj,εj) = , j=1 where M0 := 0, Mi := n1 + ···+ ni, i =1,...,k− 1. The main idea expressed in this definition is that one requires only partial shadowing of the specified orbit segments, contrary to Definition 2.1. If (Y,d,S)is afactorof(X, d, T), and (X, d, T)hastheg-almost product property, then (Y,d,S) has the g-almost product property. Indeed, let ψ : X → Y be a continuous surjection such that ψ ◦ T = S ◦ ψ.Givenε > 0, there exists ε>0 such that if d(x, y) ≤ ε,thend(ψ(x),ψ(y)) ≤ ε. Then define m(ε):=m(ε). In particular, the g-almost product property does not depend on the choice of the metric, as long as the topology on X remains the same. Example. All β-shifts have the g-almost product property. We use the notations of Section 5 in [PfS2]. Let β>1bearealnumber,andA := {0, 1,...,β}.The β-shift is a subshift Xβ ⊂ AN. The shift map is denoted by T ,andtheelementsof β X are denoted by ω = {ωi}. The language L of the β-shift is the set of all words k β ω1 ≡ (ω1,...,ωk)withω ∈ X and k ∈ N. The set of all words of L of length k is denoted by Lk. For the separating continuous functions {f1,f2,...} we choose the functions 1ifwisaprefixofω fw(ω):= with w ∈L. 0 otherwise

We enumerate these functions by enumerating first the functions with w ∈L1,then  β those with w ∈L2 and so on. If ω and ω are two points of X having a common largest prefix of length k, then by our definition of the distance (see (6)) k  −k d(ω, ω ) ≤ 2 with k := |Lj| . j=1

The main observation is that, given any w ∈Lwe can find a word w ∈L, which differs from w by at most one character, and such that for any w ∈L, the concatenated word ww  ∈L(see Proposition 5.1, in particular formula (5.7) in β n β n [PfS2]). Let ω ∈ X with w = ω1 ,andω ∈ X with the prefix w= ω1 . Suppose that w and w differ only at the character j,1≤ j ≤ n.Then 2−(j−1)−m if m =0,...,j− 1 d(T mω, T mω) ≤ (8) 2−n−m if m = j,...,n− 1.

From (8) it follows that we can take any blowup function g and determine the corresponding function m. Proposition 2.1. Let g be any blowup function and (X, d, T) have the specification property. Then it has the g-almost product property.

Prepared using etds.cls Topological Entropy of Saturated Sets 7

Proof: We may assume that the function k in Definition 3.1 is nonincreasing. Let {x1,...,xp} and {ε1,...,εp} be given. Let  −j ∆j := 2 j ∈ N .   Notice that ∆j = k>j ∆k. Define   min{m : g(m) ≥ 2k(∆j)} if ε =2∆j m(ε):=   m(2∆j)ifj is the maximum of i with 2∆i ≤ ε.  It is sufficient to prove the statement for εi of the form 2∆ji , i =1,...,p,where,  as above ji is the maximum of k such that 2∆k ≤ εi. Precisely, if εi is not of that  form, we change it into 2∆ji . From now on we assume that, for all i, εi is of the  form 2∆ji and ni > m(εi).  We prove the proposition by an iterative construction. Let ∆(xi):=εi/2=∆ji and n(xi):=ni. The sequence {x1,...,xp} is considered as an ordered sequence; its elements are called original points. The possible values of ∆(xi) are written ∆1 > ∆2 >...>∆q.Alevel-k point is by definition an original point xj such that ∆(xj)=∆k.Weset ki := k(∆i) i =1,...q. There are q steps; at each step we replace some of consecutive points by single points, called concatenated points, and obtain in this way a new ordered sequence, consisting of original and concatenated points. After the q steps have been completed, the sequence is reduced to a single element y such that p ∈ −Mj−1 y T Bnj (g; xj,εj) . j=1 At step 1 we consider the level-1 points labeled by

S1 := {k ∈ [1,p]:∆(xk)=∆1} .

If S1 =[1,p], then by the specification property there exists y such that m m d T x1,T y ≤ ∆1 ∀ m = k1,...,n(x1) − k1 − 1 m n2+m d T x2,T y ≤ ∆1 ∀ m = k1,...,n(x2) − k1 − 1 ···≤··· m n1+···+np−1+m d T xp,T y ≤ ∆1 ∀ m = k1,...,n(xp) − k1 − 1 , which proves this case. If S1 =[1 ,p], then we decompose it into maximal subsets of consecutive points, called components. (The components are defined with respect to the whole sequence.) Let J be a component, say [r, s]withr

Prepared using etds.cls 8 C-E. Pfister and W.G. Sullivan

Here nr + ···+ ns−1 stands for n(xr)+···+ n(xs−1). This implies that

∈ ∩···∩ −(nr +···+ns−1) y Bnr (g; xr,εr) T Bns (g; xs,εs) .

We replace the sequence {x1,...,xp} by the (ordered) sequence

{x1,...,xr−1,y,xs+1 ...,xp} , and set, for the concatenated point y,

∆(y):=∆1 n(y):=n(xr)+···+ n(xs) .

We do this operation for all components which are not singletons. After these { } ≤ operations we have a new (ordered) sequence z1,...,zp1 , p1 p,wherethepoint zi is either a point of the original sequence, or a concatenated point. This ends the construction at step 1. Let

S2 := {k ∈ [1,p1]:∆(zk) ≥ ∆2} .

We decompose this set into components. Let [r, s] be a component which is not a singleton (r

In that formula nr + ···+ ns−1 stands for n(zr)+···n(zs−1). Existence of such a y is a consequence of the specification property. We set

∆(y):=∆2 n(y):=n(zr)+···+ n(zs) .

The construction of y involves consecutive points of the original sequence (via the   concatenated points), say points xj, j ∈ [u, t]. Since ∆j = k>j ∆k,

∈ ∩···∩ −(nu+···+nt−1) y Bnu (g; xu,εu) T Bnt (g; xt,εt) .

We do these operations for all components of S2, which are not singletons. We get { } a new ordered sequence, still denoted by z1,...,zp2 . This ends the construction at level 2. The construction at level 3 is similar to the construction at level 2, using

S3 := {k ∈ [1,p2]:∆(zk) ≥ ∆3} .   Once step q is completed, since ∆j = k>j ∆k,wehaveasingleconcatenated point y such that for all i =1,...,p m n1+···+ni−1+m d T xi,T y ≤ εi ∀ m = ki,...,ni − ki − 1 .

Prepared using etds.cls Topological Entropy of Saturated Sets 9

Corollary 2.1. Assume that (X, d, T) has the specification property. There exists a nonincreasing function k : R+ → N with the following property. For   y1 ∈ X,...,yp ∈ X, ε1 > 0,...,εp > 0,andn1 ∈ N,...,np ∈ N,thereexists z ∈ X such that

Mk+m m  d(T z,T yk) ≤ εk ∀ m =0,...nk − 1 and k =1,...,p, with k−1  M1 := k(ε1) and Mk := 2k(εi)+ni + k(εk) k =2,...,p. i=1

k(ε ) Proof: Specification implies surjectivity, so we may choose xi so that T i xi = yi.  With ni := ni +2k(εi) we then proceed as in the proof of Proposition 2.1. We denote a ball in M(X)by

B(ν, ζ):={µ ∈ M(X):d(ν, µ) ≤ ζ} .

Lemma 2.1. Assume that (X, d, T) has the g-almost product property. Let x1 ∈ X,...,xk ∈ X, ε1 > 0,...,εk > 0,andn1 ≥ m(ε1),...,nk ≥ m(εk) be given. Assume that E ∈B nj (xj) (νj,ζj) j =1,...,k. k ∈ −Mj−1 Then for any y j=1 T Bnj (g; xj,εj) and any probability measure α

k E ≤ nj  d Mk (y),α ζj + d(νj,α) , j=1 Mk where Mi = n1 + ···+ ni and

 g(ni) ζi = ζi + εi + i =1,...,k. ni Proof:Wehave k nj E E Mj−1 Mk (y)= nj (T y) , j=1 Mk and, because of our choice (6) of the distance on X,

nj −1 1 g(nj) nj − g(nj) E E Mj−1 ≤ m Mj−1+m ≤ d nj (xj), nj (T y) d(T xj,T y) + εj . nj m=0 nj nj The result follows from the triangle inequality and the definition of the distance (6): k nj E ≤ E Mj−1 d Mk (y),α d nj (T y),α . j=1 Mk

Prepared using etds.cls 10 C-E. Pfister and W.G. Sullivan

3. Uniform separation Let F ⊂ M(X) be a neighborhood of ν ∈ M(X, T). Recall that

Xn,F := {x ∈ X : En(x) ∈ F } ,

N(F ; n, ε) := maximal cardinality of a (n, ε)-separated subset of Xn,F , (9) and define

N(F ; δ, n, ε) := maximal cardinality of a (δ, n, ε)-separated subset of Xn,F .

Let ξ = {Vi : i =1,...,k}, be a finite partition of measurable sets of X.The entropy of ν ∈ M(X) with respect to ξ is H(ν, ξ):=− ν(Vi)log2 ν(Vi).

Vi∈ξ ∨n −k ∈ We write T ξ := k∈Λn T ξ.Theentropy of ν M(X, T) with respect to ξ is 1 h(T,ν,ξ) := lim H(ν, T ∨nξ) , n n and the metric entropy of ν is h(T,ν):=supξ h(T,ν,ξ).

3.1. Uniform separation property Definition 3.1. The dynamical system (X, d, T) has uniform separation property if the following holds. For any η,thereexistδ∗ > 0 and ε∗ > 0 so that for µ ergodic ∗ ∗ and any neighborhood F ⊂ M(X) of µ,thereexistsnF,µ,η, such that for n ≥ nF,µ,η,

N(F ; δ∗,n,ε∗) ≥ 2n(h(T,µ)−η) .

Remark. Notice that uniform separation implies that htop(T,X) is finite. Indeed,

2n(h(T,µ)−η) ≤ N(F ; δ∗,n,ε∗) ≤ N(X; n, ε∗) , so that, for all µ ergodic

1 ∗ h(T,µ) ≤ lim sup log2 N(X; n, ε )+η<∞ . n→∞ n The result follows from Corollary 8.6.1 in [W]. Theorem 3.1. Suppose that T is expansive, or that T is asymptotically h- expansive. Then (X, d, T) has the uniform separation property. Proof:IfT is expansive, and ε>0 is an expansive constant for T ,thenfora partition α with diameter smaller than ε (see [W])

h(T,ρ)=h(T,ρ,α)ifρ ∈ M(X, T) .

If T is asymptotically h-expansive, then there exists h(ε) such that

lim h(ε)=0, ε→0

Prepared using etds.cls Topological Entropy of Saturated Sets 11 and if α is a finite partition with diameter smaller than ε,

h(T,ρ) ≤ h(T,ρ,α)+h(ε)ifρ ∈ M(X, T) . (10)

(See [N] Theorem 1 and [Mi1].) Let η>η > 0. The value of η is fixed in (12). We choose ε so that either ε is an expansive constant, or η h(ε) ≤ . (11) 4

Let ν ∈ M(X, T). We first construct a neighborhood of ν, Wν ⊂ M(X). It is sufficient to prove the result for an ergodic µ ∈ Wν ∩ M(X, T), since M(X, T)can ⊂ be covered by a finite number of neighborhoods Wνi M(X), i =1,...,n. Let ξ = {A1,...,Ak} be a fixed finite partition of X with 2 diamξ ≤ ε.We  ∗  ∗ 1 choose η and a constant δ > 0sothat2η + δ < 2 and ∗  ∗   φ(δ +2η )+(δ +2η )log2(2k − 1) <η− η , (12) where φ(δ):=−δ log2 δ − (1 − δ)log2(1 − δ) . (13)

Since ν is regular, let Vj ⊂ Aj, j =1,...,k, be compact subsets, with η ν(Aj\Vj) < . 4k log2(2k) ∗ ∗ Define ε > 0sothatd(x, y) > 2ε whenever x ∈ Vi, y ∈ Vj, i = j. There exists an integer n∗, such that for n ≥ n∗, η log 2 ≥ 2 . 4log2 k n log2(2k)

Let Ui ⊃ Vi, i =1,...,k,bek open neighborhoods with diamUi <ε,andsuch ∗  that d(x, y) >ε whenever x ∈ Ui, y ∈ Uj, i = j.Letξ be a partition of X  which contains Uj, j =1,...,k,andthesetsAi \∪jUj so that diamξ <ε.Let K := X \∪j Uj; since K is closed, IK , the indicator function of K, is upper semi- continuous. We define the neighborhood Wν by η Wν := ρ ∈ M(X): IK ,ρ≤ IK ,ν + . 4log2(2k) It is convenient to label the atoms of T ∨nξ by words w of length n over an alphabet  A of at most 2k letters. The letters 1,...,k label the atoms U1,...,Uk of ξ ,and the other letters the non-empty atoms among A1 \∪j Uj,...,Ak \∪j Uj. We define an application w : X → AΛn ,

j  w(x)j := i if T x is in the atom of ξ labeled by i.

We prove the result for an ergodic µ ∈ Wν ∩ M(X, T). Because of our choice of ε (see (10) and (11)), there exists n∗ so that η H(µ, T ∨nξ) >n h(T,µ) − if n ≥ n∗ . 2

Prepared using etds.cls 12 C-E. Pfister and W.G. Sullivan

By definition of Wν

k η η η µ(X \∪jUj) ≤ ν(X \∪j Uj)+ ≤ ν(Aj \Vj)+ < . 4log2(2k) j=1 4log2(2k) 2log2(2k)

Let  Yn := x : |{j ∈ Λn :w(x)j >k}| ≤ nη , ∗ so that limn µ(X \ Yn) = 0. Therefore, by the Ergodic Theorem and by taking n large enough, if F is a neighborhood of µ, then we may suppose that  η log2 2 ∗ µ(X \ (Xn,F ∩ Yn)) < − if n ≥ n . 2log2(2k) n log2(2k) Set n A0 := X \ (Xn,F ∩ Yn) . n n By conditioning with respect to {A0 ,X\A0 } we get

∨n  n n ∨n  H(µ, T ξ ) ≤ log2 2+µ(A0 )H(µ( ·|A0 ),T ξ )+ n n ∨n  (1 − µ(A0 ))H(µ( ·|X \ A0 ),T ξ ) .

 n ∨n  Since the number of atoms of ξ is at most 2k, H(µ( ·|A0 ),T ξ ) ≤ n log2(2k). Thus, for n ≥ n∗,

n ∨n   H(µ( ·|X \ A0 ),T ξ ) >n(h(T,µ) − η ) . (14)

Let Ξn denote the of Xn,F ∩ Yn by the map w. Inequality (14) implies that

n(h(T,µ)−η) |Ξn|≥2 .

H   n The Hamming distance dn (w, w ) of two different elements w and w of A is the   number of different letters of w and w .LetΞn ⊂ Ξn of maximum cardinality H   ∗   such that dn (w, w ) >n(2η + δ ) for any pair (w, w ) of different elements of Ξn. Let Γn be defined by selecting exactly one point of Xn,F ∩ Yn from each atom of ∨n   ∗ ∗ T ξn, which is labeled by a word of Ξn. By construction Γn ⊂ Xn,F is (δ ,n,ε )-  separated. The maximum cardinality of Ξn implies that for each w ∈ Ξn there   H   ∗ n exists w ∈ Ξn so that dn (w , w) ≤ n(2η + δ ). For a given w ∈ A (see e.g. Lemma 2.2 in [PfS2])

 n H   ∗ nφ(2η+δ∗) n(2η+δ∗) |{w ∈ A : dn (w , w) ≤ n(2η + δ )}| ≤ 2 (|A|−1) .

Therefore, by our choice (12) of η and δ∗

n(h(T,µ)−η)  2 n(h(T,µ)−η) |Γn| = |Ξn|≥  ∗  ∗ ≥ 2 . 2nφ(2η +δ )(2k − 1)n(2η +δ )

Corollary 3.1. Assume that (X, d, T) has the uniform separation property, and that the ergodic measures are entropy-dense. For any η,thereexistδ∗ > 0 and

Prepared using etds.cls Topological Entropy of Saturated Sets 13

ε∗ > 0 so that for µ ∈ M(X, T) and any neighborhood F ⊂ M(X) of µ,thereexists ∗ nF,µ,η, such that

∗ ∗ n(h(T,µ)−η) ∗ N(F ; δ ,n,ε ) ≥ 2 if n ≥ nF,µ,η . For any µ ∈ M(X, T), 1 h(T,µ) ≤ lim lim inf lim inf log2 N(F ; δ, n, ε) . ε→0 δ→0 F µ n→∞ n Proof:Letη>0andµ ∈ F .Forη/2 there exist δ∗ > 0andε∗ > 0 such that the statement is true for the ergodic measures with η/2 instead of η.Ifµ is not ergodic, and F µ, then we choose ν ∈ F ,ergodic,sothath(T,µ)−h(T,ν) ≤ η/2. ∗ ∗ The statement is then true with nF,µ,η = nF,ν,η/2. The second statement is an easy consequence of the first one.

3.2. The entropy map We study the entropy map when the uniform separation property holds. Proposition 3.2 gives another expression for the entropy of ν ∈ M(X, T), and Proposition 12 gives a sufficient condition implying that the entropy map is upper semi-continuous. Recall that for ε>0andν ∈ M(X, T)(see (9)) 1 s(ν; ε):=inf lim inf log2 N(F ; n, ε)ands(ν) := lim s(ν; ε) , F ν n→∞ n ε→0 with similar definitions for s(ν; ε)ands(ν). If s(ν)=s(ν), then the common value is denoted by s(ν). The next three results are valid for any dynamical systems. Lemma 3.1 is essentially one-half of the variational principle for the topological entropy, [G], [Mi2]and[W] section 8.2.

Lemma 3.1. Let (X, d, T) be a dynamical system. Let {En} be a sequence of (n, ε)- separated subsets and define n−1 1 νn := δT kx . n|En| x∈En k=0

Assume that limn νn = µ.Then 1 lim sup log2 |En|≤h(T,µ) . n→∞ n Proof: Up to minor modifications, the proof is identical with the second part of the proof of Theorem 8.6 in [W]. Proposition 3.1. Let (X, d, T) be a dynamical system and µ ∈ M(X, T).Then s(µ) ≤ h(T,µ) . Proof:Ifh(T,µ)=∞, then there is nothing to prove. Let h(T,µ) < ∞. Suppose that 1 lim inf lim sup log2 N(F ; n, ε) >h(T,µ) . ε→0 F µ n→∞ n

Prepared using etds.cls 14 C-E. Pfister and W.G. Sullivan

There exist ε∗ > 0andδ>0sothatforε ≤ ε∗, 1 inf lim sup log2 N(F ; n, ε) ≥ h(T,µ)+2δ. F µ n→∞ n Let 0 <ε≤ ε∗. There exists a decreasing sequence of convex closed neighborhoods {Cn} so that 1 Cn = {µ} and lim sup log2 N(Cn; n, ε) ≥ h(T,µ)+2δ. (15) n→∞ n n ⊂ Let En Xn,Cn be (n, ε)-separated with maximal cardinality, and define n−1 1 νn := δT kx ∈ Cn . n|En| x∈En k=0

By definition limn νn = µ. By Lemma 3.1 1 1 lim sup log2 |En| = lim sup log2 N(Cn; n, ε) ≤ h(T,µ) , n→∞ n n→∞ n which contradicts (15). Corollary 3.2. Let (X, d, T) be a dynamical system. Then s(µ) is well-defined, and s(µ)=h(T,µ), for all ergodic measures. Moreover, 1 1 s(µ) = lim lim inf lim inf log2 N(F ; δ, n, ε) = lim lim inf lim sup log2 N(F ; δ, n, ε) . ε→0 δ→0 F µ n→∞ n ε→0 δ→0 F µ n→∞ n Proof: From Proposition 3.1 and N(F ; δ, n, ε) ≤ N(F ; n, ε), 1 lim lim inf lim sup log2 N(F ; δ, n, ε) ≤ s(µ) ≤ h(T,µ) . ε→0 δ→0 F µ n→∞ n From Proposition 1.1 1 h(T,µ) ≤ lim lim inf lim inf log2 N(F ; δ, n, ε) ≤ s(µ) . ε→0 δ→0 F µ n→∞ n

Proposition 3.2. Let (X, d, T) be a dynamical system. If the uniform separation condition is true and the ergodic measures are entropy-dense, then s(µ) is well- defined, and s(µ)=h(T,µ), for all µ ∈ M(X, T). Proof: Same proof as the proof of Corollary 3.2, using Corollary 3.1 instead of Proposition 1.1. Proposition 3.3. Let (X, d, T) be a dynamical system. If the uniform separation condition is true and the ergodic measures are entropy-dense, then the entropy map is upper semi-continuous. Proof:LetF be a neighborhood of ν.Givenη>0, by Corollary 3.1 there exists ε∗ so that 1 ∗ sup h(T,µ) ≤ lim sup log2 N(F ; n, ε )+η. µ∈F n→∞ n

Prepared using etds.cls Topological Entropy of Saturated Sets 15

Hence 1 ∗ inf sup h(T,µ) ≤ inf lim sup log2 N(F ; n, ε )+η. F ν µ∈F F ν n→∞ n Since η is arbitrary, we get from Proposition 3.1 inf sup h(T,µ) ≤ h(T,ν) . F ν µ∈F

4. Upper bound for htop(T,GK ) Let E ⊂ X,andGn(E,ε) be the collection of all finite or countable covers of E by sets of the form Bm(x, ε)withm ≥ n.Weset −tm C(E; t, n, ε, T ):=inf 2 : C∈Gn(E,ε) ,

Bm(x,ε)∈C and C(E; t, ε, T ) := lim C(E; t, n, ε, T ) . n→∞ Then

htop(E,ε,T):=inf{t : C(E; t, ε, T )=0} =sup{t : C(E; t, ε, T )=∞} , and the topological entropy of E is

htop(E,T) := lim htop(E,ε,T) . ε→0 Theorem 4.1. Let (X, d, T) be a dynamical system and µ ∈ M(X, T). (1) Let K ⊂ M(X, T) be a closed subset, and let K G := x ∈ X : {En(x)}n has a limit-point in K .

Then K htop(T, G) ≤ sup{h(T,ρ):ρ ∈ K} . (2) If µ ∈ M(X, T),then htop(T,Gµ) ≤ h(T,µ) . (3) Let K ⊂ M(X, T) non-empty, connected and compact. Then

htop(T,GK ) ≤ inf{h(T,µ):µ ∈ K} .

Proof:Letµ ∈ M(X, T)ands := supµ∈K h(T,µ). If s = ∞, there is nothing to prove. Assume that s<∞;lets − s =2δ>0. Since N(F ; n, ε) is a nonincreasing function of ε, by Proposition 3.1 1 inf lim sup log2 N(F ; n, ε) ≤ h(T,µ) ∀ ε>0 . F µ n→∞ n For any ε>0thereexistaneighborhoodofµ, F (µ, ε), and M(F (µ, ε)) ∈ N,so that 1 log N(F (µ, ε); n, ε) ≤ h(T,µ)+δ ∀ n ≥ M(F (µ, ε)) . n 2

Prepared using etds.cls 16 C-E. Pfister and W.G. Sullivan

Since maximal (n, ε)-separated subsets of a set A are also (n, ε)-spanning subsets of A, for any n ≥ M(F (µ, ε)),

 −sn −δn C(Xn,F (µ,ε); s ,n,ε,T) ≤ N(F (µ, ε); n, ε)2 ≤ 2 .

Since K is compact, given a fixed ε>0, we can find a finite open covering of K by sets of the form F (µ, ε), say F (µj,ε), j =1,...mε,withµj ∈ K.If{En(x)} has a limit-point in K,thenx is an element of

AM := Xn,F (µj ,ε) , n≥M j=1 for arbitrarily large M.Thus,forM ≥ maxj M(F (µj,ε)), K  −δn C( G; s ,M,ε,T) ≤ mε 2 , n≥M which implies that

K htop( G, T ; ε) ≤ sup{h(T,µ):µ ∈ K} .

{µ} (2) is a consequence of (1). For the third statement notice that GK ⊂ G for all µ ∈ K,sothat htop(T,GK ) ≤ inf{h(T,µ):µ ∈ K} .

5. Lower bound for htop(T,GK ) Theorem 5.1. Let (X, d, T) be a dynamical system with the uniform separation and g-almost product properties. Let K be a connected non-empty compact subset of M(X, T).Then inf{h(T,µ):µ ∈ K}≤htop(T,GK ) .

Proof:Foreachε>0 there exists a finite sequence of measures α1,...,αn in K such that each point of K is within ε of some αj. Because K is connected, possibly repeating some αj, we can choose this sequence so that α1,...,αn is not more than ε away from any point of K and d(αj,αj+1) < 2ε for each j. Extending this argument we deduce that there exists a sequence α1,α2,...in K so that the closure of {αj : j ∈ N,j>n} for each n ∈ N equals K and

lim d(αj,αj+1)=0. j→∞

Let η>0, and h∗ := inf{h(T,µ):µ ∈ K}−η.

Given this sequence of measures†{αk}, we construct a subset G such that for each x ∈ G, {En(x)} has the same limit-point set as the sequence {αk},and

† If K = {α},thenαk = α for all k.

Prepared using etds.cls Topological Entropy of Saturated Sets 17

∗ htop(T,G) ≥ h . The construction of G is the core of the proof. It is also used in the proof of Theorem 6.1. By Corollary 3.1 we can find δ∗ > 0andε∗ > 0sothatifF µ, then there ∗ exists nF,µ,η with

∗ ∗ n(h(T,µ)−η) ∗ N(F ; δ ,n,ε ) ≥ 2 ∀ n ≥ nF,µ,η . (16)

Let {ζk} and {εk} be two strictly decreasing sequences†, limk ζk = 0 and limk εk = ∗ ∗ ∗ 0, with ε1 <ε . From (16) we deduce the existence of nk and a (δ ,nk,ε )-separated ⊂ subset Γk Xnk,B(αk,ζk) with ∗ nkh |Γk|≥2 . (17)

We may assume that nk satisfies

∗ g(nk) δ nk > 2g(nk)+1 and ≤ εk . (18) nk n −1 The orbit-segments {x,Tx,...,T k x}, x ∈ Γk, are the building-blocks for the construction of the points of G. By Lemma 2.1 and (18) ∈ ∈ ⇒E ∈B x Γk and y Bnk (g; x, εk)= nk (y) (αk,ζk +2εk) . (19)

We choose a strictly increasing sequence {Nk},withNk ∈ N, k nk+1 ≤ ζk njNj , (20) j=1 and k−1 k nj Nj ≤ ζk njNj . (21) j=1 j=1    Finally we define the (stretched) sequences {nj}, {εj} and {Γj}, by setting for

j = N1 + ···+ Nk−1 + q with 1 ≤ q ≤ Nk ,

   nj := nk εj := εk Γj := Γk . Let k j −Mj−1   k  j j G := T Bnj (g; x ,εj) with M := n . j=1  =1 xj ∈Γj

Gk is a non-empty closed set. We can label each set obtained by developing this formula by the branches of a labeled tree of height k. A branch is labeled by  (x1,...,xk)withxj ∈ Γj. Theorem 5.1 is proved by proving Lemma 5.1. Lemma 5.1. Let ε be such that 4ε = ε∗,andlet G := Gk . k≥1

† One can take ζk = εk; however the roles of ζk and εk are different. See last remark of Section 5.

Prepared using etds.cls 18 C-E. Pfister and W.G. Sullivan

   j j ∈ j  j ∈  j ∈  j (1) Let x ,y Γj with x = y .Ifx Bnj (g; x ,εj) and y Bnj (g; y ,εj),then m m max{d(T x, T y):m =0,...nj − 1} > 2ε.

(2) G is a closed set, which is the disjoint union of non-empty closed sets  G(x1,x2,...) labeled by (x1,x2,...) with xj ∈ Γj. Two different sequences label two different sets. (3) G ⊂ GK . ∗ (4) htop(T,G) ≥ h .   ∈  j ∈  j j j Proof:(1)Letx Bnj (g; x ,εj)andy Bnj (g; y ,εj). Since x and y are ∗  ∗ ∈  (δ ,nj,ε )-separated and (18) holds, there exists m Λnj such that

m m m m  m m  d(T xj,T yj) > 4ε, d(T xj,T x) ≤ εj ,d(T yj,T y) ≤ εj .

But

m m m m m m m m d(T x, T y) ≥ d(T xj,T yj) − d(T xj,T x) − d(T y, T yj) > 2ε.

 (2) G is the intersection of closed sets. Let (x1,x2,...) be a sequence with xj ∈ Γj. By the g-almost product property and compactness −Mj−1   j T Bnj (g; x ,εj) j≥1

   j  j is a non-empty closed set. By (1) the sets Bnj (g; x ,εj)andBnj (g; y ,εj)are disjoint when xj = yj. Thus two different sequences label two different sets.  (3) We define the stretched sequence {αm} by k−1 k  αm := αk if njNj +1≤ m ≤ njNj . j=1 j=1

 The sequence {αm} has the same limit-point set as the sequence {αk}.If

 lim d(En(y),αn)=0, n→∞

 then the two sequences {En(y)} and {αn} have the same limit-point set. Because  of (20) and the definition of {αm}, it is sufficient to show that

 lim d(EM (y),αM )=0. k→∞ k k j ≤ j+1  Suppose that =1 nN

Prepared using etds.cls Topological Entropy of Saturated Sets 19

Since limj ζj = 0, limj εj = 0 and limj d(αj,αj+1)=0thisproves(3).

(4) For this part of the proof the details of the construction are unimportant. What h∗(M −M ) ∗ is required is that limn Mn+1/Mn =1and|Γn+1|≥2 n+1 n .Lets

Thus if W contains a prefix of each word of Wm, m |W ∩ Wk|/|Wk|≥1. (22) k=1  Note that since C is a cover, each point of Wm has a prefix associated with some ∗ ∈C |W |≥ h Mk BMp (z,ε) . Also k 2 . Hence

∗ 2−h Mp ≥ 1.  BMp (x,ε)∈C ∗ Thus when p is large enough so that k ≥ p implies sMk+1 ≤ h Mk, n ≥ Mp and C∈Gn(G, ε), we have 2−sm ≥ 1 ⇒ C(G; s, n, ε, T ) ≥ 1.

Bm(x,ε)∈C

Remark. Inequality (22) is a complement to Kraft’s inequality of coding theory [S], which has the reversed inequality. For Kraft’s inequality one requires a mapping

Prepared using etds.cls 20 C-E. Pfister and W.G. Sullivan into Wm for which no word is a prefix of another. In the current context we have a mapping in which a prefix for each word of Wm appears. Remark. When the dynamical system is expansive with expansive constant ε∗,

∗ lim sup{diamBn(x, ε ):x ∈ X} =0. n→∞

Hence, in the proof of Theorem 5.1, we can work with a fixed ε = ε∗ and use

Lemma 5.2. Let (X, d, T) be an expansive dynamical system with expansive constant ε∗.Letf : X → R be a continuous function, f≤1. Then, for any δ>0 ∗ there exists N(δ) so that for any n ≥ N(δ),anyx ∈ X,andanyy ∈ Bn(g; x, ε ), f,En(x) − f,En(y)  ≤ δ.

Proof:Letδ>0. Since f is uniformly continuous, there exists η so that d(x, y) ≤ η implies |f(x) − f(y)|≤δ/2. We choose M ∈ N so that for any m ≥ M,

∗ sup{diamBm(g; x, ε ): x ∈ X}≤η.

Then we choose N>Mso that for any n ≥ N

M(g(n)+1) δ ≤ . n 2 ∗ We set N(δ):=N.Letn ≥ N(δ)andy ∈ Bn(g; x, ε ). By definition there exists ∗ Λ ⊂ Λn so that |Λn\Λ|≤g(n)anddΛ(x, y) ≤ ε .Wesaythatk ∈ Λn is bad if there exists j, k ≤ j ≤ k + M − 1sothatj ∈ Λ. The number of bad k is at most k k ∗ k k M(g(n)+1). Ifk is not bad, then T y ∈ BM (T x, ε ), so that d(T y, T x) ≤ η and consequently |f(T ky) − f(T kx)|≤δ/2. Hence n − M(g(n)+1) δ M(g(n)+1) f,En(x) − f,En(y)  ≤ + ≤ δ. n 2 n

6. Non-uniform case In this section we drop the condition of uniform separation. Let ε>0, δ>0and ν ∈ M(X, T). We set 1 s(ν; δ, ε):=inf lim inf log2 N(F ; δ, n, ε) . F ν n→∞ n Similarly we define s(ν; δ, ε). In this definition we can compute the infimum using the base of neighborhoods of ν given by the balls B(ν, ζ). Since N(F ; δ, n, ε) ≤ N(X; n, ε), Theorem 7.7 and remark (8), p.166, in [W]implythats(ν; δ, ε) ≤ s(ν; ε) < ∞ and s(ν; δ, ε) ≤ s(ν; ε) < ∞.

Lemma 6.1. The functions s( · ; δ, ε) and s( · ; δ, ε) are upper semi-continuous functions.

Prepared using etds.cls Topological Entropy of Saturated Sets 21

Proof: Let limk µk = ν.Leta>s(ν; δ, ε). Then there exists an F ν such that 1 a>lim inf log2 N(F ; δ, n, ε) . n→∞ n If k is large enough, then µk ∈ F and therefore 1 a>lim inf log2 N(F ; δ, n, ε) ≥ s(µk; δ, ε) , n→∞ n which implies that s(ν; δ, ε) ≥ lim sup s(µk; δ, ε) . k→∞ Similar proof for s( · ; δ, ε). Lemma 6.2. Let ε∗ > 0, δ∗ > 0.Letν ∈ M(X, T) and

ν = µt ρ(dt) be its ergodic decomposition. Then, for any ∆ > 0 we can find a finite convex p combination i=1 aiµi of ergodic measures such that p d ν, aiµi ≤ ∆ , i=1 and p ∗ ∗ ∗ ∗ s(µt,δ ,ε ) ρ(dt) ≤ ais(µi,δ ,ε ) . i=1 ∗ The {ai} can be chosen to be rational numbers. Moreover, if h h .

Proof:Let{A1,...,Ap} be a partition of M(X, T) with diameter smaller than ∆. For any Ai there exists an ergodic µi ∈ Ai such that 1 ∗ ∗ ∗ ∗ s(µt; δ ε ) ρ(dt) ≤ s(µi; δ ε ) , ρ(Ai) Ai and 1 d µt ρ(dt),µi ≤ ∆ . ρ(Ai) Ai Setting ai := ρ(Ai), we get the first result. To get rational coefficients, do the above for ∆/2 and note that if 2pn < ∆, each aj may be made rational by adjusting by not more than 1/n while maintaining the required properties. By Corollary 3.2 lim lim s(ν; δ, ε)=h(T,ν)ifν ergodic . ε→0 δ→0 From the Monotone-Convergence Theorem and the affine character of the entropy we get

lim lim s(µt; δ, ε) ρ(dt)= lim lim s(µt; δ, ε) ρ(dt)= h(T,µt) ρ(dt)=h(T,ν) . ε→0 δ→0 ε→0 δ→0

Since s(µt; δ, ε) is nonincreasing in δ and ε, this proves also the second result.

Prepared using etds.cls 22 C-E. Pfister and W.G. Sullivan

Theorem 6.1. Let (X, d, T) be a dynamical system verifying the g-almost product property. Then

htop(T,Gν )=h(T,ν) for any ν ∈ M(X, T) . Proof: The proof follows the same pattern as the proof of Theorem 5.1. Let h(T,ν) >h >h∗ . We prove ∗ htop(T,Gν ) ≥ h . By Lemma 6.2 we can find ε∗ > 0, δ∗ > 0, and for any k, a finite convex combination of ergodic measures with rational coefficients pk αk := ai,kµi,k , i=1 so that {αk} converges to ν,andsuchthat pk  ∗ ∗ h < ai,ks(µi,k,δ ,ε ) . (23) i=1

Let {ζk} and {εk} be two strictly decreasing sequences, limk ζk = 0 and limk εk =0, ∗ with ε1 <ε . We assume that

d(ν, αk) ≤ ζk .

For each αk there exists an integer nk so each element of {nkai,k} is an integer, ∗ ∗ ∗ ∗ ai,knk(s(µi,k,δ ,ε )−η) N(B(µi,k,ζk); δ ,ai,knk,ε ) ≥ 2 i =1,...,pk , (24) where η := h − h∗,and

∗ g(ai,knk) δ ai,knk > 2g(ai,knk)+1 and ≤ εk ∀ i =1,...,pk. (25) ai,knk ∗ ∗ Let Γi,k be a (δ ,ai,knk,ε )-separated subset of Xai,knk,B(µi,k,ζk). The cardinality of Γi,k is at least (see (24)) ∗ ∗ 2ai,knk(s(µi,k,δ ,ε )−η) . We define pk Γk := Γi,k, i=1 so that by (23) and (24) ∗ nkh |Γk|≥2 .

We denote the elements of Γk by E ∈B xk := (x1,k,...,xpk,k)with ai,knk (xi,k) (µi,k,ζk) . and set pk −(a1,k+···+ai−1,k)nk Bnk (g; xk,εk):= T Bai,knk (g; xi,k,εk)(witha0,k := 0) . i=1 We omit the rest of the proof, which from this point follows that of Theorem 5.1.

Prepared using etds.cls Topological Entropy of Saturated Sets 23

Lemma 6.3. Assume that (X, d, T) has the g-almost product property, and let δ>0 and ε>0.Leta1 > 0,...,ap > 0,witha1 + ···+ ap =1.Then,forνi ∈ M(X, T),   ε <εand δ < (mini ai) δ, p p   s aiνi; δ ,ε ≥ ai s(νi; δ, ε) . i=1 i=1   Proof:Letm be the function of Definition 2.3. Let ε <εand δ < (mini ai) · δ be given. We fix ζ > 0, and choose ∆ and ζ so small that

2∆ <ε− ε and ζ +∆<ζ .  We set Fi = B(νi,ζ), F = B(ν, ζ )withν = i aiνi.Letη>0andn be a (large) integer. We decompose n into n1 + ··· + np = n,withni ∈ N and ∗ ∗ ain≤ni ≤ain. There exists n such that for n ≥ n ,(mini ai) n ≥ m(∆) and ain(s(νi)−η) N(Fi; δ, ni,ε) ≥ 2 i =1,...,p. We also assume that for n ≥ n∗ and all i

g(ni)  ζ +∆+ <ζ and niδ>2g(ni)+1. ni

Suppose that En(xi) ∈ Fi, i =1,...,p. For each choice of x1,...,xp we select y in p −(n1+···+nj−1)  ∅ T Bnj (g; xj, ∆) = . j=1

In this way we define a subset S of cardinality at least 2(s(ν1;ε)−η)a1n ···2(s(νp;ε)−η)apn, such that  y ∈ S =⇒En(y) ∈B(ν, ζ ) . The subset S is (δ,n,ε)-separated, with  2g(ni) 1 δ = min ni δ − . i ni n

Since ai are fixed, by choosing n large enough,

   δ ≥ δ if δ < (min ai) δ. i Therefore p 1    lim inf log2 N(B(ν, ζ ); δ ,n,ε) ≥ s(νj; δ, ε) − η. n→∞ n j=1 We may take ∆ arbitrarily small, as well as ζ, so that we can take the limit ζ → 0. Thus p   s(ν; δ ,ε) ≥ s(νj; δ, ε) − η. j=1 Since η is also arbitrary, this proves the lemma.

Prepared using etds.cls 24 C-E. Pfister and W.G. Sullivan

Proposition 6.1. Assume that (X, d, T) has the g-almost product property. Then

s(ν)=h(T,ν) if ν ∈ M(X, T) .

∗ ∗ Proof: It is sufficient to prove limε s(ν; ε)=s(ν) ≥ h , for any h 0, δ > 0, and a sequence {νk} such that limk νk = ν, each νk is a finite convex combinations of ergodic measures,

nk nk ∗ ∗ ∗ νk = ai,kµi,k and ai,ks(µi,k; δ ,ε ) ≥ h . i=1 i=1 By Lemma 6.3, if ε<ε∗,then

nk nk nk ∗ ∗ ∗ h ≤ ai,ks(µi,k; δ ,ε ) ≤ lim s ai,kµi,k; δ, ε ≤ s ai,kµi,k; ε . δ→0 i=1 i=1 i=1 By Lemma 6.1 we get

 ∗ s(ν) = lim s(ν; ε ) ≥ s(ν; ε) ≥ lim sup s(νk; ε) ≥ h . ε→0 k→∞

7. Multi-fractal analysis of ergodic averages We present one application of the above results. For variants see Section 5 in [PfS2]. Proposition 7.1. Let (X, d, T) be a dynamical system such that for all µ ∈ M(X, T), htop(T,Gµ)=h(T,µ).LetY be a topological vector space and ϕ : X → Y . Suppose that the map φ : M(X) → Y ,

µ → φ(µ):= ϕ, µ  is continuous. For any a ∈ Y we set n−1 1 k Ka := x ∈ X : lim ϕ(T x)=a . n n k=0 Then htop(T,Ka)=sup{h(T,ρ):ρ ∈ M(X, T) and ϕ, ρ  = a} . Proof:LetF (a):={ρ ∈ M(X, T): ϕ, ρ  = a}. This is a closed set. Because µ → φ(µ):= ϕ, µ  is continuous, the statement

n−1 1 lim ϕ(T kx)=a n n k=0 is equivalent to the statement

{En(x)} has all its limit-points in F (a) .

Prepared using etds.cls Topological Entropy of Saturated Sets 25

Let F (a) G := {x ∈ X : {En(x)} has all its limit points in F (a)} . F (a) For any ρ ∈ F (a) one has Gρ ⊂ G , hence F (a) sup{h(T,ρ):ρ ∈ F (a)}≤htop(T,G ) . For the other inequality notice that GF (a) ⊂ F (a)G, which implies by Theorem 4.1 (1) F (a) sup{h(T,ρ):ρ ∈ F (a)}≥htop(T,G ) .

Proposition 7.1 is the main result in [TV]. For earlier results in the context of subshifts of finite type, see [O1]. See Theorem 6.2 in [TV] for the relationship between htop(T,Ka) and the topological pressure P (T,ϕ)=sup{h(T,ρ)+ ϕ, ρ  : ρ ∈ M(X, T)} , when the map ϕ is real-valued. Takens and Verbitskiy use, independently, a method similar to the method of proof of Section 5, under the assumptions of the specification property and htop(T,X) < ∞. Our proof is different and we prove that the hypothesis of Proposition 7.1 are verified for a dynamical system with the g-almost product property only. We emphasize the role of the empirical measures En. There is an analogy with Large Deviations Theory (see [LePf]). In the language of Large Deviations Theory, one could say that Takens and Verbitskiy proved a level-1 result and that we prove a level-3 result, and that Proposition 7.1 is a contraction principle. The two main properties which are used in this paper are the entropy density of ergodic measures and the uniform separation property. This later property implies that the topological entropy is finite and that the entropy-map is upper semi- continuous. We give an example of a dynamical system with finite topological entropy, for which entropy-density of ergodic measures is true (specification property is true), but the uniform separation property and the upper semi- continuity of the entropy map fail. Example. Consider the shift space Y := [−1, 1]Z+ , with the metric ∞ −k−2 d(x, y):= 2 |xk − yk| . k=0

Let A be the alphabet containing the letters a0 =0,am =1/m,anda−m = −1/m, m ∈ N. We consider A as a subset of [−1, 1] with the induced metric, so that it is acompactset.LetX ⊂ AZ+ be the subshift defined‡ by ⎧ ⎪ ⎨⎪aj,ora−j,oraj+1,oraj−1 if xk = aj, j ≥ 1 ≥ ∀ ∈ xk+1 = ⎪aj,ora−j,ora−j+1,ora−j−1 if xk = a−j, j 1 k Z+ . ⎩⎪ a0,ora1,ora−1 if xk = a0

‡ We were inspired by [Gu].

Prepared using etds.cls 26 C-E. Pfister and W.G. Sullivan

There is an injective map from X to A ×{−2, −1, 1, 2}N.Letx ∈ X and N y =(y0,y1,...) ∈ A ×{−2, −1, 1, 2} ;wesety0 := x0 and for k ≥ 1 ⎧ ⎪ 1ifxk−1 = xk ⎪ ⎪− − ⎪ 1ifxk−1 = xk ⎨⎪ 2 if, for j ∈ N, xk−1 = |aj| and xk = |aj−1| yk := ⎪ ⎪−2 if, for j ∈ N, xk−1 = |aj| and xk = |aj+1| ⎪ ⎪ ⎪ 2ifxk−1 = a0 and xk = a1 ⎩⎪ −2ifxk−1 = a0 and xk = a−1.

From this it follows that the topological entropy of X is bounded by log2 4. We show that the subshift X has the specification property. Let ε0 =1/m0, m0 ∈ N be fixed. Consider a word of length n,sayw=(x0,x1,...,xn−1). We define a new word of length n, denoted by w=( x0, x1,...,xn−1), ⎧ ⎪ ⎨⎪xk if xk = aj or xk = a−j,withj ≤ m0 xk := (26) ⎪am0 if xk = aj with j>m0 ⎩⎪ a−m0 if xk = a−j with j>m0.

We can also extend the transformation · to the points of X. Notice that the image of X under this transformation, X, is a subset of X,and

m m 1 d(T x, T x) ≤ ∀ m ∈ Z+ . 2m0

Let j0 be defined by −(j+1) 1 j0 := min j ∈ Z+ :2 < , (27) 2m0 so that ∞ −k−2 1 2 |xk − yk| < . 2m0 k>j0

The subshift X has the specification property with k(ε0):=j0 + m0 +1. Let n1 ∈ N,...,np ∈ N, x1 ∈ X,...,xp ∈ X and k1 ≥ k(ε0),...,kp−1 ≥ k(ε0)be given. Let wi be the prefix of xi of length ni + j0, i =1,...,p. Then we can find y ∈ X such that, for i =1,...,p i−1 (yi,yi+1 ...,yj)=wi i = (n + k)andj = i + ni + j0 − 1 . =1 For y we have

n1+k1+···+nj−1+kj−1+m m n1+k1+···+nj−1+kj−1+m m d(T y, T xj) ≤ d(T y, T xj) m m + d(T xj,T xj) 1 ≤ m =0,...,nj − 1andj =1,...,p. m0

Prepared using etds.cls Topological Entropy of Saturated Sets 27

Thus, (X, d, T) is a dynamical system with finite topological entropy and the specification property. It is neither expansive, nor asymptotically expansive; indeed, 0 0 for x ,withxk =0forallk ∈ Z+, 0 htop Bn(x ,ε) ≥ log2 2 . n

We show that the entropy-map is not upper semi-continuous at δx0 . Indeed, let µm be the Bernoulli measure concentrated on the configurations with xk = a±m for all k. These measures converge weakly to δx0 , but h(T,µm)=log2 2andh(T,δx0 )=0. Choose ϕ(x):=|x0|, µ := δx0 so that ϕ, µ  =0.LetFδ be the neighborhood

Fδ := {ν : | ϕ, ν − ϕ, µ | <δ} .

Then n−1 ∈ 1 | | Xn,Fδ = x X : xk <δ . (28) n 0 We shall prove that for each ε>0 1 lim lim sup log2 N(Fδ; n, ε)=0. (29) δ→0 n n

Each neighbourhood Fδ,δ > 0, contains ergodic measures of the form µm with h(T,µm)=log2 2. Hence (29) implies that X does not have the uniform separation property. It is convenient to work with (n, ε)-spanning sets instead of (n, ε)-separated sets. Let ε =1/m0, m0 ∈ N.Letδ>0sothatm0δ<1/2, and j0 be defined as in (27). S We select an (n, ε)-spanning set (δ)forXn,Fδ and bound its cardinality. Let

n−1 S ∈{ | |≤ }Z+ ∩ ≥ | | (δ):= x aj : j m0 X : xk = xn+j0−1,k n + j0; xk

−1 Since |xk|≥m0 when xk =0, x ∈S(δ) has at most nδm0 nonzero terms among its first n coordinates. The number of possible subsets with these nonzero coordinates is at most n ≤ 2nφ(m0δ) , k k≤(δm0)n where φ is the entropy-function (13). Therefore the cardinality of S(δ)isatmost

nφ(m0δ) m0δn j0 2 (2m0) (2m0 +1) .

S ∈ It remains to verify that (δ)is(n, ε)-spanning with ε =1/m0.Lety Xn,Fδ and let y be the element of S(δ) such that

yk = yk for k

Prepared using etds.cls 28 C-E. Pfister and W.G. Sullivan

−1 where yk is defined as in (26). We have |yk − yk|≤m0 for all k

j0 m m −k−2 −k−2 d(T y, T y) ≤ 2 |ym+k − ym+k| + 2 |ym+k − ym+k|

k=0 k>j0 j 1 0 1 1 ≤ 2−k−2 + ≤ . 0 0 0 m k=0 2m m From the above estimates (29) follows. Notice that we could use here Proposition 3.3 to conclude that the uniform separation property does not hold. Notice also that the proof of Theorem 5.1 in [TV] is incomplete. Without additional hypothesis one cannot satisfy points (1), (2) and (3) of p.328. As this example shows, it is not sufficient to assume finiteness of the topological entropy and the specification property.

References

[BaSc] L. Barreira, J. Schmeling, Sets of non-typical points have full topological entropy and full Hausdorff dimension, J. Isr. Math. 116, 29-70 (2000). [Bo1] R. Bowen, Entropy-expansive maps, Trans. A.M.S., 164, 323-331 (1972). [Bo2] R. Bowen, Topological entropy for noncompact sets, Trans. A.M.S., 184, 125-136 (1973). [BK] M. Brin, A. Katok, On local entropy, in Geometric Dynamic, 30-38, Ed. J. Palis Jr., LNM 1007, Springer, Berlin (1983). [C] H. Cajar, Billingsley Dimension in Probability Spaces, Springer LMN 892, Berlin (1980). [Ch] D.G. Champernowne, The construction of decimal normal in the scale of ten, J. London Math. Soc. 8, 254-260 (1933). [Co] C.M. Colebrook, The Hausdorff dimension of certain sets of nonnormal numbers, Michigan Math. J. 17 103–116 (1970). [Eg] H.G. Eggleston, The fractional dimension of a set defined by decimal properties, Quart. J. Math. Oxford Ser. 20, 31-36 (1949). [G] L.W. Goodwyn, Topological entropy bounds measure-theoretic entropy, Proc. A.M.S., 23, 679-688 (1969). [Gu] B.M. Gureviˇc, Topological entropy of enumerable Markov chains, Soviet Math. Dokl. 10, 911-915 (1969). [K] A. Katok, Lyapunov exponents, entropy and periodic orbits for diffeomorphisms, p.137-145 IHES Publications math´ematiques, 51, 137-173 (1980). [LePf] J.T. Lewis, C.-E. Pfister, Thermodynamic probability theory: some aspects of large deviations, Russian Mathematical Surveys 50, 279-317 (1995). [Mi1] M. Misiurewicz, Topological conditional entropy, Studia Mathematica, 55, 175-200 (1976). N [Mi2] M. Misiurewicz, A short proof of the variational principle for a Z+ -action on a compact metric space, Ast´erisque, 40, 147-157 (1976). [N] S. Newhouse, Continuity properties of entropy, Ann. Math 129, 215-235 (1989). [O1] E. Olivier, Analyse multifractale de fonctions continues, C. R. Acad. Sci. Paris 326, 1171-1174 (1998). [O2] E. Olivier, Dimension de Billingsley d’ensembles satur´es, C. R. Acad. Sci. Paris 328, 13-16 (1999).

Prepared using etds.cls Topological Entropy of Saturated Sets 29

[Pe] Ya.B. Pesin, Dimension Theory in Dynamical Systems, Contemporary Views and Applications, Chicago Lectures in Mathematics, University of Chicago Press, Chicago (1997). [PePi] Ya.B. Pesin, S.B. Pitskel´, Topological Pressure and the Variational Principle for Noncompact Sets, Funct. Anal. and Appl. 18, 307-318 (1985). [PfS1] C.-E. Pfister, W.G. Sullivan, Billingsley dimension on shift spaces, Nonlinearity 16, 661-682 (2003). [PfS2] C.-E. Pfister, W.G. Sullivan, Large Deviations Estimates for Dynamical Systems without the Specification Property. Application to the β-shifts, Nonlinearity 18, 237-261 (2005). [S] P.C. Shields, The Ergodic Theory of Discrete Sample Paths, Graduate Studies in Mathematics Volume 13, American Mathematical Society(1996). [Sc] J. Schmeling, Symbolic dynamics for β-shifts and self-normal numbers, Ergod. Th. & Dynam. Sys. 17, 675-694 (1997). [TV] F. Takens, E. Verbitskiy, On the variational principle for the topological entropy of certain non-compact sets Ergod. Th. & Dynam. Sys 23, 317-348 (2003). [Te] A.A. Tempelman, Dimension of Random Fractals in Metric Spaces, Teor. Veroyatnost. i Primenen. 44, 589-616 (1999); translation in Theory Probab. Appl. 44, 537-557 (2000). [W] P. Walters, An Introduction to Ergodic Theory, Graduate Texts in Mathematics 79, Springer, Berlin (2000). [We] H. Wegmann, Uber¨ den Dimensionsbegriff in Wahrsheinlichkeitsr¨aumen von P. Billingsley I and II, Z. Wahrscheinlichkeitstheorie verw. Geb. 9, 216-221 and 222- 231 (1968). [Y] L.-S. Young, Some Large Deviation Results for Dynamical Systems, Trans. Amer. Math. Soc. 318, 525-543 (1990).

Prepared using etds.cls