<<

Groups of Intermediate Growth and Grigorchuk’s Group

Eilidh McKemmie supervised by Panos Papazoglou

This was submitted as a 2CD Dissertation for FHS and Computer Science, Oxford University, Hilary Term 2015 Abstract

In 1983, [6] discovered the first known example of a group of intermediate growth. We construct Grigorchuk’s group and show that it is an infinite 2-group. We also show that it has intermediate growth, and discuss bounds on the growth.

Acknowledgements

I would like to thank my supervisor Panos Papazoglou for his guidance and support, and Rostislav Grigorchuk for sending me copies of his papers.

ii Contents

1 Motivation for studying group growth 1

2 Definitions and useful facts about group growth 2

3 Definition of Grigorchuk’s group 5

4 Some properties of Grigorchuk’s group 9

5 Grigorchuk’s group has intermediate growth 21 5.1 The growth is not polynomial ...... 21 5.2 The growth is not exponential ...... 25

6 Bounding the growth of Grigorchuk’s group 32 6.1 Lower bound ...... 32 6.1.1 Discussion of a possible idea for improving the lower bound . . . 39 6.2 Upper bound ...... 41

7 Concluding Remarks 42

iii iv 1 Motivation for studying group growth

Given a finitely generated group G with finite generating set S , we can see G as a set of words over the alphabet S . Defining a weight on the elements of S , we can assign a length to every group element g which is the minimal possible sum of the weights of letters in a word which represents g. We are interested in the set of all elements in G with length bounded above by some number n. This set is exactly the ball of words of radius n around the identity in the of G with respect to S where the length of an edge labelled by s is the weight of s. The growth of G with respect to S and our chosen weight is, roughly speaking, the growth rate of the volume of this ball of words as we let the radius increase. We define an equivalence relation on growth functions such that, no matter how we weight group generators, the growth of G always falls into the same equivalence class. We can then classify the growth of any finitely generated group without having to specify the generating set or the weights we are working with. We should note that using weights to define growth is not standard. Usually we would simply assign a value of 1 to every generator. However, we will show that different weights can give us more precise information about the growth of a group, so a definition of group growth using weights will be useful. The historical motivation for studying the growth of finitely generated groups lies in differential . In 1955, Svarcˇ showed in [14] (as discussed in [7]) that the growth rate of the volume of a ball in the universal cover of a compact Riemannian manifold is equivalent to the growth of the fundamental group of that manifold. Mil- nor and Wolf [12, 15] also noted a relationship between the curvature of a compact Riemannian manifold and the growth of its fundamental group: Milnor showed that the fundamental group of a compact manifold of negative curvature has exponential growth, and (as discussed by Wolf in [15]) that bounds on the curvature of a Rie- mannian manifold result in bounds on the growth of the fundamental group of the manifold. In 1968, Milnor posed a question in [13] asking whether there are any groups whose growth is intermediate, that is, not equivalent to any polynomial or exponential. The question was answered by Grigorchuk in [5, 6] in 1983 with the construction of un- countably many finitely generated groups of intermediate growth. We will construct a group of intermediate growth often called the first or simply Grigorchuk’s group. We will discuss some of the group’s properties and show it has intermediate growth. Grigorchuk originally conjectured that the growth √ of the group was equivalent to e n , a conjecture which was disproved in 2000 when Leonov and Bartholdi [10, 1] both found that Grigorchuk’s group grows strictly more √ quickly than e n . We will discuss various bounds on the growth of Grigorchuk’s group, and finally note that the exact nature of the growth of Grigorchuk’s group is unknown.

1 2 Definitions and useful facts about group growth

We begin by defining our terms and notation.

Notation 2.1. Throughout this work, we will be using log for the natural logarithm and e for the identity element of a group. We also say that the set of natural numbers N does not contain 0. Definition 2.2 (Alphabets and words). An alphabet is a non-empty set whose ele- ments we sometimes call letters. A word over an alphabet Σ is a finite sequence of letters (s1, . . . , sk) where si ∈ Σ. We usually omit the parentheses and commas and 0 write s1 . . . sk for the word (s1, . . . , sk). If w, w are words over the alphabet Σ then we denote their concatenation by ww0 . For n ∈ N we let the word wn denote the con- catenation of the word w with itself n times. We let Σm denote the set of all words on Σ with exactly m letters, and Σ∗ denote the set of all words on Σ. The empty word is denoted .

For this section, let G be a finitely generated group with finite generating set S , and define S−1 = {s−1 : s ∈ S}. Then all elements of G can be represented by words over the alphabet S ∪ S−1 in the obvious way. There may be two words which are not equal as words but which represent the same group element. The identity element e is represented by the empty word .

Definition 2.3 (Weights, word lengths and group element lengths). A weight δ on S −1 −1 is a function δ : S ∪ S ∪ {} → R≥0 such that δ(s) = δ(s ) > 0 for all s ∈ S and δ() = 0. −1 Define the length |w| of the word w = s1 ··· sn on the alphabet S ∪ S to be |w| = n, the number of letters in the word.

Define the weight δ(w) of the word w = s1 ··· sn to be δ(w) = δ(s1) + ··· + δ(sn). For g ∈ G, the length of g is lδ(g) = min{δ(w): w is a word representing g}. That is, lδ(g) is the minimum weight of a word representing g over the alphabet S . We write l(g) for lδ(g) where it is obvious which weight function we are using.

Definition 2.4 (Group growth). Let δ be a weight on S . Define the ball of words of radius n around the identity to be Bδ(n) := {g ∈ G : lδ(g) ≤ n} for n ∈ R≥0 . The growth function γδ : R≥0 → N is defined by γδ(n) := |Bδ(n)|. We write B = Bδ and γ = γδ where there is no ambiguity. Note that B(0) = {e} and so γ(0) = 1.

It is quite unusual to define growth using weights, and for the most part, we will be using the standard weight defined by δ(s) = 1 for all s ∈ S ∪ S−1 . However, different weights will be useful to us when trying to achieve tight bounds on the growth of Grigorchuk’s group. We define an equivalence relation on functions so that we can classify the growth functions of groups.

2 0 Definition 2.5. Given two functions γ, γ : R≥0 → N, say that γ does not grow more quickly than γ0 , written γ  γ0 , if there exist constants α, C > 0 such that γ(n) ≤ Cγ0(αn) for all n > 0. We say γ and γ0 are equivalent, written γ ∼ γ0 , if both γ  γ0 and γ0  γ .

Proposition 2.6. ∼ is an equivalence relation.

Proof. For reflexivity, we obviously have γ(n)  γ(n), so γ(n) ∼ γ(n). For symmetry, note that if γ(n) ∼ γ0(n) then γ0(n)  γ(n) and γ(n)  γ0(n), and so γ0(n) ∼ γ(n). Transitivity of ∼ relies on the transitivity of , which we will show first: if γ(n)  γ0(n) and γ0(n)  γ00(n) then there exist constants C, D, α, β > 0 such that γ(n) ≤ Cγ0(αn) and γ0(n) ≤ Dγ00(βn), so γ(n) ≤ CDγ00(αβn). Therefore γ(n)  γ00(n). So if γ0(n) ∼ γ(n) and γ0(n) ∼ γ00(n) then γ(n)  γ0(n)  γ00(n) and γ00(n)  γ0(n)  γ(n), which gives us that γ ∼ γ00(n) and so ∼ is transitive.

Proposition 2.7 (The growth function is well-defined, [2, Lemma 2]). If S1 and S2 are two finite generating sets of the group G with weights δ1 on S1 and δ2 on S2 then

γδ1 ∼ γδ2 .

Proof. It is enough to show that γδ1  γδ2 . The other direction follows by symmetry. Let   lδ2 (s) M := max : s ∈ S1 > 0. δ1(s)

Let w = s1 ··· sk (where si ∈ S1 ) be a word representing g ∈ G such that δ1(w) = lδ1 (g), ∗ Pk that is, w is a minimal-weight word in S1 representing g. Then lδ1 (g) = i=1 δ1(si).

lδ2 (si) Note that, for i = 1, . . . , k, we have M ≥ and so Mδ1(si) ≥ lδ2 (si). Therefore δ1(si)

k k X X lδ2 (g) ≤ lδ2 (si) ≤ Mδ1(si) = Mlδ1 (g). i=1 i=1

If g ∈ Bδ1 (n) then lδ1 (g) ≤ n, so lδ2 (g) ≤ Mlδ1 (g) ≤ Mn, giving us that g ∈ Bδ2 (Mn).

So Bδ1 (n) ⊆ Bδ2 (Mn), therefore γδ1 (n) ≤ γδ2 (Mn) and so γδ1  γδ2 .

We get γδ2  γδ1 by symmetry, and so γδ1 ∼ γδ2 . Because the growth function is a well-defined property of the group up to equiva- lence, we can classify groups by their growth.

Definition 2.8. Let G be an infinite finitely generated group with growth function γ . The group G has polynomial growth if γ(n) ∼ nβ for some β > 0. The group G has exponential growth if γ(n) ∼ en . The group G has intermediate growth if it has neither exponential nor polynomial growth.

We will now show that we could equivalently have defined a group to be of inter- mediate growth if neither γ(n)  nβ for any β > 0 nor en  γ(n).

3 Proposition 2.9. Let G be an infinite finitely generated group with finite generating set S . Let δ be a weight defined on S and γ be the growth function of G with respect to δ. Then G has intermediate growth if and only if neither γ(n)  nβ for any β ≥ 0 nor en  γ(n).

We will use the following lemmas in our proof:

Lemma 2.10 ([8, Exercise 1.1]). Let G be an infinite finitely generated group with finite generating set S , and use the standard weight defined by δ(s) = 1 for all s ∈ S . Let γ be the growth function of G with respect to the weight δ. (i) If n ∈ Z≥0 then γ(n + 1) ≥ γ(n) + 1. (ii) If m ∈ R≥0 and n ∈ Z≥0 such that n + 1 > m ≥ n then γ(m) = γ(n). Proof. (i) It is sufficient to show that γ(n + 1) > γ(n) because γ(n) is always an integer. Since B(n) ⊆ B(n + 1), we have γ(n + 1) ≥ γ(n). If there is some n such that γ(n + 1) = γ(n) then there is no element g ∈ G of length l(g) > n (else a minimal-length word representing g would have a subword of length n + 1 which is a minimal-length word representing an element of G). This tells us that G is finite, which contradicts our assumption that G is infinite. Therefore we must conclude that γ(n + 1) > γ(n). (ii) We are using the standard weight and so the lengths of group elements are integers. Therefore there are no words of length k if n+1 > k > n, and so γ(m) = γ(n) for n + 1 > m ≥ n.

Lemma 2.11 (There are no groups which grow strictly more slowly than all poly- nomials). Let G be an infinite finitely generated group with growth function γ . Then γ(m)  m.

Proof. Fix a finite generating set S of G and growth function γδ1 with respect to the standard weight defined by δ1(s) = 1 for all s ∈ S . It is enough to show that

γδ1 (m)  m since by Proposition 2.7, we know γ(m) ∼ γδ1 (m)  m and so γ(m)  m.

First we will show that γδ1 (n) ≥ n + 1 for all n ∈ Z≥0 :

Use induction on n, noting for the base case that γδ1 (0) = 1. Then γδ1 (n) ≥

γδ1 (n − 1) + 1 by Lemma 2.10(i). By the inductive hypothesis, γδ1 (n − 1) ≥ n and so

γδ1 (n) ≥ n + 1. Now let m ∈ R≥0 . There exists n ∈ Z≥0 such that n + 1 > m ≥ n, and so

γδ1 (m) = γδ1 (n) by Lemma 2.10(ii). Therefore γδ1 (m) = γδ1 (n) ≥ n + 1 > m for all m ∈ R≥0 .

Therefore γδ1 (m)  m, and so γ(m)  m as required.

Lemma 2.12. For all a, b > 1, we have an ∼ bn .

n αn n βn n n Proof. Let α := logb a and β := loga b. Then a = b and b = a , so a  b and bn  an . Therefore an ∼ bn .

4 Lemma 2.13 (There are no groups which grow strictly more quickly than an expo- nential). Let G be an infinite finitely generated group with growth function γ . Then γ(n)  en .

Proof. Let S be a finite generating set of G and let k = |S|. By Proposition 2.7, we can use any weight we want to define the growth function of G. We will use the standard weight δ defined by δ(s) = 1 for all s ∈ S and note that γ ∼ γδ is the growth function of G. Then let Σ := S ∪ S−1 ∪ {}. Every element g of length l(g) ≤ n can be written as a word of exactly n letters on the alphabet Σ (using the letter  if l(g) < n). There are (2k+1)n words in Σn and so there are at most (2k+1)n elements n n of g such that l(g) ≤ n, so γδ(n) ≤ (2k + 1) . Therefore γ(n) ∼ γδ(n)  (2k + 1) . By Lemma 2.12, (2k + 1)n  en , so by transitivity of , we have γ(n)  en .

Proof of Proposition 2.9. First, assume that G has intermediate growth. Then γ(n)  en and so we cannot have both γ(n)  en and en  γ(n). By Lemma 2.13, γ(n)  en and so we cannot have en  γ(n). Similarly, γ(n)  nβ for any β > 0, and so we cannot have α, β > 0 such that γ(n)  nβ and nα  γ(n). By Lemma 2.11, n  γ(n) and so we cannot have γ(n)  nβ for any β > 0. So if G has intermediate growth then neither γ(n)  nβ for any β > 0 nor en  γ(n). For the converse, assume neither γ(n)  nβ for any β > 0 nor en  γ(n). Then γ(n)  en and γ(n)  nβ for any β > 0, so G has intermediate growth.

3 Definition of Grigorchuk’s group

Here, we construct Grigorchuk’s group in two different and useful ways and show our definitions are equivalent.

Definition 3.1. Let T be a directed graph with vertex set V (T ) = {0, 1}∗ and edge set E(T ) = {(v, v0), (v, v1): v ∈ V (T )}. This is an infinite binary rooted tree with root , the empty word on {0, 1}∗ . The vertex v is the parent of v0 and v1, and v0, v1 are the children of v and siblings to each other. We say v0 is the left child of v, and v1 is the right child, as shown in Figure 1. For any vertex v ∈ V (T ), we define the level |v| of v as the number of letters of the string v. We write Tv for the subtree of T rooted at v ∈ V (T ), that is, the subtree of T induced by V (Tv) = {vw : w ∈ V (T )}.

We will be looking at the group Aut(T ) of graph automorphisms of the tree T . Be- cause graph automorphisms have to preserve edge relations, they act only by swapping two children of the same parent.

We will use right-multiplication of automorphisms: for π0, π1 ∈ Aut(T ), we have π0π1 = π0 · π1 = π0 ◦ π1 , that is, π0π1(v) = π0(π1(v)). We construct Grigorchuk’s group by defining four automorphisms of T . Grig- orchuk’s group is generated by these four automorphisms.

5 ϵ

0 1

00 01 10 11

000 001010 011 100 101 110 111

...

Figure 1: The first three levels of the tree T . This figure is a reproduction of [8, Figure 1].

For every v ∈ T (V ), we define the automorphism av ∈ Aut(T ) which swaps the two subtrees Tv0 and Tv1 of T by ( vxu if w = vxu for u ∈ {0, 1}∗ and x ∈ {0, 1} av(w) = w if w is not a vertex of Tv where ( 1 if x = 0 x = 0 if x = 1.

For simplicity of notation, we let a = a be the tree automorphism which swaps the subtrees T0 and T1 . Note that av and aw commute if neither Tv nor Tw is a subtree of the other, that is, there is no u ∈ T (V ) such that v = wu or w = vu. To see this, note that av acts as the identity on Tw and aw acts as the identity on Tv since v is not of the form wxz and w is not of the form vxz for any x ∈ {0, 1} and z ∈ {0, 1}∗ .

Definition 3.2. Grigorchuk’s group G = ha, b, c, di is generated by four automor- phisms of T where a = a as above and b, c, d are defined by

∞ ∞ Y Y b := a13i0 ◦ a13i+10 i=0 i=0 ∞ ∞ Y Y c := a13i0 ◦ a13i+20 i=0 i=0 ∞ ∞ Y Y d := a13i+10 ◦ a13i+20. i=0 i=0

6 b c d

Figure 2: Illustration of the action of automorphisms b, c and d on the tree T . Each homomorphism swaps pairs of white subtrees and acts as the identity on pairs of black subtrees in the figure. This figure is a reproduction of [8, Figure 4].

Although these automorphisms are infinite products of automorphisms, they are well-defined because they are well-defined on every vertex v ∈ T (V ): all but at most one of the subtree-swapping automorphisms au will act as the identity on v because v cannot be in more than one of the subtrees {T1n0 : n = 0, 1, 2,...}, and the subtree- swapping automorphisms here commute with each other by facts we noted above. Since ∗ v is a finite string in {0, 1} , each of b, c and d acts as the identity or as a1n0 on v for some n ∈ N. We think about b, c and d as swapping subtrees of left branches from the main right branch (1, 1, 1,...). This is illustrated in Figure 2. We now provide an alternative way of defining Grigorchuk’s group, and show the definition is equivalent to our first definition. This will be useful for proving different statements about the group.

There is a natural graph isomorphism ιv : T → Tv defined by ιv(w) = vw, so ∼ T = Tv for any v ∈ T (V ). This isomorphism induces a which we −1 can unambiguously denote ιv : Aut(T ) → Aut(Tv), defined by ιv(π) = ιv ◦ π ◦ ιv , so ∼ Aut(T ) = Aut(Tv). Now we may define the map

ϕ : Aut(T ) × Aut(T ) → Aut(T )

ϕ :(π0, π1) 7→ ι0(π0) ◦ ι1(π1).

Intuitively, ϕ(π0, π1) applies π0 to the left subtree T0 of T and π1 to the right subtree T1 of T . It is easy to see that ϕ is a bihomomorphism, that is, ϕ(π0, π1)ϕ(ρ0, ρ1) = ϕ(π0ρ0, π1ρ1). Therefore, if ord(π0), ord(π1) < ∞ then ord(ϕ(π0, π1)) = lcm(π0, π1).

7 Definition 3.3 (Alternative definition of b, c and d). We could have defined b, c, and d recursively and simultaneously by

b = ϕ(a, c), c = ϕ(a, d), d = ϕ(e, b)(?) where e ∈ Aut(T ) is the identity automorphism on T .

Proposition 3.4 (Our two definitions are equivalent, [8, Exercise 5.2]). The original definition of b, c and d is the unique solution to the simultaneous equations (?).

Proof. We note that ι0(av) = a0v and ι1(av) = a1v for all v, and also that the automorphisms a1m0 commute with each other for m ≥ 0 since if n 6= m then T1m0 cannot be a subtree of T1n0 and vice versa. Now we substitute b, c and d into the equations (?) to show that we have a solution.

∞ ∞ Y Y ϕ(a, c) = ϕ(a, a13i0 ◦ a13i+20) i=0 i=0 ∞ ∞ Y Y = ι0(a) ◦ ι1( a13i0 ◦ a13i+20) i=0 i=0 ∞ ∞ Y Y = ι0(a) ◦ ι1(a13i0) ◦ ι1(a13i+20) since ι1 is a homomorphism i=0 i=0 ∞ ∞ Y Y = a0 ◦ a13i+10 ◦ a13i+30 definition of ι1 i=0 i=0 ∞ ∞ Y  Y = a0 ◦ a13i+30 ◦ a13i+10 by commutativity of the a1m0 i=0 i=0 ∞ ∞ Y Y = a13i0 ◦ a13i+10 i=0 i=0 = b.

Similarly, we can verify the solution for the other two equations. For uniqueness, we show that if b, c and d are defined simultaneously and recur- sively by (?) then they are the automorphisms in our original definition of Grigorchuk’s group.

8 b = ϕ(a, c) = ϕ(a, ϕ(a, d)) definition of c = ϕ(a, ϕ(a, ϕ(e, b))) definition of d

= ι0(a) ◦ ι1(ι0(a) ◦ ι1(ι0(e) ◦ ι1(b))) definition of ϕ

= ι0(a) ◦ ι1(ι0(a)) ◦ ι1(ι1(ι0(e))) ◦ ι1(ι1(ι1(b))) expand brackets using ιv

= a0 ◦ a10 ◦ ι1(ι1(ι1(b)))

= a0 ◦ a10 ◦ a130 ◦ a140 ◦ ι1(ι1(ι1(ι1(ι1(ι1(b)))))) recursively evaluate b on T111

= ··· evaluate b on T16 and repeat ∞ ∞ Y Y = a13i0 ◦ a13i+10 rearranging since the a1m0 commute. i=0 i=0 Similarly, the other two equations give us our original definitions of c and d, and so we have uniqueness.

We will use the two definitions of b, c, and d interchangeably.

4 Some properties of Grigorchuk’s group

In this chapter we will show that G is an infinite and lay some ground- work for our proof that G has intermediate growth which appears in the next chapter. We are interested in infinite torsion groups because of the Generalised Burnside Prob- lem which was posed by Burnside [3] in 1902 and asks whether infinite torsion groups exist. Grigorchuk’s group is not the first known example of an infinite torsion group, but a well-known one. Lemma 4.1. a, b, c and d all have 2. Proof. Obviously none of these automorphisms is the identity.

Intuitively, av swaps Tv0 and Tv1 and so obviously av has order 2 for all v ∈ T (V ). 2 ∗ More formally, av(vxu) = av(vxw) = vxw = vxw for all v, w ∈ {0, 1} and 2 x ∈ {0, 1}, and av(w) = av(w) = w if w is not a vertex of Tv . So av is of order 2 for all v ∈ T (V ).

Note a = a , so a has order 2. Since b, c and d are all products of a1m0 , and these automorphisms all commute with each other, we also have that b, c and d have order 2. Lemma 4.2 ([8, Exercise 5.3]). bcd = e.

Proof. Since the a1m0 commute with each other and all have order 2, ∞ ∞ ∞ Y 2 Y 2 Y 2 bcd = a13i0 ◦ a13i+10 ◦ a13i+20 = e. i=0 i=0 i=0

9 Corollary 4.3 ([8, Exercise 5.3]). G is 3-generated.

Proof. Since c−1 = c we have bd = c and so G = ha, b, di. Lemma 4.4 ([8, Exercise 5.4]). (ad)4 = (ac)8 = (ab)16 = e.

Proof. ad = a ◦ ϕ(e, b) applies b to the subtree T1 of T and then swaps the subtrees 2 T0 and T1 . Applying this twice; (ad) = adad = a ◦ ϕ(e, b) ◦ a ◦ ϕ(e, b) = ϕ(b, b). So

(ad)4 = ϕ(b, b)2 = ϕ(b2, b2) = ϕ(e, e) = e.

Now c = ϕ(a, d), so we get (ac)2 = a ◦ ϕ(a, d) ◦ a ◦ ϕ(a, d) = ϕ(da, ad). We note that (da)2 = ϕ(e, b) ◦ a ◦ ϕ(e, b) ◦ a = ϕ(b, b) = (ad)2 and so

(ac)8 = ϕ(da, ad)4 = ϕ((da)4, (ad)4) = ϕ(ϕ(b, b)2, ϕ(b, b)2) = ϕ(e, e) = e.

Finally, noting that (ac)4 = ϕ((ad)2, (da)2) = ϕ((da)2, (ad)2) = (ca)4 and so (ca)8 = (ac)8 = e, we get

(ab)16 = (a ◦ ϕ(a, c) ◦ a ◦ ϕ(a, c))8 = ϕ(ca, ac)8 = ϕ((ca)8, (ac)8) = ϕ(e, e) = e.

Definition 4.5. A weight δ on G with generating set S = {a, b, c, d} is called trian- gular if, for all x, y, z ∈ {b, c, d}, we have δ(x) < δ(y) + δ(z).

Lemma 4.6 ([2, Lemma 4]). Let δ be a triangular weight on G with generating set S = {a, b, c, d}. Let g ∈ G. If w is a word representing the group element g and δ(w) = l(g) then w is of the form w = (a)u1au2a ··· aun(a) where ui ∈ {b, c, d}. (Note that the parentheses mean that w may or may not begin or end with an a.)

Proof. Let w = x1x2 ··· xk be a word representing the group element g such that δ(w) = l(g). Assume for a contradiction that w is not of the required form. We are in one or both of the following cases:

(1) there is some i such that xi = xi+1 (2) there is some i such that xi, xi+1 ∈ {b, c, d}. 2 In case (1), since a, b, c and d all have order 2, we have xixi+1 = xi = e. Defining 0 the word w := x1 ··· xi−1xi+2 ··· xm , we see that it represents the group element g and 0 that δ(w ) = δ(w)−δ(xi)−δ(xi+1) < δ(w). This contradicts the fact that δ(w) = l(g), that is, that w is a minimal-length word representing g.

In case (2), noting that bc = d, bd = c, cd = b we can rewrite xixi+1 = t ∈ {b, c, d}. 0 Defining the word w := x1 ··· xi−1txi+2 ··· xm , we see that it represents the group 0 element g. Also, δ(w ) = δ(w) − (δ(xi) + δ(xi+1) − δ(t)) < δ(w) because we assumed in the statement of this lemma that δ is triangular and so δ(xi) + δ(xi+1) − δ(t) > 0. This contradicts the fact that δ(w) = l(g). Both cases lead to contradiction, and so w must be of the required form.

10 Definition 4.7. If w is a word representing the group element g and w is of the form

(a)u1au2a ··· aun(a) where ui ∈ {b, c, d}, we call w an alternating decomposition of g. If δ(w) = l(g), we call w a shortest alternating decomposition of g. Note that there may be more than one (shortest) alternating decomposition of g.

Given a word w over the alphabet {a, b, c, d}, define |w|a to be the number of copies of a in the word w. Define |w|b , |w|c and |w|d similarly.

Definition 4.8. For all n ∈ N define the Hn of G to be

Hn = StabG(n) = {π ∈ G|π(v) = v for all |v| = n},

th the stabiliser of the n level of the tree T . We will write H for H1 . Note that if an automorphism of the tree T stabilises the nth level, then it must stabilise all of the first n levels. This is because if we swap two subtrees in level i, we cannot swap the vertices in level i + 1 back again because automorphisms can only swap vertices with their siblings. Therefore Hi ≥ Hi+1 for all i ∈ N. 2n−1 Proposition 4.9 ([8, Exercises 4.4 and 6.2]). [G : Hn] ≤ 2 and Hn E G. 2n−1 Proof. First we will construct a group An ⊆ Aut(T ) such that |An| = 2 , and then we will show that [G : Hn] ≤ |An|. We know that automorphisms of T act by swapping subtrees which share the same parent, that is, for τ ∈ Aut(T ) and v ∈ {0, 1}∗ and x ∈ {0, 1}, either τ(vx) = τ(v)x or τ(vx) = τ(v)x. We define the sign of τ at the vertex v by ( 0 if τ(vx) = τ(v)x ωv(τ) = 1 if τ(vx) = τ(v)x.

∗ An automorphism τ is uniquely determined just by specifying ωv(τ) for all v ∈ {0, 1} because we can use this information to find out how τ acts on any given vertex of T .

Therefore, we can identify τ with the tuple (ωv(τ))v∈{0,1}∗ . Let An = {τ ∈ Aut(T ): ωv(τ) = 0 for all |v| ≥ n} which is the set of automor- phisms which do not swap any branches of T after the first n levels. This is obviously a subgroup of Aut(T ). 2n−1 We will use induction on n to show that |An| = 2 for all n ∈ N. For the base case n = 1, note that A1 = {e, a} since we can only swap the vertices in level 1. 2n−1 Therefore |A1| = 2 = 2 . 2n−1−1 For the inductive step, assume that |An−1| = 2 . Given any τ ∈ An , we know 0 that ωv(τ) = 0 for all |v| ≥ n. Construct the automorphism τ ∈ An−1 by setting ( 0 ωv(τ) if |v| < n − 1 ωv(τ ) = 0 if |v| ≥ n − 1.

In constructing τ 0 , we copy the action of τ on the first n − 1 levels and essentially erase the action of τ on the nth level of T . There are 2n−1 ways an automorphism can

11 th 1 n act on the n level of T because there are 2 2 pairs of vertices which can be swapped 2n−1 or not. Therefore, there are 2 automorphisms τ ∈ An which give rise to the same 0 2n−1 automorphism τ ∈ An−1 in the way specified above. So |An| ≤ 2 |An−1|. Similarly, 0 2n−1 0 for each ρ ∈ An−1 , there are 2 automorphisms ρ ∈ An which have ωv(ρ ) = ωv(ρ) 2n−1 2n−1 for |v| < n − 1. Therefore, |An| ≥ 2 |An−1| and so |An| = 2 |An−1|. By the induction hypothesis, we get

2n−1 2n−1−1 2n−1 |An| = 2 2 = 2 and we are done by induction. It remains to show that [G : Hn] ≤ |An|. Intuitively, this makes sense because An is the group of automorphisms which “only swap vertices in the first n levels” and Hn is a subgroup of the group of automorphisms which do not swap vertices in the first n levels. For n ∈ N, define the map

θn : G → An

θn : g → θn(g) where, for x1,..., xk ∈ {0, 1}, letting v = x1 ··· xk we have ( g(v) if k ≤ n θn(g)(v) = g(v)xn+1 ··· xk if k > n, that is, θn(g) is the automorphism which acts as g on the first n levels of T and does not swap any subtrees after the nth level. Note that ker θn = {g ∈ G : g(v) = v for all |v| ≤ n} = Hn , and that Im θn ≤ An . By the First Isomorphism Theorem,  =  ∼ Im θ ≤ A G ker θn G Hn = n n 2n−1 and so [G : Hn] ≤ |An| = 2 as required. Lemma 4.10. If g ∈ G then g ∈ H if and only if there is a word w representing g with |w|a even. Proof. First note that if u ∈ {b, c, d} then u(0) = 0 and u(1) = 1, and that a(0) = 1 and a(1) = 0. That is, b, c and d act as the identity on the first level of the tree and a swaps the branches. Therefore, if |w|a = m then ( 0 if m is even g(0) = w(0) = am(0) = 1 if m is odd and ( 1 if m is even g(1) = w(1) = am(1) = 0 if m is odd. We have g ∈ H if and only if g(0) = 0 and g(1) = 1, that is, g ∈ H if and only if there is a word w representing g with |w|a even.

12 We have seen that the action of g on the first level of T is determined by the parity 0 of |w|a for any given word w representing g. Therefore, given any two words w, w 0 representing g, we know |w|a and |w |a must have the same parity. This means we can define the following.

Definition 4.11. Let g ∈ G and pick some word w representing g. Define σa : G → {0, 1} by ( 0 if |w|a is even σa(g) = 1 if |w|a is odd.

Corollary 4.12. If g ∈ G then g ∈ H if and only if every word w representing g is such that |w|a even.

Proof. Let w be any word representing g. We have by Lemma 4.10 that there is a 0 0 0 word w representing g with |w |a even. Now as we just noted above, |w|a and |w |a have the same parity, so |w|a is even.

Proposition 4.13. H = hb, c, d, aba, aca, adai and [G : H] = 2.

Proof. Let g ∈ G. We will show that g ∈ H or g ∈ aH. If σa(g) = 0 then g ∈ H. Else, if σa(g) = 1, let w = (a)u1au2a ··· aun(a) be some alternating decomposition of g. Let h ∈ G be the group element represented by the word aw and note that, as group elements, g = ew = aaw = ah. We will show that σa(h) = 0 and so h ∈ H. If the word w begins with a then h = aw = aau1au2a ··· aun(a) = u1au2a ··· aun(a), so h has an alternating decomposition v = u1au2a ··· aun(a). Then |v|a = |w|a − 1. If the word w does not begin with a then h = aw = au1au2a ··· aun(a), so h has an alternating decomposition v = au1au2a ··· aun(a), and |v|a = |w|a + 1. That is, σa(h) = 1 − σa(g) = 0. So h ∈ H and g = ah, therefore g ∈ aH. So either g ∈ H or g ∈ aH for all g ∈ G. Therefore, [G : H] = 2. of order two are always normal subgroups, so H E G, which we already knew from Proposition 4.9. Obviously hb, c, d, aba, aca, adai ≤ H since we can write all of the generators of the group on the left-hand side as words with an even number of letters a. To show H ≤ hb, c, d, aba, aca, adai we let h ∈ H. By Lemma 4.6 and Lemma 4.10, we know that h can be written as (a)u1au2a ··· aun(a) where ui ∈ {b, c, d} with an even number of copies of a, so h ∈ hb, c, d, aba, aca, adai.

Definition 4.14. Define the group A := hai = {e, a}. We extend the map ϕ : Aut(T )×Aut(T ) → Aut(T ) to ϕ :(Aut(T )×Aut(T ))oA → Aut(T ) by ϕ(π0, π1; σ) = σ ◦ ϕ(π0, π1). In the semi-direct product, A acts by swapping first and second co- ordinates so that (Aut(T ) × Aut(T )) o A = Aut(T ) o A is a wreath product.

Proposition 4.15. ϕ :(Aut(T ) × Aut(T )) o A → Aut(T ) is an isomorphism with inverse ψ : Aut(T ) → (Aut(T ) × Aut(T )) o A.

13 Proof. First, we show that ϕ is a homomorphism, that is, it respects multiplication in (Aut(T ) × Aut(T )) o A which is defined by ( (π0 ◦ ρ0, π1 ◦ ρ1; σ ◦ τ) if τ = e (π0, π1; σ)(ρ0, ρ1; τ) = (π1 ◦ ρ0, π0 ◦ ρ1; σ ◦ τ) if τ = a. We get

ϕ(π0, π1; σ) ◦ ϕ(ρ0, ρ1; τ) = σ ◦ ι0(π0) ◦ ι1(π1) ◦ τ ◦ ι0(ρ0) ◦ ι1(ρ1) ( σ ◦ τ ◦ ι (π ◦ ρ ) ◦ ι (π ◦ ρ ) if τ = e = 0 0 0 1 1 1 σ ◦ τ ◦ ι0(π1 ◦ ρ0) ◦ ι1(π0 ◦ ρ1) if τ = a ( ϕ(π ◦ ρ , π ◦ ρ ; σ ◦ τ) if τ = e = 0 0 1 1 ϕ(π1 ◦ ρ0, π0 ◦ ρ1; σ ◦ τ) if τ = a as required. It remains to show that ϕ is an isomorphism. We provide an inverse ψ defined by ( (τ| , τ| ; e) if τ(0) = 0 ψ(τ) = T0 T1

(τ|T1 , τ|T0 ; a) if τ(0) = 1 where τ|T0 and τ|T1 represent the restriction of τ to the trees T0 and T1 respectively. It is obvious that ψ is the inverse of ϕ and so ϕ is an isomorphism.

Proposition 4.16. ψ(G) ≤ (G × G) o A, and we can see ψ(H) as a subgroup of G × G. Now let P1,P2 : G × G → G be projections onto the first and second co- ordinates respectively. Then P1 ◦ ψand P2 ◦ ψ are surjections mapping H onto G. Proof. We first write out the action of ψ on the generators {b, c, d, aba, aca, ada} of H:  b 7→ (a, c; e), aba 7→ (c, a; e)  ψ : c 7→ (a, d; e), aca 7→ (d, a; e)  d 7→ (e, b; e), ada 7→ (b, e; e).

We can see that ψ(H) ≤ (G × G) o A because the left and right components of ψ(hb, c, d, aba, aca, adai) can be generated by a, b, c and d. Since the element of A is always e, we can see ψ(H) as a subgroup of ψ(H) ≤ G × G. Also, P1 ◦ ψ(H) = G since

P1 ◦ ψ(H) ≥ hP1 ◦ ψ(b),P1 ◦ ψ(ada),P1 ◦ ψ(aba),P1 ◦ ψ(aca)i = ha, b, c, di = G.

Similarly, P2 ◦ ψ(H) = G. Theorem 4.17. G is an infinite group. Proof. For a contradiction, we will assume that G is finite. By Proposition 4.16, H can be mapped surjectively onto G. Since G is finite, we get |G| ≤ |H|. However, we also have that H is a proper subgroup of G since [G : H] = 2 by Proposition 4.13, so |H| < |G|. This is a contradiction since we have |H| < |H|, so G must be infinite.

14 For the rest of this chapter, unless otherwise specified, take the weight δ on G with generating set S = {a, b, c, d} to be the standard weight defined by δ(s) = 1 for all s ∈ S . We now seek to obtain some bounds on the lengths of various types of words on the alphabet S .

Definition 4.18. We define two rewriting rules ζ, η : S → S ∪ {}, which rewrite a letter of S to another letter of S or the empty letter  which has weight δ() = 0 and so can be removed from words, in the following way: ( a 7→ , b 7→ a λ : c 7→ a, d 7→ 

( a 7→ , b 7→ c ρ : c 7→ d, d 7→ b. Note that λ(s) is the first (or left, hence the name λ) co-ordinate of ψ(s), and ρ(s) is the second (or right) co-ordinate. ∗ ∗ Now we may define two rewriting rules Φ0, Φ1 : S → S , which rewrite each letter of the input word in sequence. Let w be the input word. Φ0 and Φ1 are defined by ( s 7→ λ(s) if the number of a’s preceding the letter s in w is odd Φ0 : s 7→ ρ(s) if the number of a’s preceding the letter s in w is even and ( s 7→ ρ(s) if the number of a’s preceding the letter s in w is odd Φ1 : s 7→ λ(s) if the number of a’s preceding the letter s in w is even.

The following proposition tells us why these rewriting rules are useful to us.

Proposition 4.19 (Result stated in [8, Lemma 8.2]). Let g ∈ G and let w ∈ S∗ be a word representing g. Now let g0, g1 ∈ G be the group elements represented by words Φ0(w) and Φ1(w) respectively. If g ∈ H then we have ψ(g) = (g1, g0; e) and if g∈ / H then ψ(g) = (g0, g1; a).

Proof. Let w = s1 ··· sk where si ∈ S for i = 1, . . . , k. We show the result by induction on k, the number of letters in w. The base case is the case when k = 1. In this case, Φ0(w) = ρ(w) (which is the second co-ordinate of ψ(g)) and Φ1(w) = λ(w) (which is the first co-ordinate of ψ(g)) since the number of copies of the letter a appearing before the letter w is even. If g∈ / H then w = a and so ψ(g) = (e, e; a). As required, Φ0(w) = Φ1(w) =  which represents the group element e. If g ∈ H, then w ∈ {b, c, d}. By the above remarks, Φ0(w) is the second co-ordinate of ψ(g) and Φ1(w) is the first, so ψ(g) = (g1, g0; e). Now for the inductive step assume k > 1. Recall that sk is the last letter of w 0 0 0 and let w := s1 ··· sk−1 be the word w with sk removed. Let g0 and g1 be the group

15 0 0 0 elements represented by words Φ0(w ) and Φ1(w ) respectively. Since w = s1 ··· sk−1 , it has fewer than k letters. Therefore we can apply the induction hypothesis to get 0 0 0 0 0 0 0 0 that if g ∈ H, then ψ(g ) = (g1, g0; e), and if g ∈/ H, then ψ(g ) = (g0, g1; a). We can 0 0 use the homomorphic property of ψ to note that ψ(g) = ψ(g sk) = ψ(g )ψ(sk). 0 0 First look at the case where sk = a. Then |w|a = |w |a + 1 and so σa(g ) 6= σa(g). 0 0 Since Φ0, Φ1 always rewrite a to , we have that Φi(w) = Φi(w a) and Φi(w ) are 0 words representing the same group element gi for i = 0, 1. Therefore g0 = g0 and 0 0 0 0 0 g1 = g1 . If g ∈ H then g ∈/ H and so ψ(g ) = (g0, g1; a). In this case, we get

ψ(g) = ψ(g0)ψ(a) 0 0 = (g0, g1; a)(e, e; a) by induction hypothesis 0 0 = (g1, g0; e) 0 0 = (g1, g0; e) since g0 = g0 and g1 = g1 as required. 0 0 0 0 If g∈ / H then g ∈ H and so ψ(g ) = (g1, g0; e). In this case, we get

ψ(g) = ψ(g0)ψ(a) 0 0 = (g1, g0; e)(e, e; a) by induction hypothesis 0 0 = (g0, g1; a) 0 0 = (g0, g1; a) since g0 = g0 and g1 = g1 as required.

Now treat the case where sk ∈ {b, c, d}. If we let u0 and u1 be, respectively, the group elements represented by words ρ(sk) and λ(sk), then ψ(sk) = (u1, u0; e) by the base case of this induction. If g ∈ H then g0 ∈ H and so there are an even number of copies of a in the 0 0 0 word w . Therefore we have Φ0(w sk) = Φ0(w )ρ(sk), which represents both of the 0 0 0 group elements g0u0 and g0 . Therefore g0u0 = g0 . Similarly, we have Φ1(w sk) = 0 0 0 Φ1(w )λ(sk), which represents both of the group elements g1u1 and g1 , so that g1u1 = 0 0 0 g1 . Since ψ(g ) = (g1, g0; e), we get

0 ψ(g) = ψ(g )ψ(sk) 0 0 = (g1, g0; e)(u1, u0; e) by induction hypothesis 0 0 = (g1u1, g0u0; e) 0 0 = (g1, g0; e) since g0u0 = g0 and g1u1 = g1 as required. 0 For the final case, assume sk ∈ {b, c, d} and g∈ / H. Therefore g ∈/ H and so 0 0 there are an odd number of copies of a in the word w . Therefore we have Φ0(w sk) = 0 0 0 Φ0(w )λ(sk), which represents both of the group elements g0u1 and g0 , and so g0u1 = 0 0 g0 . Similarly, we have Φ1(w sk) = Φ1(w )ρ(sk), which represents both of the group

16 0 0 0 0 0 elements g1u0 and g1 . Therefore g1u0 = g1 . Since ψ(g ) = (g0, g1; a), we get

0 ψ(g) = ψ(g )ψ(sk) 0 0 = (g0, g1; a)(u1, u0; e) by induction hypothesis 0 0 = (g0u1, g1u0; a) 0 0 = (g0, g1; a) since g0u1 = g0 and g1u0 = g1 as required. We have treated all cases and so we are done.

The following proposition gives us bounds on δ(Φ0(w)) and δ(Φ1(w)) in terms of δ(w) which we will use to show that G is a torsion group. Proposition 4.20. Let w ∈ {a, b, c, d}∗ , and work with any given weight δ. Then

δ(Φ0(w)) + δ(Φ1(w)) = (δ(a) + δ(c))|w|b + (δ(a) + δ(d))|w|c + δ(b)|w|d .

Proof. Each letter a in w contributes a letter  to each of Φ0(w), Φ0(w), so it con- tributes 0 to the sum δ(Φ0(w)) + δ(Φ1(w)). Each letter b in w contributes an a and a c, one to each of the words Φ0(w), Φ1(w), so it contributes δ(a) + δ(c) to the sum δ(Φ0(w)) + δ(Φ1(w)). Each letter c in w contributes an a and a d, one to each of the words Φ0(w), Φ1(w), so it contributes δ(a) + δ(d) to the sum δ(Φ0(w)) + δ(Φ1(w)). Each letter d in w contributes an  and a b, one to each of the words Φ0(w), Φ1(w), so it contributes δ(b) to the sum δ(Φ0(w)) + δ(Φ1(w)). There are no other contributions to δ(Φ0(w)) + δ(Φ1(w)), and so δ(Φ0(w)) + δ(Φ1(w)) = (δ(a) + δ(c))|w|b + (δ(a) + δ(d))|w|c + δ(b)|w|d . Corollary 4.21. Let w ∈ {a, b, c, d}∗ . Then

(i) |Φ0(w)| + |Φ1(w)| = 2|w|b + 2|w|c + |w|d, (ii) |Φ0(Φ0(w))| + ··· + |Φ1(Φ1(w))| = 2|w|d + 2|w|b + |w|c, (iii) |Φ0(Φ0(Φ0(w)))| + ··· + |Φ1(Φ1(Φ1(w)))| = 2|w|c + 2|w|d + |w|b . Proof. Let δ be the standard weight δ(s) = 1 for s = a, b, c, d. Note that δ(w) = |w| for all words w ∈ {a, b, c, d}∗ . (i) is a direct application of Proposition 4.20:

|Φ0(w)| + |Φ1(w)| = δ(Φ0(w)) + δ(Φ1(w))

= (δ(a) + δ(c))|w|b + (δ(a) + δ(d))|w|c + δ(b)|w|d

= 2|w|b + 2|w|c + |w|d as required for part (i).

Now note that, by definition of Φ0 and Φ1 , we have

|Φ0(w)|b + |Φ1(w)|b = |w|d,

|Φ0(w)|c + |Φ1(w)|c = |w|b, (∗)

|Φ0(w)|d + |Φ1(w)|d = |w|c.

17 Replacing w with Φ0(w) in the result (i), we get |Φ0(Φ0(w))| + |Φ1(Φ0(w))| = 2|Φ0(w)|b + 2|Φ0(w)|c + |Φ0(w)|d . Similarly, replacing w with Φ1(w), we get |Φ0(Φ1(w))|+|Φ1(Φ1(w))| = 2|Φ1(w)|b + 2|Φ1(w)|c + |Φ1(w)|d . Therefore,

|Φ0(Φ0(w))| + |Φ1(Φ0(w))| + |Φ0(Φ1(w))| + |Φ1(Φ1(w))|

= 2(|Φ0(w)|b + |Φ1(w)|b) + 2(|Φ0(w)|c + |Φ1(w)|c) + (|Φ0(w)|d + |Φ1(w)|d) by above

= 2|w|d + 2|w|b + |w|c by (∗) so we have shown (ii).

Similarly, we can use (ii) to prove (iii). Substituting Φ0(w) for w in (ii), we get |Φ0(Φ0(Φ0(w)))| + ··· + |Φ1(Φ1(Φ0(w)))| = 2|Φ0(w)|d + 2|Φ0(w)|b + |Φ0(w)|c. Similarly, replace w with Φ1(w) in (ii) to get |Φ0(Φ0(Φ1(w)))|+···+|Φ1(Φ1(Φ1(w)))| = 2|Φ1(w)|d + 2|Φ1(w)|b + |Φ1(w)|c. And now we can finish the proof of (iii).

|Φ0(Φ0(Φ0(w)))| + ··· + |Φ1(Φ1(Φ0(w)))| + |Φ0(Φ0(Φ1(w)))| + ··· + |Φ1(Φ1(Φ1(w)))|

= 2(|Φ0(w)|d + |Φ1(w)|d) + 2(|Φ0(w)|b + |Φ1(w)|b) + (|Φ0(w)|c + |Φ1(w)|c)

= 2|w|c + 2|w|d + |w|b by (∗) as required.

We are able to classify the elements of G into three categories which we can use to bound their length:

Proposition 4.22 ([8]). Use the generating set S = {a, b, c, d} for G and the stan- dard weight δ(s) = 1 for all s ∈ S . Each g ∈ G falls into exactly one of the sets

AG,BG,CG ⊆ G defined by

AG := {g ∈ G : l(g) is even}

BG := {g ∈ G : g has a shortest alternating decomposition of the form au1 ··· auna}

CG := {g ∈ G : g has a shortest alternating decomposition of the form u1a ··· aun}. Proof. Since we are using the standard weight, l(g) = |w| for all g ∈ G where w is a shortest alternating decomposition of g. Therefore l(g) is an integer, so it can be even or odd. If g ∈ BG or g ∈ CG then l(g) is odd, so g ∈ G cannot be both in AG and one of BG or CG . If l(g) = 2n + 1 for some integer n then if g ∈ BG with a 0 shortest decomposition w = au1au2a ··· auna then |w|a = n + 1 and if g ∈ CG with a shortest decomposition v = x1ax2a ··· axn+1 then |v|a = n, and so σa(g) 6= σa(h), which tells us that no element of BG can have a shortest decomposition of the form u1au2a ··· aun . So each g ∈ G falls into exactly one of the sets.

Lemma 4.23 ([8, Lemma 8.2]). If g ∈ G and ψ(g) = (g0, g1; α) then, using the standard weight δ(s) = 1 for s = a, b, c, d, 1 (i) if g ∈ AG then l(g0), l(g1) ≤ 2 l(g) 1 (ii) if g ∈ BG then l(g0), l(g1) ≤ 2 (l(g) − 1) 1 (iii) if g ∈ CG then l(g0), l(g1) ≤ 2 (l(g) + 1).

18 Proof. We will construct words representing g0 and g1 which have the required lengths. First, fix a shortest alternating decomposition w of g. We know we can do this, and that |w| = l(g), by Lemma 4.6. By Proposition 4.19, we know that one of the words

Φ0(w) and Φ1(w) represents the group element g0 and the other word represents g1 . We note that every letter a in w contributes 0 to each |Φi(w)|, and every let- ter u ∈ {b, c, d} contributes at most 1 to each |Φi(w)|, where i = 0, 1. Therefore, |Φi(w)| ≤ |w| − |w|a for i = 0, 1. So, looking at the various cases, we get the result as follows.

Case (i): Say l(g) = |w| = 2n. Since w is an alternating decomposition, |w|a = 1 2 |w| = n. Therefore 1 |Φ (w)| ≤ |w| − |w| = n = l(g) i a 2 for i = 0, 1, as required.

Case (ii): Say l(g) = |w| = 2n + 1. Since g ∈ BG and w is an alternating decomposition, by Proposition 4.22, we have w = au1au2a ··· auna. Therefore |w|a = 1 2 (|w| + 1) = n + 1. Therefore 1 |Φ (w)| ≤ |w| − |w| = n = (l(g) − 1) i a 2 for i = 0, 1, as required.

Case (iii): Say l(g) = |w| = 2n + 1. Since g ∈ CG and w is an alternating decomposition, by Proposition 4.22, we have w = u1au2a ··· aun+1 . Therefore |w|a = 1 2 (|w| − 1) = n. Therefore 1 |Φ (w)| ≤ |w| − |w| = n + 1 = (l(g) + 1) i a 2 for i = 0, 1, as required.

The following lemma is useful for one of the cases in our proof that G is an infinite torsion group.

Lemma 4.24. Let w be an alternating decomposition of some element of G and let |w|d = 0. If w is of the form w = au1a ··· aun then Φ0(w), Φ1(w) and their concatenation Φ0(w) · Φ1(w) are alternating decompositions of elements of G.

Proof. Let w = (a)u1au2 ··· uk(a). By definition, Φ0 and Φ1 rewrite letters a to  so we only need to look at how the ui are rewritten. Again by definition, Φ0 and Φ1 both rewrite every other ui as a, and the remaining letters ui as either c or d, so Φ0(w), Φ1(w) are words of the form (a)v1av2 ··· vm(a) with k letters where vi ∈ {c, d}. Since there are an odd number of copies of a preceding un , and un ∈ {b, c}, we know that the last letter of Φ0(w) is λ(un) = a. The first letter of Φ1(w) is ρ(u1) which is c or d since u1 ∈ {b, c}. Because of this fact and because Φ0(w) and Φ1(w) are alternating decompositions, we have that Φ0(w) · Φ1(w) is an alternating decomposition.

19 Theorem 4.25. G is a torsion group. In fact, it is a 2-group. That is, for all g ∈ G there exists a positive integer m such that ord(g) = 2m .

Proof following [4, VIII.17]. We use the standard weight defined by δ(s) = 1 for all s ∈ {a, b, c, d} and show the result by induction on the length l(g). For the base case, we note that if l(g) = 0 then g = e, if l(g) = 1 then g2 = e by Lemma 4.1 and, if l(g) = 2 then g16 = e by Lemma 4.4. For the inductive step, assume that all elements h ∈ G of length l(h) ≤ l(g)−1 have order a power of 2. By Lemma 4.6, we can take a shortest alternating decomposition w = (a)u1au2a ··· aun(a) of g such that |w| = l(g) where ui ∈ {b, c, d} for i = 1, . . . , n. We will look at several different cases for w. −1 2 −1 Case (1): l(g) is odd. If w = au1au2a ··· auna then aga = a u1a ··· aunaa = u1a ··· aun has length at most l(g) − 2, so the inductive hypothesis tells us that there is a positive integer m such that ord(aga−1) = 2m . Since aga−1 is conjugate to g, −1 m they have the same order, so ord(g) = ord(aga ) = 2 . If w = u1au2a ··· aun −1 −1 then we conjugate g by u1 so that we can write u1 gu1 = u1 u1au2a ··· aunu1 = −1 au2a ··· aunu1 . We have by Lemma 4.2 that unu1 ∈ {, b, c, d} and so u1gu1 has length at most l(g) − 1. By the inductive hypothesis, there is a positive integer m −1 m such that ord(g) = ord(u1gu1 ) = 2 . Note that if l(g) is even then we can write g = au1a ··· aun with ui ∈ {b, c, d} −1 because if g = x1a ··· axna we just swap g for x1gx1 which has the same order and is of length at most l(g). Case (2): l(g) is even and g ∈ H. In this case, l(g) is divisible by 4. By 1 Lemma 4.23, we can write ψ(g) = (g0, g1; e) where g0, g1 ∈ G and l(g0), l(g1) ≤ 2 l(g) < l(g) since l(g) ≥ 4. Therefore we can apply the inductive hypothesis to get that ord(g0) and ord(g1) are powers of 2. Since g = ϕ(g0, g1) and ϕ is a bihomomorphism, k k k k g = ϕ(g0, g1) = ϕ(g0 , g1 ), and so ord(g) = lcm(ord(g0), ord(g1)) which is a power of 2. Case (3): l(g) is even and g∈ / H: We can write g = ϕ(g0, g1; a) where g0, g1 ∈ G. 2 Therefore g = ϕ(g1g0, g0g1; e). We will show that l(g1g0), l(g0g1) < l(g) so that we can apply the induction hypothesis and get that ord(g0g1) and ord(g1g0) are powers of 2. Then, since ϕ is a homomorphism, we will have some m such that ord(g2) = 2m and so ord(g) = 2m+1 .

By Proposition 4.19, the words Φ0(w) and Φ1(w) represent g0 and g1 respectively. Therefore l(g1g0), l(g0g1) ≤ |Φ0(w)| + |Φ1(w)|. Recall that w = au1a ··· aun and so |w| = 2(|w|b + |w|c + |w|d). By Corollary 4.21, we know

|Φ0(w)| + |Φ1(w)| = 2|w|b + 2|w|c + |w|d = |w| − |w|d ≤ |w| = l(g).

We must have at least one u ∈ {b, c, d} such that |w|u > 0. Case (3)(i): l(g) is even and g∈ / H and |w|d > 0: In this case we get

l(g1g0), l(g0g1) ≤ |Φ0(w)| + |Φ1(w)| = |w| − |w|d < |w| = l(g)

20 as required. Note that this works because w is an alternating decomposition such that

|w| = l(g) and |w|d > 0. Case(3)(ii): l(g) is even and g∈ / H and |w|d = 0 and |w|c > 0: We will show that ord(g0g1) is a power of 2. The proof for g1g0 is analogous. As above, note that

l(g0g1) ≤ |Φ0(w)| + |Φ1(w)| = |w| − |w|d ≤ l(g).

Therefore if l(g0g1) is odd we may apply Case (1) to g0g1 to get that ord(g0g1) is a power of 2. If l(g0g1) is even and g0g1 ∈ H then apply Case (2) to g0g1 to get the 0 result. So we can assume l(g0g1) is even and g0g1 ∈/ H. Let w = Φ0(w)·Φ1(w) which is 0 a word representing g0g1 . Note that |w |d = |Φ0(w)|d+|Φ1(w)|d = |w|c (see the proof of Corollary 4.21). Also, we may apply Lemma 4.24 to w to get that w0 is an alternating 0 0 decomposition. Now |w |d = |w|c > 0 and |w | = |Φ0(w)| + |Φ1(w)| = |w| − |w|d = |w|. 0 0 Therefore g0g1 is represented by an alternating decomposition w of length |w | = l(g) 0 0 with |w |d > 0 and so we apply Case (3)(i) to g0g1 and w to get our result that ord(g0g1) and ord(g1g0) are powers of 2. Case (3)(iii): l(g) is even and g∈ / H and |w|d = 0 = |w|c . In this case, 1 l(g) 16 16 w = (ab) 2 , and by Lemma 4.4 we have that (ab) = e. Therefore we have g = e. So ord(g) is a power of 2 for all g ∈ G giving us that G is a 2-group and therefore a torsion group.

We have shown that Grigorchuk’s group is an infinite torsion group, and so gives us an answer to the Generalised Burnside Problem which asks if such groups exist. Now we move on to show that Grigorchuk’s group has intermediate growth.

5 Grigorchuk’s group has intermediate growth

We will show that Grigorchuk’s group G has intermediate growth and therefore answers Milnor’s question of whether finitely generated groups of intermediate growth exist.

Theorem 5.1. Grigorchuk’s group G has intermediate growth.

5.1 The growth is not polynomial

We will show first that G is not of polynomial growth. Proposition 5.2. Grigorchuk’s group G is not of polynomial growth. To prove this, we will define a multilateral group and show that these groups do not have polynomial growth before showing that G is multilateral.

Definition 5.3. Two groups G1 and G2 are commensurable, written G1 ≈ G2 , if they contain isomorphic finite index subgroups. That is, if there are subgroups H1 ≤ G1 ∼ and H2 ≤ G2 such that H1 = H2 and [G1 : H1], [G2 : H2] < ∞. A group G is multilateral if G is infinite and there is some m ≥ 2 such that G ≈ Gm .

21 Proposition 5.4 ([8, Lemma 7.4]). If G is a finitely generated multilateral group then G does not have polynomial growth.

We will use the following lemmas to show this.

Lemma 5.5 ([1, Lemma 4]). Let G be a group with growth function γ defined using some weight δ. Let H ≤ G be finitely generated groups and [G : H] = m < ∞. Now let γH (n) := |B(n) ∩ H| = |{g ∈ H : l(g) ≤ n}| be the restriction of γ to H . Then there is some constant K ≥ 0 such that

γ(n − K) ≤ mγH (n) ≤ γ(n + K).  Proof. Fix a set U of coset representatives of G H . Note that |U| = m. Let K := max{l(u): u ∈ U}. Noting that every element g ∈ G can be written uniquely as g = uh for some u ∈ U and h ∈ H , define a bijection

θ : G → U × H θ : uh 7→ (u, h) with inverse

θ−1 : U × H → G θ−1 :(u, h) 7→ uh.

To show that γ(n−K) ≤ mγH (n), let g ∈ B(n−K) so that l(g) ≤ n−K . Since g = uh for some u ∈ U and h ∈ H , we have that h = u−1g and so l(h) ≤ K + (n − K) = n. Therefore h ∈ B(n)∩H , and so θ(B(n−K)) ⊆ U ×(B(n)∩H). Since θ is a bijection we get

γ(n − K) = |B(n − K)| = |θ(B(n − K))| ≤ |U||B(n) ∩ H| = mγH (n) which gives us the first part of our lemma. To show that mγH (n) ≤ γ(n + K), let h ∈ B(n) ∩ H so that l(h) ≤ n and l(uh) ≤ n + K for all u ∈ U . Therefore θ−1(U × (B(n) ∩ H)) ⊆ B(n + K). Again since θ−1 is a bijection, we get

mγH (n) = |U × (B(n) ∩ H)| = |θ−1(U × (B(n) ∩ H))| ≤ |B(n + K)| = γ(n + K) which completes our proof.

Lemma 5.6 ([8, Exercise 1.5]). If H ≤ G are finitely generated groups with growth functions γH and γG , and [G : H] = m < ∞, then γG ∼ γH .

22 Proof. Let S be a finite generating set for G and X be a finite generating set for H .

Then S ∪ X is also a finite generating set for G. Define standard weights δX (x) = 1 for all x ∈ X for H and δS(s) = 1 for all s ∈ S and δS∪X (y) = 1 for all y ∈ S ∪ X for G. Then by Proposition 2.7, we have that γG ∼ γδS ∼ γδS∪X and γH ∼ γδX .

First we will show that γH  γG . If h ∈ BδX (n) then h can be written h = x1 ··· xk where k = lδX (h) ≤ n and xi ∈ X for i = 1, . . . , k. This is also a decomposition of h into generators S ∪ X of G, so we have lδS∪X (h) ≤ k ≤ n, which tells us that h ∈ BδS∪X (n). Therefore BδX (n) ⊆ BδS∪X (n) and so, for any n,

γδX (n) = |BδX (n)| ≤ |BδS∪X (n)| = γδS∪X (n).

Therefore γH ∼ γδX  γδS∪X ∼ γG as required. It remains to show that γG  γH . By Lemma 5.5, there is some K such that γ (n) ≤ mγH (n+K). Now let M = max{l (s): s ∈ S}. Then if h ∈ B (n+K)∩H δS δS δX δS we know h = s1 ··· sl for some l = lδS (h) ≤ n + K . Now

lδX (h) ≤ lδX (s1) + ··· + lδX (sl) ≤ lM ≤ (n + K)M,

so h ∈ BδX ((n + K)M). Therefore γH (n + K) = |B (n + K) ∩ H| ≤ |B ((n + K)M)| = γ ((n + K)M). δS δS δX δX Since γ (n) ≤ mγH (n + K), we know that γ (n) ≤ mγ ((n + K)M). δS δS δS δX Now for n ≥ 1, we have n(M + MK) = nM + nMK ≥ nM + MK = (n + K)M .

Since γδX is an increasing function, γδX ((n + K)M) ≤ γδX (n(M + MK)). For n < 1 we know that γδS (n) = 1 ≤ m ≤ mγδX (n(M + MK)).

Therefore we have γδS (n) ≤ mγδX (n(M + MK)) for all n ≥ 0. Therefore γG ∼

γδS  γδX ∼ γH . So γG ∼ γH as required.

Lemma 5.7. If there exists β > 0 such that γ(n) ∼ nβ and we have m ≥ 2 then γ(n)m  γ(n).

Proof. It is obvious that γ(n)m ∼ nβm . We will show that nβ  nβm . In particular, we will show that it is not true that nβm  nβ , that is, for all constants C, α > 0 there exists some n ∈ N such that nβm > C(αn)β : For any C, α > 0, note that

C(αn)β Cαβ = → 0 < 1 nβm nβ(m−1) as n → ∞ and so there must exist some n ∈ N such that nβm > C(αn)β . We have shown that nβ  nβm . If γ(n)m ∼ γ(n) then nβ ∼ nβm , which is not true. So γ(n)m  γ(n). Now we are ready to prove that finitely generated multilateral groups do not have polynomial growth.

23 Proof of Proposition 5.4. If S is a finite generating set for G then Sm is a finite m m generating set for G . If g = (g1, . . . , gm) ∈ G then lGm (g) ≤ n if and only if lG(gi) ≤ n for all i. Therefore

m m γG(n) = |{g ∈ G : lS(g) ≤ n}| m = |{g ∈ G : lS(g) ≤ n} | m = |{g = (g1, . . . , gm) ∈ G : lS(gi) ≤ n for all i}|

= γGm (n) for all n ∈ N. Since G is a multilateral group, there exists m ≥ 2 such that G ≈ Gm . Therefore there are finite index subgroups H ≤ G and H0 ≤ Gm such that H ∼= H0 . By ∼ 0 Lemma 5.6, we have γG ∼ γH and γGm ∼ γH0 . Since H = H , we also have γH ∼ γH0 , m and so γG(n) ∼ γH (n) ∼ γH0 (n) ∼ γGm (n) = γG(n) . m We have some m ≥ 2 such that γG(n) ∼ γG(n). Therefore, by Lemma 5.7, G cannot have polynomial growth.

Now we can show that G does not have polynomial growth.

Proof of Proposition 5.2. It is sufficient to show that G is multilateral because then we can apply Proposition 5.4 to get that, since G is a finitely generated multilateral group, G does not have polynomial growth. We already know from Theorem 4.17 that G is infinite. We just need to find m ≥ 2 such that G ≈ Gm . We claim that G ≈ G × G. By Corollary 4.13, [G : H] = 2. We know from Proposition 4.16 that ψ(H) ≤ G × G, and since ψ is an isomorphism, we ∼ 0 0 have ψ(H) = H. Let H := ψ(H). We will show that [G × G : H ] < ∞ and so G is multilateral. We will do this by defining the group B := hg−1bg : g ∈ Gi ≤ G and showing first that B × B ≤ H0 and then that [G : B] < ∞. To show that B×B ≤ H0 , we will show that B×{e} ≤ H0 and {e}×B ≤ H0 . Given g ∈ G, since ψ maps H surjectively onto G under projection, we can find h, k ∈ H such that ψ(h) = (x, g) and ψ(k) = (g, y) (we don’t care what x and y are). Note that d ∈ H and ada ∈ H, so we may write

0 −1 −1 −1 −1 −1 H 3 ψ(h dh) = ψ(h )ψ(d)ψ(h) = (x , g )(e, b)(x, g) = (e, g bg) and similarly

0 −1 −1 −1 −1 −1 H 3 ψ(k (ada)k) = ψ(k )ψ(ada)ψ(k) = (g , y )(b, e)(g, y) = (g bg, e).

This works for every g ∈ G and so we have shown that B × {e}, {e} × B ≤ H0 . Now we want to show [G : B] < ∞. ∼ First note that ha, di = D8 the dihedral group of order 8. We can see this by noting that ha, di = ha, adi and (ad)4 = e = a2 and a(ad)a−1 = a−1 , so a is the reflection and ad is a rotation.

24 Consider the quotient map q : ha, di → G/B. If g ∈ G then g can be written as a word w with letters a, b, d since G = ha, b, di by Corollary 4.3. If w = vbu for words v, u on {a, b, d} then, since B is normal in G, we get gB = vbuB = vbBu = vBu = Bvu = vuB. In this way, we can write gB as Bh = hB where h ∈ ha, di. Therefore, ∼ the map q is onto, and so, by the First Isomorphism Theorem, G/B = ha, di/ker(q), and so [G : B] = [ha, di : ker(q)] ≤ 8. Now we see that

0 [G × G : B × B] 2 [G × G : H ] = ≤ [G × G : B × B] = [G : B] ≤ 64. [H0 : B × B] So we have shown that G ≈ G × G and since G is infinite, it is multilateral. Then by Proposition 5.4, we have that G does not have polynomial growth.

5.2 The growth is not exponential

It remains to show that G does not have exponential growth. We look at the stabiliser of the third level of the tree

H3 = StabG(3) = {π ∈ G : π(v) = v for |v| = 3}.

Proposition 5.8. If h ∈ H3 then there exist unique h000, . . . , h111 ∈ G such that

h = ϕ(ϕ(ϕ(h000, h001), ϕ(h010, h011)), ϕ(ϕ(h100, h101), ϕ(h110, h111))).

Proof. Given h ∈ H3 , we know by Proposition 4.16 that there are unique h0, h1 ∈ G such that ψ(h) = (h0, h1) ∈ G×G. In fact, since h ∈ H3 , we can show that h0, h1 ∈ H. Note that

h = ϕ(h0, h1) = ι0(h0) ◦ ι1(h1).

Assume for a contradiction that h0 ∈/ H. Then h0(0) = 1 and so h(00) = 01.Therefore h∈ / H3 which is a contradiction, so we must have that h0 ∈ H, and similarly h1 ∈ H. Therefore we can use Proposition 4.16 again to get unique h00, . . . , h11 such that ψ(h0) = (h00, h01), ψ(h1) = (h10, h11) ∈ G × G so that

h = ϕ(ϕ(h00, h01), ϕ(h10, h11)) = ι00(h00) ◦ ι01(h01) ◦ ι10(h10) ◦ ι11(h11).

As before, assume for a contradiction that h00 ∈/ H. Then h00(0) = 1 and so h(000) = 001. Therefore h∈ / H3 which is a contradiction, and so we must have h00 ∈ H. Similarly, h01, h10, h11 ∈ H. Therefore we can finally define h000, . . . , h111 ∈ G by writing ψ(hv) = (hv0, hv1) for all v ∈ {0, 1}2 . Again, uniqueness follows from the fact that ψ is well-defined, and we have

h = ϕ(ϕ(ϕ(h000, h001), ϕ(h010, h011)), ϕ(ϕ(h100, h101), ϕ(h110, h111))).

25 Proposition 5.8 justifies the following definition.

Definition 5.9. The map χ is defined by

8 χ : H3 → G

χ : h 7→ (h000, . . . , h111) where h000, . . . , h111 ∈ G are such that

h = ϕ(ϕ(ϕ(h000, h001), ϕ(h010, h011)), ϕ(ϕ(h100, h101), ϕ(h110, h111))).

Proposition 5.10. χ is an injective homomorphism.

Proof. The fact that χ is a homomorphism follows directly from the fact that ϕ is a bihomomorphism, and χ also inherits injectivity from ϕ.

Proposition 5.11. Grigorchuk’s group G does not have exponential growth. We will show, following Grigorchuk and Pak’s proof in [8], that γ(n)  exp(nβ) for all log 40  β > 6 < 0.913. log(8) We will use the standard weight δ on S = {a, b, c, d} defined by δ(s) = 1 for all s ∈ S for the whole of this proof. We will need the following lemmas. The first lemma bounds the lengths of components of χ(h) in terms of the length of h for all h ∈ H3 .

Lemma 5.12 (Cancellation Lemma, [8, Lemma 10.1]). Let h ∈ H3 and 5 (h000, . . . , h111) = χ(h). Then l(h000) + ··· + l(h111) ≤ 6 l(h) + 8.

Proof. Take a shortest alternating decomposition w of h and let w0 = Φ0(w) and w1 = Φ1(w). Then let w00 = Φ0(Φ0(w)), w01 = Φ1(Φ0(w)), w10 = Φ0(Φ1(w)), and w11 = Φ1(Φ1(w)). Finally, define w000 = Φ0(Φ0(Φ0(w))), . . . , w111 = Φ1(Φ1(Φ1(w))) similarly. Now let

ψ(h) = (h0, h1; e),

ψ(h0) = (h00, h01; e), ψ(h1) = (h10, h11; e). Then, by the definition of χ, we know that

ψ(hv) = (hv0, hv1; e) for all v ∈ {0, 1}2 , words of length two on the alphabet {0, 1}.

By Proposition 4.19, we have that the words w0 and w1 represent the group elements h0 and h1 (although their subscripts do not necessarily match up – we

26 could have w0 representing h1 and w1 representing h0 ). Therefore, applying Propo- sition 4.19 again, the words in {wv : |v| = 2} represent the group elements in {hv : |v| = 2}. We write it in this way to indicate that they may be differently labelled, as noted previously. Finally, we apply Proposition 4.19 once more to see that the words in {wv : |v| = 3} represent {hv : |v| = 3}. By Lemma 4.23, we have

l(h0) + l(h1) ≤ l(h) + 1,

l(h00) + ··· + l(h11) ≤ l(h0) + l(h1) + 2 ≤ l(h) + 3, and so

l(h000) + ··· + l(h111) ≤ l(h00) + ··· + l(h11) + 4

≤ l(h0) + l(h1) + 6 (†) ≤ l(h) + 7.

By Corollary 4.21, we have that

|w0| + |w1| = 2(|w|b + |w|c + |w|d) − |w|d

|w00| + ··· + |w11| = 2(|w|b + |w|c + |w|d) − |w|c

|w000| + ··· + |w111| = 2(|w|b + |w|c + |w|d) − |w|b.

It follows that, since 2(|w|b + |w|c + |w|d) ≤ |w| + 1, we have

|w0| + |w1| ≤ |w| + 1 − |w|d

|w00| + ··· + |w11| ≤ |w| + 1 − |w|c (††)

|w000| + ··· + |w111| ≤ |w| + 1 − |w|b.

From (†) and (††), we have that

l(h000) + ··· + l(h111) ≤ |w000| + ··· + |w111| ≤ |w| + 1 − |w|b and l(h000)+···+l(h111) ≤ l(h00)+···+l(h11)+4 ≤ |w00|+···+|w11|+4 ≤ |w|+5−|w|c and

l(h000) + ··· + l(h111) ≤ l(h0) + l(h1) + 6 ≤ |w0| + |w1| + 6 ≤ |w| + 7 − |w|d.

So we have

l(h000) + ··· + l(h111) ≤ min{|w| + 1 − |w|b, |w| + 5 − |w|c, |w| + 7 − |w|d}

≤ min{|w| + 7 − |w|b, |w| + 7 − |w|c, |w| + 7 − |w|d}

≤ |w| + 7 − max{|w|b, |w|c, |w|d}.

27 1 1 Now note that, if |w|u ≤ 6 |w|−1 for all u ∈ {b, c, d}, then |w|b +|w|c +|w|d ≤ 2 |w|−3, 1 1 1 but we also have that 2 (|w| − 1) ≤ |w|b + |w|c + |w|d . This implies that 2 |w| − 2 ≤ 1 2 |w| − 3, which is a contradiction. Therefore we must have some u ∈ {b, c, d} such 1 that |w|u > 6 |w| − 1, and so

l(h000) + ··· + l(h111) ≤ |w| + 7 − max{|w|b, |w|c, |w|d} 1 ≤ |w| + 8 − |w| 6 5 = |w| + 8 6 as required.

The following lemma bounds the length of coset representatives of  . G H3 Lemma 5.13. If U is a set of minimal-length coset representatives of  then G H3 |U| ≤ 128 and l(u) ≤ 127 for all u ∈ U .

23−1 7 Proof. By Lemma 4.9, we know [G : H3] = n ≤ 2 = 2 = 128. Therefore |U| = [G : H3] ≤ 128. Now suppose for a contradiction that there is some u ∈ U such that l(u) = l >

127. Then there exist s1, . . . , sl ∈ S such that u = s1s2 ··· sl is a minimal-length decomposition of u into generators. Since l > 127, we have that

|{e, s1, s1s2, s1s2s3, . . . , s1s2 ··· sl}| > 128 and therefore, by the Pigeonhole Principle and the fact that there are at most 128 cosets, there are two integers j < k such that s1 ··· sj and s1 ··· sk are in the same coset as each other. Then Hs1 ··· sj ··· sk = Hs1 ··· sj and so Hu = Hs1 ··· sl = Hs1 ··· sjsk+1 ··· sl . Writing x := s1 ··· sjsk+1 ··· sl , we note that Hu = Hx but l(x) < l(u), a contradiction to the minimality of the length of u. So we have shown that minimal-length coset representatives have length at most 127.

?k Notation 5.14. For any function f : R≥0 → R≥0 , denote by f (x) the sum

?k X f (x) := f(n1)f(n2) ··· f(nk). Pk i=1 ni≤x The following lemma uses analysis to find an upper bound for the growth of a group whose growth function has a certain property.

Lemma 5.15 ([8, Lemma 3.1]). Let G be an infinite group with finite generating set S and growth function γ with respect to the standard weight δ defined by δ(s) = 1 for all s ∈ S . Assume there are integers k ≥ 2 and M ≥ 0 and constants C > 0 and 0 < α < 1 such that γ(n) ≤ Cγ?k(αn) for all n ≥ M . If we have β < 1 such that k  β nβ kβ α < 1 then we have γ(n)  e .

28 Proof. It is sufficient to show that there exists some A > 0 such that log(γ(n)) ≤ Anβ Anβ A nβ nβ for all n ∈ R≥0 . Then γ(n) ≤ e for all n ∈ R≥0 and so γ(n)  (e ) ∼ e by Lemma 2.12. In fact, it is enough to show that there exists some A > 0 such that log(γ(n)) ≤ β An for all n ∈ Z≥0 . This is because if m ∈ R≥0 then there is some n ∈ Z≥0 such that n + 1 > m ≥ n and so γ(m) = γ(n) by Lemma 2.10. If m ≥ 1 then log(γ(m)) = log(γ(n)) ≤ Anβ ≤ A(2m)β , and if m < 1 then log(γ(m)) = 0 ≤ Amβ , so we can extend the result on the non-negative integers to the same result on the non-negative reals. We will use induction on n to show that there exists some A > 0 such that β log(γ(n)) ≤ An for all n ∈ Z≥0 . For the base case, we will show that the result holds for n < M . We can pick A large enough such that log(γ(n)) ≤ A·nβ since there are only finitely many inequalities which A has to satisfy. For the induction step, assume that n ≥ M and that we have A such that log(γ(x)) ≤ Axβ for all x < n. Note that, since β < 1, the function f(y) = yβ is convex, and so we can apply Jensen’s Inequality to get (f(n ) + ··· + f(n )) n + ··· + n  1 k ≤ f 1 k . k k

Therefore we have, for any choice of n1, . . . , nk whose sum is at most αn, the bound

log(γ(n1) ··· γ(nk)) = log(γ(n1)) + ··· + log(γ(nk)) β β ≤ A(n1 + ··· + nk ) by the inductive hypothesis n + ··· + n β ≤ A · k 1 k by Jensen’s Inequality as above k αnβ ≤ Ak since n + ··· + n ≤ αn k 1 k k = Anβ · αβ kβ = Anβ · (1 − ε)

k β k β k β where ε = 1 − kβ α . Since kβ α < 1 we have ε > 0. It is obvious that kβ α > 0 and so 0 < ε < 1 Now, since n ≥ M , we have γ(n) ≤ Cγ?k (αn). There are at most (αn)k tuples

(n1, . . . , nk) such that n1 + ··· + nk ≤ αn since the ni are integers. The inequality β log(γ(n1) ··· γ(nk)) ≤ An · (1 − ε) holds for all of these tuples, so we have

 k  log(γ(n)) ≤ log C (αn) · γ(n1) ··· γ(nk)

= log(C) + k log (α) + k log(n) + log(γ(n1) ··· γ(nk)) ≤ log(C) + k log (α) + k log(n) + Anβ · (1 − ε) by above. Recall that ε < 1. Pick A such that 1 k log(n) A ≥ (log(C) + k log (α)) + . εnβ εnβ

29 Then log(C) + k log (α) + k log(n) − εAnβ ≤ 0 and so log(γ(n)) ≤ log(C) + k log (α) + k log(n) + Anβ · (1 − ε) = (log(C) + k log (α) + k log(n) − εAnβ) + Anβ ≤ Anβ since log(C) + k log (α) + k log(n) − εAnβ ≤ 0. log(n) 1 Since → 0 and → 0 as n → ∞, we see that 1 (log(C) + k log (α)) + nβ nβ εnβ k log(n) εnβ is bounded above as n → ∞. Therefore there exists a value of A such that β log(γ(n)) ≤ An for all n ∈ Z≥0 . k  β nβ We have shown that, if β < 1 satisfies kβ α < 1, then γ(n)  e . Now we are ready to prove that G does not have exponential growth. Proof of Proposition 5.11. We will bound the length of g ∈ G by looking at how it acts on the third level of the tree T . Let U be a set of minimal-length coset representatives of  . By Lemma 5.13, G H3 we know |U| ≤ 128 and l(u) ≤ 127 for all u ∈ U . Given g ∈ G, there are unique h ∈ H3 and u ∈ U such that g = hu. Therefore −1 h = gu and so l(h) ≤ l(g) + 127. Writing χ(h) = (h1, . . . , h8), we note that

8 X 5 l(h ) ≤ l(h) + 8 by the Cancellation Lemma (Lemma 5.12) i 6 i=1 5 ≤ (l(g) + 127) + 8 since l(h) ≤ l(g) + 127 6 5 635 = l(g) + + 8 6 6 5 636 < l(g) + + 8 6 6 5 = l(g) + 114. 6 P8 5 7 Now we will use the fact that i=1 l(hi) < 6 l(g) + 114, that [G : H3] = n ≤ 2 and that χ is an injection to show that 5  γ(n) ≤ 128 · γ?8 n + 114 . 6

Define the map θ : G → U × χ(H3) on g = hu ∈ G where h ∈ H3 and u ∈ U by θ(g) = (u, χ(h)), noting that θ is well-defined because g ∈ G uniquely determines h and u. This map is obviously an injection because if θ(g) = θ(g0) where g = hu and g0 = h0u0 then θ(g) = (u, χ(h)) = (u0, χ(h0)) = θ(g0), so u = u0 and, by injectivity of χ, we have h = h0 , so g = hu = h0u0 = g0 . To get a bound on γ(n), let g ∈ B(n) and P8 θ(g) = (u, (h1, . . . , h8)). Then, by above and since l(g) ≤ n, we have i=1 l(hi) < 5 5 6 l(g) + 114 ≤ 6 n + 114. Letting ni = l(hi) for i = 1,..., 8, we therefore have that 5 n1 + ··· + n8 ≤ 6 n + 114 and hi ∈ B(ni) for all i.

30 Therefore, if we restrict θ to B(n), we get an injective map   [ θ|B(n) : B(n) → U ×  B(n1) × · · · × B(n8)

(n1,...,n8)

5 where the union is over tuples (n1, . . . , n8) such that n1 + ··· + n8 < 6 n + 114. Now

γ(n) = |B(n)|  

[ ≤ U ×  B(n1) × · · · × B(n8)

(n1,...,n8) X = |U| · |B(n1)| · · · |B(n8)|

(n1,...,n8) X = 128 · γ(n1) ··· γ(n8)

(n1,...,n8) 5  = 128 · γ?8 n + 114 . 6

Setting m = n + 137, note that 5 5n + 685 5n + 684 5 m = ≥ = n + 114. 6 6 6 6 Also, we get γ(m) = γ(n + 137) ≤ |S|137γ(n) since each element in B(n + 137) is an element in B(n) concatenated with an element of length at most 137, of which there are at most |S|137 . Therefore,

γ(m) ≤ |S|137γ(n) = 22·137γ(n) since |S| = 4 = 22 = 2274γ(n) 5  ≤ 2274 · 128 · γ?8 n + 114 by our work above 6 5  5 5 ≤ 2281 · γ?8 m since 128 = 27 and n + 114 ≤ m. 6 6 6 5 Now we can apply Lemma 5.15 with k = 8, M = 137, C = 2281 and α = . Since 6 ?k k  β γ(n) ≤ Cγ (αn) for all n ≥ M , if we have β < 1 such that kβ α < 1 then we have γ(n)  enβ . We can do a short computation to find an explicit upper bound for γ(n) by showing 8  5 β there exists a β < 1 such that 8β 6 < 1 and finding a lower bound for this β . 8 5 β 8 5 Note that 8β 6 ≤ 8β 6 since we must have β < 1, so we will find β such that 8 5 < 1. 8β 6

31 Since 8 5 40 < 1 ⇐⇒ 8β > 8β 6 6 40 ⇐⇒ β log(8) > log 6 log 40  ⇐⇒ β > 6 , log(8)

40 log( 6 ) we can take any β such that β > log(8) . 40 40 log( 6 ) log( 6 ) Since log(8) < 0.913, we have shown that for all 1 > β > log(8) , we have γ(n)  β β en . Since β < 1 we know that γ(n)  en  en and so G does not have exponential growth.

Now we have that G is of intermediate growth and so we have answered Milnor’s question.

6 Bounding the growth of Grigorchuk’s group

The exact growth of Grigorchuk’s group is not currently known, but work has been done to give upper and lower bounds on it.

6.1 Lower bound √ As before, let γ be the growth function of G. We will give a lower bound of e n  γ(n) on γ and then discuss a proof of Bartholdi [1] which gives an improvement on this bound. √ Theorem 6.1 ([1]). e n  γ(n).

We will use the following lemmas to prove this theorem.

Lemma 6.2 ([1, Corollary 8]). Let δ be a weight on S = {a, b, c, d}, a generating set of G. Write l for lδ . Assume there are constants K ≥ 0 and ω > 0 such that, for all h ∈ H with ψ(h) = (h0, h1; e), we have

l(h) ≤ ω max(l(h0), l(h1)) + K.

Then enα  γ(n) log(2) where α = log(ω) < 1.

Proof. We may write γ for γδ since we know γ ∼ γδ . By Lemma 5.5, since [G : H] = 2, there is a constant K1 such that

H 2γ (ωn + K) ≤ γ(ωn + K + K1).

32 Now let ψ(H)0, ψ(H)1 be the groups such that ψ(H) = ψ(H)0 × ψ(H)1 . Since [G × G : ψ(H)] =: m ≤ 64 is finite as we noted in our proof to Proposition 5.2, we have that a := [G : ψ(H)0] and b := [G : ψ(H)1] are finite and that ab = m ≤ 64. By Lemma 5.5, there are constants C,D such that

γ(n − C) ≤ aγψ(H)0 (n) and γ(n − D) ≤ bγψ(H)1 (n).

Letting K2 = max{C,D}, we get

2 ψ(H)0 ψ(H)1 ψ(H)0 ψ(H)1 γ(n − K2) ≤ abγ (n)γ (n) ≤ 64γ (n)γ (n).

Because l(h) ≤ ω max(l(h0), l(h1)) + K , we have that if l(h0), l(h1) ≤ n then l(h) ≤ ωn + K , and so

ψ(H)0 ψ(H)1 H γ (n)γ (n) = |(B(n) ∩ ψ(H)0) × (B(n) ∩ ψ(H)1)| ≤ γ (ωn + K).

Therefore we have 1 1 γ(n − K )2 ≤ γH(ωn + K) ≤ γ(ωn + K + K ) 64 2 2 1 and so 2 γ(n − K2) ≤ 32γ(ωn + K + K1) for all n > K2 . Replacing n by n + K2 , we get

γ(n)2 ≤ 32γ(ωn + A) for all n > 0 where A := K + K1 + ωK2 (which is a constant). We claim that

 k  k k ω − 1 γ(n)2 ≤ 322 −1γ ωkn + A ω − 1 for k ≥ 1, which we show by induction on k (it is easy to see by iterating γ(n)2 ≤ 32γ(ωn + A), but we include the proof for rigour). We already have the base case k = 1. For the inductive step, assume the result holds for k ≤ r − 1. Then

γ(n)2r = (γ(n)2)2r−1 ≤ 322r−1 · γ(ωn + A)2r−1 by base case  r−1  r−1 r−1 ω − 1 ≤ 322 · 322 −1γ ωr−1(ωn + A) + A by inductive hypothesis ω − 1  r  r ω − 1 = 322 −1γ ωrn + A ω − 1

33 as we wanted. This inequality holds for all n > 0 since γ(n)2 ≤ 32γ(ωn + A) holds for all n > 0. 1 Now let L ∈ R≥0 be such that γ(L) > 32 and L ≥ ω . Fixing m ∈ R≥0 , let r ∈ N be the maximal integer such that

 ωr − 1 n = n(r) := ω−r m − A ≥ L ω − 1

r ωr−1 which we know exists since n is a decreasing function of r. Then m = ω n + A ω−1 and so r r r γ(n)2 γ(n)2 γ(L)2 γ(m) ≥ 32 ≥ ≥ . 32 32 32

γ(L) r Since > 1 is a constant, by Lemma 2.12 we have that e2  γ(m). Now we need 32 to find r. Let R ∈ R≥0 be such that n(R) = L. Then r = bRc. We have 1 − ω ω−Rm = L + A =: c ω − 1 and so log(m) log(c) R = − . log(ω) log(ω) Now note that log(m) log(c) r ≥ R − 1 = − − 1. log(ω) log(ω)

log(c) log(m) log(2) log(m) If ≥ −1 then r ≥ . We set α = , and so mα = 2 log(ω) . Then log(ω) log(ω) log(ω) we have α n log(m) o r em = exp 2 log(ω) ≤ e2  γ(m).

1−ω−r Recall that c = L + A ω−1 . Since ω > 0 and A > 0, we get that 1 − ω−r cω = Lω + Aω ≥ Lω ≥ 1 ω − 1

1 since L ≥ ω . 1 Now c ≥ ω , so taking logs base ω on both sides, we get logω c ≥ −1, and noting loge c that = logω c, we get loge ω log c ≥ −1 log ω which, by above, gives us our result that emα  γ(m).

So we want to find a bound of the form l(h) ≤ ω max(l(h0), l(h1)) + K for all h ∈ H.

34 Definition 6.3. Define two maps σ and τ by

σ : G → H ( a 7→ aca, c 7→ b σ : b 7→ d, d 7→ c and

τ : G → G ( a 7→ d, c 7→ a τ : b 7→ e, d 7→ a.

Lemma 6.4. σ and τ extend to homomorphisms. We will omit the proof but point the reader towards Lysenok’s widely-cited work √ in [11] which Bartholdi’s proof that e n  γ(n) also cites in this context. Lysenok gives a set of defining relations for Grigorchuk’s group, allowing him to show that σ and τ extend to homomorphisms. The following lemma is a generalisation of Bartholdi’s work [1, Proposition 7]. We give a result in terms of general ω and K . Bartholdi showed a special case of this result for ω = 4 and K = 12 which we will show next.

Lemma 6.5. Take the generating set S = {a, b, c, d} of G and let δ be a weight on ω S . Write l for lδ . If ω is such that l(σ(g)) ≤ 2 l(g) + L for some constant L then, for all h ∈ H with ψ(h) = (h0, h1; e), we have

l(h) ≤ ω max(l(h0), l(h1)) + K where K = 2L + 6δ(a) + 4δ(c).

Proof. We can show that for all g ∈ G we have σ(g) = ϕ(τ(g), g) by checking for the elements in S as follows, and then noting that σ, τ and ϕ are homomorphisms.

σ(a) = aca = ϕ(d, a) = ϕ(τ(a), a) σ(b) = d = ϕ(e, b) = ϕ(τ(b), b) σ(c) = b = ϕ(a, c) = ϕ(τ(c), c) σ(d) = c = ϕ(a, d) = ϕ(τ(d), d) as required. −1 Now we will write (h0, h1) = ψ(h) and claim that h = aσ(h0)a · σ(τ(h0) h1). To show this, note that ψ is injective, so it is sufficient to show that ψ(aσ(h0)a · −1 σ(τ(h0) h1)) = ψ(h). We get that

−1 −1 −1 ψ(aσ(h0)a · σ(τ(h0) h1)) = (h0, τ(h0)) · (τ(τ(h0) h1), τ(h0) h1) −1 −1 = (h0, τ(h0)τ(h0) h1) · (τ(τ(h0) h1), e) −1 = (h0, h1) · (τ(τ(h0) h1), e).

35 −1 Since σ(h0) ∈ H we get aσ(h0)a ∈ H, and since σ(τ(h0) h1) ∈ H, we have that −1 ψ(aσ(h0)a · σ(τ(h0) h1)) ∈ ψ(H). We also have that (h0, h1) ∈ ψ(H) and so we must have −1 −1 −1 (τ(τ(h0) h1), e) = (h0, h1) ψ(aσ(h0)a · σ(τ(h0) h1)) ∈ ψ(H). −1 Therefore (τ(τ(h0) h1), e) ∈ (τ(G) × {e}) ∩ ψ(H). Note that τ(G) = ha, di. Looking at the definition of ψ, we can see that (G × {e}) ∩ ψ(H) = {(e, e), (b, e)} and −1 so (τ(G) × {e}) ∩ ψ(H) = {(e, e)}. Therefore (τ(τ(h0) h1), e) = (e, e) which tells us that

−1 −1 ψ(aσ(h0)a · σ(τ(h0) h1)) = (h0, h1) · (τ(τ(h0) h1), e)

= (h0, h1) · (e, e)

= (h0, h1) = ψ(h).

−1 By injectivity of ψ, we get that h = aσ(h0)a · σ(τ(h0) h1) so we have proved our claim. Recall that τ(G) = ha, di = {e, ad, adad, da, d, dad, ada, a}. Every element of τ(G) can be written as a subword of adad and therefore, applying σ as a rewriting rule, for every g ∈ G, we can see that σ(τ(g)) can be written as a subword of acacacac = σ(τ(adad)). Therefore l(σ(τ(g))) ≤ l(σ(τ(adad))) ≤ 4δ(a) + 4δ(c). ω By assumption, we have that l(σ(g)) ≤ 2 l(g) + L. −1 Since h = aσ(h0)a · σ(τ(h0) h1), we have

−1 l(h) ≤ δ(a) + l(σ(h0)) + δ(a) + l(σ(τ(h0) )) + l(σ(h1)) ω  ω  ≤ 2δ(a) + l(h ) + L + (4δ(a) + 4δ(c)) + l(h ) + L 2 0 2 1 ω = (l(h ) + l(h )) + 2L + 6δ(a) + 4δ(c) 2 0 1 ≤ ω max{l(h0), l(h1)} + 2L + 6δ(a) + 4δ(c) since the maximum is no smaller than the mean

= ω max{lδ(h0), l(h1)} + K as required.

Lemma 6.6 ([1, Proposition 7]). Take the generating set S = {a, b, c, d} of G and let δ be the standard weight δ(s) = 1 for all s ∈ S . Write l for lδ . Then, for all h ∈ H with ψ(h) = (h0, h1; e), we have

l(h) ≤ 4 max(l(h0), l(h1)) + 12.

Proof. We will show that l(σ(g)) ≤ 2l(g) + 1 and then apply Lemma 6.5. Looking at the definition of σ, we can see that if g ∈ G and v is a word representing g over the alphabet S such that |v| = δ(v) = l(g) then l(σ(g)) ≤ 3|v|a+|v|b+|v|c+|v|d . |v|+1 |v|−1 This is maximised when |v|a = 2 and |v|b + |v|c + |v|d = 2 , and so we get 1 l(σ(g)) ≤ 2 (4|v| + 2) = 2|v| + 1 = 2l(g) + 1.

36 Therefore we can apply Lemma 6.5 with ω = 4 and L = 1 to get that, for all h ∈ H with ψ(h) = (h0, h1; e), we have l(h) ≤ ω max(l(h0), l(h1)) + K where K = 2L + 6δ(a) + 4δ(c) = 12. Therefore we have

l(h) ≤ 4 max(l(h0), l(h1)) + 12 as required. √ Our proof that e n  γ(n) follows directly from the above lemmas.

Proof of Theorem 6.1. By Lemma 6.6, we can apply Lemma 6.2 with ω = 4, which tells us that enα  γ(n) for log(2) log(2) log(2) log(2) 1 α = = = = = . log(ω) log(4) log(22) 2 log(2) 2 √ So we have e n  γ(n) as required.

After Milnor’s question concerning the existence of groups of intermediate growth was answered by Grigorchuk, there was a lot of interest in finding out whether groups √ of growth equivalent to e n existed. Grigorchuk conjectured in [6] that the growth of √ the Grigorchuk group was equivalent to e n . However, this conjecture was disproved by Leonov in [10] in 2000. Leonov gave a tighter lower bound on the growth γ of nα log(2) Grigorchuk’s group when he proved that γ(n)  e where α = 87  > 0.504. log 22 Around the same time, Bartholdi gave what is currently the best known lower bound on the growth of Grigorchuk’s group. In [1], he showed that en0.5157  γ(n).

nα log(2) Theorem 6.7 ([1, Theorem 1]). e  γ(n) where α = log(3.83414) . Note that 0.5158 > α > 0.5157. This lower bound was achieved by constructing a finite state automaton to show that l(h) ≤ ω max(l(h0), l(h1)) + K where ω = 3.83414 for all h = ϕ(h0, h1). We describe how Bartholdi uses finite state automata for his lower bound proof, but direct the reader to the original paper [1] for the construction of the automaton used.

Definition 6.8. A finite state automaton consists of a finite set of states V , a finite alphabet S of symbols containing a padding character ÷, a transition function t, a start state ∗ ∈ V and an end state † ∈ V . Each state is either an input state or an output state. Letting I be the set of input states and O be the set of output states, we get V = I tO. The transition function t consists of two functions, an input transition ∗ ∗ ∗ tI : I × (S × S ) → V and an output transition tO : O → V × S . We say that the ∗ input transition tI (x, (u, v)) is labelled by the pair (u, v) of input words in S and the 0 ∗ output transition tO(y) = (y , w) is labelled by the single word w ∈ S .

37 We will view a finite state automaton as a directed labelled graph Γ with vertex set V and edge set

(u,v) ∗ w 0 ∗ 0 {x −−→ tI (x, (u, v)) : x ∈ I, u, v ∈ S }∪{y −→ tO(y) = (y , w): y ∈ O, w ∈ S , y ∈ V }.

Given a path p = e1 ··· en in Γ, the input i(p) of p is the pair of words i(p) = (i(p)0, i(p)1) obtained by concatenating labels of all input transitions along the path p in sequence, and the output o(p) of p is the word obtained by concatenating labels of all output transitions along the path p in sequence. ∗ ∗ In order to run the automaton on an input (v0, v1) ∈ S × S , first pad the input ∗ ∗ words to make them infinite so that our input pair is (v0÷ , v1÷ ). Then we step ∗ ∗ through the states inputting a new subword of (v0÷ , v1÷ ) or outputting a word at every state. Let p be a path from ∗ to † in Γ. We say that o(p) is an output of the automaton ∗ ∗ on input (v0, v1) if i(p) = (v0÷ , v1÷ ). We will consider only automata which have exactly one output for each path. Our definition of finite state automata is non-standard because it allows states to output words and takes input in pairs of input words instead of allowing only one letter for each output and input state. However, our definition of finite state automata is equivalent to a special case of the general definition. The standard definition of finite state automata may be found in [9]. To prove Theorem 6.7, Bartholdi uses a weight δ and constructs a finite au- tomaton to find, for any pair of minimal-length words (v0, v1) representing a pair of group elements (h0, h1) ∈ ψ(H), a word v representing ϕ(h0, h1) such that δ(v) ≤ ω max(δ(h0), δ(h1)) + K where ω = 3.83414. Then we have lδ(h) ≤ δ(v) ≤ ω max(δ(v0), δ(v1)) + K = ω max(lδ(h0), lδ(h1)) + K, and so, by Lemma 6.2, we have enα  γ(n) where α = log(2) and 0.5158 > α > 0.5157. log(ω)√ This is the same idea as for the proof that γ(n)  e n , except we need better rewriting rules than the ones we have to hand, and so we use the automaton to rewrite words for us. The following lemma gives us a bound on word lengths similar to that required by Lemma 6.2 to give a lower bound on the growth of γ(n), but for words in the alphabet of an automaton rather than group elements.

∗ Lemma 6.9 ([1, Lemma 9]). Let Γ be a finite state automaton. Let κ : S → R≥0 be a function assigning a positive weight to every letter in S and κ(÷) = 0 and with the property that κ(uu0) = κ(u) + κ(u0). Assume that κ(o(p)) is bounded by K for all ω paths p of length no greater than |V |. If ω is such that κ(o(l)) < 2 (κ(i(l)0) + κ(i(l)1)) for all directed loops l in Γ then for every input pair (v0, v1), we have an output word v such that

κ(v) ≤ ω max(κ(v0), κ(v1)) + K.

∗ ∗ Proof. Let p be a path in Γ from ∗ to † such that i(p) = (v0÷ , v1÷ ). Note that the graph has a finite number |V | of vertices. Therefore, we can write p as a sequence of

38 paths and loops p = p0l1p1 ··· lmpm where the li are all the loops in p and so the pi contain no loops. Then the path p0 ··· pm also contains no loops, and so is of length at most |V |. We get

m m X X κ(v) = κ(o(lj)) + κ(o(pj)) j=1 j=0 m X ω ≤ (κ(i(l ) ) + κ(i(l ) )) + K by assumptions 2 j 0 j 1 j=1 ω ≤ (κ(v ) + κ(v )) + K 2 0 1 ≤ ω max{κ(v0), κ(v1)} + K since the mean is no greater than the maximum as required.

Sketch proof of Theorem 6.7. First, fix for each g ∈ G a shortest alternating decom- position g ∈ {a, b, c, d}∗ and let M := {g : g ∈ G}. Let the finite state automaton Γ have finite set of states V = M × M and alphabet S = {a, b, c, d, ÷}. The rest of the description of this automaton can be found in [1], but is not relevant for our sketch proof. Define a weight δ on the group by

δ(a) = 1, δ(b) = 3.33, δ(c) = 2.8, δ(d) = 1.06 and define the weight function for the automaton to be κ(g) = δ(g).

Bartholdi shows that, for any input pair (h0, h1), the automaton Γ outputs a word v representing ϕ(h0, h1) where (h0, h1) ∈ ψ(H). It also has the properties that κ(o(p)) is bounded by K for all paths p of length no greater than |V |, and that ω κ(o(l)) < 2 (κ(i(l)0) + κ(i(l)1)) for all directed loops l in Γ where ω = 3.83414. Applying Lemma 6.9 to this automaton, we get that, for every input pair (h0, h1) with ψ(h) = (h0, h1), we have an output word v such that

κ(v) ≤ ω max(κ(h0), κ(h1)) + K.

Noting that lδ(h) ≤ κ(v) and lδ(hi) = κ(hi) for i = 1, 2, we get

lδ(h) ≤ ω max(lδ(h0), lδ(h1)) + K

nα for all h such that ψ(h) = (h0, h1). Therefore, by Lemma 6.2, we have e  γ(n) log(2) where α = log(ω) and ω = 3.83414 as required.

6.1.1 Discussion of a possible idea for improving the lower bound The following proposition is stated but not proved in Bartholdi’s paper [2]. Bartholdi suggests that it can lead to tighter upper bounds on the growth of G, and can be

39 proved using Lysenok’s set of defining relations for G found in [11]. We will give our own elementary proof and discuss whether it may be exploited to improve the current bounds on the growth of Grigorchuk’s group.

Proposition 6.10. Let δ be a triangular weight on G with generating set S = {a, b, c, d}. Let g ∈ G with |w| ≥ 5. If w is a shortest alternating decomposition of g then w does not contain dada as a subword.

Proof. First note that (da)4 = e by Lemma 4.4. Therefore dada = a−1d−1a−1d−1 = adad since a and d have order 2 by Lemma 4.1. Assume for a contradiction that v = dadau is a subword of w for some u ∈ {b, c, d}. Then, since w is a shortest alternating decomposition, we know that δ(dadau) = 2δ(a) + 2δ(d) + δ(u). Since dada = adad, we can rewrite v as adadu = adau0 =: v0 where u0 = du ∈ {b, c, e}, and δ(adau0) = 2δ(a) + δ(d) + δ(u0) where δ(u0) < δ(d) + δ(u). So δ(v0) < δ(v), and so replacing v with v0 in w gives us a shorter alternating decomposition, so w was not a shortest alternating decomposition. If w contains the subword v = adada then we can rewrite this as aadad = dad =: v0 and argue in the same way.

We can also show, using a completely analogous proof, that a shortest alternating decomposition |w| of an element of G with |w| ≥ 9 does not contain (ca)4 as a subword, and that a shortest alternating decomposition |w| of an element of G with |w| ≥ 17 does not contain (ba)8 as a subword since, by 4.4, we have (ba)16 = e = (ca)8 and so (ba)8 = (ab)8 and (ca)4 = (ac)4 . The following corollary is our own, and an obvious consequence of Proposition 6.10.

Corollary 6.11. Let δ be a triangular weight on G with generating set S = {a, b, c, d}. If g ∈ G has a shortest alternating decompositionw such that |w| ≥ 5 then

|w|d ≤ |w|b + |w|c + 1.

Proof. Let w = (a)u1au2 ··· auk(a). By Proposition 6.10, at most every other ui can be d, and the others must be b or c. If u1 6= d then for all i > 1 such that ui = d we have ui−1 = b or ui−1 = c, so we can have at most |w|b + |w|c copies of d in w. If u1 = d then for each i < k such that ui = d, we must have ui+1 = b or ui+1 = c. We can have an extra d if uk = d. Therefore we can have at most |w|b + |w|c + 1 copies of d in w.

So |w|d ≤ |w|b + |w|c + 1. Intuitively, it seems that this fact could be used to find an improved lower bound on the growth of Grigorchuk’s group, or at least a simpler proof of the fact that √ Grigorchuk’s group grows strictly more quickly than e n , because it gives us a bound on some of the letters in shortest words representing group elements. However, we suggest that it might be shown that more is needed.

40 √ In order to prove that Grigorchuk’s group grows strictly more quickly than e n , it is sufficient to show that, for any shortest alternating decomposition v of g, there is some x < 2 such that δ(v0) − L ≤ x δ(v) where v0 is the word we get by applying σ to v as a rewriting rule. This is because 0 we then have lδ(σ(g)) ≤ δ(v ) ≤ xδ(v) + L = xlδ(g) + L since lδ(g) = δ(v), and so we can apply Lemma 6.5 with ω < 4. By the definition of σ we can write

0 lδ(σ(g)) ≤ δ(v ) = (2δ(a) + δ(c))|v|a + δ(d)|v|b + δ(b)|v|c + δ(c)|v|d.

Note that

δ(v) = δ(a)|v|a + δ(b)|v|b + δ(c)|v|c + δ(d)|v|d. One method of exploiting Corollary 6.11 which we have tried is applying it to the δ(v0) − L above bounds on the lengths of σ(g) and g to get a bound of the form < 2. It δ(v) would be quite simple to write a computer program to test different triangular weights to try to achieve this bound. However, it looks as though Corollary 6.11 is not enough δ(v0) − L to give us a result. In fact, it could be true that ≥ 2 for all possible weights. δ(v) δ(v0) − L It would still be a useful contribution to the area to show that ≥ 2 for δ(v) all possible weights.

6.2 Upper bound The best known upper bound on the growth of Grigorchuk’s group is due to Bartholdi and can be found in [2].

Theorem 6.12 ([2, Theorem 1]). Let ζ be the real root of the polynomial x3 +x2 +x−2

log(2) α and let α = . Then γ(n)  en .  2  log ζ Note that 0.768 > α > 0.767. Therefore the best known bounds on the growth of Grigorchuk’s group are en0.5157  γ(n)  en0.767 . We omit the proof of this upper bound and direct the reader to Bartholdi’s original paper [2]. We will say that the proof uses a cleverly constructed weight and a lot of calculations to give a bound on the growth of Grigorchuk’s group in terms of Catalan numbers which tell us how many ways there are to bracket sums of a certain number of elements, or, equivalently, how many labelled full binary rooted trees with a certain number of leaves there are. The elements of B(n) are represented as bracketings (or, equivalently, trees) which are defined using ψ, so we already have all the machinery to

41 prove this theorem, however, we found a few minor flaws in some of the calculations which should not affect the result but do not allow us to give a full proof because they cause us to miss a few edge cases.

7 Concluding Remarks

In this essay, we have shown that Grigorchuk’s group G is an infinite torsion group of intermediate growth, and we have proved that the growth γ is bounded by

√ e n  γ(n)  en0.913 .

The exact growth of Grigorchuk’s group is unknown, but we have discussed the best current bounds which tell us that

en0.5157  γ(n)  en0.767 .

We do not know if these bounds are tight. It would be interesting to try to tighten these bounds, and it seems that new and more precise ways of bounding the lengths of words will be required for this. We do not even know if the growth of Grigorchuk’s group has the form γ(n) ∼ enα for some α. It may well be that γ oscillates in the range between our two best bounds. Another question in this area is whether there exists a group whose growth is √ equivalent to e n . Since we have shown that Grigorchuk’s group is not one of these groups, the question remains open.

42 References

[1] Laurent Bartholdi: ‘Lower Bounds on the Growth of Grigorchuk’s Torsion Group’, arXiv:math/9910068v1, 1999.

[2] Laurent Bartholdi: ‘The Growth of Grigorchuk’s Torsion Group’, International Mathematics Research Notices, 20, 1049–1054, 1998,

[3] W. Burnside: ‘On an unsettled question in the theory of discontinuous groups’, Quarterly Journal of Mathematics 33, 230–238, 1902.

[4] Pierre de la Harpe: Topics in Geometric University of Chicago Press, 2000.

[5] R. I. Grigorchuk: ‘On the Milnor problem of group growth’, Soviet Mathematics Doklady, 28, 23–26, 1983, as discussed in [8].

[6] R. I. Grigorchuk: ‘Degrees of growth of finitely generated groups and the the- ory of invariant means’, Mathematics of the USSR-Izvestiya, 25, 259–300, 1985. (English translation)

[7] Rostislav Grigorchuk: ‘Milnor’s Problem on the Growth of Groups and its Consequences’, arXiv:1111.0512v4, 2013.

[8] Rostislav Grigorchuk and Igor Pak: ‘Groups of Intermediate Growth: an Introduction for Beginners’, arXiv:math/0607384v1, 2006.

[9] John E. Hopcroft and Jeffrey D. Ullman: Introduction to automata theory, languages, and computation Addison-Wesley, 1979.

[10] Yu. G. Leonov: ‘On a Lower Estimate of the Growth Function for the Grig- orchuk Group’, Mathematical Notes, 67, No. 3, 403–405, 2000. (English translation)

[11] I. G. Lysenok: ‘A system of defining relations for a Grigorchuk group’, Mathe- matical Notes, 38, No. 4, 784–792, 1985. (English translation)

[12] J. Milnor: ‘A note on curvature and fundamental group’, Journal of Differential Geometry, 2, 1–7, 1968.

[13] : Problem 5603, The American Mathematical Monthly, 75 No. 6, 685–687, 1968.

[14] A.S. Svarcˇ : ‘A volume invariant of coverings’, Doklady Akademii Nauk SSSR, 105, 32–34, 1955, as discussed in [7].

[15] Joseph A. Wolf: ‘Growth of finitely generated solvable groups and curvature of Riemanniann manifolds’, Journal of Differential Geometry 2, 421–446, 1968.

43