Random Walks on Disordered Media and Their Scaling Limits

Random walks on disordered media and their scaling limits

Takashi Kumagai ⇤ Department of Mathematics Kyoto University, Kyoto 606-8502, Japan Email: [email protected]

(June 27, 2010) (Version for St. Flour Lectures)

Abstract The main theme of these lectures is to analyze heat conduction on disordered media such as fractals and percolation clusters using both probabilistic and analytic methods, and to study the scaling limits of Markov chains on the media.

The problem of random walk on a percolation cluster ‘the ant in the labyrinth’ has received much attention both in the physics and the mathematics literature. In 1986, H. Kesten showed an anomalous behavior of a random walk on a percolation cluster at critical probability for trees and for Z2. (To be precise, the critical percolation cluster is finite, so the random walk is considered on an incipient infinite cluster (IIC), namely a critical percolation cluster conditioned to be infinite.) Partly motivated by this work, analysis and di↵usion processes on fractals have been developed since the late eighties. As a result, various new methods have been produced to estimate heat kernels on disordered media, and these turn out to be useful to establish quenched estimates on random media. Recently, it has been proved that random walks on IICs are sub-di↵usive on Zd when d is high enough, on trees, and on the spread-out oriented percolation for d>6.

Throughout the lectures, I will survey the above mentioned developments in a compact way. In the ﬁrst part of the lectures, I will summarize some classical and non-classical estimates for heat kernels, and discuss stability of the estimates under perturbations of operators and spaces. Here Nash inequalities and equivalent inequalities will play a central role. In the latter part of the lectures, I will give various examples of disordered media and obtain heat kernel estimates for Markov chains on them. In some models, I will also discuss scaling limits of the Markov chains. Examples of disordered media include fractals, percolation clusters, random conductance models and random graphs.

⇤Research partially supported by the Grant-in-Aid for Scientiﬁc Research (B) 22340017.

1 Contents

0 Plan of the lectures and remark 3

1 Weighted graphs and the associated Markov chains 4 1.1 Weighted graphs ...... 4 1.2 Harmonic functions and e↵ective resistances ...... 9 1.3 Trace of weighted graphs ...... 16

2 Heat kernel upper bounds (The Nash inequality) 17 2.1 The Nash inequality ...... 17 2.2 The Faber-Krahn, Sobolev and isoperimetric inequalities ...... 20

3 Heat kernel estimates using e↵ective resistance 25 3.1 Green density killed on a ﬁnite set ...... 27 3.2 Green density on a general set ...... 30 3.3 General heat kernel estimates ...... 32 3.4 Strongly recurrent case ...... 33 3.5 Applications to fractal graphs ...... 36

4 Heat kernel estimates for random weighted graphs 38 4.1 Random walk on a random graph ...... 39 4.2 The IIC and the Alexander-Orbach conjecture ...... 42

5 The Alexander-Orbach conjecture holds when two-point functions behave nicely 43 5.1 The model and main theorems ...... 43 5.2 Proof of Proposition 5.4 ...... 46 5.3 Proof of Proposition 5.3 i) ...... 49 5.4 Proof of Proposition 5.3 ii) ...... 51

6 Further results for random walk on IIC 52 6.1 Random walk on IIC for critical percolation on trees ...... 52 6.2 Sketch of the Proof of Theorem 6.2 ...... 54 6.3 Random walk on IIC for the critical oriented percolation cluster in Zd (d>6) . . . . . 56 6.4 Below critical dimension ...... 58 6.5 Random walk on random walk traces and on the Erd¨os-R´enyi random graphs . . . . . 59

7 Random conductance model 61 7.1 Overview ...... 62 7.2 Percolation estimates ...... 67 7.3 Proof of some heat kernel estimates ...... 67 7.4 Corrector and quenched invariance principle ...... 68 7.5 Construction of the corrector ...... 72

2 7.6 Proof of Theorem 7.15 ...... 75 7.7 Proof of Proposition 7.16 ...... 77 7.7.1 Case 1 ...... 77 7.7.2 Case 2 ...... 82 7.8 End of the proof of quenched invariance principle ...... 84

0 Plan of the lectures and remark

A rough plan of lectures at St. Flour is as follows. Lecture 1–3: In the first lecture, I will discuss general potential theory for symmetric Markov chains on weighted graphs (Section 1). Then I will show various equivalent conditions for the heat kernel upper bounds (the Nash inequality (Section 2)). In the third lecture, I will use e↵ective resistance to estimate Green functions, exit times from balls etc.. On-diagonal heat kernel bounds are also obtained using the e↵ective resistance (Section 3). 1 Lecture 4–6 2 : I will discuss random walk on an incipient infinite cluster (IIC) for a critical percolation. I will give some sucient condition for the sharp on-diagonal heat kernel bounds for random walk on random graphs (Section 4). I then prove the Alexander-Orbach conjecture for IICs when two-point functions behave nicely, especially for IICs of high dimensional critical bond percolations on Zd (Section 5). I also discuss heat kernel bounds and scaling limits on related random models (Section 6). 1 Lecture 6 2 –8: The last 2 (and half) lectures will be devoted to the quenched invariance principle d d for the random conductance model on Z (Section 7). Put i.i.d. conductance µe on each bond in Z and consider the Markov chain associated with the (random) weighted graph. I consider two cases, namely 0 µe c P-a.e. and c µe < P-a.e.. Although the behavior of heat kernels are quite    1 di↵erent, for both cases the scaling limit is Brownian motion in general. I will discuss some technical details about correctors of the Markov chains, which play a key role to prove the invariance principle. This is the version for St. Flour Lectures. There are several ingredients (which I was planning to include) missing in this version. Especially, I could not explain enough about techniques on heat kernel estimates; for example o↵-diagonal heat kernel estimates, isoperimetric profiles, relations to Harnack inequalities etc. are either missing or mentioned only briefly. (Because of that, I could not give proof to most of the heat kernel estimates in Section 7.) This was because I was much busier than I had expected while preparing for the lectures. However, even if I could have included them, most likely there was not enough time to discuss them during the 8 lectures. Anyway, my plan is to revise these notes and include the missing ingredients in the version for publication from Springer. I referred many papers and books during the preparation of the lecture notes. Especially, I owe a lot to the lecture notes by Barlow [10] and by Coulhon [40] for Section 1–2. Section 3–4 (and part of Section 6) are mainly from [18, 87]. Section 5 is mainly from the beautiful paper by Kozma and Nachmias [86] (with some simplification in [101]). In Section 7, I follow the arguments of the papers by Barlow, Biskup, Deuschel, Mathieu and their co-authors [16, 30, 31, 32, 90, 91].

3 1 Weighted graphs and the associated Markov chains

In this section, we discuss general potential theory for symmetric (reversible) Markov chains on weighted graphs. Note that there are many nice books and lecture notes that treat potential theory and/or Markov chains on graphs, for example [6, 10, 51, 59, 88, 107, 109]. While writing this section, we are largely inﬂuenced by the lecture notes by Barlow [10].

1.1 Weighted graphs Let X be a finite or countably infinite set, and E is a subset of x, y : x, y X, x = y . A graph {{ } 2 6 } is a pair (X, E). For x, y X,wewritex y if x, y E.Asequencex ,x , ,x is called a 2 ⇠ { }2 0 2 ··· n path with length n if x X for i =0, 1, 2, ,n and x x for j =0, 1, 2, ,n 1. For x = y, i 2 ··· j ⇠ j+1 ··· 6 define d(x, y) to be the length of the shortest path from x to y. If there is no such path, we set d(x, y)= and we set d(x, x) = 0. d( , ) is a metric on X and it is called a graph distance. (X, E) 1 · · is connected if d(x, y) < for all x, y X, and it is locally finite if ] y : x, y E < for all 1 2 { { }2 } 1 x X. Throughout the lectures, we will consider connected locally finite graphs (except when we 2 consider the trace of them in Subsection 1.3). Assume that the graph (X, E) is endowed with a weight (conductance) µxy,whichisasymmetric nonnegative function on X X such that µ > 0 if and only if x y. We call the pair (X, µ)a ⇥ xy ⇠ weighted graph. Let µ = µ(x)= µ and define a measure µ on X by setting µ(A)= µ for A X. x y X xy x A x ⇢ Also, we define B(x, r)=2 y X : d(x, y)

Deﬁnition 1.1 We say that (X, µ) has controlled weights (or (X, µ) satisﬁes p0-condition) if there exists p0 > 0 such that µxy p0 x y. µx 8 ⇠

1 If (X, µ) has controlled weights, then clearly ] y X : x y p . { 2 ⇠ } 0 Once the weighted graph (X, µ) is given, we can deﬁne the corresponding quadratic form, Markov chain and the discrete Laplace operator. Quadratic form We deﬁne a quadratic form on (X, µ) as follows.

2 2 1 2 H (X, µ)=H = f : X R : (f,f)= (f(x) f(y)) µxy < , 2 { ! E x,y X 1} x 2y X⇠ 1 (f,g)= (f(x) f(y))(g(x) g(y))µ f,g H2. 2 xy E x,y X 8 2 x 2y X⇠ Physically, (f,f) is the energy of the electrical network for an (electric) potential f. E Since the graph is connected, one can easily see that (f,f) = 0 if and only if f is a constant E function. We ﬁx a base point 0 X and deﬁne 2 2 2 2 f 2 = (f,f)+f(0) f H . k kH E 8 2

4 Note that

1 2 2 2 2 2 (f,f)= (f(x) f(y)) µxy (f(x) + f(y) )µxy =2 f f L , (1.1) E 2  k k2 8 2 x y x y X⇠ X X 2 2 2 where f 2 is the L -norm of f.SoL H . We give basic facts in the next lemma. k k ⇢ Lemma 1.2 (i) Convergence in H2 implies the pointwise convergence. (ii) H2 is a Hilbert space.

Proof. (i) Suppose f f in H2 and let g = f f.Then (g ,g )+g (0)2 0sog (0) 0. n ! n n E n n n ! n ! For any x X,thereisasequence x l X such that x =0,x = x and x x for 2 { i}i=0 ⇢ 0 l i ⇠ i+1 i =0, 1, ,l 1. Then ··· l 1 l 1 2 2 1 gn(x) gn(0) l gn(xi) gn(xi+1) 2l(min µx x ) (gn,gn) 0 (1.2) | |  | |  i=0 i i+1 E ! Xi=0 as n so we have gn(x) 0. !1 2! 2 (ii) Assume that fn n H is a Cauchy sequence in H .Thenfn(0) is a Cauchy sequence in R so { } ⇢ converges. Thus, similarly to (1.2) fn converges pointwise to f, say. Now using Fatou’s lemma, we have f f 2 lim inf f f 2 , so that f f 2 0. k n kH2  m k n mkH2 k n kH2 ! Markov chain Let Y = Y be a Markov chain on X whose transition probabilities are given by { n} µxy P(Yn+1 = y Yn = x)= =: P (x, y) x, y X. | µx 8 2

x We write P when the initial distribution of Y is concentrated on x (i.e. Y0 = x, P-a.s.). (P (x, y))x,y X 2 is the transition matrix for Y . Y is called a simple random walk when µ =1wheneverx y. Y xy ⇠ is µ-symmetric since for each x, y X, 2

µxP (x, y)=µxy = µyx = µyP (y, x).

We deﬁne the heat kernel of Y by

x pn(x, y)=P (Yn = y)/µy x, y X. 8 2 Using the Markov property, we can easily show the Chapman-Kolmogorov equation:

p (x, y)= p (x, z)p (z,y)µ , x, y X. (1.3) n+m n m z 8 2 z X Using this and the fact p1(x, y)=µxy/(µxµy)=p1(y, x), one can verify the following inductively

p (x, y)=p (y, x), x, y X. n n 8 2

5 For n 1, let x x Pnf(x)= pn(x, y)f(y)µy = P (Yn = y)f(y)=E [f(Yn)], f : X R. 8 ! y y X X We sometimes consider a continuous time Markov chain Yt t 0 w.r.t. µ which is deﬁned as { } follows: each particle stays at a point, say x for (independent) exponential time with parameter 1, and then jumps to another point, say y with probability P (x, y). The heat kernel for the continuous time Markov chain can be expressed as follows.

n x 1 t t pt(x, y)=P (Yt = y)/µy = e pn(x, y), x, y X. n! 8 2 n=0 X Discrete Laplace operator For f : X R, the discrete Laplace operator is deﬁned by !

1 x f(x)= P (x, y)f(y) f(x)= (f(y) f(x))µxy = E [f(Y1)] f(x)=(P1 I)f(x). (1.4) L µ y x y X X Note that according to the Ohm’s law ‘I = V/R’, (f(y) f(x))µ is the total flux flowing into y xy x, given the potential f. P Definition 1.3 Let A X. A function f : X R is harmonic on A if ⇢ ! f(x)=0, x A. L 8 2 h is sub-harmonic (resp. super-harmonic) on A if f(x) 0 (resp. f(x) 0)forx A. L L  2 f(x) = 0 means that the total flux flowing into x is 0 for the given a potential f.Thisisthe L behavior of the currents in a network called Kircho↵’s (first) law. For A X, we define the (exterior) boundary of A by ⇢ @A = x Ac : z A such that z x . { 2 9 2 ⇠ } Proposition 1.4 (Maximum principle) Let A be a connected subset of X and h : A @A R be [ ! sub-harmonic on A.Ifthemaximumofh over A @A is attained in A, then h is constant on A @A. [ [

Proof. Let x A be the point where h attains the maximum and let H = z A @A : h(z)= 0 2 { 2 [ h(x ) .Ify H A,thensinceh(y) h(x) for all x A @A,wehave 0 } 2 \ 2 [ 0 µ h(y)= (h(x) h(y))µ 0.  yL xy  y X Thus, h(x)=h(y)(i.e.x H) for all x y.SinceA is connected, this implies H = A @A. 2 ⇠ [ We can prove the minimum principle for a super-harmonic function h by applying the maximum principle to h. 2 2 For f,g L , denote their L -inner product as (f,g), namely (f,g)= f(x)g(x)µx. 2 x P 6 2 2 2 2 Lemma 1.5 (i) : H L and f 2 f 2 . L ! kL k2  k kH (ii) For f H2 and g L2, we have ( f,g)= (f,g). 2 2 L E (iii) is a self-adjoint operator on L2(X, µ) and the following holds: L 2 ( f,g)=(f, g)= (f,g), f,g L . (1.5) L L E 8 2

Proof. (i) Using Schwarz’s inequality, we have 1 f 2 = ( (f(y) f(x))µ )2 kL k2 µ xy x x y X X 1 2 2 ( (f(y) f(x)) µ )( µ )=2 (f,f) 2 f 2 .  µ xy xy E  k kH x x y y X X X (ii) Using (i), both sides of the equality are well-deﬁned. Further, using Schwarz’s inequality,

µ (f(y) f(x))g(x) ( µ (f(y) f(x))2)1/2( µ g(y)2)1/2 = (f,f)1/2 g < . | xy | xy xy E k k2 1 x,y x,y x,y X X X So we can use Fubini’s theorem, and we have 1 ( f,g)= ( µ (f(y) f(x)))g(x)= µ (f(y) f(x))(g(y) g(x)) = (f,g). L xy 2 xy E x y x y X X X X (iii) We can prove (f, g)= (f,g) similarly and obtain (1.5). L E (1.5) is the discrete Gauss-Green formula.

Lemma 1.6 Set px( )=p (x, ). Then, the following hold for all x, y X. n · n · 2 x y x pn+m(x, y)=(pn,pm),P1pn(y)=pn+1(x, y), (1.6) px(y)=p (x, y) p (x, y), (px,py )=p (x, y) p (x, y), (1.7) L n n+1 n E n m n+m n+m+1 p (x, y) p (x, x)p (y, y). (1.8) 2n  2n 2n p Proof. The two equations in (1.6) are due to the Chapman-Kolmogorov equation (1.3). The ﬁrst equation in (1.7) is then clear since = P I. The last equation can be obtained by these equations L 1 and (1.5). Using (1.6) and the Schwarz inequality, we have

p (x, y)2 =(px,py )2 (px,px)(py ,py )=p (x, x)p (y, y), 2n n n  n n n n 2n 2n which gives (1.8).

It can be easily shown that ( , L2) is a regular Dirichlet form on L2(X, µ) (c.f. [55]). Then E the corresponding Hunt process is the continuous time Markov chain Yt t 0 w.r.t. µ and the { } corresponding self-adjoint operator on L2 is in (1.4). L

7 Remark 1.7 Note that Yt t 0 has the transition probability P (x, y)=µxy/µx and it waits at x { } for an exponential time with mean 1 for each x X.Sincethe‘speed’of Yt t 0 is independent 2 { } of the location, it is sometimes called constant speed random walk (CSRW for short). We can also consider a continuous time Markov chain with the same transition probability P (x, y) and wait at x for an exponential time with mean µ 1 for each x X. This Markov chain is called variable speed x 2 random walk (VSRW for short). We will discuss VSRW in Section 7. The corresponding discrete Laplace operator is f(x)= (f(y) f(x))µ . (1.9) LV xy y X For each f,g that have ﬁnite support, we have

(f,g)= ( f,g) = ( f,g) , E LV ⌫ LC µ where ⌫ is a measure on X such that ⌫(A)= A for all A X. So VSRW is the Markov process | | ⇢ associated with the Dirichlet form ( , ) on L2(X,⌫ ) and CSRW is the Markov process associated E F with the Dirichlet form ( , ) on L2(X, µ). VSRW is a time changed process of CSRW and vice E F versa.

We now introduce the notion of rough isometry.

Deﬁnition 1.8 Let (X1,µ1), (X2,µ2) be weighted graphs that have controlled weights. (i) A map T : X X is called a rough isometry if the following holds. 1 ! 2 There exist constants c1,c2,c3 > 0 such that

1 c d (x, y) c d (T (x),T(y)) c d (x, y)+c x, y X , (1.10) 1 1 2  2  1 1 2 8 2 1 d (T (X ),y0) c y X , (1.11) 2 1  2 8 0 2 2 1 c µ (x) µ (T (x)) cµ (x) x X , (1.12) 3 1  2  1 8 2 1 where d ( , ) is the the graph distance of (X ,µ ),fori =1, 2. i · · i i (ii) (X1,µ1), (X2,µ2) are said to be rough isometric if there is a rough isometry between them.

It is easy to see that rough isometry is an equivalence relation. One can easily prove that Z2 and the triangular lattice, the hexagon lattice are all roughly isometric. It can be proved that Z1 and Z2 are not roughly isometric. The notion of rough isometry was first introduced by M. Kanai ([73, 74]). As this work was mainly concerned with Riemannian manifolds, definition of rough isometry included only (1.10), (1.11). The definition equivalent to Definition 1.8 is given in [42] (see also [65]). Note that rough isometry corresponds to quasi-isometry in the field of geometric group theory. While discussing various properties of Markov chains/Laplace operators, it is important to think about their ‘stability’. In the following, we introduce two types of stability.

Definition 1.9 (i) We say a property is stable under bounded perturbation if whenever (X, µ) satisfies the property and (X, µ ) satisfies c 1µ µ cµ for all x, y X, then (X, µ ) satisfies 0 xy  xy0  xy 2 0

8 the property. (ii) We say a property is stable under rough isometry if whenever (X, µ) satisﬁes the property and (X0,µ0) is rough isometric to (X, µ), then (X0,µ0) satisﬁes the property.

If a property P is stable under rough isometry, then it is clearly stable under bounded perturbation. It is known that the following properties of weighted graphs are stable under rough isometry.

(i) Transience and recurrence

(ii) The Nash inequality, i.e. p (x, y) c n ↵ for all n 1,x X (for some ↵>0) n  1 2 (iii) Parabolic Harnack inequality

We will see (i) later in this section and (ii) in Section 2. One of the important open problem is to show if the elliptic Harnack inequality is stable under these perturbations or not.

Deﬁnition 1.10 (X, µ) has the Liouville property if there is no bounded non-constant harmonic functions. (X, µ) has the strong Liouville property if there is no positive non-constant harmonic functions.

It is known that both Liouville and strong Liouville properties are not stable under bounded perturbation (see [89]).

1.2 Harmonic functions and e↵ective resistances For A X,deﬁne ⇢ =inf n 0:Y A ,+ =inf n>0:Y A ,⌧=inf n 0:Y / A . A { n 2 } A { n 2 } A { n 2 } For A X and f : A R, consider the following Dirichlet problem. ⇢ ! v(x)=0 x Ac, L 8 2 (1.13) v = f. ( |A Proposition 1.11 Assume that f : A R is bounded and set ! x '(x)=E [f(YA ):A < ]. 1 (i) ' is a solution of (1.13). x (ii) If P (A < )=1for all x X, then ' is the unique solution of (1.13). 1 2

Proof. (i) ' = f is clear. For x Ac, using the Markov property of Y ,wehave |A 2 '(x)= P (x, y)'(y), y X

9 so '(x) = 0. L (ii) Let ' be another solution and let H = '(Y ) ' (Y ). Then H is a bounded martingale up 0 n n 0 n n to A, so using the optional stopping theorem, we have

x x x '(x) '0(x)=E H0 = E HA = E ['(YA ) '0(YA )] = 0 since < a.s. and '(x)=' (x) for x A. A 1 0 2

Remark 1.12 (i) In particular, we see that ' is the unique solution of (1.13) when Ac is finite. In this case, here is another proof of the uniqueness of the solution of (1.13): let u(x)='(x) '0(x), c 2 then u A =0and u(x)=0for x A .So,notingu L and using Lemma 1.5, (u, u)= | L 2 2 E ( u, u)=0which implies that u is constant on X (so it is 0 since u A =0). L x | (ii) If hA(x):=P (A = ) > 0 for some x X, then the function ' + hA is also a solution of 1 2 (1.13) for all R, so the uniqueness of the Dirichlet problem fails. 2 For A, B X such that A B = ,define ⇢ \ ; 1 2 R (A, B) =inf (f,f):f H ,f =1,f =0 . (1.14) e↵ {E 2 |A |B } (We define R (A, B)= when the right hand side is 0.) We call R (A, B)thee↵ective resistance e↵ 1 e↵ between A and B. It is easy to see that R (A, B)=R (B,A). If A A , B B with A B = , e↵ e↵ ⇢ 0 ⇢ 0 0 \ 0 ; then R (A ,B ) R (A, B). e↵ 0 0  e↵ Take a bond e = x, y , x y in a weighted graph (X, µ). We say cutting the bond e when we { } ⇠ take the conductance µxy to be 0, and we say shorting the bond e when we identify x = y and take the conductance µ to be . Clearly, shorting decreases the e↵ective resistance (shorting law), and xy 1 cutting increases the e↵ective resistance (cutting law). The following proposition shows that among feasible potentials whose voltage is 1 on A and 0 on B, it is a harmonic function on (A B)c that minimizes the energy. [ Proposition 1.13 (i) The right hand side of (1.14) is attained by a unique minimizer '. (ii) ' in (1) is a solution of the following Dirichlet problem

'(x)=0 x X (A B), L 8 2 \ [ (1.15) ' =1,' =0. ( |A |B

2 Proof. (i) We ﬁx a based point x0 B and recall that H is a Hilbert space with f H2 = 2 2 2 k k (f,f)+f(x0) (Lemma 1.2 (ii)). Since := f H : f A =1,f B =0 is a closed convex subset E 2 V { 2 | | } of H , a general theorem shows that has a unique minimizer for 2 (which is equal to ( , ) V k·kH E · · on ). V (ii) Let g be a function on X whose support is ﬁnite and is contained in X (A B). Then, for any \ [ R, ' + g ,so (' + g,' + g) (',' ). Thus (', g) = 0. Applying Lemma 1.5(ii), we 2 2V E E E have ( ', g) = 0. For each x X (A B), by choosing g(z)= (z), we obtain '(x)µ = 0. L 2 \ [ x L x

10 As we mentioned in Remark 1.12 (ii), we do not have uniqueness of the Dirichlet problem in general. So in the following of this section, we will assume that Ac is ﬁnite in order to guarantee uniqueness of the Dirichlet problem. The next theorem gives a probabilistic interpretation of the e↵ective resistance.

Theorem 1.14 If Ac is ﬁnite, then for each x Ac, 0 2

1 x0 + Re↵ (x0,A) = µx0 P (A <x0 ). (1.16)

x Proof. Let v(x)=P (A <x0 ). Then, by Proposition 1.11, v is the unique solution of Dirichlet 2 problem with v(x0) = 0, v A = 1. By Proposition 1.13 and Lemma 1.5 (noting that 1 v L ), | 2

1 x0 Re↵ (x0,A) = (v, v)= ( v, 1 v)=( v, 1 v)= v(x0)µx0 = E [v(Y1)]µx0 . E E L L

x0 x0 + By deﬁnition of v, one can see E [v(Y1)] = P (A <x0 ) so the result follows.

Similarly, if Ac is ﬁnite one can prove

1 x + Re↵ (B,A) = µxP (A <B ). x B X2 c Note that by Ohm’s law, the right hand side of (1.16) is the current ﬂowing from x0 to A . The following lemma is useful and will be used later in Proposition 3.18.

Lemma 1.15 Let A, B X and assume that both Ac,Bc are ﬁnite. Then the following holds. ⇢

Re↵ (x, A B) x Re↵ (x, A B) [ P (A <B) [ , x/A B. R (x, A) 1 R (x, B) 1   R (x, A) 8 2 [ e↵ e↵ e↵ Proof. Using the strong Markov property, we have

x x + x + P (A <B)=P (A <B,A B <x )+P (A <B,A B >x ) [ [ x + x + x = P (A <B,A B <x )+P (A B >x )P (A <B). [ [ So x + x + x P (A <B,A B <x ) P (A <x ) ( < )= [ . P A B x + x + P (A B <x )  P (A B <x ) [ [ Using (1.16), the upper bound is obtained. For the lower bound,

x + x + x + x + P (A <B,A B <x ) P (A <x <B) P (A <x ) P (B <x ), [ so using (1.16) again, the proof is complete.

As we see in the proof, we only need to assume that Ac is ﬁnite for the upper bound.

11 Now let (X, µ) be an inﬁnite weighted graph. Let An n1=1 be a family of ﬁnite sets such that An { } c c ⇢ An+1 for n N and n 1An = X. Let x0 A1. By the short law, Re↵ (x0,An) Re↵ (x0,An+1), so 2 [ 2  the following limit exists. c Re↵ (x0):=lim Re↵ (x0,A ). (1.17) n n !1 Further, the limit R (x ) is independent of the choice of the sequence A mentioned above. e↵ 0 { n} (Indeed, if Bn is another such family, then for each n there exists Nn such that An BNn , { } c c ⇢ so limn Re↵ (x0,An) limn Re↵ (x0,Bn). By changing the role of An and Bn,wehavethe !1  !1 opposite inequality.)

Theorem 1.16 Let (X, µ) be an inﬁnite weighted graph. For each x X,thefollowingholds 2 x + 1 P ( = )=(µxRe↵ (x)) . x 1

Proof. By Theorem 1.14, we have

x + c 1 1 c P (An <x )=(µxRe↵ (x, An) ) .

Taking n and using (1.17), we have the desired equality. !1

Deﬁnition 1.17 We say a Markov chain is recurrent at x X if Px(+ = )=0.Wesaya 2 x 1 Markov chain is transient at x X if Px(+ = ) > 0. 2 x 1 The following is well-known for irreducible Markov chains (so in particular it holds for Markov chains corresponding to weighted graphs). See, for example [97].

x Proposition 1.18 (1) Yn n is recurrent at x X if and only if m := n1=0 P (Yn = x)= . 1 x + { } 2 1 Further, m = P ( = ). x 1 P (2) If Y is recurrent (resp. transient) at some x X, then it is recurrent (resp. transient) for { n}n 2 all x X. 2 x (3) Yn n is recurrent if and only if P ( Y hits y inﬁnitely often )=1for all x, y X. Yn n is { } { } 2 { } transient if and only if Px( Y hits y ﬁnitely often )=1for all x, y X. { } 2 From Theorem 1.16 and Proposition 1.18, we have the following.

Y is transient (resp. recurrent) R (x) < (resp. R (x)= ), x X (1.18) { n} , e↵ 1 e↵ 1 9 2 R (x) < (resp. R (x)= ), x X. , e↵ 1 e↵ 1 8 2 2 2 Example 1.19 Consider Z with weight 1 on each nearest neighbor bond. Let @Bn = (x, y) Z : { 2 either x or y is n .Byshorting@Bn for all n N, one can obtain | | | | } 2 1 1 R (0) = . e↵ 4(2n + 1) 1 n=0 X So the simple random walk on Z2 is recurrent.

12 Let us recall the following fact.

Theorem 1.20 (P´olya 1921) Simple random walk on Zd is recurrent if d =1, 2 and transient if d 3. The combinatorial proof of this theorem is well-known. For example, for d = 1, by counting the total number of paths of length 2n that moves both right and left n times,

0 2n 2n (2n)! 1/2 P (Y2n = 0) = 2 = 2n (⇡n) , n ! 2 n!n! ⇠ where Stirling’s formula is used in the end. Thus

1 0 1 1/2 m = P (Yn = 0) (⇡n) +1= , ⇠ 1 n=0 n=1 X X so Yn is recurrent. { } d This argument is not robust. For example, if one changes the weight on Z so that c1 µxy   c for x y, one cannot apply the argument at all. The advantage of the characterization of 2 ⇠ transience/recurrence using the e↵ective resistance is that one can make a robust argument. Indeed, by (1.18) we can see that transience/recurrence is stable under bounded perturbation. This is because, if c µ µ c µ for all x, y X,thenc R (x) R (x) c R (x). We can 1 xy0  xy  2 xy0 2 1 e↵  e0 ↵  2 e↵ further prove that transience/recurrence is stable under rough isometry.

Finally in this subsection, we will give more equivalence condition for the transience and discuss 2 2 2 some decomposition of H . Let H0 be the closure of C0(X)inH ,whereC0(X) is the space of compactly supported function on X. For a finite set B X, define the capacity of B by ⇢ Cap (B)=inf (f,f):f H2,f =1 . {E 2 0 |B } We first give a lemma.

2 Lemma 1.21 If a sequence of non-negative functions vn H , n N satisﬁes limn vn(x)= 2 2 !1 1 for all x X and limn (vn,vn)=0, then 2 !1 E 2 lim u (u vn) H2 =0, u H ,u 0. n !1 k ^ k 8 2

Proof. Let un = u vn and deﬁne Un = x X : u(x) >vn(x) . By the assumption, for each ^ { 2 c } N N,thereexistsN0 = N0(N) such that Un B(0,N) for all n N0. For A X, denote 2 1 ⇢ ⇢ 2 c A(u)= 2 x,y A(u(x) u(y)) µxy.Since Un (u un) = 0, we have E 2 E P 1 2 (u u ,u u ) 2 u(x) u (x) (u(y) u (y)) µ E n n  · 2 n n xy x Un y: y x X2 X⇠ ⇣ ⌘ 2 B(0,N 1)c (u un) 2 B(0,N 1)c (u)+ B(0,N 1)c (un) (1.19)  E  E E ⇣ ⌘ 13 for all n N . As u =(u + v u v )/2, we have 0 n n | n|

B(0,N 1)c (un) c1 B(0,N 1)c (u)+ B(0,N 1)c (vn)+ B(0,N 1)c ( u vn ) E  E E E | | ⇣ ⌘ c2 B(0,N 1)c (u)+ B(0,N 1)c (vn) .  E E ⇣ ⌘ Thus, together with (1.19), we have

(u un,u un) c3 B(0,N 1)c (u)+ B(0,N 1)c (vn) c3 B(0,N 1)c (u)+ (vn,vn) . E  E E  E E ⇣ ⌘ ⇣ ⌘ 2 Since u H , B(0,N 1)c (u) 0 as N and by the assumption, (vn,vn) 0 as n .Sowe 2 E ! !1 E ! !1 obtain (u u ,u u ) 0 as n . By the assumption, it is clear that u u 0 pointwise, E n n ! !1 n ! so we obtain u u 2 0. k nkH ! We say that a quadratic form ( , ) is Markovian if u and v =(0 u) 1, then v E F 2F _ ^ 2F and (v, v) (u, u). It is easy to see that quadratic forms determined by weighted graphs are E E Markovian.

Proposition 1.22 The following are equivalent. (i) The Markov chain corresponding to (X, µ) is transient. (ii) 1 / H2 2 0 (iii) Cap ( x ) > 0 for some x X. { } 2 (iii) Cap ( x ) > 0 for all x X. 0 { } 2 (iv) H2 = H2 0 6 (v) There exists a non-negative super-harmonic function which is not a constant function. (vi) For each x X, there exists c (x) > 0 such that 2 1 f(x) 2 c (x) (f,f) f C (X). (1.20) | |  1 E 8 2 0

z 2 Proof. For fixed x X,define'(z)=P (x < ). We first show the following: ' H and 2 1 2 0 1 (',' )=( ', 1 x )=Re↵ (x) = Cap ( x ). (1.21) E L { } { }

Indeed, let An n1=1 be a family of ﬁnite sets such that An An+1 for n N, x A1, and { } c 1 1 z⇢ 2 2 n 1An = X.ThenRe↵ (x, An) Re↵ (x) . Let 'n(z)=P (x <⌧An ). Using Lemma 1.5 (ii), [ # and noting ' C (X), we have, for m n, n 2 0  c 1 ('m,'n)=('m, 'n)=(1 x , 'n)= ('n,'n)=Re↵ (x, An) . (1.22) E L { } L E This implies c 1 c 1 (' ' ,' ' )=R (x, A ) R (x, A ) . E m n m n e↵ m e↵ n Hence ' is a -Cauchy sequence. Noting that ' ' pointwise, we see that ' ' in H2 as { m} E n ! n ! well and ' H2. Taking n = m and n in (1.22), we obtain (1.21) except the last equality. 2 0 !1 To prove the last equality of (1.21), take any f H2 with f(x) = 1. Then g := f ' H2 and 2 0 2 0

14 g(x) = 0. Let g C (X)withg g in H2. Then, by Lemma 1.5 (ii), (', g )=( ', g ). n 2 0 n ! 0 E n L n Noting that ' is harmonic except at x, we see that ' C (X). so, letting n ,wehave L 2 0 !1 (', g)=( ', g)= '(x)g(x)µ =0. E L L x Thus, (f,f)= (' + g,' + g)= (',' )+ (g, g) (',' ), E E E E E which means that ' is the unique minimizer in the deﬁnition of Cap ( x ). So the last equality of { } (1.21) is obtained. Given (1.21), we now prove the equivalence. (i)= (iii) : This is a direct consequence of (1.18) and (1.21). ) 0 (iii) (ii) (iii) : This is easy. Indeed, Cap ( x ) = 0 if and only if there is f H2 with () () 0 { } 2 0 f(x) = 1 and (f,f) = 0, that is f is identically 1. E (iii) = (vi): Let f C (X) H2 with f(x) = 0, and deﬁne g = f/f(x). Then 0 ) 2 0 ⇢ 0 6 Cap ( x ) (g, g)= (f,f)/f(x)2. { } E E

So, letting c1(x)=1/Cap ( x ) > 0, we obtain (vi). { } z 1 (vi)= (i): As before, let 'n(z)=P (x <⌧An ). Then by (1.20), ('n,'n) c1(x) . So, using ) E the fact ' ' in H2 and (1.21), R (x) 1 = (',' )=lim (' ,' ) c (x) 1. This means the n ! e↵ E n E n n 1 transience by (1.18). (ii) (iv): (ii)= (iv) is clear since 1 H2, so we will prove the opposite direction. Suppose ()2 ) 2 2 1 H .Thenthereexists f C (X) such that 1 f 2

Remark 1.23 (v)= (i) implies that if the Markov chain corresponding to (X, µ) is recurrent, ) then it has the strong Liouville property.

For A, B which are subspaces of H2,wewriteA B = f + g : f A, g B if (f,g) = 0 for { 2 2 } E all f A and g B. 2 2 2 2 As we see above, the Markov chain corresponding to (X, µ) is recurrent if and only if H = H0 . When the Markov chain is transient, we have the following decomposition of H2, which is called the Royden decomposition (see [107, Theorem 3.69]).

15 Proposition 1.24 Suppose that the Markov chain corresponding to (X, µ) is transient. Then

H2 = H2, H 0 where := h H2 : h is a harmonic functions on X. . Further the decomposition is unique. H { 2 }

2 Proof. For each f H ,letaf =infh H2 (f h, f h). Then, similarly to the proof of Proposition 2 2 0 E 1.13, we can show that there is a unique minimizer v H2 such that a = (f v ,f v ), f 2 0 f E f f (f v ,g) = 0 for all g H2, and in particular f v is harmonic on X. For the uniqueness of the E f 2 0 f decomposition, suppose f = u + v = u + v where u, u and v, v H2.Then,w := u u = 0 0 0 2H 0 2 0 0 v v H2,so (w, w) = 0, which implies w is constant. Since w H2 and the Markov chain 0 2H\ 0 E 2 0 is transient, by Proposition 1.22 we have w 0. ⌘

1.3 Trace of weighted graphs Finally in this section, we briefly mention the trace of weighted graphs, which will be used in Section 3 and Section 7. Note that there is a general theory on traces for Dirichlet forms (see [55]). Also note that a trace to infinite subset of X may not satisfy locally finiteness, but one can consider quadratic forms on them similarly.

Proposition 1.25 (Trace of the weighted graph) Let V X be a non-void set such that P(V < ⇢ )=1and let f be a function on V . Then there exists a unique u H2 which attains the following 1 2 inﬁmum: inf (v, v):v H2,v = f . (1.23) {E 2 |V } Moreover, the map f u =: HV f is a linear map and there exist weights µˆxy x,y V such that the 7! { } 2 corresponding quadratic form ˆ ( , ) satisﬁes the following: EV · · ˆ V (f,f)= (HV f,HV f) f : V R. E E 8 !

Proof. The ﬁrst part can be proved similarly to Proposition 1.11 and Proposition 1.13. It is clear that HV (cf)=cHV f,sowewillshowHV (f1 + f2)=HV (f1)+HV (f2). Let ' = HV (f1 + f2), 'i = HV (fi) for i =1, 2, and for f : V R,deﬁneEf : X R by (Ef) V = f and Ef(x)=0 ! ! | when x V c. As in the proof of Proposition 1.13 (ii), we see that (H f,g)=0wheneverSupp 2 E V g V c.So ⇢ (' + ' ,' + ' )= (' + ' ,Ef + Ef )= (' + ' ,')= (E(f + f ),')= (',' ). E 1 2 1 2 E 1 2 1 2 E 1 2 E 1 2 E

Using the uniqueness of (1.23), we obtain '1 + '2 = '. This establishes the linearity of HV . Set ˆ(f,f)= (H f,H f). Clearly, ˆ is a non-negative deﬁnite symmetric bilinear form and E E V V E ˆ(f,f) = 0 if any only if f is a constant function. So, there exists axy x,y V with axy = ayx such E ˆ 1 2 { } 2 that (f,f)= 2 x,y V axy(f(x) f(y)) . E 2 P 16 Next, we show that ˆ is Markovian. Indeed, writingu ¯ =(0 u) 1 for a function u,since E _ ^ H u =¯u,wehave V |V ˆ ˆ (¯u, u¯)= (HV u,¯ HV u¯) (HV u, HV u) (HV u, HV u)= (u, u), u : V R, E E E E E 8 ! where the fact that is Markovian is used in the second inequality. Now take p, q V with p = q E 2 6 arbitrary, and consider a function h such that h(p)=1,h(q)= ↵<0 and h(z) = 0 for z V p, q . 2 \{ } Then, there exist c1,c2 such that

ˆ(h, h)=a (h(p) h(q))2 + c h(p)2 + c h(q)2 = a (1 + ↵)2 + c + c ↵2 E pq 1 2 pq 1 2 ˆ(h,¯ h¯)=a (h¯(p) h¯(q))2 + c h¯(p)2 + c h¯(q)2 = a + c . E pq 1 2 pq 1 So (a + c )↵2 +2a ↵ 0. Since this holds for all ↵>0, we have a 0. Puttingµ ˆ = a for pq 2 pq pq pq pq each p, q V with p = q,wehave ˆ = ˆ, that is ˆ is associated with the weighted graph (V,µˆ). 2 6 EV E E

We call the induced weights µˆxy x,y V as the trace of µxy x,y X to V . From this proposition, { } 2 { } 2 we see that for x, y V , R (x, y)=RV (x, y)whereRV ( , ) is the e↵ective resistance for (V,µˆ). 2 e↵ e↵ e↵ · ·

2 Heat kernel upper bounds (The Nash inequality)

In this section, we will consider various equivalent inequalities to the Nash-type heat kernel upper bound, i.e. p (x, y) c t ✓/2 for some ✓>0. We would prefer to discuss them under a general t  1 framework including weighted graphs. However, some arguments here are rather sketchy to apply for the general framework. (The whole arguments are fine for weighted graphs, so readers may only consider them.) This section is strongly motivated by Coulhon’s paper [40]. Let X be a locally compact separable metric space and µ be a Radon measure on X such that µ(B) > 0 for any non-void ball. ( , ) is called a Dirichlet form on L2(X, µ)ifitisasymmetric E F Markovian closed bilinear form on L2. It is well-known that given a Dirichlet form, there is a 2 corresponding symmetric strongly continuous Markovian semigroup Pt t 0 on L (X, µ) (see [55, 2 { } Section 1.3, 1.4]). Here Markovian means if u L satisfies 0 u 1 µ-a.s., then 0 Ptu 1 µ-a.s. 2     for all t 0. We denote the corresponding non-negative definite L2-generator by . 2 L p We denote the inner product of L by ( , ) and for p 1 denote f p for the L -norm of · · k k f L2(X, µ). For each ↵>0, define 2 ( , )= ( , )+↵( , ). E↵ · · E · · · · ( , ) is then a Hilbert space. E1 F 2.1 The Nash inequality We first give a preliminary lemma.

1 2 Lemma 2.1 (i) Ptf 1 f 1 for all f L L . 2 k k k k 2 \ (ii) For f L , deﬁne u(t)=(Ptf,Ptf). Then u0(t)= 2 (Ptf,Ptf). 2 E (iii) For f and t 0, exp( (f,f)t/ f 2) P f / f . 2F E k k2 k t k2 k k2

17 2 Proof. (i) We ﬁrst show that if 0 f ,then0 P f.Indeed,ifweletf = f 1 1 ,then L t n f ([0,n]) 2  2  · fn f in L .Since0 fn n, the Markovian property of Pt implies that 0 Ptfn n. Taking !   { }   n , we obtain 0 P f.SowehaveP f P f ,since f f f . Using this and the !1  t t| || t | | | | | Markovian property, we have for all f L2 L1 and all Borel set A X, 2 \ ⇢ ( P f , 1 ) (P f , 1 )=(f ,P 1 ) f . | t | A  t| | A | | t A k k1 1 Hence we have Ptf L and Ptf 1 f 1. 2 k k k k (ii) Since P f Dom( ), we have t 2 L u(t + h) u(t) 1 (P I)P f = (P f + P f,P f P f)=(P f + P f, h t ) h h t+h t t+h t t+h t h h 0 # 2(P f, P f)= 2 (P f,P f). ! t L t E t t Hence u (t)= 2 (P f,P f). 0 E t t (iii) We will prove the inequality for f Dom( ); then one can obtain the result for f by 2 L 2F approximation. Let = 1 dE be the spectral decomposition of .ThenP = e t = L 0 L t L 1 e tdE and f 2 = 1(dE f,f). Since e 2t is convex, by Jensen’s inequality, 0 k k2 0 R 7! 2 R R1 (dEf,f) 1 2t (dEf,f) (P2tf,f) Ptf 2 exp 2 t 2 e 2 = 2 = k k2 . 0 f 2  0 f 2 f 2 f 2 ⇣ Z k k ⌘ Z k k k k k k Taking the square root in each term, we obtain the desired inequality.

Remark 2.2 An alternative proof of (iii) is to use the logarithmic convexity of P f 2.Indeed, k t k2 P f 2 =(P f,f)=(P f,P f) P f P f , s, t > 0 k (t+s)/2 k2 t+s t s k t k2k s k2 8 so P f 2 is logarithmic convex. Thus, k t k2 d 2 d ( Ptf ) 2 (P f,P f) t log P f 2 = dt k k2 = E t t (2.1) 7! dt k t k2 P f 2 P f 2 k t k2 k t k2 is non-decreasing. (The last equality is due to Lemma 2.1(ii).) The right hand side of (2.1) is 2 (f,f) E f 2 when t =0, so integrating (2.1) over [0,t], we have k k2 P f 2 t d 2t (f,f) log k t k2 = log P f 2ds E . f 2 ds k s k2 f 2 k k2 Z0 k k2 The following is easy to see. (Note that we only need the ﬁrst assertion.)

2 Lemma 2.3 Let ( , ) be a symmetric closed bilinear form on L (X, µ), and let Pt t 0, be the E F { } L corresponding semigroup and the self-adjoint operator respectively. Then, for each >0, ( , ) is E F also a symmetric closed bilinear form and the corresponding semigroup and the self-adjoint operator t 2 are e Pt t 0, I , respectively. Further, if ( , ) is the regular Dirichlet form on L (X, µ) { } L E F and the corresponding Hunt process is Yt t 0, then ( , ) is also the regular Dirichlet form and { } E F the corresponding hunt process is Yt ⇣ t 0 where ⇣ is the independent exponential random variable { ^ } with parameter .(⇣ is the killing time; i.e. the process goes to the cemetery point at ⇣.)

18 The next theorem was proved by Carlen-Kusuoka-Stroock ([38]), where the original idea of the proof of (i) (ii) was due to Nash [96]. ) Theorem 2.4 (The Nash inequality, [38]) The following are equivalent for any 0. (i) There exist c ,✓> 0 such that for all f L1, 1 2F\ f 2+4/✓ c ( (f,f)+ f 2) f 4/✓. (2.2) k k2  1 E k k2 k k1 (ii) For all t>0, P (L1) L and it is a bounded operator. Moreover, there exist c ,✓> 0 such t ⇢ 1 2 that t ✓/2 Pt 1 c2e t , t>0. (2.3) k k !1  8 1 Here Pt 1 is an operator norm of Pt : L L1. k k !1 !

When = 0, we cite (2.2) as (N✓) and (2.3) as (UC✓). Proof. First, note that using Lemma 2.1, it is enough to prove the theorem when = 0. (i) (ii) : Let f L2 L1 with f = 1 and u(t):=(P f,P f) . Then, by Lemma 2.1 (ii), ) 2 \ k k1 t t 2 u (t)= 2 (P f,P f). Now by (i) and Lemma 2.1 (i), 0 E t t 1+2/✓ 4/✓ 2u(t) c ( u0(t)) P f c u0(t),  1 k t k1  1 1+2/✓ 2/✓ so u0(t) c2u(t) .Setv(t)=u(t) , then we obtain v0(t) 2c2/✓.Sincelimt 0 v(t)= # 2/✓  4/✓ ✓/2 u(0) = f 2 > 0, it follows that v(t) 2c2t/✓. This means u(t) c3t ,whence Ptf 2 ✓/4 k k 2 1 ✓/4  k k  c3t f 1 for all f L L ,whichimplies Pt 1 2 c3t .SincePt = Pt/2 Pt/2 and k k 2 \ k k !  Pt/2 1 2 = Pt/2 2 , we obtain (ii). k k ! k k !1 (ii) (i) : Let f L2 L1 with f = 1. Using (ii) and Lemma 2.1 (iii), we have ) 2 \ k k1 (f,f)t c t ✓/2 exp( 2E ) 4 . f 2  f 2 k k2 k k2 Rewriting, we have (f,f)/ f 2 (2t) 1 log(t✓/2 f 2) (2t) 1A,whereA = log c and we may E k k2 k k2 4 take A>0. Set (x)=sup x log(xt✓/2) Ax/(2t) . By elementary computations, we have t>0{ 2t } (x) c x1+2/✓.So 5 (f,f) ( f 2) c f 2(1+2/✓) = c f 2+4/✓. E k k2 5k k2 5k k Since this holds for all f L2 L1 with f = 1, we obtain (i). 2 \ k k1

Remark 2.5 (1) When one is only concerned about t 1 (for example on graphs), we have the following equivalence under the assumption of Pt 1 C for all t 0 (C is independent of t). k k !1  (i) (N ) with =0holds for (f,f) f 2. ✓ E k k1 (ii) (UC ) with =0holds for t 1. ✓

19 (2) We have the following generalization of the theorem due to [41]. Let m : R+ R+ be a ! decreasing C1 bijection which satisﬁes the following: M(t):= log m(t) satisﬁes M (u) c M (t) 0 0 0 for all t 0 and all u [t, 2t]. (Roughly, this condition means that the logarithmic derivative of 2 m(t) is polynomial growth. So exponential growth functions may satisfy the condition, but double exponential growth functions do not.) Let (x)= m (m 1(x)). Then the following are equivalent. 0 (i) c ( f 2) (f,f) for all f , f 1. 1 k k2 E 2F k k1 

(ii) Pt 1 c2m(t) for all t>0. k k !1  1+2/✓ ✓/2 The above theorem corresponds to the case (y)=c4y and m(t)=t .

Corollary 2.6 Suppose the Nash inequality (Theorem 2.4) holds. Let ' be an eigenfunction of L with eigenvalue 1. Then ✓/4 ' c1 ' 2, k k1  k k where c1 > 0 is a constant independent of ' and .

t t 1/2 ✓/4 Proof. Since ' = ', Pt' = e L' = e '. By Theorem 2.4, Pt 2 = Pt 1 ct for L k k !1 k k  t 1. Thus !1  t ✓/4 e ' = Pt' ct ' 2. k k1 k k1  k k 1 Taking t = and c1 = ce, we obtain the result.

Example 2.7 Consider Zd, d 2 and put weight 1 for each edge x, y , x, y Zd with x y =1. { } 2 k k Then it is known that the corresponding simple random walk enjoys the following heat kernel estimate: d/2 d pn(x, y) c1n for all x, y Z and all n 1. Now, let  2

H = (2n1, 2n2, , 2nd), (2n1 +1, 2n2, , 2nd) : n1, ,nd Z {{ ··· ··· } ··· 2 } and consider a random subgraph (!) by removing each e H with probability p [0, 1] indepen- C 2 2 dently. (So the set of vertices of (!) is Zd. Here ! is the randomness of the environments.) If we C define the quadratic forms for the original graph and (!) by ( , ) and !( , ) respectively, then C E · · E · · it is easy to see that !(f,f) (f,f) 4 !(f,f) for all f L2. Thus, by Remark 2.5(1), the E E  E 2 heat kernel of the simple random walk on (!) still enjoys the estimate p!(x, y) c n d/2 for all C n  1 x, y (!) and all n 1, for almost every !. 2C 2.2 The Faber-Krahn, Sobolev and isoperimetric inequalities In this subsection, we denote⌦ X when ⌦ is an open relative compact subset of X. (For ⇢⇢ weighted graphs, it simply means that ⌦ is a finite subset of X.) Let C0(X) be the space of continuous, compactly supported functions on X.Define (f,f) 1(⌦) = inf E , ⌦ X. f C (X), 2 2F\ 0 f 8 ⇢⇢ Supp f Cl(⌦) 2 ⇢ k k

20 By the min-max principle, this is the ﬁrst eigenvalue for the corresponding Laplace operator which is zero outside ⌦.

Deﬁnition 2.8 (The Faber-Krahn inequality) Let ✓>0.Wesay( , ) satisﬁes the Faber-Krahn inequality of order ✓ if the following holds: E F 2/✓ (⌦) cµ(⌦) , ⌦ X. (FK(✓)) 1 8 ⇢⇢ Theorem 2.9 (N ) (FK(✓)) ✓ ,

Proof. (N ) (FK(✓)): This is an easy direction. From (N ), we have ✓ ) ✓ 2 2/✓ 2 f 1 1 f 2 k k2 (f,f), f L . (2.4) k k  f 2 E 8 2F\ ⇣k k ⌘ On the other hand, if Supp f Cl(⌦), then by Schwarz’s inequality, f 2 µ(⌦) f 2,so( f 2/ f 2)2/✓ ⇢ k k1  k k2 k k1 k k2 µ(⌦)2/✓. Putting this into (2.4), we obtain (FK(✓)).  (FK(✓)) (N ): We adopt the argument originated in [60]. Let u C (X) be a non-negative ) ✓ 2F\ 0 function. For each >0, since u<2(u ) on u>2 ,wehave { } u2dµ = u2dµ + u2dµ u>2 u 2 Z Z{ } Z{  } 2 2 4 (u ) dµ +2 udµ 4 (u )+dµ +2 u 1. (2.5)  u>2 u 2  k k Z{ } Z{  } Z Note that (u ) since ( , ) is Markovian (cf. [55, Theorem 1.4.1]). Set ⌦= u>; + 2F E F { } then ⌦ is an open relative compact set since u is compactly supported, and Supp (u ) ⌦. So, + ⇢ applying (FK(✓)) to (u ) gives + u 2/✓ (u )2 dµ µ(⌦)2/✓ ((u ) , (u ) ) k k1 (u, u), +  E + +  E Z ⇣ ⌘ where we used the Chebyshev inequality in the second inequality. Putting this into (2.5),

u 2/✓ u 2 4 k k1 (u, u)+2 u . k k2  E k k1 ⇣ ⌘ (2 ✓)/(2+✓) Optimizing the right hand side by taking = c (u, u)✓/(✓+2) u , we obtain 1E k k1

✓ 4 2 2+✓ u c (u, u) ✓+2 u , k k2  2E k k1 and thus obtain (N✓). For general compactly supported u , we can obtain (N✓) for u+ and u , 2F so for u as well. For general u L1, approximation by compactly supported functions gives the 2F\ desired result.

21 Remark 2.10 We can generalize Theorem 2.9 as follows. Let c (⌦) , ⌦ X, (2.6) 1 '(µ(⌦))2 8 ⇢⇢ where ' :(0, ) (0, ) is a non-decreasing function. Then, it is equivalent to Remark 2.5(2)(i), 1 ! 12 1/2 1/✓ where (x)=x/'(1/x) ,or'(x)=(x (1/x)) . Theorem 2.9 is the case '(x)=x .

In the following, we define f for the two cases. Case 1: When one can define the gradient kr k1 on the space and (f,f)= 1 f(x) 2dµ(x), then f := f(x) dµ(x). Case 2: When E 2 X |r | kr k1 X |r | (X, µ) is a weighted graph, then f := 1 f(y) f(x) µ .Whenever f appears, we R kr k1 2 x,y X | | Rxy kr k1 consider that we are either of the two cases. 2 P Definition 2.11 (Sobolev inequalities) (i) Let ✓>2.Wesay( , ) satisfies (S2) if E F ✓ 2 2 f 2✓/(✓ 2) c1 (f,f), f C0(X). (S✓ ) k k  E 8 2F\ (ii) Let ✓>1.Wesay( , ) satisfies (S1) if E F ✓ 1 f ✓/(✓ 1) c2 f 1, f C0(X). (S✓ ) k k  kr k 8 2F\ In the following, we define @⌦ for the two cases. Case 1: When X is a d-dimensional Riemannian | | manifold and ⌦ is a smooth domain, then @⌦ is the surface measure of ⌦. Case 2: When (X, µ)is | | a weighted graph, then @⌦ = c µ .Whenever@⌦ appears, we consider that we are | | x ⌦ y ⌦ xy | | either of the two cases. 2 2 P P Definition 2.12 (The isoperimetric inequality) Let ✓>1.Wesay( , ) satisfies the isoperimetric inequality of order ✓ if E F (✓ 1)/✓ µ(⌦) c @⌦ , ⌦ X. (I )  1| | 8 ⇢⇢ ✓ We write (I ) when ✓ = , namely when 1 1

µ(⌦) c1 @⌦ , ⌦ X. (I )  | | 8 ⇢⇢ 1 Remark 2.13 (i) For the weighted graph (X, µ) with µ 1 for all x X,if(I ) holds, then (I ) x 2 ↵ holds for any ↵ .So(I ) is the strongest inequality among all the isoperimetric inequalities. d  1 (ii) Z satisﬁes (Id). The binary tree satisﬁes (I ). 1 Theorem 2.14 The following holds for ✓>0.

(I ) ✓>1 (S1) =✓>2 (S2) ✓>2 (N ) (UC ) (FK(✓)) ✓ () ✓ ) ✓ () ✓ () ✓ ()

22 Proof. Note that the last two equivalence relations are already proved in Theorem 2.4 and 2.9. (I ) ✓>=(1 S1): When (X, µ) is the weighted graph, simply apply (S1)tof =1 and we can obtain ✓ ( ✓ ✓ ⌦ (I✓). When X is the Riemannian manifold, by using a Lipschitz function which approximates f =1⌦ nicely, we can obtain (I✓). (I ) =✓>1 (S1): We will use the co-area formula given in Lemma 2.16 below. Let f be a support ✓ ) ✓ compact non-negative function on X (when X is the Riemannian manifold, f C1(X) and f 0). 2 0 Let H (f)= x X : f(x) >t and set p = ✓/(✓ 1). Applying (I )tof and using Lemma 2.16 t { 2 } ✓ below, we have

f = 1 @H (f) dt c 1 µ(H (f))1/pdt = c 1 1 dt. (2.7) kr k1 | t | 1 t 1 k Ht(f)kp Z0 Z0 Z0 q 1 1 Next take any g L such that g 0 and g q =1whereq is the value that satisﬁes p + q = 1. 2 k k Then, by the H¨olderinequality,

1 1 dt 1 g 1 dt = g(x) 1 1 (x)dtdµ(x)= fg , k Ht(f)kp k · Ht(f)k1 Ht(f) k k1 Z0 Z0 ZX Z0 1 since 0 1Ht(f)(x)dt = f(x). Putting this into (2.7), we obtain R 1 f p =sup fg 1 c1 f 1, q k k g L : g q=1 k k  kr k 2 k k so we have (S1). We can obtain (S1) for general f C (X) by approximations. ✓ ✓ 2F\ 0 (S1) =✓>2 (S2): Set ✓ˆ = 2(✓ 1)/(✓ 2) and let f C (X). Applying (S1)tof ✓ˆ and using ✓ ) ✓ 2F\ 0 ✓ Schwarz’s inequality,

✓ 1 2✓ ✓ ˆ ˆ ✓ 2 ✓ ✓ f dµ = f ✓ c1 f 1 ✓ 1 k k  kr k ⇣ Z ⌘ 1/2 ✓ˆ 1 ✓ˆ 1 2✓ c f f c f f = c f f ✓ 2 dµ .  2k r k1  2kr k2k k2 2kr k2 ⇣ Z ⌘ 2 rearranging, we obtain (S✓ ). (S2) =✓>2 (N ): For f C (X), applying the Hölderinequality (with p 1 =4/(✓ + 2), q 1 = ✓ ) ✓ 2F\ 0 (✓ 2)/(✓ + 2)) and using (S2), we have ✓ 4 4 4 2+ ✓ ✓ 2 ✓ f 2 f 1 f 2✓ c1 f 1 (f,f), ✓ 2 k k k k k k  k k E 1 so we have (N✓) in this case. Usual approximation arguments give the desired fact for f L . 2F\ (S2) ✓>=(2 N ): For f C (X) such that f 0, define ✓ ( ✓ 2F\ 0 k k k k fk =(f 2 )+ 2 =2 1A +(f 2 )1B ,kZ, ^ k k 2 k+1 k k+1 where Ak = f 2 ,Bk = 2 f<2 .Thenf = k Z fk and fk C0(X). So { } {  } 2 2F\ (f,f)= (f ,f )+ (fP,f ) (f ,f ), (2.8) E E k k E k k0 E k k k Z k Z,k=k k Z k Z X2 2X6 0 X02 X2 23 where the last inequality is due to k=k (fk,fk0 ) 0. (This can be verified in a elementary way 6 0 E for the case of weighted graphs. When is strongly local, (fk,fk ) = 0.) P E k=k0 E 0 Next, we have 6 P 1+2/✓ 1+2/✓ 2k 2 2 µ(Ak) = fk dµ ZAk ⇣ ⌘ ⇣ ⌘ 4/✓ 2+4/✓ 4/✓ k fk 2 c1 fk 1 (fk,fk) c1 2 µ(Ak 1) (fk,fk), (2.9) kk  k k E  E ⇣ ⌘ where we used (N✓) for fk in the second inequality. Let ↵ =2✓/(✓ 2), = ✓/(✓ + 2) (1/2, 1), and ↵k 2(1 2) define ak =2 µ(Ak), bk = (fk,fk). Then (2.9) can be rewritten as ak c2ak 1 bk .Summing E 1 1  over k Z and using the Hölderinequality (with p =1 ,q = ), we have 2 2(1 ) 2 1 2(1 ) ak c2 ak 1 bk c2( ak 1) ( bk) c2( ak) ( bk) .    Xk Xk Xk Xk Xk Xk Putting (2.8) into this, we have /(2 1) a c (f,f) . (2.10) k  2E Xk On the other hand, we have

↵ ↵ ↵(k+1) 2↵ f ↵ = f dµ 2 µ(Ak 1)=2 ak. k k Bk  Xk Z Xk Xk 2 Plugging (2.10) into this, we obtain (S✓ ).

０ Figure 1: 2-dimensional pre-Sierpinski gasket

p Remark 2.15 (i) Generalizations of (S✓ ) and (I✓) are the following: f c '(µ(⌦)) f , ⌦ X, f C (X) such that Supp f Cl(⌦), k kp  1 kr kp 8 ⇢⇢ 8 2F\ 0 ⇢ 1 @⌦ c | | , ⌦ X, '(µ(⌦))  2 µ(⌦) 8 ⇢⇢ where ' :(0, ) (0, ) is a non-decreasing function as in Remark 2.10. Note that we deﬁned 1 ! 1 (Sp) for p =1, 2, but the above generalization makes sense for all p [1, ] (at least for Riemannian ✓ 2 1

24 manifolds). Theorem 2.14 is the case '(x)=x1/✓ – see [40] for details. 1 (ii) As we see in the proof, in addition to the equivalence (I✓) (S✓ ), one can see that the best 1() constant for c1 in (I✓) is equal to the best constant for c2 in (S✓ ). This is sometimes referred as the Federer-Fleming theorem.

(iii) It is easy to see that the pre-Sierpinski gasket (Figure 1) does not satisfy (I✓) for any ✓>1. On the other hand, we will prove in (3.28) that the heat kernel of the simple random walk enjoys the following estimate p (0, 0) n log 3/ log 5, n 1. This gives an example that (N ) cannot imply 2n ⇣ 8 ✓ (I✓) in general. In other word, the best exponent for isoperimetric inequalities is not necessarily the best exponent for Nash inequalities. (iv) In [93], there is an interesting approach to prove (I ) (N ) directly using the evolution of ✓ ) ✓ random sets.

The following lemma was used in the proof of Theorem 2.14.

Lemma 2.16 (Co-area formula) Let f be a non-negative function on X (when X is the Riemannian manifold, f C1(X) and f 0), and deﬁne H (f)= x X : f(x) >t . Then 2 0 t { 2 } f = 1 @H (f) dt. kr k1 | t | Z0 Proof. For simplicity we will prove it only when (X, µ) is a weighted graph. Then, 1 f = f(y) f(x) µ = (f(y) f(x))µ kr k1 2 | | xy xy x,y X x X y X:f(y)>f(x) X2 X2 2 X 1 1 = ( 1 f(y)>t f(x) dt)µxy = dt 1 f(y)>t f(x) µxy { } { } x X y X:f(y)>f(x) Z0 Z0 x,y X X2 2 X X2 1 1 = dt µxy = @Ht(f) dt. 0 c 0 | | Z x, Ht(f) y Ht(f) Z 2X 2X

The next fact is an immediate corollary to Theorem 2.14.

Corollary 2.17 Let ✓>2.If( , ) satisﬁes (I ), then the following holds. E F ✓ ✓/2 p (x, y) ct t>0,µ a.e. x, y. t  8

3 Heat kernel estimates using e↵ective resistance

In this section, we will consider the weighted graph (X, µ). We say that (X, µ)isloop-free if for any l 3, there is no set of distinct points x l X such that x x for 1 i l where we set { i}i=1 ⇢ i ⇠ i+1   xl+1 := x1.

Set R (x, x) = 0 for all x X. We now give an important lemma on the e↵ective resistance. e↵ 2

25 1 Lemma 3.1 (i) If c1 := infx,y X:x y µxy > 0 then Re↵ (x, y) c1 d(x, y) for all x, y X. 2 ⇠  12 (ii) If (X, µ) is loop-free and c2 := supx,y X:x y µxy < , then Re↵ (x, y) c2 d(x, y) for all 2 ⇠ 1 x, y X. 2 (iii) f(x) f(y) 2 R (x, y) (f,f) for all x, y X and f H2. | |  e↵ E 2 2 (iv) R ( , ) and R ( , )1/2 are both metrics on X. e↵ · · e↵ · ·

Proof. (i) Take a shortest path between x and y and cut all the bonds that are not along the path. Then we have the inequality by the cutting law. (ii) Suppose d(x, y)=n. Take a shortest path (x ,x , ,x )betweenx and y so that x = x, x = 0 1 ··· n 0 n y. Now take f : X R so that f(xi)=(n i)/n for 0 i n, and f(y)=f(xi)ify is in the !   n branch from xi,i.e.ify can be connected to xi without crossing xk k=0 xi .Thisf is well-deﬁned { 1} \{n 1} 2 because (X, µ) is loop-free, and f(x)=1,f(y) = 0. So R (x, y) (1/n) µ c /n = e↵  i=0 xixi+1  2 c /d(x, y), and the result follows. 2 P (iii) For any non-constant function u H2 and any x = y X, we can construct f H2 such that 2 6 2 2 f(x)=1,f(y) = 0 by a linear transform f(z)=au(z)+b (where a, b are chosen suitably). So

u(x) u(y) 2 1 sup | | : u H2, (u, u) > 0 =sup : f H2,f(x)=1,f(y)=0 = R (x, y), (u, u) 2 E (f,f) 2 e↵ n E o nE o (3.1) and we have the desired inequality.

(iv) It is easy to see Re↵ (x, y)=Re↵ (y, x) and Re↵ (x, y) = 0 if and only if x = y. So we only need to check the triangle inequality. Let H˜ 2 = u H2 : (u, u) > 0 . Then, for x, y, z X that are distinct, we have by (3.1) { 2 E } 2 u(x) u(y) R (x, y)1/2 =sup | | : u H˜ 2 e↵ (u, u)1/2 2 E u(x) u(z) n u(z)o u(y) sup | | : u H˜ 2 +sup | | : u H˜ 2 = R (x, z)1/2 + R (z,y)1/2.  (u, u)1/2 2 (u, u)1/2 2 e↵ e↵ n E o n E o So R ( , )1/2 is a metric on X. e↵ · · Next, let V = x, y, z X and let µˆxy, µˆyz, µˆzx be the trace of µxy x,y X to V .Deﬁne 1 1 { }⇢1 { } { } 2 Rxy =ˆµxy, Ryz =ˆµyz, Rzx =ˆµzx. Then, using Proposition 1.25 and the resistance formula of series and parallel circuits, we have

1 R (R + R ) R (z,x)= = zx xy yz , (3.2) e↵ 1 1 Rzx +(Rxy + Ryz) Rxy + Ryz + Rzx and similarly R (x, y)= Rxy(Ryz+Rzx) and R (y, z)= Ryz(Rzx+Rxy) . Hence e↵ Rxy+Ryz+Rzx e↵ Rxy+Ryz+Rzx

1 RyzRzx Re↵ (x, z)+Re↵ (z,y) Re↵ (x, y) = 0, (3.3) 2{ } Rxy + Ryz + Rzx which shows that R ( , ) is a metric on X. e↵ · ·

26 Remark 3.2 (i) Di↵erent proofs of the triangle inequality of R ( , ) in Lemma 3.1(iv) can be found e↵ · · in [10] and [75, Theorem 1.12]. (ii) Weighted graphs are resistance forms. (See [79] for deﬁnition and properties of the resistance form.) In fact, most of the results in this section (including this lemma) hold for resistance forms.

3.1 Green density killed on a ﬁnite set

For y X and n N,let 2 2 n 1 L(y, n)= 1 Y =y { k } Xk=0 be the local time at y up to time n 1. For a ﬁnite set B X and x, y X,deﬁnetheGreen ⇢ 2 density by 1 x 1 x gB(x, y)= E [L(y,⌧B)] = P (Yk = y, k <⌧ B). (3.4) µy µy Xk 1 x 1 y Clearly gB(x, y)=0wheneitherx or y is outside B.Sinceµy P (Yk = y, k <⌧ B)=µx P (Yk = x, k <⌧ B), we have g (x, y)=g (y, x) x, y X. (3.5) B B 8 2 Using the strong Markov property of Y ,

x gB(x, y)=P (y <⌧B)gB(y, y) gB(y, y). (3.6) 

Below are further properties of the Green density.

Lemma 3.3 Let B X be a ﬁnite set. Then the following hold. ⇢ (i) For x B, g (x, ) is harmonic on B x and =0outside B. 2 B · \{ } (ii) (Reproducing property of the Green density) For all f H2 with Supp f B,itholdsthat 2 ⇢ (g (x, ),f)=f(x) for each x X. E B · 2 (iii) Ex[⌧ ]= g (x, y)µ for each x X. B y B B y 2 (iv) R (x, Bc)=2g (x, x) for each x X. e↵ P B 2 (v) Ex[⌧ ] R (x, Bc)µ(B) for each x X. B  e↵ 2

Proof. (i) Let ⌫(z)=g (x, z). Then, for each y B x , noting that Y ,Y / B,wehave B 2 \{ } 0 ⌧B 2

⌧B 1 ⌧B 1 x x µzy ⌫(y)µy = E [ 1y(Yi+1)] = E [ 1z(Yi)P (z,y)] = ⌫(z)µz = ⌫(z)µyz µ z z z z Xi=0 Xi=0 X X X Dividing both sides by µ ,wehave⌫(y)= P (y, z)⌫(z), so ⌫ is harmonic on B x . y z \{ } (ii) Let u(y)=⌫(y)µ .Since⌫ is harmonic on B x and Supp f B, noting that we can apply y P \{ } ⇢

27 Lemma 1.5 (ii) because f L2 when B is ﬁnite, we have 2 (⌫, f)= ⌫(x)f(x)µ = ⌫(x) P (x, y)⌫(y) f(x)µ E x { } x y µ X µ = ⌫(x)µ xy µ ⌫(y) f(x)= ⌫(x)µ xy µ ⌫(y) f(x) { x µ x } { x µ y } y x y y X X = u(x) P (y, x)u(y) f(x). (3.7) { } y X

x ⌧B 1 Since u(y)=E [ k=0 1 Yk=y ] and y 1 Yk=y P (y, x)=1 Yk+1=x ,wehave { } { } { }

P P ⌧B 1 ⌧B 1 x x u(x) P (y, x)u(y)=E 1 Y =x E 1 Y =y P (y, x) { k } { k } y y X h Xk=0 i h Xk=0 X i ⌧B 1 ⌧B 1 x x =1+E 1 Y =x E 1 Y =x { k } { k+1 } h Xk=1 i h Xk=0 i ⌧B 1 ⌧B x x =1+E 1 Y =x E 1 Y =x =1. { k } { k } h Xk=1 i h Xk=1 i Putting this into (3.7), we obtain (⌫, f)=f(x). E (iii) Multiplying both sides of (3.4) by µy and summing over y B, we obtain the result. x 2 (iv) If x/B, both sides are 0, so let x B. Let pB(z)=gB(x, z)/gB(x, x). Then, by Proposition 2 x 2 1.13 and (i) above, we see that pB attains the minimum in the deﬁnition of the e↵ective resistance. (Note that the assumption that B is ﬁnite is used to guarantee the uniqueness of the minimum.) Thus, using (ii) above,

c 1 x x (gB(x, ),gB(x, )) 1 Re↵ (x, B ) = (pB,pB)=E · 2 · = . (3.8) E gB(x, x) gB(x, x) (v) If x/B, both sides are 0, so let x B. Using (iii), (iv) and (3.6), we have 2 2 Ex[⌧ ]= g (x, y)µ g (x, x)µ = R (x, Bc)µ(B). B B y  B y e↵ y B y B X2 X2 We thus obtain the desired inequality.

Remark 3.4 As mentioned before Definition 1.3, (u(x) P (y, x)u(y)) in (3.7) is the total flux y flowing into x, given the potential ⌫.So,weseethatthetotalfluxflowingoutfromx is 1 when the P potential g (x, ) is given at x. B · c The next example shows that Re↵ (x, B )=gB(x, x) does not hold in general when B is not finite.

Example 3.5 Consider Z3 with weight 1 on each nearest neighbor bond. Let p be an additional point and put a bond with weight 1 between the origin of Z3 and p; X = Z3 p with the above [{ }

28 3 mentioned weights is the weighted graph in this example. Let B = Z and let Bn = B B(0,n).By c 1 Z3 \ Lemma 3.3(iv), Re↵ (0,Bn)=gBn (0, 0). If we set cn := Re↵ (0,B(0,n)), then it is easy to compute c 1 3 Re↵ (0,Bn)=(1+cn) .SincethesimplerandomwalkonZ is transient, limn cn =: c0 > 0. !1 1 As will be proved in the proof of Lemma 3.9, gB(0, 0) = limn gBn (0, 0),sogB(0, 0) = (1 + c0) . c !1 1 c On the other hand, it is easy to see Re↵ (0,B )=1> (1 + c0) ,soRe↵ (0,B ) >gB(0, 0).This also shows that in general the resistance between two sets cannot be approximated by the resistance of ﬁnite approximation graphs.

For any A X, and A ,A X with A A = , A A = (for either i = 1 or 2) deﬁne ⇢ 1 2 ⇢ 1 \ 2 ; \ i ; A 1 2 R (A ,A ) =inf (f,f):f H ,f =1,f =0,fis a constant on A . e↵ 1 2 {E 2 |A1 |A2 } In other word, RA ( , ) is the e↵ective resistance for the network where the set A is shorted and e↵ · · reduced to one point. Clearly RA (x, A)=R (x, A) for x X A. We then have the following. e↵ e↵ 2 \ Proposition 3.6 Let B X be a ﬁnite set. Then ⇢ 1 c g (x, y)= (R (x, Bc)+R (y, Bc) RB (x, y)), x, y B. B 2 e↵ e↵ e↵ 8 2

Proof. Since the set Bc is reduced to one point, it is enough to prove this when Bc is a point, say z z. Noting that Re{↵}(x, y)=Re↵ (x, y), we will prove the following. 1 gX z (x, y)= (Re↵ (x, z)+Re↵ (y, z) Re↵ (x, y)), x, y B. (3.9) \{ } 2 8 2 By Lemma 3.3 (iv) and (3.6), we have

y y gX z (x, y)=P (x <⌧X z )gX z (x, x)=P (x <z)Re↵ (x, z). (3.10) \{ } \{ } \{ } Now V = x, y, z X and consider the trace of the network to V , and consider the function u on { }⇢ V such that u(x) = 1, u(z) = 0 and u is harmonic on y. Using the same notation as in the proof of Lemma 3.1 (iv), we have

1 1 y Rxy Ryz Ryz P (x <z)=u(y)= 1 1 1+ 1 1 0= . Rxy + Ryz ⇥ Rxy + Ryz ⇥ Rxy + Ryz Putting this into (3.10) and using (3.2), we have

Ryz Rzx(Rxy + Ryz) RzxRyz gB(x, y)= = . (3.11) Rxy + Ryz · Rxy + Ryz + Rzx Rxy + Ryz + Rzx By (3.3), we obtain (3.9).

Remark 3.7 Take an additional point p and consider the -Y transform between V = x, y, z 0 { } and W = p ,x,y,z . Namely, W = p ,x,y,z is the network such that µ˜ , µ˜ , µ˜ > 0 and { 0 } { 0 } p0x p0y p0z other weights are 0,andthetraceof(W, µ˜) to V is (V,µˆ). (See, for example, [79, Lemma 2.1.15].) Then RzxRyz in (3.11) is equal to µ˜ 1 ,sog (x, y)=˜µ 1 . Rxy+Ryz+Rzx p0z B p0z

29 Corollary 3.8 Let B X be a ﬁnite set. Then ⇢ g (x, y) g (x, z) R (y, z), x, y, z X. | B B | e↵ 8 2 Proof. By Proposition 3.6, we have

c c R (y, Bc) R (z,Bc) + RB (x, y) RB (x, z) g (x, y) g (x, z) | e↵ e↵ | | e↵ e↵ | | B B | 2 c R (y, z)+RB (y, z) e↵ e↵ R (y, z),  2  e↵ which gives the desired result.

3.2 Green density on a general set In this subsection, will discuss the Green density when B can be infinite. Since we do not use results in this subsection later, readers may skip this subsection. Let B X. We define the Green density by (3.4). By Proposition 1.18, g (x, y) < for all ⇢ B 1 x, y X when Y is transient, whereas g (x, y)= for all x, y X when Y is recurrent. Since 2 { n} X 1 2 { n} there is nothing interesting when g (x, y)= , throughout this subsection we will only consider the X 1 case B = X when Y is recurrent. 6 { n} Then we can easily see that the process Y B killed on exiting B is transient, so g (x, y) < for { n } B 1 all x, y X. It is easy to see that (3.5), (3.6), and Lemma 3.3 (i), (iii) hold without any change of 2 the proof. 2 2 Recall that H0 is the closure of C0(X)inH . We can generalize Lemma 3.3 (ii) as follows. Lemma 3.9 (Reproducing property of Green density) For each x X, g (x, ) H2. Further, for 2 B · 2 0 all f H2 with Supp f B,itholdsthat (g (x, ),f)=f(x) for each x X. 2 0 ⇢ E B · 2 Proof. When B is finite, this is already proved in Lemma 3.3 (ii), so let B be finite (and B = X 6 if Y is recurrent). Fix x X and let B = B(x ,n) B and write ⌫ (z)=g (x, z). Then { n} 0 2 n 0 \ n Bn ⌧ ⌧ so that ⌫ (z) ⌫(z) < for all z X. Using the reproducing property, for m n,we Bn " B n " 1 2  have (⌫ ⌫ ,⌫ ⌫ )=g (x, x) g (x, x), E n m n m Bn Bm which implies that ⌫ is the Cauchy sequence in H2. It follows that ⌫ ⌫ in H2 and ⌫ H2. { n} n ! 2 0 Now for each f H2 with Supp f B, choose f C (X) so that Supp f B and f f in H2. 2 0 ⇢ n 2 0 ⇢ n n ! Then, as we proved above, (f ,⌫ )=f (x). Taking n and using Lemma 1.2 (i), we obtain E n n n !1 (f,⌫)=f(x). E c c As we see in Example 3.5, Re↵ (0,B )=limn Re↵ (0, (B B(0,n) ) does not hold in general. !1 \ So we introduce another resistance metric as follows. 2 u(x) u(y) 2 R (x, y):=sup | | : u H0 1, (u, u) > 0 , x, y X, x = y, ⇤ (u, u) 2 E 8 2 6 n E o 30 2 2 where H0 1= f + a : f H0 ,a R . By (3.1), the di↵erence between this and the e↵ective { 2 2 } 2 2 resistance metric is that either supremum is taken over all H0 1 or over all H . Clearly R (x, y) ⇤  Re↵ (x, y). Here and in the following of this subsection, we will consider R (x, Bc), so we assume B = X. ⇤ 6 (For R (x, Bc), we consider Bc as one point by shorting.) Using R ( , ), we can generalize Lemma ⇤ ⇤ · · 3.3 (iv),(v) as follows.

Lemma 3.10 Let B X be a set such that B = X. Then the following hold. c ⇢ 6 (i) R (x, B )=gB(x, x) for each x X. ⇤x c 2 (ii) E [⌧B] R (x, B )µ(B) for each x X.  ⇤ 2

Proof. (i) If x/B, both sides are 0, so let x B. Let px (z)=g (x, z)/g (x, x). Note that we 2 2 B B B cannot follow the proof Lemma 3.3 (iv) directly because we do not have uniqueness for the solution of the Dirichlet problem in general. Instead, we discuss as follows. Rewriting the deﬁnition, we have

c 1 2 2 2 R (x, B ) =inf (f,f):f H0,x(B) where H0,x(B):= f H0 1: Suppf B,f(x)=1 . ⇤ {E 2 } { 2 ⇢ } (3.12) Take any v H2 (B). (Note that H2 (B) H2 since B = X.) Then, by Lemma 3.9, 2 0,x 0,x ⇢ 0 6 x x x x (v pB,gB(x, )) v(x) pB(x) (v pB,pB)=E · = =0. E gB(x, x) gB(x, x) So we have (v, v)= (v px ,v px )+ (px ,px ) (px ,px ), E E B B E B B E B B x which shows that the inﬁmum in (3.12) is attained by pB. Thus by (3.8) with R ( , ) instead of ⇤ · · R ( , ), we obtain (i). e↵ · · Given (i), (ii) can be proved exactly in the same way as the proof of Lemma 3.3 (v).

Remark 3.11 As mentioned in [78, Section 2], when (X, µ) is transient, one can show that ( ,H2 E 0 1) is the resistance form on X , where is a new point that can be regarded as a point of [{ } { } inﬁnity. Note that there is no weighted graph (X , µ¯) whose associated resistance form is 2 2 [{ } 2 ( ,H0 1). Indeed, if there is, then 1 H0 1, which contradicts the fact 1 / H0 (Proposition E { } 2 2 1.22).

Given Lemma 3.10, Proposition 3.6 and Corollary 3.8 holds in general by changing R ( , )to e↵ · · R ( , ) without any change of the proof. ⇤ · · Finally, note that by Proposition 1.24, we see that H2 = H2 1 if and only if there is no non- 0 constant harmonic functions of ﬁnite energy. One can see that it is also equivalent to Re↵ (x, y)= R (x, y) for all x, y X. (The necessity can be shown by the fact that the resistance metric ⇤ 2 determines the resistance form: see [79, Section 2.3].)

31 3.3 General heat kernel estimates In this subsection, we give general on-diagonal upper and lower heat kernel estimates. Deﬁne

B(x, r)= y X : d(x, y)

Lemma 3.12 Assume that infx,y X:x y µxy > 0. 2 ⇠ (i) For any non-void ﬁnite set ⌦ X, ⇢ c (⌦) 1 . (3.13) 1 r(⌦)µ(⌦)

(ii) Suppose that there exists a strictly increasing function v on N such that

V (x, r) v(r), x X, r N. (3.14) 8 2 8 2 Then, for any non-void ﬁnite set ⌦ X, ⇢ c1 1(⌦) 1 . (3.15) v (µ(⌦))µ(⌦)

2 Proof. (i) Let f be any function with Supp f ⌦ normalized as f = 1. Then f 2 µ(⌦). ⇢ k k1 k k  Now consider a point x X such that f(x ) = 1 and the largest integer n such that B(x ,n) ⌦. 0 2 | 0 | 0 ⇢ Then n r(⌦) and there exists a sequence x n X such that x x for i =0, 1, ,n 1,  { i}i=0 ⇢ i ⇠ i+1 ··· x ⌦ for j =0, 1, ,n 1 and x / ⌦. So we have j 2 ··· n 2 n 1 n 1 1 c 2 c (f,f) (f(x ) f(x ))2µ 1 f(x ) f(x ) 1 , E 2 i i+1 xixi+1 n | i i+1 | n Xi=0 ⇣ Xi=0 ⌘ n 1 where the last inequality is due to f(x ) f(x ) f(x ) f(x ) = 1. Combining these, i=0 | i i+1 || 0 n | P(f,f) c c E 1 1 . f 2 nµ(⌦) r(⌦)µ(⌦) k k2 Taking inﬁmum over all such f, we obtain the result. (ii) Denote r = r(⌦). Then, there exists x ⌦ such that B(x ,r) ⌦, so (3.14) implies v(r) µ(⌦). 0 2 0 ⇢  Thus r v 1(µ(⌦)), and (3.15) follows from (3.13). 

Proposition 3.13 (Upper bound: slow decay) Assume that infx,y X:x y µxy > 0 and 2 ⇠ D V (x, r) c1r , x X, r N, (3.16) 8 2 8 2 for some D 1. Then the following holds. D sup pt(x, x) c2t D+1 , t 1. x X  8 2

32 Proof. By (3.15), we have (⌦) c µ(⌦) 1 1/D. Thus the result is obtained by Theorem 2.14 by 1 1 taking ✓ =2D/(D + 1).

Remark 3.14 We can generalize Proposition 3.13 as follows (see [14]): Assume that infx,y X:x y µxy > 0 and (3.14) holds for all r r0. Then the following holds. 2 ⇠ 2 sup pt(x, x) c1m(t), t r0, x X  8 2 where m is deﬁned by 1/m(t) 2 1 t r = v (s)ds. 0 Zv(r0) 2 1 Indeed, by (3.14), we see that (2.6) holds with '(s) = cv (s)s. Thus the result can be obtained by applying Remark 2.5 and Remark 2.10. In fact, the above generalized version of Proposition 3.13 also holds for geodetically complete non-compact Riemannian manifolds with bounded geometry (see [14]). Below is the table of the slow heat kernel decay m(t), given the information of the volume growth v(r).

V (x, r) exp(cr) c exp(cr↵) crD cr 1 1 1/↵ D/(D+1) 1/2 supx X pt(x, x) ct log tct (log t) ct ct 2 

Next we discuss general form of the on-diagonal heat kernel estimate. This lower bound is quite robust, and the argument works as long as there is a Hunt process and the heat kernel exists. Note that the estimate is independent of the upper bound, so in general the two estimates may not coincide.

Proposition 3.15 (Lower bound) Let B X and x B. Then ⇢ 2 x 2 P (⌧B >t) p (x, x) , t>0. 2t µ(B) 8

Proof. Using the Chapman-Kolmogorov equation and the Schwarz inequality, we have

x 2 x 2 2 2 P (⌧B >t) P (Yt B) =( pt(x, y)dµ(y)) µ(B) pt(x, y) dµ(y) µ(B)p2t(x, x),  2   ZB ZB which gives the desired inequality.

3.4 Strongly recurrent case In this subsection, we will restrict ourselves to the ‘strongly recurrent’ case and give sucient conditions for precise on-diagonal upper and lower estimates of the heat kernel. (To be precise, Proposition

3.16 holds for general weighted graphs, but the assumption FR, given in other propositions holds only for the strongly recurrent case.)

33 Throughout this subsection, we ﬁx a based point 0 X and let D 1, 0 <↵ 1. As before 2  d( , ) is a graph distance, but we do not use the property of the graph distance except in Remark · · 3.17. In fact all the results in this subsection hold for any metric (not necessarily a geodesic metric) on X without any change of the proof.

↵ Proposition 3.16 For n N, let fn(x)=pn(0,x)+pn+1(0,x).AssumethatRe↵ (0,y) c d(0,y) 2  ⇤ holds for all y X. Let r N and n = 2[r]D+↵. Then 2 2 D D r fn(0) c1n D+↵ c . (3.17)  ⇤ _ V (0,r) ⇣ ⌘ Especially, if c rD V (0,r), then f (0) c n D/(D+↵). 2  n  3 Proof. First, note that similarly to (1.7), we can easily check that

(f ,f )=f (0) f (0). (3.18) E n n 2n 2n+2

Choose x B(0,r) such that fn(x )=minx B(0,r) fn(x). Then ⇤ 2 ⇤ 2

fn(x )V (0,r) fn(x)µx pn(0,x)µx + pn+1(0,x)µx 2, ⇤    x B(0,r) x G x G 2X X2 X2 ↵ so that fn(x ) 2/V (0,r). Using Lemma 3.1 (iii), Re↵ (0,y) c d(0,y) , and (3.18), we have ⇤   ⇤ 2 2 2 8 fn(0) 2 fn(x ) + fn(0) fn(x ) +2Re↵ (0,x ) (fn,fn)  ⇤ | ⇤ |  V (0,r)2 ⇤ E 8 ↵ 8 ↵ +2c d(0,x ) (fn,fn) +2c d(0,x ) (f2n(0) f2n+2(0)).(3.19)  V (0,r)2 ⇤ ⇤ E  V (0,r)2 ⇤ ⇤ The spectral decomposition gives that k f (0) f (0) is non-increasing. Thus ! 2k 2k+2 n (f (0) f (0)) (2[n/2] + 1) f (0) f (0) 2n 2n+2  4[n/2] 4[n/2]+2 2[n/2] 2 (f (0) f (0)) 2f (0).  2i 2i+2  2[n/2] i=[Xn/2] ↵ D+↵ 2 8 4c r fn(0) Since n = 2[r] is even, putting this into (3.19), we have f (0) + ⇤ . Using n  V (0,r)2 n a + b 2(a b), we have  _ 1 c r↵ f (0) c ( ⇤ ). (3.20) n  1 V (0,r) _ n Rewriting, we obtain (3.17).

Remark 3.17 (i) Putting n = 2[r↵V (0,r)] in (3.20), we have the following estimate:

c1(1 c ) f ↵ (0) _ ⇤ . (3.21) 2[r V (0,r)]  V (0,r) (ii) When ↵ =1, using Lemma 3.1(i), we see that the assumption of Proposition 3.16 holds if infx,y X:x y µxy > 0. So (3.17) gives another proof of Proposition 3.13. 2 ⇠

34 1/↵ In the following, we write BR = B(0,R). Let " =(3c ) . We say the weighted graph (X, µ) ⇤ satisﬁes FR, (or simply say that FR, holds) if it satisﬁes the following estimates:

↵ D D ↵ c R ("R) V (0,R) R ,Re↵ (0,z) c d(0,z) , z BR,Re↵ (0,BR) ,V(0,"R) . (3.22)   ⇤ 8 2 Proposition 3.18 In the following, we ﬁx R 1 and >0. D ↵ (i) If V (0,R) R ,Re↵ (0,z) c d(0,z) for all z BR, then the following holds.   ⇤ 2 x D+↵ E [⌧B ] 2c R x BR. (3.23) R  ⇤ 8 2

(ii) If FR, holds, then there exist c1 = c1(c ),q0 > 0 such that the following holds for all x ⇤ 2 B(0,"R).

x q0 D+↵ E [⌧ ] c R , (3.24) BR 1 c q0 RD+↵ n P x(⌧ >n) 1 n 0. (3.25) BR 2c RD+↵ 8 ⇤ Proof. (i) Using Lemma 3.3 (v) and the assumption, we have

x c c D+↵ E [⌧BR ] Re↵ (x, BR)µ(BR) (Re↵ (0,x)+Re↵ (0,BR))µ(BR) 2c R , x BR. (3.26)    ⇤ 8 2 (ii) Denote B = B(0," R). By Lemma 1.15 and the assumption, we have for y B , 0 2 0 c Re↵ (y, B 0 ) Re↵ (y, 0) y y c R P (⌧BR < 0 )=P (B < 0 ) [c{ } c { } R { }  R (y, B )  R (0,B ) R (0,y) e↵ R e↵ R e↵ c d(y, 0)↵ R↵/(3) 1 ⇤ c ↵ ↵ ↵ = .  Re↵ (0,B ) c d(y, 0)  R / R /(3) 2 R ⇤ Applying this into (3.6) and using Lemma 3.3 (v), we have for y B , 2 0 ↵ y 1 c R gBR (0,y)=gBR (0, 0)P ( 0 <⌧BR ) Re↵ (0,BR) . { } 2 2 Thus, ↵ D D+↵ 0 R " R E [⌧BR ] gB(0,y)µy µ(B0) . 2 22 y B X2 0 Further, for x B , 2 0 D D+↵ x x 0 1 0 " R E [⌧BR ] P ( 0 <BR )E [⌧BR ] E [⌧BR ] , { } 2 42 so (3.24) is obtained. Next, by (3.23), (3.24), and the Markov property of Y ,wehave

q0 D+↵ x x Yn D+↵ x c1 R E [⌧BR ] n + E [1 ⌧ >n E [⌧BR ]] n +2c R P (⌧BR >n).   { BR }  ⇤ So (3.25) is obtained.

35 Proposition 3.19 If FR, holds, then there exist c1 = c1(c ),q0,q1 > 0 such that the following holds ⇤ for all x B(0," R). 2

q1 D/(D+↵) c3.18.1 D+↵ c3.18.1 D+↵ p2n(x, x) c1 n for R n R . 4q0   2q0

Proof. Using Proposition 3.15 and (3.25), we have

x 2 q0+1 2 P (⌧BR >n) (c3.18.1/(2c )) q D/(D+↵) ⇤ 1 p2n(x, x) D c1 n , µ(BR) R for some c1,q1 > 0.

Remark 3.20 The above results can be generalized as follows. Let v, r : N [0, ) be strictly ! 1 increasing functions with v(1) = r(1) = 1 which satisfy

d d ↵ ↵ 1 R 1 v(R) R 2 1 R 1 r(R) R 2 C1 C1 ,C2 C2 R0  v(R0)  R0 R0  r(R0)  R0 ⇣ ⌘ ⇣ ⌘ ⇣ ⌘ ⇣ ⌘ for all 1 R R< , where C ,C 1, 1 d d and 0 <↵ ↵ 1.Assumethatthe  0  1 1 2  1  2 1  2  following holds instead of (3.22) for a suitably chosen " > 0:

c r(R) v("R) V (0,R) v(R),Re↵ (0,z) c r(d(0,z)), z BR,Re↵ (0,BR) ,V(0,"R) .   ⇤ 8 2 x B(0," R). Then, the following estimates hold. 2 c1 c2 c0 c0 p2n(x, x) for v(R)r(R) n v(R)r(R), q1 v( (n))   q2 v( (n)) 4q0   2q0 I I where ( ) is the inverse function of (v r)( ) (see [87] for details). I · · · 3.5 Applications to fractal graphs In this subsection, we will apply the estimates obtained in the previous subsection to fractal graphs.

2-dimensional pre-Sierpinski gasket n Let V0 be the vertices of the pre-Sierpinski gasket (Figure 1) and define V n =2V0. Let n n 1 n 1 an =(2 , 0),bn =(2 , 2 p3) be the vertices in V n. n Lemma 3.21 It holds that R (0, a ,b )= 1 5 . e↵ { n n} 2 3 ⇣ ⌘ 0 + p z + Proof. Let pn = P ( an,bn < ). Let z =(3/2, 3/2) and define q1 = P ( a1,b1 < ). { } 0 { } 0 { } 1 1 { } Then, by the Markov property, p1 =4 (p1 + q1 + 1) and q1 =2 (p1 + 1). Solving them, we have p1 =3/5. Next, for a simple random walk Yk on the pre-Sierpinski gasket, define the induced (n) { } random walk Y on V n as follows. { k } (n) ⌘0 =min k 0:Yk V n ,⌘i =min k>⌘i 1 : Yk V n Y⌘i 1 ,Yi = Y⌘i , for i N. { 2 } { 2 \ } 2

36 (n) Then it can be easily seen that Y is a simple random walk on V n. Using this fact, we can { k } inductively show p = p p = p (3/5) = =(3/5)n+1. We thus obtain the result by using n+1 n · 1 n · ··· Theorem 1.14.

Let df = log 3/ log 2,dw = log 5/ log 2.

Proposition 3.22 The following hold for all R 1. (i) c Rdf V (0,R) c Rdf . 1   2 (ii) R (0,z) c d(0,z)dw df , z X. e↵  3 8 2 (iii) R (0,B(0,R)c) c Rdw df . e↵ 4

n n n n n Proof. (i) Since there are 3 triangles with length 1 in B(0, 2 ), we see that c13 V (0, 2 ) c23 . n 1 n n 1 n 1  Next, for each R, take n N such that 2 R<2 .Thenc13 V (0, 2 ) V (0,R) 2     V (0, 2n) c 3n.Since3=2df , we have the desired estimate.  2 (ii) We ﬁrst prove the following:

c (5/3)n R (0,a ) c (5/3)n. (3.27) 3  e↵ n  4 1 n Indeed, by the shorting law and Lemma 3.21, 2 (5/3) = Re↵ (0, an,bn ) Re↵ (0,an). On the (1) (2) { }  other hand, let 'n and 'n be such that

(1) (1) 1 1 (2) (2) (' ,' )=R (0,a ) = R (0,b ) = (' ,' ). E n n e↵ n e↵ n E n n (1) (2) Then, by symmetry, 'n (bn)='n (an)=:C.So

1 5 n '(1) + '(2) '(1) + '(2) 1 = R (0, a ,b ) n n , n n 2 3 e↵ { n n} E 1+C 1+C ⇣ ⌘ ⇣ 2 ⌘ 1 (1 + C)2 ('(1),'(1))+ ('(2),'(2)) = R (0,a ). (1 + C)2 E n n E n n 4 e↵ n n ⇣ ⌘o We thus obtain (3.27). Now for each z X, choose n 0 such that 2n d(0,z) < 2n+1. We can then 2 i  take a sequence z = z0,z1, ,zn such that zi V i, d(zi,zi+1) 2 for i =0, 1, ,n 1(zi = zi+1 ··· 2  ··· is allowed) and d(0,z )=2n. Similarly to (3.27), we can show that R (z ,z ) c (5/3)i.Thus, n e↵ i i+1  5 using the triangle inequality, we have

n 1 n 1 i n n dw d R (0,z) R (z ,z )+R (0,z ) c (5/3) +(5/3) c (5/3) c d(0,z) f . e↵  e↵ i i+1 e↵ n  5  6  7 Xi=0 ⇣ Xi=0 ⌘ So the desired estimate is obtained. n 1 n (iii) For each R, take n N such that 2 R<2 . Then, by the shorting law and Lemma 3.21, 2  n c 1 5 1 n(dw d ) dw d R (0,B(0,R) ) R (0, a ,b )= = 2 f c R f , e↵ e↵ { n n} 2 3 2 8 ⇣ ⌘ so we have the desired estimate.

37 By Proposition 3.16, 3.19, 3.22, we see that the following heat kernel estimate holds for simple random walk on the 2-dimensional Sierpinski gasket.

d /dw d /dw c n f p (0, 0) c n f , n 1. (3.28) 1  2n  2 8 Vicsek sets

０ Figure 2: 2-dimensional Vicsek set

We next consider the Vicsek set (Figure 2). Noting that this graph is loop-free, we can apply Lemma 3.1 (ii), and Lemma 3.1 (i) trivially holds. So by a similar proof we can obtain Proposition 3.22 with df = log 5/ log 3 and dw = df + 1, so (3.28) holds. This shows that the estimate in Proposition 3.13 is in general the best possible estimate when D = log 5/ log 3. By considering the generalization of Vicsek set in Rd, we can see that the same is true when D = log(1 + 2d)/ log 3 (which is a sequence that goes to inﬁnity as d ). In fact it is proved in [14, Theorem 5.1] that the estimate in !1 Proposition 3.13 is in general the best possible estimate for any D 1. Remark 3.23 It is known that the following heat kernel estimate holds for simple random walk on the pre-Sierpinski gasket, on Vicsek sets, and in general on the so-called nested fractal graphs (see [65, 72]):

dw 1 dw 1 d /dw d(x, y) dw 1 d /dw d(x, y) dw 1 c n f exp c p (x, y)+p (x, y) c n f exp c , 1 2 n  n n+1  3 4 n ⇣ ⇣ ⌘ ⌘ ⇣ ⇣ ⌘ ⌘ for all x, y X, n d(x, y). (Note that p (x, y)=0if n

4 Heat kernel estimates for random weighted graphs

From this section, we consider the situation where we have a random weighted graph (X(!),µ!): { ! ⌦ on a probability space (⌦, , P). We assume that, for each ! ⌦, the graph X(!)isinﬁnite, 2 } F 2 locally ﬁnite and connected, and contains a marked vertex 0 G. We denote balls in X(!)by 2 B!(x, r), their volume by V!(x, r), and write

B(R)=B!(R)=B!(0,R),V(R)=V!(R)=V!(0,R).

38 We write Y =(Y ,n 0,Px,x X(!)) for the Markov chain on X(!), and denote by p!(x, y)its n ! 2 n transition density with respect to µ!. Let

⌧ = ⌧ =min n 0:Y / B(0,R) . R B(0,R) { n 2 } 4.1 Random walk on a random graph

Recall that FR, is deﬁned in (3.22). We have the following quenched estimates.

Theorem 4.1 Let R , 1. Assume that there exist p() 0 with p() c q0 for some 0 0  1 q ,c > 0 such that for each R R and , 0 1 0 0 ! P( ! :(X(!),µ ) satisﬁes FR, ) 1 p(). (4.1) { }

Then there exist ↵1,↵2 > 0 and ⌦0 ⌦ with P(⌦0)=1such that the following holds: For all ! ⌦0 ⇢ 2 and x X(!), there exist Nx(!),Rx(!) N such that 2 2 D D ↵1 ! ↵1 (log n) n D+↵ p (x, x) (log n) n D+↵ , n N (!), (4.2)  2n  8 x ↵2 D+↵ x ↵2 D+↵ (log R) R E ⌧ (log R) R , R R (!). (4.3)  ! R  8 x Further, if (4.1) holds with p() exp( c q0 ) for some q ,c > 0, then (4.2), (4.3) hold with  2 0 2 log log n instead of log n.

Proof. We will take⌦ 0 =⌦1 ⌦2 where⌦ 1 and⌦ 2 are deﬁned below. \ ! First we prove (4.2). Write w(n)=p2n(0, 0). By Proposition 3.16 and 3.19, we have, taking D+↵ n =[c1()R ], q1 1 D/(D+↵) q1 P((c1 ) n w(n) c1 ) 1 2p(). (4.4)   Let n =[ek] and = k2/q0 (by choosing R suitably). Then, since p( ) < , by the Borel– k k k 1 1 2q1/q0 D/(D+↵) Cantelli lemma there exists K0(!)withP(K0 < ) = 1 such that c k n w(nk) 1 1 P  k  c k2q1/q0 for all k K (!). Let⌦ = K < . For k K we therefore have 1 0 1 { 0 1} 0

1 2q1/q0 D/(D+↵) 2q1/q0 D/(D+↵) c (log n ) n w(n ) c (log n ) n , 2 k k  k  2 k k ! so that (4.2) holds for the subsequence nk. The spectral decomposition gives that p2n(0, 0) is monotone decreasing in n. So, if n>N = eK0 + 1, let k K be such that n n

2q1/q0 D/(D+↵) D/(D+↵) 2q1/q0 D/(D+↵) w(n) w(n ) c (log n ) n 2e c (log n) n .  k  2 k k  2 Similarly w(n) w(n ) c n D/(D+↵)(log n) 2q1/q0 . Taking q > 2q /q , so that the constants k+1 3 2 1 0 c2,c3 can be absorbed into the log n term, we obtain (4.2) for x = 0. If x, y X(!) and k = d (x, y), then using the Chapman–Kolmogorov equation 2 ! p! (x, x)(p!(x, y)µ (!))2 p! (y, y). 2n k x  2n+2k

39 Let ! ⌦ , x X(!), write k = d (0,x), h!(0,x)=(p!(x, 0)µ (!)) 2, and let n N (!)+2k. 2 1 2 ! k x 0 Then

(log(n + k))q2 (log(2n))q2 (log n)1+q2 p! (x, x) h!(0,x)p! (0, 0) h!(0,x) h!(0,x) 2n  2n+2k  (n + k)D/(D+↵)  nD/(D+↵)  nD/(D+↵) provided log n 2q2 h!(0,x). Taking

q2 ! Nx(!)=exp(2 h (0,x)) + 2d!(0,x)+N0(!), (4.5) and ↵1 =1+q2, this gives the upper bound in (4.2). The lower bound is obtained in the same way. x Next we prove (4.3). Write F (R)=E!⌧R. By (3.23) and (3.24), we have

q0 D+↵ D+↵ P(c1 R F (R) 2c2R , x B(0,"R)) 1 p(). (4.6)   8 2

n 2/q0 Let Rn = e and n = n , and let Fn be the event of the left hand side of (4.6) when R = Rn, c 2 = n.ThenwehaveP(F ) p(n) n , so by Borel–Cantelli, if⌦ 2 =liminfFn,thenP(⌦2) = 1. n   Hence there exist M with M (!) < on⌦ , and c ,q > 0 such that for ! ⌦ and x X(!), 0 0 1 2 3 3 2 2 2

q3 1 F (Rn) q3 (c3n ) D+↵ c3n , (4.7)  Rn  provided n M (!) and n is also large enough so that x B(" R ). Writing M (!) for the 0 2 n n x smallest such n,

1 2q3/q0 D+↵ 2q3/q0 D+↵ c (log R ) R F (R ) c (log R ) R , for all n M (!). 3 n n  n  3 n n x As F (R) is monotone increasing, the same argument as in the proof of (4.2) above enables us to replace F (R )byF (R), for all R R =1+eMx . Taking ↵ > 2q /q , we obtain (4.3). n x 2 3 0 The case p() exp( c q0 ) can be proved similarly by the following changes; take =  2 k 1/q0 2/q0 ! (e+(2/c2) log k) instead of k = k , and take Nx(!)=exp(exp(Ch (0,x)))+2d!(0,x)+N0(!) in (4.5). Then, log n (resp. log nk, log Rn) in the above proof are changed to log log n (resp. log log nk, log log Rn) and the proof goes through.

We also have the following annealed estimates. Note that there are no log terms in them.

Proposition 4.2 Suppose (4.1) holds for some p() 0 which goes to 0 as ,andsuppose !1 the following hold, c D+↵ E[Re↵ (0,B(R) )V (R)] c1R , R 1. (4.8)  8 Then

D+↵ 0 D+↵ c2R E[E ⌧R] c3R , R 1, (4.9)  !  8 D/(D+↵) ! c4n E[p (0, 0)], n 1. (4.10)  2n 8

40 Suppose in addition that there exist c5 > 0,0 > 1 and q0 > 2 such that c ( 1RD V (R),R (0,y) d(0,y)↵, y B(R)) 1 5 , (4.11) P e↵ q   8 2 0 for each R 1, . Then 0 ! D/(D+↵) E[p (0, 0)] c6n , n 1. (4.12) 2n  8

Proof. We begin with the upper bounds in (4.9). By (3.26) and (4.8),

0 c D+↵ E[E ⌧R] E[Re↵ (0,B(R) )V (R)] c1R . !  

For the lower bounds, it is sucient to ﬁnd a set F ⌦ of ‘good’ graphs with P(F ) c2 > 0 ⇢ such that, for all ! F we have suitable lower bounds on E0 ⌧ or p! (0, 0). We assume that R 1 2 ! R 2n is large enough so that " R 1, where is chosen large enough so that p( ) < 1/4. We can then 0 0 0 obtain the results for all n (chosen below to depend on R) and R by adjusting the constants c2,c4 in (4.9) and (4.10). 3 Let F be the event of the left hand side of (4.6) when = 0.ThenP(F ) 4 , and for ! F , 0 q0 D+↵ 2 E ⌧ c R . So, ! R 3 0

0 0 q0 D+↵ 3c3 q0 D+↵ E[E ⌧R] E[E ⌧R : F ] c30 R P(F ) 0 R . · · 4

D+↵ ! q1 D/(D+↵) Also, by (4.4), if n =[c4(0)R ], then p2n(0, 0) c50 n . So, given n N, choose R D+↵ 2 so that n =[c4(0)R ] and let F be the event of the left hand side of (4.4). Then

q1 D/(D+↵) 3c5 q1 D/(D+↵) Ep· (0, 0) P(F )c5 n n , 2n 0 4 0 giving the lower bound in (4.10).

Finally we prove (4.12). Let Hk be the event of the left hand side of (4.11) with = k.By (3.17), we see that p! (0, 0) c kn D/(D+↵) if ! H ,whereR is chosen to satisfy n = 2[R]D+↵. 2n  6 2 k Since P( kHk) = 1, using (4.11), we have [ ! D/(D+↵) D/(D+↵) c Ep (0, 0) c6(k + 1)n P(Hk+1 Hk) c6(k + 1)n P(H ) 2n  \  k Xk Xk D/(D+↵) q c n (k + 1)k 0 < ,  7 1 Xk since q0 > 2. We thus obtain (4.12).

Remark 4.3 With some extra e↵orts, one can obtain quenched estimates of max0 k n d(0,Yk) and   µ(Wn) where Wn = Y0,Y1, ,Yn (range of the random walk), and annealed lower bound of 0 { ··· } E!d(0,Yn).See[87,(1.23),(1.28),(1.31)]and[18,(1.16),(1.20),(1.23)].

41 4.2 The IIC and the Alexander-Orbach conjecture The problem of random walk on a percolation cluster — the ‘ant in the labyrinth’ [57] — has received much attention both in the physics and the mathematics literature. From the next section, we will consider random walk on a percolation cluster, so X(!) will be a percolation cluster. Let us first recall the bond percolation model on the lattice Zd: each bond is open with probability p (0, 1), independently of all the others. Let (x) be the open cluster containing x;thenif 2 C ✓(p)=P ( (x) =+ ) it is well known (see [61]) that there exists p = p (d) such that ✓(p)=0if p |C | 1 c c p 0ifp>pc. If d = 2 or d 19 (or d>6 for spread-out models mentioned below) it is known (see for example [61, 106]) that ✓(p ) = 0, and it is conjectured that this holds for d 2. At the critical probability c p = pc, it is believed that in any box of side n there exist with high probability open clusters of diameter of order n. For large n the local properties of these large finite clusters can, in certain circumstances, be captured by regarding them as subsets of an infinite cluster , called the incipient G infinite cluster (IIC for short). This IIC = (!) is our random weighted graph X(!). G G IIC was constructed when d = 2 in [76], by taking the limit as N of the cluster (0) !1 C conditioned to intersect the boundary of a box of side N with center at the origin. For large d a construction of the IIC in Zd is given in [69], using the lace expansion. It is believed that the results there will hold for any d>6. [69] also gives the existence and some properties of the IIC for all d>6 for spread-out models: these include the case when there is a bond between x and y with probability d pL whenever y is in a cube side L with center x, and the parameter L is large enough. We write d d for the IIC in Z . It is believed that the global properties of d are the same for all d>dc, both G G for nearest neighbor and spread-out models. (Here dc is the critical dimension which is 6 for the percolation model.) In [69] it is proved for spread-out models that has one end – i.e. any two Gd paths from 0 to infinity intersect infinitely often. See [106] for a summary of the high-dimensional results. ! ! Let Y = Yt t 0 be the (continuous time) simple random walk on d = d(!), and qt (x, y)be { } G G its heat kernel. Define the spectral dimension of by Gd ! log qt (x, x) ds( d)= 2lim , t log t G !1 (if this limit exists). Alexander and Orbach [7] conjectured that, for any d 2, d ( )=4/3. While s Gd it is now thought that this is unlikely to be true for small d (see [71, Section 7.4]), the results on the geometry of for spread-out models in [69] are consistent with this holding for d above the critical Gd dimension. Recently, it is proved that the Alexander-Orbach conjecture holds for random walk for the IIC on a tree ([19]), on a high dimensional oriented percolation cluster ([18]), and on a high dimensional percolation cluster ([86]). In all cases, we apply Theorem 4.1, namely we verify (4.1) with D =2,↵= 1 to prove the Alexander-Orbach conjecture. We will discuss details in the next two sections.

42 5 The Alexander-Orbach conjecture holds when two-point functions behave nicely

This section is based on the paper by Kozma-Nachmias ([86]) with some simpliﬁcation by [101].

5.1 The model and main theorems We write x y if x is connected to y by an open path. We write x r y if there is an open path of $ $ length less than or equal to r that connects x and y.

Definition 5.1 Let (X, E) be a infinite graph. (i) A bijection map f : X X is called a graph automorphism for X if f(u),f(v) E if and only ! { }2 if u, v E. Denote the set of all the automorphisms of X by Aut(X) { }2 (ii) (X, E) is said to be transitive if for any u, v X, there exists Aut(X) such that (u)=v. 2 2 (iii) For each x X, define the stabilizer of x by 2 S(x)= Aut(X):(x)=x . { 2 } (iv) A transitive graph X is unimodular if for each x, y X, 2 (y): S(x) = (x): S(y) . |{ 2 }| |{ 2 }| One of the important property of the unimodular transitive graph is that it satisfies the following

P(x 0 = z y)= P(0 x = z y), (5.1) x,y,z X $ $ x,y,z X $ $ d(0X,z)=2 K d(x,zX)=2 K for each K N. This can be veriﬁed on Zd using the property of group translations, and it can be 2 extended to unimodular transitive graphs using the mass transport technique (see for example, [102, (3.14)]). Let (X, µ) be a unimodular transitive weighted graph that has controlled weights, with weight 1 on each bond, i.e. µ = 1 for x y. In this section, allowing some abuse of notation, we denote xy ⇠ B(x, r)= y X : d(x, y) r and set @A = x A : z Ac such that z x . { 2  } { 2 9 2 ⇠ } We ﬁx a root 0 X as before. (Note that since the graph is unimodular and transitive, the 2 results in this section is independent of the choice of 0 X.) Consider a bond percolation on X and 2 let pc = pc(X) be its critical probability, namely

Pp(There exists an open -cluster.) > 0 for p>pc(X), 1 Pp(There exists an open -cluster.) = 0 for p

In this section, we will consider the case p = pc and denote P := Ppc . Throughout this section, we assume that the following limit exists (independently on how d(0,x) ). !1

PIIC(F )= lim P(F 0 x) (5.2) d(0,x) | $ !1

43 for any cylinder event F (i.e., an event that depends only on the status of a finite number of edges). We denote the realization of IIC by (!). For the critical bond percolations on Zd,thisisprovedin G [69] for d large using the lace expansion. We consider the following two conditions. The first is the triangle condition by Aizenman- Newman ([3]), which indicates mean-field behavior of the model:

P(0 x)P(x y)P(y 0) < . (5.3) $ $ $ 1 x,y X X2 Note that the value of the left hand side of (5.3) does not change if 0 is changed to any v X because 2 X is unimodular and transitive. The second is the following condition for two-point functions: There exist c0,c1,c2 > 0 and a decreasing function : R+ R+ with (r) c0 (2r) for all r>0 such that ! 

c1 (d(0,x)) P(0 x) c2 (d(0,x)) x X such that x, (5.4)  $  8 2 where d( , ) is the original graph distance on X. Because X is transitive, we have P(y z)=P(0 · · $ $ x)whend(0,x)=d(y, z).

Since X is unimodular and transitive, we can deduce the following from the above mentioned two conditions.

Lemma 5.2 (i) (5.3) implies the following open triangle condition:

lim sup P(0 x)P(x y)P(y w)=0. (5.5) K $ $ $ !1 w:d(0,w) K x,y X X2

(ii) (5.5) implies the following estimates: There exist C1,C2,C3 > 0 such that

1/2 P (0) >n C1n , n 1, (5.6) |C |  8 1 1 C2(pc p) E[ (0) ] C3(pc p) , p

Proof. (i) This is proved in [83] (see [23, Lemma 2.1] for Zd). (ii) (5.5) implies (5.7) (see [3, Proposition 3.2] for Zd and [102, Page 291] for general unimodular transitive graphs), and the following for h>0 small (see [23, (1.13)] for Zd and [102, Page 292] for general unimodular transitive graphs).

1/2 1 jh 1/2 c1h P( (0) = j)(1 e ) c2h .  |C |  Xj=1 1/2 Taking h =1/n,wehaveP (0) >n C1n . |C |  d 2 d For the critical bond percolations on Z , (5.4) with (x)=x was obtained by Hara, van der Hofstad and Slade [67] for the spread-out model and d>6, and by Hara [66] for the nearest-neighbor

44 model with d 19 using the lace expansion. (They obtained the right asymptotic behavior including 2 d the constant.) Given (5.4) with (x)=x for d>6, it is easy to check that (5.3) holds as well.

For each subgraph G of X,writeGp for the result of p-bond percolation. Deﬁne B(x, r; G)= z : d (x, z) r where d (x, z) is the length of the shortest path between x and z in G (it is { Gpc  } Gpc pc if there is no such path). Note that p = p (X). Now deﬁne 1 c c H(r; G)= @B(0,r; G) = , (r)= supP(H(r; G)). { 6 ;} G X ⇢ Note that

(r)= supP( @B(v, r; G) = ), E[ B(0,r) ]=E[ B(v, r) ], v X, (5.8) G X { 6 ;} | | | | 8 2 ⇢ since X is transitive. The following two propositions play a key role.

Proposition 5.3 Assume that the triangle condition (5.3) holds. Then there exists a constant C>0 such that the following hold for all r 1. i) E[ B(0,r) ] Cr, ii)( r) C/r. | |   Proposition 5.4 Assume that (5.4) and i), ii) in Proposition 5.3 hold. Then (4.1) in Theorem 4.1 1/2 holds for PIIC with p()= , D =2and ↵ =1.

Consider simple random walk on the IIC and let p!( , ) be its heat kernel. Combining the above n · · two propositions with Theorem 4.1, we can derive the following theorem.

Theorem 5.5 Assume that (5.3) and (5.4) hold. Then there exist ↵1 > 0 and N0(!) N with 2 P(N0(!) < )=1such that 1 2 2 ↵1 ! ↵1 (log n) n 3 p (0, 0) (log n) n 3 , n N0(!), PIIC a.e. !.  2n  8 In particular, the Alexander-Orbach conjecture holds for the IIC.

By the above mentioned reason, for the critical bond percolations on Zd, the Alexander-Orbach conjecture holds for the IIC for the spread-out model with d>6, and for the nearest-neighbor model with d 19. Remark 5.6 (i) In fact, the existence of the limit in (5.2) is not relevant in the arguments. Indeed, even if the limit does not exist, subsequential limits exist due to compactness, and the above results hold for each limit. So the conclusions of Theorem 5.5 hold for any IIC measure (i.e. any subsequential limit). (ii) The opposite inequalities of Theorem 5.3 (i.e. E[ B(0,r) ] C0r and (r) C0/r for all r 1) | | hold under weaker assumption. See Proposition 5.12.

In the following subsections, we prove Proposition 5.3 and 5.4.

45 5.2 Proof of Proposition 5.4 The proof splits into three lemmas.

Lemma 5.7 Assume that (5.4) and i) in Proposition 5.3 hold. Then there exists a constant C>0 such that for r 1 and x X with d(0,x) 2r, we have 2 2 1 P( B(0,r) r 0 x) C , 1. (5.9) | | | $  8

Proof. It is enough to prove the following for r 1 and x X with d(0,x) 2r: 2 2 E[ B(0,r) 1 0 x c1r (d(0,x)). (5.10) | |· { $ }  Indeed, we then have, using (5.4) and (5.10), ⇤

2 E[ B(0,r) 0 x] E[ B(0,r) 1 0 x c1r (d(0,x)) 2 | |· { $ } 1 P( B(0,r) r 0 x) | 2|| $ = 2 2 c3 , | | | $  r r P(0 x)  r c2 (d(0,x))  $ ⇤ which gives (5.9). So we will prove (5.10). We have

r r r E[ B(0,r) 1 0 x = P(0 z,0 x) P( 0 y y z y x ) | |· { $ } $ $  { $ }{ $ }{ $ } z z,y X X ⇤ r r P(0 y)P(y z)P(y x). (5.11)  $ $ $ z,y X Here the ﬁrst inequality is because, if 0 r z,0 x occurs, then there must exist some y such that { $ $ } 0 r y , y r z and y x occur disjointly. The second inequality uses the BK inequality twice. { $ } { $ } { $ } For d(0,y) r and d(0,x) 2r,wehaved(x, y) d(0,x) d(0,y) d(0,x)/2, so that 

P(y x) c4 (d(x, y)) c4 (d(0,x)/2) c5 (d(0,x)). $    Thus,

r r r 2 (RHS of (5.11)) c5 (d(0,x)) P(0 y)P(y z) c5r (d(0,x)) P(0 y) c5r (d(0,x)).  $ $  $  z,y z,y X X Here we use i) in Proposition 5.3 to sum, ﬁrst over z (note that (5.8) is used here) and then over y in the second and the third inequality. We thus obtain (5.10).

Lemma 5.8 Assume that (5.4) and ii) in Proposition 5.3 hold. Then there exists a constant C>0 such that for r 1 and x X with d(0,x) 2r, we have 2 2 1 1 P( B(0,r) r 0 x) C , 1. (5.12) | | | $  8

46 Proof. It is enough to prove the following for r 1, "<1 and x X with d(0,x) 2r: 2 2 P B(0,r) "r , 0 x c1" (d(0,x)). (5.13) | | $  ⇣ ⌘ Indeed, we then have, using (5.4) and (5.13), 2 1 1 2 1 P( B(0,r) r , 0 x) c1 (d(0,x)) 1 P( B(0,r) r 0 x)= | | $ c3 , | | | $ P(0 x)  c2 (d(0,x))  $ which gives (5.12). So we will prove (5.13). If B(0,r) "r2,theremustexistj [r/2,r] such that @B(0,j) 2"r. We ﬁx the smallest | | 2 | | such j. Now, if 0 x,thereexistsavertexy @B(0,j) which is connected to x by a path that does $ 2 not use any of the vertices in B(0,j 1). We say this “x y o↵ B(0,j 1)”. Let A be a subgraph $ of X such that P(B(0,j)=A) > 0. It is clear that, for any A and any y @A, y x o↵ A @A is 2 { $ \ } independent of B(0,j)=A .Thus, { } P(0 x B(0,j)=A) P(y x o↵ A @A B(0,j)=A) $ |  $ \ | y @A X2 = P(y x o↵ A @A) P(y x) C @A (d(0,x)), (5.14) $ \  $  | | y @A y @A X2 X2 where we used d(x, y) d(0,x) d(0,y) d(0,x)/2 in the last inequality. By the deﬁnition of j we have @A 2"r and summing over all A with P(B(0,j)=A) > 0 and @B(0,r/2) = (because | | 6 ; 0 x) gives $ 2 P( B(0,r) "r , 0 x) C"r (d(0,x)) P(B(0,j)=A). | | $  · XA Since the events B(0,j)=A and B(0,j)=A are disjoint for A = A ,wehave { 1} { 2} 1 6 2 P(B(0,j)=A)=P(@B(0,r/2) = ) c/r, 6 ;  XA where ii) in Proposition 5.3 is used in the last inequality. We thus obtain (5.13).

Lemma 5.9 Assume that (5.4) and i), ii) in Proposition 5.3 hold. Then there exists a constant C>0 such that for r 1 and x X with d(0,x) 2r, we have 2 1 1/2 P(Re↵ (0,@B(0,r)) r 0 x) C , 1. (5.15)  | $  8 Proof of Proposition 5.4. Note that R (0,z) d(0,z) holds for all z B(0,R), and B(0,R) e↵  2 | | V (0,R) c B(0,R) for all R 1. By Lemma 5.7, 5.8 and 5.9, for each R 1 and x X with  1| | 2 d(0,x) 2R,wehave 1/2 P( ! : B(0,R) satisﬁes FR, 0 x) 1 3C 1. { }| $ 8 Using (5.2), we obtain the desired estimate.

So, all we need is to prove Lemma 5.9. We ﬁrst give deﬁnition of lane introduced in [95] (similar notion was also given in [19]).

47 Deﬁnition 5.10 (i) An edge e between @B(0,j 1) and @B(0,j) is called a lane for r if there exists a path with initial edge e from @B(0,j 1) to @B(0,r) that does not return to @B(0,j 1). (ii) Let 0 1 and x X with 2 d(0,x) 2r: 1 1/2 P Re↵ (0,@B(0,r)) r, 0 x c1 (d(0,x)). (5.18)  $  Indeed, we then have, using⇣ (5.4) and (5.18), ⌘

1 1 P(Re↵ (0,@B(0,r)) r , 0 x) P(Re↵ (0,@B(0,r)) r 0 x)=  $  | $ P(0 x) $ 1/2 c1 (d(0,x)) 1/2 c3 ,  c2 (d(0,x))  which gives (5.15). So we will prove (5.18). The proof consists of two steps. Step 1 We will prove the following; There exists a constant C>0 such that for any r 1, for any event E measurable with respect to B(0,r) and for any x X with d(0,x) 2r, 2 P(E 0 x ) C rP(E) (d(0,x)). (5.19) \{ $ }  p We ﬁrst note that by (5.10), there exists some j [r/2,r] such that 2

E @B(0,j) 1 0 x Cr (d(0,x)). | |· { $ }  Using this and the Chebyshev inequality,⇥ for each M>⇤ 0wehave

P(E 0 x ) P( @B(0,j) >M,0 x)+P(E @B(0,j) M,0 x ) \{ $ }  | | $ \{| | $ } Cr (d(0,x)) + P(E @B(0,j) M,0 x ).  M \{| | $ }

48 For the second term, using (5.14) we have, for each A,

P( B(0,j)=A 0 x ) C @A (d(0,x))P(B(0,j)=A). { }\{ $ }  | | Summing over all subgraphs A which satisfy E (measurability of E with respect to B(0,r)isused here) and @A M gives P(E @B(0,j) M,0 x ) CM (d(0,x))P(E). Thus | | \{| | $ }  Cr (d(0,x)) P(E 0 x ) + CM (d(0,x))P(E). \{ $ }  M Taking M = r/P(E), we obtain (5.19). Step 2 For j p[r/4,r/2], let us condition on B(0,j), take an edge e between @B(0,j 1) and @B(0,j), 2 and denote the end vertex of e in @B(0,j)byve. Let Gj be a graph that one gets by removing all the edges with at least one vertex in B(0,j 1). Then, e is a lane for r @B(v ,r/2; G ) = { }⇢{ e j 6 ;} in the graph Gj. By the deﬁnition of and ii) in Proposition 5.3 (note that (5.8) is used here), we have P(@B(ve,r/2; Gj) = B(0,j)) (r/2) C/r. 6 ;|   Recall the deﬁnition of Jj in (5.17). By the above argument, we obtain

E[ Jj B(0,j)] = P ( 1 e is a lane for r B(0,j)) | || { }| ve @B(0,j) 2X C P ( 1 @B(ve,r/2;G )= B(0,j)) @B(0,j) .  { j 6 ;}|  r | | ve @B(0,j) 2X This together with i) in Proposition 5.3 implies

r/2 r/2

E[ Jj ]= E[ Jj 1 B(0,j)=A ] | | | | { } jX=r/4 jX=r/4 XA r/2 r/2 C C C @A P(B(0,j)=A)= E[ @B(0,j) ] E[ B(0,r) ] C0.  r | | r | |  r | |  jX=r/4 XA jX=r/4 So, P(0 is -lane rich for r) C/(r). Combining this with (5.16), we obtain  1 C P Re↵ (0,@B(0,r)) r .   r ⇣ ⌘ (Here R (0,@B(0,r)) = if @B(0,r)= .) This together with (5.19) in Step 1 implies (5.18). e↵ 1 ;

5.3 Proof of Proposition 5.3 i) The original proof of Proposition 5.3 i) by Kozma-Nachmias ([86]) (with some simpliﬁcation in [82]) used an induction scheme which is new and nice, but it requires several pages. Very recently (in fact, 2 weeks before the deadline of the lecture notes), I learned a nice short proof by Sapozhnikov [101] which we will follow. The proof is also robust in the sense we do not need the graph to be transitive (nor unimodular).

49 1 Proposition 5.11 If there exists C1 > 0 such that Ep[ (0) ] 0 such that E[ B(0,r) ] C2r for all r 1. | | 

Proof. It is sucient to prove for r 2/p . For p 0 such that E[ B(0,r) ] c1r for all r 1. | | (ii) If X is unimodular transitive and satisﬁes (5.3), then there exists c > 0 such that (r) c /r 2  2 for all r 1.

Proof. (i) It is enough to prove E[ x : d(0,x)=r ] 1 for r 1. Assume by contradiction that |{ }| E( x : d(0,x)=r0 ) 1 c for some r0 N and c>0. Let G(r)=E( x : d(0,x) r ). Then |{ }|  2 |{  }|

G(2r0) G(r0)=E[ x : d(0,x) (r0, 2r0] ] |{ 2 }| E[ (y, x):d(0,y)=r0,d(y, x) (1,r0] and y is on a geodesic from 0 to x ]  |{ 2 }| E[ y : d(0,y)=r0 ] G(r0) (1 c)G(r0).  |{ }| ·  where we used Reimer’s inequality and the transitivity of X in the second inequality. (Note that we can use Reimer’s inequality here because H := y : d(0,y)=r can be veriﬁed by examining all { 0} the open edges of B(0,r ) and the closed edges in its boundary, while x : d(y, x) r for y H 0 {  0} 2 can be veriﬁed using open edges outside B(0,r ).) Similarly G(nr ) (1 c)G(n 1,r )+G(r ). A 0 0  0 0 simple calculation shows that this implies that G(nr ) as n , which contradicts criticality. 0 6! 1 !1 (ii) We use a second moment argument. By (i) and Proposition 5.3 i), we have E B(0,r) c1r | | and E B(0,r) c2r for each r 1, 1. Putting =2c2/c1, we get | |

E B(0,r) B(0,r) c1r c2r = c2r. | \ |

50 r r r r r Next, noting that 0 x, 0 y 0 z z x z y for some z Zd, the BK inequality gives { $ $ }⇢{ $ }{ $ }{ $ } 2 3 2 r r r r 3 E[ B(0,r) ] P(0 z)P(z x)P(z y)= P(0 x) c3r , | |  $ $ $ $  x,y,z x Zd X h X2 i where the last inequality is due to Proposition 5.3 i). The ‘inverse Chebyshev’ inequality P(Z>0) (EZ)2/EZ2 valid for any non-negative random variable Z yields that 2 2 c2r c4 P B(0,r) B(0,r) > 0 3 , | \ | c3r r which completes the proof since B(0,r) B(0,r) >0 H(r). {| \ | }⇢ Note that the idea behind the above proof of (i) is that if E[ x : d(0,x)=r ] < 1thenthe |{ }| percolation process is dominated above by a subcritical branching process which has ﬁnite mean, and this contradicts the fact that the mean is inﬁnity. This argument works not only for the boundary of balls for the graph distance, but also for the boundary of balls for any reasonable geodesic metric. I learned the above proof of (i), which is based on that of [85, Lemma 3.1], from Kozma and Nachmias ([84]). The proof of (ii) is from that of [86, Theorem 1.3 (ii)].

5.4 Proof of Proposition 5.3 ii) In order to prove Proposition 5.3 ii), we will only need (5.6). First, note that (5.6) implies the following estimate: There exists C>0 such that

C1 P G(0) >n G X, n 1, (5.20) |C |  n1/2 8 ⇢ 8 because (0) (0) . Here (0) is the connected component containing 0 for G ,where |C ||CG | CG pc pc = pc(X).

The key idea of the proof of Proposition 5.3 ii) is to make a regeneration argument, which is similar to the one given in the proof of Lemma 5.9. Proof of Proposition 5.3 ii). Let A 1 be a large number that satisﬁes 33A2/3 + C A2/3 A, 1  where C is from (5.20). We will prove that( r) 3Ar 1. For this, it suces to prove 1  A (3k) , (5.21)  3k k 1 k for all k N 0 . Indeed, for any r, by choosing k such that 3 r<3 ,wehave 2 [{ }  k 1 A 3A (r) (3 ) k 1 < .   3 r We will show (5.21) by induction – it is trivial for k =0sinceA 1. Assume that (5.21) holds for all j0 be a (small) constant to be chosen later. For any G X, ⇢ we have k k k k P(H(3 ; G)) P @B(0, 3 ; G) = , G(0) "9 + P G(0) >"9  6 ; |C | |C | ⇣ k k⌘ C1 P @B(0, 3 ; G) = , G(0) "9 + , (5.22)  6 ; |C | p"3k ⇣ ⌘ 51 where the last inequality is due to (5.20). We claim that

k k k+1 k 1 2 P @B(0, 3 ; G) = , G(0) "9 "3 ((3 )) . (5.23) 6 ; |C |  If (5.23) holds, then by⇣ (5.22) and the induction hypothesis,⌘ we have 3 2 1/2 k k+1 k 1 2 C1 "3 A + C1" P(H(3 ; G)) "3 ((3 )) + . (5.24)  p"3k  3k Put " = A 4/3. Since (5.24) holds for any G X,wehave ⇢ 33A2/3 + C A2/3 A (3k) 1 ,  3k  3k where the last inequality is by the choice of A. This completes the inductive proof of (5.21). So, we will prove (5.23). Observe that if (0) "9k then there exists j [3k 1, 2 3k 1]such |CG | 2 · that @B(0,j; G) "3k+1. We fix the smallest such j. If, in addition, @B(0, 3k; G) = then at least | | 6 ; one vertex y of the "3k+1 vertices of level j satisfies @B(y,3k 1; G ) = ,whereG G is determined 2 6 ; 2 ⇢ from G by removing all edges needed to calculate B(0,j; G). By (5.8) and definition of (with G2), this has probability (3k 1). Summarizing, we have 

k k k+1 k 1 P @B(0, 3 ; G) = , G(0) "9 B(0,j; G) "3 (3 ) . 6 ; |C |  ⇣ ⌘ k 1 We now sum over possible values of B(0,j; G) and get an extra term of P(H(3 ; G)) because we k 1 k 1 k 1 need to reach level 3 from v.SinceP(H(3 ; G)) (3 ), we obtain (5.23). 

6 Further results for random walk on IIC

In this section, we will summarize results for random walks on various IICs.

6.1 Random walk on IIC for critical percolation on trees

Let n0 2 and let B be the n0-ary homogeneous tree with a root 0. We consider the critical bond 1 percolation on B,i.e.let ⌘e : e is a bond on B be i.i.d. such that P (⌘e = 1) = , P (⌘e = 0) = { } n0 1 1 for each e.Set n0 (0) = x B : ⌘–open path from 0 to x . C { 2 9 } Let Bn = x B : d(0,x)=n where d( , ) is the graph distance. Then, it is easy to see that { 2 } · · 1 Zn := (0) Bn is a branching process with o↵spring distribution Bin(n0, ). Since E[Z1] = 1, |C \ | n0 Z dies out with probability 1, so (0) is a ﬁnite cluster P -a.s.. In this case, we can construct the { n} C incipient inﬁnite cluster easily as follows.

Lemma 6.1 (Kesten [77]) Let A B k := x B : d(0,x) k . Then ⇢  { 2  } lim P ( (0) B k = A Zn = 0) = A Bk P ( (0) B k = A)=:P0(A). n   9 !1 C \ | 6 | \ | C \ Further, there exists a unique probability measure P which is an extension of P0 to a probability on the set of -connected subsets of B containing 0. 1

52 Let (⌦, , P) be the probability space given above and for each ! ⌦, let (!) be the rooted labeled F 2 G tree with distribution P. So, (⌦, , P) governs the randomness of the media and for each ! ⌦, F 2 (!) is the incipient inﬁnite cluster (IIC) on B. G For each = (!), let Y be a simple random walk on . Let µ be a measure on given by G G { n} G G µ(A)= µ ,whereµ is the number of open bonds connected to x .Deﬁnep!(x, y):= x x x 2G n x(Y = y)/µ2G . P n P y In this example, (4.1) in Theorem 4.1 holds for P with p()=exp( c), D = 2 and ↵ = 1, so we can obtain the following (6.1). In [19], further results are obtained for this example. Let Px( ):=P( x ). · ·| 2G

Theorem 6.2 (i) There exist c0,c1,c2,↵1 > 0 and a positive random variable S(x) with Px(S(x) c m) 0 for all x B such that the following holds  log m 2

2/3 ↵1 ! 2/3 ↵1 c1n (log log n) p (x, x) c2n (log log n) for all n S(x),x B. (6.1)  2n  2 (ii) There exists C (0, ) such that for each " (0, 1),thefollowingholdsforP-a.e. ! 2 1 2 (1 ")/3 2/3 ! lim inf(log log n) n p (0, 0) C. (6.2) n 2n !1  We will give sketch of the proof in the next subsection. (6.2) together with (4.10) show that one cannot take ↵1 = 0 in (6.1). Namely, there is a oscillation of order log log for the quenched heat kernel estimates.

(N) 1/3 Remark 6.3 (i) For N N, let Zn = N d(0,YNn), n 0. In [77], Kesten proved that P- (N) 2 (N) distribution of Zn converges as N . Especially, Z is tight with respect to the annealed 0 e !1 { · } (N) law P⇤ := P P!. On the other hand, by (6.2), it can be shown that Z is NOT tight with respect ⇥ { · } to the quenchede law. e (ii) In [45], it is proved that the P-distribution of the rescaled simplee random walk on IIC converges to Brownian motion on the Aldous tree ([5]). (iii) (4.2) is proved with D =2and ↵ =1for simple random walk on IIC of the family tree for the critical Galton-Watson branching process with ﬁnite variance o↵spring distribution ([54]), and for invasion percolation on regular trees ([8]). (iv) The behavior of simple random walk on IIC is di↵erent for the family tree of the critical Galton- Watson branching process with inﬁnite variance o↵spring distribution. In [48], it is proved that when the o↵spring distribution is in the domain of attraction of a stable law with index (1, 2), then 2 (4.2) holds with D = /( 1) and ↵ =1. In particular, the spectral dimension of the random walk is 2/(2 1). It is further proved that there is an oscillation of order log for the quenched heat kernel estimates. Namely, it is proved that there exists ↵2 > 0 such that

2 1 ↵2 ! lim inf n (log n) p (0, 0) = 0, a.e. !. n 2n P !1 ( 1)/(2 1) Note that for this case convergence of P-distribution of N d(0,YNn) as N is already !1 proved in [77].

53 One may wonder if the o↵-diagonal heat kernel estimates enjoy sub-Gaussian estimates given in Remark 3.23. As we have seen, there is an oscillation already in the on-diagonal estimates for quenched heat kernel, so one cannot expect estimates precisely the same one as in Remark 3.23. However, the following theorem shows that such sub-Gaussian estimates holds with high probability for the quenched heat kernel, and the precise sub-Gaussian estimates holds for annealed heat kernel. ! x Let Yt t 0 be the continuous time Markov chain with based measure µ.Setqt (x, y)=P (Yt = { } y)/µy, and let Px,y( )=P( x, y ), Ex,y( )=E( x, y ). Then the following holds. · ·| 2G · ·| 2G Theorem 6.4 ([19]) (i) Quenched heat kernel bounds: a) Let x, y = (!), t>0 be such that N := [ d(x, y)3/t] 8 and t c d(x, y). Then, there 2G G 1 exists F = F (x, y, t) with Px0,y0 F (x, y, t) 1 c2 exp( c3N), such that ⇤ ⇤ ⇤ p ! 2/3 qt (x, y) c4t exp( c5N), for all ! F .  2 ⇤ b) Let x, y with x = y, m 1 and  1. Then there exists G = G (x, y, m, ) with 2G 6 1 ⇤ ⇤ Px0,y0 G (x, y, m, ) 1 c6 such that ⇤

! 2/3 c8(+c9)m 3 2 q2T (x, y) c7T e , for all ! G where T = d(x, y) /m . 2 ⇤ (ii) Annealed heat kernel bounds: Let x, y B. Then 2 3 3 2/3 d(x, y) 1/2 2/3 d(x, y) 1/2 c1t exp c2( ) Ex,yq·(x, y) c3t exp c4( ) , (6.3) t  t  t where the upper bound is for c d(x, y) t and the lower bound is for c (d(x, y) 1) t. 5  5 _ 

Note that (6.3) coincides the estimate in Remark 3.23 with df =2,dw = 3.

6.2 Sketch of the Proof of Theorem 6.2

For x and r 1, let M(x, r) be the smallest number m N such that there exists A = 2G 2 z ,...,z with d(x, z ) [r/4, 3r/4], 1 i m, so that any path from x to B(x, r)c must pass { 1 m} i 2   through the set A. Similarly to (5.16), we have

R (0,B(0,R)c) R/(2M(0,R)). (6.4) e↵ So, in order to prove (4.1) with p()=exp( c) (which implies Theorem 6.2 (i) due to the last assertion of Theorem 4.1), it is enough to prove the following.

Proposition 6.5 (i) Let >0, r 1. Then 2 2 P(V (0,r) >r) c0 exp( c1), P(V (0,r) <r) c2 exp( c3/p).   (ii) Let r, m 1. Then c5m P(M(x, r) m) c4e . 

54 Proof. (Sketch) Basically, all the estimates can be obtained through large deviation estimates of the total population size of the critical branching process. (i) Since is a tree, B(x, r) V (x, r) 2 B(x, r) , so we consider B(x, r) . Let ⌘ be the G | |  | | | | o↵spring distribution and p := P (⌘ = i). (Note that E[⌘] = 1, Var[⌘] < .) Now deﬁne the size i 1 biased branching process Z˜n as follows: Z˜0 = 1, P (Z˜1 = j)=(j + 1)pj+1 for all j 0, and ˜ { } ˜ n ˜ Zn n 2 is the usual branching process with o↵spring distribution ⌘. Let Yn := k=0 Zk.Then, { } (d) (d) P Y˜ [r/2] B(0,r) Y˜ [r]. r/2 | |  r (d) Here, for random variable ⇠, we denote ⇠[n]:= n ⇠ ,where ⇠ are i.i.d. copies of ⇠. i=1 i { i} Let Y¯ := n Z , i.e. the total population size up to generation n. Then, it is easy to get n k=0 k P P 2 2 P (Y¯ [n] n ) c exp( c0),P(Y¯ [n] n ) c exp( c0/p). n  n  

We can obtain similar estimates for Y˜n[n] so (i) holds. (ii) For x, y ,let(x, y) be the unique geodesic between x and y. Let D(x)bethedescendents 2G of x and D (x)= y D(x):d(x, y)=r . Let H be the backbone and b = b H be the point r { 2 } r/4 2 where d(0,b)=r/4. Deﬁne

A := z (0,b) b (Dr/4(z) H),A⇤ = z A : Dr/4(z) = . [ 2 \{ } \ { 2 6 ;} Then, any path from 0 to B(0,r)c must cross A b , so that M(0,r) A +1. Deﬁne p := P (z ⇤ [{ } | ⇤| r 2 A z A)=P (Z > 0) c/r. Let  be i.i.d. with distribution Ber(p ) that are independent of ⇤| 2 r/4  { i} r A . Then we see that | | A (d) | | (d) A⇤ =  , A Z˜ [r/4]. | | i | |  r/4 Xi=1 Using these, it is easy to obtain P ( A >m) e cm, so (ii) holds. | ⇤|  We next give the key lemma for the proof of Theorem 6.2 (ii)

Lemma 6.6 For any " (0, 1), 2 V (0,n) lim sup 2 1 " = , P a.s. n n (log log n) 1 !1

Proof. (Sketch) Let D(x; z)= y D(x):(x, y) (x, z)= x , and deﬁne { 2 \ { }} n 2 n 1 n 1 n 2 Z = x : x D(y ; y ),d(x, y ) 2 , 2 i 2 +2 . n |{ 2 i i+1 i    }| n 2 Thus Zn is the number of descendants o↵ the backbone, to level 2 , of points y on the backbone (d) between levels 2n 1 and 2n 1 +2n 2.SoZ are independent, B(0, 2n) Z , and Z = { n}n | | n n

55 (d) n 2 2 Y˜ n 2 [2 ]. It can be shown that Y˜ [n] c n Bin(n, p /n) for some p > 0sowehave,ifa = 2 n 1 1 1 n 1 " (log n) ,

n n n ˜ n 2 n Pb( B(0, 2 ) an4 ) Pb(Zn an4 ) P (Y2n 2 [2 ] an4 ) | | n 2 n+2 c2an log an P (Bin (2 ,p 2 ) c a ) c e c /n. 1 2 n 3 3

As Zn are independent, the desired estimate follows by the second Borel-Cantelli Lemma.

n 2n n n 2n Proof of Theorem 6.2 (ii). Let an = V (0, 2 )2 ,tn =2 V (0, 2 )=an2 . Using (3.21) with ! n 2/3 1/3 ↵ = 1, we have f (0) c/V (0,r), so p (0, 0) c/V (0, 2 )=ctn /an . By Lemma 6.6, rV (0,r)  tn  a (log n)1 " (log log t )1 ". We thus obtain the result. n ⇣ n

6.3 Random walk on IIC for the critical oriented percolation cluster in Zd (d>6) We ﬁrst introduce the spread-out oriented percolation model. Let d>4 and L 1. We consider an d d oriented graph with vertices Z Z+ and oriented bonds ((x, n), (y, n + 1)) : n 0, x, y Z with ⇥ { 2 0 x y L . We consider bond percolation on the graph. We write (x, n) (y, m)ifthere k k1  } ! is a sequence of open oriented bonds that connects (x, n) and (y, m). Let

(x, n)= (y, m):(x, n) (y, m) C { ! } and define ✓(p)=Pp( C(0, 0) = ). Then, there exists pc = pc(d, L) (0, 1) such that ✓(p) > 0 for | | 1 2 p>p and ✓(p) = 0 for p p . In particular, there is no infinite cluster when p = p (see [61, Page c  c c 369], [62]). For this example, the construction of incipient infinite cluster is given by van der Hofstad, den

Hollander and Slade [68] for d>4. Let ⌧n(x)=Ppc ((0, 0) (x, n)) and ⌧n = ⌧n(x). ! x Proposition 6.7 ([68]) (i) There exists L (d) such that if L L (d), then P 0 0

lim p (E (0, 0) n)=: IIC(E) for any cylindrical events E, n P c Q 9 !1 | ! d where (0, 0) n means (0, 0) (x, n) for some x Z . Further, QIIC can be extended uniquely to ! ! 2 a probability measure on the Borel -field and (0, 0) is QIIC–a.s. an infinite cluster. C (ii) There exists L (d) such that if L L (d), then 0 0 1 lim Ppc (E (0, 0) (x, n) )=:PIIC(E) for any cylindrical events E. 9 n ⌧ \{ ! } !1 n x X Further, PIIC can be extended uniquely to a probability measure on the Borel -field and PIIC = QIIC. Let us consider simple random walk on the IIC. It is proved in [18] that (4.1) in Theorem 4.1 holds 1 for PIIC with p()= , D = 2 and ↵ = 1, so we have (4.2). Note that although many of the arguments in Section 5 can be generalized to this oriented model straightforwardly, some are not, due to the orientedness. For example, it is not clear how to adapt

56 Proposition 5.3 ii) to the oriented model. (The definition of( r) needs to be modified in order to take orientedness into account.) For the reference, we will briefly explain how the proof of (4.1) goes in [18], and explain why d>6isneeded.

2 1 d Volume estimates Let ZR := cV (0,R)/R and Z = 0 dtWt(R ), where Wt is the canonical measure of super-Brownian motion conditioned to survive for all time. R

Proposition 6.8 The following holds for d>4 and L L0(d). l l l 2 2 (i) limR EIICZR = EZ 2 (l + 1)! for all l N. In particular, c1R EIICV (0,R) c2R for !1  2   all R 1. 2 (ii) PIIC(V (0,R)/R <) c1 exp c2/p for all R, 1.  { } Proof. (Sketch) (i) First, note that for l 1 and R large, R 1 l 2 l 2 l l EIICZR EIIC[(c0R B(0,R) ) ]=(c0R ) EIIC[( I (0,0) (y,n) ) ]. (6.5) ⇠ | | { ! } n=0 y Zd X X2 For l 1 and m =(m , ,m), deﬁne the IIC (l + 1)-points function as 1 ··· l (l+1) ⇢ˆm = PIIC((0, 0) (yi,mi), i =1, ,l). ! 8 ··· d y1, ,yl Z ···X2 In [68], it is proved that for t =(t , ,t) (0, 1]l, 1 ··· l 2 c0 l (l+1) (l+1) d d d lim ( ) ⇢ˆ = Mˆ := (X1( ),Xt ( ), ,Xt ( )), s st 1,t N R 1 R l R !1 s ··· ˆ (l+1) where M is the (l+1)-st moment of the canonical measure N of super-BM Xt. So taking R 1,t !1 l R 1 R 1 R 1 R 1 l c0 (l+1) 1 1 c0 (l+1) (RHS of (6.5)) = ⇢ˆn1, ,n = ⇢ˆ R2l ··· ··· l R ···R Rl R~t n =0 n =0 n =0 n =0 X1 Xl X1 Xl 1 1 ˆ (l+1) l 1 1 dt1 dtlM = EZ where ~t =(n1R , nlR ). ! ··· 1,t ··· Z0 Z0 2 (ii) By (i), we see that for any ✏>0, there exists >0 such that QIIC(V (0,R)/R <) <✏. Now the chaining argument gives the desired estimate.

From Proposition 6.8, we can verify the volume estimates in (4.1). As we see, we can obtain it for all d>4.

Resistance estimates Following [18], deﬁne cut set at level n [1,R] as follows: 2 (x, n) is RW-connected to level R by a path D(n)= e =((w, n 1), (x, n)) : . ⇢G in (z,l):l n ⇢ G\{ } Here ‘(x, n) is RW-connected to level R’ means there exists a sequence of non-oriented bonds con- necting (x, n)to(y, R) for some y Zd. 2

57 Proposition 6.9 For d>6, there exists L (d) L (d) such that for L L (d), 1 0 1 EIIC( D(n) ) c1(a, d), 0 < n6. Let us indicate why d>6isneeded. 1 EIIC D(n) = QIIC [(w, x) D(n)] = lim Ppc [(w, x) D(n), 0 y] , | | 2 A N 2 ! w,x Zd !1 w,x,y Zd X2 X2 where w =(w, n 1), x =(x, n), y =(y, N) and A =limN ⌧N . Let us consider one conﬁguration !1 in the event (w, x) D(n), 0 y which is indicated by the following ﬁgure. { 2 ! } y t

l k t x t n t

j t 0

d/t2 (4 d)/2 By [70], we have supx Zd ⌧n(x) K(n + 1) for n 1, and ⌧n = A(1 + O(n )) as n . 2  !1 Using these and the BK inequality, the conﬁguration in the above ﬁgure can be bounded above by

l n l 1 d/2 1 (2 d)/2 c (l j + 1) c (l n + 1)  Xl=n kX=n Xj=0 Xl=n kX=n 1 (4 d)/2 1 (4 d)/2 c (l n + 1) = c m < for d>6.  1 m=1 Xl=n X In order to prove Proposition 6.9, one needs to estimate more complicated zigzag paths eciently. (Open problem) The critical dimension for oriented percolation is 4. How does the random walk on IIC for oriented percolation behaves when d =5, 6?

6.4 Below critical dimension We have seen various IIC models where simple random walk on the IIC enjoys the estimate (4.2) with D = 2 and ↵ = 1. This is a typical mean ﬁeld behavior that may hold for high dimensions (above the critical dimension). It is natural to ask for the behavior of simple random walk on low dimensions. Very few is known. Let us list up rigorous results that are known so far.

58 (i) Random walk on the IIC for 2-dimensional critical percolation ([76]) In [76], Kesten shows the existence of IIC for 2-dimensional critical percolation cluster. Further, he proves subdi↵usive behavior of simple random walk on IIC in the following sense. Let Y { n}n be a simple random walk on the IIC, then there exists ✏>0 such that the P-distribution of 1 +✏ n 2 d(0,Yn) is tight.

(ii) Random walk on the 2-dimensional uniform spanning tree ([20]) It is proved that (4.2) holds with D =8/5 and ↵ = 1. Especially, it is shown that the spectral dimension of the random walk is 16/13.

(iii) Brownian motion on the critical percolation cluster for the diamond lattice ([64]) Brownian motion is constructed on the critical percolation cluster for the diamond lattice. Further, it is proved that the heat kernel enjoys continuous version of (4.2) with ↵ = 1 and some non-trivial D that is determined by the maximum eigenvalue of the matrix for the corresponding multi-dimensional branching process.

It is believed that the critical dimension for percolation is 6. It would be very interesting to know the spectral dimension of simple random walk on IIC for the critical percolation cluster for d<6. (Note that in this case even the existence of IIC is not proved except for d = 2.) The following numerical simulations (which we borrow from [27]) suggest that the Alexander-Orbach conjecture does not hold for d 5.  d =5 d =1.34 0.02,d=4 d =1.30 0.04, ) s ± ) s ± d =3 d =1.32 0.01,d=2 d =1.318 0.001. ) s ± ) s ± 6.5 Random walk on random walk traces and on the Erdös-Rényi random graphs In this subsection, we discuss the behavior of random walk in two random environments, namely on random walk traces and on the Erdös-Rényi random graphs. They are not IIC, but the technique discussed in Section 3 and Subsection 4.1 can be applied to some extent. We give a brief overview of the results. Random walk on random walk traces d ! Let X(!) be the trace of simple random walk on Z , d 3, started at 0. Let Yt t 0 be the { } simple random walk on X(!) and p!( , ) be its heat kernel. It is known in general that if a Markov n · · chain corresponding to a weighted graph is transient, then the simple random walk on the trace of the Markov chain is recurrent P-a.s. (see [28]). The question is to have more detailed properties of the random walk when the initial graph is Zd. The following results show that it behaves like 1-dimensional simple random walk when d 5. (d) Theorem 6.10 ([44]) Let d 5, and let Bt t 0, Wt t 0 be independent standard Brownian { } { } motions on R and Rd respectively, both started at 0. (i) There exist c1,c2 > 0 such that

1/2 ! 1/2 c1n p (0, 0) c2n for large n, P-a.e. !.  2n 

59 1/2 ! ! (ii) There exists 1 = 1(d) > 0 such that n d (0,Y[tn]) t 0 converges weakly to B1t t 0 ! { } {| |} P-a.e. !, where d ( , ) is the graph distance on X(!). Also, there exists 2 = 2(d) > 0 such that · · n 1/4Y ! converges weakly to W (d) -a.e. !. [tn] t 0 B t t 0 P { } { | 2 |} On the other hand, the behavior is di↵erent for d =3, 4.

Theorem 6.11 ([103, 104]) (i) Let d =4. Then there exist c1,c2 > 0 and a slowly varying function such that

1 1 1 1 2 ! 2 c1n 2 (n) p (0, 0) c2n 2 (n) for large n, P-a.e. !, (6.6)  2n  1 1 ! 1 13 + 0 n 4 (log n) 24 max Yk n 4 (log n) 12 for large n, P!-a.s. and P-a.e. !,  1 k n | |   1 for any >0. Further, (n) (log n) 2 ,thatis ⇡ log (n) 1 lim = . n !1 log log n 2 (ii) Let d =3. Then there exists ↵>0 such that

! 10 ↵ p (0, 0) n 19 (log n) for large n, P-a.s. !. 2n  These estimates suggest that the ‘critical dimension’ for random walk on random walk trace for Zd is 4. Note that some annealed estimates for the heat kernel (which are weaker than the above) are obtained in [44] for d = 4. One of the key estimates to establish (6.6) is to obtain a sharp estimate for E[Re↵ (0,Sn)] where 4 Sn n is the simple random walk on Z started at 0. In [37], Burdzy and Lawler obtained { } 1 1 1 c1(log n) 2 E[Re↵ (0,Sn)] c2(log n) 3 for d =4. (6.7)  n 

This comes from naive estimates E[Ln] E[Re↵ (0,Sn)] E[An], where Ln is the number of cut points   for S , ,S , and A is the number of points for loop-erased random walk for S , ,S .(The { 0 ··· n} n { 0 ··· n} logarithmic estimates in (6.7) are those of E[Ln] and E[An].) In [103], Shiraishi proves the following.

1 1 E[Re↵ (0,Sn)] (log n) 2 for d =4, n ⇡ which means the exponent that comes from the number of cut points is the right one. Intuitively the reason is as follows. Let T be the sequence of cut times up to time n. Then the random walk { j} trace near STj and STj+1 intersects typically when Tj+1 Tj is large, i.e. there exists a ‘long range an intersection’ . So E[Re↵ (STj ,STj+1 )] 1, and E[Re↵ (0,Sn)] E[ Re↵ (STj ,STj+1 )] E[an] ⇡ ⇣ j=1 ⇡ ⇣ n(log n) 1/2,wherea := sup j : T n . n { j  } P Random walk on the Erdös-Rényi random graphs Let V := 1, 2, ,n be labeled vertices. Each bond i, j (i, j V ) is open with probability n { ··· } { } 2 n p (0, 1), independently of all the others. The realization G(n, p) is the Erdös-Rényi random graph. 2

60 It is well-known (see [34]) that this model has a phase transition at p =1/n in the following sense. Let n be the i-th largest connected component. If p c/n with c<1then n = O(log n), with Ci ⇠ |C1 | c>1then n n and n = O(log n) for j 2, and if p c/n with c =1then n n2/3 for |C1 |⇣ |Cj | ⇠ |Cj |⇣ j 1. 4/3 Now consider the finer scaling p =1/n + n for fixed R – the so-called critical window. 2 Let n and Sn be the size and the surplus (i.e., the minimum number of edges which would need |C1 | 1 to be removed in order to obtain a tree) of n. Then the well-known result by Aldous ([4]) says C1 that (n 2/3 n ,Sn) converges weakly to some random variables which are determined by a length |C1 | 1 of the largest excursion of reflected Brownian motion with drift and by some Poisson point process. Recently, Addario-Berry, Broutin and Goldschmidt ([1]) prove further that there exists a (random) compact metric space such that n 1/3 n converges weakly to in the Gromov-Hausdor↵ sense. M1 C1 M1 (In fact, these results hold not only for n,Sn but also for the sequences ( n, n, ), (Sn,Sn, ).) C1 1 C1 C2 ··· 1 2 ··· Here can be constructed from a random real tree (given by the excursion mentioned above) by M1 gluing a (random) finite number of points (chosen according to the Poisson point process) – see [1] for details. n 1 n Now we consider simple random walk YmC on . The following results on the scaling limit { }m C1 and the heat kernel estimates are obtained by by Croydon ([43]).

1 Theorem 6.12 ([43]) There exists a di↵usion process (‘Brownian motion’ ) BtM t 0 on 1 such n 1/3 1 1 { } M that n Y C t 0 converges weakly to BtM t 0 P-a.s.. Further, there exists a jointly continuous { [nt]} { } 1 1 heat kernel ptM ( , ) for BtM t 0 which enjoys the following estimates. · · { }

3 1/2 ↵2 2/3 1 ↵1 d(x, y) d(x, y) 1 c t (ln t ) exp c ln ( ) pM (x, y) 1 1 2 t 1 t  t ⇣ ⇣ ⌘ ⇣ 3 1/2 ⌘ ⌘ ↵3 2/3 1 1/3 d(x, y) d(x, y) c t (ln t ) exp c ln ( ) , x, y ,t 1,  3 1 4 t 1 t 8 2M1  ⇣ ⇣ ⌘ ⇣ ⌘ ⌘ where ln x := 1 log x,andc , ,c ,↵ , ,↵ are positive (non-random) constants. 1 _ 1 ··· 4 1 ··· 3 n The same results hold for and i for each i N (with constants depending on i). Ci M 2

7 Random conductance model

Consider weighted graph (X, µ). As we have seen in Remark 1.7, there are two continuous time Markov chains with transition probability P (x, y)=µxy/µx. One is constant speed random walk (CSRW) for which the holding time is the exponential distribution with mean 1 for each point, and the other is variable speed random walk (VSRW) for which the holding time at x is the exponential distribution with mean µ 1 for each x X. The corresponding discrete Laplace operators are given x 2 in (1.4), (1.9), which we rewrite here. 1 f(x)= (f(y) f(x))µ , LC µ xy x y X f(x)= (f(y) f(x))µ . LV xy y X 61 Recall also that for each f,g that have ﬁnite support, we have

(f,g)= ( f,g) = ( f,g) . E LV ⌫ LC µ 7.1 Overview

d Consider Z , d 2 and let Ed be the set of non-oriented nearest neighbor bonds and let the conductance µe : e Ed be stationary and ergodic on a probability space (⌦, , P). For each { 2x } F ! !,let( Yt t 0, P! x Zd ) be either the CSRW or VSRW and deﬁne 2 { } { } 2 ! x qt (x, y)=P! (Yt = y)/✓y be the transition density of Xt t 0 where ✓ is either ⌫ or µ. This model is called the random con- { } ductance model (RCM for short). We are interested in the long time behavior of Yt t 0, especially { } in obtaining the heat kernel estimates for q!( , ) and a quenched invariance principle (to be precise, t · · quenched functional central limit theorem) for Yt t 0. Note that when Eµe < , it was proved { } d 1 2 in the 1980s that "Yt/"2 converges as " 0 to Brownian motion on R with covariance I in law ! under P P 0 with the possibility = 0 for some cases (see [50, 80, 81]). This is sometimes referred ⇥ ! as the annealed (or averaged) invariance principle. d From now on, we will discuss the case when µe : e Ed are i.i.d.. If p+ := P(µe > 0) pc(Z ). Under the condition, there exists unique ⇥ ! inﬁnite connected components of edges with strictly positive conductances, which we denote by . C1 Typically, we will consider the case where 0 , namely we consider P( 0 ). 2C1 ·| 2C1 We will consider the following cases:

Case 0: c 1 µ c for some c 1, •  e  Case 1: 0 µ c for some c>0, •  e  Case 2: c µ < for some c>0. •  e 1 (Of course, Case 0 is the special case of Case 1 and Case 2.) For Case 0, the following both sides quenched Gaussian heat kernel estimates

d/2 2 ! d/2 2 c t exp( c x y /t) q (x, y) c t exp( c x y /t) (7.1) 1 2| |  t  3 4| | holds P-a.s. for t x y by the result in [49], and the quenched invariance principle is proved in [105]. | | When µ 0, 1 , which is a special case of Case 1, the corresponding Markov chain is a random walk e 2{ } on supercritical percolation clusters. In this case, isoperimetric inequalities are proved in [92] (see also [56]), both sides quenched Gaussian long time heat kernel estimates are obtained in [11] (precisely, "d (7.1) holds for 1 Sx(!) x y t where Sx x Zd satisﬁes Pp(Sx n, x (0)) c1 exp( c2n ) _ _| | { } 2 2C  for some " > 0), and the quenched invariance principle is proved in [105] for d 4 and later extended d to all d 2 in [30, 91].

62 Case 1 This case is treated in [31, 32, 53, 90] for d 2. (Note that the papers [31, 32] consider a discrete time random walk and [53, 90] considers CSRW. In fact, one can see that there is not a big di↵erence between CSRW and VSRW in this case, as we will see in Theorem 7.2.) Heat kernel estimates In [53, 31], it is proved that Gaussian heat kernel bounds do not hold in general and anomalous behavior of the heat kernel is established for d large (see also [36]). In [53], d Fontes and Mathieu consider VSRW on Z with conductance given by µxy = !(x) !(y)where ^ !(x):x Zd are i.i.d. with !(x) 1 for all x and { 2 }  P(!(0) s) s as s 0,  ⇣ # for some >0. They prove the following anomalous annealed heat kernel behavior.

0 log E[P (Yt = 0)] d lim ! = ( ). t log t 2 !1 ^ We now state the main results in [31]. Here we consider discrete time Markov chain with transition probability P (x, y):x, y Zd and denote by P n(0, 0) the heat kernel for the Markov chain, which { 2 } ! (in this case) coincides with the return probability for the Markov chain started at 0 to 0 at time n.

Theorem 7.1 (i) For P-a.e. !, there exists C1(!) < such that for each n 1, 1 d/2 n ,d=2, 3, n 2 P! (0, 0) C1(!) 8 n log n, d =4, (7.2)  2 <> n ,d5. 2 n > Further, for d 5, limn n P! (0, 0) = 0 P-a.s.: !1 (ii) Let d 5 and >1/d. There exists an i.i.d. law P on bounded nearest-neighbor conductances with p >p(d) and C (!) > 0 such that for a.e. ! (0) = , + c 2 2 {|C | 1} n 2  P (0, 0) C (!)n exp( (log n) ), n 1. ! 2 8

(iii) Let d 5. For any increasing sequence n n N, n , there exists an i.i.d. law P on bounded { } 2 !1 nearest-neighbor conductances with p >p(d) and C (!) > 0 such that for a.e. ! (0) = , + c 3 2 {|C | 1} n 2 1 P (0, 0) C (!)n ! 3 n along a subsequence that does not depend on !.

As we can see, Theorem 7.1 (ii), (iii) shows anomalous behavior of the Markov chain for d 5. Once (ii) is proved, (iii) can be proved by a suitable choice of subsequence. We will give a key idea of the proof of (ii) here and give complete proof of it in Subsection 7.3. Suppose we can show that for large n, there is a box of side length `n centered at the origin such that in the box a bond with conductance 1 (‘strong’ bond) is separated from other sites by bonds with conductance 1/n (‘weak’ bonds), and at least one of the ‘weak’ bonds is connected to the origin by a path of bonds with conductance 1 within the box. Then the probability that the walk is back

63 to the origin at time n is bounded below by the probability that the walk goes directly towards the above place (which costs eO(`n) of probability) then crosses the weak bond (which costs 1/n), spends time n 2` on the strong bond (which costs only O(1) of probability), then crosses a weak bond n again (another 1/n term) and then goes back to the origin on time (another eO(`n) term). The cost O(`n) 2 2 of this strategy is O(1)e n so if can take `n = o(log n) then we get leading order n .

Quenched invariance principle For t 0, let Yt t 0 be either CSRW or VSRW and deﬁne { } (") Yt := "Yt/"2 . (7.3)

In [32, 90], they prove the following quenched invariance principle.

(") 0 Theorem 7.2 (i) Let Yt t 0 be the VSRW. Then P-a.s. Y converges (under P!)inlawto d{ } 2 Brownian motion on R with covariance V I where V > 0 is non-random. (") 0 (ii) Let Yt t 0 be the CSRW. Then P-a.s. Y converges (under P!)inlawtoBrownianmotion d { } 2 2 2 on R with covariance C I where C = V /(2dEµe). Case 2 This case is treated in [16] for d 2. Heat kernel estimates The following heat kernel estimates for the VSRW is proved in [16]. (We do not give proof here.)

Theorem 7.3 Let q!(x, y) be the heat kernel for the VSRW and let ⌘ (0, 1). Then, there exist t 2 constants c , ,c > 0 (depending on d and the distribution of µ ) and a family of random variables 1 ··· 11 e Ux x Zd with { } 2 ⌘ P(Ux n) c1 exp( c2n ),  such that the following hold. (a) For all x, y Zd and t>0, 2 ! d/2 q (x, y) c t . t  3 d 1/2 (b) For x, y Z and t>0 with x y t Ux, 2 | |_ ! d/2 2 q (x, y) c t exp( c x y /t) if t x y , t  3 4| | | | q!(x, y) c exp( c x y (1 log( x y /t))) if t x y . t  3 4| | _ | | | | (c) For x, y Zd and t>0, 2 ! d/2 2 2 1+⌘ q (x, y) c t exp( c x y /t) if t U x y . t 5 6| | x _| | d 1+⌘ (d) For x, y Z and t>0 with t c7 x y , 2 _| | d/2 2 ! d/2 2 c8t exp( c9 x y /t) E[q (x, y)] c10t exp( c11 x y /t). | |  t  | | Quenched invariance principle For t 0, deﬁne Y (") as in (7.3). Then the following quenched t invariance principle is proved in [16].

64 (") 0 Theorem 7.4 (i) Let Yt t 0 be the VSRW. Then P-a.s. Y converges (under P!)inlawto d{ } 2 Brownian motion on R with covariance V I where V > 0 is non-random. (") 0 (ii) Let Yt t 0 be the CSRW. Then P-a.s. Y converges (under P!)inlawtoBrownianmotion d { } 2 2 2 2 on R with covariance I where = /(2dEµe) if Eµe < and =0if Eµe = . C C V 1 C 1 Local central limit theorem In [17], a sucient condition is given for the quenched local limit theorem to hold (see [47] for a generalization to sub-Gaussian type local CLT). Using the results, the following local CLT is proved in [16]. (We do not give proof here.)

! 2 d/2 2 2 Theorem 7.5 Let qt (x, y) be the VSRW and write kt(x)=(2⇡tV ) exp( x /(2V t) where V d | | is as in Theorem 7.4 (i). Let T>0,andforx R ,write[x]=([x1], , [xd]). Then 2 ··· d/2 ! 1/2 lim sup sup n q (0, [n x]) kt(x) =0, a.s. n nt P x Rd t T | | !1 2 The key idea of the proof is as follows: one can prove the parabolic Harnack inequality using Theorem 7.3. This implies the uniform H¨oldercontinuity of nd/2q! (0, [n1/2 ]), which, together with Theorem nt · 7.4 implies the pointwise convergence. For the case of simple random walk on the supercritical percolation, this local CLT is proved in [17]. Note that in general when µ c, such local CLT does NOT hold because of the anomalous e  behavior of the heat kernel and the quenched invariance principle.

More about CSRW with Eµe = According to Theorem 7.4 (ii), one does not have the usual 1 central limit theorem for CSRW with Eµe = in the sense the scaled process degenerates as " 0. 1 ! A natural question is what is the right scaling order and what is the scaling limit. The answers are given in [13, 22] for the case of heavy-tailed environments with d 3. Let µ satisﬁes { e} ↵ P(µe c1)=1, P(µe u)=c2u (1 + o(1)) as u , (7.4) !1 for some constants c ,c > 0 and ↵ (0, 1]. 1 2 2 In order to state the result, we ﬁrst introduce the Fractional-Kinetics (FK) process.

Deﬁnition 7.6 Let B (t) be a standard d-dimensional Brownian motion started at 0,andfor { d } ↵ (0, 1), let V↵(t) t 0 be an ↵-stable subordinator independent of Bd(t) , which is determined by 2 { } ↵ 1 { } E[exp( V↵(t))] = exp( t ). Let V (s):=inf t : V↵(t) >s be the rightcontinuous inverse of ↵ { } V↵(t). We deﬁne the fractional-kinetics process FKd,↵ by

1 FK (s)=B (V (s)),s[0, ). d,↵ d ↵ 2 1 The FK process is non-Markovian process, which is -H¨oldercontinuous for all <↵2/ and is (d) self-similar, i.e. FK ( ) = ↵/2FK ( ) for all >0. The density of the process p(t, x) started d,↵ · d,↵ · at 0 satisﬁes the fractional-kinetics equation @↵ 1 t ↵ p(t, x)= p(t, x)+ (x) . @t↵ 2 0 (1 ↵) This process is well-known in physics literatures, see [110] for details.

65 Theorem 7.7 Let d 3 and Let Yt t 0 be the CSRW of RCM that satisﬁes (7.4). { } (") (") 0 (i) ([13]) Let ↵ (0, 1) in (7.4) and let Yt := "Yt/"2/↵ . Then P-a.s. Y converges (under P!) 2 d in law to a multiple of the fractional-kinetics process c FKd,↵ on D([0, ), R ) equipped with the · 1 topology of the uniform convergence on compact subsets of [0, ). ("1) (") (ii) ([22]) Let ↵ =1in (7.4) with c1 = c2 =1and let Yt := "Yt log(1/")/"2 . Then P-a.s. Y 0 d 2 1/2 converges (under P!)inlawtoBrownianmotiononR with covariance C I where C =2 V > 0.

Remark 7.8 (i) In [24], a scaling limit theorem similar to Theorem 7.7 (i) was shown for symmetric

Bouchaud’s trap model (BTM) for d 2. Let ⌧x x Zd be a positive i.i.d. and let a [0, 1] be a { } 2 2 parameter. Deﬁne a random weight (conductance) by

µ = ⌧ a⌧ a if x y, xy x y ⇠ and let µx = ⌧x be the measure. Then, the BTM is the CSRW with the transition probability

µxy/ y µxy and the measure µx.Ifa =0, then the BTM is a time change of the simple random walk on d and it is called symmetric BMT, while non-symmetric refers to the general case P Z a =0. (This terminology is a bit confusing. Note that the Markov chain for the BTM is symmetric 6 (reversible) w.r.t. µ for all a [0, 1].) According to the result in [24], one may expect that Theorem 2 7.7 (i) holds for d =2as well with a suitable log correction in the scaling exponent. (ii) For d =1, the scaling limit is quite di↵erent from the FK process. In [52, 25], it is proved that the scaling limit (in the sense of ﬁnite-dimensional distributions) of the BTM on R is a singular di↵usion in a random environment, called FIN di↵usion. The result can be extended to RCM for d =1.

Remark 7.9 Isoperimetric inequalities and heat kernel estimates are very useful to obtain various properties of the random walk. For the case of supercritical percolation, estimates of mixing times ([29]) and the Laplace transform of the range of a random walk ([99]) are obtained with the help of isoperimetric inequalities and heat kernel estimates. For 2-dimensional supercritical percolation ([21, 39]), and other models including IICs on trees and IIC on Zd with d 19 ([21]), it is proved that two independent random walks started at the same point collide inﬁnitely often P-a.s..

Remark 7.10 (i) RCM is a special case of random walk in random environment (RWRE). The subject of RWRE has a long history; we refer to [35, 111] for overviews of this ﬁeld. (ii) For directed random walks in (space-time) random environments, the quenched invariance principle is obtained in [98]. Since the walk enters a new environment at every time step, one can use independence more eciently, but the process is no longer symmetric (reversible). Because of the directed nature of the environment, one may consider distributions with a drift for which a CLT is not even expected to hold in general for the undirected setting; see for example [108, 26] for ‘pathologies’ that may arise.

66 7.2 Percolation estimates

d Consider the supercritical bond percolation p>pc(d) on Z for d 2. In this subsection, we will give some percolation estimates that are needed later. We do not give proof here, but mention the corresponding references. Let (0) be the open cluster containing 0 and for x, y (0), let d (x, y) be the graph distance C 2C ! for (0) and x y be the Euclidean distance. The ﬁrst statement of the next proposition gives a C | | stretched-exponential decay of truncated connectivities due to [61, Theorem 8.65]. The second statement is a comparison of the graph distance and the Euclidean distance due to Antal and Pisztora [9, Corollary 1.3].

Proposition 7.11 Let p>pc(d). Then the following hold. (i) There exists c1 = c1(p) such that

(d 1)/d Pp( (0) = n) exp( c1n ) n N. |C |  8 2

(ii) There exists c2 = c2(p, d) > 0 such that the following holds Pp-almost surely,

d!(0,y)1 0 y lim sup { $ } c2. y y  | |!1 | | d For ↵>0, denote ,↵ the set of sites in Z that are connected to inﬁnity by a path whose edges C1 satisfy µ ↵. The following proposition is due to [32, Proposition 2.3]. Similar estimate for the b size of ‘holes’ in can be found in [90, Lemma 3.1]. C1

Proposition 7.12 Assume p+ = P(µb > 0) >pc(d). Then there exists c(p+,d) > 0 such that if ↵ satisﬁes P(µb ↵) >pc(d) and P(0 <µb <↵)

c2n P x , diam (x) n c1e ,n1, (7.6) 2C1 K  d for some c1,c2 > 0. Here ‘diam’ is the diameter in Euclidean distance on Z .

7.3 Proof of some heat kernel estimates

Proof of Theorem 7.1(ii). For >1/d let ✏>0 be such that (1 + 4d✏)/d < . Let P be an i.i.d. conductance law on 2 N : N 0 Ed such that { } N (1+✏) P(µe = 1) >pc(d), P(µe =2 )=cN , N 1, (7.7) 8 where c = c(✏) is the normalized constant. Let e1 denote the unit vector in the first coordinate (1+4d✏)/d d direction. Define the scale `N = N and for each x Z ,letAN (x) be the event that the 2 configuration near x, y = x + e1 and z = x +2e1 is as follows:

67 (1) µ = 1 and µ =2 N , while every other bond containing y or z has µ 2 N . yz xy e  2 (2) x is connected to the boundary of the box of side length (log `N ) centered at x by bonds with conductance one.

N ✏ Since bonds with µe = 1 percolate and since P(µe 2 ) N ,wehave  ⇣ [1+(4d 2)✏] P AN (x) cN . (7.8) d d 2 Now consider a grid of vertices GN in [ `N ,`N ] Z that are located by distance 2(log `N ) .Since \ AN (x): x GN are independent, we have { 2 }

GN d c [1+(4d 2)✏] | | `N [1+(4d 2)✏] cN ✏ P AN (x) 1 cN exp c 2 N e , (7.9)   (log `N )  x GN ⇣ 2\ ⌘ ⇣ ⌘ n ⇣ ⌘ o c so using the Borel-Cantelli, x GN AN (x) occurs only for ﬁnitely many N. \ 2 2 d d By Proposition 7.11 (i), every connected component of side length (log `N ) in [ `N ,`N ] Z d d \ will eventually be connected to in [ 2`N , 2`N ] Z . Summarizing, there exists N0 = N0(!) C1 \ d d with P(N0 < ) = 1 such that for N N0, AN (x) occurs for some x = xN (!) [ `N ,`N ] Z 1 2 \ that is connected to 0 by a path (say Path )in[ 2` , 2` ]d, on which only the last N edges (i.e. N N N 0 those close to the origin) may have conductance smaller than one. N N+1 d d Now let N N0 and let n be such that 2 2n<2 . Let xN [ `N ,`N ] Z be such  2 \ that AN (xN ) occurs and let rN be the length of PathN . Let ↵ = ↵(!) be the minimum of µe for e within N0 steps of the origin. The Markov chain moves from 0 to xN in time rN with probability at N0 r least ↵ (2d) N , and the probability of staying on the bond (y, z) for time 2n 2rN 2 is bounded N independently of !. The transitions across (x, y) cost order 2 each. Hence we have

2n 2N0 2r 2N P (0, 0) c↵ (2d) N 2 . (7.10) ! By Proposition 7.11 (ii), we have r c` for large N.Sincen 2N and ` (log n), we obtain N  N ⇣ N  the result.

7.4 Corrector and quenched invariance principle Our goal is to prove Theorem 7.2 and Theorem 7.4 assuming the heat kernel estimates. Let us ﬁrst give overview of the proof. As usual for the functional central limit theorem, we use ‘corrector’. Let d d 0 ' = '! : Z R be a harmonic map, so that Mt = '(Yt)isaP -martingale. Let I be the identity ! ! map on Zd. The corrector is (x)=(' I)(x)='(x) x. It is referred to as the ‘corrector’ because it corrects the non-harmonicity of the position function. For simplicity, let us consider CLT (instead of functional CLT) for Y . By deﬁnition, we have

Y M (Y ) t = t t . t1/2 t1/2 t1/2

68 1/2 Since we can control ' (due to the heat kernel estimates), the martingale CLT gives that Mt/t converges weakly to the normal distribution. So all we need is to prove (Y )/t1/2 0. This can be t ! done once we have (a) P 0( Y At1/2) is small and (b) (x) / x 0 as x . (a) holds by ! | t| | | | |! | |!1 the heat kernel upper bound, so the key is to prove (b), namely sublinearity of the corrector. Note that there maybe many global harmonic functions, so we should chose one such that (b) holds. As we will see later, we in fact prove the sublinearity of the corrector for ,↵. C1

We now discuss details. Let

[0, 1]Ed for Case 1, ⌦= [1, ]Ed for Case 2. ( 1 (Note that one can choose c = 1 in Case 1 and Case 2.) The conductance µ : e E are deﬁned { e 2 d} on (⌦, P) and we write µ x,y (!)=!x,y for the coordinate maps. Let Tx :⌦ ⌦ denote the shift { } ! by x, namely (Tz!)xy := !x+z,y+z. d The construction of the corrector is simple and robust. Let Qx,y(!):x, y Z be a sequence of { 2 } non-negative random variables such that Qx,y(Tz!)=Qx+z,y+z(!), which is stationary and ergodic. Assume that there exists C>0 such that the following hold:

2 Q0,x(!) C P-a.e. !, and E[ Q0,x x ] < . (7.11)  | | 1 x Zd x Zd X2 X2 The following construction is based on Mathieu and Piatnitski [91].

Theorem 7.13 There exists a function :⌦ Zd Rd that satisﬁes the following (1)–(3) P-a.e. ⇥ ! !.

(1) (Cocycle property) (!,0) = 0 and, for all x, y Zd, 2 (!, x) (!, y)=(T !, x y). (7.12) y

j d (2) (Harmonicity) '!(x):=x + (!, x) enjoys Q'!(z)=0for all 1 j d and z Z where j L   2 '! is the j-th coordinate of '! and

( f)(x)= Q (!)(f(y) f(x)). (7.13) LQ x,y y Zd X2

(3) (Square integrability) There exists C< such that for all x, y Zd, 1 2 2 E ( ,y) ( ,x) Qx,y(!) 0, the corrector is uniquely determined by the properties (1)–(3) in Theorem

69 7.13 and (3-2) in Proposition 7.16 below. (iii) In [91, 32] the corrector is deﬁned essentially on ⌦ B where B = e , , e , e , , e ⇥ { 1 ··· d 1 ··· d} is the set of unit vectors. However, one can easily extend the domain of the corrector to ⌦ Zd by ⇥ using the cocycle property.

Given the construction of the corrector, we proceed as in Biskup and Prescott [32]. We ﬁrst give sucient condition for the (uniform) sublinearity of the corrector in this general setting and then show that the assumption given for the suciency of the sublinearity can be veriﬁed for Case 1 and Case 2.

We consider open clusters with respect to the (in general long range) percolation for Qxy x,y Zd . { } 2 Assume that the distribution of Qxy is given so that there exists unique infinite open cluster P-a.s., { } and denote it by . We also consider (random) one-parameter family of infinite sub-clusters which C1 we denote by ,↵,whereweset ,0 = . We assume P(0 ,↵) > 0. (The concrete choice of C1 C1 C1 2C1 ,↵ will be given later.) C1 Let Y = Yt t 0 be the VSRW on (with based measure ⌫x 1 for x ) that corresponds { } C1 ⌘ 2C1 to Q in (7.13). We introduce the trace of Markov chain to ,↵ (cf. Subsection 1.3). Define L C1 1,2,... the time intervals between successive visits of Y to ,↵, namely, let C1

j+1 := inf n 1: Y0+ + +n ,↵ , (7.15) ··· j 2C1 ˆ ˆ(↵) x with 0 = 0. For each x, y ,↵,letQxy := Qxy (!)=P! (Y1 = y) and deﬁne the operator 2C1 f(x):= Qˆ f(y) f(x) . (7.16) LQˆ xy y ,↵ 2XC1

Let Yˆ = Yˆt t 0 be the continuous-time random walk corresponding to ˆ. { } LQ The following theorem gives sucient condition for the sublinearity of the corrector ! on ,↵. C1 Let P↵( ):=P( 0 ,↵) and let E↵ be the expectation w.r.t. P↵. · ·| 2C1 d ˆ Theorem 7.15 Fix ↵ 0 and suppose ! : ,↵ R , ✓>0 and Y satisfy the following (1)–(5) C1 ! for P↵-a.e. !: 1 d j (1) (Harmonicity) If '!(x)=('!(x), ,'!(x)) := x + !(x), then ˆ'! =0on ,↵ for ··· LQ C1 1 j d.   (2) (Sublinearity on average) For every ✏>0, 1 lim 1 !(x) ✏n =0. (7.17) n nd {| | } !1 x ,↵ 2xXC1n | | (3) (Polynomial growth) There exists ✓>0 such that

!(x) lim max | ✓ | =0. (7.18) n x ,↵ n !1 1 2xC n | |

70 2 (4) (Di↵usive upper bounds) For a deterministic sequence bn = o(n ) and a.e. !, Ex Yˆ x sup max sup !| t | < , (7.19) n 1 x ,↵ t bn pt 1 2xC1n | | d/2 x ˆ sup max sup t P! (Yt = x) < . (7.20) n 1 x ,↵ t bn 1 2xC1n | |

(5) (Control of big jumps) Let ⌧n = t 0: Yˆt Yˆ0 n . There exist c1 > 1 and N! > 0 which {ˆ ˆ| | } is finite for P↵-a.e. ! such that Y⌧n Y0 c1n for all t>0 and n N!. | | Then for P↵-a.e. !, (x) lim max | ! | =0. (7.21) n x ,↵ n !1 1 2xC n | | Now we discuss how to apply the theorem for Case 1 and Case 2. Case 1: In this case, we define Q (!)=! . Then it satisfies the conditions for Q given x,y x,y { xy} above including (7.11). Denote by ! the generator of VSRW (which we denote by Yt t 0), i.e., L { } ( f)(x)= ! (f(y) f(x)). (7.22) L! xy y : y x X⇠ Q = ! in this case. The infinite cluster is the cluster for b Ed : µb > 0 (since we assumed L L C1 { 2 } p+ = P(µb > 0) >pc(d), it is the supercritical percolation so there exists unique infinite cluster.) Fix ↵ 0 that satisfies (7.5) and ,↵ the the cluster for b Ed : µb ↵ . (Again it is the C1 { 2 ˆ } x supercritical percolation so there exists unique infinite cluster.) We let Qxy := P! (Y1 = y). Note that although Yˆ may jump the ‘holes’ of ,↵, Proposition 7.12 shows that all jumps are finite. Let C1 '!(x):=x + (!, x)where is the corrector on . Then, by the optional stopping theorem, it is C1 easy to see that ˆ'! = 0 on ,↵ (see Lemma 7.21 (i)). LQ C1 Case 2: First, note that if we define Qx,y(!) similarly as in Case 1, it does not satisfy (7.11). Especially when Eµe = , we cannot define correctors for Yt t 0 in a usual manner. The idea 1 { } of [16] is to discretize Yt t 0 and construct the corrector for the discretized process Y[t] t 0. Let ! { } ! x ! { } qt (x, y) be the heat kernel of Yt t 0. Note that qt (x, y)=P! (Yt = y)=qt (y, x). We define { } ! d Qx,y(!)=q (x, y), x, y Z . 1 8 2 d d Then Qx,y 1 and y Qx,y = 1. Note that in this case P(Qxy > 0) > 0 for all x, y Z ,so = Z .  0 2 2 2 C1 Integrating Theorem 7.3 (b), we have E[E Yt ]=E[ x Q0x x ] ct for t 1, so (7.11) holds. P · | | | |  One can easily check other conditions for Q given above. In this case, we do not need to consider xy P { } ˆ ˆ ,↵ and there is no need to take the trace process. So, ↵ = 0, Qx,y = Qx,y, ˆ = Q, and Yt = Y[t] C1 LQ L (discrete time random walk) in this case. The next proposition provides some additional properties of the corrector for Case 1 and Case 2. This together with Lemma 7.28 below verify for Case 1 and Case 2 the sucient conditions for sublinearity of the corrector given in Theorem 7.15. As mentioned above, for Case 2, we consider only ↵ = 0.

71 Proposition 7.16 Let ↵>0 for Case 1 and ↵ =0for Case 2, and let be the corrector given in Theorem 7.13. Then satisﬁes (2), (3) in Theorem 7.13 for P↵-a.e. ! if P(0 ,↵) > 0. Further, 2C1 it satisﬁes the following. (1) (Polynomial growth) There exists ✓>dsuch that the following holds P↵-a.e. !: (!, x) lim max | ✓ | =0. (7.23) n x ,↵ n !1 1 2xC n | |

(2) (Sublinearity on average) For each ✏>0,thefollowingholdsP↵-a.e. !: 1 lim 1 (!,x) ✏n =0. n nd {| | } !1 x ,↵ 2xXC1n | | (3-1) Case 1: (Zero mean under random shifts) Let Z :⌦ Zd be a random variable such that ! (↵) q (a) Z(!) ,↵(!), (b) P↵ is preserved by ! ⌧Z(!)(!),and(c) E↵[d! (0,Z(!)) ] < for 2C1 1 7! 1 some q>3d. Then ( ,Z( )) L (⌦, , P↵) and · · 2 F

E↵ ( ,Z( )) =0. (7.24) · · ⇥ ⇤ (3-2) Case 2: (Zero mean) E ( ,x) =0for all x Zd. · 2 ⇥ ⇤ 7.5 Construction of the corrector In this subsection, we prove Theorem 7.13. We follow the arguments in Barlow and Deuschel [16, Section 5] (see also [32, 33, 91]). We deﬁne the process that gives the ‘environment seen from the particle’ by

Z = T (!), t [0, ), (7.25) t Yt 8 2 1 where Yt t 0 is the Markov chain corresponding to Q. Note that the process Z is ergodic under { } L the time shift on ⌦ (see for example, [30, Section 3], [50, Lemma 4.9] for the proof in discrete time). 2 2 2 Let L = L (⌦, P) and for F L ,writeFx = F Tx. Then the generator of Z is 2 LFˆ (!)= Q (!)(F (!) F (!)). 0,x x x Zd X2 Deﬁne ˆ 2 (F, G)=E Q0,x(F Fx)(G Gx) F, G L . E 8 2 x Zd h X2 i The following lemma shows that ˆ is the quadratic form on L2 corresponding to Lˆ. E 2 d Lemma 7.17 (i) For all F L and x Z ,itholdsthatEF = EFx and E[Q0,xFx]=E[Q0, xF ]. 2 2 (ii) For F L2,itholdsthat ˆ(F, F) < and LˆF L2. 2 E 1 2 (iii) For F, G L2,itholdsthat ˆ(F, G)= E[GLˆF ]. 2 E

72 Proof. (i) The ﬁrst equality is because P = P Tx.SinceQ0,x T x = Q x,0 = Q0, x, E[Q0,xFx]= E[(Q0,x T x)F ]=E[Q0, xF ] so the second equality holds. (ii) For F L2,wehave 2 ˆ 2 2 2 (F, F)=E[ Q0,x(F Fx) ] 2E[ Q0,x(F + F )] E  x x x X X 2 2 2 =2CEF +2E[ Q0, xF ]=4C F 2, (7.26) k k x X where we used (i) in the second equality, and C is a constant in (7.11). Also,

ˆ 2 E LF = E[ Q0,xQ0,y(Fx F )(Fy F )] | | x,y X 2 1/2 2 1/2 ˆ 2 2 E ( Q0,xQ0,y(Fx F ) ) ( Q0,xQ0,y(Fy F ) ) C (F, F) 4C F   E  k k2 x,y x,y h X X i where (7.26) is used in the last inequality. We thus obtain (ii). (iii) Using (i), we have E[Q0, xG(F F x)] = E[Q0,xGx(Fx F )]. (7.27) So

ˆ 1 1 E[GLF ]= E[GQ0,x(F Fx)] = E[GQ0,x(F Fx)] + E[GQ0, x(F F x)] 2 2 x x x X X X 1 ˆ = E[Q0,x(GF GFx + GxFx GxF )] = (F, G), 2 E x X where (7.27) is used in the third equation, and (iii) is proved.

Next we look at ‘vector ﬁelds’. Let M be the measure on⌦ Zd deﬁned as ⇥ d GdM := E[ Q0,xG( ,x)], for G :⌦ Z R. ⌦ Zd · ⇥ ! Z ⇥ x Zd X2 We say G :⌦ Zd R has the cocycle property (or shift-covariance property) if the following holds, ⇥ !

G(Tx!, y x)=G(!, y) G(!, x), P a.s. 2 Let L = G L2(⌦ Zd,M):G has the cocycle property. and for G L2(⌦ Zd,M), write 2 { 2 ⇥ 2 2 } 2 ⇥ 2 G 2 := E[ x Zd Q0,xG( ,x) ]. It is easy to see that L is a Hilbert space. Also, for G L ,it k kL 2 · 2 holds that G(!, 0) = 0 and G(Tx!, x)= G(!, x). P 2 Deﬁne : L2 L by r ! 2 F (!, x):=F (Tx!) F (!), for F L . r 2 2 ˆ Since F 2 = (F, F) < (due to Lemma 7.17 (ii)) and F (Tx!, y x)=F (Ty!) F (Tx!)= kr kL E 1 r 2 F (!, y) F (!, x), we can see that F is indeed in L . r r r

73 2 Now we introduce an orthogonal decomposition of L as follows

2 L = L2 L2 . pot sol 2 2 2 2 Here Lpot = Cl F : F L where the closure is taken in L , and Lsol is the orthogonal comple- 2 {r 2 } ment of Lpot. Note that ‘pot’ stands for ‘potential’, and ‘sol’ stands for ‘solenoidal’. 2 2 Before giving some properties of Lpot and Lsol, we now give definition of the corrector. Let d d 2 ⇧:R R be the identity and denote by⇧ j the j-th coordinate of⇧ . Then,⇧ j L .Indeed, ! 2 2 ⇧j(y x)=⇧j(y) ⇧j(x) so it has the cocycle property, and by (7.11), ⇧j 2 < . Now define k kL 1 L2 and L2 by j 2 pot j 2 sol ⇧ = L2 L2 . j j j 2 pot sol d d This gives the definition of the corrector =(1, ,d):⌦ Z R . ··· ⇥ ! Remark 7.18 The corrector can be defined by using spectral theory as in Kipnis and Varadhan [80]. This ‘projection’ definition is given in Giacomin, Olla and Spohn [58], and Mathieu and Piatnitski [91] (also in [16, 32, 33] etc. mentioned above).

Lemma 7.19 For G L2 ,itholdsthat 2 sol

Q0,xG(!, x)=0, P a.s. (7.28) x Zd X2 0 Hence Mn := G(!,Yn) is a P!-martingale for P-a.e. !.

2 2 Proof. Recall that G(Tx!, x)= G(!, x) for G L . Using this, we have for each F L 2 2

E[Q0,xG( ,x)Fx( )] = E[T xQ0,xG(T x ,x)Fx(T x )] · · · · x x X X = E[Q0, x( G( , x))F ( )] = E[Q0,xG( ,x)F ( )]. · · · · x x X X 2 2 So E[Q0,xG( ,x)(F + Fx)( )] = 0. If G L ,thensince F L ,wehave x · · 2 sol r 2 pot P 0= G FdM = E[Q0,xG( ,x)(Fx F )]. d r · ⌦ Z x Z ⇥ X 2 So we have E[ Q0,xGF ] = 0. Since this holds for all F L , we obtain (7.28). Since x 2 EP0 [G(!,Y ) G(!,Y ) Y = x]= Q (!)(G(!, y) G(!, x)) ! n+1 n | n xy y X = Q0,y x(Tx!)G(Tx!, y x)=0, y X where the cocycle property is used in the second equality and (7.28) is used in the last inequality, 0 we see that Mn := G(!,Yn)isaP!-martingale.

74 Verification of (1)-(3) in Theorem 7.13: (1) and (3) in Theorem 7.13 is clear by definition of and the definition of the inner product on L2(⌦ Zd,M). By Lemma 7.19, we see that (2) in ⇥ Theorem 7.13 hold. – Here, note that by the cocycle property of ',

j j j j ( Q'!)(x)= Qx,y(!)('!(y) '!(x)) = Q0,y x(Tx!)' (y x)=0, L Tx! y Zd y Zd X2 X2 for 1 j d where the last equality is due to (7.28).   In [33, Section 5], there is a nice survey of the potential theory behind the notion of the corrector. There it is shown that⇧ (here ⇧ is the identity map) spans L2 (see [33, Corollary 5.6]). pot 7.6 Proof of Theorem 7.15 The basic idea of the proof of Theorem 7.15 is that sublinearity on average plus heat kernel upper bounds imply pointwise sublinearity, which is (according to [32]) due to Y. Peres. In Theorem 7.15, the inputs that should come from the heat kernel upper bounds are reduced to the assumptions (7.19), (7.20). In order to prove Theorem 7.15, the next lemma plays a key role.

Lemma 7.20 Let R¯n := max !(x) . x ,↵ 2C1 x n | | Under the conditions (1,2,4,5) of Theorem 7.15, for each ✏>0 and >0, there exists c1 > 1 and a random variable n0 = n0(!,✏, ) which is ﬁnite a.s. ﬁnite such that

R¯ ✏n + R¯ .nn . (7.29) n  2c1n 0 Before proving this lemma, let us prove Theorem 7.15 by showing how this lemma and (7.18) imply (7.21).

¯ ¯ Proof of Theorem 7.15. Suppose that Rn/n 0 and choose c with 0

k✓ k✓ and, inductively, R¯ k (2c ) cn. However, (7.18) says R¯ k /(2c ) 0 as k for each (2c1) n 1 (2c1) n 1 ! !1 ﬁxed n, which is a contradiction.

We now prove Lemma 7.20. The idea behind is the following. Let Yˆ be the process correspond- { t} ing to started at the maximizer of R¯ ,sayz . Using the martingale property, we will estimate LQˆ n ⇤ z⇤ ˆ 2 E! [ !(Yt)] for time t = o(n ). The right-hand side of (7.29) expresses two situations that may occur at time t:(i) (Yˆ ) ✏n (by ‘sublinearity on average’, this happens with high probability), | ! t |

75 (ii) Yˆ has not yet left the box [ 2c n, 2c n]d and so (Yˆ ) R¯ . It turns out that these two are 1 1 ! t  2c1n the dominating strategies.

Proof of Lemma 7.20. Fix ✏, > 0 and let z be the site where the maximum R¯n is achieved. ⇤ Denote 1 n := x ,↵ : x n, !(x) ✏n . O 2C1 | | | | 2 ˆ Recall that Yt is the continuous-time random walk on ,↵ corresponding to Qˆ! in (7.16). We { } z C1 L denote its expectation for the walk started at z by E!⇤ .Deﬁne ⇤

Sn := inf t 0: Yˆt z 2n . | ⇤| ˆ Note that, by Theorem 7.15 (5), there exists n1(!) which is a.s. ﬁnite such that we have Yt Sn z | ^ ⇤| 2c n for all t>0 and n n (!). Using the harmonicity of x x + (x) and the optional stopping 1 1 7! ! theorem, we have ¯ z ˆ ˆ Rn = !(z ) E!⇤ !(Yt Sn )+Yt Sn z . (7.30) | ⇤ | ^ ^ ⇤ h i We will consider t that satisﬁes t b , (7.31) 3c1n 2 where bn = o(n ) is the sequence in Theorem 7.15 (4), and we will estimate the expectation separately on Sn

ˆ ˆ ˆ Y2t z n Sn

76 By (7.19), the second term is less than c pt for some c = c (!)providedt b . The ﬁrst term can 4 4 4 n be estimated depending on whether Yˆ or not: t 62O 2n z ˆ 1 ¯ ˆ E!⇤ !(Yt) 1 Sn t ✏n + R2c1nP!,z (Yt 2n). | | { }  2 ⇤ 2O ⇥ ⇤ For the probability of Yˆ we get t 2O2n

P!,z (Yˆt 2n)= P!,z (Yˆt = x) ⇤ 2O ⇤ x 2n 2XO 1/2 1/2 2n P!,z (Yˆt = z ) P!,x(Yˆt = x) c5 |O |,  ⇤ ⇤  td/2 x 2n 2XO where c5 = c5(!) denote the suprema in (7.20). Here is the ﬁrst inequality we used the Schwarz inequality and the second inequality is due to (7.20).

1 z ˆ ˆ p ¯ 2n E!⇤ !(Yt Sn )+Yt Sn z 1 Sn t c4 t + ✏n + R2c1nc5 |O |. (7.33) ^ ^ ⇤ { }  2 td/2 h i By (7.32) and (7.33), we conclude that

(c2 + c3)pt 1 2n R¯n (R¯2c n +2c1n) + c4pt + ✏n + R¯2c nc5 |O |. (7.34)  1 n 2 1 td/2 Since = o(nd) as n due to (7.17), we can choose t := ⇠n2 with ⇠>0 so small that (7.31) |O2n| !1 applies and (7.29) holds for the given ✏ and once n is suciently large.

7.7 Proof of Proposition 7.16 In this subsection, we prove Proposition 7.16 for Case 1 and Case 2 separately.

7.7.1 Case 1

For Case 1, we need to work on ,↵, but we do not need to use Proposition 7.12 seriously yet C1 (although we use it lightly in (7.39)). By definition, it is clear that satisfies Theorem 7.13 (3) for P↵ for all ↵ 0withP(0 ,↵) > 0. The next lemma proves Theorem 7.13(2) for P↵. 2C1 1 d Lemma 7.21 Let be the corrector on and define '!(x)=('!(x), ,'!(x)) := x + (!, x). j C1 ··· Then ˆ'!(x)=0for all x ,↵ and 1 j d. LQ 2C1  

x Proof. Note that ( ˆ'!)(x)=E! '!(Y1 ) '!(x). Also, since Yt t 0 moves in a ﬁnite compo- LQ { } nent of for t [0, ], ' (Y ) is bounded. Since ' (Y ) is a martingale and < ,↵ 1 !⇥ t ⇤ ! t t 0 1 C1 \C1 2 x { } 1 a.s., the optional stopping theorem gives E!'!(Y1 )='!(x).

77 Proof of Proposition 7.16 (1). In this Case 1, the result holds for all ✓>das we shall prove. Let ✓>dand write Rn := max (!, x) . x ,↵ 2C1 x n | | By Proposition 7.11 (ii),

(↵) d! (0,x) (!):=sup < , P↵-a.s., (7.35) x ,↵ x 1 2C1 | | (↵) where we let d! (x, y) be the graph distance between x and y on ,↵. So it is enough to prove ✓ C1 d Rn/n 0 on (!) for all < . Note that on (!) every x ,↵ [ n, n] can be ! {  } 1 {  d} 2C1 \ connected to the origin by a path that is inside ,↵ [ n,n ] . Using this fact and (!,0) = 0, C1 \ we have, on (!) , {  }

!x,x+b Rn 1 !x,x+b ,↵ (!, x + b) (!, x) (!, x + b) (!, x) ,  { 2C1 }  ↵ x ,↵ b B x ,↵ b B r x2XC1n X2 x2XC1n X2 | | | | (7.36) where B = e , , e , e , , e is the set of unit vectors. Using the Schwarz and (7.14), we { 1 ··· d 1 ··· d} get

d 2 2d(n) 2 2d E↵[(Rn1 (!) ) ] E↵ !x,x+b (!, x + b) (!, x) Cn (7.37) {  }  ↵  x ,↵ b B h 1 i x2XC n X2 | | for some constant C = C(↵, , d) < . Applying Chebyshev’s inequality, for ✓ (d,✓ ), 1 0 2 C (R 1 n✓0 ) . P↵ n (!) 2(✓ d) {  }  n 0

k k✓0 Taking n =2 and using the Borel-Cantelli, we have R2k /2 C on (!) for large k,so k✓ k+1 k+1 k {K+1  } ✓ R k /2 0 a.e. Since R k /2 R /n 2R k+1 /2 for 2 n 2 ,wehaveR /n 0 2 ! 2  n  2   n ! a.s. on (!) . {  } We will need the following lemma which is easy to prove using the H¨olderinequality (see [30, Lemma 4.5] for the proof).

Lemma 7.22 Let p>1 and r [1,p). Let Xn n N be random variables such that supj 1 Xj p < 2 { } 2 k k and let N be a random variable taking values in positive integers such that N Ls for some s> 1 2 r(1 + 1 )/(1 r ). Then N X Lr. Explicitly, there exists C = C(p, r, s) > 0 such that p p j=1 j 2 P N s[ 1 1 ] X C sup X N r p , k jkr  k jkp k ks j=1 j 1 X where C is a ﬁnite constant depending only on p, r and s.

78 Proof of Proposition 7.16 (3-1). Let Z be a random variable satisfying the properties (a)– 2 2 (c). Since L ,thereexistsasequence n L such that 2 pot 2 2 d n( ,x):= n Tx n ( ,x)in(⌦ ). n L Z · !!1 · ⇥ Without loss of generality we may assume that n( ,x) ( ,x) almost surely. Since Z is P↵- · ! · 1 preserving, we have E↵[n( ,Z( ))] = 0 once we can show that n( ,Z( )) L .Itthussucesto · · · · 2 prove that 1 n ,Z( ) ,Z( ) in . (7.38) n L · · !!1 · · (↵) Abbreviate K(!):=d! (0,Z(!)) and note that, as in (7.36), ! (!,Z(!)) x,x+b (!, x + b) (!, x) . n  ↵ n n x ,↵ b B r x2XCK1(!) X2 | | 2 Note that p!x,x+b n(!, x + b) n(!, x) 1 x ,↵ is bounded in L , uniformly in x, b and n, { 2C1 } and assumption (c) shows that K Lq for some q>3d. Thus, by choosing p = 2, s = q/d and 2 N =2d(2K + 1)d in Lemma 7.22 (note thatN Ls), we obtain, 2

sup n( ,Z( )) r < , n 1 k · · k 1 for some r>1. Hence, the family ( ,Z( )) is uniformly integrable, so (7.38) follows by the fact { n · · } that ( ,Z( )) converge almost surely. n · ·

The proof of Proposition 7.16 (2) is quite involved since there are random holes in ,↵.We C1 follow the approach in Berger and Biskup [30].

Lemma 7.23 For ! 0 ,↵ , let xn(!) n Z be the intersections of ,↵ and one of the 2{ 2C1 } { } 2 C1 coordinate axis so that x0(!)=0. Then

(!, xn(!)) lim =0, ↵-a.s. n P !1 n

Proof. Similarly to (7.25), let (!):=Tx1(!)(!) denote the “induced” shift. As before, it is standard to prove that is P↵-preserving and ergodic (cf. [30, Theorem 3.2]). Further, using Proposition 7.12,

(↵) p E↵ d (0,x1(!)) < , p< . (7.39) ! 1 8 1

Now deﬁne (!):=(!, x1(!)). Using (7.39), we can use Proposition 7.16 (3-1) with Z(!)=x1(!) and obtain 1 L (⌦, P↵) and E↵ (!)=0. 2 Using the cocycle property of , we can write

n 1 (!, x (!)) 1 n = k(!) n n Xk=0 79 and so the left-hand side tends to zero P↵-a.s. by Birkho↵’s ergodic theorem.

Since the hole is random, we should ‘build up’ averages over higher dimensional boxes inductively, which is the next lemma. We ﬁrst give some deﬁnition and notation. Given K>0 and ✏>0, we say that a site x Zd is (K,✏ )-good if x (!) and 2 2C1 (y,! ) (x,! )

⌫ ⇤ = k1e1 + + k⌫e⌫ : ki Z, ki n i =1,...,⌫ . n ··· 2 | | 8 For each ! ⌦, deﬁne 2 1 %⌫(!)=limlim sup inf 1 (!,x) (!,y) ✏n . (7.40) ✏ 0 y (!) ⇤1 ⌫ {| | } n ,↵ n ⇤n ⌫ # !1 2C1 \ | | x ,↵(!) ⇤n 2C1X \ 1 Note that the inﬁmum is taken only over sites in one-dimensional box⇤ n.Theideaistoprove %⌫ =0P↵-a.s. inductively for all 1 ⌫ d.   Our goal is to show by induction that %⌫ = 0 almost surely for all ⌫ =1,...,d. The induction step is given the following lemma which is due to [30, Lemma 5.5].

Lemma 7.24 Let 1 ⌫

1 2 Proof. Let p = P↵(0 ,↵). Let ⌫

⌫ ⌫ ⇤n,j = ⌧je⌫+1 (⇤n),j=1,...,L.

Here L is a deterministic number chosen so that

⌫ ⌫ 0 = x ⇤n : j 0,...,L 1 ,x+ je⌫+1 ⇤n,j ,↵ 2 9 2{ } 2 \C1 is so large that (1 ) ⇤⌫ once n is suciently large. | 0| | n| Choose ✏>0 so that 1 L✏ + < p2 . (7.41) 2 1 Pick ✏>0 and a large value K>0. Then for P↵-a.e. !, all suciently large n, and for each ⌫ j =1,...,L, there exists a set of sites j ⇤n,j ,↵ such that ⇢ \C1 ⌫ ⌫ (⇤n,j ,↵) j ✏ ⇤n,j \C1 \  | | and (x,! ) (y,! ) ✏n, x, y . (7.42)  2 j 80 Moreover, for n suciently large, could be picked so that ⇤1 = and, assuming K large, j j \ n 6 ; the non-(K,✏ )-good sites could be pitched out with little loss of density to achieve even . j ⇢GK,✏ (These claims can be proved directly by the ergodic theorem and the fact that P↵(0 K,✏) converges 2⌫+1G to the density of ,↵ as K .) Given 1,...,L, let ⇤ be the set of sites in⇤ n ,↵ whose C1 !1 \C1 projection onto the linear subspace H = k1e1 + + k⌫e⌫ : ki Z belongs to the corresponding { ··· 2 } 1 projection of 1 L. Note that the j could be chosen so that⇤ ⇤n = . [···[ \ 6 ; ⌫ By the construction, the projections of the j’s, j =1,...,L, onto H ‘fail to cover’ at most L✏ ⇤ | n| sites in , and so at most ( + L✏) ⇤⌫ sites in⇤ ⌫ are not of the form x + ie for some x . 0 | n| n ⌫+1 2 j j It follows that ⌫+1 ⌫+1 S (⇤n ,↵) ⇤ ( + L✏) ⇤n , (7.43) \C1 \  | | i.e., ⇤ contains all except at most (L✏ + )-fraction of all sites in⇤ ⌫+1. Next note that if K is n 1 2 suciently large, then for all 1 i 3K + ✏L.If%⌫,✏ denotes the right-hand side of (7.40) before taking ✏ 0, the # bounds (7.43) and (7.45) and⇤ ⇤1 = yield \ n 6 ; % (!) + L✏, ⌫+1,5✏  for P↵-a.e. !. But the left-hand side of this inequality increases as ✏ 0 while the right-hand side # decreases. Thus, taking ✏ 0 and 0 proves that ⇢⌫+1 = 0 holds for P↵-a.e. !. # #

Proof of Proposition 7.16 (2). First, by Lemma 7.23, we know that %1(!) = 0 for P↵-a.e. !. Using induction, Lemma 7.24 then gives %d(!) = 0 for P↵-a.e. !. Let ! 0 ,↵ .By 2{ 2C1 } Lemma 7.24, for each ✏>0thereexistsn0 = n0(!) which is a.s. ﬁnite such that for all n n0(!), 1 we have (x,! ) ✏n for all x ⇤n ,↵(!). Using this to estimate away the inﬁmum in (7.40), | | 2 \C1 %d = 0 now gives the desired result for all ✏>0.

81 7.7.2 Case 2

d Recall that for Case 2, we only need to work when ↵ = 0, so ,↵ = = Z in this case. 1 1 (n) ! C C We follow [16, Section 5]. Let Qx,y = qn (x, y). Since the based measure is ⌫x 1, we have (n) (n 1) (1) ⌘ Qx,y = z Qx,z Qz,y. We ﬁrst give a preliminary lemma. 2 LemmaP 7.25 For G L , we have 2 (n) 2 2 E[ Q G( ,y)) ] n G 2 . (7.46) 0,y ·  k kL y X

2 Proof. Let an be the left hand side of (7.46). Then, by the cocycle property, we have

2 (n 1) 2 a = E[ Q Qx,y(G(Tx ,y x)+G( ,x)) ] n 0,x · · x y X X (n 1) 2 2 = E[ Q Qx,y G(Tx ,y x) +2G(Tx ,y x)G( ,x)+G( ,x) ]=:I + II + III. 0,x { · · · · } x y X X Then,

(n 1) 2 I = E[ Q x,0 (Tx )Q0,y x(Tx )G(Tx ,y x) ] · · · x y X X (n 1) 2 2 2 = E[ Q ( )Q0,z( )G( ,z) ]=E[ Q0,z( )G( ,z) ]= G 2 , x,0 · · · · · k kL x z z X X X (n 1) 2 (n 1) 2 2 III = E[ Q0,x Qx,yG( ,x) ]=E[ Q0,x G( ,x) ]=an 1, · · x y x X X X (n 1) 1/2 1/2 II =2E[ Q0,x ( )Q0,z(Tx )G(Tx ,z)G( ,x)] 2I II =2an 1 G 2 . · · · ·  k kL x z X X Thus an an 1 + G 2 n G 2 .  k kL ··· k kL The next lemma proves Proposition 7.16 (1). We use the heat kernel lower bound in the proof.

2 Lemma 7.26 Let G L . 2 (i) For 1 p<2, there exists c > 0 such that  p p 1/p d E[ G( ,x) ] cp x G 2 , x Z . (7.47) | · |  | |·k kL 8 2 d 4 (ii) It follows that limn max x n n G(!, x) =0, P-a.e. !. !1 | | | |

Proof. Using the cocycle property and the triangle inequality iteratively, for x =(x , ,x ), we 1 ··· d have d p 1/p p 1/p E[ G( ,x) ] xi E[ G( , ei) ] , | · |  | | | · | Xi=1

82 p p so it is enough to prove E[ G( , ei) ] cp G 2 for 1 i d. By Theorem 7.3, there exists an L | · |  k k   ⌘ integer valued random variable Wi with Wi 1 such that P(Wi = n) c1 exp( c2n ) for some ⌘>0  and q!(0, e ) c t d/2 for t W .Set⇠ = q!(0, e ). Then t i 3 i n,i n i

p 1 p E[ G( , ei) ]= E[ G( , ei) 1 W =n ]. | · | | · | { i } n=1 X Let ↵ =2/p and ↵ =2/(2 p) be its conjugate. Then, using the heat kernel bound and the H¨older, 0 p 1/↵ p 1/↵ 2 1/↵ ↵0/↵ 1/↵0 E[ G( , ei) 1 W =n ]=E[⇠n G( , ei) ⇠n 1 W =n ] E[⇠nG( , ei) ] E[⇠n 1 W =n ] | · | { i } | · | { i }  · { i } 1/↵0 (n) 2 1/↵ d/2 ↵0/↵ ⌘ E[ Q G( ,y) ] (c3n ) c1 exp( c2n )  0,y · y X ⇣ ⌘ 2 1/↵ d/(2↵) ⌘ (d+2)/(2↵) ⌘ p (n G 2 ) c4n exp( c5n )=c4n exp( c5n ) G 2 ,  k kL k kL p p where we used (7.46) in the last inequality. Summing over n 1, we obtain E[ G( , ei) ] cp G 2 | · |  k kL for 1 i d.   (ii) We have d d+1 d cn c0n (max G( ,x) > ) (2n + 1) max ( G( ,x) > ) max [ G( ,x) ] G 2 , P n P n E L x n | · |  x n | · |  n x n | · |  n k k | | | | | | d+3 where we used (7.47) in the last inequality. Taking n = n and using the Borel-Cantelli gives the desired result.

We next prove Proposition 7.16 (3-2). One proof is to apply Proposition 7.16 (3-1) with Z( ) x. · ⌘ Here is another proof.

Lemma 7.27 For G L2 ,itholdsthatE[G( ,x)] = 0 for all x Zd. 2 pot · 2 2 Proof. If G = F where F L ,thenE[G(x, )] = E[Fx F ]=EFx EF = 0. In general, there 2r 2 2· d exist Fn L such that G =limn Fn in L .SinceP(Q0,x > 0) = 1 for all x Z , it follows { }⇢ r 2 that Fn( ,x) converges to G( ,x)inP-probability. By (7.47), for each p [1, 2), Fn( ,x) n r · p · 1 2d {r · } is bounded in L so that Fn( ,x) converges to G( ,x)inL for all x Z .ThusE[G( ,x)] = r · · 2 · limn E[ Fn( ,x)] = 0. r ·

By this lemma, we have E[( , ei)] = 0 where ei is the unit vector for 1 i d. Using the ·   cocycle property inductively, we have n n (!, nei)= (T(k 1)e !, ei)= T(k 1)e , i i Xk=1 Xk=1 1 where ( !):=(!,ei). By the proof of Lemma 7.27, we see that L . So Birkho↵’s ergodic 2 theorem implies 1 lim (!, nei)=0, -a.e. !. (7.48) n P !1 n Given this, which corresponds to Lemma 7.23 for Case 1, the proof of Proposition 7.16 (2) for Case 2 is the same (in fact slightly easier since = Zd in this case) as that of Case 1. C1

83 7.8 End of the proof of quenched invariance principle In this subsection, we will complete the proof of Theorem 7.2 and Theorem 7.4. Throughout this subsection, we assume that ↵ satisﬁes (7.5) for Case 1 and ↵ = 0 for Case 2. First, we verify some conditions in Theorem 7.15 to prove the sublinearity of the corrector. The heat kernel estimates and percolation estimates are seriously used here.

Lemma 7.28 (i) The di↵usive bounds (7.19), (7.20) hold for P↵-a.e. !. (ii) Let ⌧ = t 0: Yˆ Yˆ n . Then for any c > 1, there exists N = N (c ) > 0 which is n { | t 0| } 1 ! ! 1 a.s. finite such that Yˆ Yˆ c n for all t>0 and n N . | ⌧n 0| 1 ! Proof. (i) For Case 2, this follows from Theorem 7.3. For Case 1, the proof is given in [32, Section 6]. (See also [90, Section 4]. In fact, in this case both sides Gaussian heat kernel bounds which are similar to the ones for simple random walk on the supercritical percolation clusters hold.) (ii) For Case 1, this follows directly from Proposition 7.12 and the Borel-Cantelli. For Case 2, using d Theorem 7.3, we have for any z Z with z n and c1 > 1, 2 | | z P P ( Y1 c1n) P(Uz c1n)+E[ q· (z,y):Uz 0 does not depend on z. So the result follows by the Borel-Cantelli. 2 ··· 5 Proof of Theorem 7.2 and Theorem 7.4. By Proposition 7.16 and Lemmas 7.28, the corrector satisfies the conditions of Theorem 7.15. It follows that is sublinear on ,↵ (for Case C1 2, since ↵ = 0, is sublinear on ). However, for Case 1, by (7.6) the diameter of the largest C1 component of ,↵ in a box [ 2n, 2n] is less than C(!) log n for some C(!) < . Using the C1 \C1 1 harmonicity of '! on , the optional stopping theorem gives C1 max (!, x) max (!, x) +2C(!) log(2n), x  x ,↵ x2C1n 2xC1n | | | | hence is sublinear on as well. C1 Having proved the sublinearity of on for both Case 1 and Case 2, we proceed as in the d =2 C1 proof of [30]. Let Yt t 0 be the VSRW and Xn := Yn, n N be the discrete time process. Also, let { } 2d '!(x):=x + (!, x) and define Mn := '!(Xn). Fixv ˆ R . We will first show that (the piecewise 2 linearization) of t vˆ M scales to Brownian motion. For K 0 and for m n,let 7! · [tn]  m 1 0 2 (!) 1 fK (!):=E! (ˆv M1) 1 vˆ M1 K , and Vn,m(K):=Vn,m(K)= fK TXk (!). · {| · | } n k=0 ⇥ ⇤ X Since m 1 1 0 2 Vn,m(✏pn)= E! (ˆv (Mk+1 Mk)) 1 vˆ (M M ) ✏pn k , n · {| · k+1 k | } F k=0 h i X for k = (X0,X1,...,Xk), in order to apply the Lindeberg-Feller functional CLT for martingales, F we need to verify the following for P0-almost every ! (recall P0( ):=P( 0 )): · ·| 2C1

84 (i) V (0) Ct in P 0-probability for all t [0, 1] and some C (0, ), n,[tn] ! ! 2 2 1 (ii) V (✏pn) 0inP 0-probability for all ✏>0. n,n ! ! 1 By Theorem 7.13 (3), fK L for all K.Sincen TXn (!) (c.f. (7.25)) is ergodic, we have 2 7! n 1 1 Vn,n(K)= fK TX (!) 0fK , (7.49) k n E n !!1 Xk=0 0 for P0-a.e. ! and P -a.e. path Xk k of the random walk. Taking K = 0 in (7.49), condition (i) ! { } above follows by scaling out the t-dependence first and working with tn instead of n. On the other hand, by (7.49) and the fact that f p fK for suciently large n,wehave,P0-almost surely, ✏ n  0 2 lim sup Vn,n(✏pn) E0fK = E0 E! (ˆv M1) 1 vˆ M1 K 0, n  · {| · | } K! !1 h i !1 ⇥ 2⇤ where we can use the dominated convergence theorem sincev ˆ M1 L . Hence, conditions (i) and 0 · 2 (ii) hold (in fact, even with P!-a.s. limits). We thus conclude that the following continuous random process 1 t vˆ M +(nt [nt])v ˆ (M M ) 7! pn · [nt] · [nt]+1 [nt] 0 2 converges weakly to Brownian motion with mean zero and covariance E0f0 = E0E (ˆv M1) .This ! · can be written asv ˆ Dvˆ where D is the matrix defined by · ⇥ ⇤ 0 Di,j := E0E (ei M1)(ej M1) , i, j 1, ,d . ! · · 8 2{ ··· } Thus, using the Cramér-Wold device and the continuity of the process, we obtain that the linear in- terpolation of t M /pn converges to d-dimensional Brownian motion with covariance matrix D. 7! [nt] Since X M = (!,X ), in order to prove the convergence of t X /pn to Brownian motion n n n 7! [nt] P0-a.e. !, it is enough to show that, for P0-a.e. !,

(Xk,!) 0 max | | 0, in P!-probability. (7.50) 1 k n pn n!   !1 By the sublinearity of , we know that for each ✏>0thereexistsK = K(!) < such that 1 (x,! ) K + ✏ x , x (!). | | | | 8 2C1 Putting x = X and substituting X = M (X ,!) in the right hand side, we have, if ✏< 1 , k k k k 2 K ✏ (X ,!) + M 2K +2✏ M . | k |1 ✏ 1 ✏| k| | k| But the above CLT for Mk and the additivity of Mk imply that maxk n Mk /pn converges { } { }  | | in law to the maximum of Brownian motion B(t)overt [0, 1]. Hence, denoting the probability law 2 of Brownian motion by P ,wehave

0 lim sup P! max (Xk,!) pn P max B(t) . n k n | |  0 t 1 | |2✏ !1    ⇣ ⌘ 85 The right-hand side tends to zero as ✏ 0 for all >0. Hence we obtain (7.50). We thus conclude ! that t Bˆ (t) converges to d-dimensional Brownian motion with covariance matrix D,where 7! n 1 Bˆ (t):= X +(tn [tn])(X X ) ,t0. n pn [tn] [tn]+1 [tn] Noting that 0 1 ˆ lim P!(sup Ysn Bn(s) >u)=0, u, T > 0, n 0 s T |pn | 8 !1   which can be proved using the heat kernel estimates (using Theorem 7.3; see [16, Lemma 4.12]) for Case 2, and using the heat kernel estimates for Yˆ and the percolation estimate Proposition 7.12 for Case 1 (see [90, (3.2)]; note that a VSRW version of [90, (3.2)] that we require can be obtained along the same line of the proof), we see that t Y /pt converges to the same limit. 7! tn By the reﬂection symmetry of P0, we see that D is a diagonal matrix. Further, the rotation 2 2 0 2 symmetry ensures that D =( /d)I where := E0E M1 . To see that the limiting process is !| | not degenerate to zero, note that if =0then( ,x)= x holds a.s. for all x Zd. But that is · 2 impossible since, as we proved, x ( ,x) is sublinear a.s. 7! · Finally we consider the CSRW. For each x ,letµx(!):= y !xy. Let F (!)=µ0(!) and 2C1 t t P

At = µYs ds = F (Zs)ds. Z0 Z0 Then ⌧ =inf s 0:A >t is the right continuous inverse of A and W = Y is the CSRW. Since t { s } t ⌧t Z is ergodic, we have 1 0 lim At = EF =2dEµe, P P!-a.s. t t !1 ⇥ 1 So if Eµe < ,then⌧t/t (2dEµe) =: M>0 a.s. Using the heat kernel estimates for Case 2 1 ! (i.e. using Theorem 7.3; see [16, Theorem 4.11]), and using the heat kernel estimates for Yˆ and the percolation estimate Proposition 7.12 for Case 1 (i.e. using the VSRW version of [90, (3.2)]), we have

0 1 1 lim P!(sup Wsn YsMn >u)=0, T,u > 0, P-a.e. !. n 0 s T |pn pn | 8 !1   1 1 1 1 Thus, pn Wsn = pn YsMn + pn Wsn pn YsMn converges to C Bt where Bt is Brownian motion 2 2 { } and C = MV > 0. 1 If Eµe = ,thenwehave⌧t/t 0, so Wsn converges to a degenerate process. 1 ! pn

Remark 7.29 Note that the approach of Mathieu and Piatnitski [91] (also [90]) is di↵erent from the above one. They prove sublinearity of the corrector in the L2-sense and prove the quenched invariance 2 principle using it. (Here L is with respect to the counting measure on ,↵ or on . In the above C1 C1 arguments, sublinearity of the corrector is proved uniformly on ,↵, which is much stronger.) Since C1 compactness is crucial in their arguments, they ﬁrst prove tightness using the heat kernel estimates – recall that the above arguments do not require a separate proof of tightness. Then, using the Poincar´e inequality and the so-called two-scale convergence, weak L2 convergence of the corrector is proved.

86 This together with the Poincar´einequality again, they prove strong L2 convergence of the corrector (so the L2-sublinearity of the corrector). In a sense, they deduce weaker estimates of the corrector (which is however enough for the quenched CLT) from weaker assumptions.

Acknowledgement. I would like to thank M.T. Barlow, M. Biskup, J-D. Deuschel, J. Kigami, G. Kozma, P. Mathieu, A. Nachmias and H. Osada for fruitful discussions and comments, A. Sapozh- nikov for sending me his preprint [101]. I would also thank G. Ben Arous and F. den Hollander for their encouragements. For the preparations of St. Flour lectures, I had a series of pre-lectures in Kyoto. I would like to thank the audience of the lectures, especially H. Nawa, N. Yoshida, M. Shinoda, M. Takei, N. Kajino, M. Nakashima and D. Shiraishi for useful comments and for ﬁnding typos of the draft.

References

[1] L. Addario-Berry, N. Broutin, and C. Goldschmidt. The continuum limit of critical random graphs. Preprint 2009. arXiv:0903.4730

[2] M. Aizenman and D.J. Barsky. Sharpness of the phase transition in percolation models. Comm. Math. Phys. 108 (1987), 489–526.

[3] M. Aizenman and C.M. Newman. Tree graph inequalities and critical behavior in percolation models. J. Statist. Phys. 36 (1984), 107–143.

[4] D. Aldous. Brownian excursions, critical random graphs and the multiplicative coalescent. Ann. Probab. 25 (1997), 812–854.

[5] D. Aldous. The continuum random tree. III. Ann. Probab. 21 (1993), 248-289.

[6] D. Aldous and J. Fill. Reversible Markov chains and random walks on graphs. Book in preparation. http://www.stat.berkeley.edu/ aldous/RWG/book.html ⇠ [7] S. Alexander and R. Orbach. Density of states on fractals: “fractons”. J. Physique (Paris) Lett., 43:L625–L631, (1982).

[8] O. Angel, J. Goodman, F. den Hollander, and G. Slade. Invasion percolation on regular trees. Ann. Probab. 36 (2008), 420–466.

[9] P. Antal and A. Pisztora. On the chemical distance for supercritical Bernoulli percolation. Ann. Probab. 24 (1996), 1036–1048.

[10] M. Barlow. Random walks on graphs. Lectures given at PIMS summer school: Lecture notes for a course at RIMS are available at http://www.math.kyoto-u.ac.jp/~kumagai/rim10.pdf

[11] M.T. Barlow. Random walks on supercritical percolation clusters. Ann. Probab. 32 (2004), 3024-3084.

[12] M.T. Barlow. Di↵usions on fractals. Springer, New York, (1998). Lecture Notes in Mathematics 1690, Ecole d’étéde probabilitésde Saint-Flour XXV–1995.

87 [13] M.T. Barlow and J. Cern´y.ˇ Convergence to fractional kinetics for random walks associated with unbounded conductances. Probab. Theory Relat. Fields, to appear.

[14] M.T. Barlow, T. Coulhon and A. Grigor’yan. Manifolds and graphs with slow heat kernel decay. Invent. Math. 144 (2001), 609–649.

[15] M.T. Barlow, T. Coulhon and T. Kumagai. Characterization of sub-Gaussian heat kernel estimates on strongly recurrent graphs. Comm. Pure Appl. Math. 58 (2005), 1642–1677.

[16] M.T. Barlow and J.-D. Deuschel. Invariance principle for the random conductance model with unbounded conductances. Ann. Probab. 38 (2010), 234–276.

[17] M.T. Barlow and B.M. Hambly. Parabolic Harnack inequality and local limit theorem for random walks on percolation clusters. Electron. J. Probab. 14 (2009), 1–27.

[18] M.T. Barlow, A.A. J´arai,T. Kumagai and G. Slade. Random walk on the incipient inﬁnite cluster for oriented percolation in high dimensions. Comm. Math. Phys. 278 (2008), 385–431.

[19] M.T. Barlow and T. Kumagai. Random walk on the incipient inﬁnite cluster on trees. Illinois J. Math. 50 (2006), 33–65 (electronic).

[20] M.T. Barlow and R. Masson. Spectral dimension and random walks on the two dimensional uniform spanning tree. Preprint 2009. arXiv:0912.4765

[21] M.T. Barlow, Y. Peres and P. Sousi. Collisions of random walks Preprint 2010. arXiv:1003.3255

[22] M.T. Barlow and X. Zheng. The random conductance model with Cauchy tails. Ann. Appl. Probab.,to appear.

[23] D.J. Barsky and M. Aizenman. Percolation critical exponents under the triangle condition. Ann. Probab. 19 (1991), 1520–1536.

[24] G. Ben Arous and J. Cern´y.Scalingˇ limit for trap models on Zd. Ann. Probab. 35 (2007), 2356–2384.

[25] G. Ben Arous and J. Cern´y.Bouchaud’sˇ model exhibits two aging regimes in dimension one. Ann. Appl. Probab. 15 (2005), 1161–1192.

[26] G. Ben Arous, A. Fribergh, N. Gantert and A. Hammond. Biased random walks on Galton-Watson trees with leaves. Preprint 2007. arXiv:0711.3686

[27] D. Ben-Avraham and S. Havlin. Di↵usion and reactions in fractals and disordered systems. Cambridge University Press, Cambridge, 2000.

[28] I. Benjamini, O. Gurel-Gurevich and R. Lyons. Recurrence of random walk traces. Ann. Probab. 35 (2007), 732-738.

[29] I. Benjamini and E. Mossel. On the mixing time of simple random walk on the super critical percolation cluster. Probab. Theory Relat. Fields 125 (2003), 408-420.

[30] N. Berger, M. Biskup. Quenched invariance principle for simple random walk on percolation clusters. Probab. Theory Relat. Fields 137 (2007), 83–120.

88 [31] N. Berger, M. Biskup, C.E. Ho↵man and G. Kozma. Anomalous heat-kernel decay for random walk among bounded random conductances. Ann. Inst. H. Poincar´eProbab. Statist. 44 (2008), 374–392.

[32] M. Biskup and T.M. Prescott. Functional CLT for random walk among bounded random conductances. Electron. J. Probab. 12 (2007), 1323–1348.

[33] M. Biskup and H. Spohn. Scaling limit for a class of gradient ﬁelds with non-convex potentials. Ann. Probab., to appear.

[34] B. Bollob´as. Random graphs. Cambridge University Press, second edition, 2001.

[35] E. Bolthausen and A-S. Sznitman. Ten lectures on random media. DMV Seminar, 32, Birkh¨auserVerlag, Basel, 2002.

[36] O. Boukhadra. Heat-kernel estimates for random walk among random conductances with heavy tail. Stoch. Proc. Their Appl. 120 (2010), 182–194.

[37] K. Burdzy and G.F. Lawler. Rigorous exponent inequalities for random walks. J. Phys. A: Math. Gen. 23 (1999), L23–L28.

[38] E.A. Carlen and S. Kusuoka and D.W. Stroock. Upper bounds for symmetric Markov transition functions. Ann. Inst. Henri Poincar´e-Probab. Statist. 23 (1987), 245-287.

[39] D. Chen and X. Chen. Two random walks on the open cluster of Z2 meet inﬁnitely often. Preprint 2009.

[40] T. Coulhon. Heat kernel and isoperimetry on non-compact Riemannian manifolds. Contemporary Math- ematics 338, pp. 65–99, Amer. Math. Soc., 2003.

[41] T. Coulhon. Ultracontractivity and Nash type inequalities. J. Funct. Anal., 141 (1996), 510–539.

[42] T. Coulhon and L. Salo↵-Coste. Variétésriemanniennes isométriquesà l’infini. Rev. Mat. Iberoamericana, 11 (1995), 687–726.

[43] D.A. Croydon. Scaling limit for the random walk on the giant component of the critical random graph. Preprint 2009. http://www2.warwick.ac.uk/fac/sci/statistics/staff/academic/croydon/research

[44] D.A. Croydon. Random walk on the range of random walk. J. Stat. Phys. 136 (2009), 349–372.

[45] D.A. Croydon. Convergence of simple random walks on random discrete trees to Brownian motion on the continuum random tree. Ann. Inst. H. Poincar´eProbab. Statist. 44 (2008), 987-1019.

[46] D.A. Croydon. Volume growth and heat kernel estimates for the continuum random tree. Probab. Theory Relat. Fields 140 (2008), 207–238.

[47] D.A. Croydon and B.M. Hambly. Local limit theorems for sequences of simple random walks on graphs. Potential Anal. 29 (2008), 351–389.

[48] D. Croydon and T. Kumagai. Random walks on Galton-Watson trees with inﬁnite variance o↵spring distribution conditioned to survive. Electron. J. Probab. 13 (2008), 1419–1441.

[49] T. Delmotte. Parabolic Harnack inequality and estimates of Markov chains on graphs. Rev. Math. Iberoamericana 15 (1999), 181–232.

89 [50] A. De Masi, P.A. Ferrari, S. Goldstein and W.D. Wick. An invariance principle for reversible Markov processes. Applications to random motions in random environments. J. Statist. Phys. 55 (1989), 787–855.

[51] P.G. Doyle and J.L. Snell. Random walks and electric networks. Mathematical Association of America, 1984. http://xxx.lanl.gov/abs/math.PR/0001057.

[52] L.R.G. Fontes, M. Isopi and C.M. Newman. Random walks with strongly inhomogeneous rates and singular di↵usions: convergence, localization and aging in one dimension. Ann. Probab., 30 (2002), 579–604.

[53] L.R.G. Fontes and P. Mathieu. On symmetric random walks with random conductances on Zd. Probab. Theory Relat. Fields 134 (2006), 565–602.

[54] I. Fujii and T. Kumagai. Heat kernel estimates on the incipient infinite cluster for critical branching processes. Proceedings of German-Japanese symposium in Kyoto 2006, RIMS KôkyûrokuBessatsu B6 (2008), 85–95.

[55] M. Fukushima, Y. Oshima and M. Takeda. Dirichlet forms and symmetric Markov processes.deGruyter, Berlin, 1994.

[56] P. G´abor. A note on percolation on Zd: isoperimetric proﬁle via exponential cluster repulsion. Electron. Comm. Probab. 13 (2008), 377–392.

[57] P.G. de Gennes. La percolation: un concept uniﬁcateur. La Recherche, 7 (1976), 919–927.

[58] G. Giacomin, S. Olla and H. Spohn. Equilibrium ﬂuctuations for interface model. Ann. Probab. 29 r (2001), 1138–1172.

[59] A. Grigor’yan. Analysis on Graphs. http://www.math.uni-bielefeld.de/ grigor/aglect.pdf

[60] A. Grigor’yan. Heat kernel upper bounds on a complete non-compact manifold, Rev. Mat. Iberoamericana, 10 (1994), 395–452.

[61] G. Grimmett. Percolation. 2nd ed., Springer, Berlin, 1999.

[62] G. Grimmett and P. Hiemer. Directed percolation and random walk. In and out of equilibrium (Mam- bucaba, 2000), pp. 273–297, Progr. Probab. 51, Birkh¨auser,Boston, 2002.

[63] G. Grimmett, H. Kesten and Y. Zhang. Random walk on the inﬁnite cluster of the percolation model. Probab. Theory Relat. Fields 96 (1993), 33-44.

[64] B.M. Hambly and T. Kumagai. Di↵usion on the scaling limit of the critical percolation cluster in the diamond hierarchical lattice. Comm. Math. Phys., 295 (2010), 29–69.

[65] B.M. Hambly and T. Kumagai. Heat kernel estimates for symmetric random walks on a class of fractal graphs and stability under rough isometries. In: Fractal geometry and applications: A Jubilee of B. Mandelbrot, Proc. of Symposia in Pure Math. 72, Part 2, pp. 233–260, Amer. Math. Soc. 2004.

[66] T. Hara. Decay of correlations in nearest-neighbour self-avoiding walk, percolation, lattice trees and animals. Ann. Prob. 36 (2008), 530–593.

[67] T. Hara, R. van der Hofstad and G. Slade. Critical two-point functions and the lace expansion for spread-out high-dimensional percolation and related models. Ann. Probab., 31 (2003), 349–408.

90 [68] R. van der Hofstad, F. den Hollander, and G. Slade. Construction of the incipient inﬁnite cluster for spread-out oriented percolation above 4 + 1 dimensions. Comm. Math. Phys., 231 (2002) 435–461.

[69] R. van der Hofstad and A.A. J´arai.The incipient inﬁnite cluster for high-dimensional unoriented percolation. J. Stat. Phys. 114 (2004), 625–663.

[70] R. van der Hofstad and G. Slade. Convergence of critical oriented percolation to super-Brownian motion above 4 + 1 dimensions. Ann. Inst. H. Poincar´eProbab. Statist. 39 (2003), 415–485.

[71] B.D. Hughes. Random walks and random environments, volume 2: random environments. Oxford University Press, Oxford, 1996.

[72] O.D. Jones. Transition probabilities for the simple random walk on the Sierpinski graph. Stoch. Proc. Their Appl. 61 (1996), 45–69.

[73] M. Kanai. Analytic inequalities, and rough isometries between non-compact riemannian manifolds. Lect. Notes Math. 1201, pp. 122–137, Springer, 1986.

[74] M. Kanai. Rough isometries, and combinatorial approximations of geometries of non-compact riemannian manifolds. J. Math. Soc. Japan 37 (1985), 391–413.

[75] A. Kasue. Convergence of metric graphs and energy forms. Preprint 2009.

[76] H. Kesten. The incipient inﬁnite cluster in two-dimensional percolation. Probab. Theory Relat. Fields 73 (1986), 369–394.

[77] H. Kesten. Subdi↵usive behavior of random walk on a random cluster. Ann. Inst. H. Poincar´eProbab. Statist. 22 (1986), 425–487.

[78] J. Kigami. Dirichlet forms and associated heat kernels on the Cantor set induced by random walks on trees. Adv. Math., to appear.

[79] J. Kigami. Analysis on fractals. Cambridge Univ. Press, Cambridge, 2001.

[80] C. Kipnis and S.R.S. Varadhan. Central limit theorem for additive functionals of reversible Markov processes and applications to simple exclusions. Comm. Math. Phys. 104 (1986), 1–19.

[81] S.M. Kozlov. The method of averaging and walks in inhomogeneous environments. Russian Math. Surveys 40 (1985), 73–145.

[82] G. Kozma. Percolation on a product of two trees. Preprint 2010. arXiv:1003.5240

[83] G. Kozma. The triangle and the open triangle. Ann. Inst. Henri Poincar´e-Probab. Statist., to appear.

[84] G. Kozma and A. Nachmias. Personal communications.

[85] G. Kozma and A. Nachmias. Arm exponents in high dimensional percolation. Preprint 2009. arXiv:0911.0871

[86] G. Kozma and A. Nachmias. The Alexander-Orbach conjecture holds in high dimensions. Invent. Math. 178 (2009), 635–654.

91 [87] T. Kumagai and J. Misumi. Heat kernel estimates for strongly recurrent random walk on random media. J. Theoret. Probab. 21 (2008), 910–935.

[88] R. Lyons and Y. Peres. Probability on trees and networks. Book in preparation. http://mypage.iu.edu/ rdlyons/prbtree/prbtree.html ⇠ [89] T. Lyons. Instability of the Liouville property for quasi-isometric Riemannian manifolds and reversible Markov chains. J. Di↵erential Geom. 26 (1987), 33–66.

[90] P. Mathieu. Quenched invariance principles for random walks with random conductances. J. Stat. Phys. 130 (2008), 1025–1046.

[91] P. Mathieu and A. Piatnitski. Quenched invariance principles for random walks on percolation clusters. Proc. Roy. Soc. A 463 (2007), 2287-2307.

[92] P. Mathieu and E. Remy. Isoperimetry and heat kernel decay on percolation clusters. Ann. Probab. 32 (2004), 100–128.

[93] B. Morris and Y. Peres. Evolving sets, mixing and heat kernel bounds. Probab. Theory Relat. Fields 133 (2005), 245–266.

[94] J.C. Mourrat. Variance decay for functionals of the environment viewed by the particle. Ann. Inst. Henri Poincar´e-Probab. Statist., to appear.

[95] A. Nachmias and Y. Peres, Critical random graphs: diameter and mixing time. Ann. Probab. 36 (2008), 1267–1286.

[96] J. Nash. Continuity of solutions of parabolic and elliptic equations. Amer. Math. J., 80 (1958), 931–954.

[97] J. Norris. Markov chains. Cambridge University Press, Cambridge, 1998.

[98] F. Rassoul-Agha and T. Sepp¨al¨ainen.An almost sure invariance principle for random walks in a space- time random environment. Probab. Theory Relat. Fields 133 (2005), 299–314.

[99] C. Rau. Sur le nombre de points visitéspar une marche aléatoiresur un amas infini de percolation. Bull. Soc. Math. France 135 (2007), 135–169.

[100] L. Salo↵-Coste. A note on Poincar´e,Sobolev, and Harnack inequalities. Int. Math. Res. Not. IMRN 2 (1992), 27–38.

[101] A. Sapozhnikov. Upper bound on the expected size of intrinsic ball. Preprint 2010. arXiv:1006.1521v1

[102] R. Schonmann. Multiplicity of phase transitions and mean-ﬁeld criticality on highly non-amenable graphs. Comm. Math. Phys. 219 (2001), 271–322.

[103] D. Shiraishi. Exact value of the resistance exponent for four dimensional random walk trace. Preprint 2009.

[104] D. Shiraishi. Heat kernel for random walk trace on Z3 and Z4. Ann. Inst. Henri Poincar´e-Probab. Statist., to appear.

[105] V. Sidoravicius and A-S. Sznitman. Quenched invariance principles for walks on clusters of percolation or among random conductances. Probab. Theory Relat. Fields 129 (2004), 219–244.

92 [106] G. Slade. The lace expansion and its applications. Springer, Berlin, (2006). Lecture Notes in Mathematics 1879. Ecole d’Et´ede Probabilit´esde Saint–Flour XXXIV–2004.

[107] P. M. Soardi. Potential theory on inﬁnite networks. Lecture Note in Math. 1590, Springer-Verlag, 1994.

[108] A-S. Sznitman. On the anisotropic walk on the supercritical percolation cluster. Comm. Math. Phys. 240 (2003), 123-148.

[109] W. Woess. Random walks on inﬁnite graphs and groups. Cambridge Univ. Press, Cambridge, 2000.

[110] G.M. Zaslavsky. Hamiltonian chaos and fractional dynamics. Oxford University Press, Oxford, 2005.

[111] O. Zeitouni. Random walks in random environment. Springer, Berlin, (2004). Lecture Notes in Mathe- matics 1837, pp. 189–312, Ecole d’Et´ede Probabilit´esde Saint–Flour XXXI–2001.