Anton Bovier

Gaussian processes on trees: From spin glasses to branching Brownian motion

Lecture Notes, Bonn, WS 2014/15

March 11, 2015 in construction in construction Preface

These are notes were written for a graduate course at Bonn University I gave in 2014/15. It extends a shorter version of lecture notes for a series of lectures given at the 7th Prague Summer School on Mathematical Statistical Mechanics in August 2013. BBM was studied over the last 50 years as a subject of interest in its own right, with seminal contributions by McKean, Bramson, Lalley, Sellke, Chauvin, Rouault, and others. Recently, the field has experienced a revival with many re- markable contributions and repercussions into other areas. The construction of the extremal process in our work with Arguin and Kistler [6] as well as, in parallel, in that of Aïdékon, Beresticky, Brunet, and Shi [2], is one highlight. Many results on BBM were extended to branching , see e.g. Aïdékon [1]. Other devel- opments concern the connection to extremes of the free Gaussian random field in d = 2 by Bramson and Zeitouni [25], Ding [32], Bramson, Ding, and Zeitouni [24], and Biskup and Louidor [13, 14]. My personal motivation to study this object came, however, from spin glass theory. About ten years ago, Irina Kurkova and I wrote two papers [19, 20] on Derrida’s Generalised Random Energy models (GREM). It turns out that these models are closely related to BBM, and with BBM corresponding to a critical case, that we could not at the time fully analyse. These lecture notes are mainly motivated by the work we did with Nicola Kistler, Louis-Pierre Arguin, and Lisa Hartung. The aim of the course is to give a compre- hensive picture of what goes into the analysis of the extremal properties of branch- ing Brownian motions, seen as Gaussian processes indexed by Galton-Watson trees. Chapters 1-3 provide some standard background material on extreme value theory, point processes, and Gaussian processes. Chapter 4 gives a brief glimpse on spin glasses, in particular the REM and GREM models of Derrida which provides some important motivation. Chapter 5 introduces branching Brownian motion and its re- lation to the Fischer-Kolmogorov-Petrovsky-Piscounov equations and states some by now classical results of Bramson and of Lalley and Sellke. Chapter 6 gives a condensed but detailed review of the analysis of the F-KPP equations that was given in Bramson’s monograph [22]. The remainer of the notes is devoted to more recent work. In Chapter 7 I review the derivation and description of the extremal process of in constructionv vi Preface

BBM contained mostly the paper [6] with Louis-Pierre Arguin and Nicola Kistler. Chapter 8 describes recent work with Lisa Hartung [16] on an extension of that work. Chapter 9 reviews two papers [17, 18] with Lisa Hartung on variable speed branching Brownian motion. Discussion of recent work on related problems, such as branching random walks, the Gaussian free field, and others, may possibly be added in later versions. The recent activities in and around BBM have triggered a number of lecture notes and reviews, that hopefully are complementary to this one. I mention the review by Gouéré [40] that presents and compares the approaches of Arguin et al and Aïdékon et al.. Ofer Zeitouni has lecture notes on his homepage that deal with branching random walk and the Gaussian free field [75]. Nicola Kistler just wrote a survey linking REM and GREM to other correlated Gaussian processes [53]. I am deeply grateful to my collaborators on the matters of these lectures, Louis- Pierre Arguin, Nicola Kistler, Irina Kurkova, and Lisa Hartung. They did the bulk of the work, and without them none this would have been achieved. I also thank Jiríˇ Cernýˇ and Lisa Hartung for pointing out various mistakes in previous versions of these notes. I thank Marek Biskup, Jiríˇ Cerný,ˇ and Roman Kotecký for organising the excellent school in Prague that ultimately triggered the idea to produce such notes.

Bonn, February 2015, Anton Bovier in construction Contents

1 Extreme value theory for iid sequences ...... 1 1.1 Basic issues ...... 1 1.2 Extremal distributions ...... 2 1.2.1 Example: The Gaussian distribution...... 3 1.2.2 Some technical preparation...... 4 1.2.3 max-stable distributions...... 8 1.2.4 The extremal types theorem...... 10 1.3 Level-crossings and the distribution of the k-th maxima...... 11

2 Extremal processes ...... 15 2.1 Point processes ...... 15 2.2 Laplace functionals ...... 18 2.3 Poisson point processes...... 19 2.4 Convergence of point processes ...... 20 2.4.1 Vague convergence...... 21 2.4.2 Weak convergence...... 23 2.5 Point processes of extremes ...... 28 2.5.1 The of exceedances...... 28 2.5.2 The point process of extreme values...... 29 2.5.3 Complete Poisson convergence...... 30

3 Normal sequences ...... 33 3.1 Normal comparison ...... 34 3.2 Applications to extremes ...... 39

4 Spin glasses ...... 43 4.1 Setting and examples ...... 43 4.2 The REM ...... 45 4.2.1 Rough estimates, the second moment method ...... 45 4.2.2 Law of the maximum and the extremal process ...... 47 4.3 The GREM, two levels ...... 47 in constructionvii viii Contents

4.3.1 The finer computations ...... 49 4.4 The general model and relation to branching Brownian motion. . . . . 51 4.5 The Galton-Watson process ...... 52 4.6 The REM on the Galton-Watson tree ...... 54

5 Branching Brownian motion ...... 57 5.1 Definition and basics ...... 57 5.2 Rough heuristics ...... 58 5.3 Recursion relations ...... 60 5.4 The F-KPP equation ...... 61 5.5 The travelling wave ...... 63 5.6 The derivative martingale ...... 66

6 Bramson’s analysis of the F-KPP equation ...... 71 6.1 Feynman-Kac representation ...... 71 6.2 The maximum principle and its applications ...... 75 6.3 Estimates on solutions of the linear F-KPP equation ...... 89 6.4 Brownian bridges ...... 92 6.5 Hitting of curves ...... 95 6.6 Asymptotics of solutions of the F-KPP equation ...... 97 6.7 Convergence results ...... 104

7 The extremal process of BBM ...... 115 7.1 Limit theorems for solutions ...... 115 7.2 Existence of a limiting process ...... 120 7.2.1 Convergence of the Laplace functional ...... 120 7.2.2 Flash back to the derivative martingale...... 122 7.2.3 A representation for the Laplace functional...... 123 7.3 Interpretation as cluster point process ...... 124 7.3.1 Interpretation via an auxiliary process ...... 124 7.3.2 The Poisson process of cluster extremes ...... 126 7.3.3 More on the law of the clusters ...... 128 7.3.4 The extremal process seen from the cluster extremes ...... 130

8 Full extremal process ...... 135 8.1 Introduction ...... 135 8.1.1 The embedding ...... 135 8.1.2 The extended convergence result ...... 136 8.2 Properties of the embedding ...... 137 8.3 The q-thinning ...... 143

9 Variable speed BBM ...... 149 9.1 The construction ...... 149 9.2 Two-speed BBM ...... 150 9.2.1 Position of extremal particles at time bt ...... 152 9.2.2 Convergence of the McKean martingale ...... 154 in construction Contents ix

9.2.3 Asymptotic behaviour of BBM ...... 158 9.2.4 Convergence of the maximum ...... 159 9.2.5 Existence of the limiting process ...... 165 9.2.6 The auxiliary process ...... 165 9.2.7 The case σ1 > σ2 ...... 169 9.3 Universality below the straight line ...... 172 9.3.1 Outline of the proof ...... 173 9.3.2 Localization of paths ...... 174 9.3.3 Proof of Proposition 9.29...... 179 9.3.4 Approximating two-speed BBM...... 180 9.3.5 Gaussian comparison ...... 181

References ...... 187

Index ...... 191

in construction in construction Chapter 1 Extreme value theory for iid sequences

The goal of these lectures is the analysis of extremes in a large class of Gaussian processes on trees. To get this into perspective, we need to start with the simplest and standard situation, the theory of extremes of sequences of independent random variables. This is the subject of this opening chapter. Records and extremes are not only fascinating us in all areas of live, they are also of tremendous importance. We are constantly interested in knowing how big, how, small, how rainy, how hot, etc. things may possibly be. To answer such questions, an entire branch of , called extreme value statistics, was developed. There are a large number of textbooks on this subject, my personal favourites being those by Leadbetter, Lindgren, and Roozen [57] and by Resnick [65].

1.1 Basic issues

As usual in statistics, one starts with a set of observations, or data, that correspond to partial observations of some sequence of events. Let us assume that these events are related to the values of some random variables, Xi, i ∈ Z, taking values in the real numbers. Assuming that {Xi}i∈Z is a (with discrete time) defined on some space (Ω,F,P), our first question will be about the distribution of its maximum: Given n ∈ N, define the maximum up to time n, n Mn ≡ maxXi. (1.1.1) i=1 We then ask for the distribution of this new random variable, i.e. we ask what is P(Mn ≤ x)? As often, one is interested in this question particularly when n is large, i.e. we are interested in the asymptotics as n ↑ ∞. Certainly Mn may tend to infinity, and their distribution may have no reasonable limit. The natural first question about Mn are thus: first, can we rescale Mn in some way such that the rescaled variable converges to a random variable, and second, is in construction1 2 1 Extreme value theory for iid sequences

there a universal class of distributions that arises as the distribution of the limits? To answer these questions will be our first target. A second major issue will be to go beyond just the maximum value. Coming back to the marks of flood levels under the bridge, we do not just see one, but a whole bunch of marks. can we say something about their joint distribution? In other words, what is the law of the maximum, the second largest, third largest, etc.? Is there, possibly again a universal law of how this process of extremal marks looks like? This will be the second target, and we will see that there is again an answer to the affirmative.

1.2 Extremal distributions

We consider a family of real valued, independent identically distributed random variables Xi, i ∈ N, with common distribution function

F(x) ≡ P(Xi ≤ x). (1.2.1) Recall that by convention, F(x) is a non-decreasing, right-continuous function F : R → [0,1]. Note that the distribution function of Mn is simply

n n n P(Mn ≤ x) = P(∀i=1Xi ≤ x) = ∏P(Xi ≤ x) = (F(x)) . (1.2.2) i=1 As n tends to infinity, this will converge to a trivial limit ( 0, if F(x) < 1, lim(F(x))n = (1.2.3) n↑∞ 1, if F(x) = 1.

which simply says that any value that the variables Xi can exceed with positive probability will eventually exceeded after sufficiently many independent trials. As we have already indicated above, to get something more interesting, we must rescale. It is natural to try something similar to what is done in the central limit the- orem: first subtract an n-dependent constant, then rescale by an n-dependent factor. Thus the first question is whether one can find two sequences, bn, and an, and a non-trivial distribution function, G(x), such that

limP(an(Mn − bn)) = G(x). (1.2.4) n↑∞ in construction 1.2 Extremal distributions 3 1.2.1 Example: The Gaussian distribution.

In , it is always natural to start playing with the example of a Gaussian distribution. So we now assume that our Xi are Gaussian, i.e. that F(x) = Φ(x), where x 1 Z 2 φ(x) ≡ √ e−y /2dy. (1.2.5) 2π −∞ We want to compute

−1  −1 n P(an(Mn − bn) ≤ x) = P Mn ≤ an x + bn = Φ an x + bn . (1.2.6)

−1 Setting xn ≡ an x + bn, this can be written as

n (1 − (1 − Φ(xn)) (1.2.7)

For this to converge, we must choose xn such that

−1 (1 − Φ(xn)) = n g(x) + o(1/n), (1.2.8)

in which case n −g(x) lim(1 − (1 − Φ(xn)) = e . (1.2.9) n↑∞

Thus our task is to find xn such that

1 Z ∞ 2 √ e−y /2dy. = n−1g(x). (1.2.10) 2π xn At this point it will be very convenient to use an approximation for the function 1 − Φ(u) when u is large, namely

1 2 1 2 √ e−u /2 1 − 2u−2 ≤ 1 − Φ(u) ≤ √ e−u /2. (1.2.11) u 2π u 2π Note the these bounds area going to be used over and over, and are surely worth memorising. Using this bound, our problem simplifies to solving

1 2 √ e−xn/2 = n−1g(x), (1.2.12) xn 2π that is

− 1 (a−1x+b )2 −b2/ −a−2x2/ −a−1b x e 2 n n e n 2 n 2 n n −1 √ √ n g(x) = −1 = −1 . (1.2.13) 2π(an x + bn) 2π(an x + bn) Setting x = 0, we find 2 e−bn/2 √ = n−1g(0). (1.2.14) 2πbn in construction 4 1 Extreme value theory for iid sequences √ Let us make the ansatz bn = 2lnn + cn. Then we get for cn √ 2 √ √ − 2lnncn−c /2 e n = 2π( 2lnn + cn). (1.2.15)

It is convenient to choose g(0) = 1. Then, the leading terms for cn are given by

lnlnn + ln(4π) cn = − √ . (1.2.16) 2 2lnn

The higher order corrections to cn can be ignored, as they do not affect√ the validity of (1.2.8). Finally, inspecting (1.2.13), we see that we can choose an = 2lnn. Putting all things together we arrive at the following assertion.

Lemma 1.1. Let Xi, i ∈ N be iid normal random variables. Let √ lnlnn + ln(4π) bn ≡ 2lnn − √ , (1.2.17) 2 2lnn

and √ an = 2lnn. (1.2.18) Then, for any x ∈ R,

−e−x limP(an(Mn − bn) ≤ x) = e . (1.2.19) n↑∞

Remark 1.2. It will be sometimes convenient to express (1.2.19) in a slightly differ- ent, equivalent form. With the same constants, an, bn, define the function

un(x) ≡ bn + x/an. (1.2.20)

Then −e−x limP(Mn ≤ un(x)) = e . (1.2.21) n↑∞

−x This is our first result on the convergence of extremes, and the function e−e , called the Gumbel distribution, is the first extremal distribution that we encounter. The next question to ask is how “typical” the result for the Gaussian distribution is. From the computation we see readily that we made no use of the Gaussian hy- pothesis to get the general form exp(−g(x)) for any possible limit distribution. The fact that g(x) = exp(−x), however, depended on the particular form of Φ. We will see next that, remarkably, only two other types of functions can occur.

1.2.2 Some technical preparation.

Our goal will be to be as general as possible with regard to the allowed distributions F. Of course we must anticipate that in some cases, no limiting distributions can be in construction 1.2 Extremal distributions 5

constructed (e.g. think of the case of a distribution with support on the two points 0 and 1!). Nonetheless, we are not willing to limit ourselves to random variables with continuous distribution functions, and this will introduce a little bit of complication, that, however, can be seen as a useful exercise. Before we continue, let us explain where we are heading. In the Gaussian case we have seen already that we could make certain choices at various places. In general, we can certainly multiply the constants an by a finite number and add a finite num- ber to the choice of bn. This will clearly result in a different form of the extremal distribution, which, however, we think as morally equivalent. Thus, when classify- ing extremal distributions, we will think of two distributions, G,F, as equivalent if F(ax + b) = G(x). (1.2.22) The distributions we are looking for arise as limits of the form

n F (anx + bn) → G(x). (1.2.23)

We will want to use that such limits have particular properties, namely that for some choices of αn,βn, n G (αnx + βn) = G(x). (1.2.24) This property is called max-stability. Our program is then reduced to classify all max-stable distributions modulo the equivalence (1.2.22), and to determine their domains of attraction. Note the similarity of the characterisation of the Gaussian distribution as a stable distribution under addition of random variables. Recall the notion of weak convergence of distribution functions:

Definition 1.3. A sequence, Fn, of probability distribution functions is said converge weakly to a probability distribution function F,

w Fn → F, (1.2.25)

iff and only if Fn(x) → F(x), (1.2.26) for all points x where F is continuous. The next thing we want to do is to define the notion of the (left-continuous) inverse of a non-deceasing, right-continuous function (that may have jumps and flat pieces). Definition 1.4. Let ψ : R → R be a monotone increasing, right-continuous function. Then the inverse function ψ−1 is defined as

ψ−1(y) ≡ inf{x : ψ(x) ≥ y}. (1.2.27)

We need the following properties of ψ−1. Lemma 1.5. Let ψ be as in the definition, and a > c and b real constants. Let H(x) ≡ ψ(ax + b) − c. Then in construction 6 1 Extreme value theory for iid sequences

(i) ψ−1 is left-continuous. (ii) ψ(ψ−1(x)) ≥ x. − − (iii) If ψ 1 is continuous at ψ(x) ∈ R, then ψ 1(ψ(x)) = x. (iv)H −1(y) = a−1 ψ−1(y + c) − b (v) If G is a non-degenerate distribution function, then there exist y1 < y2, such that −1 −1 G (y1) < G (y2). −1 −1 Proof. (i) First note that ψ is increasing. Let yn ↑ y. Assume that limn ψ (yn) < −1 ψ (y). This means that for all yn, inf{x : ψ(x) ≥ yn} < inf{x : ψ(x) ≥ y}. This −1 means that there is a number, x0 < ψ (y), such that, for all n, ψ(x0) ≤ yn, but ψ(x0)) > y. But this means that limn yn ≥ y, which is in contradiction to the hypoth- esis. Thus ψ−1 is left continuous. (ii) is immediate from the definition. (iii) ψ−1(ψ(x)) = inf{x0 : ψ(x0) ≥ ψ(x)}, thus obviously ψ−1(ψ(x)) ≤ x. On the other hand, for any ε > 0, ψ−1(ψ(x) + ε) = inf{x0 : ψ(x0) ≥ ψ(x) + ε}. But ψ(x0) can only be strictly greater than ψ(x) if x0 > x, so for any y0 > ψ(x), ψ−1(y0) ≥ x. Thus, if ψ−1 is continuous at ψ(x), this implies that ψ−1(ψ(x)) = x. (iv) The verification of the formula for the inverse of H is elementary and left as an exercise. (v) If G is not degenerate, then there exist x1 < x2 such that 0 < G(x1) ≡ y1 < −1 −1 G(x2) ≡ y2 ≤ 1. But then G (y1) ≤ x1, and G (y2) = inf{x : G(x) ≥ G(x2)}. If the latter equals x1, then for all x ≥ x1, G(x) ≥ G(x2), and since G is right-continuous, G(x1) = G(x2), which is a contradiction. ut The following corollary is important. Corollary 1.6. If G is a non-degenerate distribution function, and there are con- stants a > 0,α > 0, and b,β ∈ R, such that, for all x ∈ R,

G(ax + b) = G(αx + β). (1.2.28)

then a = α and b = β. Proof. Set H(x) ≡ G(ax + b). Then, by (i) of the preceding lemma,

H−1(y) = a−1(G−1(y) − b), (1.2.29)

but by (1.2.28) also H−1(y) = α−1(G−1(y) − β). (1.2.30) On the other hand, by (v) of the same lemma, there are at least two values of y such −1 that G (y) are different, i.e. there are x1 < x2 such that

−1 −1 a (xi − b) = α (xi − β), (1.2.31)

which obviously implies the assertion of the corollary. ut Remark 1.7. Note that the assumption that G is non-degenerate is necessary. If, e.g., G(x) has a single jump from 0 to 1 at a point a, then it holds that G(5x−4a) = G(x)! in construction 1.2 Extremal distributions 7

The next theorem is known as Khintchine’s theorem:

Theorem 1.8. Let Fn, n ∈ N, be distribution functions, and let G be a non-degenerate distribution function. Let an > 0, and bn ∈ R be sequences such that w Fn(anx + bn) → G(x). (1.2.32)

Then there are constants, αn > 0 and βn ∈ R, and a non-degenerate distribution function, G∗, such that w Fn(αnx + βn) → G∗(x), (1.2.33) if and only if −1 an αn → a, (βn − bn)/an → b, (1.2.34) and G∗(x) = G(ax + b). (1.2.35) Remark 1.9. This theorem makes the comment made above precise, saying that dif- ferent choices of the scaling sequences an,bn can lead only to distributions that are related by a transformation (1.2.35).

Proof. By changing Fn, we can assume for simplicity that an = 1, bn = 0. Let us first show that if αn → a,βn → b, then Fn(αnx + βn) → G∗(x). Let ax + b be a point of continuity of G. Write

Fn(αnx + βn) = Fn(αnx + βn) − Fn(ax + b) + Fn(ax + b). (1.2.36)

By assumption, the last term converges to G(ax + b). Without loss of generality we may assume that αnx + βn is monotone increasing. We want to show that

Fn(αnx + βn) − Fn(ax + b) ↑ 0. (1.2.37)

For, otherwise, there would be a constant, δ > 0, such that along a subsequence nk,

limk Fnk (αnk x +βnk )−Fnk (ax +b) < −δ. But since αnk x +βnk ↑ ax +b, this implies that for any y < ax+b, limk Fnk (y)−Fnk (ax+b) < −δ. Now, if G is continuous at y, this implies that G(y)−G(ax +b) < −δ. But this implies that either F is discontin- uous at ax+b, or there exists a neighbourhood of ax+b such that G(x) has no point of continuity within this neighbourhood. But this is impossible since a probability distribution function can only have countably many points of discontinuity. Thus (1.2.37) must hold, and hence

w Fn(αnx + βn) → G(ax + b) (1.2.38)

which proves (1.2.33) and (1.2.35). Next we want to prove the converse, i.e. we want to show that (1.2.33) implies (1.2.34). Note first that (1.2.33) implies that the sequence αnx+βn is bounded, since otherwise there would be subsequences converging to plus or minus infinity, along those Fn(αnx + βn) would converge to 0 or 1, contradicting the assumption. This

implies that the sequence has converging subsequences, αnk ,βnk , along which in construction 8 1 Extreme value theory for iid sequences

limFn (αn x + βn ) → G∗(x). (1.2.39) k k k k

0 0 0 0 0 Then the preceding results shows that ank → a , bnk → b , and G∗(x) = G(a x + b ). Now, if the sequence does not converge, there must be another convergent subse- 00 00 quence a 0 → a , b 0 → b . But then nk nk

00 0 G∗(x) = limFn0 (αn0 x + βn0 ) → G(a x + b ). (1.2.40) k k k k

Thus G(a0x + b0) = G(a00x + b00). and so, since G is non-degenerate, Corollary 1.6 implies that a0 = a00 and b0 = b00, contradicting the assumption that the sequences do not converge. This proves the theorem. ut

1.2.3 max-stable distributions.

We are now prepared to continue our search for extremal distributions. Let us for- mally define the notion of max-stable distributions.

Definition 1.10. A non-degenerate probability distribution function, G, is called max-stable, if for all n ∈ N, there exists an > 0,bn ∈ R, such that, for all x ∈ R,

n −1 G (an x + bn) = G(x). (1.2.41)

The next proposition gives some important equivalent formulations of max- stability and justifies the term.

Proposition 1.11. (i) A probability distribution, G, is max-stable, if and only if there exists probability distributions Fn and constants an > 0, bn ∈ R, such that, for all k ∈ N, −1 w 1/k Fn(ank x + bnk) → G (x). (1.2.42) (ii) G is max-stable if and only if there exists a probability distribution function, F, and constants an > 0, bn ∈ R, such that

n −1 w F (an x + bn) → G(x). (1.2.43)

Proof. We first prove (i). If (1.2.42) holds, then by Khintchine’s theorem, there exist constants, αk,βk, such that

1/k G (x) = G(αkx + βk), (1.2.44)

n for all k ∈ N, and thus G is max-stable. Conversely, if G is max-stable, set Fn = G , and let an,bn the constants that provide for (1.2.41). Then

1/k −1 h nk −1 i 1/k Fn(ank x + bnk) = G (ank x + bnk) = G , (1.2.45) in construction 1.2 Extremal distributions 9

which proves the existence of the sequence Fn and of the respective constants. Now let us prove (ii). Assume first that G is max-stable. Then choose F = G. n −1 Then the fact that limn F (an x + bn) = G(x) follows if the constants from the defi- nition of max-stability are used trivially. Next assume that (1.2.43) holds. Then, for any k ∈ N,

nk −1 w F (ank x + bnk) → G(x), (1.2.46) and so n −1 w 1/k F (ank x + bnk) → G (x), (1.2.47) so G is max-stable by (i)! ut

There is a slight extension to this result.

Corollary 1.12. If G is max-stable, then there exist functions a(s) > 0,b(s) ∈ R, s ∈ R+, such that Gs(a(s)x + b(s)) = G(x). (1.2.48)

Proof. This follows essentially by interpolation. We have that

[ns] G (a[ns]x + b[ns]) = G(x). (1.2.49)

But

n [ns]/s n−[ns]/s G (a[ns]x + b[ns] = G (a[ns]x + b[ns])G (a[ns]x + b[ns]) (1.2.50) 1/s n−[ns]/s = G (x)G (a[ns]x + b[ns]). (1.2.51)

As n ↑ ∞, the last factor tends to one (as the exponent remains bounded), and so

n w 1/s G (a[ns]x + b[ns]) → G (x), (1.2.52)

and n w G (anx + bn) → G(x). (1.2.53) Thus by Khintchine’s theorem,

a[ns]/an → a(s), (bn − b[ns])/an → b(s), (1.2.54)

and G1/s(x) = G(a(s)x + b(s)). (1.2.55) ut in construction 10 1 Extreme value theory for iid sequences 1.2.4 The extremal types theorem.

Definition 1.13. Two distribution functions, G,H, are called “of the same type”, if and only if there exists a > 0,b ∈ R such that G(x) = H(ax + b). (1.2.56)

We have seen that the only distributions that can occur as extremal distributions are max-stable distributions. We will now classify these distributions.

Theorem 1.14. Any max-stable distribution is of the same type as one of the follow- ing three distributions: (I) The Gumbel distribution, −x G(x) = e−e . (1.2.57) (II) The Fréchet distribution with parameter α > 0, ( 0, if x ≤ 0 G(x) = −α (1.2.58) e−x , if x > 0.

(III) The Weibull distribution with parameter α > 0,

( α e−(−x) , if x < 0 G(x) = (1.2.59) 1, if x ≥ 0.

We do not give the rather lengthy proof of this theorem. It can be found in [57]. Let us state as an immediate corollary the so-called extremal types theorem.

Theorem 1.15. Let Xi, i ∈ N be a sequence of i.i.d. random variables. If there exist sequences an > 0, bn ∈ R, and a non-degenerate probability distribution function, G, such that w P(an(Mn − bn) ≤ x) → G(x), (1.2.60) then G(x) is of the same type as one of the three extremal-type distributions.

Note that it is not true, of course, that for arbitrary distributions of the variables Xi it is possible to obtain a nontrivial limit as in (1.2.60). The following theorem gives necessary and sufficient conditions. We set xF ≡ sup{x : F(x) < 1}.

Theorem 1.16. The following conditions are necessary and sufficient for a distribu- tion function, F, to belong to the domain of attraction of the three extremal types:

Fréchet:x F = +∞, 1 − F(tx) −α lim = x , ∀x>0,α > 0 (1.2.61) t↑∞ 1 − F(t) in construction 1.3 Level-crossings and the distribution of the k-th maxima. 11

Weibull:x F =< +∞,

1 − F(xF − xh) α lim = x , ∀x>0,α > 0 (1.2.62) h↓∞ 1 − F(xF − h)

Gumbel: ∃g(t) > 0, 1 − F(t + xg(t)) −x lim = e , ∀x. (1.2.63) t↑xF 1 − F(t) This theorem is not of too much interest for us later and we refer to [65] for the proof.

1.3 Level-crossings and the distribution of the k-th maxima.

In the previous section we have answered the question of the distribution of the maximum of n iid random variables. It is natural to ask for more, i.e. for the joint distribution of the maximum, the second largest, third largest, etc. From what we have seem, the levels un for which P(Xn > un) ∼ τ/n will play a k crucial rôle. A natural variable to study is Mn , the value of the k-th largest of the first n variables Xi. It will be useful to introduce here the notion of order statistics.

1 n Definition 1.17. Let X1,...,Xn be real numbers. Then we denote by Mn ,...,Mn k its order statistics, i.e. for some permutation, π, of n numbers, Mn = Xπ(k), and

n n−1 2 1 Mn ≤ Mn ≤ ··· ≤ Mn ≤ Mn ≡ Mn. (1.3.1)

We will also introduce the notation

Sn(u) ≡ #{i ≤ n : Xi > u} (1.3.2)

for the number of exceedences of the level u. Obviously we have the relation

 k  P Mn ≤ u = P(Sn(u) < k). (1.3.3)

The following result states that the number of exceedances of an extremal level un is Poisson distributed.

Theorem 1.18. Let Xi be iid random variables with common distribution F. If un is such that n(1 − F(un)) → τ, 0 < τ < ∞, (1.3.4) then k−1 s  k  −τ τ P Mn ≤ un = P(Sn(un) < k) → e ∑ (1.3.5) s=0 s! in construction 12 1 Extreme value theory for iid sequences

Proof. The proof of this lemma is quite simple. We just need to consider all possible ways to realise the event {Sn(un) = s}. Namely

s   P(Sn(un) = s] = ∑ ∏ P Xi` > un ∏ P(Xj ≤ un) {i1,...,is}⊂{1,...,n} `=1 j6∈{1i,...,is} n = (1 − F(u ))sF(u )n−s s n n 1 n! = [n(1 − F(u ))]s [Fn(u )]1−s/n . (1.3.6) s! ns(n − s)! n n

n −τ n! But, for any s fixed, n(1 − F(un)) → τ, F (un) → e , s/n → 0, and ns(n−s)! → 1. Thus τs (S (u ) = s) → e−τ . (1.3.7) P n n s! Summing over all s < k gives the assertion of the theorem. ut

Using very much the same sort of reasoning, one can generalise the question answered above to that of the numbers of exceedances of several extremal levels.

1 2 r Theorem 1.19. Let un > nn ··· > un such that

` n(1 − F(un)) → τ`, (1.3.8)

with 0 < τ1 < τ2 < ...,< τr < ∞. (1.3.9) i i Then, under the assumptions of the preceding theorem, with Sn ≡ Sn(un),

1 2 1 r r−1  P Sn = k1,Sn − Sn = k2,...,Sn − Sn = kr → k τ 1 (τ − τ )k2 (τ − τ )kr 1 2 1 ... r r−1 e−τr . (1.3.10) k1! k2! kr! Proof. Again, we just have to count the number of arrangements that will place the desired number of variables in the respective intervals. Then

1 2 1 r r−1  P Sn = k1,Sn − Sn = k2,...,Sn − Sn = kr   n  1 2 = P X1,...,Xk1 > un ≥ Xk1+1,...,Xk1+k2 > un,... k1,...,kr r−1 r  ...,un ≥ Xk1+···+kr−1+1,...,Xk1+···+kr > un ≥ Xk1+···+kr+1,...,Xn  n  k k 1 k1  1 2  2  r−1 r  r = (1 − F(un)) F(un) − F(un) ... F(un ) − F(un) k1,...,kr

n−k1−···−kr r × F (un). (1.3.11)

Now we write in construction 1.3 Level-crossings and the distribution of the k-th maxima. 13 h i 1 h i F(u`−1) − F(u` ) = n(1 − F(u` )) − n(1 − F(u`−1)) , (1.3.12) n n n n n

 ` `−1  and use that n(1 − F(un)) − n(1 − F(un )) → τ` −τ`−1. Proceeding otherwise as in the proof of Theorem 1.18, we arrive at (1.3.10) ut

in construction in construction Chapter 2 Extremal processes

In this chapter we develop and complete the description of the collection of “ex- tremal values” of a stochastic process. Here we will develop this theory in the lan- guage of point processes. We begin with some background on this subject and the particular class of processes that will turn out to be fundamental, the Poisson point processes. For more details, see [65, 50, 30, 52].

2.1 Point processes

Point processes are designed to describe the probabilistic structure of point sets d in some metric space, for our purposes R . For reasons that may not be obvious d immediately, a convenient way to represent a collection of points xi in R is by associating to them a point measure. Let us first consider a single point x. We consider the usual Borel-sigma algebra, d d B ≡ B(R ), of R , that is generated form the open sets in the open sets in the d d Euclidean topology of R . Given x ∈ R , we define the Dirac measure, δx, such that, for any Borel set A ∈ B, ( 1, ifx ∈ A δx(A) = (2.1.1) 0, ifx 6∈ A.

d d d Definition 2.1. A point measure on R is a measure, µ, on (R ,B(R )), such that d there exists a countable collection of points, {xi ∈ R ,i ∈ N}, such that

∞ µ = ∑ δxi , (2.1.2) i=1

and if K is compact, then µ(K) < ∞. in construction15 16 2 Extremal processes

d Note that the points xi need not be all distinct. The set Sµ ≡ {x ∈ R : µ(x) 6= 0} d is called the support of µ. A point measure such that for all x ∈ R , µ(x) ≤ 1 is called simple. d d We denote by Mp(R ) the set of all point measures on R . We equip this set d with the sigma-algebra Mp(R ), the smallest sigma algebra that contains all subsets d d d of Mp(R ) of the form {µ ∈ Mm(R ) : µ(F) ∈ B}, where F ∈ B(R ) and B ∈ d B([0,∞)). Mp(R ) is also characterised by saying that it is the smallest sigma- algebra that makes the evaluation maps, µ → µ(F), measurable for all Borel sets d F ∈ B(R ). d Definition 2.2. A point process, N, is a random variable taking values in Mp(R ), d i.e. a measurable map, N : (Ω,F,P) → Mp(R ), from a probability space to the space of point measures.

This looks very fancy, but in reality things are quite down-to-earth:

d Proposition 2.3. A map N : Ω → Mp(R ) is a point process, if and only if the map N(·,F) : ω → N(ω,F), is measurable from (Ω,F) → ([0,∞],B([0,∞])), for any Borel set F, i.e. if N(F) is a real random variable.

Proof. Let us first prove necessity, which should be obvious. In fact, since ω → d p N(ω,·) is measurable into (Mp(R ),Mp(R )), and µ → µ(F) is measurable from this space into (R+,B(R+)), the composition of these maps is also measurable. Next we prove sufficiency. Define the set

d −1 G ≡ {A ∈ Mp(R ) : N A ∈ F}. (2.1.3)

d This set is a sigma-algebra and N is measurable from (Ω,F) → (Mp(R ),G) by d definition. But G contains all sets of the form {µ ∈ Mp(R ) : µ(F) ∈ B}, since

−1 d N {µ ∈ Mp(R ) : µ(F) ∈ B} = {ω ∈ Ω : N(ω,F) ∈ B} ∈ F, (2.1.4)

d since N(·,F) is measurable. Thus G ⊃ Mp(R ), and N is measurable a fortiori as a map from the smaller sigma-algebra. ut

We will have need to find criteria for convergence of point processes. For this we recall some basic notions from standard measure theory. If B is a Borel-sigma algebra, of a metric space E, then T ⊂ B is called a Π-system, if T is closed under finite intersections; G ⊂ B is called a λ-system, or a sigma-additive class, if (i) E ∈ G, (ii) If A,B ∈ G, and A ⊃ B, then A \ B ∈ G, (iii) If An ∈ G and An ⊂ An+1, then limn↑∞ An ∈ G. The following useful observation is called Dynkin’s theorem.

Theorem 2.4. If T is a Π-system and G is a λ-system, then G ⊃ T implies that G contains the smallest sigma-algebra containing T . in construction 2.1 Point processes 17

The most useful application of Dynkin’s theorem is the observation that, if two probability measures are equal on a Π-system that generates the sigma-algebra, then they are equal on the sigma-algebra. (since the set on which the two measures coincide forms a λ-system containing T ). As a consequence we can further restrict the criteria to be verified for N to be a point process. In particular, we can restrict the class of F’s for which N(·,F) need to be measurable to bounded rectangles.

Proposition 2.5. Suppose that T are relatively compact sets in B satisfying (i) T is a Π-system, (ii) The smallest sigma-algebra containing T is B, (iii) Either, there exists En ∈ T , such that En ↑ E, or there exists a partition, {En}, of E with ∪nEn = E, with En ⊂ T . Then N is a point process on (Ω,F) in (E,B), if and only if the map N(·,I) : ω → N(ω,I) is measurable for any I ∈ T .

Exercise. Check that the set of all finite collections bounded (semi-open) rectangles d forms indeed a Π-system for E = R that satisfies the hypoptheses of the proposi- tion.

Corollary 2.6. Let T satisfy the hypothesis of Proposition 2.5 and set  G ≡ {µ : µ(Ij) = n j,1 ≤ j ≤ k},k ∈ N,Ij ∈ T ,n j ≥ 0 . (2.1.5)

d Then the smallest sigma-algebra containing G is Mp(R ) and G is a Π-system.

Next we show that the law, PN, of a point process is determined by the law of the d collections of random variables N(Fn), Fn ∈ B(R ). d d Proposition 2.7. Let N be a point process in (R ,B(R ) and suppose that T is as in Proposition 2.5. Define the mass functions

PI1,...,Ik (n1,...,nk) ≡ P(N(Ij) = n j,∀1 ≤ j ≤ k), (2.1.6)

for Ij ∈ T , n j ≥ 0. Then PN is uniquely determined by the collection

{PI1,...,Ik ,k ∈ N,Ij ∈ T }. (2.1.7) We need some further notions.

Definition 2.8. Two point processes, N1,N2 are are independent, if and only if, for any collection Fj ∈ B, G j ∈ B, the vectors

(N1(Fj),1 ≤ j ≤ k) and (N2(G j),1 ≤ j ≤ `) (2.1.8)

are independent random vectors. in construction 18 2 Extremal processes

d d Definition 2.9. The set function s λ defined on (R ,B(R ) defined by Z λ(F) ≡ N(F) = µ(F)PN(dµ), (2.1.9) E d Mp(R ) for F ∈ B, is a measure, called the intensity measure of the point process N. as

d For measurable functions f : R → R+, we define Z N(ω, f ) ≡ f (x)N(ω,dx). (2.1.10) Rd Then N(·, f ) is a random variable. We have that Z EN( f ) = λ( f ) = f (x)λ(dx). (2.1.11) Rd

2.2 Laplace functionals

If Q is a probability measure on (Mp,Mp), the Laplace transform of Q is a map, ψ d from non-negative Borel functions on R to R+, defined as Z  Z  ψ( f ) ≡ exp − f (x)µ(dx) Q(dµ). (2.2.1) d Mp R

If N is a point process, the Laplace functional of N is Z −N( f ) −N(ω, f ) ψN( f ) ≡ Ee = e P(dω) (2.2.2) Z  Z  = exp − f (x)µ(dx) PN(dµ). d Mp R

Proposition 2.10. The Laplace functional, ψN, of a point process, N, determines N uniquely. k 1 Proof. For k ≥ 1, and F1,...,Fk ∈ B, c1,...,ck ≥ 0, let f = ∑i=1 ci Fi (x). Then

k N(ω, f ) = ∑ ciN(ω,Fi), (2.2.3) i=1 and k ! ψN( f ) = Eexp − ∑ ciN(Fi) . (2.2.4) i=1

This is the Laplace transform of the vector (N(Fi),1 ≤ i ≤ k), that determines uniquely its law. Hence the proposition follows from Proposition 2.7 ut in construction 2.3 Poisson point processes. 19

One can restrict the set of functions that is required for the Laplace functionals to determine the measure considerably, as one can see from the proof above.

2.3 Poisson point processes.

The most important class of point processes for our purposes will be Poisson point processes.

d Definition 2.11. Let λ be a σ-finite, positive measure on R . Then a point process, N, is called a with intensity measure λ (PPP(λ)), if d (i) For any F ∈ B(R ), and k ∈ N,

( k e−λ(F) (λ(F)) , if λ(F) < ∞, P(N(F) = k) = k! (2.3.1) 0, if λ(F) = ∞.

(ii) If F,G ∈ B are disjoint sets, then N(F) and N(G) are independent random variables.

In the next theorem we will assert the existence of a Poisson point process with any desired intensity measure. In the proof we will give an explicit construction of such a process.

Proposition 2.12. (i) PPP(λ) exists, and its law is uniquely determined by the requirements of the definition. (ii) The Laplace functional of PPP(λ) is given, for f ≥ 0, by

 Z  − f (x) ΨN( f ) = exp − (1 − e )λ(dx) . (2.3.2) Rd Proof. Since we know that the Laplace functional determines a point process, in or- der to prove that the conditions of the definition uniquely determine the PPP(λ), we show that they determine the form (2.3.2) of the Laplace functional. Thus suppose that N is a PPP(λ). Let f = c1F . Then

ΨN( f ) = E[exp(−N( f ))] = E[exp(−cN(F))] (2.3.3) ∞ k (λ(F)) −c = ∑ e−cke−λ(F) = e(e −1)λ(F) k=0 k!  Z  = exp − (1 − e− f (x))λ(dx) ,

k 1 which is the desired form. Next, if Fi are disjoint, and f = ∑i=1 ci F−i, it is straight- forward to see that in construction 20 2 Extremal processes

" k !# k ΨN( f ) = E exp − ∑ ciN(Fi) = ∏E[exp(−ciN(Fi))], (2.3.4) i=1 i=1 due to the independence assumption (ii); a simple calculations shows that this yields again the desired form. Finally, for general f , we can choose a sequence, fn, of the form considered, such that fn ↑ f . By monotone convergence then N( fn) ↑ N( f ). On the other hand, since e−N(g) ≤ 1, we get from dominated convergence that

−N( fn) −N( f ) ΨN( fn) = Ee → Ee = ΨN( f ). (2.3.5)

But, since 1 − e− fn(x) ↑ 1 − e− f (x), and monotone convergence gives once more

Z  Z  − fn(x) − f (x) ΨN( fn) = exp (1 − e )λ(dx) ↑ exp (1 − e )λ(dx) . (2.3.6)

On the other hand, given the form of the Laplace functional, it is trivial to verify that the conditions of the definition hold, by choosing suitable functions f . Finally we turn to the construction of PPP(λ). Let us first consider the case d λ(R ) < ∞. Then construct d (i) A Poisson random variable, τ, of parameter λ(R ). d (ii) A family, Xi, i ∈ N, of independent, R -valued random variables with common distribution λ. This family is independent of τ. Then set τ ∗ N ≡ ∑ δXi . (2.3.7) i=1 The easiest way to verify that N∗ is a PPP(λ) is to compute its Laplace functional. This is left as an easy exercise. d To deal with the case when λ(R ) is infinite, decompose λ into a countable sum of finite measures, λk, that are just the restriction of λ to a finite set Fk, where the Fk d ∗ ∗ form a partition of R . Then N is just the sum of independent PPP(λk) Nk . ut

2.4 Convergence of point processes

Before we turn to applications to extremal processes, we still have to discuss the notion of convergence of point processes. As point processes are probability distri- butions on the space of point measures, we will naturally think about weak conver- gence. This means that we will say that a sequence of point processes, Nn, converges weakly to a point process, N, if for all continuous functions, f , on the space of point measures, E f (Nn) → E f (N). (2.4.1) However, to understand what this means, we must discuss what continuous func- tions on the space of point measures are, i.e. we must introduce a topology on the in construction 2.4 Convergence of point processes 21

set of point measures. The appropriate topology for our purposes will be that of vague convergence.

2.4.1 Vague convergence.

d d We consider the space R equipped with its natural Euclidean metric. Clearly R is d a complete, separable metric space. We will denote by C0(R ) the set of continuous d + n real-valued functions on R that have compact support; C0 (R ) denotes the subset d of non-negative functions. We consider M+(R ) the set of all σ-finite, positive d d d measures on (R ,B(R )). We denote by M+(R ) the smallest sigma-algebra of d + d subsets of M+(R ) that makes the maps m → m( f ) measurable for all f ∈ C0 (R ). d Definition 2.13. A sequence of measures, µn ∈ M+(R ) converges vaguely to a d + d measure µ ∈ M+(R ), if, for all f ∈ C0 (R ),

µn( f ) → µ( f ). (2.4.2)

Vague convergence defines a topology on the space of measures. Typical open neighbourhoods are of the form

d k B f1,..., fk,ε (µ) ≡ {ν ∈ M+(R ) : ∀i=1 |ν( fi) − µ( fi)| < ε}, (2.4.3) i.e. to test the closeness of two measures, we test it on their expectations on finite collections of continuous, positive functions with compact support. Given this topol- d ogy, on can of course define the corresponding Borel sigma algebra, B(M+(R )), d which (fortunately) turns out to coincide with the sigma algebra M+(R ) intro- duced before. The following properties of vague convergence are useful.

d Proposition 2.14. Let µn, n ∈ N be in M+(R ). Then the following statements are equivalent: v (i) µn converges vaguely to µ, µn → µ. (ii) µn(B) → µ(B) for all relatively compact sets, B, such that µ(∂B) = 0. (iii) limsupn↑∞ µn(K) ≤ µ(K) and liminfn↑∞ µn(G) ≥ µ(G), for all compact K, and all open, relatively compact G.

In the case of point measures, we would of course like to see that the point where the sequence of vaguely convergent measures are located converge. The following proposition tells us that this is true.

d v Proposition 2.15. Let µn, n ∈ N, and µ be in Mp(R ), and µn → µ. Let K be a compact set with µ(∂K) = 0. Then we have a labelling of the points of µn, for n ≥ n(K) large enough, such that in construction 22 2 Extremal processes

p p µ (· ∩ K) = δ n , µ(· ∩ K) = δ , (2.4.4) n ∑ xi ∑ xi i=1 i=1

n n such that (x1,...,xp) → (x1,...,xp). Another useful and unsurprising fact is that

d d Proposition 2.16. The set Mp(R ) is vaguely closed in M+(R ). Thus, in particular, the limit of a sequence of point measures, will, if it exists as a σ-finite measure, be again a point measure. Finally, we need a criterion that describes relatively compact sets.

Proposition 2.17 (Proposition 3.16 in [65]). A subset M of M+(E) or Mp(E) is relatively compact, if and only if one of the following holds: + • For all f ∈ C0 (E), sup µ( f ) < ∞, (2.4.5) µ∈M • For all relatively compact B ∈ B(E),

sup µ(B) < ∞, (2.4.6) µ∈M

For the proof, see [65].

Proposition 2.18. The topology of vague convergence can be metrised and turns M+ into a complete, separable metric space.

Although we will not use the corresponding metric directly, it may be nice to see how this can be constructed. We therefore give a proof of the proposition that constructs such a metric.

+ d Proof. The idea is to first find a countable collection of functions, hi ∈ C0 (R ), v such that µn → µ if and only if, for all i ∈ N, µn(hi) → µ(hi). The construction below is form [50]. Take a family Gi,i ∈ N, that form a base of relatively compact sets, and assume it to be closed under finite unions and finite intersections. One can + find (by Uryson’s theorem), families of functions fi,n,gi,n ∈ C0 , such that 1 1 fi,n ↑ Gi , gi,n ↓ Gi . (2.4.7)

Take the countable set of functions gi,n, fi,n as the collection hi. Now µ ∈ M+ is determined by its values on the h j. For, first of all, µ(Gi) is determined by these values, since µ( fi,n) ↑ µ(Gi) andµ(gi,n) ↓ µ(Gi). (2.4.8) d But the family Gi is a Π-system that generates the sigma-algebra B(R ), and so the values µ(Gi) determine µ. v Now, µn → µ iff and only if, for all hi, µn(hi) → ci = µ(hi). in construction 2.4 Convergence of point processes 23

From here the idea is simple: Define

∞   d(µ,ν) ≡ ∑ 2−i 1 − e−|µ(hi)−ν(hi)|. (2.4.9) i=1

Indeed, if D(µn, µ) ↓ 0, then for each `, |µn(h`) − µ(h`)| ↓ 0, and conversely. ut

It is not very difficult to verify that this metric is complete and separable.

2.4.2 Weak convergence.

Having established the space of σ-finite measures as a complete, separable metric space, we can think of weak convergence of probability measures on this space just as if we were working on an Euclidean space. One very useful fact about weak convergence is Skorokhod’s theorem, that re- lates weak convergence to almost sure convergence.

Theorem 2.19. Let Xn,n = 0,1,... be a sequence of random variables on a com- plete separable metric space. Then Xn converges weakly to a random variable X0, iff ∗ and only if there exists a family of random variables Xn , defined on the probability space ([0,1],B([0,1]),m), where m is the Lebesgue measure, such that

D ∗ (i) For each n, Xn = Xn , and ∗ ∗ (ii)X n → X0 , almost surely. (for a proof, see [12]). While weak convergence usually means that the actual realisation of the sequence of random variables do not converge at all and oscillate widely, Skorokhod’s theorem says that it is possible to find an “equally likely” se- ∗ quence of random variables, Xn , that do themselves converge, with probability one. Such a construction is easy in the case when the random variables take values in R. In that case, we associate with the random variable Xn (whose distribution function is Fn, that form simplicity we may assume strictly increasing), the random variable ∗ −1 Xn (t) ≡ Fn (t). It is easy to see that

Z 1 ∗ 1 m(Xn ≤ x) = F−1(t)≤xdt = Fn(x) = P(Xn ≤ x). (2.4.10) 0 n

On the other hand, if P(Xn ≤ x) → P(X0 ≤ x), for all points of continuity of F0, that −1 −1 ∗ ∗ means that for Lebesgue almost all t, Fn (t) → F0 (t), i.e. Xn → X0 , m-almost surely. Skorokhod’s theorem is very useful to extract important consequences from weak convergence. In particular,it allows to prove the convergence of certain functionals of sequences of weakly convergent random variables, which otherwise would not be obvious. When proving weak convergence, one usually does this in two steps. in construction 24 2 Extremal processes

• Prove tightness of the sequence of laws. • Prove convergence of finite dimensional distributions. A useful tightness criterion is the following (see [65], Lemma 3.20).

+ Lemma 2.20. A sequence of point processes ξn is tight, iff for any f ∈ C0 (E), the sequence ξn( f ) is tight as a sequence of real random variables.

+ + Proof. Assume that all f ∈ C0 (E), the sequence ξn( f ) is tight. Let gi ∈ C0 (E) be a sequence of functions such that gi ↑ 1, as i ↑ 1. Then for any ε > 0, there exists ci < ∞, such that −i P(ξn(gi) > ci) ≤ ε2 . (2.4.11)

Let M ≡ ∩i≥1{µ ∈ M+(E) : µ(gi) ≤ ci}. We claim that this implies that for any + f ∈ C0 (E), we have that supµ∈M µ( f ) < ∞. To see this, since gi ↑ 1, for any f there will be K0 < ∞ and i0 < ∞, such that f ≤ Kgi0 , and hence

sup µ( f ) ≤ sup K0µ(gi0 ) ≤ K0ci0 . (2.4.12) µ∈M µ∈M

This implies that M is relatively compact in M+(E). Finally,  P ξn 6∈ M ≤ P(ξ 6∈ M) ≤ ∑ P(ξn(gi) > ci) ≤ 2ε. (2.4.13) i≥1

Hence, for any ε > 0, there is a compact set such that for all n, ξn is in this set with probability at least 1 − ε. This implies that ξn is tight. The converse statement is trivial. ut

Corollary 2.21. Let E = R. Assume that for all a the sequence ξn(1x>a) is tight. Then ξn is tight. 1 Proof. Obviously, we an chose the sequence gi in the proof above as gi(x) = x>ai for ai ↑ ∞. ut

Clearly, Laplace functionals are a good tool to verify weak convergence.

Proposition 2.22. Let Nn be a sequence of point processes. Then Nn converges weakly to N, if and only if all Laplace functionals converge, more precisely, if for + all f ∈ C0 (E), ΨNn ( f ) → ΨN( f ). (2.4.14) Proof. We just sketch this. (2.4.14) asserts that the Laplace transforms of the posi- tive random variables Nn( f ) converge to a random variables, which implies conver- gence in distribution and hence tightness. On the other hanse, ΨN( f ) determines the law of N, and hence there can be only one limit point of the sequences Nn, which implies weak convergence. The converse assertion is trivial. ut in construction 2.4 Convergence of point processes 25

+ Of course we do not really need to check convergence for all f ∈ C0 . For in- stance, in the case E = R, we may chose the class of functions of the form f (x) = k 1 ∑`=1 c` x>u` , k ∈ N, ci > 0, u` ∈ R. Clearly, the Laplace functionals evaluated on these functions determine the Laplace transforms of the vectors {N((u1,∞)),...,Nk((uk,∞))}, and hence the probabilities (assume the ui are an increasing sequence)

P(N((u1,∞)) = m1,...,Nk((uk,∞) = mk) (2.4.15) = P(N((u1,u2]) = m1 − m2,N((u2,u3]) = m2 − m3,...,N((uk−1,uk]) = mk−1 − mk,N((uk,∞)) = mk), and hence the mass functions, that we know to determine the law of N by Proposition 2.7. Another useful criterion for weak convergence of point processes is provided by Kallenberg’s theorem [50].

Theorem 2.23. Assume that ξ is a simple point process on a metric space E, and T is a Π-system of relatively compact open sets, and that for I ∈ T ,

P(ξ(∂I) = 0) = 1. (2.4.16)

If ξn, n ∈ N are point processes on E, and for all I ∈ T ,

limP(ξn(I) = 0) = P(ξ(I) = 0), (2.4.17) n↑∞

and limEξn(I) = Eξ(I) < ∞, (2.4.18) n↑∞ then w ξn → ξ. (2.4.19)

d Remark 2.24. The Π-system, T , can be chosen, in the case E = R , as finite unions of bounded rectangles.

Proof. The key observation needed to prove the theorem is that simple point pro- cesses are uniquely determined by their avoidance function. This seems rather intu- itive, in particular in the case E = R: if we know the probability that in an interval there is no point, we know the distribution of the gape between points, and thus the distribution of the points. Let us note that we can write a point measure, µ, as

µ = ∑ cyδy, (2.4.20) y∈S

where S is the support of the point measure and cy are integers. We can associate to µ the simple point measure

∗ ∗ T µ = µ = ∑ δy, (2.4.21) y∈S in construction 26 2 Extremal processes

∗ Then it is true that the map T is measurable, and that, if ξ1 and ξ2 are point mea- sures such that, for all I ∈ T ,

P(ξ1(I) = 0) = P(ξ2(I) = 0), (2.4.22) then ∗ D ∗ ξ1 = ξ2 . (2.4.23) To see this, let  C ≡ {µ ∈ Mp(E) : µ(I) = 0},I ∈ T . (2.4.24)

The set C is easily seen to be a Π-system. Thus, since by assumption the laws, Pi, of the point processes ξi coincide on this Π-system, they coincide on the sigma- algebra generated by it. We must now check that T ∗ is measurable as a map from ∗ ∗ (Mp,σ(C)) to (Mp,Mp), which will hold, if for each I, the map T1 : µ → µ (I) is measurable form (Mp,σ(C)) → {0,1,2,...}. Now introduce a family of finite coverings of (the relatively compact set) I, An, j, with An, j’s whose diameter is less than 1/n. We will chose the family such that for each j, An+1, j ⊂ An,i, for some i. Then kn ∗ ∗ T1 µ = µ (I) = lim ∑ µ(An, j) ∧ 1, (2.4.25) n↑∞ j=1 ∗ since eventually, no An, j will contain more than one point of µ. Now set T2 µ = (µ(An, j) ∧ 1). Clearly,

∗ −1 (T2 ) {0} = {µ : µ(An, j) = 0} ⊂ σ(C), (2.4.26)

∗ ∗ and so T2 is measurable as desired, and so is T1 , being a monotone limit of finite sums of measurable maps. But now

∗ ∗ ∗ −1  ∗ −1  P(ξ1 ∈ B) = P(T ξ1 ∈ B) = P ξ1 ∈ (T ) (B) = P1 (T ) (B) . (2.4.27)

∗ −1 ∗ −1  ∗ −1  But since (T ) (B) ∈ σ(C), by hypothesis, P1 (T ) (B) = P2 (T ) (B) , ∗ which is also equal to P(ξ1 ∈ B), which proves (2.4.22). Now, as we have already mentioned, (2.4.18) implies tightness of the sequence 0 00 ξn. Thus, for any subsequence n , there exist a sub-sub-sequence, n , such that ξn00 converges weakly to a limit, η. By Proposition 2.16 this is a point process. Let us assume for a moment that (a) η is simple, and (b), for any relatively compact A,

P(ξ(∂A) = 0) ⇒ P(η(∂A) = 0). (2.4.28)

w Then, the map µ → µ(I) is a.s. continuous with respect to η, and therefore, if ξn0 → η, then P(ξn0 (I) = 0) → P(η(I) = 0). (2.4.29) But we assumed that P(ξn(I) = 0) → P(ξ(I) = 0), (2.4.30) in construction 2.4 Convergence of point processes 27

so that, by the foregoing observation, and the fact that both η and ξ are simple, ξ = η. It remains to check simplicity of η and (2.4.28). To verify the latter, we will show that for any compact set, K,

P(η(K) = 0) ≥ P(ξ(K) = 0). (2.4.31)

+ d We use that for any such K, there exist sequences of functions, f j ∈ C0 (R ), and compact sets, Kj, such that 1 K ≤ f j ≤ 1Kj , (2.4.32) 1 1 and Kj ↓ K. Thus,

P(η(K) = 0) ≥ P(η( f j) = 0) = P(η( f j) ≤ 0). (2.4.33)

But ξn0 ( f j) converges to η( f j), and so

P(η( f j) ≤ 0) ≥ limsupP(ξn0 ( f j) ≤ 0) ≥ P(ξn0 (Kj) ≤ 0). (2.4.34) n0

Finally, we can approximate Kj by elements Ij ∈ T , such that Kj ⊂ Ij ↓ K, so that

P(ξn0 (Kj) ≤ 0) ≥ limsupP(ξn0 (Ij) ≤ 0) = P(ξ(Kj) ≤ 0), (2.4.35) n0 so that (2.4.31) follows. Finally, to show simplicity, we take I ∈ T and show that the η has multiple points in I with zero probability. Now

∗ ∗ ∗ P(η(I) > η (I)) = P(η(I) − η (I) < 1/2) ≤ 2(Eη(I) − Eη (I))). (2.4.36) ut

The latter, however, is zero, due to the assumption of convergence of the intensity measures.

Remark 2.25. The main requirement in the theorem is the convergence of the so- called avoidance function, P(ξn(I) = 0), (2.4.18). The convergence of the mean (the intensity measure) provides tightness. It may be replaced by any other tightness tightness criterion (see [30]). Note that, by Chebychev’s inequality, (2.4.18) implies tightness via Corollary 2.21. We will have to deal with situations where (2.4.18) fails. e.g. in branching Brownian motion. in construction 28 2 Extremal processes 2.5 Point processes of extremes

We are now ready to describe the structure of extremes of random sequences in terms of point processes. There are several aspect of these processes that we may want to capture:

(i) the distribution of the values largest values of the process; if un(x) is the scaling w function such that P(Mn ≤ un(x)) → G(x), it would be natural to look at the point process n Nn ≡ δ −1 . (2.5.1) ∑ un (Xi) i=1

As n tends to infinity, most of the points un(Xi) will disappear to minus infinity, but we may hope that as a point process, this object will converge. (ii) the “spatial” structure of the large values: we may fix an extreme level, un, and ask for the distribution of the values i for which Xi exceeds this level. Again, only a finite number of exceedances will be expected. To represent the exceedances as point process, it will be convenient to embed 1 ≤ i ≤ n in the unit interval (0,1], via the map i → i/n. This leads us to consider the point process of exceedances on (0,1], n 1 Nn ≡ ∑ δi/n Xi>un . (2.5.2) i=1 (iii) we may consider the two aspects together and consider the point process on R × (0,1], n Nn ≡ δ −1 . (2.5.3) ∑ (un (Xi),i/n) i=1 In this chapter we consider only the case of iid sequences, although this is far two restrictive. We turn to the non-iid case later when we study Gaussian processes.

2.5.1 The point process of exceedances.

We begin with the simplest, object, the process Nn of exceedances of an extremal level un.

Theorem 2.26. Let Xi be iid random variables with marginal distribution function F. Let τ > 0 und assume that there is un ≡ un(τ) such that n(1 − F(un(τ))) → τ. then the point process ∞ N ≡ 1 en ∑ δi/n Xi>un(τ) (2.5.4) i=−∞

converges weakly to the Poisson point process Ne on R with intensity measure τdx. Proof. We will use Kallenberg’s theorem. First note that trivially, in construction 2.5 Point processes of extremes 29

n 1 ENn((c,d]) = ∑ P(Xi > un(τ)) i/n∈(c,d] (2.5.5) i=1 = n(d − c)(1 − F(un(τ))) → τ(d − c),

so that the intensity measure converges to the desired one. Next we need to show that

−τ|I| P(Nn(I) = 0) → e , (2.5.6) for I any finite union of disjoint intervals. Then

 −τ|I| P(Nn(I) = 0) = P ∀i/n∈IXi ≤ un → e , (2.5.7) from the basic result on convergence of the law of the maximum. ut

2.5.2 The point process of extreme values.

Let us now turn to an alternative point process, that of the values of the largest maxima. In the iid case we have already shown enough in Theorem 1.19 to get the following result. However, we will use the occasion to show how to use Laplace functionals.

Theorem 2.27. Let Xi be sequence of iid random variables with marginal distribu- tion function F. Assume that for all τ ∈ R+, n(1 − F(un(τ))) → τ, uniformly on compact intervals. Then the point process

n En ≡ δ −1 (2.5.8) ∑ un (Xi) i=1

converges weakly to the Poisson Point process on R+ with intensity measure the Lebesgue measure.

Proof. We will show this by proving convergence of the Laplace functionals. For φ : R → R+ a continuous, non-negative function of compact support, we set

  Z  " n !# −1  Ψn(φ) ≡ E exp − φ(x)En(dx) = E exp − ∑ φ un (Xi) . (2.5.9) i=1

The computations in the iid case are very simple. By independence,

" n !# n −1   −1  E exp − ∑ φ un (Xi) = ∏E exp −φ un (Xi) i=1 i=1  −1 n = E exp −φ un (X1) . (2.5.10) in construction 30 2 Extremal processes

We know that the probability for un(Xi) to be in the support of φ is small of order 1/n. Thus, if we write

h −1 i h −1 i −φ(u (Xi)) −φ(u (X )) E e n = 1 + E e n 1 − 1 , (2.5.11)

the second term will be of order 1/n. Therefore,

n  h −φ(u−1(X ))i  h −φ(u−1(X )) i E e n 1 ∼ exp nE e n 1 − 1 . (2.5.12)

Finally, h −1 i Z   −φ(u (Xi)) −φ(τ) limnE e n − 1 = e − 1 dτ. (2.5.13) n↑∞ To show the latter, note first that, (assuming first that φ is differentiable and using integratio by parts.

h −1 i Z ∞   −φ(un (X1)) −1  −φ(τ) nE e − 1 = nP un (X1) ∈ dτ e − 1 (2.5.14) 0 Z ∞ d  −φ(τ)  −1  = e − 1 xP un (X1) > τ . 0 dτ Since φ has compact support and the integrand converges uniformly on compact sets, the assertion (2.5.13) follows. A standard approximation argument shows that + the same holds for all φ ∈ C0 . From (2.5.13) we get that the Laplace functional converges to that of the PPP(dτ), which proves the theorem. ut

2.5.3 Complete Poisson convergence.

We now come to the final goal of this section, the characterisation of the space-value process of extremes as a two-dimensional point process. We consider again un(τ) such that n(1 − F(un(τ))) → τ. Then we define

∞ Nn ≡ δ −1 . (2.5.15) ∑ (i/n,un (Xi)) i=1

2 as a point process on R (or more precisely, on R+ × R+.

Theorem 2.28. Let un(τ) be as above. Then the point process Nn converges to the 2 Poisson point process, N , on R+ with intensity measure given by the Lebesgue measure.

Proof. The easiest way in the iid case to prove this theorem is to use Laplace func- + 2 tionals just as in the proof of Theorem 2.27. For φ ∈ C0 (R+). Clearly in construction 2.5 Point processes of extremes 31

∞  −1   ΨNn (φ) = ∏ 1 + E exp −φ un (Xi),i/n − 1 (2.5.16) i=1 ∞ ! −1  −1   ∼ exp n ∑ nE exp −φ un (Xi),i/n − 1 . i=1

Note that since φ has compact support, there are only finitely many non-zero terms in the sum. As before, Z ∞  −1    −φ(τ,i/n)  nE exp −φ un (Xi),i/n − 1 = e − 1 dτ (2.5.17) 0 Z ∞ d  −φ(τ,i/n)  −1   e − 1 nP un (Xi) > τ − τ dτ. 0 dτ The first term is what we want, while again due to the fact that φ as compact sup- port (and for the moment has bounded derivatives), say I × J the second term is in absolute value smaller than Z 1 −1   i/n∈JK nP un (X1) ≤ τ − τ dτ. (2.5.18) I Since the term in the bracket tends to zero, inserting this into the sum over i still gives a vanishing contribution, while

∞ Z ∞   Z ∞ Z ∞   n−1 ∑ e−φ(τ,i/n) − 1 dτ → e−φ(τ,z) − 1 dτdz. (2.5.19) i=1 0 0 0 From here the claimed result is obvious. ut in construction in construction Chapter 3 Normal sequences

The assumption that the underlying random process are iid is, of course rather re- strictive and unrealistic in real applications. A lot of work has been done to extend extreme value theory to other processes. It turns out that the results of the iid model are fairly robust, and survive essentially unchanged under relatively weak mixing assumptions. In this course we are mainly interested in the case of Gaussian pro- cesses and for this reason we present some of the main classical results for this case only in the present chapter. More precisely, we consider the case of stationary Gaus- sian sequences. We will introduce a very powerful tool, Gaussian comparison, that makes the study of extremes in the Gaussian case more amenable. In the stationary case, a normalised Gaussian sequence, (Xi,i ∈ Z), is charac- terised by

EXi = 0, (3.0.1) 2 EXi = 1, EXiXj = ri− j.

where rk = r|k|. The sequence must of course be such that the infinite dimensional matrix with entries ci j = ri− j is positive definite. Our main target here is to show that under the so-called Berman condition [11],

rn lnn ↓ 0, (3.0.2)

the extremes of a stationary normal sequences behave like those of the correspond- ing iid normal sequence. We shall see that the logarithmic decay of correlations is indeed a boundary for irrelevance of correlation. in construction33 34 3 Normal sequences 3.1 Normal comparison

In the context of Gaussian random variables, a recurrent idea is to compare one to another, simpler one. The simplest one to compare with are, of course, iid variables, but the concept goes much farther. Let us consider a family of Gaussian random variables, ξ1,...,ξn, normalised to have mean zero and variance one (we refer to such Gaussian random variables as centred normal random variables), and let Λ 1 denote their covariance matrix. Let 0 similarly η1,...,ηn be centred normal random variables with covariance matrix Λ . Generally speaking, one is interested in comparing functions of these two pro- cesses that typically will be of the form

EF(X1,...,Xn), (3.1.1)

n where F : R → R. For us the most common case would be 1 F(X1,...,Xn) = X1≤x1,...,Xn≤xn . (3.1.2) An extraordinary efficient tool to compare such processes turns out to be interpola- h h tion. Given ξ and η, we define X1 ,...,Xn , for h ∈ [0,1], by √ √ Xh = hξ + 1 − hη. (3.1.3)

Then Xh is normal and has covariance

Λ h = hΛ 1 + (1 − h)Λ 0, (3.1.4)

The following Gaussian comparison lemma is a fundamental tool in the study of Gaussian processes.

h n Lemma 3.1. Let η,ξ,X be as above. Let F : R → R be differentiable and of mod- h h erate growth. Set f (h) ≡ EF(X1 ,...,Xn ). Then

Z 1  2  1 1 0 ∂ F h h f (1) − f (0) = dh ∑(Λi j −Λi j)E (X1 ,...,Xn ) . (3.1.5) 2 0 i6= j ∂xi∂x j

Proof. Trivially, Z 1 d f (1) − f (0) = dh f (h), (3.1.6) 0 dh and n   d 1 ∂F  −1/2 −1/2  f (h) = ∑ E h ξi − (1 − h) ηi , (3.1.7) dh 2 i=1 ∂xi where of course ∂F is evaluated at Xh. To continue we use a remarkable formula for ∂xi Gaussian processes, known as the Gaussian integration by parts formula. in construction 3.1 Normal comparison 35

Lemma 3.2. Let Xi, i ∈ {1,...,n} be a multivariate Gaussian process, and let g : n R → R be a differentiable function of at most polynomial growth. Then n ∂ Eg(X)Xi = ∑ E(XiXj)E g(X). (3.1.8) j=1 ∂x j

Proof. We first consider the scalar case, i.e. Lemma 3.3. Let X be a centred Gaussian random variable, and let g : R → R be a differentiable function of at most polynomial growth. Then

2 0 Eg(X)X = E(X )Eg (X). (3.1.9)

Proof. Let σ = EX2. Then

∞ 2 1 Z − x EXg(X) = √ g(x)xe 2σ2 dx 2πσ 2 −∞ Z ∞  x2  1 d 2 − = √ g(x) −σ e 2σ2 dx 2πσ 2 −∞ dx Z ∞ x2 2 1 d − 2 0 = σ √ g(x)e 2σ2 dx = EX Eg (x), (3.1.10) 2πσ 2 −∞ dx

2 − x where we used elementary integration by parts and the assumption that g(x)e 2σ2 ↓ 0, if x → ±∞. ut

To prove the multivariate case, we use a trick from [71]: We let Xi, i,= 1,...,n be a centred Gaussian vector, and let X be a centred Gaussian random variable. Set 0 EXXi X ≡ Xi − X . Then i EX2 0 EXi X = EXiX − EXiX = 0, (3.1.11)

0 and so X is independent of the vector Xi . Now compute   0 EXiX 0 EXnX EXF(X1,...,Xn) = EXF X1 + X ,...,Xn + X . (3.1.12) EX2 EX2 Using in this expression Lemma 3.3 for the random variable X alone, we obtain   2 0 0 EXiX 0 EXnX EXF(X1,...,Xn) = EX EF X1 + X ,...,Xn + X EX2 EX2 n   2 EXiX ∂F 0 EXiX 0 EXnX = EX ∑ 2 E X1 + X 2 ,...,Xn + X 2 i=1 EX ∂xi EX EX n ∂F = ∑ E(XiX)E (X1,...,Xn), i=1 ∂xi which proves the lemma. ut in construction 36 3 Normal sequences

Applying Lemma (3.2) in (3.1.7) yields

d 1 ∂ 2F f (h) = ∑ E E(ξiξ j − ηiη j) (3.1.13) dh 2 i6= j ∂x j∂xi 2 1 1 0  ∂ F  h h = ∑ Λ ji −Λ ji E X1 ,...,Xn , 2 i6= j ∂x j∂xi

which is the desired formula. ut

The general comparison lemma can be put to various good uses. The first is a monotonicity result that is sometimes known as Kahane’s theorem [49].

Theorem 3.4. Let ξ and η be two independent n-dimensional Gaussian vectors. Let D1 and D2 be subsets of {1,...,n} × {1,...,n}. Assume that

Eξiξ j ≥ Eηiη j, if (i, j) ∈ D1, Eξiξ j ≤ Eηiη j, if (i, j) ∈ D2, (3.1.14) Eξiξ j = Eηiη j, if (i, j) 6∈ D1 ∪ D2.

n Let F be a function on R of moderate growth, such that its second derivatives satisfy

∂ 2 F(x) ≥ 0, if (i, j) ∈ D1, ∂xi∂x j ∂ 2 F(x) ≤ 0, if (i, j) ∈ D2. (3.1.15) ∂xi∂x j Then E f (ξ) ≤ E f (η). (3.1.16) Proof. The proof of the theorem can be trivially read off the preceding lemma by inserting the hypotheses into the right-hand side of (3.1.5). ut

We will need two extensions of these results for functions that are not differen- tiable. The first is known as Slepian’s lemma [69].

Lemma 3.5. Let ξ and η be two independent n-dimensional Gaussian vectors. As- sume that

Eξiξ j ≥ Eηiη j, for alli 6= j Eξiξi = Eηiηi, for alli. (3.1.17) Then n n Emax(ξi) ≤ Emax(ηi). (3.1.18) i=1 i=1 in construction 3.1 Normal comparison 37

Proof. Let n −1 βxi Fβ (x1,...,xn) ≡ β ln ∑ e . (3.1.19) i=1 A simple computation shows that, for i 6= j,

∂ 2F eβ(xi+x j) = −β 2 ,< 0, (3.1.20) ∂xi∂x j n βxk  ∑k=1 e and so the theorem implies that, for all β > 0,

EFβ (ξ) ≤ EFβ (η). (3.1.21) On the other hand, n lim Fβ (x1,...,xn) = maxxi. (3.1.22) β↑∞ i=1 Thus lim EFβ (ξ1,...,ξn) ≤ lim EFβ (η1,...,ηn), (3.1.23) β↑∞ β↑∞ and hence (3.1.18) holds. ut

As a second application, we want to study P(X1 ≤ u1,...,Xn ≤ un). This corre- 1 sponds to choosing F(X1,...,Xn) = X1≤u1,...,Xn≤un . 0 1 Lemma 3.6. Let ξ,η be as above. Set ρi j ≡ max(Λi j,Λi j), and denote by x+ ≡ max(x,0). Then

P(ξ1 ≤ u1,...,ξn ≤ un) − P(η1 ≤ u1,...,ηn ≤ un) (3.1.24) 1 0 2 2 ! 1 (Λi j −Λi j)+ ui + u j ≤ ∑ q exp − . 2π 2 2(1 + ρi j) 1≤i< j≤n 1 − ρi j

Proof. Although the indicator function is not differentiable, we will proceed as if it was, setting d 1 = δ(x − u), (3.1.25) dx x≤u where δ denotes the Dirac delta function, i.e. R f (x)δ(x − u)dx ≡ f (u). This can be justified e.g. by using smooth approximants of the indicator function and passing to  2  the limit at the end (e.g. replace 1 by (2 2)1/2 R x exp − (z−u) , do all the x≤u πσ −∞ 2σ 2 computations, and pass to the limit σ ↓ 0 at the end. With this convention, we have that, for i 6= j,

 2  ∂ F h h 1 h h E (X1 ,...,Xn ) = E h δ(Xi − ui)δ(Xj − u j) (3.1.26) ∏ Xk ≤uk ∂xi∂x j k6=i∨ j h h ≤ Eδ(Xi − ui)δ(Xj − u j) = φh(ui,u j), in construction 38 3 Normal sequences

where φh denotes the density of the bivariate normal distribution with covariance h Λi j, i.e.

2 2 h ! 1 ui + u j − 2Λi juiu j φh(ui,u j) = q exp − h . (3.1.27) h 2 2(1 − (Λ )2) 2π 1 − (Λi j) i j

Now

2 2 h 2 2 h h 2 u + u − 2Λ uiu j (u + u )(1 −Λ ) +Λ (ui − u j) i j i j = i j i j i j h 2 h 2 2(1 − (Λi j) ) 2(1 − (Λi j) ) 2 2 2 2 (ui + u j ) (ui + u j ) ≥ h ≥ , (3.1.28) 2(1 + |Λi j|) 2(1 + ρi j)

0 1 where ρi j = max(Λi j,Λi j). (To prove the first inequality, note that this is trivial if h h Λi j ≥ 0. If Λi j < 0, the result follows after some simple algebra). Inserting this into (3.1.26) gives

 2  2 2 ! ∂ F h h 1 (ui + u j ) E (X1 ,...,Xn ) ≤ q exp − , (3.1.29) ∂xi∂x j 2 2(1 + ρi j) 2π 1 − ρi j

from which (3.1.24) follows immediately. ut

Remark 3.7. It is often convenient to replace the assertion of Lemma 3.6 by

|P(ξ1 ≤ u1,...,ξn ≤ un) − P(η1 ≤ u1,...,ηn ≤ un]) (3.1.30) 1 0 2 2 ! 1 |Λi j −Λi j| ui + u j ≤ ∑ q exp − . 2π 2 2(1 + ρi j) 1≤i< j≤n 1 − ρi j

A simple, but useful corollary is the specialisation of this lemma to the case when ηi are independent random variables.

Corollary 3.8. Let ξi be centred normal variables with covariance matrix Λ, and let ηi be iid centred normal variables. Then

P(ξ1 ≤ u1,...,ξn ≤ un) − P(η1 ≤ u1,...,ηn ≤ un) (3.1.31) 2 2 ! 1 (Λi j)+ ui + u j ≤ ∑ q exp − . 2π 2 2(1 + |Λi j|) 1≤i< j≤n 1 −Λi j

In particular, if |Λi j| < δ ≤ 1, in construction 3.2 Applications to extremes 39

n |P(ξ1 ≤ u,...,ξn ≤ u) − [Φ(u)] | (3.1.32) 1 |Λ |  u2  ≤ √ i j exp − . ∑ 2 2π 1≤i< j≤n 1 − δ 1 + |Λi j|

Proof. The proof of the corollary is straightforward and left to the reader. ut

Another simple corollary is a version of Slepian’s lemma:

0 1 Corollary 3.9. Let ξ,η be as above. Assume that Λi j ≤ Λi j, for all i 6= j. Then

 n   n  P maxξi ≤ u − P maxηi ≤ u ≤ 0. (3.1.33) i=1 i=1

Proof. The proof of the corollary is again obvious, since under our assumption, 0 1 (Λi j −Λi j)+ = 0. ut

3.2 Applications to extremes

The comparison results of Section 3.1 can readily be used to give criterion under which the extremes of correlated Gaussian sequences are distributed as in the inde- pendent case.

Lemma 3.10. Let ξi, i ∈ Z be a stationary normal sequence with covariance rn. Assume that supn≥1 rn ≤ δ < 1. Let un be such that

2 n un − 1+|r | limn ∑ |ri|e i = 0. (3.2.1) n↑∞ i=1 Then −τ n(1 − Φ(un)) → τ ⇔ P(Mn ≤ un) → e . (3.2.2) Proof. Using Corollary 3.8, we see that (3.2.1) implies that

n |P(Mn ≤ un) − Φ(un) | ↓ 0. (3.2.3) Since n −τ n(1 − Φ(un)) → τ ⇔ Φ(un) → e , (3.2.4) the assertion of the lemma follows. ut

Since the condition n(1 − Φ(un)) → τ determines un (if 0 < τ < ∞), on can easily derive a criteria for (3.2.1) to hold.

Lemma 3.11. Assume that rn lnn ↓ 0, and that un is such that n(1 − Φ(un)) → τ, 0 < τ < ∞. Then (3.2.1) holds. in construction 40 3 Normal sequences

Proof. We know that, if n(1 − Φ(un)) ∼ τ,

 1  exp − u2 ∼ Ku n−1. (3.2.5) 2 n n √ and un ∼ 2lnn. Thus

u2 u2|r | n 2 n i − 1+|r | −u 1+|r | n|ri|e i = n|ri|e n e i . (3.2.6)

Let α > 0, and i ≥ nα . Then

−u2 −1 n|ri|e n ∼ 2n |ri|lnn, (3.2.7)

and 2 un|ri| ≤ 2|ri|lnn. (3.2.8) 1 + |ri| But then lnn lnn |r |lnn = |r |lni ≤ |r |lni = α−1|r |lni, (3.2.9) i i lni i lnnα i

which tends to zero, since rn lnn → 0, if i ↑ ∞. Thus

2 un − 1+|r | −1 −1  ∑ n|ri|e i ≤ 2α sup |ri|lniexp 2α |ri|lni ↓ 0, (3.2.10) i≥nα i≥nα

as n ↑ ∞. On the other hand, since there exists δ > 0, such that 1 − ri ≥ δ,

2 un − 1+|r | 1+α −2/(2−δ) 2 ∑ n|ri|e i ≤ n n (2lnn) , (3.2.11) i≤nα

which tends to zero as well, provided we chose α such that 1 + α < 2/(2 − δ), i.e. α < δ/(2 + δ). This proves the lemma. ut

The following theorem summarises the results on the stationary Gaussian case.

Theorem 3.12. Let Xi,i ∈ N be a stationary centred normal series with covariance rn, such that rn lnn → 0. Then (i) For 0 ≤ τ ≤ ∞,

n −τ n(1 − Φ(un)) → τ ⇔ Φ(un) → e , (3.2.12)

(ii)u n(x) can be chosen as in the iid normal case, and (iii) The extremal process is Poisson, more precisely,

n −x δ −1 → PPP(e dx), (3.2.13) ∑ uN (Xi) i=1 in construction 3.2 Applications to extremes 41

where PPP(µ) denotes the Poisson point process on R with intensity measure µ. Proof. Points (i) and (ii) have already been show. For the proof of the convergence of extremal process, we show the convergence of the Laplace fuctional. Let φ : R → R+ have compact support (say in [a,b]), and assume without loss that φ has bounded derivatives. Then, by Lemma 3.1,

n h − n φ(u−1(X ))i h −φ(u−1(X ))i E e ∑i=1 n i − ∏E e n i (3.2.14) i=1 1 h n −1 i 0 −1 h 0 −1 h −∑ φ(un (Xi)) = ∑ Λi jE φ (un (Xi ))φ (un (Xj ))e i=1 2lnn. 2 i6= j

By assumption on the function φ, the right-hand side of (3.2.14) is bounded in ab- solute value by  h h  C ∑ Λi jP Xi > uN(a),Xj > un(a) 2lnn, (3.2.15) i6= j h h for some constant C. Using the formula for the joint density of the variables Xi ,Xj and the bound (3.1.28), we see that

2 2   1 Z ∞ Z ∞ − x +y h h 2(1+Λi j) P Xi > uN(a),Xj > un(a) ≤ p dx dye 2π (1 −Λi j) un(a) un(a) 2 un(a) (1 +Λi j) − ≤ e (1+Λi j) . (3.2.16) 2p 2πun(a) (1 −Λi j)

2 Using that un(a) ∼ lnn and inserting this bound into (3.2.15), we see that (3.2.14) tends to zero as n ↑ ∞, whenever the assumption of Lemma 3.10 is satisfied. This proves the theorem. ut in construction in construction Chapter 4 Spin glasses

The motivation for the results I present in these lectures comes from spin glass theory. There is really not enough time to go into this at any depth and I will only sketch a few key concepts. Those interested in more should read a book on the subject, e.g. [15], [71] or [64], or the less technical introduction by Newman and Stein [62]..

4.1 Setting and examples

Spin glasses are spin systems with competing, random interactions. To remain close to the main frame of these lectures, let us remain in the realm of mean-field spin glasses of the type proposed by Sherrington and Kirkpatrick. N Here we have a state space, SN ≡ {−1,1} . On this space we define a Gaussian process HN : SN → R, characterised by its mean EHN(σ) = 0, for all σ ∈ SN, and covariance 0 0 0 EHN(σ)HN(σ ) ≡ Ng(σ,σ ), σ,σ ∈ SN, (4.1.1)

where g is a positive definite quadratic from. The random functions HN are called Hamiltonians. In physics, they represent the energy of a configuration of spins, σ. In the examples that are most popular, g is assumed to depend only on some distance between σ and σ 0. There are two popular distances: • The Hamming distance,

N N ! 0 1 0 d (σ,σ ) ≡ 1 0 = N − σiσ . (4.1.2) ham ∑ σi6=σi ∑ i i=1 2 i=1

In this case one choses

0 0  g(σ,σ ) = p 1 − 2dham(σ,σ )/N , (4.1.3) in construction43 44 4 Spin glasses

with p a polynomial with non-negative coefficients. The most popular case is p(x) = x2. One denotes the quantity

0 0 1 − 2dham(σ,σ )/N ≡ RN(σ,σ ) (4.1.4)

as the overlap between σ and σ 0. The resulting class of models are called Sherrington-Kirkpatrick models. • A second popular distance is the lexicographic distance

0 0 dlex(σ,σ ) = N + 1 − min(i : σi 6= σi ), (4.1.5)

Note the dlex is an ultrametric. In this case,

0 −1 0 g(σ,σ ) = A(1 − N dlex(σ,σ )). (4.1.6)

A : [0,1] → [0,1] can here be chosen as any non-decreasing function. It will be 0 −1 0 convenient to denote qN(σ,σ ) ≡ N (min(i : σi 6= σi ) − 1) and call it the ultra- metric overlap. The models obtained in this way were introduced by Derrida and Gardener [38, 39] and are called generalised random energy models (GREM). The GREMs have a more explicit representation in term of iid normal random variables. Let us fix a number, k ∈ N, of levels. Then introduce k natural numbers, 2 2 2 0 < `1 < `2 < ··· < `n = N, and k real numbers a1,...,ak, such that a1 +a2 +...an = 1. Finally, let {Xm}1≤m≤k , be iid centred normal random variables with σ σ∈{−1,1}σ1+···+`m variance N. The Hamiltonian of a k-level GREM is then given as a random function N HN : {−1,1} → R, defined as

k m HN(σ) ≡ ∑ amXσ 1...σ m , (4.1.7) m=1

where we write σ ≡ σ 1σ 2 ...σ k and σ k ∈ {−1,1}`m . It is straightforward to see that HN(σ) is Gaussian with variance N, and that

k (H ( )H ( )) = N a2 1 . E N σ N τ ∑ m σi=τi,∀i≤`1+···+`m (4.1.8) m=1 Thus, if we define the function

k ( 2 1 Σ z) = N ∑ am `1+···+`m≤z, z ∈ [0,N], (4.1.9) m=1 , so that (4.1.8) can be written as

2 E(HN(σ)HN(τ)) = Σ (min(i : σi 6= τi) − 1). (4.1.10)

This is the form of the correlation given previously, with Σ 2(z) = NA(N−1z). Clearly the representation (4.1.7) is very useful for doing explicit computations. in construction 4.2 The REM 45

The main objects one studies in statistical mechanics are the Gibbs measures associated to these Hamiltonians. They are probability measures on SN that assign to σ ∈ SN the probability

e−βHN (σ) µβ,N(σ) ≡ , (4.1.11) Zβ,N

where the normalising factor Zβ,N is called the partition function. The parameter β is called the inverse temperature. Note that these measures are random variables on the underlying probability space (Ω,F,P) on which the Gaussian processes HN are defined. The objective of statistical mechanics is to understand the geometry of these measures for very large N. This problem is usually interesting in the case when β is large, where the Gibbs measures will feel the geometry of the random process HN. In particular, one should expect that for large enough β (and finite N), µβ,N will assign mass only to configurations of almost minimal energy. One way to acquire information on the Gibbs measures its thus to first analyse the structure of the minima of HN.

4.2 The REM

In order to illustrate what can go on in spin glasses, Derrida had the bright idea to in- troduce a simple toy model, the random energy model (REM). In the REM, the val- ues HN(σ) of the Hamiltonian are simply independent Gaussian random variables with mean zero and variance N. One might think that this should be too simple, but, remarkably, quite a lot can be learned from this example. Understanding the structure of the ground states in this model turns into the clas- sical problem of extreme value theory for iid random variables, which is of course extremely well understood (there are numerous textbooks of which I like [57] and [65] best). We will set HN(σ) ≡ −Xσ . The first question is for the value of the maximum of the Xσ ,σ ∈ SN.

4.2.1 Rough estimates, the second moment method

To get a first feeling we ask for the right order to the maximum. More precisely, we ask for the right choice of functions uN : R → R, such that the maximum is of that order uN(x) with positive, x-dependent probability. To do so, introduce the counting function M (x) ≡ 1 , N ∑ Xσ >uN (x) (4.2.1) σ∈SN

which is the number of σ’s such that Xσ exceeds uN(x). Then in construction 46 4 Spin glasses   N P max Xσ > uN(x) ≤ EMN(x) = 2 P(Xσ > uN(x)). (4.2.2) σ∈SN

This may be called a first moment estimate. So a good choice of uN(x) would be N such that 2 P(Xσ > uN(x)) = O(1). For later use, we even choose

N −x 2 P(Xσ > uN(x)) ∼ e . (4.2.3) It is easy to see that this can be achieved with

√ 1 ln(N ln2) + ln(4π) x uN(x) = N 2ln2 − √ + √ . (4.2.4) 2 2ln2 2ln2

The fact that EMN(x) = O(1) precludes that there are too many exceedances of the level uN(x), by Chebyshev’s inequality. But it may still be true that the probability that MN(x) > uN(x) tends to zero. To show that this in not the case, one may want to control, e.g., the variance of MN(x). Namely, using the Cauchy-Schwartz inequality, we have that 2 1 2 2 (EMN(x)) = E(MN(x) MN (x)≥1) ≤ E MN(x) P(MN(x) ≥ 1), (4.2.5) so that 2 (EMN(x)) P(MN(x) ≥ 1) ≥ 2 . (4.2.6) E(MN(x) ) Now it is easy to compute

2 N N N 2 E MN(x) = 2 P(Xσ > uN(x)) + 2 (2 − 1)P(Xσ > uN(x)) (4.2.7) 2 −N = (EMN(x)) (1 − 2 ) + EMN(x). Hence 1 1 P(MN(x) ≥ 1) ≥ −N ∼ x , (4.2.8) 1 − 2 + 1/EMN(x) 1 + e for N large. Together with the Chebyshev upper bound, this gives already good control on the probability to exceed uN(x), at least for large x:

−x   e −x −x ≤ P max Xσ > uN(x) ≤ e . (4.2.9) 1 + e σ∈SN In particular,   x lime lim P max Xσ > uN(x) = 1, (4.2.10) x↑∞ N↑∞ σ∈SN which is a bound on the upper tail of the distribution of the maximum. This is called the second moment estimate. First and second moment estimates are usually easy to use and give the desired control on maxima, if they work. Unfortunately, they do not always work as nicely as here. in construction 4.3 The GREM, two levels 47 4.2.2 Law of the maximum and the extremal process

In this simple case, we can of course do better. In fact, the result on extremes of iid random variables from Chapters 1 and 2 imply that   −e−x P max Xσ ≤ uN(x) → e , (4.2.11) σ∈SN

as N ↑ ∞. Moreover, with EN ≡ δ −1 (4.2.12) ∑ uN (Xσ ) σ∈SN

Corollary 4.1. In the REM, the sequence of point processes EN converge in law to −z the Poisson point process with intensity measure e dz on R.

4.3 The GREM, two levels

To understand what happens if correlation is introduced into the game, we consider the simplest version of the generalised random energy model with just two hierar- 1 2 chies. That is, for σ ∈ SN/2 and σ ∈ SN/2, we define the Hamiltonian

1 2 1 2 HN(σ σ ) ≡ a1Xσ 1 + a2Xσ 1σ 2 . (4.3.1)

2 2 where all X are iid centred Gaussian with variance N, and a1 + a2 = 1. Note that this corresponds to the Gaussian process with covariance

0 0 EHN(σ)HN(σ ) = NA(qN(σ,σ )), (4.3.2) where  0, if x < 1/2  2 A(x) = a1, if 1/2 ≤ x < 1, (4.3.3) 1, if x = 1.

4.3.0.1 Second moment method

We may be tempted to retry the second moment method that worked so nicely in the REM. Of course we get

N EMN(x) = 2 P(HN(σ) > uN(x)), (4.3.4) as in the REM. But this quantity does not see any correlation, so this should make us suspicious. But we know how to check whether this is significant: compute the second moment. This is easily done: in construction 48 4 Spin glasses

 2 h i M (x) = 1 1 2 1 1 2 (4.3.5) E N ∑ E HN (σ σ )>uN (x) HN (τ τ )>uN (x) σ 1,σ 2,τ1,τ2 h i = 1 1 2 ∑ E HN (σ σ )>uN (x) σ 1,σ 2 h i + 1 1 2 1 1 2 ∑ ∑ E HN (σ σ )>uN (x) HN (σ τ )>uN (x) σ 1 σ 26=τ2 h i h i + 1 1 2 1 1 2 . ∑ ∑ E HN (σ σ )>uN (x) E HN (σ τ )>uN (x) σ 16=τ1 σ 2,τ2

N N 2 The first terms yield 2 P(HN(σ) > uN(x)), the last give (2 − 1) P(HN(σ) > 2 uN(x)) , so we already know that these are ok, i.e. of order one. But the middle terms are different. In fact, a straightforward Gaussian computation shows that

 2  h i uN(x) 1 1 2 1 1 2 ∼ exp − . (4.3.6) E HN (σ σ )>uN (x) HN (σ τ )>uN (x) 2 N(1 + a1)

Thus, with uN(x) as in the REM, the middle term gives a contribution of order

N/ − N/( +a2) 23 22 2 1 1 . (4.3.7)

2 1 This does not explode only if a1 ≤ 3 . What is going on here?? To see this, look in detail into the computations:

Z  2 2  1 2 1 2  x (u − x) P HN(σ σ ) ∼ u ∧ HN(σ τ ) ∼ u ∼ exp − 2 − 2 dx, (4.3.8) 2a1N (1 − a1)N

which is obtained by first integrating over the value x of X1 and then asking that σ 1 both X2 and X2 fill the gap to u. Now this integral, for large u gets its main σ 1σ 2 σ 1τ2 contribution from values of x that maximise the exponent, which is easily checked 2 2ua1 to be reached at xc = 2 . For u = uN(x), this yields indeed (??). However, in this 1+a1 case √ √ 2 √ 2a1 2ln2 2 2a1 xc ∼ N 2 = 2 a1N ln2 (4.3.9) 1 + a1 1 + a1 √ Now a N ln2 is the maximum that any of the 2N/2 variables a X1 can reach, 1 √ 1 σ 1 and the value xc is larger than that as soon as a1 > 2 − 1. This indicates that this moment computation is non-sensical above that value of a1. On the other 1 2 hand, when we compute P(HN(σ σ ) ∼ uN) in the same way, we√ find√ that the 2 corresponding√ critical value of the first variable is xc = uNa1 = a√1 2a1N ln2 = a 2max (a X1 ). Thus, here a problem occurs only for a > 1/ 2. In that latter 1 σ1 1 σ1 1 case we must expect a change in the value of the maximum, but for smaller values of a1 we should just be more clever. in construction 4.3 The GREM, two levels 49

4.3.0.2 Truncated second moments

We see that the problem comes with the excessively large values of the first compo- nent contributing to the second moments. A natural idea is to cut these off. Thus we introduce 1 1 √ MbN(x) ≡ H ( 1 2)>u (x) 1 . (4.3.10) ∑ N σ σ N a1X 1 ≤a1N ln2 σ 1,σ 2 σ A simple computation shows that

EMbN(x) = EMN(x)(1 + o(1)), (4.3.11)

2 h 2i as long as a1 < 1/2. On the other hand, when we now compute E MbN(x) , the previously annoying term becomes

√ ( − a )2   −N 1 2 1 3N/2 √ 1−a2 2 1 1 2 1 1 2 1 1 ∼ 2 1 , (4.3.12) E HN (σ σ )>uN (x) HN (σ τ )>uN (x) X ≤N ln2 σ1

2 which is insignificant for a1 < 1/2. Since MN(x) ≥ MbN(x), it follows that

2   (EMbN (x)) EMN(x) ≥ P(MN(x) ≥ 1) ≥ P MbN(x) ≥ 1 ≥ 2 = O(1). (4.3.13) E[MbN (x) ]

2 So the result can be used to show that, as long as a1 < 1/2, the maximum is of the 2 same order as in the REM. If a1 = 1/2, the bound on (4.3.12) is order one. Thus we still get the same behaviour for the maximum. If a1 is even larger,√ then the behaviour of the maximum changes. Namely, one cannot reach the value 2ln2N any more, and the maximum will be achieved by adding the maxima of the first hierarchy to the maximum in one of√ the branches of the second hierarchy. This yield for the leading order (a1 + a2) 2N.

4.3.1 The finer computations

As in the REM, we do of course want to compute things more precisely. This means we want to compute the limit of

 1 2  • P max(σ 1,σ 2) HN(σ σ ) ≤ uN(x) , or, better, • compute the limit of the Laplace functionals of the extremal process,   −1 1 2  Eexp− ∑ φ uN (HN(σ σ )) . (4.3.14) 1 2 σ ,σ ∈SN/2 in construction 50 4 Spin glasses

Technically, there is rather little difference in the two cases, Let us for simplicity 1 compute the Laplace functional for the case φ(x) = λ (uN (x),∞). Then   

exp − 1 1 2 (4.3.15) E  λ ∑ HN (σ σ )>uN (x) 1 2 σ ,σ ∈SN/2   

= exp − 1 1 2 ∏ E  λ ∑ HN (σ σ )>uN (x) 1 2 σ ∈SN/2 σ ∈SN/2 2N/2  Z 2 2N/2  1 − t  h −λ 1 i = √ e 2N E e a2X>uN (x)−a1t dt 2πN 2N/2  Z 2 2N/2  1 − t  −λ  = √ e 2N 1 + (e − 1)P(a2X > uN(x) − a1t) dt . 2πN Here X is a centred Gaussian random variable with variance N. Basically, to get something nontrivial, the probability in the last expression should be at most order 2−N. Namely, if this probability is much larger than 2−N, then 2N/2  −λ  −2−N/2 1 + (e − 1)P(a2X > uN(x) − a1t) ≤ e . (4.3.16) Integrating this over t does not make it any bigger. So the last line in (4.3.15) is of the form

Z tc 2 2N/2 1 − t  −λ  Biggl[√ e 2N 1 + (e − 1)P(a2X > uN(x) − a1t) dt 2πN −∞ #2N/2 Z ∞ 2 1 − t  −2−N/2  +√ e 2N O e . (4.3.17) 2πN tc

Here we defined tc by

−N P(a2X > uN(x) − a1tc) = 2 . (4.3.18) By standard Gaussian asymptotics, this yields √ uN(x) a2N ln2 tc = − . (4.3.19) a1 a1 The first term in (4.3.17) is essentialy

Z tc 2 1 − t  N/2 −λ  √ e 2N 1 + 2 (e − 1)P(a2X > uN(x) − a1t) dt. (4.3.20) 2πN −∞ Inserting the standard Gaussian asymptotics and reorganising things, this is asymp- totically equal to in construction 4.4 The general model and relation to branching Brownian motion 51

u (x)2 s2 − N −   e 2N Z tc−uN (x)α1 e 2a2N 1 + e−λ − 1 2N/2 √ √ √ ds (4.3.21) 2πuN(x)/ N −∞ 2πNa2 2 − s  Z tc−uN (x)α1 e 2a2N ∼ 1 + 2−N/2e−x e−λ − 1 √ ds. −∞ 2πNa2

2 The integral is essentially one, if tc −uN(x)a1 > 0, which is the case when a1 < 1/2. 2 In the case a1 = 1/2, the integral is 1/2. The second term in (4.3.17) is bounded by

 −N/2  2 O e−2 e−tc /2N. (4.3.22)

Hence this term gives a vanishing contribution to (4.3.17), and we find that we get in both cases  

lim exp − 1 1 2 (4.3.23) E  λ ∑ HN (σ σ )>uN (x) N↑∞ 1 2 σ ,σ ∈SN/2     = exp e−x e−λ − 1 K ,

which is the Laplace functional for a Poisson process with intensity Ke−xdx, K being 2 2 1 or 1/2, depending on whether a1 < 1/2 or a1 = 1/2, respectively. In [19] the full picture is explored when the number of levels is arbitrary (but fi- nite). The general result is that the extremal process remains Poisson with intensity measure e−xdx whenever the function A is strictly below the straight line A(x) = x, and it is Poisson with intensity Ke−xdx when A(x) touches the straight line finitely many times. The value of the constant K < 1 can be expressed in terms of the prob- ability that a stays below 0 in the points where A(x) = x. 2 If a1 > 1/2, the entire picture changes. In that case, the maximal values of the process are achieved by adding up the maxima of the two hierarchies. and this leads to a lower value even on the leading scale N. The extremal process also changes, but it is simply a concatenation of Poisson processes. This has all been fully explored in [19].

4.4 The general model and relation to branching Brownian motion

We have seen that the general case corresponds to a Gaussian process indexed by 0 SN with covariance EXsXσ 0 = NA(qN(σ,σ )), for a non-decreasing function A. Now we are invited to think of SN as the leaves of a tree with binary branching and N levels. Then it makes sense to extend the Gaussian process from the leaves of the tree to the entire tree. If we think of the edges of the tree of being of length one, this process should be indexed by t ∈ [0,N], with covariance in construction 52 4 Spin glasses

0 E[Xσ (t)Xσ 0 (s)] ≡ NA((t ∧ s)/N ∧ qN(σ,σ )). (4.4.1) These models were introduced by Derrida and Gardner [38, 39] and further investi- gated in [20]. It turns out that the leading order behaviour of the maximum in these models can be computed by approximating the function A by step functions. The point here is that if A is a step functions, one can show easily, using the methods explained above, that

√ Z 1 q −1 0 lim N max Xs = 2ln2 A (x)dx, (4.4.2) N↑∞ σ∈SN 0

where A¯ denotes the convex envelope of A (i.e. the smallest convex function larger than or equal to A). It is straightforward to see that this formula remains true for arbitrary non-decreasing A. The proof, given in [20], uses Gaussian comparison method explained in Chapter 3. A similar feature is not true for the subleading corrections. They can be computed for all step functions, but the expression has no obvious limit then these converge to some continuous function. If the convex hull of A is the linear function x we have seen already in the case of two steps that the intensity of the Poisson process of extremes changes when A touches the straight line. In general, a rather complicated looking formula is obtained if A touches the straight line in several points [20]. These observations make the case A(x) = x an obviously interesting target. Another interesting observation is that the processes introduced above can all be constructed with the help of Brownian motions. In the case A(x) = x, one sees easily that this can be realised as follows. Start at time 0 with a Brownian motion that splits into two independent Brownian motions at time t = 1. Each of these splits again into two independent copies at time t = 2, and so on. The case A non-decreasing just corresponds to a time change of this process. The case studied above, when A is a step function, corresponds to singular time changes where for some time the Brownian motions stop to move and just split and then recover their appropriate size instantaneously.

4.5 The Galton-Watson process

So far we have looked at Gaussian processes on {−1,1}N which could be seen as the leaves of a binary tree of depth N. We could also decide to take some different tree. A relevant example would be the continuous time Galton-Watson tree. See, e.e. [8]. The Galton-Watson process [74] is the most canonical continuous time branch- ing process. Start with a single particle at time zero. After an exponential time of parameter one, this particle splits into k particles according to some probability dis- tribution p on N. Then each of the new particles splits at independent exponential in construction 4.5 The Galton-Watson process 53

times independently according to the same branching rule, and so on. For the pur- poses of this book, we will always assume that p0 = 0, i.e. no deaths occur. At time t, there are n(t) "individuals" (ik(t),1 ≤ k ≤ n(t)). The point is that the collection of individuals is endowed with a genealogical structure. Namely for each pair of individuals at time t, ik(t),i`(t), there is a unique time 0 ≤ s ≤ t when they shared a common ancestor for the last time. We call this time d(ik(t),i`(t)). It is sometimes useful to provide a labelling of the Galton-Watson process in terms of multi-indices. For convenience we think of multi-indices as infinite se- quences of non-negative integers. Let us set

N I ≡ Z+, (4.5.1) and let F ⊂ I denote the subset of multi-indices that contain only a finitely many entries that are different from zero. Ignoring leading zeros, we see that

∞ k F = ∪k=0Z+, (4.5.2)

0 where Z+ is either the empty multi-index or the multi-index containing only zeros. A continuous-time Galton-Watson process will be encoded by the set of branching times, {t1 < t2 < ··· < tw(t) < ...} (where w(t) denotes the number of branching times up to time t) and by a consistently assigned set of multi-indices for all times t ≥ 0. To do so, we construct for a given tree the sets of multi-indices, τ(t) at time t as follows.

• τ(0) = {(0,0,...)}. • for all j ≥ 0, for t ∈ [t j,t j+1), for all τ(t) = τ(t j). i • If i ∈ τ(t j) then i + (0,...,0,k,0,...) ∈ τ(t j+1) if 0 ≤ k ≤ ` (t j+1) − 1, where | {z } W(t j)×0

i ` (t j) = #{ offsprings of the particle corresponding to iat timet j}. (4.5.3)

Note that here we use the convention that, if a given branch of the tree does not "branch" at time ti, it is considered as having one offspring. We can relate the assignment of labels in a backwards consistent fashion as fol- lows. It i ≡ (i1,i2,i3,...) ∈ τ(t), we define the function i(t), for r ≤ t, of multi- indices via ( i`, if t` ≤ r, i`(r) ≡ (4.5.4) 0, if t` > r. Clearly, i(r) ∈ τ(r). This construction allows to define the boundary of the tree at infinite time as follows:

∂T ≡ {i ∈ I : ∀t < ∞,i(t) ∈ τ(t)}. (4.5.5)

Note that ∂T is an ultrametric space equipped with the ultrametric m(i,j) ≡ e−d(i,j), where d(i,j) = sup{t ≥ 0 : i(t) = j(t)}. in construction 54 4 Spin glasses

Note that from the knowledge of the set of multi indices in ∂T and the set of branching times, the entire tree can be reconstructed. Similarly, knowing τ(t) allows to reconstruct the tree up to time t.

Remark 4.2. The labelling of the GW-tree is a slight variant of the familiar Ulam- Neveu-Harris labelling (see e.g. [41]). In our labelling the added zeros keep track of the order in which branching occurred in continuous time. The construction above is also nicely explained for discrete time GW processes in [73].

Fig. 4.1 Construction of Te: The green nodes were introduced into the tree ’by hand’.

4.6 The REM on the Galton-Watson tree

A Gaussian process can then be constructed where the covariance is a function of this distance, just like on the binary tree. The simplest Gaussian process on such trees is of course the REM. i.e. at time t where are n(t) iid independent Gaussian random variables xk(t), k = 1,...,n(t), of variance t. We will always be interested in the case then ∑k kpk > 1, i.e. when the is supercritical. In that −1 case limt↑∞ t lnn(t) = c > 1, and limt↑∞ n(t)/En(t) = e, where e is an exponential random variable with parameter 1. We will discuss more in that in the next chapter. What does that mean for the REM? If we condition on n(t), Our standard estimates will apply and we get that in construction 4.6 The REM on the Galton-Watson tree 55   −e−x P max xk(t) ≤ un(x)|n(t) = n ∼ e , (4.6.1) k≤n(t)

provided P(xk(t) > un,t (x)) ∼ 1/n. (4.6.2) Using the computations in Chapter 1 and taking into account that the variance of xk(t) is equal to t, this implies that √ −1 lnlnn + ln(4π) x un,t (x) = t 2t lnn − √ + √ . (4.6.3) 2 2t−1 lnn 2t−1 lnn Passing to the limit, since the right hand side of (4.6.1) does not depend on n, one easily sees that   −e−x P max xk(t) ≤ un(t),t (x) → e . (4.6.4) k≤n(t) Taking into account the convergence properties of n(t), we see that, asymptotically as t ↑ ∞, √ lnrt + ln(4π) ln(e) x un(t),t (x) ∼ t 2c − √ + √ + √ . (4.6.5) 2c 2c 2c From this we can see that instead of the random rescaling, we can also use a deterministic rescaling

√ ln(ct) x ut (x) ≡ t 2c − √ + √ . (4.6.6) 2c 2c In that case we get that

  √ h  e − 2cxi P max xk(t) ≤ ut (x) → E exp − e , (4.6.7) k≤n(t) 4π

where the average is over the exponential random variable e. Here we see for the first time the appearance of a random shift of a Gumbel distribution. . Note that I moved some other constants around to arrive at a form for the right hand side that is conventional in the context of branching Brownian motion. In the same way we see that for the REM on the Galton-Watson tree, the extremal process acquires a if we chose the same deterministic rescaling. The processes n(t) √ h  e − 2cx i δ −1 → PPP e dx . (4.6.8) ∑ ut (xk(t)) E k=1 4π In other words, the limiting process is a Poisson process with random intensity, also known as a . in construction in construction Chapter 5 Branching Brownian motion

In this chapter we start to look at branching Brownian motion as a continuous time version of the GREM. We collect some basic facts that will be needed later.

5.1 Definition and basics

The simplest way to describe branching Brownian motion is as follows. At time zero, a single particle x1(0) starting at the origin, say) begins to perform Brownian motion in R. After an exponential time, τ, of parameter one, the particle splits into two identical, independent copies of itself that start Brownian motion at x1(τ). This process is repeated ad infinitum, producing a collection of n(t) particles xk(t),1 ≤ k ≤ n(t). This construction can be easily extendend to the case of more general offspring distributions where particles split into k ≥ 1 particles with probabilities pk. We will always assume that ∑k≥1 pk = 1, ∑k≥1 kpk = 2 and ∑k≥2 k(k − 1)pk < ∞. Note that while of course nothing happens if the particle splits into one particle, we keep this option to be able to maintain the mean number of offspring equal to 2. Another way of constructing BBM is to first built a Galton-Watson tree with the same branching mechanism as above. Branching Brownian motion (at time t) can then be interpreted as a Gaussian process xk(t), indexed by the leaves of a GW- process, such that Exk(t) = 0 and, given a realisation of the Galton Watson process,

E[xk(t)x`(t) = d(ik(t),i`(t))]. (5.1.1) This observation makes BBM perfectly analogous to the CREM [20] with covari- ance function A(x) = x, which we have seen to be the critical covariance. in construction57 58 5 Branching Brownian motion 5.2 Rough heuristics

Already in the GREM we have seen that than A touches the straight line x, there appears a constant in the intensity measure that reflects the fact that the process is not allowed to get close to the maximal values possible at this point. The basic intuition of what happens in BBM√ is that the ancestral paths of BBM have to stay below the straight line with slope 2 all the time. This will force the maximum down slightly. The rough heuristics of the correct rescaling can be obtained from second moment calculations. More precisely, we should consider the events n n √ o o ∃k≤n(t) : {xk(t) − m(t) > x} ∧ xk(s) ≤ 2s ,∀s ∈ (r,t − r) , (5.2.1)

for r fixed. Here we write, for given t, and s ≤ t, xk(s) for the ancestor at time s of the particle xk(t). One of the observations of Bramson [26] is that the event in (5.2.1) has the same probability as the event  ∃k≤n(t) : xk(t) − m(t) > x , (5.2.2)

when first t ↑ ∞ and then r ↑ ∞. This is not very easy, but also not too hard. Let us first recall that a Brownian bridge from I0 to 0 in time t can be represented as s z(s) = B(s) − B(t). (5.2.3) t t More generally, we will denote by zx,y a Brownian bridge from x to y in time zero. Clearly, s zt (s) = x + (y − x) + zt (s). (5.2.4) x,y t 0,0

It is clear that the ancestral path from xk(t) to zero is just a Brownian bridge from xk(t) to zero. That is, we can write t − s x (t − s) = x (t) + zt (s), (5.2.5) k t k k t where zk(s),s ∈ [0,t], is a Brownian bridge form 0 to 0, independent of xk(t). The event in (5.2.1) can then be expressed as √ n n t oo ∃k≤n(t) : {xk(t) − m(t) > x} ∧ zk(s) ≤ (1 − s/t)( 2s − xk(t)),∀s ∈ (r,t − r) , (5.2.6) where of course the Brownian bridges are not independent. The point, however, is, that as far as the maximum is concerned, they might as well be independent. We need to bound the probability of a Brownian bridge to stay below a straight line. The following lemma is taken from Bramson [22], but is of course classic.

Lemma 5.1. Let y1,y2 > 0. Let z be a Brownian bridge from 0 to 0 in time t. Then

s t−s  P z(s) ≤ t y1 + t y2,∀0≤s≤t = 1 − exp(−y1y2/t). (5.2.7) in construction 5.2 Rough heuristics 59

The proof and further discussions on properties of Brownian bridges will be given in the next chapter. The first moment estimate shows (the calculations are a bit intricate and consti- tute a good part of Bramson’s first paper [26])  n √ o  P ∃k≤n(t) : {xk(t) − m(t) > x} ∧ zk(s) ≤ (1 − s/t)( 2s − xk(t)) ,∀s ∈ (r,t − r) 2 2 − (m(t)+x) − m(t) e 2t Cx2 e 2t √ ∼ et √ √ ∼ C0et x2e− 2x. (5.2.8) 2πm(t)/ t t t3/2

The t-dependent term is equal to one if √ 3 m(t) = 2t − √ lnt. (5.2.9) 2 2 To justify that this upper bound is not bad, one has to perform a second moment computation√ in which one retains the condition that the particle is not above the straight line 2s. This is not so easy here and actually yields a lower bound there the x2 is not present. Indeed, it also turns out that the upper bound is not optimal, and that the x2 should be replaced by x. We will show that in the next chapter. A refinement of the truncated second moment method was used more recently by Roberts [67]. One can actually localise the ancestral paths of extremal particles much more precisely, as was shown in [3]. Namely, define ( sγ , 0 ≤ s ≤ t/2 ft,γ (s) ≡ (5.2.10) (t − s)γ , t/2 ≤ s ≤ t.

Then the following holds:

Theorem 5.2 ([3]). Let D = [dl,du] ∈ R be a bounded interval. Then for any 0 < 1 α < 2 < β, and any ε > 0, there are r0 ≡ r0(D,ε,α,β), such that for all r > r0 and t > 3r,  P ∀k≤n(t)s.t.xk(t) − m(t) ∈ D, it holds that (5.2.11) s s  ∀ , x (t) + f (s) ≤ x (s) ≤ x (t) + f (s) ≥ 1 − ε. s∈[r,t−r] t k t,α k t k s,α Basically this theorem says that the paths of maximal particles lie below the straight line to their endpoint along the arc ft,α (s). This is just due to a property of Brownian√ bridges: in principle, a Brownian bridge wants to oscillate (a little bit more) than t at its middle. But if it is not allowed to cross zero, than it also will not get close to zero, because each time it does so, it is unlikely to not do the crossing. This phenomenon is known as entropic repulsion in statistical mechanics. √ Note also that for the ancestral√ paths of particles that end up much lower than 2t, say around at with a < 2, there is no such effect present. The ancestral paths in construction 60 5 Branching Brownian motion

of these particles are just Brownian bridges from zero to their end position which are localised around the straight line by a positive upper and a negative lower envelope 1 fγ,t , with any γ > 2 .

5.3 Recursion relations

The presence of the tree stucture in BBM of course suggest to derive recursion relations. To get into the mood, let us show that

Lemma 5.3. t En(t) = e . (5.3.1) Proof. For notational simplicity we consider only the case of pure binary branching. The idea is to use the recursive structure of the process. Let τ be the first branching time. Clearly, if t < τ, then n(t) = 1, Otherwise, it is n0(t − τ) + n00(t − τ), where n0 and n00 are the number of particles in the two independent offspring processes. This reasoning leads to

Z t En(t) = P(τ > t) + P(τ ∈ ds)2En(t − s) (5.3.2) 0 Z t −t −s = e + 2 e En(t − s)ds. 0 Differentiating this equation yields

Z t d −t −t −s d En(t) = −e + 2e En(0) + 2 e En(t − s)ds (5.3.3) dt 0 dt Z t −t −t −s d = −e + 2e En(0) − 2 e En(t − s)ds 0 ds Z t −t −s = −e + 2En(t) − 2 e En(t − s)ds = En(t), 0 where we used integration by parts in the second inequality and the fact that n)0) = 1. The assertion follows by solving this differential equation with En(0) = 1. ut We can also show the following classical result (see the standard textbook by Athreya and Ney [8] for this and many further results on branching processes):

Lemma 5.4. If n(t) is the number of particles of BBM at time t, then

M(t) ≡ e−t n(t) (5.3.4)

is a martingale. Moreover, M(t) converges, a.s. and in L1, to an exponential random variable of parameter 1. in construction 5.4 The F-KPP equation 61

Proof. Again we write things only for the binary case. The verification that M(t) is a martingale is elementary. Since M(t) is positive with mean one, it is bounded in L1 and hence, by Doob’s martingale convergence theorem, M(t) converges a.s. to a random variable M. To show that the martingale is uniformly integrable and hence converges in L1, we show that

 2 −t φ(t) ≡ E M(t) = 2 − e . (5.3.5)

This can be done by noting that EM(t)2 satisfies the recursion Z t 2 φ(t) = e−3t + 2 e−3sφ(t − s)ds + 1 − e−3t . (5.3.6) 0 3 Differentiating yields the differential equation

φ 0(t) = 2 − φ(t), (5.3.7)

of which (5.3.5) is the unique solution with φ(0) = 1. Convergence to the exponen- tial distribution can be proven similarly. ut

5.4 The F-KPP equation

The fundamental link between BBM and the F-KPP equation [54, 37] is generally attributed to McKean [61], but appears already in Skorohod [68] and Ikeda, Naga- sawa, and Watanabe [44, 45, 46].

Lemma 5.5 ([61]). Let f : R → [0,1] and {xk(t) : k ≤ n(t)} a branching Brownian motion starting at 0. Let

"n(t) # v(t,x) ≡ E ∏ f (x − xk(t)) . (5.4.1) k=1

Then, u(t,x) ≡ 1 − v(t,x) is the solution of the F-KPP equation (5.4.2)

1 ∂ u = ∂ 2u + F(u), (5.4.2) t 2 x

∞ k with initial condition u(0,x) = 1 − f (x) where F(u) = (1 − u) − ∑k=1 pk(1 − u)u . Remark 5.6. The reader may wonder why we introduce the function v. It is easy to check that v(t,x) itself solves equation (5.4.2) with F(u) replaced by −F(1 − v). The form of F given in the lemma is the one customary know as F-KPP equation and will be used throughout this text. Proof. The derivation of the F-KPP equation is quite similar to the arguments used in the previous section. Again we restrict to the case of binary branching. Let f : in construction 62 5 Branching Brownian motion

R → [0,1]. Define v(t,x) by (5.4.1). Then, distinguishing the cases when the first branching occurs before or after t, we get

2 − z Z e 2t v(t,x) = e−t √ f (x − z)dz (5.4.3) 2πt 2 − z Z t Z e 2s + e−s √ v(t − s,x − z)2dzds. 0 2πs Differentiating with respect to t, using integration by parts as before together with the fact that the heat kernel satisfies the heat equation, we find that 1 ∂ v = ∂ 2v + v2 − v. (5.4.4) t 2 x Obviously, v(0,x) = f (x). Following the remark above, u = 1−v then solves (5.4.2). There is a smart way to see that (5.4.4) and (5.4.3) are the same thing. Namely, 2 − z if H(t,x) ≡ e−t √e 2t , then is the Green kernel for the linear operator 2πt 1 ∂ − ∂ 2 + 1, (5.4.5) t 2 x i.e. the solution of the inhomogeneous linear equation 1 ∂ v − ∂ 2v + v = r (5.4.6) t 2 x

with initial condition u0 is given by Z Z t Z v(t,x) = H(t,x − y)v0(y)dy + H(s,y)r(t − s,x − y)dsdy. (5.4.7) 0

Inserting r(t,x) = v2(t,x), we obtain what is known as the mild formulation of Eq. (5.4.4), and this is precisely (5.4.3). ut

The first example is obtained by choosing f (x) = 1x≥0. Then

"n(t) #   v(t,x) = 1 = x (t) ≤ x . E ∏ (x−xk(t)≥0) P max k (5.4.8) k=1 k≤n(t)

A second example we will exploit is obtained by choosing f (x) = exp(−φ(x)) for φ a non-negative continuous function with compact support. Then

"n(t) #  Z  −φ(x−x (t)) v(t,x) = E ∏ e k = E exp(− φ(x − z)Et (dz)) , (5.4.9) k=1 in construction 5.5 The travelling wave 63

n(t) where Et ≡ ∑k=1 δxk(t) is the point process associated to BBM at time t. We see that in this case u is the Laplace functional of the point process Et .

Definition 5.7. We call the equation (5.4.2) with a general non-linear term the F- KPP equation. We say that F satisfies the standard conditions if F ∈ C1([0,1]),

F(0) = F(1) = 0, F(u) > 0,∀u ∈ (0,1), F0(0) = 1, F0(u) ≤ 1,∀u ∈ [0,1], (5.4.10) and 1 − F0(u) = O(uρ ). (5.4.11)

5.5 The travelling wave

Bramson [22] studied convergence of solutions to a large class of KPP equations for a large class of functions F that include the those that arise from BBM. In particular, he established under what conditions on the initial data solutions converge to trav- elling waves. Part of this is based on earlier results by Kolmogorov et al [54] and Uchiyama [72]. For a purely probabilistic analysis of the travelling wave solutions, see also Harris [43]. We will always be interested in initial conditions that lie between zero and one. It is easy to see that then the solutions remain between zero and one forever. A travelling wave moving with speed λ would be a solution, u, of the F-KPP equation such that d u(t,x + λt) = 0. (5.5.1) dt

A simple computation shows that this implies that u(t,x + λt) = wλ (x), where 1 ∂ 2w (x) + λ∂ w (x) + F(w (x)) = 0. (5.5.2) 2 x λ x λ λ If we want a solution that decays to zero at +∞, for large positive x, w must be close to the solution of the linear equation 1 ∂ 2w (x) + λ∂ w (x) + w (x) = 0 (5.5.3) 2 x λ x λ λ √ But his equation can be solved explicitly. If λ √6= 2, then there are two linearly −b±x 2 independent solutions√ e , where b± = λ ± λ − 2. Clearly, these values are real only if λ > 2, so only in this case we can have non-oscillatory solutions. If √ √ √ λ = 2, then there are two solutions e− 2x and xe− 2x. Kolmogorov et al [54] and Uchiyama [72] showed by a phase-space analysis that it both cases the heavier tailed solution describes the correct asymptotics of the travelling wave solution, and that it is unique, up to translations. in construction 64 5 Branching Brownian motion √ Lemma 5.8 ([54, 72]). Let F satisfy the standard conditions. Then, for λ ≥ 2, Equation (5.5.2) has a unique solution satisfying 0 < wλ (x) < 1, wλ (x) → 0, as 0 x → +∞, and wλ (x) → 1, as x → −∞, up to translation, i.e. if w,w are two solutions, 0 then there exists a ∈ R s.t. wλ (x) = wλ (x + a).

Proof. Note first, that if wλ ∈ [0,1], then, unless wλ is constant, it must hold that for all x ∈ R, wλ (x) ∈ (0,1). For, if for some x ∈ R, wλ (x) = 0, then it must also hold that ∂xwλ (x) = 0. But then, since F(wλ (x)) = 0, the initial value problem with these initial data at x has the unique solution wλ ≡ 0. The same holds if wλ (x) = 1. Next we look at the equation in phase space. It reads

q0 = p (5.5.4) p0 = −2F(q) − 2λ p.

Clearly, this has the two fix points (0,0) and (1,0). The Hessian matrices at these fixpoints are  0 1   0 1  , and . (5.5.5) −2F0(0) −2λ −2F0(1) −2λ 0 Since√ F (0) = 1, the eigenvalues of the Hessian at the fixpoint (0,0) are b± = −λ ± 2 p 2 0 λ − 2, whereas those at the fixpoint (1,0) are a± = −λ ± λ − 2F (1). Since 0 0 of necessity F (1) ≤ 0, the eigenvalues a± are both real, but unless F (1) = 0, one is positive and the other negative. Hence (1,0)√is a saddle point. The eigenvalues b± have negative real part, but are real only if λ ≥ 2. In any case, the fix point is stable, but in the case when the eigenvalues are non-real, the solutions of the linearised equations have oscillatory behaviour and cannot stay positive. In the other cases, there exists an integral curve from (1,0) that approaches (0,0) along the direction of the smaller of the two eigenvalues, i.e. a map γ : [0,1] → R2, such that

γ0(τ) = V(γ(τ)), (5.5.6)

where V(x) is in the direction of√W(q, p) ≡ (p,−2 f (q) − 2λ p) but has |V(x)| ≡ 1. In the degenerate case λ = 2, one can show that the solution analogously has the behaviour of the heavier-tailed solution of the linear equation. From the exis- tence of such an integral curve it follows that for any function τ : [0,1] → R such that τ0(t) = |W(γ(τ(t)))|, (5.5.7)

with the property that limt↓−∞ τ(t) = 0 and limt↓+∞ τ(t) = 1, we have that

w(t) ≡ γ(τ(t)), (5.5.8)

is a solution of w0(t) = W(w(t)) and satisfies the right conditions at ±∞. Clearly the same is true forw ˜(t) ≡ w(t + a), for any a ∈ R, so solutions are unique only up to a translation. ut √ We will be mainly interested in the case λ = 2. The following theorem slightly specialises Bramson’s Theorems A and B from [22] for this case. in construction 5.5 The travelling wave 65

Theorem 5.9 ([22]). Let u be solution of the F-KPP equation (5.4.2) satisfying the standard conditions with 0 ≤ u(0,x) ≤ 1. Then there is a function m(t) such that

u(t,x + m(t)) → ω(x), (5.5.9)

uniformly in x, where ω is one of the solutions of (??) from Lemma 5.8, as t ↑ ∞, if and only if √ 1 R t(1+h) (i) for some h > 0, limsupt→∞ t ln t u(0,y)dy ≤ − 2, and R x+N (ii) for some ν > 0,M > 0,N > 0, x u(0,y)dy > ν for all x ≤ −M. √ bx Moreover, if limx→∞ e u(0,x) = 0 for some b > 2, then one may choose √ 3 m(t) = 2t − √ lnt. (5.5.10) 2 2 Theorem 5.9 is one of the core results of Bramson’s work on Brownian motion and most of the material in his monograph [22] is essentially needed for its proof. We will recapitulate his analysis in the next chapter. We see that Condition (i) is in particularly satisfied for Heaviside initial condi- tions. Hence, this result implies that   √ P max xk(t) − m(t) ≤ x → ω(x) ≡ 1 − w 2(x), (5.5.11) k≤n(t)

√ where w 2 solves Eq. (5.5.2). It should be noted that it follows already from the results of Kolmogorov√ et al. [54] that (5.5.11) must hold for some function m(t) with m(t)/t → 2 (see next chapter). But only Bramson’s precise evaluation of m(t) shows that the shift is changed from the iid case where it would have to be √ 1 m0(t) = 2t − √ lnt, 2 2 which is larger than m(t). One can also check that the law of the maximum for the REM on the GW tree, (4.6.7), does not solve (5.5.2). In the next section we will see that this is, however, not so far off.

Remark 5.10. Bramson also shows that convergence to travelling waves takes place if the forward tail of the initial conditions decays more slowly, i.e. if

1 Z t(1+h) limsup ln u(0,y)dy = −b, (5.5.12) t→∞ t t √ with b < 2. These cases will not be relevant for us.

Remark 5.11. We will use Bramson’s theorem to proof convergence of Laplace functionals of the extremal process of BBM. Note the for Laplace functionals with in construction 66 5 Branching Brownian motion

φ having bounded support cannot satisfy condition (ii). This precludes uniform con- vergence, but not pointwise convergence.

5.6 The derivative martingale

While Bramson analyses the asymptotic behaviour of the functions ω(x), he does not give an explicit form of these solutions, and in particular he does not provide a link to the Gumbel distribution from classical extreme value theory. This was achieved a few years later by Lalley and Sellke (see also the paper by Harris [43]). They wrote a short and very insightful paper [56] that provides, after Bramson’s work, the most significant insights into BBM that we have. It’s main achievement is to give a probabilistic interpretation for the limiting distribution of the maximum of BBM. But this is not all. For 1− f satisfying the hypothesis of Theorem 5.9 let u be given by (5.4.1). Now define "n(t) √ # √ vˆ(t,x) ≡ E ∏ f ( 2t + x − xi(t)) = v(t, 2t + x). (5.6.1) i=1 One checks thatv ˆ solves the equation 1 √ ∂ vˆ = ∂ 2vˆ+ 2∂ vˆ− F(1 − vˆ). (5.6.2) t 2 x x Now let ω solve 1 √ ∂ 2ω + 2∂ ω − F(1 − ω). (5.6.3) 2 x x Thenv ˆ(t,x) ≡ ω(x) is a stationary solution to (5.6.2) with initial conditionv ˆ(0,x) = ω(x). Therefore we have the stochastic representation

"n(t) √ # ω(x) = E ∏ω( 2t + x − xi(t)) . (5.6.4) i=1

Now probability enters the game. n(t) √ Lemma 5.12. The function W(t,x) ≡ ∏i=1 ω( 2t + x − xi(t)) is a martingale with respect to the natural filtration Ft of BBM. Moreover, W(t,x) converges almost ∗ ∗ surely and in L1 to a non-trivial limit, W (x), and EW (x) = ω(x). y Proof. The proof is straightforward. It helps to introduce the notation xk(t) for BBM started in y. Clearly W(t,x) is integrable. By the Markovian nature of BBM,

n(t) ni(s) √  −x 0  W(t + s,x) = ∏ ∏ ω 2(t + s) − xi (t) − xi, j(s) . (5.6.5) i=1 j=1

where the xi, j are independent BBM’s. Now in construction 5.6 The derivative martingale 67 " # n(t) ni(s) √    −x 0 i E W(t + s,x) Ft = ∏E ∏ ω 2(t + s) − xi (t) − xi, j(s) Ft i=1 j=1 " √ # n(t) ni(s) √  −x  xi (t)− 2t = ∏E ∏ ω 2(s) − xi, j (s) Ft i=1 j=1 n(t) √  −x  = ∏ω 2t − xi (t) = W(t,x). (5.6.6) i=1

Now W(t,x) is bounded and therefore converges almost surely and in L1 to a limit, W ∗(x) whose expectation is the ω(x). ut There are more martingales. First, by a trivial computation,

n(t) √ Y(t) ≡ ∑ e 2xi(t)−2t (5.6.7) i=1

is a martingale. Since it is positive and EY(t) = 1, it converges almost surely to a finite non-negative limit. But this means that the exponents in the sum must all go to minus infinity, as otherwise no convergence is possible. This means that √  min 2t − xi(t) ↑ +∞,a.s,, (5.6.8) i≤n(t)

(this argument is convincing but a bit fishy, see the remark below). One of Bramson’s results is that (we will see the proof of this in the next chapter,

see Corollary 7.2) √ 1 − ω(x) ∼ Cxe− 2x, as x ↑ ∞. (5.6.9) Hence

n(t) √ ! W(t,x) = exp ∑ lnω( 2t + x − xi(t)) (5.6.10) i=1 n(t) √ √ √ ! − 2( 2t+x−xi(t)) ∼ exp −C ∑( 2t + x − xi(t))e i=1  √ √  = exp −Cxe− 2xY(t) −Ce− 2xZ(t) ,

where n(t) √ √ √ − 2( 2t−xi(t)) Z(t) ≡ ∑( 2t − xi(t))e ). (5.6.11) i=1

Z(t) is also a martingale, called the derivative martingale, with EZt = 0. The fact that Z(t) is a martingale can be verified by explicit computation, but it will actually not be very important for us. In any case, Z(t) is not even bounded in L1 (in fact, q 2t an easy calculation shows that E|Zt | = π and therefore it is a priori not cleat that in construction 68 5 Branching Brownian motion

Z(t) converges. By the observation (5.6.8), Z(t) is much bigger than Y(t), which implies that unless Y(t) → 0, a.s., it must be true that liminfZ(t) = +∞, which is impossible since this would imply that W(t,x) → 0, which we know to be false. Hence Y(t) → 0, and thus Z(t) → Z, a.s., where Z is finite and positive. It follows that  √  limW(t,x) = exp −CZe− 2x . (5.6.12) t↑∞ Remark 5.13. While the argument of Lalley and Sellke for (5.6.8) may not convince 1 everybody , the following gives an alternative proof for the fact that Yt → 0. By a simple Gaussian computation, √  √  e 2K P ∃k≤n(t) : 2t − xk(t) < K ≤ √ . (5.6.13) 4πt But this implies that  Z(t)  < K ≤ ct−1/2. (5.6.14) P Y(t)

Now assume that Yt → a > 0, a.s.. Then, for large enough t,

−1/2 P(Z(t) < 2Ka) ≤ ct . (5.6.15) But this implies that   liminfZ(2n) < 2aK = 0. (5.6.16) P n and hence ! P limsupZ(t) < 2aK = 0. (5.6.17) t↑∞ But this implies that liminfW(t,x) = 0,a.s., (5.6.18) t↑∞ and since we know that W(t,x) converges almost surely, it must hold that the limit is zero. But this is in contradiction with the fact that the limit is a non-negative random variable with positive expectation and that convergence holds in L1. Hence it must be the case that Yt → 0, almost surely.

Remark 5.14. One may interpret Y(t) as a partition function . Namely, if we set

n(t) βxi(t) Zβ (t) ≡ ∑ e , (5.6.19) i=1 then √ Z 2(t) Y(t) = √ . (5.6.20) EZ 2(t)

1 Thanks to Marek Biskup for voicing some doubt about this claim. in construction 5.6 The derivative martingale 69 √ 2 has the natural interpretation of the critical inverse temperature for this model, which can be interpreted as the value where the "" starts to fail in a strong sense, namely that Z (t)/ Z (t) does no longer converge to a non-trivial β E √β limit. In the REM, the critical value is 2ln2, and it was shown in [21] that in this case, at the critical value, this ratio converges to 1/2. For BBM, one can showthat

Z (t) β . (5.6.21) EZβ (t)

is a uniformly integrable martingale for all values β < 1 that converges a.s. and in L1 to a positive random variable of mean 1. The reason for the name derivative martingale is that Z√t looks like the derivative of the martingale Yt with respect to β at the value β = 2. This is indeed a strange animal: its limit is almost surely positive, but the limit of its expectation is zero.

Remark 5.15. Convergence of the derivative martingale is shown via purely proba- bilistic techniques by Kyprianou [55].

Finally, we return to the law of the maximum. We have that, for any s ≥ 0,  

limP max xk(t) − m(t) ≤ x Fs t↑∞ k≤n(t)  

= limP max xk(t + s) − m(t + s) ≤ x Fs t↑∞ k≤n(t+s)   = limP max xk(t + s) − m(t + s) ≤ x|Fs t↑∞ k≤n(t+s) n(s)   = lim ∏ P max xi,k(t) − m(t + s) − xi(s) ≤ x|Fs t↑∞ i=1 k≤n(t) n(s) = lim ∏ u(t,x + m(t + s) − xi(s)). (5.6.22) t↑∞ i=1 where 1 − u is the solution of the F-KPP equation with Heaviside initial condition. Next we use that √ m(t + s) − m(t) − 2s → 0, ast ↑ ∞, (5.6.23) for fixed s. This shows that  

limP max xk(t) − m(t) ≤ x Fs t↑∞ k≤n(t) n(s) √ = lim ∏ u(t,x + m(t) − 2s − xi(s)) t↑∞ i=1 n(s) √ = ∏ ω(x − 2s − xi(s)) = W(s,x). i=1 in construction 70 5 Branching Brownian motion

As now s ↑ ∞, √ − 2x W(s,x) → e−CZe , a.s. (5.6.24) which proves the main theorem of Lalley and Sellke [56]:

Theorem 5.16 ([56]). For BBM,   √ −CZe− 2x limP max xk(t) − m(t) ≤ x = Ee . (5.6.25) t↑∞ k≤n(t)

Moreover   √ −CZe− 2x limlimP max xk(t) − m(t) ≤ x|Fs = e , a.s.. (5.6.26) s↑∞ t↑∞ k≤n(t)

Remark 5.17. Of course, the argument above shows that any solution of (5.5.2)

satisfying√ the conditions of Lemma 5.8 has a representation of the form 1 −  −CZ − 2x  E e e , with only different constants C.

Remark 5.18. Lalley and Sellke conjectured in [56] that

Z T √ 1 1 −CZe− 2x lim max xk(t)−m(t)≤xdt → e , a.s.. (5.6.27) T↑∞ T 0 k≤n(t) This was proven to be true in [5]. in construction Chapter 6 Bramson’s analysis of the F-KPP equation

In this chapter we recapitulate the essential features of Bramson’s proof of his main theorems on travelling wave solutions. The material is taken from [22] with occa- sionally a few details added in proofs and some omissions.

6.1 Feynman-Kac representation

Bramson’s analysis of the asymptotics of solutions of the KPP equation in [22] relies on a Feynman-Kac representation. The derivation of this can be seen as a combina- tion of the the mild formulation of the pde together with the standard Feynman-Kac representation of the heat kernel. Namely, if u(t,x) is the solution of the linear equa- tion 1 ∂ u = ∂ 2u + k(t,x)u, (6.1.1) t 2 x with initial conditions u(0,x), then the standard Feynman-Kac formula yields the representation ([47], see e.g. [70, 51])

 Z t   u(t,x) = Ex exp k(t − s,Bs)ds u(0,Bt ) , (6.1.2) 0 where the expectation is with respect to ordinary Brownian motion started at x. To get back to the full equation, we use this with

k(t,x) = F(u(t,x))/u(t,x), (6.1.3)

where u itself is the solution of the F-KPP equation. It may seem that this is just a re-writing of the original equation as an integral equation. Still, the ensuing repre- sentation is very useful as it allows to process a priori information on the solution into finer estimates. Note that under the standard conditions on F stated in (5.4.10) and (5.4.11), 0 ≤ k(t,x) ≤ 1 (6.1.4) in construction71 72 6 Bramson’s analysis of the F-KPP equation

and k(t,x) ∼ Cu(t,x)ρ , when u(t,x) ↓ 0. (6.1.5) Bramson’s key idea was to use the FK-representation not all the way down to the initial condition, but to go back to some arbitrary time 0 ≤ r < t. 1 Theorem 6.1. Let u be the solution of the F-KPP equation with initial conditions u(0,x). Let k(t,x) ≡ F(u(t,x))/u(t,x). Then, for any 0 ≤ r ≤ t, u satisfies the equa- tion  Z t−r   u(t,x) = Ex exp k(t − s,Bs)ds u(r,Bt−r) , (6.1.6) 0

where (Bs,s ∈ R+) is Brownian motion starting in x. This can be conveniently be rewritten in terms of Brownian bridges. Recall the t  definition of a Brownian bridge zx,y(s),0 ≤ s ≤ t from x to y in time t. Then (6.1.6) can be written as

Z ∞ −(x−y)2/(2(t−r))  Z t−r  e t−r u(t,x) = dyu(r,y) p E exp k(t − s,zx,y (s))ds . −∞ 2π(t − r) 0 (6.1.7) Note that under the standard conditions k(t,x) lies between zero and one and tends to one as ∞. Bramson’s basic idea to exploit this formula is to prove a priori estimates on the function k(t,x). Note that k(t,x) > 0, and by Condition (5.4.11), k(x,t) ∼ 1, when u(t,x) ∼ 0 (think about the simplest case when k(t,x) = 1 − u(t,x)). We will see that k(s,x) ∼ 1 if x is a above a certain curve, M . On the other hand, he shows that the probability that the Brownian bridge descends below a curve M , is negligibly small. The following proposition is this strategy and essentially contained in Bramson’s monograph [22, Proposition 8.3]. Theorem 6.2. Let u be a solution to the F-KPP equation (5.4.2) with initial condi- tion satisfying Z ∞ √ ye 2yu(0,y)dy < ∞. (6.1.8) 0 Define √ − 2z √ 2 (z+ √3 lnt) ! √ e Z ∞ √ (y−z) 2 2 2y − (t−r) −2y ψ(r,t,z+ 2t) ≡ p u(r,y+ 2r)e e 2 1 − e t−r dy 2π(t − r) 0 (6.1.9) Then for r large enough, t ≥ 8r, and z ≥ 8r − √3 lnt, 2 2 √ √ √ γ−1(r)ψ(r,t,z + 2t) ≤ u(t,z + 2t) ≤ γ(r)ψ(r,t,z + 2t), (6.1.10)

where γ(r) ↓ 1, as r → ∞.

1 One should appreciate the beauty of this construction: start with a probabilistic model (BBM), derive a pde whose solutions represent quantities of interest, and then use a different probabilistic representation of the solution (in terms of Brownian motion) to analyse these solutions... in construction 6.1 Feynman-Kac representation 73

We see that ψ basically controls u, but of course ψ still involves u. The fact that u is bounded by u may seems strange, but we shall see that this is very useful.

Proof. The better part of this chapter will be devoted to proving the following two facts: 1. For r large enough, t ≥ 8r and x ≥ m(t) + 8r

u(t,x) ≥ ψ1(r,t,x) (6.1.11) 2 − (x−y) Z ∞ 2(t−r) t−r e  t−r x  ≡ C1(r)e u(r,y)p P zx,y (s) > M r,t (t − s),s ∈ [0,t − r] dy −∞ 2π(t − r)

and

u(t,x) ≤ ψ2(r,t,x) (6.1.12) 2 − (x−y) Z ∞ 2(t−r) t−r e t−r 0  ≡ C2(r)e u(r,y)p P zx,y (s) > M r,t (t − s),s ∈ [0,t − r] dy, −∞ 2π(t − r)

x 0 where the functions M r,t (t − s), M r,t (t − s) satisfy

0 x M r,t (t − s) ≤ nr,t (t − s) ≤ M r,t (t − s). (6.1.13)

Here √ (s − r) √ n (s) ≡ 2r + (m(t) − 2r). (6.1.14) r t − r

Moreover, C1(r) ↑ 1, C2(r) ↓ 1 as r ↑ ∞. 2. The bounds ψ1(r,t,x) and ψ2(r,t,x) satisfy

ψ (r,t,x) 1 ≤ 2 ≤ γ(r), (6.1.15) ψ1(r,t,x)

where γ(r) ↓ 1 as r ↑ ∞. Assuming these facts, we can conclude the proof of the theorem rather quickly. Set

2 − (x−y) Z ∞ 2(t−r) t−r e t−r  ψ(r,t,x) = e u(r,y)p P zx,y (s) > nr,t (t − s),s ∈ [0,t − r] dy. −∞ 2π(t − r) (6.1.16) By (6.1.13), we have ψ1 ≤ ψ ≤ ψ2. Therefore, for r,t and x large enough

u(t,x) ψ (r,t,x) ψ (r,t,x) ≤ 2 ≤ 2 ≤ γ(r) , (6.1.17) ψ(r,t,x) ψ(r,t,x) ψ1(r,t,x) and in construction 74 6 Bramson’s analysis of the F-KPP equation u(t,x) 1 ≥ . (6.1.18) ψ(r,t,x) γ(r) Combining (6.1.17) and (6.1.18) we get

γ−1(r)ψ(r,t,x) ≤ u(t,x) ≤ γ(r)ψ(r,t,x). (6.1.19)

For x ≥ 8r − √3 lnt we get from (6.1.19) that 2 2 √ √ √ γ−1(r)ψ(r,t,x + 2t) ≤ u(t,x + 2t) ≤ γ(r)ψ(r,t,x + 2t). (6.1.20)

The nice thing is that the probability involving the Brownian bridge in the defi- nition of ψ can be explicitly computed,√ see Lemma 5.1. √ Since our bridge goes from x + 2t to y and the straight line is between 2r √and m(t), and it stays above this line, we have to adjust this√ to length t − r, y1 = 3 2t + x − m(t) = x + √ lnt > 0 (for t > 1) and y2 = y − 2r. 2 2 √ √ t−r  Of course P zx,y (s) > nr,t (t − s),s ∈ [0,t − r] = 0 for y ≤ 2r. For y > 2r, from Lemma 5.1 we have    √  x + √3 lnt (y − 2r) zt−r(s) > n (t − s),s ∈ [0,t − r] = 1 − exp − 2 2 . P x,y r,t  t − r 

√ (6.1.21) Changing variables to of y0 = y − 2r in the integral appearing in the defini- tion of ψ, we get (6.1.9). This, together with (6.1.19), concludes the proof of the proposition. ut

Remark 6.3. In a later paper [23], Bramson gives an√ even more explicit representa- tion for the asymptotics of the solutions with speed 2, namely, Z ∞ 1 u(t,z + m(t)) = 2C(t,z)zt−1et yu(0,y)√ exp−(z + m(t) − y)2/2tdy, 0 2πt (6.1.22) with limz↑∞ limt↑∞ C(t,z) = 1. We will not make use of this and therefore also do not give the proof.

In the remainder of this chapter we will recall Bramson’s analysis of the F-KPP equation that leads to the proof of the facts we used in the proof above. In the sub- sequent chapter we use this proposition crucially to obtain the extremal process of BBM. Let us make some comments about what needs to be shown. In the bounds (6.1.11) and (6.1.12), the exponential factor involving k(s,x) in the Feynman-Kac representation (6.1.7) is replaced by 1. For an upper bound, this is easy, since k ≤ 1; it is, however, important that we are allowed to smuggle in the condition that the 0 Brownian bridge stays above the curve M r,t , which is lying somewhat, but not too much, below the straight line nr,t . For the lower bound, we can of course introduce x a condition for the Brownian bridge to stay above the curve M r,t , which lies above t−r the straight line. We then have to show that then k(t − s,zx,y (s)) is very close to in construction 6.2 The maximum principle and its applications 75

1. For this we seem to need to know how the solution u(t,x) behaves, but, in fact, an upper bound suffices. But a trivial upper bound is always given by the solution of the linear F-KPP equation for which the Feynman-Kac representation is explicit and allows to get very precise bounds. Finally, one needs to show that the two prob- abilities concerning the Brownian bridge are almost the same. This holds because of what is√ known as entropic repulsion : A Brownian bridge wants to fluctuate on the oder t. If one imposes a condition not to make such fluctuations in the neg- ative direction, the bridge will prefer to stay positive and in fact behaves as if one would force it to stay above a curve that goes only up to some tδ , with δ < 1/2. The remainder of this chapter explains how these simple ideas are implemented.

6.2 The maximum principle and its applications

Let us first remark that under the standard conditions on F, existence and uniqueness of solutions of the F − KPP equation follow easily via the standard tools of Picard iteration and the Gronwall lemma. Theorem 6.4. If F satisfies the standard conditions, then the F-KPP equation with initial data 0 ≤ u0(x) ≤ 1 has a unique solution for all t ∈ R+ and 0 ≤ u(t,x) ≤ 1, for all t ∈ R+ and x ∈ R. Remark 6.5. Existence and uniqueness hold for general bounded initial conditions, if F is assumed to be Lipschitz, but we are only interested in solutions that stay in [0,1]. Proof. The best way to prove existence is to set up a Picard iteration for the mild form of the equation, i.e. to define u1(t,x) as the unique solution of the heat equation

1 ∂ u1(t,x) = ∂ 2 u(t,x), t ∈ ,x ∈ , u1(0,x) = u (x). (6.2.1) t 2 xx R+ R 0 This can be written as Z t Z ∞ 1 u (t,x) = ds g(t − s,x − y)u0(y), (6.2.2) 0 −∞ where 2 1 − x g(t,x) = √ e 2t , (6.2.3) 2πt is the heart kernel. Then define recursively

Z t Z ∞ Z t Z ∞ n n−1 u (t,x) = ds g(t − s,x − y)u0(y) + ds dyg(t − s,x − y)F(u (s,y)) 0 −∞ 0 −∞ ≡ T(un−1)(t,x). (6.2.4)

If this sequence converges, the limit will be a solution of in construction 76 6 Bramson’s analysis of the F-KPP equation

Z t Z ∞ Z t Z ∞ u(t,x) = ds g(t − s,x − y)u0(y) + ds dyg(t − s,x − y)F(u(s,y)), 0 −∞ 0 −∞ (6.2.5) which is a mild formulation of the F − KPP equation. To prove convergence, one shows that the map T is a contraction in the sup-norm (on [0,t0] × R), for t0 suffi- ciently small. This follows easily since F is Lipschitz continuous, so that

Z t Z ∞ |T(u)(t.x) − T(v)(t,x)| ≤ ds dyg(t − s,x − y)C|u(s,y) − v(s,y)| (6.2.6) 0 −∞ Z t Z ∞ ≤ C sup |u(s,y) − v(s,y)| ds dyg(t − s,x − y) y∈R,s∈[0,t] 0 −∞ = Ct sup |u(s,y) − v(s,y)|. y∈R,s∈[0,t] Hence,

sup |T(u)(t.x) − T(v)(t,x)| ≤ Ct0 sup |u(s,y) − v(s,y)|, (6.2.7) x∈R,t∈[0,t0] y∈R,s∈[0,t0]

by which T is a contraction of t0 < 1/C. This proves that a solution exists up to time 1/C. From here we may start with u(t0,x) as a new initial contain to construct the solution up to time 2/C, and so on. This shows global existence. To prove uniqueness, we could use the Gronwall lemma, but it follows immedi- ately form the maximum principle which we state next. One of the fundamental tools in Bramson’s analysis is the following maximum principle for solutions of the F-KPP equation. We take the theorem from Bramson who attributes it to Aronson and Weinberger [7]. Proposition 6.6. Let F satisfy the standard conditions (5.4.10). Assume that u1(t,x) and u2(t,x) satisfy the inequalities

0 ≤ u1(0,x) ≤ u2(0,x) ≤ 1, ∀x ∈ (a,b), (6.2.8)

and 1 1 ∂ u2(t,x) − ∂ 2u2(t,x) − F(u2(t,x)) ≥ ∂ u1(t,x) − ∂ 2u1(t,x) − F(u1(t,x)), t 2 x t 2 x (6.2.9) for all (t,x) ∈ (0,T) × (a,b). If a > −∞, assume further that

0 ≤ u1(t,a) ≤ u2(t,a) ≤ 1, ∀t ∈ [0,T], (6.2.10)

and if b < ∞, assume that

0 ≤ u1(t,b) ≤ u2(t,b) ≤ 1, ∀t ∈ [0,T]. (6.2.11)

Then u2 ≥ u1, and if the inequality (6.6) is strict in an open interval of (a,b), then u2 > u1 on (0,T] × (a,b). in construction 6.2 The maximum principle and its applications 77

Proof. The proof consists of reducing the argument to the usual maximum principle for parabolic linear equations. Condition (6.2.9) can be written as 1 ∂ (u2 − u1) − ∂ 2(u2 − u1) ≥ F(u2) − F(u1) = F0(u1 + θ(u2 − u1))(u2 − u1), t 2 x (6.2.12) 0 for some θ ∈ [0,1], by the mean value theorem. Now set α ≡ maxu∈[0,1] F (u), and set v(t,x) ≡ e−2αt u2(t,x) − u1(t,x). (6.2.13) Then by (6.2.12), v satisfies the differential inequality 1 ∂ v(t,x) − ∂ 2v(t,x) ≥ F0(u1 + θ(u2 − u1)) − 2αv(t,x), (6.2.14) t 2 x where by construction the coefficient to v on the right hand side is non-positive. If now for some (t0,x0), v(t0,x0) < 0, then there must be some point (t1,x1), with t1 ≤ t0, such that v takes its minimum in [0,t0] × (a,b) at this point. Note that x1 cannot be in the boundary of (a,b), due to the boundary conditions. But then at 2 (t1,x1), ∂t ≤ 0 and ∂xxv ≥ 0, so that the left hand side of (??) is non-positive, whereas the left hand side is strictly positive, which leads to a contradiction. Hence ν ≥ 0, and so u2 ≥ u1. ut

Clearly, the preceding theorem implies that two solutions of the F-KPP equation with ordered initial (and boundary) conditions remain ordered for all times. Since, moreover, the constant functions 0 and 1 are solutions of the F-KPP equations, any solution with initial conditions within [0,1] will remain in this interval. This follows, of course, also from the McKean representation (5.4.1). The following corollary compares solutions of the linear and non-linear equation.

Corollary 6.7 (Corollary 1 of [22]). Let u1 be a solution of the F-KPP equation with F satisfying (5.4.10) and let u2 satisfy the linear equation 1 ∂ u2(t,x) = ∂ 2u2(t,x) + u2(t,x). (6.2.15) t 2 x If u1(0,x) = u2(0,x) for all x, then

u1(t,x) ≤ u2(t,x), (6.2.16)

1 for all t > 0,x ∈ R. Therefore, if u (0,x) = 1x≤0, the re-centring m(t) satisfies √ m(t) < 2t − 2−3/2 lnt +C = m˜ (t) +C. (6.2.17)

Proof. Since F(u) ≤ u, the two functions satisfy the hypothesis of Proposition 6.6. Thus, (6.2.16) holds. On the other hand, one can solve u2 explicitly, using the heat kernel, as t + e Z ∞ 2 u2(t,x) = √ e−(x−y) /(2t)u2(0,y)dy. (6.2.18) 2πt −∞ in construction 78 6 Bramson’s analysis of the F-KPP equation

For Heaviside initial condition this becomes t e Z 0 2 u2(t,x) = √ e−(x−y) /(2t)dy 2πt −∞ t = e P(N (0,t) > x). (6.2.19) Thus, we know that u2(t,z + m˜ (t)) → c0e−cz. (6.2.20) By monotonicity, we see that m(t) must be less thanm ˜ (t), up to constants.Of course, we know this probabilistically already.

An extension of the maximum principle due to McKean also controls the sign changes of solutions.

Proposition 6.8 (Proposition 3.2 in [22]). Let ui, i = 1,2 be solutions of the F-KPP equation with F satisfying (5.4.10). Assume that the initial conditions are such that, 2 1 2 1 if x1 < x2, then u (0,x1) > u (0,x1) implies that u (0,x2) ≥ u (0,x2). Then this 2 property is preserved for all times, i.e. for all t > 0, if x1 < x2, then u (t,x1) > (≥ 1 2 1 )u (t,x1) implies that u (t,x2) > (≥)u (t,x2).

Proof. We set v = u2 − u1. Then 1 ∂ v = ∂ 2v + F0(u1 + θ(u2 − u1))v. (6.2.21) t 2 x We now use the Feynman-Kac representation in the form (6.1.6) with

k(t,x) ≡ F0(u1(t,x) + θ(t,x)(u2(t,x) − u1(t,x))) (6.2.22)

the coefficient of ν on the right of (6.2.21). Set

Z r  M(r) ≡ exp k(t − s,Bs)ds v(t − r,Br). (6.2.23) 0

Then ν(t,x) = Ex[M(r)], for any t ≥ r ≥ 0; in fact, M(r) is a martingale. Next we replace the constant time r by the

τ ≡ inf{0 ≤ s ≤ t : M(s) = 0} ∧t. (6.2.24)

Now choose an x1 such that v(t,x1) > 0. Then

0 < v(t,x1) = Ex1 [M(τ)], (6.2.25)

so that P(M(τ) > 0) > 0. But this means that, with positive probability τ = t, which can only happen if there is a continuous path, γ, starting at x1, such that for all 0 ≤ s ≤ t, v(t −s,γ(s)) > 0. In particular, v(0,γ(t)) > 0. Now define another stopping time, σ, by σ ≡ inf{0 ≤ s ≤ t : Bs = γ(s)} ∧t, (6.2.26) in construction 6.2 The maximum principle and its applications 79

where B0 = x2. Now M(σ) ≥ 0. If σ < t, the inequality is strict, and if σ = t, it must be the case that B(t) ≥ γ(t), since otherwise the BM would have had to cross γ. But then ν(0,B(t) > 0 by assumption on the initial conditions. On the other hand, we have again,

v(t,x2) = Ex2 [M(σ)]. (6.2.27)

But Brownian motion starting at x2 hits the path γ with positive probability before time t, so the left-hand side of (6.2.27) is strictly positive, hence v(t,x2) > 0. ut The first main goal is to prove an a priori convergence result to travelling wave solutions that was already known to Kolmogorov, Petrovsky, and Piscounov [54]. We will use the notation

me(t) ≡ sup{x ∈ R : u(t,x) ≥ ε}. (6.2.28) Theorem 6.9 ([54]). Let u be a solution of the F-KPP equation with F satisfying (5.4.10) and Heaviside initial conditions. Then √ 2 u(t,x + m1/2(t)) → w (x), (6.2.29) √ uniformly in x, as t ↑ ∞, where w 2 is the solution of the stationary KPP equation 1 √ w00 + 2w0 + F(w) = 0, (6.2.30) 2

with w(0) = 1/2. Moreover, √ m1/2(t)/t → 2. (6.2.31) Proof. We need two results that are consequences of Proposition 6.8. Corollary 6.10 (Corollary 1 in [22]). For u as in Theorem 6.9, for any 0 < ε < 1, ( u(t,m (t) + x) ↑ w(x), for x > 0, ε , (6.2.32) u(t,mε (t) + x) ↓ w(x), for x ≤ 0.

for some functions w.

Proof. We fix t0,a ∈ R+ and set

1 u (t,x) ≡ u(t,x + mε (t0)), 2 u (t,x) ≡ u(t + a,x + mε (t0 + a)). (6.2.33)

Clearly these functions satisfy the hypothesis of Proposition 6.8. Moreover, for u(t,mε (t)) = ε, and so 2 1 u (t0,0) = u (t0,0). (6.2.34) 2 1 Hence, by Proposition 6.8, for x ≥ 0, u (t0,x) ≥ u (t0,x), i.e.

u(t0 + a,x + mε (t0 + a)) ≥ u(t0,x + mε (t0)). (6.2.35) in construction 80 6 Bramson’s analysis of the F-KPP equation

2 1 Likewise, for x < 0, if for some x < 0, it were true that u (t0,x) > u (t0,x), then the 2 1 same would need to hold at 0, which is not the case. Hence, u (t0,x) ≤ u (t0,x), i.e.

u(t0 + a,x + mε (t0 + a)) ≤ u(t0,x + mε (t0)). (6.2.36)

Hence u(t,mε (t)+x is monotone increasing and bounded from above for x > 0, and decreasing and bounded from below for x ≤ 0. This implies that it must converge to some function w, as claimed. ut We need one more corollary that involves the notion of being more stretched. For g,h monotone decreasing functions, g is more stretched than h, if for any c ∈ R and x1 < x2, g(x1) > h(x1 + c) implies g(x2) > h(x2 + c), and the same with > replaced by ≥. Corollary 6.11 (Corollary 2 in [22]). If ui are solutions of the F-KPP equations as in the Theorem, then if the initial conditions of u2 are more stretched than those of 1 2 1 1 u , then for any time, u is√ more stretched than u . In particular, if u has Heaviside initial conditions, than w 2 is more stretched than u1(t,·) for all times. Proof. Straightforward from Proposition 6.8. ut

So now we know that u(t,m1/2(t) + x) converges to some function, w(x). more- over,√ since u is monotone,√ so is the limit. Moreover,√ u(t,x) is less stretched than 2 2 2 w . Thus w(x) ≤ w (x), for√x > 0, and w(x) ≥ w (x), for x ≤ 0. It is not hard to show that the limit is indeed w 2; if we set v(t,x) ≡ u(t,x + m(t)), then

1 ∂ v(t,x) = ∂ u(t,m(t) + x) + m0(t)∂ u(t,m(t) + x) = ∂ 2v + m0(t)∂ v + F(v). t t x 2 x x (6.2.37) Integrating the left-hand side over x twice starting from zero, and then integrating the result over t over an interval of length 1, we get Z x Z y dy dz(u(t + 1,z + m(t + 1)) − u(t,z + m(t)) (6.2.38) 0 0 ≤ x2/2 sup (u(t + 1,z + m(t + 1)) − u(t,z + m(t)), 0≤z≤x

which tends to zero by Corollary 6.10, as t ↑ ∞. Applying the same procedure on the right-hand side yields

Z t+1 Z x Z y   1 2 0 ds dy dz ∂x v + m (t)∂xv + F(v) (6.2.39) t 0 0 2 Z t+1 1 1 x = ds v(s,x) − v(s,0) − ∂xv(s,0) t 2 2 2 Z x Z x Z y  +m0(s) dy(v(s,y) − v(s,0)) + dy dzF(v(s,z)) . 0 0 0

Using the convergence of v to w, as t ↑ ∞, and setting a ≡ limt↑∞ ∂xv(t,x), the first line converges to in construction 6.2 The maximum principle and its applications 81

Z t+1 1 1 x 1 1 a ds v(s,x) − v(s,0) − ∂xv(s,0) → w(x) − − x. (6.2.40) t 2 2 2 2 4 2 Next, for the same reason,

Z x Z x  1 dy(v(s,y) − v(s,0)) → w(y) − dy. (6.2.41) 0 0 2 Finally,

Z t+1 Z x Z y Z x Z y ds dy dzF(v(s,z)) → dy dzF(w(z)). (6.2.42) t 0 0 0 0 Since all of (6.2.39) tends to zero, also the term involving m0(s) must be independent of t, asymptotically. This means that

Z t+1 dsm0(s) = m(t + 1) − m(t) → λ, (6.2.43) t

for some λ ∈ R. Putting all this together we get that 1 1 a Z x  1 Z x Z y w(x) − − x + λ w(y) − dy + dy dzF(w(z)) = 0. (6.2.44) 2 4 2 0 2 0 0

Differentiating twice with respect to x yields√ that w satisfies the travelling wave equation (5.5.2) for some value√ of λ ≥ 2. But we have already√ an upper bound on m which implies that λ ≤ 2, which leaves us with λ = 2. This concludes the proof of the theorem. ut The maximum principle has several important consequences for the asymptotics of solutions that will be exploited later. The first is a certain monotonicity property. Lemma 6.12 (Lemma 3.1 in [22]). Let u be a solution of the F-KPP equation as before. Assume that the initial conditions satisfy u(0,x) = 0, for x ≥ M, for some real M. Then, for any t, for all y ≥ x ≥ M,

u(t,y) ≤ u(t,2x − y), (6.2.45)

and therefore, ∂xu(t,x) ≤ 0. (6.2.46) i 1 Proof. Let u , i = 1,2, be solutions on R+ × [x,∞) with initial data u (0,y) = 0, for y ≥ x, and u2(0,y) = u(0,2x − y) for y ≥ x, and with boundary conditions ui(t,x) = u(t,x), for all t ≥ 0. By the maximum principle, u1 ≤ u2, for all times. Since x ≥ M, and hence u1(t,y) = u(t,y), for y ≥ x. On the other hand, by reflection symmetry, u2(y,y) = u(y,2x−y). Hence (6.2.45). Since this inequality implies that u(t,x+ε) ≤ u(t,x − ε), the claim on the derivative at x is immediate. ut The next lemma gives a continuity result for the solutions as functions of the initial conditions. in construction 82 6 Bramson’s analysis of the F-KPP equation

Lemma 6.13 (Lemma 3.2 in [22]). Let ui be solutions of the F-KPP equation with F0(u) ≤ 1, and assume that they are bounded over finite time. Then, if the initial conditions satisfy u2(0,x) − u1(0,x) ≤ ε, for all x, (6.2.47) then u2(t,x) − u1(t,x) ≤ εet , for all t and x. (6.2.48) The same holds for the absolute values of the differences. Proof. Set v = u2 − u1. Then

1 F(u2) − F(u1) ∂ v = ∂ 2v + v. (6.2.49) t 2 x u2 − u1 By assumption on f , the coefficient of v is at most 1. Now let v+ solve (6.2.49) with initial conditions v+(0,x) = v(0,x)∨0, then by the maximum principle, for all times t ≥, v+(t,x) ≥ v(t,x)∨0 remains true. Next letv ¯ be a solution of the linear equation

1 ∂ v = ∂ 2v + v (6.2.50) t 2 x with initial conditionv ¯(0,x) = ε. Clearly,v ¯(t,x) = εet . On the other hand, by the maximum principle and the assumption on the initial conditions, v+(t,x) ≤ v¯(t,x). Hence (6.2.48) follows. The same argument can be made for the negative part of v. ut An application of this lemma is the next corollary. Corollary 6.14 ([22]). Let u be a solution of the F-KPP equation satisfying the stan- dard conditions and assume that

u(t,x + m(t)) → wλ (x), as t ↑ ∞, (6.2.51)

uniformly in x. Then, m(t + s) − m(t) → λs, (6.2.52) uniformly for s in compact intervals. Hence, m(t)/t → λ. Proof. Since wλ is stationary, the previous lemma implies that for all t ≥ 0 and s0 ≥ 0, uniformly in s ≤ s0.

λ s0 λ sup u(t + s,x + λs + m(t)) − w (x) ≤ e u(t,x + m(t)) − w (x) , (6.2.53) x∈R where the right-hand side tends to zero by assumption. The next lemma states a quantitative comparison result. Thus

sup|u(t + s,x + λs + m(t)) − u(t + s,x + m(t + s))| → 0, (6.2.54) x

as t ↑ ∞, which implies m(t + s) − m(t) → λs. ut in construction 6.2 The maximum principle and its applications 83

Lemma 6.15 (Lemma 3.3 in [22]). Let ui, i = 1,2, be solutions of the F-KPP equa- tion with F0(u) ≤ 1. If 0 ≤ ui(0,x) ≤ 1, for all x, and for x > 0

u1(0,x) = u2(0,x), (6.2.55)

then there exists a constant, C, such that

2 1 −1/4 u (t,x) − u (t,x) ≤ Ct , (6.2.56) √ for x > 2t − 2−5/2 lnt.

Proof. The proof is similar to that of Lemma 6.12. With the same definitions as in that proof, we see that, by the Feynman-Kac formula, the solutionv ¯ of the linear equation is given as

t Z 0 2 t 2 e − (x−y) e − x v¯(t,x) = √ e 2t dy ≤ p √ e 2t . (6.2.57) 2πt −∞ 2πx/ t √ For x = 2t − 2−5/2 lnt, the right hand side is bound asymptotically equal to p √ t−1/4/ 2 2π. ut

The next lemma is the first result linking the behaviour of the tail of the initial distribution to the convergence to a travelling wave.

Lemma 6.16 (Lemma 3.4 in [22]).√Let u be a solution of the F-KPP equation sat- 0 ρ isfying the standard setting. If λ = 2 assume, in addition, that √1 − F (u) = O(u ) for some ρ > 0. If the initial condition is such that, for some λ ≥ 2, and a function γ(x) ↑ 1, as x ↑ ∞, u(0,x) = γ(x)wλ (x), (6.2.58) where wλ is a solution of the stationary equation with speed λ, then

λ sup u(t,x + λt) − w (x) → 0, as t ↑ ∞, (6.2.59) x≥bλ (t) √ 1 where bλ (t) = ( 2 − λ)t − 8 lnt. Proof. Set ( u(0,x), for x > N, uN(0,x) ≡ (6.2.60) wλ (x), for x ≤ N.

We know that wλ decays exponentially for x → ∞. So due to (6.2.58), for any δ >), there is N < ∞, such that for all x ∈ R,

wλ (x + δ) ≤ uN(0,x) ≤ wλ (x − δ). (6.2.61)

The maximum principle implies then that in construction 84 6 Bramson’s analysis of the F-KPP equation

wλ (x + δ) ≤ uN(t,λt + x) ≤ wλ (x − δ). (6.2.62)

On the other hand, for x ≥ b(t)+N, |uN(t,x)−u(t,x)| ≤ Ct−1/4. Thus at the expense of a little error of order t−1/4, we can replace uN by u in (6.2.62) for such x. But this gives (6.2.59), since wλ is uniformly continuous. ut The next proposition further strengthens the point that if a solutions approaches the travelling wave for large x, then it does so also for smaller x. This will be one main tool to turn the tail asymptotics of Theorem ?? into a proof of convergence to the travelling wave. Proposition 6.17 (Proposition 3.3 in [22]). Under the assumptions of Lemma 6.16, if for some N, for all x ≥ N,

−1 λ λ γ1 (x)w (x) − γ2(t) ≤ u(t,x + m(t)) ≤ γ1(x)w (x) + γ2(t), (6.2.63)

where γ1(x) → 1, as x ↑ ∞, and γ2(t) → 0, as t ↑ ∞. Then there exists c(t) tending to −∞ as t ↑ ∞, such that

λ sup u(t,x + m(t)) − w (x) → 0, as t ↑ ∞ (6.2.64) x≥c(t)

Proof. The proof of this proposition is a relatively simple application of the maxi- mum principle via the preceding lemma. Some nice property hold if F not only satisfy the standard conditions, but it will even be concave, i.e. F(u)/u will be decreasing. This holds if F comes from BBM with binary branching, where F(u)/u = (1−u). In other situations one can compare solutions with those corresponding to upper and lower concave hulls,

F(u) ≡ umax(F(u0)/u0,u0 ≥ u), (6.2.65) F(u) ≡ umax(F(u0)/u0,u0 ≤ u). (6.2.66)

Then the following lemma can be applied. Lemma 6.18 (Lemma 3.5 in [22]). Assume that F is concave and ui,i = 1,2,3 are solutions of the F-KPP equation satisfying the standard conditions. Then,

u3(0,x) ≤ u2(0,x) + u1(0,x), for all x, (6.2.67)

implies

u3(t,x) ≤ u2(t,x) + u1(t,x), for all x and for all t ≥ 0. (6.2.68)

Similarly, for any 0 < c ≤ 1, if

u1(0,x) ≥ cu2(0,x), (6.2.69)

for all x, then in construction 6.2 The maximum principle and its applications 85

u1(t,x) ≥ cu2(t,x), (6.2.70) holds for all t ≥ 0 for all x.

The proof is a straightforward application of the maximum principle. The following proposition compares the convergence of general initial data to the Heaviside case that was studied in Kolmogorov’s theorem.

Proposition 6.19 (Proposition 3.4 in [22]). Let u be a solution of the F-KPP equa- tion that satisfies the standard conditions with concave F. Assume that for some t0 ≥ 1 and η > 0, u(t0,0) ≥ η. (6.2.71) Then for all 0 < θ < 1 and ε > 0, there exists T (independent of η) and a constant cθ , such that for t ≥ T, √ −1 2  θ  u(t +t0 − θ lnη,x) > w |x| − m1/2(t) + c − ε. (6.2.72)

In particular, for all δ > 0,

−1 u(t +t0 − θ lnη,x) → 1, (6.2.73) √ uniformly in |x| ≤ ( 2 − δ)t. If 1 − F(u) ≤ uρ , then we may chose θ = 1.

Proof. We need some preparation.

Lemma 6.20 (Lemma 3.6 in [22]). Assume that u(t0,0) ≥ η, for some t0 ≥ 1 and η > 0. Then for either J = [0,1] or J = [−1,0],

u(t0,x) ≥ η/10, for all x ∈ J. (6.2.74)

Proof. We bound the solution by that of the linear F-KPP equation and use the Feynman-Kac representation. This implies that

2 Z e−y /2 u(t0,0) ≤ e u(t0 − 1,y) √ dy. (6.2.75) R 2π This is bounded from above by

2 2 ! Z 0 e−y /2 Z ∞ e−y /2 2emax u(t0 − 1,y) √ dy, u(t0 − 1,y) √ dy . (6.2.76) −in fty 2π 0 2π

Let us say that the maximum is realised for the integral over the positive half line. Then use that, for ∈ [0,1]

2 2 2 maxe+(y−x) /2−y /2 = ex /2 ≤ e1/2. (6.2.77) y≥0

Thus in construction 86 6 Bramson’s analysis of the F-KPP equation

Z ∞ −(x−y)2/2 3/2 e u(t0,0) ≤ e u(t0 − 1,y) √ dy. (6.2.78) 0 2π 2 Now use that ∂t u ≥ ∂xxu to see that

2 Z ∞ e−(x−y) /2 u(t0 − 1,y) √ dy ≤ u(t0,x). (6.2.79) −∞ 2π This gives (6.2.74). ut

Lemma 6.21. Let u(t,x) as usual and ( η, x ∈ J, u(0,x) = (6.2.80) 0, x 6∈ J,

θ −1 θ for some interval J. For any θ ∈ (0,1), there is C1 > 0, such that for t ≤ θ ln(C1 /η), and all x ∈ R, 2 Z e−(x−y) /(2t) u(t,x) ≥ ηeθt √ dy. (6.2.81) J 2πt

If (5.4.11) holds, there is C1 > 0, such that for t < −lnη,

Z −(x−y)2/(2t) t e u(t,x) ≥ C1ηe √ dy. (6.2.82) J 2πt Proof. The key idea of the proof is to exploit the fact that if u is small, then F(u)/u is close to one. An upper bound then shows that for a short time, will grow as claimed. We introduce a comparison solution uθ with the same initial conditions as u and with F(u) replaced by Fθ (u) ≡ F(u) ∧ (θu). Clearly Fθ (u) ≤ F(u) and so, by the maximum principle, u(t,x) ≥ uθ (t,x), for all t ≥ 0 and x ∈ R. On the other hand, 1 ∂ uθ ≤ ∂ 2 uθ + θuθ , (6.2.83) t 2 xx we have the trivial upper bound

uθ (t,x) ≤ ηeθt , (6.2.84)

0 θ for all t ≥ 0 and x ∈ R. Since F (u) ≤ 1, for given θ < 1, there will be a C1 > 0, θ θ such that for all 0 ≤ u ≤ C1 , F(u)/u ≤ θ, and hence F (u)/u = θ. By the bound ∗ −1 θ (6.2.84), this is true at least up to time t = θ ln(C1 /η). This gives the claimed lower bound. In the case when (5.4.11) holds, we can choose θ = 1 and uθ = u. Then t ρ we know that F(u(t,x))/u(t,x) ≥ 1−C2 (ηe ) , which we can use up to t∗ = −lnη. Using again the Feynman-Kac formula, we get for t < t∗,

−(x−y)2/(2t) R t s ρ Z e u(t,x) ≥ e− 0(1−C2(ηe ) )ds √ dy. (6.2.85) J 2πt in construction 6.2 The maximum principle and its applications 87

But Z t∗ (ηes)ρ ds ≤ ρ−1, (6.2.86) 0 and hence 2 Z e−(x−y) /(2t) u(t,x) ≥ e−C2/ρ ηet √ dy, (6.2.87) J 2πt which is what is claimed. ut

We now prove the proposition. We consider only the case when (5.4.11) holds and thus set θ = 1. Set t1 = −lnη, t2 = t0 +t1. The two preceding lemmas give

2 Z e−(x−y) /(2t1) u(t2,x) ≥ C1 √ dy, (6.2.88) J 2πt1 with J either [−1,0] or [0,1]. Thus, for |x| ≤ 1, we get, again from Lemma 6.21, that

u(t2 + 1,x) ≥ C4. (6.2.89)

1 1 We can now compare u(t +t2 + 1,x) ≥ u (t,x) where the initial condition for u is 1 2 (−1,1](x)C4. Let u be the solution with Heaviside initial conditions. We get

1 2 2  u (t,x) ≥ C4 u (t,x) − u (t,x + 1) . (6.2.90)

Now set x = z + m1/2(t). Then the right hand side of (6.2.90) converges to √ √  2 2  C4 w ,z) − w (z + 1) . (6.2.91)

√ √ 2 2 Using the tail behaviour oft w this is for large z, w (z+c1), for some 0 < c1 < ∞, and hence √ 1 −1 2 u (t,z + m1/2(t) − c1) ≥ γ1 (z)w (z) − γ2(t). (6.2.92) Then for all ε > 0 for large enough t, √ 1 2 u (t,z + m1/2(t) − c1) ≥ w (z) − ε, (6.2.93)

if z ≥ c(t), where c(t) is from Proposition 6.17 and tends to −∞ as t ↑ ∞.. This also implies that for large enough t, √ 1 2 u (t,c(t) + m1/2(t) − c1) ≥ w (c(t)) − ε/2 ≥ 1 − ε. (6.2.94)

1 Since u (t,x) is decreasing in x for x ≥ 0, we have that for 0 ≤ x ≤ m1/2(t) + c(t) − c1, √ 1 2 u (t,x) ≥ 1 − ε ≥ w (x − m1/2(t) + c1) − ε. (6.2.95)

On the other hand, for x ≥ m1/2(t) + c(t) − c1, it follows from (6.2.93) that in construction 88 6 Bramson’s analysis of the F-KPP equation √ 1 2 u (t,x) ≥ w (x − m1/2(t) + c1) − ε. (6.2.96)

This implies that for all x ≥ 0, √ 1 2 u(t +t0 − lnη + 1,x) ≥ u (t,x) ≥ w (x − m1/2(t) + c1) − ε. (6.2.97)

This can be written as √ 1 2 u(t +t0 − lnη,x) ≥ u (t,x) ≥ w (x − m1/2(t) + c) − ε, (6.2.98)

with a slightly changed constant c. The case when x < 0 follows similarly. √ √ 2 Finally, since m1/2(t) ∼ 2t and w (z) ↑ 1, as z ↓ −∞, it follows that

u(t +t0 − lnη,x) ↑ 1, (6.2.99) √ uniformly in t if |x| ≤ ( 2 − δ)t, which is (??). ut

The final proposition of this section asserts that the scaling function m1/2(t) is essentially concave.

Proposition 6.22 (Proposition 3.5 in [22]). Let u be a solution of the F-KPP equa- tion satisfying the standard conditions and Heaviside initial conditions. Assume fur- ther that F is concave. Then there exists a constant M > 0, such that for all s ≤ t, s m (s) − m (t) ≤ M. (6.2.100) 1/2 t 1/2

Proof. Eq. (6.2.100) will follow from the fact that, for all t1 ≤ t2 and s ≥ 0,

m1/2(t1 + s) − m1/2(t1) ≤ m1/2(t2 + s) − m1/2(t2) + M. (6.2.101)

Namely, assume that (6.2.100) fails for some s ≤ t. Define

T = inf{r ∈ [s,t] : m1/2(r)/r = α}, (6.2.102)

where α ≡ infq∈[s,t] m1/2(q). Let t1 = sup{r ∈ [0,s] : m1/2(r)/r = α}. Put t2 = s, and S = T − s. Then

m1/2(t2) > αt2 + M, and m1/2(t2 + S) = α(t2 + S). (6.2.103)

Hence m1/2(t2 + S) − m1/2(t2) < αS − M. (6.2.104)

Furthermore, m1/2(t1) = αt1, and m1/2(t1 + S) ≥ α(t1 + S), so that

m1/2(t1 + S) − m1/2(t1) ≥ αS. (6.2.105)

(6.2.104) and (6.2.105) contradict (6.2.101). It remains to prove (6.2.101). But by Corollary 6.2.32, for t2 ≥ t1, in construction 6.3 Estimates on solutions of the linear F-KPP equation 89 ( u(t1,x + m1/2(t1)), if x > 0, u(t2,x + m (t2)) ≥ (6.2.106) 1/2 1/2, if x ≤ 0.

In particular, 1 u(t ,x + m (t )) ≥ u(t ,x + m (t )). (6.2.107) 2 1/2 2 2 1 1/2 1 By the concavity result (6.2.69) in Lemma 6.18, this extends to 1 u(t + s,x + m (t )) ≥ u(t + s,x + m (t )). (6.2.108) 2 1/2 2 2 1 1/2 1

Since u is monotone, it follows that m1/4(t2 +s)−m1/2(t2) ≥ m1/2(t1 +s)−m1/2(t1). Moreover, it follows from the second assertion of Corollary 6.11, that u drops form 1/2 1/2 to 1/4 uniformly faster than w , so that m1/4(t)−m1/2(t) is bounded uniformy in t. This proves (6.2.101). ut

6.3 Estimates on solutions of the linear F-KPP equation

The linear F-KPP equation, 1 ∂ u = ∂ 2u + u, (6.3.1) t 2 x has already served as reference in the previous sections. We derive now some precise estimates on the behaviour of its solutions from the explicit representation of its solution as t Z ∞ 2 e − (x−y) φ(t,x) = √ u(0,y)e 2t dy. (6.3.2) 2πt −∞ This representation reduces the work to get good bounds into standard estimates on Gaussian integrals, i.e. the Gaussian tail bounds (1.2.11) and cutting up the range of integration. √ Recall that λ,b are in the relation b = λ − λ 2 − 2. We first prove an upper bound.

Lemma 6.23 (Lemma 4.1 in [22]). Assume that for some h > 0 and for some x0, for all x ≥ t0, Z x(1+h) u(0,y)dy ≤ e−bx. (6.3.3) x Then, for each δ ≥ 0, and a constant C,

t Z 2 e − (x−y) −1 C −b(x− t)− 2t/3 √ u(0,y)e 2t dy ≤ t e e λ δ , (6.3.4) 2πt Jc for all t > 1 and for all x, where in construction 90 6 Bramson’s analysis of the F-KPP equation 1 b λ = + , (6.3.5) b 2 and J = (x − (b + δ)t,x − (b − δ)t).

Proof. The proof of this bound is elementary. See [22]. ut

Bramson states a variation on this bound as √ Corollary 6.24 (Corollary 1 in [22]). Assume that for h > 0, for some 0 < b ≤ 2,

Z x(1+h)  limsupx−1 ln u(0,y)dy ≤ −b. (6.3.6) x↑∞ x

Then, for δ ≥ 0 and 0 < ε < b,

−(x−y)2/2t Z e 2 et u(0,y) √ dy ≤ e−b(x−λt)+εx−δ t/4, (6.3.7) Jc 2πt for all x and t large enough. J is as in the previous lemma. In particular,

lnφ(t,x) ≤ −b(x − λt) + εx. (6.3.8)

We need a sharper bound on φ(t,x).

Lemma 6.25. Under the hypothesis of Lemma 6.23, for all x,

Z t −x 2 Z ∞ 2 1 0 − y 1 −by− y φ(t,x) ≤ √ e 2t dy + K √ e 2t dy. (6.3.9) 2πt −∞ 2πt t0−x In particular, for x ≥ λt,

x2 xt0 t− + t e 2t −(x−λ)t φ(t,x) ≤ p √ + Ke . (6.3.10) 2πx/ t

R ∞ Proof. We shift coordinates so that the integral in (6.3.2) takes the form −∞ u(x + 2 y)exp(−y /(2t))dy. We split the integral in (6.3.2) into the part from in fty to t0 −x and the rest. In the first part, we bound u by 1. In the second, we split the integral into pieces of length, i.e. we write

∞ Z ∞ 2 Z n+1+t0−x 2 u(0,x + y)e−y /(2t)dy = ∑ u(0,x + y)e−y /(2t)dy (6.3.11) t0−x n=0 n+t0−x ∞ n+1+t −x −min y2/(2t) Z 0 ≤ ∑ e y∈[n+t0−x,n+1+t0−x] u(0,x + y)dy n=0 n+t0−x ∞ −min y2/(2t) −b(n+t ) ≤ ∑ e y∈[n+t0−x,n+1+t0−x] e 0 . n=0 in construction 6.3 Estimates on solutions of the linear F-KPP equation 91

We would like to reconstruct the integral from the last line. To do this, note that, for any n,

2 2 max y /(2t) − min y /(2t) ≤ (2(n +t0 − x ± 1) + 1)/2t, y∈[n+t0−x,n+1+t0−x] y∈[n+t0−x,n+1+t0−x] (6.3.12) where ± depends on whether n + t0 − x is positive or negative. For all n such that 2|n + t0 − x| ≤ Ct, this is bounded by a constant, while otherwise, summands are anyway smaller than exp(−Ct) which are negligible is C is chosen large enough. Hence

∞ Z ∞ −min y2/(2t) −b(n+t ) C −bx −by −y2/(2t) ∑ e y∈[n+t0−x,n+1+t0−x] e 0 ≤ e e e e dy. (6.3.13) n=0 t0−x From here (6.3.9) follows. The last integral is bounded from above by

2 √ e−bx+b t/2 2πt. (6.3.14)

Combining this with a standard Gaussian estimate for the first integral in (6.3.9) and recalling the definition of λ gives (6.3.10). ut

The next lemma provides a lower bound.

Lemma 6.26. Assume that for some h > 0,

Z x(1+h)  liminfx−1 ln u(0,y)dy ≥ −b, (6.3.15) x↑∞ x √ with 0 < b < 2. Let M > λ be fixed. Then, for all ε > 0, there is T > 0 such that, for (t,x) ∈ AT and x ≤ Mt,

lnφ(t,x) ≥ b(x − λt) − εx, (6.3.16)

where AT = {(t,x) : t ≥ T,x ≥ b0t} ∪ {(t,x) : 1 ≤ t ≤ T,x ≥ b0t}.

Corollary 6.27. Let mφ (t) be defined by φ(t,mφ (t)) = 1/2. If

Z x(1+h)  limx−1 ln u(0,y)dy = −b, (6.3.17) x↑∞ x √ for 0 < b < 2. Then limmφ (t)/t = λ. (6.3.18) t↑∞ √ √ If the initial data satisfy (6.3.6) with b = 2, then λ = 2.

Proof. In the first case, the result follows from the bounds on φ in Lemma√ 6.26 and Corollary 6.24. In the second case, the upper bound is still valid with λ = 2, while the lower bound follows by direct computation. ut in construction 92 6 Bramson’s analysis of the F-KPP equation 6.4 Brownian bridges

The use of the Feynman-Kac representation for analysing the asymptotics of solu- tions of the F- KPP equation requires a rather detailed control on the behaviour of Brownian bridges. The following lemma, due to Bramson, gives some basic estimates.

Lemma 6.28 (Lemma 2.2. in [22]). Let zt be a Brownian bridge from 0 to 0 in time t. Then the following estimates hold.

(i) For y1,y2 > 0,

t  −2y1y2/t P ∃0≤s≤t : z (s) ≥ (sy1 + (t − s)y2)/t ≤ e . (6.4.1)

(ii) For x > 0 and 0 < s0 < t, √ 2 s0 2 ∃ : zt (s) ≥ x ≤ e−x /(2s0). (6.4.2) P 0≤s≤s0 x

(iii) For y1,y2 > 0, and 0 ≤ x ≤ y2/2, r t 2 −x2/( t)  y2/( t)  p(t,x;∀ : z ∈ [−y ,y ]) ≥ y e 2 x − (y + 2y )e 2 2 − y (x + y )2/t , 0≤s≤t s 1 2 πt3 1 1 2 1 1 (6.4.3) where p(t,x;A) denotes the density of the Brownian motion restricted to the event A. (iv) For y ,y > 0, and c , let zt be a Brownian bridge from y to y in time t. 1 2 0 y1,y2 1 2 Then r 2 Z ∞    t  −4y1x/t −4y2x/t P ∀0≤s≤t : zy ,y (s) > c0(s ∧ (t − s)) = 1 − e 1 − e 1 2 πt 0 2 ×e−2(x+c0t/2−(y1+y2)/2) /t dx. (6.4.4)

Proof. The proof is a nice exercise using the reflection principle. We first prove (i). First,But it all is obvious paths that that do the not probability stay below iny question2 have a isreflected the same counterpart as that ends in y2 +y1 in instead in y2 −y1. Thus the probability of BM to end in y2 −y1 and to cross ∃ : zt (s) ≥ y  (6.4.5) the level y2 is the same as theP probability0≤s≤t 0,y to2− endy1 in y2 2+ y1, hence (6.4.5) equals p(t,y + y ) 2 1 = e−2y1y2/t , (6.4.6) p(t,y2 − y1)

where p(t,x) is the heat kernel. This proves (i). To prove (ii), Bramson uses that

t   P ∃s≤s0 : zs > x ≤ 2P ∃s≤s0 : Bs > x . (6.4.7) (which I don’t quite see how to get easily). Since the law of the Brownian bridge at time t/2 is the same as that of Bt/2, the probability of the bridge to exceed a level x in construction 6.4 Brownian bridges 93

Fig. 6.1 Use of the reflection principle

on an interval [0,s1] for s1 ≤ t/2 is the same as that of Brownian motion. This gives the result if s0 ≤ t/2. If s0 > t/2, we have

t  t  t  P ∃s≤s0 : zs > x ≤ P ∃s≤t/2 : zs > x + P ∃s∈[t/2,s0] : zs > x . (6.4.8) For the second probability, we have that

t  t  t  P ∃s∈[t/2,s0] : zs > x = P ∃s∈[t−s0,t/2] : zs > x ≤ P ∃s∈[0,t/2] : zs > x . (6.4.9) As both probabilities can are the same as that for Brownian motion, we get a rather crude bound with the factor two in (6.4.7). Applying the reflection principle to the Brownian motion probability and using the standard Gaussian tail asymptotics then yields the claimed estimate. To prove (iii), one uses the reflection principle to express the restricted density as

t p(t,x;∀0≤s≤t : z (s) ∈ [−y1,y2]) = p(t,x)− p(t,x+2y1)− p(t,x−2y2)+ p(t,x−2y1 −2y2). (6.4.10) Inserting the explicit expression of the heat kernel, this equals

1  2 2 2 2  √ e−x /2t − e−(x+2y1) /2t − e−(x−2y2) /2t + e−(x−2y1−2y2) /2t (6.4.11) 2πt 1 2   = √ e−x /2t 1 − e−2y1(x+y1)/t − e−2y2(y2−x)/t + e−2(y1+y2)(y1+y2−x)/t 2πt 1 2    = √ e−x /2t 1 − e−2y1(x+y1)/t − e−2y2(y2−x)/t 1 − e2y1(x−2y2−y1)/t . 2πt

Using that for u > 0, e−u ≥ 1 − u and e−u ≤ 1 − u + u2/2, the right-hand side is bounded from below by in construction 94 6 Bramson’s analysis of the F-KPP equation

2 2   −x /2t 2 −2y2(y2−x)/t √ e y1 (x + y1) − y1(x + y1) /t + e (x − 2y2 − y1) (6.4.12). 2πt3

Using the assumption x ≤ y2/2 to simplify this expression yields the assertion of (iii). We skip the proof of part (iv) which is an application of the estimate (i). ut 1 t Lemma 6.29. Let 2 < γ < 1. Let z be a Brownian bridge from 0 to 0 in time t. Then, for all ε > 0, there exists r large enough such that, for all t > 3r,

t γ  P ∃s ∈ [r,t − r] : |z (s)| > (s ∧ (t − s)) < ε. (6.4.13) More precisely,

∞ t γ  1 −γ −k2γ−1/2 P ∃s ∈ [r,t − r] : |z (s)| > (s ∧ (t − s)) < 8 ∑ k 2 e . (6.4.14) k=brc

Fig. 6.2 BB stays with high probability in the green tube

Proof. The probability in (6.4.13) is bounded from above by

dt−re t γ  ∑ P ∃s ∈ [k − 1,k] : |z (s)| > (s ∧ (t − s)) k=dre dt/2e t γ  ≤ 2 ∑ P ∃s ∈ [k − 1,k] : |z (s)| > (s ∧ (t − s)) (6.4.15) k=dre

by the reflection principle for the Brownian bridge. This is now bounded from above by dt/2e t γ  2 ∑ P ∃s ∈ [0,k] : |z (s)| > (k − 1) (6.4.16) k=dre Using the bound of Lemma 6.28 (ii) we have

t γ  1 −γ −(k−1)2γ−1/2 P ∃s ∈ [0,k] : |z (s)| > (k − 1) ≤ 4(k − 1) 2 e . (6.4.17)

Using this bound for each summand in (6.4.16) we obtain (6.4.14). Since the sum on the right-hand side of (6.4.14) is finite (6.4.13) follows. ut in construction 6.5 Hitting probabilities of curves 95

An important fact in the analysis of Brownian bridges is that hitting probabil- ities can be approximated by looking a discrete sets of times. This is basically a consequence of continuity of Brownian motion. Let ` : [0,t] → R be function that is bounded from above. Let Sk ≡ {sk,1,...,sk,2k } ⊂ −k −k [0,t] be such that for all k, j, sk, j ∈ [( j − 1)t2 , jt2 ], and that

−k −k −k `(sk, j) ≥ sup{`(s),s ∈ [( j − 1)t2 , jt2 ]} − 2 . (6.4.18)

n S Set S ≡ k≤n Sk, and set

( n n `(x), if s ∈ S , `S (s) ≡ (6.4.19) −∞, else.

Lemma 6.30 (Lemma 2.3 in [22]). With the notation above, the event A ≡ {z(s) > `(s),∀0 ≤ s ≤ t} is measurable, and

 Sn  limP z(s) > ` (s),∀0 ≤ s ≤ t = P(A). (6.4.20) n↑∞

We will not give the proof which is fairly straightforward.

6.5 Hitting probabilities of curves

We will need to compare hitting probabilities of different curves. In particular, we want to show that these are insensitive to slight deformations. The intuitive reason is related to what is called entropic repulsion : if a Brownian motion cannot touch a curve, it will actually stay quite far way from it, with large probability. Let Lt be a family of functions defined on [0,t]. For C > 0 and 0 < δ < 1/2, define ( L (s) −Csδ , if 0 ≤ s ≤ t/2, L (s) ≡ t t δ (6.5.1) Lt (s) −C(t − s) , if s/2 ≤ s ≤ t, and ( L (s) +Csδ , if 0 ≤ s ≤ t/2, L (s) ≡ t t δ (6.5.2) Lt (s) +C(t − s) , if s/2 ≤ s ≤ t.

Lemma 6.31 (Lemma 6.1 in [22]). Let Lt be a family of functions such that for some r0 > 0, Lt (s) ≤ 0, ∀s ∈ [r0,t − r0]. (6.5.3)

Then for all ε > 0 exists rε large enough such that for all r > rε and t > 3r

t  P z (s) > Lt (s),∀r ≤ s ≤ t − r − 1 < . (6.5.4) t ε P(z (s) > Lt (s),∀r ≤ s ≤ t − r) in construction 96 6 Bramson’s analysis of the F-KPP equation

Proof. In the proof of this lemma Bramson uses the Girsanov formula to transform the avoidance problem for one curve into that of another curve by just adding the appropriate drift to the BM. In fact is γt is one curve and γ˜t is another, both starting and ending in 0, then clearly

t  t  P z (s) > γt (s),∀r ≤ s ≤ t − r = P z (s) − γt (s) + γ˜t (s) > γ˜t (s),∀r ≤ s ≤ t − r . (6.5.5) t But z − γt + γ˜t is a Brownian bridge under the measure Pe, where Pe is absolutely continuous w.r.t. to the law of BM with Radon Nikodým derivative

Z t Z t  dPe t 1 2 = exp ∂sβ(s)dzs − (∂sβ(s)) ds , (6.5.6) dP 0 2 0

where β(s) = γ˜t (s) − γt (s). We would be tempted to use this formula directly with Lt and Lt , but the sin- gularity of sδ would produce a diverging contribution. Since the condition on the Brownian bridge involves only the interval (r,t − r), we can modify the drift on the intervals [0,r] and [t − r,t]. Bramson uses

2Crδ−1s, if 0 ≤ s ≤ r,  2Csδ , if r ≤ s ≤ t/2, (s) ≡ βr,t δ (6.5.7) 2C(t − s) , if t/2 ≤ s ≤ t − r,  2Crδ−1(t − s), if t − r ≤ s ≤ t.

Then the discussion above clearly shows that

h −R t (s)dzt − 1 R t ( (s))2ds i z (s) > L (s),∀r < s < t − r = 0 ∂sβr,t s 2 0 ∂sβr,t 1 . P t t E e zt (s)>Lt (s),∀r

Z t   2 2 2δ−1 1 h 2δ−1 2δ−1i (∂sβr,t (s)) ds = 8C r + (t/2) − r , (6.5.9) 0 2δ − 1 which tends to zero as r and t tend to infinity. To treat the stochastic integral, we need an upper bound on the maximum of zt . This is most easily done by sneaking in a condition {zt (s) < 2Csε }, resp. 2C(t − s)ε , with ε > 1/2. It is easy to see that this condition is verified by the Brownian bridge with probability tending to one, so is makes no difference to add it. But then we get that

Z t Z t t t 2 ∂sβr,t (s)dz (s) = − z (s)∂s βr,t (s)ds. (6.5.10) 0 0 Using now the bound on zt , it is elementary to see that this integral is bounded in absolute value by O(rδ+ε−1), which for, e.g. ε = 1/2 + (1/2 − δ)/2 tends to zero. ut in construction 6.6 Asymptotics of solutions of the F-KPP equation 97

There are some further related results that relate hitting probabilities for certain other deformations of curves. If L : [0,t] → R, define, for 0 ≤ δ < 1/2, C > 0,  L(s + sδ ) +Csδ , if r ≤ s ≤ t/2,  δ δ θr,t ◦ L(s) ≡ L(s + (t − s) ) +C(t − s) , if t/2 ≤ s ≤ t − 2r, (6.5.11) L(s), else.

Then the following holds.

Proposition 6.32 (Proposition 6.2 in [22]). For all ε > 0 there exists rε large enough such that for all r > rε and all t > 3r,

t −1  P z (s) > (θr,t ◦ Lt (s)) ∨ (θr,t ◦ L(s)) ∨ L(s),∀r ≤ s ≤ t − r − 1 < , (6.5.12) t ε P(z (s) > Lt (s),∀r ≤ s ≤ t − r) and

(zt (s) > L (s),∀r ≤ s ≤ t − r) P t − 1 < ε. (6.5.13) t −1  P z (s) > θr,t ◦ Lt (s),∀r ≤ s ≤ t − r Finally, Bramson proves the following monotonicity result:

Proposition 6.33 (Proposition 6.3 in [22]). Let `i,i = 1,2 be upper semicontinous at all but finitely many values of 0 ≤ s ≤ t, and let `1(s) ≤ `2(s), for all 0 ≤ s ≤ t. Let zx,y denote the Brownian bridge from x to y in time t. Then

[z (s) > ` (s),∀ ] P x,y 2 0≤s≤t (6.5.14) P[zx,y(s) > `1(s),∀0≤s≤t ] is monotone increasing in x and y.

6.6 Asymptotics of solutions of the F-KPP equation

We are now moving closer to proving Theorem 6.2. We still need some estimates on solutions. The first statement is a regularity estimate.

Proposition 6.34 (Proposition 7.1 in [22]). Let u be a solution of the F-KPP equa- tion satisfying the standard conditions. Assume that for some h > 0,

Z t(1+h)  √ limsupt−1 ln u(0,y)dy ≤ − 2. (6.6.1) t↑∞ t

Then lim sup (u(t,x2) − u(t,x1)) = 0. (6.6.2) t↑∞ 0≤x1≤x2 If, moreover, for some η > 0,N,M > 0, for x < −M, in construction 98 6 Bramson’s analysis of the F-KPP equation

Z x+N u(0,y)dy ≥ η, (6.6.3) x

then (6.6.2) holds also if the condition in the supremum is relaxed to x1 ≤ x2. √ Proof. Eq. (6.2.73) from Proposition 6.19 implies that for x ∈ [0,( 2 − δ)t], u(t,x) → 1, for any δ > 0. By the Feynman-Kac representation (6.1.2), using Con- dition (6.6.3), for all x < 0

Z ∞ 2 1 − (x−y) u(1,x) ≥ √ e 2 u(0,y)dy (6.6.4) 2π −∞ x+N−M 1 Z 2 2 ≥ √ e−max(M ,(N−M) )/2u(0,y)dy ≥ η0 > 0. 2π x−M

Hence u(t,x) → 1 also for x ∈ (−∞,0], uniformly.√ Thus we only need to show that (6.6.2) holds with the supremum restricted to t( 2−δ) ≤ x1 ≤ x2. To do so, define, 2 for s ≥ 0 and 0 < δ1 < 1, u as the solution with initial data ( u(0,x), if x ≤ δ s, us(0,x) = 1 (6.6.5) 0, if x > δ1s.

Now Lemma 6.12 implies that, for δ1t ≤ x1 ≤ x2,

t t u (t,x2) ≤ u (t,x1). (6.6.6)

If we set vs(t,x) ≡ u(t,x) − us(t,x), then ( 0, if x ≤ δ s, vs(0,x) = 1 (6.6.7) u(0,x), if x > δ1s.

But vs satisfies an equation 1 ∂ vs = ∂ 2vs) + ks(t,x)vs, (6.6.8) t 2 x where ks is some function of the solutions u,us. The important point is, however, that ks(t,x) ≤ 1. Thus the Feynman-Kac representation yields the bound

Z ∞ 2 t 1 t − (x−y) v (t,x) ≤ √ e u(0,y)e 2t dy. (6.6.9) 2πt δ1s

Using the growth bound (6.6.1) on the initial√ condition, it follows from Corollary t 2 6.24 that v (t,x) → 0, uniformly in x ≥ t( 2 − δ1 /6). Using this together with (6.6.6) implies the desired convergence and proves the proposition. ut

The following propositions prepare for a more sophisticated use of the Feyman- Kac representation (6.1.6). This used in particular the comparison estimates of the hitting probabilities of different curves by the Brownian bridges. We use the notation in construction 6.6 Asymptotics of solutions of the F-KPP equation 99

of the previous section and fix the reference curve s L (s) = m (s) − m (t)t−1(t − s)α(r,t), (6.6.10) r,t 1/2 t 1/2

where m1/2 is defined in Eq. (6.2.28). α(r,t) = o(t) will be specified later. Next we introduce the functions

s t−s M r,t (s) ≡ L r,t (s) + t m1/2(t) + t α(r,t) = m1/2(s) + L r,t (s) − Lr,t (s), (6.6.11) and

s t−s M r,t (s) ≡ L r,t (s) + t m1/2(t) + t α(r,t) = m1/2(s) + L r,t (s) − Lr,t (s). (6.6.12)

At this point we can extend the rough estimate on m1/2(t) given in (6.2.31) for Heaviside initial conditions to initial conditions satisfying (6.6.1).

Lemma 6.35. Assume that u(0,x) satisfies (6.6.1). Then (6.2.31) holds, i.e. m (t)/t → √ 1/2 2.

Proof. Recall that by the maximum principle, u(t,x) ≤ φ(t,x) with φ defined as (6.3.2). On the other hand, it is fairly straightforward to show that, if the initial√ data satisfy (6.6.1), then mφ (t), i.e. the analog of m (t) defined for φ behaves like 2t. √ 1/2 Hence limsup m (t)/t ≤ 2. The bound (6.2.73) in Proposition 6.19 implies, on t 1/2 √ the other hand, that liminft m1/2(t)/t ≥ 2. ut The strategy of using the Feynman-Kac formula relies on controlling the term k(y,s) appearing in the exponent in appropriate regions of the plane, and then es- tablishing probabilities that the Brownian bridge passes through those. The next proposition states that k is close to one above the curves M . Proposition 6.36 (Proposition 7.2 in [22]). Let u be a solution to the F-KPP equa- tion satisfying the standard assumptions, with initial data satisfying (6.6.1) for some h > 0. Then, there is a constant C2 such that, for r large enough and y > M r,t (s), one has that ( δ 1 −C e−ρs , if r ≤ s ≤ t/2, k(s,y) ≥ 2 (6.6.13) −ρ(t−s)δ 1 −C2e , if t/2 ≤ s ≤ t − r. Thus r−t R t−r k(t−s,x(s))ds e3 e 2r → 1, as r ↑ ∞, (6.6.14)

if x(s) > M r,t (t − s) for s ∈ [2r,t − r]. Proof. We give a proof of of this proposition under the slightly stronger hypothesis of Lemma 6.3.1. The strategy is the following: (i) We prove (6.6.14) for defined with m (s) replaced by any function m(s) √ M√t,t 1/2 such that 2s ≥ m(s) ≥ 2s −C lns, for any√ C < ∞. (ii) Use this bound to control u(t,x) for x ≥ 2t −C lnt. in construction 100 6 Bramson’s analysis of the F-KPP equation

(iii) Deduce from this that √ 3 m1/2(t) ≥ 2t − √ lnt + M, (6.6.15) 2 2 for M < ∞.

We begin with Step (i) and assume that the m1/2 in the definition of M is stated in item (i). By definition, for s ∈ [r,t/2], after some tedious computations,

δ δ M r,t (s) ≥ m1/2(s + s ) + (4 + (α(r,t) − m1/2(t))/t)s . (6.6.16)

By Lemma 6.35, for large enough r < t, the quantity in the bracket is positive and hence δ M r,t (s) ≥ m1/2(s + s ). (6.6.17) We use that√ u(s,x) ≤ φ(s,x), and that by Lemma 6.25, under the Assumption (6.3.6) with b = 2, √ √ δ 2 δ δ s−(m1/2(s+s )) /(2s)+t0 2 −(m1/2(s+s )− 2s φ(s,m1/2(s) + s + z) ≤ e + Ke √ √ √ δ δ ≤ et0 2e− 2s + Ke− 2s . (6.6.18)

Hence for y > M r,t (s) and s ∈ [r,t/2], √ − 2sδ u(y,s) ≤ C3e . (6.6.19)

The analogous result follows in the same way for s ∈ [t/2,t − r]. Using that F satis- fies the standard conditions implies then (6.6.13). The conclusion (6.6.14) is obvi- ous. We now come to Step (ii). √ Lemma√ 6.37. Let u be as in the proposition. Let x = x(t) ≡ 2t − ∆(t), where ∆(t) ≤ 2t −C lnt, for some C < ∞. Then √ 2 √ Z ∞ e 2ye−y /2t u(x,t) ≥ Ct−1e2 2∆(t) u(0,y) √ . (6.6.20) y0 2πt

Proof. We know that Step (i) that for our choices of M < for zx,y(s) > M r,t (s) on [2r,t − r], Z t−r  t exp k(s,zx,y(s)ds ≥ C3e , (6.6.21) 2r

where C3 depends on r but not on t. Now choose y0 small enough such that R ∞ u(0,y)dy > 0. Now we use the Feynman-Kac representation in the form (6.1.7) y0 (with r = 0) to get the bound in construction 6.6 Asymptotics of solutions of the F-KPP equation 101

Z ∞ −(x−y)2/2t e h R t−r k(s,z (s)dsi √ 1 2r x,y u(x,t) ≥ u(0,y) E z (s)>M (t−s),∀s∈[2r,t−r]e dy −∞ 2πt x,y r,t Z ∞ −(x−y)2/2t t e  ≥ C3e u(0,y) √ P zx,y(s) > M r,t (t − s),∀s ∈ [2r,t − r(6.6.22)] dy. y0 2πt

The curves M ,M can be chosen with α(r,t) = y0, with y0 fixed (usually negative). s One can always choose m(s) such that m(s)− t m(t) ≤ M < ∞. Then for large enough r, the probability in the last line can be bounded from below using our comparison results for hitting probabilities of Brownian bridges that imply that   P zx,y(s) > M r,t (t − s),∀s ∈ [r,t − r] ≥ C1P zx,y(s) > M r,t (t − s),∀s ∈ [r,t − r] ≥ C1P(z(s) > 0,∀s ∈ [r,t − r]), (6.6.23)

where in the last line z ≡ z0,0. Using (5.2.7) and the fact that there is a positive probability that at times r and t − r, the Brownian bridge is larger than, say, 1, one sees that there is a constant, C, depending only on r, such that

P(z(s) > 0,∀s ∈ [r,t − r]) ≥ C/t. (6.6.24) Hence we have that

Z ∞ −(x−y)2/2t −1 t e u(t,x) ≥ C4t e u(0,y) √ (6.6.25) y0 2πt √ Inserting x = 2t −∆(t) and dropping terms that tend to zero, we arrive at (6.6.20). ut

It is now easy to get the desired lower bound on m1/2. Proposition 6.38 (Proposition 8.1 in [22]). Under the hypothesis stated above, there exists a constant C0 < ∞ and t0 < ∞, such that for all t ≥ t0, √ 3 m1/2(t) ≥ 2t − √ lnt +C0 ≡ n(s). (6.6.26) 2 2

Proof. We insert into (6.6.20) ∆(t) = √D lnt − z. Then we see that 2 2 √ 2 √ Z ∞ e 2ye−y /2t u(t, 2t − ∆(t)) ≥ Ct−3/2elnt) u(0,y) √ (6.6.27) y0 2π √ Z A 2 ≥ Ct−3/2eDlnt−) u(0,y)e 2ye−y /2t dy y0 √ D−3/2 −2 2z ≥ C˜(A,y0)t e .

for some A < ∞ chosen such that R A u(0,y)dy > 0, where we just replaced the ex- y0 ponentials in the integral by their minimum on [y0,A]. Clearly, this can be made in construction 102 6 Bramson’s analysis of the F-KPP equation

equal to 1/2 by setting D = 3 and choosing finite z appropriately. This implies the assertion for m1/2. ut

Having shown that the true m1/2 satisfies the hypothesis made in Step (i), we see that the assertion (6.6.13) holds for M defined with the true m1/2. This concludes the proof of Proposition 6.36. ut Remark 6.39. Bramson proves this proposition under the weaker hypothesis (6.6.1). δ δ His argument goes as follows. First, by definition, u(s + σ ,m1/2(s + s )) = 1/2. δ Next set v(t0,0) ≡ u(s,m1/2(s + s )). Fix s  0. Then v(t,x) solves the F-KPP δ δ equation and v(s ,0) = u(s,m1/2(s + s )). But by (6.2.73) in Proposition 6.19, if δ δ v(t0,0) = η, then v(t +t0 −lnη,x) ↑ 1, as t ↑ ∞. But if s +lnη ↑ ∞, then v(s ,0) ↑ 1, as s∞, in contradiction with the fact that v(sδ ,0) = 1/2. Hence we must have that d lnη + sδ < ∞, or η ≤ e−σ . This result already suggests that the contributions in the Feynman-Kac represen- tations should come from Brownian bridges that stay above the curves M and thus enjoy the full effect of the k. To show this, one first shows that bridges staying below curves M do not contribute much. Define the sets  Gx,y(r,t) ≡ zx,y : ∃s ∈ [2r,t − r] : zx,y(s) ≤ M r,t (t − s) , (6.6.28)

and  Gx,y(r1;r,t) ≡ zx,y : ∃s ∈ [2r ∨ r1,t − r1] : zx,y(s) ≤ M r,t (t − s) . (6.6.29)

Proposition 6.40 (Proposition 7.3 in [22]). Let u be a solution that satisfies (6.6.1). Assume that α(r,t) = o(t) and that ( s m (t) + t−s α(r,t) + 8sδ , if s ∈ [r,t/2], m (s) ≤ t 1/2 t 1/2 s t−s δ (6.6.30) t m1/2(t) + t α(r,t) + 8(t − s) , if s ∈ [t/2,t − r].

Then for y > α(r,t) and x ≥ m1/2(t), for r large enough,

h R t−r i 3r−t 2r k(t−s,zx,y(s)ds1 −2 c  e E e Gx,y(r,t) ≤ r P Gx,y(r,t) . (6.6.31)

Proof. The proof of this proposition is quite involved and will only be sketched. We need some localising events for the Brownian bridges. Set Ij ≡ [ j, j + 1) ∪ (t − j −

1,t − j], for j = 0,..., j0 − 1, with j0 < t ≤ j0 + 1, and Ij0 = [ j0,t − j0]. Define the exit times

1  S (r,t) ≡ sup s : 2r ≤ s ≤ t/2,zx,y(s) ≤ M r,t (t − s) (6.6.32) 2  S (r,t) ≡ inf s : t/2 ≤ s ≤ t − r,zx,y(s) ≤ M r,t (t − s) , (6.6.33)

and in construction 6.6 Asymptotics of solutions of the F-KPP equation 103 ( S1(r,t), if S1(r,t) + S2(r,t) > t, S(r,t) ≡ (6.6.34) S2(r,t), if S1(r,t) + S2(r,t) ≤ t. Then define A j(r,t) ≡ {zx,y : S(r,t) ∈ Ij},, j = 0,..., j0. (6.6.35) and further

1 A j (r,t) ≡ {zx,y ∈ Ar,t : zx,y(s) > −(s∧(t −s)+ys/t +x(t −s)/t, ∀s ∈ Ij}. (6.6.36)

2 1 Finally, A j (r,t) = A j(r,t) − A j (r,t). First one shows that the probability not to cross below M in the interval [r1,t − r1] is controlled by that not cross in [2r,t − r], r Gc (r ;r,t) ≤ C 1 Gc (r,t), (6.6.37) P x,y 1 3 r P x,y

for y ≥ α(r,t) and x ≥ m1/2(t). The proof of this [Lemma 7.1 in [22]] is via usual arguments for bridges (reflection principle). The next step is more complicated. It consists in showing that if a bridge realises 1 the event A j (r,t) for large j, i.e. if M r,t is crossed near the centre, then this will give a small contribution to u. More precisely,

h R t−r i δ 3r−t k(t−s,zx,y(s))ds − j /4 1  e e 2r 1 1 ≤ C4e A (r,t) . (6.6.38) E A j (r,t) P j

The point is to that around the place where the curve M is crossed, u(s,z(s) is δ roughly u(s,m1/2 − s ). Again by Proposition 6.19, at this position u will be very close to 1 at this point, and hence k will be very small. Fiddling around this produces δ a loss by a factor e− j /4. Finally, the event in the left-hand side in (6.6.31) is decomposed into the union of the events A j(r,t), and using the estimates (6.6.38), one obtains a bound of the form (6.6.31), where the choice of the factor 1/r2 is quite arbitrary; in fact one could get any negative power of r. ut The next proposition is a reformulation of Proposition 6.32. Proposition 6.41 (Proposition 7.4 in [22]). Let u be as above and assume that (6.6.30) holds as in the previous proposition. For a constant M1 > 0, assume that x ≥ m1/2(t) + M1 and y ≥ α(r,t) + M1. Then,  P zx,y(s) > M r,t (t − s),∀s ∈ [r,t − r]  → 1, (6.6.39) P zx,y(s) > m1/2(t − s),∀s ∈ [r,t − r] and  P zx,y(s) > m1/2(t − s),∀s ∈ [r,t − r]  → 1, (6.6.40) P zx,y(s) > M r,t (t − s),∀s ∈ [r,t − r] uniformly in t, as r ↑ ∞. in construction 104 6 Bramson’s analysis of the F-KPP equation

So far the probabilities concern only the bulk part of the bridges. To control the initial and final phases we need to slightly deform the curves. Define ( x M r,t (s), ifs ∈ [0,t − 2r], M r,t (s) ≡ (6.6.41) (x + m1/2(t))/2, ifs ∈ [t − 2r,t],  M r,t (s), ifs ∈ [r,t − 2r], x,y  M r,t (s) ≡ y/2, ifs ∈ [0,r], (6.6.42)  (x + m1/2(t))/2, ifs ∈ [t − 2r,t], and ( M (s), ifs ∈ [r,t − 2r], M 0 (s) ≡ r,t (6.6.43) r,t −∞, else.

Proposition 6.42 (Proposition 7.5 in [22]). Let u be as above and assume that (6.6.30) holds as in the previous proposition. Then,

 x0  P zx,y(s) > M r,t (t − s),∀s ∈ [0,t − r] → 1, (6.6.44) 0  P zx,y(s) > M r,t (t − s),∀s ∈ [0,t − r]

uniformly in t ≥ 8r, x ≥ x0 ≥ m1/2(t) + 8r, and y ≥ α(r,t), as r ↑ ∞. Moreover,

 x0,y0  P zx,y(s) > M r,t (t − s),∀s ∈ [0,t] → 1, (6.6.45) 0  P zx,y(s) > M r,t (t − s),∀s ∈ [0,t]

and uniformly in t ≥ 8r, x ≥ x0 ≥ m1/2(t) + 8r, and y ≥ y0 ≥ max(α(r,t),8r), as r ↑ ∞.

6.7 Convergence results

We have now finished the preparatory steps and come to the main convergence re- sults. In this section we always take u(t,x) a solution of F-KPP satisfying the stan- dard conditions and we assume that (6.6.1) and (6.6.3) hold. We will also assume that the finite mass condition (6.1.8) holds. Moreover, we will assume mostly that there exist y0 such that Z ∞ u(0,y)dy > 0. (6.7.1) y0 Now set y¯ ≡ y − α(r,t), z = x − m1/2(t), z¯ ≡ x − n(t). (6.7.2) in construction 6.7 Convergence results 105

We all the curves introduced above with an additional superscript n will denote these curves when m1/2(s) is replaced by n(s) in their construction. We want an upper bound on the probability that a Brownian bridge lies above M . Lemma 6.43 (Corollary 1 in Chapter 8 of [22]). Assume that α(r,t) = O(lnr). Then, if x ≥ n(t) + 1,y ≥ α(r,t) + 1, there exists a constant C > 0, such that for r large enough,

  −2y ¯z¯/t  P zx,y(s) > M r,t (t − s),∀s ∈ 2r,t − r] ≤ Cr 1 − e . (6.7.3)

Proof. Playing around with the different curves, Bramson arrives at the upper bound

CP(zz¯,y¯(s) > 0,∀s ∈ [2r,t − 2r]). (6.7.4) √ Now it is plausible that with positive probability, zz¯,y¯(2r) ∼ zz¯,y¯(t − 2r) ∼ r. Then the probability to stay positive is (1 − e−r/(t−4r)), which for large t is about what is claimed.

We now move to the lower bound on m1/2. Proposition 6.44 (Proposition 8.2 in [22]). Under the general assumptions of this section, for all x ≥ m1/2(t) + 1 and large enough t,

Z ∞ −(x−y)2/2t t e  −2yz ¯ /t  u(t,x) ≤ C2e u(0,y) √ 1 − e dy, (6.7.5) y0 2πt

where C2 depends on initial data but not on t. Hence √ − 2z u(t,x) ≤ C3ze , (6.7.6)

and √ 3 m1/2(t) ≤ 2t − √ lnt +C4, (6.7.7) 2 2

for some constant C4.

Proof. Again we assume F concave. We may set y0 = 0, and by the maximum principle, we may assume u(0,y) = 1, for y ≤ 0. The main step in the proof of (6.7.5) is to use the Feynman-Kac representation and to introduce a one in the form 1 +1 c Gx,y(r,t) Gx,y(r,t). Then we use Proposition 6.40 and Lemma 6.43. This yields that

h R t i   − 0 k(t−s,zx,y(s))ds t c  t −2 c  t −2yz¯/t E e ≤ e P Gx,y(r,t) + e r P Gx,y(r,t) ≤ C5e 1 − e , (6.7.8) for some constant C5, for y ≥ 1, x ≥ n(t) + 1. Using monotonicity arguments, one derives with some fiddling at the upper bound Z ∞ −(x−y)2/2t t e  −2y ¯z¯/t  u(t,x) ≤ C6e u(0,x) √ 1 − e dy, (6.7.9) 0 2πt in construction 106 6 Bramson’s analysis of the F-KPP equation

for x ≥ m (t)+1. To conclude we mud show thatz ¯ can be replaced by z for which 1/2 √ we want to show that n(t) and m1/2(t) differ by at most a constant. Set z2 = x− 2t. Then the left hand side of (6.7.9) equals

2 √ Z ∞ √ e−(z2−y) /2t   − 2z2 2y −2y ¯z¯/t C6e u(0,x)e √ 1 − e dy (6.7.10) 0 2πt r Z ∞ √ −3/2 2 2y ≤ C6t z¯ ye u(0,y)dy π 0 √ √ −3/2 − 2z2 − 2¯z ≤ C7t z¯e ≤ C8z¯e . (6.7.11)

Since n(t) ≤ m1/2(t), this implies (6.7.6). But then √ − 2(m1/2(t)−n(t)) u(t,m1/2(t)) ≤ C8(m1/2(t) − n(t))e . (6.7.12)

Since the right-hand side is bounded from below uniformly in t, this implies that m1/2 − n(t) is bounded uniformly in t, hence (6.7.7). ut Proposition 6.44 gives the following upper bound on the distribution of the max- imum of BBM that will be used in the next chapter.

Lemma 6.45 (Corollary 10 in [4]). There exists a numerical constants, ρ,t0 < 0, such that for all x > 1, and all t ≥ t0,

   √ x2 3x lnt  P max xk(t) − m1/2(t) ≥ x ≤ ρxexp − 2x − + √ . (6.7.13) k≤n(t) 2t 2 2 t

Proof. For Heaviside initial conditions and using y0 = −1, (6.7.5) gives

Z 0 −(x+m(t)−y)2/2t t e  −2(y+1)x/t  u(t,x + m1/2(t)) ≤ C2e √ 1 − e dy. (6.7.14) −1 2πt

−u Using that 1−e ≤ u, for u ≥ 0 and writing out what m1/2(t) is, the result follows. ut

Note that this bound establishes the tail behaviour for the probability of the max- imum, Eq. (5.6.9), that was used in Section 5.6. We are now closing in on the final step in the proof. Before we do that, we need one more a priori bound, which is the converse to (6.7.5).

Corollary 6.46 (Corollary 1 in [22]). Under the assumptions as before, for x ≥ m1/2(t) and all t,

Z ∞ −(x−y)2/2t t e  −2yz ¯ /t  u(t,x) ≥ C5e u(0,y) √ 1 − e dy, (6.7.15) α(r,t) 2πt

if α(r,t) = O(lnr).C5 does not depend on x or t. in construction 6.7 Convergence results 107

Proof. The main point is that we can bound the probability that the Brownian bridge stays above M r,t on [2r,t − r] (up to constants) by the probability that zz,y¯ stays positive on [r,t −r], which in turn is bounded by a constant times (1−exp(−2yz ¯ /t)). The rest of the proof is then obvious. ut

By now we control our functions up to multiplicative constants. The reason for the errors is essentially that we do not control the integrals in the exponent in the Feynman-Kac formula well outside of the intervals [2r,t − r]. In the next step we remedy this. Now the representation (6.1.7) comes into play. This avoids having to control what happens up to time r. We now come to the final step, the proof of Theorem 6.2.

Proof (of Theorem 6.2). As advertised before, all reposes on the use of the Feynman- Kac representation in the form (6.1.7). Inserting a condition for the bridge to stay x above a curve M r,t , we get the lower bound

2 Z ∞ e−(x−y) /(2(t−r)) u(t,x ) ≥ u(r,y) p (6.7.16) −∞ 2π(t − r)  Z t−r   t−r × k(t − s,z (s))ds 1 t−r x dy. E exp x,y z (s)>M (t−s),∀s∈[0,t−r] 0 x,y r,t We want to show that condition on the path, the nonlinear term in the integral in the exponent can be replaced by 1. This is provided by Proposition 6.42 for the part of x the integral between 2r and t −r, since on this segment, M r,t (s) = M r,t (t −s). Now x we are interested in x > m1/2(t). For 0 ≤ s ≤ 2r, M r,t (t −s) = (x+m1/2(t))/2. Thus x for x = m1/2(t)+z with z large enough, by Proposition 6.44, for x(s) > M r,t (t −s), √ −z 2 u(t,x(s)) ≤ C3ze . (6.7.17)

x 1 But for x(s) > M r,t (t − s) = 2 (x + m1/2(t)) = m1/2(t) + z/2, so √ −z/ 2 u(t − s,x(s)) ≤ C7ze . (6.7.18)

Hence, √ −ρz/ 2 k(t − s.x(s)) ≥ 1 −C8e , (6.7.19) which tends to 1 nicely, as z ↑ ∞, and so

Z 2r  −2r −C8z e exp k(t − s,x(s))ds ≥ exp −2rC8e . (6.7.20) 0

If x − m1/2(t) ≥ r, this tends to one uniformly in t > r, as r ↑ ∞. Therefore we have the lower bound in construction 108 6 Bramson’s analysis of the F-KPP equation

Z ∞ −(x−y)2/(2(t−r)) t−r e u(t,x ) ≥ C1(r)e u(r,y) p (6.7.21) −∞ 2π(t − r) h t−r x i × P zx,y (s)) > M r,t (t − s),∀s ∈ [0,t − r] dy,

for x ≥ m1/2(t) + r, and C1(r) ↑ 1, as r ↑ ∞, uniformly in t. This is (6.1.11). To prove a matching upper bound, one first wants to show that in the Feynman- Kac representation (6.1.2), one may drop the integral from −∞ to −lnr. Indeed, by simple change of variables, and using that initial conditions are in [0,1], and F(u)/u is decreasing in u, we get that

2 Z −lnr e−(x−y) /(2t)  Z t  √ u(0,y)E exp k(t − s,zx,y(s))ds dy (6.7.22) −∞ 2πt 0 2 Z −lnr e−(x+lnr−y) /(2t)  Z t  ≤ √ E exp k(t − s,zx,y−lnr(s))ds dy −∞ 2πt 0 Z −lnr −(x+lnr−y)2/(2t)  Z t  e H ≤ √ E exp k(t − s,zx+lnr,y(s))ds dy = u (t,x + lnr). −∞ 2πt 0 Our rough upper (Eq. (6.7.9)) and lower (Eq.(6.7.15)) suffice to show that, for x ≥ m1/2(t), uniformly in t > 3r,

uH (t,x + lnr) ↓ 0, as r ↑ ∞. (6.7.23) u(t,x)

This shows that

Z ∞ (x−y)2 h R t i − k(t−s,zx,y(s))ds u(t,x) ≥ C2(r) u(0,y)e 2t E e 0 dy, (6.7.24) −lnr

where C2(r) → 1, as r ↑ ∞, if x ≥ m1/2(t), uniformly in t > 3r. Now we introduce into the Feynman-Kac representation a one in the form

1 = 1 t−r 0 + 1 t−r 0 . (6.7.25) zx,y (s)>M r,t (t−s),∀s∈[0,t] ∃s∈[0,t]:zx,y (s)≤M r,t (t−s)

Note that by definition of M 0, the second indicator function is equal to the indicator function of the event Gx,y(r,t). We want to use Proposition 6.40 to show that the contribution coming from the second indicator function is negligible. To be able to do this, we had to make sure that we can drop the contributions from y < −lnr. Now we get readily in construction 6.7 Convergence results 109

Z −lnr −(x−y)2/(2t)  Z t   e 1 √ u(0,y)E exp k(t − s,zx,y(s))ds Gx,y(r,t) dy −∞ 2πt 0

Z −lnr −(x−y)2/(2t)  Z t−r   3r e 1 ≤ e √ u(0,y)E exp k(t − s,zx,y(s))ds Gx,y(r,t) dy −∞ 2πt 2r

Z −lnr −(x−y)2/(2t) −2 t e c  ≤ r e √ u(0,y)P Gx,y(r,t) . (6.7.26) −∞ 2πt Now Lemma 6.43 provides a bound for the probability in the last line. Inserting this, and observing that we have the lower bound (6.7.15), we see that this is at most a constant times u(t,x)/r, and hence negligible as r ↑ ∞. We have arrived at the bound

Z ∞ 2 − (x−y) u(t,x) ≤ C2(r) u(0,y)e 2t (6.7.27) −∞  Z t   1 ×E exp k(t − s,zx,y(s))ds zt−r(s))≥M 0 (t−s),∀s∈[0,t] dy. 0 x,y r,t

0 Since by definition, M r,t imposes no condition on the interval [0,2r], we move the initial condition forward to time r, using that

Z ∞ 0 2  Z r  − (y−y ) 0 u(r,y) = u(0,y)e 2r E exp k(r − s,zy,y0 (s))ds dy , (6.7.28) −∞ 0 and the Chapman-Kolmogorov equation for the heat kernel, to get from (6.7.27) the bound, for x > m1/2(t),

Z ∞ −(x−y)2/(2(t−r)) t−r e u(t,x) ≤ C3(r)e u(r,y) p (6.7.29) −∞ 2π(t − r)  t−r 0  ×P zx,y (s)) > M r,t (t − s),∀s ∈ [0,t − r] dy,

with C3(r) tending to one, as r ↑ ∞. This is (6.1.12). We have now proven the bounds (6.1.11) and (6.1.12). We still need to prove (6.1.15). Here we need to show the events for the Brownian bridge to stay above x M 0 is just as unlikely as to stay above M . These issues were studies in Section 6.5. Let y ≤ 0 be such that R ∞ u(0,y)dy > 0. The bounds we have on m (s) readily 0 y0 1/2 imply that (6.6.30) holds with α(r,t) = −ln(r) when r is large enough. Thus we can use Proposition 6.40 to get that

 x  zt (s) > (t − s),∀s ∈ [0,t − r] P x,y0 M r,t = 1 − ε(r), (6.7.30) zt (s) > 0 (t − s),∀s ∈ [0,t − r] P x,y0 M r,t

for t ≥ 8r, x − m1/2(t) ≥ 8r, and r large enough, with ε(r) ↓ 0, as r ↑ ∞. in construction 110 6 Bramson’s analysis of the F-KPP equation

We want to push this to a bound that works for all y. This requires two more pieces of input. The first is a consequence of Proposition 6.44 and Corollary 6.46.

Corollary 6.47 (Corollary 2 in [22]). For x2 ≥ x1 ≥ 0 and y0 such that (6.7.1) holds. Then, for t large enough,

−(x −y )2/2t u(t,x2) e 2 0 ≥ C6 2 . (6.7.31) u(t,x1) e−(x1−y0) /2t

Here C6 > 0 does not depend on t or xi. Proof. Just combine the upper and lower bound we have by now and use mono- tonicity properties of the exponential that appear there. ut Next we need the following consequence of (a trivial version of) the FKG in- equality.

Lemma 6.48. Let h,g : R → R be increasing measurable functions. Let X be a real valued random variable. Then

E[h(X)g(X)] ≥ E[h(X)]E[g(X)]. (6.7.32)

Proof. Let µ denote the law of X. Then 1 Z Z [h(X)g(x)] − [h(X)] [g(X)] = µ(dx) µ(dy)(h(x) − h(y))(g(x) − g(y)). E E E 2 (6.7.33) Since both f and g are increasing, the two terms in the product have the same sign, and hence the right-hand side of (6.7.33) is non-negative. This implies the assertion of the lemma. ut

Lemma 6.49 (Lemma 8.1 in [22]). Let h : R → [0,1] be increasing and let g : R → R+ be such that, for x1 ≤ x2, it holds that g(x1) ≤ Cg(x2), for some C > 0. Then, if X is a real valued random for which E[h(X)] ≥ 1 − ε, then

E[g(X)h(X)] ≥ (1 −Cε)E[g(X)]. (6.7.34)

Proof. Set g1(x) ≡ sup{g(y) : y ≤ x}. Then g1 is increasing and positive, so the FKG inequality applies and states that

E[g1(X)h(X)] ≥ E[g1(X)]E[h(X)] ≥ (1 − ε)E[g1(X)]. (6.7.35)

By assumption, for any x, g1(x) ≤ Cg(x). Hence

E[g(X)(1 − h(X)] ≤ CE[g1(X)(1 − h(X))] (6.7.36) ≤ CE[g1(X)] −C(1 − ε)E[g1(X)] = CεE[g1(X)]. Hence in construction 6.7 Convergence results 111

E[g(X)h(X)] ≥ E[g(X) −Cεg1(X)] ≥ (1 − ε)E[g(X)]. (6.7.37) ut

We use Lemma 6.49 in the following construction. Let us denote by b(y;x,y0,t,r) the probability density that the Brownian bridge from zt passes through y at time x,y0 t − r. This can be expressed in terms of the heat kernel p(t,x) as

p(r,y − y0)p(t − r,x − y) b(y;x,y0,t,r) = . (6.7.38) p(t,x − y0) Noting that the condition on the bridge in the denominator of (6.7.30) has bearing only on the time interval [0,t − r], we see that using the ,

 x  zt (s) > (t − s),∀s ∈ [0,t − r] (6.7.39) P x,y0 M r,t Z  t−r x  = dyb(y;x,y0,t,r)P zx,y (s) > M r,t (t − s),∀s ∈ [0,t − r] .

We now prepare to use the FKG inequality in the form of Lemma 6.49. For this we represent the left-hand side of (6.7.30) as

 t−r x  Z P zx,y (s) > M r,t (t − s),∀s ∈ [0,t − r] dyb(y;x,y0,t,r) (6.7.40) zt (s) > 0 (t − s),∀s ∈ [0,t − r] P x,y0 M r,t  t−r x  Z P zx,y (s) > M r,t (t − s),∀s ∈ [0,t − r] = t−r 0  P zx,y (s) > M r,t (t − s),∀s ∈ [0,t − r] t−r 0  P zx,y (s) > M r,t (t − s),∀s ∈ [0,t − r] × b(y;x,y0,t,r)dy zt (s) > 0 (t − s),∀s ∈ [0,t − r] P x,y0 M r,t Z x x ≡ hr,t (y)µr,t (dy)

x x Notice that µr,t is a probability measure, and hr,t is increasing. Hence, the hypothesis x of Lemma 6.49 are satisfied for the expectation of the function hr,t with respect to x the probability µr,t . It remains to choose a function g such that the expectations in the left- and right-hand sides of (6.7.34) turn into the integrals appearing in (6.1.11) and (6.1.12). For this we chose

u(r,y) gr(y) ≡ 1y≥0. (6.7.41) p(r,y − y0)

Corollary 6.47 ensures that gr satisfies the hypothesis on g needed in Lemma 6.49 with a constant C = 1/C6. With these choices, in construction 112 6 Bramson’s analysis of the F-KPP equation Z x x gr(y)hr.t (y)µr,t (dy) (6.7.42)

R y  t−r x  0 u(r,y)p(t − r,x − y)P zx,y (s) > M r,t (t − s),∀s ∈ [0,t − r] dy = p(t,x − y ) zt (s) > 0 (t − s),∀s ∈ [0,t − r] 0 P x,y0 M r,t R y t−r 0  u(r,y)p(t − r,x − y)P zx,y (s) > M r,t (t − s),∀s ∈ [0,t − r] ≥ (1 −Cε) 0 p(t,x − y ) zt (s) > 0 (t − s),∀s ∈ [0,t − r] 0 P x,y0 M r,t This yields

2 − (x−y) Z ∞ 2(t−r) e  t−r x  u(r,y)p P zx,y (s) > M r,t (t − s),s ∈ [0,t − r] dy (6.7.43) −∞ 2π(t − r) 2 − (x−y) Z ∞ 2(t−r) e t−r 0  ≥ (1 −Cε(r)) u(r,y)p P zx,y (s) > M r,t (t − s),s ∈ [0,t − r] dy. 0 2π(t − r)

Here we used that the integrals from −∞ to zero are non-negative. Finally, one shows that for the values of x in question, these integrals are also negligible compared to those from 0 to ∞, and hence the integral on the right-hand side can also be replaced by that over the entire real line. This yields the upper bound in (6.1.15). Since the lower bound is trivial, we have proven (6.1.15) and this concludes the proof of Theorem 6.2. ut

We can now proof the special case of Bramson’s convergence theorem, Theorem 5.9 that is of interest to us.

Theorem 6.50. Let u be a solution of the F-KPP equation with F satisfying the standard conditions. Assume that the finite mass condition (6.1.8) holds, and that for some η,M < N > 0,

Z x+N u(0,y)dy > η, for x ≤ −M. (6.7.44) x

Then √ u(t,x + m(t)) → w 2(x), as t ↑ ∞, (6.7.45) uniformly in x ∈ R, where √ 3 m(t) = 2t − √ lnt. (6.7.46) 2 2 Proof. The proof uses Proposition 6.17. To show that the hypothesis of the propo- sition are satisfied, we show that, for x large enough, √ √ − 2x −1 − 2x Cxe γ1 (x) − γ2(t) ≤ u(t,x + m(t)) ≤ Cxe γ1(x) + γ2(t), (6.7.47) in construction 6.7 Convergence results 113 √ − 2x where γ1(x) → 1,√ as x ↑ ∞ and√γ2(t) → 0, as t ↑ ∞. But Cxe is precisely the tail asymptotics of w 2, so that w 2 can be chosen (remember that we are free to shift the solution to adjust the constant) the hypothesis of Proposition 6.17 are satisfied, and the conclusion of the theorem follows. Eq. 6.7.47 follows from Theorem 6.2 via the following corollary whose proof we postpone to the next chapter, where various further tail estimates will be proven.

Corollary 6.51. Let u and m(t) be as in the theorem. Then √ ex 2 limlim u(t,x + m(t)) = C, (6.7.48) x↑∞ t↑∞ x

for some 0 < C < ∞.

From here we get the desired control on the tail of u and hence convergence to the travelling wave solution. ut

in construction in construction Chapter 7 The extremal process of BBM

In this chapter we discuss the construction of the extremal process of branching Brownian motion following the paper [6]. This construction has two parts. The first is the proof of convergence of Laplace functionals of the point process of BBM as t ↑ ∞. This will turn out to be doable using estimates on the asymptotics of solutions of the F-KPP equation that are the basis of Bramson’s work. The ensuing repre- sentation for Laplace functionals is rather indirect. In the second part, one gives an explicit description of the point process that has this Laplace functional as a cluster point process.

7.1 Limit theorems for solutions

The bounds in (6.1.10) have been used by Chauvin and Rouault to compute the probability of deviations of the maximum of BBM, see Lemma 2 in [28]. We will use their arguments in this slightly more general setting. Our aim is to control limits when t and later r tend to infinity. √ We will need to analyse solutions u(t,x + 2t) where x = x(t) depends on t in different ways. First we look at x fixed. This corresponds to particles that are just √3 lnt ahead of the natural front of BBM. We start with the asymptotics of ψ. 2 2 Proposition 7.1. Let the assumptions of Theorem 6.2 be satisfied, and assume in addition that y0 ≡ sup{y : u(0,y) > 0} is finite. Then the following hold for any z ∈ R: (i)

√ 3/2 √ r Z ∞ √ √ z 2 t 2 y 2 lime 3 ψ(r,t,z + 2t) = ye u(r,y + 2r)dy ≡ C(r). t↑∞ √ lnt π 0 2 2 (7.1.1) (ii) limr↑∞ C(r) ≡ C exists and C ∈ (0,∞). in construction115 116 7 The extremal process of BBM

(iii) √ t3/2 √ limez 2 u(t,z + 2t) = C. (7.1.2) t↑∞ √3 lnt 2 2 Proof. Consider first the limit t ↑ ∞. For fixed z, the integrand converges pointwise √ √ to √3 yey 2u(r,y + 2r), as can be seen by elementary calculations. In fact, the 2 π integrand in (7.1.1) is given by

3/2 √ 2 (z+ √3 lnt) ! t √ (y−z) 2 2 2y − (t−r) −2y = p u(r,y + 2r)e e 2 1 − e t−r (7.1.3) 2π(t − r) √3 lnt 2 2 t3/2(z + √3 lnt √ √ √ √ 2 2 2y 1 2y ∼ p u(r,y + 2r)ye ∼ √ u(r,y + 2r)ye . 2π(t − r)3 √3 lnt 2π 2 2 To deduce convergence of the integral, we need to show that we can use dominated convergence. We see from the first line in (7.1.3) that the integrand in (6.1.9) after multiplication with the prefactor in (7.1.1) is bounded from above by √ √ const.ye 2yu(r,y + 2r). (7.1.4)

Thus we need an a priori bound on u. By the Feyman-Kac representation and the fact that u(0,y) = 0, for y > y0, implies that

Z y 2 r 0 1 − (y−x) u(r,x) ≤ e √ e 2r dy (7.1.5) −∞ 2πr ( C , x ≤ y , ≤ r 0 −(y −x)2/2r Cre 0 , x ≥ y0. √ Setting x = y + 2r and inserting this into (7.1.4), we can bound the integrand in (6.1.9) by (ignoring constants) √ √ √ 2 ye 2y1 √ + e 2ye−(y+ 2r−y0) /2r1 √ (7.1.6) y≤y0− 2r y≥y0− 2r 2 √ −y /2r+yy0/r √ ≤ C(r,y0)1 +C(r,y0)ye 1 , y≤y0− 2r y≥y0− 2r which is integrable in y uniformly in t. Hence Lebesgue’s dominated convergence theorem applies and the first part of the proposition follows. The more interesting task is to show now that C(r) converges to a non-trivial limit, as r ↑ ∞. By Theorem 6.2, for r large enough,

√ 3/2 √ √ 3/2 √ z 2 t z 2 t limsupe 3 u(t,z + 2t) ≤ γ(r)lime ψ(r,t,z + 2t) = C(r)γ(r) , t↑ √ lnt t↑∞ lnt ∞ 2 2 (7.1.7) and in construction 7.1 Limit theorems for solutions 117

√ t3/2 √ √ t3/2 √ liminfez 2 u(t,z+ 2t) ≥ γ(r)−1 limez 2 ψ(r,t,z+ 2t) =C(r)γ(r)−1 , t↑∞ √3 lnt t↑∞ lnt 2 2 (7.1.8) These bounds hold for all r, and since γ(r) → 1, we can conclude that the left- hand sides is bounded from above by liminfr↑∞ C(r) and from below by and limsupr↑∞ C(r). So the limsup ≤ liminf and hence limr↑∞ C(r) = C exists. As a byproduct, we see that also the left-hand sides of (7.1.7) and (7.1.8) must agree, and thus we obtain (7.1.2). A very important aspect is that the bounds (7.1.7) and (7.1.8) hold for any r. But for large enough finite r, the constants C(r) are strictly positive and finite by the representation (7.1.1), and this shows that C is strictly positive and finite. ut We will make a slight stop on the way and show how the sharp asymptotics for the upper tail of solutions follows from what we just did. Corollary 7.2. Let u be as in the preceding proposition. Then √ ez 2 limlim u(t,z + m(t)) = C, (7.1.9) z↑∞ t↑∞ z

where 0 < C < ∞ is the same constant as in Proposition 7.1. Proof. Note that  √  ψ(r,t,z + m(t)) = ψ r,t, 2t + z − √3 lnt . (7.1.10) 2 2 Then, by the same arguments as in the proof of Proposition 7.1, we get that √ r ez 2 √ 2 Z ∞ √ √ lim ψ(r,t,z + 2t) = yey 2u(r,y + 2r)dy = C(r). (7.1.11) t↑∞ z π 0

Here C(r) the same constant as before. Now we can choose z > 8r in order to be able to apply Theorem 6.2 and then let r ↑ ∞. This yields (7.1.9). ut √ The next lemma is a variant of the previous proposition for the case when x ∼ t. Lemma 7.3. Let u be a solution to the F-KPP equation (5.4.2) with initial condition satisfying the assumptions of Theorem 5.9 and

y0 ≡ sup{y : u(0,y) > 0} < ∞ . (7.1.12)

Then, for a > 0, and y ∈ R, √ √ 2a t 3/2 √ e t √ √ 2 lim √ ψ(r,t,y + a t + 2t) = C(r)e− 2ye−a /2 (7.1.13) t→∞ a t

where C(r) is the constant from above. Moreover, the convergence is uniform for a in compact sets. in construction 118 7 The extremal process of BBM

Proof. The structure of the proof is the same as in the proposition. Compute the pointwise limit and produce an integrable majorant from a bound using the lin- earised F-KPP equation. ut

It will be very important to know√ that as r ↑ ∞, in the integral representing C(r), only the y’s that are of order r give a non-vanishing contribution. The precise version of this statement is the following lemma. Lemma 7.4. Let u be a solution to the F-KPP equation (5.4.2) with initial condition satisfying the assumptions of Theorem 5.9 and

y0 ≡ sup{y : u(0,y) > 0} < ∞ . (7.1.14)

Then for any z ∈ R √ Z A1 r √ √ lim limsup u(r,z + y + 2r) yey 2dy = 0 (7.1.15) A1↓0 r↑∞ 0 Z ∞ √ √ y 2 lim limsup √ u(r,z + y + 2r) ye dy = 0 . A2↑∞ r↑∞ A2 r

Proof. Again it is clear that we need some a priori information on the behaviour of u. This time this will be obtained by comparing to a solution of the F-KPP equation with Heavyside initial conditions, which we know are probabilities for the max- 1 imum of BBM. Namely, by assumption, u(0,y) ≤ {y y0 + 1 + z + y . (7.1.16) k≤n(t)

For the probabilities we have a sharp estimate given in Lemma 6.45 in the previous chapter. √ We use this lemma with z ∼ t, so the z2/t term is relevant. The last term involv- ing lnt/t can, however, be safely ignored. The lemma follows in a rather straight- forward and purely computational way once we insert this bound in integrals. The details can be found in [6]. Note that the precise bound (6.7.13) is needed and the trivial bound comparing to iid variables would not suffice. ut √ √ q 2 R ∞ y 2 The following lemma shows how C = limr↑∞ π 0 u(r,y + 2r) ye dy be- haves when the spatial argument of u is shifted. Lemma 7.5. Let u be a solution to the F-KPP equation (5.4.2) with initial condition satisfying the assumptions of Theorem 5.9 and

y0 ≡ sup{y : u(0,y) > 0} < ∞ . (7.1.17)

Then for any z ∈ R: in construction 7.1 Limit theorems for solutions 119 r 2 Z ∞ √ √ lim yey 2u(r,z + y + 2r)dy (7.1.18) r↑∞ π 0 r √ 2 Z ∞ √ √ √ = e− 2z lim yey 2u(r,y + 2r)dy = Ce− 2z. r↑∞ π 0 Proof. The proof is obvious from the fact proven in Lemma 7.4. Namely, changing variables,

Z ∞ √ √ √ Z ∞ √ √ yey 2u(r,z + y + 2r)dy = e− 2z (y − z)ey 2u(r,y + 2r)dy (7.1.19) 0 z

Now the part of the√ integral from z to zero cannot contribute, as it does not involve y of the order of r, and for the same reason replacing y by y − z changes nothing in the limit. ut The following is a slight generalisation of Lemma 7.5: Lemma 7.6. Let h(x) be continuous function that is bounded and is zero for x small enough. Then

r Z 0  √  √ 2 − 2y lim E h(y + maxxi(t) − 2t) (−y)e dy t↑∞ π −∞ i Z ∞ √ √ = h(z) 2Ce− 2zdz, (7.1.20) −∞ where C is the constant appearing in the in the law of the maximum, Theorem 5.16. 1 Proof. Consider first the case when h(x) = [b,∞)(x), for b ∈ R.

r Z 0  √  √ 2 − 2y lim E h(y + maxxi(t) − 2t) (−y)e dy (7.1.21) t↑∞ π −∞ i Z 0  √ q √ 2 − 2y = lim P y + maxxi(t) − 2t > b π (−y)e dy t↑∞ −∞ i Z ∞  √ q √ 2 2y = lim P maxxi(t) > b + y + 2t π ye dy t↑∞ 0 i Z ∞ √ q √ 2 2y = lim u(t,b + y + 2t) π ye dy, t↑∞ 0

where u solves the F-KPP equation (5.4.2) with initial condition u(0,x) = 1x<0. Hence the right-hand side of (7.1.21) converges, by Lemma 7.5, to √ Z ∞ √ √ Z ∞ √ √ Ce− 2b = C e− 2z 2dz = C h(z)e− 2z 2dz. (7.1.22) b −∞ To pass to the general case (7.1.20), note that by linearity, (7.1.22) holds for linear combinations of indicator functions of intervals. The general case when follows from a monotone class type argument. ut in construction 120 7 The extremal process of BBM 7.2 Existence of a limiting process

We now look at the construction of the limit of the point processes

≡ . Et ∑ δxk(t)−m(t) (7.2.1) k≤n(t)

7.2.1 Convergence of the Laplace functional

To prove the existence of the limiting extremal process we show tightness and the convergence of a sufficiently rich class of Laplace functionals

  Z  ψt (φ) ≡ E exp − φ(y)Et (dy) . (7.2.2)

From Lemma 5.5 one has that the Laplace functional related to a solution u of the F-KPP equation with initial conditions u(0,x) = 1 − e−φ(−x) via

1 − ψt (φ) = u(t,m(t)). (7.2.3)

As we want to use Bramson’s main theorem, Theorem 5.9, we use functions φ such that the hypotheses (i) and (ii) in that theorem are satisfied. A convenient choice is the class of functions of the form

k 1 φ(x) = ∑ c` x>u` , (7.2.4) `=1

−φ(−x) with c` > 0 and u` ∈ R. Set g(x) ≡ 1 − e . Clearly g(x) vanishes for x > −∑k c −min`(u`), while for x < −max`(u`), we have that g(x) ≥ 1 − e `=1 ` > 0. This implies that g(x) satisfies both hypotheses for the initial conditions of Theorem 5.9. On the other hand, if u(t,x) solves the F-KPP equation with initial condition u(0,x) = g(x), "n(t) # − k φ(x (t)−x) 1 − u(t,x) = E ∏e ∑`=1 i (7.2.5) i=1 is the Laplace functional for our choice of φ. Thus, if we can show that u(t,x+m(t)) converges to a non-trivial limit, we obtain the following theorem.

Theorem 7.7. The point process Et = ∑k≤n(t) δxk(t)−m(t) converges in law to a point process E .

Proof. First, the sequence of processes Et is tight by Corollary 2.21, since that for any bounded interval, B ⊂ R, in construction 7.2 Existence of a limiting process 121

lim limP(Et (B) ≥ N) = 0, (7.2.6) N↑∞ t↑∞

so the limiting point process must be locally finite. (Basically, if (7.2.6) failed then the maximum of BBM would have to be much larger than it is know to be). We now show (7.2.6). Clearly it is enough to show this for B = [y,∞). Assume that (7.2.6) does not hold. Then there exists ε > 0 and a sequence tN ↑ ∞ there exists such that there for all N < ∞,

P(EtN ([y,∞)) ≥ N) ≥ ε. (7.2.7)

On the other hand, we know that for any δ > 0, we can find aδ ∈ R, such that   limP max xk(t) ≤ m(t) + aδ ≥ 1 − δ. (7.2.8) t↑∞ k≤n(t)

Now chose δ = ε/2. It follows that, for N large enough   P {Et (B) ≥ N} ∩ { max xk(tN + 1) ≤ m(tN + 1) + aε/2} ≥ ε/2. (7.2.9) k≤n(tN +1)

But this probability is bounded from above by the probability that the offspring of the N particles at time tN that must be above m(tN) + y remain below m(tN + 1) + aε/2, which is smaller than

N ! ( j) ∏P x j(tN) + max xk (1) ≤ m(tN + 1) + aε/2 x j(tN) ≥ m(tN) + y i=1 k≤n j(1)   N ≤ P max xk(1) ≤ m(tN + 1) − m(tN) + aε/2 − y (7.2.10) k≤n(1)   √ N ≤ P max xk(1) ≤ 2 + aε/2 − y . k≤n(1)

But the probability in the last line is certainly smaller than 1, and hence choosing N large enough, the expression in the last line is stricly smaller than ε/2, which leads to a contradiction. This proves (7.2.6). For any φ of the form (7.2.4) we now know from Theorem 5.9 that u(t,x +m(t)) √ converges, as t ↑ ∞, to a travelling wave w 2. Hence

limψt (φ) ≡ ψ(φ) (7.2.11) t↑∞

exists, and is strictly smaller than one. But in construction 122 7 The extremal process of BBM   Z  ψt (φ) = E exp − φ dEt " #   = E ∏ exp − φ xk(t) − m(t) k≤n(t) " # = E ∏ g(m(t) − xk(t)) = u(t,0 + m(t)), k≤n(t)

and therefore limt↑∞ ψt (φ) ≡ ψ(φ) exists which proves (7.2.11). ut Remark 7.8. The particular choice of the functions φ for which we prove conver- gence of the Laplace functionals is made since they satisfy the hypothesis of Theo- + rem 5.9. It implies, of course, convergence for all φ ∈ Cc (R) (and more). They do not satisfy the assumption (ii) in Bramson’s theorem, so convergence of the corre- sponding solutions cannot be uniform in x, but that is not important . In [6] conver- + gence was proven directly for φ ∈ Cc (R) using a truncation procedure. Essentially, it makes no difference whether the support of φ is bounded from above or not, since the maximum of the point process Et is finite almost surely, uniformly in t.

7.2.2 Flash back to the derivative martingale.

Having established the tail asymptotics for solutions of the F-KPP equation, one might be tempted to compute the law of the maximum (or Laplace functionals) by recursion at an intermediate time r, e.g.         Z Z E exp − φ(−x + y)Et (dy) = E E exp − φ(−x + y)Et (dy) Fr

" "n(r)  Z  ## (i) = E E ∏ exp − φ(−x + xi(r) − m(t) + m(t − r) + y)Et−r(dy) Fr i=1 " "n(r) ##  √  ≈ E E ∏ 1 − v(t − r,−x + xi(r) − 2r) Fr , (7.2.12) i=1

(i) where Es are iid copies on BBM and   √ Z √ (i) v(s,x − xi(r) + 2r + m(s)) ≡ 1 − exp − φ(−x + xi(r) − 2r + y)Es (dy) . √ (7.2.13) We have used that m(t)−m(t −r) = 2r+O(r/t). We want to use the fact (see The- orem 5.2) that the particles that√ will show√ up√ in the support√ of φ at time t must come from particles in an interval ( 2r − c1 r, 2r − c2 r), if r is large enough and c1 > c2 > 0 become large. Hence we may assume that the conditional expectations that appear in (7.2.12) can be replaced by their tail asymptotics, i.e. in construction 7.2 Existence of a limiting process 123 √ √ √ √ − 2(x−xi(r)+ 2r) v(t −r,x−xi(r)+ 2r+m(t −r)) ∼ C(x−xi(r)+ 2r)e . (7.2.14)

Hence we would get,

  Z  limE exp − φ(x + y)Et (dy) t↑∞ " n(r) √ √ √ !# − 2(x−xi(r)+ 2r) = limE exp −C ∑(x − xi(r) + 2r)e r↑∞ i=1 √ √ h  − 2x − 2xi = limE exp −CZre −CYrxe , (7.2.15) r↑∞

with Zt and Yt the martingales from Section 2.4. The constant C will of course de- pend in general on the initial data for ν, i.e. of φ. Now one could argue like√ Lalley and Sellke that Yr must converge to a non- negative limit, implying that 2r − xi(r) ↑ ∞ if this limit is stricly positive. Then again this implies that Yr → 0, a.s., and hence

√   Z  h − 2x i −CZre limE exp − φ(x + y)Et (dy) = limE e . (7.2.16) t↑∞ r↑∞

The argument above is not quite complete. In the next subsection we will show that the final result is nonetheless true.

7.2.3 A representation for the Laplace functional

In Subsection 7.2 we have shown convergence of the Laplace functional. The fol- lowing proposition exhibits the general form of the Laplace functional of the limit- ing process. In this section we will always assume that φ is a function of the form (7.2.4). The following proposition exhibits the general form of the Laplace functional of the limiting process.

Proposition 7.9. Let Et be the process (7.2.1). For φ be as given in (7.2.4) and any x ∈ R,   Z  √ h  − 2xi limE exp − φ(y + x)Et (dy) = E exp −C(φ)Ze (7.2.17) t↑∞

where, for u(t,y) the solution of the F-KPP equation with initial condition u(0,y) = 1 − e−φ(−y), r 2 Z ∞ √ √ C(φ) = lim u(t,y + 2t)ye 2ydy (7.2.18) t↑∞ π 0 is a strictly positive constant depending on φ only, and Z is the derivative martin- gale. in construction 124 7 The extremal process of BBM

Proof. The initial conditions u(0,y) satisfy the hypothesis that were required in the previous results on sharp approximation. The existence of the constant C(φ) then follows from Proposition 7.1. On the other hand, it follows from Theorem 5.9 and the Lalley-Sellke represen-

tation, that √ h − 2xi limu(t,x + m(t)) = 1 − E exp−C(φ)Ze , (7.2.19) t↑∞ for some constant C(φ) (recall that the relevant solution of (5.5.2) is unique up to shifts (see Lemma 5.8), which translates into uniqueness of the representation (7.2.17) up to the choice of the constant C). Clearly, form the asymptotic of solu-

tions, we know that √ h − xi 1 − E exp−CZe 2 lim √ = C. (7.2.20) x↑∞ xe− 2x Thus all what is left is to identify this constant with C(φ). But from Corollary 7.2 it follows that√ C must be C(phi). This is essentially done by rerunning Proposition (7.1) with 2t replaced by the more precise m(t). This proves (7.2.17). ut

7.3 Interpretation as cluster point process

In the preceding section we have given a full construction of the Laplace functional of the limiting extremal process of BBM and have given a representation of it in Proposition 7.9. Note that all the information on the limiting process is contained in the way how the constant C(φ) depends on the function φ. The characterisation of this dependence via a solution of the F-KPP equation with initial condition given in terms of φ does not appear very revealing at first sight. In the following we will remedy this by giving explicit probabilistic descriptions of the underlying point pro- cess.

7.3.1 Interpretation via an auxiliary process

We will construct an auxiliary class of point processes that a priory have nothing to do with the real process of BBM. Let (ηi;i ∈ N) denote the atoms of a Poisson point process on (−∞,0) with intensity measure r 2 √ (−x)e− 2xdx . (7.3.1) π

(i) (i) For each i ∈ N, consider independent BBM’s {xk (t),k ≤ n (t)}. Note that, for each i ∈ N, in construction 7.3 Interpretation as cluster point process 125

(i) √ max xk (t) − 2t → −∞, a.s. (7.3.2) k≤n(i)(t) This follows, e.g., from the fact that the martingale Y(t) defined in (5.6.7) converges to zero, a.s.. The auxiliary point process of interest is constructed from these ingre- dients as n(i)(t) √ Πt ≡ δ 1 (i) , (7.3.3) ∑ ∑ √ lnZ+ηi+x (t)− 2t i k=1 2 k where Z has the same law as limit of the derivative martingale. The existence and non-triviality of the process in the limit t ↑ ∞ is not obvious, especially in view of (7.3.2). We will show that not only it exists, but it has the same law as the limit of the extremal process of BBM.

Theorem 7.10. Let Et be the extremal process (7.2.1) of BBM. Then

law limEt = limΠt . (7.3.4) t↑∞ t↑∞

Proof. The proof of this result just requires the computation of the Laplace trans- form, which we are already quite skilled in. The Laplace functional of Πt using the form of the Laplace functional of a Pois- son process reads

  Z  E exp − φ(x)Πt (dx)     n(i)(t)  √ (i) √  = Eexp−∑ ∑ φ ηi + lnZ/ 2 + xk − 2t  i k=1 " !# (i) = E exp −∑Θt (ηi , (7.3.5) i

where we set n(i)(t) (i)  √ (i) √  Θt (x) ≡ ∑ φ x + lnZ/ 2 + xk − 2t . (7.3.6) k=1 Now we want to compute the expectation. Using the independence of the Cox pro- cess and the processes x(i)(t), and using the explicit form of the Laplace functional of Poisson processes, we see that, (7.3.5) is equal to

h  h 1  ii −Θt (x) E exp E e − 1 σ(Z) (7.3.7) √  Z 0 h 1  iq  −Θt (x) 2 − 2x = E exp E e − 1 σ(Z) π (−x)e dx −∞ Note that the conditional expectation in the exponent is just the expectation with respect to one BBM, x(1). The outer expectation is just the average with respect to in construction 126 7 The extremal process of BBM

the derivative martingale Z. Now the exponent is explicitly given as " # ! Z 0 n(t)  √ √  q √ −φ(x+Z/ 2+xk(t)− 2t) 2 − 2x E ∏ e − 1 π (−x)e dx (7.3.8) −∞ k=1 Z 0 √ √ q √ 2 − 2x = − u(t,−x + 2t − Z/ 2) π (−x)e dx −∞ Z ∞ √ √ q √ 2 2x = − u(t,x + 2t − Z/ 2) π xe dx, 0

where u is a solution of the F-KPP equation with initial condition 1 − e−φ(−x). Hence, by (7.1.18), r 2 Z ∞ √ 1 √ lim u(t,x + 2t − √ lnZ)xe 2xdx (7.3.9) t↑∞ π 0 2 r 2 Z ∞ √ √ = Z lim u(t,x + 2t)xe 2xdx = ZC(φ), π t↑∞ 0 where the last equality follows by Proposition 7.9. Thus we arrive finally at the

  Z  h −ZC(φ)i limE exp − φ(x)Πt (dx) = E e t↑∞

This implies that the Laplace functionals of limt↑∞ Πt and of the extremal process of BBM are equal. and proves the proposition. ut

7.3.2 The Poisson process of cluster extremes

We will now give another interpretation of the extremal process. In the paper [4], the following result was shown: consider the ordered enumeration of the particles of BBM at time t, x1(t) ≥ x2(t) ≥ ··· ≥ xn(t)(t). (7.3.10) Fix r > 0 and construct the equivalence classes of these particles such that each class consists of particles that have a common ancestor at time s > t −r. For each of these classes, we can select as a representative its largest member. This can be constructed recursively as follows:

i1 = 1,

ik = min( j > ik−1 : q(ii` , j) < t − r,∀` ≤ k − 1). (7.3.11) This is repeated to exhaustion and provides the desired decomposition. We denote ∗ the resulting number of classes by n (t). We can think of the xik (t) as the heads of families at time t. We then construct the point processes in construction 7.3 Interpretation as cluster point process 127

n∗(t) r Θ ≡ δx (t)−m(t). (7.3.12) t ∑ ik k=1 In [4] the following result was proven.

r Theorem 7.11. Let Θt be defined above. Then

r limlimΘt = Θ, (7.3.13) r↑∞ t↑∞

where convergence is in law and is the random Poisson process (Cox process) √ √ Θ with intensity measure CZ 2e− 2xdx, Z being the derivative martingale, and C the constant from Theorem 5.16.

Remark 7.12. The thinning out of the extremal process is removing enough corre- lation to Poissonise the process. This is rather common in the theory of extremal processes, see e.g. [9, 10].

I will not prove this theorem here. Rather, I will show how this result links up to the representation of the extremal process of BBM given above.

Proposition 7.13. Let

ext √ Π ≡ δ 1 (7.3.14) t ∑ √ lnZ+ηi+Mi(t)− 2t} i 2

(i) (i) where M (t) ≡ maxk xk (t), i.e. the point process obtained by retaining from Πt the maximal particles of each of the BBM’s. Then √ √ ext law  − 2x  limΠt = PZ = PPP Z 2Ce dx (7.3.15) t↑∞

as a point process on R, where C is the same constant appearing the law of the ext maximum. In particular, the maximum of limt↑∞ Πt has the same law as the limit law of the maximum of BBM.

Proof. Consider the Laplace functional of our thinned auxiliary process,

" √ !# (i) E exp −∑φ(ηi + M (t) − 2t) (7.3.16) i

(i) Since the M ’s are i.i.d., and denoting by Fη the σ-algebra generated by the Pois- son point process of the η’s, we get that in construction 128 7 The extremal process of BBM

" √ !# (i) E exp −∑φ(ηi + M (t) − 2t) (7.3.17) i " √ #   (i)   = E ∏E exp −φ(ηi + M (t) − 2t) i " !# = E exp −∑g(ηi) , i

where (i) √ gt (z) ≡ φ(z + M(t) − 2t). (7.3.18) Proceeding as in the proof of Theorem 7.10 , " !# E exp −∑g(ηi) (7.3.19) i " Z 0 √ √ r √ !# h −g(y+lnZ/ 2+M(t)− 2t) i 2 − 2y = E exp − EMt e − 1 (−y)e dy , −∞ π

where EMt denote expectation w.r.t. the max of BBM at time t. Using Lemma 7.5 with h(x) = 1 − e−φ(−x), we see that

Z 0 √ √ r √ h −φ(y+lnz/ 2+M(t)− 2t)i 2 − 2y lim EMt 1 − e (−y)e dy (7.3.20) t↑∞ −∞ π Z ∞   √ √ = 1 − e−φ(a) C 2e− 2ada, −∞ √ − a which is the Laplace functional of the PPP with intensity ZCe 2 da on R. The assertion of Proposition 7.13 follows. ut

So this is nice. The process of the maxima of the BBM’s in the auxiliary process is the same Poisson process as the limiting process of the heads of families. This gives a clear interpretation of the BBM’s in the auxiliary process.

7.3.3 More on the law of the clusters

Let us continue to look at√ the auxiliary process. We know that it is very hard for any of the M(i)(t) to exceed 2t and thus to end up on above a level a as t ↑ fty. There- fore, many of the atoms ηi of the Poisson process will not have relevant offspring. The following proposition states that there is a narrow window deep below zero from which all relevant particles come. This is analogous to the observation in Lemma 7.1.14 in construction 7.3 Interpretation as cluster point process 129

Proposition 7.14. For any y ∈ R and ε > 0 there exist 0 < A1 < A2 < ∞ and t0 depending only on y and ε, such that √  (i)  √ √  supP ∃i,k : ηi + xk (t) − 2t ≥ z, ∧ ηi ∈/ −A1 t,−A2 t < ε. (7.3.21) t≥t0 Proof. Throughout the proof, the probabilities are considered conditional on Z. Clearly we have

 (i) √ √  P ∃i,k : ηi + xk (t) − 2t ≥ z, but ηi ≥ −A1 t (7.3.22) " #  √  ≤ 1 √ M(i)(t) − 2t + η ≥ y F , E ∑ ηi∈[−A1 t,0]P i η i Z 0 √ √  (i)  − 2y = √ P M (t) − 2t + ηi ≥ z C(−y)e dy −A1 t √ Z A t   √ 1 3 2y = P M(t) − m(t) ≥ z + y + √ lnt Cye dy. 0 2 2 Inserting the bound on the probability from Lemma 6.45, on the domain of integra- tion we get

√ √ √   √3   0 3 − 2( lnt+z+y) −y2/2t P M(t) − 2t ≥ z + y + 2t ≤ ρ √ lnt + z + y e 2 2 e . 2 2 (7.3.23) Inserting this into (7.3.22), this is bounded by √ Z A t   00 −3/2 1 3 −y2/2t 000 3 ρ t √ lnt + z + y e dy ≤ ρ A1, (7.3.24) 0 2 2 Similarly, we have that

 (i) √ √  P ∃i,k : ηi + xk (t) − 2t ≥ z, but ηi ≤ −A2 t (7.3.25) Z ∞ √ √ √   2y ≤ √ P M(t) − 2t ≥ z + y + 2t Cye dy A2 t Z ∞ −3/2 2 −y2/2t ≤ ρ˜t √ y e dy A2 t Z ∞ 2 = ρ˜ y2e−y /2dy, A2

which manifestly tends to zero as A2 ↑ ∞. ut

We see that there is a close analogy to Lemma 7.4. How should we interpret the auxiliary process? Think of a very large time t, and go back to time t −√r. A that time, there is a√ certain distribution of particles which all lie well below 2(t − r) by an order of r. From those a small fraction will have offspring that reach the in construction 130 7 The extremal process of BBM √ √ excessive size 2r + r and thus contribute to the extremes. The selected particles form the Poisson process, while their offspring, which are now BBMs conditioned to be extra large, form the clusters.

7.3.4 The extremal process seen from the cluster extremes

We finally come to yet another description of the extremal process. Here we start with the Poisson process of Proposition 7.13 and look at the law of the clusters "that made it up√ to there". Now we know√ that the point in the auxiliary process all came from the t window around√ − 2t. Thus we may expect the clusters to look like BBM that was bigger than 2t. Therefore we define the process

√ E t = δ . (7.3.26) ∑ xk(t)− 2t k≤n(t)

Obviously, the limit of such a process√ must be trivial, since the probability that the maximum of BBM shifted by − 2t does√ not drift to −∞ is vanishing. However, conditionally on the event {maxk xk(t)− 2t ≥ 0}, the process E t does converge to

a well defined point process E = ∑ j δξ j as t ↑ ∞. We may then define the point process of the gaps

≡ , (7.3.27) Dt ∑δxk(t)−max j≤n(t) x j(t) k and

D = ∑δ∆ j , ∆ j ≡ ξ j − maxξi, (7.3.28) j i

where ξi are the atoms of the limiting process E . Note that D is a point process on (∞,0] with an atom at 0. (i) Theorem 7.15. Let PZ be as in (7.3.15) and let {D ,i ∈ N} be a family of indepen-

dent copies of the gap-process (7.3.28). Then the point process Et = ∑k≤n(t) δxk(t)−m(t) converges in law as t ↑ ∞ to a Poisson cluster point process E given by

law E ≡ limEt = δ (i) . (7.3.29) ∑ p +∆ t↑∞ i, j i j

This theorem looks quite reasonable. The only part that may be surprising is that√ all the ∆ j have the same√ law, since a priori we only know that they come from a t neighbourhood below 2t. This is the content of the next theorem. √ Theorem 7.16. Let x ≡ −a t + b for some a > 0,b ∈ R. The point process

δ √ (7.3.30) ∑ x+xk(t)− 2t k≤n(t) in construction 7.3 Interpretation as cluster point process 131  √  converges in law under P · {x + maxk xk(t) − 2t > 0} , as t ↑ ∞ to a well-defined point process E . The limit does not depend on a and b, and the maximum√ of E shifted by x has the law of an exponential random variable of parameter 2. √ Proof. Set maxE t ≡ maxi xi(t) − 2t. We first show that, for z ≥ 0, √   − 2z limP x + maxE t > z x + maxE t > 0 = e , (7.3.31) t↑∞

for X > 0. This follows from the fact that the conditional probability is just  P x + maxE t > X  , (7.3.32) P x + maxE t > 0

and the numerator√ and denominator√ can be well approximated by the functions ψ(r,t,X −x+ 2t) and ψ(r,t,−x+ 2t), respectively. Using Lemma 7.3 for these, we get the assertion (7.3.31). Second, we show that for any function φ that is continuous has support bounded from below, the limit of   Z  

E exp − φ(x + z)E t (dz) x + maxE t > 0 (7.3.33)

exists and is independent of x. The proof is just a tick more complicated than before, but relies again on the properties of the functions ψ. I will skip the details. ut

We now can proof Theorem 7.15.

Proof (Proof of Theorem 7.15). The Laplace functional of the process in (9.3.1) of the extremal process clearly is given by

  Z ∞ √ √  h −R φ(y+z) (dz)i − 2y E exp −CZ E 1 − e D 2e dy (7.3.34) −∞

for the point process D defined in (7.3.28). We show that the limiting Laplace func- tional of the auxiliary process can be written in the same form . Then Theorem 7.15 follows from Theorem ??. The Laplace functional of the the auxiliary process, is given by " !# (i) √ limΨ (φ) = lim exp − φ(η + √1 lnZ + x (t) − 2t) . (7.3.35) t E ∑ i 2 k t↑∞ t↑∞ i,k

Using the form for the Laplace transform of a Poisson process we have for the right side in construction 132 7 The extremal process of BBM " !# (i) √ lim exp − φ(η + √1 lnZ + x (t) − 2t) (7.3.36) E ∑ i 2 k t↑∞ i,k " Z 0   Z r √ !# 2 − 2y = E exp −Z lim E 1 − exp − φ(x + y)E t (dx) (−y)e dy . t↑∞ −∞ π

Define ≡ . (7.3.37) Dt ∑ δxi(t)−max j≤n(t) x j(t) i≤n(t) The integral on the right-hand side equals

Z 0  Z r √ 2 − 2y lim E f φ(z + y + maxE t )Dt (dz) (−y)e dy (7.3.38) t↑∞ −∞ π

−x with f (x) ≡ 1 − e . By Proposition 7.14, there exist A1 and A2 such that

Z 0  Z r √ 2 − 2y E f φ(z + y + maxE t )Dt (dz) (−y)e dy (7.3.39) −∞ π √ Z −A t  Z r √ 1 2 − 2y = √ E f φ(z + y + maxE t )Dt (dz) (−y)e dy, −A2 t π +Ωt (A1,A2) (7.3.40)

where the error term satisfies lim sup (A ,A ) = 0. Let m be the A1↓0,A2↑∞ t≥t0 Ωt 1 2 φ minimum of the support of φ. Note that

Z  f φ(z + y + maxE t )Dt (dz) (7.3.41)

is zero when y+maxE t < mφ , and that the event {y+maxE t = mφ } has probability zero. Therefore,  Z  E f φ(z + y + maxE t )Dt (dz) (7.3.42)  Z   = f φ(z + y + maxE t )Dt (dz) 1 E {y+maxE t >mφ }  Z    = E f φ(z + y + maxE t )Dt (dz) y + maxE t > mφ P y + maxE t > mφ .

One can show (see Corollary 4.12 in [6]) that the conditional law of the pair (Dt ,y+ maxE t ) given {y + maxE t > mφ } converges, as t ↑ ∞, to a pair of independent random√ variables, where the limit of a + maxE t ) is exponential with√ parameter√ 2 (see (7.3.31)). Moreover, the convergence is uniform in y ∈ [−A1 t,−A2 t]. R This implies the convergence of the random variable φ(z + y + maxE t )Dt (dz). in construction 7.3 Interpretation as cluster point process 133

Therefore, writing the expectation over the exponential r.v. explicitly, we obtain  Z   limE f φ(z + y + maxE t )Dt (dz) y + maxE t > mφ t↑∞ √ Z ∞  Z √ √ 2m − 2s = e φ E f φ(z + s)D(dz) 2e ds. (7.3.43) mφ On the other hand, √ r Z −A1 t 2 √ √  − 2y − 2mφ √ P y + maxE t > mφ (−y)e dy = Ce + Ωt (A1,A2) −A2 t π (7.3.44)

by Lemma 7.6, where limA1↓0,A2↑∞ limsupt↑∞ Ωt (A1,A2) = 0. using the same ap- proximation as in (7.3.39). Combining (7.3.44), (7.3.43) and (7.3.42), one sees that (7.3.36) converges to

  Z ∞ √ √  h −R φ(y+z) (dz)i − 2y E exp −CZ E 1 − e D 2e dy , (7.3.45) −∞ which is by (7.3.35) the limiting Laplace transform of the extremal process of branching Brownian motion: this shows (7.3.34) and concludes the proof of The- orem 7.15. ut

The properties of BBM conditioned to exceed their natural threshold were al- ready described in detail by Chauvin√ and Rauoult [28]. There will be one branch (the spine√) that exceeds the level 2t by an exponential random variable of of pa- rameter √2 (see the preceding proposition). The spine is very close to a straight line of slope 2. From this spine, ordinary BBM’s branch off at Poissonian times. clearly, all the branches that split off at times later than r before the end-time, will reach at most the level √ √ 3 √ 3 2(t − r) + 2r − √ lnr = 2t − √ lnr. (7.3.46) 2 2 2 2 √ Seen from the top, i.e. 2t, this tends to −∞ as r ↑ ∞. Thus only branches that are created "a finite time" before the end-time t remain visible in the extremal process. This does, of course, correspond perfectly to the observation in Theorem 7.11 that implies that all the points visible in the extemal process have a common ancestor at a finite time before the end. As a matter of fact, from the observation above one can prove Theorem 7.11 easily. in construction in construction Chapter 8 Full extremal process

In this chapter we present a extension of the convergence of the extremal process of BBM. This material is taken from Bovier and Hartung [16].

8.1 Introduction

We have seen in Chapter 2 that in extreme value theory (see e.g. [57]) it is custom- ary to give description of extremal processes that also encode the locations of the extreme points ("complete Poisson convergence"). We want to obtain an analogous result for BBM. Now our "space" is the Galton-Watson tree, or more precisely its "leaves" at infinity. This requires a construction of this object. In the case of deter- ministic binary branching at integer times, the leaves of the tree at time n are nat- n urally labelled by sequences σ ≡ (σ1σ2 ...σn), with σ` ∈ {0,1}. These sequences can be naturally mapped into [0,1] via

n n −`−1 σ 7→ ∑ σ`2 ∈ [0,1]. (8.1.1) `=1

Moreover, the limit, as n ↑ ∞ of the image of this map is [0,1]. We now want to do the same for a general Galton-Watson tree.

8.1.1 The embedding

Our goal is to define a map γ : {1,...,n(t)} → R+ in such a way that it encodes the genealogical structure of the underlying supercritical Galton-Watson process. Recall k the we write i (t) for the multi-index that labels the k-th particle at time t and xk(t) for the position of that particle. xmulti-indices. To this end, recall the labelling of the Galton-Watson tree through multi-indices presented in Section 4.5. Then define in construction135 136 8 Full extremal process

W(t) −t j γt (i) ≡ γt (i(t)) ≡ ∑ i j(t)e . (8.1.2) j=1

For a given i, the function (γt (i(t)),t ∈ R+) describes a trajectory of a particle in R+. Thus to any "particle" at time t we can now associate the position on R × R+, k (xk(t),γt (i (t))). The important point is that for any given particle, this trajectory converges to some point γ(i) ∈ R+, as t ↑ ∞, almost surely. Hence also the sets γt (τ(t)) converge, for any realisation of the tree, to some (random) set γ(τ(∞)). It is easy to see that γ(τ(∞)) is a Cantor set.

8.1.2 The extended convergence result

Using the embedding γ defined in the previous section, we now state the following theorem. We use the notation from the previous chapter.

n(t) Theorem 8.1. The point process ≡ δ k → on × , as Eet ∑k=1 (γt (i (t)),xk(t)−m(t)) Ee R+ R t ↑ ∞, where E ≡ δ (i) , (8.1.3) e ∑ (q ,p )+(0,∆ ) i, j i i j

where (qi, pi)i∈N √are the atoms of a Cox process on R+ × R with intensity mea- − 2x sure Z(dv) ×Ce dx, where Z(dv) is a random measure on R+, characterised in (i) Lemma 8.4, and ∆ j are the atoms of the independent and identically distributed point processes ∆ (i) from Theorem 7.15.

Remark 8.2. The nice feature of the process Eet is that it allows to visualise the dif- ferent clusters ∆ (i) corresponding to the different point of the Poisson process of n(t) cluster extremes. In the process ∑k=1 δxi(t)−m(t) considered in earlier work, all these points get superimposed and cannot be disentangled. In other words, the process Ee encodes both the values and the (rough) genealogical structure of the extremes of BBM.

Remark 8.3. A very similar result is proven by Biskup and Louidor for the discrete Gaussian free field [13, 14]. Their result provided the motivation to prove the corre- sponding theorem for BBM.

The measure Z(dv) in an interesting object in itself. For v,r ∈ R+ and t > r, we define √ √ √ 2(x j(t)− 2t) Z(v,r,t) = ( t − x (t)) 1 j , ∑ 2 j e γr(i (r))≤v (8.1.4) j≤n(t)

which is a truncated version of the usual derivative martingale Zt . In particular, observe that Z(∞,r,t) = Zt . in construction 8.2 Properties of the embedding 137

Lemma 8.4. For each v ∈ R+ the limit limr↑∞ limt↑∞ Z(v,r,t) exists almost surely. Set Z(v) ≡ limlimZ(v,r,t). (8.1.5) r↑∞ t↑∞ Then 0 ≤ Z(v) ≤ Z, where Z is the limit of the derivative martingale. Moreover, almost surely, Z(v) is monotone increasing in u and continuous.

The measure Z(v) is the analogue of the corresponding "derivative martingale measure" studied in Duplantier et al [33, 34] and Biskup and Louidor [13, 14] in the context of the Gaussian free field. For a review, see Rhodes and Vargas [66]. The objects are examples of what is known as multiplicative chaos that was introduced by Kahane [48].

8.2 Properties of the embedding

k We need the three basic properties of γ. The first lemma states that the map γt (i (t)) k is well approximated by γr(i (r), when r is large, but small compared to t, provided t is large enough. More precisely:

Lemma 8.5. Let D = [D,D] ⊂ R be a compact interval Define, for 0 ≤ r < t < ∞, the events

γ n k k −r/2o Ar,t (D) = ∀k with xk(t) − m(t) ∈ D: γt (i (t)) − γr(i (r)) ≤ e . (8.2.1)

For any ε > 0 there exists 0 ≤ r(D,ε) < ∞ such that, for any r > r(D,ε) and t > 3r

γ c P Ar,t (D) < ε. (8.2.2)

Proof. Let ε > 0. Then, by Theorem 2.3 of [3] there exists for each ε > 0 r1 < ∞, such that, for all t > 3r1 γ c P Ar,t (D) ≤ P ∃k : xk(t) − m(t) ∈ D,∀s∈[r1,t−r1] : xk(s) ≤ D + Et,α (s) k k −r/2 but γt (i (t)) − γr(i (r)) > e + ε/2, (8.2.3)

1 s α where 0 < α < 2 and Et,α (s) = t m(t) − ft,α (s) and ft,α = (s ∧ (t − s)) . Using the "many-to-one lemma" (see Theorem 8.5 of [42])), the probability in (8.2.3) is bounded from above by   t −t˜j 1 −r/2 e P x(t) ∈ m(t) + D,∀s∈[r1,t−r1] : x(s) ≤ D + Et,α (s) but ∑ j m je t˜j∈[r,t] > e , (8.2.4) where x is a standard Brownian motion and (t˜j, j ∈ N) are the points of a size-biased Poisson point process with intensity measure 2dx independent of x, m j are indepen- dent random variables, uniformly distributed on {0,...,l˜j − 1}, where finally l˜j are ˜ kpk i.i.d. according to the size-biased offspring distribution, P(l j = k) = 2 . Due to in construction 138 8 Full extremal process

independence, and since m j ≤ l˜j, the expression (8.2.4) is bounded from above by

t  e P x(t) ∈ m(t) + D,∀s∈[r1,t−r1] : x(s) ≤ D + Et,α (s)   ˜ −t˜j 1 −r/2 ×P ∑ j(l j − 1)e t˜j∈[r,t] > e . (8.2.5)

The first probability in (8.2.5) is bounded by  s  x(t) ∈ m(t) + D,∀ : x(s) − x(t) ≤ D − D − f (s) . (8.2.6) P s∈[r1,t−r1] t t,α s Using that ξ(s) ≡ x(s) − t x(t) is a Brownian bridge from 0 to 0 in time t that is independent of x(t), (8.2.6) equals  P(x(t) ∈ m(t) + D)P ∀s∈[r1,t−r1] : ξ(s) ≤ D − D − ft,α (s)  ≤ P(x(t) ∈ m(t) + D)P ∀s∈[r1,t−r1] : ξ(s) ≤ D − D . (8.2.7) Using Lemma 3.4 of [3] to bound the last factor of (8.2.7) we obtain that (8.2.7) is bounded from above by

r1 κ P(x(t) ∈ m(t) + D), (8.2.8) t − 2r1 where κ < ∞ is a positive constant. Using this as an upper bound for the first prob- ability in (8.2.5) we can bound (8.2.5) from above by

r1   t ˜ −t˜j 1 −r/2 e κ P(x(t) ∈ m(t) + D)P ∑ j(l j − 1)e t˜j∈[r,t] > e . (8.2.9) t − 2r1 By (5.25) of [3] (resp. an easy Gaussian computation) this is bounded from above by r1   ˜ −t˜j 1 −r/2 Cκ P ∑ j(l j − 1)e t˜j∈[r,t] > e , (8.2.10) t − 2r1 for some positive constant C < ∞. Using the Markov inequality, (8.2.10) is bounded from above by tr1   r/2 ˜ −t˜j 1 Cκ e E ∑ j(l j − 1)e t˜j∈[r,t] , (8.2.11) t − 2r1

We condition on the σ-algebra F generated by the Poisson points. Using that l˜j is ˜ −t˜j 1 independent of the Poisson point process (t j) j and ∑ j e t˜j∈[r,t] is measurable with respect to F we obtain that (8.2.11) is equal to

tr1    r/2 ˜ −t˜j 1 Cκ e E E ∑ j(l j − 1)e t˜j∈[r,t]|F (8.2.12) t − 2r1 tr1   r/2 −t˜j 1 ˜  = Cκ e E ∑ j e t˜j∈[r,t]E (l j − 1)|F . t − 2r1

1 Since E(l j − 1) = ∑k 2 (k − 1)k = K/2 < ∞ we have that (8.2.12) is equal to in construction 8.2 Properties of the embedding 139

tr1   r/2 −t˜j 1 CκK/2 e E ∑ j e t˜j∈[r,t] . (8.2.13) t − 2r1 By Campbell’s theorem (see e.g [52] ), (8.2.13) is equal to tr Z t tr CκK/2 1 er/2 e−x2dx ≤ CκK 1 e−r/2, (8.2.14) t − 2r1 r t − 2r1 which can be made smaller than ε/2 for all r sufficiently large and t > 3r. The second lemma ensures that γ maps particles are with a low probability to a very small neighbourhood of a fixed a ∈ R.

Lemma 8.6. Let a ∈ R+ and D as in Lemma ??. Define the event

γ n k o Br,t (D,a,δ) = ∀k with xk(t) − m(t) ∈ D: γr(i (r)) 6∈ [a − δ,a] (8.2.15)

For any ε > 0 there exists δ > 0 and r(a,D,δ,ε) such that for any r > r(a,D,δ,ε) and t > 3r γ c P Br,t (D,a,δ) < ε. (8.2.16) Proof. Following the proof of Lemma 8.5 step by step we arrive at the bound

c tr1   γ   −t˜j 1 P Br,t (D,a,δ) ≤ Cκ P ∑ j m je t˜j∈[0,r] ∈ [a − δ,a] . (8.2.17) t − 2r1 We rewrite the probability in (8.2.17) in the form

∞  ∗ −t˜  i = {i m 6= }, ∗ m j 1 ∈ [a − ,a] . ∑ P inf : i 0 ∑ j≥i je t˜j∈[0,r] δ (8.2.18) i∗=1

∗ Consider first P(i = inf{i : mi 6= 0}). This probability is equal to

" ∗ #  1  i −1 1 P(∀i≤i∗ : mi = 0 and mi∗ 6= 0) = E 1 − . (8.2.19) ∗ ∏ li j=1 l j   −1 1+p1 Using that the l j are iid together with the simple bound E l j ≤ 2 , we see that (8.2.19) is bounded from above by

∗ 1 + p i −1 1 . (8.2.20) 2

1+p1 0 0 Since 2 < 1 by assumption on p1 we can choose for each ε > 0 K(ε ) < ∞ such

that ∗ ∞ 1 + p i −1 ∑ 1 < ε0. (8.2.21) i∗=K(ε0)+1 2 Hence we bound (8.2.18) by in construction 140 8 Full extremal process

K(ε0)  ∗ −t˜  0 i = {i m 6= }, ∗ m j 1 ∈ [a − ,a] + . ∑ P inf : i 0 ∑ j≥i je t˜j∈[0,r] δ ε (8.2.22) i∗=1 We rewrite   −t˜j −t˜∗ −1 −(t˜j−t˜∗ ) ∗ m e 1 = m ∗ e i 1 1 + m ∗ ∗ m e i 1 ∗ ∑ j≥i j t˜j∈[0,r] i t˜i∗ ∈[0,r] i ∑ j>i j t˜j−ti ∈[0,r−ti∗ ] (8.2.23) i∗ Next, we estimate the probability that t˜i∗ is large. Observe that t˜i∗ = ∑i=1 si where si are iid exponentially distributed random variables with parameter 2. This implies ∗ that t˜i∗ is Erlang(2,i ). Thus

α −2rα i∗ (2rα )i 0 K( 0) − rα ∗ ˜ ∗ ˜ α ε 2 P(ti > r ) = e ∑i=0 i! ≤ C(K(ε ))b(2r ) e , for all i ≤ K(ε). (8.2.24) Next we want to replace t˜i∗ in the indicator function in (8.2.23) by a non-random quantity rα , for some 0 < α < 1, in order to have a bound that depends only on the differences t˜j −t˜i∗ . Note first that

−(t˜j−t˜∗ ) −(t˜j−t˜∗ ) m e i 1 ∗ − m e i 1 ∗ α (8.2.25) ∑ j t˜j−ti ∈[0,r−ti∗ ] ∑ j t˜j−ti ∈[0,r−r ] j>i∗ j>i∗

−(t˜j−t˜∗ ) −(t˜j−t˜∗ ) = m e i 1 ∗ α ≤ m e i 1 ∗ α . ∑ j t˜j−ti ∈[r−r ,r−ti∗ ] ∑ j t˜j−ti ∈[r−r ,r] j>i∗ j>i∗

Using the fact that m j ≤ l˜j − 1 for all j and the Markov inequality, we get that

 −(t˜ −t˜∗ ) −r/2 ∗ m e j i 1 α > e P ∑ j>i j t˜j−t˜i∗ ∈[r−r ,r]

r/2  −(t˜ −t˜∗ )  ≤ e ∗ (l˜ − 1)e j i 1 α . (8.2.26) E ∑ j>i j t˜j−t˜i∗ ∈[r−r ,r]

Using Campbell’s theorem as in (8.2.12), we see that the second line in (8.2.26) is equal to Z r  α  er/2K/2 e−x2dx = K e−r/2+r − e−r/2 . (8.2.27) r−rα 0 For any ε > 0, there exists r0 < ∞, such that for all r > r0, the probabilities in (8.2.24) and (8.2.26) are smaller than ε0. On the the event ( ) α −(t˜j−t˜∗ ) −r/2 = {t ∗ ≤ r } ∩ m e i 1 ∗ α ≤ e , (8.2.28) D i ∑ j t˜j−ti ∈[r−r ,r] j>i∗

which has probability at least 1−2ε0, we can bound (8.2.22) in a nice way. Namely, since mi∗ ≥ 1 by definition and m j are chosen uniformly from (0,...,l j − 1) and −(t˜ −t˜∗ ) independent of {t } . Moreover, ∗ m e j i 1 α ≥ 0 is also inde- j j≥1 ∑ j>i j t˜j−ti∗ ∈[0,r−r ] pendent of ti∗. It follows that (8.2.22) is bounded from above by in construction 8.2 Properties of the embedding 141

K(ε0) ∗  −t˜∗ −r/2 α  0 ∑ P(i = inf{i : mi 6= 0}) max P {e i ∈ [b − δ − e ,b]} ∧ {ti∗ ≤ r } +3ε . i∗=1 b∈[0,1] (8.2.29) Using the bound on the first probability in (8.2.29) given in (8.2.20), one sees that (8.2.29) is bounded from above by

K(ε0)  i∗−1 1 + p1  h  −r/2i 0 max P ti∗ ∈ −logb,−log b − δ − e +3ε . ∑ α −r/ i∗=1 2 b∈[δ+e−r +e 2,1] (8.2.30) ∗ Recalling that ti∗ is Erlang(2,i ) distributed, we have that

 h  −r/2i P ti∗ ∈ −logb,−log b − δ − e ∗ i −1 1   = ∑ f (b) − f (b − δ − e−r/2) , (8.2.31) i=0 i!

where we have set f (x) = 2x(−2log(x))i. By the mean value theorem, uniformly α on b ∈ [δ + e−r + e−r/2,1],

  ∗   f (b) − f (b − δ − e−r/2) ≤ 2(−2log(δ))i (i∗ + 1) δ + e−r/2 . (8.2.32)

Inserting this bound into (8.2.31), we get that, for i∗ ≤ K(ε0),

 h  −r/2i max P ti∗ ∈ −logb,−log b − δ − e (8.2.33) b∈[δ+e−rα +e−r/2,1] ∗ i −1 1 ≤ 2 ∑ (−2log(δ))i−1(δ + e−r/2) i=1 (i − 1)! 0 ≤ C(K(ε0))(−log(δ))K(ε ) (δ + e−r/2), (8.2.34)

for some constant C(K(ε0)) < ∞. Now the right-hand side of (8.2.34) can be made smaller than ε0 by choosing r large enough and δ small enough. Collecting the bounds in (8.2.24), (8.2.26) and (8.2.34) implies (8.2.16) if ε0 = ε/4

The following lemma asserts that any two points that get close to the maximum of BBM with have distinct images under the map γ, unless their genealogical distance is large.

Lemma 8.7. Let D ⊂ R be as in Lemma ??. For any ε > 0 there exists δ > 0 and r(δ,ε) such that for any r > r(δ,ε) and t > 3r

 i j  P ∃i, j≤n(t):d(xi(t),x j(t))≤r : xi(t),x j(t) ∈ m(t) + D,|γt (i (t)) − γt (i (t))| ≤ δ < ε. (8.2.35) in construction 142 8 Full extremal process

Proof. To control (8.2.35), we first use that, by Theorem 2.1 in [3], for any ε0, there is r1 < ∞, such that, for all t ≥ 3r1, and r ≤ t/3, the event

{∃i, j≤n(t):d(xi(t),x j(t))∈(r1,r),xi(t),x j(t) ∈ m(t) + D} (8.2.36)

has probability smaller than ε0. Therefore,

 i j  P ∃i, j≤n(t):d(xi(t),x j(t))≤r,xi(t),x j(t) ∈ m(t) + D,|γt (i (t)) − γt (i (t))| ≤ δ (8.2.37)  i j  0 ≤ P ∃i, j≤n(t):d(xi(t),x j(t))≤r1 : xi(t),x j(t) ∈ m(t) + D,|γt (i (t)) − γt (i (t))| ≤ δ + ε .

The nice feature of the probability in the last line is that r1 is now independent of r. At the expense of one more ε0, we can introduce in addition the condition that the paths on xi(t),x j(t) are localised in Et,α over the interval [r2,t − r2], for some r1 < r2 < ∞, independent of t. Then a second moment estimate (also known as the many-to-two lemma), shows that

 i j  P ∃i, j≤n(t):d(xi(t),x j(t))≤r1 : xi(t),x j(t) ∈ m(t) + D,|γt (i (t)) − γt (i (t))| ≤ δ  2r1 (1) (2) ≤ e K ∃ (1) (2) x˜ (t − r1),x˜ (t − r1) ∈ m(t) + D,∀ , P i≤n (t−r1), j≤n (t−r1) i j e s∈[r2,t−r2] (1) (2) (1),i (2), j  0 x˜i (s),x j ≤ D + Et,α (s),k = 1,2,|γt (i (t)) − γt (i (t))| ≤ δ + ε , (8.2.38)

(k) (k) where we write xi (t) = xk(r1) + x˜ (t − r1), etc., and De is a finite enlargement of 0 D such that D + xk(r1) ⊂ De with probability at least 1 − ε , and D is the supremum (k) of De. Using independence of the branches xe and the same arguments as in Lemma 8.5, we see that the probability in the last line is bounded from above by

 2 tr2  1 2 j −t˜j i −t˜i  k 0 k0 Cκ P γr1 (i 1(r1)) − γr2 (i (r1)) + ∑k mke − ∑k mk0 e ≤ δ , t − 2r2 (8.2.39) j i 0 where (t˜k ,k ∈ N) and (t˜k0 ,k ∈ N) are the points of independent Poisson point pro- j i cesses with intensity 2dx restricted to [r1,t]. Moreover, lk ,lk0 are i.i.d. according to j i the size-biased offspring distribution and mk resp. mk0 are uniformly distributed on j i {0,...,lk − 1} resp. {0,...,lk0 − 1}. We rewrite (8.2.39) as

 j i  j −t˜ 2 1 i −t˜ 0 m k 1 j ∈ ( (r )) − ( (r )) + 0 m k 1 j + [− , ] . P ∑k ke t˜ ∈[r ,t] γr1 i 1 γr1 i 1 ∑k k0 e t˜ ∈[r ,t] δ δ k 1 k0 1 (8.2.40) As in (8.2.18) we rewrite the probability in (8.2.40) as in construction 8.3 The q-thinning 143

∞  j ∑ P l = inf{k : mk 6= 0}, (8.2.41) l=1 j ˜j −t˜i  −tk 1 2 i k0 ∑ mke ∈ γr1 (i (r1)) − γr1 (i (r1)) + ∑mk0 e + [−δ,δ] . k≥l k0

j i 0 Due to the independence of (t˜k ,k ∈ N) and (t˜k0 ,k ∈ N) we can proceed as with (8.2.18) in the proof of Lemma 8.6 to make (8.2.41) as small as desired by choosing δ small enough. The prefactor in (8.2.39) tends to a constant as t ↑ ∞, and the addi- tional prefactor from (8.2.38) is independent of t and δ. This implies the assertion of Lemma 8.7.

8.3 The q-thinning

n(t) i The proof of the convergence of ∑i=1 δ(γ(i (t)),xi(t)−m(t)) comes in two main steps. In a first step, we show that the points of the local extrema converge to the desired r Poisson point process. Recall from Section 7.3.2 the process Θt defined√ in (7.3.12). For rd ∈ R+ and t > 3rd, we can write, with Rt = m(t)−m(t −rd)− 2rd = o(1), we have n(r ) D d Θ (rd ) ≡ δ √ (8.3.1) t ∑ x j(rd )− 2rd +Mj(t−rd )+Rt j=1

( j) ( j) where Mj(t − rd) ≡ maxk≤n j(t−rd ) xk (t − rd) − m(t − rd) and x independent BBM’s. Then

rd Proposition 8.8. Let xik (t) denote the atoms of the process Θt . Then

n∗(t) D i lim lim δ( (i k (t)),x¯ (t) = δ(qi,pi) ≡ Eb, (8.3.2) r ↑ t↑ ∑ γt ik ∑ d ∞ ∞ k=1 i

where√ (qi, pi)i∈N are the points of the Cox process Eb with intensity measure Z(dv)× Ce− 2xdx with the random measure Z(dv) defined in (8.1.5). Moreover,

n(rd ) D lim lim δ √ = E , (8.3.3) ∑ (γr(x j(r)),x j(rd )− 2rd +Mj) b r↑∞ rd ↑∞ j=1

where Mj are i.i.d. and distributed with the limit law of the maximum on BBM. The proof of Proposition 8.8 relies in Lemma 8.4 which we now prove.

Proof (Proof of Lemma 8.4). For v,r ∈ R+ fixed, the process Z(v,r,t) defined in (8.1.4) is a martingale in t > r (since Z(∞,r,t) is the derivative martingale and 1 i γr(i (r))≤v does not depend on t). To see that Z(v,r,t) converges a.s. as t ↑ ∞, note that in construction 144 8 Full extremal process

n(r) n(i)(t−r) √ √ √  √ (i) √ 2(xi(r)− 2r) 2(x (t−r)− 2(t−r)) Z(v,r,t) = 1 i r − x (r) j ∑ γr(i (r))≤ve ∑ 2 i e i=1 j=1 n(i)(t−r) ! √  √ (i) √ (i) 2(x j (t−r)− 2(t−r)) + ∑ 2(t − r) − x j (t − r) e j=1 n(r) √ √ √  2(xi(r)− 2r) (i) = 1 i r − x (r) Y ∑ γr(i (r))≤ve 2 i t−r i=1 n(r) √ √ 2(xi(r)− 2r) (i) + 1 i Z . ∑ γr(i (r))≤ve t−r (8.3.4) i=1

(i) (i) Here Zt ,i ∈ N are iid copies of the derivative martingale, and Yt are iid copies (i) of the McKean martingale. From Section 5.6 we know that Yt−r → 0, a.s., while (i) (i) (i) Zt−r → Z , a.s., where Z are non-negative r.v.’s. Hence

n(r) √ √ 2(xi(r)− 2r) (i) Z(v,r,t) ≡ Z(v,r) = Z 1 i , lim ∑ e γr(i (r))≤v (8.3.5) t↑∞ i=1

(i) where Z , i ∈ N are iid copies of Z. To show that Z(v,r) converges, as r ↑ ∞, we 1 go back to (8.1.4). Note that for fixed v, γ (ii(r))≤v is monotone decreasing in r. r √  On the other hand, we have seen (5.6.8) that mini≤n(t) 2t − xi(t) → +∞, almost surely, as t ↑ ∞. Therefore, the part of√ the sum in (8.1.4) that involves negative terms (namely those for which xi(t) > 2t) converges to zero, almost surely. The remaining part of the sum is decreasing in r, and this implies that the limit, as t ↑ ∞, is monotone decreasing almost surely. Moreover, 0 ≤ Z(v,r) ≤ Z, a.s., where Z is the almost sure limit of the derivative martingale. Thus limr↑∞ Z(v,r) ≡ Z(v) exists. Finally, 0 ≤ Z(v) ≤ Z and Z(v) is an increasing function of v because Z(v,r) is increasing in v, a.s., for each r. To show that Z(du) is nonatomic, fix ε,δ > 0 and let D ⊂ R be compact. By Lemma 8.7 there exists r1(ε,δ) such that, for all r > r1(ε,δ) and t > 3r,

i j  P ∃i, j≤n(t) : d(xi(t),x j(t)) ≤ r,xi(t),x j(t) ∈ m(t) + D,|γt (i (t)) − γt (i (t))| ≤ δ < ε. (8.3.6) Rewriting (8.3.6) in terms of the thinned process E (r/t)(t) gives

∃ x ,x ∈ m(t) + D,| (iik (t)) − (iik0 (t))| ≤  ≤ . P ik,ik0 : ¯ik ¯ik0 γt γt δ ε (8.3.7)

Assuming for the moment that E (r/t)(t) converges as claimed in Proposition 8.8, this implies that for any ε > 0, for small enough δ > 0,

P(∃δ > 0 : ∃i 6= j : |qi − q j| < δ) < ε. (8.3.8) in construction 8.3 The q-thinning 145

This could not be true if Z(du) had an atom. This proves Lemma 8.4 provided we can show convergence of E (r/t)(t). The proof of Proposition 8.8 uses the properties of the map γ obtained in Lemma 8.5 and 8.6. In particular, we use that, in the limit as t ↑ ∞, the image of the extremal particles under γ converges and that essentially no particle is mapped too close to the boundary of any given compact set. Having these properties at hand we can use the same procedure as in the proof of Proposition 5 in [4]. Finally, we use Lemma 8.4 to deduce Proposition 8.8. Proof (Proof of Proposition 8.8). We show the convergence of the Laplace func- tionals. Let φ : R+ × R → R+ be a measurable function with compact support. For simplicity we start by looking at simple functions of the form

N 1 φ(x,y) = ∑ ai Ai×Bi (x,y), (8.3.9) i=1

where Ai = [Ai,Ai] and Bi = [Bi,Bi] for N ∈ N, i = 1,...,N, ai,Ai,Ai ∈ R+, and Bi,Bi ∈ R. The extension to general functions φ then follows by monotone conver- gence. For such φ, we consider the Laplace functional

" n∗(t) !# ik  Ψt (φ) ≡ E exp − ∑ φ γt (i (t)),x¯ik (t) . (8.3.10) k=1

The idea is that the function γ only depends on the early branchings of the particle. To this end we insert the identity

1 = 1 γ + 1 γ c (8.3.11) Ar,t (suppyφ) (Ar,t (suppyφ))

γ into (8.3.10), where Ar,t is defined in (8.2.1), and by suppyφ we mean the support of φ with respect to the second varible. By Lemma 8.5 we have that, for all ε > 0, there exists rε such that, for all r > rε , γ c P Ar,t (suppyφ) < ε, (8.3.12) uniformly in t > 3r. Hence it suffices to show the convergence of

" n∗(t) ! # i  exp − φ γt (i k (t)),x¯i (t) 1 γ . (8.3.13) E ∑ k Ar,t (suppyφ) k=1

We introduce yet another identity into (8.3.13), namely

1 = 1TN γ γ + 1 TN γ γ c , i=1(Br,t (suppyφ,Ai)∩Br,t (suppyφ,Ai)) ( i=1(Br,t (suppyφ,Ai)∩Br,t (suppyφ,Ai))) (8.3.14) γ γ −r/2 where we use the shorthand notation Br,t (suppyφ,Ai) ≡ Br,t (suppyφ,Ai,e ) (re- call (8.2.15)). By Lemma 8.6 there exists for all ε > 0r ¯ε such that for all r > r¯ε and in construction 146 8 Full extremal process

uniformly in t > 3r

 N γ γ c P ∩i=1 Br,t (suppyφ,Ai) ∩ Br,t (suppyφ,Ai) < ε. (8.3.15)

Hence we only have to show the convergence of

" n∗(t) ! # ik  exp − φ γt (i (t)),x¯i (t) 1 γ TN γ γ . E ∑ k Ar,t (suppyφ)∩( i=1(Br,t (suppyφ,Ai)∩Br,t (suppyφ,Ai))) k=1 (8.3.16) Observe that on the event in the indicator function in the the last line the following k holds: If for any i ∈ {1,...,N}, γt (i (t)) ∈ [Ai,Ai] andx ¯k(t) ∈ suppyφ then also k γr(i (r)) ∈ [Ai,Ai], and vice versa. Hence (8.3.16) is equal to

" n∗(t) ! # ik  exp − φ γr(i (r)),x¯i (t) 1 γ TN γ γ . E ∑ k Ar,t (suppyφ)∩( i=1(Br,t (suppyφ,Ai)∩Br,t (suppyφ,Ai))) k=1 (8.3.17) Now we apply again Lemma 8.5 and Lemma 8.6 to see that the quantity in (8.3.17) is equal to " n∗(t) !# ik  E exp − ∑ φ γr(i (r)),x¯ik (t) + O(ε). (8.3.18) k=1

Introducing a conditional expectation given Frd , we get (analogous to (3.16) in [4]) as t ↑ ∞ that (8.3.18) is equal to

" n∗(t) !# i  limE exp − φ γr(i k (r)),x¯i (t) (8.3.19) t↑ ∑ k ∞ k=1

"n(rd ) " j ( j) ## −φ(γr(i (r)),x j(rd )−m(t)+m(t−rd )+max ( j) xi (t−rd )−m(t−rd )) i≤n (t−rd ) = limE ∏ E e Frd t↑∞ j=1

"n(rd ) √ # h j i −φ(γr(i (r)),x j(rd )− 2rd +M) = E ∏ E e Frd , j=1

where M is the limit of the rescaled maximum of BBM. The last expression is com- pletely analogous to Eq. (3.17) in [4]. Following the analysis of this expression up to Eq. (3.25) in [4], we find that (8.3.19) is equal to

h  √ N √ √ i − 2y j(rd ) ai 1 j − 2 Bi − 2 Bi  crd E exp −C ∑ y j(rd)e ∑(1−e ) Ai (γr(i (r))) e −e , j≤n(rd ) i=1 √ (8.3.20) where y j(rd) = x j(rd)− 2rd and limrd ↑∞ crd = 1, and C is the constant from law of the maximum of BBM. Using Lemma 8.4 (8.3.20) is in the limit as rd ↑ ∞ and r ↑ ∞ equal to in construction 8.3 The q-thinning 147

" N √ √ ! # ai − 2B − 2 Bi  E exp −C ∑(1 − e ) e i − e (Z(Ai) − Z(Ai)) (8.3.21) i=1  Z √ √   −φ(x,y)  − 2y = E exp e − 1 Z(dx) 2Ce dy .

This is the Laplace functional of the process Eb, which proves Proposition 8.8. To prove Theorem 8.1 we need to combine Proposition 8.8 with the results on the genealogical structure of the extremal particles of BBM obtained in [3] and the convergence of the decoration point process ∆ (see e.g. Theorem 2.3 of [2]).   (rd /t) Proof (Proof of Theorem 8.1). For xik (t) ∈ supp E (t) define the process of recent relatives by (ik) ik ∆t,r = δ0 + ∑ N j , (8.3.22) ik j:τ j >t−r

ik where τ j are the branching times along the path s 7→ xik (s) enumerated backwards ik ik in time and N j the point measures of particles whose ancestor was born at τ j . In (ik) the same way let ∆r be independent copies of ∆r which is defined as

n(t) ∆r ≡ lim 1 δ , (8.3.23) ∑ d(x˜i(t),argmax j≤n(t) x˜j(t))≥t−r x˜i(t)−max j≤n(t) x˜j(t) t↑∞ i=1 the point measure obtained from ∆ by only keeping particles that branched of the maximum after time t − r (see the backward description of ∆ in [2]). By Theorem (r /t) 2.3 of [2] we have that (the labelling ik refers to the thinned process E d (t)) √ √  (ik)  ( j) x (r ) − 2r + M (t − r ),∆ ⇒ x (r ) − 2r + M ,∆r , ik d d ik d t,rd ∗ j d d j d 1≤k≤n (t) j≤n(rd ) (8.3.24) ( j) as t ↑ ∞, where Mj are independent copies of M. Moreover, ∆rd is independent ( j) of (M ) j≤n(rd ). Looking now at the the Laplace functional for the complete point process Eet , h R i φ(x,y)Eet (dx,dy) Ψet (φ) ≡ E e , (8.3.25) for φ as in (8.3.9), and doing the same manipulations as in the proof of Proposition 8.8, shows that

" n(t) !#  k  Ψet (φ) = E exp − ∑ φ γr(i (r)),x¯k(t) + O(ε). (8.3.26) k=1

Denote by Ct,r(D) the event in construction 148 8 Full extremal process

Ct,r(D) = ∀i, j ≤ n(t) with xi(t),x j(t) ∈ D + m(t): d(xi(t),x j(t)) 6∈ (r,t − r). (8.3.27) By Theorem 2.1 in [3] we know that, for each D ⊂ R compact, c lim sup P((Ct,r(D)) ) = 0. (8.3.28) r↑∞ t>3r 1 1 Hence by introducing 1 = c + (supp φ) into (8.3.26), we obtain (Ct,r(suppyφ)) Ct,r y that

n∗(t)  (i , j)  ik ik k  −∑k=1 φ(γr(i (r)),x¯i (t))+∑ j φ γr(i (r)),x¯i (t)+∆t,r Ψet (φ) = E e k k d + O(ε), (8.3.29)

(ik, j) (ik) where ∆t,rd are the atoms of ∆t,rd . Hence it suffices to show that

n∗(t)

δ (i , j) (8.3.30) ∑ ∑ (γ (iik (r)),x¯ (t))+(0,∆ k ) k=1 j r ik t,rd

converges weakly when first taking the limit t ↑ ∞ and then the limit rd ↑ ∞ and finally r ↑ ∞. But by (8.3.24),

∗ n (t) n(rd ) √ lim δ i (ik,`) = δ j ( j,`) . t↑ ∑ ∑ (γr(i k (r)),x¯i (t))+(0,∆ ) ∑ ∑ (γr(i (r)),x j(rd )− 2rd +Mj)+(0,∆r ) ∞ k=1 ` k t,rd j=1 ` d (8.3.31) The limit as first rd and then r tend to infinity of the process on the right-hand side exists and is equal to Ee by 8.8 (in particular (8.3.3)). This concludes the proof of Theorem 8.1. in construction Chapter 9 Variable speed BBM

We have seen that BBM is somewhat related to the GREM where the covariance is a linear function of the distance. It is natural to introduce versions of BBM that have more general covariance functions A(x). This can be achieved by changing the speed (= variance) of the Brownian motion with time. Variable speed BBM was introduced by Derrida and Spohn [31] and has recently been studied recently by Fang and Zeitouni [36, 35] and others [59, 60, 63]. This entire chapter is based on joined work with Lisa Hartung [17, 18].

9.1 The construction

The general model can be constructed as follows. Let A : [0,1] → [0,1] be a right- continuous increasing function. Fix a time horizon t and let

Σ 2(u) = tA(u/t). (9.1.1)

Note that Σ 2 is almost everywhere differentiable and denote by σ 2(s) its deriva- tive wherever it exists. We define Brownian motion with speed function Σ 2 as time change of ordinary Brownian motion on [0,t] as

Σ Bs = BΣ 2(s). (9.1.2)

Branching Brownian motion with speed function Σ 2 is constructed like ordinary branching Brownian motion except that if a particle splits at some time s < t, then the offspring particles perform variable speed Brownian motions with speed func- 2 Σ Σ tion Σ , i.e. their laws are independent copies {Br − Bs }t≥r≥s, all starting at the position of the parent particle at time s. We denote by n(s) the number of particles at time s and by {xi(s);1 ≤ i ≤ n(s)} the positions of the particles at time s. If we denote by d(xk(t),x`(t)) the time of the most recent common ancestor of the particles i and k, then a simple computation in construction149 150 9 Variable speed BBM

shows that 2 E[xk(s)x`(s)] = Σ (d(xk(s),x`(s))). (9.1.3) Moreover, for different times, we have

2 E[xk(s)x`(r)] = Σ (d(xk(t),x`(t) ∧ s ∧ r)). (9.1.4) Remark 9.1. Strictly speaking, we are not talking about a single stochastic process, t∈R+ but about a family {xk(s),k ≤ n(s)}s≤t of processes with finite time horizon, in- dexed by that horizon, t.

The case when A is a step function with finitely many steps corresponds to Der- rida’s GREMs, with the only difference that the binary tree is replaced by a Galton- Watson tree. The case we discuss here corresponds to A being a piecewise linear function. The case when A is arbitrary has been dubbed CREM in [20] (and treated for binary regular trees). In that case the leading order of the maximum was ob- tained; this analysis carries over mutando mutandis to the BBM situations. Fang and Zeitouni [36] have obtained the order of the correction (namely t1/3) in the case when A is strictly concave and continuous, but there are no results on the extremal process or the law of the maximum. This result has very recently strengthened by Maillard and Zeitouni [59] who proved convergence of the law of the maximum to some travelling wave and computed the next order of the correction (which is logarithmic).

9.2 Two-speed BBM

Understanding the piecewise linear case seems to be a prerequisite to getting the full picture. Fang and Zeitouni [36] have in this case obtained the correct order of the corrections, which is completely analogous to the GREM situation, without further information on the law of the maximum and the extremal process. The simplest case is two-speed BBM, i.e.

( 2 2 σ1 , 0 ≤ s < bt σ (s) = 2 0 < b ≤ 1, (9.2.1) σ2 , t ≤ s ≤ t,

with the normalisation 2 2 σ1 b + σ2 (1 − b) = 1. (9.2.2)

Note that in the case b = 1, σ2 = ∞ is allowed. Fang and Zeitouni [35] showed that √  1 2t − √ lnt + O(1), if σ1 < σ2, √ 2 2 max xk(t) = 2t(σ1b + σ2(1 − b)) (9.2.3) k≤n(t) − √3 (σ + σ )lnt + O(1), if σ > σ . 2 2 1 2 1 2 in construction 9.2 Two-speed BBM 151

The second case has a simple interpretation: the maximum is achieved by adding to the maxima of BBM at time tb the maxima of their offspring at time t(1 − b) later, just as in the analog case of the GREM. The first case looks like the REM, which is not surprising, as we have already seen in the GREM that correlations have no influence as long as A(u) < u. But when looking at the extremal process, we will find that here things are a lot more subtle. The main result of [17] is the following.

Theorem 9.2 ([17]). Let xk(t) be branching Brownian motion with variable speed as given in (9.2.1). Assume that σ1 < σ2 and b ∈ (0,1). Then (i)   √ h −C0Ye− 2x i limP max xk(t) − m˜ (t) ≤ x = E e , (9.2.4) t↑∞ k≤n(t) √ where m˜ (t) = 2t − √1 lnt, C0 is a constant and Y is a random variable that is 2 2 the limit of a martingale (but different from Z!). (ii) The point process ≡ → , Eet ∑ δxk(t)−m˜ (t) Ee (9.2.5) k≤n(t) as t ↑ ∞, in law, where E = δ (k) , (9.2.6) e ∑ η +σ Λ k, j k 2 j

where ηk is the√ k-th atom of a mixture of Poisson point process with intensity 0 − 2x 0 (k) measure C Ye dx, with C and Y as in (i), and Λi are the atoms of indepen- dent and identically distributed point processes Λ (k), which are the limits in law of , (9.2.7) ∑ δxi(t)−max j≤n(t) x j(t) j≤n(t) √ where x(t) is BBM of speed 1 conditioned on max j≤n(t) x j(t) ≥ 2σ2t.

The picture is completed by the limiting extremal process in the case σ1 > σ2. This result is much simpler and could be guessed form the known fact on the GREM.

Theorem 9.3 ([17]). Let xk(t) be as in Theorem 9.2, but σ2 < σ1. Again b ∈ (0,1). (i) Let E ≡ E 0 and E ,i ∈ N be independent copies of the extremal process of standard branching Brownian motion. Let √ 3 3 m(t) ≡ 2t(bσ1 + (1 − b)σ2) − √ (σ1 + σ2)lnt − √ (σ1 lnb + σ2 ln(1 − b)), 2 2 2 2 (9.2.8) and set ≡ . Eet ∑ δxk(t)−m(t) (9.2.9) k≤n(t) Then in construction 152 9 Variable speed BBM

limEet = Ee, (9.2.10) t↑∞ exists, and E = δ (i) , (9.2.11) e ∑ σ e +σ e i, j 1 i 2 j

(i) (i) where ei,e j are the atoms of the point processes E and E , respectively. We just remark on the main steps in the proof of Theorem 9.2. The first step is a localisation of the particles that will eventually reach the top at√ the time of the speed change.√ The following proposition says that these are√ in a t neighbourhood of 2 σ1 t 2b, which is much smaller then the position σ1t 2b of the leading particles at time tb. Thus, the faster particles in the second half-time must make up for this. It is not very hard to know everything about the particles after time tb. The main problem one is faced with is to control their initial distribution at this time. Fortunately, this can be done with the help of a martingale. Define

n(s) √ 2 −s(1+σ )+ 2xi(s) Ys = ∑ e 1 . (9.2.12) i=1

When σ1 < 1, one can show that Y is a uniformly integrable positive martingale with mean value one. The martingale Ys will take over the role of the derivative martingale in the standard case. We no go in more detail through the proofs of the theorems.

9.2.1 Position of extremal particles at time bt

The key to understanding the behaviour of the two speed BBM is to control the positions of particle at time bt which are in the top at time t. This is done using straightforward Gaussian estimates.

Proposition 9.4. Let σ1 < σ2. For any d ∈ R and any ε > 0, there exists a constant A > 0 such that for all t large enough √ √ √ h 2 i P ∃ j≤n(t) s.t. x j(t) > m˜ (t) − d and x j(bt) − 2σ1 bt 6∈ [−A t,A t] ≤ ε. (9.2.13)

Proof. Using a first order Chebyshev inequality we bound (9.2.13) by h  √ i t 1 √ √ √ √ p e E 2 Pw2 σ2 (1 − b)tw2 > m˜ (t) − d − σ1 btw1 {σ1 btw1− 2σ1 bt6∈[−A t,A t]} √ √ h  2t−σ bw logt i t 1 √ √ √1 1 √ √ √d = e E {w − 2σ bt6∈[−A0,A0]}Pw2 w2 > − − 1 1 σ2 1−b 2 2σ2 (1−b)t σ2 (1−b)t ≡ (R1) + (R2), (9.2.14) in construction 9.2 Two-speed BBM 153

0 √1 where w1,w2 are independent N (0,1)-distributed, A = A, Pw2 denotes the law bσ1 of the variable w2. Introducing into the last line the identity in the form

1 = 1 √ √ + 1 √ √ (9.2.15) { 2t−σ1 bw1

By the same argument, (R2) is smaller than √ Z 1 − bσ2 2 t −1 1 √ √ −w1/2 e (2π) w − 2σ bt6∈[−A0,+A0] √ √ e (9.2.17) 1√ 1 √ 2t − σ bw 2t−σ1 bw1≥logt 1 1  √ √ √ √ √ 2 1  2t−σ1 bw1−logt/(2 2 t)−d/ t  ×exp − √ dw1. 2 1−bσ2 √ √ We change variables w1 = 2σ1 bt + z. Then the integral in (9.2.17) can be bounded from above by

2 Z − z M 2σ2(1−b) q e 2 dz, (9.2.18) 2 z6∈[−A0,A0] 2πσ2 (1 − b)

where M is some positive constant. (9.2.18) can be made as small as desired by taking A (and thus A0) sufficiently large. ut

2 Remark 9.5. The point here is that√ since σ1 < σ1, these particles are way below max x (bt), which is near 2σ bt. The offspring of these particles that want k≤n(bt) k 1 √ 2 √to be top at time will have to race much faster (at speed 2σ2 , rather than just 2σ2) than normal. Fortunately, there are lots of particles to choose from. We will have to control precisely how many.

We need a slightly finer control on the path of the extremal particle until the time of speed change. To this end we define two sets on the space of paths, X : R+ → R, The first controls that the position of the path is in a certain tube up to time s and the second the position of the particle at time s. We define the following events.

 q γ Ts,r = X ∀0≤q≤s|X(q) − X(s)| ≤ ((q ∧ (s − q)) ∨ r) √ s  2 γ γ Gs,A,γ = X X(s) − 2σ1 s ∈ [−As ,+As ] (9.2.19) q Recall that the ancestral path form 0 to xk(s) can be written as xk(q) = s xk(s)+zk(s), where zk is a Brownian bridge from 0 to 0 in time s, independent of xk(s). We need the following simple fact about Brownian bridges. in construction 154 9 Variable speed BBM

Lemma 9.6. Let z(q) be a Brownian bridge starting in zero and ending in zero at time s. Then for all γ > 1/2, the following is true. For all ε > 0 there exists r such that γ limP(|z(q)| < ((q ∧ (s − q)) ∨ r) ,∀0 ≤ q ≤ s) > 1 − ε. (9.2.20) s↑∞

1 Proposition 9.7. Let σ1 < σ2. For any d ∈ R,A > 0 , γ > 2 and any ε > 0, there exists constants B > 0 such that, for all t large enough,

h √ i P ∃ j≤n(t) : x j(t) > m˜ (t) − d ∧ x j ∈ G 1 ∧ x j 6∈ Gb t,B, ≤ ε. (9.2.21) bt,A, 2 γ Proof. For B and t sufficiently large the probability in (9.2.21) is bounded from above by h i P ∃ j≤n(t) : x j(t) > m˜ (t) − d ∧ x j ∈ G 1 ∧ x j 6∈ Tbt,r (9.2.22) bt,A, 2

Let w1 and w2 be independent N (0,1)-distributed random variables and z a Brow- nian bridge starting in zero and ending in zero at time bt. Using a first moment method together with the independence of the Brownian bridge from its endpoint, one sees that (9.2.22) is bounded from above by h  √ i t 1 √ √ √ √ p e E 2 Pw2 σ2 (1 − b)tw2 > m˜ (t) − d − σ1 btw1 {σ1 btw1− 2σ1 bt∈[−A t,A t]}   ×P z 6∈ Tbt,r < ε, (9.2.23)

where the last bound follows from Lemma 9.6 (with ε replaced by ε/M) and the bound (9.2.18) obtained in the proof of Proposition 9.4. ut

1 Proposition 9.8. Let σ1 < σ2. For any d ∈ R,A,B > 0, γ > 2 and any ε > 0, there exists a constant r > 0 such that for all t large enough

h √ P ∃ j≤n(t) : x j(t) > m˜ (t) − d ∧ x j ∈ G 1 ∩ Gb t,B, bt,A, 2 γ √ √ √ i ∧x j(b t + ·) − x j(b t) 6∈ Tb(t− t),r ≤ ε. (9.2.24)

Proof. The proof of this proposition is essentially identical to the proof of Proposi- tion 9.7. ut

9.2.2 Convergence of the McKean martingale

In this section we pick up the idea of [56] and consider a suitable convergent mar- tingale for the time inhomogeneous BBM with σ1 < σ2. Let xi(s), 1 ≤ i ≤ n(s) be 2 2 the particles of a BBM where the Brownian motions have variance σ1 with σ1 < 1. Define in construction 9.2 Two-speed BBM 155

n(s) √ 2 −s(1+σ )+ 2xi(s) Ys = ∑ e 1 . (9.2.25) i=1 This turns out to be a uniformly integrable martingale that converges almost surely to a positive limit Y.

Theorem 9.9. If σ1 < 1,Ys is a uniformly integrable martingale with E[Ys] = 1. 1 Hence the limit lims→∞ Ys exists almost surely and in L , is finite and strictly positive.

−t Remark 9.10. Note further that if σ1 = 0, then Yt = e n(t) which is well known to converge to an exponential random variable.

Proof. Clearly, √ s h −(1+σ 2)s+ 2x (s)i E[Ys] = e E e 1 1 = 1. (9.2.26)

Next we show that Ys is a martingale. Let 0 < r < s. Then

n(r) "n j(s−r) √ # 2 i −s(1+σ )+ 2(x (s−r)+xi(r)) E[Ys|Fr] = ∑ E ∑ e 1 j | Fr , (9.2.27) i=1 j=1

n i o where for 1 ≤ i ≤ r: x j(s − r),1 ≤ j ≤ ni(s − r) are the particles of independent 2 BBM’s with variance σ1 at time s − r. (9.2.27) is equal to

n(r) √ 2 −r(1+σ )+ 2xi(r) ∑ e 1 = Yr, (9.2.28) i=1 as desired. It remains to show that Ys is uniformly integrable. Recall that xk(r) is the ancestor of xk(s) at time r ≤ s and write xk for the entire ancestral path of xk(s). Define the truncated variable

n(s) √ A −( + 2)s+ x (s) Y = e 1 σ1 2 i 1 . (9.2.29) s ∑ {xi∈Gs,A,1/2,xi∈Ts,r} i=1

A First Ys −Ys ≥ 0, and a simple computation shows that

"n(s) √ #  A −(1+σ 2)s+ 2x (s) Y −Y ≤ e 1 i 1 E s s E ∑ {xi6∈Gs,A,1/2} i=1 "n(s) √ # −( + 2)s+ x (s) + 1 σ1 2 i 1 E ∑ e {xi6∈Ts,r} i=1 √ Z ∞ 2 2 s −(1+σ1 )s+ 2sσ1x1 √ −x /2 √dx ≤ e e {x− 2sσ 6∈[−A,A]}e + ε −∞ 1 2π Z 2 = e−z /2 √dz + ε, (9.2.30) |z|>A 2π in construction 156 9 Variable speed BBM

r where we used that xi(s) and the bridge xi(r) − s xi(s) are independent and that the event {xi 6∈ Ts,r} depends only on this bridge and by Lemma 9.6 has probability less than ε. The bound in (9.2.30) can be made as small as desired by taking A and r to A infinity. The key point is that the the second moment of Ys is uniformly bounded in s.

2 " n(s) √ ! #  A 2 −(1+σ 2)s+ 2x (s) (Y ) = e 1 i 1 ≡ (T1) + (T2), E s E ∑ {xi∈Gs,A,1/2∩Ts,r} i=1 (9.2.31) where

"n(s) √ # − (( + 2)s− x (s)) (T1) = e 2 1 σ1 2 i 1 E ∑ {xi∈Gs,A,1/2∩Ts,r} i=1  √  − ( + 2)s+ (x (s)+x (s)) (T2) = e 2 1 σ1 2 i j 1 (9.2.32) E ∑ {xi,x j∈Gs,A,1/2∩Ts,r} i, j=1 i6= j

We start by controlling (T1). √ (s−2s(1+σ 2) s +A/ √ e 1 Z 2 σ1 σ1 2 2 2sσ1x −x /2 (T1) ≤ √ √ e e dx 2π 2sσ1−A/σ1 √ −(1−σ 2)s A 2s Z A/σ √ e 1 e 1 −x2/ −( − 2)s A s ≤ √ e 2dx ≤ e 1 σ1 e 2 → 0 as s → ∞.(9.2.33) 2π −A/σ1

Note that here we used the condition that xi ∈ Gs,A,1/2, otherwise this contribution would diverge. Now we control (T2). Using a second moment bound and dropping the useless parts of the conditions on the Brownian bridges √ √ 2 2 √ Z s Z 2σ q+I1(q,s) Z 2σ s+A s−x √ s s−q 1 1 −s(1+σ 2)+ 2(x+y) (T2) ≤ Ke e √ √ e 1 2 2 √ 0 2σ1 q−I1(q,s) 2σ1 s−A s−x 2 !2 2 − y − x 2(s−q) dy q 2 dxdq 2σ1 √ 2 σ1 √ ×e e 2 , (9.2.34) σ1 2π(s−q) 2πσ1 q

∞ √ γ where K = ∑i=1 pkk(k − 1) and√ I1(q,s) = Aq/ s + ((q ∧ (s − q)) ∨ r) . Moreover 2 We change variables x = z + 2σ1 q and obtain √ 2 √ Z s Z +I1(q,s) Z 2σ (s−q)+A s−z √ √ s s−q 1 −s(1+σ 2)+ 2(z+ 2σ 2q+y) Ke e √ e 1 1 2 √ 0 −I1(q,s) 2σ1 (s−q)−A s−z √ 2 !2 (z+ 2σ2q)2 − y − 1 2(s−q) dy 2q dzdq 2σ1 √ 2σ1 √ ×e e 2 , (9.2.35) σ1 2π(s−q) 2πσ1 q in construction 9.2 Two-speed BBM 157

y √ √ Now we change variables w = √ − 2σ1 s − q. (9.2.35) is equal to σ1 s−q

√ √ +A s−z !2 (z+ 2σ2q)2 Z s Z +I1(q,s) √ Z √ − 1 −q(1−2σ 2) +2 2z σ1 s−q −w2/2 dw 2σ2q dzdq K e 1 e √ e √ e 1 √ . −A s−z 2π 2πσ 2q 0 −I1(q,s) √ 1 σ1 s−q (9.2.36) Now the integral with respect to w is bounded by 1. Hence (9.2.36) is bounded from above by √ √ √ Z s Z +I (q,s)/σ q (z− 2σ q)2 −q(1−2 2) 1 1 − 1 dzdq K e σ1 e 2 √ . (9.2.37) √ 2π 0 −I1(q,s)/σ1 q

We split the integral over q into the three parts R1, R2, and R3 according to the integration from 0 to r, r to s − r, and s − r to r, respectively. Then √ 1 √ √ 2 Z s−r − 2 (I1(q,s)/σ1 q− 2σ1 q) −q(1−2σ 2) e R2 ≤ K e 1 √ √ dq (9.2.38) r √ √ 2π 2σ1 q − I1(q,s)/σ1 q

This is bounded by

s−r Z 2 γ C 2 −(1−σ1 )q+O(q ) −c(1−σ1 )r K e dq ≤ 2 e . (9.2.39) r 1 − σ1

For R1 the integral over z can only be bounded by one. This gives Z r (2σ 2−1)q R1 ≤ K e 1 dq ≡ D1(r), (9.2.40) 0

R3 can be treated the same way as R2 and we get

s Z 2 γ K 2 γ −(1−σ1 )q+O(r ) −(1−σ1 )(s−r)+O(r ) R3 ≤ K e dq ≤ 2 e → 0 as s → ∞. s−r 1 − σ1 (9.2.41) h A2i Putting all three estimates together, we see that sups E Ys ≤ D2(r). From this it follows that Ys is uniformly integrable. Namely, 1 A1 A 1 E[Ys Ys>z] = E[Ys Ys>z] + E[(Ys −Ys ) Ys>z] (9.2.42) h A i h A  i A = Y 1 A + Y 1 − 1 A + [(Y −Y )1 ]. E s Ys >z/2 E s Ys>z Ys >z/2 E s s Ys>z

For the first term we have h i h i A1 2 A2 2 E Ys Y A>z/ ≤ E Ys ≤ D2(r). (9.2.43) s 2 z z For the second, we have in construction 158 9 Variable speed BBM

h A  i h A i Y 1 − 1 A ≤ Y 1 A 1 A (9.2.44) E s Ys>z Ys >z/2 E s Ys−Ys ≥z/2 Ys ≤z/2 z ≤ (Y −Y A) > z/2 ≤ Y −Y A. 2P s s E s s

 A The last term in (9.2.42) is also bounded by E Ys −Ys . Choosing now A and r  A 2 such that E Ys −Ys ≤ ε/3, and then z so large that z D2(r) ≤ ε/3, we obtain that 1 E[Ys Ys>z] ≤ ε, for large enough z, uniformly in s. Thus Ys is uniformly integrable, which we wanted to show. The remainder of the assertion of the theorem is now standard theory. ut √ n(s) 2 ˜ A −(1+σ1 )s+ 2xi(s)1 We will also need to control the processes Ys,γ = ∑i=1 e xi∈Gs,A,γ . ˜ A Lemma 9.11. The family of random variables Ys,γ , s,A ∈ R+,1 > γ > 1/2 is uni- formly integrable and converges, as s ↑ ∞ and A ↑ ∞, to Y, both in probability and in L1.

Proof. The proof of uniform integrability is a rerun of the proof of Theorem 9.9, noting that the bounds on the truncated second moments are uniform in A. Moreover, ˜ A the same computation as in Eq. (9.2.30) shows that E|Ys −Ys,γ | ≤ ε, uniformly in s, for A large enough. Therefore,

˜ A limlimsupE|Ys −Ys,γ | = 0, (9.2.45) A↑∞ s↑∞

˜ A which implies that Ys −Ys,γ converges to zero in probability. Since Ys converges to Y almost surely, we arrive at the second assertion of the lemma. ut

9.2.3 Asymptotic behaviour of BBM

We now need to control how the particles in the second epoch of high variance make it up to the top. This requires control on the asymptotics of the F-KPP equation even further ahead of the travelling wave√ than was done in Chapter 7. More precisely, we need to control solutions u(t,x + 2t) for values x = at,a > 0. The following proposition is an extension of Lemma 7.3 for these values of x.

Lemma 9.12. Let u be a solution to the F-KPP equation with initial data satisfying (i) 0 ≤ u(0,x) ≤ 1; √ 1 R t(1+h) (ii) for some h > 0, limsupt→∞ t log t u(0,y)dy ≤ − 2; R x+N (iii) for some v > 0, M > 0, N > 0, it holds that x u(0,y)dy > v for all x ≤ −M; R ∞ 2y (iv) moreover, 0 u(0,y)ye dy < ∞. Then we have for x = at + o(t) √ 2 √ lim e 2xex /2tt1/2u(t,x + 2t) = C(a), (9.2.46) t→∞ in construction 9.2 Two-speed BBM 159

where C(a) is a strictly positive constant. The convergence is uniform for a in com- pact intervals. √ Proof. Recall the definition of the function ψ(r,t,x + 2t) from (6.1.9). As in the proof of Lemma 7.3, the claimed result follows from the next lemma that provides the asyptotics of ψ. Lemma 9.13. For x = at +o(t) we have, under the assumptions of Proposition 9.12, √ 2 √ lim e 2xex /2tt1/2ψ(r,t,x + 2t) (9.2.47) t→∞ √ 1 Z ∞ 2 √ = √ e−a r/2u(r,y + 2r)e( 2+a)y 1 − e−2aydy ≡ C(r,a). 2π 0 The convergence is uniform for a in a compact set. The proofs of this lemma is fairly analogous to the corresponding estimate in Chapter 7 and we do not give the details. See again [17]. The proof of Lemma 9.12 is now concluded as in the proof of Proposition 7.1. ut We also need the following variant of Lemma 7.6. Lemma 9.14. Let u be a solution of the F-KPP equation with initial data satisfying Assumptions (i)-(iv) of Proposition 9.12. Let

√ 1 Z ∞ √ 2 C(a) = lim √ u(t,z + 2t)e( 2+a)z−a t/2dz. (9.2.48) t→∞ 2π 0

Then for any x ∈ R: √ √ 1 Z ∞ √ 2 lim √ u(t,x + z + 2t)e( 2+a)z−a t/2dz = C(a)e−( 2+a)x. (9.2.49) t→∞ 2π 0 Moreover, for any bounded continuous function h(x), that is zero for x small enough

Z 0 √ √ h  i 1 −( 2+a)y−a2t/2 lim E h y + maxx ¯i(t) − 2t √ e dy t→∞ −∞ 2π Z √ √ = C(a) h(z)( 2 + a)e−( 2+a)zdz, (9.2.50) R

where {x¯i(t),i ≤ n(t)} are the particles of a standard BBM with variance 1. Here C(a) is the constant from (9.2.48) for u satisfying the initial condition 1{x≤0}. We skip the proof that is completely analogous to that of Lemma 7.6.

9.2.4 Convergence of the maximum

We first show the convergence of the law the of the maximum. The following theo- rem is more precise then the statement in Theorem 9.2 in construction 160 9 Variable speed BBM

Theorem 9.15. Let√ {xk(t),1 ≤ k ≤ n(t)} be the particles of two speed BBM with σ1 < σ2. Set a = 2(σ2 − 1). Then

  √ h  − 2yi lim P max xk(t) − m˜ (t) ≤ y = E exp −σ2C(a)Ye . (9.2.51) t→∞ 1≤k≤n(t)

Y is the limit of the McKean martingale from the last section, and 0 < C(a) < ∞ is the positive constant given by

Z ∞  √  √ −a2r/2 ( 2+a)z −2az C(a) = lim e P max x¯k(r) > z + 2r e 1 − e dz, (9.2.52) r→∞ 0 k≤n(r)

where {x¯k(t),k ≤ n(t)} are the particles of a standard BBM.

Proof. Denote by {xi(bt),1 ≤ i ≤ n(bt)} the particles of a BBM with variance σ1 at time bt and by Fbt the σ-algebra generated this BBM. Moreover, for 1 ≤ i ≤ n(bt), i let {x j((1 − b)t),1 ≤ j ≤ ni((1 − b)t)} denote the particles of independent BBM with variance σ2 at time (1 − b)t. By second moment estimates one can easily show that the order of the maximum ism ˜ (t), i.e. for any ε > 0, there exists d < ∞, such that   P max xk(t) − m˜ (t) ≤ −d ≤ ε/2. (9.2.53) 1≤k≤n(t)

Therefore,     P −d ≤ max xk(t) − m˜ (t) ≤ y ≤ P max xk(t) − m˜ (t) ≤ y (9.2.54) 1≤k≤n(t) 1≤k≤n(t)   ≤ P −d ≤ max xk(t) − m˜ (t) ≤ y + ε/2 1≤k≤n(t)

For details see [35]. On the other hand, by Proposition 9.4, all the particles that may contribute to the event in question are localised at time bt in the narrow window described by G 1 . Thus we obtain bt,A, 2 in construction 9.2 Two-speed BBM 161   P max xk(t) − m˜ (t) ≤ y 1≤k≤n(t)   i = P max max xi(bt) + x j((1 − b)t) − m˜ (t) ≤ y 1≤i≤n(bt) 1≤ j≤ni((1−b)t) "  # i = E ∏ P max x j((1 − b)t) ≤ m˜ (t) − xi(bt) + y| Fbt 1≤ j≤ni((1−b)t) 1≤i≤ni(bt) "  # −1 i −1 ≤ E P max σ2 x j((1 − b)t) ≤ σ2 (m˜ (t) − xi(tb) + y)| Ftb ∏ ≤ j≤n (( −b)t) 1≤i≤n(bt) 1 i 1 x ∈G i bt,A, 1 2 +ε. (9.2.55)

The corresponding lower bound holds without the ε. We write the probability in (9.2.55) as   i −1 1 − P max x¯j((1 − b)t) > σ2 (m˜ (t) − xi(tb) + y)| Ftb 1≤ j≤ni((1−b)t) −1  = 1 − u (1 − b)t,σ2 (m˜ (t) − xi(bt) + y) , (9.2.56)

i wherex ¯j((1 − b)t) are the particles of a standard BBM and u is the solution of the F-KPP equation with Heaviside initial conditions. By Proposition 9.12 and setting √ √ 2x+x2/2t 1/2 Ct (x) ≡ e t u(t,x + 2t), (9.2.57)

we have that

−1  u (1 − b)t,σ2 (m˜ (t) − xi(bt) + y) (9.2.58) √  −1  = C(1−b)t σ2 (m˜ (t) − xi(bt) + y) −t 2(1 − b)) √ √ √ 2  m˜ (t)−xi(bt)+y  1  m˜ (t)−xi(bt)+y  − 2 − 2(1−b)t − − 2(1−b)t −1/2 ×e σ2 e 2(1−b)t σ2 ((1 − b)t) √ √ 2 Now all the xi(bt) that appear are of the form xi(bt) = 2σ1 bt + O( t), so that √ √  −1  C(1−b)t σ2 (m˜ (t) − xi(bt) + y) − 2(1 − b)t) = C(1−b)t (a(1 − b)t + O( t)), (9.2.59) where (using (9.2.2)) √ √ 2 √ ! √ 1 2 − 2σ1 b a ≡ − 2(1 − b) = 2(σ2 − 1). (9.2.60) 1 − b σ2

By Proposition 9.12, √  −1  limC(1−b)t σ2 (m˜ (t) − xi(bt) + y) − 2(1 − b)t) = C(a), (9.2.61) t↑∞ in construction 162 9 Variable speed BBM

with uniform convergence for all i appearing in (9.2.55) and C(a) is the constant given by (9.2.52). After a little algebra we can rewrite the expectation in (9.2.55) as

" (m˜ (t)+y−x (bt))2 # (1−b)t− i n −1/2 2(1−b)t 2 o E ∏ 1−C(a)((1−b)t) e σ2 (1+o(1)) . (9.2.62) 1≤i≤n(bt) xi∈Gbt,A,1/2 √ √ √ 2 Using that xi(bt) − 2σ1 tb ∈ [−A t,A t] we have the uniform bounds

2 √  (m˜ (t)+y−xi(bt))  2  exp (1 − b)t − 2 ≤ exp (1 − σ2 )bt + logt + A t , (9.2.63) 2(1−b)tσ2

2 which is exponentially small in t since σ2 > 1. Hence (9.2.62) is equal to

"  (m˜ (t)+y−x (bt))2 # (1−b)t− i −1/2 2(1−b)t 2 E ∏ exp−C(a)((1 − b)t) e σ2 (1 + o(1)) . 1≤i≤n(bt) xi∈Gbt,A,1/2 (9.2.64) Expanding the square in the exponent in the last line and keeping only the relevant terms yields

2 (1 − b)t − (m˜ (t)+y−xi(bt)) 2(1−b)tσ 2 √ 2 √ 2 2 1 = (1 − b)t − 2y −tσ2 (1 − b) + 2σ1 bt + 2xi(bt) − 2 lnt √ 2  2  2tσ1 b − xi(bt) + 2 + o(1) 2(1 − b)σ2 t √ 2  2  √ √ 2tσ b − xi(bt) 2 1 1 = − 2y − bt(1 + σ1 ) + 2xi(bt) + 2 lnt − 2 + o(1). 2(1 − b)σ2 t (9.2.65)

The terms up to the last one would nicely combine to produce the McKean martin- gale as coefficient of C(a). Namely,

 √ √ 1  − y−bt( + 2)+ x (bt)+ t ∏ exp −C(a)((1 − b)t)−1/2e 2 1 σ1 2 i 2 ln 1≤i≤n(bt) xi∈Gbt,A,1/2   √ √  C(a) − 2y −bt(1+ 2)+ 2 x¯ (bt) = exp−√ e ∑ e σ1 σ1 i . (9.2.66)  1 − b 1≤i≤n(bt)  xi∈Gbt,A,1/2 in construction 9.2 Two-speed BBM 163

However, the last terms are of√ order one and cannot be neglected. To deal with√ them, we split the process at time b t. We write somewhat abusively xi(bt) = xi(b t) + (i) √ √ √ xl (b(t − t)), where we understand that xi(b t) is the ancestor at time b t of the particle that at time t is labeled√ i if we think backwards from time t, while√ the labels of the particles at time b t run only over the different ones, i.e. up to n(b t), if we think in the forward direction. No confusion should occur if this is kept in mind. Using Proposition 9.7 and Proposition 9.8 we can further localise the path of the particle. Recall the definition of Gs,A,γ and Tr,s, we rewrite (9.2.64), up to a term of order ε, as " " −1/2 E E exp −C(a)((1 − b)t) (9.2.67) ∏√ ∏ ≤i≤n(b t) (i) √ 1 1≤l≤n (b(t− t)) x ∈G √ l i b t,B,γ (i) x ∈G ;x ∈ √ i bt,A, 1 l Tb(t− t),r 2 ! ##  √ (i) √ 2  (m˜ (t)+y−xi(b t)−xl (b(t− t))) √ ×exp (1 − b)t − 2 (1 + o(1)) F tb . 2(1−b)tσ2 √ √ √ √ √ √ (i) 2 Using that xi(b t) + xl (b(t − t)) − 2σ1 tb ∈ [−A t,A t] andm ˜ (t) = 2t − √1 logt, we can re-write the terms multiplying C(a) in (9.2.67) as 2 2  √ √ √ 1 √ exp − (1 + σ 2)bt + 2(x (b t) + x(i)(b(t − t))) − log(1 − b) − 2y 1 i l 2 √ √ √ (i) 2 2 √  (xi(b t)+xl (b(t− t))− 2σ1 bt) − 2 + O(1/ t) 2(1−b)σ2 t (i) √ (i) √ √ √ ≡ E(xi,xl ) = E(xi(b t),xl (b(t − t))) = E(xi(b t),xi(bt) − xi(b t)). (9.2.68)

Now (9.2.67) takes the form " " # n (i) o √ i E E exp − C(a)E(xi,x )(1 + o(1)) Fb t . ∏√ ∑ l 1≤i≤n(b t) (i) √ 1≤l≤n (b(t− t)) x ∈G √ l i b t,B,γ (i) x ∈G ;x ∈ √ i bt,A, 1 l Tr,b(t− t) 2 (9.2.69) The idea is that the exponential in the conditional expectation can be replaced by its linear approximation, and that the linear term can be nicely computed. With

(i) x ≡ ∑ E(xi,xl ), (9.2.70) (i) √ ≤l≤n (b(t− t)) 1 l (i) x ∈G ;x ∈ √ i bt,A, 1 l Tr,b(t− t) 2 and performing some fairly straightforward algebra, we see that in construction 164 9 Variable speed BBM

√ √ √ √ √ 2 2 (z+xi(b t)− 2σ1 bt) √ √ √ Z Kt +A t 2(z+xi(b t))− −z2/2σ2b(t− t) b(σ 2t− t)− 2y 2σ2(1−b)t e 1 dz √ 1 2 √ √ E[x|Fb t ] ≤ e √ e 2 Kt −A t 2πσ1 b(t− t) √ √ √ √ 1/2 B t 2 1  σ 2(1−b)  Z 2 −(1+σ1 )b t+ 2xi(bt)− 2 log(1−b)− 2y 2 √ −w /2t √dw = e 2 √ e 1−σ1 b/ t −B t 2πt √ √ 2 √ −(1+σ )b t+ 2xi(bt)− 2y = e 1 σ2(1 + o(1)), (9.2.71) √ γ−1  p  where o(1) ≤ O(t ) and B = A/ σ1σ2 b(1 − b/ t) . Note that the inequality comes from the fact that we dropped the tube conditions. However, by the fact that the Brownian bridge from its endpoint and that the bridge verifies the tube condition with probability at least (1 − ε) (see Lemma 9.6), it follows that the right hand side of (9.2.71) multiplied by an additional factor (1 − ε) is a lower bound. This is what we want. To justify the replacement of the exponential by the linear approximation, due to the the inequalities 1 1 − x ≤ e−x ≤ 1 − x + x2, x > 0, (9.2.72) 2 we just need to control the second moment. But

√ √ √ √ 2  2 2 √ −2(1+σ1 )b t+2 2xi(b t)−2 2y A √ E[x |Fb t ] ≤ e E Yb(t− t) , (9.2.73)

Y A √ where b(t− t) is the truncated McKean martingale defined in (9.2.29). Note that its second moment is bounded by D2(r) (see (9.2.43)). Comparing this to (9.2.73), one sees that

2 √ √ √ √ √ E[x |Fb t ] 2 2 γ/2 −(1+σ1 )b t+ 2xi(b t) −(1−σ1 )b t+0(t ) √ ≤ D2(r)e ≤ Ce , (9.2.74) E[x|Fb t ]

which tends to zero uniformly as t ↑ ∞. Thus the second moment term is negligible. Hence we only have to control

" √ √ √ #  2  −(1+σ )b t+ 2xi(bt)− 2y E 1 −C(a)e 1 σ2 ∏√ 1≤i≤n(b t) √ xi∈Gb t,B,γ " √ √ ! # 2 √ −(1+σ )b t+ 2xi(bt)− 2y = E exp − C(a)e 1 σ2 (1 + o(1)) ∑ √ 1≤i≤n(b t) √ xi∈Gb t,B,γ " #  √  − 2y ˜ B√ = E exp −C(a)σ2e Yb t,γ (1 + o(1)) (9.2.75)

where in construction 9.2 Two-speed BBM 165 √ n(b t) √ 2 √ √ B −(1+σ )b t+ 2xi(b t) √ √ √ Y˜ √ = e 1 1 2 γ/2 γ/2 . (9.2.76) b t,γ ∑ xi(b t)− 2σ1 b t∈[−Bt ,Bt ] i=1

Now from Lemma 9.11, Y˜ B√ converges in probability and in L1 to the random b t,γ variable Y, when we let first t and then B tend to infinity. Since Y B√ ≥ 0 and b t,γ C(a) > 0, it follows √ √ h  B − 2yi h  − 2yi limliminfE exp −C(a)σ2Y˜ √ e = E exp −σ2C(a)Ye . B↑∞ t↑∞ b t,γ (9.2.77) Finally, letting r tend to +∞, all the ε-errors (that are still present implicitly, vanish. This concludes the proof of Theorem 9.15. ut

9.2.5 Existence of the limiting process

The following existence theorem is the basic step in the proof of Theorem 9.2.1.

Theorem 9.16. Let σ1 < σ2. Then, the point processes Et = ∑k≤n(t) δxk(t)−m˜ (t) con- verges in law to a non-trivial point process E .

It suffices to show that, for φ ∈ Cc(R) positive, the Laplace functional   Z  Ψt (φ) = E exp − φ(y)Et (dy) , (9.2.78)

of the processes Et converges. The proof of this is essentially a combination of the corresponding proof for ordinary BBM and what we did when showing convergence of the maximum. The result is that

limΨt (φ) = E[exp(−σ2C(a,φ)Y)], (9.2.79) t↑∞

where C(a,φ) is given by

√ 1 Z ∞ √ 2 C(a,φ) = lim √ v(t,z + 2t)e( 2+a)z−a t/2dz, (9.2.80) t→∞ 2π 0 where v(t,x) is the solutions of the F-KPP equation with initial data v(0,x) = 1 − −φ¯(−x) e . Here φ¯(z) ≡ φ(σ2yz).

9.2.6 The auxiliary process

The final step is the interpretation of the limiting process. This is again very much analogous to the standard BBM case. Let (ηi;i ∈ N) be the atoms of a Poisson point in construction 166 9 Variable speed BBM

process η on (−∞,0) with intensity measure √ 2 √σ2 e−( 2+a)ze−a t/2dz. (9.2.81) 2π

i For each i ∈ N consider independent standard BBMsx ¯ . The auxiliary point pro- cess of interest is the superposition of the i.i.d BBMs with drift shifted by ηi + √ 1 logY, where a is the constant defined in (9.2.60): 2+a

Πt = δ √  . (9.2.82) ∑ η + √ 1 logY+x¯i (t)− 2t σ i,k i 2+a k 2

Remark 9.17. The form of the auxiliary process is similar to the case of standard BBM, but with a different intensity of the Poisson process. In particular, the intensity decays exponentially with t. This is a consequence of the fact that√ particles at the time of√ the speed change were forced to be O(t) below the line 2t, in contrast to the O( t) in the case of ordinary BBM. The reduction of the intensity of the process with t forces the particles to be selected at these locations.

Theorem 9.18. Let Et be the extremal process of the two-speed BBM. Then

law lim Et = lim Πt . (9.2.83) t→∞ t→∞ As the proof is in nature similar to that in the standard case, we skip the details. The following proposition shows that in spite of the different Poisson ingredients, when we look at the process of the extremes of each of the xi(t), we end up with a Poisson point process just like in the standard BBM case.

Proposition 9.19. Define the point process

ext Π ≡ δ √  . (9.2.84) t ∑ η + √ 1 logY+max x¯i (t)− 2t σ i,k i 2+a k≤ni(t) k 2

Then ext law lim Πt = PY ≡ δp , (9.2.85) t→∞ ∑ i i∈N √ √ − 2x where PY is the Poisson point process on R with intensity measure σ2C(a)Y 2e dx.

ext (i) (i) Proof. We consider the Laplace functional of Πt . Let M (t) = maxx ¯k (t) and as before φ¯(z) = φ(σ2z). We want to show

" √ !# ¯ (i) limE exp −∑φ(ηi + M (t) − 2t t↑∞ i  Z ∞ √ √   −φ(x) − 2x = exp −σ2C(a) 1 − e 2e dx . (9.2.86) −∞ in construction 9.2 Two-speed BBM 167

(i) Since ηi is a Poisson point process and the M are i.i.d. we have

" √ !# ¯ (i) E exp −∑φ(ηi + M (t) − 2t i  Z 0 √ √  h −φ¯(z+M(t)− 2t)i −( 2+a)z−a2t/2 dz = exp −σ2 E 1 − e e √ , (9.2.87) −∞ 2π

where M(t) has the same distribution as one the variables M(i)(t). Now we apply −φ¯(z) ¯ Lemma 9.14√ with h(x)√ = 1 − e . Hence the result follows by using that φ(z) = φ(σ2z) and 2 + a = 2σ2 together with the change of variables x = σ2z. ut The following proposition states that the Poisson points of the auxiliary process contribute to the limiting process come from a neighbourhood of −at.

Proposition 9.20. Let z ∈ R√,ε > 0. Let ηi be the atoms of a Poisson point process 2 with intensity measure Ce−( 2+a)x−a t/2dx on (−∞,0]. Then there exists B < ∞ such that √  i √ √  supP ∃i,k : ηi + x¯k(t) − 2t ≥ z,ηi 6∈ [−at − B t,−at + B t] ≤ ε. (9.2.88) t≥t0 Proof. By a first order Chebychev inequality we have

 (i) √ √  P ∃i,k : ηi + x¯k (t) − 2t ≥ z,ηi > −at + B t Z 0 √ √   −( 2+a)x −a2t/2 ≤ C √ P maxx ¯k(t) ≥ 2t − x + z e e dx −at+B t √ Z at−B t √ √   ( 2+a)x −a2t/2 = C P maxx ¯k(t) ≥ 2t + x + z e e dx, (9.2.89) 0 by the change of variables x → −x. Using the asymptotics of Lemma 6.45 we can bound (9.2.89) from above by √ √ √ Z at−B t 2 2 ρC t−1/2e− 2(x+z)e−(x+z) /2t e( 2+a)xe−a t/2dx 0 √ Z −B √ −( 2+a)z −z2/2 −( 2+a)z −B2/2 ≤ ρCe √ e dz ≤ ρCe e , (9.2.90) −a t √ √ by changing variables x → x/ t −a t. For any z ∈ R, this can be made as small as we want by choosing B large enough. Similarly one bounds √ √ √  i  −( 2+a)z −B2/2 P ∃i,k : ηi + xk(t) − 2t ≥ z,ηi < −at − B t ≤ ρCe e . (9.2.91)

This concludes the proof. ut (i) The next proposition describes the law of the clustersx ¯k . This is analogous to Theorem 7.16 (Theorem 3.4 in [6]). in construction 168 9 Variable speed BBM

Proposition 9.21. Let x = at + o(t) and {x˜k(t),k ≤ n(t)} be a standard BBM under  √  the conditional law P · {maxx ˜k(t) − 2t − x > 0} . Then the point process

δ √ (9.2.92) ∑ x˜k(t)− 2t−x k≤n(t)

 √  converges in law under P · {maxx ˜k(t) − 2t − x > 0} as t → ∞ to a well defined ¯ ¯ point process E . The limit does not depend on x − at and the maximum√ of E shifted by x has the law of an exponential random variable with parameter 2 + a. √ √ Proof. Set E¯t = ∑ δ and maxE¯t = maxx ˜k(t) − 2t. First we show that for k x˜k(t)− 2t X > 0 √  −( 2+a)X lim maxE¯t > X + x|maxE¯t > x = e . (9.2.93) t→∞ P

[maxE¯t >X+x] To see this we rewrite the conditional probability as P and use the uni- P[maxE¯t >x] form bounds from Proposition 6.2. Observing that √ Ψ(r,t,X + x + 2t) √ lim √ = e−( 2+a)X , (9.2.94) t→∞ Ψ(r,t,x + 2t)

where Ψ is defined in Equation (??), we get (9.2.93) by first taking t → ∞ and then r → ∞. The general claim of Proposition 9.21 follows in exactly the same way from (9.2.93) as Theorem 7.16. ut

Define the gap process

D = . t ∑δx˜k(t)−max j x˜j(t) (9.2.95) k ¯ ¯ Denote by ξi the atoms of the limiting process E , i.e. E ≡ ∑ j δξ j and define

D ≡ ∑δΛ j , Λ j = ξ j − maxξi. (9.2.96) j i

D is a point process on (−∞.0] with an atom at 0.

Corollary 9.22. Let x = −at + o(t). In the limit t → ∞ the random variables Dt and x +maxE¯t are conditionally independent on the event {x +maxE¯t > b} for any b ∈ R. More precisely, for any bounded function f ,h and φ¯ ∈ Cc(R),     Z lim f φ¯(z)Dt (dz) h(x + maxE¯) x + maxE¯ > b t→∞ E √ √  Z  R ∞ −( 2+a)z b h(z)( 2 + a)e dz = E f φ¯(z)D(dz) √ . (9.2.97) e−( 2+a)b Proof. The proof is essentially identical to the proof of Corollary 4.12 in [6]. ut in construction 9.2 Two-speed BBM 169

Finally we come to the description of the extremal process as seen from the Poisson process of cluster extremes, which is the formulation of Theorem 9.2.1.

(i) Theorem 9.23. Let PY be as in (9.2.85) and let {D ,i ∈ N} be a family of indepen- (i) dent copies of the gap-process (9.2.96) with atoms Λ j . Then the point process Et converges in law as t → ∞ to a Poisson cluster point process E given by

law E = δ (i) . (9.2.98) ∑ p +σ Λ i, j i 2 j

Also this proof is now very close to that of Theorem 7.15, resp. Theorem 2.1 in [6].

9.2.7 The case σ1 > σ2

In this section we proof Theorem 9.2. The existence of the process E from (9.2.11) will be a byproduct of the proof. The point is that in this case if we rerun the computations to find the location of the particles that contribute to the maximum at time bt, the naive computation would place them above the level of the maximal particles at that time. But of course, there are no such particles. Thus the particles that reach the highest level are those that have been maximal at time bt. The following lemma that is contained in the calculation of the maximal displacement in [35] makes this precise.

Lemma 9.24. [[35]] For all ε > 0,d ∈ R there exists a constant D large enough such that for t sufficiently large

P[∃k ≤ n(t) : xk(t) > m(t) + d and xk(bt) < m1(bt) − D] < ε. (9.2.99)

Proof (of Theorem 9.2). First we establish the existence√ of a limiting process. Note that m(t) = m (bt) + m ((1 − b)t), where m (s) = 2σ s − √3 σ logs. Recall 1 2 i i 2 2 i

φ¯(z) = φ(σ2z) (9.2.100)

and ¯ g(z) = 1 − e−φ(−z). (9.2.101)

Here we assume that φ(x) = 1x>a, for a ∈ R. Using that the maximal displacement is m(t) in this case we can proceed as in the proof of Theorem 9.16 and only have to control " " ## i 1 −Ψt (φ) = E ∏ E ∏ g((m(t) − xi(bt))/σ2 − x¯j((1 − b)t)) Ftb , i≤n(tb) j≤ni((1−b)t) (9.2.102) in construction 170 9 Variable speed BBM

i wherex ¯j((1−b)t) are the particles of a standard BBM at time (1−b)t and xi(bt) are the particles of a BBM with variance σ1 at time bt. Using Lemma 9.24 and Theorem 1.2 of [35] as in the proof of Theorem 9.15 above, we obtain that (9.2.102), for t sufficiently large, equals   " #   (m(t)−xi(bt)) i   E E g − x¯j((1 − b)t) Ftb  + O(ε).  ∏ ∏ σ2  i≤n(bt) j≤ni((1−b)t) xi(bt)>m1(bt)−D (9.2.103) The remainder of the proof has an iterated structure. In a first step we show that con- i ditioned on Fbt for each i ≤ n(bt) the points {xi(bt) + x j((1 − b)t) − m(t)|xi(bt) > m1(bt) − D} converge to the corresponding points of the point process xi(bt) − (i) (i) m1(bt) + σ2E˜ , where E˜ are independent copies of the extremal process of stan- dard BBM. To this end observe that " # i u((1 − b)t,z) = E ∏ g(z − x¯j((1 − b)t)) (9.2.104) j≤n((1−b)t)

solves the F-KPP equation with initial condition u(0,z) = g(z). Moreover, the as- sumptions of Theorem ?? are satisfied. Hence (9.2.103) is equal to   " √ m (bt)−x (bt) # ! − 2 1 i  −C(φ¯)Ze σ2  ε + E ∏ E e Fbt (1 + o(1)) . (9.2.105)  i≤n(bt)  xi(bt)>m1(bt)−D

Here C(φ¯) is from standard BBM, i.e. q Z ∞ √ √ ¯ 2 2y C(φ) = lim π u(t,y + 2t)ye dy, (9.2.106) t↑∞ 0 see Lemma ??. Note furthermore that already in (9.2.105) the concatenated structure of the limiting point process becomes visible. In a second step we establish that the points xi(bt) − m1(t) that have a descendant in the lead at time t converge to Ee. Define

  √ σ   − 2 1 y ¯ σ E exp −C(φ)Ze 2 , if σ1y < D, hD(y) ≡ (9.2.107) 1, if σ1y ≥ D.

Then the expectation in (9.2.105) can be written as (we ignore the error term o(1) which is easily controlled using that the probability that the number of terms in the product is larger than N tends to zero as N ↑ ∞, uniformly in t) in construction 9.2 Two-speed BBM 171 " # E ∏ hδ,D(m1(bt)/σ1 − x¯i(t)) , (9.2.108) i≤n(bt)

where nowx ¯ is standard BBM. Defining " # vD(t,z) = 1 − E ∏ hD(z − x¯i(bt)) , (9.2.109) i≤n(t)

vD is a solution of the F-KPP equation with initial condition vD(0,z) = 1 − hD(z). But this initial condition satisfies the assumptions of Theorem ?? and therefore,

√ h −C˜(D,Z,C(φ¯))Z˜e− 2x i vD(t,m(t) + x) → E e . (9.2.110)

where Z˜ is an independent copy of Z and q Z ∞ √ √ ˜ ¯ 2 2y C(D,Z,C(φ)) = lim π vD(t,y + 2t)ye dy. (9.2.111) t↑∞ 0 By the same argumentation as in standard BBM setting one obtains that q Z ∞ √ √ ˜ ¯ ˜ ¯ 2 2y C(Z,C(φ)) ≡ limC(D,Z,C(φ)) = lim π v(t,y + 2t)ye dy, (9.2.112) D↑∞ t↑∞ 0

where v is the solution of the F-KPP equation with initial condition v(0,z) = 1−h(z) with   √ σ  − 2 1 z h(z) = E exp −C(φ¯)Ze σ2 . (9.2.113)

Therefore, taking the limit first as D ↑ ∞ in the left-hand side of (9.2.110), we get that √ h −C˜(Z,C(φ¯))Z˜e− 2x i limΨt (φ(· + x)) = lim lim vD(t,m(t) + x) = E e . (9.2.114) t→∞ D↑∞t→∞

To see that the constants C˜(Z,C(φ¯)) are strictly positive, one uses the Laplace func- tionals Ψt (φ) are bounded from above by     1 E exp −φ max xi(bt) + max x j ((1 − b)t) − m(t) . (9.2.115) i≤n(bt) j≤n1((1−b)t)

Here we used that the offspring of any of the particles at time bt has the same law. So the sum of the two maxima in the expression above has the same distribution as the largest descendent at time t off the largest particle at time bt. The limit of Eq. (9.2.115) as t ↑ ∞ exists and is strictly smaller than 1 by the convergence in law of the recentered maximum of a standard BBM. But this implies the positivity of the constants C˜. Hence a limiting point process exists. Finally, one may easily check that the right hand side of (9.2.114) coincides with the Laplace functional of in construction 172 9 Variable speed BBM

the point process defined in (9.2.11) by basically repeating the computations above. ut

Remark 9.25. Note that in particular, the structure of the variance profile is con- tained in the constant C˜(D,Z,C(φ¯)) and that also the information on the structure of the limiting point process is contained in this constant. In fact, we see that in all cases we have considered in this paper, the Laplace functional of the limiting process has the form √ h  − 2xi limΨt (φ(· + x)) = E exp −C(φ)Me , (9.2.116) t↑∞

where M is a martingale limit (either Y of Z) and C is a map from the space of pos- itive continuous functions with compact support to the real numbers. This function contains all the information on the specific limiting process. This is compatible with the finding in [59] in the case where the speed is a concave function of s/t. The universal form (9.2.116) is thus misleading and without knowledge of the specific form of C(φ), (9.2.116) contains almost no information.

Remark 9.26. There is no difficulty to extend this result to the case of multi-speed BBM with decreasing speeds.

9.3 Universality below the straight line

It turns out that in the case when the covariance stays strictly below the straight line A(s) = s for all s ∈ (0,t), the same picture emerges as in the two speed case, with the McKean martingale depending only on the slope at 0 and the decoration process depending only on the slope at 1. In [18] a rather large class of functions A were considered. Here we will simplify the presentation by restricting ourselves to smooth functions. Let A : [0,1] → [0,1] be non-decreasing function twice differentiable with bounded second derivative that satisfies the following three conditions: (A1)For allx ∈ (0,1): A(x) < x, A(0) = 0 and A(1) = 1. 0 2 0 2 (A2)A (0) = σb < 1 and A (1) = σe > 1. √ Theorem 9.27. Assume that A : [0,1] → [0,1] satisfies (A1)-(A2). Let m˜ (t) = 2t − √1 logt. Then there is a constant C(σ ) depending only on σ and a random vari- 2 2 e e e able Yσb depending only on σb such that (i) √   h − 2x i −Ce(σe)Yσ e limP max xi(t) − m˜ (t) ≤ x = E e e . (9.3.1) t↑∞ 1≤i≤n(t) (ii) The point process in construction 9.3 Universality below the straight line 173

δx (t)−m˜ (t) → Eσb,σe = δ (i) , (9.3.2) ∑ k ∑ pi+σeΛ k≤n(t) i, j j

as t ↑ ∞, in law, where the pi are√ the atoms of a Poisson point process on R with intensity measure C(σ )Y e− 2xdx, and the Λ (i) are the limits of the processes e e σb √ as in (??), but conditioned on the√ event {maxk x˜k(t) ≥ 2σet}. 0 (i) (iii) If A (1) = ∞, then Ce(∞) = 1/ 4π, and Λ = δ0, i.e. the limiting process is a Cox process.

The random variable Yσb is the limit of the uniformly integrable martingale

n(s) √ 2 −s(1+σb )+ 2σbx¯i(s) Yσb (s) = ∑ e , (9.3.3) i=1

where x¯i(s) is standard branching Brownian motion.

9.3.1 Outline of the proof

The proof of Theorem 9.27 is based on the corresponding result obtained in [17] for the case of two speeds, and on a Gaussian comparison method. We start by showing the localisation of paths, namely that the paths of all particles that reach a hight of orderm ˜ (t) at time t has to lie within a certain tube. Then we show tightness of the extremal process. The remainder of the work consists in proving the convergence of the finite di- mensional distributions. To this end we use Laplace transforms. We introduce auxil- iary two speed BBM’s whose covariance functions approximate Σ 2(s) well around 0 and t. Moreover we choose them in such a way that their covariance functions lie above respectively below Σ 2(s) in a neighbourhood of 0 and t. We then use Gaussian comparison methods to compare the Laplace transforms. The Gaussian comparisons comes in three main steps. In a first step we introduce the usual interpolating process and introduce a localisation condition on its paths. In a second step we justify a certain integration by parts formula, that is adapted to our setting. Finally the resulting quantities are decomposed in a part that has a sign and a part that converges to zero. in construction 174 9 Variable speed BBM 9.3.2 Localization of paths

An important first step is again the localisation of the ancestral paths of particles that reach extreme levels. This is essentially inherited from properties of the standard Brownian bridge. For a given covariance function Σ 2, and a subinterval I ⊂ [0,t], define the following events on the space of paths, X : R+ → R,

 2  γ Σ (s) 2 2 γ T 2 = X ∀s : s ∈ I : X(s) − X(t) < (Σ (s) ∧ (t − Σ (s))) . (9.3.4) t,I,Σ t Proposition 9.28. Let x denote the variable speed BBM with covariance function 2 1 Σ . For any 2 < γ < 1 and for all d ∈ R, there exists r sufficiently large such that, for all t > 3r,

 n γ o ∃k ≤ n(t) : {xk(t) > m˜ (t) + d} ∧ xk 6∈ T 2 < ε, (9.3.5) P t,Ir,Σ

2 where Ir ≡ {s : Σ (s) ∈ [r,t − r]}.

To prove Proposition 9.28 we need Lemma 6.29 on Brownian bridges.

Proof (Proof of Proposition 9.28). Using a first moment method, the probability in (9.3.5) is bounded from above by

t  γ  e B 2 > m˜ (t) + d,B 2 6∈ T 2 , (9.3.6) P Σ (t) Σ (·) t,Ir,Σ

2 where BΣ 2(·) is a time change of an ordinary Brownian motion. Using that Σ (s) is an non-decreasing function on [0,t] with Σ 2(t) = t, we bound (9.3.6) from above by  n s o et {B > m˜ (t) + d} ∧ ∃s ∈ [r,t − r] : B − B > (s ∧ (t − s))γ . (9.3.7) P t s t t s Now, ξ(s) ≡ Bs − t Bt is a Brownian bridge from 0 to 0 in time t, and it is well known that ξ(s) is independent of Bt . Therefore, it holds that (9.3.7) is equal to

t γ e P(Bt > m˜ (t) + d)P(∃s ∈ [r,t − r] : |ξ(s)| > (s ∧ (t − s)) ). (9.3.8) Using the standard Gaussian tail bound,

Z ∞ 2 2 e−x /2dx ≤ u−1e−u /2, for u > 0, (9.3.9) u we have √ t t t −(m˜ (t)+d)2/2t e P(Bt > m˜ (t) + d) ≤ e √ e 2π(m˜ (t) + d) t √ = √ e− 2d ≤ M, (9.3.10) 2π(m˜ (t) + d) in construction 9.3 Universality below the straight line 175

for some constant M > 0. By Lemma 6.29 we can find by r large enough such that

γ P(∃s ∈ [r,t − r] : |ξ(s)| > (s ∧ (t − s)) ) < ε/M. (9.3.11)

Using the bounds of (9.3.10) and (9.3.10) we can bound (9.3.8) from above by ε. ut

Proof (Proof of Theorem 9.27). We show the convergence of the extremal process

= Et ∑ δxk(t)−m˜ (t) (9.3.12) k≤n(t)

by showing the convergence of the finite dimensional distributions and tightness. Tightness of (Et )t≥0 follows trivially from a first order Chebyshev estimate that shows that for any d ∈ R and ε > 0, there exists N = N(ε,d) such that, for all t > 0,

P(Et [d,∞) ≥ N) < ε. (9.3.13)

To show the convergence of the finite dimensional distributions define, for u ∈ R,

n(t) (t) = 1 , Nu ∑ xi(t)−m˜ (t)>u (9.3.14) i=1 that counts the number of points that lie above u. Moreover, we define the corre-

sponding quantity for the process Eσb,σe (defined in (9.3.2)),

Nu = 1 (i) . (9.3.15) ∑ p +σ Λ >u i, j i e j

Observe that, in particular,   P max xi(t) − m˜ (t) ≤ u = P(Nu(t) = 0). (9.3.16) 1≤k≤n(t)

The key step in the proof of Theorem 9.27 is the following proposition, that asserts the convergence of the finite dimensional distributions of the process Et .

Proposition 9.29. For all k ∈ N and u1,...,uk ∈ R

d {Nu1 (t),...,Nuk (t)} → {Nu1 ,...,Nuk } (9.3.17)

as t ↑ ∞.

The proof of this proposition will be postponed to the following sections. Assuming the proposition, we can now conclude the proof of the theorem. (9.3.17) implies the convergence of the finite dimensional distributions of Et . Since tightness follows from the control of the first moment, we obtain Assertion (ii) of Theorem 9.27. Assertion (i) follows immediately from Eq. (9.3.16). in construction 176 9 Variable speed BBM

2 √To prove Assertion (iii), we need to show that, as σe ↑ ∞, it holds that Ce(σe) ↑ (i) 1/ 4π and the processes Λ converge to the trivial process δ0. Then,

Eσb,∞ = ∑δpi , (9.3.18) i √ where (p ,i ∈ ) are the points of a PPP with random intensity measure √1 Y e− 2xdx. i N 4π σb

Lemma 9.30. The point process Eσb,σe converges in law, as σe ↑ ∞, to the point process Eσb,∞. Proof. The proof of Lemma 9.30 is based on a result concerning the cluster pro- (i) cesses Λ . We write Λσe for a single copy of these processes and add the subscript to make the dependence on the parameter σe explicit. We recall from [17] that the

process Λσe is constructed as follows. Define the processes E σe as the limits of the point processes n(t) t E ≡ δ √ (9.3.19) σe ∑ xk(t)− 2σet k=1 where x is standard BBM at time t conditioned on the event {max x (t) > √ k≤n(t) k 2σet}. We show here that, as σe tends to infinity, the processes E σe converge to a point process consisting of a single atom at 0. More precisely, we show that   t √ lim limP E σe ([−R,∞)) > 1 max xk(t) > 2σet = 0. (9.3.20) σe↑∞ t↑∞ k≤n(t)

Now,   t √ P E σ ([−R,∞)) > 1 max xk(t) > 2σet (9.3.21) e k≤n(t)   t t √ ≤ P suppE σ ∩ [0,∞) 6= /0 ∧ E σ ([−R,∞)) > 1 max xk(t) > 2σet e e k≤n(t)   Z ∞ t t √ ≤ P suppE σe ∩ dy 6= /0 ∧ E σe ([−R,∞)) > 1 max xk(t) > 2σet 0 k≤n(t)   Z ∞ t √ = P suppE σe ∩ dy 6= /0 max xk(t) > 2σet 0 k≤n(t)  t t  ×P E σe ([−R,∞)) > 1 suppE σe ∩ dy 6= /0 .

 t  But · supp ∩ dy 6= /0 ≡ P √ (·) is the Palm measure on BBM, i.e. the P E σe t,y+ 2σe conditional law of BBM given that there is a particle at time t in dy (see Kallenberg [50, Theorem 12.8]. Chauvin, Rouault, and Wakolbinger [29, Theorem 2] describe the tree under the Palm measure Pt,z as follows. Pick one particle at time t at the location z. Then pick a spine, Y, which is a Brownian bridge from 0 to z in time t. Next pick a Poisson point process π on [0,t] with intensity 2. For each point p ∈ π in construction 9.3 Universality below the straight line 177

Y(p),i start a random number νp of independent branching Brownian motions (B ,i ≤ ν ) starting at Y(p). The law of ν is given by the size biased distribution, (ν = p √ P p kpk k − 1) ∼ 2 . See Figure 9.1. Now let z = 2σet + y for y ≥ 0. Under the Palm

measure, the point process E σe (t) then takes the form

nY(p),i(p) D √ E σe (t) = δy + ∑ ∑ δ Y(p),i . (9.3.22) B j (t−p)− 2σet p∈π,i<νp j=1

Since, for 1 > γ > 1/2,

Fig. 9.1 The cluster process seen from infinity for σe small (left) and σe very large (right)

√  −1/2 γ γ  lim limP ∀s ≥ σe : Y(t − s) − y + 2σes ∈ [−(σes) ,(σes) ] = 1, (9.3.23) σe↑∞ t↑∞

if we define the set √ t n −1/2 γ γ o Gσe ≡ Y : ∀t ≥ s ≥ σe : Y(t − s) − y + 2σes ∈ [−(σes) ,(σes) ] , (9.3.24)

it will suffice to show that, for all R ∈ R+,

 Y(p),i t  lim limP ∃p ∈ π,i < νp, j : B j (t − p) ≥ y − R ∧Y ∈ Gσe = 0. (9.3.25) σe↑∞ t↑∞

The probability in (9.3.25) is bounded by in construction 178 9 Variable speed BBM

 Y(p),i t  P ∃p ∈ π,i ≤ νp, j : B j (t − p) ≥ y − R) ∧Y ∈ Gσe (9.3.26) "Z t νp # ≤ 1 Y(p),i 1 π(dp) E ∑ B (t−p)>y−R Y∈Gσe 0 i=1 j "Z t " νp # # π ≤ 1 Y(p),i 1 t F π(dp) E E ∑ max B (t−p)≥y−R Y∈Gσe 0 i=1 j j Z t   Y(t−s) t ≤ 2KP maxB j ≥ y − R ∧Y ∈ Gσe ds. 0 j Here we used the independence of the offspring BBM and that the conditional prob- ability given the σ-algebra Fπ generated by the Poisson process π appearing in the 1/2 integral over π depends only on p. For the integral over s up to 1/σε , we just bound the integrand by 2K. For larger values, we use the localisation provided by

the condition that Y ∈ Gσe , to get that the right hand side of (9.3.26) is not larger than

−1/2 Z σe Z t √ K ds + K s (B(s) > −R + s − ( s)γ )ds. 2 2 −1/2 e P 2σe σe (9.3.27) 0 σe (9.3.27) is by (9.3.9) bounded from above by

∞ √ −1/2 Z 2 γ K + K (1−σe )s+ 2σe(R+(σes) )ds. 2 σe 2 −1/2 e (9.3.28) σe From this it follows that (9.3.28) (which does no longer depend on t) converges to zero, as σe ↑ ∞, for any R ∈ R. Hence we see that  t t  P E σe ([−R,∞)) > 1 suppE σe ∩ dy 6= /0 ↓ 0, (9.3.29)

uniformly in y ≥ 0, as t and then σe tend to infinity. Next,   Z ∞ t √ P suppE σe ∩ dy 6= /0 max xk(t) > 2σet (9.3.30) 0 k≤n(t)   Z ∞ √ √ ≤ P max xk(t) ≥ 2σet + y max xk(t) > 2σet . 0 k≤n(t) k≤n(t) √ But by Proposition 7.5 in [17] the probability in the integrand converges to exp(− 2σey), as t ↑ ∞. It follows from the proof that this convergence is unifomr in y, and hence by dominated convergence, the right-hand side of (9.3.30) is finite. Therefore, (9.3.20)

holds. As a consequence, Λσe converges to δ0. It remains to show that the intensity of the Poisson process converges as claimed. Theorems 1 and 2 of [27] relate the constant Ce(σe) defined√ by (9.2.52) to the inten- sity of the shifted BBM conditioned to exceed the level 2σet as follows: in construction 9.3 Universality below the straight line 179 h i ∑ 1 √ 1 E k x¯k(s)> 2σes √ = lim √ (9.3.31) s↑∞   4πCe(σe) P maxk x¯k(s) > 2σes " √ # √ = lim 1 maxx¯k(s) > 2σes E ∑ x¯k(s)−maxi x¯i(s)> 2σes−maxi x¯i(s) s↑∞ k k

= Λσe ((−E,0]),

where, by Theorem√ 7.5 in [17], E is a exponentially distributed random variable with parameter 2 , independent of . As we have just shown that → , it σe Λσe Λσe √δ0 follows that the right-hand side tends to one, as σe ↑ ∞, and hence Ce(σe) ↑ 1/ 4π. Hence the intensity measure of the PPP appearing in converges to the desired √ Eσb,σe intensity measure √1 Y e− 2xdx. 4π σb

This proves Assertion (iii) of Theorem 9.27.

9.3.3 Proof of Proposition 9.29

We prove Proposition 9.29 via convergence of Laplace transforms. Define the

Laplace transform of {Nu1 (t),...,Nuk (t)},

k !! t k Lu1,...,uk (t,c) = E exp − ∑ clNul (t) , c = (c1,...,ck) ∈ R+, (9.3.32) l=1

and the Laplace transform Lu1,...,uk (c) of {Nu1 ,...,Nuk }. Proposition 9.29 is then a consequence of the next proposition.

Proposition 9.31. For any k ∈ N, u1,...,uk ∈ R and c1,...,ck ∈ R+

lim Lu ,...,u (t,c) = Lu ,...,u (c). (9.3.33) t→∞ 1 k 1 k The proof of Proposition 9.31 requires two main steps. First, we prove the result for the case of two-speed BBM. This was done in our previous paper [17]. In fact, we will need a slight extension of that result where we allow a slight dependence of the speeds on t. This will be given in the next subsection. The second step is to show that the Laplace transforms in the general case can be well approximated by those of two-speed BBM. This uses the usual Gaussian comparison argument in a slightly subtle way. in construction 180 9 Variable speed BBM 9.3.4 Approximating two-speed BBM.

As we will see later, it is enough to compare the process with covariance function A with processes whose covariance function is piecewise linear with single change in slope. We will produce approximate upper and lower bounds by choosing these in such a way that the covariances near zero and near one are below, resp. above that of the original process. In fact, it is not hard to convince oneself that the following holds.

Lemma 9.32. There exist families of functions At and At that are piecewise linear, continuous functions with a single change point for the derivative, such that A(0) = A(0) = 0 and A(1) = A(1) = 1. Moreover,

0 0 0 0 limAt (0) = limAt (0), andlimAt (1) = limAt (1). (9.3.34) t↑∞ t↑∞ t↑∞ t↑∞

Moreover, (i) for all s with Σ 2(s) ∈ [0,t1/3] and Σ 2(s) ∈ [t −t1/3,t],

2 Σ (s) ≥ Σ 2(s) ≥ Σ 2(s). (9.3.35)

(ii) If A(x) = 0 on some finite interval [0,δ], then Eq. (9.3.35) only holds for all s with Σ 2(s) ∈ [t −t1/3,t] while, for s ∈ [0,(δ ∧ b)t), it holds that

2 Σ (s) = Σ 2(s) = Σ 2(s) = 0. (9.3.36)

{y ,i ≤ n(t)} 2 {y ,i ≤ Let i be the particles of a BBM with speed function Σ and let i n(t)} be particles of a BBM with speed function Σ 2. We want to show that the limiting extremal processes of these processes coincide. Set

n(t) (t) ≡ 1 , (9.3.37) N u ∑ yi(t)−m˜ (t)>u i=1 n(t) N (t) ≡ 1y (t)−m(t)>u. (9.3.38) u ∑ i ˜ i=1

Lemma 9.33. For all u1,...,uk and all c1,...,ck ∈ R+, the limits

k !! limE exp − ckN u (t) (9.3.39) t↑ ∑ l ∞ l=1

and k !! limE exp − ckN u (t) (9.3.40) t↑ ∑ l ∞ l=1 0 exist. If A (1) < ∞, the two limits coincide with Lu1,...,uk (c). in construction 9.3 Universality below the straight line 181

0 2 If A (1) = σe = ∞, then the two limits in (9.3.39) converges to that in (9.3.40), as ρ ↑ ∞.

Proof. We first consider the case when A0(1) < ∞. To prove Lemma 9.33, we show that the extremal processes

n(t) n(t) Et = δy −m(t) and Et = δy −m(t) (9.3.41) ∑ i ˜ ∑ i ˜ i=1 i=1

both converge to Eσb,σe , that was defined in (9.3.2). Note that this implies first con- vergence of Laplace functionals with functions φ with compact support, while the Nu(t) have support that is unbounded from above. Convergence for these, however, carries over due to tightness. The assertion in the case when σe = ∞ follows directly from Lemma 9.30. ut

9.3.5 Gaussian comparison

We now come to the heart of the proof, namely the application of Gaussian com- parison to control the process with covariance function A in terms of those with the piecewise linear covariances. We distinguish from now on the expectation with respect to the underlying tree structure and the one with respect to the Brownian movement of the particles.

• En: expectation w.r.t. Galton-Watson process. tree • EB: expectation w.r.t. the Gaussian process conditioned on the σ-algebra Ft generated by the Galton Watson process. The proof of Proposition 9.31 is based on the following Lemma that compares

the Laplace transform Lu1,...,uk (t,c) with the corresponding Laplace transform for the comparison processes.

Lemma 9.34. For any k ∈ N, u1,...,uk ∈ R and c1,...,ck ∈ R+ we have

k !!

Lu1,...,uk (t,c) ≤ E exp − ∑ clN ul (t) + o(1) (9.3.42) l=1 k !! (t,c) ≥ exp − c (t) + o(1) (9.3.43) Lu1,...,uk E ∑ lN ul l=1

Proof. The proofs of (9.3.42) and (9.3.43) are very similar. Hence we focus on proving (9.3.42). We will, however, indicate what has to be changed when proving 2 the lower bound as we go along. For simplicity all overlined names depend on Σ . 2 Corresponding quantities where Σ is replaced by Σ 2 are underlined. Set in construction 182 9 Variable speed BBM

n(t) k ! f (x(t)) ≡ f (x (t),...,x (t)) ≡ − c 1 . 1 n(t) exp ∑ ∑ l xi(t)−m˜ (t)>ul (9.3.44) i=1 l=1

We want to control

k !! k !!

EB exp − ∑ clNul (t) − EB exp − ∑ clN ul (t) l=1 l=1   = EB f (x1(t),...,xn(t)(t)) − EB f (y1(t),...,yn(t)(t)) (9.3.45)

A straightforward application of the interpolation formula from Lemma 3.1 will produce a term that is very similar to what appears in a second moment estimate. But we already know that this will not give a useful estimate, unless we introduce an appropriate truncation that reflects the typical behaviour of ancestral paths of extrema particles. Thus we have to sneak them in in a clever way. Define as usual the interpolating processes √ √ h xi = hxi + 1 − hyi. (9.3.46)

h Remark 9.35. We understand the interpolating process {xi ,i ≤ n(t)} as a new Gaus- sian process with the same underlying branching structure and speed function

2 2 2 Σh (s) = hΣ (s) + (1 − h)Σ (s). (9.3.47)

Then, (9.3.45) is equal to

Z 1  d h EB f (x (t))dh , (9.3.48) 0 dh where

n(t)   d h 1 ∂ h h 1 1 f (x (t)) = ∑ f (x1(t),...,xn(t)(t)) √ xi(t) − √ yi(t) (9.3.49) dh 2 i=1 ∂x j h 1 − h

and ∂ f (xh(t),...,xh (t)) is the weak partial derivative1: ∂x j 1 n(t)

k ! ∂ h h h h h f (x1(t),...,xn(t)(t)) = − ∑ clδ(xi (t) − m(t) − ul) f (x1(t),...,xn(t)(t)). ∂x j l=1 (9.3.50)

1 In this proof it always suffices to consider weak derivatives since the only non-differentiable function is f (x) = 1x≥u which can be monotonously approximated by smooth functions, e.g. re- 2 2 place all indicator functions in f by √ 1 R x e−z /2σ dz, rewrite (9.3.45) as in (9.3.48) and then 2πσ −∞ take σ ↑ ∞. in construction 9.3 Universality below the straight line 183

h We must introduce the condition on the path of xi into (9.3.49) at this stage. To do so, we insert into the right-hand side of (9.3.49) a one in the form

1 γ 1 γ 1 = xh∈T + xh6∈T , (9.3.51) i t,I¯, 2 i t,I¯, 2 Σh Σh with ¯  < < >  I ≡ t(δ0 (t) ∧ δ1 (t)),t(1 − δ1 (t)) , (9.3.52) γ <,> <,> <,> and T 2 was defined in (9.3.4). Here δ1 (t) ≡ δ (t), while δ0 is defined in t,I,Σh 2 the same way, but with respect to the speed function Σ instead of Σ 2. We call the two resulting summands S< and S>, respectively. The next lemma shows that S> does not contribute to the expectation in (9.3.49), as t → ∞.

Lemma 9.36. With the notation above, we have

Z 1  lim En EB(|S>|)dh = 0. (9.3.53) t→∞ 0 The proof of this lemma will be postponed. We continue with the proof of Lemma 9.34. We are left with controlling

n(t)  ! 1 ∂ h xi(t) yi(t) S f x t 1 γ √ √ EB( >) = EB ∑ ( ( )) xh∈T − . (9.3.54) 2 ∂x i t,I¯, 2 i=1 j Σh h 1 − h

γ By the definition of T ¯ 2 , t,I,Σh

1 h γ = 1 2 2 2 γ , (9.3.55) x ∈T ∀s∈I¯:|ξi(Σ (s))|≤(Σ (s)∧(t−Σ )) i t,I¯, 2 h h h Σh

h where ξi(·) is a Brownian bridge from 0 to 0 in time t, that is independent of xi (t). We want to apply a Gaussian integration by parts formula to (9.3.54). However, we need to take care of the fact that each summand in (9.3.54) depends on the whole path of ξi through the term in (9.3.55). However, this is not really a problem, due to the fact that the condition (9.3.55) only depends on the properties of the Brownian bridge which is independent of the endpoint. By the Gaussian integration by parts formula (3.1.9) we have, ! ∂ h x t f x t 1 γ EB i( ) ( ( )) xh∈T (9.3.56) ∂x i t,I¯, 2 j Σh n(t) 2 ! h ∂ h x t x t 1 γ f x t = ∑ EB( i( ) j ( ))EB xh∈T ( ( )) , (9.3.57) i t,I¯, 2 ∂x ∂x j=1 Σh j i

The same formula hold of course with xi replaced by yi Hence in construction 184 9 Variable speed BBM

n(t) 2 h !   ∂ f (x (t)) S x t x t y t y t 1 γ EB( >) = ∑ EB( i( ) j( )) − EB( i( ) j( )) EB xh∈T , i t,I¯, 2 ∂x ∂x i, j=1 Σh i j i6= j (9.3.58) where

2 h k ∂ f (x (t)) h h h h = ∑ clcl0 δ(xi − m˜ (t) − ul)δ(xi − m˜ (t) − ul0 ) f (x1(t),...,xn(t)(t)). ∂xi∂x j l,l0=1 (9.3.59) Introducing 1 = 1 h h ¯ + 1 h h ¯, (9.3.60) d(xi (t),x j (t))∈I d(xi (t),x j (t))6∈I into (9.3.58) we rewrite (9.3.58) as (T1) + (T2), where

n(t) h !   ∂ f (x (t)) T x t x t y t y t 1 h h 1 γ ( 1) = ∑ EB( i( ) j( )) − EB( i( ) j( )) EB d(x (t),x (t))∈I¯ xh∈T , i j i t,I¯, 2 ∂x ∂x i, j=1 Σh i j i6= j (9.3.61) n(t) h !   ∂ f (x (t)) T x t x t y t y t 1 h h 1 γ ( 2) = ∑ EB( i( ) j( )) − EB( i( ) j( )) EB d(x (t),x (t))6∈I¯ xh∈T . i j i t,I¯, 2 ∂x ∂x i, j=1 Σh i j i6= j (9.3.62) The term (T1) is controlled by the following Lemma.

Lemma 9.37. With the notation above, there exists a constant C¯ < ∞, such that

> Z 1  Z t(1−δ (t)) 2 1 −s+Σ 2(s)+O(sγ ) −s+Σ (s)+O(sγ ) n (T1)dh ≤ C¯ e − e ds. E < < 0 t(δ0 (t)∧δ1 (t)) (9.3.63)

Moreover, we have:

2 Lemma 9.38. If Σ 2 satisfies (A1)-(A3), and Σ is as defined in Lemma ??, then

Z 2 γ 2 γ lim e−s+Σ (s)+O(s ) − e−s+Σ (s)+O(s ) ds = 0. (9.3.64) t→∞ I¯ The proofs of these lemmata are technical but fairly straightforward and will not be given here. Details can be found in [18]. Up to this point the proof of (9.3.43) works exactly as the proof of (9.3.42) when 2 Σ is replaced by Σ 2. For (T2) and (T2) we have:

Lemma 9.39. For almost all realisations of the Galton-Watson process, the follow- ing statements hold: lim(T2) ≤ 0, (9.3.65) t↑∞ and in construction 9.3 Universality below the straight line 185

lim(T2) ≥ 0. (9.3.66) t↑∞ The proof of this lemma is technical and will be skipped. From Lemma 9.37, Lemma 9.38, and Lemma 9.39 together with (9.3.54), the bound (9.3.42) follows. As pointed out, using Lemma 9.39, the bound (9.3.43) also follows. Thus, Lemma 9.34 is proved. ut

We can now conclude the proof of Proposition 9.31.

Proof (Proof of Proposition 9.31). Taking the limit as t ↑ ∞ in (9.3.42) and (9.3.43) and using Lemma 9.33 gives, in the case A0(1) < ∞,

limsupLu1,...,uk (t,c) ≤ Lu1,...,uk (c) t↑∞

liminfLu ,...,u (t,c) ≥ Lu ,...,u (c) (9.3.67) t↑∞ 1 k 1 k

0 Hence limt↑∞ Lu1,...,uk (t,c) exists and is equal to Lu1,...,uk (c). In the case A (1) = ∞, the same result follows if in addition we take ρ ↑ ∞ after taking t ↑ ∞. This concludes the proof of Proposition 9.31. ut

in construction in construction References

1. E. Aïdékon. Convergence in law of the minimum of a branching random walk. Annals of Probability, 41:1362–1426, 2013. 2. E. Aïdékon, J. Berestycki, E. Brunet, and Z. Shi. Branching Brownian motion seen from its tip. Probab. Theor. Rel. Fields, 157:405–451, 2013. 3. L.-P. Arguin, A. Bovier, and N. Kistler. Genealogy of extremal particles of branching Brown- ian motion. Comm. Pure Appl. Math., 64(12):1647–1676, 2011. 4. L.-P. Arguin, A. Bovier, and N. Kistler. Poissonian statistics in the extremal process of branch- ing Brownian motion. Ann. Appl. Probab., 22(4):1693–1711, 2012. 5. L.-P. Arguin, A. Bovier, and N. Kistler. An ergodic theorem for the frontier of branching Brownian motion. Electron. J. Probab., 18:no. 53, 25, 2013. 6. L.-P. Arguin, A. Bovier, and N. Kistler. The extremal process of branching Brownian motion. Probab. Theor. Rel. Fields, 157:535–574, 2013. 7. D. G. Aronson and H. F. Weinberger. Nonlinear diffusion in population genetics, combustion, and nerve pulse propagation. In Partial differential equations and related topics (Program, Tu- lane Univ., New Orleans, La., 1974), pages 5–49. Lecture Notes in Math., Vol. 446. Springer, Berlin, 1975. 8. K. B. Athreya and P. E. Ney. Branching processes. Springer-Verlag, New York, 1972. Die Grundlehren der mathematischen Wissenschaften, Band 196. 9. G. Ben Arous, V. Gayrard, and A. Kuptsov. A new REM conjecture. In In and out of equilib- rium. 2, volume 60 of Progr. Probab., pages 59–96. Birkhäuser, Basel, 2008. 10. G. Ben Arous and A. Kuptsov. REM universality for random Hamiltonians. In Spin glasses: statics and dynamics, volume 62 of Progr. Probab., pages 45–84. Birkhäuser Verlag, Basel, 2009. 11. S. M. Berman. Limit theorems for the maximum term in stationary sequences. Ann. Math. Statist., 35:502–516, 1964. 12. P. Billingsley. Weak convergence of measures: Applications in probability. Society for Indus- trial and Applied Mathematics, Philadelphia, Pa., 1971. Conference Board of the Mathemati- cal Sciences Regional Conference Series in Applied Mathematics, No. 5. 13. M. Biskup and O. Louidor. Extreme local extrema of two-dimensional discrete Gaussian free field. ArXiv e-prints, June 2013. 14. M. Biskup and O. Louidor. Conformal symmetries in the extremal process of two-dimensional discrete . ArXiv e-prints, Oct. 2014. 15. A. Bovier. Statistical mechanics of disordered systems. Cambridge Series in Statistical and Probabilistic Mathematics. Cambridge University Press, Cambridge, 2006. 16. A. Bovier and L. Hartung. Extended convergence of the extremal process of branching Brow- nian motion. preprint, 2014. 17. A. Bovier and L. Hartung. The extremal process of two-speed branching Brownian motion. Electron. J. Probab., 19(18):1–28, 2014. in construction187 188 References

18. A. Bovier and L. Hartung. Variable speed branching Brownian motion 1. Extremal processes in the weak correlation regime. preprint, 2014. 19. A. Bovier and I. Kurkova. Derrida’s generalised random energy models. I. Models with finitely many hierarchies. Ann. Inst. H. Poincaré Probab. Statist., 40(4):439–480, 2004. 20. A. Bovier and I. Kurkova. Derrida’s generalized random energy models. II. Models with continuous hierarchies. Ann. Inst. H. Poincaré Probab. Statist., 40(4):481–495, 2004. 21. A. Bovier, I. Kurkova, and M. Löwe. Fluctuations of the free energy in the REM and the p-spin SK models. Ann. Probab., 30(2):605–651, 2002. 22. M. Bramson. Convergence of solutions of the Kolmogorov equation to travelling waves. Mem. Amer. Math. Soc., 44(285):iv+190, 1983. 23. M. Bramson. Location of the travelling wave for the Kolmogorov equation. Probab. Theory Related Fields, 73(4):481–515, 1986. 24. M. Bramson, J. Ding, and O. Zeitouni. Convergence in law of the maximum of the two- dimensional discrete Gaussian free field. Jan. 2013. 25. M. Bramson and O. Zeitouni. Tightness of the recentered maximum of the two-dimensional discrete Gaussian free field. Comm. Pure Appl. Math., 65(1):1–20, 2012. 26. M. D. Bramson. Maximal displacement of branching Brownian motion. Comm. Pure Appl. Math., 31(5):531–581, 1978. 27. B. Chauvin and A. Rouault. KPP equation and supercritical branching Brownian motion in the subcritical speed area. Application to spatial trees. Probab. Theory Related Fields, 80(2):299– 314, 1988. 28. B. Chauvin and A. Rouault. Supercritical branching Brownian motion and K-P-P equation in the critical speed-area. Math. Nachr., 149:41–59, 1990. 29. B. Chauvin, A. Rouault, and A. Wakolbinger. Growing conditioned trees. Stochastic Process. Appl., 39(1):117–130, 1991. 30. D. J. Daley and D. Vere-Jones. An introduction to the theory of point processes. Springer Series in Statistics. Springer-Verlag, New York, 1988. 31. B. Derrida and H. Spohn. Polymers on disordered trees, spin glasses, and traveling waves. J. Statist. Phys., 51(5-6):817–840, 1988. New directions in statistical mechanics (Santa Barbara, CA, 1987). 32. J. Ding. Exponential and double exponential tails for maximum of two-dimensional discrete gaussian free field. Probability Theory and Related Fields, pages 1–15, 2011. 33. B. Duplantier, R. Rhodes, S. Sheffield, and V. Vargas. Critical Gaussian multiplicative chaos: Convergence of the derivative martingale. Ann. Probab., 42(5):1769–1808, 2014. 34. B. Duplantier, R. Rhodes, S. Sheffield, and V. Vargas. Renormalization of critical Gaussian multiplicative chaos and KPZ relation. Comm. Math. Phys., 330(1):283–330, 2014. 35. M. Fang and O. Zeitouni. Branching random walks in time inhomogeneous environments. Electron. J. Probab., 17:no. 67, 18, 2012. 36. M. Fang and O. Zeitouni. Slowdown for time inhomogeneous branching Brownian motion. J. Stat. Phys., 149(1):1–9, 2012. 37. R. Fisher. The wave of advance of advantageous genes. Ann. Eugen., 7:355–369, 1937. 38. E. Gardner and B. Derrida. Magnetic properties and function q(x) of the generalised random energy model. J. Phys. C, 19:5783–5798, 1986. 39. E. Gardner and B. Derrida. Solution of the generalised random energy model. J. Phys. C, 19:2253–2274, 1986. 40. J.-B. Gouéré. Branching Brownian motion seen from its left-most particule. ArXiv e-prints, May 2013. 41. R. Hardy and S. C. Harris. A conceptual approach to a path result for branching Brownian motion. Stochastic Process. Appl., 116(12):1992–2013, 2006. 42. R. Hardy and S. C. Harris. A spine approach to branching diffusions with applications to l p - convergence of martingales. In Séminaire de Probabilités XLII, Lecture Notes in Mathematics, pages 281–330. Springer Berlin Heidelberg, 2009. 43. S. C. Harris. Travelling-waves for the FKPP equation via probabilistic arguments. Proc. Roy. Soc. Edinburgh Sect. A, 129(3):503–517, 1999. in construction References 189

44. N. Ikeda, M. Nagasawa, and S. Watanabe. Markov branching processes I. J. Math. Kyoto Univ., 8:233–278, 1968. 45. N. Ikeda, M. Nagasawa, and S. Watanabe. Markov branching processes II. J. Math. Kyoto Univ., 8:365–410, 1968. 46. N. Ikeda, M. Nagasawa, and S. Watanabe. Markov branching processes I. J. Math. Kyoto Univ., 9:95–160, 1969. 47. M. Kac. On distributions of certain Wiener functionals. Trans. Amer. Math. Soc., 65:1–13, 1949. 48. J.-P. Kahane. Sur le chaos multiplicatif. Ann. Sci. Math. Québec, 9(2):105–150, 1985. 49. J.-P. Kahane. Une inégalité du type de Slepian et Gordon sur les processus gaussiens. Israel J. Math., 55(1):109–110, 1986. 50. O. Kallenberg. Random measures. Akademie-Verlag, Berlin, 1983. 51. I. Karatzas and S. E. Shreve. Brownian motion and . Graduate Texts in Mathematics. Springer, New York, 1988. 52. J. F. C. Kingman. Poisson processes, volume 3 of Oxford Studies in Probability. The Claren- don Press, Oxford University Press, New York, 1993. Oxford Science Publications. 53. N. Kistler. Derrida’s random energy models. from spin glasses to the extremes of correlated random fields. Springer, to appear, 2013. 54. A. Kolmogorov, I. Petrovsky, and N. Piscounov. Etude de l’équation de la diffusion avec croissance de la quantité de matière et son application à un problème biologique. Moscou Universitet, Bull. Math., 1:1–25, 1937. 55. A. E. Kyprianou. Travelling wave solutions to the K-P-P equation: alternatives to Simon Harris’ probabilistic analysis. Ann. Inst. H. Poincaré Probab. Statist., 40(1):53–72, 2004. 56. S. P. Lalley and T. Sellke. A conditional limit theorem for the frontier of a branching Brownian motion. Ann. Probab., 15(3):1052–1061, 1987. 57. M. Leadbetter, G. Lindgren, and H. Rootzén. Extremes and related properties of random sequences and processes. Springer Series in Statistics. Springer-Verlag, New York, 1983. 58. T. M. Liggett. Random invariant measures for Markov chains, and independent particle sys- tems. Z. Wahrsch. Verw. Gebiete, 45(4):297–313, 1978. 59. P. Maillard and O. Zeitouni. Slowdown in branching Brownian motion with inhomogeneous variance. ArXiv e-prints, July 2013. 60. B. Mallein. Maximal displacement of a branching random walk in time-inhomogeneous envi- ronment. ArXiv e-prints, July 2013. 61. H. P. McKean. Application of Brownian motion to the equation of Kolmogorov-Petrovskii- Piskunov. Comm. Pure Appl. Math., 28(3):323–331, 1975. 62. C. Newman and D. Stein. Spin glasses and complexity. Princeton University Press, Princeton, NJ, 2013. 63. J. Nolen, J.-M. Roquejoffre, and L. Ryzhik. Power-like delay in time inhomogeneous Fisher- KPP equations. 2013. 64. D. Panchenko. The Sherrington-Kirkpatrick model. Springer, 2013. 65. S. Resnick. Extreme values, regular variation, and point processes, volume 4 of Applied Probability. A Series of the Applied Probability Trust. Springer-Verlag, New York, 1987. 66. R. Rhodes and V. Vargas. Gaussian multiplicative chaos and applications: A review. Probab. Surv., 11:315–392, 2014. 67. M. I. Roberts. A simple path to asymptotics for the frontier of a branching Brownian motion. Ann. Probab., 41(5):3518–3541, 2013. 68. A. V. Skorohod. Branching diffusion processes. Teor. Verojatnost. i Primenen., 9:492–497, 1964. 69. D. Slepian. The one-sided barrier problem for Gaussian noise. Bell System Tech. J., 41:463– 501, 1962. 70. D. Stroock and S. Varadhan. Multidimensional diffusion processes, volume 233. Springer, 1979. 71. M. Talagrand. Spin glasses: a challenge for mathematicians, volume 46 of Ergebnisse der Mathematik und ihrer Grenzgebiete. (3). [Results in Mathematics and Related Areas. (3)]. Springer-Verlag, Berlin, 2003. in construction 190 References

72. K. Uchiyama. The behavior of solutions of some nonlinear diffusion equations for large time. J. Math. Kyoto Univ., 18(3):453–508, 1978. 73. T. Watanabe. Exact packing measure on the boundary of a Galton-Watson tree. J. London Math. Soc. (2), 69(3):801–816, 2004. 74. H. W. Watson and F. Galton. On the probability of the extinction of families. Journal of the Anthropological Institute of Great Britain, 4:138–144, 1875. 75. O. Zeitouni. Branching random walks and Gaussian free fields. Lecture notes, 2013.

in construction Index

almost sure convergence, 23 F-KPP equaton ancestral path, 58 linear, 89 asymptotics of upper tails, 117 Feynman-Kac formula, 83, 99 Feynman-Kac representation, 71 Berman condition, 33 finite mass condition, 104 boundary first moment estimate, 59 of Galton-Watson process, 53 FKG inequality, 110 branching Brownian motion Fréchet distribution, 10 definition, 57 variable speed, 151 Galton-Watson process, 52, 137 branching process, 52 Galton-Watson tree, 52 Brownian bridge, 58, 72, 92, 111 embedding in R, 137 basic estimates, 92 Gaussian comparison, 34, 184 Gaussian integration by parts, 35 complete Poisson convergence, 30 Gaussian process, 47, 57 convergence theorem, 64 Gaussian sequence covariance, 47, 57 stationary, 33 Cox process, 55 Gaussian tail bounds, 3 generalised random energy model, 44, 47 derivative martingale, 66, 67, 122 , 45 Dynkin’s theorem, 17 Girsanov formula, 96 GREM, 44, 47, 151 embedding in R Gumbel distribution, 4, 47, 65 of Galton-Watson tree, 137 random shift, 55 entropic repulsion, 59, 75, 95 ergodic theorem, 70 Hamiltonian, 43 exceedances Hamming distance, 43 point process of, 28 existence, 120 integration by parts existence of solutions, 75 Gaussian, 35 extremal distributions, 2 intensity measure, 18 extremal process, 15 interpolating process, 184 extremal types, 10 interpolation, 34 extremal types theorem, 10 inverse extreme value theory, 45 left continuous, 5

F-KPP equation, 61 Kahane’s theorem, 36 in construction191 192 Index

Kallenberg’s theorem, 25 of BBM, 120 Khintchine’s theorem, 7 Poisson, 19 Kolmogorov’s theorem, 79 Poisson point process, 19

labelling, 53 random energy model, 45 Laplace functional, 18, 50, 63 relatively compact, 22 convergence, 120 REM, 45 Laplace transform, 18 law of maximum, 69 second moment method, 46, 47 REM, 46 truncated, 49 lexicographic distance, 44 Skorokhod’s theorem, 23 linear F-KPP equation, 71, 75, 89 Slepian’s lemma, 36 localisation of paths, 59 spin glass, 43 standard conditions, 63, 71 martingale, 60, 123, 154 stationary Gaussian sequence, 33 max-stability, 5 stretched max-stable distribution, 8 more, 80 maxima of iid sequences, 1 supercritical, 54 maximum principle, 76 mild solution, 62 tightness, 24, 120, 177 more stretched, 80 travelling wave, 63 multi-index, 53 labelling, 53 ultrametric, 44, 53 normal comparison, 34 uniqueness of solutions, 75 normal sequence, 33 vague convergence, 21 order statistics, 11 variable speed, 151

particle number, 60 weak convergence, 5 partition function, 68 of point processes, 20 Picard iteration, 75 Weibull distribution, 10 point measure, 15 point process, 15, 16, 63 Yule process, 52 in construction