TREE-VALUED MARKOV CHAINS

DERIVED FROM GALTON-WATSON



PROCESSES.

David Aldous and Jim Pitman

Technical Rep ort No. 481

Department of Statistics

University of California

367 Evans Hall  3860

Berkeley, CA 94720-3860

February 1997

Abstract

Let G b e a Galton-Watson tree, and for 0  u  1letG be

u

the subtree of G obtained by retaining each edge with probability

u. We study the tree-valued Markov pro cess G ; 0  u  1 and

u

 

an analogous pro cess G ; 0  u  1 in which G is a critical or

u 1

sub critical Galton-Watson tree conditioned to b e in nite. Results

simplify and are further develop ed in the sp ecial case of Poisson

o spring distribution.

Running head. Tree-valued Markovchains.

Key words. Borel distribution, branching pro cess, conditioning, Galton-

Watson pro cess, generalized , h-transform, pruning, ran-

dom tree, size-biasing, spinal decomp osition, thinning.

AMS Subject classi cations 05C80, 60C05, 60J27, 60J80



Research supp orted in part by N.S.F. Grants DMS9404345 and 9622859 1

Contents

1 Intro duction 2

1.1 Related topics : :: :: :: ::: :: :: :: :: ::: :: :: :: 4

2 Background and technical set-up 5

2.1 Notation and terminology for trees :: :: :: ::: :: :: :: 5

2.2 Galton-Watson trees : :: ::: :: :: :: :: ::: :: :: :: 9

2.3 Poisson-Galton-Watson trees :: :: :: :: :: ::: :: :: :: 10

2.4 Uniform Random Trees :: ::: :: :: :: :: ::: :: :: :: 11

2.5 Conditioning on non-extinction :: :: :: :: ::: :: :: :: 12

3 Pruning Random Trees 17

3.1 Transition rates :: :: :: ::: :: :: :: :: ::: :: :: :: 19

3.2 Pruning a Galton-Watson tree : :: :: :: :: ::: :: :: :: 22

3.3 Pruning a GW tree conditioned on non-extinction ::::::: 26

3.4 The sup ercritical case : :: ::: :: :: :: :: ::: :: :: :: 30

4 The PGW pruning pro cess 32

4.1 The jointlawofG ; G  : ::: :: :: :: :: ::: :: :: :: 32

 

4.2 Transition rates and the ascension pro cess :: ::: :: :: :: 35

1

4.3 The PGW 1 distribution :: :: :: :: :: ::: :: :: :: 37



4.4 The pro cess G ; 0    1 :: :: :: :: :: ::: :: :: :: 38



4.5 A representation of the ascension pro cess : :: ::: :: :: :: 40



:: :: :: :: ::: :: :: :: 43 4.6 The spinal decomp osition of G



4.7 Some distributional identities : :: :: :: :: ::: :: :: :: 45

4.8 Size-Mo di ed PGW-trees : ::: :: :: :: :: ::: :: :: :: 48

1 Intro duction

This pap er develops some theory for Galton-Watson trees G i.e. family trees

asso ciated with Galton-Watson branching pro cesses, starting from the fol-

lowing twoknown facts.

i [Lemma 10] For xed 0  u  1letG b e the \pruned" tree obtained by

u

cutting edges of G and discarding the attached branch indep endently with

probability1 u. Then G is another Galton-Watson tree.

u

1

, ii [Prop osition 2] For critical or sub critical G one can de ne a tree G 2

1

interpretable as G conditioned on non-extinction. Qualitatively, G consists

of a single in nite \spine" to which nite subtrees are attached.

Weinterpret i as de ning a pruning process G ; 0  u  1, whichis

u

a tree-valued continuous-time inhomogeneous Markovchain such that G is

0

the trivial tree consisting only of the ro ot vertex, and G = G . An analogous

1

  1

pruning pro cess G ; 0  u  1 with G = G is constructed from the con-

u 1

1

ditioned tree G of ii. Section 3 gives a careful description of the transition

rates and transition probabilities for these pro cesses. The two pro cesses are

qualitatively di erent, in the following sense. If G is sup ercritical then on the

event G is in nite there is a random ascension time A suchthatG is nite

A

but G is in nite: the chain \jumps to in nite size" at time A.Incontrast,

A

 

the pro cess G  \grows to in nity" at time 1, meaning that G is nite for

u u

 

u<1 but G = G is in nite. A connection b etween the two pro cesses is

1 1

made Section 3.4 by conditioning G ; 0  u< Aontheeventthat A

u

equals the critical time, i.e. the a for which G has mean o spring equal 1.

a

By rescaling the time parameter wemaytake a = 1, and the conditioned



pro cess is then identi ed with G ; 0  u< 1.

u

These results simplify, and further connections app ear, in the sp ecial case

of Poisson o spring distribution, which is the sub ject of Section 4. There we

consider G ; 0  <1, where G is the family tree of the Galton-Watson

 

branching pro cess with Poisson o spring, and the asso ciated pruned con-



; 0    1. To highlight four prop erties: ditioned pro cess G





 The distribution of G has several di erentinterpretations as a limit

1

Section 4.3.



is the distribution of G ,size-  For xed <1, the distribution of G





biased by the total size of G Section 4.4.



 The pro cess G rununtil its ascension time A> 1 has a representation





in terms of G  as Section 4.5



d



; 0  < log U =1 U  G ; 0  



U



where U is uniform 0; 1, indep endentofG ; 0    1.



 

, a certain vertex b ecomes distin- by pruning G  In constructing G

1 

 

guished, i.e. the highest vertex of the spine of G retained in G . This

1  3



vertex turns out to b e distributed uniformly on G , and a simple spinal





decomposition of G into indep endent tree comp onents is obtained by





cutting the edges of G along the path from the ro ot to the distinguished



vertex Section 4.6.

Other topics include consequent distributional indentities relating Borel and

size-biased Borel distributions Sections 4.5 and 4.7 and the interpretation

of trees conditioned to b e in nite as explicit Do ob h-transforms, with the

related identi cation of the Martin b oundary of ; G  Section 4.4.



None of the individual results is esp ecially hard; the length of the pap er

is due partly to our development of a precise formalism for writing rigorous

pro ofs of such results. Section 2 contains this formalism and discussion of

known results.

1.1 Related topics

Of course, branching pro cesses form a classical part of probability theory.

Various \probability on trees" topics of contemp orary interest are treated in

the forthcoming monograph byLyons [30], which explores several asp ects of

Galton-Watson trees but touches only tangentially on the sp eci c topics of

this pap er.

Our motivation came from the following considerations, whichwillbe

elab orated in a more wide-ranging but less detailed companion pap er [7].

Supp ose that for each N there is a Markovchain taking values in the set of

forests on N vertices. Lo oking at the tree containing a given vertex gives a

tree-valued pro cess, and taking N !1limits may give a tree-valued Markov

chain. The prototyp e example not exactly forest-valued, of course is the

random graph pro cess GN; P edge  = =N ; 0    N  for whichthe

limit tree-valued Markovchain is our pruned Poisson-Galton-Watson pro-

cess G ; 0  <1 [1]. The pruned conditioned Poisson-Galton-Watson





pro cess G ; 0    1 arises in a more subtle way as a limit of the Marcus-



Lushnikov discrete coalescent pro cess with additivekernel see [6] for back-

ground on the general Marcus-Lushnikov pro cess and [42, 43 , 44, 38 ] for

recent results on the additive case. More exotic variations of G , e.g. a



stationary Markov pro cess in which branches grow and are cut down up on

b ecoming in nite, arise as other N !1limits and are studied in [7]. Finally,

we remark that the unconditioned and conditioned critical Poisson-Galton- 4

Watson distributions arise as N !1limits in several other contexts as

\fringes" in random tree mo dels [3], in particular in random spanning trees

[2, 37 ]; in the Wright-Fisher mo del where there is no natural pruning struc-

ture.

2 Background and technical set-up

Here we set up our general notation for random trees, and presentsome

background material ab out Galton-Watson trees.

2.1 Notation and terminology for trees

Except where otherwise indicated, bya tree t wemeana rooted labeledtree,

that is a set V =vertst, called the set of vertices or labels of t, equipp ed

t

with a directededge relation ! such that for some obviously unique element

ro ot t 2 V there is for eachvertex v 2 V a unique path from the root to v ,

that is a nite sequence of vertices v = roott;v ;:::;v = v suchthat

0 1 h

t

v ! v for each1 i  h.Thenh = hv; t is the height of vertex v in

i1 i

the tree t.Formally, t is identi ed by its vertex set V and its set of directed

t

edges, thatisthesetfv; w 2 V  V : v ! w g. If a subset S of vertst

t

is such that the restriction of the relation ! to S  S de nes a tree s with

vertss= S , then either S or s may b e called called a subtree of t. Let

V 2f0; 1; 2;:::;1g be the number of elements of a set V , and for a tree t

let t =vertst. The numb er of edges of a tree t is t 1. For a tree t

t

and v 2 vertst let childrenv; t:=fw 2 vertst: v ! w g denote the set

of children of v in t, and let c t := childrenv; t, the number of children

v

of v in t.Each non-ro ot vertex w of t is a child of some unique vertex v of

t,say v =parentw; t. Let s and t be two trees. Call s a relabeling of t if

s

there exists a bijection ` :vertss ! vertstsuchthat v ! w if and only if

t

`v  ! `w . Then ro ot t=`ro ot s and hv; s=h`v ; t.

In the discussion ab ove there was no notion of \birth order" for children.

To incorp orate this notion, for n 2 N := f1; 2;:::g let T denote the set of

n

family trees also called rootedorderedtrees or planted plane trees [14, 46]

with n vertices. Figure 1 illustrates the 5 trees comprising T .

4 5



    

@

@

@

        

@ @

@ @

@ @

    

ro ot ro ot ro ot ro ot ro ot

Figure 1

Weinterpret an element t of T := [ T as a nite family tree with the

n n

ro ot representing a single progenitor, and eachvertex of the tree represent-

ing an individual of the family. Then for g =0; 1; 2 ::: eachvertex at height

g corresp onds to an individual in the g th generation of the family. While

the graphical representation of t in Figure 1 involves no explicit lab eling of

the vertices v of t,weidentify an individual in the g th generation of t as a

sequence of g integers, for instance 2; 7; 4 to indicate a third generation indi-

vidual who is the 4th child of the 7th child of the 2nd child of the progenitor.

Thus, following Harris[22, xVI.2] and Kesten [25], weidentify each t 2 T as

n

g

1

a ro oted lab eled tree with vertst V, where V := [ N [f0g is the

g =1

set of all nite sequences of p ositiveintegers, together with a ro ot element

V

denoted 0. Regard V as a ro oted tree, with v ! w if and only if w =v; j 

g +1 g

for some j 2 N , where v; j  2 N is de ned by app ending j to v 2 N for

g  1, and where 0;j=j 2 N .Ifw =v; j  call w the j th child of v , call

j the rank of w , and write j = rankw . So for each non-ro ot w 2V,the

p ositiveinteger rankw  2 N is the last comp onent of the nite sequence w .

A  nite or in nite family tree is a subtree t of V suchthat0 2 vertst, and

for each v 2 vertst the set of children of v is the set fv; j : 1  j  c tg.

v

1 1

Let T denote the set of all such family trees t. Each t 2 T is a subtree

t

of V , whose edge relation ! on vertst is de ned by restriction of the edge

V

relation ! on V . A family tree t is therefore uniquely identi ed by its vertex

1

set vertst V, and it is convenienttoidentify t with vertst. Thus T

is identi ed as a collection of subsets of V sub ject to certain constraints in-

dicated ab ove, and wemay write for instance v 2 t instead of v 2 vertst 6

to indicate that v is a vertex of t.>From the de nitions ab ove

1

[

1 1

T : T := ft 2 T :t = ng; T := ft 2 T :t < 1g =

n n

n=1

The height of a nite tree is the maximum heightofallvertices in the

1 h h

tree. There is a natural restriction map r : T ! T where T is the

h

set of nite family trees of height at most h. For t identi ed as a subset

of V , r t is the tree formed byallvertices of t of height at most h. A tree

h

1 h

t 2 T is identi ed by the sequence r t;h  0. Note that the r t 2 T

h h

are sub ject only to the consistency condition that r t = r r t. The set

h h h+1

1

T is now identi ed as a subset of an in nite pro duct of countable sets

1 0 1 2

T  T  T  T :

1

Wegive T the top ology derived by this identi cation from the pro duct

h

of discrete top ologies on the T . So a sequence of trees t has a limit

n

1 h h

lim t = t 2 T i for every h there exists t 2 T and nhsuch

n n

h 1

that r t = t for all n  nh; the limit is then the unique t 2 T with

h n

h 1

r t = t . In particular, for each t 2 T , the sequence r t has limit t as

h n

n !1.

Let s b e a tree whose vertex set V is a subset of some set S equipp ed with a

total ordering. In particular, wehave in mind the cases S = N with the usual

ordering, and S = V with lexicographical ordering and 0 as least element.

Supp ose each v 2 vertss has only a nite number of children. Then there

is a natural relab eling ` of the vertices of s by V which de nes a family tree t

asso ciated with s,say t = fams. The relab eling ` :vertss !V is de ned

as follows. Let `ro ot s = 0. For each non-ro ot vertex v of s let rankv be

the rank of v amongst its siblings, that is the number of w 2 vertsssuch

that w has the same parentas v and w  v .For v with height h  1ins let

ro ot s;v ;:::;v = v  b e the path from ro ot sto v in s, and de ne

1 h

h

`v  := rankv ;:::;rankv  2 N :

1 h

1

The tree t =fams 2 T is the subtree of V whose vertex set is `vertss,

the range of the relab eling map ` :vertss !V.

1

For t 2 T and g  0, let geng; t b e the g th generation of individuals

in t, in other words the set of vertices of t of height g .To illustrate notation, 7

there are the following identities b etween subsets of the set V : for all t,for

all h =0; 1;::::

h

[ [

childrenv; t; r t = geng; t: genh +1; t=

h

g =0

v 2genh;t

Let Z t := genh; t, the size of generation h of t. The ab oveidentities

h

imply

h 1

X X X

Z t = c t;r t = Z t;t = Z t:

h+1 v h n n

n=0 n=0

v 2genh;t

The abbreviation

Z t := Z t = c t

1 0

makes a convenient notation for the numb er in individuals in the rst gen-

eration of t, that is the number of children of the ro ot 0 of t. Denote the

trivial family tree with the single vertex 0 by . Starting from r t = ,

0

a family tree t is conveniently sp eci ed as the unique tree t suchthat

h h h

r t = t for all h for some sequence of trees t 2 T determined recur-

h

h h

sively as follows. Given that t 2 T has b een de ned, the set of vertices

h

genh; t = genh; t =r t r t is determined, hence so is the size

h h1

h

Z t = Z t of this set; for each p ossible choice of Z t non-negativeintegers

h h h

h+1 h+1 h+1 h

c ;v 2 genh; t, there is a unique t 2 T suchthat r t = t

v h

h+1 1

and c t = c for all v 2 gen h; t. So a unique tree t 2 T is deter-

v v

mined by sp ecifying for each h  0thewayinwhich these Z t non-negative

h

h h h

integers are chosen given that r t = t for some t 2 T .

h

1

A random family tree is a random elementof T , formally sp eci ed

by its sequence of height restrictions, say T =r T ;h =0; 1;:::, where

h

h

each r T is a with values in the countable set T ,and

h

r T = r r T  for all h.Thedistribution of T , denoted distT , is then

h h h+1

determined by the sequence of distributions of r T for h  0. Such a dis-

h

tribution is determined by a sp eci cation for each h  0 of the joint condi-

tional distribution given r T of the numb ers of children c T as v ranges over

h v

genh; r T .

h

1

De ne convergence of distributions on T byweak convergence relative

h

to the pro duct of discrete top ologies on T . That is, for random family

trees T ;n =1; 2;::: and T ,wesaythatT converges in distribution to T ,

n n 8

d

and write either T !T,ordistT  ! distT , or lim distT  = distT 

n n n n

if

P r T = t ! P r T = t 8h  0; t 2 T: 1

h n h

2.2 Galton-Watson trees

Let p=p0;p1;::: b e a on the non-negative

integers with p1 < 1. Following [22, 25 , 34 , 35 ], call a random family tree

G a Galton-Watson GW tree with o spring distribution pifthenumber

of children Z G of the ro ot has distribution p:

P Z G = n=pn 8 n  0

h

and for each h =1; 2;:::, conditionally given r G = t ,thenumb ers of

h

h

children c G ;v 2 genh; t , are i.i.d. according to p. Equivalently,for

v

each h  1 the distribution of r G is determined by the formula

h

Y

h

pc t 8 t 2 T 2 P r G = t=

v h

v 2r t

h1

where the pro duct is over all vertices v of t of heightatmosth 1. The

1

restriction of the distribution of G on T to the set T of nite family trees

is then given by the formula

Y

P G = t= pc t 8 t 2 T 3

v

v 2t

where the pro duct is over all vertices v of t.

Denote the mean of the o spring distribution of G by :

X

npn:  := E Z G =

n

It is well known that P G < 1 = 1, or equivalently P Z G > 0 ! 0as

h

h !1, if and only if   1. So for   1 the distribution of G is completely

determined by formula 3 . Let

n

X

X where the X are i.i.d. copies of 4 S =

i i n

i=1 9

Whatever ,itisknown [15 ] that the distribution of the total progeny G

on the eventG < 1isgiven by the formula

1

P G = n= P S = n 1 8 n =1; 2;::: 5

n

n

Let k  1. Given Z G = k ,for1 i  k let G i b e the subtree of G formed

bytheith child of the ro ot and all its descendants, and observethatthe

asso ciated family trees famG i for 1  i  k are i.i.d. copies of G .So

! 

k

X

6  1+ distGjZ G = k  = dist

i

i=1

where  ;  ;::: are i.i.d. copies of G .For all k  1and n  0

1 2

! 

k

X

k

P S = n k  7  = n = P G = n +1j Z G = k =P

n i

n

i=1

for S as in 4, where the rst equality sp ells out 6 , and the second equality

n

generalizes 5. See [15, 27 , 39 , 47 , 48 ] for derivations of this formula and

other interpretations of the distribution of G involving random walks and

queues.

2.3 Poisson-Galton-Watson trees

For   0 let G b e a GW tree with the Poisson o spring distribution



 n 1

p n:=e  =n!. Denote the distribution of G on T by PGW :

 

>From 3 and 2

Y

1

t t1

8 t 2 T 8 P G = t=e 



c t!

v

v 2t

h

P r G = t=P G = texpZ t 8 t 2 T : 9

h   h

In this case, S in 4 and 5 has Poissonn distribution. So from 5

n

the total progeny of a PGW  tree has probability distribution P on



f1; 2;  ; 1g which is known as the Borel  distribution [12, 35, 50 ]:

n1

n

n

e 8 n =1; 2;::: 10 P G = n=P n:=

 

n! 10

>From 7, the sum of k indep endent random variables N i, each with the



Borel distribution 10 , has distribution on f1; 2; ; 1g sp eci ed by

 !

k

nk

X

n k

n

P N i= n = e 8 n = k; k +1;::: 11



n n k !

i=1

This is the Borel-Tanner distribution [21, 49 , 50] with parameters k and .

2.4 Uniform Random Trees

Let R b e the set of all ro oted trees lab eled by[n]:=f1; 2;:::;ng.For a

[n]

~

ro oted lab eled tree t with n vertices, let t denote the corresp onding ro oted

~

unlab eled tree. Formally, de ne t to b e the set of all trees s 2 R obtained

[n]

~ ~

~

by some relab eling of the vertices of t by[n]. So t 2 R where R is a

[n] [n]

set of equivalence classes of elements of R .LetU b e a random tree with

n

[n]

uniform distribution on R . Aldous [4] observed that for G a PGW tree

[n] 

d

~ ~

G j G = n = U : 12

  n

~

That is, G given G = nandU induce identical distributions on R

  n

[n]

when unlab eled. To put this another way, x>0 and generate a PGW

family tree G .Given that vertsG = V for some set V with V = n, let

 

y

G 2 R be G relab eled by a uniform random p ermutation  : V ! [n].



[n]



Then 12 amounts to:

y

distG j G = n = distU : 13

 n



Call a function of a ro oted lab eled tree t an invariant if t= s

whenever s is a relab eling of t.For example, the number Z t of vertices of t

h

at height h is an invariant. So is the matrix M t=M t;h  0;c  0

h;c

where M t is the numb er of individuals in generation h of t that have c

h;c

children. The identity 12 can b e restated as

d

 G  j G = n = U  8 invariant 14

  n

This identityfor =M and = Z ;:::;Z was discovered earlier and ex-

1 n

ploited by Kolchin [26, 27 ]. The following prop osition records a sharp er

result, implicit in the discussion of [5] and explicit in [39], whichobvi-

ously implies all of these identities 12 -14 . We call formula 15 the U -

n

representation of distG j G = n.

  11

Prop osition 1 For each n =1; 2;::: the conditional distribution of a PGW

tree G given G = n is the same for al l >0, and identical to the distri-

 

bution of famU .Thiscommon distribution is given by

n

n! 1

P G = t j G = n=P famU =t= 15

Q

  n

n1

n c t!

v

v 2t

for al l t 2 T with t = n.

2.5 Conditioning on non-extinction

1

An in nite random tree G ,whichwe call G conditioned on non-extinction is

derived in the following prop osition from a critical or sub critical GW tree G .

1 

The probabilistic description of G involves the size-biased distribution p

asso ciated with a probability distribution p on the non-negativeintegers

with mean  2 0; 1:

 1

p n:=  npn 8 n  0: 16

Here is an exact statementof\known result ii" in the Intro duction.

Prop osition 2 Kesten [25].Let Z G := genn; G  be the number of

n

individuals in the nth generation of a GW tree G with o spring distribution

p such that p0 < 1 and   1. Then

i

1

distGjZ G > 0 ! distG  as n !1 17

n

1 1

where distG  is the distribution of a random family tree G speci edby

1 h h

P r G = t= Z tP r G = t 8 t 2 T ;h  0: 18

h h h

1

ii Almost surely G contains a unique in nite path ro ot = V ;V ;V ;:::

0 1 2

such that V is a child of V for every h =0; 1; 2;:::.

h+1 h

1

iii For each h the joint distribution of r G and V is given by

h h

1 h h

P r G = t;V = v = P r G = t 8 t 2 T ;v 2 genh; t 19

h h h

1

iv The joint distribution of V ;V ;V ;::: and G is determinedrecur-

0 1 2

1

sively as fol lows: for each h =0; 1; 2;:::,given V ;V ;:::;V  and r G ,

0 1 h h

1 1

the numbers of children c G areindependent as v ranges over genh; G ,

v



with distribution p for v 6= V , and with the size-biased distribution p 

h

1 1

for v = V ; given also the numbers of children c G for v 2 genh; r G ,

h v h

1

G children of V . the vertex V has uniform distribution on the set of c

h h+1 V

h 12

That 18 de nes a probability distribution for an in nite family tree

1 n

G follows from the well known fact that  Z G ;n =0; 1;:::isanon-

n

negative martingale with exp ectation 1. The sequence of height restrictions

1 1

r G ;n =0; 1;  which determines G is a Markovchain with state

n

space T obtained as the Doob h-transform of the Markovchain r G ;n =

n

n

0; 1;  via the space-time harmonic function hn; t:= Z t. See [41] for

n

background and other applications of h-transforms, and [31 ] for an elegant

1

treatment of the recursive construction iv of G and the in nite path V ,

h

1

whichwe call the spine of G . As observed in [31 ], the construction iv

1

of V and G such that 18 and 19 hold can also b e carried out in the

h

sup ercritical case 1 <<1. While our fo cus in this pap er will b e on

the case 0 < 1, we note in passing that for >1 the path V is

h

1

almost surely not the only in nite path from the ro ot in G ; rather, there

are uncountably manysuch paths almost surely. Also, the conditional limit

1

theorem 17 do es not hold for >1withG constructed via 18. Rather,

there is the elementary result that

lim distGjZ G > 0 = distGjG = 1 for >1

n

n!1

1

where the right side do es not have the same distribution as G de ned by

18 , except when p is degenerate.

In particular, Prop osition 2 contains the classical result [9] sec. I.8

that for the usual integer-valued GW pro cess Z G ;h =0; 1;::: started at

h

Z = 1, for   1 there is the conditioned limit theorem

0

1 1

lim distZ G ;:::;Z GjZ G > 0 = distZ G ;:::;Z G  8 h =1; 2;:::

1 h n 1 h

n!1

20

1

where Z G ;h =0; 1;::: is a Markovchain with state space N and homo-

h

1

geneous transition probabilities de ned as follows. Given Z G = m say,the

h

1 1

number Z G of individuals in the next generation of G is distributed

h+1

as the sum of m indep endentrandomvariables, with m 1 of these vari-

ables distributed according to the o spring distribution pofZ G ,andone



variable distributed according to the size-biased o spring distribution p .

1

In other words, the pro cess Z G 1;h =0; 1;::: is a branching pro cess

h

with immigration, starting from an initial p opulation of zero, with o spring

 

distribution p and immigration distribution p , where p  is the dis-

  

1forZ with the size-biased o spring distribution p , tribution of Z 13

that is

 1

p n:= n +1pn +1 8 n =0; 1;::: 21

It is elementary and well known that



p = pi pisPoisson for some   0.

The following corollary is an easy consequence of this fact combined with the

previous Prop osition.

1

Corollary 3 Spinal Decomp osition of G . In the setting of Proposition

1

2, with V  the in nite spine of G derivedbyconditioning G on non-

h

i

extinction, for i =0; 1;::: let G be the family tree derivedfrom the subtree

1 1

of G with root V in the random forest obtainedfrom G by deleting each

i

edge along the spine, and let V be the J th child of V . Then

i+1 i i

i

i the trees G ;i =0; 1;::: areindependent and almost surely nite,

with identical distribution

1

X

i 

P G = t= P G = t j Z G = m p m 8 t 2 T; 22

m=0

i

ii the trees G have the same distribution as G i p is Poisson;

i

iii conditional ly given G ;i =0; 1;::: the ranks J are independent

i

i i 1

and J has uniform [Z G +1] distribution, where Z G = c G 1.

i V

i

i

The common distribution of the G describ ed by 22 is that of a mo d-

i ed GW tree, in whichthenumb er of the rst generation individuals has



distribution p , while these and all subsequent individuals have o spring

h

distribution p. Note that V =J ;:::;J  2 N for all h  1. So the

h 0 h1

1

spinal decomp osition sp eci es the joint distribution of the spine of G and

the sequence of nite family trees derived by cutting all edges of the spine.

1 1

This determines the distribution of G , for it is clear that G is a measur-

i

able function of V  and the G .We record now for later use the following

h

consequence of Prop osition 2.

1

Lemma 4 In the setting of Proposition 2, with V  the in nite spine of G

h

derivedbyconditioning G on non-extinction, let H beanon-negative integer

1

random variable independent of G , and let G be the family tree derived

H 

1 1

from the nite subtreeofG that contains the root after G is cut into two 14

subtrees by deletion of the edge V ;V . Then for each t 2 T and each

H H +1

vertex v of t at height h

h 

P G = t;V = v =P H = h  c tP G = t 23

H v

H 

where



p n n +1pn +1



 n:= = 8 n =0; 1;::: 24

pn pn

P

for p the o spring distribution of G and  := npn  1.

n

Pro of. By conditioning on H it suces to prove 23 for a constant H ,say

H = h. Let t = r t. Then for each v at height h in t the left side of 23

h h

equals

P r G = t ;V = v  P G = t j r G = t ;V = v  25

h h h h h h

h h h

1

But for H = h xed, r G = r G by construction, so 18 gives

h h

h

h

P r G = t ;V = v =  P r G = t : 26

h h h h h h

Also, given r G = t and V = v , according to part iv of Prop osition 2,

h h h

h

G develops over generations h +1;h +2;::: muchlike G , with individuals

h

having indep endentnumb ers of o spring, except that in G each individual

h

except v has o spring distribution p, whereas v has o spring distribution



p . By consideration of the pro duct formulae 2 and 3, it follows that

for any particular tree t in which v has n o spring,



P G = t j r G = t ;V = v =  nP G = t j r G = t  27

h h h h h

h h



for  n as in 24. Combine 26 and 27 in 25 to obtain 23 for a

xed H = h. 2

Conditioning on the total progeny. Kennedy [24 ] obtained an ana-

log of the conditioned limit theorem 20 as n !1with conditioning on

P

G = n instead of Z G > 0, where G = Z G is the total progeny. His

n n

n

assumption on the o spring distribution p is that the generating function

P

n

g s:= pns satis es

n

0 00

9 a>0withg a=ag a < 1; and g a < 1: 28

Reinterpreting his argument in terms of family trees gives the convergence

assertion in 29: the equality follows easily from the pro duct formula 3. 15

Prop osition 5 Let G be a GW tree whose o spring generating function g

satis es 28.Let G be the critical GW tree whose o spring generating func-

P

1 1 n n

tion is g a g as= g a a p s for a as in 28. Then

n

n

1

GjG = n ! distG  as n !1 29 distGjG = n = dist

1

where G is G conditioned on non-extinction.

In terms of the o spring mean , condition 28 is always satis ed if >1

and p is nondegenerate, with a<1. If  = 1 then 28 holds if and only

P

1

1 2

. If G = G and G = G if n pn < 1, in whichcase a =1;then

n

<1 then 28 requires pntodecay exp onentially,and a> 1; then the

1

1

distributions of G and G are di erent.

Assume for this paragraph that 28 holds, and consider what happ ens if

we condition on Gn instead of G = n and then let n !1:

8

>

distGjG = 1 if >1

<

1

distG  if  =1

lim distGjGn=

n!1 >

1

:

G  if <1 dist

The rst case is elementary, and the second two cases follow easily from

Prop osition 5. Note the paradoxical fact that while

\ \

Gn=G = 1 Z G > 0 =

n

n n

and b oth intersections involve decreasing sequences of events,

lim distGjZ G > 0 = lim distGjGn only if   1: 30

n

n n

For >1 b oth limits in 30 are the naively de ned distGjG = 1. For

1

 = 1 b oth limits yield distG , which suggests the intuitiveinterpretation

1

of G as distGjG = 1. However, suchinterpretations are p otentially

slipp ery, as shown by the fact that for 0 <<1

1

1

lim distGjZ G > 0 = distG  6=dist G =limdistGjGn: 31

n

n n 16

3 Pruning Random Trees

1

Let T b e a random family tree. Call a T -valued pro cess T ; 0  u  1

u

a uniform pruning of T if

d

T = T almost surely and T ; 0  u  1 =T u; 0  u  1 32

1 u

where T u; 0  u  1 is constructed as follows from some T 1 with

d d

T 1 = T . Here = denotes equality in distribution, meaning equalityof

nite dimensional distributions in a displaysuch as 32 . Supp ose that given

T 1 there are indep endent uniform0; 1 random variables  attached to

e

y

the edges e of T 1; let T u b e the comp onent containing the ro ot in the

subgraph of T 1 consisting of those edges e with 

e

y y

famT u b e T u relab eled as a family tree.

Let I be an interval of the form either [0; ] for some 0 < <1 or

1

[0;  for some 0 < 1.CallaT -valued pro cess T ;t 2 I auniform

t

pruning process if

T ; 0  u  1 is a uniform pruning of T for all t 2 I: 33

ut t

If T ; 0  u  1 is a uniform pruning of T then T ; 0  u  1 is a uniform

u 1 u

pruning pro cess, and almost surely

T = , the single-vertex tree, and lim T = T . 34

0 u"1 u 1

The second equality means that for almost every ! in the basic probability

space, for each h there exists a uh; !  < 1suchthat r T ! =r T ! for

h u h 1

all uh; ! 

an inhomogeneous Markov pro cess. All uniform pruning pro cesses T ;t 2 I 

t

share the same co-transition probabilities, whichhaveanobvious invariance

under scaling: for 0 0,

t 1

distT jT = t = distT  8 t 2 T 35

uz z

u

t

where T ; 0  u  1 is a uniform pruning of the xed tree t. For a

u

nite tree t 2 T these transition probabilities are describ ed by the following

formula, which is derived by conditioning on which subtree s of t remains

containing the ro ot after cutting eachedgeof t with probability1 u:

X

t s1 ns;t

P T = r jT = t= P T = r= u 1 u 36

uz z

u

s 17

where the sum ranges over all subtrees s of t with 0 2 s and fams=r,and

ns; tisthenumb er of edges v; wof t suchthat v 2 s and w 2 t s. This

formula determines the co-transition probabilities of every uniform pruning

1

pro cess, for it is easily seen that a T -valued pro cess T  is a uniform

t

pruning pro cess i for each h  0 the height restricted T-valued pro cess

r T  is a uniform pruning pro cess.

h t

If T ; 0  u  1 is derived from T by uniform pruning, the size Z T

u 1 u

of the rst generation of T is distributed as the sum of Z T indep endent

u 1

Bernoulliuvariables. In terms of probability generating functions [16]

1

X

n

g s:= s P Z T = nisgiven by g s=g 1 u + us: 37

u u u 1

n=0

In particular, if distZ T =Poisson for some   0 then distZ T =

1 u

Poissonu.

The following two lemmas record some more technical prop erties of uni-

form pruning pro cesses for ease of later reference.

n

Lemma 6 Let T ;u 2 I  beasequence of uniform pruning processs and

n

u

d

n

let t 2 I be such that t ! 1 and T !T as n !1 for some random

n n n

t

n

family tree T . Then

i for each > 0

n

distT ; 0  u  1  ! distT ; 0  u  1  38

u

u

d

= T ; where T ; 0  u  1 is a uniform pruning of T with T

u 1 1

ii if t  1 for al l n then 38 holds also for  =0.

n

Pro of. The convergence in 38 should b e understo o d as convergence of

nite dimensional distributions of height restricted pro cesses for each nite

height h. The probability that a height restricted uniform pruning pro cess

T  passes through some sequence of trees t ; 1  i  k at times 0

u i 0



k u k

k

probabilities of the form 36 . The conclusions of the lemma follow from the

fact that the function of u; r; tdisplayed in 36 is continuous in u for all

r; t 2 T. 2

Lemma 7 If T ; 0  u<1 is a uniform pruning process then T :=

u 1

lim T exists almost surely, and T ; 0  u  1 is a uniform pruning

u"1 u u

of T .

1 18

Pro of. The pro cess Z T ; 0  u< 1 is almost surely increasing, hence has

u

an almost sure limit, say Z 1. Moreover, Z < 1 almost surely,asthe

1 1

following argumentbycontradiction shows. If P Z = 1= 3>0, than for

1

all <there exists u 2 1=2; 1 with P Z T > 3= > 2 . But from the

u

pruning prop erty, conditionally given Z T > 3= the number Z T exceeds

u 1=2

the numb er of successes in 3= indep endent trials with success probability

1

2u > 1=2. By the lawoflargenumbers, P Z T > 1= jT > 3= !

1=2 u

1as  ! 0. Therefore, for all suciently small  this conditional probability

exceeds 1=2, and for suchsmall we deduce that P Z T > 1= >.Let

1=2

 ! 0 to deduce that P Z T = 1 >,incontradiction to the fact that

1=2

1 1

P Z T = 1 = 0 b ecause T 2 T and Z t < 1 for all t 2 T .Thus

1=2 1=2

1

P Z < 1 = 1. Therefore, lim r T exists almost surely and equals T

1 u"1 1 u

1

1

say, where T is the tree of heightatmostoneinwhich the ro ot has Z

1

1

children. Now pro ceed inductively. Supp ose for some h that lim r T exists

u"1 h u

h h

h

g. 2 T . Let U =inffu : r T = T almost surely and equals say T

h h u

1 1

Then P U < 1 = 1 by inductivehyp othesis, and for each v in generation

h

h

h of T the number c T of children of v in the next generation of T is

v u u

1

increasing for U

h v u v

h

as u " 1. That the c 1 are a.s. nite for all v in generation h of T can b e

v

1

shown by a reprise of the previous argumentbycontradiction for h = 0 after

h+1 h

almost surely . It follows that lim r T = T conditioning on T

u"1 h+1 u

1 1

h h+1 h+1

h+1

and each v in generation = T 2 T is such that r T where T

h

1 1 1

h h+1 h

h of T has c 1 children in T . So by induction, these limits T

v

1 1 1

exist for all h almost surely, and hence almost surely lim T = T where

u"1 u 1

h

1

for all h. 2 T 2 T is de ned by r T = T

1 h 1

1

3.1 Transition rates

The transition rates for a uniform pruning pro cess will b e given in Lemma 8.

Note that in this pap er, Markov pro cesses are constructed in fairly explicit

fashion, in contrast to the usual way of sp ecifying a Markov pro cess by stating

its transition rates.

For T with T < 1, a uniform pruning pro cess T ; 0  u  1 is an

1 1 u

inhomogeneous Markovchain with step function paths of jump-hold typ e on

the countable state space T. This chain is determined by its co-transition

probabilities 36 , or by its co-transition rates, which are much simpler and 19

1

can b e describ ed as follows. For t in T and w a non-ro ot vertex of t,

let v = parentw . Deleting v; w from the set of edges of t de nes a

w

directed graph on vertst with two comp onent subtrees, say t and t , with

w

w w

ro ot t  = roott = 0, and ro ot t =w .Callt the remaining tree and t

w w

the prunedbranch derived by pruning t below w ,orby cutting the edge v; w

of t.>From the construction of a uniform pruning pro cess T ; 0  u  1

u

with indep endent uniform times, there is the following formula: for all nite

family trees t and r the co-transition rateq ^ t ! r from t to r at time

u

0

u

1

q^ t ! r=u V r; t 39

u

where

V r; t:= fw 2 vertst f0g :famt =rg 40

w

In words, V r; t is the number of ways to cho ose a non-ro ot vertex w of t

such that if t is pruned b elow w , and the remaining tree t is relab eled as a

w

family tree, the result is r. Lemma 9 b elowgives a more explicit description

of how V r; t and V r; t are determined by r and t.

Consider now the forwards transition rates of a uniform pruning pro cess

T ; 0  u  1. For two nite family trees r and t and 0

u

q r ! t b e the rate of forwards transitions of from r to t at time u, given

u

T = r. Combining 39 and the obvious identity of unconditional rates

u

P T = rq r ! t= P T = t^q t ! r 8 r; t 2 T

u u u u

gives the following formula.

Lemma 8 For a uniform pruning process T ; 0  u  1 with T < 1,

u 1

the forwards transition rate from r to t at time 0

P T = t V r; t

u

41 q r ! t=

u

u P T = r

u

The meaning of the combinatorial factor V r; t can b e clari ed in terms

1

of the following op eration on family trees. Supp ose that r; s 2 T ,that

v is a vertex of r, and that j 2 [c r + 1]. Let tr;v;j;s denote the family

v

tree obtained by attaching the root of s to r as the j th child of v . This

is the unique family tree t whose vertices contain v and a child w of v of 20

w

rank j such that famt =r and famt = s. Note that c t = c r +1,

w v v

and that t =r +s. The notation is illustrated by Figure 2, in which

t = tr;v;2; s= tr;v;3; ssothatV r; t= 2.

     

@

@

@

@

@

@

    

@ @

@ @

r s t

@ @

 v v   

@ @

@ @

@ @

  

ro ot ro ot ro ot

Figure 2

The following lemmaisintuitively clear from pictures like Figure 2; we

leave its pro of to the dedicated reader!

Lemma 9 For r; t 2 T, let V r; t:=fw 2 vertst f0g :famt =rg.

w

i If V r; t is not empty, then this set is of the form fw ;w ; ;w g

1 2 k

where k =V r; t and the w areconsecutive siblings. That is, the w have

i i

acommon parent v 2 t, and w is the m + ith child of v for some m  0

i

and 1  i  k .

ii For any particular w 2 V r; t, the set V r; t is the maximal set

fw ;w ; ;w g of consecutive children of parentw  such that w = w for

1 2 k i

w w

i

some i and famt  = famt  for al l i.

iii The number V r; t equals the number of representations of t as

t = tr;v;j;s. That is, V r; t is the number of triples v; j; s with v 2

r;j 2 [c r +1]; s 2 T such that tr;v;j;s= t:

v

iv For r and t such that V r; t  1, there is a unique vertex v of r

and a unique s 2 T such that tr;v;j;s= t for some j 2 [c r +1].

v

v For given r;v and s let I = I r;v;s be the set of i such that the

descendants in r of the ith child of v form a family tree identical to s.As

j ranges over i  j  i +1 where fi ;:::;i g is a maximal sequenceof

1 m 1 m

consecutive elements of I , the tree t = tr;v;j;s is the same, and such that

V r; t= m +1, whereas every other choiceofj 2 [c r +1] de nes a tree

v

0 0

t = tr;v;j;s with V r; t =1. 21

Given a family tree r,a vertex v of r, and a random family tree S ,saythat

a random family tree T is constructed by random attachment of S to r at v

if T = tr;v;J; S  where J has uniform [c r + 1] distribution indep endently

v

of S .Part iii of the ab ove lemma shows that the distribution of T is then

determined by the following formula: for all v 2 r and j 2 [c r +1]

v

V r; tP S = s

P T = t= for t = tr;v;j;s: 42

c r +1

v

3.2 Pruning a Galton-Watson tree

1

Throughout this section let G ; 0  u  1 b e the T -valued pro cess

u

derived by uniform pruning of a GW tree G with o spring distribution p .

1 1

We shall describ e the forwards transition rates 45 and Prop osition 12 and

transition probabilities Prop osition 14 for this pro cess.

Rep eated application of the argumentwhich justi ed 37 yields \known

fact i" from the Intro duction.

Lemma 10 Lyons [29] The tree G is a GW tree whose o spring dis-

u

tribution p n:=P Z G = n is determined for al l n =0; 1; 2;::: by the

u u

formula

X

n

g s:= p ns = g 1 u + us 8 u 2 [0; 1]: 43

u u 1

n

In particular, if G is a PGW  tree then G is a PGWu  tree.

1 1 u 1

Lyons [29, 30] and Haase [20] give applications of Lemma 10 to the theory

k

of p ercolation on trees. Note the case g s=s for some p ositiveinteger k ,

1

when G ; 0  u  1 is a uniform pruning of the deterministic tree in which

u

eachvertex has k children. Let

X

np n: 44  := E Z G =

1 1 1

n

Note that for all 0  u  1

X

E Z G = np n=u :

u u 1

n 22

The forwards transition rates. Assuming   1, so G < 1 almost

1 1

surely,we nd from 3 and 41 that for r; s 2 T,foreach v 2 r and

j 2 [c r + 1] the forwards transition rate from r to t = tr;v;j;sattimeu

v

given G = r is

u



V r; t  c r

1 v

u

q r ! t= P G = s for each t = tr;v;j;s 45

u u

c r +1

v

where



p n

u

  1

 n:= for p n=u  n +1p n +1: 46

1 u

u u

p n

u

  

Note that p  is the distribution of Z G  1 for Z G  with the size-

u u

u

biased distribution derived from the distribution p of Z G . The following

u u



lemma, which is the key to a later calculation, shows that p  is also the

u



 

distribution of Z G where G ; 0  u  1 is a uniform pruning of G

1

u u





 = distZ G  1. de ned as a GW pro cess with distZ G

1

1

Lemma 11 For a non-negative integer random variable Z with mean  2

 

0; 1, let Z have the size-biased distribution P Z = n=nP Z = n=,

 

let Z = Z 1, and let Z denote the sum of Z independent Bernoul liu

u

variables. Then for each 0  u  1.

d

 

Z  =Z  : 47

u u

P

n 

Pro of. Let g s:= P Z = ns . Then Z has generating function

n

0 0

g s= where  = g 1, and Z has generating function g 1 u + usand

u

mean u. So 47 amounts to

0

g 1 u + us d g 1 u + us

=

 ds u

which is true by the chain rule. 2

In terms of op erators on the set of probability measures on the non-

negativeintegers with nite non-zero mean, the lemma states that the op er-



ator distZ  ! distZ  commutes with the op erator distZ  ! distZ .

u

Formula 45 only displays transition rates b etween nite trees, assuming

  1. However, by comparison with 42 , and by consideration of similar

1

formulae for height restricted pro cesses, we obtain the following intuitive

description of the forwards evolution of G whichisvalid even without

u

assuming   1. Let GW u denote the distribution of the GW tree G .

1 u 23

Prop osition 12 The distribution of the process G ; 0  u  1 with G = 

u 0

is uniquely determined by the fol lowing transition rates: at each time u 2

0; 1, given G = r,each vertex v of r with c children runs a risk of appending

u

1

a new branch at rate u c +1p c +1=p c where p n=P Z G = n

u u u u

is determined by 43; given that a branch is appendedto v 2G at time

u

u the new branch is appendedasifbyrandom attachment of a branch with

distribution GWu.

Let k =supfn : p n > 0g. Provided c

1

p c is a strictly p ositive and continuous function of u 2 0; 1, hence so

u

1

is the rate u c +1p c +1=p c app earing ab ove. So these rates are

u u

determined for all 0  c

 ! 0 of naively de ned conditional probabilities from the joint distribution

of G and G .

u u+

As a corollary of Prop osition 12 there is the following characterization of

a PGW tree in terms of the evolution of its uniform pruning pro cess. Part

i of the corollary sharp ens a similar description of uniform pruning for an

unlab eled PGW tree in Aldous [1].

Corollary 13 i If G is a PGW  tree then G is a PGWu  tree. The

1 1 u 1

rate of attachment of branches to v 2G at time u is then identical ly equal

u

to  , for al l 0

1 u

appendedtov 2G the new branch is appendedasifbyrandom attachment

u

of a branch with distribution PGWu .

1

ii Conversely, if a uniform pruning G  ofaGWtree G is such that

u 1

for some 0 0, the rate

u

of attachment of branches to v given G does not depend on the number of

u

children of v in G , then G is a PGW  tree for some 0   < 1.

u 1 1 1

Pro of. Part i is the sp ecialization of Prop osition 12 to a PGW tree. The

1

assumption in ii forces u c +1p c +1=p c=g u for some g u

u u

not dep ending on c. It follows that p isPoissonu  for some  < 1,

u 1 1

and hence that g 1 u + us= g s = expu 1 s. Since this

1 u 1

determines g z  = exp 1 z  for z 2 1 u; 1, and a probability

1 1

generating function g z  is an analytic function of z for jz j < 1, it follows

1

that g z  = exp 1 z  for all jz j < 1, and hence that distZ G =

1 1 1

Poisson . 2

1 24

Forwards transition probabilities By extension of the earlier notion of

attaching the ro ot of one family tree s as the j th child of some vertex v of

another family tree r to form a new tree tr;v;j;s, wecanmake sense of

attaching various trees as variously ranked children of various vertices of r.

Given non-negativeintegers k v ;v 2 r and for each v 2 r an increasing

sequence of k v integers

1  j v  < ::: < j v   c r + k v  48

1 k v  v

and a sequence of k v  trees s v ;:::;s v , we can construct a new family

1

k v 

tree by attaching the root of s v  to r as the j v th child of v for each

i i

i 2 [k v ] and each v 2 r.Given r and k v ;v 2 r, and some distribution

for a random family tree S ,say that a random tree T is constructedby

random attachment of k v  independent copies of S to v for each v 2 r if the

distribution of T is that induced by making the ab ove construction with a

uniform random choice of j v ;i 2 [k v ] sub ject to 48 and random choice

i

of s v  according to the distribution of S , indep endently for eachv; i with

i

v 2 r;i 2 [k v ]. Assuming for simplicitythat r is a nite tree and S is an

almost surely nite tree, the probability of making the construction with any

particular choice of j v and s v isthen

i i

! 

1

k v 

Y Y

c r + k v 

v

P S = s v : 49

i

k v 

v 2r

i=1

As for random attachment of one copyof S to one vertex v of r discussed

earlier, various choices of j v and s v may result in construction of the

i i

same family tree t. So for T constructed by random attachmentof k v 

indep endent copies of S to v for each v 2 r, the probability P T = tis

obtained by summing the expression 49 over all suchchoices.

The following prop osition describ es the joint distribution of G and G for

u 1

each xed 0  u<1 in terms of random attachment of branches. The joint

distribution of G and G for arbitrary 0  u

u z



by rescaling. For c  0 let P c;  denote the conditional distribution of

u

Z G Z G given Z G = c. That is, for m  0

1 u u

 !

p m + c m + c

1

c m



P c; m= P Z G Z G = m j Z G = c= u 1 u :

u 1 u u

p c c

u

50 25



It is known [40 ] that P c;  do es not dep end on c, meaning that Z G Z G

u 1 u

and Z G are indep endent, i p isPoisson.

u 1

Prop osition 14 Fix 0  u<1. Given G ,letK v ;v 2 vertsG  be

u u u



independent with distributions P c G ;  and given the K v  for al l v 2G

u v u u u

^

let G be de nedbyrandom attachment of K v  independent copies of G to

1 u 1

d

^

v for each v 2G . Then G ; G  =G ; G .

u u 1 u 1

d

y

Pro of. >From the uniform pruning construction, G ; G  =famG ; G 

u 1 1

u

y

where G is the subtree of G containing the ro ot after removing eachedgeof

1

u

G indep endently with probability1 u. It is easily seen that conditionally

1

y y

given G , indep endently as v ranges over G the number of children w of v

u u



in G that are not children of v in G has distribution P c G ; . Moreover,

1 u u v u

y

each one of these children w 2G G is the ro ot of a subtree of G which

1 1

u

when identi ed as a family tree is an indep endentcopyof G . The identity

1

d

^

G ; G  =G ; G now follows after appropriate relab eling, using the two

u 1 u 1

following consequences of the uniform pruning construction:

y y

i for each v 2V, given v 2G and b oth the number c of children of v in G

u u

and the number c + k of children of v in G , the set of c ranks of the children

1

y

in G is a uniform random subset of [c + k ]ofsizec;

u

y

ii for each h  0, conditionally given r G , r G , and the cv and cv +k v 

h 1 h

u

y

as v ranges over vertices of G of height h, these random subsets of sizes cv 

u

picked from [cv + k v ] are indep endent.

2

3.3 Pruning a GW tree conditioned on non-extinction

In this section, let G b e a GW tree which is critical or sub critical, with

1

non-degenerate o spring distribution, so

 := E Z G  2 0; 1] and G < 1 almost surely.

1 1 1

As in the previous section, let

G ; 0  u  1 b e a uniform pruning of G .

u 1

1

Let G be G conditioned on non-extinction, as in Prop osition 2, and let

1

1

1 

. ; 0  u  1 b e a uniform pruning of G G

1 u 26

Intuitively,



G ; 0  u  1 is G ; 0  u  1 conditioned on G = 1 51

u 1

u

but as indicated around 31 , this interpretation is hazardous for <1. Our



descriptions of G ; 0    1 parallel similar descriptions, in terms of weak

u

limits or h-transforms, of bridges and excursions of Markov pro cesses such

as Brownian motion [17 , 41 ].

To b e careful ab out 51 , it follows from Prop osition 2 and Lemma 6 that



distG ; 0  u  1 = lim distG ; 0  u  1 j Z G > 0 52

u h 1

u

h"1

and from Prop osition 5 and Lemma 6 that provided  =1

1



distG ; 0  u  1 = lim distG ; 0  u  1 j G = n: 53

u 1

u

n"1

Here 52 and 53 refer to convergence of nite dimensional distributions,

which extends easily to convergence of distributions on suitable path spaces.

Note that 53 is false for  < 1. With the conditions and notation of

1

Prop osition 5, the limit in distribution on the right side of 53 is rather

 1

 

the distribution of the uniform pruning G ; 0  u  1 of G obtained by

u 1

 

conditioning G on non-extinction, where G is the critical GW tree suchthat

 

distGjG = n = distGjG = n 8 n  1:

1 

Since G is a tree with only one in nite path, it is obvious that G < 1

1 u

almost surely for all u<1. And, from 34 ,

1  

almost surely. 54 = G = G G

1 1 1

The following prop osition provides an explicit formula for the distribu-



tion of G via its density relative to the distribution of G . Recall that the

u

u

distribution of G is given by the pro duct formula 3 with p  in place of

u u

p.



Prop osition 15 The distribution of the uniform pruning process G ; 0 

u

u<1 with countable state space T is determined by the co-transition prob-

abilities 36 of any uniform pruning process and the fol lowing distribution



of G for each 0  u<1:

u

 

= t=h P G u; tP G = t 8 t 2 T 55

u

u 27



where h 0; =1 and for 0

X

hv 

 

h u; t:=1 u  c t 56 

v

1

u

v 2t

with hv  the height of v ,andc t the number of children of v in t, and

v

 1

 n:=n +1p n +1u p n for p  the o spring distribution of

u 1 u u

u

G as in 43.

u

After proving this prop osition, wepoint out some reformulations of it.

  y

Pro of. It suces to derive55forG  constructed as G =famG where

u u u

y 1

G is the comp onentcontaining the ro ot in the subgraph of G consisting of

u 1

those edges e with   u where the  are indep endent uniform0; 1 random

e e

1

,let variables. Let V  b e the in nite spine of G

h

1

y

H := sup fh : V 2G g

u h

u

1 y

that is the height of the highest spinal vertex of G that is a vertex of G .

1 u

Formula 55 follows from formula 57 of the next lemmaby summation

over v 2 t. 2

 

Lemma 16 For 0  u< 1 let V be the vertex of G at height H which is

u

u u

 y y

. Then for al l !G via the relabeling map from G the image of V 2G

H

u

u u u

t 2 T and v 2 t,

hv 

  

P G = t;V = v = 1 u  c t P G = t: 57

v u

1

u u u

Pro of. The next lemma allows this formula to b e read from 23 by appli-

1 1y 1y

cation of Lemma 4 with G = G , G = famG  for G as de ned b elow,

u

u u

and H = H . 2

u

1y 1

Lemma 17 Fix 0

u 1

component containing 0 in the random graph de ned by deletion of each edge

1

e of G not in its in nite spine and such that  >u, wherethe arethe

e e

1



independent uniform 0; 1 variables usedtoconstruct G . Then

u

 1y

i The tree G is the family tree derivedfrom the nite subtreeof famG 

u u

1y

which remains after cutting the spine of famG  between heights H and

u

u

H +1.

u 28

1

ii Let G be G conditioned on non-extinction. Then

u

u

d

1y 1

famG  = G :

u u

1y

iii The height H is independent of famG  with the geometric1 u

u

u

distribution:

n

P H = n=1 uu 8 n =0; 1;:::

u

Pro of. Part i is evident from the de nitions, and iii is obvious b ecause

H is the least n such that the edge e := V ;V  has  >u.Toprove ii,

u n n+1 e

1

observe that from the description of G in Prop osition 2 iv, the de nition

1

1y 1y 1y

of G , and 43 , at eachlevel h of famG , the vertices of famG 

u u u

have indep endentnumb ers of o spring with the o spring distribution p 

u

of G for non-spinal vertices, and a mo di ed o spring distribution for spinal

u

vertices. According to Prop osition 2 iv applied to G = G , a similar

u

d

1 1y 1

statement applies to G . So to showfamG  = G , it suces to check

u u u

that

1y

a the o spring distribution of each spinal vertex of famG isidentical

u

1

to the o spring distribution of each spinal vertex of G ;

u

1y

b given that a spinal vertex of famG hasc children, the rank of the

u

1y

 is uniform on [c]. next spinal vertex of famG

u

1

But, in the notation of Lemma 11, by Prop osition 2iv applied to G and the

1

1y 1y

construction of G ,each spinal vertex of famG  has o spring according

u u

1 

to the distribution of Z G 1 +1=Z G   + 1. On the other hand,

u 1 u

1

1 1

by Prop osition 2 iv applied to G ,each spinal vertex of G has o spring

u u

 

according to the distribution of Z G  =Z G  + 1. But

u u

d d

  

Z G  +1 =Z G   +1 =Z G   +1 58

u 1 u 1 u

where the rst equality is due to 43 and the second is the commutation

rule of Lemma 11. This proves a, and b is quite easy. 2



Reformulation of Prop osition 15. Since G ; 0  u<1 and G ; 0 

u

u

u<1 are two uniform pruning pro cesses with the same co-transition prob-

abilities 36 , for all t 2 T and 0

 

distG ; 0  t  u jG = t = distG ; 0  t  u jG = t: 59

t u

t u 29

Weinterpret this as an equality of distributions on the path space [0;u],

where for u>0we let [0;u] b e the space of all right continuous step function

paths ! :[0;u] ! T with at most a nite numb er of jumps, equipp ed with the

 - eld generated by the maps t ! ! where ! =! ; 0  t  u. Then 59

t t

combined with 55 amounts to the following formula: for each non-negative

measurable function f de ned on [0;u],



E [f G ; 0  t  u] = E [hu; G f G ; 0  t  u]: 60

u t

t

Implicit in Prop osition 15 is the consistency of this prescription of distribu-



tionsofG ; 0  t  uasu varies. By a standard argument, this consistency

t

amounts to:

Corollary 18 Relative to the ltration generatedbyG ; 0  u<1,

u



the process h u; G ; 0  u<1 is a non-negative martingale, with

u



Eh u; G =1 8 0  u< 1:

u



In other words, h u; t de ned by 56 is a space-time harmonic function



for the chain G ; 0  u< 1, and the Do ob h -transform of G ; 0  u< 1

u u



is G ; 0  u<1. See the end of Section 4.4 for identi cation of the

u

corresp onding Martin b oundary.

3.4 The sup ercritical case

In the critical case, the GW tree conditioned to b e in nite Prop osition 2

has another interpretation as a limit of sup ercritical GW trees conditioned

on non-extinction. This fact, and its consequence for pruning pro cesses, are

sp elled out in Prop osition 19.

It is convenient here to rescale the time parameter to makeitidentical

to the o spring mean. So supp ose G ; 2 I  is a uniform pruning pro cess



parameterized byaninterval I with [0; 1]  I [0; 1, suchthatG is a



GW tree with E Z G = for all  2 I .For example, take I =[0;k]for



some k =2; 3;:::, and let G ; 0  u  1 b e a uniform pruning of the

uk

deterministic regular tree in which eachvertex has k children. Or, as in the

next section, take G to b e a PGW tree.

 30

Prop osition 19

1

lim distG j G = 1=limdistG j G = 1 = distG  61

  1 

1

1 1



; 0  u  1 62 lim distG ; 0  u  1 j G = 1 = distG

u 

u

1

Pro of. By application of Lemma 6, it suces to show that

1

 as   1: 63 distG j G = 1 ! distG

 

1

Let



F :=P G < 1; F :=1 F = P G = 1: 64

 

Let g b e the generating function of Z G .Itiswell known that the extinction

 

probability F  is the least non-negative ro ot of the equation

F = g F  65





and that F  is strictly p ositivei >1. Fix >1with 2 I . It is

assumed that G ; 0  u  1 is a uniform pruning of G , and hence

u 

g s=g 1 = += s 66

 

It is easy to see from this formula and the strict convexityof g that F 





isacontinuous decreasing function of .HenceF   0as   1, Now x

h  0andt 2 T and compute

Z t



h

1 1 F 

P r G = t j G = 1 = P r G = t

h   h 



F 

1

! P r G =tZ t= P r G = tas   1

h 1 h h

1

where the continuityof P r G =t is evident from 2 and the explicit

h 

formula for p  in terms of p which can b e read from 66 , and the last

 

equality is 18 for  =1. 2 31

The ascension time. In terms of the uniform pruning pro cess G ; 2 I 



we de ne the ascension time A := inf f :G = 1g.SoA>1 a.s. The



events A   and G = 1 are identical, so





P A  =P G = 1=F  8  2 I 67



and weinterpret 62 intuitively as



distG ; 0  u  1 j A = 1 = distG ; 0  u  1: 68

u

u

In contrast, Prop osition 12 implies that G < 1 a.s. and implies an

A

explicit formula for the jointlawofG and A,which can b e used to ob-

A

tain a continuously varying conditional distribution of G given A = a for

A

a>1. But this distribution is concentrated on nite trees rather than on

in nite ones. In Section 4.2 we study the Poisson case where there is much

simpli cation.

4 The PGW pruning pro cess

It follows easily from Lemma 10 by Kolmogorov's extension theorem that

1

there exists a unique distribution for a T -valued inhomogeneous Markov

pro cess G ; 0  <1 suchthat



distG  = PGW 8 0  <1: 69



and G ; 0  u  1 is a uniform pruning of G for each >0. Essentially

u 

the same Markov pro cess, with states identi ed as ro oted unlab eled trees

rather than family trees, featured in Aldous [1]. We shall showhowthe

results of section 3 may b e simpli ed and extended in this PGW setting. We

sometimes give b oth \pro of by sp ecialization" from the general GW result in

section 3, and an \autonomous pro of" directly exploiting Poisson structure.

4.1 The jointlawofG ; G 

 

For each xed pair of times ; with0 <<1 this jointlawisde-

termined by the PGW  distribution of G and the conditional distribution



of G given G ,asgiven by 36. The following sp ecialization of Prop osition

 

14 describ es the conditional distribution of G given G :

  32

Prop osition 20 Fix  and  with 0  <<1. Given G , let N v ;v 2

 ;

G be independent with Poisson  distribution, and given the N v ;v 2

 ;

^

G let G be de nedbyrandom attachment of N v  independent copies of

  ;

d

^

G to v for each v 2G . Then G ; G  =G ; G : In particular

     

0 1

N

;

X

d

@ A

G ; G  = G ; G + G i 70

    

i=1

where N is the sum of N v  over al l vertices v of G ,sogiven G with

; ;  

G = ` the distribution of N is Poisson with mean ` , and given

 ;

G and N the G i are i.i.d. copies of G .

 ;  

For general o spring distribution, the pro cess G  is not necessarily

u

Markov, but in the Poisson setting it is. We susp ect that either of the results

of the following corollary can b e used to characterize the Poisson case.

Corollary 21 The process G ;  0 with state space f1; 2;:::;1g has



the Markov property, and

the process 1 G ; 0  <1 is a martingale 71



relative to the ltration generatedbyG ; 0  <1.



Pro of by sp ecialization. The forwards Markov prop ertyofG ; 



0 follows easily from 70 and the MarkovpropertyofG ;  0. The



martingale prop erty is the particular case of Corollary 18 for G a PGW 1

1

tree.

Autonomous pro of. Result 71 can b e checked directly from 70 as

follows. Since

 !

1 1

X X

h 1

E G =E Z G =  =1  72

 h 

h=0 h=0

formula 70 implies that for 0    <1

! 

1 

1

G : E G jG =G +G  1  =

    

1 

With the MarkovpropertyofG ; 0  <1 this gives



E 1 G jG ; 0   =E 1 G jG =1 G

    33

which is 71. 2

Formulae for the forwards transition probabilities of the Markovchain

G ;  0 can b e obtained as follows from the representation 70 . Con-



sider for >0 and 0    1, the distribution of

N

X

X := X

; ;i

i=1

where N has Poisson   distribution, and given N = n the X for 1  i 

;i

n are i.i.d. with the Borel distribution P . The distribution of X is

 ;

known [13 ] as the generalized Poisson distribution GPD on f0; 1;:::;1g

with parameters  ; , and given by the formula

1

k 1 k

P X = k =  + k e 8k =0; 1;::: 73

;

k !

which follows easily from 11 . According to 70 , for 0    <1

the conditional distribution of G G given G = ` is GPD ; for

  

= ` . That is to say

P G = ` + k jG = `=P X = k  74

  ;

as in 73 for = ` . Take k = m ` in 74 to obtain the following

formula for 0  < and 1  `  m<1 :

` 

m`1 m`

m ` e : 75 P G = m j G = `=

 

m `!

Using Bayes' rule and the Borel distributions 10 of G and G , this for-

 

mula can b e inverted to obtain an expression for the co-transition probability

P G = ` j G = m. Due to the U -representation 15 of distG j G =

  n  

n and the scaling prop erty of a uniform pruning pro cess, this co-transition

probability dep ends on ;  only through the ratio =, and has a simple

combinatorial interpretation in terms of pruning a uniform random tree. See

Section 4.8 for further discussion and references to earlier app earances of the

same co-transition probabilities. 34

4.2 Transition rates and the ascension pro cess

For each h =1; 2;::: the height restricted pro cess r G ; 0  <1is

h 

h

an inhomogeneous Markovchain with countable state space T and step-

function paths, whose co-transition rates and co-transition probabilities can

b e read from the general formulae of section 3.

The pro cess G ; 0  <1develops with time running forwards by



a pro cess of attachment of trees which can b e describ ed as follows, due to

Corollary 13. Starting from G = , the single-vertex tree, at eachtime  0

0

andateachvertex v of G , attachments to v are made at rate 1; given G = r

 

and that an attachmentismadeto v 2 r at time , the tree to b e attached

has PGW distribution, and the ro ot of this tree is attached to r as the

j th child of v for a j chosen indep endently and uniformly from [c r + 1].

v

Thus for each v 2 r and j 2 [c r + 1] the forwards transition rate from r to

v

t = tr;v;j;sattimeu given G = r is



V r; tP G = s



q r ! t= for each t = tr;v;j;s: 76



c r +1

v

The rates 76 for r; t 2 T determine the evolution of G ;  0 only up



to the ascension time A := inf f :G = 1g. As noted in Aldous [1],



what happ ens at time A is that the pro cess attaches an in nite branchto

some vertex of G ,which is an almost surely nite random tree, to form an

A

in nite tree G .

A

Consider nowtheascension process G ; 0  



time  is G if 0  



any in nite tree. The ascension pro cess is a Markovchain with countable

state-space T [f1g, where 1 is an absorbing state. The distribution of

the ascension pro cess is determined by the initial state G = , the transition

0

rates 76 , and the ascension rate



q r ! 1 =rP G = 1=:rF : 77

 

Note that the total rate of transitions out of each state r 2 T is r,which

is the sum of the combined rate rF  of transitions to all other nite

 

trees, and the rate rF  for transitions to 1, where F + F =1.



We recall now some known features of the function F :[1; 1 ! 0; 1]

and the conditional distribution of G given that G < 1. See Alon-Sp encer

  35

[8, x6.4 and x6.5], or Aldous [1] for further discussion. For the Poisson

generating function g s = exp1 s the equation 65 shows that for





>1 the non-extinction probability F  is the strictly p ositive solution of

1

  

1 F  = expF . It follows that the inverse function F :0; 1] !

[1; 1is

2

log 1 u u u

1



F u= =1+ + +  78

u 2 3



For   1 de ne the conjugate ^  1by^ = F , where F = 1 F 

is the extinction probability. Then for   1 it follows from 8 that

P G = t= F P G = t 8 t 2 T 79

 ^

and hence for   1

distG j G < 1 = distG =PGW^ : 80

  ^

The formula 77 for the ascension rate combined with these results gives the

following formulae for the joint distribution of A and G .Formula 81 and

A

the variant of 82 for an unlab eled G  app ear as formulae 13 and 15 in



[1].

Lemma 22 For   1 and t 2 T,



P A   = F  81



P G = t;A 2 d = F tP G = t d 82

A 

P G = t j A =  = 1 ^ tP G = t 83

A ^



for F  and the conjugate ^ as de nedabove. Let U have uniform0; 1

distribution. Then

!  ! 

^

log U A

d

:= A; F A = ;U : 84 A;

A 1 U



Pro of. First, P A  =P G = 1=F , giving 81 . Next, given





G = t, eachofthet vertices has chance F d of acquiring an in nite



branch during time d, giving 82. Next, for xed   1 the conditional

probabilities P G = t j A = areby 82 prop ortional to tP G = t,

A 

and thus by 80 prop ortional to tP G = t. >From 72, the mean of

^ 36

the Borel ^  distribution is 1=1 ^ , and 83 follows. Finally, the fact

d d

1 1

  

that A has distribution function F means that A = F U  = F 1 U =

 log U =1 U . So

 !

log U

d



A; F A = ; 1 U :

1 U

and hence

! 

log U

d



;U A; F A = A; 1 F A =

1 U

which is 84. 2

1

4.3 The PGW 1 distribution

1

For 0 < 1 write PGW  for the distribution of the in nite tree

obtained by conditioning PGW  to b e in nite Prop osition 2. In other

1

words, PGW  is the distribution of the tree constructed by attaching a

sequence of i.i.d. PGW family trees to a single in nite spine, as describ ed

1

in Corollary 3. The particular case PGW 1 plays a fundamental role in

the sequel. The following prop osition summarizes some of the manyways

this pro cess arises as a weak limit. Here G isaPGW family tree, and



famU  is the family tree derived from U with uniform distribution on the

n n

n1

set R of n ro oted trees lab eled by[n].

[n]

Lemma 23 i Grimmett [19]

d

1

famU  ! PGW 1 as n !1

n

ii For each  2 0; 1,

d

1

distG j G = n ! PGW 1 as n "1:

 

iii For each  2 0; 1],

d

1

distG j G  n ! PGW 1 as n "1:

 

iv Kesten [25] For each  2 0; 1]

d

1

distG j Z G > 0 ! PGW  as h "1:

 h  37

v

d

1

distG j G = 1 ! PGW 1 as   1:

 

Pro of. Grimmett [19] presented the variant of i for unlab eled trees, but his

argument yields the sharp er result for family trees. Part ii is the particular

case of Prop osition 5 for a PGW tree. Either of i and ii follow from

the other due to the U -representation 15 of distG j G = n. Part iii

n  

follows easily from ii. Part iv is Prop osition 2 for PGW . Part v can

b e read from 61 for the Poisson family.



4.4 The pro cess G ; 0    1



1

 1

Let G ; 0    1 b e a uniform pruning of G with PGW 1 distribution.

 1

As explained in the more general setting of Section 3.3, this pro cess should

b e understo o d intuitively as G ; 0    1 conditioned on G = 1 where

 1

G ; 0    1 is a uniform pruning of G with PGW 1 distribution. Note

 1

  1

that G = G = G almost surely, and that for all t 2 T; 0 <<1and

1 1 1

0 <<1

 

= t = distG ; 0  t  1 jG = t: 85 ; 0  t  1 jG distG

t 

 t

According the following prop osition, for each xed with 0  <1, the



distribution of G is the PGW distribution size-biased by total p opula-



tion size. We denote this probability distribution on nite family trees by

 

PGW . Sheth [42, 43 ] studied various features of the PGW  distribu-

tion in connection with a mo del for a coalescent pro cess, and the following

corollary of Prop osition 15 is closely related to Sheth's results.



Corollary 24 The process G ; 0  < 1 is an inhomogeneous Markov



chainwithcountable state space T, whose distribution is uniquely determined

by 85 and the fol lowing formula:



= t=1 tP G = t 8 2 [0; 1; t 2 T: 86 P G





Pro of by sp ecialization. Apply Prop osition 15 for G a PGW1 tree.

1



Then  =1,and n = 1 for all 0

1

u

simpli es to 86 . 38

Autonomous pro of. Formula 85 combined with 86 amounts to the

following formula: for each non-negative measurable function f de ned on

the path space [0;] de ned ab ove60,



E [f G ; 0  t  ] = 1 E [G f G ; 0  t  ]:

 t

t



That this is a consistent prescription of distG ; 0  t  as varies

t

amounts to the martingale prop ertyof1 G ; 0  <1 obtained



in Corollary 21. The existence and uniqueness in distribution of a pro cess



G ; 0  <1 satisfying 86 and 85 are now clear by Kolmogorov's



  1

extension theorem. Lemma 7 implies the existence of G := G 2 T as

1 1



an almost sure limit, and implies that G ; 0    1 is a uniform pruning



d

 1 

.That .To nish the argument it just has to b e shown that G = G of G

1 1 1

is, for each h  0 and t 2 T

 1

P r G = t ! P r G = tas  " 1: 87

h h

 1

But from 85 , for  2 0; 1] we can compute

1

X

 

P r G = t= P r G = t j G = nP G = n 88

h h  

 

n=0

where weknowbytheU -representation 15 of distG j G = nand

n  

Lemma 23 that

1

P r G = t j G = n=P r G = t j G = n ! P r G = t

h   h 1 1 h

1

as n !1. But for each xed n



P G = n= 1 nP G = n ! 0as  " 1





by insp ection of the Borel formula 10 , and 87 now follows easily from

88 . 2



; 0  <1 as the Do ob The corollary ab ove identi es the pro cess G





h -transform of G ; 0  <1 asso ciated with the space-time harmonic





function h ; t:=1 t for the inhomogeneous Markovchain G ; 0 



<1 with state-space T. The autonomous pro of yields also the following

corollary, where the explicit formula 89 is obtained from 75 . The limit

relation 90 is evident from this argument without calculation, but it can

also b e checked from 89 and 10 using Stirling's formula. 39

Corollary 25 Every non-negative function h; n such that h0; 1 = 1 and

the process h; G ; 0  <1 is a martingale relative to the ltration



generatedbyG ; 0  <1 admits a unique representation as



X

P mh ; n h; n=

m

m2f1;2;:::;1g

for some probability distribution P  on f1; 2;:::;1g whereform =1; 2;:::

mn1

P G = m j G = n m!m n

1 

n

h ; n= =1 ne 89

m

m1

P G = m m n!m

1

and for m = 1

h ; n= lim h ; n=1 n: 90

1 m

m!1

In particular, the only h such that h; G  ! 0 almost surely as  " 1 is





h; n=h ; n, so that h; G = h ; G  as above.

1  

Essentially,thisisanidenti cation of the Martin b oundary of the space-

time pro cess ; G ; ; 0  <1 with the set f1; 2;:::;1g. Similarly,



the Martin b oundary of the space-time pro cess ; G ; ; 0  <1 can b e



1

1

of T comprising those trees t suchthatif identi ed with the subset T

0

T ; 0    1 is a uniform pruning of t then T < 1 almost surely for

 

all 0  <1. The extreme harmonic function h corresp onding to sucha t

is

h; t=P T = t=P G = t:

 

4.5 A representation of the ascension pro cess

Consider again the ascension time A := inf f :G = 1g for the PGW



pruning pro cess, studied in Section 4.2. Combining Corollary 24 with the

formulae of Lemma 22 leads to the following rather surprising representation

of the ascension pro cess G ; 0  



Prop osition 26

 !

log U

d



G ; 0  



U

1 U



; 0    1. where U is uniform 0; 1, independent of G

 40



Pro of. TakeG ; 0    1 indep endentof A.Then



 

= t= 1 a^ tP G = t P G = tjA = a=P G

a^

^

a^

A

d



by 86 . Comparing with 83, we see A; G  =A; G . By conditioning

A

^

A

on these terminal values and reversing time, 85 implies

d



G ; 0  



^

A=A

Use 84 to rewrite the rightsideintermsofU and 91 follows. 2

Since P A> 1 = 1, the identity in distribution 91 implies in particular

that

d



G = G 8 0    1 92



U



where U has uniform distribution on 0; indep endentofG ; 0    1.



We sp ell out the meaning of 92 in the following corollary:

1

1

Corollary 27 Fix 0 < 1.Let G have PGW 1 distribution, and in-

1

1 1

dependently of G let U have uniform distribution on 1 ; 1. Given G



1 1

1

and U ,construct a random forest F by cutting each edge of G indepen-

 

1

dently with probability U . Then the family tree derivedfrom the component



1

of F that contains the root of G has distribution PGW.



1

The identity 92 can also b e checked as follows. By Corollary 24 and

the U -representation 15 of distG j G = n, the random trees G and

n   



G share a common conditional distribution given their total size. So 92

U

d



amounts to G =G for all 0    1, that is



U

Z



1



P nd 8 0 < 1;n =1; 2;::: 93 P n=







0



where P  is the Borel distribution of G displayed in 10 , and P

 





is the distribution of G ,whichby Corollary 24 is the size-biasedBorel



distribution

 

= n=1 nP n: 94 n:=P G P



 

But 93 in turn amounts to

d



[P n] = P n 8 0 < < 1;n =1; 2;::: 95





d

which is easily checked by calculus using the formula 10 for P n. We



restate 93 in the following corollary: 41

Corollary 28 For each 0 <  1 the Borel distribution is the uniform

mixture of size-biased Borel distributions over 0 <<.

Let N denote a random variable with Borel distribution P , and

 

 

N arandomvariable with size-biased Borel distribution P . >From

 

94 and 95, for 0 <<1

d



E [f N ] 96 E [f N ] = 1 E [N f N ] = E [f N ] + 

   



d

where the rst equality holds for every non-negative function f , and the

second equality holds at  for any f such that the derivative exists, as can

b e seen by di erentiation of the next formula 97 with nf n instead of

f n. Apply 93 and the rst equalityof96withf n=n instead of f n

to deduce that for all non-negative function f and 0    1

! 

Z



1 f N 



= 1 E [f N ] d: 97 E



N 

0





These formulae imply numerous identities involving moments of N and N .





For example, 96 for f n=n gives easily

 2 2

E N =1 E N =1  : 98

 

Take f = 1 in 97 to obtain

Z



1

1  d =1 =2: 99 E 1=N =





0

k

For d n:=n =n , where n := n!=n k ! the formula d n=

k k k k +1

d n d n=n combined with 97 and an easy induction shows that

k k

k 1

E d N  =  =k ; 8 k  1;2 [0; 1]: 100

k 

A similar calculation yields

k 1 

 =  ; 8 k  1;2 [0; 1: 101 E d N

k



Since d n is the probability of no rep eats in a sequence of k indep endent

k

uniform random picks from a set of n elements, we deduce the following

curious result. See also section 4.7 for related results. 42

Corollary 29 Let G have PGW distribution for some 0 <  1. Given



G let V ;:::;V be k vertices of G picked independently and uniformly at

 1 k 

random. Then the unconditional probability that these k vertices of G are



k 1 

al l distinct is  =k .For G instead of G the corresponding probability is





k 1

 for al l 0 <<1.



4.6 The spinal decomp osition of G







Fix 0 < 1. A random tree G with PGW  distribution has a number



of remarkable prop erties as a consequence of the results in Section 3.3. Fol-



lowing the notation of that section, supp ose that G has b een constructed as



 y y

G =famG whereG is the comp onent containing the ro ot in the subgraph

  

1

of G consisting of those edges e with    where the  are indep endent

e e

1

1

uniform0; 1 random variables. Let V  b e the in nite spine of G , and let

h

1

1y

H := supfn : V 2G g,so H has the geometric1  distribution

 n 



n

P H = n=1  8 n =0; 1;::: 102



   y

Let V 2G be the vertex of G at height H which corresp onds to V 2G

 H



   

y 

via the relab eling map from G !G . According to formula 57 sp ecialized

 

to the case at hand, there is the following re nement of 86. For 0  <1

  

and G and V 2G de ned as ab ove,

  

 

P G = t;V = v =1 P G = t 8t 2 T;v 2 t:



 

  

That is, the vertex V of G at height H is a uniform random vertex of G ,



  

 

. is uniform on G meaning that the conditional distribution of V given G



 

By further examination of the argument in Section 3.3 we deduce:



Corollary 30 Spinal decomp osition of G . Fix 0 < 1.For a random





  

tree G with PGW  distribution, and V a uniform random vertex of G ,

 

   

= V  be the path in ;:::;V let H be the height of V , and let ro ot = V



H 0



  

G from the root to V ,cal l it a spine of G .For 0  i  H let G i bethe



 

 

family tree derivedfrom the subtreeofG with root V in the forest obtained

 i

 

by deleting al l edges of G on the path from the root to V . Then



i H has the geometric1  distribution 102.



ii given H = h the G i for 0  i  h are independent with PGW



distribution, and 43

iii given H = h  1 and these family trees G i for 0  i  h, the path



 

from the root to V is de nedbyV =J ; ;J  for 1  i  h wherethe

1 i

i

J areindependent and J has uniform [Z G i+ 1] distribution.

i i

1

Pro of. Lemma 17 combined with the spinal decomp osition of G given in





Corollary 3 show that this result holds for the particular construction of G



 

and H used to obtain 57 , with V = V .Butbychangeofvariables the





 

result must also b e true as stated for any triple G ;V ;H  with the same





 

joint distribution as this particular triple G ;V ;H . 2



 

We remark that our spinal decomp osition is the probabilistic analog of

similar combinatorial decomp ositions of Joyal [23 ] and Lab elle [28 ], which

were partly anticipated by Meir and Mo on [32].

Note that for a xed 0 < <1, it only makes sense to sp eak of a spine



 

of a PGW  distributed tree G rather than the spine of G , b ecause

 

the construction of the spine involves the extra randomization of picking a

 

uniform random vertex V of G . But according to the ab ove discussion, if





 

G ; 0  <1 is a uniform pruning pro cess suchthat G has PGW 

 

distribution for each0  <1, then with probability one this pro cess



grows a unique in nite spine as  " 1. This is the spine of the tree G

1

1

with PGW 1 distribution, and this in nite spine induces a nite spine in



for each0  <1insucha way that the length H + 1 of this spine G





is increasing as  increases. In this construction each of the non-negative



integer-valued pro cesses H ; 0  <1 and G ; 0  <1 is increasing





and inhomogeneous Markov. The transition probabilities of H  are quite





obvious. Those of G  can b e read from formula 75 and the fact that



 1 1

G  is the Do ob h -transform of G for h ; n=1 n.







Since in the setting of Corollary 30 the entire family tree G can b e



reconstructed from the random elements whose jointlaw is describ ed byi-

iii, the spinal decomp osition implies the the following recursive construction



of a PGW  tree:

Corollary 31 Let random elements

H ; G i; 0  i  H ; J ; 1  i  H 

  i 

have the joint distribution described in i-iii of Corol lary 30. Recursively

 

de ne trees G i; 0  i  H and vertices V ; 0  i  H ; as fol lows.

 

i

  

=0;for 1  i  H let G i be the family tree Let G 0 = G 0;V



0 44

 

obtained by attachment of G i to G i 1 as the J th child of V , and let

i

i1



   

V =V ;J . Then G H  has PGW  distribution, and the vertex V

i 

i i1 H





at height H is a random vertex of G H .

 

4.7 Some distributional identities

Distributional relationships b etween random trees imply distributional rela-

tionships b etween the integer-valued random variables which record the sizes

of trees. In this section we sp ell out several such relationships.

Borel and size-biased Borel distributions The spinal decomp osition



Corollary 30 expresses G as the union of H + 1 subtrees whichcanbe







relab eled as indep endent PGW  trees. Since weknow that G has a



size-biased Borel  distribution 94 , we deduce:

Corollary 32 Let N 1;N 2;::: beindependent with the Borel distri-

 

bution 10, independent also of H with the geometric1  distribution



102. Then



N := N 1 + + N H +1 has size-biasedBorel distribution. 103

  



An elementary pro of of Corollary 32 can b e given as follows. By conditioning

on H = h and using the Borel-Tanner formula 11 for the distribution of





the sum of h + 1 indep endent Borelvariables, for N de ned bythe





sum in 103 , and P  the size-biased Borel distribution, we obtain for all



0  h  n 1

h + 1n 1!

 

P n: 104 = n= P H = h; N



 

h+1

n n h 1!



The right side is easily summed over 0  h  n 1 to con rm that P N =





n. n=P



The decomp osition 103 should b e compared to the obvious consequence

of 6 that for Z with Poisson  distribution indep endent of i.i.d. Borel



variables N i



N := N 1 +  + N Z +1 has Borel distribution: 105

    45

We do not knowofany reference to the representation 103 of the size-

biased Borel distribution in terms of the more elementary Borel and geo-

metric distributions, or to the companion representation 93 of the Borel

distribution as a mixture of size-biased Borel distributions. But b oth the

Borel and size-biased Borel distributions have found applications in various

contexts [11, 10 , 21 , 42 , 43 , 44 , 50 ] where these representations mightprove

useful. See [13, 36 , 45 ] for study of the general class of Lagrangian distribu-

tions, which includes b oth the Borel and size-biased Borel distributions as

particular cases.

It is a well known consequence of 105 that the Borel distribution P





of N is in nitely divisible. Representation 103 gives P asaconvolution





of P and distN 1 + ::: + N H , and the latter inherits the in nite

   

divisibility prop ertyof H .We deduce





Corollary 33 The size-biased Borel distribution P  is in nitely divis-



ible.

More on the heightofarandomvertex. For any nite ro oted random

tree T , let H T denote the height of a uniform random vertex V of T .LetU

n

n1

have uniform distribution on the n ro oted trees lab eled by[n]. >From 86

and the U -representation 15 of distG j G = nwehave for 0  <1

n  

that

 

P H G = h j G = n=P H G = h j G = n=P H U = h: 106

  n

 



On the other hand, by the spinal decomp osition of G Corollary 30, there is



d

   

 with the joint distribution forH ;N  =H ;N ; G the identityH G

 

   

104 . It follows that

h + 1n 1!

P H U = h= 8 0  h  n 1: 107

n

h+1

n n h 1!

For n  2 let D b e the number of vertices on the path from 1 to 2 in U .

n n

By symmetry, the conditional distribution of H U ,given that the random

n

vertex V of U is not the ro ot, is identical to the unconditional distribution

n

of D 1. So

n

1 n 1

P H U = h= 1h =0+ P D = h + 11h  1

n n

n n 46

and 107 amounts to the result of Meir and Mo on [32] that

k n 2!

P D = k = 8 2  k  n: 108

n

k 1

n n k !

As a variation, for G with PGW  distribution, we can apply Corollary 92



to compute for h =0; 1; 2;::: and  2 0; 1]

Z Z

h

 

1  1

 h

P H G  hd =  d = P H G  h= : 109





  h +1

0 0

Compare with the consequence of Corollary 30 i that for  2 0; 1

 h

P H G  h= : 110





These formulae can also b e checked by conditioning on G or G to reduce





to 107 and then using 100 and 101 .

Some asymptotic distributions. It is known [32] that as n !1the

p

asymptotic distribution of H U = n is that of a random variable R with

n

2

r =2

the Rayleigh density re for each r>0. Also, as  " 1 the asymptotic

2  2

distribution of 1  N is that of Z where Z now denotes a normal



2

variable with E Z =0andE Z =1. Both these assertions are easily

checked by asymptotic density calculations, which establish corresp onding

lo cal limit theorems. Since from 86 and 15 wehave that

 

distH G j G = n = distH G j G = n = distH U 

  n

 

d

  

and G = N it follows that for  close to 1 the distribution of 1 H G 

  

q





N R where R is indep endentof N ,and must b e close to that of 1 

 

hence also close to that of jZ jR where R is indep endentof Z . On the other



has geometric1  distribution, so it is elemen- hand, weknow that H G





tary that the asymptotic distribution of 1 H G  is that of a standard



exp onential variable .Thus we deduce the non-trivial identity in distribu-

tion

d

jZ jR = : 111

In terms of probability densities, this amounts to the well known formula

s

Z

1

2

2 2 2

y +t =y =2 t

e dy = e



0 47

whichcanbeveri ed byshowing that the functions of t on b oth sides are

equal at 0 and satisfy the same ordinary di erential equation. Let denote

t

a random variable with the gammat distribution de ned by the density

1 t1 x

t x e for x> 0. Since

q

q

d d d

jZ j = 2 ; R = 2 ;  =

1 1

1=2

the identity 111 is the sp ecial case t =1=2 of the identity in distribution

d

2

= 8 t>0 112 4

t

t+1=2

2t

where and are indep endent, which is due to Wilks [51 ]. Evaluation

t

t+1=2

of moments shows that b oth 111 and 112 are equivalent to the duplication

formula for the gamma function

2z 1

2z =2 z z +1=2=1=2:

See Gordon [18] for further probabilistic interpretations of gamma function

identities.

4.8 Size-Mo di ed PGW -trees

Several results of the previous sections have natural generalizations to the

following class of distributions on the set T of nite family trees. Call a



random tree G a size-modi ed Poisson-Galton-Watson treeSMPGWtree,



or use the same acronym for its distribution, if G has distribution of the

form



P G = t=f tP G = t 8t 2 T 113

1

for some f with E f G  = 1, where G is a PGW 1 tree. The distribution

1 1



of such a tree G is determined byitssize distribution Q, that is the distri-



bution of G on the p ositiveintegers whichisgiven by Qn= f nP n

1

where P  is the Borel1 distribution of G . Let U  b e a sequence of

1 1 n

n1

random trees suchthatU has uniform distribution on the set R of n

n [n]

ro oted trees lab eled by[n], and let T := famU . By the U -representation

n n n

15 of distG j G = n, formula 113 is equivalentto

 

  

distG j G = n = distT  8 n : P G = n > 0: 114

n 48

That is to say

1

X



QnP T = t 115 P G = t=

n

n=1

where P T = tgiven by formula 15 . So the most general SMPGW

n



distribution is obtained as the distribution of a random tree G constructed

as follows. Indep endent of the sequence of random trees T letS have

n



distribution Q. Then G := T has the distribution displayed in 115 . The

S

set of all SMPGW distributions on T is therefore a simplex whose extreme

p oints are the distributions of T for n =1; 2;:::.

n



for 2 [0; 1 are SMPGW Clearly,PGWfor 2 [0; 1] and PGW

distributions. Typically a result for SMPGW can b e established rst for the

extreme distributions of T , either by a combinatorial argumentorbycon-

n



ditioning a result already obtained for the PGW or PGW family, and then

extended to the SMPGW family by mixing over the extreme distributions.

Following are several illustrations of this theme.

 

Prop osition 34 Suppose G ; 0  u  1 is a uniform pruning of G with a

u

SMPGW distribution. Then



i The process G ; 0  u  1 is Markov with the fol lowing co-

u

transition probabilities: for 0

 

P G = ` jG = m=P U = ` 8 1  `  m 116

m;q

qu u

where U is the component subtreecontaining ro ot U  after each edge of

m;q m

U is deleted independently with probability 1 q .

m

ii Moon [33] For al l 0

!  ! 

m`1

m1

1 q  m q m

`

P U = `= ` ` 117

m;q

` q m q



iii For each 0  u  1 the tree G is an SMPGW tree whose size

u

distribution is given by

1

X

 

P G = `= P U = `P G = m 118

m;u

u 1

m=1



Pro of. In the particular case when G = G has a PGW  distribution for



1



; 0  u  1 was established some 0    1, the Markov prop ertyofG

u 49



in Corollary 21. By conditioning on G = n in this sp ecial case, it follows

1



that G ; 0  u  1 must also b e Markov with the same co-transition

u



probabilities in the extreme case when G = T , hence also by mixing for any

n

1



SMPGW tree G .Theformula 116 for the co-transition probabilities follows

1

easily from 114 . Mo on found formula 117 by a combinatorial argument.

By application of Bayes rule, this formula for the co-transition probabilities of



G ; 0  u  1 can b e obtained from the forwards transition probabilities

u

75 of the same pro cess, or vice versa. For part iii, it is enough to consider

  

the case G = famU , in whichcaseG  can b e constructed as G =

n

1 u u

famU  after using indep endent uniform variables to de ne U ; 0  u 

n;u n;u

1 as an increasing pro cess of subtrees of U . The problem is to showthat

n

distfamU  j famU = m=distfamU  119

n;u n;u m

By an easy combinatorial argument, for each subset V of [n]withV = m,

given vertsU = V the tree U has uniform distribution on the set of

n;u n;u

m1

all m ro oted trees lab eled by V . It follows that for each V [n] with

V = m

distfamU  j vertsU =V  = distfamU 

n;u n;u m

which of course implies 119 . 2

Consider now the closure SMPGW of SMPGW , that is the set of all

1

probability distributions on T obtainable as weak limits of some sequence

of SMPGW distributions. By an easy variation of the autonomous pro of of

1

Corollary 24, every distribution in SMPGW is a mixture of the PGW 1

1 

distribution of G and some SMPGW distribution. That is to say, G has a

1

SMPGW distribution i

X

 

2= = nP T 2 120 P G P G

n

n2f1;2;:::;1g

d

1

1

= G has PGW 1 distribution. The following characterization where T

1

1



of the PGW pro cess now follows from Lemma 7 and Corollary 25:



Prop osition 35 Suppose that G ; 0  u<1 is a uniform pruning pro-

u



cess such that G has a SMPGW distribution for each 0  u<1. Then

u

  

G SMPGW ; 0  u  1 is a uniform pruning of G := lim G , which has a

u"1

u 1 1

distribution, and the fol lowing conditions areequivalent: 50



i lim P G  n=0 for every n =1; 2;:::

u"1

u





ii distG =PGW u for every u 2 0; 1]

u

1



iii distG =PGW 1.

1

The spinal decomp osition for a SMPGW tree. It is instructivetocon-



sider the analog for a SMPGW tree of the spinal decomp osition of PGW 

  

stated in Corollary 30. Let V b e a uniform random vertex of G , let H G

 

b e the heightofV , and construct family trees G i as b efore by cutting



all edges along the path from 0 to V . Then instead of i,ii and iii in

Corollary 30 it is clear that by application of that Corollary and 107 we

have

i

h +1n 1!

  

P H G = h; G = n= P G = n 8 0  h  n 1

h+1

n n h 1!

  

ii given H G = h and G = n the G ifor0 i  h are distributed

like h + 1 indep endent PGW 1 trees conditionally given that the sum of

their sizes is n.

 

iii The conditional distribution of the path from 0 to V given H G =



h  1 and these family trees G ifor0 i  h is just as describ ed in iii

 

of Corollary 30 for G = G .





 

It follows easily that the trees G i are i.i.d. i G has PGW  distri-

bution for some  2 0; 1]. To more precise ab out the converse:

 

Corollary 36 If a SMPGW tree G is such that given H G =1the sizes



  

G 0 and G 1 are independent, then G has PGW  distribution for

some  2 0; 1].

 

In particular, for G with PGW  distribution the G ifor0 i 

 

H G are not conditionally i.i.d. given H G . But they are exchangeable:

  

Corollary 37 For a SMPGW tree G , the family trees G i for 0  i  H G



areconditional ly exchangeable given H G .



This consequence of the spinal decomp osition of G can also b e proved



by rst checking it combinatorially for G = T . Indeed, this case is implicit

n

in the combinatorial results [23 , 28]. 51

References

[1] D.J. Aldous. A random tree mo del asso ciated with random graphs.

Random Structures Algorithms, 1:383{402, 1990.

[2] D.J. Aldous. The construction of uniform spanning trees

and uniform lab elled trees. SIAM J. Discrete Math., 3:450{465, 1990.

[3] D.J. Aldous. Asymptotic fringe distributions for general families of ran-

dom trees. Ann. Appl. Probab., 1:228{266, 1991.

[4] D.J. Aldous. The continuum random tree I. Ann. Probab., 19:1{28,

1991.

[5] D.J. Aldous. The continuum random tree I I: an overview. In M.T.

Barlow and N.H. Bingham, editors, Stochastic Analysis, pages 23{70.

Cambridge University Press, 1991.

[6] D.J. Aldous. Deterministic and sto chastic mo dels for coalescence: a

review of the mean- eld theory for probabilists. Unpublished. Available

via homepage http://www.stat.b erkeley.edu/users/aldous, 1996.

[7] D.J. Aldous and J. Pitman. Tree- and forest-valued Markovchains. In

Preparation, 1997.

[8] N. Alon and J.H. Sp encer. The Probabilistic Method. Wiley, New York,

1992.

[9] K.B. AthreyaandP. Ney. Branching Processes. Springer, 1972.

[10] S. Berg and J. Jaworski. Probability distributions related to the lo cal

structure of a random mapping. In A. Frieze and T.L  uczak, editors,

Random Graphs,volume 2, pages 1{21. Wiley,1992.

[11] S. Berg and L. Mutafchiev. Random mappings with an attracting center:

Lagrangian distributions and a regression function. J. Appl. Probab.,

27:622 { 636, 1990.

[12] E. Borel. Sur l'emploi du th eor eme de Bernoulli p our faciliter le calcul

d'un in nit e de co ecients. Application au probleme de l'attente aun

guichet. C.R. Acad. Sci. Paris, 214:452{456 , 1942. 52

[13] P.C. Consul. Generalized Poisson Distributions. Dekker, 1989.

[14] N. Dershowitz and S. Zaks. Enumerations of ordered trees. Discrete

Mathematics, 31:9{28, 1980.

[15] M. Dwass. The total progeny in a branching pro cess. J. Appl. Probab.,

6:682{686, 1969.

[16] W. Feller. AnIntroduction to Probability Theory and its Applications,

Vol 1,3rded. Wiley,NewYork, 1968.

[17] P. Fitzsimmons, J. Pitman, and M. Yor. Markovian bridges: construc-

tion, Palm interpretation, and splicing. In E. C inlar, K.L. Chung, and

M.J. Sharp e, editors, Seminar on Stochastic Processes, 1992, pages 101{

134. Birkhauser, Boston, 1993.

[18] L. Gordon. A sto chastic approach to the gamma function. Amer. Math.

Monthly, 101:858{865 , 1994.

[19] G. R. Grimmett. Random lab elled trees and their branching networks.

J. Austral. Math. Soc. Ser. A, 30:229{237 , 1980.

[20] H. Haase. On the incipient cluster of the binary tree. Arch. Math.

Basel, 63:465{471, 1994.

[21] F.A. Haight and M.A. Breuer. The Borel-Tanner distribution.

Biometrika, 47:143{150, 1960.

[22] T.E. Harris. The Theory of Branching Processes. Springer-Verlag, New

York, 1963.

[23] A. Joyal. Une th eorie combinatoire des s eries formelles. Adv. in Math.,

42:1{82, 1981.

[24] D.P. Kennedy. The Galton-Watson pro cess conditioned on the total

progeny. J. Appl. Probab., 12:800{806, 1975.

[25] H. Kesten. Sub di usive b ehavior of random walk on a random cluster.

Ann. Inst. H. Poincar eProbab. Statist., 22:425{487, 1987. 53

[26] V.F. Kolchin. Branching pro cesses, random trees, and a generalized

scheme of arrangements of particles. Mathematical Notes of the Acad.

Sci. USSR, 21:386{394, 1977.

[27] V.F. Kolchin. Random Mappings. Optimization Software, New York,

1986. Translation of Russian original.

[28] G. Lab elle. Une novelle d emonstration combinatoire des formules

d'inversion de lagrange. Adv. in Math., 42:217{247, 1981.

[29] R. Lyons. Random walks, capacity, and p ercolation on trees. Ann.

Probab., 20:2043{208 8, 1992.

[30] R. Lyons. Probability on trees. Bo ok in preparation, 1996.

[31] R. Lyons, R. Pemantle, and Y. Peres. Conceptual pro of of L log L criteria

for mean b ehavior of branching pro cesses. Ann. Probab., 23:1125{1138,

1995.

[32] A. Meir and J.W. Mo on. The distance b etween p oints in random trees.

J. Comb. Theory, 8:99{103, 1970.

[33] J. W. Mo on. A problem on random trees. J. Comb. Theory B,10:201{

205, 1970.

[34] J. Neveu. Arbres et pro cessus de Galton-Watson. Ann. Inst. H. Poincar e

Probab. Statist., 22:199 { 207, 1986.

[35] R. Otter. The multiplicative pro cess. Ann. Math. Statist., 20:206{224,

1949.

[36] A. G. Pakes and T. P. Sp eed. Lagrange distributions and their limit

theorems. SIAM Journal on Applied Mathematics, 32:745 { 754, 1977.

[37] R. Pemantle. Uniform random spanning trees. In J. Laurie Snell, editor,

Topics in Contemporary Probability, pages 1{54, Bo ca Raton, FL, 1995.

CRC Press.

[38] J. Pitman. Coalescent random forests. Technical Rep ort 457, Dept.

Statistics, U.C. Berkeley,1996. 54

[39] J. Pitman. Enumerations of trees and forests related to branching pro-

cesses and random walks. Technical Rep ort 482, Dept. Statistics, U.C.

Berkeley, 1997.

[40] C.R. Rao and H. Rubin. On a characterization of the Poisson distribu-

tion. Sankhya, Ser. A, 26:294{298, 1964.

[41] L.C.G. Rogers and D. Williams. Di usions, Markov Processes and Mar-

tingales,volume 1,Foundations. Wiley, 1994. 2nd. edition.

[42] R.K. Sheth. Merging and hierarchical clustering from an initially Poisson

distribution. Mon. Not. R. Astron. Soc., 276:796{824, 1995.

[43] R.K. Sheth. Galton-Watson branching pro cesses and the growthofgrav-

itational clustering. Mon. Not. R. Astron. Soc., 281:1277{128 9, 1996.

[44] R.K. Sheth and J. Pitman. Coagulation and branching pro cess mo dels

of gravitational clustering. To app ear in Mon. Not. R. Astron. Soc.,

1997.

[45] M. Sibuya, N. Miyawaki, and U. Sumita. Asp ects of Lagrangian proba-

bility distributions. Studies in AppliedProbability. Essays in Honour of

Lajos Tak acs J. Appl. Probab., 31A:185{197, 1994.

[46] R. Stanley.Enumerativecombinatorics, vol. 2. Bo ok in preparation, to

b e published byCambridge University Press, 1996.

[47] L. Tak acs. Queues, random graphs and branching pro cesses. J. Applied

Mathematics and Simulation, 1:223{243, 1988.

[48] L. Tak acs. Limit distributions for queues and random ro oted trees. J.

Applied Mathematics and Stochastic Analysis, 6:189{216, 1993.

[49] J.C. Tanner. A problem of interference b etween two queues. Biometrika,

40:58{69, 1953.

[50] J.C. Tanner. A derivation of the Borel distribution. Biometrika,

1961:222{22 4, 1961.

[51] S. S. Wilks. Certain generalizations in the analysis of .

Biometrika, 24:471{494, 1932. 55