<<

A study of large and non-fringe subtrees in conditional Galton-Watson trees Full Version

Xing Shi Cai, Luc Devroye

School of Computer Science, McGill University of Montreal, Canada,

[email protected]

[email protected]

October 8, 2018

We study the conditions for families of fringe or non-fringe subtrees to gw exist with high probability (whp) in Tn , a Galton-Walton tree of size n. gw We first give a Poisson approximation of fringe subtree counts in Tn , which permits us to determine the height of the maximal complete r-ary fringe subtree. Then we determine the maximal Kn such that every tree of size gw at most Kn appears as fringe subtree in Tn whp. Finally, we show that non-fringe subtree counts are concentrated and determine, as an application, gw the height of the maximal complete r-ary non-fringe subtree in Tn .

1 Introduction

In this paper, we study the conditions for families of fringe or non-fringe subtrees to exist whp (with high probability) in a Galton-Walton tree conditional to be of size n. In particular, we want to find the height of the maximal complete r-ary fringe and non- fringe subtrees. We also want to determine the threshold kn such that all trees of size at most k appear as fringe subtrees. In doing so, we extend Janson’s [22] result on fringe

arXiv:1602.03850v1 [math.PR] 11 Feb 2016 n subtrees counts and prove a new concentration theorem for non-fringe subtree counts. Let T be the set of all rooted, ordered, and unlabeled trees, which we refer to as plane trees. All trees considered in this paper belong to T. (See Janson [21, sec. 2.1] for details.) Figure 1 lists all plane trees of at most 4 nodes. Given a tree T P T and a node v P T , let Tv denote the subtree rooted at v. We call 1 1 Tv a fringe subtree of T . If Tv is isomorphic to some tree T P T, then we write T “ Tv

1 Figure 1: Plane trees of at most 4 nodes. and say that T has a fringe subtree of shape T 1 rooted at v, or simply T contains T 1 as a fringe subtree. 1 On the other hand, if Tv can be made isomorphic to T by replacing some or none of 1 its own fringe subtrees with leaves (nodes without children), then we write T ă Tv and say that T has a non-fringe subtree of shape T 1 rooted at v, or simply T contains T 1 as a 1 1 non-fringe subtree. (Note that T “ Tv implies that T ă Tv.) We also use the notation T 1 ă T to denote that T has a non-fringe subtree of shape T 1 at its root. Figure 2 shows some examples of fringe and non-fringe subtrees.

fringe subtrees of T

T non-fringe subtrees of T

Figure 2: Examples of fringe and non-fringe subtrees.

Let ξ be a non-negative integer-valued random variable. The Galton-Watson tree T gw with offspring distribution ξ is the random tree generated by starting from the root and independently giving each node a random number of children, where the numbers gw gw of children are all distributed as ξ. The conditional Galton-Watson tree Tn is T restricted to the event |T gw| “ n, i.e., T gw has n nodes. The comprehensive survey by Janson [21] describes the history and the basic properties of these trees. In the study of conditional Galton-Watson trees, the following is usually assumed throughout the paper:

gw Condition A. Let Tn be a conditional Galton-Watson tree of size n with offspring dis- tribution ξ, such that Eξ “ 1 and 0 ă σ2 :“ Var pξq ă 8. Let T gw be the corresponding unconditional Galton-Watson tree.

2 We summarize notations:

¨ T — the set of all rooted, ordered and unlabeled trees (plane trees)

¨ T — a tree in T

¨ Tv — a fringe subtree of T rooted at node v P T ¨ ξ — a non-negative integer-valued random variable with Eξ “ 1 and 0 ă σ2 :“ Var pξq ă 8

¨ h — the span of ξ, i.e., gcdti ě 1 : pi ą 0u ¨ T gw — an unconditional Galton-Watson tree with offspring distribution ξ

gw gw gw ¨ Tn — T given that |T | “ n

gw gw gw ¨ Tn,v — a fringe subtree of Tn rooted at node v P Tn

gw gw gw ¨ Tn,˚ — a fringe subtree of Tn rooted at a uniform random node of Tn

¨ Tn — a sequence of trees

¨ Sn — the set of all trees of size n ¨ S — a set of trees

` gw ¨ Sďn — the set tT P T : |T | ď n, P tT “ T u ą 0u

¨ An — a sequence of sets of trees

gw gw ¨ NS pTn q — the number of fringe subtrees of Tn that belong to S ¨ πpSq — P tT gw P Su

nf gw gw ¨ NT pTn q — the number of non-fringe subtrees of Tn of shape T ¨ πnf pT q — P tT ă T gwu, the probability that T gw has a non-fringe subtree T at its root

gw If p1 “ 0, then there exist positive integers n such that P t|T | “ nu “ 0. For such gw gw n, Tn is not well-defined. But it is easy to show that P t|T | “ nu ą 0 for all n ě n0 with n ´ 1 ” 0 pmod hq, where h is span of ξ and n0 depends only on ξ (see [21, cor. gw gw 15.6]). Therefore, in this paper, for all asymptotic results about T and Tn , the limits are always taken along the subsequence with n ´ 1 ” 0 pmod hq. Extending a result by Aldous [1], Janson [21, thm. 7.12] proved the following theorem:

gw gw Theorem 1. Assume Condition A. The conditional distribution LpTn,˚ |Tn q converges in probability to LpT gwq. In other words, for all T P T, as n Ñ 8,

gw N pT q p T n “ P T gw “ T |T gw Ñ P tT gw “ T u . (1) n n,˚ n (

3 Later Janson strengthened the above result, proving the asymptotic normality of gw gw NT pTn q by studying additive functionals on Tn [22]. gw gw A natural generalization of NT pTn q is to consider fringe subtree counts NTn pTn q where Tn P T is a sequence of trees instead of a fixed tree T . Let Popλq denote a Poisson random variable with mean λ. We have:

gw Theorem 2. Assume Condition A. Let πpT q :“ P tT “ T u and let kn Ñ 8, kn “ opnq. Then gw lim sup dTV pNT pTn q, PopnπpT qqq “ 0, (2) nÑ8 T :|T |“kn where dTV p ¨ , ¨ q denotes the total variation distance. Therefore, letting Tn be a sequence of trees with |Tn| “ kn, we have as n Ñ 8:

gw (i) If nπpTnq Ñ 0, then NTn pTn q “ 0 whp.

gw d (ii) If nπpTnq Ñ µ P p0, 8q, then NTn pTn q Ñ Popµq.

(iii) If nπpTnq Ñ 8, then

N pT gwq ´ nπpT q Tn n n Ñd Np0, 1q, nπpTnq

a d where Np0, 1q denotes the standard normal distribution, and Ñ denotes conver- gence in distribution.

In Theorem 2, (i)–(iii) follow directly from (2) by the following lemma, whose simple proof we omit:

Lemma 1. Let Xn be a sequence of random variables. Let µn be a sequence of non- negative real numbers. Assume that dTV pXn, Popµnqq Ñ 0 as n Ñ 8. We have:

(i) If µn Ñ 0, then Xn “ 0 whp.

d (ii) If µn Ñ µ P p0, 8q, then Xn Ñ Popµq.

? d (iii) If µn Ñ 8, then pXn ´ µnq{ µn Ñ Np0, 1q. Theorem 2 can be generalized as follows:

Theorem 3. Assume Condition A. Let Skn be the set of all trees of size kn, where gw gw kn Ñ 8 and kn “ opnq. For S Ď Skn , let πpSq :“ P tT P Su and NS pTn q :“ gw vPT gw Tn,v P S . Then n J K ř gw lim sup dTV pNS pTn q, PopnπpSqqq “ 0. nÑ8 SĎSkn

Therefore, letting An be a sequence of sets of trees with An Ď Skn , we have:

4 gw (i) If nπpAnq Ñ 0, then NAn pTn q “ 0 whp.

gw d (ii) If nπpAnq Ñ µ P p0, 8q, then NAn pTn q Ñ Popµq.

(iii) If nπpAnq Ñ 8, then

N pT gwq ´ nπpA q An n n Ñd Np0, 1q. nπpAnq a The proof of Theorem 2 is given in Section 3. It uses many ingredients from previous results on fringe subtrees, especially from Janson [22]. (In particular, Lemma 6.2 of gw [22] makes the computation of the variance of NSn pTn q quite easy, which is crucial for the proof.) The key step of the proof is based on subtree switching. It permits the coupling of two conditional Galton-Watson trees. Then we can apply the exchangeable pair method by Chatterjee et al. [5], which is an extension of Stein’s method [29], to study the convergence of total variation distance. This approach can easily be modified to prove Theorem 3, whose steps we sketch at the end of Section 3. Binary search trees and recursive trees are also well-studied random tree models (see Drmota [11]). Many authors have found results similar to Theorem 2 for these two types of trees, see, e.g., [14, 18, 8, 9, 17]. For recent developments, see Holmgren and Janson [20]. We say that a tree T is possible if P tT gw “ T u ą 0. As an application of Theorems 2 gw and 3, we ask the following question — when does Tn contain all possible trees within a family of trees (possibly depending on n). As shown in subsection 4.1, this is essentially a variation of the coupon collector problem. In subsection 4.2 we answer the above question for the set of complete r-ary trees. Let gw Hn,r be the maximal integer such that Tn contains all complete r-ary trees of height at most Hn,r as fringe subtrees. Lemma 13 shows that Hn,r ´ logr log n converges in probability to an explicit constant. ` ` Let Sďk be the set of all possible trees of size at most k. Let Kn “ maxtk : Sďk Ď gw gw ` gw YvPTn Tn,v u, i.e., Kn is the maximal k such that every tree in Sďk appears in Tn as fringe subtrees. In subsection 4.3, we show that, roughly speaking, if the tail of the offspring distribution does not drop off too quickly, Kn{ log n converges in probability p to a positive constant. Otherwise, we have Kn{ log n Ñ 0. For example, for a random p Cayley tree, we have Kn log logpnq{ logpnq Ñ 1. For many well-known Galton-Watson trees, we also give the second order asymptotic term of Kn. Non-fringe subtrees are more complicated to analyze. However since on average fringe gw subtrees in Tn behave like unconditional Galton-Watson trees when n is large, the number of non-fringe subtrees of shape T should be more or less nP tT ă T gwu. The following theorem is a precise version of this intuition.

nf gw nf gw Theorem 4. Assume Condition A. Let π pT q :“ P tT ă T u. Let NT pTn q :“ gw gw T T . Let T be a sequence of trees with T k where k and vPTn ă n,v n | n| “ n n Ñ 8 kn “ opJnq. We haveK ř

5 nf nf gw p (i) If nπ pTnq Ñ 0, then NTn pTn q Ñ 0.

nf nf gw nf p (ii) If nπ pTnq Ñ 8, then NTn pTn q{pnπ pTnqq Ñ 1. Chyzak et al. [7] studied non-fringe subtrees for various random trees, including simply generated trees. They proved that if for all n we have Tn “ T where T is fixed, then nf gw NTn pTn q has a central limit theorem. However, Theorem 4 cannot be simply derived from their result as our Tn depends upon n.

nf Remark. It is tempting to try to prove that if nπ pTnq Ñ µ P p0, 8q, then we have nf gw d NTn pTn q Ñ Popµq, which is true for fringe subtrees. Unfortunately, this is not true in general. See Lemma 26 in Section 5.2.

In Section 5, we give a proof of Theorem 4 and apply it to study the maximal complete gw r-ary non-fringe subtree in Tn . The paper ends with some open questions in Section 6.

2 Notations and Preliminaries

In this section we list some well-known results regarding conditional Galton-Watson trees and Poisson approximations.

2.1 Conditional Galton-Watson trees

Given a tree T of size k, let v1, . . . , vk be the nodes of T in dfs (depth-first-search) order. Let di be the degree (the number of children) of vi. We call pd1, d2, . . . , dkq the preorder degree sequence of T . Let N :“ t1, 2,...u and let N0 :“ t0u Y N. It is well-known that (see Janson [21, lem. 15.2]):

k Lemma 2. A sequence pd1, d2, . . . , dkq P N0 is the preorder degree sequence of some tree if and only if it satisfies

j d ě j, p1 ď j ď k ´ 1q i“1 i (3) k d “ k ´ 1. #ři“1 i Figure 3 gives a demonstrationř of Lemma 2. k Let Dk Ď N0 be the set of all preorder degree sequences of length k. Observe:

1 Corollary 1. If pd1, d2, . . . , dkq P Dk, then it is impossible that there exists 1 ď k ă k such that pd1, d2, . . . , dk1 q P Dk1 .

n n n n gw n n n n Let ξ :“ pξ1 , ξ2 , . . . , ξn q be the preorder degree sequence of Tn . Let ξ :“ pξ1 , ξ2 ,..., ξn q n be a uniform random cyclic rotation of ξ . Let ξ1, ξ2,... be i.i.d. copies of ξ. Let n n Sn :“ i“1 ξi. The next lemma is a well-known connection between ξ andr ξ1, ξr2, .r . . , ξn r (see, e.g., Otter [25], Kolchin [23], Dwass [12] and Pitman [26]). For a complete proof, see Jansonř [21, cor. 15.4]. r

6 3 j i“1 di ´ j for j “ 1,..., 7 ř 2

1

0 The degree sequence pd1, . . . , d7q “ p2, 1, 0, 3, 0, 0, 0q 1 2 3 4 5 6 7 -1

Figure 3: Example of preorder tree degree sequence.

Lemma 3. Assume that P t|T gw| “ nu ą 0. We have

n n n L pξ1 , ξ2 ,..., ξn q “ pξ1, ξ2, . . . , ξn | Sn “ n ´ 1q ,

L where “ denotes “identicallyr r distributed”r and the right-hand-side denotes pξ1, ξ2, . . . , ξnq restricted to the event that Sn “ n ´ 1.

Let pi :“ P tξ “ iu. Let h be the span of ξ, i.e., h :“ gcdti ě 1 : pi ą 0u. We recall the following result (see Janson [22, (4.3)] or Kolchin [23]):

Lemma 4. Assume Condition A. We have h P t|T gw| “ nu „ ? n´3{2, 2πσ2 as n Ñ 8 with n ´ 1 ” 0 pmod hq. The following lemma is a special case of [22, lem. 5.1]. We nonetheless give a short proof for later reference in the paper.

gw Lemma 5. Assume P t|T | “ nu ą 0. Let T P Sk with 1 ď k ď n.

gw gw (i) Let NT pTn q :“ vPT gw Tn,v “ T . Then n J K ř E rN pT gwqs P tS “ n ´ ku T n “ πpT q n´k . n P tSn “ n ´ 1u

gw gw gw (ii) Let NSk pTn q “ vPT |Tn,v | “ k . Then n J K ř gw E rNSk pTn qs P tSn´k “ n ´ ku “ πpSkq . n P tSn “ n ´ 1u

7 n n n Proof. Let pd1, d2, . . . , dkq be the preorder degree sequence of T . Recall that pξ1 , ξ2 , . . . , ξn q gw is the preorder degree sequence of Tn . Let

n n n Ii “ ξi “ d1, ξi`1 “ d2, . . . , ξi`k´1 “ dk , J K where the indices are taken modulo n. Note that if n ´ k ` 1 ă i ď n, then it is impossible that Ii “ 1, because the length of the preorder degree sequence of the fringe subtree T gw must be strictly less than n,vi 1 k. Therefore, if Ii ą 1 and n ´ k ` 1 ă i ď n, then there exists a k ă k such that 1 pd1, d2, . . . , dkq is also a preorder degree sequence, which is impossible by Corollary 1. gw n Therefore for all 1 ď i ď n, Ii “ Tvi “ T and NT pTn q “ i“1 Ii. Recalling that pξn, ξn,..., ξnq is a uniform randomJ rotationK of pξn, ξn, . . . , ξnq and using Lemma 3, we 1 2 n 1 2 n ř have

r r r n n gw n n n E rNT pTn qs “ E Ii “ P ξi “ d1, ξi`1 “ d2, . . . , ξi`k´1 “ dk «i“1 ff i“1 n ÿ ÿ (

“ P tξi “ d1, ξi`1 “ d2, . . . , ξi`k´1 “ dk | Sn “ n ´ 1u i“1 ÿ “ nP tξ1 “ d1, ξ2 “ d2, . . . , ξk “ dk | Sn “ n ´ 1u P trξ “ d , ξ “ d , . . . , ξ “ d s X rS “ n ´ 1su “ n 1 1 2 2 k k n . P tSn “ n ´ 1u

k Since pd1, . . . dkq is a preorder degree sequence, i“1 di “ k ´ 1 by Lemma 2. Therefore, using the fact that ξ1, ξ2, . . . , ξn are independent, the last expression equals ř P trξ “ d , ξ “ d , . . . , ξ “ d s X rS “ n ´ ksu P tS “ n ´ ku n 1 1 2 2 k k n´k “ nP tT gw “ T u n´k . P tSn “ n ´ 1u P tSn “ n ´ 1u

Thus part (i) is proved. Part (ii) follows by summing the equality in (i) over all T P Sk. The following approximations are useful for estimating the expectation and the vari- ance of the number of fringe subtrees.

Lemma 6 (Lemma 5.2 and 6.2 of Janson [22]). Assume Condition A. We have: (i) Uniformly for all k with 1 ď k ď n{2,

P tS “ n ´ ku k n´k “ 1 ` O ` o n´1{2 . P tS “ n ´ 1u n n ˆ ˙ ` ˘ (ii) Uniformly for all k with 1 ď k ď n{4,

P tS “ n ´ 2k ` 1u P tS “ n ´ ku 2 1 k k2 n´2k ´ n´k “ O ` O ` O . P tS “ n ´ 1u P tS “ n ´ 1u n n3{2 n2 n ˆ n ˙ ˆ ˙ ˆ ˙ ˆ ˙

8 2.2 Poisson Approximation Stein’s method [29] bounds the errors in approximating the sum of random variables with a normal distribution. It was extended to Poisson distributions first by Chen [6] and later by many others, e.g., Arratia et al. [3] and Barbour et al. [4]. Among these, we found that it is easier to use the exchangeable pairs method by Chatterjee et al. [5] to prove Theorem 2. In particular, we use it of the form of Ross [28, thm. 4.37] as follows. Given a function f : N0 Ñ R, let

kfk :“ max |fpiq|, k∆fk :“ max |fpi ` 1q ´ fpiq|. iPN0 iPN0

Lemma 7. Let W be a non-negative integer-valued random variable with EW “ λ. Let pW, W 1q be an exchangeable pair, i.e., pW, W 1q “L pW 1,W q. Let F be a σ-algebra such L that W is F-measurable. If Z “ Popλq, then there exists a function f : N0 Ñ R with

kfk ď λ´1{2, k∆fk ď λ´1, such that for all c P R,

1 dTV pW, Zq ď E rpλ ´ cP tW “ W ` 1|Fuq fpW ` 1qs ´E rpW ´ cP tW 1 “ W ´ 1|Fuq fpW qs .

The following Lemma is a special case of Roos [27, thm. 1], which applies to mixed Poisson distributions. Barbour et al. proved a similar result using Stein’s method [4, thm. 1.C]. We include our proof for its simplicity.

Lemma 8. If X “L Popµq and Y “L Popνq, then

? ? |µ ´ ν| dTV pX,Y q ď µ ´ ν “ ? ? . µ ` ν ˇ ˇ ˇ ˇ Proof. Let xi “ P tX “ iu and yi “ P tY “ iu. We have

1 8 1 8 ? ? ? ? d pX,Y q “ |x ´ y | “ | x ´ y |p x ` y q TV 2 i i 2 i i i i i“1 i“1 ÿ ÿ 1{2 1 8 ? ? 8 ? ? ď | x ´ y |2 p x ` y q2 2 i i j j ˜i“1 j“1 ¸ ÿ ÿ 1{2 1 8 ? 8 ? “ 2 ´ 2 x y 2 ` 2 x y 2 i i j j ˜˜ i“1 ¸ ˜ j“1 ¸¸ ÿ ÿ 2 1{2 8 ? “ 1 ´ xiyi , ¨ ˜i“1 ¸ ˛ ÿ ˝ ‚

9 where the second step uses the Cauchy-Schwartz inequality. An easy calculation shows that ? 8 8 i{2 ? 2 ? ´ µ`ν pµνq ? µ ` ν p µ ´ νq x y “ e 2 “ exp µν ´ “ exp ´ . i i i! 2 2 i“1 i“1 ÿ ÿ ´ ¯ ˆ ˙ Thus we have ? ? 2 dTV pX,Y q ď 1 ´ exp ´p µ ´ νq b ? ? ď p µ ´ `νq2 pby 1 ´˘ e´x ď xq ? ? “ bµ ´ ν . ˇ ˇ 3 Sequences of fringeˇ subtreesˇ

In this section we prove Theorem 2 and sketch the proof of Theorem 3. Recall that gw gw πpT q :“ P tT “ T u, and NT pnq :“ vPT gw Tn,v “ T . Theorem 2 states that J n K gw sup dTV pNT pTřn q, PopnπpT qqq “ op1q, T PSkn

gw gw whenever kn “ opnq. If πpT q “ 0, then NT pTn q “ 0 and dTV pNT pTn q, PopnπpT qqq “ gw gw 0 deterministically. Thus we can assume that P tT P Skn u “ P t|T | “ knu ą 0 for all n, and that the above supremum is taken over all T P Skn with πpT q ą 0. gw To prove this, we first show that ENT pTn q „ nπpT q. Then we construct a coupling gw gw gw gw NT pTn q „ NT pTn q such that pNT pTn q,NT pTn qq forms an exchangeable pair, and apply Lemma 7 to finish the proof. gw gw Recall that N T : gw T S , i.e., N T is the number of fringe s Skn p n q “ vPTn v P kn s Skn p n q gw gw subtrees of size kn in T . AlsoJ recall that πpKSq :“ P tT P Su. n ř Lemma 9. Let k “ kn “ opnq. We have EN pT gwq k sup T n ´ 1 “ O ` o n´1{2 , nπpT q n T PSk ˇ ˇ ˆ ˙ ˇ ˇ ` ˘ ˇ ˇ and ˇ ˇ EN pT gwq k Sk n ´ 1 “ O ` o n´1{2 . nπpS q n ˇ k ˇ ˆ ˙ ˇ ˇ ` ˘ ˇ ˇ Proof. Since k “ opnq, weˇ have k ă n{2ˇ for n large. Thus by Lemma 5 and 6, uniformly for all T P Sk, EN pT gwq P tS “ n ´ ku k T n ´ πpT q “ πpT q n´k ´ 1 “ πpT q O ` opn´1{2q . n P tS “ n ´ 1u n ˇ ˇ ˇ n ˇ ˆ ˆ ˙ ˙ ˇ ˇ ˇ ˇ ˇ ˇ ˇ ˇ Summingˇ over all treesˇ T with ˇ|T | “ k gives the secondˇ part of the lemma.

10 Lemma 10. Let k “ kn “ opnq. There exists a constant C1 such that 1 k k2 Var pN pT gwqq ď C pnπpS qq2 ` ` . Sk n 1 k n n3{2 n2 ˆ ˙ n n n n gw Proof. Recall that ξ :“ pξ1 , ξ2 , . . . , ξn q is the preorder degree sequence of Tn and that Dk is the set all preorder degree sequences of length k. Let

n n n Ji :“ pξi , ξi`1, . . . ξi`k´1q P Dk , J K n n where the indices are modulo n. Thus Ji “ 1 if and only if pξi , . . . ξi`k´1q satisfies (3). Note that if n ´ k ` 1 ă i ď k, then it is impossible that Ji “ 1. (See Lemma 5 for a gw gw n gw similar argument.) Thus NSk pTn q :“ vPT Tn,v P Sk “ i“1 Ji. n J K Recall that ξn :“ pξn, ξn,..., ξnq is a uniform random rotation of ξn. Let 1 2 n ř ř n n n r r r Ji :“r pξi , ξi`1,... ξi`k´1q P Dk . J K Then N T gw n J n J . Sk p n q “ i“1 i “r i“1 ri r r Using the fact that J1,..., Jn are identically distributed and Lemma 5, we have ř ř r 1 gw gw P tSn´k “ n ´ ku EJ1 “ r ENSk prTn q “ P t|T | “ ku . n P tSn “ n ´ 1u Similar to the proofr of Lemma 5, we have

n n n n EJ1Jk`1 “ P pξ1 ,..., ξk q P Dk, pξk`1,..., ξ2kq P Dk “ P !tpξ , . . . , ξ q P D , pξ , . . . , ξ q P D |)S “ n ´ 1u r r r1 rk k kr`1 2rk k n P tS “ n ´ 2k ` 1u “ P t|T gw| “ ku2 n´2k P tSn “ n ´ 1u , m where ξ1, ξ2,... are i.i.d. copies of ξ and Sm :“ i“1 ξi. Consider two indices i ‰ j. Note that if |i ´ j| ă k or |i ` n ´ j| ă k, then EJ J “ 0. ř i j This is because a fringe subtree of size k, besides itself, cannot contain another fringe subtree of the same size. On the other hand, if |i ´ j| ą k and |i ` n ´ j| ąr rk, i.e., n n n n n pξi ,..., ξi`k´1q and pξj ,..., ξj`k´1q do not overlap, then E JiJj “ E J1Jk`1 since ξ is permutation invariant. Therefore, there exists a constant” C1 suchı that” ı r r r r r r r r r gw Var pNSk pTn qq “ E JiJj ´ E Ji E Jj ď npn ´ 2kq EJ1Jk`1 ´ EJ1EJk`1 1ďi,jďn ÿ ´ ” ı ” ı ” ı¯ ´ ¯ r r P trS r“ n ´ 2k ` 1u PrtSr “ nr´ kru 2 ď pnP t|T gw| “ kuq2 n´2k ´ n´k P tS “ n ´ 1u P tS “ n ´ 1u « n ˆ n ˙ ff 1 k k2 “ C pnπpS qq2 ` ` , 1 k n n3{2 n2 ˆ ˙ where in the last step we use k ď n{4 for n large and part (ii) of Lemma 6.

11 The following Lemma is the key part of the proof of Theorem 2.

Lemma 11. Assume that k “ kn “ opnq and k Ñ 8. We have as n Ñ 8,

gw gw sup dTV pNT pTn q, PopE rNT pTn qsqq Ñ 0. T PSk

gw gw Proof. Let T P Sk. We use the abbreviations Γn “ NT pTn q and Ψn “ NSk pTn q. Let n n n n pd1, . . . , dkq be the preorder degree sequence of T . Recall that ξ :“ pξ1 , ξ2 ,..., ξn q is a gw uniform random cyclic rotation of the preorder degree sequence of Tn . Let r r r r n n n n n n Ii “ ξi “ d1, ξi`1 “ d2,... ξi`k´1 “ dk , and Ji “ pξi , ξi`1,..., ξi`k´1q P Dk , J K J K where all the indices are modulo n. Then by the same argument in Lemma 10, Γn “ nr r r n r r r r r i“1 Ii and Ψn “ i“1 Ji. Note that Ii “ 1 implies that Ji “ 1 since pd1, . . . , dkq P Dk. n n n n n n Let ξ :“ pξ1 , ξ2 ,..., ξn q be an independent copy ξ . Let Ii and Ji be defined for ξ as ř n ř Ii andrJi for ξ . r r r n n n n Wep constructp p a newp sequence ξ :“ pξ1 , ξ2 ,...,rξn q as follows.p p Let Z be a uniformp randomr r variabler on t1, 2, . . . , nu that is independent from everything else. Let ξn “ ξn if JZ “ 0 or JZ “ 0. Otherwise JZs“ JZ “s 1s and ins this case we let s r ξn if i P tZ,Z ` 1,...,Z ` k ´ 1u pmod nq, r pξn i r p i “ n #ξi otherwise. p s Equivalently, we canr construct ξn using a method which we call subtree switching. gw gw Let Tn be an independent copy of Tn . Let Z be as before. Let vpZq and vpZq be gw gw the Z-th nodes (in dfs order) in Tsn and Tn respectively. If the two fringe subtrees gw r gw gw gw gw Tn,vpZq and Tn,vpZq are both of size k, then in Tn we replace Tn,vpZq with Tn,vpZq rand get gw gw gwr n a new tree Tn . Otherwise let Tn “ Tn . Let ξ be a uniform random rotation of r gw n r the preorderr degree sequence of Tn . This construction of ξ is equivalent to the above one. s s s n n n gw Let Ii be defined for ξ as Ii sfor ξ . Let Γn :“ i“1 Ii “s NTn pTn q. It is not hard to verify that ξn L ξn and that Γ , Γ L Γ , Γ , i.e., Γ , Γ is an exchangeable pair “ p n nq “ p n nq ř p n nq defineds in Lemma 7. Wes leaver the detailsr tos the reader. s s n n n Let rn :“ EIr1 “ EsI1 and qn :“ EJs1 “ EJs1. Let F “ σpξ1 s, ξ2 ,..., ξn q, i.e., the sigma n algebra generated by ξ . Since the event Γn “ Γn `1 happens if and only if JZ “ JZ “ 1, IZ “ 0 and IZ r“ 1, wep have r p r r r r s r p n r P Γpn “ Γn ` 1|F “ P Yi“1 Z “ i, Ji “ 1, Ii “ 0, Ji “ 1, Ii “ 1 F n! ” ı ˇ ) ( ˇ s “ P tZ “ iu P Jri “ 1, rIi “ 0 pF P pIi “ 1 ˇ i“1 ÿ ! ˇ ) ! ) 1 n r r r n ˇ p “ J 1 ´ I r “ n Jˇ ´ I , n i i n n i i i“1 i“1 ÿ ´ ¯ ÿ ´ ¯ r r r r

12 where we use that rIi “ 1s Ď rJi “ 1s and rIi “ 1s Ď rJi “ 1s. n n Let c “ n{EJ1 “ n{qn. Note that i“1 EIi “ EΓn “ nrn, and that i“1 EJi “ EΨn “ nqn. Thus r r p p ř ř r r n r n r ζ :“ EΓ ´ cP Γ “ Γ ` 1|F “ EΓ ´ n J ´ E J ` E J ´ I 1 n n n n q n i i i i n i“1 (n ÿ ´ ” ı ” ı ¯ r s r r r r r r r r “ nr ´ n nq ` n Γ ´ n J ´ EJ “ n Γ ` n pE rΨ s ´ Ψ q . n q n q n q i i q n q n n n n n i“1 n n ÿ ´ ¯ r r The event Γn “ Γn ´ 1 happens if and only if JZ “ JZ “ 1, IZ “ 1 and IZ “ 0. Therefore

s n r p r p P Γn “ Γn ´ 1|F “ P Yi“1 Z “ i, Ji “ 1, Ii “ 1, Ji “ 1, Ii “ 0 F n! ” ı ˇ ) ( ˇ s “ P tZ “ iu P Jri “ 1, rIi “ 1 pF P pJi “ 1,ˇIi “ 0 i“1 ÿ ! ˇ ) ! ) 1 n r r ˇ qp pr “ I P J “ 1 ´ P I “ˇ1 “ n Γ ´ n Γ , n i i i n n n n i“1 ÿ ´ ! ) ! )¯ r p p where we again use that rIi “ 1s Ď rJi “ 1s and rIi “ 1s Ď rJi “ 1s. Therefore

n qn rn rn ζ2 :“ Γn ´ cP Γrn “ Γn ´ 1r|F “ Γn ´ p Γn ´p Γn “ Γn. qn n n qn ( ´ ¯ It follows from Lemmas 7 that there exists a function f : N0 Ñ R with k∆fk ď ´1 ´1 ´1{2 ´1{2 E rΓns “ pnrnq and kfk ď E rΓns “ pnrnq , such that

dTV pΓn, PopEΓnqq ď E rζ1fpΓn ` 1q ´ ζ2fpΓnqs r r r “ E n Γ ` n pE rΨ s ´ Ψ q fpΓ ` 1q ´ n Γ fpΓ q q n q n n n q n n „ˆ n n ˙ n  r r “ E n Γ pfpΓ ` 1q ´ fpΓ qq ` fpΓ ` 1q n pE rΨ s ´ Ψ q q n n n n q n n „ n n  rn rn ď E rΓn ¨ |fpΓn ` 1q ´ fpΓnq|s ` E r|fpΓn ` 1q| ¨ |E rΨns ´ Ψn|s qn qn rn rn ď k∆fk E rΓns ` kfk E r|E rΨns ´ Ψn|s qn qn rn 1 rn 1 1{2 ď nrn ` ? pVar pΨnqq qn nrn qn nrn r r Var pΨ q 1{2 “ n ` n n :“ α ` α . q nq2 1 2 n „ n  Let pmax :“ maxiě0 pi. By the assumptions of Theorem 2, Var pξq ą 0, which implies that ξ is not a constant. Thus pmax ă 1. By Lemma 9, we have EΓ r “ n „ πpT q :“ P tT gw “ T u “ p p ¨ ¨ ¨ p ď pk . n n n n d1 d2 dk max

13 And by Lemma 9 and Lemma 4 EΨ q “ n „ πpS q :“ P t|T gw| “ ku “ Θpk´3{2q. n n kn

k ´3{2 Therefore α1 “ rn{qn “ Oppmax{k q “ op1q. By Lemma 10, we have

r Var pΨ q r 1 k k2 α2 “ n n “ n pnπpS qq2 O ` O ` O 2 nq2 np1 ` op1qqπpS q2 kn n n3{2 n2 n kn ˆ ˆ ˙ ˆ ˙ ˆ ˙˙ r k r k2 “ Opr q ` O n ` O n Ñ 0, n n1{2 n ˆ ˙ ˆ ˙ k k since rn ď pmax Ñ 0, rnk ď pmaxk Ñ 0 and k “ opnq. Therefore α2 Ñ 0. Thus dTV pΓn, PopEΓnqq ď α1 ` α2 Ñ 0. Since this upper holds whenever T P Sk, the lemma follows.

Proof of Theorem 2. Let k “ kn. By Lemma 9,

EN pT gwq k sup T n ´ 1 “ O ` o n´1{2 . nπpT q n T PSk ˇ ˇ ˆ ˙ ˇ ˇ ` ˘ ˇ ˇ Therefore, for all T P Sk, weˇ have ˇ |nπpT q ´ EN pT gwq| k T n “ nπpT q O ` o n´1{2 nπ T n p q ˆ ˆ ˙ ˙ a 2 ` ˘ a k πpT q k “ O ` opπpT qq “ o kpmax , ˜c n ¸ ´a ¯ k where the last step uses k “ opnq and πpT q ď pmax. It follows from Lemma 8 that

gw gw |nπpT q ´ ENT pTn q| k sup dTV pPopnπpT qq, PopENT pTn qqq ď O sup “ o kpmax . T PSk ˜T PSk nπpT q ¸ ´a ¯ Thus by Lemma 11 and the triangle inequality, a

gw sup dTV pNT pTn q, PopnπpT qqq T PSk gw gw gw ď sup dTV pNT pTn q, PopENT pTn qqq ` sup dTV pPopnπpT qq, PopENT pTn qqq T PSk T PSk k “ op1q ` o kpmax “ op1q. ´a ¯ Statements (i)–(iii) of theorem follow by Lemma 1.

14 3.1 The sketch of the proof of Theorem 3

Let An Ď Skn be a sequence of sets of trees, i.e., An is a set of some trees of size kn. gw Theorem 3 states that sup dTV pNA pT q, πpAnqq Ñ 0. We can show this by AnĎSkn n n adapting the proof of Theorem 2. The key step is to construct an exchangeable pair gw gw pNAn pTn q,NAn pTn qq as in the proof of Lemma 11 with new definitions of Ii, Ji, Ii and Ji. n Let DpAnq be thes set of preorder degree sequences of trees in An. Recallr thatr pξ :“ n n n gw ppξ1 , ξ2 ,..., ξn q is a uniform random rotation of the preorder degree sequence of Tn . Let r r r r n n n n n n Ii “ pξi , ξi`1,..., ξi`k´1q P DpSkn q , Ji “ ξi ` ξi`1 ¨ ¨ ¨ ` ξi`k´1 “ k ´ 1 . J K J K gw n n n n n n ThenrNAn prTn rq “ i“r1 Ii. Let ξ :“ pξ1 , ξ2 r,..., ξrn q ber an independentr copy of ξ . Define I and J for ξn as I and J for ξn. Now we construct ξn L ξn as in Lemma 11, i i ř i i “ r gw L pgw p p pgw gw r which gives a random tree Tn “ Tn . Then pNTn pTn q,NTn pTn qq is an exchangeable pair. Fromp herep we canp arguer as inr the restr of the proof for Theorems r 2. s s 4 Families of fringe subtrees

gw In this section, we apply Theorem 2 and 3 to study the conditions for Tn to contain every tree that belongs to a family of trees.

4.1 Coupon collector problem As shown later, our problem is essentially a variation of the famous coupon collector problem—if in every draw we get a coupon with a uniform random type among n types, how many draws do we need to collect all n types of coupons? The next lemma is about a generalization of this problem needed later. For the original problem, see Erd˝osand R´enyi [13] and Flajolet et al. [16]. For more about the generalized version defined below, see Neal [24].

Lemma 12 (Generalized coupon collector). Let Xn be a random variable that takes values in t1, . . . , nu. Let pn,i :“ P tXn “ iu. Assume that pn,i ą 0 for all 1 ď i ď n. Let Xn,1,Xn,2,... be i.i.d. copies of Xn. Let

Nn :“ infti ě 1 : |tXn,1,Xn,2,...,Xn,iu| “ nu.

Let mn be a sequence of real numbers. We have

n mn 1 1 ´ p1 ´ pn,iq ď P tNn ď mnu ď n . p1 ´ p qmn i“1 i“1 n,i ÿ ř

15 Proof. Let m “ mn. Let Zn,i “ i R tXn,1,...,Xn,mu . Then Nn ď m if and only if n J K Zn :“ i“1 Zn,i “ 0, i.e., P tNn ď mu “ P tZn “ 0u “ 1 ´ P tZn ě 1u. The first inequality of this lemma follows from the following: ř n n n m m P tZn ě 1u ď EZn “ EZn,i “ P Xj“1Xn,j ‰ i “ p1 ´ pn,iq . i“1 i“1 i“1 ÿ ÿ ( ÿ For 1 ď i ‰ j ď n, we have

m m m E rZn,iZn,js ´ E rZn,is E rZn,js “ p1 ´ pn,i ´ pn,jq ´ p1 ´ pn,iq p1 ´ pn,jq p m “ p1 ´ p qm 1 ´ n,j ´ p1 ´ p qm ă 0. n,i 1 ´ p n,j „ˆ n,i ˙  Therefore

Var pZnq “ E rZn,iZn,js ´ E rZn,is E rZn,js 1ďi,jďn ÿ 2 “ pE rZn,iZn,js ´ E rZn,is E rZn,jsq ` E rZn,is ´ E rZn,is ď EZn. 1ďi‰jďn 1ďiďn ÿ ÿ ` ˘ Thus by Chebyshev’s inequality, as in the second moment method (see e.g., Alon and Spencer [2, chap. 4]), we have

Var pZnq 1 1 P tZn “ 0u ď P t|Zn ´ EZn| ě EZnu ď ď “ . 2 n mn pEZnq EZn i“1p1 ´ pn,iq 4.2 Complete r-ary fringe subtrees ř

A tree T is called possible if πpT q ą 0. Let r ą 0 be a fixed integer and hn be a sequence of positive integers. A simple application of Theorem 2 is to find sufficient conditions gw such that whp every (or not every) possible complete r-ary tree appears in Tn as fringe subtrees.

Let hn Ñ 8 be a sequence of positive integers. Let Ahn,r be the set of all possible complete r-ary trees of height at most hn. Let

gw Hn,r “ max th : Tn contains all trees in Ahn,r as fringe subtreesu .

Lemma 13. Assume Condition A and pr ą 0 for some r ě 2. Let 1 1 1 α “ log log ` log . r r p r ´ 1 p ˆ 0 r ˙

Let ωn Ñ 8 be an arbitrary sequence.

gw (i) If hn ď logrplog n ´ ωnq ´ αr, then whp Tn contains all trees in Ahn,r as fringe subtrees.

16 gw (ii) If hn ě logrplog n ` ωnq ´ αr, then whp Tn does not contain all trees in Ahn,r as fringe subtrees. Also, p Hn,r ´ logr log n Ñ ´ αr. Proof. Let T r-ary denote the complete r-ary tree of height h . Note that if T r-ary appears hn n hn gw gw in Tn as a fringe subtree, then every tree in Ahn,r also appears in Tn as a fringe r-ary hn hn subtree. The tree Tn has `n :“ r leaves and vn :“ pr ´1q{pr´1q “ p`n ´1q{pr´1q internal vertices, which all have degree r. Thus we have π T r-ary : P T gw T r-ary pvn p`n . p hn q “ “ hn “ r 0

If hn ď logrplog n ´ ωnq ´ αr, then (

hn log n ´ ωn `n “ r ď . rαr Therefore 1 1 1 log “ v log ` ` log π T r-ary n p n p p hn q r 0 `n ´ 1 1 1 “ log ` `n log r ´ 1 pr p0 1 1 1 “ ` log ` log ` Op1q n r ´ 1 p p ˆ r 0 ˙ log n ´ ω ď n rαr ` Op1q rαr “ log n ´ ωn ` Op1q. Thus log nπ T r-ary ω O 1 , which implies that nπ T r-ary . It follows p p hn qq ě n ` p q Ñ 8 p hn q Ñ 8 gw p from Theorem 2 that NT r-ary pT q Ñ 8. Thus (i) is proved. hn n Similar computations show that with the assumptions of (ii), we have nπ T r-ary 0, p hn q Ñ gw p which implies that NT r-ary pT q Ñ 0 by Theorem 2. The last statement of the lemma hn n follows directly from (i) and (ii). We have a similar result for the set of 1-ary trees (chains) of height at most h. The proof is virtually identical to the previous lemma and we leave it to the reader.

Lemma 14. Assume Condition A and p1 ą 0. Let ωn Ñ 8 be an arbitrary sequence. We have: 1 gw (i) If hn ď plog n ´ ωnq{ log , then whp T contains all trees in Ah ,1 as fringe p1 n n subtrees.

1 gw (ii) If hn ě plog n ` ωnq{ log , then whp T does not contain all trees in Ah ,1 as p1 n n fringe subtrees. Therefore H p n,1 Ñ 1. log1{p1 pnq

17 4.2.1 Binary trees

gw Consider Tn with a binomial p2, 1{2q offspring distribution, i.e., p0 “ p2 “ 1{4 and bin gw p1 “ 1{2. Let Tn be Tn with each degree-one node labeled of having a left or a right bin child uniformly and independently at random. Then Tn is a tree in which nodes have a left position and a right position where child nodes can attach, and each position can be occupied by at most one child. We call such a tree a binary tree. Figure 4 gives an bin example of Tn for n “ 7.

bin gw T Possible outcomes of T7 given that T7 “ T

bin Figure 4: Example of Tn .

bin Let Tn be a binary tree of size n that has n0, n1, n2 nodes of degree 0, 1, 2 respectively, bin where n0 `n1 `n2 “ n. Let Tn be Tn with the difference between left and right children bin bin gw n1 being forgotten. We have P Tn “ Tn |Tn “ Tn “ 1{2 , since there are in total 2n1 ways to label the n degree-one nodes of T . Therefore 1 n ( bin bin bin bin gw gw P Tn “ Tn “ P Tn “ Tn |Tn “ Tn P tTn “ Tnu P tT gw “ T u ( “ P T bin “ T bin|T gw “ T ( n n n n n P t|T gw| “ nu 1 1 1 ( 1 “ “ . 2n1 4n0n2 2n1 P t|T gw| “ nu 4nP t|T gw| “ nu In other words, as is well-known from the connection between simply generated trees bin and Galton-Watson trees of size n, see, e.g., Janson [21, pp. 132], Tn is uniformly distributed among all binary trees of size n. gw Thus our analysis of maximum r-ary fringe subtree in Tn can be easily adapted to uniform random binary trees. For example, an argument similar to Lemma 14 shows that bin the maximum one-ary fringe subtree (chain) in Tn that consists of only left children has length about log4 n.

4.3 All possible fringe subtrees

` Recall that Sďk denotes the set of all possible trees of size at most k, i.e., ` gw Sďk :“ tT P T : |T | ď k, P tT “ T u ą 0u. Also recall that gw ` gw Kn :“ maxtk : Sďk ĎYvPTn tTn,v uu.

18 We would like to study the growth of Kn with n. Let Geppq denote Geometric p distribution, i.e., P tGeppq “ iu “ pp1 ´ pqi for i P N0. Let Beppq denote Bernoulli p distribution and let Bipd, pq denote Binomial pd, pq distribution. Recall that Popλq is a Poissonpλq random variable. Table 1 shows five types of well-known conditional Galton-Watson trees. See Janson [21, sec. 10] for more examples. Name Definition L i`1 Plane trees ξ “ Gep1{2q pi “ 1{2 pi ě 0q L Full binary trees ξ “ 2 Bep1{2q p0 “ p2 “ 1{2 Motzkin trees ξ uniform in t0, 1, 2u p0 “ p1 “ p2 “ 1{3 L d 1 i 1 d´i d-ary trees ξ “ Bipd, 1{dq pi “ i 1 ´ d d p0 ď i ď dq L ´1 Labeled trees ξ “ Pop1q pi “ `e ˘{ `i! ˘ ` ˘ Table 1: Some well-known conditional Galton-Watson trees.

gw 1 We can assume that P t|T | “ knu ą 0 for all n P N. Otherwise, let kn :“ maxti ď gw 1 kn : P t|T | “ iu ą 0u. It is not difficult to show that kn ´ kn ď h “ Op1q for kn large. (See [21, lem. 12.3] for details.) Thus this assumption does not change results in this subsection. gw d gw Janson showed that Tn,˚ Ñ T [21, thm. 7.12]. In other words, fringe subtrees on min ` average behave like unconditional Galton-Watson trees. Let Tk be a tree T P Sďk gw min gw that minimizes P tT “ T u. Then Tk also is the least likely tree to appear in Tn ` min as fringe subtree among all trees in Sďk when n is large. So intuitively if whp Tk gw ` min appears in Tn , then every tree in Sďk should also appears whp. And if whp Tk does ` not appear, then whp there is at least one tree in Sďk that is missing. Therefore, the problem can be reduced to finding pmin :“ P T gw “ T min “ min P tT gw “ T u . k k ` T PSďk ( min Lemma 15. Assume Condition A. If kn Ñ 8 and npkn {kn Ñ 8, then whp every tree in S` appears in T gw. ďkn n min k Proof. Let k “ kn. Recall that pmax :“ maxiě0 pi and pmax ă 1. Therefore pk ď pmax. ´1 Thus we can assume that k ď 2 logpnq{ logppmaxq when k is large. Otherwise we have min ´1 npk ď n Ñ 0, which contradicts the assumption. gw 1 gw Thus by Theorem 3, NSk pTn q ě yn :“ r 2 nP tT “ |k|us whp. Let An be the event gw that Tn contains all possible trees of size k as fringe subtrees. Let Bnpiq be the event gw that NSk pTn q “ i for given i ě yn. Thus gw P tAnu “ P tAn X rNSk pTn q ă ynsu ` P tAn|Bnpiqu P tBnpiqu iěy ÿn ě P tAn|Bnpynqu P tBnpiqu , iěy ÿn “ P tAn|Bnpynqu p1 ´ op1qq,

19 where the inequality comes from the obvious fact that P tAn|Bnpaqu ě P tAn|Bnpbqu c given that a ě b. So it suffices to prove that P tAn|Bnpynqu Ñ 0. gw We can show that this is equivalent to a coupon collector problem. Let Tn |Bnpynq be gw ˚ gw Tn restricted to the event Bnpynq. Let T be a random tree distributed as Tn |Bnpynq. gw Replace each of its yn subtrees with an independent copy of Tk . The result is still a gw random tree distributed as Tn |Bnpynq. (We leave the verification to the readers.) gw So P tAn|Bnpynqu equals the probability of that yn independent copies of Tk contain ` every tree in Sďk. It follows from Lemma 12 (the coupon collector) that

c gw yn gw P tAn|Bnpynqu ď p1 ´ P tTk “ T uq ď exp t´ynP tTk “ T uu T PS` T PS` ÿďk ÿďk 1 P tT gw “ T u ď exp ´ nP t|T gw| “ ku 2 P t|T gw| “ ku T PS` " * ÿďk ` min ď Op|Sďk|q exp ´npk . ? It is well-known that the number of plane trees( of size exactly k is 4k´1{ πk3p1 ` op1qq See, e.g., Flajolet and Sedgewick [15, pp. 406]. It follows that there exists a constant C ` k j´1 k such that |Sďk| ď j“1 C4 ď C4 for all k P N. Thus for large enough k, the last expression above is at most ř k min min C4 exp ´npk “ C exp k logp4q ´ npk Ñ 0.

c ( ( Therefore we have P tAn|Bnpynqu Ñ 0. Theorem 5. Assume Condition A. Assume that as k Ñ 8,

min α β logp1{pk q „ γk plog kq , where α ě 1, β ě 0, γ ą 0 are constants. Let kn Ñ 8 be a sequence of positive integers. Let m “ log n. Then for all constants δ ą 0, we have: (i) If k 1 δ m γ log m1{α β 1{α, then whp T gw contains all trees in S` as n ď p ´ qr { p q s n ďkn fringe subtrees.

(ii) If k 1 δ m γ log m1{α β 1{α, then whp T gw does not contain all trees in S` n ě p ` qr { p q s n ďkn as fringe subtrees.

As a result, β 1{α K p α n Ñ . β 1{α γ plog n{plog log nq q ˆ ˙ min The behavior of pk varies for different offspring distributions. But as mentioned min in the introduction, the types of trees that we are interested in all have pk that are covered by Theorem 5.

20 Proof. Let k “ kn. Part (i) assumes that

m 1{α k ď p1 ´ δq . γplog m1{αqβ „  Taking a logarithm, we have 1 m log k ď logp1 ´ δq ` log “ p1 ` op1qq log m1{α. α γplog m1{αqβ Thus 1 p1 ´ δqαm log γkα log k β γ o 1 log m1{α β 1 δ αm. min „ p q ď p ` p qq 1{α β p q „ p ´ q pk γplog m q Therefore, recalling m “ log n,

min 1 α log npk “ m ´ log min ě m ´ p1 ` op1qqp1 ´ δq m “ Ωpmq. pk It follows that

min log k ´ log npk “ O plog mq ´ Ωpmq Ñ ´8.

min gw Thus npk {k Ñ 8 and it follows from Lemma 15 that whp Tn contains every tree in ` Sďk as a fringe subtree. 1{α β 1{α min Similar computations show that if k ě p1 ` δqrm{γplog m q s then npk Ñ 0. It gw p gw follows from Theorem 2 that N min pT q Ñ 0. Thus whp T does not contain every Tk n n ` tree in Sďk as a fringe subtree. The rest of this section is organized as follows. In the next subsection, we give a min general method of finding pk . Then we divide offspring distributions in two categories and show that Theorem 5 is applicable to all the Galton-Watson trees listed in Table 1.

min 4.3.1 Computing pk

Let Ik :“ tj : 1 ď j ď k, pj ą 0u. Let L1 :“ p0 and for k ě 2 let

p 1{i L :“ min p i : i P I . k 0 p k´1 # ˆ 0 ˙ +

Since Lk is non-increasing, L :“ limkÑ8 Lk exists. Equivalently, we have

p 1{i L :“ inf p i : i P N, p ą 0 . 0 p i # ˆ 0 ˙ + min 1{k Theorem 6. Assume Condition A. We have ppk q Ñ L as k Ñ 8, where the limit is taken along the subsequence k with P t|T gw| “ ku ą 0. As a result, we have L ă 1.

21 min 1{k In fact, we have a stronger result for the upper bound of ppk q .

Lemma 16. Assume Condition A. For all fixed i with pi ą 0, there exist constants 1 2 gw Ci ą 1 and Ci,Ci , kpiq ą 0 such that for all k ě kpiq with P t|T | “ ku ą 0, there are 1 ´Ci k at least k Ci trees T of size k with k p 1{i 0 ă P tT gw “ T u ď C2 p i . i 0 p « ˆ 0 ˙ ff Proof. We assume that h (the span of ξ) is 1. The proof for h ą 1 is similar. Let j be the smallest positive integer such that j is coprime with i and pj ą 0. (Such a j exists except for the trivial case that p0 ` pi “ 1.) Let x “ pk ´ 1q mod i. By the Chinese reminder theorem, there exists a smallest non-negative integer y such that

y ” x pmod iq, #y ” 0 pmod jq. Note that y depends only on i. Therefore, if k ě kpiq :“ y ` 1, we can choose y k ´ 1 ´ y n “ k ´ n ´ n , n “ , n “ , 0 i j j j i i such that n0, ni, nj are all non-negative integers with

n0 ` ni ` nj “ k, and ini ` jnj “ k ´ 1.

Let Skpn0, ni, njq be the set of plane trees of size k that has n0, ni and nj nodes with degree 0, i and j respectively. It is well-known that when the above two conditions hold, we have 1 k 1 k! |S pn , n , n q| :“ “ . k 0 i j k n , n , n k n !n !n ! ˆ 0 i j˙ 0 i j (See [15, pp. 194].) Since i is a constant and y only depends on i, there exists a constant ˚ Ci such that

˚ ˚ ˚ |n0 ´ kp1 ´ 1{iq| ď Ci , |ni ´ k{i| ď Ci , nj ď Ci . Using these inequalities and Stirling’s approximation [15, pp. 407], it is easy to verify 1 that there exists a constant Ci ą 0 such that k 1{i 1´1{i ´ C1 1 1 C1 k |S pn , n , n q| ě k´ i 1 ´ :“ k´ i C . k 0 i j i i i ˜ˆ ˙ ˆ ˙ ¸

And for every T P Spn0, ni, njq, we have

k 1{i C˚ C˚ p P tT gw “ T u ď pni pn0 ď p´ i p´ i pk{ipkp1´1{iq :“ C2 p i . i 0 i 0 i 0 i 0 p « ˆ 0 ˙ ff

22 gw ` Proof of Theorem 6. Let T be a tree with |T | “ k and P tT “ T u ą 0, i.e., T P Sďk. Let ni be the number of nodes of degree i in T . Note that if i ą 0 and i R Ik´1, then ni “ 0. Since by (3) the sum of the degrees in a preorder degree sequence equals the size of the tree minus one, we have

n0 ` n1 ` . . . nk´1 “ k, and n1 ` 2n2 ... ` pk ´ 1qnk´1 “ k ´ 1.

Using the convention that 00 “ 1, we have for k ě 2

gw n0 n1 nk´1 P tT “ T u “ p0 p1 ¨ ¨ ¨ pk´1 n1 n2 nk´1 n0`n1`...`nk´1 p1 p2 pk´1 “ p0 ¨ ¨ ¨ ` p0 p0 p0 ˆ ˙ ˆ ˙ ˆ ˙k´1 1 ini 1 i“1 ini i i pi pi k k ř “ p0 ě p0 min p iPIk´1 p iPI « 0 ff « 0 ff źk´1 ˆ ˙ ˆ ˙ k´1 k´1 “ p0Lk ě p0L . (4)

min 1{k As a result lim infkÑ8 ppk q ě L. To show the other way, let ε ą 0 be a constant, and let α “ minti : Li`1 ď L ` εu. 1{α Therefore 0 ă p0ppα{p0q ď L ` ε. By Lemma 16, there is at least one tree T of size k such that

1 k p α pmin ď P tT gw “ T u ď C p α ď C pL ` εqk, k α 0 p α « ˆ 0 ˙ ff

min 1{k where Cα ą 0 is constant. Thus lim supkÑ8 ppk q ď L ` ε. Since ε is arbitrary, we min 1{k have lim supkÑ8 ppk q ď L. gw Recall that pmax :“ maxiě0 pi ă 1. For all trees T with size k, we have P tT “ T u ď k min 1{k min 1{k pmax, i.e., ppk q ď pmax. Thus L “ limkÑ8 ppk q ď pmax ă 1.

4.3.2 When L ą 0

min 1{k 0 If L ą 0, then by Theorem 6, logp1{ppk q q „ logp1{Lqk “ logp1{Lqkplog kq . Thus we can apply Theorem 5 with γ “ logp1{Lq, α “ 1 and β “ 0 to get

K p 1 n Ñ . logpnq log p1{Lq The following Lemma computes L for some well-known Galton-Watson trees. See Janson [21, sec. 10] for more about these trees.

Lemma 17. (i) Full binary tree: If ξ “L 2 Bep1{2q, then L “ 1{2. (ii) Motzkin tree: If L p0 “ p1 “ p2 “ 1{3, then L “ 1{3. (iii) d-ary tree: If ξ “ Bipd, 1{dq for d ě 2, then L “ pd ´ 1qd´1{dd. (iv) Plane tree: If ξ “L Gep1{2q, then L “ 1{4.

23 Proof. (i): If ξ „ 2 Bep1{2q, then p0 “ 1{2, p2 “ 1{2 and pi “ 0 for i R t0, 2u. Thus for k ě 3, we have 1{i 1{2 pi p2 Lk “ min p0 “ p0 “ 1{2. i:iăk,pią0 p p ˆ 0 ˙ ˆ 0 ˙ Therefore L “ limkÑ8 Lk “ 1{2. (ii) and (iii) follow from similar simple calculations. (iv): For all i ě 1, we have

p 1{i 1 1 1{i p i “ “ 1{4. 0 p 2 2i ˆ 0 ˙ ˆ ˙

Therefore Lk “ 1{4 for all k ě 1, and L “ 1{4. Define minti P N : L “ Lu if L “ L for some j, κ “ i`1 j #8 otherwise. If κ ă 8, then we call the Galton-Watson tree well-behaved. Examples of such trees include those for which ξ is bounded, and those for which ξ has a polynomial or sub- exponential tail. The case ξ “L Gep1{2q is also well-behaved. Thus the four types of Galton-Watson trees in Lemma 17 are well-behaved. The following theorem gives better thresholds than Theorem 5.

Theorem 7. Assume Condition A and let the Galton-Watson tree be well-behaved. Then for all constants δ ą 0, we have: (i) If k log n 1 δ log log n log 1 , then whp T gw contains all trees in S` n ď p ´ p ` q q{ L n ďkn as fringe subtrees.

1 gw (ii) If kn ě plog n ´ p1 ´ δq log log nq{ log L , then whp Tn does not contain all trees in S` as fringe subtrees. ďkn Thus as n Ñ 8, we have K logp1{Lq ´ log n p n Ñ ´ 1. log log n The main idea is that when κ ă 8, there are exponentially many trees of size k that gw have small probability to appear as fringe subtrees in Tn . Then we can use Lemma 12 (the coupon collector) to find the sufficient condition for one of them to not to appear whp.

Proof. (i): Write m “ log n and k “ kn. Using (4), it is easy to verify that in this case min npk {k Ñ 8. Thus (i) follows from Lemma 15. (ii): The proof is similar to the one of Lemma 15. As in that proof, we can assume gw 3 gw that k “ Oplog nq. Thus by Theorem 3, whp NSk pTn q ď yn :“ t 2 nP t|T | “ kuu. Let

24 gw An be the event that Tn contains all possible trees of size k as fringe subtrees. Let gw Bnpiq be the event that NSk pTn q “ i for some i ď yn. Then

gw P tAnu ď P tNSk pTn q ą ynu ` P tAn|Bnpiqu P tBnpiqu iďy ÿn ď op1q ` P tAn|Bnpynqu P tBnpiqu , iďy ÿn ď op1q ` P tAn|Bnpynqu .

Thus it suffices to prove that P tAn|Bnpynqu Ñ 0. Using the same coupling as in the proof of Lemma 15, we have P tAn|Bnpynqu equals gw ` the probability that yn independent copies of Tk do not contains all trees in Sďk. It follows from Lemma 12 (the coupon collector) that P tAn|Bnpynqu Ñ 0 if ` p1 ´ T PSďk gw yn P tTk “ T uq goes to infinity. 1{κ ř By definition of κ, we have L “ p0ppκ{p0q . It follows from Lemma 16 that there 1 1 2 ´Cκ k exists constants Cκ ą 1 and Cκ,Cκ ą 0 such that there are at least k Cκ trees T in ` Sďk with P tT gw “ T u pp pp {p q1{κqk C2Lk P tT gw “ T u “ ď C2 0 κ 0 “ κ . k P t|T gw| “ ku κ P t|T gw| “ ku P t|T gw| “ ku Therefore

2 k yn gw 1 C L p1 ´ P tT “ T uqyn ě k´Cκ Ck 1 ´ κ . k κ P t|T gw| “ ku T PS` ˆ ˙ ÿďk Since L ă 1 and P t|T gw| “ ku “ Θpk´3{2q, we have Lk{P t|T gw| “ ku “ op1q. Thus for k large enough, the logarithm of the above is C2Lk k logpC q ´ C1 logpkq ` y log 1 ´ κ κ κ n P t|T gw| “ ku ˆ ˙ 1 C2Lk ě k logpC q ´ y κ 2 κ n P t|T gw| “ ku 1 3 1 ě k logpC q ´ nC2Lk “ k logpC q ´ OpnLkq. 2 κ 2 κ 2 κ

k 1´δ By our assumptions, k “ Ωplog nq and L ď plog nq {n. Since Cκ ą 0, we have 1 plog nq1´δ k logpC q ´ OpnLkq ě Ωplog nq ´ O n Ñ 8, 2 κ n ˆ ˙ which implies P tAn|Bnpynqu Ñ 0. p Remark. If L ą 0 and κ “ 8, then Theorem 5 shows that Kn{ logpnq Ñ 1{ logp1{Lq. But the second order term of Kn is sensitive to small modifications of the offspring distribution, which makes it slightly more challenging to analyze the second order term.

25 4.3.3 When L “ 0

1{i It is clear that L “ 0 if and only if ξ has infinite support and lim infiÑ8 p0ppi{p0q “ 0, which implies lim supiÑ8 logp1{piq{i “ 8, along the subsequence with pi ą 0. If in addition we have pi ą 0 for all i ě 0 and logp1{piq „ fpiq for some f : r0, 8q Ñ r0, 8q with fpiq{i Ò 8, then we say that ξ has an f-super-exponential tail. We have the following threshold for Galton-Watson trees with such a property.

Theorem 8. Assume Condition A and that ξ has an f-super-exponential tail. Let f ´1 denote the inverse of f. Then for all constants δ ą 0, we have 1. If k f ´1 1 δ log n 1, then whp T gw contains all trees in S` as fringe n ď pp ´ q q ` n ďkn subtrees.

2. If k f ´1 1 δ log n 1, then whp T gw does not contain all trees in S` as n ě pp ` q q ` n ďkn fringe subtrees.

Therefore, K p n Ñ 1. f ´1plog nq

Proof. (i): Let k “ kn. Choose ε ą 0 such that p1 ´ δqp1 ` εq ă p1 ´ δ{2q. Since logp1{piq „ fpiq, there exists an integer ipεq such that for all i ą ipεq,

log pi ě ´p1 ` ε{2qfpiq.

1{i Let wi :“ log p0ppi{p0q . We have as k Ñ 8,

1 log pi min wi “ min 1 ´ logpp0q ` ipεqăiăk ipεqăiăk i i ˆ ˙ p1 ` ε{2qfpiq ě logpp0q ´ max ipεqăiăk i p1 ` ε{2qfpk ´ 1q “ logpp q ´ Ñ ´8, 0 k ´ 1 where we use that fpiq{i Ò 8. Since min1ďiďipεq wi is a constant, we have for large k,

p1 ` ε{2qfpk ´ 1q log Lk :“ min wi ě logpp0q ´ . 1ďiăk k ´ 1 It follows from (4) that for k large enough,

min k´1 log pk ě logpp0Lk q ě logpp0q ` pk ´ 1q log p0 ´ p1 ` ε{2qfpk ´ 1q. ě ´p1 ` εqfpk ´ 1q, where the last step uses fpkq{k Ò 8.

26 The assumption k ´ 1 ď f ´1pp1 ´ δq log nq implies that fpk ´ 1q ď p1 ´ δq log n and k “ Oplog nq. Thus

min log pk ě ´p1 ` εqp1 ´ δq log n ě ´p1 ´ δ{2q log n.

min δ{2 min gw Thus npk ě n . We have npk {k Ñ 8. It follows from Lemma 15 that Tn contains all possible trees of size at most k as fringe subtree whp. star (ii): Let Tk´1 be the tree in which one node has degree k ´ 1 and all other nodes are leaves. Computations similar to above show that if k ´ 1 ě f ´1pp1 ` δq log nq, then star gw star nπpTk´1 q Ñ 0. Therefore Tn does not contain Tk´1 whp. ´c1i2 Example (The discrete Gaussian distribution). When pi “ ce for some appropriate positive normalization constants c and c1, we have L “ 0, and Theorem 8 applies. Then

K p 1 n Ñ ? , logpnq c1 as n Ñ 8. a Example (The Cayley trees). A better example is the Galton-Watson tree with offspring L ´1 distribution ξ “ Pop1q, i.e., the Cayley tree. It has pi “ e {i! and logp1{piq „ i logpiq. It is easy to see that K log log n p n Ñ 1. log n Using (4) it is not difficult to verify that the tail drops so fast that the least possible star tree of size k is Tk´1 . This is a special case of the following general observation. 1{i Lemma 18. Assume Condition A. If pi ą 0 for all i ě 0 and pi Ó 0, then for k large min gw star k´1 L enough, pk “ P T “ Tk´1 “ p0 pk´1. In particular, this is true for ξ “ Pop1q. In the latter case we have ( min k´1 log pk “ logpp0 pk´1q “ ´k log kp1 ` Op1{kqq.

It follows from Theorem 5 that for ξ “L Pop1q,

log log n p K Ñ 1, n log n as n Ñ 8. This matches the result given by Theorem 8. However, we can be more precise by applying the following theorem with γ “ 1. Theorem 9. Assume Condition A. Also assume that

min logp1{pi q “ γplog iqpi ` Op1qq, where γ ą 0 is a constant. Define

m “ log n, m1 “ logpm{γq, m2 “ log m1.

Let kn Ñ 8 be a sequence of positive integers. Then for all constants δ ą 0, we have:

27 m m2 gw ` (i) If kn ď 1 ´ p1 ` δq , then whp T contains all trees in S γpm1´m2q m1pm1´m2q n ďkn as fringe subtrees.´ ¯

m m2 gw (ii) If kn ě 1 ´ p1 ´ δq , then whp T does not contain all trees γpm1´m2q m1pm1´m2q n in S` as fringe subtrees. ďkn ´ ¯ Thus, as n Ñ 8, log log n p 1 K Ñ , n log n γ and more precisely,

log n log n K γ log ´ log log 2 n γ γ plog log nq p ´ 1 ˆ Ñ ´ 1. » ” log n ı fi log log log n – fl min Proof. Let k “ kn. For (i), we can show that npk {k Ñ 8. It follows from Lemma 15 gw that whp Tn contains all trees of size at most k as fringe subtrees. min For (ii), similarly we can show that npk Ñ 0. It follows from Theorem 2 that whp ` gw there is at least one tree in Sďk that does not appear in Tn as fringe subtrees.

5 Non-fringe subtrees

In this section we prove Theorem 4, the concentration of non-fringe subtree counts in conditional Galton-Watson trees. Given a tree T , let vpT q be the number of its internal nodes and let `pT q be the number nf gw gw n n n n of its leaves. Recall that N T : gw T T , and that ξ : ξ , ξ ,..., ξ T p n q “ uPTn ă n,u “ p 1 2 n q J K gw is a uniform random rotation of the preorder degree sequence of Tn . To simplify the notation, write v :ř“ vpT q and ` :“ `pT q. Byr Lemmar 2,r T hasr a preorder degree sequence of the form

pa1, 0, a2, 0,..., a`, 0q :“

pa1,1, a1,2, . . . , a1,rp1q, 0, a2,1, a2,2, . . . , a2,rp2q, 0, . . . , a`,1, a2,2, . . . , a`,rp`q, 0q (5) for positive integers rp1q, rp2q, . . . , rp`q and that

` ` ` rpsq

rpsq “ v, as,t ą 0, as :“ as,t “ v ` ` ´ 1. (6) s“1 s“1 s“1 t“1 ÿ ÿ ÿ ÿ Therefore, if T ă T 1, then T 1 has a preorder degree sequence of the form

pa1, b1, a2, b2,..., a`, b`q (7) where b1,..., b` are preorder degree sequences of some plane trees. Thus each non- gw n fringe subtree of shape T in Tn corresponds to a segment of ξ of the form of (7). If

r 28 none of the segments overlap with each other, then we can permute them into the form n pa1,..., a`, b1,..., b`q. Since ξ is permutation invariant, the new sequence still has the n nf gw distribution of ξ . In other words, NT pTn q is almost distributed like the number of n the patterns pa1, a2,..., a`q inr ξ . The problem withr this argument is that non-fringe subtrees can overlap. But as shown later in this section, under the assumptionsr of Theorem 4, the effect of such overlaps is negligible. We will use Dn to denote the set of preorder degree sequences of trees with size n. Let Dn be the set of sequences that are cyclic rotations of sequences in Dn. Given d :“ pd1, . . . , dnq P Dn, let degipdq :“ pdi, di`1, . . . , di`k´1q such that degipdq P Dk for somerk ě 1, where the indices are all modulo n. Lemma 2 guarantees that such degipdq exists and is unambiguous.r Let Tipdq be the tree with the preorder degree sequence degipdq.

5.1 Factorial moments

Let pxqr :“ xpx ´ 1q ¨ ¨ ¨ px ´ r ` 1q. For a random variable X, EpXqr is called the r-th factorial moment of X. We give exact formulas for the first and second factorial nf gw moments of NT pTn q in this subsection. Lemma 19. Assume that P t|T gw| “ nu ą 0. Let T be a tree. We have

nf gw E N pT q P Sn v T “ n ´ vpT q ´ `pT q T n “ πnf pT q ´ p q . n P S n 1 “ ‰ t n “ ´ u (

n nf gw n Proof. Let v :“ vpT q and ` :“ `pT q. Let Ii “ T ă Tipξ q . Then NT pTn q “ i“1 Ii. By the permutation invariance of ξn, we have EJ N nf pT gwqK “ E r n I s “ nP tI “ 1u. T n i“1 i ř1 Recall that T has a preorder degree sequence of ther form pa1, 0,..., a`, 0q satisfying “ n ‰ ř (6). Let A Ď Dn be the set of sequencesr such that ξ P A if and only if I1 “ 1. In other words, d :“ pd1, d2, . . . , dnq P A if and only if deg1pdq “ pa1, b1,..., a`, b`q for some b1,..., b` rwhich are preorder degree sequences ofr trees. By permuting deg1pdq into 1 1 1 1 1 pa1, a2,..., a`, b1, b2,..., b`q, we get a new sequence d :“ pd1, d2, . . . , dnq P A where

1 A :“ pe1, e2, . . . , enq P Dn : pe1, e2, . . . , evq “ pa1, a2,..., a`q . ! ) Such a permutation defines a mappingr f : A Ñ A1. 1 1 1 For every d P A , condition (6) implies that in d after pa1,..., a`q, there are at least ` consecutive segments that are preorder degree sequences of trees, i.e., there is a unique d P A with fpdq “ d1. Thus f is a one-to-one mapping. If d1 “ fpdq, then P ξn “ d “ P ξn “ d1 , since ξn is permutation invariant. Therefore we have ! ) ! ) r r r n n 1 P tI1 “ 1u “ P ξ P A “ P ξ P A . ! ) ! ) r r

29 n Recall that by Lemma 3, ξ „ pξ1, ξ2, . . . , ξn|Sn “ n ´ 1q, where ξ1, ξ2, . . . , ξn are i.i.d. n copies of ξ and Sn “ s“1 ξs. We have n 1 r n n n P ξ P Ař “ P pξ1 , ξ2 ,..., ξv q “ pa1, a2,..., a`q

! ) P!tpξ1, ξ2, . . . , ξvq “ pa1, a2,..., a`q,S)n “ n ´ 1u r “ r r r P tSn “ n ´ 1u P tS “ n ´ v ´ `u “ P tT ă T gwu n´v , P tSn “ n ´ 1u ` where in the last step we use s“1 as “ v ` ` ´ 1. nf gw To compute EpNT pTn qq2,ř we enumerate all the cases that T can appear as overlap- ping non-fringe subtrees by constructing a set of trees tT ‘ T u as follows. For trees T , 1 S and node v P T , let T “ SplaypT, v, Sq denote tree T with subtree Tv replaced by S. 1 Thus Tv “ S . Let VpT q denote the internal nodes of T . Then define the collection

tT ‘ T u “ tSplaypT, v, T qu z tT u . » fi vPVpT q:T ă T ďv Note that |tT ‘ T u| ă vpT q. Also– note that given T 1 P tTfl‘ T u we can always find a unique node v P VpT q such that T 1 “ SplaypT, v, T q. See Figure 5 for an example of tT ‘ T u.

Figure 5: An example of tT ‘ T u.

Lemma 20. Assume that P t|T gw| “ nu ą 0. Let T be a tree. We have

P Sn 2v T “ n ` 1 ´ 2pvpT q ` `pT qq E pN nf pT gwqq “ npn ´ 2vpT q ` 1qπnf pT q2 ´ p q T n 2 P S n 1 t n “ ´ u ( 1 1 “ ‰ P Sn v T 1 “ n ´ vpT q ´ `pT q ` 2n πnf pT 1q ´ p q . P tSn “ n ´ 1u T 1PtT ‘T u ( ÿ

30 Proof. Let v “ vpT q and ` “ `pT q. Let Ii be defined as in the proof of Lemma 19. Since I1,...,In are indicator random variables and permutation invariant, we have

n nf gw E pNT pTn qq2 “ E rIiIjs “ n E rI1Iis . 1ďi‰jďn i“2 “ ‰ ÿ ÿ n n The event I1Ii “ 1 happens if and only if T ă T1pξ q and T ă Tipξ q both happen. n Thus instead of summing E rI1Iis over i, we can sum P ξ “ d over pairs pi, dq P r r t2, . . . , nu ˆ Dn that satisfy T ă T1pdq and T ă Tipdq, i.e., ! ) r 1 1 1 deg1pdq “ par1, b1, a2, b2,..., a`, b`q, and degipdq “ pa1, b1, a2, b2,..., a`, b`q,

1 1 where pa1, 0,..., a`, 0q is the preorder degree sequence of T and b1, b1,..., b`, b` are preorder degree sequences of trees. Let A be the set of such pairs. Then iě2 E rI1Iis “ P ξn d . pi,dqPA “ ř For 1 ď!r ď n, let) Irpdq be the set of positions in d that are occupied by degrpdq, i.e., ř r

Irpdq :“ tj mod n : r ď j ă r ` | degrpdq|u .

in Let I1 pdq Ď I1pdq be the set of positions in d that are occupied by the parts of deg1pdq out in in out that correspond to a1, . . . , a`. Let I1 pdq “ I1pdqzI1 pdq. Define Ii pdq and Ii pdq accordingly. Let A1 Ď A be the set of pi, dq in A such that

1 in in A “ pi, dq P A : I1 pdq X Ii pdq “ H .

Let A2 :“ AzA1. ( 2 in in If pi, dq P A , then I1 pdq X Ii pdq ‰ H. In other words, either Ti is fringe subtree of T1 and Ti is rooted at a node that corresponds to an internal node of T (regarding that 1 T1 is a non-fringe subtree of the shape T ), or vice versa. Thus there exists a T P tT ‘T u 1 1 such that either T ă T1pdq or T ă Tipdq. By symmetry, we have

n 1 n P ξ “ d “ 2 P T ă T1pξ q pi,dqPA2 T 1PtT ‘T u ÿ ! ) ÿ ! ) 1 1 r P Srn v T 1 “ n ´ vpT q ´ `pT q “ 2 πnf pT 1q ´ p q , (8) P tSn “ n ´ 1u T 1PtT ‘T u ( ÿ where the last step follows from Lemma 19. 1 Now consider pi, pd1, . . . , dnqq P A . Arrange pd1, . . . , dnq in a cycle. Paint the segment deg1ppd1, . . . , dnqq red and the segment degippd1, . . . , dnqq blue. One of the three cases must be true: (i) I1pdq X Iipdq “ H — The red segment and the blue segment do not overlap. (ii) Iipdq Ď I1pdq — The red segment contains the blue segment. (iii) I1pdq Ď Iipdq — The blue segment contains the red segment. (Since deg1pdq and degipdq are both preorder degree sequences, if I1pdq X Iipdq ‰ H then either (ii) or (iii)

31 Figure 6: Examples of three cases in A2.

must happen. And since i ‰ 1 we cannot have Iipdq “ I1pdq.) Figure 6 gives examples of the three cases. We permute pd1, . . . , dnq as follows. For (i) and (ii), we first permute the red segment from pa1, b1,..., a`, b`q to pa1,..., a`, b1,..., b`q. Then we permute the blue segment 1 1 1 1 of from pa1, b1,..., a`, b`q to pa1,..., a`, b1,..., b`q. It is clear this can be done in case (i). And it is not difficult to see that in case (ii) the positions that are occupied by the blue segment is completely contained by the positions that are occupied by b`1 for some 1 1 ď ` ď `. This means that Ti is a fringe subtree of T1 and the root of Ti does not correspond to an internal node of T (because T1 is a non-fringe subtree in the shape of T ). So the first step of the permutation moves the blue segment but does not change its contents and we can carry out the second step without problem. In case (iii), we reverse the order of the two steps. After this the starting position of the red segment may have changed. We rotate the new sequence such that the red segment still starts from position 1. 1 1 1 1 1 1 In the end, we get a new pair pi , pd1, . . . , dnqq such that pd1, d2, . . . , dvq “ pa1,..., a`q 1 1 1 and pdi1 , di1`1, . . . , di1`v´1q “ pa1,..., a`q. Let B Ď tv ` 1, . . . , nu ˆ Dn be the set of such pairs. The above permutation defines a mapping f : A1 Ñ B. Since given pi1, d1q P B, we can without ambiguity recover the red segment and blue segmentr of d1, the mapping is reversible, i.e., f is one-to-one. Since ξn is permutation invariant, if pi1, d1q “ fpi, dq, then P ξn “ d “ P ξn “ d1 . Therefore r ! ) ! ) r r P ξn “ d “ P ξn “ d . pi,dqPA1 pi,dqPB ÿ ! ) ÿ ! ) r r Given pi, pd1, . . . , dnqq P B, we can move the segment pdi, . . . , di`v´1q to the position 1 1 v ` 1 to get a new sequence pd1, . . . , dnq P C, where

C :“ pe1, . . . , enq P Dn : pe1, . . . , e2vq “ pa1,..., a`, a1,..., a`q . ! ) Since there are n ´ 2v ` 1 possibler values of i, this permutation gives us a pn ´ 2v ` 1q- to-one mapping h : B Ñ C and if d1 “ hpi, dq, then P ξn “ d “ P ξn “ d1 . ! ) ! ) r r 32 We obtain as usual

n n n P ξ “ d “ P pξ1 ,..., ξ2vq “ pa1,..., a`, a1,..., a`q dPC ÿ ! ) ! ) r “ P tpξr1, . . . , ξr2vq “ pa1,..., a`, a1,..., a`q | Sn “ n ´ 1u

“ P tpξ1, . . . , ξ2vq “ pa1,..., a`, a1,..., a`qu P tS “ pn ´ 1q ´ 2pv ` ` ´ 1qu ˆ n´2v P tSn “ n ´ 1u P tS “ n ` 1 ´ 2pv ` `qu “ πnf pT q2 n´2v . P tSn “ n ´ 1u It follows that

P ξn “ d “ P ξn “ d “ pn ´ 2v ` 1q P ξn “ d pi,dqPA1 pi1,dqPB dPC ÿ ! ) ÿ ! ) ÿ ! ) r r P tS “ n ` 1 ´ 2rpv ` `qu “ pn ´ 2v ` 1qπnf pT q2 n´2v . (9) P tSn “ n ´ 1u The lemma follows by combining (8) and (9) with the following:

n nf gw n EpNT pTn qq2 “ n E rI1Iis “ n P ξ “ d i“2 pi,dqPA ÿ ÿ ! ) “ n P ξn “ d ` n r P ξn “ d . pi,dqPA1 pi,dqPA2 ÿ ! ) ÿ ! ) r r 5.2 Sequence of non-fringe subtrees

Let Tn be a sequence of trees. Let vn :“ vpTnq and `n :“ `pTnq. In this subsection we nf gw prove Theorem 4, the concentration of NTn pTn q. First we narrow down the scopes of vn and `n.

Lemma 21. Assume Condition A. There exists a constant C1 ą 0 such that

sup nπnf pT q Ñ 0. T :vpT qąC1 log n

nf gw p Thus if vn ą C1 log n, then NTn pTn q Ñ 0 as n Ñ 8.

nf gw p sup NT pTn q Ñ 0. T :vpT qąC1 log n gw The whole Tn is of size n and is a non-fringe subtree itself.

33 gw vpT q Proof. Recall that pmax :“ maxiě0 pi. We have P tT ă T u ď pmax. Since pmax ă 1, if C1 “ 2{ logp1{pmaxq, then

nf C1 log n ´2 sup nπ pT q ď npmax “ nn Ñ 0. T :vpT qąC1 log n

By Lemma 19, we have

nf gw gw P tSn´vn “ n ´ vn ´ `nu vn ´1 E NTn pTn q “ nP tTn ă T u ď npmaxP tSn “ n ´ 1u . P tSn “ n ´ 1u “ ‰ It follows from the well-known local limit theorem (see, e.g., Kolchin [23][thm. 1.4.2]) ´1{2 that P tSn “ n ´ 1u “ Θpn q. Therefore if vn ą C1 log n, the above expression is at most vn 1{2 ´2 1{2 npmaxO n ď nn Opn q Ñ 0.

Lemma 22. Assume Condition A` and˘ that vn “ Oplog nq. There exists a sequence 1{2 nf nf gw p bn “ opn log nq such that if `n ą bn, then nπ pTnq Ñ 0 and NTn pTn q Ñ 0 as n Ñ 8. Proof. Janson [21, thm. 19.7] showed that if Var pξq ă 8 then there exists a sequence 1 1{2 gw 1 1 bn “ opn q such that the maximum degree of Tn is less than bn whp. Let bn “ vnbn “ 1{2 1 opn log nq. If `n ě bn, then Tn must contain at least one node with degree at least bn. The probability of this event goes to zero.

nf gw nf Lemma 23. Assume Condition A. If |Tn| “ opnq, then E NTn pTn q {pnπ pTnqq Ñ 1 as n Ñ 8. “ ‰

Proof. |Tn| “ opnq implies that vn “ opnq and `n “ opnq. Therefore it follows Lemma 6 and 19 that E N nf pT gwq P tS “ n ´ v ´ ` u Tn n “ n´vn n n Ñ 1. nπnf T P S n 1 “ p nq ‰ t n “ ´ u nf nf gw nf 2 Lemma 24. Assume Condition A. If nπ pTnq Ñ 8, then EpNTn pTn qq2{pnπ pTnqq Ñ 1 as n Ñ 8.

Proof. Let v “ vn, ` “ `n and T “ Tn. Let C1 and bn be defined as in Lemma 21 and 22 respectively. We can assume that v ď C1 log n “ opnq and ` ď bn “ opnq. Otherwise we cannot have nπnf pT q Ñ 8 by these two lemmas. If T 1 P tT ‘T u, then vpT 1q ă 2v “ opnq and `pT 1q ă 2` “ opnq. Therefore, it follows from Lemma 6 and 20 that

nf gw nf 2 P tSn´2v “ n ` 1 ´ 2pv ` `qu EpNT pTn qq2 “ npn ´ 2v ` 1qπ pT q P tSn “ n ´ 1u 1 1 P Sn v T 1 “ n ´ vpT q ´ `pT q ` 2n πnf pT 1q ´ p q P tSn “ n ´ 1u T 1PtT ‘T u ( ÿ “ p1 ` op1qqpnπnf pT qq2 ` Opnq πnf pT 1q. T 1PtT ‘T u ÿ

34 nf 1 nf 2 Thus it suffices to show that n T 1PtT ‘T u π pT q “ opnπ pT qq . Consider the superset A of tT ‘ T u that contains trees which can be obtained by replacing a proper non-leaf subtreeř of T with another copy of T . (We do not restrict where this replacement can happen as in the definition of tT ‘T u.) Note that |A| “ v´1, since T has v internal nodes and one of them is the root. If T 1 P A, then T 1 contains T as a fringe subtree. Thus πnf pT 1q ď πnf pT q. In the case that v is bounded, we have n P tT 1 ă T gwu ď nvπnf pT q “ Opnπnf pT qq “ opnπnf pT qq2. T 1PA ÿ Thus we can assume that v Ñ 8. For T 1 P A, if T 1 has at least 3v{2 internal nodes, call T 1 big, otherwise call it small. Let Abig and Asmall be the sets of big and small trees in A respectively. 1 1 If T P Abig, then besides internal nodes that correspond to internal nodes of T , T 1 gw nf v{2 contains at least v{2 extra internal nodes. So we have P tT ă T u ď π pT qpmax. Since v{2 v Ñ 8 and pmax ă 1, vpmax “ op1q. Using that |A| ă v, we have

1 gw nf v{2 nf n P tT ă T u ď nvπ pT qpmax “ opnπ pT qq. T 1PA ÿbig Let Ti,j be a fringe subtree in T whose root is at depth i and is the j-th node of this 1 level. If replacing Ti,j with a copy of T makes a new tree Ti that has strictly less than 3v{2 internal nodes, then Ti,j must contain more than v{2 internal nodes. Therefore, for 1 each i, there is at most one possible such j. For an example of Ti , see Figure 7.

1 Figure 7: An example of T1 for T with 7 internal nodes.

As T has v internal nodes, there are at most v ´ 1 possible i that can make Ti,j a 1 proper and non-leaf subtree. Since Ti has at least i internal nodes besides these in the nf 1 nf i copy of T that replaced Ti,j, we have π pTi q ď π pT qpmax. In summary, we have v v nf 1 nf 1 nf i nf n π pT q ď n π pTi q ď n π pT qpmax ď Opnπ pT qq. T 1PA i“1 i“1 ÿsmall ÿ ÿ Therefore, n P tT 1 ă T gwu ď n P tT 1 ă T gwu ` n P tT 1 ă T gwu T 1PtT ‘T u T 1PA T 1PA ÿ ÿbig ÿsmall “ opnπnf pT qq ` Opnπnf pT qq “ opnπnf pT qq2.

35 The method used in proving Lemma 20 and 24 can be adapted to show the following lemma for higher factorial moments. We leave the proof for interested readers.

nf Lemma 25. Assume Condition A. If nπ pTnq Ñ 8, then for all fixed r ě 2,

nf gw EpNTn pTn qqr nf r Ñ 1, pnπ pTnqq as n Ñ 8.

nf The condition that nπ pTnq Ñ 8 is necessary, as shown by the following lemma.

Lemma 26. Assume Condition A. Let Lhpnq be a chain (complete 1-ary tree) of height hpnq. Let X :“ N nf pT gwq. If nP L ă T gw Ñ µ P p0, 8q as n Ñ 8, then n Lhpnq n hpnq

( 2 1 ` p1 3 3p1 ` 2p1 ` 1 EXn Ñ µ, Var pXnq Ñ µ , E pXn ´ EXnq Ñ µ 2 . 1 ´ p1 p1 ´ p1q “ ‰ As a result, lim infnÑ8 dTV pXn, Popµqq ą 0.

gw h Proof. Let h “ hpnq. Since nP tLh ă T u “ np1 Ñ µ P p0, 8q, we have h “ log1{p1 n ` Op1q. Lh has h internal nodes and one leaf. Thus it follows from Lemma 19 and 6 that

nf P tSn´h “ n ´ h ´ 1u EXn “ nπ pLhq Ñ µ. P tSn “ n ´ 1u

Since tLh ‘ Lhu “ tLh`i : 1 ď i ď h ´ 1u, by Lemma 6,

1 1 nf 1 P Sn´vpT 1q “ n ´ vpT q ´ `pT q ζ1 :“ 2n π pT q 1 P tSn “ n ´ 1u T PtLh‘Lhu ( ÿ h´1 P tS “ n ´ h ´ i ´ 1u “ 2n πnf pL q n´h´i h`i P tS “ n ´ 1u i“1 n ÿ h´1 h´1 p “ p1 ` op1qq2n ph`i “ p1 ` op1qq2nph pi Ñ 2µ 1 . 1 1 1 1 ´ p i“1 i“1 1 ÿ ÿ We also have by Lemma 6,

nf 2 P tSn´2h “ n ` 1 ´ 2ph ` 1qu 2 ζ2 :“ npn ´ 2hqπ pLhq Ñ µ . P tSn “ n ´ 1u Therefore, it follows from Lemma 20 that

p1 2 EpXnq2 “ ζ1 ` ζ2 Ñ 2µ ` µ . 1 ´ p1 Thus 2 1 ` p1 Var pXnq “ E rpXnq2s ` E rXns ´ E rXns Ñ µ . 1 ´ p1

36 So we have Var pX q 1 ` p n Ñ 1 ą 1. E rXns 1 ´ p1

With an argument similar to Lemma 20, we can compute E rpXnq3s, which yields 2 3 3p1 ` 2p1 ` 1 E pXn ´ EXnq Ñ µ 2 . p1 ´ p1q Since “ ‰ 3 3 3 E |Xn ´ EXn| “ 2E |Xn ´ EXn| ˆ Xn ă EXn ` E pXn ´ EXnq 3 J 3 K “ ‰ ď 2pE“Xnq ` E pXn ´ EXnq , ‰ “ ‰ the above limit implies that “ ‰ 3 C :“ lim sup E |Xn ´ EXn| ă 8. nÑ8 Now we can finish by following the method“ of Barbour‰ et al. [4, thm. 3B]. Let L Zn “ PopEXnq be a coupling of Xn that minimizes P tZn ‰ Xnu. Therefore we have dTV pXn, PopEXnqq “ P tZn ‰ Xnu. Thus 2 2 Var pXnq ´ E rXns “ E pXn ´ EXnq ´ E pZn ´ EXnq 2 2 “ E “rpXn ´ EXnq‰ ´ pZ“n ´ EXnq s ˆ‰ Xn ‰ Zn 2 J K ď E “pXn ´ EXnq ˆ Xn ‰ Zn ‰ 1{3 J K 3 2{3 ď P “pXn ‰ Znq pE |Xn ´ EXn‰| q , where in the last step we use H¨older’s inequality“ [19, pp. 129].‰ So Var pX q ´ EX 3 d pX , PopEX qq “ P tX ‰ Z u ě n n . TV n n n n Ep|X ´ EX |3qq2{3 ˆ n n ˙ Therefore 3 1 1 ` p1 lim inf dTV pXn, PopEXnqq ě ą 0. nÑ8 C2 1 ´ p ˆ 1 ˙ Since EXn Ñ µ, we also have lim infnÑ8 dTV pXn, Popµqq ą 0.

Proof of Theorem 4. (i): By Lemma 21 we can assume vn ď C1 log n for n large. Oth- nf gw nf gw erwise we have P NTn pTn q ě 1 ď ENTn pTn q Ñ 0 along the subsequence that vn ą C1 log n. Similarly we can assume that `n “ opnq by Lemma 22. nf gw ( nf nf nf gw p Then by lemma 23, ENTn pTn q „ nπ pTnq. Thus nπ pTnq Ñ 0 implies NTn pTn q Ñ 0. (ii): We can assume that vn ď C1 log n and `n “ opnq by Lemma 21 and 22. Otherwise nf nf gw nf we cannot have nπ pTnq Ñ 8. Therefore, by Lemma 23, ENTn pTn q „ nπ pTnq Ñ 8. nf gw nf 2 By Lemma 24, we have EpNTn pTn qq2 “ p1 ` op1qqpnπ pTnqq . Therefore, nf gw nf gw nf gw nf gw 2 Var NTn pTn q “ E pNTn pTn qq2 ` E NTn pTn q ´ pE NTn pTn q q nf 2 nf nf 2 ` ˘ “ p1“` op1qqpnπ p‰Tnqq “` p1 ` op1qqp‰ nπ “pTnqq ´ p1‰` op1qqpnπ pTnqq nf 2 “ opnπ pTnqq .

nf gw nf p Thus NTn pTn q{pnπ pTnqq Ñ 1.

37 5.3 Complete r-ary non-fringe subtrees

gw Theorem 4 allows us to find the maximal complete r-ary non-fringe subtree in Tn . We omit the proofs of the following results due to their similarities to Lemma 13 and 14:

Lemma 27. Assume Condition A and that pr ą 0 for some r ě 2. Let Hn,r be the gw height of the maximal complete r-ary non-fringe subtree in Tn . Then as n Ñ 8, s H p n,r Ñ 1. logrplog nq s Lemma 28. Assume Condition A and that p1 ą 0. Let Hn,1 be the height of the maximal gw chain (complete 1-ary) non-fringe subtree in Tn . Then as n Ñ 8, s H p n,1 Ñ 1. log1{p1 n s gw Example (The binary tree). Recall that when p0 “ p2 “ 1{4 and p1 “ 1{2, Tn is equivalent to a uniform random binary tree of size n. It follows from Lemma 28 that p Hn,1{ log2 n Ñ 1. This result was previously proved by Devroye et al. [10].

6s Open questions

nf gw nf gw nf p Theorem 4 shows that if nπ pTnq :“ nP tTn ă T u Ñ 8, then NTn pTn q{nπ pTnq Ñ 1. nf gw We believe that with an extra assumption that vpTnq Ñ 8, it is also true that pNTn pTn q´ nf nf nπ pTnqq{ nπ pTnq converges in distribution to a normal distribution. (Janson [22, thm. 1.9] has shown that this is indeed the case when vpnq is bounded.) Theorema 3 generalizes Theorem 2 by considering the number of fringe subtrees whose shapes belong to a set of trees Skn instead of being a single tree Tn. It may be possible to generalize Theorem 4 in similar way, i.e., we consider the non-fringe subtrees whose shapes belong to a set of trees Skn instead of being a single tree Tn. Another problem may be of interest is to get a non-fringe version of Theorem 5, i.e., what are the sufficient conditions for all (or not all) trees of size at most k to appear in gw Tn as non-fringe subtrees. Let T be a tree and v be a node of T . Recall that Tv denotes the fringe subtree rooted at v. If by removing some or none the subtrees of Tv, we can make it isomorphic to another tree T 1, then we say that T contains an embedded subtree of the shape T 1 at v. A more challenging open question is to determine the size of the maximum complete r-ary embedded subtree in large conditional Galton-Watson trees.

References

[1] D. Aldous. Asymptotic fringe distributions for general families of random trees. The Annals of Applied Probability, 1(2):228–266, 1991.

38 [2] N. Alon and J. H. Spencer. The Probabilistic Method. John Wiley & Sons, Hoboken, NJ, USA, 3rd edition, 2008.

[3] R. Arratia, L. Goldstein, and L. Gordon. Two moments suffice for Poisson approx- imations: The Chen-Stein method. The Annals of Probability, 17(1):9–25, 1989.

[4] A. D. Barbour, L. Holst, and S. Janson. Poisson Approximation. Oxford University Press, New York, NY, USA, 1993.

[5] S. Chatterjee, P. Diaconis, and E. Meckes. Exchangeable pairs and Poisson approx- imation. Probability Surveys, 2:64–106, 2005.

[6] L. H. Y. Chen. Poisson approximation for dependent trials. The Annals of Proba- bility, 3(3):534–545, 1975.

[7] F. Chyzak, M. Drmota, T. Klausner, and G. Kok. The distribution of patterns in random trees. Combinatorics, Probability and Computing, 17(01):21–59, 2008.

[8] L. Devroye. Limit laws for local counters in random binary search trees. Random Structures and Algorithms, 2(3):303–315, 1991.

[9] L. Devroye. Limit laws for sums of functions of subtrees of random binary search trees. SIAM Journal on Computing, 32(1):152–171, 2002.

[10] L. Devroye, P. Flajolet, F. Hurtado, M. Noy, and W. Steiger. Properties of random triangulations and trees. Discrete and Computational Geometry, 22(1):105–117, 1999.

[11] M. Drmota. Random Trees. Springer, Vienna, Austria, 2009.

[12] M. Dwass. The total progeny in a branching process and a related random walk. Journal of Applied Probability, 6(3):682–686, 1969.

[13] P. Erd˝osand A. R´enyi. On a classical problem of probability theory. Magyar Tudom´anyosAkad´emiaMatematikai Kutat´oInt´ezet´enekK¨ozlem´enyei, 6:215–220, 1961.

[14] Q. Feng, H. M. Mahmoud, and A. Panholzer. Phase changes in subtree varieties in random recursive and binary search trees. SIAM Journal on Discrete Mathematics, 22(1):160–184, 2008.

[15] P. Flajolet and R. Sedgewick. Analytic Combinatorics. Cambridge University Press, Cambridge, UK, 2009.

[16] P. Flajolet, D. Gardy, and L. Thimonier. Birthday paradox, coupon collectors, caching algorithms and self-organizing search. Discrete Applied Mathematics, 39 (3):207–229, 1992.

39 [17] P. Flajolet, X. Gourdon, and C. Mart´ınez.Patterns in random binary search trees. Random Structures and Algorithms, 11(3):223–244, 1997.

[18] M. Fuchs. Subtree sizes in recursive trees and binary search trees: Berry–Esseen bounds and Poisson approximations. Combinatorics, Probability and Computing, 17(5):661–680, 2008.

[19] A. Gut. Probability: A Graduate Course. Springer, New York, NY, USA, 2013.

[20] C. Holmgren and S. Janson. Limit laws for functions of fringe trees for binary search trees and recursive trees. ArXiv e-prints, June 2014.

[21] S. Janson. Simply generated trees, conditioned Galton-Watson trees, random allo- cations and condensation. Probability Surveys, 9:103–252, 2012.

[22] S. Janson. Asymptotic normality of fringe subtrees and additive functionals in conditioned galton–watson trees. Random Structures and Algorithms, 48(1):57–101, 2016.

[23] V. F. Kolchin. Random Mappings. Translations Series in Mathematics and Engi- neering. Optimization Software, New York, NY, USA, 1986.

[24] P. Neal. The generalised coupon collector problem. Journal of Applied Probability, 45(3):621–629, 2008.

[25] R. Otter. The number of trees. Annals of Mathematics, 49(2):583–599, 1948.

[26] J. Pitman. Enumerations of trees and forests related to branching processes and random walks. Microsurveys in Discrete Probability, 41:163–180, 1998.

[27] B. Roos. Improvements in the Poisson approximation of mixed Poisson distribu- tions. Journal of Statistical Planning and Inference, 113(2):467–483, 2003.

[28] N. Ross. Fundamentals of Stein’s method. Probability Surveys, 8:210–293, 2011.

[29] C. Stein. A bound for the error in the normal approximation to the distribution of a sum of dependent random variables. In Proceedings of the 6th Berkeley Symposium on Mathematical Statistics and Probability, Volume 2: Probability Theory, 583–602, Berkeley, CA, USA, 1972. University of California Press.

40