Iowa State University Capstones, Theses and Graduate Theses and Dissertations Dissertations

2011 Coalescence in Bellman-Harris and multi-type branching processes Jyy-i Joy Hong Iowa State University

Follow this and additional works at: https://lib.dr.iastate.edu/etd Part of the Mathematics Commons

Recommended Citation Hong, Jyy-i Joy, "Coalescence in Bellman-Harris and multi-type branching processes" (2011). Graduate Theses and Dissertations. 10103. https://lib.dr.iastate.edu/etd/10103

This Dissertation is brought to you for free and open access by the Iowa State University Capstones, Theses and Dissertations at Iowa State University Digital Repository. It has been accepted for inclusion in Graduate Theses and Dissertations by an authorized administrator of Iowa State University Digital Repository. For more information, please contact [email protected]. Coalescence in Bellman-Harris and multi-type branching processes

by

Jyy-I Hong

A dissertation submitted to the graduate faculty

in partial fulfillment of the requirements for the degree of

DOCTOR OF PHILOSOPHY

Major: Mathematics

Program of Study Committee:

Krishna B. Athreya, Major Professor

Clifford Bergman

Dan Nordman

Ananda Weerasinghe

Paul E. Sacks

Iowa State University

Ames, Iowa

2011

Copyright c Jyy-I Hong, 2011. All rights reserved. ii

DEDICATION

I would like to dedicate this thesis to my parents Wan-Fu Hong and Wen-Hsiang Tseng for their un- conditional love and support. Without them, the completion of this work would not have been possible. iii

TABLE OF CONTENTS

ACKNOWLEDGEMENTS ...... vii

ABSTRACT ...... viii

CHAPTER 1. PRELIMINARIES ...... 1

1.1 Introduction ...... 1

1.2 Discrete-time Single-type Galton-Watson Branching Processes ...... 3

1.2.1 Definitions and Notations ...... 3

1.2.2 Limit Theorems ...... 4

1.3 Discrete-time Multi-type Galton-Watson Branching Processes ...... 7

1.3.1 Definitions, Assumptions and Notations ...... 7

1.3.2 Limit Theorems ...... 10

1.4 Continuous-time Single-type Age-dependent Bellman-Harris Branching Processes . . 13

1.4.1 Definitions, Assumptions and Notations ...... 13

1.4.2 Limit Theorems ...... 14

1.4.3 Age Distribution in Bellman-Harris Processes ...... 16

1.5 Continuous-time Multi-type Age-dependent Bellman-Harris Branching Processes . . . 18

1.5.1 Definitions, Assumptions and Notations ...... 18

1.5.2 Limit Theorems ...... 19

1.5.3 Age Distribution in Multi-type Age-dependent Branching Processes ...... 19

CHAPTER 2. REVIEW OF THE COALESCENCE IN DISCRETE-TIME SINGLE-TYPE

GALTON-WATSON BRANCHING PROCESSES ...... 21

2.1 Introduction ...... 21

2.2 The Supercritical Case ...... 23 iv

2.3 The Critical Case ...... 23

2.4 The Subcritical Case ...... 24

2.5 The Explosive Case ...... 25

CHAPTER 3. COALESCENCE IN DISCRETE-TIME MULTI-TYPE GALTON-WATSON

BRANCHING PROCESSES ...... 27

3.1 Introduction ...... 27

3.2 Results in The Supercritical Case ...... 28

3.2.1 The Statements of Results ...... 29

3.2.2 The Proof of Theorem 3.1 ...... 30

3.2.3 The Proof of Theorem 3.2 ...... 32

3.2.4 The Proof of Theorem 3.3 ...... 33

3.3 Results in The Critical Case ...... 35

3.3.1 The statements of Results ...... 35

3.3.2 The Proof of Theorem 3.5 ...... 37

3.3.3 The Proof of Theorem 3.6 ...... 41

3.3.4 The Proof of Theorem 3.7 ...... 43

3.4 Results in The Subcritical Case ...... 46

3.4.1 The Statements of Results ...... 46

3.4.2 The Proof of Theorem 3.8 ...... 48

3.4.3 The Proof of Theorem 3.9 ...... 50

3.4.4 The Proof of Theorem 3.10 ...... 51

3.5 The on Types ...... 54

3.5.1 The Statements of Results ...... 54

3.5.2 The Proof of Theorem 3.11 ...... 56

3.5.3 The Proof of Theorem 3.12 ...... 59

CHAPTER 4. COALESCENCE IN CONTINUOUS-TIME SINGLE-TYPE AGE-DEPEN-

DENT BELLMAN-HARRIS BRANCHING PROCESSES ...... 66

4.1 Introduction ...... 66 v

4.2 Results in The Supercritical Case ...... 67

4.2.1 The statement of Results ...... 67

4.2.2 The proof of Theorem 4.1 ...... 69

4.2.3 The proof of Theorem 4.2 ...... 73

4.2.4 The proof of Theorem 4.3 ...... 75

4.2.5 The proof of Theorem 4.4 ...... 85

4.3 Results in The Subcritical Case ...... 86

4.3.1 The statements of Results ...... 86

4.3.2 The proof of Theorem 4.5 ...... 87

4.3.3 The proof of Theorem 4.6 ...... 96

CHAPTER 5. COALESCENCE IN CONTINUOUS-TIME MULTI-TYPE AGE-DEPEND-

ENT BELLMAN-HARRIS BRANCHING PROCESSES ...... 101

5.1 Introduction ...... 101

5.2 Results in The Supercritical Case ...... 102

5.2.1 The statements of Results ...... 102

5.2.2 The proof of Theorem 5.1 ...... 104

5.2.3 The proof of Theorem 5.2 ...... 114

5.2.4 The proof of Theorem 5.3 ...... 119

5.3 The Generation Problem in Supercritical Case ...... 122

5.3.1 The statement of Result ...... 123

5.3.2 The proof of Theorem 5.4 ...... 123

CHAPTER 6. APPLICATION TO BRANCHING RANDOM WALKS ...... 128

6.1 Introduction ...... 128

6.2 Review of Results in The Supercritical Case ...... 129

6.3 Results in The Explosive Case ...... 130

6.3.1 The Statements of Theorems in The Explosive Case ...... 131

6.3.2 The Proof of Theorem 6.3 ...... 132

6.3.3 The Proof of Theorem 6.4 ...... 137 vi

CHAPTER 7. OPEN PROBLEMS ...... 139

7.1 Problems in Discrete-time Multi-type Galton-Watson Branching Processes ...... 139

7.2 Problems in Continuous-time Single-type Bellman-Harris Branching Processes . . . . 139

7.3 Problems in Continuous-time Multi-type Bellman-Harris Branching Processes . . . . . 140

BIBLIOGRAPHY ...... 141 vii

ACKNOWLEDGEMENTS

First of all, I would like to take this opportunity to express my appreciation to my advisor Dr.

Krishna B. Athreya for his excellent teaching and expert guidance. From the knowledge of mathematics to the advice of study, research and even life, I always feel so comfortable and encouraged to discuss with him and learn from him. Secondly, I would like to thank the members of my committee: Dr.

Ananda Weerasinghe, Dr. Paul E. Sacks, Dr. Clifford Bergman and Dr. Dan Nordman for their precious advices and suggestions. Also, my thanks are given to Dr. Jerold Mathews for the wonderful reading time he spent in improving my English and to Dr. Leslie Hogben for her helps in many ways.

Furthermore, I would like to give my thanks to Dr. Jhishen Tsay (National Sun Yat-sen University,

Taiwan) and his wife Chun-Tor Lee for their continuous encouragement when I was struggling and depressed. In addition, I would like to thank all the friends from LIFE at Campus Baptist Church for everything they have done for me.

Finally, I would like to give my deepest appreciation to my mother, my sister, Mou-Mou, My brother, Shih-Hsun, and my brother-in-low, Cohuahua, for their love, care, understanding and patience.

With their endless support in my life, I was able to pursue my dreams. I also give my special thanks to my husband, Chao-Chun, for his love and company during these years and it means a lot to me. viii

ABSTRACT

For branching processes, there are many well-known limit theorems regarding the evolution of the population in the future time. In this dissertation, we investigate the other direction of the evolution, that is, the past of the processes. We pick some individuals at random by simple random sampling without replacement and trace their lines of descent backward in time until they meet. We study the coalescence problem of the discrete-time multi-type Galton-Watson branching process and both the continuous- time single-type and multi-type Bellman-Harris branching processes including the generation number, the death time (in the continuous-time processes) and the type (in the multi-type processes) of the last common ancestor ( also called the most recent common ancestor) of the randomly chosen individuals for the different cases (supercritical, critical, subcritical and explosive). 1

CHAPTER 1. PRELIMINARIES

1.1 Introduction

The study of branching processes has a long history and was essentially motivated by the observa- tion of the extinction of certain family lines of the European aristocracy in contrast to the rapid exponen- tial growth of the whole population. Francis Galton formulated this extinction problem and originally posed it in the Educational Times in 1874 and the Reverend Henry William Watson replied with a so- lution (see Harris (1963)). Seneta and Heyde (1977) have pointed out that the French mathematician

Bienayme´ had formulated essentially the same model fifty years earlier.

The model of Galton and Watson (called the Galton-Watson branching process) appeared to have been neglected for many years after its creation. After 1940, interest in this model increased, partly because of the analogy between the growth of families and nuclear chain reactions and also partly because of the increased general interest in applications of . Since then, branching processes have been regarded as appropriate probability models for the description of the behavior of systems whose components (cells, particles, individuals in general) reproduce, are transformed, or die

(see Harris [20], Athreya and Ney [5], Jagers [22], Mode [28] and Sevastyanov [35]). Nowadays, this theory is an area of active and interesting research.

There are many generalizations of the single-type Galton-Watson branching process in discrete time. Of these the multi-type branching process model in discrete time is a natural one. The multi- type branching process is important because it is constructed in a way that closely matches real-life situations and hence can be used to study a wide variety of real-life problems, including those related to differences in types of ethnicities, types of genes, types of cosmic rays, etc. Another generalization is the continuous-time single-type case known as the Bellman-Harris branching process which is widely used by many fields. This device was suggested by Scott and Uhlenbeck (1942) in their treatment of 2 cosmic rays, where the continuous variable is energy, and was used by Bartlett (1946) and Leslie (1948) in dealing with human population, where the continuous variable is age.

In the rest of this chapter, we review basic definitions and results of single-type (Section 1.2) and multi-type (Section 1.3) discrete-time Galton-Watson branching processes and single-type (Section 1.4) and multi-type (Section 1.5) continuous-time Bellman-Harris age-dependent branching processes. We discuss results on the extinction probabilities, the growth rates of population and some other convergent properties. The results are fundamental and may be found in the books on branching processes men- tioned earlier. Here, we state the results which are needed in this thesis based on the books, Branching

Processes written by Athreya and Ney [5] and The Theory of Branching Processes written by T. E.

Harris [20].

In Chapter 2, we state the problem of coalescence in branching processes and review the results for all cases (supercritical, critical ,subcritical and explosive) of the discrete-time single-type Galton-

Watson branching processes.

In Chapter 3, we extend the results of the problem of coalescence to the discrete-time multi-type

Galton-Watson branching process including supercritical (Section 3.2), critical (Section 3.3) and sub- critical (Section 3.4) cases. Also, we present the Markov property on Types (Section 3.5) along the line of descent of an individual randomly chosen from the current generation by simple random sampling.

In Chapter 4, we consider the continuous-time single-type Bellman-Harris branching processes and give proofs to the problem of coalescence in the supercritical (Section 4.2) and subcritical (Section

4.3) cases. By the results on the problem of coalescence, we also are able to investigate the branching random walks (Section 4.4).

Although the research of branching processes has a long history, the study of the problem of co- alescence is still in its infancy. In Chapter 5, we state some interesting open questions related to this topic. 3

1.2 Discrete-time Single-type Galton-Watson Branching Processes

1.2.1 Definitions and Notations

A discrete-time single-type Galton-Watson branching process is the simplest type of branching process. This process can be thought as a population evolving in time. It starts at time 0 with Z0 individuals, each of which lives a unit of time and produces its offsprings upon death according to the {p j} j≥0 independently of others. Let Z1 be the total number of children produced by the Z0 individuals, that is,

XZ0 ξ0,i i=1 where {ξ0,i}i≥1 are i.i.d. random variables with the probability distribution {p j} j≥0. It constitutes the first generation and then these individuals in the first generation go on to produce the second generation of population Z2 and so on. So, the total size of the population in the (n + 1)st generation, n = 0, 1, 2, ··· , is given by  XZn   ξn,i if Zn > 0 Z =  n+1  i=1   0 if Zn = 0 where {ξn,i : i ≥ 1, n ≥ 0} are i.i.d. copies with the probability distribution {p j} j≥0.

Then {Zn}n≥0 is called a Galton-Watson branching process with initial population Z0 and offspring distribution {p j} j≥0. Here, ξn,i is the number of offspring of the ith individual of the nth generation. Let

X∞ m ≡ jp j j=1 be the mean of the offspring distribution {p j} j≥1. We shall refer to the Galton-Watson process as sub- critical, critical, supercritical or explosive according as 0 < m < 1, m = 1, 1 < m < ∞ or m = ∞, respectively.

Moreover, if T denotes the full family tree generated in this way, every individual in T can be identified by a finite string (i0, i1, ··· , in) meaning that this individual is in the nth generation and is the inth child of the individual (i0, i1, ··· , in−1) of the (n − 1)st generation. 4

1.2.2 Limit Theorems

In this section, we collect some well-known results for discrete-time single-type Galton-Watson branching processes.

Theorem 1.1. (Supercritical Case) Let p0 = 0 and 1 < m < ∞. Then

(a) P(Zn → ∞|Z0 > 0) = 1.

 Z  (b) (Harris, 1960) W ≡ n : n ≥ 0 is a nonnegative martingale and hence n mn

lim Wn ≡ W exists w.p.1. n→∞

(c) (Kesten and Stigum, 1966)

X∞ ( j log j)p j < ∞ if and only if E(W|Z0 = 1) = 1 j=1

and then W has an absolutely continuous distribution on (0, ∞) with a positive density.

(d) (Seneta and Heyde, 1970)

Cn+1 Zn ∃Cn s.t. → m and → W w.p.1 Cn Cn

X∞ n and P(0 < W < ∞) = 1. In particular, ( j log j)p j < ∞ if and only if Cn ∼ m . j=1 (e) (Athreya and Schuh [4])

E(W : W ≤ x) ≡ L(x)

L(cx) is slowly varying at ∞, i.e. ∀0 < c < ∞, → 1 as x → ∞. L(x)

Under the assumption p0 = 0, the population size Zn of a supercritical Galton-Watson branching process goes to infinity as n → ∞ with probability 1 and it grows like mn. This is the stochastic analogue of the so-called Malthusian law of geometric population growth.

In the next two theorems, we present the results for the critical and subcritical cases.

∞ 2 X 2 Theorem 1.2. (Critical Case) Let m = 1, p j , 1 for any j ≥ 1 and σ ≡ j p j − 1 < ∞. Then j=1 5

(a) P(Zn → 0|0 < Z0 < ∞) = 1.

(b) (Kolmogrov, 1938)

σ2 nP(Z > 0) → as n → ∞. n 2

(c) (Yaglom, 1947)

Z  − 2 n 2 x P > x Zn > 0 → e σ , 0 < x < ∞. n

(d) (Athreya [12]) For 1 ≤ k ≺ n, let

(k) Zn−k,i  ≡ (k) ≤ ≤ Vn,k I(Z >0) : 1 i Zk n − k n−k,i

(k) on the event {Zk > 0}, where {Z j,i : j ≥ 0} is the G-W process initiated by the ith individual in the kth generation.

k Let k → ∞, n → ∞ such that n → u, 0 < u < 1.

Then the sequence of point processes {Vn,k}n≥1 conditioned on {Zn ≥ 1} converges weakly to the

V ≡ {η j : j = 1, 2, ··· , Nu}

k−1 where {η j} j≥1 are i.i.d. exp(1),Nu is Geo(u), i.e., P(Nu = k) = (1 − u)u , k ≥ 1 and {η j} j≥1 and

Nu are independent. X∞ Theorem 1.3. (Subcritical Case) (Yaglom, 1947) Let 0 < m ≡ jp j < 1. Then j=1 ∞ ∞ X X j (a) For j ≥ 1, lim P(Zn = j|Zn > 0) ≡ b j exists, b j = 1 and B(s) ≡ b j s , 0 ≤ s ≤ 1 is the n→∞ j=0 j=0 unique solution of the functional equation

B( f (s)) = mB(s) + (1 − s) , 0 ≤ s ≤ 1

∞ X j where f (s) ≡ p j s , in the class of all probability generating functions vanishing at 0. j=0

X∞ X∞ (b) jb j < ∞ iff ( j log j)p j < ∞. j=1 j=1 6

P(Z > 0|Z = 1) 1 (c) lim n 0 = . n→∞ mn P∞ jb j j=1

(d) If Z0 is a and EZ0 < ∞, then

lim P(Zn = j|Zn > 0) = b j , ∀ j ≥ 1 n→∞ X∞ and if, in addition, ( j log j)p j < ∞ then j=1

∞ X P(Zn > 0) EZ0 jb j < ∞ and lim = . n→∞ mn P∞ j=1 jb j j=1

In both of the critical and subcritical Galton-Watson branching processes, the population will die out eventually with probability 1. But, conditioned on the event of non-distinction, i.e. the set {Zn > 0},

Zn will go to infinity in distribution with the growth rate of n in the critical case while Zn will converge to a proper random variable in distribution in the subcritical case as n → ∞.

We present the results of P. L. Davies and D. R. Grey for the explosive Galton-Watson branching process as follows.

Theorem 1.4. (Explosive Case) Let p0 = 0, m = ∞ and, for some 0 < α < 1, P p j j>x → 1 as x → ∞ xαL(x) where L : (1, ∞) → (0, ∞) is a function slowly varying at ∞. Then

(a) (Davies [16])

n α log Zn → η w.p.1

and P(0 < η < ∞) = 1 and η has a continuous distribution.

(1) (2) (b) (Grey [19]) Let {Zn }n≥0 and {Zn }n≥0 be two i.i.d. copies of a Galton-Watson branching process X∞ { } ≡ ∞ (1) (2) with the offspring distribution p j j≥0 satisfying p0 = 0, m jp j = and Z0 = Z0 = 1. j=1 Then, w.p.1,   1 Z(1)  0, with prob. n →  2 (2)  Zn  1  ∞, with prob. 2 . 7

It is easy to deduce b) from a) in the above theorem as

n (1) (2) α log Zn − log Zn → η1 − η2 ≡ η, say and

1 P(η > 0) = P(η < 0) = . 2

1.3 Discrete-time Multi-type Galton-Watson Branching Processes

1.3.1 Definitions, Assumptions and Notations

In a discrete-time single-type Galton-Watson branching process, we assume that each individual lives for a fixed unit time and then produces its children according to the same offspring distribution.

In this section, we allow a number of distinguishable types of individuals having different offspring distributions.

First, we consider a finite number d of individual types. Such processes arise in a variety of appli- cations in biology and physics and they could represent genetic or mutant types in the real populations such as animal population, bacterial population or photons, etc.

Through out this section and next chapter, we adopt the following conventions.

1. N0 is the set of all nonnegative integers.

n o d ≡ ≡ ··· ∈ ··· 2. N0 j ( j1, j2, , jd): ji N0, i = 1, 2, , d

··· ··· d 3. 0 = (0, 0, , 0) and 1 = (1, 1, , 1) in N0

··· ··· ∈ d 4. ei = (0, , 0, 1, 0, , 0) N0 with the 1 in the ith component.

5. u ≤ v means ui ≤ vi for i = 1, 2, ··· , d while u < v means ui ≤ vi for all i and ui < vi for at least one i.

6. The vector of absolute values is

|x| = |x1| + |x2| + ··· + |xd| 8

7. The sup norm is

kxk = max{|x1|, |x2|, ··· , |xd|}

8. The product notation is

Yd y yi x = xi i=1

9. For a matrix M, the super norm is

kMk = max{|mi j| : i, j = 1, 2, ··· , d}

Let Zn = (Zn,1, Zn,2, ··· , Zn,d) be the population vector in the nth generation, n = 0, 1, 2, ··· , where

Zn,i is the number of individuals of type i in the nth generation. We assume that each individual of type i, i = 1, 2, ··· , d, lives a unit of time and, upon death, produces children of all types and according to

 (i) (i) the offspring distribution p (j) ≡ p ( j1, j2, ··· , jd) j∈Nd and independently of other individual, where (i) p ( j1, j2, ··· , jd) is the probability that a type i parent produces j1 children of type 1, j2 children of type 2, ··· , jd children of type d. Therefore, each component of the vector of the probability generating functions f = f (1), f (2), ··· , f (d) can be written as:

X (i) ··· (i) ··· j1 j2 ··· jd f (s1, s2, , sd) = p ( j1, j2, , jd)s1 s2 sd j1, j2,··· , jd≥0 where 0 ≤ sr ≤ 1, r = 1, 2, ··· , d, being the probability generating function of the number of various types produced by a type i individual,

 d Thus, a discrete-time multi-type Galton-Watson branching process Zn n≥0 is a on N0 with the transition function

d P(i, j) = P(Zn+1 = j|Zn = i) ∀i, j ∈ N0 ∞ X  i such that P(i, j)sj = f(s) (see notation (8)). ∈ d j N0 When the process is initiated in state ei, we will denote the process {Zn}n≥0 by

(i) (i) (i) ··· (i)  Zn = Zn,1, Zn,2, , Zn,d

(i) where Zn, j is the number of type j individuals in the nth generation for a process Z0 = ei. The generating (i) (i) function of Zn will be denoted by fn (s). 9

( j) Also, we let ξn,r be the vector of offsprings of the rth individual of type j in the nth generation then

( j) ( j) ( j) ( j) ξn,r ∼ {p (·)}, i.e., P(ξn,r = ·) = p (·). Then, the population in the (n + 1)th can be expressed as

d Zn, j X X ( j) Zn+1 = ξn,r. j=1 r=1

Let mi j = E(Z1, j|Z0 = ei) be the expected number of type j offspring of a single type i individual in one generation for any i, j = 1, 2, ··· , d. Then, we define the mean matrix

M = {mi j : i, j = 1, 2, ··· , d}.

n (n) n Clearly, we get E(Zn|Z0) = Z0M . We let mi j be the (i, j)th element of M . When the higher moments exist, we can denote them by the following notations. First, we let

(r) (r) (r) (r) qn (i, j) = E Zn,i Zn, j − δi, jZn,i i, j, r = 1, 2, ··· , d and define the matrix

(r)  (r) Qn = qn (i, j): i, j = 1, 2, ··· , d ,

(1) (2) (d) the vector of matrices Qn = Qn , Qn , ··· , Qn , the quadratic form

1 Xd Xd Q(r)[s] = s q(r)(i, j)s , n 2 i n j i=1 j=1 and the vectors of quadratic forms

(1) (2) (d)  Qn[s] = Qn [s], Qn [s], ··· , Qn [s] (1.1)

and let Q[s] ≡ Q1[s].  We also impose the following assumptions to the process Zn n≥0:

 1. The branching process Zn n≥0 is a non-singular process, i.e., for every i, the probability that each individual has exactly one offspring of the same type is less than 1.

 2. The branching process Zn n≥0 is a positive regular process. That is, the mean matrix M is strictly (n) positive (there exists an n such that mi j > 0 for all i, j = 1, 2, ··· , d). 10

By the Frobenius theorem, the strictly positive matrix M has a maximal eigenvalue ρ which is positive, simple and has associated positive right and left eigenvectors u and v. Moreover, these can be normalized so that u · v = 1 and u · 1 = 1, then one can write

Mn = ρnP + Rn

(n) n where P is the matrix whose (i, j)th entry is uiv j, and where PR = RP = 0 and the ri j ≤ cρ0, (n) n i, j = 1, 2, ··· , d, for some c < ∞ and 0 < ρ0 < ρ, where ri j is the (i, j)th entry of R . In a discrete-time multi-type Galton-Watson branching process, the role of the crucial criticality parameter is now played by the maximal eigenvalue ρ of the mean matrix M. The process is called a supercritical, critical or subcritical branching process according as ρ > 1, ρ = 1 or ρ < 1, respectively.

1.3.2 Limit Theorems

Let {Zn}n≥1 be a nonsingular and positive regular branching process and let M be its mean matrix with the maximal eigenvalue ρ.

First, we present the result of the probability of the extinction.

Theorem 1.5. (Harris, 1963) Let

q = q(1), q(2), ··· , q(d) where q(i) is the probability of eventual extinction of the process initiated by a single individual of type i, i = 1, 2, ··· , d. Then

(a) If ρ ≤ 1, then q = 1.

(b) If ρ > 1, then q < 1.

Next, we look at three limit theorems for multi-type branching processes.

In fact, the asymptotic behavior of the multi-type branching process offers no new surprises. In supercritical case, as in the single-type process, the total population |Zn| grows with a geometric rate of ρn (need an analog of the L log L condition in Theorem 1.1 (c)) and the proportions of individuals of various types approach the corresponding ratios of the components of the left eigenvector of the mean matrix M. 11

Theorem 1.6. (Supercritical Case) Let ρ > 1. Then

(a) (Kesten and Stigum, 1966)

Z  lim n = vW w.p.1 n→∞ ρn

where W is a nonnegative random variable such that

P(W > 0) > 0 if and only if EkZ1k log kZ1k < ∞.

Moreover, if EkZ1k log kZ1k < ∞, then

E(W|Z0 = ei) = ui i = 1, 2, ··· , d

(i) and P(W = 0|Z0 = ei) = q for i = 1, 2, ··· , d.

u · Z (b) Let W = n and F be the σ-algebra generated by Z : 1 ≤ i ≤ n . Then (W , F ): n ≥ 0 n ρn n i n n

is a nonnegative martingale and hence lim Wn exists w.p.1 and equals W in (a). n→∞

Hoppe (1976) combined the functional equation approach of Seneta with the exponential martingale of Heyde to show that the analogous results of Seneta for the single-type Galton-Watson branching process also holds for the multi-type process.

Theorem 1.7. (Hoppe [21]) Let 1 < ρ < ∞ and Z0 = ei, for any i = 1, 2, ··· , d. Then there exist positive sequence {C0}n≥1 of vectors and related scalars {γn}n≥0 such that

(i) (a) lim Cn · Zn = W w.p.1; n→∞ γ (b) lim n = ρ; n→∞ γn+1 C (c) lim n = u; n→∞ γn

(i) (d) lim γnu · Zn = W w.p.1; n→∞

(i) (e) lim γnZn = W v in probability; and n→∞

(f) W(i) is a random variable such that P(W(i) < ∞) = 1 and P(W(i) = 0) = q(i).

−n −n (g) Cn ∼ ρ L(ρ )u as n → ∞, where L(s) varies slowly as s → 0. 12

In the critical case, we condition on non-extinction and normalize the process Zn by dividing it the generation number n, the limit law again is exponential as it is in the single-type critical branching process.

2 Theorem 1.8. (Critical Case) Let ρ = 1 and EkZ1k < ∞. Then

(a) (Joffe and Spitzer, 1967)

i · u lim nP(Zn , 0|Z0 = i) = . n→∞ v · Q[u]

Z · w (b) (Joffe and Spitzer, 1967) Let ρ = 1 and EkZ k2 < ∞. If w · v > 0, then n , conditioned on 1 n Zn , 0, converges in distribution to the random variable Y with density

1 − x f (s) = e γ1 , x ≥ 0 γ1 v · w where γ = . 1 v · Q[u]

2 Zn · w (c) (Ney, 1967) Let ρ = 1 and EkZ1k < ∞. If w · v = 0, then √ , conditioned on Zn , 0, n converges in distribution to the random variable with density

1 |x| − γ f2(s) = e 2 , −∞ < x < ∞, γ2

for some γ2 > 0.

The same approach is applied to the subcritical case, that is, conditioned on the event of non- extinction, the process will converge to a proper random variable in distribution. Moreover, the proba-

n bility of the event {Zn , 0} of non-extinction has a geometric rate of decay ρ .

Theorem 1.9. (Subcritical Case) (Joffe and Spitzer, 1967) Let ρ < 1. Then

(a)

v · [1 − f (s)] n ↓ Q(s) as n → ∞, 0 ≤ s ≤ 1, ρn

where Q(·) is non-increasing and positive if and only if EkZ1k log kZ1k < ∞;

(b)

1 − f (s) lim n = Q(s)u; n→∞ ρn 13

(c)

−n lim ρ P(Zn , 0|Z0 = i) = Q(0)(i · u). n→∞

(d)

lim P(Zn = j|Z0 = i, Zn , 0) = b(j) n→∞

exists, is independent of i, and is a probability measure on R+. Furthermore, X jb(j) < ∞ if and only if EkZ1k log kZ1k < ∞.

1.4 Continuous-time Single-type Age-dependent Bellman-Harris Branching Processes

1.4.1 Definitions, Assumptions and Notations

In the discrete-time single-type Galton-Watson branching process, the lifetime of each individual was one unit of time. A natural generalization is to allow these lifetimes to be random variables.

Here, we first consider single-type branching process and we assume that each individual lives for a random amount of time, say L, with distribution function G and, upon its death, produces a random number ξ of children according to the offspring distribution {p j} j≥0. The reproduction of each individual is independent of its lifetime and of other individuals.

Let Z(t) be the population at time t, i.e., the number of individuals alive at time t. Then {Z(t): t ≥ 0} is called a continuous-time single-type Bellman-Harris branching process with the lifetime distribution

G(·) and the offspring distribution {p j} j≥0. The Galton-Watson branching process can be viewed as a special case of Bellman-Harris branching process when the lifetime L ≡ 1. A Bellman-Harris process is in general not Markovian, unless the lifetimes are independent exponentially distributed random variables. In such a case, i.e., the lifetimes are independently exponentially distributed, the process is called a continuous-time Markov branching process. A general Bellman-Harris process is also called a continuous-time age-dependent branching process.

As in the single-type Galton-Watson branching process, let X∞ m ≡ jp j j=1 14 and the Bellman-Harris branching process is called a supercritical, critical or subcritical process accord- ing as m > 1, m = 1 or m < 1.

For the lifetime distribution G, we assume throughout that G(0+) = 0, i.e., that there is zero prob- ability of instantaneous death. Harris (1963) showed that, together with finite mean individual produc- tion, this guarantees the a.s. finiteness of the process for all time t > 0, i.e., P(Z(t) < ∞) = 1 for all

0 < t < ∞.

Next, we introduce a parameter α which will describe the growth rate of the population in the supercritical case.

Definition 1.1. The Malthusian parameter for m and G is the root α in R (provided it exists) such that

Z ∞ m e−αxdG(x) = 1 0

Due to the monotonicity of the left side of the equation as a function of α, such a root, when it exists, it always unique. Also, such a Malthusian parameter always exists and is necessarily nonnegative when m ≥ 1.

1.4.2 Limit Theorems

Let α denote the Malthusian parameter for the offspring mean m and the lifetime distribution G. X∞ Let f be the generating function of the offspring distribution and let F(s, t) = P(Z(t) = j|Z(0) = 1)s j, j=0 then F(s, t) is the unique bounded solution of the following integral equation

Z t F(s, t) = s[1 − G(t)] + f F(s, t − x)dG(x), |s| ≤ 1. 0

We shall say that F is the generation function of the process determined by ( f, G).

Let q be the probability of the extinction, i.e., q = P(Z(t) = 0 for some t). Then the following theorem is a direct generalization of the geometric convergence rate of the generating function of a

Galton-Watson process.

Theorem 1.10. If m , 1, 0 < γ = f 0(q), G is non-lattice and the Malthusian parameter α for γ and G Z −αt exists, and µα = γ te dG(t) < ∞, then

lim e−αtq − F(s, t) ≡ Q(s) exists for 0 ≤ s ≤ 1. t→∞ 15

Furthermore,

X∞ Q(s) ≡ 0 if and only if m < 1 and ( j log j)p j = ∞. j=1

X∞ If m > 1 or ( j log j)p j < ∞, then Q(s) , 0 for s , q. j=1 The next theorem is for the supercritical case.

Theorem 1.11. (Supercritical Case) Let 1 < m < ∞. Then

X∞ (a) If ( j log j)p j = ∞, then j=1

e−αtZ(t) → 0 w.p.1.

X∞ (b) Let Z0 = 1 and ( j log j)p j < ∞, then j=1

e−αtZ(t) → W w.p.1

where W is a nonnegative random variable such that

(i) EW = 1.

(ii) W has an absolutely continuous distribution on (0, ∞).

(iii) P(W = 0) = q = P(Z(t) = 0 for some t).

As in the discrete-time Galton-Watson branching processes, there exist the Seneta-Heyde normal- izing constants for the continuous-time Bellman Harris branching processes. Schuh and Cohn (1982) X∞ showed by different approaches that if 1 < m < ∞, without the hypothesis of ( j log j)p j < ∞, there j=1 exist constants Ct such that

Z(t) → W w.p.1 Ct as t → ∞, where W is a continuous random variable on (0, ∞) such that P(W = 0) = q = P(Z(t) =

0 for some t).

Next, when m = 1, we have an analog of the exponential limit law of the critical Galton-Watson branching process. 16

Z ∞ Theorem 1.12. (Critical Case) If m = 1, σ2 = f 00(1) < ∞, µ = tG(t) < ∞, and t2[1 − G(t)] → 0 0 as t → ∞, then

Z(t)  − 2µx (a) lim P ≤ x Z(t) > 0 = 1 − e σ2 , x ≥ 0. t→∞ t 2µ (b) P(Z(t) > 0) ∼ . σ2t

In the subcritical case, conditioned on the event of non-extinction, the process Z(t) converges to a proper random variable, as t → ∞. The result is stated as follows.

X∞ Theorem 1.13. (Subcritical Case) If m < 1 and ( j log j)p j < ∞. Assume that the Malthusian para- j=1 Z ∞ meter α for m and the lifetime distribution G exists and te−αtdG(t) < ∞. Then 0

(a) PZ(t) > 0 ∼ ce−αt for some c > 0;

(b) for all j ≥ 1,

 lim P Z(t) = j Z(t) > 0 = b j t→∞

X∞ X∞ exists, b j = 1 and jb j < ∞. j=1 j=1 For proofs of these, see Athreya and Ney [5].

1.4.3 Age Distribution in Bellman-Harris Processes

An important and useful aspect of age-dependent branching processes is the limit behavior of the age distribution.

Consider a Bellman-Harris branching process {Z(t): t ≥ 0} which starts at time 0 with one individ- ual of age 0. This individual lives for a length of time L with the lifetime distribution G and, upon its death, produces ξ children according to the offspring distribution {p j} j≥0 independently of other indi- viduals alive at the same time and of the lifetime. Then each individual lives for a length of time then dies and produces its offspring in the same way and so on.

We impose the assumption that G(0+) = 0 and G is non-lattice.

We also adopt the following notations: For any family history ω, 17

1. Z(t, ω) is the number of individuals alive at time t.

2. Z(x, t, ω) is the number of individuals alive at time t whose age is less than or equal to x.

Z(x, t, ω) 3. A(x, t, ω) = . Z(t, ω)

4. α is the Malthusian parameter for m and G R x −αu − 0 e [1 G(u)]du 5. A(x) = R ∞ −αu − 0 e [1 G(u)]du G(x + t) − G(x) 6. G (t) = x 1 − G(x) X∞ Theorem 1.14. (Athreya and Kaplan [2]) Let 1 < m = jp j < ∞ and p0 = 0. Then j=1

P (a) sup |A(x, t, ω) − A(x)| −−−−→ 0 x X∞ (b) If ( j log j)p j < ∞, then j=1 sup |A(x, t, ω) − A(x)| → 0 w.p.1 x

as t → ∞.

(c) For any bounded continuous a.e. (w.r.t. Lebesgue measure) function h(·) on the support of G,

Z ∞ Z ∞ P h(x)dA(x, t, ω) −−−−→ h(x)dA(x) 0 0 as t → ∞.

Remark 1.1. In the above theorem, (a) implies (c) X∞ X∞ Theorem 1.15. (Athreya and Kaplan [2]) Let 1 < m = jp j < ∞ and p0 = 0. If ( j log j)p j = ∞. j=1 j=1 Then, for any K in the support of G,

lim Z(K, t, ω)e−αt = 0 w.p.1 t→∞

Theorem 1.16. (Athreya and Kaplan [3])Let m = 1 and assume that lim sup[1 − Gx(t)] = 0, then, for t→∞ x≥0 any  > 0,

 lim P sup |A(x, t, ω) − A(x)| >  Z(t) > 0 = 0 t→∞ x≥0 18

1.5 Continuous-time Multi-type Age-dependent Bellman-Harris Branching Processes

1.5.1 Definitions, Assumptions and Notations

The Bellman-Harris processes can be made more general by allowing individuals to be of different types. The population consists of d types of individuals whose lifetimes and reproductive behaviors are dependent on their types.

The lifetime Li of a type i individual is a random variable with distribution Gi(·), i = 1, 2, ··· , d.

Also, a type i individual, upon its death, produces ξi, j children of type j, j = 1, 2, ··· , d, according

 (i) (i) to the probability distribution p (j) ≡ p ( j1, j2, ··· , jd) j∈Nd and independently of other individual, (i) where p ( j1, j2, ··· , jd) is the probability that a type i parent produces j1 children of type 1, j2 children of type 2, ··· , jd children of type d. As in the multi-type Galton-Watson process, we still denote the generating functions of the offspring distributions by f(s) = ( f (1)(s), f (2)(s), ··· , f (d)(S)).

Let Z(t) = (Z1(t), Z2(t), ··· , Zd(t)) be the population vector of the individuals alive at time t, t ≥ 0, where Zi(t) is the number of individuals of type i alive at time t. Then {Z(t): t ≥ 0} is called a continuous-time age-dependent branching process.

As in the discrete-time multi-type Galton-Watson branching processes, we let mi j = E(ξi, j) be the expected number of type j offspring of a single type i individual in one generation for any i, j =

1, 2, ··· , d and define the mean matrix M = {mi j : i, j = 1, 2, ··· , d}. Assume the M is nonsingular and positively regular and write ρ for its Perron-Frobenius root (the maximal eigenvale). The process is called supercritical, critical or subcritical case according as ρ > 1,

ρ = 1 or ρ < 1, respectively. Z ∞ Let Gb(α) = e−αtG(dt) be the Laplace transform of any probability distribution G. 0 d Let Mb(α) = mi jGbi(α) i, j=1. Now, we can define an analog to the concept of a Malthusian parameter for a multi-type Bellman-

Harris processes.

Definition 1.2. The Malthusian parameter α for the matrix M and the probability distributions {Gi : i = 1, 2, ··· , d} is defined to be the number α for which the matrix Mb(α) has the Perron-Frobenius root (the maximal eigenvalue) 1, provided it exists.

In the critical and supercritical cases, the Malthusian parameter α always exists and is nonnegative. 19

1.5.2 Limit Theorems

Let u and v be the right and left eigenvector of M corresponding to the maximal eigenvalue ρ such that 1 · u = u · v = 1.

Then, this Malthusian parameter α, again, related to the growth rate in the supercritical case.

Here, we also assume that the offspring mean matrix M is nonsingular and strictly positive.

2  Theorem 1.17. (Supercritical Case) Let ρ > 1 and E ξi j < ∞ for i, j = 1, 2, ··· , d. Then

lim e−αtZ(t) = vW exists w.p.1 t→∞ where W is a one-dimensional random variable.

2  2  Theorem 1.18. (Critical Case) Let ρ = 1,E ξi j < ∞ for i, j = 1, 2, ··· , d and t 1 − Gi(t) → 0 as t → ∞, for all i = 1, 2, ··· , d. Then    µ · (u ⊗ v) lim tP Z(t) , 0 Z(0) = ei = ui t→∞ Q[u] where µ = (µ1, µ2, ··· , µd) is the vector of means of (G1(t), G2(t), ··· , Gd(t)), u⊗v = (u1v1, u2v2, ··· , udvd) and Q is the second moment quadratic for associated with f.

Z(t)  Moreover, the corresponding exponential limit law for Z(t) , 0 has been proved by H. t Weiner (1970) under the very strong assumption that all moments of f(s) exists.

The next result is the analog of the limit law in the subcritical multi-type Galton-Watson branching process.

+  Theorem 1.19. (Subcritical Case) Let ρ < 1. Assume that E ξi j log ξi j < ∞ for all i, j = 1, 2, ··· , d, Z ∞ αt the Malthusian parameter α exists and te dGi(t) < ∞ for i = 1, 2, ··· , d. Then, as t → ∞, 0

−αt (a) P(Z(t) , 0|Z(0) = ei) ∼ cie for some ci > 0.

d (b) P(Z(t) = j|Z(0) = ei, Z(t) , 0) → b(j) where b(j) is a probability measure on R+ − {0}.

1.5.3 Age Distribution in Multi-type Age-dependent Branching Processes

We also have the analogs of limits theorems of the age distribution in the multi-type age-dependent process. 20

Assume, for each i = 1, 2, ··· , d, the lifetime distribution Gi of a type i individual is non-lattice and

Gi(0+) = 0. Also, for any family history ω, let

 1. Z(t, ω) = Z1(t, ω), Z2(t, ω), ··· , Zd(t, ω) . where Zi(t, ω) is the number of type i individuals alive at time t.

Xd 2. |Z(t, ω)| = Zi(t, ω) i=1  3. Z(t, x, ω) = Z1(t, x, ω), Z2(t, x, ω), ··· , Zd(t, x, ω) . where Zi(t, x, ω) is the number of type i indi- viduals alive at time t whose age is less than or equal to x.

Zi(t, x, ω) 4. for i = 1, 2, ··· , d, Ai(t, x, ω) = for Zi(t, ω) > 0 Zi(t, ω) R x −αu −  0 e 1 Gi(u) du 5. Ai(x) = R ∞ −αu −  0 e 1 Gi(u) du

Recall that ξi j is the number of type j children produced by a type i parent, i, j = 1, 2, ··· , d, according to the offspring distribution.

+  Theorem 1.20. Assume the process is supercritical, i.e. ρ > 1 and E ξi j log ξi j < ∞ for all i, j = 1, 2, ··· , d, then, conditioned on the event of non-extinction, for any i = 1, 2, ··· , d,

sup Ai(t, x, ω) − Ai(x) → 0 w.p.1 as t → ∞. x≥0

For proofs, see Athreya and Ney [5]. 21

CHAPTER 2. REVIEW OF THE COALESCENCE IN DISCRETE-TIME SINGLE-TYPE GALTON-WATSON BRANCHING PROCESSES

2.1 Introduction

To investigate an old population, there are two interesting research directions. One is to predict the future behavior of this population, such as the probability of extinction, the growth rate, the stability of the composition of the population in a multi-type case and the limit distribution of the ages of individu- als. On the other hand, we can also study the evolution of this old population backward in time given its probability structure. We have seen many classical theorems regarding the first direction for different branching processes in Chapter 1. In this chapter, we will review some known results for the other direction in the discrete-time single-type Galton-Watson branching processes for the other direction.

We start with the coalescence problem for the binary tree case.

Consider a binary tree T starting with one individual. In such a tree, each individual produces exactly two offspring upon its death. So, there are 2n individuals in the nth generation for any n =

0, 1, 2, ··· .

 ¨XX ¨ Xr H¨ r HH r r XX Xr r @ r r @ ¨XXX p p p p p @ ¨ r H¨ r HH r r XX Xr r r

Now, pick two individuals from the nth generation by the simple random sampling without replace- ment and trace their lines of descent backward in time until they meet. Call that generation Xn. Then, 22 for k = 1, 2, ··· , n,

 k 2 n−k n−k k k n−k n−k −k 2 2 2 2 (2 − 1)2 2 1 − 2 P(X < k) = = = n 2n 2n(2n − 1) 1 − 2−n 2

−k d 1 So, lim P(Xn < k) = 1 − 2 , k = 1, 2, ··· . Thus, Xn −−−→ Geo( ). n→∞ 2 Similar results holds for any regular b-nary tree, b ≥ 2. This suggests that a similar behavior might hold for Galton-Watson trees.

Let {Zn}n≥1 be a discrete-time single-type Galton-Watson branching process with offspring distribu- tion {p j} j≥0 and initiated size Z0. Here, the process {Zn}n≥0 is generated by the way described in Section 1.2 and we will also adopt the notations introduced in the same section.

If T denotes the full family tree, then every individual in T can be identified by a finite string

(i0, i1, ··· , in) meaning that this individual is in the nth generation and is the inth child of the individual

(i0, i1, ··· , in−1) in the (n − 1)st generation.

Pick two individuals form the population in the nth generation (assuming Zn ≥ 2) by the simple random sampling without replacement and trace their lines of descent backward in time until they meet for the first time. Call this common ancestor the last common ancestor or the most recent common ancestor of these two randomly chosen individuals. Let Xn,2 be the number of the generation which this common ancestor belonged to. Then we can ask the following questions.

(1) What is the distribution of Xn,2?

(2) What happens to Xn,2 when n → ∞?

Similarly, if we pick k individuals randomly from the nth generation, k ≥ 2, and trace their lines of decent backward in time until they meet. Let Xn,k be the generation number of the last common ancestor of these randomly chosen individuals. Moreover, let Yn be the generation number of the last common ancestor of all the individuals in the nth generation and we are also interested in the limit behaviors of the distributions of Xn,k, k ≥ 2, and Yn when n → ∞. In each of the following sections, we present the results on the coalescence problem for different cases (supercritical, critical, subcritical and explosive) in the discrete-time Single-type Galton-Watson branching processes. For proofs, see Athreya [10] and [12]. 23

2.2 The Supercritical Case

In the supercritical case, Athreya showed that the coalescence time Xn,k will go way back to the beginning of the tree for any k ≥ 2. That is, Xn,k converges to a proper random variable in distribution as n → ∞. X∞ Theorem 2.1. Let p0 = 0, 1 < m ≡ jp j < ∞. Then, for almost all trees T , j=1 (a) for ∀1 ≤ r < ∞,

lim P(Xn,2 < r|T ) ≡ π2(r, T ) exists n→∞

and π2(r, T ) ↑ 1 as r ↑ ∞.

(b) for ∀k ≥ 2, ∀1 ≤ r < ∞,

lim P(Xn,k < r|T ) ≡ πk(r, T ) exists n→∞

and πk(r, T ) ↑ 1 as r ↑ ∞.

(c) for almost all trees T ,

Yn → N(T )

where N(T ) = max{n ≥ 1 : Zn = 1}. Also,

k lim P(Yn = k) = (1 − p1)p , k ≥ 0. n→∞ 1

2.3 The Critical Case

In a discrete-time single-type critical Galton-Watson branching process, unlike the results in the supercritical case, the coalescence time Xn,k of k randomly chosen individuals, k ≥ 2, as well as the coalescence time Yn of the whole population are not close to the beginning of the tree when n gets large. X Y In fact, they are of order n. That is, n,k (conditioned on the set {Z ≥ k}), k ≥ 2, and n (conditioned n n n on {Zn ≥ 1}) converge to proper random variables, respectively, when n → ∞. X∞ 2 2 Theorem 2.2. Let m = 1, p1 < 1 and σ = j p j − 1 < ∞, Then, for 0 < u < 1, j=1 24

  Xn,2 (a) lim P < u Zn ≥ 2 ≡ H2(u) exists and for 0 < u < 1, n→∞ n

H2(u) ≡ 1 − Eϕ(Nu)

where Nu is a geometric random variable with distribution

k−1 P(Nu = k) = (1 − u)u , k ≥ 1

and for j ≥ 1, j P η2  i  ϕ( j) ≡ E i=1 j P 2 ηi i=1

where {ηi}i≥1 are i.i.d. exponential random variable with Eη1 = 1.

Further, H2(·) is absolutely continuous on [0, 1] with H(0+) = 0 and H(1−) = 1.

(b) for 0 < u < 1, 1 < k < ∞,   Xn,k lim P < u Zn ≥ k ≡ Hk(u) exists n→∞ n

and Hk(·) is an absolutely continuous distribution function with Hk(0+) = 0 and Hk(1−) = 1.   Yn (c) for 0 < u < 1, lim P < u Zn ≥ 1 = u. n→∞ n Y Remark 2.1. Theorem 2.2 (c) shows that n is available in Zubkov [36]. n

2.4 The Subcritical Case

The next theorem provides a sharp contract to the results in the supercritical case and critical case.

In a subcritical Galton-Watson branching process, the coalescence time Xn,k, k ≥ 2, takes place close to the present and same is true for the coalescence time Yn of the whole population in the nth generation. X∞ Theorem 2.3. Let 0 < m ≡ jp j < 1. Then j=1

Eφk(Y) (a) For k ≥ 1, lim P(n − Xn > k|Zn ≥ 2) = ≡ πk, say, where n→∞ Eψk(Y) Pj Zk,i1 Zk,i2 j ! i1,i2=1 X φk( j) = E I( Zk,i ≥ 1) j j P  P  i=1 Zk,i Zk,i − 1 i=1 i=1 25

and

 Xj  ψk( j) = P Zk,i ≥ 2 i=1

where {Zr,i : r ≥ 0}, i = 1, 2, ··· are i.i.d. copies of a Galton-Watson branching process {Zr :

r ≥ 0} with Z0 = 1 and the given offspring distribution {p j} j≥0 and Y is a random variable with

distribution {b j} j≥1 where

b j ≡ lim P(Zn = j|Zn > 0, Z0 = 1) which exists. n→∞

X∞ Further, if j log jp j < ∞, then lim πk = 0 and hence n − Xn conditioned on Zn ≥ 2 converges k↑∞ j=1 to a proper distribution on {1, 2, ···}.

(b) For k ≥ 1, lim P(n − Yn > k|Zn ≥ 1) ≡ π˜ k exists and equals n→∞

1 − qY  Yqk−1(1 − q ) E k − E k mk mk

where Y is a random variable with distribution

P(Y = j) = b j = lim P(Zn = j|Zn > 0, Z0 = 1) n→∞

and qk = P(Zk = 0|Z0 = 1). X∞ Further, if j log jp j < ∞, then lim π˜ k = 0. k→∞ j=1

That is, n−Yn conditioned on {Zn > 0} converges in distribution as n → ∞ to a proper distribution on {1, 2, ···}.

2.5 The Explosive Case

If one considers a rapidly growing population, then two individuals chosen randomly from the nth generation are unlikely to be closely related to each other when n gets large. Theorem 2.1 says that in the supercritical case, the coalescence times do go way back to the beginning of the tree. Surprisingly, it turns out that, when m = ∞, this is only true for the coalescence time Yn of the whole population in the nth generation when n gets large. The coalescence times Xn,k, k ≥ 2, turn out to be very close to the present and, in fact, n − Xn,k, k ≥ 2 converges to a proper random variable in distribution when n → ∞. 26

X∞ Theorem 2.4. Let p0 = 0, m = jp j = ∞, and for some 0 < α < 1, and a function L : (1, ∞) → j=1 (0, ∞) slowly varying at ∞. Let P p j j>x → 1 as x → ∞. xαL(x)

Then

(a) For almost all trees T and k = 1, 2, ··· , as n → ∞,

P(Xn,2 < k|T ) → 0

and

P(n − Xn,2 < k) → π2(k) exists

and π2(k) ↑ 1 as k ↑ ∞.

(b) For any 1 < j < ∞ and k = 1, 2, ···

P(Xn, j < k|T ) → 0 as n → ∞

and P(n − Xn, j < k) → π j(k) exists and π j(k) ↑ 1 as k ↑ ∞.

d (c) Yn −−−→ N(T ) ≡ max{ j : Z j = 1} < ∞ and

k−1 P(Yn = k) → (1 − p1)p1 , k ≥ 1. 27

CHAPTER 3. COALESCENCE IN DISCRETE-TIME MULTI-TYPE GALTON-WATSON BRANCHING PROCESSES

3.1 Introduction

Throughout this chapter, we consider a d−type ( 2 ≤ d < ∞ ) Galton-Watson branching process and also adopt all the definitions and notations described in Section 1.3.  Let Zn n≥0 be a discrete-time multi-type Galton-Watson branching process, i.e.,

Zn = (Zn,1, Zn,2, ··· , Zn,d)

is the population vector in the nth generation, n = 0, 1, 2, ··· , where Zn,i is the number of individuals of type i in the nth generation, 1 ≤ i ≤ d.  We impose the following assumptions to the process Zn n≥0:

 1. The branching process Zn n≥0 is a non-singular process, i.e., for every i, the probability that each individual has exactly one offspring of the same type is less than 1.

 2. The branching process Zn n≥0 is a positive regular process. That is, the mean matrix M is strictly (n) positive (there exists an n such that mi j > 0 for all i, j = 1, 2, ··· , d).

Let T denote the full discrete-time multi-type Galton-Watson family tree, every individual in T  can be identified by a finite string (r0, i0), (r1, i1), ··· , (rn, in) meaning that this individual is in the  nth generation and is the rnth child of type in of the individual (r0, i0), (r1, i1), ··· , (rn−1, in−1) in the (n − 1)st generation.

Let k ≥ 2 be a positive integer. Now, we pick k individuals from the population in the nth generation

(assuming |Zn| ≥ k) by the simple random sampling without replacement and trace their lines of descent backward in time until they meet for the first time. Call this common ancestor the last common ancestor 28

or the most recent common ancestor of these two randomly chosen individuals. Let Xn,k, the coalescence time, be the number of the generation which the last common ancestor belonged to. Then we are interested in the following questions.

(1) What is the distribution of Xn,k?

(2) What happens to Xn,k when n → ∞?

(3) What happens when k → ∞?

(4) What happens to the generation number of the last common ancestor of the whole population in

the nth generation when n gets large?

We have seen the results on the discrete-time single-type Galton-Watson branching process in Chap- ter 2 and we would like to extend those to the multi-type supercritical, critical and subcritical branching processes. Moreover, we are also interested in the questions involving the types:

(5) What is the joint distribution of the type and the generation number of the last common ancestor

and the types of the randomly chosen individuals?

(6) What happens to this joint distribution when n gets large?

We present the results for the supercritical, critical and subcritical cases in Section 3.2, 3.3 and 3.4, respectively. In Section 3.5, we also investigate the Markov property of the limit law of the types of the ancestors of any random chosen individual from the nthe generation along its line of the descent .

3.2 Results in The Supercritical Case

For a supercritical branching process, we assume that each individual has to produce at least one offspring w.p.1 upon death, that is, P(Z1 = 0|Z0 = ei) = 0 for any i = 1, 2, ··· , d. Also,  E Z1, j Z0 = ei ≡ mi j < ∞ for all 1 ≤ i, j ≤ d.

Let ρ be the maximal eigenvalue of M = {mi j : i, j = 1, 2, ··· , d}. 29

3.2.1 The Statements of Results

 Theorem 3.1. Let ρ > 1, Z0 = ei0 and E kZ1k log kZ1k Z0 = ei < ∞ for all 1 ≤ i ≤ d. Then, for k = 2, 3, ··· ,

(a) for almost all trees T and r = 1, 2, ··· ,

|Zr| P k Wr,i i=1 P(Xn,k < r|T ) → φk(r, T ) ≡ 1 −  |PZr| k Wr,i i=1 u · Z as n → ∞, where {W : i ≥ 1, r ≥ 1} are i.i.d. copies of W ≡ lim n in Theorem 1.6 (a). r,i n→∞ ρn

d (b) there exist random variable X˜k such that Xn,k −−−→ X˜k as n → ∞, where

|Zr| P k Wr,i ! i=1 P(X˜k < r) ≡ φk(r) = 1 − E  |PZr| k Wr,i i=1 for any r = 1, 2, ··· .

u · Z Remark 3.1. It may be noted that W ≡ lim n has the same distribution for all Z . n→∞ ρn 0

Now, here arises an interesting question. Since the coalescence time of Xn,k of k randomly chosen individuals in the nth generation converges in distribution to a proper random variable X˜k as n → ∞ for every positive integer k ≥ 2, what happens to X˜k if k → ∞? The following theorem tells us that X˜k will also converge in distribution as k → ∞ to the generation which is the last time when the tree consists of only one individual.

 Theorem 3.2. Let ρ > 1 and EkZ1k log kZ1k < ∞. Let U = min n ≥ 1 : |Zn| ≥ 2 be the first time when d the population exceeds 1. Then X˜k −−−→ U − 1 as k → ∞.

Next, we pick two individuals (i.e. consider k = 2) at random by simple random sampling without replacement from the nth generation and trace their lines of decent backward in time to find their last common ancestor. Let Xn,2 be the generation number of this common ancestor, ηn the type of this  common ancestor and ζn,1, ζn,2 be the typea of the chosen individuals. The following theorem asserts  that the joint distribution of Xn,2, ηn, ζn,1, ζn,2 converges as n → ∞ to a proper distribution. 30

Theorem 3.3. Let ρ > 1, Z0 = ei0 , Z0 = ei0 and EkZ1k log kZ1k < ∞. Then

lim P(Xn,2 = r, ηn = j, ζn,1 = i1, ζn,2 = i2) ≡ ϕ2(r, j, i1, i2) exists n→∞ X and ϕ2(r, j, i1, i2) = 1. (r, j,i1,i2) The following extension of Theorem 3.3 is also valid to any integer k = 2, 3, ··· .

Theorem 3.4. Let ρ > 1, Z0 = ei0 and EkZ1k log kZ1k < ∞. Then, for any 2 ≤ k < ∞,

lim P(Xn,k = r, ηn = j, ζn,1 = i1, ζn,2 = i2, ··· , ζn,k = ik) ≡ ϕk(r, j, i1, i2, ··· , ik) n→∞ X exists and ϕk(r, j, i1, i2, ··· , ik) = 1. (r, j,i1,i2,··· ,ik)

3.2.2 The Proof of Theorem 3.1

We need the following lemma for the proofs.

Lemma 3.1. (O’Brien, 1980) Assume W1, W2, ··· are pairwise independent.

max{W , W , ··· , W } 1 2 n → 0 in probability Pn Wi i=1 if and only if L(x) ≡ E(W : W ≤ x) is slowly varying at ∞.

Now, we begin to prove Theorem 3.1. n o Let Z(l) be the be the discrete-time multi-type Galton-Watson branching process initiated p,i,n−p n≥p by the ith individual of type l in the pth generation.

For any k ≥ 2, we pick k individuals by simple random sampling without replacement from the population in the nth generation and let Xn,k be the generation number of their last common ancestor.

(a) For almost all trees T and r = 1, 2, ··· ,

d Z(l)     P Pr (l) (l) (l) Zr,i,n−r Zr,i,n−r − 1 ··· Zr,i,n−r − k + 1 l=1 i=1 P(Xn,k ≥ r|T ) =   |Zn| |Zn| − 1 ··· |Zn| − k + 1 (l) d Zr (l) (l) − (l) − P P Zr,i,n−r Zr,i,n−r 1 Zr,i,n−r k+1 ρn−r ρn−r ··· ρn−r = l=1 i=1 (l) (l) (l)  d Zr | (l) |  d Zr | (l) |   d Zr | (l) |  P P Zr,i,n−r P P Zr,i,n−r 1 P P Zr,i,n−r k−1 ρn−r ρn−r − ρn−r ··· ρn−r − ρn−r l=1 i=1 l=1 i=1 l=1 i=1 31

(l) Z − Since ρ > 1 and EkZ k log kZ k < ∞, by Theorem 1.6 (a), we know that r,i,n r → (1 · v)W 1 1 ρn−r r,i

w.p.1, for any i = 1, 2, ··· , where {Wr,i : i ≥ 1, r ≥ 1} are i.i.d. copies of W in Theorem 1.6 (a). So,

(l) (l) d Zr d Zr |Zr| P P k P P k P k (1 · v)Wr,i Wr,i Wr,i l=1 i=1 l=1 i=1 i=1 P(Xn,k ≥ r|T ) → = = ≡ 1 − φk(r, T ) (l) (l)  |Z | k  Pd ZPr k  Pd ZPr k Pr (1 · v)Wr,i Wr,i Wr,i l=1 i=1 l=1 i=1 i=1 and hence (a) is proved.

 (b) Since P(Xn,k ≥ r) = E P(Xn,k ≥ r) T and hence, by the bounded convergence theorem,

|Zr| P k Wr,i  i=1  P(Xn,k ≥ r) → E ≡ 1 − φk(r) as n → ∞  |PZr| k Wr,i i=1 for r = 1, 2, ··· .

Moreover, since EkZ1k log kZ1k < ∞, by Kesten and Stigum’s result (1966), EW < ∞. Hence, if

L(x) ≡ EW : W ≤ x

then

lim L(x) = lim EW : W ≤ x = EW x→∞ x→∞

where 0 < EW < ∞. So, for any 0 < c < ∞,

L(cx) lim = 1. x→∞ L(x)

That is, the function E(Wr,1 : Wr,1 ≤ x) in x is slowly varying at ∞.

Therefore, by Lemma 3.1,

max Wr,i 1≤i≤n → 0 in probability Pn Wr,i i=1

as n → ∞. So, since |Zr| → ∞ w.p.1 as r → ∞, by the bounded convergence theorem, we have

|Zr| P k Wr,i ! max Wr,i !k−1 1≤i≤|Z | E i=1 ≤ E r → 0 as r → ∞.  |PZr| k |PZr| Wr,i Wr,i i=1 i=1 32

Thus, φk is a proper probability distribution. So, there exists a random variable X˜k with P(X˜k < d r) = ϕk(r) for any r ≥ 1 such that Xn,k −−−→ X˜k as n → ∞ and we have completed the proof of Theorem 3.1.

Remark 3.2. Theorem 3.1 (a) and (b) should be valid just with ρ > 1. That is, the assumption

EkZ1k log kZ1k < ∞ can be dropped. This will need Hoppe’s result [21] and the result that the func- tion EW : W ≤ x is slowly varying at ∞. For single-type case, this was proved by Athreya and Schun

[4]. It can be adapted to the multi-type case.

3.2.3 The Proof of Theorem 3.2

We prove this theorem in two steps.

Step 1.

Since U = min{n ≥ 1 : |Zn| ≥ 2}, for almost all trees T and any r = 1, 2, ··· , we have that  k  Wr,1  1 − if r ≤ U − 1  k |Zr|  W P k  r,1 Wr,i  i=1  φ (r, T ) = 1 − =  |Zr| k k  P k  |Zr|   Wr,i P  i=1 Wr,i  1 − if r > U i=1   |Z | k  Pr  Wr,i  i=1

Also, the assumption that P(Z1 = 0|Z0 = ei) = 0 for all i = 1, 2, ··· , d implies that

max Wr,i  1≤i≤N  P 0 < < 1 = 1 PN Wr,i i=1 for any N ≥ 2. So, for almost all trees T ,

|Zr| P k Wr,i max Wr,i !k−1 1≤i≤|Z | i=1 ≤ r → 0 as k → ∞  |PZr| k |PZr| Wr,i Wr,i i=1 i=1

and hence, for r = 1, 2, ··· ,   0 if r ≤ U − 1  lim φk(r, T ) = k→∞   1 if r > U

and Step 1. is proved. 33

Step 2.

We have that

|Zr| |Zr| |Zr| P k P k P k Wr,i ! Wr,i ! Wr,i ! E i=1 = E i=1 I(r ≤ U − 1) + E i=1 I(r ≥ U)  |PZr| k  |PZr| k  |PZr| k Wr,i Wr,i Wr,i i=1 i=1 i=1

|Zr| P k Wr,i !! i=1 = P(r ≤ U − 1) + E E I(r ≥ U) |Zr|  |PZr| k Wr,i i=1  Since Wr,i : i ≥ 1 are i.i.d.,

|Zr| P Wk r,i !!  k !! i=1 Wr,1 E E I(r ≥ U) |Zr| = E |Zr| · E I(r ≥ U) |Zr|  |PZr| k |PZr| Wr,i Wr,i i=1 i=1

 W  W !k Also, P 0 < r,i < 1 = 1 implies that r,i I(r ≥ U) → 0 w.p.1 as k → ∞, and hence |PZr| |PZr| Wr,i Wr,i i=1 i=1

 k !! Wr,1 E |Zr| · E I(r ≥ U) |Zr| → 0 as k → ∞ |PZr| Wr,i i=1 by the bounded convergence theorem.

Therefore,

 P(X˜k < r) = φk(r) = E φk(r, T ) → 1 − P(r ≤ U − 1) = P(U − 1 < r)

d for any r = 1, 2, ··· . So, X˜k −−−→ U − 1 as k → ∞ and the proof is complete.

3.2.4 The Proof of Theorem 3.3

(i) (i)1 (i)2 (i)d Let ξn, j = ξn, j , ξn, j , ··· , ξn, j be the vector of the offsprings of the jth individual of the type i in  j(l) the nth generation. Let Zp,r,s,n n≥0 be the multi-type Galton-Watson branching process initiated by the sth child of type l of the pth individual of type j in the rth generation. So,

n o Z j(l) = (Z j(l)1 , Z j(l)2 , ··· , Z j(l)d ) p,r,s,n p,r,s,n p,r,s,n p,r,s,n n≥0 34

 has the same distribution as Zn|Z0 = el does.

Let An,i be the type of the ancestor in the next generation after the last common ancestor of the ith chosen individual, i = 1, 2. Then

P(Xn,2 = r, ηn = j, ζn,1 = ζn,2 = i, An,1 = An,2)   = E P(Xn,2 = r, ηn = j, ζn,1 = ζn,2 = i, An,1 = An,2|T )

( j) ( j)l Z d ξr,p Pr P P j(l)i j(l)i Zp,r,s,n−r−1Zp,r,t,n−r−1 ! p=1 l=1 s,t=1 = E |Zn|(|Zn| − 1) ( j)l ( j) ξ j(l)i j(l)i Zr d r,p Z Z P P P p,r,s,n−r−1 p,r,t,n−r−1 ρn−r−1 ρn−r−1 ! p=1 l=1 s,t=1 = E |Zn| |Zn|−1 ρn−r−1 ρn−r−1 ( j) ( j)l ( j) ( j)l Zr d ξr,p Zr d ξr,p P P P P P P 2 (viWp,r,s)(viWp,r,t)! vi Wp,r,sWp,r,t ! p=1 l=1 s,t=1 p=1 l=1 s,t=1 −→ E = E  |ZPr+n| 2  |ZPr+n| 2 Wr+1,s Wr+1,s s=1 s=1

  Zn where Wp,r,s and Wr+1,s are i.i.d. copies of the random variable W with lim = vW. s≥1 s≥1 n→∞ ρn Similarly, we have

P(Xn,2 = r, ηn = j, ζn,1 = ζn,2 = i, An,1 , An,2) ( j)l ( j)q ( j)l ( j)q ( j) ξ ξ j(l)i j(l)i ( j) ξ ξ Zr d r,p r,p Z Z Zr d r,p r,p P P P P p,r,s,n−r−1 p,r,t,n−r−1 P P P P 2 ρn−r−1 ρn−r−1 ! vi Wp,r,sWp,r,t ! p=1 l,=1 s=1 t=1 p=1 l,q=1 s=1 t=1 = E −→ E |Zn| |Zn|−1  |Zr+n| 2 n−r−1 n−r−1 P ρ ρ Wr+1,s s=1

P(Xn,2 = r, ηn = j, i1 = ζn,1 , ζn,2 = i2, An,1 = An,2) ( j)l ( j) ξ j(l)i j(l)i Zr d r,p Z i Z 2 P P P p,r,s,n−r−1 p,r,t,n−r−1 ρn−r−1 ρn−r−1 ! p=1 l=1 s,t=1 = E |Zn| |Zn|−1 ρn−r−1 ρn−r−1 ( j) ξ( j)l ( j) ξ( j)l ZPr Pd Pr,p ZPr Pd Pr,p (vi1 Wp,r,s)(vi2 Wp,r,t)! vi1 vi2 Wp,r,sWp,r,t ! p=1 l=1 s,t=1 p=1 l=1 s,t=1 −→ E = E  |ZPr+n| 2  |ZPr+n| 2 Wr+1,s Wr+1,s s=1 s=1 35 and

( j) ξ( j)l ξ( j)q ZPr Pd Pr,p Pr,p vi1 vi2 Wp,r,sWp,r,t ! p=1 l,q=1 s=1 t=1 P(Xn,2 = r, ηn = j, i1 = ζn,1 , ζn,2 = i2, An,1 , An,2) −→ E  |ZPr+n| 2 Wr+1,s s=1 Therefore, as n −→ ∞,

( j) |ξ( j) | ZPr Pr,p Wp,r,sWp,r,t ! p=1 s,t=1 P(Xn,2 = r, ηn = j, ζn,1 = i1, ζn,2 = i2) −→ vi1 vi2 E ≡ ϕ2(r, j, i1, i2).  |ZPr+1| 2 Wr+1,s s=1

d By Theorem.3.1, we know that Xn,2 −−−→ X˜2 and then {Xn,2}n≥0 is tight. Also, ηn, ζn,1 and ζn,2 are  random variables taking values on a finite set {1, 2, ··· , α}. Hence, (Xn,2, ηn, ζn,1, ζn,2) n≥0 is tight and the limit ϕ2(r, j, i1, i2) of P(Xn,2 = r, ηn = j, ζn,1 = i1, ζn,2 = i2) is a probability distribution. Thus, X ϕ2(r, j, i1, i2) = 1 and the proof is complete. (r, j,i1,i2)

3.3 Results in The Critical Case

Now, we consider a discrete-time multi-type critical Galton-Watson branching process.

We begin with Theorem 3.5 which shows the convergence of some point process constructed from the original branching process and, by using it, we are able to prove the results on the coalescence problem for the multi-type critical branching process.

3.3.1 The statements of Results   (l) (l)1 (l)2 (l)d  For any t < n, let Zt,i,n−t = Zt,i,n−t, Zt,i,n−t, ··· , Zt,i,n−t be the branching process initiated by n≥t (l)  (l) the ith individual of type l in the tth generation and let Jt be the set of all i ∈ 1, 2, ··· , Zt such that (l) |Zt,i,n−t| > 0, l = 1, 2, ··· , d.

2  Theorem 3.5. Let ρ = 1 and EkZ1k < ∞. On the event An ≡ |Zn| > 0 , for t < n, consider the random point process

( Z(l) ) t,i,n−t (l) Vn ≡ i ∈ J , l = 1, 2, ··· , d . n − t t 36

t Let n −→ ∞, t −→ ∞ and −→ α for α ∈ (0, 1), then, conditioned on A , the distribution of n n  the random point process Vn converges to a random point process V ≡ Yi | 1 ≤ i ≤ Nα where  1   Y = (v Y , v Y , ··· , v Y ) are i.i.d. random vectors with Y ∼ exp ,N is a random i 1 i 2 i d i i≥1 i v · Q[u] α j−1 variable independent of {Yi}i≥1 with distribution P(Nα = j) = (1 − α)α for j ≥ 1 and Q is the quadratic form as defined in (1.1).

Since the two vectors u and v, the left and right eigenvectors of the offspring mean matrix M associated with the maximal eigenvalue ρ, are normalized such that u · v = u · 1 = 1, an analog stated as in the following corollary can be obtained along the lines of the proof of Theorem 3.5.

Corollary 3.1. Under the same hypotheses of the Theorem3.5, consider the random point process

( (l) ) Z − V0 ≡ t,i,n t i ∈ J(l), l = 1, 2, ··· , d . n n − t t t Let n −→ ∞, t −→ ∞ and −→ α for α ∈ (0, 1), then, conditioned on A , the distribution of the n n 0 0   random point process Vn converges to a random point process V ≡ Yi | 1 ≤ i ≤ Nα where Yi i≥1 1 are i.i.d. exponential random variables with mean and N is a random variable independent v · Q[u] a j−1 of {Yi}i≥1 with distribution P(Nα = j) = (1 − α)α for j ≥ 1.

Next, we move on to the coalescence problem on critical branching process. Let k ≥ 2 be an integer. Pick k individuals at random from the nth generation (by simple random sampling without replacement) and trace their lines of decent backward in time to find their last common ancestor. Let

Xn,k be the generation number of this common ancestor.

2 Theorem 3.6. Let ρ = 1 and EkZ1k < ∞. Then, for k = 2, 3, ··· , there exists a random variable X˜k

Xn,k d such that |Zn| ≥ k −−−→ X˜k as n → ∞ and, for any α ∈ (0, 1), n  P(X˜k < α) = 1 − E φk(Nα) ≡ Hk(α)

x P k Yi ! i=1 where φk(x) = E , {Yi}i≥1 and Nα are as defined in Theorem 3.5.  Px k Yi i=1 Theorem 3.6 tells us that the generation number of the last common ancestor of any finite number of individuals randomly chosen from the population of the nth generation grows like n. That is, the 37

coalescence time Xn is not close either to the beginning of the tree or the present when n gets large. This result is consistent with what we have seen in the discrete-time single-type Galton-Watson branching process.

Now, we trace the lines of descent of all the individuals in the nth generation backward in time till they meet. Let Tn be the coalescence of whole population of the nth generation (we also call Tn the total coalescence time of all the individuals in the nth generation). The asymptotic behavior of the total coalescence time Tn in the multi-type critical branching process offers no new surprises. As in the one-dimensional case, we condition on non-extinction and normalized by dividing it by the generation number, the limit distribution again is uniform in (0, 1).

2 Tn d Theorem 3.7. Let ρ = 1 and EkZ1k < ∞. Then there exists a random variable T˜ such that |Zn| > 0 −−−→ T˜ n as n → ∞, where T˜ has a uniform distribution in (0, 1).

3.3.2 The Proof of Theorem 3.5

To prove the convergene of the random point process {Vn}}n≥n. we first consider the Laplace func- tional of this process Vn

(l)p k d Z P P P t,i,n−t − θp fp( n−t ) ! l=1 (l) p=1 i∈J ϕn(θ1, θ2, ··· , θd, f1, f2, ··· , fd) ≡ E e t |Zn| > 0, Z0 = ei 0

+ + where θ1, θ2, ··· , θd > 0 and f1, f2, ··· , fd : R → R are bounded and continuous functions. (l)p k d Z P P P t,i,n−t − θp fp( n−t ) l=1 (l) p=1 i∈J Let Yn,t = e t . Then

   P |Zn| > 0 Z0 = ei0 E Yn,t |Zn| > 0, Z0 = ei0 = E Yn,tI{|Zn|>0} Z0 = ei0  = E E(Yn,tI{|Zn|>0}|Z j, j ≤ t) Z0 = ei0 38

By the Markov property,

 E Yn,tI{|Zn|>0} Z j, j ≤ t  = E Yn,tI{|Zn|>0} Zt  = E Yn,tI{|Zn|>0}I{|Zt|>0} Zt   = E Yn,t Zt I{|Zn|>0} − E Yn,tI{|Z|n=0} Zt I{|Zt|>0}

(1) (2) (d) (1) (2) (d) (1) Zt (2) Zt (d) Zt (1) Zt (2) Zt (d) Zt = gn−t gn−t ··· gn−t I|Zt|>0 − qn−t qn−t ··· qn−t I|Zt|>0

(p) d Z − P j  θp fp( j )I{|Z j|>0}  where g(l)(θ) = E e p=1 Z = e and q(l)(d) = P(|Z | = 0|Z = e ) for j ≥ 1. j 0 l j j 0 l Note that

(l) g j (θ) (p) d Z − P j  θp fp( j )I{|Z j|>0}  = E e p=1 Z = e 0 l (p) d Z − P j  θp fp( j )I{|Z j|>0}  = E e p=1 |Z | = 0, Z = e P|Z | = 0 Z = e  j 0 l j 0 l (p) d Z − P j  θp fp( j )I{|Z j|>0}  +E e p=1 |Z | > 0, Z = e P|Z | > 0 Z = e  j 0 l j 0 l (p) d Z − P j  θp fp( j )I{|Z j|>0}  = P|Z | = 0 Z = e  + E e p=1 |Z | > 0, Z = e P|Z | > 0 Z = e  j 0 l j 0 l j 0 l (p) d Z − P j  θp fp( j )I{|Z j|>0}  = q(l) + 1 − q(l)E e p=1 |Z | > 0, Z = e . j j j 0 l

(p) d Z − P j  θp fp( j )  Letg ˜(l)(d) = E e p=1 |Z | > 0, Z = e . j j 0 l 2 It is known that in the critical case, i.e., ρ = 1, if EkZ1k < ∞, then, as j → ∞,

(l)  ul j 1 − q = jP |Z j| > 0 Z0 = el → j v · Q[u] and Z j d |Z j| > 0, Z0 = el −−−→ vY j  1  where Y ∼ exp . Since f , f , ··· , f are bounded and continuous, as j → ∞, we have v · Q[u] 1 2 d

α α P Z ∞ P  − θp fp(vpY) 1 − θp fp(vpy) y (l) p=1 p=1 − v·Q[u] g˜ j (θ) → E e ≡ g(θ) = e e dy. v · Q[u] 0 39

Also, we have that

(l) (l) (l) (l) (l) (l) g j (θ) = q j + (1 − q j )˜g j (θ) = 1 + (1 − q j )(˜g j (θ) − 1) and so, as j → ∞,

(l) (l)  j(1 − q )(˜g (θ) − 1) j u (l)  j j j l (g(θ)−1) g (θ) = 1 + → e v·Q[u] . j j

Now, consider the quantity   (1) Z(1) (2) Z(2) (d) Z(d) E g (θ) t g (θ) t ··· g (θ) t Z = e n−t n−t n−t 0 i0  P |Zt| > 0 Z0 = ei0 (1) (2) (d)   t Zt   t Zt   t Zt ! (1) n−t n−t t (2) n−t n−t t (d) n−t n−t t E g (θ) g (θ) ··· g (θ) Z0 = ei n−t n−t n−t 0 =  P |Zt| > 0 Z0 = ei0

t and, if n → ∞, t → ∞ and → α, 0 < α < 1, then it converges to n

 u u u  1 (g(θ)−1) α v Y 2 (g(θ)−1) α v Y d (g(θ)−1) α v Y E e v·Q[u] 1−α 1 e v·Q[u] 1−α 2 ··· e v·Q[u] 1−α d   u·v (g(θ)−1) α Y = E e v·Q[u] 1−α   1 (g(θ)−1) α Y = E e v·Q[u] 1−α Z ∞ 1 1 (g(θ)−1) α y − y = e v·Q[u] 1−α e v·Q[u] dy v · Q[u] 0 Z ∞  1 − 1 1−(g(θ)−1) α y = e v·Q[u] 1−α dy v · Q[u] 0 1 = α 1 − (g(θ) − 1) 1−α 1 − α = . 1 − αg(θ)

On the other hand, consider the following   (1) Z(1) (2) Z(2) (α) Z(d) E q (θ) t q (θ) t ··· q (d) t Z = e n−t n−t n−t 0 i0  P |Zt| > 0 Z0 = ei0 (1) (2) (d)   t Zt   t Zt   t Zt ! (1) n−t n−t t (2) n−t n−t t (d) n−t n−t t E 1 − (1 − q ) 1 − (1 − q ) ··· 1 − (1 − q ) Z0 = ei n−t n−t n−t 0 =  P |Zt| > 0 Z0 = ei0 40

t and if n → ∞, t → ∞ and → α, 0 < α < 1, then it converges to n  u u u  − 1 α v Y − 2 α v Y − d α v Y E e v·Q[u] 1−α 1 e v·Q[u] 1−α 2 ··· e v·Q[u] 1−α d   −u·v α Y = E e v·Q[u] 1−α   − 1 α Y = E e v·Q[u] 1−α Z ∞ 1 − 1 α y − y = e v·Q[u] 1−α e v·Q[u] dy v · Q[u] 0 Z ∞  1 − 1 α +1 y = e v·Q[u] 1−α dy v · Q[u] 0 1 = α 1−α + 1 = 1 − α.

Moreover, by Theorem 1.8 (a), we know that

ui0 P(|Z | > 0|Z = e ) tP(|Z | > 0|Z = e ) n ·Q 1 1 t 0 i0 t 0 i0 → v [u] = ui = P(|Zn| > 0|Z0 = ei ) nP(|Zn| > 0|Z0 = ei ) t 0 α α 0 0 v·Q[u] as t, n → ∞.

Hence,

ϕn(θ1, θ2, ··· , θd, f1, f2, ··· , fd) = E Yn,t Zn| > 0, Z0 = ei0 )

 (1) (2) (d)  (1) Zt (2) Zt (d) Zt E gn−t(θ) gn−t(θ) ··· gn−t(θ) Z0 = ei0 P(|Zt| > 0|Z0 = ei ) = 0 P(|Z | > 0|Z = e )  n 0 i0 P |Zt| > 0 Z0 = ei0   (1) Z(1) (2) Z(2) (d) Z(d) E q (θ) t q (θ) t ··· q (θ) t Z = e ! n−t n−t n−t 0 i0 −  P |Zt| > 0 Z0 = ei0 1  1 − α  → − (1 − α) α 1 − αg(θ) (1 − α)g(θ) = 1 − αg(θ) X∞ = (1 − α)α jg(θ) j+1 j=0 X∞ = (1 − α)α j−1g(θ) j. j=1   Let V ≡ Yi | 1 ≤ i ≤ Nα where Yi = (v1Yi, v2Yi, ··· , vdYi) i≥1 are i.i.d. random vectors with  1  Y ∼ exp and N is a random variable independent of {Y } ≥ with distribution P(N = j) = i v · Q[u] α i i 1 α 41

j−1 (1 − α)α for j ≥ 1. Then, for any θ1, θ2, ··· , θd > 0 and any bounded, nonnegative and continuous functions f1, f2, ··· , fd, the Laplace functional of V is

N d Pα P (p) ∞  − θp fp(Yi ) X E e i=1 p=1 = (1 − α)α j−1g(θ) j. j=1

Therefore, for any a ∈ (0, 1), by the continuous mapping theorem for random measures (see Kallen-

( l ) Zt,i,n−t (l) berg [25]), the sequence of random point processes Vn ≡ i ∈ J , l = 1, 2, ··· , d , n−t t  n ≥ 1, conditioned on |Zn| > 0, Z0 = ei0 converges in distribution to the random point process t V ≡  Y | 1 ≤ i ≤ N as n, t → ∞, → α. The proof is complete. i α n

3.3.3 The Proof of Theorem 3.6

Now, we are going to prove the convergence in distribution of the generation number Xn,k of the last common ancestor of k individuals randomly chosen from the population in the nth generation.  First, conditioned on the set |Zn| ≥ 2 , for almost all trees T and any integer r = 1, 2, ··· , we have that

d Z(l) P Pr (l) (l)  (l)  |Zr,i,n−r| |Zr,i,n−r| − 1 ··· |Zr,i,n−r| − k + 1  l=1 i=1 P Xn,k ≥ r T =   |Zn| |Zn| − 1 ··· |Zn| − k + 1

Hence, for any α ∈ (0, 1), let r = [nα] + 1, then

  Xn,k P < α |Zn| ≥ 2 n  = P Xn,k < nα |Zn| ≥ 2  = 1 − P Xn,k ≥ r |Zn| ≥ 2 d Z(l) P Pr (l) (l)  (l)  |Zr,i,n−r| |Zr,i,n−r| − 1 ··· |Zr,i,n−r| − k + 1 ! l=1 i=1 = 1 − E   |Zn| ≥ 2 |Zn| |Zn| − 1 ··· |Zn| − k + 1 1 = 1 − P(|Zn| ≥ 2||Zn| > 0) (l) (l) d Zr k−1  d Zr  P P k P s P  P P k−s |Zr,i,n−r| + (−1) q1q2 ··· qs |Zr,i,n−r| !! l=1 i=1 s=1 1≤q 0 |Zn| |Zn| − 1 ··· |Zn| − k + 1 42

(l) (l) (l) d Zr d Zr d Zr P P (l) k P P k P P (l) k |Z | |Zr,i,n−r| |Z | 1 r,i,n−r r,i,n−r = 1 − E l=1 i=1 + l=1 i=1 − l=1 i=1 k   k P(|Zn| ≥ 2||Zn| > 0) |Zn| |Zn| |Zn| − 1 ··· |Zn| − k + 1 |Zn| (l) k−1  d Zr  P s P  P P k−s (−1) q1q2 ··· qs |Zr,i,n−r| ! ! s=1 1≤q 0 |Zn| |Zn| − 1 ··· |Zn| − k + 1 d  k P P |Zr,i,n−r| n−r l=1 (l) ! 1 i∈Jr = 1 − E I{|Z |≥2} |Zn| > 0 | | ≥ || |  d k n P( Zn 2 Zn > 0) P P |Zr,i,n−r| n−r l=1 (l) i∈Jr (l) (l) d Zr d Zr P P k P P (l) k |Zr,i,n−r| |Z | ! 1 r,i,n−r + E l=1 i=1 − l=1 i=1 I |Z | > 0   k {|Zn|≥2} n P(|Zn| ≥ 2||Zn| > 0) |Zn| |Zn| − 1 ··· |Zn| − k + 1 |Zn| (l) d Zr P P |Z |k−s Xk−1 X  r,i,n−r ! 1 s  l=1 i=1 + (−1) q1q2 ··· qs E   I{|Zn|≥2} |Zn| > 0 P(|Zn| ≥ 2||Zn| > 0) |Zn| |Zn| − 1 ··· |Zn| − k + 1 s=1 1≤q1 0 , we have that

(l) (l) d Zr d Zr P P (l) k P P k |Z | |Z − | r,i,n−r r,i,n r |Z |k 0 ≤ l=1 i=1 ≤ l=1 i=1 ≤ n ≤ 1 k     |Zn| |Zn| |Zn| − 1 ··· |Zn| − k + 1 |Zn| |Zn| − 1 ··· |Zn| − k + 1 and (l) d Zr P P k−s |Zr,i,n−r| k−s l=1 i=1 |Zn| 0 ≤   ≤   ≤ 1, for s = 1, 2, ··· , k − 1, |Zn| |Zn| − 1 ··· |Zn| − k + 1 |Zn| |Zn| − 1 ··· |Zn| − k + 1 thus, by the bounded convergence theorem, as n → ∞,

(l) (l) d Zr d Zr P P |Z |k P P |Z(l) |k r,i,n−r r,i,n−r ! E l=1 i=1 − l=1 i=1 I |Z | > 0 → 0   k {|Zn|≥2} n |Zn| |Zn| − 1 ··· |Zn| − k + 1 |Zn| and (l) d Zr P P | |k−s  Zr,i,n−r  l=1 i=1 E   I{|Zn|≥2} |Zn| > 0 → 0 for s = 1, 2, ··· , k. |Zn| |Zn| − 1 ··· |Zn| − k + 1

Zn d It is also known that, in the critical case, |Z| > 0 −−−→ vY and Y is exponentially distributed with n 1 parameter , so v · Q[u]

P(|Zn| ≥ 2||Zn| > 0) → 1 as n → ∞. 43

Therefore, by the continuous mapping theorem and Theorem 3.5,

Nα P k Yi !  Xn,k  i=1 P < α |Zn| > 0 → 1 − E ≡ Hk(α) as n → ∞. N n Pα k Yi i=1

x P k Yi ! i=1 Let φk(x) = E x . since EY1 < ∞, we have that φk(x) → 0 as x → ∞. Also, P k Yi i=1

x−1 lim P(Nα = x) = lim(1 − α)α = 0 for anyx ≥ 0. α→1 α→1

So, Nα → ∞ as α → 1. By the Bounded Convergence Theorem again,

Nα P k Yi !  i=1 E φk(Nα) = E ↓ 0 as α → 1. N Pα k Yi i=1  and hence Hk(α) = 1 − E φk(Nα) ↑ 1 as α → 1. Moreover, Hk(0) = 0. Therefore, Hk is a proper probability distribution and hence there exists a random variable X˜k with P(X˜k ≤ α) = Hk(α) for a ∈ (0, 1) such that

Xn,k d |Zn| > 0 −−−→ X˜k as n → ∞. n

We complete the proof of Theorem 3.6.

3.3.4 The Proof of Theorem 3.7

At the end of this section, we will prove the convergence in distribution of the total coalescence time Tn normalized by dividing by the generation number n as n → ∞. For any α ∈ (0, 1) and any n ∈ N, let r = [nα] + 1.

(l) Let Zr,i,n−r be the d-type Galton-Watson branching process initiated by the ith individual of type l (l) in the rth generation, where i = 1, 2, ··· , Zr and l = 1, 2, ··· , d.   The event Tn ≥ r for 1 ≤ r ≤ n conditioned on |Zn| > 0 occurs if and only if all the individuals in the nt generation come from the (n − r)th generation of the tree initiated by exactly one individual in

(l) (l) the rth generation. That is, |Zr,i,n−r| = 0 for all but one l = 1, 2, ··· , d and one i = 1, 2, ··· , Zr . Then, for almost all trees T , 44

(l) (p) d Zr Zr X X (l)  Y (l)  Y Y (p)  P(Tn ≥ r|T ) = P |Zr,i,n−r| > 0 · P |Zr, j,n−r| = 0 · P |Zr, j,n−r| = 0 l=1 i=1 j,i p,l j=1 Hence,

  Tn P > α |Zn| > 0 n

= P(Tn > nα||Zn| > 0)

= P(Tn ≥ r||Zn| > 0) (l) (p) d Zr Zr ! X X (l)  Y (l)  Y Y (p)  = E P |Z | > 0 · P |Z | = 0 · P |Z | = 0 |Zn| > 0 r,i,n−r r, j,n−r r, j,n−r l=1 i=1 j,i p,l j=1 (l) (p) d Zr Zr ! 1 X X Y Y Y | (l) |  · | (l) |  · | (p) |  = E P Zr,i,n−r > 0 P Zr, j,n−r = 0 P Zr, j,n−r = 0 I{|Zn|>0} P(|Zn| > 0) l=1 i=1 j,i p,l j=1 (l) (p) d Zr Zr ! 1 X X Y Y Y | (l) |  · | (l) |  · | (p) |  = E P Zr,i,n−r > 0 P Zr, j,n−r = 0 P Zr, j,n−r = 0 I{|Zn|>0}I{|Zr|>0} P(|Zn| > 0) l=1 i=1 j,i p,l j=1 (l) (p) d Zr Zr 1 X X (l)  Y (l)  Y Y (p)  = E P |Zr,i,n−r| > 0 · P |Zr, j,n−r| = 0 · P |Zr, j,n−r| = 0 P(|Zn| > 0) l=1 i=1 j,i p,l j=1 !  · 1 − I{|Zn|=0} I{|Zr|>0}

(l) (p) d Zr Zr ! 1 X X Y Y Y | (l) |  · | (l) |  · | (p) |  = E P Zr,i,n−r > 0 P Zr, j,n−r = 0 P Zr, j,n−r = 0 I{|Zr|>0} P(|Zn| > 0) l=1 i=1 j,i p,l j=1 d ! | | X (l) Y (p) P( Zr > 0) (l) (l) (l) Zr −1 (p) Zr = E Zr gn−r 1 − gn−r · 1 − gn−r |Zr| > 0 P(|Zn| > 0) l=1 p,l (l)  where gn = P |Zn| > 0 Z0 = el

d (l) (l) (l) Z −1 P(|Zr| > 0) X Zr (l) r (n − r)gn−r (n−r) r r = E · (n − r)g · 1 − r n−r P(|Z | > 0) r n−r n − r n − r n l=1 (p) (p) Z ! Y (n − r)gn−r (n−r) r r · 1 − r n−r |Zr| > 0 n − r p,l

Let hn be the function defined by

hn(x1, x2, ··· , xd) d (l) (p) X − 1  r Y − r (l) r (n r)gn−r (n−r) xl− (n r)gn−r (n−r)xp ≡ x · (n − r)g · 1 − r n−r · 1 − n−r . l n−r n − r n − r n − r l=1 p,l 45

Since, as n → ∞,

(l)  ul ngn = nP |Zn| > 0 Z0 = el → v · Q[u] and

r α → , n − r 1 − α and hence, as n → ∞,

d u u X ul α − l ·x · α Y − p ·x · α h (x , x , ··· , x ) → x · · e v·Q[u] l 1−α · e v·Q[u] p 1−α ≡ h(x , x , ··· , x ). n 1 2 d l v · Q[u] 1 − α 1 2 d l=1 p,l

We have that hn → h uniformly on any compact set since hn and h are continuous and bounded. Then, as n → ∞,

 (1) (2) (d)  Zr Zr Zr  E hn , , ··· , |Zn| > 0 r r r  → E h(v1Y, v2Y, ··· , vdY) d u u ! X ul α − l ·v Y· α Y − p ·v Y· α = E v Y · · e v·Q[u] l 1−α · e v·Q[u] p 1−α l v · Q[u] 1 − α l=1 p,l ! Y α − Y · α = E · e v·Q[u] 1−α v · Q[u] 1 − α

and

  ul P |Zr| > 0 Z0 = el rP |Zr| > 0 Z0 = el n v·Q[u] 1 1 = · → · = .   ul P |Zn| > 0 Z0 = el nP |Zn| > 0 Z0 = el r v·Q[u] α α

So, for α ∈ (0, 1),

  ! Tn 1 Y α − Y · α v·Q[u] 1−α lim P > α |Zn| > 0 = E · e n→∞ n α v · Q[u] 1 − α Z ∞ 1 1 α − 1 α y 1 − 1 y = ye v·Q[u] 1−α e v·Q[u] dy α v · Q[u] 1 − α 0 v · Q[u] = 1 − α.

Tn d Hence, |Zn| > 0 −−−→ T˜ as n → ∞, where T˜ is an uniform (0, 1) random variable and we prove n Theorem 3.7. 46

3.4 Results in The Subcritical Case

3.4.1 The Statements of Results

In this section, we consider a discrete-time multi-type subcritical branching process, i.e. 0 < ρ < 1 where ρ is the maximal eigenvalue of the mean matrix M, and investigate the coalescence problems for this process.

For any integer k ≥ 2, let Xn,k be the generation number of the last common ancestor of any k random chosen individuals in the nth generation. First, we prove the result in Theorem 3.8 regarding to the limit behavior of Xn,k as n → ∞ when k = 2.

Theorem 3.8. Let 0 < ρ < 1 and EkZ1k log kZ1k < ∞. Then there exists a random variable X˜2 such d that n − Xn,2 |Zn| ≥ 2 −−−→ X˜2 as n → ∞, and, for any r = 0, 1, 2 ··· ,

1   P(X˜ ≤ r) = 1 − E φY(1), Y(2), ··· , Y(d), r ≡ H (r) 2 ρrP|Y| ≥ 2 2 where

d tl d tl tp P P | ˜ (l)|| ˜ (l) | P P P | ˜ (l)|| ˜ (p)| Zr,i Zr, j + Zr,i Zr, j ! l=1 i, j=1 l,p=1 i=1 j=1 ··· t φ(t1, t2, , td, r) = E I d l d tl d tl P P |Z˜ (l)|≥2 P P ˜ (l)  P P ˜ (l)  r,i |Zr,i| |Zr,i| − 1 l=1 i, j=1 l=1 i, j=1 l=1 i, j=1 n o and Z˜(l) : i ≥ 1 are i.i.d copies of the branching process initiated by an individual of type l, r,i r≥0 l = 1, 2, ··· , d.

Theorem 3.8 shows that the coalescence time Xn,2 does not go way back to the beginning of the tree.

Instead, it is very close to the present so that the different n − Xn,2 between the generation number of the last common ancestor and the number of the current generation converges in distribution as n → ∞ and we have seen this phenomenon in the single-type subcritical branching process.

We also have the similar analog for the general k = 2, 3, ··· .

Corollary 3.2. Let 0 < ρ < 1 and EkZ1k log kZ1k < ∞. Then, for k = 2, 3, ··· , there exists a random d variable X˜k such that n − Xn,k |Zn| ≥ k −−−→ X˜k as n → ∞.

Now, to find the limit behavior of Tn which is the generation number of the last common ancestor of the whole population, we trace the lines of descent of all the individuals in the nth generation backward 47 in time till they meet. The following theorem tells that the limit law T˜ of the total coalescence times is very close to the present.

Theorem 3.9. Let 0 < ρ < 1 and EkZ1k log kZ1k < ∞. Then there exists a random variable T˜ such that

d n − T |Z | > 0 −−−→ T˜ as n → ∞, and, for any r = 0, 1, 2, ··· , n n

d ! −r X (l) (l) (l)Y(l)−1 Y (p)Y(p) P(T˜ ≤ r) = ρ E Y gr 1 − gr · 1 − gr ≡ π(r). l=1 p,l where Y is the random vector with distribution {b(j)} d defined as in Theorem 1.9 (d). j∈R+

Next, we would like to look at the limit of the joint distribution of the generation number and the type of the last common ancestor and the types of the randomly chosen individuals.

Consider k = 2, i.e., pick two individuals at random from the nth generation (by simple random sampling without replacement) and trace their lines of decent back in time to find their last common ancestor. Let Xn,2 be the generation number of this common ancestor, ηn the type of this last common ancestor and ζn,i the type of the ith chosen individual.

Theorem 3.10. Let 0 < ρ < 1 and EkZ1k log kZ1k < ∞. Then

 lim P Xn,2 = r, ηn = j, ζn,1 = i1, ζn,2 = i2 |Zn| ≥ 2 ≡ ψ2(r, j, i1, i2) exists n→∞ X and ψ2(r, j, i1, i2) = 1. (r, j,i1,i2)

Corollary 3.3. Let 0 < ρ < 1 and EkZ1k log kZ1k < ∞. Then

 lim P Xn,k = r, ηn = j, ζn,1 = i1, ζn,2 = i2, ··· , ζn,k = ik |Zn| ≥ k ≡ ψk(r, j, i1, i2, ··· , ik) n→∞ X exists and ψk(r, j, i1, i2, ··· , ik) = 1. (r, j,i1,i2,··· ,ik) 48

3.4.2 The Proof of Theorem 3.8

For any r ≥ 0,

 P n − Xn,2 > r |Zn| ≥ 2  = P Xn,2 < n − r |Zn| ≥ 2 (l) (l) (p) d Zn−r d Zn−r Zn−r P P | (l) || (l) | P P P | (l) || (p) | Zn−r,i,r Zn−r, j,r + Zn−r,i,r Zn−r, j,r ! l=1 i, j=1 l,p=1 i=1 j=1 = E  |Zn| ≥ 2 |Zn| |Zn| − 1 (l) (l) (p) d Zn−r d Zn−r Zn−r P P | (l) || (l) | P P P | (l) || (p) | Zn−r,i,r Zn−r, j,r + Zn−r,i,r Zn−r, j,r ! 1 l=1 i, j=1 l,p=1 i=1 j=1 =  E  I{|Zn|≥2} P |Zn| ≥ 2 |Zn| |Zn| − 1 (l) (l) (p) d Zn−r d Zn−r Zn−r P P | (l) || (l) | P P P | (l) || (p) | Zn−r,i,r Zn−r, j,r + Zn−r,i,r Zn−r, j,r ! 1 l=1 i, j=1 l,p=1 i=1 j=1 =  E  I{|Zn|≥2}I{|Zn−r|>0} P |Zn| ≥ 2, |Zn| > 0 |Zn| |Zn| − 1 (l) (l) (p) d Zn−r d Zn−r Zn−r P P | ˜ (l)|| ˜ (l) | P P P | ˜ (l)|| ˜ (p)|  Zr,i Zr, j + Zr,i Zr, j ! P |Zn−r| > 0 l=1 i, j=1 l,p=1 i=1 j=1 = E I (l) |Zn−r| > 0   Z(l) Z(l)  d Zn−r P |Zn| ≥ 2 |Zn| > 0 P |Zn| > 0 d n−r d n−r P P |Z˜ (l)|≥2 P P ˜ (l)  P P ˜ (l)  r,i |Zr,i| |Zr,i| − 1 l=1 i, j=1 l=1 i, j=1 l=1 i, j=1 ˜ (l) (l) where Zr,i ∼ Zr ∀i and l = 1, 2, ··· , d  P |Z − | > 0   = n r E φZ(1) , Z(2) , ··· , Z(d) , r |Z | > 0   n−r n−r n−r n−r P |Zn| ≥ 2 |Zn| > 0 P |Zn| > 0

d tl d tl tp P P | ˜ (l)|| ˜ (l) | P P P | ˜ (l)|| ˜ (p)| Zr,i Zr, j + Zr,i Zr, j ! l=1 i, j=1 l,p=1 i=1 j=1 ··· t where φ(t1, t2, , td, r) = E I d l d tl d tl P P |Z˜ (l)|≥2 P P ˜ (l)  P P ˜ (l)  r,i |Zr,i| |Zr,i| − 1 l=1 i, j=1 l=1 i, j=1 l=1 i, j=1

d We know that Z |Z | > 0 −−−→ Y ≡ (Y(1), Y(2), ··· , Y(d)) as n → ∞, so n−r n−r     (1) (2) (d)  (1) (2) (d)  E φ Z − , Z − , ··· , Z − , r |Zn−r| > 0 → E φ Y , Y , ··· , Y , r n r n r n r as n → ∞, since φ(·, r) is continuous for any fixed r ≥ 0.   Also, as n → ∞, we have that P |Zn| ≥ 2 |Zn| > 0 → P |Y| ≥ 2 and  P |Zn−r| > 0 −r  → ρ , P |Zn| > 0 49 hence, for any r ≥ 0,

 lim P n − Xn,2 > r |Zn| ≥ 2 n→∞    1 P |Zn−r| > 0 (1) (2) (d)  = lim E φ Z − , Z − , ··· , Z − , r |Zn−r| > 0 n→∞   n r n r n r P |Zn| ≥ 2 |Zn| > 0 P |Zn| > 0 1   = E φY(1), Y(2), ··· , Y(d), r ρrP|Y| ≥ 2 ≡ 1 − π(r)

Now, it remains to show that H2(r) → 1 as r → ∞. (1) (2) (d)  Let fn(s) = fr (s), fr (s), ··· , fr (s) be the probability generating function of Zn, then ! 1 E φY(1), Y(2), ··· , Y(d) ρr d d ! 1 Y (l) X (l)  Y (p) = E 1 − f (l)(0)Y − Y(l)1 − f (l)(0) f (l)(0) Y −1 f (p)(0)Y ρr r r r r l=1 l=1 p,l d Q (l) Y(l) 1 − fr (0) ! d (l) ! X 1 − f (0) (l)  Y (p) = E l=1 − r E Y(l) f (l)(0) Y −1 f (p)(0)Y ρr ρr r r l=1 p,l

( j) ( j) ∞ ··· ··· (l) ∞ First, since E(Z1 (log Z1 ) < for any j = 1, 2, , d, we have, for l = 1, 2, , d, EY < and

1 − f (l)(0) u r → l as r → ∞. ρr u · EY

(l) Also, fr (0) → 1 as r → ∞ for l = 1, 2, ··· , d. By the bounded convergence theorem,

d (l) ! d X 1 − f (0) (l)  Y (p) X u r E Y(l) f (l)(0) Y −1 f (p)(0)Y → l EY(l) = 1, ρr r r u · EY l=1 p,l l=1 as r → ∞. ( j) ( j) ∞ ··· ∈ Nd Secondly, under the condition E(Z1 (log Z1 ) < for any a = (a1, a2, , ad) 0, we have that

· −r  u a ρ P |Zr| > 0 Z0 = a → u · EY and

v · 1 − f (0) Xd v · 1 − f (l)(0) r = l r . d d Q (l) al l=1 Q (l)al 1 − fr (0) 1 − fr (0) l=1 l=1 50

d 1 − Q f (l)(0)al 1 − f (l)(0) r Since, as r is increasing, r is decreasing, l=1 is increasing and hence, d · −  Q (l)al v 1 fr(0) 1 − fr (0) l=1 by the monotone convergence theorem, we have

d  d  Q (l) Y(l) −r Q (l) Y(l) 1 − fr (0) ! ρ 1 − fr (0) ! l=1 l=1 E  = E   v · 1 − fr(0) −r  ρ v · 1 − fr(0) −r ! ρ P |Zr| > 0 Z0 = Y = E   −r  ρ v · 1 − fr(0) u·Y ! → E u·EY = u · EY. 1 u·EY So, as r → ∞,

d d Q (l) Y(l) Q (l) Y(l) 1 − fr (0) ! 1 − fr (0) ! v · 1 − f (0) 1 l=1 r l=1 → · ·  E r = r E  u EY = 1. ρ ρ v · 1 − fr(0) u · EY ! 1 So, we obtain that E φY(1), Y(2), ··· , Y(d) → 1 − 1 = 0 as r → ∞ and hence H (r) → 1 as ρr 2  r → ∞ provided P |Y| ≥ 2 > 0. That is, H2(·) is a probability distribution on N0. Therefore, there d exists a random variable X˜2 such that n − Xn,2 |Zn| ≥ 2 −−−→ X˜2 as n → ∞.

3.4.3 The Proof of Theorem 3.9

(l) Let Zr,i,n−r be the d-type Galton-Watson branching process initiated by the ith individual of type l (l) in the rth generation, where i = 1, 2, ··· , Zr and l = 1, 2, ··· , d. For any r ≥ 0,

 P n − Tn ≤ r |Zn| > 0  = P Tn > n − r| Zn| > 0

= P(Tn ≥ r||Zn| > 0) (l) (p) d Zn−r Zn−r ! X X (l)  Y (l)  Y Y (p)  = E P |Z | > 0 · P |Z | = 0 · P |Z | = 0 |Zn| > 0 n−r,i,r n−r, j,r n−r, j,r l=1 i=1 j,i p,l j=1 d ! P(|Z | > 0) X (l) Y (p) n−r (l) (l) (l)Zn−r−1 (p)Zn−r = E Zn−rgr 1 − gr · 1 − gr |Zn−r| > 0 P(|Zn| > 0) l=1 p,l 51

(l)  where gn = P |Zn| > 0 Z0 = el . Xd Y ! (l) (l)xl−1 (p)xp Let h(x1, x2, ··· , xd) = E xlgr 1 − gr · 1 − gr , then h is continuous at (x1, x2, ··· , xd) l=1 p,l and

  Pn − X ≤ r |Z | > 0 = E h(Z(1) , Z(2) , ··· , Z(d) ) |Z | > 0 . n n n−r n−r n−r n−r

d Since Z |Z | > 0 ≡ Z(1) , Z(2) , ··· , Z(d)  |Z | > 0 −−−→ Y ≡ (Y(1), Y(2), ··· , Y(d)) as n → ∞, n−r n−r n−r n−r n−r n−r   E h(Z(1) , Z(2) , ··· , Z(d) ) |Z | > 0 → Eh(Y(1), Y(2), ··· , Y(d)). n−r n−r n−r n−r as n → ∞.  P |Zn−r| > 0 −r Also,  = ρ and then, for each r ≥ 0, P |Zn| > 0

d ! X (l) Y (p)  −r (l) (l) (l)Y −1 (p)Y P n − Xn ≤ r |Zn| > 0 → ρ E Y gr 1 − gr · 1 − gr ≡ π(r) l=1 p,l as n → ∞. u Moreover, since g(l) → 0 and ρ−rg(l) → l as r → ∞, we have r r u · EY

d ! X (l) −r (l) (l)Y(l)−1 Y (p)Y(p) lim π(r) = lim E Y ρ gr 1 − gr · 1 − gr ≡ π(r) r→∞ r→∞ l=1 p,l d ! X u = E Y(l) l u · EY l=1 1 Xd = u EY(l) u · EY l l=1 = 1.

Therefore, π(·) is a proper probability distribution and hence there exists a random variable T˜ with

PT˜ ≤ r = π(r), r = 0, 1, 2, ···

d such that n − T |Z | > 0 −−−→ T˜ as n → ∞. n n

3.4.4 The Proof of Theorem 3.10

i i(1) i(2) i(d) Let ξn, j = ξn, j , ξn, j , ··· , ξn, j be the vector of offsprings of the jth individual of type i in the nth generation. 52

 j,l Let Zp,r,s,n : n ≥ 0 be the branching process initiated by the sth child of type l of the pth individual of type j in the rthe generation, where

j,l j,l,(1) j,l,(2) j,l,(d)  Zp,r,s,n = Zp,r,s,n, Zp,r,s,n, ··· , Zp,r,s,n .

Choose two individuals at random from the nth generation and Let An,i be the type of the ancestor in the next generation of the nearest common ancestor (the last common ancestor) of the ith chosen individual, i = 1, 2. Then

 P n − Xn,2 = r, ηn = j, ζn,1 = ζn,2 = i, An,1 = An,2 |Zn| ≥ 2 ( j) j(l) Z d ξn−r,p Pn−r P P j,l,(i) j,l,(i) Zp,n−r,s,r−1Zp,n−r,t,r−1 ! p=1 l=1 s,t=1 = E  |Zn| ≥ 2 |Zn| |Zn| − 1 ( j) j(l) Z d ξn−r,p Pn−r P P j,l,(i) j,l,(i)  Zp,n−r,s,r−1Zp,n−r,t,r−1 ! P |Zn−r| > 0 p=1 l=1 s,t=1 = E I |Zn−r| > 0    |Zn|≥2 P |Zn| ≥ 2 |Zn| > 0 P |Zn| > 0 |Zn| |Zn| − 1 Let ξ j = ξ j(1), ξ j(2), ··· , ξ j(α) be i.i.d. copies the vector of offspring of an individual of type j, then

j j(1) j(2) j(α)  j j(1) j(2) j(α) ξn−r,p = ξn−r,p, ξn−r,p, ··· , ξn−r,p ∼ ξ = ξ , ξ , ··· , ξ .

˜ l Let Zr−1,s be the i.i.d. copies of Zr−1 with Z0 = el, then

jl j,l,(1) j,l,(2) ··· j,l,(α)  ∼ ˜ l ˜l(1) ˜l(2) ··· ˜l(α)  Zp,n−r,s,r−1 = Zp,n−r,t,r−1, Zp,n−r,t,r−1, , Zp,n−r,t,r−1 Zr−1,s = Zr−1,s, Zr−1,s, , Zr−1,s .

So,

 P n − Xn,2 = r, ηn = j, ζn,1 = ζn,2 = i, An,1 = An,2 |Zn| ≥ 2

( j) j(l) Z − d ξ Pn r P P ˜l(i) ˜l(i)  Zr−1,sZr−1,t ! P |Z | > 0 p=1 l=1 s t=1 n−r , | | = E  I ( j) j(l) Zn−r > 0   |Z | |Z | − 1 d Zn−r d ξ P |Zn| ≥ 2 |Zn| > 0 P |Zn| > 0 n n  P P P P | ˜ l |≥ Zr−1,s 2 j=1 p=1 l=1 s=1  ! P |Z − | > 0 = n r E ϕ Z(1) , Z(2) , ··· , Z(d)  |Z | > 0   1 n−r n−r n−r n−r P |Zn| ≥ 2 |Zn| > 0 P |Zn| > 0 where

j(l) x j d ξ P P P ˜l(i) ˜l(i) Zr−1,sZr−1,tI d x j d ξ j(l) p=1 l=1 s,t=1  P P P P | ˜ l |≥ Zr−1,s 2 ! j=1 p=1 l=1 s=1 ϕ1(x1, x2, ··· , xd, r) = E .  d x j d ξ j(l)  d x j d ξ j(l)  P P P P | ˜ l | P P P P | ˜ l | − Zr−1,s Zr−1,s 1 j=1 p=1 l=1 s=1 j=1 p=1 l=1 s=1 53

d Since ϕ (·, r) is continuous and Z |Z | > 0 −−−→ Y as n → ∞, we have 1 n−r n−r

 P n − Xn,2 = r, ηn = j, ζn,1 = ζn,2 = i, An,1 = An,2 |Zn| ≥ 2 1   → E ϕ Y(1), Y(2), ··· , Y(d) ρrP|Y| ≥ 2 1 as n → ∞.

Similarly, we also have that, as n → ∞,

 P n − Xn,2 = r, ηn = j, ζn,1 = ζn,2 = i, An,1 , An,2 |Zn| ≥ 2 1   → E ϕ Y(1), Y(2), ··· , Y(d) , ρrP|Y| ≥ 2 2

 P n − Xn,2 = r, ηn = j, ζn,1 = i1 , ζn,2 = i2, An,1 = An,2 |Zn| ≥ 2 1   → E ϕ Y(1), Y(2), ··· , Y(d) ρrP|Y| ≥ 2 3 and

 P n − Xn,2 = r, ηn = j, ζn,1 = i1 , ζn,2 = i2, An,1 , An,2 |Zn| ≥ 2 1   → E ϕ Y(1), Y(2), ··· , Y(d) ρrP|Y| ≥ 2 4 where

j(l) j(l) x j d ξ ξ P P P P ˜l(i) ˜q(i) Zr−1,sZr−1,tI d x j d ξ j(l) p=1 l,q=1 s=1 t=1  P P P P | ˜ l |≥ Zr−1,s 2 ! j=1 p=1 l=1 s=1 ϕ2(x1, x2, ··· , xd, r) = E ,  d x j d ξ j(l)  d x j d ξ j(l)  P P P P | ˜ l | P P P P | ˜ l | − Zr−1,s Zr−1,s 1 j=1 p=1 l=1 s=1 j=1 p=1 l=1 s=1

j(l) x j d ξ P P P ˜l(ii) ˜l(i2) Zr−1,sZr−1,tI d x j d ξ j(l) p=1 l=1 s,t=1  P P P P | ˜ l |≥ Zr−1,s 2 ! j=1 p=1 l=1 s=1 ϕ3(x1, x2, ··· , xd, r) = E .  d x j d ξ j(l)  d x j d ξ j(l)  P P P P | ˜ l | P P P P | ˜ l | − Zr−1,s Zr−1,s 1 j=1 p=1 l=1 s=1 j=1 p=1 l=1 s=1 and

j(l) j(l) x j d ξ ξ P P P P ˜l(i1) ˜q(i2) Zr−1,sZr−1,tI d x j d ξ j(l) p=1 l,q=1 s=1 t=1  P P P P | ˜ l |≥ Zr−1,s 2 ! j=1 p=1 l=1 s=1 ϕ4(x1, x2, ··· , xd, r) = E .  d x j d ξ j(l)  d x j d ξ j(l)  P P P P | ˜ l | P P P P | ˜ l | − Zr−1,s Zr−1,s 1 j=1 p=1 l=1 s=1 j=1 p=1 l=1 s=1 54

Let ϕ = ϕ1 + ϕ2 + ϕ3 + ϕ4, then, as n → ∞,

 P n − Xn,2 = r, ηn = j, ζn,1 = i1, ζn,2 = i2, |Zn| ≥ 2 1   → E φY(1), Y(2), ··· , Y(d) ≡ ψ (r, j, i , i ). ρrP|Y| ≥ 2 2 1 2

d ˜    Since n − Xn,2 |Zn| ≥ 2 −−−→ X2 as n → ∞, n − Xn,2 n≥0 is tight and, also, ηn n≥0, ζn,1 n≥0 and    ζn,2 n≥0 only take values on the finite set {1, 2, ··· , d}, so n−Xn,2, ηn, ζn,1, ζn,2 is tight. Therefore, n≥0  the limit ψ2(r, j, i1, i2) of P n − Xn,2 = r, ηn = j, ζn,1 = i1, ζn,2 = i2 is a probability mass function on

N0 × {1, 2, ··· , d} × {1, 2, ··· , d} × {1, 2, ··· , d}. That is,

X ψ2(r, j, i1, i2) = 1. (r, j,i1,i2)

3.5 The Markov Property on Types

3.5.1 The Statements of Results

 Consider a discrete-time multi-type Galton-Watson branching process Zn n≥0. We pick an individual at random from the nth generation and record its type, then trace its line of descent backward and also record the types of its ancestors along the line of descent.

Let In,0 be the type of this randomly chosen individual.

Let In,i be the ancestor of this individual in the (n − i)th generation, i = 1, 2, ··· , n. The first theorem is a result on the limit behavior of the types of the last k ancestors of a randomly chosen individual in the nth generation and the Markov property on the types of the ancestors along its line of descent as n → ∞ for the supercritical case.

Theorem 3.11. Let 1 < ρ < ∞, |Z0| = 1,EkZ1k log kZ1k < ∞ and assume P(Z1 = 0|Z0 = ei) = 0 for any i = 1, 2, ··· , d. Then, for any integer k ≥ 0, there exist random variables I˜0, I˜1, ··· , I˜k such that

 d  In,0, In,1, ··· , In,k −−−→ I˜0, I˜1, ··· , I˜k as n → ∞,

 and, for any i0, i1, ··· , ik ∈ 1, 2, ··· , d ,

···  vik mikik−1 mik−1ik−2 mi1i0 P I˜ = i , I˜ = i , ··· , I˜k = ik = 0 0 1 1 (1 · v)ρk 55

 where v = (v1, v2, ··· , vd) is a left eigenvector of the offspring mean matrix M = mi j : i, j = 1, 2, ··· , d associated with the maximal eigenvalue ρ.  ˜  Moreover, In n≥0 is a Markov chain with the state space 1, 2, ··· , d and

 (a) the initial distribution λ0 ≡ λ0(1), λ0(2), ··· , λ0(d) where

v λ (i) = i for any i = 1, 2, ··· , d. 0 1 · v

 (b) the transition probability P ≡ pi j : i, j = 1, 2, ··· , d , where

v jm ji pi j = for any n = 0, 1, 2, ··· viρ

 (c) the stationary distribution π ≡ π1, π2 ··· , πd where

u v π = i i for any i = 1, 2, ··· , d. i u · v

We also have an analog of the result on the limit law and the Markov property of the types along the line of ancestor of any individual randomly chosen from the nth generation as n → ∞ for the critical case.

2 Theorem 3.12. Let ρ = 1, |Z0| = 1 and EkZ1k < ∞. Then, for any integer k ≥ 0, there exist random variables I˜0, I˜1, ··· , I˜k such that

d I , I , ··· , I  |Z | > 0 −−−→ I˜ , I˜ , ··· , I˜  as n → ∞, n,0 n,1 n,k n 0 1 k  and, for any i0, i1, ··· , ik ∈ 1, 2, ··· , d ,

vi mi i mi i ··· mi i PI˜ = i , I˜ = i , ··· , I˜ = i  = k k k−1 k−1 k−2 1 0 0 0 1 1 k k (1 · v)  where v = (v1, v2, ··· , vd) is the left eigenvector of the offspring mean matrix M = mi j : i, j = 1, 2, ··· , d associated with the maximal eigenvalue 1.  ˜  Moreover, In n≥0 is a Markov chain with the state space 1, 2, ··· , d and

 (a) the initial distribution λ0 ≡ λ0(1), λ0(2), ··· , λ0(d) where

v λ (i) = i for any i = 1, 2, ··· , d. 0 1 · v 56

 (b) the transition probability P ≡ pi j : i, j = 1, 2, ··· , d , where

v jm ji pi j = for any n = 0, 1, 2, ··· vi

 (c) the stationary distribution π ≡ π1, π2 ··· , πd where

u v π = i i for any i = 1, 2, ··· , d. i u · v

3.5.2 The Proof of Theorem 3.11

Let Zn = (Zn,1, Zn,2, ··· , Zn,d) be the population vector in the nth generation, n = 0, 1, 2, ··· , where

Zn,i is the number of individuals of type i in the nth generation. We will prove this theorem using the principle of mathematical induction. Z For k = 0, since, in the supercritical case, it is known that n → vW w.p.1 as n → ∞ and P(0 < ρn W < ∞) = 1 (see Theorem 1.6), we have, by the bounded convergence theorem,

Z  Z /ρn  v W v  n,i0 n,i0 → i0 i0 ≡ → ∞ P In,0 = i0 = E = E n = λ0(i0) as n . |Zn| |Zn|/ρ (1 · v)W 1 · v

Xd Xd v Also, λ (i) = i = 1, i.e., λ (i): i = 1, 2, ··· , d is a proper probability distribution and 0 1 · v 0 i=1 i=1 d hence there exists a random variable I˜0 with P(I˜0 = i) = λ0(i) for i = 1, 2, ··· , d such that I0 −−−→ I˜0 as n → ∞.

Next, we prove that the theorem holds for k = 1. (i) ( j)1 ( j)2 ( j)d Let ξn, j = ξn,r , ξn,r , ··· , ξn,r be the vector of offsprings of the jth individual of type i in the nth

n (i1)i0 o  (i1)i0  generation. Then ξ are i.i.d random variables with E ξ = mi i < ∞. n, j j≥1,n≥1 n, j 1 0

Since Zn,i0 → ∞ w.p.1, then by the strong , as n → ∞,

Zn,i 1 X0 ξ(i1)i0 → m w.p.1. Z n, j i1i0 n,i0 j=1

So, by the bounded convergence theorem,

Zn−1,i1 P (i1)i0 ξ Z n−1, j ! n−1,i1 n−1 ! j=1 1 X Z − /ρ 1  (i1)i0 · n 1,i1 · P In,1 = i1 In,0 = i0 = E = E ξn−1, j n Z Z − Z /ρ ρ n,i0 n 1,i1 j=1 n,i0

vi1 W 1 vi1 mi1i0 → mi1i0 · · = as n → ∞ vi0 W ρ ρvi0 57

Hence,

v m    i1 i1i0 P In,0 = i0, In,1 = i1 = P In,1 = i1 In,0 = i0 P In,0 = i0 → ≡ λ1(i0, i1) as n → ∞ (1 · v)ρ and

Xd Xd Xd 1  Xd  Xd ρv Xd λ (i, j) = v m = i = λ (i) = 1 1 (1 · v)ρ j ji (1 · v)ρ 0 i=1 j=1 i=1 j=1 i=1 i=1 since v is the left eigenvector of M associated with the eigenvalue ρ.  So, λ1(i, j): i, j = 1, 2, ··· , d is a proper probability distribution with one marginal distribution  λ0. Thus, there exists a random variable I˜1 such that P I˜0 = i, I˜1 = j = λ1(i, j) for i, j = 1, 2, ··· , d and  d  I0, I1 −−−→ I˜0, I˜1 as n → ∞.

Assume that there exist random variables I˜0, I˜1, ··· , I˜k such that

···  vik mikik−1 mi1i0 P I˜n, = i , I˜n, = i , ··· , I˜k = ik = ≡ λk(i , i , ··· , ik) 0 0 1 1 (1 · v)ρk 0 1 and, as n → ∞,

 d  I0, I1, ··· , Ik −−−→ I˜0, I˜1, ··· , I˜k .

Then

 P Ik+1 = ik+1, Ik = ik, ··· , Ii = i1 I0 = i0 (i )i (i )i ξ k+1 k ξ 2 1 Zn−(k+1),ik+1 n−(k+1), jk+1 n−2, j2 P P ··· P (i1)i0 ξn−1, j ! j =1 j =1 j 1 = E k+1 k 1 Zn,i0 (ik+1)ik (i )i Z ξ ξ 2 1 n−(k+1),ik+1 n−(k+1), jk+1 n−2, j2 1 1 1 X X X   = E ··· ··· ξ(ik+1)ik ··· ξ(i2)i1 ξ(i1)i0 (i2)i1 n−(k+1), jk+1 n−2, j2 n−1, j1 Zn−(k+1),ik+1 ξ − (ik+1)ik ξ n (k+1), jk+1 n−2, j2 jk+1=1 jk=1 j1 n−(k+1) ! Z − /ρ 1 · n (k+1),ik+1 · n k+1 Zn,i0 /ρ ρ and, again by Theorem 1.6, the strong law of large numbers and the bounded convergence theorem, we have that, as n → ∞,

vik+1 mik+1ik mikik−1···mi i PI = i , I = i , ··· , I = i I = i  → 1 0 k+1 k+1 k k i 1 0 0 k+1 vi0 ρ 58

Hence, as n → ∞,

 P In,0 = i0, In,1 = i1, ··· , Ik+1 = ik+1   = P Ik+1 = ik+1, Ik = ik, ··· , Ii = i1 I0 = i0 P I0 = i0 v m m ··· ik+1 ik+1ik ikik−1 mi1i0 → ≡ λk (i , i , ··· , ik) (1 · v)ρk+1 +1 0 1 and

Xd Xd Xd Xd Xd Xd ··· λk+1(i1, i1, ··· , ik+1) = ··· λk(i1, i1, ··· , ik) = 1. i0=1 i1=1 ik+1=1 i0=1 i1=1 ik=1

So, there exists a random variable I˜k+1 such that

···  vik+1 mik+1ik mi1i0 P I˜n, = i , I˜n, = i , ··· , I˜k = ik, I˜k = ik = λk (i , i , ··· , ik, ik ) = 0 0 1 1 +1 +1 +1 0 1 +1 (1 · v)ρk+1

 d  and, as n → ∞, I0, I1, ··· , Ik, Ik+1 −−−→ I˜0, I˜1, ··· , I˜k, I˜k+1 . By the principle of the mathematical induction, we prove that, for any integer k ≥ 0, there exist  random variables I˜0, I˜1, ··· , I˜k such that

 d  In,0, In,1, ··· , In,k −−−→ I˜0, I˜1, ··· , I˜k as n → ∞,

 and, for any i0, i1, ··· , ik ∈ 1, 2, ··· , d ,

···  vik mikik−1 mik−1ik−2 mi1i0 P I˜ = i , I˜ = i , ··· , I˜k = ik = . 0 0 1 1 (1 · v)ρk

 ˜ Now, we prove the Markov property of In n≥0.  For any n ≥ 1 and any i, j, i0, ··· , in−1 ∈ 1, 2, ··· , d , we have

PI˜ = j, I˜ = i, I˜ = i , ··· , I˜ = i  ˜ ˜ ˜ ˜  n+1 n n−1 n−1 0 0 P In+1 = j In = i, In−1 = in−1, ··· , I0 = i0 =  P I˜n = i, I˜n−1 = in−1, ··· , I˜0 = i0 n+1 v jm jimiin−1 ··· mi1i0 /(1 · v)ρ = n vimiin−1 ··· mi1i0 /(1 · v)ρ v jm ji = ≡ pi j. viρ  So, the conditional probability distribution of the future state of the chain I˜n}n≥0, given the present  state and the past states, only depends on the present state. Therefore, I˜n}n≥0 is a Markov chain with the state space 1, 2, ··· , d such that 59

 (a) the initial distribution λ0 ≡ λ0(1), λ0(2), ··· , λ0(d) where

v λ (i) = i for any i = 1, 2, ··· , d. 0 1 · v

 (b) the transition probability P ≡ pi j : i, j = 1, 2, ··· , d , where

v jm ji pi j = for any n = 0, 1, 2, ··· viρ

 ˜  It remains to show that the Markov chain In n≥0 has a stationary distribution π ≡ π1, π2 ··· , πd where

u v π = i i for any i = 1, 2, ··· , d. i u · v u v Since u > 0 and v > 0, π = i i > 0. Also, i u · v

Xd Xd u v u · v π = i i = = 1. i u · v u · v i=1 i=1  So, π ≡ π1, π2 ··· , πd is a probability distribution. Moreover, since u is a right eigenvector of M associated with the eigenvalue ρ, for any j =

1, 2, ··· , d,

d d d X X u v v jm ji v j X v j v ju j π p = i i · = m u = · ρu = = π i i j u · v v ρ ρ(u · v) ji i ρ(u · v) j u · v j i=1 i=1 i i=1 and hence π is a stationary distribution of the transition probability P.

Therefore, the proof is complete.

3.5.3 The Proof of Theorem 3.12

Before we prove Theorem 3.12, we need the following lemmas.

Let u and v = (v1, v2, ··· , vd) be the right and left eigenvector, respectively, of the offspring mean matrix M associated with the maximal eigenvalue ρ.

2 Lemma 3.2. (Mode, 1971) Let ρ = 1, |Z0| = 1 and EkZ1k < ∞. Then, for any i = 1, 2, ··· , d,

  Zn,i  lim E |Zn| > 0 = vi v · Q[u] n→∞ n 60

Z n,i  1 Remark 3.3. From Lemma 3.2, we know that, as n → ∞, |Zn| > 0 converges to vi v · Q[u] in L n and hence in probability.

 Lemma 3.3. (Karlin, 1966) Let K = x = (x1, x2, ··· , xd): xi > 0, x · u = 1 . Then

lim sup kxMn − ρn − vk = 0. n→∞ x∈K

Lemma 3.3 is in the same spirit as the Frobeninus theorem and we will make use of it to prove the next lemma.

2 Lemma 3.4. Let ρ = 1, |Z0| = 1 and EkZ1k < ∞. Then, for any i = 1, 2, ··· , d and any  > 0,

! Zn,i(ω) lim P ω : − vi >  |Zn| > 0 = 0 n→∞ u · Zn(ω)   ( j)l ( j)l ( j)l ··· ( j)l ( j)l Proof. Let Zm (n) = Zm,1(n), Zm,2(n), , Zm,d(n) where Zm,i (n) is the number of type i offspring in the (m + n)th generation of the lth individual of type j in the nth generation. Then, by the additive property,

(k) Zn+m,i = the number of the individuals of type i in the (m + n)th generation of the process

initiated with an individual of type k (k) d Zn, j X X ( j)l = Zm,i (n). j=1 l=1

Zn Let Xn = (Xn,1, Xn,2, ··· , Xn,d) ≡ , that is, u · Zn

Xn, j 1 = for all j = 1, 2, ··· , d. Zn, j u · Zn

Then

(k) d Zn, j (k) X X ( j)l Zn+m = Zm (n) j=1 l=1 (k) (k) d Zn, j d Zn, j X X  ( j)l m X X m = Zm (n) − e jM + e jM j=1 l=1 j=1 l=1 (k) d Zn, j d X X  ( j)l m X (k) m = Zm (n) − e jM + Zn, je jM j=1 l=1 j=1 61

and hence

(k) d Zn, j d (k) X X  ( j)l m X (k) m u · Zn+m = u · Zm (n) − u jM + u · Zn, je jM j=1 l=1 j=1 (k) d Zn, j d X X  ( j)l m  X m (k) = u · Zm (n) − ρ u j + ρ u jZn, j j=1 l=1 j=1 (k) d Zn, j X X  ( j)l m  m (k) = u · Zm (n) − ρ u j + ρ u · Zn . j=1 l=1 So,

Z(k) X(k) = n+m n+m (k) u · Zn+m (k) d d Zn, j P (k) m P P  ( j)l m Zn, je jM + Zm (n) − e jM j=1 j=1 l=1 = (k) d Zn, j m (k) P P  ( j)l m  ρ u · Zn + u · Zm (n) − ρ u j j=1 l=1 Z(k) d d n, j   P 1 Z(k)e Mm + P 1 P Z( j)l(n) − e Mm m (k) n, j j m (k) m j j=1 ρ u·Zn j=1 ρ u·Zn l=1 = Z(k) d n, j   1 + P 1 P u · Z( j)l(n) − ρmu m (k) m j j=1 ρ u·Zn l=1 Z(k) d Z(k) d   n, j   ρ−m P n, j e Mm + P Xn, j P ρ−m Z( j)l(n) − e Mm (k) j Zn, j m j j=1 u·Zn j=1 l=1 = (k) d   Zn, j P Xn, j P −m ( j)l m  1 + Z ρ u · Zm (n) − ρ u j j=1 n, j l=1 Suppressing the superscript k and letting

d Zn, j X  Xn, j  X   r = ρ−m u · Z( j)l(n) − ρmu nm Z m j j=1 n, j l=1 and

d Zn, j X  Xn, j  X   α = ρ−m Z( j)l(n) − e Mm nm Z m j j=1 n, j l=1 then we can have

d ρ−m P X e Mm + α n, j j nm −m m j=1 ρ XnM + αnm Xn+m = = 1 + rnm 1 + rnm 62 and hence

 m −m  XnM ρ − v − vrnm + αnm Xn+m − v = . (3.1) 1 + rnm

n ( j)lo  ( j)l m  n ( j)lo First, since u · Z are i.i.d random variables with E u · Z − ρ u j = 0 and Z are m n≥0 m m n≥0  ( j)l m i.i.d random vectors with E Zm (n) − e jM = 0, by the strong law of large numbers, on the set

{|Zn| > 0, Zn → ∞},

Z 1 Xn, j   u · Z( j)l(n) − ρmu → 0 w.p.1 Z m j n, j l=1 and Z 1 Xn, j   Z( j)l(n) − e Mm → 0 w.p.1 Z m j n, j l=1 and hence lim |rnm| = 0 and lim kαnmk = 0. Therefore, for any η > 0, n→∞ n→∞     lim P |rnm| > η, Zn → ∞ |Zn| > 0 = lim P kαnmk > η, Zn → ∞ |Zn| > 0 = 0. n→∞ n→∞

Let  > 0 be arbitrary, then, by Lemma 3.3, there exists an m0 such that for all m ≥ m0,

sup kxMn − ρn − vk ≤ . x∈K

From (3.1), we have, for any  > 0, η > 0, that

  + η + kvkη  P Z → ∞, kX − vk ≤ |Z | > 0 n n+m 1 − η n     ≥ 1 − P |rnm| > η, Zn → ∞ |Zn| > 0 − P kαnmk > η, Zn → ∞ |Zn| > 0 .

This implies that for any  > 0, η > 0,

  + η + kvkη  lim sup P kXn+m − vk > |Zn| > 0 = 0, N→∞ 1 − η which proves Lemma 3.4.



Remark 3.4. Lemma 3.4 demonstrates the proportions of individual of various types approach the corresponding ratios of the components of the left eigenvector v of the mean matrix M associated with Z the maximal eigenvalue ρ. That is, for any i = 1, 2, ··· , d, the random quantity n,i , conditioned on |Z| v n |Z | > 0 , converges to i (which is a non-random quantities) in probability as n → ∞. n 1 · v 63

Now, we are ready to prove Theorem 3.12 using the principle of mathematical induction.

First, consider k = 0, for any i0 = 1, 2, ··· , d, by the bounded convergence theorem,

  Z  v n,i0 i0 P I0 = i0 |Zn| > 0 = E |Zn| > 0 → ≡ λ0(i0). |Zn| 1 · v

d X  and λ0(i) = 1. So, there exists a random variable I˜0 on 1, 2, ··· , d such that P(I˜0 = i) = λ0(i) for i=1 all i = 1, 2, ··· , d and

d ˜ I0 |Zn| > 0 −−−→ I0 as n → ∞.

(i) ( j)1 ( j)2 ( j)d Next, for k = 1, let ξn, j = ξn,r , ξn,r , ··· , ξn,r be the vector of offsprings of the jth individual of

n (i1)i0 o  (i1)i0  type i in the nth generation. Then ξ are i.i.d random variables with E ξ = mi i < ∞. n, j j≥1,n≥1 n, j 1 0  Since, on the set |Zn| > 0 , Zn,i0 → ∞ w.p.1, then by the strong law of large numbers, as n → ∞,

Zn,i 1 X0 (i1)i0 ξ |Zn| > 0 → mi i w.p.1. Z n, j 1 0 n,i0 j=1 Z n,i 1 Also, it is known from Lemma 3.2 that |Zn| > 0 converges to viY in L and hence in probability n as n → ∞, where Y is the exponential random variable defined as in Theorem 1.8 (b). So, by the bounded convergence theorem,

Zn−1,i1 P (i1)i0 ξn−1, j !   j=1 P In,1 = i1 In,0 = i0, |Zn| > 0 = E |Zn| > 0 Zn,i0

Zn−1,i1 ! 1 X Z − /(n − 1) n − 1 (i1)i0 · n 1,i1 · = E ξn−1, j Z − Z /n n n 1,i1 j=1 n,i0

vi1 Y vi1 mi1i0 → mi1i0 · = as n → ∞ vi0 Y vi0

Hence, we have that, as n → ∞,

      vi mi i P I = i , I = i Z | > 0 = P I = i I = i , |Z | > 0 P I = i |Z | > 0 → 1 1 0 ≡ λ (i , i ) n,0 0 n,1 1 n n,1 1 n,0 0 n n,0 0 n (1 · v) 1 0 1 and

Xd Xd Xd 1  Xd  Xd v Xd λ (i, j) = v m = i = λ (i) = 1 1 (1 · v) j ji (1 · v) 0 i=1 j=1 i=1 j=1 i=1 i=1 since v is the left eigenvector of M associated with the eigenvalue ρ = 1. 64

 So, λ1(i, j): i, j = 1, 2, ··· , d is a proper probability distribution with one marginal distribution  λ0. Thus, there exists a random variable I˜1 such that P I˜0 = i, I˜1 = j = λ1(i, j) for i, j = 1, 2, ··· , d and

 d ˜ ˜  I0, I1 |Zn| > 0 −−−→ I0, I1 as n → ∞.

Now, assume that there exist random variables I˜0, I˜1, ··· , I˜k such that

···  vik mikik−1 mi1i0 P I˜n, = i , I˜n, = i , ··· , I˜k = ik = ≡ λk(i , i , ··· , ik) 0 0 1 1 (1 · v)ρk 0 1 and, as n → ∞,

 d ˜ ˜ ˜  I0, I1, ··· , Ik |Zn| > 0 −−−→ I0, I1, ··· , Ik .

Then

  P Ik+1 = ik+1, Ik = ik, ··· , Ii = i1 I0 = i0, |Zn| > 0 (i )i (i )i ξ k+1 k ξ 2 1 Zn−(k+1),ik+1 n−(k+1), jk+1 n−2, j2 P P P (i1)i0 ··· ξ n−1, j1 ! jk+1=1 jk=1 j1 = E |Zn| > 0 Zn,i0 (ik+1)ik (i )i Z ξ ξ 2 1 n−(k+1),ik+1 n−(k+1), jk+1 n−2, j2 1 1 1 X X X   = E ··· ··· ξ(ik+1)ik ··· ξ(i2)i1 ξ(i1)i0 (i2)i1 n−(k+1), jk+1 n−2, j2 n−1, j1 Zn−(k+1),ik+1 ξ (ik+1)ik ξ n−(k+1), j n−2, j jk+1=1 jk=1 j1 k+1 2 −  ! Zn−(k+1),ik+1 / n (k + 1) n − (k + 1) · · |Zn| > 0 Zn,i0 /n n and, again by Theorem 1.6, the strong law of large numbers and the bounded convergence theorem, we have that, as n → ∞,

  vik+1 mik+1ik mikik−1···mi i P I = i , I = i , ··· , I = i I = i , |Z | > 0 → 1 0 k+1 k+1 k k i 1 0 0 n k+1 vi0 ρ

Hence, as n → ∞,

  P In,0 = i0, In,1 = i1, ··· , Ik+1 = ik+1 |Zn| > 0     = P Ik+1 = ik+1, Ik = ik, ··· , Ii = i1 I0 = i0, |Zn| > 0 P I0 = i0 |Zn| > 0

vik+1 mik+1ik mikik−1···mi i → 1 0 ≡ λ (i , i , ··· , i ) (1 · v) k+1 0 1 k and

Xd Xd Xd Xd Xd Xd ··· λk+1(i1, i1, ··· , ik+1) = ··· λk(i1, i1, ··· , ik) = 1. i0=1 i1=1 ik+1=1 i0=1 i1=1 ik=1 65

So, there exists a random variable I˜k+1 such that

vi mi i ··· mi i PI˜ = i , I˜ = i , ··· , I˜ = i , I˜ = i  = λ (i , i , ··· , i , i ) = k+1 k+1 k 1 0 n,0 0 n,1 1 k k k+1 k+1 k+1 0 1 k k+1 (1 · v)

 d ˜ ˜ ˜ ˜  and, as n → ∞, I0, I1, ··· , Ik, Ik+1 |Zn| > 0 −−−→ I0, I1, ··· , Ik, Ik+1 . By the principle of the mathematical induction, we prove that, for any integer k ≥ 0, there exist  random variables I˜0, I˜1, ··· , I˜k such that

 d ˜ ˜ ˜  In,0, In,1, ··· , In,k |Zn| > 0 −−−→ I0, I1, ··· , Ik as n → ∞,

 and, for any i0, i1, ··· , ik ∈ 1, 2, ··· , d ,

vi mi i mi i ··· mi i PI˜ = i , I˜ = i , ··· , I˜ = i  = k k k−1 k−1 k−2 1 0 . 0 0 1 1 k k (1 · v)

By the proof similar to the lines in the proof for the supercritical case, one can show the Markov  ˜ property of In n≥0 for the critical case and thus the proof is complete. 66

CHAPTER 4. COALESCENCE IN CONTINUOUS-TIME SINGLE-TYPE AGE-DEPENDENT BELLMAN-HARRIS BRANCHING PROCESSES

4.1 Introduction

Now, we consider a continuous-time single-type age-dependent Bellman-Harris branching process   Z(t): t ≥ 0 with offspring distribution p j j≥0 and lifetime distribution G. Assume that this process is initiated with one individual of age 0. That is, Z(0) = 1.

For any family tree T , since every individual lives a random length of time according to G, when we look at the population at time t, for any t > 0, those who are alive at this time may belong to different generations. But if we ignore the lifetime structure of this process, there is a corresponding discrete-   time single-type Galton Watson branching process Yn n≥0 with offspring distribution p j j≥0, where Yn is the number of individuals in the continuous-time process Z(t) who were born as an nth-generation offspring, that means that each of these Yn individuals in has exactly n ancestors along its line of descent.   We call Yn n≥0 the embedded generation process of the process Z(t): t ≥ 0 . Therefore, the results (presented in Section 1.2) on the discrete-time Galton-Watson branching process can be applied to this  embedded process Yn n≥0 of the continuous-time Bellman-Harris branching process. In this chapter, we will adopt all the definitions and notations described in Section 1.4.

Let’s consider the coalescent problem on the continuous-time processes.

Let k ≥ 2 be a positive integer. We pick k individuals from those who are alive at time t, (assuming

Z(t) ≥ k) by the simple random sampling without replacement and trace their lines of descent backward in time until they meet for the first time.

Let Xk(t) be the generation number of the last common ancestor of these k random chosen individ- uals and Then we can ask the same questions as we do for the discrete-time Galton-Watson branching process. That is, 67

(1) What is the distribution of Xk(t)?

(2) What happens to Xk(t) when n → ∞?

Moreover, for a continuous-time branching process, we can ask more questions about the ”time”.

Let Dk(t) be the coalescence time of the lines of descent of any k individuals randomly chosen from the population alive at time t. Note that Dk(t) also means the death time of the last common ancestor and then the following questions are of our interest:

(3) What is the distribution of Dk(t)?

(4) What happens to Dk(t) when n → ∞?

In Sections 4.2 and 4.3, we present the results on the limits behaviors of the generation number and the death time of the last common ancestor for the supercritical and subcritical cases in the continuous- time single-type age-dependent Bellman-Harris branching process.

4.2 Results in The Supercritical Case

4.2.1 The statement of Results

The first theorem is regarding the generation number of the last common ancestor of k individuals randomly chosen from the population alive at time t.

Let Ln,i,k be the lifetime of the ancestor in the kth generation of the ith individual in the rth gen-  eration, then Lr,i,k : n ≥ 0, i ≥ 1, k = 1, 2, ··· , n − 1 are i.i.d copies with the lifetime distribution G. Xn−1 Let S n,i = Ln,i,k, then S n,i is the birth time of the ith individual in the nth generation. k=0

Theorem 4.1. Let 1 < m =< ∞, p0 = 0 and the life time distribution G is non-lattice with G(0+) = 0. X∞ If ( j log j)p j < ∞, then, for any integer k ≥ 2, j=1 (a) for almost all trees T and r = 0, 1, 2, ··· ,

Yr k P  −αS  e r,i Wr,i  i=1 P Xk(t) < r T → φk(r, T ) ≡ 1 − Wk n o as t → ∞, where Wr,i are the i.i.d copies of the random variable W in Theorem 1.11 (b). i≥1 68

 d (b) there exists a random variable X˜k on 0, 1, 2, ··· such that Xk(t) −−−→ X˜k as t → ∞ and

Yr k P  −αS  e r,i Wr,i ! PX˜ < r = 1 − E i=1 ≡ φ (r). k Wk k

for any r = 0, 1, 2, ··· .

Similar to the result in the discrete-time process, when k → ∞, the random variable D˜ k also con- verges in distribution to a proper random variable which is the last generation consisting of only one individual. It is stated in the next theorem.

 Theorem 4.2. Let 1 < m < ∞ and U = min n ≥ 1 : Yn ≥ 2 . Under the same hypotheses of Theorem d 4.1, then X˜k −−−→ U − 1 as k → ∞.

Now, we switch our focus to the death time of the last common ancestor. n o Let Ls,i be the total lifetime of the ith individual alive at time s. Then Ls,i are i.i.d. copies of the i≥1 lifetime random variable with distribution G.

Let as,i be the corresponding age and Rs,i be the corresponding residual lifetime at time t. That is,

Rs,i = Ls,i − as,i for any i ≥ 1 and any s ≥ 0.

Theorem 4.3. Let 1 < m =< ∞, p0 = 0 and the life time distribution G is non-lattice with G(0+) = 0. X∞ If ( j log j)p j < ∞, then, for any integer k ≥ 2, j=1 (a) for almost all trees T and any s ≥ 0,

Z(s) k P  −αR  e s,i W˜ s,i  i=1 P D˜ k ≤ s T ≡ Hk(s, T ) = 1 − Z(s) k  P −αR  e s,i W˜ s,i i=1 Xξ  ˜ where Wr,i i≥1 are the i.i.d copies of the sum Wi, ξ is the random variable with the offspring j=i  distribution {p j} j≥0 and Wi i≥0 are i.i.d. copies of W as defined in Theorem 1.11 (b).

d (b) there exists a random variable D˜ k on the set of non-negative real numbers such that Dk(t) −−−→ D˜ k as t → ∞ and

Z(s) k P  −αR  e s,i W˜ s,i !  i=1 P D˜ k ≤ s = 1 − E ≡ Hk(s). Z(s) k  P −αR  e s,i W˜ s,i i=1 69

for any s ≥ 0.

The next theorem shows that the limit law of D˜ k, the limit of the death time of the last common ancestor of any k randomly chosen individuals, converges to the first moment when the process slits into more than one as k → ∞.

 Theorem 4.4. Let 1 < m < ∞ and U = min n ≥ 1 : Yn ≥ 2 . Under the same hypotheses of Theorem d 4.3, then there exist a random variable D˜ such that D˜ k −−−→ D˜ as k → ∞ and, for any s ≥ 0,

 P D˜ ≤ s = P(L0 + L1 + ··· + LU−1 ≤ s)

 where Li i≥0 are i.i.d. copies of the lifetime random variable L with distribution G and U is as defined in Theorem 4.2.

4.2.2 The proof of Theorem 4.1

 Let Yn n≥0 be the embedded generation process of the continuous-time Bellman-Harris process Z(t): t ≥ 0 .  Let Zr,i(t): t > 0 be the the continuous-time single-type age-dependent Bellman-Harris branching process initiated with the ith individual in the rth generation when it is of age 0.

Let Ln,i,k be the lifetime of the ancestor in the kth generation of the ith individual in the rth gen-  eration, then Lr,i,k : n ≥ 0, i ≥ 1, k = 1, 2, ··· , n − 1 are i.i.d copies with the lifetime distribution G. Xn−1 Let S n,i = Ln,i,k, then S n,i is the birth time of the ith individual in the nth generation. k=0 (a) For almost all trees T and any r = 0, 1, 2, ··· ,

  P Xk(t) ≥ r T

Yr  P Zr,i t−S r,i  k = i=1 Z(t) k PYr     Zr,i(t − S r,i) Zr,i(t − S r,i) − 1 ··· Zr,i(t − S r,i) − k + 1 = i=1 (4.1) Z(t)Z(t) − 1 ··· Z(t) − k + 1

Yr P −α(t−S ) −α(t−S )  −α(t−S )  −kαS e r,i Zr,i(t − S r,i) · e r,i Zr,i(t − S r,i) − 1 ··· e r,i Zr,i(t − S r,i) − k + 1 · e r,i = i=1 e−αtZ(t) · e−αtZ(t) − 1 ··· e−αtZ(t) − k + 1 70

where α is the Malthusian parameter for the offspring mean m and the lifetime distribution G. X∞ It known from Theorem 1.11 that if Z0 = 1, p0 = 0 and ( j log j)p j < ∞, then j=1

e−αtZ(t) → W w.p.1 as t → ∞

where W is a random variable such that P(W > 0) = 1. So, as t → ∞,

Yr k P  −αS  e r,i Wr,i  i=1 P Xk ≥ r T → ≡ 1 − φk(r, T ) Wk n o as t → ∞, where Wr,i are the i.i.d copies of W. i≥1

 (b) Since P(Xk(t) ≥ r) = E P(Xk(t) ≥ r) T and hence, by the bounded convergence theorem,

Yr  k P −αS r,i  e Wr,i  P(X (t) ≥ r) → E i=1 ≡ 1 − φ (r) as t → ∞ k Wk k

for r = 1, 2, ··· .

To finish the proof, we need to show that φk is a proper probability distribution, i.e., φk(r) → 1 as r → ∞, and it is sufficient to prove that

Y Xr  k −αS r,i e Wr,i → 0 in probability as r → ∞. i=1

Yr  k P −αS r,i  e Wr,i  (Then, by the bounded convergence theorem, we can have that E i=1 → 0 as r → ∞ Wk and hence complete the proof.)

First, we have that

Y Y  k Xr  k  k−1 Xr −αS r,i −αS r,i −αS r,iWr,i −αS r,i max e Wr,i ≤ e Wr,i ≤ max e e Wr,i (4.2) 1≤i

Yr Yr !  X   X  E e−αS r,i W = E E e−αS r,i W L , L , ··· , L , Y , Y , ··· , Y r,i r,i 0 1 r−1 0 1 r i=1 i=1 XYr  ! −αS r,i = E e E Wr,i L0, L1, ··· , Lr−1, Y0, Y1, ··· , Yr i=1 71

  Note that Wr,i i≥1 are i.i.d. copies of W and are independent of L0, L1, ··· , Lr−1, Y0, Y1, ··· , Yr , so  XYr  −αS r,i E e Wr,i i=1 XYr ! XYr ! = E e−αS r,i EW = EW · E e−αS r,i i=1 i=1 Yr ! !  X    = EW · E E e−αS r,i Y = EW · E Y E e−αS r,1 Y r r r i=1  !    r −αS r,1 −αS r,1 −αL = EW · E YrE e = EW · EYr · E e = EW · EYr · Ee

r−1  X   since S r,i ≡ Lr,i,k are identically distributed and Lr,i,k : 0 ≤ k ≤ r − 1 are i.i.d copies of i≥1 k=0 the lifetime random variable L for each i ≥ 1.

From Theorem 1.11 (b), it is known that EW = 1. Then,  XYr  −αS r,i r r E e Wr,i = EW · m · ϕL(α) = EW = 1 < ∞ (4.3) i=1 Z ∞ −αu where ϕL(α) ≡ e dG(u) and hence mϕL(α) = 1 since α is the Mathusian parameter for m 0 and G.

For any η > 0, by Chebyshev’s inequality,

Yr ! Yr X 1  X  1 P e−αS r,i W > η ≤ E e−αS r,i W = . r,i η r,i η i=1 i=1 For any  > 0,

Yr !  k−1 X −αS r,iWr,i −αS r,i P max e e Wr,i >  1≤i , e Wr,i > η 1≤i , e Wr,i ≤ η 1≤i (4.4) η 1≤i

So, to prove that Y Xr  k −αS r,i e Wr,i → 0 in probability as r → ∞, i=1 72 it suffices, from (4.2) and (4.4), to prove that

−αS r,i max e Wr,i → 0 in probability as r → ∞. 1≤i

Let Fr be the σ−algebra generated by all the information up to the rth generation in the embedded tree. Then, for any  > 0,

    −αS r,i −αS r,i P max e Wr,i >  Fr = P ∃i = 1, 2, ··· , Yr s.t. e Wr,i >  Fr 1≤i  Fr i=1 XYr   αS r,i = P Wr,i > e Fr i=1

Let η(y) = sup xP(W > x). Since EW < ∞, xP(W > x) → 0 as x → ∞. So, for any l > 0, there x≥y exists a > 0 s.t. yP(W > y) < l for all y ≥ a and hence η(a) ≤ l. 1 a Let n > ln , then eαn > a. Hence, α    −αS r,i P max e Wr,i >  1≤i , min S r,i ≤ n + P max e Wr,i > , min S r,i > n 1≤i , min S r,i > n Fr 1≤i≤Yr 1≤i

Yr !   X   αS r,i ≤ P min S r,i ≤ n + E P Wr,i > e , min S r,i > n Fr 1≤i≤Y 1≤i≤Y r i=1 r Yr !   1 X   αS r,i αS r,i −αS r,i = P min S r,i ≤ n + E e P Wr,i > e , min S r,i > n Fr · e 1≤i≤Yr  1≤i≤Yr i=1   XYr ! 1 −αS r,i ≤ P min S r,i ≤ n + E η(a)e 1≤i≤Yr  i=1   XYr ! 1 −αS r,i = P min S r,i ≤ n + η(a)E e 1≤i≤Yr  i=1 and, by (4.3), we have that

      −αS r,i 1 P max e Wr,i >  ≤ P min S r,i ≤ n + η(a) < P min S r,i ≤ n + l. (4.5) 1≤i

Moreover,

∞   X    P min S r,i ≤ n = P min S r,i ≤ n Yr = x P Yr = x 1≤i≤Y 1≤i≤Y r x=0 r ∞ X   ≤ xP S r,1 ≤ n P Yr = x x=0  = P S r,1 ≤ n EYr

−θS r,1 −θn = P e ≤ e EYr

where θ > α such that mϕL(θ) < 1. Then, by Markov inequality,

 −θS    E e r,1  r  r ≤ ≤ r θn −θL r θn → P min S r,i n −θn m = e Ee m = e mϕL(θ) 0 (4.6) 1≤i≤Yr e

as r → ∞.   −αS r,i Since (4.5) and (4.6) together imply that P max e Wr,i >  < l as r → ∞ for any l > 0. 1≤i 0,

  −αS r,i P max e Wr,i >  → 0 as r → ∞, 1≤i

and so φk is a proper probability distribution and hence there exists a random variable X˜k on  d 0, 1, 2, ··· such that Xk(t) −−−→ X˜k as t → ∞ and

Yr k P  −αS  e r,i Wr,i ! PX˜ < r = 1 − E i=1 ≡ φ (r). k Wk k

for any r = 0, 1, 2, ··· .

The proof of Theorem 4.1 is complete.

4.2.3 The proof of Theorem 4.2

Since the number of individuals alive at time t can be expressed as the sum of all the offsprings alive at time t of all individuals in the rth generation. That is, for t > 0,

XYr Z(t) = Zr,i(t − S r,i) i=1 74 and then (5.10) can be written as

PYr     Zr,i(t − S r,i) Zr,i(t − S r,i) − 1 ··· Zr,i(t − S r,i − k + 1) i=1 Z(t)Z(t) − 1 ··· Z(t) − k + 1 PYr     Zr,i(t − S r,i) Zr,i(t − S r,i) − 1 ··· Zr,i(t − S r,i − k + 1) = i=1  PYr  PYr   PYr  Zr,i(t − S r,i) Zr,i(t − S r,i) − 1 ··· Zr,i(t − S r,i) − k + 1 i=1 i=1 i=1

Yr P −α(t−S ) −α(t−S )  −α(t−S )  e r,i Zr,i(t − S r,i) · e r,i Zr,i(t − S r,i) − 1 ··· e r,i Zr,i(t − S r,i − k + 1) = i=1  Yr   Yr   Yr  P −α(t−S ) P −α(t−S ) P −α(t−S ) e r,i Zr,i(t − S r,i) · e r,i Zr,i(t − S r,i) ··· e r,i Zr,i(t − S r,i) i=1 i=1 i=1

So, for almost all trees T ,

Yr k P  −αS  e r,i Wr,i   i=1 P Xk(t) ≥ r T → as t → ∞,  Yr k P −αS e r,i Wr,i i=1 n o where Wr,i are the i.i.d copies of W defined in Theorem 1.11 (b). i≥1 Then, by the bounded convergence theorem, for any r = 0, 1, 2, ··· ,

Yr k P  −αS  e r,i Wr,i !   i=1 P Xk(t) ≥ r → E as t → ∞  Yr k P −αS e r,i Wr,i i=1 and hence, for any r = 0, 1, 2, ··· ,

Yr k P  −αS  e r,i Wr,i !   i=1 P X˜k(t) ≥ r = E  Yr k P −αS e r,i Wr,i i=1 Yr k Yr k P  −αS  P  −αS  e r,i Wr,i ! e r,i Wr,i ! i=1 i=1 = E I(r≤U−1) + E I(r≥U)  Yr k  Yr k P −αS P −αS e r,i Wr,i e r,i Wr,i i=1 i=1 Yr  k P −αS r,i  e Wr,i ! i=1 = P(r ≤ U − 1) + E E I(r≥U) Yr  Yr k P −αS e r,i Wr,i i=1  −αS r,i k ! e Wr,i = P(r ≤ U − 1) + E YrE I(r≥U) Yr Yr P −αS e r,i Wr,i i=1 75

 where U = min n ≥ 1 : Yn ≥ 2 . X∞ Since p0 = 0 and ( j log j)p j < ∞, P(0 < Wr,i < ∞) = 1 for all i ≥ 0 and all r = 0, 1, 2, ··· . So, j=1 on the set r ≥ U ,

e−αS r,i W 0 < r,i < 1 w.p.1 Yr P −αS e r,i Wr,i i=1 and hence, for any r = 0, 1, 2, ··· , as k → ∞, !k e−αS r,i W r,i → 0 w.p.1. Yr P −αS e r,i Wr,i i=1 Therefore, by the bounded convergence theorem again, we have that

    P X˜k(t) ≥ r → P U − 1 ≥ r as k → ∞ for any r = 0, 1, 2, ··· . So, The proof is complete.

4.2.4 The proof of Theorem 4.3

We need the following lemmas to prove Theorem 4.3.

Let ξs,i be the number of offsprings of the ith individual alive at time s. n o Lemma 4.1. For any s ≥ 0, let Ws,i, j : j ≥ 1, i ≥ 1 be i.i.d. copies of W defined in Theorem 1.11 (b) ξ n o Xs,i and be indiependent of ξs,i . Let W˜ s,i = Ws,i, j. Then, under the hypotheses of Theorem 4.3, as i≥1 j=1 s → ∞,

1 max W˜ s,i → 0 in probability. Z(s) 1≤i≤Z(s) n o Proof. In a continuous-time single-type age-dependent Bellman-Harris branching process, ξs,i are i≥0  n o i.i.d. copies of the offspring random variable. Also, Ws,i, j are i.i.d. and independet of ξs,i , so i≥0 n o W˜ s,i are i.i.d random variables and i≥1

 Xξs,i  EW˜ s,i = E Ws,i, j = Eξs,i · EWs,i,1 = m ∈ (1, ∞). (4.7) j=1

Thus, since EW˜ s,i < ∞, for any  > 0,

 nP W˜ s,i > n → 0 as n → ∞ (4.8) 76 and then,

1    P max W˜ s,i >  = 1 − P max W˜ s,i ≤ n n 1≤i≤n 1≤i≤n   = 1 − P W˜ s,i ≤ n for all i = 1, 2, ··· , n Yn   = 1 − P W˜ s,i ≤ n i=1   n = 1 − P W˜ s,1 ≤ n    nP W˜ s,1 > n n = 1 − 1 − → 1 − e0 = 1 by (4.8) n as n → ∞.

Therefore, by the bounded convergence theorem, as s → ∞,

 1    1  P max W˜ s,i >  = E P max W˜ s,i >  Z(s) → 0 Z(s) 1≤i≤Z(s) Z(s) 1≤i≤Z(s)   since P Z(s) → ∞ as s → ∞ = 1 under the assumption p0 = 0. Then, Lemma 4.1 is proved.



Lemma 4.2. For any k > 0, let Z(s, k) be the number of individuals alive at time s with the residual lifetime less than or equal to k. Then, under the hypotheses of Theorem 4.3, as s → ∞,

Z(s, k) → B(k) in probability Z(s) where R −αx −  [0,∞) e G(x + k) G(x) dx B(k) = R . −αx −  [0,∞) e 1 G(x) dx Proof. For any fixed k > 0, consider a function g such that

      g(a) ≡ P Rs,i ≤ k as,i = a = P Ls,i − aa,i ≤ k Ls,i > a = P a < Ls,i ≤ a + k Ls,i > a   P a < Ls,i ≤ a + k G(a + k) − G(a) = = .   − G a P Ls,i > a 1 ( )

Let Fs be the σ−algebra generated all the history of this branching process upto time s. Then, for any  > 0, 77

! Z(s, k) P − B(k) >  Z(s) !! Z(s, k) = E P − B(k) >  Fs, (as,1, as,2, ··· , as,Z(s)) Z(s) Z(s) X !! 1 = E P I(R ≤k) − B(k) >  Fs, (as,1, as,2, ··· , as,Z(s)) Z(s) s,i i=1 Z(s) Z(s) X X !! 1   1   = E P I(R ≤k) − g(as,i) + g(as,i) − B(k) >  Fs, (as,1, as,2, ··· , as,Z(s)) Z(s) s,i Z(s) i=1 i=1 Z(s) X !! 1    ≤ E P I(R ≤k) − g(as,i) > Fs, (as,1, as,2, ··· , as,Z(s)) Z(s) s,i 2 i=1 Z(s) X !! 1    +E P g(as,i) − B(k) > Fs, (as,1, as,2, ··· , as,Z(s)) Z(s) 2 i=1 Z(s) X !! 1    = E P I(R ≤k) − g(as,i) > Fs, (as,1, as,2, ··· , as,Z(s)) Z(s) s,i 2 i=1 XZ(s) ! 1    +P g(as,i) − B(k) > (4.9) Z(s) 2 i=1 Note that, for any i ≥ 1,

  E I − g(a ) F , (a , a , ··· , a ) = 0 (Rs,i≤k) s,i s s,1 s,2 s,Z(s) and

  − F ··· Var I(Rs,i≤k) g(as,i) s, (as,1, as,2, , as,Z(s))  2  = E I ≤ k) − g(a ) F , (a , a , ··· , a ) (Rs,i≤k) s,i s s,1 s,2 s,Z(s)   = E I ≤ k) F , (a , a , ··· , a ) (Rs,i≤k) s s,1 s,2 s,Z(s)    2 −2g(a )E I ≤ k) F , (a , a , ··· , a ) + g(a ) s,i (Rs,i≤k) s s,1 s,2 s,Z(s) s,i  2  2  2 1 = g(a ) − 2 g(a ) + g(a ) = g(a ) − g(a ) ≤ . s,i s,i s,i s,i s,i 4  Conditioned on Fs and (as,1, as,2, ··· , as,Z(s)), we have that I(Rs,i≤k)−g(as,i) : i = 1, 2, ··· , Z(s) are 78 independent. So, by Chebycheve’s inequality, for any  > 0,

Z(s) X ! 1    P I(R ≤k) − g(as,i) > Fs, (as,1, as,2, ··· , as,Z(s)) Z(s) s,i 2 i=1 Z(s) ! 4 1 X   ≤ Var I ≤ − g(a ) F , (a , a , ··· , a ) 2 Z(s) (Rs,i k) s,i s s,1 s,2 s,Z(s) i=1 Z(s) 4 1 X   = Var I ≤ − g(a ) F , (a , a , ··· , a ) 2 Z(s) (Rs,i k) s,i s s,1 s,2 s,Z(s) i=1 Z(s) 4 1 X 1 ≤ 2 Z(s) 4 i=1 1 = Z(s) → 0 as s → ∞ 2 and hence

Z(s) X ! 1    P I(R ≤k) − g(as,i) > Fs, (as,1, as,2, ··· , as,Z(s)) → 0 w.p.1 Z(s) s,i 2 i=1 as s → ∞. Then, by the bounded convergence theorem,

Z(s) X !! 1    E P I(R ≤k) − g(as,i) > Fs, (as,1, as,2, ··· , as,Z(s)) → 0 as s → ∞. (4.10) Z(s) s,i 2 i=1 It remains to prove that

XZ(s) ! 1    P g(as,i) − B(k) > → 0 as s → ∞. Z(s) 2 i=1

Z(s) 1 X Let A(x, s) = I ≤ . Then, by Theorem 1.14, as s → ∞ Z(s) (as,i x) i=1

sup A(x, s) − A(x) → 0 w.p.1. x where A is as defined in Section 1.4. G(x + k) − G(x) Since g(x) = is a bounded and continuous function, by Theorem 1.14 again, we 1 − G(x) have, as s → ∞

Z(s) 1 X Z Z g(a ) ≡ g(x)dA(x, s) → g(x)dA(x) w.p.1 Z(s) s,i i=1 [0,∞) [0,∞) where R ∞ R ∞ Z G(x+k)−G(x) e−αx1 − G(x)dx −αx −  0 1−G(x) 0 e G(x + k) G(x) dx g(x)dA(x) = R ∞  = R ∞  = B(k). [0,∞) −αx − −αx − 0 e 1 G(x) dx 0 e 1 G(x) dx 79

Z(s) 1 X So, as s → ∞, g(a ) → B(k) w.p.1 and hence in probability. Thereofore, for any  > 0, Z(s) s,i i=1

XZ(s) ! 1  P g(as,i) − B(k) > → 0 as s → ∞. (4.11) Z(s) 2 i=1 From (4.9), (4.10) and (4.11), we have that, for any  > 0, ! Z(s, k) P − B(k) >  → 0 as s → ∞ Z(s) and the proof is complete.



Lemma 4.3. Let W˜ s,i and Z(s, k) be the random variables defined in Lemma 4.1 and Lemma 4.2, re- spectively. Then, under the hypotheses of Theorem 4.3, these exists a θ > 0 such that, as s → ∞,

Z(s,k)  1 X  P W˜ I ≤ ≥ θ → 1. Z(s, k) s,k (Rs,i k) i=1 n o n o Proof. Let ns,1 = min 1 ≤ j ≤ Z(s): Rs, j ≤ k and ns,i = min ns,i−1 < j ≤ Z(s): Rs, j ≤ k for i ≥ 2. Then

Z(s) Z(s,k) Z(s,k) 1 X 1 X 1 X W˜ s,iI(R ≤k) = W˜ s,n I(R ≤k) = W˜ s,n . (4.12) Z(s, k) s,i Z(s, k) s,i s,ns,i Z(s, k) s,i i=1 i=1 i=1   It is known from (4.7) that EW˜ s,1 > 0 and hence there exists an η > 0 such that P W˜ s,1 ≥ η > 0.

Let Fs be the σ−algebra generated by all the information of this Bellman-Harris branching process upto time s. Then

      ZX(s,k)   ˜ ≥ ˜ ≥ F ˜ ≥ F P Ws,ns,i η = E P Ws,ns,i η s = E P Ws,ns,i η, ns,i = j s j=1  ZX(s,k)   ˜ = E P Ws, j ≥ η, ns,i = j Fs j=1  ZX(s,k)     ˜ = E P Ws, j ≥ η Fs P ns,i = j Fs j=1  ZX(s,k)     ˜ = E P Ws, j ≥ η Fs P ns,1 = j Fs j=1   = P W˜ s,1 ≥ η . 80

Let   1 , if W˜ ≥ η  s,ns,i Xs,i =   ˜  0 . if Ws,ns,i < η then Z(s,k) Z(s,k) 1 X 1 X W˜ ≥ ηX Z(s, k) s,ns,i Z(s, k) s,i i=1 i=1 Z(s,k) ! Z(s,k) ! 1 X   1 X = η X − PW˜ ≥ η + η PW˜ ≥ η Z(s, k) s,i s,ns,i Z(s, k) s,ns,i i=1 i=1 Z(s,k) ! 1 X   = η X − PW˜ ≥ η + ηPW˜ ≥ η. (4.13) Z(s, k) s,i s,ns,i s,ns,i i=1 n o    Conditioned on Fs, Xs,i are indipendent with E Xs,i − P W˜ s,n ≥ η Fs = 0 and i≥1 s,i     − ˜ ≥  F ˜ ≥ F  − ˜ ≥ F  Var Xs,i P Ws,ns,i η s = P Ws,ns,i η s 1 P Ws,ns,i η s   = P W˜ s,1 ≥ η Fs 1 − P W˜ s,1 ≥ η Fs 1 ≤ , 4 then, by Chebychev’s inequality and Lemma 4.2, for any  > 0, Z(s,k) X ! 1   P Xs,i − P W˜ s,n ≥ η >  Fs Z(s, k) s,i i=1 Z(s,k) ! 1 1 X   ≤ Var X − PW˜ ≥ η F 2 Z(s, k) s,i s,ns,i s i=1 ! 1 1   = Var Xs,i − P W˜ s,n ≥ η Fs 2 Z(s, k)2 s,i 1 ≤ 42Z(s, k) 1 Z(s) = → 0 in probability as s → ∞. 42Z(s) Z(s, k) Therefore, by the bounded convergence theorem, ZX(s,k) ! 1   P Xs,i − P W˜ s,n ≥ η >  Z(s, k) s,i i=1 Z(s,k) X !! 1   = E P Xs,i − P W˜ s,n ≥ η >  Fs → 0 as s → ∞. Z(s, k) s,i i=1 Hence, Z(s,k) ! 1 X   P X − PW˜ ≥ η < − → 0 as s → ∞. (4.14) Z(s, k) s,i s,ns,i i=1 81

1 Let θ = ηPW˜ ≥ η, then θ > 0. Also, (4.12), (4.13) and (4.14) together imply that 2 s,1

Z(s,k)  1 X  P W˜ I ≤ ≥ θ Z(s, k) s,i (Rs,i k) i=1 Z(s,k) !  1 X   1 ≥ P η X − PW˜ ≥ η + ηPW˜ ≥ η ≥ ηPW˜ ≥ η Z(s, k) s,i s,ns,i s,ns,i 2 s,1 i=1 Z(s,k) ! 1 X   1 = P X − PW˜ ≥ η ≥ − PW˜ ≥ η Z(s, k) s,i s,ns,i 2 s,1 i=1 Z(s,k) ! 1 X   1 = 1 − P X − PW˜ ≥ η < − PW˜ ≥ η → 1 as s → ∞. Z(s, k) s,i s,ns,i 2 s,1 i=1 So, we have that

Z(s,k)  1 X  P W˜ I ≤ ≥ θ → 1 as s → ∞ Z(s, k) s,i (Rs,i k) i=1 and hence Lemma 4.3 is proved.



Now, we are ready to prove Theorem 4.3. n o Consider a discrete-time single-type Bellman-Harris branching process Z(t): t ≥ 0 with Z(0) = 1.

Recall the following notations: For any i = 1, 2, ··· , Z(s),

(1) ξs,i is the number of offsprings of the ith individual alive at time s;

(2) Ls,i is the corresponding total lifetime of the ith individual alive at time s.

(3) as,i be the corresponding age;

(4) Rs,i be the corresponding residual lifetime at time t.

˜ Let Zt−s−Rs,i, j be the branching process initiated by the jth offspring of the ith individual alive at time s.

Pick k individuals randomly from those alive at time t and trace their lines of decent backward in time until they meet. Denote the coalescence time Dk(t) which also means the death time of the last common ancestor of these randomly chosen individuals. 82

For almost all trees T and s ≥ 0,

  P Dk(t) ≤ s T   = 1 − P Dk(t) > s T

Z(s)  ξs,i  ξs,i   ξs,i  P P ˜ P ˜ P ˜ Zt−s−Rs,i, j Zt−s−Rs,i, j − 1 ··· Zt−s−Rs,i, j − k + 1 i=1 j=1 j=1 j=1 = 1 −  Z(s) ξs,i  Z(s) ξs,i   Z(s) ξs,i  P P ˜ P P ˜ P P ˜ Zt−s−Rs,i, j Zt−s−Rs,i, j − 1 ··· Zt−s−Rs,i, j − k + 1 i=1 j=1 i=1 j=1 i=1 j=1

Z(s) k  ξs,i  P Q −αRs,i P ˜ −α(t−s−Rs,i) −α(t−s−Rs,i) e Zt−s−Rs,i, je − (l − 1)e i=1 l=1 j=1 = 1 − k  Z(s) ξs,i  Q P −αRs,i P ˜ −α(t−s−Rs,i) −α(t−s−Rs,i) e Zt−s−Rs,i, je − (l − 1)e l=1 i=1 j=1 and then, by Theroem 1.11,

Z(s)  ξs,i k P −αR P e s,i Ws,i, j   i=1 j=1 P Dk(t) ≤ s T → 1 − as t → ∞  Z(s) ξs,i k P −αR P e s,i Ws,i, j i=1 j=1 Z(s)  k P −αR e s,i W˜ s,i i=1 = 1 − ≡ Hk(s, T )  Z(s) k P −αR e s,i W˜ s,i i=1 ξs,XiWs,i, j where {Ws,i, j} j≥1 are i.i.d copies of W in Theorem 1.11 and W˜ s,i ≡ for i ≥ 1 and s ≥ 0. i=1 So, by the bounded convergence theorem, as t → ∞, Z(s)  k P −αRs,i ˜   e Ws,i !     i=1 P Dk(t) ≤ s = E P Dk(t) ≤ s T → 1 − E ≡ Hk(s).  Z(s) k P −αR e s,i W˜ s,i i=1

Next, we need to show that Hk is a proper probability distribution, i.e. show that Hk(s) → 1 as s → ∞ and it is the same as showing that Z(s)  k P −αR e s,i W˜ s,i ! E i=1 → 0 as s → ∞.  Z(s) k P −αR e s,i W˜ s,i i=1 It is suffices to prove that, as s → ∞, Z(s)  k P −αR e s,i W˜ s,i i=1 → 0 in probability.  Z(s) k P −αR e s,i W˜ s,i i=1 83

Moreover, since

Z(s)  k −αR P −αRs,i −αR max e s,i W˜ s,i !k e W˜ s,i max e s,i W˜ s,i !k−1 1≤i≤Z(s) 1≤i≤Z(s) ≤ i=1 ≤ , Z(s)  Z(s) k Z(s) P −αR P −αR P −αR e s,i W˜ s,i e s,i W˜ s,i e s,i W˜ s,i i=1 i=1 i=1 it is enough to show that, as s → ∞,

−αR max e s,i W˜ s,i 1≤i≤Z(s) → 0 in probability. Z(s) P −αR e s,i W˜ s,i i=1 For any fixed k > 0,

XZ(s) XZ(s) −αRs,i ˜ −αk ˜ e Ws,i ≥ e Ws,iI(Rs,i≤k) i=1 i=1 and then −αR αk 1 max e s,i W˜ s,i max W˜ s,i e max W˜ s,i 1≤i≤Z(s) 1≤i≤Z(s) Z(s) 1≤i≤Z(s) ≤ = (4.15) Z(s) −αk PZ(s) ˜ Z(s,k) 1 PZ(s) ˜ P −αR e Ws,iI(Rs,i≤k) Ws,iI(Rs,i≤k) e s,i W˜ s,i i=1 Z(s) Z(s,k) i=1 i=1 where Z(s, k) is the number of individuals alive at time s with the residual lifetime less than or equal to k.

From Lemma 4.2, we know that, as s → ∞,

Z(s, k) → B(k) in probability Z(s) and hence

Z(s, k) 1  P < B(k) → 0 as s → ∞. Z(s) 2

Also, from Lemma 4.3, we have that, for some θ > 0, as s → ∞,

Z(s,k)  1 X  P W˜ I ≤ ≥ θ → 1. Z(s, k) s,k (Rs,i k) i=1 So, for any δ > 0, there exists an M > 0 such that for every s > M,

Z(s, k) 1  δ P < B(k) < Z(s) 2 2 and

Z(s,k)  1 X  δ P W˜ I ≤ < θ < . Z(s, k) s,k (Rs,i k) 2 i=1 84

    Z(s,k) ≥ 1 1 PZ(s,k) ˜ ≥ Let A = Z(s) 2 B(k) and B = Z(s,k) i=1 Ws,kI(Rs,i≤k) θ . Then, for any  > 0,

αk 1 e max W˜ s,i  Z(s) 1≤i≤Z(s)  P >  Z(s,k) 1 PZ(s) ˜ Z(s) Z(s,k) i=1 Ws,iI(Rs,i≤k)  Z(s)  1 −αk Z(s, k) 1 X = P max W˜ s,i > e W˜ s,iI(R ≤k) Z(s) 1≤i≤Z(s) Z(s) Z(s, k) s,i i=1   1 −αk 1 C C ≤ P max W˜ s,i > e B(k)θ : A ∩ B + P(A ) + P(B ) Z(s) 1≤i≤Z(s) 2   1 1 −αk ≤ P max W˜ s,i > θe B(k) + δ Z(s) 1≤i≤Z(s) 2 for every s > M. Thus, for any δ > 0, αk 1 e max W˜ s,i  Z(s) 1≤i≤Z(s)  lim sup P >  Z(s,k) Z(s) s→∞ 1 P ˜ Z(s) Z(s,k) i=1 Ws,iI(Rs,i≤k)   1 1 −αk ≤ lim sup P max W˜ s,i > θe B(k) + δ = δ s→∞ Z(s) 1≤i≤Z(s) 2 i.e., αk 1 e max W˜ s,i  Z(s) 1≤i≤Z(s)  lim P >  = 0 for any  > 0 s→∞ Z(s,k) 1 PZ(s) ˜ Z(s) Z(s,k) i=1 Ws,iI(Rs,i≤k) and hence, from (4.15), we have that −αR max e s,i W˜ s,i  1≤i≤Z(s)  lim P >  = 0 for any  > 0, s→∞ Z(s) P −αR e s,i W˜ s,i i=1 i.e., as s → ∞, −αR max e s,i W˜ s,i 1≤i≤Z(s) → 0 in probability. Z(s) P −αR e s,i W˜ s,i i=1 By the bounded convergence theorem, Z(s)  k P −αR e s,i W˜ s,i ! E i=1 → 0 as s → ∞.  Z(s) k P −αR e s,i W˜ s,i i=1  and thus Hk is a proper probability distribution on R+ ≡ x ∈ R : x ≥ 0 . So, there exists a random d variable D˜ k on R+ such that Dk(t) −−−→ D˜ k as t → ∞ and Z(s) k P  −αR  e s,i W˜ s,i !  i=1 P D˜ k ≤ s = 1 − E ≡ Hk(s). Z(s) k  P −αR  e s,i W˜ s,i i=1 85 for any s ≥ 0. The proof of Theorem 4.3 is complete.

4.2.5 The proof of Theorem 4.4

 Recall that Yn n≥0 is the embedded generation process of this Bellman-Harris process and U = n o min n ≥ 1 : Y j ≥ 2 which is the first generation with more than one individuals. Let Ls, i, j be the lifetime of the ancestor in the jth generation of the ith individual alive at time s. n o Then Ls,i, j are i.i.d. random variables with the lifetime distribution G for any i = 1, 2, ··· , Z(s) and j≥1 s > 0.

From Theorem 4.3, for any s > 0, we have that

Z(s) k P  −αR  e s,i W˜ s,i !   i=1 P D˜ k(t) > s = E  Z(s) k P −αR e s,i W˜ s,i i=1 Z(s) k Z(s) k P  −αR  P  −αR  e s,i W˜ s,i ! e s,i W˜ s,i ! i=1 i=1 = E I(L +L +···+L >s) + E I(L +L +···+L ≤s)  Z(s) k s,i,0 s,i,1 s,i,U−1  Z(s) k s,i,0 s,i,1 s,i,U−1 P −αR P −αR e s,i W˜ s,i e s,i W˜ s,i i=1 i=1 Yr  k P −αRs,i ˜  e Ws,i !   i=1 = P Ls,i,0 + Ls,i,1 + ··· + Ls,i,U−1 > s + E E I(L +L +···+L − ≤s) Z(s)  Z(s) k s,i,0 s,i,1 s,i,U 1 P −αR e s,i W˜ s,i i=1 Z(s) X  −αRs,i ˜ k !   e Ws,i = P L0 + L1 + ··· + L2 > s + E E I(L +L +···+L − ≤s) Z(s) Z(s) s,i,0 s,i,1 s,i,U 1 i=1 P −αR e s,i W˜ s,i i=1 n o where Li are i.i.d random variables with the lifetime distribution. i≥0 X∞ Xξs,i Since p0 = 0, 1 < m < ∞, ( j log j)p j < ∞ and W˜ s,i = Ws,i, j, we have that P(0 < W˜ s,i < ∞) = j=1 j=1  1 for all i ≥ 0 and s > 0. So, on the set Ls,i,0 + Ls,i,1 + ··· + Ls,i,U−1 ≤ s ,

e−αRs,i W˜ 0 < s,i < 1 w.p.1 Z(s) P −αR e s,i W˜ s,i i=1  and hence, on the set Ls,i,0 + Ls,i,1 + ··· + Ls,i,U−1 ≤ s , for any s > 0, as k → ∞,

!k e−αRs,i W˜ s,i → 0 w.p.1. Z(s) P −αR e s,i W˜ s,i i=1 86

Therefore, by the bounded convergence theorem again, we have that

    P D˜ k(t) > r → P L0 + L1 + ··· + LU−1 > s as k → ∞ for any s > 0. So, The proof is complete.

4.3 Results in The Subcritical Case

4.3.1 The statements of Results

The first result we establish for the subcritical case is the convergence of the age chart of the popu- lation.

Let at,i be the age of the ith individual alive at time t, i = 1, 2, ··· , Z(t). ∞ X j Recall that f (s) = p j s , 0 ≤ s ≤ 1, is the probability generating function of the offspring j=0   distribution p j j≥0 of the process Z(t) .

For any continuous and bounded function h : R+ → R+, let ! PZ(t) −s h(at,i) Hh(s, t) = E e i=1 .

X∞ Theorem 4.5. Let 0 < m < 1 and ( j log j)p j < ∞. Assume that the lifetime distribution G is non- j=1 Z ∞ lattice, G(0+) = 0 and such that the Malthusian parameter α exists and te−αtdG(t) < ∞. If 0   1 − f Hh(s, t − u) − m 1 − Hh(s, t − u) sup sup   < ∞ n n≤t 0 , the point process

n o A(t) ≡ at,i : 1 ≤ i ≤ Z(t) converges in distribution, as t → ∞, to the point process

n o A˜ ≡ a˜i : 1 ≤ i ≤ Y

 where Y is the random variable with the distribution b j j≥0 as defined in Theorem 1.13. 87

X∞ Theorem 4.6. Let 0 < m < 1 and ( j log j)p j < ∞. Assume that the lifetime distribution G is non- j=1 Z ∞ lattice, G(o+) = 0 and such that the Malthusian parameter α exists and te−αtdG(t) < ∞. If 0   1 − f Hh(s, t − u) − m 1 − Hh(s, t − u) sup sup   < ∞ n n≤t

d t − D2(t) −−−→ D˜ 2 as t → ∞, and, for any u ≥ 0,

1   PD˜ ≤ s = 1 − E φ(A˜, u) ≡ H (u) 2 eαuP(Y ≥ 2) 2  where Y is the random variable with the distribution b j j≥0 as defined in Theorem 1.13,

k P ˜ ˜ Zi(ai + u)Z j(a j + u) !  i, j=1 φ (a1, a2, ··· , ak), u = E I k )  k  k  P ˜  P P Zi(ai+u)≥2 Z˜i(ai + u) Z˜i(ai + u) − 1 i=1 i=1 i=1  ˜ for any positive integer k and any positive real numbers a1, a2, ··· , ak and Zi(t) i≥1 are i.i.d. copies of Z(t).

By the same lines of the proof of Theorem 4.6, we can extend the result to any integer k ≥ 2.

X∞ Corollary 4.1. Let 0 < m < 1 and ( j log j)p j < ∞. Then under the same hypotheses in Theorem j=1 d 4.6, for any k ≥ 2, there exists D˜ k on the set of non-negative real numbers such that t − Dk(t) −−−→ D˜ k as t → ∞.

4.3.2 The proof of Theorem 4.5

Let Z(t) be the continuous-time single-type age-dependent Bellman-Harris braching process with

Z(0) = 1.

Let at,i be the age of the ith individual alive at time t, i = 1, 2, ··· , Z(t).

Let h : R+ → R+ be any bounded and continuous function. 88

For any s ≥ 0, we consider the Laplace functional of (at,i, at,2, ··· , at,Z(t)) conditioned on the set n o Z(t) > 0 , then

ZP(t) ZP(t)  −s h(at,i)  1  −s h(at,i)  i=1 i=1 E e Z(t) > 0 = E e I(Z(t)>0) PZ(t) > 0 " ZP(t) ZP(t) # 1  −s h(at,i)  −s h(at,i)  = E e i=1 − E e i=1 I PZ(t) > 0 (Z(t)=0) " ZP(t) # 1  −s h(at,i)   = E e i=1 − E I PZ(t) > 0 (Z(t)=0) " ZP(t) # 1  −s h(at,i)   = E e i=1 − 1 + 1 − P Z(t) = 0 PZ(t) > 0 " ZP(t) # 1  −s h(at,i)   = E e i=1 − 1 + P Z(t) > 0 PZ(t) > 0 " ZP(t) # 1  −s h(at,i) = E e i=1 − 1 + 1. (4.16) PZ(t) > 0

ZP(t)  −s h(at,i) Let H(s, t) = E e i=1 . ∞ X j n o Recall that f (s) = p j s is the probability generating function of the offspring distribution p j . j≥0 j=0 n o Let ξ is the number of offspring of an individual in the process. Note that ξ ∼ p j . j≥0

Let L0 be the total lifetime of the first ancestor in this process. So, L0 ∼ G. n o Let Z j(t): t ≥ 0 be the Bellman-Harris branching process initiated by the jth individual in the first generation. Xξ Since Z(t) = Z j(t − L0), we have j=1

Z (t−L ) Z(t) ξ j 0 P − P P  −s h(at,i)   s h(at, j,i)  E e i=1 L ≤ t, ξ = E e j=1 i=1 L ≤ t, ξ 0 0 Z(t−L )  P 0 ξ  −s h(at,i)  = E e i=1 L0 ≤ t  ξ = H(s, t − L0) where at, j,i is the age (at time t) of the ith individual from the tree initiated by the jth individual in the first generation.

Hence,  ZP(t)    −s h(at,i)  ξ  E e i=1 = E H(s, t − L0) = f H(s, t − L0) . 89

Therefore, we have that

ZP(t)  −s h(at,i) H(s, t) = E e i=1

ZP(t) ZP(t)  −s h(at,i)   −s h(at,i)  = E e i=1 : L0 > t + E e i=1 : Lo ≤ t Z −sh(t)    = e P L0 > t + f H(s, t − u) dG(u) [0,t]   Z = e−sh(t) 1 − G(t) + f H(s, t − u)dG(u). (4.17) [0,t] and it implies that, for any s ≥ 0, H(s, t) satisfies the integral equation:    Z  −sh(t) − −   H(s, t) = e 1 G(t) + f H(s, t u) dG(u) (∗)  [0,t]   H(s, 0) = e−sh(0).

Moreover,

  H(∞, t) ≡ lim H(s, t) = P Z(t) = 0 . s→∞

Then, by (4.16) and (4.17),

ZP(t)  −s h(at,i)  1   E e i=1 Z(t) > 0 = 1 − 1 − H(s, t) PZ(t) > 0  Z  1 −sh(t)   = 1 −  1 − e 1 − G(t) + f H(s, t − u) dG(u) P Z(t) > 0 [0,t]  Z  1  −sh(t)  h i = 1 −  1 − e 1 − G(t) + 1 − f H(s, t − u) dG(u) P Z(t) > 0 [0,t]

For any fixed s ≥ 0, let

H(t) = 1 − H(s, t) (4.18)

 −sh(t)  ξ1(t) = 1 − e 1 − G(t) (4.19) Z t h i ξ2(t) = 1 − f (H(s, t − u)) − mH(t − u) dG(u) (4.20) 0

ξ3(t) = ξ1(t) + ξ2(t) (4.21) and then Z H(t) = ξ3(t) + m H(t − u)dG(u). (4.22) [0,t]

To complete the proof of Theorem 4.5, we need the following definition and lemmas. 90

Definition 4.1. Afunction ξ is directly Riemann integrable if

X∞   X∞   (a) δ sup ξ(t) and δ inf ξ(t) converge absolutely for sufficient small δ > 0; and nδ≤t<(n+1)δ n=0 nδ≤t<(n+1)δ n=0 X∞   X∞  ! (b) δ δ sup ξ(t) − δ inf ξ(t) → 0 as δ → 0. nδ≤t<(n+1)δ n=0 nδ≤t<(n+1)δ n=0 Remark 4.1. Some sufficient conditions for direct Riemann integrability of ξ are

X∞   (1) ξ ≥ 0, bounded, continuous and sup ξ(t) < ∞; n=0 n≤t

(3) ξ is bounded by a directly Riemann integrable function;

(4) ξ is constant on the intervals (n, n + 1) and absolutely integrable.

Lemma 4.4 is a well-known result in the . See Feller [17].

Lemma 4.4. Let G be a probability distribution function and G∗n denote its n−fold convolution. Let X∞ U = G∗n. If ξ is directly Riemann integrable and G is non-lattice, then n=0 R ∞ ξ(u)du  0 lim ξ ∗ U (t) = ∞ . t→∞ R 0 udG(u) Lemma 4.5. If the Mathusian parameter α of m and G exists, if e−αtξ(t) is directly Riemann integrable, and if G us non-lattice, then the solution H of the integral equation

Z t H(t) = ξ(t) + m H(t − u)dG(u), t ≥ 0 0 satisfies R ∞ e−αuξ(t)du H(t) ∼ 0 . R ∞ −αu m 0 ue dG(u) The proof of Lemma 4.5 can be found in Athreya and Ney [5].

Lemma 4.6. Let H be the function defined in (4.18). Then, under the hypotheses of Theorem 4.5,

sup e−αtH(t) < ∞. s,t≥0 91

Proof. For any fixed s ≥ 0 and for any t ≥ 0 , we have

H(t) = 1 − H(s, t) Z  −sh(t)  h i = 1 − e 1 − G(t) + 1 − f H(s, t − u) dG(u) [0,t] Z −sh(t) h i ≤ 1 − e 1 − G(t) + 1 − f H(s, t − u) dG(u) . [0,t]

Note that f (1) = 1, 0 < H(s, t − u) < 1 and f is a continuous function. Then, by the mean value theorem, there exists c such that H(s, t − u) < c < 1 and

f (1) − f H(s, t − u) f 0(c) = . (4.23) 1 − H(s, t − u)

Therefore,

Z −sh(t) 0   H(t) ≤ 1 − e 1 − G(t) + f (c) 1 − H(s, t − u) dG(u) [0,t] Z 0 ≤ 1 − G(t) + f (c) 1 − H(s, t − u) dG(u) [0,t]   Z ≤ 1 − G(t) + m H(t − u) dG(u). (4.24) [0,t] since f 0 is non-decreasing.

−αt −αt  Let me dG(t) = dGα(t) and gα(t) ≡ e 1 − G(t) . R ∞ −αt ∞ Note that 0 te dG(t) < imples the Riemann integrability of gα.

So, gα ≥ 0 is non-increasing and Riemann-integrable, and hence gα is directly Riemann integrable by condition (2) in Remark 4.1.

Moreover, that G is non-lattice implies that Gα is also non-lattice. X∞ ∗n ∗n Let Gα be the n−fold convolution of Gα and Uα = Gα . Then, by Lemma 4.4, we have that n=0 R ∞ 0 gα(u)du lim gα ∗ Uα(t) = ∞ < ∞. t→∞ R 0 udGα(u) 92

Multiply both sides of (4.24) by e−αt and , then Z − −   − e αt H(t) ≤ e αt 1 − G(t) + m e αt H(t − u) dG(u) [0,t] Z −α(t−u) = gα(t) + e H(t − u) dGα(u) [0,t]

= gα(t) + Hα ∗ Gα(t)  ≤ gα(t) + gα + Hα ∗ Gα ∗ Gα(t)

= ···

∗2 ∗3 = gα(t) + gα ∗ Gα(t) + gα ∗ Gα (t) + gα ∗ Gα (t) + ···

= gα ∗ Uα(t)

− and hence lim e αt H(t) is bounded by a constant for any s ≥ 0. So, t→∞

sup e−αtH(t) < ∞. s,t≥0



Lemma 4.7. Let ξ1 be the function defined in (4.19). Then, under the hypotheses of Theorem 4.5,

−αt e ξ1(t) is directly Riemann integrable.

Proof. Note that

−αt −αt −sh(t)  −αt  e ξ1 = e 1 − e 1 − G(t) ≤ e 1 − G(t) ≡ gα(t) where gα is known as a directly Riemann integrable function from the proof of Lemma 4.6.

−αt So, e ξ1 is directly Riemann integrable by condition (3) in Remark 4.1.



Lemma 4.8. Let ξ2 be the function defined in (4.20). Then, under the hypotheses of Theorem 4.5,

Z ∞ −αt e ξ2(t) dt < ∞. 0

Proof. Recall that Z H(t) = ξ1(t) + ξ2(t) + m H(t − u)dG(u) [0,t] Z −αt −αt −αt −αt ⇒ e H(t) = e ξ1(t) + e ξ2(t) + m e H(t − u)dG(u). [0,t] 93

−αt −αt −αt Let Hα(t) = e H(t), ξ1α(t) = e ξ1(t) and ξ2α(t) = e ξ2(t), then Z Hα(t) = ξ1α(t) + ξ2α(t) + Hα(t − u)dGα(u) [0,t]

= ξ1α(t) + ξ2α(t) + Hα ∗ Gα(t). (4.25)

We know that ξ1α is bounded by 1 and, by Lemma 4.6, Hα is also bounded, so ξ2,α is bounded. Take Laplace transforms on both sides of (4.25), we have that

Hˆα(θ) = ξˆ1α(θ) + ξˆ2α(θ) + Hˆα · Gˆα(θ)     ⇒ H˜α(θ) 1 − Gˆα(θ) + − ξ˜2α(θ) = ξˆ1α(θ).

Note that, by (4.26),

f (1) − f H(s, t − u) = f 0(c) < f 0(1) = m (4.26) 1 − H(s, t − u) Z t h i and hence ξ2(t) = 1 − f (H(s, t − u)) − mH(t − u) dG(u) < 0. 0   So, we have that Hα ≥ 0, ξ1α ≥ 0, ξ2α ≤ 0 and Gα ≤ 1. Thus, Hˆα(θ) 1 − Gˆα(θ) ≥ 0, −ξˆ2α(θ) ≥ 0 and ξˆ1α(θ) ≥ 0. Also, by the monotone convergence thoerem,

Z ∞ Z ∞ Z ∞ −θt −αt lim ξˆ1α(θ) = lim e ξ1α(t)dt = ξ1α(t)dt = e ξ1(t)dt < ∞ θ↓0 θ↓0 0 0 0     and hence lim − ξˆ2α(θ) < ∞ since Hˆα(θ) 1 − Gˆα(θ) , −ξˆ2α(θ) and ξˆ1α(θ) are of the same sign. θ↓0 Therefore, by the monotone convergence theorem again,

Z ∞ Z ∞ Z ∞ Z ∞ −αt −αt    −θt  e ξ2(t) dt = e − ξ2(t) dt = − ξ2α(t) dt = lim e − ξ2α(t) dt 0 0 0 θ↓0 0   = lim − ξˆ2α(θ) < ∞ θ↓0



Lemma 4.9. Let ξ2 be the function defined in (4.20). Then, under the hypotheses of Theorem 4.5,

−αt e ξ2(t) is directly Riemann integrable.

Proof. By the assumption, we have that   1 − f Hh(s, t − u) − m 1 − Hh(s, t − u) M ≡ sup sup   < ∞. n n≤t

So, n ≤ t < n + 1, we have

−αt e ξ2(t) Z t −α(n+1) h  i ≤ e 1 − f H(s, t − u) − mH(t − u) dG(u) 0 Z n Z t −α(n+1) h  i h  i = e 1 − f H(s, t − u) − mH(t − u) dG(u) + 1 − f H(s, t − u) − mH(t − u) dG(u) 0 n Z n Z t −α(n+1) h  i −α(n+1)  ≤ e 1 − f H(s, n − u) − mH(n − u) dG(u) + e 1 − f H(s, t − u) − mH(t − u) dG(u) 0 n Z n Z t −α(n+1)  −α(n+1)  = e 1 − f H(s, t − u) − mH(t − u) dG(u) + e 1 − f H(s, t − u) − mH(t − u) dG(u) 0 n Z n Z t −α(n+1)  −α(n+1)  ≤ e M 1 − f H(s, n − u) − mH(n − u) dG(u) + e 1 − f H(s, t − u) − mH(t − u) dG(u) 0 n Z n Z t −α(n+1)  −α(n+1)  = Me 1 − f H(s, n − u) − mH(n − u)dG(u) + e 1 − f H(s, t − u) − mH(t − u) dG(u) 0 n  Since | f | ≤ 1, |H| ≤ 1 and 1 − f H(s, t − u) − mH(t − u) ≤ 1, we have

−αt −α(n+1) −α(n+1)  e ξ2(t) ≤ Me ξ2(n) + e G(t) − G(n) . and then

−αt −α −αn −αn  sup e ξ2(t) ≤ e Me ξ2(n) + e 1 − G(n) . n≤t

−αt Riemann integrable. Therefore, by condition (3), e ξ2(t) is also directly Riemann integrable.



Now, we continue to prove Theorem 4.5.

By Lemma 4.7 and Lemma 4.9, we have that

−αt −αt −αt e ξ3(t) = e ξ2(t) + e ξ3(t) is directly Riemann integrable. 95

Then, by Lemma 4.5, we know that the solution H of the integral equation Z H(t) = ξ3(t) + m H(t − u)dG(u) [0,t] satisfies

H(t) ∼ ceαt as t → ∞

R ∞ −αu e ξ3(u)du where c = 0 . R ∞ −αu m 0 ue dG(u) Note that H(t) = 1 − H(s, t) and hence c ≡ c(s) depends on s.

Then

ZP(t) !  −s h(at,i) 1   lim E e i=1 Z(t) > 0 = lim 1 − 1 − H(s, t) t→∞ t→∞ P(Z(t) > 0) 1   = 1 − lim 1 − H(s, t) e−αt t→∞ e−αtP(Z(t) > 0) 1 = 1 − lim H(s, t)e−αt t→∞ e−αtP(Z(t) > 0) c(s) → 1 − ≡ φ(s). Q(0)

Moreover, since, by the bounded convergence theorem,

ZP(t) ! −s h(at,i) lim H(s, t) = lim E e i=1 Z(t) > 0 = 1, s→0+ s→0+ we have that

lim H(t) = lim 1 − H(s, t) = 0 s→0+ s→0+

−sh(t) lim ξ1(t) = lim 1 − e (1 − G(t)) = 0, s→0+ s→0+ and, by the bounded convergence theorem again,

Z t  lim ξ2(t) = lim 1 − f (H(s, t − u)) − mH(t − u)dG(u) = 0. s→0+ s→0+ 0  Hence, lim ξ3(t) = lim ξ1(t) + ξ2(t) = 0. s→0+ s→0+ Also, for any s ≥ 0,

−αt −αt −αt e ξ3(t) ≤ e ξ1(t) + e ξ2(t) 96

−αt −αt where e ξ1(t) and e ξ2(t) are integrable. Then, by the dominated convergence theorem,

Z ∞ Z ∞ −αt −αu lim e ξ3(t)dt = lim e ξ3(t)st = 0 s→0+ 0 0 s→0+ and hence

R ∞ −αu c(s) 1 0 e ξ3(u)du lim φ(s) = lim 1 − = 1 − lim R ∞ = 1 − 0 = 1. s→0+ s→0+ Q(0) s→0+ Q(0) −αu m 0 ue dG(u) Therefore, φ is a Laplace functional of a point process.

Since, for any s ≥ 0

ZP(t) ! −s h(at,i) φ(s) = lim E e i=1 Z(t) > 0 t→∞ and

d Z(t) Z(t) > 0 −−−→ Y as t → ∞, n o there exists a point process A˜ ≡ a˜i : 1 ≤ i ≤ Y such that

PY ! −s h(˜ai) φ(s) = E e i=1 for any s ≥ 0, and, as t → ∞,

d ˜ A(t) Z(t) > 0 −−−→ A.

The proof is complete.

4.3.3 The proof of Theorem 4.6

Let Zt,i(u) be the branching process initiated by the ith individual alive at time t. So,

ZX(t−u) Z(t) = Zt−u,i(at,i + u). (4.27) i=1 97

For any u ≤ t,

  P t − D2(t) ≥ u Z(t) ≥ 2   = P D2(t) ≤ t − u Z(t) ≥ 2 Z(Pt−u) Zt−u,i(at−u,i + u)Zt−u, j(at−u, j + u) ! i, j=1 = E  Z(t) ≥ 2 Z(t) Z(t) − 1 Z(Pt−u) Zt−u,i(at−u,i + u)Zt−u, j(at−u, j + u) ! 1 i, j=1 = E I ≥ PZ(t) ≥ 2 Z(t)Z(t) − 1 (Z(t) 2) Z(Pt−u) Zt−u,i(at−u,i + u)Zt−u, j(at−u, j + u) ! 1 i, j=1 = E I I  PZ(t) ≥ 2, Z(t) > 0 Z(t)Z(t) − 1 Z(t)≥2 Z(t−u)>0

By (4.27) and the definition of the conditional probability, we have

  P t − D2(t) ≥ u Z(t) ≥ 2 Z(Pt−u)  Zt−u,i(at−u,i + u)Zt−u, j(at−u, j + u) P Z(t − u) > 0 i, j=1 =   E P Z(t) ≥ 2 Z(t) > 0 P Z(t) > 0  Z(Pt−u)  Z(Pt−u)  Zt−u,i(at,i + u) Zt−u,i(at,i + u) − 1 i=1 i=1 ! ·I  Z(t − u) > 0 Z(Pt−u) Zt−u,i(at,i+u)≥2 i=1 Z(t−u) P ˜ ˜  Zi(at−u,i + u)Z j(at−u, j + u) P Z(t − u) > 0 i, j=1 =   E P Z(t) ≥ 2 Z(t) > 0 P Z(t) > 0  Z(Pt−u)  Z(Pt−u)  Z˜i(at−u,i + u) Z˜i(at−u,i + u) − 1 i=1 i=1 ! ·I  Z(t − u) > 0 Z(Pt−u) Z˜i(at,i+u)≥2 i=1  1 P Z(t − u) > 0    =   E φ A(t − u), u A(t − u) = at−u,1, at−u,2, ··· , at−u,Z(t−u) P Z(t) ≥ 2 Z(t) > 0 P Z(t) > 0 n o where Z˜i(t) are i.i.d. copies of Z(t) and i≥1

k P ˜ ˜ Zi(ai + u)Z j(a j + u) !  i, j=1 φ (a1, a2, ··· , ak), u = E I k )  k  k  P ˜  P P Zi(ai+u)≥2 Z˜i(ai + u) Z˜i(ai + u) − 1 i=1 i=1 i=1 for any positive integer k and any positive real numbers a1, a2, ··· , ak. 98

Since, for any fixed u, φ(·, u) is bounded and continuous and by Theorem 1.13 (b),

    ˜  E φ(A(t − u), u) Z(t − u) > 0 → E φ A, u as t → ∞.

Moreover, by Theorem 1.13 (a) PZ(t) > 0 ∼ ce−αt for some c > 0, we have that

 lim P t − D2(t) > u Z(t) ≥ 2 t→∞ 1 ceα(t−u)   −  − = lim  αt E φ2 A(t u), u Z(t u) > 0 t→∞ P Z(t) ≥ 2 Z(t) > 0 ce 1   = e−αuE φ(A˜, u) P(Y ≥ 2)

≡ 1 − H2(u).

It is remains to show that H2 is a proper probability distribution, i.e., H2(u) → 1 as u → ∞. It suffices to prove that

  lim e−αuE φ(A˜, u) = 0. u→∞

First, we have

  E φ(A˜, u) Y P ˜ ˜ Zi(˜ai + u)Z j(˜a j + u) ! i, j=1 = E I Y  Y  Y  P ˜  P P Zi(˜ai+u)≥2 Z˜i(˜ai + u) Z˜i(˜ai + u) − 1 i=1 i=1 i=1 Y P ˜ ˜ Zi(˜ai + u)Z j(˜a j + u) !! i, j=1 = E E I Y A˜  Y  Y  P ˜  P P Zi(˜ai+u)≥2 Z˜i(˜ai + u) Z˜i(˜ai + u) − 1 i=1 i=1 i=1 !   = E P there exist 1 ≤ i, j ≤ Y s.t. i j, Z˜ (˜a + u) > 0, and Z˜ (˜a + u) > 0 A˜ , i i j j    ! ˜ ˜ ˜ ˜ ˜ ≤ E 1 − P Zi(˜ai + u) = 0 for all i = 1, 2, ··· , Y A − P Zi(˜ai + u) > 0 for some i, Z j(˜a j + u) for all j , i A

X∞ For any 0 ≤ s ≤ 1 and t ≥ 0,Let F(s, t) = PZ(t) = js j and by Theorem 1.10, we have that j=0

lim e−αt1 − F(s, t) ≡ Q(s) exists for 0 ≤ s ≤ 1. t→∞ 99

So,

  = e−αuE φ(A˜, u) Y Y ! −αu Y X  Y ≤ e E 1 − F(0, a˜i + u) − 1 − F(0, a˜i + u) F(0, a˜ j + u) . i=1 i=1 j,i

X∞ Note that the assumption of ( j log j)p j < ∞ implies 0 < EY < ∞ and hence P(0 < Y < ∞) = 1. j=1 Now, conditioned on the limit age chart A˜, we have that

YY ! YY  ! −αu −αu α(˜ai+u) lim e 1 − F(0, a˜i + u) = lim e 1 − 1 − Q(0)e u→∞ u→∞ i=1 i=1 QY   1 − 1 − Q(0)eα(˜ai+u) = lim i=1 u→∞ eαu PY  Q   − − Q(0)eαa˜i αeαu 1 − Q(0)eα(˜a j+u) i=1 j,i = lim u→∞ αeαu XY Y   = lim Q(0)eαa˜i 1 − Q(0)eα(˜a j+u) u→∞ i=1 j,i XY = Q(0) eαa˜i i=1 and

Y −αu X  Y lim e 1 − F(0, a˜i + u) F(0, a˜ j + u) u→∞ i=1 j,i XY Y = lim e−αu Q(0)eα(˜ai+u) 1 − Q(0)eα(˜a j+u) u→∞ i=1 j,i XY Y ≥ lim e−αu Q(0)eα(˜ai+u) 1 − Q(0)eαu u→∞ i=1 j,i XY = lim e−αu Q(0)eα(˜ai+u)1 − Q(0)eαuY−1 u→∞ i=1 XY = lim Q(0)eαa˜i 1 − Q(0)eαuY−1 u→∞ i=1 XY = Q(0) eαa˜i . i=1 100

Hence, conditioned on A˜,

Y Y ! −αu Y X  Y 0 ≤ lim e 1 − F(0, a˜i + u) − 1 − F(0, a˜i + u) F(0, a˜ j + u) u→∞ i=1 i=1 j,i Y Y −αu Y  −αu X  Y = lim e 1 − F(0, a˜i + u) − lim e 1 − F(0, a˜i + u) F(0, a˜ j + u) u→∞ u→∞ i=1 i=1 j,i XY XY ≤ Q(0) eαa˜i − Q(0) eαa˜i i=1 i=1 = 0 w.p.1.

Therefore, by the bounded convergence theorem,

  lim e−αuE φ(A˜, u) u→∞ Y Y ! −αu Y X  Y = lim e E 1 − F(0, a˜i + u) − 1 − F(0, a˜i + u) F(0, a˜ j + u) u→∞ i=1 i=1 j,i = 0 and the proof is complete. 101

CHAPTER 5. COALESCENCE IN CONTINUOUS-TIME MULTI-TYPE AGE-DEPENDENT BELLMAN-HARRIS BRANCHING PROCESSES

5.1 Introduction

In this chapter, we consider a continuous-time d-type (2 ≤ d < ∞) age-dependent Bellman-Harris branching process Z(t): t ≥ 0 , where

Z(t) = (Z1(t), Z2(t), ··· , Zd(t))

is the population vector of the individuals alive at time t and Zi(t) is the number of individuals of type i alive at time t, t ≥ 0.

Recall that in a continuous-time multi-type Bellman-Harris branching process, each type i individ- ual, upon its death, produces ξi, j children of type j, j = 1, 2, ··· , d, according to the probability distribu-

 (i) (i) (i) tion p (j) ≡ p ( j1, j2, ··· , jd) j∈Nd and independently of other individual, where p ( j1, j2, ··· , jd) is the probability that a type i parent produces j1 children of type 1, j2 children of type 2, ··· , jd children of type d.  As in the single-type Bellman-Harris process, there is a embedded generation process Yn n≥0 for the  multi-type Bellman-Harris branching process, where Yn = Yn,1, Yn,2, ··· , Yn,d and Yn,i is the number  of individuals of type i in the nth generation. It is clear that Yn n≥0 is a discrete-time multi-type Galton- Watson branching process.

Throughout this section, we will adopt all the definitions and notations from Section 1.5 and have the following assumptions:

(1) this process is initiated by one individual of type i0 of age 0, i.e., Z(0) = ei0 and a0,1 = 0.

(2) M = {mi j : i, j = 1, 2, ··· , d} is nonsingular and positively regular and write ρ for its Perron- Frobenius root (the maximal eigenvalue). 102

d Also, we denote α the Malthusian parameter for the matrix Mb(α) = mi jGbi(α) i, j=1 where Gb(α) = Z ∞ e−αtG(dt). 0

5.2 Results in The Supercritical Case

For any k ≥ 2, we pick two individuals at random from those alive at time t and trace their lines of descent backward in time until they meet.

Let Dk(t) be the death time of the last common ancestor of these randomly chosen individuals at time t. We investigate the limit behavior of Dk(t) when t gets large. The result for the supercritical caser is stated in Theorem 5.1.  First , we assume that P(Z1 = 0|Z0 = ei) = 0 for any i = 1, 2, ··· , d, and E Z1, j Z0 = ei ≡ mi j < ∞ for all 1 ≤ i, j ≤ d.

5.2.1 The statements of Results

 Let ξs,i,p = ξs,i,p,1, ξs,i,p,2, ··· , ξs,i,p,d be the offspring vector of the pth individual of type i alive at time s. n o Let Ls,i,p be the total lifetime of the pth individual of type i alive at time s. Then Ls,i,p are i.i.d. p≥1 copies of the lifetime random variable with distribution Gi.

Let as,i,p be the corresponding age and Rs,i,p be the corresponding residual lifetime at time s. That is, Rs,i,p = Ls,i,p − as,i,p for any p ≥ 1, i = 1, 2, ··· , d and any s ≥ 0.

Theorem 5.1. Let 1 < ρ < ∞ and the life time distribution Gi is non-lattice with Gi(0+) = 0 for i =  1, 2, ··· , d. If E kZ1k log kZ1k Z0 = ei < ∞ for all 1 ≤ i ≤ d, then, for any integer k ≥ 2,

(a) for almost all trees T and any s ≥ 0,

d Zi(s) k P P  −αR  e s,i,p W˜ s,i,p  i=1 p=1 P D˜ k ≤ s T ≡ Hk(s, T ) = 1 − d Zi(s) k  P P −αR  e s,i,p W˜ s,i,p i=1 p=1

Xd ξXs,i,p, j  ˜  where Wr,i,p p≥1 are the i.i.d copies of the sum Ws,i,p, j,q, and Ws,i,p, j,q : q ≥ 1, 1 ≤ j ≤ d j=1 q=i are i.i.d. copies of W as defined in Theorem 1.17. 103

d (b) there exists a random variable D˜ k on the set of non-negative real numbers such that Dk(t) −−−→ D˜ k as t → ∞ and

d Zi(s)  k P P −αRs,i,p ˜ e Ws,i,p !  i=1 p=1 P D˜ k ≤ s = 1 − E ≡ Hk(s). d Zi(s) k  P P −αR  e s,i,p W˜ s,i,p i=1 p=1 for any s ≥ 0.

Next, we investigate the generation number Xk(t) of the last common ancestor of any k randomly chosen individuals alive at time t.

Consider the case in which every individual in the branching process has the same lifetime distrib- ution G no matter what kind of type it is of.

Let Lr,i,p,q be the lifetime of the qth-generation ancestor of the pth individual of type i in the rth generation, q = 0, 1, ··· , r − 1, p ≥ 1, i = 1, 2, ··· , d and r ≥ 0. Xr−1 Let S r,i,p = Lr,i,p,q, then S r,i,p is the birth time of the pth individual of type i in the rth generation. q=0 Then we have the following theorem.

Theorem 5.2. Let 1 < ρ < ∞ and the life time distribution G is non-lattice with G(0+) = 0. If  E kZ1k log kZ1k Z0 = ei < ∞ for all 1 ≤ i ≤ d, then, for any integer k ≥ 2,

(a) for almost all trees T and any s ≥ 0,

d Zi(s) k P P  −αS  e s,i,p W˜ s,i,p  i=1 p=1 P X˜k ≤ s T ≡ Hk(s, T ) = 1 − Wk  ˜ where Wr,i,p p≥1 are the i.i.d copies of W as defined in Theorem 1.17.

d (b) there exists a random variable X˜k on the set of non-negative real numbers such that Xk(t) −−−→ X˜k as t → ∞ and

d Zi(s)  k P P −αS s,i,p ˜ e Ws,i,p ! i=1 p=1 PX˜ ≤ s = 1 − E ≡ φ (s). k Wk k

for any s ≥ 0.

Moreover, let η(t) be the type of the last common ancestor and ζ1(t), ζ2(t) be the types of the two ran-  domly chosen individuals at time t. We also have the limit joint distribution of X2(t), η(t), ζ1(t), ζ2(t) . 104

Theorem 5.3. Let 1 < ρ < ∞ and the life time distribution G is non-lattice with G(0+) = 0. If  E kZ1k log kZ1k Z0 = ei < ∞ for all 1 ≤ i ≤ d, then

lim P(X2(t) = r, η(t) = j, ζ1(t) = i1, ζ2(t) = i2) ≡ ϕ2(r, j, i1, i2) exists n→∞ X and ϕ2(r, j, i1, i2) = 1. (r, j,i1,i2)

5.2.2 The proof of Theorem 5.1

˜ Let Zt−s−Rs,i,p, j,q be the branching process initiated by the qth offspring of type j of the pth individual of type i alive at time s.

Pick k individuals randomly from those alive at time t and trace their lines of decent backward in time until they meet. Denote the coalescence time Dk(t) which also means the death time of the last common ancestor of these randomly chosen individuals.

For almost all trees T and s ≥ 0,

  P Dk(t) ≤ s T   = 1 − P Dk(t) > s T

d Z (s)  d ξs,i,p, j  d ξs,i,p, j   ξs,i  P Pi P P ˜ P P ˜ P ˜ Zt−s−Rs,i,p, j,q Zt−s−Rs,i,p, j,q − 1 ··· Zt−s−Rs,i,p, j,q − k + 1 i=1 p=1 j=1 q=1 j=1 q=1 j=1 = 1 −  d Z (s) d ξs,i,p, j  d Z (s) d ξs,i,p, j   d Z (s) d ξs,i,p, j  P Pi P P ˜ P Pi P P ˜ P Pi P P ˜ Zt−s−Rs,i,p, j,q Zt−s−Rs,i,p, j,q − 1 ··· Zt−s−Rs,i,p, j,q − k + 1 i=1 p=1 j=1 q=1 i=1 p=1 j=1 q=1 i=1 p=1 j=1 q=1

d Zi(s) k  d ξs,i,p, j  P P Q −αRs,i,p P P ˜ −α(t−s−Rs,i,p) −α(t−s−Rs,i,p) e Zt−s−Rs,i,p, j,q e − (l − 1)e i=1 p=1 l=1 j=1 q=1 = 1 − k  d Zi(s) d ξs,i,p, j  Q P P −αRs,i,p P P ˜ −α(t−s−Rs,i,p) −α(t−s−Rs,i,p) e Zt−s−Rs,i,p, j,q e − (l − 1)e l=1 i=1 p=1 j=1 q=1 and then, by Theroem 1.17,

d Zi(s)  d ξs,i,p, j k P P −αR P P e s,i,p Ws,i,,p, j,q   i=1 p=1 j=1 q=1 P Dk(t) ≤ s T → 1 − as t → ∞  d Zi(s) d ξs,i,p, j k P P −αR P P e s,i,p Ws,i,p, j,q i=1 p=1 j=1 q=1

d Zi(s)  k P P −αR e s,i,p W˜ s,i,p i=1 p=1 = 1 − ≡ Hk(s, T )  d Zi(s) k P P −αR e s,i,p W˜ s,i,p i=1 p=1 105

Xd ξXs,i,p, j where {Ws,i,p, j,q : q ≥ 1, 1 ≤ j ≤ d} are i.i.d copies of W in Theorem 1.17 and W˜ s,i,p ≡ Ws,i,p, j,q j=1 q=1 for p ≥ 1, 1 ≤ i ≤ d and s ≥ 0.

So, by the bounded convergence theorem, as t → ∞,

d Zi(s)  k P P −αRs,i,p ˜ e Ws,i,p !      i=1 p=1 P Dk(t) ≤ s = E P Dk(t) ≤ s T → 1 − E ≡ Hk(s).  d Zi(s) k P P −αR e s,i,p W˜ s,i,p i=1 p=1

Next, we need to show that Hk is a proper probability distribution, i.e. show that Hk(s) → 1 as s → ∞ and it is the same as showing that

d Zi(s)  k P P −αRs,i,p ˜ e Ws,i,p ! i=1 p=1 E → 0 as s → ∞.  d Zi(s) k P P −αR e s,i,p W˜ s,i,p i=1 p=1 It suffices to prove that, as s → ∞,

d Zi(s)  k P P −αR e s,i,p W˜ s,i,p i=1 p=1 → 0 in probability.  d Zi(s) k P P −αR e s,i,p W˜ s,i,p i=1 p=1 Moreover, since k −αRs,i,p ˜ d Zi(s)   −αRs,i,p ˜ max e Ws,i,p P P −αRs,i,p ˜ max e Ws,i,p 1≤p≤Zi(s) !k e Ws,i,p 1≤p≤Zi(s) !k−1 1≤i≤d i=1 p=1 1≤i≤d ≤ ≤ , d Zi(s)  d Zi(s) k d Zi(s) P P −αR P P −αR P P −αR e s,i,p W˜ s,i,p e s,i,p W˜ s,i,p e s,i,p W˜ s,i,p i=1 p=1 i=1 p=1 i=1 p=1 it is enough to show that, as s → ∞, −αR max e s,i,p W˜ s,i,p 1≤p≤Zi(s) 1≤i≤d → 0 in probability. d Zi(s) P P −αR e s,i,p W˜ s,i,p i=1 p=1 For any fixed k > 0, Xd ZXi(s) Xd ZXi(s) −αRs,i,p ˜ −αk ˜ e Ws,i,p ≥ e Ws,i,pI(Rs,i,p≤k) i=1 p=1 i=1 p=1 and then

−αRs,i,p ˜ ˜ αk 1 ˜ max e Ws,i,p max Ws,i,p e |Z(s)| max Ws,i,p 1≤p≤Zi(s) 1≤p≤Zi(s) 1≤p≤Zi(s) 1≤i≤d 1≤i≤d 1≤i≤d ≤ = (5.1) d Zi(s) d d P P −αR −αk P PZi(s) P Zi(s,k) 1 PZi(s) e s,i,p W˜ e W˜ I ≤ W˜ I ≤ s,i,p p=1 s,i,p (Rs,i,p k) |Z(s)| Zi(s,k) p=1 s,i,p (Rs,i,p k) i=1 p=1 i=1 i=1 106

where Zi(s, k) is the number of individuals of type i alive at time s with the residual lifetime less than or equal to k.

We need the following lemmas to prove Theorem 5.1.

Lemma 5.1. Under the hypotheses of Theorem 5.1, as s → ∞,

1 max W˜ s,i,p → 0 in probability. |Z(s)| 1≤p≤Zi(s) 1≤i≤d n Proof. In a continuous-time single-type age-dependent Bellman-Harris branching process, ξs,i,p : p ≥ o  0 are i.i.d. for any i = 1, 2, ··· , d and s ≥ 0. Also, Ws,i,p, j,q : p, q ≥ 0, 1 ≤ i, j ≤ d are i.i.d. and n o n o independent of ξs,i,p : p ≥ 0, 1 ≤ i ≤ d , so W˜ s,i,p : p ≥ 1, 1 ≤ i ≤ d are independent random variables and

 Xd ξXs,i,p, j  EW˜ s,i,p = E Ws,i,p, j,q = E|ξs,i,p| · EWs,i,p,1,1 ∈ (0, ∞). (5.2) j=1 q=1

Thus, since EW˜ s,i,p < ∞, for any  > 0,

 nP W˜ s,i,p > n → 0 as n → ∞ (5.3) and then,

 1  P max W˜ s,i,p >  |Z(s)| 1≤p≤Zi(s) 1≤i≤d    1 ˜ = E P max Ws,i,p >  Z(s) |Z(s)| 1≤p≤Zi(s) 1≤i≤d    ˜ = E P max Ws,i,p > |Z(s)| Z(s) 1≤p≤Zi(s) 1≤i≤d    ˜ = E 1 − P max Ws,i,p ≤ |Z(s)| Z(s) 1≤p≤Zi(s) 1≤i≤d    ˜ = E 1 − P Ws,i,p ≤ |Z(s)| for all p = 1, 2, ··· , Zi(s), i = 1, 2, ··· , d Z(s) Yd ZYi(s)  ! ˜ = E 1 − P Ws,i,p ≤ |Z(s)| Z(s) i=1 p=1 d ! Y  Zi(s) ˜ = E 1 − P Ws,i,1 ≤ |Z(s)| Z(s) i=1   d ˜ ! Y  Zi(s)P Ws,i,1 > |Z(s)| Z(s) Zi(s) = E 1 − 1 − Z (s) i=1 i 107

Since Zi(s) → ∞ w.p.1 as n → ∞ for i = 1, 2, ··· , d, by the bounded convergence theorem, as s → ∞,

     1 ˜ 1 ˜ P max Ws,i,p >  = E P max Ws,i,p >  Z(s) |Z(s)| 1≤p≤Zi(s) |Z(s)| 1≤p≤Zi(s) 1≤i≤d 1≤i≤d Yd ! → E 1 − e0 = 0. i=1 Then, Lemma 5.1 is proved.



Lemma 5.2. For any i = 1, 2, ··· , d and any k > 0, let Zi(s, k) be the number of individuals of type i alive at time s with the residual lifetime less than or equal to k. Then, under the hypotheses of Theorem

5.1, as s → ∞,

Zi(s, k) → Bi(k) in probability Zi(s) where R −αx −  [0,∞) e Gi(x + k) Gi(x) dx Bi(k) = R . −αx −  [0,∞) e 1 Gi(x) dx

Proof. Recall that Ls,i,p is the total lifetime of the pth individual of type i alive at time s, as,i,p is the corresponding age and Rs,i,p is the corresponding residual lifetime at time s. For any fixed i = 1, 2, ··· , d and any fixed k > 0, consider a function g such that

      g(a) ≡ P Rs,i,p ≤ k as,i,p = a = P Ls,i,p − aa,i,p ≤ k Ls,i,p > a = P a < Ls,i,p ≤ a + k Ls,i,p > a   P a < Ls,i,p ≤ a + k G (a + k) − G (a) = = i i .   − G a P Ls,i,p > a 1 i( )

Let Fs be the σ−algebra generated all the history of this branching process up to time s. Then, for any  > 0, 108

! Zi(s, k) P − Bi(k) >  Zi(s) !! Z (s, k) i − F ··· = E P Bi(k) >  s, (as,i,1, as,i,2, , as,i,Zi(s)) Zi(s) Z (s) Xi !! 1 = E P I(R ≤k) − Bi(k) >  Fs, (as,i,1, as,i,2, ··· , as,i,Z (s)) Z (s) s,i,p i i p=1 Z (s) Z (s) Xi Xi !! 1   1   = E P I(R ≤k) − g(as,i,p) + g(as,i,p) − Bi(k) >  Fs, (as,i,1, as,i,2, ··· , as,i,Z (s)) Z (s) s,i,p Z (s) i i p=1 i p=1 Z (s) Xi !! 1    ≤ E P I(R ≤k) − g(as,i,p) > Fs, (as,i,1, as,i,2, ··· , as,i,Z (s)) Z (s) s,i,p 2 i i p=1 Z (s) Xi !! 1    +E P g(as,i,p) − Bi(k) > Fs, (as,i,1, as,i,2, ··· , as,i,Z s)) Z (s) 2 ( i p=1

Z( s) X !! 1    = E P I(R ≤k) − g(as,i,p) > Fs, (as,i,1, as,i,2, ··· , as,i,Z (s)) Z (s) s,i,p 2 i i p=1 ZXi(s) ! 1    +P g(as,i,p) − Bi(k) > (5.4) Z (s) 2 i p=1

Note that, for any i ≥ 1,

  E I − g(a ) F , (a , a , ··· , a ) = 0 (Rs,i,p≤k) s,i,p s s,i,1 s,i,2 s,i,Zi(s) and

  − F ··· Var I(Rs,i,p≤k) g(as,i,p) s, (as,i,1, as,i,2, , as,i,Zi(s))  2  = E I ≤ k) − g(a ) F , (a , a , ··· , a ) (Rs,i,p≤k) s,i,p s s,i,1 s,i,2 s,i,Zi(s)   = E I ≤ k) F , (a , a , ··· , a ) (Rs,i,p≤k) s s,i,1 s,i,2 s,i,Zi(s)    2 −2g(a )E I ≤ k) F , (a , a , ··· , a ) + g(a ) s,i,p (Rs,i,p≤k) s s,i,1 s,i,2 s,i,Z(s) s,i,p  2  2  2 1 = g(a ) − 2 g(a ) + g(a ) = g(a ) − g(a ) ≤ . s,i,p s,i,p s,i,p s,i,p s,i,p 4  Conditioned on Fs and (as,i,1, as,i,2, ··· , as,i,Zi(s)), we have that I(Rs,i,p≤k)−g(as,i,p) : p = 1, 2, ··· , Zi(s) 109 are independent. So, by Chebycheve’s inequality, for any  > 0,

Z (s) Xi ! 1    P I(R ≤k) − g(as,i,p) > Fs, (as,i,1, as,i,2, ··· , as,i,Z (s)) Z (s) s,i,p 2 i i p=1

Zi(s) ! 4 1 X   ≤ Var I ≤ − g(a ) F , (a , a , ··· , a ) 2 Z (s) (Rs,i,p k) s,i,p s s,i,1 s,i,2 s,i,Zi(s) i p=1

Zi(s) 4 1 X   = Var I ≤ − g(a ) F , (a , a , ··· , a ) 2 Z (s) (Rs,i,p k) s,i,p s s,i,1 s,i,2 s,i,Zi(s) i p=1 Z (s) 4 1 Xi 1 ≤ 2 Z (s) 4 i p=1 1 = Z (s) → 0 as s → ∞ 2 i and hence

Z (s) Xi ! 1    P I(R ≤k) − g(as,i,p) > Fs, (as,i,1, as,i,2, ··· , as,i,Z (s)) → 0 w.p.1 Z (s) s,i,p 2 i i p=1 as s → ∞. Then, by the bounded convergence theorem,

Z (s) Xi !! 1    E P I(R ≤k) − g(as,i,p) > Fs, (as,i,1, as,i,2, ··· , as,i,Z (s)) → 0 as s → ∞. (5.5) Z (s) s,i,p 2 i i p=1 It remains to prove that

ZXi(s) ! 1    P g(as,i,p) − Bi(k) > → 0 as s → ∞. Z (s) 2 i p=1

Z (s) 1 Xi Let A (x, s) = I ≤ . Then, by Theorem 1.20, as s → ∞ i Z (s) (as,i,p x) i p=1

sup Ai(x, s) − Ai(x) → 0 w.p.1. x where Ai is as defined in Section 1.5. G (x + k) − G (x) Since g(x) = i i is a bounded and continuous function, by Theorem 1.20 again, we 1 − Gi(x) have, as s → ∞

Z (s) 1 Xi Z Z g(a ) ≡ g(x)dA (x, s) → g(x)dA (x) w.p.1 Z (s) s,i,p i i i p=1 [0,∞) [0,∞) where ∞ R Gi(x+k)−Gi(x) −αx  R ∞ −αx  Z e 1 − Gi(x) dx e G (x + k) − G (x) dx 0 1−Gi(x) 0 i i g(x)dAi(x) = R ∞  = R ∞  = Bi(k). [0,∞) −αx − −αx − 0 e 1 Gi(x) dx 0 e 1 Gi(x) dx 110

Z (s) 1 Xi So, as s → ∞, g(a ) → B (k) w.p.1 and hence in probability. Therefore, for any  > 0, Z (s) s,i,p i i p=1

ZXi(s) ! 1  P g(as,i,p) − Bi(k) > → 0 as s → ∞. (5.6) Z (s) 2 i p=1

From (5.4), (5.5) and (5.6), we have that, for any  > 0, ! Zi(s, k) P − Bi(k) >  → 0 as s → ∞ Zi(s) and the proof is complete. 

Lemma 5.3. For any i = 1, 2, ··· , d, let W˜ s,i,p and Zi(s, k) be the random variables defined in Lemma

5.1 and Lemma 5.2, respectively. Then, under the hypotheses of Theorem 5.1, these exists a θi > 0 such that, as s → ∞,

Z (s,k)  1 Xi  P W˜ I ≤ ≥ θ → 1. Z (s, k) s,i,p (Rs,i,p k) i i p=1 n o n o Proof. Let ns,i,1 = min 1 ≤ j ≤ Zi(s): Rs,i, j ≤ k and ns,i,p = min ns,i,p−1 < j ≤ Zi(s): Rs,i, j ≤ k for i ≥ 2. Then

Z (s) Z (s,k) Z (s,k) 1 Xi 1 Xi 1 Xi W˜ s,i,pI(R ≤k) = W˜ s,i,n I(R ≤k) = W˜ s,i,n . (5.7) Z (s, k) s,i,p Z (s, k) s,i,p s,i,ns,i,p Z (s, k) s,i,p i p=1 i p=1 i p=1   It is known from (5.2) that EW˜ s,i,1 > 0 and hence there exists an η > 0 such that P W˜ s,i,1 ≥ η > 0.

Let Fs be the σ−algebra generated by all the information of this Bellman-Harris branching process up to time s. Then, for any p ≥ 1,

      ZXi(s,k)   ˜ ≥ ˜ ≥ F ˜ ≥ F P Ws,i,ns,i,p η = E P Ws,i,ns,i,p η s = E P Ws,i,ns,i,p η, ns,i,p = j s j=1  ZXi(s,k)   ˜ = E P Ws,i, j ≥ η, ns,i,p = j Fs j=1  ZXi(s,k)     ˜ = E P Ws,i, j ≥ η Fs P ns,i,p = j Fs j=1  ZXi(s,k)     ˜ = E P Ws,i, j ≥ η Fs P ns,i,p = j Fs j=1   = P W˜ s,i,1 ≥ η . 111

Let   1 , if W˜ ≥ η  s,ns,i,p Xs,i,p =   ˜  0 . if Ws,ns,i,p < η then

Z (s,k) Z (s,k) 1 Xi 1 Xi W˜ ≥ ηX Z (s, k) s,i,ns,i,p Z (s, k) s,i,p i p=1 i p=1

Zi(s,k) ! Zi(s,k) ! 1 X   1 X = η X − PW˜ ≥ η + η PW˜ ≥ η Z (s, k) s,i,p s,i,ns,i,p Z (s, k) s,i,ns,i,p i p=1 i p=1

Zi(s,k) ! 1 X   = η X − PW˜ ≥ η + ηPW˜ ≥ η. (5.8) Z (s, k) s,i,p s,i,ns,i,p s,i,ns,i,p i p=1

n o    Conditioned on Fs, Xs,i,p are independent with E Xs,i,p − P W˜ s,i,n ≥ η Fs = 0 and p≥1 s,p     − ˜ ≥  F ˜ ≥ F  − ˜ ≥ F  Var Xs,i,p P Ws,i,ns,i,p η s = P Ws,i,ns,i,p η s 1 P Ws,i,ns,i,p η s   = P W˜ s,i,1 ≥ η Fs 1 − P W˜ s,i,1 ≥ η Fs 1 ≤ , 4 then, by Chebychev’s inequality and Lemma 5.2, for any  > 0,

Z (s,k) Xi ! 1   P Xs,i,p − P W˜ s,i,n ≥ η >  Fs Z (s, k) s,i,p i p=1

Zi(s,k) ! 1 1 X   ≤ Var X − PW˜ ≥ η F 2 Z (s, k) s,i,p s,i,ns,i,p s i p=1

Zi(s,k) ! 1 1 X   = Var X − PW˜ ≥ η F 2 Z (s, k)2 s,i,p s,i,ns,i,p s i p=1 1 ≤ 2 4 Zi(s, k) 1 Z (s) = i → 0 in probability as s → ∞. 2 4 Zi(s) Zi(s, k)

Therefore, by the bounded convergence theorem,

ZXi(s,k) ! 1   P Xs,i,p − P W˜ s,i,n ≥ η >  Z (s, k) s,i,p i p=1 Z (s,k) Xi !! 1   = E P Xs,i,p − P W˜ s,i,n ≥ η >  Fs → 0 as s → ∞. Z (s, k) s,i,p i p=1 112

Hence,

Zi(s,k) ! 1 X   P X − PW˜ ≥ η < − → 0 as s → ∞. (5.9) Z (s, k) s,i,p s,i,ns,i,p i p=1 1 Let θ = ηPW˜ ≥ η, then θ > 0. Also, (5.7), (5.8) and (5.9) together imply that i 2 s,i,1 i Z (s,k)  1 Xi  P W˜ I ≤ ≥ θ Z (s, k) s,i,p (Rs,i,p k) i i p=1

Zi(s,k) !  1 X   1 ≥ P η X − PW˜ ≥ η + ηPW˜ ≥ η ≥ ηPW˜ ≥ η Z (s, k) s,i,p s,i,ns,i,p s,i,ns,i,p 2 s,i,1 i p=1

Zi(s,k) ! 1 X   1 = P X − PW˜ ≥ η ≥ − PW˜ ≥ η Z (s, k) s,i,p s,i,ns,i,p 2 s,i,1 i p=1

Zi(s,k) ! 1 X   1 = 1 − P X − PW˜ ≥ η < − PW˜ ≥ η → 1 as s → ∞. Z (s, k) s,i,p s,i,ns,i,p 2 s,i,1 i p=1 So, we have that

Z (s,k)  1 Xi  P W˜ I ≤ ≥ θ → 1 as s → ∞ Z (s, k) s,i,p (Rs,i,p k) i i p=1 and hence Lemma 5.3 is proved.



Now, we can continue to prove Theorem 5.1.

From Lemma 5.2, we know that, as s → ∞,

Zi(s, k) → Bi(k) in probability Zi(s) for any i = 1, 2, ··· , d and, from Theorem 1.17, we have that

Z (s) v i → i w.p.1 as s → ∞. |Z(s)| 1 · v

Hence,

Z (s, k) 1 v  P i < B(k) i → 0 as s → ∞. |Z(s)| 2 1 · v

Also, from Lemma 5.3, we have that, for some θi > 0, as s → ∞,

Z (s,k)  1 Xi  P W˜ I ≤ ≥ θ → 1. Z (s, k) s,i,p (Rs,i,p k) i i p=1 113

So, for any δ > 0 and any i = 1, 2, ··· , d, there exists an M > 0 such that for every s > M,

Z (s, k) 1 v  δ P i < B (k) i < |Z(s)| 2 i 1 · v 2d and Z(s,k)  1 X  δ P W˜ I ≤ < θ < . Z (s, k) s,i,p (Rs,i,p k) i 2d i p=1 Let

Z (s, k) 1 v  A = i ≥ B (k) i for all i = 1, 2, ··· , d |Z(s)| 2 i 1 · v and Z (s,k)  1 Xi  B = W˜ I ≤ ≥ θ for all i = 1, 2, ··· , d . Z (s, k) s,i,p (Rs,i,p k) i i p=1 Then, for any  > 0, αk 1 ˜ e |Z(s)| max Ws,i,p 1≤p≤Zi(s)  1≤i≤d  P >  d Zi(s) P Zi(s,k) 1 P ˜ |Z(s)| Z (s,k) Ws,i,pI(Rs,i,p≤k) i=1 i p=1  Xd ZXi(s)  1 ˜ −αk Zi(s, k) 1 ˜ = P max Ws,i,p > e Ws,i,pI(Rs,i,p≤k) |Z(s)| 1≤p≤Zi(s) |Z(s)| Zi(s, k) 1≤i≤d i=1 p=1  d  1 −αk X 1 vi C C ≤ P max W˜ s,i,p > e Bi(k) θi : A ∩ B + P(A ) + P(B ) |Z(s)| 1≤p≤Zi(s) 2 1 · v 1≤i≤d i=1  d  1 1 −αk X vi ≤ P max W˜ s,i,p > e θiBi(k) + δ |Z(s)| 1≤p≤Zi(s) 2 1 · v 1≤i≤d i=1 for every s > M. Thus, for any δ > 0, by Lemma 5.1, αk 1 ˜ e |Z|(s) max Ws,i,p 1≤p≤Zi(s)  1≤i≤d  lim sup P >  s→∞ d P Zi(s,k) 1 PZi(s) ˜ |Z(s)| Z (s,k) p=1 Ws,i,pI(Rs,i,p≤k) i=1 i  d  1 1 X −αk vi ≤ lim sup P max W˜ s,i,p >  θie Bi(k) + δ = δ s→∞ |Z(s)| 1≤p≤Zi(s) 2 1 · v 1≤i≤d i=1 i.e., αk 1 ˜ e |Z(s)| max Ws,i,p 1≤p≤Zi(s)  1≤i≤d  lim P >  = 0 for any  > 0 s→∞ d P Zi(s,k) 1 PZi(s) ˜ |Z(s)| Z (s,k) p=1 Ws,i,pI(Rs,i,p≤k) i=1 i 114 and hence, from (5.1), we have that

−αR max e s,i,p W˜ s,i,p 1≤p≤Zi(s)  1≤i≤d  lim P >  = 0 for any  > 0, s→∞ d Zi(s) P P −αR e s,i,p W˜ s,i,p i=1 p=1 i.e., as s → ∞,

−αR max e s,i,p W˜ s,i,p 1≤p≤Zi(s) 1≤i≤d → 0 in probability. d Zi(s) P P −αR e s,i,p W˜ s,i,p i=1 p=1

By the bounded convergence theorem,

d Zi(s)  k P P −αRs,i,p ˜ e Ws,i,p ! i=1 p=1 E → 0 as s → ∞.  d Zi(s) k P P −αR e s,i,p W˜ s,i,p i=1 p=1  and thus Hk is a proper probability distribution on R+ ≡ x ∈ R : x ≥ 0 . So, there exists a random d variable D˜ k on R+ such that Dk(t) −−−→ D˜ k as t → ∞ and

d Zi(s)  k P P −αRs,i,p ˜ e Ws,i,p !  i=1 p=1 P D˜ k ≤ s = 1 − E ≡ Hk(s). d Zi(s) k  P P −αR  e s,i,p W˜ s,i,p i=1 p=1 for any s ≥ 0. The proof of Theorem 5.1 is complete.

5.2.3 The proof of Theorem 5.2

 Let Yn n≥0 be the embedded generation process of the continuous-time multi-type Bellman-Harris process Z(t): t ≥ 0 .  Let Zr,i,p(t): t > 0 be the the continuous-time multi-type age-dependent Bellman-Harris branching process initiated with the pth individual of type i in the rth generation when it is of age 0.

Let Lr,i,p,q be the lifetime of the qth-generation ancestor of the pth individual of type i in the rth  generation, then Lr,i,p,q : r ≥ 0, i = 1, 2, ··· , d, p ≥ 1, q = 1, 2, ··· , r − 1 are i.i.d copies with the lifetime distribution G. Xr−1 Let S r,i,p = Lr,i,p,q, then S r,i,p is the birth time of the pth individual of type i in the rth generation. q=0 115

(a) For almost all trees T and any r = 0, 1, 2, ··· ,

  P Xk(t) ≥ r T

d Yr,i  P P Zr,i,p t−S r,i,p  k i=1 p=1 = Z(t) k Pd YPr,i     Zr,i,p(t − S r,i,p) Zr,i,p(t − S r,i,p) − 1 ··· Zr,i(t − S r,i,p) − k + 1 i=1 p=1 = (5.10) Z(t)Z(t) − 1 ··· Z(t) − k + 1

d Yr,i P P −α(t−S ) −α(t−S )  e r,i,p Zr,i,p(t − S r,i,p) · e r,i,p Zr,i,p(t − S r,i,p) − 1 ··· i=1 p=1 = e−αtZ(t) · e−αtZ(t) − 1 ··· −α(t−S )  −kαS e r,i,p Zr,i,p(t − S r,i,p) − k + 1 · e r,i,p · (5.11) e−αtZ(t) − k + 1

where α is the Malthusian parameter for the offspring mean m and the lifetime distribution G.

It known from Theorem 1.17 that if |Z0| = 1, P(Z1 = 0|Z0 = ei) = 0 for any i = 1, 2, ··· , d and  E kZ1k log kZ1k Z0 = ei < ∞ for all 1 ≤ i ≤ d, then

e−αtZ(t) → vW w.p.1 as t → ∞

where W is a random variable such that P(W > 0) = 1. So, as t → ∞,

d Yr,i k P P  −αS  e r,i,p Wr,i,p  i=1 p=1 P Xk ≥ r T → ≡ 1 − φk(r, T ) Wk n o as t → ∞, where Wr,i,p are the i.i.d copies of W. i≥1,p≥1

 (b) Since P(Xk(t) ≥ r) = E P(Xk(t) ≥ r) T and hence, by the bounded convergence theorem,

d Yr,i k P P  −αS  e r,i,p Wr,i,p  i=1 p=1  P(X (t) ≥ r) → E ≡ 1 − φ (r) as t → ∞ k Wk k

for r = 1, 2, ··· .

To finish the proof, we need to show that φk is a proper probability distribution, i.e., φk(r) → 1 as r → ∞, and it is sufficient to prove that

d Yr,i X X  k −αS r,i,p e Wr,i,p → 0 in probability as r → ∞. i=1 p=1 116

We will follow the lines similar to the proof for the single-type Bellman-Harris process.

First, we have that

d Yr,i  k X X  k −αS r,i,p −αS r,i,p max e Wr,i,p ≤ e Wr,i,p (5.12) 1≤p

d Yr,i !  X X  = E E e−αS r,i,p W L , 0 ≤ q ≤ r − 1, 1 ≤ p ≤ Y , 1 ≤ i ≤ d, Y , Y , ··· , Y r,i,p r,i,p,q r,i 0 1 r i=1 p=1 Y Xd Xr,i  ! −αS r,i,p = E e E Wr,i,p Lr,i,p,q, 0 ≤ q ≤ r − 1, 1 ≤ p ≤ Yr,i, 1 ≤ i ≤ d, Y0, Y1, ··· , Yr i=1 p=1   Note that Wr,i,p p≥1,1≤i≤d are i.i.d. copies of W and are independent of Lr,i,p,q, 0 ≤ q ≤ r − 1, 1 ≤ p ≤ Yr,i, 1 ≤ i ≤ d, Y0, Y1, ··· , Yr , so

 Xd XYr,i  −αS r,i,p E e Wr,i,p i=1 p=1 Xd XYr,i ! Xd XYr,i ! = E e−αS r,i,p EW = EW · E e−αS r,i,p i=1 p=1 i=1 p=1

d Yr,i ! !  X X    = EW · E E e−αS r,i,p Y = EW · E |Y |E e−αS r,1,1 Y r r r i=1 p=1  !    r −αS r,1,1 −αS r,1,1 −αL = EW · E |Yr|E e = EW · E|Yr| · E e = EW · E|Yr| · Ee

r−1  X   where S r,i,p ≡ Lr,i,p,q are identically distributed and Lr,i,p,q : 0 ≤ q ≤ r − 1 are i.i.d p≥1,1≤i≤d q=0 copies of the lifetime random variable L for each p ≥ 1 and 1 ≤ i ≤ d.

 Since E kZ1k log kZ1k Z0 = ei < ∞, it is known that 0 < EW < ∞. Then,

 Xd XYr,i    −αS r,i,p −r r lim E e Wr,i,p = lim EW · ρ E|Yr| · ρ · ϕL(α) (5.14) r→∞ r→∞ i=1 p=1   −r = EW · lim ρ E|Yr| (5.15) r→∞ = cEW (5.16) 117

Z ∞ −αu for some 0 < c < ∞, where ϕL(α) ≡ e dG(u) and hence mϕL(α) = 1 since α is the 0 Mathusian parameter for m and G.

For any η > 0, by Chebyshev’s inequality,

d Yr,i ! d Yr,i X X  X X  cEW −αS r,i,p 1 −αS r,i,p lim P e Wr,i,p > η ≤ lim E e Wr,i,p = < ∞. r→∞ r→∞ η η i=1 p=1 i=1 p=1

For any  > 0,

d Yr,i !  k−1 X X −αS r,i,pWr,i,p −αS r,i,p lim P max e e Wr,i,p >  r→∞ 1≤p , e Wr,i,p > η r→∞ 1≤p , e Wr,i,p ≤ η r→∞ 1≤p (5.17) η r→∞ 1≤p

So, to prove that

d Yr,i X X  k −αS r,i,p e Wr,i,p → 0 in probability as r → ∞, i=1 p=1 it suffices, from (5.12) and (5.17), to prove that

−αS r,i,p max e Wr,i,p → 0 in probability as r → ∞. 1≤p

Let Fr be the σ−algebra generated by all the information up to the rth generation in the embedded tree. Then, for any  > 0,

    −αS r,i,p −αS r,i,p P max e Wr,i,p >  Fr = P ∃i = 1, 2, ··· , d, ∃p = 1, 2, ··· , Yr,i s.t. e Wr,i,p >  Fr 1≤p  Fr i=1 p=1 Y Xd Xr,i   αS r,i,p = P Wr,i,p > e Fr i=1 p=1 118

Let η(y) = sup xP(W > x). Since EW < ∞, xP(W > x) → 0 as x → ∞. So, for any l > 0, there x≥y l l exists a > 0 s.t. yP(W > y) < for all y ≥ a and hence η(a) ≤ . cEW cEW 1 a Let n > ln , then eαn > a. Hence, α    −αS r,i,p P max e Wr,i,p >  1≤p , min S r,i,p ≤ n + P max e Wr,i,p > , min S r,i,p > n 1≤p , min S r,i,p > n Fr 1≤p e , min S r,i,p > n Fr 1≤p e , min S r,i,p > n Fr · e 1≤p

Moreover,

  X    P min S r,i,p ≤ n = P min S r,i,p ≤ n Yr = x P Yr = x 1≤p

−θS r,1,1 −θn = P e ≤ e E|Yr|

where θ > α such that ρϕL(θ) < 1. Then, by Markov inequality,

 −θS    E e r,1,1   r ≤ ≤ | | θn −r | | −θL P min S r,i,p n −θn E Yr = e ρ E Yr ρEe 1≤p

Since (5.18) and (5.19) together imply that

  −αS r,i,p lim P max e Wr,i,p >  r→∞ 1≤p

for any l > 0. Hence, for any  > 0,

  −αS r,i,p P max e Wr,i,p >  → 0 as r → ∞, 1≤p

and so φk is a proper probability distribution and hence there exists a random variable X˜k on  d 0, 1, 2, ··· such that Xk(t) −−−→ X˜k as t → ∞ and

d Yr,i k P P  −αS  e r,i,p Wr,i,p ! i=1 p=1 PX˜ < r = 1 − E ≡ φ (r). k Wk k

for any r = 0, 1, 2, ··· .

The proof of Theorem 5.2 is complete.

5.2.4 The proof of Theorem 5.3

Let Yn = (Yn,1, Yn,2, ··· , Yn,d) be the embedded Galton-Watson branching process.

Let Yn,i(t) be the number of individuals of type i in the nth generation alive at time t. Then

X∞ Zi(t) = Yn,i(t). n=0

Let ξn,i,p = (ξn,i,p,1, ξn,i,p,2, ··· , ξn,i,p,d) be the offspring vector of the pth individual of type i in the nth generation.

Let ξn,i,p(t) be the vector of alive offspring of the pth individual of type i in the nth generation. 120

Let Zn,i,p, j,q(t) be the continuous-time multi-type Bellman-Harris branching process initiated by the  qth child of type j of the pth individual of type i in the nth generation. Then Zn,i,p, j,q(t): t ≥ 0 is distributed as {Z(t)|Z(0) = e j : t ≥ 0}.

Let D2(t) be the death time of the last common ancestor of these two randomly chosen individuals alive at time t. By Theorem 5.1, we have that

d D2(t) −−−→ D˜ 2 as t → ∞.

Let X2(t) be the generation number of this last common ancestor. Recall that every individual has the same lifetime distribution no matter what type it is of according to the assumption. So,

d X2(t) −−−→ X˜2 as t → ∞.

Let Ai(t) be the type of the ancestor in the next generation after the last common ancestor of the ith chosen individual, i = 1, 2. Then, for almost all trees T ,

  P X2(t) = r, η(t) = j, ζ1(t) = ζ2(t) = i1, A1(t) = A2(t) T − Yr, j PYr, j(t) Pd ξrP, j,p,i Zr, j,p,i,m,i1 (t − D2(t)) · Zr, j,p,i,n,i1 (t − D2(t)) p=1 i=1 m,n=1 = |Z(t)| · (|Z(t)| − 1) YPr, j Pd ξrP, j,p,i vi1 Wr,p,mvi1 Wr,p,n ! p=1 i=1 m,n=1 → E as t → ∞. W2

So,

  P X2(t) = r, η(t) = j, ζ1(t) = ζ2(t) = i1, A1(t) = A2(t)    = E P X2(t) = r, η(t) = j, ζ1(t) = ζ2(t) = i1, A1(t) = A2(t) T

Yr, j−Yr, j(t) d ξr, j,p,i P P P − · − Zr, j,p,i,m,i1 (t D2(t)) Zr, j,p,i,n,i1 (t D2(t))! p=1 i=1 m,n=1 = E |Z(t)| · (|Z(t)| − 1)

Yr, j−Yr, j(t) d ξr, j,p,i P P P −α(t−D2(t)) − · −α(t−D2(t)) − −2αD2(t) e Zr, j,p,i,m,i1 (t D2(t)) e Zr, j,p,i,n,i1 (t D2(t))e ! p=1 i=1 m,n=1 = E e−αt|Z(t)| · e−αt(|Z(t)| − 1)

Yr, j−Yr, j(t) d ξr, j,p,i,i P P P 1 − − − − e α(t D2(t))Z (t − D (t)) · e α(t D2(t))Z (t − D (t)) r, j,p,i,m,i1 2 r, j,p,i,n,i1 2 !  p=1 i=1 m,n=1  −2αD2(t) = E e E D2(t) . e−αt|Z(t)| · e−αt(|Z(t)| − 1) 121

−2αD2(t) Let h(D2(t)) = e . Note that 0 ≤ h(D2(t)) ≤ 1. Yr, j−Yr, j(t) d ξr, j,p,i P P P −α(t−D2(t)) − · −α(t−D2(t)) − e Zr, j,p,i,m,i1 (t D2(t)) e Zr, j,p,i,n,i1 (t D2(t)) ! p=1 i=1 m,n=1 Let g(t, D2(t)) = E D2(t) . e−αt|Z(t)| · e−αt(|Z(t)| − 1) YPr, j Pd ξrP, j,p,i vi1 Wr,p,mvi1 Wr,p,n ! p=1 i=1 m,n=1 Let g = E . Note that g is a constant. W2 d Then, since D2(t) −−−→ D˜ (t) as t → ∞, by the bounded convergence theorem, we have that

g(t, D2(t)) → g w.p.1 as t → ∞ and, also, h is a bounded continuous function, hence

Eh(D2(t)) → Eh(D˜ 2) as t → ∞.

Therefore,

 ˜ E h(D2(t)g(t, D2(t))) − gEh(D2)   = E h(D (t))g(t, D (t)) − g + gh(D (t)) − h(D˜ ) 2 2 2 2   ≤ E h(D (t))g(t, D (t)) − g + gEh(D (t)) − Eh(D˜ ) 2 2 2 2

≤ Eh(D2(t))E g(t, D2(t)) − g + |g| Eh(D2(t)) − Eh(D˜ 2)

≤ E g(t, D2(t)) − g + |g| Eh(D2(t)) − Eh(D˜ 2)

→ 0 as t → ∞, by the bounded convergence theorem.

That is,

  P X2(t) = r, η(t) = j, ζ1(t) = ζ2(t) = i1, A1(t) = A2(t) YPr, j Pd ξrP, j,p,i vi1 Wr,p,mvi1 Wr,p,n !  ˜  p=1 i=1 m,n=1 → E e−2αD2 E as t → ∞. W2

Similarly, as t → ∞,

  P X2(t) = r, η(t) = j, ζ1(t) = ζ2(t) = i1, A1(t) , A2(t) YPr, j Pd ξrP, j,p,k ξrP, j,p,l vi1 Wr,p,mvi1 Wr,p,n !  ˜  p=1 k,l=1 m=1 n=1 → E e−2αD2 E , W2 122

  P X2(t) = r, η(t) = j, ζ1(t) = i1, ζ2(t) = i2, i1 , i2, A1(t) = A2(t) YPr, j Pd ξrP, j,p,i vi1 Wr,p,mvi2 Wr,p,n !  ˜  p=1 i=1 m,n=1 → E e−2αD2 E W2 and

  P X2(t) = r, η(t) = j, ζ1(t) = i1, ζ2(t) = i2, i1 , i2A1(t) , A2(t) YPr, j Pd ξrP, j,p,k ξrP, j,p,l vi1 Wr,p,mvi2 Wr,p,n !  ˜  p=1 k,l=1 m=1 n=1 → E e−2αD2 E . W2

So, for any r ≥ 0 and any j, i1, i2 = 1, 2, ··· , d, we have that

  P X2(t) = r, η(t) = j, ζ1(t) = i1, ζ2(t) = i2 | | YPr, j ξPr, j,p Wr,p,mWr,p,n !  ˜  p=1 m,n=1 → v v E e−2αD2 E ≡ ϕ (r, j, i , i ) as t → ∞. i1 i2 W2 2 1 2

d Since X2(t) −−−→ X˜2 as t → ∞ and X˜2 is a proper probability distribution, {X2(t): t ≥ 0} is tight.  Also, η(t), ζ1(t) and ζ2(t) only take values on a finite set {1, 2, ··· , d}, so we know that (X2(t), η(t), ζ1(t), ζ2(t)) : t ≥ 0 is tight. Thus, the limit ϕ2 is a probability distribution. Hence, X ϕ2(r, j, i1, i2) = 1. (r, j,i1,i2) The proof is complete.

5.3 The Generation Problem in Supercritical Case

We know that if P(Z1 = 0|Z0 = ei) = 0 for any i = 1, 2, ··· , d, then |Z(t)| → ∞ w.p.1. For a continuous-time Bellman-Harris branching process, since the lifetime is a random quantity, individuals alive at time t may belong to different generations. It is clear that the population will grow old as time t gets large but the question is how fast the generation number grows.

Now, we pick an individual at random from those alive at time t, let M(t) be the generation number of this individual. Our interest is that to determine the growth rate of Mt with t. In this section, we assume that all individuals of various types have the same lifetime distribution G although their offspring distributions may be different. 123

5.3.1 The statement of Result

Theorem 5.4. Let 1 < ρ < ∞ and the lifetime distribution G is non-lattice with G(0+) = 0 for i =  1, 2, ··· , d. If E kZ1k log kZ1k Z0 = ei < ∞ for all 1 ≤ i ≤ d. Then,

M(t) 1 → in probability as t → ∞ t µα Z −αx where µα = ρ xe dG(x). [0,∞)

5.3.2 The proof of Theorem 5.4

We need the following lemma to prove the theorem.

 Lemma 5.4. (Athreya, Athreya and Iyer [11]) Let Li i≥1 be i.i.d. positive random variable with distribution G and G(0) = 0. Let ρ > 1 and 0 < α < ∞ be the Malthusian parameter given Z ∞ −αt  ˜ by ρ e dG(t) = 1. Let Li i≥1 be i.i.d. positive random variables with distribution function 0 n R x X −αt ≥ ≥ ≥ Gα(x) = ρ 0 e dG(t), x 0. Let S 0 = 0 and S n = Li, n 1. For t 0, let N(t) = k if i=1  ˜ ˜ S k ≤ t < S k+1. Further, let Rt = S N(t)+1 − t be the residual lifetime at time t for Li i≥1. Let N(t) and Rt  ˜ be the corresponding objects for Li i≥1. Then

(a) for any k ≥ 1, and any bounded Borel measurable function φ : Rk → R,

    −αS k k E φ(L˜1, L˜2, ··· , L˜k) = E e ρ φ(L1, L2, ··· , Lk) and

  αR˜t (b) lim lim E e : R˜t > l = 0. l→∞ t→∞

Now, we can begin the proof.

Let Z0 = ei0 for some i0 = 1, 2, ··· , d.

Let Yn = (Yn,1, Yn,2, ··· , Yn,d) be the embedded Galton-Watson branching process.

Let Yn,i(t) be the number of individuals of type i in the nth generation alive at time t. Then

X∞ Zi(t) = Yn,i(t). n=0

Let Ln,i, j be the lifetime of the jth individual of type i in the nth generation. 124

Let Ln,i, j,k be the lifetime of the kth-generation ancestor of the jth individual of type i in the nth  generation, then Ln,i, j,k : n ≥ 0, i = 1, 2, ··· , d, j ≥ 1, k = 1, 2, ··· , r − 1 are i.i.d copies with the lifetime distribution G. Xr−1 Let S n,i, j = Ln,i, j,k, then S n,i, j is the birth time of the jth individual of type i in the nth generation. k=0 Recall that ρ is the maximal eigenvalue of the offspring mean matrix M with right eigenvector u and left eigenvector v.

Let α be the Mathusian parameter for M and the lifetime distribution G. Then, since all the individ- uals have the same lifetime distribution, Z ρ e−αtdG(t) = 1. [0,∞)

−αt Let dGα(t) = ρe dG(t), then Z Z −αt µα ≡ = ρ te dG(t). [0,∞)tdGα(t) [0,∞) 1 For any c > , we have µα

 1 Xd X  P(M(t) > ct) = E Y (t) |Z(t)| n,i i=1 n>ct Xd  1 X  ≤ E Y (t) Z (t) n,i i=1 i n>ct d ! X  1 X   1 X  = E Y (t): Z (t) ≤ eαtv  + E Y (t) : Z (t) > eαtv  Z (t) n,i i i Z (t) n,i i i i=1 i n>ct i n>ct d X  ≡ ai(t) + bi(t) (5.20) i=1 for any 0 <  < ∞.

We first claim that lim limbi(t) = 0 for 1 ≤ i ≤ d. ↓0 t→∞ Let    1, if the rth individual of type i in the nth generation is alive at time t δ (t) =  n,i,r   0, otherwise. 125

Then,

Y    Xn,i  E Yn,i(t) = E δn,i,r(t) r=1 Yn,i !  X  = E E δ (t) Y , n ≥ 0 n,i,r n,i r=1 !   = E Y E δ (t) Y , n ≥ 0 n,i n,i,1 n,i   = E Yn,i E δn,i,1(t)

(n)  = m P S n,i, ≤ t < S n,i, + Ln,i, i0i 1 1 1 (n)  = m P S n ≤ t < S n i0i +1 = m(n)PN(t) = n i0i

k X  where S k = Li, k ≥ 1, Li i≥1 are i.i.d. random variables with the lifetime distribution G and N(t) = n i=1 if S n ≤ t < S n+1. So, by Lemma 5.4 (a), we have !  1 X  b (t) = E Y (t) : Z (t) > eαtv  i Z t n,i i i i( ) n>ct 1 X ≤ EY (t) eαtv  n,i i n>ct −αt e X (n)  = m P S n ≤ t < S n+1 v  i0i i n>ct (n) 1 X mi i   = 0 E eα(S n+1−t)e−αS n+1 ρn+1I v ρ ρn (S n≤tct (n) X m   1 i0i α(S˜ n −t) = E e +1 I ˜ ˜ v ρ ρn (S n≤tct (n) 1 X mi i  ˜  = 0 E eαRt I . v ρ ρn (N˜ (t)=n) i n>ct

m(n) i0i Let βi0,i = sup n , then n≥1 ρ

(n) mi i 0 → u v as n → ∞ ρn i0i i 126

and hence 0 < βi0,i < ∞. Therefore,

(n) 1 X mi i  ˜  1 X  ˜  b (t) ≤ 0 E eαRt I ≤ β E eαRt I i v ρ ρn (N˜ (t)=n) v ρ i0,i (N˜ (t)=n) i n>ct i n>ct β   i0,i αR˜t = E e I(N˜ (t)>ct) viρ

From Lemma 5.4 (b), we have that, for any  > 0, there exists an l > 0 such that

  αR˜t 2 lim E e : R˜t > l <  . t→∞

So,

βi ,i   ˜   ˜  0 αRt ˜ αRt ˜ bi(t) ≤ E e I(N˜ (t)>ct) : Rt > l + E e I(N˜ (t)>ct) : Rt ≤ l viρ β     i0,i αR˜t αl ≤ E e : R˜t > l + e P(N˜ (t) > ct) . viρ

Moreover, by the strong law of large numbers,

N˜ (t) 1 → w.p.1 as t → ∞ t µα and hence

 N˜ (t)  1 limP > c = 0 for any c > . t→∞ t µα

Therefore, we have that

β     β  i0,i αR˜t αl i0,i 0 ≤ limbi(t) ≤ limE e : R˜t > l + e limP(N˜ (t) > ct) < t→∞ viρ t→∞ t→∞ viρ for any 1 ≤ i ≤ d. Hence

lim limbi(t) = 0 for all i = 1, 2, ··· , d. ↓0 t→∞

Next, we claim that lim limai(t) = 0 for all i = 1, 2, ··· , d. ↓0 t→∞ Since

 1 X    a (t) ≡ E Y (t): Z (t) ≤ eαtv  ≤ P Z (t) < eαtv  i Z t n,i i i i i i( ) n>ct and

−αt e Zi(t) → viW w.p.1 as t → ∞ 127

 and, under the assumptions that P(Z1 = 0|Z0 = ei) = 0 and E kZ1k log kZ1k Z0 = ei < ∞ for all 1 ≤ i ≤ d, P(0 < W < ∞) = 1. Then P(W ≤ ) → 0 as  ↓ 0. Hence,

 αt  lim limai(t) ≤ lim limP Zi(t) < e vi = lim limP(W ≤ ) = 0 ↓0 t→∞ ↓0 t→∞ ↓0 t→∞ for all 1 ≤ i ≤ d.

Then, from (5.20), we have that

d  X   0 ≤ limP M(t) > ct ≤ limai(t) + limbi(t) t→∞ t→∞ t→∞ i=1 for any  > 0 and hence,

d  X   0 ≤ limP M(t) > ct ≤ lim limai(t) + lim limbi(t) = 0, t→∞ ↓0 t→∞ ↓0 t→∞ i=1 1 i.e., lim PM(t) > ct = 0 for any c > . t→∞ µα By the similar argument, we can prove that,

lim PM(t) < ct = 0 t→∞ 1 for any c < . µα Since, for any  > 0,

 M(t) 1   M(t) 1   M(t) 1  P − >  = P > +  + P < −  , t µα t µα t µα we have that

 M(t) 1   M(t) 1   M(t) 1  lim P − >  = lim P > +  + lim P < −  = 0 t→∞ t µα t→∞ t µα t→∞ t µα for any  > 0. So,

M(t) 1 → in probability as t → ∞ t µα and hence the proof is complete. 128

CHAPTER 6. APPLICATION TO BRANCHING RANDOM WALKS

6.1 Introduction

A branching is a branching tree such that with each line of descent a random walk is associated.  Let Zn n≥0 be a discrete-time single-type Galton-Watson branching process with offspring distrib-  ution p j j≥0. Let Z0 = 1, then there is a unique probability measure on the family tree initiated by this ancestor.

On this family tree, we impose the following movement structure.

If an individual is located at x in the real line R, and, upon death, produces k children, then these k children move to x + Xk j, for 1 ≤ j ≤ k, where (Xk1, Xk2, ··· , Xkk) is a random vector with a joint distri-

k bution πk on R , for each k. The random vector Xk ≡ (Xk1, Xk2, ··· , Xkk) is stochastically independent of the history up to that generation as well as the movement of the offspring of other individuals.

Let ζn ≡ {xni : 1 ≤ i ≤ Zn} be the positions of the Zn individuals of the n-th generation. For each n ≥ 0, ζn is a collection of random numbers of random points on R and hence is a point process. The sequence of pairs of {Zn, ζn}n≥0 is called branching random walk. The probability distribution of this process is completely specified by

 1. the offspring distribution p j j≥0;

 2. the family of probability measures πk k≥1;

3. the initial population size Z0; and

 4. the locations ζ0 ≡ x0i, 1 ≤ i ≤ Z0 of the initial ancestors.

It is clear that {ζn}n≥0 is also a Markov chain whose state space is the set of all finite subsets of R.

The problem of our interest is what happens to the point process ζn as n → ∞. In particular, 129

(1) If Zn(x) is the number of points in ζn that are less than or equal to x, then how does Zn(x) behave as n → ∞?

 Zn(xn) (2) Does there exist xn n≥0 such that the proportion has a nontrivial limit as n → ∞? Zn

It is clear that the movement along any one line of descent is that of a classical random walk. Thus,

2 if Xki are identically distributed with mean µ and finite variance σ then the location of an individual of the n-th generation should be approximately Gaussian with mean nµ and variance nσ2 by the .

√ Zn(xn) This suggests that if Zn → ∞ as n → ∞ and if xn = σ nx + nµ, then could have Φ(x), Zn the standard N(0, 1) cumulative distribution function (c.d.f.), as its limit. Or, if Xk,1 ∈ D(α) with

0 < α ≤ 2, i.e., Xk,1 is in the domain of attraction of a stable law of order α, then there exist an and bn Z (a + b y) such that n n n converges to a standard stable law c.d.f. as n → ∞. It turns out to be true in Zn X∞ the supercritical case (1 < m = jp j < ∞), but the same result doesn’t hold for the explosive case j=1 (m = ∞).

Recall that the limit of the coalescence time of two randomly chosen individuals in the nth genera- tion in the supercritical Galton-Watson branching process is very close to the beginning of the tree but its rate of growth is n in the explosive case when n gets large. Surprisingly, this causes the difference on Z (x ) the limit behavior of the proportion n n between the supercritical and explosive cases. Zn

6.2 Review of Results in The Supercritical Case

 Consider a supercritical Galton-Watson branching process Zn n≥0, the following theorems are re- sults on its corresponding branching random walk.

X∞ Theorem 6.1. (Athreya [9]) Let p0 = 0, 1 < m ≡ jp j < ∞ and πk be such that {Xk,i : i = j=1 2 2 1, 2, ··· , k}k≥1 are identically distributed. Let EXk,1 = 0 and EXk,1 = σ < ∞. Then,

(a) for any y ∈ R, √ Z ( nσy) n → Φ(y) in mean square, Zn

where Φ(y) is the c.d.f. of the standard normal N(0, 1). 130

(b) if Yn is the position of a randomly chosen individual from the nth generation, then, for any y ∈ R,

√ P(Yn ≤ nσy) → Φ(y).

X∞ Theorem 6.2. (Athreya [9]) Let p0 = 0, 1 < m ≡ jp j < ∞ and πk be such that {Xk,i : i = j=1 1, 2, ··· , k}k≥1 are identically distributed. Let Xk,1 ∈ D(α), 0 < α ≤ 2, then

(a) there exist an, bn such that

Zn(an + bny) → Gα(y) in mean square, Zn

where Gα(·) is a standard stable law c.d.f. (of order α).

(b) if Yn is the position of a randomly chosen individual from the nth generation, then, for any y ∈ R,

√ P(Yn ≤ nσy) → Φ(y).

X∞ The results depend on the fact when p0 = 0 and 1 < m ≡ jp j < ∞, the coalescence time Xn,2 j=1 is way back in time and so the positions of two randomly chosen individuals in the nth generation are essentially independent and has the marginal distribution of a random walk at step n.

Remark 6.1. Theorem 6.1 and Theorem 6.2 holds under the following weaker assumption about πk, the distribution of (Xk,1, Xx,2, ··· , Xk,k), that does not require {Xk,1}k≥1 to be identically distributed. It suffices to assume:

(i) ∀k ≥ 1, (Xk,1, Xx,2, ··· , Xk,k) has a distribution that is invariant under permutation.

(ii) If {pk}k≥1 is the offspring distribution with

∞ ∞ X 2 X pkEXk,1 < ∞, 1 < m = kpk < ∞, p0 = 0. k=1 k=1

6.3 Results in The Explosive Case

In this section, we consider the explosive Galton-Watson branching process such that the offspring  distribution p j j≥0 is in the domain of a stable law of order α with 0 < α < 1. 131

6.3.1 The Statements of Theorems in The Explosive Case

First, we pick an individual at random from the nth generation.

Recall the following notations:

(1) Yn is the position of this randomly chosen individual;

(2) Zn(x) is the number of points in ζn that are less than or equal to x for any x ∈ R;

(3) Xk ≡ (Xk,1, Xk,2, ··· , Xk,k) are the movements of all the offspring of an individual with k offspring

and have the joint distribution πk;

(4) ζn ≡ {xni : 1 ≤ i ≤ Zn} are the positions of the Zn individuals of the n-th generation.

Theorem 6.3. Let m = ∞, p0 = 1, {p j} j≥0 ∈ D(α), 0 < α < 1. Let {Xk,i : 1 ≤ i ≤ k}k≥1 be identically

2 2 distributed. Let EXk,1 = 0 and EXk,1 = σ < ∞. Then, for any fixed y ∈ R,

√  (a) P Yn ≤ nσy → Φ(y) as n → ∞; √ Zn( nσy) d   (b) −−−→ δy as n → ∞, where δy is Bernoulli(Φ(y)), i.e. P δy = 1 = Φ(y) = 1 − P δy = 0 . Zn

The result in Theorem 6.3 (b) can be strengthened to the joint convergence of √ Z ( nσy) n , i = 1, 2, ··· , k, Zn and hence we have the following theorem.

Theorem 6.4. Under the hypothesis of Theorem 6.3,

(a) for any −∞ < y1 < y2 < ∞, √ √   Zn( nσy1) Zn( nσy2) d  , −−−→ δ1(Φ(y1)), δ2(Φ(y2)) Zn Zn

which takes values (0, 0), (0, 1) and (1, 1) with probabilities 1 − Φ(y2), Φ(y2) − Φ(y1) and Φ(y1), respectively.

(b) for any −∞ < y1 < y2 < ··· < yk < ∞, √   Zn( nσyi) d  : 1 ≤ i ≤ k −−−→ δ1, ··· , δk) Zn 132

where each δi is 0 or 1 and further δi = 1 ⇒ δ j = 1 for j ≥ i and

P(δ1 = 0, δ2 = 0, ··· , δ j−1 = 0, δ j = 1, ··· , δk = 1)

= P(δ j−1 = 0, δ j = 1) = Φ(y j) − Φ(y j−1).

Remark 6.2. Theorem 6.4 suggests that √  Zn( nσy)  Zn(y) = , −∞ < y < ∞ Zn converges in the Skorohod Space D(−∞, ∞) weakly to

 X(y) ≡ I(N≤y), −∞ < y < ∞ where N is a N(0, 1) r.v.

So, we have the following result and only tightness needs to be established:

If Yn is the position of a randomly chosen individual in the nth generation, then in all cases (as long as p0 = 0), given the tree (random walk) T , ∀y ∈ R,

√ d P(Yn ≤ nσy|T ) −−−→ δy ∼ Bernoulli(Φ(y)).

6.3.2 The Proof of Theorem 6.3

To prove Theorem 6.3, we need the following lemma.

n o Lemma 6.1. Let µn be probability distributions on [0, 1] and such that, as n → ∞, n≥1 Z Z 2 xdµn → λ and x dµn → λ [0,1] [0,1] for some 0 < λ < 1. Then,

w µn −−−−→ µ as n → ∞, where µ is a probability distribution on [0, 1] with µ{1} = λ and µ{0} = 1 − λ.

Proof. First, we have that, for any n ∈ N, Z Z  xdµn = µn {1} + xdx [0,1] (0,1) 133 and Z Z 2  2 x dµn = µn {1} + x dx. [0,1] (0,1)

So, Z Z Z Z 2 2 xdµn − x dµn = xdx − x dµn (0,1) (0,1) [0,1] [0,1] and hence Z 2 lim x − x dµn n→∞ (0,1) Z Z Z Z 2 = lim xdµn − lim x dµn = lim xdx − lim xdµn n→∞ (0,1) n→∞ (0,1) n→∞ [0,1] n→∞ [0,1] = λ − λ = 0

Now, for any 0 < a < b < 1, we have that Z Z Z 2 2 2 2  x − x dµn ≥ x − x dµn ≥ r − r dµn = (r − r )µn (a, b] (0,1) (a,b] (a,b] n o where r = min a, 1 − b and thus

1 Z lim µ (a, b] ≤ x − x2dµ = 0 = µ(a, b]. n 2 n n→∞ r − r (0,1)     Since lim µn (a, b] = µ (a, b] for any a, b ∈ [0, 1] with µ {a} = µ {b} = 0, we have that n→∞

v µn −−−→ µ as n → ∞.

Also, µ is a probability measure on [0, 1], so

w µn −−−−→ µ as n → ∞.



Now, we begin the proof of Theorem 6.3.

(a) Recall that ζn ≡ {xni : 1 ≤ i ≤ Zn} are the positions of the Zn individuals of the n-th generation.

For any fixed y ∈ R, let   √  1 , if xn,i ≤ nσy δ =  n,i   0 , otherwise. 134

Then we have that

√ XZn Zn( nσy) = δn,i. i=1

So,

√ Z Z Z Z ( nσy)  1 Xn   1 Xn    1 Xn   E n = E δ = E E δ Z = E E δ Z Z n,i Z n,i n Z n,1 n n i=1 n i=1 n i=1  √  √  = E δn,1 = P xn,1 ≤ nσy = P x0,1 + S n ≤ nσy √  = P S n ≤ nσy − x0,1 n X  where S n = ηi, ηi i≥1 are i.i.d copies with distribution π1 and x0,1 is the location of the initial i=1 2 ancestor of the nth generation individual located at the position xn,1. Since EXk,1 = 0 and EXk,1 = σ2 < ∞, by the central limit theorem, we have

 S x  P √ n ≤ y − √0,1 → Φ(y) as n → ∞. nσ nσ

Hence, √ √ √     Zn( nσy) P Yn ≤ nσy = P Yn ≤ nσy Zn = E Zn → Φ(y) as n → ∞.

(b) We will prove that, for any y ∈ R, √ Z ( nσy) d n −−−−→ Bernoulli(Φ(y)) as n → ∞. Zn

From (a), we already know that, for any fixed y ∈ R, √ Z ( nσy) E n → Φ(y) as n → ∞. Zn

It suffices to show that, for any fixed y ∈ R, we also have √ Z ( nσy)2 E n → Φ(y) as n → ∞. Zn

Recall that, for any fixed y ∈ R,   √  1 , if xn,i ≤ nσy δ =  n,i   0 , otherwise. 135 and then

√ Z Z Z Z ( nσy)2  1 Xn   1 Xn   1 Xn  E n = E δ = E δ2 + E δ δ . n,i 2 n,i 2 n,i n, j Zn Zn Z Z i=1 n i=1 n i, j=1

 First, it is known that, in the explosive case under the assumption that p0 = 0, P Zn → ∞ = 1. Also, we have that

Z  1 Xn 1  P 0 < δ2 < = 1. 2 n,i Z Zn i=1 n

Hence,

Z  1 Xn  P δ2 → 0 = 1 2 n,i Zn i=1 and then, by the bounded convergence theorem,

Z  1 Xn  E δ2 → 0 as n → ∞. (6.1) 2 n,i Zn i=1

Secondly, by the symmetry consideration conditioned on the branching tree (but not the random walk), we have that

Z  1 Xn  E δn,iδn, j Z2 n i, j=1  Zn   Zn     1 X   1 X   Zn Zn − 1   = E E δn,iδn, j Zn = E E δn,1δn,2 Zn = E E δn,1δn,2 Z2 Z2 Z2 n i, j=1 n i, j=1 n Z Z − 1   = E n n E δ δ 2 n,1 n,2 Zn

Note that, by the bounded convergence theorem,

Z Z − 1 E n n → 1 as n → ∞. (6.2) 2 Zn

Now, let τn,2 be the generation number of the last common ancestor of any two randomly chosen individuals in the nth generation. Then, by Theorem 2.4, we have

d n − τn,2 −−−→ τ˜2 as n → ∞

for some random variableτ ˜2. 136

Let xτn be the position of the last common ancestor of these two individuals corresponding to the positions xn,1 and xn,2. Then we can write

xn,i = xτn + Yn,i i = 1, 2 where Yn,i is the net displacement of the individual with position xn,i from generation τn to n.

Clearly, Yn,1 and Yn,2 are independent. Moreover, xτn , Yn,1 and Yn,2 can be written as Xτn,2 nX−τn,2 xτn = x0,1 + η j and Yn,i = ηi, j for i = 1, 2 j=1 j=1    respectively, where η j j≥1, η1,i j≥1 and η2,i j≥1 are i.i.d copies with distribution π1 and are independent with each other.

Therefore,

 E δn,1δn,2   = E E δn,1δn,2 n − τn,2  !

= E E I √ I √  n − τn,2 xn,1≤ nσy xn,2≤ nσy !   = E E I τ n−τ I τ n−τ n − τ Pn,2 Pn,2 √  Pn,2 Pn,2 √  n,2 x0,1+ η j+ η1, j≤ nσy x0,1+ η j+ η2, j≤ nσy j=1 j=1 j=1 j=1 !   = E E I τ n−τ I τ n−τ n − τ Pn,2 √ Pn,2  Pn,2 √ Pn,2  n,2 η j≤ nσy−x0,1− η1, j η j≤ nσy−x0,1− η2, j j=1 j=1 j=1 j=1 τn,2 n−τn,2 n−τn,2 !  X √ n X X o  = E P η ≤ nσy − x − max η , η n − τ j 0,1 1, j 2, j n,2 j=1 j=1 j=1

d Since n − τn,2 −−−→ τ˜2 as n → ∞ and P(˜τ2 < ∞) = 1, we have that, for i = 1, 2,

n−τn,2 τ˜2 X d X ηi, j −−−→ ηi, j as n → ∞. j=1 j=1

d τ d Also, τ −−−→∞ and n,2 −−−→ 1 as n → ∞. Hence, as n → ∞, n,2 n τ n−τ n−τ  Xn,2 √ n Xn,2 Xn,2 o  P η ≤ nσy − x − max η , η n − τ → Φ(y) w.p.1. j 0,1 1, j 2, j n,2 j=1 j=1 j=1

Then, by the bounded convergence theorem,

 E δn,1δn,2 → Φ(y) as n → ∞. (6.3) 137

So, (6.1), (6.2) and (6.3) together imply that √ Z ( nσy)2 E n → Φ(y) as n → ∞. Zn

By Lemma 6.1, we have that, for any y ∈ R, √ Z ( nσy) d n −−−−→ Bernoulli(Φ(y)) as n → ∞ Zn

and hence the proof is complete.

6.3.3 The Proof of Theorem 6.4

(a) Let −∞ < y1 < y2 < ∞ be any two fixed real numbers. Then, √ √ ! Z ( nσy ) Z ( nσy ) P n 1 ≤ n 2 = 1. Zn Zn

So, √ √ ! Z ( nσy ) Z ( nσy ) P n 1 = 1, n 2 = 0 = 0 Zn Zn

for any n = 1, 2, ··· , and hence √ √ ! Z ( nσy ) Z ( nσy ) P n 1 = 1, n 2 = 0 → 0 Zn Zn

as n → ∞.

Also, by Theorem 6.3, we have that √ Zn( nσyi) d −−−−→ δi(Φ(yi)) as n → ∞ Zn  where δi(Φ(yi)) is a Bernoulli random variable with P δi(Φ(yi)) = 1 = Φ(yi) = 1 − P δi(Φ(yi)) = 0 for i = 1, 2.

Therefore, as n → ∞, √ √ ! √ ! Zn( nσy1) Zn( nσy2) Zn( nσy2) P = 0, = 0 = P = 0 → 1 − Φ(y2) Zn Zn Zn

and √ √ ! √ ! Zn( nσy1) Zn( nσy2) Zn( nσy1) P = 1, = 1 = P = 1 → Φ(y1). Zn Zn Zn 138

  n o Moreover, since δ1(Φ(y1)), δ2(Φ(y2)) only take values on the set (0, 0), (0, 1), (1, 0), (1, 1) , √ √ ! Zn( nσy1) Zn( nσy2) P = 0, = 1 → Φ(y2) − Φ(y1). Zn Zn

Hence, (a) is proved.

(b) Let k ∈ N be any positive integer and −∞ < y1 < y2 < ··· < yk < ∞ be any fixed real numbers.  Let i1, i2, ··· , ik ∈ 0, 1 . If there exist l, m with l < m such that il = 1 and im = 0, then √ √ √ ! Zn( nσy1) Zn( nσy2) Zn( nσyk) P = i1, = i2, ··· , = ik = 0 Zn Zn Zn

for any n = 1, 2, ··· and hance √ √ √ ! Zn( nσy1) Zn( nσy2) Zn( nσyk) P = i1, = i2, ··· , = ik → 0 Zn Zn Zn

as n → ∞.

Secondly, if i1 = i2 = ··· = ik = 1, then √ √ √ ! √ ! Zn( nσy1) Zn( nσy2) Zn( nσyk) Zn( nσy1) P = 1, = 1, ··· , = 1 = P = 1 → Φ(y1) Zn Zn Zn Zn

as n → ∞, by Theorem 6.3.

Also, if i1 = i2 = ··· = ik = 0, then, by Theorem 6.3 again, √ √ √ ! √ ! Zn( nσy1) Zn( nσy2) Zn( nσyk) Zn( nσyk) P = 0, = 0, ··· , = 0 = P = 0 → 1 − Φ(yk) Zn Zn Zn Zn

as n → ∞.

Moreover, if i1 = ··· = i j−1 = 0 < 1 = i j = ··· = ik for some 1 < j < k, then √ √ √ √ ! Z ( nσy ) Zn( nσy j−1) Zn( nσy j) Z ( nσy ) P n 1 = 0, ··· , = 0, = 1, ··· , n k = 1 Zn Zn Zn Zn √ √ ! Zn( nσy1) Zn( nσy1) = P = 0, = 1 → Φ(y j) − Φ(y j−1) Zn Zn

as n → ∞, by (a). Therefore, the proof of part (b) is complete. 139

CHAPTER 7. OPEN PROBLEMS

7.1 Problems in Discrete-time Multi-type Galton-Watson Branching Processes

1. In the critical case, we are able to prove the results on the coalescence times using the convergence

of the point process, but the problem regarding limit behavior of the joint distribution of the

generation number and the type of the last common ancestor and the types of the randomly chosen

individuals remains open.

2. The coalescence problem in the explosive case is still open.

7.2 Problems in Continuous-time Single-type Bellman-Harris Branching Processes

X∞ 1. For the proofs in Chapter 4, we impose the condition ( j log j)p j < ∞. Can we still prove the j=1 results without this hypotheses?

2. The coalescence problems in the critical case are open including the results on the limit distribu-

tions of the generation number and the death time of the last common ancestor of the randomly

chosen individuals.

−αt 3. Prove the direct Riemann integrability of the e ξ2(t) in Lemma 4.9 in the proof of Theorem 4.5 under some other sufficient conditions on the offspring and lifetime distributions.

4. The age chart for a continuous-time single-type Markov Bellman-Harris branching process, i.e.,

when the lifetime distribution G is exponentially distributed with the parameter λ.

5. The limit behavior of the generation number of the last common ancestor of the randomly chosen

individuals are still open.

6. All the analogs on the coalescence problems in the explosive case are unknown. 140

7.3 Problems in Continuous-time Multi-type Bellman-Harris Branching Processes

1. To find the limits distribution of the generation number of the last common ancestor of the ran-

domly chosen individuals, we assume that the lifetime distributions for individuals of different

types are the same. Can we achieve without imposing this hypotheses?

2. What happens to the limits distribution of the generation number of the last common ancestor in

a continuous-time multi-type Markov Bellman-Harris branching process?

3. The coalescence problems in the critical, subcritical and explosive cases for a continuous-time

multi-type Bellman-Harris branching process remain open.

4. The results of the limit behavior of the generation number of any random chosen individual from

those alive at time t in the critical, subcritical and explosive cases are still unknown.

 5. Can we drop the hypotheses of E kZ1k log kZ1k Z0 = ei < ∞ for all 1 ≤ i ≤ d for the proofs for a continuous-time multi-type Bellman-Harris branching process? 141

BIBLIOGRAPHY

[1] Athreya, K. B. (1971). A note on a functional equation arising in Galton-Watson branching

processes. Journal of Applied Probability, 8, 589–598.

[2] Athreya, K. B. and Kaplan, N. (1976). Convergence of the age distribution in the one-dimensional

supercritical age-dependent branching process. The Annals of Probability, 40(1), 38–50.

[3] Athreya, K. B. and Kaplan, N. (1978). The additive property and its applications in branching

processes. Advances in Probability, Vol. 5, ed. A. Joffe and P. E. Ney, Dekker, New York.

[4] Athreya, K. B. and Schun, H.-J. (2003). On the supercritical Bellman-Harris process with finite

mean. The Indian Journal of , 65(2), 229–248.

[5] Athreya, K. B. and Ney, P. (2004). Branching Processes. New York: Dover.

[6] Athreya, K. B. and Lahiri, S. N. (2006). Measure Theory and Probability Theory. New York:

Springer.

[7] Athreya, K. B., Athreya, S. R. and Iyer, S. K. (2010). Critical age dependent branching Markov

processes and their scaling limits. Proceedings of the Indian Academy of Sciences, Mathematical

Sciences, 120(3), 1–25.

[8] Athreya, K. B. (2010). Ancestor problem for branching trees. Ramanujan Mathematics Newsletter,

19(1), 1–10.

[9] Athreya, K. B. (2010). Branching random walks. The Legacy of Alladi Ramakrishnan in the Math-

ematical Sciences, (3), 337–349.

[10] Athreya, K. B. (2010). Coalescence in recent past in rapidly growing populations. (Submitted). 142

[11] Athreya, K. B., Athreya, S. R. and Iyer, S. K. (2011). Supercritical age dependent branching

Markov processes and their scaling limits. (To appear in Bernoulli).

[12] Athreya, K. B. (2011). Coalescence in critical and subcritical Galton-Watson branching processes.

(To appear in Journal of Applied Probability).

[13] Bingham, N. H., Goldie, C. M. and Teugels, J. L. (1987). Regular Variation. Cambridge University

Press.

[14] Cohn, H. (1982). Norming constants for the finite mean supercritical Bellman-Harris process. Z.

Wahrscheinlichkeitsth.

[16] Davies, P. L. (1978). The simple branching process: A note on convergence when the mean is

infinite. Journal of Applied Probability, 15, 466–480.

[16] Davies, P. L. (1978). The simple branching process: A note on convergence when the mean is

infinite. Journal of Applied Probability, 15, 466–480.

[17] Feller, W. (1966). An Introduction to Probability Theory and Its Applications, Vol. II. New York:

Wiley.

[18] Grey, D. R. (1978). Almost sure convergence in Markov branching processes with infinite mean.

Journal of Applied Probability, 14, 702–716.

[19] Grey, D. R. (1979). On regular branching processes with infinite mean. Stochastic Processes and

their Applications, 8, 257–267.

[20] Harris, T. E. (1963). The Theory of Branching Processes. Berlin: Springer-Verlag.

[21] Hoppe, F. M. (1976). Supercritical multitype branching processes. The Annals of Probability, 4(3),

393–401.

[22] Jagers, P. (1975). Branching Processes with Biological Applications. London: Wiley.

[23] Jagers, P. and Nerman, O. (1984). The growth and composition of branching populations. Ad-

vances in Applied Probability, 16(2), 221–259. 143

[24] Jagers, P. (1992). Stabilities and instabilities in population dynamics. Journal of Applied Proba-

bility, 29(4), 770–780.

[25] Kallenberg, O. (1986). Random Measures, 4th Ed.. Berlin: Academic Press.

[26] Karlin, S. and McGregor, J. (1966). Spectral theory of branching processes. I. The case of discrete

spectrum. ZW, 5, 6–33.

[27] Lepage, R., Woodroffe, M. and Zinn, J. (1981). ***. The Annals of Probability, 9(4), 624–632.

[28] Mode, C. J. (1971). Multitypr Branching Processes. New York: Elsevier.

[29] O’Brien, G. L. (1980). A limit theorem for sample maxima and heavy branches in Galton-Watson

trees. Journal of Applied Probability, 17, 539–545.

[30] Rama Murthy, K. (1980). Convergence of age and type distributions in multitype critical Bellman-

Harris processes. Journal of Applied Probability, 17, 948–955.

[31] Samuels, M. L. (1971). Distribution of the branching-process population among generations. Jour-

nal of Applied Probability, 8, 655–667.

[32] Schuh, H.-J. and Barbour, A. D. (1977). On the asymptotic behavior of branching processes with

infinite mean. Advances in Applied Probability, 9, 681–723.

[33] Schuh, H.-J. (1982). Seneta constants for the supercritical Bellman-Harris process. Advances in

Applied Probability, 14, 732–751.

[34] Seneta, E. and Heyde, C. C. (1977). I. J. Bienaym´e: Statistical theory anticipated. New York:

Springer-Verlag.

[35] Sevastyanov, B. A. (1971). Branching processes. Moscow: Nauka. (Russ.)

[36] Zubkov, A. M. (1975). Limit distribution of the distance to the nearest common ancestor. Theory

of Probability and Its Applications, 20, 602–612.