Dynamics of Large Rank-Based Systems of Interacting Diffusions

Cameron Bruggeman

Submitted in partial fulfillment of the requirements for the degree of Doctor of Philosophy in the Graduate School of Arts and Sciences

COLUMBIA UNIVERSITY

2016 c 2016 Cameron Bruggeman All Rights Reserved ABSTRACT

Dynamics of Large Rank-Based Systems of Interacting Diffusions

Cameron Bruggeman

We study systems of n dimensional diffusions whose drift and dispersion coefficients depend only on the relative ranking of the processes. We consider the question of how long it takes for a particle to go from one rank to another. It is argued that as n gets large, the distribution of particles satisfies a Porous Medium Equation. Using this, we derive a deterministic limit for the system of particles. This limit allows for direct calculation of the properties of the rank traversal time. The results are extended to the case of asymmetrically colliding particles. These models are of interest in the study of financial markets and economic inequality. In par- ticular, we derive limits for the performance of some Functionally Generated Portfolios originating from Stochastic Portfolio Theory. Table of Contents

List of Figures iii

1 Introduction 1 1.1 The Model ...... 1 1.2 The Problem and Intuition ...... 3 1.3 Existence of Strong and Weak Solutions ...... 5 1.4 Reflecting Brownian Motion and the Gap Process ...... 6 1.5 Stationary Distributions ...... 7 1.6 Outline of The Paper ...... 11

2 Convergence of Empirical Distributions 12 2.1 A Simple Case: Affine σ2(·)...... 12 2.2 Tightness of the Empirical Distributions ...... 19 2.2.1 Tightness of Measures on a Space of Measures ...... 21 2.2.2 Tightness of ItˆoProcesses ...... 22 2.2.3 Central Moment Processes ...... 24 2.3 Limits Satisfy the Porous Medium Equation ...... 27 2.4 Convergence of the Porous Medium Equation (PME) ...... 31 2.5 The General Case: Non-Affine σ2(·) ...... 32 2.6 Atlas Models ...... 33 2.6.1 Multi-Atlas Models ...... 34 2.7 Zipf’s Law ...... 36 2.8 Gini Index ...... 37

i 3 Limiting Particle Dynamics 44 3.1 Rank Dynamics ...... 46 3.2 Hitting Times ...... 47 3.3 Time Reversal ...... 52 3.4 Rank-Rank Correlations ...... 53 3.5 Atlas Models ...... 55 3.6 Martingale Models ...... 57

4 Asymmetric Collisions 60 4.1 Existence and Stationary Distributions ...... 60 4.2 Taking the Limit ...... 64

4.3 ρen is Tight ...... 65

4.3.1 Construction of K0 ...... 66

4.3.2 Construction of Kr ...... 69 4.4 Porous Medium Equation ...... 70

5 Local Time and Ranked-Based Portfolios 72 5.1 Limits of Local Time ...... 72 5.2 A Primer on Stochastic Portfolio Theory ...... 77 5.3 Rank Based Portfolios in Large Markets ...... 79

Bibliography 85

Appendix 88

ii List of Figures

2.1 Capital Distribution Curves ...... 36 2.2 Gini Index Change ...... 43

3.1 Rank-Rank Plot ...... 55 3.2 Rank-Rank Correlation over Time ...... 56

iii Acknowledgments

First and foremost, I would like to thank my adviser Ioannis Karatzas who introduced me to my research problem, and has provided encouragement and many helpful suggestions over the past four years. You were a constant source of new directions to push the research, while always allowing me the freedom to explore new ideas. I am grateful to all of the participants at the INTECH Research Meetings for their many helpful comments and incites. I’d like to thank Robert Fernholz for his many presentations and questions, all of which provided the basis for my intuition for the subject. I am especially grateful to Mykhaylo Shkolnikov. Your work formed the basis for the results in this thesis, and conversations with you consistently helped work past any sticking points that came up. I would like to thank my coauthors; Johannes Ruf, Andrey Sarantsev, Jos`eScheinkman and Alex Snyder. You helped to introduce me to the broader world of academia. I have had many great teachers along the way, without whom I would not have made it this far. I am grateful to my highschool teacher Robert Gleeson, for introducing me to math contests and always doing whatever he could to allow me to push further. I would like to thank all of my undergraduate professors. In particular Brian Forrest, Kathryn Hare, and Andu Nica. You introduced me to analysis and changed the way I thought about math. I am thankful to all my friends and teammates in the Columbia and Waterloo Frisbee com- munities. Dan Balzerson, Bryanne Root, Nate Hunter, Chris Sanderson, Alexander Isik: The experiences I’ve had on and off the field with you are some of my best memories. I’d especially like to thank Tim Gilboy. Your competitive drive continuously pushed me to be my best, and you’ve always been there to help celebrate the good times, or helped distract from the bad times. I am eternally grateful to my girlfriend Iryna Lozynska. You provided unwavering these last four years, even when I didn’t deserve it. You were always willing to listen to my mathematical ramblings, and helped keep me grounded when times got hard.

iv Finally, I would like to thank my family. From when you first read The Number Devil to me, you’ve always done everything you could to support my interests in Math. Thank you.

v To my parents

vi CHAPTER 1. INTRODUCTION

Chapter 1

Introduction

1.1 The Model

When attempting to model random phenomenon in real-world contexts, the model:

dY (t) = b(t, ω)dt + s(t, ω)dW (t) (1.1) is often used (E.g. transmission of signals through noise, random population dynamics, modeling city size, evolution of asset prices). A common choice of the drift and variance is given by:

dY (t) = bY (t)dt + sY (t)dW (t) (1.2) for constants b, s ∈ R. In addition to being easy to work with computationally, this model has the nice property of scale invariance, as well as almost surely remaining positive for all time, if it starts out at Y (0) > 0. This invariance allows us to use the same model regardless of the choice of units (for instance, currency in mathematical finance), as well as allowing the model to remain valid at a later time when inflation has occurred. Unfortunately, this model fails to capture empirically observed behaviour, such as the higher volatility of small stocks. If we wish to maintain scale invariance, while also having a notion of what it means to be “small”, we need to consider not just a single stock, but instead a collection Y1, ...Yn of stocks. We will take the approach of determining the size of a stock by considering its rank among all other stocks. More general approaches would be to determine the size of the stock by considering its market share (E.g. the volatility-stabilized models studied in [14, 24, 25]).

1 CHAPTER 1. INTRODUCTION

Given the processes Y1(·), ..., Yn(·), we define the ranked processes Y(1)(·) ≤ ... ≤ Y(n)(·) to be the capitalization processes ranked in ascending order. The rank of stock i at time t is denoted by ri(t), with ties broken lexicographically. So Yi(t) = Y(ri(t)) and ri(t) < rj(t) if Xi(t) = Xj(t) and i < j. We adopt the Rank-Based model of market capitalizations, introduced in [1]:

i dYi(t) = bri(t)Yi(t)dt + sri(t)Yi(t)dW (t) (1.3) where b1, ..., bn and s1, ..., sn are real valued constants. In what follows, we will consider Xi(t) = log Yi(t), the log-capitalization processes. An application of Itˆo’srule to the equation (1.3) gives:

i dXi(t) = gri(t)dt + sri(t)dW (t) (1.4)

1 2 where gi = bi − 2 si . The importance of rank in the dynamics of these systems leads to their being known as Competing Brownian Particle Systems. One is often concerned with the ranked particles

X(1), ..., X(n), and their dynamics can be written as:

1 dX (t) = g dt + s dW k + (dLk−1,k − dLk,k+1) (1.5) (k) k k t 2

k,k+1 where L denotes the local time of the process X(k+1)(·) − X(k)(·) with the understanding that L0,1 ≡ 0 ≡ Ln,n+1, and W k are independent Brownian Motions. Intuitively, we can think of the ranked particles as n particles living on the real line, that reflect off of each other when they collide. If one attempts to derive the dynamics of (1.5) from (1.4), higher-order local time terms appear; that is, local times of the difference of non-adjacent particles, corresponding to triple and higher- 2 order collisions [2]. It can be shown that in the model (1.4), with sk > 0 for all k, such higher-order collisions accumulate zero local time and so these terms disappear and we are left with (1.5) (see [19]). If u is not an integer, we denote

X(u) = X(buc).

There has been interest in the properties of Rank-Based models, such as the existence of higher- order collisions (e.g. [18, 27]), existence of stationary distributions and ergodicity (e.g. [1, 19]), and the shape of the empirical distribution of particles as n goes to infinity (e.g. [28, 7, 20]). Very little work, however, has been done on studying the dynamics of a single particle within a market-model of this sort, which is the main topic of this dissertation.

2 CHAPTER 1. INTRODUCTION

1.2 The Problem and Intuition

Question 1.2.1. Within model (1.4), how long does it take for a particle to go from rank k to rank `?

This could be used to answer the following:

• How long does it take for someone to go from Lower Middle Class to Upper Middle Class?

• What are the odds that a company falls out of the S&P 500 in the next 3 years?

• What effect does implementing a new tax have on social mobility?

Mathematically, if we choose i such that ri(0) = k and define T = inf{t ≥ 0 : ri(t) = `}, these questions are equivalent to:

• What is ET ?

• What is P(T < 3)?

0 0 • What effect does changing the drift terms from g1, ..., gn to as new set of constants g1, ..., gn have on the distribution of T ?

If we would like to model the distribution of wealth in a country, the exact value of n should not matter. To this end, we introduce functions µ, σ : [0, 1] → R, and define:

gk = µ(k/n) (1.6)

sk = σ(k/n). (1.7)

It is clear that given any sequence g1, ..., gn (resp. s1, ..., sn), one can find a function µ(·) (resp. σ(·)) satisfying (1.6) (resp. (1.7)); though the choice of functions is not unique. In this dissertation, we will look for limits as n → ∞, and so we treat the functions µ(·), σ(·) as given, because this will give us a unique system. Unless otherwise stated, we will make the following assumptions on µ(·), σ(·):

2 Assumption 1.2.2. The functions µ, σ : [0, 1] → R are continuous, and σ (·) > 0.

3 CHAPTER 1. INTRODUCTION

ri(t) Because a particle with rank ri(t) receives drift µ(ri(t)/n), it is the “percentile rank” n that is the important of where a particles ranks amongst the entire system. This is the same as considering Fn(Xi(t), t) where 1 F (x, t) = #{j : X (t) ≤ x} (1.8) n n j is the empirical distribution of the system of particles at time t. Then (1.4) is equivalent to:

i dXi(t) = µ(Fn(Xi(t), t))dt + σ(Fn(Xi(t), t))dWt (1.9)

We will also consider examples in which the drift coefficients are derived from a distribution, and not a function. In this case, we define the drifts gk by: Z gk = n µ(dr), k = 1, ..., n − 1 (1.10) k−1 k [ n , n ) Z gn = µ(dr). (1.11) n−1 [ n ,1] In the case when µ(dr) = µ(r)dr for some continuous function µ(·), the model (1.10) approximates the model (1.6) for large values of n.

The difficulty in answering Question 1.2.1 comes from the fact that the particle Xi(·) is not

Markovian when considered as a one-dimensional process, since its evolution depends on Fn(x, t).

However, to model Fn(x, t), we need to consider every particle (including Xi). Thus, we must study the entire system when trying to understand a single particle. It is possible to write down a PDE satisfied by the distribution of the stopping time T , which could then be solved numerically; n however, this PDE is defined on R , so modeling systems like the distribution of wealth in the United States is not feasible (n > 300, 000, 000), and more sophisticated techniques are required. Intuitively, when n is large, the movement of a single particle should not have an effect on the overall distribution of particles. We might expect, or hope, that for large systems, the randomness of the system disappears, and the distribution Fn(x, t) is close to something deterministic. If the system (1.9) starts from a stationary distribution, then the limit F (x) = limn Fn(x, t) should be invariant in time. If these limits do indeed hold, then the limit of (1.9) takes the form:

dX(t) = µ(F (X(t)))dt + σ(F (X(t)))dW (t). (1.12) the system (1.12) is just a one-dimensional Stochastic Differential Equation, and techniques for studying its dynamics are well understood.

4 CHAPTER 1. INTRODUCTION

The remainder of this chapter will review the theory for the model (1.4) for fixed n. Chapter 2 is dedicated to proving the convergence of Fn(X, t) to a deterministic F (x). The results are based heavily on techniques developed in the papers [20, 28]. In particular, the limit is shown to satisfy a certain partial differential equation, which is known to have a unique stationary solution. An example is given of a case when µ is a distribution, using the formulation (1.10), corresponding to the large n limit of the Simple Atlas Model. Chapter 4 is devoted to studying the results of Chapter 2, in the case when (1.5) has asymmetric collisions. This also includes a new interpretation of results about the stationary distribution of such systems for fixed values of n. Chapter 3 justifies the limit (1.12). Expressions are derived for the moments of the hitting times of this limit. The limit of a Simple Atlas Model is studied again, with the atomic part of µ manifesting itself as a local time term in (1.12).

1.3 Existence of Strong and Weak Solutions

Before attempting to talk about stable distributions or to take limits, we must first verify that solutions to (1.4) do indeed exist. By [3], we know that a weak solution exists for all time, and is unique in distribution. For much of this paper, weak solutions will be enough; however the techniques for demonstrating strong solvability are useful, and so we will summarize the known results here. Strong solutions are constructed as follows: while the particles are non-overlapping (that is

Xi(t) 6= Xj(t) for i 6= j), the system simply behaves as a system of independent Brownian motions with different drifts and variances. In the case of a collision between two particles, strong solutions continue to exist according to [10], allowing us to piece together a solution up until the first time that three particles collide simultaneously [18]. This argument can be extended to the following theorem:

Proposition 1.3.1. The system (1.4) has a strong solution up until the first time τ that a triple collision occurs:

τ = inf{t : ∃i < j < k : Xi(t) = Xj(t) = Xk(t)}.

It is an open problem, whether Strong Solutions continue to exist after such a time. Work has

5 CHAPTER 1. INTRODUCTION been done on finding conditions for the absence of Triple Collisions (see e.g. [18], [27]) culminating in the following theorem:

Proposition 1.3.2 ([27]). The system (1.4) almost surely avoids triple collisions if, and only if, 2 the sequence k 7→ sk is concave.

That the above criterion does not involve the drifts is unsurprising; triple collisions are inherently a local phenomenon, and Girsanov’s Theorem allows for the removal of drift in such situations 2 (assuming all the σk are strictly positive).

1.4 Reflecting Brownian Motion and the Gap Process

The study of triple collisions and the existence of stationary distributions both rely on considering the gap process

Z(·) = (X(2)(·) − X(1)(·), ..., X(n)(·) − X(n−1)(·)). (1.13)

n−1 n−1 This process lies entirely in the positive orthant R+ . In the interior of R+ , Z acts as a Brownian Motion with covariance matrix:

 2 2 2  s1 + s2 −s2 0 ... 0    2 2 2 2   −s2 s2 + s3 −s3 ... 0  A =   , (1.14)  ......   . . . . .    2 2 0 0 0 . . . sn−1 + sn and drift vector b where:

bk = gk+1 − gk. (1.15)

When Z(·) hits one of the faces Fk = {zk = 0}, it reflects in the direction

1 r = e − (e − e ), (1.16) k k 2 k+1 k−1

n−1 n−1 where e1, ..., en−1 is the standard basis for R (and e0 = en = 0 ∈ R ). In this setting, a triple collision occurs if Zk(t) = 0 = Zk+1(t) for some k = 0, ..., n − 2 and t ∈ [0, ∞). If there is a triple collision, then the process Z hits an edge Ek,k+1 = Fk ∩ Fk+1 where Fk = {x : xk = 0}, and it is no longer immediately clear how the reflection occurs. It is shown in [2] that the dynamics of Z can be written down explicitly using the local times of Z at all possible intersections of the faces

6 CHAPTER 1. INTRODUCTION

Fk. However, all of these higher-order local time terms are indentically zero under the assumption that the matrix R with columns rk is completely-S as defined below; see [19] for the details.

Definition 1. A square matrix Q is called completely S if for each principal submatrix Qe of Q, there is a ye ≥ 0 such that Qeye > 0, where the inequalities are interpreted component-wise.

For a complete discussion of such matrices, see [26]. It is an easy exercise to check that the matrix R, with columns r1, ..., rn−1, satisfies the completely-S condition, and so we may write:

Z(t) = Z(0) + B(t) + bt + R ·L(t), 0 ≤ t < ∞,

Zk where Lk(t) = L0 (t) is the local time at zero of the process Zk, and B(·) is a Brownian Motion with covariance matrix A. The study of this reflecting Brownian Motion will be crucial in the development of stationary distributions for the Competing Brownian Particle Systems, both in the following section and in subsequent chapters.

1.5 Stationary Distributions

Since the systems described in (1.9) are all translation invariant (corresponding to scale invariance X (t) of the processes Yi(t) = e i ), there is no hope of finding a stationary distribution for the locations of the particles. Instead, we look at the gaps between consecutive particles. For k = 1, ..., n − 1, we let Zk = X(k+1) − X(k) as in (1.15). Before proceeding with any theorem, we can easily convince ourselves that the following will be a necessary condition for stability to be possible:

Assumption 1.5.1. For any 1 ≤ k < n, we have

g1 + ... + gk − kg¯ > 0, (1.17) where −1 g¯ = n (g1 + ... + gn). (1.18)

Expanding the definition ofg ¯ using (1.18), the condition (1.17) becomes:

k k n Pk Pn X X X g` g` n g > k g + k g , or equivalently, `=1 > `=k+1 . ` ` ` k n − k `=1 `=1 `=k+1

7 CHAPTER 1. INTRODUCTION

Thus, the condition (1.17) is equivalent to saying that the average drift of the bottom k particles is greater than the average drift of the top n − k particles, for each k = 1, ..., n − 1. If this condition were to fail, then it would be possible for the top n − k particles to drift away from the bottom, which would ruin any hope for the existence of a stationary distribution.

Ifg ¯ = 0, then (1.17) becomes g1 + ... + gk > 0, which is equivalent to condition (3.3) in [19]. Note that when one is only concerned with the relative position of the particles, centering the drifts makes no difference, so we may safely assume thatg ¯ = 0. We have the following theorem from [1, 19]:

Theorem 1.5.2. Under Assumption 1.5.1, the process (Z1(·), ..., Zk(·)) has a stationary distribu- tion.

2 n−1 Following [1, 19], we define for any function f ∈ C (R+ ), the differential operators:

n−1 2 n−1 1 X ∂ f X ∂f n−1 [Af](z) = ak,` (z) + bk (z), z ∈ R+ (1.19) 2 ∂zk∂z` ∂zk k,`=1 k=1

[Dkf](z) = hrk, ∇i, z ∈ Fk, k = 1, ..., n − 1 (1.20) where ak,` are the entries in the matrix (1.14) and rk are given in (1.16). If ν is the invariant distribution of Theorem 1.5.2, then Lemma 2 in [19] says that there exist measure ν0k on Fk such 2 n−1 that for any Cb (R+ ):

n−1 Z 1 X Z [Af](z)dν(z) + [Dkf](z)dν0k(z) = 0. (1.21) n−1 2 R+ k=1 Fk The equation (1.21) is known as the Basic Adjoint Relation (BAR). In general, this is not an 2 easy equation to solve. However, there exists an explicit solution when the sequence k 7→ σk is affine, that is: 2 2 2 2 2 2 s2 − s1 = s3 − s2 = ... = sn − sn−1. (1.22)

Theorem 1.5.3 ([19]). Under the assumption (1.22), the stationary distribution ν of Z is given by: n−1 Z  Y −1  ν(A) = ak exp − ha, zi dx. (1.23) k=1 A

8 CHAPTER 1. INTRODUCTION

where a = (a1, ..., an−1) is given by

2 2 sk + sk+1 ak = . (1.24) 4(g1 + ... + gk − kg¯)

In other words, under the stationary distribution, the one dimensional marginal of Zk is an expo- nential distribution with mean ak. Furthermore, the gaps Z1, ..., Zn−1 are independent under the stationary distribution.

The assumption of affinity of the variances is implied by the affinity of the function σ2(·) in (1.9). Conversely, if the variances are affine for all values of n, then this is equivalent to the affinity 2 of σ (·) due to continuity. If we let n get large, then if k ∼ κn for some κ ∈ (0, 1), the value of ak can be approximated by:

σ2(κ) a ∼ n−1 , k 2Θ(κ) − κΘ(1) where Z r Θ(r) := µ(x)dx. (1.25) 0 We see from this that the size of the gaps shrinks as n−1, and so the total distance from ranks αn to βn should be on the order of |β −α|n·n−1 = |β −α|. In Chapter 2, we will make this observation rigorous and find this limiting distance as a function of α, β. Since we do not want our system of particles to have a net drift (i.e., we wantg ¯ = 0), the Assumption 1.5.1 has the following limiting version:

Assumption 1.5.4. Θ(1) = 0 and Θ(u) > 0 for u ∈ (0, 1). Furthermore, µ(·) is continuous, and µ(0), µ(1) 6= 0.

In the case of model (1.10), we define Θ by:

Z Θ(r) = µ(dr) = µ([0, r)), r ∈ [0, 1) (1.26) [0,r) Z Θ(1) = µ(dr) = µ([0, 1]). (1.27) [0,1]

In this setting, Assumption 1.5.4 becomes simply:

9 CHAPTER 1. INTRODUCTION

Assumption 1.5.5. In the setting of (1.10), the function Θ defined in (1.26) satisfies Θ(1) = 0, and Θ(u) > 0 if u ∈ (0, 1).

Lemma A.0.1 says that Assumption 1.5.4 implies Assumption 1.5.1 for large enough values of n. If we are in the setting of the distributional model of (1.10), then Assumption 1.5.5 implies Assumption 1.5.1 trivially for all n. Since we are concerned with taking the limit as n → ∞, we will always assume that n is taken large enough to imply Assumption 1.5.1.

Note 1.5.1. Assumption 1.5.4 is a condition saying that the lower ranked particles get pushed up, and the higher ranked particles are pushed down. The assumption that µ(0), µ(1) 6= 0 implies µ(0) > 0 > µ(1) since Θ > 0. A bound of µ(u) ≥ 0 for u ∈ [0, ) and µ(u) ≤ 0 for u ∈ (1 − , 1] would be sufficient to imply Assumption 1.5.1. We will see in Proposition 2.1.1 that the position of the top ranked particle is on the order of:

−1 Z 1−n 1 X(n)(t) ∼ C du. (1.28) 1/2 |µ(1)|(1 − u)

If µ(u) → µ(1) 6= 0 as u → 1, then this integral is on the order of log n. We will make frequent use of the growth rate in (1.28) in the remainder of this dissertation, and so we will remain in the setting of Assumption 1.5.4. As an example of what could go wrong, consider if µ(u) ∼ (1 − u)2 for u ∼ 1. Then the location of the top ranked particle is on the order of n. When passing to the true stock prices Y (t) = exp(X(t)), this corresponds to the largest Y(n)(t) having a positive fraction of the total wealth. This concentration of wealth in one individual does not work well with our plan of taking a continuous limit. While this is not immediately fatal to our methods, we find that assuming Assumption 1.5.4 leaves a sufficiently interesting and useful problem.

In the remainder of the paper, we will make the following assumptions, unless otherwise noted:

Assumption 1.5.6. The initial gaps (X(k+1)(0)−X(k)(0)) are distributed according to the station- ary distribution, and the median particle X(n/2)(0) = 0 begins at the origin.

Note that this latter assumption is just a centering condition; plays no role in studying the rank hitting times.

10 CHAPTER 1. INTRODUCTION

1.6 Outline of The Paper

In Chapter 2, we prove the convergence of the empirical distribution function (x, t) 7→ Fn(x, t) in (1.8), to a deterministic limiting distribution. In the case where σ2(·) is affine, the precisely known form of the stationary distribution allows us to take the limit using elementary techniques (see Theorem 2.1.3). When σ2(·) is not affine, we argue that a weak limit point must exist in some space, and that the limiting distribution F (x, t) must satisfy a certain Porous Medium Equation (see e.g. [28, 20] for similar arguments). Convergence and uniqueness results from [20] are then used to show that the limit is unique and stationary in time. The case of Simple Atlas model (see [1],[19]) is studied as an example of the model (1.10). Since σ2(·) is affine in these models, the method of proof is the same as in Theorem 2.1.3. Finally, we discuss Zipf’s Law and Gini Coefficients for the limiting distributions. Chapter 3 makes rigorous the limiting dynamics in (1.12). We study the rank process R(t) = F (X(t)) for this limit; finding a simple form for its dynamics in (3.7). The hitting times of R(t) (which are the limiting answer to Question 1.2.1) are studied, and a recursive expression is found for the moments of these hitting times. Similar results are obtained for the Atlas Model, where the Atlas stock behaves as a hard barrier in the limit, resulting in a local time term appearing in (1.12). In Chapter 4, we consider systems of particles with asymmetric collisions (see [21]). A new approach is taken to infer the mass of the particles (resulting in asymmetric collisions), which leads to a natural generalization of Assumption 1.5.1 for the existence of a stationary distribution of the gaps. In Theorem 4.4.3, we prove a convergence result for Fn using arguments similar to Theorem 2.5.1. In Chapter 5, we study the convergence of the local time interactions between adjacent particles

(that is, the local time of the gap process Zk). A deterministic limit is obtained in Theorem 5.1.1. This limit is then used to consider the large market limits of rank-based portfolios developed in [12].

11 CHAPTER 2. CONVERGENCE OF EMPIRICAL DISTRIBUTIONS

Chapter 2

Convergence of Empirical Distributions

This chapter is devoted to the convergence of the Empirical Distribution Fn(x, t) as defined in (1.8). We begin with the simplest case, where µ(·), σ(·) are continuous functions on [0, 1] and σ2(·) is affine. Later sections will develop techniques to manage the convergence when σ2(·) is non-affine, when the function µ(·) is replaced by a measure µ (e.g. in the case of Atlas Models) and when there are asymmetric collisions between particles. We introduce the empirical quantile function

−1 qn(·, t) = Fn (·, t) (2.1) defined on (0, 1] as the inverse of the empirical distribution. The quantile function satisfies the relation

qn(u, t) = X(un)(t). (2.2)

2.1 A Simple Case: Affine σ2(·)

We first present the case where the variance function σ2(·) is assumed to be affine. In this setting, we know from Theorem 1.5.3 that the stationary distribution for the process of gaps between consecutive particles is the product of independent exponential distributions with known mean. This exact form allows us to prove a Law of Large Numbers result, and directly calculate the weak limit of the empirical distribution.

12 CHAPTER 2. CONVERGENCE OF EMPIRICAL DISTRIBUTIONS

2 Proposition 2.1.1. For u ∈ (0, 1), the quantiles qn(u) := qn(u, 0) in (2.1) converge both in L and in probability as n → ∞, to 1 Z u σ2(x) q(u) = dx, (2.3) 2 1/2 Θ(x) R x where Θ(x) = 0 µ(r)dr is as in (1.25).

Proof. From Assumption 1.5.6, we know that X(n/2)(0) = 0, and so we have

bunc X n qn(u) = ξk , k=n/2

n where the ξk = X(k+1)(0) − X(k)(0) has an exponential distribution with mean

σ( k )2 + σ( k+1 )2 an = n n k 4(µ(1/n) + ... + µ(k/n) − kµ¯) as described in Theorem 1.5.3, where

µ¯ := n−1(µ(1/n) + ... + µ(n/n)).

n Further, the ξk are independent by Theorem 1.5.3. The expected value of qn(u) is then given by:

bunc X n Eqn(u) = ak . (2.4) k=n/2

Let an(v) = nan = nan where k ≤ v < k+1 . Then: kn(v) k n n

σ2( kn(v) ) + σ2( kn(v)+1 ) 2σ2(v) + o(1) σ2(v) an(u) = n n n = = + o(1) = q0(v) + o(1), 1 kn(v) µ( n )+...+µ( n ) 4(Θ(v) + o(1)) 2Θ(v) 4n n where the o(1) terms all converge to 0 as n → ∞. Since Θ(v), σ2(v) are both bounded away from both 0 and ∞ for all v ∈ (1/2, u), we can see that an(v) is uniformly bounded for all n, v. Rewriting (2.4) gives:

bunc Z u Z u X n n Eqn(u) = ak = a (v)dv + o(1) → q(v)dv. k=n/2 1/2 1/2

The first equality follows from the definition of an, and the o(1) term corrects for the error if u 6= k/n for some k. The convergence follows from Dominated Convergence Theorem.

13 CHAPTER 2. CONVERGENCE OF EMPIRICAL DISTRIBUTIONS

Next, we will show that V ar(qn(u)) → 0, which will allow us to conclude the proposition by applying Chebychev’s Inequality. Since qn(u) is a sum of independent random variables, its variance is the sum of the variances. Since:

2 n n 2 −2 n  −2 V ar(ξk ) = (ak ) ≤ n sup a (ν) ≤ Cn ν∈(1/2,u) for some C depending on u, σ2, Θ, we can conclude that

un X −2 −1 V ar(qn(u)) ≤ Cn ≤ Cn → 0. k=n/2

We will show in later sections that this result also holds for non-affine σ2(·), as well as more general µ (e.g. for Atlas Models). The difficulty in applying the proof for non-affine σ2(·) is that we are unable to use an explicit form of the stationary distribution. Before proceeding, we give a Central Limit Theorem for the initial distribution in the affine case:

Lemma 2.1.2. Let q˜n(α) be the continuous linear interpolation of the quantile function qn(α, 0). Then the random sequence of continuous functions

√   α 7→ n q˜n(α, 0) − q(α) , n ∈ N, converges in distribution to the process:

α 7→ B(J(α)), where B(u), u ∈ (−∞, ∞) is a Brownian motion indexed by R with B(0) = 0, and: Z α σ4(r)dr J(α) := 2 dr, 0 < α < 1. 1/2 4Θ(r)

Proof. The proof is a simple modification of the proof of Donsker’s Theorem, and is omitted. See Theorem 2.4.20 in [23] for a detailed proof.

We now begin the task of arguing that the convergence of the distribution Fn(x, t) is uniform in both space and time. We will prove the following Theorem:

14 CHAPTER 2. CONVERGENCE OF EMPIRICAL DISTRIBUTIONS

Theorem 2.1.3. If σ2(·) is affine, then for every fixed T > 0, we have

sup Fn(x, t) − F (x) → 0 (2.5) x∈R,t∈[0,T ] in L2 and probability, where F = q−1 (2.6) and q is the quantile function given in (2.3).

2 We note that since Fn,F are bounded, convergence in probability and L are equivalent. From (2.3), we know that q is differentiable with derivative bounded from above and away from 0 on compact subsets of (0, 1). The Inverse Function Theorem and (2.6) then allow us to make the same conclusion for F on compact subsets of R.

So far, we have only argued that the quantiles qn converge pointwise. The next couple of lemmas will show how we can relate the pointwise convergence of quantiles, to uniform convergence of the Cumulative Distribution. We will first relate convergence of F to that of q. Next, we show how to extend this to uniform convergence in the spacial argument. Next, we show convergence for t > 0, argue uniform convergence in the temporal argument, and finally combine all of these results. For the remainder of this section, we will make no use of the affinity of σ2(·). When proving the non-affine version of the result, we will only prove pointwise convergence, and then invoke the remainder of this section to pass from pointwise to uniform convergence.

Note 2.1.1. We use the term pointwise convergence of qn(·, ·) → q(·, ·) to refer to convergence for fixed inputs u, t, and not to convergence for fixed ω. What we have proven above could be called “pointwise convergence in probability”

Lemma 2.1.4. For a fixed t ∈ [0, ∞), the convergence in probability Fn(x, t) → F (x) holds for each x ∈ R if and only if qn(u, t) → q(u) in probability for each u.

Proof. Without loss of generality, we may assume that t = 0. First, suppose that Fn(x, 0) → F (x) for each x ∈ R. Fix u ∈ (0, 1),  > 0. Then consider the event A(u, ) = {qn(u) > q(u) + } ⊂

{Fn(q(u) + ) < u}. Fixing δ > 0, we have with probability 1 − o(1) that:

0 0 Fn(q(u) + ) > F (q(u) + ) − δ = F (q(u)) + F (c) − δ = u + F (c) − δ (2.7)

15 CHAPTER 2. CONVERGENCE OF EMPIRICAL DISTRIBUTIONS for some c ∈ (q(u), q(u) + ). Since F 0 is bounded from below on (q(u), q(u) + ), we may pick δ small enough to force this last quantity to be strictly greater than u, and then we have:

0 u < u + F (c) − δ = F (q(u) + ) − δ < Fn(q(u) + ). (2.8)

But on A(u, ), Fn(q(u) + ) < u, and so (2.8) would imply u < u. Hence, A(u, ) and (2.7) must be mutually exclusive events. But (2.7) holds with probability 1 − o(1) and so P(A(u, )) =

P({qn(u) > q(u) + }) = o(1). We similarly bound the probability that qn(u) < q(u) − , and so the first direction is proven. The above argument only used the fact that the functions F, q are inverses of each other and the boundedness of first derivatives on compact sets. Since such bounds hold for both q and F , the argument for the reverse direction is the same.

Lemma 2.1.5. If G(·) is an increasing, bounded function on U ⊂ R, and Gn(·) is a random sequence of functions such that Gn(·) → G(·) in probability for each x ∈ U, then

sup |Gn(x) − G(x)| → 0 x∈U in probability.

Proof. Fix  > 0, and let x1 < ... < xK be points in U such that if x ∈ U, then there is k such that xk ≤ x ≤ xk+1 and G(x) − G(xk) <  and G(xk+1) − G(x) < . Note that the boundedness of G shows that m can be chosen to be finite. For each i = 1, ..., K, we have

P(|Gn(xi) − G(xi)| < ) = 1 − o(1).

Since m is finite, we can conclude that

P(|Gn(xi) − G(xi)| < , ∀i = 1, ..., K) = 1 − o(1).

On this event, we have

Gn(x) − G(x) ≤ Gn(xk+1) − G(x)

≤ Gn(xk+1) − G(xk+1) +  ≤ 2.

With an identical lower bound, we get supx∈U |Gn(x)−G(x)| < 2 with probability 1−o(1).

16 CHAPTER 2. CONVERGENCE OF EMPIRICAL DISTRIBUTIONS

These lemmas allow us to get uniformity in space. However, it will be easier to first show uniformity in the temporal argument for a fixed point x, and then show uniformity in the spacial argument.

Lemma 2.1.6. For any fixed u ∈ (0, 1) and t ∈ [0, ∞), qn(u, 0) → q(u) in probability, if and only if qn(u, t) → q(u) in probability.

Proof. Since q(u) is deterministic, convergence in probability is a statement about the distributions of qn(u, t), and not of the underlying measure space. We know that qn(u, t)−q¯n(t) ∼ qn(u, 0)−q¯n(0) R in distribution, whereq ¯n(t) = qn(u, t)du is the average position of the particles at time t. This is because qn(u, t) − q¯n(t) can be written in terms of the gap process at time t, and the distribution of gaps is invariant in time by Assumption 1.5.6. The processq ¯n(t) can be written as:

Z 1 n n −1 X −1 X h ki q¯n(t) = qn(r, t)du = n X(k)(t) = n X(k)(0) + µ(k/n)t + σ(k/n)Wt 0 k=1 k=1 Z 1 n   −1 X k =q ¯n(0) + µ(u)du + o(1) t + n σ(k/n)Wt . 0 k=1

R 1 Since 0 µ(u)du = 0, the second term is o(1) as n → ∞. The quadratic variation of the martingale term is given by:

n n D −1 X kE −2 X 2 −1 n σ(k/n)Wt = n σ (k/n)t ≤ n kσk∞t = o(1). k=1 k=1

Because of this, we can writeq ¯n(t) =q ¯n(0) + oP (1) and hence qn(u, t) is distributed as

qn(x, t) ∼ qn(x, 0) + oP (1) where oP (1) is a term converging to zero in probability. Hence, qn(u, t) → q(u) in probability if and only if qn(0, t) → q(u) in probability.

Corollary 2.1.7. Fn(x, t) → F (x, t) pointwise in probability for all fixed x, t.

Proof. Pointwise convergence of qn was proven in Lemma 2.1.6, and Lemma 2.1.4 allows us to convert this to a statement about pointwise convergence of Fn.

17 CHAPTER 2. CONVERGENCE OF EMPIRICAL DISTRIBUTIONS

Lemma 2.1.8. For fixed x ∈ R and T ∈ (0, ∞), we have

lim sup |Fn(x, t) − F (x)| = 0 n t≤T in probability.

Proof. We have already seen in Proposition 2.1.1 and Lemma 2.1.4, that this holds for fixed times t. We let ψ(·) be a function with ψ(y) = 1 for y ≤ x, with ψ(y) = 0 for y ≥ x + δ, and with ψ smooth in between the two (i.e. ψ is a smooth approximation of 1(−∞,x]). Letting:

−1 X Zt = n ψ(Xi(t)), 0 ≤ t < ∞, i we have Fn(x, t) ≤ Zt ≤ Fn(x + δ, t). Applying Itˆo’sRule to this process, together with (1.9), we obtain:

n −1 X 0 i dZt = n ψ (Xi(t))σ(Fn(Xi(t), t))dW i=1 X  1  +n−1 ψ0(X )µ(F (X , t)) + ψ00(X )σ2(F (X , t)) dt. i n i 2 i n i i

0 00 Since ψ , ψ , σ, µ are all bounded, the drift term is bounded in magnitude by C1. The martingale −1 0 2 2 term Mt has variance bounded above by n kψ k∞kσ k∞, for any fixed s, S:     P sup Zt − Z0 ≥  ≤ P sup Mt − M0 ≥  − C1S s≤t≤s+S t≤S

1 2 ≤ 2 E(Ms+S − Ms) ( − C1S) C2S −1 ≤ 2 n . ( − C1S)

−1 Taking S = /(2C1), we see that this probability is O(n ).

We now let sk = kS for k = 1, ..., T/S. From Corollary 2.1.7, we know that for each k, we have |Fn(x + δ, sk) − F (x + δ)| <  with probability 1 − o(1). Further, we have just argued that, with probability 1 − o(1), the process Zt changes by less than  on the interval [sk, sk+1]. Since the number of intervals is T/S, a constant independent of n, we have shown that these bounds hold

18 CHAPTER 2. CONVERGENCE OF EMPIRICAL DISTRIBUTIONS for all k with probability 1 − o(1). This gives:

h i sup Fn(x, t) ≤ sup Zt = max Zt + sup Zt − Zt k k k t≤T t≤T k,t∈[sk,sk+1)

≤ Fn(x + δ, tk) +  ≤ F (x + δ) +  +  = F (x + δ) + 2.

By taking δ small and using the continuity of F , we can make:

sup Fn(x, t) ≤ F (x) + 3. t≤T

We can similarly lower bound Fn(x, t) and so we’re done.

Note 2.1.2. A careful reading of the proof of Lemma 2.1.8 reveals that the assumption µ ∈ L∞ is a much stronger condition than necessary. Since ψ0 ∈ L∞, in order to bound the drift term we need only upper bound n−1(|µ(1/n)| + ... + |µ(n/n)|). In the setting of (1.10), this is equivalent to R |µ(dr)| < ∞ (ie. µ having finite total variation). This weakened assumption will allow the result to follow through in the context of Atlas Models.

We are now in a position to prove Theorem 2.1.3.

Proof of Theorem 2.1.3. On the strength of Lemma 2.1.8, supt≤T Fn(x, t) and inft≤T Fn(x, t) con- verge to F (x) in probability. Since they both are increasing in x, we can apply Lemma 2.1.5 to con- clude that both the supremum and infemum converge F (x) uniformly in x, and so Fn(x, t) → F (x) uniformly.

2.2 Tightness of the Empirical Distributions

We now turn our attention to the case when σ2(·) is not affine. In this section, we will argue the existence of limit points for the sequence Fn of empirical distributions. In the following sections, we will show that there is a unique limit point. For any metric space X, we let P(X) denote the set of all probability measures on the σ-algebra B(X) of Borel sets of X. For any set A ⊂ X, and  > 0 define

A := {p ∈ X : ∃q ∈ A, d(p, q) < }.

We give P(X) the Levy-Prokhorov distance

π(µ, ν) := inf{ > 0|µ(A) ≤ ν(A) + , ν(A) ≤ µ(A) +  : ∀A ∈ B(X)}.

19 CHAPTER 2. CONVERGENCE OF EMPIRICAL DISTRIBUTIONS which metrizes the topology of weak convergence of probability measures in P(X). Furthermore, P(X) is separable if X is separable.

For any fixed t, ω, the empirical distribution Fn(dx, t, ω) as in (1.8) is an element of MR := P(R), the space of probability measures on the real line. The mapping t 7→ Fn(dx, t, ω) is then an element of C([0,T ], MR); the space of continuous mappings from [0,T ] to probability measures on the real line. The mapping ω 7→ Fn(dx, ·, ω) then induces a measure πn on the space P(C([0,T ], P(R))). If we could argue that {πn}n∈N is a tight sequence of measures, then it would have limit points by Prokorov’s Theorem. This line of argument is taken up in Chapter 4. For this chapter, we will slightly modify the approach, giving a stronger result with a more natural proof. Instead of considering the mapping t 7→ Fn(dx, t) as an element of C([0,T ], MR), we define n n n −1 X H (ω) = H := n δXi(·,ω) ∈ P(C([0,T ])) =: MT . (2.9) i=1 Then Hn is the empirical distribution of the entire trajectories of the particles, not just their n positions at a fixed time t. The mapping ω 7→ H (ω) then induces a measure ρn ∈ P(MT ). The n distribution Fn is only a function of the ranked particles (X(1), ..., X(n)), while H depends also on n the names; hence H contains more information than the Fn.

To show that {ρn}n∈N is tight in P(MT ) we will slowly strip away the layers of abstraction.

First, we show that the tightness of {ρn}n∈N is equivalent to the tightness of a sequence of measures

τn ∈ MT . We will then argue that the tightness of this sequence {τn}n∈N is equivalent to a sequence of measures νn ∈ P(R).

The random measure ρn naturally induces a measure τn on C([0,T ]) as follows: First, draw n n a random measure H according to ρn and then draw a random path X(·) according to H . In our setting, this is equivalent to first randomly letting the system of n particles evolve, and then randomly, uniformly picking one of the n particle trajectories.

More generally, given any metric space X, and a measure ρn ∈ P(P(X)), one can generate a measure τn ∈ P(X) according to the following:

Z Z  Z  f(x)τn(dx) := f(x)ν(dx) ρn(dν) (2.10) X P(X) X for any measurable function f : X → R.

20 CHAPTER 2. CONVERGENCE OF EMPIRICAL DISTRIBUTIONS

We will argue that if {τn}n∈N is tight, then {ρn}n∈N is as well. This will allow us to work in the more familiar setting of measures on the space of continuous functions, instead of measures on a space of measures. In the next subsection, we will argue that the tightness of the measures on continuous diffusions is implied by the tightness of the initial distribution of these diffusions. Finally, in the last subsection we will argue this tightness by considering the empirical central moment processes. The first two subsections are similar to the proof of Lemma 1.3 in [20].

2.2.1 Tightness of Measures on a Space of Measures

For any measure ν on a metric space X, and measurable function f : X → R, define: Z νf = f(x)ν(dx). (2.11)

Lemma 2.2.1. Suppose X is a separable metric space, and P(X) is given the topology of weak convergence. Let ρn be a probability measure on P(X) (so ρn ∈ P(P(X))), and let νn ∈ P(X) be a realization of ρn. The measure ρn induces a measure τn ∈ P(X) by (2.10) for any function f on

X. Then if {τn}n∈N is tight, {ρn}n∈N is tight as well.

Proof. Fix  > 0. To show that ρn is tight in P(P(X)), we need to find a compact set K¯ ⊂ P(X) such that ρn(K¯ ) > 1 −  for all n. Since X is separable, Prokhorov’s Theorem tells us that the set K ⊂ P(X) is tight if and only if its closure K¯ is compact. Let r & 0, and Kr ⊂ X be compact sets. We then define K by:

K = {ν ∈ P(X): ν(Kr) > 1 − r, : ∀r}.

¯ K is tight in P(X) and hence K is compact. Since {τn}n∈N is a tight sequence of measures, for each r ∈ N, we can choose Kr such that:

τn(Kr) > 1 − wr where wr & 0 will be chosen later. We can then write: Z ρn E νn(Kr) := νn(Kr)ρn(dνn) = τn1Kr = τn(Kr) > 1 − wr. (2.12) P(X)

21 CHAPTER 2. CONVERGENCE OF EMPIRICAL DISTRIBUTIONS

The quantity νn(Kr) can be viewed as a random variable taking values in [0, 1], and using (2.12) and Markov’s Inequality applied to 1 − νn(Kr), we get the bound

ρn r − wr P (νn(Kr) ≥ 1 − r) ≥ . (2.13) r −r By picking wr small enough, the bound in (2.13) can be made at most 2 . And so by summing these probabilities, we can conclude that ρn(K¯ ) ≥ ρn(K) > 1 − .

Let Xn(·) be a random continuous function chosen by first picking an instance of the n particle n system (X1(·), ..., Xn(·)), and then picking k at random from {1, ..., n} and setting X = Xk. Then n X is distributed according to τn, yielding the following corollary.

n Corollary 2.2.2. If the sequence of ItˆoProcesses {X (·)}n∈N is a tight sequence of random vari- ables in C([0,T ]), then the sequence of measures ρn is tight in P(MT ).

Corollary 2.2.2 allows us to only consider the tightness of diffusions, instead of the tightness of measures.

2.2.2 Tightness of ItˆoProcesses

We now give conditions that guarantee the tightness of a sequence of ItˆoProcesses. For the remainder of the section, we will assume that Xn(·) is a continuous, real valued process on [0,T ], n such that X (0) is distributed as πn and Z t Z t n n X (t) = X (0) + bn(s, ω)ds + sn(s, ω)dWs, 0 0 where bn, sn are progressively measurable.

Lemma 2.2.3. Suppose that bn, sn are uniformly bounded for all n, ω, t, and that {πn}n∈N is a n tight sequence of measures. Then the sequence {X (·)}n∈N is a tight sequence of random continuous functions.

n Proof. Fix  > 0. In order to show that {X (·)}n∈N is a tight sequence, we need to find a pre- compact set K ⊂ C([0,T ]) such that P(Xn ∈ K) > 1 −  holds for all n. According to the Arzela- Ascoli Theorem, K is precompact if it is uniformly bounded and equicontinuous. Fix M > 0,

r & 0 and let δr & 0 be chosen later. We then define the compact set:

22 CHAPTER 2. CONVERGENCE OF EMPIRICAL DISTRIBUTIONS

K = {X(·) ∈ C([0,T ]) : |X(0)| < M, sup |X(t) − X(s)| < r, ∀r}. s,t∈[0,T ],|s−t|<δr

Since {πn}n∈N is tight, we know we can pick M such that

P(|Xn(0)| > M) < /2. (2.14)

By Lemma 2.2.4, we may choose a δr > 0 such that for any fixed time s ≤ T :

n n −(r+2) −1 P( sup |X (t) − X (s)| > r/2) < 2 T δr. (2.15) s≤t≤s+δr

−1 By partitioning [0,T ] into T δr subintervals of length δr, and applying (2.15) to each subinter- val, we can argue that:

n n n  −(r+1) P sup |X (s) − X (t)| > r < 2 . (2.16) s,t∈[0,T ],|s−t|<δr Combining (2.14), (2.16) and summing over r we obtain the bound P(Xn ∈ K) > 1 − .

Lemma 2.2.4. In the setting of Lemma 2.2.3 there exists δr such that (2.15) holds for any s ≤ T and n ∈ N.

Proof. We fix n, and let  := r/2. Without loss of generality, take s = 0. Then we can prove (2.15) by proving the following:

P(Xn(·) leaves (X(0) − , X(0) + ) by time t) = o(t). (2.17)

By Doob’s and Burkholder-Davis-Gundy inequalities, we can bound (2.17) by:

p  n  EhXit P X (·) leaves (X(0) − , X(0) + ) by time t ≤ Cp p ( − kbk∞t) p p/2 −p ≤ Cpkσk∞t ( − kbk∞t) , for some constant Cp. Note that  is fixed, and so as t → 0, the denominator stays bounded away from zero. Thus, by picking p = 4, we see that as t → 0, this bound is O(t2) = o(t) as desired.

23 CHAPTER 2. CONVERGENCE OF EMPIRICAL DISTRIBUTIONS

n Corollary 2.2.5. If the sequences bn, sn are both uniformly bounded, and {X (0)}n∈N is a tight  n sequence of random variables, then the ItˆoProcesses X (·) n≥1 are a tight sequence of random continuous functions, and hence the associated empirical distributions in MT form a tight sequence.

2.2.3 Central Moment Processes

In order to apply Corollary 2.2.5 to our setting, we need to argue that the distributions of Xn(0) form a tight sequence. In the case of affine variance, this is a simple calculation using the exact form of the stationary distribution of the gaps. We have no such representation for non-affine variance, and so other techniques must be developed. n Since we assume that X(n/2)(0) = 0, the median of the distribution of X (0) is zero. So to get tightness, we need to find a universal bound on how “spread out” the particles are. To this end, we fix p ≥ 1 and consider consider the empirical central moment process given by:

p −1 X p V (t) = n X(k)(t) − X(t) (2.18) k We consider the central moments, as opposed to just the moments, because the central moments can be written in terms of the gap process, and is a measure of “spread”, as opposed to just being a measure of size. If p is an even integer, then applying Itˆo’srule to this process, we get the dynamics:

X p−1 1 dV p(t) =pn−1 X (t) − X(t) µ(k/n)dt + σ(k/n)dW k − dLk−1,k(t) + dLk,k+1(t) (k) t 2 k −1 X p−1 − pn X(k)(t) − X(t) dX(t) k p(p − 1) X n − 1  p−2 + n−1 σ(k/n)2 X (t) − X(t) dt. 2 n (k) k Because V p can be written in terms of named particles, we could apply Itˆo’srule to this formu- lation and would then see that the finite variation process would have to be absolutely continuous. This then tells us that the coefficients on the local time terms must be zero. One could also observe this through a direct calculation. The coefficient of dX(t) is:

p−1 −1 X  p−1 pn X(k)(t) − X(t) = pV (t). k

24 CHAPTER 2. CONVERGENCE OF EMPIRICAL DISTRIBUTIONS

−2 P 2 Since X(t) is a martingale, with quadratic variation n k σ(k/n) , we can bound the quadratic variation of V p(t) by

2(p−1) 2 −2 X 2  2 −2 X 2 (p−1) 2 dhV i(t) ≤ p n σ(k/n) X(k)(t) − X(t) dt + p n σ(k/n) V (t) dt. k k   This can be bounded by dhV i/dt ≤ n−12p2 max σ2 V 2(1−p)(t). The drift term can be written as:

p−1 −1 X   pn µ(k/n) X(k)(t) − X(t) k p(p − 1) X n − 1  p−2 + n−1 σ(k/n)2 X (t) − X(t) =: A(t) + B(t). (2.19) 2 n (k) k We now assume that p is a positive even integer, and so p − 1 is odd. By Lemma A.0.2, we may

find a constant C1 > 0 such that for sufficiently large n:

p−1 −A(t) ≥ C1V (t). (2.20)

It is also immediate that we can find a constant C2 > 0 such that:

p−2 B(t) ≤ C2V (t). (2.21)

Proposition 2.2.6. For any fixed p > 0, there is a K such that for large enough n,

n p E X (0)| ≤ K (2.22)

Proof. Without loss of generality, take p ≥ 4 to be an even integer. Since the existence of a pth moment implies all smaller moments, we lose nothing by making this assumption. For large enough values of n, the drift of V p(t) can be bounded above by

p−2 p−1 A(t) + B(t) ≤ C2V (t) − C1V (t). (2.23)

For any fixed 1 ≤ p, q, V p(t) and V q(t) are the pth and qth central moments respectively of the same distribution (the empirical distribution of the n particles). We hence have the following the inequality: (V p(·))q/p ≤ V q(·) (2.24)

25 CHAPTER 2. CONVERGENCE OF EMPIRICAL DISTRIBUTIONS for any 1 ≤ p ≤ q when q is an even integer. In general, the ratio between the pth and qth moment can grow arbitrarily large; however since for a fixed n, the empirical distributions cannot assign arbitrarily small mass to sets, we can write

q q pq/p p q/p V (t) ≤ max X(k)(t) − X(t) = max X(k) − X(t) ≤ nV (t) . k k

This then gives the bound V q(·) ≤ C . (2.25) (V p(·))q/p n,p,q Using (2.23), (2.24) and (2.25), we can bound the drift by:

(p−2)/p (p−1)/p  p   p  A(t) + B(t) ≤ C3,n V (t) − C4,n V (t) . (2.26)

Note here that, unlike C1,C2, there is a dependence on n of C3,n,C4,n. Since (p − 1)/p > (p − 2)/p, p p (p−1)/p for large enough values of V (t), the drift is bounded above by −C5,nV (t) . This tells us that V p(·) experiences a mean reversion when it gets large, and so we get:

V p(t) lim = 0. (2.27) t→∞ t We can also see that EV p(0) < ∞ for all p, though we have no guarantee that this bound will be uniform across all n. However, if we write:

V p(t) = M(t) + A(t) + B(t) where M(t) is the martingale term and A(t),B(t) are as in (2.19), then we have that:

M(t) lim = 0. (2.28) t→∞ t And hence, we get:

A(t) + B(t) h i 0 = lim = E A(0) + B(0) t t

From the bounds (2.20) and (2.21), we can conclude that EA(0) and EB(0) both exist, since EV p(0) exists. We are then justified in writing:

26 CHAPTER 2. CONVERGENCE OF EMPIRICAL DISTRIBUTIONS

p−1 p−2 C1EV (0) ≤ E(−A(0)) = EB(0) ≤ C2EV (0). (2.29)

Now invoking (2.24), we get:

1+1/(p−2) p−2 p−1 h p−2 (p−1)/(p−2)i  p−2  C2EV (0) ≥ C1EV (0) ≥ C1E V (0) ≥ C1 EV (0) C p−2 2 ≥ EV p−2(0). C1

This bound depends only on C1,C2, which have no dependence on n other than it being large. Hence, any central moment of Xn(0) may be bounded independently of n. To show that all moments are bounded, not just the central moments, we need to uniformly bound EXn(0). We know that n the median of X (0) is 0 (since X(n/2)(0) = 0), and we know that:

n E|X (0) − m| ≤ K, ∀n > n0.

Assume that m > 0. Since half the mass of Xn(0) is negative, we must have that:

m E|Xn(0) − m| ≥ . 2 This gives the bound m ≤ 2K, which completes the proof.

Proposition 2.2.7. The sequence ρn is tight in P(MT ).

Proof. From the above discussion, we know that the sequence of random variables Xn(0) is L1 bounded, and hence tight. The sequence of distributions ρn is then tight by Corollary 2.2.5.

2.3 Limits Satisfy the Porous Medium Equation

In this section, we will discuss the convergence of random measure Fn(·, ω) defined in (1.8). We say that the random measures Fn(·, ω) converge to a random measure H(·, ω) weakly almost surely, to mean that:

P(ω : Fn(·, ω) → H(·, ω) weakly) = 1. (2.30)

27 CHAPTER 2. CONVERGENCE OF EMPIRICAL DISTRIBUTIONS

Proposition 2.2.7 tells us that the sequence ρn has a weak limit point in P(MT ). By invoking the Skorokhod Representation Theorem (Theorem 3.5.1 in [9]) and passing to a subsequence, we n n may assume that H → H weakly almost surely (i.e. H → H almost surely as elements of MT ), where Hn is the empirical distribution of trajectories defined in (2.9). More explicitly, given any bounded, continuous function φ : C([0,T ]) → R, we have the almost sure convergence:

−1 X Hn H n φ(Xk) = E φ(X) → E φ(X). (2.31) k Note that since we almost surely have convergence of Hn, the property (2.31) holds for all φ almost surely. More precisely:

P((2.31) holds for all φ ∈ C0(R)) = 1.

Given H ∈ MT , we may define

H HH := H(x, t) = P (X(t) ≤ x).

n Applying this same definition to H , we recover HHn = Fn. If φ(X) = f(X(t), t) for some bounded continuous function f : R × [0, ∞) → R, then (2.31) now becomes:

Z Z −1 X n f(Xk(t), t) = f(x, t)Fn(dx, t) → f(x, t)H(dx, t). (2.32) k n Thus the convergence of H gives us a convergence of Fn → H in the weak sense of (2.32). We will later show that H = F as defined in (2.6), though for now we will continue to denote the weak limit point by H to remove any hint of a circular argument. The remainder of this section will be dedicated towards arguing that H, from (2.30), must satisfy the Porous Medium Equation:

1 H = σ2(H)H  − µ(H)H (2.33) t 2 x x x in a weak sense to be made precise later. Here Ht (resp. Hx) represents the t (resp. x) derivative of H. In the next sections, we will argue that if we start from a stationary distribution, that the limit will also be stationary, and hence Ht = 0. This will allow us to solve for H as a function of x.

A Heuristic

We first present a heuristic argument, and then spend the rest of the section making the argument rigorous. This heuristic captures the essence of the proof, and in later chapters we will return to

28 CHAPTER 2. CONVERGENCE OF EMPIRICAL DISTRIBUTIONS this argument to see how the proofs and results must be changed to deal with slight changes in the model (e.g., the case of asymmetric collisions). This argument is adapted from [28].

Assume that the limit H is smooth, and let f : R → R be a compactly supported, smooth function, and define n 1 X Z Zn = f(X (t)) = f(x)F (dx, t). (2.34) t n i n i=1 n Then by (2.32) the process Zt converges, the limit of this process will be Z Z n Zt → Zt := f(x)H(dx, t) = f(x)Hx(x, t)dx. R Taking time derivative of this process, we get dZt = f(x)Hx,t(x, t)dx. Integrating by parts, this R 0 becomes dZt = − f (x)Ht(x, t)dx. We now interchange the order of these operations; take the time derivative first, and then take the limit. This would give:

n 1 X 1 dZn = (f 0(X )µ(F (X , t)) + f 00(X )σ2(F (X , t)))dt + f 0(X )σ(F (X , t))dW i. t n i n i 2 i n i i n i t i=1 Since the W i are all independent and |f|, σ2 are bounded, the variation of this process is O(n−1), and so the martingale term goes to zero. Hence, we expect to get a limit of:

Z n  0 1 00 2  dZt → f (x)µ(H(x, t)) + f (x)σ (H(x, t)) Hx(x, t)dx. R 2 If the interchange of limit and derivative was justifiable, we could now compare the two equa- tions. Integrating by parts on the f 00 term, gives us the relation:

Z Z 0  0 1 2 0  − f (x)Ht(x, t)dx = − f (x) (σ (H)Hx)x + f (x)µ(H(x, t))Hx(x, t) dx, R R 2 1 2  and so Ht = 2 σ (H)Hx x − µ(H)Hx in a weak sense. Before stating and proving the convergence result precisely, we need a technical lemma that will allow us to interchange a limit and integral.

Lemma 2.3.1. For all continuous, compactly supported, f : R × [0,T ] → R and g : R → R we have the following:

Z t Z Z t Z

sup f(x, s)g(Fn(x, s))Fn(dx, s)ds − f(x, s)g(H(x, s))H(dx, s)ds → 0 (2.35) t≤T 0 0

29 CHAPTER 2. CONVERGENCE OF EMPIRICAL DISTRIBUTIONS almost surely.

Proof. We can bound (2.35) above by:

T Z n H H E f(Xn(t), t)g(Fn(Xn(t), t)) − E f(X(t), t)g(H(X(t), t)) dt, 0

n where Xn(·),X(·) are distributed according to H , H respectively. Since f, g are bounded, we need only show convergence for a fixed t, and then apply the Dominated Convergence Theorem. Then we have: Z Z f(x, t)g(Fn(x, t))Fn(dx, t) = f(x, t)G ◦ Fn(dx, t), (2.36) R R · where G(·) = −∞ g. Since G is continuous, the (signed) distribution with distribution function

G ◦ Fn converges weakly to the distribution with distribution function G ◦ H. But f is continuous, so (2.36) converges by weak convergence, and so for fixed t we are done, and the result for all t follows from the Dominated Convergence Theorem.

We are now in a position to state and prove precisely the PME relation.

Theorem 2.3.2. In the notation of Lemma 2.3.1, suppose that Hn → H weakly, where Hn is given in (2.9). Then H (see (2.30)) satisfies the Porous Medium Equation in a weak sense: Z Z g(x, t)H(dx, t) − g(x, 0)H(dx, 0) = (2.37) R R Z t Z  1 2  gx(x, s)µ(H(x, s)) + gxx(x, s)σ (H(x, s)) + gt(x, s) H(dx, s)ds 0 R 2 ∞ for all g ∈ Cc (R × [0,T ]) and t ∈ [0,T ] almost surely.

Proof. Once again, by invoking the Skorokhod Representation Theorem, we may suppose that

Fn → H weakly almost surely. Proceeding as in the heuristic, though now with a time component, we let:

n 1 X Z Zn = g(X (t), t) = g(x, t)F (dx, t) t n i n i=1 Z Zt = g(x, t)H(dx, t)

n Since Fn → H weakly, we see that Zt → Zt for each t almost surely. Using Itˆo’srule, we may write:

30 CHAPTER 2. CONVERGENCE OF EMPIRICAL DISTRIBUTIONS

Z t Z n n  1 2  Zt − Z0 = gx(x, s)µ(Fn(x, s)) + gxx(x, s)σ (Fn(x, s)) + gt(x, s) Fn(dx, s)ds 0 R 2 By applying Lemma 2.3.1, we see that this difference converges for each t ∈ [0,T ] almost surely ∞ to the right hand side of (2.37). Finally, since Cc is separable, we may choose a dense set of functions {g1, ...}, and conclude that (2.37) holds for each gk almost surely. But both sides of

(2.37) are continuous functions of g, and so the density of the set {g1, ...} allows us to conclude ∞ that (2.37) holds for all g ∈ Cc almost surely.

2.4 Convergence of the Porous Medium Equation (PME)

Convergence results for the Porous Medium Equation (PME) are developed extensively in [20]; we will summarise these results here.

Definition 2. Given any probability distributions µ, ν on R, and p ≥ 1, the Wasserstein distance between µ and ν is given by:

1/p  p Wp(µ, ν) = inf E |X − Y | , (X,Y )∈Π(µ,ν) where Π(µ, ν) denotes the set of random variables (X,Y ) where X,Y have one dimensional marginals

µ, ν respectively. If Fµ,Fν denote the distribution functions of µ, ν, then the Wasserstein distance is also given by: Z 1 1/p  −1 −1 p Wp(µ, ν) = |Fµ − Fν | . 0 A thorough treatment of the Wasserstein distance can be found in [29].

Assumption 2.4.1. The function µ(·) is C1 with β-H¨olderfirst derivative for some β > 0. The function σ2(·) is C2 with β-H¨oldersecond derivative.

Lemma 2.4.2. A function H(x) is a weak solution to:

1 0 = σ2(H)H  − µ(H)H (2.38) 2 x x x

−1 if and only if H(x) = F (x +x ˜) for some x˜ ∈ R where F = q is as in Theorem 2.1.3. We call such a function a stable solution to the PME.

31 CHAPTER 2. CONVERGENCE OF EMPIRICAL DISTRIBUTIONS

Proof. This is a special case of Proposition 3.1 in [20].

Theorem 2.4.3. Suppose that H(x, t) satisfies the Porous Medium Equation in a weak sense, and that µ(·), σ(·) satisfies Assumption 2.4.1 and Assumption 1.5.4. Further, suppose that R x2H(dx, 0) <

∞ almost surely. Then as H(·, t) converges t → ∞, in the W2 distance to a stable solution of the form appearing in Lemma 2.4.2.

Proof. This is a special case of Theorem 3.6 in [20].

2.5 The General Case: Non-Affine σ2(·)

We are now in a position to deal with the case where σ2(·) is not affine.

Theorem 2.5.1. Suppose that Assumptions 1.5.4 and 2.4.1 both hold. Then (2.5) holds in proba- bility and L2.

Proof. We recall the definitions of Section 2.3. Hn is the empirical distribution of the trajectories, n n as defined in (2.9). H = H (ω) ∈ MT is a random measure, and hence induces a measure ρn ∈

P(MT ). By Proposition 2.2.7, we know that {ρn}n∈N is tight, and so every subsequence of {ρn}n∈N has a weak limit point. By passing to a subsequence, and invoking the Skorokhod Representation Theorem, we may assume that Hn → H weakly almost surely. Hence, by Theorem 2.3.2, the empirical distributions Fn(x, t) of (1.8) almost surely have a weak limit H(x, t) that satisfies the PME (2.37). In order to invoke Theorem 2.4.3, we need to argue that H(·, 0) almost surely has a finite second moment. To this end, let Mn(ω),M(ω) denote the second moment of Fn(·, 0, ω),H(·, 0, ω) respectively. Then since Fn(·, 0) → H(·, 0) weakly, M ≤ lim infn Mn by Lemma A.0.3. By Fatou’s Lemma and Proposition 2.2.6, we then have that:

EM ≤ lim inf EMn < ∞. n By invoking Theorem 2.4.3, the weak limit H(·, t, ω) converges to a solution to (2.38) as t → ∞. But, since we started from a stationary distribution, the distribution of H(·, t, ω) must be invariant under t, and hence H(·, t, ω) must already be a solution to (2.38). Because we have fixed X(n/2)(0),

32 CHAPTER 2. CONVERGENCE OF EMPIRICAL DISTRIBUTIONS we must have that H(0, 0, ω) = 1/2 almost surely. We can now invoke Lemma 2.4.2 to conclude that H = F almost surely.

Since F is continuous, and Fn → F weakly almost surely, we must have that Fn(x, t) → F (x) almost surely for each x, t. By Lemma 2.1.4, this then implies the results of Proposition 2.1.1. The remainder of the proof of Theorem 2.1.3 did not rely on σ2(·) being affine, and so we can repeat the rest of it in this setting.

We can now conclude, that any subsequence of Fn has a further subsequence such that (2.5) holds. But this implies that the convergence holds for the entire sequence Fn.

Corollary 2.5.2. In the setting of Theorem 2.5.1, the quantiles qn(u, 0) → q(u) converge in prob- ability.

2.6 Atlas Models

Here we will discuss how to adapt the previous results to Simple Atlas models (See [1, 19]). The n particle simple Atlas Model is the system where the volatilities are constant s1 = ... = sn = σ, and the drifts are given by g2 = ... = gn = −g and g1 = (n − 1)g. This is the simplest model in which the Assumption 1.5.1 is satisfied, and hence the n particle system has a stationary distribution of gaps.

Unfortunately, it is impossible to express this model using (1.6), since the drift g1 grows infinitely large as n → ∞. To capture this high drift near the origin, we must instead use a distribution

µ and define gk according to (1.10). In this setting, we let µ(dx) = gδ0(dx) − gdx where δ0 is a pointmass at 0. A careful reading of the proof of Theorem 2.1.3, shows that Θ (see (1.25)) is the R key function that captures the drift, not µ. We can still define Θ(u) = [0,u) µ = g(1 − u) as in (1.26). A reading of the proof of Proposition 2.1.1 shows that the same results will hold in this case. σ2(u) Since Θ(u) 6→ 0 as u → 0, the function 2Θ(u) is integrable at u = 0, and so the lower end of the distribution (the “Atlas”) is at a finite position. Thus, for ease of notation, we shift our distribution to have X(1)(0) = 0, and get that qn(u, 0) → q(u), where

33 CHAPTER 2. CONVERGENCE OF EMPIRICAL DISTRIBUTIONS

Z u σ2 σ2  1  q(u) = dx = log . (2.39) 0 2g(1 − x) 2g 1 − u This corresponds to the limit −2g  F (x) = 1 − exp x (2.40) σ2 of the distribution function. The remainder of the proof of Theorem 2.1.3 follows immediately (see Note 2.1.2), which gives the following:

Proposition 2.6.1. In the context of the Atlas Model described above, we have, for all T > 0:

 −2g   sup 1 − exp 2 x − Fn(x, t) → 0 x≥0,t≤T σ in probability. For x ≤ 0, the convergence is Fn(x, t) → 0.

2.6.1 Multi-Atlas Models

Here we consider the case of having multiple “Atlas” positions in the model. Each Atlas pushes up those above it, but does nothing with the particles below; creating a buffer that makes it harder to fall past an Atlas. A possible application would be to place Atlase-like stocks at the cut-offs for entry into various indices of stocks, such as the S&P500, entry into which provides a boost to the trading volume of the stock. These models could also be applied to model a Class Structure in an economy, where the upper class bands together to help out other members of the upper class, making it more difficult to fall from the top.

Formally, suppose that 0 = a0 < a1 < ... < am ≤ 1 are the position of Atlas stocks, and h1, ..., hm are the related drifts. Then we let distribution µ be defined by:

m X µ(dx) = hk((1 − ak)δak (dx) − 1(ak,1)(x)dx) k=0 . Integrating, we get: m X Θ(u) = hk1(ak,1](u)(1 − u) k=0

34 CHAPTER 2. CONVERGENCE OF EMPIRICAL DISTRIBUTIONS

Example 1. Consider the case when m = 1, and a1 = a. Then we can write:

Z u σ2(u) q(u) = du 0 2Θ(u) σ2  1  = log 2h0 1 − u for u ≤ a. Similarly, for u > a, we can write:

Z u σ2 q(u) = q(a) + dx a 2(h0 + h1)(1 − x) σ2  1  σ2  1 − a  = log + log 2h0 1 − a 2(h0 + h1) 1 − u σ2 h 1 1  = 1 log  + log  . 2(h0 + h1) h0 1 − a 1 − u

An interesting (and somewhat counter intuitive) consequence of this calculation, is that the ath quantile does not move any further away from the Atlas stock, and the quantiles above the level a move closer to the Atlas. Thus, any attempt of an upper class to get together and help each other is actually completely counter productive, and hurts the upper class as a whole.

We could also take h1 to be negative, so long as it is not so small that Θ loses its positivity. In this case, the upper class is effectively taxing entry into its elite level, and the class does begin to separate itself more from the Atlas stock. This is an example of a situation where “Sharing is Bad, Stealing is Good”.

Example 2. Motivated by Example 1, we return to the setting of a continuous function µ. Suppose R u that γ : [0, 1] → R is supported on [a, 1], and satisfies the condition that Γ(u) = 0 γ is non-negative and Γ(1) = 0. Then, in the same spirit as above, the model with drift µ+γ can be thought of as the members of the upper class deciding to help each other out. We can repeat the above calculation and reach the same conclusion (i.e., the upper class being less separated from the lower class). By considering γ such that −γ satisfies these assumptions, and additionally assuming Θ + Γ > 0 on (0, 1), then the upper class is more separated than before.

Note 2.6.1. The interpretation of the above models as a redistribution of wealth is not entirely correct, as there is only a conservation of log-dollars, and not real dollars. Redistribution of real dollars amongst sub populations has recently been studied by Fernholz [13].

35 CHAPTER 2. CONVERGENCE OF EMPIRICAL DISTRIBUTIONS

Figure 2.1: Capital Distribution Curves Capital distribution plots of the U.S. at the end of each decade from 1929-1999. Rankings given in reverse order of this paper (1 is the highest). Taken from [11].

2.7 Zipf’s Law

Zipf’s Law refers to the observation that many physical and social data sets can be well approx- imated by a power law distribution. More specifically, it says that if the data points are ordered a1 > a2 > ... (note the reverse ordering to the convention used throughout this paper), then:

a j α i ∼ , i < j (2.41) aj i for some α ≥ 0. This formulation is equivalent to the plot of log-log plot of size versus rank being linear. This is approximately true as seen in Figure 2.1. Throughout, we have been modeling the logarithm of the original quantities of interest. Taking exponentials, and considering the reverse ordering, we recast (2.41) as:

eq(1−u)  1 − v α ∼ (2.42) eq(1−v) 1 − u or, equivalently:

36 CHAPTER 2. CONVERGENCE OF EMPIRICAL DISTRIBUTIONS

1 q(u) ∼ C + α log (2.43) 1 − u R u σ2(r) for some constant C. But we know that q(u) = 1/2 Θ(r) dr. For r ∼ 1, Θ(r) ∼ |µ(1)|(1 − r) and 2 2 σ2(1) σ (r) ∼ σ (1). Hence, the integrand in q(u) approaches 2µ(1)(1−r) , and so we have: 1 q(u) ∼ C + α log , u ∼ 1 (2.44) 1 − u

σ2(1) with α = 2|µ(1)| We have shown that, any large enough model of the form (1.4) will obey Zipf’s Law at the top of its distribution. This is very much in accordance with Figure 2.1.

Example 3. A common n-particle model is to let each Xi be a reflecting Brownian Motion with negative drift. More precisely:

i Xi dXi(t) = −gdt + σdWt + dL (t, b) (2.45) for some boundary b. Each reflecting Brownian Motion has an exponential stationary distribution, and so the stationary quantile function is logarithmic in 1 − u. Hence, the expected empirical distribution of these systems satisfies (2.43) with equality. σ2 1 We have already seen in (2.40) that for an Atlas model, q(u) = 2g log( 1−u ), which agrees with the stationary distribution of the reflecting Brownian Motions with b = 0. This equality is not a coincidence: we will show in Chapter 3 that in the limit as n → ∞, a single tagged particle in an Atlas Model behaves exactly as (2.45).

2.8 Gini Index

The Gini Index is used as a measure of wealth inequality within a group of people. Given a E|X−Y | distribution ν, the Gini Index is define to be G = 2EX where X,Y independent random variables with distribution ν. In the event that EX = ∞, we simply take the G = 1. In our setting, the wealth of a random individual is distributed as eq(U) where U is a uniform random variable on [0, 1]. We can write the Gini Index as:

R 1(1 − u)eq(u)du G = 1 − 2 0 . R 1 q(u) 0 e du

37 CHAPTER 2. CONVERGENCE OF EMPIRICAL DISTRIBUTIONS

A larger Gini Index corresponds to more inequality; G = 0 corresponds to complete equality, while G = 1 corresponds to the top % of people having infinite money for all . Observed values of this Gini Index in various countries in the world range from 0.23 in Sweden, up to 0.65 in South

Africa. The United States has a Gini-Index of 0.45. We will first argue that the Gini Index Gn(t) converges uniformly in probability to the Gini Index of the limiting distribution F . Throughout this section, we will once again assume that σ2(·) is affine, as this allows for more precise bounds.

Lemma 2.8.1. Assume that σ2(·) is affine. Then for any fixed t, the total wealth

−1 X  Sn(t) = n exp qn(k/n, t)

R 1 converges to 0 exp(q(u))du in probability if σ2(1) α = < 1. (2.46) −2µ(1)

R 1 q(u) Note that the condition α < 1 is exactly the condition that the integral 0 e du is finite. To 1 q(u) C −α see this, we recall from (2.44) that q(u) ∼ C + α log 1−u for u ∼ 1, and hence e ∼ e (1 − u) . The integral of this converges if and only if α < 1. For u bounded away from 1, the integrand eq(u) is bounded, and hence always integrable. Since we know that the quantiles converge uniformly on compact subsets of the of (0, 1) (from Theorem 2.1.3), the middle of the wealth distribution can be handled easily. Small values of u result in very negative values of q(u), and so have little effect on the total wealth. The bulk of the proof of Lemma 2.8.1 then relies on showing that the total contribution of values of u close to u = 1 can be controlled. This relies on the exact form of the stationary distribution in the affine setting.

1 Proof. Without loss of generality, take t = 0. There is a C such that q(u) ≤ C+α log 1−u . Similarly, there is an η > 0 and α0 < 1 such that

1 an < α0 (2.47) k n − k for all k > (1 − η)n, and a c > 0 such that

n −1 ak < cn (2.48)

38 CHAPTER 2. CONVERGENCE OF EMPIRICAL DISTRIBUTIONS if for all k ∈ (n/2, (1 − η)n).

n qn(k/n) n n n For k > (1−η)n, consider the exponential Vk = e . We can write Vk = exp(ξn/2 +...+ξk ), n n where ξn/2, ..., ξk is a collection of independent exponential random variables with total mean

n qn(k/n) ∼ q(k/n). We would like to bound Vk = e in expectation:

k k −1 n  X n Y  n EVk = E exp ξi = 1 − ai (2.49) i=n/2 i=n/2 n due to the independence of the ξk and the Laplace transform of an exponential distribution:

n n −1 E exp(ξi ) = (1 − ai ) . (2.50)

n n This formula holds for ai < 1, which is guarenteed by (2.47). Since the αi are all small, we approximate this product in the usual fashion:

k k Y Y n n −1 − log(1−ai ) (1 − ai ) = e i=n/2 i=n/2 k Y −1(−an−(an)2/2) ≤ e i i i=n/2 k  X n n 2  = exp ai + (ai ) /2 i=n/2 n k  X n 2  X n ≤ exp (ai ) exp ai . i=n/2 i=n/2 Using (2.47) and (2.48), this gives:

n (1−η)n n X X X 1 π2α02 a2 ≤ c2n−2 + α02 ≤ c2n−1 + = O(1), i (n − i)2 6 i=n/2 i=n/2 i=(1−η)n k (1−η)n k X X X 1 n − k  k  a ≤ cn−1 + α0 ≤ c − α0 log = c + α0 − log(1 − ) + log η . i n − i ηn n i=n/2 i=n/2 i=(1−η)n Combining these, we have the bound:

0 0  k −α EV n ≤ exp(O(1) + c)ηα 1 − k n 0 0  k −α = eO(1)ηα 1 − . n

39 CHAPTER 2. CONVERGENCE OF EMPIRICAL DISTRIBUTIONS

−1 Pn−1 n Next, suppose that we fix a level η > ι > 0, and look at n k=(1−ι)n EVk ; which is the contribution to the average wealth of the population by the top ι people. Then we can bound this by:

n n−1 X 0 X 1 n−1 EV n ≤ eO(1)ηα n−1 k (1 − k )α0 k=(1−ι)n k=(1−ι)n n Z 1 O(1) −α0 1 → e η α0 du. (1−ι) (1 − u) Since α0 < 1, this integral converges, and so can be made arbitrarily small by pick ι small. We are now in a position to prove the Lemma.

We split Sn into the three sums:

ιn−1 (1−ι)n n −1 X  −1 X  −1 X  Sn = n exp qn(k/n, t) + n exp qn(k/n, t) + n exp qn(k/n, t) . k=0 k=ιn k=(1−ι)n+1

1 2 3 1 Denote these by Sn = S + S + S . The first sum, S , has a summand bounded by 1 for small ι < 1/2, and so S1 < ι (this is the wealth of the poorest people, so it makes sense that it is 2 R 1−ι q(u) small). The middle sum S converges to ι e du since the quantiles all converge uniformly in this interval (Theorem 2.5.1). Finally, the last sum, S3 has an expectation that converges to zero as ι → 0. We can therefore pick ι small enough, and apply Markov’s inequality to conclude that

S3 <  with probability 1 − o(1). Hence, we can write:

Z 1−ι 1 2 3 q(u) Sn = S + S + S = O(ι) + e du +  ι with probability 1 − o(1). By letting , ι → 0, we get the desired result (note that ι is chosen as a function of ).

Proposition 2.8.2. The convergence in Lemma 2.8.1 is uniform over t ∈ [0,T ] for any fixed T ∈ (0, ∞).

Proof. The argument will follow the outline of Lemma 2.1.8. For any fixed time points 0 = t0 < t1 < ... < tL = T , we have uniform convergence over all ti. We would like to argue that Sn(t) cannot change “too much” between any ti, ti+1.

40 CHAPTER 2. CONVERGENCE OF EMPIRICAL DISTRIBUTIONS

−1 X X (t) dSn(t) =n de (k) k X  1  X =n−1 µ(k/n) + σ2(k/n) eX(k)(t)dt + n−1 σ(k/n)eX(k)(t)dW k(t) 2 k k  1  X ≤ kµk + kσ2k S dt + n−1 σ(k/n)eX(k)(t)dW k(t). ∞ 2 ∞ n k The quadratic variation of this process can be bounded by:

X −2 2 2X (t) dhSni = n σ (k/n)e (k) dt k 2 X −1 X (t) 2 ≤ kσ k∞( n e (k) ) dt k 2 2 = kσ k∞Sn.

This tells us that Sn should be bounded above by a Geometric Brownian Motion. If we take logarithms, by Itˆo’sformula we have:

d log Sn ≤ Cdt + dM(t).

We can then argue as in Lemma 2.1.8:

 Sn(t)    P sup log ≥  ≤ P sup Mt − M0 ≥  − Cs s0≤t≤s0+s Sn(0) s0≤t≤s0+s C ≤ p E(hMi(s)p)2 ( − Cs)2p kσ2kp sp ≤ ∞ . ( − Cs)2p

2p −2p 2 p p For any fixed , and small enough s, this probability is bounded by Cp2  kσ k∞s . If we −1 make a partition with ti+1 − ti < s, then we will need T s time points, and so the probability that this bound fails for the process starting any time ti is bounded above by:

2p −2p 2 p p−1 CpT 2  kσ k∞s .

41 CHAPTER 2. CONVERGENCE OF EMPIRICAL DISTRIBUTIONS

By picking p > 1, this bound can be made arbitrarily small (at the expense of have many time points). But for any fixed number of time points, we have a uniform convergence result, and so we’re done by arguing the same as in Lemma 2.1.8.

2 Corollary 2.8.3. Suppose σ (·) is affine, and let w : [0, 1] → R be bounded and continuous. For any p > 0 with σ2(1) p < 1, 2|µ(1)| or p < 0 with σ2(0) p < 1, (2µ(0))

R 1 pqn(u,t) R 1 pq(u) the weighted p-diversity 0 w(u)e converges in probability to 0 w(u)e uniformly in t.

Proof. The proof is identical to Proposition 2.8.2.

1 Example 4. Consider the Atlas model described earlier. Then by (2.39), q(u) = α log( 1−u ), and so the distribution of wealth is given as:

1 eq(u) = (1 − u)α is a power law distribution. If α ≥ 1, then this distribution is not integrable, and the Gini index is 1. If 0 < α < 1, then we have:

Z 1 Z 1 1 eq(u)du = (1 − u)−αdu = 0 0 1 − α Z 1 Z 1 1 (1 − u)eq(u)du = (1 − u)1−αdu = 0 0 2 − α 1 − α α G = 1 − 2 s = . 2 − α 2 − α

Example 5. Consider the Atlas-type model given by:

σ2(u) = σ2

µ(u) = (h0 + ah1)δ0(u) − h0 − h11u>1−a.

This can be thought of as a traditional Atlas model, with an additional tax put on the top a ∈ (0, 1) of the population. We would like to know how much this tax reduces inequality. The Gini Index of such models is plotted in Figure 2.2, with the x-axis representing the ratio of h1/h0.

42 CHAPTER 2. CONVERGENCE OF EMPIRICAL DISTRIBUTIONS

Figure 2.2: Gini Index Change 2 We have chosen σ = 1 and h0 so that the starting Gini Index is 0.45, which is the approximate value for the United States today.

These calculations give a way of estimating the change in Gini Index that would occur from having drift µ(·), to having drift µ∗(·). In this example, this change is manifested by putting an additional drift of −h1 on the top a wealthiest people. If at some time t, we made this change, the Gini Index would not change instantaneously. To study how the wealth distribution would evolve with time, one would use the non-stationary Porous Medium Equation (2.37) studied in [20] with drift µ∗, and initial condition H(·, 0) = q−1.

43 CHAPTER 3. LIMITING PARTICLE DYNAMICS

Chapter 3

Limiting Particle Dynamics

In this chapter, we use the results of the previous chapter to study the limiting dynamics of a single tagged particle. Unless stated otherwise, we will assume that we are in a setting where the results of Theorem 2.5.1 hold. n Fix a rank r ∈ (0, 1) and an n ∈ N. Pick i such that Fn(Xi (0), 0) = brnc/n, and define n n X (·) = Xi (·). Let X(·) be a diffusion with X(0) = q(r) and evolving according to:

i dX(t) = µ(F (X(t)))dt + σ(F (X(t)))dWt . (3.1)

Here, Xn is a single “tagged particle” starting from rank rn, and X is the proposed limiting trajectory. Importantly, we drive both processes with the same Brownian Motion W i. Because X is driven by the Brownian Motion W i, which in turn depends on the value of n, the process X does have a dependence on n, and so is not a true limit. However, because all of the limits in Chapter 2 are deterministic, the convergence of the Fn does not depend on the choice of 1 n 1 n Brownian Motion W(n) = (W , ..., W ) = (W(n), ..., W(n)). We may thus choose each W(n) in such i a way that for all n, W(n) = W for some fixed Brownian Motion W . Under this identification, X has no dependence on n and we are free to argue the convergence Xn → X.

2 n Theorem 3.0.1. Suppose that µ, σ are both Lipschitz with Lipschitz constants Kµ,Kσ2 , and X ,X are as described above, and the results of Theorem 2.5.1 hold. Then for any T > 0:

E sup |X(t) − Xn(t)|2 → 0 t≤T

44 CHAPTER 3. LIMITING PARTICLE DYNAMICS

n 2 Proof. We will apply Gronwall’s inequality. Let g(t) = E sups≤t |X(s) − X (s)| .

Z t (n) 2 (n) 2 g(t) ≤ K1E|X(0) − X (0)| + K2T E|µ(F (X(s))) − µ(Fn(X (s), s))| ds 0 Z t (n) 2 +K3 E|σ(F (X(s))) − σ(Fn(X (s), s))| ds 0 Z t 2 2 (n) 2 ≤ K1E|q(r) − qn(r + o(1), 0)| + K4(Kµ + Kσ2 ) E|F (X(s)) − Fn(X (s), s)| ds 0 Z t (n) = o(1) + K5 E|F (X(s)) − Fn(X (s), s)|ds 0 Z t (n) 2 (n) (n) 2 ≤ o(1) + K6 E|F (X(s)) − F (X (s))| + E|F (X (s)) − Fn(X (s), s)| ds 0 Z t Z t 0 2 (n) 2 2 ≤ o(1) + K6 kF k∞E|X(s) − X (s)| ds + K6 E sup |F (x) − Fn(x, s)| ds 0 0 x Z t Z t ≤ o(1) + K7 g(s)ds + K6 o(1) 0 0 Z t = o(1) + K7 g(s)ds, 0 where K1, ..., K7 are constants independent of g. The result follows from applying Gronwall’s Inequality to the function g(t) (Problem 5.2.7 in [23]).

Remark 1. In considering the limit of the Fn, there was no assumption made on the existence of strong solutions. This is because Fn is completely determined by the rank process, and strong solutions exist for this process. However, the limiting process X(t) is a strong solution to its SDE. Two possible heuristic explanations for why this is the case are as follows:

2 2 2 • In the limit, k 7→ σk = σ (k/n) is approximately linear (if σ is differentiable) at any sequence k, k + 1, k + 2, so one might expect the solution to behave approximately like it would for linear σ2(·).

2 2 • Since the difference between σk and σk+1 goes to zero, how the particles “disentangle” after a triple collision has little effect on the motion of these particles (i.e. coming out in first or last has about the same effect). The same holds for collisions of any fixed order, and the convergence of the particle distribution tells us that, in the limit, there are never collisions of any fixed percentage of particles (i.e., there are never any collisions of order αn for any α > 0).

45 CHAPTER 3. LIMITING PARTICLE DYNAMICS

The remainder of the chapter focuses around proving convergence similar to the one above, and studying properties of the limiting process.

3.1 Rank Dynamics

Recall Question 1.2.1: “How long does it take to go from rank k to rank `?”. When taking the limit of a large number of particles, it does not make sense to consider fixed ranks k, `, but instead we need to consider ranks αn and βn for some 0 ≤ α < β ≤ 1. Since the rank of Xn(t) is given by n n n nFn(X (t), t), we are looking for how long it takes the process R (t) := Fn(X (t), t) to hit a given level β starting from the level α. As before, we will argue that Rn(t) → R(t) := F (X(t)), and then study the dynamics of this latter process. From the Inverse Function Theorem applied to F = q−1 as in (2.6) and the definition of q in (2.3) and Θ in (1.25), we get:

2Θ(F (x)) F (x) = (3.2) x σ2(F (x)) 2Θ0(F )F 4Θ(F )σ0(F )σ(F )F F (x) = x − x (3.3) xx σ(F )2 σ(F )4 4Θ(F )Θ0(F ) 8Θ2(F )σ0(F )σ(F ) = − , (3.4) σ4(F ) σ6(F ) where in the last expression, all functions are evaluated x.

n Proposition 3.1.1. For any T > 0, supt≤T |R(t) − R (t)| → 0 in probability.

Proof. Since F is differentiable with bounded derivative (see (3.2)), we can write:

n n n n |R (t) − R(t)| ≤ |F (X(t)) − F (X (t))| + |F (X (t)) − Fn(X (t), t)|

0 n ≤ kF k∞|X(t) − X (t)| + kF − Fnk∞

The first term in this latter result converges to zero by Theorem 3.0.1, while the second does Theorem 2.5.1.

We now turn our attention to the study of the dynamics of the process R(t) = F (X(t)), 0 ≤

46 CHAPTER 3. LIMITING PARTICLE DYNAMICS t < ∞ as a diffusion. Since F is a C2 function, we can apply Ito’s rule to get:

1 dR = F (X )dX + F (X )dhXi (3.5) t x t t 2 xx t t  1  = F (X )µ(R ) + F (X )σ2(R ) dt + σ(X )F (R )dW . (3.6) x t t 2 xx t t t x t t

Substituting the expressions for Fx,Fxx in (3.2) into the dynamics (3.6), and using the definition

F (Xt) = Rt, we get:

 0 0 2 0  2Θ(Rt)Θ (Rt) 2Θ(Rt)Θ (Rt) 4Θ (Rt)σ(Rt)σ (Rt) 2Θ(Rt) dRt = 2 + 2 − 4 dt + dWt. σ (Rt) σ (Rt) σ (Rt) σ(Rt)  0  = h(Rt) dWt + h (Rt)dt 1 = (h2)0(R )dt + h(R )dW (3.7) 2 t t t where 2Θ(r) h(r) = . (3.8) σ(r) The special form of this diffusion will allow for some nice calculations of hitting times, as well as allow us to show time reversal properties.

3.2 Hitting Times

n n Throughout this section Ta (resp. Ta ) is the hitting time of the process Rt (resp. Rt ) at the level n a. We similarly define Ta,b (resp. Ta,b) as the time to leave the interval (a, b).

n n Lemma 3.2.1. Ta → Ta and Ta,b → Ta,b in probability.

Proof. First we recall that since F (Xt), 0 ≤ t < ∞ is a diffusion (with dispersion coefficient bounded away form zero), that c 7→ Tc is almost surely continuous at c = a. Fix T,  > 0 with T large enough (n) so P(Ta ≤ T ) ≥ 1 − . Then, with probability 1 − o(1), we have |Fn(X (t), t) − F (X(t))| ≤  for (n) (n) t ≤ T . But this means that |Fn(X (Ta+),Ta+) − a − | ≤ . So, Ta ≤ Ta+ with probability (n) 1 − o(1). Applying a similar argument to get Ta− ≤ Ta , letting  → 0 (and hence T → ∞), and

(n) calling on the continuity of c 7→ Tc, we can conclude that Tb − Tb → 0 in probability. n To get the convergence of the hitting times Ta,b, we simply note that Ta,b = Ta ∧ Tb.

47 CHAPTER 3. LIMITING PARTICLE DYNAMICS

For the remainder of the section, we will study the random variables Ta,b and Ta. We define the functions:

u(x, t) = u(x, t, a, b) = Px(Ta,b > t)

−λTa,b v(x, λ) = v(x, λ, a, b) = Exe .

Proposition 3.2.2. (1): The function u is the smallest non-negative classical solution to the PDE

1 2  ut = h (x)ux (3.9) 2 x u(x, 0) =1, x ∈ (a, b) (3.10) and u ∈ C2,1((a, b) × [0, ∞)). (2): For any fixed λ < 0, the function v(·, λ) is the largest classical solution of:

1 h2(x)v  = − λv (3.11) 2 x x v(x) ≤1, x ∈ (a, b) (3.12) and v(·, λ) is of class C2((a, b)).

Proof. This follows from the results in Section 5 of [22], and the identity:

1 1 h2(x)u + h(x)h0(x)u = h2(x)u  . 2 xx x 2 x x

Lemma 3.2.3. For fixed 0 < a < b < 1, there is an η, K > 0 such that u(x, t) ≤ Ke−ηt for all x ∈ (a, b)

Proof. We pick 0 < a0 < a < b < b0 < 1. Then u(x, t, a, b) < u(x, t, a0, b0), and u(x, t, a0, b0) is continuous on [a, b]. Since u(x, t, a0, b0) < 1 for each t > 0 (See, for example, [5]), we must have that:

p := sup u(x, 1, a, b) ≤ sup u(x, 1, a0, b0) < 1. x∈(a,b) x∈[a,b] n So we must have P(Ta,b > n) ≤ p by the Strong Markov Property. Taking η < log p and K large enough, we are done.

48 CHAPTER 3. LIMITING PARTICLE DYNAMICS

Corollary 3.2.4. Ta,b < ∞ almost surely.

This corollary also follows from the Feller Test (See [23] Proposition 5.5.22), and the local integrability of 1/h2(·). However, Lemma 3.2.3 guarantees the existence of moments and the Laplace transform of Ta,b.

Corollary 3.2.5. The Laplace transform λ 7→ v(x, λ) exists for λ in a neighbourhood of 0, and n hence the n-th moment ExTa,b exists and is given by: dnv m(n)(x, a, b) := E T n = (x, 0). x a,b dλn Lemma 3.2.6. For a < x < b,

R b −2 x h (z)dz Px(Ta < Tb) = . R b −2 a h (z)dz Proof. A scale function for the diffusion R(t) is given by: Z y Z z 0 Z y Z z Z y 2  h (w)   0  h (a) s(y) = exp − 2 dw dz = exp − 2 (log(h(w)) dw dz = 2 dz. a a h(w) a a a h (z) And so R b −2 s(b) − s(x) x h (z)dz Px(Ta < Tb) = = . s(b) − s(a) R b −2 a h (z)dz

Corollary 3.2.7. Ta < ∞ Px a.s. for each x.

Proof. Without loss of generality, suppose that a < x. Then pick x < b < 1. We know that

Ta,b < ∞ almost surely. But Ta,b = 1Ta

Proposition 3.2.8. The moments m(n)(x) defined in Lemma 3.2.5 satisfy the recursive relation- ship: Z x h Z y i m(n)(x, a, b) = K − 2n m(n−1)(z, a, b)dz h−2(y)dy, (3.13) a a where

m(0)(x) = 1 y Z b R (n−1) Z b −1  a m (z)dz  1  K = 2n 2 dy 2 dy . a h (y) a h (y)

49 CHAPTER 3. LIMITING PARTICLE DYNAMICS

(0) 0 Proof. First, m (x) = ExTa,b = 1 since Ta,b < ∞ a.s.. Now, taking λ derivatives both sides of (3.11), we get: 1 d dn dn dn−1 (h2(x) v(x, λ)) = −λ v(x, λ) − n v(x, λ). 2 dx dλn x dλn dλn−1 Setting λ = 0, this gives the relation:

1 (h2(x)m(n)(x)) = −nm(n−1)(x) 2 x x Z x 2 (n) (n−1) h (x)mx (x) = K − 2n m (y)dy a Z x Z x R y (n−1) (n) K a m (z)dz m (x) = 2 dy − 2n 2 dy. a h (y) a h (y)

Substituting m(n)(b) = 0, we can solve for K as:

y Z b R (n−1) Z b −1  a m (z)dz  1  K = 2n 2 dy 2 dy . a h (y) a h (y)

We will now extend this result to calculating the moments of the hitting time Ta by letting b → 1.

Theorem 3.2.9. For fixed a, we define for x > a, m(n)(x) = m(n)(x, a) = m(n)(x, a, 1). Then m(n)(x) satisfies the recursive relationship:

m(0)(x) = 1 Z x Z 1 m(n)(x) = 2n h−2(y) m(n−1)(z)dzdy. a y

(0) Proof. Once again, m (x) = 1 since Ta < ∞ almost surely. Since the Ta,b % Ta as b → 1, we can write: m(n)(x) = lim m(n)(x, a, b). b→1 In the notation of the above proposition,

 Z b Z y  Z b −1 K = K(b) = 2n h−2(y) m(n−1)(z, a, b)dzdy h−2(y)dy . a a a

2 2 2 −2 R b −2 Since h (x) ∼ 4µ (1)(1 − x) σ (1) as x → 1, the integral a h (y)dy diverges as b → 1. R 1 (n−1) Claim 1. K(b) → 2n a m (z)dz as b → 1

50 CHAPTER 3. LIMITING PARTICLE DYNAMICS

Proof. We know that m(n−1)(z, a, b) ≤ m(n−1)(z, a), and so K(b) is bounded above by:

R b f(y)h−2(y)dy 2n a , R b −2 a h (y)dy

R y (n−1) R b −2 where f(y) = a m (z)dz. Since a h → ∞ as b → 1, and f(y) is continuous, this limit is R 1 (n−1) given by 2nf(1), and hence lim K(b) ≤ 2n a m (z)dz as desired. To get the reverse inequality, we fix L < f(1). Define f(b, y) by:

Z y f(b, y) = m(n−1)(z, a, b)dz. a

Then f(b, y) is increasing in both b, y and f(b, y) → f(1) as b, y → 1. Hence, for large enough b −2 and y > y0, f(b, y) > L. Since the measure h (y)dy puts infinite mass near y = 1, we can ignore y ≤ y0 in the limit, and so we can apply the same argument as above to conclude that:

lim inf K(b) ≥ 2nL. b→1

Letting L % f(1) completes the proof of the claim.

We can now write m(n)(x) as:

Z x  Z y  m(n)(x) = lim h−2(y) K − 2n m(n−1)(z, a, b)dz dy b→1 a a Z x  Z 1 Z y  = 2n h−2(y) m(n−1)(z)dz − m(n−1)(z)dz dy a a a Z x Z 1 = 2n h−2(y) m(n−1)(z)dzdy. a y

Proposition 3.2.10. m(n)(x) < ∞ for all n, x > a.

(n) 1 n Proof. We will show inductively that m (x) = O(log( 1−x ) ). Clearly this holds for n = 0.

51 CHAPTER 3. LIMITING PARTICLE DYNAMICS

Supposing it holds for n, we get:

Z x R 1 m(n)(z)dz (n+1) y m (x) = 2(n + 1) 2 dy a h (y) Z x Z 1  1  = O log( )n O(y−2)dy a y 1 − y Z x  1  = O y log( )n O(y−2)dy a 1 − y Z x  1  = O y−1 log( )n dy a 1 − y  1  = O log( )n+1 . 1 − x

Example 6. Let µ(x) = 1 − 2x and σ(·) = 1. Then Θ(r) = r(1 − r), and h(r) = 2r(1 − r). Let a ∈ (0, 1) and x > a. Then:

Z x 1 − y ExTa = 2 2 2 dy a 4y (1 − y) 1 1 x(1 − a) = − + log . a x (1 − x)a

3.3 Time Reversal

Suppose now that instead of starting from a fixed R(0) = r, we randomly choose a level r from a uniform distribution on [0, 1]. This is equivalent to randomly choosing X(0) according to the distribution F . For the rank process R, the stable distribution is simply the uniform distriubtion on [0, 1]. According to [17], the process Ret = RT −t, 0 ≤ t ≤ T is a diffusion, with drift given by the generalized Nelson equation:

0 2 0 2 0 b(y) = −(hh )(y) + (h (y)) + h (y)(log 1[0,1](y)) = −(hh0)(y) + 2(hh0)(y) + 0

= (hh0)(y)

52 CHAPTER 3. LIMITING PARTICLE DYNAMICS

and dispersion h(y). Hence, the processes R and Re have the same law. This lets us write:

P(RT ∈ da|R0 ∈ db) = P(Re0 ∈ da|ReT ∈ da)

= P(R0 ∈ da|RT ∈ db)

P(R0 ∈ da) = P(RT ∈ db|R0 ∈ da) P(RT ∈ db)

= P(RT ∈ db|R0 ∈ da).

Here the first equality is the definition of Re, the second is the equality of Re and R as processes, the third is Baye’s Rule, and the last holds because Rt is a Uniform random variable on [0, 1] for each t, and hence P(RT ∈ db) = P(R0 ∈ da).

R0=a R0=b Proposition 3.3.1. P (Rt ∈ db) = P (Rt ∈ da) for all t ≥ 0, a, b ∈ (0, 1).

This proposition states that it is exactly as easy to climb up the social ladder as it is to fall down it; which could either be good or bad depending on who you are.

Remark 2. The result at first appears to be counter intuitive: if the particles at the top have a negative drift, surely there is a larger chance of them falling down than going up. What happens in this case is that the particles are more thinly spread at the top, and so there is a larger range of “target” values to hit a certain rank. More precisely, being in rank dα means being in position q0(α)dx where x = q(α). Since q0(α) gets larger as α → 1, this balances out the fact that being in position dx is less likely for large values of x.

3.4 Rank-Rank Correlations

The Gini Coefficient provides a way of measuring economic inequality as a single number; however it says nothing about the mobility of individuals. The rank-based model has two degrees of freedom σ2(·) (µ and σ). In Chapter 2, the function of interest was the ratio d(·) = 2Θ(·) , while in this chapter, 2Θ(·) the function of interest is h(·) = σ(·) . These two functions form a more natural basis for studying the model; with d(·) capturing the behaviour of the total market (i.e. the distribution); and h(·) capturing the behaviour of individuals within the system. The metric we choose to use here is to study the correlation between a randomly chosen indi- viduals rank at time t = 0, and at some later time. We will see that this will depend only on h(·)

53 CHAPTER 3. LIMITING PARTICLE DYNAMICS and not on d(·). We will denote C(t) = cor(R(t),R(0)).

Then we have:

E[(R(t) − 1 )(R(0) − 1 )] C(t) = 2 2 (3.14) pV arR(0) · V arR(t) Since R(0) and R(t) are both uniformly distributed, V arR(0) = V arR(t) = 1/12. Using this, and expanding the numerator of (3.14), we get:

 1 C(t) = 12 ER(t)R(0) − 4 Substituting in (3.7), this becomes:

C(t) = 12 ER(0)R(t) − 3 h  Z t 1 Z t i = 12E R(0) R(0) + (h2)0(R(s))ds + h(R(s))dW (s) − 3 0 2 0   Z t 1 = 12E R(0)2 − 3 + 12E R(0)(h2)0(R(s))ds 0 2 1 Z t = 12 − 3 + 6E R(0)(h2)0(R(s))ds 3 0 Z t = 1 + 6E R(0)(h2)0(R(s))ds 0

Differentiating this expression at t = 0, we get:

d h i Z 1 C(0) = 6E R(0)(h2)0(R(0)) = 6 u(h2)0(u)du (3.15) dt 0 Z 1 h 2 1 2 i 2 = 6 uh (u)|u=0 − h (u)du = −6khk2. (3.16) 0

The first equality is using the fact that R(0) is uniform on [0, 1], the second is differentiation by parts, and the last is because h(0) = h(1) = 0. The equation (3.15) shows that the larger h gets, the more economic mobility there is. But h is inversely proportional to σ; so larger volatilities correspond to less economic mobility, which seems counter-intuitive. This is explained by the fact that we started from a stationary distribution; and higher values of σ spread out this distribution.

54 CHAPTER 3. LIMITING PARTICLE DYNAMICS

Figure 3.1: Rank-Rank Plot A simulated plot of E[R(t)|R(0)] versus R(0) with σ2 = 1, µ(x) = 1 − 2x.

So while a higher value of σ allows for more mobility in an individuals wealth X(t), it lowers the mobility of their rank, as it takes more money to affect a change in rank R(t). If we impose the restriction that the wealth distribution remains unchanged (i.e. d(·) remains unchanged), then 2 σ2 increasing σ by a factor of α we must increase Θ by a factor of α , since d(·) = 2Θ . But this will 2Θ(·) raise h(·) = σ(·) by a factor of α, which agrees with our intuition. Figure 3.1 shows a plot of the function r 7→ E[R(t)|R(0) = r], for different value of t, while Figure 3.2 shows a the function C(t). Both plots were the results of simulations using σ2(·) ≡ 1, and µ(r) = 1 − 2r, which corresponds to h(r) = 2r(1 − r).

3.5 Atlas Models

Consider the simple Atlas model with parameters σ2, g. We know from Proposition 2.6.1 that σ2 Fn(x, t) → 1 − exp(− 2g x). In particular, this tells us that the location of the Atlas stock goes to n zero in probability as n gets large (i.e. X(1)(t) → 0). In [8], Dembo and Tsai consider an Atlas model with infinitely many particles (note the difference between this setting and ours). They show

55 CHAPTER 3. LIMITING PARTICLE DYNAMICS

Figure 3.2: Rank-Rank Correlation over Time A simulated plot of the correlation between R(t) and R(0) over time with σ2 = 1, µ(x) = 1 − 2x. that the Atlas particle, when suitably rescaled, behaves as a fractional Brownian motion.

Proposition 3.5.1. In the setting of the Atlas model, if Rn(0) → α, the sequence Xn(·) converges in C([0,T ]) in probability for any fixed T > 0 to a process X(·) satisfying

σ2 1 X(0) = q(α) = log . (3.17) 2g 1 − u

n n Proof. Fix  > 0, and let A = {supt∈[0,T ] |X(1)(t)| < }. Because of the convergence X(1)(t) → 0, we may pick n large enough so that

P (A) > 1 − .

Applying Lemma A.0.4 (or [6]) with the push happening on (−∞, −] instead of (−∞, 0], we can n − a conclude that X (t) ≥ X (t) occurs with probability 1 − o(1) on the set A, where dX (t) = −gdt + σdW (t) + dLa(t) is a Brownian motion with drift −g reflecting at the point a. We can similarly argue that Xn(t) ≤ X(t), with probability 1 − o(1) and so letting  go to zero, we see that the lim Xn exists and is given by X0.

Considering the rank process R(t) = F (X(t)), we see that 1 − R(t) acts as an Geometric

56 CHAPTER 3. LIMITING PARTICLE DYNAMICS

Brownian Motion with reflection at 1 − R = 1 (i.e. at R = 0).

Proposition 3.5.2. Consider a starting rank (0, 1) 3 R(0) = r < a ∈ (0, 1). Then the Laplace transform of Ta is given by:

 q(r) q g2   gq(a)cosh σ 2λ + σ2 E exp(−λTa) = exp − . σ2  q(a) q g2  cosh σ 2λ + σ2 Proof. First note that R(t) reaching the point a is equivalent to X(t) reaching the point y := q(a), from the starting point x := q(r). We first divide by σ to move ourselves to the case of unit variance. That is, we consider a reflected Brownian motion Xe(t) = X(t)/σ, that satisfies:

Xe(0) = x/σ

g Xe=0 dXe(t) = − dt + dWt + dL (t) σ

And the hitting time of interest is now Ta = Sy/σ, which is the hitting time of Xe at the level y/σ. In order to remove the drift, we invoke Girsanov’s Theorem. To this end, we define Zt = g g2 exp(− σ Xe(t)− 2σ2 t), and the measure Q given by dQ/dP = 1/Zt on Ft. By the Novikov Condition, we may let t → ∞ and have a well defined measure Q on F∞. Under Q, Xe(·) is a reflected Brownian Motion with no drift, which then gives:

    P −λTa Q −λTa E e = E ZTa e

g g2 Q − Xe(Ta)− Ta −λT  = E e σ 2σ2 e a

gy g2 − Q −(λ+ )Ta = e σ2 E e 2σ2

 q 2  cosh x 2λ + g − gy σ σ2 = e σ2  y q g2  cosh σ 2λ + σ2

Where the third equality was obtained using the fact that Xe(Ta) = y/σ, and the last used formula 3.2.0.1 in [4].

3.6 Martingale Models

Recall that the quantities Xi(t) represent log-capitalizations of companies, and so Yi(t) = exp(Xi(t)) represent the capitalization of firm i. The dynamics of Yi are then given by:

57 CHAPTER 3. LIMITING PARTICLE DYNAMICS

s2 dYi(t) ri(t)  i = gri(t) + dt + sri(t)dWt . (3.18) Yi(t) 2

If we attempt to impose the restriction that Yi(·) is a martingale, then it must be the case that 2 gk = −sk/2 < 0 when k = ri(t). Assumption 1.5.4 does not allow µ(·) to be strictly negative, and so we cannot make each Yi be a martingale in our setting. Instead, we fix a level 0 < a < 1, and choose µ(·), σ(·) so that: σ2(r) µ(r) = − , a < r < 1. (3.19) 2 For r ≤ a, we define µ(r), σ(r) to make Assumption 1.5.4 hold. This will let us guarantee that the large companies have the desired martingale behaviour, while assigning the smaller companies a larger drift to play the role of “Atlas”. In the case when σ2(·) is affine, one could take a = 0 and use a true Atlas model. Using equation (3.19), we may write Θ0(u) = −σ2(u)/2. We can then rewrite (2.3) as:

Z u σ2(r) Z u Θ0(r)  1      q(u) = dr = − dr = log Θ( ) − log Θ(u) = C − log Θ(u) . 1/2 2Θ(r) 1/2 Θ(r) 2 Translating this from the logarithmic scale, we get:

eC eq(u) = . Θ(u)

The model now only has a single functional degree of freedom, as well as a real constant C. The entire model can thus fit by comparison with the capital distribution curves given in Figure 2.1, and the estimate of the volatility (or drift) at a single rank. The function h(·) given by (3.8) now becomes:

s 2Θ(r) −2Θ(r)2 h(r) = = . r Θ0(r) Using this, Theorem 3.2.9 becomes:

Z x −2 ExTa = 2 (1 − r)h (r)dr a Z x (−Θ0(r)) = (1 − r) 2 dr a Θ(r) Z x = (1 − r)(Θ−2)0(r)dr a

58 CHAPTER 3. LIMITING PARTICLE DYNAMICS

Integrating this expression by parts becomes:

1 − x 1 − a Z x 1 ExTa = 2 − 2 + 2 dr. Θ(x) Θ(a) a Θ(r) Zipf’s Law and the Martingale constraint introduced in this section can be used to justify an Atlas model with g = −σ2/2, giving a linear log-log plot of market capitalization versus rank with slope −1. However, as shown in Figure 2.1, empirical observations suggest that this curve is not linear. Using the method discussed above, we can maintain the Martingale property, while allowing for non-linear capital distribution curves. The Martingale assumption also allows for model fitting in which only the distribution is avail- able. In applications such as modeling equity markets, longitudinal data is freely available. How- ever, in some applications, such as the distribution of wealth, such data is harder to come by, and only the overall wealth distribution is available. This reduces the problem of fitting the entire curve, to estimating the volatility at a single rank.

59 CHAPTER 4. ASYMMETRIC COLLISIONS

Chapter 4

Asymmetric Collisions

In this chapter, we will consider the model studied in [21]. Here, the particles once again evolve as diffusions with drift and dispersion coefficients depending on their rank, but we now add in local time interactions when two particles collide. Explicitly:

1 1 i +  (ri(t),ri(t)−1) −  (ri(t)+1,ri(t)) dXi(t) = gr (t)dt + sr (t)dW (t) + p − dL − p − dL , (4.1) i i ri(t) 2 ri(t) 2

± − + (k+1,k) Zk where pk ∈ (0, 1) are such that pk + pk+1 = 1; and L (·) ≡ L0 (·) is the local time at zero ± of the kth gap process. Our assumption of 0 < pk < 1 is stronger than those used in [21], where ± pk = 0, 1 was also allowed. We will begin by providing a study of the model for fixed values of n; in particular finding the condition for the existence of a stationary distribution analogous to Assumption 1.5.1. We then follow the proof of Theorem 2.5.1 to develop a Porous Medium Equation result for this new setting. The main difficulty in this new setting is the existence of singular drift terms in (4.1), and so Lemma 2.2.3 is no longer applicable.

4.1 Existence and Stationary Distributions

The gap process here once again evolves according to a Reflecting Brownian Motion with covariance matrix A defined in (1.14), drift b defined in (1.15). The reflection matrix in this case is given by:

60 CHAPTER 4. ASYMMETRIC COLLISIONS

 −  1 −p2 0 ... 0 0    + −  −p2 1 −p3 ... 0 0 R =   . (4.2)  ......   ......    + 0 0 0 ... −pn−1 1 More explicitly, Z(t) = Z(0) + B(t) + bt + R ·L(t), (4.3)

n−1 where B(t) is a Brownian Motion in R with covariance matrix A. Note: Due to the different particle ordering convention taken in this paper compared to [21] (and others), the notation here is slightly different and care should be taken in comparing results. It is an easy exercise to show that this matrix is completely-S, and hence the gap process does indeed follow this RBM without the appearance of higher order local time terms. Once again, we have weak solutions for all time, and strong solutions until the first triple collision [21].

Proposition 4.1.1 (Sarantsev [27]). The system (4.1) almost surely has no triple collisions if and only if: 2 + − 2 + 2 − sk(pk+1 + pk−1) ≥ sk−1pk + sk+1pk (4.4) for k = 2, ..., n − 1

The ranked particles X(k)(t) of (4.1) evolve according to:

k + (k,k−1) − (k+1,k) dX(k)(t) = µ(k/n)dt + σ(k/n)dB (t) + pk dL (t) − pk dL (t). (4.5)

± If the pk = 1/2, then we get symmetric collisions, and are back in our original setting. This can be seen either by comparing the reflection matrix R in (4.2) with the reflection matrix in (1.16), ± ± or by noting that with pk = 1/2, there are no local time terms in (4.1). With pk 6= 1/2, it is as if some particles are “heavier” than others, and so push them around more. Following this intuition, we define mk such that:

+ mk−1 − mk+1 pk = , pk = . (4.6) mk + mk−1 mk+1 + mk ± Intuitively, mk is the mass of particle k, and these choices of pk are made to follow a conservation ± of momentum. It is easy to see that for any values of pk , there is a corresponding sequence of mk, so there is no loss in generality doing this.

61 CHAPTER 4. ASYMMETRIC COLLISIONS

In the case of symmetric collisions, we showed that a necessary and sufficient condition for stability is for the average drift of the bottom m particles to exceed the average drift of the top n − m particles for any choice of m. One would guess that to extend this to asymmetric collisions, we need simply turn these into weighted averages. This is indeed the correct condition as shown below. It is shown in Theorem 6.2 of [16] that Z has a stationary distribution if and only if R−1b < 0 component-wise. Since b depends only on the differences in drifts, we may assume without loss of generality that m1g1 + ... + mngn = 0.

−1 Lemma 4.1.2. If m1g1 + ... + mngn = 0, then R b = v where

mk + mk+1 vk = − (m1g1 + ... + mkgk). mkmk+1

Proof. We want to check that Rv = b, which we will do in three cases. If k = 1:

m1 + m2 − m2 + m3 (Rv)1 = − (m1g1) + p2 (m1g1 + m2g2) m1m2 m2m3 m1 m1g1 + m2g2 = −g1 − g1 + m2 m2

= g2 − g1.

If 2 ≤ k ≤ n − 2:

+ − (Rv)k = − pk vk−1 + vk − pk+1vk+1 m1g1 + ... + mk−1gk−1  1 1  = − + (m1g1 + ... + mkgk) mk mk mk+1 m g + ... + m g + 1 1 k+1 k+1 mk+1

=gk+1 − gk.

62 CHAPTER 4. ASYMMETRIC COLLISIONS

If k = n − 1:

+ (Rv)n−1 = −pn−1vn−2 + vn−1 m1g1 + ... + mn−2gn−2  1 1  = − + (m1g1 + ... + mn−1gn−1) mn−1 mn−1 mn m1g1 + ... + mn−1gn−1 = −gn−1 − mn mngn = −gn−1 + mn

= gn − gn−1,

Pn where the second last equality is due to the assumption that k=1 mkgk = 0.

Corollary 4.1.3. The system (4.1) has a stationary distribution of its gaps if and only if:

Pl m g Pn m g k=1 k k > k=1 k k (4.7) Pl Pn m k=1 mk k=1 k for all 1 ≤ l ≤ n − 1.

Proof. This proof is analogous to the discussion following Assumption 1.5.1.

First, note that the gaps depend on the drifts g1, ..., gn only through their differences bk = gk+1 − gk. Because of this, adding a constant a to all of the gk does not change the dynamics of the gap process Z(t), and hence has no effect on the existence of a gap process. Pick a so that:

n X mk(gk + a) = 0. (4.8) k=1

We know that a stationary distribution exists if and only if R−1b < 0 component-wise. By n Lemma 4.1.2 applied to the drift vectors gk + a k=1, this is equivalent to:

k X ml(gl + a) > 0, ∀k = 1, ..., n − 1. (4.9) l=1 Using the definition of a in (4.8), the condition (4.9) gives exactly (4.7).

The condition (4.7) is equivalent to the condition R−1b < 0 appearing in other papers (eg. ± [21]). However, the change of notation from pk to mk offers a novel interpretation of the condition.

63 CHAPTER 4. ASYMMETRIC COLLISIONS

In the study of stationary distributions in [21], with D = diag(A), it is shown that if R,A satisfy the “Skew-Symmetry” condition

2(R − I − A) = RD + DR, (4.10) then the invariant distribution is the product of independent exponentials (this is analgous to Theorem 1.5.3). Using this, one could easily repeat the proof of Theorem 2.1.3 to obtain the limiting distribution under the condition (4.10).

4.2 Taking the Limit

± As we did in (1.6), in order to take a limit as n → ∞, we must choose a way for pk to have a dependence on n. We proceed as before, and let m : [0, 1] → (0, ∞) be a continuous function, ± and define mk = m(k/n). Using (4.6), we can then define pk . Similar to the µ(·), σ(·), we will ± treat the function m(·) as being given, and derive the reflection weights pk accordingly. We define R x M(x) = 0 m(r)dr, and since we only care about the ratio of the masses, we will normalize m so that M(1) = 1. We will now define

Z u Θ(e u) = µ(r)m(r)dr, (4.11) 0 and make the following assumption, analagous to Assumption 1.5.4:

Assumption 4.2.1. With Θe as in (4.11), Θ(1)˜ = 0 and Θ(˜ u) > 0 for 0 < u < 1. Furthermore, µ(·) is continuous and µ(0), µ(1) 6= 0.

Proceeding as in the proof of Lemma A.0.1, we see that this implies the conditions of Corol- lary 4.1.3, and hence the existence of a stable distribution for n large enough, as long as µ(0), µ(1) 6= 0. We will consider the weighted Empirical Distribution Function

n n  X −1 X F˜ (x, t) = m m 1 (4.12) n ` k X(k)(t)≤x `=1 k=1 of the particles at time t. Further, define the weighted distribution of trajectories

n n −1 n  X  X He = m` mkδX(k) ∈ MT = P(C([0,T ])), (4.13) `=1 k=1

64 CHAPTER 4. ASYMMETRIC COLLISIONS

n and let ρen ∈ P(MT ) be the distribution of He . Note that unlike in Chapter 2, here we consider the trajectories of the ranked particles as opposed to that of the named ones. This will require a ˜ different argument for the tightness of ρen, but once projecting down to the level of Fn, there is no difference between considering the ranked particles or named particles. If we attempted to apply the methods of Chapter 2 to show that ρen is tight, we would run into problems when applying Corollary 2.2.5, since this does not cover diffusions with local time terms.

4.3 ρen is Tight

We will prove a weaker version of tightness than the proof of Theorem 2.5.1, though one that will be sufficient for establishing the PME. Consider Hen(·) as a continuous function from [0,T ] to n MR := P(R), the space of probability measures on the real line. That is, He ∈ C([0,T ], MR). The space MR is given the topology of weak convergence induced by the Levy metric, and C([0,T ], MR) is given the topology of uniform convergence. Under this identification,ρ ˜n is a measure on

C([0,T ], MR).

Lemma 4.3.1. ρen is tight as a measure on C([0,T ], MR).

In Lemma 1.3 of [15], and the proof of Theorem 1.1 in [28], a characterization of tightness for such a space is given by:

Lemma 4.3.2. The sequence ρen is tight if for every  > 0, there is a countable dense subset {f1, ...} of Cc(R), and compact sets K0 ⊂ MR and K1, ... ⊂ C([0,T ]) such that:

−r ρen(Hr) > 1 − 2 , ∀r, n ≥ 0 (4.14) where:

H0 = {ξ ∈ C([0,T ], MR): ξ(t) ∈ K0, ∀t ∈ [0,T ]}

Hr = {ξ ∈ C([0,T ], MR):(ξ(·), f) ∈ Kr}, r ≥ 1, where (ν, f) = R fdν.

65 CHAPTER 4. ASYMMETRIC COLLISIONS

4.3.1 Construction of K0

We would like to construct a compact set KR ⊂ MR such that

ρen({ξ ∈ C([0,T ], MR): ξ(t) ∈ K0, ∀t ∈ [0,T ]}) > 1 − .

In order for the set K0 to be compact, it needs to be tight set of measures on R. Hence, we fix −i some sequence i = 2 → 0, and define the set K0 by:

K0 = {ν : ν([−Ki,Ki]) > 1 − i ∀i}, where the real numbers Ki will be chosen later. n Let λm denote the distribution on [0, 1] with

n mk λm({k/n}) = P n . (4.15) ` = 1 m`

n Then λm converges weakly to the measure λm with density

λm(dx) = m(x)dx. (4.16)

If we can argue that for some a < b that X(an)(t),X(bn)(t) ∈ [−K,K] for all t ≤ T , then it follows n n that for any c ∈ (a, b), X(cn)(t) ∈ [−K,K] as well. This then says that He (t)([−K,K]) ≥ λm([a, b]) ∞ for all t ∈ [0,T ]. Pick a function φ : R → R such that φ ∈ C , φ(z) = 1 if z ≤ 0, φ(z) = 0 if z ≥ 1, and φ is decreasing between 0 and 1. For any fixed y  x, define

z − x ψ (z) = φ . (4.17) x,y y − x

Then 1(−∞,x] ≤ ψx,y ≤ 1(−∞,y]. With ψ = ψx,y, let:

−1 x,y  X  X Zt = Zt = m` mkψ(X(k)(t)). ` k Because of the definition (4.17), we have the bounds

0 −1 kψ k∞ ≤ C1(y − x)

00 −2 kψ k∞ ≤ C2(y − x) .

66 CHAPTER 4. ASYMMETRIC COLLISIONS

We can now write:

 X −1 X h σ2(k/n) i dZ = m m µ(k/n)ψ0(X (t)) + ψ00(X (t)) dt t ` k (k) 2 (k) ` k −1  X  X 0 k + m` mkσ(k/n)ψ (X(k)(t))dW (t). ` k

Since we will consider y  x, the drift term in dZt can then be bounded by

kσ2k kµk kψ0k + ∞ kψ00k ≤ C (y − x)−1, ∞ ∞ 2 ∞ 3 and the quadratic variation bounded by

−2  X  X 2 2 0 2 −1 −1 m` mkkσ k∞kψ k∞ ≤ C4(y − x) n . ` k

Here C3,C4 depend on the parameters µ, σ, but are independent of x, y. Letting κ = y − x, and Nt denote the martingale part of Zt, we calculate:

−1 P( sup |Zt − Z0| ≥ ) ≤ P( sup |Nt − N0| ≥  − C3κ S) 0≤t≤S 0≤t≤S

EhNSi ≤ C5 −1 2 ( − C3κ S) −1 C5C4κ Sn ≤ −1 2 . ( − C3κ S) n

Now, assuming that Sκ−1  , we can continue to write:

κ−1S P( sup Zt − Z0 ≥ ) ≤ C6 2 , (4.18) 0≤t≤S  n

where the constant C6 is again independent of x, y.

Lemma 4.3.3. For any  > 0, there exists an K > 0, and a compact interval A ⊂ (0, 1) such that P (X (t) ∈ [−K,K] ∀t ∈ [0,T ], α ∈ A) > 1 − , and λn (A) > 1 −  for all sufficiently large n. ρen (αn) m

Proof. We pick 0 < a < b < 1 such that λm([0, a]) = /3 and λm([b, 1]) = /3. Then A = [a, b] n n n satisfies λm(A) = 1 − 2/3, and so for large enough n, λm(A) ≥ 1 − , λm([b, 1]) > /4, λm([0, a]) > /4 by weak .

67 CHAPTER 4. ASYMMETRIC COLLISIONS

We fix 0  L  K, and α0 ∈ (0, 1) such that

n λm([α, 1]) < /16 (4.19) for all n. As in the proof of Proposition 2.2.7, we can consider the weighted central moment function: n n −1 p  X  X ¯ p V (t) = m` mk(X(k)(t) − X(t)) , (4.20) `=1 k=1 where X¯(t) is the weighted mean

n n −1 ¯  X  X X(t) = m` mkX(k)(t). `=1 k=1 and use this to argue that the sequence of stationary distributions is tight. Hence, we can pick L large enough so that

P(qn(α0, 0) > L) < /2 (4.21)

L,K Now, consider Zt = Zt . We know that:

F˜n(L, t) ≤ Zt ≤ F˜n(K, t).

We also know that, F˜n(L, 0) > 1 − /16 with at least probability 1 − /2 by combining (4.19) and (4.21). Using (4.18), we may take K large enough, such that T κ−1  , and so that:

2 16 C6T −1 P(sup |Zt − Z0| ≥ /16) ≤ 2 1 n = o(1). t≤T  (K − L)

This probability can be made as arbitrarily small by increasing n. We have now shown that, with probability 1 − /2 − o(1):

F˜n(K, t) ≥ Zt

≥ Z0 − /16

≥ F˜n(L, 0) − /16

≥ 1 − /16 − /16

= 1 − /8

68 CHAPTER 4. ASYMMETRIC COLLISIONS holds for all t ≤ T . In plain English, we have shown that for all time t, no more than /8 mass is n ever above a level K. But, the set A has at least /4 mass above it (since λm([b, 1]) > /4), and so no member of A can ever get above the level K. Hence, with probability 1 − /2 − o(1), for all t ≤ T , and α ∈ A, we have:

X(αn)(t) ≤ K.

The lower bound is similar, so we’re done.

n Now, for each i, we choose Ki,Ai as in Lemma 4.3.3 so that λm(Ai) > 1 − i, and P(X(αn)(t) ∈

[−Ki,Ki]∀t, α ∈ Ai) > 1 − i. This says that Hen(t) put mass at least 1 − i on the set [Ki,Ki] for −i all t, except on a set of probability i = 2 . Summing these, we see this occurs for all i except on a set of measure , and so ρen(H0) > 1 −  as desired. Since Lemma 4.3.3 only holds for large values of n, we then expand the set K0 to make (4.14) hold for small n.

4.3.2 Construction of Kr

For each r ∈ N, we need to choose a compact set Kr ⊂ C([0,T ]) such that:

−r ρen({ξ ∈ C([0,T ], MR):(ξ(·), f) ∈ Kr}) > 1 − 2 .

To construct the sets Hr, we first choose {f1, ...} to be a dense set of compactly supported smooth n functions. Then defining Zt = Zt = (Hen(t), fr), we have: n n  X −1 X Zt = m` mkfr(X(k)(t)) `=1 k=1 n n −1  X  X 0 k dZt = m` mkfr(X(k)(t))σ(k/n)dB `=1 k=1 n n  X −1 X  1  + m m f 0 (X )µ(k/n) + m f 00(X )σ2(k/n) dt. ` k r k k 2 r k `=1 k=1

Both the drift and dispersion terms of dZt are bounded by constants independent of n. Z(t) is also necessarily bounded, since fr is bounded. This, combined with Corollary 2.2.5 gives us that the sequence of random variables Zn(·) ∈ C([0,T ]) forms a tight sequence. But this means that there is a compact Kr ⊂ C([0,T ]) such that

n −r ρen({ξ ∈ C([0,T ], MR):(ξ(·), f) ∈ Kr}) = P(Z ∈ Kr) > 1 − 2 .

69 CHAPTER 4. ASYMMETRIC COLLISIONS

4.4 Porous Medium Equation

We now know that ρen is tight as a measure on C([0,T ], MR). We would like to establish a Porous Medium Equation type relation for any weak limit of this sequence. Fix a weakly convergent subsequence, and denote it also as ρen → ρe for ease of notation. Since we are only concerned with the distribution of the weak limit, we may assume, by Skorokhod’s Representation Theorem

(Theorem 3.5.1 in [9]), that the convergence Hen(·) → He(·) is almost sure on C([0,T ], MR). For us, it will be enough to have this convergence happen pointwise. Hence, we have that F˜n(·, t) → F˜(·, t) weakly for all t almost surely for some limiting distribution F˜.

 −1  Lemma 4.4.1. limn→∞ Fn − M (F˜n) = 0 uniformly in x, t, ω.

Proof. The proof is immediate from the definitions.

The remainder of the proof mirrors Theorem 2.3.2 exactly; with weighted averages replacing averages. For any fixed smooth, compactly supported function f, we define Zt by: n −1 n  X n  X Zt = m`=1 mkf(X(k)(t)) k=1 Z = f(x)F˜n(dx, t) Z → f(x)F˜(dx, t)

=:Zt.

Taking time derivatives, the local time terms once again dissapear since we are taking a weighted sum, and we get: n n −1 n  X  X 0 dZt = m` mkµ(Fn(X(k)(t), t))f (X(k)(t))dt `=1 k=1 n n  X −1 X 1 + m m σ2(F (X (t), t))f 00(X (t))dt ` k 2 n (k) (k) `=1 k=1 n n −1  X  X 0 k + m` mkσ(Fn(X(k)(t), t))f (X(k)(t))dWt `=1 k=1 Z 1 = [µ(F (x, t))f 0(x) + σ2(F (x, t)f 00(x))]F˜ (dx, t)dt + O(n−1/2) n 2 n n Z 1 = [µ(M −1(F˜ (x, t)))f 0(x) + σ2(M −1(F˜ (x, t)))f 00(x)]F˜ (dx, t)dt + o(1) + O(n−1/2). n 2 n n

70 CHAPTER 4. ASYMMETRIC COLLISIONS

Where the last equality follows from the fact that µ, σ2 are uniformly continuous, the fact that f is bounded, and Lemma 4.4.1. A careful reading of the proof of Lemma 2.3.1 shows that only weak convergence of Fn(·, t) → F (·, t) for each t was required in order interchange limits and integration. Hence, we’ve shown the following:

−1 −1 Proposition 4.4.2. Let µ˜ = µ ◦ M and σ˜ = σ ◦ M . Suppose that ρen → ρe weakly as measures on C([0,T ], MR). Then any realization of the limit ρe almost surely satisfies:

Z Z g(x, t)F˜(dx, t) − g(x, 0)F˜(dx, 0) = (4.22) Z t Z 1 2 [gx(x, s)˜µ(F˜(x, s)) + gxx(x, s)˜σ (F˜(x, s)) + gt(x, s)]F˜(dx, s)ds 0 2

∞ for all g ∈ CC (R × [0,T ]) and t ∈ [0,T ].

The same reasoning in Chapter 2 applies again to argue that since we started from a stationary distribution, the limit must be stationary. But there is a unique stationary solution to the PME, and so we have: 1 Z u σ˜2(r) q˜n(u, t) → q˜(u) := R r dr. (4.23) 2 M(1/2) 0 µ˜(w)dw Since q =q ˜ ◦ M, we have just shown that:

Theorem 4.4.3. In the setting of model (4.1), we have:

Z M(u) σ˜2(r) qn(u, t) → R r dr =: qM (u) M(1/2) 2 0 µ(x)dx in probability uniformly on compact sets of (0, 1) × [0, ∞). We also have:

−1 Fn(·, t) → qM (·) uniformly for t ∈ [0,T ].

71 CHAPTER 5. LOCAL TIME AND RANKED-BASED PORTFOLIOS

Chapter 5

Local Time and Ranked-Based Portfolios

k X(k+1)−X(k) The local time between two ranked particles L (t) = L0 (t) can be thought of as a measure of how often the particles X(k) and X(k+1) exchange names. In portfolio theory, this measures how often an investor Alice, that always invests in X(k), switches investments with another investor

Bob, who always invested in X(k+1). Note that since X(k) ≤ X(k+1) this switch is always due to Bob’s investment doing poorly, or Alice’s doing well. In both cases, the higher the value of Lk, the better Alice does relative to Bob. Throughout this chapter, we assume that σ2(·) is affine to allow for precise estimates as in Proposition 2.8.2.

5.1 Limits of Local Time

Lk(t) We know from [1] that limt→∞ t = 2(g1 + ... + gk) ∼ 2nΘ(k/n). We would like to extend this to a deterministic limiting result as n → ∞. For a fixed u ∈ (0, 1), we define:

−1 bunc Ln(u, t) = n L (t)

L(u, t) = 2Θ(u)t.

We will argue the following theorem:

72 CHAPTER 5. LOCAL TIME AND RANKED-BASED PORTFOLIOS

Theorem 5.1.1. In the setting of Theorem 2.1.3, for any fixed T < ∞, Ln(u, t) → L(u, t) uniformly in the following sense:

sup |Ln(u, t) − L(u, t)| → 0, (5.1) u∈(0,1),t∈[0,T ] where the limit is taken as a limit in probability.

The proof will be broken down into several steps, similar to that of Proposition 2.8.2. In Lemma 5.1.2, we will find an expression for Lk(t) for finite n. Next, we will argue that for any fixed u, t, that the desired convergence holds. This will then be extended to all u for a fixed t. Finally, the fact that Ln,L are both increasing functions of t will allow us to extend to all times t.

Lemma 5.1.2. k k  X l  L (t) = 2 µ(`/n)t + σ(`/n)W (t) − (X(`)(t) − X(`)(0)) . (5.2) `=1

l 1 `−1 ` Proof. Recall that dX(`)(t) = µ(`/n)dt + σ(`/n)dW (t) + 2 (dL (t) − dL (t)). Using this, we can compute:

k k X X 1 X (t) − X (0) = µ(`/n)t + σ`W `(t) + (L`−1(t) − L`(t)) (`) (`) 2 `=1 `=1 k X h i 1 = µ(`/n)t + σ(`/n)W `(t) − Lk(t), 2 `=1

Pk 1 `−1 ` where the second equality comes from evaluating the telescoping sum `=1 2 (L (t) − L (t)). Solving for Lk gives the desired result.

Lemma 5.1.3. For any fixed u ∈ (0, 1), t ∈ [0, ∞), the limit limn Ln(u, t) = L(u, t) holds in probability.

Proof. Using Lemma 5.1.2, we can write:

−1 Ln(u, t) =2n (µ(1/n) + ... + µ(bunc/n))t

+ 2n−1(σ(1/n)W 1(t) + ... + σ(bunc/n)W bunc(t))

−1 + 2n (qn(1/n, t) + ... + qn(bunc/n, t))

−1 − 2n (qn(1/n, 0) + ... + qn(bunc/n, 0)).

73 CHAPTER 5. LOCAL TIME AND RANKED-BASED PORTFOLIOS

The first term here is easily seen to converge to 2Θ(u)t. The second term is a Gaussian random variable with mean zero, and variance O(n−1), and so converges to zero in probability. It remains to show that the final two terms converge to the same value, and so cancel each other out in the limit. −1 Pbunc R u We will accomplish this by arguing that n `=1 qn(`/n, t) ∼ 0 qn(r, t)dr converges in prob- ability to a constant, independent of t, as n → ∞. We fix α > 0, and rewrite this integral as: Z u Z α Z u qn(r, t) = qn(r, t) + qn(r, t) = A + B 0 0 α R u Since qn(r, t) → q(r) uniformly in probability on [α, u], we immediately see that B → α q(r) in R u 2 probability. If we let α → 0, this becomes 0 q(r). Note that as r → 0, q(r)/ log(r) → σ (0)/2µ(0), and so q(r) is integrable near r = 0 (since log r is integrable). We will now argue that as α → 0, A → 0 in probability (uniformly over all n). We consider the sum:

Sn(α) := qn(1/n, t) + ... + qn(α, t). (5.3)

We know that the difference qn(α, t) − qn(k/n, t) can be written as the sum of exponential distributions. We also know that qn(α, t) = oP (1) + q(α). Hence, we can rewrite (5.3) as: αn X n Sn(α) = αnqn(α, t) − kξk k=1

n n σ2(k/n)+σ2((k+1)/n) where ξk is an exponentially distributed random variable with mean ak = 4(µ(1/n)+...µ(k/n)) (see n Theorem 1.5.3). The coefficient k in front of ξk comes from the fact that the gap gets added once for each particle of rank ≤ k. Since we can bound

k n 2 −2 V ar(ξn) = (ak ) ≤ Ck , the sum of exponential has variance bounded by: αn αn αn X n X 2 n X 2 −2 V ar kξk = k V ar(ξk ) ≤ C k k = Cαn. k=1 k 1 Using this bound, we can write:

S (α) Pαn kξn n = αq (α, t) − k=1 k n n n

= αq(α) + oP (1) − V (n, α),

74 CHAPTER 5. LOCAL TIME AND RANKED-BASED PORTFOLIOS

−1 Pαn n 2 −1 where V (n, α) = n k=1 kξk is a random variable with variance O(α n ). The mean of V (n, α) can be bounded by: αn αn −1 X n −1 X −1 EV (n, α) = n kEξk ≤ n kCk ≤ αC. k=1 k=1 By applying Chebyshev’s Inequality to V (n, α), we see that, since both the mean and variance go to zero uniformly as α → 0, we can make P(|V (n, α)| > /3) → 0 for small values of α as n → ∞. R α The expression αq(α) → 0 as α → 0, since q(α) ∼ C log(α). Hence, to make | 0 q(r, t)dr| <  with high probability, we first pick α small enough to make αq(α) < /3 and so P(|V (n, α)| > /3).

Then pick n to make oP (1) < /3 with high probability (note that the error represented by oP (1) goes to zero as n → ∞, but depends on α, and so it is necessary for fix α before letting n → ∞). This completes the proof of the Lemma.

If one was only concerned about the local time at a single rank, we could now easily prove the following:

Corollary 5.1.4. For any fixed u ∈ (0, 1),T ∈ (0, ∞), supt∈[0,T ] |Ln(u, t) − L(u, t)| → 0 in proba- bility.

Proof. (Sketch) By Lemma 5.1.3, we can show convergence for any fixed set of time points t1 <

... < tL. But since Ln,L are increasing in t, and L is continuous, we can extend this to all time by Lemma 2.1.5.

Since our goal is to prove convergence uniformly for all values of u, we will do this first before using the monotonicity of Local Times in t to extend to all t (we are saving the easy part for last).

Lemma 5.1.5. For any fixed t, (5.1) holds uniformly over u ∈ (0, 1).

Proof. Once again, we write:

−1 Ln(u, t) =2n (µ(1/n) + ... + µ(bunc/n))t

+ 2n−1(σ(1/n)W 1(t) + ... + σ(bunc/n)W bunc(t))

−1 + 2n (qn(1/n, t) + ... + qn(bunc/n, t))

−1 − 2n (qn(1/n, 0) + ... + qn(bunc/n, 0)).

75 CHAPTER 5. LOCAL TIME AND RANKED-BASED PORTFOLIOS

The convergence of the first term is simply the convergence of a Riemann Sum, which is uniform for all u ∈ (0, 1). For the second term, consider the process:

−1 1 bunc  u 7→ Gn(u) = n σ(1/n)W (t) + ... + σ(bunc/n)W (t) .

2 Then Gn(u) is a right-continuous martingale indexed by u. We can bound the L norm by:

2 2 −2  1 n  EGn(1) = n E σ(1/n)W (t) + ... + σ(n/n)W (t) n X = n−2 σ2(k/n)t k=1 ≤ n−1kσ2kt.

By Doob’s Martingale inequality, we can conclude that Gn(u) → 0 uniformly in u in probability. Next, we consider the terms:

Z u −1 2n (qn(1/n, t) + ... + qn(bunc/n, t)) ≈ qn(r, t)dr. 0

We once again fix α and consider three cases: (i) u ∈ [α, 1−α], (ii) u ∈ (0, α) and (iii) u ∈ (1−α, 1). By symmetry, cases (ii) and (iii) are identical. R u R α R u (i). We can write 0 qn = 0 qn + α qn. Since qn → q uniformly on u ∈ [α, 1 − α], and the R u first integral is the same for all u in this region, we see that 0 qn converges uniformly for all u ∈ [α, 1 − α]. R u R α (ii). Note that 0 |qn| ≤ 0 |qn| for u ∈ (0, α]. But we know from the proof of Lemma 5.1.3 R α R u that 0 |qn| is small in probability for small α, we can uniformly bound 0 qn by something small in probability. The rest of the argument is identical to the end of Lemma 5.1.3

We are now in a position to prove Theorem 5.1.1.

Proof. Fix times t1 < ... < tL,  > 0. We know by Lemma 5.1.5 that for large enough n:

|Ln(u, ti) − L(u, ti)| < ,

76 CHAPTER 5. LOCAL TIME AND RANKED-BASED PORTFOLIOS

for all u ∈ (0, 1), i = 1, ..., L with probability 1 − o(1). Then, for t ∈ [ti, ti+1), we have:

Ln(u, t) ≤ Ln(u, ti+1)

≤ L(u, ti+1) + 

= L(u, t) + 2Θ(u)(ti+1 − t) + .

Since Θ(u) is bounded, if the mesh size of the partition goes to zero, this error can be made smaller than  uniformly over u, so we are done. By similarly bounding from below, we get the desired convergence.

5.2 A Primer on Stochastic Portfolio Theory

This section will present a quick primer on Stochastic Portfolio Theory as it relates to rank-based results. Details of everything in this section can be found in [12]. Formulas presented here might have a slightly different form than those presented in [12]; this is due to the opposite convention for ordering particles, and due to certain variables (µ, Θ) already having a reserved meaning in this paper. This is done to agree with the notation in the papers [20, 28]. X Recall that the exponential Yi = e i is used to model the capitalization of a company. The market weight of this company is given by:

Yi(t) νi(t) = . Y1(t) + ... + Yn(t)

(This is denoted µk in [12]). We similarly define ν(k) to be the market weight of Y(k). A portfolio n π is a mapping π :Ω × [0,T ] → R , such that π1 + ... + πn = 1. πi(t) represents the proportion of one’s current capital invested in company i at time t. Once again, π(k)(t) is used to denote the proportion of capital invested in the kth ranked company at time t.

Assuming we start with a single dollar, the wealth process Vπ(t) associated with a portfolio is the amount of money we have at time t, assuming we have invested according to π. This process satisfies: n π X d log V (t) = γπdt + πi(t)dWi(t) (5.4) i=1 V (0) = 1 (5.5)

77 CHAPTER 5. LOCAL TIME AND RANKED-BASED PORTFOLIOS

where γi is given by:

X 1 X  γ (t) = π (t)g + π (t)s2 − π (t)2s2 , (5.6) π (k) k 2 (k) k (k) k k k ∗ where the latter summand in (5.6) is denoted γπ and is known as the excess growth rate, which captures how much extra growth is gained by diversification. (5.4) and (5.6) are special cases of equations (1.1.13) and (1.1.21) in [12]. The portfolio given by π = ν is what would happen if one invested in the entire market, and is known as the market portfolio.

n n Theorem 5.2.1 (Theorem 4.2.1 in [12]). Let ∆ = {x ∈ R : x1 +...+xn = 1}. Let S be a positive 2 n C function defined on a neighbourhood of ∆ , and let S(x1, ..., xn) = S(x(1), ..., x(n)). Then S generates the portfolio π given by:

 X  π(k) = Dk log(S(ν)) + 1 − ν(j)Dj log S(ν) ν(k), (5.7) j

has a wealth process Vπ satisfying:

log(Vπ(t)/Vν(t)) = log S(ν(t))/S(ν(0)) (5.8) Z t 1 X − D S(ν(s))ν (s)ν (s)τ (s)ds 2S(ν(s)) i,j (i) (j) (ij) 0 i,j n−1 Z t X k + (π(k)(s) − π(k+1)(s))dL (s), k=1 0

where τ(ij) is given by:

2 2 2 X 2 2 τ(k`)(t) = 1k=`sk − skν(k)(t) − s` ν(`)(t) + skν(k)(t). k

Example 7. (This is Example 4.3.2 in [12]). Let S(x) = x(1) + ... + x(m). Then S generates a portfolio π with:

 ν(k)(t)  S(ν(t)) , k ≤ m π(k)(t) =  0, k > m

78 CHAPTER 5. LOCAL TIME AND RANKED-BASED PORTFOLIOS

This corresponds to investing in the smallest m companies exclusively, weighted by their relative size. In other words, we buy and hold the smallest m stocks, and only re-balance when the mth and (m + 1)st smallest companies switch spots.

5.3 Rank Based Portfolios in Large Markets

The range of possible functions S is quite large, and in general, there need not be a natural extension of S to a function on n variables. One could attempt to restrict S to a class of functions with a P natural extension (e.g. functions of the form S(x) = g( s(νk))). However all such attempts the author has made have either resulted in too limited a class of portfolios, or too unwieldy of calculations. Instead, we will present a sequence of example calculations for portfolios of interest in the Stochastic Portfolio Theory literature. For a discussion of the finite n version of these portfolios, see Chapter 4 of [12].

In all of these examples, we will argue that Vπ/Vν converges to a deterministic limit by consider- ing seperately the three terms on the righthand side of (5.8). Argument for the first term, log S(ν) is a modification of the proof of Proposition 2.8.2. The second and third terms (when present), are divided into expressions containing indices near the edges, and indices near the middle (similar to the proof of Proposition 2.8.2). Using Theorem 2.1.3, Proposition 2.8.2 and Theorem 5.1.1, the indices near the middle are shown to have uniform convergence to the desired limit. The total effect of the indices near the edges is argued to be small, and so we can safely ignore these contributions and take the nicely behaved limit in the interior. Finally, throughout this section, unless otherwise stated, we will make the assumption:

σ2(1) < 1. (5.9) 2|µ(1)|

By Proposition 2.8.2, this is the condition needed to guarantee the convergence of the integral

R 1 qn(u,t) terms 0 e du. If this integral diverged, then in the limit as n → ∞, the total market weight of companies ranked 1, ..., (1 − )n will converge to zero as n → ∞ for any  > 0.

Example 8. Fix two levels 0 ≤ α < β ≤ 1. Let

βn X S(x) = Sn(x) = x(k) (5.10) k=αn

79 CHAPTER 5. LOCAL TIME AND RANKED-BASED PORTFOLIOS

As in Example 7,

 ν(k)(t)  S(ν(t)) , k/n ∈ [α, β] π(k)(t) =  0, else.

We also have DijS = 0, and π(k+1) −π(k) = 0 unless k as at the boundary αn or βn. The wealth process is given by:

Z t Z t αn βn log(Vπ(t)/Vν(t)) = log(S(ν(t))/S(ν(0)) − π(αn)(s)dL (s) + π(βn)dL (s). 0 0 To show convergence of the first term (to zero), we need only argue that S(ν(t)) converges to a non-zero constant, independent of t, as n → ∞. But we can write S of (5.10) as:

R β qn(u,t) α e du S(ν) = 1 . R qn(u,t) 0 e du Where qn(u, t) = X(un)(t) is the empirical quantile function at time t. The limit of both the numerator and denominator exist and converge by Proposition 2.8.2 and (5.9). These limits are both independent of t, and so the log of their ratio is zero. Explicitly, the limit is given by:

R β eq(u)du lim S(ν) = α . (5.11) n→∞ R 1 q(u) 0 e du If α = 0, then the first integral term is zero. Else, it can be written as:

ν (s) π (s)dLαn(s) = (αn) dLαn(s) (αn) S(ν)

qn(α,s) e αn = 1 dL (s) R qn(u,t) S(ν)n 0 e du eqn(α,s) = 1 Ln(α, ds) R qn(u,t) S(ν) 0 e du We know from Proposition 2.8.2 and (5.11) that the term in front of the local time converges uniformly to a constant independent of t. The local time also converges by Theorem 5.1.1; and so this term converges uniformly in time to:

Z t q(α) αn e π(αn)(s)dL (s) → L(u, t) R β q(u) 0 α e du

80 CHAPTER 5. LOCAL TIME AND RANKED-BASED PORTFOLIOS

A similar result holds for β, and so we have the limiting relative log wealth process:

eq(β)Θ(β) − eq(α)Θ(α) lim log(Vπ(t)/Vν(t)) = 2 t (5.12) n R β q(u) α e du σ2(1) q(β) −κ Since we have made the assumption that κ = 2|µ(1)| < 1, we know that e = O((1 − β) ), and Θ(β) = O(1 − β). This combines to say that the term:

lim eq(β)Θ(β) = 0, β%1 which agrees with the substitution of β = 1. We can rewrite the righthand side of (5.12) as:

eq(β)Θ(β) − eq(α)Θ(α) R β(µ(u) + q0(u)Θ(u))eq(u)du = α R β q(u) R β q(u) α e du α e du If we want to choose α, β to maximize this ratio, it is equivalent to finding α, β to maximize the ratio of the integrands, and then pick α ∼ β; which corresponds to us only holding a single stock with rank βopt. βopt can be found by: h(µ(β) + q0(β)Θ(β))eq(β) i β = argmax opt β eq(β) h 0 i = argmaxβ µ(β) + q (β)Θ(β) h 1 i = argmax µ(β) + σ2(β) . β 2

1 2 The expression µ(β) + 2 σ (β) is exactly the drift term of the stock Y(βn). This says that in the limit, the optimal ‘index’ type portfolio, is equivalent to the naive strategy of simply picking the stock with the highest return.

σ2(1) Example 9. (See Example 3.4.4 from [12]) For a fixed p > 0 with p 2|µ(1)k < 1 or p < 0 with σ2(0) 2µ(0) < 1, consider the diversity weighted portfolio π generated by S(x) = kxkp. The portfolio weights satisfy:

νp (t) π (t) = (k) (k) S(ν(t))p The formula (5.8) becomes:

∗ d log Vπ(t)/Vν(t) = d log S(ν(t)) + (1 − p)γπdt (5.13)

81 CHAPTER 5. LOCAL TIME AND RANKED-BASED PORTFOLIOS

∗ (see (5.6) for the definition of γπ). The disappearance of the local time terms in (5.13) is not surprising, since the diversity weighted portfolio can be expressed in terms of names, which have no local time terms in their dynamics. We have  Z 1 1/p Z 1 −1 S(ν) = n1/p−1 epqn(t,u)du eqn(t,u)du . (5.14) 0 0 Both the numerator and denominator converge uniformly in time by Corollary 2.8.3, and the n1/p−1 cancels out when taking a ratio, so:

log S(ν(t))/S(ν(0)) → 0

∗ in probability uniformly in t. It remains to show that γπ(t) converges to a deterministic limit independent of t.

∗ X 2 2 2 2γπ(t) = π(k)(t)σ (k/n) − π(k)(t)σ (k/n) k p 2p X ν(k)(t) X ν(k)(t) = σ2(k/n) − σ2(k/n) S(ν(t))p S(ν(t))2p k k R pq (u,t) X epqn(k/n,t)  e n du −1 = σ2(k/n) n1−p np(R eqn(u,t)du)p (R eqn(u,t)du)p k R pq (u,t) 2 X e2pqn(k/n,t)  ( e n du) −1 − σ2(k/n) n2−2p n2p(R eqn(u,t)du)2p (R eqn(u,t)du)2p k X n−1epqn(k/n,t) X n−2e2pqn(k/n,t) = σ2(k/n) − σ2(k/n) . R epqn(u,t)du (R epqn(u,t)du)2 k k

R 1 2 pq(u) R 1 pq(u) The first sum here converges to 0 σ (u)e du/ 0 e du. For k ∈ (ιn, (1 − ι)n), the second −2 P 2 summand O(n ), and so the sum converges to zero. To bound edge values, we note that ak ≤ P 2 ( ak) if ak ≥ 0. Using this, we have:

X n−2e2pqn(k/n,t)  X n−1epqn(k/n,t) 2 σ2(k/n) ≤ σ(k/n) (R epqn(u,t)du)2 R epqn(u,t)du k:k∈ /(ιn,(1−ι)n) k:k∈ /(ιn,(1−ι)n) R pq(u)  σ(u)e du2 → (0,ι)∪(1−ι,1) , R epq(u)du where the convergence is in probability, and uniform in t. This bound can be made abitrarily small by letting ι → 0, so we’re done. The end result is that:

82 CHAPTER 5. LOCAL TIME AND RANKED-BASED PORTFOLIOS

R 1 2 pq(u) 1 − p 0 σ (u)e du log Vπ(t)/Vν(t) → t. 2 R 1 pq(u) 0 e du We see that the threshold p = 1 is where the portfolio π switches from outperforming the market (p < 1) to under-performing (p > 1). At p = 1, we have π = ν. For p > 1, the diversity weighted portfolio π invests more heavily in larger stocks than ν. Because the stability condition Θ > 0 imposes the restriction that larger stocks have a negative growth rate, we would expect some the smaller stocks to outperform the larger ones, so this result is not entirely unexpected.

Example 10. For a fixed C2 function w(u) defined on [0, 1], consider the portfolio generated by Pn S(x) = k=1 w(k/n)x(k). This generates a portfolio with weights:

w(k/n)ν(k)(t) π(k)(t) = P , ` w(`/n)ν(`)(t) and the wealth process satisfies:

n 1 X  k k + 1  d log(V (t)/V (0)) = d log S(ν(t)) + w  − w  ν (t)dLk(t). π π 2S(ν(t)) n n (k) k=1 Once again, because of (5.9) and Corollary 2.8.3, we have uniform convergence of S(ν), and so log(S(ν(t))/S(ν(0))) → 0. The summands can be written as:

qn(k/n,t) k k  k + 1 ν(k)(t) k 0 −1 00 −1  e dL (t) w − w dL (t) = − w (k/n)n + kw ko(n ) 1 n n 2S(ν(t)) R qn(u,t) n 2S(ν(t)) 0 e du qn(u,t) 0 −1 00 −1  e = − w (u)n + kw ko(n ) 1 Ln(u, dt), R qn(r,t) 2S(ν(t)) 0 e dr

k k+1 2 −1 where u is chosen to lie in [ n , n ]. Since w is C , the term o(n ) is uniform over all values of u. To show convergence of these terms, we once again divide our interval into three part P [0, ι), [ι, 1 − ι], (1 − ι, 1]. To bound the two outer intervals, we note that since k ν(k) ≤ 1, we know that we can write:

83 CHAPTER 5. LOCAL TIME AND RANKED-BASED PORTFOLIOS

n n X   k  k + 1 ν(k)(t) X ν(k)(t) w − w dLk(t) = (−w0(u) + o(1)) L (u, dt) n n 2S(ν(t)) 2S(ν(t)) n k=(1−ι)n k=(1−ι)n n X kw0k + o(1) ≤ ν (t)L (u, dt) 2S(ν(t)) (k) n k=(1−ι)n n X kw0k + o(1) 1 ≤ L (u, dt). 2S(ν(t)) (n − k) + 1 n k=(1−ι)n

1 The last inequality comes from the bound ν(k) ≤ n−k+1 (e.g., the third highest stock can have no more than 1/3 of the total market cap). Since S(ν) → S in probability, we can conclude that:

n Z t 1 X   k  k + 1 w − w ν (t)dLk(t) 2S(ν(t)) n n (k) 0 k=(1−ι)n n kw0k  X 1 ≤ + o (1) L (u, t). 2S P (n − k + 1) n k=(1−ι)n

But since

−1  ELn(u, t) = 2n µ(1/n) + ... + µ(bnc/n) ∼ 2Θ(u) ≤ C(1 − u), we can bound this sum in expectation by:

n n k X 1 X C(1 − ) L (u, t) ≤ n (n − k + 1) n (n − k + 1) k=(1−ι)n k=(1−ι)n n X ≤ Cn−1 k=(1−ι)n

≤ Cι.

So this can be made small by taking ι small. Hence, by Markov’s Inequality, we can take ι small enough, so that with high probability, this term contributes less that  to the wealth process. The bound for the interval [0, ι) is identical (or could be argued from the fact that for the −1 small stocks, ν(k) = O(n )). The middle interval converges since the integrand and integrator all converge uniformly in probability. Hence, we are left with the result:

84 CHAPTER 5. LOCAL TIME AND RANKED-BASED PORTFOLIOS

R 1 0 q(u) 0 w (u)e Θ(u)du log Vπ(t)/Vν(t) → − t. R 1 q(u) 0 w(u)e du

85 BIBLIOGRAPHY

Bibliography

[1] Adrian D. Banner, Robert Fernholz, and Ioannis Karatzas. Atlas models of equity markets. Ann. Appl. Probab., 15(4):2296–2330, 11 2005.

[2] Adrian D. Banner and Raouf Ghomrasni. Local times of ranked continuous semimartingales. Stochastic Process. Appl., 118(7):1244–1253, 2008.

[3] R. F. Bass and E.´ Pardoux. Uniqueness for diffusions with piecewise constant coefficients. Probab. Theory Related Fields, 76(4):557–572, 1987.

[4] Andrei Nikolaevitch Borodin and Paavo Salminen. Handbook of Brownian motion : facts and formulae. Probability and its applications. Birkhuser, Basel, Boston, Berlin, 2002.

[5] C. Bruggeman and J. Ruf. A one-dimensional diffusion hits points fast. 2015. Preprint. Available at arXiv:1508.03822.

[6] Cameron Bruggeman and Andrey Sarantsev. Weak approximation of reflected diffusions on the positive half-line by solutions of sde. 2015. Preprint. Available at arXiv:1509.01776.

[7] A. Dembo, M. Shkolnikov, S. R. Srinivasa Varadhan, and O. Zeitouni. Large deviations for diffusions interacting through their ranks. ArXiv e-prints, November 2012.

[8] Amir Dembo and Li-Cheng Tsai. Equilibrium fluctuation of the atlas model. 2015. Preprint. Available at arXiv:1503.03581.

[9] Richard M. Dudley. Uniform Central Limit Theorems. Cambridge University Press, 1999.

86 BIBLIOGRAPHY

[10] E. Robert Fernholz, Tomoyuki Ichiba, Ioannis Karatzas, and Vilmos Prokaj. Planar diffusions with rank-based characteristics and perturbed Tanaka equations. Probab. Theory Related Fields, 156(1-2):343–374, 2013.

[11] R. Fernholz, T. Ichiba, and I. Karatzas. A second-order stock market model. ArXiv e-prints, February 2013.

[12] Robert Fernholz. Stochastic Portfolio Theory, volume 48 of Applications of Mathematics (New York). Springer-Verlag, New York, 2002. Stochastic Modelling and Applied Probability.

[13] Robert Fernholz. Self-financing Atlas submodels and stock market structure. 2015. In preper- ation.

[14] Robert Fernholz and Ioannis Karatzas. Relative arbitrage in volatility-stabilized markets. Annals of Finance, 1(2):149–177, 2005.

[15] J¨urgenG¨artner. On the McKean-Vlasov limit for interacting diffusions. Mathematische Nachrichten, 137(1):197–248, 1988.

[16] J. Michael Harrison and R. J. Williams. Brownian models of open queueing networks with homogeneous customer populations. Stochastics, 22(2):77–115, 1987.

[17] U. G. Haussmann and E. Pardoux. Time reversal of diffusions. Ann. Probab., 14(4):1188–1205, 10 1986.

[18] Tomoyuki Ichiba, Ioannis Karatzas, and Mykhaylo Shkolnikov. Strong solutions of stochastic equations with rank-based coefficients. Probab. Theory Related Fields, 156(1-2):229–248, 2013.

[19] Tomoyuki Ichiba, Vassilios Papathanakos, Adrian Banner, Ioannis Karatzas, and Robert Fern- holz. Hybrid atlas models. Ann. Appl. Probab., 21(2):609–644, 04 2011.

[20] Benjamin Jourdain and Julien Reygner. Propagation of chaos for rank-based interacting dif- fusions and long time behaviour of a scalar quasilinear parabolic equation. Stochastic Partial Differential Equations: Analysis and Computations, 1(3):455–506, 2013.

[21] Ioannis Karatzas, Soumik Pal, and Mykhaylo Shkolnikov. Systems of Brownian particles with asymmetric collisions. 2012. Preprint. Available at arXiv:1210.0259v1.

87 BIBLIOGRAPHY

[22] Ioannis Karatzas and Johannes Ruf. Distribution of the time to explosion for one-dimensional diffusions. Probab. Theory Related Fields, 2015.

[23] Ioannis Karatzas and Steven E. Shreve. Brownian motion and stochastic calculus, volume 113 of Graduate Texts in Mathematics. Springer-Verlag, New York, second edition, 1991.

[24] Soumik Pal. Analysis of market weights under volatility-stabilized market models. Ann. Appl. Probab., 21(3):1180–1213, 06 2011.

[25] Radka Pickov´a.Generalized volatility-stabilized processes. Annals of Finance, 10(1):101–125, 2014.

[26] Martin I. Reiman and R. J. Williams. A boundary property of semimartingale reflecting Brownian motions. Probab. Theory Related Fields, 77(1):87–97, 1988.

[27] Andrey Sarantsev. Triple and simultaneous collisions of competing brownian particles. 2015. Preprint. Available at arXiv:1401.6255.

[28] Mykhaylo Shkolnikov. Large systems of diffusions interacting through their ranks. Stochastic Processes and their Applications, 122(4):1730 – 1747, 2012.

[29] C´edricVillani. Optimal transport : old and new. Grundlehren der mathematischen Wis- senschaften. Springer, Berlin, 2009.

88 APPENDIX

Appendix

Lemma A.0.1. If µ(·) is continuous in (1.6), with µ(0), µ(1) 6= 0, then Assumption 1.5.4 implies Assumption 1.5.1 for large enough values of n.

Proof. By the positivity assumption on Θ, we must have that µ(0) > 0 > µ(1). Since µ is continu- ous, we know that µ(r) > 0 for r <  for some . We can also assume that µ(r) < 0 for r > 1 − , and so Assumption1.5.1 must hold for k < n and k > (1 − )n. Next, consider the interval [, 1 − ]. We know that Θ is bounded away from zero on this interval. Next, we fix a k ∈ [n, (1 − )n]. We would like to argue that for large values of n, µ(1/n)+...+µ(k/n) that |Θ(k/n) − n | is small, uniformly over all choices of k. We can upper bound this difference by:

k µ(1/n) + ... + µ(k/n) X Z `/n Θ(k/n) − ≤ (µ(r) − µ(l))dr n `=1 (`−1)/n Z 1 drne

≤ µ(r) − µ( ) dr 0 n since drne/n → r, and µ is bounded, we can invoke dominated convergence theorem to conclude that this sum goes to zero. But then, since Θ(k/n) is bounded away from zero uniformly for all valid choices of k, and (µ(1/n) + ... + µ(k/n))/n goes to Θ(k/n) uniformly, we can conclude that µ(1/n) + ... + µ(k/n) > 0 for all valid k for large enough n as desired.

Lemma A.0.2. Let x1 ≤ ... ≤ xn. be an increasing sequence such that x1 + ... + xn = 0. Let p yk = xk where p is a positive, odd integer. Then there exists a constant 0 < C < ∞ such that for large enough n, we have the following bound:

X X (−µ(k/n))yk ≥ C |yk| =: CA. (A.1) k k

89 APPENDIX

Proof. Now, since µ(1) < 0 < µ(0), and Θ > 0 on (0, 1), we may choose 0 < α < β < 1 R x such that α µ(r)dr > 0 for all x ∈ (α, β), Θ(α) = Θ(β), 2a > µ(x) > a > 0 for x < α, and −2b < µ(x) < −b < 0 for x > β. Next, we divide the sum into three parts, k ≤ αn, k ∈ (αn, βn) and k > βn. That is:

X X X X (−µ(k/n))yk = (−µ(k/n))yk + (−µ(k/n))yk + (−µ(k/n))yk = S1 + S2 + S3. k k≤αn αn≤k≤βn k≥βn

First, consider the middle term S2. With y(r) = ybrnc:

βn Z β −1 X −2 n (−µ(k/n))xk = −µ(r)y(r) + O(n sup |y(r)|) k=αn α r∈[α,β] Z β = −µ(r)y(r) + O(n−1A) α Z β = −Θ(β)y(β) + Θ(α)y(α) + Θ(r)dy(r) + O(n−1A) α Z β ≥ Θ(r)dy(r) + O(n−1A) α ≥ O(n−1A).

R u+n−1 −1 Here, the first equality comes from approximating u µ(r) by n µ(u). The first inequality R β follows from the assumption Θ(α) = Θ(β) and y(β) ≥ y(α), and so α µ(r)dr = 0. The third is integration by parts. And the last comes from the assumption that Θ(r) > Θ(α) for r ∈ (α, β) and so:

Z β Z β Θ(r)dy(r) ≥ Θ(α) dy(r) α α = Θ(α)(y(β) − y(α))

= Θ(β)y(β) − Θ(α)y(α).

We have shown that this middle terms in the sum can’t decrease the total sum by too much (at most O(n−1A)). We now turn our attention to the two outer sums.

First, suppose that xαn ≤ 0 ≤ xβn. Then the first and third sum can then be bounded by:

90 APPENDIX

−1 X X  S1 + S3 − n (−µ(k/n))yk + (−µ(k/n))yk k≤αn k≥βn

−1  X X  ≥n min(a, b) |yk| + |yk| k≤αn k≥βn ≥2 min(a, b) min(α, 1 − β)A.

This last inequality comes from the fact that the average of |yk| over either k ≤ αn or k ≥ βn must be greater than the average over all k. This follows from the fact that |yk| is increasing as either k increases above βn or decreases below αn.

Now, suppose that there is some  < α such that xn > 0. In this case, the sign of −µ(k/n) is no longer equal to the sign of xk for all k ≤ αn.

Suppose that  is chosen so that yn−1 < 0 ≤ yn. Because xk is an increasing sequence, and must sum to zero, we get the bound:

X X X 0 = xk + xk + xk k≤n n≤k≤αn αn≤k X X ≥ − |xk| + |xk| k≤n αn≤k X ≥ − |xk| + (1 − α)n|xαn|. k≤n

This then yields:

X |xk| ≥ (1 − α)nxαn. (A.2) k≤n We can now write:

P X X  k≤n |xk|p (1 − α)nxαn p |y | = |x |p ≥ n ≥ n k k n n k≤n k≤n

1−p p p 1−p p −1 X ≥  (1 − α) n|xαn| ≥  (1 − α) (α − ) |yk| n≤k≤αn

Now, since  < α, we can write this last inequality as:

91 APPENDIX

X X 1 − αp X |y | ≥ α1−p(1 − α)p(α)−1 |y | = |y |. (A.3) k k α k k≤n n≤k≤αn n≤k≤αn The coefficient on the right-hand side of (A.3) increases to infinity as α & 0. Hence, we can pick α sufficiently small to make it be at least 4. We can now bound S1 by:

X X X S1 = (−µ(k/n))yk ≥ a|yk| − 2a|yk| k≤αn k≤n n≤k≤αn X X a X a 4 X ≥ a a|y | − 4−1(2a) |y | = |y | ≥ |y |. k k 2 k 2 5 k k≤n k≤n k≤n k≤αn Here the second inequality used (A.3), and the final one used the elementary inequality that if c > 4d then c > (4/5)(c+d). We have shown that for small enough α, we still have the comparison:

X S1 > c |yk|. k≤αn

This was all that was needed in the simpler case when xαn < 0, so the rest of the argument follows after making an identical argument for S3.

2 2 Lemma A.0.3. If Xn → X weakly, then EX ≤ lim infn EXn.

Proof. By Skorohod’s Representation Theorem, we may assume that Xn → X almost everywhere. The result then follows immediately by Fatou’s Lemma.

Lemma A.0.4. Consider a sequence of processes Xn(t) satisfying X(0) = x and:

n n dX (t) = n1(−∞,0](X (t))dt − gdt + σdW (t)

Then Xn(·) → X(·) almost surely on C([0,T ]), where X(t) satisfies X(0) = x and:

dX(t) = −gdt + σdW (t) + dL(t) where L(t) is the local time of the process X(t) at 0.

Proof. Note: This is joint work with Andrey Sarantsev as in [6]. Much more general results are possible, though the version stated here suffices for this paper, and contains much of the spirit of the more general proofs.

92 APPENDIX

By a simple comparison argument, we can see that X1(t) ≤ X2(t) ≤ ... ≤ X(t), where the first inequalities come from the fact that the drift functions n1(−∞,0] are increasing in n, and the last inequality is from the fact that X(t) dominates any diffusion of the form:

dY (t) = −gdt + b(Y (t))dt + σdW (t) where b is supported only on (−∞, 0]. Hence, Z(t) = lim Xn(T ) exists almost surely, and the limit is almost surely bounded above by X(t). Recall the fact that if an increasing sequence of functions on a compact set converges to a continuous function, then the convergence is actually uniform. With this in hand, we need only show that Z(t) = X(t) since we know that X(t) is almost surely continuous. If we can show that Z(t) ≥ 0 for all t almost surely, then it would be a solution to the Skorohod problem that defines X(t). But this is unique and would hence imply that Z(t) = X(t). To see that Z(t) ≥ 0, we will fix , M > 0, and use scale functions to show that, as n → ∞, the probability of Xn(t) reaching − before M goes to zero. Then letting M go to infinity, we see that the probability of hitting M before time T goes to zero, and hence the probability of hitting − by time T goes to zero, so the limit Z(t) is almost surely non-negative. We make this rigorous below: A scale function of Xn is given by:

 2 2g x σ σ2  2g (e − 1) : x ≥ 0 sn(x) = 2 − 2(n−g) x σ σ2  −2(n−g) (e − 1) : x < 0

From this, we can easily see that the probability Xn reaches level − before level M is given by:

s (M) − s (x) n n → 0. sn(M) − sn(−)

This is because sn(y) is constant in n for y ≥ 0, but goes to −∞ for y < 0. Next, since Z(t) ≤ X(t), we can conclude that the probability X(t) reaches M by time T is bounded by the probability that Z(t) does, which goes to zero as M goes to infinity since Z is continuous, and T < ∞. Following the preceding argument, we are now done.

93