APPROXIMATE EQUILIBRIA IN LARGE GAMES

A DISSERTATION SUBMITTED TO THE DEPARTMENT OF MANAGEMENT SCIENCE AND ENGINEERING AND THE COMMITTEE ON GRADUATE STUDIES OF STANFORD UNIVERSITY IN PARTIAL FULFILLMENT OF THE REQUIREMENTS FOR THE DEGREE OF DOCTOR OF PHILOSOPHY

Yu Wu August 2012

© 2012 by Yu Wu. All Rights Reserved. Re-distributed by Stanford University under license with the author.

This work is licensed under a Creative Commons Attribution- Noncommercial 3.0 United States License. http://creativecommons.org/licenses/by-nc/3.0/us/

This dissertation is online at: http://purl.stanford.edu/jv163hr1839

ii I certify that I have read this dissertation and that, in my opinion, it is fully adequate in scope and quality as a dissertation for the degree of Doctor of Philosophy.

Ramesh Johari, Primary Adviser

I certify that I have read this dissertation and that, in my opinion, it is fully adequate in scope and quality as a dissertation for the degree of Doctor of Philosophy.

Nicholas Bambos

I certify that I have read this dissertation and that, in my opinion, it is fully adequate in scope and quality as a dissertation for the degree of Doctor of Philosophy.

Christina Aperjis

Approved for the Stanford University Committee on Graduate Studies. Patricia J. Gumport, Vice Provost Graduate Education

This signature page was generated electronically upon submission of this dissertation in electronic format. An original signed hard copy of the signature page is on file in University Archives.

iii Abstract

The complexity of studying in large games often scales with the size of the system: as it increases, computing the exact Nash equilibrium can soon become intractable. However, when the number of players in the system approaches infinity and no individual player has a significant impact on the system, we can approximate the system by considering each single player no longer playing against other individual players but a single aggregation of all other players. In this paper, we apply this idea to study and approximate Nash equilibria in two large scale games. In part I, we consider a model of priced resource sharing that combines both queueing behavior and strategic behavior. We study a priority service model where a single server allocates its capacity to agents in proportion to their payment to the system, and users from different classes act to minimize the sum of their cost for processing delay and payment. As the exact processing time of this system is hard to compute and cannot be characterized in closed form, we introduce the concept of aggregate equilibrium to approximate the exact Nash equilibrium, by assuming each individual player plays against a common aggregate priority that characterizes the large system. We then introduce the notion of heavy traffic equilibrium as an alter- native approximation of the Nash equilibrium, derived by considering the asymptotic regime where the system load approaches capacity. We show that both aggregate equilibrium and heavy traffic equilibrium are asymptotically exact in heavy traffic. We present some numerical results for both approximate equilibria, and discuss effi- ciency and revenue, and in particular provide a bound for the price of anarchy of the heavy traffic equilibrium. In part II, we study the reputation system of large scale online marketplace. We

iv develop a large market model to study reputation mechanisms in online marketplaces. We consider two types of sellers: commitment sellers, who are intrinsically honest but may be unable to accurately describe items because of limited expertise; and strategic sellers, who are driven by a profit maximization motive. We focus on stationary equi- libria for this dynamic market, in particular, on separating equilibria where strategic sellers are incentivized to describe the items they have for sale truthfully, and char- acterize the conditions under which such equilibria exist. We then complement our theoretical results with computational analysis and provide insights on the features of markets that may incentivize truthfulness in equilibrium.

v Acknowledgements

First, I would like to give my deepest gratitude to my advisor Ramesh Johari for his endless support and constant encouragement in my education at Stanford. His passion, optimism, attitude towards work and life has inspired and will continue to inspire me. I am very grateful to Christina Aperjis and Loc Bui for their insightful suggestions in collaborating in several areas of this thesis. I would like to thank Nick Bambos, Ashish Goel and Balaji Prabhakar for serving my dissertation committee and provid- ing great advice to the development of this work. I want to thank all the faculty and staff in the Department of Management Science and Engineering, for their generous help that makes my stay at Stanford a very enjoyable one. I would like to gratefully acknowledge support from a Larry Yung Stanford Grad- uate Fellowship, and the National Science Foundation under grant CMMI-0948434. In these five years at Stanford, I benefited tremendously from fruitful discus- sions with fellow students, especially Krishnamurthy Iyer, Xiangrui Meng and Zizhuo Wang. I enjoyed very much my life at Stanford, which has been filled with joyful memories with my friends here: Ming Chen, Hao Chen, Pei He, Yi Liu, Zhen Qian, Xu Tan and many others. I would also like to express my gratitude to Shanshan Xu, Weipeng Zhang and Yali Zhang for giving me constant emotional support despite the geographical distance between us. I have a special thank you to Beichen Wang, who embraces me with heartfelt friendship and influences me with his kindness and integrity. Finally, I would like to thank my parents Qingping Wu and Yuan Wang for their unconditional love, support and encouragement. I owe the most thanks to them.

vi Contents

Abstract iv

Acknowledgements vi

1 Part I: Resource Sharing Game 1 1.1 Introduction ...... 1 1.2 Resource Sharing Game ...... 4 1.2.1 Characterizing Processing Times ...... 6 1.2.2 Nash Equilibrium ...... 8 1.3 Job Level Game: Aggregate Approximation ...... 10 1.3.1 Aggregate Priority ...... 11 1.3.2 Aggregate Processing Time ...... 12 1.3.3 Aggregate Equilibrium ...... 16 1.4 Job Level Game: Heavy Traffic Approximation ...... 20 1.4.1 Heavy Traffic Processing Time ...... 20 1.4.2 Heavy Traffic Equilibrium ...... 21 1.4.3 Sensitivity ...... 24 1.4.4 Efficiency ...... 25 1.4.5 Revenue ...... 28 1.5 Class Level Game ...... 30 1.5.1 Aggregate Equilibrium ...... 33 1.5.2 Heavy Traffic Equilibrium ...... 36 1.6 Numerics ...... 37

vii 1.6.1 Approximation Errors: AE vs. HTE ...... 38 1.6.2 Approximation Sensitivity ...... 38 1.6.3 Price of Anarchy ...... 41 1.7 Extensions ...... 41 1.7.1 A Multi-server Model ...... 44 1.8 Conclusion ...... 47

2 Part II: Reputation System 49 2.1 Introduction ...... 49 2.2 Model ...... 53 2.2.1 Preliminaries ...... 53 2.2.2 Market Dynamics and Stationary Equilibrium ...... 56 2.2.3 Separating Equilibrium ...... 58 2.3 Existence of Separating Equilibria ...... 60 2.4 Approximating Separating Equilibria ...... 65 2.5 Computational Analysis ...... 68 2.6 Conclusion ...... 72

Bibliography 73

viii List of Tables

1.1 Fairness-efficiency tradeoff on α...... 26

ix List of Figures

1.1 Relative error of AE and HTE with different system loads...... 39 1.2 Relative error of HTE under different parameters...... 40 1.3 Price of anarchy of HTE under different parameters...... 42

2.1 Density functions and their logarithmic ratio...... 71 2.2 The function B(r) with different class weights...... 72

x Chapter 1

Part I: Resource Sharing Game

1.1 Introduction

A range of resource sharing systems, such as computing or communication services, exhibit two distinct characteristics: queueing behavior and strategic behavior. Queue- ing behavior arises because jobs or flows are served with the limited capacity of system resources. Strategic behavior arises because these jobs or flows are typically generated by self-interested, payoff-maximizing users. Analysis of strategic behavior in queueing systems has a long history, dating to the seminal work of Naor [47]; see the book by Hassin and Haviv [22] for a comprehensive survey. The interaction of queueing and strategic behaviors has become especially important recently, with the rise of paid re- source sharing systems such as cloud computing platforms. For example, [2] and [12] discussed systems with multiple service providers, modeled as first-come-first-serve queues, that compete in both price and response time for potential buyers. In this chapter we consider a particular queueing model where a single server is shared among multiple jobs, and the service capacity allocated to each job depends on its priority level. The particular scheduling policy we consider is known in the liter- ature as the discriminatory processor sharing (DPS) policy, introduced by Kleinrock [38]. In the DPS model, the server shares its capacity in proportion to the priority level of all jobs currently in the system. This service allocation rule is a special case of a more general scheduling policy for queueing networks known as proportionally

1 CHAPTER 1. PART I: RESOURCE SHARING GAME 2

fair resource sharing [34, 42]; such scheduling policies have been studied extensively in the context of networked resource sharing (see [32, 54] and references therein). A survey of the DPS literature can also be found in [3]. We consider a DPS system in steady state, and study a job level game where every individual job is a single strategic user. Each user chooses a payment β, which corresponds to the priority level of that user. The user also incurs a cost proportional to total processing time. The users’ goal is to choose priority levels to minimize the sum of expected processing cost and payment. (We also briefly discuss a class level game, where every class is a single user.) This game is inspired by resource sharing in real services. For example, in the Amazon EC2 Spot Instances, a user can bid her own price (priority) and enjoy the service as long as the dynamic benchmark price computed by the system is lower than the bid. The service is terminated either upon completion of task or when the system price rises above the bid price. In many file hosting websites, users can purchase premium packages which increase upload/download bandwidth and speed, and allow parallel tasks among other benefits. Note that in these services, the resource is shared among all users that currently request service, and a higher payment leads to higher performance. A central difficulty in analysis of equilibria arises because exact computation of the steady state processing time of a single job, given the priority choices of other jobs, is not possible in closed form. Since the queueing behavior computation itself involves numerical complexity, equilibrium characterization in closed form for the strategic behavior is essentially impossible. Thus obtaining structural insight into the games is a significant challenge. To tackle this problem, we propose an approximate approach to equilibrium char- acterization that is amenable to analysis, computable in closed form, and provably exact in an appropriate asymptotic regime where the load on the system increases, known as the heavy traffic regime [37, 49]. The heavy traffic asymptotic regime is widely used in analysis of queueing systems and is especially valuable to study systems with many users. Asymptotics yield two benefits. First, they significantly simplify stochastic analysis. The second key benefit of asymptotics is that we are also able to CHAPTER 1. PART I: RESOURCE SHARING GAME 3

simplify our game theoretic analysis. Informally, an important reason is that when the number of users grows large, no single user has a large impact on the whole system; this effect allows us to simplify calculation of equilibria. (Note that this is similar to the “large market” approximation used to justify competitive price equilibria in economics.) We conclude with a brief survey of related work. Priority pricing problems in queueing systems with disciplines other than DPS have been investigated, such as de- sign of efficient, incentive compatible pricing for nonpreemptive priority FIFO queues [43, 36]. Besides the service discipline, our work also differs from this work because in our model users choose their own priority levels, while in previous models the service provider fixes the priority levels available. Other studies focus on optimal arrival of users, including the work by Glazer and Hassin [20] and Jain et al. [28]. In our paper, although we briefly discuss the case where arrivals are endogenously generated, in the main discussion we assume that arrival rates exogenously given. In a recent work, Yolken and Bambos [58] studied a proportional priority game similar to our class level game, but each class of users form a separated queue . This difference leads to quite different structures and results of their and our models. Turning our attention to DPS specifically, we note that most prior work on DPS considers only analysis of the queueing system without any strategic choice of priori- ties. Haviv and van der Wal [23] consider a DPS system in which users choose their own priority levels to minimize their costs, but only study a model with one class of users. This is essentially different from the multiple-class game studied in this paper because in the one-class game processing time can be computed in closed form, while that is not the case in the multiple-class game. To the best of our knowledge, there is no previous work that considers priority pricing in the multiple class DPS system, because expected processing time cannot be characterized in closed form. The remainder of part I is organized as follows. In Section 1.2, we describe the queueing game setup. In Section 1.3 and 1.4, we introduce the notion of aggregate equilibrium and heavy traffic equilibrium for the job level games, and present the results on parameter sensitivity, efficiency, and revenue under the heavy traffic equi- librium. Then we apply similar approximation techniques to class level games in CHAPTER 1. PART I: RESOURCE SHARING GAME 4

Section 1.5. Numerical results of the approximate equilibria are presented in Section 1.6. Some extensions of the model are discussed in Section 1.7. Substantial portions of the content in this chapter appear in the papers by Wu et al. [56, 57].

1.2 Resource Sharing Game

We consider a queueing game in which K classes of jobs share a single server of unit capacity. Class i (i = 1, ··· ,K) jobs arrive according to a Poisson process with arrival rate λi and have i.i.d. exponentially distributed service requirements

(measured in units of service, e.g., processing cycles) with mean 1/µi. We assume P for simplicity that µi = µ for all classes. Let λ = k λk denote the total arrival rate to the system. Also, let ρi = λi/µ be the load of class i, and define the system load P P as ρ = k ρk = k λk/µ. To ensure stability, we assume ρ < 1. It is well known that under this condition, the resulting queueing system is ergodic and possesses a unique steady state distribution [39]. Waiting and being served in the system induces a cost ci per unit time for users of class i. Without loss of generality we assume c1 > c2 > ··· > cK : if two classes i and j have the same cost ci = cj, then they can be merged into one class with arrival rate λi + λj. We assume that the server allocates its capacity according to the discriminatory processor sharing (DPS) policy. Under this policy, each job is associated with a priority level. If there are currently N jobs in the system and job ` has chosen priority PN level β`, then the fraction of service capacity allocated to job ` is β`/ m=1 βm. Upon arrival, without observing the state of the system, each job chooses a priority level β > 0. We consider a family of pricing rules for priority that we refer to as α-fair pricing rules, where α > 0. Formally, we assume that if a job chooses priority level β, then the system manager charges that job a price βα, where α > 0. Varying α allows us to study a range of pricing schemes. In particular, as α → 0, jobs face a strongly diminishing marginal cost to higher choices of β; while as α → ∞, jobs face a strongly increasing marginal cost with higher choices of β. The pricing rules we consider are closely related to α-fair allocation rules studied in the networking literature [35] [45]. In an α-fair allocation system, one unit of resource CHAPTER 1. PART I: RESOURCE SHARING GAME 5

is allocated to N users, whose utility functions are characterized by α: U (α)(x) = x1−α/(1 − α) if α 6= 1, and U (α)(x) = log(x) if α = 1. Users make payments for use of the system. Let w` be the payment of user `; the payments determine users’ weights in the system. Formally, suppose the payment vector of users is w and the allocation vector is x; then the resource manager solves the following optimization problem:

N N X (α) X max w`U (x`) s.t. x` ≤ 1. x `=1 `=1

1/α P 1/α The solution of this problem is x` = w` /( wm ). A well-known example of an α- fair allocation rule is the proportionally fair allocation rule, obtained when α = 1 [34]: resource is allocated proportional to payment. Now, suppose that the α-fair pricing α rule is used in our model, so that w` = β` . Then the α-fair allocation rule reduces to the discriminatory processor sharing policy described above—i.e., allocation of server capacity in proportion to the priority levels β`. In this paper we will generally be interested in scenarios where all jobs of the same class i choose the same priority level. In an abuse of notation we denote by

βi the priority level chosen by all class i jobs, and in this case we succinctly denote

(β1, ··· , βK ) by β. We refer to β as the class priority vector. Let V (β; β) be the expected processing time for a job with priority β that arrives to the system in steady state, with the class priority vector given by β. Observe that with this notation, a class i job with priority level βi has expected processing time V (βi; β). For conve- α nience we define Wi(β) = V (βi; β). The total cost of a user is cV (β; β) + β , where c is the user’s unit time cost and β is its priority level. We frequently make use of Little’s law, which provides a relationship between steady state expected processing times and steady state expected queue lengths [39].

In particular, let Ni denote the steady state number of class i jobs in the system. In a system consisting of K classes (λi, βi), i = 1,...,K, Little’s law establishes that in steady state, for every class i we have

E[Ni] = λiWi(β). (1.1) CHAPTER 1. PART I: RESOURCE SHARING GAME 6

1.2.1 Characterizing Processing Times

First we characterize the processing times V (β; β) and Wi(β). For the K class DPS model, Fayolle et al. [19] show that the expected steady state processing time Wi for each class i can be determined by solving a linear system.

Theorem 1. [19] In a K-class DPS model with class priority vector β, (W1(β), ···

,WK (β)) is the unique solution of the following system of equations:

K X λiβi µW (β) − [W (β) + W (β)] = 1, k = 1, ··· ,K. (1.2) k β + β k i i=1 i k

On the other hand, computing the processing time V (β; β) for general β can be reduced to computing the processing time Wi(β), i = 1, ··· ,K as stated by the following theorem.

Proposition 2. Let Ni be the steady state number of class i jobs in a K-class DPS system with class priority vector β. Then the steady state processing time of a job with priority β is

K X V (β; β) = U0(β; β) + Ui(β; β)E[Ni], (1.3) i=1 where

" K #−1 βi X λiβi U (β; β) = U (β; β), i = 1, ··· ,K; and U (β; β) = µ − (1.4) i β + β 0 0 β + β i i=1 i

Proof. We follow the same approach as in [23] (see also Theorem 4.8 in [22]) by decomposing the problem into two parts. Fix a job with priority β > 0, and fix β =

(β1, ··· , βK ) > 0 where βi is the priority level of class i jobs. Let U(β; β, n1, ··· , nK ) be the expected time in the system if this job has priority β, and is currently being processed with ni class i jobs in the system. Then conditional on the first transition, CHAPTER 1. PART I: RESOURCE SHARING GAME 7

we have the following recursion:

K 1 X λi U(β; β, n , ··· , n ) = + U(β; β, n , ··· , n + 1, ··· , n ) 1 K λ + µ λ + µ 1 i K i=1 K X µ niβi + U(β; β, n , ··· , n − 1, ··· , n ). λ + µ PK 1 i K i=1 k=1 nkβk + β (1.5)

Following the same argument as in [22], we can show that U(β, n1, ··· , nK ) is linear in each ni. That is, there are functions Ui(β; β)(i = 0, ··· ,K) such that:

K X U(β; β, n1, ··· , nK ) = U0(β; β) + Ui(β; β)ni. (1.6) i=1

Substituting (1.6) into (1.5) yields that

" K #" K # " K # K X X X X niβi + β 1 + λiUi(β; β) = βµ U0(β; β) + Ui(β; β)ni +µ βiUi(β; β)ni. i=1 i=1 i=1 i=1

Then compare the coefficients of each ni as well as the constant term, we obtain:

" K # X βk 1 + λiUi(β; β) = µ(β + βk)Uk(β; β), k = 1, ··· ,K; i=1 K X 1 + λiUi(β; β) = µU0(β; β). i=1

Hence the solution to (1.5) is

" K #−1 βi X λiβi U (β; β) = U (β; β), i = 1, ··· ,K; U (β; β) = µ − . i β + β 0 0 β + β i i=1 i

The above equation, along with (1.6) gives the expression for V (β; β) in terms of the expected queue lengths.

The values of E[Ni] can be obtained by applying Little’s law to the solution of CHAPTER 1. PART I: RESOURCE SHARING GAME 8

the system of linear equations (1.2). We conclude, therefore, that solving for V (β; β) in (1.3) can be reduced to computing Wi(β). In general, explicitly solving (1.2) requires the inversion of a K × K matrix with 3 complexity O(K ), and hence, there is no closed form expression for Wi(β) or V (β; β).

Nevertheless, when K = 1 or K = 2, we are able to solve for Wi(β) and V (β; β) in closed form. The solution of (1.3) with K = 1 is first established in [23, 22] as follows

1 β(1 − ρ) + βˆ V (β; βˆ) = · . (1.7) µ(1 − ρ) βˆ(1 − ρ) + β

When K = 2, the solution for Wi(β) is given by [19], and the solution for V directly follows. Both solutions are lengthy and omitted for brevity.

1.2.2 Nash Equilibrium

In this DPS system with jobs generated by strategic users, we consider two types of games for this system: the job level game and the class level game. In the job level game, each job is an individual user, aiming to minimize its expected total cost by choosing its own priority level β. Although jobs from the same class are allowed to choose different priority levels, because jobs of the same class share the same parameters ex ante, we restrict our attention only to symmetric equilibria of the resource sharing game; these are equilibria where jobs from the same class choose the same priority levels. Such an equilibrium can be characterized by a class priority vector (β1, ··· , βK ).

Definition 1. A job level Nash equilibrium consists of a class priority vector

β = (β1, ··· , βK ) such that for all i = 1, ··· ,K,

α βi = arg min [ciV (β; β) + β ] , ∀ i = 1, ··· ,K. (1.8) β>0

In the class level game, each class is regarded as a single user and chooses a priority level for all of its jobs; therefore the equilibrium is again characterized by a class priority vector. CHAPTER 1. PART I: RESOURCE SHARING GAME 9

Definition 2. A class level Nash equilibrium consists of a class priority vector

β = (β1, ··· , βK ) such that for all i = 1, ··· ,K,

α βi = arg min [ciWi(β1, ··· , βi−1, β, βi+1, ··· , βK ) + β ], ∀ i = 1, ··· ,K. β>0

We emphasize that, although jobs from the same class choose the same priority in both the symmetric job level equilibrium and the class level equilibrium, these two equilibria are not identical. The difference is that in the class level game, changing the priority level of a whole class i causes an externality within the class itself, while by contrast, in the job level game, a single job alters its priority level in isolation. We mainly study the job level game, then we discuss how our study can be adapted to the class level game in the Section 1.5. Existence of Nash equilibrium can be guaranteed when α ≥ 1, by exploiting con- vexity of the job cost function in (1.8). When α < 1, the payment term βα is strictly concave, therefore the convexity of the objective function is not guaranteed. Al- though analytically establishing existence of Nash equilibrium in this regime remains an open question, our numerical computation with best response dynamics converges to a Nash equilibrium even when α < 1.

Proposition 3. There exists a Nash equilibrium for the job level game when α ≥ 1.

Proof. It is straightforward to show that V (β; β) ≤ 1/(µ(1 − ρ)): the expected pro- cessing time of any job, regardless of priority, cannot be longer than the expected length of a busy period in an M/M/1 queue with arrival rate λ and service rate µ [39]. (This follows because the discriminatory processor sharing policy is work- conserving, so the length of a busy period will be identical to that in an M/M/1 queue.) It then follows that there exists an upper bound β such that no job ever has β > β as an optimal strategy. Next we show that V (β; β) is convex in β. Combining (1.3) and (1.4) gives V (β; β) = f(β)/g(β), where f(β) = 1 + PK λiWiβi and g(β) = µ − PK λiβi . It is i=1 βi+β i=1 βi+β easy to check that f(β) > 0, f 0(β) < 0, f 00(β) > 0; and g(β) > 0, g0(β) > 0, g00(β) < 0, CHAPTER 1. PART I: RESOURCE SHARING GAME 10

thus

f(β)00 [f 00(β)g(β) − f(β)g00(β)] 2 [f 0(β)g(β) − f(β)g0(β)] g(β)g0(β) = − > 0. g(β) [g(β)]2 [g(β)]4

α α Moreover, β is convex in β since α ≥ 1, so ciV (β; β) + β is convex in β. It follows by Rosen’s existence theorem [52] that a pure Nash equilibrium exists for the game in this case.

As usual, this existence result is nonconstructive, since it uses a fixed point theo- rem. In general, given the implicit equations that define the processing times in Wi(β) and V (β; β), there is no closed form characterization of the Nash equilibrium, and no tractable approach for computation is available. Although we could resort to some heuristics (e.g., best response dynamics) to approach NE, each step of such an algo- rithm requires computing a range of processing times with fixed parameters, and as established above each such computation has complexity O(K3). Further, there is no theoretical guarantee that such dynamics will converge. (Though we note, numerical computation suggests that the best response dynamics does converge.) Equilibrium computation is therefore not possible in closed form in general; as a result, we are left with essentially no structural insight into the behavior of players in the game.

1.3 Job Level Game: Aggregate Approximation

One of the most crucial complexity of computing Nash equilibria lies in the intractabil- ity of computing processing time V (β; β). In the next two sections, we consider an alternate approach to the equilibrium analysis, by approximating the processing time. If we can find a closed form approximation of V (β; β), we may also approximate Nash equilibrium in closed form and analytically study the equilibrium. Our first attempt of approximation is inspired by the closed form expression (1.7). Although V (β; β) does not admit an explicit form in general, it has a simply expression if there is only one class of jobs in the system (i.e., K = 1). Note that in the DPS system, the service rate of any individual job depends on the priorities of other jobs only through the CHAPTER 1. PART I: RESOURCE SHARING GAME 11

sum of those priorities, therefore the idea of aggregating the priorities of the whole system into a single class is promising.

1.3.1 Aggregate Priority

Formally, suppose the system consists of K classes (λi, βi), i = 1, ··· ,K with β > 0. We seek an aggregate priority βˆ such that the K classes of jobs in the system are PK ˆ approximately equivalent to a single class with arrival rate λ = i=1 λK , priority β. To find a suitable value of βˆ, consider two parallel systems for an additional job with priority β. In the original DPS system with K classes (λi, βi), i = 1, ··· K, the PK service capacity our chosen job enjoys at time t is β/( i=1 Ni(t)βi + β) if it is in the system at time t, and 0 otherwise. We then consider a second system where we imagine that there is only one class (λ, βˆ). If we denote by random variable Nˆ(t) the number of other jobs at time t in this system, the service capacity allocated to our chosen job is β/(Nˆ(t)βˆ + β) if this job is in the system at time t, and 0 otherwise. We hope these two systems are identical from the point of view of our single job, which would be ensured if the service allocated to each job, as well as the number of jobs in two systems, were the same. In other words, we require the two systems to ˆ PK satisfy N(t) = i=1 Ni(t) almost surely, and

K K X ˆ ˆ X ˆ Ni(t)βi = N(t)β = Ni(t)β a.s. (1.9) i=1 i=1

However, this is impossible on a sample path basis, since βˆ cannot be equal to every

βi (except in the degenerate case where all βi’s are equal). Nevertheless, we can develop a choice for βˆ by taking expectations on both sides of (1.9). Applying Little’s Law (1.1), and replacing the class level expected processing time Wi(β) by the equivalent job level expected processing time, we have:

h K i P K K E i=1 Niβi P λ W (β)β P λ V (β ; β)β βˆ = = i=1 i i i = i=1 i i i . h K i PK PK P λiWi(β) λiV (βi; β) E i=1 Ni i=1 i=1 CHAPTER 1. PART I: RESOURCE SHARING GAME 12

The preceding approximation suggests a reasonable approach to choosing the ag- gregate priority βˆ; however, it still requires computation of the full K-class expected job level processing time. But recall that our intention is that the system with all K classes (λi, βi), i = 1, ··· ,K, should be well approximated by a system with a single ˆ ˆ class (λ, β). Therefore we hope that V (βi; β) ≈ V (βi; β), where the latter is the job level processing time in a system with only one class. Using this approximation, we arrive at the following definition.

ˆ Definition 3. β is the aggregate priority of the system with K classes (λi, βi), i = 1, ··· ,K if PK λ V (β ; βˆ)β βˆ = i=1 i i i . (1.10) PK ˆ i=1 λiV (βi; β) Lemma 4. For any system with β > 0, aggregate priority exists and is unique.

Proof. Define K X fβ(x) = (x − βi)λiV (βi; x). i=1 ˆ ˆ It follows that any β satisfying (1.10) is a solution to fβ(β) = 0. It is straightforward to check by differentiating that for any β > 0, fβ is a convex function, with fβ(0) < 0 and fβ(x) → ∞ as x → ∞. Hence for any β > 0, there exists a unique solution ˆ ˆ to fβ(β) = 0 in (0, ∞), and that unique solution is the aggregate priority β we are looking for.

We note aggregate priority βˆ is a function of system priorities β. Although there is no general closed form expression for aggregate priority βˆ, the above proof shows it can be obtained by a simple bisection method numerically.

1.3.2 Aggregate Processing Time

With the aggregate priority and equation (1.7), we define the aggregate processing time as follows. CHAPTER 1. PART I: RESOURCE SHARING GAME 13

Definition 4. The aggregate processing time for a job with priority level β > 0 in the system with K classes (λi, βi), i = 1, ··· ,K, with β > 0, is defined as

V AG(β; β) = V (β; βˆ(β)) (1.11) where βˆ(β) is the aggregate priority of the system, and V is the one-job-one-class processing time given by (1.7).

Once again, βˆ(β) is dependent on ρ but we do not parameterize this dependence for simplicity. Based on the preceding heuristic derivation, our conjecture is that V AG is a po- tentially good proxy for the true processing time. In the remainder of this subsection, we verify that our calculation is correct in an asymptotic regime when the system is heavily loaded. We have the following theorem that the (scaled) aggregate processing time converges to the (scaled) exact p time in heavy traffic.

Theorem 5. Let β > 0. As ρ → 1,

(1 − ρ)[V AG(β; β) − V (β; β)] → 0.

To show this theorem, we first study the heavy traffic regime. That is, the system arrival rate λ approaches total service capacity µ, or equivalently, system load ρ approaches 1. In heavy traffic, a phenomenon known as state space collapse gives us a simplified solution for the steady state distribution of the system [49]; informally, state space collapse refers to the fact that the numbers of jobs of each class in the system become perfectly correlated when the system is heavily loaded. In a slight abuse, whenever we write ρ → 1, we mean that we consider a sequence PK of systems such that (ρ1, ··· , ρK ) converges to some (ρ1, ··· , ρK ) with i=1 ρi =

1. Moreover, we emphasize that both V (β; β) and Wi(β) depend on ρ, though we suppress this dependence for notational brevity. Let Ni denote the steady state number of type i jobs in the system. Then we have the following result on the joint steady state distribution of (N1, ··· ,NK ) for a DPS system in heavy traffic. CHAPTER 1. PART I: RESOURCE SHARING GAME 14

Theorem 6. [48] Let Ni be the steady state number of class i jobs in a K-class DPS system with class priority vector β. Then as ρ → 1, we have

  d. ρ1 ρK (1 − ρ)(N1, ··· ,NK ) → Z · , ··· , , β1 βK where “→d. ” denotes convergence in distribution, and Z is an exponentially distributed PK random variable with mean 1/γ(β) where γ(β) = i=1(ρi/βi).

With this state space collapse result, we are able to approximate V (β; β) in heavy traffic.

Lemma 7. In a K-class DPS system with class priority vector β,

1 lim(1 − ρ)V (β; β) = , ρ→1 λγβ

PK where γ = i=1(ρi/βi) and ρi = ρi/ρ does not change as ρ → 1.

Proof. Convergence of the joint distribution implies convergence of marginal distri- d. butions, so (1 − ρ)Ni → Zρi/βi for each i. Moreover, the second moment of (1 − ρ)Ni is shown to be uniformly bounded [48], so the Ni’s are uniformly integrable. It follows from [9] that in this case convergence in distribution implies convergence in the mean, and hence, ρi ρi lim(1 − ρ)E[Ni] = E[Z] = . (1.12) ρ→1 βi βiγ

Taking advantage of this approximation of E[Ni], we are now able to approximate V (β; β). Note that

" K !#−1 K !−1 X ρiβi X ρi lim U0(β; β) = λ 1 − = λβ ρ→1 β + β β + β i=1 i i=1 i CHAPTER 1. PART I: RESOURCE SHARING GAME 15

is finite, so substituting (1.4) and (1.12) into (1.3) yields

lim(1 − ρ)V (β; β) ρ→1 " K # X ρi = lim U0(β; β) + (1 − ρ) ρ→1 γ(β)(β + β) i=1 i QK PK Q (1 − ρ)γ(β) (βi + β) + ρ (βj + β) = lim i=1 i=1 i j6=i ρ→1 QK PK Q µγ(β)[(1 − ρ) i=1(βi + β) + β i=1 ρi j6=i(βj + β)]   K ! X βi = lim U0(β; β) lim(1 − ρ)E[Ni] ρ→1 β + β ρ→1 i=1 i 1 = . λγβ

Now we prove Theorem 5.

Proof. Let βˆ = βˆ(β); then βˆ is the unique solution of the following equation:

PK λ V (β ; βˆ)β βˆ = i=1 i i i . (1.13) PK ˆ i=1 λiV (βi; β)

From (1.7) we have

βˆ β(1 − ρ) + βˆ βˆ ≤ µ(1 − ρ)V (β; βˆ) = ≤ (1 − ρ) + . (1.14) βˆ(1 − ρ) + β βˆ(1 − ρ) + β β

ˆ It is clear from the definition that β/βi ≤ maxj βj/ minj βj , M, for all i. Then it follows from (1.14) that

µ(1 − ρ)V (β ; βˆ)β lim i i = 1, ∀ i = 1, ··· ,K. ρ→1 βˆ

Substituting this limit into (1.13), we immediately get

λ 1 lim βˆ = = . (1.15) ρ→1 PK γ i=1 λi/βi CHAPTER 1. PART I: RESOURCE SHARING GAME 16

Since βˆ has a finite limit when ρ → 1, (1.14) implies that

λ(1 − ρ)V (β; βˆ)β lim = 1, ∀ β > 0. (1.16) ρ→1 βˆ

Then (1.15) and (1.16) yield

1 lim(1 − ρ)V (β; βˆ) = . ρ→1 λγβ

It follows from Lemma 7 that the proof is complete.

Therefore, the aggregate processing time is a good approximation for the exact processing time in heavy traffic.

1.3.3 Aggregate Equilibrium

Recall from Definition 1 that β = (β1, ··· , βK ) is a job-level Nash equilibrium if (1.8) holds, i.e., α βi = arg min ciV (β; β) + β , i = 1, ··· ,K. β>0 For general K it is quite hard to solve for the preceding equilibrium because: i) computing V (β; β) requires matrix inversion to solve the linear system (1.2), which can only be done numerically; and ii) even if we are able to solve V (β; β) numerically and obtain optimality conditions for each player (which cannot be done in closed form), we would still need to solve a possibly nonlinear system with K equations and K unknowns to compute the Nash equilibrium. In this section, we propose a novel concept of equilibrium which can be used to approximate the Nash equilibrium, yet can be computed in closed form. We approxi- mate V (β; β) by V AG(β; β) in the objective function, and based on this approximation we define a concept of equilibrium that we call aggregate equilibrium (AE), as follows.

Definition 5. An aggregate equilibrium of the game consists of a set of priorities CHAPTER 1. PART I: RESOURCE SHARING GAME 17

β = (β1, ··· , βK ) such that

AG α βi = arg min ciV (β; β) + β , i = 1, ··· ,K. β>0 where V AG is the aggregate processing time defined in (4).

We have the following theorem on the existence, uniqueness, and computation of this equilibrium when α = 1.

2 2 Theorem 8. If α = 1, then for any cost profile such that cK /c1 ≥ (1 − ρ) /(2 − ρ) , the aggregate equilibrium exists, is unique, and can be computed in the following closed form: √ 2 " 2 1 # (2ρ − 1) S + [(2ρ − 1) S + 4S S (2 − ρ)] 2 βˆ = 3 √3 1 2 , (1.17) 2S1(2 − ρ) µ

!1/2 βρˆ (2 − ρ)c βˆ = i − (1 − ρ)β,ˆ (1.18) i µ(1 − ρ) where K   K X ρi X √ ρ(2 − ρ) S = √ ,S = (ρ c ),S = . 1 c 2 i i 3 (1 − ρ) i=1 i i=1

Proof. By the optimality condition, we can express each optimal βi as a function of only one parameter—βˆ—then use the consistency condition to explicitly solve for βˆ, and thus obtain each aggregate equilibrium βi. Taking the first order derivative of the optimality condition w.r.t. β, we find for ˆ a class i job that if βi > 0, then

ˆ ˆ ˆ 2 ciβρ(2 − ρ) = µ(1 − ρ)[βi + (1 − ρ)β] .

The unique F.O.C. solution is:

!1/2 βρˆ (2 − ρ)c βˆ = i − (1 − ρ)β.ˆ i µ(1 − ρ)

ˆ The second derivative of the optimality condition is positive at βi. Therefore, when CHAPTER 1. PART I: RESOURCE SHARING GAME 18

∗ 3 ˆ ci > c = µ(1−ρ) β/(ρ(2−ρ)), there is a unique optimal priority for class i job given ∗ by (1.18), and when 0 < ci ≤ c there is no optimal priority for class i jobs. In fact, ∗ ˆ in the case 0 < ci ≤ c , if βi > 0, any job from class i has an incentive to deviate and choose a lower priority; yet they cannot reach β = 0, because zero priority can never be optimal for a whole class. ∗ Suppose ci > c for all i. Then the expected processing time for a class i job is:

 s  ˆ ˆ ˆ 1 µβ(2 − ρ)ρ V (βi, β) = 1 +  . (1.19) µ ci(1 − ρ)

Thus it follows from the consistency condition

PK λ Vˆ (βˆ , βˆ)βˆ βˆ = i=1 i i i , PK ˆ ˆ ˆ i=1 λiV (βi, β) that K  s  X µβˆ(2 − ρ)ρ ρ 1 + βˆ i  c (1 − ρ)  i=1 i  s  s  K ˆ ˆ X µβ(2 − ρ)ρ ciβρ(2 − ρ) = ρ 1 + − (1 − ρ)βˆ . i  c (1 − ρ)   µ(1 − ρ)  i=1 i

This can be reduced to a quadratic equation in x = βˆ1/2:

1 2 1 S1(2 − ρ)(S3µ) 2 x − (2ρ − 1)S3x − S2(S3/µ) 2 = 0 where K   K X ρi X √ ρ(2 − ρ) S = √ ,S = (ρ c ),S = . 1 c 2 i i 3 (1 − ρ) i=1 i i=1 There is a unique positive solution for βˆ = x2 which is given by (1.17). ˆ Note that β is in turn dependent on the ci’s. Thus, to confirm that we have ∗ identified the equilibrium, we must verify that this solution satifies ci > c = µ(1 − CHAPTER 1. PART I: RESOURCE SHARING GAME 19

3 ˆ √ √ ρ) β/(ρ(2 − ρ)). We note that S1 ≥ ρ/ c1, S2 ≤ ρ c1, and

2 1 √ 1 − ρ (2ρ − 1) + [(2ρ − 1) S + 4S S (1 − ρ)/ρ)] 2 c∗ = 3 1 2 2 − ρ 2S1  1  " 2 # 2 1 − ρ 2ρ − 1 2ρ − 1 4S2(1 − ρ) =  + +  2(2 − ρ) S1 S1 S1ρ

 1  " # 2 1 − ρ 2ρ − 12 4(1 − ρ) 2ρ − 1 √ ≤ + + c 2(2 − ρ)  ρ ρ ρ  max 1 − ρ√ √ = c ≤ c . 2 − ρ 1 K

The result then follows.

Theorem 8 immediately leads to the two following corollaries.

Corollary 9. If α = 1, given a cost profile, there exists a load threshold ρ0 < 1 such that for all ρ > ρ0, the aggregate equilibrium exists, is unique, and can be computed as in Theorem 8.

The preceding corollary shows that as the system approaches heavy traffic, even- tually aggregate equilibrium is guaranteed to exist.

Corollary 10. If α = 1, for any cost profile such that cK /c1 ≥ 1/4, aggregate equilibrium exists, is unique, and can be computed as in Theorem 8.

We note that although aggregate priority βˆ still exists and is unique, it does not admits a closed form. Moreover, we are unable to solve the aggregate equilibrium in closed form when α 6= 1. In the next section, we discuss another approximation method that further simplifies the approximation and gain significant tractability. CHAPTER 1. PART I: RESOURCE SHARING GAME 20

1.4 Job Level Game: Heavy Traffic Approxima- tion

In the previous section, we defined aggregate equilibrium as an approximation of the exact Nash equilibrium. We showed that it is asymptotically exact in heavy traffic regime, but admits a closed form expression only when α = 1. In this section, we are going to focus only on heavy traffic, and consider an even simpler approximation in this regime. Such an approximation is relevant for large systems such as cloud computing services, where providers will typically not want to provision significant excesses of capacity. 1

1.4.1 Heavy Traffic Processing Time

If we only focus on heavy traffic, then Lemma 7 suggests we can approximate pro- cessing time by a simple expression.

Definition 6. The heavy traffic processing time for a job with priority level β in a system with K classes with class priority vector β is defined as

K 1 1 1 X ρi V HT (β; β) = · , where γ(β) = . (1 − ρ) µβγ(β) ρ β i=1 i

In fact, 1/γ(β) works as the aggregate priority in this system. Unlike the general aggregate priority, it has a closed form because of the state space collapse in heavy traffic. We note that V HT (β; β) has a closed form, and is easy to compute. Moreover, it is asymptotically exact in the heavy traffic regime, which directly follows from Lemma 7.

Theorem 11. Let β > 0. As ρ → 1,

(1 − ρ)[V HT (β; β) − V (β; β)] → 0.

1One such justification for heavy traffic capacity provisioning comes from Nair et al. [46], who study optimal capacity provisioning for online service providers. They find that as the market size becomes large, heavy traffic emerges as a consequence of a profit maximizing strategy for the service provider, with exact scaling depending on the strength of positive externalities among users. CHAPTER 1. PART I: RESOURCE SHARING GAME 21

We note here that one reason we consider the case where µi = µ for all i is that in the absence of this assumption, a similar result to Proposition 2 becomes more challenging (in particular, because (1.5) is no longer tractable). However, an appropriate generalization of Theorem 6 holds even for heterogeneous µi, and based on this fact we conjecture that the analysis of this paper can be carried out even with heterogeneity of µi. We leave this for future work.

1.4.2 Heavy Traffic Equilibrium

Just like what we did to aggregate approximation, we can approximate V (β; β) by V HT (β; β) in the objective function, and based on this approximation we define a concept of equilibrium that we call heavy traffic equilibrium (HTE).

Definition 7. A heavy traffic equilibrium of the game consists of a set of prior- ities β = (β1, ··· , βK ) such that

HT α βi = arg min ciV (β; β) + β , i = 1, ··· ,K. β>0

At this point, we pause to clearly distinguish between the Nash equilibrium, the heavy-traffic equilibrium, and the aggregate equilibrium. There are two major dif- ferences between the Nash equilibrium and the aggregate/heavy traffic equilibrium: (1) there is an auxiliary parameter βˆ in aggregate equilibrium and γ in heavy-traffic equilibrium which approximately “aggregates” the strategies of all jobs in the system; and (2) each individual job simply optimizes believing that all other jobs are using aggregate priority, which is computationally much easier than optimizing against the exact strategies of all other jobs. Therefore, compared to the computationally pro- hibitive Nash equilibrium, the aggregate and heavy traffic equilibria sacrifices some accuracy to reduce complexity. Because of the simple and closed form of heavy traffic processing time, we can explicitly compute the heavy traffic equilibrium for any α.

Theorem 12. A heavy-traffic equilibrium always exists, and it is unique. Moreover, CHAPTER 1. PART I: RESOURCE SHARING GAME 22

it can be calculated in closed form:

1 1 α+1 −1 − α βi = ci [α(1 − ρ)ρ S1] , (1.20)

− 1 PK α+1 where S1 = i=1 λici .

Proof. We note that the best response of a class i job given that all class j jobs choose

βj (j = 1, 2,...,K) is

  HT α ci α βi(β) = arg min ciV (β, β) + β = arg min + β . β≥0 β≥0 µ(1 − ρ)βγ(β)

The first order condition of optimality yields a unique solution:

1 ∗ −1 − α+1 βi (β) = ci αµ(1 − ρ)γ(β) . (1.21)

And the second derivative of the objective function at this point is

2ci 1 ∗ α−2 (α + 1)ci 1 ∗ 3 + α(α − 1)(βi ) = ∗ 3 > 0. µ(1 − ρ)γ(β) (βi ) µ(1 − ρ)γ(β) (βi )

Therefore, (1.21) is the unique minimizer of the objective function. Recall that γ(β) = PK i=1 ρi/(ρβi). Thus, at the equilibrium,

K X 1 α+1 1 ∗ −1 −1 ∗  α+1 ∗ α α γ(β ) = ρiρ ci αµ(1 − ρ)γ(β ) ⇒ γ(β ) = (S1/λ) (αµ(1 − ρ)) , i=1 (1.22) − 1 PK α+1 where S1 = i=1 λici . Plugging (1.22) into (1.21) yields the result:

1 1 1 1 α+1 α+1 ∗ −1 − α −1 − α βi = ci [λ αµ(1 − ρ)S1] = ci [ρ α(1 − ρ)S1] .

Therefore, the heavy-traffic equilibrium always exists, is unique, and can be calculated by the above closed form expressions.

We have two remarks on this result. First, this closed form expression allows us to carry out analysis of sensitivity, efficiency, and revenue of the HTE (see later in this CHAPTER 1. PART I: RESOURCE SHARING GAME 23

section). Second, the HTE is easily computable with complexity O(K). In compar- ison, the complexity for computing the exact processing time with fixed parameters is O(K3), and as discussed computing exact NE is intractable. We have observed above that the difference between the heavy traffic processing time and the exact processing time approaches zero as ρ → 1, when scaled by a factor 1 − ρ. Using this approximation, we can also prove an approximation theorem for the heavy traffic equilibrium: we show that deviating by any constant factor from the HTE is not profitable as ρ → 1.

Theorem 13. Consider a sequence of systems indexed by n such that classes have the same service capacity µ, and the loads of the systems ρ(n) → 1 as n → ∞. Let β(n) be the unique HTE of the n-th system, then for any δ ≥ 0,

α α (n) h  (n) (n)  (n)  (n) (n)  (n) i lim (1 − ρ ) ciV(n) βi ; β + βi − ciV(n) δβi ; β − δβi ≤ 0. n→∞

Here V is subscripted by (n) to indicate that the processing time is computed in system n with load ρ(n).

Proof. First, we observe that for any β > 0 and β = (β1, ··· , βK ) > 0, V (β; β) and HT V (β; β) depend on β and β only through the ratios βi/β and βi/βj for any i, j. Now, since β(n) is the HTE at the n-th system, for any δ ≥ 0, we have that

α HT (n) (n) (n) α HT  (n) (n)  (n) ciV(n) (βi ; β ) + (βi ) ≤ ciV(n) δβi ; β + δβi . (1.23)

(n) (n) (n) (n) Define θj = βj /βi (j = 1, ··· ,K). Then (1.20) implies that θj = βj /βi is (n) HT (n) (n) HT independent of n and the load ρ . Therefore, V(n) (βi ; β ) = V(n) (1; θ) and (n) (n) V(n)(βi ; β ) = V(n)(1; θ). And thus,

(n) HT (n) (n) (n) (n) lim (1 − ρ )[V(n) (βi ; β ) − V(n)(βi ; β )] n→∞ (1.24) (n) HT = lim (1 − ρ )[V(n) (1; θ) − V(n)(1; θ)] = 0. n→∞

HT (n) (n) The last equality follows from the asymptotic exactness of V . Similarly, βj /(δβi ) CHAPTER 1. PART I: RESOURCE SHARING GAME 24

is also independent of n and system load, so we have that

(n) h HT  (n) (n)  (n) (n)i lim (1 − ρ ) V(n) δβi ; β − V(n) δβi ; β = 0. (1.25) n→∞

Then the claim follows by plugging (1.24) and (1.25) into (1.23).

In the theorem, we consider deviations by a multiplicative constant factor rather than by an additive constant because (1.20) implies that, as ρ → 1, the heavy traffic equilibrium increases without bound; as a result, it is straightforward to check that any additive constant deviation has no beneficial effect as ρ approaches 1. Note that the processing time is only asymptotically exact up to a 1 − ρ scaling, thus the same is true for this approximation theorem as well. Indeed, this is what we give up by studying heavy traffic: while we gain analytical tractability, the “resolution” to which we can study deviations is scaled by 1 − ρ. This tradeoff is systematic throughout the study of large scale queueing models even without strategic behavior.

1.4.3 Sensitivity

The tractability of heavy traffic equilibrium allows us to analytically study parameter sensitivity, as well as efficiency and revenue at the HTE equilibrium. We let β∗ denote the HTE. In this subsection, we analyze the sensitivity of the HTE, i.e., how the equilibrium behaves with respect to changes in system parameters. These observations follow directly from (1.20). ∗ Sensitivity with respect to c. If all ci are scaled by a constant ζ > 0, then every βi 1 is scaled by ζ α . This is rather intuitive since the objective function is the sum of ex- α pected processing cost and βi , and the expected processing cost does not change any

Vi. Therefore the equilibrium is the same up to a scaling factor. Further more, simple

first derivative analysis shows that all equilibrium βi’s are increasing in any single cj.

Note that increasing cj provides extra incentive for class j users to invest in priority

βj. In equilibrium, all other βi’s also increase as a result of priority competition in the server. CHAPTER 1. PART I: RESOURCE SHARING GAME 25

1 ∗ ∗ α+1 Sensitivity with respect to ρ. The ratio βi /βj = (ci/cj) is independent of ρ, ∗ ∗ ∗ i.e., changing ρ will change each βi but will not affect βi /βj for any i, j. Therefore the ratio between service capacity allocated to any pair of jobs, as well as the ratio between the heavy traffic processing times of a pair of jobs, are invariant to the load of the system. ∗ Sensitivity with respect to α. When α → 0, every βi → ∞; when α → ∞, every ∗ βi → 1. This is due to the fact that as α → 0, jobs face a strongly diminishing marginal cost to higher choices of β, and hence, prefer to choose higher β at the equilibrium; while the effect is reversed as α → ∞.

1.4.4 Efficiency

In HTE, efficiency is characterized by the expected total cost incurred to the system in one unit of time:

K K ! K !−1   α 1 X ρ X X − C = λ c V HT (β∗; β∗) = λ c α+1 λ c α+1 . (1.26) i i i 1 − ρ i i i i i=1 i=1 i=1

We call C the system processing cost (a more efficient system has a lower value of C).

Given fixed λi and ci (i = 1, ··· ,K), the efficiency depends on the system parameter α and the load ρ as follows. Dependence of C on ρ. We note that C is proportional to ρ/(1 − ρ), and hence is increasing in ρ. This is because a larger load ρ implies a busier system, and therefore the processing time is longer. (Note that we fixed λ, so varying ρ is equivalent to varying µ.) Dependence of C on α. It is well known that the system optimal scheduling policy is the c-µ rule [13]: classes are given strict priority in descending order of ciµi (or equivalently in this paper, in descending order of ci, since we assume that all µi are the same). That is, for any 1 ≤ i, j ≤ K, class j jobs are preempted by class i jobs if ciµi > cjµj. Jobs with the same value of cµ are served in first-in-first-out (FIFO) 1 ∗ ∗ α+1 ∗ ∗ scheme. Since βi /βj = (ci/cj) , for ci > cj, the ratio βi /βj is higher with smaller α, so we expect higher α lead to less efficient equilibria. This intuition is analytically CHAPTER 1. PART I: RESOURCE SHARING GAME 26

c-µ 0 ← α → ∞ β  c ∞ c β priority ratio i i i ← i → 1 βj cj cj βj fairness least less ← → more efficiency optimal more ← → less

Table 1.1: Fairness-efficiency tradeoff on α. stated in the following theorem.

Proposition 14. The HTE system processing cost C is increasing in α > 0.

Proof. Take the first derivative of C with respect to α, we have that

!−2  α 1 α 1  1 ∂C ρ X − − ci X − = λ λ (c α+1 c α+1 − c α+1 c α+1 ) ln λ c α+1 . ∂α (1 − ρ)(α + 1)2 i j i j j i c i i i

α − 1 α − 1 Since c α+1 c α+1 − c α+1 c α+1 and ln ci have the same signs, this derivative is positive i j j i cj and C is increasing in α.

We note that even when α approaches 0, the HTE does not approach social 1 ∗ ∗ α+1 optimum. In fact, for any i, j such that ci > cj, we have that βi /βj = (ci/cj) , and ∗ ∗ hence 1 < βi /βj < ci/cj. On the other hand, with the c-µ rule, if ci > cj, then class i jobs completely preempt class j jobs, which can be interpreted as the case where ∗ ∗ βi /βj = ∞. Therefore, it is clear that the HTE can never be as efficient as the c-µ rule, for any choice of α. Note α decides how strict the priority is in the system, as α increases, the system tends to be more fair at the loss of social efficiency. We summarize this idea on α in Table 1.1. Moreover, we can upper bound the price of anarchy (PoA) of the HTE, as stated in the following theorem. The PoA is the ratio C/Copt, where Copt is the minimum expected system processing cost (achieved by the c-µ rule).

Theorem 15. The price of anarchy (PoA) of the HTE is upper-bounded by:

K−1 α α C P (λ /λ )(c /c ) α+1 + 1 λ − λ   c  α+1 < i=1 i K i K < K 1 + 1. (1.27) opt K−1 − 1 C P α+1 λK cK i=1 (λi/λK )(ci/cK ) + 1 CHAPTER 1. PART I: RESOURCE SHARING GAME 27

Proof. To compute the system processing cost of the c-µ rule, consider the following M/M/1 queue models: Class 1 jobs have the highest priority in the system and themselves form an M/M/1 queue with parameter (λ1, µ), therefore basic M/M/1 queue result [39] implies the expected number of class 1 jobs in the system is E[N1] =

ρ1/(1 − ρ1). Then Little’s law implies the expected processing time for class 1 job is

E[N1]/λ1. If we further consider both class 1 and class 2 jobs, they are not preempted by any other jobs in the system, therefore they form yet another M/M/1 queue with parameter (λ1 + λ2, µ), and thus the expected number of class 2 jobs in the system is E[N2] = (ρ1 + ρ2)/(1 − ρ1 − ρ2) − E[N1]. In general, the expected number of class Pi Pi−1 j=1 ρj j=1 ρj i jobs in the system is E[Ni] = Pi − Pi−1 . Finally, the system processing 1− j=1 ρj 1− j=1 ρj cost of the system with c-µ rule is

K K Pi Pi−1 ! X X j=1 ρj j=1 ρj Copt = c λ (E[N ]/λ ) = c − . i i i i i Pi Pi−1 i=1 i=1 1 − j=1 ρj 1 − j=1 ρj

To bound the PoA, we first note that ci is decreasing in i and the expected number opt PK of all jobs in the system is ρ/(1 − ρ), therefore C > cK i=1 E[Ni] = cK ρ/(1 − ρ). 0 0 On the other hand, let ci = ci/cK and λi = λi/λK for i = 1, ··· ,K − 1, then

α K−1 K−1 ! C C(1 − ρ) P λ0 c0 α+1 + 1 X α < = i=1 i i < λ0 c0 α+1 + 1. opt − 1 i 1 C cK ρ PK−1 0 0 α+1 i=1 λic i + 1 i=1

α   α+1 C λ − λK c1 Therefore opt < + 1. C λK cK We note that there exist some systems in which the PoA of HTE can be made arbitrarily large. Here is an example. Let λi’s all equal, set ci = mcK for i = −2 Pi 1 2 1, ··· ,K−1 and ρ = 1−m . Then j=1 ρj < 1− K for i < K and ρ/(1−ρ) = m −1. We have that

α α+1 opt 2 cK ρ (K − 1)m + 1 C < cK (m + mK), C = . − 1 (1 − ρ) (K − 1)m α+1 + 1 CHAPTER 1. PART I: RESOURCE SHARING GAME 28

Hence, α C (K − 1)m α+1 + 1 m2 − 1 > . opt − 1 2 C (K − 1)m α+1 + 1 m + mK Letting m go to infinity makes the PoA arbitrarily large. Also, note that in this case α the PoA bound is (K − 1)m α+1 + 1, which means that the PoA bound is “asymptot- ically tight”.

We note that (1.27) is only an upper bound and in many cases is a loose bound. Moreover, this upper bound can be made arbitrarily large through appropriate pa- rameter choices; further, this is tight, in the sense that there exist systems where the

PoA of HTE is in fact arbitrarily large. For example, let λi = λ for all i, and set −2 cK = 1, ci = m for i = 1, ··· ,K − 1, and choose µ so that ρ = 1 − m . Then it can  α  be shown that C/Copt = Ω (K − 1)m α+1 as m → ∞ (see the proof of the theorem for details). We also note that the PoA bound is increasing in α, which matches the intuition that a scheme closer to strict priority in descending cost order yields higher social welfare. If we let α → 0, then the PoA is asymptotically bounded by λ/λK . In that case, if the arrival rates of all classes are the same, then the PoA is bounded by

K. We can also let λ1, ··· , λK−1 → 0 to make the PoA approach 1, but this is not surprising since in this case the system essentially consists of only one class.

1.4.5 Revenue

The revenue of the server per unit time is the sum of expected payments in one unit of time:

K K ! K !−1   α 1 X ρ X X − R = λ (β∗)α = λ c α+1 λ c α+1 . (1.28) i i α(1 − ρ) i i i i i=1 i=1 i=1

Given fixed λi and ci (i = 1, ··· ,K), the revenue depends on the system parameter α and the load ρ as follows. Dependence of R on ρ. The revenue is proportional to ρ/(1 − ρ), therefore the revenue is increasing in ρ. Heavier traffic will induce greater congestion, and hence, CHAPTER 1. PART I: RESOURCE SHARING GAME 29

jobs have to invest more in their purchase of priority in order to keep the same performance. Dependence of R on α. The revenue depends on α in three terms, and it seems that in general the effect of changing α in the last two terms is significantly smaller than that of changing α in the first term ρ/(α(1 − ρ)). Hence we would expect that the revenue is in general decreasing in α. The next result shows this intuition holds if c1/cK is not too high.

4 Proposition 16. The revenue R is decreasing in α > 0 if c1/cK < e .

Proof. Take the first derivative of R with respect to α, we have that

 1 −2 ∂R ρ X − = λ c α+1 ∂α (1 − ρ)α2 i i " # α 1 α 1 α 1 α X − − ci X − × λ λ (c α+1 c α+1 − c α+1 c α+1 ) ln − λ λ c α+1 c α+1 . (α + 1)2 i j i j j i c i j i j i

4 (α+1)2 If c1/cK < e , then maxi,j ln(ci/cj) ≤ 4 ≤ α . It follows from (1.29) that ∂R/∂α is negative and R is decreasing in α.

On the other hand, R could be increasing in α in some cases. For instance, if

K = 2 and c1/c2 is large enough, then ∂R/∂α is positive around α = 1 (see the proof of proposition for details). To explain this special scenario where the monotonicity does not hold, we first note that a smaller α in general induces a higher revenue because jobs have incentive to purchase higher priority (as a response to the stronger diminishing marginal cost effect). However, in the HTE, significant asymmetry in costs will result in significant asymmetry in equilibrium priorities. Therefore when c1/cK is large, the optimal priorities already exhibit significant differences even when α is not small, and thus in equilibrium at small α, jobs have lower incentive to increase their priorities compared to what they do with mutually comparable costs. With both (1.26) and (1.28), it is quite surprising to see that in the HTE,

total cost of all jobs = C = αR = α · total revenue of the system. CHAPTER 1. PART I: RESOURCE SHARING GAME 30

Thus, we obtain an interesting insight: another interpretation of α is the users’ equi- librium cost per unit revenue. We have shown that the user’s total cost is increasing in α, (i.e., the system efficiency is decreasing in α), and the system revenue is de- creasing in α under some mild conditions. Therefore, in a wide range of regimes, from the standpoint of system manager, smaller α is more favorable in terms of both efficiency and revenue. Note that smaller α is somewhat more “unfair,” however, as it approaches a strict priority system.

1.5 Class Level Game

Based on the heavy traffic processing time approximation results in Theorem 6, one can also propose similar approximate equilibrium concepts for class level games. How- ever, although the processing time approximation allows us to greatly simplify the computation of best response strategies, we are not able to obtain a closed form expression for the class level AE or HTE. Take heavy traffic equilibrium as an example. As what we did in Section 1.4.1, we HT can define Wi (β) = 1/µ(1 − ρ)βiγ(β). It then follows from Theorem 6 and Little’s law that E[Ni] 1 lim(1 − ρ)Wi(β) = lim = . ρ→1 ρ→1 λi λβiγ(β) Therefore W HT is asymptotically exact: as ρ → 1,

HT (1 − ρ)[Wi (β) − Wi(β)] → 0.

Similarly, we can define β = (β1, ··· , βK ) as a class level heavy traffic equilibrium if ∀i = 1, ··· ,K,

HT α βi = arg min ciWi (β1, ··· , βi−1, β, βi+1, ··· , βK ) + β β>0 " #−1 ciρ 1 X ρj ρi = arg min + + βα. β>0 µ(1 − ρ) β βj β j6=i CHAPTER 1. PART I: RESOURCE SHARING GAME 31

P ρj ρi Denote γ−i(β; β) = + . We can obtain a system of non-linear equations j6=i βj β to compute this class level heavy traffic equilibrium, but we are not able to get any closed form expressions for it, mainly due to two features of γ−i(β; β) that are different from γ(β) of a job level game.

First, γ−i(β; β) is subscripted by i, which implies that different classes face dif- ferent environments in the system. Note that in a job level game, the approximate processing time is inspired by a “one-versus-many” view that is rigorously justified in a heavy traffic regime: a single job has little impact on the whole system when the system is sufficiently loaded. By contrast, the same is not true in a class level model: even if the load increases, a single class amounts to a constant fraction of jobs in the system, so the equivalent aggregate priority as seen by one class will differ from the aggregate priority as seen by another class.

Second, γ−i(β; β) explicitly depends on β, the action of class i, due to the intra- class externality behavior: in a class level game, a class chooses one priority level for all its jobs simultaneously; while in a job level game, a job can choose any priority level regardless of other jobs belonging to its class. To make progress, recall that our main interest in the class level model comes from the view that a single user of a resource sharing service corresponds to an entire class, i.e., that this user generates multiple jobs and chooses a single priority level for her entire family of jobs. We note that for resource sharing services, a regime of particular interest is one where the number of users is large. In our context, therefore, we are led to consider a regime where the number of classes is large. In the limit, the intra-class externality effect will become negligible. This observation motivates us to consider a limiting model in which the number of classes approaches infinity and any single class becomes infinitesimal. Thus, it can be connected to the job level game model. Next we formalize this idea. We consider a series of systems indexed by K satis- fying the following properties.

Definition 8. A limiting class level game consists of the following elements.

K K 1. There are K classes in system K, each with cost ck arrival rate λk , k = CHAPTER 1. PART I: RESOURCE SHARING GAME 32

1, ··· ,K, and service requirement rate µK .

K 2. The costs ck are i.i.d. samples from a distribution F (·) with a positive and

closed support Sc.

K 3. The arrival rates λk are i.i.d samples from a distribution G(·) with a positive R and closed support Sλ (independent of the costs). Let EG[λ] = λdG, the Sλ limiting mean arrival rate.

4. The service rate µK in system K is chosen to ensure that the system is stable, K PK K i.e., ρ , k=1 λk /µ < 1.

K 5. limK→∞ µ /K exists; denoting this limit by µ, we assume that EG[λ] < µ.

6. Feasible priority levels are bounded by a positive and closed support Sβ = [β, β].

As K → ∞, the limit of these systems consists of a continuum of classes with independent cost and arrival rate distributions as F and G, and with limiting per class service rate µ. + Suppose we are given a limiting class level game. Let B : Sc → R denote a priority strategy function, that maps the unit cost of a class (and its jobs) to the priority level chosen by this class (and its jobs). Let V (β; B, F, G, µ) be the expected steady state processing time of a job with priority β, processed with a continuum of classes of jobs characterized by priority strategy function B, cost distribution F , arrival rate distribution G, and per class service rate µ. Then the symmetric Nash equilibrium for this continuum game is a strategy function B(·) such that:

α B(c) = arg min cV (β; B,F,G) + β , ∀ c ∈ Sc. β≥0

Given the complexity of V in general, this cannot be solved in closed form. Next we define and study the aggregate and heavy traffic equilibria of the limiting class level game. CHAPTER 1. PART I: RESOURCE SHARING GAME 33

1.5.1 Aggregate Equilibrium

First we analogously define the aggregate priority of a system, inspired by aggregate priority for the job level game.

Definition 9. βˆ is the aggregate priority of the system characterized by (B,F,G) if

R R λV (B(c), βˆ)B(c)dF dG βˆ = Sc Sλ . R R λV (B(c), βˆ)dF dG Sc Sλ

Comparing this definition to that in the finite case, we note that the summation over all classes is replaced by integration over the support space, since now we have a continuum of classes. Similarly, the aggregate approximation of processing time of a job in the system is the processing time as if this job were facing a single class with the aggregate priority.

Definition 10. The aggregate processing time for a job with priority level β in this limiting class level game is defined as V AG(β; B) = V (β; βˆ) where βˆ is the aggregate priority of the system and V is the one-job-one-class processing time given by (1.7).

Analogous to the approximation theorem in job level games, we have a similar approximation result in heavy traffic for class level games with infinitely many classes.

Theorem 17. Suppose we are given a limiting class level game, and any positive strategy function B : Sc → Sβ. Then for all β > 0:

AG lim(1 − ρ)V (β; B) = lim lim W1(β, B(c2), ··· ,B(cK )). ρ→1 K→∞ ρK →1

Proof. We denote βˆ by βˆ(ρ) when we send ρ to 1. It is clear that the aggregate priority βˆ = βˆ(ρ) is bounded above, since the feasible priority levels are bounded. ˆ Thus let β be any limit point, say along a subsequence ρn; then it follows from (1.14) that on this subsequence,

λ(1 − ρ )V (β; βˆ(ρ ))β lim n n = 1, ∀ β > 0. (1.30) n→∞ ˆ β(ρn) CHAPTER 1. PART I: RESOURCE SHARING GAME 34

Substituting this limit into the definition of aggregate priority, we immediately obtain Z −1 ˆ −1 lim β(ρn) = B(c) dF . (1.31) n→∞ Sc As this is true for all convergent subsequences, we conclude:

Z −1 lim βˆ(ρ) = B(c)−1dF . ρ→1 Sc

Then (1.31) and (1.30) yield

 Z −1 lim(1 − ρ)λV (β; βˆ(ρ)) = β B(c)−1dF . (1.32) ρ→1 Sc

On the other hand, it follows from Theorem 6 and Little’s law that

K K 1 lim (1 − ρ )λ W1(β, B(c2), ··· ,B(cK )) = . ρK →1 γK (β)β

K ρ X ρ where γK (β) = 1 + i . Then the strong law of large numbers implies β β i=2 i

K ! Z λ1 X λi lim γK (β) = lim + = B(c)−1dF. K→∞ K→∞ λK β λK B(c ) i=2 i Sc

Therefore

 Z −1 K K −1 lim lim (1 − ρ )λ W1(β, B(c2), ··· ,B(cK )) = β B(c) dF . (1.33) K→∞ ρK →1 Sc

Equations (1.32) and (1.33) together complete the proof.

Finally, we define the aggregate equilibrium of the class level game.

Definition 11. Suppose we are given a limiting class level game. A class level aggregate equilibrium is characterized by a positive priority strategy function B CHAPTER 1. PART I: RESOURCE SHARING GAME 35

such that AG α B(c) = arg min cV (β, B) + β , ∀c ∈ Sc, β≥0

AG where V (β; B) is the aggregate processing time defined in (1.11) with ρ = EG[λ]/µ.

Computing the exact aggregate equilibrium is similar to the computation for the job level game. To ensure positivity of the equilibrium, as in Theorem 8, we assume that min Sc/ max Sc > 1/4. Further, suppose that the feasible priority region Sβ is taken sufficiently large to ensure that the equilibrium priority strategy never hits the upper boundary. Both (1.18) and (1.19) hold with ci replaced by c and βi replaced by B(c). Then the consistency condition gives the closed form solution for the aggregate priority level in the class level aggregate equilibrium when α = 1:

√ 2 " 2 1 # (2ρ − 1) S + [(2ρ − 1) S + 4S S (2 − ρ)] 2 βˆ = 3 √3 1 2 , 2S1(2 − ρ) µ where S = R c−1/2dF , S = R c1/2dF and S = (2 − ρ)/(ρ(1 − ρ)). Further, we 1 Sc 2 Sc 3 obtain B(c) via (1.18) upon replacing ci by c; establishing B(c) is strictly positive follows by an argument similar to the end of Theorem 8. In our next result, we relate the aggregate equilibrium of the class level game to the aggregate equilibria of the job level game in a series of finite systems; in this sense, we show that a class level game with infinitely many classes behaves like a job level game, validating that the in-class externality has been mitigated. If K is fixed, system K is a finite class system. The optimal strategies in aggregate equilibrium are given by (1.18) and the corresponding aggregate equilibrium priority is given by (1.17). Now if K → ∞, we find that the series of aggregate equilibrium priorities converges almost surely and the limit is βˆ.

Proposition 18. Suppose we are given a limiting class level game, and let βˆK denote the (random) aggregate equilibrium priority for the job level game induced in the K’th system. Let βˆ be the aggregate priority in the class level aggregate equilibrium. Then:

lim βˆK = β.ˆ K→∞ CHAPTER 1. PART I: RESOURCE SHARING GAME 36

Proof. The proof follows by applying the strong law of large numbers in (1.17).

1.5.2 Heavy Traffic Equilibrium

Let us define Z Z Z γ(B) = λ/(µρB(c))dF dG = B(c)−1dF dG, (1.34) Sc Sλ Sc where the last equality comes from the independence between λ and c, and define heavy traffic processing time in limiting class game by

1 1 V HT (β; B) = . (1.35) µ(1 − ρ) γ(B)β

Then we have similar approximation result in heavy traffic for the limiting class level game, which shows that V HT is asymptotically exact.

Theorem 19. Suppose we are given a limiting class level game, and any positive strategy function B : Sc → Sβ. Then for all β > 0:

HT K lim(1 − ρ)V (β, B) = lim lim (1 − ρ )W1(β, B(c2), ··· ,B(cK )). ρ→1 K→∞ ρK →1

Proof. By definition (1.34) and (1.35)we have

 Z −1 lim(1 − ρ)λV HT (β; B) = β B(c)−1dF . (1.36) ρ→1 Sc

The rest of the proof is just a repetition of the proof to Theorem 17.

Inspired by heavy traffic equilibrium for the job level game, we can also define the class level heavy traffic equilibrium for the limiting class level game.

Definition 12. Suppose we are given a limiting class level game. A class level heavy traffic equilibrium is characterized by a positive priority strategy function B HT α B(c) = arg min cV (β, B) + β , ∀c ∈ Sc, β≥0 CHAPTER 1. PART I: RESOURCE SHARING GAME 37

HT where V (β, B) is defined as in (1.35), with ρ = EG[λ]/µ.

This class level heavy traffic equilibrium can also be solved in closed form.

Theorem 20. The class level heavy traffic equilibrium for a limiting class level game always exists and is unique. Moreover, it can be calculated in closed form as follows:

1 − 1 B(c) = c α+1 (µα(1 − ρ)S2) α ,

R − 1 where S = c α+1 dF is independent of ρ. 2 Sc The proof is just a similar repetition of the proof for Theorem 12. Moreover, we can relate the HTE of the class level game to the HTE of the job level game in a series of finite systems, just like we did for aggregate equilibrium. The result can be easily verified by applying the strong law of large numbers in (1.21)

K Proposition 21. Suppose we are given a limiting class level game. Let βi denote the strategy of a class i job used in heavy traffic equilibrium in the Kth system, and B(·) be the equilibrium strategy function used in the limiting class level game, then

K K  lim βi − B(ci ) = 0. K→∞

1.6 Numerics

In this section, we study the numerics of our two approximations with different sys- tem parameters (K, {ci}, {λi}, α, ρ); this complements our theoretical analysis above. We only consider the finite class job level game in this section. To quantify the het- erogeneity of ci’s, we assume ci’s are i.i.d. drawn from a uniform distribution on

[0, 10], plus a constant c0. A smaller c0, therefore, induces a potentially larger ratio between the smallest and largest ci. Similarly, we assume λi’s are i.i.d. drawn from uniform[0, 10] + λ0. Given the aggregate equilibrium βAG, the heavy traffic equilibrium βHT and an Nash equilibrium βNE, we use the relative error as a measure of approximation, i.e., AG or HT NE NE maxi(βi − βi )/βi . We compute the exact Nash equilibrium using best CHAPTER 1. PART I: RESOURCE SHARING GAME 38

response dynamics; surprisingly we found that best response dynamics converge to NE for all parameter choices below.

1.6.1 Approximation Errors: AE vs. HTE

First we compare the approximation error of AE and HTE. Both approximations are shown to be asymptotically exact in heavy traffic, therefore we would expect small approximation errors when ρ is close to one. The main difference of AE and HTE is that the construction of AE does not depend on the state space collapse result (Theorem 6), while HTE directly uses the approximation this result. Hence aggregate approximation should be more robust when the system is not in heavy traffic. All these conjectures are verified by our numerical analysis. We fix the system at α = 1 (in order to have closed form solution of aggregate equilibrium), K = 10, c0 = λ0 = 1 (i.e., ci ∼ U[1, 11] and λi ∼ U[1, 11]). Then we compare the approximation errors of AE and HTE under different system load ρ = 0.5, 0.8, 0.9 and 0.95. For each value of ρ, we have 100 simulation samples (of cost vector and arrival rate vector), and the approximation errors are summarized by the boxplots, with AE on the left and HTE on the right in Figure 1.1. It is clear that (i) both AE and HTE approximate exact NE well in heavy traffic regime. Their relative approximation errors are less than 0.12 when ρ = 0.95; (ii) AE has much better approximation performance than HTE outside heavy traffic (note the different scale in y-axis of two boxplots); (iii) In general, AE is a very good approximation of the exact Nash equilibrium. In our simulations, the maximum relative error of AE is less than 0.15 even when system load is 0.5, which is far from the heavy traffic regime.

1.6.2 Approximation Sensitivity

Next we study the impact of changes in system parameters on approximation accu- racy. We use HTE as an example, similar results can be obtained for AE.

To illustrate the change, we fix most of these parameters at K = 10, c0 = λ0 = 1, α = 1, ρ = 0.9, but vary one or two of them at a time. For each set of parameters, CHAPTER 1. PART I: RESOURCE SHARING GAME 39

aggregate equilibrium heavy traffic equilibrium approximation error approximation error approximation 0 1 2 3 4 5 6 7 0.00 0.05 0.10 0.15 ρ=0.5 ρ=0.8 ρ=0.9 ρ=0.95 ρ=0.5 ρ=0.8 ρ=0.9 ρ=0.95

Figure 1.1: Relative error of AE and HTE with different system loads. we have 100 simulation samples, and the approximation errors are summarized by the boxplots. In the upper panel of Figure 1.2, we see that heterogeneity in the system weakens the approximation. Numerical results show that approximation error is higher with smaller c0 or a larger number of classes; both increase heterogeneity. Heterogeneity in the arrival rates appears to cause less degradation in the approximation, but it can be shown that both arrival rate heterogeneity and significant cost heterogeneity together can amplify approximation errors. In the lower panel of Figure 1.2, we vary α and ρ. The results suggests that the approximation error is lower with larger α. Note that α describes the marginal cost to payment; therefore larger α induces smaller payments and the relative heterogeneity among user decisions diminishes. Regarding ρ, the error decreases as we approach heavy traffic (ρ close to 1), as expected. We conclude by noting that the approximation error can be made arbitrarily large through appropriate parameter choices; one such example is given when K = 2, c1/c2 → ∞ and λ1/λ2 → ∞. In this case since λ2 and c2 are relatively small, the opti- mal priority level for class 2 in the exact NE is also extremely small. However, in the heavy traffic approximation, the processing time V HT (β, β) is inversely proportional to β; so in HTE no user will chooses an extremely small β, leading to arbitrarily large error rate. CHAPTER 1. PART I: RESOURCE SHARING GAME 40

α=1, ρ=0.9, c0=1, λ0=1 α=1, ρ=0.9, K=10, λ0=1 α=1, ρ=0.9, K=10, c0=1 approximation error approximation error approximation error approximation 0.25 0.30 0.35 0.40 0.45 0.25 0.30 0.35 0.40 0.45 0.2 0.4 0.6 0.8 1.0

K=2 K=5 K=10 K=20 c0=0.1 c0=0.5 c0=1 c0=5 λ0=0.1 λ0=0.5 λ0=1 λ0=5

α=0.5, K=10, c0=1, λ0=1 α=1, K=10, c0=1, λ0=1 α=5, K=10, c0=1, λ0=1 approximation error approximation error approximation error approximation 0 10 20 30 40 0.0 0.5 1.0 1.5 2.0 0.00 0.05 0.10 0.15 ρ=0.6 ρ=0.8 ρ=0.9 ρ=0.99 ρ=0.6 ρ=0.8 ρ=0.9 ρ=0.99 ρ=0.6 ρ=0.8 ρ=0.9 ρ=0.99

Figure 1.2: Relative error of HTE under different parameters. CHAPTER 1. PART I: RESOURCE SHARING GAME 41

1.6.3 Price of Anarchy

Finally, we study the actual price of anarchy of approximate equilibrium. In Theorem 15 we give an upper bound for the price of anarchy of HTE, which is tight only when the system is under extreme parameters. In this subsection, we compute the actual prices of anarchy of approximation equilibria.

We still fix most of system parameters at K = 10, c0 = λ0 = 1, α = 1, ρ = 0.9, and vary one or two of them at a time. For each set of parameters, we have 100 simulation samples, and the approximation errors are summarized by the boxplots. In the upper panel of Figure 1.3, we see that heterogeneity in the system increases price of anarchy. Recall c-µ rule uses strict priority allocation scheme, while DPS’ pri- ority system is not strict; the more heterogeneity the system has, the larger difference is between these two allocation rules. In the lower panel of Figure 1.3, we vary α and ρ. The results suggest that the price of anarchy is higher with larger α and larger ρ. As we mentioned before, α is the fairness index of the system, larger α implies more fair system and therefore it is farthur away from the most unfair but efficiency optimizing c-µ rule. When ρ increases, the system is busier and the advantage of strict priority is amplified by the increasing number of waiting jobs.

1.7 Extensions

We believe our work makes significant progress on two fronts. First, the DPS queueing model is important in its own right as a benchmark model for analysis of priority pricing for shared resource services. Our analysis provides extensive insight into this queueing system with strategic behavior. Second, and perhaps of greater longer term interest, our approximation methodology suggests a broader research program for understanding strategic behavior in queueing systems: by exploiting large system asymptotics, we can simplify both the complexity of the stochastic system, as well as the complexity of the economic system. Our approximation methods can be applied to several extensions of the one server CHAPTER 1. PART I: RESOURCE SHARING GAME 42

α=1, ρ=0.9, c0=1, λ0=1 α=1, ρ=0.9, K=10, λ0=1 α=1, ρ=0.9, K=10, c0=1 price of anarchy price of anarchy price of anarchy 1.4 1.6 1.8 2.0 1.2 1.4 1.6 1.8 2.0 1.0 1.2 1.4 1.6 1.8 2.0

K=2 K=5 K=10 K=20 c0=0.1 c0=0.5 c0=1 c0=5 λ0=0.1 λ0=0.5 λ0=1 λ0=5

α=0.5, K=10, c0=1, λ0=1 α=1, K=10, c0=1, λ0=1 α=5, K=10, c0=1, λ0=1 price of anarchy price of anarchy price of anarchy 1.0 1.5 2.0 2.5 3.0 3.5 1 2 3 4 5 1.0 1.5 2.0 2.5 3.0 3.5 4.0 ρ=0.6 ρ=0.8 ρ=0.9 ρ=0.99 ρ=0.6 ρ=0.8 ρ=0.9 ρ=0.99 ρ=0.6 ρ=0.8 ρ=0.9 ρ=0.99

Figure 1.3: Price of anarchy of HTE under different parameters. CHAPTER 1. PART I: RESOURCE SHARING GAME 43

DPS model. Random order of service. Consider an alternative prioritized allocation policy, the random order of service (ROS) policy. In the ROS policy, only one job is served at a time and upon completion of this job, a new job starts to be served with probability proportional to its priority level. Therefore, if there are currently N jobs waiting in the system and job ` has chosen priority level β`, then in the ROS policy, job ` is the PN next job to start service with probability β`/ m=1 βm. Although ROS and DPS are different in the ways they allocate service among jobs, the ratio of expected service allocated to two jobs is the same as the ratio of their priorities in both schemes. Therefore, we would expect some similarities in the expected processing times of jobs in these two allocation policies. In fact, Ayesta et al. [7] show that in heavy traffic regime, the expected processing time of a class j HT job in a ROS system is exactly V (βj; β), the same as in a DPS system. Thus in the heavy traffic ROS system, the expected processing time of a job with priority β is V HT (β; β). Therefore all our results on heavy traffic equilibria of the DPS system also hold for the ROS system. Endogenous arrival rates. Some previous work (e.g., [43, 36]) on priority pricing in queueing systems allows for strategic choice of the arrival rate. In our model, this might mean jobs only enter if their total cost (cost of waiting plus payment) does not exceed a reservation utility. A significant challenge here is that when arrival rates are endogenized, heavy traffic cannot be exogenously guaranteed. However, we believe approximating the processing time may still yield valuable insight into this game. Characterizing the quality of approximate equilibria in this regime remains an open direction. In our DPS model, suppose we allow jobs to leave the system without requesting service; then each job optimizes its expected payoff:

α max{0, ui − ciV (β, β) − β } by simultaneously choosing whether to enter, and the priority level β if it enters. Here ui is the utility of service completion. Clearly computing expected processing time V CHAPTER 1. PART I: RESOURCE SHARING GAME 44

remains a hurdle here, and thus we believe our approximation technique still applies and significantly reduces the computational complexity.

1.7.1 A Multi-server Model

Our model considered a single resource; more generally, we can extend some of our basic results to a network setting. In this subsection, we provide a description of a network generalization of our model. Consider a setting with J resources, and K classes. Jobs of class i arrive at rate λi. Each class requires service from a subset of the resources; in particular, let ri denote the subset of resources that are used by a job of class i. Each job generates an exponentially distributed workload with mean

1/µj at resource j; let ρij = λi/µj be the traffic intensity of class i at resource j. We assume that each resource operates as an independent market. In other words, each job bids independently at each resource, and each resource uses the DPS policy to allocate resources to jobs. Let βij be the bid of a class i job at resource j; for simplicity in this section, we assume that the payment of the job to that resource is equal to βij (i.e., that α = 1 at each resource. Finally, in the model we consider, we assume that each job simultaneously requires service from each of the resources it demands. We assume that each resource is sensitive to its maximum processing time across the resources. This might be a reasonable model if, for example, the resources correspond to resources used to farm out parallelized jobs; in that case, the user would be sensitive to the completion time of the slowest job run on the resources they demand. We have the following equivalent definition of a Nash equilibrium.

Definition 13. A Nash equilibrium of the network game consists of a class priority vector β = (βij, i = 1,...,K; j ∈ ri) such that

" # (j) X βi = arg min ci max{Vj(βij; β )} + βij , ∀ i = 1, ··· ,K, β>0 j∈ri j∈ri

(j) where β = (βkj, k such that j ∈ rk) is the class priority vector of jobs that use resource j. CHAPTER 1. PART I: RESOURCE SHARING GAME 45

(j) Here Vj(β; β ) is the processing time in a DPS system for a job with priority β at resource j, when the class priority vector at resource j is β(j), as before. We can analogously define a heavy traffic equilibrium (HTE) of the network game, by HT replacing Vj by Vj . Note that this notion is formally justified if we consider a P heavy traffic limit where ρij converges to a limitρ ¯ij, such that i:j∈i ρ¯ij = 1 for all j. (In particular this corresponds to the limit where every resource approaches to heavy traffic simultaneously.) ¯ If we let Vj denote the processing time of a particular class i job at resource j, ¯ ¯ then observe that maxj∈ri {Vj} is a convex function of (Vj, j ∈ ri). As a result, the objective function of user i in the definition of heavy traffic equilibrium can be shown to be convex, and so by standard arguments it is straightforward to show that a HTE exists for this game; for brevity we omit the details. Unfortunately, due to the complexity of the network setting, it is not possible in general to establish either uniqueness of the equilibrium or compute the equilibrium in closed form. However, we can use our earlier results to obtain a bound on the price of anarchy ¯ (j) in this model. We require one additional piece of notation. Let bj(Vj; β ) be the value of β that ensures that the heavy traffic processing time of a job at resource j is ¯ (j) Vj, when the class priority vector of other jobs at resource j is β . In other words, ¯ (j) let bj(Vj, β ) be the solution β to:

¯ HT (j) Vj = Vj (β; β ).

Though bj(·) can be computed in closed form, the solution is tedious and not partic- ularly insightful. For our purposes, all we require is that it is convex, decreasing, and ¯ differentiable in Vj > 0. We can then prove the following theorem.

∗ Theorem 22. Suppose λi = λ for all i. Let β be an HTE of the network game, and ∗ ∗ (j)∗ let Vij = Vj(βij, β ). Further, define:

0 ∂bj ∗ (j)∗ bij = ¯ (Vij, β ). ∂Vj CHAPTER 1. PART I: RESOURCE SHARING GAME 46

Then the price of anarchy, i.e., the ratio of the HTE processing cost to the minimal system processing cost is bounded above by:

s 0 maxi:j∈r (−b ) (K − 1) max i ij + 1. j 0 mini:j∈ri (−bij)

Proof. Our proof technique follows the analysis of the price of anarchy of the network game in [29]. In that paper, it is shown that by a decomposition approach, the price of anarchy of a network resource allocation game can be studied by reduction to the price of anarchy of a collection of single resource games. In particular, we write the processing time cost function of a user as a function of her processing times as:

Ci(V i) = ci max Vij. j∈ri

Now consider a new game where the cost function of user i is instead given by:

ˆ X 0 ∗ ∗ Ci(V i) = (−bij)(Vij − Vij) + ci max Vij. j∈ri j∈ri

This cost function is derived by linearizing around the processing time vector ob- ∗ ˆ ∗ served in equilibrium. It has two important properties: first, Ci(V i ) = Ci(V i ); and second, because of convexity of the original cost function, the first order condition for optimality at the equilibrium can be used to show that for any V i, there holds ˆ Ci(V i) ≤ Ci(V i). In particular, the optimal system processing cost can only be lower under the new cost function. Next, observe that in equilibrium, since a user minimizes C (V )+P b (V , β(j)), i i j∈ri j ij ∗ the directional derivative of Ci(V i ) in the direction of the vector (1,..., 1) must be equal to ci, since the job must have equalized its processing times at the different resources in equilibrium. The first order condition can then be used to conclude that P −b0 = c . (Note that −b0 ≥ 0 since b is decreasing in processing j∈ri ij i ij j ˆ time.) This ensures that Ci(V i) > 0 for all i and feasible V i, and in particular ∗ P 0 ∗ that ci maxj Vij − j(−bij)(Vij) ≥ 0. Finally, we note that Cˆ is linear in the processing times at different resources; CHAPTER 1. PART I: RESOURCE SHARING GAME 47

thus the first order conditions for optimality for job i decompose across the resources. Thus if we consider now independent games at each resource j, where player i with 0 j ∈ ri plays with unit time cost −bij, then it follows that a HTE for that game would also be β(j)∗. Since the equilibrium actions are the same in the network game and in the independent single server games, while the optimal social cost is lower in the latter, the result then follows using the same argument as in [29] by using the price of anarchy bound for single resource games established in Theorem 8.

1.8 Conclusion

In this chapter, we considered a model of priced resource sharing that combines both queueing behavior and strategic behavior. We study a priority service model where a single server allocates its capacity to agents in proportion to their payment to the system, and users act to minimize the sum of their cost for processing delay and payment. Calculation of exact Nash equilibrium proves to be difficult, because both the queueing and strategic interactions introduce significant complexity into characteri- zation of agents processing times. We introduced two novel concepts of approximate equilibrium, the aggregate equilibrium (AE) and the heavy traffic equilibrium (HTE). Both approximate equilibria are asymptotically exact and can be computed in closed form. This great advantage of tractability enables us to conduct a series of parametric and normative analysis of the system. Our main contributions are as follows. (1) Longstanding approach to analyze queueing system. Our approximation ap- proaches are of great use even without applying to game theoretic settings. The aggregate approximation and heavy traffic approximation of processing time signifi- cantly increase the tractability of analyzing the complicated DPS queueing system. (2) Approximate notions of equilibrium. When applying our approximation meth- ods to not only queueing systems but also queueing games, we suggest natural cor- responding notions of equilibrium that we call aggregate equilibrium (AE) and heavy traffic equilibrium (HTE). In an AE or HTE, users minimize the sum of their payment CHAPTER 1. PART I: RESOURCE SHARING GAME 48

and the approximate processing time cost, rather than their true expected process- ing time cost. We show that under mild conditions, both AE and HTE exist and are unique, and that they can be computed in closed form in terms of system pa- rameters. Moreover, AE and HTE are asymptotically exact in heavy traffic regime. They are thus both simple to compute, and asymptotically accurate when the system approaches heavy traffic. (3) Economic analysis: parameter sensitivity, efficiency, and revenue. A signifi- cant benefit of our approach is that since we can compute the equilibrium in closed form, it is straightforward to carry out analysis of efficiency and revenue. We study how the system behavior changes when cost or arrival rate parameters are scaled, and more importantly, we investigate social efficiency and system revenue of HTE under different system parameters, and give a bound for the price of anarchy of HTE. We obtain some intriguing insights: in particular, we show that within a particular class of pricing schemes, and for a wide range of parameter choices, the incentives of the revenue maximizing service provider become aligned with minimization of total system processing cost. Chapter 2

Part II: Reputation System

2.1 Introduction

Reputation mechanisms have played a significant role in the rise and success of on- line marketplaces such as eBay. In such markets, buyers and sellers who have never met each other must trust that the transaction will be carried out to their satis- faction. Some potential sources of uncertainty can be eliminated by the platform; e.g., through verification of payment. However, guaranteeing that the counterparty is trustworthy—and in particular, that sellers are honest in their description of items for sale—is not something the platform can guarantee a priori, without physically in- specting every item being listed. Reputation scores serve as a proxy for the perceived trustworthiness of a seller, and encourage buyers to participate in trade where they otherwise may not. In this chapter, we develop a large market model to study conditions under which sellers can be made truthful via appropriate design of a reputation mechanism. In particular, we study a market with long-lived sellers and short-lived buyers. In each period, each seller gets an item of either high or low value and sells it to a buyer. Sellers can intentionally or unintentionally misadvertise a low value item as high value, but will therefore get a negative feedback from the buyer. There is an aggregation mechanism in the system that calculates the reputation of a seller from her feedback history. The reputation of a seller is observable to the buyer and indicates how

49 CHAPTER 2. PART II: REPUTATION SYSTEM 50

trustworthy she is, and therefore influences the payment she gets from the buyer. In the model we consider, there are two distinct types of sellers: commitment sell- ers and strategic sellers. The commitment sellers are intrinsically honest, but suffer from the fact that their expertise may simply not be high enough to properly describe the items they have for sale. In particular, commitment sellers are characterized by a fixed probability of inaccurate description, over their entire lifetime. The strate- gic sellers are driven by a profit maximization motive, and can willfully inflate the advertisement of an item if it will yield higher expected discounted profit over their lifetime. Motivated by the discussion above, we are interested in a particular equilib- rium outcome of the market where the optimal strategy for all strategic sellers is to always truthfully advertise the value of the items they sell. Our paper makes three main contributions, as follows. (1) Formulation of a large market model for reputation systems in online mar- ketplaces. A major challenge in modeling and analyzing the behavior of sellers in a setting as described above is that the entire market represents a very complex dy- namic game. As previously argued in the literature on large dynamic games (see, e.g., [1, 27]), traditional equilibrium concepts such as perfect Bayesian equilibrium are both implausible (due to their high demands on rationality) and intractable (with difficulty even ensuring existence of equilibrium) for such markets. Instead, we focus on a large market approximation, where each buyer and seller is an infinitesimal member of the overall marketplace. Using this model we define an appropriate notion of stationary equilibrium for the dynamic market. In this equilibrium notion, individual agents conjecture that the cross-sectional distribution of types and reputations in the marketplace (the population state) remains constant over time, and optimize accordingly. The equilibrium requires a consistency check: that the population state arises from these optimal decisions of individual agents. A key benefit of our approach is that it allows computational analysis to obtain insight into the structure of equilibria, something that is quite challenging with traditional equilibrium notions for dynamic markets. (2) Existence of separating equilibria. Having defined stationary equilibrium, we CHAPTER 2. PART II: REPUTATION SYSTEM 51

focus our attention on a particular type of equilibrium that we call separating equi- librium (by analogy with similar terminology for signaling and screening games in economics). A separating equilibrium is characterized by two features: first, that strategic sellers are truthful at all reputations; and second, as a result, in steady state their reputation scores are separated from those of commitment sellers. As long as a commitment seller has a positive probability of “making a mistake” in his adver- tisement, he will receive a non-negligible number of negative feedbacks relative to strategic sellers. Therefore a buyer will be able to distinguish by reputation score alone whether a seller is of strategic or commitment type. The importance of this notion is that it allows us to simplify analysis of equilib- rium. In particular, a key construct in analyzing equilibrium is the expected amount the buyers would be willing to pay for an item of given advertisement; this pay- ment varies with the reputation score of the seller. In a separating equilibrium, this payment is completely determined by the distribution of commitment sellers’ types when one’s reputation score is not perfect. In turn, the payment function completely determines the incentives of strategic sellers to cheat in their advertisements; in par- ticular, we can characterize conditions on strategic sellers’ types to guarantee they will be truthful. Using these dual insights, we can find conditions on the distribution of commitment seller and strategic seller types under which a separating equilibrium exists. (3) Computational analysis. As noted above, a significant benefit of the large mar- ket approximation is that we can actually compute optimal strategies and equilibria. We conclude this chapter by discussing some numerical insights into conditions under which strategic sellers might be truthful. We find that truthfulness is incentivized in markets where reputation system can effectively help buyers infer the types of sellers. In particular, if a market has different types of sellers and those who are more likely to sell high value items also have better expertise and are more likely to accurately assess the value of items, then strategic sellers have less incentives to cheat. More- over, strategic sellers in a market that lacks intrinsically honest sellers who have high reputations are more likely to be truthful. The decision problem of a strategic seller in an electronic marketplace with a CHAPTER 2. PART II: REPUTATION SYSTEM 52

reputation mechanism has been studied both in the context of an equilibrium, where buyers play a best response to the seller’s strategy [14, 17], and in isolation [5, 6, 18]. In contrast to prior work, we take a large market approach with multiple types of sellers who vary in a number of dimensions, such as expertise and whether they are strategic. In the context of electronic marketplaces, reputation mechanisms have also been studied empirically [50, 11, 41, 24, 31] and experimentally [51]. Our work is also related to recent progress in using large market approximations to analyze large dynamic games and dynamic markets. Stationary equilibrium is also sometimes called mean field equilibrium because of its relationship to mean field mod- els in physics, where large systems exhibit macroscopic behavior that is considerably more tractable than their microscopic description. In the context of dynamic games, SE and related approaches have been proposed under a variety of monikers across economics and engineering; see, e.g., studies of anonymous sequential games [30, 8]; dynamic stochastic general equilibrium in macroeconomic modeling [53]; Nash cer- tainty equivalent control [26, 25]; and mean field games [40]. More closely related to our work are recent papers studying information percolation and aggregation in large distributed markets [15, 16], and mean field equilibria in large dynamic auctions [27, 21, 10]. Our paper contributes to this literature by studying a large market model with a reputation mechanism, and using the large market approximation to inform reputation . The remainder of this chapter is organized as follows. Section 2.2 introduces our model, and the notion of stationary equilibrium, as well as the notion of separating equilibrium. Section 2.3 discusses existence of separating equilibria. Section 2.4 discusses existence of an approximation of separating equilibria, where we simplify the process of estimating commitment sellers’ steady state distribution of reputations; this yields additional insight beyond our original model. Section 2.5 discusses our computational analysis. Much of the content of this chapter appears in the paper by Wu et al. [55]. CHAPTER 2. PART II: REPUTATION SYSTEM 53

2.2 Model

A note on the presentation: Here and throughout this chapter, since our modeling approach relies on standard technical arguments in the literature on mean field and continuum dynamic games, we suppress measure-theoretic terminology and some re- lated technical details for clarity of presentation. (See, e.g., [27, 10, 16] for examples with similar construction.)

2.2.1 Preliminaries

Discrete time. Trade takes place in discrete time periods; since we eventually consider a stationary notion of equilibrium, for simplicity we consider a market where time runs from t = −∞ to t = +∞. Sellers and buyers. Our model consists of two distinct kinds of agents: sellers have one item for sale in each period, and buyers want one item for purchase in each period. Sellers live for infinite time, while each buyer lives for exactly one period; this is referred to as a model with “long-lived” sellers and “short-lived” buyers. We consider a continuum (or nonatomic) game, where each seller and buyer is infinitesimal relative to the whole market; such a model is reasonable for large markets with many buyers and sellers. Formally, we assume a unit mass of sellers. In every period, each seller has an item for sale whose value is either high (vH ) or low (vL), where 0 ≤ vL < vH . The values of items are independent across different periods and across sellers, but all buyers value items in the same way; we make the latter assumption for simplicity. In each period, each seller sells his good in a second price auction among N > 1 buyers; buyers bid to maximize expected payoff, given information about the seller (see below). Note that since buyers are homogeneous and live for only one period, it is a dominant strategy for each bidder to bid his expected value for the item, and thus the revenue to the seller will be the expected value of the item. (For our purposes the auction format is not critical; all that matters is that the latter property hold.) Advertisement and feedback. At the beginning of each period, each seller (pri- vately) observes the item he has for sale and posts a description that potential buyers can see. A seller can describe an item as low or high value regardless of what the CHAPTER 2. PART II: REPUTATION SYSTEM 54

true value is. Potential buyers observe the description that a seller posts as well as his reputation score, computed from feedback on past transactions. Since buyers are short-lived, this is the only information they use to compute the expected value of the item; i.e., the payment to a seller from the corresponding buyer is a function of just the description and the reputation score of the seller. After purchase, the buyer receives the item and observes its true value. The buyer then leaves negative feedback (equal to zero) for the seller if and only if he received a low value item and the seller had described it as high value. Otherwise, the buyer gives positive feedback (equal to one) to the seller. Reputation score. We assume that a seller’s reputation score is the exponential moving average of the vector of feedbacks that he has received up to now. The mechanism is characterized by a single parameter α, with 0 < α < 1, and defined as follows: given the score of a seller in the previous periodr ˆ and his most recent feedback f0, the new reputation score is αf0 +(1−α)ˆr. Thus α measures how strongly recent feedback affects the reputation score.

Consider a specific seller and let fi denote his i-th most recent feedback, where fi =

0 (resp., fi = 1) indicates negative (resp., positive) feedback, for i = 0, 1, 2, .... Then a P∞ i simple recursion yields that the seller’s reputation score is given by α i=0(1−α) ·fi. Note here that the sum is up to i = ∞ because we assume sellers are long-lived, i.e., that they have been in the system for infinitely many periods already. In this paper, we focus on the exponential moving average primarily because of its simplicity. In addition, exponential smoothing is a good model of how people update their impressions (without a reputation mechanism in place) [4, 44, 33]. We note that exponential smoothing has been previously studied in the context of reputation mechanisms for electronic marketplaces [18, 5]. Commitment sellers and strategic sellers. We assume that the market consists of many types of sellers, where sellers can vary with respect to three attributes. The first attribute is the probability of having a high value item for sale in each period; we refer to this as the seller’s value. The second attribute is the ability to accurately determine whether an item has high or low value; we refer to this as the seller’s expertise. Finally, sellers may vary with respect to whether they are intrinsically CHAPTER 2. PART II: REPUTATION SYSTEM 55

honest when describing an item they have for sale. In our model, sellers that are intrinsically honest always describe their items truth- fully to the best of their knowledge (i.e., given their expertise). Such sellers are called commitment sellers. In particular, if a commitment seller thinks he has a low value item for sale, he will describe it as low value. However, due to lack of expertise, a commitment seller may think that a low value item has high value and, as a result, describe it as a high value item. We say that a commitment seller has expertise p and value q, or simply is of type (p, q), if (1) in each period, he has a high value item for sale with probability q, and (2) each time that he has a low value item for sale, he describes it as a high value item with probability 1 − p (because he thinks that the value is actually high). We assume that a high value item is easier to identify and is never misadvertised as a low value item. Hence a commitment seller of type (p, q) makes a mistake with probability (1−q)(1−p), which is called the mistake rate of this type. Let F C denote the distribution of commitment sellers; in particular, f C (p, q) is the density of sellers that have expertise p and value q. The second group of sellers are strategic sellers; these are sellers that are only interested in maximizing their infinite horizon discounted payoffs, and thus may not be honest. In particular, a strategic seller may intentionally describe a low value item as high value (even if he knows that the value is actually low) if this will increase his infinite horizon expected discounted payoff. A strategic seller will describe a low value item truthfully only if he is properly incentivized by the reputation mechanism. In order to focus on whether strategic sellers are incentivized to be truthful, we assume that all strategic sellers have expertise p = 1, that is, can always determine the true value of an item. We say that a strategic seller is of type (δ, q) if (1) in each period, he has a high value item for sale with probability q, and (2) his discount factor is δ. Let F S denote the distribution of strategic sellers; in particular, f S(δ, q) is the density of sellers that have value q and discount factor δ. Finally we define S δ = sup{δ0 ≥ 0|F (δ, q) = 0, ∀q, δ < δ0}, and similarly define q. Formally, we assume that the fraction of strategic sellers is ρ, 0 < ρ < 1. CHAPTER 2. PART II: REPUTATION SYSTEM 56

2.2.2 Market Dynamics and Stationary Equilibrium

Note that in our market, there are dynamics at both microscopic and macroscopic levels. At a microscopic level, the reputation scores of both commitment sellers and strategic sellers evolve dynamically in response to their respective advertisement behavior. At a macroscopic level, the cross-sectional distribution of types and repu- tations in the market evolves over time. Our focus in this section is in developing a stationary notion of equilibrium for these dynamics. We first consider microscopic dynamics. Observe that the reputation score of a commitment seller evolves according to a Markov chain and the corresponding (p,q) invariant distribution is determined by his type (p, q); denote this Φα . Similarly, it is straightforward to show that a if a strategic seller uses a strategy that is stationary and only depends on his current reputation score (called a stationary ), then his reputation evolves as a Markov chain as well. (We show below that such a stationary Markov optimal strategy exists for the seller in a stationary market.) In this case the Markov chain for the reputation score of the strategic seller will converge to an invariant distribution that is determined by the parameters of his type (δ, q); (δ,q) denote this Ψα . C Now let us consider the relation to macroscopic dynamics. Let Gα (t) denote the joint distribution over type and reputation for commitment sellers at time t; and S similarly, let Gα(t) denote the joint distribution over type and reputation for strategic C S sellers at time t. We refer to the ordered pair (Gα (t),Gα(t)) as the population state at time t. We refer to a stationary market as a large market where the population state C S remains stationary, i.e., constant for all time; let (Gα ,Gα) denote these stationary values. Now note that since we are looking at a continuum market, it must be the C case that the marginal distribution Gα (r|p, q) of the reputation score of a commit- (p,q) ment seller of type (p, q) is exactly the distribution Φα ; and similarly, the marginal S (δ,q) distribution Gα(r|δ, q) is exactly the distribution Ψα . Combined with the exogenous distributions F C and F S, we see that in a sta- tionary market, the density of commitment sellers of type (p, q) and reputation r is C (p,q) C gα (r, p, q) = φα (r)f (p, q); and the density of strategic sellers of type (δ, q) and CHAPTER 2. PART II: REPUTATION SYSTEM 57

S (δ,q) S reputation r is gα (r, δ, q) = ψα (r)f (δ, q). To this point we have not discussed how the dynamics of the sellers are affected by the behavior of the buyers. Suppose now that the market is stationary, and that C S the distributions Gα and Gα are common knowledge among all sellers and buyers in the market. Given these distributions and the strategy that each type of strategic seller is using, buyers can compute the probability that a seller of a given reputation score is of some type, and in turn infer the probability that a seller is advertising an item truthfully as a function of his reputation score. Let B(r) denote the probability that an item has high value when a seller with reputation score r claims so in his description. Then, the expected value of an item that is described as high value by a seller of reputation score r is equal to vB(r) ≡ vH · B(r) + vL · (1 − B(r)). On the other hand, an item that is described as low value is indeed a low value item, because strategic sellers have no reason to describe a high value item as low value and we are assuming that commitment sellers do not confuse a high value item for a low value item. We therefore assume that a seller with reputation score r will receive payment of vB(r) for an item that he describes as high value and payment vL for an item that he describes as low value. Observe that if B(r) > 0, then the seller receives a strictly higher payment when he describes an item as high value; thus, a strategic seller may be tempted to exaggerate the value of the item in his description. On the other hand, if B(r) is increasing and the seller posts a description of a high value item, then the payment is increasing in the reputation score; therefore, a strategic seller may be incentivized to describe an item truthfully in order to have better reputation — an thus better payments — in the future. Note that these considerations do not apply to commitment sellers, because they are not strategic and always describe an item according to the value they assess. We can now define stationary equilibrium.

Definition 14. A (type symmetric) stationary equilibrium of this market consists C S of a pair of distributions Gα and Gα, a set of stationary Markov strategies used by strategic sellers (indexed by type (δ, q)), and a function B(r) for r ∈ [0, 1] such that the following three conditions hold: CHAPTER 2. PART II: REPUTATION SYSTEM 58

(1) the strategy of type (δ, q) strategic sellers maximizes their infinite horizon ex- pected discounted payoff when the payoff to a seller of reputation score r de-

scribing a high (resp., low) value item is equal to vB(r) (resp., vL);

(2) B(r) is the probability that an item has high value when a seller with reputation score r claims so in his description, assuming the population state is constant C S at (Gα ,Gα) and strategic sellers use the given set of strategies ;

C S (3) (Gα ,Gα) is the steady state population state of a market where sellers follow the strategies in (1), and buyers bid according to the beliefs in (2).

Recall that in order to compute B(r) one needs to know the stationary distribution C S of different types of sellers in the market (via Gα and Gα) and the strategies of strategic sellers. A stationary equilibrium should be viewed as an equilibrium concept for a formal “mean field” dynamic market with a continuum of buyers and sellers, where the joint distribution of types and reputations remains constant over time despite churn at the individual level. (In the interests of brevity we omit the formal probabilistic justification that such a model is well-posed; see, e.g., [27, 10, 16] for further details.)

2.2.3 Separating Equilibrium

In this paper, we are interested in equilibria where the strategic sellers are truthful at all reputation scores. As a result, the reputation scores of strategic sellers converge to 1. To simplify the analysis, we assume that for all commitment sellers the expertise p and value q are both strictly smaller than 1, that is, for every commitment seller there is a positive probability that he confuses a low value item for a high value item and in each period he has a low value for sale with a positive probability. Formally, this means the distribution F C places zero mass on commitment sellers with expertise p = 1 or value q = 1. Under this assumption, if in a stationary equilibrium the reputation scores of strategic sellers are equal to 1, commitment sellers and strategic sellers are separated by their reputation scores: the scores of commitment sellers are strictly smaller than CHAPTER 2. PART II: REPUTATION SYSTEM 59

1, whereas the reputation score of each strategic seller is exactly equal to 1. We call this a separating equilibrium.

Definition 15. A separating equilibrium is a stationary equilibrium where

(1) for every strategic seller, it is optimal to describe his items truthfully at all reputation scores; and

(2) a seller’s reputation score is equal to 1 if and only if he is a strategic seller.

The separation of reputation scores at equilibrium significantly reduces the com- plexity of analyzing and computing an equilibrium, because the function B(·) is al- most independent of strategic sellers. More specifically, B(r) does not depend on strategic sellers when r < 1, because only commitment sellers have reputation scores that are strictly smaller than 1. On the other hand, B(1) = 1 because only strategic sellers have reputation scores equal to 1 and they are always truthful in a separating equilibrium. Therefore, we can check whether there exists a separating equilibrium for a given seller distribution in two steps. First, we use the distribution of commit- ment types in the market to compute B(r) for r < 1. The second step is to check whether it is optimal for all strategic sellers to be always truthful when the payment for describing a high value item at reputation score r is vB(r). We conclude this section by briefly discussing the relation to [6, 5]. The setting we have described here is similar to that of [6, 5]; however, there are two key differences. First, in this paper we consider a market with multiple types of sellers who vary in a number of dimensions, such as expertise and whether they are strategic. The second difference is the assumptions we make with respect to the payment to the seller as a function of his reputation score and description.1 This payment naturally arises as the belief of Bayesian buyers, which allows us to take an equilibrium approach where buyers are optimizing and the payment function arises endogenously. This is in contrast to [6, 5], where the payment function is assumed to be fixed and exogenous and the focus is on designing a mechanism that optimally incentivizes strategic sellers.

1 The payment described here coincides with the one in [6] only when vL = 0. CHAPTER 2. PART II: REPUTATION SYSTEM 60

2.3 Existence of Separating Equilibria

In this section, we find conditions under which a separating equilibrium exists. Our approach is as follows. First, we note that the payment function B(r) is derived by buyers using Bayes’ rule. Second, we note that in a separating equilibrium, only the commitment sellers contribute to the definition of B(r) for r < 1; in particular, B(r) is independent of the strategy of strategic sellers when r < 1. We start by considering how B(r) would look in a separating equilibrium. Con- sider a commitment seller of type (p, q). In every period, he has an item with high value with probability q and an item with low value with probability 1 − q. The seller always describes a high value item as high value. On the other hand, he describes a low value item as a high value item with probability 1 − p, where p represents his ex- pertise. Thus in every period, the reputation of this commitment seller increases from r to t(r) ≡ α + (1 − α)r with probability q + (1 − q)p and decreases to c(r) ≡ (1 − α)r with probability (1 − q)(1 − p). Recall that we denote the corresponding invariant (p,q) distribution of this Markov chain by φα . We now use Bayes’ rule and conclude that if the equilibrium were separating, then for r < 1, there holds:

R 1 R 1 C (p,q) f (p, q) φα (r) · q dp dq B(r) = 0 0 . (2.1) R 1 R 1 C (p,q) 0 0 f (p, q) φα (r) · (q + (1 − q)(1 − p)) dp dq

We emphasize that in a separating equilibrium, B(r) for r < 1 is determined by just the commitment sellers. Now we consider a single strategic seller of type (δ, q) and characterize his optimal strategy, assuming the payment function is given by (2.1). We actually solve a slightly more general problem: we focus on conditions on B(r) under which it is optimal for this seller to always describe the item he has for sale truthfully regardless of his reputation score. Existence is then derived as a corollary. For the remainder of this section, we assume that B(r) is non-decreasing. Suppose that the strategic seller’s reputation score is r. Since the reputation score is given by the exponential moving average, if the seller receives negative feedback in this period his reputation score will decrease to c(r). On the other hand, if he receives CHAPTER 2. PART II: REPUTATION SYSTEM 61

positive feedback in this period, his reputation score will increase to t(r). We define t(i) recursively, so that t(0)(r) = r and t(i+1)(r) = t(t(i)(r)). Let V (r) be the maximum infinite horizon discounted payoff of the seller when his current reputation score is r. The optimal strategy of this seller is given by the following Bellman equation.

V (r) = q·[vB(r) + δ · V (t(r))]+(1−q)·max{vB(r)+δ·V (c(r)), vL +δ·V (t(r))} (2.2)

In particular, with probability q the seller has a high value item and it is optimal for him to describe it truthfully. As a result, he receives a payment of vB(r) and his reputation score increases to t(r). With probability 1 − q the seller has a low value item for sale. If he advertises the low value item truthfully, he receives payment vL and his reputation increases to t(r). If he describes the low value item as a high value item, he receives a higher payment in this period (vB(r)), but his reputation score drops to t(r). We say that it is optimal for a seller to be truthful at reputation r if it is optimal for him to describe a low value item truthfully when his reputation score is r. The following lemma characterizes under what conditions it is optimal for the seller to be truthful at all reputation scores.

Lemma 23. It is optimal for a type (δ, q) strategic seller to be truthful at all r ∈ [0, 1] if and only if ∞ X B(r) ≤ q · δi+1[B(t(i+1)(r)) − B(t(i)(c(r)))] (2.3) i=0 for all r ∈ [0, 1].

Proof. By (2.2), it is optimal for the seller to be truthful at reputation r if and only if

δ · [V (t(r)) − V (c(r))] ≥ vB(r) − vL = (vH − vL)B(r).

Let Vˆ (r) be the expected infinite horizon discounted payoff of the seller if her current reputation score is r and she is truthful in all future periods. By the one step deviation CHAPTER 2. PART II: REPUTATION SYSTEM 62

principle, the seller will not deviate from always being truthful if and only if

ˆ ˆ δ · [V (t(r)) − V (c(r))] ≥ (vH − vL)B(r) (2.4) for r ∈ [0, 1]. We observe that

∞ ˆ X i (i) V (r) = δ (q · vB(t (r)) + (1 − q) · vL) i=0 ∞ ˆ ˆ X i (i+1) (i) V (t(r)) − V (c(r)) = q · (vH − vL) δ [B(t (r)) − B(t (c(r)))] i=0

To conclude the proof, we substitute the latter in (2.4).

It is obvious from (2.3) that a strategic seller with higher value q and larger discount factor δ is more likely to be truthful. Since δ ≤ 1 and q ≤ 1, a necessary condition for the existence of separating equilibrium is

∞ X B(r) < [B(t(i+1)(r)) − B(t(i)(c(r)))]. (2.5) i=0

On the other hand, when (2.5) holds, then separating equilibrium exists if both δ and q are sufficiently close to 1. Recall that when the seller has reputation r and describes an item as high value, then he receives a payment of vH · B(r) + vL · (1 − B(r)). The following proposition shows under what properties of B it is possible to have a strategic seller who is always truthful.

Proposition 24. Suppose that B(r) is non-decreasing with B(0) = 0.

(i) If B(r) is strictly convex, there exist δ∗ < 1 and q∗ < 1 such that it is optimal for type (δ, q) strategic seller to be always truthful if δ ≥ δ∗ and q ≥ q∗.

(ii) If B(r) is concave, then it is not optimal for the seller to be always truthful. CHAPTER 2. PART II: REPUTATION SYSTEM 63

(iii) If B(r) is logarithmically concave and

∞ X B(1) ≤ q · δi+1[B(1) − B(t(i)(1 − α))], i=0

then it is optimal to be truthful for all r ∈ [0, 1].

Proof. We first observe that

B(t(i+1)(r)) − B(t(i)(c(r))) = B(t(i+1)(r)) − B(t(i+1)(r) − α(1 − α)i).

(i) Since B is strictly convex, non-decreasing and B(0) = 0,

B(t(i+1)(r)) B(t(i+1)(r)) − B(t(i+1)(r) − α(1 − α)i) > α(1 − α)i ≥ α(1 − α)iB(r). t(i+1)(r)

We thus conclude that there exist δ∗ < 1 and q∗ < 1 such that (2.3) holds for all r ∈ [0, 1]. Because the RHS of (2.3) is increasing in both δ and q, we conclude that if δ ≥ δ∗ and q ≥ q∗, then it is optimal for the seller to be always truthful. (ii) We show that if B is concave, then (2.3) does not hold at r = 1. In particular,

B(t(i+1)(1)) − B(t(i+1)(1) − α(1 − α)i) = B(1) − B(1 − α(1 − α)i) ≤ α(1 − α)iB(1), thus for any δ < 1 and q < 1,

∞ ∞ X X q · δi+1[B(t(i+1)(1)) − B(t(i)(c(1)))] < α(1 − α)iB(1) = B(1). i=0 i=0

(iii) It suffices to show that if B is logarithmically concave and (2.3) holds for r = 1, then (2.3) also holds for any r ∈ [0, 1]. Thus, it suffices that

B(t(i+1)(r)) − B(t(i+1)(r) − α(1 − α)i) B(1) − B(1 − α(1 − α)i) ≥ B(r) B(1) for all r ∈ [0, 1]. We now fix some r ∈ [0, 1]. Let x ≡ t(i+1)(r) and k ≡ α(1 − α)i. We have that x ≥ r and B(x) ≥ B(r), because B is non-decreasing. To conclude the CHAPTER 2. PART II: REPUTATION SYSTEM 64

proof, we show that

B(1)(B(x) − B(x − k)) ≥ B(x)(B(1) − B(1 − k)).

This inequality trivially holds if B(x − k) = 0. On the other hand, if B(x − k) > 0, it suffices to show that B(x)/B(x − k) is nonincreasing in x. Since B is logarithmi- cally concave, log(B(x)) − log(B(x − k)) is nonincreasing in x, which implies that B(x)/B(x − k) is nonincreasing in x.

Proposition 24 (i) says that if B is strictly convex, then it is optimal for a strategic seller to be always truthful if (1) he cares enough about future payments (i.e., his discount factor is sufficiently large), and (2) he is sufficiently likely to have high value items for sale in each period. Note that the aforementioned conditions are in fact necessary to incentivize truthfulness; for instance, if the seller does not value future payments, he will not try to maintain a good reputation. On the other hand, Proposition 24 (ii) says that if B is concave, then it is not optimal for the seller to be truthful for any (δ, q). In particular, the seller will always deviate from being truthful when his reputation score is equal to 1. We can get some intuition for this result by considering condition (2.4) when r = 1. At maximum score, the gain for being truthful now (given that the seller is truthful in the future) is proportional to B(1) − B(t(i)(1 − α)), while the gain from deviating is proportional to B(1). When B is concave, then B(1) − B(t(i)(1 − α)) is much smaller than B(1), while for a strictly convex function, B(1) − B(t(i)(1 − α)) is significant relative to B(1). Proposition 24 (iii) says that if B is logarithmically concave (that is, if its loga- rithm is a concave function), then it is optimal for a seller to be always truthful if and only if it is optimal to be truthful at r = 1. Thus, if B is log-concave, then in order to verify whether it is optimal for a seller to be always truthful it suffices to verify a single inequality. See [6] for a discussion on the connection between log-concavity and payment functions that arise in electronic marketplaces. Returning to separating equilibria, our main insight in this section is then derived as a corollary to Proposition 24. In particular, observe that if all strategic sellers are CHAPTER 2. PART II: REPUTATION SYSTEM 65

guaranteed to be truthful given the function B(r) induced by just the commitment sellers, then we have found a separating equilibrium. Thus we have the following existence result.

Corollary 25. Given a market, suppose that B(r) as defined in (2.1) is non-decreasing with B(0) = 0.

(i) If B(r) is strictly convex, there exist δ∗ < 1 and q∗ < 1 such that if δ ≥ δ∗ and q ≥ q∗, then there exists a separating equilibrium.

(ii) If B(r) is concave, then there never exists a separating equilibrium.

(iii) If B(r) is logarithmically concave and F S only assigns nonzero mass to (δ, q) such that ∞ X B(1) ≤ q · δi+1[B(1) − B(t(i)(1 − α))], i=0 then there exists a separating equilibrium.

2.4 Approximating Separating Equilibria

In this section we consider an approximation of separating equilibria, where we sim- plify the process of estimating commitment sellers’ steady state distribution of repu- tations. In particular, we assume that buyers compute B(r) using an approximation (p,q) of the invariant distribution Φα of each commitment seller. This approximation significantly simplifies the computation of B(r) and yields additional insight beyond our original model. Recall that a commitment seller of type (p, q) has a low value item with probability 1 − q which he falsely advertises with probability 1 − p; he thus receives negative feedback with probability (1 − q)(1 − p). Therefore, in the long run the reputation of a type (p, q) commitment seller will oscillate around 1 − (1 − q)(1 − p). If the parameter α is small, the oscillation will not be significant and 1 − (1 − q)(1 − p) provides a good approximation of the seller’s reputation score. This motivates us to consider an approximate equilibrium where buyers believe that a seller of type (p, q) CHAPTER 2. PART II: REPUTATION SYSTEM 66

has reputation score 1−(1−q)(1−p). Equivalently, we assume that if a commitment seller has reputation score r and expertise p, then buyers believe that his value is q = (r − p0)/(1 − p0).

Definition 16. An approximate separating equilibrium consists of a pair of distribu- C S tions Gα and Gα, a set of stationary Markov strategies used by strategic sellers, and a function B(r) for r ∈ [0, 1] such that the following three conditions hold:

(1) when the payoff to a seller of reputation score r is equal to vB(r) (resp., vL) for describing a high (resp., low) value item, it is optimal for all strategic sellers to always describe their items truthfully;

C S (2) (Gα ,Gα) is the steady state population state of the market and

C (i) for all commitment types (p, q), gα (1, p, q) = 0;

S (ii) for all strategic types (δ, q), gα (r, δ, q) > 0 only if r = 1;

(3) B(r) is the probability that an item has high value when a seller with reputation score r claims so in his description assuming that a commitment seller of type (p, q) has reputation score 1 − (1 − q)(1 − p), that is,

 R 1 f C (p, r−p ) r−p dp  0 1−p 1−p if r ≥ 1 R 1 f C p, r−p r−p +1−r dp B(r) = 0 ( 1−p )( 1−p ) .  1 if r = 1

We now focus on the case that all commitment sellers have the same expertise p0 and derive conditions under which an approximate separating equilibrium exists. C Since all commitment sellers have expertise p0, we have that f (p, q) = 0 when p 6= p0. C 2 We further assume that that f (p0, q) > 0 for all q ∈ [0, 1). Given that we are approximating the reputation of a seller of type (p, q) with

2The result we will discuss also holds under the more general assumption that there exists some q0 such that fC (p0, q) > 0 if and only if q ∈ [q0, 1). CHAPTER 2. PART II: REPUTATION SYSTEM 67

1 − (1 − p)(1 − q), we have that for r ∈ [p0, 1),   C r−p0 r−p0 f p0, 1−p0 1−p0 r − p0 B(r) =     = . (2.6) C r−p0 r−p0 1 − 2p0 + rp0 f p0, + 1 − r 1−p0 1−p0

Since we are assuming that f(p0, ·) is positive everywhere on [0, 1), B(r) is well defined for r ≥ p0. We further assume if a seller’s reputation score is less than p0, buyers believe that the seller is of type (p0, q = 0) (or equivalently, the inferred value is max{(r − p0)/(1 − p0), 0}). Therefore B(r) = 0 for all r < p0. Finally, B(1) = 1, because we are interested in approximate separating equilibria. Note that B is continuous and non-decreasing throughout [0, 1]. Moreover, B is concave on [p0, 1]; however, Proposition 24 does not apply because B is not concave throughout its domain. In fact, in this case it is optimal for a strategic seller to be always truthful as long as his discount factor and value are large enough. As a result, an approximate separating equilibrium may exist when all commitment sellers have the same expertise. This is the content of the following theorem.

C Proposition 26. Suppose that f (p, q) > 0 if and only if p = p0 and q ∈ [0, 1). Then there exist δ∗ < 1 and q∗ < 1 such that if q ≥ q∗ and δ ≥ δ∗, an approximate separating equilibrium exists.

Proof. An approximate separating equilibrium exists if it is optimal for all strategic sellers to be always truthful, hence it suffices to check (2.5) for any reputation r,

For any value r, if c(r) < p0, then B(c(r)) = 0, and

∞ X B(t(i+1)(r)) − B(t(i)(c(r))) > B(t(r)) − B(c(r)) = B(t(r)) ≥ B(r). i=1

(i+1) (i) If c(r) ≥ p0, then all t (r) and t (c(r)) are greater than or equal to p0, and

2 (r1 − r2)(1 − p0) B(r1) − B(r2) = (1 − 2p0 + r1p0)(1 − 2p0 + r2p0) CHAPTER 2. PART II: REPUTATION SYSTEM 68

Note that since t(i+1)(r) − t(i)(c(r) = α(1 − α)i, it suffices to show that

∞ i 2 r − p0 X α(1 − α) (1 − p0) ≤ 1 − 2p + rp (1 − 2p + t(i+1)(r) · p )(1 − 2p + t(i)(c(r)) · p ) 0 0 i=0 0 0 0 0

It is easy to show that the LHS is increasing in r and RHS is decreasing in r. Therefore we only need to show that the previous inequality holds for r = 1, that is,

∞ i X α(1 − α) (1 − p0) 1 ≤ . 1 − 2p + t(i)(1 − α) · p i=0 0 0

(i) P∞ i To conclude the proof, we observe that t (1 − α) ≤ 1 and i=0 α(1 − α) = 1.

Proposition 26 shows that when all commitment sellers have the same expertise, then an approximate separating equilibrium exists under very general conditions. Recall that a large discount factor and a large probability of having a high value item for sale are in fact necessary conditions to incentivize truthfulness, regardless of what the specific payment function is.

2.5 Computational Analysis

In our model, buyers use (2.1) to infer the probability that an item described as high value actually has high value. We can thus find out whether a separating equilib- rium exists by numerically checking whether a certain distribution of commitment sellers gives rise to a payment function that incentivizes strategic sellers to be always truthful. Throughout this section, we set α = 0.1 and given a distribution of com- mitment sellers, we check whether the corresponding function B(·) satisfies (2.5) for all r ∈ [0, 1]. If (2.5) holds, then we can conclude that the market has a separating equilibrium as long as both δ and q are close enough to 1. We first give an example where a separating equilibrium exists, and then we discuss several features of the distribution of commitment types that make the existence of a separating equilibrium more likely. CHAPTER 2. PART II: REPUTATION SYSTEM 69

Example 1. There are four types of commitment sellers in the market with expertise and value attributes (p, q) drawn from the set {(0.3, 0.3), (0.4, 0.4), (0.5, 0.5), (0.6, 0.6)} and f C (0.3, 0.3) = f C (0.4, 0.4) = f C (0.5, 0.5) = 10 · f C (0.6, 0.6). Then (2.5) holds for all r ∈ [0, 1] and a separating equilibrium exists if δ and q are sufficiently large (e.g., when δ → 1 and q = 0.93).

However, a separating equilibrium may not exist in general. As we have seen in Section 2.3, the shape of B(r) plays an important role in whether a separating equilibrium exists. For r < 1, B(r) is defined in (2.1) as a function of the prior C (p,q) density f (p, q) and the densities of the invariant distributions φα . For a given (p,q) commitment type (p, q), φα only depends on the mistake rate (1 − p)(1 − q) and the parameter α of the reputation mechanism. On the other hand, the probability that an item described by a type (p, q) seller as high value actually has high value is equal to 1 − q/(1 − p + pq); we will refer to this as the conditional mistake rate of type (p, q). We now consider three different properties of the distribution of commitment types that might help lead a marketplace towards more truthful strategic sellers, and therefore make the existence of a separating equilibrium more likely. (1) Diversity of mistake rates (1 − p)(1 − q). A necessary condition for strategic sellers to be truthful is that there is sufficient heterogeneity in the values of mistake rates (1 − p)(1 − q), so that the resulting function B(r) is increasing at a sufficient rate throughout its domain. As an extreme counterexample, consider a market with

I types of commitment sellers where (1 − pi)(1 − qi) takes the same value for all i ∈ I (pi,qi) (i.e., there is no heterogeneity in mistake rates). Then, the density functions φα (r) are identical for all i ∈ I, and B(r) is constant for r < 1. In other words, the payment that a seller receives does not depend on his reputation when his score is in [0, 1). On the other hand, if B(r) > 0 for r < 1, the seller receives a strictly higher payment for describing a high value item. Then, it is always optimal for a strategic seller with score r < 1 to exaggerate the value of a low value item in his description. Thus, strategic sellers are not truthful for r < 1 and a separating equilibrium does not exist. This counterexample suggests that if the mistake rates are similar or identical across all commitment sellers, then the effect of reputation as a type signal is weakened and CHAPTER 2. PART II: REPUTATION SYSTEM 70

the incentive for maintaining a high reputation to signal high type is reduced. (2) Positive correlation of value and expertise. The reputation of a seller reveals how likely he is to receive negative feedback. For a commitment seller with type (p, q), the (unconditional) probability of getting negative feedback in a given period is equal to his mistake rate (1 − p)(1 − q). Observe that a commitment seller with high (resp., low) expertise p may have low (resp., high) reputation if his value type q is low (resp., high). However, when buyers observe an item advertised as high value sold by a type (p, q) seller, the probability that this item is misadvertised is equal to the conditional mistake rate 1 − q/(1 − p + pq), not the overall mistake rate. In a market where high expertise sellers have low value types and low expertise sellers have high value types, it is possible that a seller with higher overall mistake rate has a lower conditional mistake rate. In that case, the function B(·) is decreasing in (0, 1) and the reputation system does not properly incentivize strategic sellers.

Consider a market with two types of commitment sellers (p1, q1) = (0.6, 0.2),

(p2, q2) = (0.4, 0.4). Then type 1 sellers have lower overall mistake rate but higher conditional mistake rate. We assume each of these types arises with the same prob- ability in the population of sellers. In this market, a seller with higher reputation is

(p1,q1) (p2,q2) more likely to be of type 1, i.e., φα (r)/φα (r) is increasing in r. However, buy- ers are willing to pay to interact with a seller of type 1, because the corresponding conditional mistake rate is smaller than the one of type 2. As a result, the func- tion B(·) is decreasing in (0, 1), strategic sellers are not truthful when r < 1, and a separating equilibrium does not exist. Figure 2.1 illustrates this idea by show-

(pi,qi) h (p1,q1) (p2,q2) i ing the density functions φα (r) in the left panel and log φα (r)/φα (r)

(p1,q1) (p2,q2) in the right panel. As r increases, weight φα (r) dominates φα (r) and B(r) approaches q1/(1 − p1 + p1q1). On the contrary, if sellers with higher expertise p also has higher value q, then the overall and conditional mistake rates are consistent. In particular, if pi > pj whenever qi > qj , then (1 − pi)(1 − qi) < (1 − pj)(1 − qj) and 1 − qi/(1 − pi + piqi) <

1 − qj/(1 − pj + pjqj). This guarantees that the function B is increasing throughout its domain. (3) Fewer “good” commitment sellers and more “bad” commitment sellers. This CHAPTER 2. PART II: REPUTATION SYSTEM 71

(0.6, 0.2) (0.4, 0.4) log(density ratio) reputation density −10 −5 0 5 0 1 2 3

0.0 0.2 0.4 0.6 0.8 1.0 0.0 0.2 0.4 0.6 0.8 1.0

r r Figure 2.1: Density functions and their logarithmic ratio. property results in a function B(r) which is steep for large r, and thus is more likely to incentivize strategic sellers to be always truthful. On the other hand, if there are many commitment sellers in the market with very low overall mistake rates (“good” commitment seller), i.e., the prior density for “good” commitment sellers is high, then it follows from Bayes’ rule that B(r) is not steep for large value of r. Then, strategic sellers may be better off occasionally deviating from being truthful, because as long as they keep their reputations at a high level (not necessarily one) they can still receive high payments. For example, consider a market similar to Example 1 where type (0.6, 0.6) instead is twice as likely as each of the other types. In Figure 2.2, the function B(r) of Example 1 is shown in the left panel and the B(r) of the new market is shown in the right panel. We can see that increasing the mass of the “best” commitment sellers in the market increases the convexity of function B(r) at r close to 1, and numerical computations show that in the new market a strategic seller will deviate from being truthful when his reputation score is in (0.71, 1). CHAPTER 2. PART II: REPUTATION SYSTEM 72 probability function B(r) probability function B(r) 0.4 0.5 0.6 0.7 0.8 0.4 0.5 0.6 0.7 0.8

0.0 0.2 0.4 0.6 0.8 1.0 0.0 0.2 0.4 0.6 0.8 1.0

r r Figure 2.2: The function B(r) with different class weights.

2.6 Conclusion

In this chapter, we formulate a large market model for reputation systems where each seller and buyer is an infinitesimal member of the market and define an appropri- ate notion of stationary equilibrium for the dynamic market. We are particularly interested in equilibria where strategic sellers are truthful, and therefore focus on the notion of separating equilibrium where commitment sellers and strategic sellers are separated by their reputation scores. We give conditions under which it is op- timal for strategic sellers to be always truthful and a separating equilibrium exists. We gain additional insight by theoretically analyzing an approximation of separating equilibrium and through numerical analysis. The notion of separating equilibrium is of great interest not only because it is di- rectly related to the goal of incentivizing truthfulness, but also because the separation of reputation scores significantly reduces the complexity of analyzing and computing an equilibrium. Our work on large market approximation and separating equilibrium suggests a general method to study reputation systems and truthfulness in complex marketplaces. Bibliography

[1] Sachin Adlakha, Ramesh Johari, and Gabriel Y. Weintraub. Equilibria of dy- namic games with many players: Existence, approximation, and market struc- ture. arXiv, abs/1011.5537, 2010.

[2] Gad Allon and Itai Gurvich. Pricing and dimensioning competing large-scale service providers. Manufacturing & Service Operations Management, 12:449– 469, 2010.

[3] Eitan Altman, Konstantin Avrachenkov, and Urtzi Ayesta. A survey on discrim- inatory processor sharing. Queueing Systems, 53(1-2):53–63, 2006.

[4] Norman H. Anderson. Foundations of Information Integration Theory. New York: Academic Press, 1981.

[5] Christina Aperjis and Ramesh Johari. Designing reputation mechanisms for efficient trade. Technical report, Stanford University, 2010.

[6] Christina Aperjis and Ramesh Johari. Optimal windows for aggregating ratings in electronic marketplaces. Management Science, 56(5):864–880, 2010.

[7] U. Ayesta, A. Izagirre, and I.M. Verloop. Heavy traffic analysis of the dis- criminatory random-order-of-service discipline. Performance Evaluation Review, 39(2):41–43, September 2011.

[8] J. Bergin and D. Bernhardt. Anonymous sequential games: existence and char- acterization of equilibria. Economic Theory, 5(3):461–489, 1995.

73 BIBLIOGRAPHY 74

[9] P. Billingsley. Weak Convergence of Measures: Applications in Probability. So- ciety for Industrial Mathematics, Philadelphia, PA, 1987.

[10] Aaron Bodoh-Creed. Approximation of large games with applications to uniform price auctions. 2012. Submitted.

[11] Luis Cabral and Ali Hortacsu. Dynamics of seller reputation: Theory and evi- dence from eBay. J. of Industr. Econom. (to appear), 58:54–78, 2010.

[12] Y. Chen, C. Maglaras, and G. Vulcano. Design of an aggregated marketplace under congestion effects: Asymptotic analysis and equilibrium characterization. Working Paper, 2010.

[13] D.R. Cox and W.L. Smith. Queues. Methuen and Wiley, London and New York, 1961.

[14] Chrysanthos Dellarocas. Reputation mechanism design in online trading envi- ronments with pure moral hazard. Inform. Systems Res., 16(2):209–230, 2005.

[15] D. Duffie, S. Malamud, and G. Manso. Information percolation with equilibrium search dynamics. Econometrica, 77(5):1513–1574, 2009.

[16] D. Duffie, S. Malamud, and G. Manso. Information percolation in segmented markets. 2012. Working paper.

[17] Mehmet Ekmekci. Sustainable reputations with rating systems. Journal of Eco- nomic Theory, 146:479–503, 2011.

[18] Ming Fan, Yong Tan, and Andrew B. Whinston. Evaluation and design of online cooperative feedback mechanisms for reputation management. IEEE Trans. on Knowl. and Data Eng., 17(2):244–254, 2005.

[19] G. Fayolle, I. Mitrani, and R. Iasnogorodski. Sharing a processor among many job classes. Journal of the ACM, 27(3):519–532, 1980.

[20] A. Glazer and R. Hassin. ?/m/1: On the equilibrium distribution of customer arrivals. European Journal of Operational Research, 13:146–150, 1983. BIBLIOGRAPHY 75

[21] R. Gummadi, P. Key, and A. Proutiere. Optimal bidding strategies and equilibria in dynamic auctions with budget constraints. In Ad Auctions Workshop 2012, 2012.

[22] R. Hassin and M. Haviv. To queue or not to queue: Equilibrium behavior in queueing systems. Kluwer Academic Publishers, 2003.

[23] M. Haviv and J. van der Wal. Equilibrium strategies for processor sharing and queues with relative priorities. Probability in the Engineering and Informational Sciences, 11(4):403–412, 1997.

[24] Daniel Houser and John Wooders. Reputation in auctions: Theory, and evidence from eBay. J. Econom. & Management Str., 15(2):353–369, 06 2006. available at http://ideas.repec.org/a/bla/jemstr/v15y2006i2p353-369.html.

[25] M. Huang, P. E. Caines, and R. P. Malham´e.Large-population cost-coupled LQG problems with nonuniform agents: Individual-mass behavior and decentralized -Nash equilibria. IEEE Transactions on Automatic Control, 52(9):1560–1571, 2007.

[26] M. Huang, R. P. Malham´e,and P. E. Caines. Large population stochastic dy- namic games: closed-loop Mckean-Vlasov systems and the Nash certainty equiv- alence principle. Communications in Information and Systems, 6(3):221–251, 2006.

[27] Krishnamurthy Iyer, Ramesh Johari, and Mukund Sundararajan. Mean field equilibria of dynamic auctions with learning. 2011. Submitted.

[28] R. Jain, S. Juneja, and N. Shimkin. The concert queueing game: to wait or to be late. Discrete Event Dynamic Systems, 21:103–134, 2011.

[29] R. Johari and J. N. Tsitsiklis. Efficiency loss in a network resource allocation game. Mathematics of Operations Research, 29(3):407–435, 2004.

[30] B. Jovanovic and R. W. Rosenthal. Anonymous sequential games. Journal of Mathematical Economics, 17:77–87, 1988. BIBLIOGRAPHY 76

[31] Kirthi Kalyanam and Shelby McIntyre. Return on reputation in online auction markets. Working Paper, Santa Clara University. June 2001.

[32] W. Kang, F. Kelly, N. Lee, and R. Williams. State space collapse and diffusion approximation for a network operation under a fair bandwidth sharing policy. The Annals of Applied Probability, 19(5):1719–1780, 2009.

[33] Yoshihisa Kashima and Andrew R. Z. Kerekes. A distributed memory model of averaging phenomena in person impression formation. Journal of Experimental Social Psychology, 30(5):407 – 455, 1994.

[34] F.P. Kelly, A. Maulloo, and D. Tan. Rate control in communication networks: shadow prices, proportional fairness and stability. Journal of the Operational Research Society, 49(3):237–252, 1998.

[35] Frank Kelly. Charging and rate control for elastic traffic. European Transactions on Telecommunications, 8:33–37, 1997.

[36] Y. J. Kim and M. V. Mannino. Optimalincentive-compatiblepricing for m/g/1queues. Operations Research Letters, 31:459–461, 2003.

[37] J.F.C. Kingman. On queues in heavy traffic. Journal of the Royal Statistical Society. Series B (Methodological), 24(2):383–392, 1962.

[38] L. Kleinrock. Time-shared systems: A theoretical treatment. Journal of ACM, 14(2):242–261, 1967.

[39] L. Kleinrock. Queueing Systems, Volume 1: Theory. Wiley–Interscience, New York, 1975.

[40] J. M. Lasry and P. L. Lions. Mean field games. Japanese Journal of Mathematics, 2:229–260, 2007.

[41] David Lucking-Reiley, Doug Bryan, Naghi Prasad, and Daniel Reeves. Pennies from eBay: The determinants of price in online auctions. J. Industrial Econom., 55(2):223–233, 2007. BIBLIOGRAPHY 77

[42] L. Massouli´eand J. Roberts. Bandwidth sharing and admission control for elastic traffic. Telecommunication Systems, 15(1-2):185–201, 2000.

[43] H. Mendelson and S. Whang. Optimal incentive-compatible priority pricing for the m/m/1 queue. Operations Research, 38:870–883, 1990.

[44] Robin M.Hogarth and Hillel J. Einhorn. Order effects in belief updating: The belief-adjustment model. Cognitive Psychology, 24(1):1–55, 1992.

[45] J. Mo and J. Walrand. Fair end-to-end window-based congestion control. IEEE/ACM Trans. Netw., 8:556–567, October 2000.

[46] J. Nair, A. Wierman, and B. Zwart. Exploiting network effects in the provisioning of large scale systems. Proceedings of 29th International Symposium on Computer Performance, Modeling, Measurements and Evaluation., 2011.

[47] P. Naor. The regulation of queue size by levying tolls. Econometrica, 37(1):15–24, 1969.

[48] K. Rege and B. Sengupta. Queue-length distribution for the discriminatory processor-sharing queue. Operations Research, 44(4):653–657, 1996.

[49] M.I. Reiman. Open queueing networks in heavy traffic. Mathematics of Opera- tions Research, 9(3):pp. 441–458, 1984.

[50] Paul Resnick and Richard Zeckhauser. The Economics of the Internet and E- Commerce, volume 11, chapter Trust among strangers in internet transactions: Empirical analysis of eBay’s reputation system, pages 127–157. Elsevier Science Ltd., 2002.

[51] Paul Resnick, Richard Zeckhauser, John Swanson, and Kate Lockwood. The value of reputation on eBay: A controlled experiment. Experimental Econom., 9(2):79–101, June 2006.

[52] J B Rosen. Existence and uniqueness of equilibrium points for concave n-person games. Econometrica, 33(3):520–534, July 1965. BIBLIOGRAPHY 78

[53] N. L. Stokey, R. E. Lucas, Jr., and E. C. Prescott. Recursive methods in economic dynamics. Harvard University Press, Cambridge, MA, 1989.

[54] I.M. Verloop, U. Ayesta, and R. Nunez-Queija. Heavy-traffic analysis of a multiple-phase network with discriminatory processor sharing. Operations Re- search, 59(3):648–660, 2011.

[55] Yu Wu, Christina Aperjis, and Ramesh Johari. Reputation mechanisms in large online markets. Working Paper, 2012.

[56] Yu Wu, Loc Bui, and Ramesh Johari. Aggregate equilibrium of priority pricing for resource sharing systems. Working Paper, 2011.

[57] Yu Wu, Loc Bui, and Ramesh Johari. Heavy traffic approximations of equilibria in resource sharing games. Technical Report, arXiv:1109.6166, 2011.

[58] Benjamin Yolken and Nicholas Bambos. Game based capacity allocation for utility computing environments. Telecommun System, 47:165–181, 2011. Yu Wu

I certify that I have read this dissertation and that, in my opinion, it is fully adequate in scope and quality as a dissertation for the degree of Doctor of Philosophy.

(Ramesh Johari) Principal Adviser

I certify that I have read this dissertation and that, in my opinion, it is fully adequate in scope and quality as a dissertation for the degree of Doctor of Philosophy.

(Nick Bambos)

I certify that I have read this dissertation and that, in my opinion, it is fully adequate in scope and quality as a dissertation for the degree of Doctor of Philosophy.

(Christina Aperjis)

Approved for the University Committee on Graduate Studies