A Fluid Approximation for Large-Scale Service Systems

Yunan Liu, Ward Whitt Department of Industrial Engineering and Operations Research Columbia University 500 West 120th Street New York, NY 10027-6699 {yl2342,ww2040}@columbia.edu

ABSTRACT tact centers (telephone call centers) and hospitals, we intro- We introduce and analyze a deterministic fluid model that duce and analyze a deterministic fluid model that serves as an approximation for a many-server non-Markovian queue- serves as an approximation for the Gt/GI/st + GI many- server queueing model, which has a general time-varying ing model with time-varying parameters; see [1] and [3] for background on contact centers, see [15] for hospitals. Large- arrival process (the Gt), a general service-time distribution scale service systems tend to be quite complicated, with mul- (the first GI), a time-dependent number of servers (the st) and allows abandonment from queue according to a gen- tiple classes of customers and multiple pools of servers. Here eral abandonment-time distribution (the +GI). This fluid we restrict attention to the basic model with a single class of model approximates the associated queueing system when customers handled by a single group of homogeneous servers, the arrival rate and number of servers are both large. We working in parallel. also show that the system dynamics greatly simplifies in two The motivating queueing model is Gt/GI/st + GI, which special cases: (i) when the service time distribution is ex- has a general time-varying arrival process (the Gt) partially ponential (M) and (ii) when the service time distribution is characterized by its arrival rate function, independent and deterministic (D) and the model is stationary. We develop identically distributed (i.i.d.) service times with a general an efficient algorithm to compute all standard performance service-time distribution (the first GI), a time-varying (de- functions in both cases. terministic) staffing function (number of servers, the st), and In case (i), we establish an asymptotic loss of memory abandonment allowed from queue according to i.i.d. pa- (ALOM) property, i.e., asymptotic independence from the tience times with a general distribution (the +GI). The initial conditions as time evolves. We show that the differ- stochastic model has four complicating features that make ence in the performance functions with different initial con- it difficult to analyze directly: (i) time varying arrival rate ditions dissipates over time exponentially fast, under reg- and staffing function, (ii) customer abandonment, (iii) large ularity conditions. In contrast, in case (ii) we show that scale (many servers) and (iv) non-Markovian probability ALOM fails dramatically. Instead, although all model pa- structure. We aim to provide tools that will increase under- rameters are constants, we show that the performance rapidly standing of these complicating features and thereby help im- approaches a periodic steady state (PSS) with a period equal prove system performance. For setting appropriate staffing to the service time, whenever the system does not start with policies to cope with time-varying demands in meeting per- the unique stationary distribution. Moreover, the form of formance goals subject to performance constraints, see [4, the PSS depends on the initial condition. Simulation and 6]. See [2, 5, 12] for recent development. a heavy-traffic limit confirm that this anomalous behavior The fluid model is intended to serve as an approximation also occurs in the large-scale queueing model. for the queueing model when both the number of servers and the arrival rate are large, and the system experiences occasional periods of significant overloading. Such approx- Categories and Subject Descriptors imations are usually justified by many-server heavy-traffic G.3 [Probability and Statistics]: Queueing Theory limits as the arrival rate and number of servers increase, e.g., see [13], but this work is concerned with analyzing the General Terms performance of the approximating fluid model. Specifically, here we provide an overview of our recent work in [7]-[11]. Performance, Theory In §2, we briefly describe the transient performance of the Gt/GI/st +GI fluid queue. In §3, we discuss the asymptotic Keywords loss of memory (ALOM) property of the Gt/M/st +GI fluid large-scale service systems, queues with time-varying ar- queue. In §4, we discuss the convergence to a periodic steady rivals; nonstationary queues; many-server queues; customer state (PSS) in the GI/D/s + GI fluid queue with constant abandonment; deterministic fluid model; non-Markovian queues, model parameters. asymptotic loss of memory, periodic steady state 2. THEMODELANDITSPERFORMANCE

1. INTRODUCTION In the Gt/GI/st + GI ﬂuid model there is a determin- Motivated by the need for tools to improve the perfor- istic arrival process with arrival-rate function λ ≡ {λ(t) : mance of large-scale service systems, such as customer con- t ≥ 0}; i.e., the total input of ﬂuid over the interval [0,t] 0.8

0.6

B(t)

0.2 t service ! t ( ) ( ) 0 facility 0 1 2 3 4 5 6 7 8 9 10 arrival departure Time t G 0.8

0.6

(a) Underloaded: B(t)

0.2

0 B(t)=S(t) 0 1 2 3 4 5 6 7 8 9 10 Q(t)>0 Time t b t service 1 ( t) queue )0,( ! t 0.8 facility ( ) 0.6 arrival departure B(t) G 0.4 0.2

0 " t 0 1 2 3 4 5 6 7 8 9 10 F ( ) Time t 2 abandonment 1.5

(b) Overloaded: B(t)=S(t), Q(t)>0 1

0.5 X(t) = B(t) + Q(t)

0 0 1 2 3 4 5 6 7 8 9 10 Time t Figure 1: Flows in overloaded and underloaded intervals. Figure 2: The performance measures of a Gt/M/s+M example with different initial conditions. t is Λ(t) ≡ 0 λ(u) du, t ≥ 0. There is a staffing (service capacity) functionR s ≡ {s(t) : t ≥ 0}. There are service- time and abandon-time cdf’s G and F , respectively, with used to determine all performance measures, such as the to- pdf’s g and f. These cdf’s apply deterministically yielding tal fluid in queue Q(t) and in service B(t), the potential proportions, i.e., a proportion G(x) (F (x)) of the fluid en- waiting time v(t) (the virtual waiting time of an infinitely tering service (the queue) completes service (abandons) and patient arrival), the abandonment rate α(t) and the depar- departs by time x after it has entered. There is unlimited ture rate σ(t). Figure 1 illustrates the flow dynamics in waiting space and the service discipline is FCFS. different regimes. The transient dynamics can be characterized by two two- parameter performance functions: Q(t,y) and B(t,y), where Q(t,y)(B(t,y)) is the quantity of fluid in queue (in service) 3. EXPONENTIAL SERVICE TIMES at time t that has been in queue for time less than or equal The algorithm in [8] greatly simplifies when the service to y. Under regularity conditions, these functions are abso- time distribution is exponential (M). Then the fixed-point lutely continuous in the second argument, i.e., equation for b(t,x) has an explicit solution. y y We illustrate the algorithm by considering an example Q(t,y)= q(t,x)dx and B(t,y)= b(t,x)dx, y ≥ 0, Z0 Z0 with a sinusoidal arrival rate λ(t)=1+0.6 sin(t), constant staffing s(t) = 1, exponential abandonment time with rate where the fluid density in queue b and the density in ser- θ = 0.5 and service rate µ = 1. (This would apply to a vice q are non-negative integrable functions that satisfy the queueing system with n servers after multiplying λ(t) and following fundamental evolution equations: s(t) by n; we would then scale up the performance as well.) We consider four different initial conditions: (i) empty, (ii) F¯(x + u) q(t + u, x + u) = q(t,x) ¯ , 0 ≤ x < w(t), underloaded with B(0) = 0.5 and Q(0) = 0, (iii) overloaded F (x) with B(0) = 1 and Q(0) = 0.4 and (iv) overloaded with G¯(x + u) B(0) = 1 and Q(0) = 0.8. (Simulations confirm that these b(t + u, x + u) = b(t,x) . G¯(x) plots describe the queue performance well for large n.) Figure 2 shows that the differences of fluid contents in Under regularity conditions, we show that the head-of-line these four cases converge to zero extremely fast. Thus Fig- waiting time w is a non-negative continuous function that ure 2 shows the asymptotic loss of memory (ALOM) prop- solves the following ODE: erty; i.e., the performance is asymptotically independent of b(t, 0) the initial conditions as time evolves. Indeed, in [10] we w′(t) = 1 − . q(t, w(t)) show that the differences dissipate exponentially fast, under regularity conditions. We provide an algorithm to compute the transient dynamics In [10], we apply ALOM to establish convergence to steady of b(t,y) and q(t,y) for 0 ≤ t ≤ T , assuming that the system state in the G/M/s + GI fluid queue, characterized by [14], alternates between underloaded intervals (with B(t) ≤ s(t)) and convergence to a PSS in the Gt/M/s + GI fluid queue and overloaded intervals (with Q(t) > 0 and B(t) = s(t)). when the arrival rate function is periodic. In [11], we also Theorem 3.1 in [8] shows that the two-parameter density treat the open network generalization of the Gt/Mt/st +GIt function b(t,x) solves a fixed-point equation in each over- fluid queue where the service and patience distributions at loaded interval. The two density functions b and q can be each queue are allowed to be time-varying. 2.5

2 6. REFERENCES 1.5 [1] Z. Aksin, M. Armony, and V. Mehrotra. The modern b(t,0) 1

0.5 call center: a multi-disciplinary perspective on

0 0 0.5 1 1.5 2 2.5 3 3.5 operations management research. Production and Time t

0.6 Operations Management, 16:665–688, 2007.

0.5 0.4 [2] Z. Feldman, A. Mandelbaum, W. A. Massey, and

w(t) 0.3

0.2 W. Whitt. Staﬃng of time-varying queues to achieve 0.1 Management Science 0 time-stable performance. , 0 0.5 1 1.5 2 2.5 3 3.5 Time t 0.8 54:324–338, 2008. 0.6 [3] N. Gans, G. Koole, and A. Mandelbaum. Telephone

0.4 Q(t) call centers: tutorial, reviews and research prospects. 0.2 Manufacturing Service Operations Management,

0 0 0.5 1 1.5 2 2.5 3 3.5 Time t 5:79–141, 2003. [4] L. V. Green, P. J. Kolesar, and W. Whitt. Coping Figure 3: The PSS performance of the GI/D/s + M with time-varying demand when setting staffing fluid queue with different initial conditions. requirements for a service system. Production and Operations Management, 16:13–39, 2007. [5] I. Gurvich and W. Whitt. Scheduling flexible servers 4. STATIONARY MODEL WITH D SERVICE with convex delay costs in many-server service systems. Manufacturing Service Operations In §§2 and 3 our analysis is based on the assumption that Management, 11:237–253, 2009. the service time distribution G has a density g. Now we [6] O. Jennings, A. Mandelbaum, W. A. Massey, and consider the GI/D/s + GI fluid queue that has constant W. Whitt. Server staffing to meet time-vary demand. parameters and deterministic service times 1/µ. Determin- Management Science, 42:1383–1394, 1996. istic service times are of applied interest, because computer- generated service times, such as automated messages, may [7] Y. Liu and W. Whitt. A fluid model for a large-scale well be deterministic. On the one hand, we know the queue- service system experienceing periods of overloading. length process X(t) of a M/D/n + M stochastic queue has Columbia University, NY, NY, 2010. a well-defined steady state as t → ∞ for all arrival rates, http://www.columbia.edu/∼ww2040/allpapers.html. because abandonment ensures stability. We demonstrate [8] Y. Liu and W. Whitt. A fluid model for a many-server by showing that {X(t) : t ≥ 0} is a regenerative process Gt/GI/st + GI queue. Columbia University, NY, NY, with finite regeneration cycle τ, under regularity conditions. 2010. However, the times between regenerations grow exponen- http://www.columbia.edu/∼ww2040/allpapers.html. tially fast as n →∞ where n is the number of servers. This [9] Y. Liu and W. Whitt. The heavily loaded many-server suggests that in a M/D/n + M queue with large n and λ, queue with abandonment and deterministic service the queue length converges to that steady state very slowly. times. Columbia University, NY, NY, 2010. (That is confirmed by our analysis.) http://www.columbia.edu/∼ww2040/allpapers.html. On the other hand, as n → ∞, the sequence of appro- [10] Y. Liu and W. Whitt. Large-time asymptotics for the priately scaled (under the fluid scaling) stochastic queues Gt/Mt/st + GIt many-server fluid queue with converges to the M/D/s + M fluid model. We consider the customer abandonment. Columbia University, NY, case ρ = λ/sµ > 1, under which the system is overloaded, NY, 2010. but still stable because of the abandonment. We show the http://www.columbia.edu/∼ww2040/allpapers.html. performance in the fluid model converges to a PSS as time [11] Y. Liu and W. Whitt. A network of time-varying evolves. Moreover, there are infinitely many possible PSS’s, many-server fluid queues with customer abandonment. depending on the initial conditions. The system is station- Columbia University, NY, NY, 2010. ary if and only if the system starts with the unique station- http://www.columbia.edu/∼ww2040/allpapers.html. ary fluid content. We establish the many-server heavy-traffic [12] Y. Liu and W. Whitt. Stabilizing customer limit in [9], under regularity conditions. abandonments in many-server queues with In Figure 3, we illustrate this periodic behavior by plotting time-varying arrivals. Columbia University, NY, NY, the performance of the M/D/s + M fluid model with λ = 2, 2010. µ = s = 1 and exponential abandonment with rate θ = 2. http://www.columbia.edu/∼ww2040/allpapers.html. We consider two initial conditions: (i) critically loaded with [13] G. Pang, R. Talreja, and W. Whitt. Martingale proofs b(0,x) = 1.5 · 1{0≤x≤1/2} + 0.5 · 1{1/2