The Analysis of Dynamical Queueing Systems (Background)

Technological innovations are creating new types of communication systems. During the 20 th century, we saw the evolution of electronic communication from the telegraph to the telephone to the internet. In turn, these new types communication systems have created new types of communication services such as call centers, electronic commerce and wireless communications.

Communication services managers must make important business decisions to stay competitive and profitable. They have to maximize the communication resources that they are making available to the customer. The word “resource’’ can refer to telephone lines, telephone operators, optical fiber or bandwidth. At the same time, the manager must minimize their costs for providing these resources. This also allows them to maximize their profits.

Unpredictable but growing demand for new services delivered by new communication systems renders managerial decision making difficult. The mathematical field of queueing theory has traditionally provided managers with a useful set of decision making formulas and algorithms for designing communication systems and services. An example of this is the classical blocking formula that was developed in 1917. For a group of telephone trunk lines that connect two central offices, this formula gives the probability that “all circuits are busy” (no trunk line is currently available) given the fixed number of trunk lines as provided by the service and given the average number of trunk lines requested by the customers. This formula can then be “inverted” so a manager can then decide what should be the least or optimal amount of trunk lines needed for a given customer demand that would keep a specific below some acceptable threshold level. The measure that we use here is the probability of all circuits being busy

Historically, this was the firt problem that led to the creation of queueing theory. A. K. Erlang modeled the usage of a telephone trunk group as a time-homogeneous . The randomness arises as the best way to model mathematically the aggregate behavior of people. The transient (short run) probabilities for such a solve a set of ordinary differential equations. When we look at the asymptotic regime of large time (long run), these differential equations then reduce to a set of linear equations which are relatively easier to solve. This type of equilibrium or steady state analysis made classical queueing theory an important tool for modeling many different types of voice communications systems. The simplest, fundamental queueing model of this type that Erlang used has multiple servers (corresponding to telephone operators) but no buffers for customers to wait when all the servers are busy (when the customer hears a busy signal, they leave). We model the number of busy servers as a continuous time Markov process where we assume that the call arrival traffic is modeled by a random Poisson process of a constant mean rate and the time each customer spends on the phone is random with a negative .

1 The success of using queueing theory to model circuit-switched telephone networks in the first half of the 20 th century for voice communications led to another major triumph in the latter half of the 20 th century as queueing theory was applied by computer scientists such as [1] to model data communication systems. The simplest, fundamental queueing model for data communications has a single server where jobs arrive according to a Poisson process. When that single server is busy, there is an unlimited capacity waiting buffer for all the jobs there are not in service. The time that each job takes to be served is random with a negative exponential distribution. By using again the steady state theory for continuous time Markov chains, a queueing analysis was invented that contributed to the design of the first prototype for the Internet (ARPANET).

Given the new types of communication systems and services available today, it is no longer possible to make many of the simplifying assumptions of classical queueing theory. One simplifying assumption that has become increasingly untenable is that all queues act in isolation. The reality is that most of them interact as a queueing network. Mathematically, this dramatically increases the complexity of the queueing models needed to capture this interactivity.

Another major mathematical assumption has been to restrict our analysis of Markov processes to ones that are time homogeneous. This is akin to assuming that the mean arrival rate for telephone calling traffic is constant or independent of the time of day. Manager and engineers can no longer assume this fiction in their daily lives so it is important for the mathematician to respond. Unfortunately, the mathematician must now throw out the entire theory of steady state and equilibrium analysis for continuous time Markov chains when they try to address this issue. What are the mathematical tools needed to replace these techniques? These are the topics to be explored in the following series of research summaries.

1. A.K. Erlang, Solutions of some problems in the theory of probabilities of significance in automatic telephone exchanges , The Post Office Electrical Engineers’ Journal 10 (1918) 189–197. Translated from Elektroteknikeren 13 (1917) (in Danish). 2. L. Kleinrock. Queueing Systems, Volume II: Computer Applications, John Wiley & Sons., 1976.

2

I. Uniform Acceleration for Dynamical Queueing Analysis

One innovation that I developed towards this end in my Ph.D. thesis was the notion of uniform acceleration . This was an asymptotic theory where the parameter of time for a specific queueing system is no longer scaled to infinity but its arrival and service rates are. When these rates are constant, this asymptotic analysis is equivalent to a steady state analysis. It is precisely when these rates are time-varying that these two types of analyses diverge and are not equivalent. I showed in my thesis that you could apply this analysis to the M/M/1 queue with time varying rates, which is a dynamic variant of the fundamental queueing model for data communications. One can use this system to model customers waiting to download files from a webserver. With my new time-dependent analysis it was now possible to obtain simple formulas that capture the dynamic behavior of this queueing model. In particular, my analysis can deal with the time dependent phenomena of backlog. For steady state analysis, we can say that such a queue is stable if its mean arrival rate is strictly less than its mean service rate. Before this type of analysis was employed, it was not mathematically understood that when rates are time varying, it is possible that the arrival rate can be less than the service rate at a given time but the system is still unstable at that time. This occurs when the arrival rates of the past have so exceeded the service rates of the past that a backlog in the queue develops. Consequently, it is not possible for the system to recover by the time the arrival rate drops below the service rate.

This queueing problem has been a touchstone in my research career and I have returned to it once every ten years to say something new about it. The paper in 1985 [6] was a rigorous asymptotic analysis of the transition probabilities for the queueing model. The paper written jointly in 1995 with Avishai Mandelbaum of Technion [5] was the same asymptotic scaling but as applied directly to the random sample path behavior of the queueing system. This was achieved by using the latest probabilistic techniques of the theory of strong approximations, which give asymptotic expansions for the sample paths of a Poisson process. This gave a more detailed analysis of the queue when it is unstable. For the first time, the fluid approximation for this system as heuristically described by queueing theorists such as Kleinrock [1] was now shown to arise rigorously as a functional strong law of large numbers limit. Moreover, we showed that the deviation from this yields its own functional central limit theorem. In a new paper, just completed [1] and co-authored by Mor Harchol-Balter of Carnegie Mellon University and my Ph.D. graduate student Robert C. Hampshire, we take this sample path analysis of the number of customers in the queueing system and apply it to the asymptotic analysis of the time spent by each customer in the systems with respect to the service discipline.

Inspired by call centers, a paper co-authored with Avishai Mandelbaum and Martin Reiman of Bell Labs [3] generalizes the notion of uniform acceleration for multiserver queues. Here the scaling of the arrival rates (demand) is matched by a scaling of the telephone operators (supply). This is a natural asymptotic regime for business managers.

3 We identify in this a large, new class of dynamical queueing network models (having time-varying rates) and show that they can be approximated by simpler sets of dynamical systems.

1. Fluid and Diffusion Limits for Transient Sojourn Times of Processor Sharing Queues with Time Varying Rates . R. C. Hampshire, M. Harchol-Balter and W. A. Massey (2006). To appear in the special issue of Queueing Systems and Their Applications on Fair Resource Sharing, Volume 53, Issue 1/2, June 2006.

2. The Analysis of Queues with Time-Varying Rates for Models. W. A. Massey, (2002), Systems 21, pp. 173–204.

3. Strong Approximations for Markovian Service Networks , A. Mandelbaum, W. A. Massey, and M. I. Reiman, Queueing Systems and Their Applications, 30 (1998) pp. 149-201.

4. Uniform Acceleration Expansions for Markov Chains with Time-Varying Rates , W. A. Massey and W. Whitt, Annals of Applied Probability , Vol. 8, No. 4 (1998) pp. 1130-1155.

5. Strong Approximations for Time Dependent Queues , W. A. Massey and A. Mandelbaum, Mathematics of , 20:1 (February 1995), pp. 33-64.

6. Asymptotic Analysis of the Time Dependent M/M/1 Queue , W. A. Massey, Mathematics of Operations Research , 10 (May 1985), pp. 305-327.

4

II. A Matrix Operator Calculus for Queueing Network Analysis via Special Functions and Stochastic Orderings

Just as constant arrival and service rates are not necessarily a reasonable modeling assumption, similar objections can be made for assuming that every queueing system is acting in isolation. The reality is that queueing systems interact to form a network . A major breakthrough for data communications was the discovery of the . Such a network is a collection of single server queues (or nodes) with infinite buffers. We then assume that each node is fed by an independent, external Poisson arrival stream. The time for each job in service has a negative exponential distribution. Finally, every customer who completes a job at one node leaves it to randomly re-enter the queue of another node. Given this interaction, the number of customers waiting at each node starts out as a collection of correlated random variable. Nevertheless, it was shown by Jackson [1] that the steady state behavior for the joint distribution of these customer population sizes makes them act as if they are statistically independent of each other! One only need replace the external arrival rates with an aggregate arrival rate that is found by solving a linear equation.

My papers contain many original contributions to this field. The result of Jackson holds only when we assume that every node is stable. If the entire network is unstable, there can still be a stable sub-network. We must then do three things to find it. First, formulate a generalized throughput equation. Second, show that even though the new equation is non-linear, it still has a unique solution and it can be found through an efficient, finite step algorithm. Finally, we must show that this equation identifies the largest stable sub- network has a limiting distribution still looks like an independent collection of single queues in equilibrium. The latter is a non-trivial issue since the limiting theory for Markov chains only applies to the case of an entire network being stable, not some sub- network. All these issues were addresses in a paper co-authored with Jonathan Goodman of Courant Institute [4].

Another innovation of my thesis was to develop a matrix operator calculus that concisely describes fundamental Markovian queueing networks such as Jackson’s. Tensor products of right and left shift operations enable us to construct the infinitesimal generator that governs the stochastic behavior of the entire network. The matrix calculus yields simple but illuminating proofs of well known results. More important, it also leads two new insights about these networks, including their transient behavior.

A well known result for the classical M/M/1 queue is that its transient behavior can be expressed in terms of modified Bessel functions. By understanding the relationship between these special functions and the operator calculus for this queue, I was able to formulate a new generalized Bessel function that can be used to solve some transient probabilities related to a series Jackson network [3].

5 I also used the notion of stochastic ordering, which induces a partial order on the family of probability distributions for a partially ordered state space. The operator calculus can easily verify that there is an upper bound for the transient behavior of a Jackson network in terms of the known transient distribution for the M/M/1 queue [5] and [6]. What is surprising about this result [5] is that traditionally, stochastic comparisons are made by sample path or coupling arguments. This bound for the Jackson network turned out to be one where no sample path argument could work. In effect, I proved that the result was not intuitive. This lead to the paper [2] that transcended queueing theory and addressed the larger theory of stochastic comparisons for Markov chains on a partially ordered state space. Recall that topological spaces are defined by a collection of open sets and you can have strong and weak topologies. This paper showed that many different stochastic orders on the same partially ordered space can be defined by restricting the number of increasing sets to create weak and strong stochastic orders.

1. Jobshop-Like Queueing Systems , J.R. Jackson. Managm. Sci. 10(1) (1963) 131–142.

2. Stochastic Orderings for Markov Processes on Partially Ordered Spaces , W. A. Massey, Mathematics of Operations Research , 12 (May 1987), pp. 350-367.

3. Calculating Exit Times for Series Jackson Networks , W. A. Massey, Journal of Applied Probability , 24 (March 1987), pp. 226-234.

4. The Non-Ergodic Jackson Network , J. B. Goodman and W. A. Massey, Journal of Applied Probability , 21 (December 1984), pp. 860-869.

5. An Operator Analytic Approach to the Jackson Network , W. A. Massey, Journal of Applied Probability , 21 (June 1984), pp. 379-393.

6. Open Networks of Queues: Their Algebraic Structure and Estimating Their Transient Behavior , W. A. Massey, Advances in Applied Probability , 16 (March 1984), pp. 176-201.

6

III. Understanding Dynamic Queueing Behavior Through Offered Load Analysis

The papers here are related to the study of queueing models for voice traffic. We found that the time dependent behavior of these systems could be explained not by asymptotics but by using conditional probabilities of what we refer to in classical telephony as the offered load process. The latter is the behavior of a voice or real-time communications system with an infinite supply of resources (i.e. servers). We can also describe the offered load process as simply the total amount of resources requested by the customers at any given time. We made the discovery that the qualitative behavior for the dynamics of many real-time communication systems can be studied through an exact analysis of the mathematically simpler offered load process that capture a significant amount of the time-dependent behavior. In particular, we used these methods in [1] to formulate the efficient provisioning of bandwidth for network design.

1. Provisioning for Bandwidth Sharing and Exchange . R. C. Hampshire, W. A. Massey, D. Mitra and Q. Wang. Telecommunications Network Design and Management, pp. 207-225, Oper. Res./Comput. Sci. Interfaces Ser., 23, Kluwer Acad. Publ., Boston, MA, 2003.

2. The Analysis of Queues with Time-Varying Rates for Telecommunication Models. W. A. Massey, (2002), Telecommunications Systems 21, pp. 173–204.

3. Server Staffing to Meet Time-Varying Demand , O. B. Jennings, A. Mandelbaum, W. A. Massey and W. Whitt, Management Science , 42:10 (October 1996), pp. 1383-1394.

4. An Analysis of the Modified-Offered-Load Approximation for the Nonstationary Erlang Loss Model , W. A. Massey and W. Whitt, Annals of Applied Probability , 4:4 (November 1994), pp. 1145-1160.

5. The Physics of the M(t)/G/infinity Queue , S. Eick, W. A. Massey and W. Whitt, Operations Research , 41 (July-August 1993), pp. 400-408.

6. Networks of Infinite-Server Queues with Nonstationary Poisson Input , W. A. Massey and W. Whitt, Queueing Systems and Their Applications , 13:1 (May 1993), pp. 183-250.

7

IV. Call Centers, Fluid Models, Dynamical Systems and Classical Mechanics

Markovian service networks are a family of stochastic processes newly formulated and analyzed in [6]. They are inspired by the study of call centers. A natural asymptotic scaling for these networks corresponds to expanding services in proportion to the growth of customer demand. We can establish fluid and diffusion limit theorems. The deterministic fluid limit is a non-linear dynamical system. The diffusion limit is typically a whose mean vector and covariance matrix combine with the fluid limit to form another dynamical system. Designing a call center for profit maximization leads to another dynamical system whose analysis involves applying the calculus of variations to the fluid limit. I should also point out that paper [6] led to my patent for server staffing.

1. Variational Optimization for Call Center Staffing . R. C. Hampshire and W. A. Massey (2005). Proceedings of the 2005 Richard Tapia Celebration of Diversity in Conference.

2. Fluid Approximation of a Priority Call Center with Time-Varying Arrivals. A. D. Ridley, W. A. Massey and M. Fu (2004). The Telecommunications Review, Volume 15, pp. 69-77.

3. Queue Lengths and Waiting Times for Multi-Server Queues with Abandonment and Retrials . A. Mandelbaum, W. A. Massey, M. I. Reiman, B. Rider and A. Stolyar (2002), Telecommunications Systems 21, pp. 149–172.

4. The Analysis of Queues with Time-Varying Rates for Telecommunication Models. W. A. Massey, (2002), Telecommunications Systems 21, pp. 173–204.

5. Strong Approximations for Markovian Service Networks , A. Mandelbaum, W. A. Massey, and M. I. Reiman, Queueing Systems and Their Applications, 30 (1998) pp. 149-201.

6. Server Staffing to Meet Time-Varying Demand , O. B. Jennings, A. Mandelbaum, W. A. Massey and W. Whitt, Management Science , 42:10 (October 1996), pp. 1383-1394.

8