Performance Analysis of Multiclass Queueing Networks via Brownian Approximation

by Xinyang Shen

B.Sc, Jilin University, China, 1993

A THESIS SUBMITTED IN PARTIAL FULFILMENT OF THE REQUIREMENTS FOR THE DEGREE OF

DOCTOR OF PHILOSOPHY

in

THE FACULTY OF GRADUATE STUDIES (Faculty of Commerce and Business Administration; Operations and Logisitics)

We accept this thesis as conforming to the required standard

THE UNIVERSITY OF BRITISH COLUMBIA May 23, 2001

© Xinyang Shen, 2001 In presenting this thesis in partial fulfilment of the requirements for an advanced degree at the University of British Columbia, I agree that the Library shall make it freely available for reference and study. I further agree that permission for extensive copying of this thesis for scholarly purposes may be granted by the head of my department or by his or her representatives. It is understood that copying or publication of this thesis for financial gain shall not be allowed without my written permission.

Faculty of Commerce and Business Administration The University of. British Columbia Abstract

This dissertation focuses on the performance analysis of multiclass open queueing networks using semi-martingale reflecting Brownian motion (SRBM) approximation. It consists of four parts. In the first part, we derive a strong approximation for a multiclass feedforward queueing network, where jobs after service completion can only move to a downstream service station. Job classes are partitioned into groups. Within a group, jobs are served in the order of arrival; that is, a first-in-first-out (FIFO) discipline is in force, and among groups, jobs are served under a pre-assigned preemptive priority discipline. We obtain an SRBM as the result of strong approximation for the network, through an inductive approach. Based on the strong approximation, some procedures are proposed to approximate the stationary distribution of various performance measures of the queueing network. Our work extends and complements the previous work done on the feedforward queueing network. The numeric examples show that the strong approximation provides a better approximation than that suggested by a straightforward interpretation of the heavy traffic limit theorem. In the second part, we develop a Brownian approximation for a general multiclass queue• ing network with a set of single-server stations that operate under a combination of FIFO (first-in-first-out) and priority service disciplines and are subject to random breakdowns. Our intention here is to illustrate how to approximate a queueing network by an SRBM, not to justify such approximation. We illustrate through numerical examples in comparison against simulation that the SRBM model, while not always supported by a heavy traffic limit theo• rem, possesses good accuracy in most cases, even when the systems are moderately loaded. Through analyzing special networks, we also discuss the existence of the SRBM approxima• tion in relation to the stability and the heavy traffic limits of the networks. In most queueing network applications, the stationary distributions of queueing networks are of great interest. It becomes natural to approximate these stationary distributions by the stationary distributions of the approximating SRBMs. Although we are able to characterize the stationary distribution of an SRBM, except in few limited cases, it is extremely difficult to obtain the stationary distribution analytically. In the third part of the dissertation, we propose a numerical algorithm, referred to as BNA/FM (Brownian network analyzer with Abstract iii

finite element method), for computing the stationary distribution of an SRBM in a hyper- cube. SRBM in a hypercube serves as an approximate model of queueing networks with finite buffers. Our BNA/FM algorithm is based on finite element method and an extension of a generic algorithm developed in the previous work. It uses piecewise polynomials to form an approximate subspace of an infinite dimensional functional space. The BNA/FM algorithm is shown to produce good estimates for stationary probabilities, in addition to stationary mo• ments. This is in contrast to the BNA/SM (Brownian network analyzer with spectral method) developed in the previous work, where global polynomials are used to form the approximate subspace and they sometime fail to produce meaningful estimates of these stationary prob• abilities. We also report extensive computational experiences from our implementation that will be useful for future numerical research on SRBMs. A three-station tandem network with finite buffers is presented to illustrate the effectiveness of the Brownian approximation model and our BNA/FM algorithm. In the last part of the dissertation, we extend the BNA/FM algorithm to calculate the stationary distribution of an SRBM in an orthant. This type of SRBM arises as a Brownian approximation model for queueing networks with infinite buffers. We prove the convergence theorems which justify the extension. A three-machine job shop example is presented to illustrate the accuracy of our extended BNA/FM algorithm. In fact, this extended algorithm is also used in the first two parts of this dissertation to analyze the performance of several queueing network examples and it gives fairly good performance estimates in most cases. iv

Table of Contents

Abstract ii

Table of Contents iv

List of Figures vii

List of Tables ix

Acknowledgements x

1 Introduction 1

2 Strong Approximation for Multiclass Feedforward Queueing Networks 4 2.1 Introduction 4 2.2 Preliminaries 6 2.3 A Multiclass Single Server Station 11 2.3.1 Queueing Model 11 2.3.2 Preliminary Lemmas 15 2.3.3 Functional Law-of-Iterated-Logarithm 18 2.3.4 Strong Approximation 22 2.3.5 Fine Tuning the Strong Approximation for Sojourn Time 28 2.3.6 A Packet Queue Application 29 2.4 Multiclass Feedforward Networks 30 2.4.1 Queueing Network Model 31 2.4.1.1 Primitive Data and Assumptions • . . 31 2.4.1.2 Performance Measures and Their Dynamics 35 2.4.2 Main Result 37 2.5 Performance Analysis Procedure 43 2.5.1 Product Form Solution 45 2.6 Numerical Examples 46 Table of Contents v

2.6.1 Single Station With Two Job Classes : 46 2.6.2 Two-station Tandem Queue 48 2.7 Appendix 52 2.7.1 Proofs and An Elementary Lemma 52 2.7.2 General Traffic Intensity Case 57

3 Brownian Approximations of Multiclass Queueing Networks 61 3.1 Introduction 61 3.2 Model Formulation 64 3.2.1 Notation and Conventions 64 3.2.2 The Primitive Processes 66 3.2.3 The Derived Processes 67 3.3 Conditions on the Primitive Processes 69 3.4 The SRBM Approximation 73 3.5 Discussions and Variations 79 3.5.1 Issues Surrounding the SRBM 79 3.5.2 Kumar-Seidman Network 81 3.5.3 Alternative Approximation with State-Space Collapse 82 3.6 Special Cases and Numerical Results 85 3.6.1 A Single-Class Single-Server Queue with Breakdown 85 3.6.2 A Generalized 86 3.6.3 A Variation of the Bramson Network 88 3.6.4 A More-Complex Multiclass Network 89 3.7 Concluding Remarks 94

4 Computing the Stationary Distribution of SRBM in a Hypercube 98 4.1 Introduction 98 4.2 SRBM in a Hypercube 101 4.3 The BNA/FM Algorithm 105 4.3.1 The Generic Algorithm 106 4.3.2 The BNA/FM Algorithm for SRBM in a hypercube 110 4.4 Computational Issues of the BNA/FM Algorithm 113 4.4.1 Solving Linear Systems of Equations 113 4.4.1.1 Sparseness and Computing Memory 114 4.4.1.2 Direct Method 115 4.4.1.3 Iterative Method 116 Table of Contents vi

4.4.2 Computational Complexity 117 4.4.3 Mesh Selection 117 4.4.4 Ill-Conditioned System Matrix 119 4.4.5 Scaling 121 4.5 Numerical Examples 122 4.5.1 Comparison with SC Solution 123 4.5.2 A 3-dimensional SRBM with Product Form Solution . 124 4.6 A Queueing Network Application 128 4.7 Concluding Remarks 135 4.8 Appendix 135 4.8.1 Integrals Calculations 137 4.8.1.1 Computing h 138

141 4.8.1.2 Computing 72

4.8.1.3 Computing 73 142 4.8.1.4 Computing h 143

4.8.1.5 Computing I5 143

4.8.1.6 Computing J6 143

4.8.1.7 Computing I7 144

4.8.1.8 Computing IBk 145

4.8.1.9 Computing IBk 147 4.8.2 Basic Integrals 148 4.8.3 Compute the Basic Integrals 155 4.8.4 Integrals of Hermite Basis Functions 167

5 Computing the Stationary Distribution of an SRBM in an Orthant 173 5.1 Introduction 173 5.2 Definition and Preliminaries 175 5.3 The BNA/FM Algorithm for an SRBM in an Orthant 180 5.4 Convergence of the Algorithm 183 5.5 A Job Shop Example 189

References 195 vii

List of Figures

2.1 A multiclass feedforward queue network 32 2.2 Single station with two job classes . . . 47 2.3 Two-station tandem queue 50

3.1 Kumar-Seidman Network 65 3.2 A Variation of Bramson Network 65 3.3 A generalized Jackson queueing network 89 3.4 Queue length by station in a variation of Bramson network with Poisson ar• rivals and exponential services 94 3.5 A two-station multiclass network 95 3.6 Cumulative stationary distributions of waiting time in the complex queueing network: group 1 jobs 96 3.7 Cumulative stationary distributions of waiting time in the complex queueing network: group 3 jobs 97

4.1 A finite element mesh of a two-dimensional hypercube state space Ill 4.2 Percentage errors of approximate marginal stationary distribution Pi .... 126 4.3 Percentage errors of approximate marginal stationary distribution P

5.1 Queueing Network Model of the Job Shop Facility 191 5.2 The marginal stationary distribution of the waiting time of machine 1 in the job shop 193 5.3 The marginal stationary distribution of the waiting time of machine 2 in the job shop 194 5.4 The marginal stationary distribution of the waiting time of machine 3 in the job shop 194 Vlll

List of Tables

2.1 Average queue length in a single-station network 49 2.2 System specifications of two-station tandem queue 50 2.3 Average queue length of network 1 51 2.4 Average queue length of network 2 • • • 51

3.1 Parameters of single station queue with breakdown 86 3.2 Average sojourn time in the single-class single-server queue with breakdown . 87 3.3 Parameters of the generalized Jackson queueing network 88 3.4 Average sojourn time of the generalized Jackson queueing network 90 3.5 Average waiting times in the Bramson network 91 3.6 Average sojourn time of each job class in the Bramson network 92 3.7 Mean service times of three cases of the complex queueing network 92 3.8 Average sojourn time of each job class in the complex queueing network ... 93

4.1 Comparisons of different scaling 122 4.2 Estimates of stationary means from different algorithms for a special two- dimensional SRBM 124 4.3 Comparisons for a 3-dimensional SRBM with product form stationary density 125 4.4 The parameters of the 129 4.5 The average queue lengths of the tandem queue 131 4.6 The average throughput rates of the tandem queue 131 4.7 The average job loss rates of the tandem queue 132 4.8 The iterations of the SRBM approximation for average queue lengths of the tandem queue 132 4.9 The iterations of the SRBM approximation for average throughput rates of the tandem queue 133 4.10 The iterations of the SRBM Approximation for average job loss rates of the tandem queue 133 4.11 The comparison of tail probabilities of system No. 1 for the tandem queue . . 134 Contents ix

4.12 The comparison of tail probabilities of system No. 2 for the tandem queue . . 134

5.1 Means of inter-arrival times and service times of the job shop 191 5.2 Average waiting time at each machine at the job shop 192 5.3 The iterations of the SRBM approximations for average waiting times for the job shop example 192 5.4 The iterations of the SRBM approximation errors for traffic intensities for the job shop example 192 5.5 The iterations of the SRBM approximation for job loss rates for the job shop example 193 Acknowledgements

I am sincerely grateful to my advisor, Professor Hong Chen, for introducing me to the stochas• tic process, and reflected Brownian motions, and for his valuable advice, support and many contributions to this dissertation. Throughout my graduate studies at the University of British Columbia, I have benefited a great deal from Professors Derek Atkins, Shelby Brumelle, Martin Puterman, and Murray Prank. They have been generous with their time and ideas, and have given invaluable advice on the directions of my research. I thank Professors Daniel Granot, Frieda Granot, Thomas McCormick, and Maurice Queyranne for their guidance and encouragement in the early stage of my career at the University of British Columbia. During the course of research for this dissertation, I have also benefited from ideas in• spired by conversations with Professors Jim Dai at the School of Industrial and Systems Engineering, Georgia Institute of Technology, and David Yao in the Department of Indus• trial Engineering and Operations Research at Columbia University. I am grateful for their valuable contributions to this dissertation. I wish to thank the Faculty of Commerce and Business Administration and the Natural Sciences and Engineering Research Council of Canada (NSERC) for the MacPhee Fellowship and Postgraduate Scholarship. I would also like to thank my fellow graduate students for their help and friendship, in• cluding Xiaomei Ding, Qing Li, Li Liu, Yong Liu, Grey Sosic, Dongsheng Zhou and Yonggan Zhao. Finally, I want to thank my parents and my sisters for their constant support throughout the many years of my education. Among all those people who have helped me along the way, none is more important for the completion of this dissertation than my wife, Haitao Huang. She has provided all her love, support and understanding through many disappointments as well as accomplishments. 1

Chapter 1 Introduction

Queueing network models have provided a very useful tool for the design and operations man• agement of manufacturing systems and communication networks. (See, for example, Jackson (1963), Kleinrock (1976), Buzacott and Shanthikumar (1992), and Yao (1994).) They have been used for performance analysis, dynamical and the optimal control of these systems. In this dissertation, we focus on performance analysis of multiclass open queueing networks with a mixture of FIFO (first-in-first-out) and static priority service disciplines via Brownian approximation. Despite much effort, exact analysis of the performance of queueing networks has been largely limited to some special queueing networks which have a product-form stationary dis• tribution, see, for example, Baskett et al. (1975) and Kelly (1979). Given the complexity of queuing networks, it is known to be almost impossible to obtain exact analytical results for general queueing networks. Therefore, many approximations have been developed to evaluate the performance of queueing networks. Brownian approximation (also known as diffusion ap• proximation) and decomposition approximation are the two most important approximations that have been actively studied and applied. Decomposition-based approximation (Shanthikumar and Buzacott (1981), Whitt (1983), Bitran and Tirupati (1988), Segal and Whitt (1989), and Whitt (1994)), also known as the parametric decomposition method, approximates the arrival process to each station in a network by a renewal process and decomposes the whole network into single stations. Then the performance of each station can be obtained via the well known G/G/l model, and the performance of the network can be obtained by a straightforward aggregation. The Brownian approximation refers to approximating certain processes of interest, such as workload, queue lengths and sojourn times of queueing networks, by a semi-martingale reflecting Brownian motion (SRBM), or its special case, a reflecting Brownian motion (RBM). It is based on the heavy traffic limit theorem and the strong approximation. Substantial effort has been made in the past 30 years to establish limit theorems that support the Brownian approximation. The first part of this dissertation presented in Chapter 2 represents such an effort. In that chapter, an RBM model is derived for a multiclass feedforward queueing network under a combination of FIFO and static priority disciplines. The approximation Chapter 1. Introduction 2 is rigorously justified by strong approximation theorems. In our approach, the workload processes of higher priority job classes in the queueing network are approximated by non-zero diffusion processes. If using traditional heavy traffic limit theorems, those workload processes are approximated by zero, which is obviously not a good approximation. The numerical examples indicate that the RBM derived from the strong approximation does provide a much better approximation model than the one derived from the diffusion approximation. Based on the strong approximation developed in Chapter 2, we heuristically derive a Brow• nian approximation for a more general multiclass queueing network in Chapter 3. In this network, the single-server stations are operated under a combination of FIFO and priority service disciplines, and are subject to random breakdowns. The focus in that chapter is not on proving limit theorems so as to justify why the network in question can be approximated by an SRBM. Rather our intention and contribution is to illustrate how to approximate a gen• eral network by an SRBM. We illustrate through numerical examples in comparison against simulation that the SRBM model we proposed, while not always supported by heavy traffic limit theorem and strong approximation, possesses good accuracy in most cases, even when the networks are moderately loaded. Through both analytical and numerical results, we also identify cases in which the SRBM may not exist or may work poorly. In most queueing network applications, it is important to obtain the stationary distribu• tions of queueing networks which can be used for both performance analysis and optimal control. With SRBM being an approximation for queueing networks, the stationary distribu• tion of an SRBM is the natural candidate for approximating the stationary distribution of the corresponding queueing network. However, except for some special cases, the analytical solution for the stationary distribution of an SRBM is difficult to derive, therefore, numerical methods have been sought. In chapters 4 and 5, we develop two numerical algorithms for computing the stationary distributions of SRBMs in a hypercube and in an orthant respec• tively. SRBM in a hypercube has been identified as an approximate model of open queueing networks with finite buffers, and SRBM in an orthant has been identified as an approxi• mate model of open queueing networks with infinite buffers. The computational algorithm for SRBM in an orthant provides a numerical tool for obtaining the performance estimates of the multiclass infinite buffer queueing networks studied in chapters 2 and 3. The core of this algorithm is based on the computational algorithm for an SRBM in a hypercube which is developed in Chapter 4. Other than being an essential component of the numerical algorithm for an SRBM in an orthant, the algorithm developed in Chapter 4 also provides a numerical tool for obtaining the performance estimates of finite buffer queueing networks. Chapter 1. Introduction 3

In Chapter 4, we present and implement a general numerical algorithm, which we call BNA/FM (Brownian network analyzer with finite element method), for computing the sta• tionary distribution of an SRBM in a hypercube. The BNA/FM algorithm is shown to be very stable and produce good estimates for stationary distributions, in addition to station• ary moments. We report extensive computational experiences from our implementation that will be useful for future numerical research on SRBMs. We also present a three-station tan• dem network with finite buffers in order to illustrate the effectiveness of the combination of Brownian approximation and our BNA/FM algorithm. Due to the noncompactness of orthant state space, it is impossible to apply the BNA/FM algorithm to SRBM in an orthant directly. In Chapter 5, we propose an approximation that extends the BNA/FM algorithm to compute the stationary distribution of an SRBM in an orthant. First, for an SRBM in an orthant, we construct a series of SRBMs in hypercubes which converge weakly to the SRBM in an orthant. Next, we apply the BNA/FM algorithm proposed in Chapter 4 to calculate the stationary distributions of the constructed SRBMs defined in hypercubes. Therefore, we obtain the approximation to the stationary distribution of the SRBM in an orthant. As a theoretical support, we show that the constructed SRBMs in hypercubes converge to the SRBM in an orthant weakly. In addition, by conjecturing that the stationary distributions ofthe constructed SRBMs in hypercubes are relatively compact, we are able to show that the stationary distributions of the SRBMs in hypercubes converge to the stationary distribution of the SRBM in an orthant in distribution. At the end, we analyze a three-station job shop example by using the extended algorithm and the results are very encouraging. The extended algorithm is also applied in the chapters 2 and 3 of this dissertation and provides fairly good performance estimates for several queueing network examples. 4

Chapter 2 Strong Approximation for Multiclass Feedforward Queueing Networks

2.1 Introduction

In this chapter, we first consider a single-server station serving multiple classes of jobs, where jobs of different classes may have different arrival and service time distributions. Job classes are partitioned into groups. Within a group, jobs are served in the order of arrival (i.e., a first-in first-out service discipline is in force), and among groups, jobs are served under a pre-assigned (static) preemptive priority discipline. Our key result is to establish that if the input data (i.e., the arrival and the service processes) satisfy an approximation (such as the functional law-of-iterated logarithm (FLIL) approximation or the strong approximation), then the output data (i.e., the departure processes) and the performance measures (such as the queue length, the workload and the sojourn time processes) satisfy a similar approximation. In an obvious way, this result extends to a feedforward multiclass network, where jobs can only move from a lower numbered station to a higher numbered station. We identify explicitly the strong approximation limit under the network setting. This chapter relates to, extends and complements, some existing works of Peterson (1991), Harrison and Williams (1992) and Chen and Mandelbaum (1994). Peterson (1991) first stud• ied a multiclass feedforward queueing network to derive a heavy traffic limit theorem or a diffusion approximation theorem under a heavy traffic condition. It is shown that the limit can be described by a J-dimensional reflecting Brownian motion (RBM), where J equals the number of service stations in the network. In particular, a state space collapse phenomenon is observed for higher priority job classes, namely, the limit of the workload or the queue length of high priority jobs is zero. Thus, a direct application of this theorem would yield zero as an approximation to the queue length or the workload of a higher priority class. Clearly this is not satisfactory. Usually heuristics are offered in this case to provide a better approxi• mation. Harrison and Williams (1992) studied the reflecting Brownian motion suggested by the heavy traffic limit theorem and obtained a necessary and sufficient for the existence of a product form stationary distribution of the Brownian model. On the other hand, Chen and Mandelbaum (1994) derived a strong approximation for a generalized Jackson network; it was explained that the strong approximation refines the heavy traffic limit theorem in that it pro- 2.1 Introduction 5

vides the rate of convergence and it does not require the network to be under a heavy traffic condition. This chapter extends the strong approximation analysis in Chen and Mandelbaum (1994) to the multiclass feedforward network. The strong approximation yields appropriate approximations for the network's workload, queue length and sojourn time processes of all job classes (not just the lowest job class). The numerical examples indicate that the strong approximation does provide a much better approximation than the approximation suggested by the diffusion approximation, for both higher and lower priority classes. There is a large literature on diffusion approximations, and readers are referred to survey papers by Whitt (1974), Lemoine (1978), Glynn (1990), Harrison and Nguyen (1993) and Williams (1996). Csorgo, Deheuvels and Horvath (1987), Zhang, Hsu and Wang (1990), and Glynn and Whitt (1991a, 1991b) are among the first to study the strong approximation for queueing systems. Horvath (1990) and Chen and Mandelbaum (1994) obtain the strong approximation for open generalized Jackson networks, and Zhang (1997) for closed generalized Jackson networks. The strong approximation has also been used to study time-dependent queues (see Mandelbaum and Massey (1995) and Mandelbaum, Massey and Reiman (1998)) and state-dependent queues (see Mandelbaum and Pats (1998)). The chapter is organized as follows. In the next section, we introduce the functional law- of-iterated-logarithm (FLIL) and the functional strong approximation theorem (FSAT) for two fundamental processes, which set a basis for our analysis of the queue. In Section 2.3, we obtain the FLIL and the FSAT for a single station queue, which are the key results of the chapter. In Section 2.4, by viewing some of the results in the previous section as an input- output theorem, we extend the strong approximation to the multiclass feedforward network. Section 2.5 provides a procedure to approximate by a reflecting Brownian motion (RBM) a feedforward queueing network with the renewal (exogenous) arrival process, i.i.d. service times and Markovian routing. This section is almost self-contained to make it convenient for those readers who would only need to obtain a strong approximation. Numeric examples are given in Section 2.6 to compare the performance measure estimates given by the simulation, the proposed approximations suggested by the strong approximation, and the approximation suggested by the diffusion approximation. In order to simplify the presentation, we assume that the traffic intensity is no greater than 1 through Sections 2.3-2.6; this would include almost all cases of practical interest. The more general case, where the traffic intensity may be strictly greater than 1, is summarized in the appendix for a single-station queue.

We denote by 3iK the if-dimensional Euclidean space, and by !ft+ = {x G 5?K : x > 0}

1 its nonnegative orthant. Let 3? = M and K+ = All vectors are assumed to be column vectors, and the prime (') is used to denote the transpose of a vector and a matrix. We denote 2.2 Preliminaries 6

e-7 the jth unit vector (whose jth element equals 1 and all other elements equal zero) and e a vector of l's (whose elements all equal 1), both in an appropriate dimension from the

K context. For x = (xk)k=l G 5R^, define the norm ||x|| = maxi<^<^ \xk\- Let V be the set of isf-dimensional functions which are right-continuous and have left limits (RCLL), and let

K K V$ = {x e V : x{0) > 0}. For X = (Xk) G V , define the norm

||X||r= sup = sup max: \Xk(t)\. 0

Sometimes for convenience, we write ||A"(i)||x for ||X||T' The composition {x(y(t)), t > 0}

K K of x : 5R+ -> di with y : 5?+ 3?^ is the function from K+ to ?R whose kth coordinate is the real-valued function {xk(yk{t)), t>0},k = 1,...,K.

2.2 Preliminaries

In this section, we consider two fundamental processes. For ease of exposition, we present all results for one-dimensional processes. Since all the results are pathwise on an appropriate probability space, they have obvious generalizations to multidimensional cases, and without explicitly stating so, we shall quote these generalizations in the latter sections. Let X G V, and let Y = {Y(t), t > 0} denote its inverse, defined by

Y{t) = sup{s > 0 : X{s)

One important example of the above pair is that X is a partial sum of a sequence £ = i = 1,2,...} of nonnegative i.i.d. random variables, namely,

W *(*):= *>0, (2.2) i=l

(with X(t) — 0 for t < 1), and its corresponding Y (defined by (2.1)) is a renewal process. We have the following result of the functional law-of-iterated-logarithm for the pair.

Theorem 2.1 Consider the (X, Y) pair as introduced above. Suppose that

\\X(t)-X(t)\\T= sup \X(t)-X(t)\^ 0{y/T^ log logT) (2.3) 0 0. Then

a s / \\Y(t)-Y(t)\\T= sup |y-W-f(f)| = -0(v TloglogT) (2.4) o

The proof of this theorem follows almost the same lines of the proof for Theorem 2.1.3 in Csorgo and Horvath (1993). For any X having a bound like (2.3), we say that X has a FLIL approximation X. We note that the FLIL approximation (2.3) implies a functional strong-law-of-large numbers (FSLLN) limit result. Namely, it implies that almost surely, Xn{t) := -X(nt) -> X(t) = mt, n as n —»• oo, and the convergence is uniform on any compact set (u.o.c). Next, we consider a refined approximation, namely, the functional strong approximation, for processes X and Y.

Theorem 2.2 (FSAT) Consider the (X, Y) pair as introduced above. Suppose that for some r > 2,

a 1 r ||*-*||r =*o(T / ), (2.5) as T —r oo, with

X{t) =mt + oB{t), t>0, where m and a are positive constants, and B = {B(t), t > 0} is a standard Brownian motion (i.e., a Wiener process). Then we have

a 1 ||Y-y||r ^-o(T /'-'), ' (2.6) with r' = r if r < 4 and arbitrary r' < 4 if r > 4, and with

Y{t) = ut - ucrB{ut), i>0, 2.2 Preliminaries 8 where u = 1/ra. Furthermore, when X is a partial sum of nonnegative i.i.d. random variables as given by (2.2), if £j has a finite rth moment with r > 2, then we can assume that X and Y are denned on a probability space, on which there exists a standard Brownian motion B = {B{t),t > 0} such that both (2.5) and (2.6) hold, with m = #(&) and a being the standard deviation of

This theorem follows from Theorems 2.1.1 and 2.1.3 of Csorgo and Horvath (1993), where they actually give a slightly better bound in (2.6) when (2.5) holds with r > 4. We note that the strong approximation (2.5) leads to a functional central limit theorem (FCLT) limit. Specifically, let Xn(t) = -^=[X{nt) -nmt]]

the approximation (2.5) implies that

Xn A X asn -» co,

where J is a driftless Brownian motion with a standard deviation a, and "A" denotes the weak convergence in V (refer to, for example, Billingsley (1968) and Whitt (1980)). Now, we introduce the notion of r-strong continuity that will be used extensively in this chapter. A function x G T>K is said to be strong continuous with degree r, or r-strong continuous, for some r G (2,4), if

sup \\x{u) -x{v)\\ = o(T1/r), asT->oo, (2.7) 0

where h(T) = -JT log log T, and it is simply said to be strong continuous if it is r-strong continuous for all r € (2,4). We note that an r-strong continuous function may not be con• tinuous. A stochastic process X = {X(t), t > 0} in VK is said to be an r-strong continuous process for some r € (2,4), if with probability 1, the sample path of this version is r-strong continuous. For simplicity, we shall assume throughout this chapter that all r-strong con• tinuous stochastic processes are defined on such a probability space. A stochastic process is simply said to be strong continuous if it is r-strong continuous for all r G (2,4). We say a stochastic process X has a strong approximation if for some r G (2,4), there exists a prob• ability space on which a version of X (for simplicity we still write it as X) and an r-strong continuous stochastic process X are defined such that

sup \X{t)-mt-X{t)\=-o(T^r), 0

where m is a (deterministic) constant. When the above equality holds, we also say that X has a strong approximation X = {X(t), t > 0} with X(t) = mt + X(t).

Lemma 2.3 (i) A Wiener process (i.e., a standard Brownian motion) is a strong contin• uous process.

(ii) If a process has a strong approximation, then it must have a FLIL approximation. Specifically, if the process X satisfies

a 1 ||XW_mi_l(i)||r ^o(T /'-)

with X = {X(t),t > 0} being r-strong continuous, then we have

a / -m*||r =0(v rioglogT).

(iii) If process X — {X(t), t > 0} G V has a FLIL approximation, then there exist positive M and T such that with probability 1,

||X(t)|| < Aft fort>T.

(iv) A linear combination of r-strong continuous functions is r-strong continuous, and a (de• terministic) linear combination of r-strong continuous processes is r-strong continuous.

(v) Let X = {X(t), t > 0} be an r-strong continuous process, and let r = {r(t),t > 0} be a process with r(i) G [0, oo) for all t > 0 having a FLIL approximation:

a \\r(t) - at\\T = O(vAnoglogT).

Then we have

\\X(T(t))-X(at)\\Ta=°(Tl

(vi) Let £ = n > 1} be a sequence of random variables and let

W

n=l

Assume that X has a strong approximation,

a \\X(t)-mt-X(t)\\T d-o(T^). 2.2 Preliminaries 10

Suppose that process E has a strong approximation,

a \\E(t)-pt-E(t)\\T d-o(T^).

Let Y(t) = X(E(t)), t > 0. Then Y = {Y(t), t > 0} has the following strong approxi• mation: \\Y(t) - mBt - mS(f) - X(J3t)\\T °= °(T1/r),

Proof Part(i) of this lemma is a special case of Lemma 3.6.3 in Chen and Mandelbaum (1994), and parts (ii)-(v) clearly follow from the definitions of the strong continuity, the FLIL approximations and the strong approximations. For (vi), we have

\\Y(t)-mpt-mE(t)-X(l3t)\\T

< \\X(E(t)) - mE{t) - X(E(t))\k + m\\E(t) -fit- E(t)\\T + \\X(E(t)) - X(pt)\\r

where the last equality follows from the strong approximation assumptions for X and S, part (iii) and (v) of the lemma and the strong continuity of X. • We state an additional property of strong continuity, which relates to the regulator of the one-dimensional reflected mapping. (Refer to Harrison (1985) for the definition and the properties ofthe one-dimensional reflection mapping.) The proof of the following proposition can be found in Appendix 2.7.1.

Proposition 2.4 Suppose that x G X>o be an r-strong continuous function (2 < r < 4). Let

f{t) = sup [-6s - x(s)}+ - [-6}+t, (2.8) 0

Finally, we state a bound for a special class of reflecting Brownian motions (which we shall show may arise as the strong approximation for the feedforward multiclass queueing network under study in this chapter). Let X be a K-dimensional Brownian motion starting at X(0) = x G with drift 6 and covariance matrix V (on an appropriate probability space). Let R be a K x K lower-triangular matrix with positive diagonal elements. Then by inductively applying the one-dimensional reflection mapping, we can show that there exists 2.3 A Multiclass Single Server Station 11 a unique pair (Y, Z) satisfying

Z = X + RY > 0,

Y is nondecreasing with Y(0) — 0, /•oo

/ Zk(t)dYk(t) = 0, k = l,...,K. Jo

The unique Z is called the reflecting Brownian motion and Y the regulator of the reflecting Brownian motion, associated with data (x,d,T,R).

Theorem 2.5 Let Z be a ^-dimensional reflecting Brownian motion associated with data R), where R is a lower-triangular matrix with positive diagonal elements. Suppose that R~l6 < 0. Then

sup ||Z(t)|| °=* O(logT), (2.9) o 0,

sup ||Z(t)|| =' o(Tl'r). 0

The proof of this theorem is in Appendix 2.7.1.

2.3 A Multiclass Single Server Station

We formally describe the queueing model in Section 2.3.1, and then establish its FLIL theorem and its strong approximation theorem in Sections 2.3.3 and 2.3.4, respectively. In Section 2.3.5, we provide an improved strong approximation for the sojourn times. We provide a packet queue application in Section 2.3.6.

2.3.1 Queueing Model

The queueing model under consideration is a single-server station serving K classes of jobs. Let K = (1, ...,K} denote the set of job class indices. Jobs of all classes arrive exogenously, wait for service and after service completion leave the system. To specify the service discipline, we partition K into L groups, 1, ... ,L, and let gt denote the set of classes belong to group L For any H < a job of a class in g£ has a higher preemptive-resume priority over any job of any class in gv; as a result, the presence of the latter job has no impact on the former job. 2.3 A Multiclass Single Server Station 12

In this sense, for £ < £', a job of a class in gi does not see any job of any class in g^. Within each group, jobs are served under FIFO discipline. Let n be a mapping from IC (job class index set) to £ .= {1,... ,L} (job group index set); specifically, k G g^k)'-, that is, class k is in group ir(k). A job of class k is referred to as a (group) £ job or it is in ge, if k G ge- The station is assumed to have an infinite waiting room. We note that with L = 1, the station models a multiclass queue under a (pure) FIFO service discipline, and with L = K (implying each group £ contains a single class), the station models a multiclass queue under a (pure) priority service discipline.

The queue is described by following primitive data: K counting processes Ak = {Ak{t), t >

0} (A; G IC), and K sequences of nonnegative random variables vk = {vk{n), n > 1} (k G IC),

all defined on the same probability space. The (integer-valued) quantity, Ak(t), indicates the number of class h jobs that have arrived (exogenously) to the system during [0,t], The

random variable vk(n) is the service time required for the nth class k job. We assume that initially there are no jobs in the system.

We introduce some notation. Let Ufc(l) be the arrival time of the first class k job and uk(n), n > 1, the interarrival time between the (n — l)st and the nth class k jobs (corresponding

to jump points of Ak). We call u = (uk) with uk — {uk(n), n > 1} the interarrival time

sequence, and call v = (vk) the service time sequence. We introduce the summation,

n

Vfc(0) = 0, Vk{n) = J2 vk(m), n > 1, k G K, m—l and define its associated counting processes,

S^t) = sup{n>0: Vk(n)

Let Vk{t) = Vfc(L*J) for any t > 0. Let V = [Vk), A = {Ak) and 5 = {Sk). We call A an exogenous arrival process, and S a service process. The performance measures of interest are the i-dimensional aggregated workload process Z = (Zt) with Zi = {Zt(t), t > 0} (£ = 1,... , L), the K-dimensional queue length process Q = {Qk) with Qk = {Qk{t), t > 0} (k = 1,... ,K), and the L-dimensional cumulative idle time process Y = (Ye) with Yi = {Ye(t), t > 0} (£ — 1,... ,L). All of these processes are nonnegative processes. The quantity, Zi{t), indicates the total amount of current work for the station embodied in jobs that are in groups 1 to £ and that are queued or in service at the station at time t. The quantity, Qk{t), is integer valued and indicates the number of class k jobs at the station at time t. The quantity, V^(i), indicates the cumulative amount of time 2.3 A Multiclass Single Server Station 13

that the server does not serve jobs in groups 1 to I during [0,t]. It is clear that Y must be nondecreasing with Y(0) = 0. To describe the dynamics of the queue, we need some additional notation.

• Dk(t) counts the number of departures of class k jobs from the station during [0, t) after their service completions;

• Wk(t) is the workload process of class k jobs.

• Tk(t) is the total amount of time that the server at the station has served jobs of class A; during [0,i\.

• T((t) is the arrival time of the gi job which has most recently completed service [re(t) is zero if there have been no service completions for group £].

• vk(t) is the partial service time (if any) that has been performed on job of type k during

(Tt(t),t], where k € gt.

• Sfc(i) is the sojourn time of class k jobs at time t, denoting the time which will be spent in the queue by the first class k job to arrive at time greater than or equal to t.

• r]k[t) is the time at which the first class k job arrives during [t, oo).

• Tk{t) is the time that a class k job would spend at the station if it arrived at time t.

From the above definitions, we have the following dynamic relations,

Qk(t) = Ak(t)-Dk(t), (2 10)

Wk(t) = Vk(Ak(t)) - Vk(Dk(t)) - uk(t) (2 11)

12) »=1 jSSi

13)

Dk{t) = Sk{Tk(t)) = Ak{Te(t)), where I = ir{k), (2, 14)

0 < vk(t) < max vk(n), (2 15)

\

0

(2 17) 2.3 A Multiclass Single Server Station 14

Relations (2.10) and (2.11) are flow-balance relations in terms of time and head count of jobs, respectively. The second equality in (2.14) follows from the FIFO service discipline within jobs in group £; namely, any group £ jobs that arrived before the time r^(i) must have finished service by time t. The dynamics of Tk can be described by a recursive relationship,

7r(fc)-l

Tk(t) = Zm(t)+Y/ ^[VjiAATkW + ty-VjiMt))]

+Vk(Ak(t))-Vk(Ak(t)-l), (2.18)

s where Zn^) (*) i the current workload at time t (contributed by those jobs having priority no less than class k) just before this class k job arrives, Vj(Aj(Tk(t) + t)) — Vj(Aj(t)) is the workload of class j jobs that arrive after time t and before the completion of this class k job, and Vk{Ak{t)) — Vk(Ak(t) — 1) is the service time required for this job. The summation term in (2.18) is the total amount work embodied in the jobs which arrive during the sojourn time of the concerned job and which have higher priority than the concerned job (the class k job). Define the "net-put" process Ni(t) by

l = EEw))-*- (2.i9) »=i jegi

The quantity, Nt(t), represents the total workload input of all types of jobs in g\ U ... U gi during [0, t] minus the work that would be finished if the server were never idle. The equality (2.13) can alternatively be written as

Zt(t) = Ne(t) + Yt(t) > 0. (2.20)

Under the work-conserving (i.e., nonidling) condition, Yi(-) can increase at time t only when Zi(t) = 0. Hence, the pair (Z(,Y() jointly satisfies the one-dimensional reflection mapping theorem in Harrison (1985), which yields

Yt(t) = sup [-Nt{s)]. (2.21) 0

We assume that there exists a long-run average arrival rate and an average service time; namely,

— > A as t —• oo,

V(n) > m as n —> oo. n We shall call A^, the A;th coordinate of A, the (exogenous) arrival rate of class k job, and call

mk, the A;th coordinate of m, the average service time of class k job (alternatively the mean

service time of class k job when vk(n) has the same finite mean for all n > 0). We assume

that for all k £ /C, A^ > 0 and mk > 0. Call uk •= the service rate of class k. Define

e

Bi^^Xjmj, ptS^Bt, p = pL, (2.22) jegi »=i where $ is the aggregated traffic intensity for job classes in gi, pn the aggregated traffic intensity for all classes in g\ U ... U gi (po =0), and p is the traffic intensity of the service station. We shall assume that the traffic intensity p < 1, through this section. The discussion for the case when p > 1 is in Appendix 2.7.2.

2.3.2 Preliminary Lemmas

We state and prove three lemmas that will be used in establishing the main results.

Lemma 2.6 Suppose that for k = 1,... ,K, as T —> oo,

a sup \Ak{t)-\kt\ d- O(VTloglogT), 0

a sup \Vk(t)-mkt\ d- 0(\/rioglogT). 0

Then we have

IMOIIr a=0(VT log log T), k = l,...,K, as T —> oo. If we further assume that

a l r sup \Vk(t) - mkt - Vk(t)\ d- o(T' ), 0

with r e (2,4) and Vk being an r-strong continuous process, then we have

IM*)||r 0=°(T1/r), k = l,...,K, as T -» oo.

Proof. We only prove the second half of the lemma, and the proof of the first half is almost the same. Use the convention Vfc( —1) = 0 and i>fc(0) = 0 in the proof below. From (2.15), we have for k = 1,... , K,

IKMHr < sup { max Ufc(n)}

0<«

= sup vk(Ak{t)) 0

= sup {Vk(Ak(t))-Vk(Ak(t)-l)} Q

< sup \Vk(Ak(t))-mkAk(t) + Vk(Ak{t))\ 0

+ sup \Vk(Ak(t) - 1) - mk(Ak(t) - 1) + Vjb(4fc(i) - 1)| 0

+ sup |Vfc(Afc(t))-Vlb(Afc(t)-l)|-|-mfc 0

where the last equality follows from the strong approximation assumption for Vk, the strong

continuity of Vk, and Lemma 2.3 (iii) and (v). • In view of (2.16), we can use an argument similar to the 1 leading to Lemma 2.6 to prove the following lemma.

Lemma 2.7 Suppose that for k = 1,... , K, as T —> oo,

a / sup \Ak(t)-\kt\ d- 0(V TlogTogT). o

Then

\\m(t)-t\\T a=0(v^loglogT), k = l,...,K, as T —> oo. If we further assume that for k — 1,... , as T —^ oo,

1 r sup |Afc(t) - Afc*-Afc(t)| =' o(T / ), 0<4

with r e (2,4) and Ak being r-strong continuous, then

a 1 \\Vk(t)-t\\T =o(T /'-), k = l,...,K, as T —>• oo.

Lemma 2.8 Suppose that the FLIL approximations,

sup |Ajb(t) - Ajbt| =' O(VTloglogT), o

sup |Sfc(t)-ufct| =• O(VTloglogT), o

/ sup |V*(t)-mfct| =' 0(V TToglogT), 0

as T ->• oo, hold. Then there exist M0 and T0 which are positive such that with probability

1,

Tk{t) < M0t for t > T0.

Proof. It follows from the assumptions of the lemma that

a \\Vk{Ak(t)) - \kmkt\\T = O(VTloglogT).

Therefore, there exist positive constants T\ and a such that with probability 1,

Vk{Ak(t)) < Xkmkt + epilog log* for t > Tv

From (2.18), we deduce that, with probability 1

7r(fc)-l

Tk(t) < Zv(k){t) + pAk)-iTk{t) + p„w-it + a £ log log i

7r(fc)-l

+Vk(Ak(t))-Vk(Ak(t)-l)- £ £^(^(4)), »=i jepi 2.3 A Multiclass Single Server Station 18 for t > T\. This implies that with probability 1,

(l-P»(/t)-i)7fc(t) < + P*(*)-it +a ]C Vtlo^ogt

»=1 j'Gffi 7T(fe)-l

for £ > Ti. Since we assume that p < 1, we know that p,,-^)-! < 1. The above inequality, together with of Lemma 2.3(iii) and (2.26) in Theorem 2.9, yield the desired result. Even though the proof of Theorem 2.9 (which is to be provided later) makes use of this lemma, the proof of (2.26) in that theorem does not depend on this lemma and is under the same condition as this lemma. Therefore, the proof is complete. •

2.3.3 Functional Law-of-Iterated-Logarithm

The key result of this section is to show that if the primitive data (the input process) have FLIL approximations, then the departure process (the output process) and the key perfor• mance measures of the queue also have FLIL approximations. To this end, assume that all the primitive data, the exogenous arrival process and the service process, have FLIL approximations: as T —> oo,

sup \Ak(t)-Xkt\ M' 0(y/T log log T), (2.23) 0

a sup \Sk(t)-nkt\ = OtVTToglogT), (2.24) o

a sup \Vk(t)-mkt\ = 0(v/rioglogr). (2.25) 0

In fact, it follows from Theorem 2.1 that (2.25) implies (2.24). In addition, a sufficient con• dition for the above approximations is that for each k 6 IC, the (exogenous) interarrival time

sequence uk and the service time sequence vk are i.i.d. sequences with finite variances. The main results follow. 2.3 A Multiclass Single Server Station 19

Theorem 2.9 Suppose that the FLIL assumptions (2.23)-(2.25) hold. Assume that the traf• fic intensity p < 1. Then as T -» oo,

a.s. sup \Ze{t)\ O^TloglogT), (2.26) 0

Remark. We note that this theorem holds without assuming the renewal arrival process and the i.i.d. service times. One simple example is to have a compound arrival process (modeling batch arrivals). They hold even without assuming that.Afc and Sk are integer-valued, as long as the FLIL approximations (2.23)-(2.25) hold. Consider a specific example, where the arrival process Ak takes the form

Ak{t) = / ak{s)ds, Jo and service process Sk(t) = /x^i. This may represent an ATM communication system, where afc(i) models the rate at which cells (of class k) are generated at time t and p,k the rate at which cells (of class k) can be processed. (The process {a^(i), t > 0} is often modeled by a sum of randomly on-off sources.)

Proof (of Theorem 2.9). From the definition of Ni[t), we have

l

N dt) = E E {[Vk(Ak(t)) - mkAk(t)] + mk[Ak(t) - \kt}} + [pt - l)t. 2.3 A Multiclass Single Server Station 20

By the FLIL assumptions (2.23) and (2.25) and Lemma 2.3 (iii), we have

a / sup \Nt(t) - (Pl - l)t\ d- 0(v TloglogT). 0

Because the pair (Zi, Yi) satisfies the oblique reflection mapping (in view of (2.20) and (2.21)), the Lipschitz continuity of the reflection mapping implies (2.26) and (2.31). In view of (2.11) and (2.14), we have

k€ge k£ge

= J2{[Vk{Ak{t))-mkAk{t)]+mk[Ak{t)-Xkt]}

kegt

- E IF* W*)) - mkDk(t)}+mk[Ak(Tc(t)) - XkTt(t)]}

k£gt

+j3t{t - Tl(t)) - E "*(*)•

k£gt

Using Lemma 2.6 and the FLIL assumptions (2.23) and (2.25) yields

a sup |Vwfc(i)-ft(t-r«(t))| d- 0(^Tlog log T),

kT9t which can be rewritten as

a sup \Ze(t) -Z/_!(t) - Bt(t - Tl(t))\ = O(VTloglogT). o

This combined with (2.26) implies the FLIL approximation (2.33) for T£. For k = 1,... , K, let I = n(k); the FLIL approximation (2.30) for the departure process

Dk follows from

sup \Dk(t)-Xkt\ = sup \Ak{Te{t)) - Xkt\ 0

= SUp \[Ak{Tt(t))-\kTt{t)] + \k{Tt{t)-t)\ 0

°=|- CKyTloglogT).

From the relation (2.14), we can write,

Dk(t) - Xkt = S(Tk{t)) - Xkt = [Sk{Tk{t)) - ukTk(t)} + uk[Tk{t) - Xkmkt}. 2.3 A Multiclass Single Server Station 21

Therefore, the FLIL approximation (2.32) for Tk can be derived from (2.30), (2.24) and the

fact that 0 < Tk(t) < t. Similarly, the FLIL approximation (2.28) for the queue length process Qk can be proved by observing

Qk(t) = [Ak(t) - Xkt] - [Dk(t) - Xkt].

By noting Lemma 2.6, we have that for k = 1,... , K,

sup |Wjb(t)| 0

= sup \[Vk(Ak{t)) - mkAk{t)} + mk[Ak(t) - Xkt] 0

> - [Vk{Dk(t)) - mkDk{t)} + mk[Dk{t) - Xkt] - uk{t)\

a= 0(\/TloglogT),

which is the FLIL approximation (2.27) for the workload process Wk.

Finally, we establish the FLIL approximation (2.29) for the sojourn time process §k. It follows from (2.18) and Lemma 2.8 that

\\Tk(t) - Pn{k)^Tk(t)\\T 7r(fc)-l

V < IK(fc)(i)||r+ £ Y,\\ M3^{t)+t))-mjAj{Tk{t) + t)\\T i=l jegi 7r(fc)-l

+\\Vk(Ak(t))-Vk(Ak(t)-l)\\T+ ^mjWAjiUV + V-XjiTkW+mT

j=i jest 7r(fc)-l 7r(fc)-l m A + E £II^W)-™^MIIT+ E E ^w~ *

»=i jest »=i iesi =• 0(x/T log log T).

Therefore, we have

||T*(i)||r °= 0(v/TloglogT).

Since Sk{t) = Tk(r]k{t)), the above combined with Lemma 2.7 yields (2.29). • 2.3 A Multiclass Single Server Station 22

2.3.4 Strong Approximation

The key result of this section is to show that if the primitive data of the queue have r-strong approximations (for some r £ (2,4)), then the performance measures (such as the workload process, the queue length process and the sojourn time process) and the output process (namely, the departure process) also have r-strong approximations.

To this end, we assume that processes Ak(t), Sk(t),Vk(t) are defined on an appropriate probability space such that for some r 6 (2,4) and for k = 1,... , K,

a l T sup \Ak(t) - Xkt - Ak(t)\ - o{T' ), (2.34) 0

a l T sup \Sk(t) - pkt - Sk(t)\ d- o{Tl ), (2.35) 0

l r sup \Vk(t)-mkt-Vk(t)\ =' o{Tl ), (2.36) 0

where Ajt > 0, pk > 0, mk = l/uk, and Vk(t) = —mkSk(mkt). It follows from Theorem 2.2

that (2.36) implies (2.35). We shall also assume that Ak and Sk are r-strong continuous

processes, k 6 IC. By Theorem 2.2, if we assume that the sequence of service times vk and

the sequence of interarrival times uk are mutually independent nonnegative i.i.d. sequences having finite rth moment with r £ (2,4), then we can have (2.34)-(2.36), with

/2 Ak(t) = \l c0ikB0,k(t), (2.37)

,2 Sk(t) = ul ckBhk(t), (2.38)

where

Xk = 1/E{uk{n)),

uk = l/E{vk{n)),

co,k = coefficient of variation of uk(n),

ck = coefficient of variation of vk (n),

and Bo>k(t) and B\tk(t), k € IC, are mutually independent standard Brownian motions. (The coefficient of variation of a random variable is its standard deviation divided by its mean).

Theorem 2.10 Suppose that the strong approximation assumptions (2.34)-(2.36) hold with

Ak and Sk being r-strong continuous for some r 6 (2,4). Assume that the traffic intensity 2.3 A Multiclass Single Server Station 23 p < 1. Then for t e £ and k e IC, as T ->• oo,

a.s. l r (2.39) sup |Z,(t) - zt(t)\ o(T' ) ofc(t) - Ab(t)| o{Tl ) o

a.s. l r sup |Wfc(t)- o(Tl ) (2.43) 0

mt)+Yt(t), (2.44) i

JV/(t) {Pi - 1)* + £ ^[mjAjit) - VjiXjt)}, (2.45) »=1 sup {-#/(*)}+ (2.46) 0

f •^7T(fc)() (2.47) 1 - Pir{k)-l sfc(*) (2.48) Afct + - ^-[^(fc)(t) - (*)], PTT(fc)

[^Tr(Jfc) (*) _ ^7T(fc)-l(*)]) (2.49) Ar(fc)

[^(*)(t) - ^(fc)-iW] = mfcQt(t). (2.50)

wk(t)

Remarks

1. By Proposition 2.4, the process Zt is an r-strong continuous process; hence, Sjt, Qk and Wk are r-strong continuous. In particular, let

i _ Dk(t) = Afc(t) [^7r(fc)() ^7r(fc)-l(*)]; Ar(fc) 2.3 A Multiclass Single Server Station 24

then Dk is r-strong continuous, and the departure process Dk has the strong approxi• mation,

a \\Dk(t)-Xkt-Dk(t)\\T =i-o(T^).

This property is essential for the inductive use of the strong approximation of the single-station queue to be extended to a feedforward network case.

2. In Peterson (1991), a state space collapse phenomenon is observed for priority job classes; namely, the workload of high priority group vanishes in the usual heavy traffic normalization. The strong approximation theorem enables us to identify more refined approximation; in this case, it suggests approximating the workload processes of higher priority classes by reflecting Brownian motions instead of zero. Indeed, our numerical examples in Section 2.6 show that the approximations suggested by the strong approxi• mation outperforms a straightforward interpretation of the heavy traffic approximation. On the other hand, we can recover the results of heavy traffic limits from the strong approximation by assuming the equalities (2.37) and (2.38), and we have

Z*(t) = ZL(t), (2.51)

where Z* (t) is the diffusion approximation of the workload of the station in Peterson (1991). We can also recover the corresponding weak convergence results following a similar approach in Chen and Mandelbaum (1994) that shows how to derive the dif• fusion limit theorem from the strong approximation theorem for a generalized Jackson network. We note that the strong approximation limit is not unique; it could be any process that differs from the limit in the theorem by an order of magnitude no more than o(T1/r). Finally, it follows from Theorem 2.5 that when the reflecting Brownian motion with a negative drift (p < e case), any constant (including zero) can be the strong approximation limit.

3. The second equality in (2.50) is the Little's law for the strong approximation limits of the workload and the queue length process.

4. We would like to point out that, by assuming the equalities (2.37) and (2.38), though our approximation for the workload process of the lowest priority group is consistent with the result in Peterson (1991), we have a slightly different approximation for the sojourn time process. For class k jobs which are in the group g^, the diffusion approximation 2.3 A Multiclass Single Server Station " 25

of its sojourn time §k in Peterson (1991) is

= Z*[t) , (2.52) PL ~ PL-I

while in our chapter, the strong approximation gives

I- PL 1 - PL

These two formulas are consistent when the traffic intensity at the station is 1 which is the heavy traffic assumption in the diffusion approximation.

Proof(of Theorem 2.10). First, by Lemma 2.3 (ii), the strong approximation assumptions (2.34)-(2.36) imply the FLIL assumptions (2.23)-(2.25); hence, Theorem 2.9 prevails. In the remainder of the proof, we shall repeatedly use Lemma 2.3 without explicitly referring to it. Next, we rewrite the net-put process as

t

i=i fcesi t

= YI E {[MM*)) ~ mkAk{t) + Vk(Ak(t))] + mk[Ak(t) - Xkt - Ak(t)]

~[Vk{Ak{t)) - Vk{Xkt)} + Xkmkt + mkAk{t) - Vk{Xkt)} - t i

= E E {[MM*)) - mkAk(t) + Vk(Ak(t))] + mk[Ak(t) - Xkt - Ak(t)}

-[Vk(Ak(t))-Vk(\kt)]}+Nt(t).

Thus, Nt(t) has a strong approximation as

l T sup \Nt(t) - Nt(t)\ "=?• o{T' ). 0

Since the pair (Zi, Y() satisfies the reflection mapping, by the Lipschitz continuity of the reflection mapping, we have

a r sup \Zt(t) - Zt(t)\ ^ o(T^ ), 0

a sup \Yt(t)-Yt(t)\ = o(T^) 0<«T 2.3 A Multiclass Single Server Station 26 where Zi and Yt are defined in (2.44) and (2.46) respectively. This proves the strong approx• imation (2.39) for the workload process Z^. In view of (2.11) and (2.14), we have

+ [Vk{Ak(t)) - Vk(Xkt)] - [Vk(Dk(t)) - mkDk(t) - Vk(Dk(t))}

- mk[Ak{Tl{t)) - XkTt(t) - Ak(n{t))] - mk[Ak(Tt{t)) - Ak{t)}

- [Vk(Dk(t)) - Vk(Xkt)} +Xkmk[t - Tt(t)] - uk(t)} .

Thus, by Lemma 2.6, it follows that

x r sup £ W*(t)-ft[*-T/(*)] o(T / ), 0<4

a l r sup Zt(t) - Zt.^t) - 0i[t - n(t)]\ * o(Tl ). (2.54) 0

Now fix k € /C and £ = n(k). Rewrite Dk as

Dk(t)

= [Ak{Tt(t)) - xkn(t) - Ak(n(t))] + [Ak{Tt{t)) - Ak(t)}

+Xkt + Xk[re{t)-t}+Ak{t)

= [Ak(Tt{t)) - xkTl{t) - Ak(Tt{t))] + [Ak(Tt(t)) - Ak(t)]

+ji {ft(r/(t) -t) + [zt(t) - ^_i(i)]} - j[Zt(t) - Zi.^t)] + Ak(t) + xkt.

This, together with (2.54), yield the strong approximation (2.41) for the departure process

Dk. 2.3 A Multiclass Single Server Station 27

Similarly, we prove the strong approximation (2.42) for the queue length process Qk by observing

sup \Qk{t) - Qk{t)\ 0

= sup \[Ak{t)-Xkt-Ak{t)]-[Dk{t)-Dk(t)]\ 0

ad- o(T^T), asT->oo,

and the strong approximation (2.43) for the workload process Wk by observing

Wk(t)-Wk(t)

= [Vk(Ak(t)) - mkAk{t) - Vk{Ak{t))) + mk[Ak{t) - Xkt - Ak[t))

+[Vk(Ak(t)) - Vk(Xkt)] - [Vk(Dk(t)) - mkDk{t) - Vk(Dk(t))]

-mk[Dk(t) - Dk{t)\ - [Vk(Dk(t)) - Vk(Xkt)} - uk{t). Finally, we establish the strong approximation (2.40) for the sojourn time process S^. Note

that Sk(t) = Tk{r}k(t)); we first rewrite (2.18),

Z 7k{t) = Zn(k){t) + Pn(k)-l%{t) + [Z„(k){t) ~ ir(k){t)] 7r(fc)-l

»="i jest

+ mjlAjiTkit) + t)- Xj{Tk(t) +1) - Aj{Tk{t) + t)]

+ mjiAjiTkit) + t)- Aj(t)] + [VjiAjfait) + t)) - Vj(Xjt)]

- [Vj{Aj{t)) - XjTTljt - TTljAj(t) - Vj(Xjt)]}

+[Vk{Ak(t)) - Xkmkt - mkAk(t) - Vk(Xkt)}

-[Vk{Ak{t) - 1) - Xkmkt - mkAk{t) - Vk{Xkt)).

Note that ||7fc(t)|| =' 0{y/TloglogT) as shown in the proof of Theorem 2.9; in view of

Lemma 2.3 and the strong approximations for Zn(k), Vj and Aj, we obtain from the above 2.3 A Multiclass Single Server Station 28

equality,

l/r l|Tfc(t) - pr{k)-iTk(t) - Zv{k)(t)\\T =• °(T ), as T -> co, or equivalently,

1 \\Tk(t) - z Zn{k)(t)\\T °==- op "), as T -> oo.

The latter, together with Lemmas 2.7 and (2.17), implies (2.40). •

2.3.5 Fine Tuning the Strong Approximation for Sojourn Time

A more detailed study reveals that the strong approximation of sojourn time that we obtained in Theorem 2.10 should be better interpreted as the strong approximation to the sojourn queue time which is the time between the arrival of the job and the time that it just begins service. The reason is that, in the proof of FSAT approximation for the sojourn time process

we approximate Vk{Ak(t)) — Vk(Ak(t) — 1), the service time, by zero. If we replace Vk(Ak(t)) —

Vk(Ak{t) — 1) by zero in (2.18), then the new Tk is exactly the sojourn queue time. Thus, the strong approximation for the sojourn queue time is same as that for the sojourn time given by Theorem 2.10. As we know, the sojourn time should be larger than the sojourn queue time. For a single-class single-station queue, the sojourn time of a job equals to its sojourn queue time plus its service time. For preemptive priority queueing networks, even if a job is in service, it may well be interrupted by another arriving job with higher priority. Hence, in general, the sojourn time of a job should be longer than or equal to its sojourn queue time plus its service time. Our numerical examples suggest that approximating the service time by its mean would yield an improved strong approximation for the sojourn time (2.47),

1 Sfc(t) = i [Zn(k)(t) + mk]. (2.55)

We note that we might obtain a better approximation by using a true service time random

variable in place of mk in the above.

Because of the nature of strong approximation, we could replace mk in (2.55) by any constant and the strong approximation still holds. However, there is strong reason to believe

that mk is the best constant to put in (2.55). Our numerical examples in Section 2.6 show that (2.55) gives fairly good approximation to the steady-state average sojourn time. For 2.3 A Multiclass Single Server Station 29

M/G/l preemptive priority queue, the approximated steady-state mean sojourn time is

ESk = ( -— r +mk), (2.56)

l-ftr(fc)-l 2(l-p7r(fc)) which is the same as the exact mean sojourn time; see (3.39) in Kleinrock (1976).

2.3.6 A Packet Queue Application

Our strong approximation in Theorem 2.10 provides a framework to evaluate the performance of single-station multiclass queues. It holds without assuming renewal arrival processes or renewal service times. Here, we provide as an example a batch renewal model which was used by Fendick, Saksena and Whitt (1989) to study the dependence in packet communication networks. As we know, a batch renewal process is a renewal process if and only if it is a batch-Poisson process and the batch size is geometrically distributed on the positive integers. Typically, the superposition process of batch renewal processes are not renewal. The strong approximation theorem can be applied to them to get performance measures easily.

Let uk = {ufc(n), n > 1} be i.i.d. batch interarrival times, with mean \k and squared

coefficient of variation (SCV) c^k\ let vk = {vk(n), n > 1} be i.i.d. service times, with mean

mk and squared coefficient of variation c\\ let nk = {nk(n), n > 1} be i.i.d. batch sizes with

r)k[n) being the nth batch size of class A; packets and with mean bk and squared coefficient of

variation c\k. Furthermore, we assume that all three of them have finite rth moments with r > 2 and are mutually independent. All classes belong to one priority group, so service discipline is strictly FIFO.

Let Ak(t) denote the arrival process of class k packets.

1*1

Xk(t) = <>0, i=l

Yk(t) = sup{s > 0 : Xk(s) < t}, 0*W' t>o, i=l L«J Vk(t) = £«*W> t>0, i=l 2.4 Multiclass Feedforward Networks 30

with X{t) = 0, U{t) = 0, and Vk{t) = 0 for t < 1. Thus

A(t) = C/(r(t)).

By Theorem 2.2 and Lemma 2.3, there exist three independent standard Brownian motions Bg, B%, and B\ such that

/2 a r \\Uk(t)-bkt-bl cbkB^bkt)\\T d- o(T^ '),

l /2 a \\Vk(t)-mkt-m k ckBl(mkt)\\T * o(T^'),

/2 a 1 \\Yk(t)-Xkt-\l c0kB^(t)\\T d: op ''),

/2 a l r \\Ak(t) - Xkbkt - X^b^B^t) - bl cbkBV(Xkbkt)\\ d- o{T' '),

where r' = r if r < 4, and r' < 4 if r > 4. Therefore, by Theorem 2.10, the strong approximation for the total workload process at the station is

Z(t) = N(t)+Y(t), K N(t) = (p-Vt + ^frk^hcokBfit) fc=i

l /2 J l /2 +mkb k cbkBk (Xkbkt) - m k 'ckB%\mkXkbkt)]} ,

+ Ut) = sup {-Ne(s)} , 0

where p = 2~lk=\ ^k^k^k- This is a one-dimensional reflecting Brownian motion. In particular, we can get approximated steady-state mean workload by

EZ El^krnl[bkcl + bl(clk + clk)} 2(1 - p)

Fendick, Saksena and Whitt (1989) obtained the same result by using a heavy traffic limit theorem.

2.4 Multiclass Feedforward Networks

In this section, we shall assume the FLIL approximation without much discussion, and focus on the strong approximation. The reasons are that the derivation of the FLIL approximation 2.4 Multiclass Feedforward Networks 31 for the network case is quite similar to that for the single-station case and that the strong approximation yields more useful approximation.

2.4-1 Queueing Network Model

We first describe the primitive data and then the performance measures and their dynamics.

2.4.1.1 Primitive Data and Assumptions The queueing network consists of a set of J service stations, indexed by j = 1,... , J, serving K classes of jobs, indexed by k = 1,... , K. There are L priority groups, indexed by £, £=!,... , L, and gt is the set of all job classes belong to group £. Let 7r(-) be a many-to-one mapping from class indices to group indices; specifically, job class k belongs to the priority group 7r(fc). Jobs from group £(£ = !,... ,L) are served exclusively at station j = cr(£), where cr(-) is a many-to-one mapping from group indices to station indices. While each group is served at one station exclusively, each station may serve more than one group. For simplicity, we define a(0) = 0. Note that the composition a o IT is a many-to-one mapping from class indices to station indices. If £ < m, then jobs in group £ are assumed to have a preemptive priority over jobs in group m (£, m = 1,... , L). Within a group, jobs of all classes are served in the order of arrival, that is first-in first-out (FIFO). The network is a feedforward queueing network in the sense that any job at station i can turn into another class at station j only if j > i (i, j = 1,... , J). To illustrate our notation, consider the network given by Figure 2.1, which has J = 2 stations serving K — 6 classes of jobs with L = 4 priority groups. Job class 1 belongs to priority group 1; job classes 2 and 3 belong to priority group 2; job class 4 belongs to priority group 3 and job classes 5 and 6 belong to priority group 4. Priority groups 1 and 2 reside at station 1 and all the other groups reside at station 2. Then, 7r, g and a defined above can be written as

TT(1) = 1, TT(2) = TT(3) = 2,

TT(4) = 3, TT(5) = ?r(6) = 4;

5i = {1}, 92 = {2,3}, 53 = {4}, 54 = {5,6}, CT(1)=CT(2) = 1, a(3) = CT(4) = 2.

To facilitate our analysis, we make some assumptions on the way of indexing priority groups and job classes. As we will see later, this is critical to obtaining a compact presentation of the main results. 2.4 Multiclass Feedforward Networks 32

1

0.5

2 0.5 4

3 5

6

FIGURE 2.1. A multiclass feedforward queue network

• Every station has at least one priority group, and every priority group has at least one job class; hence, necessarily, K > L > J'.

• For k,j = l,...,K, n(k) < ir(j) if k < j. Therefore, jobs of class 1 must be in group 1 and jobs of class K must be in group L.

• For £, m = 1,... , L, a(tj < a(m) if I < m. Thus, jobs from group 1 must be served in station 1, and jobs from group L must be served in station J.

The queueing network is described by the following primitive data: 2K sequences of non-

negative random variables uk = {ufc(n), n > 1} and vk = {vk{n), n > 1} (k = 1,... ,K), and K sequences of if-dimensional vector k(n), n > 1} (k = 1,... ,K), all defined on the same probability space. We assume that there are no jobs in the network at time t = 0. The random variable Ufc(l) is the time of the first exogenously arrived class k job, and Ufc(n), n > 1, is the time between the (n — l)st and nth exogenous arrived class k jobs. The random variable Ufc(n) is the service time required for the nth class k job. The random variable cf>k describes the routing mechanism for class k jobs: the nth class k job after service completion turns into a class j job if (f>k(n) = eJ, and leave the network if (j>k(n) = 0. By the feedforward structure and our numbering convention, it follows that, for all n > 1 and k,i = 1,...,K, the class transitions must satisfy

We introduce the summations,

n

£4(0) =0, Uk{n) = Y,Uk{m),n>l, k = l,...,K, m=l n

Vt(0) =0, Vk{n) = Y^Vk(™),n>l, k = l,...,K, n $*(0)=0, $fc(n) = E 4>k(m),n > 1. fc = l,...,if. m=l

Define their associated counting processes

Ek{t) = sup{n > 0 : Uk{n) < t}, k = l,...,K,

Sk(t) = sup{n > 0 : Vk(n) < t}, k = l,...,K.

1 Let U = (Uk), V = (Vk), $ = ($ ,... E = (Ek) and 5 = (Sk). We call E an exogenous arrival process, S a service process, and $ a routing sequence. Note that we do not assume that the arrival process E, the service process S and the routing processes are renewal processes. Similar to the single-station queueing model above, we assume that there exist a long-run average arrival rate, an average service time and a long-run average transition (routing) rate; namely,

a as t —> oo, t V(n) m as n —> oo, n $fc(n)

k as n —> oo,

where Pk is the kth row of a K x K matrix P = (pkj). We shall call ak, the A;th coordinate

of a, the (exogenous) arrival rate of class k job and call mk, the kth coordinate of m, the average service time of class k job (alternatively the mean service time of class k job when

vk(n) has the same finite mean for all n > 1). Call pkj, the jth coordinate of P'k (and the (k,j)th element of P), the average transition rate that a class k job turns into a class j

k job after completing its service. When

l-< k < K, mk > 0 and call p,k := l/mk the service rate of class k. By our assumption on the routing sequence, it follows that matrix P is a strictly upper triangular matrix. For the 2.4 Multiclass Feedforward Networks 34 network shown in Figure 2.1, the routing matrix takes the form

/ o o o 0 oo\ 0 0 0 0.5 0 0 0 0 0 0 1 0 P = 0 0 0 0 0 0 0 0 0 0 0 0 \ 0 0 0 0 0 0/ where it is assumed that upon service completion, any class 3 job will turn into a class 5 job, a class 2 job will either leave the network or turn into a class 4 job with equal probability and jobs of all other classes will leave the network. We assume that the primitive processes (V, E, ... , $K) have strong approximations, namely, we assume that they are defined on a probability space such that there exist K (K + 2)-dimensional r-continuous process (V, E,&,... , l>K) satisfying

as T -» 00, for some r e (2,4). We note, in particular, that we neither assume that (V, E, $x, ... , $K) is a Brownian motion, nor assume its components V, E and $K are mutually independent.

k However, if uk, vk and (j> , k = 1,...,K, are mutually independent i.i.d. sequences, and uk

and vk have finite moments of order r £ (2,4), then by a multidimensional generalization

of Theorem 2.2, the strong approximation assumptions (2.57)-(2.59) hold with mk (the kth

component of m), l/ak (where ak is the kth component of a) being the means of random

k J variables vk(l) and uk(l), respectively, with pkj = P{ (n) = e } (the 7th component of

K Pk), and with V, E, ... , $ being mutually independent driftless Brownian motions. The covariance matrices of these Brownian motions are, respectively,

= Vki(ki -Pke), k = 1,... ,K,

(Tv)u 2.4 Multiclass Feedforward Networks 35

where co,jt and ck are the coefficients of variations, of random variables, Uk(l) and Vfc(l), respectively, and Sa = 1 if i = £ and 6a = 0 otherwise. Our model has a slightly more general structure than the one described in Peterson (1991). In particular, we allow the routing sequences to include Markovian routing, while Peterson (1991) only considers the deterministic routing.

2.4.1.2 Performance Measures and Their Dynamics

The performance measures of interest are the L-dimensional (aggregated) workload process Z = (Zt) with Zt — {Zi(t), t > 0} (£ = 1,... ,L), the ^-dimensional workload process W = (Wk) with Wk = {Wk(t), t > 0} (k = 1,K), the if-dimensional queue length process Q — (Qk) with Qk = {Qk(t), t > 0} (k — 1,.... ,K), and the L-dimensional cumulative idle time process Y = (Yi) with Yi = {Yi(t), t > 0} (1 = 1,... ,L). The process Z is nonnegative with Zi(t) indicating the total amount of immediate work for station a(£) embodied in jobs that are in groups 1 to £ and that are either queued or in service at station a(£) at time t. The quantity Wk{t) indicates the amount of work embodied in all class k jobs that are either queued or in service at time t. The quantity Qk{t) indicates the number of class k jobs in the network at time t. We assume that Q(0) = 0 and thus Z(0) = 0. The quantity Yi[t) indicates the cumulative amount of time that the server at station -K(£) does not process jobs in groups 1 to £ during [0,t\. It is clear that Y must be nondecreasing and Y(0) = 0. We introduce some additional notation.

• Ak(t) is the total number of class k jobs arrived to station a(ir(k)) during [0, t] either exogenously or from other stations.

• Dk(t) is the total number of service completions of class k jobs at station o~(n(k)) during

[0,*].

• Tk(t) is the total amount of time that server e(n(k)) has served jobs of class k during [o,t].

• Ti(t) is the arrival time of the gi job which has most recently completed service at station a(£) (r^(i) is zero if there have been no service completions for group £).

• Sfc(t.) is the sojourn time of class k jobs at time t at station cr(7r(A;)), denoting the time which will be spent at station a(ir(k)) by the first class k job who arrives at time greater than or equal to t.

• T]k(t) is the time at which the first class k job arrives during [t, oo). 2.4 Multiclass Feedforward Networks 36

Tk{t) is the time that a class k job would spend at station CT(7T(/C)), if it arrived at time

Define two L x K matrices: a higher priority group constituent matrix C and a group con• stituent matrix C\. The (£, k)th component of C, Ctk — 1 if cr(£) = a(n(k)) and Tv(k) < £,

and Ctk — 0 otherwise. The (£,k)th component of C\, Cuk = 1 if n{k) = £, and Cuk = 0 otherwise. Define a K x K (strictly) higher priority class constituent matrix Ci — (C2ij) with

Cnj = 1 if o~(n(i)) = ir(j), and C^ij = 0 otherwise. Consider the example shown in Figure 2.1; under the priority group specification given earlier, the higher prior• ity group constituent matrix C, the group constituent matrix C\, and the (strictly) higher priority class constituent matrix C2, respectively, take the form,

(1 0 0 0 0 o\ (l 0 0 o\ 1 1 1 0 0 0 0 1 1 0 C = 0 0 0 1 0 0 cx = 0 0 v° 0 0 1 1 1J 1J (0 0 0 0 0 0) 1 0 0 0 0 0 1 0 0 0 0 0 Co 0 0 0 0 0 0 0 0 0 1 0 0 0 0 1 0

Let A = (I — P')~la and call its kth component, A^, the long run average arrival rate of class k jobs, A; = 1,... ,K. Let M = diag(m) and A = diag(A) be K x K diagonal matrices with Aith diagonal elements mk and A^, respectively. Let p = CMX. Note that p is of dimension L; if gt is the lowest priority group at its station CT(£), then pi is the traffic intensity at that station. We shall assume that the traffic intensity at all stations are no greater than one and hence,

1 p < e. Let 5k = (1 — p1r(k-i))~ if o-(n(k — 1)) = <7(TT(A;)) and 5k = 1 if a(ir(k — 1)) < CT(7T(A;)),

and let A = diag(<5) be a K x K diagonal matrix whose A;th diagonal element is 5k- 2.4 Multiclass Feedforward Networks 37

It is from the above definitions that we have the following dynamic relations,

Q(t) = A(t)-D(t), (2.60) W(t) = V(A(t)) - V(D(t)) - u(t), (2.61) Y(t) = et-CT{t), (2.62)

Z(t) = CW(t), (2.63)

Z(t) = N(t)+Y(t), (2.64)

N(t) = CV{A(t)) - et,

D(t) = A(C[r(t)), (2.65)

K A(t) (2.66)

S(t) (2.67)

= Zn{k)(t) + C2k[V(A(Tk(t)+t)) - V(A(t))} +Vk(Ak(t)) - Vk(Ak(t) -- 1X2.68) Tk(t)

where uk(t), the fcth component of v(t), is the partial service time (if any) that has been performed on the class k job during which is dominated by an inequality similar to

(2.15). In (2.68), C'2k is the A:th row of matrix C2, and from the context, we hope it will not be confused with the (2, fc)th element of matrix C. For understanding the above relations, it is helpful to compare them with the relations (2.10)-(2.18) for the single-station case. In particular, relation (2.64) is a workload flow balance relation (in terms of time) for service stations, and relation (2.60) is a job flow balance relation (in terms of number of jobs) for job classes. We shall assume that the work-conserving condition is in force. Hence, the pair (Z, Y) satisfies the reflection mapping relation, which implies that

Y(t) = sup [-N(s)] = sup [es - CV(A(s))}. (2.69) 0

2.4.2 Main Result

For the queueing network model described in Section 2.4.1, jobs can route from station i to station j only if j > i. Now we argue that given the strong approximations (2.57)-(2.59) for the primitive data, we could inductively apply Theorem 2.10 (the strong approximation the• orem for a single station) to the network from stations 1 to station J. First, by Theorem 2.10 and Remark 1 after it, the departure process of each job class from station 1 has a strong approximation; this, the assumption (2.59) (that the routing sequence has a strong approxi• mation) and Lemma 2.3(vi) imply that the arrival processes to station 2 from station 1 also 2.4 Multiclass Feedforward Networks 38

have strong approximations. Since jobs arrive at station 2 either exogenously or from station 1, the total arrival process to station 2 for each job class must have a strong approximation as well. Hence, by applying Theorem 2.10 to station 2, we know in particular that the departure process of each job class from station 2 satisfies a strong approximation. Inductively, we can show that the departure process and the arrival process for each class in the network must have some strong approximations. Therefore, we could apply Theorems 2.10 to each station to obtain the strong approximations for all the performance measures, especially the work• load process of each job class, the aggregated workload processes, the queue length processes and the sojourn time processes. The following theorem presents the strong approximations in a compact form.

Theorem 2.11 Suppose that the strong approximations (2.57)-(2.59) hold. Let

H \C[{CMh.C[)

G CM{I - P')-lP'H and

R (I + G) -l

Then, as T ->• oo

a.s. l T \\Z{t) - Z(t)\\r o(T' ) (2.70)

a.s. 1/r (2.71) \\Y(t)-Y(t)\\T o(T ) \\Q(t)-Q(t)\\T a.s. o{TllT) (2.72) a.s. o(Tl'r) (2.73) \\W(t)-W(t)\\T

a.s. o(Tllr) (2.74) ||S(t)-S(t)||r

where

Q{t) = HZ(t) for t > 0, (2.75)

W{t) = MHZ(t) = MQ(t), (2.76) S(t) = b\c[Z{t) +ml; (2.77) 2.4 Multiclass Feedforward Networks 39 and (Z,Y) are defined as follows:

Z(t) = 6t + X{t) + RY(t) > 0 for t > 0, (2.78)

6 = R{p-e), (2.79) K

fc X{t) = RC V(Xt) + M(I - PY'lEit) + E *(A*t)] (2.80) k=l Y(t) is continuous and nondecreasing with Y(0) = 0, (2.81) POO (2.82) / Z£{t)dYe{t) = 0 for / = 1,... ,L, Jo

Remarks

1. It is shown in the Appendix (Lemma 2.12) that the matrix H is well defined. In fact, let p = C\M\; then the (A;,^)th element of the K x L matrix H is given by

if e = n{k), if e = ir(k)-l and a(l) = o{ir{k)), otherwise.

(The £th component of /3, fie, is the traffic intensity of priority group l at station a(t).) It is also shown in the Appendix that the matrix G is well defined and is strictly lower triangular. Hence, matrix R is also well-defined and is lower triangular.

2. Since matrix R is triangular, by inductively applying the one-dimensional reflection mapping, it is clear that for the given 9 in (2.79) and X in (2.80), relations (2.78), (2.81) and (2.82) uniquely determine the process Z and Y. In particular, when the interarrival sequences Uk {k = 1,...,K), the service sequences (k — \,...,K) and the routing sequences

K

L F = RC YVK + M(I - PT'FE + £ Ar|](fc 7 - P)- M C'R', (2.83) k=\ 2.4 Multiclass Feedforward Networks 40

where TE, Ty and T$ are as given towards the end of Section 2.4.1.1. When the se•

k quences uk, vk and (j) , k = 1,K, are i.i.d. but not mutually independent, X given in (2.80) is still a Brownian motion but with its covariance matrix computed differently.

3. The second equality in (2.76)' is the Little's law for strong approximation limits.

Proof (of Theorem 2.11). In view of the previous discussion, we only need to show that the strong approximation limits in (2.70)-(2.74) are given by (2.75)-(2.82). Specifically, the starting points of our proof are the following results: first the FLIL approximations hold, in particular,

\\A(t)-Xt\\T =• 0(y/T log logT), (2.84)

\\D(t)-Xt\\T =• O^T log logT), (2.85) =• 0(^/T log logT), (2.86) \\r{t) -et\\T

\\T(t)\\T =' 0{y/Tlog logT). (2.87)

Second, the strong approximations (2.57)-(2.59), the strong approximations (2.70)-(2.74), and

a 1 \\A(t) - Xt - A(t)\\T d- op "), (2.88)

IkWIIr =• o(Tl"), (2.89)

a 1 \W)-et\\T 4- op "), (2.90) hold. That the FLIL approximations (2.84)-(2.87) hold can be proved under the FLIL approx• imation assumptions for the primitive data which are implied by the strong approximation assumptions (2.57)-(2.59); the proof is similar to that for Theorem 2.9 and will not be pro• vided. That the strong approximations (2.70)-(2.74) and (2.88) hold for some limits Z, Y, Q, W, S and A follow from an induction proof as outlined before the theorem. That (2.89) and (2.90) hold is an extension of Lemmas 2.6 and 2.7 respectively. What remains now is to identify limits Z, Y, Q, W, S and A and show that they satisfy (2.75)-(2.82).

1 u n For any two processes X and Y, if \\X - Y\\T =' op "), we shall write X{t) » Y(t) to simplify the presentation. With this notation, if Y(t) w X(t), then Y is a strong approx• imation for X implies that X is also a strong approximation for X. In the following proof, we shall repeatedly use without explicitly referring to the FLIL approximations (2.84)-(2.87), the strong approximations (2.57)-(2.59) and Lemma 2.3. 2.4 Multiclass Feedforward Networks 41

First, note that A = AC[e] from (2.61), (2.65) and (2.89), we have

W(t) = V(A(t))-V(D(t))-u(t)

« [MA(t) + V(A(t))] - [MD(t) + V(D(t))]

« [MXt + MA[t) + V{Xt)} - [MA{C[T(t)) + V{Xt)]

ss [MXt + MA(t)] - [MAC[T(t) + MA{C[T(t))}

ss MAC[{et-T{t)); this, combined with (2.73), yields

W{t) x MAC[(et-T{t)). (2.91)

From the above and (2.63), we have

Z(t) = CW(t) ss CMAC[(et - r(i)); (2.92) substituting the above into (2.91) leads to

W{t) ss MAC'^CMAC'^Zit) = MHZ(t)\ this establishes the first equality in (2.76). We also note that the above, together with (2.91), implies

AC[(et - T(I)) SS M-lW{t) « HZ{t). (2.93)

In view of (2.60), (2.65), (2.92) and (2.93), the relation (2.75) and the second equation in (2.76) follow from

Q(t) = A(t)-D(t)

ss [Xt + A{t)]-A{C[T{t))

ss [Ai + A(t)] - [AC[r{t) + A{C[T{t))}

ss AC[{et-T(t))

ss HZ{t). 2.4 Multiclass Feedforward Networks 42

Next, in view of (2.65) and (2.66), we have

K A(t) = £(i) + X>fc(-D*W) fc=i K

k=l K

k « at + E(t) + P'A(C'1T(t)) + '%2$ (\kt) fc=i

as at + E(t) + ^2 ^fc(Afci) + P'ACV(i) + P'i(i). k=l

Note that A(i) =s Ai + A[t) and A = a + P'X = a + P'AC'e; the above leads to

K

1 1 k A(t) « (I-P )- E(t) + ^2$ (\kt)-P'AC[(et-T(t)) k=l K

~ (/_P')-i ^(i) + ^$*(Afci) -P'HZ(t) k=l where the last approximation follows from (2.93). Using the above approximation we can rewrite (2.64) to obtain,

Z[t) = CV{A{t)) -et + Y{t)

« C[MA(t) + V(A{t))]-et + Y{t)

« CMXt + CMA(t) + V{Xt) -et + Y{t) K (p -e)t + C l fc V{Xt) + M(I - P'r[E{t) + £ l> (Afci)] -CM(I - P')-lP'HZ{t) + Y{t);

this, together with (2.70), implies that

Z{t)a8t + X(t)+KY(t),

with 6 and X as defined by (2.79) and (2.80), respectively. This establishes the relation (2.78). The relations (2.81) and (2.82) follow from the corresponding properties for the original processes Y and Z and the Lipschitz continuity ofthe reflection mapping; specifically, the first 2.5 Performance Analysis Procedure 43 relation corresponds to the nondecreasing property of Y and the second relation corresponds to the work-conserving condition as stated by (2.69). Finally, we establish (2.77). In view of (2.68) and (2.87) , we have

Tk(t) = ZAk){t) + C'2k [V(A(Tk(t) + t)) - V(A(t))j + Vk(Ak(t)) - Vk(Ak(t) - 1),

« + [MA(rfc(i) +1) + y(i(rfc(t) +.i)) - M>i(t) - y(i(t))] +mk

« ZAk)(t) + C'2k [MX(Tk(t) + t) + MA(Tk(t) + t) + V(X(Tk(t) + t))

-MXt- MA{t) -V{Xt)] + mk

« Zn(k)(t) + C'2k [MXTk(t) + MA(t) + V(Xt) - MA(t) - V(Xt))] + mk

= ZAk)(t) + C'2kMXTk{t) + mk; this establishes the relation

T{t) w A(C[Z{t)+m).

Therefore, combined with (2.67) and (2.90), we can conclude (2.77). •

2.5 Performance Analysis Procedure

Based on the strong approximation theorem in Section 2.4, we outline a procedure to approx• imate various performance measures of queueing networks. Specifically we consider the case where the interarrival time, the service time and the routing sequences are mutually indepen• dent i.i.d. sequences. In this case, we can approximate the aggregated workload process Z by an RBM Z with drift 6, covariance matrix F and reflection matrix R, which are described by

R = {I + CM{I - P')-XP'H)-\

-R{e-p), K

X RC FVA + M(I - PT^E + £ Afcr|](7 - P)- M CR'. k=l

Readers are referred to Section 2.4.1.1 for the definitions of the vectors and matrices used in the above equalities. In particular, we note that all of them are from the service disciplines, the routing probability, and the mean and the variance of the interarrival and service times. Following from Lemma 2.12 in the Appendix and Lemma 3.2 of Chen (1996), when p < 1, Z has a unique stationary distribution. Under certain condition (which will be elaborated 2.5 Performance Analysis Procedure 44 below), the stationary distribution has an explicit product form. In the more general case, the stationary distribution can be computed numerically. Dai and Harrison (1992) first develop such an algorithm, refered as BNA/SM in this dissertation (which approximates the station• ary distribution by polynomial functions). In Chapter 4 and 5, we develop a new numerical algorithm BNA/FM (which approximates the stationary distribution by finite element func• tions). All numerical examples in this chapter are calculated by this new algorithm. Though this new algorithm can compute the stationary distribution function, we focus on approxi• mating the mean of the stationary distribution only; approximating the distribution function will be dealt with in later chapters. Given the estimate for the stationary distribution of the aggregated workload process, we could obtain estimates for some other performance measures of queueing networks. Let E(Zi) (C = 1,... , L) be the stationary mean for the aggregated workload. We shall describe two alternative methods to obtain the estimates of the stationary mean queue length and mean sojourn time. The first method is to approximate the mean queue length by (2.75) in Section 2.4, and we have

E{Q)=HE{Z). (2.94)

Then, we use Little's Law to obtain mean sojourn time as

E(Sk) = ±-E(Qk). (2.95)

The second method is to approximate the mean sojourn time via (2.77) by

E(Sk) = 5k[E{Z1t(k))+mk], (2.96)

and then obtain the mean queue length by Little's Law,

E(Qk) = XkE(Sk). (2.97)

These two methods are summarized as Algorithms 1 and 2. They usually give different approximations. From our numerical experiments in Section 2.6, Algorithm 2 seems to provide much more accurate estimation than Algorithm 1. Therefore, Algorithm 2 is recommended to obtain approximations for the mean stationary queue length and the mean stationary sojourn time. The numerical evidence also suggests that both algorithms are doing well and

are asymptotically identical for a class k if /3n^ (the traffic intensity of priority group 7r(fc)) 2.5 Performance Analysis Procedure 45

Algorithm 1 Computing steady-state average queue length and sojourn time

E(Q) = HE(Z),

JS(Sfc) = ^E(Qk).

Algorithm 2 Computing steady-state average queue length and sojourn time

E(h) = 8k[E{Zv{k))+mk],

E(Qk) = \kE(Sk).

is close to 1. Intuitively, when is close to 1 and p^k) < 1, ^ is close to or equal to 1 and the workload of all other priority groups at that station should almost be zero. Thus, both algorithms give

E(Sk) « E(Zk).

2.5.1 Product Form Solution

Harrison and Williams (1992) showed that Z has a product form stationary distribution if and only if

Tij = ^RjiTu for all 1 < i < j < L, (2.98)

in which case the solution is

L p(x)=n^exp(~/^:E^' x - °' ^"2- ) 1=1

where n\,... ,KL are the positive constants defined as

2(1 Pe)Ru Kt = ~ for<=l I. (2.100)

The product form condition is rarely satisfied. Peterson (1991) pointed out a special case where the product form condition is satisfied; this special case requires struc• tures; namely, all jobs at each station have the same service time distributions and are served under FIFO service disciplines, and jobs follow deterministic routing. 2.6 Numerical Examples 46

2.6 Numerical Examples

This section is devoted to analyzing two examples, both of which are feedforward queueing networks as described in Section 2.4. We apply our strong approximation to these models to obtain RBM models. Then, we compare the performance estimates from our RBM approxima• tions with the estimates obtained from the RBM approximations obtained by using diffusion approximation in Peterson (1991) and with simulation results. To calculate the steady-state performance measures from RBM models, we use the numerical algorithm BNA/FM devel• oped in Chapter 5.

2.6.1 Single Station With Two Job Classes

Consider the single station network pictured in Figure 2.2. There are two job classes. Class 1 jobs have higher preemptive priority over class 2 jobs. We consider four versions of systems:

1. All interarrival and service times are taken to be of order 4 (SCV=0.25).

2. All interarrival and service times are taken to be exponential (SCV=1).

3. All interarrival and service times are taken to be Gamma distributed with SCV=2.

4. All interarrival times are taken to be exponential and all service times are taken to be Erlang of order 4.

The arrival rates of both classes are 1. For each system, we examine five cases of the mean service times:

1. m\ = 0.7, m2 = 0.1;

2. mi = 0.5, m-2 = 0.3;

3. mi = 0.3, m-2 = 0.5;

4. mi = 0.1, 77i2 = 0.7;

5. mi = 0.2, m2 = 0.2.

Although there is no product form solution for the joint stationary distribution of (Z\, Z2),

the marginal distributions of Z\ and Z2 are exponentially distributed with means

mKcoi +ci) E{ZX) = 2(1-Pi)

E(Z2) = 2(1- pa) 2.6 Numerical Examples 47

V

FIGURE 2.2. Single station with two job classes

respectively (p2 = Aim2 + A2m2 and pi = \\m\). The following three analytical methods are used to obtain approximations of the mean queue lengths and mean sojourn times:

1. By Algorithm 1 in Section 2.5:

T"I(CQI + CI) B(Si) = EQ1 = 2(1-Pi) '

} £7(S2) = EQ2-—{ 2(13^) 2(1 -Pl) -

2. By Algorithm 2 in Section 2.5

EQi = —^7;—+m i 2(1-Pi) 2 1 (ELi^i(4 + ^) E§2 — EQ2 { + m2). -Pl 2(1 -P2)

3. By the diffusion approximation in Peterson (1991):

ESi = EQi=mi, 1 SLi "?(<& +

m2 2(1 - p2)

(Note that a straightforward interpretation ofthe diffusion approximation would yield E%\ = = 0; the suggested approximation above represents a slight improvement.) Table 2.1 summarizes the mean queue length estimates of each job class by using strong approximations, diffusion approximations and simulation. The columns "Strong (1)" and "Strong (2)" in Table 2.1 correspond to the approximations obtained by Algorithm 1 and 2.6 Numerical Examples 48

Algorithm 2, respectively. The numbers in parentheses after the simulation figures show 95% confidence intervals as percentages of the simulation numbers. The numbers in parentheses following other figures are the percentage errors (in absolute value) as compared to simulation numbers. This convention will also be used in the subsequent tables. First we note that as indicated in Section 2.3.5 (c.f. (2.56)), Algorithm 2 gives the exact mean queue lengths for the cases (systems 1 and 2) of Poisson arrivals. (In the corresponding rows, we could have reported the percentage errors relative to the exact mean, but we report the percentage errors relative to simulation results for the consistency with other rows.) It seems that in almost all other cases, Algorithm 2 of the strong approximation also gives the best approximations and its estimates are quite close to simulation results. When SCVs of interarrival and service times are 1, Algorithms 1 and 2 coincide. This is true even when the arrival rates are not equal to 1. Note that the strong approximation Algorithm (2) also performs well when the station is lightly loaded (with p = 0.4). It should be expected that the diffusion approximation would not give a good estimate for the mean queue length of the higher priority class, but it is quite surprising that it also does poorly in estimating the mean queue length of the lower priority class. However, we note that the approximation for the mean queue length of the lower priority class does improve as the traffic intensity for the lower priority class increases (relative to the traffic intensity for the higher priority class); this corresponds to the case when m.2 increases from 0.1 to 0.7 and mi decreases from 0.7 to 0.1 in Table 2.1.

2.6.2 Two-station Tandem Queue

Pictured in Figure 2.3 is a two-station tandem queueing network. Each station has two different job classes. We assume that all exogenous arrival processes and service processes are mutually independent renewal processes. We will estimate the performance of this network under two different service disciplines:

1. The service discipline at station 1 is preemptive priority and the service discipline at station 2 is FCFS. Class 1 jobs have a higher priority over class 2 jobs at station 1.

2. The service discipline at station 1 is FCFS and the service discipline at station 2 is preemptive priority. Class 3 jobs have a higher priority over class 4 jobs at station 2.

For each different service discipline type, we will compute three versions of systems with different service and interarrival time distributions. We list parameters of three systems in Table 2.2. All service and interarrival time distributions are taken to be Erlang of order 4 2.6 Numerical Examples 49

System mi 7712 Class Strong (1) Strong (2) Diffusion Simulation Qx 0.58(42.6%) 1.11(9.9%) 0.70(30.7%) 1.01(0.5%) 0.7 0.1 Q2 2.17(10.0%) 2.42(0.4%) 6.25(159.3%) 2.41(1.9%) Qx 0.25(56.1%) 0.63(10.5%) 0.50(12.3%) 0.57(0.3%) 0.5 0.3 Q2 1.00(27.0%) 1.45(5.8%) 1.42(3.6%) 1.37(1.0%) Qx 0.11(65.5%) 0.33(6.5%) 0.30(3.2%) 0.31(0.2%) 0.3 0.5 1 Q2 0.79(35.1%) 1.32(9.1%) 0.85(29.8%) 1.21(0.7%) Qx 0.03(72.0%) 0.10(0.0%) 0.10(0.0%) 0.10(0.1%) 0.1 0.7 Q2 0.88(34.8%) 1.47(8.9%) 0.89(34.1%) 1.35(0.6%) Qx 0.06(70.0%) 0.21(6.3%) 0.20(0.0%) 0.20(0.1%) 0.2 0.2 Q2 0.10(64.3%) 0.29(3.6%) 0.17(39.3%) 0.28(0.3%) 2.33(1.7%) 2.33(1.7%) 0.70(70.5%) 2.37(2.2%) 0.7 0.1 Qi Q2 8.67(2.9%) 8.67(2.9%) 25.0(180.0%) 8.93(4.9%) Qx 1.00(0.0%) 1.00(0.0%) 0.50(50.0%) 1.00(1.1%) 0.5 0.3 Q2 4.00(2.6%) 4.00(2.6%) 5.67(45.4%) 3.90(1.8%) Qx 0.43(0.0%) 0.43(0.0%) 0.30(30.2%) 0.43(0.7%) 0.3 0.5 2 Q2 3.14(0.6%) 3.14(0.6%) 3.40(7.6%) 3.16(1.9%) Qx 0.11(0.0%) 0.11(0.0%) 0.10(9.1%) 0.11(0.5%) 0.1 0.7 Q2 3.56(0.8%) 3.56(0.8%) 3.57(1.1%) 3.53(2.5%) 0.25(0.0%) 0.25(0.0%) 0.20(20.0%) 0.25(0.5%) 0.2 0.2 Qi Q2 0.42(6.8%) 0.42(6.8%) 0.66(70.9%) 0.39(0.6%) Qx 4.67(13.3%) 3.97(3.6%) 0.70(83.0%) 4.12(3.7%) 0.7 0.1 Q2 17.31(0.1%) 17.00(3.6%) 50.0(190.2%) 17.23(8.8%) Qx 2.00(25.0%) 1.50(6.3%) 0.50(68.8%) 1.60(2.0%) 0.5 0.3 Q2 8.00(4.3%) 7.40(3.5%) 11.33(47.7%) 7.67(5.0%) Qx 0.87(42.6%) 0.56(8.2%) 0.30(50.8%) 0.61(1.2%) 0.3 0.5 3 Q2 6.29(10.7%) 5.57(1.9%) 6.8(19.7%) 5.68(3.8%) Qx 0.22(70.8%) 0.12(7.7%) 0.10(23.1%) 0.13(0.6%) 0.1 0.7 7.12(7.7%) 6.34(4.1%) 7.14(8.0%) 6.61(3.8%) Q2 0.50(51.0%) 0.30(9.0%) 0.20(39.4%) 0.33(0.9%) 0.2 0.2 Qi Q2 0.83(57.2%) 0.58(10.0%) 1.33(151.6%) 0.53(1.0%) Qx 1.46(15.1%) 1.72(0.0%) 0.70(59.3%) 1.72(1.5%) 0.7 0.1 Q2 5.42(2.9%) 5.54(0.7%) 7.81(40.0%) 5.58(3.9%) Qx 0.63(22.2%) 0.81(0.0%) 0.50(38.3%) 0.81(0.7%) 0.5 0.3 Q2 2.50(8.8%) 2.73(0.4%) 3.54(29.2%) 2.74(2.2%) Qx 0.27(28.9%) 0.38(0.0%) 0.30(21.1%) 0.38(0.4%) 0.3 0.5 4 Q2 1.96(12.1%) 2.23(0.0%) 2.13(4.5%) 2.23(2.0%) Qx 0.07(36.4%) 0.11(0.0%) 0.10(9.1%) 0.11(0.3%) 0.1 0.7 Q2 2.22(11.9%) 2.51(0.4%) 2.23(11.5%) 2.52(2.0%) 0.16(30.4%) 0.24(4.3%) 0.20(13.0%) 0.23(0.3%) 0.2 0.2 Qx Q2 0.26(25.6%) 0.35(0.0%) 0.42(20.0%) 0.35(0.5%)

TABLE 2.1. Average queue length in a single-station network 2.6 Numerical Examples 50

1 C O

2 4

FIGURE 2.3. Two-station tandem queue

System Distribution Cii Q.2 m\ m2 m3 7714 1 Ei 1.0 3.0 0.5 0.1 0.3 0.2 2 M 1.0 3.0 0.5 0.1 0.3 0.2 3 Gamma 1.0 3.0 0.5 0.1 0.3 0.2

TABLE 2.2. System specifications of two-station tandem queue

(SCV = 0.25) in the first system, all are taken to be exponential (SCV = 1) in the second system, and all are taken to be Gamma distribution with SCV = 2 in the third system. Tables 2.3 and 2.4 present the simulation estimates and strong approximation estimates of the mean queue length for each job class for each different queueing system configuration. We use BNA/FM algorithm to calculate the mean aggregated workload numerically and then use Algorithm 2 in Section 2.5 to obtain the mean queue lengths. (In this case, both Algorithm 1 and the estimate based on the diffusion approximation provide inferior estimates as well, so they are not presented.) The strong approximation estimates of the mean queue lengths for this two station network are quite impressive compared with the simulation estimates, except the estimate for the job class 4 in type 1 queueing network, in which the strong approximation significantly underestimates the queue length for job class 4. We have no theoretical explanation for it at the moment, though we feel that the large errors might be due to the large variations in the interarrival times of this class (which correspond to the departure times of class 2, the lower priority class at station 1). 2.6 Numerical Examples 51

Approximation System Qi Q2 Q3 Qi No. Method Simulation 0.57 (0.3%) 2.71 (1.1%) 0.99 (1.3%) 4.00 (1.1%) 1 Strong Appr. 0.63 (10.5%) 2.70 (0.4%) 1.04 (5.1%) 2.83 (29.3%) Simulation 1.00 (0.9%) 9.02 (2.6%) 3.32 (2.5%) 12.50 (1.9%) 2 Strong Appr. 1.01 (1.0%) 8.99 (3.3%) 3.2 (3.6%) 9.31 (25.5%) Simulation 1.59 (1.6%) 17.54 (4.0%) 6.59 (4.6%) 24.41 (3.9%) 3 Strong Appr. 1.54 (3.1%) 17.15 (2.2%) 6.25 (5.2%) 18.45 (24.4%)

TABLE 2.3. Average queue length of network 1

Approximation System Qi Q2 Q3 Qi No. Method Simulation 0.78 (0.6%) 1.51 (0.8%) 0.31 (0.2%) 3.79 (1.2%) 1 Strong Appr. 0.85 (9.0%) 1.34 (11.3%) 0.33 (6.5%) 3.58 (5.5%) Simulation 1.90 (2.1%) 4.50 (2.2%) 0.40 (0.7%) 12.17 (2.9%) 2 Strong Appr. 1.89 (0.5%) 4.47 (0.6%) 0.41 (3.5%) 11.80 (3.0%) Simulation 3.42 (3.1%) 8.53 (3.5%) 0.55 (1.0%) 23.86 (4.4%) 3 Strong Appr. 3.30 (3.5%) 8.69 (1.9%) 0.51 (7.3%) 22.65 (5.1%)

TABLE 2.4. Average queue length of network 2 2.7 Appendix

2.7 Appendix

2.7.1 Proofs and An Elementary Lemma

Proof (of Proposition 2.4). Without loss of generality, we assume that a;(0) = 0. If u we have

/(«) - f(v) = sup {-ds - x{s)} - sup {-9s -x{s)} - [-9}+{u - v) 0

sup {-6(v + s) - x{v + s)} - sup {-9s - x(s)} ,0

sup {-9s - x(v + s) + x(v)} 0

If 9 > 0, we have

= sup {-9s - x(s)} - sup {-9s - x(s)} 0 0.

If 9 < 0, we have 2.7 Appendix 53

/(«) - /(«) = sup {-0s - x(s)} - sup {-0s - x(s)} + 6(u - v) 0 sup {-6s - x{s)} - sup {-6s - x{s)} + 6{u - v) u—v

= sup {—6s - x(s + u - v)} - sup {-6s - x(s)} 0 — sup {x(s) — x(s + u — v)} 0 — sup \x(s + u — v) — x(s)\. 0

Let h(T) = v/Tlog logT, for V6>, we have

sup \f{u)-f{v)\ 0

< sup I sup \x(s + u - v) - x(s)\V sup \x(v + s) - x(v)\ > (0

= o(TllT), as T ^ oo.

• Proof(of Theorem 2.5). We provide an inductive proof on dimension K. First, consider K = 1. Then R is a positive scalar. In this case, the one-dimensional RBM Z can be expressed as

Z(t) =X{t) + sup {-X{s)}+, 0

P(||Z||T > z) = P( sup sup [X{t) - X(s)} > z)

0 z) 0 z) 0

for any z > 0, where v = 2\8\/a2. Taking z = (21ogT)/V in the above yields

p(||Z||r/logT>2/i/) then by the Borel-Cantelli lemma, we establish (2.9). If ^(0) = x ^ 0, define Xl(t) = X(t) — -X"(0). Let $ denote the one-dimensional reflection mapping; that is, Z = &{X). Define Zl = <&(Xl), Since X1 is a Brownian motion with a negative drift starting from the origin, the above proof establishes that the bound (2.9) holds for Zl; then the Lipschitz continuity of the reflection mapping $ establishes the bound (2.9) for Z. Next, suppose that the theorem holds for dimension d— 1; we show it also holds for dimen• sion K = d. Since R is a lower triangular matrix, it follows from the induction hypothesis that

a sup \Zk(t)\ = O(logT) (2.101) 0

d d-l

l l l {R')ddZd{t) = Y,{R~)dkXk(t) - J2(R~)dkZk(t) + Yd(t), fc=l k=l with (R-^dd > 0. Let d Xd(t) = ^(R-^dkXkityiR-1)^ k=l and Zd = §{Xd). Note that Xd is a one-dimensional process (actually a Brownian motion), Zd is its reflected process. Then, in view of (2.101) and the Lipschitz continuity of the one- 2.7 Appendix 55

dimensional reflection mapping <&, we have

d sup \Zd(t) - Z (t)\ "=• O(logT). (2.102) 0

Furthermore, note that Xd is a (one-dimensional) Brownian motion with a negative drift,

1 1 since R~ 9 < 0 and (R~ )dd > 0. Applying the proved result for the one-dimensional case yields,

sup \Zd{t)\ ad- O(logT); 0

sup \Zd(t)\ O(logT). 0

The above and (2.101) prove (2.9) for K = d. •

Lemma 2.12 Both matrices H = AC[(CMhC[)-1 and G = CM(I - P')-lP'H are well- defined, and matrix G is strictly lower triangular.

Proof. Let ki denote the number of job classes at station i, and 4 denote the number of priority groups at station i, i = 1,... , J. Because the queueing network is feedforward, we can express the matrix P' as

Pn 0 0

P21 P22 0 P' = (2.103)

PJI Pj2 PJZ ••• PJJ where P^- (1 < i,j < J) is a ki x kj sub-matrix. Since there is no self-feedback, all elements of Pji, i = 1,... , J, must be zero. Matrix (P2)' has a structure similar to (2.103) where the diagonal sub-matrices are zero. Since

(i-p'r1 = (/+P+P2+ ...)', 2.7 Appendix 56 and M = diag{m.j}, to prove the result of this lemma, it suffices to prove that H is well defined and CP'H is a lower triangular matrix. Matrix C can be represented by

Cu 0 0 0 0 C = c22

0 0 0 CJJ

where Cu (1 < i < J) is a £t x kt sub-matrix. This implies that D = CMKC[ is lower triangular. Furthermore, D can be written as

Du 0 0 0

0 D22 0 0 D

0 0 0 ... DJJ\ where Du (1 < i < J) is a £{ x 4 sub-matrix, and has the form of

Ift+i 0 ... 0 Du = Pe+i Pe+2

where R = C\M\ and group I has the highest priority at station i. We know that the inverse of Du has the form

1 0 0 i_ 1 0 Pt+2 Pt+2 Du' =

Thus, it can be verified easily that

if £ = 7r(Jb), Pir(fc) -7^- if l = ir{k)-l and a(^) = a(7r(fc)), 0 otherwise. 2.7 Appendix 57

It is obvious that H has the following representation

#11 0 0 ... 0

0 tf22 0 ... 0

0 0 0 ... HJJ where Ha (1 < i < J) is a ki x l± sub-matrix. By these representations, it is clear that CP'H is lower triangular. This completes the proof. •

2.7.2 General Traffic Intensity Case

In this section, we show that strong approximation can also be applied for approximation even if the traffic intensity at a station is larger than 1. For simplicity, we only consider single-station queueing networks. This is an extension of the discussion in Section 2.3. We are satisfied not to provide a proof here since the proof is analogous to that in Section 2.3. Define a partition of K by

Kn = {k G IC : p^k) < 1},

Kb = {k G K : p^fc)-! < 1 < Pn(k)},

Ks = {k G K : p^ > 1 and p^-i > 1}.

Note that both Kb and Ks are empty sets, if p < 1; also note that the set Kb would be an

empty set if pw(t) = 1 for some k G Kn. Also note that all classes in Kn have higher priorities

than classes in Kb and Ks, and all classes in Kb have higher priorities than classes in Ks.

Hence, jobs of classes in Kn do not see jobs of classes in Kb and Ks, and jobs of classes in Kb

do not see jobs of classes in Ks. On the other hand, jobs of classes in Ks see all jobs of classes

in Kn and Kb in front of them in the queue, and jobs of classes in Kb see all jobs of classes

in Kn in front of them in the queue. Based on what they see, jobs of classes in Kn observe the queue with a traffic intensity less than or equal to 1, or the queue is non-bottleneck or

balanced bottleneck; jobs of classes in Ks observe the queue with a traffic intensity strictly

greater than 1, or the queue is strictly bottleneck. When Kb ^ 0, all jobs in Kn see the queue with a traffic intensity strictly less than 1. In this case, if a job in Kb were given the highest (preemptive) priority over the other jobs in Kb, then it would observe the queue is non-bottleneck, and if a job in Kb were given the lowest (preemptive) priority to the other jobs in Kb, then it would observe that the queue is strictly bottleneck; hence, overall, jobs in Kb hold a balance or fall between non-bottlenecks and strict bottlenecks. Actually, when 2.7 Appendix 58

K,b ^ 0, /C(, = geb for some 1 < 4 < L, and hence, all jobs in K,b are served in the order of their arrival (FIFO).

Theorem 2.13 Suppose that the FLIL assumptions (2.23)-(2.25) hold. Then as T oo,

a sup \zt{t) - zt(t)\ = o(Vriogiogr), 0

s sup \Wk(t) - (Xk - X*k)mkt\ °= - O(VTloglogT), o

a 5 sup \Qk(t)-(Xk-X*k)t\ = ' 0(VT log logT), 0

sup |Sfc(*)-S*(t)| =' Otx/TloglogT), o

a sup |Z?it(t)-A^| = 0(x/TloglogT), 0

+ a sup |y,(£) - (i - Pl) t\ = o(VriogiogT), 0

a sup \Tk(t) - X*kmkt\ = 0(VTlog log T), 0

a.s. l T sup \Qk(t)-(Xk-X*k)t-Ak(t) + Wk(t)\ o(Tl ) 0

a.s. 1 sup \Wk(t)-(Xk-X*k)mkt-Wk(t)\ op ") 0

Zi(t) = Nt(t) + Yi(t), i

Nt(t) = (w-lJt + EE^iW-^^*)]'

+ Yt(t) = swp {-Nt(s)} 0

ffl i + Sfc(t) = r-3 {^)W + EE j[ jW ^^

-Aj(ZAk)(t)+t) - Vj(^(^{k)(t)+t))]},

t x t wk(t) = Ak(fn(k)t)-^YJ^MMk))-M j)} P^ jegt

+ + -z^[YHk)(t) - (1 - pm) t] + ^-[Yv(kyi(t) - (1 - P.(fc)-i) t], Pn(k) P-x(k)

Wk(t) = -mkWk(t)+mkAk{t)-Vk(Xkt) + Vk{X*kt),

and Wk is r-strong continuous.

In fact, we can get an equivalent but easier to understand strong approximation form for the workload and queue length of individual job classes. 2.7 Appendix

Corollary 2.15 Suppose that the assumptions (2.34)-(2.36) hold with 2 < r < 4. Then k = 1,... , K, as T —> oo,

a 1 sup \Qk(t) - Qk(t)\ =- o(T ^), 0

a l r sup \Wk(t)-Wk(t)\ d- o(T' ), 0

k Wk(t) = ^[ZAk) - ZAk)_x{t) - W< \t)} + mk[Ak(t) - Ak(

+[Vk(\kt)-Vk(\*kt)],

k k W) = ^-[Zv{k)(t) - ZAk)_x{t) - W^ \t)} + [Ak(t) - Ak(^ k)}, Piv(k) We(t) = £ {miMt) ~ MM) ~ rnM^t) + V,(\*t)} . i€9e 61

Chapter 3 Brownian Approximations of Multiclass Queueing Networks

3.1 Introduction

In the performance evaluation of queueing networks, the Brownian approximation refers to approximating certain processes of interest, such as workload, queue lengths and sojourn times, by a semi-martingale reflecting Brownian motion (SRBM, or its special case, a reflect• ing Brownian motion). (Refer to Section 3.4 for the definition of the SRBM.) In the context of a generalized Jackson network, which is essentially a single-class model — in particular, every station processes one class of jobs, the Brownian approximation is supported by either a heavy traffic limit theorem (also known as the diffusion approximation, refer to Reiman (1984), and Harrison and Williams (1987)), or by the strong approximation, another type of limit theorem (refer to Chen and Mandelbaum (1994)). In multiclass queueing networks, the situation is more complex, since there are known examples where the SRBM does not exist (e.g., Dai and Wang (1993)). However, in the special case of a multiclass feedforward network, Harrison and Williams (1992) does establish an SRBM approximation, supported by the heavy traffic limit theorem of Peterson (1991) and the strong approximation theorem in Chapter 2. In another special case, the re-entrant line network under a first-buffer-first-served (FBFS) discipline and a last-buffer-first-served (LBFS) discipline, Dai, et al. (1997) proposes an RBM approximation, which is supported by the heavy traffic limit theorem of Chen and Zhang (1996) for the FBFS discipline, and by Chen and Ye (1999) and Bramson and Dai (1999) for the LBFS discipline. Although not supported by a heavy traffic limit theorem, Harrison and Nguyen (1990, 1993) develops SRBM approximations for a general multiclass network with a first-in-first-out (FIFO) service discipline, and shows through numerical examples that the approximations work quite well. Harrison and Pich (1996) extends the approximations to networks with multiple, unreliable servers and other features. Motivated by these previous works, the objective of our study in this chapter is to de• velop SRBM approximations to general multiclass networks operating under priority service disciplines and with unreliable servers. In fact, we allow a mixture of priority and FIFO disciplines: each station in the network processes one or several priority groups, whereas 3.1 Introduction 62 within each group the classes are served on a FIFO basis. We start with conditions, which are diffusion or strong approximations, on the primitive processes: external arrivals, service requirements, service capacities (up and down times), and the routing mechanism; and de• velop the SRBM approximations for derived processes such as workload, queue lengths and sojourn times. The focus of the chapter is not on proving limit theorems so as to justify why the network in question can be approximated by an SRBM. Rather, our intention is to illustrate how to approximate a general multiclass network with priorities and server breakdowns by an SRBM. We make no claim that the proposed approximations can always be justified by some limit theorems. Quite to the contrary, through both analysis and numerical results, we identify cases where the SRBM approximation may not exist, or may work poorly. In particular, through the well-known Kumar-Seidman network (Kumar and Seidman (1990)), we show that the SRBM may not exist even when the network is stable, or when a heavy traffic limit exists. On the other hand, if the network is "close" enough (in some parametric sense) to the limiting network under heavy traffic, then the SRBM does exist. Furthermore, in the case of a Bramson-like network (Bramson (1994)), where we can charac• terize the region in which the SRBM exists, we illustrate through numerical examples that near the boundary of the SRBM existence region the approximation could work poorly. In almost all of the previous works on the Brownian approximation of queueing networks involving priority service disciplines, the queue-length and the workload of the higher prior• ity classes are approximated as zero. This is known as the state-space collapse phenomenon (Reiman (1984,1988)). Although this is supported by the heavy traffic limit (when the latter exists), zero is clearly not a good approximation, in particular when the traffic intensity of the higher priority classes is substantial. This is illustrated by numerical examples in Chapter 2 via a feedforward network. In contrast, the SRBM approximation proposed in this chapter provides a non-zero approximation to the higher priority classes (and in this regard, general• izing the work of Chapter 2). On the other hand, we also develop the SRBM approximation based on the state-space collapse phenomenon, which does have the computational advantage of a lower-dimensional state space, and may work reasonably well when the traffic intensity associated with the higher priority classes is negligible.

To implement the SRBM approximation, numerical computation is required. For this pur• pose, an algorithm, refered as BNA/SM, was developed in Dai (1990), and Dai and Harrison (1992). In chapter 4 and 5, we provid an alternative implementation based on the finite el• ement basis (instead of the global polynomial basis used in BNA/SM), which improves the 3.1 Introduction 63 numerical stability. This is the algorithm used in this chapter, and the resulting approxima• tions exhibit good accuracies in most cases, even when the traffic intensity is moderate. To evaluate the performance of queueing networks with the kind of generality specified above, the SRBM approximation appears to be the only viable alternative to simulation. Traditional, decomposition-based approximations (e.g., Shanthikumar and Buzacott (1981), Whitt (1983), among others), which appear to work quite well in the generalized Jackson network, do not extend easily to the multiclass setting. The decomposition approach typically approximates the arrivals to each station by a renewal process, whereas in multiclass networks the arrivals are most often highly correlated, so much so that they could easily cause instability even when the usual traffic condition is satisfied (which suffices to guarantee stability in the generalized Jackson network); refer to, e.g., Kumar and Seidman (1990), and Bramson (1994), among others. In other words, applying the decomposition-based approximation to a multiclass network could result in predicting say, a finite steady-state mean queue length at a station, whereas the true queue-length process may not even possess a stationary distribution. Compared with simulation, on the other hand, the advantage of the SRBM approximation lies in its tractability, as an analytical model, in solving optimization or scheduling problems associated with the queueing networks, in a wide range of applications; refer to, e.g., Harrison and Wein (1990), Wein (1990) and Laws (1992). The rest of the chapter is organized as follows. We start with a formal description in Sec• tion 3.2 of the multiclass network under study. In particular, we specify the primitive and the derived processes, and the dynamical equations that relate the two. In Section 3.3, we present the conditions on the primitive processes, in terms of diffusion or strong approxi• mations, and illustrate how these asymptotics hold in several example systems. Our main approximation scheme is developed in Section 3.4, where we derive the SRBM approxima• tion for the aggregated workload process based on the conditions imposed on the primitives, and extend the approximation to other derived processes. Several issues surrounding the SRBM approximation are discussed in Section 3.5, and further illustrated through analyzing the Kumar-Seidman network. The connection to the notion of state-space collapse is also examined in that section. Extensive numerical studies are presented in Section 3.6, where the SRBM approximation is compared against simulation for a variety of networks. Brief concluding remarks are summarized in Section 3.7. 3.2 Model Formulation 64

3.2 Model Formulation

3.2.1 Notation and Conventions

We study a queueing network that consists of J single-server stations, indexed by j € J = {1,J}, and K job classes, indexed by k € IC := {1,... , K}. K job classes are partitioned into L (non-empty) groups, gi,...,gi- Group gt, or sometimes simply called group £, jobs are served exclusively at station sigma{£). Thus, a is a many-to-one mapping from C : = {£\, ...,£L}- onto J. Necessarily, J < L < K. Within each group, jobs are served in their order of arrival (i.e., FIFO). (While all job classes with a group follow the same service discipline, they are statistically different in terms of arrival times, service requirements and routing mechanism as shall be specified in Section

3.2.2.) Among different groups, group £\ jobs have a preemptive priority over group £2 jobs

if £\ < £2, provided that they are served at the same station, i.e., o-(£\) = a(^)- Define a mapping n from IC onto L: n(k) = £ if and only if A: € Then the above convention implies that class k jobs are served exclusively at station CT(7T(A;));whe n a(ir(i)) = o(n(k)), class i jobs have a preemptive priority over class k jobs if ir(i) < n(k); they are served FIFO if n(i) = ir(k). For any k € IC, let h(k) denote the index of the group with the next higher priority at station cr(7r(A;)) (i.e., h(k) < ir(k) and a(h(k)) = cr(7r(A;)); and if h(k) < £ < n{k), then it must be a{£) ^ a(n(k))), and let h(k) = 0 if class k has the highest priority at its station. We introduce a J x L station constituent matrix C = {Cjt)j£jj£c'- The (j, £)th component of C, Cji = 1 if a(£) = j (i.e., group £ is served at station j) and Cji — 0 otherwise. Next, we introduce two LxK matrices: group constituent matrix G = [Gik)iec,keK. and higher priority

group constituent matrix H — {Hik)i£C,k£K'- The {£, k)th. component of G, Gik = 1 if fc'G gt, and Gtk = 0 otherwise. That is, each row £ of G specifies what job classes belong to the group gt. The (£, k)th component oi H, Ha- = 1 if a(£) = o(n(k)) and -n[k) < £; otherwise, Htk = 0. Note that the condition for = 1 necessarily includes k € gi. Hence, each row £ of H includes not only the classes that belong to the group gi, but also all those classes that are served at the same station o(£) but with a higher priority than those in g^. While matrix C specifying the station constituents in terms of groups and matrix G specifying the group constituents in terms of classes, the matrix product CG spells out the station constituents in terms of classes.

Some examples are in order. 3.2 Model Formulation 65

Example 3.1 Consider the generalized Jackson network, where there is a single stream of external arrivals and hence a single job class at each station. In this case, we can simply set 9k = {k} and a(k) = k for k G IC, and hence K = J = L and C = G = H = I.

Example 3.2 Consider a multiclass network with pure priorities, i.e., each priority group contains a single class. An example of such a network is shown in Figure 3.1, which is known as the Kumar-Seidman network. Here, classes 2 and 4 have higher priorities at their respective

Class 1 Class 2

Class 4 Class 3

FIGURE 3.1. Kumar-Seidman Network

stations. The partition of the classes can be specified as g\ = {4}, g2 = {1}, gs — {2} and

g4 = {3}. Then, the matrices C, G and H can be identified as follows:

I 0 0 0 1 { 0 0 0 1 \ 1 0 0 0 1 0 0 1 C = G = and H = 0 1 0 0 0 1 0 0 0 1 1 0 \ 0 0 1 0 ) \ I

Example 3.3 Consider a two-station five-class network as depicted in Figure 3.2. All jobs are served FIFO. This is a variation of what is known as the Bramson network. Here, because

0.05

0.8

FIGURE 3.2. A Variation of Bramson Network

of FIFO, we have L = J= {1,2}, gx = {1,5} and g2 = {2,3,4}, and a(l) = 1 and CT(2) = 2. 3.2 Model Formulation 66

(The priority does not play any role in this case, and we could alternatively define, e.g.,

g2 = {1, 5} and gx = {2, 3,4}, and o(l) = 2 and a(2) = 1.) Furthermore, C = I,

1 0 0 0 1 G^f ). (3.1) y o i i i o y

3.2.2 The Primitive Processes

The network under study is driven by the following four primitive processes:

(i) External Arrivals. Jobs of class A; arrive at the network exogenously according to a

counting process Ek = {Ek(t),t > 0}, where Ek(t) indicates the number of arrivals up

to time t, with Ek(0) = 0.

(ii) Service Requirements. The n-th job in class A; requires vk(n) units of processing time

from the server at station 1}, and denote

n

V*(0)=0, Vk(n) = Y,vk(m); Vk = {Vk(n), n > 1}.

771=1

(iii) Service Capacities. For each station j, the server is constrained by a (cumulative) service capacity process, Cj = {cj(t),t > 0}, where Cj(t) is the maximum amount of work that the server j can accomplish up to t. Hence, if server j is perfectly reliable, then Cj{t) = t. (Assume all servers work at unit rate.) If the server is subject to random disruptions (breakdowns), then Cj(t) is the cumulative up-time over the interval [0,t]. (More on this in Example 3.6.)

(iv) Routing Mechanism. After service completion a job may change classes (in particular if it is routed to another station). For each class k £ IC, define k = { 1} such that the nth job of class k after service completion becomes a class j job if fc(n) = eJ (the j'-th unit vector), or leaves the network if = (4>k) the routing sequence. Let n $FC(0) = 0, $fc(n) = ^ k{m), n > 1.

771=1 Denote $k{t) = $*(|tj), for all t > 0; and $k = {$k{t), t > 0}. 3.2 Model Formulation 67

3.2.3 The Derived Processes

The derived processes are the performance measures of interest in the network. We shall focus on the following four processes:

(I) the workload process: W = (Wk)keic, where Wk = {Wk{t),t > 0}, and Wk(t) denotes the amount of work embodied in all class k jobs that are either queued or in service at time t:

(II) the queue-length process: Q = [Qk)ke)Ci where Qk — {Qk(t),t > 0}, and Qk{t) represents the number of class k jobs either in service or in queue at time t;

(III) the sojourn-time process: S = {Sk)keKi where Sk = {Sk{t),t > 0}, with Sk{t) denoting the time that will be spent at station a{k) by the first job of class k that arrives at the station at time t or afterwards;

(IV) the aggregated workload process: Z = (Z^k^ic, where Zt = {Zt(t),t > 0} (£ € £), and Zt(t) represents the total amount of work embodied in those jobs that are either queued or in service at station o{£) at time t and with priorities no lower than £.

To specify the dynamics of the above processes, we also need some other (intermediate) performance measures:

• Ak(t): total number of class k jobs arrived to station a(ir(k)) during [0, t], either exter• nally or from other stations;

• Dk{t): total number of departures (service completions) of class k jobs from station a(n(k)) during [0, t];

• Yt(t): the cumulative amount of service capacity during [0, t] that station -K(£) does not use to serve jobs with priority £ or higher. In particular, if group gt has the lowest priority at station a(£), then Yt{t) indicates the cumulative amount of service capacity during [0, t] at station o(£) that is lost due to the lack of jobs for service (corresponding the idle time if one interprets the service capacity as the time available for service).

Furthermore, denote:

• Vk[t)'- the arrival epoch of the first job in class k during [t, co);

• Tk{t): the time a class k job will spend at station a(k), given it arrives at t;

• rt(t): the arrival time of the job in group gt that has most recently completed service at station o(£) (r^(i) = 0 if none of the jobs in group gt has been completed by t); 3.2 Model Formulation 68

• vk(t): the partial service time (if any) that has been performed on a class k job during

(T7T(fc)-*]-

Note the following bounds for nk(t) and uk(t), k £ IC (which will be used later):

0

um where uk(m) is the interarrival time between the (m — l)-st and the m-th class k jobs (J P

points of Ek)\ and

0 < vk(t) < max vk(n), (3.3)

l

i.e., uk(t) is bounded by the maximum service requirement of all class k jobs that have arrived up to t.

We can now derive the following relations:

K A(t) (3.4) lr—1 K— 1 D(t) = A(G'r(t)), (3.5)

Q(t) = A(t)-D(t), (3.6) W(t) = V(A(t))-V(D(t))-u(t), (3.7)

Z(t) = HV{A{t))-C'c{t)+Y{t), (3.8)

Z(t) = HW(t), (3.9)

S(t) = nm- (3.10)

The equation in (3.4) specifies the total arrivals (of any class) as the sum of the exter•

nal arrivals and internal transitions; (3.5) is best understood in component form: Dk(t) =

Ak(r^^(t)), taking into account that all jobs in group g^^) are served FIFO; (3.6) says the difference between arrivals and departures is what is left in the network (i.e., queued or in service); and (3.7) is a variation of (3.6) in terms of workload, and partially completed jobs are accounted for in the last term u(t). Furthermore, writing (3.8) in component form, we have

Mt)= X) MAk{t)) - C(T{i){t)+ Yt{t), ieC. 3.3 Conditions on the Primitive Processes 69

Note that we must have, for all I € £,

dYt > 0, Ye{0) = 0; ZtdYt = 0. (3.11)

In words, Y^{t) is nondecreasing in t, starting from 0; and it can only increase at t if Zi(t) = 0. These are consistent with the specification of the (preemptive) priority discipline and the work-conserving condition. The relation in (3.9) follows from the definitions of the two workload processes, W and Z, and the matrix H. Finally, (3.10) is self-explanatory, given the definition of T and n. Note, however, that 7~ is governed by the following recursive relation:

%(t) = Zw{k)(t)+ Y, lVi(Mt + %(t)))-Vi(Mt))] i:ff(7T(t))=

+ [Tk(t) - C^fc)) {t + Tk(t)) + ca{n(k)) (f,)]

+ [Vk(Ak(t))-Vk(Ak(t)-l)], (3.12) where the first term on the right-hand side is the aggregated workload over all those jobs that are no lower in priorities than the class k job that indexes 7fc, the second term is the sum of the work embodied in those jobs that arrive during the sojourn time of the class k job in question and have higher (preemptive) priorities, the third term is the down-time of station a(n{k)) during the sojourn time of the class k job in question, and the fourth term is the job's own service time.

3.3 Conditions on the Primitive Processes

Recall in Section 3.2.2, the four primitive processes that drive the dynamics of the network are:

k E = (Ek)keK, V = (Vk)keK, c=(Cj)jeJ, $ = (® )keK, denoting, respectively, the external arrivals, the service requirements, the service capacities, and the routing mechanism. 3.3 Conditions on the Primitive Processes 70

The starting point of our approximation scheme is to assume that (E, V, c, $) jointly sat• isfies the following:

E(t) « at + E{t), (3.13)

V{t) « mt+V(t), (3.14)

c(t) « Kt + c{t), (3.15)

k k ® {t) « P'kt + $ {t), keJC. (3.16)

Here,

a = {ak)keK, rn = {mk)keic, K = {Kj)jej, P'k = {pki)ieic, (3.17) are non-negative vectors (drift rates); and

k k E = (Ek)keK, V = (Vk)kac, c=(cj)jeJ, $ = @ )ieic, (3.18)

are driftless Brownian motions, with covariance matrices denoted FE, IV, Tc and F$k, re• spectively. The approximation "w" means, e.g., in the case of (3.13): either

sup \\E(t) -at-E{t)\\ =' o(T1/r), asT->oo, (3.19) 0

for some r e (2,4) (which is known as strong approximation; see, e.g., Csorgo and Horvath (1993)), or

n-1/2[E{n-)-na-} A E(-), as n-> oo, (3.20)

where A denotes weak convergence (see, e.g., Billingsley (1968)). The same holds for (3.14), (3.15) and (3.16) as well. Several technical remarks are in order. First, the strong approximation generally assumes that there exists a probability space on which a version of E and E are defined such that (3.19) holds (for this version); whereas here we simply assume that the primitive processes have already been defined on an appropriate probability space. Second the strong approxi• mation in (3.19) clearly implies the weak convergence (or, diffusion approximation) in (3.20). However, the strong approximation also requires stronger conditions. For example, if E is a re• newal process, then the interarrival times are required to have finite moments of higher order (than the second moment, which is the requirement of the weak convergence). Hence, wher- 3.3 Conditions on the Primitive Processes 71 ever applicable, the strong approximation is preferred. In addition to its stronger conclusion (than the weak convergence), another advantage is the it does not require the time-and-space scaling associated with the sequence of networks, which are necessary objects in the diffu• sion approximation. However, relatively fewer networks have been shown to follow strong approximations; refer to Section 3.5.1 for more details. Note the vectors in (3.17) have the following physical interpretation (all in the sense of long-run averages):

• ak is the external arrival rate of class k jobs (hence, if A; denotes a class that is generated

only by internal transitions, then necessarily ak = 0);

• rrik is the mean service requirement of class k jobs;

• Kj is the service capacity of station j, i.e., the maximum output rate from the station;

• Pki is the proportion of class A; jobs that transfers (upon service completion at station a(ir(k))) into class i (and transits to station a(n(i))).

For future reference, we collect here some more notation that relates to the primitive processes: Denote:

• P = (pki)i the routing matrix. Note the A;th row of P is Pk, whose transpose appears in (3.16). P is assumed to be substochastic, with Pn -> 0 as n —> oo.

• A = (I — P')~la, whose A>th component, A^, is the (nominal) arrival rate of class A; jobs at station a(k), including both external arrivals (if any) and internal transfers.

• M = diag(m) and A = diag(A), two K x K diagonal matrices with the A>th diagonal

elements being and Xk, respectively.

• (3 = HMX, whose £-th component is the summation of the group traffic intensities (which are given by GMX) for those groups at station o(t) with the priority no less than group £.

• p = CGMX, whose j-th component is the traffic intensity of station j. Note that if gi is

the lowest priority group at its station, then (5i = Pa{i)- Throughout below, we assume p < K, i.e., the traffic intensity at every station is strictly less than its service capacity.

Below we illustrate through some examples how to specify the parameters that characterize the approximations in (3.13)-(3.16), specifically, the drift terms and the covariance matrices (of the driftless Brownian motions). 3.3 Conditions on the Primitive Processes 72

Example 3.4 (Renewal Arrivals, i.i.d. Service Requirement and Markovian Routing) These

are the most common assumptions in queueing networks. Specifically, for each k £ IC, EK

is a renewal process with arrival rate ak and squared coefficient of variation (SCV) of in•

terarrival times CQ k, and (E\, ...,EK) are mutually independent. (The squared coefficient of variation of a random variable equals its variance divided by the square of its mean.) Then, the approximation in (3.13) holds, and the covariation matrix TE is:

(FE)ik = oikcltk8ik, i, k £ IC.

This can be justified by either the functional central limit theorem (also known as Donsker's Theorem) or the functional strong approximation theorem (Csorgo and Horvath (1993)).

Similarly, the service requirements, (vk)ke)c, are K mutually independent i.i.d. sequences,

with mean (mk)ke)c and SCV (c|)fce/c- (The SCV c\ should be distinguishable from the capacity process Cj from the context.) Hence, the approximation in (3.14) also holds, with the covariance matrix,

(Tv)ik = mlclSik, i, k £ IC.

k The same also applies to the routing sequences, ( )keiCi which are also mutually inde• pendent i.i.d. sequences. In particular, the approximation in (3.16) holds, and the covariance matrix

(T^Jim =Pki(km -Pkm), i,m £ K, for all k £ IC.

Example 3.5 (Batch Arrivals) Suppose for each k £ IC, class k jobs arrive in batches:

l The interarrival times of the batches are i.i.d. with mean ak and SCV c^k, and the batch

sizes are i.i.d. with mean bk and SCV c^k. Assume that the batch sizes are independent of interarrival times and that the arrivals among different classes are mutually independent.

Then the approximation in (3.13) holds with ak = akbk and

2 2 {Ts)ik = akb k{clk + c bk)6ik, i,k £ IC.

(See Chapter 2)

Example 3.6 (Perfect Service Stations) If station j is perfectly reliable, then the jth com• ponent of the approximation in (3.15) holds with Kj = 1 and dj = 0. 3.4 The SRBM Approximation 73

Example 3.7 (Service Stations with Breakdowns) A station (server) j is subject to "au• tonomous breakdowns," if breakdowns can occur regardless whether it is idle or working. In this case, let {(uj(n),dj(n)),n > 1} be an i.i.d. sequence, where Uj(n) and dj(n) denote the duration of the n-th up-time and the n-th down-time (including repair time), respectively. For simplicity, assume that Uj(n) and dj(n) are independent. (The station can start in either the first up period or the first down period.) Let dj and c\ (UJ and c\.) denote the mean and the squared coefficient of variation of dj(n) (uj{n)). Then following Chen and Whitt (1993), the approximation in (3.15) holds with

UJ + dj and

u]dj(clj+cl)

{T^1 = Uj+dj 5jh

A station (server) is subject to "operational breakdowns" if breakdowns can only occur while the station is actively processing jobs. If a station j has an infinite supply of work and is hence always engaged in service, then the distinction between the operational breakdown and the autonomous breakdown disappears for the process c. This is the case in heavy traffic. In general, Harrison and Pich (1996) proposes a refined approximation as follows:

{Te)ji = Pi . "frWi+^lj

Either of the above modes of the breakdowns implicitly assumes that there is a dedicated repair-person for each service station. When more than one service stations share one repair• person, the approximation (3.15) still prevails, except it becomes more difficult to identify

the approximation parameters n and Tc. See Harrison and Pich (1996) for details.

3.4 The SRBM Approximation

The essence of our approximation scheme is to start with the asymptotic conditions on the primitive processes in (3.13)-(3.16), and derive the corresponding approximations for the derived processes, in particular, the four processes (W, Q, S, Z) specified in Section 3.2.3. 3.4 The SRBM Approximation 74

A key assumption we need is that the arrival process A (also defined in Section 3.2.3) satisfies the following:

A(t) w Xt + A(t), (3.21) where A is a continuous process such that either A is r-strong continuous or the weak limit of A(nt)/\/n (as n —>• oo) exists and has a continuous sample path. (A process x G T>K (a space of K-dimensional functions which are right-continuous and have left-limits) is said to be r-strong continuous if

sup \\x{u) — x{v)\\ as T —»• oo; |u-u|

sup \\A{t) - Xt\\ - 0{y/T\oglogT), asT->oo, o Xt, u.o.c, as n —> oo. n When either of the above holds, we shall write:

A{t) ~ At. (3.22)

Furthermore, we can show that the above is equivalent to either one of the following two approximations:

D(t) ~ At, (3.23)

r(t) ~ et. (3.24)

In view of (3.22) and (3.23) — that the departure rate equals the arrival rate — it is clear that the approximations in (3.22)-(3.24) should hold if the queueing network is stable (or weakly stable). (It is shown in Chen (1995) that the approximation in (3.22) holds for a weakly stable queueing network under any work-conserving discipline. The approximation in (3.24) can be similarly established.) Indeed, here we are only interested in approximating a stable network. 3.4 The SRBM Approximation 75

In deriving the approximations below, we shall repeatedly make use of the following result (refer to Chapter 2): Suppose two processes, X and T, satisfy the following:

X{t)^at + X{t) and T{t) ~ bt, such that either X is r-strong continuous or the weak limit of X[nt)/y/n (as n -> oo) is continuous. Then,

X(T{t)) « aT(t) + X{bt).

We are now ready to derive the approximations for the processes (W, Q, S, Z). First, ap• plying (3.21) to (3.5) yields,

D(t) = A{G'T{t)) « AG'T(t)+A{t). (3.25)

Next, applying (3.13) and (3.16) to (3.4) yields,

K

1 k A(t) « at + E{t) + £i$ {\kt)+P'D{t). (3.26)

fc=i

Note that A = a + P'X and A = AG'e. Hence, combining (3.21), (3.25) and (3.26) leads to

K

fc A(t) « (I-P'Y £(i)+£$ (Afci) fc=i -{I -P')-lP'AG'[et-T(t)}. (3.27)

From the bound in (3.3), we can show that the residual service time satisfies v(t) w 0. Then, applying (3.14), (3.21) and (3.25) to (3.7), we obtain,

W(t) « M[A(t) - D[t)] « MAG'[et - r(t)}. (3.28)

In view of (3.9), the above implies:

Z(t) w HMAG'[et-T{t)}. (3.29) 3.4 The SRBM Approximation 76

Making use of first, (3.15), (3.14) and (3.21); and then, (3.27) and (3.29), we obtain

Z{t) « H[MA{t)+ V{Xt)]-C'[Kt + c{t)} + Y{t) r K

l k « HMXt + HM(I - P')~ E{t) + J2$ (Xkt) k=l (3.30) +HV{Xt) - NZ(t) - C'[Kt - c{t)} + Y{t), where

N = HM(I - P')_1P'A, and A = AG'[HMAG'] (3.31)

We note the above inverse matrix always exists and the (k, £)-th component of the matrix A is as follows

' Xk/(GMX)e itkegt,

Aki = I -Xk/{GMX)i> like ge and £ = h{£'), 0 otherwise, where (GMX)i is the traffic intensity of group £, £ £ C. Assuming the existence of the inverse of I + N, denote:

R = (I + N)~l = (HMAG')[HM(I - P'^AG'}'1, (3.32)

6 = R{P-C'K), (3.33) and

K

l fc X{t) = R HM(I - P')~ [E{t) + X £ (Afci)] + HV{Xt) - C'c{t) . (3.34)

Here A" is a driftless Brownian motion. To get a feel for the covariance matrix of X, denoted T^, consider the case when the primitive processes, (E, V, c, $), are mutually independent. Then, we can derive

K 1 A r 7 p lMH R ?x = HM(I - p'r (r£+E ***) ( - r ' k=l

+HATVH' + CTCC^R', (3.35) 3.4 The SRBM Approximation 77

where, recall, TE, IV, TC and T$K are, respectively, the covariance matrices of the driftless Brownian motions, E, V, c and defined in (3.13)-(3.16).

We can now rewrite the approximation in (3.30) as

Z(t) idt + X{t) + RY{t).

If the above, along with (3.11) and the fact that Z > 0, defines a reflection mapping that is continuous (as is the case for the generalized Jackson network where R is an M-matrix), then we will have

Z(t) « Z(t). (3.36) where Z satisfies

Z(t) =6t + X(i) + RY(t) > 0 for t > 0; (3.37)

dY > 0 and Y(0) = 0; (3.38)

ZidYt = 0, t £ C. (3.39)

The above mapping (called the "reflection mapping"), which maps the processes X and Y to Z, defines the latter process (Z) as a "semimartingale reflecting Brownian motion" (SRBM), which we shall also refer to as SRBM(#, T, R). (Note that a semimartingale is a martingale plus a process of bounded variation. Here, X is a martingale, and 6t + RY(t) is a linear combination of monotone functions - 6t and Y(t) - and hence, is of bounded variation.) Before proceeding any further, we must make two important technical remarks here. First, the above mapping is not always well-defined — more on this later. Second, X, being a driftless Brownian motion is a martingale with respect to the (natural) filtration generated by itself. For the above approximation to work, however, X is required to be a martingale with respect to the filtration generated by (X, Y). Nevertheless, this usually holds under non- anticipative service disciplines such as FIFO and priority; see Williams (1998) for details. For the time being, let us assume the above mapping uniquely defines Z (in distribution). Then, given the approximation (3.36), from (3.28) and (3.29), we have

W{t) « W{t) := MKG'[HMKG'\~lZ(t) = MAZ{t); (3.40) 3.4 The SRBM Approximation 78

and

Q(t) « Q{t) := M~lW(t) = AZ{t). (3.41)

Finally, we derive the approximation for the sojourn-time process. In view of (3.14) and (3.22), we can show that the service time of a class k job has the following approximation,

Vk(Ak(t))-Vk(Ak(t)-l) « mk, kelC.

(We could replace mk by any constant, but choosing mk, the average service time of a class k job, appears to give a better approximation.) Then, in view of (3.12), we have

«

»(i)

= Z*(k){t) + Ph(k)Tk{t) + mk, where we define Bo = 0. Hence, we have

+m rk{t) « ^(fc)W * _ «

Note that the approximations in (3.13) and (3.22), along with the bound in (3.2), clearly

imply r]k(t) ~ t. Then, in view of (3.10), the above approximation yields,

2 {t) mk Sk(t) & Sk(t):= ^ \ . (3.42) K o-{n(k)) ~ Ph(k)

To summarize, our approximation scheme is built upon the following three conditions:

(A) the primitive processes (E, V, c, 5>) satisfy the asymptotics in (3.13), (3.14), (3.15) and (3.16);

(B) the arrival process A satisfies the approximation in (3.21);

(C) the SRBM Z is well defined (which includes conditions such as the matrix (I + N) is invertible and X is a martingale with respect to the filtration generated by (X,Y)).

Under these conditions, the aggregated workload process Z can be approximated by Z, and the other derived processes, in particular, W (workload), Q (queue length) and S (sojourn time) follow the approximations in (3.40), (3.41) and (3.42), respectively. 3.5 Discussions and Variations 79

Condition (A) is basically a modeling choice, and given a specific application, the required asymptotics are often routinely justified, as shown in the examples in Section 3.2.2. Condition (B) implies that A(t)/t converges almost surely as t —> oo. This requires the network to be weakly stable (and hence stable). Numerous examples (such as those in Kumar and Seidman (1990) and Bramson (1994)) indicate that a multiclass queueing network may not be weakly stable under the traffic condition p < e. Under some mild conditions, it appears that (B) is equivalent to what is known as the state-space collapse condition, which is a key for the heavy traffic limit theorem; refer to Bramson (1998) and Williams (1998), also see Section 3.5.3. Condition (C) is the most complex among the three, and we shall examine it in more details in the next section.

3.5 Discussions and Variations

3.5.1 Issues Surrounding the SRBM

First, and most importantly, the SRBM must be well-defined via the relations in (3.37)- (3.39) That is, these relations must uniquely determine the distribution of Z (i.e., uniqueness in law). Suppose the covariance matrix is non-degenerate. Then it is known (Reiman and Williams (1988) and Taylor and Williams (1993)) that the SRBM Z is well-defined if and only if the reflection matrix R is completely-5. (A square matrix A is an 5-matrix if there exists an x > 0 such that Ax > 0. It is completely-5 if all of its principal minors are S-matrices.) Except for some rare cases (usually associated with very specially correlated primitive processes), the covariance matrix is almost always non-degenerate. The key is therefore whether the reflection matrix R is completely-5. Some sufficient conditions are (i) iii is an M-matrix; or (ii) R is a P-matrix. (Refer to Cottle, et al. (1992) for other conditions that are either equivalent to or sufficient for the completely-5 condition.) As we shall see in Section 3.5.2 and Section 3.6.3, the reflection matrix R may not always be completely-5. Indeed, the example in Section 3.6.3 shows that R may not even be invertible.

The second issue is whether or not the SRBM Z possesses a stationary distribution. Har• rison and Williams (1987) proves that Z corresponding to a generalized Jackson network does possess a stationary distribution when the traffic intensity is less than unity (where the service station is assumed to be perfect and hence the service capacity is one). Dupuis and Williams (1994) relates the existence of the SRBM stationary distribution to the stability of a linear Skorohold problem. Applying a linear Lyapunov function to this Skorohold problem, 3.5 Discussions and Variations 80

Chen (1996) finds certain more explicit conditions, which, in particular, imply that if the traffic intensity of a multiclass feedforward network is less than unity, then the corresponding Z has a stationary distribution (See Chapter 2). The existence of a stationary distribution of Z is obviously motivated by the need to approximate the stationary performance measures of queueing networks. Hence, it is only natural to expect that the issue has a close relationship with the stability of queueing net• works. For the latter subject, a substantial body of literature has emerged in recent years. For instance, Rybko and Stolyar (1992), and Dai (1995) show that the queueing network is stable if a corresponding fluid network is stable. Note, however, that the stability of a queueing network does not in general imply that the SRBM Z is even well-defined (see the examples in Section 3.5.2 and Section 3.6.3). And, the converse also remains an open problem, i.e., whether the existence of a well-defined Z (given the traffic intensity is less than the service capacity) implies the stability of a corresponding queueing network.

The third issue is whether or not the approximation in (3.36) is supported by a limit theorem, in the following sense (which is the strong approximation): there exists a common probability space in which both Z and Z are defined, such that

sup \\Z{t)-Z{t)\\=- o{Tl"), asT->co, 0 2. Indeed, such an approximation is established in Chen and Mandelbaum (1994) for a generalized Jackson network, and in Chapter 2 for a multiclass feedforward network. In both chapters, the reflection mapping is well-defined and Lipschitz continuous. This assumption, however, appears too strong for a general multiclass network, as it would require that the relations in (3.37)-(3.39) uniquely determine the processes Z and Y, in a pathwise sense, for any given process X (with X(0) > 0). It is known that this cannot hold (Bernard and El Kharroubi (1991) and Mandelbaum (1989)). An alternative interpretation of the approximation is to view the network under study as one in a sequence of networks indexed by n = 1,2,.... Let pn be the traffic intensity of the nth network, and assume that \pn\pn — K] -¥ 6 (which implies pn —>• K as n —> oo). This is known as the heavy traffic condition. If we can show that as n —> co,

-Lzn(n-,(pn-K))Az(-,0), (3.43) then we may interpret the above as:

Zn(t; (pn - «)) « yfrZ(t/n; (pn - «))=£(*; (pn - «)), 3.5 Discussions and Variations 81

where the second distributional equality follows from the scaling property of the Brownian motion and the reflection mapping. Suppose the network under study is the one indexed by m. Then, we have

m m m m p = p , Z(t) = Z (t;(p -K)), Z(t) = Z(t;(p -K)); and the approximation in (3.36) holds. In this case, we can claim that (3.36) is supported by a heavy traffic limit theorem. Indeed, such an approximation is established in Harrison and Williams (1987) for the generalized Jackson network based on the heavy traffic limit theorem of Reiman (1984), and in Harrison and Williams (1992) for a multiclass feedforward network based on the heavy traffic limit theorem of Peterson (1991). In these special cases, the SRBM Z(-\ (p — e)) is well-defined for any p < e (where K = e). In anticipation of a heavy traffic limit theorem, this type of approximation is also proposed in Harrison and Nyuyen (1990;1993) for a general multiclass network under a FIFO service discipline. In more general cases, however, as we shall demonstrate in Section 3.5.2 below via the Kumar-Seidman network, the situation is much more complex: Z(-;p — e) may not exist for some p < e (although it does exist when p is close enough to e), even if the corresponding heavy traffic limit (3.43) does exist.

3.5.2 Kumar-Seidman Network

For the Kumar-Seidman network shown in Figure 3.1, we have

1 1 / —ml ml 0 0 \

0 0 7712 1 0 A = 0 0 -771.3

1 V ml 0 0 0 )

1 / 1 0 Trig 177l4 —mj m4 \

0 1 7773 177I4 —7773 177l4 R = —m 1 0

1 1 ml rri2 —m i m 0 1 \ 2 /

1 1 / 1 0 — 7773 7774 77l3 7774 \

1 _1 0 1 — 7773 m4 m3 7774 I + N = 1 1 —ml m2 m i m2 1 0 —m 1m2 m ilm 0 1 V 1 2 / 3.5 Discussions and Variations 82

Note that a principal minor of R:

1

mj"1m2 is an 5-matrix if and only if

(3.44) and it is straightforward to verify that this is also a necessary and sufficient condition for R to be completely-S. Note that this condition does not depend on a\ and 0:3. Hence, by making a\ and 03 small enough, we can create a case in which the network is stable but the SRBM is not well-defined. Under the traffic condition p < e, the condition (3.44) is strictly stronger than

ai7772 + 037774 < 1 (3.45) which is a sufficient condition for the stability of Kumar-Seidman network (see, e.g., Botvich and Zamyatin (1992)). On the other hand, under the heavy traffic condition p = e, (3.44) becomes equivalent to (3.45), which is shown in Chen and Zhang (1998) as a necessary and sufficient condition for the existence of a heavy traffic limit theorem.

If the network parameters a and m is such that p < e and 77127714 > 77117773 but (3.45) still holds, then the matrix R is not completely-5 and hence the proposed Brownian approxima• tion does not exist. This network can clearly be considered as one particular network in a sequence of networks that approach the heavy traffic limit (if we assume the limiting net• work satisfies (3.45)). This suggests that in this case the heavy traffic limit does not provide the mathematical support for the approximation. But if the particular network chosen is "closer enough" to the limiting network, i.e., the parameters a and m are chosen in a small neighborhood of

{(a, m) : p = e and aim2 + 037774 < 1} = {(a,m) : p = e and 77127774 = 77117773}, then (3.44) must hold and the the SRBM approximation is well-defined.

3.5.3 Alternative Approximation with State-Space Collapse

Following the usual heavy traffic limit theorem, the diffusion approximation for the workload or the queue length of a higher priority group is simply zero. For instance, Peterson (19991) 3.5 Discussions and Variations 83 provides such an example for a feedforward network; and for the Kumar-Seidman network in Figure 3.1, Chen and Zhang (1998) shows that the diffusion limits for the queue-length processes of classes 2 and 4 are zero. This phenomenon is known as a state-space collapse principle, and is critical to the heavy traffic approximation of multiclass networks. Bramson (1998) and Williams (1998) have shown that a multiplicative form of state-space collapse holds for some interesting families of mul• ticlass networks, and this multiplicative form implies regular state-space collapse provided that the reflection matrix R for the limiting SRBM is well defined and completely-

Now, we derive the SRBM model with state-space collapse. Denote:

• U = (JJj) with Uj = [Uj(t),t > 0}, j G v7, where Uj(t) indicates the total amount of workload at station j at time t\ hence, if group I has the lowest priority at station a(£),

then = Ze(t);

• I = (Ij) with Ij = {Ij(t),t >}, j G J, where Ij(t) indicates the cumulative amount of unused capacity during [0,t] at the station j; hence, if group t has the lowest priority

at station a(E), then Ia^(t) = Ye(t).

• A = (&tj)tzcj&j, with

1 if group t is the lowest priority group served at station j, 0 otherwise;

Then, similar to (3.8), we have

U(t) = CGV(A(t))-c(t) + I(t). (3.46) 3.5 Discussions and Variations 84

Under state-space collapse, the workload of all but the lowest priority group vanishes; hence, we have

Z(t) « AU(t). (3.47)

Combined with (3.29), this implies:

[et - r(t)] « [HMAGTlAU{t). (3.48)

From (3.27), (3.48) and (3.46), we obtain

U{t) « CG[MA(t) + V(Xt)] — [nt + c(t)] + I(t) K

CGMXt + CGM{I - P'Y £(*) + £ 4* (Afct)

fc=i

+CGV(Xt) - CGM{I - P')_1^'AAC/(i) - [nt - c{t)} + I(t).

Thus, the above can be written as

U{t)sLot + X(t) + RI{t), where

-1 R = {I + CGM{I - .P'r^'AA) ,

e = R{CGMX-K), and

K

l K X(t) = R\CGM{I - P')-[E(t) + ^2$ (Xkt)] + CGV(Xt) - 6{t) .

fc=i

Note that the jth component of CGMX is the traffic intensity of station j, j 6 J• In summary, we have

U(t) » U(t),

Z{t) « Z(t) := AU(t), 3.6 Special Cases and Numerical Results 85 where U satisfies

U(t) = et + X{t) +RI{t) > 0,

dl > 0 and 1(0) = 0, -

Ujdlj = 0, j £ J.

The approximations for the other derived processes, Q, W and S, remain the same as in (3.40), (3.41), and (3.42), respectively.

To conclude this part, let us revisit the Kumar-Seidman network. We can draw the same conclusion as in Section 3.5.2 by considering the approximation under state-space collapse.

In particular, when 77117773 7^ m2m4, the reduced two-dimensional reflection matrix for the SRBM takes the following form:

1 II 7773 7774 RL =

1 — 777J 77727773 7774 \ m1 7772 1 which is completely-5 if and only if (3.44) holds, the same condition as in Section 3.5.2.

3.6 Special Cases and Numerical Results

3.6.1 A Single-Class Single-Server Queue with Breakdown

Consider a single-class, single-server queue subject to autonomous breakdown. We charac• terize the job arrival process by its first two. moments: the arrival rate a and the SCV of interarrival time CQ. Similarly, let m be the mean service time and c2, the SCV of the service

2 time. Let d and c d [u and c^) be the mean and the SCV of the down- (up-)time. Following the results in Section 3.4, we can construct an SRBM Z as an approximation to the workload process with the following data:

U 9 am — u + cf R 1

2 2 2 2 d u (c u + c ) 2 2 m a(c s +CQ) + r (u + d)3 3.6 Special Cases and Numerical Results 86

2 2 2 Case a r m c 14 c d cd 1 4.0 1 0.15 1 12 1 4 1 2 4.0 1 0.15 1 12 2 4 2 3 4.0 2 0.15 2 12 1 4 1 4 4.0 2 0.15 2 12 2 4 2 5 4.0 1 0.15 1 18 1 2 1 6 4.0 1 0.15 1 18 2 2 2 7 4.0 2 0.15 2 18 1 2 1 8 4.0 2 0.15 2 18 2 2 2 9 1.0 1 0.75 1 18 1 2 1 10 1.0 1 0.75 1 18 2 2 2 11 1.0 2 0.75 2 18 1 2 1 12 1.0 2 0.75 2 18 2 2 2

TABLE 3.1. Parameters of single station queue with breakdown

For one-dimensional SRBM, the steady-state distribution is known to be exponentially dis• tributed with mean E(Z) = jg. From (3.42), we obtain the steady-state average system time:

E(5) = (^+m)/«, where k = s+a- In Table 3.1, we list several different sets of parameters of the queue. In simulation, the random variables (either representing service times, up-times and down-times) are fitted with Erlang distributions, exponential distributions, or Gamma distributions corresponding to the SCV being less than 1, equal to 1, or larger than 1. This scheme is used for simulations in all subsequent examples as well. Table 3.2 presents both simulation and SRBM estimates for E(

3.6.2 A Generalized Jackson Network

Figure 3.3 depicts a three-station generalized Jackson queueing network. Jobs arrive to sta• tions 1 and 2 following i.i.d. Poisson processes with rate 0.5. Jobs completing service at station 1 proceed to either station 2 or station 3, with equal probability. Jobs completing service at station 2 either go to station 1 or exit the system, with equal probability. Jobs 3.6 Special Cases and Numerical Results 87

Case Simulation Results SRBM Estimates 1 5.924 (6.8%) 6.000 (1.3%) 2 10.017 (7.7%) 11.00 (9.8%) 3 6.972 (7.4%) 6.800 (2.5%) 4 12.003 (7.3%) 11.800 (1.7%) 5 1.070 (3.0%) 1.100 (2.8%) 6 1.464 (5.3%) 1.700 (16.1%) 7 1.471 (2.6%) 1.433 (2.6%) 8 1.861 (4.2%) 2.033 (9.2%) 9 6.230 (6.2%) 6.200 (0.5%) 10 7.501 (6.9%) 7.400 (1.3%) 11 10.282 (5.0%) 10.367 (0.8%) 12 12.245 (7.3%) 11.567 (5.5%)

TABLE 3.2. Average sojourn time in the single-class single-server queue with breakdown completing service at station 3 either go to station 1 or exit the system, with equal probabil• ity. All service-times are i.i.d. at each station, independent among stations, and independent of the arrivals. The mean and SCV of the service times at station i are mi and cf, i = 1,2,3. We consider 11 versions of the system whose parameters are listed in Table 3.3. In Cases 1 through 5, station 1 is heavily loaded, while stations 2 and 3 are moderately loaded; in Cases 6 and 7, all stations are lightly loaded; in Cases 8 and 9, all stations are moderately loaded; and Case 10 and 11, all stations are heavily loaded. Table 3.4 shows the simulation estimates and SRBM approximations of the steady-state average sojourn time at each station. In this example as well as all other examples in sub• sequent subsections, we used the numerical algorithm BNA/FM developed in Chapter 5 to compute the mean sojourn time from SRBM model. As evident from the comparison, SRBM results are surprisingly good even when all stations have low traffic intensity. 3.6 Special Cases and Numerical Results

2 1 Vfl\ r c Cc Case ai a.2 m2 m3 2 3 Pi P2 P3 1 0.50 0.50 0.60 0.60 0.80 0.25 0.25 0.25 0.90 0.75 0.60 2 0.50 0.50 0.60 0.60 0.80 1.00 1.00 1.00 0.90 0.75 0.60 3 0.50 0.50 0.60 0.60 0.80 2.00 2.00 2.00 0.90 0.75 0.60 4 0.50 0.50 0.60 0.60 0.80 2.00 0.25 0.25 0.90 0.75 0.60 5 0.50 0.50 0.60 0.60 0.80 0.25 2.00 2.00 0.90 0.75 0.60 6 0.50 0.50 0.20 0.20 0.40 0.25 0.25 0.25 0.30 0.25 0.30 7 0.50 0.50 0.20 0.20 0.40 0.25 0.25 0.25 0.30 0.25 0.30 8 0.50 0.50 0.40 0.50 0.80 0.25 0.25 0.25 0.60 0.63 0.60 9 0.50 0.50 0.40 0.50 0.80 2.00 2.00 2.00 0.60 0.63 0.60 10 0.75 0.75 0.40 0.50 0.80 0.25 0.25 0.25 0.90 0.94 0.90 11 0.75 0.75 0.40 0.50 0.80 2.00 2.00 2.00 0.90 0.94 0.90

TABLE 3.3. Parameters of the generalized Jackson queueing network

3.6.3 A Variation of the Bramson Network

For the network shown in Figure 3.2, we have

Pi1 0 \ 0 P21 0 P21 and 0 P21 V Pi1 0 /

1 1 _ 1/1 + A(m3 + 2m4)/92 -3Xm5p2 \

d V -P2PT1 1 + ^PT1 ) ' where

l 1 d = [1 + Am5p1 + A(m3 + 2m4)p2 ] - 3\m5p1 .

Note that d > 0 if and only if

(3.49) (7712 + 7773 + 7774) > (7772 - 7774)7775; and it is easy to see that this condition is a necessary and sufficient condition for the matrix R to be completely-5 (which in this case is equivalent to R being an M-matrix). (Note that if the two sides of (3.49) are equal, then the matrix inverse (I + ./V)-1 in (3.32) does not exist and R is not defined.) Notice that this condition does not depend on the arrival rate A (A = a\). This implies that similar to the Kumar-Seidman network in Section 3.5.2, we 3.6 Special Cases and Numerical Results 89

FIGURE 3.3. A generalized Jackson queueing network can create a case in which the network is stable (by making A small enough) but the SRBM approximation is not well-defined. For instance, m = (0.02,0.8,0.05,0.05,0.88) provides such a case where (3.49) is violated. In this case, if we let A = 1, the traffic intensities at both stations are 0.9. The simulation result presented in Figure 3.4 suggests that the total number of jobs in system grows with time and that the network cannot be stable. (Poisson arrivals and exponential service times are used in the simulation.) In this sense, the existence of the SRBM appears to be closely related to the stability of the queueing network. Next, we fix the arrival rate at A = 1, and vary the mean service times (m, see Table 3.5), which all satisfy (3.49), and hence, the corresponding SRBM exists. In Cases 1, 2, 8, and 9, the m values are very close to the region where the SRBM does not exist; and interestingly, the associated SRBM approximation gives very poor estimates in Case 1 and 2 but reasonably good estimates in Case 8 and 9. The best approximation appears in Case 4, where the mean service times are identical at station 2 and nearly identical at station 1.

3.6.4 A More-Complex Multiclass Network

Consider the two-station network shown in Figure 3.5. Job classes 1, 2, and 3 are processed at station 1, where the service discipline is FIFO; classes 4, 5 and 6 are processed at station 2, where class 4 has a higher priority over classes 5 and 6 (which are at the same priority rank, and hence served FIFO). Thus, there exist three priority job groups; jobs in class 1, 2 and 3 3.6 Special Cases and Numerical Results 90

System No. Method E(5i) E(52) E(53)

i Simulation 4.879 (3.9%) 1.572 (0.8%) 1.280 (0.4%) 1 SRBM 4.544 (6.9%) 1.578 (0.4%) 1.303 (1.8%) Simulation 6.019 (4.1%) 2.399 (1.4%) 2.000 (0.8%) Z SRBM 5.957 (1.0%) 2.385 (0.6%) 2.004 (0.2%) o Simulation 7.610 (4.1%) 3.405 (1.4%) 2.880 (1.2%) 0 SRBM 7.668 (0.8%) 3.394 (0.3%) 2.839 (1.4%) 1.902 (0.6%) A Simulation 7.402 (5.8%) 2.020 (1.1%) 4 SRBM 7.426 (0.3%) 2.029 (0.5%) 1.857 (2.4%)

r Simulation 5.009 (5.1%) 3.029 (1.6%) 2.296 (1.0%) 0 SRBM 4.976 (0.7%) 3.054 (0.8%) 2.385 (3.9%) Simulation 0.255 (0.2%) 0.239 (0.1%) 0.504 (0.2%) 6 SRBM 0.257 (0.8%) 0.242 (1.3%) 0.500 (0.8%) Simulation 0.318 (0.5%) 0.297 (0.4%) 0.649 (0.6%) 7 SRBM 0.317 (0.3%) 0.297 (0.0%) 0.629 (3.1%) Q Simulation 0.777 (0.6%) 1.024 (0.5%) 1.522 (0.8%) O SRBM 0.780 (0.4%) 1.028 (0.4%) 1.504 (1.2%) Simulation 1.262 (0.8%) 1.714 (1.0%) 2.596 (1.3%) 9 SRBM 1.290 (2.2%) 1.743 (1.7%) 2.609 (0.5%) Simulation 2.620 (1.9%) 5.550 (4.1%) 5.191 (3.4%) 10 SRBM 2.602 (0.7%) 5.416 (2.4%) 5.015 (3.4%) Simulation 5.738 (3.6%) .11.091 (3.1%) 11.677 (4.1%) 11 SRBM 5.808 (1.2%) 11.128 (0.3%) 11.669 (0.1%)

TABLE 3.4. Average sojourn time of the generalized Jackson queueing network belong to group 1; class 4 jobs belong to group 2; class 5 and 6 jobs belong to group 3. Classes 1, 2 and 3 are associated with external arrivals, which are Poisson processes with rates 0.2,

2 0.4 and 0.5 respectively. Service times of all six classes are i.i.d. with the SCV denoted c s.

2 Consider three versions of the network, labeled as systems A, B and C, with c s =0.25, 1, and 2, respectively. In each system, there are 3 different sub-cases as shown in Table 3.7, where

nij is the mean service time of class i, i = 1,... , 6; p\ and p2 denote the traffic intensities at 3.6 Special Cases and Numerical Results 91

No. m Method Station 1 Station 2 Simulation 5.759 (5.7%) 9.898 ;9.2%) 1 (0.40,0.80,0.05,0.05,0.45) SRBM 11.937 (107.3%) 12.030 (21.5%) Simulation 3.544 (5.2%) 7.866 [7.0%) 2 (0.40,0.80,0.05,0.05,0.40) SRBM 9.760 (175.4%) 14.388 (82.9%) Simulation 0.748 (2.3%) 6.370 ;io.3%) 3 (0.40,0.80,0.05,0.05,0.20) SRBM 1.012 (35.3%) 6.879 [8.0%) Simulation 2.375 (3.9%) 2.797 ;e.9%) 4 (0.40,0.30,0.30,0.30,0.45) SRBM 2.432 (2.4%) 2.743 ;i.9%) Simulation 8.659 (6.9%) 5.767 [7.4%) 5 (0.02,0.30,0.30,0.30,0.88) SRBM 8.959 (3.5%) 4.838 (16.1%) Simulation 6.349 (5.9%) 4.645 (5.6%) 6 (0.10,0.30,0.30,0.30,0.80) SRBM 7.156 (12.7%) 4.164 (10.4%) Simulation 4.922 (4.8%) 3.719 (4.1%) 7 (0.20,0.30,0.30,0.30,0.70) SRBM 5.552 (12.8%) 3.549 (4.6%) Simulation 9.499 (4.9%) 6.311 (4.4%) 8 (0.02,0.425,0.05,0.425,0.80) SRBM 9.911 (4.3%) 5.432 (13.9%) Simulation 9.484 (4.3%) 6.074 (4.0%) 9 (0.02,0.43,0.05,0.42,0.88) SRBM 10.034 (5.8%) 5.529 (9.0%)

TABLE 3.5. Average waiting times in the Bramson network the two stations. The routing matrix is

/ 0 0 0 1 0 0 0 0 0 0 1 0 0 0 0 0 0 0.5 P = 0 0 0 0 0 0 0.5 0 0 0 0 0

0 0 0 0 0 0 /

The performance measure of interest is the average sojourn times experienced by each job class at the two stations, denoted E(

E(Si) = mi + E(Zi), i = l,2,3,

E(Si) = mi + E(Z2), t = 4,

EOS,) = —l—(mi + E(Z3)), i = 5,6. 3.6 Special Cases and Numerical Results 92

No. Method E(5i) E(52) E(53) E(S4) E(5B) Simulation 5.68 (5.4%) 10.89 (7.4%) 10.25 (7.8%) 9.45 (8.4%) 6.69 (5.2%) 1 SRBM 12.34 (117.2%) 12.83 (17.8%) 12.08 (17.8%) 12.08 (27.8%) 12.39 (85.2%) Simulation 3.55 (4.9%) 8.83 (6.8%) 8.17 (7.3%) 7.49 (7.9%) 4.34 (4.6%) 2 SRBM 10.16 (186.2%) 15.19 (72.0%) 14.44 (76.7%) 14.44 (92.8%) 10.16 (134.1%) Simulation 1.02 (1.5%) 7.31 (7.8%) 6.62 (8.5%) 6.08 (9.1%) 1.08 (2.0%) 3 SRBM 1.41 (38.2%) 7.68 (5.1%) 6.93 (4.7%) 6.93 (14.0%) 1.21 (12.0%) Simulation 2.78 (3.4%) 3.09 (6.8%) 3.10 (6.9%) 3.10 (6.8%) 2.83 (3.3%) 4 SRBM 2.83 (1.8%) 3.04 (1.6%) 3.04 (1.9%) 3.04 (1.9%) 2.88 (1.8%) Simulation 8.65 (6.9%) 5.91 (6.4%) 6.15 (6.3%) 6.14 (6.3%) 9.57 (6.2%) 5 SRBM 8.98 (3.8%) 5.14 (13.0%) 5.14 (16.4%) 5.14 (16.3%) 9.84 (2.8%) Simulation 6.44 (5.8%) 4.85 (5.2%) 5.00 (5.2%) 4.99 (5.1%) 7.16 (5.2%) 6 SRBM 7.26 (12.7%) 4.46 (8.0%) 4.46 (10.8%) 4.46 (10.6%) 7.96 (11.2%) Simulation 5.13 (4.6%) 3.96 (4.0%) 4.05 (3.9%) 4.05 (3.9%) 5.61 (4.2%) 7 SRBM 5.75 (12.1%) 3.85 (2.8%) 3.85 (4.9%) 3.85 (4.9%) 6.25 (11.4%) Simulation 9.44 (4.9%) 6.74 (4.4%) 6.36 (4.6%) 6.74 (4.3%) 10.46 (4.5%) 8 SRBM 9.93 (5.2%) 5.86 (13.1%) 5.48 (13.8%) 5.86 (13.1%) 10.71 (2.4%) Simulation 9.42 (4.3%) 6.50 (3.4%) 6.14 (3.6%) 6.49 (3.4%) 10.44 (4.0%) 9 SRBM 10.05 (6.7%) 5.96 (8.3%) 5.58 (9.1%) 5.95 (8.3%) 10.91 (4.5%)

TABLE 3.6. Average sojourn time of each job class in the Bramson network

Case mi m-2 m3 m.4 m5 m6 Pi P2 1 0.5 0.5 1.0 0.50 0.50 1.0 0.90 0.65 2 1.0 0.5 0.4 1.00 1.00 0.4 0.80 0.90 3 0.5 0.5 0.4 0.25 0.25 2.0 0.60 0.70

TABLE 3.7. Mean service times of three cases of the complex queueing network

The numerical comparisons against simulation are summarized in Table 3.8. The quality of the approximation in this example is very good. Next we present some numerical results on the stationary distribution for Case 2 of System A. In Figures 3.6 and 3.7, the cumulative stationary distributions are plotted for the waiting times of job groups 1 and 3, respectively. The SRBM approximations of the waiting times

are by the approximations of the workload process, specifically. W\ = W2 = W3 for job group 1 and W$ = W% for job group 3; their stationary distributions are generated by BNA/FM algorithm. From these two figures, we can see that SRBM provides fairly good approximation for the stationary distribution as well. To approximate the stationary distribution of the sojourn time, we propose to replace the

mean service time mk in (3.42) by a random variable vk, which has the same distribution as 3.6 Special Cases and Numerical Results 93

E CM) E(58) Case Method E(5i) EOS,) E(53) ( E(55) System A Simu. 4.88 (4.2%) 4.86 (4.2%) 5.36 (3.0%) 0.55 (0.3%) 1.33 (0.8%) 1.70 (0.6%) 1 SRBM 4.70 (3.7%) 4.70 (3.3%) 5.30 (1.1%) 0.57 (3.6%) 1.29 (3.0%) 1.91 (12.4%) Simu. 2.54 (1.8%) 2.13 (2.1%) 2.02 (2.2%) 1.14 (0.4%) 9.23 (5.3%) 8.30 (5.3%) 2 SRBM 2.52 (0.8%) 2.02 (5.2%) 1.92 (5.0%) 1.15 (0.9%) 8.77 (5,0%) 7.77 (6.4%) Simu. 1.00 (0.7%) 0.96 (0.8%) 0.86 (0.7%) 0.26 (0.3%) 2.46 (2.0%) 4.45 (1.5%) 3 SRBM 0.98 (2.0%) 0.98 (21.%) 0.88 (2.3%) 0.25 (3.8%) 2.57 (4.5%) 4.51 (1.3%) System B Simu. 7.07 (5.3%) 7.15 (5.6%) 7.64 (4.8%) 0.65 (0.5%) 2.48 (1.7%) 2.81 (1.5%) 1 SRBM 7.26 (2.7%) 7.26 (1.5%) 7.76 (1.6%) 0.63 (3.1%) 2.31 (6.9%) 2.93 (4.3%) Simu. 3.76 (1.8%) 3.28 (1.9%) 3.19 (2.0%) 1.46 (0.7%) 14.15 (5.8%) 13.34 (6.4%) 2 SRBM 3.85 (2.4%) 3.35 (2.1%) 3.25 (1.9%) 1.45 (0.7%) 14.61 (3.3%) 13.61 (2.0%) Simu. 1.28 (1.1%) 1.23 (1.2%) 1.13 (1.6%) 0.28 (0.4%) 4.01 (3.6%) 5.95 (2.7%) 3 SRBM 1.26 (1.6%) 1.26 (2.4%) 1.16 (2.7%) 0.27 (3.6%) 4.09 (2.0%) 6.03 (1.3%) System C Simu. 9.90 (5.4%) 10.17 (5.3%) 10.61 (5.5%) 0.77 (1.5%) 3.95 (2.4%) 4.26 (2.4%) 1 SRBM 10.18 (2.8%) 10.18 (0.1%) 10.67 (0.6%) 0.71 (7.8%) 3.58 (9.4%) 4.20 (1.4%) Simu. 5.42 (3.4%) 4.90 (3.8%) 4.80 (3.5%) 1.87 (1.4%) 22.99 (7.9%) 22.09 (8.2%) 2 SRBM 5.62 (3.7%) 5.12 (4.5%) 5.02 (4.6%) 1.85 (1.1%) 22.82 (0.7%) 21.82 (1.2%) Simu. 1.60 (1.5%) 1.57 (1.4%) 1.46 (1.3%) 0.30 (0.7%) 5.95 (4.2%) 7.90 (3.4%) 3 SRBM 1.62 (1.3%) 1.62 (3.2%) 1.52 (4.1%) 0.30 (0.0%) 6.12 (2.3%) 8.07 (2.2%)

TABLE 3.8. Average sojourn time of each job class in the complex queueing network 3.7 Concluding Remarks 94

Value. (X103)

Time (X103) total number of jobs at station 1 total number of jobs at station 2

FIGURE 3.4. Queue length by station in a variation of Bramson network with Poisson arrivals and exponential services

the service requirement of class A; and is independent of Zn(k) (*) i namely,

Zv(k)(t)+Vk

Sk(t) « Sk{t): =

Without such modification, P > "W(«<7(7r(A:)) - Ph{k))) = 1> which is clearly not the case for the stationary distribution of the original queueing process. The added difficulty for this modification is that we have to numerically compute the convolution of two random variables.

3.7 Concluding Remarks

We have extended previous schemes of Brownian approximation to a general multiclass queue• ing network with a mixture of priority and FIFO service disciplines. Our proposed approxi• mation is consistent with the existing literature when the network follows a pure FIFO service discipline, and it is supported by the strong approximation theorem in a feedforward network. Through concrete examples, we have also demonstrated the relations (or the lack thereof, in 3.7 Concluding Remarks 95

Class 1 Class 4

Class 2 Class 5 0.5

Class 3 Class 6

FIGURE 3.5. A two-station multiclass network general) among the Brownian approximation model, the heavy traffic limit theorem, and the stability of the queueing network. Specifically, the stability of a queueing network may not always imply the existence of the Brownian approximation; and in some sense, neither does the heavy traffic limit theorem always imply the existence of the Brownian approximation. While in most of our numerical examples the Brownian approximation provides quite accu• rate estimates, we have also demonstrated that the approximation may not work well near the boundary of the existence region of the SRBM. 3.7 Concluding Remarks 96

FIGURE 3.6. Cumulative stationary distributions of waiting time in the complex queueing network: group 1 jobs 3.7 Concluding Remarks 97

| -»- Simulation -o-SRBM

FIGURE 3.7. Cumulative stationary distributions of waiting time in the complex queueing network: group 3 jobs 98 Chapter 4 Computing the Stationary Distribution of SRBM in a Hypercube

4.1 Introduction

In this chapter, we propose a numerical algorithm for computing the stationary distribution of an SRBM in a hypercube state space. Such an SRBM often arises as an approximate model for finite buffer queueing networks. Queueing networks have long been used to model manufacturing systems and commu• nication networks, and have provided a very useful tool for the design and the operations management of these systems. (See, for example, Jackson 1963, Kleinrock 1976, Buzacott and Shanthikumar 1992, and Yao 1994.) In modeling and analyzing these systems, one ofthe fun• damental issues is the performance analysis of queueing networks. Despite much effort, exact analysis of queueing networks has been largely limited to exponential networks with infinite buffers. (See, for example, Kelly 1979, Walrand 1988 and Yao 1994.) Almost all real world systems modeled by queueing networks have finite buffer capacity. In many applications, buffer constraints are not essential (or not hard constraints); in this case, analytically simpler queueing networks with infinite buffers have been used. But in some other applications, buffer constraints have an important impact on the performance of the systems and may not be ignored. (See examples in Bertsekas and Gallager (1992), Buzacott and Shanthikumar (1992) and Yao (1994).) For certain queueing networks with finite buffers, Brownian models can be formulated for approximate analysis of these networks. See, for example, Dai and Harrison (1991) and W. Dai (1996); Brownian model of a three station tandem network is given in Section 4.6 of this chapter. In the Brownian model of a queueing network with finite buffers, an SRBM in a hypercube is used to approximate queue length process. The data specifying the SRBM can be computed explicitly from certain parameters of the queueing network. The parameters involved are the first and second moments of the interarrival and service time distributions, and the routing probabilities. The theoretic foundation for our SRBM is the work of Dai and Williams (1995) that provides a necessary and sufficient condition for the existence of an SRBM in a convex polyhedron. For a given SRBM, one would like to compute certain characteristics of the 4.1 Introduction 99

SRBM. Motivated by queueing network applications, one often focuses on the stationary distribution of an SRBM. Computed quantities from the stationary distribution are used to estimate certain performance measures of the corresponding queueing network. Only SRBMs whose data satisfy a certain skew symmetry condition in Harrison and Williams (1987) have explicit formulas for stationary distributions. In this chapter, we propose an algorithm for computing the stationary distribution of an SRBM in a hypercube. In general, we shall use a Brownian network analyzer (BNA) to refer to an algorithm for computing the stationary distribution of an SRBM. (This is motivated by Whitt (1983) who uses a queueing network analyzer (QNA) to refer to an algorithm for computing the stationary distribution of a queueing network.) Our algorithm is closely related to a numeric algorithm developed by Dai and Harrison (1991) for computing the stationary distribution in a two-dimensional rectangle. Their algorithm consists of two parts: the first part requires a finite dimensional approximation of an infinite dimensional functional space, and the second part uses a specific sequence of global polynomials to form the approximation subspace. For convenience, we will refer to the first part of their algorithm as a generic algorithm, and the second part as a BNA/SM algorithm (a BNA algorithm with a spectral method) in this chapter. (The latter follows a convention in numerical literature (Canuto et al. (1988).) The specific BNA algorithm we propose is based on an extension of the generic algorithm of Dai and Harrison (1991), and uses a finite element method or piecewise polynomials to form the approximation subspace. We shall refer to it BNA/FM algorithm. The BNA/SM algorithm was shown to often produce accurate estimate ofthe stationary mean of an SRBM. However, it sometimes fails to produce good estimates for stationary probabilities. Stationary probabilities and tail probabilities are important quantities of an SRBM that can be used to answer some important questions regarding quality of service for the system the corresponding queueing network models. Even in computing the stationary mean of an SRBM, there have been cases where BNA/SM fails to provide a meaningful estimate. See Case A.l in Table II of Dai, Nguyen, Reiman (1994), although we point out that the case is for an SRBM living in a high-dimensional orthant, not a hypercube. Our BNA/FM algorithm is shown to produce accurate estimates of the stationary mean as well as the stationary probabilities. (See Section 4.4.4 for more comparisons between the two algorithms.)

Implementing the BNA/FM algorithm in arbitrary dimensions has been a long, difficult project. An exploratory implementation was implemented in 1992 by Dai for SRBMs in one and two dimensions. W. Dai (1996) implemented a version in his thesis for SRBMs in two and three dimensions with uniform mesh. Finally, in this thesis, we implemented a version 4.1 Introduction 100 for SRBMs in arbitrary dimensions with general lattice mesh, our general implementation, in C++ programming language, supersedes all the previous implementations. The numerical results and experiments reported in this chapter are from this implementation. In addition to developing the BNA/FM algorithm and reporting its successful implementation, we also sum• marize our numerical experiences from our extensive computations using the implementation. It is hoped that these experiences can guide further numerical research on SRBMs. Once an approximating subspace is chosen, there is still a choice of which basis to use to represent the subspace. With a fixed subspace, choice of a basis can affect the computational accuracy significantly due to round-off errors in numerical computation. We should point out that the sometime poor performance of BNA/SM in Dai and Harrison (1991) may not be intrinsic to the algorithm. It may be due to the poor choice of basis for global polynomials. Both the generic and BNA/SM algorithms were generalized to an SRBM living in a high- dimensional orthant and simplex in Dai and Harrison (1991,1992). In next chapter, we extend the BNA/FM algorithm to compute the stationary distribution of an SRBM in an orthant. Schwerer (1997) proposed to use a linear program to compute stationary moments of an SRBM. Brownian approximation, a version of diffusion approximation or the functional central limit theorem, has long been used for approximating the queueing network. The SRBMs arise as the limits of certain performance processes of queuing networks with appropriate scaling in time and space under a heavy traffic condition. Most of these limit theorems, known as functional central limit theorems or heavy traffic limit theorems, have been focused on the queueing networks with infinite buffers, where the corresponding SRBMs are defined in a nonnegative orthant. For a survey in this area, readers are referred to Whitt (1974), Lemoine (1978), Glynn (1990), Harrison and Nguyen (1993), Chen and Mandelbaum (1994), Williams (1996) and Chen and Yao (2001). Relatively much less efforts have been made on the Brownian approximation for the network with finite buffers. Bardhan and Mithal (1993) first attempted to establish such a theorem. Dai and W. Dai (1998) established a limit theorem for certain feedforward finite buffer networks that identifies the SRBM in a hypercube as its limit.

As will be discussed in Section 4.4.2, our BNA/FM algorithm, like BNA/SM of Dai and Harrison, has the "curse of dimensionality". The complexity of the algorithm grows expo• nentially in the dimension of the state space. In the Brownian approximation of a queueing network, the dimension corresponds to the number of stations of the queueing network. For a queueing network with large number of stations, we admit that it may be more efficient to simulate the queueing network itself than to use the Brownian model. However, for a multi- 4.2 SRBM in a Hypercube 101 class queueing network, the network can get "large" by having a large number of job classes but small number of stations. In such a case, performance analysis based on formulating Brownian model and solving the stationary density is an attractive alternative to brute force simulation of the queueing network. The rest of the chapter is organized as follows. In the next section, we define the semi- martingale reflecting Brownian motion (SRBM) in a hypercube. We also present the basic adjoint relationship that characterizes the stationary density of the SRBM. In Section 4.3, we start with recapitulating the generic algorithm of Dai and Harrison (1991) with an extension to multi-dimensional hypercube, and then propose our BNA/FM algorithm. In Section 4.4, we report several important issues emerging from our implementation of the algorithm. Some numerical experiments are presented in Section 4.5 to show the accuracy of the BNA/FM algorithm. In Section 4.6, we present a three-station tandem network with finite buffers, and show how SRBMs, armed with the BNA/FM algorithm, can effectively be used for its performance analysis. We conclude the chapter with Section 4.7. Finally, we introduce some notation to be used in this chapter. Let Rk denote the k- dimensional Euclidean space, and denote the nonnegative A>dimensional orthant. For a subset S of 3?fc, let C2(S) be the functional space of twice differentiable functions whose first and second order partials are continuous and bounded on S and let B{S) be the set of functions which are Borel measurable.

4.2 SRBM in a Hypercube

Let K > 1 be a fixed integer. A if-dimensional hypercube S is defined as

S = {x € $lK : 0 < x < 6}, (4.1) where b is a if-dimensional strictly positive vector. In this section, we define a semimartin- gale reflecting Brownian motion (SRBM) that lives in the state space S. We then state the basic adjoint relationship that characterizes the stationary distribution of the SRBM (Theo• rem 4.5). The characterization is the starting point for computing the stationary distribution which is the primary quantity that we wish to compute in this chapter. Given a K-dimensional vector 9, a K x K symmetric and strictly positive definite matrix T, and a K x IK matrix R, we now define an SRBM associated with the data (9, T, R) on the hypercube state space S. Readers who choose to work with the analytical problem associated 4.2 SRBM in a Hypercube 102

with data (S, 9, V, R) without going through SRBMs may go directly to Theorem 4.5 at the end of this section.

Definition 4.1 For x € S, an (S,9, F,R)-SRBM that starts from x is an {Ft}-adapted, K-dimensional process Z defined on some filtered probability space (O, T, {Ft}; Px) such that

Z = X + RY, (4.2)

•where

1. Z has continuous paths in S, Px-a.s.,

2. under Px, X is a K-dimensional Brownian motion with drift 9 and covariance matrix

r such that {X(t) — 9t,Ft,t > 0} is a martingale, and X(0) = x, Px-a.s.,

3. Y is an {Tt}-adapted, 2K-dimensional process such that Px-a.s.,

(a) y(o) = 0,

(b) Yi, % = 1,... ,2K, is continuous and nondecreasing,

(c) for i = 1,... , 2K, Yi can increase only when Z is on the face Fi, and Fi = {x 6 S : Xi = 0} and FK+% = {% € S : Xi = bi} are the ith lower and upper boundary face of the hypercube S, respectively. x

In (3c), we mean that, for each t > 0, Z(t) £ Fi implies Yi(t — 5) = Yi(t + 6) for some

5 > 0. This is equivalent to f0°° l{z(s)£Fi] dYi(s) = 0 for all i. Loosely speaking, an SRBM behaves like a Brownian motion with drift vector 9 and covariance matrix T in the interior of the hypercube 5, with the processes being confined to the hypercube by instantaneous "reflection" (or "pushing") at the boundary, where the direction of "reflection" on the ith face Fi is given by the ith column of R. The parameters 0, V and R are called the drift vector, covariance matrix and reflection matrix of the SRBM, respectively. The existence of an SRBM depends on the properties of the reflection matrix R. Dai and Williams (1995) provided a sufficient condition on R for the existence of an SRBM in a general polyhedron state space. For convenience, we partition R as R = (R\,R2), where

both R\ and R2 are K x K matrices, formed by the first and the last K columns of R, respectively. To specialize their condition into our case, we introduce the notion of reflection matrix associated with a vertex. Note that our hypercube has 2K vertexes, and each vertex is

given by C\i^aFi fljg^ Fx+i for a (unique) index set a C {1,... , K} with B = {1,... , K} \ a. 4.2 SRBM in a Hypercube 103

For each vertex a, the reflection matrix Ra associated with the vertex is the K x K matrix, given by

a

R = (Ia-I^)[R\iI a+R2lp]

where Ia is a K x K diagonal matrix whose ith component equals one if i € a and equals zero otherwise, and Ip is similarly defined.

Definition 4.2 A square matrix A is said to be an 8 matrix if there is a vector x > 0 such that Ax > 0. The matrix A is said to be completely-?) if each principal submatrix of A is an %-matrix.

Definition 4.3 The K x 2K reflection matrix R is said to satisfy the completely-S condition if for each vertex a, Ra is a completely-S matrix.

It follows from Propositions 1.1 and 1.2 of Dai and Williams (1995) that a necessary condition for the existence of the SRBM Z associated with (5,8, Y, R), for each initial x e S, is that the reflection matrix R satisfy completely-S condition. When R satisfies the completely-S condition, it follows from Theorem 1.3 of Dai and Williams (1995) that there exist processes (Z, X, Y) defined on a common filtered probability space (fi, T, {Ft}), on which a family of

probability measures {Px} is defined, such that for each x € S, under Px, Z is an (S, 8, T, R)- SRBM starting from x. Furthermore, Z is a strong Markov process that is Feller continuous.

Definition 4.4 A probability measure ITQ on S is called a stationary distribution for Z if for each bounded Borel measurable function f on S

(4.3)

Here, Ex denotes the expectation under Px.

Using the same argument as in Section 7 of Harrison and Williams (1987), one can show that the stationary distribution TTQ is unique and has a density po with respect to Lebesgue measure dx on S. As stated in the introduction, the primary purpose of this chapter is to compute the stationary density po- We now provide an analytical characterization for po- To

this end, for each k = 1,... 2K, define the measure -Kk on boundary face Fk via

(4.4)

where E^Q denotes the expectation under probability measure Pno (•) = fsPx{-)Mdx). It

then follows again from the arguments in Harrison and Williams (1987) that irk has a density 4.2 SRBM in a Hypercube 104

pk with respect to the surface Lebesgue measure dak on Fk- Furthermore, p0,pi,... ,P2K satisfy the following basic adjoint relationship (BAR):

r 22KK r

2 / (jCf(x)P0(x)) dx + Y, (Vkf(x)pk(x)) dak =0, V/ £ C6 (5), (4.5) Js £r, M

where

92 g rtt \ 1 P /( ) , V^fl d/(x) = 2 2-r^&S + &'v (4-6) = t^V/Oc), (4.7)

DA; is the A;th column of the reflection matrix R, and V/ is the gradient of /. The following theorem is a special case of Dai and Kurtz (1997), where general polyhedron state space was considered. As before, 6 is a /{"-dimensional vector, T is a K x K symmetric and strictly positive definite matrix, and R is a K x IK matrix.

Theorem 4.5 Assume that R satisfies the completely-S condition in Definition 4-3. There

exists a unique nonnegative function p = (pn, Pi, P2K) with fspo(x)dx = 1 and

fFkPk(x) dak < oo for k = I,... ,2K that satisfies the basic adjoint relationship (4-5). Fur•

thermore, TTn(-) = fpo{x)dx is the stationary distribution ofthe SRBM Z associated with

data (S, 8, V, R), and TTjt(-) = f.Pk{x)dok is the measure on Fk defined in (4-4)-

Theorem 4.5 provides an analytical characterization ofthe stationary density of an SRBM. One would hope to find an analytical solution from the characterization. This has been possible only for some very special cases. Harrison, Landau and Shepp (1985) derived an analytical expression for a two-dimensional driftless SRBM. Harrison and William (1987) identified a certain skew symmetry condition for an SRBM to have a product-form stationary distribution. In general, numerical algorithm is needed to compute the stationary distribution. As we will see in the next section that the characterization provides a starting point for a generic algorithm for computing the stationary density p. We now define some quantities related to the stationary distribution of an SRBM. For i = l,... , K and k = 1,... , 2K, define

qi = j Xip0(x)dx, (4.8)

Sk = Pk{x)dok- (4.9) JFk 4.3 The BNA/FM Algorithm 105

The vector q = (91,... ,qi<) is called the stationary mean. It is also the long-run average value of Z. The quantity 5^ represents the long-run average amount of pushing per unit of time needed on boundary Fk in order to keep the SRBM Z inside the state space 5. These quantities, along with stationary probabilities, are of interest in the queueing network applications.

4.3 The BNA/FM Algorithm

In this section, we develop the BNA/FM algorithm for computing the stationary density p of an SRBM. As it was mentioned in the introduction, the implementation of the BNA/FM algorithm has been a long, difficult project. Although we will not discuss the implementa• tion in detail, we will discuss some key issues and our computational experiences with the implementation in Section 4.4. Dai and Harrison (1991) developed a BNA/SM algorithm for computing the stationary dis• tribution of an SRBM in a two-dimensional rectangle. Both their BNA/SM and our BNA/FM algorithms are specialized version of a generic algorithm, which involves a finite dimensional approximation of an infinite dimensional functional space. It is the schemes of approximations that differ the BNA/SM and BNA/FM algorithms. In BNA/SM, global polynomials are used to form approximating subspaces, whereas in our BNA/FM algorithm, piecewise polynomials are used. A piecewise polynomial is defined through a partition of state space; within each subdomain of the partition it is a polynomial. A global polynomial is one defined on the entire state space. The spectral algorithm achieves its accuracy by increasing the maximum degree of polynomials, whereas the BNA/FM algorithm achieves its accuracy by refining the partition of the state space. Pros and cons of both the spectral method and finite element method in many problem do• mains, notably in fluid dynamics, are well documented; see, for example, Canuto et al. (1988) and Becker, Carey and Oden (1981). As it was discussed in the introduction, BNA/SM of Dai and Harrison (1991) generally produces good estimate of stationary mean of an SRBM. How• ever, it sometimes produces poor estimates of stationary probabilities. As it will be shown in Section 4.5, our BNA/FM algorithm produces accurate estimates for stationary probabilities as well. In the reminder of this section, we first recapitulate the generic algorithm of Dai and Harrison (1991) with an extension to multi-dimensional hypercube. We also extend their framework by allowing approximating functions not necessarily C2 smooth. Such extension is essential when we propose our BNA/FM algorithm in Section 4.3.2. 4.3 The BNA/FM Algorithm 106

4-3.1 The Generic Algorithm

Functional Space L2(S)

To facilitate the description of the generic algorithm, we adopt some new notation to present the basic adjoint relationship (4.5) in a compact form. First we define a linear space of functions:

2 L (S) = {g = (go,gi,...,92K)€B(S)xB(F1)...xB(F2K):

2 2 I \g0\ dx + 'Y\ I \gk\ dak < oo}. Js fr[ JF,

The space L2(S) is a tensor product of the L2 space in the interior and the L2 spaces on

boundaries. For u,v £ B(S) x B(Fi)... x B(F2K), define

2K

u (u, v) EE / u0v0dx + r kVkdok

JFJF Js fc=l >K whenever the right side is well defined. When u,v £ L2(S),2/ (u,v) defines a proper inner product on L2(S). The norm of a function u 6 L2(S) is defined as a nonnegative real number given by

||u|| = y/[u,u). (4.10)

For two functions u,v € L2(S), we say that u and v are orthogonal in L2(S) if (u, v) = 0. With new notation, basic adjoint relationship (4.5) can be rewritten as,

2 (Af,p) = 0 for all / € Cb (5), (4.11)

where p = (p0,Pi, • • • ,P2K) and Af = (£/,X>i/,... ,V2KS)-

The Least Square Problem

Since the hypercube S is compact, it is easy to verify that Af E L2(S) for each / £ C2(S). Thus, we can define

2 H = closure of {Af : / £ C6 (5)} , where the closure is taken with respect to norm (4.10) in L2(S). If we assume that the

2 unknown density p is in L (S), then (4.11) implies that p is orthogonal to Af for all / £ C^(S): and thus for all / £ H. In other words, if we assume that p £ L2(S), p satisfies the basic 4.3 The BNA/FM Algorithm 107

adjoint relationship (4.5) is equivalent to p € H-1, where H1- denotes the orthogonal space of H in L2(S). Let us assume for the moment that the unknown density function p is in L2(S). For any h° g H, h° - h° e H1-, where h° is the projection of h° onto H or

h° = argmin [I h° - h II2 .

Thus, h° — h°, in place of p, satisfies the basic adjoint relationship (4.11). If the function h° — h° does not change sign, it follows from Theorem 4.5 that

p = K{h° - h°), (4.12) where « is a constant such that the integral ofpo on S equals to one. The question of whether function h° — h° changes sign remains an open research problem. It was conjectured by Dai and Harrison (1991) that the function does not change sign. We state their conjecture, adapted to the high-dimensional hypercube, in the following.

Conjecture 4.6 Suppose that po is an integrable Borel function in S andpk, k = 1,... ,2K are finite Borel measures on F\,... , i^sr, respectively. If they jointly satisfy the basic adjoint relationship (4-5), then po does not change sign in S.

Supporting the numerical experiences of Dai and Harrison (1991), we found that the function h° — h° does not change sign in all our numerical experiments. For all numerical examples shown in this chapter, we choose h° = (1,0,... , 0) £ L2(S). If we assume that p is in L2, then because

j [h° •p)d\ = 1, we immediately know that h° ^ H. In several places of this section, we have made the assumption that p G L2(S). Unfortu• nately, this assumption does not hold for some (5,6, V, i?)-SRBMs. See, for example, Harrison, Landau and Shepp (1985). When p & L2(S), the key relation (4.12) fails to hold. However, the algorithm to estimate p proposed later in this section remains valid.

Galerkin Approximations

Let us again assume that p 6 L2(S) and fix h° = (1,0,... , 0). To find p using equation (4.12), one needs to compute h°, the projection of h° onto H. The space H is linear and infinite 4.3 The BNA/FM Algorithm 108 dimensional. (By infinite dimensionality of H, we mean that it is necessary to have infinite many functions to form a basis for the space.) Solving the least square problem exactly in an infinite dimensional space is in general impossible. Instead we seek an approximate solution

to (4.11) by using a finite dimensional subspace Hn to approximate the space H. This is known as Galerkin approximation in numerical analysis (cf. Becker, Carey and Oden 1981).

Suppose that we have a sequence of finite dimensional subspaces {Hn} that satisfies Hn —•

2 H in L (S) as n —> oo. (By Hn -» H we mean that, for any h 6 H, there exists a sequence

n n {h } with h e Hn such that \\hn - h\\ ->• 0 as n -> oo.) Let

n h = arg min || h° - h f .

n Since Hn —• H, we have \\h — h°\\ —> 0, as n —>• oo. Let

wn{x) = Kn[h°{x) - hn{x)], (4.13) where K" is a normalizing constant that makes the integral of WQ on S equal one. Dai and Harrison (1991) proposed to use wn to approximate the stationary density p. Indeed, when p € L2(S), it was proved that

||ion -p\\ -> 0 as n -> oo, (4-14) assuming that Conjecture 4.6 holds. When p S- L2(5), convergence in L2 like (4.14) cannot be expected. However, wn in (4.13) is still well defined. Dai and Harrison (1991) conjectured that wn converges to p in certain weaker sense.

As in Dai and Harrison (1991), our choice of finite dimensional subspace Hn will be of the form

Hn = {Af:fe CN) (4.15)

for some finite dimensional space CN. However, there is an important difference. In Dai and

Harrison (1991), CN was chosen as a subspace of C^(S). Whereas in the current exposition, we do not make such restriction. For a function / that is not in C2, the operator Af is undefined in the conventional sense because the second order derivatives of / do not exist at some point. In such cases, Af in (4.15) will be interpreted through general derivatives as described in Oden and Reddy (1976). 4.3 The BNA/FM Algorithm 109

To introduce the general derivatives, let us define the norm || • \\H2 via

2 |j/|| 2= max /( fdx+ max f (df_yd f j2dx UJUH y ! i

+ max / (^-)2d 0 such that

IWH<«i||/||^ (4-16)

2 for any / £ Cb[S). We use Cb(S) to denote the closure of C (S) under the norm || • ||#2. A standard procedure can be used to define the first-order and second-order derivatives for each / £ C2(S). Thus, the operator Af can be extended to / G C2(S). The inequality (4.16) can be extended for any / £ C2(5).

2 Suppose that one is given a sequence of finite-dimensional subspaces {Cn} of C (S) with

2 Cn —> Cb(S) in the sense that for every / £ C (S), one can find a sequence {/„} with /„ £ Cn

2 such that ||/ — fn\\H 0 as n -> oo. One can then verify that Hn -> H via (4.16).

n To numerically compute h , let /", i = 1,... ,Nn, be a finite set of linearly independent

n basis functions of Cn, where Nn is the dimension of subspace Cn. Then, we can express h as

/in = 2>i-/4/"> (4.17) i=l for some scalars {UJ}. To find the coefficients {u,}, observing that (h° — hn,Aff) = 0 for

i = 1,... ,Nn, we obtain the following linear equations:

Au = y, • (4.18) where

Aiij = {Af?,Af?), « = («!,... ,uNn)\ y=((h°,Af?),... ,(h°,Af%n))'. (4.19)

The matrix A is symmetric and semi-positive definite. By deleting some redundant basis functions if necessary, we can and will assume that the matrix A is positive definite. Thus there exists a unique solution to the linear system of equations (4.18).

To summarize the generic algorithm, let Cn be a given finite dimensional subspace of C^(S). First solve u from the system of linear equations (4.18) with coefficients computed via 4.3 The BNA/FM Algorithm 110

n formulas in (4.19). Then form projection hn using u via (4.17). Finally construct function w

via (4.13). The resulting function wn is proposed to be an estimate of the unknown density

P-

n Each choice of Cn, and consequently Hn, yields an approximation w of p. Even for a fixed

Cn, different choice of a basis produces different set of coefficients A and y in (4.19). Because of numerical round-off error the resulting /i™, and hence wn, depends on the choice of basis. In next section, we propose to use the finite element method to generate the approximate

sequence {Cn} for which a natural choice of basis exists.

4.3.2 The BNA/FM Algorithm for SRBM in a hypercube

In this section, we construct a sequence of functional subspaces Cn using the finite element method (FEM). The resulting algorithm to compute wn is called the BNA/FM algorithm. The BNA/FM algorithm differs significantly from BNA/SM used in Dai and Harrison (1991). As evidenced in Section 4.5, our BNA/FM algorithm is able to produce accurate approximations of stationary probabilities. A mesh is a partition of the state space into a finite number of subdomains called finite elements. Since the domain 5 is a hypercube, it is natural to use lattice mesh to divide the domain 5, where each finite element is again a hypercube. The lattices are allowed to be non-uniform so that we can choose the sizes of lattices freely. Each corner of a finite element is called a node. Figure 4.1 shows, for example, the domain of a two-dimensional hypercube (rectangle) that is partitioned into 8x6 elements with 9x7 nodes. Let x = (xi,... ,XK) denote a free variable in S. For every dimension j = 1,... ,K, we divide interval [0, bf\ into nj sub-intervals. Let yQ- = 0 < yj < • • • < y"j = bj be the n partition points in dimension j. We have FTJLi(j + 1) nodes with Yif=i nj finite elements.

The corresponding mesh is denoted as n\ x n2 x • • • x UK- We use A to denote a generic mesh. Also, we label nodes in such a way that node (ii,... ,IK) corresponds to spatial coordinate (y^1,... ,y%£). For future reference, we define

hj=Vj+1-Vji k = 0,... ,rtj - 1 andj = 1,... ,K,

and ||A|| = maxkjhj.

For each mesh A, we now construct the finite dimensional space CA- The corresponding HA is constructed via (4.15). Each function / in CA is a polynomial when it is restricted in each finite element. It is C2 in the interior of the each element, and is C1 globally. With these requirements, we use third order Hermite functions to construct the basis for the subspace 4.3 The BNA/FM Algorithm 111

FIGURE 4.1. A finite element mesh of a two-dimensional hypercube state space

CA- See Carey and Oden (1981) for some basic properties of Hermite functions and other possibilities for constructing bases. The one-dimensional Hermite basis functions over interval [—1,1] are

4>{x) = {\x\ - l)2(2\x\ + 1), for-l-

ip{x) = x(\x\ - l)2, for-l

For an interval [yk l,yh+1] in the jth-dimension, define

-1 L fc—1 iixj.e [yj ,y*] and k> 0,

x ki j) = < + jf-) ifa;je[yJ*)yJ i]andfc

otherwise, and

k k hJr V ( l£r ) if Xj G [y j-\Vj] and k > 0,

hkil> ifxj e [ykj,yk+l] and & < n

0 otherwise.

Now by using tensor-product, we are able to construct tensor-product Hermite basis func• tions for each node in high dimensions. At node (z'i,... the basis functions are of the 4.3 The BNA/FM Algorithm 112

form

K

x X r /ii,... ,i ,ri,... i • • • i X K ,rx ( l K) ~ XT ^h< j ( 3^ 3=1 where rj is 0 or 1, and

x lf rj= 9irA 3) = [ °' (4-20) J' [ ipijixj), d = 1-

Each node has 2K_ tensor-product basis functions. Hence, we have a total oin = 2K UiLi(ni+ 1) basis functions. Furthermore, for ease of programming, we re-index these basis functions as

fi{xi,... ,xK) = fiu...,iK,n TK{xi,... ,xK), (4.21) where

K k-l K

i = 2*$> H(nk + 1) + X^-V*. (4.22) k=l i=l k=l

Now, we have completed the construction of finite-dimensional subspaces CA- One can check that CA C C2(S). The following theorem is needed to justify the use of the BNA/FM algorithm.

Theorem 4.7 As ||A|| 0, CA C2(5)

in the || • \\H2 norm.

Proof. Let / £ C^(S) be fixed. For any e > 0, we would like to show the following assertion: there exists 6 > 0 such that for any partition A with ||A|| < 5,

\\g-f\\H2

any e > 0 there exists a polynomial f\ such that — /||#2 < e. Thus, it is enough to prove the assertion for a polynomial /. For each partition A, let g be the finite element interpolation of /. Since any polynomial function is C4 smooth, the theorem follows from the following interpolation error estimate in Theorem 6.6 of Oden and Reddy (1976):

daf(x) TJ2 < 11 f — q 11 K max max a a K J y W " " - x€S 0<\a\<4 3x x' ... 8x K where ft is a constant independent of A and /, a = (c*i,... , ax), and |a| = J2k ak- a The implementation of the BNA/FM algorithm requires us to solve the system of linear equations (4.18) with matrix A and vector y constructed as in (4.19). The computation of Aij and can be quite tedious. Explicit formulas for their computation are in the Appendix of this chapter.

4.4 Computational Issues of the BNA/FM Algorithm

We have implemented the BNA/FM algorithm in a software package using the C++ pro• gramming language. The software runs in both Linux and Sun Solaris operating systems. Although the algorithm itself is easy to understand, it is a big challenge to program the algorithm because of the complexity of BNA/FM implementation. In this section, we discuss several important issues emerging from our implementation. They are very critical to the suc• cess of applying our BNA/FM algorithm to solve practical problems. Some of the challenges like the curse of dimensionality apply to other algorithms as well.

4-4-1 Solving Linear Systems of Equations

Recall that the BNA/FM algorithm uses the subspace CA constructed in Section 4.3.2 for a given mesh A. The total number of basis functions is

K K n = 2 ]J(nj + l), (4.24) where K is the dimension of the state space S and nj is the number of partition points in the jth dimension. To obtain a numerical estimate of the density function p, we must solve the system of linear equations (4.18), Au = y, where the n x n matrix A and the n-vector y 4.4 Computational Issues of the BNA/FM Algorithm 114 are given in (4.19). The most computationally expensive part of the BNA/FM algorithm is to solve the linear system of equations (4.18). In general, there are two types of methods to solve a system of linear equations, direct methods and iterative methods. A direct method would yield an exact solution in a finite number of steps if all calculations were exact (without round-off error). An iterative compu• tation ends when a solution with a prescribed precision is found. There is no prior knowledge of the number of steps needed in an iterative method. Because of the round-off error, there is even no guarantee that the iterative method will converge at all. There has a huge literature in studying the pros and cons of both methods. Whether one method dominates the other one is often problem specific, and depends on the fine tuning like pivoting and preconditioning performed. In the software, we have implemented both the iterative methods and direct methods. Users can experiment with both methods and choose a better one depending on a specific problem when they run the software. As mentioned above, both of these methods have their own advantages and disadvantages. We will describe them in more details in Sections 4.4.1.2 and 4.4.1.3. Due to possible huge size of the linear system, we have exploited several tech• niques in numerical analysis to make our software stable, accurate, fast, and computationally efficient. Since both methods need to take advantage of the sparseness of the matrix A, we first have a discussion on sparse matrices.

4.4.1.1 Sparseness and Computing Memory

In practice, n in (4.24) can be quite large. The reader may refer to Table 4.3 in Section 4.5.2 to get a feeling of how big the n can be for a 3-dimensional problem. If we treat the system matrix A as dense, we need to store n2 elements in our software, and the direct methods need 0(n3) operations to solve the system of linear equations. This can be prohibitively expensive (in terms of computer memory and computer time) even in two or three dimensions. Fortunately, the matrix A is quite sparse because our basis functions are locally supported. Because any basis function defined in §4.3.2 are only non-zero in the adjacent elements of its node, the maximum number of nonzero elements in each row of A defined in (4.19) is 6K. Thus, the total number of nonzero elements in A is at most nQK, instead of n2. Exploring this sparseness can be extremely beneficial even just for memory storage. For example, for a 3-dimensional problem with 10 x 10 x 10 nodes in Section 4.5.2, it takes at least 600 Megabytes memory space in a dense matrix storage mode as compared to only 20 Megabytes in a sparse matrix representation. Therefore, we must exploit the sparseness structure of the matrix A in order to solve large problems. 4.4 Computational Issues of the BNA/FM Algorithm 115

In our implementation, we use a software library called SparseLib++ for efficient sparse matrix storage and computation. See Dongarra, Lumsdaine, and Remington (1994) for de• tailed information on SparseLib++. Despite taking advantage of the sparseness, for large size problems specially when the dimension of the state space is larger than or equal to 4, the matrices A could still be too big to be successfully stored in memory space of a low-end PC or UNIX workstation. When the physical memory is exhausted, most operating systems can use hard-disk space as virtual memory to store system matrix A. However, the algorithm would be extremely slow due to the slow data transmission speed between CPU and hard-disk compared to the fast data transmission speed between CPU and memory.

4.4.1.2 Direct Method

Most direct methods are based on LU factorization and Gaussian elimination. See, for ex• ample, Chapter 3.2 of Maron (1987). To solve Au = y, one must compute an A = U'DU factorization, where D is a diagonal matrix and U is an upper triangular with unit diago• nal elements. Although A is sparse, U may not be in general as sparse, i.e., the factorization method tends to fill in the originally sparse blocks in system matrix A with nonzero elements. This fill-in demands much more memory storage than the original sparse structure, and the new nonzero entries increase the operation counts. To keep the sparseness of matrix A as much as possible, we have to use some specialized factorization methods. A typical procedure which solves sparse linear system by using direct methods has three parts: symbolic factorization, numerical factorization, and forward and backward solution. The main idea is to keep the matrix as sparse as possible by finding a permutation matrix P such that the factorization of PAP1 has an as moderate fill-in as can reasonably be expected. An ordering step, used to reordering system matrix A, is a must in order to find the permuta• tion matrix P. By exploiting sparseness feature, we hope to reduce the number of operations to 0(n2) or less as compared to 0(n3) by using a naive direct method. We also hope to keep the storage requirement as low as possible. Of course, the actual saving of computer time and memory space will be very much problem dependent. In our implementation of the direct methods, instead of writing our own code, we integrate a software package, SPOOLES 2.2, as a linear equation solver. SPOOLES is a free software li• brary designed to solve sparse system of linear equations Au = y for u by direct methods (See Ashcraft and Grimes 1999). It was written in C programming language with an object ori• ented design philosophy. The web page http://www.netlib.org/linalg/spooles contains the latest release of the package. The library provides three sparse matrix ordering methods: 4.4 Computational Issues of the BNA/FM Algorithm 116

minimum degree, nested dissection and multi-section. Its computations are based on BLAS3 numerical kernels to take advantage of high performance computing architecture and it sup• ports pivoting for numerical stability. (BLAS3 is a collection of linear algebra subroutines to perform common linear algebra operations with matrix operands. For more information about BLAS3, see Dongarra 1990). In solving a linear system of equations by a direct method, pivoting is used to improve numerical stability. When the system matrix A is ill-conditioned (see §4.4.4 below for its definition), pivoting is a good choice. Pivoting is an option in our software. Thus users can enable or disable it when running the software. However, from our experiments, we do not find any significant difference in accuracy for final solutions. On the other hand, in our experiments pivoting takes almost twice computing time and memory space than no pivoting. Thus, we do not recommend using pivoting with our software. Ordering method has a significant impact on how efficient the direct method is. How to order sparse matrices itself is an active research area in numerical analysis research com• munity. From some experiments, we found that one ordering method, called the better of generalized nested dissection and multi-section (the function orderViaBestOfNDandMSO in SP00LES 2.2 library), is by and large the most efficient choice in terms of computing time and memory usage. Hence, this particular ordering method is used in the implementation. For a gentle introduction to ordering sparse matrices, readers are referred to Ashcraft (1999).

4.4.1.3 Iterative Method

One of attractive properties of the BNA/FM algorithm is that the system matrix A is sparse. However, in using the direct methods to solve the system Au = y, the sparseness of A cannot be fully utilized due to fill-in of nonzero elements. Thus, memory usage can be a bottleneck when the direct methods are used to solve the system of linear equations. This is why we have also implemented an iterative method in our software. The basic idea behind iterative methods is to generate a sequence of approximate solution vectors. In each iteration, only nonzero entries of A are involved in the computation. Thus, iterative methods avoid fill-in problem that direct methods face, and therefore, save some memory space. This can be seen from the comparison of the memory usage in Table 4.3 in Section 4.5.2 between the direct and iterative methods. In our software, we use an existing iterative methods library, IML++ version 1.2, to solve the linear system of equations. The library, by Dongarra, Luminsdain, and Remington (1994), is a collection of algorithms implemented in C++ for solving both symmetric and non-symmetric linear systems of equations by using iterative techniques. A preconditioner which is generated 4.4 Computational Issues ofthe BNA/FM Algorithm 117 by SparseLib++ is a must in order to get a good convergence speed. Since the system matrix A is symmetric and positive definite, we use the preconditioned Conjugate Gradient (CG) iterative algorithm (the function CG() in IML++) with an incomplete Cholesky preconditioner (the class ICPreconditioner_double in SparseLib++) in our software. Although an iterative method takes much less memory space than a direct method, it usually takes much more computing time than a direct method to solve the linear system of equations. In our example in Section 4.5.2, the average running time of the iterative method is twice as much as that of the direct method. When the size of the system matrix A increases, the computing time for each iteration increases and the number of iteration needed increases too. So there is a trade-off in saving memory space and computing time. From our experiences, we suggest using the direct method to solve the linear system when the number of nodes is small or moderate. Only when the memory usage becomes a problem, should we turn to use the iterative method.

4-4-2 Computational Complexity

The size of matrix A is n x n. Because of sparseness of matrix A, it may take 0(n) calculations to generate the matrix A. But the number of arithmetic operations needed to solve the linear system is 0(n3) via either LU factorization or Gaussian Elimination. For example, if we set ni = 5 for i = 1,... , K, then n = 0(12^). Thus, the computational complexity increases exponentially with the number of dimensions K. In other words, the computing time needed may increase exponentially as the dimension of the problem increases. For example, to solve a 3-dimensional problem with a 4 x 4 x 4 mesh, it takes our software about 9 seconds to obtain an estimate on a computer. But for a 4-dimensional problem with a4x4x4x4 mesh, it takes our software more than 24 minutes to obtain an estimate with similar accuracy on the same computer.

4-4-3 Mesh Selection

To motivate, consider a special case of one-dimensional (5,8, a2, i?)-SRBM, where S = [0,6] and R = (1, —1). Such an SRBM is also called a two-sided regulated Brownian motion by Harrison (1985). It is known that the stationary density is given by,

(28/a2)e20x^2 = Jmil _ x for, €[0,6], if 0 + 0, and p{x) = 1/6 for re G [0, 6] if 8 = 0. (See, for example, Harrison (1985).) When 8 < 0, it is clear that the peak of the derivative of the density is at x = 0. In this case, intuitively 4.4 Computational Issues of the BNA/FM Algorithm 118

the numerical algorithm would do better by selecting mesh with smaller subintervals near the origin and (relatively) larger subintervals near the upper bound b. Similarly, when 8 > 0, the smaller subintervals would be preferred near the upper bound b for mesh selection. For the boundary case 6 = 0, a uniform mesh would be the best. This is indeed the case with the actual implementation of our numerical algorithm. Unfortunately, determining where the density makes the quick changes itself is a difficult problem. For a driftless (0 = 0) SRBM in two-dimensional rectangle, Harrison, Lauda and Shepp (1984) has a conformal mapping representation of the stationary density. In particular, they were able to explicitly identify which corner has a singular pole. Prior information on the location of singularities can be used to build a more refined mesh. If an SRBM in an orthant has a stationary distribution, we can see the mesh should be selected with smaller subintervals near the original point. On the other hand, the problem of determining whether an SRBM in an orthant has a stationary distribution is non-trivial. Dupuis and Williams (1994) shows that the stationary distribution of an infinite buffer SRBM exists if its corresponding Skorohod problem is attracted to the origin (or equivalently is stable), and Chen (1996) using a linear Lyapunov function provides some sufficient conditions for its stability. For the generalization, observe in the above that when 9 < 0, the corresponding infinite buffer SRBM has a stationary distribution (which can be obtained by letting b —> oo). The case 9 > 0 can be transformed to the case 9 < 0, if one considers b — Z, also an SRBM but with parameters (S, —8,a2,R) (which is equivalent to transforming the upper bound to the origin and the origin to the upper bound). Now consider an SRBM Z in a general hypercube (5,9,T,R). To help with the mesh selection, consider first the corresponding infinite buffer SRBM (5R+, 9, V, RK), where RK is a submatrix of R consisting of the first K-columns of R. (Readers are referred to Taylor and Williams (1993) for a formal definition of an SRBM in a nonnegative orthant.) If this infinite buffer SRBM has a stationary distribution, then the mesh should be selected with smaller subintervals near the origin. Otherwise, we could consider all of (2K — 1) transformations, each of which transforms one of the (2K — 1) corners of the hypercube to the origin. Specifically, let a be a subset of (1,d}. Consider the transformation

Zl = bi - Zi for i G a and Zf = Z{ for % £ a.

Then Za is also an SRBM in S. To describe its parameters, introduce two matrices. Let Ia be a AT-dimensional diagonal matrix whose ith diagonal element equals 1 if i € a and equals 0 otherwise. Let Ea be a 2K x 2K matrix defined as follows: {Ea)a = 1 for i & a 4.4 Computational Issues of the BNA/FM Algorithm 119

and i - d & a; (Ea)i^+i = (Ea)d+i,i = 1 for i G a; all other elements of Ea equal zero. (We

a note that [/5a]_1 = Ea.) It is direct to check that Z is an SRBM associated with the data (5,6a,ra, Ra) starting from Za{0) = Iab + [I - 2/a]Z(0), where

6a = [/ - 2Ia]6,

Ra = [I-2Ia]REa, ra = [/-2/a]r[/-2/°].

We wish to identify an index subset a (of {1,d}) such that the corresponding infinite buffer SRBM of Za has a stationary distribution; in this case, the mesh should be selected with

smaller subintervals near the corner where xa = ba and xb = 0. We note that there could be boundary cases (corresponding to 8 = 0 in the one-dimensional case). On the other hand, the problem of determining whether an infinite buffer SRBM has a stationary distribution is non-trivial. Dupuis and Williams (1994) shows that the stationary distribution of an infinite buffer SRBM exists if its corresponding Skorohod problem is at• tracted to the origin (or equivalently is stable), and Chen (1996) using a linear Lyapunov function provides some sufficient conditions for its stability. Consider the two-dimensional

case. Let 6 = (0fc)|=1 and R = {rik)2><2. In this case, the necessary and the sufficient condi• tion is known to be

+ 1 + 01 + rur£(-62) < 0 and 82 + r2iru (-01) < 0.

4-4-4 IU-Conditioned System Matrix

In using our computed wn to approximate the stationary density p, there are two sources of error. The first source is due to the fact that C& is an approximate of C2(S). Such an error is called the approximation error. Even when the computation of wn can be carried out using infinite precision, this error exists. It decreases when the mesh gets finer. The other source is from the numerical round-off error in computing wn once an approximate subspace CA is given. Round-off error occurs because only finite precision arithmetic is carried on a computer. Our numerical computation of wn consists primarily of two parts: the system generation, i.e., calculating coefficient matrix A, and system solution, i.e., solving linear equations. There can be some round-off error in the calculation of A. But significantly more round-off error occurs in computing the solution to the large linear system (4.18), Au = y. The accuracy of u depends on the property of A. If A is nearly singular, the solution u is extremely sensitive 4.4 Computational Issues of the BNA/FM Algorithm 120 to small changes in the coefficient matrix A and the right-hand side y. In this case, A or the system is said to be ill conditioned. The degree of ill-conditioning of linear systems is measured by condition number of the matrix A. The larger the condition number is, the worse conditioned the system is. The condition number can be determined using the extreme eigenvalues of A. The formal math• ematical definition of the condition number is

Cond^HPII'P-1!!, where 11 • | j is the usual matrix norm. Estimating condition number of A is not an easy task since it involves obtaining the inverse of A which takes much more effort than solving the linear system directly. As mesh number is refined, the size of system increases, and so does the condition number of the system as we have observed in our numerical experiments. From some experiments we did, we note that as mesh is refined, the systems becomes progressively more ill-conditioned, and round-off error increases. At some point, the round-off error can completely dominate the approximation error. In such case, further refining the mesh actually decreases the quality of approximation wn. We note that in running the current implementation of BNA/SM algorithm of Dai and Harrison (1992), we sometime observe that their algorithm fails to produce positive numbers when the maximum degree of polynomials used is as small as 8. In such cases, we believe that the round-off error dominates the approximation error even when a moderate accuracy of the final estimate is attempted. Whereas in all our cases, the final estimate degrades only when it reaches a high level degree of accuracy. There are several other factors that affect conditioning number in our BNA/FM algorithm. The uniform or non-uniform mesh has an effect, so does the basis function chosen. Also choice of basis functions has an effect. We have used third order Hermite functions. Other orders or hybrid polynomials are possible. See, for example, Carey and Oden (1981). It could be the case that the current implementation of BNA/SM algorithm can be improved by choosing alternative bases for global polynomials. Currently, the entry Aij = (Afi,Afj), where A involves second order derivatives. Such construction of A follows naturally from the current form of the basic adjoint relationship (4.5) that characterizes the stationary density. The condition number for such A is several order of magnitude larger than the one for a matrix formed by (/;, fj). See page 197 of Carey and Oden (1981) for a similar observation. If one can find an alternative characterization of the stationary density, for example, by carrying out integration by parts once in the basic 4.4 Computational Issues of the BNA/FM Algorithm 121

adjoint relationship (4.5), one may be able to formulate a system matrix that has a much smaller condition number. Such an investigation is a possible future research direction.

4-4-5 Scaling

For an (S, 0, F, i?)-SRBM, our computational experiences show that the proper scale of the data (S,6,T, R) has a significant effect on the accuracy and efficiency of our numerical ap• proximation of the stationary density. The fact that the data can be scaled is based on the following proposition whose proof readily follows from (4.3), Definition 4.1, and Theorem 1.3 of Dai and Williams (1995). Dai and Harrison (1992) has a similar proposition for SRBM in an orthant.

Proposition 4.8 Suppose that Z is a K-dimensional SRBM with data (S,6,T, R), and that Z has a stationary distribution n with mean vector m. Let S be the hypercube as defined in (4-1)t D be a positive diagonal matrix, and a be a positive scalar. The new process Z* defined by

Z*{t) = DZ(at), (4.25)

is also an SRBM with data (S*,6*,T*,R*), where

6* = aD9, T* = aDTD, R* = DR, (4.26)

and

S* = {x € RK : 0 < x < Db}.

Moreover, Z* has a stationary distribution n* with a finite mean vector m*; they are related

to 7r and m via

TX*{X) = TX(D~1X), (4.27)

m* = Dm. (4.28) 4.5 Numerical Examples 122

a Iterations 50.00 0.399164 0.783404 530 10.00 0.521410 0.955050 330 1.00 0.513462 0.95336 116 0.10 0.500231 0.950072 88 0.01 0.499998 0.950001 407 Exact 0.50000 0.950000

TABLE 4.1. Comparisons of different scaling

To illustrate the scaling effects, we consider a two-dimensional SRBM example which has a product form stationary density function. The data associated with this SRBM are:

& = (10, -10), r = / and S = [0,1] x [0,1]. As in Chapter 2 of Dai (1990), one can check that the data satisfy a skew symmetry condition of Harrison and Williams (1987). Thus, this SRBM has a product form stationary density function and the mean vector of the stationary distribution can be computed to be (0.5,0.95). Following from Proposition 4.8, if we scale Z by D = I and a scalar a, we will get a different SRBM Z* but with same stationary distribution as SRBM Z. We list our numerical approximation of the means of stationary distribution of Z* in Table 4.1 for several different a by using the uniform 10 x 10 mesh. From this table, it can be observed that the smaller a, the more accurate estimates the results are. However, this does not mean that the BNA/FM algorithm gives poor estimates for this problem when a is large. Instead, it indicates that a mesh denser than 10 x 10 should be used in order to produce good approximations when a is large. In this table, we also show that the number of iterations needed for the iterative method. Loosely speaking, more iterations means that the system matrix A is more ill-conditioned. Thus, we can conclude partially that smaller F and 9 would give better approximation in our algorithm. In practice, if some elements of 6 or some diagonal elements of matrix F are large, we should scale them properly before carrying out the numerical computation.

4.5 Numerical Examples

In this section, we present two SRBMs whose stationary distributions can be obtained through methods other than the algorithm proposed in this chapter. We compare the accuracy of our 4.5 Numerical Examples 123 algorithm with those known methods. In the first case, we show that BNA/FM algorithm produces as good estimates as the BNA/SM algorithm does. In the second case, we show that BNA/FM algorithm produces good estimates of stationary probabilities. We also present empirical evidence of the complexity of our algorithm.

4-5.1 Comparison with SC Solution

In this subsection we apply our BNA/FM algorithm to a two-dimensional SRBM that was studied by Dai and Harrison (1991). The data of this SRBM are 6 = 0, T = 21, S = [0,a] x [0,1], and

As discussed in Section 2.5 of Dai and Harrison (1991), the density function p £ L2{S). However the BNA/FM algorithm still gives very accurate approximation which is consistent with the results obtained by Dai and Harrison (1991) using the BNA/SM algorithm. As in Dai and Harrison (1991), we fix the height of the rectangle and let the length a of the rectangle change freely. For the various values of the length parameter a, Table 4.2

compares three different estimates of q\ and q2. The BNA/FM is obtained by our algorithm with a 9 x 9 uniform mesh. The SC estimates were obtained by Trefethen and Williams (1986) using an explicit expression of the stationary density. The expression was obtained by Harrison, Landau and Shepp (1985) for general two-dimensional driftless SRBMs, and is based on Schwarz-Christoffel (SC) transformation in complex variables. BNA/SM and SC estimates are taken from Dai and Harrison (1991). It is clear from the table that the accuracy of our BNA/FM algorithm is at least as good as BNA/SM in Dai and Harrison (1991). It takes less than 1 second CPU time and 800 Kilobytes of memory for both iterative and direct methods to obtain BNA/FM estimates for every value of length parameter a. We use a 8 x 9 uniform mesh to calculate the results. ' A very coarse estimate of condition number of matrix A is 4.7 x 1011 which is very big. Because of the ill-conditioning, we have observed that the number of iterations performed in order to get 6 decimal precision is very close to the size of the linear system. For example, the number of iterations for a = 1 is 374 while the size of the linear system is 400 when using a 8 x 9 mesh. 4.5 Numerical Examples 124

a Method 91 92 BNA/SM 0.258229 0.380822 0.5 BNA/FM 0.258548 0.380244 SC 0.258585 0.380018

BNA/SM 0.551325 0.448675 1.0 BNA/FM 0.551511 0.448571 SC 0.551506 0.448494

BNA/SM 0.878800 0.471640 1.5 BNA/FM 0.879476 0.471676 SC 0.879534 0.471624

BNA/SM 1.238442 0.483103 2.0 BNA/FM 1.239767 0.482937 SC 1.239964 0.482830

TABLE 4.2. Estimates of stationary means from different algorithms for a special two-dimensional SRBM

4-5.2 A 3-dimensional SRBM with Product Form Solution

One of main reasons that we develop the BNA/FM algorithm is to approximate the stationary distribution function, not just its mean values. To see how good this algorithm is, we introduce a special 3-dimensional SRBM whose stationary density has explicit product form solution. Then, we compare numerical results from our BNA/FM algorithm with analytical solutions. The data of the SRBM are given as

/ 1 -1 0 -1 1 0 \ R= 1 10-1-1 0 , V 0 0.1 0 0 -1 /

9' = (1,-1,-0.5), r = / and S = [0,1] x [0,1] x [0,1]. Since the data satisfies the skew symmetry condition in Harrison and Williams (1987), the stationary density function po is of exponential form,

(1 -e 2){l -e l)

Table 4.3 compares the exact means of the stationary distribution with the approximate results obtained by our BNA/FM algorithm. In this numerical example, we use uniform 4.5 Numerical Examples 125

Means LU Method Iterative Method n 9i 92 93 Time Memory Time Memory i = 4 0.500043 0.344660 0.418012 9 4.1 15 3.1 1,000 i = 6 0.500033 0.343893 0.418021 33 13.6 62 8.4 2,744 i = 8 0.500021 0.343677 0.418023 86 35.9 180 17.6 5,832 i = 10 0.500013 0.343592 0.418023 200 76.7 420 33.2 10,648 i = 12 0.500009 0.343551 0.418023 441 149.0 930 57.4 17,576 Exact 0.500000 0.343482 0.418023

TABLE 4.3. Comparisons for a 3-dimensional SRBM with product form stationary density mesh. The index i in the table denotes the total number of partitions at each dimension, and the index n denotes the size of the linear system for each different mesh. In this table, we also show the computing time and memory usage for both iterative and direct methods. The computing time is measured by second and the memory is measure by Megabytes. The approximate results obtained by both direct and iterative methods are very close, so we only list the results obtained by the direct method. Table 4.3 shows that if we require 1% accuracy which is usually good enough in queueing network applications, the convergence is very fast (i = 4 is good enough). It also shows that when the mesh is refined, the accuracy of approximate means increases slowly, while the required computing time and memory increase exponentially. Compared with the direct method, the iterative method takes almost twice as much computing time but only takes about half as much memory as the direct method. Using less memory will definitely help us to solve large-scale practical problems although it may take longer computing time. Figures 4.2-4.4 are plots regarding the computation of three two-dimensional marginal

stationary distributions Pi, P2, and P3, where Pi, P2) and P3 are defined as

Pi (3:1,2:2) = / p0(s)ds, J 0

P2{x1,x3) = / Po{s)ds, J 0<.Si

P3(x2,x3) = / Po{s)ds.

J0

The vertical axes represent the percentage errors of our computation results compared against the exact results. As we can see from these three figures, the BNA/FM algorithm provides very accurate estimation for the stationary distribution. 4.5 Numerical Examples 126

FIGURE 4.3. Percentage errors of approximate marginal stationary distribution Pi 4.5 Numerical Examples 127

FIGURE 4.4. Percentage errors of approximate marginal stationary distribution P3 4.6 A Queueing Network Application 128

4.6 A Queueing Network Application

In this section, we show how our BNA/FM algorithm, proposed for solving the stationary dis• tribution of SRBMs, can be used to predict the performance of a 3-station queueing network that has finite buffers. Pictured in Figure 4.5 is a queueing network of 3 stations in series. Each station has a single server with a first-in-first-out service discipline. The buffer size at each station is assumed to be finite. We use 6j to denote the buffer size (the number of waiting rooms plus 1) at station i, i = 1,2,3. Jobs arrive at station 1 according to a Poisson process with rate A = 1. After completing services at station 1, they go to station 2, and after completing services there, they proceed to station 3. They exit the system after completing services at station 3. To deal with the finiteness of buffers, we make the convention that a job entering to a full buffer is simply discarded (or lost). Such a network is referred to as a loss network, which is commonly used to model computer networks. The service times at each station are assumed to be i.i.d. positive random variables, and service times at different stations are assumed to be independent. The service time distribu• tion at station 1 is taken to be Erlang of order 4. Thus, the squared coefficient of variation (variance divided by the mean squared) of the service time distribution is cf = 1/4 = 0.25. The service time distribution at station 2 is taken to be exponential, and thus c2, = 1. The service time distribution at station 3 is taken to be a Gamma distribution with c2 = 2. The service rate \Xi and the buffer size bi at each station i, i = 1, 2, 3, are shown in Table 4.4. Let Zi(t) be the queue length, including possibly the one being served, at station i at time t, 2 = 1,2, 3. Following the approach in Harrison and Nguyen (1990), W. Dai (1996) proposed an SRBM in the 3-dimensional box to approximate the queue length process Z = {Z(t),t > 0},

where Z{t) = (Zi(t), Z2{t), Z3(t)). The SRBM has the following data: S = {z £ ^ : 0 <

Zi

1 •0 0 -1 0 0 \

1 1 0 0 -1 0

0 -1 1 0 0 -1 /

FIGURE 4.5. Finite queues in tandem 4.6 A Queueing Network Application 129

System 61 b2 h Pi P2 P3 1 10 10 10 1/0.9 1/0.9 1/0.9 2 20 25 25 0.9 0.9 0.9 3 10 15 15 0.9 0.9 0.9 4 3 5 5 1/0.9 1/0.9 1/0.9

TABLE 4.4. The parameters of the loss network and

/ l + u-i/4 -/ii/4 0\ r = -Mi/4 pi/4+/J2 -M2 (4.30)

V 0 -fJ,2 H2 + 2/i3 /

Using the BNA/FM algorithm proposed in Section 4.3, one can compute the stationary distribution of the SRBM, and the stationary mean of the SRBM. The stationary mean is then used to estimate the long-run average queue lengths of the loss network. The SRBM

rows in Table 4.5 lists the estimates of average queue lengths, q\, 92 and q3, for different cases. For comparison, we have simulated the loss network in each case. The corresponding esti• mates are given in the Simulation rows. In each case, the simulation estimates are based on 10 batches of 200,000 units of time, with the system in the first 10,000 units of time truncated. The numbers in parentheses after the simulation figures show 95% confidence intervals as the percentage of the simulation figures. The numbers in parentheses following all other figures are percentage errors (in absolute values) as compared to simulation results. For this loss network, there are other performance measures that are important in practice. For example, one might be interested in the throughput at each station. (The throughput 7; at station i is the long-run average number of jobs leaving station i per unit of time.) The throughput 7i at station i is related to the utilization rate p% at the station via

H = PiPi, i = 1,2,3.

Let m,i — Note the definition of 5k (k = 1,... ,6) in (4.9). Then the Brownian estimate of pk is given by

Pi = 1 - rriidi, 2 = 1,2,3. (4.31) 4.6 A Queueing Network Application 130

Also, the long-run fraction of jobs lost at station i can be estimated via £3+,, i = 1,2,3. Tables 4.6 and 4.7 list the simulation results and SRBM estimates of average throughput rates and jobs loss rates for different cases. In obtaining the Brownian model with the covariance matrix given in (4.30), we implicitly assumed that the actual utilization rate pi can be replaced by 1. This assumption requires that each station is heavily loaded and each buffer size is large. As discussed in Dai and Harrison (1991), one can refine the Brownian model by replacing the covariance matrix by

1 + piPi/4 -piMi/4 0 \ -piPi/4 piPi/4 + P2P2 -P2P2 (4.32) 0 -P2P2 P2P2 + 2p3/J3 /

.Since the utilization rate p = (pi,p2,ps) itself is unknown, we denote the covariance in (4.32) by p(p). We now use an iterative procedure to find p and other performance measures simultaneously. We initialize p(0) = (1,1,1). Assume that p(n — 1) is known. We use the BNA/FM algorithm to find the stationary density corresponding to covariance matrix r(p(n — 1)). The associated 5(n — 1) can be obtained at the same time using formula (4.9). Then, we use (4.31) to get an update for p(n). The iterations, along with the refined Brownian estimates, are given in Table 4.8 to 4.10. The case n = 1 corresponds the original Brownian model whose results are shown in Table 4.5 to 4.7. By observing the numerical results in Table 4.8 to 4.10, we can see that the above iterative procedure provides a slightly better Brownian model for performance evaluation compared to the original Brownian model, especially in System No. 4. By comparing numerical results to simulation results, the SRBM model gives fairly good approximation. Performance approx• imation to station 3 is not as good as that to stations 1 and 2. This may due to the large variation of service time at station 3. Another performance measure related to the queueing network is the (tail) probability that the total number of jobs in system is at least k for a positive integer k. Such performance measures are needed to assess the quality of service for a queueing network. In Tables 4.11 and 4.12, we use systems No. 1 and 2 to compare (tail) probabilities for each station calculated from both SRBM and simulation. From these two tables, we can find that the SRBM estimates are not very close to simulation results, but they are reasonably good enough in practice when high precision is not required. We note that there are two possible errors here: one is from the BNA/FM algorithm itself; the other one is because of using SRBMs to approximate original 4.6 A Queueing Network Application 131

System No. Approximate Method qi 92 93 BNA/FM 3.619 (1.4%) 3.651 (3.2%) 4.172 (12.5%) 1 Simulation 3.669 (0.7%) 3.539 (1.1%) 3.709 (0.6%) BNA/FM 14.669 (1.6%) 12.137 (1.7%) 11.565 (2.5%) 2 Simulation 14.912 (0.5%) 12.344 (1.5%) 11.286 (0.9%) BNA/FM 6.304 (2.2%) 6.731 (2.2%) 6.780 (4.1%) 3 Simulation 6.443 (0.4%) 6.883 (1.1%) 6.515 (1.2%) BNA/FM 1.370 (0.4%) 1.795 (10.2%) 2.086 (22.6%) 4 Simulation 1.364 (0.2%) 1.629 (0.5%) 1.701 (0.5%)

TABLE 4.5. The average queue lengths of the tandem queue

System No. Approximate Method 71 72 73 BNA/FM 0.976 (0.3%) 0.947 (0.4%) 0.859 (3.9%) 1 Simulation 0.973 (0.2%) 0.951 (0.2%) 0.894 (0.3%) BNA/FM 0.896 (0.0%) 0.875 (0.1%) 0.836 (0.3%) 2 Simulation 0.896 (0.0%) 0.876 (0.1%) 0.839 (0.2%) BNA/FM 0.876 (0.3%) 0.848 (0.8%) 0.788 (1.9%) 3 Simulation 0.879 (0.2%) 0.855 (0.3%) 0.803 (0.2%) BNA/FM 0.838 (0.1%) 0.784 (3.2%) 0.611 (17.2%) 4 Simulation 0.837 (0.0%) 0.810 (0.2%) 0.738 (0.3%)

TABLE 4.6. The average throughput rates of the tandem queue queueing networks. We strongly believe that the main error here is because of the SRBM approximation error. 4.6 A Queueing Network Application 132

System No. Approximate Method <54 <55 k BNA/FM 0.024 (4.0%) 0.030 (25.0%) 0.088 (63.0%) 1 Simulation 0.025 (3.1%) 0.024 (3.1%) 0.054 (2.3%) BNA/FM 0.104 (0.0%) 0.021 (0.0%) 0.039 (11.4%) 2 Simulation 0.104 (2.1%) 0.021 (4.4%) 0.035 (3.1%) BNA/FM 0.124 (2.5%) 0.028 (3.7%) 0.061 (27.1%) 3 Simulation 0.121 (1.1%) 0.027 (3.3%) 0.048 (3.0%) BNA/FM 0.162 (1.3%) 0.054 (80.0%) 0.173 (162.1%) 4 Simulation 0.160 (0.5%) 0.030 (1.6%) 0.066 (1.2%)

TABLE 4.7. The average job loss rates of the tandem queue

No. n qi (n) q2{n) q3 (n) 1 1 3.619 (1.4%) 3.651 (3.2%) 4.172 (12.5%) 2 3.585 (2.3%) 3.507 (0.9%) 4.022 (8.4%) 3 3.585 (2.3%) 3.515 (0.7%) 4.046 (9.1%) Simulation 3.669 (0.7%) 3.539 (1.1%) 3.709 (0.6%) 2 1 14.669 (1.6%) 12.137 (1.7%) 11.565 (2.5%) 2 14.671 (1.6%) 12.170 (1.4%) 11.534 (2.2%) 3 14.671 (1.6%) 12.141 (1.6%) 11.532 (2.2%) Simulation 14.912 (0.5%) 12.344 (1.5%) 11.286 (0.9%) 3 1 6.304 (2.2%) 6.731 (2.2%) 6.780 (4.1%) 2 6.310 (2.1%) 6.700 (2.7%) 6.730 (3.3%) 3 6.310 (2.1%) 6.702 (2.7%) 6.733 (3.3%) Simulation 6.443 (0.4%) 6.883 (1.1%) 6.515 (1.2%) 4 1 1.370 (0.4%) 1.795 (10.2%) 2.086 (22.6%) 2 1.363 (0.0%) 1.634 (0.3%) 1.896 (11.5%) 3 1.363 (0.0%) 1.656 (1.7%) 1.967 (15.6%) Simulation 1.364 (0.2%) 1.629 (0.5%) 1.701 (0.5%)

TABLE 4.8. The iterations of the SRBM approximation for average queue lengths of the tandem queue 4.6 A Queueing Network Application 133

No. n 7i (n) 72 (n) 73 (n) 1 1 0.976 (0.3%) 0.947 (0.4%) 0.859 (3.9%) 2 0.978 (0.5%) 0.955 (0.4%) 0.892 (0.2%) 3 0.978 (0.5%) 0.954 (0.3%) 0.889 (0.6%) Simulation 0.973 (0.2%) 0.951 (0.2%) 0.894 (0.3%) 2 1 0.896 (0.0%) 0.875 (0.1%) 0.836 (0.3% 2 0.897 (0.1%) 0.876 (0.0%) 0.839 (0.0%) 3 0.896 (0.0%) 0.876 (0.0%) 0.839 (0.0%) Simulation 0.896 0-0%) 0.876 (0.1%) 0.839 (0.2%) 3 1 0.876 (0.3%) 0.848 (0.8%) 0.788 (1.9%) 2 0.876 (0.3%) 0.850 (2.6%) 0.797 (0.7%) 3 0.876 (0.3%) 0.850 (2.6%) 0.797 (0.7%) Simulation 0.879 (0.2%) 0.855 (0.3%) 0.803 (0.2%) 4 1 0.838 (0.1%) 0.784 (3.2%) 0.611 (17.2%) 2 0.849 (1.4%) 0.819 (1.1%) 0.744 (0.8%) 3 0.848 (1.3%) 0.816 (0.7%) 0.717 (2.8%) Simulation 0.837 (0.0%) 0.810 (0.2%) 0.738 (0.3%)

TABLE 4.9. The iterations of the SRBM approximation for average throughput rates of the tandem queue

No. n <54(n) 55(n) 86{n) 1 1 0.024 (4.0%) 0.030 (25.0%) 0.088 (63.0%) 2 0.022 (12.0%) 0.023 (4.3%) 0.062 (14.8%) 3 0.022 (12.0%) 0.0233 (4.3%) 0.065 (20.4%) Simulation 0.025 (3.1%) 0.024 (3.1%) 0.054 (2.3%) 2 1 0.104 (0.0%) 0.021 (0.0%) 0.039 (11.4%) 2 0.104 (0.0%) 0.020 (2.0%) 0.037 (5.7%) 3 0.104 (0.0%) 0.020 (2.0%) 0.037 (5.7%) Simulation 0.104 (2.1%) 0.021 (4.4%) 0.035 (3.1%) 3 1 0.124 (2.5%) 0.028 (3.7%) 0.061 (27.1%) 2 0.124 (2.5%) 0.026 (3.7%) 0.053 (10.4%) 3 0.124 (2.5%) 0.026 (3.7%) 0.053 (10.4%) Simulation 0.121 (1.1%) 0.027 (3.3%) 0.048 (3.0%) 4 1 0.162 (1.3%) 0.054 (80.0%) 0.173 (162.1%) 2 0.151 (0.6%) 0.030 (0.0%) 0.075 (13.6%) 3 0.152 (5.0%) 0.032 (6.7%) 0.099 (50.0%) Simulation 0.160 (0.5%) 0.030 (1.6%) 0.066 (1.2%)

TABLE 4.10. The iterations of the SRBM Approximation for average job loss rates of the tandem queue 4.6 A Queueing Network Application

k 1 - Pi (A) 1 - P2{k) 1 - P3(k) BNA/FM Simu. BNA/FM Simu. BNA/FM Simu. 1 0.8035 0.7232 0.7850 0.6846 0.8421 0.6675 2 0.6391 0.5802 0.6190 0.5446 0.6988 0.5547 3 0.5016 0.4566 0.4846 0.4290 0.5702 0.4561 4 0.3866 0.3519 0.3742 0.3324 0.4553 0.3689 5 0.2905 0.2633 0.2827 0.2519 0.3532 0.2920 6 0.2100 0.1877 0.2063 0.1839 0.2627 0.2237 7 0.1427 0.1238 0.1421 0.1260 0.1830 0.1628 8 0.0864 0.0702 0.0878 0.0764 0.1130 0.1092 9 0.0394 0.0248 0.0412 0.0345 0.0521 0.0614 10 0 0 0 0 0 0

TABLE 4.11. The comparison of tail probabilities of system No. 1 for the tandem queue

k l-Pi(fc) 1 - P2{k) 1 -P3(fc) BNA/FM Simu. BNA/FM Simu. BNA/FM Simu. 1 0.9930 0.9898 0.9566 0.9336 0.9525 0.8818 2 0.9847 0.9817 0.9135 0.8935 0.9054 0.8353 4 0.9635 0.9600 0.8283 0.8128 0.8123 0.7477 6 0.9340 0.9298 0.7441 0.7307 0.7211 0.6638 8 0.8932 0.8868 0.6611 0.6484 0.6322 0.5831 10 0.8367 0.8263 0.5791 0.5667 0.5459 0.5048 12 0.7583 0.7421 0.4983 0.4864 0.4623 0.4298 14 0.6495 0.6258 0.4186 0.4084 0.3817 0.3575 16 0.4988 0.4658 0.3401 0.3310 0.3042 0.2881 18 0.2898 0.2430 0.2627 0.2553 0.2300 0.2219 19 0.1567 0.1005 0.2238 0.2178 0.1942 0.1897 20 0 0 0.1864 0.1804 0.1593 0.1583 22 0 0 0.1113 0.1054 0.0924 0.0979 24 0 0 0.0370 0.0328 0.0296 0.0406 25 0 0 0 0 0 0

TABLE 4.12. The comparison of tail probabilities of system No. 2 for the tandem queue 4.7 Concluding Remarks 135

4.7 Concluding Remarks

In this chapter, we have proposed the finite element method algorithm to compute the sta• tionary distribution of SRBMS in either hypercube state space or orthant state space. This algorithm extends and complements the previous algorithms. In particular, we find this algo• rithm accurate, stable, and capable of computing the stationary density function (in addition to the mean of the stationary distribution). Computing the density function would allow us to predict some important performance measures in real applications such as the service level in a production or a communication network. We have applied the algorithm to a finite buffer queueing network and a job shop example with infinite buffers, and our numerical results indicate the algorithm in general provides very good approximations. However, the algorithm itself has some limitations. The biggest one is due to the com• plexity of the original BNA/FM algorithm due to dimensional curse, which is also applied to BNA/SM algorithm and almost impossible to avoid. Due toe the computational com• plexity, we can only carry out computation for 3-dimensional SRBMs which can be used as approximation to three-station queueing networks in practice. On the other hand, the BNA/FM algorithm is suitable for parallel computing, and we expect larger size problems can be solved in high-end workstations. Since this algorithm is still in research phase and we put much more effort on studying the algorithm itself, the user interface of software implementation is a little weak. Currently, users have to manually set the truncation boundaries, select a mesh, and conduct the iteration one by one. There is definitely a learning curve here. As we would like to point out that it is a huge effort to actually implement the algorithm, it will be desirable to automate the above manual procedures and make the final software easy to use. By doing so, everyone would be able to run the software to evaluate the performance of queueing networks. However, it requires a lot of programming work and we are planning to follow this approach as our next step.

4.8 Appendix

Here, we will write down the explicit formula for computing Aij and y,. From (4.6)-(4.7), the definitions of operators £ and P, and (4.19), we have 4.8 Appendix

Aij — < Afi,Afj >

4 ^ ^ ™ Js dxkdxt dxpdxq

jt=ip=i Jb K p f dfj(x) dfjjx)

k=lp,q=l r>° F H

We can calculate y in (4.19) similarly. Let h° = (1,1,... , 1), then we have

yi =

2 F dx t% Js^tdx, ^ Js 8x„ tiU > >- '

By noting that fspo(x)dx = 1, we can get K by

1 f N

- = / [1 -y2ui{Cfi)(xi,... ,xd)]dx Jb 1=1

7s 0 i=l i=l fc,l=l c^ ^' j=1 fc=1 Js 4.8 Appendix 137

The stationary means qi, % = 1,... ,K, in (4.8) can be expressed as

Qi -L Xipo(x)dx N K N K biUi dfjjx) = K dx-^J^Ujt cc i dec EE dxkdx, dxk 3=1 k,l=l I. 3=1 fc=l

For notational convenience, we define following integrals,

Js dxkdxi dxpdxq

2 h = f d fj{x) dfjjx) _

Js dxkdxi dxp

Js dxk dxp

Js dxkdxi

Js oxk h = / —~Xjdx Js oxkdxi

h = / —x Xjdx

Js oxk

o f dU{x)dfj{x)r

4-8.1 Integrals Calculations

In order to compute integrals, h,--. , 76, I&k, and IBk, by computer programming language, we have to express all of them by some simple operations of simple function. First we notice 4.8 Appendix

that

2 d h(x) = f g'Lr^Ylu^m^M if * = i

X dxkdxi \ 9[ktr^k)9iltrS l)Tlu^k,u^l9iu,ru(Xu) if M *

;r ^7 = 9Lrfc(^)II^.'-"( ")

4.8.1.1 Computing I\

Recall the definition of I\

2 = r&Mx) d fj(x)dx

1 Js dxkdxi dxpdxq

Case 1. k = I and p = q.

If k — p, we have

TT f f

x x ( k)9jk,sk( k)dxk)- u^kk.Si Jo Jo

If k ^ p, we have

K rburOu

x x = (11/ 9iu,ru( u)9ju,su{ u)dxu)

rbk rbp

X x x x x W 9ik,rk( k)9jk,sk{ k)dxk 9ip,rP{ p)9jp,Sp( p)dxp). Jo Jo

Case 2. k = I and p ^ q.

If k = p, we have

7 x x 1 = ( TT / 9iu,ru{ u)9ju,Su(. u)dxu) ZZ JO u^k,q

x x x x dx X(/ bk 9ik,rk{ k)9'jk,sk( k)dxk 9iq,rq{ q)9jq,Sq( q) q)- Jo Jo 4.8 Appendix 139

If k = q, we have

7 x i = ( TT / 9iu,TU{xu)gju,su( u)dxu) u^k,p'..t.Z„Jo

b rbk r P

x x (/ 9ik,rk( k)9'jk,Sk{xk)dxk 9iP,rp{xp)g'jpiSp{xp)dxp). Jo Jo

If k ^ p, and k ^ q, we have

K b rbu r k

= (11/ 9iu,ru(xu)9ju,su(xu)dxu) x ( / Si't,rt(a;ji!)Sjfc,«fc(a;fc)da;fc) u^k,p,q 0 0 /•bp /•&,

*(/ 9iP,rP{xp)9j s(xp)dxp / 9i„r,(^)9j,,s,(a;g)^9)- JO •/o

Case 3. k ^ I and p = q.

U k = p, we have

7 1 = (11/ ^u,ru(^)9j„,^(^)^«)

rbk rbi x x (/ 9ik,rk{ k)g"k,sk{ k)dxk g'ihri{xt)gjl)Sl{xi)dxi). Jo Jo

If / = p, we have

7l = (II / 2WUMSJ„,S„(ZU)GM Uj£k,l

rbibi rOfbk

x x(/ sW^jU^')^' I 9ik,rk( k)9jk,sk{xk)dxk). Jo Jo

li k ^ p, and I ^ p, we have

" /-Du r°k

II = x x x ( II / 9iu,ru{ u)9ju,su{ u)dxu) x (I g'ik!rk{xk)gjk,sk{ k)dxk) u^k,l,p 0 " /•bj /-bp

x x x x (/ &,,r,( l)9ji,a,{xi)dxl 9ip,rp{ p)9jp,Sp{ p)dxp). Jo Jo 4.8 Appendix

Case 4. k ^ / and p ^ q.

If k = p and I = q, ov k = q and I = p we have

/•ft*, rbi x x x(/ 9'ik,rk{ k)9jk,Sk( k)dxk 9il,rl{xi)g'jlill{.xi)dxi). Jo Jo

li k = p and I ^ q, we have

" />6u /•<>*

x = (11/ 9iu,ru{ u)gju,Su(xu)dxu) x {J g'ik>Tk{xk)g'jk^k{xk)dxk)

x(/ 9i,,r,(xi)gjlttl{xi)dxi 9iq,rq{xq)g'js(xq)dxq). Jo Jo

Ii k ^ p, and / = 9, we have

7l = x x x ( II / 9iu,ru(Zu)9ju,Su( u)dxu) X [I 9ik,rk( k)9jk,sk{ k)dxk)

u^k,l,P " "

x(/ g'ir{xi)g'js[xi)dxi 9ip,rp(xp)g'ja{xp)dxp). Jo Jo

li k = q, and I ^ p, we have

x x )dx ) = (II / 9iu,ru( u)9ju,su{ u u X (/ 4 (ifc)^ (ifcjctefe) i, Jo Jo

/•6, rbp

x x x dx *(/ 9i,,n( i)9ji,si{xi)dxi 9iP,rP{ P)9jp,sp{ P) P)- Jo Jo

li k ^ q, and I = p, we have

JLK rburbu rbi J x x x x dx l = ( II / 9iu,ru{ u),9ju,su{ u)dxu) X {I g'iuri{ l)9'ji,sS l) l) u^k,l,q " "

X X x x dx (/ 9'ik,rk( k)9jk,sk{xk)dxk 9iq,Tq{ q)g'jq,sq( q) q)- Jo Jo 4.8 Appendix

Finally, if k ^ p, q, and I ^ p,q, we have

J x i = ( IT / 9iu,ru( u)9ju,su{xu)dxu) .A, Jo j-bk rbi X x W 9ik,rk( k)9jk,sk{xk)dxk) 9il,rl(xi)9ji,a,{xi)dxl) Jo Jo

fbp fbq t

x X dx x x X(/ 9iP,rP{ p)9jp,sp( p) P 9iq,rq{ q)9jq,Sq(

4.8.1.2 Computing I2

Recall the definition of I2

2 j • f d U{x) dfj(x)dx

Js dxkdxi dxp

Case 1. k = I = p.

K b K pburbu pbr kk x x h = (TT / 9iu,Tu( u)9ju,su( u)dxu) x (/ g'l {xk)g'jk^k{xk)dxk). u^k.A Jo Jo

Case 2. k = I ^ p.

JL rbu TT f

x x 12 = ( II / 9iu,ru( u)9ju,su{ u)dxu) u^k,p ® rbk rbp

x x x x x (/ 9ik,rk( k)9jk,sk{ k)dxk 9ip,rp{ p)9jp,Sp{ p)dxp). Jo Jo

Case 3. k ^ / = p.

* rbu

12 = x x (II / 9iu,ru( u)9ju,su{ u)dxu) u^k,r°

fbk rbi

x x x(/ 9'ik,rk( k)9jk,sk{ k)dxk g[ {xi)g'jltt[(xi)dxi). Jo Jo 4.8 Appendix

Case 4. / ^ k = p.

x x h = (II / 9iu,ru{ u)9ju,su( u)dxu)

n J u^k,l ° rk rbk

x x x(/ 9'ii,ri( l)9ji,Sl( l)dxi g'iktrk{xk)g'jktSk{xk)dxk). Jo Jo

Case 5. k ^ I ^ p and p # k.

nK rbu rbkr°k

X X 12 = X ( ii / 9iu,Tu( u)9ju,Su( u)dx u) ( / 9ik,Tk

x x i k)9jk,sk{ k)dxk) u^k,l,p " "

rbi rbP

x x dx x x dx *( g'ii,n( l)gjl,si( l) l 9iP,rp{ p)g'jp,Sp( p) p)- Jo Jo

4.8.1.3 Computing I3

Recall the definition of ^3

d x dfi(x) fA )dx

ls dxk dxp

Case 1. k = p.

^ rbu rbk

x x I = (f[ [ 9iu,ru{ u)9ju,su( u)dxu) x ( f g'ikrk{xk)g'jktSk{xk)dxk). u^k..-LI. JO Jo

Case 2. p ^ k.

TT fK

13 = x x (11/ 9iu,Tu( u)9ju,su( u)dxu) ,,-j.uu^k,p' „J0

rbk rbP

x x x x x dx X(/ 9ik,rk( k)9jk,Sk( k)d k 9iP,rp{ p)g'jp,Sp( p) p)- Jo Jo 4.8 Appendix 143

4.8.1.4 Computing I\

Recall the definition of I\

Js dxkdxi

Case 1. k = I.

bu bk TT f x f h = (TT / 9iu,Tu( u)dxu) x ( / (a^dx*).

Case 2. I ^ k.

h = (11/ 9iu,ru{Xu)dxu)

rbk rbi x x(/ 9ik,rk( k)dxk g'ir{xt)dxi). Jo Jo

4.8.1.5 Computing 1$

JS dxk K rbu rbk T-T f f

x = (11/ 9iu,ru( u)dxu) x ( / g[ (xk)dxk).

4.8.1.6 Computing 76

Recall the definition of IQ

h = [ o xjdx Js oxkdxi

Case 1. k = I = j.

K " rbrbuu rbkrbk x h = (11/ 9iu,ru{ u)dxu) x ( / g'lkrk{xk)xkdxk u^k.Si Jo Jo 4.8 Appendix 144

Case 2. I = k ^ j.

K rbu

7 x 6 = ( TT / 9iu,Tu( u)dxu) u^k,j j-bkrbk r"j X ' x / 9ik,rk( k)dxk / gij^ixjjxjdxj Jo Jo

Case 3. l± k = j.

bu TT r 7 x e = (TT / 9iu,ru( u)dxu) Z.I ,Jo rbkbk rbi

x x x( / 9ik,rk( k) kdxk / g'i r(xi)dxi). Jo Jo

Case 4. k ^ / = j.

* rbu

7 x 6 = (IT / 9iu,ru( u)dxu) tit i Jo

rbkbk fbir°i

x XI:(/' ' 9ik,rk( k)dxk g'ir(xi)xidxi). Jo Jo

Case 5. k ^ I, k ^ j and / ^ j

K rrbub

h = ( TT / ffWnOZu)^)

rbk rbi rbj

x x(/ 9ik,rk( k)dxk g'i (xt)dxi g^^ix^Xjdxj) Jo Jo Jo

4.8.1.7 Computing I7

Recall the definition of Ij

h = [9 l1^Xjdx

Js dxk 4.8 Appendix 145

Case 1. k = j.

K rbu rbk x h = (TT / 9iu,rA u)dxu) x ( / 9iktrk[xk)xkdxk). ttIT . Jo Jo

Case 2. k + j.

K b K rbfbuu rbr kk rbj x Il = (TT / 9iu,r«( u)dxu) x (/ g'i (xk)dxk) x ( / g^^ix^Xjdxj). U^k,jZI-Jo ' Jo Jo

4.8.1.8 Computing IBk

Recall the definition of IBk

1Bi = J MMMMdI Fk dxp dxq

For the boundaries Fk, k = 1,... , d, the calculation can be divided into following cases. Case 1. k = p = q.

K rbu

x IBk = 9'ik,rk{0)9jk,sk(Q)(n / 9iu,ru{Xu)9ju,Su( u)dxu) u^k '

Case 2. k ^ p = q.

Case 3. k=p + q.

IB = ^,rAxu)9ju,Su(xu)dx ) k 5U(0)ffit).fc(0)( If t u u^k,q

x dx *{[ ' 9iq,rM^'jq,sq(

Case 4. k = q ^ p.

IBk = Su,ri(0)Sj-fcliJb(0)( JJ J 9iu,Tu{xu)9ju,su{xu)dxu)

fbp X x (/ 9'ip,rI1( p)9jP,sp(Xp)dxp) Jo

Case 5. k ^ q, q ^ p, and k±p.

TT /•t»

a; :c da: /Sfc = fli*,i-4((%Jb,,Jk(0)( 11 / 5<«,r«( u)Sj»,««( u) u) -TL JO uj=k,p,q

b f P A ; x x (/ 9ip,rP( p)9jP,sP(xp)dxp 9iq{xq,rq)gjqjSq(xq)dxq) Jo Jo For the boundaries Fpc+k, k = 1,... ,K, the calculation can be divided into following

Case 1. k = p = q.

" fOu

x IBk = 9LrkMjk,sh(h)(Y[ / 9iu,ru( u)9ju,Su(Xu)dxu) u^kJ0

Case 2. fc # p = q-

IBk = 9i (bk)9j ,, (bk)(]l 9iu,ru(xu)9ju (xu)dxu) kfk k k I ,su u^k,p"it. „ J 0 fbp

X a; da; (/ 9ip,rp(^)5ip,«p( p) p) ^0

Case 3. k = p ^ q.

; a; 7Bt = 9ik,rk(bk)9jk,Sk{bk){ / 5i«,r«(au)5j»,««( «)da;«)

u^k,q "

x (/ 9iq,rq(xq)g]S {xq)dxq) Jo 4.8 Appendix 147

Case 4. k = q ^ p.

IBk = 9ik,rk{bk)g'jk,sk{h){ IJ J 9iu,ru(Xu)9ju,su(Xu)dxu) u^k,p ® fbp

x (Jo 9ip,rp{xP)9jP,sP(xp)dxp)

Case 5. k ^ q, q ^ p, and k ^ p. K rbu

x x IBk = 9ik,Tk(bk)9jk,sk{bk){ J! / 9iu,ru( u)9ju,su( u)dxu) -TL Jo

bq fbP f i

x x x (yo 9'ip,rP( p)9jP,sp{ p)dxp 9iq{xq,rq)gjq>Sq{xq)dxq

4.8.1.9 Computing IBk

Recall the definition of IBk

Fk dxp

For the boundaries Fk, k = 1,... , d, the calculation can be divided into following cases. Case 1. k = p.

* rbu 0U x 9iu,ru{Xu)9ju,Su{ u)dxu) /

Case 2. k ^ p.

K rbu rbp x IBk = 9ik,rk(0)( TT / 9i«,ru( u)dxu) x ( / [xp)dxp) tt Jo Jo

For the boundaries FK+k, k = l,... , K, the calculation can be divided into following cases. 4.8 Appendix 148

Case 1. k = p.

b

IBk = 9ik,rk( k)(Y[ J 9iUlru{xu)dxu) u^k "

Case 2. k ^ p.

IBk = 9ik,rk(h){ [[ / 9iu,ru{xu)dxu) x ( g'iptrp(xp)dxp) u^k,pJ° J°

4-8.2 Basic Integrals

By observing the results obtained in previous subsection, we can identify following integrals that need to be calculated further. The details of following integral calculation are showed in Section 4.8.3.

x x B\ — I 9iu,ru( u)9iu,Sui. u)dxu, JO

l if Su — 0, Zu 7^ 0,77-u: £(hb- + h\r), ru =

13 Liu if r — Su — 0, ill — 0, 35 "u ' u 13 uiu-l if Su — iy. 35 u ' ru = — ^ui

1 2 n if $ui iu 7^ u: M-ifi- ) +ten ru * 0)

iu 2 — (h ) if Suiiu — 0, 210 V"u > i ru * if 210^ u ) > ru &ui — T^ui i 3 3 if = Ife((Ai-- ) + (fti-) ), ru = Su 1-5^U 0)^1i;

if Su — !•) iu 105 V"u ) ' ru = — 0, if Su — 1; Iu — nu< 105 V"u / ' ru =

x B2 9iu,ru ( u)9iu+l,su [xu)dxv JO 9h\? if r = s — 0, 70 ' u u

a "W) if r -0 s =1

= < 2 13(/t^) if r = 1, s = 0, 420 ' u u if r = s = 1. 140 ' u u 4.8 Appendix

x B3 9iu,ru { u)9iu-l,su {Xu)dxv

iu — 1 9h: r = s = 70 ' if u u 0,

420 ' if ru = 0, su — 1, = < 420 ' if ru = l,su = 0, (ft!""1)3 140 ' if ru-su = 1.

/•6U

x x / 9iu,ru{ u)9iu,Su{ u)dxu, Jo

o, if ru Su — o, 7u 7^ 0) flu, 1 if Su — iu = 0, 2' ru o, 1 if --- Su — 2' ru 0,

0,su l,iu ^ 0,77 if ru = iu - — — h if 0,su l,iu = 0, 10 "u ' ru = _-L/,»u-l n - 10 u i if ru 0,su =

1 0,7„ ^ 0,71 if Su ^(Ai-- +/.*,«), ru = 1, =

n if 1 j s 0,7 = 10 u ) ru u U 0, IM«-I if 10 « ' ru 1. 0, if ru = Su —

x x B5 = g'iu,ru( u)9iu+l,s»{ u)dxu, Jo 1 if — Su — 2' ru 0,

if = 0, Su 1 10 ' ru = < = if = 0 10 ' ru 1, $u if I. 60 ' ru — Su — 4.8 Appendix 150

x x

B6 / 9iu,ru( u)9iu-l,Su( u)dxu, Jo ( 1 if r = s = 0, 2' u u lia if r = 0, s = 1, 10 ' u u = < if r = l,s = 0, 10 ' u u iu — 1 (huu 60 I , if r„ = s„ = l.

rbu

x x 9iu,ru ( u)9iu,su [ u)dxu, JO0

if — 57J 0,2 ^ 0,nu, 6 if = su = 0, i = 0, 6 if = Su = 0, 2 = Tlu, lu-l ' ru

= = if ru 0) 1) 7^ 0) T^Ui

_ X = if r — 10' u — 0, Su 1, iu 0, j_ if = = = 10' 'u 0> Su 1, Tlu, if = = 0, *u 1) $u 0) *u 7^ 0, 7lu, _ n if 10' = l,s„ - 0,z'u - 0, 11 if = = = 10' 1) $u 0) iu T^Ul if —• Su — l-^'u 7^ O5 Tlu,

iu -—h if — Su — l'^'u — 0, 15 ' if _ 2_hiu-l " Su — 1 '^u ~ T^ut 15 u '

x x #8 = / g"u,ru(, u)9iu + \,Su( u)dxu, Jo 6 if r — $u — 0, 5hl" ' u 1 if = 0,su 1 10' ru = 1 if r = 0 10' u 1, su = h£_ if 1. 30 ' ru —' Su — 4.8 Appendix 151

x BQ = I 9iu,ru( u)9iu-\,su{Xu)dxu, Jo

if ru = su — 0,

= = JQI if 0) Su 1) = < 10 /it""1 30 ' if ru = su = 1.

-Bio Jo

if ru = Su — o,

if ru = 0,su = 1) iu 7^ 0) ^M! 1 u if = 0,su = l,iu = 0, ' /i' ' ru 1

if ru 0,su =

if 1) su = 0,iu # 0,nu, ru = = < l if ru = 1) Su 0,iu = 0, 1 if ru = 1) Su

Tl , if ru = Su ~~ 1 7^* 0, u

1 = if — 2U = 0, 2' ru 1, 1 if = Su — 2' ru 1, 4.8 Appendix 152

,r ixu)9i -\,s ixu)dx , -B12 -L u u u u

if ru = su = 0,

r, if r„ = fj,s„ = 1, = < if ru = l,s„ = 0,

1 r - s = '2' if u u 1.

x x -Bl3 = / 9iu,ru( u)9iu,su( u)dxv Jo

12( su = 0, iu 7^ 0,TiU) 12 if ru Su — 0, iu — 0i (Al")3' 12 if ru Su — 0) ^u —

if r S

6(-(FCJ_1)2.+ Tjffja). « + n^u + 0,nU)

= if ^14 7^ ^W) ^li 0) = < (ft?)2' 6 (ftl"-1)2'

= if ru = su Ij 7^ 0? 4 if TU — Sy, — 1, iu — 0) /lu"' 4 = = if f« su — 1, iu

x x -Bl4 = / 9iu,ru{ u)9iu + l,Su{ u)dxt Jo 12 if ru = su = 0, (hi")3' 6 if r„ = 0, su — 1, (/C)2" = < 6 if rn = 1, s„ = 0, (Al")2' _2 —, if ru — s« = 1. 4.8 Appendix

x X dx 15 = / 9iu,ru( u)9iu-l,Su( n) u, JO 12 if ru — Su — o, (ft?-1)3' 6 if = 0,su 1 -1 2 ru = (ftl" ) ' < 6 0 if r — 1) Su — (ft?-1)2' u 2 if r — su = 1. ft?"1 ' u

rbu

x x / 9iu,ru(. u)9'iu,su( u)dxv Jo 6 + if r 0,iu ^ 0,nu, 5 ^ h'^u fa?'' u 6 if r 0,i„ = 0, 5fa? ' u 6 if fu — su — 0) I'LL — ^"U) sfti o, if r„ 7^ if r ^ 5 , iu == 0 10' u U __1_ if r -f- S i ~ ^" 10' u U u

-1 ^(/i^ if ru = su = l,iu ^ 0,nu,

if TU — s — 1> iu — 0> 15 ' u J./,»«-! if vu — su — 1, iu — T^u- V 15 u '

Bn = / g'iu>ru{xu)giu+hSu{xu)dxu, Jo 6 1 if r — Su — 0, 5ft?' u 1 if r = 0,su 1 10' u = = < 1 = if ru 1) s 0 10' u fa? if — Su — 1. 30 ' ru 4.8 Appendix

rbu

= / 9iu,ru(Xu)9'iu-l,Su(Xu)dxu, Jo

u if ru = su = 0, 5h'u _ J_ 10' if ru = 0, su = 1, = < J_ 10' if ru — l,su = 0, 1 r = s = 1. _htr30 ' if u u

x / 9iu,ru( u)dxu, Jo l if ^0,n ( hi?-+ti? ru = 0, 2"U 2 ' if r = 0 = 0, ft? u 2 ' if = 0 = Tl , /»:,"-' Tu u 2 ' 1 2 2 if # 0,n -(ftl"- ) ^-^") Tu = 1 12 ' (ft!")2 if Tu = 1, = 0, 12 ' (ft!"-1)2 if = n . 12 ' Tu = 1 ^11 u

rbu

x I 9'iu,ru( u)dxu, Jo

i + 0,nu, o, if Tu = 0, u

if 0,iu = 0, -1, Tu = = < if 0,iu 1, Tu = = nu if = 1. Tu

rbu I 9iu,Tu(Xu)dXU, Jo o, if Tu = 0, if = o, Tu 1) iu < -1, if Tu — 1) iu = 0,

= 1, if Tu 1) *u = nu. 4.8 Appendix 155

B22 = I 9iu,TU{xu)xudxu JoU

if ru =0,i„/ 0, nu,

2 r = 0,iu = K"yu" + s>(>#). if u 0,

= if ru = 0, iu nu, = < 1 2 2 1 3 3 ru l,iu y£ 0,n , ^(-(^- ) + fe) ) ^((/»i-- ) + (/*i" ) ), if = u

2 3 if ru = l,iu = iW^i") + &(>#) . 0,

1 2 1 3 r l,iu ^ Q,n . -^i"(^- ) + ^(^- ) if u = u

rb u x B23 = / 9i„,Tu{ u)xudxu Jo

if 0,iu + 0,n„, ru = h u if r = 0,iu = 0, -vi? - k i > u if r = 0,i„ = nU) y£» - Wu~\ u if = n ru 1, iu ®, u,

_J_(hiu)2 = if Iu 12 v « / ' ru 1) = o,

= U if = n . 12V « J ' ru 1) 1*u u

•B24 = / Sil.rJ^)^^ VO

if 0,i ^ 0, ru = 0,nu, if = 0,i = 0, 1, ru if = Oj — j -1, ru < if = l,i ^ 0,n„, 0, ru

if = l,iu = 0, -yi". ru if ru

^.5.3 Compute the Basic Integrals

To simplify the notation and calculation, we only consider the case that iu ^ 0, nu when we compute B\,... , B21 • 4.8 Appendix

r0

l l x u K~ J i 9iv,ru{xuhu~ + yu)9iu,su(xuh^- + yu )dxu

u u +Kt f 9i»,ru{xuhtr + yu )9iu,su(xuh^ + yu )dxu Jo

1 ' K*- mmdy + /ij,- £ Hv)Hy)dy if ru = su = o,

1 2 2 < te~ ) jl^iymy^y + te) fiiy)i>{y)dy, if ru # su,

l A^' f S\^y)^{y)dy + te?!li>{ymv)dy, if ru = *u = i.

_1 u ' i(K" +^ ), if r-u = su = 0,

1 2 2 < ^(-(^- ) + (^) ), if ru7^Su,

_1 3 3 f > Tu5(^ ) + (^) )' i ru=a„ = l.

u u s = h\? [ 9iu,ru{xuK +yu )9iu+hsu{xuh^ + yu )dxu 2 Jo

{ h\f fi 4>(y)4>(y - l)dy if «ruu = 0, (h\T)2 fi 4>(vmy - l)dy, if r 0, s„ = 1, = < u 2 1, su = 0, te) fi^{y)4>{y-l)dy, if ru s„ = 1. { tef fi 1>(V)1>(V ~ Wv, if ru 9/4" if r„. = s = 0, 70 ' • u u 13(fc?)2 420 ' if ?~u 0, su = 1, 13(/i?)2 1, s„ = 0, 420 ' if ru s„ = 1. 140 ' if ru 4.8 Appendix

1 1 u 1 1 1 Bz = ft' f^ru^uh^- +yi - )glu-i,sAxuh^- +yu-- )dxu Jo

1 ft' ft{y-l){y)dy if r„ = su = 0,

(h^ftHy-myWy, if ru = 0, s„ = 1

= < 2 (hir) ft My - i)tty)dy, if r„ = 1, su = 0

u 3 (K ) ft 1>(v - W(v)dy, if r„ = su — 1-

if ru = su = o,

420 ) 11 ' u — ui = 1, = < 420 ) 11 ' u — = 0, ("u ) jf „ _ „ _ 1. X40 ' 11 'u — '«

u j u +ft f gLrSxuft + yu )9iu,su(xuhir + yu )dxu • Jo

' /°! '(y)Hy)dy + ft 4>'{y){y)dy if r„ = s„ = 0,

/°i '(yMy)dy + ft ft 4f(vMy)dy, if ru = o, *u = l if hZ-'f^'MmdV + ftftMvMvWy, ru = l,s„ = 0

_1 2 k (^ ) ip'(y)^(y)dy + {ft? ft VivMyWv, if ru = su = l.

0, if ru = su = 0,

l+ if -Uft~ ft)i ru = 0,au = l,

l Uft~ + ft), if rtt = l,su = 0,

0, if ru — su = 1. 4.8 Appendix 158

h u l = u [ 3l,r>X +yu)9iu+hsu(xuhu* +yt)dxu Jo

1 /o 4>'(yMy - i)dy if ru = su — 0,

h\? ft

2 (/»L") /o V-'(y)^(y-i)dy, if ru = *u = i. _ l • if r„ = s = 0, 2' u ft? if r = 0, s„ = 1, 10 ' u

s = = fo"' if '"M =— 1I;5 u$u

2 (fa?) = = ~gQ ! if TU SU \.

56 = {Xuh-u

x if = f0 '(y-l)(y)dy Tu = su 0,

1 hi"- fi 4>'{y - \My)dy, if Tu = 0, su = 1, = < if Tu — 1) su = 0, -1\2 fl /,// if Tu — Su — 1. 1 if r„ = s — 0, 2' u iiu if ru = 0, su = l, = < 10 ' ft?"1 if r = l,s = 0, 10 ' u u (ft?"1)2 if r„ = su = 1. 60 • 4.8 Appendix 159

B7 =

h u + u f g'lsSxuh^ + yu )giu,su(xuh]: +ytr)dxu Jo

J^T 1°! (y)dy + ± fi "{y)4>{y)dy if ru = su = 0, if r = 0,s = 1, I\"{ymy)dy + fi4>"{ymy)dy, u u

/°!r(y)Hy)dy + fir(y)(y)dy, if ru = i,su = o,

u l hu ~ f_x r(yMy)dy + h\° fi r(yMy)dy, if ru = Su = 1.

T + ^r), if ru = su = 0, if r„ = 0,s = 1, o, u = < — o, if ru l,su = 0,

Mh^+hfr), if r„ = su = l.

5 = x u x u 8 K I 9'L,rS ^u +yu )9iu+i,sA uh^ +yu )dxu Jo

[ ^fi

fif(yMv-i)dy, if ru = 0, su = 1,

fir(y)(y-i)dy, if ru 1, su = 0, S„ = 1. h\?fii/>"(yMv-l)dy. if ru 6 u if ru — Su —0 , 5h'u ' 1 if = 0, s 1 10' ru u < = 1 if = 0 10' ru 1) su hi* if — Su ~ 1. 30 ' ru 4.8 Appendix 160

1 1 So = ft' /V^x^+yr )^ JO

l f0 {y)dy if ru = su = 0,

0, su - 1, /0V(y - i)V>(y)dy, if r„ fZr(y-i)4>(v)dv, if 1, s„ = 0,

Su = 1. I ^/oV'fe-iMyJdy, if ru

6 if ru = su = 0,

if ru = 0,su = 1, = < 10' 10' if = 1, SJJ 0,

1 hi"- r = s,, = 1. 30 ' if u

10 1 1 1 = ft- f g'i^uft- +y^n^sxuft- +vir)dxu

+ft f 9iu,ru Jo 1 llir(yW(y)dy + 7r^f r(yW(y)dy if r„ =s „ = o, (ft (ft?) "1 All h^r !\ 4>"(vW(v)dy + ^ ft f(vW(y)dy, if ru = 0, su = 1,

j^=r /°! #{yW\v)dy + ± ft '{yW{y)dy, if ru = 1, su = 0,

f\r{yW{y)dy +ftr{vW{y)dy, if ru = *„ = l.

0, if ru = 5„ = 0,

-(^pr + ^r): if ru = 0,5u = l, = < if ru = l,su = 0, ft?

0 if ru — su = 1. 4.8 Appendix 161

hi x u B = u 9iu,rA uhi +yu»)g'iu+hSu{xuhi]> + y ")dx 11 f u u Jo

' fi 4>"{y)ct>'{y - l)dy if ru = su = 0,

< i/oVW(y-i)<*y. if ru = 0,s„ = l,

^ fir{vW{y-l)dy, if ru = l,su = 0,

fir(yW(y-i)dy, if ru if ru — $u — 0, 1 1 if r = 0,su A?" u = < 1 if 0 ru — 1) Su

1 if r — Su — 1 2' u

1 1 1 1 1 Bl2 = h\?- /^i;,rjxu/jr-l + yi--)s^-i,.«(^/»jr- + yir")^u Jo

i /oV(y-l)0'(y)dy if ru = *u = 0, (ft? )

if r„ = 0, su = 1, = < •fhxfinv-iWWv, iMzsfir(v-i)'{y)dy, if f r u = i,s = Io ; (A? ) u u

[ fiV(y-iW(y)dy, if ru = su = 1.

if s = 0, r o, r„ = u

if r = 0,su = 1, A?" u

TirT, if ru = l,su = 0, if r„ = s,, 1. 4.8 Appendix 162

Bu — KU X h X + "7-i &^ « *~ yuK,su(xuhu^ + y^)dxu

+h x u f 9iu,ru( uhi]: + yu«)g"iu,Su(xuhu» + y\?)dxu J 0

f JZ^f-iWvW'Wy + Jj^ if ru = su = 0,

u 2 if r # = < -1 2 J_Il11r{y)r{y)dy r va / r + 7]^fir{y)r(y)dy, u S|1 (fa? ) - ' ^"'"^ ' (h'u )

1 l°i + r„ = = l. [ j-rtU r(vW(y)dv ^ I r(y)r(y)dy, if Su

= <

4 rT + if r = s = 1. (faF /c^' u u

u u Si4 = ft [ g'lu^(xuft + yu )g"iu+i,Su{xuft +yu )dxu Jo

jj^tit'Wiv-Vdy if ru = su = 0, JJ^li4f'(v)ip"{y-l)dy, if r„ = 0,s„ = l, = <

jtgyfiV'{.vW\v-l)dy, if r„ = l,5u = 0,

^firiyW(v-l)dy, if = s„ = 1.

h'u '' if ru = 0, su = 1, \22 (ft?) = < 6 :, if r„ = l,s„ = 0,

r s TJ^I if u — u — !• 4.8 Appendix

Jo = Su = 0, (h'uu-1)3 1 1)2 1 (ft?-1)2

12 if ru = su — 0, (Al"-1)3'

if r •(S5%' u = 0,su = l, 6 if r„ = 1, su = 0, (Al""TTT1)2 i 2 if ru = su = 1. ftlP1"'

u u u +hu f 9'iu,ru{xuhi +yu )9'i»,Su(xuhi: +yu )dxu Jo

l ' J^T Jl, '{y)4>'{y)dy + ± f0 '(y)dy if ru = su = 0,

' &'(yW(v)dy + fi

x 1 . /»ir /_°i f{vW{v)dy + So nyW(v)dyy if ru = su = i.

' |(^=r+ ^r), if »"« = -« = 0.

S 0, if ru + su,

k if ru = *„ = !. 4.8 Appendix

u = ft [ 9iu,ru(xuft +yu )g'iu+i,Su{xutiu" +yi?)dxu Jo

Jo 4>'{y)'{y - l)dy if ru = su = 0,

1 = J Xo4>'{yW{y - l)dy, if ru = 0,su = 1,

Jo i>'{y)4>'{y - l)dy, if ru = l,Su = 0,

> ft fi^'(yW(y-i)dy, if ru = su = l.

— 5^Tu"i if ru — su = 0;

= ^ To'' if 0, su = 1,

To' ^ ru = 1, Su = 0,

_>i if =s - i ^ 30 ) J1 r'« — a« — x-

1 1 i l 1 i l ft f 9iu,ru(zuhi? +yX-)g'iu-i,Su{xuft- +yx-)dxu Jo

1 - l ^PT/O ^'^ )'(y)dy if ru = su = 0,

^ £'(y-lW{y)dy, if ru=0,*u = l,

1 _ l Jo ^'(y W{y)dy, if r„ = 1, su = o,

1 . ft' Jo V»'(y " l)V-'(y)dy, if ru = Su = 1.

6 if ru — Sy, —0, 1 if r = 0, su - 1 10' u 1 if = s 0 10' ru 1) u if 1. 30 ' ru — Su '

f 1 1 1 l u ft' J i 9iu,ru{xuh "- + y u )dxu l u 9iu,ru{xuft + y u )dxu +ftJo f

1 ft' $\{y)dy + ft Jo(y)dy if ru = o,

1 2 2 {ft- ) !\i>{y)dy + {hi?) tii>{y)dy, if ru = l.

l \{ft- + ft), if ru = 0,

M-ft-^ + ift)2)' if r« = 1' 4.8 Appendix

x u +hiu a'i ,rS ^u +y )dx f u u u Jo

f Il14/(y)dy + fi4/(y)dy if ru = 0,

1 • I K?- f_x i>'(y)dy + h\? fi TP'(y)dy, if ru = 1.

0, if ru = 0,

0, if ru = 1,

r0

1 1 B21 = hlT' J^gl^ixuh^- +y]T)dxu

[ g'LrAzuhi* + yu")dxu Jo0

T^rS\"{y)dy + ^fi

1 f\r(y)dy + f0 r(y)dy, if ru = l.

0, if ru = 0,

0, if ru = l,

/-U

u u [ 9iu,r.(zu/4 + yj*)(ztX« + yu )dxu Jo ' ylute~l /_0! {y)dy) 2 1 2 O < +(^- ) /_ !0(y)y«!y + (/iJr) fi (y)ydy, if ru = 0, yi* (Cijr1)2 J-i i>(y)dy + te)2 £ V(y)dy)

1 3 3 , +te~ ) /°! V(y)ydy + (^") £ V-(yMy, if ru = 1.

1 1 2 2 §(/ij,-- +/»i-)yi' + ^(-(/ii-- ) + (/>i') )) if ru = 0,

1 2 2 1 3 3 ^yi"((-^- ) + (^) ) + 3Tj((^- ) + (^) )) if r„ = l, 4.8 Appendix

l -1 _1 u Vu~ j sU.OcX" + yi")(*iX" + yl )dxu

Jo 'yi"(/i #(v)dy + fi4/(y)dy)

1 +K>- S\4>'{y)ydy + h^ fi

1 2 , 2 1 k +(^"- ) /°i^ (y)ydy + (/»i") /o V''(y)ydy, if ru = 1.

-^(/li-^+Ai"), if ra = 0,

1 2 2 TS^i"- ) -^") ). if ru = l,

1 /•O

x h 1 u 1 l 9iu,ru( * u +y ){x h »- +y«)dx /»L" J u u u u

u u +hl f 9iu,ru{xuh\? + yu ){xuh^ + yi")cbu Jo ' y\t{^I\ny)dy + ±finy)dy\

U ( +ll 14>"(y)ydy + fi"(y)ydy, if ru = 0,

1 yi"(/°1<(y)rfy + /o ^"(y)rfy)

_1 . +/»i" /_! V>"(y)ydy + Jo V-"(y)ydy, if ru = l.

0, if ru=0, 0, if r„ = 1, 4.8 Appendix 167

4-8.4 Integrals of Hermite Basis Functions

To help readers to verify the calculation in previous sections, we list the results of following

integrals which are used directly when computing basic integrals B\,... , B24..

rl rO

Jo {y)(y)dy = J ^ {y)(y)dy = ^ 11 = 210

j {y)i>{y)dy = ~ y i MyWivWy = 1 105 j '

f1 9 / Hy)(y - l)dy = 70 Jo 420 Hy)i>(y - i)dy = -— 3_

l f 42130 / 4>{y - i)i>{y)dy = 1 Jo r1 1410 / V'(y)V'(y - i)dy = Jo

J1 (y)

r1 r° 1 J 4>{yW{y)dy = J ^ {yW\y)dy = ^

r1 r° 1

yo '(y)^{y)dy = J ^ 4>'{y)ip{y)dy = -—

J i>{yW{y)dy = j ^{y)f'{y)dy = 0 l / (t>{y)'{y - i)dy = Jo 2

[ 4>(y)i>'(y-l)dy: J_ Jo ~10 f '(y - l)V(y)dy =1 Jo 10 / i>{yW{y-i)dy _ 1 Jo ~60 _1 f 4>'(y)(y-l)dy-- Jo ~2 1 [ 4>'{yMv-i)dy- Jo 10

f 4>{y-l)1>'(y)dy. J_ Jo ~10

/ i>'{y)i>{y - l)dy _1_ Jo : 60

^ 0(y)0"(y)ciy = ^ c*(y)^"(y)dy =

^ 0(y)V"(y)dy = - y° 4>{vW(v)dy

y1 4>"{yMy)dy = - y° tf>"{y)ip{y)dy

y1 V(y)V>"(y)dy = y° MvW (v)dy = 4.8 Appendix

j\{y)"{y-i)dy = \

J\(y)ip"(y-i)dy = -^

£ "(y - i)V(y)dy = JQ f\{yW\y-l)dy = ^ f "{y){y - l)dy = I

£"(y)1>(v-i)dv = ~

1 J 4>(y - lW\v)dy = 75 f 1>"(vWv - i)dy = ^

f 4>'{y)'{y-^)dy = -\ f 0W(y - l)dy = 75 j\'(y-l)^'(y)dy = -^ f\'(y)^'(y-l)dy = ~ 4.8 Appendix

^ '(v)dy = ^ 0'(y)0'(y)dy = |

^ 4'(y)iP'(y)dy = - ^ \yW{y)dy = ^

1 j\'{yW{y)dy = J°^'(y)^ (y)dy = -|

^ 4,'(y)4>"{y)dy = ^ 4>\y)"{y)dy = 0

^ 0'(y)V"(y)dy = ^ 4>'(y)1>" (v)dy1 = f "(yW(y)dy =- 1

J\'(y)ip"(y)dy = - j°^'{yW{y)dy = -\

/%'W(y ~ l)dy = 0 vo

v/o ^"(y)V'(y - l)dy = 1

v/o ^'(y-l)V"(y)dy = -l

jf ^'(y)V'(y - l)dy = ^

f1 4>'{y)(l>"(y-l)dy = 0 vo

/ tfW(y-i)dy = -i vo / "(y - l)V'(y)dy = 1 vo

y V-W(y-i)dy =~ 4.8 Appendix

J1 4>"(y)"(y)dy = 12

£ "W\y)dy = - y° "{yW'{y)dy = 6

£ r(y)r(y)dy = J° 1>"(yWiv)dy = 4

f1

[ f(y)r(y-i)dy = 6 Jo

[ 4>"(v-l)1>"(y)dy = -G Jo

[ 1>"(vW(y-i)dy = 2 Jo _3_ (y)ydy J ~ J i 20

j i>{y)ydy = j i>{y)ydy =

rl rO 1

'{y)ydy = --

r\ rO 1

) JO i '{y)ydy = - J ^ i>'{y)ydy = - —

f 4>"{y)ydy = - J° "(y)ydy = l

^ ^"(y)ydy = J° iP"(y)ydy = 0 4.8 Appendix

d d J

d = d j ^{y) y ~ j x ^(y)v y = J2

J\'(y)dy = - J° '{y)dy = -1

J\'(y)dy = J° 1>'(y)dy = 0

j1 "{y)dy = j° 4>"(y)dy = 0

J\"(y)dy = - J\"(y)dy = -1 173 Chapter 5 Computing the Stationary Distribution of an SRBM in an Orthant

5.1 Introduction

In Chapter 4, we have developed the BNA/FM algorithm to compute the stationary distri• bution of an SRBM in a hypercube. In this chapter, we extend the algorithm to compute the stationary distribution of an SRBM in an orthant. The SRBM in a hypercube often serves as an approximate model for finite buffer queueing networks, and the SRBM in an orthant often serves as an approximate model for infinite buffer queueing networks. The computed station• ary distribution of an SRBM can then be used to approximate the stationary distribution of the corresponding queueing network. By assuming Conjecture 5.11, we are able to provide a full proof of convergence theorems which justify the extension. All numerical experiments done thus far show that the extended algorithm gives fairly accurate performance estimates. Queueing networks have been widely used to model manufacturing systems and commu• nication networks, and have provided a very useful tool for the design and operations man• agement of these systems. (See, for example, Jackson (1963), Kleinrock (1976), Buzacott and Shanthikumar (1992), and Yao (1994).) In modeling and analyzing these systems, one ofthe fundamental issues is the performance analysis of queueing networks. Despite much effort, exact analysis of queueing networks has been very limited, (See, for example, Kelly (1979), Walrand (1988), and Yao (1994).). To evaluate the performance of general queueing networks, many approximation methods have been proposed. Among all those methods, Brownian approximation appears to be at• tractive and useful. In Brownian approximation, an SRBM is used to approximate the queue length process, workload process or sojourn time process of a queueing network. The data specifying the SRBM can be computed explicitly from certain parameters of the queueing network. The parameters involved are the first and second moments of the interarrival and service time distributions, and the routing probabilities. Compared with simulation, the ad• vantage of the SRBM approximation lies in its tractability, as an analytical model, in solving optimization or scheduling problems associated with the queueing networks, in a wide range of applications; refer to, e.g., Harrison and Wein (1990), Wein (1990), and Laws (1992). 5.1 Introduction 174

Traditionally, SRBMs arise as the limits of certain performance processes of queuing net• works with appropriate scaling in time and space under heavy traffic condition. Most of these limit theorems, known as functional central limit theorems or heavy traffic limit theorems, have been focused on the queueing networks with infinite buffers, where the corresponding SRBMs are defined in nonnegative orthants. For a survey in this area, readers are referred to Whitt (1974), Lemoine (1978), Glynn (1990), Harrison and Nguyen (1993), Chen and Mandelbaum (1994), Williams (1996) and Chen and Yao (2001). Recently, strong approxi• mation has also been adopted to derive SRBMs from queueing networks, (see Chapter 2 of this dissertation). Motivated by queueing network applications, one often focuses on the stationary distribu• tions of SRBMs. Computed quantities from stationary distributions are used to approximate certain performance measures of the corresponding queueing networks. However, it is unlikely that a general analytical solution of the stationary distribution will ever be found. Therefore, numerical solutions have been sought. Dai and Harrison (1991, 1992) introduced and imple• mented a numerical algorithm which we refer to as BNA/SM. In the previous chapter, we have developed an alternative implementation for computing the stationary distribution of an SRBM in a hypercube, called BNA/FM, which is based on the finite element basis (instead of the global polynomial basis used in BNA/SM). In this chapter, we augment the algorithm so that it is able to calculate the stationary distribution of SRBMs in orthants. In fact, this extended algorithm has been used to estimate the performance measures of several queueing examples in Chapter 2 and 3 of this dissertation and the results are very encouraging. The rest of the chapter is organized as follows. In the next section, we define an SRBM in a general convex polyhedron state space while hypercube and orthant are two special cases of convex polyhedron. In Section 5.3, we develop an approximation schema which extends the algorithm developed in the previous chapter to compute the stationary distribution of an SRBM in an orthant. In Section 5.4, we provide two convergence theorems which justify the approximation schema proposed in Section 5.3. The developed numerical algorithm has already been used in Chapters 2 and 3, and has provided good approximation for several queueing networks. In Section 5.5, we present an additional job shop example, and show how the Brownian approximation, armed with the BNA/FM algorithm, can effectively be used for the performance analysis of queueing networks.

Finally, we introduce some notation to be used in this chapter. Let Kfc denote the k- dimensional Euclidean space, and 5?+ denote the nonnegative fc-dimensional orthant. For a subset S of Uk, let C2(S) be the functional space of twice differentiable functions whose first and second order partials are continuous and bounded on S and let B(S) be the set of 5.2 Definition and Preliminaries 175 functions which are Borel measurable. For a Borel set U C K*, k > 1, we define C([0, T], £/) =

{w : [0, T] -» [/, to is continuous}. For any continuous function / defined from [t\, t2) C [0, oo) into some k > 1, let

Osc(/,[t1)<2])= sup |/(*)-/(*)|. ti

5.2 Definition and Preliminaries

In this section, we define a semimartingale reflecting Brownian motion (SRBM) that lives in a convex polyhedron. We then state the basic adjoint relationship that characterizes the stationary distribution of the SRBM (Theorem 5.5), which is the starting point for our numerical algorithm. A if-dimensional convex polyhedron S {K > 1) is defined in terms of m (m > 1)

X-dimensional vectors {rii,i £ J}, J = (1,... ,m}, and an m-dimensional vector b =

(bi,... , bm)', where prime denotes transpose, as

K S = {x e R : rii • x > bi, for all i £ J}, where n,- • x = n\x denotes the inner product of the vector Tij and x. It is normally assumed that the interior S is nonempty and that the set {(ni, bi),... , (nm, bm)} is minimal in the sense that no proper subset defines S. This is equivalent to the assumption that each of the faces

F{ = {x 6 S : rii • x = bi}, i £ J, has dimension K — 1. Throughout this chapter, we are concerned with the SRBMs that are defined in either hypercubes or orthants, which are two special cases of general convex polyhedron, a K- dimensional hypercube S is equivalent to a K-dimensional convex polyhedron with

m = 2K,

b = (0,... ,0,bK+i,... ,b2K),

(ni,... ,n2K) = {-!,!), 5.2 Definition and Preliminaries 176

where / is if-dimensional identity matrix. Similarly, a K-dimensional orthant is equivalent to a iC-dimensional convex polyhedron with

m = K,

b = (0,... ,0),

(ni,... ,nK) = -I.

Given a if-dimensional vector 8, a K x K symmetric, positive definite matrix V, and a K x m matrix R, we shall define a semimartingale reflecting Brownian motion (SRBM) associated with the data (S, 8, T, R) on a general convex polyhedron.

Definition 5.1 For x € S, an SRBM associated with the data [S,6, F,R) that starts from x is an {Ft}-adapted, K-dimensional process Z defined on some filtered probability space (ft,

T, {Ft}, Px) such that

Z = X + RY, (5.1)

where

1. Z has continuous paths in S, Px-a.s.,

2. under Px, X is a d-dimensional Brownian motion with drift 8 and covariance matrix

F such that {X(t) — 8t,Ft,t > 0} is a martingale and X(0) = x, Px-a.s.,

3. Y is an {Ft}-adapted, m-dimensional process such that Px-a.s.,

(a) Y(0) = 0,

(b) Y is continuous and nondecreasing,

(c) for i = 1,... ,m, Yj can increase only when Z is on the face F{.

In (3c), we mean that, for each t > 0, Z(t) # Fi implies Yj(i -5) = Yi(t + S) for some

5 > 0. This is equivalent to /0°° l{z(s)eFi}i dYi(s) = 0 for all i. Loosely speaking, an SRBM behaves like a Brownian motion with drift vector 8 and covariance matrix T in the interior 5.2 Definition and Preliminaries 177 of the polyhedron S, with the processes being confined to the polyhedron by instantaneous "reflection" (or "pushing") at the boundary, where the direction of "reflection" on the ith face Fi is given by the ith column of R. The parameters 6, Y and R are called the drift vector, covariance matrix and reflection matrix of the SRBM, respectively. The existence of an SRBM depends on the property of the reflection matrix R. Dai and Williams (1995) provides a sufficient and necessary condition on R for the existence of an SRBM in a general convex polyhedron. It is straightforward to specialize their condition for a if-dimensional SRBM in an orthant. To specialize their condition for a if-dimensional SRBM in a hypercube, we need to introduce the notion of reflection matrix associated with

K a vertex. A K-dimensional hypercube has 2 vertexes, and each vertex is given by C\i^aFi for a (unique) index set a C (1,... , 2K} such that for each i E a, i + K £ a, and for each i £ a and i < K, i + K G a. It can be seen that \a\ = K. For an 2K x 2K matrix A and

each vertex set a, let Aa denote the K x K matrix contained from A by deleting those rows and columns with indices in J \ a. The reflection matrix Ra associated with the vertex a is the K x K matrix, given by

a R = ((I,-I)'R)Q.

Definition 5.2 A square matrix A is said to be an S matrix if there is a vector x > 0 such that Ax > 0. The matrix A is said to be completely-S if each principal submatrix of A is an

S-matrix.

Definition 5.3 For an SRBM in an orthant, The reflection matrix R is said to satisfy the completely-S condition if R is an S-matrix. For an SRBM in a hypercube, the reflection matrix

R is said to satisfy the completely-S condition if for each vertex a, Ra is a completely-S matrix.

It follows from Propositions 1.1 and 1.2 of Dai and Williams (1995) that a necessary condition for the existence of the SRBM Z associated with (5,0, Y, R), for each initial x G S, is that the reflection matrix R satisfy completely-S condition. On the other hand, when R satisfies the completely-S condition, it follows from Theorem 1.3 of Dai and Williams (1995) that there exist processes (Z, X, Y) defined on a common filtered probability space (fi, T,

{Ft}), on which a family of probability measures {Px} is defined, such that for each x G S,

under Px, Z is an (S, 6, Y, ii!)-SRBM starting from x. Furthermore, Z is a strong Markov process that is Feller continuous. 5.2 Definition and Preliminaries 178

Definition 5.4 A probability measure no on S is called a stationary distribution for Z if for

each bounded Borel measurable function f on S

f Ex[f{Z{t))}dn0{x) = f f(x)dn0{x) for all t > 0. (5.2) Js Js

Here, Ex denotes the expectation under Px.

Using the same argument as in Section 7 of Harrison and Williams (1987), one can show that if the stationary distribution no exists, then it is unique and has a density po with respect to Lebesgue measure dx on S. As stated in the introduction, the primary purpose of this chapter is to compute the stationary density po- We now provide an analytical characterization for

po- To this end, for each k = 1,... m, define the measure nk on boundary face Fk via

(5.3) = 2E 7TQ / l{Z{,)e.}dYk{s)

where E%0 denotes the expectation under probability measure P%0 (•) = /sPs(.)7ro(

then follows again from the arguments in Harrison and Williams (1987) that nk has a density

pk with respect to the surface Lebesgue measure dak on Fk. Furthermore, po,p\,-- - ,pm satisfy the following basic adjoint relationship (BAR):

m .

{Cf{x)po{x))dx + Yj (Vkf(x)pk(x))dok = 0, VfeCj;(S), (5.4)

Js K=L JFK

where

(5.5) j,k=l 3 K j=l 3

Vkf(x) = v'kVf{x), (5.6)

vk is the kth. column of the reflection matrix R, and V/ is the gradient of /. The following theorem is a special case of Dai and Kurtz (1997), where general polyhedron state space was considered. As before, 6 is a If-dimensional vector, T is a K x K symmetric and strictly positive definite matrix, and R is a K x m matrix. 5.2 Definition and Preliminaries 179

Theorem 5.5 Assume that R satisfies the completely-S condition in Definition 5.3 and the stationary distribution exists. There exists a unique nonnegative function p = (po, p\,

pm) with fspo(x)dx = 1 and fF Pk{x) dak < oo for k = 1,... ,m that satisfies the basic adjoint relationship (5.4)- Furthermore, 7To(-) = fpo(x)dx is the stationary distribution of

the SRBM Z associated with data (S, 6, V, R), and 7Tfc(-) = fpk{x) dak is the measure on

Fk defined in (5.3).

Theorem 5.5 provides an analytical characterization of the stationary density of an SRBM. One would hope to find an analytical solution from the characterization. However, this has been possible only for some very special cases. Harrison and Williams (1987) derived so called skewed conditions that are sufficient and necessary for an SRBM in an orthant to have a product form stationary distribution. The conditions are

2RuTijRjj = (RuRijTjj + RjjRjiTa), for all i ^ j, (5-7) and the stationary density function is

K

Po{x) = Jj7fcexp(-7fcxfc), x £ S, (5.8) fc=i where 71,. • • ,JK are positive constants defined by

7fc=2^ k = lj,_K, (5.9)

Except these limited cases, the analytical solution for the stationary distribution of a general SRBM in an orthant or a hypercube is not known. Thus, numerical solution has been sought. We now define some mean quantities related to the stationary distribution of an SRBM. For i = 1,... , K and k = 1,... , m, define

qi = j Xip0(x)dx, (5.10)

h = Pk{x)dak- (5.11) 5.3 The BNA/FM Algorithm for an SRBM in an Orthant 180

The vector q = (

value of Z. The quantity 5k represents the long-run average amount of pushing per unit

of time needed on boundary Fk in order to keep the SRBM Z inside the state space S. These quantities, along with stationary distributions, are of interest in the queueing network applications. To better understand these mean quantities, we now put them into the context of conven• tional queueing networks. Generally speaking, SRBMs in orthants arise as Brownian approx• imation to infinite buffer queueing networks and SRBMs in hypercubes arise as Brownian approximation for finite buffer queueing networks. If an SRBM Z is used to approximate the queue length process of an infinite buffer queueing network, then qi in (5.11), % = 1,... ,K, is the approximation to the average queue length at station i, and 1 — rriiSi, i = 1,... , K, is the approximation to the traffic intensity at station i where mj is the mean service time at station i. If the SRBM Z is used to approximate the queue length process of a finite buffer queueing network, then and 1 — m^i, % = 1,... , K have same interpretation, and furthermore, 5K+%, i = I,.. • ,K, is the approximation to the job loss rate at station i. On the other hand, if an SRBM Z is used to approximate the workload process of an infinite buffer queueing network, then qi, i = 1,... , K, is the approximation to the average queue waiting time at station i (not include the mean service time), and 1 — 5i, i = 1,... , K, is the approximation to the traffic intensity at station i. If the SRBM Z is used to approximate the workload processes of an finite buffer queueing network, again, qi, and 1 —

5.3 The BNA/FM Algorithm for an SRBM in an Orthant

In Chapter 4, we have developed the BNA/FM algorithm for computing the stationary dis• tribution of an SRBM in a hypercube. In this section, we extend the algorithm to compute the stationary distribution of an SRBM in an orthant. If we partition an orthant state space by a lattice as done in the previous chapter, we shall get either infinite number of elements or some unbounded elements. This violates the finite element property. As a result, we are unable to use BNA/FM algorithm directly. In this section, we construct a sequence of SRBMs in hypercubes which converges weakly to the SRBM in an orthant, and then we apply BNA/FM algorithm to this sequence of SRBMs to 5.3 The BNA/FM Algorithm for an SRBM in an Orthant 181 obtain a sequence of stationary distributions which converges weakly to the the stationary distribution of the SRBM in an orthant. Let Z be a ^-dimension SRBM with data {S,6,T,R) where S = ft£ is an orthant. Let {bn} be an increasing sequence of positive vectors such that lim^oo bn = oo. Define Sn as the iC-dimensional hypercube,

Sn = {x e UK : 0 < x < bn}.

Let R = (R, —I), now we can construct a sequence of independent X-dimensional SRBM, {Zn(t)}, associated with data (S™, 9, T, R) starting from 0. As we will prove in the next section, {Zn} converges to Z weakly. Let TTQ be the stationary

n distribution of Z and TT0 be the stationary distribution of Z (Assume that Z does have a stationary distribution). We conjecture that {IXQ} is relatively compact, based on which we can show that TTQ converges to 7To in distribution as n -» oo. This result is quite intuitive and the our numerical experiments support the conjecture. As it becomes clear from the above discussion. In order to calculate the stationary distribu• tion of Z, we first construct a sequence {Zn} and then use BNA/FM algorithm to calculate the stationary density function pfi for Zn, n = 1,.... We would stop the iteration if

(5.12)

where e is a pre-determined positive constant close to 0. However, evaluating inequality (5.12) can be very difficult since it involves numerical in• tegration over a multidimensional state space. Instead of using (5.12), our implementation compares the stationary mean quantities between two iterations to determine if we should stop the iteration and this works well in all experiments done so far. Let Z be the SRBM approximation to the original queueing network with infinite buffers. Once we construct {Zn}, a sequence of SRBMs in hypercubes, every Zn can be interpreted as an SRBM approximation to a conventional finite buffer queueing network. Intuitively, the approximation schema we use in this section corresponds approximating an infinite buffer queueing network by a sequence of finite buffer queueing networks with increasing buffer size. From the interpretation at the end of Section 5.2, we know that if Zn is a good approximation 5.3 The BNA/FM Algorithm for an SRBM in an Orthant 182 to Z, then for i = 1,... ,K

< e

< e

< e

where pi to be the traffic intensity at station i ofthe original queueing network; p™ = l-mj5" if Z is an approximation to the queue length process and pf = 1 - S? if Z is an approximation to the workload process. Another non-trivial problem is how we choose upper bound {&"} so that the algorithm will converge fast within as fewer iterations as possible. If initial bound {b0} is small and the increase between every iterations is small, then we need many iterations in order to have a good approximation. Because ofthe computational complexity of BNA/FM algorithm discussed in Section 4.4.2, this is not favorable. On the other hand, if the initial upper bound is too big, the algorithm would requires a mesh with a lot of grid points to cover the truncated region. Thus, the size of the linear system becomes big and it takes much more computational time and memory and is prone to round-errors in computation. Harrison and Nguyen (1990) strongly believes that no matter whether or not the skewed conditions in (5.7) hold or not, the stationary distribution function TTQ(X) decays exponentially as x —>• oo, the parameter vector of that exponential tail being precisely the vector 7. This is confirmed in all of our numerical experiments. Based on it, we choose the initial upper bound b° such that for i = 1,... , K,

(5.13)

To construct the subsequent {bn}, define a constant vector c such that c, = 5 if b}/5 < 5 and Ci = 10 if b\/5 > 5, i = 1,... ,K. Then bn is selected as

6,7'1 = v,71 -1 + c.

As it turns out, three iterations are usually sufficient to have a good approximation. We note that the way we choose c because it works well in our numerical experiments. 5.4 Convergence of the Algorithm 183

The remaining question is how we choose the initial function h°. As it can be seen from the discussion in last chapter, h° should be chosen such that

p-h°= Af, with / being as smooth as possible. Unfortunately, such prior choice is very difficult. In chapter 4, we simply used h° = (1,0,... ,0). Since the stationary density function of an SRBM in an orthant has been believed to have an exponential tail, it suggests that a better initial function would be

K h° = (Ti'ykexi?(-'yhXk),0, • • • ,0). k=l

Our experiments show that this initial function significantly improves the convergence speed and the accuracy of the algorithm.

5.4 Convergence of the Algorithm

In this section, we first introduce the Skorohod problem and the oscillation result established in Dai and W. Dai (1998). Then, we prove the main convergence theorem for a sequence of SRBMs in hypercubes. Finally, we prove the convergence theorem for a sequence of stationary distributions given that a conjecture is true.

Definition 5.6 Given T > 0 and x £ C{[0,T],UK) with x{0) £ S, an (S, R)-regulation of x over [0,T] is a pair {z,y) £ C{[0,T],S) x C{[0,T],W?) such that

(i) z(t) = x(t) + Ry{t) for all t £ [0,T],

(ii) z(t) G S for all t G [0, T],

(iii) for each i G J, y% is non-decreasing, y,(0) = 0, and yi can increase only at times

t e [0,T] for which z{t) £ Fi.

Theorem 5.7 Let R0 = —diag(a), where a is a d-dimensional positive vector, R be a K x K matrix that satisfy the completely-S condition, Z be an SRBM on an orthant with Data 5.4 Convergence of the Algorithm 184

, 6, T, R), and {&"} to be an increasing sequence of positive vectors such that when n —>• oo,

n n n h = (&",... , b K) -> oo. For n = 0,1,..., define S as the hypercube space associated with

n n n upper bound b and Z as the SRBM with data {S ,6,T, (R,Rb)) starting from 0. Then

Zn =>• Z, asn-> 00,

where means convergence in distribution.

Before we prove this theorem, we need to first verify that the Zn constructed in the above

theorem is well defined. That is, the matrix (R,Rb) satisfies the Complete-S condition. For this purpose, we now prove the following lemma.

Lemma 5.8 Let Rb and R to be defined same as in Theorem 5.7. Then the K x 2K matrix

(R, Rb) satisfies completely-S conditions.

Proof of Lemma 5.8. Our proof is adapted from that of Lemma 5.1 in Dai and W. Dai (1998). Since Sn is simple, by Proposition 1.1 in Dai and Williams (1995), it is enough to

show that for each vertexa C J = {1,... ,2K}, ((I, —I)'R)a is an complete-S matrix. It is easy to see that

Let M = (NR) a, then M has the following form

where M\ is a principal sub matrix of R, M4 is a principal submatrix of — Rb, M2 is a submatrix

of Rb and M3 is a submatrix of — R. Because the definition of the set a, M2 doesn't contain

any diagonal elements of Rb- Hence M2 = 0. Because ii is a complete-S matrix, hence Mi is a complete-S matrix. Because — Rb is a diagonal matrix with positive diagonal elements, M4 is a complete-S matrix. Thus, M is a complete-S matrix. • Consider a sequence of hypercubes Sr indexed by r > 0. The reflection matrix associated with each state space ST is R. The following oscillation result is concerned with deterministic continuous paths, and it is a special case of a theorem in Dai and W. Dai (1998). 5.4 Convergence of the Algorithm 185

Lemma 5.9 Assume that R satisfies the Completely-?) condition. There exist constants K > 0 and f > 0 that depend only on R such that for any T > 0, r > f, x G C([0,T],MK) with x(0) G ST, and an (Sr, R)-regulation {y,z) of x over [0,T], the following holds for each

interval [h,t2] C [0,T]:

Osc(y,[t\M) < KOsc{x,[tut2]), (5.14)

Osc{y,[tut2]) < KOsc(x,[h,t2]). (5.15)

A sequence of stochastic process {Xn} in C([0, oo), $iK) is said to be relatively compact if

nk for every sequence {nk}, there is a subsequence {nkj} such that {X J} converge in distribu• tion.

Lemma 5.10 Under the hypotheses of Theorem 5.7, the sequence (Zn,Xn,Yn) is relatively compact.

Proof of Lemma 5.10. First of all, since {Xn} are independent Brownian motions with same drift 6 and same covariance matrix V, it is straightforward to see that Xn converges in distribution to a Brownian motion with drift 0 and covariance matrix V. To prove this lemma, it is necessary and sufficient to verify conditions given in Corollary 7.4 in Chapter 3 of Ethier and Kurtz (1986).

By Lemma 5.9, there exist constants K > 0 and no > 0 such that for all 0 < t\ < t2 and n > no,

n n Osc(Z ,[ti,t2]) < «Osc(X ,[ti,t2]), ' (5.16)

n n Osc{Y ,[tl,t2\) < *Osc{X , [tut2]), (5.17) which imply that P"-a.s. for any t > 0 and n > no,

\Yn{t)\ < «Osc(Xn,[0,t]) = K sup |A"n(i)|, 0

This, combined with the relative compactness of Xn, implies the compact containment con• dition (a) of Corollary 3.7.4 in Ethier and Kurtz (1986). Next, for T > 0 and <5" > 0, we define the modulus of continuity of a path x(-) by

w(x{-),5yT) = infmaxOsc(a;(-),[ii-i,ii]), (5.18) U i where the infinum extends over the finite set {tj} of points satisfying 0 = to < ii < ... <

tr = T and tj — tj-\ > 8 for j — 1,... , r. Because {Xn} is relatively compact, for each rj > 0 and T > 0, there exists 8 > 0 such that

lim supP{w{Xn, 5,T) > T]/{K + 1)} < TJ/{K + 1).

n—>oo n

Prom (5.16) and (5.17), for n > no,

w{Yn,S,T) • < Kw{Xn,5,T),

w(Zn,S,T) < Kw{Xn,8,T).

Thus, for n > no,

lim supP{w(Yn,5,T) > n} < lim supP{nw{Xn, 8,T) > 77}

7i-»oo n n-too n < lim sup P{w{Xn, 8, T) > n/{K + 1)}

n—Kx n < V-

Therefore, condition (b) of Corollary 3.7.4 holds for Yn. Similarly, it is also holds for Zn. This completes the proof. • Proof of Theorem 5.7. By Lemma 5.10, the sequence {(Zn,Xn, Yn)} is relatively compact. Let (Z,X,Y*) be a weak limit defined on a probability space (Q, J7, P). That is, there exits

a sequence {njt} such that as nk —> 00,

{Znk,Xnk,Ynk) => {Z,X,Y*).

By the Skorohod representation theorem (Theorem 3.1.8 in Ethier and Kurtz (1986)), we can assume that {{Znk,Xnk,Ynk)} and (Z,X,Y*) are defined on the same probability space 5.4 Convergence of the Algorithm 187

(ft, JF, P) such that

n nk nk (Z ",X ,Y ) -> (Z,X,Y*) u.o.c, asnfc->oo. (5.19)

Note that Zn* is an SRBM and (Znk ,Xnk ,Ynk) satisfies all conditions in Definition 5.1.

Let us define Tt = o{{Z{s), X{s), Y*{s), 0 < s < £}. It is easy to check that {Z, X, Y*) satisfies (1), and (3)-(5) in Definition 5.1. From the proof of Lemma 5.10, we know that X is a d-dimensional Brownian motion with drift 9 and covariance matrix V starting from 0. For any positive integer m, 0 < s\, < ... , <

K nk sm — s < t < oo, fi,... , fm G Cb($l ) and u = s or t, by the uniform integrability of {X } (they are independent and identical processes) and the assumed weak convergence,

nk nk nk E[(X (u) - eu)h(Z (Sl)),... Jm(Z (sm))] -+ E[(X(u) - 0u)/i(Z(5i)),... , fm(Z(sm))] as k —>• oo. By the martingale property of XUk, the expectation involving with Xnk in the above has the same value for u = s and u = t, and so the same is true for the expectation

involving X. the latter property implies that {X(t) — dt,Ft,t > 0} is a martingale. Now, we shall prove (6) in Definition 5.1, i.e. Y*{-) can only increase at time t when Zi(t) — 0, for i — 1,... , K. Because Y™k(-) can only increase at time t when Z™k(t) = 0, i = 1,... , K, we have for each T > 0

Following from Lemma 2.4 in Dai and Williams (1995), we have

0

for i = 1,... , K. This implies that P a.s., Y*(t), i = 1,... , K, can increase only at time t

k such that Zi(t) = 0. Similarly, since Y^°K(-) can only increase at time t when Z" (t) = 6™*, i = 1,... , K, we have 5.4 Convergence of the Algorithm 188

Following from oscillation result (5.16) and the fact that bnk ->• oo as k -> oo, we have

T 0 = lim [ (b?>-Z?-(t))AldY£K(t)

T = [ dY*+K(t), Jo

which implies that Y*+K(t) = 0, P a.s., for i = 1,... , K. Let Y = (Y{,... , YK). Then we have

T% = a{(Z(s),X(s),Y*(s),0

= a{(Z{s),X{s),Y(s),0 < s < t}.

nk nk nk Because (Z ,X ,Y ) satisfied equation (5.1) for any nk > 0, we have

Z(t) = X{t)+RY{t).

Thus, (Z,X,Y) satisfies all the properties in Definition 5.1 with !Ft defined above, and in particular, Z is an SRBM associated with data (S,9,T,R) starting from 0. Because R is completely-S, this SRBM is unique is distribution, thus we obtain that

Zn Z, as n -> oo.

• The following conjecture is useful to prove the convergence theorem for the stationary distribution,

Conjecture 5.11 {irn,n > 1} is relatively compact.

Theorem 5.12 Under the hypotheses of Theorem 5.7 and that there exists a stationary dis• tribution TT for SRBM Z, let ixn to be the stationary distribution of SRBM Zn, then we have

TTn(x) => TT{x) 5.5 A Job Shop Example 189

Proof of theorem 5.12. Let TX be the weak limit of a subsequence of {nn}. Then for each bounded Borel function /, t > 0, and compact M C S

| / Ex\J{Zt)]fi(dx) - f f(x)n(dx)\ Js Js

< n limBi-fool j Ex[f(Zt)]7l(dx) - j Ex[f(Zt)]ix i (dx)\

n -rlimn._>00| f Ex[f(Zt)]^(dx)- f Ex[f(Z?)]n i(dx)\ Js Js

n + )mnj-*oo\ J f{x)Hdx) - j f{x)ir J(dx)\

< n limn^oo f \Ex[f(Zt)} - Ex[f(Z?)]\ir i(dx)

J M

n + limn^oo [ \Ex[f(Zt)] - Es[f(Z?)]\« i (dx) Js\M

The second inequality follows from the compactness of {n71} and the last inequality follows from Theorem 5.7, the boundness of /, and the compactness of space M. Since TTN is relative compact, following from Prohorov's theorem (Theorem 3.2.2 of Ethier and Kurtz (1986)), for every e > 0, there is a compact M C S such that TTn(S \ M) < e for all n. Consequently, we have

| / Ex[f{Zt)]H(dx) - [ f(x)ir(dx)\ =0. Js Js

This implies that ff is the stationary distribution of SRBM Z. Following from the uniqueness of stationary distribution (Dai and Harrison (1991)), we conclude that TT — n and 7rn n. •

5.5 A Job Shop Example

We now consider a simple job shop queueing network model. The job shop consists of 3 machines, numbered from 1 to 3. There are four types of jobs, each having different inter• arrival times, a different deterministic machine visitation sequence, and different machine service times. All the service times and interarrival times are assumed to be independently distributed. Since each job type at each machine has a different service process, we number those service processes from 1 to 10 as illustrated in Figure 5.1. The interarrival times of type 1 and 2 jobs are Gamma distributed with squared co-efficients of variance (SCV) equal to 2, 5.5 A Job Shop Example 190 and the interarrival times of type 3 and 4 jobs are exponentially distributed (SCV=1). The service times at machine 1 and 2 are Gamma distributed with SCV equal to 2, and the service times at machine 3 are Erlang distributed (SCV=0.25). The travel times between machines are assumed to be negligible. The visitation sequence for four types of jobs are illustrated in Figure 5.1. We consider several scenarios that have different arrival rates, denoted by ay, j = 1,... ,4, and different mean service times, denoted by mj, i = 1,... ,10. Those data are listed in Table 5.1. Following the approach in Chapter 3, an SRBM approximation can be constructed to approximate the workload process. We assume this job shop has infinite buffers at every machine, so the SRBM is defined in an orthant state space. Next, using the BNA/FM al• gorithm proposed in this chapter, we can compute its stationary distribution together with the stationary means of the SRBM. The stationary means are then used to estimate the long-run average queue waiting time at every machine. The SRBM rows in Table 5.2 list the estimates of average queue waiting time at each machine. For comparison, we also simulate every case by using Arena 3.0, and the corresponding estimates are given in Simulation rows. From Table 5.2, one can see that the BNA/FM estimates are fairly accurate compared to the simulation results. Next, we use case 1 as an example to show how we choose the truncated boundary in each iteration. Table 5.3 lists the average queue waiting times calculated from our algorithm when the different upper bounds are used. As we can see from the table, when the upper bound increases, the difference of average waiting times between two consecutive iterations decreases. Table 5.4 shows the difference of the traffic intensities between the SRBM approximations and actual values. Table 5.5 lists the workload loss rates. Again, we can see that when the size of the bounds increases, those values in Table 5.4 and Table 5.5 decrease to 0.

In this example, we choose e = 10~3, since in most cases, we compare our results with simulation estimates which are not expected to have a very high accuracy. Thus we feel this error bound is sufficient enough for performance analysis. To further increase the accuracy, we need to increase the bound size and the mesh size and therefore, it would take much more computational time to obtain the results. In all these three tables, the bound values of No. 4 satisfy the inequalities (5.13), and they are used as the starting truncation points when performing the computation. Bounds in No. 1 to 3 in these tables are used to show readers that a smaller upper bound would produce bigger errors. In real computation, we stop at No. 6 because, starting from No. 5, for i = 1,2,3, we 5.5 A Job Shop Example 191

7774 7777 7779 77710 Case ai C*2 a3 CV4 mi 7712 7713 7715 7776 ms 1 0.25 0.5 0.5 0.5 0.4 0.8 0.8 0.8 0.6 0.2 0.8 0.2 0.4 0.6 2 0.25 0.25 0.5 0.5 0.4 0.8 0.8 0.8 0.6 0.2 0.8 0.2 0.4 0.6

TABLE 5.1. Means of inter-arrival times and service times of the job shop have

\l-Si-pi\ < nn3,

In No. 7 and 8, even though we increase the upper bound &i, the average queue waiting times vary very little. Now, we again use case 1 and select the upper bound in No. 6 to calculate the marginal stationary distribution of waiting times at every machine. The results are plotted in Figure 5.2 to 5.4 together with the simulation estimates. From those pictures, it can be seen that the SRBM estimates are fairly close to the simulation estimates.

Type 1 1 7

2 8

Type 2 3 4 9

Type 3 5 10

Type 4 6

FIGURE 5.1. Queueing Network Model of the Job Shop Facility 5.5 A Job Shop Example 192

Case Method Machine 1 Machine 2 Machine 3 1 SRBM 12.406 5.260 2.460 Simulation 12.631 4.845 2.482 2 SRBM 3.221 1.685 1.342 Simulation 3.146 1.519 1.328

TABLE 5.2. Average waiting time at each machine at the job shop

Machine 1 Machine 2 Machine 3 No. bi b2 h 1 30 . 16 7 9.218 4.287 1.945 2 40 20 10 10.666 4.785 2.240 3 50 25 15 11.534 5.046 2.406 4 60 30 15 12.017 5.177 2.427 5 70 35 20 12.276 5.235 2.458 6 80 40 25 12.406 5.260 2.460 7 90 40 25 12.464 5.261 2.460 8 100 40 25 12.491 5.262 2.460 Simulation 12.631 4.845 2.482

TABLE 5.3. The iterations of the SRBM approximations for average waiting times for the job shop example

Machine 1 Machine 2 Machine 3 No. h b2 h 1 30 16 7 1.597e-2 1.752e-2 2.504e-2 2 40 20 10 6.555e-3 6.784e-3 9.011e-3 3 50 25 15 2.716e-3 2.646e-3 2.874e-3 4 60 30 15 1.103e-3 8.889e-4 1.466e-3 5 70 35 20 4.545e-4 3.725e-4 5.674e-4 6 80 40 25 1.832e-4 1.365e-4 3.083e-4 7 90 40 25 9.935e-5 9.935e-5 2.692e-4 8 100 40 25 6.723e-5 8.502e-5 2.557e-4

TABLE 5.4. The iterations of the SRBM approximation errors for traffic intensities for the job shop example 5.5 A Job Shop Example 193

Machine 1 Machine 2 Machine 3 No. bi b2 b3 1 30 16 7 9.270e-3 1.340e-2 9.839e-3 2 40 20 10 4.065e-3 4.976e-3 3.019e-3 3 50 25 15 1.790e-3 1.850e-3 4.914e-4 4 60 30 15 7.818e-4 6.403e-4 5.504e-4 5 70 35 20 3.445e-4 2.183e-4 2.114e-4 6 80 40 25 1.474e-4 6.993e-5 1.730e-4 7 90 40 25 6.347e-5 7.006e-5 1.805e-4 8 100 40 25 3.138e-5 7.000e-5 1.849e-4

TABLE 5.5. The iterations of the SRBM approximation for job loss rates for the job shop example

Simulaton —i— SRBM —x—

0 10 20 30 40 50 60 70 80 Waiting Time at Machine 1

FIGURE 5.2. The marginal stationary distribution of the waiting time of machine 1 in the job shop 5.5 A Job Shop Example 194

0 5 10 15 20 25 30 35 40 Waiting Time at Machine 2

FIGURE 5.3. The marginal stationary distribution of the waiting time of machine 2 in the job shop

I i i 1 1 —' 0 5 10 15 20 25 Waiting Time at Machine 3

FIGURE 5.4. The marginal stationary distribution of the waiting time of machine 3 in the job shop 195

References

[1] Ashcraft, Cleve and Roger Grimes. (1999). SPOOLES: an object-oriented sparse matrix library. Proceedings of the 1999 SIAM Conference on Parallel Processing for Scientific Computing, March 22-27, 1999.

[2] Ashcraft, Cleve (1999). Ordering sparse matrices and transforming front trees. Working paper.

[3] Bardhan, I. and S. Mithal. (1993). Heavy traffic limits for an open network of finite buffer overflow queues: the single class case. Preprint.

[4] Baskett, Forest, K. Mani Chandy, Richard R. Muntz, and Fernando G. Palacios. (1975). Open, closed, and mixed networks of queues with different classes of customers. Journal of the Association for Computing Machinery, 22, 2, 248-260.

[5] Becker, Eric B. and Graham F. Carey and J. Tinsley Oden. (1981). Finite Elements: An Introduction. Prentice Hall, Englewood Cliffs, NJ.

[6] Bernard, A. and A. El Kharroubi. (1991). Regulation de processus dans le premier or• thant de lZn. Stochastics and Stochastics Reports, 34, 149-167.

[7] Bertsekas, D. and R. Gallager. (1992). Data Networks, Prentice-Hall, New Jersey.

[8] Billingsley, P. (1968). Convergence of Probability Measures. Wiley, New York.

[9] Bitran, Gabriel R. and Devanath Tirupati. (1988). Multiproduct queueing networks with deterministic routing: decomposition approach and the notion of interference. Manage• ment Science, 34, 1, 75-100.

[10] Botvich, D. D. and A. A. Zamyatin. (1992). Ergodicity of conservative communication networks. Rapport de recherche, 1772, INRIA.

[11] Bramson, M. (1994). Instability of FIFO queueing networks. Annals of Applied Proba• bility, 4, 414-431. References 196

[12] Bramson, M. (1998). State space collapse with application to heavy traffic limits for multiclass queueing networks. Queueing Systems. 30, 89-148.

[13] Bramson, M. and J. G. Dai. (1999). Heavy traffic limits for some queueing networks. Preprint.

[14] Buzacott, J. and J. G. Shanthikumar. (1992). Design of manufacturing systems using queueing models. Queueing System, 12, 135-213.

[15] Canuto C., M. Y. Hussaini, A. Quarteroni and T. A. Zang, (1988). Spectral methods in fluid dyanmics. Springer-Verlag, Berlin.

[16] Carey Graham F. J. Tinsley Oden. (1981). Finite Elements: A Second Course. Prentice Hall, Englewood Cliffs, NJ.

[17] Chen, H. and A. Mandelbaum. (1994). "Hierarchical modelling of stochastic networks, Part II: strong approximations", in D.D. Yao (ed.), Stochastic Modeling and Analysis of Manufacturing Systems, 107-131, Springer-Verlag.

[18] Chen, H. (1995). Fluid approximations and stability of multiclass queueing networks: work-conserving disciplines. Annals of Applied Probability, 5, 636-665.

[19] Chen, H. (1996). A sufficient condition for the positive recurrence of a semimartingale reflecting Brownian motion in an orthant, Annals of Applied Probability, 6, 758-765.

[20] Chen, H. and Ward Whitt. (1993). Diffusion approximations for open queueing networks with service interruptions. Queueing Systems, Theory and Applications, 13, 335-359.

[21] Chen, H. and D. D. Yao. (2001). Fundamentals of Queueing Networks, Springer-Verlag.

[22] Chen, H. and H. Ye. (1999). Existence condition for the diffusion approximations of multiclass priority queueing networks. Preprint.

[23] Chen, H. and H. Zhang. (1996). Diffusion approximations for re-entrant lines with a first-buffer-first-served priority discipline. Queueing Systems, Theory and Applications, 23. 177-195.

[24] Chen, H. and H. Zhang. (1998). Diffusion approximations for Kumar-Seidman network under a priority service discipline. Operations Research Letters, 23, 171-181.

[25] Cottle, R. W., J. S. Pang and R. E. Stone. (1992). The Linear Complementarity Problem, Academic Press. References 197

[26] Csorgo, M., P. Deheuvels and L. Horvath (1985). An approximation of stopped sums with applications in queueing theory. Advanced in Applied Probability, 19, 674-690.

[27] Csorgo, M., and L. Horvath (1993). Weighted Approximations in Probability and Statis• tics. Wiley, New York.

[28] Dai, J. G. (1990). Steady-state analysis of reflected Brownian motions: characterization, numerical methods and queueing applications. PhD Thesis, Stanford University.

[29] Dai, J. G. (1995). On positive Harris recurrence of multiclass queueing networks: a unified approach via fluid models. Annals of Applied Probability, 5, 49-77.

[30] Dai, J. G. and W. Dai. (1998). A heavy traffic limit theorem for a class of open queueing networks with finite buffers. Queueing Systems, 32, 5-40.

[31] Dai, J. G. and J. M. Harrison. (1991). Steady-state analysis of RBM in a rectangle: numerical methods and a queueing application. Annals of Applied Probability, 1, 16-35.

[32] Dai, J. G. and J. M. Harrison. (1992). Reflected Brownian motion in an orthant: numer• ical methods for steady-state analysis. Annals of Applied Probability, 2, 65-86.

[33] Dai, J. G. and T. G. Kurtz. (1997). Characterization ofthe stationary distribution for a semimartingale reflecting Brownian motion in a convex polyhedron, preprint.

[34] Dai, J. G., V. Nguyen, and M. I. Reiman. (1994). Sequential bottleneck decompositions: An approximation method for generalized Jackson networks. Operations Research, 42, 119-136.

[35] Dai, J. G. and Y. Wang. (1993). Nonexistence of Brownian models for certain multiclass queueing networks. Queueing Systems, Theory and Applications, 13, 41-46.

[36] Dai, J. G., D. H. Yeh and C. Zhou. (1997). The QNET method for re-entrant queueing networks with priority disciplines. Operations Research, 45, 610-623.

[37] Dai, J. G. and R. J. Williams. (1995). Existence and uniqueness of semimartingale re• flecting Brownian motions in convex polyhedrons. Theory probab. Appl., 40, 1-40.

[38] Dai, W. (1996). Brownian approximations for queueing networks with finite buffers: modeling, heavy traffic analysis and numerical implementations, Ph.D. Dissertation, Georgia Institute of Technology. References 198

[39] Dongarra, J. A., Lumsdaine R. Pozo, K. Remington. (1994). A sparse matrix library in C++ for high performance architectures. Proceedings of the Second Object Oriented Numerics Conference, 214-218.

[40] Dongarra, J. A. (1990). A Set of Level 3 Basic Linear Algebra Subprograms. ACM Transactions on Mathematical Software, Vol. 16, 1-17.

[41] Dupuis, P. and R. J. Williams. (1994). Lyapunov functions for semimartingale reflecting Brownian motions. Annals of Probability, 22, 680-702.

[42] Ethier, S. N. and T. G. Kurtz. (1986). Markov Process: Characterization and Conver• gence, Wiley, New York.

[43] Fendick, K. W., V. R. Saksena, and Ward Whitt. (1989). Dependence in packet queues, IEEE Transactions on Communications, 37, 1173-1183.

[44] Glynn, P. W. (1990). Diffusion approximations, in D.P. Heyman and M.J. Sobel (eds.), Handbooks in Operations Research and Management Science, II: Stochastic Models, 145-198. North-Holland, Amsterdam.

[45] Glynn, P. W. and Ward Whitt. (1991a). A new view of the heavy-traffic limit theorem for infinite-server queues. Advances in Applied Probability, 23, 188-209.

[46] Glynn, P. W. and Ward Whitt. (1991b). Departures from many queues in series. Annals of Applied Probability, 1, 546-572.

[47] Harrison, J. M. (1985). Brownian Motion and Stochastic Flow Systems.. Wiley, New York.

[48] Harrison, J. M., H. Landau and L. A. Shepp. (1985). The stationary distribution of reflected Brownian motion in a planar region. Annals of Probability, 13, 744-757.

[49] Harrison, J. M. and V. Nguyen. (1990). The QNET method for two-moment analysis of open queueing networks. Queueing Systems: Theory and Applications, 6, 1-32.

[50] Harrison, J. M. and V. Nguyen. (1993). Brownian models of multiclass queueing net• works: current status and open problems, Queueing Systems: Theory and Applications, 13, 5-40.

[51] Harrison, J. M. and M. T. Pich. (1996). Two-moment analysis of open queueing networks with general workstation capabilities. Operations Research, 44, 936-950. References 199

[52] Harrison, J. M. and L.M. Wein. (1990). Scheduling networks of queues: heavy traffic analysis of a two-station closed network. Operations Research, 38, 1052-1064.

[53] Harrison, J. M. and R.J. Williams. (1987). Multidimensional reflected Brownian motions having exponential stationary distributions. Annals of Probability, 15, 115-137.

[54] Harrison, J. M. and R. J. Williams. (1992). Brownian models of feedforward queueing networks: and product form solutions, Annals of Applied Probability, 2, 263-293.

[55] Horvath, L. (1990). Strong approximations of open queueing networks, Mathematics of Operations Research, 17, 487-508.

[56] Jackson, J. R. (1963). Job shop-like queueing systems. Management Science, 10, 131-142.

[57] Kelly, F. P. (1979). Reversibility and Stochastic Networks, Wiley, New York.

[58] Kleinrock, Leonard. (1976). Queueing Systems II: Computer Applications, Wiley, New York.

[59] Kumar, P. R. and T. I. Seidman. (1990). Dynamic instabilities and stabilization methods in distributed real-time scheduling of manufacturing systems. IEEE Transactions on Automatic Control, 35, 289-298.

[60] Laws C. N. (1992). Resource pooling in queueing networks with dynamic routing. Ad• vances in Applied Probability, 24, 699-726.

[61] Lemoine, A. J. (1978). Network of queues - a survey of weak convergence results. Man• agement Science, 24, 1175-1193.

[62] Mandelbaum, A. (1989). The dynamic complementarity problem, Preprint.

[63] Mandelbaum, A. and W. A. Massey. (1995). Strong approximation for time-dependent queues, Mathematics of Operations Research, 20, 33-64.

[64] Mandelbaum, A., W. A. Massey and M. Reiman. (1998). Strong approximation for Markov service networks, Queueing systems, Theory and Applications, 30, 149-201.

[65] Mandelbaum, A. and G. Pats. (1998). Stochastic networks I: approximations and appli• cations with continuous diffusion limits. Annals of Applied Probability, 8, 569-646.

[66] Maron, M. J. (1987). Numerical Analysis: A Practical Approach (Second Edition), Macmillan Publishing Company, New York. References 200

[67] Oden J. T. and J. N. Reddy. (1976). An introduction to the mathematical theory of finite elements. A Wiley-Interscience Publication.

[68] Peterson, W. P. (1991). A heavy traffic limit theorem for networks of queues with multiple customer types, Mathematics of Operations Research, 16, 90-118.

[69] Reiman, M. I. (1984). Open queueing networks in heavy traffic. Mathematics of Opera• tions research, 9, 441-458.

[70] Reiman, M. I. (1988). A multiclass feedback queue in heavy traffic. Advances in Applied Probability. 20, 179-207.

[71] Reiman, M. I. and R. J. Williams. (1988). A boundary property of semimartingale reflect• ing Brownian motions. Probability Theory and Related Fields, 77, 87-97. and 80(1989), 633.

[72] Rybko, A. N. and Stolyar, A. L. (1992). Ergodicity of stochastic processed describing the operations of open queueing networks. Problemy Peredachi Informatsii, 28, 2-26.

[73] Schwerer E. (1997). A Linear Programming Approach to the Steady-state Analysis of Markov Process. Ph.D. Dissertation, Stanford University.

[74] Segal, Moshe and Ward Whitt. (1989). A queueing network analyzer for manufacturing, in Teletraffic Science for New Cost-effective Systems ITC 12, Eds. M. Bonatti, 1146- 1152, North-Holland, Amsterdam.

[75] Shanthikumar, J. G. and J. A. Buzacott. (1981). Open queueing network models of dynamic job shops. Int. J. Prod. Res., 19, 255-266.

[76] Taylor, L. M. and R. J. Williams. (1993). Existence and uniqueness of semimartingale reflecting Brownian motions in an orthant. Probability Theory Related Fields, 96, 283- 317.

[77] Trefethen, L. and R. J. Williams. (1985). Conformal mapping solution of Laplace's equa• tion on a polygon with oblique derivative boundary conditions. J. Comput. Appl. Math., 14, 227-249.

[78] Walrand, J. (1988). An Introduction to Queueing Networks, Prentice Hall, Englewood Cliffs, New Jersey.

[79] Wein L. M. (1990). Scheduling networks of queues: heavy traffic analysis of a two-station network with controllable inputs. Operations Research, 38, 1065-1078. References 201

[80] Whitt, Ward. (1974). Heavy traffic theorems for queues: a survey, in A.B. Clarke (ed.), Mathematical Methods in Queueing Theory, 307-350, Springer Verlag.

[81] Whitt, Ward. (1980). Some useful functions for functional limit theorems, Mathematics of Operations Research, 5, 67-85.

[82] Whitt, Ward. (1983). The queueing network analyzer. Bell System Technical Journal, 62, 9, 2779-2815.

[83] Whitt, Ward. (1994). Towards better multi-class parametric decomposition approxima• tion for open queueing networks. Annals of Operations Research, 48, 221-248.

[84] Williams, R. J. (1996). On the approximation of queueing networks in heavy traffic, in Stochastic Networks: Theory and Applications, F.P. Kelly, S. Zachary, and I. Ziedins (eds.), Oxford University Press, Oxford, 35-56.

[85] Williams, R. J. (1998). Diffusion approximations for open multiclass queueing networks: sufficient conditions involving state space collapse. 30, 27-88.

[86] Yao, D. D. (1994). Stochastic Modeling and Analysis of Manufacturing Systems. Edited by David Yao. Springer-Verlag, New York.

[87] Zhang, H. (1997). Strong approximations for irreducible closed queueing networks, Ad• vances in Applied Probability, 29, 498-522.

[88] Zhang, H., G. Hsu and R. Wang. (1990). Strong approximations for multiple channel queues in heavy traffic, Journal of Applied Probability, 28, 658-670.