Response Time Distribution of a Class of Limited Queues∗

Miklos Telek Benny Van Houdt Department of Telecommunication Department of Math and Computer Science Technical University of Budapest University of Antwerp - imec MTA-BME Information Systems Research Group Belgium Hungary [email protected] [email protected]

ABSTRACT depends on the number of jobs present. If we denote µ(m) Processor sharing queues are often used to study the perfor- as the processing rate in case there are m jobs served in pa- mance of time-sharing systems. In such systems the total rallel, then µ(m) is called the service curve and this curve tends to be unimodal and attains a maximum in some value service rate µ(m) depends on the number of jobs m present ∗ ∗ in the system and there is a limit implemented, called the k , i.e., µ(k ) ≥ µ(m) for any m. As it is desirable that multi-programming level (MPL), on the number of jobs k the system operates at high efficiency a common approach that can be served simultaneously. Prior work showed that is to limit the number of jobs that is processed in parallel to some k, where k is called the multi-programming level under highly variable jobs sizes, setting the MPL k beyond ∗ ∗ (MPL) and setting k = k is a very natural choice. Whene- the value k = arg maxm µ(m) may reduce the mean re- sponse time. ver the number of jobs n in the system exceeds k, n − k of In order to study the impact of the MPL k on the re- the jobs queue to receive service. Thus, the system operates sponse time distribution, we analyse the MAP/PH/LPS- as a limited processor sharing queue where the service rate k(m) queue. In such a queue jobs arrive according to a depends on the number of jobs in service. Markovian arrival process (MAP), have phase-type (PH) As indicated in [5] when the job sizes are highly variable distributed sizes, at most k jobs are processed in parallel the mean response times can be further reduced by picking an MPL k that exceeds k∗ (especially if the load is not too and the total service rate depends on the number of jobs ∗ being served. Jobs that arrive when there are k or more high). The intuition is that although setting k > k reduces jobs present are queued. the service rate when there are many jobs present, short jobs We derive an expression for the Laplace transform of the tend to have an easier time to pass long jobs which outweighs response time distribution and numerically invert it to study the reduced service rate (if the load is not too high). Ho- the impact of the MPL k. Numerical results illustrate to wever, service disciplines that minimize the mean response what extent increasing k beyond k∗ increases the quantiles time, like shortest job next, also tend to cause some form and tail probabilities of the response time distribution. They of starvation for the long jobs. Thus it would be interesting further demonstrate that for bursty arrivals and larger MPL to see how the MPL k affects not only the mean response time, but the response time distribution. In fact it can be k values having more variable job sizes may reduce the mean ∗ response time. anticipated that increasing k beyond k may increase the tail probabilities as long jobs have a more difficult time to complete. 1. INTRODUCTION The main objective of this paper is to develop an efficient Processing jobs with highly variable job sizes in a first- numerical method to compute the response time distribution come-first-served (FCFS) order is clearly suboptimal as short in a limited processor sharing queue to investigate how it is jobs get stuck behind long jobs. Therefore many systems affected by the MPL k. To this end we derive an expression employ some form of time sharing such that short jobs expe- for the Laplace transform (LT) of the response time distri- rience much better delay characteristics which often results bution of the MAP/PH/LPS-k(m) queue, where jobs arrive in an overall gain in the mean system response time. In according to a Markovian arrival process (MAP) of order such systems the efficiency tends to improve as the number na, jobs have an order ns phase-type (PH) distributed size, of jobs processed in parallel increases up to some point after at most k jobs are processed in parallel and the service rate which the efficiency declines as too many jobs are compe- depends on the number of jobs being served. Jobs that ar- ting for the available resources [13]. In a queueing theoretic rive when there are k or more jobs present are queued (in setting such a system corresponds to a processor sharing an infinite buffer) and queued jobs are taken in FCFS or- (PS) queue where the overall processing rate of the server der from the queue whenever a job completes service. To obtain the response time distribution we numerically invert ∗This work was supported by the FWO grant G024514N the LT, which requires us to numerically evaluate the LT and the OTKA 123914 project. at various values of s. For this purpose we propose two approaches: a Kronecker and a spectral expansion appro- ach. At first glance both approaches may appear proble- IFIP WG 7.3 Performance 2017. Nov. 14-16, 2017, New York, NY USA matic as they rely on matrix computations where the size Copyright is held by author/owner(s). k of the matrices involved is O(nans ). However, as we are we evaluate the distribution observed by a tagged customer mostly interested in highly variable job sizes we focus on arriving to the system. This part of the analysis, presen- hyper-exponential distributions with ns = 2 phases only (as ted in Section 4, is a straightforward application of matrix k in [5]). Further, it is easy (see [5]) to reduce these O(na2 ) analytic methods [10, 7]. The challenge is to determine the size matrices to matrices of size O(nak), which allows fast response time distribution given the initial distribution of a numerical inversion as indicated by the various numerical tagged customer in an efficient manner: this is by far the examples (for na = 1, ns = 2 and k = 15 computing the main technical contribution of the paper and is presented in probability P [R ≤ t] that the sojourn time is less than or Section 5. The response time distribution is obtained in the equal to t for a given t requires less than 10 seconds on a Laplace transform domain from which a numerical inverse regular laptop). transformation provides the time domain result. The main insights provided by the numerical examples is that while increasing the MPL k beyond k∗ reduces the 3. RELATED WORK mean response times in case of highly variable job sizes, it The closest related work is [5] which considers exactly the does significantly increase the tail probabilities. More sur- same queueing system, but focuses on the mean response prisingly, when the arrivals occur in bursts and the MPL k time only. As the MPL that minimizes the mean response is large (e.g., k = 15) more variable job sizes can result in time depends to a large extent on the arrival rate λ the significantly lower mean response times (while the tail pro- authors also propose two dynamic schemes to adjust the babilities still increase with increasing job size variability). MPL level to minimize the mean response time. In [12] a The paper is structured as follows. The queueing mo- numerical scheme is proposed to study the response time del analyzed in this paper is described in Section 2, while distribution of a limited processor sharing queue, but the Section 3 reviews some of the related work in this area. system is a closed queueing system and job durations are Section 4 focuses on how to compute the queue length dis- exponential. While the analysis can be adapted to an open tribution. The analysis of the response time distribution is queueing network with Poisson input, relaxing the exponen- presented in Section 5 and is the main technical contribution tial job durations appears problematic. of the paper. Section 6 discusses various numerical examples Limited processor sharing (LPS) queues (where the over- and conclusions are drawn in Section 7. all service rate does not depend on the number of jobs being processed) have been studied by a variety of authors. In [9] 2. MODEL AND NOTATIONS the impact of the MPL k on the sojourn time tail asymp- To model general service time distribution and job arri- totics is studied and a robust setting for k is proposed that val patterns we assume that customers arrive according to a achieves good tail asymptotics for both heavy and light Markovian arrival process (MAP) and their job lengths are tailed job size distributions. A method to assess the mean phase-type (PH) distributed. PH distributions and MAPs response time in an LPS queue (with Poisson arrivals and are distributions and point processes with a modulating fi- PH service) via matrix analytic methods was proposed in nite state background Markov chain [7]. At the expense [14] and is very similar in nature to Section 4. A closed of increasing the size of the background Markov chain any form approximation for the mean sojourn time in an LPS general distribution and point process can be approximated queue with general service times was proposed in [2]. Fluid, arbitrary closely with PH distributions and MAPs, respecti- diffusion and heavy traffic approximations for LPS queues vely. MAPs can describe point processes with possibly de- have also been developed in [17, 15, 16]. Further, a mo- pendent inter-arrival times and include the special case of notonicity result for the G/GI/LPS-k queue was presented renewal processes with PH distributed inter-arrival times. in [11], which shows that the queue length distribution mo- Various fitting tools are available online for PH distributi- notonically increases (decreases) in the stochastic ordering ons, as well as some tools to approximate MAPs (e.g., [6]). sense as a function of the MPL k for service time distribu- We consider an infinite buffer service node, where jobs tions with an increasing (decreasing) hazard rate. arrive according to a MAP(D0, D1) with mean arrival rate λ = δD11I, where 1I is the column vector of ones of ap- 4. STATIONARY ANALYSIS OF THE NUM- propriate size and δ is the stationary distribution of the MAP satisfying δ(D0 + D1) = 0 and δ1I = 1. The service BER OF CUSTOMERS time distribution is PH(α, A). The completion rate vector Let N(t) be the number of customers in the system at time of PH(α, A) is denoted by a = −A1I. The size of the arrival t, Z(t) be the phase of the MAP characterizing the arrivals MAP is na and the size of the service PH is ns. at time t and let {Y1(t),Y2(t),...,Ymin(N(t),k)(t)} represent The server serves at most k jobs in parallel such that the the service phases of the customers under service at time service speed depends on the number of parallel served jobs. t. X(t) = (N(t), {Z(t),Y1(t),Y2(t),...,Ymin(N(t),k)(t)}) is a If m (m ≤ k) jobs are served in parallel the service speed continuous time Markov process. We state that N(t) is the of one of those jobs is µm = µ(m)/m and during such a “level” process and {Z(t),Y1(t),Y2(t),...,Ymin(N(t),k)(t)} is period the service time is PH(α, µmA). We only consider the “phase” process. MPL values k for which the stability condition Denote the stationary probabilities of level m by πm, i.e., −1 πm is a row vector containing the stationary probabilities for λα(−A) 1I < kµk, (N(t), {Z(t),Y1(t),Y2(t),...,Ymin(m,k)(t)}) with N(t) = m. holds. m Pm i−1 Define the Kronecker sum for vectors ⊕i=1a , i=1 ⊗j=1I⊗ m We are interested in the response time distribution of this a ⊗ ⊗`=i+1I, then we have the following result: system. In order to compute it we first compute the stati- m onary distribution of the number of customers, the service Theorem 1. Let Lm = D0 ⊕ ⊕i=1µmA and Fm = D1 ⊗ m m phases and the phase of the arrival process, based on which α⊗⊗i=1I, for m = 0, . . . , k−1. Let Bm = I⊗⊕i=1µma, for k k m = 1, . . . , k, L = Lk, F = D1 ⊗ ⊗i=1I, B = I ⊗ ⊕i=1µkaα 5. RESPONSE TIME DISTRIBUTION and Bk+1 = B. Denote R as the minimal non-negative The Laplace transform of the response time, R, is deno- solution of the quadratic matrix equation ted by r(s) = IE(e−sR). To compute the response time we 0 = F + RL + R2B, (1) introduce the following quantities. Let entry (u, v) of the matrix W(s, i, j) be the Laplace Rk = R and define transform of the probability that i customers are served and j arrive in an interval of length t where the phase process R = −F (L + R B )−1, (2) m m m+1 m+1 m+2 starts in phase u and ends in phase v, while service is perfor- for m = 0, . . . , k − 1, then med by k servers during the entire interval. More precisely, W(s, i, j) = R e−stW¯ (t, i, j)dt and m−1 t Y πm = π0 Rj, W¯ (t, i, j)u,v = P(Nser(t) = i, Narr(t) = j, N(τ) ≥ k, ∀τ < t, j=0 {Z(t),Y1(t),...,Yk(t)} = v|{Z(0),Y1(0),...,Yk(0)} = u), j πk+j = πkR , where Narr(t) and Nser(t) denote the number of arrivals and for m = 0, . . . , k and j ≥ 0 and π0 is the unique vector such services in (0, t), respectively. As the elements of W¯ (t, i, j) P that π0(L0 + R0B1) = 0 and i πi1I= 1. are probabilities, for Re(s) > 0 we have Proof. Due to the fact that N(t) can change at most Z ∞ Z ∞ −st ¯ −st by one in a single state transition X(t) is a Quasi-Birth- e W(t, i, j)u,v dt ≤ e dt Death (QBD) Markov process, where the first k + 1 levels 0 0 Z ∞ (0, 1, . . . , k) of this QBD are irregular and level dependent. = e−Re(s)tdt < ∞, The generator of the QBD is given by 0

L0 F0  and consequently W(s, i, j)u,v is analytic for Re(s) > 0 (as the LT is analytic in its absolute region of convergence). In B1 L1 F1    the subsequent analysis we present Laplace domain matrix  ......   . . .  expressions for matrices of probabilities, these expressions    Bk−1 Lk−1 Fk−1  , are obtained as the solution of linear equations in explicit    Bk LF  (described by matrix inverse) or in implicit (solution of a    BLF  Sylvester equation) from. In either cases, without any furt-  . . .  her notice, we have that the solution is finite and analytic ...... for Re(s) > 0 due to the same reasoning as for W(s, i, j)u,v. The result follows from a standard application of matrix Define entry u of the column vector Si(s) as the Laplace analytic methods [10, 7]. transform of the remaining service time Rrem of a tagged cu- stomer when there are i customers in the system, the phase m m+1 The matrices Rm are of size nans × nans and R is a process is in phase u and the tagged customer is in the first k square matrix of size nans . While this may appear proble- server. That is, matic for computational purposes, we can use the common S (s) = IE(e−sRrem |N(0) = i, tagged in server 1, method to reduce the size of the matrices by simply keeping i u track of the number of servers that are currently in each ser- {Z(0),Y1(0),...,Ymin(i,k)(0)} = u). vice phase (at the expense of complicating the expressions for the blocks part of the generator matrix), see Section 6 Theorem 3. The Laplace transform of the response time, for more details. When ns = 2, as in our numerical experi- r(s), can be expressed as ments, this makes the block sizes linear in k (as opposed to k−1 exponential). X 1 r(s) = π F S (s) λ m m m+1 m=0 Theorem 2. After an arrival of a stationary customer ∞ ∞ X X 1 i 0 the system state distribution is + π R FW(s, i, j)B S (s) (3) λ k k+j 1 j=0 i=0 πˆm+1 = πmFm, λ 0 k k where B = (I ⊗ ⊕i=1µka)(I ⊗ α ⊗i=1 I). where Fm = F for m ≥ k. Proof. We rely on Theorem 2 and distinguish between Proof. The probability of being in phase u of level m+1 two cases: the tagged customer either gets to the server di- just after an arrival epoch is (πmFm)u and the result then rectly, or it is buffered upon its arrival. In the first case follows by a standard Markov chain argument [3]. It suffices the matrix Fm ensures that the newly arrived customer is to check that in server one and its service time is described by Sm+1(s). ∞ ∞ 1 i X X 1 In the second case λ πkR F describes the probability that πˆ 1I= π F 1I= m+1 λ m m there are k + i customers in front of the tagged one and m=0 m=0 P∞ j=0 W(s, i, j) describes all arrival patterns until i custo- ∞ X 1 min(m,k) 1 mers have departed and the tagged customers starts service. π D 1I ⊗ 1 ⊗ ⊗ 1I= δD 1I= 1. 0 λ m 1 i=1 λ 1 The matrix B ensures that the tagged customer gets to ser- m=0 ver one, from which point Sk+j(s) describes the response time. Theorem 4. The matrices W(s, i, j) satisfy the following • the service completion of the tagged customer accor- relations: ding to the vector Bˆ m1I; −1 L(s) , W(s, 0, 0) = (sI − L) , (4) • the service completion of a non-tagged customer ac- cording to the matrix B¯ k from which the remaining i W(s, i, 0) = W(s, i − 1, 0)BL(s) = L(s)(BL(s)) , (5) response time is Sk−1(s); 0 j • an arrival captured by Fm (which keeps the tagged W(s, 0, j) = W(s, 0, j − 1)FL(s) = L(s)(FL(s)) , (6) customer in server one and places the incoming one to serve m + 1) from which the remaining response time W(s, i, j) = (W(s, i, j − 1)F + W(s, i − 1, j)B) L(s). (7) is Sm+1(s). Proof. (4) describes the case without any arrivals or ser- vice completions. In this case only the phase can change ac- cording to the matrix L. (5) represents the case that there To avoid the infinite recurrence relation in (11) we com- are i − 1 services and no arrivals in (0, τ) with τ < t, there pute Sk+1(s) explicitly, based on V(s, i, h). Entry (u, v) of is a service completion at time τ according to matrix B and V(s, i, h) is the Laplace transform of the probability that i there are no arrivals or service completions in (τ, t). (6) is non-tagged customers are served and h customers arrive similar to (5), and (7) considers the cases that the last event in an interval of length t that starts in phase u with k + 1 is an arrival or a service completion. customers and ends in phase v, while there are at least k +1 customers in the system during the entire interval of length t. 5.1 Analysis of Si(s) There are two differences between V(s, i, h) and W(s, i, h). To compute Si(s) we need matrices that distinguish bet- The first one is that W(s, i, h) considers the service of all ween a service completion of the tagged customer or another customers while V(s, i, h) considers the service of the non- customer. To this end we define tagged customers only and assumes that the tagged custo- ˆ k mer is not served in (0, t). The second (and more intricate) B = I ⊗ µkaα ⊗ ⊗`=2I, difference is that W(s, i, h) does not put any restrictions on and the order of the i service completions and h arrivals, while ˆ m for V(s, i, h) the number of arrived customers may not be Bm = I ⊗ µma ⊗ ⊗`=2I, below the number of served customers at all times, ensuring for m ≤ k for the service completion of the tagged customer that there are at least k + 1 customers in the system during (residing in server one). While the entire interval. ¯ k B = I ⊗ I ⊗ ⊕`=2µkaα, Theorem 6. For V(s, i, h) we have and V(s, 0, 0) = (sI − L)−1 = L(s), (12) ¯ m Bm = I ⊗ I ⊗ ⊕`=2µma, V(s, 0, h) = V(s, 0, h − 1)FL(s) = (L(s)F)h L(s), (13) for m ≤ k correspond to another customer completing ser- ˆ ¯ ˆ ¯ vice. Note that B + B = B and Bm + Bm = Bm. Furt- ¯ −1 0 V(s, i, i) = V(s, i − 1, i)BL(s), (14) hermore, define Lm(s) , (sI − Lm) for m < k and Fm = m D1 ⊗ ⊗i=1I ⊗ α which is the same as Fm except that it puts ¯ the incoming customer in server m + 1 (instead of server V(s, i, h) = V(s, i, h − 1)F + V(s, i − 1, h)B L(s), (15) one). for 0 < i < h. Proof. The reasoning is similar to the proof of Theorem Theorem 5. Si(s) satisfies the following recurrence rela- tions 4 and we only emphasize the differences here. The matrix B¯ corresponds to the service completion of a non-tagged ˆ 0  S1(s) = L1(s) B11I+ F1S2(s) , (8) customer, and (14) ensures that the state when the number of served and arrived customers is identical, is reachable only from the state when the number of arrivals equals the ˆ ¯ 0  Sm(s) = Lm(s) Bm1I+ BmSm−1(s) + FmSm+1(s) , number of service completions plus one. This ensures that (9) there cannot be fewer than k+1 customers in the system. for 1 < m < k, Theorem 7. For Re(s) ≥ 0, Sk+1(s) can be computed as   ¯ −1 ˆ ¯ Sk(s) = L(s) Bˆ k1I+ B¯ kSk−1(s) + FSk+1(s) , (10) Sk+1(s) = V(s, 0) I − R(s) B1I+ V(s, 0)BSk(s), (16) ¯ ¯−1 ¯   where V(s, 0) = sI−L−R(s)B and R(s) is the minimal Sk+j(s) = L(s) Bˆ1I+ BS¯ k+j−1(s) + FSk+j+1(s) , (11) non-negative solution of ¯ ¯ ¯ 2 ¯ for j ≥ 1. sR(s) = F + R(s)L + R (s)B. (17) Proof. According to the definition of V(s, i, h) we have Proof. We detail only (9) as the other cases can be de- rived similarly. The response time associated with S (s) ∞ ∞ ∞ m X X ˆ X ¯ for 1 < m < k can be computed assuming that there are Sk+1(s) = V(s, `, ` + m)B1I+ V(s, `, `)BSk(s), m=0 `=0 `=0 only phase transitions according to matrix Lm up to time τ (τ < t) and at time τ there are three possible events: (18) where the first term describes the case when the tagged cu- Proof. The recurrence relation in (16) remains valid for stomer is served before the queue length decreases to k and higher number of customers as well, that is the second term describes the case when the tagged custo- ¯ −1 ˆ ¯ mer is not served while the queue length is larger than k Sk+j(s) = V(s, 0) I − R(s) B1I+ V(s, 0)BSk+j−1(s). and from the point when the queue length reduces to k is (22) its remaining service time is described by Sk(s). Successive substitution of this recurrence relation for k + To avoid infinite summations we introduce 1, . . . , k + j yields (21). ∞ X V(s, i) = V(s, `, i + `) 5.2 Computing the response time distribution `=0 According to (3) and (21) the Laplace transform of the response time can be written as for which the following recurrence relations hold for i > 0 based on (15) r(s) = r1(s) + r2(s) + r3(s), V(s, i) = V(s, i − 1)FL(s) + V(s, i + 1)BL¯ (s). with k−1 −1 X 1 Substituting L(s) by (sI − L) gives r (s) = π F S (s), 1 λ m m m+1 sV(s, i) = V(s, i − 1)F + V(s, i)L + V(s, i + 1)B¯, m=0 ∞ ∞ which indicates that V(s, i) follows a matrix geometric se- X X 1 i 0 j r (s) = π R FW(s, i, j)B V(s, 0)B¯ S (s), quence 2 λ k k j=0 i=0 i V(s, i) = V(s, i − 1)R¯ (s) = V(s, 0)R¯ (s), and ¯ ∞ ∞ where R(s) is the solution of (17). For Re(s) ≥ 0, the X X 1 i 0  r3(s) = πkR FW(s, i, j)B spectral radius of R¯ (s) is less than 1 since F + L + B¯ 1I= λ j=1 i=0 −Bˆ1I ≤ 0, which implies that I − R¯ (s)−1 exists. For j−1 V(s, 0) we have due to (14) X ` −1 · V(s, 0)B¯ V(s, 0)I − R¯ (s) Bˆ1I. ∞ X `=0 V(s, 0) = V(s, 0, 0) + V(s, `, `) Using Theorem 1 we can compute πm (m = 1, . . . , k) and `=1 from the linear system of (8), (9) and (20) we can retrieve ∞ X S (s)(m = 1, . . . , k) for any fixed value of s. Based on = L(s) + V(s, ` − 1, `)BL¯ (s) m these, we can compute the first term, r1(s). We now fo- `=1 cus on the analysis of r (s) and r (s), using two different ¯ 2 3 = L(s) + V(s, 1)BL(s), computational approaches. from which V(s, 0) can be computed as 5.2.1 Computing the response time by Kronecker ex- V(s, 0) = L(s) + V(s, 0)R¯ (s)BL¯ (s) pansion ¯ ¯−1 The first approach to compute r2(s) and r3(s) exists in re- = sI − L − R(s)B . lying on a Kronecker expansion. While the solution is easy Finally, from (18) we have to implement, the size of the matrices involved is the square of the size of the blocks characterizing the generator of the ∞ X m Markov chain. This implies that the run time complexity Sk+1(s) = V(s, 0)R¯ (s)Bˆ1I+ V(s, 0)BS¯ k(s), 6 when ns = 2 will grow as k . The spectral approach presen- m=0 ted in the next section avoids the need to work with such −1 = V(s, 0) I − R¯ (s) Bˆ1I+ V(s, 0)BS¯ k(s), (19) large matrices and has a time complexity that is cubic in k when ns = 2 (see Section 6 for more details on the run time which completes the theorem. complexity).

Thanks to (16), (10) can be written as Theorem 9.   1  T  ˆ ¯ r2(s) = Sk(s) ⊗ πk N(s)vec(M(s, 0)) Sk(s) = L(s) Bk1I+ BkSk−1(s) λ   where ⊗ is Kronecker product, T denotes transpose, vec() is ¯ −1 ˆ ¯ + L(s)FV(s, 0) I − R(s) B1I+ BSk(s) . (20) the column stacking vector operator,  −1 (20) together with (8) and (9) form a set of linear equations U(s) = I − L(s)T BT ⊗ R L(s)T FT ⊗ I for Sm(s)(m = 1, . . . , k) for any fixed value of s. and M(s, 0) and N(s) are the solutions of the Sylvester equa- 1 Theorem 8. For j > 0, Sk+j(s) can be computed as tions

j−1 M(s, 0) = FL(s) + RM(s, 0)BL(s), X ` −1 Sk+j(s) = V(s, 0)B¯ V(s, 0) I − R¯ (s) Bˆ1I 1These equations are of the form AXB −X +C = 0 and can `=0 therefore be solved in cubic time using the dlyap MATLAB j + V(s, 0)B¯ Sk(s), (21) command.     N(s) = B0T ⊗ I + B¯ T V(s, 0)T ⊗ I N(s)U(s). where for N(s) we have

P∞ i N(s) = Proof. Let M(s, j) = i=0 R FW(s, i, j). For j = 0, ∞ we have j  0T  X ¯ T T   0T  j ∞ ∞ B ⊗ I + B V(s, 0) ⊗ I B ⊗ I U(s) X i X i i M(s, 0) = R FW(s, i, 0) = R FL(s)(BL(s)) j=1     i=0 i=0 = B0T ⊗ I + B¯ T V(s, 0)T ⊗ I N(s)U(s), = FL(s) + RM(s, 0)BL(s) which completes the proof. which is a Sylvester matrix equation for M(s, 0). For j ≥ 1, one finds ∞ Theorem 10. X i j+1 M(s,j) = FW(s, 0, j) + R FW(s, i, j) = (FL(s)) r3(s) = i=1 ∞ 1 ˆ T  −1 X i S(s) ⊗ πk N(s)U(s)(I − U(s)) vec(M(s, 0)), + R F (W(s, i, j − 1)F + W(s, i − 1, j)B) L(s) λ i=1 −1 ∞ where Sˆ(s) = V(s, 0) I − R¯ (s) Bˆ1I and N(s), U(s) and j+1 X i = (FL(s)) + R FW(s, i, j − 1) FL(s) M(s, 0) are defined in Theorem 9. i=1 −1 | {z } Proof. Using Sˆ(s) = V(s, 0)I − R¯ (s) Bˆ1I, we write M(s,j−1)−FW(s,0,j−1) ∞ ∞ ∞ X i−1 1 X X 0 ` + R R FW(s, i − 1, j) BL(s) r (s) = π M(s, j)B V(s, 0)B¯ Sˆ(s) 3 λ k i=1 `=0 j=`+1 | {z } M(s,j) ∞ 1 X  T T T ` 0T  = Sˆ(s) B¯ V(s, 0)  B ⊗ π · = M(s, j − 1)FL(s) + RM(s, j)BL(s), λ k `=0 ∞ whose closed form solution can be obtained using the vec X j operator, by recalling that vec(XYZ) = (ZT ⊗ X)vec(Y ) U(s) vec(M(s, 0)) j=`+1 vec(M(s, j)) = L(s)T FT ⊗ Ivec(M(s, j − 1)) ∞ 1 X  T T T ` 0T  = Sˆ(s) B¯ V(s, 0)  B ⊗ π · T T  λ k + L(s) B ⊗ R vec(M(s, j)). `=0 `+1 −1 Hence, U(s) (I − U(s)) vec(M(s, 0))

1  T  −1 vec(M(s, j)) = U(s)vec(M(s, j − 1)), = Sˆ(s) ⊗ πk N(s)U(s)(I − U(s)) vec(M(s, 0)), λ with  −1 U(s) = I − L(s)T BT ⊗ R L(s)T FT ⊗ I. 5.2.2 Computing the response time by spectral ex- Therefore, pansion We start with the next theorem which assumes that the j vec(M(s, j)) = U(s) vec(M(s, 0)). matrix V(s, 0)B¯ is diagonalizable. For all the numerical ex- periments performed in this paper, it turns out that these This implies that r (s) can be written in the following man- 2 matrices are indeed diagonalizable. As such all the numeri- ner by using the equality vec(XYZ) = (ZT ⊗X)vec(Y ) with cal results presented in this paper made use of Theorem 11 0 ¯j X = πk, Y = M(s, j) and Z = B V(s, 0)B Sk(s): as it avoids the need to perform a Kronecker expansion. ∞ X 1 0 ¯j r2(s) = πkM(s, j)B V(s, 0)B Sk(s) 11. When V(s, 0)B¯ is diagonalizable for a fixed λ Theorem j=0 PN s and its spectral decomposition is n=1 θnunvn, then r2(s) ∞ and r (s) can be computed as 1 X  T T T j 0T  3 = S (s) B¯ V(s, 0)  B ⊗ π vec(M(s, j)) λ k k j=0 N 1 X ˜ 0 1   r2(s) = πk M(s, θn)B unvnSk(s), = S (s)T ⊗ π · λ λ k k n=1 ∞ j X  T T   0T  j B¯ V(s, 0) ⊗ I B ⊗ I U(s) vec(M(s, 0)) and j=0 N ˜ ˜ | {z } 1 X M(s, 1) − M(s, θn) 0 N(s) r3(s) = πk B unvn λ 1 − θn 1   n=1 T −1 = Sk(s) ⊗ πk N(s)vec(M(s, 0)), ¯  ˆ λ · V(s, 0) I − R(s) B1I, where M˜ (s, θ) is the solution of the Sylvester equation2 and

∞ j−1 N X 1 0 X X ` M˜ (s, θ) = FL(s) + θM˜ (s, θ)FL(s) + RM˜ (s, θ)BL(s). r3(s) = πkM(s, j)B θ unvn· λ n (23) j=1 `=0 n=1 V(s, 0)I − R¯ (s)−1Bˆ1I Proof. Using M(s, j) defined in Theorem 9, r2(s) and N ∞ j r3(s) can be written as 1 X X 0 1 − θn = πk M(s, j)B unvn· λ 1 − θn n=1 j=1 ∞ X 1 0 j −1 r (s) = π M(s, j)B V(s, 0)B¯ S (s), V(s, 0)I − R¯ (s) Bˆ1I 2 λ k k j=0 N ˜ ˜ 1 X M(s, 1) − M(s, θn) 0 ∞ = π B u v · X 1 0 k n n r (s) = π M(s, j)B · λ 1 − θn 3 λ k n=1 j=1 V(s, 0)I − R¯ (s)−1Bˆ1I. j−1 X ` −1 V(s, 0)B¯ V(s, 0)I − R¯ (s) Bˆ1I, `=0 If V(s, 0)B¯ is non-diagonalizable, then its spectral de- where M(s, j) was shown to satisfy composition contains at least one non-trivial Jordan block. The following theorem discusses the case when the spectral ¯ M(s, 0) = FL(s) + RM(s, 0)BL(s), decomposition of V(s, 0)B is composed by a single Jordan block of maximal size. M(s, j) = M(s, j − 1)FL(s) + RM(s, j)BL(s), Theorem 12. If for a fixed s the spectral decomposition for j ≥ 1. Multiplying the last equation with θj and sum- of V(s, 0)B¯ is ming over j gives θ 1  θ 1 ∞ ∞ −1   X j X j V(s, 0)B¯ = Θ   Θ, θ M(s, j) = θ M(s, j − 1)FL(s)  .. ..   . . j=1 j=1 θ ∞ X j + R θ M(s, j)BL(s). then j=1 1 r (s) = π Φ(s)ΘS (s), 2 λ k k ∞ For M˜ (s, θ) = P θj M(s, j) we therefore have 1 j=0 r (s) = π Ψ(s)ΘV(s, 0)I − R¯ (s)−1Bˆ1I, 3 λ k M˜ (s, θ) − M(s, 0) = θM˜ (s, θ)FL(s) where the `th column of Φ(s) and Ψ(s) can be computed as + RM˜ (s, θ)BL(s) − RM(s, 0)BL(s), `−1 ˜ (n) X M (s, θ) 0 −1 [Φ(s)] = [B Θ ] , ` n! `−n from which, for any s and θ, M˜ (s, θ) is the solution of the n=0 `−1 ˜ Sylvester equation X M(s, 1) 0 −1 [Ψ(s)] = [B Θ ] ` (1 − θ)n+1 `−n n=0 M˜ (s, θ) = FL(s) + θM˜ (s, θ)FL(s) + RM˜ (s, θ)BL(s). `−1 n ˜ (n−u) X X 1 M (s, θ) 0 −1 (24) − [B Θ ] , (1 − θ)u+1 (n − u)! `−n n=0 u=0 If V(s, 0)B¯ is diagonalizable and its spectral decomposition such that M˜ (0)(s, θ) = M˜ (s, θ) and M˜ (n)(s, θ) (n ≥ 1) is the is PN θ u v then n=1 n n n solution of the Sylvester equation3

(n) (n−1) ∞ N M˜ (s, θ) = nM˜ (s, θ)FL(s) (25) X 1 0 X j r (s) = π M(s, j)B θ u v S (s) 2 λ k n n n k + θM˜ (n)(s, θ)FL(s) + RM˜ (n)(s, θ)BL(s). j=0 n=1 N Proof. First we note that (25) is the nth derivative of 1 X 0 = π M˜ (s, θ )B u v S (s), (24) with respect to θ. (25) is a Sylvester equation for λ k n n n k n=1 M˜ (n)(s, θ) when M˜ (n−1)(s, θ) is known. It means that for a fixed θ, M˜ (n)(s, θ) can be computed iteratively starting from n = 0. 2This equation is of the form AXB + XC = D and can be solved in cubic time using a Hessenberg decomposition of 3This equation is of the same form as (23) and can therefore (B,C) and Schur decomposition of A, see [4]. also be solved in cubic time. For the given spectral decomposition of V(s, 0)B¯, we have SCV p 1/γ1 1/γ2 1 0.5000 1 1 j θ 1  2 0.7887 0.6340 2.3660 ∞ θ 1 5 0.9082 0.5505 5.4495 X 1 0 −1   r2(s) = πkM(s, j)B Θ  . .  ΘSk(s) 10 0.9523 0.5251 10.4749 λ  .. .. j=0   19 0.9743 0.5132 19.4868 θ 1 = π Φ(s)ΘS (s), Table 1: Parameter settings of the hyper- λ k k exponential job size distribution for various SCV va- where lues.  j j j−1 j  j−N θ 1 θ ... N θ ∞ θj jθj−1 ... and its Nth column is X 0 −1  1  Φ(s) = M(s, j)B Θ  . .  ,  . .  N−1 ∞ n j j=0  . .  X X 0 −1 1 d 1 − θ j [Ψ(s)] = M(s, j)[B Θ ] θ N N−n n! dθn 1 − θ n=0 j=n j  N−1 with i = 0 for i > j. n ˜ ˜ X 1 d M(s, 1) − M(s, θ) 0 −1 The first column of Φ(s), [Φ(s)] , can be computed as = [B Θ ] , 1 n! dθn 1 − θ N−n ∞ n=0 X 0 −1 j ˜ 0 −1 [Φ(s)]1 = M(s, j)[B Θ ]1θ = M(s, θ)[B Θ ]1. where j=0 dn M˜ (s, 1) − M˜ (s, θ) n! M˜ (s, 1) = For the second column we have dθn 1 − θ (1 − θ)n+1 ∞ ! n X 0 −1 j j−1 X n! M˜ (n−u)(s, θ) [Φ(s)]2 = M(s, j)[B Θ ]1 θ − 1 (n − u)! (1 − θ)u+1 j=1 u=0 ∞ X 0 −1 j + M(s, j)[B Θ ]2θ j=0 The cases when multiple, potentially non-trivial Jordan ˜ (1) 0 −1 ˜ 0 −1 = M (s, θ)[B Θ ]1 + M(s, θ)[B Θ ]2. blocks appear in the spectral decomposition of V(s, 0)B¯ can and the last column is be handled by the combination of these results and are omit- ted here. N−1 ∞ ! X X 0 −1 j j−n [Φ(s)] = M(s, j)[B Θ ] θ N N−n n n=0 j=n 6. NUMERICAL RESULTS N−1 ˜ (n) In this section we present numerical results obtained by X M (s, θ) 0 −1 = [B Θ ] . numerically inverting the Laplace transform r(s) of the re- n! N−n n=0 sponse time distribution. As we are mainly interested in Similarly, systems where the workload consists of a mixture of long and short jobs, we make use of a hyper-exponential (HEXP) 1 −1 r3(s) = πkΨ(s)ΘV(s, 0) I − R¯ (s) Bˆ1I, distribution with ns = 2 phases to model the job size dis- λ tribution. Under a 2-phase HEXP distribution a job has where an exponentially distributed length with mean 1/γ1 with probability p and an exponentially distributed length with Ψ(s) = mean 1/γ2 with probability 1 − p. The parameters p, γ1 and  ` ` `−1 `  `−N θ 1 θ ... N θ γ2 are set by matching the mean job duration EX, its squa- ∞ j−1 θ` `θ`−1 ... red coefficient of variation SCV (for any SCV ≥ 1) and a X 0 −1 X  1  M(s, j)B Θ   = shape parameter. Unless otherwise stated, we set EX = 1,  .. ..  j=1  . .  `=0 r r θ` SCV − 1 SCV − 1 γ1 γ1 = 1 + , γ2 = 1 − , p = , 1−θj d 1−θj 1 dN 1−θj SCV + 1 SCV + 1 2 1−θ dθ 1−θ ... N! dθN 1−θ j j and ∞  1−θ d 1−θ ...  X 0 −1  1−θ dθ 1−θ    M(s, j)B Θ  . .  .   −γ1  .. ..  α = p 1 − p , A = . j=1   −γ2 1−θj 1−θ Table 1 lists the resulting parameters for SCV = 1, 2, 5, 10 The first column of Ψ(s) is and 19. For instance, when SCV = 10 about 95% of the jobs are type 1 jobs, these are typically short jobs, and the ∞ j X 0 −1 1 − θ [Ψ(s)] = M(s, j)[B Θ ] remaining 5% of the jobs have a job length that is on average 1 1 1 − θ 20 times longer. We make use of the same prototypical j=0 service rate curve µ(m) as in [5] which is depicted in Figure ˜ ˜ M(s, 1) − M(s, θ) 0 −1 1. Note that the service rate is maximized at when k = 4 or = [B Θ ]1. 1 − θ 5. 1.5 8 SCV = 1 7.5 λ = 0.8 SCV = 2 SCV = 5

(m) 7

µ SCV = 10 SCV = 19 1 6.5

6

5.5

Overall service rate 0.5 5

Mean Response Time 4.5

0 5 10 15 20 25 4 Number of jobs in service m 3.5 4 6 8 10 12 14 16 18 20 Figure 1: Prototypical service rate curve of [5]. MPL k

Figure 2: Mean job response time as a function of To numerically invert the Laplace transform we make use the MPL k under Poisson arrivals with rate λ = 0.8. of the Euler algorithm [1, Section 5]. To determine the 15 probability P [R ≤ t] that the response time is at most t, λ 14 = 0.9 SCV = 1 this algorithm evaluates r(s)/s in a set of 2M points, where SCV = 2 13 SCV = 5 Re(s) = M ln(10)/3t > 0. For our numerical results we SCV = 10 12 set M = 32 and all computations were performed in double SCV = 19 precision using the MATLAB function euler inversion which 11 can be downloaded from the Mathworks File Exchange ser- 10 ver [8]. This function requires the function that evaluates 9 r(s)/s for a specific value of s as input. 8 7

As stated in Section 4 the theorems presented in this paper Mean Response Time k make use of matrices of size ns na, which allows us to specify 6 the matrices in an elegant manner using Kronecker sums 5 and products. Even if ns = 2 this would result in matrices 4 k 4 5 6 7 8 9 10 11 12 13 14 15 of size 2 na. To reduce the computation times, we use the MPL k common approach to reduce the size of the matrices involved by keeping track of the number of jobs in each phase (instead Figure 3: Mean job response time as a function of of the service phase of each individual job), e.g., [14]. For the MPL k under Poisson arrivals with rate λ = 0.9. ns = 2 this would reduce the size of the matrices to (k+1)na (as there can be between 0 and k jobs in phase 1). However, looking at the analysis in Section 5 it should be noted that 2 and 3 depict the mean response time in case of Poisson we always need to maintain the phase of the tagged job job arrivals with rate λ = 0.8 and 0.9 for various choices of (located in server 1). This implies that we can only collapse the service time SCV. These plots are similar to [5, Figure the phases of the remaining k −1 service phases and the size 3] and confirm that selecting a higher MPL k when the job of the matrices used in our computations therefore equals sizes are highly variable is beneficial for the mean response 2kna. time. For all the numerical experiments presented in this section Figures 4 and 5 show the response time distribution for we were able to make use of Theorem 11 as the matrices various choices of the MPL k for the setting with SCV = 19. V(s, 0)B¯ turned out to be diagonalizable for all the para- These figures clearly illustrate that while the mean response meter settings considered. For Poisson input (λ = 0.8) and time is minimized for an MPL value well beyond 5 (where HEXP service the computation time was less than 1 second the service rate curve is maximized), the tail of the response for k = 6, about 3.2 seconds for k = 10 and 7.5 seconds for time distribution behaves very differently and is minimized k = 15 to compute a single probability of the form P [R > t] by setting k = 4, the smallest k value for which µ(k) ≥ µ(m) (on an Intel Core i7-3630QM 2.40 GHz laptop). These com- for all m. This observation is in agreement with the obser- putation times are not very sensitive to the specific value of vation made in [9] that under light tailed job size distri- t or the SCV of the job sizes. For MAP input with na = 2 butions, the tail asymptotics of a limited processor sharing instead of Poisson input the computation times roughly in- queue (with a fixed service rate curve) tends to improve as crease by a factor 8 (as the size of the matrices doubles the MPL level k decreases. It also confirms our intuition and computation times are cubic in the size of the matri- that the mean response time improves up to some point as ces involved). If we rely on Theorems 9 and 10 instead of for larger k short jobs have an easier time passing long jobs, Theorem 11 the computation times for k = 6, 10 and 15 in- but this makes it harder for the long jobs to complete ser- crease to approximately 4, 40 and 440 seconds, which clearly vice, which worsens the tail behavior of the response time demonstrates the effectiveness of Theorem 11 to reduce the distribution. computation times. This is further illustrated in Figures 6 and 7 where we de- pict the response time distribution conditioned on the initial 6.1 Poisson arrivals service phase of a job. We state that jobs that started ser- The Poisson arrival process with rate λ is a special case vice in phase i, for i = 1, 2, are type-i jobs. In case of HEXP of a MAP with na = 1, D0 = −λ and D1 = λ. Figures job lengths, this implies that the type-i jobs have a mean 100 100 k = 3 k = 3 k = 4 k = 4 k = 5 10-1 k = 5 -1 10 k = 6 k = 6 k = 9 k = 9 k = 12 k = 12 10-2 k = 15 k = 15 10-2 > t]

10-3 short

10-3 Pr[R 10-4 Pr[Response time > t] -4 10 -5 10 λ λ = 0.8, SCV = 19 = 0.8, SCV = 19

-6 10-5 10 0 50 100 150 200 250 300 350 400 0 50 100 150 200 250 300 350 400 t t

Figure 4: Response time distribution for various Figure 6: Response time distribution of short jobs, MPL k values under Poisson arrivals with rate λ = i.e., type-1 jobs, for various MPL k values under 0.8 and SCV = 19. Poisson arrivals with rate λ = 0.8 and SCV = 19.

100 0 10 k = 3 k = 3 k = 4 k = 4 k = 5 k = 5 k = 6 -1 10 k = 6 k = 9 k = 9 k = 12 10-1 k = 12 k = 15 k = 15 10-2 > t] long

-3 Pr[R 10 10-2 Pr[Response time > t] -4 10 λ = 0.8, SCV = 19 λ = 0.9, SCV = 19 10-3 0 50 100 150 200 250 300 350 400 10-5 0 50 100 150 200 250 300 350 400 t t Figure 7: Response time distribution of long jobs, Figure 5: Response time distribution for various i.e., type-2 jobs, for various MPL k values under MPL k values under Poisson arrivals with rate λ = Poisson arrivals with rate λ = 0.8 and SCV = 19. 0.9 and SCV = 19. gure 8 indicates that except for the SCV = 19 case, the 99th percentile is minimized by setting the MPL equal to 4 or 5. length of 1/γi and roughly speaking the type-1 jobs corre- spond to short jobs (with a mean length of about 0.5) and It indicates that the tail asymptotics kick in rather quickly type-2 jobs are typically long (with a mean close to 20). To and using a larger MPL k might not give the desirable re- sult. The fact that a larger k reduces the 99th percentile of compute these distributions, denoted as Rshort and Rlong, we can rely on the results presented in Section 5, except the response time for an SCV equal to 19 is not surprising that the vector α used in the matrices B0 and F0 needs to as in this case about 97.5% of the jobs are type-1 jobs (see Table 1) and these benefit from increased k values. be replaced by ei, the stochastic vector with entry i equal to 1. Ideally we would like to compute the response time dis- 6.2 Phase-type renewal arrivals tribution of a job given its size, but this distribution cannot be directly obtained from Section 5. In this section we investigate the impact of the arrival pro- Figure 6 shows that the response time distribution of the cess by replacing the Poisson arrivals used in the previous short jobs tends to improve as k increases up to k = 12, the section with a phase-type renewal process. This means that value that minimizes the mean response time. For larger t consecutive inter-arrival times are still independent, but the smaller k values still perform better, but this is most likely inter-arrival time distribution follows a phase-type distribu- due to the fact that not all type-1 jobs are truly short. Si- tion. Similar to the service time distribution, we assume an milarly Figure 7 shows that increasing k is typically bad for HEXP inter-arrival time distribution characterized by the the long jobs. mean arrival rate λ and its squared coefficient of variation We end this section by depicting the 99th percentile of the SCVa, that is     response time distribution for various choices of the SCV in −γ1 pγ1 (1 − p)γ1 na = 2, D0 = λ , D1 = λ , Figure 8. This percentile is very meaningful whenever the −γ2 pγ2 (1 − p)γ2 main concern of the system is to guarantee good response p time for the majority of the jobs, e.g., 99% of the jobs. Fi- where γ1,2 = 1 ± (SCVa − 1)/(SCVa + 1) and p = γ1/2. 110 100 SCV = 1 λ = 0.8 PH renewal, SCV = 10 100 a SCV = 5 10-1 SCV = 19 90 SCV = 19

-2 80 10

70 -3 SCV = 10 10 99th percentile 60 Pr[Response time > t] 10-4 50 SCV = 5

40 10-5 4 5 6 7 8 9 10 11 12 13 14 15 0 50 100 150 200 250 300 350 400 MPL k t

Figure 8: The 99th percentile of the response time Figure 10: Response time distribution for k = 15 distribution as a function of the MPL k under Pois- under PH renewal input with rate λ = 0.8 and son arrivals with λ = 0.8. SCVa = 10 for job sizes with SCV = 1, 5 and 19.

100 24 PH renewal, SCV = 10 SCV = 1 a SCV = 2 k = 3 22 SCV = 5 k = 4 SCV = 10 k = 5 10-1 20 SCV = 19 k = 6 k = 9 18 k = 12 10-2 k = 15 16

14

-3 12 10 Mean Response Time

10 Pr[Response time > t] -4 8 10

4 6 8 10 12 14 16 18 PH renewal, λ = 0.8, SCV = 10 a MPL k 10-5 0 50 100 150 200 250 300 350 400 Figure 9: Mean job response time as a function of t the MPL k under PH renewal input with rate λ = 0.8 and SCVa = 10 for various SCV values for the job Figure 11: Response time distribution for various sizes. MPL k values under PH renewal input with rate λ = 0.8 and SCVa = 10 and job sizes with SCV = 19.

Figure 9 depicts the mean response time as a function of k when the PH renewal process has a mean arrival rate λ = 0.8 Figures 9 and 11 further illustrate that increasing the ∗ and an SCVa = 10. Note in such case the inter-arrival times MPL k beyond k also reduces the mean response time in are a mixture of many rather short inter-arrival times and case of PH renewal input and the tail probabilities are af- less frequent long inter-arrival times. The most notable re- fected in a similar manner as in the Poisson case (that is, sult in this figure (which we also confirmed by simulation) is setting k = k∗ remains optimal for the tail). that when the MPL k is large, the mean response time redu- ces as the job sizes become more variable (i.e., SCV increa- 6.3 Markov modulated Poisson arrivals ses). The intuition behind this rather unexpected result is as In this section we look at the impact of having correla- follows. In case of bursty arrivals there are some time inter- ted job inter-arrival times. For this propose we rely on a vals where the queue received many jobs in a short period of 2-state Markov modulated Poisson process (MMPP). Such time. When these jobs are a mixture of long and short jobs a process alternates between two states and while in state and k is large, some of the jobs, being the short ones, can i it generates arrivals according to a Poisson process with be cleared more quickly. This implies that the queue length rate λi, for i = 1, 2. The sojourn times in each state are can be reduced more quickly after such a burst of arrivals exponential and we assume they have the same mean 1/φ. in case the job sizes are more variable. If the queue length Such a process can be represented as a MAP in the following decreases more quickly, the total service capacity increases manner: more rapidly after such a burst of arrivals, which is clearly     −λ1 − φ φ λ1 0 beneficial for the response times. Another example that con- na = 2, D0 = , D1 = . φ −λ2 − φ 0 λ2 firms this intuition is presented further on in case of MAP arrivals with correlated inter-arrival times. Figure 10 indi- Figure 12 depicts the mean response time as a function of cates that while the mean response times may improve with the MPL k for various SCV values of the job size in case of the variability of the job sizes when k is large, e.g., k = 15, MMPP input with λ1 = 0.7, λ2 = 0.9 and 1/φ = 1000, while the tail probabilities behave in line with our expectations: Figure 13 depicts the sojourn time distribution for the same more variable job sizes given higher tail probabilities. MMPP when SCV = 19. The main conclusions are the 12 150 SCV = 1 SCV = 1 11 MMPP(0.7,0.9,1000) SCV = 2 MMPP(0.5,1.1,1000) SCV = 2 SCV = 5 SCV = 5 10 SCV = 10 SCV = 10 SCV = 19 SCV = 19 9 100

8

7

6 50 Mean Response Time Mean Response Time 5

4

3 0 4 6 8 10 12 14 16 18 20 4 6 8 10 12 14 16 MPL k MPL k

Figure 12: Mean job response time as a function of Figure 14: Mean job response time as a function of the MPL k under MMPP input with rates λ1 = 0.7, the MPL k under MMPP input with rates λ1 = 0.5, λ2 = 0.9 and a mean sojourn time 1/φ = 1000. λ2 = 1.1 and a mean sojourn time 1/φ = 1000.

100 k = 3 expression for the Laplace transform of the response time k = 4 distribution via both a Kronecker and spectral expansion -1 k = 5 10 k = 6 approach. k = 9 By numerically inverting the Laplace transform we de- k = 12 monstrated that while increasing the multi-programming le- 10-2 k = 15 ∗ vel (MPL) k beyond k = arg maxm µ(m) may decrease the mean response time in case of highly variable job sizes (as

10-3 shown in [5]), this has a negative effect on the tail behavior which may already be visible at the 99th percentile of the

Pr[Response time > t] response time distribution. Further, in case of bursty arri- 10-4 vals and large MPL k having more variable job sizes may MMPP(0.7,0.9,1000), SCV = 19 reduce the mean response times, while the tails still behave as expected (more job size variability leads to larger tail 10-5 0 50 100 150 200 250 300 350 400 probabilities). t In short the results shed light on the potential drawbacks associated with increasing the MPL value beyond k∗ when Figure 13: Response time distribution for k = 15 performing admission control. under MMPP input with rates λ1 = 0.7, λ2 = 0.9 and a mean sojourn time 1/φ = 1000 for job sizes with 8. ACKNOWLEDGMENTS SCV = 19. The authors are grateful to G´abor Horv´athand Ill´esHorv´ath for some helpful discussions. same as in the Poisson setting: under high job size variability setting the MPL k beyond k∗ reduces the mean response 9. REFERENCES time at the expense of an increase in the tail probabilities. [1] J. Abate and W. Whitt. A unified framework for Figure 12 may appear to be in conflict with the intuition numerically inverting laplace transforms. INFORMS provided in the previous section in case the arrivals occur Journal on Computing, 18(4):408–421, 2006. in a bursty manner as even for larger k, a higher SCV value [2] B. Avi-Itzhak and S. Halfin. Expected response times results in a larger mean response time (in contrast to Figure in a non-symmetric time sharing queue with a limited 9). However, this is merely due to the fact that an MMPP number of service positions. In Proceedings of ITC, process with λ1 = 0.7 and λ2 = 0.9 is not very bursty. If we volume 12, pages 2–1, 1988. further increase the difference between the arrival rates λ 1 [3] P. Br´emaud. Markov chains: Gibbs fields, Monte and λ to 0.6 as illustrated in Figure 14 we note that more 2 Carlo simulation, and queues, volume 31. Springer variable job sizes do result in lower mean response times for Science & Business Media, 2013. k large. [4] J. Gardiner, A. Laub, J. Amato, and C. Moler. Solution of the Sylvester matrix AXBT + CXDT = E. 7. CONCLUSIONS ACM Trans. Math. Software, 18(2):223–231, 1992. Motivated by time sharing systems, we studied a processor [5] V. Gupta and M. Harchol-Balter. Self-adaptive sharing queueing system where at most k jobs are served admission control policies for resource-sharing simultaneously, the overall service rate µ(m) depends on the systems. SIGMETRICS Perform. Eval. Rev., number of jobs m in service and additional jobs are buffered 37(1):311–322, June 2009. (and served in FCFS order). Arrivals are assumed to occur [6] J. Kriege and P. Buchholz. PH and MAP Fitting with according to a Markovian arrival process (MAP) and the job Aggregated Traffic Traces, pages 1–15. Springer sizes follow a phase-type (PH) distribution. We derived an International Publishing, Cham, 2014. [7] G. Latouche and V. Ramaswami. Introduction to Matrix Analytic Methods in Stochastic Modeling. Society for Industrial and Applied Mathematics, Philadelphia, 1999. [8] T. McClure. Numerical inverse laplace transform. computer software. mathworks file exchange., 2013. [9] J. Nair, A. Wierman, and B. Zwart. Tail-robust via limited processor sharing. Perform. Eval., 67(11):978–995, Nov. 2010. [10] M. Neuts. Matrix Geometric Solutions in Stochastic Models. Johns Hopkins University Press, Baltimore, 1981. [11] M. Nuyens and W. van der Weij. Monotonicity in the limited processor-sharing queue. Stochastic Models, 25(3):408–419, 2009. [12] K. Rege and B. Sengupta. Response time distribution in a multiprogrammed computer with terminal traffic. Performance Evaluation, 8(1):41–50, 1988. [13] M. Welsh, D. Culler, and E. Brewer. SEDA: An architecture for well-conditioned, scalable internet services. SIGOPS Oper. Syst. Rev., 35(5):230–243, Oct. 2001. [14] F. Zhang and L. Lipsky. Modelling restricted processor sharing. In PDPTA, pages 353–359, 2006. [15] J. Zhang, J. Dai, and B. Zwart. Law of large number limits of limited processor-sharing queues. Mathematics of Operations Research, 34(4):937–970, 2009. [16] J. Zhang, J. Dai, and B. Zwart. Diffusion limits of limited processor sharing queues. The Annals of Applied Probability, 21(2):745–799, 2011. [17] J. Zhang and B. Zwart. Steady state approximations of limited processor sharing queues in heavy traffic. Queueing Systems, 60(3):227–246, 2008.