Outline

for Jackson networks – • Cyclic network • Extension of Jackson networks • BCMP network • More than classical

Li Xia, Tsinghua Univ. 1 Mean value analysis

• Two methods to analyze closed – Buzen’s algorithm to compute G(N) and distribution – Mean value analysis to recursively compute the average performance metrics; also can recursively compute marginal distribution • Mean value analysis, proposed by 1. Reiser, M.; Lavenberg, S. S. (1980). "Mean-Value Analysis of Closed Multichain Queuing Networks". Journal of the ACM 27(2): 313-322. (IBM Zurich research, IBM Watson research) 2. Sevcik, K. C.; Mitrani, I. (1981). "The Distribution of Queuing Network States at Input and Output Instants". Journal of the ACM 28 (2): 358-371. Li Xia, Tsinghua Univ. 2 Arrival theorem

• Arrival theorem – A general case of PASTA theorem – Also called random observer property (ROP) or job observer property – “upon arrival at a station, a job observes the system as if in steady state at an arbitrary instant for the system without that job”

Li Xia, Tsinghua Univ. 3 Arrival theorem

• Applicability condition – always hold in open product-form networks with unbounded queues at each node (not Poisson arrival) – may not hold for some networks • Cyclic queue with M=2 and N=2, D/D/1, μ=1 and each server starts with 1 job, # of jobs seen by job 1 is 0, not equals 0.5 – for Poisson arrival, PASTA Theorem – for open Jackson network • q(n)=p(n) – for closed Jackson network

• qi(N,n-i)=p(N-1,n-i), the statistics seen by an arrival customer is equal to that of the steady network with one customer less.

Li Xia, Tsinghua Univ. 4

Mean value analysis

• Based on two basic principles – Arrival theorem (PASTA or ROP for Jackson networks) – Little’s law • For closed Jackson network

– qn (N)=pn(N-1), queue length seen by arrival equals that of the network with one less customer – Little’s law is applicable throughout the network

Li Xia, Tsinghua Univ. 5 Mean value analysis

• With Arrival theorem

– Wi(N) = [1+Li(N-1)]/μi (also valid for M/M/1 or M/M/c)

• Wi(N): mean response time at node i for a network with N customers

• Li(N-1): average number of customers at node i for a network with N-1 customers • With Little’s law

– Li(N) =λi(N)Wi(N)

• λi(N): throughput (arrival rate) of node i in an N- customer network, which is unknown

Li Xia, Tsinghua Univ. 6 Calculation of λi(n)

• Calculate visit ratio vi by M Add one more equation: v  v r, for all i 1,..., M i j ji v1+v2+…+vM = 1 j1

– vi is the relative throughput of node i n • Calculate λ(n): ()n  M vW() n i1 ii

• Throughput of node i: λi(n)=λ(n)vi

Li Xia, Tsinghua Univ. 7 Algorithm of mean value analysis

• Solve traffic equations to obtain vi, i=1,2,…,M

• Initialize Li(0)=0, i=1,2,…,M • For n=1:N, calculate

– Wi(n) = [1+Li(n-1)]/μi

– λ(n)=n/[v1W1(n)+…+vMWM(n)]

– λi(n)= λ(n) vi, i=1,2,…,M

– Li(n) =λi(n)Wi(n), i=1,2,…,M

Li Xia, Tsinghua Univ. 8 Discussion of mean value analysis

• Calculate the average metrics easily – Average queue length, mean waiting time, mean response time – Also calculate the marginal distribution recursively • Recursive algorithm – Complexity is linear to the system size • For multiclass networks – Also applicable, but complexity grows exponentially with the number of classes

Li Xia, Tsinghua Univ. 9 Example

• Similar to Example 4.5 in page 203 of Gross’ book (machine repair problem) – Closed Jackson network with M=3, N=2 – Exception: all the nodes are single-server

– Service rate: μ1=2, μ2=1, μ3=3

– Routing prob.: r12=3/4, r13=1/4, r21=2/3, r23=1/3, r31=1

Li Xia, Tsinghua Univ. 10 Example (cont.)

2 v v v • Visit rate equation set: 123 3 3 vv Let v1=1, solve the equation 214 11 v =3/4, v =1/2 v31 v 2 v 2 3 43 • State space: (M+N-1) choose N, it is 6 (0,0,2), (0,1,1), (0,2,0), (1,0,1), (1,1,0), (2,0,0) • Buzen’s algorithm to compute G(N), then compute the steady state distribution π = …

Li Xia, Tsinghua Univ. 11 Example (cont.) • MVA method – For n=1:

• W1(1)=[1+L1(0)]/μ1=1/2; W2(1)=1; W3(1)=1/3 • λ(1)=1/[1*1/2+3/4*1+1/2*1/3]=12/17

• λ1(1)= λ(1)v1=12/17; λ2(1)=9/17; λ3(1)=6/17

• L1(1)= λ1(1)W1(1)=6/17; L2(1)=9/17; L3(1)=2/17 – For n=2:

• W1(2)=[1+L1(1)]/μ1=23/34; W2(2)=26/17; W3(2)=19/51 • λ(2)=2/[1*23/34+3/4*26/17+1/2*19/51]=204/205

• λ1(2)= λ(2)v1=204/205; λ2(2)=153/205; λ3(2)=102/205

• L1(2)= λ1(2)W1(2)=138/205; L2(2)=234/205; L3(2)=38/205

Li Xia, Tsinghua Univ. 12 Thinking of closed Jackson network

• Relation of parameters – Arrival rate (throughput) λ v.s. service rates μ • μ increases, λ increases – Arrival rate λ v.s. number of customers N • N increases, λ increases with a upper bound • MVA for marginal distribution – Recursive formula, similar to M/M/1

pi(n,N)=λi(N)/μi*pi(n-1,N-1) Given marginal distribution  Compute steady state distribution by Jackson theorem, as if independent , multiply …

Li Xia, Tsinghua Univ. 13 Cyclic network

• A special case of closed Jackson network

– rij=1, if j=i+1 and 0

μ1

μ5 μ2

μ4 μ3

Li Xia, Tsinghua Univ. 14 Cyclic network

• Product-form solution of steady state distribution

– Traffic equation is special, vi+1=vi, so set all vi=1

– ρi= vi/μi=1/ μi

11n12 n nM pnM12   n12 n nM GNGN()()12  M

1 GN() where  n12 n nM n1 ...  nM  N 12  M

Li Xia, Tsinghua Univ. 15 Extension of Jackson networks • Load-dependent arrival rate and service rate – Similar results, product form solution • Consider travel time between nodes – Model the travel time as extra nodes with ample servers, still keep the form of Jackson networks • Multiclass Jackson network – Each class has its own routing structure, arrival rates and service rates – Applicable for computer, communication systems – BCMP network, still have product-form solution

Li Xia, Tsinghua Univ. 16 Non-Jackson network • Many variants from Jackson networks • State-dependent routing probability – Customer has flexibility to decide its next stop • E.g., choose the node with less congestion – Even exponential interarrival and service time, no product-form solution – Use Markov model to do analysis, but suffer from “curse of dimensionality” • Product-form solution avoids this curse of computation • Storage is a curse if need to store every state distribution – Avoid to store every distribution, use iterative calculation, e.g., for all s: L=L+n*p(s), only one iterative variable L Li Xia, Tsinghua Univ. 17 BCMP network • BCMP network definition – M servers, K classes of customers – 4 kinds of service disciplines • FCFS, PS, IS(infinite servers, or ample servers), LCFS with preemptive-resume – Class transition • class k customer from server i transits to server j as class

r, with probability qij,kr – Service time distribution • FCFS: IID exponential for all classes; • PS, IS, LCFS: any COX distribution (including exponential)

Li Xia, Tsinghua Univ. 18 BCMP network

• Steady state distribution of BMCP network has a product-form solution – Handle each server independently and multiply them together – Calculate the normalization constant, • Does exist similar algorithm to Buzen’s? – Scalability, avoid the curse of dimensionality • Applicable to large-scale problems

Li Xia, Tsinghua Univ. 19 More than classical queueing theory

• Heavy tail traffic • Phase-Type (PH) distribution • (MAM)

Li Xia, Tsinghua Univ. 20 Heavy tail traffic

• Assumption – Service time is exponentially distributed – Or: job size is exponentially distributed • In practice, especially in computer system – Job size is not exponentially distributed – Heavy tail, high variance, decreasing failure rate

Li Xia, Tsinghua Univ. 21 Heavy tail traffic • Data Measure is important (ACM SIGMETRICS) – Collect the job size in computer system P{jobsize >x}

1 heavy tail exponential

1/2

1/4 1/8 x 1 2 4 8 16 32 Looks like an exponential distribution, F() x ex But actually it is not, 1st moment, 2nd moment, …

Li Xia, Tsinghua Univ. 22 Pareto distribution • If we use log-log plot

P{jobsize >x} 1 Pareto 1/2 1/4 1/8 Exponential 1/16

x 1 2 4 8 16 32 This fits well a Pareto distribution: F( x ) x , x  1, 0   2  [0.8,1.2]

Li Xia, Tsinghua Univ. 23 Pareto distribution • Property of Pareto distribution – Decreasing failure rate f() x x 1 r( xx ), 1    Fx() xx • The older a job is, the longer it will take CPU time in future – Infinite or near infinite variance • If α≤1: E[x]=∞, E[xn]=∞, E[x|x>a]=∞ • If α>1: E[x]<∞, E[xn]=∞, E[x|x>a]<∞ – Heavy-tail property • 1% largest job comprise 90% system load • More biased than the 80-20 rule (Pareto principle) Li Xia, Tsinghua Univ. 24

Pareto distribution • Also known as Power-law distribution – Hot concept: power-law; small-world – Widely exist in practice, almost everywhere! • Most of the resources/contributions belongs to a few people/units • In business, 80-20 rule • In computer network, heavy-tail traffic Win a lot of awards, top paper, nature/sci. • In Internet, the file size at website • In social network/internet topology, the degree of nodes Question: based on the power-law/Pareto distribution, what we can do? Focus on the biggest job… Li Xia, Tsinghua Univ. 25 Phase-Type distribution

• Since the job size is not exponential (heavy- tail), Markovian tool cannot be applied • Not all are lost, we can use approximation – Phase-type distribution to approximate most of distributions, for service time – Markovian arrival process to approximate most of distributions, for interarrival time

Li Xia, Tsinghua Univ. 26 Phase-Type distribution • A generalization of concept of phases • PH distribution: the time to enter an absorbing state in Markov process – PH distribution has two terms, (α,T) • α is the initial distribution • T is part of the transition probability/rate matrix Q • For example, -2 distribution: T μ μ 0  1 2 3 Q 0  0 0 0 α = (1,0) Li Xia, Tsinghua Univ. 27 PH distribution

• Coxian distribution  pp11 (1 )  μp1 μ Q 0  1 2 3  0 0 0 α = (1,0) μ(1-p1) • Hyerexponential distribution

μ1 1 110 with an initial Q 0 distribution α = (q,1-q) 3 22 μ2 0 0 0 2 

Li Xia, Tsinghua Univ. 28 The calculation of pdf of PH distribution • Use the C-K equation and matrix calculation of Q • Calculate the transient behavior of absorbing Tx state, pn(x). F( xe ) e 1 

– pn(x)=P{X

Li Xia, Tsinghua Univ. 29 Matrix Analytic Method (MAM) • A computation method to handle Markovian queueing model with multiple-dimensions – M/PH/1, M/Er/1, MAP/M/1, MAP/PH/1, … – State: (# of customers, phase status) – Transition rate matrix has block tri-diagonal structure

0 QBD(Quasi Birth- Death) structure

0 …

Li Xia, Tsinghua Univ. 30 Matrix Analytic Method (MAM) • Proposed by M.F. Neuts (1935-2014, University of Arizona) in 1975 – Developed fast in recent years – Target: give a numerical method to obtain the distribution of Markovian queueing model • Repeat with certain mode • Grow unboundedly in no more than 1 dimension • Computing R or G 2 – Recursively solve equation: R=A0+RA1+R A2, n-1 – Steady distribution: πn= π1 R

Li Xia, Tsinghua Univ. 31