On the relation between matrix-geometric and discrete phase-type distributions

Sietske Greeuw

Kongens Lyngby/Amsterdam 2009 Master thesis Mathematics University of Amsterdam Technical University of Denmark Informatics and Mathematical Modelling Building 321, DK-2800 Kongens Lyngby, Denmark Phone +45 45253351, Fax +45 45882673 [email protected] www.imm.dtu.dk

University of Amsterdam Faculty of Science Science Park 404, 1098 XH Amsterdam, The Netherlands Phone +32 20 5257678, Fax +31 20 5257675 [email protected] www.science.uva.nl Summary

A discrete phase-type distribution describes the time until absorption in a discrete-time Markov chain with a finite number of transient states and one absorbing state. The density f(n) of a discrete phase-type distribution can be expressed by the initial probability vector α, the transition probability matrix T of the transient states of the Markov chain and the vector t containing the probabilities of entering the absorbing state from the transient states:

n−1 f(n) = αT t, n ∈ N. If we take a probability density of the same form, but not necessarily require α,T and t to have the probabilistic Markov-chain interpretation, we obtain the density of a matrix-geometric distribution. Matrix-geometric distributions can equivalently be defined as distributions on the non-negative integers that have a rational probability generating function. In this thesis it is shown that the class of matrix-geometric distributions is strictly larger than the class of discrete phase-type distributions. We give an example of a set of matrix-geometric distributions that are not of discrete phase- type. We also show that there is a possible order reduction when representing a discrete phase-type distribution as a matrix-geometric distribution. The results parallel the continuous case, where the class of matrix-exponential distributions is strictly larger than the class of continuous phase-type distribu- tions, and where there is also a possible order reduction.

Keywords: discrete phase-type distributions, phase-type distributions, matrix- exponential distributions, matrix-geometric distributions. ii Resum´e

En diskret fasetype fordeling er fordelingen af tiden til absorption i en diskret- tids Markov kæde med et begrænset antal transiente tilstande og ´en absorbe- rende tilstand. Tætheden f(n) af en diskret fasetype fordeling kan udtrykkes ved vektoren med den initielle fordeling α, matricen T, der beskriver de mulige over- gange mellem transiente tilstande i Markov kæden, og vektoren t med sandsyn- ligheder for at springe til den absorberende tilstand fra de transiente tilstande:

n−1 f(n) = αT t, n ∈ N. Hvis vi tager en tæthed af samme form, men ikke nødvendigvis kræver, at α,T og t har probabilistisk Markov kæde fortolkning, f˚arvi tætheden af en matrix-geometrisk fordeling. Matrix-geometriske fordelinger kan tilsvarende de- fineres som fordelinger p˚aikke-negative heltal, der har en rationel sandsyn- lighedsgenererende funktion. I denne projekt er det vist, at mængden af matrix-geometriske fordelinger er strengt større end mængden af diskrete fasetype fordelinger. Der gives et ek- sempel p˚aen mængde af matrix-geometriske fordelinger, der ikke er af diskret fasetype. Vi viser ogs˚a,at der er en mulighed for reduktion af størrelse af repræsentationen, n˚arde repræsenterer en diskret fasetype fordeling som en matrix-geometrisk fordeling. Resultaterne svarer til det kontinuerte tilfælde, hvor klassen af matrix- eksponentielle fordelinger er strengt større end klassen af kontinuert fasetype fordelinger, og hvor reduktion af størrelse af repræsentation ogs˚aer muligt. iv Preface

This thesis was prepared in partial fulfillment of the requirements for acquiring the master of science degree in Mathematics. It has been prepared during a five month Erasmus-stay at the Danish Technical University, at the institute for Informatics and Mathematical Modelling.

I would like to thank my two supervisors, Bo Friis Nielsen and Michel Mandjes, for both being very supportive in my decision of doing my master project abroad. Furthermore I want to thank Bo for his friendly and informative guidance, and Michel for his help with the final parts of my thesis.

Finally I want to thank my family and my friends and all other people who have supported me during this process.

Amsterdam, March 2009

Sietske Greeuw vi Contents

Summary i

Resum´e iii

Preface v

1 Introduction 1

2 Phase-type distributions 5 2.1 Discrete phase-type distributions ...... 5 2.2 Continuous phase-type distributions ...... 19

3 Matrix-exponential distributions 23 3.1 Definition ...... 24 3.2 Examples of matrix-exponential distributions ...... 25

4 Matrix-geometric distributions 31 4.1 Definition ...... 32 4.2 Existence of genuine matrix-geometric distributions ...... 34 4.3 Order reduction for matrix-geometric distributions ...... 38 4.4 Properties of matrix-geometric distributions ...... 45 viii CONTENTS Chapter 1

Introduction

In this thesis the relation between matrix-geometric and discrete phase-type distributions will be explored. Matrix-geometric distributions (MG) are distri- butions on the non-negative integers that possess a density of the form

n−1 f(n) = αT t, n ∈ N (1.1) together with f(0) = ∆, the point mass in zero. The parameters (α,T, t) are a row vector, matrix and column vector respectively, that satisfy the necessary conditions for f(n) to be a density. Hence αT n−1t ≥ 0 for all n ∈ N and P∞ −1 n=0 f(n) = ∆ + α(I − T ) t = 1.

The class MG is a generalization of the class of discrete phase-type distributions (DPH). A discrete phase-type distribution describes the time until absorption in a finite-state discrete-time Markov chain. A discrete phase-type density has the same form as (1.1) but requires the parameters to have the interpretation as initial probability vector (α), sub-transition probability matrix (T ) and exit probability vector (t) of the underlying Markov chain. The analogous setup in continuous time is given by matrix-exponential distri- butions (ME) that are a generalization of continuous phase-type distributions (PH). The latter describe the time until absorption in a continuous-time finite- state Markov chain.

The name phase-type refers to the states of the Markov chain. Each visit to a 2 Introduction state can be seen as a phase with a geometric (in continuous time exponential) distribution, and the resulting Markovian structure gives rise to a discrete (con- tinuous) phase-type distribution. The method of phases has been introduced by A.K. Erlang in the beginning of the 20th century and has later been generalized by M.F. Neuts. A general can be approximated arbi- trarily closely by a phase-type distribution. This makes phase-type distributions a powerful tool in the modelling of e.g. queueing systems. Another nice feature of phase-type distributions is that they often give rise to closed form solutions.

The main reference on discrete phase-type distributions is ‘Probability distribu- tions of phase type’ by M.F. Neuts, published in 1975 [12]. This article gives a thorough introduction to discrete phase-type distributions, their main proper- ties and their use in the theory of queues. In his 1981 book ‘Matrix-geometric Solutions in Stochastic Models’ [13], M.F. Neuts gives again an introduction to phase-type distributions, this time focussing on the continuous case. Another clear introduction of continuous phase-type distributions can be found in the book by G. Latouche and V. Ramaswami [10]. A good survey on the class ME is given in the entry on matrix-exponential distributions in the Encyclopedia of Statistical Sciences by Asmussen and O’Cinneide [4]. In this article the rela- tion of the class ME to the class PH is discussed, some properties of ME are given and the use of these distributions in applied probability is explained. The authors mention the class of matrix-geometric distributions as the analogous discrete counterpart of ME. However, no explicit examples are given.

The main question addressed in this thesis is what the relation is between the classes DPH and MG. From their definition it is clear that

DPH ⊆ MG.

It is however not immediately clear, whether there exists distributions that are in MG but do not have a discrete phase-type representation. The second question is on the order of the distribution. This order is defined as the minimal dimension of the matrix T in (1.1) needed to represent the distribution. It is of interest to know if there exist discrete phase-type distributions that have a lower order representation when they are represented as a matrix-geometric distribution. Such a reduced-order representation can be much more convenient to work with, for example in solving high-dimensional queueing systems.

In the continuous case it is known that

PH ( ME, that is, PH is a strict subset of ME [3, 4]. A standard example of a distribution that is in ME but not in PH is a distribution represented by a density which has a zero on (0, ∞). Such a distribution cannot be in PH, as a phase-type density 3 is strictly positive on (0, ∞) [13]. There is no equivalent result in discrete time, as discrete phase-type densities can be zero with a certain periodicity. A sec- ond way of showing that ME is larger than PH is by using the characterization of phase-type distributions stated by O’Cinneide in [14] which is based on the absolute value of the poles of the Laplace transform. In [15] O’Cinneide gives a lower bound on the order of a phase-type represen- tation. He shows that there is possible order reduction when using a matrix- exponential representation rather than a phase-type representation.

We will use the characterization of discrete phase-type distributions by O’Cinneide [14], to give a set of distributions that is in MG but not in DPH. We will also show with an example that order reduction by going from a DPH to an MG representation is possible. This result is based on discrete phase-type densities which have a certain periodicity.

The rest of the thesis is organized as follows: Section 2.1 gives an introduc- tion to and some important properties of discrete phase-type distributions. In Section 2.2 the class of continuous phase-type distributions PH is shortly ad- dressed. In Section 3.1 the class ME is introduced and in Section 3.2 some examples of matrix-exponential distributions are given, where the focus lies on their relation to the class PH. In Chapter 4 the answers to the questions stated in this introduction are given. In Section 4.1 we introduce matrix-geometric distributions and give an equivalent definition for these distributions based on the rationality of their probability generating function. In Section 4.2 we give an example that illustrates that MG is strictly larger than DPH and in Section 4.3 we give an example of order reduction between these classes. For the sake of completeness, Section 4.4 is devoted to some properties of matrix-geometric distributions. They equal the properties of discrete phase-type distributions that are stated in Chapter 2 but the proofs now need to be given analytically, as the probabilistic Markov-chain interpretation is no longer available. 4 Introduction Chapter 2

Phase-type distributions

This chapter is about phase-type distributions. In Section 2.1 discrete phase- type distributions are introduced. They are the distribution of the time until absorption in a discrete-time finite-state Markov chain with one absorbing state. Some properties of these distributions will be studied, the connection with the geometric distribution will be explained, and some closure properties will be examined. In Section 2.2, we have a look at continuous phase-type distributions. As their development is analogous to the discrete case, we will not explore them as extensively as the discrete case, but we will only state the most relevant results. The presentation of phase-type distributions in this chapter is based on [12],[6] and [13]. In the following, I and e will denote the identity matrix and a vector of ones respectively, both of appropriate size. Row vectors will be denoted by bold face Greek letters, whereas column vectors will be denoted by bold face Latin letters. We define N to be the set of positive integers, i.e. N = {1, 2, 3,...}.

2.1 Discrete phase-type distributions

Let {Xn}n∈N≥0 be a discrete-time Markov chain on the state space E = {1, 2, . . . , m, m + 1}. We let {1, 2, . . . , m} be the transient states of the 6 Phase-type distributions

Markov chain and m + 1 be the absorbing state. The transition probability matrix of this Markov chain is given by  T t  P = . 0 1 Here T is the m × m sub-transition probability matrix for the transient states, and t is the exit vector which gives the probability of absorption into state m+1 from any transient state. Since P is a transition probability matrix, each row of P must sum to 1, hence T e + t = e.

The probability of initiating the Markov chain in state i is denoted by αi = P (X0 = i) . The initial probability vector of the Markov chain is then given by Pm+1 (α, αm+1) = (α1, α2, . . . , αm, αm+1) and we have i=1 αi = 1.

Definition 2.1 (Discrete phase-type distribution) A τ has a discrete phase-type distribution if τ is the time until absorption in a discrete-time Markov chain,

τ := min{n ∈ N≥0 : Xn = m + 1}.

The name phase-type distribution refers to the states (phases) of the underlying Markov chain. If T is an m×m matrix we say that the density is of order m. The order of the discrete-phase type distribution is the minimal size of the matrix T needed to represent the density.

2.1.1 Density and distribution function

In order to find the density of τ we look at the probability that the Markov chain is in one of the transient states i ∈ {1, 2, . . . , m} after n steps,

m (n) X n pi = P (Xn = i) = αk (T )ki . k=1

(n) (n) (n) (n) We can collect these probabilities in a vector and get ρ = (p1 , p2 , ··· , pm ). Note that ρ(0) = α.

Lemma 2.2 The density of a discrete phase-type random variable τ is given by

n−1 fτ (n) = αT t, n ∈ N (2.1) and fτ (0) = αm+1. 2.1 Discrete phase-type distributions 7

Proof. The probability of absorption of the Markov chain at time n is given by the sum over the probabilities of the Markov chain being in one of the states {1, 2, . . . , m} at time n − 1 multiplied by the probability that absorption takes place from that state. The state of the Markov chain at time n − 1 depends on the initial state of the Markov chain and the (n − 1)-step transition probability matrix T n−1. Hence we get

m X (n−1) (n−1) n−1 fτ (n) = P (τ = n) = pi ti = ρ t = αT t, n ∈ N. i=1

Note that fτ (0) is the probability of absorption of the Markov chain in zero steps, which is given by αm+1, the probability of initiating in the absorbing state. 

The density of τ is completely defined by the initial probability vector α and the sub-transition probability matrix T , since t = (I − T ) e. We write

τ ∼ DPH (α,T ) to denote that τ is of discrete phase type with parameters α and T . The density of a discrete phase-type distribution is said to have an atom in zero of size αm+1 if fτ (0) = αm+1.

A representation (α,T ) for a phase-type distribution is called irreducible if every state of the Markov chain can be reached with positive probability if the initial distribution is given by α. We can always find such a representation by simply leaving out the states that cannot be reached.

From now on we will omit the subscript τ and simply write f(n) for the density of a discrete phase-type random variable. We have to check that f(n) = αT n−1t is a well-defined density on the non-negative integers. Since α,T and t only have non-negative entries (as their entries are probabilities) we know that f(n) is non-negative for all n. The infinite sum of f(n) is given by:

∞ ∞ X X f(n) = f(0) + αT n−1t n=0 n=1 ∞ ! X n = αm+1 + α T t n=0 −1 = αm+1 + α (I − T ) t −1 = αm+1 + α (I − T ) (I − T ) e

= αm+1 + αe = 1. 8 Phase-type distributions

As T is a sub-stochastic matrix, all its eigenvalues are less than one (see e.g. P∞ n [2] Proposition I.6.3). Therefore we have in the above that the series n=0 T converges to the matrix (I − T )−1 (see e.g. [16] Lemma B.1.).

We denote by F (n) = P (τ ≤ n) the distribution function of τ. The distribution function can be deduced by the following probabilistic argument.

Lemma 2.3 The distribution function of a discrete phase-type variable is given by F (n) = 1 − αT ne. (2.2)

Proof. We look at the probability that absorption has not yet taken place and hence the Markov chain is in one of the transient states. We get

1 − F (n) = P (τ > n) m X (n) = pi i=1 = p(n)e = αT ne.



2.1.2 Probability generating function and moments

For a discrete random variable X defined on the non-negative integers and density pn = P (X = n) the probability generating function (pgf) is given by

∞ 2 X k H(z) = p0 + zp1 + z p2 + ... = z pk, |z| ≤ 1. k=0

X  (i) Hence H(z) = E z . Denote by H (z) the i-th derivative of H(z). We have

∞ ∞ X (1) X H(1) = pk = 1,H (1) = kpk = E (X) k=0 k=0 and in general we get the k-th factorial , if it exists, by

(k) H (1) = E (X (X − 1) ... (X − (k − 1))) .

For a probability generating function, the following properties hold: 2.1 Discrete phase-type distributions 9

1. The probability generating function uniquely determines the probability density of X, since

(k) H (0)/k! = pk for k = 1, 2,....

It follows that if HX (z) = HY (z) for all z with |z| ≤ 1 then fX (n) = fY (n) for all n ∈ N. 2. If X,Y are two independent random variables, then the pgf of the random variable Z = X + Y is given by

X+Y  X Y  X  Y  HZ (z) = E z = E z z = E z E z = HX (z)HY (z).

3. If Z is with probability p equal to X and with probability (1 − p) equal to Y then HZ (z) = pHX (z) + (1 − p) HY (z).

Lemma 2.4 The probability generating function of a discrete phase-type ran- dom variable is given by

−1 H(z) = αm+1 + αz (I − zT ) t. (2.3)

Proof.

∞ n X n H(z) = E (z ) = z f(n) n=0 ∞ X = f(0) + znαT n−1t n=1 ∞ X n−1 = αm+1 + αz (zT ) t n=1 ∞ X n = αm+1 + αz (zT ) t n=0 −1 = αm+1 + αz (I − zT ) t.

 Note that H(z) is a rational expression (a ratio between two polynomials) in z. This because all elements of (I −zT ) are polynomials in z and taking the inverse leads to ratios of those polynomials. The poles of a rational function are given by the roots of the denominator. Hence the poles of H(z) are the solutions to the equation det(I − zT ) = 0. As the eigenvalues of the matrix T are found by solving det(T − zI) = 0, we see that the poles of the probability generating 10 Phase-type distributions functions are reciprocals of eigenvalues of T (there might be less poles than eigenvalues due to cancellation).

From the probability generating function we can obtain the expectation of a discrete phase-type random variable by differentiating H(z) and taking z = 1. We first need the following lemma:

Lemma 2.5 For |z| ≤ 1 and I and T the identity and sub-transition probability matrix it holds that ∞ X k(zT )k = (I − zT )−2 − (I − zT )−1. (2.4) k=1

Proof. We have ∞ X (I − zT ) k(zT )k = zT − (zT )2 + 2(zT )2 − 2(zT )3 + 3(zT )3 − ... k=1 = zT + (zT )2 + (zT )3 + ... ∞ X = (zT )k − I k=0 = (I − zT )−1 − I from which the desired result follows. 

Using Lemma 2.5 we can now calculate the first derivative of H(z) and find the expectation by substituting z = 1: d   H(1)(z) = α + αz (I − zT )−1 t dz m+1 ∞ ! −1 d X = α (I − zT ) t + αz (zT )k t dz k=0 ∞ −1 X = α (I − zT ) t + αz k(zT )k−1T t k=1 ∞ −1 X = α (I − zT ) t + α k(zT )kt k=1 = α (I − zT )−1 t + α((I − zT )−2 − (I − zT )−1)t. H(1)(1) = αe + α(I − T )−1e − αe = α(I − T )−1e, where we used for the last equation again that t = (I − T ) e. 2.1 Discrete phase-type distributions 11

Corollary 2.6 The expectation of a discrete phase-type random variable is given by −1 E (X) = α (I − T ) e. (2.5)

The factorial moments for a discrete phase-type random variable can be obtained by successive differentiation of the probability generating function and are given by: (k) −k E (X (X − 1) ... (X − (k − 1))) = H (1) = k!α (I − T ) e.

The formulation in (2.5) of the mean of a discrete phase-type distribution sug- gests that the matrix (I − T )−1 denotes the mean number of visits to each stat of the Markov chain, prior to absorption.

−1 Lemma 2.7 Let U = (I − T ) . Then uij gives the mean number of visits to state j, given that the Markov chain initiated in state i, prior to absorption.

Proof. Let Bj denote the number of visits to state j prior to absorption. Further, let Ei and Pi denote the expectation respectively the probability con- ditioned on the event X0 = i. Then τ−1 ! X Ei (Bj) = Ei 1{Xn=j} n=0 ∞ X = Pi (Xn = j, τ − 1 ≥ n) n=0 ∞ X = Pi (Xn = j, τ > n) n=0 ∞ X n = (T )ij . n=0

Now we can write U = (Ei (Bj))ij to get ∞ X −1 U = T n = (I − T ) . n=0 

2.1.3 Geometric and negative

As a first example of a discrete phase-type distribution we have a look at the phase-type representation for the geometric distribution. Let the random vari- 12 Phase-type distributions able X be of discrete phase-type with parameters α = 1 and T = (1 − p) for 0 < p < 1. Then it is readily seen that the exit vector is given by t = p and we can compute the density of X by

f(n) = αT n−1t = (1 − p)n−1 p.

Hence X is geometrically distributed with success probability p. The probability generating function is given by

zp H(z) = α + αz (I − zT )−1 t = 1 · z (1 − z (1 − p))−1 p = . m+1 1 − (1 − p) z

We can generalize the previous example by considering the (m + 1) × (m + 1)- transition probability matrix

 1 − p p 0 ... 0 0 0   0 1 − p p . . . 0 0 0     ......   ......  P =   . (2.6)  0 0 0 ... 1 − p p 0     0 0 0 ... 0 1 − p p  0 0 0 ... 0 0 1

If we take the initial probability vector (1, 0,..., 0) we have a Markov chain that runs through all the states, and makes a transition to the following state accord- ing to a geom(p) distribution. The discrete phase-type distribution represented by the first m rows and columns of P is the sum of m geometric distributions, which is a negative binomial distribution. The density g(n) of a negative bino- mial distribution with success probability p and m successes is given by

n + m − 1 g(n) = (1 − p)npm m − 1 and has support on the non-negative integers. This density counts the number of failures n until m successes have occurred. The phase-type expression in (2.6) however counts all trials. Hence the density corresponding to this representation is given by:  n − 1  f(n) = (1 − p)n−mpm, m − 1

n−1  which is zero on the first m − 1 integers, as m−1 is zero for n < m. A further generalization is possible, by taking the sum of m geometric distributions that each have a different success probability pk for 1 ≤ k ≤ m. The resulting distri- bution is called generalized negative binomial distribution. The corresponding 2.1 Discrete phase-type distributions 13 transition probability matrix is given by   1 − p1 p1 0 ... 0 0 0  0 1 − p2 p2 ... 0 0 0     ......   ......  P =   ,  0 0 0 ... 1 − pm−1 pm−1 0     0 0 0 ... 0 1 − pm pm  0 0 0 ... 0 0 1 and the initial probability vector is again given by (1, 0,..., 0).

2.1.4 Non-uniqueness of representation

The representation of a discrete phase-type distribution is not unique. Consider the phase-type density f(n) with parameters α and T , where we take T to be irreducible. Since T is a sub-stochastic matrix, and the matrix (I − T ) is invertible, we know that T has at least one eigenvalue λ with 0 < λ < 1 and corresponding eigenvector ν. We can normalize ν such that νe = 1. If we now take α = ν we get

f(n) = νT n−1t = λn−1νt = λn−1ν (I − T ) e = λn−1 (νIe − νT e) = λn−1 (1 − λ) .

It follows that the geometric density with success probability (1 − λ) can be represented as a discrete phase-type density both with parameters (ν,T ) of dimension m, say, and with parameters (1, λ) of dimension 1.

2.1.5 Properties of discrete phase-type distributions

In this section we will look at some properties for the set of discrete phase-type distributions. We will give only probabilistic proofs for these properties. The analytic proofs can be found in Section 4.4 on properties for matrix-geometric distributions.

Property 2.1 Any probability density on a finite number of positive integers is of discrete phase-type.

Proof. Let {g(i)}i∈I be a probability density on the positive integers of finite support, hence I (N. Then we can make a Markov chain {Xn}n∈N≥0 on a 14 Phase-type distributions state space of size I + 1 such that the probability of absorption after i steps equals exactly g(i). The discrete phase-type distribution with this underlying Markov chain and initiating in phase 1 with probability 1 will now give the desired density.  Example 2.1 As an example of this property, we have a look at the density g(n) with support on the set {1, 2, 3, 4} given by 1 1 1 5 g(1) = , g(2) = , g(3) = , and g(4) = . 3 6 12 12 We want to find a discrete phase-type representation (β,S) for this density.

The underlying Markov chain {Xn}n∈N≥0 will consist of five states (labelled 1, 2, 3, 4, 5 and visited in that order), where the fifth state is the absorbing state, and hence S will be of order four. We take β = (1, 0, 0, 0) and denote by s the exit vector of the phase-type representation, which is given by (I − S)e. The Markov chain initiates in state one with probability 1, P(X0 = 1) = 1, and since the probability of absorption in one step should equal g(1) we get 1 2 1 P(X1 = 5) = g(1) = 3 , and hence P(X1 = 2) = 3 . This gives s1 = 3 and 2 1 S1,2 = 3 . From state 2 the probability of absorption must be g(2) = 6 . We find 2 1 1 3 P(X1 = 2)P(X2 = 5) = 3 s2 = 6 which leads to s2 = 4 and hence S2,3 = 4 . Continuing this way, we find that the transition probability matrix  2 1  0 3 0 0 3 3 1  0 0 4 0 4   5 1  P =  0 0 0   6 6   0 0 0 0 1  0 0 0 0 1 describes the underlying Markov process of the discrete phase-type density g(n), and leads to the representation  2  0 3 0 0 3 0 0 4 0 β = (1, 0, 0, 0) and S =  5  . 0 0 0 6  0 0 0 0

Property 2.1 is of limited use in practise, unless the set I is small. As a conse- quence of this first property, the discrete phase-type distributions are dense in the set of distributions on the non-negative integers. This means that for any probability distribution F on N≥0 there is a sequence Fn of discrete phase-type distributions that converges in distribution to F (this convergence is called weak convergence).

Property 2.2 The convolution of a finite number of discrete phase-type den- sities is itself of phase-type. 2.1 Discrete phase-type distributions 15

Proof. We will prove this property for two phase-type distributions. Let X and Y be two independently distributed phase-type random variables with distributions represented by (α,T ) and (β,S) respectively. We let T be of order m and S of order k and denote the exit vectors of the two densities by t and s respectively. Consider the transition probability matrix   T tβ βk+1t P =  0 S s  0 0 1 and initial probability vector γ = (α, αm+1β, αm+1βk+1) . The matrix P de- scribes a discrete-time Markov chain with m + k + 1 phases, where the first m transient states are visited according to initial probability vector α and transi- tion probability matrix T and they are left with the probabilities given in the exit vector t. The next k states are entered with the initial probabilities given in β and visited according to the matrix S. Absorption from those states into state m + k + 1 takes place with the exit probabilities given in s. With probability αm+1 the chain is entered in state m + 1 and with probability αm+1βk+1 the chain is entered immediately in the absorbing state m + k + 1. If we denote the upper left part of P by V , T tβ V = , 0 S we see that (γ, V ) is a representation for the discrete phase-type random variable Z = X + Y. 

Property 2.3 Any finite mixture of probability densities of phase type is itself of phase type.

Proof. Let the vector (p1, p2, . . . , pk) denote the mixing density. Let the densities fi(n) be represented by the matrices Ti and initial probability vectors αi. Then the mixture k X f(n) = pifi(n) i=1 can be represented by the transition probability matrix   T1 0 0 ... 0 t1  0 T2 0 ... 0 t2     . . . .  P =  . . . .     0 0 0 ...Tk tk  0 0 0 ... 0 1 and initial probability vector β = (p1α1, . . . , pkαk, αm+1) .  16 Phase-type distributions

For two matrices A and B of size k × m and p × q respectively, the Kronecker product A ⊗ B is given by   a11B . . . a1mB  . .. .  A ⊗ B =  . . .  . ak1B . . . akmB

Hence A ⊗ B has size kp × mq. We can use the Kronecker product to express the maximum and minimum of two discrete phase-type variables.

Property 2.4 Let X and Y be discrete phase-type distributed with parameters (α,T ) and (β,S) respectively. Then U = min(X,Y ) is discrete phase-type distributed with parameters (α ⊗ β,T ⊗ S) and V = max(X,Y ) is discrete phase-type distributed with parameters

γ = (α ⊗ β, βk+1α, αm+1β), T ⊗ SI ⊗ s t ⊗ I L =  0 T 0  . 0 0 S

Proof. In order to prove that the minimum U is of discrete phase-type we have to find a Markov chain where the time until absorption has the same distribution as the minimum of X and Y . This minimum is the time until the first of the two underlying Markov chains of X and Y gets absorbed. We make a new Markov chain that keeps track of the state of both underlying Markov chains of X and Y . The state space of this new Markov chain consists of all combinations of states of X and Y . Hence, if X has order m and Y has order k we have the new state space {(1, 1), (1, 2),..., (1, k), (2, 1), (2, 2),..., (2, k),..., (m, 1), (m, 2),..., (m, k)}. The mk × mk order transition probability matrix corresponding to this new Markov chain is given by T ⊗ S since each entry tijS gives for a transition in the Markov chain corresponding to (α,T ) all possible transitions in the Markov chain corresponding to (β,S). The corresponding initial probability vector is given by α ⊗ β. The maximum V is the time until both underlying Markov chains of X and Y get absorbed. This means that after one of the Markov chains gets absorbed we need to keep track of the second Markov chain until it too gets absorbed. Hence we need a new Markov chain with mk + m + k states, where the first mk states are the same as in the case of the minimum above. Then with the probabilities in the exit vector s the Markov chain of Y gets absorbed, and our new Markov chain continues in the states of X. And with the probabilities in the exit vector t the Markov chain of X gets absorbed and our new Markov chain continues in the states of Y. Hence the matrix L describes the transition probabilities in the new Markov chain. If one of the Markov chains initiates in the absorbing state 2.1 Discrete phase-type distributions 17 we only have to keep track of the remaining Markov chain. Hence the initial probability vector corresponding to the matrix L is given by γ. 

The last property is about the N-fold convolution of a discrete phase-type distri- bution, where N is also discrete phase-type distributed. For ease of computation we will assume in the following that the probability of initiating in the absorbing state is zero.

Property 2.5 Let Xi ∼ DPH(α,T ) i.i.d. and N ∼ DPH(β,S). Then the N-fold convolution of the Xi,

N X Z = Xi i=1 is again discrete phase-type distributed, with representation (γ,V ) where

γ = (α ⊗ β) and V is the upper left part of the transition probability matrix

 T ⊗ I + (tα) ⊗ S t ⊗ s  P = , 0 1 where the identity matrix has the same dimension as the matrix S.

Proof. The interpretation of the matrix V = (T ⊗ I + (tα) ⊗ S) is that first the transient states of the Markov chain corresponding to the Xi are visited according to the transition probabilities in T . When this Markov chain gets absorbed with the probabilities given in t, the Markov chain underlying N is visited according to S. If absorption in this second Markov chain does not take place, the first Markov chain will be revisited with initial probabilities given in α. The new system corresponding to V is absorbed when both the first and the second Markov chain get absorbed. 

We will make this probabilistic proof more clear with the following example.

Example 2.2 Let Xi ∼ DPH(α,T ) be represented by the following vector and matrix:  1 1  2 3 α = (1, 0),T = 1 1 3 3 and let N ∼ DPH(β,S) be geometrically distributed with success probability p ∈ (0, 1), hence β = 1,S = (1 − p). 18 Phase-type distributions

 1  6 This leads to the exit vectors t = 1 and s = p. The underlying Markov 3 chains for the Xi and for N are shown in Figure 2.1:

1 6 1 2 1 1 2 1 3 3 3 1 1 1 3 3

(I) Underlying Markov chain of Xi

p a b 1 − p 1

(II) Underlying Markov chain of N

Figure 2.1: The Markov chains corresponding to Xi and N.

Using γ = (α ⊗ β) and V = (T ⊗ I + (tα) ⊗ S) we get for the representation of Z:

 1 1 1   p  2 + 6 · (1 − p) 3 + 0 · (1 − p) 6 γ = (1, 0),V = 1 1 1 , v = p . 3 + 3 · (1 − p) 3 + 0 · (1 − p) 3

The interpretation of this N-fold convolution of the Xi is that first states 1 and 2 of (I) are visited, and when the absorbing state 3 is reached the Markov chain given in (II) is started. With probability (1−p) this Markov chain chooses to go for another round in (I), and with probability p this Markov chain gets absorbed in state b and then the total system is absorbed. We can think of the Markov chain underlying Z as a combination of (I) and (II), where the states 3 and a are left out, as the probabilities of going from state 1 or 2 through state 3 and state a back to state 1 are equal to the new probabilities given in the matrix V of going directly from state 1 or 2 to state 1. Figure 2.2 depicts this Markov chain underlying the random variable Z: 2.2 Continuous phase-type distributions 19

1 6 p 1 1 2 + 6 (1 − p) 1 1 2 1 p b 3 3 1 1 1 1 3 + 3 (1 − p) 3

Figure 2.2: Underlying Markov chain of Z.

2.2 Continuous phase-type distributions

In analogy with the discrete case, continuous phase-type distributions (or sim- ply phase-type distributions) are defined as the time until absorption in a continuous-time Markov chain.

Let {Xt}t≥0 be a continuous-time Markov chain on the state space E = {1, 2, . . . , m, m + 1} where we take again state m + 1 as the absorbing state. The intensity matrix of this process is given by T t Λ = . 0 0 Here T is an m × m sub-intensity matrix and since each row of an intensity matrix must sum up to zero we have t = −T e. The initial distribution of this process will again be denoted by (α, αm+1) with αe + αm+1 = 1.

Definition 2.8 (Phase-type distribution) A random variable τ has a phase- type distribution if τ is the time until absorption in a continuous-time Markov chain, τ := inf{t > 0 : Xt = m + 1}.

We write τ ∼ PH (α,T ) where α is the initial probability vector and T the sub intensity matrix of the transient states of the Markov chain. The transition probability matrix for the Markov chain at time t is given by P t = eΛt. For the transition probabilities in the transient states of the Markov chain we have

T t P (Xt = j, t ≤ τ|X0 = i) = e ij .

Lemma 2.9 The density of a phase-type random variable τ is given by f(t) = αeT tt, t > 0. (2.7) and f(0) = αm+1. 20 Phase-type distributions

Proof. We have f(t)dt = P (τ ∈ (t, t + dt]). By conditioning on the initial state of the Markov chain, i, and on the state at time t, j, we get

m X f(t)dt = P (τ ∈ (t + dt]|Xt = j, X0 = i) P (Xt = j|X0 = i) P (X0 = i) i,j=1 m X t = P (τ ∈ (t + dt]|Xt = j)(P )ijαi. i,j=1

The probability of absorption in the time interval (t, t + dt] when Xt = j is given by tjdt, where tj is the j-th element of the exit vector t. Also, for all t T t states i, j ∈ {1, . . . , m} we have (P )ij = e ij which leads to

m X T t T t f(t)dt = αi e ij tjdt = αe tdt. i,j=1

As in the discrete case we have f(0) = αm+1 which is the probability of starting in the absorbing state. 

The continuous analogue of the probability generating function is the Laplace transform. For a non-negative random variable X with density f(x) the Laplace transform is defined as Z ∞ −sX  −sx L(s) = E e = e f(x)dx. 0 Without proof we state the following:

Corollary 2.10 The distribution of a phase-type random variable is given by

F (t) = 1 − αeT te. (2.8)

The Laplace-transform of a phase-type random variable X is given by

−1 L(s) = αm+1 + α (sI − T ) t. (2.9)

Note that analogously to the discrete case the Laplace transform of a phase-type distribution is a rational function in s.

An important property of continuous phase-type distributions that does not hold in the discrete case is that a continuous phase-type density is strictly positive on (0, ∞). This can be seen by rewriting the matrix eT t using uniformization. If we take a phase-type density f(n) = αeT tt with irreducible representation (α,T ) we can write λ = − min(Tii) 2.2 Continuous phase-type distributions 21 and T = λ(K − I) where K = I + λ−1T. The matrix K is stochastic, since

X X Tij K = + 1 = 1 ij λ j j and

Tij Tij Kij = = − ≤ 1 λ min Tii Tii Tii Kii = + 1 = − + 1 ≥ 0. λ min Tii

We find for eT t: eT t = eλ(K−I)t ∞ X (λt)i = e−λt Ki i! i=0 which is a non-negative matrix. If the matrix Ki is irreducible from some i on, (hence the matrix T is irreducible), this matrix will be strictly positive. If this is not the case, there will be states in the underlying Markov chain that cannot be reached from certain other states. However, the choice of α will make sure that all states can be reached with positive probability as we have chosen the representation (α,T ) to be irreducible (meaning we left out all the states that cannot be reached). Hence αeT t is a strictly positive row vector and multiplying this vector with the non-negative non-empty column vector t ensures that αeT tt > 0 for all t > 0.

Example 2.3 If we take a phase-type density with representation −λ λ 0  0 α = (1, 0, 0),T =  0 −λ λ  and t = 0 0 0 −λ λ then we have a continuous phase-type distribution where each phase is visited an exp(λ) distributed time before the process enters the next phase. Since there are three transient states this is the sum of three exponential distributions and hence an Erlang-3 distribution. The density of this distribution is given by f(t) = αeT tt  −λt −λt 1 −λt 2 2   e e λt 2 e λ t 0 = (1, 0, 0)  0 e−λt e−λtλt  0 0 0 e−λt λ (λt)2 = λe−λt , 2 which we recognize as the Erlang-3 density. 22 Phase-type distributions Chapter 3

Matrix-exponential distributions

This chapter is about matrix-exponential distributions. They are distribu- tions with a density of the same form as continuous phase-type distributions, f(t) = αeT tt, but they do not necessarily possess the probabilistic interpreta- tion as the time until absorption in a continuous-time finite-state Markov chain. In this chapter we explore some identities and give some examples of matrix- exponential distributions. It serves as a preparation to the next chapter in which we will study the discrete analogous of matrix-exponential distributions. A thorough introduction to matrix-exponential distributions is given by As- mussen and O’Cinneide in [4]. In Section 3.1 the definition of a matrix-exponential distribution is given, and the equivalent definition as distributions with a rational Laplace transform is addressed. In Section 3.2 we give three examples of matrix-exponential distri- butions and explain their relationship to phase-type distributions. 24 Matrix-exponential distributions

3.1 Definition

Definition 3.1 (Matrix-exponential distribution) A random variable X has a matrix-exponential distribution if the density of X is of the form

f(t) = αeT tt, t ≥ 0.

Here α and t are a row and column vector of length m and T is an m × m matrix, all with entries in C.

If we denote the class of matrix-exponential distributions by ME and the class of phase-type distributions by PH we immediately have the relation PH ⊆ ME. In the next section we will show that in fact PH ( MG. The matrix-exponential distributions generalize phase-type distributions in the sense that they do not need to possess the probabilistic interpretation as a distribution of the time until absorption in a continuous-time Markov chain. This means that the vector α and matrix T are not necessarily an initial probability vector and sub-intensity matrix, which allows them to have entries in C. As T is no longer a sub-intensity matrix, the relation t = −T e does not necessarily hold. Hence the vector t becomes a parameter, and we write

X ∼ ME (α,T, t) for a matrix-exponential random variable X.

There is an equivalent definition of the class of matrix-exponential distributions, that says that a random variable X has a matrix-exponential distribution if the Laplace transform L(s) of X is a rational function in s. The connection between the rational Laplace transform and density of a matrix-exponential distribution is made explicit in the following proposition, which is taken from Asmussen & Bladt [3].

Proposition 3.2 The Laplace transform of a matrix-exponential distribution can be written as

2 n−1 b1 + b2s + b3s + ... + bns L(s) = n n−1 , (3.1) s + a1s + ... + an−1s + an for some n ≥ 1 and some constants a1, . . . , an, b1, . . . , bn. From L(0) = 1 it follows that we have an = b1. The distribution has the following representation

f(t) = αeT tt 3.2 Examples of matrix-exponential distributions 25 where

α = (b1, b2, . . . , bn)  0 1 0 0 0 ... 0 0   0 0 1 0 0 ... 0 0    T =  ......     0 0 0 0 0 ... 0 1  −an −an−1 −an−2 −an−3 −an−4 ... −a2 −a1 and 0 0   0   t = . . (3.2)   .   0 1

Proof. The proof of this proposition can be found in [3] pages 306-310.

If T is an m × m matrix, the corresponding matrix-exponential distribution is p(s) said to be of order m. If we write L(s) = q(s) then d = deg(q) is the degree of the matrix-exponential distribution. Note that Proposition 3.2 is of great importance as it gives a method to find an order-d representation of a matrix-exponential density that has a Laplace transform of degree d.

3.2 Examples of matrix-exponential distributions

In this section we will give three examples of matrix-exponential distributions. Our main aim with these examples is to illustrate that the class of matrix- exponential distributions is strictly larger than the class of phase-type distri- butions and that it is sometimes more convenient to use a matrix-exponential representation for a distribution because it can be represented with a lower or- der representation (i.e. the matrix T has a lower dimension). To be able to say whether or not a distribution is of phase-type we use the follow- ing Characterization Theorem for phase-type distributions by C.A. O’Cinneide [14]. 26 Matrix-exponential distributions

Theorem 3.3 (Characterization of phase-type distributions) A distribu- tion on [0, ∞) with rational Laplace-Stieltjes transform is of phase-type if and only if it is either the point mass at zero, or (a) it has a continuous positive density on the positive reals, and (b) its Laplace-Stieltjes transform has a unique pole of maximal real part.

The proof of this theorem is far from simple and can be found in [14]. Condition (a) of the characterization, stating that a phase-type density is always positive on (0, ∞), has already been addressed in Chapter 2 on page 20. Condition (b) implies that a distribution can only be of phase-type if the sinusoidal effects in the tail behavior of the density of a phase-type distribution (which cause two conjugate complex poles in the Laplace-Stieltjes transform) decay at a faster exponential rate than the dominant effects.

The first example is of a matrix-exponential distribution that does not satisfy Condition (a) and (b) of Theorem 3.3, and hence is not of phase-type.

Example 3.1 A distribution that is genuinely matrix-exponential (hence not of phase-type) is represented by the following density:

f(t) = Ke−λt(1 − cos(ωπt)), t ≥ 0 for λ and ω real coefficients and K a scaling coefficient given by

λ(λ2 + π2ω2) K = . π2ω2

1 Figure 3.1 shows this density for λ = 1 and ω = 2, which implies K = 1 + 4π2 .

2n This density is zero whenever t equals ω for n ∈ N and therefore violates Condition (a) of Theorem 3.3. By rewriting this density using the identity eix+e−ix cos x = 2 we find a matrix-exponential representation for this density:

f(t) = Ke−λt(1 − cos(ωπt)) eiωπt + e−iωπt  = Ke−λt(1 − ) 2 K K = Ke−λt − e−t(−iωπ+λ) − e−t(iωπ+λ) 2 2 e−λt 0 0  K −t(−iωπ+λ) K = (1, −1, −1)  0 e 0   2  . −t(iωπ+λ) K 0 0 e 2 3.2 Examples of matrix-exponential distributions 27

f HtL

1.2

1.0

0.8

0.6

0.4

0.2

t 1 2 3 4

1 −t Figure 3.1: Density plot of f(t) = (1 + 4π2 )e (1 − cos(2πt)).

The Laplace transform of f(t) is given by

Z ∞ L(s) = K e−ste−λx(1 − cos(ωπt))dt 0 Kπ2ω2 = (s + λ)(s − iπω + λ)(s + iπω + λ) Kπ2ω2 = . (3.3) s3 + 3λs2 + (π2ω2 + 3λ2)s + π2ω2λ + λ3

This is a rational expression in s which again shows that f(n) is a matrix- exponential density. The poles of the Laplace transform are −λ, −λ + iπω and −λ − iπω which all have −λ as real part. Hence this distribution violates also Condition (b) of the Characterization Theorem for phase-type distributions. Now by using Proposition 3.2 and Formula (3.3) we find that another matrix- exponential representation for f(t) = αeT tt is given by

α = (Kπ2ω2, 0, 0),  0 1 0  T =  0 0 1  −π2ω2λ − λ3 −π2ω2 − 3λ2 −3λ and 0 t = 0 . 1 28 Matrix-exponential distributions

gHtL

1.4

1.2

1.0

0.8

0.6

0.4

0.2

t 1 2 3 4 5

1 −t 1  Figure 3.2: Density plot of g(t) = 1+ 1 e 1 + 2 cos(2πt) . 2+8π2

A slight change in this first example leads to a density that violates Condition (b) but not Condition (a) of Theorem 3.3 and hence is again matrix-exponential and not of phase-type.

Example 3.2 The distribution represented by the density   1 −t 1 g(t) = 1 e 1 + cos(2πt) , t ≥ 0 1 + 2+8π2 2 is genuine matrix-exponential. Its Laplace transform is given by

K K K L(t) = + + , s + 1 4(s + 1 − 2πi) 4(s + 1 + 2πi)

1 where K = 1+ 1 . 2+8π2 The poles of the Laplace transform are −1, −1+2πi and −1−2πi hence there is no pole of maximum real part and Condition (b) of Theorem 3.3 is not satisfied. We do have g(n) > 0 for all t ≥ 0. Figure 3.2 shows a plot of this density.

In ‘Phase-Type Distributions and Invariant Polytopes’ by C.A. O’Cinneide, [15] the author states the following theorem on the minimal order of a phase-type distribution.

Theorem 3.4 Let µ be a phase-type distribution with −λ1 for the pole of its transform of maximal real part, and with poles at −λ2 ± iθ, where θ > 0. Then the order n of µ satisfies θ π ≤ cot . λ2 − λ1 n 3.2 Examples of matrix-exponential distributions 29

1 Using the fact that tan x ≥ x, for 0 ≤ x ≤ 2 π, this implies that πθ n ≥ . (3.4) λ2 − λ1

Using this theorem we can give an example of a distribution that has a lower order when represented as a matrix-exponential density than when represented as a phase-type density.

Example 3.3 This is an elaboration of an example in [15] on page 524. The density ! 1  −(1−)t −t  h(t) = 1 1 e + e cos t , t ≥ 0 2 + 1− for 0 <  < 1 has Laplace transform

2(−1 + )(−3 +  − 4s + s − 2s2) L(s) = . (−3 + )(−1 +  − s)(2 + 2s + s2)

The denominator of L(s) is of degree three and hence h(t) has a matrix-exponential representation of order three, which can again be found by using Proposition 3.2. The poles of the Laplace transform are −1 − i, −1 + i and − 1 + . Thus by Theorem 3.4 and Inequality (3.4) we have for the order m of the phase-type representation for h(t) that π m ≥ .  Hence we have a phase-type density of degree three but with arbitrarily large order. Note that for  = 0 this density is not of phase-type since it then doesn’t have a unique pole of maximum real part. Figure 3.3 shows the density plot of h(t) for  = 0, 7/9 and 1/2.

In this chapter we have shown that the class of matrix-exponential distributions is strictly larger than the class of phase-type distributions. We have seen that there are phase-type distributions that have a larger order than their degree. In the next chapter we will explore the relation between matrix-geometric dis- tributions, which are the discrete analogous of matrix-exponential distributions and discrete phase-type distributions. 30 Matrix-exponential distributions

h t Ε = 0

H L 1.2 Ε = 7 9

1.0  Ε = 1 2 0.8

 0.6

0.4

0.2

t 1 2 3 4

7 1 Figure 3.3: Density plot of h(t) for  = 0, 9 and 2 Chapter 4

Matrix-geometric distributions

This chapter is about the class of matrix-geometric distributions (MG). They are distributions on the non-negative integers with a density of the same form as discrete phase-type distributions,

f(n) = αT n−1t.

However, matrix-geometric distributions do not necessarily posses the proba- bilistic interpretation as the time until absorption in a discrete-time Markov chain. They can equivalently be defined as distributions with a rational proba- bility generating function. The main question of this thesis, to be answered in this chapter, is how matrix- geometric distributions relate to discrete phase-type distributions. We will show that DPH ( MG and that there is possible order reduction by taking the matrix-geometric re- presentation for a discrete phase-type distribution. Matrix-geometric distributions are mentioned by Asmussen and O’Cinneide in [4] as the discrete equivalent of matrix-exponential distributions, but no explicit example of such a distribution is given. They are used by Akar in [1] to study the queue lengths and waiting times of a discrete time queueing system, but no explicit examples are given in this article either. Sengupta [17] shows a class of 32 Matrix-geometric distributions distributions that are obviously in MG, but by the use of a transformation he shows that they are actually in DPH.

In Section 4.1 matrix-geometric distributions are defined. In Section 4.2 we show that there exist distributions that are genuinely matrix-geometric, hence not of discrete phase-type, by the use of the Characterization Theorem of dis- crete phase-type distributions stated by O’Cinneide in [14]. In Section 4.3 we will show that there exist phase-type distributions that have a lower degree, and hence lower order matrix-geometric representation, than the order of their discrete phase-type representation. Finally, in Section 4.4 we will revisit some properties of discrete phase-type distributions and show that they also hold for the class MG, by proving them analytically.

4.1 Definition

Definition 4.1 (Matrix-geometric distribution) A random variable X has a matrix-geometric distribution if the density of X is of the form

n−1 f(n) = αT t, n ∈ N where α and t are a row and column vector of length m and T is an m × m matrix with entries in C.

We have to make a note about the support of these distributions. Matrix- geometric distributions have support on the non-negative integers, and the matrix-geometric density is actually a mixture of the above density defined on n ∈ N and the point mass at zero. For convenience we will assume in this chapter that f(0) = 0.

The following proposition shows that we can equivalently define matrix-geometric distributions as distributions that have a rational probability generating func- tion. It is the discrete equivalent of Proposition 3.2.

Proposition 4.2 The probability generating function of a matrix-geometric dis- tribution can be written as

2 n−1 n bnz + bn−1z + ... + b2z + b1z H(z) = 2 n−1 n , (4.1) 1 + a1z + a2z + ... + an−1z + anz for some n ≥ 1 and constants a1, . . . , an, b1, . . . , bn in R. The corresponding density is given by f(n) = αT n−1t 4.1 Definition 33 where

α = (1, 0,..., 0),   −a1 0 0 0 0 ... 0 1  −an 0 0 0 0 ... 0 0   −an−1 1 0 0 0 ... 0 0 T =    ......   ......  −a2 0 0 0 0 ... 1 0 and   bn  b1     b2  t =   . (4.2)  .   .  bn−1

We call this the canonical form of the matrix-geometric density.

Proof. From Equation (2.9) and Proposition 3.2 we know that we have the following formula for the Laplace transform of a matrix-exponential distribution:

2 n−1 −1 b1 + b2s + b3s + ... + bns L(s) = α (sI − T ) t = n n−1 . s + a1s + ... + an−1s + an In Proposition 3.2 the corresponding vectors α and t and a matrix T are given, using the coefficients of the Laplace transform.

The probability generating function of a matrix-geometric distribution is given by H(z) = αz (I − zT )−1 t. This formula is equal to the matrix formula of the Laplace transform if s is replaced by 1/z, which leads to the following fractional form of the probability generating function:

b2 b3 bn 1 b + + 2 + ... + n−1 H(z) = L( ) = 1 z z z 1 a1 an−1 z zn + zn−1 + ... + z + an 2 n−1 n bnz + bn−1z + ... + b2z + b1z = 2 n−1 n . 1 + a1z + a2z + ... + an−1z + anz The corresponding vectors α and t and a matrix T are hence the same as in Proposition 3.2 in Formula (3.2). To obtain the canonical form for a matrix- geometric density given in Formula (4.2) we take the transpose of α, t and T and rename the states to obtain α = (1, 0,..., 0).  34 Matrix-geometric distributions

As in the continuous case, the order of a matrix-geometric distribution is said to p(z) be m if T is an m × m matrix. If we write H(z) = q(z) , the degree of a matrix- geometric distribution is given by n = deg(p). It follows from Proposition 4.2 that for a matrix-geometric distribution we can always find a representation that has the same order as the degree of the distribution.

If a random variable X has a matrix-geometric distribution with parameters α,T and t we write X ∼ MG (α,T, t) . Although the relation T e+t = e does not necessarily holds, it is always possible to find a representation that does satisfies this relation. For a non-singular matrix M we can write for the density f(n) of X:

f(n) = αT n−1t = αMM −1T n−1MM −1t = αM(M −1TM)n−1M −1t.

For this density we now want the relation

M −1TMe + M −1t = e to hold, which can found to be true when M satisfies the relation

Me = (I − T )−1t.

4.2 Existence of genuine matrix-geometric dis- tributions

From the definition of matrix-geometric distributions it is immediately clear that the relation DPH ⊆ MG must hold. The question arises though whether or not DPH ( MG ? In the continuous case it was easy to find an example of a distribution that is matrix- exponential but not of phase-type, as phase-type distributions can never be zero on (0, ∞). But this is not the case for discrete phase-type distributions, as they can have a periodicity which allows them to be zero with a certain period d.

In ’Characterization of phase-type distributions’ by C.A. O’Cinneide,[14], the following characterization of discrete phase-type distributions is given. 4.2 Existence of genuine matrix-geometric distributions 35

Theorem 4.3 (Characterization of discrete phase-type distributions) A probability measure µ on {0, 1,...} with rational generating function is of phase type if and only if for some positive integer ω each of the generating functions

∞ X n Hµ,k(z) = µ(ωn + k)z , (4.3) n=0 for k = 0, 1, . . . , ω − 1, is either a polynomial or has a unique pole of minimal absolute value.

The proof of this theorem can be found in [14], pages 49-56. Here, we will give an outline of the proof.

Necessity. Let µ be discrete phase-type. Then either µ is the point mass at zero, for which Theorem 4.3 holds with ω = 1 or µ has an irreducible repre- sentation (α,T ) of positive order. In the latter case, we let {Xn}n∈N be the Markov chain corresponding to this representation. Let ω be the least com- mon multiple of the periods of all the periodic classes of the chain X. Let τ be the time until absorption of the Markov chain, hence τ has distribution µ. If P (τ = k mod ω) = 0 then Hµ,k(z) is identically zero and hence a poly- nomial. If however P (τ = k mod ω) > 0 we can define a new Markov chain {Yn} = {Xk+ωn} which given that τ = k mod ω is an absorbing Markov chain with one absorbing state and possibly fewer transient states than our original chain since we can leave out the states than can never be reached. If we denote by αk the initial distribution of Y0 and by Sk the transition pro- bability matrix of its transient states we have that (αk,Sk) is an irreducible τ−k representation of the conditional distribution of ω given τ = k mod ω. By the definition of ω, the transition matrix Sk has no periodic states. If we call this Hµ,k(z) distribution µk we have that Hµ (z) = for all z. k Hµ,k(1) In order to prove that Hµ,k has a unique pole of minimal value it suffices now to prove that Hµk (z) has one, since Hµ,k(z) is just a positive multiple of Hµk (z) and their poles are equal. Let ζk denote the unique eigenvalue of Sk of largest absolute value, whose existence is assured by the Perron-Frobenius theorem (see e.g. [8], page 240). The uniqueness of ζk is a consequence of Sk being aperi- odic which implies that it cannot have non-real eigenvalues of maximal modulus (see Thm. 3.3.1 in [11]). Since Sk is a transition probability matrix, ζk is less than one. As the poles of Hµk (z) are the reciprocals of the eigenvalues of Sk, it 1 follows is the unique pole of minimal value of Hµ (z) and hence of Hµ,k(z). ζk k The other poles of Hµ,k(z) will have greater absolute value, which concludes the proof of necessity. 36 Matrix-geometric distributions

Sufficiency. Let µ be a probability measure on {0, 1,...} with rational gen- erating function and with each of the generating functions Hµ,k(z) either a polynomial or with a unique pole of minimal absolute value. In [14] it is proven that a probability measure with rational generating function with a unique pole of minimal absolute value is of discrete phase-type. The proof of this statement makes use of invariant polytopes and is beyond the scope of this thesis. However, this statement is the crux of the proof for the sufficiency of the condi- tion to be of discrete phase-type. We can finish the proof with the following ar- Hµ,k(z) gument. The probability measure µk with generating function Hµ (z) = k Hµ,k(1) is of discrete phase-type as Hµk (z) has a unique pole of minimal absolute value. Let τk be a random variable with distribution µk for k = 0, 1, . . . , ω − 1. Then ωτk is again of discrete phase-type, as we can modify the state space such that from every original state there is a deterministic series of ω − 1 transitions through ω − 1 new states after which again an original state is reached. As a constant c can be modelled as a phase-type distribution with transition matrix of dimension c with Ti,i+1 = 1 for all states i we have that also ωτk +k is discrete phase-type distributed. Now µ is a mixture of the distributions µk for ωτk + k in proportions of Hµ,k(1) for k = {0, 1, . . . , ω − 1}. Hence µ is a finite mixture of discrete phase-type distributions and is so itself of discrete phase-type, which completes the proof for sufficiency. 

Based on this characterization of discrete phase-type distributions we can prove the following theorem:

Theorem 4.4 The class of matrix-geometric distributions is strictly larger than the class of discrete phase-type distributions.

Proof. We prove this theorem by giving an example of a matrix-geometric distribution that does not meet the necessary condition stated in Theorem 4.3 to be of discrete phase-type. We have a look at a matrix-geometric distribution with density √ n−1   g(n) = Kp 1 + cos( 2(n − 1)) , n ∈ N (4.4) for K a normalizing constant and p ∈ (0, 1). We can rewrite this density to get it into matrix-geometric form:  √  g(n) = Kpn−1 1 + cos 2(n − 1)  1 √ 1 √  = Kpn−1 1 + ei 2(n−1) + e−i 2(n−1) 2 2 4.2 Existence of genuine matrix-geometric distributions 37

1 √ 1 √ = Kpn−1 + Kpn−1ei 2(n−1) + Kpn−1e−i 2(n−1) 2 2 1  √ n−1 1  √ n−1 = Kpn−1 + K pei 2 + K pe−i 2 2 2 n−1 p 0 0    √ K = (1, 1, 1) 0 pei 2 0 1 K . (4.5)  √   2  −i 2 1 0 0 pe 2 K

We write as usual g(n) = αT n−1t. According to Theorem 4.3 a necessary condi- tion for this density to be of discrete phase-type is that the probability genera- P∞ n ting functions Hg,k(z) = n=0 g(ωn + k)z for some ω ∈ N are either a polyno- mial or have a unique pole of minimal absolute value for all k = 0, 1, . . . , ω − 1. For k = 1, 2, . . . , ω − 1 these generating functions are given by

∞ X n Hg,k(z) = g(ωn + k)z n=0 ∞ X = αT ωn+k−1tzn n=0 ∞ X = α T ωnT k−1znt n=0 ∞ ! X = α (zT ω)n T k−1t n=0 = α (I − zT ω)−1 T k−1t. (4.6) For k = 0 we get ∞ X n Hg,0(z) = g(ωn)z n=0 ∞ X = g(0) + αT ωn−1tzn n=1 ∞ X = 0 + α T ωn−1−ω+1T ω−1znt n=1 ∞ ! X = αz (zT ω)n−1 T ω−1t n=1 = αz (I − zT ω)−1 T ω−1t. (4.7)

We see that the zeros of the denominator of these generating functions are given by the reciprocals of the eigenvalues of the matrix T ω. We have to check for Formula (4.7) that none of these zeros cancel. We can use the representation 38 Matrix-geometric distributions of g(n) in (4.5) to find √ √ ω−1 i 2(ω−1) −i 2(ω−1) ω −1 ω−1 zp 1 zpe 1 zpe αz (I − zT ) T t = K + K √ + K √ . 1 − zpω 2 1 − zpei 2ω 2 1 − zpe−i 2ω √ √ Hence the poles of both (4.6) and (4.7) are given by p−ω, p−ωe−i 2ω and p−ωei 2ω, which are the reciprocals of the eigenvalues of the matrix pω 0 0  √ T ω = 0 pωei 2ω 0  √  0 0 pωe−i 2ω and which are not a zero of the numerator of Hg,0(z). These poles all have equal absolute value p−ω. The only possibility for H (z) to have a unique pole of √ g,k √ i 2ω k2πi maximum absolute value is√ for ω = k 2π because then e = e = 1 and T ω will have eigenvalue pk 2π with multiplicity three. However, according to Theorem 4.3, ω must be chosen in N, which proves that it is not possible for Hg,k(z) to have a unique pole of minimal absolute value. Finally, we know that Hg,k(z) is not the zero polynomial√ since g(n) is positive for all n ∈ N, which is a consequence of (1 + cos( 2(n − 1)) not having a zero in the integer numbers. We can conclude that the distribution represented by g(n) is genuinely matrix- geometric and that the class of matrix-geometric distributions is strictly larger than the class of discrete phase-type distributions. 

Example 4.1 The density

1  9 n−1  √  g(n) = 1 + cos( 2(n − 1)) 21 + 19 √ 10 2 362−360 cos( 2) is genuinely matrix-geometric. Figure 4.1 shows the density plot of g(n).

4.3 Order reduction for matrix-geometric distri- butions

Example 3.3 of Chapter 3 showed a class of phase-type distributions that have degree three, and hence matrix-exponential representation of order three, but π that need m ≥  phases for their phase-type representation, which hence can be arbitrarily large for  ↓ 0. In this section we will show that such order reduction can also take place when going from a discrete phase-type representation to a matrix-geometric representation. 4.3 Order reduction for matrix-geometric distributions 39

gHnL

0.15

0.10

0.05

n 5 10 15 20 25 30

n−1 √ Figure 4.1: Density plot of g(n) = 1 9  1 + cos( 2(n − 1)) 21 + 19 √ 10 2 362−360 cos( 2)

Theorem 4.5 A distribution represented by the density  (n − 1)π  h(n) = Kpn−1 1 + cos , n ∈ (4.8) m N for K a normalization constant, p ∈ (0, 1) and m ∈ N≥2 is discrete phase-type of order at least 2m. It is matrix-geometric of order 3.

Proof. To prove this theorem we have to prove the following three steps:

1. The density h(n) is matrix-geometric of order three. 2. There is a discrete phase-type representation for h(n) of order 2m. 3. The order-2m representation is minimal, there is no lower order discrete phase-type representation for h(n).

Proof of 1. The probability generating function of h(n) is given by:

∞ X (n − 1)π H(z) = Kpn−1(1 + cos( ))zn m n=1 ∞ ∞ (n−1)πi −(n−1)πi ! X X e m + e m = Kz (pz)n−1 + Kz (pz)n−1 2 n=1 n=1 ∞ n−1 ∞ n−1! Kz 1 X  πi  X  −πi  = + Kz pze m + pze m 1 − pz 2 n=1 n=1 Kz 1 1 1 1 = + Kz πi + Kz −πi . (4.9) 1 − pz 2 1 − pze m 2 1 − pze m 40 Matrix-geometric distributions

This is a rational function in z of degree three. Hence h(n) has an order three matrix-geometric representation (see also Example 4.2).

Proof of 2. The density h(n) is zero for n = m + 1, 3m + 1, 5m + 1,.... Hence it has a period of 2m and the following discrete phase-type representation is a reasonable candidate to represent h(n): h˜(n) = αT n−1t (4.10) with α = (1, 0,..., 0)   0 t1,2 0 0 0 ...... 0  0 0 t2,3 0 0 ...... 0     ......   ......     0 0 0 0 tm,m+1 ...... 0  T =    0 0 0 0 0 tm+1,m+2 ... 0     ......   ......     0 0 0 0 0 ...... t2m−1,2m t2m,1 0 0 0 0 ...... 0 and   1 − t1,2  1 − t2,3     .   .     1 − tm,m+1  t =   . (4.11) 1 − tm+1,m+2    .   .    1 − t2m−1,2m  1 − t2m,1

By equating the first 2m values of h(n) and h˜(n) we can solve for all the pa- rameters {t1,2, t2,3, . . . , t2m−1,2m, t2m,1} in representation (4.11). We have

h(1) = αt = 1 − t1,2

h(2) = αT t = t1,2(1 − t2,3) . . m−1 h(m) = αT t = t1,2t2,3 . . . tm−1,m(1 − tm,m+1) m h(m + 1) = αT t = t1,2t2,3 . . . tm,m+1(1 − tm+1,m+2) . . 2m−1 h(2m) = αT t = t1,2t2,3 . . . t2m−1,2m(1 − t2m,1). (4.12) 4.3 Order reduction for matrix-geometric distributions 41

This system of equations has a unique solution, as the equations can be solved recursively. We find

t1,2 = 1 − h(1) h(2) t = 1 − 2,3 1 − h(1) . . h(m + 1) t = 1 − m+1,m+2 1 − h(1) − ... − h(m) . . h(2m) t = 1 − . 2m,1 1 − h(1) − ... − h(2m − 1)

Note that ti,j ∈ (0, 1) for (i, j) 6= (m + 1, m + 2) since h(i) ∈ (0, 1) for i ∈ {1, . . . , m, m + 2,..., 2m} and

h(1) + ... + h(i) < 1 ⇒ h(i) < 1 − h(1) − ... − h(i − 1) ⇒ h(i) < 1 ⇒ 1 − h(1) − ... − h(i − 1) h(i) t = 1 − > 0. i,j 1 − h(1) − ... − h(i − 1)

That ti,j < 1 follows directly from the equations in (4.12). From h(m + 1) = 0 we deduce that tm+1,m+2 = 1 which implies that absorption is not possible from state m + 1 and we will indeed get h˜(m + 1 mod 2m) = 0.

We now have two densities h(n) and h˜(n) that agree on n = 1,..., 2m. It remains to prove that h(n) = h˜(n) for all n ∈ N. We will do this inductively, by first proving that h(2m + 1) = h˜(2m + 1) (4.13) and then, assuming h(s) = h˜(s) for s ≤ S ∈ N, by showing that

h(s + 1) h˜(s + 1) = (4.14) h(s) h˜(s) for all s ∈ N from which h(s + 1) = h˜(s + 1) follows.

We have h(2m + 1) = 2Kp2m. For the scaling constant K the following formula can be found by summing separately the terms of h(n) that have the same value 42 Matrix-geometric distributions

(n−1)π of cos m :

∞ 1 X (n − 1)π = pn−1(1 + cos ) K m n=1 2m−1 X pj jπ = (1 + cos ). 1 − p2m m j=0

In order to compute h˜(2m + 1) we need to know what the general form of h˜(n) is. Let s ∈ N be given by s = d + ξ · 2m (4.15) where 1 ≤ d ≤ 2m and ξ ∈ N. The general form of h˜(s) can be deduced from the structure of the matrix T and is given by

˜ s−1 ξ+1 ξ+1 ξ ξ ξ h(s) = αT t = (1 − td,d+1)t1,2 t2,3 . . . td,d+1 . . . t2m−1,2mt2m,1 (4.16)

Now we find h˜(2m + 1) = αT 2mt

= (1 − t1,2)t1,2t2,3 ··· t2m,1 1 − h(1) − h(2) 1 − h(1) − h(2) − ... − h(2m) = h(1)(1 − h(1)) ··· 1 − h(1) 1 − h(1) − ... − h(2m − 1) = −h(1)(−1 + h(1) + . . . h(2m)) 2m−1 X jπ = −2K(−1 + K pj(1 + cos )) m j=0 1 = 2K − 2K2( (1 − p2m)) K = 2Kp2m which is indeed equal to h(2m+1) and hence (4.13) is proven. For the induction step we get

h(s + 1) Kpd+k·2m(1 + cos d+k·2mπ ) 1 + cos dπ = m = p m h(s) Kpd−1+k·2m(1 + cos d−1+k·2mπ ) (d−1)π m 1 + cos m 4.3 Order reduction for matrix-geometric distributions 43 whereas ˜ h(s + 1) 1 − td+1,d+2 = td,d+1 h˜(s) 1 − td,d+1  1−Pd+1 h(i)  1 − i=1 Pd 1−Pd h(i) 1 − h(i) = i=1 i=1  1−Pd h(i)  d−1 i=1 1 − P h(i) 1 − Pd−1 i=1 1− i=1 h(i) 1 − Pd h(i) − (1 − Pd+1 h(i)) = i=1 i=1 Pd−1 Pd 1 − i=1 h(i) − (1 − i=1 h(i)) Pd+1 h(i) − Pd h(i) = i=1 i=1 Pd Pd−1 i=1 h(i) − i=1 h(i) h(d + 1) 1 + cos dπ = = p m . h(d) (d−1)π 1 + cos m

Hence (4.14) is proven and this completes the proof of Step 2. We can conclude that our reasonable candidate h˜(n) is indeed an order-2m discrete phase-type representation for the density h(n).

Proof of 3. We want to prove that there is no discrete phase-type representa- tion for h(n) of an order lower than 2m. Suppose h(n) has a discrete phase-type representation (β,S) that is of order d < 2m. Since h(m + 1) = 0 we know that there must be at least one en- try in the vector with exit probabilities s which is zero, say si = 0 for some i ∈ {1, . . . , d}. Let d˜ be the period of the matrix S. As S has order d, we know that d˜ ≤ d. We also know that d˜ > 1 because if S has no period then βSn−1s will be strictly positive for all n > d and (β,S) is certainly no representation for h(n). The periodicity d˜ implies that for 1 ≤ j ≤ d − 1

h(j) = cκh(j + κd) (4.17) for κ ∈ N and a positive constant c with c < 1. As h(m + 1) = 0, (4.17) implies that h(m + 1 + d˜) = 0. But this is a contradiction, as m + 1 + d˜ < 3m + 1 and h(n) is not zero between m + 1 and 3m + 1. Hence (β,S) cannot be a discrete phase-type representation for h(n) and we can conclude that a discrete phase-type representation for h(n) must be at least of order 2m. 

9 Example 4.2 We take p = 10 and m = 2 in (4.8) which leads to the following density: 181  9 n−1  (n − 1)π  f(n) = 1 + cos . (4.18) 1910 10 2 44 Matrix-geometric distributions

Figure 4.2 shows the density plot of f(n).

The probability generating function of f(n) is given by

181 z − 1629 z2 + 14661 z3 H(z) = 955 191000 191000 , 9 81 2 729 3 1 − 10 z + 100 z − 1000 z which leads to the following matrix-geometric representation for f(n) by the use of Proposition 4.2: f(n) = αT n−1t with

α = (1, 0, 0)  9  10 0 1 T =  729 0 0  1000  81 − 100 1 0 and  181  955 t =  14661  .  191000  1629 − 19100

By solving the system of equations in (4.12) we can find the following discrete phase-type representation of order 4 for f(n):

α = (1, 0, 0, 0)  774  0 955 0 0  0 0 1539 0 T =  1720   0 0 0 1 1719 1900 0 0 0 and  181  955  181  t =  1720  .  0  181 1900 4.4 Properties of matrix-geometric distributions 45

f n

H L

0.15

0.10

0.05

n 5 10 15 20 25 30 35

181 9 n−1   (n−1)π  Figure 4.2: Density plot of f(n) = 1910 10 1 + cos 2 .

4.4 Properties of matrix-geometric distributions

For the sake of completeness we will revisit in this section some of the properties for discrete phase-type distributions given in Section 2.1. We will show that these properties also hold for the class of matrix-geometric distributions, by giving an analytic proof for them based on probability generating functions or distribution functions. The first property, stating that every probability density on a finite number of positive integers is of discrete phase-type, can clearly be extended to matrix-geometric densities, as the former are a subset of the latter. Hence we start with the equivalence of Property 2.2.

Property 4.1 The convolution of a finite number of matrix-geometric densities is itself matrix-geometric.

Proof. We will prove this property for two matrix-geometric densities. Let X and Y be two independent matrix-geometric random variables with distributions represented by (α,T, t) and (β, S, s) respectively. We let T be of order m and S of order k. Consider the matrix

 T tβ  V = 0 S and vectors

β t γ = (α, α β, α β ) and v = k+1 . m+1 m+1 k+1 s 46 Matrix-geometric distributions

The probability generating function of the matrix-geometric distribution with representation (γ,V, v) is given by

−1  I − zT −ztβ   β t  H(z) = γ + zγ k+1 m+k+1 0 I − zS s

 (I − zT )−1 z(I − zT )−1tβ(I − zS)−1   β t  = γ + zγ k+1 m+k+1 0 (I − zS)−1 s (I − zT )−1β t + z(I − zT )−1tβ(I − zS)−1s = γ + zγ k+1 m+k+1 (I − zS)−1s −1 2 −1 −1 = αm+1βk+1 + βk+1zα(I − zT ) t + z α(I − zT ) tβ(I − zS) s −1 + αm+1zβ(I − zS) s −1 −1 = (αm+1 + zα(I − zT ) t)(βk+1 + zβ(I − zS) s) = HX (z)HY (z).

Hence the sum of X and Y is again matrix-geometrically distributed. 

Property 4.2 Any finite mixture of matrix-geometric probability densities is itself matrix-geometric.

Proof. Let the vector (p1, p2, . . . , pk) denote the mixing density. Let the matrix-geometric densities fi(n) be represented by the parameters (αi,Ti, ti). Then the mixture k X f(n) = pifi(n) i=1 has a matrix-geometric representation (β,V, v) where

β = (p1α1, . . . , pkαk),   T1 0 0 ... 0  0 T2 0 ... 0  V =    ......   . . . . .  0 0 0 ...Tk and   t1 t2 v =    .   .  tk

The probability generating function of f is given by 4.4 Properties of matrix-geometric distributions 47

−1 Hf (z) = zβ(I − zV ) v −1 = p1zα1(I − zT1) t1 + ... + pkzαk(I − zTk)tk

= p1Hf1 (z) + . . . pkHfk (z).  Property 4.3 Let X and Y be independent matrix-geometrically distributed random variables with representation (α,T, t) and (β, S, s) respectively. Then U = min(X,Y ) has again a matrix-geometric distribution with parameters (γ,V, v) where γ = α ⊗ β,V = T ⊗ S, and v = (I − (T ⊗ S))e.

Proof. Denote the dimension of the matrix T by k and the dimension of the matrix S by m. We have

P(U > n) = P(min(X,Y ) > n) = P(X > n)P(Y > n) and hence we have to show that n n n (α ⊗ β)(T ⊗ S) ek+m = αT ekβS em.

Here we denote by ei a column vector of ones of length i. For the Kronecker product it holds that (A ⊗ B)(C ⊗ D) = AC ⊗ BD if the matrices A, B, C and D are of such sizes that the products AC and BD can be formed. Using this property and keeping in mind the sizes of the different vectors and matrices we find that n n n (α ⊗ β)(T ⊗ S) ek+m = (α ⊗ β)(T ⊗ S )(ek ⊗ em) n n = (αT ⊗ βS )(ek ⊗ em) n n = αT ek ⊗ βS em n n = αT ekβS em. Hence for the minimum U of X and Y we have U ∼ MG(γ,V, v). 

Property 4.4 Let Xi ∼ MG(α,T, t) i.i.d. and N ∼ MG(β, S, s). Then the N-fold convolution of the Xi N X Z = Xi i=1 is again matrix-geometrically distributed, with representation (γ,V, v) where γ = (α ⊗ β),V = (T ⊗ I + (tα) ⊗ S) and v = (t ⊗ s) where the identity matrix has the same dimension as the matrix S.

Proof. See [12], pages 181-185. 48 Matrix-geometric distributions Bibliography

[1] N. Akar. A matrix-analytical method for the discrete-time Lindley equation using the generalized Schur decomposition. In SMCTOOLS’06, October 2006. [2] S. Asmussen. Applied Probability and Queues, volume 51 of Applications of Mathematics. Springer, 2nd edition, 2003. [3] S. Asmussen and M. Bladt. Renewal theory and queueing algorithms for matrix-exponential distributions. In A. Alfa and S.R. Chakravarthy, edi- tors, Matrix-analytic Methods in Stochastic Models, pages 313–341. 1997. [4] S. Asmussen and C.A. O’Cinneide. Matrix-exponential distributions. En- cyclopedia of Statistical Sciences, 2006. [5] S. Asmussen and M. Olsson. Phase-type distributions (Update). Encyclo- pedia of Statistical Sciences Update, 2, 1998. [6] M. Bladt and B.F. Nielsen. Lecture notes on phase-type distributions, 2008. [7] M. Bladt and B.F. Nielsen. Multivariate matrix-exponential distributions. Submitted, 2008. [8] G.R. Grimmett and D.R. Stirzaker. Probability and Random Processes. Oxford University Press, 3rd edition, 2001. [9] V.G. Kulkarni. A new class of multivariate phase type distributions. Op- erations Research, 37(1):151–158, 1989. [10] G. Latouche and V. Ramaswami. Introduction to Matrix Analytic Meth- ods in Stochastic Modelling. ASA-SIAM Series on Statistics and Applied Probability. ASA-SIAM, 1999. 50 BIBLIOGRAPHY

[11] H. Minc. Nonnegative Matrices. John Wiley and Sons, 1998. [12] M.F. Neuts. Probability distributions of phase type. In Liber Amicorum Professor Emeritus H. Florin, pages 173–206. Department of Mathematics, University of Louvain, Belgium, 1975. [13] M.F. Neuts. Matrix-Geometric Solutions in Stochastic Models: An Algo- rithmic Approach. Johns Hopkins series in Mathematical sciences; 2. The Johns Hopkins University Press, Baltimore, 1981. [14] C.A. O’Cinneide. Characterization of phase-type distributions. Stochastic Models, 6(1):1–57, 1990. [15] C.A. O’Cinneide. Phase-type distributions and invariant polytopes. Ad- vances in Applied Probability, 23(3):515–535, 1991. [16] E. Seneta. Non-negative Matrices and Markov Chains. Springer Series in Statistics. Springer-Verlag, 2nd edition, 1981. [17] B. Sengupta. Phase-type representations for matrix-geometric solutions. Stochastic Models, 6(1):163–167, 1990.