Appendix A List of Distributions
Here we list common statistical distributions used throughout the book. The often used indicator symbol 1{.} and gamma function (α) are defined as follows.
Definition A.1 The indicator symbol is defined as
1, if condition in {.} is true, 1{.} = (A.1) 0, otherwise.
Definition A.2 The standard gamma function is defined as
∞ α− − (α) = t 1e t dt,α>0. (A.2) 0
A.1 Discrete Distributions
A.1.1 Poisson Distribution, Poi sson(λ)
A Poisson distribution function is denoted as Poisson(λ). The random variable N has a Poisson distribution, denoted N ∼ Poisson(λ), if its probability mass function is
k λ −λ p(k) = Pr[N = k]= e ,λ>0(A.3) k! for all k ∈{0, 1, 2,...}. Expectation, variance and variational coefficient of a ran- dom variable N ∼ Poisson(λ) are
1 E[N]=λ, Var[N]=λ, Vco[N]=√ . (A.4) λ
P. Shevchenko, Modelling Operational Risk Using Bayesian Inference, 273 DOI 10.1007/978-3-642-15923-7, C Springer-Verlag Berlin Heidelberg 2011 274 Appendix A
A.1.2 Binomial Distribution, Bin(n, p)
The binomial distribution function is denoted as Bin(n, p). The random variable N has a binomial distribution, denoted N ∼ Bin(n, p), if its probability mass function is
n − p(k) = Pr[N = k]= pk(1 − p)n k, p ∈ (0, 1), n ∈ 1, 2,... (A.5) k for all k ∈{0, 1, 2,...,n}. Expectation, variance and variational coefficient of a random variable N ∼ Bin(n, p) are 1 − p E[N]=np, Var[N]=np(1 − p), Vco[N]= . (A.6) np
Remark A.1 N is the number of successes in n independent trials, where p is the probability of a success in each trial.
A.1.3 Negative Binomial Distribution, Neg Bin(r, p)
A negative binomial distribution function is denoted as NegBin(r, p). The random variable N has a negative binomial distribution, denoted N ∼ NegBin(r, p), if its probability mass function is
r + k − 1 p(k) = Pr[N = k]= pr (1 − p)k, p ∈ (0, 1), r ∈ (0, ∞) (A.7) k for all k ∈{0, 1, 2,...}. Here, the generalised binomial coefficient is
r + k − 1 (k + r) = , (A.8) k k! (r) where (r) is the gamma function. Expectation, variance and variational coefficient of a random variable N ∼ NegBin(r, p) are
r(1 − p) r(1 − p) 1 E[N]= , Var[N]= , Vco[N]=√ . (A.9) p p2 r(1 − p)
Remark A.2 If r is a positive integer, N is the number of failures in a sequence of independent trials until r successes, where p is the probability of a success in each trial. Appendix A 275
A.2 Continuous Distributions
A.2.1 Uniform Distribution, U(a, b)
A uniform distribution function is denoted as U(a, b). The random variable X has a uniform distribution, denoted X ∼ U(a, b), if its probability density function is
1 f (x) = , a < b (A.10) b − a for x ∈[a, b]. Expectation, variance and variational coefficient of a random variable X ∼ U(a, b) are
a + b (b − a)2 b − a E[X]= , Var[X]= , Vco[X]=√ . (A.11) 2 12 3(a + b)
A.2.2 Normal (Gaussian) Distribution, N (μ, σ )
A normal (Gaussian) distribution function is denoted as N (μ, σ). The random vari- able X has a normal distribution, denoted X ∼ N (μ, σ), if its probability density function is
1 (x − μ)2 f (x) = √ exp − ,σ2 > 0,μ∈ R (A.12) 2πσ2 2σ 2 for all x ∈ R. Expectation, variance and variational coefficient of a random variable X ∼ N (μ, σ) are
E[X]=μ, Var[X]=σ 2, Vco[X]=σ/μ. (A.13)
A.2.3 Lognormal Distribution, LN(μ, σ )
A lognormal distribution function is denoted as LN(μ, σ). The random variable X has a lognormal distribution, denoted X ∼ LN (μ, σ), if its probability density function is
1 (ln(x) − μ)2 f (x) = √ exp − ,σ2 > 0,μ∈ R (A.14) x 2πσ2 2σ 2 for x > 0. Expectation, variance and variational coefficient of a random variable X ∼ LN (μ, σ) are 276 Appendix A 1 μ+ 1 σ 2 2μ+σ 2 σ 2 σ 2 E[X]=e 2 , Var[X]=e (e − 1), Vco[X]= e − 1. (A.15)
A.2.4 t Distribution, T (ν,μ,σ2)
A t distribution function is denoted as T (ν,μ,σ2). The random variable X has a t distribution, denoted X ∼ T (ν,μ,σ2), if its probability density function is
−(ν+1)/2 ((ν + 1)/2) 1 (x − μ)2 f (x) = √ 1 + (A.16) (ν/2) νπ νσ2 for σ 2 > 0, μ ∈ R, ν = 1, 2,... and all x ∈ R. Expectation, variance and varia- tional coefficient of a random variable X ∼ T (ν,μ,σ2) are
E[X]=μ if ν>1, ν [X]=σ 2 ν> , Var ν − if 2 (A.17) 0 2 σ ν Vco[X]= if ν>2. μ ν − 2
A.2.5 Gamma Distribution, Gamma(α, β)
A gamma distribution function is denoted as Gamma(α, β). The random variable X has a gamma distribution, denoted as X ∼ Gamma(α, β), if its probability density function is
xα−1 f (x) = (−x/β), α > ,β> (α)βα exp 0 0 (A.18) for x > 0. Expectation, variance and variational coefficient of a random variable X ∼ Gamma(α, β) are √ E[X]=αβ, Var[X]=αβ 2, Vco[X]=1/ α. (A.19)
A.2.6 Weibull Distribution, Weibull(α, β)
A Weibull distribution function is denoted as Weibull(α, β). The random variable X has a Weibull distribution, denoted as X ∼ Weibull(α, β), if its probability density function is α f (x) = xα−1 (−(x/β)α), α > ,β> βα exp 0 0 (A.20) Appendix A 277 for x > 0. The corresponding distribution function is α F(x) = 1 − exp −(x/β) ,α>0,β>0. (A.21)
Expectation and variance of a random variable X ∼ Weibull(α, β) are E[X]=β (1 + 1/α), Var[X]=β2 (1 + 2/α) − ( (1 + 1/α))2 .
A.2.7 Pareto Distribution (One-Parameter), Pareto(ξ, x0)
A one-parameter Pareto distribution function is denoted as Pareto(ξ, x0). The ran- dom variable X has a Pareto distribution, denoted as X ∼ Pareto(ξ, x0), if its distribution function is
−ξ x F(x) = 1 − , x ≥ x0, (A.22) x0 where x0 > 0 and ξ>0. The support starts at x0, which is typically known and not considered as a parameter. Therefore the distribution is referred to as a single parameter Pareto. The corresponding probability density function is
−ξ− ξ x 1 f (x) = . (A.23) x0 x0
Expectation, variance and variational coefficient of X ∼ Pareto(ξ, x0) are ξ E[X]=x if ξ>1, 0 ξ − 1 ξ Var[X 2]=x2 if ξ>2, 0 (ξ − 1)2(ξ − 2) 1 Vco[X]=√ if ξ>2. ξ(ξ − 2)
A.2.8 Pareto Distribution (Two-Parameter), Pareto2(α, β)
A two-parameter Pareto distribution function is denoted as Pareto2(α, β). The ran- dom variable X has a Pareto distribution, denoted as X ∼ Pareto2(α, β), if its distribution function is
−α x F(x) = − + , x ≥ , 1 1 β 0 (A.24) 278 Appendix A where α>0 and β>0. The corresponding probability density function is
αβ α f (x) = . (A.25) (x + β)α+1
The moments of a random variable X ∼ Pareto2(α, β) are
βkk! E[X k]= ; α>k. k (α − ) i=1 i
A.2.9 Generalised Pareto Distribution, GPD(ξ, β)
A GPD distribution function is denoted as GPD(ξ, β). The random variable X has a GPD distribution, denoted as X ∼ GPD(ξ, β), if its distribution function is
1 − (1 + ξ x/β)−1/ξ ,ξ = 0, Hξ,β(x) = (A.26) 1 − exp(−x/β), ξ = 0, where x ≥ 0 when ξ ≥ 0 and 0 ≤ x ≤−β/ξ when ξ<0. The corresponding probability density function is − 1 − 1 ( + ξ /β) ξ 1,ξ = , ( ) = β 1 x 0 h x 1 (A.27) β exp(−x/β), ξ = 0.
Expectation, variance and variational coefficient of X ∼ GPD(ξ, β), ξ ≥ 0, are
βnn! 1 β [X n]= ,ξ< ; [X]= ,ξ< ; E n ( − ξ) E − ξ 1 k=1 1 k n 1 β2 1 1 Var[X 2]= , Vco[X]=√ ,ξ< . (A.28) (1 − ξ)2(1 − 2ξ) 1 − 2ξ 2
A.2.10 Beta Distribution, Beta(α, β)
A beta distribution function is denoted as Beta(α, β). The random variable X has a beta distribution, denoted as X ∼ Beta(α, β), if its probability density function is
(α + β) f (x) = xα−1( − x)β−1, ≤ x ≤ , (α) (β) 1 0 1 (A.29) for α>0 and β>0. Expectation, variance and variational coefficient of a random variable X ∼ Beta(α, β) are Appendix A 279 α αβ β E[X]= , Var[X]= , Vco[X]= . α + β (α + β)2(1 + α + β) α(1 + α + β)
A.2.11 Generalised Inverse Gaussian Distribution, GIG(ω,φ,ν)
A GIG distribution function is denoted as GIG(ω,φ,ν). The random variable X has a GIG distribution, denoted as X ∼ GIG(ω,φ,ν), if its probability density function is
(ν+1)/2 (ω/φ) ν − ω− −1φ f (x) = √ x e x x , x > 0, (A.30) 2Kν+1 2 ωφ where φ>0, ω ≥ 0ifν<−1; φ>0, ω>0ifν =−1; φ ≥ 0, ω>0ifν>−1; and
∞ 1 ν −z(u+1/u)/2 Kν+1(z) = u e du. 2 0
Kν(z) is called a modified Bessel function of the third kind; see for instance Abramowitz and Stegun ([3], p. 375). The moments of a random variable X ∼ GIG(ω,φ,ν) are not available in a closed form through elementary functions but can be expressed in terms of Bessel functions: √ α/2 α φ Kν+1+α 2 ωφ E[X ]= √ ,α≥ 1,φ>0,ω>0. ω Kν+1 2 ωφ
Often, using notation Rν(z) = Kν+1(z)/Kν(z), it is written as
α/2 *α α φ [X ]= Rν+ ωφ ,α= , ,... E ω k 2 1 2 k=1
∂ ν −(ωx+φ/x) = The mode is easily calculated from ∂x x e 0as 1 1 mode[X]= ν + ν2 + 4ωφ , 2ω that differs only slightly from the expected value for large ν, i.e.
mode[X]→E[X] for ν →∞. 280 Appendix A
A.2.12 d-variate Normal Distribution, Nd(μ, )
A d-variate normal distribution function is denoted as Nd (μ, ), where μ = d (μ1,...μd ) ∈ R and is a positive definite matrix (d × d). The corresponding probability density function is
1 1 − f (x) = √ exp − (x − μ) 1(x − μ) , x ∈ Rd , (A.31) (2π)d/2 det 2 where −1 is the inverse of the matrix . Expectations and covariances of a random vector X = (X1,...,Xd ) ∼ Nd (μ, ) are
E[Xi ]=μi , Cov[Xi , X j ]=&i, j , i, j = 1,...,d. (A.32)
A.2.13 d-variate t-Distribution, Td(ν,μ, )
A d-variate t-distribution function with ν degrees of freedom is denoted as d Td (ν, μ, ), where ν>0, μ = (μ1,...μd ) ∈ R is a location vector and is a positive definite matrix (d × d). The corresponding probability density function is
( ) ν+d ν+ − d (x − μ) −1(x − μ) 2 f (x) = 2 √ 1 + , (A.33) (νπ)d/2 ν ν 2 det where x ∈ Rd and −1 is the inverse of the matrix . Expectations and covariances of a random vector X = (X1,...,Xd ) ∼ Td (ν, μ, ) are
E[Xi ]=μi , if ν>1, i = 1,...,d; Cov[Xi , X j ]=ν&i, j /(ν − 2), if ν>2, i, j = 1,...,d. (A.34) Appendix B Selected Simulation Algorithms
B.1 Simulation from GIG Distribution
To generate realisations of a random variable X ∼ GIG(ω,φ,ν)with ω,φ > 0, a special algorithm is required because we cannot invert the distribution function in closed form. The following algorithm can be found in Dagpunar [67]:
Algorithm√ B.1 (Simulation√ from GIG distribution) α = ω/φ β = ωφ 1. ; 2 , m = 1 ν + ν2 + β2 , β ( ) = 1 β 3 − 2 1 β + ν + + ν − β + 1 β g y 2 y y 2 m 2 y m 2 2 m. 2. Set y0 = m, While g(y0) ≤ 0doy0 = 2y0, y+: root of g in the interval (m, y0), ( , ) y−: root of g in the interval 0 m . y+ ν/2 β 1 1 3. a = (y+ − m) exp − y+ + − m − , m 4 y+ m y− ν/2 β 1 1 b = (y− − m) exp − y− + − m − , m 4 y− m =−β + 1 + ν ( ) c 4 m m 2 ln m . , ∼ U( , ) = + U + 1−V 4. Repeat U V 0 1 , Y m a V b U , > − ≥−ν + 1 β + 1 + until Y 0 and ln U 2 ln Y 4 Y Y c, Y Then X = α is GIG(ω,φ,ν). To generate a sequence of n realisations from a GIG random variable, step 4 is repeated n times.
281 282 Appendix B
B.2 Simulation from α-stable Distribution
To generate realisations of a random variable X ∼ αStable(α,β,σ,μ), defined by (6.56), a special algorithm is required because the density of α-stable distribution is not available in closed form. An elegant and efficient solution was proposed in Chambers, Mallows and Stuck [50]; also see Nolan [176].
Algorithm B.2 (Simulation from α-stable distribution) 1. Simulate W from the exponential distribution with mean = 1. U − π , π 2. Simulate U from 2 2 . 3. Calculate ⎧ − + 1 ⎪ sin(α(U+Bα,β )) cos(U−α(U+Bα,β )) 1 α ⎨⎪ Sα,β ,α = 1, (cos U)1/α W Z = ⎪ ⎩ 2 π + β − β πW cos U ,α= , π 2 U tan U ln π+2βU 1
where
2 2 1 Sα,β = (1 + β tan (πα/2)) 2α , 1 Bα,β = (β (πα/ )). α arctan tan 2
The obtained Z is a sample from αStable(α, β, 1, 0). 4. Then, ⎧ ⎨ μ + σ Z,α = 1, X = ⎩ 2 μ + σ Z + π βσ ln σ, α = 1,
is a sample from αStable(α,β,σ,μ).
Note that there are different parameterisations of the α-stable distribution. The algorithm above is for representation (6.56). Solutions for Selected Problems
Problems of Chapter 2
2.1 The likelihood function for independent data N ={N1, N2,...,NM } from Poisson(λ) is
*M n −λ λ i n(λ) = e , ni ! i=1 M M ln n(λ) =−λM + ln λ ni − ln(ni !). i=1 i=1 The MLE Λ maximising the log-likelihood function ln N(λ) is
M 1 Λ = N . M i i=1
Using the properties of the Poisson distribution, E[Ni ]=Var[Ni ]=λ, it is easy to get
M 1 E[Λ ]= E[N ]=λ; M i i=1 M 1 λ Var[Λ ]= Var[N ]= . M2 i M i=1
To estimate the variance of Λ using a normal approximation, find the information matrix M 1 ∂2 ln (λ) 1 1 I(λ) =− E N = E N = . M ∂λ2 Mλ2 i λ i=1
283 284 Solutions for Selected Problems
Thus, using asymptotic normal distribution approximation,
− Var[Λ ]≈I 1(λ)/M = λ/M.
In both cases the variance depends on unknown true parameter λ that can be esti- mated, for a given realisation n,as λ. 2.4 Consider
L(u) = u1 L1 +···+u J L J , where u ∈ RJ and set
φu(t) = [tL(u)]], t > 0.
Then using homogeneity property [tL]=t[L],
dφ (t) u = [L(u)]. dt From another side J J dφu(t) [L(x)] [L(u)] = u j = u j , dt ∂x j ∂u j j=1 x=tu j=1 where to get the last equality we used homogeneity property. Thus J [ +···+ + ] L1 L j hLj [L(1)]= ∂h = j=1 h 0 completes the proof. 2.5 The sum of risks is gamma distributed:
Z1 + Z2 + Z3 ∼ Gamma(α1 + α2 + α3,β).
[ ]= −1( . |α ,β) Thus VaR0.999 Zi FG 0 999 i and
[ + + ]= −1( . |α + α + α ,β), VaR 0.999 Z1 Z2 Z3 FG 0 999 1 2 3
−1(·|α, β) (α, β) where FG is the inverse of the Gamma . Using, for example, MS Excel spreadsheet function GAMMAINV(·), find
VaR 0.999[Z1]≈5.414, VaR 0.999[Z2]≈6.908, VaR 0.999[Z3]≈8.133, VaR 0.999[Z1 + Z2 + Z3]≈11.229. Solutions for Selected Problems 285
The sum of VaRs is VaR0.999[Z1]+VaR 0.999[Z2]+VaR 0.999[Z3]≈20.455 and thus the diversification is ≈ 45%.
Problems of Chapter 3
3.1 By definition of the expected shortfall we have
∞ 1 E[Z|Z > L]= zh(z)dz 1 − H(L) L L E[Z] 1 = − zh(z)dz. 1 − H(L) 1 − H(L) 0
Substituting h(z) calculated via characteristic function (3.11) and changing vari- able x = t × L, we obtain
L ∞ L 2 zh(z)dz = z Re[χ(t)] cos(tz)dtdz 0 π 0 0 ∞ 2L sin(x) 1 − cos(x) = Re[χ (x/L)] − dx. π x x2 0
Recognizing that the term involving sin(x)/x corresponds to H(L), we obtain ⎡ ⎤ ∞ 1 2L 1 − cos x E[Z|Z > L]= ⎣E[Z]−H(L)L + Re[χ(x/L)] dx⎦ . 1 − H(L) π x2 0
Problems of Chapter 4 4.1 The linear estimator θtot = w1θ1 +···+wK θK is unbiased, i.e. E[θtot]=θ,if w1 +···+wK = 1 because E[θk]=θ. Minimisation of the variance
[ θ ]=w2σ 2 +···+w2 σ 2 Var tot 1 1 K K under the constraint w1 +···+wK is equivalent to unconstrained minimisation of the