On Asymptotic Diversification Effects for Heavy-Tailed Risks

Dissertation zur Erlangung des Doktorgrades der Fakult¨atf¨urMathematik und Physik der Albert-Ludwigs-Universit¨atFreiburg im Breisgau

vorgelegt von Georg Mainik

Februar 2010 Dekan: Prof. Dr. Kay K¨onigsmann Referenten: Prof. Dr. Ludger R¨uschendorf Prof. Dr. Paul Embrechts

Datum der Promotion: 29. April 2010 Contents

1 Introduction1 1.1 Motivation...... 1 1.2 Overview of related theory...... 2 1.3 Central results and structure of the thesis...... 3 1.4 Acknowledgements...... 7

2 Multivariate regular variation9 2.1 Basic notation and model assumptions...... 9 2.2 Canonical forms of exponent and spectral measures...... 14 2.3 Dependence functions...... 18 2.4 Copulas...... 20 2.5 Spectral densities of Gumbel copulas...... 24 2.6 Spectral densities of elliptical distributions...... 27

3 Extreme risk index 35 3.1 Basic approach...... 35 3.2 Representations in terms of spectral measures...... 39 3.3 Portfolio optimization and diversification effects...... 42 3.4 Minimization of risk measures...... 49

4 Estimation 55 4.1 Basic approach...... 55 4.2 Main results...... 59 4.3 Empirical processes with functional index...... 63 4.4 Proofs of the main results...... 76 4.5 Examples and comments...... 78

5 Stochastic order relations 87 5.1 Introduction...... 87 5.2 Ordering of extreme portfolio losses...... 90 5.3 Ordering of spectral measures...... 94

i ii CONTENTS

5.4 Convex and supermodular orders...... 106 5.5 Examples...... 111

6 Modelling and simulation 117 6.1 Objectives and design...... 117 6.2 Estimation of the tail index...... 118 6.3 Models...... 121 6.4 Simulation results...... 126 6.5 Conclusions...... 147

A Auxiliary results 149 A.1 Regular variation...... 149 A.2 Empirical processes...... 149

Bibliography 155

Index 163 Chapter 1

Introduction

1.1 Motivation

Management of financial risks is one of the central challenges in the area of finance and insurance. In particular, any market agent has a natural inter- est in protection against extreme losses arising from market crashes, natural catastrophes, or political turbulences. Furthermore, financial crashes have proved to be a serious danger not only to the prosperity of share holders and financial corporations, but also to the general economic and political sta- bility of entire societies. Thus, additionally to the microeconomic interest, management of extreme financial risks is an issue of macroeconomic, politi- cal, and social importance. This issue is even greater in a rapidly changing world with advancing globalization of financial markets and acceleration of communication networks. In consequence of these developments, permanent validation and adjustment of models used in the risk management is essential to the market agents and the regulating authorities as well. Portfolio diversification is one of the basic microeconomic approaches to the reduction of non-systemic risk. The first probabilistic concept of diversi- fication was given by Markowitz(1952). Quantifying risk and dependence by variance and correlation, respectively, Markowitz gave arguments explaining why diversification decreases the portfolio risk. Designed for the multivariate Gaussian model assumption, this approach shaped the intuition of portfolio diversification in economics (cf. Sharpe, 1964; Lintner, 1965) and has proved to be appropriate to many application areas. However, the mean-variance portfolio theory cannot be applied to all diversification problems. First, this approach assumes existence of second moments, which is questionable in some applications. There are even real- world risk data sets suggesting infinite first moments (cf. Moscadelli, 2004;

1 2 Chapter 1. Introduction

Neˇslehov´aet al., 2006). Another limitation is the two-sided view of risk, as it automatically follows from measuring risk by variance. This may be misleading for asymmetric distributions and extreme risks related to rare events. Moreover, measuring dependence by correlation is also problematic in many applications (cf. Embrechts et al., 2002). Indeed, since moments depend on the entire distribution of a random variable in its value domain, they may be inappropriate to the quantification of risk and dependence in the tail region. The limitations of the Markowitz approach are well known and addressed by recent developments in financial risk theory. In particular, the two-sided view of risk is commonly replaced by the notion of downside risk and re- lated risk measures such as the value-at-risk or the expected shortfall. The theory of coherent risk measures founded by Artzner et al.(1999) provides an axiomatic characterization of risk measures that favours portfolio diver- sification. This objective can be considered as an extension of the original motivation of Markowitz, who wanted to find a mathematical reasoning for portfolio diversification (cf. Markowitz, 1991). Talking about rare events and extreme losses, one has to mention the heavy-tailed distributions. The evidence of heavy tails in finance and in- surance is manifold and the application area is so vast that an outline of the relevant literature would go far beyond the scope of this introduction. To mention only the basic facts, it should be said that heavy-tailed models are generally accepted in the realm of insurance (cf. Hogg and Klugman, 1984; Embrechts et al., 1997) and vividly discussed in mathematical finance since its earliest days (cf. Mandelbrot, 1963; Fama, 1965). The equivocality concerning heavy-tailed modelling of financial returns is a result of the high complexity of financial markets. On the one hand, extremal behaviour of the market data seems to suggest heavy-tailed distributions (cf., among others, Longin, 1996; Beirlant et al., 2004b). On the other hand, reliable estima- tion of tail parameters in non-stationary time series with clustering effects is known to be difficult (cf. Malevergne et al., 2006). This implies that indica- tion for heavy tails in financial return data should be handled very carefully. Still, despite the ongoing scientific discussion, it is commonly agreed that heavy tails are worth considering.

1.2 Overview of related theory

Extremal behaviour of random variables is a subject of vital research since the very beginnings of probability theory. The core basis of extreme value theory is the Fisher–Tippett theorem, which characterizes the limit distribu- 1.3. Central results and structure of the thesis 3 tions for maxima of i.i.d. random variables (Fisher and Tippett, 1928; Gne- denko, 1943). The Fisher–Tippett theorem states that there are only three possible limit distribution types: the Fr´echet, the Gumbel, and the Weibull distributions. This result was applied to the estimation of high quantiles and probabilities of extremal events in many application areas, such as hydrol- ogy, network modelling, and risk management in finance and insurance. An elaborate overview of the latter application area is given by Embrechts et al. (1997). The probabilistic structure of multivariate extremes was characterized by de Haan and Resnick(1977). This seminal result provided a sound theoretical basis for various approaches to the modelling and the estimation of extremal dependence. In particular, modelling concepts based on multivariate extreme value theory and copulas have been developed for an adequate description and analysis of risks and risk portfolios. Comprehensive elaborations on this topic are given in McNeil et al.(2005) and Malevergne and Sornette(2006). Further developments in modelling extremal dependence are based on the notions of tail dependence and tail copulas, multivariate excess distributions, and the empirical distribution of excess directions. See, among others, Falk et al.(1994); Schmidt and Stadtm¨uller(2006); Kl¨uppelberg and Resnick (2008); Hauksson et al.(2000). The current state of extreme value theory in both univariate and multivariate settings with particular emphasis on statistical applications is presented in de Haan and Ferreira(2006), whereas heavy-tailed models are specially treated by Resnick(2007). There are also various applications of multivariate extreme value theory related to sums of random variables (cf. W¨uthrich, 2003; Barbe et al., 2006; B¨ocker and Kl¨uppelberg, 2008; Kortschak and Albrecher, 2009a). However, the focus of these results is put on loss aggregation rather than on portfolio diversification. Although tightly related, these problems are obviously not identical. In particular, characterization of risk aggregated in the sum of random variables does not provide a direct link to the location of the optimal portfolio. Furthermore, as highlighted in Remark 5.31, the influence of the tail index on diversification effects turns out to be different from the influence on the aggregated risk.

1.3 Central results and structure of the thesis

This thesis is dedicated to the diversification of extreme risks. In contrast to the Markowitz approach, the problem is reduced to the comparison and minimization of the extreme portfolio risks, while the consideration of average gains is omitted. Thus application of these results in practical portfolio 4 Chapter 1. Introduction optimization might need additional modelling of gains in the non-extreme region and a trade-off between the risks and the expected gains. This problem statement should not be misunderstood as a prudent view of risk. The high abstraction level of the present approach is motivated by the interest in a pure mathematical concept of diversification effects for extreme risks. To achieve this, all conditions that are not essential to the extremal be- haviour are dropped. The results are derived upon the assumption of mul- tivariate regular variation, which specifies only the tail behaviour and the asymptotic dependence structure in the tail region. This setting is rather different from the Markowitz optimization problem, which was primarily de- signed for the multivariate Gaussian case. In particular, the distributions considered here need not be symmetric or have finite variances, and even the case of infinite first moments is included. The general problem setting can be described as follows. Given a multi- variate regularly varying random vector X = (X(1),...,X(d)), one is inter- ested in the asymptotic comparison of portfolio losses

ξ>X := ξ(1)X(1) + ... + ξ(d)X(d) with portfolio weights ξ(i) ∈ R satisfying ξ(1) +...+ξ(d) = 1. This comparison is based upon the limit

P ξ>X > t γξ := lim . t→∞ P {kXk1 > t} The existence of this limit is obtained from the multivariate regular variation of X, which also implies the asymptotic quantile relation

← Fξ>X (1 − λ) 1/α lim = γξ . λ↓0 F ← (1 − λ) kXk1

Thus γξ determines the first-order term for the approximation of high port- folio loss quantiles. As recently shown by Degen et al.(2010), first-order approximation may need further improvement by second-order terms. Still, analysis of first-order properties remains important. The basic result underlying this thesis is the characterization of the limit ratio γξ as a functional of the portfolio vector ξ and the characteristics of the multivariate regular variation, given by the tail index α and the spectral measure Ψ. This characterization also includes short positions, i.e., negative portfolio weights. Due to its extraordinary role, the functional γξ is called extreme risk index of the portfolio ξ. The mapping ξ 7→ γξ characterizes the problem of portfolio optimization with respect to extreme risks. It turns 1.3. Central results and structure of the thesis 5 out that this function is convex if the loss expectations exist and concave otherwise under the additional assumption that the portfolio weights are non-negative and there are no extreme gains. Moreover, positive dependence of loss components decreases the total portfolio risk in case of infinite loss expectations. In the general case, i.e., with possible extreme gains and short positions, the properties of the function ξ 7→ γξ are less definite. The con- vexity for α ≥ 1 remains, but the qualitative behaviour for α < 1 depends on the spectral measure. Examples are presented where ξ 7→ γξ is convex, concave, or none of that. These results contradict the intuition of diversification within the Marko- witz theory and the theory of coherent risk measures. Similar effects are al- ready known from aggregation results in multivariate regularly varying mod- els (cf. Rootz´enand Kl¨uppelberg, 1999; Neˇslehov´aet al., 2006; Embrechts et al., 2009a,b). Thus the extreme risk index γξ can be considered as an extension of these findings to diversification problems. Moreover, γξ allows to compare extreme portfolio risks in the infinite mean case, where coherent risk measurement is not possible (cf. Delbaen, 2009). The next series of results is related to the estimation of the extreme risk index γξ from i.i.d. observations of the random loss vector. The estimation of the optimal portfolio ξopt minimizing γξ is also included. A semiparamet- ric estimator γbξ is proposed that exhibits the same convexity and concavity properties as the true extreme risk index γξ. Thus the qualitative results on the optimization of γξ remain valid for the estimate γbξ. Asymptotic normal- ity and strong consistency of the estimator γbξ are obtained uniformly over compact sets of portfolio vectors ξ. The uniform asymptotic normality means weak convergence of the properly normalized estimation error to a Gaussian limit process with index ξ. The proofs incorporate empirical process the- ory presented in van der Vaart and Wellner(1996). The uniform strong consistency of γbξ immediately yields the strong consistency of the estimated optimal portfolio ξbopt. In addition to the comparison of risks for different portfolio vectors, this thesis addresses the comparison of stochastic models. A new notion of stochastic ordering is introduced, appropriate to the ordering of random vectors with respect to the resulting extreme portfolio risks. This asymptotic portfolio loss order apl is characterized for multivariate regularly varying random vectors. The equivalent criterion obtained here incorporates order- ing of marginal distributions with respect to apl and ordering of spectral measures with respect to a specific integral order relation that is linked to the extreme risk index γξ. In addition to the criteria based upon spectral measures, sufficient criteria for apl are obtained in terms of some well-known 6 Chapter 1. Introduction multivariate stochastic order relations. The statistical results are accomplished by an extensive simulation study. Monte Carlo performance tests for the estimators γbξ and ξbopt are imple- mented in two exemplaric models. The models are chosen to illustrate how the estimates of the tail index α and the spectral measure Ψ influence the bias and the variability of the estimator γbξ. Furthermore, the optimization quality achieved by the estimator ξbopt is compared for different values of the tail index and the degree of dependence between the loss components. In particular, simulation results demonstrate that portfolio optimization may be problematic if α is close to 1 and the loss components are non-negative. The optimization results obtained in other cases are reliable. The thesis is organized as follows. Chapter2 provides the basic nota- tion of multivariate regular variation and multivariate extreme value theory, followed by an overview of alternative approaches to the modelling of depen- dence structures, auxiliary results, and examples. Chapter3 is dedicated to the characterization of diversification effects in multivariate regularly vary- ing models and the resulting optimization problem. The extreme risk index γξ is introduced, analysed and applied to the minimization of risk measures. Chapter4 comprises the statistical results, including consistency and asymp- totic normality of the estimated extreme risk index γbξ and consistency of the estimated optimal portfolio ξbopt. Stochastic ordering with respect to extreme portfolio losses is discussed in Chapter5. Finally, the simulation study is presented in Chapter6. Auxiliary results on regular variation and empirical processes are collocated in AppendixA. Some of the results presented in Chapter3 and a special case of the results obtained in Chapter4 are published in Mainik and R¨uschendorf(2010). 1.4. Acknowledgements 7 1.4 Acknowledgements

I would like to thank my scientific advisor Professor Dr. Ludger R¨uschendorf for drawing my attention to extremal dependence structures, for encouraging me to work independently, and for the support he gave me by his valuable comments and decisive questions. I thank my colleagues at the Department of Mathematical Stochastics for the friendly atmosphere, especially appreciating the time and the discussions I had with Wolfgang Kluge, Nataliya Koval, Joachim Schneegans, Eva-Maria Schopp, and Volker Pohl. Further thanks go to the department’s heart and soul, Mrs. Monika Hat- tenbach, for stylistic proof reading and serving as a wandering encyclopaedia of LATEX layout tricks. I thank my friends Katja Guschanski, Adrian Kantian, Florian Dennert, Hilmar B¨ohm,and Roman Iakoubov for the time they shared with me and for all the moments of humour and truth. Greatest thanks go to my family. I thank my parents, Johannes and Ludmila Mainik, and my brother Andreas for their unconditional support that gave me so much confidence. Enormous thanks are due to my wife, Annette, for walking this path along with me and being a constant source of power, will, and joy. 8 Chapter 1. Introduction Chapter 2

Multivariate regular variation

This chapter introduces the heavy-tailed multivariate regularly varying prob- ability distributions, which constitute the theoretical framework of the thesis. The definition and the basic properties of univariate and multivariate regular variation are given in Section 2.1, whereas further extensions, related notions, examples, and auxiliary results needed later are collocated in a sequence of sections addressing special topics. Thus, the so-called canonical standardizations of exponent and spectral measures are introduced in Section 2.2, while the notion of dependence func- tions is subject of Section 2.3, and the copula approach to the modelling of multivariate extremes is discussed in Section 2.4. Highlighting the intercon- nections between the different dependence notions, these sections put them into a common theoretical framework. The chapter is concluded by two sections that deal with specific models appearing in examples of Chapters5 and6. Section 2.5 presents an exem- plary computation of the canonical spectral densities associated with Gumbel copulas. Finally, Section 2.6 provides a general representation for the spectral densities of multivariate regularly varying elliptical distributions.

2.1 Basic notation and model assumptions

Let X be a random vector in Rd with components X(1),...,X(d) representing the gains and the losses of risky assets. Focusing on the risks, let X be a random loss vector, i.e., X(i) > 0 quantifies losses and X(i) < 0 quantifies gains generated by the i-th asset. According to the application area (finan- cial or actuarial), it is natural to distinguish between the general case with components X(i) taking both positive and negative values (loss-gain case) and the case when the value domains of all components X(i) are restricted

9 10 Chapter 2. Multivariate regular variation to R+ := [0, ∞)(pure loss case). As it turns out further, portfolio diversifi- d cation for loss vectors X ∈ R+ exhibits some remarkable properties that do not hold in the general case. Denoting the weight of the i-th asset in the portfolio by ξ(i), one obtains d d the portfolio vector ξ ∈ R+ or ξ ∈ R if negative portfolio weights (short positions) are permitted. The portfolio loss is given by the scalar product of the portfolio vector ξ and the loss vector X. In the following, vectors are regarded as columns and the portfolio loss is written as ξ>X: ξ>X := ξ(1)X(1) + ... + ξ(d)X(d). As a special case, this notation includes components X(i) representing relative losses of assets Z(i): (i) (i) (i) Z0 − ZT X = (i) , i = 1, . . . , d. Z0 In this setting the scalar product ξ>X equals the random loss generated by investing the value ξ(i) in the i-th asset for i = 1, . . . , d:

d (i) > X ξ  (i) (i) ξ X = (i) Z0 − ZT . i=1 Z0

> Pd (i) Furthermore, the relative loss of the portfolio ξ is given by ξ X/ i=1 ξ . The probability distributions of the loss components X(i) are assumed to be heavy-tailed in the sense that the tail index of each X(i) is finite:

n (i) β o αi := sup β ∈ [0, ∞):E X < ∞ < ∞. (2.1) Sometimes it is more convenient to operate with the upper and the lower tail index of a random variable Y , obtained from the positive part Y+ := Y · 1 {Y > 0} and the negative part Y− := |Y | · 1 {Y < 0}, respectively. It is obvious that the total tail index of Y defined in (2.1) is equal to the minimum of the upper and the lower one. A random variable Y is called heavier-tailed than another random variable Z if the tail index of Y is lower than that of Z. It is well known that in the case of unequal component tail indices αi the contribution of lighter tails to the portfolio loss ξ>X is asymptotically negligible if all portfolio weights ξ(i) are positive and there is no mutual neutralization of the heavier tails due to linear dependence between gains and losses. Consequently, the study of asymptotic diversification effects can be reduced to the non-trivial case by assuming that the component tail indices αi are equal:

α1 = ... = αd =: α. (2.2) 2.1. Basic notation and model assumptions 11

The heavy-tail property (2.1) is strengthened by the assumption of (uni- variate) regular variation. Definition 2.1. A non-negative random variable Y is called regularly varying with tail index α ∈ [0, ∞) if the following condition is satisfied:

P{Y > tx} ∀x > 0 → x−α, t → ∞. (2.3) P{Y > t}

It is easy to see that regular variation (2.3) implies the heavy-tail prop- erty (2.1) and that the tail index α characterizing these two properties is necessarily the same. Analogously to the heavy-tail property, regular varia- tion can also be considered separately for upper and lower tails. The notion of regular variation for random variables and corresponding probability distributions is intimately related to the regular variation of func- tions. A function f defined on a neighbourhood of ∞, f :(c, ∞) → R, c ∈ R, is called regularly varying (at ∞) if there exists a function g : (0, ∞) → R such that f(tx) ∀x > 0 lim = g(x). (2.4) t→∞ f(t) It is well known that for measurable functions f any solution g of (2.4) is a power function, β g(x) = x , β ∈ R, (2.5) and that the regular variation index β in (2.5) is unique for each f. In the special case β = 0 the function f is called slowly varying, and any measurable regularly varying function f has necessarily the form

f(t) = l(t) · tβ (2.6) with a slowly varying function l. For more details on regular variation of functions see Bingham et al.(1987). Thus, in terms of the probability distribution function corresponding to a random variable Y ,

FY (t) := P {Y ≤ t} , regular variation of Y with tail index α ∈ [0, ∞) in the upper or in the lower tail means regular variation with index β = −α of the function t 7→ 1−FY (t) at t = ∞ or of the function t 7→ FY (t) at t = −∞, respectively. In order to obtain a non-trivial dependence structure in the tails, the univariate regular variation of the asset losses X(i) is strengthened by the assumption of multivariate regular variation. 12 Chapter 2. Multivariate regular variation

Definition 2.2. A random vector X taking values in Rd is called multivariate regularly varying with tail index α ∈ (0, ∞) if there exist a sequence an → ∞ and a (non-zero) Radon measure ν on the Borel σ-field B([−∞, ∞]d \{0}) such that ν([−∞, ∞]d \ Rd) = 0 and, as n → ∞,

−1 v nP an X → ν on B([−∞, ∞]d \{0}), (2.7)

v −1 where → denotes the vague convergence of Radon measures and P an X is the −1 probability distribution of an X. It should be noted that random vectors with non-negative components yield limit measures ν that are concentrated on [0, ∞]d \{0}. Therefore multivariate regular variation in this special case can also be defined by vague convergence on B([0, ∞]d \{0}). For a full account of technical details related to the notion of multivariate regular variation, vague convergence, and the Borel σ-fields on the punctured spaces [−∞, ∞]d \{0} and [0, ∞]d \{0} the reader is referred to Resnick(2007) or Lindskog(2004). It is well known that the limit measure ν obtained in (2.7) is unique except for a constant factor, has a singularity in the origin in the sense that ν((−ε, ε)d) = ∞ for any ε > 0, and exhibits the scaling property

ν(tA) = t−αν(A) (2.8) for all sets A ∈ B [−∞, ∞]d \{0} that are bounded away from 0. It is also well known that (2.7) implies that the random variable kXk with an arbitrary norm k·k on Rd is univariate regularly varying with tail index α. Moreover, the sequence an can always be chosen as

← an := FkXk(1 − 1/n), (2.9)

← where FkXk is the quantile function of kXk. The resulting limit measure ν is d normalized on the set Ak·k := {x ∈ R : kxk > 1} by  ν Ak·k = 1. (2.10)

Thus, after normalizing ν by (2.10), the scaling relation (2.8) yields an equivalent rewriting of the multivariate regular variation condition (2.7) in terms of weak convergence:

 −1 w  L t X | kXk > t → ν|Ak·k on B Ak·k (2.11)

for t → ∞, where ν|Ak·k is the restriction of ν to the set Ak·k. 2.1. Basic notation and model assumptions 13

Additionally to (2.7) it is assumed that the limit measure ν is non-degen- erate in the following sense:

 d (i)  ν x ∈ R : x > 1 > 0, i = 1, . . . , d. (2.12)

This assumption ensures that all asset losses X(i) are relevant for the extremes of the portfolio loss ξ>X. If (2.12) is satisfied in the upper tail region, i.e., if

 d (i)  ν x ∈ R : x > 1 > 0, i = 1, . . . , d, (2.13) then ν also characterizes the asymptotic distribution of the componentwise (1) (d) (i) (i) (i) maxima Mn := (M ,...,M ) with M := max{X1 ,...,Xn } by the limit relation

 −1 w d  P an Mn ∈ [−∞, x] → exp −ν [−∞, ∞] \ [−∞, x] (2.14) for x ∈ (0, ∞]d. Therefore ν is called exponent measure. For more details concerning the asymptotic distributions of maxima see Resnick(1987). Another consequence of the scaling property (2.8) is the product repre- sentation of ν in polar coordinates

(r, s) := τ(x) := (kxk, kxk−1x) with respect to an arbitrary norm k·k on Rd. The induced measure ντ := ν ◦ τ −1 necessarily satisfies

τ ν = c · ρα ⊗ Ψ (2.15) with the constant factor  c = ν Ak·k > 0, the measure ρα on (0, ∞] defined by

−α ρα((x, ∞]) := x , x ∈ (0, ∞], (2.16)

d and a probability measure Ψ on the unit sphere Sk·k with respect to k·k,

d  d Sk·k := s ∈ R : ksk = 1 .

The measure Ψ is called spectral measure of ν or X. Since the term “spectral measure” is already used in other areas, Ψ is also referred to as angular d measure. In the special case of R+-valued random vectors X it may be d d convenient to reduce the domain of Ψ to Sk·k ∩ R+. 14 Chapter 2. Multivariate regular variation

Although the domain of the spectral measure Ψ depends on the norm k·k underlying the polar coordinates, the representation (2.15) is norm- independent in the following sense: if (2.15) holds for some norm k·k, then it also holds for any other norm k·k that is equivalent to k·k. Hence, due to the equivalence of all norms on Rd, the product representation (2.15) holds for any norm on Rd. The tail index α is the same and the spectral measure d Ψ on the unit sphere S corresponding to k·k is obtained from Ψ by the following transformation: T −1 Ψ = Ψ ,T (s) := ksk s.

Consequently, only the factor c = ν(Ak·k) in (2.15) depends on the underlying norm k·k and the exponent measure ν. However, since the normalization of ν is not unique, the factor c can be chosen arbitrarily. In particular, setting c = 1 for a given norm k·k does not lead to any loss of generality. Finally, it should be noted that multivariate regular variation of the loss vector X is intimately related with the univariate regular variation of port- folio losses ξ>X. As shown in Basrak et al.(2002), multivariate regular d > variation of X implies existence of a portfolio vector ξ0 ∈ R such that ξ0 X is regularly varying with tail index α and any portfolio loss ξ>X satisfies P ξ>X > t lim = c(ξ, ξ0) ∈ [0, ∞). (2.17) t→∞  > P ξ0 X > t This means that all portfolio losses ξ>X are either regularly varying with > tail index α or asymptotically negligible compared to ξ0 X. d Moreover, it is also worth a remark that for R+-valued random vectors X the converse implication is true in the sense that (2.17) and univariate > regular variation of ξ0 X imply multivariate regular variation of the random vector X. This sort of Cram´er-Wold theorem was established in Basrak et al. (2002) and Boman and Lindskog(2009). For further details on regular variation of functions or random variables and related applications in extreme value theory the reader is referred to the vast literature on these topics. See among others Bingham et al.(1987); Resnick(1987); Basrak et al.(2002); Hult and Lindskog(2006); de Haan and Ferreira(2006); Resnick(2007).

2.2 Canonical forms of exponent and spectral measures

An immediate consequence of the multivariate regular variation property is the characterization of asymptotic distributions of loss vectors X in terms 2.2. Canonical forms of exponent and spectral measures 15 of the exponent measure ν. Moreover, standardized by setting c = 1 in the product representation (2.15), the exponent measure ν depends only on the tail index α characterizing the severity of losses and the spectral measure Ψ that can be regarded as the asymptotic distribution of loss directions. How- ever, in some applications other standardizations of the exponent measure ν may be more suitable. In particular, there may be a demand for versions of ν and Ψ that are invariant under specific marginal transformations, such as (i) multiplication of each X by a constant factor ci > 0. This technical issue is addressed by the so-called canonical exponent mea- sures, defined as ν∗ := ν ◦ T, (2.18) where the transformation T is given by

(1) (d) (1) (d) T : x , . . . , x 7→ Tα ν(B1) · x ,...,Tα ν(Bd) · x (2.19) with  d (i) Bi := x ∈ R : x > 1 , i = 1, . . . , d, and  1/α 1/α Tα(t) := t+ − t− .

∗ ∗ The transformed measure ν has unit marginal weights ν (Bi),

∗  1/α  ν (Bi) = ν (T (Bi)) = ν ν (Bi) · Bi −1 = ν (Bi) · ν (Bi) = 1, i = 1, . . . , d, (2.20) and inherits the scaling property (2.8) with the scaling index −α standardized to −1:

ν∗(tA) = ν(T (tA)) = ν t1/αT (A) = t−1ν∗(A). (2.21)

The scaling property (2.21) implies a product representation of ν∗ in polar coordinates and the canonical spectral measure Ψ∗ is defined as the angular component of this product:

∗ −1 ∗ ν ◦ τ = ρ1 ⊗ Ψ , (2.22) 16 Chapter 2. Multivariate regular variation

−1 with the measure ρ1 defined according to (2.16), i.e., ρ1((x, ∞]) := x . The unit margin condition (2.20) can be equivalently written as

∗ 1 = ν (Bi) Z Z  (i) ∗ = 1 r · s > 1 dρ1(r) dΨ (s) d Sk·k (0,∞) Z Z  (i) n (i) −1o ∗ = 1 s 6= 0 · 1 r > s dρ1(r) dΨ (s) d Sk·k (0,∞) Z  (i) (i) ∗ = 1 s 6= 0 s dΨ (s) d Sk·k Z (i) ∗ = s dΨ (s), i = 1, . . . , d. (2.23) d Sk·k

It is easy to see that ν∗ and Ψ∗ are invariant under componentwise rescaling, ∗ i.e., the canonical exponent measure νwX and the canonical spectral measure ∗ ΨwX of the random vector

wX := w(1)X(1), . . . , w(d)X(d) with w ∈ (0, ∞)d are equal to those of X:

∗ ∗ ∗ ∗ νwX = νX , ΨwX = ΨX . (2.24)

In some literature, the canonical standardizations ν∗ and Ψ∗ are referred to as the exponent or spectral measures. d For random vectors X with values in R+ and exponent measures ν defined on B([0, ∞]d \{0}) the representation (2.19) of the transformation T can be simplified to

 d  (1)1/α d  (d)1/α T (x) = ν B1 ∩ R+ · x ,..., ν Bd ∩ R+ · x (2.25) and the unit margin condition (2.20) can be written as

∗  d (i)  ν x ∈ R+ : x > 1 = 1, i = 1, . . . , d. (2.26)

∗ d d Moreover, the spectral measure Ψ is defined on B(Sk·k ∩ R+), which allows to rewrite (2.23) as Z s(i) dΨ∗(s) = 1, i = 1, . . . , d. d d Sk·k∩R+ 2.2. Canonical forms of exponent and spectral measures 17

It should be noted that the canonical spectral measure Ψ∗ still depends on the underlying norm. If the polar coordinates are based on the 1-norm, the unit margin condition (2.23) implies that the total mass of Ψ∗ is constant: Z Z Z ∗ ∗ (1) (d)  ∗ dΨ (s) = ksk1 dΨ (s) = s + ... + s dΨ (s) d d d S1 S1 S1 = d. (2.27) In order to obtain formulas translating Ψ into the canonical form Ψ∗ and vice versa, consider Ψ∗(A) = ν∗ τ −1 ((1, ∞) × A) = ν T ◦ τ −1 ((1, ∞) × A) −1  = ρα ⊗ Ψ τ ◦ T ◦ τ ((1, ∞) × A) Z Z  −1 = 1 (r, s) ∈ τ ◦ T ◦ τ ((1, ∞) × A) dρα(r) dΨ(s), d Sk·k (0,∞) where τ(x) = (kxk, kxk−1x) is the transformation into the polar coordinates and T is the transformation defined in (2.19). Due to τ −1(r, s) = rs and T −1(rs) = rαT −1(s) the integrand can be rewritten as 1 τ ◦ T −1 ◦ τ −1(r, s) ∈ (1, ∞) × A = 1 τ ◦ T −1(rs) ∈ (1, ∞) × A  −1 n −1 −1 −1 o = 1 T (rs) > 1 · 1 T (rs) T (rs) ∈ A

n −1 −1/αo n −1 −1 −1 o = 1 r > T (s) · 1 T (s) T (s) ∈ A .

Finally, the identity Z n −1 −1/αo −1 1 r > T (s) dρα(r) = T (s) (0,∞) implies Z ∗ n −1 −1 −1 o −1 Ψ (A) = 1 T (s) T (s) ∈ A · T (s) dΨ(s). (2.28) d Sk·k Similar arguments yield Ψ(A) = ν∗ T −1 ◦ τ −1 ((1, ∞) × A) Z = 1 kT (s)k−1 T (s) ∈ A · kT (s)kα dΨ∗(s). (2.29) d Sk·k 18 Chapter 2. Multivariate regular variation

Besides the transformation T defined in (2.19), there is another approach that leads to the canonical exponent measure ν∗. Let the random vector X d be restricted to R+ and denote the probability distribution function of each (i) component X by Fi. Then the random vector  1 1  Z := (1) ,..., (d) (2.30) 1 − F1(X ) 1 − Fd(X ) satisfies the multivariate regular variation condition (2.7) with exponent mea- ∗ sure ν and an = n:

nPn−1Z →v ν∗ on B([0, ∞]d \{0}). (2.31)

d If X is not restricted to R+, this approach can be applied in each orthant separately. It is worth a remark that canonical standardizations ν∗ play a major role beyond the framework of classical multivariate regular variation with components X(i) having the same tail index. The standardization (2.30) and the limit condition (2.31) provide a basis for the characterization of multivariate extremes of random vectors with component tail indices αi that are not necessarily equal. However, since different αi yield trivial asymptotics of portfolio losses, this case is out of scope here. For further details on canonical exponent measures and multivariate ex- treme value theory in the general case the reader is referred to Resnick(1987); de Haan and Ferreira(2006); Resnick(2007).

2.3 Extreme value distributions and depen- dence functions

As already highlighted in (2.14), the exponent measure ν is intimately related to the asymptotic distributions of componentwise maxima. It is easy to see that the distributional limits

c d G(x) = exp (−ν ([−∞, x] )) , x ∈ R+, (2.32) obtained in (2.14) satisfy

Gt(x) = exp (−tν ([−∞, x]c))   c = exp −ν −∞, t−1/αx

−1/α  d = G t x , t ∈ (0, ∞), x ∈ R+. (2.33) 2.3. Dependence functions 19

Consequently, these distributions are max-stable, satisfying the characteristic condition

t (1) (d)  G (x) = G a1(t) · x + b1(t), . . . , ad(t) · x + bd(t)

−1/α (cf. Resnick, 1987) with ai(t) = t and bi = 0 for i = 1, . . . , d. The standardization of the exponent measure ν to the canonical form ν∗ can be regarded as detachment of the dependence structure from marginal distributions. Indeed, the max-stable distribution generated by ν∗,

∗ ∗ c d G (x) := exp (−ν ([−∞, x] )) , x ∈ R+, (2.34)

∗ is simple max-stable, i.e., the marginal distributions Gi are unit Fr´echet, ∗ −1 Gi (t) = exp −t , i = 1, . . . , d, and G∗ satisfies

∗ t ∗ −1 d (G (x)) = G (t x), t ∈ (0, ∞), x ∈ R+. (2.35) Moreover, it is easy to see that G can be obtained from G∗ by transformation of margins:

G(x) = exp (−ν ([−∞, x]c)) = exp −ν∗ ◦ T −1 ([−∞, x]c) = G∗ T −1(x) (2.36) with T defined in (2.19). A popular approach to the parametrization of dependence structures char- acterized by ν∗ is based on dependence functions

∗ ∗ c d L (x) := − log G (1/x) = ν ([−∞, 1/x] ) , x ∈ R+, (2.37) where 1/x is understood componentwise: 1/x := (1/x(1),..., 1/x(d)). In some literature, dependence functions are also referred to as stable tail dependence functions or tail dependence functions (cf. Beirlant et al., 2004a; de Haan and Ferreira, 2006). Thus, in terms of dependence functions, the scaling property (2.21) of ν∗ can be written as d tL(x) = L(tx), t > 0, x ∈ R+. (2.38) Obviously, (2.37) allows to obtain the canonical exponent measure ν∗ from the corresponding dependence function L. Moreover, in case of existence, the densities of ν∗ and Ψ∗ can be obtained by successive partial differentiation of L. Further details related to this method an an exemplary calculation are given in Section 2.5. 20 Chapter 2. Multivariate regular variation 2.4 Copulas

Addressing the copula approach in the multivariate extreme value theory, this section highlights the interconnections between copulas and exponent measures. The central results stated in Lemmas 2.3 and 2.4 provide trans- lations between these notions, thus allowing to put the related applications into a common framework. A function C : [0, 1]d → [0, 1] is called copula if C is a multivariate probability distribution function with uniform margins, i.e.,

C(t, 1,..., 1) = C(1, t, 1,..., 1) = ... = C(1,..., 1, t) = t for t ∈ [0, 1] (cf. Joe, 1997). It is easy to see that any function F : Rd → [0, 1] defined as

(1) (d) (1) (d) F x , . . . , x = C F1 x ,...,Fd x , (2.39) with a copula C and univariate probability distribution functions F1,...,Fd is a multivariate probability distribution function with marginal distribution functions given by Fi. Moreover, the converse is also true in the sense that for any multivariate distribution function F with margins F1,...,Fd there exists an associated copula C satisfying (2.39). If the marginal distributions Fi are continuous, the copula associated with F is unique and given by

← (1) ← (d) C (u) = F F1 u ,...,Fd u . (2.40)

This result is also known as Sklar’s theorem. Suppose that the distribution function F is simple max-stable, i.e., F sat- −1 isfies (2.35). Then the marginal distributions are given by Fi(y) = exp(−y ) and the copula C of F necessarily satisfies

 t t Ct u(1), . . . , u(d) = C u(1) ,..., u(d) (2.41) for any t ∈ (0, ∞). It is easy to see that any copula satisfying (2.41) can be used for generating max-stable random vectors. Therefore copulas endowed with this property are also called extreme value copulas (cf. Cap´era`aet al., 1997). The relation between the copula and the canonical exponent measure ν∗ of a multivariate regularly varying random vector X is characterized by the d following result. To keep the proof simple, it is stated for R+-valued random vectors with continuous marginal distributions. However, an extension to the general case is possible. 2.4. Copulas 21

Lemma 2.3. Let X be a multivariate regularly varying random vector with d values in R+, continuous marginal distributions F1,...,Fd, and canonical ∗ d exponent measure ν . Then for any x ∈ R+ \{0} the copula C of X satisfies   1  lim t · 1 − C 1 − = ν∗ ([0, x]c) (2.42) t→∞ tx with 1 − 1/x defined componentwise: 1 − 1/x := 1 − 1/x(1),..., 1 − 1/x(d) . Proof. The invariance of copulas under strictly increasing marginal trans- formations implies that the copula C of X is also the copula of the random vector Z obtained from X according to (2.30). Further it is easy to see that the margins Z(i) of Z are standard Pareto distributed:

 (i) 1 F (i) (t) := P Z ≤ t = 1 − , i = 1, . . . , d. Z t ← Hence, combining (2.31) with (2.40) and FZ(i) (t) = 1/(1 − t), one obtains

d !   1  [   1  t · 1 − C 1 − = tP Z(i) > F ← 1 − tx Z(i) tx(i) i=1 d ! [ = tP Z(i) > tx(i) i=1 = tP t−1Z ∈ [0, x]c ∗ c → ν ([0, x] ) . 

The following result can be regarded as an asymptotic analogue to the scaling property (2.41) of extreme value copulas. Being a direct consequence d of Lemma 2.3, it is stated for probability distributions on R+ with continuous margins. In order to extend this result to the general case one only needs an appropriate version of Lemma 2.3. Lemma 2.4. Let X be a multivariate regularly varying random vector with d values in R+ and continuous marginal distributions. Then for any t ∈ (0, ∞) and u ∈ [0, 1]d the copula C of X satisfies Ct (1 − u/v) = C (1 − u/v)t + o(1/v) (2.43) for v → ∞ with (1 − u/v)t defined componentwise:

 t t (1 − u/v)t := 1 − u(1)/v ,..., 1 − u(d)/v . 22 Chapter 2. Multivariate regular variation

Proof. It suffices to show that

lim v · 1 − Ct(1 − u/v) = lim v · 1 − C (1 − u/v)t . (2.44) v→∞ v→∞ Note that (2.42) implies

lim v · (1 − C (1 − u/v)) = ν∗ ([0, 1/u]c) (2.45) v→∞ and that C(1 − u/v) → 1 as v → ∞. This yields

v · 1 − Ct (1 − u/v) = v · Ct(1) − Ct (1 − u/v) = vt · (1 − C (1 − u/v)) + v · o (1 − C (1 − u/v)) = tν∗ ([0, 1/u]c) + o(1).

Moreover, (2.45) implies

v · 1 − C (1 − u/v)t = v · (1 − C (1 − tu/v + o(1/v))) = v · (1 − C (1 − tu/v)) + v · (C (1 − tu/v + o(1/v)) − C (1 − tu/v)) = ν∗ ([0, 1/tu]c) + o(1) + v · (C (1 − tu/v + o(1/v)) − C (1 − tu/v)) .

Now recall that the copula C is a probability distribution function on [0, 1]d with uniform margins. Let U ∼ C be a random vector distributed according to C. Then, using the extended notation [a, b] := [min(a, b), max(a, b)], one obtains

|C (1 − tu/v + o(1/v)) − C (1 − tu/v)| = |P {U ∈ [0, 1 − tu/v + o(1/v)]} − P {U ∈ [0, 1 − tu/v]}| d X ≤ P U (i) ∈ [1 − tu/v, 1 − tu/v + o(1/v)] i=1 = o(1/v).

Finally, (2.44) follows from the scaling property (2.21) of ν∗ via

∗ c ∗ c tν ([0, 1/u] ) = ν ([0, 1/tu] ). 

The results of Lemmas 2.3 and 2.4 suggest the interpretation of multivari- ate regular variation as a smooth scaling behaviour of the copula in the tail d region. It is easy to see that a probability distribution on R+ with margins F1,...,Fd satisfying

1 − Fi(t) lim = ci ∈ (0, ∞), i = 2, . . . , d, t→∞ 1 − F1(t) 2.4. Copulas 23 and copula C associated with some exponent measure ν∗ by (2.42) is multi- variate regularly varying and non-degenerate in the sense of (2.12). Similar results can also be obtained for probability distributions on Rd. It is also worth a remark that the classical definition of the copula with margins standardized to unif(0, 1) is not the only possible way of detach- ing the dependence structure from the marginal distributions. Depending on the application area, copula-like objects with other margin standardiza- tions may be more suitable. In particular, the Pareto copula introduced by Kl¨uppelberg and Resnick(2008) is specially designed for better integration in the framework of the multivariate extreme value theory. Very common examples of copulas are

• Archimedean copulas:

C(u) = φ−1 φ u(1) + ... + φ u(d) , (2.46)

where the function φ : [0, 1] → R is the so-called Archimedean generator (cf. McNeil and Neˇslehov´a, 2009).

• Gumbel copulas, which are Archimedean copulas with generator φ(z) = (− log z)ϑ and dependence parameter ϑ ∈ [1, ∞):

 d !1/ϑ X (i)ϑ Cϑ(u) := exp − − log u  . (2.47) i=1

Passing to the limit for ϑ → ∞, the parameter domain of the Gumbel copula is sometimes extended to [1, ∞]:

 (1) (d)  C∞(u) := exp − max − log u ,..., − log u .

It is easy to see that Gumbel copulas exhibit the scaling property (2.41):

 1/ϑ t  (1)ϑ (d)ϑ Cϑ(u) = exp −t − log u + ... + − log u

  ϑ ϑ1/ϑ = exp − tϑ − log u(1) + ... + tϑ − log u(d)

 (1)t (d)t =Cϑ u ,..., u . (2.48)

Moreover, Gumbel copulas are natural limits for multivariate regularly varying random variables with Archimedean copulas (see Remark 2.5). 24 Chapter 2. Multivariate regular variation

• Elliptical copulas are the copulas of elliptical distributions (cf. Defini- tion 2.6), with the special case of t-copulas obtained from multivariate Student-t distributions (cf. Example 2.7). Elliptical copulas are a pop- ular choice for models designed to capture gains and losses simultane- ously. It should be noted that (2.34) allows to obtain the dependence function L and the copula C of a simple max-stable distribution by following trans- formations:

L (x) = − log C exp −x(1) ,..., exp −x(d) (2.49) and C (u) = exp −L − log u(1),..., − log u(d) . (2.50) In particular, (2.49) yields an explicit formula for the dependence function of the Gumbel copula:

1/ϑ  (1)ϑ (d)ϑ Lϑ(x) = x + ... + x = kxkϑ, ϑ ∈ [1, ∞]. (2.51)

Remark 2.5. As shown by Genest and Rivest(1989), Gumbel copulas are the only Archimedean copulas that satisfy (2.41). Moreover, if a random vector d (i) X on R+ has identically distributed regularly varying margins X and an Archimedean copula with generator φ such that the function t 7→ φ(1 − 1/t) is regularly varying at t = ∞ with index −ϑ for ϑ ∈ [1, ∞), then X is multivariate regularly varying with canonical exponent measure of X equal to that of the Gumbel copula Cϑ. (cf. Genest and Rivest, 1989; Barbe et al., 2006). In particular, this entails that the asymptotic dependence structures in the tail region provided by Archimedean copulas are related to spheric surfaces with respect to the norm k·kϑ for ϑ ∈ [1, ∞]. For further details on the geometric characterization of max-stable dependence structures see Molchanov(2008).

2.5 Spectral densities of Gumbel copulas

The aim of this section is two-fold. On the one hand, it demonstrates general arguments that allow to obtain canonical spectral densities from dependence functions. On the other hand, it provides an explicit representation of the canonical spectral densities corresponding to Gumbel copulas, which is used in computational examples of Chapter5. Going back to Coles and Tawn (1991), these results are carried out here in detail as a demonstration of the pathway from dependence functions to spectral densities. 2.5. Spectral densities of Gumbel copulas 25

Let Ψ∗ denote the spectral measure of ν∗ with respect to the 1-norm:

∗ −1 ∗ ν ◦ τ = ρ1 ⊗ Ψ , where τ : x 7→ (r, s) is the polar coordinate transformation with the radial −1 part r = kxk1 and the angular part s = kxk1 x. Parametrizing the angular part by the first d − 1 components,

s = s(w) = w(1), . . . , w(d−1), 1 − w(1) + ... + w(d−1) ,

d d one obtains a bijective parametrization of the unit simplex Σ ⊂ R+ as Σd = s(W ) with

 d−1 (1) (d−1) W := x ∈ R+ : x + ... + x ≤ 1 .

Suppose that ν∗ has a density: Z ∗ d  ν (A) = q(x) dx, A ∈ B R+ \{0} , A and let h denote the density of Ψ∗ transformed according to the parametriza- tion s = s(w): Z Ψ∗ (B) = h(w) dw, B ∈ B Σd . s−1(B)

Given the existence of the canonical exponent density q, the scaling property ν∗(tA) = t−1ν∗(A) can be equivalently written as

q(tx) = t−d−1q(x), t > 0. (2.52)

Finally, (2.52) and straightforward transformation arguments yield a repre- sentation of the spectral density h in terms of the exponent density q:

h(w) = q (s(w)) . (2.53)

It is worth a remark that (2.53) and (2.52) also allow to obtain the exponen- tial density q form the spectral density h:

q(x) = h(w(x))(r(x))−d−1, (2.54)

−1 (1) (d−1) with the obvious notation r(x) = kxk1 and w(x) = kxk1 (x , . . . , x ). 26 Chapter 2. Multivariate regular variation

Now consider the dependence function L(x) = ν([0, x]c) or, which is even more convenient, the function l(x) := L (1/x) = ν∗ ([0, x]c) ∗ = ν (B1 ∪ ... ∪ Bd) ! X |J|+1 ∗ \ = (−1) ν Bj (2.55) J⊂{1,...,d} j∈J with sets Bj defined for j = 1, . . . , d as  d (j) (j) Bj := y ∈ R+ : y > x . It is easy to see that subsequent partial differentiation of the representa- tion (2.55) with respect to x(1), . . . , x(d) eliminates all summands with |J| < d. This yields

∂ ∂ d+1 ∂ ∂ ∗ ... l(x) = (−1) (d) ... (1) ν (B1 ∩ ... ∩ Bd) . ∂xd ∂x1 ∂x ∂x Finally, due to Z ∞ Z ∞ ∗ (1) (d) (d) (1) ν (B1 ∩ ... ∩ Bd) = ... q y , . . . , y dy ... dy , x(1) x(d) one obtains ∂ ∂ q(x) = − ... l(x). (2.56) ∂x(d) ∂x(1) Now return to the Gumbel copula Cϑ, where (2.51) yields

1  (1)−ϑ (d)−ϑ ϑ lϑ(x) = x + ... + x .

Thus, according to (2.56), the canonical exponential density of Cϑ for ϑ ∈ (1, ∞) is given by ∂ ∂ q (x) = − ... l (x) ϑ ∂x(d) ∂x(1) ϑ −ϑ−1 1 −d d−1 ! d ! d ! ϑ Y Y X −ϑ = (kϑ − 1) x(i) x(j) (2.57) k=1 i=1 j=1 and the corresponding spectral density hϑ is obtained from (2.53): (1) (d−1) (1) (d−1) hϑ(w) = q w , . . . , w , 1 − w + ... + w . (2.58)

Figure 2.1 shows plots of spectral densities hϑ(w) in the two-dimensional case, where w = s(1). 2.6. Spectral densities of elliptical distributions 27 8 theta 1.2 1.6 2 6 2.4 3 4 4 spectral density 2 0

0.0 0.2 0.4 0.6 0.8 1.0

s_1

(1) Figure 2.1 Canonical spectral densities hϑ(s ) of two-dimensional Gumbel copulas with selected values of the dependence parameter ϑ.

2.6 Spectral densities of elliptical distribu- tions

This section is dedicated to the multivariate regular varying elliptical distri- butions, with the central result given in Lemma 2.8 by the representations of spectral densities for arbitrary norms and dimensions. Special empha- sis is put on explicit formulas that can be used in numerical computations. Used in Chapters5 and6, the results of this section are not needed before. However, elliptical distributions give a good example of a non-trivial class of multivariate regularly varying models and the general result obtained in Lemma 2.8 has an interest of its own.

Definition 2.6. The distribution of a random vector X on Rd is called elliptical if d X = µ + Re · A · U, (2.59) 28 Chapter 2. Multivariate regular variation where µ ∈ Rd, Re is a non-negative random variable, A is a non-zero d × d matrix, and U is a random vector in Rd that is independent of Re and uniformly distributed on the unit sphere induced by the 2-norm:

d d  d U ∼ unif(S2), S2 := x ∈ R : kxk2 = 1 .

2 2 It is easy to see that EkXk2 < ∞ if and only if ERe < ∞ and that in this case the covariance matrix of X is equal to Var(Re) · C with

C := A · A>.

The matrix C, which is unique except for a constant factor, is called the generalized covariance matrix of X. Further it should be noted that elliptical distributions with non-invertible C are degenerate in the sense that they concentrate the entire probability mass on affine subspaces of Rd. According to Hult and Lindskog(2002), multivariate regular variation of an elliptical distribution is equivalent to the regular variation of the radial factor Re, and the tail index of X is inherited from Re without change. Example 2.7. The multivariate Student-t distribution is an elliptical distri- bution with radial factor Re = |Y |, where Y is centred Student-t distributed with η degrees of freedom, η ∈ (0, ∞). It is well known that the univariate Student-t distribution is regularly varying with tail index η. Consequently, multivariate Student-t distributions are multivariate regularly varying with tail index η. The spectral measure of a regularly varying elliptical distribution depends only on the generalized covariance matrix C, the tail index α and the norm underlying the polar coordinates. If the matrix C is invertible, then the resulting spectral measure Ψ with respect to an arbitrary norm k·k on Rd d has a density on the corresponding unit sphere Sk·k. Hult and Lindskog (2002) derived representations of the spectral densities for k·k2 and k·k∞ in the 2-dimensional case. The following lemma allows to extend these results to arbitrary norms and dimensions.

Lemma 2.8. Let the random vector X be elliptically distributed with gener- alized covariance matrix C = A · A> and multivariate regularly varying with tail index α. If C is invertible, then the spectral measure Ψ of X on the unit d  d sphere S := x ∈ R : kxk = 1 is given by

R α −1 kAuk dσ(u) Ψ (B) = g (B) ,B ∈ B d , (2.60)  R α S d kAuk dσ(u) S2 2.6. Spectral densities of elliptical distributions 29

d d with the mapping g : S2 → S defined by 1 g(u) := Au (2.61) kAuk

d and dσ denoting the surface integral on S2. Proof. First it should be noted that the centring constant µ has no influence on the regular variation properties. Therefore it suffices to consider elliptical distributions centred by setting µ = 0. Let R and S denote the radial and the angular parts with respect to k·k:

R := kXk = Rfe (U) with d f(u) := kAuk, u ∈ S2, and 1 S := X = g(U), kXk where 1 1 d g(u) = Au = Au, u ∈ S2. kAuk f(u) The assumption that the generalized covariance matrix C has an inverse d implies that the matrix A also has one. Hence, if s = g(u) for s ∈ S and d u ∈ S2, then 1 A−1s = u f(u) −1 −1 d and u = g (s) can be obtained by projecting A s on S2:

−1 1 −1 d u = g (s) = −1 A s, s ∈ S. kA sk2

Now let Ψ denote the spectral measure of X with respect to k·k. Then, d due to (2.11), the spectral measure Ψ of any measurable set B ∈ B(S) can be obtained as P{R > z, S ∈ B} Ψ(B) = lim z→∞ P{R > z} R U −1 P{Rfe (U) > z|U = u} dP (u) = lim g (B) z→∞ R U d P{Rfe (U) > z|U = u} dP (u) S2 R U −1 P{Re > z/f(u)} dP (u) = lim g (B) . z→∞ R U d P{Re > z/f(u)} dP (u) S2 30 Chapter 2. Multivariate regular variation

Due to (2.6), regular variation of Re yields

P{Re > t} = t−αl(t)

d with a slowly varying function l. Moreover, U ∼ unif(S2) allows to obtain U d the probability distribution P of U from the surface measure σ on S2 by normalization of the total mass to 1: R U A dσ(u) d P (A) = R ,A ∈ B S2 . d dσ(u) S2 This yields

R −α −1 (z/f(u)) l(z/f(u))dσ(u) Ψ (B) = lim g (B)  R −α z→∞ d (z/f(u)) l(z/f(u))dσ(u) S2 R α l(z/f(u)) −1 f (u) dσ(u) = lim g (B) l(z) . (2.62) z→∞ R α l(z/f(u)) d f (u) dσ(u) S2 l(z)

d It is easy to see that f(S2) is a compact sub-interval of (0, ∞). Hence, due to the uniform convergence theorem for slowly varying functions (cf. Bingham et al., 1987, Theorem 1.2.1), one obtains l(z/f(u)) lim = 1 z→∞ l(z)

d uniformly in u ∈ S2. Consequently, (2.62) yields

R α −1 f (u)dσ(u) Ψ (B) = g (B)  R α d f (u)dσ(u) S2 and the identity f(u) = kAuk completes the proof.  Finally, in order to make this result applicable in computations, the rep- resentation (2.60) must be translated in explicit formulas that can be imple- mented in a computer program. d 1 d Assume that S is a C surface in R , which is particularly true for all d unit spheres Sp corresponding to p-norms k·kp with p ∈ [1, ∞]. In this case d (2.60) yields a density of Ψ with respect to the surface area on S. It is easy to see that ksk = 1 implies

−1  −1 1 −1 1 f g (s) = Ag (s)  = A −1 A s = −1 . kA sk2  kA sk2 2.6. Spectral densities of elliptical distributions 31

Consequently, with vg−1 (s) denoting the factor of local surface area deforma- −1 d d tion induced by the mapping g : S → S2, one obtains R −1 −α kA sk v −1 (s) dσ(s) Ψ (B) = B 2 g . (2.63)  R −1 −α d kA sk2 vg−1 (s) dσ(s) S d Given a bijective and sufficiently smooth parametrization s = s(w) of S by w ∈ W ⊂ Rd−1,(2.63) can be rewritten as R −1 −α p s−1(B) kA s(w)k2 det G(w) dw Ψ(B) = , (2.64) R −1 −α p W kA s(w)k2 det G(w) dw where G(w) is the measure tensor of the mapping w 7→ g−1 ◦ s(w):

G(w) := D(g−1 ◦ s)(w)> · D(g−1 ◦ s)(w) = D>s(w) · D>g−1(s) · Dg−1(s) · Ds(w), with D denoting the (m × n) differential matrix of a smooth mapping from Rn to Rm. An explicit representation of the differential matrix Dg−1 is obtained as −1 follows. It is easy to see that g (s) = h2 ◦ h1(s) with 1 h1(s) := −1 s kA sk2 and −1 h2(x) := A x.

Moreover, h2 is linear. Therefore, identifying linear mappings with corre- −1 sponding matrices, one obtains Dh2 = h2 = A and −1 −1 Dg (s) = A Dh1(s).

The differential Dh1 is obtained by partial differentiation of −1/2  > −1> −1  h1(s) = s A A s s

−1/2 = s>C−1s s.

∂ (i) Denoting x(s) := h1(s), one has Dh1(s) = ( ∂s(j) x (s))i,j, and the partial derivatives are given by

∂ ∂  −1/2 x(i)(s) = s(i) · s>C−1s ∂s(j) ∂s(j) > −1 −1/2  s C e  = s>C−1s δ − s(i) j i,j s>C−1s 32 Chapter 2. Multivariate regular variation

d with δi,j := 1{i = j} and ej denoting the j-th unit vector in R . This yields

> −1 −1/2  s · s · C  Dg−1(s) = s>C−1s A−1 I − (2.65) d s>C−1s with Id denoting the (d × d) identity matrix. The representations (2.64) and (2.65) are used in numerical computations presented in Chapters5 and6. Graphic examples of spectral densities on the 2 2 unit sphere S1 = {x ∈ R : kxk1 = 1} parametrized by

 >  (1 − w, w) , w ∈ [0, 1)  (1 − w, 2 − w)>, w ∈ [1, 2) s(w) = > (2.66)  (w − 3, 2 − w) , w ∈ [2, 3)  (w − 3, w − 4)>, w ∈ [3, 4) are given in Figure 2.2. 2.6. Spectral densities of elliptical distributions 33

sigma1=2, sigma2=3, rho=0.4

alpha

1.2 0.6 1 1.4 1.0 2 3

0.8 4 0.6 0.4 spectral density (transformed) 0.2 0.0

0 1 2 3 4

w (parametrization of the 1−norm unit sphere)

sigma1=2, sigma2=3, alpha=2

rho 2.0 0 −0.2 0.2 0.4 1.5 0.6 0.8 1.0 0.5 spectral density (transformed) 0.0

0 1 2 3 4

w (parametrization of the 1−norm unit sphere)

Figure 2.2 Spectral densities of bivariate elliptical distributions with respect to k·k1 for selected values of the tail index α and the gener- 2 2 alized correlation matrix C = (Ci,j) with C1,1 = σ1, C2,2 = σ2, and 2 C1,2 = σ1σ2ρ. The unit sphere S1 is parametrized according to (2.66). 34 Chapter 2. Multivariate regular variation Chapter 3

Extreme risk index

This chapter introduces the newly developed notion of the extreme risk index, which is essential to the whole thesis. Section 3.1 presents the basic approach to the comparison of extreme portfolio losses and the resulting definition of the extreme risk index. Section 3.2 is dedicated to integral representations of the extreme risk index in terms of the tail index α and spectral measures associated with arbitrary norms on Rd. Application of the extreme risk index to portfolio optimization and general properties of diversification effects in multivariate regularly varying models are discussed in Section 3.3. Finally, application of the extreme risk index to the minimization of risk measures is studied in Section 3.4.

3.1 Basic approach

In the following it is assumed that the random loss vector X satisfies model conditions introduced in Section 2.1, i.e., X is multivariate regularly varying with tail index α ∈ (0, ∞) and exponent measure ν that is non-degenerate in the sense of (2.12). Moreover, if not mentioned otherwise, Ψ denotes the spectral measure of X with respect to the 1-norm and ν is normalized by

d ν({x ∈ R : kxk1 > 1}) = 1, (3.1) so that the product representation (2.15) of ν in polar coordinates yields

τ ν = ρα ⊗ Ψ

−1 with τ(x) := (kxk1, kxk1 · x). It is easy to see that multiplication of the portfolio vector ξ by a constant factor c > 0 results in the multiplication of the portfolio loss by c. Conse- quently, the influence of the portfolio composition on the portfolio loss can

35 36 Chapter 3. Extreme risk index be studied by considering standardized portfolios. Following the intuition of diversifying a unit capital over a number of assets, portfolio vectors are standardized by the sums of their components. Hence the set of portfolio vectors can be reduced to an affine hyperspace in Rd:

 d (1) (d) H1 := x ∈ R : x + ... + x = 1 .

Additional regulations, such as bounds for short sales, result in restriction to subsets of H1. In particular, if short positions are not permitted, the set of comparable portfolio vectors is reduced to the unit simplex:

d  d (1) (d) Σ := x ∈ R+ : x + ... + x = 1 .

It is obvious that extreme losses of a portfolio ξ, i.e., events when ξ>X exceeds a large bound t > 0, can be written as {X ∈ Aξ,t} with sets Aξ,t defined as  d > Aξ,t := x ∈ R : ξ x > t . (3.2) Thus analysis of extreme portfolio losses involves the asymptotics of the probabilities P{X ∈ Aξ,t} for ξ ∈ H1 and t → ∞. In order to make these vanishing probabilities comparable, they are normalized by the probability P{X ∈ At} with the set At defined as

 d At := x ∈ R : kxk1 > t . (3.3)

The sets Aξ,t and At can be regarded as different kinds of extremal events: Aξ,t indicates high losses of the portfolio ξ, whereas At is a generic extremal event indicating that some components of the vector X produce high losses or gains.

It is easy to see that the sets At and Aξ,t can be obtained from A1 and Aξ,1 by rescaling,

At = t · A1,Aξ,t = t · Aξ,1, and that the Euclidean distance between Aξ,1 and the origin depends on the portfolio vector ξ: −1 −1 inf kxk2 = kξk2 ≥ kξk1 . (3.4) x∈Aξ,1 In particular, for ξ ∈ Σd this distance is bounded from below by 1. An 2 illustration of the sets A1 and Aξ,1 in R is given in Figure 3.1. Note that the normalization (3.1) of the exponent measure ν can also be written as ν(A1) = 1. Consequently, multivariate regular variation (2.7) of 3.1. Basic approach 37

x2 x2  1 A1 Aξ,1   ξ 1  0 1 x1   ξ   0 1 x1 

2 Figure 3.1 Sets A1 and Aξ,1 for in R for ξ = (0.28, 0.72)

X yields

−1 P {X ∈ Aξ,t} P {t X ∈ Aξ,1} lim = lim −1 t→∞ P {X ∈ At} t→∞ P {t X ∈ A1}

ν(Aξ,1) = = ν(Aξ,1). (3.5) ν(A1)

Hence for any pair of portfolio vectors ξ1, ξ2 ∈ H1 one obtains

P ξ>X > t ν(A ) lim 1 = ξ1,1 . t→∞  > P ξ2 X > t ν(Aξ2,1) This means that for multivariate regularly varying loss vectors X the asymp- totic probabilities of extreme portfolio losses can be compared in terms of the functional γξ := ν(Aξ,1). d In the special case Aξ,1 ⊂ A1, which is particularly true for ξ ∈ Σ , (3.5) can also be interpreted as convergence of the conditional probability > P{ξ X > t|kXk1 > t}, i.e., the probability that the portfolio ξ yields high losses given that some assets X(i) generate an extreme outcome. Thus the functional γξ also characterizes the relative sensitivity of the portfolio ξ to extremal events. 38 Chapter 3. Extreme risk index

Moreover, multivariate regular variation of X yields the asymptotic rela- tion of tail probabilities

1 − Fξ>X (rt) −α lim = γξ · r (3.6) t→∞ 1 − FkXk1 (t) for any r > 0 and the asymptotic quantile relation

← F > (1 − u/t) lim ξ X = γ1/α · u−1/α (3.7) t→∞ F ← (1 − 1/t) ξ kXk1 for any u > 0 (cf. Resnick, 1987, Proposition 0.8, parts (v) and (vi)). Conse- quently, γξ allows to order both the probabilities of extreme losses and high loss quantiles for all portfolios ξ ∈ H1. This means that γξ provides all in- formation that is needed for comparing the influence of the portfolio vector ξ on the severity of extreme losses. It should also be noted that the scaling relations (3.6) and (3.7) allow to estimate probabilities of extreme losses and high loss quantiles and to extrapolate these estimates beyond the observable area. The estimated values can be used in portfolio optimization. An empirical study based on these scaling relations is provided in Hauksson et al.(2000). Highlighting the central role of the functional γξ in the characterization of extreme portfolio losses, the foregoing arguments justify the introduction of a new notion. Definition 3.1. Let X be multivariate regularly varying with tail index α ∈ (0, ∞) and exponent measure ν normalized by ν(A1) = 1. Then for any portfolio vector ξ ∈ H1 the functional

γξ := ν (Aξ,1) is called extreme risk index of ξ. The extreme risk index is a natural way to quantify the influence of the asymptotic dependence structure on extreme portfolio losses in the frame- work of multivariate regular variation. It complements the available palette of approaches including the coefficient of tail dependence (cf. Joe, 1997), the extremal dependence measure (cf. Resnick, 2004), the Pickands depen- dence function (cf. Pickands, 1981), and the tail copula (cf. Schmidt and Stadtm¨uller, 2006), which are rather focused on applications beyond portfo- lio optimization. Indeed, the coefficient of tail dependence and the extremal dependence measure are very useful for fitting and testing models. How- ever, these functionals are single-number characteristics and therefore they 3.2. Representations in terms of spectral measures 39 are not able to carry sufficient information about the location of the optimal portfolio. On the other hand, parametrization of dependence structures by dependence functions and tail copulas is based on sets of the form Rd\[−∞, x] d for x ∈ R+, which are naturally related to simultaneous exceeding of bounds by the components X(i). Thus the extreme risk index fills the gap for an approach that addresses extremes of portfolio losses directly.

3.2 Representations in terms of spectral mea- sures

This section provides representations of the extreme risk index γξ in terms of the spectral measure Ψ and the tail index α. The basic representation using the spectral measure corresponding to the 1-norm is given in Lemma 3.2. An extension of this result to spectral measures associated with other norms is obtained in Lemma 3.4.

Lemma 3.2. Suppose that the random vector X is multivariate regularly varying with tail index α and let Ψ denote the spectral measure of X with respect to the 1-norm. Then the extreme risk index γξ of a portfolio ξ ∈ H1 satisfies Z γξ = fξ,α(s) dΨ(s) (3.8) d S1 with > α >  > α fξ,α(s) := ξ s + = ξ s · 1 ξ s > 0 . (3.9)

Proof. Recall that the product representation (2.15) and the normalization τ −1 ν(A1) = 1 imply ν = ρα ⊗ Ψ with τ(x) := (kxk1, kxk1 · x). This yields

γξ = ν(Aξ,1) Z = 1 {x ∈ Aξ,1} dν(x) d ZR Z  > = 1 r · ξ s > 1 dρα(r) dΨ(s) d S1 R+ Z Z  >  >  = 1 ξ s > 0 · 1 r > 1/ ξ s dρα(r) dΨ(s) d S1 R+ Z > α = ξ s dΨ(s).  d + S1 40 Chapter 3. Extreme risk index

x2  1  Aξ,1   ξ   0 1 x1     

d Figure 3.2 Support of fξ,α (thick line) on S1 for d = 2 and ξ = (0.72, 0.28)

In the following, the integral representation (3.8) of γξ will also appear in the short writing γξ = Ψfξ,α. d In the pure loss case without short positions, i.e., if X is restricted to R+ and ξ is restricted to Σd,(3.8) can be rewritten as Z γξ = fξ,α(x) dΨ(s) (3.10) Σd with the integrand fξ,α simplified to

> α fξ,α(s) = ξ s . This implies that in the pure loss case the extreme risk index of a uniformly diversified portfolio does not depend on the spectral measure Ψ: Z −1 (1) (d)α −α γd−1(1,...,1) = d s + ... + s dΨ(s) = d . (3.11) Σd

In the general case, i.e., if X takes values in Rd and the spectral measure Ψ is not concentrated on Σd,(3.11) fails because the support of the integrand fξ,α depends on the portfolio vector ξ. See Figure 3.2 for a graphic example. An immediate consequence of the representation (3.8) is the continuity of the extreme risk index γξ with respect to the portfolio vector ξ. 3.2. Representations in terms of spectral measures 41

Corollary 3.3. Suppose that the assumptions of Lemma 3.2 are satisfied. Then the mapping ξ 7→ γξ is continuous for any tail index α ∈ (0, ∞) and any spectral measure Ψ.

Proof. It is easy to see that

|γξ1 − γξ2 | ≤ Ψ |fξ1,α − fξ2,α| ≤ kfξ1,α − fξ2,αk∞.

Moreover, for any fixed α ∈ (0, ∞) the parametrization ξ 7→ fξ,α is continuous in the sense that for any ε > 0 there exists δ > 0 such that |ξ1 − ξ2| ≤ δ implies kfξ1,α − fξ2,αk∞ ≤ ε. 

Following result extends the representation (3.8) to spectral measures associated with other norms.

Lemma 3.4. Suppose that the assumptions of Lemma 3.2 are satisfied. Let d Ψ and S denote the spectral measure of X and the unit sphere corresponding d to an arbitrary norm k·k on R . Then Z γξ = c · fξ,α(s) dΨ(s), (3.12) d S where Z Z −1 α α c = ksk dΨ(s) = ksk1 dΨ(s) . (3.13) d d S1 S

Proof. Recall the product representation (2.15) and let ν denote the ver- sion of the exponent measure normalized by ν(Ak·k ) = 1 with Ak·k := d {x ∈ R : kxk > 1}. Redrawing the calculations from the proof of Lemma 3.2 for ν(Aξ,1), one obtains Z ν (Aξ,1) = fξ,α(s) dΨ(s). d S

In order to obtain γξ = ν(Aξ,1), recall that the version ν of the exponent measure normalized by ν(A1) = 1 is a constant multiple of ν:

ν = c · ν, c ∈ (0, ∞).

Due to ν(A1) = 1 = ν(Ak·k ), the constant c necessarily satisfies

 −1 c = ν Ak·k = (ν (A1)) . 42 Chapter 3. Extreme risk index

Finally, it is easy to see that Z Z   −1 ν Ak·k = 1 r > ksk dρα(r) dΨ(s) d S1 R+ Z α = ksk dΨ(s) d S1 and, analogously, Z α ν(A1) = ksk1 dΨ(s).  d S

This result shows that the reference norm underlying the spectral mea- sure can be chosen arbitrarily. Indeed, starting with another reference norm k·k and using the corresponding measures ν and Ψ, one obtains merely a rescaled version of the same functional γξ. However, fixing the reference norm helps to define the extreme risk index γξ uniquely. Finally, it is worth a remark that the identity (3.11) is a special prop- erty of the 1-norm in the sense that if another reference norm k·k and the d d resulting spectral measure Ψ on S ∩ R+ are used, the value of the integral Ψfd−1(1,...,1),α depends on the mass distribution under Ψ: Z −1 (1) (d)α −α −1 d s + ... + s dΨ(s) = ν(d · A1) = d c . d d S∩R+

3.3 Portfolio optimization and diversification effects

This section is dedicated to portfolio optimization with respect to the ex- treme risk index γξ. The central results are given in Lemmas 3.5 and 3.6, characterizing the optimization problems in the pure loss case without short positions and in the general case respectively. These results are discussed and illustrated with the special emphasis put on the inversion of diversification effects when the tail index α passes the critical value 1. Consider the problem of finding the portfolio minimizing the probabil- ity of extreme losses. As already shown before, the relative probability of extreme losses can be measured by γξ. Consequently, minimization of the function ξ 7→ γξ yields a portfolio that is optimal with respect to extreme risks. Following result characterizes the optimization problem arising in the pure loss case without short positions. 3.3. Portfolio optimization and diversification effects 43

d Lemma 3.5. Suppose that the random vector X takes values in R+ and is multivariate regularly varying with tail index α ∈ (0, ∞). Then the mapping d ξ 7→ γξ for ξ ∈ Σ is (a) convex for α > 1; (b) linear for α = 1; (c) concave for α ∈ (0, 1). The convexity or concavity properties in (a) and (c) are strict if the spectral measure Ψ of X is not concentrated on a linear subspace of Σd.

Proof. Part (a). The convexity of ξ 7→ γξ follows from the convexity of the α d mapping t 7→ t for t > 0 and α ≥ 1. Given λ ∈ (0, 1) and ξ1, ξ2 ∈ Σ , the representation (3.10) yields Z > α > α λγξ1 + (1 − λ)γξ2 = λ ξ1 s + (1 − λ) ξ2 s dΨ(s) Σd Z > > α ≥ λξ1 s + (1 − λ)ξ2 s dΨ(s) Σd

= γλξ1+(1−λ)ξ2 . Strict convexity holds if the upper inequality is strict, i.e., if Z Z > α > α > > α λ ξ1 s + (1 − λ) ξ2 s dΨ(s) > λξ1 s + (1 − λ)ξ2 s dΨ(s) Σd Σd d α for all ξ1, ξ2 ∈ Σ such that ξ1 6= ξ2. Since the mapping t 7→ t is strictly > > convex for α > 1, equality holds only if ξ1 s = ξ2 s almost sure with respect to Ψ. This can also be written as

 d > Ψ s ∈ Σ :(ξ1 − ξ2) s = 0 = 1, which exactly means that the entire probability mass of Ψ is concentrated d ⊥ on Σ ∩ (ξ1 − ξ2) . Part (b) is trivial since for α = 1 the mapping t 7→ tα is linear and the mapping ξ 7→ γξ is therefore a composition of linear mappings. Part (c) is analogous to part (a) due to the strict concavity of t 7→ tα for α ∈ (0, 1).  Consequently, the location of the optimal portfolio

ξopt := arg min γξ ξ∈Σd in the pure loss case without short positions can be described as follows: 44 Chapter 3. Extreme risk index

spectral measure extreme risk index and the optimal portfolio 0.4 0.30 0.25 0.3 0.20 0.2 0.15

alpha 0.10

0.1 2 2.5

0.05 3 3.5 4 0.0 0.00

0.0 0.2 0.4 0.6 0.8 1.0 0.0 0.2 0.4 0.6 0.8 1.0

xi_1 xi_1

Figure 3.3 Left: density of the discrete spectral measure Ψ(w) de- 1 fined in (3.15) for w = 12 (3, 2, 4, 1, 2). Right: resulting extreme risk index γξ(Ψ(w), α) and the optimal portfolios (vertical lines) for se- lected values of α between 2 and 4.

d • For α > 1 the typical location of ξopt would be in the interior of Σ . The optimal portfolio is unique if there is no mass concentration on linear subspaces under Ψ.

d • For α ≤ 1 the minimum of γξ is achieved in a vertex of Σ , i.e.,

min γξ = min γei (3.14) ξ∈Σd i=1,...,d

with ei denoting the i-th unit vector.

Graphic examples for these facts are given in Figures 3.3 and 3.4 with discrete spectral measures Ψ(w) defined by

n(w) X (i) Ψ(w) := w δ(i−1, n(w)−i)/(n(w)−1), (3.15) i=1 where w is a vector of weights and n(w) is the size of w. The results of Lemma 3.5 and the conclusions above have an interesting consequence: if only the losses are accounted, then portfolio diversification 3.3. Portfolio optimization and diversification effects 45

spectral measure extreme risk index 0.30 0.8 0.25 0.6 0.20 0.4 0.15

alpha 0.10 0.5

0.2 0.7 1 0.05 1.5 2 2.5 0.0 0.00

0.0 0.2 0.4 0.6 0.8 1.0 0.0 0.2 0.4 0.6 0.8 1.0

xi_1 xi_1

Figure 3.4 Left: density of the discrete spectral measure Ψ(w) de- 1 fined in (3.15) for w = 50 (10, 15, 10, 5, 2, 1, 0, 0, 0, 2, 5). Right: result- ing extreme risk index γξ(Ψ(w), α) for selected values of α between 0.5 and 2.5. does not reduce the danger of extreme losses in the case α ∈ (0, 1]. More- over, for α < 1 portfolio diversification typically increases extreme risks. The R > α representation γξ = Σd (ξ s) dΨ(s) suggests that these negative effects are stronger in the case of low positive dependence, i.e., when the probability mass of Ψ is concentrated around the vertices of the unit simplex Σd. Anal- ogously, for α > 1 low positive dependence makes positive diversification effects stronger. This is illustrated in Figure 3.5, where γξ and the normal- ized version γξ/γ(1,0) are plotted for symmetric 3-point spectral measures Ψλ defined as 1  Ψλ := λ · δ( 1 , 1 ) + (1 − λ) · δ(1,0) + δ(0,1) , λ ∈ [0, 1], (3.16) 2 2 2 with parameter λ quantifying the degree of positive dependence. It should be noted that the normalization (3.11) makes the original values of γξ inconvenient for the comparison of diversification effects resulting from different spectral measures. Instead, an appropriate transformation of γξ is needed that assigns unit values to portfolios consisting of single assets. While a general solution to this problem is obtained in Chapter5, the symmetry of the special case (3.16) allows to quantify the diversification effects by the 46 Chapter 3. Extreme risk index

ratio γξ/γ(1,0). Thus lower values of γξ/γ(1,0) indicate reduction of risk and values below 1 indicate that the diversification effect of the portfolio ξ is positive in the sense that ξ is less risky than the single-asset portfolio (1, 0). Figure 3.5(a) represents the case α > 1. It shows that the best diversifi- cation effects are achieved if X(1) and X(2) are asymptotically independent, i.e., if λ = 0, and the worst case is given by λ = 1, where the comonotonic distribution of asset losses erases all diversification effects. While the case α > 1 accords with the usual intuition of portfolio diver- sification, there is a sort of phase change when the tail index α passes the critical value 1. The case α < 1 is represented by Figure 3.5(b), showing that diversification effects for α ∈ (0, 1) are negative or zero and that the asymptotic independence of asset losses leads to worst diversification effects, whereas the comonotonic case without any diversification effects is the best one. Thus in case of infinite means the sensitivity to extremal events can only be optimized by minimizing the number of uncertainty sources and not by diversification. It is worth a remark that the transition from sub-additivity of risks for α > 1 to super-additivity for α < 1 has been repeatedly observed in studies on risk aggregation in multivariate regularly varying models (cf. Rootz´en and Kl¨uppelberg, 1999; Alink et al., 2004; Dan´ıelssonet al., 2005; Neˇslehov´a et al., 2006; Barbe et al., 2006; Embrechts et al., 2009a,b). It is obvious that Lemma 3.5 is intimately related to these results. Finally, it should also be noted that negative diversification effects in infinite mean models were already noticed in the beginnings of probability theory. If, for example, X1,...,Xn are i.i.d. α-stable random variables, then

−1 d (1/α)−1 n (X1 + ... + Xn) = n X1, which implies negative diversification effects for α < 1. A general result on negative diversification effects in similar settings can be obtained from the Marcinkievicz–Zygmund Strong Law of Large Numbers for i.i.d. random variables, cf. Neˇslehov´aet al.(2006). Now consider the optimization of γξ in the general case, i.e., for loss d vectors in R and portfolios in H1. As already highlighted in Corollary 3.3, the mapping ξ 7→ γξ is continuous for any fixed Ψ and α. Consequently, if γξ is minimized over a compact subset H ⊂ H1, there always exists an optimal portfolio

ξopt = ξopt(H) := arg min γξ. ξ∈H The uniqueness of the optimal portfolio for α > 1 and non-degenerate Ψ is due to the following analogue of Lemma 3.5. 3.3. Portfolio optimization and diversification effects 47

extreme risk index (original values) extreme risk index (normalized) 0.5 1.0 0.4 0.8 0.3 0.6

lambda lambda 0.2 0.4 0 0 0.2 0.2 0.4 0.4 0.1 0.2 0.6 0.6 0.8 0.8 1 1 0.0 0.0

0.0 0.2 0.4 0.6 0.8 1.0 0.0 0.2 0.4 0.6 0.8 1.0

xi_1 xi_1

(a) α = 2

extreme risk index (original values) extreme risk index (normalized) 0.7 1.4 0.6 1.2 0.5 1.0 0.4 0.8

0.3 lambda 0.6 lambda 0 0

0.2 0.2 0.4 0.2 0.4 0.4 0.6 0.6 0.1 0.2 0.8 0.8 1 1 0.0 0.0

0.0 0.2 0.4 0.6 0.8 1.0 0.0 0.2 0.4 0.6 0.8 1.0

xi_1 xi_1

(b) α = 0.5

Figure 3.5 Influence of positive dependence on the extreme risk index γξ and the normalized version γξ/γ(1,0) for α > 1 and α < 1 in the pure loss case without short positions. The underlying spectral measure Ψλ is defined in (3.16) 48 Chapter 3. Extreme risk index

Lemma 3.6. Suppose that the random vector X takes values in Rd and is multivariate regularly varying with tail index α ∈ (0, ∞). Then the mapping ξ 7→ γξ for ξ ∈ H1 is (a) convex for α ≥ 1;

(b) strictly convex if α > 1 and the total mass of Ψ is concentrated neither d d on a linear subspace of S1 nor on the intersection of S1 with two unequal half-spaces of Rd.

Proof. Part (a). Consider the integrand fξ,α defined in (3.9) as a function of d ξ ∈ H1. For any s ∈ S1 it can be decomposed as

>  fξ,α(s) = gα ξ s with α gα(t) := t+. > Since gα is convex for α ≥ 1 and ξ 7→ ξ s is linear, fξ,α is convex for all d s ∈ S1. This implies convexity of the mapping ξ 7→ Ψfξ,α. Part (b). Suppose that α > 1 and the mapping ξ 7→ γξ is not strictly convex, i.e., there exist two unequal portfolio vectors ξ1, ξ2 ∈ H1 such that

 >  >  > >  Ψ λgα ξ1 s + (1 − λ)gα ξ2 s − gα λξ1 s + (1 − λ)ξ2 s = 0. (3.17)

d Since the mapping ξ 7→ fξ,α is convex for all s ∈ S1, the integrand in (3.17) is non-negative. Therefore the integral can only be equal to 0 if the integrand is equal to 0 Ψ-a.s. Moreover, the function gα : R → [0, ∞) is strictly convex on [0, ∞) for α > 1. Consequently, (3.17) can only hold in the following two cases:

> > (i) ξ1 s = ξ2 s Ψ-a.s., i.e., the mass of Ψ is concentrated on a linear subspace d of S1,

> > (ii) ξ1 s ≤ 0 and ξ2 s ≤ 0 Ψ-a.s., i.e., the mass of Ψ is concentrated on the d d intersection of S1 with two unequal half-spaces of R . 

Contrarily to the pure loss setting, the integrand fξ,α defined in (3.9) is not concave for α < 1. As a consequence, γξ is not necessarily concave for α < 1 in the loss-gain case. A very basic example illustrating Lemma 3.6 is given by the parametric d family of spectral measures Ψλ on S1 defined as

Ψλ := λΨ1 + (1 − λ)Ψ0, λ ∈ [0, 1], (3.18) 3.4. Minimization of risk measures 49 with 1   Ψ1 := δ(− 1 , 1 ) + δ( 1 ,− 1 ) 2 2 2 2 2 and 1 Ψ := δ + δ + δ + δ  . 0 4 (1,0) (−1,0) (0,1) (0,−1) Thus the parameter λ quantifies the degree of negative dependence in the spectral measure Ψλ, with λ = 0 corresponding to asymptotic independence and λ = 1 indicating total negative dependence. The resulting extreme risk index γξ is given by

(1) (2) α λ ξ − ξ 1 − λ  (1) α (2) α γξ(Ψλ, α) = + ξ + ξ 2 2 4 α λ (1) 1 1 − λ  (1) α (1) α = ξ − + ξ + 1 − ξ . 2 2 4 Figure 3.6 illustrates the influence of the negative dependence parameter λ on the extreme risk index γξ(α, Ψλ) for selected values of λ and α. The case α > 1 is represented by Figure 3.6(a), showing that γξ remains convex and that stronger negative dependence improves diversification effects of port- folios without short positions. Figure 3.6(b) represents the case α < 1. It shows that γξ is very sensitive to negative dependence and fails to be concave in ξ ∈ Σd. Moreover, increasing negative dependence makes diversification effects positive, which demonstrates that the phase change observed in the pure loss case does not occur in general. Besides the loss of concavity in ξ ∈ Σd for α < 1, Figure 3.6 illustrates that short positions typically increase the probability of extreme portfolio losses and therefore should be taken with care. Indeed, investing a unit capital with short positions increases the portfolio norm kξk2. Hence, due to (3.4), the corresponding set Aξ,1 gets closer to the origin, which typically increases γξ = ν(Aξ,1).

3.4 Minimization of risk measures

This section is dedicated to the application of the extreme risk index γξ to risk minimization. It demonstrates that the portfolio ξopt obtained by minimization of γξ is asymptotically optimal with respect to some well-known and rather natural risk measures. > The ordering (3.7) of high quantiles of the portfolio loss ξ X by γξ has immediate consequences on risk measures such as the value-at-risk (VaR) and the expected shortfall (ES). Recall the definition of VaR and ES for a 50 Chapter 3. Extreme risk index

extreme risk index (original values) extreme risk index (normalized)

lambda lambda

1.2 0 0 0.2 8 0.2 0.4 0.4 1.0 0.6 0.6

0.8 6 0.8

0.8 1 1 0.6 4 0.4 2 0.2 0 0.0

−1.0 −0.5 0.0 0.5 1.0 1.5 2.0 −1.0 −0.5 0.0 0.5 1.0 1.5 2.0

xi_1 xi_1

(a) α = 2

extreme risk index (original values) extreme risk index (normalized) 2.5 0.6 2.0 0.5 0.4 1.5 0.3

lambda 1.0 lambda 0 0 0.2 0.2 0.2 0.4 0.4 0.5

0.1 0.6 0.6 0.8 0.8 1 1 0.0 0.0

−1.0 −0.5 0.0 0.5 1.0 1.5 2.0 −1.0 −0.5 0.0 0.5 1.0 1.5 2.0

xi_1 xi_1

(b) α = 0.5

Figure 3.6 Influence of negative dependence on the extreme risk index γξ and the normalized version γξ/γ(1,0) for α > 1 and α < 1 in the loss-gain case with short positions. The underlying spectral measure Ψλ is defined in (3.18). 3.4. Minimization of risk measures 51

random loss Y with distribution function FY . To avoid technicalities, assume that FY is continuous. Then the value-at-risk at the confidence level (1 − λ) for (typically small) λ ∈ (0, 1) is defined by

← VaR1−λ(Y ) := FY (1 − λ).

The expected shortfall at the confidence level (1 − λ) is defined by Z 1 ← ES1−λ(Y ) := FY (u) du (3.19) λ (1−λ,1)

(cf. McNeil et al., 2005). It is well known that continuity of FY implies

ES1−λ(Y ) = E [Y |Y > VaR1−λ] .

> > Now consider VaR1−λ(ξ X) and ES1−λ(ξ X) for λ ↓ 0 and a multivariate regularly varying loss vector X. As an immediate result of (3.7) one obtains that the portfolio vector ξopt minimizing γbξ also minimizes VaR1−λ with re- spect to extreme risks, i.e., asymptotically for λ ↓ 0. Furthermore, for α > 1 there holds > ES1−λ(ξ X) α lim > = . (3.20) λ↓0 VaR1−λ(ξ X) α − 1 This asymptotic relation is a consequence of Karamata’s theorem (cf. Theo- > rem A.1) Consequently, ξopt also minimizes ES1−λ(ξ X) for λ ↓ 0. The asymptotic result (3.20) can be generalized to the class of spectral risk measures. Introduced by Acerbi(2002), spectral risk measures are obtained as weighted averages of loss quantiles, Z −1 Mφ(Y ) := FY (p)φ(p) dp, (0,1) where the weight function φ : (0, 1) → R is an admissible risk spectrum, i.e., it is non-negative, non-decreasing, and satisfies Z φ(p) dp = 1. (0,1)

As a consequence of (3.19), ES1−λ is a spectral risk measure. Thus (3.20) can be considered as a limit relation for the rescaled and properly normalized risk spectrum. Analogously, for any admissible risk spectrum φ1 : (0, 1) → R the transformations −1 τλ : u 7→ 1 − λ (1 − u) 52 Chapter 3. Extreme risk index for λ ∈ (0, 1) and u ∈ (1 − λ, 1) induce a family of rescaled admissible risk spectra φλ defined by

0 φλ(u) = τλ(u)φ1(τλ(u)) · 1(0,1)(τλ(u)) −1 −1  = λ φ1 1 − λ (1 − u) · 1(1−λ,1)(u). (3.21) This notion leads to the following generalization of (3.20).

Lemma 3.7. Let Y be a continuously distributed random variable on R and suppose that Y+ is regularly varying with tail index α > 1. Further, let φλ be admissible risk spectra defined in (3.21) with φ1 satisfying

−1/α+1−ε ∀t ∈ (1, ∞) φ1 (1 − 1/t) ≤ K · t (3.22) for some K > 0 and ε > 0. Then Z Mφλ (Y ) 1/α−2 lim = t φ1(1 − 1/t) dt. (3.23) λ↓0 VaR1−λ(Y ) (1,∞)

Proof. The spectral risk measure Mφλ (Y ) is defined as Z ← Mφλ (Y ) = FY (u)φλ(u) du. (1−λ,1)

It is easy to see that the representation (3.21) of φλ and the substitution −1 u = τλ (1 − 1/t) = 1 − λ/t yield Z −2 ← Mφλ (Y ) = t FY (1 − λ/t)φ1(1 − 1/t) dt. (1,∞)

Consequently, the ratio Mφλ (Y )/VaR1−λ(Y ) can be written as Z Mφλ (Y ) −2 = t gλ(t)φ1(1 − 1/t) dt (3.24) VaR1−λ(Y ) (1,∞) with ← ← gλ(t) := FY (1 − λ/t)/FY (1 − λ). It is well known that regular variation of Y with tail index α implies ← regular variation of the function t 7→ FY (1 − 1/t) with index 1/α. This yields 1/α gλ(t) → t for λ ↓ 0 pointwise in t ∈ (1, ∞). Hence the integrand in (3.24) converges pointwise to the integrand in (3.23). Furthermore, (3.22) implies

−2 −1/α−ε/2 −1−ε/2 t gλ(t)φ1(1 − 1/t) ≤ K · t gλ(t) · t (3.25) 3.4. Minimization of risk measures 53 and the uniform convergence theorem for regularly varying functions (cf. Bingham et al., 1987, Theorem 1.5.2) yields

−1/α−ε/2 −ε/2 t gλ(t) → t (3.26) for λ ↓ 0 uniformly in t ∈ (1, ∞). Moreover, t−ε/2 is bounded on for t > 1 and t−1−ε/2 is integrable on (1, ∞). Hence (3.25) and (3.26) imply existence −2 of an integrable bound for t gλ(t)φ1(1 − 1/t) on (1, ∞) and the dominated convergence theorem completes the proof.  Applying Lemma 3.7 to multivariate regularly varying random vectors with non-degenerate extreme risk index γξ, one obtains the following result. Corollary 3.8. Let X be multivariate regularly varying with spectral measure Ψ and tail index α > 1. Further, let φλ be admissible risk spectra defined in (3.21) with φ1 satisfying (3.22). Then, for any portfolio vectors ξ1, ξ2 ∈ H1 with γξ1 , γξ2 > 0, M ξ>X γ 1/α lim φλ 1 = ξ1 . (3.27) λ↓0 >  Mφλ ξ2 X γξ2

Proof. It is easy to see that positive γξ1 and γξ2 imply that the portfolio > > losses ξ1 X and ξ2 X are regularly varying with tail index α. Furthermore, (3.7) yields VaR ξ>X γ 1/α lim 1−λ 1 = ξ1 . λ↓0 >  VaR1−λ ξ2 X γξ2 Thus (3.27) can be easily obtained from (3.23). 

> It is easy to see that this result simplifies the minimization of Mφλ (ξ X) for small λ. In particular, if the admissible portfolio set H ⊂ H1 is compact > and γξ is positive on H, the minimization of Mφλ (ξ X) for λ ↓ 0 is equivalent to the minimization of γξ. 54 Chapter 3. Extreme risk index Chapter 4

Estimation of the extreme risk index and the optimal portfolio

This chapter is dedicated to the estimation of the extreme risk index γξ and the optimal portfolio ξopt in the i.i.d. setting. Based on the theory of empir- ical processes presented in van der Vaart and Wellner(1996), strong consis- tency and asymptotic normality of the proposed estimator γbξ are obtained uniformly in the portfolio vector ξ. Section 4.1 introduces the estimation ap- proach and the basic notation and gives a first insight into the distribution of the subsample underlying the estimator γbξ. The main results of this chapter, concerning consistency and asymptotic normality of the estimator γbξ, are stated in Section 4.2. Section 4.3 introduces notions and auxiliary results related to empirical measures and empirical processes with functional index. Finally, the pending proofs of the main results from Section 4.2 are given in Section 4.4 and the second-order conditions underlying the asymptotic normality results are discussed in Section 4.5.

4.1 Basic approach

The representation (3.8) suggests the following plug-in approach to the esti- mation of the extreme risk index γξ and the optimal portfolio ξopt:

1. Estimate the tail index α by an estimator αb.

d 2. Estimate the spectral measure Ψ on S1 by an estimator Ψ.b

3. Estimate γξ by Z γξ := fb(s) dΨ(b s) (4.1) b d S1

55 56 Chapter 4. Estimation

with f = f = (ξ>s)αb . b ξ,αb +

4. Obtain an estimate for the optimal portfolio by minimizing γbξ on the admissible portfolio set H ⊂ H1:

ξbopt = ξbopt(H) := arg min γbξ. (4.2) ξ∈H

Since γbξ is obtained by plugging Ψb and αb into the representation (3.8), the minimization problem for γbξ has the same properties as for γξ and is charac- terized by Lemmas 3.5 and 3.6. Although there are approved methods for the estimation of α and Ψ as well as for the optimization of γbξ, the final result of the plug-in approach (4.1) is not trivial. It must be assured that the solutions of the approximating problems yield sensible approximations for both the optimal argument ξopt and the optimal value γ , which involves consistency of the estimator γ ξopt bξ uniformly in ξ. Furthermore, it is desirable to establish asymptotic normality of γbξ as a function of the portfolio vector ξ. Before formulating the estimators and stating the main results, some notation is needed. In the following, let X denote a random vector that is multivariate regularly varying with tail index α and spectral measure Ψ on d S1. Further let X1,...,Xn denote an i.i.d. sample of X,

X1,...,Xn i.i.d. ∼ X, whereas the 1-norm polar coordinates of X and Xi are denoted by (R,S) and (Ri,Si) respectively: −1  −1  (R,S) := kXk1 , kXk1 X , (Ri,Si) := kXik1 , kXik1 Xi . To avoid technical difficulties, it is assumed throughout the chapter that the distribution function of the radial parts is continuous:

FR(t) := P {R ≤ t} ∈ C(R). Since this assumption is fulfilled in common applications and models, this restriction is not problematic. As usual in extreme value theory, the estimates of tail related parameters are obtained from a subsample of k = k(n) observations associated with the k upper order Rn:1,...,Rn:k of the radial parts R1,...,Rn. The growth of k is linked to n by the following assumption: k(n) k(n) → ∞, → 0. (4.3) n 4.1. Basic approach 57

Let i(n, 1), . . . , i(n, k) denote the indices corresponding to the k observa- tions with greatest values of Ri, ordered as they appear in the sample, i.e., i(n, 1), . . . , i(n, k) satisfy 1 ≤ i(n, 1) < . . . < i(n, k) ≤ n and there exists a permutation π of the tuple (1, . . . , k) such that   Ri(n,1),...,Ri(n,k) = Rn:π(1),...,Rn:π(k) . (4.4)

The subsample Xi(n,1),...,Xi(n,k) contains all information that is needed for the plug-in estimator (4.1) of γξ. The estimator αb of the tail index α is obtained from the upper order statistics of the radial parts,

αb = αb (Rn:1,...,Rn:k) , (4.5) which can be based on various approaches (cf. Hill, 1975; Pickands, 1975; Smith, 1987; Dekkers et al., 1989). The spectral measure Ψ is estimated by the empirical measure of the angular parts Si(n,1),...,Si(n,k):

k 1 X Ψb := n := δS . (4.6) P k i(n,j) j=1 There is vast literature on the estimation of the exponent measure ν and the spectral measure Ψ, incorporating methods based on convergence of point processes (cf. de Haan and Resnick, 1993) and empirical processes (cf. Einmahl et al., 1993, 1997, 2001; de Haan and Sinha, 1999; Schmidt and Stadtm¨uller, 2006). However, there is no reference that would cover the asymptotic behaviour of γbξ. Indeed, estimation of the exponent measure ν on sets related to portfolio losses is studied only by de Haan and Sinha (1999) for estimators that are essentially different from γbξ. Moreover, neither the estimation of Ψ with respect to function classes containing functions fξ,α nor the uniform consistency of these estimates, which is needed in portfolio optimization, has been considered so far. The first step on the way to the consistency and the asymptotic normality of the estimator γbξ is the analysis of the probability distribution underlying the subsample Xi(n,1),...,Xi(n,k). Following lemma is fundamental to the main results stated in Section 4.2.

Lemma 4.1. Suppose that the distribution function FR of the radial part R = kXk1 is continuous and let Un denote the (k +1)-st upper of R1,...,Rn transformed by FR:

Un := FR (Rn:k+1) . (4.7) 58 Chapter 4. Estimation

Then, for any u ∈ (0, 1),

 k L Si(n,1),...,Si(n,k)|Un = u = ⊗i=1Ψu, (4.8) where Ψu := L (S|FR(R) > u) . (4.9)

Proof. The continuity of FR implies that the sample indices i(n, 1), . . . , i(n, k) and the permutation π satisfying (4.4) are unique almost surely. Moreover, the random variables

Yi := FR(Ri), i = 1, . . . , n, are independent and uniformly distributed on (0, 1), whereas by construction ← of Yi there holds Ri = FR (Yi) almost surely and   Yi(n,1),...,Yi(n,k) = Yn:π(1),...,Yn:π(k) .

It is well known (cf. Haj´osand R´enyi, 1954) that the conditional probability distribution of Yn:1,...,Yn:k given Yn:k+1 = u is equal to the probability dis- tribution of the order statistic of k i.i.d. random variables that are uniformly distributed on (u, 1). Moreover, the random permutation π satisfying (4.4) is uniformly distributed on the permutation group Sk. This implies that the subsample Yi(n,1),...,Yi(n,k) is conditionally i.i.d.:

 k L Yi(n,1),...,Yi(n,k) | Un = u = ⊗i=1unif(u, 1).

Since for a random variable Z ∼ unif(u, 1) the probability distribution of ← FR (Z) is equal to L(R|FR(R) > u), one obtains

 k L Ri(n,1),...,Ri(n,k)| Un = u = ⊗i=1L(R|FR(R) > u).

Finally, it is easy to see that this conditional i.i.d. property is inherited by the subsample Xi(n,1),...,Xi(n,k) and its angular parts Si(n,1),...,Si(n,k) in the following sense:

 k L Xi(n,1),...,Xi(n,k)| Un = u = ⊗i=1L(X|FR(R) > u) and  k L Si(n,1),...,Si(n,k)| Un = u = ⊗i=1L(S|FR(R) > u). 

An immediate consequence of Lemma 4.1 is the following representation of L(Si(n,1),...,Si(n,k)) as a mixture of product measures. 4.2. Main results 59

Corollary 4.2. If FR is continuous, then Z   k Un P Si(n,1),...,Si(n,k) ∈ A = Ψu(A) dP (u) (4.10) [0,1]

 k d Un for A ∈ B S1 , where P is the probability distribution of Un and

k k Ψu := ⊗i=1Ψu, u ∈ (0, 1).

← Since FR (u) → ∞ for u ↑ 1, the behaviour of Ψu for u ↑ 1 is related to the regular variation of X. One easily obtains the following result. Lemma 4.3. Suppose that the random variable X is multivariate regularly varying. Then w Ψu → Ψ, u ↑ 1.

Proof. The measure Ψu is obtained from the measure

−1  ← µu := L t(u) X|R > t(u) , t(u) := FR (u),

d by projection on S1: T Ψu = µu −1 with T : x 7→ kxk1 x. Moreover, representation (2.11) of multivariate regu- lar variation implies w µu → ν|A1 d with ν|A1 denoting the restriction of ν to the set A1 = {x ∈ R+ : kxk1 > 1}. Finally, the continuous mapping theorem yields

T w T µu → (ν|A1 ) = Ψ. 

4.2 Main results

This section contains the main results of the present chapter. Strong con- sistency and the asymptotic normality of the estimator γbξ uniformly in the portfolio vector ξ are stated in Theorems 4.4 and 4.8 respectively. Strong consistency of the estimated optimal portfolio ξbopt and the estimated opti- mal value γ are obtained in Corollary 4.5. Resting upon the auxiliary bξbopt results from Section 4.3, the proofs of Theorems 4.4 and 4.8 are deferred to Section 4.4. Let (Ω, A, P) denote the probability space underlying the i.i.d. sample X1,...,Xn. It is obvious that the estimator γbξ is a stochastic process with 60 Chapter 4. Estimation index ξ ∈ H1, i.e., γbξ :Ω → R is Borel measurable for any portfolio vector ξ. Furthermore, Corollary 3.3 implies that γbξ(ω) is continuous in ξ for any ω ∈ Ω, which allows to consider γbξ as a mapping from Ω into C(H1). Moreover, it should be noted that for any compact subset H ⊂ H1 the function space C(H) is separable with respect to the uniform distance (cf. Lemma A.7). Hence restriction of the portfolio vector ξ to a compact set H ⊂ H1 makes the mapping ω 7→ (γbξ(ω): ξ ∈ H), measurable with respect to the Borel σ-field on (C(H), k·k∞). Thus strong consistency and asymptotic normality of γbξ uniformly in ξ ∈ H √can be con- sidered as almost sure convergence of γbξ and weak convergence of k(γbξ − γξ) as mappings in l∞(H). The following theorem states strong consistency of γbξ uniformly in ξ ∈ H for a compact subset H ⊂ H1.

Theorem 4.4. Let X1,...,Xn be i.i.d. multivariate regularly varying random d d variables on R with tail index α ∈ (0, ∞) and spectral measure Ψ on S1 and assume that the distribution function FR of the radial parts is continuous. Further, let H be a compact subset of H1 and suppose that the estimator αb is consistent almost surely:

αb → α P-a.s. (4.11)

Then the estimator γbξ is strong consistent uniformly in ξ ∈ H:

sup |γbξ − γξ| → 0 P-a.s. (4.12) ξ∈H

It should be noted that (4.12) means convergence of γbξ to γξ as a mapping in l∞(H) and that both the minimum and the minimizing argument are continuous functions on (C(H), k·k∞). Hence, as an immediate consequence of Theorem 4.4, one obtains the following result.

Corollary 4.5. Suppose that the conditions of Theorem 4.4 are satisfied and the optimal portfolio ξopt = arg minξ∈H γξ is unique. Then the estimator ξbopt and the estimated optimal value γ are consistent almost surely: bξbopt

ξopt → ξopt P-a.s., γ → γξ P-a.s. b bξbopt opt

The asymptotic normality results for the estimator γbξ are established upon a condition related to the asymptotic independence of the radial and the angular parts of the subsample Xi(n,1),...,Xi(n,k). It involves the estimator 4.2. Main results 61

αb = αb(Rn:1,...,Rn:k) and the empirical measure Pn of the angular parts Si(n,1),...,Si(n,k), defined in (4.6). The measure Pn is indexed by elements of the function class FH,α := {fξ,α : ξ ∈ H} (4.13) where α ∈ (0, ∞) and H ⊂ H1 is a compact set of admissible portfolios. The measurability of the empirical measure Pn = Pn(ω) and the random measure

ΨUn (ω) := ΨUn(ω) (4.14) ∞ with Ψu defined in (4.9) as mappings in l (FH,α) is established in Lemma 4.12. Condition 4.6. The random variable √ Yn := k(αb − α) (4.15) ∞ and the mapping Gn :Ω → l (FH,α) defined as √ Gn := k (Pn − ΨUn ) (4.16) are asymptotically independent, i.e., for any bounded continuous functions ∞ h1 ∈ Cb(R) and h2 ∈ Cb(l (FH,α)) there holds

lim E[h1(Yn)h2( n)] − Eh1(Yn)Eh2( n) = 0. n→∞ G G Remark 4.7. The major reason for stating√ the explicit assumption of asymp- totic independence between Yn = k(αb − α) and Gn in Condition 4.6 is the resulting generality of Theorem 4.8, which leaves the estimator αb un- specified. However, since asymptotic independence of radial and angular parts is an essential feature of multivariate regularly varying models, Condi- tion 4.6 merely reflects the natural intuition towards any sensible estimator αb = αb(Ri(n,1),...,Ri(n,k)) and the empirical process Gn, constructed from the angular parts Si(n,1),...,Si(n,k). In particular, the Hill estimator αbH, representing one of the most fundamental approaches to the estimation of the tail index α, satisfies Condition 4.6 automatically. See Lemma 4.21 and Corollary 4.22 for further details. The next theorem states asymptotic normality for γbξ as a process with index ξ ∈ H and, under weaker conditions, pointwise in ξ. For a defini- tion of the Brownian bridge on a function class the reader is referred to the formulation of the Donsker property (4.27).

Theorem 4.8. Let X1,...,Xn be i.i.d. multivariate regularly varying random d d variables on R with tail index α ∈ (0, ∞) and spectral measure Ψ on S1 and suppose that the distribution function FR of the radial parts is continuous. Further, let H be a compact subset of H1 and assume that Condition 4.6 is satisfied. 62 Chapter 4. Estimation

(a) Suppose that the estimator αb is asymptotically normal, √ w 2  k (αb − α) → Y ∼ N µα, σα , (4.17) and that there exists a mapping b ∈ l∞(H) such that √ P ∞ k(ΨUn − Ψ)fξ,α → b(ξ) in l (H) . (4.18) Then √ w ∞ k (γbξ − γξ) → b(ξ) + GΨfξ,α + Ψ [∂αfξ,α] · Y in l (H), (4.19) where b(ξ) is the asymptotic bias term from (4.18), GΨ is a Brownian bridge on the function class FH,α “with time” Ψ, Y is a Gaussian random variable that is independent from GΨ and distributed according to (4.17), and ∂αfξ,α denotes the partial derivative of fξ,α with respect to α:

∂ α   ∂ f := f = ξ>s log ξ>s . α ξ,α ∂α ξ,α + + (b) Suppose that (4.17) is satisfied and that √ k(ΨUn fξi,α − Ψfξi,α) → b(ξi) ∈ R (4.20)

holds for ξ1, . . . , ξp ∈ H. Then √ k γ ,..., γ  − γ , . . . , γ  →w N (M,C) (4.21) bξ1 bξp ξ1 ξp

where the mean vector M = M(α, ξ1, . . . , ξp) and the covariance matrix C = C(α, ξ1, . . . , ξp) are given by (i) M = b(ξi) + µαΨ[∂αfξi,α] ,   2   Ci,j = Ψ fξi,αfξj ,α − Ψfξi,αΨfξj ,α + σαΨ[∂αfξi,α]Ψ ∂αfξj ,α 2 with i, j ranging in {1, . . . , p} and (µα, σα) being the mean and the vari- ance of the random variable Y in (4.17). It is well known that the estimators of α mentioned in Section 4.1 are asymptotically normal under appropriate second-order conditions specifying convergence rate of the distribution L(t−1R|R > t) for t → ∞. A compre- hensive elaboration on this topic is given in de Haan and Ferreira(2006). For original results see (among others) Davis and Resnick(1984); Drees(1995); Dekkers et al.(1989); Smith(1987); Drees et al.(2004). Condition (4.18) can be understood as a second-order condition related to the weak convergence of the angular parts Si(n,1),...,Si(n,k). Since multivari- ate regular variation leaves convergence rates completely unspecified, similar conditions are necessary for establishing asymptotic normality in regularly varying models. An explicit criterion that is sufficient for (4.18) is obtained in Lemma 4.19 and illustrated in Example 4.20. 4.3. Empirical processes with functional index 63 4.3 Empirical processes with functional in- dex

The present section provides auxiliary results underlying the uniform consis- tency and asymptotic normality results stated in Section 4.2. After a brief introduction into the notion of Glivenko–Cantelli and Donsker theorems, ba- sic properties of function classes relevant to the estimator γbξ are highlighted in Lemmas 4.9, 4.10, and 4.11. Further, measurability of the empirical mea- sure Pn defined in (4.6) and the empirical process Gn defined in (4.16) is obtained in Lemma 4.12, whereas uniform Glivenko–Cantelli, Donsker, and pre-Gaussian properties of relevant function classes are established in Lem- mas 4.13, 4.14, and 4.15. Finally, convergence of Pn and Gn as mappings in l∞ is obtained in Lemmas 4.17 and 4.18. The estimator γbξ of the extreme risk index γξ proposed in Section 4.1 is obtained by indexing the empirical measure Pn defined in (4.6) with a random element f of the function class ξ,αb

FH := {fξ,α : ξ ∈ H, α ∈ (0, ∞)} , (4.22) where H ⊂ H1 is a compact set of admissible portfolio vectors. Additionally, it will be shown further on that consistency of the estimator αb and smooth- ness of the parametrization α 7→ fξ,α allow to reduce the index set of the empirical measure Pn and the empirical process Gn defined in (4.16) to the function class FH,α := {fξ,α : ξ ∈ H} (4.23) with α ∈ (0, ∞) being the true tail index. Moreover, restriction of Pn and Gn to function classes FH,α allows to avoid measurability issues. As stated in Lemma 4.12, Pn and Gn are Borel ∞ measurable mappings from Ω into l (FH,α), which is not true for empirical measures and empirical processes in general. In particular, measurability issues appear even in the basic case of the empirical distribution function of an i.i.d. sample Y1,...,Yk ∼ unif(0, 1),

k 1 X (t) := 1 {t ≥ Y } , t ∈ [0, 1]. Fk k i i=1

It is well known that Fk :Ω → D[0, 1] is not Borel measurable if the set D[0, 1] of c`adl`agfunctions on the unit interval is endowed with the supremum norm (cf. Billingsley, 1968). It should be noted that the measurability of Pn and Gn is needed for applying the conditional i.i.d. property obtained in Lemma 4.1 to Pn and 64 Chapter 4. Estimation

Gn. Indeed, conditional distributions of the random mappings Pn, Gn :Ω → ∞ l (FH,α) are well-defined only if Pn and Gn are measurable. It is also worth a remark that Borel measurability of Pn and Gn allows to use standard notions of stochastic convergence, i.e., weak, in probability, or ∞ almost sure, for Pn and Gn as mappings in l (FH,α). However, this advantage is not essential. The notions of stochastic convergence can be extended using outer expectations and outer probabilities, so that convergence of measurable and non-measurable mappings can be considered in a common framework (cf. van der Vaart and Wellner, 1996). For the sake of completeness, the extended notion of weak convergence as well as the notions of convergence in outer probability and outer almost surely are sketched in Appendix A.2. Consistency and asymptotic normality of γbξ can be viewed as special ver- sions of the Glivenko–Cantelli and the Donsker theorems (cf. van der Vaart and Wellner, 1996, and references therein). Let Pk,Ψ denote the empirical measure corresponding to k i.i.d. random variables with probability distribu- tion Ψ: k 1 X := δ ,Y ,...,Y i.i.d. ∼ Ψ. (4.24) Pk,Ψ k Yi 1 k i=1 A class F of measurable functions is called Glivenko–Cantelli if the Glivenko– Cantelli theorem holds for Pk,Ψ uniformly in f ∈ F:

kPk,Ψ − ΨkF := sup |(Pk,Ψ − Ψ) f| → 0 (4.25) f∈F as k → ∞ in outer probability or outer almost surely (cf. Definition A.5). In particular, strong Glivenko–Cantelli property of Pk,Ψ uniformly in F means ∗ kPk,Ψ − ΨkF → 0 P-a.s., where (·)∗ denotes the measurable cover on the underlying probability space (Ω, A, P) and can be omitted if kPk,Ψ − ΨkF is measurable (cf. Remark A.4). Let Gk,Ψ denote the empirical process corresponding to Pk,Ψ: √ Gk,Ψ := k (Pk,Ψ − Ψ) . (4.26) A class F of measurable functions is called Donsker if the Donsker theorem holds for Gk,Ψ uniformly in f ∈ F:

w ∞ Gk,Ψ → GΨ in l (F), k → ∞, (4.27) where GΨ is the Brownian bridge “with time” Ψ, i.e.,

(GΨf1,... GΨfm) ∼ N (0,C) (4.28) 4.3. Empirical processes with functional index 65

and the covariance matrix C = (Ci,j) is given by    Ci,j := Ψ fi − Ψfi fj − Ψfj = Ψfifj − ΨfiΨfj. (4.29) A class F of measurable functions is called pre-Gaussian if the limit pro- cess GΨ in (4.27) is tight. It should be noted that the mere existence of a zero- mean Gaussian process GΨ with covariance structure specified in (4.29) is en- sured by Kolmogorov’s extension theorem. More precisely, Donsker property and pre-Gaussianity of a function class F mean that there exists a probability 0 0 0 0 ∞ space (Ω , A , P ) and a tight, Borel measurable mapping GΨ :Ω → l (F) satisfying (4.28), (4.29), and (4.27). ∞ If the empirical process Gk,Ψ is not measurable in l (F), the symbol “→w ” in (4.27) represents the extended notion of weak convergence for non- measurable mappings (cf. Definition A.6). The usage of the same symbol for both classical and extended weak convergence is consistent since these notions coincide in case of measurability. However, standard Glivenko–Cantelli and Donsker theorems cannot be applied to the empirical measure Pn and the empirical process Gn of the subsample Si(n,1),...,Si(n,k) directly. Although conditionally i.i.d. given Un = u (cf. Lemma 4.1), the random variables Si(n,1),...,Si(n,k) are not necessarily independent. Moreover, the probability distribution of each Si(n,j) varies with n. Thus uniform convergence results for f and f demand special Pn ξ,αb Gn ξ,αb versions of Glivenko–Cantelli and Donsker theorems that take into account  the structure of the underlying probability distribution L Si(n,1),...,Si(n,k) . Being the central results of this section, the specialized Glivenko–Cantelli and Donsker theorems for the empirical measure Pn and the empirical process Gn are stated in Lemmas 4.17 and 4.18. These results conclude a foregoing series of auxiliary lemmas dedicated to the basic properties of the function classes comprising the integrands fξ,α or their partial derivatives in α, the ∞ measurability of Pn and Gn as mappings in l , and the uniform Glivenko– Cantelli, Donsker, and pre-Gaussian properties of relevant function classes. The following result states that the functions fξ,α and the partial deriva- 2 tives ∂αfξ,α and ∂αfξ,α are uniformly bounded if the parameters ξ and α over compact sets.

Lemma 4.9. Suppose that the sets H ⊂ H1 and I ⊂ (0, ∞) are compact. Then the function classes

FH,I := {fξ,α : ξ ∈ H, α ∈ I} ,

∂αFH,I := {∂αfξ,α : ξ ∈ H, α ∈ I} , 2  2 ∂αFH,I := ∂αfξ,α : ξ ∈ H, α ∈ I are uniformly bounded. 66 Chapter 4. Estimation

Proof. The functions fξ,α have the form

fξ,α(s) = gα ◦ hξ(s) (4.30) with α > gα(t) := t+ and hξ(s) := ξ s. (4.31) This yields

2 2  ∂αfξ,α = (∂αgα) ◦ hξ and ∂αfξ,α = ∂αgα ◦ hξ. (4.32)

d Furthermore, s ∈ S1 implies

> |hξ(s)| = ξ s ≤ kξk∞ · ksk1 = kξk∞ and compactness of H yields

d ∀ξ ∈ H hξ S1 ∈ [−b, b] (4.33) with b = b(H) := sup {kξk∞ : ξ ∈ H} < ∞. (4.34) Thus it suffices to show that the mappings

α (α, t) 7→ gα(t) = t+, α (α, t) 7→ ∂αgα(t) = t+ log (t+) , 2 α 2 (α, t) 7→ ∂αgα(t) = t+ log (t+) are bounded on the compact domain I × [−b, b], which is an obvious conse- quence of their continuity.  The next result shows that compactness of the admissible portfolio set H implies continuity of fξ,α and ∂αfξ,α with respect to the parameter ξ.

Lemma 4.10. Let H ⊂ H1 be compact and α ∈ (0, ∞). Then the parametriza- ∞ d tions ξ 7→ fξ,α and ξ 7→ ∂αfξ,α are continuous mappings from H into l (S1).

Proof. It was already highlighted in (4.33) that hξ(s) is bounded uniformly in ξ ∈ H. Furthermore, it is easy to see that the mapping ξ 7→ hξ(s) is d Lipschitz uniformly in s ∈ S1:

|h (s) − h (s)| = (ξ − ξ )> s ≤ kξ − ξ k · ksk = kξ − ξ k . (4.35) ξ1 ξ2 1 2 1 2 ∞ 1 1 2 ∞

α Moreover, the mapping t 7→ gα(t) = t+ is continuous and hence uniformly continuous on compact domains. Consequently, the mapping ξ 7→ fξ,α(s) = 4.3. Empirical processes with functional index 67

d gα ◦ hξ(s) is continuous uniformly in s ∈ S1. This yields continuity of the ∞ d parametrization ξ 7→ fξ,α as a mapping from H into l (S1). For the continuity of ξ 7→ ∂αfξ,α one only needs to replace gα by ∂αgα. 

Now recall the conditional angular distribution Ψu = L(S|FR(R) > R) defined for u ∈ [0, 1) (cf. Lemma 4.1) and the extension Ψ1 := Ψ justified w by Ψu → Ψ as u ↑ 1 (cf. Lemma 4.3). The following result states that ∞ the parametrization u 7→ Ψu is continuous in l , which is essential for the measurability of the random centring ΨUn in (4.16). Lemma 4.11. Let X be multivariate regularly varying with tail index α ∈ d (0, ∞) and spectral measure Ψ on S1. Further, assume that the distribution function FR of the radial parts is continuous and that the set H ⊂ H1 of admissible portfolio vectors is compact. Then

(a) the mappings u 7→ Ψufξ,α and u 7→ Ψu[∂αfξ,α] are continuous in u ∈ [0, 1] for any ξ ∈ H;

∞ (b) The measure Ψu converges to Ψ in l :

kΨu − ΨkF ∗ := sup |Ψuf − Ψf| → 0, u ↑ 1, f∈F ∗

∗ ∗ for F = FH,α and F = ∂αFH,α := {∂αfξ,α : ξ ∈ H}.

Proof. Part (a). Continuity of FR implies FR(R) ∼ unif(0, 1). Hence one obtains for u ∈ [0, 1) and f ∈ F ∗:

Ψuf = E [f(S)|FR(R) > u] 1 Z FR(R) = E[f(S)|FR(R) = v] dP (v) P{FR(R) > u} (u,1] Z 1 0 = Ψvf dv (4.36) 1 − u (u,1) with 0 Ψu := L (S|FR(R) = u) , u ∈ (0, 1). (4.37)

Now let u1, u2 ∈ [0, 1) and suppose without loss of generality u1 < u2. Then Z Z 1 0 1 0 Ψu1 f − Ψu2 f = Ψvf dv − Ψvf dv 1 − u1 (u1,1) 1 − u2 (u2,1) Z Z 1 0 u2 − u1 0 = Ψvf dv + Ψvf dv. 1 − u1 (u1,u2] (1 − u1)(1 − u2) (u2,1) 68 Chapter 4. Estimation

This implies

u2 − u1 (u2 − u1)(1 − u2) |Ψu1 f − Ψu2 f| ≤ kfk∞ + kfk∞ 1 − u1 (1 − u1)(1 − u2) 2kfk∞ = · (u2 − u1). (4.38) 1 − u1

Hence the mappings u 7→ Ψuf are continuous in u ∈ [0, 1) for f ∈ FH,α and f ∈ ∂αFH,α. Finally, continuity of u 7→ Ψuf in u = 1 is an immediate conse- w quence of the weak convergence Ψu → Ψ = Ψ1 established in Lemma 4.3. Part (b). Consider the mapping

(u, ξ) 7→ φ(u, ξ) := |(Ψu − Ψ)fξ,α| for u ∈ [0, 1] and ξ ∈ H. It obvious that φ(1, ξ) = 0 for ξ ∈ H. Moreover, part (a) and Lemma 4.10 imply that φ is continuous. Consequently, φ is uniformly continuous on the compact domain [0, 1] × H and therefore

sup |(Ψu − Ψ) f| = sup φ(u, ξ) ≤ sup φ(v, ξ) → 0 f∈FH,α ξ∈H ξ∈H,v∈[u,1] as u ↑ 1. Same arguments apply to the function class ∂αFH,α.  The next results states that all empirical measures and empirical processes that are relevant to the uniform consistency and asymptotic normality of the ∞ estimator γbξ on compact portfolio sets H are measurable in l .

Lemma 4.12. The empirical measures Pn and Pk,Ψ, the random measure

ΨUn and the empirical processes Gn and Gk,Ψ are Borel measurable as map- ∞ ∗ ∗ ∗ pings in l (F ) for F = FH,α and F = ∂αFH,α. √ √ Proof. Recall that Gn = k(Pn − ΨUn ) and Gk,Ψ = k(Pk,Ψ − Ψ) and note that Ψ is measurable as a mapping from Ω into l∞(F ∗) because it is constant. Hence it suffices to establish the measurability of the random measures Pn,

Pk,Ψ and ΨUn . d It is obvious that for any probability measure Q on B(S1) the mapping f 7→ Qf is continuous in f ∈ F ∗:

∗ |Qf1 − Qf2| ≤ kf1 − f2k∞ , f1, f2 ∈ F . (4.39)

Hence the mappings f 7→ Pn(ω)f, f 7→ Pk,Ψ(ω)f, and f 7→ ΨUn (ω)f are continuous in f ∈ F ∗ for any ω ∈ Ω. Moreover, the continuity of the parametrizations ξ 7→ fξ,α and ξ 7→ ∂αfξ,α established in Lemma 4.10 and the compactness of the parameter domain H 4.3. Empirical processes with functional index 69

∗ ∞ d imply that the function class F is compact in l (S1). Hence, according to

Lemma A.7, the random measures Pn, Pk,Ψ, and ΨUn are Borel measurable as mappings from Ω into l∞(F ∗) if they are stochastic processes with index ∗ ∗ f ∈ F , i.e., if Pnf, Pk,Ψf, and ΨUn f are measurable for any f ∈ F . It is easy to see that the mappings ω 7→ Pn(ω)f and ω 7→ Pk,Ψ(ω)f for f ∈ F ∗ are measurable by construction (cf. (4.6) and (4.24)). Finally, the measurability of the mapping ω 7→ ΨUn (ω)f = ΨUn(ω)f is an immediate con- sequence of the measurability of the random variable Un and the continuity of the mapping u 7→ Ψuf established in Lemma 4.11(a). 

Sufficient criteria for Glivenko–Cantelli, Donsker and pre-Gaussian prop- erties of a function class F can be formulated in terms of its entropy (cf. Definition A.8). The verification of entropy conditions can be based on the properties of Vapnik–Cervonenkis˘ (VC) classes of sets and functions pre- sented in van der Vaart and Wellner(1996, Section 2.6). For the reader’s convenience, definitions of these objects and auxiliary results needed below are collocated in the appendix, starting with Definition A.16. The next result is needed for the proof of Lemma 4.14.

Lemma 4.13. Any finite-dimensional vector space H of measurable func- tions f : X → R is VC-major.

Proof. According to Definition A.16(d), H is VC-major if the set class

{{x ∈ X : h(x) > t} : h ∈ H, t ∈ R} is VC. Obviously, this set class can also be written as

{{x ∈ X : ht(x) > 0} : h ∈ H, t ∈ R} (4.40) with

ht(x) := h(x) + t.

The function class {ht : h ∈ H, t ∈ R} is a finite-dimensional vector space and therefore VC-subgraph according to Lemma A.18. Hence Lemma A.19 implies that the set class (4.40) is VC. 

The following result allows to obtain entropy bounds for the function classes FH,α and ∂αFH,α that are essential to the proof of Lemma 4.15

Lemma 4.14. Let H ⊂ H1 be compact and α ∈ (0, ∞). Then the function classes FH,α and ∂αFH,α are VC-hull. 70 Chapter 4. Estimation

Proof. Recall that FH,α and ∂αFH,α are uniformly bounded due to Lemma 4.9. Hence, according to Lemma A.21, it suffices to demonstrate that these func- tion classes are VC-major. α Consider first the function class FH,α = gα ◦ HH with gα(t) = t+ and

HH := {hξ : ξ ∈ H} .

It is easy to see that HH is a subset of a finite-dimensional vector space of functions and therefore VC-major (cf. Lemma 4.13). Since gα is monotone, the VC-major property is inherited by FH,α = gα ◦ HH (cf. Lemma A.20) Now consider the function class ∂αFH,α = (∂αgα) ◦ HH . According to Definition A.16, one needs to show that the set class

 d E := s ∈ S1 : ∂αfξ,α(s) > t : t ∈ R, ξ ∈ H

α is VC. The decomposition ∂αfξ,α = (∂αgα) ◦ hξ with ∂αgα(t) = t+ log(t+) yields

 d  d  d s ∈ S1 : ∂αfξ,α(s) > t = s ∈ S1 : hξ(s) < v1(t) ∪ s ∈ S1 : hξ(s) > v2(t) where v1(t) and v2(t) are defined as

 α v1 := inf y ∈ R : y+ log(y+) ≤ t ,  α v2 := sup y ∈ R : y+ log(y+) ≤ t , with inf(∅) := ∞ and sup(∅) := −∞. Furthermore, it is easy to see that the set class  d s ∈ S1 : hξ(s) > v2(t) : t ∈ R, ξ ∈ H is contained in the set class

 d d E0 := s ∈ S1 : hξ(s) > t : t ∈ [−∞, ∞], ξ ∈ R .

> Moreover, due to hξ(s) = ξ s the set class

 d s ∈ S1 : hξ(s) < v1(t) : t ∈ R, ξ ∈ H is also contained in E0. This implies

E ⊂ E0 t E0 := {A ∪ B : A, B ∈ E0} .

Finally, the set class E0 is VC according to Lemma 4.13 and the VC property is inherited by E0 t E0 due to Lemma A.17.  4.3. Empirical processes with functional index 71

The subsequent lemma is based upon the uniform entropy bounds (cf. Definition A.8) for VC-hull function classes and is essential to the convergence of the empirical measure Pn defined in (4.6) and the empirical process Gn defined in (4.16) as mappings in l∞. A function class F is called universally Glivenko–Cantelli, Donsker, or pre-Gaussian, if the corresponding property holds uniformly for all probability measures on the sample space.

Lemma 4.15. Let H ⊂ H1 be compact and α ∈ (0, ∞). Then the function classes FH,α and ∂αFH,α are universally Glivenko–Cantelli, Donsker, and pre-Gaussian.

∗ Proof. Let F denote FH,α or ∂αFH,α. First it should be noted that all functions f ∈ F ∗ are measurable. Furthermore, Lemma 4.9 implies that F ∗ is uniformly bounded. Hence the constant function

F (s) := 1 d (s) · sup kfk (4.41) S1 ∞ f∈F ∗ can serve as an envelope function for F ∗ (cf. Definition A.10). Moreover, in the proof of Lemma 4.12 it was highlighted that the function ∗ ∞ d ∗ class F is a compact subset of l (S1). Consequently, F is separable with respect to the uniform distance and therefore universally Ψ-measurable, i.e., d Ψ-measurable for any probability measure Ψ on B(S1) (cf. Definition A.11). Hence, according to Theorem A.12, the uniformly bounded and univer- sally Ψ-measurable function class F ∗ is universally Glivenko–Cantelli if it satisfies the following entropy condition:

∗ 1  sup log N εkF kQ,1, F , L (Q) = o(n), (4.42) Q∈Qn

d where Qn denotes the set of all discrete probability measures on S1 with atoms of size j/n for j ∈ {1, . . . , n}. ∗ ∞ d Another consequence of the separability of F in l (S1) is the separability of the function classes

∗ n ∗ o F δ,P := f − g : f, g ∈ F , kf − gkP,2 < δ ∗2  2 ∗ F ∞ := (f − g) : f, g ∈ F . Hence these function classes are universally Ψ-measurable and Theorem A.13 yields that F ∗ is universally Donsker and pre-Gaussian if the following uni- form entropy condition is satisfied: Z ∞ q ∗ 2 sup log N(εkF kQ,2, F , L (Q)) dε < ∞, (4.43) 0 Q∈Q 72 Chapter 4. Estimation where Q denotes the set of all finitely discrete probability measures. It is easy to see that k·kQ,1 ≤ k·kQ,2 for any probability measure Q on d S1. Furthermore, since the envelope function F is constant, there holds kF kQ,1 = kF kQ,2. Hence any εkF kQ,2-ball is a subset of an εkF kQ,1-ball and therefore

∗ 1  ∗ 2  N εkF kQ,1, F , L (Q) ≤ N εkF kQ,2, F , L (Q) .

Thus (4.43) implies (4.42) and it suffices to verify (4.43). Moreover, it should be noted that F ∗ is covered by a singe ball of size kF kQ,2: ∗ ∀f ∈ F kfkQ,2 ≤ kF kQ,2 . Consequently, the integrand in (4.43) is equal to 0 for ε > 1 and (4.43) can be reduced to Z 1 q ∗ 2 sup log N(εkF kQ,2, F , L (Q)) dε < ∞. (4.44) 0 Q∈Q

According to Lemma 4.14, the function class F ∗ is VC-hull. Hence d Lemma A.22 implies that for all probability measures Q on S1 there holds

−1 ∗ 12−2Vm (F ) log N εkF k , F ∗, L2(Q) ≤ K (4.45) Q,2 ε

∗ with constant K depending only on the VC-index Vm(F ) ∈ N of the VC- subgraph set class connected with F ∗. This yields the integrability condi- ∗ tion (4.44) for the function class F .  Remark 4.16. An alternative proof of Lemma 4.15 can be obtained by ap- plying Theorem A.14 to the bracketing entropy bounds (cf. Definition A.9) resulting from explicitly constructed brackets of size ε with respect to the supremum norm that cover FH,α and ∂αFH,α. The covering numbers ob- tained by explicit construction are polynomial with respect to 1/ε and the resulting entropy numbers are logarithmic, thus showing that the entropy bound 4.45 is, although sufficient, far from being minimal. The following result is a consequence of the universal Glivenko–Cantelli property obtained in Lemma 4.15 and the mixture representation (4.10) of the distribution L(Si(n,1),...,Si(n,k)).

Lemma 4.17. Suppose that X is multivariate regularly varying with tail index α ∈ (0, ∞) and spectral measure Ψ. Further, let H ⊂ H1 be compact. 4.3. Empirical processes with functional index 73

Then the empirical measure Pn defined in (4.6) and the random measure ΨUn defined in (4.14) satisfy

∞ ∗ Pn − ΨUn → 0 P-a.s. in l (F ) (4.46)

∗ ∗ for F = FH,α and for F = ∂αFH,α.

∞ ∗ Proof. Recall that Pn and ΨUn are Borel measurable in l (F ) according to ∞ ∗ Lemma 4.12. Consequently, the distribution L(Pn − ΨUn ) on B(l (F )) is well-defined and Corollary 4.2 yields Z Un L (Pn − ΨUn ) = L (Pn − ΨUn |Un = u) dP (u) [0,1] Z Un = L (Pk,Ψu − Ψu) dP (u) (4.47) [0,1] with the empirical measure Pk,Ψ defined in (4.24). As shown in Lemma 4.15, the function class F ∗ is universally Glivenko–Cantelli, which yields

kPk,Ψu − ΨukF ∗ := sup |(Pk,Ψu − Ψu) f| → 0 P-a.s. f∈F ∗ for k → ∞ uniformly in Ψu for u ∈ [0, 1]. This can be equivalently written as P sup sup kPm,Ψu − ΨukF ∗ → 0, k → ∞. (4.48) u∈[0,1] m≥k Applying (4.48) to the mixture representation (4.47), one obtains

P sup kPm − ΨUm kF ∗ → 0 m≥n for n → ∞, which is equivalent to (4.46).  Analogously, combining the universal Donsker and pre-Gaussian proper- ties obtained in Lemma 4.15 with the mixture representation (4.10) of the distribution L(Si(n,1),...,Si(n,k)), one obtains weak convergence of the em- pirical process Gn defined in (4.16). Lemma 4.18. Suppose that X is multivariate regularly varying with tail index α ∈ (0, ∞) and spectral measure√ Ψ. Further, let H ⊂ H1 be compact. Then the empirical process Gn = k(Pn − ΨUn ) satisfies

w ∞ Gn → GΨ in l (FH,α). (4.49) 74 Chapter 4. Estimation

∞ Proof. Since Gn is Borel measurable in l (FH,α) (cf. Lemma 4.12), weak convergence (4.49) is understood in the classical setting. Thus it suffices to show that

lim Eh( n) = Eh( Ψ). (4.50) n→∞ G G for any function h ∈ Cb(FH,α). Furthermore, (4.47) implies Z Un L (Gn) = L (GΨu ) dP (u) (4.51) [0,1] and, as a consequence, Z Un |Eh(Gn) − Eh(GΨ)| = Eh(Gk,Ψu ) dP (u) − Eh(GΨ) [0,1] Z Un = (Eh(Gk,Ψu ) − Eh(GΨ)) dP (u) [0,1] Z Un ≤ |Eh(Gk,Ψu ) − Eh(GΨ)| dP (u). [0,1]

It is well known that condition (4.3) linking n and k = k(n) implies that Un = FR(Rn:k+1) converges to 1 almost surely. Hence there exists a sequence vn ↑ 1 such that lim P{Un < vn} = 0. (4.52) n→∞ Moreover, it is easy to see that

|Eh(Gn) − Eh(GΨ)| Z Un ≤ |Eh(Gk,Ψu ) − Eh(GΨ)| dP (u) [0,vn) Z Un + |Eh(Gk,Ψu ) − Eh(GΨ)| dP (u) [vn,1]

≤ 2khk∞P {Un < vn} + sup |Eh(Gk,Ψu ) − Eh(GΨ)|. u≥vn

Hence (4.52) implies

lim sup |Eh(Gn) − Eh(GΨ)| = lim sup sup |Eh(Gk,Ψu ) − Eh(GΨ)|. n→∞ n→∞ u≥vn

Thus, in order to verify (4.50), it suffices to show that for any sequence uk ↑ 1 there holds |Eh( ) − Eh( )| → 0. Gk,Ψuk GΨ 4.3. Empirical processes with functional index 75

∞ ∗ Since h ∈ Cb(l (F )) was chosen arbitrarily, this is equivalent to w ∞ Gk,Ψk → GΨ in l (FH,α) (4.53) with Ψk := Ψuk . w w Recall that Ψu → Ψ for u ↑ 1 (cf. Lemma 4.3) and therefore Ψk → Ψ. Furthermore, Lemma 4.15 states that the function class FH,α is universally Donsker and pre-Gaussian and Lemma 4.9 implies that FH,α is uniformly bounded. Hence, according to Lemma A.15, weak convergence (4.53) holds if the class FH,α and the sequence Ψk satisfy

sup |σΨk (f1 − f2) − σΨ(f1 − f2)| → 0, (4.54) f1,f2∈FH,α where σΨ(f) denotes the variance seminorm kf − ΨfkΨ,2:

 21/2  2 21/2 σΨ(f) := Ψ (f − Ψf) = Ψ f − (Ψf) . It is easy to see that for any a, b ≥ 0 there holds |a − b| ≤ p|a2 − b2|. Consequently, (4.54) can be verified by sup σ2 (f − f ) − σ2 (f − f ) → 0. (4.55) Ψk 1 2 Ψ 1 2 f1,f2∈FH,α

Let F denote the envelope function of FH,α and consider the function

g := (f1 − f2).

It is obvious that kgk∞ ≤ 2kF k∞ and therefore σ2 (g) − σ2 (g) = Ψ g2 − (Ψ g)2 − Ψg2 − (Ψg)2 Ψk Ψ k k 2 2 = Ψkg − Ψg − (Ψkg − Ψg) · (Ψkg + Ψg) 2 2 ≤ Ψkg − Ψg + |Ψkg − Ψg| · 4kF k∞. (4.56) 2 2 2 Hence, due to (f1 − f2) ≤ 2f1 + 2f2 and |f1 − f2| ≤ |f1| + |f2|, it suffices to show that for k → ∞ 2 2 sup Ψkf − Ψf → 0 and sup |Ψkf − Ψf| → 0. f∈FH,α f∈FH,α

2 It is obvious that f ∈ FH,α implies f ∈ FH,2α. Hence it suffices to verify

sup |Ψkf − Ψf| → 0, k → ∞, (4.57) f∈FH,α for arbitrary α ∈ (0, ∞), which is an immediate consequence of Ψk = Ψuk and Lemma 4.11(b).  76 Chapter 4. Estimation 4.4 Proofs of the main results

Based upon auxiliary results from Section 4.3, this section provides pend- ing proofs of the uniform strong consistency and the uniform asymptotic normality stated for the estimator γbξ in Theorems 4.4 and 4.8. Proof of Theorem 4.4. Consider the decomposition

γ − γ = f − Ψf bξ ξ Pn ξ,αb ξ,α = [f − f ] + ( − Ψ ) f + (Ψ − Ψ) f . (4.58) Pn ξ,αb ξ,α Pn Un ξ,α Un ξ,α

It is easy to see that Lemma 4.11(b) and the almost sure convergence Un ↑ 1 imply sup |(Ψ − Ψ) f | = kΨ − Ψk → 0 P-a.s. Un ξ,α Un FH,α ξ∈H Furthermore, Lemma 4.17 yields

sup |(Pn − ΨUn ) fξ,α| → 0 P-a.s. ξ∈H

Hence it suffices to show that

sup | [f − f ]| → 0 P-a.s. (4.59) Pn ξ,αb ξ,α ξ∈H

First-order Taylor expansion yields

[f − f ] = [∂ f 0 ] · (α − α) Pn ξ,αb ξ,α Pn α ξ,α b

0 with α between αb and α. Moreover, strong consistency of αb implies

1 {αb ∈ / I} → 0 P-a.s. (4.60) for any compact interval I ⊂ (0, ∞) with non-empty interior containing α. This allows to reduce (4.59) to

sup |1 {αb ∈ I}· Pn [∂αfξ,α0 ] · (αb − α)| → 0 P-a.s. ξ∈H

Finally, strong consistency of αb and uniform boundedness of the function class ∂αFH,I established in Lemma 4.9 complete the proof.  The proof of Theorem 4.8 is obtained by combination Lemma 4.18 yield- ing asymptotic normality of Pnfξ,α for the true tail index α and the delta method applied to the estimated function f . ξ,αb 4.4. Proofs of the main results 77

Proof of Theorem 4.8. Part (a) Consider the decomposition √ √ k (γξ − γξ) = k (Pnfξ,α − Ψfξ,α) b √ b = k ( [f − f ] + ( − Ψ ) f + (Ψ − Ψ) f ) Pn ξ,αb ξ,α Pn Un ξ,α Un ξ,α h√ i √ = k (f − f ) + f + k (Ψ − Ψ) f . (4.61) Pn ξ,αb ξ,α Gn ξ,α Un ξ,α

First it should be noted that assumption (4.18) postulates √ P ∞ k (ΨUn − Ψ) fξ,α → b(ξ) in l (H) and Lemma 4.18 implies

w ∞ Gnfξ,α → GΨfξ,α in l (H). Thus it suffices to consider the first term in (4.61). Second-order Taylor expansion yields h√ i k (f − f ) Pn ξ,αb ξ,α √ √ 1  2  2 = [∂ f ] · k (α − α) + ∂ f 0 · k (α − α) (4.62) Pn α ξ,α b 2Pn α ξ,α b

0 with some α between α and αb. It is easy to see that asymptotic normal- ity (4.17) of α implies b √ 2 P k (αb − α) → 0 and P 1 {αb ∈ / I} → 0 for any compact interval I ⊂ (0, ∞) with non-empty interior containing α. 2 Furthermore, according to Lemma 4.9, the function class ∂αFH,I is uniformly bounded. Hence √  2  2 Pn ∂αfξ,α0 · k (α − α) b √  2  2 = (1 {αb ∈ / I} + 1 {α ∈ I}) · Pn ∂αfξ,α0 · k (αb − α)  2  = oP(1) + 1 {αb ∈ I}· Pn ∂αfξ,α0 · oP(1) = oP(1) uniformly in ξ ∈ H. Moreover, Lemmas 4.17 and 4.11(b) imply

Pn [∂αfξ,α] = (Pn − ΨUn )[∂αfξ,α] + ΨUn [∂αfξ,α] → Ψ[∂αfξ,α] P-a.s. 78 Chapter 4. Estimation uniformly in ξ ∈ H. Hence asymptotic normality (4.17) of αb yields √ w Pn [∂αfξ,α] · k (αb − α) → Ψ[∂αfξ,α] · Y

2 with a Gaussian random variable Y ∼ N (µα, σα). Finally, Condition√ 4.6 pos- tulating asymptotic independence of the random variable Yn := k(αb − α) and the empirical process Gn yields the result (4.19). Part (b). This result is merely the finite-dimensional version of part (a). It is easy to see that replacing the assumption (4.18) by (4.20) affects only the last term in (4.61) and results in an exchange of the uniform convergence to b(ξ) for a pointwise version. Hence the pointwise asymptotic normality (4.21) of γbξ follows immediately along the lines of the proof for the part (a). 

4.5 Examples and comments

The following section concludes the present chapter by remarks and examples to the second-order conditions underlying Theorem 4.8. Lemma 4.19 and Example 4.20 illustrate the second-order condition (4.18) for the angular parts, whereas Lemma 4.21 and Corollary 4.22 are dedicated to Condition 4.6 that postulates asymptotic independence of the estimates obtained from the radial and the angular parts.

Lemma 4.19. Let X be multivariate regularly varying with tail index α ∈ (0, ∞) and spectral measure Ψ. Further, let H ⊂ H1 be compact and assume that the distribution function FR of the radial part is continuous. Suppose that √  ∞ k Ψ1−k/n − Ψ fξ,α → b(ξ) in l (H) (4.63) and that the mapping

0 0 u 7→ Ψufξ,α, Ψu := L (S|FR = u) , is continuous in u ∈ (0, 1] for any ξ ∈ H. Then √ P ∞ k (ΨUn − Ψ) fξ,α → b(ξ) in l (H). (4.64)

Proof. Due to (4.63) it suffices to show

  √  ΨUn − Ψ1−k/n fξ,α = oP 1/ k (4.65) uniformly in ξ ∈ H. It is easy to see that continuity of the mapping u 7→ 0 Ψufξ,α and the representation (4.36) of Ψufξ,α imply differentiability of the 4.5. Examples and comments 79

mapping u 7→ Ψufξ,α:  Z  ∂ ∂ 1 0 Ψufξ,α = Ψvfξ,α dv ∂u ∂u 1 − u (u,1) Z 1 0 1 0 = 2 Ψvfξ,α dv − Ψufξ,α (1 − u) (u,1) 1 − u 1 = (Ψ − Ψ0 ) f . (4.66) 1 − u u u ξ,α Hence

 1 0 Ψ − Ψ f = (Ψ ∗ − Ψ ∗ ) f · (U − (1 − k/n)) (4.67) Un 1−k/n ξ,α 1 − u∗ u u ξ,α n ∗ with u between (1 − k/n) and Un. It is well known (cf. Smirnov, 1949) that the random variable Un = FR(Rn:k+1) satisfies U − (1 − k/n) n √ →w N (0, 1). (4.68) k/n As a result one obtains √  Un − (1 − k/n) = OP k/n and therefore √ ∗   1 − u = k/n + OP k/n . Consequently, (4.67) yields √  OP k/n  0 ∗ ΨUn − Ψ1−k/n fξ,α = √  · (Ψu − Ψu∗ ) fξ,α k/n + oP k/n √   0 = OP 1/ k · (Ψu∗ − Ψu∗ ) fξ,α.

Hence it suffices to show that

0 sup |(Ψu∗ − Ψu∗ ) fξ,α| = oP(1). (4.69) ξ∈H It is easy to see that Z 0 1 0 0 ∗ sup |(Ψu − Ψu∗ ) fξ,α| = sup ∗ (Ψv − Ψu∗ ) fξ,α dv ξ∈H ξ∈H 1 − u (u∗,1) 0 0 ≤ sup |(Ψv − Ψu∗ ) fξ,α| . (4.70) v∈[u∗,1],ξ∈H 80 Chapter 4. Estimation

0 Recall that the mapping u 7→ Ψufξ,α is supposed to be continuous on (0, 1] 0 for ξ ∈ H. Furthermore, the mapping ξ 7→ Ψufξ,α is continuous on H for 0 u ∈ (0, 1] since each Ψu is a probability measure and ξ 7→ fξ,α is continuous ∞ d 0 in l (S1) according to Lemma 4.10. Hence the mapping (u, ξ) 7→ Ψufξ,α is continuous on (0, 1] × H and therefore uniformly continuous on on [ε, 1] × H for any ε > 0. This implies

0 0 sup |(Ψv − Ψu) fξ,α| → 0, u ↑ 1. (4.71) v∈[u,1],ξ∈H

∗ Finally, since u is always chosen between (1−k/n) and Un, one easily obtains ∗ P u → 1. Thus (4.69) is a consequence of (4.70) and (4.71).  The following example illustrates Lemma 4.19 and shows that the bias term b = b(ξ) in the angular second-order condition (4.18) depends on the choice of the extreme subsample size k = k(n). Example 4.20. Consider a multivariate regularly varying distribution with 0 conditional angular distribution Ψu := L(S|FR(R) = u) given by

0 0 0 Ψu := uΨ1 + (1 − u)Ψ0,

0 0 d where Ψ1 and Ψ0 are arbitrary probability measures on B(S1). Given the continuity of the radial distribution FR, the conditional angular distribution Ψu := L(S|FR(R) > u) is equal to Z 1 0 Ψu = Ψv dv 1 − u (u,1) Z Z 0 1 0 1 = Ψ1 v dv + Ψ0 (1 − v) dv 1 − u (u,1) 1 − u (u,1) 1 + u 1 − u = Ψ0 + Ψ0 2 1 2 0 1 u = (Ψ0 + Ψ0 ) + (Ψ0 − Ψ0 ) . (4.72) 2 1 0 2 1 0

0 In particular, (4.72) yields that the spectral measure Ψ = Ψ1 is equal to Ψ1. Another consequence of (4.72) is

Ψ1−k/n − Ψ = Ψ1−k/n − Ψ1 1 − k/n 1 = (Ψ0 − Ψ0 ) − (Ψ0 − Ψ0 ) 2 1 0 2 1 0 k = (Ψ0 − Ψ0 ) . 2n 0 1 4.5. Examples and comments 81

Hence condition (4.63) is equivalent to

k3/2 (Ψ0 − Ψ0 ) f → b(ξ) in l∞(H). 2n 0 1 ξ,α

Consequently, (4.63) is satisfied if k3/2/n → λ ∈ [0, ∞) and the asymptotic bias term b(ξ) appearing in Theorem 4.8 is given by

0 0 b(ξ) = λ · (Ψ0 − Ψ1) fξ,α.

In particular, b(ξ) is non-zero for λ > 0.

Another point that is worth a remark√ is the asymptotic independence of the normalized estimation error Yn = k(αb − α) and the empirical process Gn stated in Condition 4.6. As already highlighted in Remark 4.7, this condition is rather natural in the framework of multivariate regular variation and is automatically satisfied by the Hill estimator. The rest of this section provides a proof for this assertion. The Hill estimator (cf. Hill, 1975), defined as

k !−1 1 X α := log(R /R ) , (4.73) bH k n:i n:k+1 i=1 is one of the earliest and most popular estimators for the tail index α of a heavy-tailed distribution. Denoting Rei(n,j) := Ri(n,j)/Rn:k+1, one obtains the representation k −1 1 X α = log Rei(n,j). (4.74) bH k j=1

−1 Hence the tuple (αbH , Pnfξ,α) can be written as

−1   ˜ ˜  αbH , Pnfξ,α = Penl, Penfξ,α (4.75) with the empirical measure Pen defined by

k 1 X Pen := δ k (Rei(n,j),Si(n,j)) i=1 ˜ ˜ and the functional indices l, fξ,α defined by ˜ ˜ l(r, s) := log(r) and fξ,α(r, s) := fξ,α(s). 82 Chapter 4. Estimation

← Recall that Rn:k+1 = FR (Un) for continuous FR. Consequently, Lemma 4.1 yields

     k L Rei(n,1),Si(n,1) ,..., Rei(n,k),Si(n,k) |Un = u = ⊗i=1Peu where ← Peu := L (R/FR (u),S|FR(R) > u) .

The representation (4.75) shows√ that the asymptotic independence of the normalized estimation error Yn := k(αbH − α) and the empirical process Gn assumed in Condition 4.6 is intimately related to the asymptotic behaviour of the empirical process √   Ge n := k Pen − PeUn (4.76)

with the random centring PeUn (ω) := PeUn(ω) and functional index f being an ˜ element of the function class FeH,α ∪ {l} with FeH,α defined as

n ˜ o FeH,α := fξ,α : ξ ∈ H , α ∈ (0, ∞).

The following lemma states weak convergence of the empirical process Ge n to a Gaussian process.

Lemma 4.21. Suppose that X is multivariate regularly varying with tail index α ∈ (0, ∞) and spectral measure Ψ. Further, let H ⊂ H1 be compact and assume that the distribution function FR of the radial part is continuous. Then the empirical process Ge n defined in (4.76) satisfies

w ∞   n˜o Ge n → Gρα⊗Ψ in l FeH,α ∪ l . (4.77)

Proof. It is easy to see that the multivariate regular variation condition (2.11) for X yields w Peu → ρα ⊗ Ψ, u ↑ 1. Thus the weak convergence (4.77) can be considered as an extension of Lemma 4.18 and proven by adapting the proof of Lemma 4.18 to Ge n and ˜ FeH,α ∪ {l}. It is easy to see that an envelope function for the function class ˜ FeH,α ∪ {l} is given by

Fe(r, s) := max (log(r),F (s)) , 4.5. Examples and comments 83

where F is an envelope function of the function class FH,α. Since F is bounded, the integrability of Fe and Fe2 depends only on the integrability ˜ ˜2 of l and l . Denoting the projection on the first component by π1,

π1(r, s) := r, one obtains Z ˜ π1 Peul = log(r) dPeu (r) (1,∞) Z Z −1 π1 = 1{v < r} v dv dPeu (r) (1,∞) (1,∞) Z Z −1 π1 = v 1{v < r} dPeu (r) dv (1,∞) (1,∞) Z −1 ← ← = v · P {R/FR (y) > v|R > FR (u)} dv (1,∞) Z P{R > vt} = v−1 · dv (4.78) (1,∞) P{R > t} ← with t = t(u) := FR (u). Thus regular variation of the random variable R ˜ and the generalized Karamata Theorem stated in Lemma A.2 yield Peul < ∞ for u ∈ (0, 1) and Z ˜ −1 P{R > vt} lim Peul = lim v · dv u↑1 t→∞ (1,∞) P{R > t} Z = v−1 · v−αdv (1,∞) 1 = . (4.79) α Analogously to (4.78) one obtains Z ˜2 −1 P {R > vt} Peul = 2v log(v) · dv (1,∞) P {R > t}

˜2 and, by Lemma A.2, Peul < ∞ for u ∈ (0, u) and Z ˜2 −1 P {R > vt} lim Peul = lim 2v log(v) · dv u↑1 t→∞ (1,∞) P {R > t} Z = 2v−1 log(v) · v−αdv (1,∞) 2 = . (4.80) α2 84 Chapter 4. Estimation

2 The integrability of Fe with respect to Peu obtained in (4.80) allows to extend the uniform Donsker and pre-Gaussian properties established for the ˜ function class FH,α in Lemma 4.15 to the function class FeH,α ∪ {l}. Further- ˜ more, (4.80) allows to extend Lemma 4.18 to Ge n and FeH,α ∪ {l} by verifying condition (4.55), i.e., it suffices to show that

2 2 sup σ (f1 − f2) − σ (f1 − f2) → 0 (4.81) Pek ρα⊗Ψ ˜ f1,f2∈FeH,α∪{l}

with Pek := Peuk for an arbitrary sequence uk ↑ 1. It is easy to see that (4.55) is equivalent to

2 2 sup σ (f1 − f2) − σρ ⊗Ψ(f1 − f2) → 0, Pek α f1,f2∈FeH,α which allows to reduce (4.81) to

2 ˜  2 ˜  sup σ l − f − σρ ⊗Ψ l − f → 0. (4.82) Pek α f∈FeH,α

Denoting g := (˜l − f), one obtains analogously to (4.56)

2 2 σ (g) − σρ ⊗Ψ(g) Pek α       2 ≤ Pek − ρα ⊗ Ψ g + Pek − ρα ⊗ Ψ g · Pek + ρα ⊗ Ψ g .

Hence, due to (˜l − f)2 ≤ 2˜l2 + 2f 2, |˜l − f| ≤ |˜l| + |f|, and (4.57), condi- tion (4.82) can be reduced to     ˜2 ˜ Pek − ρα ⊗ Ψ l → 0 and Pek − ρα ⊗ Ψ l → 0 for k → ∞, which is an immediate consequence of (4.79) and (4.80).  The final result of this section is obtained by combination of Lemma 4.21 with the delta method.

Corollary 4.22. Suppose that conditions of Lemma 4.21 are satisfied and that √  ˜ −1 P k PeUn l − α → b ∈ R. (4.83)

Then αbH is asymptotically√ normal and satisfies Condition 4.6, i.e., the ran- dom variable Yn := k(αbH − α) is asymptotically independent from Gn. 4.5. Examples and comments 85

Remark 4.23. It is well known that condition (4.83) can be ensured by strengthening the regular variation of the radial part R by a second-order condition and opposing an additional regularity condition on the sequence k = k(n). See Section 6.2 for further details. −1 −1 Proof of Corollary 4.22. Denote ϑ := α and ϑbH := αbH . Then second-order Taylor expansion yields √  −1 −1 Yn = k ϑbH − ϑ √ √ 2 −2   ∗−3   = −ϑ k ϑbH − ϑ + ϑ k ϑbH − ϑ

∗ ˜ for some ϑ between ϑbH and ϑ. Hence, due to ϑbH = Penl, one obtains

√ √ 2 −2  ˜  ˜ ∗−3  ˜  Yn = −ϑ k Penl − ϑ l + ϑ k Penl − ϑ . (4.84)

Furthermore, assumption (4.83) and Lemma 4.21 imply that √   √   ˜ ˜ ˜ w 2 k Penl − ϑ = Ge nl + k PeUn l − ϑ → N b, σϑ (4.85) with 2 2 ˜2  ˜ −2 σϑ := (ρα ⊗ Ψ) l − (ρα ⊗ Ψ) l = α .

This yields ϑ∗ →P ϑ and

√ 2 ∗−3  ˜  ϑ k Penl − ϑ = oP(1). (4.86)

Applying (4.85) and (4.86) to (4.84), one obtains asymptotic normality of αb: √ w 2 2 Yn = k (αbH − α) → Y ∼ N (−bα , α ). (4.87) w Now consider asymptotic independence of Yn and Gn. Since Yn → Y and w Gn → GΨ, asymptotic independence of Yn and Gn is equivalent to the joint convergence w (Yn, Gn) → (Y, GΨ) (4.88) ˜ with independent Y and GΨ. Due to (4.84), (4.85), and Gnfξ,α = Ge nfξ,α, joint convergence (4.88) and independence of Y and GΨ are immediate con- sequences of Lemma 4.21 and the product structure in the “time” ρα ⊗ Ψ of the Brownian bridge Gρα⊗Ψ.  86 Chapter 4. Estimation Chapter 5

Stochastic order relations

This chapter is dedicated to the ordering of multivariate probability distri- butions with respect to extreme portfolio losses. Following the concept of stochastic order relation, a new notion suitable for asymptotic ordering of portfolio losses is introduced and characterized in the framework of multi- variate regular variation. The discussion of diversification effects started in Section 3.3 is continued, with particular interest paid to ordering of spectral measures and inversion of diversification effects in infinite-mean models. A brief introduction into stochastic order relations is given in Section 5.1. The definition and some basic properties of the asymptotic portfolio loss order are stated in Section 5.2. Characterization of the asymptotic portfolio loss order for multivariate regularly varying models is addressed in Section 5.3, whereas sufficient criteria in terms of some well-known stochastic order rela- tions are derived in Section 5.4. Finally, a series of examples is collocated in Section 5.5.

5.1 Introduction

Definition 5.1. (a) A binary relation  on a set X is called order, if  ex- hibits following properties:

• reflexivity: x  x for all x ∈ X ; • transitivity: x  y and y  z implies x  z; • antisymmetry: x  y and y  x implies x = y.

(b) A binary relation  on a set X is called preorder, if it is reflexive and transitive.

87 88 Chapter 5. Stochastic order relations

In the sequel, stochastic order relations will be understood as orders or preorders of random variables on a given measurable space (X , A). Typically, d d the sample space X will be either R or R+ and A will be the corresponding Borel σ-field. Furthermore, if appropriate, the notion of stochastic ordering will extend to the underlying probability distributions. Thus, if X  Y depends only on the underlying probability distributions PX and PY , this order relation can also be equivalently written as PX  PY . The most basic example of a stochastic order relation is the usual stochas- tic order st for univariate random variables. A random variable X is called lower than Y in usual stochastic order, X st Y , if

∀t ∈ R P{X > t} ≤ P{Y > t}. (5.1)

Obviously, st is a distribution property and therefore X st Y is equivalent X Y to P st P . Besides the univariate case, there are various ordering notions for ran- dom vectors that reflect specific characteristics of dependence between their components. The following definition introduces the supermodular order sm. Being consistent with correlation and some other dependence charac- teristics, supermodular ordering plays an important role in diverse applica- tions (cf. Marshall and Olkin, 1979; B¨auerle, 1997; M¨ullerand Stoyan, 2002; Bergenthum and R¨uschendorf, 2007; Embrechts et al., 2009b). It will be shown in Section 5.4 that sm has particular consequences to the ordering of asymptotic portfolio losses. Definition 5.2. (a) A function f : Rd → R is called supermodular, if for all x, y ∈ Rd f(x ∧ y) + f(x ∨ y) ≥ f(x) + f(y). (5.2)

(b) Let X and Y be random vectors in Rd. Then X is said to be smaller than Y in supermodular order, X sm Y , if Ef(X) ≤ Ef(Y ) for all supermodular functions f : Rd → R such that the expectations exist. Remark 5.3. (a) Alternatively to (5.2), supermodular functions can be de- fined in terms of difference operators

ε ∆i f(x) := f(x + εei) − f(x),

d where ε > 0, i = 1, . . . , d, and ei denotes the i-th unit vector in R . As shown by Kemperman(1977), condition (5.2) is equivalent to

ε δ ∆i ∆j f(x) ≥ 0 (5.3)

for all x ∈ Rd, all ε, δ > 0, and all i, j ∈ {1, . . . , d} such that i < j. 5.1. Introduction 89

(b) From (5.2) it is easy to see that the function f(−x) is supermodular if and only if f(x) is supermodular. Consequently, X sm Y is equivalent to −X sm −Y .

(c) It is well known that sm is invariant under non-decreasing component transformations (cf. M¨ullerand Stoyan, 2002). Thus X sm Y is equiv- alent to CX sm CY with CX and CY denoting the copulas of X and Y .

(d) As an immediate consequence of (5.3), any function f(x) depending on (i) d only one component x of x ∈ R is supermodular. Thus X sm Y implies that X and Y have the same marginal distributions, i.e., X(i) =d Y (i) for i = 1, . . . , d. Another important example of stochastic orders is given by the family of convex order relations (cf. M¨uller and Stoyan, 2002; R¨uschendorf, 2004). As it will be shown in Section 5.4, convex order relations are also relevant for the ordering of extreme portfolio losses. Definition 5.4. Let X and Y be random vectors in Rd. Then X is said to be smaller than Y in

(a) convex order, X cx Y , if Ef(X) ≤ Ef(Y ) for all convex functions f : Rd 7→ R such that the expectations exist;

> > d (b) linear convex order, X lcx Y , if ξ X cx ξ Y for all ξ ∈ R ;

> > d (c) positive linear convex order, X plcx Y , if ξ X cx ξ Y for all ξ ∈ R+;

(d) directionally convex order, X dcx, if Ef(X) ≤ Ef(Y ) for all direction- ally convex—i.e., supermodular and componentwise convex—functions f : Rd → R such that the expectations exist.

There are also many other notions of stochastic ordering which are not mentioned here. In fact, any stochastic application may induce a specific order relation for the incorporated random variables. As a result of this variety, the interconnections between stochastic order relations are an im- portant aspect of their classification. In particular, an order relation  is called stronger than another order relation ∗, if X  Y implies X ∗ Y . Interconnections of this type are called order hierarchy. It is obvious that all results established for an order relation ∗ automatically apply to any stronger order relation . The subsequent remark is an excerpt from the order hierarchies presented in M¨ullerand Stoyan(2002). 90 Chapter 5. Stochastic order relations

Remark 5.5. The following implication chains hold for all random vectors X and Y in Rd:

(a)( X sm Y ) ⇒ (X dcx Y ) ⇒ (X plcx Y );

(b)( X cx Y ) ⇒ (X lcx Y ) ⇒ (X plcx Y ).

5.2 Ordering of extreme portfolio losses

The present section introduces the notion of asymptotic portfolio loss order apl. Depending only on the asymptotic behaviour of portfolio losses, this notion is suitable for the analysis of asymptotic diversification effects. Some basic properties of this newly introduced notion are discussed in Remark 5.7 and Lemmas 5.9 and 5.10. Finally, an ordering result is obtained for elliptical distributions in Lemma 5.11. Definition 5.6. Let X and Y be random variables in Rd. Then X is called lower than Y in asymptotic portfolio loss order, X apl Y , if

 > d P ξ X > t ∀ξ ∈ Σ lim sup > ≤ 1. (5.4) t→∞ P {ξ Y > t}

Remark 5.7. (a) If the denominator in (5.4) is equal to 0, then the value of the fraction is determined by 0/0 := 1 and c/0 := ∞ for c > 0.

(b) The relation apl is a preorder, i.e., apl is reflexive and transitive, but X not antisymmetric. Thus X apl Y and Y apl X does not imply P = Y P . In particular, apl is shift invariant:

d ∀c ∈ R X apl X + c,

which immediately yields X apl X + c apl X. (c) It is easy to see that condition (5.4) is equivalent to

 > d P ξ X > t ∀ξ ∈ R+ lim sup > ≤ 1. (5.5) t→∞ P {ξ Y > t}

(d) Although originally designed for random vectors, apl is well-defined in 1 ∗ the univariate case. Due to Σ = {1}, Z apl Z for random variables Z and Z∗ in R means P {Z > t} lim sup ∗ ≤ 1. t→∞ P {Z > t} 5.2. Ordering of extreme portfolio losses 91

In particular, the situation when the components of random vectors X and Y are ordered with respect to apl,

(i) (i) ∀i ∈ {1, . . . , d} X apl Y ,

is a rather natural setting for the comparison of diversification effects resulting from different dependence structures.

Before proceeding with further study of apl, some additional notation must be introduced. Remark 5.8. In the sequel, the product vx of v, x ∈ Rd is understood com- ponentwise: vx := v(1)x(1), . . . , v(d)x(d) . Thus the writings vx = v · x and v>x = v> · x have different meanings.

The subsequent lemma comprises some basic invariance properties of apl.

Lemma 5.9. The asymptotic portfolio loss order apl is invariant under following operations. (a) permutation:

(i1) (id) (i1) (id) X apl Y ⇒ X ,...,X apl Y ,...,Y

for any permutation (i1, . . . , id) of (1, . . . , d); (b) marginalization:

(i1) (im) (i1) (im) X apl Y ⇒ X ,...,X apl Y ,...,Y

for any sub-index (i1, . . . , im) ⊂ (1, . . . , d). In particular, X apl Y (i) (i) implies X apl Y for i = 1, . . . , d; (c) componentwise rescaling:

d ∀v ∈ R+ X apl Y ⇒ vX apl vY.

Proof. All these properties are immediate consequences of Remark 5.7(c). 

Additionally to the invariance properties stated above, apl is monotonic d with respect to linear expansions in the special case of random vectors in R+. d Lemma 5.10. Let X be a random vector in R+. Then

d ∀v ∈ [1, ∞) X apl vX. (5.6) 92 Chapter 5. Stochastic order relations

Proof. Consider the vector

w := v(1) − 1, . . . , v(d) − 1 .

It is obvious that v ∈ [1, ∞)d implies w ∈ [0, ∞)d and therefore

ξ> · (vX) = ξ>X + ξ> · (wX) ≥ ξ>X.

 >  > Hence P ξ X > t ≤ P ξ · (vX) , which immediately yields (5.6). 

Defined in terms of linear combinations ξ>X, the asymptotic portfolio loss order apl accords with the structure of elliptical distributions introduced in Definition 2.6. This entails the following result.

Lemma 5.11. Let X and Y be elliptically distributed,

d d X = µX + RX AX U, Y = µY + RY AY U (5.7)

(cf. (2.59)). Further, suppose that

RX apl RY (5.8)

> > and that the generalized covariance matrices CX = AX AX and CY = AY AY satisfy d > > ∀ξ ∈ Σ ξ CX ξ ≤ ξ CY ξ. (5.9)

Then X apl Y .

Proof. According to Remark 5.7(b), apl is shift invariant. Consequently, assuming µX = µY = 0 does not lead to any loss of generality. Furthermore, > > it is easy to see that ξ AX = 0 implies ξ X = µX = 0. Hence, due to > d d > 0 apl ξ Y for any ξ ∈ Σ , it suffices to consider ξ ∈ Σ with ξ AX 6= 0. In this case (5.7) yields

> d > ξ X = RX · ξ AX U > = aξ,X RX · vξ,X U,

d where aξ,X ∈ (0, ∞) and vξ,X ∈ R are defined by

> > 1/2 aξ,X := ξ AX 2 = ξ CX ξ , (5.10) −1 > > −1 > vξ,X := aξ,X · ξ AX = aξ,X · AX ξ. (5.11) 5.2. Ordering of extreme portfolio losses 93

> Moreover, it is obvious that kvξ,X k2 = 1. Hence the random variable vξ,X U d is an orthogonal projection of U ∼ unif(S2) on a vector of unit length and symmetry arguments yield > d vξ,X U = V > (1) > with V := e1 U = U . In particular, the distribution of vξ,X U does not depend on ξ or AX . This implies

> d ξ X = aξ,X RX V (5.12) and, analogously, > d ξ Y = aξ,Y RY V (5.13) with > > 1/2 aξ,Y := ξ AY 2 = ξ CY ξ . (5.14) Furthermore, (5.10), (5.14), and (5.9) yield

aξ,X ≤ aξ,Y and therefore aξ,X RX V apl aξ,Y RX V.

Hence it suffices to show that aξ,Y RX V apl aξ,Y RY V , which is equivalent to RX V apl RY V . Note that independence of RX and U in (5.7) implies independence of RX and V in (5.12). Analogously, RY is independent from V in (5.13). Moreover, RX and RY are non-negative. This yields

P {RX V > t} = P {RX V+ > t} Z Z = 1{rv > t} dPRX (r) dPV (v) (0,1) [0,∞) Z V = P {RX > t/v} dP (v) (0,1) Z V = f(t/v) · P {RY > t/v} dP (v), (5.15) (0,1) where f : R+ 7→ R+ is defined by P {R > t} f(t) := X . P {RY > t} Since t/v > t for v ∈ (0, 1), representation (5.15) yields

P {RX V > t} ≤ sup {f(z): z > t}· P {RY V > t} . (5.16) 94 Chapter 5. Stochastic order relations

Finally, recall that (5.8) is equivalent to

lim sup f(t) ≤ 1. t→∞

Thus (5.16) implies

P {RX V > t} lim sup ≤ 1.  t→∞ P {RY V > t}

5.3 Ordering of spectral measures

This section is dedicated to the characterization of the asymptotic portfolio loss order apl in the framework of multivariate regular variation. The re- sults obtained here highlight the influence of the tail index α and the spectral measure Ψ on apl, with primary focus put on dependence structures cap- tured by Ψ. It is shown that apl corresponds to a family of order relations on the set of of canonical spectral measures and that these order relations are intimately related to the extreme risk index γξ. Main results are stated in Theorems 5.18 and 5.20, providing criteria for X apl Y in terms of com- (i) (i) ponentwise ordering X apl Y for i = 1, . . . , d and ordering of canonical spectral measures. An application to spectral measures of elliptical distribu- tions is given in Lemma 5.19.

Lemma 5.12. Let X and Y be multivariate regularly varying on Rd and suppose that d ∀ξ ∈ Σ γξ(Y ) > 0. (5.17) Further, assume that P {kXk > t} lim 1 = 0. (5.18) t→∞ P {kY k1 > t}

Then X apl Y .

Proof. According to Chapter3, there holds

P ξ>X > t lim = γξ(X) t→∞ P {kXk1 > t} and P ξ>Y > t lim = γξ(Y ). t→∞ P {kY k1 > t} 5.3. Ordering of spectral measures 95

Hence (5.17) and (5.18) yield

P ξ>X > t lim sup > t→∞ P {ξ Y > t}  > ! P ξ X > t P {kY k1 > t} P {kXk1 > t} = lim sup · > · t→∞ P {kXk1 > t} P {ξ Y > t} P {kY k1 > t} γ (X) P {kXk > t} = ξ · lim sup 1 γξ(Y ) t→∞ P {kY k1 > t} = 0. 

As a particular consequence of this result, apl is trivial in the case of different tail indices and non-degenerate portfolio losses.

Corollary 5.13. If X and Y are multivariate regularly varying on Rd with tail indices αX and αY and Y satisfies (5.17), then αX > αY implies X apl Y . Proof. Recall that multivariate regular variation of X implies regular varia- tion of kXk1 with tail index αX . Analogously, kY k1 is regularly varying with tail index αY . Finally, αX > αY yields (5.18).  Thus the primary setting for studying the influence of dependence struc- tures on the ordering of extreme portfolio losses is the case of random vari- ables X and Y with equal tail indices:

αX = αY =: α.

In the framework of multivariate regular variation, asymptotic dependence in the tail region is characterized by the spectral measure Ψ or its canonical ∗ ∗ version Ψ . Since apl and Ψ are invariant under componentwise rescalings (cf. Lemma 5.9(c) and (2.24)), the canonical spectral measure Ψ∗ is more suitable for the characterization of apl. The following lemma provides a ∗ representation of the extreme risk index γξ in terms of Ψ . Note that the formulation makes use of the componentwise product notation introduced in Remark 5.8.

Lemma 5.14. Let X be multivariate regularly varying on Rd with tail index α ∈ (0, ∞). If X satisfies the non-degeneracy condition (2.12), then Z ∗ γξ(X) = gξ,α (vs) dΨ (s), (5.19) d S1 96 Chapter 5. Stochastic order relations where Ψ∗ denotes the canonical spectral measure of X, the rescaling vector v = (v(1), . . . , v(d)) is defined by

(i) v := (γei (X) + γ−ei (X)), (5.20)

d and the function gξ,α : R → R is defined as

d !α X (i)  (i)1/α (i)1/α gξ,α(x) := ξ · x + − x − . (5.21) i=1 +

Proof. According to (2.18), the canonical exponent measure ν∗ of X is ob- tained from the exponent measure ν as

ν∗ = ν ◦ T with the transformation T : Rd → Rd defined by

(1) (d) T (x) := Tα ν(B1) · x ,...,Tα ν(Bd) · x , (5.22)  1/α 1/α Tα(t) := t+ − t− , (5.23)  d (i) Bi := x ∈ R : x > 1 . (5.24)

Furthermore, according to (2.22), ν∗ satisfies

∗ −1 ∗ ν ◦ τ = ρ1 ⊗ Ψ , where Ψ∗ denotes the canonical spectral measure of X and τ denotes the −1 transformation into polar coordinates, τ(x) = (kxk1, kxk1 · x). This yields

γξ(X) = ν(Aξ,1) ∗ −1 = ν (T (Aξ,1)) ∗  d = ν x ∈ R : T (x) ∈ Aξ,1 Z Z  > ∗ = 1 ξ T (rs) > 1 dρ1(r) dΨ (s). (5.25) d S1 (0,∞)

1/α It is easy to see that (5.23) implies Tα(rt) = r Tα(t) for r > 0 and t ∈ R. Consequently, (5.22) yields

T (rx) = r1/αT (x) (5.26) 5.3. Ordering of spectral measures 97 for r > 0 and x ∈ Rd. Applying (5.26) to (5.25), one obtains Z Z  1/α > ∗ γξ(X) = 1 r ξ T (s) > 1 dρ1(r) dΨ (s) d S1 (0,∞) Z Z  > n > −αo ∗ = 1 ξ T (s) > 0 1 r > ξ T (s) dρ1(r) dΨ (s) d S1 (0,∞) Z α = 1 ξ>T (s) > 0 ξ>T (s) dΨ∗(s) d S1 Z α = ξ>T (s) dΨ∗(s). (5.27) d + S1

Finally, consider the sets Bi defined in (5.24). It is easy to see that

(i) ν(Bi) = ν(Aei ) + ν(A−ei ) = γei (X) + γ−ei (X) = v .

Hence

d !α > α X (i) (i) (i) ξ T (s) + = ξ · Tα v s i=1 + = gξ,α (vs) . 

∗ As already mentioned above, apl and Ψ are invariant under rescaling of components. Consequently, characterization of apl can be reduced to (i) the case when the marginal weights v = γei (X) + γ−ei (X) in (5.19) are standardized by

P{|X(i)| > t} ∀i, j ∈ {1, . . . , d} lim = 1. (5.28) t→∞ P{|X(j)| > t}

This condition will be referred to as the balanced tails condition. The fol- lowing result shows that this condition significantly simplifies representa- tion (5.19).

Lemma 5.15. Suppose that X is multivariate regularly varying on Rd with tail index α ∈ (0, ∞).

(a) If X has balanced tails in the sense of (5.28), then

γξ(X) ∗ = Ψ gξ,α. (5.29) γe1 (X) + γ−e1 (X) 98 Chapter 5. Stochastic order relations

(b) The non-degeneracy condition (2.12) is equivalent to the existence of a vector w ∈ (0, ∞)d such that wX has balanced tails.

(c) The extreme risk index γξ of the rescaled vector wX obtained in part (b) satisfies

γξ(wX) ∗ = ΨX gξ,α. (5.30) γe1 (wX) + γ−e1 (wX)

Proof. Part (a). Consider the integrand gξ,α(vs) in the representation (5.19):

d !α X (i)  (i) (i)1/α (i) (i)1/α gξ,α(vs) = ξ · v s + − v s − . i=1 +

The balanced tails condition (5.28) implies that X is non-degenerate in the sense of (2.12). Furthermore, all weights v(i) in the representation (5.19) are equal:

 (i) P X > t /P {kXk > t} γe (X) + γ−e (X) 1 = lim 1 = i i t→∞ (j) P {|X | > t} /P {kXk1 > t} γej (X) + γ−ej (X) v(i) = , i, j ∈ {1, . . . , d}. v(j)

Hence gξ,α(vs) simplifies to

(1) gξ,α(vs) = v gξ,α(s)

= (γe1 (X) + γ−e1 (X)) gξ,α(s).

Part (b). Suppose that X satisfies (2.12). Then the sets Bi defined in (5.24) satisfy ν(Bi) > 0 for i = 1, . . . , d. Consequently, the random variables |X(i)| are regularly varying with tail index α. Denoting

(i) −1/α w := (ν(Bi)) , (5.31) one obtains

 (i) (i)  (i) (i)  (i) ! P w X > t P X > t/w P X > t lim = lim (i) · t→∞ P {kXk1 > t} t→∞ P {|X | > t} P {kXk1 > t} (i)α = w · ν(Bi) = 1 5.3. Ordering of spectral measures 99 for i = 1, . . . , d. Hence for any i, j ∈ {1, . . . , d} there holds

 (i) (i)  (i) (i) ! P w X > t P w X > t P {kXk1 > t} lim (j) (j) = lim · (j) (j) t→∞ P {|w X | > t} t→∞ P {kXk1 > t} P {|w X | > t} = 1.

To prove the inverse implication, suppose that Z := wX has balanced tails for some w ∈ (0, ∞)d. With ν denoting the exponent measure of X, one obtains

 (i) ν(Bi) P X > t = lim (1) ν(B1) t→∞ P {|X | > t}  (i) (i) P Z > w t = lim t→∞ P {|Z(1)| > w(1)t}  (i) (i)  (1)  (i) ! P Z > w t P Z > t P Z > t = lim · · t→∞ P {|Z(i)| > t} P {|Z(1)| > w(1)t} P {|Z(1)| > t}  w(i) −α = ∈ (0, ∞), i ∈ {1, . . . , d} . w(1)

Since multivariate regular variation of X implies ν(Bj) > 0 for at least one index j ∈ {1, . . . , d}, this yields ν(Bi) > 0 for all i. Part (c). This is an immediate consequence of Lemma 5.15(a) and the invariance of canonical spectral measures under componentwise rescaling.  Representation (5.29) suggests that ordering of the normalized extreme risk indices γξ/(γe1 + γ−e1 ) in the balanced tails setting can be considered as an integral order relation for canonical spectral measures with respect to the function class  d Gα := gξ,α : ξ ∈ Σ . (5.32) This justifies the following definition. ∗ ∗ d Definition 5.16. Let Ψ and Φ be canonical spectral measures on S1 and let ∗ ∗ α > 0. Then the order relation Ψ Gα Φ is defined by

∗ ∗ ∀g ∈ Gα Ψ g ≤ Φ g. (5.33)

Remark 5.17. (a) As a consequence of Lemma 3.5(b), Gα is indifferent for α = 1 and canonical spectral measures on Σd, i.e., any Ψ∗ and Φ∗ on B(Σd) satisfy ∗ ∗ ∗ ∗ Ψ G,1 Φ and Φ G,1 Ψ . (5.34) 100 Chapter 5. Stochastic order relations

(b) The order relation Gα is mixing invariant in the sense that uniform ∗ ∗ ordering of two parametric families {Ψϑ : ϑ ∈ Θ} and {Φϑ : ϑ ∈ Θ},

∗ ∗ ∀ϑ ∈ ΘΨϑ Gα Φϑ,

implies Z Z ∗ ∗ Ψϑ dµ(ϑ) Gα Φϑ dµ(ϑ) Θ Θ for any probability measure µ on Θ. This property is particularly useful for construction of ordered parametric models. The next result shows that multivariate regular variation entails a strong relation between Gα and apl. In particular, Gα characterizes the influence of the asymptotic dependence structure on the ordering of portfolio losses.

Theorem 5.18. Let X and Y be multivariate regularly varying random vec- d ∗ tors on R with tail index α ∈ (0, ∞) and canonical spectral measures ΨX ∗ and ΨY . Further, suppose that X and Y satisfy the balanced tails condi- tion (5.28).

(1) (1) ∗ ∗ (a) If X apl Y , then ΨX Gα ΨY implies X apl Y .

(1) (1) (1) (1) ∗ ∗ (b) If X apl Y and Y apl X , then ΨX Gα ΨY is equivalent to X apl Y .

Proof. Part (a). Since X has balanced tails, Lemma 5.15(a) yields

 >  > ! P ξ X > t P ξ X > t P {kXk1 > t} lim (1) = lim · (1) t→∞ P {|X | > t} t→∞ P {kXk1 > t} P {|X | > t} γ (X) = ξ γe1 (X) + γ−e1 (X) ∗ = ΨX gξ,α.

Analogously one obtains

 > P ξ Y > t ∗ lim = ΨY gξ,α. t→∞ P {|Y (1)| > t}

∗ ∗ Moreover, ΨX Gα ΨY implies

∗ ΨX gξ,α ∗ ≤ 1. (5.35) ΨY gξ,α 5.3. Ordering of spectral measures 101

Consequently,

P ξ>X > t lim sup > t→∞ P {ξ Y > t}  >  (1)  (1) ! P ξ X > t P Y > t P X > t = lim sup (1) · > · (1) t→∞ P {|X | > t} P {ξ Y > t} P {|Y | > t}

∗  (i) ΨX gξ,α P X > t = ∗ · lim sup (i) (5.36) ΨY gξ,α t→∞ P {|Y | > t} ≤ 1

(i) (i) due to (5.35) and |X | apl |Y |. ∗ ∗ Part (b). Since the implication (ΨX Gα ΨY ) ⇒ (X apl Y ) is already ∗ ∗ established in Part (a), it suffices to show (X apl Y ) ⇒ (ΨX Gα ΨY ). It is (1) (1) (1) (1) easy to see that the combination of X apl Y and Y apl X implies  (1) P X > t lim = 1. t→∞ P {|Y (1)| > t} Thus (5.36) yields

∗  > ΨX gξ,α P ξ X > t ∗ = lim sup > ΨY gξ,α t→∞ P {ξ Y > t}

∗ ∗ and X apl Y implies ΨX Gα ΨY . 

Combining Theorem 5.18 with Lemma 5.11, one obtains an ordering re- sult for the canonical spectral measures of multivariate regularly varying elliptical distributions. The notation Ψ∗ = Ψ∗(α, C) is justified by the fact that spectral measures of elliptical distributions depend only on the tail index α and the generalized covariance matrix C (cf. Lemma 2.8).

Lemma 5.19. Let C and D be d-dimensional covariance matrices satisfying

Ci,i = Di,i > 0, i = 1, . . . , d, (5.37) and ∀ξ ∈ Σd ξ>Cξ ≤ ξ>Dξ. (5.38) Then ∗ ∗ ∀α > 0 Ψ (α, C) Gα Ψ (α, D). 102 Chapter 5. Stochastic order relations

Proof. Fix α ∈ (0, ∞) and consider random vectors

X =d R · A · U, Y =d R · B · U, where A and B are square roots of the matrices C and D in (5.38), i.e., C = A · A>,D = B · B>, and R is an arbitrary regularly varying non-negative random variable with tail index α. As an immediate consequence of Lemma 5.11 one obtains X apl Y . Fur- thermore, invariance of apl under componentwise rescaling yields (1) (d) wX apl wY for w = (w , . . . , w ) with

(i) −1/2 −1/2 w := Ci,i = Di,i , i = 1, . . . , d. Moreover, as a particular consequence of arguments underlying (5.12) and (5.13), one obtains

w(i)X(i) =d w(j)Y (j), i, j ∈ {1, . . . , d}. Hence the random vectors wX and wY satisfy the balanced tails condi- tion (5.28), whereas their components are mutually ordered with respect to apl. Finally, Theorem 5.18(b) and invariance of canonical spectral measures under componentwise rescalings yield

∗ ∗ ∗ ∗ Ψ (α, C) = ΨwX Gα ΨwY = Ψ (α, D). 

The subsequent result extends Theorem 5.18 to random vectors that do not have balanced tails. Theorem 5.20. Let X and Y be multivariate regularly varying random vec- d ∗ tors on R with tail index α ∈ (0, ∞) and canonical spectral measures ΨX ∗ (i) (i) and ΨY . Further, assume that |X | apl |Y | with  (i) P X > t λi := lim sup (i) ∈ (0, 1] (5.39) t→∞ P {|Y | > t} for i = 1, . . . , d and that the vector v = (v(1), . . . , v(d)) defined by

(i) −1/α v := λi (5.40) satisfies −1 X apl vX or v Y apl Y. (5.41) ∗ ∗ Then ΨX Gα ΨY implies X apl Y . 5.3. Ordering of spectral measures 103

d Proof. According to Lemma 5.15(b), there exists w ∈ R+ such that wY satisfies the balanced tails condition (5.28). Furthermore, the tails of the random vector

vwX := v(1)w(1)X(1), . . . , v(d)w(d)X(d) with v defined in (5.40) are also balanced. Indeed, it is easy to see that

 (i) (i)  (i) (i) (i) P w Y > t P v w X > t α lim = lim = w(i) t→∞ P {|Y (i)| > t} t→∞ P {|v(i)X(i)| > t} for i = 1, . . . , d. Analogously one obtains

 (i) (i) P v X > t (i)α −1 lim = v = λi t→∞ P {|X(i)| > t} and, as a result,

 (i) (i) (i) P v w X > t lim sup (i) (i) t→∞ P {|w Y | > t}  (i) (i) P v X > t = lim sup (i) t→∞ P {|Y | > t}  (i) (i)  (i) ! P v X > t P X > t = lim sup (i) · (i) t→∞ P {|X | > t} P {|Y | > t}  (i) −1 P X > t = λi · lim sup (i) t→∞ P {|Y | > t} = 1 for i = 1, . . . , d. Hence the balanced tails condition for wY implies that the tails of vwX are also balanced. Furthermore, invariance of canonical spectral measures under componen- twise rescaling yields

∗ ∗ ∗ ∗ ΨvwX = ΨX Gα ΨY = ΨwY . Thus, applying Theorem 5.18(a), one obtains

vwX apl wY. (5.42)

(i) −1/α Since v = λi > 0 for i = 1, . . . , d, condition (5.42) is equivalent to

−1 wX apl v wY. (5.43) 104 Chapter 5. Stochastic order relations

Moreover, assumption (5.41) implies

−1 wX apl vwX or v wY apl wY. (5.44)

Combining this ordering statement with (5.42) and (5.43), one obtains

wX apl wY.

Finally, invariance of apl with respect to componentwise rescaling yields X apl Y . 

d In the special case of random vectors in R+, Lemma 5.10 yields a signifi- cant simplification of Theorem 5.20.

Corollary 5.21. Let X and Y be multivariate regularly varying random d ∗ vectors on R+ with tail index α ∈ (0, ∞) and canonical spectral measures ΨX ∗ and ΨY . Further, suppose that

 (i) P X > t λi := lim sup (i) ∈ (0, 1], i = 1, . . . , d. (5.45) t→∞ P {|Y | > t}

∗ ∗ Then ΨX Gα ΨY implies X apl Y .

Proof. Assumption (5.45) immediately yields that the rescaling vector v d −1 defined in (5.40) is an element of [1, ∞) . Moreover, condition v Y apl Y is equivalent to Y apl vY . Hence condition (5.41) of Theorem 5.20 is satisfied due to Lemma 5.10. 

The final result of this section is due to the indifference of Gα for α = 1 mentioned in Remark 5.17(a). This special property of spectral measures on d Σ allows to reduce apl to the ordering of components. It should be noted that this result cannot be extended to the general case of spectral measures d on S1.

d Lemma 5.22. Let X and Y be multivariate regularly varying on R+ with tail index α = 1. Further, suppose that Y satisfies the non-degeneracy con- (i) (i) dition (2.12) and that X apl Y for i = 1, . . . , d. Then X apl Y .

Proof. According to Lemma 5.15(b), there exists w ∈ (0, ∞)d such that wY satisfies the balanced tails condition (5.28). Furthermore, due to the invariance of apl under componentwise rescaling, X apl Y is equivalent to wX apl wY . 5.3. Ordering of spectral measures 105

Thus it can be assumed without loss of generality that Y has balanced tails. This yields

P X(i) > t P X(i) > t λi := lim sup (i) = lim sup (1) , i = 1, . . . , d. t→∞ P {Y > t} t→∞ P {Y > t}

(i) (i) Hence the assumption X apl Y for i = 1, . . . , d implies λi ∈ [0, 1] for all i. Moreover, the balanced tails condition for Y yields

γe1 (Y ) = ... = γed (Y ). (5.46)

Now consider the random vector X and denote

j := arg max γei (X). i∈{1,...,d}

Recall that γei (X) = νX (Aei ) with νX denoting the exponent measure of X and that νX is non-zero. This yields γej (X) > 0 even if X does not satisfy the non-degeneracy condition (2.12). Moreover, according to Lemma 3.5(b), the mapping ξ 7→ γξ(X) is linear. This implies

d X (i) d γξ(X) = ξ · γei (X) ≤ γej (X), ξ ∈ Σ . (5.47) i=1

Analogously, the mapping ξ 7→ γξ(Y ) is linear. As a result, (5.46) yields

d X (i) d γξ(Y ) = ξ · γei (Y ) = γe1 (Y ), ξ ∈ Σ . (5.48) i=1

Hence

P ξ>X > t lim sup > t→∞ P {ξ Y > t} ! P ξ>X > t P X(j) > t PY (1) > t = lim sup (j) · (1) · > t→∞ P {X > t} P {Y > t} P {ξ Y > t}

γξ(X) γe1 (Y ) = · λj · γej (X) γξ(Y ) ≤ 1 due to λj ≤ 1, (5.47), and (5.48).  106 Chapter 5. Stochastic order relations 5.4 Convex and supermodular orders

The present section is dedicated to the interconnections between the asymp- totic portfolio loss order apl and other stochastic order relations. The cen- tral result is stated in Theorem 5.23 and entails a collection of sufficient criteria for apl in terms of convex and supermodular order relations. Par- ticular interest is paid to the inversion of diversification effects for α < 1. An application of to copula based models is given in Lemma 5.27.

Recall the equivalent ordering criterion in terms of Gα stated in Theo- rem 5.18(b). It is easy to see that this result characterizes the influence of de- pendence in the tail region on asymptotic diversification effects in multivari- ∗ ate regularly varying models. Thus, given canonical spectral measures ΨX ∗ and ΨY , comparison of resulting diversification effects for a fixed α ∈ (0, ∞) is a matter of computation. Additionally to explicit analytical or numerical calculation of the inte- ∗ grals Ψ gξ,α, sufficient criteria for Gα can be formulated in terms of ordered expectations of specific convex functions. Furthermore, ordering of these ex- pectations can be derived from well-known stochastic order relations, such as the supermodular order and the family of convex orders. This approach was applied by Embrechts et al.(2009b) to the ordering of risks for the portfolio vector ξ = (1,..., 1) and a specific family of multivariate regularly varying models with identically distributed, non-negative margins X(i) (cf. Example 5.29). The next theorem generalizes these arguments to multivariate regularly varying random vectors in Rd with balanced tails and tail index α 6= 1. The case α = 1 is not included for two reasons. First, this case is partly trivial due d to the indifference of Gα for spectral measures on Σ (cf. Remark 5.17(a)). Second, Karamata’s theorem used in the proof of the integrable case α > 1 (cf. Theorem A.1) does not yield the desired result for random variables with tail index α = 1.

Theorem 5.23. Let X and Y be multivariate regularly varying on Rd with same tail index α 6= 1. Further, suppose that X and Y satisfy the balanced tails condition (5.28).

(1) (1) (a) If α > 1, |X | apl |Y | with

 (1) P X > t lim sup (1) = 1, (5.49) t→∞ P {|Y | > t}

and there exists u0 > 0 such that

d >  >  ∀u ≥ u0 ∀ξ ∈ Σ Ehu ξ X ≤ Ehu ξ Y (5.50) 5.4. Convex and supermodular orders 107

∗ ∗ with hu(t) := (t − u)+, then ΨX Gα ΨY .

(1) (1) (b) If α < 1, |X | and |Y | are mutually ordered with respect to apl, i.e.,

(1) (1) (1) (1) X apl Y and Y apl X , (5.51)

and there exists u0 > 0 such that

d  >    >   ∀u ≥ u0 ∀ξ ∈ Σ Efu ξ X + ≤ Efu ξ Y + (5.52)

∗ ∗ with fu(t) := −(t ∧ u), then ΨY Gα ΨX . The proof will be given after some conclusions and remarks. In particular, it should be noted that the relation between Gα and apl established in Theorem 5.18 immediately yields the following result.

Corollary 5.24. (a) If random vectors X and Y satisfy conditions of The- orem 5.23(a), then X apl Y ;

(b) If X and Y satisfy conditions of Theorem 5.23(b), then Y apl X. It is also worth a remark that the ordering of portfolio losses ξ>X required in condition (5.50) is also well known as the stop-loss order (cf. M¨uller and Stoyan, 2002). Furthermore, it is easy to see that the integrands hu and fu in (5.50) and (5.52) are convex. Consequently, sufficient criteria for these conditions can be formulated in terms of convex and supermodular stochastic order relations introduced in Definitions 5.2 and 5.1. Thus the order hierarchy mentioned in Remark 5.5 immediately yields a collection of criteria for (5.50) and (5.52). Remark 5.25. (a) Following criteria are sufficient for (5.50) and (5.52):

> > d • (ξ X)+ cx (ξ Y )+ for all ξ ∈ Σ , d • X and Y are restricted to R+ and X  Y with  denoting either plcx, lcx, cx, dcx, or sm. (b) Additionally, condition (5.50) follows from X  Y with  denoting either plcx, lcx, cx, dcx, or sm. Finally, a comment should be made upon convex ordering of non-integra- ble random variables and diversification for α < 1. Remark 5.26. The phase change at α = 1, i.e., the inversion of diversification effects taking place when the tail index α crosses this critical value, demon- strates that the implications of convex ordering are essentially different for 108 Chapter 5. Stochastic order relations integrable and non-integrable random variables. Indeed, it is easy to see that if a random variable Z on R satisfies E[Z+] = E[Z−] = ∞, then the only integrable convex functions of Z are the constant ones. Moreover, if Z is restricted to R+ and EZ = ∞, then the only integrable convex functions of Z are the non-increasing ones.

Proof of Theorem 5.23. Part (a). Consider the expectations in (5.50). It is easy to see that for u > 0 Z 1 >  1  > Ehu ξ X = P ξ X > t dt u u (u,∞) Z = P ξ>X > tu dt (1,∞) and, as a consequence, −1 >   >  > u Ehu ξ X P ξ X > u Z P ξ X > tu (1) = (1) > dt. P {|X | > u} P {|X | > u} (1,∞) P {ξ X > u} Moreover, Lemma 5.15(a) implies  > P ξ X > u γξ(X) ∗ lim = = ΨX gξ,α (5.53) u→∞ (1) P {|X | > u} γe1 (X) + γ−e1 (X) and Karamata’s theorem (cf. Theorem A.1) yields Z  > Z P ξ X > tu −α 1 lim > dt = t dt = . u→∞ (1,∞) P {ξ X > u} (1,∞) α − 1 As a result, one obtains −1 >  u Ehu ξ X 1 ∗ lim = ΨX gξ,α u→∞ P {|X(1)| > u} α − 1 and, analogously, −1 >  u Ehu ξ Y 1 ∗ lim = ΨY gξ,α. u→∞ P {|Y (1)| > u} α − 1 Hence (5.50) and (5.49) yield −1 >  u Ehu ξ X 1 ≥ lim sup −1 > u→∞ u Ehu (ξ Y ) −1 >   (1)  (1) ! u Ehu ξ X P Y > u P X > u = lim sup (1) · −1 > · (1) u→∞ P {|X | > u} u Ehu (ξ Y ) P {|Y | > u} ∗ ΨX gξ,α = ∗ ΨY gξ,α 5.4. Convex and supermodular orders 109

d ∗ ∗ for all ξ ∈ Σ , which exactly means ΨX Gα ΨY . Part (b). Note that (5.51) implies

 (1) P X > t lim = 1 (5.54) t→∞ P {|Y (1)| > t} and that (5.52) yields

>  >  ∀u > u0 ∀v ≥ 0 Efu+v ξ X − Efu+v ξ Y ≤ 0. (5.55)

Furthermore, it is easy to see that any random variable Z in R+ satisfies Z E[Z ∧ u] = (t ∧ u) dPZ (t) (0,∞) Z Z = 1{s < t}· 1{s < u} ds dPZ (t) (0,∞) (0,∞) Z Z = 1{s < u} 1{s < t} dPZ (t) ds (0,∞) (0,∞) Z = P {Z > s} ds. (0,u) This implies Z Efu+v(Z) = Efu(Z) − P {Z > t} dt. (u,u+v) Consequently, (5.55) yields

>  >  ∀u ≥ u0 ∀v > 0 Efu ξ X − Efu ξ Y ≤ I(u, v) (5.56) where Z I(u, v) := P ξ>X > t − P ξ>Y > t  dt (u,u+v) Z  (1) = φ(t) · P X > t dt (u,u+v) with P ξ>X > t − P ξ>Y > t φ(t) := . P {|X(1)| > t} Moreover, (5.54), (5.53), and an analogue of (5.53) for Y yield

 >  >  (1) P ξ X > t P ξ Y > t P Y > t φ(t) = − · P {|X(1)| > t} P {|Y (1)| > t} P {|X(1)| > t} ∗ ∗ → ΨX gξ,α − ΨY gξ,α, t → ∞. (5.57) 110 Chapter 5. Stochastic order relations

∗ ∗ d Now suppose that ΨY Gα ΨX is not satisfied, i.e., there exists ξ ∈ Σ ∗ ∗ such that ΨY gξ,α > ΨX gξ,α. Then (5.57) yields φ(t) ≤ −ε for some ε > 0 and sufficiently large t. This implies Z  (1) I(u, v) ≤ −ε P X > t dt (5.58) (u,u+v) for sufficiently large u and all v ≥ 0. Moreover, regular variation of X(1) with tail index α < 1 implies E X(1) = ∞. Consequently, the integral on the right side of (5.58) tends to infinity for v → ∞: Z  (1) ∀u > 0 lim P X > t dt = ∞. v→∞ (u,u+v) Hence, choosing u and v sufficiently large, one can achieve I(u, v) < c for any c ∈ R. In particular, u and v can be chosen such that >  >  I(u, v) < Efu ξ X − Efu ξ Y ,

∗ ∗ which contradicts (5.56). Thus ΨY gξ,α > ΨX gξ,α cannot be true and there ∗ ∗ necessarily holds ΨY Gα ΨX .  The following result is a consequence of ordering criteria mentioned in Remark 5.25 and the invariance of the supermodular order under non-de- creasing component transformations (cf. Remark 5.3(c)).

∗ ∗ d Lemma 5.27. Let Ψ1 and Ψ2 be canonical spectral measures on Σ . Further, ∗ for i = 1, 2, let Ci denote the copula of the simple max-stable distribution Gi ∗ induced by Ψi according to (2.34) and (2.22). Then C1 sm C2 implies ∗ ∗ (a) Ψ1 Gα Ψ2 for α ∈ (1, ∞);

∗ ∗ (b) Ψ2 Gα Ψ1 for α ∈ (0, 1). ∗ ∗ Proof. Let νi denote the canonical exponent measures corresponding to Ψi ∗ and Gi . It is easy to see that the transformed measures

∗ −1 να,i := νi ◦ T , i = 1, 2, with α > 0 and the transformation T defined as

 (i)1/α (d)1/α d T : x 7→ x ,..., x , x ∈ R+, exhibit the scaling property (2.8) with index −α:

−α d να,i(tA) = t να,i(A),A ∈ B(R+ \{0}). 5.5. Examples 111

Hence the transformed distributions

∗ −1 c Gα,i(x) := Gi ◦ T (x) = exp (−να,i ([0, x] )) (5.59) are max-stable with exponent measures να,i. It is well known that max-stable distributions with identical heavy-tailed margins are multivariate regularly varying (cf. Resnick, 1987). Moreover, the limit measure ν in the multivariate regular variation condition (2.7) can be chosen equal to the exponential measure associated with the property of max-stability. Consequently, each probability distribution Gα,i for i = 1, 2 and α > 0 is multivariate regularly varying with tail index α and canonical ∗ spectral measure Ψi . Furthermore, it is easy to see that X ∼ Gα,1 and Y ∼ Gα,2 have identical margins in the sense that

X(i) =d Y (j), i, j ∈ {1, . . . , d}.

Moreover, due to the invariance of sm under non-decreasing marginal trans- formations, C1 sm C2 implies

Gα,1 sm Gα,2 for all α > 0. Finally, application of the ordering criteria from Remark 5.25 to X ∼ Gα,1 and Y ∼ Gα,2 completes the proof. 

5.5 Examples

This section concludes the chapter by a series of examples with parametric models illustrating some results from the foregoing sections. Examples 5.28 and 5.29 demonstrate application of Lemma 5.27 to copula based models and d the phenomenon of phase change for random vectors in R+. The fact that the phase change does not necessarily occur in the general case is demonstrated by multivariate Student-t distributions in Example 5.30. Example 5.28. Recall the family of Gumbel copulas introduced in (2.47):

 d !1/ϑ X (i)ϑ Cϑ(u) := exp − − log u  , ϑ ∈ [1, ∞). (5.60) i=1

It was already shown in (2.48) that Gumbel copulas are so-called extreme value copulas, i.e., they are copulas of simple max-stable distributions. 112 Chapter 5. Stochastic order relations

According to Wei and Hu(2002), Gumbel copulas with dependence pa- rameter ϑ ∈ [1, ∞) are ordered by sm:

∀ϑ1, ϑ2 ∈ [1, ∞) ϑ1 ≤ ϑ2 ⇒ Cϑ1 sm Cϑ2 . (5.61) Consequently, Lemma 5.27 applies to the family of canonical spectral mea- ∗ sures Ψϑ corresponding to the Gumbel copulas Cϑ. Thus 1 ≤ ϑ1 ≤ ϑ2 < ∞ implies Ψ∗  Ψ∗ for α > 1 and there is a phase change when α crosses ϑ1 Gα ϑ2 the value 1, i.e., for α ∈ (0, 1) there holds Ψ∗  Ψ∗ . ϑ2 Gα ϑ1 Applying Theorem 5.18, one obtains ordering with respect to apl for d random vectors X and Y on R+ that are multivariate regularly varying with canonical spectral measures of Gumbel type and have balanced tails ordered by apl. In particular, this is the case if X and Y have identical regularly varying marginal distributions and Archimedean copulas that satisfy appro- priate regularity conditions (cf. Remark 2.5). Figure 5.1 illustrates the resulting diversification effects in the bivariate case, including indifference to portfolio diversification for α = 1 and the phase change occurring when α crosses this critical value. The graphics show the (1) ∗ d function ξ 7→ Ψϑ gξ,α for selected values of ϑ and α. Due to X ∈ R+, ∗ ∗ representation Ψϑ gξ,α = γξ/(γe1 + γ−e1 ) simplifies to Ψϑ gξ,α = γξ/γe1 and therefore ∗ ∗ Ψϑ ge1,α = Ψϑ ge2,α = 1. As already mentioned above, Theorem 5.23 generalizes some arguments from Embrechts et al.(2009b). The next example presents the model origi- nally addressed by the authors. Example 5.29. Another family of extreme value copulas that are ordered by sm is the family of d-dimensional Galambos copulas with parameter ϑ ∈ (0, ∞):

 !−1/ϑ X |I| X (i)−ϑ Cϑ(u) := exp  (−1) − log u  . (5.62) I⊂{1,...,d} i∈I

According to Wei and Hu(2002), ϑ1 ≤ ϑ2 implies Cϑ1 sm Cϑ2 . Thus Lemma 5.27 yields ordering of the corresponding canonical spectral measures ∗ Ψϑ with respect to Gα . Similarly to the case of Gumbel copulas, ϑ1 ≤ ϑ2 implies Ψ∗  Ψ∗ for α > 1 and Ψ∗  Ψ∗ for α ∈ (0, 1). ϑ1 Gα ϑ2 ϑ2 Gα ϑ1 Finally, it is worth a remark that Galambos copulas correspond to the d canonical exponent measures of random vectors X in R+ with identically distributed regularly varying margins X(i) and dependence structure of −X given by an Archimedean copula with a regularly varying generator φ(1−1/t). 5.5. Examples 113

Extreme risk index (normalized) Extreme risk index (normalized)

alpha theta alpha theta 1.6 1.6 6 1.4 6 2 3 1.4 3 2 1.5 1.4 1.5 2 1.4 1 1.4 1.4 1 2 0.8 1.4 0.8 2 0.6 1.4 0.6 2 1.2 1.2 1.0 1.0 0.8 0.8 0.6 0.6

0.0 0.2 0.4 0.6 0.8 1.0 0.0 0.2 0.4 0.6 0.8 1.0

xi_1 xi_1

(a) Varying α for ϑ = 1.4 (b) Varying α for ϑ = 2

Extreme risk index (normalized) Extreme risk index (normalized)

alpha theta 3 1.2

1.4 3 1.4 1.4 3 1.6 3 2 3 4 1.2 1.2 3 6 1.0 1.0

alpha theta 0.6 1.2 0.8 0.8 0.6 1.4 0.6 1.6 0.6 2

0.6 0.6 0.6 4 0.6 6

0.0 0.2 0.4 0.6 0.8 1.0 0.0 0.2 0.4 0.6 0.8 1.0

xi_1 xi_1

(c) Varying ϑ for α > 1 (d) Varying ϑ for α < 1

Figure 5.1 Bivariate Gumbel copulas: Diversification effects repre- (1) ∗ sented by functions ξ 7→ Ψϑ gξ,α for selected values of ϑ and α. 114 Chapter 5. Stochastic order relations

Models of this type were discussed in recent studies of aggregation effects for extreme risks (cf. Alink et al., 2004, 2005; Neˇslehov´aet al., 2006; Barbe et al., 2006; Embrechts et al., 2009a,b). The final example of this chapter illustrates results established in Lem- mas 5.11 and 5.19. In particular, it shows that elliptical distributions do not exhibit a phase change at α = 1. Example 5.30. Recall multivariate Student-t distributions introduced in Ex- ample 2.7 and consider the case with equal degrees of freedom, i.e.,

d d X = µX + R · AX · U, Y = µY + R · AY · U, (5.63) where R =d |Z| for a Student-t distributed random variable Z with degrees of freedom equal to α ∈ (0, ∞). Further, let the generalized covariance matrices CX = C(ρX ) and CY = C(ρY ) be defined as  1 ρ  C(ρ) := (5.64) ρ 1 and assume that ρX ≤ ρY . Then CX and CY satisfy condition (5.9) and Lemma 5.11 yields X apl Y . Moreover, Lemma 5.19 implies a uniform ordering of diversification effects in the sense that Ψ∗ = Ψ∗  Ψ∗ = Ψ∗ X α,ρX Gα α,ρY Y for all α ∈ (0, ∞). (1) ∗ Figure 5.2 shows functions ξ 7→ Ψα,ρ gξ,α for selected parameter values ρ and α that illustrate the ordering of asymptotic portfolio losses by ρ and the missing phase change at α = 1. The indifference to portfolio diversification for α = 1 is also absent. Moreover, symmetry of elliptical distributions implies γ−e1 = γe1 and, as a result, ∗ ∗ Ψα,ρ ge1,α = Ψα,ρ ge2,α = 1/2. Thus the standardization of the plots in Figure 5.2 is different from that in Figure 5.1. Remark 5.31. All examples the author is aware of—some of them are pre- ∗ sented in this thesis—suggest that the diversification coefficient Ψ gξ,α is decreasing in α. This means that risk diversification is stronger for lighter component tails than for heavier ones. However, it should be noted that the influence of the tail index α on risk aggregation is different from that. The asymptotic risk aggregation coefficient P X(1) + ... + X(d) > t qd := lim t→∞ P {X(1) > t} 5.5. Examples 115

Extreme risk index (normalized) Extreme risk index (normalized) 0.5 0.5 0.4 0.4 0.3 0.3

0.2 alpha rho 0.2 alpha rho 0.6 0.5 0.6 −0.5 0.8 0.5 0.8 −0.5

0.1 1 0.5 0.1 1 −0.5 1.2 0.5 1.2 −0.5 1.6 0.5 1.6 −0.5 2 0.5 2 −0.5 0.0 0.0

0.0 0.2 0.4 0.6 0.8 1.0 0.0 0.2 0.4 0.6 0.8 1.0

xi_1 xi_1

(a) Varying α for ρ > 0 (b) Varying α for ρ < 0

Extreme risk index (normalized) Extreme risk index (normalized) 0.5 0.5 0.4 0.4 0.3 0.3 0.2 0.2 alpha rho alpha rho 2 −0.6 0.5 −0.6

0.1 2 −0.3 0.1 0.5 −0.3 2 0 0.5 0 2 0.3 0.5 0.3 2 0.6 0.5 0.6 0.0 0.0

0.0 0.2 0.4 0.6 0.8 1.0 0.0 0.2 0.4 0.6 0.8 1.0

xi_1 xi_1

(c) Varying ρ for α > 1 (d) Varying ρ for α < 1

Figure 5.2 Bivariate elliptical distributions with generalized covari- ance matrices defined in (5.64): Diversification effects represented by (1) ∗ functions ξ 7→ Ψα,ρ gξ,α for selected values of ρ and α. 116 Chapter 5. Stochastic order relations introduced by W¨uthrich(2003) is known to be increasing in α when the loss components X(i) are non-negative (cf. Barbe et al., 2006). It is easy to see that the restriction to non-negative X(i) implies

P {kXk1 > t} 1 qd = lim = . t→∞ (1) P {X > t} γe1 Moreover, denoting the uniformly diversified portfolio by η,

η := d−1(1,..., 1), one obtains  > −1 P η X > d t α γη qd = lim = d . t→∞ (1) P {X > t} γe1 α Thus qd is a product of the factor d , which is increasing in α, and the ratio ∗ γη/γe1 , which is closely related to to the diversification coefficient Ψ gξ,α. In particular, given equal marginal weights, i.e.,

γe1 = ... = γed ,

Lemma 5.15(a) yields γη ∗ = Ψ gη,α. γe1 ∗ d As already mentioned above, the coefficients Ψ gξ,α with ξ ∈ Σ are decreas- ing in all examples considered here. This means that the aggregation and the diversification of risks are influenced by the tail index α in different, maybe even always contrary ways. The question for the generality of this contrary influence is open. One can easily prove that the extreme risk index γξ = Ψfξ,α is decreasing in α d ∗ for ξ ∈ Σ . However, this result cannot be extended to Ψ gξ,α directly since ∗ Ψ gξ,α is related to Ψfξ,α by the normalizations (5.29) and (5.30). Currently, ∗ d ∗ it remains unclear whether Ψ gξ,α with arbitrary ξ ∈ Σ or at least Ψ gη,α is generally decreasing in α. This is an interesting subject for further research. Chapter 6

Modelling and simulation

This final chapter presents a simulation study that illustrates the results of Chapters3 and4 concerning the estimation and the minimization of the extreme risk index γξ. The simulation results give insight into several aspects of the estimation and optimization procedures. Particular interest is paid to the bias of the estimated tail index αb and the inversion of diversification effects at α = 1 for random vectors with non-negative components. The chapter is organized as follows. Section 6.1 outlines the objectives and the design of the simulation study. A brief introduction to the estimation of the tail index α is given in Section 6.2, whereas specifications of the implemented models are provided in Section 6.3. The simulation results are presented in Section 6.4. Finally, conclusions are outlined in Section 6.5.

6.1 Objectives and design

The primary goal of the simulation study presented in this chapter is a prac- tical performance test for the portfolio optimization approach proposed in Section 4.1. There are several questions worth considering. In particular, it is of great interest to learn more about the reliability of the estimators γbξ and ξbopt in applications. Furthermore, since estimates of the tail index α may be considerably biased, the influence of this bias on γbξ and ξbopt should be well understood. Finally, it is also important to receive an impression of the variability of the estimator γbξ and the way it is influenced by αb and Ψ.b Besides developing an intuition of estimation errors, evaluation of the portfolio optimization approach with respect to its final results is needed. A practitioner would certainly like to know whether the estimated optimal portfolio ξbopt is useful for the minimization of the true extreme risk index γξ and to which extent the available diversification effects are utilized. More-

117 118 Chapter 6. Modelling and simulation over, it is interesting to see how the optimization results are influenced by dependence structures and the inversion of diversification effects for α = 1 d in case of R+-valued random vectors. Finally, a simulation study may help to discover remarkable effects and to find directions for further research. In addition to the reliability of estimators and portfolio optimization re- d sults, the simulation example for random vectors in R+ illustrates a mod- elling approach that is flexible enough to implement arbitrary spectral mea- sures. This flexibility considerably extends the palette of commonly used models. Thus, if the asymptotic dependence structures of well-known para- metric models do not accord with the data, the practitioner can easily build a custom model that satisfies all demands. The design of the simulation study is the following. Two bivariate mod- els are implemented, representing the pure loss case, i.e., random vectors in d d R+, and the general loss-gain case with random vectors in R . To keep the examples simple and demonstrative, each model features only two parame- ters. The pure loss example implements the tail index α and a second-order parameter that is crucial for the estimation of α. The spectral measure is fixed. On the other hand, the loss-gain example implements the tail index α and a dependence parameter that forms the diversification effects. The esti- mation of the tail index is based on the Hill estimator αbH. This well-known estimation approach was already mentioned in Section 4.5. To answer the questions outlined above, the corresponding characteris- tics are calculated for varying parameter values. The bias of γbξ and the contribution of αbH and Ψb in this concern are addressed by calculation of the expectations E γbξ,E Ψbfξ,α, and E αbH. Typical ranges of the estimates are assessed by empirical quantiles (2.5% and 97.5%) for αbH, γbξ, Ψbfξ,α, and (1) ξbopt. Since these quantile ranges provide an intuitive characterization for the variability of the estimators mentioned above, calculation of variances is omitted. In addition to the quantiles, total maximal and minimal values from all simulation runs are recorded for γbξ. The expectations and the empirical quantiles are approximated by Monte Carlo experiments with 5000 simulation runs. In each simulation run, i.i.d. samples of size n = 10000 are simulated. The simulation results related to the estimators αb, γbξ, and ξbopt are illustrated in graphics. Results concerning the quality of portfolio optimization are presented in tables.

6.2 Estimation of the tail index

This section outlines some basic results and techniques related to the esti- mation of the tail index α, which is an essential part of the plug-in estimator 6.2. Estimation of the tail index 119

γ = Ψf . Brief introductions are given to the second-order regular vari- bξ b ξ,αb ation conditions, the asymptotic normality of the Hill estimator, and the choice of the tail fraction size k = k(n) in practical applications with fixed sample size n.

Second-order conditions A crucial point in the semiparametric approach underlying the extreme risk index γξ is the estimation of the tail index α. One should bear in mind that the original multivariate regular variation assumption and the resulting regular variation of the radial part R do not specify the convergence rate. Consequently, the convergence

P {R > tx} → x−α, t → ∞, (6.1) P {R > t} may be arbitrarily slow and the reliability of any tail index estimator αb may be arbitrarily poor if the first-order condition (6.1) is not strengthened by a second-order condition specifying the quality of approximation. Let FR denote the distribution function of the radial part R. Then a second-order condition according with (6.1) can be formulated as follows.

Condition 6.1. There exists β ≤ 0 and a function A(t) such that A(t) → 0 as t → ∞, the sign of A(t) is constant for sufficiently large t, and

1−F (tx) R − x−α xαβ − 1 lim 1−FR(t) = x−α . (6.2) t→∞ A(t) β/α

Remark 6.2. Sufficient criteria for (6.2) can be formulated in terms of power expansions

−α αβ  1 − FR(t) = t c1 + c2t (1 + o(1)) , c1 > 0, c2 6= 0. (6.3)

It is easy to see that (6.3) with α > 0 and β < 0 implies (6.2) with same parameters α and β. The normalizing function A can be chosen as

βc A(t) := 2 tαβ. (6.4) αc1

For further details on the second-order regular variation and alternatives to the formulation (6.2) the reader is referred to de Haan and Ferreira(2006). 120 Chapter 6. Modelling and simulation

Hill’s estimator: consistency and asymptotic normality The estimation of the tail index α in the present simulation study is based on the approach introduced by Hill(1975). The Hill estimator k α := (6.5) bH Pk i=1 log (Rn:i/Rn:k+1) is one of the most popular and frequently used estimators for the tail index α. It is easy to see that αbH is scale invariant in the sense that

αbH (R1,...,Rn) = αbH (KR1,...,KRn) for any constant factor K > 0. Furthermore, it is well known (cf. de Haan and Ferreira, 2006) that αbH is not shift invariant and that in some cases adding a shift to the sample R1,...,Rn considerably increases the bias of αbH. On the other hand, this effect can be used for bias reduction. Being one of the earliest and most intensively studied ones, Hill’s estima- tor often serves as a benchmark for other estimation approaches. Another result justifying the the extraordinary role of the Hill estimator is obtained by Mason(1982), who proved that weak consistency of αbH is equivalent to the regular variation assumption (6.1). Moreover, Hill’s estimator is asymp- totically normal upon the second-order condition (6.2) and an additional condition on the choice of k = k(n): √ ← lim kA (FR (1 − k/n)) = λ ∈ . (6.6) n→∞ R If these conditions are satisfied, then there holds     √ 1 1 w λ 1 k − → N , 2 (6.7) αbH α 1 − β α (cf. de Haan and Ferreira, 2006, and references therein). Applying the delta method, one obtains √  α2λ  k (α − α) →w N , α2 . (6.8) bH β − 1 A simplified version of (6.8) was derived in (4.87), with assumption (4.83) replacing the effect of conditions (6.2) and (6.6). Moreover, deriving (6.8) from (6.2) and (6.6), one obtains (4.83) as a by-product. Applying Corol- lary 4.22, one obtains that the Hill estimator αbH satisfies conditions of The- orem 4.8. This is an additional argument for the suitability of the Hill esti- mator in the present simulation study. 6.3. Models 121

Remark 6.3. Recall that regular variation of the function 1−FR(t) for t → ∞ ← with index −α implies regular variation of the function FR (1 − u) for u ↓ 0 with index −1/α. Thus, if FR satisfies condition (6.3), then (6.4) yields

← −1/ααβ −β A(FR (1 − k/n)) ∼ (k/n) = (k/n) . If condition (6.6) is satisfied with λ 6= 0, this implies k1/2 ∼ (k/n)β and therefore k ∼ n−2β/(1−2β). Hence the convergence rate in the asymptotic normality result (6.8) is equal to nβ/(1−2β).

Choice of the extreme subsample: the Hill plot

As shown in (6.8), the asymptotic bias of αbH is influenced by the second- order parameter β and the parameter λ related to the choice of the extreme subsample size k = k(n). Although these results allow adaptive choice of k in the asymptotic setting (cf. Hall and Welsh, 1985; Drees and Kaufmann, 1998; Beirlant et al., 1999; Dan´ıelssonet al., 2001), the choice of k in prac- tical applications with fixed sample size n still depends on the statistician’s personal intuition. One of the most basic approaches to the necessary trade-off between the bias resulting from choosing k too large and the variance resulting from choosing k too small is the Hill plot, where estimated values αbH are plotted against the extreme subsample size k. Figure 6.1 shows an exemplaric Hill plot obtained from a simulated sample of n = 10000 random variables Ri = |Yi|, where Yi are i.i.d. Student-t distributed with η = 3 degrees of freedom and, consequently, regularly varying with tail index α = 3 (cf. Example 2.7). It is worth a remark that the Hill plot may be very misleading if the heavy-tail assumption is not satisfied. For explicit examples illustrating this drawback and for an introduction to the explorative techniques appropriate to the verification of the heavy-tail property in data sets the reader is referred to Embrechts et al.(1997).

6.3 Models

The present section introduces the two models implemented in the simulation 2 study, representing the pure loss case, i.e., random vectors in R+, and the loss-gain case with sample space R2. Specification of models is accomplished by notes on the first-order and second-order regular variation of the radial parts, spectral measures, and exemplaric sample plots. 122 Chapter 6. Modelling and simulation

5 true value estimated value 4 3 alpha 2 1 0

0 2000 4000 6000 8000 10000

k

(a) k between 10 and n = 10000

5 true value estimated value 4 3 alpha 2 1 0

0 200 400 600 800 1000

k

(b) k between 10 and n/10 = 1000

Figure 6.1 Hill plot generated from n = 10000 random variables R1,...,Rn, where Ri = |Yi| and Yi are i.i.d. Student-t distributed with η = 3 degrees of freedom. 6.3. Models 123

Pure loss case

d The case when the random vector X takes values in R+ is represented by a model constructed from the distribution of the radial part R = kXk1 and the conditional distribution of the angular part L (S|FR(R) = u) for u ∈ [0, 1]. The radial part is simulated from the Burr distribution with parameters a, b > 0: a −b FR(t) = P{R ≤ t} = 1 − (1 + t ) =: Fa,b(t) (6.9) (cf. Burr, 1942; Tadikamalla, 1980; Cook and Johnson, 1981; Kortschak and Albrecher, 2009b) It is easy to see that the Burr(a, b) distribution is regularly varying with tail index α = ab:

a −b −ab −a−b 1 − Fa,b(t) = (1 + t ) = t 1 + t . Moreover, differentiability of the mapping u 7→ (1 + u)−b at u = 0 yields (1 + u)−b = 1 + u · (−b + o(1)), u → 0, and therefore 1 + t−a−b = 1 + t−a(−b + o(1)), t → ∞. Thus the Burr(a, b) distribution satisfies condition (6.3) and, as a conse- quence, the second-order condition (6.2). The second-order parameter β is equal to −a 1 β = = − . α b The dependence structure of the random vector X = RS is characterized by the conditional probability distribution L(S|FR(R)) of the angular part (1) (2) (2) (1) (1) S = (S ,S ). Due to S = 1 − S , it suffices to specify L(S |FR(R)). In the present simulation example, S(1) is obtained by conditioning a mix of two Gaussian distributions to the interval [0, 1]: (1)  L S |FR(R) = u = L (Zu|Zu ∈ [0, 1]) , where the probability measure L(Zu) is defined as 2  2  L(Zu) := λ(u) ·N µ1(u), σ1(u) + (1 − λ(u)) ·N µ2(u), σ2(u) with λ(u) := 0.5 − 0.2u,

µ1(u) := 0.3u,

σ1(u) := 0.4 − 0.3u,

µ2(u) := 1 − 0.4u,

σ2(u) := 0.4 − 0.1u. 124 Chapter 6. Modelling and simulation 2.0 u= 0 u= 0.2 u= 0.4 u= 0.6

1.5 u= 0.8 u= 1 1.0 density 0.5 0.0

0.0 0.2 0.4 0.6 0.8 1.0

s_1

Figure 6.2 Pure loss case: densities of conditional distributions (1) L(S |FR(R) = u) for selected values of u.

It is easy to see that the random vector X with the radial part R and the angular part S defined above is multivariate regularly varying with tail index α = ab and spectral measure Ψ on Σ2 given by

Ψ = L ((Z1, 1 − Z1) |Z1 ∈ [0, 1]) .

Figure 6.2 shows density plots of the conditional probability distributions (1)  L S |FR(R) = u for selected values of u. Exemplaric plots of simulated i.i.d. samples X1,...,Xn for n = 10000 are shown in Figure 6.3.

Loss-gain case

The case of random variables with values in Rd is represented by the centred, bivariate Student-t distribution, parametrized by the degrees of freedom η and the generalized covariance matrix C (cf. Definition 2.6 and Example 2.7). Given η ∈ (0, ∞) and a symmetric, positive semi-definite matrix C ∈ R2×2, the corresponding centred, bivariate Student-t random vector X is defined by X =d Y · A · U (6.10) 6.3. Models 125

Figure 6.3 Pure loss case: simulated i.i.d. samples X1,...,Xn of size n = 10000 with varying parameters α and b. 126 Chapter 6. Modelling and simulation

2 > where U ∼ unif(S2), C = A · A , and Y is an independent of U, centred, Student-t distributed random variable with η degrees of freedom. As already mentioned in Example 2.7, X is multivariate regularly varying with tail index α = η. Figure 6.4 shows samples simulated from bivariate Student-t distributions with selected values of η and C. Spectral densities of elliptical distributions are derived in Section 2.6. Plots of bivariate spectral densities with respect to the 1-norm are shown in Figure 2.2. The values of η and C used to obtain these graphics are taken from the parameter sets used in the simulation study. It is also well known (cf., among many others, Beirlant et al., 2004a) that the centred, univariate Student-t distribution with η degrees of freedom has the representation (6.3) with αβ = −2 and, as a result, satisfies the second- order condition (6.2) with β = −2/η. This property is inherited by the radial part R = |Y | · kA · Uk1.

6.4 Simulation results

This section presents the results of the simulation study concerning the bias and the variability of the estimators αbH, γbξ, and ξbopt, and the efficiency of risk minimization by the estimated optimal portfolio ξbopt. Each subsection is dedicated to a particular aspect of the estimation or optimization procedures and illustrated by graphics or tables.

Tail index α

As already mentioned in Section 6.2, the bias of αbH depends on the second- order parameter β and the tail fraction size k. Furthermore, for any fixed sample size n, increasing k typically increases the bias of αbH, thus turning the choice of k in practical applications into a bias-variance tradeoff. To achieve maximal comparability of simulation results, the tail fraction size k is chosen equal for all experiments implemented in the simulation study. This choice is based on Monte Carlo approximations of the expectations E αbH and the quantiles F ← (0.025) and F ← (0.975). For each combination of model αbH αbH parameters, 5000 Monte Carlo simulation runs are performed. In each run, i.i.d. samples X1,...,Xn of size n = 10000 are simulated and estimates αbH are calculated for all k between 10 and 1000. Some of these simulation results are shown in Figures 6.5 and 6.6. The uniform value of the tail fraction size k for all subsequent experiments is set to k = 400. In addition to the choice of k, Monte Carlo approximations of E αbH, F ← (0.025), and F ← (0.975) allow to compare the bias and the variability αbH αbH 6.4. Simulation results 127

Figure 6.4 Loss-gain case: simulated i.i.d. samples X1,...,Xn of p p size n = 10000 for σ1 := C1,1 = 2, σ2 := C2,2 = 3, and varying parameters α and ρ := C1,2/(σ1σ2). 128 Chapter 6. Modelling and simulation of the estimator αbH for different models and parameter values. This com- parison confirms the forecasts resulting from the asymptotic normality state- ment (6.8). Figure 6.5 depicts the results for the Burr(a, b) radial parts in the pure loss example, where the tail index α = ab and the second-order parameter β = −1/b can be varied independently from each other. The graphics show that increasing b leads to increasing bias. The results obtained in the loss-gain example are illustrated in Figure 6.6. Here, due to (6.10), the radial part R = kXk1 is a mixture of rescaled absolute values of Student-t distributions:

R =d V |Y |, where Y is Student-t distributed with α degrees of freedom and the random variable V is bounded, non-negative, and independent from Y . The left side of Figure 6.6 shows that the bias of αbH increases with increasing α. This is a special property of the Student-t distribution, where the second-order parameter β is bound to the degrees of freedom parameter α via β = −2/α. The right side of Figure 6.6 shows that the influence of the dependence parameter ρ on the bias of αbH is almost negligible.

Bias of γbξ The bias of the estimator γbξ is illustrated by comparative plots of the simu- lated expectations E γbξ and E Ψbfξ,α vs. the true extreme risk index γξ. These graphics allow to compare the bias originating from the estimation of the spectral measure Ψ with the bias introduced by the estimation of the tail index α. The pure loss case is represented by Figures 6.7 and 6.8. Simulation re- sults for varying α and fixed second-order parameter β = −1/b are illustrated in Figure 6.7, whereas Figure 6.8 depicts the influence of the second-order parameter β = −1/b on the estimation results for fixed α. All graphics show that E Ψbfξ,α is very close to γξ = Ψfξ,α. Thus the bias of γbξ is mainly due to the bias of αbH, whereas estimation of the spectral measure Ψ works fine in this example. The results in the loss-gain case are similar. In particular, E Ψbfξ,α is always close to γξ, indicating that the bias of γbξ results from the bias of αbH. Figure 6.9 shows that the bias of γbξ is depends on the value of the tail index α. Furthermore, Figure 6.10 suggests that the influence of the dependence parameter ρ on the bias of γbξ is rather weak. These findings perfectly agree with conclusions from Figure 6.6 concerning the bias of αbH. 6.4. Simulation results 129

alpha=0.8, b=1.4 alpha=3, b=1.0 1.8 true value true value

sim. expectations 6 sim. expectations 1.6 sim. quantiles sim. quantiles 1.4 5 1.2 4 alpha alpha 1.0 3 0.8 0.6 2

0 200 400 600 800 1000 0 200 400 600 800 1000

k k

alpha=2, b=1.4 alpha=3, b=1.4

4.5 true value true value

sim. expectations 6 sim. expectations 4.0 sim. quantiles sim. quantiles 3.5 5 3.0 4 alpha alpha 2.5 3 2.0 1.5 2

0 200 400 600 800 1000 0 200 400 600 800 1000

k k

alpha=4, b=1.4 alpha=3, b=1.8

true value true value 6 8 sim. expectations sim. expectations sim. quantiles sim. quantiles 7 5 6 4 alpha alpha 5 3 4 3 2

0 200 400 600 800 1000 0 200 400 600 800 1000

k k

Figure 6.5 Performance of αbH in the pure loss case (Burr(a, b) distributed radial parts). Monte Carlo approximations of E αbH, F ← (0.025), and F ← (0.975). Tail index: α = ab, second-order pa- αbH αbH rameter: β = −1/b. Left side: fixed b, varying α. Right side: fixed α, varying b. Sample size: n = 10000. Extreme subsample size k: between 10 and 1000. Number of Monte Carlo simulation runs: 5000. 130 Chapter 6. Modelling and simulation

alpha=0.8, rho=0 alpha=3, rho=−0.4 1.8 true value true value

sim. expectations 6 sim. expectations 1.6 sim. quantiles sim. quantiles 1.4 5 1.2 4 alpha alpha 1.0 3 0.8 0.6 2

0 200 400 600 800 1000 0 200 400 600 800 1000

k k

alpha=2, rho=0 alpha=3, rho=0

true value true value 6

4.0 sim. expectations sim. expectations sim. quantiles sim. quantiles 5 3.5 3.0 4 alpha alpha 2.5 3 2.0 1.5 2

0 200 400 600 800 1000 0 200 400 600 800 1000

k k

alpha=4, rho=0 alpha=3, rho=0.4

true value true value 8

sim. expectations 6 sim. expectations sim. quantiles sim. quantiles 7 5 6 4 alpha alpha 5 4 3 3 2 2

0 200 400 600 800 1000 0 200 400 600 800 1000

k k

Figure 6.6 Performance of αbH in the loss-gain case (radial parts of a bivariate Student-t distribution with η degrees of freedom). Monte ← ← Carlo approximations of E αH, F (0.025), and F (0.975). Tail in- b αbH αbH dex: α = η. Second-order parameter: β = −2/α. Left: Fixed de- pendence parameter ρ, varying α. Right: Fixed α, varying ρ. Sample size: n = 10000. Extreme subsample size k: between 10 and 1000. Number of Monte Carlo simulation runs: 5000. 6.4. Simulation results 131

alpha=0.8, b=1.4 0.6 0.5 0.4 ERI 0.3 0.2 true values

0.1 sim. values (true alpha) sim. values (estim. alpha) 0.0

0.0 0.2 0.4 0.6 0.8 1.0

xi_1

alpha=2, b=1.4 0.3 0.2 ERI

0.1 true values sim. values (true alpha) sim. values (estim. alpha) 0.0

0.0 0.2 0.4 0.6 0.8 1.0

xi_1

alpha=4, b=1.4 0.20 0.15 ERI 0.10

0.05 true values sim. values (true alpha) sim. values (estim. alpha) 0.00

0.0 0.2 0.4 0.6 0.8 1.0

xi_1

Figure 6.7 Pure loss case: simulated expectations E γbξ and E Ψbfξ,α vs. γξ = Ψfξ,α with fixed second-order parameter β = −1/b and vary- ing tail index α. 132 Chapter 6. Modelling and simulation

alpha=3, b=1 0.25 0.20 0.15 ERI 0.10

true values 0.05 sim. values (true alpha) sim. values (estim. alpha) 0.00

0.0 0.2 0.4 0.6 0.8 1.0

xi_1

alpha=3, b=1.4 0.25 0.20 0.15 ERI 0.10

true values

0.05 sim. values (true alpha) sim. values (estim. alpha) 0.00

0.0 0.2 0.4 0.6 0.8 1.0

xi_1

alpha=3, b=1.8 0.25 0.20 0.15 ERI 0.10 true values

0.05 sim. values (true alpha) sim. values (estim. alpha) 0.00

0.0 0.2 0.4 0.6 0.8 1.0

xi_1

Figure 6.8 Pure loss case: simulated expectations E γbξ and E Ψbfξ,α vs. γξ = Ψfξ,α with fixed tail index α and varying second-order pa- rameter β = −1/b. 6.4. Simulation results 133

alpha=0.8, rho=0

true values sim. values (true alpha)

0.3 sim. values (estim. alpha) 0.2 ERI 0.1 0.0

0.0 0.2 0.4 0.6 0.8 1.0

xi_1

alpha=2, rho=0

true values 0.25 sim. values (true alpha) sim. values (estim. alpha) 0.20 0.15 ERI 0.10 0.05 0.00

0.0 0.2 0.4 0.6 0.8 1.0

xi_1

alpha=4, rho=0

true values sim. values (true alpha) sim. values (estim. alpha) 0.15 0.10 ERI 0.05 0.00

0.0 0.2 0.4 0.6 0.8 1.0

xi_1

Figure 6.9 Loss-gain case: simulated expectations E γbξ and E Ψbfξ,α vs. γξ = Ψfξ,α with fixed dependence parameter ρ and varying tail index α. 134 Chapter 6. Modelling and simulation

alpha=3, rho=−0.4

0.20 true values sim. values (true alpha) sim. values (estim. alpha) 0.15 ERI 0.10 0.05 0.00

0.0 0.2 0.4 0.6 0.8 1.0

xi_1

alpha=3, rho=0

true values 0.20 sim. values (true alpha) sim. values (estim. alpha) 0.15 ERI 0.10 0.05 0.00

0.0 0.2 0.4 0.6 0.8 1.0

xi_1

alpha=3, rho=0.4

0.20 true values sim. values (true alpha) sim. values (estim. alpha) 0.15 ERI 0.10 0.05 0.00

0.0 0.2 0.4 0.6 0.8 1.0

xi_1

Figure 6.10 Loss-gain case: simulated expectations E γbξ and E Ψbfξ,α vs. γξ = Ψfξ,α with fixed tail index α and varying dependence param- eter ρ. 6.4. Simulation results 135

Quantiles of γbξ and ξbopt The variability of the estimated extreme risk index γbξ and the estimated optimal portfolio ξbopt is assessed by simulated quantiles. The quantile ranges

 ← ←  2 Jγ (0.95) := F (0.025),F (0.975) , ξ ∈ Σ , (6.11) bξ γbξ γbξ determine the width of pointwise 95% confidence intervals for γξ. Anal- ogously, the width of a 95% confidence interval for the optimal portfolio (1) weight ξbopt is equal to the width of the quantile range   ← ← J (0.95) := F (1) (0.025),F (1) (0.975) . ξopt b ξbopt ξbopt

(1) Due to d = 2, the optimal portfolio ξbopt is completely determined by ξbopt. Figures 6.11, 6.12, 6.13, and 6.14 depict the simulated expectations and (1) quantiles of the estimators γbξ and ξbopt. To assess the estimation errors intro- duced by αbH, the plots are accomplished by analogue results obtained from the true value of α, i.e., the simulated expectations and quantiles of Ψbfξ,α (1) and the portfolio weight ξe minimizing Ψbfξ,α. Plots of the true extreme risk γξ are omitted since γξ and E Ψbfξ,α are almost identical (cf. Figures 6.7, 6.8, 6.9, and 6.10). Figure 6.11 represents the results obtained in the pure loss example for fixed α and varying b. The graphics suggest that the second-order parameter b does not influence the width of J . Comparison with the corresponding ξbopt quantiles of Ψbfξ,α shows that estimation of α considerably contributes to the variability of γ . A striking difference between the quantiles of γ = Ψf bξ bξ b ξ,αb and those of Ψf is the behaviour of the quantile ranges J defined in (6.11) b ξ,α γbξ and the analogue quantile ranges for Ψbfξ,α,   J (0.95) := F ← (0.975) − F ← (0.025) . Ψbfξ,α Ψbfξ,α Ψbfξ,α

Indeed, the width of J tends to 0 for ξ1 → 1/2, whereas Jγ has nearly Ψbfξ,α bξ the same width for all ξ. Thus estimation of α compensates the special property (3.11) of the pure loss case, where

−α γd−1(1,...,1) = Ψfd−1(1,...,1),α = d for all Ψ. Finally, it should also be noted that increasing b slightly increases the bias and the variability of the estimated optimal portfolio ξbopt. This effect can be explained with increasing bias of αb. 136 Chapter 6. Modelling and simulation

Another special property of the pure loss case is the inversion of diver- sification effects at α = 1 (cf. Lemma 3.5). The consequences of this phe- nomenon to the portfolio optimization are demonstrated in Figure 6.12. The (1) extremely wide quantile ranges of ξbopt show that optimization results in the pure loss case with tail index α close to 1 should be taken with care. There are two reasons for this issue. First, concavity of the mapping ξ 7→ γbξ for αb < 1 implies that ξbopt is either (0, 1) or (1, 0), which may be very different from the true optimal portfolio ξopt if the true tail index α is greater than 1. Same problem arises for α < 1 and αb > 1. Second, if γ(0,1) = Ψf(0,1),1 and γ(1,0) = Ψf(1,0),1 are nearly equal, false estimates of Ψ may lead to wrong or- dering of γb(0,1) and γb(1,0). If both α and αb are less than 1, this disorder forces 2 ξopt and ξbopt into different vertices of Σ . Thus, for α close to 1, portfolio optimization is extremely sensitive to erratic estimates of α and Ψ. The loss-gain example is less complicated. Figure 6.13 shows plots for fixed ρ and varying α. The graphics demonstrate that the ratios

← ← Fγ (0.975) − Fγ (0.025) bξ bξ (6.12) F ← (0.975) − F ← (0.025) Ψbfξ,α Ψbfξ,α slightly increase for increasing α. This effect is at least partly due to the underestimation of α, which is particularly strong in the present model (cf. Figure 6.6). Moreover, it should be noted that increasing α decreases the vari- ability of the estimated optimal portfolio. This originates from the stronger forming of diversification effects for larger values of α. Simulation results for fixed α and varying dependence parameter ρ are shown in Figure 6.14. The graphics demonstrate how the variability of the optimal portfolio depends on the strength of diversification effects. Since ρ quantifies the positive or negative dependence, lower values of this parameter result in stronger diversification effects and better detection of the optimal (1) portfolio from the estimated curves {γbξ : ξ ∈ [0, 1]}.

Maximal and minimal values of γbξ In addition to the estimation of quantiles, the implemented Monte Carlo procedure comprises recording of total maxima and minima of γbξ after 5000 simulation runs. These results are illustrated by comparative plots that also include the true values of γξ, the simulated expectations E γbξ, and the sim- ulated quantiles F ←(0.025) and F ←(0.975). Both in the pure loss and in γbξ γbξ the loss-gain case, the conclusions about the variability of γbξ in the worst 5% of all outcomes are similar to the conclusions drawn from the simulated quantiles embracing the best 95%. 6.4. Simulation results 137

alpha=3, b=1 0.25 0.20 0.15 ERI 0.10

0.05 true alpha estimated alpha 0.00

0.0 0.2 0.4 0.6 0.8 1.0

xi_1

alpha=3, b=1.4 0.25 0.20 0.15 ERI 0.10

0.05 true alpha estimated alpha 0.00

0.0 0.2 0.4 0.6 0.8 1.0

xi_1

alpha=3, b=1.8 0.25 0.20 0.15 ERI 0.10

0.05 true alpha estimated alpha 0.00

0.0 0.2 0.4 0.6 0.8 1.0

xi_1

Figure 6.11 Pure loss case: simulated expectations and quantiles (2.5% and 97.5%) for the estimator γbξ, the functional Ψbfξ,α, the es- (1) (1) timated optimal portfolio weight ξbopt, and the portfolio weight ξeopt minimizing Ψbfξ,α. Selected results with fixed tail index α and varying second-order parameter β = −1/b. 138 Chapter 6. Modelling and simulation

alpha=0.8, b=1.4 0.6 0.5 0.4 ERI 0.3 0.2

0.1 true alpha estimated alpha 0.0

0.0 0.2 0.4 0.6 0.8 1.0

xi_1

alpha=1, b=1.4 0.5 0.4 0.3 ERI 0.2

0.1 true alpha estimated alpha 0.0

0.0 0.2 0.4 0.6 0.8 1.0

xi_1

alpha=1.2, b=1.4 0.5 0.4 0.3 ERI 0.2

0.1 true alpha estimated alpha 0.0

0.0 0.2 0.4 0.6 0.8 1.0

xi_1

Figure 6.12 Pure loss case: simulated expectations and quantiles (2.5% and 97.5%) for the estimator γbξ, the functional Ψbfξ,α, the es- (1) (1) timated optimal portfolio weight ξbopt, and the portfolio weight ξeopt minimizing Ψbfξ,α. Selected results with fixed second-order parameter β = −1/b and tail index α close to 1. 6.4. Simulation results 139

alpha=0.8, rho=0 0.3 0.2 ERI 0.1

true alpha estimated alpha 0.0

0.0 0.2 0.4 0.6 0.8 1.0

xi_1

alpha=2, rho=0 0.25 0.20 0.15 ERI 0.10

0.05 true alpha estimated alpha 0.00

0.0 0.2 0.4 0.6 0.8 1.0

xi_1

alpha=4, rho=0 0.15 0.10 ERI 0.05 true alpha estimated alpha 0.00

0.0 0.2 0.4 0.6 0.8 1.0

xi_1

Figure 6.13 Loss-gain case: simulated expectations and quantiles (2.5% and 97.5%) for the estimator γbξ, the functional Ψbfξ,α, the es- (1) (1) timated optimal portfolio weight ξbopt, and the portfolio weight ξeopt minimizing Ψbfξ,α. Selected results with fixed dependence parameter ρ and varying tail index α. 140 Chapter 6. Modelling and simulation

alpha=3, rho=−0.4 0.20 0.15 ERI 0.10 0.05 true alpha estimated alpha 0.00

0.0 0.2 0.4 0.6 0.8 1.0

xi_1

alpha=3, rho=0 0.20 0.15 ERI 0.10 0.05 true alpha estimated alpha 0.00

0.0 0.2 0.4 0.6 0.8 1.0

xi_1

alpha=3, rho=0.4 0.20 0.15 ERI 0.10 0.05 true alpha estimated alpha 0.00

0.0 0.2 0.4 0.6 0.8 1.0

xi_1

Figure 6.14 Loss-gain case: simulated expectations and quantiles (2.5% and 97.5%) for the estimator γbξ, the functional Ψbfξ,α, the es- (1) (1) timated optimal portfolio weight ξbopt, and the portfolio weight ξeopt minimizing Ψbfξ,α. Selected results with fixed tail index α and varying dependence parameter ρ. 6.4. Simulation results 141

Figure 6.15 shows simulation results obtained in the pure loss example for varying α and fixed b, whereas the setting with varying b and fixed α is illustrated in Figure 6.16. These graphics suggest that the variability of γbξ increases with increasing α. The influence of the second-order parameter b on the variability of γbξ is almost zero. Results obtained in the loss-gain case are illustrated in Figures 6.17 and 6.18. In this model, the variability of γbξ primarily depends on the strength of diversification effects.

Efficiency of risk minimization From a practitioner’s point of view, one of the most important aspects is the quality of optimization results. If the minimization of extreme risks is the only objective, then the estimation errors |γbξ − γξ| and kξbopt − ξoptk can be neglected as long as the optimization error |γ − γ | is relatively small ξopt ξbopt compared to the minimal value γξopt and the quantity

Lγξ := max γξ − γξopt , ξ∈Σd which characterizes the latitude of the diversification effects. Thus the effi- ciency of risk minimization can be measured by the ratios

wγ (1 − λ) ξbopt (6.13) γ ξbopt and wγ (1 − λ) ξbopt (6.14) Lγξ with wγ (1 − λ) quantifying the width of a typical range of γ : ξbopt ξbopt    (1) ← ← w (1 − λ) := max |γ − γ | : ξ ∈ F (1) (λ/2),F (1) (1 − λ/2) γξ ξ ξopt bopt ξbopt ξbopt for small λ > 0. Note that n o P |γ − γξopt | ≤ wγ (1 − λ) ≤ 1 − λ. ξbopt ξbopt An extensive account of the values w (0.95)/γ and w (0.95)/L ξbopt ξopt ξbopt γξ for both models and various parameter values is provided in Tables 6.1, 6.2, 6.3, and 6.4. The pure loss case is represented by Tables 6.1 and 6.2. Their contents demonstrate the impact of the phase change phenomenon on the 142 Chapter 6. Modelling and simulation

alpha=0.8, b=1.4 0.6 0.5 0.4 ERI 0.3

0.2 true values sim. expectations

0.1 sim. quantiles sim. max/min 0.0

0.0 0.2 0.4 0.6 0.8 1.0

xi_1

alpha=2, b=1.4 0.4 0.3 ERI 0.2

true values

0.1 sim. expectations sim. quantiles sim. max/min 0.0

0.0 0.2 0.4 0.6 0.8 1.0

xi_1

alpha=4, b=1.4

true values sim. expectations 0.20 sim. quantiles sim. max/min 0.15 ERI 0.10 0.05 0.00

0.0 0.2 0.4 0.6 0.8 1.0

xi_1

Figure 6.15 Pure loss case: minima and maxima of the estimator γbξ after 5000 simulation runs vs. simulated expectations and quantiles (2.5% and 97.5%) of γbξ and true values of γξ. Selected results with fixed second-order parameter β = −1/b and varying tail index α. 6.4. Simulation results 143

alpha=3, b=1 0.30 0.20 ERI

0.10 true values sim. expectations sim. quantiles sim. max/min 0.00

0.0 0.2 0.4 0.6 0.8 1.0

xi_1

alpha=3, b=1.4 0.30 0.20 ERI

0.10 true values sim. expectations sim. quantiles sim. max/min 0.00

0.0 0.2 0.4 0.6 0.8 1.0

xi_1

alpha=3, b=1.8 0.30 0.20 ERI

0.10 true values sim. expectations sim. quantiles sim. max/min 0.00

0.0 0.2 0.4 0.6 0.8 1.0

xi_1

Figure 6.16 Pure loss case: minima and maxima of the estimator γbξ after 5000 simulation runs vs. simulated expectations and quantiles (2.5% and 97.5%) of γbξ and true values of γξ. Selected results with fixed tail index α and varying second-order parameter β = −1/b. 144 Chapter 6. Modelling and simulation

alpha=0.8, rho=0 0.4 0.3 ERI 0.2

true values

0.1 sim. expectations sim. quantiles sim. max/min 0.0

0.0 0.2 0.4 0.6 0.8 1.0

xi_1

alpha=2, rho=0

true values sim. expectations 0.25 sim. quantiles sim. max/min 0.20 0.15 ERI 0.10 0.05 0.00

0.0 0.2 0.4 0.6 0.8 1.0

xi_1

alpha=4, rho=0

true values 0.20 sim. expectations sim. quantiles sim. max/min 0.15 ERI 0.10 0.05 0.00

0.0 0.2 0.4 0.6 0.8 1.0

xi_1

Figure 6.17 Loss-gain case: minima and maxima of the estimator γbξ after 5000 simulation runs vs. simulated expectations and quantiles (2.5% and 97.5%) of γbξ and true values of γξ. Selected results with fixed dependence parameter ρ and varying tail index α. 6.4. Simulation results 145

alpha=3, rho=−0.4

true values sim. expectations 0.20 sim. quantiles sim. max/min 0.15 ERI 0.10 0.05 0.00

0.0 0.2 0.4 0.6 0.8 1.0

xi_1

alpha=3, rho=0

0.25 true values sim. expectations

0.20 sim. quantiles sim. max/min 0.15 ERI 0.10 0.05 0.00

0.0 0.2 0.4 0.6 0.8 1.0

xi_1

alpha=3, rho=0.4

true values sim. expectations 0.20 sim. quantiles sim. max/min 0.15 ERI 0.10 0.05 0.00

0.0 0.2 0.4 0.6 0.8 1.0

xi_1

Figure 6.18 Loss-gain case: minima and maxima of the estimator γbξ after 5000 simulation runs vs. simulated expectations and quantiles (2.5% and 97.5%) of γbξ and true values of γξ. Selected results with fixed tail index α and varying dependence parameter ρ. 146 Chapter 6. Modelling and simulation

diversification effects. The ratio wγ (0.95)/γξopt is the highest for α close ξbopt to 1 and, even worse, the ratio wγ (0.95)/Lγξ is equal or close to 1 in ξbopt opt this case, which means that the estimated optimal portfolio may be the most unfavourable one. On the other hand, portfolio optimization works fine for

α sufficiently distinct from 1. Thus the ratio wγ (0.95)/Lγξ is below 10% ξbopt opt for α > 2 and even below 1% for α = 4. The numbers obtained in the loss-gain example are presented in Tables 6.3 and 6.4. Since there is no phase change, the optimization results are much more reliable. Decreasing α still leads to decreasing quality of optimization results, which is mainly due to the weakening of the diversification effects for smaller values of α. However, the optimization errors are not dramatic.

The maximal value of wγ (0.95)/Lγξ obtained in this example is 8%, ξbopt opt whereas for α ≥ 2 the values are below 1.3% and even below 0.1% for α = 4. This means that with probability of at least 95% more than 92% of diversification effects are utilized and for larger values of α the utilization of diversification effects is almost 100%. The influence of the dependence parameter ρ on the optimization quality accords with the previous results, i.e., increase of ρ typically entails an increase of the ratios wγ (0.95)/γξopt ξbopt and wγ (0.95)/Lγξ . The only exception is ρ = 0.8, where these ratios are ξbopt opt 0. This is a consequence of the weak diversification effects for large ρ and the distinct asymmetry of the marginal weights γ(0,1) and γ(1,0). In such cases, 2 the optimal portfolio ξopt is always in a vertex of Σ and is perfectly detected from the estimates of γξ.

Table 6.1 Pure loss case: ratio wγ (0.95)/γ defined in (6.13) ξbopt ξbopt

b 0.8 1 1.2 1.4 1.6 1.8 α 0.6 0.055257 0.055257 0.055257 0.055257 0.055257 0.055257 0.8 0.075710 0.075710 0.075710 0.075710 0.075710 0.075710 1 0.078613 0.075856 0.095058 0.095058 0.095058 0.095058 1.2 0.027552 0.028006 0.030927 0.025442 0.032623 0.027326 1.4 0.018045 0.018045 0.018045 0.018045 0.018584 0.018045 2 0.011614 0.013096 0.014898 0.017315 0.021559 0.026268 3 0.008749 0.010029 0.010029 0.011047 0.012853 0.015611 4 0.007691 0.007691 0.008536 0.008975 0.010842 0.012359 6.5. Conclusions 147

Table 6.2 Pure loss case: ratio wγ (0.95)/Lγξ defined in (6.14) ξbopt

b 0.8 1 1.2 1.4 1.6 1.8 α 0.6 0.794821 0.794821 0.794821 0.794821 0.794821 0.794821 0.8 0.978946 0.978946 0.978946 0.978946 0.978946 0.978946 1 0.827000 0.798000 1.000000 1.000000 1.000000 1.000000 1.2 0.243698 0.247717 0.273547 0.225037 0.288549 0.241701 1.4 0.121524 0.121524 0.121524 0.121524 0.125156 0.121524 2 0.035434 0.039956 0.045456 0.052829 0.065778 0.080145 3 0.010716 0.012285 0.012285 0.013531 0.015743 0.019121 4 0.004737 0.004737 0.005258 0.005528 0.006678 0.007612

6.5 Conclusions

The present simulation study confirms the findings of Lemma 3.5 in the sense that optimization results in the pure loss case are not reliable for α close to 1. On the other hand, optimization results obtained in all other cases are promising. In particular, the examples implemented here suggest that distinct diversification effects are detected and utilized efficiently. The second-order parameter β of the radial part turns out to be crucial for accurate estimation of γξ. Fortunately, influence of β on the diversification results is not so strong because the bias of the estimated tail index α shifts all values of γbξ in the same direction. Thus the quality of optimization results depends rather on the tail index α and the spectral measure Ψ that determine the strength of diversification effects. Finally, a comment should be made upon the second-order conditions for the angular part. Although essential to the asymptotic normality results obtained in Chapter4, the second-order behaviour of the radial part was not included in the present simulation study. The reasons are the variety of possible effects and the lack of qualitative results on this subject. Since a meaningful simulation study must be oriented at meaningful questions, gen- eral qualitative analysis must precede the implementation of specific models. This is an interesting area for further research. 148 Chapter 6. Modelling and simulation

Table 6.3 Loss-gain case: ratio wγ (0.95)/γ defined in (6.13) ξbopt ξbopt

ρ −0.8 −0.6 −0.4 0 0.4 0.6 0.8 α 0.6 0.007166 0.008895 0.010505 0.013440 0.016162 0.017384 0.009425 0.8 0.004236 0.005510 0.006738 0.008939 0.010896 0.012741 0.001303 1 0.002521 0.003918 0.004751 0.006567 0.009020 0.009860 0.000000 1.2 0.001910 0.003020 0.004029 0.005711 0.007772 0.008899 0.000000 1.4 0.001380 0.002553 0.003319 0.005045 0.006879 0.008296 0.000000 2 0.000960 0.001709 0.002403 0.003870 0.005995 0.007134 0.000000 3 0.000482 0.001114 0.001665 0.002988 0.005651 0.007022 0.000000 4 0.000210 0.000643 0.001139 0.002312 0.005102 0.006141 0.000000

Table 6.4 Loss-gain case: ratio wγ (0.95)/Lγξ defined in (6.14) ξbopt

ρ −0.8 −0.6 −0.4 0 0.4 0.6 0.8 α 0.6 0.012756 0.019287 0.026754 0.045298 0.069141 0.080247 0.043766 0.8 0.006349 0.009810 0.013869 0.023846 0.036426 0.045809 0.004715 1 0.003373 0.006090 0.008412 0.014780 0.025129 0.029449 0.000000 1.2 0.002364 0.004254 0.006380 0.011284 0.018789 0.022991 0.000000 1.4 0.001615 0.003341 0.004823 0.008993 0.014836 0.019063 0.000000 2 0.001025 0.001957 0.002963 0.005594 0.010167 0.012782 0.000000 3 0.000489 0.001167 0.001814 0.003605 0.007664 0.009939 0.000000 4 0.000211 0.000653 0.001181 0.002554 0.006132 0.007629 0.000000 Appendix A

Auxiliary results

A.1 Regular variation

Following result is a part of Karamata’s theorem. The present formulation is taken from de Haan and Ferreira(2006), Theorem B.1.5. For an alternative formulation see Bingham et al.(1987), Theorem 1.5.11.

Theorem A.1. Let the function f : (0, ∞) → R be regularly varying at ∞, β i.e., f(ty)/f(t) → y for t → ∞ and y > 0. Then there exists t0 > 0 such that f(t) is positive and locally bounded for t ≥ t0. Moreover, if β < −1 or R ∞ β = −1 and 0 f(s) ds < ∞, then tf(t) lim ∞ = −β − 1. t→∞ R t f(s) ds The subsequent lemma can be considered as an extension of Theorem A.1. Lemma A.2. (de Haan and Ferreira, 2006, Theorem B.1.12.2) Let the func- tion f be regularly varying at ∞ with index β. If tε+βg(t) is integrable on R ∞ (1, ∞) for some ε > 0, then 1 g(y)f(ty)dy < ∞ and Z ∞ f(ty) Z ∞ lim g(y) dy = g(y)yβdy. t→∞ 1 f(t) 1 A.2 Empirical processes

Definition A.3. Let (Ω, A, P) be a probability space. (a) The outer probability P∗ of a subset B ⊂ Ω is defined as

P∗(B) := inf {P(A): A ∈ A,B ⊂ A} . (A.1)

149 150 Appendix A. Auxiliary results

(b) The outer expectation E∗ of an arbitrary mapping X :Ω → R := [−∞, ∞] is defined as ∗  E X := inf EY : Y ≥ X,Y :Ω → R measurable and EY exists , (A.2) where existence of EY means that at least EY+ or EY− is finite. Remark A.4. It is easy to see that P∗ and E∗ coincide with P and E respec- tively for measurable sets and mappings: P∗(B) = P(B) for B ∈ A, E∗X = EX for measurable X.

Further it should be noted that for any mapping X :Ω → R there exists a measurable cover, i.e., a measurable function X∗ :Ω → R satisfying X∗ ≥ X and X∗ ≤ Y for any measurable Y :Ω → R. Moreover, in case of existence, the outer expectation E∗X is equal to the expectation of X∗: E∗X = EX∗ (cf. van der Vaart and Wellner, 1996, Lemma 1.2.1). Following definition extends the notions of of convergence in probability and almost surely to mappings that are not necessarily measurable.

Definition A.5. Let (Ω, A, P) be a probability space and let Xη :Ω → D be a net of mappings into a metric space (D, d).

(a) Xη converges in outer probability to a mapping X :Ω → D if ∗ ∀ε > 0 P {d (Xη,X) > ε} → 0.

(b) Xη converges outer almost surely to a mapping X :Ω → D if ∗ d (Xη,X) → 0 P-a.s.

In the following, convergence in outer probability and outer almost surely ∗ will be denoted with symbols →P and a→.s.∗. It is easy to see that these notions coincide with classical notions of convergence in probability and almost surely if the mappings Xη and X are measurable. Following definition extends the notion of weak convergence to the case of arbitrary mappings with a Borel measurable limit.

Definition A.6. Let (Ωη, Aη, Pη) be a net of probability spaces and let Xη : Ωη → D be arbitrary maps into a metric space D. The net Xη converges weakly to a Borel measure µ on D if the following condition is satisfied: Z ∗ ∀f ∈ Cb(D)E f(Xη) → fdµ (A.3) with Cb(D) denoting the set of bounded continuous mappings f : D → R. A.2. Empirical processes 151

It is easy to see that for measurable mappings Xη the condition (A.3) is equivalent to Z Z Xη ∀f ∈ Cb fdPη → fdµ, i.e., the extended notion of weak convergence coincides with the original no- Xη w tion of weak convergence Pη → µ. Thus using the term “weak convergence” and the symbol →w in all cases does not lead to ambiguity or contradiction.

Lemma A.7. (van der Vaart and Wellner, 1996, Example 1.5.1) Let T be a compact semimetric space. The set C(T ) of all continuous functions z : T → R is a separable, complete subspace of l∞(T ). The Borel σ-field of C(T ) equals the σ-field generated by the coordinate projections z 7→ z(t). Thus a map X :Ω → C(T ) is Borel measurable if and only if it is a stochastic process.

Definition A.8. (cf. van der Vaart and Wellner, 1996, Definition 2.1.5) Let (F, k·k) be a subset of a normed space of real functions f : X → R. The entropy (without bracketing) of a function class F is the logarithm of the covering number N(ε, F, k·k). The covering number N(ε, F, k·k) is the min- imal number of balls {g : kg − fk < ε} of radius ε needed to cover the class F. The centres of the balls need not belong to F, but they should have finite norms. Definition A.9. (cf. van der Vaart and Wellner, 1996, Definition 2.1.6) The entropy with bracketing of F is the logarithm of the bracketing number N[](ε, F, k·k), defined as the minimal number of ε-brackets needed to cover F. Given two functions l and u, the bracket [l, u] denotes the set of all functions f satisfying l ≤ f ≤ u. A a bracket [l, u] is called ε-bracket if ku − lk < ε. The functions l, u need not belong to F. Definition A.10. (cf. van der Vaart and Wellner, 1996) An envelope function of a function class F is any function x 7→ F (x) such that |f(x)| ≤ F (x) for every x and f ∈ F. Definition A.11. (cf. van der Vaart and Wellner, 1996, Definition 2.3.3) A class F of measurable functions f : X → R on a probability space (X , A, P) is called a P-measurable class if the function

n X (x1, . . . , xn) 7→ sup eif(xi) ∗ f∈F i=1 is measurable on the completion of (X n, An, Pn) for every n ∈ N and every n vector (e1, . . . , en) ∈ R . 152 Appendix A. Auxiliary results

Theorem A.12. (van der Vaart and Wellner, 1996, Theorem 2.8.1) Let F be a P -measurable class of functions on a measurable space for ev- ery probability measure P in a class P. Suppose that, for some measurable envelope function F , lim sup PF {F > M} = 0, M→∞ P∈P

sup log N (εkF kQ,1, F,L1(Q)) = o(n), for every ε > 0, Q∈Qn where Qn denotes the set of all discrete probability measures with atoms of size integer multiples of 1/n. Then F is Glivenko–Cantelli uniformly in P ∈ P. Theorem A.13. (van der Vaart and Wellner, 1996, Theorem 2.8.3) Let F be a class of measurable functions with measurable envelope function F such that the function classes n o Fδ,P := f − g : f, g ∈ F, kf − gkP,2 < δ 2  2 F∞ := (f − g) : f, g ∈ F are P-measurable for every δ > 0 and P ∈ P. Furthermore, suppose that lim sup PF 2 {F > M} = 0, M→∞ P∈P Z ∞ q 2 sup log N(εkF kQ,2, F, L (Q)) dε < ∞, 0 Q∈Q where Q denotes the set of all finitely discrete probability measures. Then F is Donsker and pre-Gaussian uniformly in P ∈ P. Theorem A.14. (van der Vaart and Wellner, 1996, Theorem 2.8.4) Let F be a class of measurable functions such that lim sup PF 2 {F > M} = 0, M→∞ P∈P Z ∞ q 2 sup log N[](εkF kP,2, F, L (P)) dε < ∞. 0 P∈P Then F is Donsker and pre-Gaussian uniformly in P ∈ P. Lemma A.15. (van der Vaart and Wellner, 1996, Lemma 2.8.7) Let a function class F be Donsker and pre-Gaussian uniformly in the se- quence {Pn} and suppose that following conditions are satisfied: 2 √ ∀ε > 0 lim sup PnF {F ≥ ε n} = 0, n→∞

lim sup |σPn (f − g) − σP0 (f − g)| = 0, n→∞ f,g∈F A.2. Empirical processes 153

w with σP denoting the seminorm σP(f) := kf − PfkP,2. Then Gn,Pn → GP0 in l∞(F). Definition A.16. (a) (cf. van der Vaart and Wellner, 1996, Section 2.6.1) Let C be a collection of subsets of a set X and let {x1, . . . , xn} ⊂ X be an arbitrary subset of n points. Say that C picks out a certain subset of A ⊂ {x1, . . . , xn} if A = C ∩ {x1, . . . , xn} for some C ∈ C. The collection n C is said to shatter {x1, . . . , xn} if each of its 2 subsets can be picked out by C. The Vapnik–Cervonenkis˘ index (short: VC-index) V (C) of the class C is the smallest n for which no set of size n is shattered by C.A collection of measurable sets C is called VC-class if V (C) is finite. (b) (cf. van der Vaart and Wellner, 1996, Section 2.6.2) The subgraph of a function f : X → R is a subset of X × R given by {(x, t): t < f(x)} . A collection F of measurable functions on X is called a VC-subgraph class or just a VC-class, if the collection of all subgraphs of the functions in F forms a VC-class of sets in X × R. The VC-index V (F) of F is defined as the VC-index of the collection of subgraphs of F. (c) (cf. van der Vaart and Wellner, 1996, Section 2.6.3) A class F of measur- able functions is called VC-hull if F is contained in the pointwise closure of the symmetric convex hull of a VC-class G, i.e., if for any f ∈ F there exists a sequence of functions n n X X fn = αigi, gi ∈ G, |αi| ≤ 1, i=1 i=1 such that ∀x ∈ X f(x) = lim fn(x). n→∞ If the class G can be taken equal to a class of indicator functions, then F is called VC-hull class for sets. (d) (cf. van der Vaart and Wellner, 1996, Section 2.6.4) A class F of mea- surable functions is called VC-major if the sets {x : f(x) > t} with f ranging over F and t ranging over R form a VC-class of sets. Lemma A.17. (van der Vaart and Wellner, 1996, Lemma 2.6.17(iii)) Let C and D be VC-classes of sets in a set X . Then the set class C t D := {C ∪ D : C ∈ C,D ∈ D} is VC. 154 Appendix A. Auxiliary results

Lemma A.18. (Lemma 2.6.15 from van der Vaart and Wellner, 1996) Any finite-dimensional vector space F of measurable functions is VC-subgraph of index smaller than or equal to dim(F) + 2.

Lemma A.19. (van der Vaart and Wellner, 1996, Lemma 2.6.18(iii)) Let F be a VC-subgraph function class. Then set collection {{f > 0} : f ∈ F} is VC.

Lemma A.20. (van der Vaart and Wellner, 1996, Lemma 2.6.19) If F is VC-major, then the class of function h◦f, with h ranging over the monotone functions h : R → R, and f over F, is VC-major. Lemma A.21. (van der Vaart and Wellner, 1996, Lemma 2.6.13) A bounded VC-major class is a scalar multiple of a VC-hull class for sets.

Lemma A.22. (van der Vaart and Wellner, 1996, Corollary 2.6.12) For any VC-hull class F of measurable functions and probability measure Q,

−1 12−2Vm (F) log N εkF k , F, L2(Q) ≤ K Q,2 ε for a constant K that depends only on the VC-index Vm(F) of the VC- subgraph class connected with F. Bibliography

C. Acerbi. Spectral measures of risk: A coherent representation of subjective risk aversion. J. Bank. Finance, 26(7):1505–1518, July 2002. URL http: //dx.doi.org/doi:10.1016/S0378-4266(02)00281-9. S. Alink, M. L¨owe, and M. V. W¨uthrich. Diversification of aggregate depen- dent risks. Insurance Math. Econom., 35(1):77–95, 2004. ISSN 0167-6687. URL http://dx.doi.org/10.1016/j.insmatheco.2004.05.001. S. Alink, M. L¨owe, and M. V. W¨uthrich. Analysis of the expected shortfall of aggregate dependent risks. Astin Bull., 35(1):25–43, 2005. ISSN 0515-0361. URL http://dx.doi.org/10.2143/AST.35.1.583164. P. Artzner, F. Delbaen, J.-M. Eber, and D. Heath. Coherent measures of risk. Math. Finance, 9(3):203–228, 1999. ISSN 0960-1627. URL http: //dx.doi.org/10.1111/1467-9965.00068. P. Barbe, A.-L. Foug`eres,and C. Genest. On the tail behavior of sums of dependent risks. Astin Bull., 36(2):361–373, 2006. ISSN 0515-0361. URL http://dx.doi.org/10.2143/AST.36.2.2017926. B. Basrak, R. A. Davis, and T. Mikosch. A characterization of multivariate regular variation. Ann. Appl. Probab., 12(3):908–920, 2002. ISSN 1050- 5164. N. B¨auerle. Inequalities for stochastic models via supermodular orderings. Comm. Statist. Stochastic Models, 13(1):181–201, 1997. ISSN 0882-0287. J. Beirlant, G. Dierckx, Y. Goegebeur, and G. Matthys. Tail index estimation and an exponential regression model. Extremes, 2(2):177–200, 1999. ISSN 1386-1999. URL http://dx.doi.org/10.1023/A:1009975020370. J. Beirlant, Y. Goegebeur, J. Teugels, and J. Segers. Statistics of Extremes. Wiley Series in Probability and Statistics. John Wiley & Sons Ltd., Chich- ester, 2004a. ISBN 0-471-97647-4. Theory and Applications, with contri- butions from Daniel De Waal and Chris Ferro.

155 156 BIBLIOGRAPHY

J. Beirlant, W. Schoutens, and J. Segers. Mandelbrot’s Extremism. SSRN eLibrary, 2004b. URL http://ssrn.com/paper=642405. J. Bergenthum and L. R¨uschendorf. Comparison of semimartingales and L´evyprocesses. Ann. Probab., 35(1):228–254, 2007. ISSN 0091-1798. URL http://dx.doi.org/10.1214/009117906000000386. P. Billingsley. Convergence of Probability Measures. John Wiley & Sons Inc., New York, 1968. N. H. Bingham, C. M. Goldie, and J. L. Teugels. Regular Variation, vol- ume 27 of Encyclopedia of Mathematics and Its Applications. Cambridge University Press, Cambridge, 1987. ISBN 0-521-30787-2. K. B¨ocker and C. Kl¨uppelberg. Modelling and measuring multivariate oper- ational risk with L´evycopulas. J. Operational Risk, 3(2):3–27, 2008. J. Boman and F. Lindskog. Support theorems for the Radon transform and Cram´er-Woldtheorems. J. Theor. Probab., 22(3):683–710, 2009. ISSN 0894-9840. URL http://dx.doi.org/10.1007/s10959-008-0151-0. I. W. Burr. Cumulative frequency functions. Ann. Math. Stat., 13:215–232, 1942. ISSN 0003-4851. P. Cap´era`a,A.-L. Foug`eres, and C. Genest. A nonparametric estimation procedure for bivariate extreme value copulas. Biometrika, 84(3):567–577, 1997. ISSN 0006-3444. S. G. Coles and J. A. Tawn. Modelling extreme multivariate events. J. R. Stat. Soc., Ser. B, 53(2):377–392, 1991. ISSN 0035-9246. R. D. Cook and M. E. Johnson. A family of distributions for modelling nonelliptically symmetric multivariate data. J. R. Stat. Soc., Ser. B, 43(2): 210–218, 1981. ISSN 0035-9246. URL http://links.jstor.org/sici? sici=0035-9246(1981)43:2<210:AFODFM>2.0.CO;2-0&origin=MSN. J. Dan´ıelsson, L. de Haan, L. Peng, and C. G. de Vries. Using a boot- strap method to choose the sample fraction in tail index estimation. J. Multivariate Anal., 76(2):226–248, 2001. ISSN 0047-259X. URL http: //dx.doi.org/10.1006/jmva.2000.1903. J. Dan´ıelsson, B. N. Jorgensen, G. Samorodnitsky, M. Sarma, and C. G. de Vries. Sub-additivity re-examined: the case for Value-at-Risk. FMG Discussion Papers, London School of Economics, Nov 2005. URL http: //risk.lse.ac.uk/rr/files/JD-Cd-BJ-SM-GS-23.pdf. BIBLIOGRAPHY 157

R. Davis and S. Resnick. Tail estimates motivated by extreme value theory. Ann. Statist., 12(4):1467–1487, 1984. ISSN 0090-5364.

L. de Haan and A. Ferreira. Extreme Value Theory. Springer Series in Operations Research and Financial Engineering. Springer, New York, 2006. ISBN 978-0-387-23946-0; 0-387-23946-4.

L. de Haan and S. I. Resnick. Limit theory for multivariate sample extremes. Z. Wahrscheinlichkeitstheorie und Verw. Gebiete, 40(4):317–337, 1977.

L. de Haan and S. I. Resnick. Estimating the limit distribution of multivariate extremes. Comm. Statist. Stochastic Models, 9(2):275–309, 1993. ISSN 0882-0287.

L. de Haan and A. K. Sinha. Estimating the probability of a rare event. Ann. Statist., 27(2):732–759, 1999. ISSN 0090-5364.

M. Degen, D. D. Lambrigger, and J. Segers. Risk concentration and diver- sification: second-order properties. Insurance Math. Econom., to appear, 2010.

A. L. M. Dekkers, J. H. J. Einmahl, and L. de Haan. A moment estimator for the index of an extreme-value distribution. Ann. Statist., 17(4):1833–1855, 1989. ISSN 0090-5364.

F. Delbaen. Risk measures for non-integrable random variables. Math. Fi- nance, 19(2):329–333, 2009. ISSN 0960-1627. URL http://dx.doi.org/ 10.1111/j.1467-9965.2009.00370.x.

H. Drees. Refined Pickands estimators of the extreme value index. Ann. Statist., 23(6):2059–2080, 1995. ISSN 0090-5364.

H. Drees and E. Kaufmann. Selecting the optimal sample fraction in univariate extreme value estimation. Stochastic Process. Appl., 75(2): 149–172, 1998. ISSN 0304-4149. URL http://dx.doi.org/10.1016/ S0304-4149(98)00017-9.

H. Drees, A. Ferreira, and L. de Haan. On maximum likelihood estimation of the extreme value index. Ann. Appl. Probab., 14(3):1179–1201, 2004. ISSN 1050-5164.

J. H. J. Einmahl, L. de Haan, and X. Huang. Estimating a multidimensional extreme-value distribution. J. Multivariate Anal., 47(1):35–47, 1993. ISSN 0047-259X. 158 BIBLIOGRAPHY

J. H. J. Einmahl, L. de Haan, and A. K. Sinha. Estimating the spectral measure of an extreme value distribution. Stochastic Process. Appl., 70(2): 143–171, 1997. ISSN 0304-4149.

J. H. J. Einmahl, L. de Haan, and V. I. Piterbarg. Nonparametric estimation of the spectral measure of an extreme value distribution. Ann. Statist., 29 (5):1401–1423, 2001. ISSN 0090-5364.

P. Embrechts, C. Kl¨uppelberg, and T. Mikosch. Modelling Extremal Events, volume 33 of Applications of Mathematics (New York). Springer-Verlag, Berlin, 1997. ISBN 3-540-60931-8.

P. Embrechts, A. J. McNeil, and D. Straumann. Correlation and dependence in risk management: properties and pitfalls. In Risk Management: Value at Risk and Beyond (Cambridge, 1998), pages 176–223. Cambridge Univ. Press, Cambridge, 2002.

P. Embrechts, D. D. Lambrigger, and M. V. W¨uthrich. Multivariate extremes and the aggregation of dependent risks: examples and counter-examples. Extremes, 12(2):107–127, 2009a. ISSN 1386-1999. URL http://dx.doi. org/10.1007/s10687-008-0071-5.

P. Embrechts, J. Neˇslehov´a,and M. V. W¨uthrich. Additivity properties for value-at-risk under Archimedean dependence and heavy-tailedness. In- surance Math. Econom., 44(2):164–169, 2009b. ISSN 0167-6687. URL http://dx.doi.org/10.1016/j.insmatheco.2008.08.001.

M. Falk, J. H¨usler,and R.-D. Reiss. Laws of Small Numbers: Extremes and Rare Events, volume 23 of DMV Seminar. Birkh¨auser,Basel, 1994. ISBN 3-7643-5071-7.

E. F. Fama. The behavior of stock-market prices. J. Bus., 38(1):34–105, 1965. ISSN 00219398. URL http://www.jstor.org/stable/2350752.

R. A. Fisher and L. H. C. Tippett. Limiting forms of the frequency dis- tribution of the largest or smallest member of a sample. Mathematical Proceedings of the Cambridge Philosophical Society, 24(02):180–190, 1928. URL http://dx.doi.org/10.1017/S0305004100015681.

C. Genest and L.-P. Rivest. A characterization of Gumbel’s family of extreme value distributions. Stat. Probab. Lett., 8(3):207–211, 1989. ISSN 0167- 7152. BIBLIOGRAPHY 159

B. Gnedenko. Sur la distribution limite du terme maximum d’une s´erie al´eatoire. Ann. Math. (2), 44:423–453, 1943. ISSN 0003-486X.

G. Haj´osand A. R´enyi. Elementary proofs of some basic facts concerning order statistics. Acta Math. Acad. Sci. Hungar., 5:1–6, 1954. ISSN 0001- 5954.

P. Hall and A. H. Welsh. Adaptive estimates of parameters of regular variation. Ann. Statist., 13(1):331–341, 1985. ISSN 0090-5364. URL http://dx.doi.org/10.1214/aos/1176346596.

H. A. Hauksson, M. M. Dacorogna, T. Domenig, U. A. M¨uller, and G. Samorodnitsky. Multivariate extremes, aggregation and risk estima- tion. SSRN eLibrary, 2000. URL http://ssrn.com/paper=254392.

B. M. Hill. A simple general approach to inference about the tail of a distri- bution. Ann. Statist., 3(5):1163–1174, 1975. ISSN 0090-5364.

R. V. Hogg and S. A. Klugman. Loss Distributions. Wiley Series in Prob- ability and Mathematical Statistics: Applied Probability and Statistics. John Wiley & Sons Inc., New York, 1984. ISBN 0-471-87929-0. With the assistance of Charles C. Hewitt and Gary Patrik.

H. Hult and F. Lindskog. Multivariate extremes, aggregation and dependence in elliptical distributions. Adv. Appl. Probab., 34(3):587–608, 2002. ISSN 0001-8678. URL http://dx.doi.org/10.1239/aap/1033662167.

H. Hult and F. Lindskog. Regular variation for measures on metric spaces. Publ. Inst. Math. (Beograd) (N.S.), 80(94):121–140, 2006. ISSN 0350-1302. URL http://dx.doi.org/10.2298/PIM0694121H.

H. Joe. Multivariate Models and Dependence Concepts, volume 73 of Mono- graphs on Statistics and Applied Probability. Chapman & Hall, London, 1997. ISBN 0-412-07331-5.

J. H. B. Kemperman. On the FKG-inequality for measures on a partially ordered space. Nederl. Akad. Wetensch. Proc. Ser. A 80 = Indag. Math., 39(4):313–331, 1977.

C. Kl¨uppelberg and S. I. Resnick. The Pareto copula, aggregation of risks, and the emperor’s socks. J. Appl. Probab., 45(1):67–84, 2008. ISSN 0021- 9002. URL http://dx.doi.org/10.1239/jap/1208358952. 160 BIBLIOGRAPHY

D. Kortschak and H. Albrecher. Asymptotic results for the sum of dependent non-identically distributed random variables. Methodol. Comput. Appl. Probab., 11(3):279–306, 2009a. ISSN 1387-5841. URL http://dx.doi. org/10.1007/s11009-007-9053-3.

D. Kortschak and H. Albrecher. An asymptotic expansion for the tail of compound sums of Burr distributed random variables. Stat. Probab. Lett., In Press, Corrected Proof, 2009b. ISSN 0167-7152. URL http://dx.doi. org/10.1016/j.spl.2009.12.018.

F. Lindskog. Multivariate Extremes and Regular Variation for Stochastic Processes. Doctoral thesis, Department of Mathematics, Swiss Federal In- stitute of Technology, Z¨urich, 2004. URL http://e-collection.ethbib. ethz.ch/eserv/eth:27026/eth-27026-02.pdf.

J. Lintner. The valuation of risk assets and the selection of risky investments in stock portfolios and capital budgets. Rev. Econ. Stat., 47(1):13–37, 1965. ISSN 00346535. URL http://www.jstor.org/stable/1924119.

F. M. Longin. The asymptotic distribution of extreme stock market returns. J. Bus., 69(3):383, 1996. URL http://dx.doi.org/10.1086/209695.

G. Mainik and L. R¨uschendorf. On optimal portfolio diversification with respect to extreme risks. Finance Stoch., accepted for publication, 2010. doi: 10.1007/s00780-010-0122-z.

Y. Malevergne and D. Sornette. Extreme Financial Risks. Springer-Verlag, Berlin, 2006. ISBN 978-3-540-27264-9; 3-540-27264-X.

Y. Malevergne, V. Pisarenko, and D. Sornette. On the power of gener- alized extreme value (GEV) and generalized pareto distribution (GPD) estimators for empirical distributions of stock returns. Appl. Financ. Econ., 16(3):271–289, February 2006. URL http://dx.doi.org/10. 1080/09603100500391008.

B. Mandelbrot. The variation of certain speculative prices. J. Bus., 36(4): 394–419, 1963. ISSN 00219398. URL http://www.jstor.org/stable/ 2350970.

H. Markowitz. Portfolio selection. J. Finance, 7(1):77–91, 1952. ISSN 00221082. URL http://www.jstor.org/stable/2975974.

H. M. Markowitz. Foundations of portfolio theory. J. Finance, 46(2):469–477, 1991. ISSN 00221082. URL http://www.jstor.org/stable/2328831. BIBLIOGRAPHY 161

A. W. Marshall and I. Olkin. Inequalities: Theory of Majorization and Its Applications, volume 143 of Mathematics in Science and Engineering. Aca- demic Press Inc. [Harcourt Brace Jovanovich Publishers], New York, 1979. ISBN 0-12-473750-1.

D. M. Mason. Laws of large numbers for sums of extreme values. Ann. Probab., 10(3):754–764, 1982. ISSN 0091-1798. URL http://www.jstor. org/stable/2243383.

A. McNeil and J. Neˇslehov´a.Multivariate archimedean copulas, d-monotone functions and l1-norm symmetric distributions. Ann. Statist., 37:3059– 3097, 2009. URL http://dx.doi.org/10.1214/07-AOS556.

A. J. McNeil, R. Frey, and P. Embrechts. Quantitative Risk Management. Princeton Series in Finance. Princeton University Press, Princeton, NJ, 2005. ISBN 0-691-12255-5.

I. Molchanov. Convex geometry of max-stable distributions. Extremes, 11 (3):235–259, 2008. ISSN 1386-1999. URL http://dx.doi.org/10.1007/ s10687-008-0055-5.

M. Moscadelli. The modelling of operational risk: experience with the anal- ysis of the data collected by the Basel Committee. Temi di discussione (Economic working papers) 517, Bank of Italy, Economic Research De- partment, July 2004. URL http://ideas.repec.org/p/bdi/wptemi/td_ 517_04.html.

A. M¨ullerand D. Stoyan. Comparison Methods for Stochastic Models and Risks. Wiley Series in Probability and Statistics. John Wiley & Sons Ltd., Chichester, 2002. ISBN 0-471-49446-1.

J. Neˇslehov´a,P. Embrechts, and V. Chavez-Demoulin. Infinite-mean models and the LDA for operational risk. J. Operational Risk, 1(1):3–25, 2006.

J. Pickands, III. Statistical inference using extreme order statistics. Ann. Statist., 3:119–131, 1975. ISSN 0090-5364.

J. Pickands, III. Multivariate extreme value distributions. In Proceedings of the 43rd Session of the International Statistical Institute, Vol. 2 (Buenos Aires, 1981), volume 49, pages 859–878, 894–902, 1981. With a discussion.

S. Resnick. The extremal dependence measure and asymptotic independence. Stoch. Models, 20(2):205–227, 2004. ISSN 1532-6349. 162 BIBLIOGRAPHY

S. I. Resnick. Extreme Values, Regular Variation, and Point Processes, vol- ume 4 of Applied Probability. A Series of the Applied Probability Trust. Springer-Verlag, New York, 1987. ISBN 0-387-96481-9.

S. I. Resnick. Heavy-Tail Phenomena. Springer Series in Operations Research and Financial Engineering. Springer, New York, 2007. ISBN 978-0-387- 24272-9; 0-387-24272-4.

H. Rootz´enand C. Kl¨uppelberg. A single number can’t hedge against eco- nomic catastrophes. Ambio, 28(6):550–555, 1999. ISSN 00447447. URL http://www.jstor.org/stable/4314953.

L. R¨uschendorf. Comparison of multivariate risks and positive dependence. J. Appl. Probab., 41(2):391–406, 2004. ISSN 0021-9002. URL http:// projecteuclid.org/getRecord?id=euclid.jap/1082999074.

R. Schmidt and U. Stadtm¨uller. Non-parametric estimation of tail depen- dence. Scand. J. Statist., 33(2):307–335, 2006. ISSN 0303-6898.

W. F. Sharpe. Capital asset prices: A theory of market equilibrium under conditions of risk. J. Finance, 19(3):425–442, 1964. ISSN 00221082. URL http://www.jstor.org/stable/2977928.

N. V. Smirnov. Limit distributions for the terms of a variational series. Trudy Mat. Inst. Steklov., 25:60, 1949. ISSN 0371-9685.

R. L. Smith. Estimating tails of probability distributions. Ann. Statist., 15 (3):1174–1207, 1987. ISSN 0090-5364.

P. R. Tadikamalla. A look at the Burr and related distributions. Int. Stat. Rev., 48(3):337–344, 1980. ISSN 0306-7734. URL http://dx.doi.org/ 10.2307/1402945.

A. W. van der Vaart and J. A. Wellner. Weak Convergence and Empirical Processes. Springer Series in Statistics. Springer, New York, 1996. ISBN 0-387-94640-3. Corrected 2nd printing 2000.

G. Wei and T. Hu. Supermodular dependence ordering on a class of multi- variate copulas. Stat. Probab. Lett., 57(4):375–385, 2002. ISSN 0167-7152. URL http://dx.doi.org/10.1016/S0167-7152(02)00094-9.

M. V. W¨uthrich. Asymptotic value-at-risk estimates for sums of dependent random variables. Astin Bull., 33(1):75–92, 2003. ISSN 0515-0361. URL http://dx.doi.org/10.2143/AST.33.1.1040. Index

N(ε, F, k·k), 151 ν∗, 15

V (C), 153 Gα , 99 V (F), 153 apl, 90 At, 36 cx, 89 Aξ,t, 36 dcx, 89 ES, 51 lcx, 89 ∗ E , 150 plcx, 89 FH,α, 61 sm, 88 Gn, 61 ∂αFH,α, 67 Ge n, 82 ∂αFH,I , 65 P∗ Gα, 99 →, 150 w H1, 36 →, 151 > N[](ε, F, k·k), 151 ξ X, 10 Pn, 57 ξopt, 43 Pen, 81 P∗, 149 antisymmetry, 87 asymptotic independence, 60 PeUn , 82 Peu, 82 balanced tails condition, 97 Ψ, 13 bracketing number, 151 Ψ∗, 15 Brownian bridge, 64 Rei(n,j), 81 d Sk·k, 13 convergence Σd, 36 in outer probability, 150 VaR, 51 outer almost surely, 150 α, 10 weak, 150 a.s.∗ → , 150 copula, 20 ˜ fξ,α, 81 Archimedean, 23 fξ,α, 39 generator of, 23 γξ, 38, 39 elliptical, 24 ˜l, 81 extreme value, 20 d S1, 17 Galambos, 112 FH,I , 65 Gumbel, 23 ν, 12 Pareto, 23

163 164 INDEX

Student-t, 24 canonical, 15 covering number, 151 order, 87 dependence function, 19 (usual) stochastic, 88 stable tail, 19 asymptotic portfolio loss, 90 tail, 19 convex, 89 distribution directionally convex, 89 Burr, 123 hierarchy, 89 elliptical, 27 linear convex, 89 max-stable, 19 positive linear convex, 89 simple, 19 stochastic order relation, 88 multivariate Student-t, 28 stop-loss, 107 Donsker, 64 supermodular, 88 universally, 71 outer expectation, 64 entropy, 151 probability, 64 condition, 71 uniform, 71 phase change, 46 with bracketing, 151 pre-Gaussian envelope function, 151 universally, 71 expected shortfall, 51 preorder, 87 extreme risk index, 38, 39 pure loss case, 10 generalized covariance matrix, 28 random loss vector,9 Glivenko–Cantelli, 64 reflexivity, 87 universally, 71 regular variation heavy tail, 10 multivariate, 12 Hill of functions, 11 estimator, 81, 120 univariate, 11 plot, 121 risk measure, 49 spectral, 51 loss-gain case,9 risk spectrum admissible, 51 measurable P-measurable, 151 second-order condition, 62, 119 cover, 150 short position, 10 universally Ψ-measurable, 71 slowly varying, 11 measure subgraph, 153 angular, 13 supermodular exponent, 13 function, 88 canonical, 15 spectral, 13 tail index, 10 INDEX 165 transitivity, 87 vague convergence, 12 value-at-risk, 51 Vapnik–Cervonenkis˘ (VC), 153 VC-class of functions, 153 of sets, 153 VC-hull, 153 for sets, 153 VC-index of function classes, 153 of set collections, 153 VC-major, 153 VC-subgraph, 153