DEGREE PROJECT IN MATHEMATICS, SECOND CYCLE, 30 CREDITS STOCKHOLM, SWEDEN 2020

Simulation-Based Portfolio Optimization with Coherent Distortion Risk Measures

ANDREAS PRASTORFER

KTH ROYAL INSTITUTE OF TECHNOLOGY SCHOOL OF ENGINEERING SCIENCES

Simulation Based Portfolio Optimization with Coherent

Distortion Risk Measures

ANDREAS PRASTORFER

Degree Projects in Financial Mathematics (30 ECTS credits) Degree Programme in Applied and Computational Mathematics KTH Royal Institute of Technology year 2020 Supervisors at SAS Institute: Jimmy Skoglund Supervisor at KTH: Camilla Johansson Landén Examiner at KTH: Camilla Johansson Landén

TRITA-SCI-GRU 2020:005 MAT-E 2020:05

Royal Institute of Technology School of Engineering Sciences KTH SCI SE-100 44 Stockholm, Sweden URL: www.kth.se/sci

Abstract

This master’s thesis studies portfolio optimization using linear programming algorithms. The contribu- tion of this thesis is an extension of the convex framework for portfolio optimization with Conditional Value-at-Risk, introduced by Rockafeller and Uryasev [28]. The extended framework considers risk mea- sures in this thesis belonging to the intersecting classes of coherent risk measures and distortion risk measures, which are known as coherent distortion risk measures. The considered risk measures belong- ing to this class are the Conditional Value-at-Risk, the Wang Transform, the Block Maxima and the Dual Block Maxima measures. The extended portfolio optimization framework is applied to a reference portfolio consisting of stocks, options and a bond index. All assets are from the Swedish market. The re- turns of the assets in the reference portfolio are modelled with elliptical distribution and normal copulas with asymmetric marginal return distributions. The portfolio optimization framework is a simulation-based framework that measures the risk using the simulated scenarios from the assumed portfolio distribution model. To model the return data with asymmetric distributions, the tails of the marginal distributions are fitted with generalized Pareto dis- tributions, and the dependence structure between the assets are captured using a normal copula. The result obtained from the optimizations is compared to different distributional return assumptions of the portfolio and the four risk measures. A Markowitz solution to the problem is computed using the mean average deviation as the . The solution is the benchmark solution which optimal solutions using the coherent distortion risk measures are compared to. The coherent distortion risk measures have the tractable property of being able to assign user-defined weights to different parts of the loss distribution and hence value increasing loss severities as greater risks. The user-defined loss weighting property and the asymmetric return distribution models are used to find optimal portfolios that account for extreme losses. An important finding of this project is that optimal solutions for asset returns simulated from asymmetric distributions are associated with greater risks, which is a consequence of more accurate modelling of distribution tails. Furthermore, weighting larger losses with increasingly larger weights show that the portfolio risk is greater, and a safer position is taken.

Sammanfattning

Denna masteruppsats behandlar portf¨oljoptimeringmed linj¨araprogrammeringsalgoritmer. Bidraget av uppsatsen ¨aren utvidgning av det konvexa ramverket f¨orportf¨oljoptimeringmed Conditional Value-at- Risk, som introducerades av Rockafeller och Uryasev [28]. Det utvidgade ramverket behandlar riskm˚att som tillh¨oren sammans¨attningav den koherenta riskm˚attklassen och distortions riksm˚attklassen.Denna klass ben¨amnssom koherenta distortionsriskm˚att.De riskm˚attsom tillh¨ordenna klass och behandlas i uppsatsen och ¨arConditional Value-at-Risk, Wang Transformen, Block Maxima och Dual Block Maxima m˚atten. Det utvidgade portf¨oljoptimeringsramverket appliceras p˚aen referensportf¨oljbest˚aende av aktier, optioner och ett obligationsindex fr˚anden Svenska aktiemarknaden. Tillg˚angarnasavkastningar, i referens portf¨oljen,modelleras med b˚adeelliptiska f¨ordelningaroch normal-copula med asymmetriska marginalf¨ordelningar. Portf¨oljoptimeringsramverket ¨arett simuleringsbaserat ramverk som m¨aterrisk baserat p˚ascenarion simulerade fr˚anf¨ordelningsmodellen som antagits f¨orportf¨oljen.F¨oratt modellera tillg˚angarnasavkast- ningar med asymmetriska f¨ordelningarmodelleras marginalf¨ordelningarnas svansar med generaliserade Paretof¨ordelningaroch en normal-copula modellerar det ¨omsesidigaberoendet mellan tillg˚angarna.Re- sultatet av portf¨oljoptimeringarnaj¨amf¨orssinsemellan f¨orde olika portf¨oljernasavkastningsantaganden och de fyra riskm˚atten.Problemet l¨oses ¨aven med Markowitz optimering d¨ar”mean average deviation” anv¨andssom riskm˚att.Denna l¨osningkommer vara den ”benchmarkl¨osning”som kommer j¨amf¨orasmot de optimala l¨osningarna vilka ber¨aknasi optimeringen med de koherenta distortionsriskm˚atten. Den speciella egenskapen hos de koherenta distortionsriskm˚attensom g¨ordet m¨ojligtatt ange anv¨andar- specificerade vikter vid olika delar av f¨orlustf¨ordelningenoch kan d¨arf¨orv¨arderamer extrema f¨orluster som st¨orrerisker. Den anv¨andardefineradeviktningsegenskapen hos riskm˚attenstuderas i kombina- tion med den asymmetriska f¨ordelningsmodellen f¨oratt utforska portf¨oljersom tar extrema f¨orlusteri beaktande. En viktig uppt¨ackt ¨aratt optimala l¨osningartill avkastningar som ¨armodellerade med asym- metriska f¨ordelningar¨arassocierade med ¨okad risk, vilket ¨aren konsekvens av mer exakt modellering av tillg˚angarnasf¨ordelningssvansar. En annan uppt¨ackt ¨ar,om st¨orrevikter l¨aggsp˚ah¨ogref¨orlusters˚a ¨okar portf¨oljrisken och en s¨akrareportf¨oljstrategiantas.

i

Acknowledgements

I want to start by expressing my deepest gratitude to my supervisor Jimmy Skoglund at SAS Institute, for his guidance and advice. I would also like to thank him for the inspiring conversations and the words of encouragement when I experienced difficulties. I want to thank my colleagues and friends at SAS Institute for making me feel welcome at the office and for making this project a joyful experience. I want to thank my supervisor Camilla Land´en,my academic supervisor at the Royal Institute of Technology, for valuable feedback and her helpful hand when most needed. Finally, I would like to thank my family and friends for their love and support and for always believing in me.

Stockholm, January 6, 2020

ii

Contents

1 Introduction 1 1.1 Background ...... 1 1.2 Project Goals ...... 1 1.3 Disposition ...... 2

2 Mathematical Background 2 2.1 Risk Measure Theory ...... 2 2.1.1 Coherent Risk Measures ...... 2 2.1.2 Distortion Risk Measures ...... 4 2.2 Elliptical Distributions ...... 7 2.3 Financial Time Series ...... 7 2.4 GARCH models ...... 10 2.5 Extreme Value Theory ...... 12 2.6 Copulas ...... 14

3 Method 16 3.1 Introduction to Portfolio Analysis ...... 16 3.2 Introduction to Portfolio Optimization ...... 17 3.3 Portfolio Models ...... 18 3.3.1 Remark: Elliptically Distributed Returns in Portfolio Selection ...... 18 3.3.2 Asymmetric Distributions ...... 19 3.4 Linear Programming Methods ...... 22 3.4.1 Markowitz Linear Program ...... 22 3.4.2 Rockafellar and Uryasev’s CVaR Optimization ...... 23 3.4.3 Extension of Rockafellar and Uryasev CVaR Optimization ...... 24 3.5 Risk contributions ...... 27

4 Analysis and Conclusion 28 4.1 Reference Portfolios and Benchmark Solution ...... 28 4.2 Mean Average Deviation Benchmark Solution ...... 32 4.3 Optimization With Coherent Distortion Risk Measures ...... 33 4.3.1 Conditional Value-At-Risk ...... 34 4.3.2 Wang Transform ...... 36 4.3.3 Block Maxima ...... 39 4.3.4 Dual Block Maxima ...... 41 4.4 Robustness of solution for Mean Variation ...... 44 4.5 Comparison of CDRM Optimization ...... 46 4.6 Analysis of Euler Risk Contributions ...... 47

5 Final Conclusion and Further Investigation 50

A Appendix 51 A.1 Fitting GARCH and GDP parameters ...... 51 A.2 Estimated Parameters ...... 55 A.3 Payoff Functions and Profit-and-Loss distributions for Derivatives ...... 56

iii

1 Introduction 1.1 Background The trade-off between risk and reward is the main focus for investors. It is well known that with higher potential reward comes increased risk. Choosing an investment strategy often narrows down to the risk appetite of the investor. An investment portfolio may consist of assets with various risk levels. The possible combinations are endless but not all are optimal. Harry Markowitz [24] demonstrated this with the efficient frontier, which illustrates the set of optimal investment in the risk-reward spectrum. Any portfolio on the efficient frontier represents an optimal investment portfolio, while any portfolio below the frontier is sub-optimal. The seminal work by Markowitz introduced a new portfolio selection theory that has had an enormous effect on how investment opportunities are analyzed. He called it the modern portfolio theory. Upon introduction, the modern portfolio theory focused on linear portfolio assets, and the theory relied on the central assumption that the assets could be modeled using a normal distribution. Markowitz used variance as a quantifier of risk. Since the introduction of the modern portfolio theory, more advanced methods of risk measurement have been developed. Mainly to acknowledge the stylized fact that returns are not normally distributed in practice. Artzner et al. [2] proposed the coherent risk measures via an axiomatic approach, in which the mathematical properties of the risk measures were derived from a set of intuitive principles. This axiomatic approach was extended further by Wang et al. [37], introducing the distortion risk measures. Wirch and Hardy [38] studied the intersection between these classes, known as the coherent distortion risk measures, which inherits the properties of both classes. With the increased regulation of financial institutions, risk measures such as the coherent Conditional Value-at-risk has gained particular attention. It is, for example, used in the new market risk regulation ”Fundamental Review of the Trading Book”(FRTB). The FRTB is a set of international standardized rules governing how financial institutions risk should be measured and reported, which was proposed by the Basel committee after the financial crisis 2007 to strengthen the financial system. Being a Conditional Value-at-risk has the nice property of being convex, which makes it a good candidate for optimization. This was studied by Rockafeller and Uryasev [28] who developed a linear programming formulation for Conditional Value-at-risk, which they solved with linear programming methods. Bertsimas et al. [3] showed that any member of the family of distortion risk measures could be represented as a convex combination of Conditional Value-at-risk. This will be a central concept used for this project in developing a framework for optimization using coherent distortion risk measures.

1.2 Project Goals The project goals of this master’s thesis are to

• Extend the classical Markowitz portfolio optimization to consider non-linear options. • Abandon the assumption of normal asset returns for a more accurate model that better captures the greater risks in the tails of the return distribution. • Develop a convex linear programming framework for optimization with convex distortion risk mea- sures. • Apply

– Conditional Value-at-Risk – Wang Transform – Block Maxima – Dual Block Maxima

• Benchmark the optimal asset allocations against the classical mean-variance optimization. • Analyze the optimal portfolio results together with local information on assets risk contributions.

1 1.3 Disposition The disposition of this thesis is as follows, Section 2, gives a mathematical background on the general risk measure theory, defining the elliptical distribution to be used for portfolio modeling and presenting the basics of financial time series analysis, extreme value theory, and copulas. In Section 3, the methods used to reach the project goals are presented. The section starts with a short introduction to the general concepts of portfolio theory. The general portfolio optimization problem formulation is then stated in mathematical terms, and the portfolio models is presented. The standard elliptical models is stated, with subsequent motivation to consider asymmetric distributions. The Mean Absolute Deviation linear programming model is stated in its framework and a presentation of the minimization of Conditional Value-at-Risk is given. The latter is then extended, setting up the framework for optimization of co- herent distortion risk measures with minimization represented as convex combinations of Conditional Value-at-Risk. The section concludes with a generalized approach for assessment of the marginal risk contribution of every measure of risk used in this thesis. In Section 4, the results of the study are presented and analyzed. In Section 4.2, the reference portfolio is introduced and solved by applying the Markowitz mean-variance optimization problem. The solution is computed using the mean abso- lute deviation representation of the traditional Markowitz optimization, and the solution is used as a benchmark for the analysis. The general portfolio optimization problem is then solved with the various portfolio optimization models. The results for optimization with Conditional Value-at-Risk are presented in Section 4.3.1. In Section 4.3.2 the results for optimization with the Wang transform are presented. In Sections, 4.3.3 and 4.3.4 the presentation of the results from the minimization with the Block Maxima distortion and the Dual Block Maxima distortion are presented, respectively. Section 4.5 compares the results for the minimization of the four coherent distortion risk measures, and the results from the asset allocation analysis with marginal risk contributions are provided. In Section 5, the results are discussed from both the perspective of the project goals and practical implementation, and finally, some comments on further investigation are given.

2 Mathematical Background

This section gives an overview of the necessary mathematical background for this project. Each area is given a short presentation and a brief description of conceptualization.

2.1 Risk Measure Theory To quantify risk as a measure of uncertainty in a portfolio is one of the core activities in quantitative risk management. This is accomplished by modeling the uncertain return as a with some probability distribution. The risk is quantified by introducing a functional that assigns a single numerical number to the potential loss of the portfolio. Let the uncertainty in the future portfolio value be described by the function X :Ω 7→ R for a fixed set of scenarios Ω defined on the general probability space (Ω, F,P ). Assume that X(ω) is bounded for all ω ∈ Ω and let X be the set of reachable portfolios defined on L∞(Ω, F,P ). A risk measure is then defined as the functional ρ : X 7→ R, where a large functional value corresponds to higher return uncertainty. Arztner et al. [2] introduced the important notion of Coherent risk measures, which they defined as the characterization of a risk measure that satisfies a set of axioms. These axioms are intuitive mathematical properties that a risk measure should satisfy. The list below summarizes these properties and how they should be interpreted.

2.1.1 Coherent Risk Measures Translation Invariance (TI): ρ(X + c) = −c + ρ(X) This means that by investing an additional amount of cash c, i.e. without risk, in a cash account, the total portfolio risk will be reduced by the same amount c. Monotonicity (M): X ≥ Y =⇒ ρ(X) ≤ ρ(Y ) This means that if a portfolio position X is always associated with greater value than a portfolio Y at a future time the former is considered less risky.

2 Convexity (CX): ρ(λX + (1 − λ)Y ) ≤ λρ(X) + (1 − λ)ρ(Y ), λ ∈ [1, 0] This ensures that a diversified portfolio is associated with less risk than investing all capital in one instrument. Normalization (N): ρ(0) = 0 This means that it is acceptable to not invest at all and if there is no investment there is no risk. Positive Homogeneity (PH): ρ(λX) = λρ(X), λ ≥ 0 This means that by increasing the portfolio value with a factor linearly increases the portfolio risk with the same factor. Subadditivity (S): ρ(X + Y ) ≤ ρ(X) + ρ(Y ) This means that a merged portfolio does not induce more risk than the sum of the portfolios standalone risks.

Note that (N) is implied by (PH) and that (CX) together with (PH) implies (S). Different risk mea- sures satisfy different properties and depending on which they satisfy, and they are organized in classes accordingly. The Coherent risk measures (CRM) satisfies (TI), (M), (PH) and (S), while a Monetary measure of risk satisfies only (TI) and (M). Due to the lack of (PH) and (S), a monetary measure of risk is not coherent. The class of coherent risk measures is particularly important, and the representation theorem is therefore presented below for convenience.

Theorem 2.1 (Representation Theorem for coherent risk measures). A risk measure ρ is coherent if and only if there is a family of probability measures Q such that

ρ(X) = sup EQ[X], ∀X ∈ X Q∈Q where EQ[X] denotes the expectation of the random variable X under Q. This representation theorem states that all coherent risk measures may be represented as the worst-case expected value over a family of ”generalized scenarios.” In the next two sections, two important risk measures used in risk management will be presented, namely the quantile based risk measures Value-at-Risk and Conditional Value-at-Risk

Value-at-risk The Value-at-Risk (VaR) measure has become, by far, the most common risk measure for quantifying risk due to its intuitive and straightforward formulation. Value-at-Risk is defined as the potential loss of a portfolio with a value X at a confidence level α ∈ (0, 1), formally

VaRα(X) = min{m ∈ R : P (m + X < 0) ≤ α}. Hence, raising a minimum amount capital m ensures that the probability of strictly negative portfolio value at the investment horizon is less than α [20, p. 165]. Let X = V1 − V0 denote the profit and loss (PnL) random variable, which can be viewed as the net gain of the investment and L = −X as the net losses. In statistical terms VaRα(X) is the (1−α)-quantile of L. If L has a right continuous and strictly increasing cumulative distribution function FL then

−1 VaRα(X) = FL (1 − α), −1 where FL is the inverse of FL.

Despite its simplicity and broad applicability, VaR is controversial. The main drawback is that it only considers a specific quantile loss defined by the confidence level. In other words, it does not provide any information about the tail losses beyond that confidence level. Hence, choosing a too

3 small confidence level may result in severe underestimation of the risk. VaR is a translation invariant, positively homogeneous, and monotone risk measure. However, subadditivity is not satisfied in general. This is pointed out, in example 6.7 in McNeil et al. [26]. Since VaR is not subadditive in general, it is not a coherent risk measure. In an idealized situation when the portfolio can be expressed as a linear combination of a set of underlying elliptically distributed risk factors, VaR is subadditive. This is proved in Theorem 6.8 in McNeil et al. [26].

Conditional Value-at-Risk The Conditional Value-at-Risk (CVaR) measure was proposed to overcome theoretical deficiencies of VaR. In particular, the weakness that it ignores the quantiles beyond the confidence level. This is achieved by taking all quantiles beyond VaR under consideration.1 Conditional Value-at-Risk is defined as the average of all VaR estimates above a certain confidence level α ∈ (0, 1), formally

Z 1 Z 1 1 1 −1 CV aRα(X) = V aRu(X)du = FL (1 − u)du. (2.1) 1 − α α 1 − α α In the last step of (2.1) it has been assumed that the loss L has a right continuous and strictly increasing cumulative distribution function FL.

It follows from the definition that CVaR inherits the properties (TI), (M), and (PH) from VaR, and additionally, it can be shown that (S) is satisfied, see e.g., Hult et al. [20, p. 182]. As a result, CVaR is a CRM. By taking all VaR estimates in the tail beyond a certain threshold under consideration, CVaR does account for the potential losses far out in the tail. CVaR, therefore, overcomes the main weakness of VaR.

2.1.2 Distortion Risk Measures The coherency of CVaR makes it an attractable measure of risk in many applications, although taking the mean of all quantiles beyond a certain threshold can be somewhat questionable. By weighting all quantiles with the conditional mean, extreme losses are not considered worse than less extreme losses. In reality, investors are typically more risk averse towards increasing loss severities. In this section, a particular class of Distortion risk measures(DRM) is studied, which was first postulated by Wang et al. [37] in the context of insurance risk. These measures are basic expectations of a loss random variable under an appropriate distortion of its original distribution, which allows the risk measure to weight different loss severities such that it better reflects an investor’s risk aversion. The following list of axioms was proposed by Wang et al. [37] to characterize this class of risk measures.

Law Invariance (LI): X =d Y =⇒ ρ(X) = ρ(Y ). This means that the risk depends only on the loss distribution and two portfolio positions with identical probability distributions are exposed to equal amounts of risk. Monotonicity (M): X ≥ Y =⇒ ρ(X) ≤ ρ(Y ). This means that if a portfolio position X is associated with greater value than a portfolio Y at a future time, then the former is considered less risky. Comonotonic additivity (CA):

ρ(X +Y ) = ρ(X)+ρ(Y ) ⇐⇒ (X(ω1)−X(ω2))(Y (ω1)−Y (ω2)) ≥ 0, almost surely for ω1, ω2 ∈ Ω. This means that merging two portfolios that hedge one another does not lead to additivity of risk, while two non-hedging portfolios does. Continuity (C):  lim + ρ(max{X − δ, 0}) = ρ(max{X, 0}),  δ→0 limδ→∞ ρ(min{X, δ}) = ρ(X),  limδ→−∞ ρ(max{X, δ}) = ρ(X).

1Conditional Value-at-Risk is also frequently termed in various literature.

4 The first condition ensures that a small truncation in the estimated portfolio value does not cause large errors in the risk estimate. The last two conditions states that the risk can be estimated by approximating the portfolio value X as a bounded random variable.

For a complete presentation of risk measure theory and the properties of risk measures the reader is referred to Sereda et al. [30].

In order for the DRM to satisfy these properties, it can be defined with the Choquet integral repre- sentation (see Choquet [11]) with respect to a distortion function that defines the distorted probability measure [37]. This representation provides a good variety of modeling freedom, and many of the most common risk measures can be generalized to the Choquet integral. From here on let FL(l) = P{L ≤ l} denote the cumulative distribution function and SL(l) = 1 − FL(l) = P{L > l} the survival function of the random loss variable L ∈ X and P is the reference probability measure. Definition 2.1. A distortion function is any right-continuous and non-decreasing function g : [0, 1] 7→ [0, 1] satisfying g(0) = 0 and g(1) = 1, such that the distorted probability distribution for a random variable is defined as g(SL(l)) = Q{L > l}, where Q denotes the distorted probability measure and dQ 0 = g (SL(l)) the corresponding Radon–Niodym derivative. Moreover, the dual transform provides the dP related dual distortion function γ(x) = 1 − g(1 − x)

Assuming that the set of reachable portfolios X contains all Bernoulli random variables Bernoulli(p), such that the compounded portfolio losses occur with probability p ∈ [0, 1]. Then it is implied, by Theorem 3 in Wang et al. [37], that if and only if a DRM ρg has a Choquet integral representation, Z 0 Z ∞ Z 1 −1 ρg(X) = [1 − g(S−X (u))]du − g(S−X (u))du = FL (1 − u)dg(u), (2.2) −∞ 0 0 with respect to the distorted distribution, then the axioms (LI), (TI), (M), (PH), (CA), (C), the property ρg(1) = 1 and ρg(−X) = −ργ (X) are satisfied as a consequence of the standard property of the Choquet integral, see Denneberg [13]. Although (S) is generally not satisfied, and consequently, ρg is not coherent. However, Wirch and Hardy [38] showed that if the distortion function g is concave, the risk measure (2.2) does satisfy (S) thus demonstrating the intersection between CRM and DRM. This class of risk measures will be referred to as coherent distortion risk measures (CDRM).

Theorem 2.2. For any portfolio loss random variable L ∈ X and a concave distortion function g there is a coherent distortion risk measure ρg satisfying the axioms of coherency, (LI), (CA) and the properties ρg(1) = 1 and ρg(−X) = −ργ (X) if and only if the measure has a Choquet integral representation

Z 0 Z ∞ ρg(L) = [1 − g(SL(u))]du − g(SL(u)du = sup EQ[X], (2.3) −∞ 0 Q∈Q

dQ 0 where the generating family is Q = { << | = g (SL(l))} and << denotes absolute continuity Q P dP Q P between the probability measures Q and P. Some well known distortion functions includes,

Value-at-risk (VaRα):

gVaR(x, α) = 1{x≥1−α}, α ∈ [0, 1] This was observed by Wirch and Hardy [38], the distortion function is not concave and the risk measure is not coherent.

Conditional Value-at-risk (CVaRα):

gCVaR(x, α) = min(x/(1 − α), 1), α ∈ (0, 1) First observed by Wirch and Hardy [38], this distortion function is concave but was criticised by Wang et al. [36] since it is not differentiable in α and therefore discards all information of the quantiles below α by mapping them to zero.

5 Wang Transform (WTβ): −1 −1 gWT (x, β) = Φ(Φ (x) − Φ (β)), β ∈ (0, 1) This distortion function was introduced by Wang et al. [36], which is concave for β < 0.5. Here Φ denotes the cumulative normal distribution function, and Φ−1 its inverse.

Block Maxima (BMβ): β gBM(x, β) = x , β ∈ (0, 1) This function is sometimes referred to as the Proportional Hazard distortion function and was introduced by Wang et al. [37].

Dual Block Maxima (DBMβ): β gDBM(x, β) = 1 − (1 − x) , 1 < β This distortion function is derived from the Block Maxima distortion function using the dual transform.

Figure 2.1 displays a sample cumulative distribution function (CDF), a CVaR0.95 distortion, a WT0.95 distortion, a BM0.5 and a DBM100 of the CDF.

1 CDF WT 0.9 0.95 BM 0.5 DBM 100 0.8 CVaR 0.95

0.7

0.6

0.5

0.4 Cumulative Probability 0.3

0.2

0.1

0 -4 -3 -2 -1 0 1 2 3 4 Negative Returns (%)

Figure 2.1: Empirical cumulative distribution function(CDF), the WT0.95, the BM0.5, the DBM100 and CVaR0.95 distortion of the CDF for a return sample.

Consider a DRM ρg and the associated distortion function g(x) = x, which suggests that the distorted survival function equals the survival function g(SL(l)) = Q{L > l} = P{L > l}, i.e. the reference and P distorted probabilities coincide P = Q. This risk measure is the expected value of losses ρg(L) = E [L]. It is therefore, reasonable to require a risk measure to be bounded from below by the expected value and bounded from above by maximum loss, such that E[L] ≤ ρ(L) ≤ max{L}, provided that g is concave.

When working with coherent distortion risk measures in practice, it is often convenient to consider an alternative representation to the Chouquet integral. One of the most common alternative representations for a DRM is the quantile based representation. Assuming a concave distortion function, such that coherency holds, and let g be absolutely continuous, such that dg(u) = φ(u)du where φ is a non- R 1 decreasing function φ : [0, 1] 7→ [0, ∞) and 0 φ(u)du = 1. Then the CDRM can be represented by the alternative form

Z 1 −1 ρφ(X) = FL (u)φ(u)du. (2.4) 0 This representation is recognized as the definition of a spectral risk measure(SRM), where φ is known as the risk spectrum. For an explicit proof of the equivalence between SRM and CDRM, the reader is referred to Gzyl and Mayoral [16]. Theorem 2.3. For any portfolio loss random variable L ∈ X and a concave distortion function g, then the associated risk measure ρg is a coherent distortion risk measure if and only if there exists a function R 1 w : [0, 1] 7→ [0, 1], satisfying 0 w(u)du = 1 such that

6 Z 1 ρg(X) = w(u)CVaRu(X)du, 0 Theorem 2.3 suggests that any CDRM generated from a concave distortion function g can be represented as a convex combination of CVaRα for α ∈ [0, 1]. This result was proved by Kusuoka [23].

2.2 Elliptical Distributions The elliptical distributions is a useful class for multivariate modeling of financial portfolios. The ellip- tical distribution generalizes some of the most common symmetrical multivariate distributions such as multivariate normal, student’s-t, and normal variance mixture distributions. The elliptical distributions are defined as follows. A random vector X ∼ Nd(µ, Σ) with mean vector µ and covariance matrix Σ is said to have a multivariate normal distribution if X =d µ + AZ (2.5) T where Z ∼ Nd(0, I), with I being the d-dimensional identity matrix and Σ = AA . Moreover, assuming Σ is positive semi definite A is the Cholesky decomposition of Σ. Let Y be a random vector with spherical symmetry, i.e. a vector invariant under rotations and reflec- tions. Since multiplications with orthogonal matrices represent rotations and reflections the vector Y is said to have a spherical distribution if OY =d Y for every orthogonal matrix O satisfying OO = I. Also, T d for every arbitrary vector a with the same dimension, then a Y = |a|Y1, where Y1 is the first element of the random vector Y . By replacing the standard normal vector Z with a spherically distributed random vector Y in equation (2.5) the random vector X is said to have an elliptical distribution with stochastic representation X =d µ + AY , (2.6) if there is a mean vector µ a matrix A and a spherically distributed random vector Y . The elliptical distribution also includes the normal variance mixture distributions, which can be deduced by taking 2 W = ν/Sν , where Sν has a Chi-squared distribution. The resulting distribution takes the following form X =d µ + WAY , (2.7) If X is an elliptical distribution with stochastic representation of the form (2.7) and a is any vector of the same dimension, then the following useful relation holds √ T d T T a X = a µ + a ΣaWY1. (2.8) The simplistic parametric representation of elliptical distributions makes the construction of multivariate models simple. Moreover, one of the main advantages of the representation is that it provides analytical solutions to a wide range of applications. One problem with the elliptical model is that return distri- butions are rarely symmetric, extreme losses tend to be larger than extreme profits, and the tails are often heavier than elliptical tails. Even if a heavy-tailed student’s-t distribution can capture the extreme losses, the symmetric nature of elliptical distributions can overestimate or underestimate either of the two tails due to the symmetry assumption. Extreme profits tend to be lighter tailed than extreme losses, so a compromise must be made if a common distribution is used.

2.3 Financial Time Series In Markowitz portfolio theory, the assumption was that the financial return series are normally distributed and independent over time. These assumptions have shown to be questionable due to the presence of the stylized facts of financial time series. The stylized facts can be summarized as

1. Return series shows little serial correlation but are dependent. 2. The volatility varies over time, and the absolute and squared return series show strong serial correlation with slow decay.

3. Returns are heavy-tailed and asymmetric

7 The independency assumption is contradicted by the first and second stylized facts. This becomes apparent for extreme returns which tend to appear in clusters and hence increased volatility appears in clusters. The third stylized fact contradicts the normality assumption. The normal distribution tails are neither heavy nor asymmetric. Consequently, negative returns occur more frequently than predicted by the normal distribution. The first stylized fact indicates that forecasting using a linear model can be expected to perform poorly, due to the lack of autocorrelation in consecutive financial returns. On the other hand, stylized fact 2 implies that squared returns are to some extent forecastable. Figure 2.2 illustrates the characteristic behaviour of volatility clustering of financial return series, which is the primary indicator that the stylized facts of financial time series are present. Volatility clusters is characterized by large returns of either sign being followed by another large return of either sign.

0.2

0.15

0.1

0.05

0 Returns

-0.05

-0.1

-0.15

-0.2 0 200 400 600 800 1000 1200 1400 1600

Figure 2.2: Financial time series of returns illustrating the volatility clustering. The return series sample is captured between 2007-12-27 to 2013-12-06 for the SEB Stock

A financial time series is a sequence of observations of financial data {xt}. In the case of this project, return series data. The associated time series model for the observed data is the specification of the means and covariances of a sequence of random variables {Xt}, referred to as a stochastic process and the observed data is assumed to be a realization of the process.

Assume there is a stochastic process {Xt}t∈Z then the associated mean and autocovariance functions are defined as µX (t) = E[Xt], γX (h) = Cov(Xt,Xt+h) When working with time series, the concept of stationarity is central. For convenience weak and strict stationarity is defined here.

Definition 2.2 (Weak Stationarity). The stochastic process {Xt}t∈Z is said to be weakly stationary if the autocovariance function Cov(Xs,Xs+t) is independent of the s, only being a function of t. Consequently, 2 E[Xs] and E[Xs ] are also independent of s.

Definition 2.3 (Strict Stationarity). The stochastic process {Xt}t∈Z is said to be strictly stationary if {Xt1 , ..., Xtn } and {Xt1+h, ..., Xtn+h} have the same joint distributions for all t1, ..., tn ∈ Z and positive h > 0. The autocorrelation function is a useful tool when studying serial correlation of time series data. Let the stochastic process {Xt}t∈Z be weakly stationary then the autocorrelation function(ACF) can be defined by def γ (h) ρ(h) = X γX (0)

where γX (h) is the autocovariance function, and h is referred to as the lag.

2 Definition 2.4 (White Noise). The stochastic process {Xt}t∈Z is a white noise process WN(0, σ ) with 2 2 centered mean E[Xt] = 0 and variance E[Xt ] = σ < ∞ if its weakly stationary with autocorrelation function ( 1, h = 0 ρ(h) = 0, h 6= 0

8 Another important process is the independent and identically distributed noise process. A process is said to be independent if it has a sequence of mutually independent random variables X1,X2, ..., which for any positive integer n satisfies

P (X1 ≤ x1, ..., Xn ≤ xn) = P (X1 ≤ x1)... · P (Xn ≤ xn) = F (x1) · ... · F (xn),

where F is the cumulative distribution function of each random variable and x1, ..., xn is the observations. More precisely, the IID process is defined as follows

Definition 2.5 (IID Noise). If the stochastic process {Xt}t∈Z is a sequence of strictly stationary and 2 2 independent random variables with centered mean E[Xt] = 0, variance E[Xt ] = σ < ∞ it is said to be IID(0, σ2). Revisiting the stylized facts at the beginning from this section. These become particularly prominent when studying the ACF of returns and squared returns. Figure 2.3 illustrates the sample ACF for the first h = 10 lags for the SEB stock returns depicted in Figure 2.2.

Autocorrelation of return series 1

0.8

0.6

0.4

0.2

0 Sample Autocorrelation -0.2 0 1 2 3 4 5 6 7 8 9 10 Lag

Autocorrelation of squared return series 1

0.8

0.6

0.4

0.2

0 Sample Autocorrelation -0.2 0 1 2 3 4 5 6 7 8 9 10 Lag

Figure 2.3: Sample ACF of returns (upper frame) and squared returns (lower frame) for the sample returns for the SEB stock depicted in Figure 2.2.

The upper frame of Figure 2.3 illustrates that there is little serial correlation for the return series. Significant correlations is present if the ACF falls outside the 95% confidence bound, which is indicated by the blue lines. This occurs just barely for lag h = 2 and h = 9, hence showing little serial correlation, which is in line with stylized fact 1. Considering now stylized fact 2. In Figure 2.2 volatility clusters can be observed. Clearly, the sequence of observed data is not independent, since the volatility varies over time and decays slowly. Moreover, the lower frame of Figure 2.3 shows that there is significant serial correlation for lag h = 1, ..., 10. Concluding that there is a strong correlation for squared return series, hence showing that the stylized fact 2 is present. The presence of stylized fact 3 can be illustrated by plotting the empirical quantiles against the theoretical quantiles from a chosen distribution. In Figure 2.4, the empirical quantiles of the SEB stock is plotted against the theoretical quantiles of a standard normal distribution.

9 QQ Plot of Sample Data versus Standard Normal 0.2

0.15

0.1

0.05

0

-0.05 Quantiles of Input Sample

-0.1

-0.15

-0.2 -4 -3 -2 -1 0 1 2 3 4 Standard Normal Quantiles

Figure 2.4: Quantile-Quantile plot of the return series of the SEB stock depicted in Figure 2.2 against standard normal distribution.

The shape of the blue line of Figure 2.4 illustrates the distributional characteristics of empirical data. If the blue line, linearly follows the red line, then it is an indication that the empirical distribution is the same as the theoretical distribution. If the blue line is shaped like an S-curve, the empirical distribution is lighter than the theoretical distribution. In this case, however, the blue line is shaped like an inverted S-curve which indicates that the empirical distribution is heavier than the theoretical distribution. More- over, if one tail deviates more from the red line than the other, it is heavier than the other. Hence, the empirical distribution is asymmetric. In Figure 2.4, the tails are both heavy and asymmetric. Which is in line with stylized fact 3.

2.4 GARCH models Since volatility tends to vary over time, as suggested by stylized fact 2, the observed autocorrelation in the squared returns needs to be modelled. This is important since the risk increases in volatility clusters and an accurate volatility model can improve measurements of the risk. A popular model for filtering financial return series is the Generalized Autoregressive Conditional Heteroscedasticity (GARCH) model. The univariate GARCH(p, q) model for the conditional variance

{σt}t∈Z is obtained by considering the following decomposition of the model residuals {Zt}t∈Z,

Xt = µ + σtZt, ∀t ∈ Z (2.9)

where {Zt}t∈Z is assumed to be IID with mean zero and unit variance, denoted IID(0, 1), and indepen- dent of the non-negative process {σt}t∈Z defined as p q 2 X 2 X 2 σt = α0 + αiXt−i + βjσt−j, ∀t ∈ Z (2.10) i=1 j=1 where the shock volatility α0 > 0, αi ≥ 0 is the autoregressive and βj ≥ 0 the heteroscedastic parameters, for i, j = 1, 2, .... Then {Xt}t∈Z is called a GARCH(p, q) process. Originally Bollerslev [4] introduced the GARCH process as a generalization of Engles Autoregressive Conditional Heteroscedasticity (ARCH) model [12]. In the paper, Bollerslev assumed normally distributed standardized residuals. However, in practice, the standardized residuals Zt do not assume the characteristic exponential decaying tail behaviour observed in normal distributions. In [5], Bollerslev considers student’s-t standardized residuals as an alternative to the normal distribution, to account for the heavier tails. By letting {Xt}t∈Z represent the series of returns, then the return at time, t, is modelled as a linear combination of squared historical returns, which meets stylized fact 2. Additionally, the GARCH process can generate excess kurtosis. This can be seen by applying H¨older’sinequality to the kurtosis of σtZt

4 4 E[(σtZt) ] E[(σt) ] k(σtZt) = 2 2 = k(Zt) 2 2 ≥ k(Zt), E[(σtZt) ] E[(σt) ]

10 Which is greater than the kurtosis of Zt. This result derives from Xt being a mixture of conditional distributions with varying variances. Hence, partially meeting stylized fact 3. An alternative to further improve the modelling of heavy tails the is to consider a student’s-t distribution. However, both normal and student’s-t assumptions maintain the symmetry assumption.

To calibrate the GARCH(p, q) model the parameters of the parameter vector θ = (α0, α1, ..., αp, β1, ..., βq) needs to be estimated. This can be done with Maximum-Likelihood estimation. n   Y 1 xt L(θ, x , ..., x ) = g (2.11) 1 n σ σ t=1 t t

xt where x1, ..., xn are the observations and g( ) is the density of the innovations. The parameters of θ σt are estimated by maximizing the expression (2.11), hence

arg max L(θ, x1, ..., xn) θ An alternative to maximizing the likelihood (2.11) is to maximize the log-likelihood instead. n    X 1 xt log(L(θ, x , ..., x )) = log g (2.12) 1 n σ σ t=1 t t

The initial step of estimating the parameters θ is to make a starting guess for σ0. If the data set is large enough, the choice σ0 is non-important. Therefore, it is often assumed that the standard deviation is a good approximation for the parameter σ0. For a more thorough presentation of the calibration of the GARCH parameters, see Section 4.2.4 in McNiel et al. [26]. When calibrating the GARCH parameters to the data, a distributional assumption for the standardized residuals has to be made. In this thesis it is assumed that the standardized residuals have normal distributions. Acknowledging that the standardized residuals may not be normally distributed in practice the normality assumption may be violated. However, Bollerslev and Wooldridge show in [6] that the normal Quasi-Maximum likelihood estimator is consistent with a limiting normal distribution even if the normality assumption is violated. Therefore the Quasi-Maximum likelihood form will be used for the calibration of the GARCH parameters in this project. The quasi-log maximum likelihood estimated parameters θˆ is quasi-log likelihood function

n n  2  1 X 1 X (xt − µ) L (θ, x , ..., x ) = − log(σ2) − Q 1 n 2 t 2 σ2 t=1 t=1 t To assess the goodness-of-fit, the following tests will be considered: Ljung-Box test, Engle’s ARCH test and the Jarque-Bera test, which are defined as Definition 2.6 (Jarque-Bera test). The Jarque-Bera test is a goodness-of-fit test that assesses whether a data sample comes from a normal distribution. The test statistic tests simultaneously if the skewness and kurtosis of the data sample are consistent with a normal distribution. The test statistic is

n (k − 3)2 JB = (s + ) 6 4 where n is the sample size, s is the sample skewness and k is the sample kurtosis. The test has the null hypothesis of normality. If the sample kurtosis value differs too much from three and the skewness value differs too much from zero, the test statistic may reject the null hypothesis of normality [21] Definition 2.7 (Engle’s ARCH test). Engle’s ARCH test assesses the null hypothesis that the series has no conditional heteroscedasticity, i.e. no ARCH effects

α0 = α1 = ... = αh = 0

For a specified number of lags. Against the alternative hypothesis that there is autocorrelation in the squared residuals, given by the regression

2 2 2 Xt = α0 = α1Xt−1 = ... = αhXt−h + et

where et is the white noise process and h is a fixed number of lags. [12]

11 Definition 2.8 (Ljung-Box test). The Ljung-Box test is a numerical test for the null hypothesis that there is no significant autocorrelation for a series of residuals for a fixed number of lags. The test statistic is h X ρ(k)2 LB = n(n + 2) (n − k) k=1 where n is the sample size, h is the number of lags, and ρ(k) is the sample autocorrelation at lag k. [7]

Volatility Forcasting

To forecast future volatility a GARCH(1,1) model can be used. Assuming that the process {Xt}t∈Z is the 2 2 covariance stationary GARCH(1,1) process with E[Xt ] = E[σt ] < ∞ and parameters α1, β1 satisfying 2 α1 + β1 < 1. Then using equation (2.10) gives the approximate volatility forecast one step ahead, σt+1, in the form of recursive scheme

2 2 2 2 2 σˆt+1 = E[Xt+k|Xt ] = α0 + α1Xt + β1σˆt .

Using the form recursively gives the general form for estimating the volatility k steps ahead.

k−1 2 2 2 X i k−1 2 2 σˆt+k = E[Xt+k|Xt ] = α0 (α1 + β1) + (α1 + β1) (α1Xt + β1σˆt ) i=0

2 This general form can be rewritten in terms of the unconditional variance, σ , of the process Xt as follows

2 2 2 2 2 2 σˆt+k = E[Xt+k|Xt = σ + (α1 + β1)(σt+k − σ ). (2.13)

where the unconditional variance, σ2 is derived as follows

2 2 V ar(Xt+1) = E[Xt+1] − (E[Xt+1]) 2 = E[Xt+1] 2 2 = E[σt+1Zt+1] 2 = E[σt ] 2 2 2 = α0 + α1E[σt Zt ] + β1σˆt 2 = α0 + (α1 + β1)E[σt ]

= α0 + (α1 + β1)V ar(Xt). where it has been used that Zt ∼ IID(0, 1). Since Xt is a stationary process with V ar(Xt+1) = V ar(Xt), the unconditional variance σ2 is given by

2 α0 σ = V ar(Xt) = (2.14) 1 − α1 − β1 Then, by rearranging the terms of (2.13) and substituting the unconditional variance (2.14) gives the following explicit expression for the the conditional volatility forecast k steps forward for the GARCH(1,1) process in terms of the estimated parameters α0, α1, β1 and σt.   2 2 k 2 α0 α0 E[σt+k|σt ] = (α1 + β1) σt − + . (2.15) 1 − α1 − β1 1 − α1 − β1 2 k here σt+k reverts to the unconditional variance (2.14) as (α1 + β1) dies out. The factor (α1 + β1) is referred to as the discount rate.

2.5 Extreme Value Theory Extreme value theory (EVT) is a probabilistic theory of extreme scenarios. In particular, univariate extreme value theory focuses on modelling the extreme events in the upper and lower tails. The approach is analogous to that of the Central limit theorem (CLT), which focuses on the convergence of sums. The CLT will be stated here for convenience.

12 Theorem 2.4 (Central Limit Theorem). Let X1,X2, ... be an infinite sequence of independent and identically distributed random variables with mean µ and variance σ2, then

  x S − nµ 1 Z 2 lim P n √ ≤ x = Φ(x) := √ e−z /2dz n→∞ σ n 2π −∞ Pn where Sn = i=1 Xi and Φ is the cumulative normal distribution function. The CLT is essential. It states that a properly normalized sum converges asymptotically towards a normal distribution. Hence, if there is an interest in the distribution of sums of a large number of ob- servations from IID random variables, then the structure of the distribution of the sum can be derived from the information about the first and second moments.

Consider the following set of n observations X1, ..., Xn drawn from an IID random variable with unknown distribution function F . Instead of asymptotic convergence of sums, EVT focuses on the asymptotic 2 convergence of objects known as maxima’s Mn = max{X1, ..., Xn} describing the upper tail behaviour . n Assume independence, let P (Mn ≤ x) = F (x) denote the limiting distribution of Mn. Then similarly to CTL, the normalized maxima converge as

M − nµ  √ lim P n √ ≤ x = lim F n(σ nx + nµ) = Γ(x), n→∞ σ n n→∞ where Γ(x) is a non-degenerate distribution function. The family of such limiting distributions is known as the Generalized extreme value distributions which have the form

(  x−µ −1/ξ exp − 1 + ξ σ , ξ 6= 0 Γξ,µ,σ(x) = exp (−e−x) , ξ = 0,

where µ ∈ R and σ > 0 are the location and scale parameters, and ξ is the shape parameter, satisfying 1 + ξx > 0. The Generalized extreme value distributions can divides into three types of distributions described by the tail index 1/ξ. If ξ > 0 the distribution corresponds to the Fr´echet case, where the distribution only has a lower bound, and the tail has polynomial decay. If ξ = 0 then the distribution is referred to as the Gumble distribution, which spreads out along all of the real axes and has exponential decay. When ξ < 0 the distribution is referred to as the Weibull distribution which has an upper bound such that the tail has a finite endpoint.

The special case of the Generalized Extreme Value distribution that is particularly well suited for tail excesses modelling is the generalized Pareto distribution, which is defined as

  −1/ξ  x 1 − 1 + ξ β , ξ 6= 0 Gξ,β(x) = (2.16) 1 − e−x/β, ξ = 0,

β where β > 0, and x ≥ 0 when ξ ≥ 0 and 0 ≤ x ≤ ξ when ξ < 0. Here ξ and β are referred to as the shape and scale parameters, respectively.

The formal connection between the Generalized Pareto distribution and the Generalized Extreme Value distribution states as follows 1 − Gξ,β(x) = −Γξ,0,β(x). In order to fit the generalized Pareto parameters to empirical data, the generalized Pareto parameters need to be calibrated. This can be done with a few different methods. In this thesis, the peaks over threshold method will be used, and the parameters are estimated with Maximum-Likelihood estimates.

Assume that the ordered elements of the sample vector z˜ = (zn ≤ ... ≤ z1) are IID(0,1) and have the unknown distribution function F . To fit a generalized Pareto distribution to the upper tail of the distribution a suitably high threshold needs to be chosen to define where the upper tail starts. Let z = (zk ≤ ... ≤ z1) be the sample vector of ordered tail excesses above the threshold Zk+1 = η and

2 Similarly, the lower tail may be considered using minima’s mn = min{X1, ..., Xn} = −max{−X1, ..., −Xn}. This section will only give a brief presentation of the former and assume the analogous result holds.

13 the associated excess distribution Fη. Provided the generalized Pareto distribution (2.16) the associated density function is given by − 1 −1 ξ  z  ξ g (z) = 1 + ξ ξ,β β β The parameters ξ and β are then estimated using the log-likelihood function.

k X L(ξ, β, z1, ..., zk) = log gξ,β(zi) i=1   k   1 X zi = −k log(β) − + 1 log 1 + ξ ξ β i=1 which should be maximized subject to β > 0 and 1 + ξzi/β > 0, i = 1, ..., k. The Maximum-Likelihood estimation is consistent whenever ξ > 1/2, see Embrechts et al. [10]. The solution gives the estimated ˆ ˆ parameters ξ and β for the fitted generalized Pareto distribution Gξ,ˆ βˆ to the excess distribution Fη.

2.6 Copulas When the dependence structure for the joint distribution function of a random vector and the marginal distributions can not be specified simultaneously, a copula can be used to specify the dependence structure separately. Copulas are obtained through the combination of the probability and quantile transforms. Proposition 2.1 (Probability transform). If X has distribution function F , then F (X) ∼ U(0, 1) if and only if F is continuous [20, p. 166]. Proposition 2.2 (Quantile transform). If U ∼ U(0, 1) has a standard uniform distribution, then P (F −1(U) ≤ x) = F (x) [20, p. 166].

Let F denote the joint distribution function of the random vector X = (X1, ..., Xd) with marginal dis- tribution functions F1, ..., Fd and U = (U1, ..., Ud) be a random vector whose components have standard uniform marginal distributions, U(0, 1). Using the probability transform X may be specified as

X = (F1(U1), ..., Fn(Un)) (2.17) The dependence structure of X is inherited from the components of U, and hence, the dependence structure of X may be expressed in terms of U. A copula is defined as the d-dimensional distribution function C(u1, ..., ud) with standard uniform marginal distributions. Using the quantile transform, it follows that

C(u1, ..., ud) = P (F1(X1) ≤ u1, ..., Fd(Xd) ≤ ud) −1 −1 = P (X1 ≤ F1 (u1), ..., Xd ≤ Fd (ud)) (2.18) −1 −1 = F (F1 (u1), ..., Fd (ud)). The main concern of this thesis is with elliptical copulas, which are derived from the elliptical distribu- tions. To this class of copulas belongs the normal and student’s-t copulas. These copulas are defined as follows. If the random vector X has a multivariate normal distribution and a correlation matrix P , then by (2.18) the normal copula defines as

n −1 −1 CP (u1, ..., ud) = ΦP (Φ (u1), ..., Φ (ud)),

where ΦP denotes the standard multivariate normal distribution function of X and Φ the standard univariate normal distribution with inverse Φ−1. Similarly, if the random vector X has a multivariate student’s-t distribution with degrees of freedom ν and a correlation matrix P , then by (2.18) the student’s-t copula defines as

t −1 −1 Cν,P (u1, ..., ud) = tν,P (tν (u1), ..., tν (ud)) (2.19)

where tν,P denotes the multivariate student’s-t distribution function of X and tν the standard univariate student’s-t distribution

14 The dependence structure is determined through calibration of the copula using some dependency mea- sure on U. Since U should be transformed into X using (2.17) the dependency measure must be invariant under strictly increasing non-linear transformations. Ordinary dependency measures such as linear cor- relation do not satisfy this property. However, there are copula dependency measures that are invariant under such transformations. These are known as rank correlations, and the two most well-known are the Spearman’s rho, Kendall’s tau and the coefficients of upper and lower tail dependence. Below these de- pendency measures are presented in the bivariate case. Note that the Kendall’s tau and the Spearman’s rho rank correlation can easily be extended to higher dimensions.

First the linear correlation ρ for two random variables (X1,X2) is defined as

Cov(X1,X2) ρ(X1,X2) = p p V ar(X1) V ar(X2) where ρ(X1,X2) ∈ [−1, 1]. If |ρ(X1,X2)| = 1 it is implied that (X1,X2) are perfectly linearly dependent. Moreover, if (X1,X2) are independent, then ρ(X1,X2) = 0 although the converse does not hold. Since it is a linear measure of dependence it has the following property. If a, b, c1 and c2 are constants then

ρ(a + c1X1, b + c2X2) = ρ(X1,X2) and therefore it is invariant under strictly increasing linear transformations. However, in the general case it is not invariant under strictly increasing non-linear transformations

ρ(X1,X2) 6= ρ(F1(X1),F2(X2)) Hence, it is not a rank correlation dependency measure since it depends on the marginal distributions. The Kendall’s tau and the Spearman’s rho rank correlation are defined as follows.

Definition 2.9. Let (X˜1, X˜2) be an independent copy of the random variables (X1,X2). The Kendall’s tau rank correlation is then defined as the probability of concordance (X1 − X˜1)(X2 − X˜2) > 0 subtracted by the probability of discordance (X1 − X˜1)(X2 − X˜2) < 0, hence

τ(X1,X2) = P [(X1 − X˜1)(X2 − X˜2) > 0] − P [(X1 − X˜1)(X2 − X˜2) < 0]

Definition 2.10. Let X1 and X2 be two random variables with the associated marginal distribution functions F1 and F2. The Spearman’s rho rank correlation is then defines as the linear correlation of the probability transformed random variables,

ρS(X1,X2) = ρ(F1(X1),F2(X2)). Which for continuous random variables is the linear correlation for their unique copula. Both the Kendall’s tau and the Spearman’s rho rank correlations only depend on the copula itself and not the univariate marginal distributions and therefore inherits its property of being invariant under strictly increasing non-linear transformations. Two other well known and important copula dependence measures are the coefficients of upper and lower tail dependence. Like Spearman’s rho and Kendall’s tau, the coefficients only depend on the cop- ula, and they are designed to provide measures for the extremal dependence between, i.e. the strength of dependence in, the tails of bivariate distributions.

The coefficient of upper tail dependence is defined as

−1 −1 λu(X1,X2) lim P (X2 > F2 (q)|Z1 > F1 (q)) q→1− provided that the limit exists, such that λu(X1,X2) ∈ [0, 1]. If λu(X1,X2) ∈ (0, 1] then the random variables X1 and X2 is said to have asymptotic upper tail dependence and λu(X1,X2) = 0 corresponds to asymptotic upper tail independence. The coefficient of lower tail dependence is defined as

−1 −1 λl(X1,X2) lim P (X2 < F2 (q)|Z1 < F1 (q)) q→0+ provided that the limit exists, such that λl(X1,X2) ∈ [0, 1] and similarly, if λl(X1,X2) ∈ (0, 1] then the random variables X1 and X2 are said to have asymptotic lower tail dependence and λl(X1,X2) = 0

15 corresponds to asymptotic lower tail independence. In Section 5.3 in McNiel et al. [26] it is shown that the normal copula is asymptotically independent in both the upper and lower tails, i.e. λu(X1,X2) = λu(X1,X2) = 0. They also show that the student’s-t copula is asymptotically dependent in upper and lower tail with s ! (ν + 1)(1 − ρ) λ (X ,X ) = λ (X ,X ) = 2t − u 1 2 l 1 2 ν+1 1 + ρ

For more details on dependency measures, the reader is referred to McNiel et al. [26]. For the family of elliptical copulas there exist particularly useful relations when simulating from normal copulas and student’s-t copulas. The relations are stated as follows.

For the normal copula there exists explicit relations between Kendall’s tau, the Spearman’s rho and the linear correlation these are given by 2 τ(X ,X ) = arcsin(ρ), (2.20) 1 2 π and 6 1 ρ (X ,X ) = arcsin( ρ). S 1 2 π 2 Moreover, if the elliptical copula is a student’s-t copula the explicit relation between Kendall’s tau and the linear correlation (2.20) holds. However, there exists no simple relation between the Spearman’s rho and the linear correlation for the student’s-t copula. By using these relations element-wise on a multidi- mensional linear correlation matrix, P convert the matrix into Kendall’s tau rank correlation matrix T and the Spearman’s rho PS rank correlation matrix, respectively, which are both invariant under strictly increasing non-linear transformations.

This thesis project will only consider normal copulas, and since these are asymptotically independent in the upper and lower tails, the coefficients of upper and lower tail dependence will not be considered. Moreover, Kendall’s tau will be the only rank correlation that will be considered in this project.

3 Method

This section presents the approach to answer the research questions of this thesis project, starting with an introduction to the general concepts of portfolio optimization. The Markowitz Mean-Variance opti- mization [24] is stated. The solution to the problem will be used as a benchmark for further (upcoming) analysis. The portfolio optimization problem formulation for a general coherent distortion risk measure is stated, and the approach to solving it is presented.

3.1 Introduction to Portfolio Analysis

Suppose an investor with initial capital V0 enters the financial market consisting of risky assets at time t0 = 0. The market will be assumed to be completely liquid, i.e. it is always possible to buy unlimited quantities and any number of the assets. Additionally, there is no bid-ask spread, fees, or transaction costs so the price for buying and selling coincide. Assume the investor takes a position in a portfolio with d assets. Let πi(t) = Gi[Si(t)] denote the price of asset i = 1, ..., d, Where Gi is the pricing function that depends on the asset Si(t). The portfolio value at a given time t is given by

d X V (t, x) = xjGj[Si(t)] i=1 T where x = [x1, ..., xd] is the holdings vector. The elements xi is the current position in instrument i = 1, ..., d and is often referred to as portfolio weights. Since the price πi(t) depends on the unknown future price Si(t), the future portfolio value V1 at time t1 = 1 is uncertain and hence the investor is exposed to risk. To evaluate potential risk exposure, it is common to study historical return data. Assume that a sample {Si(−m),Si(−m + 1), ..., Si(0)} of vectors of historical data is accessible for portfolio asset i = 1, ..., d. Daily asset price data in particular. It is standard practice to transform the price data to

16 returns due to the mathematical advantage it implies. Returns can be defined in different ways, although this project will only consider log-returns,   πi(t) Ri(t) = log , i = 1, ..., d. (3.1) πi(t − 1) which are assumed to have the Markov property from day-to-day, be weakly dependent and close to identically distributed. However, this assumption is somewhat questionable since volatility clusters can typically be observed in the log-return series Ri(t). The volatility clusters indicates that the stylized facts of financial time series is present, which contradicts the hypothesis that the log-returns are IID as in Definition 2.5. By assuming that the log-returns Ri(t) are a GARCH process, standardized residuals Zt can be filtered out using (2.9). The filtered standardized residuals are in practice often close to IID. In most situations it can be assumed that the historical return series carry distributional characteristics that are also representative for R(1). It is, therefore, reasonable to assume that the portfolio value V1 = F (R(1)) can be represented by a multivariate function F that depends on the available information at t0. T Through proper modelling of F the return series can be expressed as a random vector R = (R1, ..., Rd) . Let x denote the d-dimensional holdings vector and define the future portfolio return X(x, R) = xT R ∈ X , where X is the set of reachable portfolios. Additionally, define the corresponding portfolio losses L(x, R) = −xT R. Henceforth for convenience of notation, the explicit parameter dependence will be suppressed and unless stated otherwise X and L will denote the portfolio return and loss, respectively.

3.2 Introduction to Portfolio Optimization Portfolio optimization problems can be formulated in many different ways. A tractable property of linear programming problems is convexity. With a convex optimization problem, both the objective and the feasible regions are convex, resulting in a unique optimal solution being the global solution. To preserve the convexity of a portfolio optimization problem, the risk measure in question must be convex. In Section 2.1.1, it was mentioned that coherency of risk measures implies convexity. All the coherent distortion risk measures studied in this thesis satisfies the property. If the risk measure does not satisfy the convexity property (CX), the feasible region of optimal portfolios may contain several local extrema. This is the case with Value-at-Risk, which does not satisfy the convexity property and will hence not be studied in this thesis project. Markowitz suggested three mean-variance investment problems. These are the mean-maximization, variance-minimization and mean-variance trade-off. Each of the problems is designed to reflect a certain investors view of a particular investment. Means and variances are practical when studying elliptical distributions. By generalizing to the case where the underlying distribution is unknown, and the risk is measured with any coherent distortion risk measure these optimization problems may be formulated as follows

Expected Return Maximization: Maximizes the expected return subject to a CDRM constraint τ

max{E[X(x)]|ρg(X(x)) ≤ τ, X ∈ X } (3.2) x

CDRM Minimization: Minimization of CDRM with subject to expected return constraint θ

min{ρg(X(x))|E[X(x)] ≥ θ, X ∈ X } (3.3) x

Expected Return-CDRM Trade-off: Utility maximization for a risk aversion parameter c

max{E[X(x)] − cρg(X(x))|X ∈ X } (3.4) x

Each of the formulations produces the same set of global optimal portfolios. Varying the parameters τ, θ and c traces the efficient frontier representing the set of optimal solutions. Moreover, problem (3.2) through (3.4) generate the same efficient frontier. This is a direct result from duality theory, which states that minimization of risk subject to some return constraint is equivalent to maximization of return subject to some risk constraint. This is shown in Theorem 3 in Krokhmal et al. [22]. Since each portfolio optimization problem generates the same efficient frontier, it suffices to consider one of the problems. Henceforth, risk minimization (3.3) will be the problem under consideration.

17 In most applications of market risk, an investor has certain requirements on the investment. Con- straints are often applied to the investment problem to uphold these requirements. The most common constraint is to prevent taking short positions since these are often riskier than long positions and should be based on good information since the losses could be unbounded. This constraint is set by only allow- ing positive portfolio weights xi ≥ 0 for assets i = 1, ..., d. Moreover, it is common to impose a value Pd constraint xiδi ≤ λi m=1 δmxm, where δi is a linear constraint vector. The idea of the value constraint is to prevent an instrument i to constitute more than a given percentage, λi, of the initial capital. The final constraint that will be considered is the one that ensures that all initial capital is allocated amongst Pd the portfolio assets, one may require i=1 xi = V0. Here V0 is a scaling factor, and by setting V0 = 1 the portfolio weights are normalized. Also, let µi denote the mean return of asset i = 1, ..., d and θ the minimum expected portfolio return. Introducing these constraints to the CDRM Minimization formulation (3.3) gives the following opti- mization problem.

min ρg(L) x d X Subject to xiµi ≥ θ i=1 d X xi = V0 (3.5) i=1 d X xiδi ≤ λi δmxm, i = 1, ..., d m=1

xi ≥ 0, i = 1, ..., d.

The benefit of portfolio optimization with coherent distortion risk measures is that they all satisfy the law-invariant property (LI), which means that they only depend on the distribution model of the portfolio and not on some distributional parameters such as means and variances. This is a particularly tractable property which makes it possible to measure the risk of simulated returns from non-parametric market models which is one of the key elements in this project. On the other hand, if an elliptical portfolio model is assumed and the associated mean vector and covariance matrix are known, then solving the CDRM minimization is equivalent to solving the classical Markowitz mean-variance optimization. This will be shown in Section 3.3.1 below.

3.3 Portfolio Models To make proper estimations of the potential risk associated with a position, it is necessary to specify a model for the future portfolio value. This section will consider two multivariate models. The first is the classical parametric elliptical distribution model which assumes symmetry. The second model focuses on modelling of the left and the right tail independently to address the asymmetric nature of return data described by the stylised facts of financial time series. This is accomplished with the use of extreme value theory. The tails are modelled with extreme value distributions, generalised Pareto distributions in particular. The resulting univariate distributions are unique and do not have a multivariate distribution function that describes the cross dependency. To describe the dependence structure, the copula approach will be used. A joint distribution function expresses the resulting multivariate distribution with the unique marginal distributions that accurately model asymmetric market data and a copula to describe the dependence structure.

3.3.1 Remark: Elliptically Distributed Returns in Portfolio Selection The following remark intends to emphasise the special case where portfolio optimization with a CDRM can be reduced to a modified version of the standard Markowitz mean-variance optimisation problem, provided that the portfolio assets can be modelled with elliptical distributions with sufficient accuracy. Assuming the return vector R ∼ Nd(µ, Σ) has elliptical symmetry where µ is the mean vector and Σ = AAT the dispersion matrix, such that R can be represented by the stochastic representation (2.7)

R =d µ + WAZ

18 where Z ∈ Nd(0, Id), Id is the d-dimensional identity matrix and W > 0 independent of Z. Then, using the standard property of the spherical distributions (2.8) with regards to the first two statistical moments, the distribution of the portfolio position X = xT R is given by √ T d T T x R = x µ + x ΣxWZ1. where Z1 is a univariate standard normally distributed random variable. Now, consider the portfolio optimisation problem where the risk with respect to any coherent distortion risk measure should be minimized. Using the spectral risk measure representation (2.4) it follows that

Z 1 T −1 min ρφ(x R) = min − φ(u)FxT R(u)du x x 0 √ Z 1 = min −xT µ + xT Σx φ(u)F −1 (u)du WZ1 x 0 In the last step, the property of (PH) and (TI) have been used. Since min−f(x) is equivalent to maxf(x) x x the risk minimization has the equivalent form

√ Z 1 max ρ (xT R) = max xT µ − xT Σx φ(u)F −1 (u)du. φ WZ1 x x 0 This problem is recognized as the Markowitz mean-variance trade-off problem, where the constant c is given by,

Z 1 c = φ(u)F −1 (u)du. (3.6) WZ1 0 This suggests that solving the optimisation problem (3.5) for any coherent distortions risk measure, when the return vector R has an elliptical distribution, is equivalent to solving the Markowitz mean-variance trade-off problem. Note that the trade-off constant c specified by (3.6) is just a scaling factor of the volatility. Concluding that the optimisation problem can be solved with quadratic programming.

3.3.2 Asymmetric Distributions The critical issue when assuming parametric models such as elliptical distributions to fit financial data is that they are restricted under the assumption of symmetry. Additionally, all observations are weighted equally when fitting the models. With the high density of observations around the mean, the fit becomes more representative for the central part of the distribution than for the distribution tails. This often results in model fits that have lighter tails than the empirical tails. Since extreme risk, which is the primary concern in risk management, is located far out in the tails, these models tend to underestimate the actual risk. In an attempt to model these extreme losses more accurately, such that measurements of risk becomes more reliable, the tails will be modelled separately from the rest of the distribution. This section considers an approach to produce a joint distribution function with asymmetric univariate marginal distributions. The approach combines EVT and copulas from Section 2.5 and 2.6. The approach is sometimes referred to as the copula approach, which allows the modeller to decouple the construction of multivariate models and model the univariate distribution models and the cross-dependency separately. Recall that EVT relies on the assumption of approximately IID(0,1) random variables. Due to the stylized facts of financial time series, market risk factor returns tend to deviate from this assumption, see Section 2.3. Hence, EVT cannot be applied directly. Some sample preparation has to be made by fitting a GARCH(p, q) model. There are various methods to identify the best GARCH(p, q) model for historical data. The most commonly used model, however, is the simplest one, namely the GARCH(1, 1) model. This is also the one that will be considered here since the model is, in most cases, sufficient for filtering residuals. Using the calibration procedure described in Section 2.4 yields the Maximum-Likelihood estimated parameters α0, α1 and β1 for the GARCH model. The estimated parameters for the GARCH(1,1) model for the realized return series of the SEB stock depicted in Figure 2.2 are presented in Table 3.1.

19 Table 3.1: GARCH(1,1) parameters estimated with Maximum-Likelihood estimation for the SEB sample return series. Parameter Estimate Standard Error t Statistic

αˆ0 3.1767e-06 7.184e-07 4.4219 αˆ1 0.057152 0.0059743 9.5662 ˆ β1 0.93575 0.0059985 156

For a GARCH(1,1) model, the parameter α1 measures to which extent a volatility shock feeds through from one day to the next. Moreover, the factor (α1 + β1) measures the rate at which the effect of a volatility shock dies out, which is the discount rate found in equation (2.15). Hence, α1 + β1 < 1 is necessary for model stability. Note that β1 > α1 suggests that the volatility av time t is more dependent on the previous day’s volatility level rather than on the returns. Table 3.1 shows that this requirement holds for the GARCH(1,1) model fitted to the sample series of the SEB stock. Furthermore, the third column shows that the standard error for the parameter estimates is small and, since the t statistic is larger than 2 for the three parameter estimates, they have statistical significance at the 5% significance

level. When the GARCH(1, 1) parameters have been estimated, the standardized residuals {Zt}t∈Z are obtained from Rt − µˆ Zt = (3.7) σˆt

Where Rt represents the univariate return series. After the GARCH filtration has been successfully performed the standardized residuals should be approximately IID(0, 1) and the stylized facts of financial time series should be accounted for and the volatility clustering behaviour should no longer be present. This is shown in Figure 3.1

8

6

4

2

0 Standardized Residuals -2

-4

-6 0 200 400 600 800 1000 1200 1400 1600

Figure 3.1: Series of standardized residuals illustrating that the volatility clustering is no longer present after GARCH filtration. The filtered residuals comes from the SEB return series sample depicted in Table 2.2.

Let Fn(z) denote the empirical cumulative distribution function for the sequences of independent and identically distributed standardized residuals z1, ..., zn that have been obtained through the GARCH filtration. The tails of these sequences can now be modelled by applying Extreme Value Theory (EVT). A commonly used method for tail modelling is the peaks over threshold method. This approach will be used here. By choosing a high threshold ηh ∈ [0, 1], which defines the tail base, the distribution of the h threshold excesses zk − η is given by the conditional distribution function

h h Fn(z) − Fn(η ) h Fn,ηh (z) = P (Z ≤ z|Z > η ) = h , z > η (3.8) 1 − Fn(η ) Supported by the EVT, the conditional distribution can be well approximated a Generalized Pareto h h distributions (GPD) (2.16), provided a sufficiently high threshold η , i.e. Fn,ηh (x) ≈ Gξ,β(x − η ) for x ≥ ηh. The cumulative distribution function for the upper tail can be expressed by rearranging the

20 terms of (3.8) t h F (x) = Fn(η ) + (1 − Fn(η ))Fn,ηh (x) t h h = Fn(η ) + (1 − Fn(η ))Gξ,β(x − η ) Analogously the lower tail can be modeled with GPD by considering the excesses below some sufficiently l l l low threshold η such that Fn,ηl (x) ≈ Gξ,β(η − x) for x ≤ η . The cumulative distribution function for the lower tail can be expressed as t h F (x) = 1 − Fn(η ) + (Fn(η ))Fn,ηl (x) l l = Fn(η )(1 − Gξ,β(η − x)) When the tails have been modelled, it remains to specify the area between the thresholds ηl and ηh to obtain a complete distribution. This can be done with a normal distribution, student’s-t distribution or even an empirical distribution. This choice is not so important since the true focus of this project is to study the risk far out in the tails. The empirical distribution will be used to connect the tails, giving the complete distribution function  F (ηt) + (1 − F (ηh))G (x − ηh), x ≥ ηh  n n ξ,β l u F (x) = Fn(x), η < x < η (3.9)  l l l Fn(η )(1 − Gξ,β(η − x)), x ≤ η Consider the upper tail. The key to obtaining a good tail fit for the GPD, is a suitable threshold ηh. Since there is no algorithm to specify the optimal threshold, these have to be chosen manually. Hence, the choice of threshold becomes a subjective assessment of the modeller. If they are chosen to far out in the tail, there will be few observations available, resulting in poor parameter estimates. If the threshold is to close to the mean, there will be more observations, but including too many samples closer to the mean puts more weight on observations non-representative for tail behaviour resulting in a questionable GDP approximation. A common approach to estimate the parameters is the Peaks Over Threshold h −1 h h (POT) method. The method procedure starts with a choice of a high threshold η = Fn (q ), where q is the quantile corresponding to ηh. The GPD is fitted to the excess observations beyond the threshold using Maximum-Likelihood estimation of the parameters ξ and β3. The goodness of fit is assessed by plotting the empirical tails against the GPD tails, specified by the estimated parameters and visually inspecting the GDP tail fit against the empirical tail. If the tail fit is not satisfactory, a new threshold is chosen. The procedure is repeated for successively larger thresholds and is terminated for a sufficiently good fit. The same procedure is done for the lower tail, where a l −1 l lower threshold is chosen η = Fn (q ). Figure 3.2 displays the GARCH filtered lower empirical tail of the SEB stock compared with fitted normal distribution, student’s-t distribution and Generalized Pareto distribution(GDP).

0.05 ECDF Normal t GDP 0.04

0.03

0.02 Quantile

0.01

0

-0.01 -5.5 -5 -4.5 -4 -3.5 -3 -2.5 -2 -1.5 -1 -0.5 0 Returns (%)

Figure 3.2: Empirical GARCH filtered lower tail residuals for the SEB stock, compared to the fitted normal distribution, student’s-t distribution and the Generalized Pareto distribution. 3Nystr¨omand Skoglund show in [27] that the Maximum-Likelihood estimations of GPD parameters are robust to the choice of threshold. Hence, the sensitivity of the parameter estimation is less of a concern compared to finding the right threshold.

21 Clearly, GDP provides the best fit, whereas measurement of the risk would be underestimated with the normal distribution and overestimated with the student’s-t tail. A particularly nice feature with fitting extreme value distributions such as GPD is that they can be used to extrapolate the tail beyond the range of the sample. Since this approach does not assume any particular parametric distribution, the model risk is reduced as implied by Figure 3.2. On the other hand, the uniqueness of the univariate distributions does not have a multivariate distribution that describes the cross-dependence between the assets. Modelling of the dependence structure can be done with a normal copula. The copula is calibrated by estimating the correlation matrix for the standardized residuals given by (3.7). The Kendall’s tau rank correlation matrix T is obtained from the relation (2.20). Alternatively, a student’s-t copula can be used. The Kendall’s tau rank correlation matrix is obtained the same way.

When the hybrid cumulative distribution have been specified with GPD tails, and the copula has been calibrated, the following algorithm can be used to simulate return scenarios for the linear assets in the reference portfolio.

1. Simulate the random variates (x1, ..., xd) ∼ Nd(0, T ) , where T is the Kendall’s tau rank correlation matrix.

2. Transform to standard uniform random variates (u1, ..., ud) = (Φ(x1), ..., Φ(xd)) ∼ U(0, 1), where Φ is the standard normal cumulative distribution function.

−1 −1 −1 3. Transform to standardized residuals (z1, ..., zd) = (F1 (u1), ..., Fd (ud)), where Fi is the inverse hybrid cumulative distribution function of (3.9) for assets i = 1, ..., d. 4. The simulated return scenarios for asset i = 1, ..., d are obtained from the location-scale transform

i Ri =µ ˆi + σt+1Zi,

i whereµ ˆi are the empirically estimated means and σt+1 the one day forecasted conditional volatility. The forecasted conditional volatility is computed using the GARCH volatility forecast in equation (2.15).

A similar algorithm can be obtained for simulation from student’s-t copula by exchanging Nd(0, T ) for td(ν, 0, T ) in the first step and Φ for tν in the second step.

3.4 Linear Programming Methods The portfolio optimization (3.5) can be solved in various ways, and the available choices rely on the model assumption. As implied by the remark on elliptically distributed returns in Section 3.3.1 any portfolio optimization problem with known elliptical parameters can be solved analytically with quadratic programming. However, the quadratic investment problem introduced by Harry Markowitz is developed for linear assets, which can be assumed to be well described by means and variances. However, non- linear assets such as derivatives are typically not well described in terms of means and variances. The problem is quadratic, but the asset representation is linear, so delta approximation is used. Although, such approximations can be too crude and tend to oversimplify the complex instruments. It is therfore common to employ scenario-based analysis of risk factor returns. This framework enables portfolio optimization using linear programming. The linear programming algorithms tend to be more reliable than quadratic algorithms and are more efficient at solving large-scaled problems. This section will focus on this linear programming approach, starting with an alternative version of Markowitz mean-variance optimization.

3.4.1 Markowitz Linear Program Variance is the classical statistical quantity for measuring dispersion. There are other ways to measure the dispersion of random variables. Consider the Mean Absolute Deviation (MAD), which measures dispersion in absolute values rather than squared values. Define the portfolio mean and MAD as

d J d d X X X X µ(x) = µixi, δ(x) = pj(| rjixi − µixi|) i=1 j=1 i=1 i=1

22 where rji is the matrix of returns, xi the holdings vector for assets i = 1, ..., d and scenarios j = 1, ..., J. Define the portfolio return scenarios yj = rjixi. Scenario j = 1, ..., J occurs with probability pj and PJ 1 j=1 pj = 1 which will be assumed to be pj = J , ∀j. Pd Define the deviation scenario zj = |yj − i=1 µixi| = max{(yj −µ), −(yj −µ)} for scenario j. Adding the constraints stated in (3.5) the MAD linear programming model can be stated as follows

J 1 X min z J j j=1

Subject to zj ≥ yj − µ j = 1, ..., J

zj ≥ −(yj − µ) j = 1, ..., J d X yj = rjixi j = 1, ..., J i=1 d X µ = µ x i i (3.10) i=1

µ = µ0

zj ≥ 0 j = 1, ..., J d X xi = V0 i=1 d X xiδi ≤ λi δmxm, i = 1, ..., d m=1

xi ≥ 0 i = 1, ..., d The advantage of the simulation-based approach is that it does not make any assumptions regarding the distributions of the risk factor returns.

3.4.2 Rockafellar and Uryasev’s CVaR Optimization Rockafellar and Uryasev [28] recognized that working directly with the definition of CVaR (2.1) in an optimization framework is unpractical. Instead, they introduced a linear representation of CVaR that will play a central role in the optimization framework for CDRM studied in this thesis. This section presents the linear representation in the optimization framework. Let L(x, R) be the loss associated with the portfolio position x ∈ D and to be chosen from the set of feasible portfolios X and R ∈ Rd the random vector of market risk factors that affect the loss. For each position x, the loss L(x, R) is a random variable in R induced by the distribution of R. For theoretical purposes it is assumed that R has probability density p(R). Let FL denote the cumulative distribution of the loss associated with portfolio position x such that the probability of L(x, R) not exceeding the threshold ζ is Z FL(x, ζ) = p(R)dR, L(x,R)≤ζ

For simplicity, it is assumed that FL(x, ζ) is continuous and non-decreasing with respect to ζ everywhere to avoid mathematical complications, see Uryasev [35]. The VaRα and CVaRα of the portfolio loss associated with position x at level α ∈ (0, 1) are given by

VaRα(x) = min{ζ ∈ R : FL(x, ζ) ≥ α} and Z 1 + CVaRα(x) = [L(x, R) − ζ] p(R)dR, 1 − α L(x,R)≥ζ + where [t] = max(t, 0). In the former expression VaRα(X) comes out as the left endpoint of the non- empty interval of the values ζ such that FL(x, ζ) = α, i.e. VaRα(X) is identified as the threshold ζα at level α, which follows from FL(x, ζ) being continuous and non-decreasing with respect to ζ. In the latter expression, the probability that L(x, R) ≥ ζ is therefore (1 − α), such that CVaRα(X) identifies

23 as the conditional expectation of the loss L(x, R), which is equal or grater than VaRα. The key is to d characterize VaRα(X) and CVaRα(X) in terms of the auxiliary function Hα(x, ζ) on X × R , defined as 1 Z H (x, ζ) = ζ + [L(x, R) − ζ]+p(R)dR. (3.11) α 1 − α R∈R where Hα(x, ζ) is convex with respect to (x, ζ) if and only if CVaRα(X) is convex with respect to x and the function (x, ζ) 7→ [L(x, R) − ζ]+ is convex, which holds when L(x, R) is convex in x. The convexity of Hα(x, ζ) is a crucial property for optimization problems. It eliminates the possibility of having local minima that do not coincide with the global minima. The following theorem ensures that the auxilary function (3.11) is a valid representation of CVaR in an optimization framework. Theorem 3.1 (Convex Representation for Conditional Value-at-Risk Minimization Problems (Theorem 14 in Rockafellar and Uryasev [29])). Minimizing CVaRα(X) with respect to x ∈ X is equivalent to minimizing Hα(x, ζ) over all (x, ζ) ∈ X × R in the sense that

min CVaRα(X) = min Hα(x, ζ) x∈X (x,ζ)∈X ×R where moreover

∗ ∗ ∗ ∗ ∗ (x , ζ ) ∈ arg min Hα(x, ζ) ⇐⇒ x ∈ arg minCVaRα(X), ζ ∈ arg minHα(x , ζ) (x,ζ)∈X ×R x∈X ζ∈R Theorem 3.1 ensures that the objective value for minimizing CVaR is equal to the objective value found minimizing Hα(x, ζ). Hence, the minimization of Hα(x, ζ) is a convex programming problem that minimizes the portfolio CVaR and calculates the values for both VaR and CVaR simultaneously. By realizing J returns rj and loss scenarios lj from the probability density function p(R) Rockafellar and Uryasev [28] argues that the integral can be approximated with the following convex and piece wise linear function with respect to ζ

J ∗ 1  ∗ + ∗ 1 X ∗ + Heα(x, ζ) = ζ + E [l(x, r) − ζ ] = ζ + pj[l(x, rj) − ζ ] 1 − α 1 − α j=1

Replacing CVaR in the portfolio optimization formulation with Heα(∆, ζ) as the risk measure,

min{Heα(∆, ζ)|x ∈ X },

3.4.3 Extension of Rockafellar and Uryasev CVaR Optimization In order to develop a convex optimization framework for a CDRM, two problems need to be solved. It is necessary to express the Choquet integral (2.3) in terms of a convex auxiliary function with the same minimum as the corresponding risk measure ρ and to find its convex generator Q. To address the first problem, consider the following theorem shown in the continuous case by Kusuoka [23]. Theorem 3.2. Assuming there is a portfolio X and a concave distortion function g, then the associated distortion risk measure ρg is coherent if and only if there exists a function w : [0, 1] 7→ [0, 1], satisfying R 1 0 w(u)du = 1 such that Z 1 ρg = w(u)CV aRu(X)du (3.12) 0

The tractable property of this representation is that if there is a convex combination of CVaRα(X), α ∈ [0, 1], then (3.12) can be used to construct a convex representation of any CDRM with the convex combination of CVaRα(X)s, α ∈ [0, 1]. To implement CDRM in a portfolio optimization framework requires a convex representation in the form of an auxiliary function. Consider the following function

Z 1 Mg(x, ζα) = w(u)Hu(x, ζu)du (3.13) 0 where Hα(x, ζα) is the auxiliary function defined in (3.11) as a convex representation of CVaRα(X) and R 1 w(u) ≥ 0 and 0 w(u)du = 1. Additionally, for each α ∈ [0, 1] there is a corresponding auxiliary variable

24 ζα The convex weights w(u) ∈ [0, 1] depend on both the portfolio loss distribution and the choice of distortion function. The following representation of CDRM ensures the existence of the weights and defines the corresponding CDRM. The following theorem ensures that the auxiliary function Mg(x, ζ) is a valid convex representation of a given CDRM ρg with a concave distortion function in an optimization framework. Theorem 3.3 (Convex Representation for Coherent Distortion Risk Measures Minimization Problems). |ζ| Minimizing ρg(L) with respect to x ∈ X is equivalent to minimizing Mg(x, ζ) over all (x, ζ) ∈ X × R in the sense that min ρg(X) = min Mg(x, ζ) |ζ| x∈D (x,ζ)∈X ×R where moreover

∗ ∗ ∗ ∗ ∗ (x , ζ ) ∈ arg min Mg(x, ζ) ⇐⇒ x ∈ arg minρg(X), ζ ∈ arg minMg(x , ζ) (x,ζ)∈X ×R x∈X ζ∈R For a full proof of Theorem 3.3 see discussion in Section 4 in F¨ollmerand Schied [14] where they show that any convex risk measure is continuous from above and below. This property is inherited by the Chouquet integral representation of any risk measure with the convexity property (CX) and ensures that minimizing ρg(X) and Mg(x, ζ) corresponds to the same objective value. The CDRM may therefore be characterized by the following function

Z 1  1 Z  M (x, ζ ) = w(u) ζ + [L(x, R) − ζ ]+p(R)dR du (3.14) g α u 1 − α u 0 R∈R This section will focus on finding a discrete expression for (3.12) and finding the generator Q in a discrete setting that can be used in an optimization framework. Although, this will only be a short- hand presentation providing the main results, for a full presentation of the reader is referred to Brown [8].

Definition 3.1. Let ∆J denote the restricted probability simplex in J dimensions, defined as

J J J X ∆ = {q ∈ ∆ | gj = 1, q1 ≤ ... ≤ qJ } j=1

T Let l = l(x, r) = (l1, ..., lJ ) be the realized losses of a portfolio random variable and l(1) ≤ ... ≤ l(J) the ordered losses associated with the portfolio position x ∈ Rd and the random return vector R ∈ RJ . Denote pj = p(l = lj) the probability of realizing lj = L(x, rj), j = 1, ..., J, which can be generated from any discrete distribution or sampled using Monte-Carlo simulation. Although, for simplicity and since it provides a cleaner description of the generating families Q it will be assumed throughout this thesis 1 that the probability density has a discrete uniform distribution such that p(L = Lj) = J for j = 1, ..., J. Theorem 3.4 (Finite Generator for Coherent Distortion Risk Measures(Theorem 4.2 in Bertsimas et al. [3])). For any portfolio loss random variable with ordered realized losses l(1) ≤ ... ≤ l(J) and a concave distortion function g, then the distortion risk measure

J X ρg(l) = qjl(j) j=1

is coherent if and only if there is a concave restricted probability simplex q ∈ ∆J such that

J − j + 1 J − j  q = g − g (3.15) j J J

Moreover, every q ∈ ∆J is a weighted convex set of CVaRs written on the form

J X qj = wjQkj, j = 1, ..., J k=1

25 where Q ∈ RJ×J is the CVaR matrix 1/J 0 0 ··· 0 1/J 1/(J − 1) 0 ··· 0   1/J 1/(J − 1) 1/(J − 2) ··· 0 Q =    ......   . . . . . 1/J 1/(J − 1) 1/(J − 2) ··· 1 PJ such that CVaR j−1 (l) = Qkjl(j), j = 1, ..., J and w = (w1, ..., wJ ) is the set of convex weights J k=1 PJ satisfying wk ≥ 0, k=1 wj = 1, given by ( w = Jq for k = 1 1 1 (3.16) wk = (J + 1 − k)(qk − qk−1) for k = 2, ..., J.

The coherent distortion risk measure can be rewritten as a convex combination of CVaR j−1 for a given J loss l, as follows

J J J J J J X X X X X X ρg(l) = qjl(j) = ( wjQkj)l(j) = wj( Qkjl(j)) = wjCVaR j−1 (l) J j=1 j=1 k=1 j=1 k=1 j=1 PJ PJ For a complete proof of Theorem 3.2 see Bertsimas et al. [3]. Verifying that k=1 wj = j=1 qj = 1 is easy since

J J J X X X wj = Jq1 + (J + 1 − k)(qk − qk−1) = qk k=1 k=2 k=1 and for a convex distortion function g, q1 ≤ ... ≤ qJ such that

J J X X J − j + 1 J − j  q = g − g = g(1) − g(0) = 1 j J J j=1 j=1

Now since Hαj (x, ζαj ), j = 1, ..., J are all joint functions of x and ζαj , and Mg(x, ζ) is a convex combi-

nation of Hαj (x, ζαj ) for j = 1, ..., J. The Convex representation (3.14) may therefore be approximated by the following convex and piece wise linear function with respect to ζ

J J X  ∗ 1 X ∗ + Mfg(x, ζ) = wk ζ + [l(x, rj) − ζ ] (3.17) αk n − k αk k=1 j=1

Hence, Mg(x, ζ) is a joint convex function of x and ζ such that the general portfolio risk minimization problem may be written as

min{Mfg(x, ζ)|x ∈ X }, Let n be an integer n ≤ J and introduce the auxiliary variables z = [l(x, r ) − ζ∗ ]+ for j = 1, ..., J, kj j αk k = 1, ..., n. The the minimization problem is given by

n J X  1 X  min w ζ + p z k αk n − k j kj k=1 j=1

Subject to l(x, rj) + ζαk + zkj ≥ 0, zkj ≥ 0 j = 1, ..., J, k = 1, ..., n d X xiµi ≥ θ i=1 (3.18) d X xi = V0 i=1 d X xiδi ≤ λi δmxm i = 1, ..., d m=1

xi ≥ 0, i = 1, ..., d

26 where d is the number of instruments in the portfolio. Note that optimization with coherent distortion risk measures can potentially be computationally heavy since the problem (3.18) involves double sums. The system can be thought of as an n × J grid where n denotes the number of convex combinations of CVaRα that should be solved, and J denotes the number of loss scenarios. When solving the system for CVaRα at the confidence level α, there is only one non-zero entry wk = 1. Hence, the system can be reduced to solving for n = 1. For the other CDRM, the weights wk does not have any non-zero entries, many are small but not identically zero. Theoretically, solving for all loss scenarios would imply that n = J. This is not practical since, in order to acquire accurate measurements from simulated data, J needs to be large. Therefore approximations are necessary, which will be done by reducing the size of n.

3.5 Risk contributions Provided that the distribution of portfolio returns X is known, the associated risk with respect to the risk measure ρ is estimated. It is often of particular interest to know how much each asset contributes to the total portfolio risk. A reasonable requirement on risk decompositions is that the full allocation property should be satisfied. This means that if the portfolio risk is allocated amongst the portfolio assets, the risk exposure from each asset should sum up to the total portfolio risk. The risk decomposition method originates from the Euler decomposition theorem, which states the following.

Theorem 3.5 (Euler’s Decomposition Theorem). Let f : Rd 7→ R be a continuous function with con- tinuous partial derivatives, then f is homogeneous of degree m, that is f(λx) → λmf(x) if and only if d X ∂f x = mf i ∂x i=1 i

holds for all x1, ..., xd ∈ D, where D is an open domain [34, p. 32]. Now assume that there is a risk measure ρ with continuous partial derivatives satisfying the positive homogeneous property, such that k = 1. The risk decomposition based on Euler’s decomposition theorem is

d X ∂ρ(X) ρ(X) = x (3.19) i ∂x i=1 i ∂ρ(X) where xi denotes the holdings in instrument i = 1, ..., d. The derivative in (3.19) should be ∂xi interpreted as the allocated risk from asset i with respect to the risk measure ρ, which is scaled by the holding xi as expected. Since the total portfolio risk equals the sum of all risk contributions, (3.19) satisfies the full allocation property. It is often convenient to express the allocated risks in proportions of the total portfolio risk. Normalization gives

Pd ∂ρ(X) d i=1 xi ∂x X 1 = i = ν ρ(X) i i=1 In a discrete setting where the derivative ∂ρ(X) does not exist, Tasche [33] proposes that risk contributions ∂xi can be interpreted as conditional expectations. Following this reasoning consider a set of realized losses lj = l(x, lj), j = 1, ..., J from a portfolio with loss distribution FL. Assume that the portfolio consists of d Pd assets let x denote the holding vector, such that a portfolio loss scenario can be written lj = i=1 xilji, J,d where {lj,i}j=1,i=1, li = (l1,i, ..., lJ,i) denotes the loss components. Let VaR(α) = ls and define the VaR(α) contribution form asset i by approximating the derivative, with respect to the holdings xi, with

DVaR(α)(xi) = E[li|l = ls].

Similarly the CVaRα(X) contributions is given by recalling the integral relation between CVaR(α) and VaR(α)

1 Z 1 CVaR(α) = VaR(u)du. 1 − α α By the standard rules of calculus the derivative can be taken inside the integral, giving

27 ∂CVaR(α) 1 Z 1 ∂VaR(u) = du. ∂xi 1 − α α ∂xi The CVaR(α) contribution from asset i by approximating the derivative, with respect to the holdings xi, with PJ E[li|lj > ls] D (x ) = j=1 , (3.20) CVaR(α) i PJ 1 j=1 {lj > ls} where 1 is the indicator function, which is is equal to one if the inequality holds for loss scenario j and zero otherwise. Expression (3.20) suggests that the CVaRα(X) contributions can be interpreted as the average of a sequence of VaRα(X) contributions, strictly above VaRα(X). The risk contributions for a general CDRM can be derived by considering the convex representation.

Z 1 ρg = w(u)CVaR(u)du 0 Taking the derivative inside the integral gives

∂ρ Z 1 ∂CVaR(u) g = w(u) du ∂xi 0 ∂xi The CDRM contribution from asset i is given by approximating the derivative, with respect to the holdings xi, with J PJ X j=1 E[li|lj > lk] D (x ) = w , (3.21) ρg i k PJ 1 k=1 j=1 {lj > lk}

where wk is the CVaR weight defined in (3.16).

Arguably, since the full allocation property can easily be achieved through normalization, any other risk decomposition method could be used rather than the Euler risk decomposition method. The justification for the Euler decomposition method is provided by Theorem 4.4 in Tasche [32], which states that the only risk decomposition method consistent with (local) portfolio optimization is the Euler decomposition. In other words, whereas other risk decomposition methods render misleading information about the overall portfolio performance when making adjustments to the holdings of individual portfolio assets, the Euler decomposition does not and is said to be the only performance measure, satisfying

 −E[li] −E[l] > 0, if gρ(l) > ρ(l) ∂ −E[l]  i (3.22) ∂x ρ(l) i  −E[li] −E[l] < 0, if ρ < gi (l) ρ(l) ρ where g denotes the performance measure. Here, (3.22) says that if the performance −E[li]/gi (l) of an asset i is better (worse) than the performance of the overall portfolio −E[l]/ρ(l), then increasing (decreasing) the holdings of that asset improves the overall performance of the overall portfolio. The concept of negative risk contribution is important. Negative risk contributions in an investment portfolio are interpreted as hedges that decreases the overall portfolio risk.

4 Analysis and Conclusion

This section comprises the results of this project, starting with a presentation of the reference portfolio. As a benchmark for the analysis, the optimal investment position is computed by solving the Markowitz mean-variance optimization problem for the reference portfolio. Then follows the results from the port- folio optimization problem (3.18) where the four members of coherent distortion risk measures are used.

4.1 Reference Portfolios and Benchmark Solution In order to address the objectives of the project, a portfolio of financial assets is considered. The assets included in the portfolio are ten stocks and one bond index from the Swedish market. They are listed in Table 4.1 below, with the corresponding business sectors.

28 Table 4.1: Instruments in the reference portfolio and their corresponding business sector.

Asset Sector AstraZeneca Health care AtlasCopco Industrials Ericsson Technology HM Consumer goods ICA Consumer goods Nordea Banking Sandvik Industrials SEB Banking SwedishMatch Consumer goods Telia Telecommunications Total Bond index Government bond

The historical data of the assets were collected from the Nasdaq Nordic public web page [25]. The historical price data consists of 3, 267 price samples that range between 2006-01-02 and 2018-12-02. The period was particularly chosen to capture both high and low volatile periods as well as being well- diversified over a wide range of business sectors. The assets were selected to represent different market volatility levels. As the representation of a risk-free asset, the OMRX total bond index(TBOND) is considered due to its low-risk low-reward characteristics. The index is a composition of benchmark bonds issued by the Swedish National Debt Office and liquid bonds issued by Swedish mortgage institutions. Therefore, it can be assumed to be close to risk-free. The prices of each asset at the end of the trading day is referred to as the closing prices. These are depicted in Figure 4.1 for all assets of Table 4.1 as functions of time, between the dates 2006-01-02 and 2018-12-28. The succeeding analysis of this thesis will use all the data within this time span. A few of the closing prices were missing in the downloaded data set. There were seven missing data points from the total bond index and one from Ericsson. To compensate for the missing closing prices, they have been linearly interpolated [20, p. 10]. The risky assets are depicted in the left frame and the total bond index in the right frame of Figure 4.1.

800 6500 Astra Zeneca OMRX Total Bond Index Atlas Copco 700 Ericsson HM ICA Nordea 6000 600 Sandvik SEB Swedish Match Telia 500 5500

400

5000 300 Monetary value [SEK] Monetary value [SEK]

200 4500

100

0 4000 2008 2010 2012 2014 2016 2018 2008 2010 2012 2014 2016 2018 Date [Years] Date [Years]

Figure 4.1: Price movement of risky assets(left) and total bond index (right) from 2006-01-02 to 2018- 12-28

Studying the price movements of the assets in Figure 4.1, the following conclusions are drawn. Assets such as Swedish Match and ICA Group have an upward trend while assets such as HM has a downward trend. Nordea and Telia do not have a significant trend and appears to be stable. Atlas Copco and Sandvik are more volatile than Telia and Swedish Match. The most significant trend behaviour occurs during the global financial crisis between 2007 and 2009. During the crisis, most of the assets display a negative trend behaviour, which is an indication of strong correlation among the assets when the market is unstable. Some assets, such as Nordea and Telia, managed the financial crisis better than other com- panies such as Astra Zeneca. Another important conclusion that can be drawn from the plotted price data is that assets from the same business sector are more strongly correlated. For example, Nordea and SEB, from the banking sector and HM and ICA from the Consumer goods sector. Hence, it can,

29 therefore, be concluded that the reference assets satisfy the desired traits sought in a diversified portfolio. Furthermore, the time frame of the historical data set contains both stable and unstable periods.

In addition to the stocks and the total bond index, which are linear assets, five options will be considered. Derivative securities do not have an intrinsic value by itself. Its value is derived from its underlying. In this project, the underlying of the options strategies will be some of the stocks described above. One well-known investment strategy involving derivatives is the hedging strategy. This strategy is commonly used by financial institutions to hedge the risk of an underlying asset. In general, the derivative security is constructed to be negatively correlated with the underlying, such that the derivative value increases when the value of the underlying decreases. The reference portfolio will include one put and one call option, Butterfly spreads that are more safe investments and a straddle strategy which is a bit more risky. Moreover, a down-and-in Barrier option will also be considered. Since the options are non-linear, they are typically not well represented in terms of means and variances. Therefore, when solving classical Markowitz mean-variance optimization problems, these assets need to be approximated by delta to retain the normal distribution assumption. However, in this thesis, this approach will not be used. Instead, since the assets are easy to simulate from their underlying, this project focuses on full simulation-based valuation of the portfolio. Hence, all portfolio optimization problems solved throughout this thesis will be done using simulation and linear programming. This thesis will mainly focus on European options. The different strategies for investments in deriva- tives have been chosen such that they reflect a seemingly safe and profitable investment, but with po- tentially large risks hidden in the tails. The underlying instruments of the derivatives in the non-linear reference portfolio are selected from the linear reference portfolio, and the derivative assets are given in Table 4.2 with the corresponding parameters.

Table 4.2: Option parameters where S0 denotes the initial stock price, r the risk-free interest rate in percent, σ the volatility in terms of standard deviation, T is the time in years K1, K2 and K3 are strike prices. Derivative Underlying S0 r σ T K1 K2 K3 Long put Nordea 74.58 0.012 0.5 ≈5.66 50 - - Short call Nordea 74.58 0.012 0.5 ≈5.66 100 - - Butterfly spread Telia 41.98 0.008 0.35 0.25 32.93 40.43 47.93 Straddle SEB 86.4 0.008 0.5 1 94.9 94.9 -

In Table 4.2 the initial stock price S0 has been chosen to be the last observation from the historical price data, i.e. the closing price in 2018-12-28, for the respective underlying stock. Acknowledging that having different risk-free interest rates r and volatilities σ for different options is somewhat unrealistic. However, the primary purpose of including these derivatives in the reference portfolio is to have seemingly safe investment assets in the portfolio with large potential risks. The strike prices Ki, i = 1, 2, 3 have also been selected accordingly. All option derivatives in this thesis will be of the European type. The initial stock prices S0 for the long put and short call in Table 4.2 refers to the initial price of the Nordea stock instrument, which is underlying. The underlying for the Butterfly spread is the Telia Stock instrument, and the Strangle derivative has the SEB stock as underlying. The Strangle derivative and the Butterfly spread are derivative strategies based on combinations of European options and therefore have multiple strike prices. The idea behind long put options with the Nordea stock as underlying is to produce a hedge in combination with the Nordea stock. The short call option with the Nordea stock as underlying produces a large tail risk which could potentially be unbounded. The butterfly spread is a safe options strategy which gives the best payoff if the underlying Telia stock value stays the same, but decreases if the underlying moves in either direction. The straddle option is a riskier strategy which is often referred to as a volatility betting strategy, which gives larger payoffs for large price movements in the underlying SEB stock. However, if the underlying stays the same, the investor loses the entire premium. For further discussion on option-based strategies, see Hull [18]. The reference portfolio will also include a Barrier put option of the down-and-in type with the Atlas Copco stock as underlying, in addition to the four derivative instruments stated in Table 4.2. The initial price is S0 = 210.5, r = 0.009, σ = 0.5, T ≈ 3.33, strike price K = 200 and the down-and-in barrier at H = 100. Since the investment horizon is one day, the simulated price of the underlying at the horizon is used to compute the profit-and-loss (PnL) distribution of options, which can be transformed into PnL distribution characteristics. The derivative

30 assets payoff functions and PnL distributions is discussed in more detail in Section A.3 for the benefit of the reader. It will be assumed that there are no bid-ask spreads, transaction costs or fees related to buying and selling these instruments. In Table 4.3, the distributional statistics are presented for the historical data for all the assets in the reference portfolio. The derivatives have been generated using the associated pricing function and the historical price data of the underlying asset, see Haug [17] for details on the derivative pricing formulas.

Table 4.3: Summary of distributional statistics for historical return data presented on a yearly basis.

Instrument Asset Mean STD Skew Kurt Stock AstraZeneca 0.0278 0.2088 -0.8282 14.9226 Stock AtlasCopco 0.0669 0.3172 -0.9967 19.8232 Stock Ericsson -0.0313 0.3159 -1.4088 23.8978 Stock HM -0.0230 0.2350 -0.3724 10.8637 Stock ICA 0.0756 0.2401 -0.2769 11.3632 Stock Nordea 0.0263 0.2811 0.3542 9.6631 Stock Sandvik 0.0128 0.3101 -0.1889 7.9617 Stock SEB -0.0145 0.3189 -0.1526 15.4647 Stock SwedishMatch 0.0575 0.2121 -0.2046 8.1980 Stock Telia 0.0077 0.2204 -0.5191 13.2497 Bond TBOND 0.0294 0.0180 0.4724 7.7037 Long put Nordea -0.0378 0.2254 -0.4743 9.9312 Short call Nordea -0.2620 0.4889 -1.1450 12.6476 Butterfly spread Telia -0.3565 0.0896 -7.9677 114.3461 Straddle SEB 0.1966 0.0703 7.8460 120.7253 Short down-and-in Barrier AtlasCopco -0.0088 0.3411 -2.0128 35.8576

In Table 4.3, it can be seen that the asset with the highest expected return is the straddle option followed by ICA and AtlasCopco, which can be expected to provide the largest payoff. The assets with the largest negative return are the butterfly spread and the call option with the Nordea stock as underlying. The largest volatility is recorded for the call option with the Nordea stock as underlying. It is followed by the AtlasCopco barrier option, which suggests that these assets are the riskiest. The least volatile assets are the Straddle and the TBOND. The skewness column indicates that all assets are negatively skewed except for the Nordea Stock and the TBOND. Moreover, the assets with the heaviest tails are the butterfly spread and the straddle option, which is indicated by the larger kurtosis compared to the kurtosis for the other assets. Before solving the portfolio optimization problems (3.18) and (3.10) the assumptions such as con- straints applied to the optimization problems need to be specified. This section will, therefore, be concluded by specifying the general assumptions of the portfolio optimization problems as well as the portfolio constraints. The following assumptions will be used throughout the result section unless ex- plicitly stated otherwise.

• The investment horizon is one day.

• The initial capital is V0 = 1. This property is assumed for convenience of analysis. It allows the holdings in each asset to be expressed as a percentage of the initial capital.

• Each portfolio minimization problem that will be considered will be subject to the minimum ex- pected portfolio return constraint set to θ = 1.5639 · 10−4 for daily returns which correspond to 3.94% for yearly returns. • There will be three value constraints on the portfolio assets restricting the total amount of capital that is allowed to be invested in combinations of assets.

– Derivative constraint: The total amount that is allowed to be invested in the derivative assets 5 is not allowed to exceed (a maximum of) 16 portions of the initial capital. – Hedge constraint: the hedge in the Nordea stock is constrained to a maximum investment of 35% of the initial capital.

31 – The total amount of capital that is allowed to be invested in the TBOND and the Straddle in the SEB stock is 35% of the initial capital.

The assumed value constraints will be discussed further in Section 4.2.

4.2 Mean Average Deviation Benchmark Solution In this section, the benchmark solution is presented by the applying the mean absolute deviation opti- mization problem (3.10) to the reference portfolio presented in previous section. Before presenting the benchmark solution, a brief presentation of the chosen value constraints is given. Table 4.4 presents the optimal solution to the mean absolute deviation optimization problem (3.10) applied to the reference portfolio when the returns used in the optimization are derived from the historical price data.

Table 4.4: Solution to portfolio optimization problem (3.10), with historical returns for stepwise added value constraints. From left to right, each column presents the optimal holdings vector without value constraint. The second column adds the hedge constraint. In the third constrains the TBOND and the straddle. The last column adds the derivative constraint.

Asset Name Holdings Holdings Holdings Holdings AstraZeneca 0 0.0005 0.0048 0.0333 AtlasCopco 0 0 0 0.0081 Ericsson 0 0.0007 0 0 HM 0 0.0001 0 0 ICA 0.0004 0.0016 0.0051 0.0512 Nordea 0.2602 0.1680 0.2571 0.1498 Sandvik 0 0.0002 0 0 SEB 0 0 0 0 SwedishMatch 0.0003 0.0004 0.0083 0.0536 Telia 0 0.0003 0.0019 0.0089 TBOND 0.1876 0.4577 0.1744 0.3826 LongPutNordea 0.3475 0.1820 0.0929 0.1951 ShortCallNordea 0 0.0192 0.1298 0 ButterflySpreadTelia 0 0 0 0 StraddleSEB 0.2040 0.1693 0.3256 0.1174 ShortBarrierAtlasCopco 0 0 0 0 Portfolio MAD 2.0789 · 10−4 2.94703 · 10−4 4.0179 · 10−4 9.2656 · 10−4

In Table 4.4 the first column shows that without any constraints the optimizer invests almost 61% in the Nordea hedge, 20% in the straddle and almost 19% in the TBOND. Because of the hedge, the portfolio MAD is small, as shown in the last row of the first column. Clearly the hedge is the most dominant strategy in the reference portfolio, which is not unexpected since it is a perfect hedge and efficient risk minimizer. To make the analysis more interesting a 35% value constraint is applied to the Nordea stock and the Nordea put option, it will be referred to as the hedge constraint. The hedge constraint limits the amount of capital to be allocated in the two assets to a maximum of 35% combined. In the sec- ond column, the major positions are in the TBOND and the straddle. Moreover, it can be observed that the optimizer invest 35% of the initial capital in the hedge. To avoid making the straddle and TBOND to dominant a 50% value constraint is applied to these assets combined. The results of the two applied constraints are depicted in the third column. Clearly, the optimizer prefers the derivatives instead of the stocks, the straddle and the put in particular. In an effort to diversify the investment and allocate more holdings into the stocks, all derivatives will be subject to the value constraint of a 5 total investment of 16 of the initial capital. The result is shown in the forth column. It shows that this is achieved to some extent, with 5% more invested capital in the SwedishMatch stock and 3% more in the AstraZeneca stock. It is also observed that the derivatives reach the maximum allowed invest- ment under the constraint. By studying the last row of each column, it can be seen that the hedge constraint does not increase the portfolio MAD so much. Most of the increased risk associated with the constraint is compensated for with the TBOND. The same is observed for the added TBOND, and the straddle constraint. However, adding the derivative constraint more than doubles the portfolio MAD.

The benchmark solutions will be computed under the assumption that the returns are elliptically dis- tributed such that they may be expressed in terms of a mean vector µ and a covariance matrix Σ, and additionally the degrees of freedom parameter ν for variance mixture models. The mean vector and covariance parameters are not known and can only be estimated from historical data. Let µˆ denote the estimated mean vector and covariance matrix Σ.ˆ These parameters are presented in Appendix A.2.

32 Using the estimated parameters and the stochastic representations (2.6) and (2.7) for elliptical distri- butions, return scenarios can be simulated from both multivariate normal distributions and student’s-t distributions. The benchmark solutions are given by applying (3.10) to the reference portfolio. For simple compari- son with the results in subsequent sections, the benchmarks are computed for simulated returns from five elliptical distributions. These are the normal distribution and the student’s-t distribution with degrees of freedom ν = {2.1, 5, 10, 20}. Additionally, a solution is provided for simulated returns from the normal copula with hybrid marginal distributions using the simulation algorithm presented at the end of Section 3.3.2. The normal copula used throughout the analysis is calibrated with the estimated Kendall’s tau matrix Tˆ presented in Section A.2. Table 4.5 presents the optimal holding vectors for the six benchmark solutions solved for a sample J = 10, 000 return scenarios simulated from the six multivariate distribu- tions. The sample size J has been chosen large enough to ensure convergence, which has been observed from empirical testing. The particular choice of sample size is further discussed in Section 4.3.

Table 4.5: Solution to (3.18) applied to the reference portfolio where the risk was measured by the Mean Absolute Deviation and returns drawn from a multivariate student’s-t distribution with degrees of freedom ν = {2.1, 5, 10, 20} and a multivariate normal distribution and a normal copula with hybrid marginal distribution, fitted in Appendix A.1. Normal ν = 2.1 ν = 5 ν = 10 ν = 20 Normal copula Asset Name Holding Holding Holding Holding Holding Holding AstraZeneca 0 0 0 0 0 0.0348 AtlasCopco 0 0 0 0 0 0.0088 Ericsson 0 0 0 0 0 0 HM 0 0 0 0 0 0 ICA 0.0812 0.0847 0.0878 0.0854 0.0870 0.1468 Nordea 0.1146 0.1041 0.0963 0.0990 0.0988 0.1247 Sandvik 0.0343 0.0435 0.0492 0.0483 0.0476 0 SEB 0 0 0 0 0 0 SwedishMatch 0.0950 0.0996 0.1030 0.1008 0.1017 0.0286 Telia 0.0300 0.0376 0.0429 0.0428 0.0415 0 TBOND 0.3324 0.3180 0.3083 0.3113 0.3110 0.3438 LongPutNordea 0.1449 0.1305 0.1208 0.1238 0.1235 0.1563 ShortCallNordea 0 0 0 0 0 0 ButterflySpreadTelia 0 0 0 0 0 0 StraddleSEB 0.1676 0.1820 0.1917 0.1887 0.1890 0.1562 ShortBarrierAtlasCopco 0 0 0 0 0 0

In Table 4.5, it can be seen that the allocated capital, expressed in proportions of the initial capital, are not distributed into all assets. Only half of the assets are subject to investment. Recall from Section 4.1, the assets were chosen from various market sectors to ensure diversification and in doing so, reduce risk. Clearly, the optimal portfolio position is a potential risk reduction. Summing up the holdings between the TBOND and the straddle option strategy gives exactly 50%, in all columns. Most likely, if these assets were not subject to the value constraint of a maximum 50% these numbers would have been larger as it was in Table 4.4. The same result is observed for the restriction on the derivatives. The sum of all 5 holdings in the derivatives equals 16 of the initial capital, which was the value constraint applied to the derivative assets. The only value constraint that was not reached was the 35% in the Nordea stock and the associated put option.

4.3 Optimization With Coherent Distortion Risk Measures This section studies the four coherent distortion risk measures introduced in Section 2.1.2 in a setting of simulation-based portfolio optimization. The objective here is to solve optimization problem (3.18) for the reference portfolio introduced in Section 4.2. The reference portfolio is represented by multivariate distribution models such as elliptical and asymmetric models. To ensure convergence of the optimized risk measure the simulated samples has to be large. Although Rockafellar and Uryasev’s shortcut to minimization of CVaRα is computationally efficient, the extended optimization problem can be quite computationally intense. It requires solving multiple weighted convex combinations of CVaRα. To gain a good balance between run times and acquiring accurate measurements, the sample size was chosen to be J = 10, 000. Moreover, a sample of this size implies that there is an equally large set of CVaRα weights wk, k = 1, ..., J, defined in (3.16). Since solving the entire system of J × J equations and unknown variables is not practical, approximations are necessary. Multiple tests were performed to assess the accuracy of both the portfolio risk measurement and the robustness of the optimal holdings vector for different number of weights wk and the different choices for indices k. It was found that increasing the

33 sample, the number of 11 weights to 101 weights did not improve the accuracy of holdings or portfolio risk. The same conclusion was drawn from choosing specific indices wk were the weights with the highest values were observed. Therefore, the number of weights were chosen to be 11, all with an equal distance from one another. Moreover, to get a perspective on the approximation errors effect on the measured portfolio risk, measured by the optimizer the actual risk will also be presented as a comparison. In Sections 4.3.1 through 4.3.4 each risk measure is examined separately for different distortion function parameters. Section 4.5 compares the risk measures with different distortions functions with fixed parameters.

4.3.1 Conditional Value-At-Risk This section studies the portfolio optimization with Conditional Value-at-Risk. Amongst the CDRM studied in this project it has the simplest form, being the building block of the optimization framework. Recall that the distortion function for the CVaRα is given by x g (x, α) min{ , 0}, α ∈ (0, 1). CVaR 1 − α

Here CVaRα will be studied for the confidence levels α = 0.95, α = 0.975 and α = 0.99. Using equations (3.15) and (3.16) in Theorem 3.4 for a simulated return sample of size J = 10000 gives the CVaR weights wk and quantile weights qk depicted in Figure 4.2.Each frame is zoomed in on k = 9000 to k = 10000 representing the upper 90% quantile.

w for CVaR w for CVaR w for CVaR 0.95 0.975 0.99

1 1 1

0.8 0.8 0.8

0.6 0.6 0.6 k k k w w w

0.4 0.4 0.4

0.2 0.2 0.2

0 0 0 9000 9200 9400 9600 9800 10000 10200 9000 9200 9400 9600 9800 10000 10200 9000 9200 9400 9600 9800 10000 10200 k k k

-3 q for CVaR -3 q for CVaR -3 q for CVaR 10 0.95 10 0.975 10 0.99

2 4 10

3.5 8 1.5 3

2.5 6 k k k q 1 q 2 q

1.5 4

0.5 1 2 0.5

0 0 0 9000 9200 9400 9600 9800 10000 10200 9000 9200 9400 9600 9800 10000 10200 9000 9200 9400 9600 9800 10000 10200 k k k

Figure 4.2: Convex CVaR weights wk and quantile weights qk for CVaR0.95, CVaR0.975 and CVaR0.99, zoomed in on the last 1000 weighs representing the 90% quantile.

The upper frames of Figure 4.2 depict the weighting of the convex combination of CVaRα. Since CVaRα is the basic unit of the optimization framework, the weights w for the CVaRα consists of only one non- zero entry corresponding to the confidence level α. Needless to say that only one convex combination of CVaRα is solved in this particular case. The lower frames of Figure 4.2 depicts the weighting of the quantile losses. Clearly, all quantile weights above the confidence level α have equal values, which are normalized such that the sum of all weights is equal to one. This means that all quantile losses are weighted equally, and the risk measure ignores all quantile losses except the ones above the confidence level. Moreover, as the confidence level increases the values of the weights with non-zero entries also increases. This implies that a higher confidence level corresponds to a higher degree of risk aversion. Note that q can be interpreted as the admissible risk spectrum, sometimes referred to as the risk aversion function of the spectral risk measures representation (2.4).

Table 4.6 depicts the solution to the portfolio optimization problem (3.18) with CVaRα applied to the reference portfolio and returns simulated from the multivariate normal distribution Nd(µˆ, Σ).ˆ Esti- mated parameters µˆ and Σˆ are presented in equations (A.2) and (A.3) in Section A.2. From left to

34 right the columns present the optimal holdings vector for minimization of CVaRα at confidence levels α = {0.95, 0.975, 0.99}, representing the capital allocation in proportions of the initial capital. The last row of each column states the CVaRα at the given confidence level α.

Table 4.6: Solution to (3.18) applied to the reference portfolio with returns drawn from the multivari- ate normal distribution. The column of portfolio weights are optimized for CVaR0.95, CVaR0.975 and CVaR0.99, respectively. CVaR0.95 CVaR0.975 CVaR0.99 Asset Name Holdings Holdings Holdings AstraZeneca 0 0 0 AtlasCopco 0 0 0 Ericsson 0 0 0 HM 0 0 0 ICA 0.0857 0.0856 0.0831 Nordea 0.1055 0.1038 0.1060 Sandvik 0.0467 0.0437 0.0447 SEB 0 0 0 SwedishMatch 0.0978 0.1004 0.1048 Telia 0.0427 0.0441 0.0377 TBOND 0.3091 0.3099 0.3111 LongPutNordea 0.1216 0.1224 0.1236 ShortCallNordea 0 0 0 ButterflySpreadTelia 0 0 0 StraddleSEB 0.1909 0.1901 0.1889 ShortBarrierAtlasCopco 0 0 0 Portfolio CVaRα 0.0044 0.0050 0.0056

Studying Table 4.6 it shows that varying the confidence level does not change the weights much, only within the third decimal. The largest change in capital allocation for one asset, between the confidence level 0.95 and 0.99, is only around 0.7% increase for SwedishMatch. Since it is only allowed to invest 50% in the Straddle and TBOND combined as much as possible is invested in these assets. A similar 5 observation is made for the derivative constraint that allows 16 of the capital in the five derivatives combined. The Nordea hedge does not reach the constraint of 35% of the initial capital for any of the confidence levels α. Table 4.7 depicts the solution to the portfolio optimization problem (3.18) with CVaR0.99, where the return samples are drawn from a multivariate Student-t. From left to right each column represent the solution in terms of portfolio weights where the degrees of freedom are ν = {2.1, 5, 10, 20}.

Table 4.7: Solution to (3.18) applied to the reference portfolio with CVaR0.99 and returns drawn from student’s-t distribution with degrees of freedom ν = {2.1, 5, 10, 20}.

ν = 2.1 ν = 5 ν = 10 ν = 20 Asset Name Holdings Holdings Holdings Holdings AstraZeneca 0 0 0 0 AtlasCopco 0 0 0 0 Ericsson 0 0 0 0 HM 0 0 0 0 ICA 0.0858 0.0731 0.0824 0.0844 Nordea 0.1221 0.1010 0.1057 0.1098 Sandvik 0.0304 0.0419 0.0565 0.0453 SEB 0 0 0 0 SwedishMatch 0.0849 0.1086 0.0955 0.0992 Telia 0.0330 0.0498 0.0449 0.0370 TBOND 0.3313 0.3131 0.3024 0.3118 LongPutNordea 0.1438 0.1256 0.1149 0.1243 ShortCallNordea 0 0 0 0 ButterflySpreadTelia 0 0 0 0 StraddleSEB 0.1687 0.1869 0.1976 0.1882 ShortBarrierAtlasCopco 0 0 0 0 Portfolio CVaRα 0.0066 0.0063 0.0063 0.0059

Similarly to Table 4.6, the maximum amount of capital is invested in the derivatives and the TBOND when varying the degrees of freedom. Furthermore, by increasing the extreme risk scenarios with a student’s-t distribution less capital is invested in the Straddle for the benefit of the safer Hedge strategy provided by the Nordea stock and the put option, which shifts the investment in the derivatives. The put option increases with 2% and the straddle decreases with 2% when decreasing the degrees of freedom from ν = 20 to ν = 2.1. Furthermore, it can be seen that with increasing degrees of freedom, the so- lutions converge toward the solution of the normal distribution depicted in the third column of Table 4.6.

35 Table 4.8 states the optimal solution to the portfolio optimization problem (3.18) with CVaRα applied to the reference portfolio and returns simulated from a normal copula with hybrid marginal distribution function with Generalized Pareto distributed tails. The copula is calibrated with the Kendall’s tau rank correlation matrix (A.4) and Generalized Pareto distributions have estimated parameters given by Table A.4 in Appendix A.1.

Table 4.8: Solution to (3.18) applied to the reference portfolio with CVaR0.99 and returns drawn from normal copula with hybrid marginal distributions. The column of portfolio weights are optimized for CVaR0.95, CVaR0.975 and CVaR0.99, respectively.

CVaR0.95 CVaR0.975 CVaR0.99 Asset Name Holdings Holdings Holdings AstraZeneca 0.0527 0.0620 0.0614 AtlasCopco 0.0179 0.0232 0.0341 Ericsson 0 0 0 HM 0 0 0 ICA 0.1147 0.0983 0.0887 Nordea 0.1318 0.1300 0.1233 Sandvik 0 0 0 SEB 0 0 0 SwedishMatch 0.0355 0.0414 0.0474 Telia 0 0.0015 0 TBOND 0.3349 0.3311 0.3327 LongPutNordea 0.1474 0.1436 0.1452 ShortCallNordea 0 0 0 ButterflySpreadTelia 0 0 0 StraddleSEB 0.1651 0.1689 0.1673 ShortBarrierAtlasCopco 0 0 0 Portfolio CVaRα 0.0043 0.0053 0.0065

The first observation that can be seen when studying Table 4.8 is that the optimal positions are slightly more diversified compared to the solutions in Table 4.8. With the more heavy-tailed asymmetric distri- bution model, the risk is increased and similarly to the student’s-t distribution with ν = 2.1, in Table 4.7, more is invested in the hedge and less in the Straddle. More capital is invested in the ICA asset as well. However, Sandvik and SwedishMatch decreases with 4% and 6% respectively, the former is reduced to zero. Additionally, both AstraZeneca and AtlasCopco are chosen as optimal portfolio holdings for the asymmetric market model. In contrast, this was not the case for the elliptical market model.

4.3.2 Wang Transform This section studies the implementation of portfolio optimization with the Wang transform. The distor- tion function of the Wang transform is given by

−1 −1 gWT(x, β) = Φ(Φ (x) − Φ (β)), β ∈ (0, 1),

where Φ is the standard normal cumulative distribution function and Φ−1 its inverse. The distortion function is concave for β > 0.5. The associated risk measure is hence only coherent if β > 0.5. This section considers the following three cases: β = 0.97, β = 0.99 and β = 0.999. Each one has been chosen to generate quantile weights within the same confidence levels of CVaR0.95, CVaR0.975 and CVaR0.99. The weights q and w for the Wang transform are depicted in Figure 4.3.

36 -3 w for WT -3 w for WT w for WT 10 0.85 10 0.95 0.99

10 0.05 8 8 0.04

6 6

k k k 0.03 w w w 4 4 0.02

2 2 0.01

0 0 0 0 2000 4000 6000 8000 10000 0 2000 4000 6000 8000 10000 0 2000 4000 6000 8000 10000 k k k

-3 q for WT q for WT q for WT 10 0.85 0.95 0.99 4 0.02 3.5 0.08

3 0.015 0.06 2.5

k 2 k k q q 0.01 q 0.04 1.5

1 0.005 0.02 0.5

0 0 0 0 2000 4000 6000 8000 10000 0 2000 4000 6000 8000 10000 0 2000 4000 6000 8000 10000 k k k

Figure 4.3: CVaR weights wk and quantile weights qk for WT0.85, WT0.95 and WT0.99.

The three lower frames of Figure 4.3 depict the weighting on each quantile. In contrast to CVaRα, each quantile is given increasingly larger weights to more extreme losses. This should, in theory, be more representative for an investor’s subjective perception of risk, being more averse to the risks of the more extreme losses. Furthermore, when β increases more weight is put on the worst-case losses and in particular the worst-case loss. By considering the upper frames of Figure 4.3, it can be seen that some of the weights wk have non-zero entries for small k in addition to large k. This means that the minimization of WTβ not only focuses on minimizing CVaRα for large confidence levels but also for small confidence levels. Moreover, it can be observed that a large portion of the weights wk are not equal to zero. In fact, for the Wang transform, there are no non-zero wk. Solving one CVaRα for each weight is not possible in practice. Since the simulation of return scenarios require around J = 10, 000 for convergence of CVaRα, solving for all weights gives an optimization problem of J × J. Hence, some approximations have to be made. Clearly, the weights wk in the upper and lower end of the interval have larger weight values. Therefore it seems like a good idea only to neglect the center of the interval. However, it turns out that using this approach gives large deviations between the minimized portfolio risk and (the after measurement of) the actual portfolio risk. By using careful analysis, it was found that choosing evenly spread weights is a better approach. It was attempted to solve for 101 weights, including the interval endpoints but it only improved accuracy in the third decimal places compared to 11 weights. Therefore, the latter was deemed sufficient. Table 4.9 states the solutions to problem (3.18) when applied to the reference portfolio and the returns are drawn from a multivariate normal distribution. From left to right each column presents the solutions for minimizing the Wang transform with β = {0.85, 0.95, 0.99}.

37 Table 4.9: Solution to (3.18) applied to the reference portfolio with returns drawn from a normal distri- bution. The column of portfolio weights are optimized for WT0.85, WT0.95 and WT0.99, respectively.

WT0.85 WT0.95 WT0.99 Asset names Holdings Holdings Holdings AstraZeneca 0 0 0 AtlasCopco 0 0 0 Ericsson 0 0 0 HM 0 0 0 ICA 0.0735 0.0741 0.0751 Nordea 0.1125 0.1141 0.1171 Sandvik 0.0512 0.0496 0.0467 SEB 0 0 0 SwedishMatch 0.1028 0.1023 0.1013 Telia 0.0432 0.0427 0.0419 TBOND 0.3043 0.3047 0.3054 LongPutNordea 0.1168 0.1172 0.1179 ShortCallNordea 0 0 0 ButterflySpreadTelia 0 0 0 StraddleSEB 0.1957 0.1953 0.1946 ShortBarrierAtlasCopco 0 0 0 Portfolio WTβ Optimized 0.0064 0.0067 0.0070 Portfolio WTβ Realized 0.0035 0.0048 0.0061

From Figure 4.9, it can be seen that the optimal solutions to all three choices of β are very similar to one another. At most, there is a decrease of half a per cent in the Sandvik stock from WT0.85 to WT0.99. Although, compared to the benchmark solution, one per cent has moved from Nordea to ICA. Otherwise, these are almost identical. In the two last rows, it can be seen that the WTβ estimated from the optimization always overestimate the risk, but for increasing β it appears to converge. Table 4.10 states the solutions to problem (3.18) applied to the reference portfolio with simulated returns drawn from a multivariate student’s-t distribution. From left to right, each column presents the solutions for minimizing WT0.99. From left to right, each column represents the solution in terms of portfolio weights where the degrees of freedom are ν = {2.1, 5, 10, 20}.

Table 4.10: Solution to (3.18) applied to the reference portfolio with WT0.99 and returns drawn from a student’s-t distribution with degrees of freedom ν = {2.1, 5, 10, 20}.

ν = 2.1 ν = 5 ν = 10 ν = 20 Asset names Holdings Holdings Holdings Holdings AstraZeneca 0 0 0 0 AtlasCopco 0 0 0 0 Ericsson 0 0 0 0 HM 0 0 0 0 ICA 0.0920 0.1006 0.0609 0.0743 Nordea 0.1493 0.1206 0.1083 0.1216 Sandvik 0 0.0390 0.0738 0.0231 SEB 0 0 0 0 SwedishMatch 0.0631 0.0934 0.1050 0.1031 Telia 0.0564 0.0031 0.0465 0.0578 TBOND 0.3267 0.3308 0.2930 0.3076 LongPutNordea 0.1392 0.1433 0.1055 0.1201 ShortCallNordea 0 0 0 0 ButterflySpreadTelia 0 0 0 0 StraddleSEB 0.1733 0.1692 0.2070 0.1924 ShortBarrierAtlasCopco 0 0 0 0 Portfolio WT0.99 Optimized 0.0094 0.0084 0.0080 0.0073 Portfolio WT0.99 Realized 0.0074 0.0072 0.0069 0.0064

For varying degrees of freedom ν the optimal portfolio position appears to be more chaotic in Table 4.10 compared to Table 4.7 in terms of the holdings vector. In particular, the Sandvik stock is reduced to zero for ν = 2.1 but has increased to 7% for ν = 10. The rest of the assets have a holding variation of around 3% for different ν. However, the portfolio WT0.99 appears to converge to the solution in the third column in Table 4.9. This is expected since the distributional characteristics of the student’s-t distribution converge to the characteristics of the normal distribution for increasing degrees of freedom. Table 4.11 states the optimal solution to portfolio optimization problem (3.18) with WT0.97, WT0.99 and WT0.999 applied to the reference portfolio where the simulated returns are drawn from a normal copula with hybrid with marginal distribution function with Generalized Pareto distributed tails.

38 Table 4.11: Solution to (3.18) applied to the reference portfolio with returns drawn from a normal copula with hybrid marginal distributions. The column of portfolio weights are optimized for WT0.95, WT0.975 and WT0.99, respectively. WT0.85 WT0.95 WT0.99 Asset names Holdings Holdings Holdings AstraZeneca 0.0951 0.0951 0.0952 AtlasCopco 0.0355 0.0356 0.0368 Ericsson 0 0 0 HM 0 0 0 ICA 0.0941 0.0941 0.0939 Nordea 0.0961 0.0963 0.0990 Sandvik 0 0 0 SEB 0 0 0 SwedishMatch 0.0343 0.0340 0.0308 Telia 0 0 0 TBOND 0.3324 0.3323 0.3318 LongPutNordea 0.1449 0.1448 0.1443 ShortCallNordea 0 0 0 ButterflySpreadTelia 0 0 0 StraddleSEB 0.1676 0.1677 0.1682 ShortBarrierAtlasCopco 0 0 0 Portfolio WTβ Optimized 0.0087 0.0097 0.0106 Portfolio WTβ Realized 0.0038 0.0061 0.0087

The holdings vectors of Table 4.11 appears to be very stable for the different values of β. It is even more stable than for CVaRα in Table 4.9, only changing in the third decimal. In the last two rows it can be seen that the minimized portfolio WTβ is overestimated with approximately WTβ ≈ 0.003% for β = {0.85, 0.95, 0.99}. This is due to approximation errors when minimizing WTβ, which is represented by 11 convex combinations of CVaRα.

4.3.3 Block Maxima In this section optimization with Block maxima is considered, which is sometimes referred to as the Proportional Hazard transform and the distortion function is given by

β gBM(x, β) = x , β ∈ (0, 1).

The distortion function is concave for all β ∈ (0, 1) in which case the associated risk measure is coherent. Below, BM0.5, BM0.1 and BM0.05 are studied. The weights q and w for these cases are depicted in Figure 4.4.

w for BM w for BM w for BM 0.5 0.1 0.05 0.4 0.5 0.6 0.35

0.4 0.3 0.5

0.25 0.4 0.3 k k 0.2 k w w w 0.3 0.2 0.15 0.2 0.1 0.1 0.1 0.05

0 0 0 0 2000 4000 6000 8000 10000 0 2000 4000 6000 8000 10000 0 2000 4000 6000 8000 10000 k k k

-3 q for BM q for BM q for BM 10 0.5 0.1 0.05

10 0.4 0.6 0.35 8 0.5 0.3

0.4 6 0.25 k k k q q q 0.2 0.3 4 0.15 0.2 0.1 2 0.1 0.05

0 0 0 0 2000 4000 6000 8000 10000 0 2000 4000 6000 8000 10000 0 2000 4000 6000 8000 10000 k k k

Figure 4.4: CVaR weights wk and quantile weights qk for BM0.5, BM0.1 and BM0.05.

It can be seen in the upper frames of Figure 4.4 that the risk measure associated with the block maxima distortion function is highly risk-averse. It filters out all but the extreme scenarios. There are only a few

39 non-zero entries, which are located above the CVaRα confidence level α = 0.99. Additionally, w1 = β, which is the weight for CVaR0. Hence, as β goes towards 1, the value of the weight w1 also goes to 1. The associated distortion function for CVaR0 is gCVaR(x, 0) = x. Recalling from Section 2.1.2 that the associated risk measure to this distortion function is simply the expected value, giving CVaR0(L) = E[L]. Minimizing the expected losses is equivalent to maximizing the expected return. Solving this problem becomes trivial since the problem (3.18) has no variance constraint, only a minimum return constraint. The optimizer chooses the asset with the largest mean. Therefore smaller values of β were chosen for this section. Table 4.12 states the solutions to problem (3.18) when applied to the reference portfolio and the returns are drawn from a multivariate normal distribution. From left to right, each column states the solutions for minimization of BM0.5, BM0.1 and BM0.05.

Table 4.12: Solution to (3.18) applied to the reference portfolio with returns drawn from normal distri- bution. The column of portfolio weights are optimized for BM0.5, BM0.1 and BM0.05, respectively.

BM0.5 BM0.1 BM0.05 Asset names Holdings Holdings Holdings AstraZeneca 0 0 0 AtlasCopco 0 0 0 Ericsson 0 0 0 HM 0 0 0 ICA 0.0762 0.0733 0.0684 Nordea 0.1197 0.1119 0.0977 Sandvik 0.0447 0.0518 0.0659 SEB 0 0 0 SwedishMatch 0.1003 0.1030 0.1076 Telia 0.0406 0.0433 0.0472 TBOND 0.3061 0.3042 0.3008 LongPutNordea 0.1186 0.1167 0.1133 ShortCallNordea 0 0 0 ButterflySpreadTelia 0 0 0 StraddleSEB 0.1939 0.1958 0.1992 ShortBarrierAtlasCopco 0 0 0 Portfolio Risk Optimized 0.0069 0.0072 0.0075 Portfolio Risk Measured 0.0014 0.0032 0.0072

From left to right, the columns of Table 4.12 shows that almost all asset holdings do not change much. The Sandvik stock has the largest increase in asset holdings, which increases with approximately one per cent in each column. In the last column, the asset holdings of Sandvik is 2% more compared to the benchmark in Table 4.5. Moreover, ICA and the TBOND have decreased with one per cent compared to the benchmark solution, while the Straddle has increased with one per cent. Table 4.13 states the optimal solution to problem (3.18) with BM0.05, where the return samples are drawn from a multivariate Student’s-t. From left to right each column states the optimal holdings vector where the degrees of freedom are ν = {2.1, 5, 10, 20}, from left to right.

Table 4.13: Solution to (3.18) applied to the reference portfolio with BM0.05 and returns drawn from student’s-t distribution with degrees of freedom ν = {2.1, 5, 10, 20}.

ν = 2.1 ν = 5 ν = 10 ν = 20 Asset names Holdings Holdings Holdings Holdings AstraZeneca 0 0 0 0.0023 AtlasCopco 0 0 0 0.0078 Ericsson 0 0 0 0 HM 0 0.0011 0 0 ICA 0.0862 0.1033 0.0560 0.0755 Nordea 0.1446 0.1165 0.1061 0.1304 Sandvik 0 0.0322 0.0805 0.0185 SEB 0 0 0 0 SwedishMatch 0.0630 0.0995 0.1066 0.0943 Telia 0.0743 0 0.0480 0.0480 TBOND 0.3193 0.3349 0.2902 0.3108 LongPutNordea 0.1318 0.1474 0.1027 0.1233 ShortCallNordea 0 0 0 0 ButterflySpreadTelia 0 0 0 0 StraddleSEB 0.1807 0.1651 0.2098 0.1892 ShortBarrierAtlasCopco 0 0 0 0 Portfolio Risk Optimized 0.0101 0.0089 0.0084 0.0075 Portfolio Risk Measured 0.0083 0.0074 0.0070 0.0064

Table 4.13 illustrates the same instability as for the optimal solution with the Wang transform under

40 the assumption of student’s-t distributed returns. Most surprisingly is the instability of the TBOND, which has so far been the most stable. It varies with 4%. The cause of the variation in the holding vector is the shape of the risk in the student’s-t distribution in combination with the BM0.05 being highly sensitive to the extreme risks located far out in the loss tail. Most of the assets vary with approximately the same amount, but Sandvik varies the most with 7%. However, the portfolio BM0.05, which is large for ν = 2.1, converges to BM0.05 for the normally distributed returns. In Table 4.14 the solution to portfolio optimization problem (3.18) with CVaRα applied to the refer- ence portfolio is found. The returns are drawn from a normal copula with hybrid marginal distribution function with Generalized Pareto distributed tails. The copula is calibrated with the Kendall’s tau rank correlation matrix (A.4) and Generalized Pareto distributions have estimated parameters given by Table A.4 in Appendix A.1.

Table 4.14: Solution to (3.18) applied to the reference portfolio with returns drawn from a normal copula with hybrid marginal distributions. The column of portfolio weights are optimized for BM0.5, BM0.1 and BM0.05, respectively. BM0.5 BM0.1 BM0.05 Asset names Holdings Holdings Holdings AstraZeneca 0.0797 0.0954 0.0954 AtlasCopco 0.0359 0.0378 0.0379 Ericsson 0 0 0 HM 0 0 0 ICA 0.0837 0.0937 0.0937 Nordea 0.1222 0.1012 0.1014 Sandvik 0 0 0 SEB 0 0 0 SwedishMatch 0.0374 0.0281 0.0279 Telia 0 0 0 TBOND 0.3285 0.3313 0.3313 LongPutNordea 0.1410 0.1438 0.1438 ShortCallNordea 0 0 0 ButterflySpreadTelia 0 0 0 StraddleSEB 0.1715 0.1687 0.1687 ShortBarrierAtlasCopco 0 0 0 Portfolio Risk Optimized 0.0104 0.0116 0.0118 Portfolio Risk Measured 0.0015 0.0075 0.0094

In Table 4.14, it can be seen that, as the parameter β decreases, the weights do not change much. The largest change is the decrease of two per cent in the Nordea stock and the decrease of one per cent in SwedishMatch. By comparing the solution to Table 4.8 and 4.11 it can be seen that the solution share more similarity with the Wang transform than conditional value-at-risk. Moreover, the last two rows shows that the optimization overestimated the portfolio BMβ especially for β = 0.5

4.3.4 Dual Block Maxima This section presents the solution to the minimization of the risk measure associated with the dual block maxima distortion function, given by

β gDBM(x, β) = 1 − (1 − x) .

The dual block maxima distortion function is given by applying the dual transform g(x) = 1 − g(1 − x) to the block maxima distortion function and is concave for β > 1. Here DBMβ will be studied for β = 3, β = 10 and β = 20 and the corresponding weights w and q are presented in Figure 4.5.

41 -5 w for DBM -4 w for DBM -4 w for DBM 10 3 10 10 10 20 16 8 4 14 7

12 3 6 10 5

k 8 k k w w 2 w 4

6 3

4 1 2

2 1

0 0 0 0 2000 4000 6000 8000 10000 0 2000 4000 6000 8000 10000 0 2000 4000 6000 8000 10000 k k k

-4 q for DBM -4 q for DBM -3 q for DBM 10 3 10 10 10 20

3 10 2

8 1.5 2 6 k k k

q q q 1 4 1 0.5 2

0 0 0 0 2000 4000 6000 8000 10000 0 2000 4000 6000 8000 10000 0 2000 4000 6000 8000 10000 k k k

Figure 4.5: CVaR weights wk and quantile weights qk for DBM3, DBM10 and DBM20.

The first observation is that in the top-left frame of Figure 4.5 the convex combination of CVaRs are weighted more heavily in the centre rather than the extremes. With increasing β more weights are distributed to more extreme CVaRs at higher confidence levels. In the lower frames, it can be seen that the most weight is distributed to the large loss quantiles. For DBM3 there is almost a linear increase in the quantile weights, while DBM10 and DBM20 share more similarities with block maxima and the Wang transform. Table 4.15 presents the solution to the optimization problem (3.18) applied to the reference portfolio where the returns are drawn from a multivariate normal distribution. From left to right, each column presents the solution in terms of the portfolio holdings vector for the Block maxima distortion function with β = {3, 10, 20}. For each solution, a convex combination of 11 CVaRs has been computed with even spread as an approximation. For BM3 the entire interval has been covered. For BM10, k < 2000 have been neglected since the weights wk are approximately zero and for BM20, k < 4000 have been neglected. To estimate the accuracy of the minimized risk with this approximation, an after measurement is made on the position. The optimized and the measured portfolio BMβ are presented in the two bottom rows.

Table 4.15: Solution to (3.18) applied to the reference portfolio with returns drawn from a multivariate normal distribution. The column of portfolio weights are optimized for DBN3, DBM10 and DBM20, respectively. DBM3 DBM10 DBM20 Asset names Holdings Holdings Holdings AstraZeneca 0 0 0 AtlasCopco 0 0 0 Ericsson 0 0 0 HM 0 0 0 ICA 0.0889 0.0866 0.0856 Nordea 0.0990 0.1012 0.1026 Sandvik 0.0451 0.0471 0.0468 SEB 0 0 0 SwedishMatch 0.1001 0.0997 0.1000 Telia 0.0430 0.0429 0.0427 TBOND 0.3114 0.3100 0.3097 LongPutNordea 0.1239 0.1225 0.1222 ShortCallNordea 0 0 0 ButterflySpreadTelia 0 0 0 StraddleSEB 0.1886 0.1900 0.1903 ShortBarrierAtlasCopco 0 0 0 Portfolio Risk Optimized 0.0017 0.0031 0.0039 Portfolio Risk Measured 0.0017 0.0032 0.0040

In Table 4.15 it can be seen in all three columns that the capital is mostly allocated to the TBOND and the sum of the allocation to the TBOND asset and the put option with the Nordea stock as underlying is equal to the 50% which is the maximum allowed amount to invest in them. Moreover, the maximum

42 allowed amount to invest in the derivatives is reached for all three β’s, but the maximum of the hedge is not met. Increasing the β puts more weight on the more extreme losses, and hence the optimizer should be more risk-averse. This is also reflected in the table. Slightly more capital is invested in the hedge. However, for the most part, the capital allocation does not shift much between the columns, which indicates that the solution is robust. Table 4.16 presents the solution to the optimization problem (3.18) with DBM3 applied to the ref- erence portfolio where the returns are drawn from a multivariate student’s-t distribution. From left to right each columns depicts the optimal holdings vectors when the underlying multivariate student’s-t distribution has a degrees of freedom ν = {2.1, 5, 10, 20}.

Table 4.16: Solution to (3.18) applied to the reference portfolio with DBM3 and returns drawn from a student’s-t distribution with degrees of freedom ν = {2.1, 5, 10, 20}.

ν = 2.1 ν = 5 ν = 10 ν = 20 Asset names Holdings Holdings Holdings Holdings AstraZeneca 0 0 0 0 AtlasCopco 0 0 0 0 Ericsson 0 0 0 0 HM 0 0 0 0 ICA 0.0830 0.0860 0.0904 0.0872 Nordea 0.1126 0.1028 0.0969 0.0992 Sandvik 0.0312 0.0420 0.0481 0.0468 SEB 0 0 0 0 SwedishMatch 0.0929 0.0975 0.1004 0.0989 Telia 0.0360 0.0418 0.0428 0.0438 TBOND 0.3318 0.3174 0.3088 0.3117 LongPutNordea 0.1443 0.1299 0.1213 0.1242 ShortCallNordea 0 0 0 0 ButterflySpreadTelia 0 0 0 0 StraddleSEB 0.1682 0.1826 0.1912 0.1883 ShortBarrierAtlasCopco 0 0 0 0 Portfolio Risk Optimized 0.0087 0.0079 0.0078 0.0072 Portfolio Risk Measured 0.0087 0.0079 0.0078 0.0072

Table 4.16 it can be deduced that the solution converges toward the multivariate normal distribution depicted in 4.15 for increasing degrees of freedom ν. The largest capital allocation movement is found for the Straddle and the hedge. For higher degrees of freedom, more capital is invested in the Nordea hedge. The Straddle decreases with 2% and the TBOND increases with the same amount. Table 4.17 presents the optimal investment position in the reference portfolio where the return series is drawn from a normal copula with hybrid marginal distributions.

Table 4.17: Solution to (3.18) applied to the reference portfolio with returns drawn from a normal copula with hybrid marginal distributions. The column of portfolio weights are optimized for DBM3, DBM10 and DBM20, respectively. DBM3 DBM10 DBM20 Asset name Holdings Holdings Holdings AstraZeneca 0.0396 0.0430 0.0512 AtlasCopco 0.0108 0.0138 0.0166 Ericsson 0 0 0 HM 0 0 0 ICA 0.1414 0.1305 0.1189 Nordea 0.1246 0.1279 0.1289 Sandvik 0 0 0 SEB 0 0 0 SwedishMatch 0.0289 0.0327 0.0355 Telia 0 0 0 TBOND 0.3423 0.3396 0.3363 LongPutNordea 0.1548 0.1521 0.1488 ShortCallNordea 0 0 0 ButterflySpreadTelia 0 0 0 StraddleSEB 0.1577 0.1604 0.1637 ShortBarrierAtlasCopco 0 0 0 Portfolio Risk Optimized 0.0014 0.0028 0.0037 Portfolio Risk Measured 0.0014 0.0030 0.0039

The immediate observation in Table 4.17 is that it differs from the solutions with normally distributed returns and student’s-t distributed returns. Nothing is invested in the Telia stock, which is allocated with around 4% of capital in Table 4.15. This is instead allocated to the AstraZeneca and AtlasCopco stocks. The ICA stock has the largest decrease of all the assets with 2% between the holdings for DBM3 and DBM20 columns and AstraZeneca increases with approximately the same amount.

43 4.4 Robustness of solution for Mean Variation In simulation based portfolio optimization there are two main sources of uncertainty. These are statistical uncertainty and parameter uncertainty. The statistical uncertainty originates from scenario simulation. The parameter uncertainty originates from the uncertainty in parameter estimates. In portfolio analysis, these sources of uncertainty may cause inaccurate estimates of expected return and risk. Under the assumption of elliptical distributions the statistical uncertainty can be avoided since the risk may be expressed in terms of the covariance matrix Σ with a scaling factor, see the remark in Section 3.3.1, and the expected return is represented by the mean. Therefore, no simulation is needed. This thesis, however, focuses purely on simulation-based portfolio optimization, for which case the statistical uncertainty is always present. The statistical uncertainty is accounted for by choosing a sufficiently large simulated sample size, which ensures that the simulated portfolio distribution model can provide accurate measure- ments of risk. The parameter uncertainty is a direct consequence of the historical data being limited. Therefore the quality of the parameter estimates cannot be entirely certain. It is therefore of interest to study the robustness of the optimal solution under the influence of statistical and parameter uncertainty. When attempting to minimize the risk of the reference portfolio throughout Section 4.3 it was found that statistical uncertainty may cause the optimal differ between runs when the sample size is small. Experiments were performed for samples of size 5000, 10, 000, 50, 000 and 100, 000. Through careful analysis of the results, it was found that convergence in the optimal holding vectors is achieved using samples of size 10, 000, showing no significant difference from using 50, 000 or 100, 000. Having accounted for the statistical uncertainty, it is also important to consider the parameter uncertainty that derives from the mean and variance parameters. The latter can be accounted for since volatility can, to some extent be, forecasted from historical data using analysis of financial time series, as discussed in Section 2.3. However, it is a well-known fact that expected returns are particularly hard to estimate from historical data due to observed return volatility. It is therefore of particular interest to study to what extent a wrongly estimated mean can influence the holdings of the optimal portfolio. In other words, how robust or sensitive is the optimal solution. How much do elements of the optimal holdings vector vary under variations of the mean vector. T Let r = (r1, ..., rJ ) be a return sample, drawn from a d-dimensional multivariate normal distribution Nd(µ, Σ), where d denoted the number of assets in the portfolio. Since the estimated mean vector 1 PJ 1 µˆ = J j=1 rj is a linear combination of normally distributed samples, then µˆ ∼ Nd(µ, J−1 Σ). To assess the significance of the uncertainty of the parameter, a 95% confidence interval will be considered for each asset will be consisted. While keeping all other parameters constant, let the elements of the mean vector µˆ assume the upper and lower confidence bounds. These are presented in Table 4.18 for yearly mean returns.

Table 4.18: Estimated yearly means for all assets in the reference portfolio with 95% confidence bounds. The left column depicts the lower confidence bound for the parameter estimateµ ˆ, the middle column presents the paramter estimateµ ˆ and the right column depicts the upper confidence bound for the parameter estimateµ ˆ Asset names Lower CI µˆ Upper CI AstraZeneca 0.0206 0.0278 0.0349 AtlasCopco 0.0560 0.0669 0.0778 Ericsson -0.0422 -0.0313 -0.0205 HM -0.0310 -0.0230 -0.0149 ICA 0.0674 0.0756 0.0839 Nordea 0.0167 0.0263 0.0360 Sandvik 0.0021 0.0128 0.0234 SEB -0.0254 -0.0145 -0.0036 SwedishMatch 0.0502 0.0575 0.0648 Telia 0.0002 0.0077 0.0153 TBOND 0.0288 0.0294 0.0300 LongPutNordea -0.0455 -0.0378 -0.0300 ShortCallNordea -0.2788 -0.2620 -0.2453 ButterflySpreadTelia -0.3595 -0.3565 -0.3534 StraddleSEB 0.1942 0.1966 0.1990 ShortBarrierAtlasCopco -0.0205 -0.0088 0.0029

44 In Table 4.18, it can be seen that the width of the 95% confidence interval have different sizes but is relatively small for all the assets. It can be seen that the linear assets with the largest width of the confidence intervals are the AtlasCopco, Ericsson, Sandvik and SEB stocks. The largest confidence intervals are found among the derivatives, in particular with the AtlasCopco barrier option and the Nordea call option, with widths of approximately 3.3% and 2.3% respectively. The solution to the portfolio optimization problem (3.18) is given in Table 4.19 when the reference portfolio consists of J = 10, 000 simulated multivariate normal distributed returns and the risk in terms of DBM3 is minimized, solving for 10 evenly distributed convex combinations of CVaRα.

Table 4.19: Optimal solution to problem (3.18) with µˆ and the upper and lower 95% confidence bounds for µˆ, where the DBM3 is minimized and the reference portfolio returns are simulated from a multivariate normal distribution. Lower CI µˆ Upper CI Asset names Holdings Holdings Holdings AstraZeneca 0 0 0 AtlasCopco 0 0 0 Ericsson 0 0 0 HM 0 0 0 ICA 0.1039 0.0889 0.0732 Nordea 0.0738 0.0990 0.1235 Sandvik 0.0559 0.0451 0.0352 SEB 0 0 0 SwedishMatch 0.1190 0.1001 0.0812 Telia 0.0528 0.0430 0.0338 TBOND 0.2822 0.3114 0.3407 LongPutNordea 0.0947 0.1239 0.1532 ShortCallNordea 0 0 0 ButterflySpreadTelia 0 0 0 StraddleSEB 0.2178 0.1886 0.1593 ShortBarrierAtlasCopco 0 0 0

The left column of Table 4.19 presents the solution to when the minimum expected return constraint is chosen to the lower confidence bound of the estimated mean. The middle column is presented as a reference, where the estimated mean µˆ have been used, and the right column assumes the lower confidence bound. It can be seen that neither of the assets with the largest width of the confidence intervals is subject for investment. However, out of the assets that are invested, it can be observed from Table 4.19 that the largest variations in the stocks are with Nordea followed by SwedishMatch which deviates with 2.5% and 1.9% respectively. The largest variation is with the TBOND, Nordea put option and the straddle, which all deviate with approximately 2.9%. All three columns of Table 4.19 have similar structure and since the optimal holdings varies so little within the solution is considered to be robust with respect to the uncertainty of the mean, which means that a small estimation error of the parameter µ does not have a large impact on the solution. Table 4.20 states the optimal solutions for the minimization of CVaR0.99 where the returns are drawn from a normal copula with hybrid marginal distributions. The left column states the solution when the lower confidence bound for the estimated mean is used. The middle column states the solution when the estimated mean is used and the right the optimal solution when the upper confidence bound for the estimated mean is used.

45 Table 4.20: Optimal solution to problem (3.18) with µˆ and the upper and lower 95% confidence bounds for µˆ, where the CVaR0.99 is minimized and the reference portfolio returns are simulated from a multivariate normal copula with hybrid marginal distributions.

Lower CI µˆ Upper CI Asset names Holdings Holdings Holdings AstraZeneca 0.0744 0.0614 0.0506 AtlasCopco 0.0359 0.0341 0.0288 Ericsson 0 0 0 HM 0 0 0 ICA 0.1063 0.0887 0.0720 Nordea 0.1065 0.1233 0.1408 Sandvik 0 0 0 SEB 0 0 0 SwedishMatch 0.0567 0.0474 0.0389 Telia 0 0 0 TBOND 0.3077 0.3327 0.3565 LongPutNordea 0.1202 0.1452 0.1690 ShortCallNordea 0 0 0 ButterflySpreadTelia 0 0 0 StraddleSEB 0.1923 0.1673 0.1435 ShortBarrierAtlasCopco 0 0 0

In Table 4.20 it can be seen that the optimal holding vectors for assuming the lower (left column) and upper (right column) confidence intervals deviate little from the holding vector where the middle column, representing the optimal solution for assuming the estimated mean. The largest deviations row-wise between the columns is is found in the TBOND, Nordea put option and the straddle, which all deviate with approximately 2.4% from the estimated mean in the the upper and lower confidence bound. Among the stocks, the largest deviations is with the ICA and the Nordea stock which both have deviations below 1.8%. As mentioned, the confidence bounds for the AtlasCopco stock had one of the largest confidence intervals. However, it can be seen in Table 4.20 that the optimal holdings in this asset only deviates with 0.2% at the lower confidence bound and 0.5% the upper confidence bound. Hence, resulting in the same conclusion as for DBM3. In addition, the same analysis was performed for both the Wang Transform and the Block maxima for both student’s-t distribution and normal copula, each demonstrating small holding deviations in the optimal solution compared to that of the optimal solution computed with the estimated mean, µˆ. Concluding that the solution is robust under the uncertainty of the mean at a 95% confidence level.

4.5 Comparison of CDRM Optimization In Sections 4.3.1 through 4.3.4 each risk measure was analyzed separately. This section considers both a more general analysis of the different members of the CDRM with elliptically distributed returns and then asymmetrically distributed returns. The section concludes with an analysis of risk contributions. The risk contributions are particularly interesting when studying portfolio risks since it makes it possible to analyze how much each of the individual assets contributes to the total portfolio risk. This is interesting for two reasons. First, it provides an understanding of why the optimal portfolio have been chosen the way it has been chosen and secondly it provides the knowledge needed to change the level of risk in the portfolio in the desired direction. By comparing Table 4.6, 4.9, 4.12 and 4.15 it can be seen that minimization of any of the coherent distortion risk measures with normal returns results in solutions almost identical to the benchmark solution given in Table 4.5. The prioritized assets capital is invested in the TBOND and the Straddle and the Nordea hedge. It can furthermore be seen that if the value constraint on the TBOND and the derivatives were not in effect, these assets would most likely have more allocated capital since this was the case in Table 4.4 in Section 4.2, the value constraints where were discussed. It is hence not surprising that these assets do not vary much. It is, therefore, more interesting to analyze which assets, in fact, do deviate from the benchmark. The holdings obtained with CVaRα and DBMβ are the closest to the benchmark solution while the optimal solutions for WTβ and BMβ were the furthest away, with

46 deviations of around 2% for both ICA and Nordea. This is explained by studying the q weights in the Figures 4.2, 4.3, 4.4 and 4.5. For all three frames for both BMβ and the WTβ the worst-case losses are weighted significantly higher for any other weight qk, while both CVaRα and DBMβ distributes the weights more evenly among the loss quantiles. This is also reflected in measures of portfolio risk stated in the bottom rows of the tables, which estimates the portfolio risk to be larger for these more risk-averse measures. It is furthermore observed from Tables 4.7, 4.10, 4.13 and 4.16 that the solution with student’s-t distributed returns share the same structure as the ones with normally distributed returns. Although, the holding vectors deviate more from the benchmark solution compared to the normally distributed returns. This implies that the risk measures are dependent on the distribution, the tails in particular. The polynomial decay of the student’s-t tails increases both the severity and the occurrence of the extreme losses. Since CVaR0.99 and BM3 are more conservative with the weights to the most extreme loss scenarios the optimal solutions with regard to these measures are more robust to the change of the degrees of freedom than the holding vectors obtained from solving for WT0.99 and BM0.05, which are much more sensitive. The sensitivity is a consequence of the lack of extreme loss samples and in particular, the max{L} scenario, which are given more weight for the latter two risk measures. However, increasing the degrees of freedom and holding all other parameters constant the holdings in each asset converges toward the holdings of the normal distribution. This behaviour can be explained with the remark in Section 3.3.1, which states that if the returns are elliptically distributed the optimal solution to problem (3.18) can be reformulated as the Markowitz mean-variance trade-off problem. In which case, the risk measure becomes a scaling factor of the portfolio variance, which is only dependent on the distortion function parameter and the degrees of freedom of the distribution. Now moving on to consider the optimal solutions for the asymmetrically distributed returns depicted in Tables 4.8, 4.11, 4.14 and 4.17. The first observation deduced from each table is that the structure of the holdings vector has changed. Telia and Sandvik have been reduced to zero holdings while AstraZeneca and AtlasCopco are included in the portfolio position. The reason why Telia has been removed is due to its heavy lower tail. In Table A.4 it has a large ξˆ parameter implying that it has a heavy tail. The overall portfolio risk has increased for all risk measures compared to both the normally distributed returns and the student’s-t distributed returns with ν = 2.1, being the only exception for the DBM3 risk measure. Also, the holdings in each asset varies more when changing the distortion parameters in the asymmetric case compared to the normal case. This is explained by the heavier lower tails of the asymmetric distributions. When more weights are distributed toward the extreme losses, it becomes more pronounced for the assets with heavier tails. Due to the symmetry of normal distributions and the exponential decay of the tails, the extremes do not become prominent.

4.6 Analysis of Euler Risk Contributions Recall from Section 3.5 that the performance of a portfolio position is measured by the ratio between the expected portfolio return and the portfolio risk, the greater the ratio, the better the performance. In portfolio optimization, it is the optimal portfolio performance that is the main goal. In order to gain a better understanding of why certain assets are preferred over others and making the right adjustments to improve the performance of the portfolio position, it is a good idea to investigate how much each asset contributes to the overall portfolio risk. Therefore, this section is devoted to studying the Euler decompositions. Consider a portfolio with equal holdings in all assets and use equation (3.21), then Table 4.21 presents the normalized risk contributions for CVaR0.99, WT0.85, BM0.05 and DBM3 when the returns are drawn from a normal distribution and the holdings in all assets are equal to 1/16. The last column presents the yearly mean return for all the assets in per cent.

47 Table 4.21: Risk contributions for CVaR0.99, WT0.85, BM0.05 and DBM3 when the simulated returns are drawn from the multivariate normal distribution.

Asset names ∂CVaR0.99 ∂WT0.85 ∂BM0.05 ∂DBM3 µ (% per year) ∂xi ∂xi ∂xi ∂xi AstraZeneca 0.1286 0.1299 0.1297 0.1295 0.0278 AtlasCopco 0.2614 0.2634 0.2581 0.2648 0.0669 Ericsson 0.1018 0.1072 0.1076 0.1080 -0.0313 HM 0.0764 0.0725 0.0736 0.0716 -0.0230 ICA 0.0580 0.0483 0.0469 0.0460 0.0756 Nordea 0.0675 0.0639 0.0618 0.0636 0.0263 Sandvik 0.0619 0.0607 0.0603 0.0602 0.0128 SEB 0.0600 0.0547 0.0558 0.0558 -0.0145 SwedishMatch 0.0243 0.0268 0.0237 0.0266 0.0575 Telia 0.0286 0.0277 0.0271 0.0272 0.0077 TBOND -0.0010 -0.0011 -0.0015 -0.0013 0.0294 LongPutNordea -0.0536 -0.0506 -0.0486 -0.0503 -0.0378 ShortCallNordea -0.1091 -0.1009 -0.0938 -0.0997 -0.2620 ButterflySpreadTelia -0.0029 0.0024 0.0068 0.0042 -0.3565 StraddleSEB 0.0067 0.0049 0.0030 0.0047 0.1966 ShortBarrierAtlasCopco 0.2913 0.2902 0.2896 0.2891 -0.0088

The first interesting observation that can be deduced from Table 4.21 is that the risk contributions from each asset are very similar when studying each row one at a time. Even though each risk measure represents varying risk aversions the risk contribution from each asset differs with less than 1% for each risk measure. The only exceptions is the ICA stock, which reaches a difference of 1.2% between CVaR0.99 and DBM3 This explains why the optimal solutions do not vary a lot for optimization with the different risk measures. A second observation is that the TBOND has negative risk contribution as well as positive return. This is why it is the most preferred asset along with the straddle, which has a small positive risk contribution for all risk measures as well as having the largest expected return among all assets. It can be seen that both the SwedishMatch and the Telia stocks have the smallest risks among the stocks. Since SwedishMatch also has a reasonably large expected return it is a more preferred asset compared to the Telia stock, which is why it has been given around 9% holdings together with the ICA stock. The ICA stock can be observed to have larger risk contributions than both SwedishMatch and Telia in the table. This is compensated for with a larger expected return. It also becomes clear from studying the table that assets such as AstraZeneca, AtlasCopco, Ericsson and HM are not invested in since all are associated with large risks. The latter two even having negative returns. The asset with the least amount of risk is the call option with the Nordea stock as underlying. However, this asset is not invested in due to its large negative expected return. It is not surprising that the Barrier option is not invested in since it is associated with the largest risk of all assets. Consider now the same portfolio with equal holdings in all assets and use equation (3.21), then Table 4.22 presents the normalized risk contributions for CVaR0.99, WT0.85, BM0.05 and DBM3 when the returns are drawn from the normal copula with hybrid marginal distributions and the holdings in all assets are equal to 1/16. The last column presents the yearly mean return for all the assets in percent.

48 Table 4.22: Risk contributions for CVaR0.99, WT0.85, BM0.05 and DBM3 when the simulated returns are drawn from the normal copula with hybrid univariate distributions.

Asset names ∂CVaR0.99 ∂WT0.85 ∂BM0.05 ∂DBM3 µ (% per year) ∂xi ∂xi ∂xi ∂xi AstraZeneca 0.0337 0.0430 0.0337 0.0508 0.0278 AtlasCopco 0.1750 0.1724 0.1476 0.1814 0.0669 Ericsson 0.1715 0.1344 0.1430 0.1163 -0.0313 HM 0.1052 0.1066 0.0907 0.1157 -0.0230 ICA 0.0343 0.0389 0.0309 0.0417 0.0756 Nordea 0.0521 0.0511 0.0459 0.0529 0.0263 Sandvik 0.1002 0.1148 0.0965 0.1289 0.0128 SEB 0.0600 0.0714 0.0586 0.0828 -0.0145 SwedishMatch 0.0515 0.0710 0.0670 0.0790 0.0575 Telia 0.0899 0.0814 0.1104 0.0644 0.0077 TBOND -0.0024 -0.0029 -0.0030 -0.0033 0.0294 LongPutNordea -0.0413 -0.0403 -0.0360 -0.0416 -0.0378 ShortCallNordea -0.0839 -0.0798 -0.0690 -0.0814 -0.2620 ButterflySpreadTelia 0.0532 0.0423 0.1163 0.0078 -0.3565 StraddleSEB 0.0060 0.0065 0.0031 0.0078 0.1966 ShortBarrierAtlasCopco 0.1950 0.1891 0.1644 0.1969 -0.0088

risk In Table 4.22 it can be seen that similarly to Table 4.21, the risk contributions from each asset differs more between each risk measure compared to Table 4.21. This is a direct result of the more accurate modelling of the tail risk with the asymmetric distributions and the risk measures being more sensitive tail risk located further out in the tails. This implies that the reason elliptical distributions have similar risk contributions is due to exponential decaying tails. The largest differences between the Tables 4.22 and 4.21 found in the Ericsson and the Telia stocks, which have the heaviest loss tails among the assets since they have the largest ξˆ parameter shown in Table A.4. While assets such as AstraZeneca and AtlasCopco have decreased risk contributions compared to Table 4.21, Sandvik and Telia have increased risk contribution. This explains why the capital invested in Sandvik and Telia under the symmetrical assumption has been reallocated to AstraZeneca and AtlasCopco under the asymmetric distribution assumption, but had zero holdings for simulated returns from elliptical distributions. The Nordea stock has also decreased risk contributions for all risk measures, which explains the increased holdings in the stock for optimization under asymmetric distribution assumption in Section 4.3. The risk contributions from the TBOND decrease for all measures under the assumption of the asymmetric distributions compared to symmetrical distributions assumption, which is why it has slightly increased holdings in for asymmetrical distributions. In conclusion, studying the Euler decomposition in relation to the expected return provides useful knowledge about how the optimal portfolio is chosen. By equation (3.22) it is clear that any adjustments to the holdings of a portfolio can either improve or worsen the portfolio performance. Also, gradually increasing the allocated capital in an asset that initially has a positive impact on the overall portfolio performance will eventually, when the allocated capital becomes large enough, worsen the portfolio performance. In a way, this property favours diversification to invest in multiple assets. Moreover, if the position is optimized with respect to a risk measure ρ, any adjustments to the position can not improve the portfolio performance with respect to the same risk measure ρ. In the case of an optimal portfolio position, the quotient of equation (3.22) should be close to zero. Risk contributions are powerful complementary tools for understanding how local adjustments influence the overall portfolio performance, especially when analyzing different portfolio distribution models.

49 5 Final Conclusion and Further Investigation

The goals of this master thesis project was to extend the classical Markowitz portfolio optimization model to consider a reference portfolio consisting of non-linear assets and compare with optimization with coherent distortion risk measures. Under the assumption of elliptically distributed returns the optimization problem with respect to coherent distortion risk measures is equivalent to solving the mean- variance portfolio optimization problem, which is an attractive property since the portfolio optimization with coherent distortion risk measures is computationally intense in general and no approximations are necessary for the elliptical case. Therefore the optimal solution to both methods are identical. Whenever, the elliptical assumption is deemed as a good approximation for the portfolio model the classical mean- variance optimization is the better method. When the portfolio returns have an asymmetric joint distribution model the optimal solutions for optimization with coherent distortion risk measures differs more from the benchmark solution. Moreover, the optimal holdings differed more between the risk measures and the associated levels of risk aversion. The optimal holdings for the BM risk measure and the WT differed the most from the benchmark since they are more sensitive to the most extreme losses, while CVaR and DBM deviate less form the bench- mark since they focus on a larger part of the loss quantiles, not just the extremes.

Possible directions for further research:

• The optimization framework considered in this project showed to be computationally intense and heavy approximations had to be made to be able to achieve reasonable runtimes. Therefore, an interesting direction is to investigate more efficient methods to make the optimization framework more computationally efficient. • This project considered four members of the family coherent distortion risk measures. A direction for future work is to study other members of such as the lookback distortion measure discussed by H¨urlmann[19] or the beta family of distortion measures proposed by Wirch and Hardy [38]. • The linear programming approach used in this project introduces statistical uncertainty when sim- ulating scenarios from the portfolio distribution model. For the analysis of this project, careful study has been performed on the choice of a sample size to ensure convergence the optimization, knowing that statistical uncertainty is present in simulation-based portfolio optimization. A possi- ble extension could be to analyze the sensitivity of the optimization under statistical uncertainty. The analysis could be performed using bootstrap methods suggested in Gareth et al. [15]. • Alternative portfolios could be considered. This thesis considers a reference portfolio with assets from the Swedish market. It could also be of interest to study a portfolio with assets from foreign markets and other types of assets such as foreign currencies and other types of derivatives.

• The normal copula was used to model the dependence structure for the hybrid marginal distribu- tions. Alternatively, a student’s-t copula could be used being to model the dependence structure of the portfolio assets. In contrast to the normal copula, the student’s-t copula is not asymptotically independent in the upper and lower tails. This could potentially improve the measurements of the tail risk. Alternatively, grouped copulas with members of the from the Archimedean class of copulas could be interesting to consider, the Clayton and Frank copula in particular. These copulas focus specifically on modelling of tail losses, which is where the risk is located.

50 A Appendix A.1 Fitting GARCH and GDP parameters The fundamental requirement of modelling threshold excesses with a generalized Pareto distribution is that the threshold excess sample comes from an approximately IID(0,1) distribution, see Section 2.5. As mentioned in Section 3.1, financial return series are weakly dependent and close to identically distributed. Moreover, recalling form Section 2.3, financial return series tend to be subject to the stylized facts of financial time series. To investigate if the risk factor return series are standardized with mean zero and unit variance Table A.1 depicts the distributional statistics for the linear assets of the reference portfolio.

Table A.1: Summary of return series distributional statistics and numerical tests results for presence of stylized facts of financial time series. The tests are defined in Definitions 2.6 - 2.8

Instrument Mean Variance Skew Kurt ARCH-test LB-test JB-test AstraZeneca 0.0001 0.0132 -0.8282 14.9226 1 1 1 AtlasCopco 0.0003 0.0200 -0.9967 19.8232 1 1 1 Ericsson -0.0001 0.0199 -1.4088 23.8978 0 0 1 HM -0.0001 0.0148 -0.3724 10.8637 0 1 1 ICA 0.0003 0.0151 -0.2769 11.3632 1 1 1 Nordea 0.0001 0.0177 0.3542 9.6631 1 1 1 Sandvik 0.0001 0.0195 -0.1889 7.9617 1 1 1 SEB -0.0001 0.0201 -0.1526 15.4647 1 1 1 SwedishMatch 0.0002 0.0134 -0.2046 8.1980 1 1 1 Telia 0.0000 0.0139 -0.5191 13.2497 0 1 1 TBOND 0.0001 0.0011 0.4724 7.7037 1 1 1

All assets have means close to zero, but neither asset has unit variance. In addition, both skewness and kurtosis both display behaviour asymmetry (deviation from zero) and heavy-taleidness (being larger than 3) in all assets, hence suffering from the third stylized fact, implying that a normal distribution would be ill-fitted. The heavy-tailedness could be modelled with a student’s-t distribution, however, the skewness causes problems. Moreover, the realized return series and the autocorrelation functions of squared returns in Figure A.2 provide evidence of volatility clustering and serial dependence, stylized fats one and two. The Ericsson stock, however, appears to have insignificant serial dependence, not falling outside the 95% confidence bound, from the autocorrelation function in Figure A.2. The same conclusion is drawn from Engle’s ARCH-test and the Ljung box test in Table A.1. However, the realized return series displays behaviours of volatility clustering. The Ericsson stock is therefore fitted with a GARCH(1,1) model to filter out the clustering. Since evidence of the stylized facts being present in the linear risk factor return data they are fitted with univariate GARCH(p, q) models to obtain approximately IID(0, 1) standardized residuals, that should not suffer from the stylized facts. Various GARCH(p, q) models were attempted, and it was concluded that the GARCH(1, 1) had the best performance for all assets. It was not possible however, to filter out all serial dependence from the TBOND. A wide range of GARCH models were attempted without a satisfactory result, even EGARCH models were considered. The GARCH(1,1) model gave the best fit. Table A.2 presents the Maximum-Likelihood estimated parameters for each asset, with the corresponding standard error and t-statistics. These were estimated numerically using Matlab’s built-in function estimate.

51 Table A.2: GARCH(1,1) parameters estimated with Maximum Likelihood method to filter out standard- ized residuals from log-return data.

Asset Parameter Estimate Standard Error t Statistic

Astra Zeneca αˆ0 8.6514e-06 4.5682e-07 18.938 αˆ1 0.05 0.0051495 9.7096 βˆ1 0.9 0.0064317 139.93 Atlas Copco αˆ0 9.976e-06 1.0209e-06 9.7716 αˆ1 0.0593 0.0061502 9.642 βˆ1 0.91846 0.0078092 117.61 Ericsson αˆ0 3.4536e-06 4.3904e-07 7.8663 αˆ1 0.01055 0.0009171 11.504 βˆ1 0.98136 0.0012721 771.47 HM αˆ0 1.0956e-05 5.4109e-07 20.248 αˆ1 0.05 0.003776 13.241 βˆ1 0.9 0.0047648 188.88 ICA αˆ0 1.5069e-05 9.0878e-07 16.581 αˆ1 0.10507 0.0081705 12.86 βˆ1 0.83555 0.0099017 84.385 Nordea αˆ0 3.5742e-06 6.6303e-07 5.3908 αˆ1 0.044159 0.0037888 11.655 βˆ1 0.94247 0.0049583 190.08 Sandvik αˆ0 2.3876e-06 6.9807e-07 3.4203 αˆ1 0.046247 0.0037748 12.251 βˆ1 0.94651 0.0043318 218.5 SEB αˆ0 2.784e-06 4.8636e-07 5.7241 αˆ1 0.056662 0.0042596 13.302 βˆ1 0.93365 0.0045041 207.29 Swedish Match αˆ0 4.2517e-06 7.9724e-07 5.3329 αˆ1 0.038114 0.0032463 11.741 βˆ1 0.938 0.005502 170.49 Telia αˆ0 2.4181e-06 4.7438e-07 5.0974 αˆ1 0.034545 0.0020418 16.919 βˆ1 0.9535 0.0033127 287.84 TBOND αˆ0 2e-07 7.4412e-08 2.6877 αˆ1 0.12013 0.0093221 12.887 βˆ1 0.73204 0.0084978 86.145

For a GARCH(1,1) model the parameter α1 measures to which extent a volatility shock feeds through from one day to the next and the factor (α1 + β1) measures the rate at which the effect of the shock dies out. Hence, α1 + β1 < 1 is necessary for model stability. This requirement holds for all the fitted GARCH(1,1) models listed in Table A.2. Moreover, α1 < β1 for all model estimates. Recalling equation (2.10) this implies that volatility is more dependent on the volatility shock from the previous day rather than the data itself. The standard error and t-statistic columns indicate satisfactory model fits. The daily conditional volatilityσ ˆt is computed for each asset over the entire time period by using the recursive form 2.10 with the fitted parameters. The univariate standardized residuals are obtained as

Rt − µˆ Zt = . (A.1) σˆt If the standardized residuals have been successfully filtered, they should be approximately IID(0,1) and no longer influenced by the stylized facts of financial time series. Table (A.3) depicts the distributional statistics for the standardized residuals.

52 Table A.3: Summary of distributional statistics for GARCH filtered standardized residuals and numerical test results for presence of stylized facts of financial time series. The numerical tests are defined in Definitions 2.6 - 2.8.

Instrument Mean Variance Skew Kurt ARCH-test LB-test JB-test AstraZeneca -0.0017 1.0086 -0.9298 17.7434 0 0 1 AtlasCopco 0.0013 0.9851 -2.4252 50.7181 0 0 1 Ericsson 0.0010 0.9650 -1.7846 32.5911 0 0 1 HM 0.0001 1.0195 -0.5415 13.3053 0 0 1 ICA 0.0063 0.9999 -0.3067 14.1450 0 0 1 Nordea -0.0022 0.9968 -0.0837 7.1677 0 0 1 Sandvik 0.0035 1.0232 -0.0168 4.7605 0 0 1 SEB 0.0083 1.0012 0.0371 5.8987 0 0 1 SwedishMatch 0.0025 0.9995 -0.2584 9.2606 0 0 1 Telia -0.0027 1.0006 -0.7931 13.4455 0 0 1 TBOND -0.0028 0.9662 0.2670 6.1666 0 1 1

Table A.3 shows that all GARCH filtered standardized residuals have means close to zero and ap- proximately unit variance. However, the Skewness and Kurtosis indicate that the residuals are both asymmetric and have heavy tails. The Jarque-Bera test further supports that although the residuals have approximately mean zero and unit variance, they are not normally distributed. The autocorrela- tion of the squared residuals has been filtered out for all the assets, indicated by the ARCH-test column. Lastly, The Ljung-box test shows that the serial dependence have been filtered out as well for all assets except for the TBOND asset. The reason why the filtration fails might be due to that a GARCH model is not a suitable choice for this type of asset. From Figure A.3 the same conclusions are drawn. The realized returns do not show any volatility clustering, not even for Ericsson or TBOND. Any significant autocorrelation have been filtered out from the residuals and squared residuals. The QQ-plots display the inverted S-shape, which is an indicator of heavy-tailedness. Moreover, both ends do not divert equally much, the left tail is heavier. In conclusion, the standardized residuals are approximately IID(0,1), al- though they do not support any assumptions of symmetry or exponential decay.

Provided the approximately IID(0,1) standardized residuals z1, ..., zJ with unknown distribution function, the distribution tails are modelled with Generalized Pareto distributions using the peaks over threshold method as described in Section 3.3.2. A high threshold of ηh is chosen, defining where the upper tail h begins. The threshold excess residuals zt − η are then fitted with a Generalized Pareto distribution by estimating the shapeγ ˆ and scale βˆ parameters using Maximum likelihood estimation. Analogously l the lower tail is fitted with a Generalized Pareto distribution to the excesses η − zt by choosing a lower threshold ηl. Maximum-Likelihood estimated parameters and the corresponding quantile levels for the chosen thresholds are depicted in Table A.4.

53 Table A.4: Maximum Likelihood estimated generalized Pareto distribution parameters to standardized residuals. Here ξˆ is the shape parameter and βˆ is the scale parameter for the corresponding threshold value ηl in the lower tail and ηu in the upper tail, with the corresponding quantiles levels ql and qu.

Asset Threshold Quantile ξˆ βˆ Astra Zeneca ηu = 1.6677 qu = 0.9574 0.1096 0.6204 ηl = -1.5801 ql = 0.0438 0.2982 0.6455 Atlas Copco ηu =1.6530 qu = 0.9715 0.1300 0.4451 ηl = -1.6204 ql = 0.0291 0.3977 0.4191 Ericsson ηu = 1.4624 qu = 0.9669 0.3509 0.4453 ηl = -1.3823 ql = 0.0245 0.6336 0.6077 HM ηu = 1.5452 qu = 0.9498 0.2977 0.5320 ηl = -1.6728 ql = 0.0370 0.1994 0.7630 ICA ηu = 1.6345 qu = 0.9691 0.0942 0.7806 ηl = -1.5064 ql = 0.0318 0.1524 0.9086 Nordea ηu = 1.6943 qu = 0.9721 0.0916 0.6647 ηl = -1.6578 ql = 0.0478 0.1366 0.6330 Sandvik ηu = 1.6876 qu = 0.9682 -0.1152 0.7933 ηl = -1.7911 ql = 0.0175 -0.2185 0.7777 SEB ηu = 1.6891 qu = 0.9746 0.0070 0.7227 ηl = -1.6928 ql = 0.0352 0.1296 0.5413 Swedish Match ηu = 1.7233 qu = 0.9630 0.0771 0.7038 ηl = -1.6014 ql = 0.0144 0.0397 1.1796 Telia ηu = 1.6159 qu = 0.9685 0.2722 0.5415 ηl = -1.5558 ql = 0.0309 0.4468 0.6332 TBOND ηl = 1.6815 qu = 0.9562 0.1335 0.6663 ηl = -1.6985 ql = 0.0612 -0.0513 0.6622

Table A.4 shows that the estimated shape parameter ξˆ takes only positive values, except for Sandvik and TBOND. As discussed in Section 2.5 a positive shape parameter corresponds to the Fr´echet case in which the tail has polynomial decay. Moreover, negative shape parameter corresponds to the Weibull case where the tails are finite and less fat than that of a normal distribution, which would correspond to the Gumble case where is ξ = 0. The scale parameters are all β > 0 which is in line with the definition of the Generalized Pareto distribution (2.16) Connecting Generalized Pareto tails to the empirical distribution function according to the formal representation (3.9) gives the complete hybrid cumulative distribution functions. In Figure A.1 the hybrid empirical cumulative distributions functions are visualized, where the parameters of the Generalized Pareto tails are presented in Table A.4. As shown in Figure A.1 these unique distribution functions outperform any fitted elliptical model with regard to accuracy, taking both asymmetry and heavy tails into account. This was illustrated in Figure 3.2 in Section 3.3.2. Another tractable property of the Generalized Pareto tails with ξ > 0 (the Fr´echet case) is that it extrapolates the tails beyond the range of the residual samples. Hence, taking into account potential losses that have not occurred within the sample history.

54 Hybrid cumulative distribution function for AstraZeneca Hybrid cumulative distribution function for AtlasCopco Hybrid cumulative distribution function for Ericsson Hybrid cumulative distribution function for HM 1 1 1 1

0.9 0.9 0.9 0.9

0.8 0.8 0.8 0.8

0.7 0.7 0.7 0.7

0.6 0.6 0.6 0.6

0.5 0.5 0.5 0.5 F(x) F(x) F(x) F(x)

0.4 0.4 0.4 0.4

0.3 0.3 0.3 0.3

0.2 0.2 0.2 0.2

0.1 0.1 0.1 0.1

0 0 0 0 -15 -10 -5 0 5 10 -20 -15 -10 -5 0 5 10 -20 -15 -10 -5 0 5 10 -12 -10 -8 -6 -4 -2 0 2 4 6 8 Standardized Residuals Standardized Residuals Standardized Residuals Standardized Residuals

Hybrid cumulative distribution function for ICA Hybrid cumulative distribution function for Nordea Hybrid cumulative distribution function for Sandvik Hybrid cumulative distribution function for SEB 1 1 1 1

0.9 0.9 0.9 0.9

0.8 0.8 0.8 0.8

0.7 0.7 0.7 0.7

0.6 0.6 0.6 0.6

0.5 0.5 0.5 0.5 F(x) F(x) F(x) F(x)

0.4 0.4 0.4 0.4

0.3 0.3 0.3 0.3

0.2 0.2 0.2 0.2

0.1 0.1 0.1 0.1

0 0 0 0 -15 -10 -5 0 5 10 -8 -6 -4 -2 0 2 4 6 8 -6 -4 -2 0 2 4 6 -6 -4 -2 0 2 4 6 8 Standardized Residuals Standardized Residuals Standardized Residuals Standardized Residuals

Hybrid cumulative distribution function for SwedishMatch Hybrid cumulative distribution function for Telia Hybrid cumulative distribution function for TotalBond 1 1 1

0.9 0.9 0.9

0.8 0.8 0.8

0.7 0.7 0.7

0.6 0.6 0.6

0.5 0.5 0.5 F(x) F(x) F(x)

0.4 0.4 0.4

0.3 0.3 0.3

0.2 0.2 0.2

0.1 0.1 0.1

0 0 0 -10 -8 -6 -4 -2 0 2 4 6 -10 -8 -6 -4 -2 0 2 4 6 8 -6 -4 -2 0 2 4 6 8 Standardized Residuals Standardized Residuals Standardized Residuals

Figure A.1: Hybrid empirical cumulative distribution functions for standardized residuals with estimated generalized Pareto distribution tails for each linear asset in the reference portfolio.

A.2 Estimated Parameters To simulate daily return scenarios when the reference portfolio presented in Section 4.1 is assumed to be elliptically distributed requires estimations of the parameters µ and Σ. These parameters are presented in (A.2) and (A.3). When the daily return scenarios are assumed to have asymmetric univariate marginal distributions the dependence structure of the joint distribution function, from which the return scenarios are simulated, is given by the normal copula. The copula is calibrated with the Kendall’s rank correlation matrix T . The Kendall’s tau rank correlation matrix is estimated by estimating the linear correlation matrix of the probability transformed standardized residuals and converting the linear correlation using the relation (2.20). The estimated Kendall’s rank correlation matrix Tˆ is presented in (A.4).

µˆ = 10−3[0.1101 0.2655 −0.1243 −0.0911 0.3001 0.1046 0.0506 −0.0576 0.2281 0.0307 0.1167] (A.2)

 0.1731 0.0671 0.0604 0.0497 0.0297 0.0586 0.0612 0.0554 0.0381 0.0502 −0.0008  0.0671 0.3992 0.1436 0.1335 0.0784 0.1981 0.3085 0.2243 0.0677 0.1206 −0.0056    0.0604 0.1436 0.5164 0.0926 0.0488 0.1425 0.1422 0.1281 0.0511 0.0853 −0.0041     0.0497 0.1335 0.0926 0.2192 0.0497 0.1180 0.1284 0.1248 0.0448 0.0760 −0.0032     0.0297 0.0784 0.0488 0.0497 0.2288 0.0671 0.0762 0.0769 0.0333 0.0487 −0.0022  −3   Σˆ = 10  0.0586 0.1981 0.1425 0.1180 0.0671 0.3135 0.1992 0.2433 0.0614 0.1155 −0.0058     0.0612 0.3085 0.1422 0.1284 0.0762 0.1992 0.3817 0.2261 0.0662 0.1149 −0.0062     0.0554 0.2243 0.1281 0.1248 0.0769 0.2433 0.2261 0.4035 0.0533 0.1181 −0.0061     0.0381 0.0677 0.0511 0.0448 0.0333 0.0614 0.0662 0.0533 0.1785 0.0458 −0.0009   0.0502 0.1206 0.0853 0.0760 0.0487 0.1155 0.1149 0.1181 0.0458 0.1928 −0.0028  −0.0008 −0.0056 −0.0041 −0.0032 −0.0022 −0.0058 −0.0062 −0.0061 −0.0009 −0.0028 0.0013 (A.3) The estimated mean vector µˆ and covariance matrix Σˆ are computed for the data set 3266 daily log- returns recorded over the time period 2006-01-02 to 2018-12-02.

55  1.0000 0.2836 0.2741 0.2967 0.1699 0.3031 0.2625 0.2474 0.2307 0.3147 −0.0388  0.2836 1.0000 0.4191 0.4743 0.2687 0.5620 0.8053 0.5069 0.2905 0.4612 −0.2306    0.2741 0.4191 1.0000 0.3563 0.1961 0.4346 0.4035 0.3447 0.2364 0.3489 −0.1883     0.2967 0.4743 0.3563 1.0000 0.2483 0.4776 0.4385 0.4132 0.2642 0.4339 −0.1948     0.1699 0.2687 0.1961 0.2483 1.0000 0.2661 0.2589 0.2377 0.1874 0.2561 −0.0964    T =  0.3031 0.5620 0.4346 0.4776 0.2661 1.0000 0.5473 0.6743 0.2874 0.5056 −0.2833     0.2625 0.8053 0.4035 0.4385 0.2589 0.5473 1.0000 0.5040 0.2701 0.4278 −0.2392     0.2474 0.5069 0.3447 0.4132 0.2377 0.6743 0.5040 1.0000 0.2202 0.4306 −0.2557     0.2307 0.2905 0.2364 0.2642 0.1874 0.2874 0.2701 0.2202 1.0000 0.2822 −0.0585   0.3147 0.4612 0.3489 0.4339 0.2561 0.5056 0.4278 0.4306 0.2822 1.0000 −0.1711  −0.0388 −0.2306 −0.1883 −0.1948 −0.0964 −0.2833 −0.2392 −0.2557 −0.0585 −0.1711 1.0000 (A.4)

A.3 Payoff Functions and Profit-and-Loss distributions for Derivatives This section presents the payoff functions and profit-and-loss distributions for each of the derivative assets to provide a better understanding of the structure of these assets and what kind of scenarios they generate under simulation. Recall that the portfolio has an investment horizon of one day. Since the derivatives time of maturity T is given in years, it will be assumed that all derivatives can be sold at the investment horizon at the price determined by the corresponding pricing function. The left frame of Figure A.4 depicts the payoff function of the long put option with the Nordea stock as underlying. Since the payoff function is a decreasing function, the value of the derivative decreases when the value of the Nordea stock increases. The right frame depicts the PnL distribution for the derivative when the return sample of the underlying is the historical data. The tails are observed to be almost symmetric.

Figure A.4: Payoff function subtracted by the premium and PnL distribution for the long put option with the Nordea stock as underlying. The parameters of the derivative are: S0 = 74.58, r = 0.012, σ = 0.5, T ≈ 5.66 and K = 50.

The left frame of Figure A.5 depicts the payoff function of the short call option with the Nordea stock as underlying. Since the payoff function is a decreasing function, the value of the derivative decreases when the value of the Nordea stock increases. The right frame depicts the PnL distribution for the derivative when the return sample of the underlying is the historical data. Tails of the short call option are heavier than the put option, and there are more extreme loss scenarios are than profits.

56 Figure A.5: Payoff function subtracted by the premium and PnL distribution for the short call option with the Nordea stock as underlying. The parameters of the derivative are: S0 = 74.58, r = 0.012, σ = 0.5, T ≈ 5.66 and K = 100.

The left frame of Figure A.6 depicts the payoff function of the long butterfly spread with the Telia stock as underlying. It can be seen that the payoff decreases if the value of the underlying either increases or decreases. The largest payoff is just above zero if the value of the underlying stays roughly the same. The right frame depicts the PnL distribution for the derivative when the return sample of the underlying is the historical data. The loss tail is much heavier than the profit tail. The profits are limited, which was also observed in the payoff function.

Figure A.6: Payoff function subtracted by the premium and PnL distribution for the long butterfly spread with the Telia stock as underlying. The parameters of the derivative are: S0 = 42.98, r = 0.012, σ = 0.35, T = 0.25, K1 = 32.93, K2 = 40.43 and K3 = 47.93.

The left frame of Figure A.7 depicts the payoff function of the long straddle with the SEB stock as underlying. It can be seen that the payoff increases if the value of the underlying either increases or decreases. The smallest payoff is given if the value of the underlying does not change. Hence, the best payoff is given if the underlying is volatile. The right frame depicts the PnL distribution for the derivative when the return sample of the underlying is the historical data. The profit tail is much heavier than the loss tail, which is bounded from below. This can also be seen in the payoff function.

57 Figure A.7: Payoff function subtracted by the premium and PnL distribution for the long straddle with the SEB stock as underlying. The parameters of the derivative are: S0 = 86.4, r = 0.012, σ = 0.5, T = 1 and K1 = K2 = 94.9.

The left frame of Figure A.8 depicts the payoff function of the short down-and-in Barrier put option with the AtlasCopco stock as underlying. It can be seen that the payoff function is an increasing function which increases when the underlying increases. The right frame depicts the PnL distribution for the derivative when the return sample of the underlying is the historical data. It can be observed that there are a few loss scenarios that are very large, whereas the profit scenarios are not as large.

Figure A.8: Payoff function subtracted by the premium and PnL distribution for the short down-and- in Barrier put option with the AtlasCopco stock as underlying. The parameters of the derivative is: S0 = 210.5, r = 0.009, σ = 0.5, T ≈ 3.33, strike price K = 200 and the down-and-in barrier at H = 100.

Note that both the call and put options with the Nordea stock as underlying are far out of the money, which means that the derivatives will be in the money with low probability. The barrier option is also out of the money, but the underlying is very unlikely to hit the barrier. The parameters for these derivatives have been chosen to render large losses with a low probability of occurrence. This is mainly achieved in the barrier option. The purpose is to study how sensitive the optimization is to extreme loss scenarios, and if any of the risk measures are better for detecting them. The Butterfly spread and the straddle were both chosen since they are interesting investment strategies that can be particularly interesting in a portfolio optimization framework.

58 References

[1] A. Adam, M. Houkari, J.P. Laurent, Spectral risk measures and portfolio selection [2] P. Artzner, F. Delbaen, J. M. Eber, and D. Heath. Coherent Measures of Risk, Mathematical Finance, 9 (3), pp. 203-228 (1999)

[3] D. Bertsimas and D.B. Brown. Constructing uncertainty sets for robust linear optimization. Opera- tions Research, 57(6):1483–1495, 2009. [4] T. P. Bollerslev, Generalized autoregressive conditional heteroskedasticity, Journal of Econometrics, 31 pp. 307-327. (1986) [5] T. P. Bollerslev, A Conditionally Heteroskedastic Time Series Model for Speculative Prices and Rates of Return. The Review of Economics and Statistics, Vol. 69, pp. 542-547 (1987) [6] T. P. Bollerslev, J. M. Wooldridge, Quasi-Maximum Likelihood Estimation and Inference in Dynamic Models with Time-Varying Covariances. The Review of Economics and Statistics, Vol. 69, pp. 542-547 (1992)

[7] G. E. P. Box, G. M. Jenkins, and G. C. Reinsel. Time Series Analysis: Forecasting and Control. 3rd ed. Englewood Cliffs, NJ: Prentice Hall, 1994. [8] D.B. Brown Risk and Robust Optimization [9] P. Embrechts, A. McNeil, D. Straumann, Correlation and dependence in risk management: properties and pitfalls In: Risk Management: and Beyond, ed. M.A.H. Dempster, Cambridge University Press, Cambridge, pp. 176-223, 2002. [10] P. Embrechts, C. Kluppelberg, T. Mikosch, Modelling Extremal Events for Insurance and Finance Springer-Verlag [11] G. Choquet, Theory of capacities. Ann. Inst. Fourier,pages 131-295, 1954.

[12] R. F.Engle, “Autoregressive Conditional Heteroskedasticity with Estimates of the Variance of United Kingdom Inflation.” Econometrica. Vol. 50, 1982, pp. 987–1007. [13] D. Denneberg, 1994, Non-additive measures and integrals, Dordrecht: Kluwer. [14] H. F¨ollmer,A. Schied, Convex and Coherent Risk Measures 2008

[15] G. James, D. Witten, T. Hastie, R. Tibshirani. An introduction to statistical learning, volume 112. Springer, 2013. [16] H. Gzyl, S. Mayoral, On a Relationship Between Distorted and Spectral Risk Measures IESA, Cara- cas, Venezuela and UNAV, Pamplona, Espa˜na.

[17] E.G. Haug, The Complete Guide to Option Pricing Formulas, Second edition. [18] J. Hull Fundamentals of futures and Option Markets. 2014 [19] W. H¨urlmann, Inequalities for lookback option strategies and exchange risk modelling. In Proceedings of the First Euro-Japanese Workshop on Stochastic Modelling for Finance, Insurance, Production and Reliability, Brussels, 1998.

[20] H. Hult, F. Lindskog, O. Hammarlid, and C. J. Rehn, Risk and Portfolio Analysis - Principles and Methods, Springer, 2012. [21] C. M. Jarque, and A. K. Bera. “A Test for Normality of Observations and Regression Residuals.” International Statistical Review. Vol. 55, No. 2, 1987, pp. 163–172.

[22] P. Krokhmal, J. Palmquist, S. Uryasev, Portfolio Optimization with Conditional Value-at-Risk Ob- jective and Constraints, 2001 [23] S. KUSUOKA On Law Invariant Coherent Risk Measures,

59 [24] H. Markowitz, Portfolio selection. The Journal of Finance, 7(1):77–91. American Finance Associa- tion, Wiley, 1952 [25] Nasdaq OMX Nordic [Internet], [cited]. Available from: www.nasdaqomxnordic.com.

[26] A. McNeil, R. Frey, and P. Embrechts, Quantitative Risk Management, Princeton, 2005. [27] K. Nystr¨om,J. Skoglund, Efficient filtering of financial time series and extreme value theory. Journal of Risk, 7(2):63–84, 2005. [28] R. T. Rockafellar, S. Uryasev, Optimization of Conditional Value-at-Risk, 2000

[29] R. T. Rockafellar, S. Uryasev, Conditional value-at-risk for general loss distributions, Journal of Banking & Finance 26 (2002) 1443–1471 [30] E. N. Sereda, S. T. Bronshtein, F. J. Rachev, W. Sun Fabozzi, and S. V. Stoyanov. Distortion Risk Measures in Portfolio Optimization FinAnalytica INC.

[31] N.N. Taleb, The Black Swan: The Impact of the Highly Improbable, New York: Random House, 2007. [32] D. Tasche. Risk contributions and performance measurement. Report of the Lehrstuhl fur mathe- matische Statistik, TU Munchen. Citeseer, 1999 [33] D. Tasche. Conditional Expectation as Quantile Derivative, Working Paper, Technische Universit¨at, Munchen. [34] J. Skoglund, and W. Chen, Financial Risk Management,, Wiley, 2015. [35] S. Uryasev, Derivatives of probability functions and some applications Annals of Operations Reaserch, 56, 287-311

[36] S.S. Wang. A risk measure that goes beyond coherence. Research Report. 2001 [37] S.S. Wang, V.R. Young, and H.H. Panjer. Axiomatic characterization of insurance prices. Insurance: Mathematics and Economics, 1997 [38] J.L. Wirch and M.R. Hardy. A synthesis of risk measures for capital adequacy. Insurance: Mathe- matics and Economics, 25(3):337–347, 1999.

60 AstraZeneca AtlasCopco Ericsson HM Realized Returns QQ-plot for Returns Realized Returns QQ-plot for Returns Realized Returns QQ-plot for Returns Realized Returns QQ-plot for Returns 0.1 0.1 0.2 0.2 0.2 0.2 0.15 0.15

0.05 0.05 0.1 0.1 0.1 0.1 0.1 0.1 0 0 0.05 0.05 0 0 0 0 -0.05 -0.05 0 0 -0.1 -0.1 -0.1 -0.1 -0.1 -0.1 -0.05 -0.05 Log Return [%] Log Return [%] Log Return [%] Log Return [%] -0.15 -0.15 -0.2 -0.2 -0.2 -0.2 -0.1 -0.1 Quantiles of Input Sample Quantiles of Input Sample Quantiles of Input Sample Quantiles of Input Sample -0.2 -0.2 -0.3 -0.3 -0.3 -0.3 -0.15 -0.15 2008 2010 2012 2014 2016 2018 -4 -2 0 2 4 2008 2010 2012 2014 2016 2018 -4 -2 0 2 4 2008 2010 2012 2014 2016 2018 -4 -2 0 2 4 2008 2010 2012 2014 2016 2018 -4 -2 0 2 4 Date [Years] Standard Normal Quantiles Date [Years] Standard Normal Quantiles Date [Years] Standard Normal Quantiles Date [Years] Standard Normal Quantiles Sample ACF for Returns Sample ACF for Squared Returns Sample ACF for Returns Sample ACF for Squared Returns Sample ACF for Returns Sample ACF for Squared Returns Sample ACF for Returns Sample ACF for Squared Returns 1 1 1 1 1 1 1 1

0.8 0.8 0.8 0.8 0.8 0.8 0.8 0.8

0.6 0.6 0.6 0.6 0.6 0.6 0.6 0.6

0.4 0.4 0.4 0.4 0.4 0.4 0.4 0.4

0.2 0.2 0.2 0.2 0.2 0.2 0.2 0.2

0 0 0 0 0 0 0 0 Sample Autocorrelation Sample Autocorrelation Sample Autocorrelation Sample Autocorrelation Sample Autocorrelation Sample Autocorrelation Sample Autocorrelation Sample Autocorrelation -0.2 -0.2 -0.2 -0.2 -0.2 -0.2 -0.2 -0.2 0 5 10 15 20 0 5 10 15 20 0 5 10 15 20 0 5 10 15 20 0 5 10 15 20 0 5 10 15 20 0 5 10 15 20 0 5 10 15 20 Lag Lag Lag Lag Lag Lag Lag Lag

ICA Nordea Sandvik SEB Realized Returns QQ-plot for Returns Realized Returns QQ-plot for Returns Realized Returns QQ-plot for Returns Realized Returns QQ-plot for Returns 0.15 0.15 0.15 0.15 0.2 0.2 0.1 0.1 0.1 0.1 0.1 0.1 0.05 0.05 0.1 0.1 0.05 0.05 0.05 0.05 0 0

0 0 0 0 -0.05 -0.05 0 0

-0.05 -0.05 -0.05 -0.05 -0.1 -0.1

Log Return [%] Log Return [%] Log Return [%] Log Return [%] -0.1 -0.1 -0.1 -0.1 -0.1 -0.1 -0.15 -0.15 Quantiles of Input Sample Quantiles of Input Sample Quantiles of Input Sample Quantiles of Input Sample -0.15 -0.15 -0.15 -0.15 -0.2 -0.2 -0.2 -0.2 2008 2010 2012 2014 2016 2018 -4 -2 0 2 4 2008 2010 2012 2014 2016 2018 -4 -2 0 2 4 2008 2010 2012 2014 2016 2018 -4 -2 0 2 4 2008 2010 2012 2014 2016 2018 -4 -2 0 2 4 Date [Years] Standard Normal Quantiles Date [Years] Standard Normal Quantiles Date [Years] Standard Normal Quantiles Date [Years] Standard Normal Quantiles Sample ACF for Returns Sample ACF for Squared Returns Sample ACF for Returns Sample ACF for Squared Returns Sample ACF for Returns Sample ACF for Squared Returns Sample ACF for Returns Sample ACF for Squared Returns 1 1 1 1 1 1 1 1

61 0.8 0.8 0.8 0.8 0.8 0.8 0.8 0.8

0.6 0.6 0.6 0.6 0.6 0.6 0.6 0.6

0.4 0.4 0.4 0.4 0.4 0.4 0.4 0.4

0.2 0.2 0.2 0.2 0.2 0.2 0.2 0.2

0 0 0 0 0 0 0 0 Sample Autocorrelation Sample Autocorrelation Sample Autocorrelation Sample Autocorrelation Sample Autocorrelation Sample Autocorrelation Sample Autocorrelation Sample Autocorrelation -0.2 -0.2 -0.2 -0.2 -0.2 -0.2 -0.2 -0.2 0 5 10 15 20 0 5 10 15 20 0 5 10 15 20 0 5 10 15 20 0 5 10 15 20 0 5 10 15 20 0 5 10 15 20 0 5 10 15 20 Lag Lag Lag Lag Lag Lag Lag Lag

SwedishMatch Telia TotalBond

Realized Returns QQ-plot for Returns Realized Returns QQ-plot for Returns 10-3 Realized Returns 10-3 QQ-plot for Returns 0.1 0.1 0.15 0.15 10 10

0.1 0.1 0.05 0.05 0.05 0.05 5 5

0 0 0 0

-0.05 -0.05 0 0

Log Return [%] -0.05 -0.05 Log Return [%] Log Return [%] -0.1 -0.1 Quantiles of Input Sample Quantiles of Input Sample Quantiles of Input Sample -0.1 -0.1 -0.15 -0.15 -5 -5 2008 2010 2012 2014 2016 2018 -4 -2 0 2 4 2008 2010 2012 2014 2016 2018 -4 -2 0 2 4 2008 2010 2012 2014 2016 2018 -4 -2 0 2 4 Date [Years] Standard Normal Quantiles Date [Years] Standard Normal Quantiles Date [Years] Standard Normal Quantiles Sample ACF for Returns Sample ACF for Squared Returns Sample ACF for Returns Sample ACF for Squared Returns Sample ACF for Returns Sample ACF for Squared Returns 1 1 1 1 1 1

0.8 0.8 0.8 0.8 0.8 0.8

0.6 0.6 0.6 0.6 0.6 0.6

0.4 0.4 0.4 0.4 0.4 0.4

0.2 0.2 0.2 0.2 0.2 0.2

0 0 0 0 0 0 Sample Autocorrelation Sample Autocorrelation Sample Autocorrelation Sample Autocorrelation Sample Autocorrelation Sample Autocorrelation -0.2 -0.2 -0.2 -0.2 -0.2 -0.2 0 5 10 15 20 0 5 10 15 20 0 5 10 15 20 0 5 10 15 20 0 5 10 15 20 0 5 10 15 20 Lag Lag Lag Lag Lag Lag

Figure A.2: Sub figures depicting realizations (upper left), Quantile-Quantile plot (upper right) and sample autocorrelation for returns and sample autocorrelation of squared returns. AstraZeneca AtlasCopco Ericsson HM Realized Standardized Residuals QQ-plot for Standardized Residuals Realized Standardized Residuals QQ-plot for Standardized Residuals Realized Standardized Residuals QQ-plot for Standardized Residuals Realized Standardized Residuals QQ-plot for Standardized Residuals 10 10 10 10 10 10 10 10

5 5 5 5 5 5 5 5 0 0 0 0 0 0 0 0 -5 -5 -5 -5 -5 -5 -5 -5 -10 -10 -10 -10 Log Return [%] Log Return [%] Log Return [%] Log Return [%] -10 -10 -15 -15 -15 -15 -10 -10 Quantiles of Input Sample Quantiles of Input Sample Quantiles of Input Sample Quantiles of Input Sample -15 -15 -20 -20 -20 -20 -15 -15 2008 2010 2012 2014 2016 2018 -4 -2 0 2 4 2008 2010 2012 2014 2016 2018 -4 -2 0 2 4 2008 2010 2012 2014 2016 2018 -4 -2 0 2 4 2008 2010 2012 2014 2016 2018 -4 -2 0 2 4 Date [Years] Standard Normal Quantiles Date [Years] Standard Normal Quantiles Date [Years] Standard Normal Quantiles Date [Years] Standard Normal Quantiles Sample ACF for Standardized Residuals Sample ACF for Squared Standardized Residuals Sample ACF for Standardized Residuals Sample ACF for Squared Standardized Residuals Sample ACF for Standardized Residuals Sample ACF for Squared Standardized Residuals Sample ACF for Standardized Residuals Sample ACF for Squared Standardized Residuals 1 1 1 1 1 1 1 1

0.8 0.8 0.8 0.8 0.8 0.8 0.8 0.8

0.6 0.6 0.6 0.6 0.6 0.6 0.6 0.6

0.4 0.4 0.4 0.4 0.4 0.4 0.4 0.4

0.2 0.2 0.2 0.2 0.2 0.2 0.2 0.2

0 0 0 0 0 0 0 0 Sample Autocorrelation Sample Autocorrelation Sample Autocorrelation Sample Autocorrelation Sample Autocorrelation Sample Autocorrelation Sample Autocorrelation Sample Autocorrelation -0.2 -0.2 -0.2 -0.2 -0.2 -0.2 -0.2 -0.2 0 5 10 15 20 0 5 10 15 20 0 5 10 15 20 0 5 10 15 20 0 5 10 15 20 0 5 10 15 20 0 5 10 15 20 0 5 10 15 20 Lag Lag Lag Lag Lag Lag Lag Lag

ICA Nordea Sandvik SEB Realized Standardized Residuals QQ-plot for Standardized Residuals Realized Standardized Residuals QQ-plot for Standardized Residuals Realized Standardized Residuals QQ-plot for Standardized Residuals Realized Standardized Residuals QQ-plot for Standardized Residuals 10 10 6 6 5 5 6 6 4 4 5 5 4 4 2 2 0 0 0 0 2 2 0 0 0 0 -5 -5 -5 -5 -2 -2 -2 -2 Log Return [%] -10 -10 Log Return [%] Log Return [%] Log Return [%] -4 -4 -4 -4 Quantiles of Input Sample Quantiles of Input Sample Quantiles of Input Sample Quantiles of Input Sample -15 -15 -10 -10 -6 -6 2008 2010 2012 2014 2016 2018 -4 -2 0 2 4 2008 2010 2012 2014 2016 2018 -4 -2 0 2 4 2008 2010 2012 2014 2016 2018 -4 -2 0 2 4 2008 2010 2012 2014 2016 2018 -4 -2 0 2 4 Date [Years] Standard Normal Quantiles Date [Years] Standard Normal Quantiles Date [Years] Standard Normal Quantiles Date [Years] Standard Normal Quantiles Sample ACF for Standardized Residuals Sample ACF for Squared Standardized Residuals Sample ACF for Standardized Residuals Sample ACF for Squared Standardized Residuals Sample ACF for Standardized Residuals Sample ACF for Squared Standardized Residuals Sample ACF for Standardized Residuals Sample ACF for Squared Standardized Residuals 1 1 1 1 1 1 1 1

62 0.8 0.8 0.8 0.8 0.8 0.8 0.8 0.8

0.6 0.6 0.6 0.6 0.6 0.6 0.6 0.6

0.4 0.4 0.4 0.4 0.4 0.4 0.4 0.4

0.2 0.2 0.2 0.2 0.2 0.2 0.2 0.2

0 0 0 0 0 0 0 0 Sample Autocorrelation Sample Autocorrelation Sample Autocorrelation Sample Autocorrelation Sample Autocorrelation Sample Autocorrelation Sample Autocorrelation Sample Autocorrelation -0.2 -0.2 -0.2 -0.2 -0.2 -0.2 -0.2 -0.2 0 5 10 15 20 0 5 10 15 20 0 5 10 15 20 0 5 10 15 20 0 5 10 15 20 0 5 10 15 20 0 5 10 15 20 0 5 10 15 20 Lag Lag Lag Lag Lag Lag Lag Lag

SwedishMatch Telia TotalBond Realized Standardized Residuals QQ-plot for Standardized Residuals Realized Standardized Residuals QQ-plot for Standardized Residuals Realized Standardized Residuals QQ-plot for Standardized Residuals 5 5 10 10 6 6

4 4 5 5 0 0 2 2 0 0 0 0 -5 -5

Log Return [%] Log Return [%] -5 -5 Log Return [%] -2 -2

Quantiles of Input Sample Quantiles of Input Sample -4 Quantiles of Input Sample -4 -10 -10 -10 -10 2008 2010 2012 2014 2016 2018 -4 -2 0 2 4 2008 2010 2012 2014 2016 2018 -4 -2 0 2 4 2008 2010 2012 2014 2016 2018 -4 -2 0 2 4 Date [Years] Standard Normal Quantiles Date [Years] Standard Normal Quantiles Date [Years] Standard Normal Quantiles Sample ACF for Standardized Residuals Sample ACF for Squared Standardized Residuals Sample ACF for Standardized Residuals Sample ACF for Squared Standardized Residuals Sample ACF for Standardized Residuals Sample ACF for Squared Standardized Residuals 1 1 1 1 1 1

0.8 0.8 0.8 0.8 0.8 0.8

0.6 0.6 0.6 0.6 0.6 0.6

0.4 0.4 0.4 0.4 0.4 0.4

0.2 0.2 0.2 0.2 0.2 0.2

0 0 0 0 0 0 Sample Autocorrelation Sample Autocorrelation Sample Autocorrelation Sample Autocorrelation Sample Autocorrelation Sample Autocorrelation -0.2 -0.2 -0.2 -0.2 -0.2 -0.2 0 5 10 15 20 0 5 10 15 20 0 5 10 15 20 0 5 10 15 20 0 5 10 15 20 0 5 10 15 20 Lag Lag Lag Lag Lag Lag

Figure A.3: Sub figures depicting realizations (upper left), Quantile-Quantile plot (upper right) and sample autocorrelation for standardized residuals and sample autocorrelation of squared standardized residuals.

TRITA -SCI-GRU 2020:005

www.kth.se