DEGREE PROJECT IN MATHEMATICS, SECOND CYCLE, 30 CREDITS STOCKHOLM, SWEDEN 2020
Capturing Tail Risk in a Risk Budgeting Model
FILIP LUNDIN
MARKUS WAHLGREN
KTH ROYAL INSTITUTE OF TECHNOLOGY SCHOOL OF ENGINEERING SCIENCES
Capturing Tail Risk in a Risk Budgeting Model
FILIP LUNDIN
MARKUS WAHLGREN
Degree Projects in Financial Mathematics (30 ECTS credits) Master's Programme in Industrial Engineering and Management KTH Royal Institute of Technology year 2020 Supervisor at Nordnet AB: Gustaf Haag Supervisor at KTH: Anja Janssen Examiner at KTH: Anja Janssen
TRITA-SCI-GRU 2020:050 MAT-E 2020:016
Royal Institute of Technology School of Engineering Sciences KTH SCI SE-100 44 Stockholm, Sweden URL: www.kth.se/sci
Abstract
Risk budgeting, in contrast to conventional portfolio management strategies, is all about distributing the risk between holdings in a portfolio. The risk in risk budgeting is traditionally measured in terms of volatility and a Gaussian distribution is commonly utilized for modeling return data. In this thesis, these conventions are challenged by introducing different risk measures, focusing on tail risk, and other probability distributions for modeling returns.
Two models for forming risk budgeting portfolios that acknowledge tail risk were chosen. Both these models were based on CVaR as a risk measure, in line with what previous researchers have used. The first model modeled re- turns with their empirical distribution and the second with a Gaussian mixture model. The performance of these models was thereafter evaluated. Here, a diverse set of asset classes, several risk budgets, and risk targets were used to form portfolios. Based on the performance, measured in risk-adjusted returns, it was clear that the models that took tail risk into account in general had su- perior performance in relation to the standard model. Nevertheless, it should be noted that the superiority was significantly higher for portfolios that consti- tuted of mainly high-risk assets than for portfolios with more low-risk assets and also that the superior performance did not hold in all time periods con- sidered. It was also clear that the model that used the empirical distribution to model returns performed better than the model based on an assumption of returns belonging to the Gaussian mixture model when the portfolio consisted of more assets with heavier tails.
Filip Lundin i Markus Wahlgren
En riskbudgeteringsmodell som tar hän- syn till svansrisk
Sammanfattning
Jämfört med konventionella portföljhanteringsstrategier handlar riskbudgete- ring mer om att fördela risken mellan innehav i en portfölj. Risken i riskbud- getering mäts traditionellt med avseende på volatilitet och en Gaussisk för- delning används normalt för att modellera avkastningsdata. I den här avhand- lingen anlyseras andra modeller som istället fokuserar på svansrisk genom att införa andra riskmått och genom att använda andra sannolikhetsfördelningar för modellering av avkastningsdata.
Två modeller för att konstruera riskbudgeteringsportföljer som tar hänsyn till svansrisk har analyserats i den här avhandlingen. Båda dessa modeller använ- de sig av CVaR som ett riskmått, i linje med vad tidigare forskare har använt. Den första modellen modellerade avkastningar med den empiriska fördelning- en och den andra modellen med en Gaussisk blandningsmodell. Därefter ut- värderades hur de olika modellerna presterade. Här användes en mångfald av tillgångsklasser, flera riskbudgetar och riskmål för att bilda portföljerna. Ba- serat på prestanda, mätt i termer av riskjusterad avkastning, var det tydligt att de modeller som tog hänsyn till svansrisk generellt presterade bättre än den konventionella modellen. Det bör emellertid noteras att för portföljer som hu- vudsakligen bestod av tillgångar med låg risk så var detta resultat mindre signi- fikant och även att resultatet inte gällde för alla tidsperioder som analyserades. Det var också tydligt att modellen som använde den empiriska fördelningen för att modellera avkastningsdata fungerade bättre än den Gaussiska bland- ningsmodellen när portföljen till större del bestod av tillgångar med tyngre svansar.
Filip Lundin ii Markus Wahlgren
Acknowledgements
We want to thank Nordnet for presenting us with the opportunity to write this thesis and for providing the necessary data. We especially want to thank our supervisor at Nordnet, Gustaf Haag, for introducing us to the subject of risk budgeting and for helping us to pinpoint a subject to write about. Also, we want to thank our supervisor at KTH Royal Institute of Technology, Anja Janssen for the helpful guidance and feedback that we have received throughout the process of writing this thesis.
Filip Lundin & Markus Wahlgren Stockholm, May 13, 2020
Filip Lundin iii Markus Wahlgren
Contents
1 Introduction 1 1.1 Research Questions ...... 2 1.2 Scope & Limitations ...... 3 1.3 Related Work ...... 3 1.4 Outline ...... 6
2 Background 7 2.1 Distributions ...... 7 2.1.1 Gaussian Distribution ...... 7 2.1.2 Gaussian Mixture Distribution ...... 8 2.1.3 Empirical Distribution ...... 9 2.2 Parameter Estimation ...... 10 2.2.1 Gaussian Distribution ...... 10 2.2.2 Maximum Likelihood ...... 10 2.2.3 Gaussian Mixture Model ...... 11 2.3 Risk Measures ...... 14 2.3.1 Coherent Risk Measures ...... 14 2.3.2 Volatility ...... 15 2.3.3 Value-at-Risk ...... 16 2.3.4 Conditional Value-at-Risk ...... 17 2.4 Risk Budget Portfolios ...... 18 2.4.1 Existence and Uniqueness of the Portfolio ...... 19 2.4.2 Euler’s Theorem on Homogeneous Functions . . . . . 19 2.4.3 Volatility as a Risk Measure ...... 20 2.4.4 CVaR as a Risk Measure ...... 21 2.4.5 CVaR as a Risk Measure with Gaussian Mixture Model 23 2.5 Optimization ...... 28 2.5.1 Optimization Problem ...... 28 2.5.2 Quadratic Programming ...... 29
iv 2.5.3 Sequential Quadratic Programming ...... 29 2.6 Risk Targeting ...... 29 2.7 Performance Measures ...... 31 2.7.1 Sharpe Ratio ...... 31 2.7.2 Other Measures of Risk-Adjusted Returns ...... 31 2.7.3 Maximum Drawdown ...... 32
3 Data and Methodology 33 3.1 Data ...... 33 3.2 Risk Budgeting Portfolio Creation ...... 37 3.2.1 Portfolios with Leverage ...... 38 3.3 Portfolio Evaluation ...... 38
4 Results 40 4.1 Performance Graphs ...... 40 4.2 Performance Measures for the Full Period ...... 46 4.3 Performance Measures for Sub-Periods ...... 48
5 Discussion 52 5.1 Evaluation of the Findings ...... 52 5.2 Sources of Error ...... 55 5.3 Future Work ...... 56
6 Conclusions 57
7 Bibliography 59
A Model data 62 A.1 Capital Weights ...... 62 A.2 QQ-Plots for the Models ...... 68 A.3 Leverage ...... 70 A.4 Performance Measures for Sub-Periods ...... 73 List of Figures
2.1 Illustration of VaR for a density function fL of losses ...... 17 2.2 Illustration of CVaR for a density function fL of losses ...... 18 2.3 Illustration of maximum drawdown ...... 32
4.1 Performance for unleveraged risk parity portfolios ...... 41 4.2 Performance for low-risk portfolios for a daily empirical CVaR0.05 target of 0.5% ...... 43 4.3 Performance for mid-risk portfolios for a daily empirical CVaR0.05 target of1% ...... 44 4.4 Performance for high-risk portfolios for a daily empirical CVaR0.05 tar- get of 1.5% ...... 45 4.5 The Sharpe ratio for the different periods for the risk parity portfolio . . 48 4.6 The Tail risk-adjusted return for the different periods for the risk parity portfolio ...... 49 4.7 The Sharpe ratio for the different periods for the high-risk portfolio . . . 50 4.8 The Tail risk-adjusted return for the different periods for the high-risk portfolio ...... 50
vi List of Tables
3.1 Portfolio risk budgets ...... 34 3.2 The daily empirical CVaR0.05 target for the different risk levels . . . . . 34 3.3 Performance measures for individual assets ...... 36
4.1 Performance measures for risk parity portfolio ...... 46 4.2 Performance measures for low-risk portfolio ...... 46 4.3 Performance measures for mid-risk portfolio ...... 47 4.4 Performance measures for high-risk portfolio ...... 47
vii Chapter 1
Introduction
Historically, portfolio allocation strategies have often involved investing 60 % in equities and 40 % in bonds (60/40-portfolio) or applying Markowitz’s [1] mean-variance analysis to determine portfolio weights. The mean-variance analysis has allowed investors to find the optimal capital allocation that maximizes the expected return for a given volatility level or minimize the volatility for a given level of expected return. Although both these strategies have been very popular, some criticism has been directed against them. For instance, the 60/40-portfolio is said to not diversify risks properly and the mean-variance method is highly sensitive to the input parameters. In the past decades, alternative ap- proaches for portfolio construction have therefore gained traction in the investment com- munity, one of them being risk budgeting. Risk budgeting, in contrast to the previously discussed allocation strategies, is all about distributing the risk between holdings in a portfolio. The risk in risk budgeting is often measured by volatility, which is a com- monly used measure of risk in finance. A special case of risk budgeting is risk parity or equal risk contribution (ERC) where the risk is allocated equally between assets.
The first risk budgeting fund, called All-Weather, was developed by Bridgewater asso- ciates in 1996 and used a risk parity approach to establish an investment strategy that would perform well in all economic conditions [2]. Since then and particularly after the financial crisis of 2008, when numerous institutional investors understood that their current models yielded riskier portfolios than expected, the popularity of risk budgeting and risk parity has increased considerably [3].
Risk budgeting as an investment strategy has been praised for its ability to generate true diversification considering risks in portfolios. In for instance the 60/40-portfolio 90 % of the portfolio risk, measured in volatility, normally originates from the equity component since equities are significantly riskier than bonds [4]. The risk in a risk parity portfolio
1 Chapter 1. Introduction 1.1. Research Questions
is, on the other hand, by definition set to be distributed equally among assets. Thus, the problem that one of the asset classes makes up for the majority of the overall portfolio risk is averted. Risk parity is however not in all instances preferable considering its re- strictiveness in the way that risk is allocated and therefore risk budgeting is consistently utilized to enable investors to set suitable budgets for the risk allocation themselves.
When estimating the risk that should be allocated in a risk budgeting portfolio, volatility is conventionally applied as the risk measure together with an assumption of Gaussian distributed returns. Returns of various financial assets have nevertheless been proven not to follow the Gaussian distribution. Particularly, it has been demonstrated that numerous financial asset returns possess heavier tails than what would be presumed under the as- sumption of the Gaussian distribution [5]. The aforementioned indicates that the risk of large losses may be neglected utilizing models that apply volatility as a risk measure and assume Gaussian distributed returns. It has also been shown that some financial asset returns exhibit skewness and are asymmetrical in their distribution [6][7]. This contra- dicts the assumption that returns follow the symmetrical Gaussian distribution.
In this Master thesis, the convention of employing volatility as a risk measure and as- suming Gaussian distributed returns in risk budgeting models will be challenged. A risk budgeting model will be implemented that omits the assumption of Gaussian distributed returns and uses an alternative risk measure that is focused on measuring tail risk, i.e. the risk of substantial losses.
Nordnet AB, the provider of the project, currently uses risk budgeting strategies for its three Nordnet Smart mutual funds. The firm is interested in alternative methods for eval- uating risks in risk budgeting models. Their portfolios consist of assets including stocks, interest rates, credits, commodities, inflation-protected assets, and lastly alternative risk premia which aim at exploiting anomalies in the market e.g. by buying shares with high momentum and selling shares with low momentum. In this thesis, portfolios will be formed using similar assets as used in Nordnet’s Smart funds when analyzing the risk budgeting strategy that incorporates the effects of tail risk.
1.1 Research Questions
The research questions for this master thesis are the following:
• How can a risk budgeting strategy be implemented that takes into account tail risk in asset returns?
Filip Lundin 2 Markus Wahlgren Chapter 1. Introduction 1.2. Scope & Limitations
• What is the difference in the performance of a risk budgeting strategy that acknowl- edges tail risk in returns and one that does not?
1.2 Scope & Limitations
This thesis will be limited by the data that will be considered in the analysis of the models. Data has been provided by Nordnet so that similar assets could be used in the portfolio construction process as used in their Smart Funds. However, the data consist of prices for several different asset classes that could be used by any investment manager to form a portfolio and therefore the results will be relevant from a general perspective.
This thesis will not take into consideration the transaction costs that arise when the port- folio is rebalanced or the management fees for the examined assets. Other costs such as the interest costs from leveraging will also not be investigated in this thesis. The risk-free rate is therefore assumed to be equal to zero throughout the thesis.
This thesis aims to drop the established risk measure volatility and replace it with a risk measure that incorporates tail risk. There exist many different types of risk measures that take tail risk into account but this thesis will only investigate one risk measure. Here, the risk measure that will be examined is the conditional value-at-risk (CVaR). The other aim of this thesis is to explore other assumptions than that returns are Gaussian distributed. This assumption will be challenged by calculating CVaR empirically and also through adapting a Gaussian mixture distribution.
1.3 Related Work
The introduction of alternative risk measures such as conditional value-at-risk, which is a risk measure that focuses on the tail risk, into risk budgeting is a subject that has been investigated by multiple researchers.
AllianceBernstein (2013) describe a model which they call tail risk parity. They suggest using risk parity with CVaR as a risk measure instead of volatility. In their model, the CVaR is acquired from the options markets by analyzing implied volatilities. The au- thors show that their tail risk parity strategy has a higher risk-adjusted return, when the risk is measured in terms of CVaR, compared to a conventional risk parity strategy. The strategy does however not show any improvement when risk-adjusted returns were mea- sured by Sharpe ratio, where risk is measured in terms of volatility. In the article, there are no specific mathematical details on how the tail risk parity portfolio is constructed
Filip Lundin 3 Markus Wahlgren Chapter 1. Introduction 1.3. Related Work
[8].
Boudt, Carl and Peterson (2012) use Cornish–Fisher expansions and CVaR to form a strategy similar to a risk parity strategy, but where instead of allocating the risk equally among assets they decide the allocation by minimizing the greatest CVaR risk contribu- tion from the assets. Utilizing Cornish-Fisher expansions together with CVaR they can account for heavy tails and skewness in returns [9].
Several articles that have investigated CVaR as a risk measure for the risk budgeting port- folio construction have estimated it non-parametrically. Cesarone and Colucci (2017) investigate the method by using empirical CVaR as the risk measure and setting the CVaR contribution to be equal among assets. They present methods for allocating the risk based on empirical CVaR that utilize both a naive approach where diversification is not taken into account which means that the portfolio weights are proportional to the assets’ inverse CVaR and a model that considers the effects of diversification. For the former, the calculations for the asset allocation are simple and for the latter, more ad- vanced method, the portfolio weights can be found by utilizing optimization methods [10]. This type of model has also been analyzed in a well-cited Master thesis by Ste- fanovits [11]. An alternative method for estimating the non-parametric CVaR is to use a bootstrap resampling procedure which is used in an article by Cagna and Casuccio [12]. This way of using bootstrap approaches to estimate a non-parametric CVaR is also used in another study by Colucci [13] where a filtered bootstrap approach is used to estimate CVaR and then the model is compared with the one that uses the ordinary volatility risk measure.
Some studies have developed models that also take skewness risk into account when cre- ating risk parity portfolios. One example of introducing skewness risk into risk parity models is by dropping the assumption that returns are Gaussian distributed and instead to use a Gaussian mixture distribution to model returns. This model together with CVaR as risk measure is used by Bruder, Kostyuchyk and Roncalli [14] who have applied the developed model based on historical data and have concluded that skewness-based risk parity models yield better allocation compared to ordinary volatility-based risk parity strategies since it yields less turnover and therefore lower costs. However, the perfor- mance in terms of risk-adjusted returns, measured in Sharpe ratio, is similar between the strategies. This method is also investigated in a Master thesis by Vu [15] who applies a constrained Gaussian mixture model with CVaR as a risk measure and compares it with the conventional volatility risk measure.
Mausser and Romanko [16] investigate possible issues with the use of the empirical
Filip Lundin 4 Markus Wahlgren Chapter 1. Introduction 1.3. Related Work
CVaR as a risk measure instead of volatility. They do this by considering convex opti- mization to find long-only equal risk contribution portfolios (ERC) for a set of scenarios of asset returns that are equally likely. The first difficulty of using the empirical CVaR in- stead of volatility is that it can both be positive and negative. When it is negative it is not certain if any of the long-only ERC portfolios can be found using convex optimization. The second main problem is that the empirical CVaR is not continuously differentiable which indicates that there exists a possibility that a solution may not exist at all. In their paper, Mausser and Romanko present the conditions for these problems to occur and also suggest a heuristic method to find approximate ERC portfolios for these cases.
The existing literature on the subject indicates that there are only a few studied models and methods for implementing a new risk measure such as CVaR into risk budgeting. Most prominently, the Gaussian mixture model and the empirical distribution are used to model the distribution of the returns together with CVaR as a risk measure. What, how- ever, is common for the current research is that all the papers investigate how alternative risk measures affect the performance of the risk parity portfolio, which is a special case of the risk budgeting portfolio. What is lacking in the literature are papers that have im- plemented a new risk measure for a more general risk budgeting model and compared its performance to the standard model. Together with the more commonly considered asset types, such as stocks and interest rates, this thesis also examines other asset classes such as credits, real estate, commodities, and alternative risk premia which are not commonly used in previous studies.
The two most prominent methods for implementing CVaR as a risk measure are, like previously mentioned, using the empirical distribution or to fit a Gaussian mixture dis- tribution and then to calculate the CVaR based on these distributions. Both these two methods will be implemented and compared in this thesis. This comparison of the two most commonly used methods will contribute to the existing research since a direct com- parison between the models has not yet been performed. All the articles in the literature that compare the performance of a CVaR model and the volatility-based model have only investigated one way of implementing CVaR. A comparison of multiple methods for im- plementing a CVaR model is, therefore, lacking in the literature and this thesis seeks to fill this gap. Also, portfolios with different risk level targets will be considered that utilize leverage in order to achieve the risk target. This has not been examined in the current research and is therefore a further contribution from this thesis.
Filip Lundin 5 Markus Wahlgren Chapter 1. Introduction 1.4. Outline
1.4 Outline
The thesis will continue with a background chapter. Here, the relevant mathematical theory will be presented that is needed in order to understand the mathematical tools that are used to construct the risk budgeting portfolios. Thereafter, a chapter that de- scribes the data and the methodology used for constructing the different portfolios is presented. Here, the real financial data that are used for evaluation of the performance are described. Also, the methodology for how the portfolios are formed using the data is presented. Then, the results are laid out, starting with graphs for the value develop- ment of the different portfolios and risk budgeting models. Apart from the graphs, this section also includes tables with performance measures for the full time period that are used to evaluate how well each model performs for instance with regard to risk-adjusted returns. Diagrams that show the performance measures for shorter time periods are also presented in order to show how these measure change through time. Thereafter, the re- sults are discussed, focusing on describing how the results could be explained and the relevance of the results. Lastly, the thesis ends with a conclusion where the research questions are answered.
Filip Lundin 6 Markus Wahlgren Chapter 2
Background
This section will provide necessary and relevant background for this thesis. It will de- scribe the theory of potential risk measures together with possible distributions and the estimation of their parameters. The background on how to construct the risk budgeting portfolios based on alternative risk measures will also be explained together with opti- mization methods. How specific risk levels are targeted for the portfolios will then be explained followed by a description of the performance measures that can be used for model comparison.
2.1 Distributions
Probability distributions describe how likely different events are to occur. For a random variable X, the distribution function is defined as P (X ≤ x), i.e. the probability that X takes on a value less or equal to the number x. If X is a continuous random variable, then there exists a density fX (t) such that
Z x FX (x) = P (X ≤ x) = fX (t)dt. −∞ Below the Gaussian distribution, which is used in the conventional risk budgeting models together with volatility as a risk measure, is described. Also, the Gaussian mixture model and the empirical distribution are presented which will be used to form a risk budgeting model that takes into account tail risk in financial returns.
2.1.1 Gaussian Distribution A Gaussian or normal distribution is a continuous probability distribution that is com- monly used for modeling natural phenomenons or, which is more relevant for this thesis,
7 Chapter 2. Background 2.1. Distributions
financial asset returns. The probability density function (PDF) of the Gaussian distribu- tion can be defined as the following [17]
2 1 − (x−µ) f(x) = √ e 2σ2 , x ∈ R. σ 2π It is determined by the two parameters µ and σ. The first parameter µ is the mean or expected value of the distribution and the second parameter σ is the standard deviation of the distribution.
In a multivariate setting the density function can be defined as the following
− 1 (x−µ)>Σ−1(x−µ) e 2 fX (x) = , p(2π)n|Σ| where µ and Σ are a mean vector and a covariance matrix respectively [18]. A vector X with density fX has expected value E(X) = µ and covariance matrix Σ. Here |Σ| denotes the determinant of Σ.
2.1.2 Gaussian Mixture Distribution Mixture distributions combine several distributions in order to fit empirical data better than individual distributions could. Mathematically, a probability density function f(x) of a mixture distribution with g distributions for a random vector X can be written as
g X f(x) = πifi(x), i=1 where fi(x) are component densities of the mixture and πi are the components’ weights in the distribution [19]. Naturally, the following conditions need to be fulfilled for the weights
0 ≤ πi ≤ 1, g X πi = 1. i=1 Bruder et al. [14] use a Gaussian mixture model with two Gaussian components to model returns when constructing a risk parity portfolio. Here, one component is considered a
Filip Lundin 8 Markus Wahlgren Chapter 2. Background 2.1. Distributions
continuous component and the other one a jump component. The continuous component is modelled with a Gaussian distribution with mean µ and covariance Σ while the jump component is modelled with a Gaussian distribution with mean µ+µe and covariance Σ+ Σe. These contributions are then used to form a Gaussian mixture model by introducing the probability (1 − λ) for the continuous component and the probability λ for the jump component. Mathematically, the density function for this Gaussian mixture model can, therefore, be described as the following, see [14],
1 − λ 1 > −1 f(x) = n 1 exp − (x − µ) (Σ) (x − µ) (2π) 2 |Σ| 2 2 λ 1 + exp − x − (µ + µ)>(Σ + Σ)−1 x − (µ + µ) . n 1 e e e (2π) 2 |Σ + Σe| 2 2
Here, π1 = 1 − λ and π2 = λ following the notation from above.
2.1.3 Empirical Distribution The empirical distribution function is an estimate of an unknown distribution function based on observations and can be defined in the following way [20].
Let the unknown distribution function be F (x) = P (X ≤ x) for the observations x1, x2, ..., xn of independent and identically distributed random variables X1,X2, ..., Xn. The empirical distribution function Fn(x) is then defined by
n 1 X F (x) = n n I{xi≤x}, i=1 where I{xi≤x} is the indicator function which means that it will return a 1 if xi is less or equal to x and a 0 if xi is larger than x.
The empirical distribution function, like previously mentioned distributions, can be used to estimate tail risk and has the advantages that it is only based on actual information. It only considers the observations and does not have any distributional assumptions. The disadvantages of using the empirical distribution for estimating the tail risk in the form of the empirical value-at-risk and conditional value-at-risk, as described in section 2.3.3 and 2.3.4, is that the estimations that are outside the range of the sample will show greater variation compared to a correctly fitted parametric distribution [21].
Filip Lundin 9 Markus Wahlgren Chapter 2. Background 2.2. Parameter Estimation
2.2 Parameter Estimation
In order to apply the previously mentioned distributions in the modeling, their parameters have to be estimated. Below the methods used for estimation of the different distribu- tions’ parameters are explained.
2.2.1 Gaussian Distribution The two parameters that need to be estimated for the one-dimensional Gaussian distri- bution are µ and σ. The parameter µ can be estimated by taking the arithmetic mean x¯ of the sample of n observations
n 1 X x¯ = x . n i i=1 The parameter σ can be estimated by taking the sample standard deviation s of a sample of n observations which is defined as follows v u n u 1 X 2 s = t (xi − x¯) . n − 1 i=1
For the multivariate Gaussian distribution the covariance, σjk, is also necessary to esti- mate. The sample covariance is given by the following expression
n 1 X cov(x , x ) = (x − x¯ )(x − x¯ ). j k n − 1 j,i j k,i k i=1
2.2.2 Maximum Likelihood The maximum likelihood method can be used to estimate parameters for a distribution based on empirical data. Consider the one dimensional case with x1, ..., xn, an i.i.d sample from a certain distribution with CDF F (x; θ), where θ is an unknown parameter in the distribution. Then, if X is a continuous random variable with PDF f(x; θ), the likelihood function is given by
L(θ) = f(x1; θ) · f(x2; θ) ····· f(xn; θ).
Filip Lundin 10 Markus Wahlgren Chapter 2. Background 2.2. Parameter Estimation
The likelihood function represents the probability that the sample is occurring under the distribution f(x, θ). Thus, finding the θ∗ that yields the maximum value of the likeli- hood function corresponds to finding the distribution with this θ∗ as a parameter that the sample would most likely come from [17].
2.2.3 Gaussian Mixture Model The parameters of the Gaussian mixture model can be estimated by an expectation- maximization algorithm (EM). The EM algorithm is a method to find maximum likeli- hood estimates when no closed form solution to maximizing L(θ) exists.
Bruder et al. [14] describes an EM-method to estimate the parameters of a Gaussian mixture model and it can be described in the following way.
They consider a Gaussian mixture model that has two Gaussian components and thus contains five parameters that need to be estimated, as described in section 2.1.2, i.e.
θ = λ, µ, Σ, µ,˜ Σ˜.
A sample Rt = (R1,t, ..., Rn,t) of i.i.d asset returns for n assets where Ri,t is the return of asset i observed at time t is considered. The log-likelihood function is then
T X `(θ) = ln f(Rt) t=1 where f(y) is the multivariate probability function for a Gaussian mixture model with two Gaussian components as defined in section 2.1.2. Thus in this case it is defined as the following ˜ f(y) = π1φn(y; µ, Σ) + π2φn(y; µ +µ, ˜ Σ + Σ) where π1 = 1 − λ, π2 = λ and φn(y; µ, Σ) represents the PDF of the multivariate Gaussian distribution with parameters µ and Σ. The notations µ1 = µ, Σ1 = Σ, µ2 = ˜ µ +µ ˜ and Σ2 = Σ + Σ are introduced and the log-likelihood function becomes the following
T 2 X X `(θ) = ln πjφn(Rt; µj, Σj). t=1 j=1
Filip Lundin 11 Markus Wahlgren Chapter 2. Background 2.2. Parameter Estimation
The partial derivative of the log-likelihood `(θ) with respect to µj is
T ∂`(θ) X πjφn(Rt; µj, Σj) = Σ−1(R − µ ). ∂µ P2 j t j j t=1 s=1 πsφn(Rt; µs, Σs) The first-order condition is, therefore
T X −1 πj,tΣj (Rt − µj) = 0 t=1 where π φ (R ; µ , Σ ) π = j n t j j . j,t P2 s=1 πsφn(Rt; µs, Σs)
The expression for the estimator µˆj then is the following
PT π R µˆ = t=1 j,t t . (2.1) j PT t=1 πj,t
To express the partial derivative of the likelihood function l(θ) with respect to Σj, con- −1 sider the function g(Σj ) which is defined as follows
1 1 > −1 −1 − 2 (Rt−µj ) Σj (RT −µj ) g(Σj ) = n/2 1/2 e (2π) |Σj| −1 1/2 |Σj | − 1 trace Σ−1(R −µ )(R −µ )> . = e 2 j T j t j (2π)n/2 It then follows that the partial derivative is
−1 −1 −1/2 −1 ∂g(Σj ) 1 |Σj | |Σj |Σj 1 −1 > − 2 trace Σj (Rt−µj )(Rt−µj ) −1 = n/2 · e ∂Σj 2 (2π) −1 1/2 1 > |Σj | − 1 trace Σ−1(R −µ )(R −µ )> − (R − µ )(R − µ ) · e 2 j t j t j 2 t j t j (2π)n/2 Σ − (R − µ )(R − µ )> 1 1 −1 > j t j t j − 2 (Rt−µj )Σj (Rt−µj ) = n/2 1/2 · e . (2π) |Σj| 2 Then it is clear that
T ∂`(θ) 1 X πjφn(Rt; µj, Σj) = Σ − (R − µ )(R − µ )> . ∂Σ−1 2 P2 j t j t j j t=1 s=1 πsφn(Rt; µs, Σs)
Filip Lundin 12 Markus Wahlgren Chapter 2. Background 2.2. Parameter Estimation
The expression for the first order condition is equal to
T X > πj,t Σj − (Rt − µj)(Rt − µj) = 0. t=1 ˆ Thus, the estimator Σj is given by
PT π (R − µˆ )(R − µˆ )> Σˆ = t=1 j,t t j t j . (2.2) j PT t=1 πj,t
For the mixture probabilities πj, the following expression for the partial derivative with respect to πj is yielded from the first-order condition
T ∂`(θ) X φn(Rt; µj, Σj) = . ∂π P2 j t=1 s=1 πsφn(Rt; µs, Σs)
It can be concluded that it is not possible to define the estimator πˆj directly since the numerator cannot take on the value zero. Therefore, another approach has to be utilized in order to obtain the ML estimators.
Define πˆj,t as the estimator for the posterior probability of the regime j at time t, given as π φ (R ; µ , Σ ) πˆ = j n t j j . (2.3) j,t P2 s=1 πsφn(Rt; µs, Σs)
If πˆj,t is known, the estimator πˆj can be given by
PT πˆ πˆ = t=1 j,t . (2.4) j T The EM algorithm for the Gaussian mixture distribution can then be described in the following steps.
(0) (0) (0) 1. Set the initial starting values πj , µj and Σj for k = 0.
2. The E-step, where the posterior distributions πj,t are updated for all the observations using equation (2.3) which yields the expression
(k) (k) (k) πj φn Rt; µj , Σj π(k) = . j,t P2 (k) (k) (k) s=1 πs φn Rt; µs , Σs
Filip Lundin 13 Markus Wahlgren Chapter 2. Background 2.3. Risk Measures
ˆ 3. The M-step, where the estimator πˆ, µˆj and Σj are updated using equations (2.4), (2.1) and (2.2) yielding the following expressions
PT π(k) π(k+1) = t=1 j,t j T PT (k) π Rt µ(k+1) = t=1 j,t j PT (k) t=1 πj,t > PT (k) (k+1) (k+1) t=1 πj,t Rt − µj Rt − µj Σ(k+1) = . j PT (k) t=1 πj,t 4. Iterate step 2 and 3 until the estimator converges.
(∞) (∞) ˆ (∞) 5. Finally the estimated parameters are yielded as πˆj = πj , µˆj = µj and Σj = Σj .
2.3 Risk Measures
In this section risk measures and their properties are described. The most commonly used risk measures in finance are introduced which can then be utilized for constructing risk budgeting portfolios.
2.3.1 Coherent Risk Measures
Let X be a linear vector space of random variables X which represent the values of dif- ferent portfolios at time 1. A function R that assigns real values to each X in X is called a risk measure. The risk of X is thereby denoted R(X) and can be interpreted to be the amount of capital that needs to be invested into a reference instrument with percentage return R0 at time 0 in order to yield an acceptable position. The reference instrument can, in this case, be considered to be the risk-free zero-coupon bond that matures at time 1. No capital is needed to be added to a portfolio for the position to be acceptable if R(X) ≤ 0.
In order for a risk measure R to be coherent, it needs to have the following properties [20]:
i) Monotonicity: X2 ≤ X1, then R(X1) ≤ R(X2)
ii) Subadditivity: R(X1 + X2) ≤ R(X1) + R(X2)
iii) Positive homogeneity: R(λX) = λR(X) for all λ ≥ 0
Filip Lundin 14 Markus Wahlgren Chapter 2. Background 2.3. Risk Measures
iv) Translation invariance: R(X + cR0) = R(X) − c for a real number c
The property monotonicity indicates that if the current time is 0 and a position X1 has a greater value than another position X2 at time 1 for sure, then the position X1 with the greater value will have less risk compared to the other position X2.
Subadditivity rewards diversification by ensuring that the total risk of the portfolio X1 + X2 cannot be larger than the sum of the risk of X1 and X2 separately.
Positive homogeneity means that if we scale a portfolio with a factor λ we also scale the total risk with the same factor. It also implies that R(0) = 0.
Translation invariance indicates that if an amount of cash of value c is added to the portfolio and is invested in the reference instrument, a risk-free zero-coupon bond in this case, it will decrease the risk by the same amount c [20].
2.3.2 Volatility Volatility is a common measure of risk in financial settings that is normally measured in terms of standard deviation. Formally, the historical volatility can therefore be calculated with the following expression v u N u 1 X σ = t (r − r¯)2 N − 1 t t=1 where N is the number of observed returns, rt the return at time t and r¯ the arithmetic mean of all N returns [22].
Another commonly used measure for volatility is the implied volatility which can be de- rived from the options market using the well-known Black and Scholes formula. Thus, this is a measure of how the market anticipates future volatility in contrast to the histor- ical volatility which considers backward-looking data [23].
Volatility has been criticized as a risk measure in financial settings. To begin with, some argue that, since the volatility increases both when an asset experiences large losses and gains, it is sub-optimal since many would appreciate the gain but not the loss [24]. Also, the volatility measure is often used with an assumption of Gaussian distributed data. It has however been proven that many financial asset returns have heavier tails than what would be expected by the Gaussian distribution [5].
Filip Lundin 15 Markus Wahlgren Chapter 2. Background 2.3. Risk Measures
2.3.3 Value-at-Risk Value-at-risk is a risk measure that considers the tail of the distribution of losses, that is the most negative outcomes, compared to volatility, which considers overall deviations from the mean. The value-at-risk VaRp(X) for a risk level p and a portfolio with value X at time 1 can be defined as the following [20] VaRp(X) = min m : P (L ≤ m) ≥ 1 − p
X where L is the discounted portfolio loss L = − and R0 is the return of a risk-free R0 asset in percent between time 0 and 1. VaRp(X) can then be described as the minimal value m such that the probability of the discounted portfolio loss L being no more than m is at least 1 − p.
Value-at-risk can statistically be defined as the (1−p)-quantile of the discounted portfolio loss L, i.e.
−1 VaRp(X) = FL (1 − p).
The (1 − p)-quantile of a random variable L that has the distribution function FL is defined in the expression
−1 FL (1 − p) := min m : FL(m) ≥ 1 − p .
−1 If FL is strictly increasing then FL is the regular inverse of the distribution function FL. −1 If FL is both strictly increasing and continuous then FL is the unique value m such that FL(m) = 1 − p.
The value-at-risk can be estimated using empirical data, without assuming any para- metric distribution for the data. This empirical estimate of VaR calculated for a sample {L1,L2, .., Ln} of n observations of the discounted portfolio loss L is given by
VaR\p(X) = Lbnpc+1,n.
Where L1,n ≥ L2,n ≥ ... ≥ Ln,n is the ordered sample of {L1, ..., Ln} and bzc is the integer part of z. It is therefore only the empirical (1−p)-quantile of the sample of L [20].
A significant disadvantage of using VaR as a risk measure is that it is not coherent since it is not sub-additive. This means that the total risk of a portfolio of multiple assets could be larger than the sum of individual assets’ risk [25]. Diversification does therefore not
Filip Lundin 16 Markus Wahlgren Chapter 2. Background 2.3. Risk Measures
necessarily provide a reduction in risk when the VaR is used as a risk measure. A further disadvantage with using VaR as a risk measure is that it does not capture the events that happen in the end of the tail. This is a problem since a distribution could for instance have events with low probability but with a much greater outcome than the VaR at the chosen risk level. This problem is visualised in figure below, where the right tail corresponds to the most negative outcomes and where the tail is heavier on the right than the left.
Figure 2.1: Illustration of VaR for a density function fL of losses
2.3.4 Conditional Value-at-Risk Value-at-risk only considers the distribution up to a certain point in the tail and does therefore not take into account the risk in the far end of the tail as mentioned in the previous section. In order to cope with this, another risk measure denoted conditional value-at-risk is often used. Practically, the conditional value-at-risk is the average of the value-at-risk between the risk level and the end of the tail in the distribution and will therefore incorporate all the events in the tail which is illustrated in figure 2.2. Formally, the conditional value-at-risk at the risk level p can thus be written as
1 Z p CVaRp(X) = VaRu(X)du. p 0 Similarly, as with value-at-risk, the CVaR can also be estimated directly from empirical data. Then the empirical CVaR can be described by the following expression
Filip Lundin 17 Markus Wahlgren Chapter 2. Background 2.4. Risk Budget Portfolios
Z p bnpc 1 1 X Lk,n bnpc CVaR\(X) = L du = + p − L p p bnuc+1,n p n n bnpc+1,n 0 k=1 where the L1,n ≥ L2,n ≥ ... ≥ Ln,n is the ordered sample of losses as described above.
CVaR is also called expected shortfall (ES). This risk measure is, in contrast to value- at-risk, a coherent risk measure [20].
Figure 2.2: Illustration of CVaR for a density function fL of losses
2.4 Risk Budget Portfolios
Risk budgeting is an alternative way of creating portfolios which instead of concentrat- ing on capital-distribution concentrates on the distribution of risk among assets. The idea is that each asset should contribute with some predetermined amount of risk to the total risk of the portfolio. To form this portfolio, it is thus first necessary to determine how much risk each asset should contribute to the portfolio and then calculate the asset allocation in terms of capital. A special case of this is when each asset contributes with the same amount of risk. This is usually called a risk parity portfolio or equally weighted risk contribution (ERC) portfolio.
To formulate this mathematically the risk budgeting portfolio can be defined by the fol- lowing system of nonlinear equations, see [26],
Filip Lundin 18 Markus Wahlgren Chapter 2. Background 2.4. Risk Budget Portfolios
RCi(x) = biR(x) bi > 0 xi ≥ 0 (2.5) n P b = 1 i=1 i Pn i=1 xi = 1 where bi is the risk budget of asset i which has the constraint that it cannot be set to zero, i.e. bi > 0, which is necessary for the risk budget portfolio to be unique. Furthermore, xi is the portfolio weight of asset i and x = (x1, x2, ..., xn) is the vector of portfolio weights for a portfolio with n assets. The portfolio weight xi has the constraint xi ≥ 0 which means that long-only portfolios are the only portfolios that are considered. RCi(x) is the risk contribution that asset i has to the total risk of the portfolio x which is measured using the risk measure R(x).
2.4.1 Existence and Uniqueness of the Portfolio In the remainder of this report, the coherent risk measure CVaR will be utilized when constructing risk budgeting portfolios and then the following restriction must be imposed in order for the risk budgeting portfolio to exist [14] R(x) ≥ 0, for all x. The risk measure needs to be positive in order for the portfolio to exist and be unique. A coherent risk measure has, as described in section 2.3.1, the property of positive homo- geneity. This property means that R(λx) = λR(x) where λ is a positive scalar. If there is a portfolio where R < 0 it would mean that the portfolio could be leveraged with a scaling factor λ > 1 such that R(λx) < R(x) < 0. It also follows that lim R(λx) = −∞. λ→∞ This is the reason why the risk measure R(x) needs to be positive. In order for a risk budget portfolio to exist and be unique the risk measure thus needs to be positive and the risk budgets bi needs to be positive as well, as described in section 2.4.
2.4.2 Euler’s Theorem on Homogeneous Functions An important theorem needed for determining the risk contribution of specific assets is Euler’s theorem on homogeneous functions. This theorem is explained below [27].
Filip Lundin 19 Markus Wahlgren Chapter 2. Background 2.4. Risk Budget Portfolios
A function f : U → R,U ⊂ Rn, is homogeneous of degree τ if the following equation holds for any λ > 0
f(λx) = λτ f(x), x ∈ U.
Thus, the positive homogeneity property for a coherent risk measure in section 2.3.1 is for a degree of 1.
Now let U ⊂ Rn be an open set and f : U → R be a continuously differentiable function. Then f is homogeneous of degree τ if and only if it satisfies the following equation
n X ∂f(u) τf(u) = u . i ∂u i=1 i
If the function f(u) is the risk measure R(x) and is continuously differentiable and homogeneous of degree 1, the equation becomes
n n X ∂R X R(x) = x = RC . i ∂x i i=1 i i=1
The marginal risk contribution each asset i has to the total risk measure R(x) when increasing the portfolio weight xi can then be described by the partial derivative of the risk measure with respect to the weight of asset i. The sum of all assets’ risk contribution then adds up to the risk measure R(x).
2.4.3 Volatility as a Risk Measure The established risk measure for risk budget portfolios is volatility and the use of this risk measure for risk budgeting portfolios is also conventionally based on an assumption of Gaussian distributed asset returns. If the volatility σ is used as risk measure it means that
√ q T 2 2 R(x) = σ(x) = x Σx = Σixi σi + ΣiΣj6=ixixjσij
2 where Σ is the covariance matrix, σi is the variance of the return of asset i and σij is the covariance of the returns of assets i and j. Here, σ(x) describes the total risk of the portfolio measured in volatility. The marginal risk contribution can then be defined as the following, see [28],
Filip Lundin 20 Markus Wahlgren Chapter 2. Background 2.4. Risk Budget Portfolios
∂R(x) ∂σ(x) (Σx) x σ2 + Σ x σ = = √ i = i i √ j6=i j ij , ∂xi ∂xi xT Σx xT Σx which represents the change in volatility of the total portfolio that comes from a change th of the portfolio weight xi of asset i. Here, (Σx)i denotes the i component of the vector issued from the product of Σ with x.
Then the risk contribution RCi(x) of asset i can be defined by ∂σ(x) RCi(x) := σi(x) = xi · . ∂xi This then leads to the following Euler decomposition, as described in section 2.4.2, which indicates that the total risk of the portfolio can be seen as a sum of the risk con- tributions of the assets i = 1, 2, ..., n.
n n X X ∂σ(x) σ (x) = x · i i ∂x i=1 i=1 i n X (Σx)i = xi · √ T i=1 x Σx Σx = xT · √ T √ x Σx = xT Σx = σ(x)
For the case of a risk parity or ERC portfolio the risk contributions would be equal for all assets such that σi(x) = σj(x) for all i, j = 1, 2, 3, ..., n. This can also be expressed in the following way
σ(x) σ (x) = . i n
2.4.4 CVaR as a Risk Measure The methods for creating a risk budgeting model can also be adjusted by considering con- ditional value-at-risk (CVaR) as a risk measure instead of volatility. The use of CVaR in
Filip Lundin 21 Markus Wahlgren Chapter 2. Background 2.4. Risk Budget Portfolios
risk budgeting can be described in the following way.
To begin with, the CVaR at risk level p of the full portfolio can be written
n n X ∂CVaRp(x) X CVaR R(x) = CVaR (x) = x = RC p (x) p i ∂x i i=1 i i=1
CVaRp ∂CVaRp(x) where RC (x) = xi , which is the risk contribution in terms of CVaRp, to i ∂xi the portfolios total risk from asset i. This is in accordance with the Euler decomposition described in section 2.4.2 [10].
If we replace the unknown true distribution function with the empirical one, the latter leads to
bpT c 1 X R(x) = CVaR (x) = L(k)(x) p bpT c p k=1
(1) (2) (T ) (k) where Lp (x) ≥ Lp (x)... ≥ Lp (x) are sorted portfolio losses. Clearly Lp can be written as the following where N is the number of assets in the portfolio
N (k) X (k) Lp (x) = xili(x) , i=1
(k) (k) where li(x) is the loss of asset i that corresponds to the portfolio loss Lp at obser- (1) vation k. Thus if the worst portfolio loss, Lp , occurs for instance at observation 200, (1) li(x) will be the individual asset losses that occur at the same observation 200. The partial derivative of CVaRp(x) with respect to xi, which is the marginal risk contribution of asset i, then becomes
bpT c ∂CVaRp(x) 1 X (k) = li(x) . ∂xi bpT c k=1
CVaR The risk contribution RCi of asset i is then the following expression [10]
bpT c CVaR ∂CVaRp(x) 1 X (k) RCi (x) = xi = xi li(x) . (2.6) ∂xi bpT c k=1
Filip Lundin 22 Markus Wahlgren Chapter 2. Background 2.4. Risk Budget Portfolios
2.4.5 CVaR as a Risk Measure with Gaussian Mixture Model In order to form a risk budgeting model utilizing conditional value-at-risk as a risk mea- sure and Gaussian mixture models to model returns, an analytical expression for the marginal risk contributions from the assets can be derived.
Bruder et al. [14] derives an expression for the conditional value-at-risk for the full port- folio under the distribution specified above in section 2.1.2 in their appendix 4. Then they derive a definition of the marginal conditional value-at-risk. Below the derivations are displayed following the steps in this article.
Start by setting
ϕ(a) = E[I{a ≤ Y }· Y ] where I is the indicator function and Y a Gaussian random variable such that Y ∼ N(µ, σ2). From this it is clear that the following expression holds, where φ(x) is the PDF and Φ(x) is the CDF of the standard Gaussian distribution.
Z ∞ y y − µ x − µ ϕ(a) = φ dy change of variable t = a σ σ σ Z ∞ = (µ + σt)φ(t)dt σ−1(a−µ) Z ∞ ∞ σ 1 2 = µ[Φ(t)]σ−1(a−µ) + √ t exp(− t )dt 2π σ−1(a−µ) 2 a − µ σ 1 ∞ = µ 1 − Φ + √ − exp − t2 σ 2π 2 σ−1(a−µ) a − µ a − µ = µΦ − + σφ . σ σ Let us consider now the loss of the portfolio L(x) where thus L(x) = −R(x) if R(x) is the portfolio returns. Then the conditional value-at-risk, at risk level p, can be defined as CVaRp(x) = E L(x)|VaRp(x) ≤ L(x) .
This can be rewritten using the indicator function and its expectation according to below
Filip Lundin 23 Markus Wahlgren Chapter 2. Background 2.4. Risk Budget Portfolios
1 h i CVaR (x) = VaR (x) ≤ L(x) · L(x) p pE I p 1 Z ∞ = yg(y)dy, p VaRp(x) where the density of L(x) is denoted g(y). By the expression for the density in section 2.1.2 and the symmetry of the Gaussian distribution it is given by
1 y + µ (x) 1 y + µ (x) g(y) = (1 − λ) φ 1 + λ φ 2 σ1(x) σ1(x) σ2(x) σ2(x)
T T 2 T 2 T where µ1(x) = x µ, µ2(x) = x (µ+µe), σ1(x) = x Σx and σ2(x) = x (Σ+Σe)x.