Portfolio Optimization on Multivariate Regime Switching GARCH Model with Normal Tempered Stable Innovation

Cheng Peng∗ Young Shin Kim†

College of Business, Stony Brook University, New York, USA December 1, 2020

Abstract We propose a Markov regime switching GARCH model with mul- tivariate normal tempered stable innovation to accommodate fat tails and other stylized facts in returns of financial assets. The model is used to simulate sample paths as input for portfolio optimization with risk measures, namely, conditional and conditional draw- down. The motivation is to have a portfolio that alleviates left tail events by combining models that incorporates fat tail with optimiza- tion that focuses on tail risk. In-sample test is conducted to demon-

arXiv:2009.11367v2 [q-fin.RM] 30 Nov 2020 strate goodness of fit. Out-of-sample test shows that our approach yields higher performance measured by performance ratios than the market and equally weighted portfolio in recent years which includes some of the most volatile periods in history. We also find that subop- timal portfolios with higher return constraints tend to be outperform- ing. ∗[email protected][email protected]

1 1 Introduction

Empirical studies have found in the return of various financial instruments skewness and leptokurtotic—asymmetry, and higher peak around the mean with fat tails. Normal distribution has long been recognized as insufficient to accommodate these stylized fact, relying on which could drastically underes- timate the tail risk of a portfolio. However, few has adequately incorporate this fact into modeling and decision-making. The α-Stable distribution has been proposed to accommodate them. The lack of existence of moments of stable distributions could sometimes cause difficulties, to which tempered stable distribution presents a potential solution. The class of tempered sta- ble distributions is derived from the α-stable distribution by tempering tails. Recent findings from Kim et al. (2012), Kurosaki and Kim (2018), Anand et al. (2016) use the normal tempered stable (NTS) distribution and suc- cessfully model the stock returns with high accuracy. As is pointed out, the above mentioned properties not only exist in raw returns but also in GARCH- filtered residuals of time series model. This motivates us to use multivariate NTS distribution to accommodate the asymmetry, interdependence, and fat tails of the joint innovations in our model. Other distributions like Pearson distribution and generalized hyperbolic distribution have also been proposed to this end, both having t-distribution as a special case. The estimation on the degree of freedom parameter of t-distribution in GARCH model is found to be unreliable. Study in Kim et al. (2011) finds that normal and t-distribution are rejected and NTS distribution is favored on the residu- als of time series model. Another study in Shi and Feng (2016) found that tempered stable distribution with exponential tempering is a better choice than t or GED distribution for innovation distribution of Markov-Regime- Switching (MRS) GARCH model.. While generalized hyperbolic distribution is very flexible, it has 5 parameters which might create difficulty in estima- tion. NTS distribution has only 3 standardized parameters and is flexible enough to serve the purpose. One of the drawbacks of GARCH model is the difficulty in dealing with volatility spikes, which could indicate that the market is switching within regimes. Various types of specification of MRS-GARCH models has been pro- posed, for example in Hamilton (1996), many of which suggest that regime switching GARCH model achieves a better fit for empirical data. Natu- rally, the most general model allows all parameters to switch among regimes. However, as is shown in Henneke et al. (2011), the sampling procedure in

2 MCMC method for the estimation of such model is time-consuming and renders it improper for practical use. Recent developments in algorithm in Billioa et al. (2016) improves the estimation speed. In our model, we use the regime switching GARCH model specified in Haas et al. (2004) for simplicity. This model circumvents the path dependence problem in Markov model by specifying parallel GARCH models. It’s a recognized fact that the correlation of financial assets is time- varying. Specification of multivariate GARCH models with both regime- specific correlations and dynamics involves a balance between flexi- bility and tractability. The model in Haas et al. (2004) has been generalized in Haas and Liu (2004) to a multivariate case. Unfortunately, it suffers from the curse of dimensionality and thus is unsuitable for a high dimensional empirical study. In our model, we decompose variance and correlation so that the variance of each asset evolves independently according to a univari- ate MRS-GARCH model. The correlation is incorporated in the innovations modeled by a flexible Markov swiching multivariate NTS distribution. Modern portfolio theory is usually framed as a trade-off between return and risk. Classical Markowitz Model intends to find the portfolio with the highest Sharpe ratio. However, variance has been criticized for not containing enough information on the tail of distribution. Moreover, the correlation is not sufficient to describe the interdependence of asset return with non- . Current regulations for financial business utilize the concept of Value at Risk(VaR), which is the percentile of the loss distribution, to model the risk of left tail events. However, there are several undesired properties that rendered it an insufficient criterion. First, it’s not a coherent risk measure due to lack of sub-additivity. Second, VaR is difficult to to optimize for non-normal distributions due to non-convex and non-smooth as a function of positions with multiple local extrema, which causes difficulty in developing efficient optimization techniques. Third, a single percentile is insufficient to describe the tail behavior of a distribution, which might lead to an underestimation of risk. Theory and algorithm for portfolio optimization with CVaR measure is proposed in Krokhmal et al. (2001) Rockafellar and Uryasev (2000) to address these issues. For continuous distributions, CVaR is defined as a conditional expectation of losses exceeding a certain VaR level. For discrete distributions, CVaR is defined as a weighted average of some VaR levels that exceed a specified VaR level. In this way, CVaR concerns both VaR and the losses

3 exceeding VaR. As a convex function of asset allocation, a coherent risk measure and a more informative statistic, CVaR serves as an ideal alternative to VaR. A study on the comparison of the two measure can be found in Sarykalin et al. (2014). Drawdown has been a constant concern for investors. It’s much more difficult to climb out of a drawdown than drop into one, considering that it takes 100% return to recover from 50% relative drawdown. Thus maxi- mum drawdown is often used in evaluation of performance of a portfolio. As is mentioned in Checkhlov et al. (2005), a client account of a Commodity Trading Advisor(CTA) will be issued a warning or shut down with a draw- down higher than 15% or longer than 2 years, even if small. Investors are highly unlikely to tolerate a drawdown greater than 50%. However, maxi- mum drawdown only considers the worst case which may only occur under some very special circumstances. It’s also very sensitive to the testing period and asset allocation. On the other hand, small drawdowns included in aver- age drawdown are acceptable and might be caused by pure noise, of which minimization might not make sense. For instance, a Brownian motion would have drawdowns in a simulated sample path. A relevant criterion to CVaR, conditional drawdown(CDaR) is proposed in Checkhlov et al. (2005) to address these concerns. While CVaR only concerns the distribution of return, CDaR concerns the sample path. CDaR is essentially a certain CVaR level of the drawdowns. By this, we overcome the drawbacks of average drawdown and maximum drawdown. CDaR not only take the depth of drawdowns into consideration, but also the length of them, corresponding to the concerns by investors we mentioned earlier. Since the CDaR risk measure is the application of CVaR in a dynamic case, it naturally holds nice properties of CVaR such as convexity with respect to asset allocation. Optimization method with constraint on CDaR has also been developed in relevant papers. For an optimization procedure to lead to desired optimal result, the input is of critical importance. With careless input, the portfolio optimization could magnify the impact of error and lead to unreasonable result. Using historical sample path like in Krokhmal et al. (2002) means that we assume what happened in the past will happen in the future, which is an assumption that needs careful examination. It’s found in Lim et al. (2011) that estimation errors of CVaR and the mean is large, resulting in unreliable allocation. An alternative way is to use multiple simulated sample paths. This motivates us to propose a model that incorporates fat tails to generate simulation as

4 input for the optimization with tail risk measures. To this end, we propose a Markov regime switching GARCH model with multivariate standard normal tempered stable innovations. In-sample tests are performed to examine the goodness of fit. Simulation is performed with the model to generate multiple sample paths. To demonstrate the effective- ness of the proposed model as well as the tail risk measures, we conduct an ou-of-sample study on the performance. The tested period includes some of the most volatile time in history caused by COVID-19 pandemic and inter- national trade tensions. The remainder of the paper is organized as follows. Section 2 introduces the preliminaries on NTS distribution and GARCH model. Section 3 specifies our model, methods for estimation and simulation. Section 4 is an empirical study on in-sample goodness of fit and out-of-sample performance in recent years.

2 Preliminaries

This section reviews background knowledge about Normal Tempered Sta- ble (NTS) Distribution, GARCH and regime switching GARCH model, and various risk measures.

2.1 Risk Measures In this section we discuss and summarize popular risk measures in finance such as Value at Risk (VaR), Conditional VaR (CVaR), and conditional drawdown at risk (CDaR). Definitions of CDaR, and related properties of drawdown are following Checkhlov et al. (2005). Consider a probability space (Ω, F∞, P) and a portfolio with N assets. Suppose that Pn is a stochastic process of price for the n-th asset in the portfolio on the space

Pn : [0, ∞) × Ω −→ R, n = 1, 2, ··· ,N with the real initial value Pn(0, ·) = Pn,0 > 0. Let Rn be a stochastic process of the return for the n-th asset between time 0 to time t > 0 as

Pn(t, ω) − Pn,0 Rn(t, ω) = , t > 0, Pn,0

5 with Rn(0, ·) = 0. Let R be a N-dimensional stochastic process

N R : [0, ∞) × Ω −→ R with T R(t, ω) = (R1(t, ω), R2(t, ω), ··· , RN (t, ω)) . T N Let x = (x1, x2, ··· , xN ) ∈ R be a capital allocation vector of the portfolio PN 1 satisfying n=1 xn = 1 and xn ∈ [0, 1] for all n ∈ {1, 2, ··· ,N} . Then the portfolio return R(x, t, ω) at time t > 0 is equal to

N X R(x, t, ω) = xnRn(t, ω). n=1 The definition of VaR for R(x, t, ·) with the confidence level η is

VaRη(R(x, t, ·)) = − inf{x|P(R(x, t, ·) ≤ x) > 1 − η}. If R(x, t, ·) has a continuous cumulative distribution, then we have

−1 VaRη(R(x, t, ·)) = −FR(x,t,·)(1 − η),

−1 where FR(x,t,·) is the inverse cumulative distribution function of R(x, t, ·). The definition of CVaR for R(x, t, ·) with the confidence level η is

1 Z 1−η CVaRη(R(x, t, ·)) = VaR(1−x)(R(x, t, ·))dx. 1 − η 0 If R(x, t, ·) has a continuous cumulative distribution, then we have

CVaRη(R(x, t, ·)) = −E [R(x, t, ·)|R(x, t, ·) < −VaRη(R(x, t, ·))] .

Let T > 0, and ω ∈ Ω, then (R(x, t, ω))t∈[0,T ] = {R(x, t, ω)|t ∈ [0,T ]} is one sample path of portfolio return with the capital allocation vector x from time 0 to T . The drawdown (DD(·, ω))t∈[0,T ] of the portfolio return is defined by DD(t, ω) = max R(x, s, ω) − R(x, t, ω), for t ∈ [0,T ] (1) s∈[0,t]

1We consider a long only portfolio in this paper.

6 is a risk measure assessing the decline from a historical peak in some variable, and {DD(t, ω)|t ∈ [0,T ], ω ∈ Ω} is a stochastic process of the drawdown. The average drawdown (ADD) up to time T for ω ∈ Ω is defined as 1 Z T ADD(T, ω) = DD(t, ω)dt (2) T 0 that is the time average of drawdowns observed between time 0 to T . The Maximum drawdown (ADD) up to time T for ω ∈ Ω is defined as MDD(T, ω) = max DD(t, ω) (3) t∈(0,T ) that is the maximum of drawdowns that have occurred up to time T for ω ∈ Ω. Let 1 Z T FDD(z, ω) = 1DD(t,ω)≤zdt (4) T 0 and ( inf{z|FDD(z, ω) ≥ η} if η ∈ (0, 1] ζη(ω) = . (5) 0 if η = 0 Then the conditional drawdown at Risk (CDaR) is defined as

CDaRη(T, ω)   Z T FDD(ζη(ω), ω) − η 1 = ζη(ω) + DD(t, ω)1DD(t,ω)>ζη(ω)dt. (6) 1 − η (1 − η)T 0 According to Checkhlov et al. (2005), the CDaR also obtained by the follow- ing optimization formula  1 Z T  CDaRη(T, ω) = min ζ + max (DD(t, ω) − ζ, 0) dt ζ ηT 0

Note that CDaR0(T, ω) = ADDω(T ) and CDaR1(T, ω) = MDDω(T ). Let 1 Z T  FDD(z) = E 1DD(t,·)≤zdt (7) T 0 and ( −1 inf{z|FDD(z) ≥ η} if η ∈ (0, 1] ζη = F (η) = . (8) DD 0 if η = 0

7 Considering all the sample path {R(x, t, ω)|t ∈ [0,T ], ω ∈ Ω}, we define CDaR at η as

CDaRη(T )   Z T  FDD(ζ(η)) − η 1 = ζ(η) + E DD(t, ·)1DD(t,·)>ζ(η)dt . (9) 1 − η (1 − η)T 0

2.2 Scenario based estimation We discuss estimating risk measures presented in the previous section using given scenarios (historical scenario or simulation). Suppose that the time interval [0,T ] is divided by a partition {0 = t0 < t1 < ··· < tM = T }, and denote Rm(x) = R(x, tm, ·).

We select ωs ∈ Ω for s ∈ {1, 2, ··· ,S} for number of scenarios S. Then we obtain scenarios of the portfolio return process R at time tm as

s Rm(x) = R(x, tm, ωs) where m ∈ {0, 1, 2, ··· ,M}, and s ∈ {1, 2, ··· ,S}. We can calculate VaR, CVaR, DD, and CDaR under the simulated scenarios. The VaR with signif- icant level η under the simulated scenario is estimated as

( S ) 1 X VaRη(Rm(x)) = − inf u 1Rs (x) 1 − η . (10) S m s=1

(k) s Let Rm (x) be the k-th smallest value in {Rm(x) | s = 1, 2, ··· ,S}, then CVaR at the significant level η is estimated according to the formula

CVaR1−η(Rm(x))  dS(1−η)e−1  1 1 X  dS(1 − η)e − 1 = − R(k)(x) + (1 − η) − R(dS(1−η)e)(x) , 1 − η S m S m  k=1 (11)

8 where dxe denotes the smallest integer larger than x (See Rachev et al. (2008) for the details). Before we discuss the scenario based estimation on CDaR, we need to define the accumulated uncompounded return. Let rn be a random variable of the return for the n-th asset between time ti−1 to time ti as

Pn(ti, ω) − Pn(ti−1, ω) rn(ti, ω) = , t > 0, Pn(ti−1, ω) with rn(0, ·) = 0. Let Q be a N-dimensional random variable T Q(ti, ω) = (r1(ti, ωs), r2(ti, ωs), ··· , rN (ti, ωs)) .

T N Let x = (x1, x2, ··· , xN ) ∈ R be a capital allocation vector of the portfolio PN 2 satisfying n=1 xn = 1 and xn ∈ [0, 1] for all n ∈ {1, 2, ··· ,N} . Then the portfolio return Q(x, ti, ωs) between time ti−1 and time ti is equal to

N X Q(x, ti, ωs) = xnQn(ti, ωs). n=1 The scenario based estimation on accumulated uncompounded return at time ti is t X U(x, ti, ωs) = Qn(x, ti, ωs). t=1 s Denote U(x, ti, ωs) as Ui (x). Using (1), (2), and (3), DD, ADD, and MDD on a single simulated scenario are estimated by

s s DDm,s := DD(tm, ωs) = max Uj (x) − Um(x), j∈{0,1,2,··· ,m} M 1 X ADD := ADD(t , ω ) = DD , M,s M s M m,s m=1

and MDDM,s := MDD(tM , ωs) = max DDm,s, m∈{1,2,··· ,M} respectively. Moreover, using (4), (5), (6), CDaR is estimated on a single simulated scenario by

M 1 X CDaR (T, s) := CDaR (T, ω ) = DD 1 s , η η s (1 − η)M m,s DDm,s>ζη m=1 2We consider a long only portfolio in this paper.

9 where ( s inf{z|FDD(z, s) ≥ η} if η ∈ (0, 1] ζη := ζη(ws) = . 0 if η = 0 with M 1 X F (z, s) = 1 . DD M DDm,s≤z m=1 Or it is estimated by the optimization formula

( M ) 1 X CDaRη(T, s) = min ζ + max (DDm,s − ζ, 0) . ζ (1 − η)M m=1 Using (7), (8), (9), DD and CDaR on multiple scenarios are estimated by

S 1 X DD(m) = DD S m,s s=1 and

CDaRη(T )   S M FDD(ζ(η)) − η 1 X X 1 = ζ(η) + DD 1 , (12) 1 − η (1 − η)M S m,s DDm,s>ζ(η) s=1 m=1 respectively, where S 1 X F (z) = F (z, s) DD S DD s=1 and (  inf z FDD(z) ≥ η if η ∈ (0, 1] ζη = . 0 if η = 0 When we generate scenario for each asset return in the portfolio, we s use the uncompounded return. The portfolio return Rm(x) for the capital allocation vector x is obtained by

N N m N m s X X X X X Rm(x) = xnRn(tm, ws) = xn rn(k, s) = xnrn(k, s). n=1 n=1 k=1 n=1 k=1 In this paper, we will apply time series models to generate the scenario of rn(m, s) for m ∈ {1, 2, ··· ,M}, s ∈ {1, 2, ··· ,S} and n ∈ {1, 2, ··· ,N}.

10 2.3 NTS Distribution Let T be a strictly positive random variable defined by the characteristic function for λ ∈ (0, 2) and θ > 0

1− λ ! 2θ 2 λ λ φ (u) = exp − ((θ − iu) 2 − θ 2 ) . T λ

Then T is referred to as a tempered stable subordinator with parameters λ T and θ. Suppose that ξ = (ξ1, ··· , ξN ) is a N-dimensional standard normal distributed random vector with a N×N Σ i.e. ξ ∼ Φ(0, Σ). T The N-dimensional NTS distributed random vector X = (X1, ..., XN ) is defined as √ X = µ + ν(T − 1) + T diag(γ)ξ, T N T N T where µ = (µ1, ··· , µN ) ∈ R , ν = (ν1, ··· , νN ) ∈ R , γ = (γ1, ··· , γN ) ∈ N R+ and T is independent of ξ. The N-dimensional multivariate NTS distri- bution specified above is denoted as

X ∼ MNTSN (λ, θ, ν, γ, µ, Σ). q q 2 2−λ 2θ Let µn = 0, and γn = 1 − νn( 2θ ) with |νn| < 2−λ for n ∈ {1, ··· ,N}. This yields a random variable having zero mean and unit variance. In this case, the random variable is referred to as standard NTS distributed random variable, and denoted X ∼ stdMNTSN (λ, θ, ν, Σ). The covariance matrix of X is given by 2 − λ Σ = diag(γ)Σdiag(γ) + ν>ν. X 2θ 2.4 GARCH and Regime Switching GARCH Model GARCH(p, q) model has been studied intensively as a model for volatility.

rt = η + σtt,

ut = σtt, p q 2 X 2 X 2 σt = ω + αiut−i + βiσt−i. i=1 i=1 where rt is the return at time t, σt is the variance at time t, η, ω, αi, βi are parameters. We will use GARCH(1,1) in the paper.

11 Before specifying our model, it would be clear if we first specify the uni- variate Markov regime switching model in Haas et al. (2004).

rt = η∆t + σ∆t,tt,

ut = σ∆t,tt, 2 2 2 (13) σt = ω + αut−1 + β ◦ σt−1, iid t ∼ N(0, 1). where ∆t is a that determines the regime at time t. For a model with a number of k regimes, ∆t has finite state space S = 1, ..., k and an irreducible and primitive k × k transition matrix P, with element pij = P (∆t = j | ∆t−1 = i).

rt is the return at time t, η∆t is a regime specific mean return in regime

∆t at time t, σ∆t,t is the standard deviation in regime ∆t at time t. Note that in a univariate model there are k parallel GARCH models evolving at 2 2 2 the same time. σt = (σ1,t, ...., σk,t), ω = (ω1, ...., ωk), α = (α1, ...., αk), β = (β1, ...., βk), ◦ denotes element-wise product. In this model, a number of k GARCH processes are evolving simultane- ously. The Markov chain ∆t determines which GARCH model is realized at next moment. To understand (13), see the following matrix   σ1,1 σ1,2 . . . σ1,t ... σ2,1 σ2,2 . . . σ1,t ... [σ ] =   ∆t,t ∆t=1,...,k, t=1,...T  . ..   ......  σk,1 σk,2 . . . σk,t ... Each column is a total of k parallel standard deviations at time t. Each row is the process of one regime through time 1 to T . The process is as follows Step 1: At t = 1, initial column is generated. Step 2: ∆1 decides which row is realized. For example, ∆1 = 2. That is, σ2,1 is realized in real world at t = 1. Step 3: By vector calculation, 2nd column is generated. For example, if σ2,1 is realized at t = 1, we calculate

12           σ1,2 ω1 α1 β1 σ1,1 σ2,2 ω2 α2 β2 σ2,1   =   + σ    +   ◦    .   .  2,1 1  .   .   .   .   .   .   .   .  σk,2 ωk αk βk σk,1

Step 4: ∆2 Decides which row is realized. For example, ∆2 = 1. That is, σ1,2 is realized in real world at t = 2. A regime switch from regime 2 to regime 1 takes place. Step 5: Repeat Step 1 to Step 4. It is important to note that, in the example, the realized variance at time t = 2, σ1,2, is calculated with both the realized variance at time t = 1, σ2,1, and σ1,1. This is obvious from the vector equation shown above. σ1,1 is parallel and does not exist in real world. More generally, when there is a regime switch, from the vector equation 2 2 2 σt = ω + αut + β ◦ σt−1, we know that the variance σ∆t,t is determined by σ∆t,t−1 rather than σ∆t−1,t−1. σ∆t,t−1 is a variance in regime ∆t at time t, which is generated simultaneously with σ∆t−1,t−1, but is not existed in reality at time t or t − 1. This makes the model different from the one in Henneke et al. (2011) that requires all parameters to switch according to a

Markov chain. Because in that way, there is no parallel process, and σ∆t,t is determined by σ∆t−1,t−1. The stationary conditons require a definition of matrices > Mji = pji(β + αei ), i, j = 1, ..., k and block matrix M = [Mji], i, j = 1, ...., k. The necessary and sufficient condition for stationarity is ρ(M) < 1, where ρ(M) denotes the largest eigenvalue of matrix M. In this model, a num- ber of k GARCH processes are evolving simultaneously. The Markov chain determines which GARCH model is realized at next moment. In principle, regime-specific mean can be included in this model, though it may cause loss in property of zero auto-correlation.

3 MRS-MNTS-GARCH Model

This section defines the MRS-MNTS-GARCH model and introduces the pro- cedures for model fitting and sample path simulation.

13 3.1 Model Specification To model a multivariate process, we assume the variance of each individual asset evolves independently according to univariate regime-switching GARCH model specified previously with possibly different number of regimes, while the correlation is reflected separately in the joint standard residuals mod- eled with stdMNTS distribution. The regime of correlation matrix follows a Markov process. The pair of words regime and state, market and a market index, innovations and joint residuals are used interchangeably as follows: (i) Let ∆t be the regime of the ith asset at time t defined by a Markov chain with finite state space S = 1, . . . , k(i) and an irreducible and primitive k(i) × k(i) transition matrix, where k(i) is the number of regimes of the ith asset. (1) (N) Suppose the innovation t = (t , . . . , t ) follows stdMNTS distribution of D which the parameters shift in regimes determined by a Markov chain ∆t . D stands for a market index. The multivariate distribution of joint residuals follows a Markov process with the same transition matrix as the market index. We use DJIA index for the market index D in this paper. Then we define portfolio return as

rt = ηt + σt ◦ t

ut = σt ◦ t 2 2 2  (i) (i) (i) (i)  (i)  (i)  σt = ω + α ut−1 + β ◦ σt−1 , i = 1,...,N iid  ∼ stdMNTS(λ D , θ D , ν D , Σ D ), t ∆t ∆t ∆t ∆t where

(1) (N) • rt = (rt , . . . , rt ) is the vector return of N assets at time t.

(1) (N) • ηt = (η (1) , . . . , η (N) ) is the vector mean return at time t, ∆t ∆t

(i) (i) • η (i) is the mean return of the ith asset at time t, η (i) is determined ∆t ∆t (i) by the regime of the ith asset at time t, ∆t ,

 (1) (N)  • σt = σ (1) , . . . , σ (N) is the vector of standard deviation of N ∆t ,t ∆t ,t assets at time t

14 (i)  (i) (i)  • σt = σ1,t, . . . , σk(i),t is the standard deviation vector of the ith asset 2  2 2  (i)  (i)  (i)  with σt = σ1,t ,..., σk(i),t ,

(i)  (1) (N) • ut−1 is the ith element of ut, with ut = ut , . . . , ut ,

(i) (i) (i) • ω = (ω1 , . . . , ωk(i) ),

(i) (i) (i) • α = (α1 , . . . , αk(i) ),

(i) (i) (i) • β = (β1 , . . . , βk(i) ), • ◦ denotes element-wise product,

(1) (N) • and t = (t , . . . , t ) is the joint resuduals. For each asset, the volatility process is the same as that of univariate regime-switching GARCH model in (13). It is a classic GARCH process when it does not shift within regimes. Unlike the most general model in which the variance at time t is calculated with the variance at time t − 1 in the last regime, when a regime shift takes place at time t, the variance in our model at time t is determined by the variance at time t − 1 within the new regime. The regimes of each asset and joint innovations evolve independently, alle- viating the estimation difficulty caused by all parameters switching regimes simultaneously. Intuitively, the correlation of assets change drastically when a market regime shift takes place. Thus here for simplicity we assume that the multivariate innovations switches regimes with the market.

3.2 Model Estimation Though flexible enough to accommodate the many stylized facts of asset return, the lack of an analytical form of NTS distribution presents some difficulties in estimation. Our estimation methodology is adapted from Kim et al. (2011), Kim et al. (2012), which is among the first works to incorporate NTS distribution in portfolio optimization. Another estimation of stdMNTS distribution with EM algorithm combined with FFT is studied in Bianchi et al. (2016).

15 Denote the parameters of MNTS distribution in each regime as λk, θk, γk =  (1) (N)  (1) (N) γk , . . . , γk , νk = νk , . . . , νk , Σk, where k is the regime, super- script is the asset. k = 1,...,N. To fit the model, we follow the procedures below. First, we fit univariate Markov regime switching GARCH model with t innovation on the index return, estimate the regime path and extract the residuals. The regime path of the joint residuals is assumed to be the same as this regime path. The residuals are used to estimate tail parameters λk, θk for later input. Then, we fit univariate model on each asset and extract the residuals for subsequent estimation of stdMNTS distribution. In each regime, the stdMNTS distribution has two tail parameters λk, θk and one skewness parameter vector νk. Common tail parameters λk, θk are assumed for individual constituents, which are estimated from the DJIA index in each regime respectively. This leaves the skewness parameters νk to be calibrated by curve fitting for each asset. Finally, to compute Σk with the formula Σk = −1 2−λk > −1 diag(γk) (ΣX − ν νk)diag(γk) , we explicitly estimate the covariance k 2θk k matrix of the innovations in each regime ΣX k. Note that Σk by definition is supposed to be positive definite matrix. To achieve this, we can either 2 substitute νk with the closest vector to νk in L norm which renders ΣX k − 2−λk > ν νk positive definite matrix, or directly find the closest positive definite 2θk k matrix to Σk. The latter method is applied in our empirical study. It’s known that historical correlation matrix is unreliable. We find that the out- of-sample performance is enhanced with denoising techniques in Laloux et al. (1999), Juliane and Korbinian (2005). The method in Laloux et al. (1999) is applied in our empirical study. To summarize, the procedure is as follows. Step 1: Fit univariate Markov regime switching GARCH model with t inno- vation on the index return, estimate the regime path and extract the resid- uals. Step 2: Estimate common tail parameters λk, θk in each regime. Step 3: Fit univariate model on each asset and extract the residuals. Step 4: Estimate skewness parameters νk for assets in each regime, calculate q 2 2−λk γk = 1 − νk ( ). 2θk Step 5: Calculate the denoised correlation matrices of joint residuals ΣX k in each regime. −1 2−λk > −1 Step 6: Calculate the matrix diag(γk) (ΣX − ν νk)diag(γk) , and k 2θk k find the closest positive definite matrix Σk.

16 The joint residuals are assumed to follow a stdMNTS Hidden Markov process. In our approach, the stdMNTS distribution is estimated within each market regime. The number of regimes of each asset is set as equal to that of DJIA index. We find that univariate Markov switching model with more than 3 states of- ten has one unstable state that lasts for a very short period and switches frequently and sometimes has one state with clearly bimodal residuals dis- tribution. Thus it’s desirable to limit the number of states smaller than 4, check unimodal distriution with Dip test and choose the one with highest BIC value.

3.3 Simulation To conduct simulation with the model, we use the following procedures. Step 1: Simulate S sample paths of standard deviation of N assets for T days with the fitted model. Step 2: Simulate S sample paths of a Markov chain ∆D with the market’s transition matrix. Calculate the total number of each regime in all sample paths. Step 3: Draw i.i.d. samples from stdMNTS distribution in different regimes. The number of samples is determined by the calculation in Step 2. To draw one sample, we follow the procedure below. Step 3.1: Sample from multivariate normal distribution N(0,Σk). Step 3.2: Sample from the tempered stable subordinator T with parameters λk and θk. √ Step 3.3: Calculate X = νk(T − 1) + T (γk ◦ ξk). Step 4: Multiply the standard deviation in step 1 with standard joint resid- uals in Step 3 accordingly. Add regime-specific mean ηk to get simulated asset returns.

17 4 Portfolio Optimization

Conditioned on given sample path(s), classic portfolio optimization is formu- lated as a trade-off between risk and return min W (x, T, Ω) x

s.t. Eω [R(x, T, ω)] ≥ d, x ∈ V,

where Ω is the set of all sample paths, R(x, T, ω) is the portfolio return at end time T and W (x, T, Ω) is a risk measure, x = (x1, . . . , xn) is the allocation vector of n assets. Since there are multiple sample paths in the case of PN simulation, the expected value is used. A typical setting of V is {x| i=1 xi = 1, xi ≥ 0}, meaning that short selling is not allowed. performance ratio can be defined by substituting standard deviation in Sharpe ratio with other risk measures. Similar to Markowitz Model, efficient frontier can be derived by changing d. Each portfolio on the efficient frontier is optimal in the sense that no portfolio has higher return with the same value of risk measure. For clarifi- cation in this paper, we refer to the portfolio with the highest performance ratio W (x,T,Ω) as optimal portfolio and the others on efficient frontier as Eω[R(x,T,ω)] suboptimal portfolios. Some problems regarding the risk measures needs further clarification. If we use historical sample paths as input for CVaR optimization, each asset has only one path, and we will minimize the CVaR of daily portfolio return from time t = 1 to t = T . However, this approach implicitly assumes that the daily returns are i.i.d. Since GARCH model leads to non-i.i.d. simulated daily returns, we only minimize the CVaR of return at time t = T when we use simulated sample paths as input. In terms of drawdown, the common definition of absolute drawdown (ab- breviated as drawdown) is

DD(x, t, ω) = max P (x, s, ω) − P (x, t, ω), for t ∈ [0,T ] s∈[0,t]

The common definition of relative drawdown is

max P (x, s, ω) − P (x, t, ω) DD(x, t, ω) = s∈[0,t] , for t ∈ [0,T ] maxs∈[0,t] P (x, s, ω)

18 However, the algorithm developed in Checkhlov et al. (2005) aims to minimize absolute drawdown due to its good properties like convexity. While the properties of absolute drawdown facilitates the computation, a natural concern to this approach is that when we consider a long period, drawdowns at different times may differ greatly in absolute value but close in relative value, of which the latter we may care more about. Using uncompounded accumulated return in the setting could alleviate this problem. From (1), we easily see that t X DD(x, t, ω) = Q(x, u, ω)

u=tx where tx is the value of t that gives maxu∈{1,2,...,t} Q(x, u, ω). It is not af- fected by the absolute value of the asset through time. It can also be viewed as an approximation to relative drawdown, since Pt Q(x, u, ω) ≈ u=tx Qt (1 + Q(x, u, ω)) when Q(x, u, ω) is small. This is the reason we use u=tx uncompounded accumulated return as input to CDaR optimization in the following empirical study.

5 Empirical Study

In this section, we perform model fitting, simulation and portfolio optimiza- tion to demonstrate the superiority of our model. For diversification, we set the range of weights as [0.01, 0.15]. A case study in Krokhmal et al. (2002) found that without constraints on weight, the optimal portfolio only consists of a few assets among hundreds. We limit the number of states to be smaller than 4. Dip test is conducted on residuals to ensure unimodal residual distribution. When the p-value is lower than 0.1, model with fewer states is used instead. With this condition satisfied, the one with highest BIC is selected. To be concise in notation, we denote the portfolios in a convenient manner. For example, 0.9-CDaR portfolio denotes the optimal portfolio derived from the optimization with 0.9-CDaR as risk measure. The standard deviation optimal portfolio is the optimal portfolio in classic Markowitz model that maximizes Sharpe ratio, which we also denoted as mean-variance(MV) optimal portfolio.

19 5.1 Data The data comprises of the adjusted daily price of DJIA index, 29 of the con- stituents of DJIA Index and 3 ETFs TMF, SDOW and UGL, from January 2010 to September 2020. One constituent DOW is removed for that it’s not included in the index in all tested period. Since we use 1764 trading days’ (about 7 years) data to fit the model, the actual tested period is from January 2017 to September 2020, which includes some of the most volatile periods in history caused by the COVID-19 pandemic and trade tensions.

5.2 In-sample Test We present an in-sample test result. The time period used in fitting is from 2013-07-19 to 2020-07-22. 2-regime model is favored in this period. 1075 days is classified as regime 1 and 690 days is classified as regime 2.

5.2.1 Kolmogorov–Smirnov Test Note that the marginal distribution of stdMNTS distribution is still NTS distribution. To examine the goodness of fit of stdMNTS distrbution on the joint residuals, we report in Table 1 the p value and KS statistics of KS tests. As a comparison, we fit multivariate t-distribution on the joint residuals and conducted KS test on marginal distribution similarly. We also report the β of each asset in different regimes. The degree of freedom of multivariate t-distribution in 2 regimes is 5.6 and 16.1 respectively. The data are rounded to 3 decimals place. We can observe that β varies significantly in different regimes. Generally, for all assets, β in regime 2 has much larger absolute values, indicating a significant skewness. NTS fits very well for all residuals in regime 1 due to its ability to model distribution with a high peak, skewness and fat tails, where t-distribution are frequently rejected with p value close to 0. In regime 2, NTS distribution performs similarly to t-distribution.

5.2.2 Residuals Transition matrix of the residuals is shown in Table 2. Regime 2 which corresponds higher volatility has a lower self-transition probability. We also provide the denoised correlation matrices of stdMNTS residuals in 2 regimes in Table 3. Since there are 2 regimes identified in this period, we

20 Table 1: KS Test

Regime β KS Statistics of NTS p value KS Statistics of t p value AAPL regime 1 -0.078 0.021 0.654 0.08 0 regime 2 41.938 0.123 0.081 0.12 0.093 AXP regime 1 0.063 0.014 0.975 0.078 0 regime 2 38.155 0.097 0.271 0.09 0.347 BA regime 1 0.029 0.015 0.945 0.068 0 regime 2 70.137 0.077 0.553 0.084 0.438 CAT regime 1 -0.063 0.041 0.041 0.069 0 regime 2 83.58 0.153 0.014 0.157 0.011 CSCO regime 1 0.055 0.02 0.752 0.066 0 regime 2 42.562 0.079 0.511 0.078 0.538 CVX regime 1 -0.062 0.042 0.034 0.062 0 regime 2 66.65 0.138 0.035 0.13 0.056 DDG regime 1 0.053 0.045 0.018 0.092 0 regime 2 -67.095 0.16 0.009 0.148 0.02 DIS regime 1 0.082 0.01 1 0.072 0 regime 2 52.844 0.134 0.046 0.132 0.051 DOG regime 1 -0.004 0.046 0.016 0.031 0.2 regime 2 -51.012 0.099 0.251 0.092 0.323 GS regime 1 -0.044 0.037 0.077 0.055 0.002 regime 2 46.111 0.123 0.081 0.117 0.112 HD regime 1 0 0.031 0.217 0.044 0.021 regime 2 49.479 0.088 0.374 0.084 0.445 IAU regime 1 0 0.04 0.045 0.047 0.012 regime 2 -69.031 0.029 1 0.042 0.99 IBM regime 1 0.107 0.01 1 0.087 0 regime 2 68.142 0.083 0.455 0.089 0.366 INTC regime 1 -0.024 0.016 0.926 0.06 0.001 regime 2 82.535 0.134 0.045 0.145 0.024 JNJ regime 1 0.036 0.017 0.877 0.058 0.001 regime 2 93.757 0.043 0.986 0.083 0.448 JPM regime 1 -0.027 0.038 0.073 0.063 0 regime 2 53.155 0.132 0.051 0.125 0.074 KO regime 1 0.032 0.02 0.733 0.06 0 regime 2 84.339 0.031 1 0.065 0.744 MCD regime 1 0.028 0.005 1 0.071 0 regime 2 43.145 0.068 0.696 0.066 0.732 MMM regime 1 -0.163 0.022 0.635 0.099 0 regime 2 58.869 0.125 0.071 0.128 0.064 MRK regime 1 0.022 0.016 0.923 0.058 0.001 regime 2 73.085 0.102 0.22 0.101 0.233 MSFT regime 1 0.047 0.019 0.803 0.06 0 regime 2 28.885 0.085 0.429 0.082 0.463 NKE regime 1 0.006 0.017 0.888 0.065 0 regime 2 71.82 0.081 0.478 0.091 0.338 PFE regime 1 -0.056 0.035 0.108 0.061 0 regime 2 75.966 0.111 0.148 0.111 0.146 PG regime 1 -0.06 0.017 0.868 0.071 0 regime 2 54.857 0.073 0.614 0.079 0.521 REK regime 1 0.039 0.055 0.002 0.045 0.019 regime 2 -34.95 0.072 0.63 0.063 0.779 RTX regime 1 0.025 0.025 0.447 0.06 0 regime 2 28.971 0.093 0.318 0.087 0.399 TRV regime 1 0.009 0.027 0.363 0.071 0 regime 2 70.286 0.036 0.999 0.048 0.962 UNH regime 1 -0.026 0.022 0.602 0.063 0 regime 2 51.7 0.078 0.534 0.075 0.588 V regime 1 -0.012 0.031 0.211 0.056 0.001 regime 2 34.036 0.091 0.336 0.087 0.394 VGLT regime 1 -0.016 0.062 0 0.031 0.208 regime 2 -51.37 0.047 0.968 0.043 0.988 VZ regime 1 -0.029 0.028 0.33 0.05 0.006 regime 2 52.603 0.073 0.61 0.074 0.596 WBA regime 1 0.003 0.007 1 0.07 0 regime 2 47.054 0.07 0.674 0.072 0.638 WMT regime 1 -0.096 0.019 0.797 0.085 0 regime 2 86.902 0.069 0.686 0.081 0.481 XOM regime 1 0.023 0.039 0.062 0.043 0.026 regime 2 42.583 0.12 0.094 0.115 0.12

Table 2: Transition Matrix

Regime 1 Regime 2 Regime 1 0.8964 0.1036 Regime 2 0.2069 0.7931

21 stack the upper tridiagonal part of the matrix with elements in the 1st regime and the lower tridiagonal part with the 2nd together for easier comparison. We find that the residuals have distinctively different correlation matrices in 2 regimes, validating the regime switching assumption. The residuals are highly correlated in the 2nd regime that corresponds to more volatile periods.

5.3 Out-of-sample Test Out-of-sample test with real market data is an informative way to test the effectiveness of a model. We use 3 types of risk measures with different con- fidence levels in the optimization, namely, maximum drawdown, 0.7-CDaR, 0.3-CDaR, average drawdown, 0.5-CVaR, 0.7-CVaR, 0.9-CVaR and standard deviation. Note that 0-CVaR is equal to the expected return, which does not make sense as a risk measure. Minimizing α-CVaR with α smaller than 0.5 means that we include the right part of the return distribution with many positive values. This is a flaw of standard deviation and thus we set α at 0.5, 0.7,and 0.9 to demonstrate the superiority. Rolling window technique is employed with 1764 days forward moving time window for biweekly estimation of the parameters. We simulate 1000 sample paths of length 10 every two weeks with the fitted model. The simula- tion is used as input for portfolio optimization. The portfolio is hold for two weeks until rebalance. The portfolio optimization is performed with software PSG with precoded numerical optimization procedure.

5.3.1 Optimal Portfolios We report the performance of the optimal portfolios in various forms.

Ratio Measurement We use performance ratios, i.e. the mean return of realized path divided by a risk measure of realized path, to measure perfor- mance of the optimal portfolios. The result is reported in Table 4. The row names denote the risk measures in the portfolio optimization. The columns are the performance measures. For example, 10-day 0.5-CVaR portfolio is optimized every 10 days to maximize the ratio of expected return to 10- day 0.5-CVaR, i.e. 0.5-CVaR of the return at day 10, the performance is measured with the ratio of mean return to risk of the whole realized path. Note that 0-CDaR is essentially average drawdown while 1-CDaR is maxi- mum drawdown. The risk measure of standard deviation is the sutdied in

22 Table 3: Denoised Correlation Matrix of Joint Residuals in 2 Regimes

AAPL AMGN AXP BA CAT CRM CSCO CVX DIS GS HD HON IBM INTC JNJ JPM KO MCD MMM MRK MSFT NKE PG SDOW TMF TRV UGL UNH V VZ WBA WMT AAPL 1 -0.01 0.1 0.02 0.1 0.2 0.1 -0.05 0.03 0.1 0.1 0.1 0.04 0.2 -0.03 0.1 -0.01 0.1 0.1 -0.1 0.2 0.05 -0.1 -0.1 -0.04 0.005 -0.02 0.01 0.2 -0.1 -0.004 -0.01 AMGN 0.4 1 0.1 0.1 0.1 0.01 0.1 0.04 0.1 0.1 0.1 0.1 0.1 0.02 0.1 0.1 0.1 0.1 0.1 0.1 0.002 0.1 0.1 -0.2 0.03 0.1 0.01 0.1 0.1 0.1 0.1 0.1 AXP 0.4 0.4 1 0.1 0.2 0.1 0.2 0.1 0.1 0.3 0.2 0.2 0.2 0.2 0.02 0.3 0.04 0.1 0.2 0.1 0.2 0.1 -0.04 -0.4 -0.2 0.1 -0.1 0.1 0.2 0.002 0.1 0.04 BA 0.4 0.4 0.5 1 0.1 0.01 0.1 0.1 0.1 0.2 0.1 0.1 0.1 0.1 0.03 0.2 0.03 0.02 0.1 0.1 0.02 0.1 0.001 -0.2 -0.1 0.1 -0.1 0.1 0.1 0.03 0.05 0.03 CAT 0.5 0.4 0.5 0.5 1 0.1 0.1 0.1 0.1 0.3 0.1 0.2 0.1 0.1 0.01 0.3 0.02 0.02 0.1 0.1 0.1 0.1 -0.04 -0.3 -0.2 0.1 -0.1 0.1 0.1 0.003 0.1 0.03 CRM 0.4 0.4 0.4 0.4 0.4 1 0.1 -0.1 0.03 0.02 0.1 0.1 0.04 0.2 0.02 0.02 0.05 0.1 0.1 -0.02 0.3 0.1 0.002 -0.1 0.03 0.03 0.03 0.01 0.2 -0.02 0.005 0.02 CSCO 0.4 0.4 0.5 0.5 0.5 0.4 1 -0.001 0.1 0.1 0.1 0.2 0.1 0.2 0.1 0.1 0.1 0.1 0.1 0.04 0.2 0.1 0.02 -0.3 -0.02 0.1 -0.01 0.1 0.2 0.02 0.04 0.05 CVX 0.4 0.4 0.5 0.4 0.5 0.4 0.5 1 0.1 0.2 0.05 0.1 0.1 -0.02 -0.000 0.2 -0.01 -0.04 0.1 0.1 -0.1 0.05 -0.03 -0.2 -0.1 0.1 -0.1 0.1 -0.03 0.02 0.04 0.01 DIS 0.4 0.4 0.5 0.4 0.5 0.4 0.5 0.4 1 0.2 0.1 0.1 0.1 0.1 0.04 0.2 0.05 0.03 0.1 0.1 0.04 0.1 0.01 -0.2 -0.1 0.1 -0.1 0.1 0.1 0.04 0.05 0.04 GS 0.5 0.5 0.6 0.5 0.6 0.5 0.6 0.5 0.5 1 0.2 0.2 0.2 0.2 -0.1 0.5 -0.1 -0.1 0.2 0.02 0.1 0.1 -0.2 -0.5 -0.3 0.1 -0.3 0.1 0.1 -0.1 0.1 -0.02 HD 0.4 0.4 0.5 0.5 0.5 0.4 0.5 0.4 0.5 0.5 1 0.2 0.1 0.1 0.1 0.2 0.1 0.1 0.1 0.1 0.1 0.1 0.03 -0.3 -0.1 0.1 -0.04 0.1 0.1 0.04 0.1 0.1 HON 0.5 0.5 0.6 0.5 0.6 0.5 0.6 0.5 0.5 0.6 0.5 1 0.2 0.2 0.1 0.3 0.1 0.1 0.2 0.1 0.1 0.2 0.1 -0.4 -0.1 0.2 -0.1 0.1 0.2 0.1 0.1 0.1 IBM 0.4 0.4 0.4 0.4 0.4 0.4 0.5 0.4 0.4 0.5 0.4 0.5 1 0.1 0.1 0.2 0.1 0.1 0.1 0.1 0.1 0.1 0.03 -0.3 -0.1 0.1 -0.1 0.1 0.1 0.05 0.1 0.1 23 INTC 0.4 0.4 0.4 0.4 0.5 0.4 0.5 0.4 0.4 0.5 0.4 0.5 0.4 1 0.002 0.2 0.02 0.1 0.1 -0.02 0.3 0.1 -0.04 -0.2 -0.1 0.05 -0.04 0.05 0.2 -0.04 0.02 0.02 JNJ 0.4 0.4 0.4 0.4 0.4 0.4 0.4 0.4 0.4 0.5 0.5 0.5 0.4 0.4 1 -0.1 0.2 0.1 0.1 0.2 -0.004 0.1 0.2 -0.2 0.2 0.1 0.1 0.1 0.1 0.2 0.1 0.1 JPM 0.5 0.5 0.6 0.5 0.6 0.5 0.6 0.5 0.5 0.6 0.5 0.6 0.5 0.5 0.5 1 -0.1 -0.04 0.2 0.04 0.1 0.1 -0.1 -0.5 -0.3 0.1 -0.3 0.1 0.1 -0.05 0.1 -0.01 KO 0.3 0.4 0.3 0.3 0.3 0.3 0.4 0.4 0.4 0.3 0.4 0.4 0.4 0.3 0.4 0.4 1 0.2 0.1 0.2 0.03 0.1 0.2 -0.2 0.2 0.1 0.1 0.1 0.1 0.2 0.1 0.1 MCD 0.3 0.4 0.4 0.4 0.4 0.3 0.4 0.4 0.4 0.4 0.4 0.4 0.4 0.4 0.4 0.4 0.4 1 0.1 0.1 0.1 0.1 0.1 -0.1 0.1 0.1 0.1 0.04 0.1 0.1 0.05 0.1 MMM 0.5 0.5 0.5 0.5 0.5 0.5 0.5 0.5 0.5 0.6 0.5 0.6 0.5 0.5 0.5 0.6 0.4 0.4 1 0.1 0.1 0.1 0.05 -0.3 -0.1 0.1 -0.1 0.1 0.1 0.1 0.1 0.1 MRK 0.4 0.4 0.4 0.4 0.4 0.3 0.4 0.4 0.4 0.4 0.4 0.5 0.4 0.4 0.4 0.4 0.4 0.4 0.5 1 -0.1 0.1 0.2 -0.2 0.1 0.1 0.05 0.1 0.03 0.2 0.1 0.1 MSFT 0.4 0.4 0.5 0.5 0.5 0.4 0.5 0.5 0.5 0.5 0.5 0.6 0.5 0.4 0.5 0.5 0.4 0.4 0.5 0.4 1 0.1 -0.04 -0.2 -0.01 0.02 0.01 0.02 0.3 -0.1 0.001 0.01 NKE 0.4 0.4 0.4 0.4 0.4 0.4 0.4 0.4 0.4 0.5 0.4 0.5 0.4 0.4 0.4 0.5 0.3 0.3 0.5 0.4 0.4 1 0.1 -0.3 -0.02 0.1 -0.02 0.1 0.1 0.1 0.1 0.1 PG 0.3 0.4 0.3 0.3 0.3 0.3 0.4 0.4 0.4 0.3 0.4 0.4 0.4 0.3 0.5 0.4 0.5 0.4 0.4 0.4 0.4 0.3 1 -0.1 0.2 0.1 0.2 0.04 0.03 0.2 0.1 0.1 SDOW -0.6 -0.6 -0.7 -0.6 -0.7 -0.6 -0.7 -0.6 -0.6 -0.7 -0.6 -0.8 -0.6 -0.6 -0.6 -0.7 -0.5 -0.5 -0.7 -0.6 -0.7 -0.6 -0.5 1 0.2 -0.3 0.1 -0.2 -0.3 -0.1 -0.1 -0.1 TMF -0.3 -0.2 -0.4 -0.3 -0.4 -0.3 -0.3 -0.3 -0.3 -0.4 -0.3 -0.4 -0.3 -0.3 -0.2 -0.4 -0.1 -0.2 -0.3 -0.2 -0.3 -0.3 -0.1 0.4 1 0.03 0.2 -0.04 -0.003 0.1 0.01 0.1 TRV 0.4 0.4 0.5 0.5 0.5 0.4 0.5 0.4 0.5 0.5 0.5 0.5 0.5 0.4 0.5 0.5 0.4 0.4 0.5 0.5 0.5 0.4 0.5 -0.6 -0.2 1 0.02 0.1 0.1 0.1 0.1 0.1 UGL -0.1 -0.1 -0.2 -0.1 -0.2 -0.1 -0.1 -0.1 -0.1 -0.2 -0.1 -0.2 -0.1 -0.1 -0.1 -0.2 0.001 -0.05 -0.1 -0.1 -0.1 -0.1 0.01 0.2 0.2 -0.1 1 -0.04 0.004 0.1 0.001 0.1 UNH 0.4 0.4 0.4 0.4 0.4 0.4 0.4 0.4 0.4 0.5 0.4 0.5 0.4 0.4 0.4 0.5 0.4 0.4 0.5 0.4 0.4 0.4 0.4 -0.6 -0.3 0.4 -0.1 1 0.1 0.1 0.1 0.05 V 0.5 0.4 0.5 0.5 0.5 0.4 0.5 0.5 0.5 0.6 0.5 0.6 0.5 0.5 0.5 0.6 0.4 0.4 0.6 0.4 0.5 0.4 0.4 -0.7 -0.3 0.5 -0.2 0.5 1 0.02 0.05 0.1 VZ 0.3 0.4 0.3 0.3 0.3 0.3 0.4 0.4 0.4 0.4 0.4 0.4 0.4 0.3 0.4 0.4 0.4 0.4 0.4 0.4 0.4 0.3 0.5 -0.5 -0.1 0.4 -0.01 0.4 0.4 1 0.1 0.1 WBA 0.4 0.4 0.4 0.4 0.4 0.4 0.4 0.4 0.4 0.5 0.4 0.5 0.4 0.4 0.4 0.5 0.4 0.3 0.5 0.4 0.4 0.4 0.4 -0.6 -0.2 0.4 -0.1 0.4 0.4 0.4 1 0.1 WMT 0.3 0.3 0.3 0.3 0.3 0.2 0.3 0.3 0.3 0.3 0.3 0.4 0.3 0.3 0.4 0.3 0.4 0.3 0.4 0.4 0.3 0.3 0.4 -0.4 -0.1 0.4 -0.01 0.3 0.3 0.4 0.3 1 the classical Markowitz model, of which the optimziation is also called MV optimization. The outperformance of our model and the two tail risk measures in terms of the ratios is apparent. The performance measures are consistent in the sense that a portfolio outperforming in one measure also outperforms in other measures as well in most cases. We find that 2 tail risk measures lead to similar performance. The confidence level only has a slight impact. Both CVaR and CDaR optimal portfolios slightly but consistently outperform MV optimal portfolio.

Accumulated Return To visualize the performance, we report in Figure 1 the performance of the optimal portfolios with different risk measures in out- of-sample tests. The legend shows colors that represent different risk mea- sures used in each portfolio optimization. We use log base 10 compounded accumulated return as vertical axis so that the scale of relative drawdown can be easily compared in graphs by simply counting the grids. The performance with different risk measures is reported separately. The labels in the legend shows the confidence level of risk measure. The names of the subfigures indi- cate the risk measures used in the optimization. DJIA index and the equally weighted portfolio are included in each graph for comparison. As can be observed in the graphs, all the optimal portfolios follow a similar trend, though differ in overall performance. They tend to alleviate both left tail and right tail events. For all CDaR optimal portfolios with different input, confidence level have only a small impact on the realized path, which often overlap. It’s possibly because the simulation length is only 10 days and the drawdown distribution is far from continuous, leading to close calculation of drawdowns with different confidence level.

Relative Drawdown Series, Return Distribution and Allocation Since the CVaR and CDaR risk measures concern the tail behavior, we plot the relative drawdown series in Figure 2 and the return distribution in Fig- ure 3 to demonstrate the ability to alleviate left tail events. We intend to consolidate the conclusion drawn from last section that (1) Optimization with the simulated sample paths by our model leads to good optimal portfolios with higher performance than the index and equally weighted portfolio. (2) Tail risk measure CDaR and CVaR is superior to standard deviation

24 Table 4: Performance of Optimal Portfolios

Mean Return Mean Return Mean Return Mean Return Mean Return Mean Return Mean Return Mean Return 0-CDaR 0.3-CDaR 0.7-CDaR 1-CDaR 0.5-CVaR 0.7-CVaR 0.9-CVaR Standard Deviation 0-CDaR 0.072 0.051 0.027 0.008 0.258 0.156 0.077 0.117 0.3-CDaR 0.074 0.052 0.027 0.008 0.262 0.159 0.078 0.119 0.7-CDaR 0.073 0.052 0.027 0.008 0.260 0.158 0.077 0.118 25 1-CDaR 0.075 0.053 0.028 0.007 0.259 0.157 0.078 0.117 10-day 0.5-CVaR 0.076 0.054 0.028 0.007 0.259 0.158 0.076 0.120 10-day 0.7-CVaR 0.072 0.051 0.027 0.007 0.257 0.157 0.077 0.121 10-day 0.9-CVaR 0.071 0.050 0.027 0.007 0.243 0.147 0.072 0.116 Standard Deviation 0.064 0.045 0.024 0.006 0.238 0.145 0.071 0.109 DJIA 0.011 0.008 0.004 0.001 0.064 0.040 0.019 0.033 Equal Weight 0.032 0.023 0.011 0.003 0.136 0.083 0.037 0.065 0.4

Equal Weight DJIA 0.3 0−CDaR 0.3−CDaR 0.7−CDaR 1−CDaR 0.5−CVaR 0.7−CVaR 0.2 0.9−CVaR standard deviation

0.1 log cumulative return log cumulative

0.0

2017 2018 2019 2020 Date

Figure 1: Performance of Optimal Portfolios in optimization. (3) Using our model and CDaR measure together leads to best perfor- mance, while model has a greater impact than the risk measure. Due to the large number of optimal portfolios with different risk measures, we only study 0-CDaR, 0.5-CVaR and MV optimal portfolios here. The others have similar results. As is shown in Figure 2, the 0-CDaR, 0.5-CVaR and MV optimal portfo- lios have significantly smaller relative drawdown in most period in the out- of-sample test, while 0-CDaR and 0.5-CVaR optimal portfolios are slightly better than standard deviation portfolio. We plot the kernel density estimation on the return distribution in Figure 3. To better compare the tails of return distribution, we also plot the log base 10 density. We can easily observe that the optimal portfolios are very close and alleviate both right and left tail events. They have thinner tails on both sides than DJIA and equally weighted portfolio and slightly skewed to the right. The optimal portfolios have lower peak than the equally weighted portfolio but still much higher peak than DJIA. Among the 3 optimal port- folios, the 10-day 0.5-CVaR portfolio has the thinnest left tail and a fatter right tail than the other 2 optimal portfolios.

26 0−CDaR 10−day 0.5−CVaR 0.3 DJIA Equal Weight Standard Deviation

0.2

0.1 Relative Drawdown Relative

0.0 2017 2018 2019 2020 Date

Figure 2: Relative drawdown series of 0-CDaR, 10-day 0.5-CVaR, MV opti- mal portfolios, DJIA and equally weighted portfolio

We also report the series of allocation weights among the DJIA con- stituents and the ETFs in Figure 4. It’s reasonable that when the weights on DJIA constituents are high, the weights on shortselling ETF is low.

5.3.2 Suboptimal Portfolios So far we have been studying the optimal portfolios that maximize per- formance ratios. In this section, we study whether the optimal portfolio outperforms suboptimal ones. We use 9 portfolios with different constraint on return to approximate the efficient frontier for each risk measure. For example, in each 0.3-CDaR op- timization, we perform 10 optimization with different constraints on return. The portfolio with the highest return-risk ratio is the approximated optimal portfolio. We denote the portfolio that has n lower level of constraint on return as level Ln suboptimal portfolio, the one that has n higher level of constraint on return as level Hn suboptimal portfolio. When a certain opti- mization is unfeasible, we set the allocation the same as that of a lower level. The constraints on return are (0.002, 0.010, 0.020, 0.030, 0.035, 0.040, 0.045, 0.050, 0.060, 0.065). For example, when the portfolio with constraint 0.050

27 50

40 1e−02

30

1e−07

density 20

0−CDaR log density 0−CDaR 10−day 0.5−CVaR 10−day 0.5−CVaR Standard Deviation Standard Deviation Equal Weight 1e−12 Equal Weight 10 DJIA DJIA

0 −0.10 −0.05 0.00 0.05 0.10 −0.10 −0.05 0.00 0.05 0.10 return return (a) Kernel Density Estimation (b) log Kernel Density Estimation

Figure 3: Kernel Density Estimation on 0-CDaR, 10-day 0.5-CVaR, MV optimal portfolios, DJIA and equally weighted portfolio on return has the highest performance ratio in simulation, it’s labeled the optimal portfolio in the table, which is an approximation to the true optimal one. The portfolio with constraint 0.060 is labeled H1, while H2 to H4 have the same constraint 0.065. Note that at each rebalance, the optimal portfolio changes, so does portfolios of other levels, i.e. constrait 0.060 may no longer be optimal. The performance of each label is measured with portfolios of corresponding level determined at each period. Due to the size of data, we only report suboptimal portfolios with risk measures of 0-CDaR, 0.5-CVaR and standard deviation in Figure 5 and Ta- ble 5. We find that the optimal portfolio is consistently too conservative for all risk measures and input. That is, with the same risk measure, subopti- mal portfolios with a higher constraint on return achieve higher performance ratios than the optimal portfolio. The performance does not follow the ex- pected trend that it would first monotonically increase until reaching the peak then monotonically decrease. The portfolios have increasing return as well as risk from L1 to H4. The realized paths follow similar trend with rare cross. The optimal portfolios, as expected, are in the medium part of all paths. The 10-day 0.5-CVaR suboptimal portfolios have almost identical performance. Some paths are not visible due to overlap. For example, the constraints on return of H4 are sometimes unfeasible, leading to same performance with H3 portfolio.

28 Table 5: Performance of Suboptimal Portfolios

Mean Return Mean Return Mean Return Mean Return Mean Return Mean Return Mean Return Mean Return 0-CDaR 0.3-CDaR 0.7-CDaR 1-CDaR 0.5-CVaR 0.7-CVaR 0.9-CVaR Standard Deviation 0-CDaR L4 0.061 0.043 0.023 0.008 0.239 0.146 0.074 0.113 L3 0.058 0.041 0.021 0.007 0.232 0.142 0.071 0.108 L2 0.061 0.043 0.023 0.007 0.236 0.144 0.072 0.110 L1 0.062 0.044 0.023 0.007 0.236 0.144 0.072 0.110 Optimal 0.072 0.051 0.027 0.008 0.258 0.156 0.077 0.117 H1 0.074 0.052 0.028 0.008 0.251 0.151 0.073 0.115 H2 0.082 0.058 0.030 0.008 0.259 0.157 0.076 0.120 H3 0.087 0.061 0.032 0.009 0.267 0.161 0.078 0.124 H4 0.087 0.062 0.032 0.009 0.268 0.162 0.079 0.125

10-day 0.5-CVaR L4. 0.073 0.052 0.027 0.006 0.252 0.154 0.074 0.114 L3 0.074 0.052 0.027 0.007 0.253 0.155 0.075 0.115

29 L2 0.074 0.052 0.027 0.007 0.253 0.155 0.075 0.116 L1 0.075 0.053 0.027 0.007 0.256 0.156 0.076 0.118 Optimal 0.076 0.054 0.028 0.007 0.259 0.158 0.076 0.120 H1 0.075 0.053 0.027 0.007 0.255 0.155 0.075 0.118 H2 0.075 0.053 0.027 0.007 0.256 0.156 0.075 0.119 H3 0.073 0.052 0.027 0.007 0.256 0.155 0.075 0.119 H4 0.071 0.050 0.026 0.007 0.254 0.154 0.074 0.117

Standard Deviation L4 0.041 0.029 0.015 0.004 0.176 0.110 0.055 0.084 L3 0.042 0.030 0.016 0.004 0.177 0.111 0.056 0.083 L2 0.049 0.035 0.019 0.005 0.196 0.122 0.062 0.092 L1 0.059 0.042 0.022 0.006 0.225 0.138 0.069 0.103 Optimal 0.064 0.045 0.024 0.006 0.238 0.145 0.071 0.109 H1 0.069 0.049 0.026 0.007 0.232 0.141 0.069 0.108 H2 0.071 0.050 0.027 0.007 0.232 0.141 0.069 0.110 H3 0.074 0.052 0.028 0.007 0.237 0.144 0.070 0.113 H4 0.073 0.052 0.027 0.007 0.234 0.142 0.069 0.112

DJIA 0.011 0.008 0.004 0.001 0.064 0.040 0.019 0.033 Equal Weight 0.032 0.023 0.011 0.003 0.136 0.083 0.037 0.065 Figure 4: Weights on 0-CDaR Optimal Portfolio on DJIA Constituents and 3 ETFs

1.00

0.75

0.50 Weight

0.25

0.00 2017 2018 2019 2020 Date DJIA Constituents SDOW TMF UGL

6 Conclusion

We propose the MRS-MNTS-GARCH model by combining Markov Regime Switching Model with the multivariate NTS-GARCH model. The model ac- commodates the regime switch and addresses tail risk in financial portfolio return data. We simulate sample path of portfolio returns using the model and find optimal portfolio with CDaR, CVaR and standard deviation risk measure. We conduct in-sample and out-of-sample tests to examine the ef- fectiveness of the model. In in-sample test, the model fits the residuals with high accuracy. In out-of-sample tests, we show that the proposed model significantly raises the performance of optimal portfolios measured by per- formance ratios, and that tail risk measures are slightly better than standard deviation. By combining MRS-MNTS-GARCH model and optimization with tail risk measures, the optimal portfolios successfully alleviate extreme left tail events. We also find that optimal portfolios tend to be conservative. That is, with the same risk measure, suboptimal portfolios with a higher constraint on return achieve higher performance ratios in the empirical study than the optimal portfolio.

30 0.4 Equal Weight H1 H2 H3 0.3 H4 L1 L2 L3 L4 0.2 Optimal

0.1 log cumulative return log cumulative

0.0

2017 2018 2019 2020 Date (a) 0-CDaR Suboptimal Portfolios

0.4 0.4 Equal Weight Equal Weight H1 H1 H2 H2 H3 H3 0.3 H4 0.3 H4 L1 L1 L2 L2 L3 L3 L4 L4 0.2 Optimal 0.2 Optimal

0.1 0.1 log cumulative return log cumulative return log cumulative

0.0 0.0

2017 2018 2019 2020 2017 2018 2019 2020 Date Date (b) 10-day 0.5-CVaR Suboptimal (c) Standard Deviation Suboptimal Portfolios Portfolios

Figure 5: Suboptimal Portfolios

References

Anand, A., Li, T., Kurosaki, T., and Kim, Y. S. (2016). Foster–Hart optimal portfolios. Journal of Banking & , 117–130.

Bianchi, M. L., Tassinari, G. L., and Fabozzi, F. J. (2016). Riding with the four horsemen and the multivariate normal tempered stable model. International Journal of Theoretical and Applied Finance, 19 (04), 1–28.

Billioa, M., Casarina, R., and Osuntuyi, A. (2016). Efficient Gibbs sampling for Markov switching GARCH models. Computational Statistics and Data Analysis, 37–57.

Checkhlov, A., Uryasev, S., and Zabarankin, M. (2005). Drawdown measure

31 in portfolio optimization. International Journal of Theoretical and Applied Finance, (1), 13–58.

Haas, M. and Liu, J.-C. (2004). A multivariate regime-switching GARCH model with an application to global stock market and real estate equity returns. Studies in Nonlinear Dynamics & Econometrics, (4), 493–530.

Haas, M., Mittnik, S., and Paolella, M. S. (2004). A new approach to Markov- switching GARCH models. Journal of Financial Econometrics, (4), 493– 530.

Hamilton, J. D. (1996). Specification testing in markov-switching time-series models. Journal of Econometrics, 70 , 127–157.

Henneke, J. S., Rachev, S. T., Fabozzi, F. J., and Nikolov, M. (2011). MCMC- based estimation of Markov switching ARMA–GARCH models. Applied Economics, (3), 259–271.

Juliane, S. and Korbinian, S. (2005). A shrinkage approach to large-scale co- variance matrix estimation and implications for functional genomics. Sta- tistical applications in genetics and molecular biology, 4 , Article32.

Kim, Y. S., Giacometti, R., Rachev, S. T., Fabozzi, F. J., and Mignacca, D. (2012). Measuring financial risk and portfolio optimization with a non- Gaussian multivariate model. Annals of Operations Research, 201 , 325– 342.

Kim, Y. S., Rachev, S. T., Bianchi, M. L., Mitov, I., and Fabozzi, F. J. (2011). Time series analysis for financial market meltdowns. Journal of Banking & Finance, 35 (8), 1879–1891.

Krokhmal, P., Uryasev, S., and Palmquist, J. (2001). Portfolio optimization with conditional value-at-risk objective and constraints. Journal of Risk, (2), 43–68.

Krokhmal, P., Uryasev, S., and Zrazhevsky, G. (2002). Comparative analysis of linear portfolio rebalancing strategies: An application to hedge funds. The Journal of Alternative Investments, (1), 10–29.

Kurosaki, T. and Kim, Y. S. (2018). Foster-Hart optimization for currency portfolios. Studies in Nonlinear Dynamics & Econometrics, (2), 1–15.

32 Laloux, L., Cizeau, P., Bouchaud, J.-P., and Potters, M. (1999). Noise dress- ing of financial correlation matrices. PHYSICAL REVIEW LETTERS, 83 , 1467–1470.

Lim, A. E., Shanthikumar, J. G., and Vahn, G.-Y. (2011). Conditional value- at-risk in portfolio optimization: Coherent but fragile. Operations Research Letters, 39 (3), 163–171.

Rachev, S. T., Stoyanov, S. V., and Fabozzi, F. J. (2008). Advanced Stochas- tic Models, Risk Assessment, and Portfolio Optimization: The Ideal Risk, Uncertainty, and Performance Measures. Wiley.

Rockafellar, R. T. and Uryasev, S. (2000). Optimization of conditional value- at-risk. Journal of Risk, (3), 21–41.

Sarykalin, S., Serraino, G., and Uryasev, S. (2014). Value-at-risk vs. con- ditional value-at-risk in risk management and optimization. INFORMS TutORials in Operations Research.

Shi, Y. and Feng, L. (2016). A discussion on the innovation distribution of the Markov regime-switching GARCH model. Economic Modelling, 278–288.

33