Candidate no. 900228

Trading Convexity - A Model Agnostic Approach

A thesis submitted in partial fulfillment of the MSc in Mathematical

April 7, 2015 Candidate no. 900228

This thesis is dedicated to those multitude of Indian children who do not have the means to afford the luxury of education.

2 Candidate no. 900228

Acknowledgements

I would like to express my sincere gratitude to my advisor Dr. Riccardo Rebonato for introducing me to this fascinating topic and for coaching me throughout. This thesis would not have been possible without his relentless patience in answering my many trivial questions and in explaining to me the key concepts. I have learned a lot, both from his classes at the Mathematical Institute and from our discussions over the phone and email. I have been extremely lucky to have interacted with Dr.Vladimir Putyatin throughout the course of this thesis. His insights, advice and review of my calculations have gone a long way in helping me to complete the analysis and write this thesis. The interactions with the faculty members at the Mathematical Institute have been extremely beneficial to my overall understanding of the nuances of Mathematical Finance and I would like to thank them for their quality teaching and feedback. My experience on each of the 7 visits to the Mathematical Institute has been very satisfying and it would not have been possible without the hard work that the staff of the Mathematical Institute put in prior to every visit. A big thank you to them. As we are all aware, life is much more easier to live when one has a supportive and caring family. This is where I have been blessed with wonderful parents who have been encouraging throughout and my wife Padma who has been a phenomenal source of positive energy in my life. My deepest gratitude to all of them.

3 Candidate no. 900228

In this essay, we study bond portfolio Convexity and we do so from three different perspectives. First, we introduce a model based representation of what the portfolio convexity should be using a simple Vasicek setting followed by a general multi-factor Affine set up. Second, we derive a novel model agnostic approach to extract the value of portfolio convexity in terms of portfolio “Carry” and “Roll-Down”. Finally, we develop a trading strategy which employs the model agnostic representation of portfolio convexity to exploit discrepancies in implied and realized convexity using the Treasury data provided by the US Federal Reserve[10] for the period 1987-2014. Our intention to focus on portfolio convexity is ultimately linked to the belief that mis-pricing in the fair value of convexity exist in today’s markets. These mis-pricings provide us with a trading opportunity and motivates us to develop a model agnostic approach to monetize convexity. The trading strategy is relatively easy to implement and is overall profitable conditional on the quality of the estimates of future yield volatility. Furthermore, we show that the profitability of trading strategy is not due to uncontrolled residual exposure to level, slope or curvature of the yields but is purely due to the ability of the strategy to tap into the mis-pricings in convexity.

4 Candidate no. 900228

Contents

1 Introduction 1

2 Convexity 3 2.1 Monetizing Convexity ...... 5 2.2 The Vasicek Setting ...... 6 2.3 A General Affine Model ...... 8

3 A Model Agnostic Approach 13 3.1 The Weights ...... 16

4 The Trading Strategy 18

5 Strategy Implementation 21 5.1 Estimation of Weights ...... 22 5.2 Estimation of Volatilities ...... 24 5.3 Implementation Steps ...... 30

6 Results 32 6.1 Estimated Weights ...... 32 6.2 Estimated Volatilities ...... 34 6.3 Portfolio Profit & Loss ...... 39

7 Conclusion 41

8 Appendix 1 - Estimation of Portfolio Weights: MATLAB Code 46

9 Appendix 2 - Estimation of Volatilities: R Code 48

References 51

i Candidate no. 900228

List of Figures

2.1 Bond Price vs. Yield ...... 4

5.1 ACF plot for 10 year bond at t = 9th Feb 1987 ...... 26 5.2 PACF plot for 10 year bond at t = 9th Feb 1987 ...... 26 5.3 ACF plot for 20 year bond at t = 2nd Oct 2014 ...... 27 5.4 PACF plot for 20 year bond at t = 2nd Oct 2014 ...... 28 5.5 ACF plot for 30 year bond at t = 12th May 2014 ...... 28 5.6 PACF plot for 30 year bond at t = 12th May 2014 ...... 29

6.1 Estimated weights for the 10 (Blue line) and 20 (Orange line) year bonds for the portfolio 10/20/30 ...... 33 6.2 Estimated weights for the 5 (Blue line) and 10 (Orange line) year bonds for the portfolio 5/10/30 ...... 33 6.3 Estimated weights for the 5 (Blue line) and 15 (Orange line) year bonds for the portfolio 5/15/30 ...... 34

6.4 Estimated volatilitiesσ ˆ5 (Yellow line),σ ˆ10 (Orange line) andσ ˆ15 (Green line)...... 35

6.5 Estimated volatilitiesσ ˆ20 (Blue line) andσ ˆ30 (Orange line)...... 36 6.6 P&L for the portfolios ...... 39

7.1 Signal Strength vs. Average P&L for 10/20/30 ...... 42 7.2 Signal Strength vs. Average P&L for 5/15/30 ...... 42 7.3 Signal Strength vs. Average P&L for 5/10/30 ...... 43

ii Candidate no. 900228

Chapter 1

Introduction

It is well known that the forces that shape the yield curve1 manifest themselves through the (i) Expectations around future one period interest rates, (ii) Term pre- mia and (iii) Convexity[7]. The expectations hypothesis asserts that yields on long term bonds must be equal to the expected future one period interest rates. Term pre- mia is the compensation that an investor hopes to receive for bearing duration risk and finally, Convexity arises from the nonlinear relationship between yields and bond prices. The overall shape of the is, in reality, a trade off between these three competing effects. The three factors have specific ranges in which they are most active - for example the Expectations component is material at the short end of the yield curve, Term premia is material at the medium term maturities and Convexity, ultimately, dominates at the long end of the yield curve. In this essay, we study bond portfolio Convexity and we do so from three different perspectives. First, we introduce a model based representation of what the portfolio convexity should be using a simple Vasicek setting followed by a general multi-factor Affine set up. Second, we derive a novel model agnostic approach to extract the value of portfolio convexity in terms of portfolio “Carry” and “Roll-Down”. Finally, we develop a trading strategy which employs the model agnostic representation of portfolio convexity to exploit discrepancies in implied and realized convexity using the Treasury data provided by the US Federal Reserve[10] for the period 1987-2014. Our intention to focus on portfolio convexity is ultimately linked to the belief that mis-pricing in the fair value of convexity exist in today’s markets. This mis-pricing can be a result of the sudden changes in volatilities after the 2008 financial crisis or due to a net demand for long term bonds by institutions like the pension funds2.

1There are a number of other factors that can also affect a bond’s yield. For example credit risk could affect yields of defaultable bonds. Illiquidity is another factor affecting yields[7]. 2Pension funds or Insurance companies may have negatively convex long-dated liabilities and may want to match their long-dated liabilities with long-dated assets. Their desire to match the maturity profile of their liabilities, and to reduce the associated negative convexity, creates a net institutional demand for these long-dated assets[21].

1 Candidate no. 900228

Whatever be the reason, these mis-pricings provide us with a trading opportunity and motivate us to develop a model agnostic approach to monetize convexity. However, harnessing this value of convexity requires an active dynamic strategy, similar to that of ‘gamma’ trading that an option trader engages in. Whether we make a profit will depend on the interplay between the market’s perception of today’s convexity and the future realized convexity or in other words, on the interplay between realized and implied yield volatilities, much like ‘gamma’ trading for options. Academic literature that discuss the possibility of monetizing convexity is prac- tically non-existent. It is, beyond doubt, true that many financial institutions, like the funds for example, employ proprietary trading strategies that attempt to exploit the mis-pricings in convexity however information about those strategies are largely unavailable publicly. This study is unique in that respect and attempts to introduce a trading strategy to monetize convexity using a novel model agnostic representation of portfolio convexity. The trading strategy is relatively easy to im- plement and is overall profitable conditional on the quality of the estimates of future yield volatility. Furthermore, we show that the profitability of trading strategy is not due to uncontrolled residual exposure to level, slope or curvature of the yields but is purely due to the ability of the strategy to tap into the mis-pricings in convexity. The rest of this essay is organized as follows: Chapter 2 introduces model de- pendent expressions of portfolio convexity using a and a multi-factor Affine model. The main contributions of this essay are in Chapter 3 and Chapter 4 where we introduce the model agnostic representation of portfolio convexity and the corresponding trading strategy. Chapter 5 and Chapter 6 discuss the implementation steps and the results of running the strategy on US Treasury data for the period 1987-2014. Chapter 7 concludes the essay.

2 Candidate no. 900228

Chapter 2

Convexity

In this section, we study Convexity and its impact on the shape of the yield curve. In particular, we show that (i) Convexity has the effect of depressing bond yields, (ii) the effect of Convexity is larger for long dated bonds, and (iii) Convexity is related to the volatility of the bond yields, in the sense that if there is no volatility, there shall be no Convexity. Much of this analysis has been executed in the celebrated paper of Mark Fisher[7] under a discrete time setting where all the uncertainty in the bond prices are resolved through the flip of an unbiased coin. Here we adopt a continuous time approach but essentially arrive at similar conclusions. Let y(t, T ) denote the time t yield of a zero bond with maturity T . Recall that the price of a zero coupon bond at time t with maturity T is given by

p(t, T ) = exp [−(T − t)y(t, T )] and so ln p(t, T ) y(t, T ) = − (2.1) (T − t) Figure (2.1) plots the relationship between bond yield and bond price as represented by equation (2.1). Notice the convex nature of the curve that results in an average price (the red line) greater than the price due to average yield (the yellow line). This feature is called convexity which is nothing but Jensen’s inequality at play. For any convex function f and a random variable X, Jensen’s inequality is represented as

E [f(X)] ≥ f(E [X]), whenever the expectation exists.

In our setup, exp [−(T − t)y(t, T )] is a convex function of the T maturity yields and thus the above inequality holds.

3 Candidate no. 900228

Figure 2.1: Bond Price vs. Yield

In pursuit of what drives convexity, we assume that a stochastic quantity X(t) has the following simplistic underlying process

dxt = µtdt + σtdWt (2.2) where Wt is a standard Brownian motion and µt, σt are deterministic. We know that ∫ ∫ t t xt = x0 + 0 µtdt + 0 σtdWt

Let x0 = 0 which is nothing but an assumption made to simplify calculations. Then ( [∫ ∫ ]) (∫ ) t t t exp E µtdt + σtdWt = exp µtdt (2.3) 0 0 0 ∫ t ,since µt is( deterministic and expectation) of the Ito integral 0 σtdWt is 0. Further, ∫ 1 ∫ since exp − t σ2dt + t σ dW is an exponential Martingale, we have 0 2 t 0 t t [ ( ∫ ∫ )] t 1 t E − 2 exp σt dt + σtdWt = 1 (2.4) 0 2 0 From equation (2.4), we can write [ (∫ ∫ )] (∫ ∫ ) t t t t 1 E exp µ dt + σ dW = exp µ dt + σ2dt (2.5) t t t t 2 t 0 0 (∫0 ) 0 t > exp µtdt (2.6) ( 0[∫ ∫ ]) t t = exp E µtdt + σtdWt (2.7) 0 0

4 Candidate no. 900228

Through the above equations, we have proved that indeed E [exp (X )] ≥ exp (E [X ]) (∫ ) t t t 1 2 but crucially it is the quantity exp 0 2 σt dt that drives the gap between the two expressions. Thus, convexity is essentially driven by the square of the volatility of the stochastic variable X(t), atleast while dealing with exponential functions. In the context of yields and prices, therefore, it is the yield volatilities that drive convexity and if there is no volatility then there shall be no convexity. We now inquire how does convexity affect the yield curve. Note that from figure (2.1), for a downward move in the yield the bond price goes up by more than it goes down for an upward move in the yield. The investors are aware of this feature of the bonds and therefore are willing to pay more for long dated bonds because this feature is most prevalent in bonds with longer maturities. For instance, since

p(t, T ) = exp [−(T − t)y(t, T )] we have ∂2p(t, T ) = (T − t)2p(t, T ) and (2.8) ∂y(t, T )2 E [dp(t, T )] 1 ∂p(t, T ) ∂p(t, T ) E [dy(t, T )] 1 = dt + + (T − t)2σ2 (2.9) p(t, T ) p(t, T ) ∂t ∂y(t, T ) p(t, T ) 2 yt − 2 2 The term (T t) (σyt) is always positive and is precisely responsible for giving that extra increment to the expected returns for long term bonds. Thus long dated bonds command a higher price and hence a lower yield due to the effects of convexity. In other words, convexity has the effect of lowering bond yields and this effect is most pronounced for long dated bonds. The other way to understand this effect of convexity is through the lens of the forward rate curve. Brown et.al (2000)[2]note that the term structure of long term forward interest rates is generally downward sloping. They show that this effect arises due to the combined effect of the term structure of volatility of long term interest rates and the greater convexity of long term bonds[2]. Thus Convexity, once again, has the effect of lowering the long term forward rates.

2.1 Monetizing Convexity

Unlike other risk factors (duration risk for example), harnessing the value of convex- ity requires an active dynamic strategy, similar to “gamma” trading that a delta- neutralized option trader would engage in. Suppose we buy an option and engage in accurate delta hedging. We know that when we buy an option we end up being long “gamma” and long realized volatility. Whether we make money at option expiry will then depend on whether the future realized volatility turns out to be bigger or smaller than the implied volatility that determined the price of the option. Capturing the value of convexity is also based on similar principles. By engaging in effective

5 Candidate no. 900228

duration hedging, the investor can tap into the difference between the realized and the implied volatility. In particular, if the investor believes that the market implies a lower volatility, he will build a duration-neutral portfolio which is long convexity. This leads to what we call ‘yield give -up’ which is similar to ‘premium paid’ in the context of “gamma” trading because the yield at which he buys a long dated bond is lower than the yield in the absence of convexity. The investor then makes money by continuously duration re-hedging his portfolio. Starting from a duration neutral portfolio, as the yields move, the portfolio loses its duration neutrality. As the port- folio is re-hedged, the investor makes money but the amount of money he makes will depend on the difference between the implied and realized yield volatility. Note that the money is made because the investor is a net buyer when prices have gone down and a net seller when prices have gone up. Similarly, if the investor believes that the market implies a high volatility, he will build a duration-neutral portfolio with negative convexity. This negative convexity portfolio will offer the investor a ‘yield pick-up’ which is similar to ‘premium received’ in the context of “gamma” trading. The investor will now hope that he has to duration hedge the portfolio as rarely as possible because now to remain duration neutral he will have to buy when the prices go up and sell when prices go down. The strategy discussed above, however, requires not just against the parallel moves in the yield curve but also against the level and slope risks in the yield curve. The other crucial factor is that this strategy requires a very active re-balancing of the portfolio because each time the yield curve moves up or down and comes back to the same level without the investor having re-balanced his position, he would have lost some of the time value that he paid for. Thus monetizing convexity requires a very active strategy and whether we make money really depends on how well we immunize our portfolio, the precision of our “belief” on future volatility and how actively we re-balance our position.

2.2 The Vasicek Setting

Consider a Vasicek model for the short rate rt given by

drt = κ(θ − rt)dt + σrdWt (2.10)

6 Candidate no. 900228

where θ is the mean reversion level, κ is the reversion speed, σr is the volatility of the short rate and Wt is a standard Brownian motion. We know that in a Vasicek model ( ) p(t, T ) = exp AT + BT r where (2.11) ∫t t t ∫ 1 T T T − 2 T 2 − T At = σr (Bs ) ds κθ Bs ds (2.12) 2 t t 1 − exp (−κ(T − t)) BT = − (2.13) t κ Now 1 1 ( ) y(t, T ) = − log p(t, T ) = − AT + BT r = aT + bT r say. (2.14) T − t T − t t t t t t t To calculate a process for the yields, we proceed as follows ∂y(t, T ) ∂y(t, T ) ∂2y(t, T ) dy(t, T ) = dt + dr + (dr )2 (2.15) ∂t ∂r t ∂r2 t T = bt drt ,assuming constant maturity. (2.16) T − T = bt κ(θ rt)dt + bt σrdWt (2.17) T T = µyt dt + σyt dWt (2.18)

Note that our assumption of a constant maturity yield works well for reasonably shaped yield curves, ie, for yield curves which are not too steep at the long end. We start by setting up a portfolio Πt at time t consisting of 1 unit of T1 maturity bonds and w2 units of T2 maturity bonds. So we have

Πt = p(t, T1) + w2p(t, T2) and the change in the portfolio value over the time interval dt is given by

dΠt = dp(t, T1) + w2dp(t, T2) (2.19) [ ] ∂p(t, T1) ∂p(t, T2) A A T1 T2 = [ (p(t, T1)) + w2 (p(t, T2))] dt + σyt + w2 σyt dWt (2.20) ∂y(t, T1) ∂y(t, T2) where ∂p(t, T ) ∂p(t, T ) 1 ∂2p(t, T ) i i Ti i 2 A (p(t, T )) = + µ + σ T i yt 2 y i ∂t ∂y(t, Ti) 2 ∂y(t, Ti) t

We choose w2 so that the dWt term in equation (2.20) is 0 and so

T1 ∂p(t, T1) ∂y(t, T2) σyt w2 = − T2 ∂y(t, T1) ∂p(t, T2) σyt Since the portfolio is now riskless, we must have

dΠt = Πtrtdt (2.21)

⇒ [A (p(t, T1)) + w2A (p(t, T2))] dt = [p(t, T1) + w2p(t, T2)] rtdt (2.22)

7 Candidate no. 900228

[( ) ] ∑2 ∂p(t, T ) ∂p(t, T ) 1 ∂2p(t, T ) i i Ti i 2 ⇒ w + µ − p(t, T )r + σ T = 0 (2.23) i yt i t 2 i ∂t ∂y(t, T ) 2 ∂y(t, T ) yt i=1 i i

Ti Recall that σ Ti = σrbt and so we can write equation (2.23) as yt C 2 − Y σr = 0 (2.24) where 1 ∑2 ∂2p(t, T ) C = w i b2,Ti and (2.25) 2 i ∂y(t, T )2 t i=1 i ( ) ∑2 ∂p(t, Ti) ∂p(t, Ti) Y = − w + µTi − p(t, T )r (2.26) i ∂t ∂y(t, T ) yt i t i=1 i The interpretation of equation (2.24) is critical. If we define a break-even volatility 2 Y C as σr = / and if the realized volatility over the time interval dt, say 1 day or 1 week, is bigger than the break-even volatility, a portfolio which is long convexity will make money. Similarly, if the realized volatility is smaller than the break-even volatility, a long convexity portfolio will lose money. However, sitting at time t one would have no way to know the realized volatility over time step dt unless we resort to a forecasting technique that provides the best guess of the realized volatility over dt. This is precisely why the estimation of future volatility is of paramount importance and we discuss this aspect in detail in Chapter 5. Looking at equation (2.24), it is straightforward to conclude that under the Vasicek model, the portfolio profitability will depend on the volatility of the short rate rt and not on the volatilities of the yields themselves. This is undesirable because we know that Convexity, if anything, is driven by the yield volatilities and to have the portfolio P&L guided by the uncertainty in the short rate is reason enough to warrant an improvement in our choice of a model.

2.3 A General Affine Model

In this section we consider a multi-factor affine model of n − 1 state variables, x , et that follow a joint mean-reverting process given by ( ) dx = κ θ − x + SdW (2.27) et e et ft [ ] − E i j where κ, S are square matrices of order n 1 and dWt dWt = δij, δij being the

Kronecker delta. Let the short rate rt be of the form

′ rt = ur + g xt (2.28) e e Then the price of a zero coupon bond at time t and maturity T will be[5] [ ] ′ p(t, T ) = exp AT + BT x t t et

8 Candidate no. 900228

In a manner similar to the Vasicek case, we proceed to obtain a process for the vector of n yield changes given by   dy(t, T )  1  dy(t, T2) dyt =  .  e  .  dy(t, Tn) We choose n yields because we have n − 1 factors and to create a risk-less portfolio we will need n bonds and hence n yields of different maturities. Note that for the yield with maturity Tk

1 ′ − Tk Tk y(t, Tk) = log p(t, Tk) = at + bt xt (2.29) Tk − t e e ′ where bTk = −B(t, T )/(T − t) is a row vector of length n − 1. Now et k k 2 ∂y(t, Tk) ∂y(t, Tk) ∂ y(t, Tk) 2 dy(t, Tk) = dt + dxt + 2 (dxt) (2.30) ∂t ∂xt e ∂xt e ′ e e Tk = bt dxt (2.31) e ′ e ′ = bTk κ(θ − x )dt + bTk SdW (2.32) et e et et ft = µ dt + σ ′dW (2.33) yk eyk ft and so we can write

dyt = µydt + SydW t e e f where we have defined   µ  y1  µy2  µy =  .  as a column vector of length n and e  . 

µyn

   ′  ′ T1 σy1 bt S e ′ e ′  σ  bT2 S ey2 et − Sy =  .  =  .  as a matrix of n rows and n 1 columns.  .   .  ′ ′ Tn σyn b S e et As before, we setup a portfolio Πt at time t consisting of wk units of Tk maturity bonds, k = 1, 2, ··· , n. The change in the portfolio value over the time interval dt is given by ∑n dΠt = widp(t, Ti) (2.34) i=1 Now note that

2 ′ ∂p(t, Ti) ′ 1 ∂ p(t, Ti) Ti ′ Ti dp(t, Ti) = dt + [grad p(t, Ti)] dyt + 2 bt SS bt dt (2.35) ∂t e 2 ∂y(t, Ti)

9 Candidate no. 900228

where   ∂p(t, Ti)    ∂y(t, T1)   .  [grad p(t, Ti)] =  .    ∂p(t, Ti)

∂y(t, Tn)

To calculate the weights, we set the dW t term in equation (2.34) to 0 and obtain the ∗ f optimal weights wi as solutions to equations ∑n ′ wi[grad p(t, Ti)] Sy = 0 (2.36) i=1 Since the portfolio is now risk-less, we must have

dΠt = Πtrtdt which we re-write as w∗′ (c + c + c + c ) = 0 (2.37) e 1 2 3 4 where w∗′ = [w∗, ··· , w∗ ] (2.38) e  1 n ∂p(t, T1)  dt  ∂t   .  c1 =  .  (2.39)   ∂p(t, T ) n dt ∂t   ∂p(t, T1) ′   bT1 κ (θ − x ) dt ′  et e et  − − T1 −  ∂y(t, T1)  (T1 t) p(t, T1)bt κ (θ xt) dt  .   . e e e  c2 =  .  =  .  (2.40)   ′ ∂p(t, T ) ′ − − Tn − n Tn (Tn t) p(t, Tn)bt κ (θ xt) dt b κ (θ − xt) dt e e e ∂y(t, T )et e e n   −p(t, T1)rtdt  .  c3 =  .  (2.41) −p(t, Tn)rtdt     1 ∂2p(t, T ) ′ 1 T1 ′ T1 1 2 2  b SS b dt − y1  2 ∂y(t, T )2et et   (T1 t) p(t, T1)(σ ) dt  1   2   .   .  c4 =  .  =  .  (2.42)  2 ′  1 1 ∂ p(t, Tn) T ′ T 2 yn 2 n n (Tn − t) p(t, Tn)(σ ) dt 2 bt SS bt dt 2 ∂y(t, Tn) e e 2 and we have defined   σ  1,yi   σ2,yi  yi 2 ′ (σ ) = σy σy = [σ1,y , σ2,y , ··· , σn−1,y ]  .  (2.43) e i e i i i i  . 

σn−1,yi

10 Candidate no. 900228

Our intention here is to obtain an expression similar to equation (2.24) for the general affine case. With that in mind, we proceed to simplify the c4 term as follows. Note that the volatilities σyi are the yield volatilities produced by the model and one can express them as

yi y σ = hiσ¯ (2.44) where 1 ∑n σ¯y = σyi (2.45) n i=1 In equations (2.44) and (2.45) above we are alluding to the fact that the yield volatili-

yi y ties σ can be represented as a fraction hi of the average yield volatilityσ ¯ . Crucially enough, we now assume that over time what varies is the average yield volatilityσ ¯y but the fractions hi do not. In other words, what we are really articulating is that the yield volatilities move in parallel and therefore the first principal component, or the level, of the yield volatilities explains a majority of the variance in the yield volatilities. Thus the termσ ¯y, which is nothing but the level of the yield volatilities, accounts for almost all of the variability in yields and terms hi remain unchanged.

We can now re-write c4 as   1 − 2 2  (T1 t) p(t, T1)h1dt  2  y 2  .  c4 = (¯σ )  .  (2.46) 1 (T − t)2 p(t, T )h2 dt 2 n n n Finally, from equation (2.37), we have   1 − 2 2  (T1 t) p(t, T1)h1dt  2  w∗′ (c + c + c ) = −w∗′c = −(¯σy)2w∗′  .  (2.47) e 1 2 3 e 4 e  .  1 (T − t)2 p(t, T )h2 dt 2 n n n and so ∗′ −w (c1 + c2 + c3) Y (¯σy)2 =  e  = (2.48) 1 C − 2 2  (T1 t) p(t, T1)h1   2  w∗′  .  e  .  1 (T − t)2 p(t, T )h2 2 n n n Looking at equation (2.48), several comments are in order. Unlike the Vasicek model, we now have the portfolio profitability depend on the yield volatilities. The convexity contribution term C is, among other things, directly driven by the model implied yield volatilities σyk as against the the Vasicek case where, in equation (2.25), the C Ti convexity contribution term depends on the yield volatilities through the bt ’s. This

11 Candidate no. 900228

is an improvement but it is not without certain complications which we now discuss. In an attempt to run this strategy one of the foremost tasks would be to calibrate this multi-factor model to today’s yields. We might do so by fitting the κ and S matrices to the best statistical estimate of the yield covariance matrix. This is a plausible approach especially when one considers the Principal Component based affine term structure model introduced by Rebonato et.al.[20]1. With the fitted κ and S matrices, we can now use the remaining model parameters to fit to the shape of today’s yield curve out to 10 years2. Then the 10 to 30 year portion of the fitted yield curve will tell us what the model thinks the convexity contribution to the yield curve should be as compared to the market implied convexity contribution to the yield curve. In case the fit to the 10 to 30 year portion of the yield curve is poor, we might conclude that the model’s perception of the convexity contribution is different to that of the market’s and this is exactly when we run into the question as to who is right - the model or the market? Having seen that the fit to the 10 to 30 year portion of the yield curve is poor, one might adjust the estimated κ and S matrices to produce a better fit. However, this corrective action translates into model implied yield volatilities being very different from the statistical yield covariance matrix. Since different implied yield volatilities mean different implied convexity contribution, we once again run into the question of who is right - the model or the market? Thus we see that we have a battery of distinct estimation techniques available to us to estimate a model implied convexity but each of those techniques do not shield us from facing the question on whom to trust - the market or the model. This is crucial for our strategy because we would like to have an unequivocal indicator of the value of convexity and not be entangled in a motley of methodological abundance.

1The AFPC model introduced by Rebonato, Saroka and Putyatin [20] is known to recover very well the best statistical estimate of the yield covariance matrix. 2We choose 10 years because convexity has very little effect on the shape of the yield curve at this maturity and so the model’s view of convexity can be obtained from the higher maturity portion of the fitted yield curve.

12 Candidate no. 900228

Chapter 3

A Model Agnostic Approach

In this section we derive a model independent estimate of the fair value of portfolio convexity. Our efforts shall salvage us from the issues arising out of a model based estimate as discussed in the previous chapter. They will also prove to be extremely beneficial as far as developing a trading strategy aimed at monetizing convexity is concerned. We start with a similar setup of n − 1 factors x with a joint dynamics given by et ( ) dx = κ θ − x + SdW (3.1) et e et ft [ ] − E i j where κ, S are square matrices of order n 1 and dWt dWt = δij, δij being the Kronecker delta. Our intention here is to work with the bond prices p(t, T ) directly and not the yields y(t, T ). Given that the short rate rt is of the form shown in equation (2.28), the price of a zero coupon bond at time t and maturity T will be [5] [ ] ′ p(t, T ) = exp AT + BT x t t et Further,

− − − ∂p(t, T ) ∑n 1 ∂p(t, T ) 1 ∑n 1 ∑n 1 ∂2p(t, T ) dp(t, T ) = dt + dx + dx dx ∂t ∂x i,t 2 ∂x ∂x i,t j,t i=1 i,t i=1 j=1 i,t j,t [ ] ∂p(t, T ) ′ 1 ′ = dt + [grad p(t, T )] dx + tr S′BT BT S p(t, T )dt (3.2) ∂t et 2 t t

We setup a portfolio Πt at time t consisting of wk units of Tk maturity bonds, k = 1, 2, ··· , n. The change in the portfolio value over the time interval dt is given by

∑n dΠt = widp(t, Ti) i=1 We would like our portfolio to be first order immunized and to that effect one must recover the weights w so that the dW terms in the above equation are all zero. In i ft

13 Candidate no. 900228

∗ other words, the optimal weights wi must satisfy ∑n w∗ [grad p(t, T )] = 0 i i e i=1 Since the portfolio is now risk-less, we must have

dΠt = Πtrtdt and so, ( ) ∑n [ ] ∑n ∂p(t, T ) 1 ′ w∗ i dt + tr S′BTi BTi S p(t, T )dt = r w∗p(t, T )dt (3.3) i ∂t 2 t t i t i i i=1 i=1 ∗ ,recalling that wi s satisfy ∑n w∗ [grad p(t, T )] = 0 i i e i=1

∗ Letw ˆi = wi p(t, Ti). Then, we can re-write equation (3.3) as ( ) ∑n [ ] wˆ ∂p(t, T ) 1 ′ i i dt − r p(t, T )dt + tr S′BTi BTi S p(t, T )dt = 0 (3.4) p(t, T ) ∂t t i 2 t t i i=1 i

∑ ′ n Ti Ti If we define D = i=1 wˆiBt Bt , then we can express the portfolio convexity CP as ( ) 1 ∑n 1 ∂p(t, T ) CP = tr [S′DS] = − wˆ i − r (3.5) 2 i p(t, T ) ∂t t i=1 i Now note that the instantaneous forward rate at time t is given by ∂ log p(t, T ) ∂ log p(t, T ) 1 ∂p(t, T ) f T = − = = i t ∂T ∂t p(t, T ) ∂t where we have assumed time homogeneity1. Re-writing equation (3.5), we see that

1 ∑n ( ) CP = tr [S′DS] = − wˆ f Ti − r (3.6) 2 i t t i=1 Equation (3.6) above is quiet compelling. The expression on the left hand side, 1 ′ 2 tr [S DS], is purely a model dependent estimate of portfolio convexity. Let us call it the theoretical portfolio convexity or CPth. It is essentially what the model thinks the portfolio convexity should be. The expression on the right hand side, on the other hand, is a model independent quantity that can be read off from the shape of today’s yield curve, save the weightsw ˆis. Before we dive deep into the implications of this 1In other words, if the system is time homogeneous then the maturity T and calendar time t always appear as τ = T − t

14 Candidate no. 900228

∑ ( ) n Ti − equation, let us look closely at the expression on the left - i=1 wˆi ft rt and see if we can glean a deeper intuition. Recall that given equation (2.28), we have [ ] ′ p(t, T ) = exp AT + BT x t t et and so ′ 1 ∂p(t, T ) ∂AT ∂BT i = t + t x (3.7) p(t, T ) ∂t ∂t ∂t et Also,

−(T − t)y(t, T ) = log p(t, T ) ′ ∂y(t, T ) ∂AT ∂BT ⇒ y(t, T ) − (T − t) = t + t x ∂t ∂t ∂t et , and further assuming time homogeneity, we have ∂y(t, T ) 1 ∂p(t, T ) y(t, T ) + (T − t) = i (3.8) ∂T p(t, T ) ∂t

Thus using equation (3.8), we can re-write equation (3.5) as [ ] 1 ∑n ∂y(t, T ) CP = tr [S′DS] = − wˆ y(t, T ) − r + (T − t) i (3.9) th 2 i i t i ∂T i=1

The quantity y(t, Ti) − rt is commonly known as bond carry in the trader’s jargon. It is simply the excess yield available over and above the funding cost r and portfolio ∑ t n − − ∂y(t,Ti) carry is nothing but i=1 wˆi [y(t, Ti) rt]. Similarly, the quantity (Ti t) ∂T is called roll-down which is the change in yield due a small change in maturity multiplied by the duration of a zero coupon bond. Thus, we can re-write equation (3.9) as 1 CP = tr [S′DS] = − [Carry + Roll-down] (3.10) th 2 where ∑n Carry = wˆi [y(t, Ti) − rt] i=1 ∑n ∂y(t, T ) Roll-down = wˆ (T − t) i i i ∂T i=1 Let us now discuss equation (3.10) and understand its implications. The expression 1 ′ on the left-hand side, 2 tr [S DS], is, as discussed earlier, a model dependent quantity. It gives the model’s view of what the portfolio convexity should be. Crucially, this quantity depends on what the bonds will do in the next time step dt. A straight- forward way to estimate this quantity would be to calibrate the model to the yield

15 Candidate no. 900228

1 ′ covariance matrix and then computing 2 tr [S DS] from the calibrated model. Quite interestingly, equation (3.10) declares that this model dependent quantity must equal the model independent quantity − [Carry + Roll-down], that can be easily read off from today’s yield curve, except the optimal portfolio weights. Thus equation (3.10) really provides an almost model agnostic estimate of portfolio convexity that we had originally set out to explore. Further, consider any time homogeneous model. If this model is calibrated to the shape of the yield curve2 then it will produce the same value of the theoretical portfolio convexity CPth. The crucial assumption around this impressive result of equation (3.10) is, however, that of time homogeneity. We have assumed that all our equations depend on the time to maturity (T − t) alone and not on the calender time t. Given the exceptional times that we dwell in, for example the crisis of September 2008, this assumption might seem untenable but one must note that such forward looking beliefs are limited only to the near future and for shorter maturities where the effect of convexity is hardly significant.

3.1 The Weights

We notice that in the expression − [Carry + Roll-down], the optimal portfolio weights are not really model independent and hence this expression is not completely model agnostic. In this section, we present an estimation technique to calculate the optimal weights. Let us go back to our factor model in equation (3.1) and choose the n − 1 factors as yield principal components. Then we have,

′ dyt = V dxt (3.11) e e where V is the matrix of eigen vectors of the yield covariance matrix. Now, recall that in the Vasicek case we had constructed a portfolio with two bonds and had obtained the optimal weight w2 as

T1 ∂p(t, T1) ∂y(t, T2) σyt w2 = − T2 ∂y(t, T1) ∂p(t, T2) σyt which can be re-written as

∂p(t, T1) σT1 yt − T1 ∂y(t, T1) p(t, T1)(T1 t)σyt w2 = − = − − T2 ∂p(t, T2) T2 p(t, T2)(T2 t)σyt σyt ∂y(t, T2)

2It is crucial that the fit to the yield curve is perfect at the long end.

16 Candidate no. 900228

Ti Ti Ti Ti − But σyt = bt σr, bt = Bt /(Ti t) and ∂p(t, T ) i = BTi p(t, T ) ∂r t i Thus, we have ∂p(t, T1) ∂r w2 = − (3.12) ∂p(t, T2) ∂r Notice how the volatility factor has disappeared from equation (3.12). In the multi- factor case, the expression ∂p(t,Ti) generalizes to ∂p(t,Ti) and using equation (3.11), we ∂r ∂xk know ∂p(t, Ti) ∂p(t, Ti) ∂y(t, Ti) − − ′ = = (Ti t)p(t, Ti)(V )ik (3.13) ∂xk ∂y(t, Ti) ∂xk

In a multi-factor affine setup, we have already obtained that the optimal weights wi must satisfy ∑n 1 w [grad p(t, T )] = 0 i p(t, T ) i e i=1 i which can be written as         ′ ′ ′ w1τ1V11 w2τ2V12 wnτnV1,n 0  ′   ′   ′     w1τ1V21   w2τ2V22   wnτnV2,n  0  .  +  .  + ··· +  .  = . (3.14)  .   .   .  . ′ ′ ′ w1τ1Vn−1,1 w2τ2Vn−1,2 wnτnVn−1,n 0 where τi = Ti − t. Assuming w1 = 1, we can write equation (3.14) in matrix form as

Gw = g (3.15) and obtain the optimal weightsw ˆ as

wˆ = G−1g (3.16)

Quite importantly, this technique of estimating the optimal portfolio weights is still model dependent. Further, since it involves matrix inversion, one can run into com- putational inefficiencies while dealing with an ill-conditioned matrix. One of the approaches to deal with such ill-conditioned matrices would be to employ Singular Value Decomposition (SVD) to estimate the optimal weights. In Chapter 5.1, how- ever, we present another methodology to estimate the portfolio weights which will be truly model agnostic and it shall be the one that we finally implement while executing our trading strategy.

17 Candidate no. 900228

Chapter 4

The Trading Strategy

In this section, we present a trading strategy that exploits the model agnostic rep- resentation of portfolio convexity that was derived in equation (3.10). Through this trading strategy, we seek to exploit those instances in time when Convexity has not been fairly priced. Using data on the US Zero Coupon yield from 1987-2014, we show that there have been several occasions of prolonged inconsistencies in the pricing of Convexity and on these occasions our strategy has been able to exploit such incon- sistencies. Our approach in the previous section has allowed us to construct a duration- neutral portfolio by suitably estimating the optimal weightsw ˆi’s for the n bonds.

Thereafter, we were able to represent the theoretical portfolio convexity CPth as 1 CP = tr [S′DS] = − [Carry + Roll-down] th 2 where ∑n Carry = wˆi [y(t, Ti) − rt] i=1 ∑n ∂y(t, T ) Roll-down = wˆ (T − t) i i i ∂T i=1 The left hand side of this equation is the theoretical portfolio convexity, which is nothing but the model’s view of what the portfolio convexity should be. The right hand side, on the other hand, is almost model independent and can be read off from today’s yield curve. Let us for a moment re-visit the portfolio convexity. We know that convexity is additive and if we could compute the convexity of every bond in the portfolio, say Convi, then we can easily calculate the portfolio convexity CP from equation (2.9) as ∑n 1 ∑n CP = wˆ Conv = wˆ T 2σ2 (4.1) i i 2 i i i i=1 i=1

18 Candidate no. 900228

The σi’s in equation (4.1) are the yield volatilities. One way to estimate them is to employ a calibrated model but a more beneficial alternative would be to use a statistical estimation technique that not only estimates the σi’s but also provides the best estimate of the yield volatilities over the next time step dt. This is crucial because an estimate of the future yield volatility over the next time step allows us to generate an estimate of the future realized convexity over the next time step. One can then compare this estimated realized convexity with the theoretical convexity

CPth that relies only on today’s yield curve. A gap in these two quantities is the all important trading signal that we want our strategy to pick up. th More precisely, letσ ˆi denote the best time t estimate of the i yield volatility over the next time step dt. Then, an estimate of future realized convexity over the time step dt is given by, 1 ∑n CP = wˆ T 2σˆ2 (4.2) pred 2 i i i i=1 The trading strategy then consists of defining a trading signal S as

1 ∑n S = CP − CP = wˆ T 2σˆ2 + [Carry + Roll-down] (4.3) pred th 2 i i i i=1 When S > 0, we expect the future realized portfolio convexity to be bigger than today’s portfolio convexity and thus we buy a duration hedged long convexity portfolio at time t. Similarly, when S < 0 the future realized portfolio convexity is expected to be smaller than today’s portfolio convexity and so we sell a duration hedged long convexity portfolio at time t. At time t+dt, we execute two actions. First, we unwind our position and calculate the portfolio profit and loss (P&L) as the difference between the portfolio values at time t and t + dt over and above the funding cost rt. Second, at the same time t + dt, we repeat the entire process by obtaining an estimate of the future realized convexity over the next trading period dt, checking the trading signal S and then entering into a trade based on the sign of the signal at time t + dt. We repeat our strategy over several trading intervals and hope to generate a positive P&L every time the strategy encounters a clear signal. It is important to note, however, that our trading strategy relies on several critical components and it is, therefore, instructive to understand the factors that influence the profitability of our strategy. Recall that convexity is a second order effect and to capture this second order effect, the optimal portfolio weights must provide a complete first order immunization to our portfolio. Thus, the estimation of portfolio weights wˆi’s form the crucial first step as any residual first order exposure will have a bearing on our portfolio P&L. The estimated yield volatilitiesσ ˆi’s form the other important input to the trading strategy. These estimated volatilities are essentially our best

19 Candidate no. 900228

guesses of the future yield volatilities over the trading period dt and a significant over or under estimation of the yield volatilities will have a negative impact on the portfolio P&L. We discuss the estimation of the optimal portfolio weights and future yield volatilities in Chapter 5.

20 Candidate no. 900228

Chapter 5

Strategy Implementation

In this section we discuss the steps that have been implemented to back-test our strategy on the daily historical US Zero Coupon yields from 1987-20141. The daily yields and the forward rate information used in this strategy are an outcome of the seminal paper by Gurkaynak et al (2006)[10]. We direct the interested readers to their paper[10] which describes in detail the estimation methodology adopted by the authors to derive the zero coupon yields and the forward rates across maturities. The funding cost rt is another important ingredient for our strategy and we use the daily historical US Federal Funds Effective Rate2 as a proxy for the funding cost. We implement the strategy on three typical bond portfolios:

1. a portfolio made up of 10 year, 20 year and 30 year zero coupon bonds. (Portfolio 10/20/30)

2. a portfolio made up of 5 year, 10 year and 30 year zero coupon bonds. (Portfolio 5/10/30)

3. a portfolio made up of 5 year, 15 year and 30 year zero coupon bonds. (Portfolio 5/15/30)

These stylized portfolios have the key advantage that they deliver a ‘yield give -up’ and ‘convexity pick -up’ or a ‘yield pick -up’ and ‘convexity give -up’ depending on the trading signal S. For example, if the trading signal S is positive, the trader can enter into a duration hedged long convexity portfolio by being long on the 10 and 30 year end of the yield curve and being short on the 20 year, say. Such a portfolio will have a higher convexity (‘convexity pick -up’) but a lower yield than the 20 year (‘yield give -up’).

1The daily data on yields and forward rates were downloaded from the US Federal Reserve Board website. 2The daily data on the Federal Funds Effective Rate were downloaded from US Federal Reserve Board Selected Interest Rates (Daily) - H.15.

21 Candidate no. 900228

While back-testing the strategy, we traded each portfolio in intervals of 5 business days. In other words, we would create the duration-neutral portfolio at time t based on the trading signal S at time t, hold the portfolio over 5 business days, unwind our position at t + 5 and calculate our portfolio P&L. At t + 5, we would repeat the same process. In Chapter 5.1 we discuss the methodology employed to estimate the duration-neutral portfolio weights and in Chapter 5.2 we discuss the volatility estimation procedure employed to obtain the predicted yield volatilitiesσ ˆi over the trading intreval dt.

5.1 Estimation of Weights

In Chapter 3.1, we presented a methodology to estimate the optimal weights that ren- der the portfolio duration neutral. This technique of estimating the weights assumes a factor model for the yields and is indeed model dependent; impairing the otherwise model agnostic representation of the theoretical portfolio convexity in equation (3.6)

1 ∑n ( ) CP = tr [S′DS] = − wˆ f Ti − r th 2 i t t i=1 In this section, we present a slightly different approach to estimate the optimal port- folio weights which does not rely on a specific model for the yields. In this sense, this method of estimating the weights is model agnostic. Note that a duration neutral portfolio is immunized only against parallel shifts in yield curves, but not against the non-parallel shifts. Thus, duration neutrality can expose the portfolio to a considerable amount of risk from non-parallel shifts in the yield curve fluctuations[8, 13, 1]. One of the techniques widely used to guard against non-parallel shifts in the yield curve is to employ portfolio theory and de- termine those portfolio weights as optimal which minimize the portfolio variance. Originally, Ederington (1979)[6] and Johnson (1960)[14] had used this technique to derive the minimum variance hedge ratio (HR) as the average relationship between the changes in the cash price and the changes in the futures price which minimizes the net price change risk, where net price change risk is the variance of the price changes of the hedged position[3]. In our setup, we adopt their approach and estimate the portfolio weights as those which minimize the portfolio variance of the bond price changes. Formally, let there be N zero coupon bonds in the portfolio with matu- t − rity Ti, i = 1, 2, ..N. Let ri denote the return on the bond i from time t 1 to t. Further, let Σ be the NxN return covariance matrix. Then the minimum variance portfolio is the bond portfolio with the lowest return variance and is the solution to

22 Candidate no. 900228

the minimization problem,

minimize w′Σw such that w > 0 (5.1) w=(w ,w ,··· ,w )e e e e 1 2 N Memmel and Kempf [17] go on to show that the optimal weights obtained from the above minimization are equivalent to the estimated regression coefficients in the following regression model

t t t ··· t rN = α + β1r1 + β2r2 + + βN−1rN−1 + ϵt (5.2) where ϵt satisfies the assumptions of a classical linear regression model. The βi’s ˆ are estimated using the ordinary least squares technique and we have wi = βi, i =

1, 2, ··· ,N − 1 and wN = 1. We exploit the regression setting in equation (5.2) to calculate the optimal port- folio weights for our problem through the following steps.

1. Define the return on the bond i from time t − 1 to t as

t − − − ri = Ti [y(t 1,Ti) y(t, Ti)] , i = 1, 2, 3 (5.3) t ··· Calculate ri for t = 1, 2, , 300, that is for a period of 300 trading days starting

from time t and going back. Notice that Ti [y(t − 1,Ti) − y(t, Ti)] is a proxy for a change of a bond. Also, recall that the three bond portfolios introduced under the trading strategy will each hold 3 zero coupon bonds with the 30 year zero coupon bond being common across all the three portfolios.

t 2. Let r3 denote the return on the bond with maturity 30 years. Estimate the ˆ regression coefficients βi, i = 1, 2 under the model

t t t r3 = β1r1 + β2r2 + ϵt (5.4)

ˆ ˆ 3. The estimated optimal portfolio weights are then given byw ˆ1 = −β1,w ˆ2 = −β2

andw ˆ3 = 1 wherew ˆ3 is the optimal weight for the 30 year zero coupon bond. Appendix 1 provides a MATLAB implementation of the estimation procedure dis- cussed above. It is worthwhile to note that the above process of estimating the opti- mal weights assumes that the yields exhibit a Gaussian behavior. However, when the rates are high it can be argued that the yields exhibit a more Log-Normal behavior. In such scenarios, one may calculate the return on the bond i as y(t − 1,T ) − y(t, T ) t − i i ri = Ti , i = 1, 2, 3 (5.5) y(t − 1,Ti) and follow essentially the same procedure to estimate the optimal weights. We do not pursue this line of thought in this essay and leave the above implementation for future research.

23 Candidate no. 900228

5.2 Estimation of Volatilities

As discussed in Chapter 4, the estimated yield volatilitiesσ ˆi’s form the other impor- tant input to the trading strategy. A straightforward procedure that can be employed to estimate these volatilities is to use a fixed length window, say of 300 trading days, and calculate the standard deviation of the daily yield changes. If we use a fixed length window, we could end up with either a noisy, highly responsive estimator if we use a smaller window or with a stable, slow to respond estimator if we use a long window. However, recall that we are interested in obtaining a best estimate of future volatility and ex ante we do not have a way to predict the unexpected; but we would like our volatility estimation process to know immediately that future volatilities are likely to be higher incase we have entered a regime of unexpected volatility. A long fixed length window will not be able to provide such immediate update of volatilities. Further, if we do underestimate the future volatility and remain short Convexity per our trading strategy, we will make a severe loss. In this study, we have used a GARCH(1, 1)[9] model to estimate the future volatil- ities. Our decision to use a conditional heteroscedastic model is mainly driven by the observed autocorrelations in squared returns and volatility clustering in financial time series data. Possibilities exist to use other variants of the GARCH model like the EGARCH model of Nelson (1991)[19] but we have embraced simplicity over model richness. In what follows, we provide the general setup of our GARCH(1, 1) model.

For a zero coupon bond with maturity Ti, define the return series as

t − − xi = y(t, Ti) y(t 1,Ti),Ti = 5, 10, 15, 20 and 30 (5.6)

t Let the mean equation of the time series xi be given by an ARMA(m, n) process of autoregressive order m and moving average order n as

∑m ∑n t t−r xi = µ + arxi + bsεt−s + εt (5.7) r=1 s=1 where µ is the unconditional mean, ar’s are the auto-regressive coefficients, bs’s are the moving average coefficients and the innovation terms εt’s are uncorrelated with zero mean. The variance equation of the GARCH(1, 1) model can be expressed as,

εt = σt,iηt (5.8) 2 2 2 σt,i = ω + αεt−1 + βσt−1,i (5.9) where σt,i is the time t standard deviation of the yield changes with maturity Ti, ηt is an iid process with zero mean and unit variance and ω, α, β are the unknown param- eters that need to be estimated. Note that for strict stationarity of a GARCH(1, 1)

24 Candidate no. 900228

model, we must have ω > 0, α > 0, β > 0 and α + β < 1. It is also worthwhile to ob- serve that although the innovation terms εt are serially uncorrelated, their conditional 2 variance equals σt,i and hence varies with time. We refer the reader to Francq et. al. (2010)[9] for a detailed discussion on the statistical properties of a GARCH(1, 1) model. Although we have specified a GARCH(1, 1) model for the variance equation, we still need to determine the appropriate autoregressive and moving average orders of m and n for the ARMA mean equation before we can proceed with parameter esti- mation. Our approach to determine suitable values for m and n relies on the plots of t autocorrelation (ACF) and partial autocorrelation functions (PACF) of the series xi. Recall that for a moving average process, MA(n)

2 zt = εt + θ1εt−1 + ··· + θnεt−n, εt ∼ IID(0, σ ) the ACF of lag h is given by ∑ n−h  θjθj+h ∑j=0 ≤ ≤ n 2 , 0 h n. Corr (zt+h, zt) = ρ(h) = θ (5.10)  j=0 j 0, h > n.

Similarly, for an auto regressive process, AR(m)

2 zt = ϕ1zt−1 + ϕ2zt−2 + ··· + ϕmzt−m + εt, εt ∼ IID(0, σ ) the PACF of lag h is given by

ϕh,h = Corr (zt+h, zt|zt+h−1, ··· , zt+1) = 0, if h > p (5.11) since εt+h is uncorrelated with zt+h−1, ··· , zt+1 and zt. Thus, we can use equations (5.10) and (5.11) to determine the appropriate values of m and n from the plots of t ACF and PACF. In particular, if the ACF plot of xi displays a sharp cutoff after lag h then we will consider the mean equation to be an MA(h) process. If the PACF plot displays a sharp cutoff after lag h then we will consider the mean equation to be an AR(h) process. In this relatively straightforward rule for determining the appropriate values of m and n, we exclude the possibility of incorporating both AR and MA terms in the mean equation. This is done primarily to ensure a simplistic setup for the volatility estimation procedure, one which is credible yet parsimonious.3 The following figures present the ACF and PACF plots of the 10 year zero coupon

3We realize that restricting the mean equation to either an AR or MA process could be considered as a model weakness. However, our aim here is to study the performance of the trading strategy and not dive into the depths of methodological riches. We further show that this procedure of estimating the volatility works appreciably well as far as our portfolio P&L is concerned.

25 Candidate no. 900228

bond at t = 9th Feb 1987 which is the start date that we use for back-testing our strategy.

Figure 5.1: ACF plot for 10 year bond at t = 9th Feb 1987

Figure 5.2: PACF plot for 10 year bond at t = 9th Feb 1987

The ACF plot shows a sharp cutoff after lag 1 whereas we see no such evidence in the PACF plot. This behavior of the ACF and PACF plots for the 10 year bonds is observed across all the trading days that are available in our data, starting from t = 9th Feb 1987 to t = 3rd Oct 2014. Thus, we use an MA(1) mean equation for the 10 year zero coupon bonds given by,

t ∀ x10 = µ + b1εt−1 + εt, t (5.12)

26 Candidate no. 900228

where µ is the unconditional mean, b1 is the moving average coefficient and the innovation terms εt’s are uncorrelated with zero mean. The variance equation of the 10 year bonds continues to be GARCH(1, 1) as expressed in equation (5.8). The ACF and PACF plots of 5 and 15 year yields show a similar behavior with the ACF plot producing a sharp cutoff after lag 1 and the PACF plot being hardly significant after lag 0. Hence, much like the 10 year yields, we opt for an MA(1) mean equation for the 5 and 15 year zero coupon bonds. The 20 and 30 year zero coupon yields, however, narrate a different story. The ACF and PACF profiles of these two yields start off looking very similar to those of the 10 year yields, indicating an MA(1) mean equation. Around the year 2014, the ACF and the PACF profiles change, indicating that an MA(1) representation will not be an appropriate choice of the mean equation.

Figure 5.3: ACF plot for 20 year bond at t = 2nd Oct 2014

27 Candidate no. 900228

Figure 5.4: PACF plot for 20 year bond at t = 2nd Oct 2014

Figure 5.5: ACF plot for 30 year bond at t = 12th May 2014

28 Candidate no. 900228

Figure 5.6: PACF plot for 30 year bond at t = 12th May 2014

In figures (5.3), (5.4), (5.5) and (6.1), we see that instead of an MA(1) mean equation, an AR(1) mean equation appears to be more appropriate for the 20 and 30 year yields. This observation is largely driven by the sharp cutoff observed in PACF plot after lag 1. One could also consider a mixed MA and AR mean equation as the PACF plot does not really demonstrate a uniform decay after lag 1. With these observations at hand and the fact that the 20 and 30 year yields undergo a change in profile over time, we employ a simple mean equation and assume that the innovations εt from t the mean equation are nothing but the returns xi themselves. Thus for the 20 and 30 year bonds, t ∀ xi = εt, t and i = 20, 30 (5.13) where the innovation terms εt’s are uncorrelated with zero mean. The variance equa- tion of the 20 and 30 year yields continues to be GARCH(1, 1) as expressed in equation (5.8). For parameter estimation, we follow Wurtz et.al[23] and employ a quasi maximum likelihood (QML) technique to estimate the unknown parameters for each of the five volatility models corresponding to the five different maturities considered in our strat- egy. The QML technique infers the innovations εt’s and assumes that conditional on 2 some initial values of ε0 and σ0,i, ηt’s are distributed as standard Gaussian. Appendix 2 provides an R implementation of the volatility estimation procedure using the pack- age ’fGarch’. The estimation process uses all available daily historical data on the yields with a minimum of 300 days worth of data. For the bond with maturity Ti, 2 the time t predicted yield volatilityσ ˆi over the trading period dt = 5 days is then calculated as the average of model predicted volatilities at times t + 1, t + 2, ··· , t +5.

29 Candidate no. 900228

5.3 Implementation Steps

Let us consider the portfolio 10/20/30 and let w10, w20 and w30 (with w30 = 1) be the optimal portfolio weights estimated at time t using the technique described in

Chapter 5.1. Denote byσ ˆ10, σˆ20 andσ ˆ30 the time t best estimates of yield volatilities over the trading interval dt calculated using the methodology described in Chapter 5.2. In what follows, we list the computational steps required to run the strategy from time t to t + dt.

T 1. Let ft and rt denote respectively the instantaneous forward rates and the fund- ing rate at time t where T = 10, 20, 30. Calculate the theoretical portfolio

Convexity, CPth, at time t using equation (3.6), [ ( ) ( ) ( )] − 30 − 20 − 10 − CPth = w30 ft rt + w20 ft rt + w10 ft rt (5.14)

2. Calculate the predicted portfolio Convexity, CPpred, using equation (4.2), 1 [ ] CP = w T 2 σˆ2 + w T 2 σˆ2 + w T 2 σˆ2 (5.15) pred 2 30 30 30 20 20 20 10 10 10

where Ti = i, i = 10, 20, 30.

3. Calculate the trading signal, S, using equation (4.3),

S = CPpred − CPth (5.16)

4. If S > 0, we buy a duration hedged long convexity portfolio and if S < 0 we sell a duration hedged long convexity portfolio. As far as strategy implementation

is concerned, we generate smoothed weights s10, s20, s30 as follows:

If S > 0 , obtain s10 = w10, s20 = −w20, s30 = w30 = 1

If S < 0 , obtain s10 = −w10, s20 = w20, s30 = −w30 = −1

5. Estimate the zero coupon yield and the corresponding bond price at the next

time step t+dt and maturity Ti−dt where Ti = i, i = 10, 20, 30 and dt = 1/52. In

other words, we calculate y(t + dt, Ti − dt) using the parametric representation of the yield curve provided in Gurkaynak et al.[10]. To calculate the price

p(t + dt, Ti − dt), we simply use the standard relationship,

p(t + dt, Ti − dt) = exp [− (Ti − dt) y(t + dt, Ti − dt)]

30 Candidate no. 900228

6. At this stage, we have all the ingredients required to compute the bond excess return. Define ExRet(i) as the excess return on bond i calculated at time t+dt as ( ) p(t + dt, Ti − dt) − p(t, Ti) ExRet(i) = si − rtdt , i = 10, 20, 30 (5.17) p(t, Ti)

Note that in equation (5.17), the first term is the full return on the bond includ-

ing Carry and the second term is the risk-free funding rate rt over the period dt = 1/52.

7. Calculate the portfolio P&L at time t + dt as

MTM(t + dt) = ExRet(10) + ExRet(20) + ExRet(30) (5.18)

The steps described above complete one iteration of running the strategy on historical data from time t to t+dt on the portfolio 10/20/30. The same procedure is applicable while using the strategy on the portfolios 5/10/30 or 5/15/30 with appropriate values of durations wherever applicable. To generate a cumulative portfolio P&L profile over the entire history of the available data, we repeat the above steps at time t = t + dt, cumulatively adding the MTM’s generated at each time step to produce a running portfolio P&L profile over time.

31 Candidate no. 900228

Chapter 6

Results

In this section, we present the results of our strategy implementation in terms of the estimated portfolio weights, the estimated volatilities and the observed portfolio P&L for the three strategies considered. We remind ourselves that while back-testing our strategy, we have taken t = 9th Feb, 1987 as the date when we first implement the strategy and t = 3rd Oct, 2014 as the last date. Between these two periods, the strategy has been run on 1, 381 trading days with the trading interval dt being equal to 5 days.

6.1 Estimated Weights

We begin this section by presenting the estimated weightsw ˆi, i = 5, 10, 15, 20 for the three portfolios considered in the strategy. The weights for the 30 year bonds are always equal to 1 and thus we refrain from presenting them here. Recall that the weights estimation procedure discussed in Chapter 5.1 has been executed 1, 381 number of times, one for each trading day in our data. Figures (6.1), (6.2) and (6.3) show the estimated portfolio weights for the three portfolios used in the strategy.

32 Candidate no. 900228

Figure 6.1: Estimated weights for the 10 (Blue line) and 20 (Orange line) year bonds for the portfolio 10/20/30

Figure 6.2: Estimated weights for the 5 (Blue line) and 10 (Orange line) year bonds for the portfolio 5/10/30

33 Candidate no. 900228

Figure 6.3: Estimated weights for the 5 (Blue line) and 15 (Orange line) year bonds for the portfolio 5/15/30

We see that the estimated weights have magnitudes which are generally intuitive. For example, the 5 year bonds tend to have weights relatively higher in magnitude. This is expected since the duration of the 5 year bond is smaller and hence more of the 5 year bonds are necessary to hedge the movement in the 30 year bonds. Also the profile of the optimal weights are highly anti-correlated.

6.2 Estimated Volatilities

In this section we present the results of the volatility estimation as discussed in Chapter 5.2. We begin with figures that demonstrate the propagation of the esti- mated yield volatilities over the entire sample. Once again, the estimated volatilities

σˆi have been estimated for each of the five volatilities considered in the strategy across the 1, 381 trading days in the data.

34 Candidate no. 900228

Figure 6.4: Estimated volatilitiesσ ˆ5 (Yellow line),σ ˆ10 (Orange line) andσ ˆ15 (Green line).

We notice three distinct spikes in the volatility estimates of the 5, 10 and 15 year yields. The first spike emerges around the period 1987 − 1988 which coincides with the stock market crash of 19871 followed by a spike around 1994 which could have been a fallout of the Mexican peso crisis2. The final spike in estimated volatility appears around the year 2008, concurrent with the global financial crisis of 2008.

1The 1987 stock market crash was a major systemic shock and market functioning was severely impaired. Mark Carlson in his 2006 paper [16] discusses the events surrounding the crash. 2On December 20, 1994, the Mexican government devalued the peso. The financial crisis that followed cut the peso’s value in half and sparked a severe recession in Mexico [15].

35 Candidate no. 900228

Figure 6.5: Estimated volatilitiesσ ˆ20 (Blue line) andσ ˆ30 (Orange line).

For the 20 and 30 year volatility estimates, we observe a similar pattern however the fluctuations in the volatility estimates are of a much smaller magnitude. In the following tables, we present the parameters estimates and their significance for the five volatility models. The results are presented for a sample of randomly chosen 10 trading dates only but the observations hold in general across all the 1, 381 trading dates in the data.

5 Year Volatility Model ˆ⋄ ⋄ ˆ⋄ Trading Dates σˆ5 µˆ b1 ωˆ αˆ β 3-Oct-14 0.0067 -1.2*10−5• 0.0615 2.8*10−9⋄ 0.0452 0.9488 18-Jun-13 0.0070 -1.2*10−5 0.0661 3.0*10−9⋄ 0.0451 0.9489 5-Apr-12 0.0081 -1.4*10−5• 0.0705 5.7*10−9⋄ 0.0462 0.9411 15-Feb-11 0.0107 -1.2*10−5 0.0727 5.6*10−9⋄ 0.0449 0.9428 2-Aug-10 0.0093 -1.3*10−5 0.0763 5.4*10−9⋄ 0.0438 0.9444 13-Nov-07 0.0101 -1.2*10−5 0.0887 6.4*10−9⋄ 0.0437 0.9408 7-Sep-00 0.0083 -1.4*10−5 0.1123 1.3*10−8⋄ 0.0498 0.9195 27-May-97 0.0089 -1.4*10−5 0.1275 1.5*10−8⋄ 0.0504 0.9160 3-Oct-91 0.0079 -9.9*10−6 0.1573 2.6*10−8⋄ 0.0879 0.8540 17-Jul-87 0.0094 -3.0*10−5 0.1717 1.4*10−8 0.1261 0.8566

Table 6.1: Estimated parameters for the 5 year volatility model. ⋄: significant at 1%, ⋆: significant at 5%, •: significant at 10%.

36 Candidate no. 900228

10 Year Volatility Model ˆ⋄ ˆ⋄ Trading Dates σˆ10 µˆ b1 ωˆ αˆ β 3-Oct-14 0.0074 -1.4*10−5⋆ 0.0467 3.3*10−9⋄ 0.0384⋄ 0.9538 18-Jun-13 0.0089 -1.5*10−5⋆ 0.0500 3.6*10−9⋄ 0.0391⋄ 0.9524 5-Apr-12 0.0095 -1.6*10−5⋆ 0.0511 3.8*10−9⋄ 0.0397⋄ 0.9516 15-Feb-11 0.0113 -1.5*10−5⋆ 0.0512 3.3*10−9⋄ 0.0373⋄ 0.9552 2-Aug-10 0.0096 -1.6*10−5⋆ 0.0562 3.3*10−9⋄ 0.0367⋄ 0.9557 13-Nov-07 0.0082 -1.5*10−5• 0.0675 3.5*10−9⋄ 0.0354⋄ 0.9556 7-Sep-00 0.0078 -1.9*10−5• 0.0806 6.2*10−9⋄ 0.0376⋄ 0.9471 27-May-97 0.0081 -2.0*10−5 0.0933 6.0*10−9⋄ 0.0361⋄ 0.9501 3-Oct-91 0.0078 -2.1*10−5 0.1126 8.5*10−9⋆ 0.0525⋄ 0.9322 17-Jul-87 0.0110 -2.2*10−5 0.1764 1.3*10−8 0.0785⋆ 0.9028

Table 6.2: Estimated parameters for the 10 year volatility model. ⋄: significant at 1%, ⋆: significant at 5%, •: significant at 10%.

15 Year Volatility Model ˆ⋄ ⋄ ˆ⋄ Trading Dates σˆ15 µˆ b1 ωˆ αˆ β 3-Oct-14 0.0075 -1.3*10−5• 0.0444 3.0*10−9⋄ 0.0375 0.9547 18-Jun-13 0.0088 -1.3*10−5• 0.0473 3.3*10−9⋄ 0.0387 0.9529 5-Apr-12 0.0092 -1.5*10−5• 0.0465 3.4*10−9⋄ 0.0395 0.9522 15-Feb-11 0.0108 -1.4*10−5• 0.0462 2.9*10−9⋄ 0.0363 0.9564 2-Aug-10 0.0099 -1.5*10−5• 0.0533 2.9*10−9⋄ 0.0359 0.9567 13-Nov-07 0.0077 -1.5*10−5• 0.0643 2.9*10−9⋄ 0.0338 0.9582 7-Sep-00 0.0072 -1.9*10−5• 0.0746 4.0*10−9⋄ 0.0363 0.9535 27-May-97 0.0081 -2.0*10−5• 0.0922 3.7*10−9⋄ 0.0351 0.9562 3-Oct-91 0.0073 -2.2*10−5 0.1363 4.2*10−9⋆ 0.0406 0.9522 17-Jul-87 0.0118 -2.1*10−5 0.2473 1.1*10−8 0.0682 0.9158

Table 6.3: Estimated parameters for the 15 year volatility model. ⋄: significant at 1%, ⋆: significant at 5%, •: significant at 10%.

ˆ We note that the estimated autoregressive parameter b1 from the MA(1) mean equa- tion and the estimated parametersα ˆ and βˆ from the GARCH(1, 1) model are all significant; not only for these 10 trading dates but for the entire sample3. Further, αˆ + βˆ < 1 indicating second order stationarity of the fitted GARCH(1, 1) process [9].

3We have also noticed that the GARCH(1, 1) residuals and their squares do not exhibit auto- correlation although the residuals do not look like being normal.

37 Candidate no. 900228

20 Year Volatility Model ˆ Trading Dates σˆ20 ωˆ αˆ⋄ β⋄ 3-Oct-14 0.0074 2.5*10−9⋄ 0.0364 0.9561 18-Jun-13 0.0084 2.7*10−9⋄ 0.0376 0.9546 5-Apr-12 0.0088 2.7*10−9⋄ 0.0382 0.9542 15-Feb-11 0.0101 2.4*10−9⋄ 0.0350 0.9583 2-Aug-10 0.0100 2.4*10−9⋄ 0.0348 0.9583 13-Nov-07 0.0076 2.3*10−9⋄ 0.0321 0.9606 7-Sep-00 0.0067 3.1*10−9⋄ 0.0343 0.9566 27-May-97 0.0077 2.8*10−9⋄ 0.0327 0.9598 3-Oct-91 0.0067 3.6*10−9⋆ 0.0424 0.9508 17-Jul-87 0.0113 1.3*10−8• 0.0671 0.9152

Table 6.5: Estimated parameters for the 20 year volatility model. ⋄: significant at 1%, ⋆: significant at 5%, •: significant at 10%.

30 Year Volatility Model ⋄ ˆ⋄ Trading Dates σˆ30 ωˆ αˆ β 3-Oct-14 0.0077 3.7*10−9⋄ 0.0485 0.9421 18-Jun-13 0.0086 4.1*10−9⋄ 0.0500 0.9400 5-Apr-12 0.0093 4.1*10−9⋄ 0.0507 0.9397 15-Feb-11 0.0087 3.8*10−9⋄ 0.0479 0.9428 2-Aug-10 0.0097 3.8*10−9⋄ 0.0478 0.9430 13-Nov-07 0.0090 3.8*10−9⋄ 0.0468 0.9433 7-Sep-00 0.0067 3.9*10−9⋄ 0.0463 0.9453 27-May-97 0.0063 4.4*10−9⋄ 0.0463 0.9457 3-Oct-91 0.0097 5.6*10−9⋆ 0.0424 0.9489 17-Jul-87 0.0119 1.1*10−8• 0.0600 0.9273

Table 6.6: Estimated parameters for the 30 year volatility model. ⋄: significant at 1%, ⋆: significant at 5%, •: significant at 10%.

For the 20 and 30 year volatility models, we have similar observations. The estimated parametersα ˆ and βˆ from the GARCH(1, 1) model are all significant both for these 10 trading dates and for the entire sample4. Also,α ˆ + βˆ < 1 indicating second order stationarity of the fitted GARCH(1, 1) process [9].

4In this case too, we have noticed that the GARCH(1, 1) residuals and their squares do not exhibit autocorrelation although the residuals do not look like being normal.

38 Candidate no. 900228

6.3 Portfolio Profit & Loss

In this section we present the observed P&L’s when we apply the strategy on the three portfolios: 10/20/30, 5/10/30 and 5/15/30.

Figure 6.6: P&L for the portfolios

The fact that the strategy is profitable on the three portfolios is evident from figure (6.6) although the portfolio 10/20/30 appears to be losing money post 2008. This is also supported by the handsome Sharp Ratios5 observed for the three portfolios over the entire sample but a negative ratio for the 10/20/30 portfolio post 20086.

Sharp Ratio Portfolio Full Sample Post 2008 10/20/30 0.31 -0.07 5/10/30 0.40 1.05 5/15/30 0.68 0.94

Table 6.7: Sharp Ratios for the three portfolios.

We also note that all the three portfolios exhibit three distinct periods of P&L regimes. During the first period, the strategy is moderately profitable followed by a period in

5Sharp Ratio (SR), developed√ by William F. Sharpe, is a method of calculating risk adjusted return. In our setup, SR = 52* (ratio of average daily portfolio P&L to the standard deviation of the daily portfolio P&L). 6The negative Sharpe Ratio observed for the 10/20/30 portfolio post 2008 does not necessarily dictate that the strategy fails to work after 2008 because the computed values of the Sharpe Ratio depend on the chosen dates and the period 2008-2014 might not be enough to infer whether the strategy was profitable post 2008.

39 Candidate no. 900228

the middle when the strategy remains almost neutral. In the third period, the strat- egy is clearly profitable. We recall that if the portfolio weights genuinely immunize against the duration shocks and if the volatility estimation procedure provides correct guesses of the future volatility then our strategy should make money every time. The revelation of the reality as represented by figure (6.4), however, motivates us to search for reasons as to why the strategy fails to make money during the middle period. We explore this question in detail in Chapter 7.

40 Candidate no. 900228

Chapter 7

Conclusion

We recognize that our strategy banks heavily on two moving parts - the optimal portfolio weightsw ˆi and the best estimates of future yield volatilitiesσ ˆi. This implies that there could be three potential reasons as to why the strategy might fail to make money:

• In reality the yield curve is fairly curved and the signal strength |S| is small indicating that there is in fact no money to be made.

• The optimal portfolio weightsw ˆi do not really render the portfolio duration- immunized. Thus, there might be residual exposure to the level, slope or cur- vature of the yields that impact the portfolio P&L.

• Finally, there is the potential danger of producing estimates of future volatility that are far away from the reality.

As far as the first reason is concerned, it is important to note that if the absence of profit on any particular trading day is really due to that fact that the yield curve is fairly curved and hence |S| is small, then we would expect a strong correlation between the daily P&L’s and the strength of the signal |S|. Whenever the signal strength is small, we would know that there is no money to be made but a moderate to strong signal should imply a bigger P&L. To test this hypothesis, we calculated the average P&L for each decile of the absolute signal strength |S|, hoping that a strong relationship between signal strength and P&L would produce a monotonically increasing average P&L with the deciles of |S|. Figures (7.1), (7.2) and (7.3) plot the average P&L against the deciles of the absolute signal strength |S|.

41 Candidate no. 900228

Figure 7.1: Signal Strength vs. Average P&L for 10/20/30

Figure 7.2: Signal Strength vs. Average P&L for 5/15/30

42 Candidate no. 900228

Figure 7.3: Signal Strength vs. Average P&L for 5/10/30

The figures for the three portfolios reveal a complex picture. The relationship between average P&L and absolute signal strength is certainly not monotonically increasing although for two of the last three deciles the average P&L is relatively big across all the three portfolios. It is also clear, with the exception of the portfolio 5/15/30, a weaker signal strength indeed means a smaller P&L. Thus, signal strength appears to have some bearing on the realized P&L although there might be other factors contributing to it. The observed lack of a strong positive relationship between signal strength and

P&L could be due to the inability of the portfolio weightsw ˆi to deliver duration immunization. In other words, there might be residual exposure to the level, slope or curvature of the yields and if this were true then we would expect to see significant coefficients in the regression of P&L against the level, slope and curvature of the three yields for the portfolio in question. Thus, we proceed to test the second reason on why the strategy fails to make money every time.

Yield Curve Level vs. P&L Portfolio R2 α p Value β p Value 10/20/30 -0.0007 0.0013 0.5127 -0.0037 0.8462 5/10/30 -0.0003 0.0001 0.9673 0.0144 0.4658 5/15/30 0.0007 0.0003 0.8834 0.0274 0.1603

Table 7.1: Regression of P&L against the Level of the Yields.

In table (7.1), we see that the R2’s for the regression of yield curve level on the daily P&L are extremely low with the coefficients being insignificant. Thus the three

43 Candidate no. 900228

portfolios appear to be sufficiently immunized against yield curve levels.

Yield Curve Slope vs. P&L Portfolio R2 α p Value β p Value 10/20/30 0.0089 0.0075 0.0001 -0.7102 0.0003 5/10/30 0.0000 -0.0013 0.6438 0.1386 0.3194 5/15/30 -0.0007 0.0020 0.4819 0.0160 0.8993

Table 7.2: Regression of P&L against the Slope of the Yields.

In table (7.2), the R2’s for the regression of yield curve slope on the daily P&L are once again extremely low with the coefficients being largely insignificant. Thus the three portfolios appear to be sufficiently immunized against yield curve slope too.

Yield Curve Curvature vs. P&L Portfolio R2 α p Value β p Value 10/20/30 0.0173 -0.0106 0.0000 2.9675 0.0000 5/10/30 -0.0003 0.0010 0.2928 -0.3077 0.4606 5/15/30 0.0186 -0.0026 0.0236 1.8631 0.0000

Table 7.3: Regression of P&L against the Curvature of the Yields.

As far as the regressions of yield curve curvature on the daily P&L are concerned, we find that the R2’s for the portfolios 10/20/30 and 5/15/30 are marginally higher but definitely not in the range that warrants caution primarily because in this case the exposure, if at all any, is to the yield curve curvature which is the least erratic of the three. Thus, in general, the optimal portfolio weights appear to have immunized the respective portfolios appreciably well against the yield curve level, slope and curva- ture movements. The missing link between the observed lack of a strong positive relationship between signal strength and P&L, therefore, points to weaknesses in our volatility estimation - the last of the plausible reasons as to why the strategy failed to make money every time. To test whether the estimates of future volatility are re- ally far away from the reality, we regress the convexity contributed P&L against the difference between the predicted portfolio convexity CPpred and the realized portfolio 2 convexity CPreal. An estimated value of R significantly smaller than 1 would then provide the degree of deviation of the convexity contributed P&L from the signal strength and hence a sense of how far away the volatility estimates have been from the truth. In particular, for the 5/15/30 portfolio, the convexity contributed P&L is

44 Candidate no. 900228

calculated as ∑ CP &L = P &L − siTi (y(t, Ti) − y(t − 1,Ti)) , i = 5, 15, 30 (7.1) i where the si’s are the smoothed optimal weights. The realized portfolio convexity is calculated as 1 ∑ CP = s T 2 (y(t, T ) − y(t − 1,T ))2 , i = 5, 15, 30 (7.2) real 2 i i i i i

For the 5/15/30 portfolio, the regression of CP &L on CPreal − CPpred produces an R2 of 0.45, indicating that in general there is a positive relationship between the convexity contributed P&L and the difference in realized and predicted convexities however this relationship appears to be distorted whenever there are large swings in the convexity contributed P&L. Thus the volatility estimation model chosen does not provide us with a perfect predictive tool and when moves of unexpected sizes occur, the strategy might fail to generate a profit even though it picks up a strong signal. Therefore, in reality, depending on the size of the unexpected move and whether the portfolio was long or short convexity before the move, the strategy could either fail or work exceedingly well. In this essay, we have devoted our efforts in developing a novel model agnos- tic representation of portfolio convexity based on the cross -sectional information of “Carry” and “Roll - Down” of a suitably immunized portfolio. We went further and introduced a criterion to determine if the market yield curve is fairly curved based on our model agnostic representation. We tested this criteria as a trading strategy on historical US Treasury yield curve data from 1987-2014, noting that the strategy was overall profitable with limitations as far as yield volatility estimation was con- cerned. We also demonstrated that the profitability of trading strategy is not due to uncontrolled residual exposure to level, slope or curvature of the yields but is purely due to the ability of the strategy to tap into the mis-pricings in convexity. In partic- ular, we noted that when moves of unanticipated size occur, the GARCH volatility estimates may not correctly estimate the future volatility leading to situations when the portfolio incurs a loss or an unexpected gain. We envision further research being directed to exploit sophisticated volatility es- timation procedures like EGARCH or AP ARCH and improve the quality of the estimates of future yield volatility. The possibility of introducing a threshold on the minimum absolute signal strength |S| above which the trade is executed, could be explored to manage the sensitivity of the P&L profile to the signal strength. As far as the estimation of optimal portfolio weights is concerned, an area of further inves- tigation would be to assess the impact of those weights that have been estimated assuming that the yields are Log Normal rather than Gaussian.

45 Candidate no. 900228

Chapter 8

Appendix 1 - Estimation of Portfolio Weights: MATLAB Code

%%------%% Estimation of Optimal Portfolio Weights. %%------clear all; clc; %%------%% Define Options %%------project.weights = ’LogN’; %’LogN’,’Norm’

%% Read the daily yield data [~,~,fulldata] = xlsread(’1985to2014.xlsx’, 1); fulldata = cell2dataset(fulldata); fullyields = fulldata; fullyields.Index =[]; fullyields.Date =[]; DateString = fulldata.Date; formatIn = ’yyyy-mm-dd’; temp1 = datenum(DateString,formatIn); temp2 = datevec(temp1); fulldata.Year = temp2(:,1); fulldata.Month = temp2(:,2); fulldata.Day = temp2(:,3); maturity = [10 20 30];%5,15,30 or 5,10,30

46 Candidate no. 900228

yields = [fulldata.Index fulldata.Year10./100 fulldata.Year20./100 ... fulldata.Year30./100]; clear(’temp1’,’temp2’);

%%------%% Create MTM & Weights using MVP %%------weights = zeros(length(yields)-300,3); MTM = zeros(length(yields)-1, 3); if strcmp(project.weights,’Norm’)

for i = 1:length(weights)

MTM = [-maturity(1)*diff(yields(i+1:300+i,2)) ... -maturity(2)*diff(yields(i+1:300+i,3)) ... -maturity(3)*diff(yields(i+1:300+i,4))];

mdl = fitlm([MTM(:,1) MTM(:,2)],MTM(:,3),’Intercept’,false);

weights(i,:) = [-mdl.Coefficients.Estimate’ 1]; end elseif strcmp(project.weights,’LogN’)

for i = 1:length(weights)

MTM = [-maturity(1)*diff(yields(i+1:300+i,2))./yields(i+2:300+i,2) ... -maturity(2)*diff(yields(i+1:300+i,3))./yields(i+2:300+i,3) ... -maturity(3)*diff(yields(i+1:300+i,4))./yields(i+2:300+i,4)];

mdl = fitlm([MTM(:,1) MTM(:,2)],MTM(:,3),’Intercept’,false);

weights(i,:) = [-mdl.Coefficients.Estimate’ 1]; end end %%------

47 Candidate no. 900228

Chapter 9

Appendix 2 - Estimation of Volatilities: R Code

This script provides the estimation procedure for the obtaining the predicted yield volatilities of 10, 20 and 30 year bonds. The script can be easily modified to do the same for the 5 and 15 year bonds.

##------## The required packages require("fUnitRoots") require(graphics) require(fGarch) require(tseries)

## Read the Data fulldata <- read.csv("~/Data/1985to2014.csv",quote="’",as.is = TRUE) yields <-cbind(fulldata$Year10/100,fulldata$Year20/100,fulldata$Year30/100) temp = nrow(yields)

## Initialize some space param10 = matrix(0,nrow=nrow(yields)-300,ncol=5) param20 = matrix(0,nrow=nrow(yields)-300,ncol=3) param30 = matrix(0,nrow=nrow(yields)-300,ncol=3) predVols = matrix(0,nrow=nrow(yields)-300,ncol=3) se10 = param10 se20 = param20 se30 = param30

48 Candidate no. 900228

tval10 = se10 tval20 = se20 tval30 = se30 pval10 = se10 pval20 = se20 pval30 = se30

## Begin the loop for estimating the 3 models across all data for (i in 1:(nrow(yields)-300)){ temp10 = yields[(i+1):temp,1] temp20 = yields[(i+1):temp,2] temp30 = yields[(i+1):temp,3] yields10 = temp10 yields20 = temp20 yields30 = temp30 for (j in 1:length(temp10)) { yields10[j]= temp10[(length(temp10)-j+1)] yields20[j]= temp20[(length(temp20)-j+1)] yields30[j]= temp30[(length(temp30)-j+1)] }

## Create Time Series object yields10<- ts(yields10,start=1,end=length(yields10),frequency=1) diff10<-diff(yields10,differences=1) yields20<- ts(yields20,start=1,end=length(yields20),frequency=1) diff20<-diff(yields20,differences=1) yields30<- ts(yields30,start=1,end=length(yields30),frequency=1) diff30<-diff(yields30,differences=1) gar10=garchFit(~arma(0,1)+garch(1,1), ... data=diff10,trace=F,algorithm="lbfgsb")

49 Candidate no. 900228

gar20=garchFit(~garch(1,1), ... data=diff20,trace=F,algorithm= "lbfgsb",include.mean=F) gar30=garchFit(~garch(1,1), ... data=diff30,trace=F,include.mean=F,algorithm="lbfgsb") p10=predict(gar10,5) p20=predict(gar20,5) p30=predict(gar30,5) param10[i,] = gar10@fit$coef param20[i,] = gar20@fit$coef param30[i,] = gar30@fit$coef se10[i,] = gar10@fit$se.coef se20[i,] = gar20@fit$se.coef se30[i,] = gar30@fit$se.coef tval10[i,] = gar10@fit$tval tval20[i,] = gar20@fit$tval tval30[i,] = gar30@fit$tval pval10[i,] = gar10@fit$matcoef[,4] pval20[i,] = gar20@fit$matcoef[,4] pval30[i,] = gar30@fit$matcoef[,4]

res10 = residuals(gar10,standardize =TRUE) res20 = residuals(gar20,standardize =TRUE) res30 = residuals(gar30,standardize =TRUE) predVols[i,]= sqrt(250)*c(mean(p10$standardDeviation), ... mean(p20$standardDeviation), mean(p30$standardDeviation)) rm(gar10) rm(gar20) rm(gar30) print(i)}

50 Candidate no. 900228

References

[1] Barrett, W. Brian, Thomas F. Gosnell, Jr., and Andrea J. Heuson. (1995). Yield Curve Shifts and the Selection of Immunization Strategies. Journal of Fixed In- come. September, (1995).

[2] Brown R H, Schaefer M S, (2000). Why Long-Term Forward Rates (Almost) Always Slope Downwards. London Business School working paper.

[3] Daigler, R.T., (1998). Comparing hedge ratio methodologies for fixed-income in- vestments. working paper

[4] Diebold and Rudebusch (2013). Yield Curve Modeling and Forecasting: The Dynamic Nelson-Siegel Approach.

[5] Duffie, D. & Kan, R. (1996). A yield-factor model of interest rates. Mathematical Finance, 6(4):379-406.

[6] Ederington, L.H. (1979). The Hedging Performance of the New Futures Markets The Journal o f Finance, (March), Vol. 34 No. 1, pp. 157-170

[7] Fisher, Mark. (2001). Forces that Shape the Yield Curve: Parts 1 and 2 (March 2001). FRB of Atlanta Working Paper No. 2001-3. Available at SSRN

[8] Fong, H.G., and O. Vasicek. (1983). The Tradeoff Between Return and Risk in Immunized Portfolios. Financial Analysts Journal (September-October 1983), pp. 73-78.

[9] Francq et. al. (2010). GARCH Models Structure, Statistical Inference and Finan- cial Applications.

[10] Gurkaynak, Refet S., Sack, Brian P. and Wright, Jonathan H. (2006). The U.S. Treasury Yield Curve: 1961 to the Present. FEDS Working Paper No. 2006-28. Available at SSRN

[11] Hamilton (1994). Time Series Analysis.

[12] Higham (2008). Functions Of Matrices: Theory and Computation.

51 Candidate no. 900228

[13] Ho, T.S.Y. (1992). Key Rate Durations: Measures of Risks. Journal of fixed Income, September (1992), pp. 29-44.

[14] Johnson, L. L. (1960). The Theory of Hedging and Speculation in Commodity Futures. Review of Economic Studies, Vol. 27 No. 3, pp. 139-151

[15] Joseph A. Whitt, Jr. (1996). The Mexican Peso Crisis. Economic Review. Federal Reserve Bank of Atlanta.

[16] Mark Carlson (2006). A Brief History of the 1987 Stock Market Crash with a Discussion of the Federal Reserve Response. Finance and Economics Discussion Series.Federal Reserve Board, Washington, D.C.

[17] Memmel, Christoph and Kempf, Alexander, (2006). Estimating the Global Min- imum Variance Portfolio. Available at SSRN

[18] Meucci, Attilio (2009). Review of Statistical Arbitrage, Cointegration, and Mul- tivariate Ornstein-Uhlenbeck. Available at SSRN

[19] Nelson, D.B. (1991). Conditional heteroskedasticity in asset returns: a new ap- proach. Econometrica 59:347-370.

[20] Rebonato, Saroka and Putyatin (2014) Affine Principal-Component-Based Term Structure Model. Available at SSRN

[21] Riccardo Rebonato (2014-2015) Private communication

[22] Vladimir Putyatin (2014-2015) Private communication

[23] Wurtz et.al. Parameter Estimation of ARMA Models with GARCH/APARCH Errors. An R and SPlus Software Implementation. Journal of Statistical Software

52