Trading in VIX Derivatives1 Presented at IAQF Thalesian Series

Andrew Papanicolaou

FRE Department NYU Tandon Brooklyn NY

April 25th 2017

1Joint work with Marco Avellaneda. 1 / 85 Fear

Figure : The VIX is the market’s impulse response to fear.

2 / 85 The VIX Index

For t ≤ T let the options on SPX be

−r(T −t) + P(t, K, T ) = e Et (K − ST ) −r(T −t) + C(t, K, T ) = e Et (ST − K) ,

with Et being risk-neutral expectation. The VIX is v u S ∞ ! u2erτ Z Et t+τ dK Z dK VIX = t P(t, K, t + τ) + C(t, K, t + τ) t τ K 2 K 2 0 Et St+τ

with τ = 30 days.

3 / 85 Highlights of Talk

1. The VIX futures curve exhibits stationary behavior, with mean reversion toward a . 2. Which is a good model for capturing statistical dynamics of VIX futures? 3. How do complacency and the unknown factor into a stationary model? 4. Can we manage the negative roll yield in VIX ETNs? 5. Is the market showing too little concern?

4 / 85 The VIX Time Series

VIX Daily Closing (1/2/2004 to 1/3/2017) 100

80

60

40

20

0

02−Jan−2004 30−Dec−200427−Dec−200522−Dec−200621−Dec−200718−Dec−200816−Dec−200914−Dec−201009−Dec−201110−Dec−201206−Dec−201304−Dec−201402−Dec−201529−Nov−2016

Figure : The VIX has measured market fear since 2004.

5 / 85 Statistical Properties of VIX & VX Futures

From the VIX time series... 1. time series mean, 17.16 2. time series mode, 12.64 3. augmented Dicky-Fuller stat: reject (no unit root) Typically... 1. most days the VIX futures is in contango, with backwardation coming when there’s fear. 2. backwardation mean reverts within a few weeks, which is fast compared to interest rates or oil

6 / 85 VX Term Structure

Figure : A typical contango, Jan 23, 2017.

7 / 85 VX Term Structure

Figure : Backwardation for 5 months, Aug 17, 2011. Obvious dollar-neutral trade here.

8 / 85 VX Term Structure

Figure : Full on backwardation, Oct 16, 2008

9 / 85 VX Term Structure

Figure : April 11, 2017. Notice the “scoop”, which is typically how trouble starts.... Perhaps elections in France and the possibility of a French exit are causing fear.

10 / 85 VX Term Structure

Figure : VIX weeklies included, April 11, 2017. Weeklies are not liquid.

11 / 85 The “Dull” or Most Likely State

Figure : April 10th 2017. The blue line connects the modes for each VX contract.

12 / 85 Comparison with Other Stationary Curves

Other curves are stationary, but with different properties, I Oil: and storage theory

I normal for producers to price (Keynes), I 100 years later, storage industry in Cushing OK, cash-and-carry arb puts lower bound on contango. I inventories/storage lower vol. 3 I Gold: considering only (21m) in human hands, relatively cheap to store, relatively flat curve/low vol

I Rates: Lot’s of instruments (bonds, swaps, etc..), relatively complete market; hedgable Deep contango and high vol in non-storables:

I electricity

I VIX futures

13 / 85 VX Contango Roll Yield

This is the most likely curve: Long posions in -term VX lose faster than long-term

T

Figure : The most likely yield, and the red arrows illustrating the negative roll yield for long positions at all maturities.

14 / 85 VX A Non-Staonary Curve

The “Scoop-Shaped” curve will revert back to the contango.

T

Figure : A non-stationary or transient state. The black arrows illustrate the positive roll yield for long positions at shorter maturities.

15 / 85 The “Dull” State of Fear

Figure : Sooner or later it becomes normal......

16 / 85 Bergomi’s Model [Bergomi, 2005, Bergomi, 2008]

Let T > 0 denote a future’s maturity,

Ft,T = Et VIXT ∀t ≤ T ,

where Et denotes a time-t conditional risk-neutral expectation.

Bergomi model arises naturally from the risk-neutral martingale,

d dFt,T X i = σi (T − t)dWt , Ft,T i=1

where each σi (t) is a diffusion coefficient that tends toward zero as t → ∞, and vector dWt is Brownian increments with correlations

i j dWt dWt = ρij dt .

17 / 85 Bergomi’s Model

The SDE for Ft,T is   d Z t d Z t X i 1 X 2 Ft,T = Ft0,T exp  σi (T − s)dWs − 2 ρij σij (T − s)ds , i=1 t0 i,j=1 t0

2 where σij = σi σj and t0 ≤ t is an initial time. Different kernels:

−κ t I The exponential kernel σi (t) =σ ¯i e i with κi > 0 (Markov) −γ I Power law σi (t) = t with γ > 0 (fractional Brownian motion, non-Markov, [Gatheral et al., 2014])

18 / 85 Rolling Contracts Denote τ = T − t to have rolling , τ Vt = Ft,t+τ , for which there is the following expression: d Z t τ t+τ X i Vt = Vt0 exp σi (τ + t − s)dWs i=1 t0  d Z t 1 X 2 − 2 ρij σij (τ + t − s)ds . i,j=1 t0 Take the functions to be

−κi t σi (t) =σ ¯i e

where κi > 0. We take the factor Xt to be a stationary OU process, Z t i −κi (t−s) i Xt =σ ¯i e dWs , −∞

and letting t0 tend toward −∞,...... 19 / 85 Stationary State

...... we obtain the stationary model for the futures curve,   d 1 d τ ∞ X −κi τ i X ρij σ¯i σ¯j −(κi +κj )τ Vt = V exp  e Xt − e  . 2 κi +κj i=1 i,j=1

In particular, evaluating at τ = 0 give

 d d  X i 1 X ρij σ¯i σ¯j VIXt = exp  Xt −  . 2 κi +κj i=1 i,j=1

20 / 85 The “Dull” or Most Likely State

The mode for this model.....   1 d τ ∞ X ρij σ¯i σ¯j −(κi +κj )τ mode(Vt ) = V exp − e  . 2 κi +κj i,j=1

This should be a contango....

21 / 85 The Dull State for a 2-Factor Model VIX Term Structure from 2−Factor Gaussain OU 35

30

25

20

VIX = 12.7028 15 0 VIX = 24.0596 0 VIX = 32.0561 0 10 0 2 4 6 8 10 12 maturity (in months)

1 2 Figure : 2 Factors, Xt and Xt mean-reverting OU processes. 22 / 85 PCA and Model Selection

PCA from February 8th 2011 to December 15th 2016

I 8 rolling contracts (including the VIX)

I N = 1, 499 days

I each day the VIX and the VX future curve form row entry in N × 8 matrix.

Notation, τ j τj Vij = ln(Vti ) − ln(V ) , where 1 X τ ln(V τj ) = ln(V j ) , N ti i 30 with i = 1, 2, 3,..., N, and j = 0, 1, 2 ..., 7 with τj = j 365 .

23 / 85 PCA and Model Selection The singular value decomposition (SVD),

USψ0 = V ,

where

I U is an N × 8 matrix orthonormal columns,

I S is an 8 × 8 diagonal matrix containing the singular values, and

I ψ is an 8 × 8 orthonormal matrix whose columns are the principal components used to form any given futures curve.

In other words, for d ≤ 8 we have

d τ j τj X ln(Vti ) = ln(V ) + ai`ψj` , `=1 where the coefficient matrix is a = US. 24 / 85 PCA and Model Selection

VX Curve Reconstructed with 1 Components VX Curve Reconstructed with 2 Components 20 18.5

19 18

18 17.5

17

17

16

16.5 15

16 14

13 15.5 0 1 2 3 4 5 6 7 0 1 2 3 4 5 6 7

VX Curve Reconstructed with 3 Components VX Curve Reconstructed with 4 Components 18.5 18.5

18 18

17.5 17.5

17 17

16.5 16.5

16 16

15.5 15.5 0 1 2 3 4 5 6 7 0 1 2 3 4 5 6 7 Maturity (in months) Maturity (in months)

Figure : PCA reconstruction of VX curves for April 10th 2014.

25 / 85 PCA and Model Selection

VX Curve Reconstructed with 1 Components VX Curve Reconstructed with 2 Components 50 50

45 45

40 40

35 35

30 30

25 25 0 1 2 3 4 5 6 7 0 1 2 3 4 5 6 7

VX Curve Reconstructed with 3 Components VX Curve Reconstructed with 4 Components 50 50

45 45

40 40

35 35

30 30

25 25 0 1 2 3 4 5 6 7 0 1 2 3 4 5 6 7 Maturity (in months) Maturity (in months)

Figure : PCA reconstruction of VX futures curve for August 8th 2011, US credit downgrade.

26 / 85 Mode of PCA Weights

Histogram of 1st PCA Weight Histogram of 2nd PCA Weight 120 140

120 100

100 80

80

60

60

40 40

20 20

0 0 -0.05 0 0.05 0.1 -0.15 -0.1 -0.05 0 0.05 0.1

Figure : The histogram of weights for the 1st and 2nd principal components, ai1 and ai2 respectively.

27 / 85 The Most Likely Curve via PCA

Most Likely Curve 22 most likely curve 21 mean curve

20

19

18

17

16

15

14

13 0 1 2 3 4 5 6 7 Maturity (in months)

Figure : Recall mean VIX higher than mode VIX. Here mean curve is higher than mode curve,       mode ln(Vti ) ≈ ln(V ) + mode ai1 ψ1 + mode ai2 ψ2

28 / 85 The Real-World or Statistical Model The stationary, risk-neutral process,

1 1 1 dXt = −κ1Xt dt +σ ¯1dWt , 2 2 2 dXt = −κ2Xt dt +σ ¯2dWt .

The real-world or statistical dynamics of the bivariate OU process are

1 p 1 p,1 dXt = κ1(µ1 − Xt )dt +σ ¯1dWt , 2 p 2 p,2 dXt = κ2(µ2 − Xt )dt +σ ¯2dWt ,

where

 1  −1  p p 1 p,1! Wt σ¯1 0 κ1µ1 − (κ1 − κ1)Xt Wt d 2 = p p 2 dt +d p,2 . Wt 0σ ¯2 κ2µ2 − (κ2 − κ2)Xt Wt | {z } market price of

29 / 85 Parameter Estimation

The data is the observed VIX and VX futures,

τj τj Yi = ln(Vti )

with τj = j × 30 days for j = 0, 1,..., 7. Let the parameters of the OU process by denoted by θ,

∞ p p θ = (V , κ1, κ2, σ1, σ2, ρ, κ1, κ2, µ1, µ2) . | {z } | {z } risk neutral real world Using the model

θ θ τj Yi = H Xti + G + εi (observed) , (1) θ θ p Xti+1 = A Xti + µ + ∆Wi+1 (latent) , (2)

τj p θ where cov(εi ) = R and cov(∆Wi+1) = Q .

30 / 85 Estimated Parameters Estimated θ V ∞ 21.2068 κ1 0.6879 κ2 23.7273 σ¯1 1.3393 σ¯2 1.8297 ρ 0.2018 p κ1 1.2938 p κ2 17.0911 µ1 0.1904 µ2 -0.0056 Table : Estimated VIX risk-neutral mode of 10.5068. For the statistical, the optimization has constraints to look for a mean and mode that are equal those of the VIX data; the estimated model’s statistical mode is 12.6400 and it’s mean is 18.9844, compared to the mode and mean of the VIX time series of 12.6400 and 17.1639, respectively. The total fitting error to the VX term structure is 8.50378.

31 / 85 Goodness of Fit

Kalman Filter Least−Squares Estimator 1.4 1.4 X1 X1 1.2 X2 1.2 X2

1 1

0.8 0.8

0.6 0.6

0.4 0.4

0.2 0.2

0 0

−0.2 −0.2

−0.4 −0.4

−0.6 −0.6 0 1 2 3 4 5 6 0 1 2 3 4 5 6 t (in years) t (in years)

θ Figure : Left: The Kalman filter Xbi . Right: The least-square estimator θ,lsq Xbi .

I Important to use Kalman filter; maintains our a priori preference for OU dynamics X . I Daily Least squares (LSQ) estimation can be used after parameters estimated, but if daily LSQ used in iterative parameter estimation algorithm then there is overfitting. 32 / 85 Goodness of Fit

τ τ V , τ = 0 days V , τ = 30 days t t 50 50

40 40

30 30

20 20

10 10 0 1 2 3 4 5 6 0 1 2 3 4 5 6

τ τ V , τ = 60 days V , τ = 90 days t t 40 40

30 30

20 20

10 10 0 1 2 3 4 5 6 0 1 2 3 4 5 6

τ τ V , τ = 120 days V , τ = 150 days t t 40 40 data fit 30 30

20 20

10 10 0 1 2 3 4 5 6 0 1 2 3 4 5 6 t (in years) t (in years) 33 / 85 Goodness of Fit

b θ θ b θ θ Innovations Residuals Xi+ 1− A Xi − µ 1200 450

400 1000 350

800 300

250 600 200

400 150

100 200 50

0 0 −0.3 −0.2 −0.1 0 0.1 0.2 0.3 0.4 0.5 −0.3 −0.2 −0.1 0 0.1 0.2 0.3 0.4

Figure : Left: The histograms of the innovations νi , which are normally distributed under the null hypothesis of normally-distributed εi and p θ θ θ θ ∆W . Right: The histograms of the residuals Xbi+1 − A Xbi − µ , which would also be normally distributed under the null hypothesis.

34 / 85 Most Likely VX Curve Given High VIX

τ The outlier probability for Vt is equivalent to the probability of the normal random variable exceeding some large M > 0,   1 2 2  1 2   1 (M − (µ1 + µ2))  Pp Xt + Xt ≥ M ≤ exp −  ,  2 1  1, 1 Σ 1

where Pp denotes the statistical probability measure, and

 2  σ1 ρσ1σ2 2κp κp+κp  1 1 2  Σ =    2  ρσ1σ2 σ2 p p p κ1 +κ2 2κ2

35 / 85 Most Likely VX Curve Given High VIX

Conditional on an exceedance over a threshold M, the most likely 1 2 value of (Xt , Xt ) is found by maximizing the joint density subject to a constraint. The density is

  1  1 2 1 1 1 2  −1 x − µ1 p(x , x ) = exp − x − µ1, x − µ2 Σ 2 , 2π|Σ| 2 x − µ2

and the Lagrangian optimization problem is

  1   1 1 2  −1 x − µ1 1 2  min x − µ1, x − µ2 Σ 2 − δ x + x − M , x1,x2 2 x − µ2

where δ ≥ 0 is a Lagrange multiplier.

36 / 85 Most Likely VX Curve Given High VIX

The solution is

X 1 µ  1 = 1 + δ∗Σ , X 2 µ 1 ml(τ,M) 2 where δ∗ is the optimal Lagrange multiplier, which for 1 2 M > µ1 + µ2 is M − (µ1 + µ2) δ∗ = 1 2 ; 1 1, 1 Σ 1

∗ 1 2 obviously δ = 0 if M ≤ µ1 + µ2.

37 / 85 Most Likely VX Curve Given High VIX

Most Likely Curve Given VIX High 40 most likely curve given X +X > 1.2 1 2 most likely curve 35

30

25

20

15

10 0 5 10 15 20 25 Maturity (in months)

Figure : The most likely curve given X 1 + X 2 ≥ M.

38 / 85 Most Likely VX Curve Given High VIX

Most Likely Curve Given VIX High 22

21

20

19

18

most likely curve given X +X > 0.25 17 1 2 most likely curve 16

15

14

13

12 0 5 10 15 20 25 Maturity (in months)

Figure : The most likely curve given X 1 + X 2 ≥ M.

39 / 85 Or the Curve if we know more...

High VIX with Dull State Expected in 4 Months 20

Non-Stationary Curve 19 most likely curve

18

17

16

15

14

13

12 0 2 4 6 8 10 12 Maturity (in months)

Figure : Most likely given X 1 + X 2 ≥ M and X 1e−κ1τ4 + X 2e−κ2τ4 = µ1e−κ1τ4 + µ2e−κ2τ4 . 40 / 85 Summary to this Point

I Characterized stationarity of VX curve

I Looked at PCA and found 2 factors is sufficient

I Fit the Gaussian Bergomi model and found it captures some of features.

41 / 85 Complacency

42 / 85 Complacency

Of considerable concern right now.

James Mackintosh of WSJ writes:

“At the moment, the gap between realized and is normal. Investors are prepared for some rise in volatility, but from an abnormally low level.”

Typically, the spread VIX minus realized vol is positive to reflect premium in options.

However, JM remarks that low VIX isn’t the whole story, as the premium in SPX put options is more indicative of fear.

43 / 85 Complacency

Indeed, there a several causes and effects:

I stimulus and easy money every time the market goes down; younger traders only know the “Big Dip Era”

I The ETF craze, entire market “waxes and wanes” in unison because of the migration to passive funds

I post-crisis corporate buy backs I Self-fullfilling prophecy:

I VIX will stay low, so short the contango, which in turn drives down VIX prices I many funds shorting the long-dated VIX futures and collecting premium I The ZIV is an example of this short

44 / 85 Past VIX Events

Non-Complacent moments in the last 2 decades:

I the Russia crisis 8/98

I the dotcom collape 3/00

I market euphoria begins 1/06

I the credit crunch 8/07

I Lehman collapse 9/08

I Greece debt 5/10

I Eurozone debt/US downgrade 8/11

Brexit and Trump election didn’t have much effect. The French election had some effect in the last 2 weeks.

45 / 85 We’re Overdue for an Event

I The forest service I Is there a volatility forest practices controlled burns service? as a means to prevent disastrous wildfires.

I We’re overdue for a volatility event, the forest is littered with kindling..,,,

46 / 85 The VIX ETNs

10000 VXX and VXZ Daily Closing (3/1/2010 to 3/1/2017) 400 VXX VXZ

5000 200

0 0

18-Mar-2010 15-Mar-2011 12-Mar-2012 12-Mar-2013 10-Mar-2014 06-Mar-2015 03-Mar-2016 01-Mar-2017

Figure : Negative roll yields for positions in front end or in the back end. Front end is more volatile. $1 million invested in the first VIX ETN in

2009 would be $600 today. 47 / 85 Negative Roll Yield

Contango Yield 8 V2-V1 6 V7-V4

4

2

0

-2

-4

-6

02-08-11 11-22-11 09-08-12 06-28-13 04-12-14 01-17-15 10-24-15 07-30-16

Figure : Negative roll yields for positions in front end or in the back end. Front end is more volatile.

48 / 85 Pairs Trading for VIX ETNs Co-Integrated Time Series log(VXX)-β -β log(VXY) 1 2 0.8

0.6

0.4

0.2

0

-0.2

-0.4

-0.6

18-Mar-2010 15-Mar-2011 12-Mar-2012 12-Mar-2013 10-Mar-2014 06-Mar-2015 03-Mar-2016 01-Mar-2017

Figure : Engle-Granger rejects H0: ln(VXX ) and ln(VXY ) have no co-integration. Run regression ln(VXX ) = β1 + β2 ln(VXY ), then Dickey-Fuller test rejects unit root in residual. 49 / 85 Construction of ETNs ETNs roll between 2 or more VX contracts I let t be the calendar date I let T1 and T2 be first and second expirations after t. I let r = T2 − T1. i.e. number business days in the period between T1 and T2 I let θ be number business days between today and 1 month from today.

Figure : Contracts Ft,T1 and Ft,T2 used in ETN with roll horizon θ. 50 / 85 Construction of ETNs For T1 ≤ t + θ ≤ T2 define T − (t + θ) a(t) = 2 r and notice I 0 ≤ a(t) ≤ 1 I a(T1 − θ) = 1 and a(T2 − θ) = 0 I linear in t.

Denote interpolation as

˜ T1 T2 Vt = a(t)Ft + (1 − a(t))Ft .

The change in V˜t is   ˜ T1 T2 T1 T2 dVt = a˙(t)Ft − a˙(t)Ft dt + a(t)dFt + (1 − a(t))dFt 1   = F T2 − F T1 dt + a(t)dF T1 + (1 − a(t))dF T2 θ t t t t

51 / 85 Construction of ETNs

The value of the ETN in given by E given by the formula

dE a(t)dF T1 + (1 − a(t))dF T2 t = t t + rdt T1 T2 Et a(t)Ft + (1 − a(t))Ft where r is the .

This corresponds to the variation in a futures trading account in which an investor buys the front contract with 100% of his capital and then gradually buys the second contract and sells the first (“rolls”) – buying calendar spreads on each date – so as to be fully invested in the ‘first contract after the next date, and so on.

52 / 85 The Roll Yield

The continuous-time analogue of the evolution of VXX (or any continuously rolled ETF/ETN) is therefore:

θ θ dEt dVt ∂ ln(Vt ) = θ − dt + rdt. Et Vt ∂θ Note: The futures strategy implemented by the ETN issuer maintains the number of contracts constant in the roll, by transferring a number of contracts equal to Total number of contracts number of business days between expirations from one maturity to the next one, by buying that number of calendar spreads daily.

53 / 85 Losing to Contango

Since under a risk-neutral measure all self-financing strategies should return zero, we should have

θ θ hdVt ∂ ln(Vt ) i Et θ − dt = rdt, (3) Vt ∂θ where Et [∗] is conditional expectation. I This statement does not hold under the empirical measure, θ θ ∂ ln(Vt ) since we expect Vt to be mean-reverting and ∂θ to be mostly positive,

I due to contango.

54 / 85 Simulated ETNs ETNs and VIX 200 100 1 Month ETN 7 Month ETN VIX

100 50 VIX ETN Value

0 0 0 0.5 1 1.5 2 2.5 3

Figure : VXX and VXY simulated from fitted 2-factor model.

55 / 85 Simulated ETNs Short 1m, Long 7m 100 80

50 60

0 40 VIX ETN Value

−50 20

−100 0 0 0.5 1 1.5 2 2.5 3

Figure : A simulated long-short in VXX and VXY.

56 / 85 2nd Summary

I Complacency and the risk of being too sure are of concern right now (April 2017)

I ETNs are a bi-product of successful bets on low vol (and in general the ETN craze)

I ETN’s losses are due to negative roll yield, which is similar to the premium for low-strike SPX put options.

57 / 85 The Distribution of Fear

Figure : By buying and selling VIX options, traders contribute towards a collective distribution on the Fear Index.

58 / 85 VIX Tail Hedge (VXTH) Enter/Exit Strategy

A portfolio to hedge against tail events:

I 1% of portfolio weight in 30-∆ VIX call options if VIX in range of 15%-30%; rest in S&P500

I 1/2% of portfolio weight in 30-∆ VIX call options if VIX in range of 30%-50%; rest in S&P500 stocks

I 0% of portfolio weight in VIX options if VIX outside range of 15%-50%; all in S&P500 stocks Big losses in the S&P500 coincide with spikes in VIX. VXTH provides positive cash flow on these days.

S&P 500 gained about 15% between 2006 and 2011; VXTH gained about 40%

59 / 85 VIX Tail Hedge (VXTH)

60 / 85 Non-Orthogonality of Markets

61 / 85 Time-Spread Portfolio [Papanicolaou, 2016]

2 A future on VIXsquared: Vt,T = Et VIXT

r(T +τ−t) Z ∞ 2e −rτ  dK Vt,T = P(t, K, T + τ) − P(t, Ke , T ) 2 τ 0 K

i.e. Vt,T is not only an index, but an asset that can be held,

Z ∞ vix Vt,T = 2 C (t, K, T )dK , 0 where Cvix (t, K, T ) denotes a VIX call .

62 / 85 Prediction

Figure : The VIX premium swells when there is unknown.....

63 / 85 Past VIX Events

I the Russia crisis 8/98

I the dotcom collape 3/00

I market euphoria begins 1/06

I the credit crunch 8/07

I Lehman collapse 9/08

I Greece debt 5/10

I Eurozone debt/US downgrade 8/11

Brexit and Trump election didn’t have much effect. Ambiguity and Overconfidence [Brenner et al., 2011]. What if France exits???

64 / 85 French Politics

Or look out!! Will French exit It’s going to be how it was, have consequences??? ..... which was pretty OK.

65 / 85 I don’t get it??

I We have convinced ourselves that VIX is stationary

I ..... that the VX curve is stationary

I ..... and we even have a reasonable model

Then how is it that Marine Le Pen means all bets are off?

66 / 85 What is a Model for Risk?

I Hey Sherlock!! Look up I All this time researching for just a second and see and hedging, that this jockey is a I but then a jockey walks juicer!! by at the betting window....

67 / 85 How to Store?

Figure : Where can I purchase the Smell of Fear?

68 / 85 Fear in a Bottle How can we buy tomorrow’s volatility?

I Think about cash-and-carry strategies for contango in storables (e.g oil or gas).

I Already talked about complacency and unhedged shorting of long-dated VX.

I There is no carriable form of volatility.

I How can we implement a cash and carry?

It’s not clear how to buy tomorrow’s volatility.... today.

Suitable instruments for constructing future contracts:

I time-spread portfolio

I forward-start options

I compound options.

69 / 85 In Terms of Greeks

Let Θ denote option sensitivity to changes in time to maturity, ∂ Θ(t, K, T ) = − P(t, K, T ) . ∂T Time-spread portfolio can be written as

2 Z ∞ Z τ dK Vt,T = − Θ(t, K, T + u)du 2 τ 0 0 K Z τ P(t, T + τ, K) − P(t, T , K) = − Θ(t, K, T + u)du . 0 VOLATILITY RISK QUANTIFIED AS EXPOSURE TO TIME

70 / 85 τ − − τ Θ P(t, T + ,K) P(t, T, K) = R0 (t, T + u, K)du 8 − τ Θ Height of Shaded Area = R0 (t, T + u, K)du 7 Intrinsic Value 6

5

4 price 3

2

1

0 40 42 44 46 48 50 52 54 56 K

71 / 85 Can Zero-Θ Instruments be Used to Construct Storage?

I We see the usefulness of instruments that do not decay due to passage of time,

I Time Spread portfolio is a contract on future volatility..... not a carry of today’s. It seems really hard to carry fear.

An equivalent trade to cash-and-carry for VX has yet to be made.

72 / 85 The Thrill of Making Money

Figure : “The Snap”, the most iconic moment in surfing, Tom Carroll putting it all on the line @ the Banzai Pipeline, North Shore of Oahu ’91. The complacent line was to go straight, but Tom saw differently.

73 / 85 Thank You!

74 / 85 Generalized OU

d We use a factor model where the factors are given by an R -valued process Xt that is mean reverting and in its stationary state, Z t −κ(t−s) Xt = e dLs , −∞ where κ > 0 is a positive definite matrix with

kκ−1k = decorrelation time,

d 2 and Lt is an R -valued stable L´evyprocess having triple (a, σ , ν) and L´evy-Khintchinerepresentation (see Chapter 1.2 in [Applebaum, 2004])

75 / 85 Generalized OU

kσuk2 log eihu,L1i = i hu, ai − E 2 Z  ihx,ai  + e − 1 − i hx, ai 1|x|<1 ν(dx) , Rd \{0} where σ is an invertible volatility matrix for a diffusion component, and ν(x) is an intensity measure with R (1 ∧ |x|2)ν(dx) < ∞. Rd \{0} Using the model from [Bergomi, 2005, Bergomi, 2008, Ould Aly, 2014], the term structure is −κτ 1 −κτ τ ∞ e Xt − E[[e Xt ]] Vt = V e 2 , (4) where [[ · ]] denotes quadratic (cross) variation of a vector-valued process.

76 / 85 Generalized OU Assuming the moment generating function (MGF) exists,

uL1 Λ1(u) = log Ee < ∞ ,  for 0 ≤ u < K < ∞. If E log 1 + |L1| < ∞ then Z u uXt 1 dz ΛX (u) = log Ee = Λ1(z) , κ 0 z for all u ∈ [0, K). Now we can construct the VIX futures curve, τ ∞ −κτ −κτ  Vt = V exp e Xt − ΛX (e ) , so that ∞ Xt −ΛX (1) VIXt = V e , and the relation  e−κτ VIXt V τ = V ∞ eΛX (1) × ϕ(τ) (5) t V ∞ −κτ ϕ(τ) = e−ΛX (e ) . (6)

77 / 85 Generalized OU

The yield is

 τ   ∞  Vt −κτ V −κτ −κτ log = (1 − e ) log + e ΛX (1) − ΛX (e ) . VIXt VIXt (7)

Now let m(x) denote the density of Xt ’s distribution. The Fourier transform of m is Z mˆ(q) = e−ixqm(x)dx = eΛX (−iq) ,

and so the density is given by 1 Z m(x) = eixu+ΛX (−iu)du . 2π

78 / 85 Generalized OU

The mode (for a unimodal distribution) is x∗ such that

i Z ∗ m0(x∗) = ueix u+ΛX (−iu)du = 0 , 2π and the most-likely value of VIX in the dull-state is

∗ ∞ x −ΛX (1) mode(VIXt ) = V e .

79 / 85 The Double Nelson Model [Bayer et al., 2013]

1 2 Take VIXt = Xt + Xt where

1 1 1 1 dXt = κ1(µ1 − Xt )dt + σ1Xt dWt 2 2 1 2 dXt = κ1(µ2 − Xt )dt + σ2Xt dWt .

Has heavy tailedness.

80 / 85 Kalman Filtering

θ θ Xbi = E [Xti |Y0:i ] , θ θ θ θ tr Ω = E (Xti − Xbi )(Xti − Xbi ) , and which are given by the Kalman filter,   X θ = AθX θ + µθ + K θ Y − HAθX θ − G θ , (8) bi+1 bi i+1 bti

where  −1 Ωθ = (I − K θHθ) HθΩeθ(Hθ)tr + R (9)  −1 K θ = Ωeθ(Hθ)tr HθΩeθ(Hθ)tr + R (10)  −1 Ωeθ = Aθ(Aθ)tr − AθΩeθ(Hθ)tr HθΩeθ(Hθ)tr + R HθΩeθ(Aθ)tr + Qθ . (11)

81 / 85 Innovations Process

We denote innovation process as

θ θ θ θ θ νi = Yi − H A Xbi−1 − G ,

which is an iid normal random variable under the null hypothesis that θ is the true parameter value,

θ  θ θ θ θ θ tr θ θ θ tr  νi ∼ iidN 0, H A Ω (H A ) + H Q (H ) + R .

82 / 85 Maximum Likelihood Estimation (MLE) Hence there is the log-likelihood function,

L(Y1:N |θ, R) N −1/2 2 1 X  θ θ θ θ θ tr θ θ θ tr  θ = − H A Ω (H A ) + H Q (H ) + R ν 2 i i=1 1 − ln HθAθΩθ(HθAθ)tr + HθQθ(Hθ)tr + R , 2

where · denotes matrix determinant. The maximum likelihood estimate (MLE) is ˆ ˆ (θ, R)mle = arg max L(Y1:N |θ, R) . θ,R In practice filtering makes it difficult to implement code for finding an MLE, so instead there are iterated algorithms such as expectation maximization (EM) that, while suboptimal, will converge to reasonable parameter estimate. 83 / 85 Iterative Scheme

Break the parameter space into the risk neutral and real-world parameters,

θ = (θ1, θ2) where

∞ θ1 = (V , κ1, κ2, σ¯1, σ¯2, ρ) p p θ2 = (κ1, κ2, µ1, µ2) .

The following is an iteration method that works:

84 / 85 Algorithm (Parameter Estimation) ˆ(0) ˆ(0) ˆ(0) ˆ(0) Initialize with parameter estimates θ = (θ1 , θ2 ) and R . (0) (0) 1. Compute Kalman Filter using θˆ and Rˆ , and re-estimate θ1,

N ˆ X θˆ(0) θˆ(0) θ 2 θ1 = arg min kYi − H Xbi − G k , θ 1 i=1 ˆ(0) ˆ and replace θ1 with θ1;

2. Re-estimate θ2 and matrix R with least-squares estimators,

−1 θ,lsq  θ tr θ θ tr θ Xbi = (H ) H (H ) (Yi − G ) ,

θ θ θ,lsq θ 1 P θ residuals ηi = Yi − H Xbi with η¯ = N i ηi and covariance

N   2 ˆ X θ −1/2 lsq θ lsq θ θ2 = arg min (Q ) Xi+1 − A Xi − µ , θ 2 i=1 N tr 1 X  θ θ  θ θ Rˆ = η − η¯ η − η¯ . N i i i=1

ˆ(0) ˆ ˆ(0) ˆ Replace θ2 with θ2, replace R with R, and repeat from step #1.

85 / 85 Applebaum, D. (2004). L´evyProcesses and Stochastic Calculus. CambridgeUniversity Press, Cambridge UK. Bayer, C., Gatheral, J., and Karlsmark, M. (2013). Fast Ninomiya-Victoir calibration of the double-mean-reverting model. Quantitative Finance, 13(11):1813–1829. Bergomi, L. (2005). Smile dynamics II. Risk, pages 67–73. Bergomi, L. (2008). Smile dynamics III. Risk, pages 90–96. Brenner, M., Izhakian, Y., and Sade, O. (2011). Ambiguity and overconfidence. SSRN 2284652.

85 / 85 Gatheral, J., Jaisson, T., and Rosenbaum, M. (2014). Volatility is rough. Available at SSRN 2509457. Ould Aly, S. M. (2014). Forward variance dynamics: Bergomi’s model revisited. Applied , 21(1):84–107. Papanicolaou, A. (2016). Analysis of VIX markets with a time-spread portfolio. Applied Mathematical Finance, 23(5):374–408.

85 / 85