Calibrating the SABR Model to Noisy FX Data

Kellogg College University of Oxford

A thesis submitted in partial fulfillment of the MSc in Mathematical

Hilary 2018 Abstract

We consider the problem of fitting the SABR model to an FX smile. It is demonstrated that the model parameter β cannot be deter- mined from a log-log plot of σATM against F . It is also shown that, in an FX setting, the SABR model has a single state variable. A new method is proposed for fitting the SABR model to observed quotes. In contrast to the fitting techniques proposed in the literature, the new method allows all the SABR parameters to be retrieved and does not require prior beliefs about the market. The effect of noise on the new fitting technique is also investigated. Acknowledgements

I would like to thank both of my supervisors Dr Daniel Jones and Guil- laume Bigonzi for their guidance and support throughout this project. They provided the direction for this work and offered invaluable insight and advice along the way. I also gratefully acknowledge the financial support from Bank Julius Baer and Co. Ltd. Finally I thank Dr Beate Solleder for her love and tireless support and for tolerating me during the time it has taken me to complete this project. 1 Introduction

The work presented here is concerned with fitting the SABR model to foreign exchange (FX) data. Specifically, we are interested in the of an , σ, as a function of the strike of the option, K. This relation- ship between σ and K is known as the . For specified SABR model parameters, the volatility smile is given by the well known equation of Hagan et al. [1]. Here we focus on the inverse problem, i.e. given a volatility smile which was generated using the SABR model, how can we obtain the parameters of the underlying SABR model? In section 2 we introduce the quoting conventions used in the FX market and define the three options which are commonly used to describe the FX volatility smile. A method for calibrating a volatility smile to market quotes is also described. The SABR model is presented in section 3 and the equations which will be used throughout this work are stated. Section 4 reviews previous works related to fitting the SABR model to market data and section 5 gives details of the Monte Carlo method which was used to generate simulated market data. In this work we have focused on fitting simulated data since this removes any uncertainty regarding whether the SABR model accurately describes the market and what the true model parameters are. A key topic when fitting the SABR model to market data is the determination of the parameter β. In section 6 we explain why an approach to fitting β that is often described in the literature does not produce reliable results. Section 7 examines whether variance-covariance matching can be used to estimate the SABR parameters from time series of σATM and F . The relationship between the three main FX options quotes that would be predicted by the SABR model is investigated in section 8. These predictions are also compared to sample market data. An approximation for the correlation between the implied volatility at-the-money and the is considered in section 9. This correlation is important because it allows vega exposure to be partially hedged with delta. A new method for fitting the SABR model to FX data is proposed in section 10 and the ability of this method to retrieve the parameters of the underlying SABR model is investigated. We begin with the case that the quotes are free from noise and then systematically introduce noise on each of the three main option quotes. Section 11 considers the case of pricing a digital option when the volatility smile is described using the SABR model. Conclusions are drawn in section 12, which also includes suggestions for further work.

1 2 Introduction to FX Market Conventions

The foreign exchange (FX) market is one of the most liquid and competitive mar- kets in the world. Because many of the FX market conventions are unique to this market, this section provides a brief introduction to these conventions. In the FX market participants agree to exchange one currency for another on a specified day at a specified FX rate. An FX rate is the price of one currency expressed in terms of another currency. Consider the currency pair XXXYYY. The tag XXX represents the “foreign” currency, while YYY represents the “domestic” currency. The FX rate XXXYYY specifies the price of the foreign currency in terms of the domestic currency. For example; EURUSD specifies the price of one Euro in US dollars. This section begins with an explanation of the delta conventions which are used for quoting options in the FX market. Thereafter we introduce three commonly traded options structures: the at-the-money , risk reversal and vega-weighted butterfly. These three structures are particularly important because they are often used to define the volatility smile in the FX market. The section concludes with a description of a method for calibrating a volatility smile to observed market prices.

2.1 Option Quotes in the FX Market

Options in the FX market are not typically quoted in terms of strike, K, but as the delta of the option, assuming a Black-Scholes (BS) model. The delta of an option in the BS model is given by ∂V ∆ = ∂S R T f − rs ds = we t Φ(wd1), (1) where F (t,T ) σ2 ln( K ) + 2 (T − t) d1 = √ (2) σ T − t and Φ is the normal cumulative distribution function. Here V is the value of the option and S is the current (spot) exchange rate. F (t, T ) is the forward price of the exchange rate at time t expiring at time T and w takes the value of w = 1 for a call f and w = −1 for a . rt is the risk free rate of the foreign currency at time t. Although FX options are quoted in terms of ∆, options are actually written with a specified strike. Therefore we need to be able to convert ∆ into the corresponding

2 strike. Re-arranging equation (1) leads to the following expression for K

 √  |∆|  σ2(T − t) K = F (t, T ) exp − wσ T − tΦ−1 + . (3) − R T rf ds e t s 2 In the context of this work, equation (3) is important because the SABR model gives the implied volatility as a function of K, rather than ∆. FX option quotes are further complicated by the use of different definitions of delta depending on the market convention of the currency pair being traded. For some currency pairs the market convention is to quote the premium in the foreign currency, e.g. a vanilla option on the USDJPY pair is quoted in USD. Since the premium is in foreign currency, the premium itself should be hedged. Therefore the market convention is to use the premium included delta, which is given by V ∆PI = ∆ − . (4) S Consider the case that we write a on the USDJPY pair. At expiry this gives the buyer the right to purchase USD at a price specified by the strike K, which is in JPY. To make our position (instantaneously) risk free with respect to S we should hold ∆ USD given by equation (1). However, as the option writer, we receive the option premium given by V/S, where V is the value of the option in JPY and V/S is the premium in USD. The premium included delta is the amount of USD that we need to hold in addition to the premium, which leads to equation (4). Finding the value of K which corresponds to a specified value of ∆PI is more involved because both ∆ and V depend on K. Castagna [2] proposed the following method based on Newton’s method to calculate K:

1. Calculate an initial estimate of K using equation (3)

2. Calculate ∆PI for the current value of Ki using equation (4)

3. Estimate the of ∆PI with respect to Ki by “bumping” Ki by a small amount (e.g. 1%) and re-evaluating ∆PI for this new value of K

4. Calculate Ki+1 as ∆PI − ∆¯ Ki+1 = Ki − (5) ∂∆PI ∂K where ∆¯ is the target value of ∆PI.

5. Iterate until |Ki+1 − Ki| < , where  is a tolerance parameter.

3 2.2 At-the-money Straddle

The most liquid FX option is the at-the-money (ATM) straddle. This structure consists of a call and put both struck at the “at-the-money” level. The definition of the ATM strike depends on market conventions. One choice is the zero delta ATM strike, which is defined as the strike that leads to the call and the put having the same delta (but with opposite sign). Another possible definition of the ATM strike is the ATM forward. Under this convention the ATM strike is set equal to the forward price of the underlying currency pair with the same expiry as the option. By no-arbitrage the forward price is given by

− R T rf ds e t s F (t, T ) = St , (6) − R T rdds e t s

d where rt is the risk free rate of the domestic currency at time t. The final definition of the ATM strike is the at-the-money spot, where the ATM strike is defined to be S, the current spot rate of the underlying pair. The at-the-money straddle describes the level of the implied volatility surface: changing the ATM volatility results in a parallel shift of the implied volatility surface along the implied volatility axis.

2.3 Risk Reversal

A risk reversal is a highly-traded structure consisting of a long call and a short put. The call and put are symmetric in that they are chosen to have the same delta (but with opposite sign). The most commonly traded risk reversal contract is the 25 delta contract, where the call and put are stuck such that they have deltas of 0.25 and -0.25, respectively. In the market, the risk reversal is quoted as the difference between the implied volatilities of the call and the put, i.e.

σ25RR(t, T ) = σ25C (t, T ) − σ25P (t, T ). (7)

The risk reversal can be either positive or negative and describes the skew of the implied volatility surface. A positive risk reversal indicates that there is more demand for calls than puts, whereas a negative risk reversal suggests that puts are favoured over calls.

4 2.4 Butterfly

A vega-weighted butterfly (VWB) is a highly-traded structure consisting of a long call, a long put and a short ATM straddle. The long call and long put are again symmetric in delta and together form a . For the most commonly traded butterfly, delta is again chosen to be 0.25 for the call and -0.25 for the put. This is referred to as the 25 delta butterfly. The vega of the strangle is larger than that of the ATM straddle meaning that the quantity of the straddle needs to be larger than the quantity of the strangle in order that the structure is vega neutral. The market quote for the VWB is defined as the difference between the volatility of the strangle and the volatility of the ATM straddle (σATM). Market quotes for the vega-weighted butterfly are complicated by the existence of two conventions for the strangle. The most straight forward definition of the strangle is to use the same put and call options which were used for the risk reversal. This results in the ‘two-vol’ butterfly, which is defined as 1 σ (t, T ) = σ (t, T ) + σ (t, T ) − σ (t, T ). (8) 25BF 2 25C 25P ATM Under this convention the volatility of the strangle is defined as the mean of the volatilities of the put and the call. However, the most common market quote for the VWB is not the two-vol butterfly, but the single-vol butterfly. In this case the volatilities of the put and the call are chosen to be equal to one another. Define σVWB to be the volatility of the put and the call for the single-vol strangle. The market quote for the single-vol butterfly is then

σ1−vol−25BF(t, T ) = σVWB(t, T ) − σATM(t, T ). (9)

For σ25RR = 0, equation (8) reduces to equation (9) and the two conventions for VWB are equivalent. In general, however, the two definitions are not equivalent and the discrepancy between σ25BF and σ1−vol−25BF tends to increase as the magnitude of σ25RR increases. Either σ25BF or σ1−vol−25BF can be used to construct a volatility smile. What is important is to understand which convention is being used and how to interpret the market quotes in term of the constraints that they place on the volatility smile. For simplicity the majority of this work has been performed using the two-vol butterfly, σ25BF. This choice means that fewer strikes are required in the calibration process. Section 2.5 describes how to calibrate a volatility smile using market quotes for σ1−vol−25BF, which is the case that will be most frequently encountered in practice.

5 The butterfly describes the curvature of the implied volatility surface; a high value of σ25BF(t, T ) implies that the implied volatility in the wings is large compared to the implied volatility at-the-money.

2.5 Building the Volatility Smile from Market Data

The volatility smile is a mapping between strike, K, and implied volatility:

K 7→ σ(K). (10)

In this section it is assumed that we have a functional form for σ(K) which we wish to fit to market quotes for σATM, σ25RR and σ1−vol−25BF. This is the case that is most frequently encountered in practice. It is assumed further that, given three points on the volatility smile, we can fit the function σ(K) such that we can obtain the volatility for any K ≥ 0. Although this work focuses on fitting the SABR model, the method described below can be applied to any functional form which meets these criteria. For example, the vanna-vega interpolation method proposed by Castagna [2] or the simplified parabolic interpolation method introduced by Reiswich [3]. Constructing the volatility smile from market data is achieved by recognising the three constraints placed on the smile by the three options quotes discussed above (σATM, σ25RR and σ1−vol−25BF). These constraints are described in detail by

Reiswich [3] and Castagna [2]. The market quote for σATM provides the constraint

σ(KATM) = σATM, (11) where KATM is determined by market conventions. To ensure that σ25RR is priced correctly by the volatility smile we have

σ(K25C) − σ(K25P) = σ25RR. (12)

Here the strikes K25C and K25P fulfil

∗ ∆ (K25C, σ(K25C)) = 0.25, ∗ ∆ (K25P, σ(K25P)) = −0.25. (13)

The function ∆∗(K, σ) is either the standard delta or the premium included delta and is determined by market conventions. The final constraint is that the value of a VWB priced by the volatility smile should match the price quoted in the market. Here we

6 assume that the market quote for the VWB uses the single volatility convention. The put and call that make up the VWB have a volatility given by

σVWB(t, T ) = σ1−vol−25BF(t, T ) + σATM(t, T ). (14)

The strikes of these options can be found by solving the equations

∗ ∆ (K25C, σVWB) = 0.25, ∗ ∆ (K25P, σVWB) = −0.25 (15) for K25C and K25P. The value of the strangle component of the VWB is:

C(K25C, σVWB) + P (K25P, σVWB). (16)

Here C(K, σ) and P (K, σ) are, respectively, the Black-Scholes price of a call (put) option with strike K and volatility σ. The volatility smile must be able to reproduce the price of this strangle, which leads to

  C(K25C, σVWB) + P (K25P, σVWB) = C K25C, σ(K25C) + P K25P, σ(K25P) . (17)

Castagna [2] proposed the following method to generate a volatility smile based on market prices of the ATM straddle, risk reversal and single-vol butterfly. First the

ATM strike is determined from σATM. If the market convention is a zero delta ATM strike and the premium is not included in delta, KATM is given by

1 σ2 (T −t) KATM = F (t, T )e 2 ATM . (18)

For a put and a call to have the same strike and absolute value of delta we require

Φ(d1) = Φ(−d1), which implies d1 = 0. Equation (18) arises from re-arraning equa- tion (2) with d1 = 0. When the premium is included, the ATM strike is calculated as − 1 σ2 (T −t) KATM = F (t, T )e 2 ATM . (19) Equation (19) arises from equating the sum of ∆PI for a put and a call to zero. This leads to Φ(d2) = Φ(−d2) where

F (t,T ) σ2 ln( K ) − 2 (T − t) d2 = √ (20) σ T − t

Re-arraning equation (20) with d2 = 0 yields equation (19).

7 If market convention dictates that the strike is either the forward price or the spot price, then KATM can be observed directly in the market. The 25 delta strikes for the 25 delta VWB are calculated as

√ f −1 R T r ds 1 2 σVWB (T −t)Φ (0.25e t s )+ σ (T −t) K25P = F (t, T )e 2 VWB (21) √ f −1 R T r ds 1 2 −σVWB (T −t)Φ (0.25e t s )+ σ (T −t) K25C = F (t, T )e 2 VWB (22)

When the premium is included in the delta K25P and K25C must be calculated using the procedure described in section 2.1. Next an iterative procedure is used to determine the two 25 delta volatilities in i terms of an equivalent VWB volatility σe. This procedure ensures that the price of a VWB is equal to the sum of the prices of the call and put options from which it is i composed. The iterative procedure requires values for the first two iterations of σe to i be specified. This is because the derivative of the fitting error with respect to σe is 0 1 0 −4 estimated using finite differences. Initial values of σe = σ1−vol−25BF and σe = σe +10 are typically chosen. FX volatilities are normally around 10%, meaning that a “bump” size of 10−4 will normally yield a satisfactory approximation of the derivative of the i fitting error with respect to σe. The iterative procedure consists of the following steps: 1. Calculate the implied 25 delta volatilities as

i σ25P = σATM + σe − σ25RR i σ25C = σATM + σe + σ25RR

2. Determine the strikes corresponding to these volatilities

√ R T f −1 t rs ds 1 2 i σ25P (T −t)Φ (0.25e )+ 2 σ25P (T −t) K25P = F (t, T )e √ R T f −1 t rs ds 1 2 i −σ25C (T −t)Φ (0.25e )+ 2 σ25C (T −t) K25C = F (t, T )e

When the premium is included in delta, the procedure described in section 2.1 i i must be use to determine K25P and K25C .

i i 3. Calibrate the function σ(K) to the volatilities at K25P , K25C and KATM. Use

the calibrated curve to find the implied volatilities at K25P and K25C .

4. Calculate the price difference between a butterfly strangle calculated using σVWB and the same strangle priced using the volatilities found above:

i E =C(K25C , σ(K25C )) + P (K25P , σ(K25P ))

−C(K25C , σVWB) − P (K25P , σVWB).

8 i 5. If this is not the first iteration then update σe using Newton’s method: Ei σi+1 = σi − e e ∂Ei i ∂σe where ∂Ei Ei − Ei−1 i ≈ i i−1 ∂σe σe − σe

6. Iterate until Ei <  for a suitably small value of . Note that, since the ini- −4 ∂Ei tial bump size (in this case chosen to be 10 ) serves only to allow i to be ∂σe estimated,  can be chosen independently from the choice for the initial bump.

9 3 The SABR model

This work is concerned with calibrating the SABR model to FX data. This section introduces the SABR model and quotes the formulae which will be used throughout this work. The SABR model was proposed by Hagan et al. [1]. It is a stochastic volatility model which describes the evolution of the forward price of an asset, F (t), as

β 1 dF = αF dW ,F (t = 0) = F0, (23) 2 dα = vαdW , α(t = 0) = α0, where W 1 and W 2 are two correlated Brownian motions with

dhW 1,W 2i = ρdt. (24)

Hagan et al. [1] showed that for this model the implied volatility of an option with strike K can be approximated by

α  z  σ(K,F ) = 0 . . (1−β)/2 (1−β)2 2 (1−β)4 4  x(z) (FK) 1 + 24 log F/K + 1920 log F/K + ...  2 2 2  h(1 − β) α0 1 ρβα0v 2 − 3ρ 2ih i 1 + 1−β + 1−β + v T − t + ... (25) 24 (FK) 4 (FK) 2 24 where v z = (FK)(1−β)/2 log F/K (26) α0 and

p1 − 2ρz + z2 + z − ρ x(z) = log . (27) 1 − ρ

Note that the stochastic processes α and F are treated somewhat inconsistently in equation (25). While it is stressed that σ depends on the value of α at t = 0, F enters equation (25) as a process. In practice, we would apply equation (25) to calculate σ when t = 0 and α and F are known, i.e. to be consistent with the treatment of α, the

F entering equation (25) should be F0. However, when analysing the SABR model it is often useful to consider what happens as F0 and α0 vary. To this end we abuse notation and drop the subscripts in equation (25). That is the α and F entering equation (25) are stochastic processes governed by equation (23) and whenever we wish to evaluate σ we set t = 0 and observe the current realisations of α and F .

10 For the special case of options struck at the forward price, equation (25) reduces to

σATM = σ(F,F ) α  h(1 − β)2 α2 1 ρβαv 2 − 3ρ2 ih i  = 1 + + + v2 T − t + ... (28) F 1−β 24 F 2−2β 4 F 1−β 24

Note that we have assumed that the ATM strike is given by the forward price, F . For simplicity, this convention will be adopted throughout the remainder of this work. Hagan et al. [1] noted that the (T − t) term is usually less than 1 or 2 %. Ta- ble 1 shows the order of magnitude of the SABR parameters for a typical FX volatility smile. For these values, the (T −t) term is dominated by the final term and is approx- 2 2 imately 24 v (T − t). This term is dimensionless (as expected) and is approximately 2%, which is in agreement with Hagan et al. [1].

Table 1: Typical SABR parameters for FX options.

F α ρ β v T − t 1.0 0.1 0.1 1.0 1.0 0.25

3.1 The “Backbone” of the Volatility Smile

In the context of the SABR model, the term “backbone” is used to describe the curve traced out by σATM as the forward price varies. Hagan et al. [1] argued that the (T − t) term in equation (28) can usually be ignored when analysing the behaviour of the backbone. Taking logarithms of equation (28) and ignoring the (T − t) term gives

log(σAT M ) = log(α) − (1 − β) log(F ). (29)

−(1−β) Equation (29) indicates that σAT M ∝ F . To gain insight into this relationship, consider the two limiting cases of β = 1 and β = 0. When β = 1, F can be written as  Z T Z T  1 2 FT = F0 exp αdw − α ds . (30) 0 2 0

11 Assuming zero interest rate, the value of a call option on F with strike K = F0 is

  C(F0) = E max(FT − F0, 0))  Z T Z T   1 2   = F0E max exp αdw − α ds − 1 , 0 . 0 2 0

The expectation is independent of F0, meaning that the option price is proportional to the forward price. For β = 0, F can be written as

Z T FT = F0 + αdw. (31) 0

In this case the value of a call option on F with strike K = F0 is

  Z T  C(F0) = E max αdw, 0 . (32) 0

The expectation is again independent of F0, meaning that the option price is inde- pendent of F0. For a call struck at-the-money, the Black-Scholes price is

C(F, t) = FN(d1) − F0N(d2), (33) where 1 √ d = −d = σ T − t. (34) 1 2 2 imp

At inception, t = 0 and F = F0. Rearranging gives

C(F0, 0) 1 N(d1) = + . (35) 2F0 2 Therefore the implied volatility is given by

2 −1C(F0, 0) 1 σimp = √ N + . (36) T − t 2F0 2

C(F0,0) C(F0,0) For small , σimp is approximately linear in . We can use this relation to F0 F0 convert the trends we have noted for C(F0) into trends for σimp. Thus, for β = 1, we expect that σimp is independent of F0, whereas for β = 0, we predict that σimp is −1 proportional to F0 . Hagan et al. [1] state that the “backbone”, which they define as the curve that

σATM traces as F varies, is determined almost entirely by β: β = 1 gives a flat

12 backbone, whereas β = 0 produces a downward sloping backbone. This behaviour, which is described by equation (29), is exactly what we obtained above by considering the behaviour of σimp as a function of F0. We prefer to define the backbone as the curve that σimp traces as F0 varies because this emphasises that all other parameters, and in particular α, are held constant. In practice, the backbone will be difficult to observe in market data because the Brownian motions driving F and α are correlated. This is discussed in more detail in section 6.

3.2 Refinement of the SABR model

Obloj [4] compared the formulae presented by Hagan et al. [1] and Berestycki et al. [5] and found a discrepancy for β < 1. Based on this analysis, Obloj [4] proposed a corrected version of the formula derived by Hagan et al. Obloj [4] wrote the implied volatility as a Taylor expansion in the time to (T − t) as   σ(K,F ) = σ0(K,F ) 1 + σ1(K,F )(T − t) + O((T − t)2), (37) where

2 2 2 1 (1 − β) α 1 ρβαv 2 − 3ρ 2 σ (K,F ) = 1−β + 1−β + v , 24 (FK) 4 (FK) 2 24 v ln F σ0(K,F ) = √ K ,  1−2ρζ+ζ2+ζ−ρ  ln 1−ρ v F 1−β − K1−β ζ = . α 1 − β

Comparing equation (28) and (37) we see that the discrepancy between Hagan et al. and Obloj occurs in the value of σ0(K,F ). The main result of Obloj [4] is to correct z the x(z) term in equation (28). Simple calculations show that both formulations of z the x(z) term yield the same result when F = K, or when v = 0, or when β = 1; F see Obloj [4] for details. In addition, Obloj truncated the expansion in log K in the denominator of σ0(K,F ) to leading-order. In this work equation (37) will be used to describe the implied volatility of the SABR model.

13 4 Literature Survey: Calibrating the SABR Model

We now review the literature relating to the main topic of this work: calibration methods for the SABR model. Different methods have been proposed for estimating the SABR parameters from market data. The choice of β and whether it is fitted to market data or selected in advance, is a particularly important topic in the field of calibrating the SABR model. Hagan et al. [1] introduced the SABR model and derived equations for the implied volatility as a function of strike for the model. It was noted that the parameters β and ρ affect the volatility smile in similar ways since both influence the skew. Indeed the authors showed an example volatility smile which could be equally well fitted by the SABR model with β = 0 or β = 1. Hagan et al. [1] noted that this redundancy makes it difficult to fit both β and ρ from a single market snapshot. The authors proposed using a log-log plot of historic values of σATM against F to determine β. Based on equation (29) it was argued that β can be found from the gradient of such a plot. Alternatively, Hagan et al. [1] recommended selecting a value of β based on prior beliefs about the market. West [6] calibrated the SABR model to illiquid South African markets. They used a log-log plot of σATM against F to determine β and found that β was a function of time. The model was also calibrated using a single value of β = 0.7 and it was found that this choice led to more stable values for the other SABR parameters. Based on these data, West [6] recommended fixing β for the life of a contract. We will see in section 6 that β cannot be determined from a log-log plot of historical values of σATM against F because the slope of this plot is also influenced by ρ. Evidence of this can be seen in the data presented by West [6], which show a strong correlation between the time series of β and ρ. Nowak and Sibetz [7] fitted the Heston and SABR models to FX data. They proposed fitting β using a log-log plot of σATM against F or using a value of β based on prior beliefs. Two approaches were considered for fitting the remaining three parameters. In the first approach α, ρ and v were found by minimising the square error between the market volatility and the SABR volatility. In the second approach it was noted that, for given values of ρ and v, α can be found as the root of equation (28). Thus ρ and v were found by minimising the square error between the market and the model for σRR and σBF where α(ρ, v) was given by equation (28).

Method 2 results in a larger mean square error than method 1 but ensures that σATM is fitted exactly.

14 Le Floc’h and Kennedy [8] state the β parameter is usually defined from historical series analysis for the relevant market. Having selected a value of β in advance, Le Floc’h and Kennedy [8] fit α, ρ and v by minimising the weighted mean square error in implied volatilities using a Gauss-Newton method. The model was fitted to the volatilities of commonly traded equities and indices and more weight was added to volatilities around σATM ± 20%. Reiswich [3] compared three different approaches to describing an FX volatility smile: the SABR model, vanna-vega interpolation and a simplified parabolic interpo- lation. The SABR model was fitted to market data by minimising the mean square relative error between the model and market prices of σATM, σRR and σBF. The main focus of Reiswich’s work was to compare the three methods for describing the volatil- ity smile in terms of how robust they are when fitting to real market data. Although it was noted that β is normally selected in advance, Reiswich [3] preferred to allow the least squares minimisation to select β since this gave more robust results. Hagan et al. [9] discussed an arbitrage-free SABR model and provided a useful summary of SABR-style models. They highlight that both β and ρ control the volatil- ity skew and that it is therefore difficult to distinguish between them when fitting the model to market data. Hagan et al. [9] demonstrated that the volatility smile can be well fitted for any value of β in the range [0,1]. However, the choice of β does influence delta. This dependence has also been noted by Skov Hansen [10]. Bartlett [11] pro- posed an alternative definition for delta which accounts for the correlation between F and α. Hagan et al. [9] noted that this alternative delta is almost independent of the value of β. In a recent paper, Hagan and Lesniewski [12] note that market practice is to set β to a pre-specified value. This approach is justified because ρ can be adjusted such that the model fits the market for any value of β. Similarly, if one uses the modified definition for delta proposed by Bartlett [11], delta is also independent of the choice of β. Based on the above we conclude that the prevailing approach for fitting the SABR model is to set β to an arbitrary value and then fit the remaining parameters by minimising the error between the model and market data. Authors trying to fit β normally cite the original paper by Hagan et al. [1], in which it is stated that β can be found from a log-log plot of historic values of σATM against F .

15 5 Monte Carlo Simulations

The work presented here will focus on fitting the SABR model to simulated market data. Fitting simulated data represents the best case for a fitting procedure because the data is generated using the same process that we are trying to fit. That is, simu- lated data removes any uncertainty about whether the model being fitted accurately describes the data and allows us to focus on the inverse problem of whether the model parameters can be obtained from market observations. This section describes the method used to generate simulated data in this work. The Euler-Maruyama method has been used to generate simulated market data. In this method, trajectories of F and α are simulated using a time-discretised version of equation (23):

β 1 Ft+dt = Ft + αtFt dWt , 2 αt+dt = αt + vαtdWt . (38)

1 2 The increments dWt and dWt are drawn from an N(0, dt) distribution and have covariance 1 2 Cov[dWt , dWt ] = ρdt. (39) This is achieved using a Cholesky decomposition of the correlation matrix. A time step, dt = 2.5 × 10−7 was used and 107 time steps were simulated. Market data was calculated after every 100 time steps such that the time interval between data −5 points was dt = 2.5 × 10 . At each of these data points, σATM was calculated using equation (37). The calculation of σRR and σBF requires σ25C and σ25P . The strike, K(∆, σ), of an option is related to its delta by equation (3), which is a function of σ. The implied volatility of the SABR model, σ(F,K), is given by equation (37) and is a function of strike, K. Therefore, the implied volatility cannot be expressed explicitly as a function of delta and σ25C and σ25P must be found using an iterative procedure. The difference between the delta for an option with strike K and ∆ is given by:

R T f − rs ds d(K) = e t Φ(d1) − ∆, (40) where F (t,T ) σ(F,K)2 ln( K ) + 2 (T − t) d1 = √ (41) σ(F,K) T − t and σ(F,K), is given by equation (37). In this work the root function from the python package scipy.optimize was used to find K∗, the value of K corresponding

16 to the root of equation (40). The initial estimate of K was chosen to be

∆ K = F (1 + ). (42) 0 10

∗ This choice ensures that K0 lies on the correct side of F . Once K was found, the implied volatility was found as σ(F,K∗) using equation (37).

5.1 Calculating σimp Using Monte Carlo To verify the implementation of the Monte Carlo method it was used to estimate prices for options expiring in 3 months. The option price was estimated by simulating 104 realisations of equation (38) and estimating the expected option payoff based on these trajectories. Based on the prices obtained, the implied volatility was calculated for each of the options and compared to equation (37). The results are presented in table 2 for 3 different strikes and two values of β. The SABR parameters were ρ = 0.1 and v = 1.0 and the initial values were α0 = 0.07 and F0 = 1.2. The agreement between equation (37) and the MC result is excellent for options struck at the forward price. For options with other strikes the discrepancy between equation (37) and the MC is larger, as would be expected.

Table 2: Comparison of Monte Caro with equation (37).

β = 0 β = 1

K/F0 MC Eq. (37) MC Eq. (37) 0.9 0.07941 0.07900 0.08585 0.08637 1.0 0.05953 0.05953 0.07144 0.07147 1.1 0.07447 0.07711 0.08790 0.09034

5.2 Alternative Formulation

It will be shown in section 8 that q, defined in equation (65), can be viewed as a single state variable for the SABR model. Therefore, as an alternative to simulating equation (38), we can simulate q:

 1  q = q + vq dW 2 − (1 − β)q2 dW 1 + vρdt − (2 − β)q dt . (43) t+dt t t t t t 2 t

17 Using this formulation σATM is calculated using equation (67). The calculation of σ25C and σ25P remains an iterative procedure: a value of ψ must be found that satisfies equation (61), where σ(ψ) is given by equation (66).

Figure 1 compares simulated values of σATM obtained by simulating equations (38) and (43). On this scale, both methods appear to generate the same the trajectories of σATM. The difference between the trajectories computed using equation (38) and equation (43) is shown in figure 2. The difference between the trajectories calculated using the two methods is three orders of magnitude smaller than σATM. Furthermore, the difference does not appear to show any clear trends. These observations are consistent with the difference being caused by rounding errors and validate that the SABR model can be written as equation (65).

0.1

0.09 Equation (38) Equation (43) 0.08

0.07 (-) ATM

σ 0.06

0.05

0.04

0.03 0 0.05 0.1 0.15 0.2 0.25 t (years)

Figure 1: Trajectories of σATM simulated using equations (38) and (43). F0 =

1.2, α0 = 0.07, tex = 0.25, β = 0.0, v = 1.25, ρ = 0.1.

18 ×10-6 5

0 (-)

ATM -5 σ

-10 erence in ff Di -15

-20 0 0.05 0.1 0.15 0.2 0.25 t (years)

Figure 2: Difference between trajectories of σATM simulated using equations (38) and (43). F0 = 1.2, α0 = 0.07, tex = 0.25, β = 0.0, v = 1.25, ρ = 0.1.

19 6 Fitting β to Market Data

Hagan et al. [1] propose that equation (29) can be used to find β from historical observation of σATM and F . This approach has also been proposed by other authors including West [6] and Nowak and Sibetz [7]. The aim of this section is to demonstrate that β cannot be found from historical observation of σATM and F .

Figure 3 shows a log-log plot of σATM against F for options on the EURUSD currency pair expiring in three months. Data are shown for dates between 21 August 2017 and 11 October 2017. Prices were recorded every hour between 1 am and 11 pm central European time. It can be seen that σATM tends to increase as F increases. Based on the arguments of Hagan et al., these data imply β > 1, which is outside of the allowable range of values for β.

-2.45

-2.5

-2.55 ) ATM σ

ln( -2.6

-2.65

-2.7 0.15 0.155 0.16 0.165 0.17 0.175 0.18 0.185 ln(F)

Figure 3: Log-log plot of σATM vs F for three month options on the EURUSD currency pair. Data obtained from Bloomberg.

The behaviour displayed in figure 3 can be explained by returning to equation (29) and observing that the Brownian motions driving the processes for α and F are correlated. If we consider a historical time series of σATM and F , then we have values of α0 and F0 for each point in the series. That is, for each point in the time series,

α0 and F0 are the realisations of α and F at that point in time. Consequently, if we

20 Figure 4: Three month 25-delta risk reversal on the EURUSD currency pair. Data obtained from Bloomberg.

apply equation (29) to historical observations of σATM and F , then α is a random variable which is correlated with F . Therefore, β cannot be determined simply as the slope of a log-log plot of historical observations of σATM against F . The volatilities of the 3 month risk reversals corresponding to the data shown in

figure 3 are shown in figure 4. It can be seen that σRR > 0, implying that ρ > 0 in this market. This explains why the data in figure 3 slope upwards: α is positively correlated with F .

6.1 Can β be Found by Iteration?

It was argued above that a log-log plot of σATM against F cannot be used to determine β because α is a random variable correlated with F . Here we wish to determine whether β can be found by iteration. We will fit simulated data consisting of time ∗ series of σATM, σRR, σBF and F . An arbitrary value of β = β is selected. For each point in the time series we find α, ρ and v by fitting the volatility smile to σATM,

σRR, σBF. The smile was fitted by minimising the square relative error between the

21 observed market quotes and the model predictions for these volatilities, i.e.

σ0 (α, β∗, ρ, v) 2 error(α, ρ, v) = ATM − 1 σATM σ0 (α, β∗, ρ, v) 2 + RR − 1 σRR σ0 (α, β∗, ρ, v) 2 + BF − 1 . (44) σBF Here the relative errors associated with each of the three volatility quotes are weighted equally for simplicity. When fitting a volatility smile in practice it might be preferable to weight the three errors differently. For example, more weighting might be given to volatility quotes with a higher traded volume. Under normal circumstances this would lead to a larger weighting for the error associated with σATM. Having fitted the smile at each point in the time series, we have a (fitted) value of

α for each of these smiles. Using these values of α we can plot log(σATM/α) against log(F ). From equation (29) the slope of this plot is β − 1. Therefore, we can update our value of β∗ based on the slope of this plot. We aim to iterate in this manner until the value of β stops changing between iterations.

Figure 5 shows a log-log plot of σATM/α vs F obtained using this procedure. The data shown in figure 5 were simulated using a Monte Carlo method with parameter values β = 1.0, ρ = 0.1, F0 = 1.2, α0 = 0.07, tex = 0.25, v = 1.0. Details of the Monte Carlo method used are given in section 5. Figure 5(a) shows the result of assuming a value of β∗ = 0, whereas figure 5(b) shows the result of assuming a value of β∗ = 1. In both cases only one iteration has been performed. In figure 5(a) the slope of the curve is -1, which implies a value of β = 0. In contrast, the slope of figure 5(b) is zero, which implies a value of β = 1. Therefore, the slope of the log-log plot of σATM/α vs F depends on the value of β∗ used in the fitting process, rather than the value of β used to generate the data. Figure 6 repeats the analysis for simulated data with the same parameter values as in figure 5, except for β, which is now set to β = 0 instead of β = 1 in figure 5. Again only one iteration has been performed and the slope of ∗ the log-log plot of σATM/α vs F depends on the value of β , rather than the value of β used to generate the data. It appears that we cannot obtain any information about the value of β used to generate the data from a log-log plot of σATM/α vs F . Based on these data, we conclude that the iterative procedure described above cannot be used to obtain the value of β from a time series of σATM and F .

22 -0.1

-0.12

-0.14

) -0.16 α ATM σ -0.18 log(

-0.2

-0.22

-0.24 0.12 0.14 0.16 0.18 0.2 0.22 0.24 log(F) (a) β∗ = 0

0.1

0.05 ) α ATM

σ 0 log(

-0.05

-0.1 0.12 0.14 0.16 0.18 0.2 0.22 0.24 log(F) (b) β∗ = 1

Figure 5: Log-log plot of σATM/α vs F for simulated data with parameter values of

β = 1.0, ρ = 0.1, F0 = 1.2, α0 = 0.07, tex = 0.25, v = 1.0. In each of the plots β was assumed to be fixed and the other SABR parameters were found by fitting the implied volatility curve to the simulated data. Using this process α in σATM/α is a fitted value.

23 -0.1

-0.12

-0.14

) -0.16 α ATM σ -0.18 log(

-0.2

-0.22

-0.24 0.12 0.14 0.16 0.18 0.2 0.22 0.24 log(F) (a) β∗ = 0

0.1

0.05 ) α ATM

σ 0 log(

-0.05

-0.1 0.12 0.14 0.16 0.18 0.2 0.22 0.24 log(F) (b) β∗ = 1

Figure 6: Log-log plot of σATM/α vs F for simulated data with parameter values of

β = 0.0, ρ = 0.1, F0 = 1.2, α0 = 0.07, tex = 0.25, v = 1.0. In each of the plots β was assumed to be fixed and the other SABR parameters were found by fitting the implied volatility curve to the simulated data. Using this process α in σATM/α is a fitted value.

24 7 Fitting Using Variance-Covariance Matching

In this section we examine whether the SABR parameters can be obtained by match- ing the variance and covariance of the Brownian motions to the time series of σATM and F . This method is considered as an alternative to the approach described in section 6. We can rearrange equation (23) to give the following expressions for dW 1 and dW 2 dF dW 1 = , (45) αF β dα dW 2 = . vα Discretising equation (45) gives

1 Ft+dt − Ft dW ≈ β , (46) αtFt α − α dW 2 ≈ t+dt t . vαt

For given values of β, ρ and v, α can be calculated from σATM and F using equa- tion (28). Consider the problem of fitting the SABR model to market data. We observe time series of σATM and F in the market. From these data we can calculate dW 1(β, ρ, v) and dW 2(β, ρ, v) using equations (28) and (46). Since W 1 and W 2 are correlated Brownian motions, we can write the following

1 V(dW ) = dt, (47) 2 V(dW ) = dt, Cov(dW 1, dW 2) = ρdt.

Therefore, one approach to fitting the model is to select values of β, ρ and v which cause the sample variances and covariance of dW 1 and dW 2 to be equal to those given in equation (47). That is, we seek β, ρ and v such that the following conditions are met

Ft+dt − Ft V( β√ ) = 1, (48) αtFt dt αt+dt − αt V( √ ) = 1, vαt dt Ft+dt − Ft αt+dt − αt Cov( β√ , √ ) = ρ. αtFt dt vαt dt

25 We apply variance-covariance matching to simulated data with β = 1.0, ρ = 0.1,

F0 = 1.2, α0 = 0.07, tex = 0.25 and v = 1.0. The simulated data consists of time series 5 of σATM and F with 10 observations of each. The time step between the observations is dt = 2.5 × 10−5 years. Fitting is performed by minimising the sum of the relative errors

Ft+dt − Ft αt+dt − αt error(β, ρ, v) = V( β√ ) − 1 + V( √ ) − 1 (49) αtFt dt vαt dt

ρ + F −F α −α − 1 . (50) t+dt t t+dt√ t Cov( β √ , ) αtFt dt vαt dt To demonstrate the behaviour of equation (50), we fix β and find ρ and v using the L-BFGS-B method implemented in the scipy.optimize package. An initial estimate of ρ = 0, v = 0.5 was used. Table 3 shows the resulting values of ρ and v F −F α −α t+dt√ t t+dt√ t for a range of values of β. The corresponding values of V( β ), V( ) and αtFt dt vαt dt F −F α −α t+dt t t+dt√ t Cov( β √ , ) are also shown. αtFt dt vαt dt

Table 3: ρ and v obtained using variance-covariance matching for a range of values of β.

F −F α −α F −F α −α t+dt√ t t+dt√ t t+dt√ t t+dt√ t β ρ v V( β ) V( ) Cov( β , ) αtFt dt vαt dt αtFt dt vαt dt 0.0 0.1595 1.008 0.9970 1.000 0.1595 0.5 0.1318 1.004 0.9976 1.000 0.1318 1.0 0.1037 1.000 0.9979 1.000 0.1037

The data presented in table 3 suggest that v can be found by variance-covariance matching. In contrast, for each value of β we obtain a different value of ρ and each of these (β, ρ) pairs approximately fulfils the conditions given in equation (48). This is in agreement with what we observe if we attempt to fit β, ρ and v simultaneously: the estimate for β depends strongly on the initial choice of β and tends not to differ largely from this initial guess. To explain the behaviour above, let us look more closely at the first of the con- straints in equation (47). From equation (45) we can write

 dF  = dt. (51) V αF β

26 Using equation (28) and noting that the (T − t) term is typically small, we estimate α as

1−β α ≈ σATMF . (52)

Combining equations (51) and (52) gives

 dF  V = dt, (53) σATMF which is independent of β, ρ and v. Hence, β, ρ and v only enter equation (51) through the (T − t) term, which is normally less than 1 or 2%. To summarise, although this approach seems promising because we have three constraints for three unknowns (β, ρ and v), it is difficult to determine the three unknowns uniquely because one of the constraints is relatively insensitive to the three unknowns.

27 8 Relationship Between σATM, σRR and σBF

In the FX market the volatility smile is described by three main volatility quotes:

σATM, σRR and σBF. The definitions of these can be found in section 2. Of these three main volatilities, σATM is the most liquid contract and might, therefore, be expected to be more up to date than σRR or σBF. Hence, one motivation for fitting a model to the FX volatility smile is to predict price changes for the less liquid σRR and σBF contracts based on observed changes in σATM. To this end, this section focuses on the relationship between σATM, σRR and σBF predicted by the SABR model.

Figure 7 shows the relationship between σRR and σATM for data simulated using different values of β, ρ and v. Details of the Monte Carlo method used to generate these data are given in section 5. These data indicate that σRR is a deterministic function of σATM and that this function depends on β, ρ and v. It is somewhat sur- prising that σRR is a deterministic function of σATM. The SABR model consists of two correlated stochastic processes, which describe the evolution of α and F . Specifying

σATM does not specify α or F , but instead defines the relationship between them

(equation (37)). Therefore we might expect that, for any value of σATM, the model could yield a range of values of σRR depending on the value of F . The observation that σRR is a deterministic function of σATM implies that, rather than two state vari- ables (α and F ) we can describe the system using a single state variable that is a function of α and F , i.e.

σATM(α, F ) = σATM(q) (54)

σRR(α, F ) = σRR(q) (55) q = q(α, F ). (56)

Market data of σRR, σATM pairs is shown in figure 8. These data show considerable scatter and it is difficult to discern any real trends. The data do not appear to support the model prediction that σRR is a deterministic function of σATM. Based on these observations we can conclude that these data cannot be explained by the SABR model with constant parameters.

Figure 9 shows σBF vs σATM for simulated data. The effect of changing β, ρ and v is demonstrated in figures 9(a),9(b) and 9(c), respectively. There is a determin- istic relationship between σBF and σATM for all parameter values considered. This relationship appears to be relatively insensitive to the values of β and ρ but depends strongly on v. Based on the arguments above we can write σBF(α, F ) = σBF(q).

28 ×10-3 6

5 β = 1.0 β = 0.5 β = 0.0 4

RR 3 σ

2

1

0 0 0.02 0.04 0.06 0.08 0.1 0.12

σATM (a) Effect of changing β

×10-3 15

10

ρ = -0.1 ρ = 0.1 ρ = 0.25 RR 5 σ

0

-5 0 0.02 0.04 0.06 0.08 0.1 0.12

σATM (b) Effect of changing ρ

×10-3 6

5

v = 1.25 4 v = 1.0 v = 0.5

RR 3 σ

2

1

0 0 0.02 0.04 0.06 0.08 0.1 0.12 σ ATM (c) Effect of changing v

Figure 7: σRR vs σATM for simulated data. Unless otherwise stated the parameter values are F0 = 1.2, α0 = 0.07, tex = 0.25, β = 1.0, v = 1.25 and ρ = 0.1.

FX markets typically quote σ1−vol−BF instead of σBF. These two conventions are described in section 2.4. Figure 10 shows σ1−vol−BF vs σATM for simulated data.

A deterministic relationship between σ1−vol−BF and σATM can be observed for all parameter values considered. Comparing figures 9 and 10 we see that σ1−vol−BF follows similar trends to σBF: the relationship between σ1−vol−BF and σATM depends strongly on v but is relatively insensitive to β an ρ. Indeed, for the parameter values considered there is little difference between σ1−vol−BF and σBF. Market data of σ1−vol−BF, σATM

29 ×10-3 6

5

4

RR 3 σ

2

1

0 0.068 0.07 0.072 0.074 0.076 0.078 0.08 0.082 0.084 0.086 σATM

(a) EURUSD

-0.008

-0.009

-0.01

-0.011

-0.012 RR σ -0.013

-0.014

-0.015

-0.016

-0.017 0.084 0.086 0.088 0.09 0.092 0.094 0.096 0.098 0.1 0.102 σATM

(b) USDJPY

Figure 8: σRR vs σATM for EURUSD and USDJPY currency pairs. Data are shown for dates between 21 August 2017 and 11 October 2017. pairs is shown in figure 11. Again these data do not appear to show the unique, deterministic relationship between σ1−vol−BF and σATM that is predicted by the SABR model. To explain the deterministic relationships observed above, we return to the defi-

30 ×10-3 7

6

5 β = 0.0 β = 0.5 β = 1.0 4 BF σ 3

2

1

0 0 0.02 0.04 0.06 0.08 0.1 0.12 0.14 σATM (a) Effect of changing β

×10-3 7

6

5 ρ = 0.1 ρ = −0.1 ρ = 0.25 4 BF σ 3

2

1

0 0 0.02 0.04 0.06 0.08 0.1 0.12 0.14 σATM (b) Effect of changing ρ

×10-3 7

6

v = 0.5 5 v = 1 v = 1.25 v = 1.5 4 BF σ 3

2

1

0 0 0.02 0.04 0.06 0.08 0.1 0.12 0.14 σATM (c) Effect of changing v

Figure 9: σBF vs σATM for simulated data. Unless otherwise stated the parameter values are F0 = 1.2, α0 = 0.07, tex = 0.25, β = 1.0, v = 1.25 and ρ = 0.1.

nitions of σRR and σBF. Equations (7) and (8) are repeated here for convenience.

σ25RR(t, T ) = σ25C (t, T ) − σ25P (t, T ), 1 σ (t, T ) = σ (t, T ) + σ (t, T ) − σ (t, T ). (57) 25BF 2 25C 25P ATM

31 ×10-3 7

6

5 β = 0.0 β = 0.5 β = 1.0 4 BF − vol − 1 3 σ

2

1

0 0 0.02 0.04 0.06 0.08 0.1 0.12 0.14 σATM (a) Effect of changing β

×10-3 7

6

5 ρ = 0.1 ρ = −0.1 ρ = 0.25 4 BF − vol − 1 3 σ

2

1

0 0 0.02 0.04 0.06 0.08 0.1 0.12 0.14 σATM (b) Effect of changing ρ

×10-3 7

6

v = 0.5 5 v = 1 v = 1.25 v = 1.5 4 BF − vol − 1 3 σ

2

1

0 0 0.02 0.04 0.06 0.08 0.1 0.12 0.14 σATM (c) Effect of changing v

Figure 10: σ1−vol−BF vs σATM for simulated data. Unless otherwise stated the param- eter values are F0 = 1.2, α0 = 0.07, tex = 0.25, β = 1.0, v = 1.25 and ρ = 0.1.

Here σ25C and σ25P are the implied volatilities of a call and a put with delta of 0.25 and -0.25, respectively. The implied volatility of the SABR model is given by

32 ×10-3 2.6

2.4

2.2

BF 2 σ

1.8

1.6

1.4 0.068 0.07 0.072 0.074 0.076 0.078 0.08 0.082 0.084 0.086 σATM

(a) EURUSD

×10-3 4.2

4

3.8

BF 3.6 σ

3.4

3.2

3 0.084 0.086 0.088 0.09 0.092 0.094 0.096 0.098 0.1 0.102 σATM

(b) USDJPY

Figure 11: σ1−vol−BF vs σATM for EURUSD and USDJPY currency pairs. Data are shown for dates between 21 August 2017 and 11 October 2017.

(equation (37)): v ln F σ(K,F ) = √ K . 1−2ρζ+ζ2+ζ−ρ  ln 1−ρ  2 2 2  h(1 − β) α 1 ρβαv 2 − 3ρ 2ih i 1 + 1−β + 1−β + v T − t + ... , (58) 24 (FK) 4 (FK) 2 24 33 where v F 1−β − K1−β ζ = . (59) α 1 − β

FX options are quoted in terms of delta, ∆, and we wish to determine σ25C and σ25P . Adopting this convention, we can write  F  √ σ2τ ln = wσ τΦ−1(|∆|) − , (60) K 2 where w is 1 for a call and -1 for a put. Define √ σ2τ ψ = wσ τΦ−1(|∆|) − . (61) 2 Then  F  ln = ψ, (62) K and

FK = F 2 exp (−ψ). (63)

We can rewrite ζ as v 1 − exp (−ψ(1 − β)) ζ = F (1−β) α 1 − β v 1 − exp (−ψ(1 − β)) = , (64) q 1 − β where α q = . (65) F 1−β Writing equation (37) in this notation gives vψ σ(ψ) = √ .  1−2ρζ+ζ2+ζ−ρ  ln 1−ρ  2 2  h(1 − β) 2 ψ(1−β) ρβqv ψ(1−β) 2 − 3ρ 2ih i 1 + q e + e 2 + v T − t + ... . (66) 24 4 24 Note that neither α nor F appear explicitly in equation (66) and that ψ depends on σ(ψ), so equation (66) is not an explicit equation for σ(ψ). For the case of options written at-the-money we have

σAT M = σ(0) α  h(1 − β)2 α2 1 ρβαv 2 − 3ρ2 ih i  = 1 + + + v2 T − t + ... F 1−β 24 F 2−2β 4 F 1−β 24  h(1 − β)2 ρβqv 2 − 3ρ2 ih i  = q 1 + q2 + + v2 T − t + ... . (67) 24 4 24

34 Based on equation (66) it would be possible to calculate σRR and σBF if the SABR parameters (β, ρ, v) and q are known. Since (67) is a special case of (66), σATM is also uniquely determined by β, ρ, v and q. Therefore, for specified values of β, ρ and v, there is a fixed, deterministic relationship between σATM, σRR and σBF. This explains the behaviour shown in figures 7 and 9. Let us take stock of the above: if we consider the volatility smile as a mapping between ∆ and σimp, then this mapping depends on the SABR parameters ρ, β, v and on q. For a fixed model (i.e. specified values of ρ, β and v), the state of the system can be described by q alone. Note that we have reduced the number of SABR parameters by one: α is no longer considered a parameter to be fitted. We have also reduced by one the number of variables that we can observe: we are no longer interested in observing the forward price, F . Motivated by the above, let us consider the SDE for q as defined in equation (65). Applying Ito’s Lemma

dF dα 1 dF 2 1 − β dq = −α(1 − β) + + α(1 − β)(2 − β) − dF dα F 2−β F 1−β 2 F 3−β F 2−β 1 = −(1 − β)q2dW + vqdW + (1 − β)(2 − β)q3dt − (1 − β)vq2ρdt. (68) 1 2 2 Therefore, it should be possible to describe the evolution of the volatility smile using equation (68). It is also noted that q is the instantaneous Black Scholes volatility. To see this, we rewrite equation (23) as α dF = F dW 1 (69) F 1−β and compare equation (69) to the Black-Scholes model for the asset price

1 dF = σBSF dW . (70)

Comparing equation (69) and (70) we obtain the instantaneous relation α σ = = q. (71) BS F 1−β

It is also interesting to note that q ≈ σATM. Equation (67) shows that q differs from (1−β)2 2 ρβqv 2−3ρ2 2 σATM by a factor of 1+ 24 q + 4 + 24 v (T −t). It was shown in section 3 that the (T − t) term is usually less than 1 or 2% for a typical volatility smile. Therefore the discprenacy between q and σATM will usually be of this order of magnitude.

35 9 Covariance Between dq and dF

In this section we examine the covariance between dq and dF . We noted in section 8 that q ≈ σATM. Therefore, we can view Cov(dq, dF ) as a proxy for Cov(dσATM, dF ).

Cov(dσATM, dF ) is an important quantity because it allows us to relate a change in F to a change in σATM. This relation is valuable because it allows vega exposure, which is expensive to , to be partially hedged with delta, which is cheap to hedge. In section 8 we saw that 1 dq = −(1 − β)q2dW + vqdW + (1 − β)(2 − β)q3dt − (1 − β)vq2ρdt, (72) 1 2 2 and

dF = αF βdW 1. (73)

By direct computation we can calculate the covariance between dq and dF :

Cov(dq, dF ) = E(dqdF ) 2 3 = dt(ρvE(q F ) − (1 − β)E(q F )). (74)

We can also consider the variance of dq   2 4 2 2 3 V(dq) = (1 − β) E(q ) + v E(q ) − 2vρ(1 − β)E(q ) dt. (75)

However, for common parameter values this is dominated by v2E(q2) and so provides little information regarding β or ρ. We could estimate v as s (dq) v ≈ V . (76) E(q2)dt The variance of dF is

2 2β V(dF ) = E(α F )dt 2 2 = E(q F )dt (77)

The correlation between dq and dF is Cov(dq, dF ) ρdq,dF = p V(dq)V(dF ) ρvE(q2F ) − (1 − β)E(q3F ) = p p E(q2F 2) (1 − β)2E(q4) + v2E(q2) − 2vρ(1 − β)E(q3) ρvE(q2F ) − (1 − β)E(q3F ) ≈ p p (78) E(q2F 2)v E(q2)

36 If we consider the instantaneous correlation, such that q and F are known, then q ρ ≈ ρ − (1 − β) . (79) dq,dF v It was noted in the literature survey that a volatility smile can be equally well described with any value of β. It is interesting to examine the behaviour of equa- tion (79) for these different fits. Consider the volatility smile in table 4, which was generated using β = 1, ρ = 0.1 and v = 1.0. Table 5 shows the SABR parameters

Table 4: Sample volatility smile data.

F σATM σRR σBF 1.2 7.17 % 0.256 % 0.146 %

fitted to these volatilities for a range of values of β. For each of these parameter sets, the value of ρdq,dF estimated from equation (79) is also given. These data illustrate that the instantaneous correlation between dq and dF is insensitive to the choice of β.

Table 5: SABR parameters fitted to the data in table 4 for a range of values of β.

β ρ v q ρdq,dF 0.0 0.1668 1.013 0.06699 0.1007 0.2 0.1537 1.010 0.06699 0.1006 0.4 0.1404 1.007 0.06698 0.1005 0.6 0.1270 1.005 0.06697 0.1003 0.8 0.1135 1.002 0.06697 0.1001 1.0 0.1000 1.000 0.06697 0.1000

The data shown in table 4 are characteristic of a currency pair with a small risk reversal (such as EURUSD). We now repeat the analysis for the data shown in table 6. These data represent a currency pair with a large risk reversal; for example, USDJPY. Table 7 shows the SABR parameters fitted to the volatilities in table 6 for a range of values of β. In this case the difference between the largest and smallest values of ρdq,dF is larger, but still less than 2%. Based on the analysis above, it can be concluded that ρdq,dF is insensitive to the choice of β. This is unsurprising: ρdq,dF is related to the shape of the volatility smile and it is known that smiles can be well fitted by the SABR model for any value of β [12].

37 Table 6: Sample volatility smile for a currency pair with a large risk reversal.

F σATM σRR σBF 100 9.225 % -1.475 % 0.3475 %

Table 7: SABR parameters fitted to the data in table 6 for a range of values of β.

β ρ v q ρdq,dF 0.0 -0.26984092 1.36912134 0.08909337 -0.3349 0.2 -0.28201586 1.37765979 0.08913005 -0.3338 0.4 -0.29400046 1.38646149 0.08917033 -0.3326 0.6 -0.30579258 1.39552372 0.08921424 -0.3314 0.8 -0.31739115 1.40484087 0.0892618 -0.3301 1.0 -0.32879486 1.41441135 0.08931306 -0.3288

9.1 Does the Choice of β Matter?

Figure 12 shows σRR and σBF as a function of σATM for the parameter values shown in table 5 (β = 0.0 and β = 1.0). From figure 12(b) it can be observed that the choice of β has little effect on the relationship between σBF and σATM. In contrast, the relationship between σRR and σATM depends strongly on β. Figure 12(a) shows the relationship between σRR and σATM for β = 0.0 and β = 1.0. The two curves intersect at σATM ≈ 0.07, which is the point to which the model was fitted, but diverge either side of this point. Based on the results in figure 12 we can conclude that fitting β correctly would be critical if we wanted to have a model with constant parameter values that is capable of describing the relationship between σRR and σATM over a wide range of σATM. It is, however, valid to ask whether such a model offers significant real world advantages over a model with an arbitrary choice of β that is recalibrated frequently. If σATM evolves slowly, then recalibrating the arbitrary β model on a regular basis would ensure that the model’s prediction for σRR would always be close to the true value. Under these conditions the advantage of correctly determining β is that the model would not need to be recalibrated as frequently.

On the other hand, if we observe a step change in σATM then a model with the correct value of β would be expected to yield a more accurate prediction of σRR than

38 ×10-3 10

8

6 β = 1.0 β = 0.0 4 RR

σ 2

0

-2 0 0.05 0.1 0.15 0.2 0.25 σATM

(a) σRR

×10-3 4.5

4

3.5

3 β = 0.0 β = 1.0

BF 2.5 σ

2

1.5

1

0.5

0 0 0.05 0.1σAT M 0.15 0.2 0.25

(b) σBF

Figure 12: σRR and σBF as a function of σATM

a model with arbitrary β. Large changes in σATM are typically the result of dramatic events, such as the UK referendum on EU membership on 26 July 2016 or the removal of the ‘peg’ in the EURCHF market on 15 January 2015. It is unclear whether model parameters could be regarded as constant during such dramatic events. Further work is required to asses the value of the SABR model under such circumstances.

39 10 Fitting the SABR Model

In this section we propose a new method for fitting the SABR model to observed FX market data. After describing the method, we demonstrate its behaviour in the idealised case of fitting data generated using the SABR model. Thereafter the effect of noise on the fitting accuracy is examined. We isolate the effect of noise on each of the three volatility quotes σATM, σRR and σBF. Finally conclusions are drawn.

The deterministic nature of the relationships between σATM, σRR and σBF and the dependence of these relationships on the SABR parameters suggests that it should be possible to fit the SABR model to market data using observations of the triplet

[σATM, σRR, σBF]. It was argued in section 8 that, in an FX setting, the SABR model can be regarded as having a single state variable, q. Equation (67) relates σATM to q and the model parameters. Therefore, q and σATM are interchangeable and the state of the system is fully described by σATM. From a practical point of view we prefer to work with σATM because, unlike q, it can be observed directly in the market.

If we regard σATM as a state variable, then the [σATM, σRR, σBF] triplet provides us with two constraints: namely that the model predictions for σRR and σBF should match the observed values. The model has three unknown parameters (β, ρ and v) so one triplet will generally not contain sufficient information to fit the model. One approach to fitting the model would be to consider n triplets. The parameter values would then be found by minimising the error between the predicted and observed values of σRR and σBF. For example

n X σ0 (β, ρ, v) 2 σ0 (β, ρ, v) 2 error(β, ρ, v) = RR − 1 + BF − 1 (80) σ σ i RR BF

0 0 where σRR(β, ρ, v) and σBF(β, ρ, v) are, respectively, the risk reversal and butterfly spread predicted by the SABR model. In equation (80) relative errors were used to account for possible differences in the magnitudes of σRR and σBF. In the case that

σRR is very small, absolute errors could be used to prevent division by zero. Equation (80) is minimised subject to the following bounds on β, ρ and v:

0 ≤ β ≤ 1 −0.9 ≤ ρ ≤ 0.9 0 ≤ v ≤ 100.

The bounds on β are specified by the SABR model. β = 0 gives arithmetic Brownian motion, while β = 1 results in geometric Brownian motion. In principle ρ can take

40 any value between -1 and 1. We prefer tighter bounds on ρ to prevent division by zero during the minimisation. The lower bound on v is applied because v is a volatility. An upper bound of v = 100 is applied to prevent the numerical optimiser from venturing into regions with a very large v. The typical magnitude value of v is v ≈ 1. The upper bound is sufficiently far from this value that v can be considered as being essentially unbounded from above. The initial estimate of the parameter values is chosen to be β∗ = 0.5, ρ∗ = 0.0, v∗ = 0.5. Constrained minimisation is performed in python using the L-BFGS-B method implemented in the scipy.optimize package. Python code developed during this project to fit the SABR model using the method described above can be found in appendix A.

10.1 Fitting in the Absence of Noise

To demonstrate the behaviour of the fitting procedure described above we apply it to

fictitious data with known values of β, ρ and v. It was argued above that σATM can be regarded as a state variable for the system. Therefore, we can generate sample data simply by specifying values of σATM. Table 8 shows the data to be fitted. We consider the case of n = 2, i.e. two data points are used to fit the model. In all cases point 1, corresponding to σATM = 0.06 is used in the fit. The second point is varied to show the effect of the range of σATM on the fitting quality. Define the error in the model fitting as the difference between the true model parameters and those found by fitting the model to the data:

 2 ρ0 2 v0 2 h = β0 − β + − 1 + − 1 . (81) ρ v

Here β0, ρ0 and v0 are the parameter values found by fitting the model. The absolute error was chosen for β to avoid division by zero.

Figure 13(a) shows the fitting error as a function of the range of σATM. It can be seen that the fitting error tends to decrease as the range of σATM is increased and that, for all values of β, the fitting error is very small when the range of σATM is 0.01 or larger. In all cases considered the fitting error is smaller than 10−5 when range of σATM is 0.01 or larger. For β = 0.5 the fitting error is small for all values ∗ of the range of σATM. This may be due to the initial estimate of β, β = 0.5, used in the fitting procedure. Figure 13(b) shows the fitting error as a function of the range of σATM when the initial estimate of the parameter values is chosen to be β∗ = 0.0, ρ∗ = 0.0, v∗ = 0.5. This choice leads to a small fitting error for β = 0.0 for

41 Table 8: Sample data to be fitted for three values of β. In all cases ρ = 0.1, v = 1 and tex = 0.25.

β = 0 β = 0.5 β = 1.0

Point σATM σRR(%) σBF(%) σRR(%) σBF(%) σRR(%) σBF(%) 1 0.060 0.0931 0.1188 0.1563 0.12011 0.2196 0.1218 2 0.061 0.0927 0.1208 0.1580 0.1221 0.2235 0.1239 3 0.062 0.0922 0.1228 0.1597 0.1241 0.2273 0.1260 4 0.063 0.0916 0.1247 0.1613 0.1261 0.2312 0.1280 5 0.064 0.0910 0.1267 0.1630 0.1281 0.2350 0.1301 6 0.065 0.0904 0.1287 0.1646 0.1302 0.2389 0.1322 7 0.066 0.0897 0.1306 0.1661 0.1322 0.2428 0.1343 8 0.067 0.0889 0.1326 0.1677 0.1342 0.2467 0.1364 9 0.068 0.0880 0.1345 0.1692 0.1362 0.2506 0.1385 10 0.069 0.0871 0.1365 0.1707 0.1382 0.2545 0.1406 11 0.070 0.0861 0.1385 0.1721 0.1402 0.2584 0.1426 12 0.071 0.0851 0.1404 0.1736 0.1422 0.2623 0.1447 13 0.072 0.0839 0.1424 0.1750 0.1442 0.2662 0.1468 14 0.073 0.0828 0.1443 0.1763 0.1462 0.2701 0.1489 15 0.074 0.0815 0.1463 0.1777 0.1482 0.2740 0.1510 16 0.075 0.0802 0.1482 0.1790 0.1502 0.2780 0.1531 17 0.076 0.0788 0.1502 0.1803 0.1522 0.2819 0.1552 18 0.077 0.0774 0.1521 0.1815 0.1543 0.2859 0.1573 19 0.078 0.0759 0.1541 0.1827 0.1563 0.2898 0.1594 20 0.079 0.0744 0.1560 0.1839 0.1583 0.2938 0.1615 21 0.080 0.0727 0.1579 0.1851 0.1603 0.2977 0.1636

42 all ranges of σATM, which corroborates the argument that the small fitting error for β = 0.5 in figure 13(a) is due to the choice of initial value.

0.45

0.4

0.35 β = 0.0 β = 0.5 . 0.3 β = 1 0

0.25

0.2

Fitting error0.15 (-)

0.1

0.05

0 0 0.002 0.004 0.006 0.008 0.01 0.012 0.014 0.016 0.018 0.02 Range of σATM (-) (a) β∗ = 0.5

1

0.8 β = 0.0 β = 0.5 β = 1.0 0.6

0.4 Fitting error (-)

0.2

0 0 0.002 0.004 0.006 0.008 0.01 0.012 0.014 0.016 0.018 0.02 Range of σATM (-) (b) β∗ = 0.0

Figure 13: Fitting error as a function of the range of σATM used in the fitting process for n = 2. In a) the initial estimate of β was chosen to be β∗ = 0.5, whereas b) shows the result of using an initial estimate of β∗ = 0.0.

Irrespective of the choice of initial values, figure 13 demonstrates that for the data in table 8, the SABR parameters can be retrieved from two observations of the triplet [σATM, σRR, σBF] provided that these two observations cover a sufficiently wide range of σATM. The requirement that the observations cover a range of σATM can be

43 explained by remembering that each observation contains two constraints and that we have three parameters to fit. Therefore we require at least two observations in order to determine the parameter values uniquely. If the two observations cover a very narrow range of σATM, then the constraints that they provide will be very similar and we will, in effect, only have two constraints.

10.2 Effect of Noise on Fitting

In this section we investigate the stability of the fitting method described above in the presence of noise. To simulate noisy market data, Gaussian noise was added to the simulated market data as

σ˜ATM = σATM(1 + 1)

σ˜RR = σRR(1 + 2)

σ˜BF = σBF(1 + 3), (82)

2 where each of the i is drawn from an N(0, Σi ) distribution. The noise was chosen to be proportional to the observed value to decrease the probability of obtaining negative values forσ ˜ATM orσ ˜BF.

10.2.1 Noisy σBF

First we investigate the effect of noisy σBF. That is we set Σ1 and Σ2 equal to zero and vary Σ3. Repeating the methodology employed in section 10.1, we set n = 2 and investigate the influence of the range of σATM between the two observations.

Figure 14 shows the effect of changing the range of σATM and Σ3 on the fitting error. The error shown is the average error from 20 repeats with different random draws for the noise. The true parameter values used to generate the data are β = 1.0, ρ = 0.1 and v = 1.0. It can be seen that the fitting error decreases as the range of σATM is increased. For ranges of σATM > 0.005 the fitting error increases as the amplitude of the noise is increased, as would be expected. Figure 14 suggests that the fitting procedure proposed in section 10 is relatively robust to the presence of noise on σBF.

10.2.2 Noisy σRR

Figure 15 is the equivalent to figure 14 for σRR. The trends are broadly similar to those observed above: the fitting error decreases as the range of σATM is increased

44 0.3

0.25 Σ3 = 0.00 Σ3 = 0.05 Σ . 0.2 3 = 0 10 Σ3 = 0.20

0.15

Fitting error (-) 0.1

0.05

0 0 0.002 0.004 0.006 0.008 0.01 0.012 0.014 0.016 0.018 0.02 Range of σATM (-)

Figure 14: Fitting error as a function of the range of σATM for four values of Σ3. The true parameter values are β = 1.0, ρ = 0.1 and v = 1.0. The fitting error is the average of 20 fits.

and increases as the amplitude of the noise is increased. However, for non-zero Σ2 the fit quality is poorer than was observed for non-zero Σ3. To try and improve the

0.5

Σ = 0.00 0.4 2 Σ2 = 0.05 Σ2 = 0.10 Σ2 = 0.20 0.3

0.2 Fitting error (-)

0.1

0 0 0.002 0.004 0.006 0.008 0.01 0.012 0.014 0.016 0.018 0.02 Range of σATM (-)

Figure 15: Fitting error as a function of the range of σATM for four values of Σ2. The true parameter values are β = 1.0, ρ = 0.1 and v = 1.0. The fitting error is the average of 20 fits.

fit quality, we investigate the effect of changing n, the number of points used in the

45 fitting. We again fit the data shown in table 8 with noise described by equation (82).

We set n = 20 and select ten observations from σATM = 0.06 and ten from σATM =

0.06 + range. That is, for each fit we consider twenty realisations of 2. Figure 16 shows the fitting error as a function of range for n = 20. The error shown is again the average error from 20 fits with different random draws for the noise. For small ranges of σATM the fitting error shown in figure 16 is larger than that in 15. However, for Σ2 = 0.05 and Σ2 = 0.1 we observe smaller errors than the corresponding case of n = 2 for large ranges of σATM. When Σ2 = 0.2 the fitting errors for n = 20 are larger than those for n = 2.

0.7

0.6 Σ2 = 0.00 Σ2 = 0.05 Σ . 0.5 2 = 0 10 Σ2 = 0.20

0.4

0.3 Fitting error (-) 0.2

0.1

0 0 0.002 0.004 0.006 0.008 0.01 0.012 0.014 0.016 0.018 0.02 Range of σATM (-)

Figure 16: Fitting error for n = 20 as a function of the range of σATM for four values of Σ2. The true parameter values are β = 1.0, ρ = 0.1 and v = 1.0. The fitting error is the average of 20 fits.

10.2.3 Noisy σATM

Figure 17 is the equivalent to figure 14 for σATM. The fitting error again decreases as the range is increased and increases as the amplitude of the noise is increased. The errors shown in figure 17 are, however, significantly larger than those observed for noisy σRR or noisy σBF. For example, the error for Σ1 = 0.1 with a range of

σATM = 0.02 is twice that observed for Σ2 = 0.1. In the fitting technique proposed in section 10, σATM is viewed as a state variable of the system. It is, therefore, perhaps understandable that adding uncertainty to σATM causes large fitting errors: if the state of the system is not known then the constraints provided by σRR and

46 σBF cannot be interpreted correctly. In practice σATM is the most actively traded of the three major FX options. Indeed, one motivation for modelling the FX volatility smile is to allow σRR and σBF to be predicted based on an observed change in σATM.

Therefore, one might imagine that there is less noise on σATM than on σRR and σBF.

1.2

1 Σ1 = 0.00 Σ1 = 0.05 Σ1 = 0.10 Σ . 0.8 1 = 0 20

0.6

Fitting error0.4 (-)

0.2

0 0 0.002 0.004 0.006 0.008 0.01 0.012 0.014 0.016 0.018 0.02 Range of σATM (-)

Figure 17: Fitting error as a function of the range of σATM for four values of Σ1. The true parameter values are β = 1.0, ρ = 0.1 and v = 1.0. The fitting error is the average of 20 fits.

10.3 Conclusions

We have proposed a method for fitting the SABR model to market data. This method requires observations of σRR and σBF for at least two different values of σATM. The performance of the proposed method has been studied with and without noise on the volatility quotes. Irrespective of whether noise is present on the volatility quotes or not, the fitting error tends to decrease as the range of σATM increases. This would imply fitting the model using the largest possible range of σATM available, i.e. the model would be fitted to the outliers of σATM. In practice outliers tend not to represent a true opportunity to trade and, hence, are not normally reliable. An agent wishing to fit the SABR model using the method described above must also consider whether the quotes they are calibrating to are representative of the current market conditions. If a large change in σATM is observed then it is possible that the market conditions have changed and that older quotes are not representative of current market dynamics.

47 Therefore, we recommend choosing a large range of σATM subject to the caveats that the quotes chosen are believed to represent genuine offers to trade; and that it is believed that the market dynamics at the times the quotes were observed are relevant to the current market conditions. Assessing whether a fitting method is accurate enough to be useful is not straight forward: factors to consider include how well the model describes the market and how the model will be used. There is little value in accurately determining parameter values if the model itself, irrespective of the choice of parameters, doesn’t capture the features of the market that we are interested in. We should also consider the impact of incorrect parameter estimation in terms of profit and loss: if we are using a model in such a way that our profit and loss is very sensitive to a parameter value then it is clearly important to determine the value of that parameter as accurately as possible. Nevertheless, we are of the opinion that achieving a fitting error h ≤ 0.01 is probably sufficient for most applications. This would, for example, be equivalent to determining β to within ±0.1. In the absence of noise, the cases studied above met this criterion when the range of σATM was greater than 0.005. We have also seen that the model parameters can be accurately determined in the presence of noise on σBF.

Fitting in presence of noise on σRR or σATM is much more challenging and care would be required if significant noise was observed on these quotes.

48 11 Example: Pricing a Digital Call

To demonstrate the importance of the correct determination of β, we consider a digital call on a currency pair. The payoff of a domestic paying European digital call option is  1 if ST ≥ K D(ST ,K) = 0 if ST < K. The value of a domestic paying European digital call can be approximated as [13]: ∂ D(S,K) = − C(S, K, σ(K)) ∂K ∂C(S, K, σ) ∂C(S, K, σ) ∂σ = − − , (83) ∂K ∂σ ∂K where C(S, K, σ) is the price of a vanilla European call with strike K. The second term is called the windmill-adjustment and is the product of the vega of a vanilla option with strike K and the slope of the volatility smile at K. Consider a situation where the evolution of the forward price, F , obeys the SABR model with parameters β = 0.9, ρ = 0.2 and v = 1.0. The current ATM level is

σATM = 0.05 and we price a digital call option with strike ψ = ln(F/K) = −0.001 and time to expiry T −t = 0.5 using equation (83). For simplicity we assume that both foreign and domestic interest rates are zero. The current forward price is F = 1.2.

The risk reversal and butterfly spread corresponding to these values are σRR = 0.517% and σBF = 0.215%. The price of the digital call is ∂ D(S,K) = − C(S, K, σ(K)) ∂K √ ∂σ = Φ(d ) − K T − tΦ0(d ) , (84) 2 2 ∂K where σ2 ψ − 2 (T − t) d2 = √ (85) σ T − t The implied volatility at ψ = −0.001 is given by equation (66) where q can be found as the root of equation (67). For the data given above we find q = 0.04807 and σ(−0.001) = 0.0501. We estimate the slope of the volatility smile at K using finite differences: ∂σ σ(ln 1.2 ) − σ(ln 1.2 ) ≈ 1.2013 1.2011 ∂K 0.0002 ≈ 0.0901 (86)

49 Therefore the price of the digital call is √ D(1.2, 1.2012) = 0.4817 − 1.2012 × 0.5 × 0.3985 × (0.0901) (87) = 0.4512 (88)

We now observe a step change in σAT M from 0.05 to 0.1. Repeating the calculations above for this new state yields a price of 0.4514.

11.1 Digital Pricing Assuming β = 0

Let us repeat the calculation above, this time assuming that we have chosen β = 0 based on prior knowledge. Fitting the market data σATM = 0.05, σRR = 0.517% and

σBF = 0.215% under the assumption that β = 0 yields the parameter values shown in table 9. Table 10 compares the pricing of the digital option for the two values of

Table 9: SABR parameters for β = 0

β ρ v 0 0.2424 % 1.016 %

β. The two choices of β result in different values of D(F,K). At σATM = 0.05 the difference between the two prices is negligible but at σATM = 0.1 we see that β = 0 results in a larger price than β = 0.9. Examining the data in table 10 reveals that this difference is caused by different values for the slope of the volatility smile. The two values of β result in the same values for σ(K). This is unsurprising since K is very close to F and σ(F ) = σATM is used as a state variable to which both models are fitted.

11.2 Accounting for σRR and σBF

In the above we priced a digital option at two different levels of σATM. For the

σATM = 0.1 level no use was made of σRR or σBF. This is possible because, for a specified model, σATM is a state variable. That is, the volatility smile can be uniquely determined if σATM, β, ρ and v are known. Above we considered the case of an agent who observes a step change in σATM and uses this information to update their price for a digital call. Let us now consider an agent who observes σRR and σBF in addition to σATM.

50 Table 10: Digital option pricing

β = 0.9 β = 0.0 Refit

σATM 0.05 0.1 0.05 0.1 0.1 σ(K) 0.0501 0.1001 0.0501 0.1001 0.1001

d2 -0.0459 -0.0495 -0.0459 -0.0495 -0.0495

Φ(d2) 0.4817 0.4803 0.4817 0.4803 0.4803 0 Φ (d2) 0.3985 0.3985 0.3985 0.3985 0.3985 ∂σ ∂K 0.0901 0.0853 0.0913 0.0677 0.0877 D(F,K) 0.4512 0.4514 0.4508 0.4573 0.4506

The risk reversal and butterfly spread for β = 0.9, ρ = 0.2, v = 1.0 and σATM = 0.1 are σRR = 1.06% and σBF = 0.445%. For an agent who has determined the correct

SABR parameters, σRR and σBF provide no additional information and the price of the digital call remains 0.4514. An agent who has selected β = 0.0 is able to refit their model to the new market quotes and would obtain new parameter values as shown in table 11. The pricing of the digital call option for these parameters is shown in

Table 11: Re-fitted SABR parameters for β = 0

β ρ v 0 0.2829 % 1.034 % the final column of table 10. Using these parameter values the slope of the volatility smile, and hence the digital call price, is much closer to those calculated with β = 0.9.

11.3 Discussion

In the above we examined the pricing of a digital option for different values of β. It was observed that following a step change in σATM, different choices of β resulted in different prices for the digital option and it was argued that the price difference arises from differences in the slope of the volatility smile at K. We have also seen that the price difference caused by a different choice of β can be significantly reduced by re-fitting the model to updated values of σATM, σRR and σBF. This is in agreement with the findings of other authors. For example, Hagan and Lesniewski [12] noted that the SABR model can reproduce observed volatility smiles for any value of β.

51 Therefore accurately determining the value of β cannot be expected to yield a better description of the volatility smile once the model has been fitted to market data. We compare two cases: a model with the correct value of β and a model with arbitrary β that is frequently recalibrated. The model with the correct value of β offers two potential benefits:

1. The correct choice of β will result in a model that requires less frequent re- calibration than an arbitrary choice of β. This has been demonstrated above: the model with β = 0 needed to be recalibrated to price the digital option for

σATM = 0.1 accurately, whereas no recalibration was required for β = 0.9.

2. If β is calibrated correctly then the model can be used to predict price changes

based on observed changes in σATM.

To understand the second point we must remember that σATM is more actively traded than σRR and σBF. It is, therefore, reasonable to assume that a price shock will first impact σATM and that there will be a finite time during which σATM contains new price information that is not yet reflected in σRR or σBF. A model with the correct value of β would allow pricing to be updated without needing to wait for σRR and σBF to be updated. Returning to the digital call example, a correctly calibrated model allows the price information in a change in σATM to be reflected in the price of the digital call without needing to wait for this price information to flow into market quotes for σRR and σBF. ∂σ The digital option priced above is actually relatively insensitive to ∂K : its price is predominantly determined by Φ(d2). An instrument that is far more sensitive to the shape of the volatility smile is σRR itself. It was demonstrated in figure 12(a) that the curve of σRR against σATM depends strongly on β. Table 12 shows the values of

σRR for the model parameters corresponding to the cases studied in section 11.1. It can be seen that the model with β = 0 underestimates substantially the increase in

σRR due to the change in σATM.

Table 12: σRR for different choices of β

β = 0.9 β = 0 Refit

σATM 0.05 0.1 0.05 0.1 0.1

σRR 0.517% 1.06 % 0.517% 0.820 % 1.06 %

52 The two advantages of the fitting β discussed above are most beneficial during large moves in σATM: small changes in σATM will not cause a large change in the SABR parameters for any choice of β. Similarly, although ∂σRR depends on β, small ∂σATM changes in σATM can only cause small deviations in σRR, irrespective of the choice of β.

However, as discussed in section 9, large changes in σATM are usually the result of dramatic market events and it is doubtful whether the model parameters remain constant during such events. With this is mind, it is not currently clear whether accurately determining β offers any practical advantages compared to selecting an arbitrary value of β and recalibrating the model on a regular basis.

53 12 Conclusions

Fitting the SABR model to market data is a challenging task because two of the model parameters (β and ρ) both affect the skew of the volatility smile. It is frequently claimed that β can be found from the slope of a log-log plot of historic values of σATM against F . We have demonstrated here that this is not the case. The SABR model is traditionally described as two correlated stochastic processes. We have shown that, in an FX setting, the SABR model has a single state variable and can be described by a single stochastic differential equation. When working with the model it is useful to regard the at-the-money volatility, σATM, as the state variable of the system because this can be observed directly in the market. The volatility smile is then uniquely described by σATM and the three model parameters: β, ρ and v. Using this representation we have shown how the model parameters can be retrieved from observations of the volatility smile for two or more values of σATM. Accurate determination of the SABR parameter values requires observations of the volatility smile covering a sufficient range of σATM. For the parameter values considered here a range of 0.005 was required to retrieve the parameter values from simulated market data. We have presented three months of market data for EURUSD and USDJPY. The range of σATM for both datasets was approximately 0.015, which is larger than the minimum range required for the fitting method proposed here.

Fitting in the presence of noise has also been examined. Adding noise to σBF had little effect on the ability to determine accurately the parameter values but larger

fitting errors were observed when noise was added to σATM or σRR.

12.1 Suggestions for Further Work

In this work we have considered methods for fitting the SABR model to FX data. We have focused on simulated market data with known parameter values. An important area for future work would be to examine how well the SABR model is able to describe real market data. Specifically, we have seen that the SABR model predicts a unique, deterministic relationship between σATM, σRR and σBF. At first glance this prediction is not borne out by the market data shown in figures 8 and 11. Future work could examine the source of this discrepancy. Do the SABR parameters change over time such that the model predictions hold over shorter timeframes or is there a pattern to discrepancies between the market observations and the model predictions? One motivation for modelling the FX market using the SABR model is that it establishes a relationship between σATM, σRR and σBF. In section 11.3 we considered

54 that new price information might affect σATM before it is reflected in σRR and σBF. An important aspect of further work would be to establish whether this idea is supported by historical data. Therefore we would propose using signal analysis techniques to establish whether there is evidence of a time lag between σRR, σBF and σATM. It would be interesting to know whether the magnitude of any such lag is constant or whether it depends on market conditions. Finally we would recommend an examination of historic market data from events which caused step-changes in the FX market, for example the UK referendum on EU membership on 26 July 2016 or the removal of the ‘peg’ in the EURCHF market on 15 January 2015. Analysis of these events, which sent shock waves through the FX markets, could focus on two key questions in regard to this work:

1. If we model the market using the SABR model, is there any evidence that model parameters are preserved during dramatic events?

2. Is there evidence of a time lag between σATM, σRR and σBF during such events?

Answering these questions would help to determine whether the correct determination of β offers useful advantages over using an arbitrary value of β and re-fitting the model on a frequent basis.

55 References

[1] P. S. Hagan, D. Kumar, A. S. Lesniewski, and D. E. Woodward. Managing smile risk. Wilmott, 1:84–108, 2002.

[2] A. Castagna. FX Options and Smile Risk. John Wiley and Sons Ltd, 2010.

[3] D. Reiswich. An empirical comparative analysis of foreign exchange smile cali- bration procedures. J. Comput. Financ, 60:31–67, 2011.

[4] Jan Obl´oj. Fine-tune your smile : Correction to Hagan. http://arxiv.org/abs/0708.0998, 2008.

[5] H. Berestycki, J. Busca, and I. Florent. Hedging under SABR model. Comm. Pure Appl. Math., 57:1352–1373, 2004.

[6] G. West. Calibration of the SABR model in illiquid markets. Appl. Math. Finance, 12:371–385, 2005.

[7] P. Nowak and P. Sibetz. Volatility smile. http://www.fam.tuwien.ac.at/∼sgerhold/pub files/sem12/s sibetz nowak.pdf, 2012.

[8] F. Le Floc’h and G. Kennedy. Explicit SABR calibration through simple explan- sions. SSRN eLibrary, 2014.

[9] P. S. Hagan, D. Kumar, A. S. Lesniewski, and D. E. Woodward. Arbitrage free SABR. Wilmott, 69:60–75, 2014.

[10] S. Skov Hansen. The SABR model - theory and application. PhD thesis, Copen- hagen Business School, 2011.

[11] B. Bartlett. Hedging under SABR model. Wilmott, July/August:68–70, 2006.

[12] P. S. Hagan and A. S. Lesniewski. Bartlett’s delta in the SABR model. http://dx.doi.org/10.2139/ssrn.2950749, 2017.

[13] U. Wystrup. FX Options and Structured Products. John Wiley and Sons Ltd, 2017.

56 Appendices

A Python Code

Listing 1: The SABR class. Used to calculate σRR and σBF for specified model parameters and values of σATM class SABR: def i n i t (self , args): self.beta = args[0] self.rho = args[1] self.vol = args[2] self.atm = args[3] s e l f . v r r = args [ 4 ] s e l f . v b f = args [ 5 ] s e l f . t e x = args [ 6 ] def calc atm(self , args): q = args [ 0 ] term1 = self.t e x ∗( q∗∗2∗(1.0 − s e l f . beta )∗∗2/24.0 + \ 0.25∗ s e l f . rho ∗ s e l f . beta ∗ s e l f . vol ∗q + \ s e l f . vol ∗∗2/24.0∗(2.0 −3.0∗ s e l f . rho ∗∗2)) atm = q∗(1.0+ term1) return atm def atm error(self ,args): return s e l f . atm − s e l f . calc atm ( args ) def s e t q ( s e l f ) : sol = root(self.atm error , self .atm) self.q = sol.x def volAtDelta(self ,args): self.delta = args sol = root(self.calc v o l dif ,self.atm) return s o l . x def c a l c v o l dif(self , args): #Calculates the difference between a target vol and the #vol at delta vol = args[0] delta = self.delta w = np.sign(self.delta) q = s e l f . q p s i = w∗ vol ∗ sqrt(self.t e x )∗ Ninv ( abs ( d e l t a ) ) \ − 0.5∗ vol ∗∗2∗ s e l f . t e x i f p s i ==0.0: I0 = q else : i f self .beta==1: z = self.vol/q ∗ p s i else : z = self.vol/q ∗ (1.0 − exp(− p s i ∗(1.0 − self .beta))) \ /(1.0 − s e l f . beta ) chi = log((sqrt(1.0 −2.0∗ z∗ s e l f . rho+z ∗∗2)+z−s e l f . rho )\ /(1− s e l f . rho ) ) I0 = self.vol ∗ p s i / c h i term1 = self.t e x ∗( q∗∗2∗(1.0 − s e l f . beta )∗∗2/24.0\

57 ∗exp ( p s i ∗(1.0 − self.beta)) + \ 0.25∗ s e l f . rho ∗ s e l f . beta ∗ s e l f . vol ∗q\ ∗exp ( p s i ∗(1.0 − self.beta)/2.0) + \ s e l f . vol ∗∗2/24.0∗(2.0 −3.0∗ s e l f . rho ∗∗2)) return I0 ∗(1.0+term1)− vol def c a l c strikes(self): #Calculates the volatilities at deltas of 0.25 and −0.25 self.call = self.volAtDelta(0.25) self.put = self.volAtDelta( −0.25) def c a l c r r ( s e l f ) : return s e l f . c a l l − s e l f . put def c a l c b f ( s e l f ) : return 0 . 5 ∗ (self.call + self.put) − s e l f . atm

Listing 2: The q data class. For a specified SABR model it creates lists of σRR and

σBF for σATM between 0.06 and 0.08. class q data : def i n i t (self , args): self.beta = args[0] self.rho = args[1] self.vol = args[2] s e l f . t e x = args [ 3 ] self.atm = np.linspace(0.06, 0.08, num=21) s e l f . v r r = [ ] s e l f . v b f = [ ] s e l f . q = [ ] #Get q from atm for i in s e l f . atm : x = [self.beta,self.rho,self.vol,i,i,i,self.t e x ] smile = SABR(x) smile . s e t q ( ) self .q.append(smile.q) smile . c a l c s t r i k e s ( ) s e l f . v rr.append(smile. calc r r ( ) ) s e l f . v bf.append(smile. calc b f ( ) )

Listing 3: Function to be minimised. Implements equation (80). def e r r o r q(args ,points ,step ,start ,repeats): beta = args[0] rho = args[1] vol = args[2] e r r = 0 . 0 np.random.seed(seed= 2000) for j in range ( s t a r t ) : noise = Ninv(np.random.rand()) for i in range ( p o i n t s ) : for p in range ( r e p e a t s ) : noise = Ninv(np.random.rand()) key = i ∗ step atm = qd.atm[key] v b f = qd . v b f [ key ] v r r = qd . v r r [ key ] ∗(1.0+ n o i s e ∗ 0 . 0 5 )

58 x = [beta,rho,vol,atm,v bf , v rr , t e x ] SABRsmile[key] = SABR(x) SABRsmile[key ]. set q ( ) SABRsmile[key ]. calc s t r i k e s ( ) err += (SABRsmile[key]. calc b f ( ) / v bf −1.0)∗∗2 err += (SABRsmile[key]. calc r r ( ) / v rr −1.0)∗∗2 return e r r

Listing 4: The main program minimises error q using the L-BFGS-B method. We loop over the step size between the points to be fitted. At each step size repeats repeats are performed and the result is the average error over these repeats. #Bounds on beta, rho and v bnds = ((0.0, 1.0),( − 0.9,0.9),(0.0,100.0)) #Initial guess for beta, rho, v x0 = np.array([0.5, 0.0, 0.5]) #True values of beta, rho, v x = np.array([1.0, 0.1, 1.0]) writeArray = [] maxPoint = 20 #Max number of steps between the 2 points used in the fitting r e p e a t s = 20 #Number of fits performed for averaging r p t s = 1 #Number of points at each value of ATM #Create the smile data to be fitted. qd = q data([x[0],x[1],x[2],t e x ] ) #Use 2 points for i in range (maxPoint): err = np.array([0.0, 0.0,0.0]) for j in range ( r e p e a t s ) : res = minimize(error q , x0 , \ args = (2,i+1,2∗ j ∗ rpts ,rpts), \ method=’L−BFGS−B’ , bounds=bnds, \ options={ ’disp’: False }) err += [(res.x[0] − x [ 0 ] ) ∗ ∗ 2 , \ (res.x[1]/x[1] −1)∗∗2 , (res.x[2]/x[2] −1)∗∗2 ] err = err/repeats writeArray.append([( i+1)∗0.01,err[0],err[1],err[2]])

59