Stochastic Surface Estimation

Suhas Nayak and George Papanicolaou

May 9, 2006

Abstract

We propose a method for calibrating a volatility surface that matches prices using an entropy-inspired framework. Starting with a stochastic volatility model for asset prices, we cast the estimation problem as a variational one and we derive a Hamilton-Jacobi-Bellman (HJB) equation for the volatility surface. We study the asymptotics of the HJB equation assuming that the stochastic volatility model exhibits fast mean-reversion. From the asymptotic solution of the HJB equation we get an estimate of the stochastic volatility surface. We also incorporate uncertainty in quoted prices through a penalty term, i.e. by softening the constraints in the HJB equation. We present numerical solutions of our estimation scheme. We find that, depending on the softness of the constraints, certain parameters of the volatility surface related to the implied can be calibrated so that they are stable over time. These parameters are essentially the ones found in previous fast mean-reversion asymptotics papers by Fouque, Papanicolaou and Sircar. We find that our procedure provides a natural way of interpolating between the prior parameters and the parameters of Fouque, Papanicolaou and Sircar.

1 Introduction and Review

Parameter identification for systems governed by partial differential equations is a well-studied inverse problem. In mathematical finance, finding the volatility of a risky asset from options with multiple strikes and maturities is an example. The information contained in the market is not enough to identify a pricing model and so many sets of model parameters and many types of models could potentially be compatible with the observed option prices. We study the problem of estimating a volatility surface from option prices in an incomplete market setting.

1 1.1 Complete Markets

In the complete markets case, there have been several approaches to estimating volatilities from observed option prices. One may try to use the Black-Scholes partial differential equation

1 2 Ct + σ(S, t) CSS + rCS rC = 0, t < T 2 − C(S, T ) = h(S) directly to estimate σ(S, t). This was done in Andersen and Brotherton-Ratcliffe ([2], [3]). Here, h(S) is the payoff of the option. Penalization criteria like smoothness norms were used by Lagnado and Osher [18] and Jackson et. al. [16] to regularize the volatility extraction procedure.

Another possible approach in the complete market setting is to use a dual of the option-pricing PDE. Dupire’s equation (see [10], [11]) determines the dependence of option prices on strikes and dates. It has the form

2 1 2 2 ∂ C ∂C ∂T C σ (T, K)K + rK = 0, T > t − 2 ∂K2 ∂K C(t, K) = h∗(K)

Here, h∗(K) has the form as h(S). Since this equation has derivatives in K and not in S, it is easier to handle because derivative prices are quoted for certain expiration times and strikes. Achdou and Pironneau [1] used this equation in combination with least-squares and a regularization (penalty) term to solve the inverse problem of estimating σ(T, K). Their objective was to minimize

2 J(σ) = C(ti, Ki) ci + Jr(σ) | − | i X over σ subject to C solving Dupire’s equation. Here the ci are observed derivative prices and { } 2 Jr(σ) is an appropriate Tychonoff regularization functional that involves the L -norm of σ and the L2-norm of its derivatives with respect to K and T .

1.2 Entropy-based methods

Regularization may also be achieved through the use of entropy. Entropy minimization for calibrat- ing one-period models was used by Buchen and Kelly [7], by Gulko [15], by Jackwerth and Rubinstein [17] and by Platen and Rebolledo [19].

Relative entropy-based methods can be motivated as follows. There is a range of strategies available to hedge an option. If the hedger is fully confident that the option will follow dynamics given by a known model, then it would be worthwhile to ignore observed market volatilities and hedge using the model. If, on the other hand, the hedger believes that current prices of the option fully determine the future evolution of the stock, then it would be a good idea to hedge according to

2 these current option prices. Relative entropy-based methods help bridge the gap between these two possible strategies. They provide estimates for model parameters that are close to prior information while still matching current option prices.

Within the class of relative entropy methods there are two general approaches. In Avellaneda [4], a probability law is found for the risky asset that satisfies certain moment constraints, namely that it matches observed market prices of options, and is close to a a prior probability law, which can arise from historical or other econometric information. The closeness to the prior is measured by the relative entropy, which is given by dP H(P P ) = EP ln | 0 dP   0  dP where P0 is the prior distribution, and dP0 is the Radon-Nikodym derivative of P with respect to P . For H(P P ) to be finite, the measure P has to be absolutely continuous with respect to P . 0 | 0 0 So, in particular, if under P0, we assume our asset prices follow ν (0) (0) dXi(t) = σij dBj(t) + µi dt i X=1 where Bj are standard Brownian motions, then, by Girsanov’s theorem, we must have that under { } P , the asset follows the price process ν (0) dXi(t) = σij dBj(t) + µidt. i X=1 (0) (0) Here µi = µi + j σij mj and mj is a market price of risk. In other words, once the volatility of the asset under the prior probability law is specified, the volatility of the asset under the new P measure must be the same. Only the market price of risk can be adjusted to give a consistent pricing measure.

In this framework we assume that many different, perhaps correlated, shocks drive the evolution of each asset. Moreover, we assume that there is a good prior that describes the effect of fluctuations (0) through the volatilities σij . It is unlikely that we will have an accurate assessment of these volatilities. Yet once they are fixed in the model we are unable to deviate from them in the new probability law because of the absolute continuity that is assumed. In order for the market price of risk to correct for possible misspecification of volatilities, wild swings in its value may be necessary. The work of Carmona and Xu [8], who introduce a stochastic volatility model in an entropy framework, suffers from the same problems. There, the market price of risk associated with the stochastic volatility process is the only degree of flexibility and all other parameters are fixed prior to the estimation procedure.

It is desirable, therefore, to use a different relative entropy approach, one that was introduced in [5]. They considered processes for equity of the form

dSt P0 P0 = σ0,tdBt + µt dt, under P0 St

3 and dSt P P = σtdBt + µt dt, under P. St

Unless σ0 = σ under P , the relative entropy of the two measures is infinite, since P and P0 are then mutually singular. Avellaneda et. al. in [5] extended the concept of relative entropy to probability measures under which the processes do not necessarily have the same volatility. They considered the most singular part of the relative entropy using a time discretization that is based on trinomial trees. With this discretization and a small σ σ expansion, they found that to highest order the − 0 relative entropy looks like (σ2 σ2)2 over each time step. − 0 Both approaches to relative entropy calibration have the same general objective. Once the form of the relative entropy is determined the objective is to minimize it over all possible P subject to the constraint that the prices of options under P match observed prices.

1.3 Stochastic volatility surface estimation

Finding volatilities across strikes and expiration dates for incomplete markets is a very difficult task. We focus our attention on stochastic volatility models. These models have a large number of parameters that need to be known for pricing purposes and options can be quite sensitive to them. There are, however, several papers that deal with this issue. Broadie et al. [6], for example, fit the parameters of various stochastic volatility models. They first obtain parameters using a long-run time series of underlying asset returns. They then use the information found in option prices to estimate volatility and risk premia. This second stage involves the minimization of an objective function that is just the sum of the squares of the difference in the model-derived Black-Scholes implied volatilities and the implied volatilities that correspond to the data. Their method forces consistency in parameters between the data for the underlying asset and the data for the option prices.

The problem of option pricing in a stochastic volatility setting was tackled in a series of papers by Fouque, Papanicolaou and Sircar (see [12] and the references therein). They developed a method that reduced the number of parameters that were needed for pricing and hedging to just three. Two were derived directly from the smile associated with options prices, and the other was some estimate of the underlying’s volatility. They introduced a type of model where the stochastic volatility factor followed a fast mean-reverting process. The data set of options they then considered was chosen so that the asymptotics would be valid. For example, the strikes of the options were near at-the-money and the options had expiration dates that were neither close nor far away. With this data set they found that their parameters were stable over time. Importantly, the option pricing equations they obtained within the fast mean-reverting regime were independent of the stochastic volatility driving factor. This was not the case in the entropy-based estimation procedure of Carmona and Xu [8]. A subsequent paper of Fouque et al. [14] extended the domain of applicability by introducing structure to the maturity cylces of options. We will not pursue that method here.

In this paper, we introduce a method for calibrating volatility surfaces in a stochastic volatility

4 environment. We use an entropy-inspired framework that allows flexibility in volatilities when passing from the prior probability law to the pricing one. Because we are unable to observe the underlying stochastic volatility process, we adopt the framework of Fouque, Papanicolaou and Sircar and consider a regime where fast mean-reversion is apparent. We also relax the pricing constraints so that our volatility surface is less sensitive to unreliable out-of-the-money option prices. We formulate and solve a Hamilton-Jacobi-Bellman (HJB) equation associated with the variational problem. An HJB equation in a fast mean-rversion setting was studied by Sircar and Zariphopoulou [20] in the context of portfolio optimization. Our problem is different and results in a different HJB equation. From this equation, we obtain a volatility surface that consistently incorporates the Fouque-Papanicolaou-Sircar stochastic volatility parameters. These parameters are found to be stable over time. Our method, in fact, permits a natural interpolation between the model parameters of Fouque, Papanicolaou and Sircar and our own prior. It is an interpolation that depends on the level of pricing uncertainty.

The rest of this paper is organized as follows. In Section 2 we formulate the problem and introduce a method for obtaining the volatility surface. We review the fast mean-reversion regime and simplify the corresponding HJB equation in Section 3. We reduce the complexity of the problem by studying fast mean-reversion asymptotics in Section 4. In Section 5 we describe the iterative procedure that was used to estimate the surface, while in Section 6 we detail the numerical methods we used to solve the resulting partial differential equations. In Section 7 we present the results of our method. We discuss the sensitivity of the method to the various parameters in Section 8. We end with a brief summary and conclusion.

2 Stochastic volatility in the relative entropy framework

2.1 A simplified form for the relative entropy

In the relative entropy framework that we follow, we allow the functional form of the volatility to change from P0, our prior probability law, to P , the probability law we wish to determine, because otherwise we would only have freedom to pick a risk premium. We therefore adapt the method of Avellaneda et. al. to the case of stochastic volatility.

In order to do this, we discretize in time the stock and stochastic volatility processes so that we may determine the most singular part of the relative entropy. We assume the following dynamics under P0:

dSt = rdt + σ0(St, Yt)dWt St

dYt = γ0,tdt + κdZˆt where, as before, we assume the correlation between the two Brownian motions is ρ. In other words,

dZˆtdWt = ρdt.

5 Here, r is the riskless interest rate process and σ0 is our prior estimate of the volatility surface. Yt is the stochastic volatility driving factor, which has a drift of γ0 and has its own volatility κ. Under P , we assume that the dynamics are given by:

dSt = rdt + σ(St, Yt)dWt∗ St ˆ dYt = γtdt + κdZt∗ (1)

Here, γ and γ0 are functions that may depend on both t and Yt, and again we assume the same level of correlation between the Brownian motions.

Details of the procedure used to get the relative entropy given the dynamics above are provided in Appendix A. The result is that the relative entropy generated per unit of time is just η(σ) = (σ2 σ2)2 − 0 2 2 when σ is not very far from σ0. This particular form is an unsurprising consequence of approxi- mating a distance function.

2.2 Formulation of the variational problem

Since we wish to minimize the relative entropy between our prior and a model that fits the data, we consider the optimization problem T P sup E η(σs)ds σ − Z0  subject to the constraints: P rTi E e− hi(ST ) Ci βi, i − ≤ for each i and where βi 0. Here, hi(STi ) is the payoff of option i, which has expiration date Ti. ≥ Ci is the price of this option that is quoted in the market.

The constants βi may be considered to be a measure of our confidence in the market data. We note that in previous works involving relative entropy minimization, these constants were taken to be zero (hard price constraints). By allowing deviation from market prices under our new measure P , we have softened the pricing constraints, since it is not practical for us to consider the market data as being exact. Some data points may be the result of only one trade and hence would not be reliable. Imposing hard constraints also leads to kinks in the smile near the data points.

We simplify the objective above by forming a single Lagrangian. The following analysis is similar to the one in [9] for a different problem. Our objective may be rewritten as T P rs inf E e− η(σs)ds subject to σ Z0  P rTi + i : E e− h(ST ) Ci βi (λ ) ∀ i − ≤ i P rTi i : Ci E e− h(ST ) βi (λ−). ∀ −  i ≤ i   6 Here, Lagrange multipliers are written next to the constraints. From this, we obtain the augmented Lagrangian which we try to minimize over both sets of λ:

T P rs + P rTi + sup E e− η(σs)ds + (λi− λi )(E e− h(STi ) Ci) + βi(λi + λi−). (2) σ − − − Z0  i i X   X

+ Without loss of generality, we may assume that, for each i, at most one of λi and λi− is nonzero. This is because we are minimizing over λi, and so we could otherwise decrease both λi’s by an equal but positive amount. We would then obtain a new expression where the third term in Equation (2) is decreased but the other two are unchanged.

+ Now suppose that λi is nonzero (at the minimum). Then our solution is the same as the one for an optimization with the first constraint replaced by equality for that value of i. In this case, the second constraint cannot simultaneously be an equality and the solution must therefore be an interior one. Similarly, if the λi− is nonzero, then the second constraint is an equality. These are the + only two possible cases. If we now write for each i, λi = λ λ−, we may consider the objective: i − i T P rs P rTi sup E e− η(σs)ds + λi(E e− h(STi ) Ci) + βi λi . (3) σ − − | | Z0  i i X   X If we have to minimize this over λi (for each i), then there are two cases. First, the λi, for some i, may be positive, in which case we have solved the original optimization problem with the first inequality replaced by equality. Or, the λi, for that same i, is negative, in which case the original problem with the second constraint replaced by equality has been solved. Hence, the two problems given in Equations (2) and (3) are equivalent, as they both reduce to the same cases once minimization over λi has been performed.

We may write an indirect value function

T λ P r(s t) P r(Ti t) V (t, s, y) = sup Et e− − η(σs)ds + λiEt e− − h(STi ) . σ − t i Z  X h i We note that our value function discounts the entropy to make the subsequent expressions simpler. Our objective in Equation (3) amounts to minimizing the expression

λ V (0, S, y) λiCi + βi λi (4) − | | i i X X over λ.

We know that this V λ solves the HJB equation that is derived in Appendix B and has the form

1 2 1 2 Vt rV + rSVS + γVy + κ Vyy + Φ( S VSS, ρκSVSy) = λiδ(t Ti)hi(S). − 2 2 − − i X V (T, S, y) = 0

7 Here, 2 Φ(X1, X2) = sup X1σ + X2σ η(σ). σ −

Before describing how to obtain a volatility surface from these equations, it is worthwhile noting some features of the HJB equation. With ρ = 0, the HJB equation becomes much simpler because Φ(X, Y ) may be directly evaluated. The equation decouples and the value function would solve the same PDE that was found in Avellaneda et al. [5].

The volatility surface may be derived from the value function as follows. Once we solve for the λ = λ∗ that minimizes Equation (4), we may write the desired volatility surface as

1 2 λ∗ 2 λ∗ σ∗(S, y, t) = arg sup S VSS σ + ρκSVSy . (5) σ 2

We have thus described a way to incorporate stochastic volatility in a relative entropy framework. We did this by first determining an appropriate form for the relative entropy. Once we did this, however, encoding our objective into a value function is standard, and this value function must solve an HJB equation. Apart from the form of the relative entropy, our other main contribution so far is the incorporation of pricing uncertainty through the parameters βi.

Although this methodology seems reasonable, there are some issues that need to be addressed. The HJB equation as written is highly dependent on many prior parameters. We need to know γ, κ and ρ accurately in order to proceed. Since these are all parameters associated with the unobservable stochastic volatility driving factor, determining them is a difficult task. The volatility surface, moreover, is dependent on the actual value of the stochastic volatility driving factor. The next section develops a way to get around these difficulties by considering a regime in which the stochastic volatility driving factor is fast mean-reverting. This represents the point of departure from the standard HJB equation methodology.

3 The fast mean-reversion setting

3.1 A special form for the dynamics of the stochastic volatility driving process

We have so far dealt with the presence of stochastic volatility in a fairly general way. We now consider some special models for the dynamics of the volatility driving factor, Y . One feature that most models of stochastic volatility incorporate is mean reversion. This refers to the tendency of the process to go back to its invariant or long-run distribution. A particularly tractable model that exhibits mean reversion is the Ornstein-Uhlenbeck process

dYt = α(m Yt)dt + κdZˆt. −

8 Here, α is the rate of mean reversion, while m is the long-run mean of the process. We suppose that α, κ and m are constants. Moreover, Zˆt is a Brownian motion that is correlated to the Brownian motion that drives the stock price process, Bt, with correlation ρ.

The invariant distribution for this process is a Y0 that satisfies

E[ g(Y )] = 0 L 0 for any smooth and bound g, where

∂ 1 ∂2 = α(m y) + κ2 . L − ∂y 2 ∂y2

If we let Ψ(y) be the density function of the invariant distribution, it is easily seen that

1 (y m)2 Ψ(y) = exp − , (6) √ 2 − 2ν2 2πν   where κ2 ν2 = . 2α This is exactly the normal density with mean m and variance ν2, which tells us that the parameter ν controls the size of the equilibrium fluctuations from the mean, m.

The fast mean reversion limit was considered in Fouque, Papanicolaou and Sircar [12]. They found empirical evidence for fast mean reversion in options data. Analytically, fast mean reversion corresponds to the limit α . Care, however, must be taken to ensure that the invariant → ∞ distribution remains the same as we take this limit. In other words, we would like our model to have fluctuations of the same size regardless of how quickly the volatility reverts to its mean. We therefore take κ = ν√2α for some constant ν.

If this is the specification of Y under the physical measure, then to price derivatives we need to consider the process under the risk-neutral law. Suppose that under some physical probability law the stock and volatility driving process follow

dSt = µStdt + σ0(St, Yt)dB0,t

dYt = α(m Yt)dt + ν√2αdZˆ ,t. − 0 Then, under a risk-neutral law, the processes must follow, by Girsanov’s theorem,

dSt = rStdt + σ0(Yt)dB0∗,t µ r 2 ˆ dYt = α(m Yt) ν√2αρ − ν√2αγ0,t 1 ρ dt + ν√2αdZ0∗,t. (7) − − σ0 − −  p 

9 Here, γt is a risk-premium factor or the market price of volatility risk that parametrizes the space of equivalent martingale measure. By taking, γ0,t = γ0(t, St, Yt), we specifically restrict ourselves to a Markovian setting. Equation (7) is the specification of the dynamics of the processes S and Y under our prior risk-neutral probability law.

We now just need to find the relative entropy between another risk-neutral probability law and our prior. We assume that under P the price of the stock evolves as

dSt = rStdt + σ(St, Yt)dBt∗

µ r 2 dYt = α(m Yt) ν√2αρ − ν√2αγt 1 ρ dt + ν√2αdZˆ∗. (8) − − σ − − t  p  With these particular assumptions on the dynamics of Yt, the form of η(σ) to first order in ρ is unchanged from the analysis in Appendix A.

It is worth noting one important aspect of the dynamics under the prior probability law. We have assumed that σ0(St, Yt) = σ0(Yt). In other words, our prior estimate of the volatility is dependent only on the stochastic volatility driving factor and not on the stock price. With this choice of the functional form σ0 now corresponds to function f in the work of Fouque, Papanicolaou and Sircar [12]. Although this particular form seems like a heavy restriction, it is not. The reason is that the fast mean-reversion setting does not need an explicit functional form for the prior volatility estimate. Instead, this setting replaces explicit functional forms with observable quantities and averages. This will be made clear later.

3.2 Simplifying the HJB equation

The HJB equation we need to solve is now a little different to the one we derived earlier. Because of the appearance of the σ in the denominator of one of the diffusive terms for Yt in Equation (8), Φ must now be a function of three variables. Specifically, we define Φ as:

2 1 Φ(X1, X2, X3) = sup σ X1 + ρX2σ + ρX3 η(σ) . (9) σ σ −   The HJB equation thus becomes:

2 2 Vt rV + rSVS + (α(m y) ν√2αγ 1 ρ )Vy + ν αVyy − − − − 1 2 p + Φ S VSS, ν√2αSVSy, ν√2α(µ r)Vy = λiδ(t Ti)hi(S) (10) 2 − − − − i   X

Before we carry out the asymptotic analysis we simplify Φ for σ close to σ0 and for ρ small. We therefore expand σ in ρ (a small ρ expansion). Let:

σ = σ˜ + Aρ

10 where A is a coefficient to be determined. Here, σ˜ is taken to be the maximizer when ρ = 0, which 2 2 is just given by σ˜ = X1/2 + σ0. The first-order condition for a maximum gives: X 2σX + ρX ρ 3 4σ(σ2 σ2) = 0. 1 2 − σ2 − − 0 Upon substituting for σ, we find that the coefficient for ρ is X X 2AX + X 3 4A(σ˜2 σ2) 8Aσ˜2 = X 3 8Aσ˜2. 1 2 − σ˜2 − − 0 − 2 − σ˜2 − Equating the coefficient of ρ to zero yields: X X 3 8Aσ˜2 = 0, 2 − σ˜2 − which implies that X3 X2 σ˜2 A = − 2 . 4X1 + 8σ0 So, to first order in ρ: X2 X Φ(X , X , X ) = 1 + σ2X + ρX σ˜ + ρ 3 1 2 3 4 0 1 2 σ˜ which, upon expanding σ˜, leads to:

1 1 2 2 2 X X X − Φ(X , X , X ) = 1 + σ2X + ρX 1 + σ2 + ρX 1 + σ2 . 1 2 3 4 0 1 2 2 0 3 2 0     Given this form of Φ, we note that the maximizing value of σ is clearly approximated by σ˜.

We remark also that the small ρ expansion may be justified a posteriori. We will find that the 1 y-dependence in our value function only occurs at the α scale. This implies that, since we find 1 X √αVxy, X scales like , which is small for large α. A similar argument holds for X . 2 ∝ 2 √α 3 Because ρ always premultiplies X2 or X3, the small ρ expansion of Φ is reasonable.

We now have a suitable form of the HJB equation, which uses the small ρ approximation for Φ.

2 2 2 1 1 2 2 1 2 Vt rV + rSVS + (α(m y) ν√2αγ 1 ρ )Vy + ν αVyy + S VSS + σ S VSS − − − − 4 2 0 2 1  1  2 p 2 1 2 2 1 2 2 − + ρν√2αSVSy S VSS + σ ρν√2α(µ r)Vy S VSS + σ 4 0 − − 4 0     = λiδ(t Ti)hi(S) − − i X V (0, S, y) = 0 (11)

11 4 Fast mean-reversion asymptotics applied to the HJB equation

In order to apply fast mean-reversion asymptotics, we suppose that α is large. For book-keeping α purposes, we replace α by ε . We consider the value function corresponding to this situation in the limit as ε 0. → 1 2 1 We first let x = log S (which means that S VSS = (Vxx Vx)). Substituting this into Equation 2 2 − (11) yields

1 2 1 2 2 2 Vt rV + (r σ )Vx + σ Vxx + (α(m y) ν√2αγ 1 ρ )Vy + ν αVyy − − 2 0 2 0 − − − 1 1 2 p 2 1 2 Vxx Vx 2 Vxx Vx 2 − + (Vxx Vx) + ρν√2αVxy − + σ ρν√2α(µ r)Vy − + σ 16 − 4 0 − − 4 0     = λihi(x)δ(t Ti). (12) − − i X Here, σ0 = σ0(Yt). α We now replace α by ε so that we may apply some asymptotic techniques. We therefore write our equation as

2 1 1 (Vxx Vx) 1 + + V + − + ρν√2αVxy (g(Vxx, Vx, y) 1) L2 √εL1 εL0 16 √ε −   1 1 ρν√2αVy 1 λihi(x)δ(t Ti) (13) − √ε g(V , V , y) − − − xx x i   X where ∂ 1 ∂ 1 ∂2 = r + (r σ (y)2) + σ (y)2 L2 ∂t − − 2 0 ∂x 2 0 ∂x2 ∂2 ∂ ∂ = ρν√2α ν√2αγ 1 ρ2 ρν√2α(µ r) L1 ∂x∂y − − ∂y − − ∂y ∂ ∂2 p = α(m y) + ν2α L0 − ∂y ∂y2 1 2 Vxx Vx g(V , V , y) = − + σ2 . (14) xx x 4 0  

V0,xx V0,x If we suppose that 4− is small in relation to σ0, which is reasonable as long as our prior is somewhat close to the actual σ, then we may approximate g as just

g(y) = σ0(y). (15)

This approximation is consistent with the other approximations we have made in deriving a form of the relative entropy. We note that this approximation for g is also what we would have if we had happened to pick a prior that matched the prices within the tolerance desired.

12 We also expand V in powers of ε

V = V0 + √εV1 + εV2 + ε√εV3 + ....

4.1 The leading order term

In order to solve for the V0 term we equate the expansions of the HJB equation at the first few 1 scales. At the ε scale V = 0, L0 0 which implies that V0 = V0(t, x) (it is independent of y).

1 At the √ε scale

V + V + ρν√2αV ,xy(g(V ,xx, V ,x, y) 1) ρν√2α(µ r)V ,y(g(V ,xx, V ,x, y)) = 0 L0 1 L1 0 0 0 0 − − − 0 0 0 but since V0 is independent of y, the equation becomes V = 0 L0 1 and this implies, as before, V1 = V1(t, x).

At the order 1 scale

1 2 V + V + V + (V ,xx V ,x) = λihi(x)δ(t Ti), L2 0 L1 1 L0 2 16 0 − 0 − − i X where, again, we have used that V1 and V0 is independent of y to get rid of the Vxy part of the nonlinearity. Similarly, we may get rid of the V = 0. Solvability for V requires that the L1 1 2 other terms have average 0. But since V0 is independent of y, this means that the equation that determines solvability yields the following equation for V0

1 2 BS(σ¯ )V + (V ,xx V ,x) = λihi(x)δ(t Ti). (16) L 0 0 16 0 − 0 − − i X Here, BS is the Black-Scholes operator after the log transformation has been performed.The L terminal condition is V0(T, x) = 0. This is exactly the equation that is solved in Avellaneda et. al. [5]. It appears here as the zero order approximation in our method.

4.2 The correction term

Once we have solved the equation for V0, we may remove the average (which we have set to zero) to get an equation for V2 1 ∂2V ∂V V + (σ (y)2 σ¯2) 0 0 = 0, L0 2 2 0 − ∂x2 − ∂x   13 which implies that 2 1 1 2 2 ∂ V0 ∂V0 V (t, x, y) = − (σ (y) σ¯ ) . 2 −2L0 0 − ∂x2 − ∂x   We may therefore write V2 as 1 ∂2V ∂V V (t, x, y) = (φ(y) + c(t, x)) 0 0 , (17) 2 −2 ∂x2 − ∂x   where φ(y) is a solution of the Poisson equation φ = σ (y)2 < σ2 > . L0 0 − 0

Moving on to the next scale (√ε), the relevant equation is 1 V + V + V + (V ,xx V ,x)(V ,xx V ,x) + ρν√2αV ,xy(g(V ,xx, V ,x, y) 1) L2 1 L1 2 L0 3 8 0 − 0 1 − 1 2 0 0 − ρν√2αV ,y(1/g(V ,xx, V ,x, y) 1) = 0. − 2 0 0 − Solvability for V3 implies 2 1 ∂ V2 BS(σ¯ )V + (V ,xx V ,x)(V ,xx V ,x) + ρν√2α g(V ,xx, V ,x, y) L 0 1 8 0 − 0 1 − 1 0 0 ∂x∂y   1 ∂V ∂V ρν√2α(µ r) 2 √2α 1 ρ2 γ 2 = 0. (18) − − g(V0,xx, V0,x, x) ∂y − − ∂y   p  

The averages in the last three terms of Equation (18) may be further expanded into terms with derivatives in V0, using Equation (17). Specifically, we obtain the following relations

2 3 2 ∂ V2 1 ∂ V0 ∂ V0 g(V0,xx, V0,x, y) ∂x∂y = 2 gφ0 ∂x3 ∂x2 h i 2 − 1 ∂V2 1 1 ∂ V0 ∂V0 D E = φ  2  (19) g(V0,xx,V0,x,x) ∂y 2 g 0 ∂x ∂x 2 − ∂V2 1 ∂ V0 ∂V0 D γ E = Dγφ E  2  ∂y 2 h 0i ∂x − ∂x D E   So our equation for the correction V1 becomes 3 2 1 ∂ V0 ∂ V0 ∂V0 BS(σ¯ )V + (V ,xx V ,x)(V ,xx V ,x) + A˜ + A˜ + A˜ = 0, (20) L 0 1 8 0 − 0 1 − 1 3 ∂x3 2 ∂x2 1 ∂x which is a linear equation with a source term. In particular, the coefficients A˜1, A˜2 and A˜3 are given by

1 1 1 2 A˜1 = ρν√α (µ r) φ0 + ν√α 1 ρ γφ0 √2 − g √2 −   p 1 1 1 1 2 A˜2 = ρν√α gφ0 ρν√α (µ r) φ0 ν√α 1 ρ γφ0 − √2 − √2 − g − √2 −   p 1 A˜3 = ρν√α gφ0 √2

14 We note that, with this method, and as it stands now, we need the correlation ρ and the various parameters of the stationary distribution of the stochastic volatility driving factor.

4.3 Determination of the group parameters

With the approximation given in Equation (15), our parameters may be written as

1 1 1 2 A1 = √ε ρν√α (µ r) φ0 + ν√α 1 ρ γφ0 √2 − σ0 √2 −     p 1 1 1 1 2 A2 = √ε ρν√α σ0φ0 + ρν√α (µ r) φ0 + ν√α 1 ρ σ0φ0 − √2 √2 − σ0 √2 −     p 1 A3 = √ερν√α σ0φ0 . (21) √2

These parameters, which we call our group parameters, are the same as the A˜i but they are now scaled by √ε.

We may determine these parameters as combinations of parameters that are observable from the smile. To do this, we outline the results in Fouque, Papanicolaou and Sircar [12]. In their work, they found that the price of an option, C may be expanded as

2 3 ∂CBS ∂ CBS ∂ CBS C(t, x; T, K) = CBS(t, x; T, K) (T t) (2V V ) + (V 3V ) + V , − − 3 − 2 ∂x 2 − 3 ∂x2 3 ∂x3  (22) where CBS is the Black-Scholes option price and x = log(S). Here,

1 1 1 1 2 V2 = √ε ν√α 2ρ σ0φ0 ρν√α (µ r) φ0 ν√α 1 ρ σ0φ0 √2 − √2 − σ0 − √2 −     1 p V3 = √ερν√α σ0φ0 (23) √2

We deduce, then, that our approximate Ai’s given by Equation (21 )are exactly

A = 2V V 1 3 − 2 A = V 3V 2 2 − 3 A3 = V3.

The parameters V2 and V3 may be determined from the smile. By expanding the price of the option in powers of ε, we may write ∂C C(t, x; T, K; I) = C (t, x; T, K; σ ) + √εI BS , (24) BS 0 1 σ

15 where I1 is the correction to the . By equating Equations (22) and (24), we find that the implied volatility I is just: V 3 V V log(K/S) I = σ + 3 (r + σ2) 2 3 . 0 σ2 2 0 − σ − σ3 T t 0 0 0 −

The implied volatility is therefore an affine function of the log--to-maturity-ratio log(K/S) LMMR = , T t − which means that we may fit the parameters aF P S and bF P S below to the smile curve determined by options prices I = aF P SLMMR + bF P S, where

V3 aF P S = 3 −σ0 V3 3 2 V2 bF P S = σ + r + σ . 0 σ3 2 0 − σ 0   0

This means that the scaled parameters Ai which are required to determine the correction term √εV1 are explicitly expressed as combinations of the parameters aF P S and bF P S. Moreover, given any such parameters a and b determined from a smile curve, the parameters Ai are given by 3 A = 2aσ3 σ ((σ b) a(r + σ2)) 1 − 0 − 0 0 − − 2 0 3 A = σ ((σ b) a(r + σ2)) + 3aσ3 2 0 0 − − 2 0 0 A = aσ3 (25) 3 − 0

We have thus expressed the parameters needed for the correction term as simple expressions involv- ing observable quantities. This is the power of the fast mean-reversion setting. Individual estimates of the parameters of the stochastic volatility models are not needed. Rather, group parameters like the Ai are all that are needed to not only price options as in Fouque, Papanicolaou and Sircar [12] but also to determine minimal entropy volatility surfaces as in this paper.

4.4 Summary of our analysis

What we have achieved now is a set of equations that incorporates the objectives we wanted

1. Uncertainty over the prior in the form of stochastic volatility that is independent of the specific form of the volatility since we only really need the group parameters, A1, A2 and A3.

16 2. A value function that does not depend on the stochastic volatility driving factor because of the use of fast mean-reversion.

3. Uncertainty in the observed market prices

The asymptotics helped us isolate the leading two terms of the value function that solved the HJB equation. These two terms were, indeed, independent of the stochastic volatility driving factor. The first term, moreover, solved the same equation that was solved in Avellaneda et al. [5].

5 Description of the estimation procedure

Our analysis permits a simplified approach to the estimation of the stochastic volatility surface. There are two possible approaches from here. We may estimate the parameters Ai once or we may try to self-consistently determine the same parameters in an iteration scheme.

In the first approach, we may determine A1, A2 and A3 by following the smile-fitting procedure given in Section 4. We may then solve for V0 using Equation (16). With this V0 and the coefficients Ai , we may solve for √εV for each value of λ using Equation (20). Using some gradient search { } 1 algorithm, we may then minimize (over λ) V (0, x ) λiCi + λi βi, where V = V +√εV and 0 − i i | | 0 1 x is the log of the current stock price. Finally, the appropriate derivative of V gives a volatility 0 P P surface as described by Equation (5).

We choose to iterate this scheme. We may obtain A1, A2 and A3 from the prices as before, solve for V0 and V1 and obtain a volatility surface. But instead of stopping there, we price the same set of options in our data set again using the newly derived volatility surface and use these prices to get adjusted estimates for A1, A2 and A3. Repeating this procedure a few times then leads to a converged set of parameters. The fact that our procedure leads to a converged set of parameters tells us the group parameters are intrinsic to the volatility surface as a whole and that they may be self-consistently determined. They thus have the same role in volatility surface estimation as they do in option pricing.

We call this method the volatility surface iteration procedure. It is used in our numerical calcula- tions and is summarized succinctly as

1. Fit the implied volatility smile to an affine function of the log-moneyness-to-maturity ratio. This affine fit gives us the parameters aF P S and bF P S from which we obtain A1, A2 and A3.

2. Solve for V0 and √εV1. 3. Obtain a volatility surface.

4. Price options again using the volatility surface in Step 3. We do this using the Black-Scholes PDE with volatility σ.

17 5. Derive the implied volatility smile from the prices obtained in Step 4. Fit this smile to the log-moneyness-to-maturity ratio to obtain parameters a and b. Obtain new estimates of the parameters A1, A2 and A3 from the fitted a and b. . 6. Stop if adjusted parameters differ from the previous iteration by less than δ, a small parameter 5 2 (which we take to be of order 10− in the l norm). Otherwise, repeat the procedure from Step 2.

6 Numerical Methods

6.1 The highest-order equation

Supposing that the stock pays dividend of d (constant), then the function V0(x, t) satisfies the equation:

1 2 2 2 2 sup V0,t rV0 + (r d)V0,x + σ (V0,xx V0,x) (σ σ¯0) σ − − 2 − − −   x = λihi(e )δ(t Ti) − − i X Here, we choose σ [σmin, σmax] in order to preserve the stability of our numerical scheme. Now, ∈ rt using the substitution V˜0 = e− V0, yields:

˜ ˜ 1 2 ˜ ˜ rt 2 2 2 sup V0,t + (r d)V0,x + σ (V0,xx V0,x) e− (σ σ¯0) σ [σmin,σmax] − 2 − − − ∈   x rt = λihi(e )e− δ(t Ti) − − i X We discretize the interior equation (spatial h, temporal ∆t), and supposing that we have the optimal σ at time t, we have the equation (after dropping the tildes):

V (x, t) V (x, t ∆t) 1 V (x + h, t) + V (x h, t) 2V (x, t) 0 − 0 − + σ2 0 0 − − 0 ∆t 2 h2 1 2 V0(x + h, t) V0(x h, t) rt 2 2 2 + (r d σ ) − − e− (σ σ¯ ) − − 2 2h − − 0 rt x = e− λihi(e )δ(t Ti) − − i X rt x In order to regularize the problem, we replace each option payoff e− hi(e )δ(t Ti) by Pi = x − C(e , ∆t, σ)1t Ti

18 We may now write an equation for the value function at (x, t ∆t) as: − σ2 σ2 r d 1 σ2 V (x, t ∆t) = V (x, t) 1 ∆t + V (x h, t) ∆t − − 2 0 − 0 − h2 0 − 2h2 − 2h   !! 2 1 2 σ r d 2 σ 2 2 2 rt + V0(x + h, t) ∆t 2 + − − (σ σ0) ∆te− 2h 2h !! − −

+ λiPi i X The terminal condition is taken to be V (T, x) = 0.

Finally, we specify how to get σ. We notice that we are taking the supremum in a closed interval, and we have therefore that: 2 1 rt 2 σ = e (V ,xx V ,x) + σ¯ (26) 4 0 − 0 0 2 2 2 but if the right hand side in this equation is bigger than σmax, then σ = σmax and, similarly, if it 2 2 2 is smaller than σmin, then σ = σmin.

As written our discretized scheme is itself an HJB equation. In order for this interpretation to be valid, we need the condition σ2 1 ∆t 0 − h2 ≥ for stability, and, hence, we take h2 σ2 ∆t. ≥ max We note that in our implementation, our grid is cut off 3.5 standard deviations away from the underlying price of the stock. Furthermore, the boundary conditions are chosen so that σ = σ0 at the boundary.

This numerical scheme is more transparent than the -based scheme presented in the [5]. Once we have the volatility surface from our scheme, the derivative of V0 with respect to λi may be readily computed by discretizing the linear PDE for V = ∂V0 . This PDE is just: 0,i ∂λi

BS(σ)V ,i = Pi (27) L 0 − with terminal condition 0, as before. These derivatives are needed for the optimization over the λi’s.

The PDE (27) tells us that V0,i = λi (which is the first-order condition for optimality in the work of [5]) is solvable as long as σmax and σmin are chosen so that the interval [σmin, σmax] contains the implied volatility of option i.

19 6.2 The PDE for the correction term

Our correction equation may be written as:

3 2 ∂ V0 ∂ V0 ∂V0 BS(σ)V˜ + A + A + A = 0, L 1 1 ∂x3 2 ∂x2 3 ∂x   where V˜1 = √εV1. This is a linear PDE and a simple explicit scheme may be employed. Notice that one needs the solution V0 from the first PDE in order to solve this PDE because we need σ.

In order to get a prior for the parameters A1, A2 and A3, we use the procedure outlined in Section 4. We fit the smile (the implied volatilities) to the log-moneyness-to-maturity-ratio of the options and then use Equation (25) to determine the parameters. We do this at the start with quoted prices and then at each subsequent iteration with newly calculated prices from intermediate estimations of the volatility surface.

7 Numerical Results

The fast mean-reversion asymptotics cannot be expected to apply to options covering all expiration dates and all strikes. The stochastic volatility driving factor cannot be averaged for options that are too close to expiration as in these situations the volatility term has not had enough time to fluctuate. Options that are far away from maturity may not be used either as other effects such as slowly varying volatility may dominate the fast mean-reversion effects. Similarly, options that are too far out-of-the-money or too far in-the-money should not be used.

It is for these reasons that the actual procedure we use does not correct the volatility surface all the way up to expiry. The reason for this is that the correction is not valid all the way to expiry because the fast mean-reversion correction does not apply to short-dated options. It is therefore reasonable to expect that this is also true of the correction to the volatility surface close to expiry. Hence, our results are derived from corrections to the volatility surface up to 35 days from expiry.

Our data set consists of options on the S&P 500 index between 3 May, 2004 and 12 May, 2004. Out of this data set, we pick call options that have strikes within 5% of the at-the-money level and that expire on 19 June, 2004. To allow for enough distance between the strikes so that a very fine mesh is not needed in the numerical procedures, we make sure that the strikes of the call options we choose are at least 10 apart. This means that on each day, we have between 6 and 8 options from which to build a volatility surface. We choose 3/4 of the bid/ask spread on the options to be the uncertainty in our prices (our βs).

Table 1 shows the results of the iterative procedure for finding the Ais on 3 May, 2004. Starting from a prior derived from the work of Fouque, Papanicolaou and Sircar, each succeeding row of the Ai represents coefficients derived from the volatility surface, until convergence in the Ai is achieved.

20 iteration A1 A2 A3 1 -0.00100 0.00072 0.00028 2 -0.00294 0.00267 0.00027 3 -0.00298 0.00271 0.00027 4 -0.00296 0.00269 0.00027 5 -0.00298 0.00270 0.00027

Table 1: Results of iterating the FPS parameters for 3 May, 2004 for options on the S&P 500 index. The first iteration has parameters that are equal to the original FPS estimates.

Date aF P S bF P S a b 05/03/2004 -0.0688 0.1525 -0.0664 0.1402 05/04/2004 -0.0339 0.1416 -0.0310 0.1348 05/05/2004 -0.0861 0.1518 -0.0870 0.1267 05/06/2004 -0.0666 0.1573 -0.0529 0.1491 05/07/2004 -0.0518 0.1588 -0.0424 0.1496 05/10/2004 -0.0422 0.1674 -0.0318 0.1630 05/11/2004 -0.0414 0.1595 -0.0177 0.1501 05/12/2004 -0.0727 0.1763 -0.0629 0.1660 05/13/2004 -0.0459 0.1641 -0.0156 0.1551 05/14/2004 -0.0571 0.1679 -0.0373 0.1574

Table 2: FPS parameters that are obtained from an affine fit of the implied volatility to the log- moneyness-to-maturity ratio, using the original FPS results and using our procedure. These are the parameters that are shown to be stable in previous work, and are seen to be stable here too.

If this is done for each of the days that we correct the surface, we obtain the following series of values for the parameters a and b. Figure 1 graphically shows the comparison between the two sets of coefficients.

The other part of this framework is the volatility surface obtained. We show the effects of the correction to the volatility surface for 3 May, 2004 in Figures 2-4. The two surfaces incorporate uncertainty in prices, while the last figure shows the difference between the corrected and uncor- rected surfaces as a percentage of the uncorrected surface for the duration of the correction. We can see that the stochastic volatility correction introduces a ripple into the surface (because of the third derivative term). This ripple, though smaller in magnitude, is still present 3 days later, as can be seen in Figure 5.

8 Sensitivity to parameters

It was shown in [5] that the value function is convex in the λi for the case where the Ai are all zero. When the Ai are not all zero, numerical computations show that the value function need not have

21 0.2 a FPS a b FPS 0.15 b

0.1

0.05

0

−0.05

−0.1

−0.15 1 2 3 4 5 6 7 8 9 10

Figure 1: Comparison of parameters obtained via the FPS theory and via our new method

0.22

0.2

0.18

0.16 uncorrected

σ 0.14

0.12

0.1 1600

1400 50 40 1200 30 1000 20 10 800 S&P 500 index price 0 Trading days

Figure 2: Uncorrected volatility surface for 8 European call options on the S&P 500 index, struck on May 3, 2004, and expiring on June 19, 2004. The strikes of these call options were 1050, 1075, 1085, 1100, 1110, 1120,1130 and 1140. σ0 = 0.16, σmin = 0.1, σmax = 0.25, r = 0.119 and the dividend rate, d was 0.017, while the underlying price was 1117.49.

22 0.25

0.2 corrected σ 0.15

0.1 1600

1400 50 40 1200 30 1000 20 10 800 S&P 500 index price 0 Trading days

Figure 3: Corrected volatility surface for the same 8 European call options on the S&P 500 index

0.05

0 as fraction

σ −0.05

−0.1 difference in

−0.15 1600

1400 15

1200 10

1000 5

800 S&P 500 index price 0 Trading days

Figure 4: The difference (as a fraction) between the corrected volatility surface and the uncorrected volatility surface, shown for the first 15 days of the correction.

23 0.04

0.03

0.02 uncorrected σ 0.01

0

−0.01 corrected and σ −0.02

−0.03

difference in −0.04 1600

1400 10 8 1200 6 1000 4 2 800 S&P 500 index price 0 Trading days

Figure 5: The difference (as a fraction) between the corrected volatility surface and the uncorrected volatility surface, shown for the first 10 days of the correction for the trading date 6 May, 2004.

a stationary point. Large values of Ai, in particular, can make it impossible to find a minimum. Figures 8 and 7 show what happens to the objective function (V λiCi + βi λi ) as we vary the − | | parameters A1 and β1 for a typical day.

The problem of obtaining a minimum is solved in two ways in our procedure. Firstly, we only correct the volatility surface in the regime where we expect the fast mean-reversion asymptotics to hold, i.e. far enough away from maturity. Secondly, we make the βi nonzero, thereby including some price uncertainty. With everything else the same, nonzero βi forces λ to be smaller in magnitude. We can see this in Figure 8. In fact, as βi gets larger, λi goes to zero, which means that large price uncertainty forces our volatility surface to be constant and equal to the prior volatility σ0 at all points. This means that for these large values, a = 0 and b = σ0.

Although we cannot expect our volatility surface to be the same for all values of β, we can see from Table 3 that over the range of βi that are practical (and lead to convergence in our numerical scheme), the reduced parameters a and b are fairly similar. More importantly, Table 3 tells us a lot about the role that price uncertainty has on the determination of the volatility surface. Price uncertainty turns out to be very important as a lot of price uncertainty implied that the best volatility surface under our objective of minimizing deviation from the prior is to use the prior itself. In this case, however, we are not really fitting the data. Rather we are taking a position on it. With absolute confidence in the prices, we believe the market entirely and our stochastic volatility surface estimation procedure will rarely converge. Table 3 tells us that the middle ground between complete market certainty and historical estimates of parameters may be reached with our estimation procedure.

24 0.07

0.06

0.05

0.04

0.03

Objective Function 0.02

0.01

0

−0.01 −0.01 −0.008 −0.006 −0.004 −0.002 0 0.002 0.004 0.006 0.008 0.01 λ 1

Figure 6: Using data from 6 May, 2004, we plot the dependence of the objective function on λ1 (all other λ’s are zero) for three cases: i) ’-’: Uncorrected surface; ii) ’ ’: FPS fitted values of Ai are • used with zero βi; iii) ’’: FPS fitted values with βi derived from the bid/ask spread.

3.5

3

2.5

2

1.5

Objective Function 1

0.5

0

−0.5 −0.03 −0.02 −0.01 0 0.01 0.02 0.03 λ 1

Figure 7: Changing the Ai to one hundred times the original value can lead to a loss of a minimum as can be seen in curve ’ ’; however, increasing the value of β to five times its original value can × restore that convexity.

25 β value a b 0 - - β 4 -0.0841 0.1387 β 2 -0.0727 0.1467 β -0.0529 0.1491 3 2 β -0.0186 0.1501 2β 0.0002 0.1600

Table 3: Dependence of parameters a and b on β for options whose prices are given on 6 May, 2004. β here is 3/4 the bid/ask spread. The first line indicates that there was no convergence when the volatility surface was corrected, even for the prior values of Ai.

9 Conclusion

We formulate the problem of determining the volatility surface from option prices when the under- lying price process exhibits stochastic volatility and when there is some uncertainty in the observed market prices of the options. Using fast mean-reversion asymptotics, we simplify the problem and rewrite our HJB equation so that the leading-order terms were no longer dependent on the un- observable stochastic volatility driving factor. Importantly, other parameters associated with the stochastic volatility model can be obtained directly from the options prices and do not need sepa- rate estimation. With this analysis in hand, we solve the HJB equation and determine the group parameters derived from the options’ smile. Our procedure admits an iterative determination of these stochastic volatility parameters, which are the same parameters found earlier in Fouque, Pa- panicolaou and Sircar [12]. The iteration makes these parameters consistent with the equations that determine the volatility surface.

Our numerical results showed that the stochastic volatility group parameters determined through this iterative procedure are reasonably stable in time. The parameters are also stable with respect to the volatility surface iteration procedure. This added level of stability can only be expected to arise from a model such as ours because it reduces the complexity of the estimation problem. The group parameters are thus intrinsic to the entire surface because the equations that determine the surface involve the same parameters that are obtained from the options’ smile. More importantly, we found that uncertainty in prices significantly affect both the group parameters and the volatility surface itself. The bigger the uncertainty, the more our surface resembles our prior, while the smaller the uncertainty, the closer the parameters are to the original parameters found in Fouque, Papanicolaou and Sircar [12]. Our procedure provides a natural way of interpolating between these two cases.

We hope to extend our work to incorporate maturity cycles and multiple expiration dates as in [14]. It is also believed that the proofs about the validity of our approximations which are found in [13] should extend to the present problem. We leave this to future work.

26 References

[1] Y. Achdou and O. Pironneau. Volatility smile by multilevel least square. International Journal of Theoretical and Applied Finance, 5:619–643, 2002.

[2] L.B.G. Andersen and R. Brotherton-Ratcliffe. The equity option volatility smile: an implicit finite difference approach. Journal of Computational Finance, 1:5–38, 1997.

[3] L.B.G. Andersen and R. Brotherton-Ratcliffe. Markov market model consistent with cap smile. Journal of Computational Finance, 1:5–37, 1998.

[4] M. Avellaneda. Minimum-entropy calibration of asset-pricing models. International Journal of Theoretical and Applied Finance, 1:447–472, 1998.

[5] M. Avellaneda, C. Friedman, R. Holmes, and D. Samperi. Calibrating volatility surfaces via relative entropy minimization. Applied , 1997.

[6] M. Broadie, M. Chernov, and M. Johannes. Model specification and risk premiums: The evidence from the futures options. working paper, 2004.

[7] P.W. Buchen and M. Kelly. The maximum entropy distribution of an asset inferred from option prices. Journal of Financial and Quantitative Analysis, 31:143–159, 1996.

[8] R. Carmona and L. Xu. Calibrating arbitrage-free stochastic volatility models by relative entropy method. CEOR Technical Report, Princeton University, 1997.

[9] M. Dudik, S.J. Phillips, and R.E. Schapire. Performance guarantees for regularized maximum entropy density estimation. Proceedings of the 17th Annual Conference on Computational Learning Theory, 2004.

[10] B. Dupire. Pricing with a smile. Risk, pages 18–20, 1994.

[11] B. Dupire. Mathematics of Derivative Securities, chapter Pricing and hedging with smiles, pages 103–111. Cambridge University Press, 1997.

[12] J.-P. Fouque, G. Papanicolaou, and K.R. Sircar. Derivatives in Financial Markets with Stochas- tic Volatility. Cambridge University Press, 2000.

[13] J.-P. Fouque, G. Papanicolaou, K.R. Sircar, and K. Solna. Singular perturbations in option pricing. SIAM Journal on Applied Mathematics, 63:1648–1681, 2003.

[14] J.-P. Fouque, G. Papanicolaou, K.R. Sircar, and K. Solna. Maturity cycles in implied volatility. Finance and Stochastics, 8:451–477, 2004.

[15] L. Gulko. The Entropy Pricing Theory. PhD thesis, Yale School of Management, Yale Uni- versity, 1998.

[16] N. Jackson, E. Suli, and S. Howison. Computation of deterministic volatility surfaces. Applied Mathematical Finance, 2:5–37, 1998.

27 [17] J.C. Jackwerth and M. Rubinstein. Recovering probability distributions from contemporaneous prices. Journal of Finance, 69:771–818, 1996.

[18] R. Lagnado and S. Osher. A technique for calibrating derivative security pricing models: numerical solutions of an inverse problem. Journal of Computational Finance, 1:13–25, 1997.

[19] E. Platen and R. Rebolledo. Principles for modelling financial markets. Advances in Applied Probability, 33:601–613, 1996.

[20] K.R. Sircar and T. Zariphopoulou. Bounds and asymptotic approximations for utility prices when volatility is random. SIAM Journal of Control and Optimization, 43:1328–1353, 2005.

A Relative entropy in a stochastic volatility setting

We have the following dynamics under P0, our prior,

dSt = rdt + σ0(St, Yt)dWt St

Yt = γ0,tdt + βdZˆt where, as before, we assume the correlation between the two Brownian motions is ρ. Under P , we have

dSt = rdt + σ(St, Yt)dWt∗ St ˆ Yt = γtdt + βdZt∗

In what follows, we assume that γ = γ0.

Instead of using a trinomial tree-based discretization, as in Avellaneda et. al. [5] we choose to use a simple Euler discretization for the Xt log(St) and Yt processes under P . ≡

1 2 Xn = Xn + (r σ )∆t + σ√∆t +1 − 2 1 Yn+1 = Yn + γ∆t + β√∆t2

We may do the same for the processes under P0 with no change except for subscripts on some of the parameters. Here, 1 and 2 are each normally distributed but have correlation ρ.

The relative entropy we seek is a discrete approximation to the general form

dP H(P P ) = EP ln . | 0 dP  0 

28 In the discrete-time case, this may be wrriten as

N 1 − πP H(P P ) = πP ln n | 0 n P0 n=0 ! πn ! pathsX Y Q πP Q = EP ln n P0 " Qπn !# Q πP = EP EP ln n n P0 " n πn # X    Here, En is the conditional expectation taken with respect to the information available up to time P P0 n and πn and πn are just the probabilities associated with one step along any given path under each of the probability laws.

We therefore just need to work out the log-likelihood ratio to calculate the relative entropy. First, however, we calculate the associated probabilities.

2 2 (x µ) x µ y µy (y µy) −2 2ρ − − + − 2 P 1 σ − σ β β πn = exp , 2πσβ 1 ρ2 − 2(1 ρ2)  − − p   is the likelihood function for a bivariate normal distribution and, similarly,

2 2 (x µ0) x µ0 y µy (y µy) − 2 − − 2 2ρ −σ0 β + β P0 1 σ0 − πn = exp . 2πσ β 1 ρ2 − 2(1 ρ2)  0 − − p   1 2 Here, µ = Xn +(r σ )∆t and a similar expression holds for µ , while µy = Yn +γ∆t. Evaluating − 2 0 the log likelihood gives us the following expression

2 2 2 1 σ 1 (x µ ) (x µ) (x µ)(y µy) (x µ )(y µy) ln + − 0 − + 2ρ − − − 0 − , −2 σ2 2(1 ρ2) σ2 − σ2 σβ − σ β 0 −  0  0  which we average against the joint probability density of the bivariate normal. We do this because we want to find πP EP ln n . n P0  πn  This yields 1 σ2 1 σ2 ρ2 σ ln + 1 + + 1 . −2 σ2 2(1 ρ2) − σ2 1 ρ2 − σ 0 −  0  −  0 

We may expand this expression for small σ2 σ2, which implies − 0 P 2 2 P πn 1 2 2 2 2ρ σ0(3σ0 + σ)(σ0 σ) E ln (σ σ0) + − . πP0 ≈ 4σ4 − 1 ρ2  n  0  − 

29 If we suppose that ρ is small, we may omit the last term in our entropy. We therefore choose our entropy generated per unit of time to be

η(σ) = (σ2 σ2)2. (28) − 0

We note here that a similar entropy (with a ρ2 term also) may be obtained using a trinomial tree- based discretization, where care must be taken to appropriately find the joint probabilities given the correlation. This approach yields the same entropy as in Equation (28).

B A Derivation of the HJB equation under our new formulation

rt Consider the dynamics of e− V (St, Yt, t):

rt rt 1 2 2 e− V (St, Yt, t) = e− Vt rV + σ S VSS + rSVS + γVy − 2   1 + β2V + ρσβSV ) dt 2 yy Sy  + Brownian (under P) terms

The above equation comes from applying Itˆo to the expression and using the dynamics under P that we have assumed (Equation (1)). Let

Φ(X, Y ) = sup Xσ2 + Y σ η(σ). σ − From the definition of Φ, we know that:

Xσ2 + Y σ η(σ) + Φ(X, Y ) ≤ 1 2 letting X = S VSS and Y = ρβSVSy, we may therefore write (at least formally), the following − 2 inequality

rt rt 1 2 1 2 d e− V (St, Yt, t) e− Vt rV + rSVS + γVy + β Vyy + Φ( S VSS, ρβSVSy) + η(σ) dt ≤ − 2 2    + Brownian (under P) terms

Suppose we solve our HJB equation:

1 2 1 2 Vt rV + rSVS + γVy + β Vyy + Φ( S VSS, ρβSVSy) = λiδ(t Ti)hi(S). − 2 2 − − i X Then the inequality above reads

rt rt rt d e− V (St, Yt, t) e− η(σ)dt λie− δ(t Ti)hi(S)dt + Brownian terms. ≤ − − i  X 30 Integrating and taking expectations under P gives:

T P rT rt P rTi P ru E e− V (ST , YT , T ) e− V (s, y, t) E λihi(ST )e− + E η(σu)e− du . t − ≤ − t i t " i # Zt    X Noting that our HJB solution also satisfies V (ST , YT , T ) = 0, we can conclude, then, that

T P r(Ti t) P r(s t) V (s, y, t) E λihi(ST )e− − E η(σ)e− − ds ≥ t i − t " i # t X Z  and equality may be achieved by setting σ = σ∗, where σ∗ is the supremum attained in the equation for Φ(X, Y ). So, indeed the function, V , that solves the HJB equation is a value function for

T P r(Ti t) P r(s t) sup Et λihi(STi )e− − Et η(σ)e− − ds . σ − " i # t X Z 

31