Willow Power: Optimizing Derivative Pricing Trees (Pdf)
Total Page:16
File Type:pdf, Size:1020Kb
Willow Power: Optimizing Derivative Pricing Trees Michael Curran A new type of recombining derivative pricing tree is presented as an alternative to standard binomial and trinomial trees. This tree, called the “Willow” tree, expands as the square root of time and therefore avoids the unnecessary computations that are performed in the “wings” of standard binomial and trinomial trees. It is shown that arranging the tree in accordance with the process being modelled gives dramatic improvements in computational efficiency. The purpose of this paper is to introduce an regions. In Figure 1, the space between the alternative to standard binomial and trinomial Willow and trinomial envelopes is relatively a trees, called the “Willow” tree because of its high probability region, which is neglected by characteristic shape. The motivation for this the trinomial tree. In Figure 2, the trinomial research is the following. Consider the pricing tree wastes nodes outside of the probabilisti- of a European stock option using a binomial cally relevant region. This results because the tree with 100 time steps. Using this model, the Willow tree, in concert with Brownian motion, nodes at maturity are spread uniformly across opens relatively fast initially and more slowly 20 standard deviations of the theoretical distri- later on. bution of the underlying. The range of nodes in a standard tree spreads out faster, namely, lin- early in time, than does the (normal) marginal probability distribution, which expands as the square root of time. As a result, many very low probability events are included in the back- wards induction algorithm. This inefficiency can be mitigated by “pruning” the tree, but this solves only part of the problem, and in an awk- ward way that is difficult to generalize to multi- ple factors. A more elegant and convenient remedy for this situation is presented that offers other important benefits, including radi- cally accelerated convergence, particularly for multi-factor models. Figure 1: Outer envelopes of Willow and The Willow tree provides better coverage of trinomial trees for short-time horizon high probability regions of the process space and gives less waste in the low probability ALGO RESEARCH QUARTERLY 15 VOL. 4, NO. 4 DECEMBER 2001 Willow power of barriers and “smiles,” are described in the third section, which is followed by some brief concluding remarks. The basic model This section details the mathematical develop- ment of the Willow tree. To model Brownian motion efficiently, the tree implements a discrete approximation to the normal distribution at each time point. Normal distributions are rele- vant to pricing derivatives since they play an integral role in diffusion processes, which, in Figure 2: Outer envelopes of Willow and turn, are the building blocks of the large major- trinomial trees for long-time horizon ity of derivative pricing models. The basic struc- ture of the tree is constructed using this As with all derivatives pricing trees, the Willow principle as a guideline, with the transitions tree is built in order to approximate Brownian linking successive time points together coming motion well, using a fairly small number of spa- later. tial and time steps. It is used primarily for pricing American and Bermudan instruments, which is Consider dividing the support of the normal dis- straightforward in a recombining lattice or tree. tribution into n intervals and assigning a single A good reference for trinomial trees, as applied value in each interval to represent the corre- to a variety of instrument types, is Hull (2000). sponding stratum of the distribution. Let z1, z2, ... z be the representative normal variates with Numerical comparisons show that the Willow n corresponding probabilities q , q , ... q . For tree radically outperforms standard trees and 1 2 n –1 i – 5 simplifies the implementation of important ---------- example, we can let zi = N (see Curran aspects of pricing trees that are required for n practical applications. In particular, the length of 1 1994) and qi = --- for i = 1, ..., n. Note that the time steps can be chosen arbitrarily. This is in n contrast to trinomial trees, which have an inher- symmetry of the normal distribution will result ent upper bound on the amount of time (or qua- in the variates having a mean value of zero. dratic variation) that can be modelled at each Let {X : k=1,2,3,... } be a time-inhomoge- step. (This bound is achieved by setting the k probability of the central node equal to zero. In neous Markov chain with state space order to take a larger step, a change in the topol- {1,2,...,n}. We will say that the process {:Y t ≥ 0} has the value t z when X is in ogy of the trinomial tree is required.) Further- tk k k i k more, it is emphasized that, due to the small k number of nodes in the spatial dimension, the ≡ state i, where tk hj , for some hj> 0, speed advantage of the Willow tree increases j = 1 with the number of factors. j =1,...,k. Then, subject to certain conditions The first section of the paper discusses the con- on the transition probabilities of the embedded chain X , Y converges to Brownian motion as struction of a basic Willow tree. The second sec- k tk →∞ →0 tion provides numerical results, demonstrating n and hj for all j. Note that we must that the Willow tree outperforms trinomial trees take a double limit since the time and space in pricing a (one-factor) swaption and a (two- steps do not have any mutual dependence in this factor) American put option. Various extensions model, as they do in standard trees. Observe also of the basic model, including the incorporation that this construction obviates the burdensome ALGO RESEARCH QUARTERLY 16 DECEMBER 2001 Willow power practice of pruning very low probability levels of we consider this aggregated node as being a sin- the Brownian motion from the tree, so that, gle node to avoid unnecessary computations.) while technically this approach is an explicit finite difference method, it is far more efficient To begin the process, the first step in the tree than standard implementations of the PDE must be from the root node to all n nodes at the approach. first time point, with the move to node i having probability qi. By construction, this initial transi- k tion has the required mean and variance. For Let pij denote the transition probability from node i to node j at time step k. The transition subsequent transitions, we insure satisfaction of probabilities must be parametrized by t and h in the above constraints as follows. order to achieve convergence to Brownian Equation 1 can be written as: motion. This parametrization must satisfy the usual requirements for discrete-time models, n k ∀ , consisting of the following three conditions: pij tk + hk + 1 zj = tkzi ik j = 1 • the process must constitute a martingale or, equivalently, EY[]Y = Y ∀ k (1) tk + hk + 1 tk tk .n α /k ∀ , • the variance of the process must be equal to 1 + k / pij zj = zi ik, (5) the length of the time step j = 1 0 [] ∀ Var Yt + h Yt = hk + 1 k (2) h k k + 1 k α ≡ k + 1 where k ------------ . The (time) dependence of the tk • the transition probabilities from each node probabilities on k can thus be expressed in terms must sum to one α of the parameter k. n k ∀ , . (3) The variance condition (Equation 2) can be pij = 1 ik j = 1 written as: Finally, we also impose the restriction: .n /k []2 []2 / pij tk + hk + 1 zj – tkzi n j = 1 0 k ∀ , qip = qj jk. (4) ij = h ∀ ik, . i = 1 k + 1 This condition inductively guarantees that the Dividing by tk gives unconditional probability of each state j at time n step k+1 is qj, given that this is the case at time k 2 2 p []1 + α z – z = α ∀ ik, . (6) step k. This means that we are simultaneously ij k j i k j = 1 imposing that the stationary distribution of Xk be as prescribed, and that Xk be initialized in Thus, at each time step, the (linear) conditions this stationary distribution. (Observe that the on the transition probabilities are fully deter- α root node can be regarded as containing all mined by a single scalar parameter k, greatly n nodes at value 0z for all i, so that the root diminishing the storage requirements when cal- i culating the resulting transition probabilities. need not be taken as a special case of the embedded chain. Nevertheless, when pricing, Assuming n is greater than four (which it always will be!), the system of equations consisting of ALGO RESEARCH QUARTERLY 17 DECEMBER 2001 Willow power the normalization constraints (Equations 3 and sibly some other discrete set of values corre- 4) and the mean and variance conditions sponding to alternative step sizes, and solve (Equations 5 and 6, respectively) is under-deter- Equation 7 for each value. These computations mined. A unique solution to this system can be are done once, off-line, and the results are obtained by solving the following linear program stored. When we encounter a situation during for each time step k (hereafter, to improve read- pricing where we need a solution for a value of α ability, we suppress the time dependence (i.e., that has not been computed, the step size can be the subscript k)). adjusted slightly, so that the value of α is one for which Equation 7 has been solved, or the solu- tion can be interpolated from known values. To n n α 3 assure that the interpolated probabilities yield Minimize pij 1 + zj – zi i = 1 j = 1 the correct drift in Equation 5, we interpolate 1 Subject to: linearly based on ----------------- .