L1−Penalized Likelihood Smoothing of Volatility Processes Allowing for Abrupt Changes
Total Page:16
File Type:pdf, Size:1020Kb
`1−Penalized Likelihood Smoothing of Volatility Processes allowing for Abrupt Changes David Neto(a), Sylvain Sardy(b) and Paul Tseng(c) April 2009 Abstract We consider the problem of estimating the volatility of a financial asset from a time series record of length T . We believe the underly- ing volatility process is smooth, possibly stationary, and with potential abrupt changes due to market news. By drawing parallels between time series and regression models, in particular between stochastic volatility models and Markov random fields smoothers, we propose a semiparamet- ric estimator of volatility. Our Bayesian posterior mode estimate is the solution to an `1-penalized likelihood optimization that we solve with an interior point algorithm that is efficient since its complexity is bounded by O(T 3=2). We apply our volatility estimator to real financial data, di- agnose the model and perform back-testing to investigate to forecasting power of the method by comparison to (I)GARCH. Keywords: `1 penalty, Markov random field, stochastic volatility model, smoothing, wavelet, extreme value theory, forecasting, GARCH. JEL classification: C14, C22, C61. (a)Department of Econometrics, University of Geneva, CH-1211 Geneva 4, Switzerland. E-mail: [email protected]). (b)Department of Mathematics, University of Geneva, CH-1211 Geneva 4, Switzerland. E-mail: [email protected]). (c)Department of Mathematics, University of Washington, Seattle, WA. E-mail: [email protected]). 1 1 Introduction Suppose we observe values yt of a financial asset (stock-returns, interest-rates or exchange-rates) at regularly spaced times t = 1;:::;T . Our goal is to es- timate an important intrinsic characteristic of the financial asset of interest, the evolution of the conditional variance of the stochastic return process, so as to assess the past, present and future risk of the asset. To that aim, many stochastic models have been proposed. Standard among them is to assume a data generating process for yt. Most models assume yt = σtt, where σt is a measure of volatility, and t is white standard Gaussian noise. With- MLE out additional assumption, the maximum likelihood estimate isσ ^t = jytj, which is practically useless due to its nonsmoothness (i.e., high variability). A temporal structure for σt is assumed to regularize the maximum likelihood estimation, so as to obtain a smoother estimate of σt while capturing the styl- ized features (Rydberg 2000) observed in financial econometric, like heavy tails of the marginal distributions, volatility clustering or evolution with possible abrupt changes (e.g., peaks, discontinuities). The popular GARCH-type mod- els (Engle 1982, Bollerslev 1986) are parametric and enjoy good estimation properties. Stochastic volatility models (Taylor 1986, Taylor 1994) are power- ful semiparametric alternatives which fit more flexibly the stylized features. In particular, the log-normal stochastic volatility model seems to better capture the leptokurticity of the marginal distributions of the financial data than the standard GARCH model (Shephard 1996). Stochastic volatility models are the empirical discrete-time analogs of continuous-time models in finance theory and, in particular, option pricing (Hull and White 1987). They can also be viewed as an Euler discretization of a diffusion. The general discrete model of Andersen (1994) includes GARCH and stochastic volatility models as particular cases. It defines the discrete polynomial stochastic autoregressive volatility process as: yt = σtt; (1) q σt = ' (ht) ; (2) ht = ! + φht−1 + (γ + ht−1) ηt; (3) where q 2 f1; 2g, '(·) is a positive continuous and strictly monotone function, the autoregressive process ht has positive coefficients ; φ, γ such that +φ > 0 and + φ > 0, the errors ηt and t are i.i.d. and mutually independent with 2 mean-variance of 1; ση and (0; 1), respectively. This process includes, for 2 instance, the GARCH process (corresponding to q = 2, ' (ht) = ht, ηt = t−1, and γ = 0) and the stochastic volatility model (SVM), corresponding to ' (ht) = exp (ht), q = 1 and = 0. Often ηt and t are assumed to be Gaussian. 2 One challenge for these models has been how to infer the parameters ( ; φ, γ; !) and how to estimate the hidden conditional volatility σt within the sample (smoothing) and out of sample (forecasting). While parametric GARCH mod- els have been relatively easy to fit owing to the small number of parameters, it is not the case for semi-parametric SVM that represent the returns as a non- linear hidden Markov chain. Several methods have been proposed to estimate the latter data generating process, many of which are based on calculating or estimating the conditional expectations ^ ^ 2 ^ 2 ( ; φ, γ;^ !^) = E(( ; φ, γ; !) j y; ση) and ht = E htjy; ση ; t = 1;:::;T; (4) where y = (y1; :::; yT ) are the returns up to time T . For a linear and Gaussian data generating process, the Kalman filter calculates a closed form expression for the posterior means (4). The quasi-maximum likelihood approach of Harvey, Ruiz and Shephard (1994) applies Kalman filtering to the logarithm of the 2 2 squared observations log yt = 2ht + ζt, where ζt = log t . However, the data generating process is non-Gaussian, and the poor approximation of ζt by a Gaussian distribution leads to poor estimation of σt, particularly when the 2 variance ση of the log-volatility ηt in (3) is small (Jacquier, Polson and Rossi 1994). Another possible approach is to generate by Monte-Carlo an ergodic chain from the exact posterior distribution given the observations (Jacquier et al. 1994) and calculate the componentwise averages to estimate the conditional expectations (4). However, this may require generating long sample paths before 2 reaching convergence. The estimate of the variance ση of the log-volatility innovations also remains a difficult issue. Based on a GARCH model, the estimated persistence of the conditional volatilities of many financial time series is seemingly high, which also makes the inference difficult. For example the second moment of an Integrated GARCH (Engle and Bollerslev 1986) does not exist even if the stationarity conditions are met. One also wonders whether the volatility process is really stationary, and in fact, it is now believed that such strong persistence is due to misspecification modeling. For instance Diebold (1986) argued that the persistence may be due to the instability of the unconditional variance. Lastrapes (1989) observed that the persistence of exchange-rate volatility depends on U.S. monetary policy regimes. This statement is confirmed by Lamoureux and Lastrapes (1990) who obtained a substantial lower persistence volatility measure (i.e. the sum of the GARCH parameters, + φ in equation (3)) when some dummy variables, corresponding to some periods over which the volatility seems to be stationary, are included in the GARCH model. Cai (1994) and Hamilton and Susmel (1994) used Markov switching ARCH models to take into account the regime effect in 3 the volatility, and in both cases, the persistence is captured by the stochastic process which describes the state of the system, rather than by the GARCH parameters. Another widely used approach (see for instance Chan and Maheu (2002) and Chernov, Gallant, Ghysels and Tauchen (1999)) adds a jump process to capture large discrete changes. However, the empirical evidence suggests that the jump models capture less the persistence than additional extremes in the returns marginal distribution. Starica (2003) shares the concerns early stated by Diebold (1986) about persistence and misspecification, and discusses the danger of assuming global stationarity. Starica and Granger (2005) proposed a nonstationary approach to estimate volatility by considering the log-absolute returns as a locally stationary process as defined in Dahlhaus (1997). Their methodology considers the underlying process as a slowly and smoothly varying process with sharp changes. The aforementioned results motivate the features we aim to reproduce with our proposed model and estimator, namely, that volatility evolves smoothly except for occasional abrupt changes whose transient effect prevents the market from returning quickly to the level before shock. However, while Starica and Granger's approach is based on successive stationarity tests over intervals of varying width to identify the times of regime switching, our `1-based estimator finds the desired segmentation via the selection of a single parameter. Moreover, the solution of our `1-penalized likelihood optimization problem is calculated with a fast algorithm. This paper is organized as follows. In Section 2.1 we define our volatility estimator and discuss its link to wavelet and Markov random field smoothing. In Section 2.2 we use the connection to smoothers to present a selection rule for the smoothing parameter which guarantees smoothness of the volatility estimate. In Section 2.3 we show that the volatility estimate can be efficiently computed using an interior-point algorithm. Section 2.4 illustrates the performance of the estimator on a simulated time series. In Section 3 we apply our estimator to the DOW JONES and NASDAQ, analyze the results and evaluate the forecasting performance. Section 4 draws some conclusions and points to future extensions. 2 Smoothing the volatility 2.1 `1 penalized likelihood estimator (fSVM) We consider a discrete time SVM with Laplace innovations ηt whose density λ fη(x) = 2 exp(−λjxj) has heavier tails than the Gaussian density to better re- flect sudden changes of regime due to important market news. More precisely, 4 this model corresponds to (1){(3) with = 0, q = 1, γ = 1 and ! = µ(1 − φ). 2 2 Hence the variance of the log-volatility innovations is ση = 2/λ and this SVM is related to variance gamma (VG) models (Madan and Seneta 1990) for modeling the non-Gaussian nature of stock market returns and future index prices. In- deed the VG model can be expressed by equation (1), where σt follows a Gamma process, which in its continuous version writes as S (t) = S (0) exp (L (t)), where S(t) are the stock prices and L (t) is a Laplace motion (Kotz, Kozubowski and Podg´orski2001).