Bayesian Methods for Solving Estimation and Forecasting Problems in the High-Frequency Trading Environment

Bayesian methods for solving estimation and forecasting problems in the high-frequency trading environment Paul Alexander Bilokon Christ Church University of Oxford A thesis submitted in partial fulfillment of the MSc in Mathematical Finance 16 December, 2016 To God. And to my parents, Alex and Nataliya Bilokon. With love and gratitude. To the memory of Rudolf Emil Kálmán (1930–2016), who laid down the intellectual foundations on which this work is built. i Abstract We examine modern stochastic filtering and Markov chain Monte Carlo (MCMC) methods and consider their applications in finance, especially electronic trading. Stochastic filtering methods have found many applications, from Space Shuttles to self-driving cars. We review some classical and modern algorithms and show how they can be used to estimate and forecast econometric models, stochastic volatility and term structure of risky bonds. We discuss the practicalities, such as outlier filtering, parameter estimation, and diagnostics. We focus on one particular application, stochastic volatility with leverage, and show how recent advances in filtering methods can help in this application: kernel density estimation can be used to estimate the predicted observation, filter out outliers, detect structural change, and improve the root mean square error while preventing discontinuities due to the resampling step. We then take a closer look at the discretisation of the continuous-time stochastic volatility models and show how an alternative discretisation, based on what we call a filtering Euler–Maruyama scheme, together with our generalisation of Gaussian assumed density filters to arbitrary (not nec- essarily additive) correlated process and observation noises, gives rise to a new, very fast approxi- mate filter for stochastic volatility with leverage. Its accuracy falls short of particle filters but beats the unscented Kálmán filter. Due to its speed and reliance exclusively on scalar computations this filter will be particularly useful in a high-frequency trading environment. In the final chapter we examine the leverage effect in high-frequency trade data, using last data point interpolation, tick and wall-clock time and generalise the models to take into account the time intervals between the ticks. We use a combination of MCMC methods and particle filtering methods. The robustness of the latter helps estimate parameters and compute Bayes factors. The speed and precision of modern filtering algorithms enables real-time filtering and prediction of the state. Acknowledgements First and foremost, I would like to thank my supervisor, Dan Jones, who not only taught and instructed me academically, but also provided advice, guidance, and support in very challenging circumstances. Thank you very much, Dan, for putting up with me during this project. I would like to thank my mentors and former colleagues, Dan Crisan (Imperial College), Rob Smith (Citi- group), William Osborn (formerly Citigroup), and Martin Zinkin (formerly Deutsche Bank), who have taught me much of what I know about stochastic filtering. Rob not only taught me a lot, but was very helpful and understanding when I was going through a difficult time. Martin showed me a level of scientific and software development skill which I didn’t know existed. I would also like to thank my former manager at Deutsche Bank, Jason Batt, for his help and encouragement. I am very grateful to my PhD supervisor, Abbas Edalat, and scientific mentors, Claudio Albanese, Iain Clark, Achim Jung, Moe Hafiz, and Berc Rustem, for teaching me to think scientifically. I would also like to mention my quant and technology colleagues at Deutsche Bank Markets Electronic Trading who all, one way or another, taught me various aspects of the science, technology, and business of electronic trading: Tahsin Alam, Zar Amrolia, Benedict Carter, Teddy Cho, Luis Cota, Matt Daly, Madhucchand Darbha, Alex Doran, Marina Duma, Gareth Evans, Brian Gabillet, Yelena Gilbert, Thomas Giverin, Hakan Guney, James Gwinnutt, William Hall, Matt Harvey, Mridula Iyer, Chris-M Jones, Sergey Korovin, Egor Kraev, Leonid Kraginskiy, Peter Lee, Ewan Lowe, Phil Morris, Ian Muetzelfeldt, Roel Oomen, Helen O’Regan, Kir- ill Petunin, Stanislav Pustygin, Claude Rosenstrauch, Lubomir Schmidt, Torsten Schöneborn, Sumit Sengupta, Alexander Stepanov, Viktoriya Stepanova, Ian Thomson, Krzysztof Zylak. It has been an honour and privilege to work with you all while I was writing this dissertation. My friends have provided the much-needed encouragement and support: Sofiko Abanoidze, John and Anastasia Aston, Ester Goldberg, James Gwinnutt, Moe Hafiz, Lyudmila Meshchersky, Laly Nickatsadze, Adel Sabirova, Stephen Scott, Andrei Serjantov, Irakli Shanidze, Ed Silantyev, Nitza and Robin Spiro, Natasha Svetlova, Alexey Taycher, Nataliya Tkachuk, Jay See Tow, Douglas Vieira. Finally, I would like to acknowledge Saeed Amen, Matthew Dixon, Attila Agod, and Jan Novotny at Thale- sians, who have not only been great colleagues, but also great friends. I am especially grateful to Saeed for his patience, support, and friendship over the years. iii Nomenclature the Halmos symbol, which stands for quod erat demonstrandum A := f1, 2, 3g is equal by definition to (left to right) f1, 2, 3g =: A is equal by definition to (right to left) N natural numbers N0 ditto, explicitly incl. 0 N∗ nonzero natural numbers R reals, (−¥, +¥) R+ positive reals, (0, +¥) + R0 nonnegative reals, [0, +¥) + R0 extended nonnegative reals, [0, +¥] F, G,... name of a collection A, B,... name of a set fA, B,...g definition of a collection fO 2 T j O clopeng —"— f1, 2, 3g definition of a set fx 2 N j x eveng —"— (ak) name of element of a sequence (ak)k2N —"— ¥ (ak)k=1 —"— (2, 4, 6, . .) definition of a sequence (2, 4, 6) definition of a tuple x 2 N j x even such that iv Z ⊆ R subset R ⊇ Z superset Z ( R proper subset R ) Z proper superset A × B the Cartesian product of the sets A and B + C ! R0 function type + f : C ! R0 —"— x 7! x2 function definition f : x 7! x2 —"— f (A) the image of the set A under the function f f −1(A) the inverse image of the set A under the function f kxkp p-norm; in particular, when p = 2, Euclidean norm v name of a vector A name of a matrix v| transpose vec A vectorisation of a matrix s(A) the s-algebra generated by the family of sets A p(· j ·) a generic conditional probability density or mass function, with the arguments making it clear which conditional distribution it relates to T the time set X in the context of filtering, the state process S the state space B(S) the Borel s-algebra of S fX gt2T the usual augmentation of the filtration generated by X U (a, b) continuous uniform distribution with support [a, b] N m, s2 univariate normal (Gaussian) distribution with mean m and variance s2 N (m, S) multivariate normal (Gaussian) distribution with mean m and covariance matrix S v j x; m, s2, j m, s2 the probability density function (pdf), with argument x, of a univariate normal (Gaus- sian) distribution with mean m and variance s2 j (x; m, S), j (m, S) the probability density function (pdf), with argument x, of a multivariate normal (Gaussian) distribution with mean m and covariance matrix S AR autoregressive (process) ARIMA autoregressive integrated moving average (process) ARMA autoregressive moving average (process) ARCH autoregressive conditional heteroscedasticity BM geometric Brownian motion BUGS Bayesian inference Using Gibbs Sampling — a modelling language CUSUM Cumulative Sum of Recursive Residual DAG Directed Acyclic Graph EBFS empirical Bratley–Fox–Schrage EGARCH exponential generalised autoregressive conditional heteroscedasticity EKF extended Kálmán filter ELK empirical Law–Kelton EM Euler–Maruyama (scheme) GARCH generalised autoregressive conditional heteroscedasticity GASF Gaussian approximation recursive filter GBM geometric Brownian motion GDP gross domestic product HMM hidden Markov model i.i.d. independent and identically distributed (random variables) KDE kernel density estimator (estimate, estimation) KF Kálmán filter LTRF long-term relative frequency MA moving average (process) MCMC Markov chain Monte Carlo vi MISE mean integrated square error MLE maximum likelihood estimator (estimate, estimation) MMSE minimum mean square error estimator (estimate, estimation) MSE mean square error NLN normal–lognormal mixture (distribution) OU Ornstein-Uhlenbeck (process) post-RPF post-regularised particle filter QML quasi-maximum likelihood RMSE root mean square error SDE stochastic differential equation SIR sequential importance resampling SIS sequential importance sampling SISR sequential importance sampling with resampling SMC sequential Monte Carlo SV stochastic volatilty SVL stochastic volatility with leverage SVL2 stochastic volatility with the second correlation structure SVLJ stochastic volatility with leverage and jumps UKF unscented Kálmán filter wcSVL wall-clock stochastic volatility with leverage Z-spread zero volatility spread vii Contents Nomenclature iv Contents vii List of tables xi List of figures xii List of algorithms xiv List of listings xv Preface xvii 1 Volatility 1 1.1 Introduction . .1 1.2 The early days of volatility modelling: constant volatility models . .1 1.3 Evidence against the constant volatility assumption . .2 1.4 The two families of models . .3 1.4.1 The GARCH family . .3 1.4.2 The stochastic volatility family . .4 1.5 The statistical leverage effect . .6 1.6 A stochastic volatility model with leverage and jumps . .6 1.7 Conclusions and further reading . .9 viii 2 The filtering framework 11 2.1 Introduction . 11 2.2 Special case: general state-space models . 13 2.3 Particle filtering methods . 13 2.4 Applying the particle filter to the stochastic volatility model with leverage and jumps . 15 2.5 The Kálmán filter . 18 2.6 Some examples of linear-Gaussian state space models . 19 2.6.1 A non-financial example: the Newtonian system . 20 2.6.2 Autoregressive moving average models . 20 2.6.3 Continuous-time stochastic processes: the Wiener process, geometric Brownian motion, and the Ornstein–Uhlenbeck process .

Bayesian Methods for Solving Estimation and Forecasting Problems in the High-Frequency Trading Environment

Two Principles of Evidence and Their Implications for the Philosophy of Scientific Method

Testing for Cointegration When Some of The

Conditional Heteroscedastic Cointegration Analysis with Structural Breaks a Study on the Chinese Stock Markets

Lecture 18 Cointegration

Testing Linear Restrictions on Cointegration Vectors: Sizes and Powers of Wald Tests in Finite Samples

SUPPLEMENTARY APPENDIX a Time Series Model of Interest Rates

Sequential Monte Carlo Methods for Inference and Prediction of Latent Time-Series

Identification of Spikes in Time Series Arxiv:1801.08061V1 [Stat.AP]

The Role of Models and Probabilities in the Monetary Policy Process

Lecture: Introduction to Cointegration Applied Econometrics

On the Power and Size Properties of Cointegration Tests in the Light of High-Frequency Stylized Facts

Making Wald Tests Work for Cointegrated VAR Systems *