A University of Sussex Phd Thesis Available Online Via Sussex
Total Page:16
File Type:pdf, Size:1020Kb
A University of Sussex PhD thesis Available online via Sussex Research Online: http://sro.sussex.ac.uk/ This thesis is protected by copyright which belongs to the author. This thesis cannot be reproduced or quoted extensively from without first obtaining permission in writing from the Author The content must not be changed in any way or sold commercially in any format or medium without the formal permission of the Author When referring to this work, full bibliographic details including the author, title, awarding institution and date of the thesis must be given Please visit Sussex Research Online for more information and further details NON-STATIONARY PROCESSES AND THEIR APPLICATION TO FINANCIAL HIGH-FREQUENCY DATA Mailan Trinh A thesis submitted for the degree of Doctor of Philosophy University of Sussex March 2018 UNIVERSITY OF SUSSEX MAILAN TRINH A THESIS FOR THE DEGREE OF DOCTOR OF PHILOSOPHY NON-STATIONARY PROCESSES AND THEIR APPLICATION TO FINANCIAL HIGH-FREQUENCY DATA SUMMARY The thesis is devoted to non-stationary point process models as generalizations of the standard homogeneous Poisson process. The work can be divided in two parts. In the first part, we introduce a fractional non-homogeneous Poisson process (FNPP) by applying a random time change to the standard Poisson process. We character- ize the FNPP by deriving its non-local governing equation. We further compute moments and covariance of the process and discuss the distribution of the arrival times. Moreover, we give both finite-dimensional and functional limit theorems for the FNPP and the corresponding fractional non-homogeneous compound Poisson process. The limit theorems are derived by using martingale methods, regular vari- ation properties and Anscombe's theorem. Eventually, some of the limit results are verified via a Monte-Carlo simulation. In the second part, we analyze statistical point process models for durations between trades recorded in financial high-frequency trading data. We consider parameter set- tings for models which are non-stationary or very close to non-stationarity which is quite typical for estimated parameter sets of models fitted to financial data. Simu- lation, parameter estimation and in particular model selection are discussed for the following three models: a non-homogeneous normal compound Poisson process, the exponential autoregressive conditional duration model (ACD) and a Hawkes process model. In a Monte-Carlo simulation, we test the performance of the following in- formation criteria for model selection: Akaike's information criterion, the Bayesian information criterion and the Hannan-Quinn information criterion. We are partic- ularly interested in the relation between the rate of correct model selection and the underlying sample size. Our numerical results show that the model selection for the compound Poisson type model works best for small parameter numbers. Moreover, the results for Hawkes processes confirm the theoretical asymptotic distributions of model selection whereas for the ACD model the model selection exhibits adverse behavior in certain cases. Declaration of Authorship I hereby declare that this thesis has not been, and will not be, submitted in whole or in part to another University for the award of any other degree. Mailan Trinh 28th March 2018 To my parents 5 Contents Introduction 8 1 Point processes 15 1.1 Point process theory and martingales . 15 1.2 The homogeneous Poisson process . 17 1.3 The inhomogeneous Poisson process . 19 1.4 Simulation . 22 1.4.1 Simulation methods for the homogeneous Poisson process . 22 1.4.2 Thinning algorithms . 23 1.5 Summary . 25 2 The fractional Poisson process 26 2.1 Preliminaries . 26 2.1.1 The Mittag-Leffler function . 26 2.1.2 Stable distributions . 28 2.2 The stable subordinator and its inverse . 30 2.3 Definition of the fractional Poisson process . 32 2.4 Governing equations . 33 2.4.1 Fractional differential operators . 33 2.4.2 The homogeneous case . 34 2.4.3 The non-homogeneous case . 36 2.5 Moments and covariance structure . 39 2.5.1 Moments . 39 2.5.2 Covariance . 40 2.6 Summary . 44 3 Limit theorems 45 3.1 Preliminaries: Convergence in the Skorokhod space . 45 3.1.1 Weak convergence of probability measures and Riesz repres- entation theorem . 45 3.1.2 Prokhorov's theorem . 51 6 3.1.3 Compactness in C ......................... 53 3.1.4 A Skorokhod topology: J1 and compactness on D ....... 56 3.1.5 Continuity of functions on D×D: M1 topology and continuous mapping approach . 60 3.2 A martingale approach to limit theorems for the fractional Poisson process . 63 3.2.1 The fractional Poisson process as a Cox process . 64 3.2.2 The FNPP and its compensator . 67 3.3 Regular variation and scaling limits . 69 3.3.1 A one-dimensional limit theorem . 70 3.3.2 A functional limit theorem . 73 3.4 The fractional compound Poisson process . 76 3.4.1 A one-dimensional limit result . 77 3.4.2 A functional limit theorem . 80 3.5 Some numerical examples . 80 3.6 Summary . 81 4 Information criteria and model selection 84 4.1 The model selection problem . 84 4.2 Information criteria . 85 4.2.1 Akaike's information criterion . 86 4.2.2 The Bayesian information criterion . 92 4.2.3 Model weighting and model averaging . 94 4.2.4 The consistency property . 96 4.3 Summary . 98 5 Models for durations between trades and model selection 99 5.1 The Monte-Carlo setup . 99 5.2 A normal compound Poisson model . 100 5.2.1 Definition . 100 5.2.2 Simulation . 102 5.2.3 Fitting . 103 5.2.4 Numerical results . 104 5.3 The autoregressive conditional duration (ACD) model . 110 5.3.1 Simulation and fitting . 112 5.3.2 Numerical results . 112 5.4 Hawkes processes . 114 5.4.1 Definition and some properties . 115 5.4.2 Simulation . 122 5.4.3 Fitting . 123 7 5.4.4 Numerical results . 125 5.5 Summary . 129 Conclusion 130 Bibliography 131 A Regular variation and Tauberian theorems 142 B Tables 145 C Code manual 154 C.1 Compound Poisson type models . 154 C.2 Sample code for compound Poisson type models . 164 C.3 Using the ACDm package . 166 C.4 Hawkes processes . 167 D Source code 172 D.1 Poisson process . 172 D.2 Dλ-model . 173 D.3 Pλ-model . 177 D.4 Hawkes processes . 181 List of Figures Index 8 Introduction Motivation Time series analysis as presented in standard textbooks like Brockwell and Davis 1991 or Hamilton 1994 assumes integer-indexed time series of the form x1; x2; : : : ; xn. This assumption is most suitable for data and measurements that can be recorded at specific equidistant times or are already aggregated. For example, this is the case for daily, monthly or yearly stock market data. A useful assumption for time series models is the concept of stationarity 1: A time series is stationary if the autocorrelation function only depends on the lag, i.e. the time difference h := t − s between two data points xs and xt, where s < t. Sta- tionarity allows a form of dependence between data points that still ensures consist- ency and asymptotic results for parameter estimates of time series models such as ARMA (autoregressive moving average) and GARCH (generalized autoregressive conditional heteroskedasticity). An initially non-stationary time series can sometimes be transformed into a station- ary one. This is usually done by detecting and removing deterministic trends and seasonality as well as differencing (see Section 1.4 in Brockwell and Davis 1991). These two assumptions of regularly spaced and stationary data are called into ques- tion when moving to high-frequency level of financial data. As a consequence of technological advancement, it is possible to record all transaction of a trading day or as Engle 2000 termed it: financial data are increasingly available at \ultra-high- frequency". This kind of intra-day or tick-by-tick data are inherently irregularly spaced. One could aggregate the data to fit into the framework of integer-indexed time series, but this can be problematic as pointed out in Engle and Russell 1998: The choice of the time grid for aggregation is somewhat arbitrary and distorts the results of a subsequent statistical analysis. If time intervals are too small, some in- tervals are empty or just contain a single observation. If the intervals are too large, information on the time structure might get lost. A way to accommodate irregularly 1At this point, we refer to stationarity as second-order or weak stationarity as opposed to strict stationarity, where the finite dimensional marginals of the process do not depend on the lag. For an exact definition see Definition 1.3.2 and Definition 1.3.3 in Brockwell and Davis 1991. 9 spaced data is continuous-time point process models. Engle and Russell 1998 have proposed the ACD (autoregressive conditional duration) model which will be dis- cussed further in Section 5.3 of this thesis. As direct generalizations of the standard time series models, there are approaches in constructing continuous time analogues such as the CARMA (Brockwell 2001, 2004, 2014 and Section 11.5 in Brockwell and Davis 2016) and COGARCH (Kl¨uppelberg, Lindner and Maller 2004, Brockwell, Chadraa and Lindner 2006) process. Slightly separate from the theory around time series models, doubly stochastic point processes are already established in actuarial risk theory, but their subclass self-exciting point processes has received attention in recent publications (see Section 5.4 in the thesis) and are viable alternatives to the ACD model. Concerning the stationarity property, it is debatable whether this theoretically con- venient property can be reconciled with stylized facts of empirical data.