An Autoregressive Conditional Poisson Model

Mo deling Time Series Count Data: An Autoregressive Conditional Poisson Mo del Andr eas Heinen Department of Economics University of California, San Diego 9500 Gilman Drive La Jolla CA 92037 [email protected] November 2000 Abstract This pap er intro duces new mo dels for time series count data. The Autoregressive Conditional Poisson mo del ACP makes it p ossible to deal with issues of discreteness, overdisp ersion variance greater than the mean and serial correlation. A fully parametric approachis taken and a marginal distribution for the counts is sp eci ed, where conditional on past observations the mean is autoregressive. This enables to attain improved inference on co ecients of exogenous regressors relative to static Pois- son regression, which is the main concern of the existing literature, while mo deling the serial correlation in a exible way. Avariety of mo dels, based on the double Poisson distribution of Efron 1986 is intro duced, which in a rst step intro duce an additional disp ersion parameter and in a second step make this disp ersion parameter time-varying. All mo dels are estimated using maximum likeliho o d which makes the usual tests available. In this framework auto correlation can b e tested with a straightforward likeliho o d ratio test, whose simplicity is in sharp contrast with test pro cedures in the latentvari- able time series count mo del of Zeger 1988. The mo dels are applied to the time series of monthly p olio cases in the U.S b etween 1970 and 1983 as well as to the daily number of price change durations of :75$ on the IBM sto ck. A :75$ price-change duration is de ned as the time it takes the sto ck price to moveby at least :75$. The variable of interest is the daily numb er of such durations, which is a measure of intradaily volatility, since the more volatile the sto ck price is within a day, the larger the counts will b e. The ACP mo dels provide go o d density forecasts of this measure of volatility. 1 1 Intro duction Many interesting empirical questions can b e addressed by mo deling a time series of count data. Examples can be found in a great variety of contexts. In the area of accident prevention, Johansson 1996 used time series counts to assess the e ect of lowered sp eed limits on the numb er of road casualties. In epidemiology, time series counts arise naturally in the study of the incidence of a disease. A prominent example is the time series of monthly cases of p olio in the U.S. which has b een studied extensively by Zeger 1988 and Brannas and Johansson 1994. In the area of nance, b esides the applications mentioned in Cameron and Trivedi 1996, counts arise in market microstructure as so on as one starts lo oking at tick-by-tick data. The price pro cess for a sto ck can b e viewed as a sum of discrete price changes. The daily numb er of these price changes constitutes a time series of counts whose prop erties are of interest. Most of these applications involve relatively rare events which makes the use of the normal distribution questionable. Thus, mo deling this typ e of series requires one to deal explicitly with the discreteness of the data as well as its time series prop erties. Neglect- ing either of these two characteristics would lead to p otentially serious missp eci cation. A typical issue with time series data is auto correlation and a common feature of count data is overdisp ersion the variance is larger than the mean. Both of these problems are addressed simultaneously by using an autoregressive conditional Poisson mo del ACP. In the simplest mo del counts haveaPoisson distribution and their mean, conditional on past observations, is autoregressive. Whereas, conditional on past observations, the mo del is equidisp ersed the variance is equal to the mean, it is unconditionally overdisp ersed. A fully parametric approach and cho ose to mo del the conditional distribution explicitly and make sp eci c assumptions ab out the nature of the auto correlation in the series. A similar mo deling strategy has b een explored indep endently by Rydb erg and Shephard 1998, Rydb erg and Shephard 1999b, and Rydb erg and Shephard 1999a. Two generalisations of their framework are intro duced in this pap er. The rst consists of replacing the Pois- son by the double Poisson distribution of Efron 1986, which allows for either under- or overdisp ersion in the marginal distribution. In this context, two common variance functions will b e explored. Finally, an extended version of the mo del is prop osed, which allows for separate mo dels of mean and of variance. It is shown that this mo del p erforms well in a nancial application and that it delivers very good density forecasts, which are tested by using the techniques prop osed by Dieb old and Tay 1997. The main advantages of this mo del are that it is exible, parsimomious and easy to estimate using maximum likeliho o d. Results are easy to interpret and standard hyp othesis test are available. In addition, given that the auto correlation and the density are mo deled explicitly, the mo del is well suited for b oth p oint and density forecasts, which can be of interest in many applications. Finally, due to its similarity with the autoregressive conditional heteroskedasticity ARCH mo del of Engle 1982, the present framework can be extended to most of the mo dels in the ARCH class; for a review see Bollerslev, Engle, and Nelson 1994. The pap er is organised as follows. A great numb er of mo dels of time series count data have b een prop osed, which will b e reviewed in section 2. In section 3 the basic autoregressive Poisson mo del is intro duced and some of its prop erties are discussed. Section 4 intro duces 2 twoversions of the Double Autoregressive Conditional Poisson DACP mo del, along with their prop erties. In section 5 the mo del is generalised to allow for time-varying variance, applications to the daily numb er of price changes on IBM and to the numb er of new p olio cases are presented is section 6; section 7 concludes. 2 A review of mo dels for count data in time series Many di erent approaches have b een prop osed to mo del time series count data. Good reviews can be found b oth in Cameron and Trivedi 1979, Chapter 7 and in MacDonald and Zucchini 1997, Chapter 1. Markov chains are one way of dealing with count data in time series. The metho d consists of de ning transition probabilities between all the p ossible values that the dep endent variable can take and determining, in the same way as in usual time series analysis, the appropriate order for the series. This metho d is only reasonable, though, when there are very few p ossible values that the observations can take. A prominent area of application for Markovchains is binary data. As so on as the number of values that the dep endentvariable takes gets to o large, these mo dels lose tractability. Discrete Autoregressive Moving Average DARMA mo dels are mo dels for time series count data with prop erties similar to those of ARMA pro cesses found in traditional time series analysis. They are probabilistic mixtures of discrete i.i.d. random variables with suitably chosen marginal distribution. One of the problems asso ciated with these mo d- els seems to be the diculty of estimating them. An application to the study of daily precipitation can be found in Chang and Delleur 1984. The daily level of precipitation is transformed into a discrete variable based on its magnitude. The metho d of moments is used to estimate the parameters of the mo del by tting the theoretical auto correlation function of each mo del to their sample counterpart. Estimation of the mo del seems quite cumb ersome and the mo del is only applied to a time series which can take at most three values. McKenzie 1985 surveys various mo dels based on "binomial thinning". In those mo d- els, the dep endentvariable y is assumed to b e equal to the sum of an error term with some t presp eci ed distribution and the result of y draws from a Bernoulli which takes value t1 1 with some probability and 0 otherwise. This guarantees that the dep endent variable takes only integer values. The parameter in that mo del is analogous to the co ecient on the lagged value in an AR1 mo del. This mo del, called INAR1, has the same au- to correlations as the AR1 mo del of traditional time series analysis, which makes it its discrete counterpart. This family of mo dels has b een generalised to include integer valued ARMA pro cesses a well as to incorp orate exogenous regressors. The problem with this typ e of mo dels is the diculty in estimating them. Many mo dels have b een prop osed and the emphasis was put more on their sto chastic prop erties than on how to estimate them. Hidden Markovchains, advo cated by MacDonald and Zucchini 1997 are an extension of the basic Markov chains mo dels, in which various regimes characterising the p ossible values of the mean are identi ed. It is then assumed that the transition from one to another of these regimes is governed by a Markov chain.

Load more