Stat Comput DOI 10.1007/s11222-017-9748-4 Nonparametric estimation for compound Poisson process via variational analysis on measures Alexey Lindo1 · Sergei Zuyev2 · Serik Sagitov2 Received: 2 February 2016 / Accepted: 17 April 2017 © The Author(s) 2017. This article is an open access publication Abstract The paper develops new methods of nonpara- 1 Introduction metric estimation of a compound Poisson process. Our key estimator for the compounding (jump) measure is based on The paper develops new methods of nonparametric esti- series decomposition of functionals of a measure and relies mation of the distribution of compound Poisson data. A on the steepest descent technique. Our simulation studies for compound Poisson process (Wt )t≥0 isaMarkovjumppro- various examples of such measures demonstrate flexibility of cess with W0 = 0 characterised by a finite compounding our methods. They are particularly suited for discrete jump measure Λ defined on the real line R = (−∞, +∞) such distributions, not necessarily concentrated on a grid nor on that the positive or negative semi-axis. Our estimators also appli- cable for continuous jump distributions with an additional Λ({0}) = 0, Λ:=Λ(R) ∈ (0, ∞). (1) smoothing step. The jumps of this process occur at the constant rate Λ, Keywords Compound Poisson distribution · Decompound- and the jump sizes are independent random variables with a ing · Measure optimisation · Gradient methods · Steepest common distribution Λ(dx)/Λ. In a more general context, descent algorithms the compound Poisson process is a particular case of a Lévy process with Λ being the corresponding integrable Lévy mea- Mathematics Subject Classification Primary: 62G05; sure. Inference problems for such processes naturally arises Secondary: 62M05 · 65C60 in financial mathematics (Cont and Tankov 2003), queueing theory (Asmussen 2008), insurance (Mikosch 2009) and in many other situations modelled by compound Poisson and Lévy processes. Suppose the compound Poisson process is observed at reg- ularly spaced times (Wh, W2h,...,Wnh) for some time step > = − B h 0. The consecutive increments Xi Wih W(i−1)h then Sergei Zuyev ( ,... ) [email protected] form a vector X1 Xn of independent random variables having a common compound Poisson distribution with the Alexey Lindo [email protected] characteristic function Serik Sagitov θ ψ(θ) θ [email protected] ϕ(θ) = Eei Wh = eh ,ψ(θ)= (ei x − 1)Λ(dx). 1 School of Mathematics and Statistics, University of Glasgow, (2) Glasgow, UK 2 Department of Mathematical Sciences, Chalmers University R of Technology and University of Gothenburg, Gothenburg, Here and below the integrals are taken over the whole Sweden unless specified otherwise. Estimation of the measure Λ in 123 Stat Comput termsofasample(X1,...Xn) is usually called decompound- ing which is the main object of study in this paper. 8 We propose a combination of two nonparametric meth- ods which we call characteristic function fitting (ChF) and 6 convolution fitting (CoF). ChF may deal with a more gen- eral class of Lévy processes, while CoF explicitly targets the Distance compound Poisson processes. 4 The ChF estimator for the jump measure Λ is obtained by minimisation of the loss functional 2 5 hψ(θ) 2 4 LChF(Λ) = |e −ˆϕn(θ)| ω(θ)dθ, (3) 0 3 −4 2 where ψ(θ) ≡ ψ(θ,Λ) is given by (2), −2 lambda 0 1 x 2 n 4 1 θ ϕˆ (θ) = ei Xk . n n k=1 Fig. 1 Illustration of intrinsic difficulties faced by any characteristic function fitting procedure. Plotted is the integrated squared modulus of is the empirical characteristic function and ω(θ) is a weight the difference between two characteristic functions with measures Λ = δ Λ = λδ ∈[− , ],λ∈ ( , ) function. It was shown in Neumann and Reiss (2009)ina 1 and x , x 5 5 0 5 . Clearly, any algorithm based on closeness of characteristic functions, like (3), would have difficulties more general Lévy process setting that minimising (3) leads converging to the global minimum attained at point x = 1,λ= 1even to a consistent estimator of the Lévy triplet. Typically, ω(θ) in this simple two-parameter model is a positive constant for θ ∈[θ1,θ2] and zero otherwise, but it can also be chosen to grow as θ → 0, this would lead to boosting an agreement of the moments of a fitted jump distribution with the empirical moments. for such a two-step procedure will be denoted by Λ˜ in the We compute explicitly the derivative of the loss func- k sequel. tional (3) with respect to the measure Λ,formula(18)in To give an early impression of our approach, let us demon- “Appendix”, and perform the steepest descent directly on the strate the performance of our methods on the famous data by cone of non-negative measures to a local minimiser, further Ladislaus Bortkiewicz who collected the numbers of Prus- developing the approach by Molchanov and Zuyev (2002). sian soldiers killed by a horse kick in 10 cavalry corps over It must be noted that, as a simple example reveal, the func- a 20-year period (Bortkiewicz 1898). The counts 0, 1, 2, 3 tionals based on the empirical characteristic function usually and 4 were observed 109, 65, 22, 3 and 1 times, with 0.6100 have a very irregular structure, see Fig. 1. As a result, the deaths per year per cavalry unit. The author argues that the steepest descent often fails to attend the global optimal solu- data are Poisson distributed which corresponds to the mea- tion, unless the starting point of the optimisation procedure sure Λ = λδ concentrated on the point {1} (only jumps of is carefully chosen. 1 size 1) and the mass λ being the parameter of the Poisson The CoF estimation method uses the fact that the convo- distribution estimated by the sample mean to be 0.61. Fig- lution of F(x) = P(W ≤ x), h ure 2 on its top panel presents the estimated Lévy measures for the cut-off values k = 1, 2, 3 when using CoF method. ∗2( ) = ( ) ( − ) , F x F y F x y dy For the values of k = 1, 2, the result is a measure having many atoms. This is explained by the fact that the accuracy as a functional of Λ has an explicit form of an infinite Taylor of the convolution approximation is not enough for these ˆ series involving direct products of measures Λ, see Theo- data, but k = 3 already results in a measure Λ3 essentially rem 2 in Sect. 4. After truncating it to only the first k terms, concentrated at {1}, thus supporting the Poisson model with (k) Λˆ = . we build a loss function LCoF by comparing two estimates parameter 3 0 6098. In Sect. 4, we return to this exam- of F∗2: the one based on the truncated series and the other ple and explain why the choice of k = 3 is reasonable here. 2∗ being the empirical convolution Fn . CoF is able to produce Caused by a possibly very irregular behaviour of the score ˆ nearly optimal estimates Λk when large values of k are taken, function LChF demonstrated above, we practically observed but at the expense a drastically increased computation time. that the convergence of the ChF method depends critically on A practical combination of these methods recommended the choice of the initial measure, especially on its total mass. ˆ by this paper is to find Λk usingCoFwithalowvalueofk and However, the proposed combination of CoF followed by ChF ˆ then apply ChF with Λk as the starting value. The estimate demonstrates (the bottom plot) that this two-step (faster) pro- 123 Stat Comput ^ ^ for estimating the unknown density of the measure Λ.In Λ1 ||Λ1|| = 0.4589 ^ ^ contrast, we do not distinguish between discrete and contin- Λ2 ||Λ2|| = 0.5881 ^ ^ uous Λ in that our algorithms, based on direct optimisation Λ3 ||Λ3|| = 0.6099 of functionals of a measure, work for both situations on a dis- cretised phase space of Λ. However, if one sees many small atoms appearing in the solution, which fill a thin grid, this may indicate that the true measure is absolutely continuous and some kind of smoothing should yield its density. In this paper, we do not address estimation of more general Lévy processes allowing for Λ(−1, 1) =∞. In the Lévy pro- cess setting, the most straightforward approach for estimating the distribution F(x) = P(Wh ≤ x) is the moments fitting, see Feuerverger and McDunnough (1981b) and Carrasco and Florens (2000). Estimates of Λ can be obtained by maximis- ing the likelihood ratio (see e.g. Quin and Lawless 1994)orby minimising some measure of proximity between F and the 0.0 0.1 0.2 0.3 0.4 0.5 0.6 ˆ 1 n empirical distribution function Fn(x) = = 1I {X ≤x}, 0246810 n k 1 k where the dependence on Λ comes through F via the inver- x sion formula of the characteristic function: Λ^ Λ^ 1 y 3 || 3|| = 0.6099 F(x) − F(x − 0) = lim exp{hψ(θ) − iθx}dθ. Λ~ Λ~ →∞ 1 || 1|| = 0.6109 2π y −y For the estimation, the characteristic function in the integral above is replaced by the empirical characteristic function. Parametric inference procedures based on the empiri- cal characteristic function have been known for some time, see Feuerverger and McDunnough (1981a) and Sueishi and Nishiyama (2005), and the references therein. Algorithms based on the inversion of the empirical characteristic func- tion and on the relation between its derivatives were proposed in Watteel and Kulperger (2003). Note that the inversion of the empirical characteristic function, in contrast to the inversion of its theoretical counterpart, generally leads to a complex valued measure which needs to be dealt with.
Details
-
File Typepdf
-
Upload Time-
-
Content LanguagesEnglish
-
Upload UserAnonymous/Not logged-in
-
File Pages15 Page
-
File Size-