Efficient Evaluation of the Probability Density Function
Total Page:16
File Type:pdf, Size:1020Kb
Efficient Evaluation of the Probability Density Function of a Wrapped Normal Distribution Gerhard Kurz, Igor Gilitschenski, and Uwe D. Hanebeck Intelligent Sensor-Actuator-Systems Laboratory (ISAS) Institute for Anthropomatics and Robotics Karlsruhe Institute of Technology (KIT), Germany [email protected], [email protected], [email protected] Abstract—The wrapped normal distribution arises when the density of a one-dimensional normal distribution is wrapped around the circle infinitely many times. At first look, evaluation of its probability density function appears tedious as an infinite series is involved. In this paper, we investigate the evaluation of two truncated series representations. As one representation performs well for small uncertainties, whereas the other performs well for large uncertainties, we show that in all cases a small number of summands is sufficient to achieve high accuracy. I. INTRODUCTION The wrapped normal (WN) distribution is one of the most widely used distributions in circular statistics. It is obtained by wrapping the normal distribution around the unit circle and adding all probability mass wrapped to the same point (see Figure 1. The wrapped normal distribution is obtained by wrapping a normal Fig. 1). This is equivalent to defining a normally distributed distribution around the unit circle. random variable X and considering the wrapped random variable X mod 2π. approximated by just the first few terms of the infinite The WN distribution has been used in a variety of appli- series, depending on the value of σ2. cations. These applications include nonlinear circular filtering [1], [2], constrained object tracking [3], speech processing [4], In their book on directional statistics, Mardia and Jupp [16, [5], and bearings-only tracking [6]. Sec. 3.5] suggest However, evaluation of the WN probability density function For practical purposes, the density φw can be can appear difficult because it involves an infinite series. This approximated adequately by the first three terms of 2 is one of the main reasons why many authors (such as [7], [8], (3.5.66) [gn in this paper] when σ > 2π while for 2 [9], [10], [11]) use the von Mises distribution instead. It is even σ ≤ 2π the term with k = of (3.5.64) [fn in this sometimes referred to as the circular normal distribution [12]. paper] gives a reasonable approximation. Collet et al. published some results on discriminating between While this is practical advice, there is no theoretical justifica- wrapped normal and von Mises distributions [13]. Their results tion for using this approximation and there is no quantification were further refined by [14]. These analyses indicate that of the error incurred by this method. Other classic publications several hundred samples are necessary to distinguish between on circular statistics such as the book by Batschelet [17, the two distributions. Therefore the von Mises distribution Sec. 15.4] and the book by Fisher [18, Sec. 3.3.5] do not can be considered as a sufficiently good approximation in give any details about evaluation the WN probability density applications where sample sizes are small, but may prove function and suggest the use of the von Mises distribution insufficient in applications with large sample sizes. instead. In this paper, we will show that a very accurate numerical II. THE WRAPPED NORMAL DISTRIBUTION evaluation of the WN probability density function can be per- formed with little effort. Some authors have briefly discussed The wrapped normal distribution [12, Sec. 2.2.6], [16, Sec. approximation of the WN probability density function, but 3.5] is defined by the probability density function (pdf) to our knowledge, no one has published any proof for error 1 2 1 X (x + 2πk − µ) bounds, even though the WN distribution has been known f(x; µ, σ) = p exp − ; 2πσ 2σ2 and used for a long time [15]. Jammalamadaka and Sengupta k=−∞ simply state [12, Sec. 2.2.6] with x 2 [0; 2π), location parameter µ 2 [0; 2π), and It is clear that the density can be adequately uncertainty parameter σ > 0. Because the summands of the series converge to zero, it is natural to approximate the pdf accuracy range approximation with a truncated series 0 < σ < 1:34 f 0(x; µ, σ) 1E-5 1:34 ≤ σ < 2:28 f 1(x; µ, σ) 1 f(x; µ, σ) ≈ f (x; µ, σ) 2:28 ≤ σ < 4:56 g (x; µ, σ) n 4:56 ≤ σ g0(x; µ, σ) n 2 1 X (x + 2πk − µ) 0 = p exp − ; 0 < σ < 0:93 f (x; µ, σ) 2πσ 2σ2 0:93 ≤ σ < 1:89 f 1(x; µ, σ) k=−n 1E-10 1:89 ≤ σ < 2:21 f 2(x; µ, σ) 2:21 ≤ σ < 3:31 g2(x; µ, σ) where only 2n + 1 summands are considered. We will inves- 3:31 ≤ σ < 6:62 g1(x; µ, σ) tigate the choice of n (depending on σ) in this paper. 6:62 ≤ σ g0(x; µ, σ) As we will later prove, the series representation defined 0 < σ < 0:76 f 0(x; µ, σ) above yields a good approximation for small values of σ only. 0:76 ≤ σ < 1:53 f 1(x; µ, σ) 2 For this reason, we introduce a second representation, which 1:53 ≤ σ < 2:31 f (x; µ, σ) 1E-15 2:31 ≤ σ < 2:73 g3(x; µ, σ) yields good approximations for large values of σ. The pdf 2:73 ≤ σ < 4:09 g2(x; µ, σ) of a WN distribution is closely related to the Jacobi theta 4:09 ≤ σ < 8:17 g1(x; µ, σ) function [19]. This leads to another representation of the pdf 8:17 ≤ σ g1(x; µ, σ) [12, eq. (2.2.15)] Table I COMBINED APPROXIMATIONS FOR DIFFERENT ACCURACIES. 1 ! 1 X 2 g(x; µ, σ) = 1 + 2 ρk cos(k(x − µ)) ; 2π k=1 2 IV. THEORETICAL RESULTS where ρ = exp(−σ =2) . Analogous to fn, we define a truncated version Before we analyze the approximation error of the different approaches, we prove an inequality for the error function. g(x; µ, σ) ≈ gn(x; µ, σ) n ! 2 Lemma 1. For x > 1, the error function fulfills the inequality 1 X k 2 = 1 + 2 ρ cos(k(x − µ)) e−x 2π 1 − erf(x) ≤ p . k=1 π Proof: We use the continued fraction representation as 1 that only considers the first n summands. given in [19, 7.1.14] 2 III. EMPIRICAL RESULTS e−x erf(x) = 1 − 0 1 We implemented the truncated series fn and gn as well as p 1 π @x + 2x+ 2 A the exact solution (which increases n until the value of the x+ 3 2x+ 4 x+··· pdf does not change anymore because of the limited accuracy 2 e−x of the data type). We used the IEEE 754 double data type for ) 1 − erf(x) = all variables. It consists of 1 bit for the sign, 11 bit for the 0 1 p exponent, and 52 bit for the fraction [20]. Thus, it is accurate 1 π @x + 2x+ 2 A x+ 3 2x+ 4 to approximately 15 decimal digits. x+··· 2 For x; µ 2 [0; 2π), the error is largest for µ = 0 and e−x x ! 2π in both approximations (see Fig. 2). We will later ) 1 − erf(x) ≤ p x>1 π show this fact in the theoretical section. Thus, we com- pare the error ef (n; σ) = jf(2π; 0; σ) − fn(2π; 0; σ)j and eg(n; σ) = jg(2π; 0; σ)−gn(2π; 0; σ)j respectively. The results for n = 1; 2;:::; 11 are depicted in Fig. 3. Furthermore, we A. Representation Based on Wrapped Density include a comparison to the uniform distribution with pdf 1 fu(x) = 2π , which is also a special case of gn for n = 0. We consider the approximation fn(x; µ, σ) ≈ f(x; µ, σ). As can be seen, the uniform distribution is accurate up to In the following proposition, we will show that the error numerical precision for approximately σ ≥ 9. decreases exponentially in n. We empirically determined the combined approximation Proposition 1. For x; µ 2 [0; 2π) and n > 1 + pσ , the error based on fn and gn for different accuracies (see Table I). 2π ef (n; σ) = jfn(x; µ, σ) − f(x; µ, σ)j has an upper bound 1We treat the parameter n in f and g the same way, although the p n n (π 2(n−1))2 evaluation of f involves 2n + 1 summands whereas the evaluation of g n n exp − σ2 only involves n summands. However, the computational effort for evaluation e (n; σ) < : f 3=2 of a single summand of gn is higher, which roughly negates this difference. 2π WN(0,5.0) WN(0,0.5) 0 0 10 10 n=0 n=0 n=1 n=1 n=2 −2 n=2 n=3 10 n=3 −5 10 n=4 n=4 n=5 n=5 −4 n=6 10 n=6 n=7 n=7 n=8 n=8 −10 −6 10 n=9 10 n=9 error n=10 error n=10 −8 10 −15 10 −10 10 −20 −12 10 10 0 1 2 3 4 5 6 0 1 2 3 4 5 6 x x Figure 2. Empirical results depicting the error for different values of n for ef (n; σ) with σ = 5 (left) and eg(n; σ) with σ = 0:5 (right). Note that some points are rounded to zero because of the limited accuracy of the floating point arithmetic.