Toroidal Information Fusion Based on the Bivariate Von Mises Distribution

Toroidal Information Fusion Based on the Bivariate von Mises Distribution

Gerhard Kurz∗ and Uwe D. Hanebeck∗

Abstract— Fusion of toroidal information, such as correlated angles, is a problem that arises in many fields ranging from robotics and signal processing to meteorology and bioinformatics. For this purpose, we propose a novel fusion method based on the bivariate von Mises distribution. Unlike most literature on the bivariate von Mises distribution, we consider the full version with matrix-valued parameter rather than a simplified version. By doing so. we are able to derive the exact analytical computation of the fusion operation. We also propose an efficient approximation of the normalization constant including an error bound and present a parameter estimation algorithm based on a maximum likelihood approach. The presented algorithms are illustrated through examples.

I.INTRODUCTION Fig. 1: Samples on the torus drawn from three bivariate von Many applications involve the consideration of correlated Mises distributions with different parameters. angles or other periodic quantities. For example, the direction a person’s head is facing and the direction the torso is facing are highly correlated. Other examples are the wind direction The bivariate von Mises distribution is not the only at two different measurement locations or at two different toroidal probability distribution capable of representing cor- times, the orientation of two joints of a robotic arm, or the relations. An interesting alternative is the bivariate wrapped phase of a signal received over multiple paths. normal distribution (see, for example, [10]). We previously Motivated by the large number of highly relevant problems investigated the applicability of this distribution for recursive in a variety of areas, we propose a novel method for the filtering in [11]. Although the bivariate wrapped normal has a fusion of measurements of correlated angles in this paper. number of nice properties and is well-suited for propagation Because each angle can be represented by a point on the of information, it is not closed under multiplication, which unit circle, we consider the Cartesian product of two unit makes the fusion operation difficult and computationally ex- circles, which is given by the torus. pensive. Some other distributions can be found in literature, As most commonly used distributions such as the Gaussian such as a bivariate distribution with von Mises marginals1 n distribution are defined on R , they are not suitable to repre- [12], a model based on Fourier series [13], and a distribution sent uncertainty on the torus (although they can still be used induced by a Mobius¨ transformation [14]. to locally approximate a toroidal distribution). Consequently, The contributions of this paper can be summarized as it is important to consider probability distributions defined follows. First, we present a series representation for the on the torus. We resort to the field of directional statistics normalization constant of the bivariate von Mises distribution [1], a subfield of statistics that deals with distributions on that can be used to efficiently approximate its value with certain manifolds, the torus being one example. One suitable fairly low computational cost. There has been some previous toroidal distribution is the bivariate von Mises distribution work on the normalization constant for certain special cases, (see Fig. 1), which was proposed by Mardia in [2] and [3] but our method is to the authors’ knowledge the first result (see also [1, Sec. 3.7.1]). This distribution constitutes a for the general bivariate von Mises. Second, we propose a generalization of the univariate von Mises distribution [4], a parameter estimation scheme based on maximum likelihood commonly used distribution on the unit circle. Special cases estimation of the distribution’s parameters. Third, we prove of the bivariate von Mises distribution were considered by that the general form of the bivariate von Mises distribution is Rivest [5], Singh et al. [6], and Mardia et al. [7], [8]. A closed under multiplication and present an efficient algorithm possible generalization to higher dimensions was discussed to analytically calculate the product of two bivariate von in [9]. Mises distributions. This method allows efficient Bayesian ∗ The authors are with the Intelligent Sensor-Actuator-Systems Labora- fusion on the torus. tory (ISAS), Institute for Anthropomatics and Robotics, Karlsruhe Institute of Technology (KIT), Germany. E-mail: [email protected], 1The bivariate von Mises distribution counterintuitively does not have von [email protected] Mises marginals in general. II.BIVARIATE VON MISES DISTRIBUTION where Before we introduce the bivariate von Mises distribution, we briefly review the properties of the univariate von Mises α = κ2 + cos(x1 − µ1)a11 + sin(x1 − µ1)a21 , distribution [4]. Its probability density function (pdf) is given β = sin(x1 − µ1)a22 + cos(x1 − µ1)a12 . by 1 Note that the marginals are VM distributed if A is a f(x; µ, κ) = exp(κ cos(x − µ)) , 2πI0(κ) zero-matrix, but not in general. This result constitutes a generalization of the results by Singh et al. [6, eq. (2.2)] where x ∈ [0, 2π), µ ∈ [0, 2π) and κ ≥ 0. The normalization and Mardia et al. [7, eq. (5)]. constant contains I (κ), the modified Bessel function of the 0 Several special cases of the bivariate von Mises distri- first kind [15, Sec. 9.6]. The parameter µ determines the bution have been considered in literature, in which A was location of the mode of the distribution, whereas κ is a restricted to fulfill certain properties. Jupp et al. have dis- concentration parameter that represents the uncertainty. cussed a distribution, where A is a multiple of an orthogonal The bivariate von Mises (BVM) distribution is defined by matrix [17, eq. (3·3)]. Another subclass was discussed by the probability density function Rivest [5], in which A is restricted to be a diagonal matrix, f(x; µ, κ, A) i.e., a1,2 = a2,1 = 0. Later, further special cases of Rivest’s model were considered by Singh et al. [6] and Mardia = C(κ, A) · exp κ1 cos(x1 − µ1) + κ2 cos(x2 − µ2) et al. [7]. Singh et al. proposed the sine model, where a1,1 = a1,2 = a2,1 = 0 and only a2,2 ∈ R can be chosen. (1) Analogously, Mardia et al. proposed the cosine model, where T ! a1,2 = a2,1 = a2,2 = 0 and only a1,1 ∈ can be chosen. cos(x1 − µ1) cos(x2 − µ2) R + A , Both models are compared in [7]. Several interesting results sin(x1 − µ1) sin(x2 − µ2) have been published for these special cases, e.g., solutions for where x ∈ [0, 2π)2, µ ∈ [0, 2π)2, the normalization constant, bimodality conditions, marginals, parameter estimation schemes, etc. a1,1 a1,2 2×2 A = ∈ R , However, very little is known about the general bivariate a2,1 a2,2 von Mises distribution, as it is more difficult to analyze and C(κ, A) > 0 is the normalization constant2. We discuss than its special cases. In this paper, we seek to address the calculation of C(κ, A) in detail in Sec. III. The param- this deficiency and present several results for the general eter µ once again controls the location of the distribution, bivariate von Mises distribution. The following sections will whereas κ controls the concentration. Additionally, there is be devoted to deriving an approximation of the normalization the parameter A that influences the correlation of the two constant, presenting a maximum likelihood estimator, and angles. It is important to emphasize that A is an arbitrary proposing a method for Bayesian fusion. 2 × 2 matrix, i.e., it does not need to be symmetric, positive definite, or invertible. To illustrate the influence of the matrix III.NORMALIZATION CONSTANT A, we depict several examples of the resulting pdf in Fig. 2. As can be seen, the pdf can become bimodal for certain In this section, we derive a method for calculating the choices of A, which is typically not intended3. Also, it is normalization constant of a bivariate von Mises distribution. noteworthy that there are four parameters, the entries of Some authors such as Singh et al. [6], Rivest [5], and A, parameterizing the correlation. This may seem surprising Jupp et al. [17] have previously derived the normalization as intuitively a single scalar parameter should be sufficient constant for certain special cases of the bivariate von Mises (similar to the bivariate wrapped normal distribution [11]). distribution but their solutions do not apply to the more The marginals of the BVM distribution with matrix pa- general distribution with matrix parameter A. rameter can be obtained according to Z 2π A. Series Representation f(x1) = f(x1, x2)dx2 0 Obviously, the normalization constant is independent of = C · exp(κ1 cos(x1 − µ1)) the location of the distribution, i.e., we do not need to Z 2π consider the value of µ. By rewriting the exponential function · exp(α cos(x2)) exp(β sin(x2))dx2 0 using its power series representation [15, 4.2.1], we obtain p 2 2 = C · exp(κ1 cos(x1 − µ1)) · 2πI0( α + β ) Z 2π Z 2π C(κ, A)−1 = exp κ cos(x ) + κ cos(x ) 2Some authors use a slightly different (but equivalent) parameterization, 1 1 2 2 0 0 in which angles are represented by unit vectors, see e.g., [16, eq. (2.10)] 3The exact condition when the density is bimodal are not yet known for T ! cos(x1) cos(x2) the general case of a bivariate von Mises distribution. Bimodality conditions + A dx1dx2 for the sine and cosine models are given in [7, Theorem 2,3] sin(x1) sin(x2) osdrn h nqaiyfrthe for inequality the considering the as series, power Bound Error the B. of zero. terms to consider converge of to Appendix. quickly sufficient couple summands the is first it in the concentrations, terms low only few of first case the the In for solutions give We n for define we where with distributions Mises von Bivariate 2: Fig. h term The both | = ≤ s u h ouingt opiae o ag ausof values large for complicated gets solution the but , s n ecnpoea ro on fti prxmto by approximation this of bound error an prove can We cos( + sin( + = n ( | Z 4 κ ( 0 x Z κ π , 2 1 0 , A 2 π 2 A ( 0 0 0 0 and π Z x κ ) x s A = A = | 0 ) 1 1 Z n 1

2 0 2 0 0 ) ) 0 ( + π a a = = κ 2 x 2 π | 1 , 2 κ κ , , 1 A n n X X 1 2 1 ∞ ∞ are =0 =0 κ cos( cos( ) + + + 1 a ecluae nltclyfrarbitrary for analytically calculated be can cos( n s a κ 2 1 n x n x ! 1 cos( π 2 sin( , ( 2 Z 2 1 -periodic. + n κ ∈ sin( + ) cos( + ) x 0 + , ! 2 1 x a x N A π + ) 1 1 a 1 0 Z , ) ) ) 1 1 0 , 2 , + 2 T κ π x + 2 x A a 1

cos( 1 1 a ) n ) , a κ 2 2 a t term -th cos( , 2 sin( 1 1 1 + , x , 2 cos( 2 + 2 sin( a sin( ) x x 2 a 2 , 2 x 2 1 ) ) , x 1 2 x + + ) 2 ) 2 ! ) | ) a n µ n 2 n κ . , dx dx 2 [2 = 2 | n 1 cos( 1 dx dx dx , 1 2 3] 2 x dx 2 T (2) . n ) , 2 . κ [0 = ae ntesmlrt fvnMssdsrbtoso high of distributions Mises distributions. Gaussian von the of to approximations similarity concentration e.g., the found, on other be high, based to is uncer- concentration need the the may where i.e., approximations cases small, In high. is is concentration tainty the for suitable where mostly is situations approximation proposed the bound, this hs h ro erae o ﬁxed for decreases summands error the Thus, h rnae ato h eiscnb one codn to according bounded of be can value series absolute the the of part Then, truncated constant. the normalization to the 0 of summands consider only we Now, notation, the simplify To tight. abbreviation not the is deﬁne bound we this that Note . 7 , 1 t . 3]

4 = n X T = ∞ N o ifrn orlto matrices correlation different for , N π 2 s scoe ag nuh scnb enfrom seen be can As enough. large chosen is ( n κ ( 1 n κ + , ! A κ ) 2

+ = = ≤ = ≤ a | n n X | | 1 X t ∞ N N t t =0 = ∞ , | | | 1 N N N N ! ! + ( n X

exp( n X N ∞ | =0 ∞ t a n t =0 n | 1 ! N + ,

2 ( t | + t N t n + n ) | n hntenme of number the when N ! n | )! → t + a | − n 2 N n , 1 1 →∞ )! + o calculation for A a 2 0 oethat Note . ,

2 0.1 0.2 0 0 . ) A = A = . 0.3 0.5 0 0.9 IV. PARAMETER ESTIMATION B. Derivatives

For many practical problems, it is essential to be able Some optimization algorithms allow the user to specify to estimate the parameters of a distribution based on a set the derivatives of the value function in order speed up the of samples. In this section, we propose an algorithm based optimization process. For this reason, we derive analytical on the concept of maximum likelihood estimation (MLE) expression for the derivatives in this section. The derivatives to address this problem. The related problem of parameter of the log-likelihood can be calculated as follows. Deriving estimation for the bivariate wrapped normal distribution is with respect to µ1 yields discussed in [18] n ∂ X (j) log(L(µ, κ, A) = κ1 sin(x1 − µ1) ∂µ1 A. Maximum Likelihood Approach j=1 T " (j) # " (j) #! (1) (n) sin(x − µ ) cos(x − µ ) We assume that n independent samples x , . . . , x ∈ + 1 1 A 2 2 2 (j) (j) [0, 2π) on the torus are given. Then, the likelihood of − cos(x1 − µ1) sin(x2 − µ2) obtaining these samples is given by and the derivative with respect to µ2 can be obtained analo- n gously. The derivative with respect to κ (l = 1, 2) is given Y l L(µ, κ, A) = f(x(j); µ, κ, A) , by j=1 ∂ log(L(µ, κ, A) and we seek to obtain the parameters µ, κ, A that maximize ∂κl n L(µ, κ, A). For this purpose, we consider the log-likelihood ∂ X (j) = −nC(κ, A) C(κ, A)−1 + cos(x − µ ) , ∂κ l l l j=1 n X log(L(µ, κ, A) = log(f(x(j); µ, κ, A)) where the derivative of the inverse normalization constant is j=1 given by n ∞ ∂ (j) X ∂ X ∂κ sn(κ, A) = n log(C(κ, A)) + κ1 cos(x1 − µ1) (C(κ, A)−1) = l . ∂κ n! j=1 l n=0 T " (j) # " (j) # (j) cos(x −µ ) cos(x −µ ) The derivative of s can be calculated analytically because + κ cos(x −µ ) + 1 1 A 2 2 n 2 2 2 (j) (j) s is a polynomial for all n (see Appendix). If we differen- sin(x1 −µ1) sin(x2 −µ2) n tiate with respect to a1,1, we obtain and obtain the parameters according to ∂ log(L(µ, κ, A) arg max log(L(µ, κ, A) ∂a1,1 µ,κ,A ∂ C(κ, A)−1 n ∂a1,1 X (j) (j) = −n + cos(x −µ ) cos(x −µ ) , s. t. κ1 > 0, κ2 > 0 C(κ, A)−1 1 1 2 2 j=1 Unfortunately, a closed-form solution of this problem does where not seem to exist. Therefore, we use a numerical optimization ∞ ∂ sn(κ, A) algorithm to find the best solution. As this is a non-convex ∂ X ∂a1,1 (C(κ, A)−1) = . optimization problem, we cannot guarantee to find the global ∂a n! 1,1 n=0 optimum, but starting with a well-chosen initial value seems to be sufficient to obtain a good local optimum in practice. Once again, the derivative of sn can be calculated analyti- In order to obtain the initial value for the optimization, we cally. The other entries of A can be obtained in a similar propose the following method. First, we observe that if A is way. a zero matrix, the marginals V. BAYESIAN FUSION Z 2π Z 2π Now that we have shown how to estimate the parameters f(x; µ, κ, A)dx1 , f(x; µ, κ, A)dx2 0 0 of a single bivariate wrapped normal distribution, we address the questions of how to fuse several of these densities. In are von Mises distributions with parameters µ1, κ1 and order to perform Bayesian fusion of multiple densities, we µ2, κ2, respectively. Therefore, we fit von Mises distributions need to derive a way to calculate the product of two bivariate to the marginals of the samples using moment matching (see von Mises densities. The reason is that according to Bayes’ [19]) and use their parameters to obtain an initial estimate theorem, we have for µ and κ. This is equivalent to obtaining µ from the first f(z|x) · f(x) trigonometric moment of the samples (see [11]). The matrix f(x|z) = ∝ f(z|x) · f(x) , A is then initialized with a zero matrix. f(z)   cos(µ1) cos(µ2) − sin(µ1) cos(µ2) − cos(µ1) sin(µ2) sin(µ1) sin(µ2) sin(µ1) cos(µ2) cos(µ1) cos(µ2) − sin(µ1) sin(µ2) − cos(µ1) sin(µ2) M(µ) =   cos(µ1) sin(µ2) − sin(µ1) sin(µ2) cos(µ1) cos(µ2) − sin(µ1) cos(µ2) sin(µ1) sin(µ2) cos(µ1) sin(µ2) sin(µ1) cos(µ2) cos(µ1) cos(µ2)

Fig. 3: Matrix required for Bayesian fusion of bivariate von Mises densities. i.e., if we assume f(z|x) and f(x) to be distributed according The equations (3)–(6) can be solved to obtain µZ and κZ , to bivariate von Mises distributions, we can obtain f(x|z) as which yields the (renormalized) product of the two. q Z C 2 S 2 Before we calculate the product, we ﬁrst consider the κ1 = (m1 ) + (m1 ) , probability density function (1) and rewrite the matrix mul- q Z C 2 S 2 tiplication, which yields κ2 = (m2 ) + (m2 ) , µZ = atan2(mS, mC ) , f(x; µ, κ, A) = C · exp(κ cos(x − µ ) + κ cos(x − µ ) 1 1 1 1 1 1 2 2 2 Z S C µ2 = atan2(m2 , m2 ) , + cos(x1 − µ1)a11 cos(x2 − µ2)

+ cos(x1 − µ1)a12 sin(x2 − µ2) where + sin(x − µ )a cos(x − µ ) C X X Y Y 1 1 21 2 2 m1 = κ1 cos(µ1 ) + κ1 cos(µ1 ) , + sin(x1 − µ1)a22 sin(x2 − µ2)) . S X X Y Y m1 = κ1 sin(µ1 ) + κ1 sin(µ1 ) , C X X Y Y By using the addition theorems for sine and cosine, we can m2 = κ2 cos(µ2 ) + κ2 cos(µ2 ) , reformulate the exponent as S X X Y Y m2 = κ2 sin(µ2 ) + κ2 sin(µ2 ) .

κ1 cos(µ1) cos(x1) + κ2 cos(µ2) cos(x2) Effectively, this corresponds to the multiplication formula for

+ κ1 sin(µ1) sin(x1) + κ2 sin(µ2) sin(x2) univariate von Mises distributions in each dimension. The last four equations can be simpliﬁed to + cos(x1) cos(x2) · M(µ)1,: · a Z Z X X Y Y + sin(x1) cos(x2) · M(µ)2,: · a M(µ ) · a = M(µ ) · a + M(µ ) · a , + cos(x1) sin(x2) · M(µ)3,: · a which can be solved by inverting M(µZ ) + sin(x1) sin(x2) · M(µ)4,: · a , aZ = M(µZ )−1(M(µX ) · aX + M(µY ) · aY ) (11) T where a = [a1,1, a2,1, a1,2, a2,2] is the vectorized form to obtain aZ , and thus AZ . It can be shown that det M(µ) = of A and M(µ)j,: refers to the j-th row of M(µ), which is given in Fig. 3. 1, irrespective of µ. Hence, M(µ) is always invertible and Now we consider two bivariate von Mises distributions (11) can always be solved. f(x; µX , κX , AX ) and f(x; µY , κY , AY ) with parameters It is worth mentioning that our result for the product of two µX , κX , AX and µY , κY , AY , respectively. We seek to bivariate von Mises densities also shows that the restricted obtain the parameters µZ , κZ , AZ of a third bivariate von versions discussed by Singh et al. [6] and Mardia et al. [7] Mises distribution f(x; µZ , κZ , AZ ), such that are, in general, not closed under multiplication because an entry of A can be non-zero after multiplication, even if it f(x; µX , κX , AX ) · f(x; µY , κY , AY ) ∝ f(x; µZ , κZ , AZ ) was zero before, depending on M(µ).

For this purpose, we consider the exponent of the product VI.EXAMPLES and use the method of equating the coefﬁcients to obtain the In this section, we give some examples of how the system of eight equations proposed methods can be applied. X X Y Y Z Z κ1 cos(µ1 ) + κ1 cos(µ1 ) = κ1 cos(µ1 ) , (3) A. Parameter Estimation Example X X Y Y Z Z κ1 sin(µ1 ) + κ1 sin(µ1 ) = κ1 sin(µ1 ) , (4) The following example illustrates the parameter estimation X X Y Y Z Z κ2 cos(µ2 ) + κ2 cos(µ2 ) = κ2 cos(µ2 ) , (5) method discussed in Sec. IV. For this purpose, we use the circular κX sin(µX ) + κY sin(µY ) = κZ sin(µZ ) , (6) wind data set from the R package [20]. This data set 2 2 2 2 2 2 was also discussed in [21]. It contains measurements of the X X Y Y Z Z M(µ )1,:a + M(µ )1,:a = M(µ )1,:a , (7) wind direction measured by a meteorological station at Col X X Y Y Z Z M(µ )2,:a + M(µ )2,:a = M(µ )2,:a , (8) De La Roa, Italy, from January 29th, 2001 to March 31st, X X Y Y Z Z 2001. For each day, we consider the measurements obtained M(µ )3,:a + M(µ )3,:a = M(µ )3,:a , (9) X X Y Y Z Z at 3:00 am and at 3:15 am. In this way, we obtain 62 two- M(µ )4,:a + M(µ )4,:a = M(µ )4,:a . (10) dimensional samples deﬁned on the torus. bivariate von Mises and 0 0 µY = [1.5, 5]T , κY = [0.1, 2]T , AY = , 0 0 respectively. By applying the Bayesian fusion formulas given in Sec. V, we obtain the parameters of the product f(x; µZ , κZ , AZ ), which are given by

µZ = [1.9392, 4.9203]T , κZ = [0.7892, 2.1148]T , 0.0145 0.0110 AZ = . −0.2383 −0.1813

bivariate wrapped normal This example is visualized in Fig. 5. As shown in Sec. V, this result does not involve any approximation, i.e., f(x; µZ , κZ , AZ ) ∝ f(x; µX , κX , AX ) · f(x; µY , κY , AY ) .

Observe that all entries of AZ are non-zero, even though AX had just one non-zero entry and AY was a zero matrix. This illustrates the fact that the sine model considered by Singh et al. [6] is not closed under multiplication.

VII.CONCLUSION In this paper, we have presented a number of results on the bivariate von Mises distribution. In particular, we have proposed a series representation of the approximation constant Fig. 4: Parameter estimation example. The samples are that allows the efficient computation of an approximation. shown in red and the value of the probability density function Furthermore, we have discussed parameter estimation based is given by the color of the background. Note that both the on a maximum likelihood approach. Finally, we have shown ◦ x and y axes are 360 -periodic. that the bivariate von Mises distribution is closed under multiplication and given an algorithm to calculate the parameters of the renormalized product in closed form. This allows Then, the maximum likelihood estimation procedure pro- efficient Bayesian fusion of bivariate von Mises densities, posed in Sec. IV was used to obtain the parameters of a which facilitates their use in a variety of fusion problems on bivariate von Mises distribution. For comparison, we also the torus. obtained the parameters of a bivariate wrapped normal dis- Future work may include an extension of the discussed tribution using numerical MLE [18], because this distribution methods to the n-torus (see also [9]), and the derivation of is also a common model for toroidal data. The resulting a convolution (or more general propagation) algorithm to densities are depicted in Fig. 4. create a recursive filter as was developed for the bivariate It can be seen that both distributions are able to repre- wrapped normal distribution [11]. Moreover, further investi- sent the obvious correlation between the two measurements. gation of the normalization constant may be of interest, as it However, the bivariate von Mises seems to reflect the true may be possible to represent it as a series of Bessel functions, underlying density more closely because it is able to reflect similar to the results for the special cases given in [5], [6], the asymmetric distribution of the samples as a result of the and [7]. additional degrees of freedom. This is illustrated by the fact An implementation of the algorithms proposed in this that the log-likelihood of the BVM distribution is higher (- paper is available as part of the MATLAB library 147.3259) than the log-likelihood of the BWN distribution libDirectional [22], a library dedicated to directional (-164.5940). In future work, a rigorous goodness of fit test statistics and estimation involving directional quantities. could be used to better quantify this result. APPENDIX B. Fusion Example The first terms of sn(κ, A) in (2) are given by In this example, we consider two bivariate von Mises X X X Y Y Y 2 distributions f(x; µ , κ , A ) and f(x; µ , κ , A ) with s0(κ, A) = 4π , parameters s1(κ, A) = 0 , 0 0 s (κ, A) = π2(a2 + a2 + a2 + a2 + 2κ2 + 2κ2) , µX = [2, 4]T , κX = [0.7, 0.2]T , AX = 2 1,1 1,2 2,1 2,2 1 2 0 −0.3 2 s3(κ, A) = 6κ1κ2a1,1π , f(x; µX , κX , AX ) f(x; µY , κY , AY ) f(x; µZ , κZ , AZ ) Fig. 5: Fusion example. The density on the left is multiplied with the density in the middle to obtain the density on the right. On the top, the densities are visualized on the torus, whereas on the bottom, the same densities are visualized as a function on the plane.

3 2 4 2 2 2 2 2 2 s4(κ, A) = π (3a1,1 + 6a1,1a1,2 + 6a1,1a2,1 + 2a1,1a2,2 [10] R. A. Johnson and T. Wehrly, “Measures and Models for Angular 16 Correlation and Angular-Linear Correlation,” Journal of the Royal 2 2 2 2 4 Statistical Society. Series B (Methodological), vol. 39, no. 2, pp. 222– +24a1,1κ1 + 24a1,1κ2 + 8a1,1a1,2a2,1a2,2 + 3a1,2 229, 1977. 2 2 2 2 2 2 2 2 4 +2a1,2a2,1 + 6a1,2a2,2 + 24a1,2κ1 + 8a1,2κ2 + 3a2,1 [11] G. Kurz, I. Gilitschenski, M. Dolgov, and U. D. Hanebeck, “Bivari- 2 2 2 2 2 2 4 2 2 ate Angular Estimation Under Consideration of Dependencies Using +6a2,1a2,2 + 8a2,1κ1 + 24a2,1κ2 + 3a2,2 + 8a2,2κ1 Directional Statistics,” in Proceedings of the 53rd IEEE Conference +8a2 κ2 + 8κ4 + 32κ2κ2 + 8κ4) , on Decision and Control (CDC 2014), Los Angeles, California, USA, 2,2 2 1 1 2 2 Dec. 2014. 15 s (κ, A) = π2κ κ (3a3 + 2a a a [12] G. S. Shieh and R. A. Johnson, “Inferences Based on a Bivariate 5 4 1 2 1,1 1,2 2,1 2,2 Distribution with von Mises Marginals,” Annals of the Institute of 2 2 2 2 2 Statistical Mathematics, vol. 57, no. 4, pp. 789–802, 2005. +a1,1(3a1,2 + 3a2,1 + a2,2 + 4κ1 + 4κ2)) . [13] J. J. Fernandez-Dur´ an,´ “Models for Circular–Linear and Circular– Circular Data Constructed from Circular Distributions Based on Non- REFERENCES negative Trigonometric Sums,” Biometrics, vol. 63, no. 2, pp. 579–585, [1] K. V. Mardia and P. E. Jupp, Directional Statistics, 1st ed. Wiley, 2007. 1999. [14] S. Kato and A. Pewsey, “A Mobius¨ Transformation-induced Distribu- [2] K. V. Mardia, “Characterizations of Directional Distributions,” in tion on the Torus,” Biometrika, 2015. A Modern Course on Statistical Distributions in Scientific Work. [15] M. Abramowitz and I. A. Stegun, Handbook of Mathematical Func- Springer Netherlands, 1975, vol. 17, pp. 365–385. tions with Formulas, Graphs, and Mathematical Tables, 10th ed. New [3] ——, “Statistics of Directional Data,” Journal of the Royal Statistical York: Dover, 1972. Society. Series B (Methodological), vol. 37, no. 3, pp. 349–393, 1975. [16] P. E. Jupp and K. V. Mardia, “A Unified View of the Theory of [4] R. von Mises, “Uber¨ die ”Ganzzahligkeit” der Atomgewichte und Directional Statistics, 1975-1988,” International Statistical Review / verwandte Fragen,” Physikalische Zeitschrift, vol. XIX, pp. 490–500, Revue Internationale de Statistique, vol. 57, no. 3, pp. 261–294, 1989. 1918. [17] ——, “A General Correlation Coefficient for Directional Data and [5] L.-P. Rivest, “A Distribution for Dependent Unit Vectors,” Communi- Related Regression Problems,” Biometrika, vol. 67, no. 1, pp. 163– cations in Statistics - Theory and Methods, vol. 17, no. 2, pp. 461–483, 173, 1980. 1988. [18] G. Kurz and U. D. Hanebeck, “Parameter Estimation for the Bivariate [6] H. Singh, V. Hnizdo, and E. Demchuk, “Probabilistic Model for Two Wrapped Normal Distribution (submitted),” in Proceedings of the 54th Dependent Circular Variables,” Biometrika, vol. 89, no. 3, pp. 719– IEEE Conference on Decision and Control (CDC 2015), Osaka, Japan, 723, 2002. Dec. 2015. [7] K. V. Mardia, C. C. Taylor, and G. K. Subramaniam, “Protein [19] G. Kurz, I. Gilitschenski, and U. D. Hanebeck, “Recursive Nonlin- Bioinformatics and Mixtures of Bivariate von Mises Distributions for ear Filtering for Angular Data Based on Circular Distributions,” in Angular Data,” Biometrics, vol. 63, no. 2, pp. 505–512, 2007. Proceedings of the 2013 American Control Conference (ACC 2013), [8] K. V. Mardia and J. Frellsen, “Statistics of Bivariate von Mises Washington D. C., USA, Jun. 2013. Distributions,” in Bayesian Methods in Structural Bioinformatics, ser. [20] U. Lund and C. Agostinelli, “circular: Circular Statistics for Biology and Health, T. Hamelryck, K. Mardia, and Statistics,” Nov. 2013. [Online]. Available: http://cran.r- J. Ferkinghoff-Borg, Eds. Springer Berlin Heidelberg, 2012, pp. 159– project.org/web/packages/circular/index.html 178. [21] C. Agostinelli, “Robust Estimation for Circular Data,” Computational [9] K. V. Mardia, G. Hughes, C. C. Taylor, and H. Singh, “A Multi- Statistics & Data Analysis, vol. 51, no. 12, pp. 5867–5875, 2007. variate von Mises Distribution with Applications to Bioinformatics,” [22] G. Kurz, I. Gilitschenski, F. Pfaff, and L. Drude, “libDirectional,” Canadian Journal of Statistics, vol. 36, no. 1, pp. 99–109, 2008. 2015. [Online]. Available: https://github.com/libDirectional