On the Fisher–Bingham Distribution

On the Fisher{Bingham Distribution BY A. Kume and S.G Walker Institute of Mathematics, Statistics and Actuarial Science, University of Kent Canterbury, CT2 7NF,UK [email protected] and [email protected] Abstract This paper primarily is concerned with the sampling of the Fisher{Bingham distribution and we describe a slice sampling algorithm for doing this. A by-product of this task gave us an in¯nite mixture representation of the Fisher{Bingham distribution; the mixing distributions being based on the Dirichlet distribution. Finite numerical approximations are considered and a sampling algorithm based on a ¯nite mixture approximation is compared with the slice sampling algorithm. Keywords: Directional statistics; Fisher{Bingham distribution; Gibbs sampling. 1 Introduction. The Bingham, and more generally the Fisher{Bingham distribution, are constructed by constrain- ing multivariate normal distributions to lie on the surface of a sphere of unit radius. They are used in modeling spherical data which usually represent directions but in some cases they can also be used in shape analysis. If x = (x0; x1; : : : ; xp) are distributed according to such a distribution then the norm of x is 1. 2 2 2 2 Hence x = (x0; x1; : : : ; xp) lies on the simplex. The contribution of this paper is to transform x 2 to (!; s) and to study the marginal and conditional distributions of ! and s. Here si = xi and !i = xi=jxij, so that !i 2 f¡1; +1g. Clearly the Lebesgue measure in IRp+1 induces the uniform measure on the unit sphere Sp. Hence the Fisher{Bingham distribution which is obtained in Sp via the constrained multivariate normal random vector in with covariance § and mean ¹ 2 IRp+1 has the density with respect to uniform p measure dSp (x) in S f(xj¹; §) = C(¹; §)¡1 expf¡(x ¡ ¹)t§(x ¡ ¹)g where C(¹; §) is the corresponding normalizing constant and x 2IRp+1 such that xtx = 1. This distribution is an extension of the Bingham distribution which consist of ¹ being zero. The uniform measure in Sp is invariant of the orthogonal transformations, it can be easily shown that if X has FB(¹; §) then for each orthogonal matrix V 2 O(p+1), Y = XV has FB(¹V; V t§V ). So, without loss of generality, we assume that the covariance matrix is diagonal i.e. § = ¤ = diag(¸0; ¸1; : : : ; ¸p). 1 t With x x = 1 we have in terms of (!0;:::;!p; s1; : : : ; sp) (x ¡ ¹)t§(x ¡ ¹) = xt§x ¡ 2xt§¹ + ¹t§¹ Xp Xp p t = ¸isi ¡ 2 ¸i!i¹i si + ¹ §¹ i=0 i=0 p X p p = (¡aisi ¡ bi!i si) ¡ b0!0 1 ¡ s + c i=1 t where, for i = 1; : : : ; p, ai = ¸0 ¡ ¸i, and for i = 0; : : : ; p, bi = 2¸i¹i, with c = ¹ §¹ + ¸0 and s = 1 ¡ s0 = s1 + ¢ ¢ ¢ + sp. Implementing the transformation from x to (!0;:::;!p; s1; : : : ; sp) yields the joint density of interest given by ( ) Xp p Yp p © ª ¡1=2 ¡1=2 f(!; s) / exp (aisi + bi!i si) exp b0!0 1 ¡ s si (1 ¡ s) 1(s · 1): (1) i=1 i=1 An important point which will come in useful later is to take, without loss of generality, ¸0 to be the largest of the ¸'s and so ai ¸ 0 for all i = 1; : : : ; p. Another important point to make is that the Bingham density (i.e. when ¹ = 0) remains un- changed following any addition of a constant, say », to all the diagonal elements of §. This follows since, with xtx = 1, xt(§ + »I)x = xt§x + »: In this case we can not expect to be able to estimate p ¸i's, rather it is only the di®erences, such as ¸i ¡ ¸i0 , that we can estimate. Note that this scenario also holds whether § is diagonal or not. The function in (??) is the joint density of interest and from it we will make contributions about the Fisher{Bingham distribution by providing a mixture representation and simulating samples using Gibbs sampling from it. In Section 2 we provide the mixture representation of the Fisher{Bingham distribution and demon- strate a method for truncating this to a ¯nite mixture with known error. In some parameter cases a workable approximation can be obtained. In Section 3 we provide a Gibbs sampling approach to sampling the Fisher{Bingham distribution and hence compare distributions obtained from our mixture representation and the Gibbs sampling. This problem of sampling the Fisher{Bingham distribution was raised by a referee of the paper by Kume and Walker (2006). 2 Mixture representation of Fisher{Bingham distribution. From the joint density of (!; s), it is clear that, for i = 0; : : : ; p, we have p exp(bi si) P(!i = 1jsi) = p p exp(bi si) + exp(¡bi si) and the !i are independent given the si. Hence this is easy to understand and so for the rest of this section we will concentrate on the marginal density of s. This is given up to a constant of proportionality, and with respect to the Lebesgue measure ds1 ::: dsp, by 2 ( ) Yp p p ¡1=2 ¡1=2 f(s) / g(s) = exp(aisi) cosh(bi si)si cosh(b0 1 ¡ s) (1 ¡ s) 1 (s · 1) : i=1 Qp ¡1=2 ¡1=2 Note that h(s) = i=1 si (1¡s) is proportional to Dir(s; 1=2;:::; 1=2), the pdf at (s1; : : : ; sp) of the Dirichlet distribution with its p + 1 parameters equal to 1=2. Now both exp(¢) and cosh(¢) can be expanded in powers; so X1 l l exp(aisi) = aisi=l! l=0 and X1 p 2m m cosh(bi si) = bi si =(2m)! m=0 leading to X1 X1 X1 X1 g(s) = ¢ ¢ ¢ ¢ ¢ ¢ w(l; m)Dir(s; l1 + m1 + 1=2; : : : ; lp + mp + 1=2; m0 + 1=2); l1=0 lp=0 m0=0 mp=0 where p b2m0 ¡(m + 1=2) Y b2mi ali ¡(l + m + 1=2) w(l; m) = P 0 0 i i i i : (2m )!¡ ( p (l + m + 1=2) + m + 1=2) l !(2m )! 0 i=1 i i 0 i=1 i i Hence, f(s) is an in¯nite mixture of Dirichlet distributions. 2.1 Finite approximation. The idea here is to truncate the in¯nite mixture to a ¯nite number of terms and to compute the error in such a procedure. To this end, let us de¯ne p Y n p o p ¡1 li ¡1 ni ¡1 n0 w(l; n; s; !) = li! (aisi) ni! (!ibi si) n0! (!0b0 1 ¡ s) h(s): i=1 So w(l; m) is the integral of w(l; n = 2m; s; !) with respect to ds and d!, the uniform measure on f¡1; 1gp+1. We now consider a direct expansion of the expression on the right-hand side of (??) ( )k X1 Xp p ¡1 p f(!; s) / k! (aisi + bi!i si) + b0!0 1 ¡ s h(s) k=0 i=1 and see that these terms are related to w(l; n; s; !) as ( )k X Xp p ¡1 p w(l; n; s; !) = k! (aisi + bi!i si) + b0!0 1 ¡ s h(s) l1+l2+¢¢¢+lp+n0+¢¢¢+np=k i=1 3 where the summation on the left is made for all possible integer partitions of k (including zeros) such that l1 + l2 + ¢ ¢ ¢ + lp + n0 + ¢ ¢ ¢ + np = k. We notice that ( p ) p Ã p ! X p p X p p p X p (aisi + bi!i si) + b0!0 1 ¡ s · (aisi+jbij si)+jb0j 1 ¡ s · M 1 ¡ s + 1 ¡ s + si ; i=1 i=1 i=1 where M = maxfai; i = 1; : : : ; p; jbij; i = 0; : : : ; pg. Also p X p p si · p + 1 i=0 and so X p w(l; n; s; !) · k!¡1M k(1 + p + 1)k h(s): l1+l2+¢¢¢+lp+n0+¢¢¢+np=k Note that, if any of ni's is odd then the integral of w(l; n; s; !) with respect to d! is zero. Therefore, collecting only the non-zero terms left the after integrating over d! and ds in the summation above, we obtain X p ¡(1=2)p+1 w(l; m) · k!¡1M k(1 + p + 1)k : ¡(p=2 + 1=2) l1+l2+¢¢¢+lp+2m0+¢¢¢+2mp=k Hence, p X X1 (1 + p + 1)k T = w(l; m) · ¿(p) M k ; N k! l1+l2+¢¢¢+lp+2m0+¢¢¢+2mp¸N k=N where ¿(p) = ¡(1=2)p+1=¡(p=2 + 1=2), leading to p ¿(p) M N (1 + p + 1)N p T · Á(N) = eM(1+ p+1): N N! We can ¯nd a value of N, which will form the basis of our truncation, based on the decreasing speed of Á(N). For a given error ² we ¯nd N such that TN · Á(N) · ² and so an approximation to the Fisher{Bingham distribution has density function for s = (s1; : : : ; sp) given by NX¡1 NX¡1 [XN=2] [XN=2] fN (s) = ¢ ¢ ¢ ¢ ¢ ¢ qN (l; m)Dir(s; l1 + m1 + 1=2; : : : ; lp + mp + 1=2; m0 + 1=2); l1=0 lp=0 m0=0 mp=0 where w(l; m) qN (l; m) = P P P P : N¡1 ¢ ¢ ¢ N¡1 [N=2] ¢ ¢ ¢ [N=2] w(l; m) l1=0 lp=0 m0=0 mp=0 This works since ( ) X X c f(l; m) : 1 · li; 2mi < Ng ½ (l; m): li + 2 mi ¸ N : i i 4 With this approximation statistical inference becomes highly feasible based on a mixture distribution for which estimating methods are well documented.

On the Fisher–Bingham Distribution

Recent Advances in Directional Statistics

Bayesian Orientation Estimation and Local Surface Informativeness for Active Object Pose Estimation Sebastian Riedel

Bayesian Methods of Earthquake Focal Mechanism Estimation and Their Application to New Zealand Seismicity Data ’

UNIVERSITY of CALIFORNIA Los Angeles Models for Spatial Point

Manhattan World Inference in the Space of Surface Normals

A New Unified Approach for the Simulation of a Wide Class of Directional Distributions

Unscented Orientation Estimation Based on the Bingham Distribution

Recursive Estimation of Orientation Based on the Bingham Distribution

MATHEMATICAL ENGINEERING TECHNICAL REPORTS Holonomic

Efficient Sampling from the Bingham Distribution

A Hierarchical Eigenmodel for Pooled Covariance Estimation

Exploring Applications of Bingham Distribution for Characterizing Uncertainty Over SO(3) 16-833 Final Report