arXiv:1811.01394v3 [math.ST] 17 Aug 2019 pc;tasomto oe;hroi xoeta family. exponential harmonic model; transformation space; 22F30. 20G05, ..Peiiayt eto 14 12 10 11 10 13 References 4 Section plane to 8 Acknowledgement half Preliminary upper the 5.1. on Appendix distributions of family 5. A distribution 4.8.1. Hyperboloid 2 distribution 4.8. 7 Fisher– Mises–Fisher 4.7. Von distribution Mises 4.6. Von distributions 4.5. distributions Wishart gamma 3 5 inverse and 4.4. distribution Gamma categorical and distribution distribution 4.3. normal Bernoulli multivariate and distribution 4.2. Normal 4.1. Examples family 4. Exponential families exponential method 3.2. good Our construct to method 3.1. A families exponential 3. Harmonic model 2.2. Transformation work 2.1. Related 2. Introduction 1. e od n phrases. and words Key 2010 Date EHDT OSRC EXPONENTIAL CONSTRUCT TO METHOD A uut2,2019. 20, August : enul,ctgrcl ihr,vnMss ihrBnhmand gamma, Fisher–Bingham normal, Mises, distributions. as von hyperboloid such Wishart, families categorical, exponential Bernoulli, used widely erates rcsl,w osdrahmgnosspace homogeneous a Mor consider we theory. precisely, representation by systematically families exponential Abstract. n osrc nepnnilfml nain ne h transfor the under invariant group family mation exponential an construct and AIISB ERNEAINTHEORY REPRENSETATION BY FAMILIES ahmtc ujc Classification. Subject Mathematics OCITJ N AOYOSHINO TARO AND TOJO KOICHI nti ae,w ieamto ocntut“good” construct to method a give we paper, this In G yuigarpeetto of representation a using by xoeta aiy ersnainter;homogeneous theory; representation family; exponential Contents 1 rmr 21;Scnay62H11, Secondary 62H10; Primary G/H G h ehdgen- method The . sasml space sample a as e - 15 14 14 9 5 4 3 2 2 2 2 KOICHITOJOANDTAROYOSHINO

1. Introduction Exponential families are important subjects of study in the field of information geometry and are often used for Bayes inference because they have conjugate priors, which have been well studied [DY79]. By definition, there are infinitely many exponential families, however, only a small part of them are widely used such as Bernoulli, categorical, normal, multivariate normal, gamma, inverse gamma, von Mises, von Mises–Fisher, Fisher–Bingham, Wishart and hyperboloid distributions, which have been investigated individually. To understand them com- prehensively, we consider a method to construct good exponential fam- ilies systematically. Here we want the method to satisfy the following two advantages: (i) The method can generate many well-known good distributions. (ii) Distributions obtained by the method are limited in principle. In this paper, we introduce a method satisfying them. In our method, representation theory plays an important role. Actually, most of useful distributions have the same symmetry as sample spaces. More pre- cisely, the can be regarded as a homogeneous space G/H and the family of distributions is G-invariant. In Section 3.1, we sug- gest a method to construct exponential families invariant under the transformation groups G systematically. In Section 4, by the method, we realize examples as mentioned above.

2. Related work 2.1. Transformation model. A Transformation model is one of para- metric models in , which enables us to understand many fam- ilies of probability measures on sample spaces comprehensively and have good properties on statistics [BNBE89]. However, by the defi- nition of transformation models, the overlap between transformation models and exponential families is too small not to include many im- portant families of probability measures on sample spaces which admit group actions (see Section 1.3 in [BN78a]). Our method clear up the problem by using representation theory. In fact, our method generates many useful exponential families invariant under the transformation group (see Table 1).

2.2. Harmonic exponential families. Our method is a generaliza- tion of harmonic exponential families suggested in [CW15]. They stud- ied densities on a compact Lie group G or its homogeneous spaces G/H such as torus and sphere by using unitary representations of G. As mentioned in Section 7 (“Discussion and Future Work”) in the paper, they wished an extension to noncompact groups. In the case where G is compact, there is a unique invariant measure on G or G/H up to A METHOD TO CONSTRUCT EXPONENTIAL FAMILIES 3 positive constant. On the other hand, in the case where G is noncom- pact, it does not hold. For example, let G be an affine transformation group R× ⋉R and H the closed subgroup R×, then G/H R admits no G-invariant measures. However, G/H admits relatively≃ invariant measures. Therefore, we take relatively invariant measures on G/H instead of the invariant measure. Since a relatively invariant measure on G/H is not unique, we have a lot of choice of relatively invariant measures. Our method does not choose one of them but take “all” of them (see Subsection 3.1).

3. A method to construct good exponential families In this section, we introduce a method to construct a family of distri- butions on a homogeneous space G/H by representation theory. More- over, we show that the families obtained by our method are exponential families.

3.1. Our method. Our method generates a family dpθ θ Θ of prob- ability measures from three inputs as follows: { } ∈ (i) Let G be a Lie group and H a closed subgroup of G. We put X := G/H, which is regarded as a sample space. (ii) Let V be a finite dimensional real representation of G. (iii) Let v0 be an H-fixed vector in V . We put Ω (G,H) := χ : G R χ is a continuous group homomorphism and χ =1 . 0 { → >0 | |H } We assume that there exist relatively G-invariant measures on G/H and take one µ of them. (The family below does not depend on the choice of µ (see Remark 3.1)). Then weP set

dp˜ (x) := dp˜ (x) = exp( ϕ, x v )χ(x)dµ(x) (ϕ V ∨, x = gH), θ ϕ,χ −h · 0i ∈

Θ := θ =(ϕ, χ) V ∨ Ω0(G,H) dp˜θ < , { ∈ × | X ∞} 1 Z − c := dp˜ (θ Θ). θ θ ∈ ZX  Here we denote the dual space of V by V ∨. From the conditions that v0 is H-fixed and χ = 1, the notations x v and χ(x) are well-defined. |H · 0 Then we get a probability measure dpθ on X as follows: dp = c dp˜ (θ Θ). θ θ θ ∈ As a result we obtain the following family of distributions on X:

:= dpθ θ Θ. P { } ∈ Remark 3.1. The above family does not depend on the choice of a relatively invariant measure µ onPX. In fact, for relatively invariant 4 KOICHITOJOANDTAROYOSHINO

dµ1 measures µ1,µ2, there exist c R>0 such that c Ω0(G,H). This ∈ dµ2 ∈ follows from Fact 3.2 below. Fact 3.2 ([BNBE89, Corollary 4.1]). Suppose G acts transitively on X. Then there exists a relatively invariant measure µ on X with multiplier f : G R if and only if → >0 (1) f(h) = ∆ (h)/∆ (h) (h H). H G ∈ Here we denote modular functions of G and H by ∆G and ∆H respec- tively. This measure is unique up to a multiplicative constant. 3.2. . In this subsection, we see that families ob- tained by our method are exponential families. There are several defi- nitions of an exponential family. We adopt the following Definition 3.3 and show that families of distributions obtained by our method are exponential families in this sense. Let X be a manifold and (X) the set of all Radon measures on X. R Definition 3.3 (See Section 5 in [BN70] for example). Let be a non- empty subset of (X) consisting of probability measures. P is said to be an exponentialR family on X if there exists a triple (µ,V,TP) satisfying the following four conditions: (i) µ (X), (ii) V ∈is R a finite dimensional real vector space, (iii) T : X V , x T is a continuous map, → 7→ x (iv) for any p , there exists θ V ∨ such that ∈ P ∈ θ,Tx dp(x)= cθe−h idµ(x),

where V ∨ is the dual space of V , and 1 − θ,Tx cθ := e−h idµ(x) . ZX  Proposition 3.4. A family := dpθ θ Θ obtained by the method in Section 3.1 is an exponentialP family{ if Θ} ∈is not empty. To show this proposition, we need: Fact 3.5. Let G be a Lie group with finitely many connected compo- nents. Then the following set is a finite dimensional vector space of C(G): f C(G) f : G R is a group homomorphism . { ∈ | → } Proof of Proposition 3.4. From Fact 3.5, W = log χ χ Ω (G,H) { | ∈ 0 } is a finite dimensional real vector space. Therefore V˜ := V W ∨ is also a finite dimensional vector space. Then we define T : X ⊕ V˜ by → T (x) := (x v , ev ), · 0 x A METHOD TO CONSTRUCT EXPONENTIAL FAMILIES 5 where evx : W R is the evaluation map at x X. We regard V˜ as V of the condition→ (ii) in Definition 3.3. Then, the∈ other conditions in Definition 3.3 are satisfied. 

4. Examples In this section, we realize widely used exponential families in Table 1 by our method.

Table 1: Examples and inputs (G,H,V,v0) for them

distribution sample space G H V v0 Bernoulli 1 1 1 Rsgn 1 {± } {± } { } Categorical 1, ,n Sn Sn−1 W w { ··· } × × Normal R R ⋉R R Sym(2, R) E22 n n Multivariate normal R GL(n, R) ⋉R GL(n, R) Sym(n + 1, R) En+1,n+1 Gamma R>0 R>0 1 R1 1 { } Inverse gamma R>0 R>0 1 R−1 1 + { } Wishart Sym (n, R) GL(n, R) O(n) Sym(n, R) In 1 2 Von Mises S SO(2) I2 R e1 n−1 { } n Von Mises–Fisher S SO(n) SO(n 1) R e1 n−1 − n Fisher–Bingham S SO(n) SO(n 1) R Sym(n, R) e1 E11 n − ⊕ n+1 ⊕ Hyperboloid H SO0(1,n) SO(n) R e0 Poincar´e SL(2, R) SO(2) Sym(2, R) I2 H n n Here W = (x1, ,xn) R xi = 0 , w = ( (n 1), 1, , 1) W . { ··· ∈ | Pi=1 } − − ··· ∈

4.1. and multivariate normal distribution. In this subsection, we realize normal distribution on R and multivari- ate normal distribution on Rn by using a representation of an affine transformation group.

Proposition 4.1. Put G = R× ⋉R and H = R× G. Let ρ : G GL(Sym(2, R)) be a representation given as follows:⊂ → t a b a b ρ(g)(S) := S (g =(a, b) R× ⋉R,S Sym(2, R)). 0 1 0 1 ∈ ∈     0 0 Then v := Sym(2, R) is H-fixed, and by our method we 0 0 1 ∈ obtain normal distribution as follows:

1 (x µ)2 (2) exp( − 2 )dx (σ,µ) R>0 R. { √2πσ2 − 2σ } ∈ ×

Proof. We identify G/H =(R× ⋉R)/R× with R by the following map: G/H R, gH b (g =(a, b) G) . → 7→ ∈ From [G,G]= R and Proposition 5.1 (iii), We have Ω (G,H)= 1 . 0 { } We identify Sym(2, R)∨ with Sym(2, R) by taking the following inner product on Sym(2, R): x, y = Tr(xy) (x, y Sym(2, R)). h i ∈ 6 KOICHITOJOANDTAROYOSHINO

Let dx denote Lebesgue measure on R, which is relatively G-invariant. Then we have

θ1 θ2 dp˜θ1,θ2,θ3 (x) = exp( Tr( ρ(g)v0))dx − θ2 θ3   2 = exp( (θ x +2θ x + θ ))dx (g =(a, x) R× ⋉R). − 1 2 3 ∈ The following equalities are well-known:

θ1 θ2 R Θ= Sym(2, ) dp˜θ1,θ2,θ3 < { θ2 θ3 ∈ | R ∞}   Z θ1 θ2 = Sym(2, R) θ1 > 0 , { θ2 θ3 ∈ | }   θ θ θ θ2 c = 1 exp 1 3 − 2 . θ π θ r  1  Then we obtain normal distribution (2) by a change of variables µ = θ2 and σ = 1 .  − θ1 √2θ1 By generalizing the above case, we obtain the following: Proposition 4.2. Put G = GL(n, R) ⋉Rn, H = GL(n, R) G. Let ρ : G GL(Sym(n +1, R)) be a representation given as follows:⊂ → A b t A b ρ(g)(S) := S (g =(A, b) GL(n, R)⋉Rn,S Sym(n+1, R)). 0 1 0 1 ∈ ∈     Then v0 = En+1,n+1 Sym(n +1, R) is H-fixed, and by our method we obtain multivariate normal∈ distribution as follows:

1 1 t 1 (3) exp( (x µ)Σ− (x µ)) , (2π)n det Σ − 2 − − ( )(Σ,µ) Sym+(n,R) Rn ∈ × where wep denotes the set of positive definite symmetric matrices by Sym+(n, R). Proof. We identify G/H = GL(n, R) ⋉Rn/GL(n, R) with Rn by the following map: G/H Rn, (A, b)H b. → 7→ From [G,G] = Rn and Proposition 5.1 (iii), we have Ω (G,H) = 1 . 0 { } We identify Sym(n +1, R)∨ with Sym(n +1, R) by taking the following inner product on Sym(n +1, R): x, y = Tr(xy) (x, y Sym(n +1, R)). h i ∈ Let dx denote Lebesgue measure on Rn. Then we have t θ1 θ2 x x x dp˜θ1,θ2,θ3 (x) = exp( Tr( t t ))dx − θ2 θ3 x 1     = exp( (txθ x +2tθ x + θ ))dx (θ Sym(n, R), θ Rn, θ R). − 1 2 3 1 ∈ 2 ∈ 3 ∈ A METHOD TO CONSTRUCT EXPONENTIAL FAMILIES 7

The following equalities are well-known: R Rn R Θ= (θ1, θ2, θ3) Sym(n, ) dp˜θ1,θ2,θ3 < { ∈ × × | Rn ∞} Z = (θ , θ , θ ) Sym(n, R) Rn R θ is positive definite. , { 1 2 3 ∈ × × | 1 } θ1 θ2 det t det θ θ2 θ3 1   cθ = n exp   . r π det θ1     Then we obtain multivariate normal distribution (3) by a change of 1 1 1 variables µ = θ− θ and Σ = θ− .  − 1 2 2 1 4.2. and categorical distribution. In this subsection, we realize Bernoulli distribution on 1 and categorical distribution on 1, , n by using a signature representation{± } of 1 and a natural representation{ ··· } of the symmetric group respectively.{± }

Proposition 4.3. Put G = 1 and H = 1 . Let ρ : 1 R× {± } { } {± } → be the signature representation. Then v0 = 1 is H-fixed, and by our method, we obtain Bernoulli distribution as follows: 1+x 1−x (4) s 2 (1 s) 2 s (0,1) (x 1, 1 ). { − } ∈ ∈ {− } Proof. Since G is compact, we have Ω0(G,H) = 1 from Proposi- tion 5.1 (i). Therefore we have { }

θ θx e− (x = 1) dp˜ (x)= e− dx = , θ eθ (x = 1) ( − where dx is the counting measure on G/H = 1 . This is integrable R R {± } 1 for any θ , so we have Θ = θ and cθ = e−θ+eθ . Therefore we get ∈ { ∈ } e−θ e−θ+eθ (x = 1) dpθ(x)= eθ . − (x = 1) ( e θ+eθ − e−θ Thus by a change of variable s = e−θ+eθ , we obtain Bernoulli distribu- tion (4).  By generalizing the above case, we obtain the following: Proposition 4.4. Let n be a positive number with n 2, G the sym- ≥ metric group Sn and H the stabilizer Sn 1 = (Sn)1 of 1 in G. Let ρ : G GL(W ) be a subrepresentation of− the natural representation G GL→ (Rn), where W = (x , , x ) Rn n x = 0 . Then → { 1 ··· n ∈ | i=1 i } v0 =( (n 1), 1, , 1) W is H-fixed, and by our method we obtain categorical− − distribution··· on∈ 1, , n as follows: P { ··· } δ1,x δ2,x δn,x n (5) s1 s2 sn (s1, ,sn) si>0, si=1 (x 1, 2, , n ), { ··· } ··· ∈{ Pi=1 } ∈{ ··· } where δi,x denotes the . 8 KOICHITOJOANDTAROYOSHINO

Proof. Since H = Sn 1 = (Sn)1 is the stabilizer of the first element, − v0 = ( (n 1), 1, , 1) is H-fixed. We have Ω0(G,H) = 1 from Proposition− − 5.1 (i).··· By taking an inner product on W which{ } is the n restriction of the standard inner product on R , we identify W ∨ with W . Then we have

dp˜a1, ,an (k)= dp˜a1, ,an ((1,k)H) ··· ··· t (a1, ,an)ρ((1,k)) ( (n 1),1, ,1) = e− ··· − − ··· = enak ((1,k) S ,k =1, , n). ∈ n ···

Since this is integrable for any (a1, , an) W , we have Θ = (a1, , an) 1 ··· ∈ enak { ··· ∈ W and cθ = n nai . By a change of variables sk = n nai , we ob- } i=1 e i=1 e tain categoricalP distribution (5). P 

4.3. Gamma and inverse gamma distributions. In this subsec- tion, we realize gamma and inverse gamma distributions on R>0 by using one dimensional representations of R>0. Proposition 4.5. Put G = R and H = 1 . For λ R, we denote >0 { } ∈ by ρ : G R× a representation as follows: λ → λ ρλ(x)= x .

Then v0 := 1 R is H-fixed, and by our method we obtain the following family of distributions∈

kλ λ x ( x )λ dx (6) | | e− θ . Γ(k) θ x (k,θ) R>0 R>0     ∈ × Especially, we obtain if λ =1 and inverse gamma distribution if λ = 1 as follows: − 1 x k x dx (7) e− θ (λ = 1), Γ(k) θ x (k,θ) R>0 R>0     ∈ × k 1 θ θ dx (8) e− x (λ = 1). (Γ(k) x x ) −   (k,θ) R>0 R>0 ∈ × Remark 4.6. By putting k =1 in the case of λ =1, we obtain .• n Z By putting k = 2 (n >0) and θ = 2 in the case of λ = 1, • we obtain Chi-squared∈ distribution. In the case of λ =2 and k =1, we obtain . • In the case of k = 1, we obtain with a • shape parameter λ> 0. 1 In the case of λ = 1 and k = 2 , we obtain the unshifted L´evy • distribution. − A METHOD TO CONSTRUCT EXPONENTIAL FAMILIES 9

Proof of Proposition 4.5. From the continuity of χ Ω0(G,H), we get α ∈ Ω0(G,H) = χ : x x α R . We identify R∨ with R by taking the standard{ inner product7→ | on∈R.} Then we have dx dp˜ (x) = exp( βxλ)xα . α,β − x dx R Here, x is the Haar measure on >0. Claim. Θ := (α, β) R R dp˜ < = (α, β) R R>0 α,β { ∈ × | 1 ∞} { ∈ × R β > 0,αλ > 0 . Moreover the value c− of the integral dp˜ α,β R>0 α,β | α } R Γ( λ ) is α . λ β λ R | | This claim can be proved by a direct calculation, so we omit the proof. As a result, we get α λ λ β α βxλ dx dpα,β(x)= | | α x e− . Γ( λ ) x 1 α λ  By a change of variables k = λ , θ = β− , we obtain (6). 4.4. Wishart distributions. In this subsection, we realize on the set of positive definite symmetric matrices Sym+(n, R) by using a representation of GL(n, R). The case n = 1 is essentially same as the previous subsection. Proposition 4.7. Put G = GL(n, R) and H = O(n). Let ρ : G GL(Sym(n, R)) be a representation as follows: → ρ(g)S = gStg (S Sym(n, R)). ∈ Then In is H-fixed, and by our method we obtain Wishart distribution on X = G/H Sym+(n, R) as follows: ≃ (9) α (det y) n+1 α 2 − exp( Tr(yx))(det x) − dx . n(n 1) n k 1 4 − − (π k=1 Γ(α −2 ) ) (y,α) Sym+(n,R) R α> n 1 − { ∈ × | 2 } Proof of PropositionQ 4.7. Claim. Ω (G,H)= g det g t t R . 0 { 7→ | | | ∈ } This claim follows from G/[G,G] R×, Remark 5.2 and Fact 5.3. dx≃ + We take an G-invariant measure n+1 on Sym (n, R) and identify (det x) 2 Sym(n, R)∨ with Sym(n, R) by the following inner product: x, y = Tr(xy) (x, y Sym(n, R)). h i ∈ Then we have

α dx dp˜y,α(x) = exp( Tr(yx))(det x) n+1 . − (det x) 2 10 KOICHITOJOANDTAROYOSHINO

Here we identify GL(n, R)/O(n) with Sym+(n, R) by gO(n) x = t 7→ gIn g. To normalize the above measure, we use Fact 4.8. As a result, we obtain Wishart distribution (9).  Fact 4.8 ([W28, In33, S35], see also [I14] for example). Θ= (y,α) + n 1 { ∈ Sym(n, R) R y Sym (n, R),α> − . × | ∈ 2 } n dx n(n−1) α 4 k 1 α exp( Tr(yx))(det x) n+1 = π Γ(α −2 )(det y)− . x Sym+(n,R) − (det x) 2 − Z ∈ kY=1 4.5. . In this subsection, we realize von Mises distribution on the circle S1 by using the natural representation of SO(2).

2 Proposition 4.9. Put G = SO(2) and H = I2 . Let ρ : G GL(R ) { } 2 → be the natural representation. Then v0 := e1 R is H-fixed, and by our method we obtain von Mises distribution on∈ S1 as follows:

1 κ cos(x µ) (10) e− − dx . 2πI (κ) 0 (κ,µ) R>0 R/2πZ   ∈ × 1 Proof. We use the identification G/H = SO(2)/ I2 S R/2πZ. { } ≃ ≃ 2 From Proposition 5.1 (i), we have Ω0(G,H)= 1 . We identify (R )∨ with R2 by taking the standard inner product on{ }R2. Then we have (a cos x+b sin x) √a2+b2 cos(x α) dp˜a,b(x)= e− dx = e− − dx, where dx is the uniform measure, which is the Haar measure on SO(2), and α R/2πZ satisfies cos α = a , sin α = b . Since S1 is ∈ √a2+b2 √a2+b2 2 1 2 2 compact, we have Θ = θ = (a, b) R , and c− = 2πI (√a + b ). { ∈ } θ 0 Here Im(r) is the modified Bessel function of the first kind of order m given as follows: 1 2π (11) I (r)= cos(mx)er cos xdx (m Z,r R). m 2π ∈ ∈ Z0 Therefore by a change of variables κ = √a2 + b2, µ = α, we obtain von Mises distribution (10).  4.6. Von Mises–Fisher distribution. In this subsection, we realize n 1 von Mises–Fisher distribution on the sphere S − by using the natural representation of SO(n). Proposition 4.10. Put G = SO(n) and H = SO(n 1). Let ρ : G n − n → GL(R ) be the natural representation. Then v0 := e1 R is H-fixed, ∈ n 1 and by our method we obtain von Mises–Fisher distribution on S − as follows:

1 t (12) n 1 n exp( µx)dx . (2π) 2 µ − 2 I n 1( µ ) − ( 2 − )µ Rn k k k k ∈ A METHOD TO CONSTRUCT EXPONENTIAL FAMILIES 11

n 1 Proof. We use the identification SO(n)/SO(n 1) S − , gSO(n − ≃ − 1) ge1. Since G is compact, from Proposition 5.1 (i), we obtain 7→ n n Ω0(G,H) = 1 . We identify (R )∨ with R by taking the standard inner product{ on} Rn. Then we have dp˜ (x) = exp( tµx)dx, µ − where dx is the uniform measure, which is G-invariant measure on n 1 n 1 n S − . Since S − is compact, we have Θ = θ = µ R . Therefore we obtain von Mises–Fisher distribution from{ Fact 4.11.∈ }  Fact 4.11 ([BNBE89, Example 8.3]). For µ Rn, ∈ t n 1 n exp( µx)dx = (2π) 2 µ − 2 I n ( µ ). 2 1 n−1 − k k − k k ZS n 1 Here dx is the SO(n)-invariant measure on S − and Im(r) denotes the modified Bessel function of the first kind of order m (see (11)). 4.7. Fisher–Bingham distribution. In this subsection, we realize n 1 Fisher–Bingham distribution on the sphere S − by using a represen- tation of SO(n). 1 Proposition 4.12. Put G = SO(n) and H = SO(n 1) = k − { k | ∈ SO(n 1) G. Let ρ : G GL(Rn Sym(n, R)) be a representation  as follows:− } ⊂ → ⊕ ρ(g)(v,S) := (gv,gStg) ((v,S) Rn Sym(n, R)). ∈ ⊕ n Then v0 := (e1, E11) R Sym(n, R) is H-fixed, and by our method we obtain Fisher–Bingham∈ ⊕ distribution as follows: t t (13) cµ,A exp( µx xAx) (µ,A) Rn Sym(n,R), { − − } ∈ ⊕ where cµ,A is the corresponding . Remark 4.13. We obtain as a subfamily of Fisher– Bingham distribution if we assume the constraint Aµ =0.

n 1 Proof. We use the identification SO(n)/SO(n 1) S − , gSO(n − ≃ − 1) ge1. Since G = SO(n) is compact, from Proposition 5.1 (i), we 7→ n n obtain Ω0(G,H) = 1 . We identify (R Sym(n, R))∨ with R Sym(n, R) by taking{ the} direct sum of the⊕ standard inner product on⊕ Rn and the following inner product on Sym(n, R): x, y = Tr(xy) (x, y Sym(n, R)). h i ∈ n 1 Then we have for x = ge S − (g SO(n)) 1 ∈ ∈ dp˜ (x) = exp( ((µ, A), ρ(g)(e , E )))dx (µ,A) − 1 11 = exp( tµx Tr(Axtx))dx − − = exp( tµx txAx)dx. − − 12 KOICHITOJOANDTAROYOSHINO

n 1 n Since S − = G/H is compact, we have Θ = R Sym(n, R). Therefore we obtain Fisher–Bingham distribution (13). ⊕  4.8. Hyperboloid distribution. In this subsection, we realize hyper- boloid distribution [J81] on the n-dimensional hyperbolic space Hn (see Definition 4.16) by using the natural representation of SO0(1, n).

Proposition 4.14. Let n be a positive integer and put G = SO0(1, n) and H = SO(n). Let ρ : G GL(Rn+1) be the natural representation. → n+1 We denote by e0, , en the standard basis of R . Then v0 := e0 Rn+1 is H-fixed, and··· by our method we obtain hyperboloid distribution∈ on the n-dimensional hyperbolic space Hn as follows: n n (14) cκ exp(κ ξ, v )dµ(v) (κ,ξ) R>0 H (v H ), { h i } ∈ × ∈ where cκ is the normalizing constant written as follows: n−1 κ 2 (15) cκ = n−1 . (2π) 2 2K n−1 (κ) 2

Here Kν is the modified Bessel function of the second kind and with index ν. Remark 4.15 (see Section 7 in [BN78b], see also [J81]). In [J81], he denotes by Hn the (n 1)-dimensional hyperbolic space. − In the case n =1, we have c = 1 . κ 2K0(κ) • κ In the case n =2, we have c = κe . • κ 2π Definition 4.16 (hyperbolic space). Let n be a positive integer. The following manifold Hn with the metric which is the restriction to Hn 2 2 2 Rn+1 of dx0 + dx1 + + dxn on is called n-dimensional hyperbolic space.− ··· Hn = x Rn+1 x > 0, x, x = 1 . { ∈ | 0 h i − } Here , is a pseudo inner product on Rn+1 given as follows: h· ·i (16) x, y := x y + x y + + x y . h i − 0 0 1 1 ··· n n A Lie group SO0(1, n), which is the identity component of SO(1, n), acts transitively on the n-dimensional hyperbolic space Hn. Since the stabilizer of e Hd is SO(n), we identify Hn with the homogeneous 1 ∈ space SO0(1, n)/SO(n) as follows: (17) SO (1, n)/SO(n) Hn, 0 → (18) gSO(n) g e . 7→ · 0 n+1 n+1 Proof of Proposition 4.14. We identify (R )∨ with R by the pseudo inner product (16). We have Ω0(G,H)= 1 from Proposition 5.1 (ii). Take a G-invariant measure µ on Hn (see{ section} 2C in [J81] for the realization). Then we have dp˜ (v) = exp( y, v )dµ(v) (y Rn+1, v = ge Hn). y −h i ∈ 0 ∈ A METHOD TO CONSTRUCT EXPONENTIAL FAMILIES 13

Fact 4.17 ([BN78b, J81]). Θ= y Rn+1 y < 0, y,y < 0 . { ∈ | 0 h i } n Since there is a natural bijection R>0 H Θ, (κ, ξ) κξ, we use R Hn as a parameter space. × → 7→ − >0 × Fact 4.18 (Section 7 in [BN78b]). The normalizing constant of exp( y, v )dµ(v)= exp(κ ξ, v )dµ(v) is given as (15). −h i h i Therefore, we obtain hyperboloid distribution (14).  4.8.1. A family of distributions on the upper half plane. In this subsub- section, via the diffeomorphism between H2 and the upper half plane , we realize a corresponding distribution on the upper half plane to theH above hyperboloid distribution for n = 2. Since the exponential family on the upper half plane is compatible with the Poincar´emetric, in this paper, we call it Poincar´edistribution, which is given as follows: D e2D a(x2 + y2)+2bx + c dxdy (19) exp , 2 a b π − y y   Sym+(2,R)     ∈ b c where D = √ac b2. First, let us recall− the definition of the upper half plane. Definition 4.19 (upper half plane). The following manifold with dx2+dy2 H Poincar´emetric y2 is called upper half plane: = x + iy C y > 0 . H { ∈ | } A Lie group SL(2, R) acts transitively on the upper half plane as a M¨obius transformation as follows: H az + b a b g z := (g = SL(2, R)). · cz + d c d ∈   Since the stabilizer of i is SO(2), we identify with the homogeneous space SL(2, R)/SO(2) as follows: H (20) SL(2, R)/SO(2), H↔ √y x (21) z = x + iy √y SO(2), 0 1 7→ √y ! (22) g i [ gSO(2). · ← Moreover, we use the following diffeomorphism: (23) SL(2, R)/SO(2) SO (1, 2)/SO(2), → 0 which is induced by the following 2 : 1 group homomorphism: (24) SL(2, R) SO (1, 2),g Ad(g). → 0 7→ Here since SL(2, R) is connected and the adjoint representation Ad : sl R 1 G GL( (2, )) preserves the pseudo inner product 2 Tr(xy) with signature→ (2, 1) on sl(2, R), the above map is well-defined. We use a 14 KOICHITOJOANDTAROYOSHINO

0 1 0 1 1 0 basis f1 = − , f2 = , f3 = . Since SO(2) 1 0 1 0 0 1 ⊂      −  1 SL(2, R) stabilize f , we have Ad(SO(2)) = SO (1, 2) g 1 { g ∈ 0 | ∈ SO(2) . From the composition of (20), (23) and (17), we have the following} correspondence: (25) H2, H→ 1+ x2 + y2 v 1 0 (26) x + iy 1 (x2 + y2) = v . 7→ 2y  −   1 2x v2

ξ0+ξ1 ξ2   ξ0 ξ1 Therefore, by putting a = 2π κ, b = 2 κ and c = −2 κ, we obtain κeκ 2 Poincar´edistribution (19) from exp(κ( ξ0v0+ξ1v1+ξ2v2))dµ(v) (κ,ξ) R>0 H . { 2π − } ∈ × 5. Appendix 5.1. Preliminary to Section 4. In this subsection, we prove Propo- sition 5.1, which are used to determine Ω0(G,H) for pairs (G,H) in Section 4. Proposition 5.1. Let G be a Lie group and H a closed subgroup of G. Then we have Ω0(G,H)= 1 if at least one of the following conditions is satisfied: { } (i) G is compact. (ii) G is connected and semisimple. (iii) π(G)= π(H). Here π : G G/[G,G] is the quotient map and [G,G] is the commu- tator subgroup→ of G.

Proof. (i) This follows from the continuity of χ Ω0(G,H). (ii) This follows from [G,G]= G and Remark 5.2.∈ (iii) Let χ be an element of Ω0(G,H) and g an element of G. From Remark 5.2, there existsχ ˜ : G/[G,G] R>0 such that χ = χ˜ π, and h H such that π(g)= π(h)→ from the assumption. Then◦ χ(g)=∈χ ˜ π(g)=χ ˜ π(h)= χ(h) = 1. ◦ ◦ 

Remark 5.2. Let G be a group and χ : G R>0 a group homomor- phism. Then χ factors G/[G,G]. →

Fact 5.3. R× R>0 continuous group homomorphism = x x λ λ R{ . → } { 7→ | | | ∈ } Acknowledgement The authors would like to thank Professor Hideyuki Ishi, Kenichi Bannai, Kei Kobayashi and Shintaro Hashimoto for valuable comments. A METHOD TO CONSTRUCT EXPONENTIAL FAMILIES 15

The authors would also like to thank Atsushi Suzuki, Kenta Oono and Kentaro Minami for helpful comments and discussion. References [BN69] O. E. Barndorff–Nielsen, L´evy homeomorphic parametrization and ex- ponential families, Z. Wahrscheinlichkeitstheorie verw. Geb. 12 (1969), 56–58. [BN70] O. E. Barndorff–Nielsen, Exponential families-exact theory, Various Publication Series No. 19. Aarhus Universitet, Aarhus, (1970). [BN78a] O. E. Barndorff–Nielsen, Information and exponential families in statistical theory Chichester: Wiley, (1978). [BN78b] O. E. Barndorff–Nielsen, Hyperbolic distributions and distribution on hyperbolae, Scand. J. Statist. 8 (1978), 151–157. [BNBJJ82] O. E. Barndorff–Nielsen, P. Blaæsild, J. L. Jensen, B. Jørgensen, Exponential transformation models, Proc. R. Soc. Lond. A 379 (1982), 41–65. [BNBE89] O. E. Barndorff–Nielsen, P. Blaæsild, P. S. Eriksen, Decompo- sition and invariance of measures, and statistical transformation models, New York: Springer Verlag, (1989). [CW15] T. S. Cohen, M. Welling, Harmonic exponential families on manifolds, In Proceedings of the 32nd International Conference on (ICML), volume 37 (2015), 1757–1765. [DY79] P. Diaconis, F. Ylvisoker, Conjugate priors for exponential families, The Annals of Statistics, Vol.7, No.2 (1979), 269–281. [HG13] K. Hornik, B. Grun¨ , On conjugate families and Jeffreys priors for von Mises–Fisher distributions, Jour. of Stat. Plan. and Inf. 143 (2013), 992–999. [In33] A. E. Ingham, An integral which occurs in statistics, Proc. Cambr. Philos. Soc. 29 (1933), 271–276. [I14] H. Ishi, Homogeneous cones and their application to statistics. Modern meth- ods of multivariate statistics, Lecture notes of the CIMPA-FECYT-UNESCO- ANR Workshop held in Hammamet, September 2011. (eds. by P. Graczyk and A. Hassairi), Travaux en Cours, vol. 82, Hermann Editeurs,´ Paris, (2014). [J81] J. L. Jensen, On the hyperboloid distribution, Scand. J. Statist. 8 (1981), 193–206. [KW09] K.Kume, S. G. Walker, On the Fisher–Bginham distribution, Stat. Comput. 19 (2009), 167–172. [RAW90] R. A. Wijsman, 7. Invariant and relatively invariant measures on locally compact groups and spaces. Invariant measures on groups and their use in statistics, Inst. of Math. Stat., Hayward, CA, (1990), 119–150. [S35] G. L. Siegel Uer¨ die analytische tehorie der quadratische Formen, Ann. of Math. 36 (1935), 527–606. [W28] J. Wishart, The generalized product monent distribution in samples from a normal multivariate population, Biometrika 20A (1928), 32–52.

Graduate School of Mathematical Science, The University of Tokyo,, 3-8-1 Komaba, Meguro-ku, Tokyo 153-8914, Japan E-mail address: [email protected] Graduate School of Mathematical Science, The University of Tokyo,, 3-8-1 Komaba, Meguro-ku, Tokyo 153-8914, Japan E-mail address: [email protected]