arXiv:2104.05510v2 [math.PR] 8 Aug 2021 o h iaea xoeta distribution exponential bilateral the For xlctyfo h distribution the from explicitly rpitsbitdt ora fMliait Analysis Multivariate of Journal to submitted Preprint xml fdaiyi rvddb h wei aiis(Barl families Tweedie the by function provided by is initiated duality families of exponential example of functions framework variance this in of idea the develop we families, exponential variables random is transform oseitfor exist does osdrameasure a Consider Abstract .Introduction 1. httelimit the that htsrnefrua!Ohroe a efudi 1] Probl [17], in found be function can the ones cases Other certain formulas! strange What oino xoeta families exponential T of notion 00MSC: 2020 distributions. Wishart scale, Tweedie families, nes asinfamily Gaussian Inverse ulmeasure dual A Keywords: eta aiywt ainefunction variance with family nential ℓ ∗ µ ′′ orsodn uhr mi drs:gerard.letac@math. address: Email author. Corresponding n a esrrzdb h xlctfrua htoegt fr gets one that formulas explicit the by surprized be can One h oso itiuin ihteoeo h ipetvaria simplest the of one the with distribution, Poisson The ∗ ( m ) = ioaih itiuin adudsrbto,lredev large distribution, Landau distribution, Dilogarithm ( rmr 20,Scnay60E10 Secondary 62H05, Primary V m F ( µ > µ ) ( X ∗ Am m E 1 satisfies µ ult o eladmliait xoeta families exponential multivariate and real for Duality ( )) , . . . , X p on − a 1 1 with ntttd aheaiusd olue 1 ot eNarbon de route 118 Toulouse, Math´ematiques de de Institut ) , . Am R edn otento fdaiyaogepnnilfamilies exponential among duality of notion the to leading X o ntne h ymti enul aePr( case Bernoulli symmetric the instance, For m n n 3 eeaigantrlepnnilfamily exponential natural a generating . . . , − p 7→ a dual has ℓ > µ ′ 1 ∗ µ TF n h miia mean empirical the and ( a dual has 1 /α − of ℓ ( bandb osdrn l rnltoso ie exponen given a of translations all considering by obtained µ ′ m X ( s e steLpaetasomo eti ulmeasure dual certain a of transform Laplace the as ) Bm n )) m . exp( X α ic h rme hoe a ese sarsl bu n dim one about result a as seen be can theorem Cram´er the Since eeae pt rnlto yteusmercsal a w law stable unsymmetric the by translation a to up generated 3 = α n ( / m 2 ( ∼ Bm m s Pr( ℓ . ) . µ ) 1 2 = ( uhada esr osntawy xs.Oeipratprop important One exist. always not does measure dual a Such q = X s e )) with −| 1 2 n univ-toulouse.fr (1 G´erard Letac x > = e | dx 1 + − m Z 1 p m √ R egt for get, we ) 1 1 n + ) + / X − exp( n m 1 h ril fCr ors[1 n18.A vnsimpler even An 1982. in [21] Morris Carl of article the n sn h ag o feape bandfo h theory the from obtained examples of box large the using , q 1 2 − → (1 m = = vadEi 1,Jresn[]adTede[2) the [22]): Tweedie and [9] Jørgensen [1], Enis and ev (1 n −h →∞ + ( c functions nce o utbepis( pairs suitable for 1 a, X − s m40 h rsn ae sgigt nepe in interpret to going is paper present The 410. em ∗ mlredvain noedmnin osdriid Consider dimension. one in deviations large om √ 1 , m m α x 1 + ain,qartcadcbcra exponential real cubic and quadratic iations, F i ( ) + ) m > − ( · · · µ µ 1 ) m ( + 0, ihvrac function variance with ) e302Tuos,France. Toulouse, 31062 ne dx X m + 2 n . ) ) . X . = n V ) ± / F 1) n ( . m hnteCa´rterm[]says [6] Cram´er theorem the Then = ) A = 1 2 , o ahraogteextended the among rather (or B m ed,fr0 for leads, se().Frisac,the instance, For (8)). (see ) ed otesuyo expo- of study the to leads , µ ∗ hc a ededuced be can which ilfamily tial V F ( < µ ) ( m m n Laplace and ) < uut1,2021 10, August t Laplace ith ,to 1, F ). ensional erty (1) s s transform e− s , also called Landau distribution. This gives tools for describing duals of other familiar exponential families. The cases of the normal and gamma families are very simple, being self dual, but other familiar cases like the negative binomial, the and the cubic families are tougher. Finally, we consider another family, which is the dilogarithm family with variance function em 1 as well as the one with variance function sinh m. Like the normal and gamma families, they have the remarkable prop− erty to be self dual. The definition of dual measures makes sense also in Rn while the probabilistic interpretation in terms of large deviations is lost. However we consider several cases in Rn: the , the Wishart ones and other quadratic families as classified by Casalis ([4]). We proceed as follows: the notion of duality leads us unfortunatelyto change a bit the tradition about the exponen- sx θx tial families. Indeed we will use e− instead of e in order to obtain later more readable formulas. This is explained in Section 2, together with the description of the classical objects attached to an . In the preceding lines we have been vague about duality. Section 3 gives proper definitions, explaining what we call a dual measure µ∗ of µ and showing that some measures have no dual. We explain also what a T exponential family TF is. It is nothing but an exponential family F plus all its translations. Indeed, talking about the dual F∗ of an exponential family F does not exactly make sense, while the dual TF∗ of TF does. Section 3 gives also the link with large deviations. m Section 4 concentrates on the TF when the variance function VF(m) is e and some parent distributions. It also give details on what we call L´evy measures of types 0, 1 and 2. Of course, large parts of this material are well known from probabilists and statisticians: exponential families, variance functions, L´evy measures, Landau distribution. It was necessary to expose them again for commodity of reading. This section contains crucial calculations for the sequel in Proposition 7. Section 5 applies the results of Section 4 to the description of the duals of the Morris and the cubic families, with the surprizing fact that they exist all with the only exception of the hyperbolic family with variance function m2 + 1. Section 6 describes the self dual dilogarithm distribution µ on the set N = 0, 1,... of integers defined by { } ∞ ∞ 1 µ(n)zn = exp (zk 1)  k2 −  Xn=0 Xk=1     which, for m > 0, generates the exponential family with variance function em 1. Since the consideration of this exponential family and of a set of parent distributions is not done in the literature,− we develop some of their properties, somewhat deviating from the study of duality. For instance, if N is the standard Gaussian distribution, then the variance of the exponential family generated by the convolution N µ is em + 1. Section 7 considers the Rn case: The multinomial distribution∗ has a very explicit dual expressed in terms of the Landau distribution. The is self dual as the one dimensional . We prove some negative results, like the fact that the multivariate negative binomial law has no dual. Section 8 discusses open problems.

2. Laplace and bilateral Laplace transforms

At first the exponential families and Laplace transforms are considered. A certain tradition among statisticians (see Morris [21]) as opposed to physicists, and may be to probabilists, defines the Laplace transform of a positive measure µ on R and Rn as follows: θ,x Lµ(θ) = eh iµ(dx). (2) ZRn

From the H¨older inequality the set D(µ) = θ ; Lµ(θ) < is a convex set, and the function kµ = log Lµ is convex on { ∞} D(µ). Actually kµ is strictly convex outside of the particular case where µ is concentrated on one point in the case of R, oronan affine hyperplane in the case of Rn. To avoid trivialities one introduces the interior Θ(µ) of D(µ). One calls (Rn) the set of µ which are not concentrated on an affine hyperplane and such that Θ(µ) is not empty. Such a µ generatesM a set of probabilities θ,x k (θ) P(θ, µ)(dx) = eh i− µ µ(dx)

2 and F = F(µ) = P(θ, µ) ; θ Θ(µ) is called the natural exponential family generated by µ. Note that µ is not { ∈ } R necessarily bounded: simple examples on like µ(dx) = 1(0, )(x)dx or n∞=0 δn generate the important families of ∞ exponential distributions or geometric discrete laws. Omiting ’natural’, weP will say always exponential family for short. Objects linked to F are the mean m of P(θ, µ), the domain of the means MF and the inverse function ψµ. They are defined by

m = kµ′ (θ) = xP(θ, µ)(dx), MF = kµ′ (Θ(µ)), θ = ψµ(m). ZRn

Note that since kµ is strictly convex, then kµ′ is injective on Θ(µ), the map ψµ from MF onto Θ(µ) is well defined and MF is a connected open set. If CF is the closed convex set generated by the of µ clearly MF is contained in CF . We say that F or µ are steep if MF is equal to the interior of CF . Most of the classical exponential families are steep, but not always (see for n instance the Tweedie scale below for p < 0). In R the set MF is an interval, but in R there are non steep examples such that MF is non-convex ([16], p. 35). Finally, the last important object about the exponential family F is its variance function VF defined on MF by 1 VF(m) = = kµ′′(ψµ(m)) = (x m) (x m)P(ψµ(m), µ)(dx), ψµ′ (m) ZRn − ⊗ − which characterizes F. Now bilateral Laplace transforms are considered. In the particular case of dimension one, an older tradition asso- ∞ e sxµ dx s > ciates the name of Laplace to integrals 0 − ( ) which are conveniently defined for 0 in many circumstances. In the present paper we need to considerR what the physicists call the bilateral Laplace transform

s,x Bµ(s) = e−h iµ(dx). (3) ZRn

Dealing with this slight change of notation Bµ(s) = Lµ( s) will much simplify the description of duality between two natural exponential families. In the sequel, we say ’Laplac− e transform’ for short and by abus de langage instead of the longer term ’bilateral Laplace transform’. Because we will deal with these bilateral Laplace transforms, we have to modify the description of the classical objects associated to F = F(µ) with

S (µ) = Θ(µ), ℓµ(s) = kµ( s), m = ℓ′ (s), (4) − − − µ 1 s = ϕµ(m) = ψµ(m), ℓ′′(s) = k′′( s), VF(m) = (ϕ′ (m))− . (5) − µ µ − − µ

Let us insist on the fact that ℓµ is convex and thus s m = ℓ (s) is decreasing when n = 1. For any n the inverse 7→ − µ′ function m s = ϕµ(m) from MF onto S (µ) is well defined. Thus we have several ways of coding a member of an exponential7→ family: P(θ, µ), P( s, µ), P(ψµ(m), µ), P( ϕµ(m), µ). − − Finally, in one dimension, to find a generating measure µ of the exponential family F from the knowledge of VF, we proceed as follows: dm dm ds = ϕµ′ (m)dm = s = . − VF (m) ⇒− Z VF (m) The choice of the integration constants will change µ and is arbitrary. In practical cases, we choose these constants in order to get the simplest form of µ. Therefore this choice may depend on aesthetic considerations. Here is an important example in dimension one: Example 1 (the Tweedie scale). This term describes a set of exponential families with variance functions of the form mp R VF(m) = p 1 on MF = (0, ) where λ> 0 and p [0, 1] (see [1, 9, 22]). We are going to describe their densities λ − or Laplace transforms. The∞ limiting cases p = 0 and∈ p \= 1 correspond respectively to the Gaussian exponential family with fixed variance λ where MF = R and to the Poisson family and they need a special treatment.

3 p 2 1. The stable subordinator case p > 2. For simplification we introduce α = p−1 (0, 1). We write − ∈ p 1 p 1 1/(p 1) λ − dm 1 λ − λs− − ds = s = m = = ℓ′ (s). − mp ⇒ p 1 m ⇒ (p 1)1/(p 1) − µ −   − − (p 1)α α Here S (µ) = (0, ). We finally obtain ℓµ(s) = λ p− 2 s . This family is generated by a stable law of parameter α with L´evy measure∞ concentrated on (0, ) of− Type− 1 (see Section 4.3 for the definition of a L´evy measure of an infinitely divisible probability and its type).∞ λ 2. The gamma case p = 2. Similarly we obtain S (µ) = (0, ),ℓµ′ (s) = λ/s and Bµ(s) = 1/s . This family is λ 1 ∞ − generated by the measure µ(dx) with density x − /Γ(λ) and is the family of gamma distributions with shape parameter λ . 2 p 3. The Poisson -gamma case 1 < p < 2. For simplification denote β = p−1 (0, ). A computation analogous to the case p > 2 leads to − ∈ ∞ β 1 (p 1) 1 (p 1)− p 1 1 ℓ = λ − , ℓ = λ − µ(s) − β µ′ (s) − 1 (6) 2 p s − 2 p p 1 − − s − and to S (µ) = (0, ). This implies that the corresponding exponential family is the set of laws of X + + ∞ 1 ··· XN(t) where X1,..., Xn,... are iid gamma distributed with shape parameter β and where N(t) is an independent β (p 1)− with mean t = λ 2− p . − 4. The non steep stable case p < 0. For simplicity denote q = p > 0 and γ = q+2 (1, 2). Here S (µ) = ( , 0) − q+1 ∈ −∞ and we obtain γ (q + 1) γ ℓµ(s) = λ ( s) . q + 2 − The members of this family are stable laws with parameter γ with L´evy measure concentrated on ( , 0). The −∞ support of such a stable law is R but MF = (0, ) : this is an example of a non steep family and of an infinitely divisible distribution of Type 2 (see Section 4 and∞ (21)).

3. Duality

3.1. Duality between measures Rn Given µ ( ) the function s ℓµ′ (s) = m maps S (µ) onto the domain of the means MF . Its inverse ∈ M 7→ − n m s = ϕµ(m) exists and maps MF onto S (µ). Suppose that there exists µ (R ) such that 7→ ∗ ∈M

ℓ′ ( ℓµ(s)) = s. (7) − µ∗ −

If (7) holds we say that µ∗ is a dual measure of µ. Note that we say ’a dual measure’ since µ∗ is unique only up to a multiplicative constant. However the associated exponential family F(µ∗) will not change if µ∗ is replaced by Cµ∗. Observe also that if µ∗ exists and if µ is steep then µ is also a dual measure of µ∗. In general, if µ∗ is bounded it is natural to choose the multiplicative constant such that µ∗ is a probability.

n 1 Proposition 1. Let µ (R ) and suppose that there exists a dual measure µ∗. Then ℓ′′ (m) = (VF(µ)(m))− . Further- ∈M µ∗ = 1 more, if µ is steep ℓµ′′(m) (VF(µ∗)(m))− .

Proof: Since s = ϕµ( ℓ (s)) we get by definition that − µ′ 1 1 1 ϕ′ ( ℓ′ (s)) = ( ℓ′′(s))− = (VF(µ)( ℓ′ (s)))− , ℓ′′ (m) = (VF(µ)(m))− . − µ − µ − − µ − µ µ∗ The second formula is obtained by symmetry. In the particular case of dimension one, we will have numerous examples of dual measures given below. Let us give the simplest now, based on the Tweedie scale. Essentially, duality exchanges cases 1 and 3.

4 1 1 Proposition 2. Let 1 < p < 2 and q defined by p + q = 1. Let µ, defined by (6) generating the exponential family mp with variance function VF (m) = p 1 . Then there exists a dual measure µ∗ and it generates the exponential family with λ − variance function V s = p q 1 sq. F(µ∗)( ) ( q ) λ Proof: Using formula (6) we get ℓ (s) = λ(p 1) qs q. Since p 1 = p/q Proposition 1 gives the result. µ′′ − − − − Remark 1. For having a more symmetric form we can choose λ such that Rp R q V (m) = mp, V (s) = − sq. (8) F(µ) qp F(µ∗) pq Dual measures do not always exist. Consider, for instance, Case 4 of the Tweedie scale above generated by a µ with parameter γ (1, 2) with L´evy measure concentrated on ( , 0) and domain of the means ∈ −∞ MF µ = (0, ). Therefore if µ was existing we would have for positive constants C, C , C : ( ) ∞ ∗ 1 2 γ γ 1 ℓµ(s) = C( s) , ℓ′ (s) = Cγ( s) − = m, − µ − − − 1 γ γ 1 γ 1 ℓ′ (m) = C m − , ℓ (m) = C m − . µ∗ 1 µ∗ 2

This would imply that µ∗ would be a stable law with parameter γ/(γ 1) > 2, which is impossible. Moreover, it is well known that a probability P (R) is infinitely− divisible if and only if there exists ν (R) such that for s S (P) we have ∈M ∈M ∈ sx ℓP′′(s) = e− ν(dx). (9) ZR Extension of the definition of infinite divisibility to measures µ (R) (even the unbounded ones) is easy and is ∈ M done in Letac [16] and the same remark about ℓµ′′(s) holds. Then we have the following simple proposition:

Proposition 3. Let µ (R) and assume that a dual measure µ∗ does exist. Then µ∗ is infinitely divisible if and only if m 1 is the Laplace∈M transform of some measure ν (R). 7→ VF(µ)(m) ∈M Proof: This an immediate consequence of Proposition 1 and of (20). Example 2 (Bernoulli distribution on 1). Proposition 3 gives a powerful tool to check quickly whether a dual measure exists. Consider, for instance, the± Bernoulli measure on 1 with mean m ( 1, 1): ± ∈ − 1 µ = ((1 m)δ 1 + (1 + m)δ1). 2 − − 2 2 2 Therefore the variance of X µ is VF µ = E(X ) E(X) = 1 m . To prove that µ exists we observe that ∼ ( ) − − ∗

1 1 1 1 1 my y = + = e− −| |dy 1 m2 2 1 m 1 + m! 2 ZR − − ν dy = 1 e y dy. B m = + and we apply Proposition 3 to the bilateral exponential law ( ) 2 −| | One can prove that µ∗ ( ) (1 1+m 1 m m) (1 m) − on S (µ∗) = ( 1, 1). We will describe µ∗ later on in Section 4, the second formula in (15) and Section 5. − −

3.2. The role of linear transformations and the Jørgensen set n Consider the image µa of µ by the isomorphism x a(x) of R into itself. Here a is an invertible matrix of order 7→ n µ µ µ 1 µ . . If has a dual ∗ then ( ∗)a− is a dual of a Let us give the details of a tedious calculation:

= s,a(x) = aT s,x) = T = = T T Bµa (s) e−h i e−h iµ(dx), ℓµa (s) ℓµ(a s), m ℓµ′ a (s) a ℓµ′ (a s), ZRn ZRn − − 1 T T T 1 T 1 T 1 T 1 T (a− ) m = ℓ′ (a s), a s = ℓ′ ((a− ) m), s = (a− ) ℓ′ ((a− ) m),ℓ(µ ) (m) = ℓµ ((a− ) m), − µ µ∗ µ∗ a ∗ ∗ 1 T 1 B (m) = B ((a 1)T m) = e a− ) m,y µ (dy) = e m,a− y µ (dy) = B (m). (µa)∗ µ∗ − −h i ∗ −h i ∗ µa 1 ZRn ZRn − 5 Recall also that the image of F = F(µ) by the isomorphism a satisfies

T 1 VF(µa)(m) = a VF(a− m)a.

n n If µ (R ) then the Jørgensen set Λ(µ) is the set of λ > 0 such that there exists µλ (R ) such that ∈ M λ ∈ M S (µ) = S (µλ)and(Bµ) = Bµ . Denote Fλ = F(µλ). Recall that the set (Fλ)λ Λ(µ) is called by Jørgensen an exponential λ ∈ dispersion model. It is not correct to think that if µ∗ is a dual measure of µ then Λ(µ) =Λ(µ∗). We will see in Section 5 that a dual measure µ∗ of the Bernoulli distribution µ exists and is infinitely divisible. Therefore

Λ(µ∗) = (0, ) , Λ(µ) = 1, 2,... . ∞ { }

However if λ is both in Λ(µ) and Λ(µ∗), then a dual measure of µλ satisfies m B (m) = B ( ). (10) (µλ)∗ µ∗ λ 3.3. Duality and the change of generating measure of an exponential family s ,x Let us clarify now what happens to µ∗ and to F(µ∗) when we replace µ(dx) by µ1(dx) = eh 0 iµ(dx). This is important, since F(µ) = F(µ1).

n s0,x Proposition 4. Let µ (R ) and suppose that there exists a dual measure µ∗. Let µ1(dx) = eh µ(dx). Then = ∈ M i µ1∗ µ∗ δs0 is a dual measure of µ1. In particular, the elements of F(µ1∗) are the translation by s0 of the elements of ∗ m ,y F(µ ). Symmetrically if µ = µ δm is a translation of µ by m then µ (dx) = e 0 µ (dy) is a dual measure of µ . In ∗ 2 ∗ 0 0 2∗ h i ∗ 2 particular F(µ2∗) = F(µ∗). We skip the proof, which is an immediate application of the definitions. Let us consider the simple example of the gamma exponential family F with shape parameter λ, which is Case 2 in the Tweedie scale example in Section 2. The 2 domain of the means is MF = (0, ) and its variance function is VF(m) = m /λ. Applying Proposition 1 we write ∞ λ λ es0m+s1 yλ 1 s1 ∞ (y s0)m − ℓ′′ = , ℓ′ = + s , ℓµ = λ log m + s m + s , Bµ (m) = = e e− − dy. µ∗ 2 µ∗ 0 ∗ 0 1 ∗ λ m −m − m Z0 Γ(λ)

If s0 = 0 then F(µ∗) is the same gamma family. If s0 , 0 this is a translation of this last family. Suppose that we have 2 started from a translated gamma family with MF = (m , ) and VF(m) = λ/(m m ) . For this case we get the same 0 ∞ − 0 families F(µ∗): they do not depend on the particular translation by m0.

3.4. Duality for T–exponential families The preceding discussion shows that while the dual of a measure is well defined up to a multiplicative constant, this is not the case for an exponentialfamily F, since changing the generating measure µ of F into some µ1 implies that F(µ) = F(µ1) by definition, but possibly changes F(µ∗) into a translate F(µ1∗). For this reason we coin the definition of a T–exponential family: Definition 1. Given an exponential family F the associated T–exponential family TF is the union of all the translations of F. If F = F(µ) then n TF = P( s, µ) δm ; s S (µ), m R . { − ∗ 0 ∈ 0 ∈ } Such a µ (Rn) is called a generating measure of TF. ∈M Rn Rn s0,x +b Note that TF(µ) = TF(µ1) if and only if there exists s S (µ), b , m0 such that µ1 = δm0 e−h i µ(dx) or ∈ ∈ ∈ ∗ ℓµ (s) = s, m + b + ℓµ(s + s ). 1 −h 0i 0 We are now in position to clearly define the dual TF∗ of a T–exponential family TF, thanks to the following proposition:

6 n Proposition 5. If µ is a generating measure of the T-exponential family TFin R and if µ has a dual measure µ∗ then any generating measure µ1 has also a dual and

TF(µ∗) = TF(µ1∗)

Under these circumstances we denote TF∗ = TF(µ∗). For instance we have seen that if TF is the T–exponential family of all the gamma distributions with shape pa- rameter λ augmented with all its translations, then TF = TF∗ is self dual. We will find a similar phenomena with the Gaussian distributions with variance one, then with two unexpected examples: below in Section 6 with the diloga- rithm exponential family and the sinh m family, and, as a generalization of the one dimensional gamma case, with the Wishart distributions with fixed shape parameter.

3.5. Duality and large deviations.

Let F = F(µ) be a real natural exponential family and let m0 < m two points of MF . Let X1,..., Xn,... be independent random variables with the same distribution in F with mean m0. The theorem of large deviations, due to Cram´er [6], says that for m m t h(m0, m) = − dt Zm0 VF (t) we have 1/n 1 n h(m0,m) Pr( (X1 + + Xn) > m) →∞ e− . (11) n ··· ! −→

In the next proposition, we link h(m , m) with µ as the rest of the Taylor expansion of m ℓµ (m) around m . It 0 ∗ 7→ ∗ 0 shows that m exp h(m0, m) defines a member of TF∗ when the distribution P of Xn is in TF. One can consider such a proposition7→ as a kind of probabilistic interpretation of duality in dimension one.

Proposition 6. Let TF be a real T–exponential family such that its dual TF∗ exists. Let X1,..., Xn,... be independent random variables with the same distribution P TF and mean m . For m < m define ∈ 0 0 1 1 h(m0, m) = lim log Pr( (X1 + + Xn) > m). n n n − →∞ ··· c Suppose that the dual probability P∗ of P exists. Let s0 = ℓP (m0),c = ℓP (m0) m0ℓ′ (m0) and Pm∗ = e P∗ δs0 . Then ∗ ∗ − P∗ 0 ∗

my h(m0,m) B(Pm∗ )(m) = e− Pm∗ (dy) = e . (12) 0 ZR 0

Proof: From Proposition 1 we have ℓ′′ (m) = 1/VF(P)(m). Therefore by integration by parts P∗ m h(m , m) = (m t)ℓ′′ (t)dt = (m m)ℓ′ (m) + ℓ (m) ℓ (m ) = ℓ (m) ms + c. 0 P∗ 0 P∗ P∗ P∗ 0 P∗ 0 Zm0 − − − −

3.6. Warnings for existence and non existence of the dual measures To conclude Section 3, devoted to generalities on duality, we consider some devices for proving that a function is not a Laplace transform for the case of dimension 1. Other ones will appear in Section 7 for the Rn case. It will be shown that the Vinogradov-Paris distribution has no dual. Vinogradov and Paris [23] in their Theorem 2 2 6 consider the exponential family F on (0, ) with variance function 2m /(1 m ) defined on MF = (0, 1). Now we ∞ − show that it has no dual. Indeed if a dual measure µ∗ exists we have from Proposition 1

2 1 m m2/2 ℓ′′ (m) = − Bµ (m) = e− / √m. µ∗ 2m2 ⇒ ∗

7 m2/2 The function e− is a Fourier transform, and 1/ √m is a Laplace transform. Who wins? By the theorem of maximal = analyticity of Laplace transforms (see Letac and Mora [15] and Kawata [11]) Bµ∗ (m) is notonlydefined on MF (0, 1) but also on (0, ). However the second derivative of log Bµ (m) is negative on m > 1 and this proves that Bµ cannot ∞ ∗ ∗ be the Laplace transform of a positive measure. For proving that some TF∗ does not exists, this trick can be utilized when F is not steep. Let us apply Proposition 5 to the probability distributions

1 x dx P (dx) = e−| |dx, P (dx) = 1 2 2 1 2 cosh( 2 πx)

2 3/2 and to a distribution P3 described in Letac [16] generating the exponential family with variance function (1 + m ) . The probabilities P1 and P2 are called the bilateral exponential and the . They have means m0 = 0, and respective Laplace transforms 1 1 B (s) = , B (s) = , 1 1 s2 2 cos s − which generate two exponential families with respective variance functions

2 2 2 V1(m) = m + 1 + √m + 1, V2(m) = m + 1.

Performing the calculation of h(0, m) = ℓµ (m) for P1, P2, P3, we obtain ∗ 2 √ 2 m + 1 m + 1 √m2+1 1 1 m arctan m √m2+1 1 h1(m) = 2 − e − , h2(m) = e , h3(m) = e − . (13) m2 √m2 + 1

For seeing that P1∗ , P2∗ and P3∗ do not exist we use Mathematica for computing 3m2 m2 m4 m6 283m8 m2 h (m) = 1 + + o(m5), h (m) = 1 + + + + o(m9), h (m) = 1 + + o(m5). 1 4 2 2 24 80 − 3020 3 2

4 4 8 This proves that R x P1∗(dx) = R x P3∗(dx) = 0, R x P2∗(dx) < 0: none of them is possible. Thus P1∗, P2∗, and P3∗ do not exist. My thanksR go to Lev KlebanovR for suggestingR this method for proving the lack of duality of some measures.

4. The variance function em, the Landau law ϕ and the dual of Poisson family

In this section, we restrict ourselves to the one dimensional case.

4.1. The Landau distribution ϕ.

We will study TF∗ when TF is the set of Poisson distributions augmented with its translations. In Section 5 we are going to study the dual pairs (TF, TF∗) issued of the case where VF(m) is a quadratic or a cubic polynomial. For studying their dual we will use a particular probability ϕ on R called the Landau law, which can be defined by its s s Laplace transform defined on S (ϕ) = (0, ) by Bϕ(s) = e− s . As we are going to see, ϕ exists and has∞ R as support. Sometimes the term of Landau distribution is given to the s convolution ϕ δ 1, because its Laplace transform has the more elegant form s . Itis thelawof X 1 when X ϕ. A detailed study∗ of− the Landau distribution can be found, for instance, in Marucho et al. [20]. − ∼ The law ϕ generates the exponential family with variance function em. For seeing this we write using for instance (5): 1 m ℓϕ(s) = s log s s, ℓ′ (s) = log s = m, ℓ′′(s) = = e . (14) − ϕ − ϕ s An important feature of ϕ for our purposes is the existence of the following dual ϕ∗:

s s s e− 1 s = ℓ′ ( ℓ′ (s)) = log ℓ′ (s), ℓ′ (s) = e− , ℓϕ (s) = e− 1, Bϕ (s) = e − . − ϕ − ϕ∗ ϕ∗ ϕ∗ ∗ − ∗ Thus a dual of the Landau distribution is the Poisson distribution with mean 1. 8 4.2. Existence of ϕ and parents Proposition 7. For s > 0 and R > 1 we have

s s ∞ sx x 2 1+s 1 s sx 2 x e− s = exp (e− 1 + sxe− )x− dx, (1 + s) (1 s) − = exp (e− 1 sx)x− e−| |dx, (15) Z0 − − ZR − − ss = ∞ sx x 2 ss = ∞ sx x + 2 (s+1)s+1 exp (1 e− )(1 e− )x− dx, (s+1)s exp (1 e− )(1 e− (1 x))x− dx, (16) − Z0 − − − Z0 − − s+1 Rs ss(s+1) R 1 ∞ sx R 1+e x Re x/R − = R 1 = − − Rs+1 R− − exp (1 e− ) f (x)dx, f (x) − (R 1)−x2 , (17) (Rs+1) R 1 − Z0 − − − + 1 s+2 s 2 ∞ sx x 2 4 s+1 = exp (1 e− )(x 1 + e− )x− dx. (18)   − Z0 − − Proof: We prove first that for a > 0 we have

∞ u au 2 F(a) = (e− 1 + ue− )u− du = 1 log a. Z0 − − −

For seeing this we note that F′(a) = 1/a and doing an obvious integration by parts we observe that F(1) = 1. We pass from the value of F(a) to the first− statement in (15) by the change of variable u = sx. − For proving the second statement in (15) we apply the first one while replacing s by 1 + s and 1 s, thus obtaining two integrals. In the second one we change x into x and thereafter summing the two integrals− yield the second expression in (15) plus the extra term −

x x dx 2 + (e−| | 1 x e−| |) = 0, ZR − −| | x2 which is obtained by an integration by parts. For proving the first relation in (16) we write patiently

ss = s + ∞ e sx + sxe x x 2dx s + ∞ e sxe x + sxe x + xe x x 2dx log s+1 ( − 1 − ) − ( 1) ( − − 1 − − ) − (s + 1) Z0 − − − Z0 −

∞ ∞ x sx x x 2 sx xe− x 2 = 1 + (e− (1 e− ) xe− )x− dx = 1 + (e− 1 e x )(1 e− )x− dx − Z0 − − − Z0 − − − −

∞ ∞ x ∞ sx x 2 xe− x 2 sx 1 x 1 = 1 + (e− 1)(1 e− )x− dx + (1 1 e sx )(1 e− )x− dx = (1 e )x− (1 e− )x− dx. − Z0 − − Z0 − − − − − Z0 − − The last line uses the fact observed above that F(1) = 1. For proving the second statement in (16) we use the Frullani integral − ∞ sx x 1 log(1 + s) = (1 e− )e− x− dx (19) Z0 − ss that we add to log (s+1)s+1 for obtaining the desired result. 1 The proof of (17) follows the same lines: we start from the log of (15) and replace s by s + 1 and s + R . We watch sx the coefficient f (x) of e− defined by (17) and we rearrange the remainder terms. The integral

∞ log R f (x)dx = Z R 1 0 − is computed by an integration by parts followed by the application of (19). For proving (18) we combine (19) and the first statement in (16) applied to s + 1 instead of s:

s+2 s+2 1 (s+2) = 1 (s+2) 1 = 1 ∞ sx x 1 + sx x x 2 4 s+1 4 (s+1)s+1 s+1 4 exp (e− 1)e− x− (1 e− e− )(1 e− )x− dx     Z0 { − − − } ∞ ∞ x 2 1 sx x 2 1 e− = 4 exp (1 e− )(x 1 + e− )x− dx + −x dx . n Z0 − − Z0   o x 2 ∞ 1 e− dx = To conclude we have by an integration by parts and (19) that 0 −x log4. This ends the proof of (18). R   9 4.3. L´evy measures. Recall that we concentrate on the one dimensional case in this section. Extension to the Rn case is possible, but we will have no use of it in this paper. Recall also that if a probability P is in (R) then P is infinitely divisible if there exists a positive measure ν(dx) on R 0 such that M \ { } min(1, x2)ν(dx) < , ZR 0 ∞ \{ } and there exist two numbers a R and σ 0 such that for all s S (P) we have ∈ ≥ ∈

1 2 2 sx ℓP(s) = as + σ s + (e− 1 + sτ(x))ν(dx). (20) 2 ZR 0 − \{ } x 0 Here τ is a bounded function such that τ(x)/x → 1. Feller [7] chooses τ(x) = sin x, the Russian school chooses 2 −→ x τ(x) = x/(1 + x ). For getting simple formulas here we have chosen τ(x) = xe−| |. Changing τ may change a but neither σ nor the measure ν, which is called the L´evy measure of P. The number σ is called the Gaussian part. If ν is P ν ν , x ν dx < , P bounded, we say that or is Type 0. If is unbounded, but R 0 min(1 ) ( ) we say that is Type 1. For \{ } | | ∞ Types 0 and 1, the representation of the Laplace transform BPR(s) does not need τ and we can write

1 2 2 sx ℓP(s) = as + σ s (1 e− )ν(dx). 2 − ZR 0 − \{ } In other cases we say that P is of Type 2 and τ is necessary. The convex hull of the support of such a probability of Type 2 is always R. Example 3. In Section 2 we have mentioned in the Tweedie scale the unsymmetric stable distribution with parameter γ (1, 2) defined by its Laplace transform ∈

sγ ∞ sx x γ 1 dx e = exp as + (e− 1 + sxe− )x− − . (21) Z − γ(γ 1)Γ(2 γ)! 0 − − This formula can be guessed by differentiating sγ twice, and can be checked by two integrations by parts, yielding also the exact value of the constant a. The L´evy measure is of Type 2 and equals

γ 1 dx ν(dx) = 1(0, )(x)x− − . ∞ γ(γ 1)Γ(2 γ) − − From Proposition 7, the first formula in (15), we see that the Landau distribution ϕ exists, is infinitely divisible dx without Gaussian part and has ν(dx) = 1(0, )(x) x2 for L´evy measure. It is of Type 2. From Proposition 7, the second formula∞ in (15), we see that there exists an infinitely divisible distribution P such 1+s 1 s that BP(s) = (1 + s) (1 s) with S (P) = R and with L´evy measure − − e x ν(dx) = −| | dx. x2

1 Its type is 2. We will see in Section 6 that P is the dual of the symmetric 2 (δ 1 + δ1) as expected from (1) of the introduction. − From Proposition 7, the first formula in (16), we see that there exists an infinitely divisible distribution P such that s s+1 BP(s) = s /(s + 1) with S (P) = (0, ) and with L´evy measure ∞ x 1 e− ν(dx) = 1(0, )(x) − dx. ∞ x2 Its type is 1. We will see later that P is the dual of the negative binomial distribution.

10 From Proposition 7, the second formula in (16), we see that there exists an infinitely divisible distribution P such s s that BP(s) = s /(s + 1) with S (P) = (0, ) and with L´evy measure ∞ x 1 e− (1 + x) ν(dx) = 1(0, )(x) − dx. ∞ x2 Its type is 0. We will see later that P is the dual of an Abel distribution sometimes called generalized Poisson distribution, generating an exponential family on non-negative integers with cubic variance function m(1 + m)2. From Proposition 7, the first formula in (17), we see that there exists an infinitely divisible distribution P such that s+1 ss s+ R 1 ( 1) − BP(s) = Rs+1 with S (P) = (0, ) and with L´evy measure + R 1 (Rs 1) − ∞

ν(dx) = 1(0, )(x) f (x)dx ∞ where the positive function f is defined by (17). Its type is 0, since the integral of f is finite. We will see later that P is the dual of a Tak´acs distribution, generating an exponential family on non negative integers with cubic variance function m(1 + m)(1 + Rm). From Proposition 7, the second formula in (17),we see that there exists an infinitely divisible distribution P such s+2 s+2 that BP(s) = ( ) with S (P) = (0, ) and with L´evy measure s+1 ∞ x x 1 + e− ν(dx) = 1(0, )(x) − dx. ∞ x2 Its type is 0. We will see later that P is the dual of a Kendall-Ressel distribution, generating an exponential family with cubic variance function m2(1 + m).

5. A dictionary of dual pairs for quadratic and cubic families in R

We split the section in two parts: duals of Morris families, duals of cubic families. The case of the Tweedie scale has been already ruled out in Section 3.1 and in particular, by Proposition 2. Most of the considered distributions are infinitely divisible, which is saying that the corresponding Jørgensen set is (0, ). Formulas for passing from µ to ∞ µλ and from µ∗ to (µλ)∗ have been given above in Section 3.2. For this reason, except in the binomial case, we give simplified versions, ignoring the Jørgensen parameter λ. For instance, looking at the dual of the negative binomial m2 2 family with variance function m + λ , we deal only with m + m in order to have more readable formulas for the duals. The presentation of each pair (TF, TF∗) is done by selecting a µ and a ν which may be only a translation of µ∗. These measures are generating TF and TF respectively, namely F(µ) TF, F(ν) TF . We give one of the two ∗ ⊂ ⊂ ∗ variance functions VF(µ)(m) and VF(ν)(m) as well one of the two Laplace transforms Bµ(s) and Bν(s). In most of the cubic cases, the densities on R, or the weights in the discrete cases, are not available and either the Laplace transform or the variance function is computable explicitly: hence question marks may appear in the description of the dual. Logic would have imposed to write VF(ν)(s) and Bν(m) systematically. However we do not stick always to this rule. There are two reasons: variance functions of exponential families have become familiar objects during the last forty years from Morris [21], with many examples, and we are used to write m not s for the mean, exactly like we are reluctant to call c the radius of a circle and r its center. The second reason is after all that TF∗∗ = TF. Which is the first, TF or TF∗?

5.1. Duals of Morris families 1. The Poisson case s m s s (m, exp(e− ); e , e− s ).

δn s The Poisson exponential family is generated by µ = n∞=0 n! with Laplace transform exp(e− ). We have seen in s s Section 3.1 that ℓϕ(m) = m log m m and therefore ℓ′ (m) = log m = s, m = e− ℓ′ (s) = e− and, ignoring Pϕ ϕ∗ = − s − − the integration constant, ℓϕ∗ (s) e− . Therefore µ is a dual of the Landau distribution ϕ.

11 2. The Bernoulli case 2 s m 2 s 1 s (m m , 1 + e− ; 4(cosh ) , s (1 s) − ). − 2 − The Bernoulli distribution is generated by µ = δ0 + δ1 leading to

s 1 ℓ(µ) = log(1 + e− ), ℓ′(µ)(s) = = m, (22) −1 + es −

where m is in the domain of the means (0, 1) of the Bernoulli distribution. Therefore ℓ′ (m) = s = log(1 mu∗ m) + log m and, ignoring the integration constant: − − − m 1 m ℓµ (m) = (1 m) log(1 m) + m log m, Bµ (m) = m (1 m) − . ∗ − − ∗ − One can describe µ∗ by introducing the lawϕ ˜ of X when X has the Landau distribution ϕ, thus with Laplace m m − x 1 transform e ( m)− defined for m < 0. Now we introduce the measure ϕ1(dx) = e − ϕ˜(dx) with Laplace transform, defined− for m < 1,

mx mx+x m 1 m Bϕ1 (m) = e− ϕ1(dx) = e e− ϕ˜(dx) = e (1 m) − . ZR ZR −

Note, by assuming m = 0 in Bϕ1 (m), ϕ1 is a probability. Finally from the observation of their Laplace transforms we get a dual probability of µ as the convolution of ϕ and ϕ1. We obtain m m m 1 m m 1 m µ∗ = ϕ ϕ1, Bϕ ϕ (m) = e− m e (1 m) − = m (1 m) − , ∗ ∗ 1 − − since the product of two Laplace transforms is well defined on the intersection of their existence domain (here ( , 1) and (0, ). For computing the variance function we use Proposition 1 and (24): −∞ ∞ 1 d 1 4 = ℓ′′(m) = − = . V (m) µ dm 1 + em 2 m mu∗ cosh 2 In Proposition 7 and the second statement in (15) we have considered the case with the symmetric binomial 1 distribution µ = 2 (δ 1 + δ1) which is an affine transformation of the ordinary Bernoulli distribution. We have seen that its dual exists− and has Laplace transform (1 s)1 s(1 + s)1+s, and is infinitely divisible with L´evy − − measure of Type 2. Since ℓµ(s) = log cosh s we get the variance function with Proposition 1: 1 V(m) = = cosh2 m. ℓµ′′(m) 3. The binomial case 2 m s N 4 m 2 s N s (m , (1 + e− ) ; (cosh ) , s (N s) − ). (23) − N N 2 − For describing the dual of the binomial distribution from the study of the Bernoulli case, enough is to use (10). 4. The negative binomial case 2 s 1 m 2 s 1 s (m + m , (1 e− )− ; 4(sinh ) , s (1 + s)− − ). − 2

The negative binomial is generated by µ = n∞=0 δn leading to P s 1 ℓµ(s) = log(1 e− ), ℓ′ (s) = = m, (24) − − µ −es 1 − − where m is in the domain of the means (0, ). Therefore ℓ′ (m) = s = log(1 + m) + log m and, ignoring the µ∗ integration constant ∞ − − m 1 m ℓµ (m) = (1 + m) log(1 + m) + m log m, Bµ (m) = m (1 + m)− − . ∗ − ∗ Proposition 7, the first relation in (16), has shown the existence of the probability µ∗. It is an infinitely divisible 1 distribution of Type 1. Since ℓµ′ (s) = es 1 , the variance function of F(µ∗) is easily obtained: − − 1 m = = 2 VF(µ∗ (m) 4(sinh ) . ℓµ′′(m) 2 12 5. The gamma case 2 2 m λ m λ ( , s− ; , s− ). λ λ This self dual has been detailed in Section 3 as a particular case of the Tweedie scale. 6. The Gaussian case 2 1 2 2 2 1 2 2 (σ , exp( σ s ); σ− , exp( σ− s )). 2 2 1 2 2 2 2 This case is very simple since ℓN(0,σ2)(s) = σ s , ℓ′ 2 (s) = σ s and therefore ℓ′ 2 (s) = σ− s. 2 N(0,σ ) N∗(0,σ ) 7. The hyperbolic case We have seen in Section 3.6 with equation (13) that these laws have no dual.

5.2. Duals of cubic families 1. The Abel case (m(1 + m)2, ?;?, ss/(s + 1)s). 2. The Takacs case s+1 s R 1 s (s + 1) − (m(1 + m)(1 + Rm), ?;?, Rs+1 . R 1 (Rs + 1) − 3. The Kendall Ressel case s + 1 (m2(1 + m), ?;?, ( )s+1). s The above given three cases are similar. The densities of these three cubic families are explicit (see Letac and Mora [15]) but not the Laplace transforms. Similarly the variance function of the dual is not computable (except if we accept a description by a sum of a series obtained by the Lagrange formula). However using Proposition 1 and the relation ℓµ′′ (m) = 1/VF(m) we can catch the Laplace transform of the dual. Therefore we apply this principle to the three functions∗ 1 1 1 , , . m(1 + m)2 m(1 + m)(1 + Rm) m2(1 + m) Fortunately these relations can be integrated twice leading to the three Laplace transforms appearing in the second expression of (16), (17) and (18), respectively. For seeing that these three duals do exist, the hard work has been done in Proposition 7, where it is shown that these three duals are infinitely divisible, as commented on in Section 4.3. s+1 s+1 About (18), observe that the Laplace transform ( s ) of an unbounded measure µ is changed by the substitution s+2 s+2 1 s+2 s+2 s s + 1 into the Laplace transform ( s+1 ) , then converted into the Laplace transform 4 ( s+1 ) of a probability of F(7→µ): 1. The Inverse Gaussian case m3 ( , e λ √2s;23/2λm3/2, exp λ2/4s). λ − This is an application of Proposition 2 with q = 3. 2. The strict arcsine family case s (m(1 + m2), ?;?, ( )searctan s). √1 + s2

For computing the Laplace transform of µ∗ we use Proposition 1,

1 1 s 1 2 ℓ′′ (s) = = , ℓ′ (s) = log s log(1 + s ), µ∗ s(1 + s2) s − 1 + s2 µ∗ − 2

s s arctan s leading to Bµ (s) = ( ) e . For seeing that the dual exists we note that ∗ √1+s2 1 s 1 cos x ∞ e sx(1 cos x)dx = ℓ (s) = ∞(e sx 1) dx. − 2 µ∗ − − 2 Z0 − s − 1 + s Z0 − x

1 cos x Therefore µ∗ exists, is infinitely divisible with L´evy measure − 2 1(0, )(x)dx and is of Type 0. x ∞ 13 he large arcsine family case + + 2 3. T (m(1 2m cos a m ), ?;?, Bµ∗(s)) with 0 < a < π/2. The Laplace transform of the dual Proposition 1 is obtained as follows: 1 1 1 s + 2cos a ℓµ′′ (s) = = . ∗ s(1 + 2s cos a + s2) s − sin2 a s+cos a 2 sin a + 1   s+cos a Observing that ℓ′′ is the Laplace transform of a positive measure is not quite obvious. Introducing u = µ∗ sin a leads to s = u sin a cos a and to − 1 s + 2cos a 1 u sin a + cos a 1 = = ∞ ut + 2 2 2 2 2 e− sin(a t)dt sin a s+cos a sin a u + 1 sin a Z0 sin a + 1   1 ∞ t s+cos a 1 ∞ sx x cos a = e sin a a + t dt = e e a + x a dx. 2 − sin( ) − − sin( sin ) sin a Z0 sin a Z0 We now show that for x > 0 and 0 a π/2 we have ≤ ≤ 1 x cos a e− sin(a + x sin a) 1 sin a ≤ or, equivalently, that f (x) = sin x ex cos a sin(a + x sin a) 0. For seeing this we use the two inequalities ec 1 + c and r sin r respectively applied− to c = x cos a and≥ r = x sin a and we get ≥ ≥ f (x) sin a(1 + x cos a) sin a cos r sin r cos a = sin a(1 cos r) + cos a(r sin r) 0. ≥ − − − − ≥ As a consequence, denoting

2 1 x cos a x 0 x g(x) = 1 e− sin(a + x sin a) → − sin a ∼ 2 g(x) we can claim that µ∗ exists, is infinitely divisible with L´evy measure x2 1(0, )(x)dx and is Type 0. One can note that the limit values a = π/2 and a = 0 yield the Abel and the strict arcsine∞ cases. The computationof the

function Bµ∗ (s) is a painful exercise of calculus. We get indeed 1 1 π 1 2 ℓµ (s) = s log s (s + cos a) log(s + cos a) + ( + ( a)cos a)s (cotan a) log(1 + 2s cos a + s ) ∗ − 2 − 2 2 − − 2 s + cos a s + cos a 1 2 π +(cotan a) arctan + 2 cos a log cos a + (cotan a) ( 2 a).  sin a   sin a  −

6. The variance function em − 1 and the dilogarithm law In this section we still continue our study of duality in the one dimensional case. We introduce the dilogarithm law which generates the exponential family with variance function em 1. This gives the opportunity to introduce − parent distributions µr, σ, σr, η and α with interesting properties, and exponential families with unexpected variance functions, like em + 1 and sinh m. We will see that µ and α are self dual.

6.1. The generating probability µ From the Bar Lev criteria described in Letac and Mora [15] (Corollary 3.3 and Proposition 4.4), V(m) = em 1 defined on (0, ) is the variance function of an exponential family F concentrated on the non-negative integers.− Let ∞ us compute a particular generating probability µ = n∞=0 µ(n)δn of F. We use P m dm dm e− dm m ds = = = = d log(1 e− ). −V(m) −em 1 −1 e m − − − − − Therefore S (µ) = (0, ) and ∞ sn s m s ∞ e− e− = 1 e− ℓ′ (s) = m = log(1 e− ) ℓ′ (s) = . (25) − ⇒ µ − − ⇒ µ − n Xn=1 14 Recall that ∞ 1 π2 = . n2 6 Xn=1

Continuing the calculation and choosing properly the integration constant such that kµ(0) = 0 we have

2 ns 2 ns π ∞ e− ∞ ns π ∞ e− ℓµ(s) = + Bµ(s) = µ(n)e− = exp + . − 6 n2 ⇒ − 6 n2  Xn=1 Xn=0 Xn=1     As a consequence ∞ ∞ zn 1 µ(n)zn = exp − . (26)  n2  Xn=0 Xn=1    This generating function is linked to the classical special function Li2(z), called dilogarithm, defined on the unit disk zn z C ; z 1 by Li (z) = ∞ , for which a vast literature exists from Euler: let us quote Lewin [19], Zagier [24] { ∈ | |≤ } 2 n=1 n2 and Kirilov [12]. The measureP µ is a probability as we can see by putting z = 1 in (26). It is an infinitely divisible δn distribution with L´evy mesure n∞=1 n2 . For this reason we call µ the dilogarithm law in the sequel. Here is a surprising property that µ shares with theP normal and the gamma laws, namely the self duality. Proposition 8. The dilogarithm distribution µ is dual of itself.

s m Proof: The result is immediate from (25) which implies the symmetric relation e− + e− = 1.

Consider Y and Y′ which are independent with the same distribution µ. We denote by σ the distribution of Y Y′ in Z. Consider also an element P( s, µ)) of the exponential family F(µ). Since S (µ) = (0, ) we rather write− s − ∞ r = e (0, 1) and we consider the probability µr = P( s, µ) which satisfies − ∈ −

∞ n Li2(rz) Li2 (r) µr(n)z = e − . (27) Xn=0

Similarly, we introduce Yr and Y independent with the same distribution µr and denote by σr the distribution of Yr Y r′ − r′ in Z. In Sections 6.2 and 6.3 we study the four probabilities µ, σ, µr, σr.

6.2. Some properties of µ and σ For having a probabilistic interpretation of the probability µ defined by (26), consider the probability ν on positive integers defined by 6 ∞ δ ν = n π2 n2 Xn=1

Suppose that X1,..., Xn,... are iid with distribution ν and consider an independent Poisson random variable N with 2 mean λ = π /6. Then the distribution of X1 + + XN is µ by a classical calculation using conditioning by N. Of course, this shows that µ is infinitely divisible of··· type 0.

Proposition 9. IfY andY′ are independent with the same distribution µ then the characteristic function of the random it(Y Y ) 1 t(2π t) variable Y Y σ on Z is the function with period 2π given for 0 t 2π by E(e ′ ) = e 2 . − ′ ∼ ≤ ≤ − − − E itY eint 1 Proof: Since (e ) = exp n∞=1 n2− we have P  it(Y Y ) ∞ cos nt 1 1 E(e − ′ ) = exp 2 − = exp t(2π t) ,  n2  −2 − ! Xn=1     cos nt π2 1 and due to elementary calculation of Fourier series we have, for 0 t 2π, ∞ = t(2π t). ≤ ≤ n=1 n2 6 − 4 − P

15 Remark 2. 1) Since S (µ) = (0, ), then σ has no Laplace transform and cannot generate an exponential family. 2) ∞ The calculation of Pr(Y = Y′) is not elementary. More specifically by the change of variable s = t π and Mathematica: − 2π π 1 1 t(2π t) 1 1 s2 1 π2 Pr(Y = Y′) = e− 2 − dt = e 2 − 2 ds = 0.11751. 2π Z0 π Z0 By a similar calculation for n Z ∈ 2π n π 1 1 t(2π t)+int ( 1) 1 s2 1 π2 Pr(Y Y′ = n) = e− 2 − dt = − e 2 − 2 cos(ns)ds. − 2π Z0 π Z0

2 it(Y Y ) t πt The fact that E(e − ′ ) = e 2 − in [0, 2π] calls for a link with the Gaussian distribution, as it is shown in the next proposition. Proposition 10. If Z N(0, 1) and Y, Y µ are independent, then the logarithm of the characteristic function of the ∼ ′ ∼ continuous random variable W = Z + Y Y′ is made of piecewise affine functions and is concave. More specifically, if t 2πk [0, 2π) with k Z then − − ∈ ∈ log E(eitW ) = 2k(k + 1)π2 (2k + 1)πt − 2k2π2 2iπx and the density f (x) of W is, using the notation ek(x) = e− − , equals

1 ek(x) ek+ (x) f (x) = − 1 . (28) 2π (2k + 1)π + ix Xk Z ∈ Proof: If t 2πk [0, 2π) with k Z we have − ∈ ∈ 1 1 log E(eitW ) = t2 + (t 2kπ)2 π(t 2kπ) = 2k(k + 1)π2 (2k + 1)πt. −2 2 − − − − To compute f (x) we use the inverse Fourier transform leads easily to (28):

2(k+1) 1 ∞ itx itW 1 itx+2k(k+1)π2 (2k+1)πt f (x) = e− E(e )dt = e− − dt. 2π 2π Z Xk Z Z2kπ −∞ ∈

6.3. Some properties of µr and σr

Proposition 11. Let 0 < r < 1. IfYr and Yr′ are independent with the same distribution µr then the Laplace transform of the law σr of Yr Y is defined for s in S (σr) = (log r, log(1/r) by − r′ r r s(Y Y ) 2 dx dx E(e− r− r′ ) = exp log(1 2x cosh θ + x ) 2 log(1 x) . − Z0 − x − Z0 − x !

The distribution σr generates an exponential family F(σr), where R is the domain of the means and the variance m function is, with a = sinh 2 , 2 V (m) = √r2 + a2( √r2 + a2 + √1 + a2). F(σr ) 1 r2 − s s Proof: From (27) Bσ (s) exists if and only if re− < 1 and re− < 1, in other terms if s < log r. Since Li2(r) = r r log(1 x) dx we get | | − − 0 − x R s re− r s dx s dx Li2(re− ) = log(1 x) = log(1 xe− ) − Z0 − x − Z0 − x and Bσr (s) is easily deduced from this expression. 16 For computing the variance function, we observe that

s s m = k′ (s) = log(1 re− ) log(1 re ). (29) σr − − − Equality (29) shows that R is the domain of the means by letting s log r. Also, (29) can be rewritten, →± m m a = sinh = r sinh s + (30) 2 − 2 

With the classical notations θ = ψ(m) and VF(σr )(m) = 1/ψ′(m) we can write, by differentiating (30) with respect to m,

2 1 2 1 m m 1 a 1 1 √1 + a = cosh = r cosh ψ(m) + (ψ′(m) + ) = r 1 + + . 2 2 2 2 2 r r2 V (m) 2 !   F(σr ) This transform leads to the desired formula: 2 √r2 + a2 2 √ 2 2 √ 2 √ 2 2 VF(σr )(m) = = r + a ( 1 + a + r + a ). √1 + a2 √r2 + a2 1 r2 − − The calculation of the dual of σr involves elliptic integrals and its existence is unproven.

6.4. The variance function em + 1 In this section we study the exponential family F(η)on R with variance function em+1 and with the most surprising property that probability η is equal to N(0, 1) µ where µ is the dilogarithm probability defined by (26). Since the function em + 1 is real analytic and∗ positive on R then, if it is a variance fuction, the domain of the means of the corresponding natural exponential family will be R. Now, to decide whether it is a variance function or not, we have to compute a potential Laplace transform in the usual way:

dm e mdm du = = − = = + = + m ds m m d log(1 u) d log(1 e− ). −e + 1 1 + e− −1 + u − − Since m > 0 we can take S (η) = (0, ) and s = log(1 + e m). This leads to ∞ − ns s s ∞ e− ℓ′ (s) = m = log(e 1) = s + log(1 e− ) = s . η − − − − n Xn=1 By choosing properly integration constant such that η is a probability, we get

ns 1 2 π2 ∞ e− ℓη(s) = s + = ℓN , (s) + ℓµ(s). 2 − 6 n (0 1) Xn=1 Showing the unexpected result that η = N(0, 1) µ proves also that η exists and is a positive measure, and that em + 1 is the variance function of an exponential family.∗ Proposition 11 given above can be reformulated in terms of µ and m m η. If X η and Y′ µ are independent, then W X Y′. One can remark that since e and e 1 are variance functions,∼ therefore∼ using affinities and the Jørgensen∼ set− would enable us to describe all exponential± families with variance functions Aem/λ + B. We leave to the reader to prove that η has no dual.

6.5. The variance function sinh(m) In this section we study the exponential family F(α) on N with variance function sinh(m) and we describe the link between the probability α = (α(n))n∞=0 and the dilogarithm function Li2(z). The existence of α is granted by the Bar- Lev theorem since the Taylor expansion of sinh(m) has only nonnegative coefficients (see [15], Corollary 3.3). The m 0 fact that α is concentrated on N is granted by the fact that sinh(m) → m (see [15], Proposition 4.4). For computing α we proceed in the usual way ∼ dm 2emdm 2du 1 1 u 1 em 1 ds = = = = du = d log − = d log − . −sinh(m) −e2m 1 −u2 1 − u 1 − u + 1! − u + 1 − em + 1 − − − 17 em 1 Since m > 0 we can take S (α) = (0, ) and s = log m− . This leads to ∞ − e +1 s (2n 1)s 1 + e− ∞ e− − ℓ′ (s) = m = log = 2 . (31) α − − 1 e s − 2n 1 − − Xn=1 − Writing z = e s (0, 1) we get − ∈ 2n 1 ∞ z − 1 2 ℓα(s) = C + 2 = C + 2Li (z) Li (z ). (2n 1)2 2 − 2 2 Xn=1 − Since Li (1) = π2/6, in order to have α of mass 1, we take C = π2/4 and we finally obtain 2 −

2 ∞ n π +2Li (z) 1 Li (z2) α(n)z = e− 4 2 − 2 2 . (32) Xn=0 Of course, if 8 ∞ 1 β(dx) = δ2n 1, π2 (2n 1)2 − Xn=1 − 2 if N is Poisson distributed with mean π /4, and if X1,..., Xn,... are iid with distribution β and independent of N, we have X + + XN α. 1 ··· ∼ Proposition 12. The probability α is dual of itself. m s s m Proof: From (31) we obtain e− + e− + e− − = 1. The symmetry between m and s implies the result. The last proposition links α and µ. The proof follows from (32). 2 4 Proposition 13. If Y µ and X α∗ are independent, then X + 2Y µ∗ . ∼ ∼ ∼

7. Examples of duality in Rn The classification of exponential families in Rn has been done in the literature with the same guidelines as for R: from the simplest variance functions like the Tweedie scale, or like the quadratic ones of Morris [21] to more com- plicated ones like the cubic families (see [15]) or the Babel class (see [16]). For Rn, several choices of classification through a definition of simple have been carried out: 1. The first one has been to extend the gamma family with shape parameter p > 0, whose variance function is m2/p, to homogeneous quadratic variance functions in Rn. More specifically one had to find which variance matrices VF (m) of order n are made of homogeneous quadratic polynomials with respect to m = (m1,..., mn). The answer has been given by Casalis [3]: out of trivial cases, the only exponential families with homogeneous quadratic variance functions are the Wishart ones, on symmetric, Hermitian, quaternionic Hermitian matrices, and analogous objects on the Lorentz cones and on the exceptional Albert algebra. 2. The second choice has been to consider the so called simple quadratic families in Rn. While a quadratic family has variance function of the form VF (m) = A(m) + B(m) + C, where m A(m) is quadratic homogeneous, m B(m) is linear and C is a constant symmetric matrix of order n, such7→ a family is said to be simple if A(m) 7→ 1 has rank one, and therefore of the form λ m m; recall that if E is a Euclidean space and a, b are in E then a b is the endomorphism of E defined by ⊗ ⊗ x (a b)(x) = a b, x . 7→ ⊗ h i Here again their classification has been done by Casalis [4], while the classification of general quadratic families is an open problem. 3. The third block is the simple cubic exponential families, obtained from the simple quadratic ones by a M¨obius transformation of their variance function. They have been classified by Hassa¨ıri [8]. n 4. The last choice is the so called diagonal families in R defined by the fact that the diagonal part of VF (m) has the form (V1(m1),..., Vn(mn)). They have been classified in a paper with six coauthors: see Bar-Lev et al. [2]. 18 7.1. The Wishart families are self dual In the Euclidean space V of real symmetric matrices of order n with scalar product x, y = trace(xy) consider the h i cone V+ V of positive definite matrices and the cone V+ V of semipositive definite matrices. If ⊂ ⊂ 1 3 n 1 n 1 p Λ= , 1, ,..., − − , , ∈ (2 2 2 ) ∪ 2 ∞! the family of the Wishart distributions of shape parameter p is generated by the unbounded measure µp on V+ which can be defined on s V+ by its Laplace transform ∈ 1 s,x = e−h iµp(dx) p . ZV+ (det s)

ℓ s = p s, ℓ s = ps 1 s s s 1. ℓ ℓ s = As a consequence µp ( ) det µ′ p ( ) − since the gradient of log det is − Clearly µp ( µp ( )) 1 1 − − 7→ − − p(ps− )− = s and this shows that µp is a dual of itself. We are not going to give details for the Wishart distributions defined on the other symmetric cones: after intro- ducing the necessary definitions, the simple calculation above remains the same. One can consult Casalis and Letac [5], for instance.

7.2. The dual of the multinomial distribution n Let (e1, e2,..., en) be the canonical basis of R and let us call Bernoulli distribution any law concentrated on the n + 1 points (0, e1, e2,..., en), namely of the form

p δ + p δe + p δe + + pnδe , 0 0 1 1 2 2 ··· n where pi > 0 for i 0,..., n and p + p + + pn = 1. The set of all these Bernoulli distributions is an exponential ∈ { } 0 1 ··· family generated by the measure µ = δ + δe + + δe whose Laplace transform is 0 1 ··· n

s1 sn ∆= Bµ(s ,..., sn) = 1 + e− + + e− 1 ··· Therefore s s ∂ e− 1 ∂ e− n ℓµ(s) = log ∆, m1 = ℓµ(s) = ,..., mn = ℓµ(s) = . −∂s1 ∆ −∂sn ∆

For deciding whether µ∗ exists or not we compute ℓµ (m) such that ℓ′ ( ℓµ′ (s) = s. Thus we have to compute ∗ − µ∗ − (s1,..., sn) with respect to (m1,..., mn) as follows: 1 s1 sn m1∆= e− , ..., mn∆= e− , 1 + (m1 + + mn)∆ = ∆, ∆= . − − ··· 1 m mn − 1 −···− Therefore s = log m + log(1 m mn),..., sn = log mn + log(1 m mn). 1 − 1 − 1 −···− − − 1 −···− As a consequence, if µ∗ does exist, it must satisfy

ℓ′ (m) = log m1 log(1 m1 mn),..., log mn log(1 m1 mn) . µ∗ − − −···− − − −···−  It is easy to see that up to an additive constant we have

ℓµ (m) = (m log m + + mn log mn + (1 m mn) log(1 m mn), ∗ 1 1 ··· − 1 −···− − 1 −···− m1 mn 1 m1 mn Bµ (m) = m ... m (1 m mn) − −···− ∗ 1 n − 1 −···− sx s s Recall that the Landau distribution ϕ is defined by R e− ϕ(dx) = e− s . In order to describe the above µ∗ let us coin a simple lemma: R

19 n Lemma 1. Let f (s) = f (s1, , sn) = a1 s1 + + an sn + b = a, s + b be a non-constant affine form of R . Then f (s) f (s) ··· ··· h i e− f (s) , defined on the half plane H = s; f (s) > 0 , is the Laplace transform of a positive measure µ(dx) on n { } bt R , concentrated on the line Ra, defined as the image of the measure ϕ1(dt) = e− ϕ(dt) by the map t x = at. This measure is bounded if b 0, of mass ebbb when b > 0,andmass1if b = 0. 7→ ≥ Proof: By definition we have

s,x t s,a t s,a bt t f (s) e−h iµ(dx) = e− h iϕ1(dt) = e− h i− ϕ(dt) = e− ϕ(dt). ZRn ZR ZR ZR The remainder is plain.

With this lemma we can describe µ∗(dx) which satisfies

s1 x1 sn xn s1 sn 1 s1 sn e− −···− µ∗(dx) = s1 ... sn (1 s1 sn) − −···− ZRn − −···− on the tetrahedron s; s ,..., sn, 1 s sn > 0 which is the domain of the means of the exponential family { 1 − 1 −···− } of the Bernoulli distribution. From the lemma, applied to the n + 1affine forms f (s) = s sn + 1, f (s) = 0 − 1 −···− 1 s1, ... fn(s) = sn, the measure µ∗ is the convolution of n + 1 probabilities respectively concentrated on the lines 1 Re1,..., Ren and R( e1 en). For f0 the measure described in the lemma is bounded and has mass e− and has to be normalized to become− −···− a probability:

1 f0 f0 f1 f1 fn fn 1 s1 s2 s1 sn e − f e− f ..., e− f = (1 s sn) − −···− s ... s . 0 1 n − 1 −···− 1 n It is interesting to compute the variance function of F(µ∗) : s s s s Proposition 14. Let ∆= 1 + e 1 + + e n ,D = diag(e 1 ,..., e n ) and Jn the (n, n) matrix with all entries equal − ··· − − − to one. Let µ be a dual of µ = δ + δe + + δe . Then the variance function of F(µ ) is ∗ 0 1 ··· n ∗ es1 + 1 ... 1 . . V s = ∆ D 1 + J = + e s1 + + e sn  . .. .  . F(µ∗)( ) ( − n) (1 − − ) . . . ···    1 ... esn + 1      Proof: Let v = v(s) = (e s1 ,..., e sn )T . Note that ∆ = v and that v = D. With this notation we have, since − − ′ ′ − ℓµ = log ∆,

1 1 1 1 ℓµ′ (s) = v, ℓµ′′(s) = D v v, VF(µ )(s) = [ℓµ′′(s)]− . −∆ ∆ − ∆2 ⊗ ∗ Rn 2 2 2 For computing the inverse matrix we use the following identity: if a such that a = a1 + + an is different from 1, then ∈ k k ··· 1 a a (In a a)− = In + ⊗ . − ⊗ 1 a 2 − k k We are going to apply this expresion to

1/2 1/2 1/2 s /2 s /2 T a = ∆− D− v = ∆− (e− 1 ,..., e− n ) , which satisfies 2 ∆ 1 2 1 a a 1/2 1/2 a = − , 1 a = , ⊗ = (D− v) (D− v), k k ∆ − k k ∆ 1 a 2 ⊗ − k k 1 1 1 − 1/2 1 1/2 1/2 a a 1/2 VF(µ )(s) = D v v = ∆D− (In a a)− D− = ∆D− (In + ⊗ )D− ∗ ∆ − ∆2 ⊗ ! − ⊗ 1 a 2 1 1/2 1/2 1 − k k = ∆(D− + ∆(D− a) (D− a)) = ∆(D− + Jn). ⊗

s s 2 s Remark 3. Note that when n = 1 the variance is (1 + e− )(1 + e ) = 4 cosh 2 as seen in (23) . 20 7.3. The other simple quadratic families. Casalis [4] splits the set of simple quadratic families in Rn in 2n + 4 types, extending the famous Morris classifi- cation for n = 1 in six families. Among them n + 1 are trivial from our point of view, with variance function of the form B(m) + C where m B(m) is linear. In fact, with a proper choice of a basis of Rn they are product of k Poisson families and n k one dimensional7→ Gaussian families with k 0,..., n (see also [14]). Since trivially a product of duals is dual of− the product, there is nothing to add. The remaining∈ { n + 3 families} are made with two exceptional ones, the multinomial and the hyperbolic, and a set of n + 1 types denoted (NM ga)k families with k 0,..., n that we − ∈ { } will describe later. In the sequel we are going to prove that neither the hyperbolic type nor the (NM ga)k have a dual when k > 0. The case k = n is referred to as the multivariate negative binomial case. The case k = 0− has a dual if and only if n = 2 or n = 3. Now it is shown that there is no dual for the multivariate negative binomial distribution. Denote by N the set of Nn 1 nonnegative integers. Consider the measure on defined by µ(dx) = (δ0 δe1 δn)− ∗ where the inverse is taken in the sense of convolution. Its Laplace transform is − −···− 1 1 Bµ(s) = = , 1 e s1 e sn ∆ − − −···− − defined on the set S (µ) = s Rn; ∆ > 0 . { ∈ } We show that no dual µ of µ exists. Suppose the existence of µ . Since ℓµ = log ∆ we have ∗ ∗ − 1 s1 sn m = (m ,..., mn) = ℓ′ (s) = (e− ,..., e− ). 1 − µ ∆ s s s Since e i = mi∆ we get ∆(m + + mn) = e 1 + + e n = 1 ∆ leading to ∆= 1/(1 + m + + mn). Finally we 1 ··· − ··· − − 1 ··· have on the domain mi 0, for all i 1,..., n , ≥ ∈ { }

ℓ′ (m) = ( s1,..., sn) µ∗ − − with si = log mi log(1 + m + + mn) and therefore − − 1 ···

ℓµ (m) = m log m + + mn log mn (1 + m + + mn) log(1 + m + + mn), ∗ 1 1 ··· − 1 ··· 1 ··· m1 mn m mn B (m) = 1 ··· . (33) µ∗ 1+m + +m (1 + m + + mn) 1 n 1 ··· ··· For fixed m and m′ the matrix B (2m) B (m + m ) M = µ∗ µ∗ ′ (34) + " Bµ∗ (m m′) Bµ∗ (2m′) # is semi positive definite since

a = 2 + + + 2 = m,x + m′,x 2 (a, b)M a Bµ∗ (2m) 2abBµ∗(m m′) b Bµ∗ (2m′) (ae−h i be−h i) µ∗(dx) 0. b ! ZRn ≥

This implies that det M 0. Let us apply this remark to the particular case m = (0, 1, 0,..., 0) and m′ = (1, 1, 0,..., 0). It leads to ≥ 1 19339 det M = (48 3355) = < 0 334755 − −334755 which is the desired contradiction. Next it is shown that there is no dual for the (NM ga)n 1 case. We use the notation of [4]. This exponential − n− 1 family is generated by the measure µ(dx1,..., dxn 1, dy) on N − (0, ) defined by − × ∞ x1+ +xn 1 1 y ··· − (δ0 δe1 δen 1 )− ∗(dx1,..., dxn 1) 1(0, )(y)dy. − −···− − − (x1 + + xn 1)! ∞ ··· − 21 The exponential family F(µ) is theset ofthelaws of(X, Y) when X = (X1,..., Xn 1) is multivariate negative binomial n 1 − on N − and the conditional law of Y knowing X is gamma with shape parameter 1 + X = 1 + X1 + + Xn 1. Its variance function is | | ··· − Vµ(m) = m m + diag(m1,..., mn 1, 0) ⊗ − and the Laplace transform of µ is 1 s1 sn 1 1 Bµ(s) = (sn e− ... e − )− = . − − − ∆ We show first the following: If µ exists then its Laplace transform is defined on mi > 0, i 1, , n , by ∗ ∈ { ··· } m1 mn 1 m1 ... mn −1 Bµ (m) = − . (35) ∗ m1+ +mn 1+1 mn ··· −

s1 sn 1 si To see this, since ∆′ = (e− ,..., e− − , 1) and that m = ℓµ′ (s) = ∆′/∆ one gets mn = 1/∆ and e− = mi/mn for i < n leading to − ℓµ′ (m) = s = log(m1/mn),..., log(mn 1/mn), (1 + m1 + + mn 1)/mn ∗ − − ··· − from which the result is obtained. To see that actually µ∗ does not exist we use the same method as for the multivariate negative binomial distribution by observing that the matrix M defined by (34) has a negative determinant when choosing m = (0,..., 0, 1) and m′ = (1,..., 0, 1). Now it will be established that there is no dual for the (NM ga)k case with 0 < k < n 1. This exponential k − n k 1 − family is generated by the measure µ(dx ,..., dxk, dy, dz) on N (0, ) R defined by 1 × ∞ × − − x1+ +xk 1 y ··· (δ0 δe1 δek )− ∗(dx1,..., dxk) 1(0, )(y)dy N(0, yIn k 1)(dz), − −···− (x + + xk)! ∞ × − − 1 ··· n k+1 where N(0, yIn k 1) is the Gaussian distribution on R − with covariance matrix yIn k 1. Its Laplace transform is − − − − Bµ(s) = 1/∆ with 1 n k 2 si ∆= sk+ s e− . 1 − 2 j − jX=k+2 Xi=1

Computations, quite analogous to the previous one, yield that the Laplace transform of µ∗ would be

m 2 k mi i n m i=1 e− mi 1 j Bµ (m) = exp . (36) ∗ 1+m1+ +mk   Qm ··· 2 mk+1 k+1  jX=k+2    , = = =   If k 0, putting mk+2 ... mn 0 shows that Bµ∗ (m) cannot be a Laplace transform as we have seen for (35). Consider now the (NM ga) case Bµ(s) = 1/∆ with − 0 1 n ∆= s s2. 1 − 2 j Xj=2

This generates the exponential family containing the distribution of (X1, Z2 √X1,..., Zn √X1), where X1, Z2,..., Zn are independent, with Zi N(0, 1) and where X1 is exponential with mean 1. Standard calculations show that if µ∗ exists then ∼ 2 1 1 n m = j Bµ∗ (m) exp . m1 2 m1   Xj=2    Denoting for simplicity m~ = (m2,..., mn) we can write  

m~ ,x 1 m x 2 (n 3)/2 dx B (m ,m ~ ) = e 2 1 m − . µ∗ 1 −h i− k k 1 (n 1)/2 Rn 1 Z − (2π) −

If n 3 0 we prove that Bµ is a Laplace transform as follows: − ≤ ∗ 22 If n = 2 we use 1 ∞ tm dt = e− 1 √m1 Z0 √πt for writing

1 m x ∞ m (t+ 1 x2) dt 1 m x ∞ m y dy = 2 1 2 = 2 1 Bµ∗ (m1, m2) e− e− dx e− e− dx. √2π ZR Z0 √πt √2π ZR Zx2/2 1 2 π(y 2 x ) q − 1 This shows that Bµ is the Laplace transform of the density restricted to the interior of the parabola ∗ √2π √π(y 1 x2) − 2 y = x2/2. If n = 3 we have 1 m~ ,x 1 m x 2 B m ,m ~ = e 2 1 dx. µ∗ ( 1 ) 2π −h i− k k ZR2 R3 1 2 2 This is the Laplace transform of the following singular measure in : it is concentrated on the cone y = 2 (x1 + x2) and it is the image of the Lebesgue measure dx1dx2 on R2 by the map x (x, 1 x 2). 2π 7→ 2 k k To complete the study of these last two measures µ∗, one can describe the variance function of F(µ∗) with the following matrices for n = 2 and n = 3 : 1 s + 1 s2 1 s = 2 1 2 2 2 2 VF(µ∗ )(s1, s2) s1 s2 1 , − 2 ! " 2 s2 1 # s + 1 (s2 + s2) 1 s 1 s 1 1 2 2 3 2 2 2 3 V (s , s , s ) = s (s2 + s2) 1 s 1 0 . F(µ∗) 1 2 3 1 2 2 3  2 2  − !  1 s 0 1   2 3    We skip this standard computation.   For n 4 one can suspect that µ does not exist. Enoughis to proveit for n = 4, since we can do m = ... = mn = 0 ≥ ∗ 5 to pass from the case n > 4 to the case n = 4. For simplicity of calculations in the case n = 4 we rather set m1 = t and m2 = s1, m3 = s2, my = s3. And we show now that 1 1 L(~s, t) = exp ~s 2 t + 1 2(t + 1)k k ! is not a Laplace transform on ( 1, ) R3. Suppose the contrary. Consider a random variable (X~, Y) such that for t > 1 − ∞ × − ~s,X~ tY A = e−h i− , E(A) = L(~s, t). Denote ~1 = (1, 1, 1) We are going to prove that for a suitable (r, ~s, t) we have

E( X~ rY~1 2A) < 0 (37) k − k 2 1 u which proves the impossibility. By taking partial derivatives of L(~s, t) a calculation gives E( X~ rY~1 A) = e t+1 B k − k (t+1)5 with the shorter notations u = 1 ~s 2 and 2 k k B = 2u(t + 1)2 + 2r ~s,~1 ((t + 1)3 + (t + 1)2(1 + u)) + r2(2(t + 1)2 + 4u(t + 1) + u2) h i We now choose ~s = 2r~1. This gives u = 6r2 and ~s,~1 = 6r. The quantity B becomes − h i B = 12r((t + 1)3 + (t + 1)2(1 + 6r2)) + r2(14(t + 1)2 + 24(t + 1)r2 + 36r4). − For fixed t we can now choose r small enough such that B < 0. Therefore (37) is proved. Finally it is shown that there is no dual for the multivariate hyperbolic case. Following Casalis [4], p. 1836, the multivariate hyperbolic case is described by the Laplace transform

s1 sn 1 1 Bµ(s) = (cos sn e− e − )− − −···− 23 defined on 1 1 n 1 s1 sn 1 S (µ) = s R − ( π, π) ;cos sn e− e − > 0 . { ∈ × −2 2 − −···− }

We spare to the reader the proof of the fact that if a dual µ∗ exists then its Laplace transform is, with the notation S = 1 + m1 + + mn 1, ··· − n 1 m1 − m B (m) = i=1 i exp(m arctan(m /S )). µ∗ 2 2 S/2 n n (SQ+ mn)

Note that if m1 = = mn 1 = 0 we get Bµ (m) = h2(mn) where h2 is defined in (13). But is has been proven in ··· − ∗ Section 3 that h2 cannot be a Laplace transform. As a consequence the present µ∗ does not exist.

8. Remarks.

Duality, with its links with the large deviations as described in Proposition 6, has not a strong probabilistic inter- pretation compared to the notion of reciprocity, which was used in Letac and Mora [15]. However it widens the zoo of variance functions, and it creates unexpected links like the pair Poisson-Landau. The problem of deciding of the existence of a dual is unsolved in many circumstances, and we need more tools than we described in Section 3.5. Why do some measures have dual and why others have not? The Babel class has been defined and described in Letac [16], as the set the variance functions b∆+ (am + c) √∆ where ∆ is a polynomial of degree 2. Its study from the duality view point has not been considered in the present paper. The same problem arises with≤ the elliptic variance functions (Letac [18]) or the Seshadri class (Kokonendji [13]). The deep functional equations of the dilogarithm described in [12, 19, 24] should lead to new properties of the dilogarithm distribution. A complete characterization of self duality would also be in order. The role of steepness is not well understood: all tractable examples of a non steep µ lead to the non existence of a dual . Does non steepness imply no dual? Dealing with exponential families in Rn is generally R2 , hard: the simplest non trivial diagonal family in is generated by the measure δ0 + δe1 + δe2 + cδe1+e2 where c 1 (see [2] p. 895) and leads to computations of great complexity. We have not tried yet the cubic families in Rn for n > 1 characterized by Hassairi [8].

Acknowledgments

I have had some discussions with Shaul Bar-Lev, Lev Klebanov and Vladimir Vinogradov about various parts of the paper. I thank them all.

References

[1] S.K. Bar-Lev, P. Enis, Reproducibility and natural exponential families with power variance functions, Ann. Statist. 14 (1986) 1507–1522. [2] S.K. Bar-Lev, D. Bshouty, P. Enis, G. Letac, I. Li Lu, D. Richards, The diagonal multivariate natural exponential families and their classifica- tion, J. Theoret. Probab. 7 (1994) 883–929. [3] M. Casalis, Les familles exponentielles `avariance quadratique homog`ene sont des lois de Wishart sur un cˆone sym´etrique, C.R. Acad. Sci. Paris. S´er. I Math. 312 (1991) 143–146. [4] M. Casalis, The 2d + 4 simple quadratic natural exponential families on Rd, Ann. Statist. 24 (1996) 1828–1864. [5] M. Casalis, G. Letac, The Lukacs-Olkin-Rubin characterization of the Wishart distribution on symmetric cones, Ann. Statist. 24 (1996) 763–786. [6] H. Cram´er, Sur un nouveau th´eor`eme-limite de la th´eorie des probabilit´es, Actualit´es Scientifiques et Industrielles, 736 (1938) 5–23. [7] W. Feller, An Introduction to and Its Applications, Vol. II, Wiley, New York, 1966. [8] A. Hassa¨ıri, La classification des familles exponentialles de Rn par l’action du goupe lin´eaire de Rn+1, C.R. Acad. Sci. Paris. S´er. I Math. 315 (1992) 207–210. [9] B. Jørgensen, Exponential dispersion models, J. Roy. Statist. Soc. Ser. B 49 (1987) 127–162. [10] B. Jørgensen, The Theory of Dispersion Models, Chapman and Hall, London, 1997. [11] T. Kawata, Fourier Analysis in Probability Theory, Academic Press, New York, 1972. [12] D. Kirilov, Dilogarithm identities, Progress of Theoretical Physics Supplement 118 (1994) 61–142 arXiv 9408113v2. [13] C.C. Kokonendji, Exponential families with variance functions in √∆P( √∆) : Seshadri’s class, Test 3 (1994) 123–172. [14] G. Letac, Le probl`eme de la classification des familles exponentielles naturelles de Rd ayant une fonction variance quadratique, Probability on Groups IX, Oberwolfach, Lecture notes in Mathematics, Springer 1379 (1989) 192–216. [15] G. Letac, M. Mora, Natural exponential families with cubic variances, Ann. Statist. 18 (1990) 1–37.

24 [16] G. Letac, Lectures on Natural Exponential Families and Their Variance Functions, 50, Institudo de Mathem´atica Pura e Aplicada, Rio de Janeiro, 1992. [17] G. Letac, Integration and Probability, Exercises and Solutions, Springer, New York, 1995. [18] G. Letac, Associated exponential families and elliptic functions, in The Fascination of Probability, Statistics and their Applications in honor of Ole Barndorff-Nielsen, eds. M. Podolskij, R. Stelzer, S. Thorbjørnsen and A. Veraart pp. 53–84 2016, Springer, New York. [19] L. Lewin, Polylogarithms and Associated Functions, North Holland, New York, 1981. [20] M. Marucho, C. Garcia-Canal, H. Fanchiotti, The Landau distribution for charged particles traversing thin films, Internat. J. Modern Phys. C 17 (2006) 1461-1476 arXiv hep-ph/0305310v3. [21] C. Morris, Natural exponential families with quadratic variance functions, Ann. Statist. 10 (1982) 65–80. [22] M. Tweedie, An index which distinguishes between some important exponential families, in Statistics: Applications and New Directions. Proc. Indian Institute Golden Jubilee Internat. Conf., eds. J.K. Ghosh and J. Roy pp. 579–604 1984, Indian Statistical Institute, Calcutta. [23] V. Vinogradov, R. Paris, On two extensions of the canonical Feller-Spitzer distribution, J. Stat. Dist. Appl. 8 (2021) 1–25. [24] D. Zagier, The Dilogarithm Function, in Frontiers in Number Theory, Physics, and Geometry, Vol. II, eds. P. Cartier, P. Moussa, B. Julia and P. Vanhove pp. 3–65 2007, ISBN 978-3-540-30308-4 DOI 10.1007/978-3-540-30308-4 -1, on line.

25