1

DENSITY OF THE RATIO OF TWO NORMAL RANDOM VARIABLES

by T. Pham-Gia* and N. Turkkan Universite de Moncton and E. Marchand, University of New Brunswick CANADA

Abstract: We give the exact closed form expression of the density of X12/ X , where X1 and

X 2 are normal random variables, in terms of Hermite or Confluent Hypergeometric functions. All cases will be considered: standardized and non standardized variables, independent or correlated variables. Several new applications are presented , and relationships with mixtures of normal distributions are given.

Keywords and Phrases: Normal, Bivariate normal, Ratio, Hermite function, Kummer confluent hypergeometric function, Integral representation, Finite sampling AMS Classification: 62E15, 62N05.

1. INTRODUCTION

The density of of WXY= / , where X and Y are normal random variables, has attracted the interest of several researchers, as early as 1930, since it was encountered in some basic problems in Statistics. Although the cases where both X and Y are standard normal, or standard bivariate normal, are fairly simple, general cases are much more complex. Geary (1930) was the first to investigate this question and Fieller (1932 ) presented another approach to evaluate this density. In the sixties, two important papers ( Marsaglia ( 1965) and Hinkley (1969)) both addressed this concern, with different viewpoints, however. Recently, demand for this expression resurface in a number of important applications, and an active exchange on the Internet Mathforum ( Startz (1997), Ward (1997) and Marsaglia (2001)), has rekindled the need for a convenient expression of this distribution . Springer ( 1984, p. 139), using Mellin Transform methods, has presented a method to obtain this density, in terms of an infinite series, but the inversion of the Mellin transform in the complex plane naturally requires some advanced computation, that is not always easy to handle. Although this method does guarantee the results, much numerical analysis has to be performed, and the result is not a closed form expression.

In this article we will use special mathematical functions, the Hermite function, which is a generalization of the , and Kummer’s confluent hypergeometric function, to give a convenient closed form expression for the density of X /Y . This expression can also be obtained using a completely different approach, based on conditional expectation. Some particular cases have, however, simpler expressions, expressed as common functions.

In section 2, we will first recall the Hermite function Hzν ( ) , and give its basic properties, and its integral representation when ν is a negative real number. In sections 3 and 4, the density of W is given for different, and exhausting cases. Section 5 discusses the interesting shapes that the density of W can take, while section 5 presents some numerical applications, with ______*) Research partially supported by NSERC grant A9249 ( Canada).

2

related discussions. Finally, section 6 gives another look at the problem, from a cumulative distribution viewpoint, and puts it in its wider context of continuous and discrete mixture of normal distributions.

2. THE HERMITE FUNCTION and THE POWER-QUADRATIC EXPONENTIAL FAMILY

Although Hermite polynomials are well utilized in Statistics, for instance in the Gram- Charlier expansion of a density, the Hermite function has only timid encounters with distribution theory ( Pham-Gia (1992)). The use of special functions, already very widespread in Mathematical Physics, is gaining ground in Statistics, where they provide powerful tools to complement the classical common functions. Dickey ( 1983) has championed such a use, already a few decades ago.

The Hermite function with parameter ν , Hzν ( ) , can be derived from the parabolic ν /2 2 cylinder function Dν by the relation Hzνν( )= 2 exp( z / 2) Dz ( 2), where Dν itself is related to the ν − th derivative of the function exp(−z 2 / 2) by the relation: ν d 2 Dz( )=− ( 1)ν exp( z2/2 / 4) ( e−z ) , ν >1. ν dzν

For any value of ν , Hν can be directly defined by the infinite series ( Lebedev ( 1972, p. 289)): ∞ n 1(1)(()/2)−Γm −ν m Hzν ()= ∑ (2) z, (1) 2(Γ−ν )m=0 m ! with the Gamma function for negative values obtained by repeatedly applying the relation:

Γ−(νν ) =Γ− ( / 2) Γ ((1 − ν ) / 2) /[2ν +1 πν ], > 0 and Γ(1/2)(3/2)−Γ =−π . When ν is a positive integer, we have the corresponding Hermite polynomial, and with

ν < 0 , Hzν ( ) has an integral representation of the form: ∞ 1 2 Hz()= e−−ttz2(1) t −+ν dt, (2) ν ∫ Γ−()ν 0 which shows that Hxν ( ) is a positive function on the whole real line.

The Hermite function Hz−2 () is of particular interest in this article. We have: ∞ 2 Hz()= tedt−−ttz2 , (3) −2 ∫ 0 with H −2 (0)= π / 2, and from the general relation ν +1 ∞ 2 2 Hz()=+ zet−−+t (1)2νν ( t z 2(1)/2 ) − dt, ν < 0 ( Lebedev (1972, p.297) ν ∫ Γ−(/2)ν 0 2 zet∞ −t we have: Hz()= dt. −2 ∫ 223/2 2(0 tz+ ) An important identity relates the Hermite function to Kummer’s classical confluent hypergeometric function of first kind, 11F , defined by: 3

∞ (,)α kzk 11Fz(,;)αγ = ∑ . , γ ≠−− 0, 1, 2,.., , with the ascending factorial, or Pochhamer k=0 (,)γ kk ! coefficients, (α ,kkk )=+αα ( 1)...( α +−=Γ+Γ 1) ( α ) / ( α ), and (α ,0)= 1. It is: 12z Hz( )=−−− 2ν πν [ . F ( /2;1/2; z22 ) . F ((1 ν )/2;3/2; z ) (4) ν 1−νν11 11 ΓΓ−() () 22

2.2 As presented in Pham-Gia (1994), the power-quadratic exponential family of distributions has the property that its hazard rates can be ordered under certain conditions. Further properties on the ratios of two members of this family are presented in Pham-Gia and Turkkan ( 2004).

DEFINITION: The family PQE (γ ;δε , ) of distributions consists of positive continuous random variables with densities of the form : f (;;,)tCtttγδε=−+ (;,) γδεγ exp[( δ ε2 )], where the domain of the parameters is as follows:

(;γ δε ,)∈>−−∞<<∞> { γ 1; δ , ε 0}U {1;0,0}γ >−δε > = .

We write XPQE ~ (γ ;δε , ) , and have the following special cases:

1. Case γ ;,δε≠ 0 : ft (;γ ;δε , ) = Ct(γδε ; , )γ exp(−+ [ δ tt ε2 ]) ,t ≥ 0 . We have the general power-quadratic density, where the normalizing constant C (γ ;δε , ) has a complex expression ν − ε 2 of the form : C(;γδε ,)= , δ Γ+(1).(γ H ) ν 2 ε δ with ν =−(γ + 1) < 0 , and where H ()is the value of the Hermite function with ν 2 ε δ parameter ν at (). 2 ε 2. Case γ = 0 : a) δ ,ε ≠ 0 : We have the normal density, N(,µ σ 2 ), truncated from below at 0, 2 21− 2 denoted NTr (,µ σ ), with εσ= (2 ) and δ =−µσ/ . Hence, f (tC ;0;δε , )=−+ (0; δε , )exp[ ( δ tt ε2 )], t ≥ 0 , δδ2 with C(0;δε , )=− exp( ) /[ π / ε .(1 −Φ ( )], (5) 4ε 2ε where Φ is the cumulative distribution function of the standard normal. We also have δ CH(0;δε , )= ε / ( ) by using the relation Hz()=−Φπ [1 ( z 2)]. (6) −1 2 ε −1 2 b) If δ = 0 , ε > 0 , we have the half , denoted , NH (0,σ ) , defined for t ≥ 0 , with variance σ 21= (2ε )− , and f (tC ;0;0,εεε )= (0;0, ) exp(−≥ tt2 ), 0 ε with C(0;0,ε )= 2 . π 4

c) If ε = 0 but δ > 0 , then the distribution is exponential, of the form f (tt ;0,δ ,0)= δδ exp(− ) ,0t ≥ .

3. Case δ = 0 : For γ >− 1,γε ≠ 0, > 0 , we have the , denoted Ray(,)γ ε , of the form f (tCtt ;γ ,0,εγε )=− ( ;0, )γ exp( ε2 ) , t ≥ 0 with 1−ν ε −ν /2Γ() C(;0,)γε= 2 , where ν = −+ (γ 1) , if we take into consideration the fact that 2()(1/2)ν Γ−ν Γ 2(1/2)ν Γ H (0) = . In particular, when γ = 2 , we have the Maxwell distribution of the well- ν 1−ν Γ() 2 3 4ε 2 known form : f (tttt ;γε ; )=−<22 exp( ε ),0 . π 4. Case ε = 0 : For γ >− 1,γδ ≠ 0, > 0 , we have the common gamma density in two δ γ +1 parameters, of the form. f (tCtt ;γ ;δγδδ ,0)= ( ; ,0)γ exp(− ) ,0t ≥ , with C(;γδ ,0)= , Γ+(1)γ denotedXGa ~ (γ + 1,1/δ ) , if the density of XGa ~ (λ ,δ ) is given its standard expression ft( ;λδ , )=− tλλ−1 exp( t / δ ) /[ δ Γ≥ ( λ )], t 0 , λ,δ > 0.

3. RATIO OF TWO INDEPENDENT NORMAL VARIABLES

3.1 Works toward the derivation of the density of W =X/Y , where X and Y are normal variables, dependent or independent, have generally followed the approach of finding

Ua1 + the cumulative distribution of the ratio W* = , where Ui , i =1,2, are standard Ub2 + µ µ normal variables with the same correlation coefficient, and a = X and b = Y , with σ X σ Y σ PW()(*)≤= t PW ≤Y t ( Geary (1930), Fieller (1932), Hinkley(1969) ). But Marsaglia σ X

(1965) required further that Ui are independent, and hence, a and b now have specific values containing the correlation coefficient. More specifically, we have: 1 µ ρµ µ a =−()XY and b = Y . (7) 2 1− ρ σσXY σ Y Section 6 gives the relation between W and W* in this case. Naturally, the tabulated values of the standard bivariate normal distribution, 12∞∞ xxyy22−+γ Lhk(,;ρ )=− exp dxdy are used in the expression of the 22∫∫  2(1πρ−− )hk  2(1 ρ ) cumulative distribution function, by the first three authors, while Marsaglia (1965) also qx h h expressed it in terms of Nicholson (1943) V function Vhq(, )= ∫∫ϕϕ ()() x ydxdy, whereϕ 00 is the standard normal density. Deriving the cumulative distribution function, he obtained the density in terms of an expression, which contains, however, an integral of ϕ . 5

On the other hand, Pitman ( 1935) has noticed that the two random variables

()()XY−−µ XYµ ()()XY−−µ XYµ V1 =+ and V2 =−are independent normal variates, with σσXY σσXY zero means and variances 2(1− ρ ) and 2(1+ ρ ) respectively. This argument is particularly 22 useful in deriving results on σ XY/σ , as we will see in section 5. Although these approaches are certainly interesting, and we will use them in latter sections to study the different shapes of the density of W, they do not provide its real, direct closed form expression, as is often required in applications. Furthermore, the change from W Ua+ to the ratio 1 obscures the direct relationships between the parameters of X and Y and Ub2 + the expression of that density .

In Theorem 1, we will give a simple closed form expression for the density of W in the case of independent variables, adopting Springer’s approach first ( 1984, p. 118 ), but we will use Hermite functions at the following step, instead of Mellin transforms.

We have the following relation between the Hermite function and Kummer’s confluent hypergeometric function

LEMMA 1:

2 HzH−−2211()+−= ( z ) F (1;1/2;) z, ∀z . (8)

PROOF:

Using relation (4) for ν =−2 , and adding the two expressions of Hz−2 ()and Hz−2 ()− we obtain the above identity. QED

2 2 THEOREM 1: Let XN~(µ XX ,σ ) and YN~(,µYYσ ) be independent normal variables. Then WXY= / has as density:

K1 fw(;µµσσXYXY , ; , )= 22 2 [11 F (1:1/2;(())] θ 1 w σσYXw + (9)

−∞

∞ PROOF: The density of W is given by a very simple integral f ()wyfwyfydy= | |( ) () . ∫ 12 −∞ In order to use Hermite functions, which are obtained as integrals over (0,∞ ) , we first 2 * 21− reparametrize the normal XN~(µ XX ,σ ) to N (,ε11δ ), where εσ1 = (2X ) and 6

2 * δ1 =−µσXX/ , with ε1 > 0 and −∞ <δ1 < ∞ . For XN~(,)ε11δ , its density 2 2 ε11δ is f (xCxx ;εδ11 , )=−− 1 exp( ε 1 δ 1 ), −∞

Let us now decompose the two normal densities f1 and f2 into their positive and negative +− + parts, i.e. fii=+ff i , i =1, 2 , where fii= f for t > 0 , and + − − fi = 0 for t < 0 , i =1,2. Similarly, fii= f for t < 0 , and fi = 0 for t > 0 . We then have , for the density of W , ∞∞ f (;wεεδδ , ; , )=+−− K [ xf++ ( wxf ) ( x ) dx xf − ( wxf ) − ( x ) dx ] for t > 0 , and 1212 0∫∫ 21 22 2 2 21 22 2 2 00 ∞∞ f (;wεεδδ , ; , )=+−− K [ x f−+ ( wx ) f ( x ) dx x f + ( wx ) f − ( x ) dx ] for t < 0 , where 1212 0∫∫ 21 22 2 2 21 22 2 2 00 2ε εδδ22+ K =−12exp[ 1 2 ]. 0 π 2 2 Making the change of variable tx=+21ε w ε 2, the first integral I1 becomes 1 ∞ δδw + I =−−ttexp(2 12 tdt ) . Using relation ( 3 ), we have: 1 εεw2 + ∫ 2 120 εε12w +

K δ w +δ I = 0 Hw(())ξ , whereξ ()w = 12. (12) 121εεw2 + − 1 2 12 2 ε12w + ε

K0 Similarly, the second integral I221=−2 Hw− (())ξ , and hence, for w > 0 εε12w +

K0 f (;wHwHwεεδδ1212 , ; , )=+−2 [−− 21 (()) ξ 2 ( ξ 1 ())]. εε12w +

For w < 0, we have its first integral equal I2 , while its second integral equal I1 . Hence, we have the same expression for fw(;ε1212 ,εδδ ; , ), for w < 0. Replacing ε121,,εδ and δ 2 by their values in terms of σ121,,σµ and µ2 , and using Lemma 1, we obtain expressions (9). 2 to (11), where θξ11()ww= [()]. QED

REMARKS: 1. The above results are valid for all values of σ XY ,σ > 0 and of µ XY , µ in

R. When either µ X or µY is zero, the values of the corresponding expressions can be obtained by just replacing µ X or µY by zero in (9) to (11). For the particular case of

σ XYσ µ XY==µ 0 , since 11F (1;1 / 2; 0)= 1 , we have : fw(;σσXY , )= 22 2 , −∞

2. From the above expression of f (w ) , we can see that its is the product of the Cauchy-

K1 type density 22 2,and a confluent hypergeometric function 11F . A very similar σ YXw +σ decomposition was obtained by Marsaglia ( 1965).

3. When the coefficient of variation of a normal variable is small enough , the area of the related distribution, truncated at zero , either from the left or from the right, depending on the case, is near unity and this truncated distribution can, practically, be taken as the original normal distribution. Hence, under the hypothesis that both normal variables have their coefficients of variation less than 0.25, for example, one of the contributions to the sum [

HH−−21()θ −− 2 (θ 1 )] becomes negligible and we can just use the other, with the corresponding adjustment in the normalizing coefficient to δδ2 C(0;δε , )=− exp( ) /[ π / ε .(1 −Φ ( )]. Geary (1930) considered this case for Y and 4ε 2ε µ − µ W showed that XYis nearly standard normal. 22 σσXY+ W For example, let both X and Y have small coefficients of variation, so that their densities can be almost taken as truncated ( from below, at the origin) normal densities . For X, we have: 2 f (tC ;0;µβσXX , )=−+ (0; δε11 , )exp[ ( δ 1 tt ε 1 )], t ≥ 0 , with ε1 and δ1 defined as 2 δδ11 previously, and C(0;δε11 , )=− exp( ) /[ πε / 1 .(1 −Φ ( )] , as given by (5), and a similar 4ε1 2ε1 expression holds for Y. Then the density of WXY= / on (0,∞ ) , can now be obtained as a Hermite function: 2 1 f (;wCδεδε , ; , )= (0;, δε ). Hw (()) ξ . Expressed in terms of µ ,σ , Wii11 2 2∏ 2 − 2 ii i=1 εε12w + i = 1, 2, we have

K1 f (;wHwµµσσXYXY , ; , )= 22 2 .−2 (()) ξ σσYXw +

−∞

8

4. RATIO OF TWO DEPENDENT NORMAL VARIABLES

Two dependent variables can have their marginal distributions normal, while their joint density could be the bivariate normal, or have another form. Only the first case has been dealt with in the literature. Let (XY , ) ~ BVN (µ XYXY ,µσ ; , σ ; ρ ) , i.e. let (X ,Y ) have the bivariate normal density of the form:

1 xxyy−−−−µµµµXXYY22 fxyXY(,;,;,;)µµσσρ X Y X Y =− A exp2 [( )2( − ρ )( )( + )] 2(1− ρσ ) XXYY σ σ σ 21− , −∞ 0, −∞<µ XY,µ <∞, while −<11ρ <.

THEOREM 2: Let (,)~X Y BVN (,µ12µσ ;XY , σ ;) ρ. Then the density of WXY= / is

2(1− ρσσ222 ) fw(;µµσσρ , ; , ; )= KXY  F (1;1/2;()), θ w (13) WXYXY 211222 2  σρσσσYXYXww−+2 −∞

PROOF: The proof uses exactly the same arguments as before, after reparametrizing the bivariate density as we did in the previous case. Also, a decomposition according to the signs +++−−+ −− of x and y in the four quadrants of the plane, f XY ,ff XY , XY , and f XY , allows us to take the appropriate signs in the integration. We will not reproduce the proof here, but the details are available upon request. QED

As an application of the above expression, let us consider Fieller’s study. Fieller ( 1932, p. 436) compiled two measurements made , on the temporal, Y, and parietal bones, X , in the left hand side of 787 Egyptian skulls, and gave the following data:

xyss====111.207, 86.019,xy 5.788, 3.845 , and rxy = 0.174 . He was interested in finding the distribution of Y/X. We use these values as parameter values in Equation ( 13 ) , and obtain Fig. 1.

REMARKS:

1. When µ XY==µ 0 and σ XY= σ = 1, we have the standard bivariate normal density with correlation coefficient ρ . Then the above expression reduces 1− ρ 2 to fw(;ρ )= , −∞

9

. Similarly, Geary (1930) considered the two standardized variables and arrived at the same result. Naturally, if ρ = 0 , this density reduces to the Cauchy one, as expected.

2. When ρ = 0 , the two variables X and Y are independent, and we can verify that

Theorem 2 reduces to Theorem 1, i.e θ21()ww=θ () and KK21= , and the same density for WXY= / is obtained. The cases ρ = ± 1 lead to a degenerate bivariate distribution and is not considered.

Again, as in the case where the variables are independent, when the coefficient(s) of variation of either variable, or both, is (are) small, that (these) variable(s) can be taken as positive, or better as normal(s) variable(s) truncated from below at the origin, and a simpler expression of the density for W can be obtained. Hinkley (1965) studied this case, and showed that the µ w − µ cumulative distribution of W , F(w) converges to ΦYX, where σσ12aw() ww2 21ρ aw()=−+22. He also studied the accuracy of approximating F(w) by the σ1122σσ σ above expression. Shanmurgalingam (1982) presented a Monte-Carlo study of the case where this shape is mildly normal to very skewed, depending on the two coefficients of variation CX and CY . A similar study along these lines can be carried out here, using expression ( 13 ), but there is much less need to approximate the cumulative distribution function of W now the closed form expression of its density is available.

3. Our approach adopted to derive the above results, based on a decomposition into positive components and using Hermite functions, can be applied to study the density of the ratio X /Y , when the joint distribution f (xy , ) of (X ,Y ) is not bivariate normal, but the marginal distributions of X and Y are normal, as in the case of the distribution used by Ruymgaart ( 1973 ). This is a discrete mixture, with equal weights, of two independent bivariate normal densities, with mean vectors zero and correlation coefficients -½ and ½ respectively:

1 1 22 f (,xy )=+()ϕϕ−1/2 (, xy ) 1/2 (, xy ), where ϕρρ (,xy )=−−+ exp( {} x 2 xyy /2) , 2 21πρ− 2 with −∞

It can also be applied to study X /Y in the case conditional variables X |Yy= and YX| = x, are both normal, with the joint distribution of (X ,Y ) bivariate normal, or not.

σ X 22 In the first case, we then have (XY |=+−− y ) ~ N (µ XYXρµσρ ( y ); (1 )) , and σ Y

σ Y 22 similarly, and (YX |=+−− x ) ~ N (µYXYρµσρ ( x ); (1 )) , and the density of their ratio σ X (|X Yy= ) can be obtained from Theorem 1. An example for the second case is provided by (|YX= x ) the non-bivariate normal density 10

f (xy , )=−++++ C exp( [ x22 y 2 xyx ( y xy )]) , −∞

5. SHAPES OF THE DENSITY OF W

The various shapes that the density of W can take, make this topic particularly interesting. Indeed, the fact that the density of W can have one, or two modes, sometimes quite apart from X each other , calls for a careful investigation. It is more convenient that W = , with Y

Va1 + (,)~XY BVN (,µ1212µσσ , , ,) ρ , be put under a form related to T = , with V1 and V2 Vb2 + being independent standard normal variates N (0,1) , and a and b given by (7) . We can see that it suffices to study the case ab ,> 0 for the distribution of T, since other cases can be obtained from this case. By Theorem 1, T has as density : K f (;tab , ;)= 1 [ F (1:1/2;(θ ())] t , (14) t 2 +1 11 1 ()at+ b 2 1 whereθ ()t =≥ 0 and Kab=−+exp[ (22 ) / 2] . 1 2(t 2 + 1) 1 π 222 Hence, we always have: awab/2≤≤+θ1 ( ) ( )/2

From the above expression, it is immediate that the following properties hold :

a) f (tab ; , ) is higher to the right, i.e for any value t0 > 0, we have f ()tft00≥− ( ) 2 b) fKFb(0)=>11 . 1 (1;1/ 2; / 2) 0 c) Using the properties of the function θ1()w , which is increasing in [−baab/,/], and decreasing outside, for t > 0, f(t) has a mode at mab0 ≤ / .

In the general case, the derivative of f is :

' 11Ft(2;3/2;θθ 1 ( ))()()batabt+− 11 Ft (1;1/2; 1 ( )) ft()=− 2 K1 22 2 t . , (15) (1++ttFt ) (1 )11 (2; 3 / 2;θ 1 ( )) And we have fab '( / )< 0 , while f '(0)> 0 , and the density is always increasing at the origin, confirming point c). Also, the sign of f '(t ) depends only on the sign of the function

()()batabt+− 11Ft(1;1 / 2;θ 1 ( )) ψ ()tt=−2 . , where it can be shown that the ratio (1+ tFt )11 (2;3 / 2;θ 1 ( )) Ft(1;1 / 2;θ ( )) ζ ()t = 11 1 is an increasing function of t > 0, and takes the value 1 at the origin. 11Ft(2;3/ 2;θ 1 ( )) 11

In fact, we can show that 1≤≤ζ (t ) 2 . Hence, d) For either a = 0 or b = 0 , we have f symmetrical w.r.t. vertical axis.

For ab=>0, 0 , f is unimodal since as a function of t, ψ (tt ) / decreases from 0 .

For b = 0 , if a ≤1, ψ (tt ) / is negative for t > 0, and hence f decreases on (0,∞ ) and we have only one mode.

But for a >1, ψ (tt ) / is positive, then negative, and hence f has 2 symmetric modes at m0 2 2 and - m0 . Also, since ψ (1)0a −< we have ma0 < −1. Although zero is a point of continuity, it is also a turning point, and the two values of f ’, to the left and the right of zero can be obtained by direct computation. e) Using the bounds of ζ (t ) , we can see that ψ (t ) has at most three roots.

There will be a second mode at m1 < 0 if ψ (t ) has 2 roots on the negative axis. Then, we have f ()mfm11≤− ( ). Furthermore, using the increasing property of θ1()t in (−ba / ,0) , we have f increasing in this interval, and hence mba1 < − / . ' In the case of no root , f is always positive and there is only one mode at m0 > 0 . The intermediary case of a double negative root gives a unimodal density, and the corresponding values of a and b determine the boundary value between the bimodal and the unimodal forms of the density, as given by Marsaglia (1965) . To further investigate this point, we proceed as follows :

For a > 0 we compute the values of b > 0 such that equation ψ (t )= 0 has a single positive root and a double negative root. The positive root corresponds to the mode for w > 0 , but the double negative root will determine the boundary values of b for which there is no second mode. At the same time, we compute the values of this double root. Numerical results show that, as given by Marsaglia (1965), a curve with vertical asymptote about 2.257 delimits the values of (a,b) for which the density unimodal, or bimodal. f). The distribution of the ratio of two linear combinations of independent normal variables, coming from univariate or bivariate distributions, can now be obtained directly from Theorem 1 and Theorem 2.

a) Let X i , in= 1,..., and Yj , j = 1,..., m , be mn+ independent normal variables n XN~(µ ,σ 2 ), YN~(,µ σ 2 ) and let VTT= / , where TX= α and iXXii iYYii 12 1 ∑ ii i=1 m n TY= β . Then the density of V is given by ( 9 ), where we have µ = αµ . 2 ∑ jj TiX1 ∑ i j=1 i=1 m n m µ = βµ , σ 222= ασ and σ 222= βσ . TiY2 ∑ i TiX1 ∑ i TiY2 ∑ i i=1 i=1 i=1 When µ ==µ 0 , and V has a generalized Cauchy–type distribution. TT12

b) Similarly, Let (X11 ,YXY ),...,(kk , ) be k independent normal vectors, with kk (,)~(,XYii,1212 Nµ iµσσ i ; i , i ;) ρ i,1,ik= . Then the density of VXY= ∑αii/ ∑α ii is given ii==11 12

n m n m by Theorem 2 with µ = αµ , µ = αµ ,σ 222= ασ , σ 222= ασ and 1 ∑ iXi 2 ∑ iYi 11∑ ii 22∑ ii i=1 i=1 i=1 i=1 k ρ = ()/∑ ρσiii12 σ σσ 12. i=1 5. APPLICATIONS

There are several known applications of the density of X /Y . In linear regression, the ratio of the two least squares estimates of the regression line, which are the intercept and the slope, has this density. Marsaglia (1965) mentioned the distribution of red cells, and Shanmurgalingam (1982) mentioned digestibility, or the ratio of the weight of a component of a plant to that of the whole plant, and presented a Monte-Carlo study related to this ratio. In what follows we provide two other applications, one in the domain of Education, where evaluation of students performance over similar academic subjects for forecasting purposes is an important concern, and the other in Finite Sampling Theory, where the distribution of this ratio has been mentioned frequently in the Ratio estimating approach.

1 For example, we have, in education, the joint distribution of the final grades of First year English (X) and second year English literature (Y) as approximately BVN( 75.25, 71.58; (6.25)2, (5.45)2; 0.76). These values are sample values of these parameters, obtained from a large sample of 427 students in the last 3 years, taking the two courses with the same professors: A general study of how academic achievements in English literature is related to the one in Introductory English would consider the distribution of X /Y , as given by

Theorem 2. This density, denoted f1 , is given by fig. 4 , and reflects the distribution of this ratio for any value (X,Y).

22 A (1−α )100 % confidence interval for the ratio σ12/σ can also be obtained, based on nrt−+22(1) −22 Pitman’s result (1939). It is Bs(/)(/)22 s± s 22 s B 2− 1, where B = α , 12 12 n − 2 which is here :

If we look only into the 2 independent marginal distributions of X and Y, the ratio X /Y has its density given by Theorem 1, is denoted f2 in Fig. 4, and reflects the distribution of X/Y for X and Y from these marginal distributions.

To further study the homogeneity of the distributions, we can consider the two conditional distributions (YX |= 63) and (XY |= 60) , these two values for X and Y being adopted as 222 respective minimal passing grades for the two courses. Setting σ XY| =−σρ X(1 ) and

σ X 2 µ XY| =+µρ X()y − µ Y , where y = 60, and, similarly, for σ YX| and µYX| , the density σ Y (|YX= 63) of the ratio is given by Theorem 1 since they can be considered as independent, (|XY= 60) and its graph, denoted f3 , is also given by fig. 4. We can see that the three densities differ very much from each other.

2. In Ratio estimating in Classical Finite Sampling Theory, we suppose R = XY/ is the ratio of the two totals of two populations, with same number of elements. We also have 13

R = XY/ , where X and Y are means of the two populations. Estimating R = µ XY / µ , by R , the estimate of Y is then YRX = . Paulson (1942)), using Geary’s results ( 1930) mentioned previously, suggested a method to give an interval estimation of R , based on R . Depending on whether the parameters are known, or unknown, we have two cases for the confidence limits of R, based on a sample of size n , {(x11 ,yxy ),...,(nn , )} , and have the following formulae :

222222222 (nxy−±−−−− zααρσXY σ ) ( nxy z ρσ XY σ ) (() n y z αα σ Y )(() n x z σ X ) a) 222 , in the first case, ny()− zασ Y and

222222222 (nxytrss−±−−−−αα,1nXY−− ) ( nxytrssnytsnxts ,1 nXY ) (() αα ,1 nY −− )(() ,1 nX ) b) 22 2 , when the ny()− tα ,1nY− s values of the parameters are unknown, where ()x − x 2 ()yy− 2 s2 = ∑ i , s2 = ∑ i and r is the sample correlation coefficient. x n −1 Y n −1

Since EW ( )≠= R µ XY / µ , where WXY= / , we can, alternately, estimate the distribution of W, based on Theorem 2, find the estimated value of EW ( ) , EW(), and estimate Y as YEWX = ().. However, as for the Cauchy distribution, the mean of W theoretically does not exist. But, in practice, as pointed out by K. S . Brown ( 2004), this mean is computable when the Cauchy component plays only a non-significant role in the density ,and is called the pseudo-mean. It should be pointed out too, that the principal value of this mean exists ( Stuart and Ord ( 198 , p. 77 )).

For example, in a survey , we have found xys==6, 8,XY = 1.25, s = 2.15 , r = 0.97 and X has the value of 1456.65. We wish to estimate Y .

Since t4,.025 = 2.7764 , Paulson’s method give a 95% confidence interval for R, the ratio of the two means, as 0.773± 0.086 . Using the expression of fwW ( ;6,8;1.25;2.15,0.97) given by Theorem 2, we have an approximate distribution of W, and obtain its pseudo-mean EW ( ) = 0.769. and its variance (0.012)2 , by numerical computation. The point estimation of Y are very close however, since Y ==0.773(1456.65) 1126 , while Y ==0.769(1456.65) 1120 .

6. RATIO OF VARIABLES FROM MIXTURES OF NORMAL DISTRIBUTIONS

The problem considered in this article is in fact a particular case of the ratio of two normal variables related by a mixture process. We first establish the expression of the cumulative distribution function of W = X/Y, from which the density can be derived.

22 X σ11U +δ 1 THEOREM 3 : Let (,)~(,XY Nµ121µσ , , σ 2 ;) ρ. Then W == , where YUσ 22+δ 2

δiii= µσ/ , i = 1,2 , and (,UU12 )~(0,0,1,1;) N ρ . (16) 14

('tUt−+−ρδδ ) (' ) Then, we have PW(≤= t ) EU2  Φ221 .sgn( U +δ ) (17) 2 22 1− ρ

2 PROOF: Using the relations: UU12|~(,1)=− u 2 Nρ u 2 ρ and UN2 ~(0,1), we have: ∞ y2 U +δ − PW()≤= tϕ () u P11 ≤ t '| U = u du , where ϕ()ye= 2 / 2π is the standard ∫ 2222 −∞ U22+δ ' ' ' normal density . Considering the sign of U 22+δ , we integrate in (,−∞ −δ 2 ) and (,)−∞δ 2 separately, and obtain ' −δ2 ∞ PWt(≤= )ϕδδ () uPU +≥'' tU '( + )| Uudu = + ϕδδ () uPU +≤ '' tU '( + )| Uudu = ∫∫21122222{}{} 21122222 ' −∞ −δ2 −δ ' 2 −−('tUtρδδ ) − (''' − )∞ (' tUt − ρδδ ) + (' '' − ) =Φϕϕ()uduudu221 +Φ () 221, and ∫∫222222 −∞ 11−−ρρ' −δ2

we have the above result. QED

With the above result we can obtain the representation of WXY= / , where

Va1 + (,)~XY BVN (,µ1212µσσ ; , ;) ρ, in terms of T = , with (,V12 V )~ BVN (0,0;1,1;0), the Vb2 + same as given by Marsaglia (1965), i.e. in the form of the following σ σ Va+ COROLLARY : W has the same distribution as W*1()=+ρρ11 −2 1 , where σσ22Vb 2+

δ12− ρδ (,)ab = ( ,δ 2 ). 1− ρ 2

PROOF: First, from Theorem3, for T , we have, since ρ = 0 , and σ12==σ 1,

U2 PT(≤= t ) E[ Φ ( tU221 + tδ −δδ ).sgn( U 22 + )] . (18) Hence, ttt'(')(')−−−ρρρ PW( *≤= t ) PT ( ≤ ) = EU2  Φ ( U + bt − a ).sgn( U +δ ) . 222222 111− ρρρ− −  ('tt−−ρρ ) (' ) ('δρδ− ') = EUU2 Φ+(δ ' t −12 ).sgn( U +δ ) when replacing a and b by 2222 2 22 −11ρρ − 1 − ρ  their values. This is exactly the expression (17) of PW (≤ t ) as given by Theorem 3. QED ALTERNATE PROOFS OF THEOREM 1 AND THEOREM 2:

Alternate proofs for the above two theorems can now be obtained, using (18), and arguments based on conditional expectations of random variables.

To derive the density of W, we first compute the density fT (t ) of T , −∞

∞ f (tububtubaudu )=+ ( )sgn( + )ϕϕ ( ( +− ) ) ( ) T ∫ 222 22 −∞ 1 ∞ = uutuaubdusgn( )ϕϕ (−− ) ( ) , by making the change of variable uub→− ∫ 222 2 2 22 2π −∞ 1 ∞ ()tu−− a22 () u b =−−uusgn( )exp(22 )exp( ) du ∫ 22 2 222π −∞ A ∞ u 2 ab22+ =−++uusgn( )exp(2 (1 t2 ))exp( ubatdu ( )) , where A =−exp( ) ∫ 22 2 2 22π −∞ 2 A ∞ u 2 =−+++−+uexp(2 (1 t2 )[exp( u ( b at ) exp( u ( b at )] du . ∫ 2222 22π 0 2 Posing now zu=+2 (1 t ) / 2 , we obtain : Az∞ 2 f (tzbatdu )=− exp( )cosh( ( + )) . T 22∫ 2 π (1++tt )0 1 ()cz2 k 1(2)!k Since cosh(cz ) = and the Pochhammer coefficient (,)k = , we have: ∑ 2k k≥0 (2k )! 22!k

Az∞ 12 f (tzbatzdz )=− exp( ) ( ( + )2 )kk T 22∫ ∑ π (1++tkt )0 k≥0 2 ! 1 Abat1(+ )2k 1 . And hence, = 22∑ kk π (1++ttk )k≥0 2 (1 ) (1 / 2, ) Aatb()+ 2 ft()= . F (1;1/2; ) T π (1++tt22 )11 2(1 ) W (/)σ σρ− Using the above corollary, we have: T = 21 , and, hence, 1− ρ 2 1(**)(**()*)ab22++σ atwb 2 fw( )=− .exp( .2 . F (1;1/ 2; ) , with W π (1−+ [tw * ( )]2 ) 22 11 2(1 [ tw * ( )]2 ) σρ1 1− w(/)σ σρ− 1 abρ tw*( ) = 21 , bb*/= σ and a* =−(), which gives the same 2 2 2 1− ρ 1− ρ σ12σ expression as Theorem 2.

6.2 RATIOS OF VARIABLES FROM SOME FORMS OF MIXTURES OF NORMAL DISTRIBUTIONS

We consider the case (,UU12 )is bivariate normal , whose conditional distribution has a covariance matrix of the form zI2 , where Z is a with distribution G, i.e. −1 (,UU12 | Z= z )~(0,) N 2 zI 2, where I2 is the identity matrix, with Z = V having distribution function G(v). It is immediate that when Z =1 , we have the case considered in

Theorem 1. Also, when the correlation between Uz1 | and Uz2 | is ρ ≠ 0 , a change of variables, as in the above corollary , is sufficient to bring the problem back to the above case. 16

Ua+ PROPOSITION: The density of W = 1 is Ub2 + 1 ∞ ab22+ ()bat+ 2 f (tFutvdGv )=− exp( ) (1;1/ 2; ( ) ) ( ) , where ut()= (19). 2 ∫ 11 2 π (1+ t )0 2 2(1+ t ) PROOF : By using the same argument as above, we have: ()/Ua+ z PW(|)≤== t Z z P1 ≤= t | Z z = ()/Ub2 + z t K uy() =−+exp[ (ab22 )/2 z ]1 [ F (1:1/2; ] dy. Hence, setting V = 1/ Z, ∫ 2 11 −∞ π (1)tz+

K f (te )=−+ xp[ ( abv22 ) /2]1 . F (1:1/2; utv ( ) ), and (19) follows. WV|11= v π (1)t 2 +

Some particular cases are of interest : 1) If Z is a discrete random variable, with density ()ab22+ 1()∞ − ut PZ()== z p, i =1,2.,,,. then ft()= pe2zi F (1;1/2; ). ii 2 ∑ i 11 π (1+ tz ) i=1 i Kut2() 2) When V ~ Gamma (α ,β ) , we have ft( )= 222 .21 F (α ;1;1/ 2; ) , where π (1+++tab )1 2β 2β ()bat+ 2 K = ()α and ut()= , while F is Gauss hypergeometric function. ab22++2β 2(1+ t 2 ) 21 This result can be established by noticing that , by (19) , 1()2ββα ∞ utk a22++ b f (tvvdv )=−α +−k 1 exp( ) W 2 ∫∑ k πα(1+Γt ) ( )0 k≥0 (1 / 2) 2 12βαα (,)(1,)2()kk ut = ()α ()k . 222∑ 1 22 π (1+++tab ) 2ββk≥0 (,)k kab ! ++ 2 2 It is interesting to see that when α ==β d / 2 we have the Student density in two parameters. Also, for d =1 and α = β = 1/ 2 , we have a more general form of the Cauchy distribution, of

22 c1 11ab+ + ab the form ft()= 22, with c1 = 2 , and c2 = 2 since, in the above ()tc−+21 c π b +1 b +1 2()ut 1 density, F (;1;1/2;α ) reduces to . 21 ab22++2β 1− z

7. CONCLUSION

The density of the ratio of two normal random variables has been shown to have a convenient closed form, when Hermite and Kummer functions are used. These functions are also easy to be programmed on a computer and will provide a powerful tool to deal with questions related to this ratio. Applications in other domains that use this result can now be handled with ease. A generalization to ratios of variables arising from mixtures of normal distributions is possible and presents a unified approach to address this problem.

REFERENCES 17

Brown, K.S., Ratio Populations, at http:// www.seanet.com/~ksbrown Castillo, E. and Galambos, J., Conditional Distributions and the Bivariate Normal Distribution, Metrika, 36, 1989, 209-214. Dickey, J.M., Multiple Hypergeometric Functions: Probabilistic Interpretations and Statistical Uses, Journ. Amer. Stat. Assoc., 78, 1983, 628-637. Fieller, E.C., The Distribution of the Index of a Normal Bivariate Distribution, Biometrika 24, 1932, 428-440. Geary, R.C., The Frequency Distribution of the Quotient of two Normal Variates, Journ. Royal Stat Soc., 97, 1930, 442-446. Hinkley, D. V., On the Ratio of two Correlated Normal Random Variables, Biometrika, 56, 1969, 635-639. Lebedev, N.N., Special Functions and their Applications, Dover, New York, 1972. Marsaglia, G., Ratios of Normal Variables and Ratios of sums of Uniform Variables, Journ. Amer. Stat. Assoc., 60, 1965, 193-204. Mathforum, at http://Mathforum.org. Nicholson, C. The Probability Integral for two Variables, Biometrika, 33, 1943, 59-72. Paulson, E., A Note on the Estimation of Some Mean Values for a Bivariate Distribution, Annals of Mathematical Statistics, 1942, 440-445. Pham-Gia, T., The Hazard Rate of the Power- Quadratic Exponential Family of Distributions, Stat and Prob. Letters, 20, 1994, 375-382. Pham-Gia, T. and Turkkan, N., Distribution of Ratios of Random Variables from the Power- Quadratic Exponential Family and Applications, Statistical Papers, submitted. Pitman, E.J.G., Note on Normal Correlation, Biometrika, 31, 9, 1939. Ruymgaart, F.H., Non-normal Bivariate Densities with Normal Marginals and Linear Regression Functions, Stat. Neerlandica, 27, 1973, 11-17. Shanmurgalingam, S., On the Analysis of the Ratio of Two Correlated Normal Variables, The Statistician, 31, 1982, 251-258. Springer, M. The Algebra of Random Variables, Wiley, New York, 1984. Stuart, A. and Ord, K., Kendall’s Advanced Theory of Statistics, vol 1, 5th ed., Oxford Univ. Press, New Yoyk, 1987.

FIGURE CAPTIONS

Fig. 1: Fieller’s ratio distribution for Egyptian archaeological data.

Fig. 2: Unimodal densities for X/Y

Fig. 3: Bimodal densities for X/Y

Fig. 4: Densities of X/Y and of marginal and conditional ratios