Available at Applications and Applied http://pvamu.edu/aam Mathematics: Appl. Appl. Math. ISSN: 1932-9466 An International Journal (AAM) Vol. 3, Issue 1 (June 2008), pp. 42 – 54 (Previously, Vol. 3, No. 1)
Some Applications of Dirac's Delta Function in Statistics for More Than One Random Variable
Santanu Chakraborty Department of Mathematics University of Texas Pan American, 1201 West University Drive, Edinburg, Texas 78541, USA [email protected]
Received July 14, 2006; accepted December 7, 2007
Abstract
In this paper, we discuss some interesting applications of Dirac's delta function in Statistics. We have tried to extend some of the existing results to the more than one variable case. While doing that, we particularly concentrate on the bivariate case.
Keywords: Dirac's Delta function, Random Variables, Distributions, Densities, Taylor's Series Expansions, Moment generating functions.
1. Introduction
Cauchy, in 1816, was the first (and independently, Poisson in 1815) gave a derivation of the Fourier integral theorem by means of an argument involving what we would now recognize as a sampling operation of the type associated with the delta function. And there are similar examples of the use of what are essentially delta functions by Kirchoff, Helmholtz and Heaviside. But Dirac was the first to use the notationδ . The Dirac delta function (δ -function) was introduced by Paul Dirac at the end of the 1920s in an effort to create the mathematical tools for the development of quantum filed theory. He referred to it as an “improper function” in Dirac (1930). Later, in 1947, Laurent Schwartz gave it a more rigorous mathematical definition as a linear functional on the space of test functions D (the set of all real-valued infinitely differentiable functions with compact support) such that for a given function f (x) in D , the value of the functional is given by the property (b) below. This is called the sifting or sampling property of the delta function. Since the delta function is not really a function in the classical sense, one should not consider the “value” of the delta function at x . Hence, the domain of the delta function is D and its value, for f ∈ D and a given x0 is f (x0 ). Khuri (2004) studied some interesting applications of the delta function in statistics. He mainly studied univariate cases even though he did give some interesting examples for the multivariate case. We shall study some more applications in the multivariate scenario in this work. These might help future researchers
42 AAM: Intern, J., Vol. 3, Issue 1 (June 2008) [Previously, Vol. 3, No. 1] 43
in statistics to develop more ideas. In sections 2 and 3, we discuss derivatives of the delta function in both univariate and multivariate case. Then, in section 4, we discuss some applications of the delta function in probability and statistics. In section 5, we discuss calculations of densities in both univariate and multivariate case using transformations of variables. In section 6, we use vector notations for delta functions in the multidimensional case. In section 7, we discuss very briefly the transformations of variables in the discrete case. Then, in section 8, we discuss the moment generating function in the multivariate set up. We conclude with few remarks in section 9.
2. Derivatives of the δ -function in the Univariate Case
In the univariate case, some basic properties satisfied by Dirac's delta function are:
∞ (a) ∫δ (x)dx = 1, −∞
b δ − = < < (b) ∫ f (x) (x x0 )dx f (x0 ) for all a x0 b , a
where f (x) is any function continuous in a neighborhood of the point x0 . In particular, we have, ∞ δ − = ∫ f (x) (x x0 )dx f (x0 ). −∞
This is the sifting property that we mentioned in the previous section. If f (x) is any function th with continuous derivatives up to the n order in some neighborhood of x0 , then
b δ (n) − = − n (n) ≥ < < ∫ f (x) (x x0 )dx ( 1) f (x0 ), n 0 for all a x0 b . a
In particular, we have,
∞ δ (n) − = − n (n) ≥ ∫ f (x) (x x0 )dx ( 1) f (x0 ), n 0 −∞
(n) th for a given x0 . Here, δ is the generalized n order derivative of δ . This derivative defines a n (n) linear functional which assigns the value (−1) f (x0 ) to f (x) .
Now let us consider the Heaviside function H(x) unit step function defined by
44 Santanu Chakraborty
H (x) = 0 for x < 0
= 1 for x ≥ 1.
dH (x) The generalized derivative of H (x) is δ (x), i.e., δ (x) = . As a result, we get a special dx case of the formula for the nth order derivative mentioned above:
∞ ∫ x nδ (i) (x)dx = 0 if i ≠ n −∞
= (−1) n n! if i = n
3. Derivatives of the delta function in the bivariate case
Following Saichev and Woyczynski (1997), Khuri (2004) provided the extended definition of delta function to the n -dimensional Euclidean space. But we shall mainly concentrate on the bivariate case. As in the univariate case, we can write down similar properties for the bivariate case as well. In the bivariate case, δ (x, y) = δ (x)δ (y) . So, if we assume f (x, y) to be a
continuous function in some neighborhood of (x0 , y0 ) , then we can write
δ − − = ∫∫ f (x, y) (x x0 , y y0 )dxdy f (x0 , y0 ) , ℜ×ℜ
where ℜ is the real line.
Now, for this function f , if all its partial derivatives up to the nth order are continuous in the
abovementioned neighborhood of (x0 , y0 ) , then,
n (n) n n ∂ f (x, y) f (x, y)δ (x − x , y − y )dxdy = (−1) C | = = …… (1) ∫∫ 0 0 ∑ k k n−k x x0 , y y0 ℜ×ℜ 0≤k≤n ∂x ∂y
n (n) where Ck is the number of combinations of k out of n objects, δ (x, y) is the generalized nth order derivative of δ (x, y) . In the general p-dimensional case, by using induction on n , it can be shown that ∞ ∞ δ (n) − * − * ∫... ∫ f (x1 ,..., x p ) (x1 x1 ,..., x p x p )dx1...dx p −∞ −∞ − − − n n k1 ... k p−1 n! ∂ (n) f = − n = * ( 1) ∑... ∑ k | x x , k1 p 0 0 k1!...k p ! ∂x1 ...∂x p
AAM: Intern, J., Vol. 3, Issue 1 (June 2008) [Previously, Vol. 3, No. 1] 45
* * * where x = (x1 ,..., x p )′ , x = (x1 ,..., x p )′ and f is a function of p variables, namely,
x1 , x2 ,, x p .
4. Use of Delta Function to Obtain Discrete Probability Distributions
If X is a discrete random variable that assumes the values a1 ,...,an with probabilities p1 ,..., pn
respectively such that ∑ pi = 1, then, the probability mass function of X can be represented 1≤i≤n
as p(x) = ∑ piδ (x − ai ). 1≤i≤n
Now let us consider two discrete random variables X and Y which assume the values a1 ,...,an
and b1 ,...,bn , respectively, and the joint probability P(X = ai ,Y = b j ) is given by pij for i = 1,...,m and j = 1,...,n so that the joint probability mass function p(x, y) is given by
p(x, y) = ∑ ∑ pijδ (x − ai )δ (y − b j ) . 1≤i≤m1≤ j≤n
Similarly, one can write down the joint probability distribution of any finite number of random variables in terms delta functions as follows:
Suppose we have k random variables X 1 ,..., X k with X i taking values aij , j = 1,2,...,ni for i = 1,2,...,k with probability p . Then, the joint probability mass function is i1...ik
n1 nk P(X = x ,..., X = x ) = p δ (x − a ) δ (x − a ) 1 1 k k ∑ ∑ i1ik 1 1i1 k kik i1 =1 ik =1
As an example, we may consider the situation of multinomial distributions. Let
X 1 , X 2 ,...X k follow multinomial distribution with parameters n, p1 , p2 ,..., pk . Then,
n! i1 ik P(X 1 = i1 ,, X k = ik ) = p1 pk i1!ik !
where i1 ,,ik add up to n and p1 , pk add up to 1. In terms of delta function, the joint probability mass function is
PX(1= x 12 , X = x 2 ,..., Xkk = x )
n! ii12 ik =∑∑... ∑ pp12... pkδ ( x1 −δ i 1 ) ( x 2 − i 2 )... δ ( xkk − i ) . ii12 ik ii12! !... ik !
46 Santanu Chakraborty
We can also consider conditional probabilities and think of expressing them in terms of the δ - function. Let us go back to the example of the two discrete random variables X and Y , where
X takes the values a1 ,a2 ,,am and Y takes the values b1 ,b2 ,,bn . Then, the conditional probability of Y = y given X = x is given by
PY(,= y X = x ) p(|) y x= PY ( = y | X = x ) = PX()= x
∑∑pijδ−( xa i )( δ− ybj ) pxy(, ) ≤≤ ≤ ≤ = = 11im jn . px() ∑ piiδ−() x a 1≤≤in
5. Densities of Transformations of Random Variables Using δ -function
If X is a continuous random variable with a density function f (x) and if Y = g(X ) is a function of X , then the density function of Y , namely, h(y) is given by
∞ h(y) = ∫ f (x)δ (y − g(x))dx . −∞
We can extend this to the two-dimensional case. If X and Y are two continuous random variables with joint density function f (x, y) and if Z = φ1 (X ,Y ) and W = φ2 (X ,Y ) are two random variables obtained as transformations from (X ,Y ) , then the bivariate density function for Z and W is given by
∞ ∞ = δ −φ δ −φ h(z, w) ∫ ∫ f (x, y) (z 1 (x, y)) (w 2 (x, y))dxdy , −∞−∞
where z and w are the variables corresponding to the transformations φ1 (X ,Y ) and φ2 (X ,Y ) . This has obvious extension to the general p-dimensional case.
Khuri (2004) gave an example of two independent Gamma random variables X and Y so that
X and Y are gamma random variables with distributions Γ(λ,α1 ) and Γ(λ,α 2 ) respectively. If
we denote the densities as f1 and f 2 respectively, then we have,
α λ 1 α1 −1 −λx f1 (x) = x e , for x > 0 Γ(α1 )
= 0 , for x ≤ 0.
AAM: Intern, J., Vol. 3, Issue 1 (June 2008) [Previously, Vol. 3, No. 1] 47
α λ 2 α2 −1 −λy f 2 (y) = y e , for y > 0 Γ(α 2 )
= 0 , for y ≤ 0 .
X In that case, if we define Z = and W = X + Y , then, Z is distributed as Beta with X + Y parameters α1 , α 2 and W is distributed as Gamma with parameter values 1 and α1 + α 2 . From now on, we shall use β (a,b) to indicate a Beta random variable with positive parameters a , b and Γ(c,k) to indicate a Gamma distribution with positive parameters c and k .
Now let us consider 3 random variables X 1 , X 2 , X 3 distributed independently so that X i is 1 distributed as gamma, i.e., Γ( ,α ) for i = 1, 2, 3, and if we define 2 i
X 1 Y1 = , X 1 + X 2
X 1 + X 2 Y2 = , X 1 + X 2 + X 3
Y3 = X 1 + X 2 + X 3 .
Then, we have,
Y1 distributed as β (α1 ,α 2 ) ,
Y2 distributed as β (α1 + α 2 ,α 3 ) , 1 Y distributed as Γ( ,α + α + α ) . 3 2 1 2 3
This can be shown following exactly the same technique used in Khuri (2004) which is a one step generalization of the result proved by him. So, we define
* x1 x1 + x2 δ (x1 , x2 , x3 , y1 , y2 , y3 ) = δ ( − y1 )δ ( − y2 )δ (x1 + x2 + x3 − y3 ) . x1 + x2 x1 + x2 + x3
The joint density of Y1, Y2 and Y3 is given by
∞ ∞ ∞ = δ * g(y1 , y2 , y3 ) ∫∫∫ f (x1 , x2 , x3 ) (x1 , x2 , x3 , y1 , y2 , y3 )dx1dx2 dx2 0 0 0 ∞ ∞ ∞ x +x +x − 1 2 3 1 α1 −1 α2 −1 α3 −1 2 * = x1 x2 x3 e δ (x1 , x2 , x3 , y1 , y2 , y3 )dx1dx2 dx3 α1 +α2 +α3 ∫∫∫ 2 Γ(α1 )Γ(α 2 )Γ(α 3 ) 0 0 0
Using properties of the delta function,
48 Santanu Chakraborty
the innermost integral (integral with respect to x1 )
∞ x1 +x2 +x3 α − − = 1 1 2 δ * ∫ x1 e (x1 , x2 , x3 , y1 , y2 , y3 )dx1 0
∞ x1 +x2 +x3 α − − x x + x = 1 1 2 δ 1 − δ 1 2 − δ − − − ∫ x1 e ( y1 ) ( y2 ) (x1 (y3 x2 x3 ))dx1 0 x1 + x2 x1 + x2 + x3
y − 3 − − − α1 −1 2 y3 x2 x3 y3 x3 = (y3 − x2 − x3 ) e δ ( − y1 )δ ( − y2 ) . y3 − x3 y3
Next, we use delta function properties to deal with the second integral (the integral with respect to x2 ) as ∞ y − 3 α −1 α − y − x − x 2 − − 1 1 2 δ 3 2 3 − ∫ x2 (y3 x2 x3 ) e ( y1 )dx2 0 y3 − x3
∞ α1 −1 y3 α − (y − x − x ) − = 2 1 3 2 3 2 δ − − − ∫ x2 e (x2 (y3 x3 )(1 y1 ))dx2 0 y3 − x3
y3 α − − α1 +α2 −1 1 1 α2 −1 2 = (y3 − x3 ) y1 (1− y1 ) e
Then, we use delta function properties for the outermost integral (without the constant terms) is
∞ y − 3 α −1 α +α − α −1 α − 1 3 − 1 2 1 1 − 2 1 2 δ − − ∫ x3 (y3 x3 ) y1 (1 y1 ) e (x3 y3 (1 y2 )) dx3 0 y3
y3 α − α +α − α +α +α − − 1 1 α2 −1 1 2 1 α3 −1 1 2 3 1 2 = y1 (1− y1 ) y2 (1− y2 ) y3 e
Finally, putting the constant terms together, we get
y − 3 1 α −1 α − α +α −1 α − α +α +α −1 g(y , y , y ) = y 1 (1− y ) 2 1 y 1 2 (1− y ) 3 1 y 1 2 3 e 2 . 1 2 3 α1 +α2 +α3 1 1 2 2 3 2 Γ(α1 )Γ(α 2 )Γ(α 3 )
This completes the proof.
6. Vector notations for delta functions in the multidimensional case
AAM: Intern, J., Vol. 3, Issue 1 (June 2008) [Previously, Vol. 3, No. 1] 49
In the multidimensional case, if the transformation is linear, i.e., Y = AX where Y and X are m ×1 and n ×1 vectors respectively and A is an m × n matrix, then we can express g(y) , the density of Y , in vector notation in terms of f (x) , the density of X as follows
∞ ∞ g(y) = ∫... ∫ f (x)δ (y − Ax)dx , (2) −∞ −∞ where
δ (y) = δ (y1 )...δ (ym ) .
So,
T T δ (y − Ax) = δ (y1 − a1 x)...δ (ym − am x),
T T where a1 ,...,am are the rows of the matrix A . Now, in the one-dimensional set up, if a is a δ (x) scalar, then δ (ax) is given by δ (ax) = . Similarly, in the multidimensional set up, if | a | Y = AX as above and A is a nonsingular matrix so that m = n , then, we must have
δ (x − A−1y) δ (y − Ax) = . (3) | A |
This is because of the following: since the transformation is nonsingular, we have
1 g(y) = f (A −1y) | A | and therefore, from (2)
∞ ∞ f (A −1y) = ∫... ∫ f (x) | A | δ (y − Ax)dx . −∞ −∞
But we know that
∞ ∞ ∫... ∫ f (x)δ (x − A −1y)dx = f (A −1y) . −∞ −∞
Therefore, (3) follows. Similarly, if Y = AX + b where A is a nonsingular matrix and b is a vector, then, we have
50 Santanu Chakraborty
δ (x − A −1 (y − b)) δ (y − (AX + b)) = . | A |
Using this, one can conclude that if X is multivariate normal with μ as the mean vector and ∑ as the covariance matrix, then, for a nonsingular transformation A and a constant vector b , the transformed vector Y = AX + b follows multivariate normal with Aμ + b as the mean vector and A ∑ AT as the variance-covariance matrix.
7. Transformation of Variables in the Discrete Case
Transformation of variables can be applied to the discrete case as well. If X is a discrete random
variable taking the values a1 ,,an with probabilities p1 ,, pn and if Y = g(X ) is a transformed variable, then the corresponding probability mass function for Y is given by
∞ n = δ − = δ − q(y) ∫ p(x) (y g(x))dx ∑ pi (y g(ai )) . −∞ 1
In the two-dimensional case, if p(x, y) is the joint probability mass function for the two variables X and Y , then q(z, w) , the joint probability mass function for the transformed pair
Z = φ1 (X ,Y ) and W = φ2 (X ,Y ) is given by
∞ ∞ = δ −φ δ −φ q(z, w) ∫ ∫ p(x, y) (z 1 (x, y)) (w 2 (x, y))dxdy −∞−∞ m n = ∑∑ pijδ (z −φ1 (ai ,b j ))δ (w −φ2 (ai ,b j )) 1 1
Where the pair (X ,Y ) is discrete having the values (ai ,b j ) with probabilities pij for i = 1,2,...,n and j = 1,2,...,m .
8. Moments and Moment Generating Functions
In the univariate set up, the k th non-central moment of X is written as
∞ ∞ ∞ k = k δ − = kδ − = k ∫ x p(x)dx ∫ x ∑ pi (x ai )dx ∑ pi ∫ x (x ai )dx ∑ pi ai . −∞ −∞ 1≤i≤n 1≤i≤n −∞ 1≤i≤n
In the bivariate set up, the non-central moment of order (k,l) for (X,Y) is given by
∞ ∞ ∞ ∞ m n k l = k l δ − δ − ∫ ∫ x y p(x, y)dxdy ∫ ∫ x y ∑∑ pij (x ai ) (y b j )dxdy −∞−∞ −∞−∞ 1 1
AAM: Intern, J., Vol. 3, Issue 1 (June 2008) [Previously, Vol. 3, No. 1] 51
m n ∞ ∞ m n = k lδ − − = k l ∑∑ pij ∫ ∫ x y (x ai , y b j )dxdy ∑∑ ai b j pij . i=1 j=1 −∞−∞ i=1 j=1
For example, if (X ,Y ) follow trinomial distribution, then
n n−k n! k l n−k −l p(x, y) = ∑∑ p1 p2 (1− p1 − p2 ) δ (x − k)δ (y − l) . k =0 l=0 k!l!(n − k − l)!
As a result, the corresponding non-central moment of order (r, s) is given by
∞ ∞ n n−k n−k −l n! k l r s r s = − − ∫ ∫ x y p(x, y)dxdy ∑∑ p1 p2 (1 p1 p2 ) ak bl . −∞−∞ k =0 l=0 k!l!(n − k − l)!
The most interesting part in Khuri's article is the representation of the density function of a continuous random variable X in terms of its non-central moments. Thus, if f (x) is the density function for X , then f (x) is represented as
m (−1) (m) f (x) = ∑ µmδ (x) , 0≤m<∞ m!
(m) th th where δ (x) is the generalized m derivative of δ (x) and µm is the m order non-central moment for the random variable X . One can see a proof of this result in Kanwal (1998). Let us briefly mention the technique used by him to derive the above expression. He showed that, for any real function ψ defined and differentiable of all orders in a neighborhood of zero,
(−1) n 〈 ψ 〉 = 〈 µ δ (n) ψ 〉 f , ∑ ≤ <∞ n (x), , 0 n n!
where 〈 f ,ψ 〉 , the inner product between the functions f and ψ , is defined as
∞ 〈 f ,ψ 〉 = ∫ f (x)ψ (x)dx . −∞
This leads us to conclude
(− 1) n = µδ()n fx() ∑ ≤ <∞ n (x ).... (4) 0 n n!
The crucial step in the proof was to use the Taylor's expansion in one-dimension for the function ψ . Thus, we have, ψ (n) (0)x n ψ = (x) ∑ ≤ <∞ 0 n n!
52 Santanu Chakraborty
Then, ∞ ψ (n) (0) = (−1) n ∫ψ (x)δ (n) (x)dx = (−1) n 〈ψ ,δ (n) 〉 . −∞
These two steps give us the relation (4).
When we move to the two-dimensional scenario, we shall have to use Taylor's series expansion for a two-dimensional analytic function ψ about the point (0,0) which is given by
1 ∂ ∂ ψ = + nψ (x, y) ∑ ≤ <∞ [x y ] (x, y) |x=0, y=0 0 n n! ∂x ∂y
∞ 1 ∞ ∂ n = n ψ k n−k ∑ ∑ Ck k n−k (x, y) |x=0, y=0 x y (5) n=0 n!k =0 ∂x ∂y
Now, here also, we follow the same technique and so we compute 〈 f ,ψ 〉 which is defined as
∞ ∞ 〈 f ,ψ 〉 = ∫ ∫ f (x, y)ψ (x, y)dxdy −∞−∞
Now using Taylor’s series expansion of ψ (x, y) about (x0 , y0 ) from (5) and assuming that interchange of integrals with summations permissible, we get
∞ ∞ 1 ∂ n 〈 f ,ψ 〉 = f (x, y)[ nC ψ (x, y) | x k y n−k ]dxdy ∫ ∫ ∑0≤n<∞ ∑0≤k≤n k k n−k x=0, y=0 −∞−∞ n! ∂x ∂y 1 ∂ n ∞ ∞ = ψ (x, y) | f (x, y)x k y n−k dxdy ∑0≤n<∞ ∑0≤k≤n k n−k x=0, y=0 ∫ ∫ k!(n − k)! ∂x ∂y −∞−∞ 1 ∂ n = ψ 〈 k n−k 〉 ∑ ≤ <∞ ∑ ≤ ≤ k n−k (x, y) |x=0, y=0 f , x y . 0 n 0 k n k!(n − k)! ∂x ∂y
Now we define the (,)rsth order non-central moment for the pair (X ,Y ) as
∞ ∞ µ = r s = 〈 r s 〉 r,s ∫ ∫ f (x, y)x y dxdy f , x y . −∞−∞
Then from above, 1 ∂ n 〈 ψ 〉 = ψ µ f , ∑ ≤ <∞ ∑ ≤ ≤ k n−k (x, y) |x=0, y=0 k,n−k 0 n 0 k n k!(n − k)! ∂x ∂y 1 = 〈 µ − n δ (k ) δ (n−k ) ψ 〉 ∑ ≤ <∞ ∑ ≤ ≤ k,n−k ( 1) (x) (y), . 0 n 0 k n k!(n − k)!
AAM: Intern, J., Vol. 3, Issue 1 (June 2008) [Previously, Vol. 3, No. 1] 53
The last equality follows because of the following relation
n (k ) (n−k ) n ∂ 〈ψ ,δ (x)δ (y)〉 = (−1) ψ (x, y) | = = . ∂x k ∂y n−k x 0, y 0
Therefore, we have 1 = µ − n δ (k ) δ (n−k ) f (x, y) ∑ ≤ <∞ ∑ ≤ < k,n−k ( 1) (x) (y) . 0 n 0 k n k!(n − k)!
Now, the non-central moment of order (r, s) is given by
∞ ∞ ∞ ∞ 1 f (x, y)xr y sdxdy = x r y s µ (−1) n δ (k ) (x)δ (n−k ) (y)dxdy . ∫ ∫ ∫ ∫ ∑0≤n<∞ ∑0≤k≤n k,n−k −∞−∞ −∞−∞ k!(n − k)!
But, we also have ∫∫ x r y sδ (i) (x)δ ( j) (y)dxdy = 0, if i ≠ r or j ≠ s
= (−1) r+s r!s!, if i = r and j = s .
Therefore, the non-central moment of order (r, s) reduces to µr,s .
When we talk about the moment generating function in the one variable case, we have
∞ (−1) n φ(t) = E(etX ) = etx µ δ (n) (x)dx ∫ ∑0≤n<∞ n −∞ n! (−1) n ∞ = µ etxδ (n) (x)dx ∑0≤n<∞ n ∫ n! −∞ (−1) n d n = µ − n tx ∑ ≤ <∞ n ( 1) n e |x=0 0 n n! dx µ = n n ∑ ≤ <∞ t . 0 n n!
In the two-variable case, the moment generating function is given by ∞ ∞ φ(s,t) = E(e sX +tY ) = ∫ ∫ e sx+ty f (x, y)dxdy −∞−∞ ∞ ∞ 1 = e sx+ty µ (−1) n δ (k ) (x)δ (n−k ) (y)dxdy ∫ ∫ ∑0≤n<∞ ∑0≤k 54 Santanu Chakraborty 1 = µ k n−k ∑ ≤ <∞ ∑ ≤ < k,n−k s t 0 n 0 k n k!(n − k)! s k t n−k = µ ∑ ≤ <∞ ∑ ≤ < k,n−k . 0 n 0 k n k! (n − k)! In the general p -dimensional case, the moment generating function could be obtained exactly in the similar fashion so that, it is given by k k1 k2 p t t t p φ(t ,t , ,t ) = 1 2 µ . 1 2 p ∑ ≤ <∞ ∑ ≤ < ∑ ≤ < − − − k1 ,k2 ,,k p 0 n 0 k1 n 0 k p n k1 k p−1 k1! k2! k p! 9. Concluding Remarks The study of generalized functions is now widely used in applied mathematics and engineering sciences. The δ -function approach provides us with a unified approach in treating discrete and continuous distributions. This approach has the potential to facilitate new ways of examining some classical concepts in mathematical statistics. However, some interesting applications can be found in the paper by Pazman and Pronzato (1996). In this paper, the authors use delta function approach for densities of nonlinear statistics and for marginal densities in nonlinear regression. We are also looking forward to obtain some interesting applications of the delta function in statistics. Acknowledgement: I am sincerely thankful to Professor Lokenath Debnath in the University of Texas-Pan American for bringing this problem to my notice. REFERENCES Dirac, P.A.M. (1930). The Principles of Quantum Mechanics, Oxford University Press. Hoskins, R.F. (1998). Generalized Functions, Ellis Horwood Limited, Chichester, Sussex, England. Kanwal, R.P. (1998). Function Theory and Technique (2nd Edition), Boston, MA, Birkhauser. Khuri, A.I. (2004). Applications of Dirac's delta function in statistics, International Journal of Mathematical Education in Science and Technology, 35, no. 2, 185-195. Pazman, A and Pronzato, L. (1996). A Dirac-function method for densities of nonlinear statistics and for marginal densities in nonlinear regression, Statistics & Probability Letters, 26, 159- 167. Saichev, A.I. and Woyczynski, W.A. (1997). Distributions in the Physical and Engineering Sciences, Boston, MA, Birkhauser.