Journal of Statistical Theory and Applications, Vol. 16, No. 2 (June 2017) 269–283 ______

An Alternative Approach of Binomial and Multinomial Distributions

Mian Arif Shams Adnan Department of Mathematical Science, Ball State University, Muncie, IN 47304, USA [email protected]

Md. Tareq Ferdous Khan Department of Statistics, Jahangirnagar University, Dhaka-1342, Bangladesh

Md. Forhad Hossain Department of Statistics, Jahangirnagar University, Dhaka-1342, Bangladesh

Abdullah Albalawi Department of Mathematical Science, Ball State University, Muncie, IN 47304, USA

Received 4 April 2015

Accepted 18 September 2016

Abstract In this paper we have tried to present an alternative approach for two discrete distributions such as Binomial and Multinomial with a new concept of sampling having a more general form apart from the traditional methods of sampling. The existing distributions may be obtained as a special case of our newly suggested technique. The basic distributional properties of the proposed distributions are also been examined including the limiting form of the proposed to . Keywords: Generalized Binomial Distribution, Generalized Multinomial Distribution, Sampling Methods, Distributional Properties, Arithmetic Progression, Limiting Form. Mathematics Subject Classification: 60E05, 62E10, 62E15, 62E20.

1. Introduction The discrete distributions are widely used in the diversified field and among them the distribution like Binomial, Multinomial, Poisson and Geometric are the most commonly used discrete distributions. The other discrete distributions include uniform or rectangular, hypergeometric, negative binomial, power series etc. The truncated and censored forms of the different discrete distributions are used as in statistics literature and in real life problems. The binomial distribution was first studied in connection with the games of pure chance but it is not limited within that area only, where the Multinomial distribution is considered as the generalization of the binomial distribution. The number of mutually exclusive outcomes from a single trial are in multinomial distribution compared to two outcomes namely success or failure of Binomial distribution. The usual binomial푘 distribution is the discrete probability distribution of the number of successes 0 to n resulted from n independent Bernoulli trails each of which yields success with probability p and failure with

Copyright © 2017, the Authors. Published by Atlantis Press. This is an open access article under the CC BY-NC license (http://creativecommons.org/licenses/by-nc/4.0/).

269 Journal of Statistical Theory and Applications, Vol. 16, No. 2 (June 2017) 269–283 ______

probability q. One of the important assumptions regarding binomial variate representing the number of success is that it can take only values in the sequence of 0,1,2, , . But in real world, the successes of binomial may not occur in the usual way rather it may occur in a different sequence such as (i) 0,2,4, ,2 , (ii) 2,4, ,2 , (iii) 0,3,6 ,3 , (iv) 3,6,9, ,3 and so on. ⋯ 푛 In cases (ii) and (iv), truncated distribution is the better option to find the probability of number of succ⋯ ess.푛 In truncated⋯ 푛 distribution,⋯ it 푛is assumed ⋯that푛 the truncated values of the random variable have certain probabilities. If it is considered that there is no existence of the truncated values, the truncated distribution cannot provide the probability due to mathematical cumbersome. In the remaining cases, Binomial and truncated Binomial are completely helpless. In this context, we have suggested an alternative approach of Binomial and Multinomial distribution having generalized sequence of the values of the random variables. For convenience the distributions are defined as relatively generalized distribution. The number of successes of the proposed distributions is represented by the arithmetic progression + , where, is non-negative integer and termed as minimum number of success, is positive integer representing the concentration of success and is a non-negative integer indicating the total number of trails. 푎 푛푛 푎 To justify the sequence with real life situation, let us� consider an example of number of defective shoes. It is well known푛 that the shoes are produced pair wise. That is, if we make attempts to identify the number of defective shoes, then it is usual that the number of defective shoes would occur pair wise. In this context, the number success in the form 0,2,4, ,2 is justified. If it is known that푛 the minimum number of defective items in attempts are , then these occurrences are not chance outcome and may be regarded as constant and then we are interested to find the⋯ probability푛 of chance outcome taking form , + 2, + 4, ,2 , where > 0 and푛 is non-negative푎 integer. That is, number of defective shoes larger than a. The other possible sequences are also justifiable in this way by real life examples. 푎 푎 푎 ⋯ 푛 푎 One may푛 confuse the proposed form of the distribution with the usual truncated distribution. The major difference is that in our form the probability exists only for the possible number of success shown in the sequence. In the example discussed above, the defective number of shoes take the values 0,2,4, ,2 . This indicates that there is no existence of the number of defective item(s) of the form 1,3,5, and thus no probabilities. On the other hand, in truncated distribution, there is existence of the number of success⋯ which푛 is truncated and also they have the probabilities. ⋯ A number of authors published their work under the heading of generalized binomial distribution but they were different in terms of the key concept of our present work. Altham1 showed two generalizations of the binomial distribution when the random variables are identically distributed but not independent and assumed to have symmetric joint distribution with no second or higher order “interactions”. Two generalizations are obtained depending on whether the “interaction” for discrete variables is “multiplicative” or “additive”. The distribution has a new parameter > 0 which controls the shape of the distribution and flexible to allow for both over or under-disperse than traditional Binomial distribution. Whereas, the beta-binomial distribution allows only for over-disperse distribution휃 than the corresponding Binomial distribution (Johnson, Kemp and Kotz2). Dwass3 have provided a unified approach to a family of discrete distributions that includes the hypergeometric, Binomial, and Polya distributions by considering the simple sample scheme where after each drawing there is a “replacement” whose magnitude is a fixed real number. Paul4 derived a new three parameter distribution, a generalization of the binomial, the beta-binomial and correlated beta-binomial distribution. Further a modification on beta-correlated binomial distribution was proposed by Raul5. In the generalization of the probability distribution, Panaretos and Xekalaki6 developed cluster binomial and multinomial model and their probability distributions. In his study, Madsen7 discussed that in many cases binomial distribution fails to apply because of more variability in the data than that can be explained by the distribution. He pointed out a characterization of sequences of exchangeable Bernoulli random variables which can be used to develop models which is more fluxion than the traditional binomial distribution. His study exhibited sufficient conditions which will yield such models and show how existing models can be combined to generate further models.

270 Journal of Statistical Theory and Applications, Vol. 16, No. 2 (June 2017) 269–283 ______

A generalization of the binomial distribution is introduced by Drezner and Farnum8 that allow the dependence between trials, non-constant probabilities of success from trial to trial and which contains usual binomial distribution as special case. A new departure in the generalization was carried out by Fu and Sproule9 by adopting the assumption that the underlying Bernoulli trials take on the values or where < , rather than the conventional values 0 or 1. This rendered a four parameter binomial distribution of the form ( , , , ). In a recent work, Altham and Hankin10 introduced two generalizations of훼 multinomial훽 훼and Binomial훽 distributions which arise from the multiplicative Binomial distribution of Altham1. The forms of the generalized퐵 푛 푝 훼distributions훽 are of form and termed as “multivariate multinomial distribution” and “multivariate multiplicative binomial distribution”. Like the Altham’s generalized distribution, both the distribution has an additional shape parameter which corresponds usual distribution if it takes value 1 and over and under-disperse for greater and less than 1 respectively. 휃 2. New Approach of Binomial Distribution In the traditional Binomial sampling each time we draw a sample having either a success or a failure, we continue this method up to n trials and we may have number of successes starting from 0 and may ended up to n, but in real world, the success of Binomial distribution may not occur in the usual sequences rather it may happend as

i) 0,2,4, ,2 ii) 2,4, ,2 iii) 0,3,6 ⋯,3 푛 iv) 3,6,⋯9, 푛,3 ⋯ 푛 and so on. ⋯ 푛 These indicate that the number of success may follow an arithmetic progression + , where, is a non-negative integer representing the minimum number of success, is a positive integer representing the concentration of success occurring and is a non-negative integer indicating the total number푎 푛푛 of trails. 푎 � Definition 2.1. A random variable X is푛 said to have a relatively general binomial distribution if it has the following probability mass function

+ ( ; , , , ) = ; = , + , + 2 , , + (2.1) 푎 푛푛+ 푥 푎+푛푛−푥 � � 푝 푞 푃 푥 푎 푛 � 푝 푥 푥 푎 푎 � 푎 � ⋯ 푎 푛푛 푎+푛푛+ 푎 푛푛 푥 푎+푛푛−푥 = ∑푥=푎 � � 푝 푞 1 푥 푎 푛푛 푥 푎+푛푛−푥 푘 � � 푝 푞 where, a 0 is the minimum푥 number of success, d > 0 is the concentration of occurrence of success, is a predefined finite number of non-negative integer represents the number of trials and p is the probability of ≥ + 푛 success such that + = 1 and = is a constant. The probability mass function ( ; , , , ) stands for probability푎+푛푛 of 푎getting푛푛 x success푥 푎+푛푛 out−푥 of maximum of + successes in n 푝 푞 푘 ∑푥=푎 � � 푝 푞 trials. 푥 푃 푥 푎 푛 � 푝 푎 푛푛 Theorem 2.1. Show that the probability mass function of generalized binomial distribution is a probability function.

Theorem 2.2. Show the relationship between generalized binomial distribution with parameter 0, > 0, 0 and p and usual binomial distribution with parameter 0 and p. 푎 ≥ � 푛 ≥ 푛 ≥

271 Journal of Statistical Theory and Applications, Vol. 16, No. 2 (June 2017) 269–283 ______

It can be easily shown that the probability mass function of generalized binomial distribution reduces to the probability mass function usual binomial distribution when = 0.

Theorem 2.3 Show that the moment generating function of generalized푎 binomial distribution is

+ ( ) ( ) = (2.2) 푎+푛푛 푎 +푛푛 푡 푥 푎+푛푛−푥 ∑푥=푎 � � 푝푒 푞 푀푋 푡 푥 푎+푛푛 푎 푛푛 푥 푎+푛푛−푥 ∑푥=푎 � � 푝 푞 Theorem 2.4. The mean푥 and of generalized binomial distribution are ( + ) and ( + ) respectively. 푎 푛푛 푝 푎 푛푛 푝� Theorem 2.5. If X follows the generalized binomial distribution then find the 3rd and 4th raw and central moments of X respectively are

= ( + ) 3( + ) + 2( + ) + 3( + ) 3( + ) ′ 3+3( + ) 2 3 3 2 2 2 (2.4) 휇3 = (푎 + 푛푛) 푝 − 6(푎 + 푛푛) 푝 + 11푎( +푛푛 푝) 푎6( 푛푛+ 푝) −+ 6푎( +푛푛 푝) 18( + ) ′ 4+412푎 ( 푛푛+ 푝 ) 3 +47( + ) 2 74( + ) 4 3 3 2 3 4 휇 푎 푛푛 +푝 (− + 푎 ) 푛푛 3 푝 푎 2 푛푛 2 푝 − 푎 푛푛 2 푝 푎 푛푛 푝 − 푎 푛푛 (푝2.5) = ( + ) (1 푎2 )푛푛 푝 푎 푛푛 푝 − 푎 푛푛 푝 (2.6) = ( + ) 1푎+ 3푛푛( 푝+ ) 2 (2.7) 3 휇 푎 푛푛 푝푞 − 푝 4 휇Still the푎 special푛푛 푝 �case� for� 3푎rd and푛푛 4th− raw�푝 and�� central moments holds for = 0 and = 1 and turned to the form of usual Binomial distributions. 푎 � Theorem 2.6. The measures and coefficient of skewness and kurtosis of generalized binomial distribution are: Skewness (1 2 ) Measures of Skewness ( ) = (2.8) ( + ) 2 − 푝 훽1 푎 푛푛 푝푞1 2 Coefficient of Skewness ( ) = = (2.9) ( + ) − 푝 훾1 �훽1 From coefficient of skewness, the following� 푎 conclusion푛푛 푝푞 can be drawn and still surprising that the nature of skew depends only on only and which is similar to the usual binomial distribution as:

i) The distribution푝 is positively skewed if < . 1 ii) On the other hand, the distribution is negatively skewed if > . 푝 2 1 iii) And the distribution is symmetric if = . 푝 2 1 Kurtosis 푝 2 (1 6 ) Measures of kurtosis ( ) = 3 + (2.10) ( + ) − 푝� 훽2 푎 푛푛 푝�

272 Journal of Statistical Theory and Applications, Vol. 16, No. 2 (June 2017) 269–283 ______

(1 6 ) Coefficient of kurtosis ( ) = 3 = (2.11) ( + ) − 푝� 훾2 훽2 − These equations tell us that the generalized푎 distribution푛푛 푝� is i) Mesokurtic if = . 1 > ii) Platykurtic 6 and 1 iii)Leptokurtic if < . 푝� 6 1 푝� 6 Theorem 2.7 Show that the maximum likelihood estimate of the parameter is , where is the total number of success from maximum of + success in trials. 푥 푝 푎+푛푛 푥 푎 푛푛 푛 = Theorem 2.8. The MLE estimator of number of trial of the distribution is 푛 . 1 ∑푖=1 푥푖

푛� � � 푝� − 푎� Theorem 2.9. Necessary conditions to derive the generalized from generalized binomial distribution.

Generalized Poisson distribution can be derived from the generalized binomial distribution under the following assumptions: i) p, the probability of success in a Bernoulli trail is very small. i.e. 0. ii) n, the number of trails is very large. i.e. . iii) ( + ) = is finite constant, that is average number of success푝 → is finite. Under this condition, we have 푛 → ∞ ( 푎+ 푛푛) 푝 = 휆 = = 1 ( ) and ( ) 휆 휆 푎 푛푛 푝 휆 ∴ 푝 푎+푛푛 푞 − 푎+푛푛 Theorem 2.10 Normal distribution as a limiting form of generalized binomial distribution.

3. New Approach of Multinomial Distribution Under the sampling scheme described in Section 2, consider the situation where one of the mutually exclusive outcomes is possible from a single trial other than only success or failure. More specifically, if the outcomes are denoted by , , , , and the number of occurrences of the respective outcomes푘 are denoted by , , , such that = + , where is non-negative integer and termed as푘 1 2 2 푘 minimum number of success,푒 푒 is푒 positive⋯ 푘푒 integer representing the concentration of success and is a non- 1 2 푘 푖 negative integer푥 푥indicating⋯ 푥 the total number∑푖=1 푥 of trails,푎 푛푛then our suggested푎 general form of Binomial as well as traditional Multinomial distribution� cannot provide the probability that the event occurred 푛 times, the event occurred times and so on the event occurred times. To overcome this situation, we have suggested the new approach of Multinomial distribution and termed as relatively more푒1 general M푥ultinomial1 or generalized푒2 Multinomial푥2 distribution. In this section푒푘 we would푥푘 present only the definition of the suggested distribution and statement of the theorems that we have derived for our present work to keep the paper size standard and other reason is that the derivations are very much similar to the suggested Binomial distribution.

Definition 3.1. k discrete random variable , , , is said to have a generalized multinomial distribution if it has the following probability mass function 푋1 푋2 ⋯ 푋푘

273 Journal of Statistical Theory and Applications, Vol. 16, No. 2 (June 2017) 269–283 ______

1 ! ! ! ( , , , ; , , , , , , ) = 푥1 푥2 푥푘 1 1 2 푘 , , 1, 2 푘 푝 푝 ⋯ 푝 1 2 푘 1 2 푘 푥 푥 ⋯!푥 ! ! 푃 푥 푥 ⋯ 푥 푎 푛 � 푝 푝 ⋯ 푝 푎+푛푛 푥1 푥2 푥푘 1 2 ( 푘+ )! ∑푥 푥 ⋯ 푥 =푎 푝1 푝2 ⋯ 푝푘 ! ! 푥1 푥!2 ⋯ 푥푘 = 푥1 푥2 푥푘 (3.1) 푎 푛푛( ) +푝1 푝!2 ⋯ 푝푘 , ,푥1, 푥2 ⋯!푥푘 ! ! 푎+푛푛 푥1 푥2 푥푘 푥1 푥2 ⋯ 푥푘=푎 푎 푛푛 1 2 푘 ∑ 1 2 푘 푝 푝 ⋯ 푝 where, 0, > 0, 0 and , , , such 푥that푥 ⋯ 푥 = 1, are the parameters of the distribution. and = + . 푘 1 2 푘 푖=1 푖 푘 푎 ≥ � 푛 ≥ 푝 푝 ⋯ 푝 ∑ 푝 The∑푖= 1defined푥푖 푎 function푛푛 is a probability mass function as it holds the conditions of the probability function and easily reduces to traditional Multinomial distribution if = 0 and = 1 and thus may be said that the traditional Multinomial distribution is a special case of the suggested Multinomial distribution. 푎 � Theorem 3.1. Generalized Binomial distribution is a special case of generalized Multinomial distribution.

Theorem 3.2. The moment generating function of generalized Multinomial distribution is

( + )! ( ) , , , ! 푎+푛푛 푘 푡푖 푥푖 , , , ( , , , ) = 1 2 푘 푎 푛푛 (3.3) ∑푥 푥 ⋯ 푥 =푎 (푘 + ∏)!푖=1 푝푖푒 , , , 푖=1 푖 푋1 푋2 ⋯ 푋푘 1 2 푘 ∏ 푥 ! 푀 푡 푡 ⋯ 푡 푎+푛푛 푘 푥푖 푥1 푥2 ⋯ 푥푘=푎 푎 푘 푛푛 푖=1 푖 ∑ 푖 ∏ 푝 and the mean and the variance of ( = 1,2,∏푖=,1 푥) respectively are

( ) = = ( + ) 푋 푖 푖 ⋯ 푘 (3.4) ( ) = ( ′ + ) (1 ) (3.5) 푖 1푖 푖 퐸 푋 휇 푎 푛푛 푝 Theorem푉 푋푖 3.3.푎 The푛푛 3푝rd푖 and− 4푝th푖 raw and central moments of generalized Multinomial distribution are

= ( + ) 3( + ) + 2( + ) + 3( + ) 3( + ) + ( + ′ ) 3 3 2 3 3 2 2 2 (3.6) 3푖 푖 푖 푖 푖 푖 휇 = (푎 + 푛�) 푝 − 6(푎 +푛푛 ) 푝 + 11푎( +푛푛 푝) 푎6( 푛푛+ 푝) −+ 6푎( +푛푛 푝) 푎 푖 푛푛′ 푝 4 184 ( + ) 3 +412( + ) 2+ 74( + ) 47( + ) 3 3 4푖 푖 푖 푖 푖 푖 휇 푎 푛푛 +푝( −+ 푎) 푛푛 2 3 푝 푎 푛푛 3 푝 − 푎 2 푛푛 2 푝 푎 푛푛 2 푝 (3.7) = ( + ) −(1 푎 )(푛푛1 2푝푖 ) 푎 푛푛 푝 푖 푎 푛 � 푝 푖 − 푎 푛푛 푝 푖 (3.8) 푖 푎 푛푛 푝 3푖 푖 푖 푖 and휇 푎 푛푛 푝 − 푝 − 푝

= ( + ) (1 ) 1 + 3 ( + ) 2 (1 ) (3.9)

4 푖 푖 푖 푖 휇 In all푎 the푛푛 moments푝 − 푝if we� put� 푎= 0푛푛 and− =�푝1 then− 푝 these� reduces to the form of moments of traditional Multinomial distribution which again justifies that the traditional Multinomial distribution is a special case of our suggested generalized Multinomial푎 distribution.�

274 Journal of Statistical Theory and Applications, Vol. 16, No. 2 (June 2017) 269–283 ______

Theorem 3.4. Karl Pearson’s measures and coefficients of skewness and kurtosis of generalized multinomial distribution.

Measure of Skewness: (1 2 ) = (3.10) ( + ) (1 2 ) − 푝푖 훽1푖 Coefficient푎 푛푛of Skewness:푝푖 − 푝푖 1 2 = = (3.11) ( + ) (1 ) − 푝푖 훾1푖 훽1푖 � 푖 푖 Measure of Kurtosis:� 푎 푛푛 푝 − 푝 {1 6 (1 )} = 3 + (3.12) ( + ) (1 ) − 푝푖 − 푝푖 훽2푖 Coefficient of푎 Kurtosis푛푛 푝푖 − 푝푖

{1 6 (1 )} = 3 = (3.13) ( + ) (1 ) − 푝푖 − 푝푖 훾2푖 훽2푖 − The conclusion푎 from푛푛 푝푖the −measures푝푖 and coefficient of skewness and kurtosis of the generalized Multinomial distribution can be drawn in the same way as drawn in the generalized Binomial distribution depending on the values of .

Theorem 3.5. The maximum푝푖 likelihood estimator of the parameters of generalized multinomial distribution is = ( + ) ( ) , where is the number of success comprising is the total number of failure 푥푖 subject to condition that maximum number of success is + . 푝�횤 푎+푛푛 푥푖 푎 푛푛 − 푥푖 푎 푛푛 4. Discussion and Conclusion We have suggested relatively more general form of two discrete distributions such as Binomial and Multinomial for the different sampling scheme which is described above and termed as generalized distribution. It is evident from the generalized distribution that if sampling is drawn in the usual manner, then our suggested distributions reduces to the traditional form and thus it may conclude that the traditional Binomial and Multinomial distribution are the special cases of our proposed generalized Binomial and Multinomial distribution. Like the traditional distributions, all of the distributional properties including limiting theorems of the suggested distributions have derived. The truncated cases of the traditional distribution can be address more accurately by our new approach of the distributions. In general, the new approach of the distributions are providing more access and broaden the scope from the theoretical point of view as well as from the standpoint of real world problem solving. Generalized sequence of the number of success of the proposed distributions may be considered as an added advantage in the distribution theory.

References

[1] P. M. E. Altham, Two Generalizations of the Binomial Distribution, J. R. Stat. Soc. Ser. C. Appl. Stat. 27 (2) 1978 162-167. [2] N. L. Johnson, A. W. Kemp and S. Kotz, Univariate Discrete Distributions. 3rd edn. (John Wiley & Sons, New Jersey, 2005). [3] M. Dwass, A Generalized Binomial Distribution, Amer. Statist. 33(2) 1979 86-87.

275 Journal of Statistical Theory and Applications, Vol. 16, No. 2 (June 2017) 269–283 ______

[4] S. R. Paul, A Three-Parameter Generalization of the Binomial Distribution, Comm. Statist. Theory Methods 14(6) 1985 1497- 1506. [5] S. R. Raul, On the beta-correlated binomial (bcb) distribution-a three parameter generalization of the binomial distribution, Comm. Statist. Theory and Methods 16(5) 1987 1473-1478. [6] J. Panaretos. and E. Xekalaki, On Generalized Binomial and Multinomial Distributions and Their Relation to Generalized Poisson Distribution, Ann. Inst. Statist. Math. 38(2) 1986 223-231. [7] R. W. Madsen, Generalized binomial distributions, Comm. Statist. Theory and Methods 22(11) 1993 3065-3086. [8] Z. Drezner and N. Farnum, A generalized binomial distribution, Comm. Statist. Theory and Methods 22(11) 1993 3051-3063. [9] J. Fu and R. Sproule, A generalization of the binomial distribution, Comm. Statist. Theory and Methods 24(10) 1995 2645-2658. [10] P. M. E. Altham and R. K. S. Hankin, Multivariate Generalizations of the Multiplicative Binomial Distribution: Introducing the MM Package, J. Statist. Software. 46(12) 2012 1-23.

Appendix Proof of Theorem 2.1. We know, the probability mass function of the suggested generalized binomial distribution is

+ ( ; , , , ) = ; = , + , + 2 , , + (2.1) 푎 푛푛+ 푥 푎+푛푛−푥 � � 푝 푞 푃 푥 푎 푛 � 푝 푥 푥 푎 푎 � 푎 � ⋯ 푎 푛푛 푎+푛푛 푎 푛푛 푥 푎+푛푛−푥 ∑푥=푎 � � 푝 푞 It is clear from the function that푥 ( ; , , , ) 0 for all values of X with different values of a, d, n and p.

Again, 푃 푥 푎 푛 � 푝 ≥

푎+푛푛 ( ; , , , )

� 푃 푥 푎 푛 +� 푝 푥=푎 = 푎+푛푛 푎 푛푛+ 푥 푎+푛푛−푥 � � 푝 푞 � 푥 푎+푛푛 푎 푛푛 푥 푎+푛푛−푥 푥=푎 ∑푥=푎 �1 � 푝 푞 + = + 푥 푎+푛푛 푎 푛푛 푥 푎+푛푛−푥 � � � 푝 푞 = 1 푎+푛푛 푎 푛푛 푥 푎+푛푛−푥 ∑푥=푎 � � 푝 푞 푥=푎 푥 푥 As ( ; , , , ) 0 and ( ; , , , ) = 1, so we may conclude that ( ; , , , ) is a probability function. 푎+푛푛 푃 푥 푎 푛 � 푝 ≥ ∑푥=푎 푃 푥 푎 푛 � 푝 푃 푥 푎 푛 � 푝 Proof of Theorem 2.3. According to the definition of moment generating function,

( ) = ( ) = 푎+푛푛 ( ; , , , ) 푡� 푡� 푀푋 푡 퐸 푒 � 푒 푃 푥 푎 푛 � 푝 푥=푎1 + => ( ) = 푎+푛푛 푡� 푎 푛푛 푥 푎+푛푛−푥 푀푋 푡 � 푒 � � 푝 푞 푥=푎 푘 푥

276 Journal of Statistical Theory and Applications, Vol. 16, No. 2 (June 2017) 269–283 ______

1 + => ( ) = 푎+푛푛 ( ) 푎 푛푛 푡 푥 푎+푛푛−푥 푀푋 푡 � �+ � 푝푒 푞 푘 푥=푎 푥 ( ) ( ) = 푎+푛푛 푡 푥 푎+푛푛−푥 (2.2) 푎 +푛푛 ∑푥=푎 � � 푝푒 푞 ∴ 푀푋 푡 푥 푎+푛푛 푎 푛푛 푥 푎+푛푛−푥 ∑푥=푎 � � 푝 푞 This is the moment generating푥 function of generalized binomial distribution.

Proof of Theorem 2.4. The mean of the distribution can be calculated from its moment generating function by differentiating it with respect to t and equating = 0. That is

푡 ( ) = = [ ( )] ′ 1 � 푋 + 퐸 푋 휇 푀 푡 �푡=0 ( ) ( ) = 𝑑 푎+푛푛 푎 푛푛 + 푡 푥−1 푡 푎+푛푛−푥 ∑푥=푎 � � 푥 푝푒 푝푒 푞 푥 � 푎+푛푛 푥 푎+푛푛−푥 �� 푥=푎+ 푎 푛푛 ∑ � ( �)푝 푞 푡=0 = 푥 푎+푛푛 푎 +푛푛 푡 푥 푎+푛푛−푥 ∑푥=푎 푥 � � 푝푒 푞 푥 � 푎+푛푛 푥 푎+푛푛−푥 �� 푥=푎 +푎 푛푛 ∑ � � 푝 푞 푡=0 = 푥 푎+푛푛 푎+ 푛푛 푥 푎+푛푛−푥 ∑푥=푎 푥 � � 푝 푞 푥 푎+푛푛 푥 푎+푛푛−푥 푎 푛푛( + )( + 1)! ( ) ∑푥=푎 � � 푝 푞 ( 푥 1)! + 1 ( 1) ! = 푎+푛푛 푎 푛푛 푎 푛푛 − 푥−1 푎+푛푛−1− 푥−1 ∑푥=푎 푥 + 푝푝 푞 푥 푥 − �푎 푛푛 − − 푥 − � 푎+푛푛 푎 푛푛( +푥 푎+푛푛1−)푥! ( + ) ∑푥=푎 � � 푝 푞 ( ) ( 1)푥! + 1 ( 1) ! = 푎+푛푛−1 푎 푛푛 − 푥−1 푎+푛푛−1− 푥−1 푎 푛푛 푝 �∑푥−1=푎−1 + 푝 푞 � 푥 − �푎 푛푛 − − 푥 − � +푎+푛푛 1 푥 푎+푛푛−푥 ( + ) 푥=푎 푎 푛푛 ( ) ∑ 1� � 푝 푞 = 푥 푎+푛푛−1 푎+ 푛푛 − 푥−1 푎+푛푛−1− 푥−1 푎 푛푛 푝 �∑푥−1=푎−1 � � 푝 푞 � 푥 − 푎+푛푛 +푎 푛푛 푥 푎+푛푛−푥 ( + ) ∑푥=푎 � � 푝 푞 = 푥 푎+푛푛+ 푎 푛푛 푥 푎+푛푛−푥 푎 푛푛 푝 �∑푥=푎 � � 푝 푞 � 푥 = ( + )푎+ 푛푛 푎 푛푛 푥 푎+푛푛−푥 ∑푥=푎 � � 푝 푞 푥 ( ) = 푎= ( 푛푛+ 푝 ) (2.3) ′ which∴ 퐸 푋 is the휇 mean1 of푎 the푛푛 distribution.푝 Similarly, it can be found that the variance of the distribution is ( + ) . If we substitute = 0 and = 1 in mean and variance of the generalized distribution then it reduces to mean and variance of usual distribution which holds the property that usual distribution is special case푎 of푛� the푝� generalized distribution.푎 �

277 Journal of Statistical Theory and Applications, Vol. 16, No. 2 (June 2017) 269–283 ______

Proof of Theorem 2.7. We know, the probability mass function of binomial distribution itself is a likelihood function. Therefore, the like function of generalized binomial distribution is

+ 1 + = = (2.12) 푎 푛푛+ 푥 푎+푛푛−푥 � � 푝 푞 푎 푛푛 푥 푎+푛푛−푥 퐿 푥 � � 푝 푞 푎+푛푛 푎 푛푛 푥 푎+푛푛−푥 ∑푥=푎 � � 푝 푞 푘 푥 where, is the total푥 number of success from maximum of + success in trials.

Taking 푥logarithm in both sides, we have 푎 푛푛 푛

1 + = 푎 푛푛 푥 +푎+푛푛−푥 푙=𝑙> 푙�=� � + � 푝 푞 +� + ( + ) (1 ) (2.13) 푘 푥 푎 푛푛 푙𝑙 −푙𝑙 푙� � � 푥�푥푥� 푎 푛푛 − 푥 푙� − 푝 Differentiating equation (2.푥13) with respect to and equating to zero, we get

+ 푝 ( ) = + ( 1) = 0 1 훿 푥 푎 +푛푛 − 푥 푙 𝑙 => −= 0 훿� 푝̂ 1− 푝̂ 푥 푎 푛푛( −+푥 ) => − = 0 푝̂ −(1푝̂ ) => 푥 − (푝̂푥+− 푎) 푛푛= 0푝 ̂ − 푝̂푥 => ( + )푝̂ =− 푝̂ 푥 − 푎 푛푛 푝̂ => = (2.14) 푎 푛푛+ 푝̂ 푥 푥 푝̂ Hence, the maximum푎 푛푛 likelihood estimate of the parameter is , where is the total number of success from maximum of + success in trials. 푥 푝 푎+푛푛 푥 Alternative Proof. The likelihood푎 푛푛 function of푛 generalized binomial distribution can be written as

+ = 푎 � 푥1 푎+�−푥1 � +푥2 �−푥2 � 푥3 �−푥3 � 푥푛 �−푥푛 퐿=> ��= 1 � 푝 푞 � �� 2� 푝 푞 � �� 3� 푝 푞 � ⋯ �� 푛� 푝 푞 � 푥 푛 푛 푥 푥 푥 ∑푖=1 푥푖 푎+푛푛−∑푖=1 푥푖 => = . (1 ) 푎 � � � � (2.15) 퐿 푝 푛 푞 �� 푛 1 � � 2� � 3� ⋯ � 푛�� ∑푖=1 푥푖 푎+푛푛−∑푖푥=1 푥푖 푥 푥 푥 퐿 퐶 푝 + − 푝 where, = is a constant. 푎 � � � � 퐶 � � � � � � ⋯ � � Taking logarithm푥1 in both푥2 sides,푥3 we have푥푛

= . (1 ) 푛 푛 ∑푖=1 푥푖 푎+푛푛−∑푖=1 푥푖 푙𝑙 푙��퐶 푝 − 푝 �

278 Journal of Statistical Theory and Applications, Vol. 16, No. 2 (June 2017) 269–283 ______

= + 푛 + + 푛 (1 ) (2.16)

푙𝑙 � 푥푖 푙𝑙 �푎 푛푛 − � 푥푖� 푙� − 푝 Differentiating푖= 1equation (2.16) with respect푖=1 to and equating to zero, we get

( + ) 푝 ( ) = + ( 1) = 0 푛 1 푛 푖=1 푖 푖=1 푖 훿 ∑ 푥 푎 (푛푛+− ∑) +푥 => 푙𝑙 − = 0 훿� 푛 푝̂ 푛 (1 )− 푝̂ 푛 ∑푖=1 푥푖 − 푝̂ ∑푖=1 푥푖 − 푎 푛푛 푝̂ 푝̂ ∑푖=1 푥푖

=> 푛 ( + )푝̂ =−0 푝̂

� 푥푖 − 푎 푛푛 푝̂ 푖=1 => ( + ) = 푛

푎 푛푛 푝̂ � 푥푖 => = 푖 = 1 (2.17) 푛 + 푖 ∑푖=1 푥 푝̂ where, 푎 푛푛= is the total number of success out of maximum of + success in trials. 푛 푖 Proof of∑ 푖=Theorem1 푥 푥 2.8. If the probability of success is known from푎 prior푛푛 knowledge푛 or estimated on the basis of observation then we can also estimate the number of trial . Let us define is the probability of success which is either known or estimated. Therefore,푝 푛 푝� = 푛+ ∑푖=1 푥푖 푝=� > + = 푎 푛�� 푛 ∑푖=1 푥푖 => 푎 =푛�� 푛 푝� 푖=1 푖 1∑ 푥 => 푛��= − 푎 (2.18) 푝�푛 ∑푖=1 푥푖 푛� � − 푎� Again,� the value푝� of is the initial number of success or in other words, the number success in first trial. Whereas, is the difference between the occurred number of success in successive trials. 푎 Proof of Theorem� 2.10. Normal distribution can be derived from the generalized binomial distribution under the following conditions:

i) The probability of success p or the probability of failure q are not so small and ii) n, the number of trails is very large i.e. n tends to infinity.

We know, the probability mass function of generalized binomial variate X with parameter a, n, d, and p is

+ ( ; , , , ) = ; = , + , + 2 , , + 푎 푛푛+ 푥 푎+푛푛−푥 � � 푝 푞 푃 푥 푎 푛 � 푝 푥 푥 푎 푎 � 푎 � ⋯ 푎 푛푛 푎+푛푛 푎 푛푛 푥 푎+푛푛−푥 ∑푥=푎 � � 푝 푞 푥

279 Journal of Statistical Theory and Applications, Vol. 16, No. 2 (June 2017) 269–283 ______

( + )! ! ( + )! = 푎 푛푛 푥 푎+푛푛−푥 (2.19) ( + )!푝 푞 푥 푎! ( 푛푛+ − 푥 )! 푎+푛푛 푎 푛푛 푥 푎+푛푛−푥 ∑푥=푎 푝 푞 Now, let us consider the standard푥 푎 푛푛 normal− 푥 variate as

( ) ( + ) = = (2.20) var( ) ( + ) 푥 − 퐸 푋 푥 − 푎 푛푛 푝 푧 � 푋 � (푎 푛푛) 푝� ( ) When = , = = + ( ) ( ) 푎− 푎+푛푛 푝 푎+푛푛 푝 푎 ( ) ( ) ( ) and when푥 푎=푧 +� 푎+,푛푛=푝푝 −� � = � 푎+푛푛 푝푝= ( ) ( ) 푎+푛푛− 푎+푛푛 푝 푎+푛푛 � 푎+푛푛 �

푥 푎 푛푛 푧 � 푎+푛푛 푝푝 � 푎+푛푛 푝푝 � 푝 Thus, in the limit as Z takes the values to Hence the distribution of Z will be a continuous distribution over the to with mean zero and variance unity. Now, under conditions푛 →(i)∞ and (ii), we shall find− ∞the limiting∞ form of the (12.1) by applying the Stirling’s approximation for factorials.− ∞For large∞ n, equation (12.1) can be written as lim ( ; , , , ) ( + )! 푛→∞ 푃 푥 푎 !푛(�+푝 )! = lim 푎 푛푛 푥 푎+푛푛−푥 (2.21) ( + )!푝 푞 푥 푎! ( 푛푛+ − 푥 )! 푛→∞ 푎+푛푛 푎 푛푛 푥 푎+푛푛−푥 ∑푥=푎 푝 푞 Let us first work푥 with푎 numerator푛푛 − 푥 only

( + )! lim ! ( + )! 푎 푛푛 푥 푎+푛푛−푥 푛→∞ 2푝 푞 ( )( + ) = lim푥 푎 푛푛 − 푥 1 − 푎+푛푛 푎+푛푛+ ( ) 2 2 √ 휋2푒 푎 (푛푛+ ) 푥 푎+푛푛−푥 1 1 푛→∞ 푥+ 푎+푛푛−푥+ 푝 푞 −푥( +2 ) − 푎+푛푛−푥 2 √ 휋 푒 푥 √ 휋 푒 1 푎 푛푛 − 푥 = lim 푎+푛푛+ 2 2 푎( +푛푛 ) 푥 푎+푛푛−푥 1 1 푛→∞ 푥+ 푎+푛푛−푥+ 푝 푞 {( +2 ) } {( + ) 2} √ 휋 푥 푎 푛푛 −1 푥 1 = lim 푥+ 푎+푛푛−푥+ 2 2 2 (푎 +푛푛 )푝 푎( 푛푛+ 푞 ) 1 1 푛→∞ 푥+ 푎+푛푛−푥+ 1 ( 2+ ) ( + 2 ) √ 휋� 푎 푛푛 푝� 푥 푎 푛푛 − 1푥 1 = lim 푥+ 푎+푛푛−푥+ (2.22) 2 ( + ) 2 ( + ) 2 푎 푛푛 푝 푎 푛푛 푞 푛→∞ � � � � We have√ from휋� 푎equation푛푛 푝 (2.� 20) 푥 푎 푛푛 − 푥

= ( + ) + ( + )

푥 푎 푛푛 푝 푧� 푎 푛푛 푝�

280 Journal of Statistical Theory and Applications, Vol. 16, No. 2 (June 2017) 269–283 ______

=> = 1 + ( + ) ( + ) 푥 푞 푧� 푎 푛푛 푝 푎 푛푛 푝 and

+ = 1 ( + ) ( + ) 푎 푛푛 − 푥 푝 − 푧� 푎 푛푛 푝 푎 푛푛 푞 Again,

= ( + ) 𝑑 � 푎 푛푛 푝� From𝑑 equation (2.22) we have

( + )! lim ! ( + )! 푎 푛푛 푥 푎+푛푛−푥 푛→∞ 1 푝 푞 ( + ) ( + ) 푥 푎 푛푛 − 푥 1 1 = lim 푥+ 푎+푛푛−푥+ 2 ( + ) 2 ( + ) 2 푎 푛푛 푝 푎 푛푛 푞 푛→∞ � ( ) �( � ) � ( ) ( ) 푥 푎 푛푛 − 푥 √ 휋� 푎 푛푛 푝� 1 1 푎+푛푛 푝+푧 푎+푛푛 푝푝+ 푎+푛푛− 푎+푛푛 푝−푧 푎+푛푛 푝푝+ 1 � 2 1 � 2 ⎧1 + ⎫ ⎧1 ⎫ ( + ) ( + ) = lim 푞 푝 ⎨ 푧� ⎬ ⎨2− 푧� ⎬ ⎩ 푎 푛푛 푝⎭ ⎩ 푎 푛푛 푞⎭ 푛→∞ ( ) ( ) ( ) ( ) 1 √ 휋 1 푎+푛푛 푝+푧 푎+푛푛 푝푝+ 푎+푛푛 �−푧 푎+푛푛 푝푝+ 1 � 2 1 � 2 ⎧1 + ⎫ ⎧1 ⎫ ( + ) ( + ) = lim 푞 푝 ⎨ 푧� ⎬ 2⎨ − 푧� ⎬ ⎩ 1 푎 푛푛 푝⎭ ⎩ 푎 푛푛 푞⎭ 푛→∞ = lim √ 휋 (2.23) 2 푁 푛→∞ where, √ 휋 ( ) ( ) ( ) ( ) 1 1 푎+푛푛 푝+푧 푎+푛푛 푝푝+ 푎+푛푛 �−푧 푎+푛푛 푝푝+ = 1 + � 2 1 � 2 ( + ) ( + ) 푞 푝 푁 � 푧� � � − 푧� � 푎 푛푛 푝 1 푎 푛푛 푞 => log = ( + ) + ( + ) + log 1 + 2 ( + ) 푞 푁 � 푎 푛푛 푝 푧� 푎 푛푛 푝� � � 푧� � 1 푎 푛푛 푝 + ( + ) ( + ) + log 1 2 ( + ) 푝 � 푎 푛푛 푞 − 푧� 푎 푛푛 푝� � � − 푧� � 푎 푛푛 푞

281 Journal of Statistical Theory and Applications, Vol. 16, No. 2 (June 2017) 269–283 ______

=> log = ( + ) + ( + )

푁 � 푎 1푛푛 푝 푧� 푎 푛푛 푝� + log + 3 + 2 ( + ) 22 ( + ) 33 ( + ) �2 푞 푧 푞 푧 푞 � �푧� − � � ⋯ � 푎 푛푛1 푝 푎 푛푛 푝 푎 푛푛 푝 + ( + ) ( + ) + log 3 + 2 ( + ) 22 ( + ) 3 ( + ) �2 푝 푧 푝 푧 푝 � 푎 푛푛 푞 − 푍� 푎 푛푛 푝� � �−푧� − − � � ⋯ � 푎 푛푛 푞 푎 푛푛 푞 푎 푛푛 푞

=> log = ( + ) + + + 22 2 ( + ) 42 ( + ) 푧 2 푧 푞 푧 푞 푁 �푧� 푎 푛푛 푝� − 푞 푧 푞 � − ⋯ � 푎 푛푛 푝 푎 푛푛 푝 + ( + ) + + 22 2 ( + ) 42 ( + ) 푧 2 푧 푝 푧 푝 �−푧� 푎 푛푛 푝� − 푝 푧 푝 − � − ⋯ � 푎 푛푛 푞 푎 푛푛 푞 => log = + + 2 ( + ) 22 4( +2 ) 푧 푞 푝 푧 푧 푞 푝 푁 �� − � � − � − � ⋯ � 푎 푛푛 푝 푞 푎 푛푛 푝 푞 log = => = Hence, as limit , we get 2 푧2. 푧 2 Now, substituting푛 →the∞ value of N in푁 equation2 (12.5)푁 we푒 have

( + )! 1 lim = 2 ! ( + )! 푧 2 − 푎 푛푛 푥 푎+푛푛−푥 2 푛 →∞ 푝 푞 푒 푥 푎 푛푛 − 푥 √ 휋 Putting these value in equation (12.3) we have

1 2 푧 1 2 − lim ( ; , , , ) = ( ) = 2 = 2 ; < < 푧 1푒 2 − 2 2 √ 휋 푧 푛→∞ 푃 푥 푎 푛 � 푝 푓 푧 ∞ 2 − 푒 −∞ 푧 ∞ 2 √ 휋 ∫−∞ 푒 which is the probability density function√ 휋 of standard normal variate Z.

( ) If we consider the transformation = then the probability density function of X will be ( ) 푥− 푎+푛푛 푝

푧 � 푎+푛푛 푝푝 1 ( ) ( ) = ( ) 2 ; < < (2.24) 1 푥− 푎+푛푛 푝 2 ( + ) − � � 2 푎+푛푛 푝푝 푓 푥 푒 −∞ 푥 ∞ If we denote√ 휋 the� 푎 mean푛푛 of푝 X� by and variance of X by then the probability density function become 2 1 휇 휎 ( ) = 2 ; < < 1 푥−휇 2 − � � 2 휎 푓 푥 푒 −∞ 푥 ∞ 휎√ 휋

282 Journal of Statistical Theory and Applications, Vol. 16, No. 2 (June 2017) 269–283 ______

Hence, the proof of the theorem.

Proof of Theorem 3.1. The form of the probability mass function of generalized Multinomial distribution is

1 ! ! ! ( , , , ; , , , , , , ) = 푥1 푥2 푥푘 1 1 2 푘 , , 1, 2 푘 푝 푝 ⋯ 푝 1 2 푘 1 2 푘 푥 푥 ⋯!푥 ! ! 푃 푥 푥 ⋯ 푥 푎 푛 � 푝 푝 ⋯ 푝 푎+푛푛 푥1 푥2 푥푘 1 2 푘 ∑푥 푥 ⋯ 푥 =푎 푝1 푝2 ⋯ 푝푘 Considering = 2 such that + = + and 푥+1 푥2 =⋯1푥,푘 we obtain

푘 푥1( +푥2 )푎! 푛푛 푝1 푝2 ! ! ( , ; , , , , ) = 푥1 푥2 푎 (푛푛+ 1)! 2 , 1 2 푝 푝 1 2 1 2 푥 푥 ! ! 푃 푥 푥 푎 푛 � 푝 푝 푎+푛푛 푥1 푥2 1 2 (푎 + 푛푛)! ∑푥 푥 =푎 푝1 푝2 ! ( 푥+1 푥2 )! => ( ; , , , , ) = 푥1 푎+푛푛−푥1 푎 ( 푛푛+ )! 1 2 1 1 푝 푝 1 1 2 푥 푎! ( 푛푛+ − 푥 )! 푃 푥 푎 푛 � 푝 푝 푎+푛푛 푥1 푎+푛�−푥1 1 푎 푛푛 ∑푥 =푎 푝1 푝2 Letting = , = and = 1푥1 푎 = 1푛푛 − 푥=1 , we have

푥1 푥 푝1 푝 ( 푝+2 )! − 푝1 − 푝 푞 ! ( + )! ( ) ; , , , = 푎 푛푛 푥 푎+푛푛−푥 ( + )!푝 푞 푥 푎 ( 푛푛 − 푥 ) 푃 푥 푎 푛 � 푝 ! + ! 푎+푛푛 +푎 푛푛 푥 푎+푛푛−푥 ∑푥=푎 푝 푞 => ( ; , , , ) = 푥 푎 푛푛 − 푥 푎 푛푛+ 푥 푎+푛푛−푥 � � 푝 푞 푃 푥 푎 푛 � 푝 푥 Hence, 푎+푛푛 푎 푛푛 푥 푎+푛푛−푥 ∑푥=푎 � � 푝 푞 푥 +

( ; , , , ) = 푥 푎+푛푛−푥 ; = , + , + 2 , , + (3.2) 푎 푛푛+ � � 푝 푞 푃 푥 푎 푛 � 푝 푥 푥 푎 푎 � 푎 � ⋯ 푎 푛푛 푎+푛푛 푎 푛푛 푥 푎+푛푛−푥 ∑푥=푎 � � 푝 푞 which is the probability mass 푥function of generalized Binomial distribution.

283