Testing the Ratio of Two Poisson Rates

Total Page:16

File Type:pdf, Size:1020Kb

Testing the Ratio of Two Poisson Rates Testing the Ratio of Two Poisson Rates Kangxia Gu1, Hon Keung Tony Ng∗1, Man Lai Tang2, and William R. Schucany1 1 Department of Statistical Science, Southern Methodist University, Dallas, Texas 75275-0332, USA 2 Department of Mathematics, Hong Kong Baptist University, Kowloon Tong, Kowloon, Hong Kong Summary In this paper we compare the properties of four different general approaches for testing the ratio of two Poisson rates. Asymptotically normal tests, tests based on approximate p-values, exact conditional tests, and a likelihood ratio test are considered. The properties and power performance of these tests are studied by a Monte Carlo simulation experiment. Sample size calculation formulae are given for each of the test procedures and their validities are studied. Some recommendations favoring the likelihood ratio and certain asymptotic tests are based on these simulation results. Finally, all of the test procedures are illustrated with two real life medical examples. Key words: Asymptotic tests, Conditional test, Constrained maximum likelihood estimation, Level of sig- nificance, Mid-p; Monte Carlo simulation, Power. 1 Introduction The Poisson Distribution, named after the famous French mathematician Simeon Denis Poisson who first formulated it (Poisson, 1837), is a mathematical rule that assigns probabilities to the number of occurences. It is most commonly used to model the number of random occurrences of some phenomenon in a specified space or time interval. Sometimes the Poisson rate is of interest because it pertains to a unit of time or ∗ Corresponding author: e-mail: [email protected], Phone: +1-214-768-2465, Fax: +1-214-768-4035 2 K. Gu, H.K.T. Ng, M.L. Tang, and W.R. Schucany: Testing of Poisson Rates space. For example, the incidence rate of a disease (Rothman and Greenland, 1998) is defined as the number of events observed divided by the time at risk during the observation period. Biological, epidemiological, and medical research can produce data that follow a Poisson distribution. It is sometimes called the law of rare events, for it describes the distribution of counts of such events. For example, the number of cases of a rare disease or the frequency of single gene mutations may be modeled by a Poisson distribution. In two-sample situations, researchers are likely to be interested in testing the difference or ratio of the Poisson means. The comparison of the Poisson rates from two independent sam- ples is clearly of medical interest. For instance, Stampfer and Willett (1985) conducted a prospective study to examine the relationship of post-menopausal hormone use and coronary heart disease (CHD). With postmenopausal hormone use in 54308.7 person-years, there are 30 CHD cases; without postmenopausal hormone use in 51477.5 person-years, there are 60 CHD cases. The problem of interest is to test whether the post-menopausal hormone using group has less coronary heart disease. Another example in epidemi- ology is a breast cancer study reported in Rothman and Greenland (1998) and Graham, Mengersen, and Morton (2003). Two groups of women were compared to determine whether those who had been examined using x-ray fluoroscopy during treatment for tuberculosis had a higher rate of breast cancer than those who had not been examined using x-ray fluoroscopy. Forty-one cases of breast cancer in 28,010 person-years at risk are reported in the treatment group with women receiving x-ray fluoroscopy and 15 cases of breast cancer in 19,017 person-years at risk in the control group with women not receiving x-ray fluoroscopy. The problem of comparing two Poisson rates has been studied for a long time, however, most of the early studies focused on the equal time frame situation. Przyborowski and Wilenski (1939) first proposed a conditional test for the unity of the ratio of two Poisson rates based on a conditional binomial distribution. Subsequently, Chapman (1952), Birnbaum (1953), Brownlee (1967) and Gail (1974) studied the properties and some alternatives for testing the hypothesis or constructing confidence intervals based on the condi- tional approach. On the other hand, Hald (1960), Ractliffe (1964), Cox and Lewis (1966), Haight (1967) Testing of Poisson Rates 3 Detre and White (1970) and Sichel (1973) investigated asymptotic tests based on normal approximations for the equality of the two Poisson rates. Development for unequal sampling frames has received more attention recently. Shiue and Bain (1982) derived a uniformly most powerful unbiased (UMPU) test for equality of two Poisson rates and showed that a test based on the normal approximation of the binomial distribution is nearly as powerful as the UMPU test. Huffman (1984) proposed an improved asymptotic test statistic, which accelerated the rate of convergence to normality by a variance stabilizing transformation. Along the same line, Thode (1997) considered an alternative test statistic that is more powerful than Shiue and Bain statistic for large Poisson rates. Recently, Ng and Tang (2005) provided systematic comparisons among these tests and presented the associated sample size formulae. All these studies are devoted to the problem of testing the equality of the two Poisson rates (or the unity of the ratio) only. Krishnamoorthy and Thomson (2004) considered tests based on estimated p-values and showed that these have uniformly better power than the conditional test. Although many methods have been proposed to test the ratio of two Poisson rates, it is not clear when techniques are more appropriate, or how their performances might vary. Moreover, most of the works focus on testing the unity of the ratio. In the present article, we present an extensive and systematic comparative study of different test procedures for testing the non-unity of the rate ratio of two Poisson processes under unequal sampling frames. In Section 2, we review the problem of testing the general non-unity ratio of two Poisson rates over unequal-size sampling frames. Asymptotic tests based on normal approximations, tests based on approximate p-value methods, an exact conditional test, a mid-p conditional test, and a likelihood ratio test are considered. In Section 3, we present the formulae for sample size calculation for each of the test procedures. In Section 4, Monte Carlo simulations are used to study the properties of the different test procedures, as well as the validity of the sample size calculation formulae. Two real life examples are used to illustrate the test procedures in Section 5. Finally, a discussion of the results with some recommendations are presented in Section 6. 4 K. Gu, H.K.T. Ng, M.L. Tang, and W.R. Schucany: Testing of Poisson Rates 2 Test Procedures 2.1 Asymptotic tests based on normal approximations For fixed sampling frames t0 and t1, two independent Poisson processes (with parameters λ0 and λ1) are observed. Let X0 and X1 be the corresponding number of outcomes, i.e. Xi ∼ Poisson(λi) with λi = tiγi for i =0, 1. Here, γi, i =0, 1 are the Poisson rates. We denote the observed values of X0 and X1 by x0 and x1, respectively. We are interested in testing that the ratio of two Poisson rates is equal to a pre-specified positive number R versus greater than R, i.e. we are testing the following one-sided hypotheses H0 : γ0/γ1 = R against H1 : γ0/γ1 >R. (1) When R =1, it is equivalent to testing the equality of the two Poisson rates, i.e. γ0 = γ1. The properties of different test procedures for R =1under unequal sampling frames, i.e. t0 =6 t1, were investigated in Ng and Tang (2005). Since γi, i =0, 1 are the unknown parameters, two sample estimates for γi are considered as follows. • Unconstrained Maximum Likelihood Estimate (MLE) The maximum likelihood estimate of γi is given by Xi γˆi = ,i=0, 1. (2) ti • Constrained Maximum Likelihood Estimate (CMLE) Under the null hypothesis H0: γ0/γ1 = R, the CMLEs of γ0 and γ1 can be shown to be (see Appendix A) X0 + X1 X0 + X1 γ˜0 = and γ˜1 = , (3) t0(1 + 1/ρ) t1(1 + ρ) where d = t1/t0 and ρ = R/d. Note that the CMLEs of γ0 and γ1 are the same as the method of moment estimator under the null hypothesis H0 : γ0/γ1 = R. Specifically, E(X0+X1)=t0γ0+t1γ1 with γ0/γ1 = R yields the estimators in (3). Testing of Poisson Rates 5 Because we can re-express the null hypothesis in (1) as H0 : γ0 − Rγ1 =0, we can develop the test statistics based on the statistic W =ˆγ0 − Rγˆ1. The variance of the statistic W is given by 2 2 γ0 R γ1 σW = + t0 t1 and it can be estimated by ∗ 2 ∗ 2 γ0 R γ1 sW = + , t0 t1 ∗ where γi is any reasonable estimate of γi, i =0, 1. Hence, we consider the test statistic W/sW for H0. ∗ With MLEs γi =ˆγi, i =0, 1, we get X0/t0 − RX1/t1 X0 − X1ρ W1(X0,X1)= = . 2 2 2 2 pX0/t0 + X1R /t1 pX0 + X1ρ ∗ Similarly, with CMLEs γi =˜γi, i =0, 1, we obtain X0/t0 − RX1/t1 X0 − X1ρ W2(X0,X1)= = . 2 p(X0 + X1)(R /d + R)/(R + d) p(X0 + X1)ρ Note that when ρ =1, W1 is equivalent to W2 and is the test statistic studied by Shiue and Bain (1982) and Thode (1997). On the other hand, testing the null hypothesis in (1) is equivalent to testing ln(γ0/γ1) − ln(R)=0. Hence, we consider the statistic U = ln(ˆγ0/γˆ1) − ln(R) = ln(X0/t0) − ln(X1/t1) − ln(R). By the delta 2 method, the variance of U can be approximated by σU =1/(t0γ0)+1/(t1γ1) and it can be estimated by 2 ∗ ∗ ∗ sU =1/(t0γ0 )+1/(t1γ1 ), where γi is any reasonable estimate of γi, i =0, 1.
Recommended publications
  • Effect of Probability Distribution of the Response Variable in Optimal Experimental Design with Applications in Medicine †
    mathematics Article Effect of Probability Distribution of the Response Variable in Optimal Experimental Design with Applications in Medicine † Sergio Pozuelo-Campos *,‡ , Víctor Casero-Alonso ‡ and Mariano Amo-Salas ‡ Department of Mathematics, University of Castilla-La Mancha, 13071 Ciudad Real, Spain; [email protected] (V.C.-A.); [email protected] (M.A.-S.) * Correspondence: [email protected] † This paper is an extended version of a published conference paper as a part of the proceedings of the 35th International Workshop on Statistical Modeling (IWSM), Bilbao, Spain, 19–24 July 2020. ‡ These authors contributed equally to this work. Abstract: In optimal experimental design theory it is usually assumed that the response variable follows a normal distribution with constant variance. However, some works assume other probability distributions based on additional information or practitioner’s prior experience. The main goal of this paper is to study the effect, in terms of efficiency, when misspecification in the probability distribution of the response variable occurs. The elemental information matrix, which includes information on the probability distribution of the response variable, provides a generalized Fisher information matrix. This study is performed from a practical perspective, comparing a normal distribution with the Poisson or gamma distribution. First, analytical results are obtained, including results for the linear quadratic model, and these are applied to some real illustrative examples. The nonlinear 4-parameter Hill model is next considered to study the influence of misspecification in a Citation: Pozuelo-Campos, S.; dose-response model. This analysis shows the behavior of the efficiency of the designs obtained in Casero-Alonso, V.; Amo-Salas, M.
    [Show full text]
  • Lecture.7 Poisson Distributions - Properties, Normal Distributions- Properties
    Lecture.7 Poisson Distributions - properties, Normal Distributions- properties Theoretical Distributions Theoretical distributions are 1. Binomial distribution Discrete distribution 2. Poisson distribution 3. Normal distribution Continuous distribution Discrete Probability distribution Bernoulli distribution A random variable x takes two values 0 and 1, with probabilities q and p ie., p(x=1) = p and p(x=0)=q, q-1-p is called a Bernoulli variate and is said to be Bernoulli distribution where p and q are probability of success and failure. It was given by Swiss mathematician James Bernoulli (1654-1705) Example • Tossing a coin(head or tail) • Germination of seed(germinate or not) Binomial distribution Binomial distribution was discovered by James Bernoulli (1654-1705). Let a random experiment be performed repeatedly and the occurrence of an event in a trial be called as success and its non-occurrence is failure. Consider a set of n independent trails (n being finite), in which the probability p of success in any trail is constant for each trial. Then q=1-p is the probability of failure in any trail. 1 The probability of x success and consequently n-x failures in n independent trails. But x successes in n trails can occur in ncx ways. Probability for each of these ways is pxqn-x. P(sss…ff…fsf…f)=p(s)p(s)….p(f)p(f)…. = p,p…q,q… = (p,p…p)(q,q…q) (x times) (n-x times) Hence the probability of x success in n trials is given by x n-x ncx p q Definition A random variable x is said to follow binomial distribution if it assumes non- negative values and its probability mass function is given by P(X=x) =p(x) = x n-x ncx p q , x=0,1,2…n q=1-p 0, otherwise The two independent constants n and p in the distribution are known as the parameters of the distribution.
    [Show full text]
  • 1 One Parameter Exponential Families
    1 One parameter exponential families The world of exponential families bridges the gap between the Gaussian family and general dis- tributions. Many properties of Gaussians carry through to exponential families in a fairly precise sense. • In the Gaussian world, there exact small sample distributional results (i.e. t, F , χ2). • In the exponential family world, there are approximate distributional results (i.e. deviance tests). • In the general setting, we can only appeal to asymptotics. A one-parameter exponential family, F is a one-parameter family of distributions of the form Pη(dx) = exp (η · t(x) − Λ(η)) P0(dx) for some probability measure P0. The parameter η is called the natural or canonical parameter and the function Λ is called the cumulant generating function, and is simply the normalization needed to make dPη fη(x) = (x) = exp (η · t(x) − Λ(η)) dP0 a proper probability density. The random variable t(X) is the sufficient statistic of the exponential family. Note that P0 does not have to be a distribution on R, but these are of course the simplest examples. 1.0.1 A first example: Gaussian with linear sufficient statistic Consider the standard normal distribution Z e−z2=2 P0(A) = p dz A 2π and let t(x) = x. Then, the exponential family is eη·x−x2=2 Pη(dx) / p 2π and we see that Λ(η) = η2=2: eta= np.linspace(-2,2,101) CGF= eta**2/2. plt.plot(eta, CGF) A= plt.gca() A.set_xlabel(r'$\eta$', size=20) A.set_ylabel(r'$\Lambda(\eta)$', size=20) f= plt.gcf() 1 Thus, the exponential family in this setting is the collection F = fN(η; 1) : η 2 Rg : d 1.0.2 Normal with quadratic sufficient statistic on R d As a second example, take P0 = N(0;Id×d), i.e.
    [Show full text]
  • ONE SAMPLE TESTS the Following Data Represent the Change
    1 WORKED EXAMPLES 6 INTRODUCTION TO STATISTICAL METHODS EXAMPLE 1: ONE SAMPLE TESTS The following data represent the change (in ml) in the amount of Carbon monoxide transfer (an indicator of improved lung function) in smokers with chickenpox over a one week period: 33, 2, 24, 17, 4, 1, -6 Is there evidence of significant improvement in lung function (a) if the data are normally distributed with σ = 10, (b) if the data are normally distributed with σ unknown? Use a significance level of α = 0.05. SOLUTION: (a) Here we have a sample of size 7 with sample mean x = 10.71. We want to test H0 : μ = 0.0, H1 : μ = 0.0, 6 under the assumption that the data follow a Normal distribution with σ = 10.0 known. Then, we have, in the Z-test, 10.71 0.0 z = − = 2.83, 10.0/√7 which lies in the critical region, as the critical values for this test are 1.96, for significance ± level α = 0.05. Therefore we have evidence to reject H0. The p-value is given by p = 2Φ( 2.83) = 0.004 < α. − (b) The sample variance is s2 = 14.192. In the T-test, we have test statistic t given by x 0.0 10.71 0.0 t = − = − = 2.00. s/√n 14.19/√7 The upper critical value CR is obtained by solving FSt(n 1)(CR) = 0.975, − where FSt(n 1) is the cdf of a Student-t distribution with n 1 degrees of freedom; here n = 7, so − − we can use statistical tables or a computer to find that CR = 2.447, and note that, as Student-t distributions are symmetric the lower critical value is CR.
    [Show full text]
  • Poisson Versus Negative Binomial Regression
    Handling Count Data The Negative Binomial Distribution Other Applications and Analysis in R References Poisson versus Negative Binomial Regression Randall Reese Utah State University [email protected] February 29, 2016 Randall Reese Poisson and Neg. Binom Handling Count Data The Negative Binomial Distribution Other Applications and Analysis in R References Overview 1 Handling Count Data ADEM Overdispersion 2 The Negative Binomial Distribution Foundations of Negative Binomial Distribution Basic Properties of the Negative Binomial Distribution Fitting the Negative Binomial Model 3 Other Applications and Analysis in R 4 References Randall Reese Poisson and Neg. Binom Handling Count Data The Negative Binomial Distribution ADEM Other Applications and Analysis in R Overdispersion References Count Data Randall Reese Poisson and Neg. Binom Handling Count Data The Negative Binomial Distribution ADEM Other Applications and Analysis in R Overdispersion References Count Data Data whose values come from Z≥0, the non-negative integers. Classic example is deaths in the Prussian army per year by horse kick (Bortkiewicz) Example 2 of Notes 5. (Number of successful \attempts"). Randall Reese Poisson and Neg. Binom Handling Count Data The Negative Binomial Distribution ADEM Other Applications and Analysis in R Overdispersion References Poisson Distribution Support is the non-negative integers. (Count data). Described by a single parameter λ > 0. When Y ∼ Poisson(λ), then E(Y ) = Var(Y ) = λ Randall Reese Poisson and Neg. Binom Handling Count Data The Negative Binomial Distribution ADEM Other Applications and Analysis in R Overdispersion References Acute Disseminated Encephalomyelitis Acute Disseminated Encephalomyelitis (ADEM) is a neurological, immune disorder in which widespread inflammation of the brain and spinal cord damages tissue known as white matter.
    [Show full text]
  • 6.1 Definition: the Density Function of Th
    CHAPTER 6: Some Continuous Probability Distributions Continuous Uniform Distribution: 6.1 Definition: The density function of the continuous random variable X on the interval [A; B] is 1 A x B B A ≤ ≤ f(x; A; B) = 8 − < 0 otherwise: : Application: Some continuous random variables in the physical, management, and biological sciences have approximately uniform probability distributions. For example, suppose we are counting events that have a Poisson distribution, such as telephone calls coming into a switchboard. If it is known that exactly one such event has occurred in a given interval, say (0; t),then the actual time of occurrence is distributed uniformly over this interval. Example: Arrivals of customers at a certain checkout counter follow a Poisson distribution. It is known that, during a given 30-minute period, one customer arrived at the counter. Find the probability that the customer arrived during the last 5 minutes of the 30-minute period. Solution: As just mentioned, the actual time of arrival follows a uniform distribution over the interval of (0; 30). If X denotes the arrival time, then 30 1 30 25 1 P (25 X 30) = dx = − = ≤ ≤ Z25 30 30 6 Theorem 6.1: The mean and variance of the uniform distribution are 2 B 2 2 B 1 x B A A+B µ = A x B A = 2(B A) = 2(B− A) = 2 : R − h − iA − It is easy to show that (B A)2 σ2 = − 12 Normal Distribution: 6.2 Definition: The density function of the normal random variable X, with mean µ and variance σ2, is 2 1 (1=2)[(x µ)/σ] n(x; µ, σ) = e− − < x < ; p2πσ − 1 1 where π = 3:14159 : : : and e = 2:71828 : : : Example: The SAT aptitude examinations in English and Mathematics were originally designed so that scores would be approximately normal with µ = 500 and σ = 100.
    [Show full text]
  • Statistical Inference
    GU4204: Statistical Inference Bodhisattva Sen Columbia University February 27, 2020 Contents 1 Introduction5 1.1 Statistical Inference: Motivation.....................5 1.2 Recap: Some results from probability..................5 1.3 Back to Example 1.1...........................8 1.4 Delta method...............................8 1.5 Back to Example 1.1........................... 10 2 Statistical Inference: Estimation 11 2.1 Statistical model............................. 11 2.2 Method of Moments estimators..................... 13 3 Method of Maximum Likelihood 16 3.1 Properties of MLEs............................ 20 3.1.1 Invariance............................. 20 3.1.2 Consistency............................ 21 3.2 Computational methods for approximating MLEs........... 21 3.2.1 Newton's Method......................... 21 3.2.2 The EM Algorithm........................ 22 1 4 Principles of estimation 23 4.1 Mean squared error............................ 24 4.2 Comparing estimators.......................... 25 4.3 Unbiased estimators........................... 26 4.4 Sufficient Statistics............................ 28 5 Bayesian paradigm 33 5.1 Prior distribution............................. 33 5.2 Posterior distribution........................... 34 5.3 Bayes Estimators............................. 36 5.4 Sampling from a normal distribution.................. 37 6 The sampling distribution of a statistic 39 6.1 The gamma and the χ2 distributions.................. 39 6.1.1 The gamma distribution..................... 39 6.1.2 The Chi-squared distribution.................. 41 6.2 Sampling from a normal population................... 42 6.3 The t-distribution............................. 45 7 Confidence intervals 46 8 The (Cramer-Rao) Information Inequality 51 9 Large Sample Properties of the MLE 57 10 Hypothesis Testing 61 10.1 Principles of Hypothesis Testing..................... 61 10.2 Critical regions and test statistics.................... 62 10.3 Power function and types of error.................... 64 10.4 Significance level............................
    [Show full text]
  • Likelihood-Based Inference
    The score statistic Inference Exponential distribution example Likelihood-based inference Patrick Breheny September 29 Patrick Breheny Survival Data Analysis (BIOS 7210) 1/32 The score statistic Univariate Inference Multivariate Exponential distribution example Introduction In previous lectures, we constructed and plotted likelihoods and used them informally to comment on likely values of parameters Our goal for today is to make this more rigorous, in terms of quantifying coverage and type I error rates for various likelihood-based approaches to inference With the exception of extremely simple cases such as the two-sample exponential model, exact derivation of these quantities is typically unattainable for survival models, and we must rely on asymptotic likelihood arguments Patrick Breheny Survival Data Analysis (BIOS 7210) 2/32 The score statistic Univariate Inference Multivariate Exponential distribution example The score statistic Likelihoods are typically easier to work with on the log scale (where products become sums); furthermore, since it is only relative comparisons that matter with likelihoods, it is more meaningful to work with derivatives than the likelihood itself Thus, we often work with the derivative of the log-likelihood, which is known as the score, and often denoted U: d U (θ) = `(θjX) X dθ Note that U is a random variable, as it depends on X U is a function of θ For independent observations, the score of the entire sample is the sum of the scores for the individual observations: X U = Ui i Patrick Breheny Survival
    [Show full text]
  • (Introduction to Probability at an Advanced Level) - All Lecture Notes
    Fall 2018 Statistics 201A (Introduction to Probability at an advanced level) - All Lecture Notes Aditya Guntuboyina August 15, 2020 Contents 0.1 Sample spaces, Events, Probability.................................5 0.2 Conditional Probability and Independence.............................6 0.3 Random Variables..........................................7 1 Random Variables, Expectation and Variance8 1.1 Expectations of Random Variables.................................9 1.2 Variance................................................ 10 2 Independence of Random Variables 11 3 Common Distributions 11 3.1 Ber(p) Distribution......................................... 11 3.2 Bin(n; p) Distribution........................................ 11 3.3 Poisson Distribution......................................... 12 4 Covariance, Correlation and Regression 14 5 Correlation and Regression 16 6 Back to Common Distributions 16 6.1 Geometric Distribution........................................ 16 6.2 Negative Binomial Distribution................................... 17 7 Continuous Distributions 17 7.1 Normal or Gaussian Distribution.................................. 17 1 7.2 Uniform Distribution......................................... 18 7.3 The Exponential Density...................................... 18 7.4 The Gamma Density......................................... 18 8 Variable Transformations 19 9 Distribution Functions and the Quantile Transform 20 10 Joint Densities 22 11 Joint Densities under Transformations 23 11.1 Detour to Convolutions......................................
    [Show full text]
  • Compound Zero-Truncated Poisson Normal Distribution and Its Applications
    Compound Zero-Truncated Poisson Normal Distribution and its Applications a,b, c d Mohammad Z. Raqab ∗, Debasis Kundu , Fahimah A. Al-Awadhi aDepartment of Mathematics, The University of Jordan, Amman 11942, JORDAN bKing Abdulaziz University, Jeddah, SAUDI ARABIA cDepartment of Mathematics and Statistics, Indian Institute of Technology Kanpur, Kanpur, Pin 208016, INDIA dDepartment of Statistics and OR, Kuwait University, Safat 13060, KUWAIT Abstract Here, we first propose three-parameter model and call it as the compound zero- truncated Poisson normal (ZTP-N) distribution. The model is based on the random sum of N independent Gaussian random variables, where N is a zero truncated Pois- son random variable. The proposed ZTP-N distribution is a very flexible probability distribution function. The probability density function can take variety of shapes. It can be both positively and negatively skewed, moreover, normal distribution can be obtained as a special case. It can be unimodal, bimodal as well as multimodal also. It has three parameters. An efficient EM type algorithm has been proposed to compute the maximum likelihood estimators of the unknown parameters. We further propose a four-parameter bivariate distribution with continuous and discrete marginals, and discuss estimation of unknown parameters based on the proposed EM type algorithm. Some simulation experiments have been performed to see the effectiveness of the pro- posed EM type algorithm, and one real data set has been analyzed for illustrative purposes. Keywords: Poisson distribution; skew normal distribution; maximum likelihood estimators; EM type algorithm; Fisher information matrix. AMS Subject Classification (2010): 62E15; 62F10; 62H10. ∗Corresponding author. E-mail addresses: [email protected] (M.
    [Show full text]
  • 1 IEOR 6711: Notes on the Poisson Process
    Copyright c 2009 by Karl Sigman 1 IEOR 6711: Notes on the Poisson Process We present here the essentials of the Poisson point process with its many interesting properties. As preliminaries, we first define what a point process is, define the renewal point process and state and prove the Elementary Renewal Theorem. 1.1 Point Processes Definition 1.1 A simple point process = ftn : n ≥ 1g is a sequence of strictly increas- ing points 0 < t1 < t2 < ··· ; (1) def with tn−!1 as n−!1. With N(0) = 0 we let N(t) denote the number of points that fall in the interval (0; t]; N(t) = maxfn : tn ≤ tg. fN(t): t ≥ 0g is called the counting process for . If the tn are random variables then is called a random point process. def We sometimes allow a point t0 at the origin and define t0 = 0. Xn = tn − tn−1; n ≥ 1, is called the nth interarrival time. th We view t as time and view tn as the n arrival time (although there are other kinds of applications in which the points tn denote locations in space as opposed to time). The word simple refers to the fact that we are not allowing more than one arrival to ocurr at the same time (as is stated precisely in (1)). In many applications there is a \system" to which customers are arriving over time (classroom, bank, hospital, supermarket, airport, etc.), and ftng denotes the arrival times of these customers to the system. But ftng could also represent the times at which phone calls are received by a given phone, the times at which jobs are sent to a printer in a computer network, the times at which a claim is made against an insurance company, the times at which one receives or sends email, the times at which one sells or buys stock, the times at which a given web site receives hits, or the times at which subways arrive to a station.
    [Show full text]
  • Poisson Approximations
    Page 1 Chapter 8 Poisson approximations The Bin(n, p) can be thought of as the distribution of a sum of independent indicator random variables X1 + ...+ Xn, with {Xi = 1} denoting a head on the ith toss of a coin. The normal approximation to the Binomial works best√ when the variance np(1 p) is large, for then each of the standardized summands (Xi p)/ np(1 p) makes a relatively small contribution to the standardized sum. When n is large but p is small, in such a way that np is not large, a different type of approximation to the Binomial is better. < > Poisson distribution 8.1 Definition. A random variable Y is said to have a with parame- Poisson distribution ter , abbreviated to Poisson(), if it can take values 0, 1, 2,... with probabilities ek P{Y = k}= for k = 0, 1, 2,... k! The parameter must be positive. The Poisson() appears as an approximation to the Bin(n, p) when n is large, p is small, and = np: n n(n 1)...(nk+1) k n k pk(1 p)nk = 1 k k! n n k n 1 if k small relative to n k! n k e if n is large k! The exponential factor comes from: n 1 2 log 1 = n log 1 = n ... if /n 0. n n n 2 n2 By careful consideration of the error terms, one can give explicit bounds for the er- ror of approximation. For example, there exists a constant C, such that, if X is distributed Bin(n, p) and Y is distributed Poisson(np) then X∞ |P{X = k}P{Y =k}| Cp k=0 Le Cam1 has sketched a proof showing that C can be taken equal to 4.
    [Show full text]