Moments and the Moment Generating Function Math 217 Probability And

Total Page:16

File Type:pdf, Size:1020Kb

Moments and the Moment Generating Function Math 217 Probability And made up in the early 19th century from the Greek word for curvature.) Kurtosis is not a particularly important concept, but I mention it here for com- pleteness. Moments and the moment It turns out that the whole distribution for X generating function is determined by all the moments, that is different Math 217 Probability and Statistics distributions can't have identical moments. That's Prof. D. Joyce, Fall 2014 what makes moments important. There are various reasons for studying moments The moment generating function. There is a and the moment generating functions. One of them clever way of organizing all the moments into one that the moment generating function can be used mathematical object, and that object is called the to prove the central limit theorem. moment generating function. It's a function m(t) of a new variable t defined by Moments, central moments, skewness, and kurtosis. The kth moment of a random variable m(t) = E(etX ): k X is defined as µk = E(X ). Thus, the mean is t the first moment, µ = µ1, and the variance can Since the exponential function e has the power se- be found from the first and second moments, σ2 = ries 2 µ2 − µ . 1 1 X tk t2 tk The kth central moment is defined as E((X−µ)k). et = = 1 + t + + ··· + + ··· ; k! 2! k! Thus, the variance is the second central moment. k=0 The higher moments have more obscure mean- ings as k grows. we can rewrite m(t) as follows A third central moment of the standardized ran- m(t) = E(etX ) dom variable X∗ = (X − µ)/σ, (tX)2 (tX)k 3 = E 1 + tX + + ··· + + ··· ∗ 3 E((X − µ) ) 2! k! β3 = E((X ) ) = σ3 t2E(X2) tkE(Xk) = 1 + tE(X) + + ··· + + ··· is called the skewness of X. A distribution that's 2! k! symmetric about its mean has 0 skewness. (In fact t2µ tkµ = 1 + tµ + 2 + ··· + k + ··· all the odd central moments are 0 for a symmetric 1 2! k! distribution.) But if it has a long tail to the right µ2 2 µk k = 1 + µ1t + t + ··· + t + ··· and a short one to the left, then it has a positive 2! k! skewness, and a negative skewness in the opposite Theorem. The kth derivative of m(t) evaluated at situation. t = 0 is the kth moment µ of X. A fourth central moment of X∗, k In other words, the moment generating function E((X − µ)4) β = E((X∗)4) = generates the moments of X by differentiation. 4 σ4 The primary use of moment generating functions is callled kurtosis. A fairly flat distribution with is to develop the theory of probability. For instance, long tails has a high kurtosis, while a short tailed the easiest way to prove the central limit theorem distribution has a low kurtosis. A bimodal distri- is to use moment generating functions. bution has a very high kurtosis. A normal distri- For discrete distributions, we can also compute bution has a kurtosis of 3. (The word kurtosis was the moment generating function directly in terms 1 Zt (X+Y )t Xt Y t of the probability mass function f(x) = P (X=x) Proof: mZ (t) = E(e ) = E(e ) = E(e e ). as Now, since X and Y are independent, so are eXt X m(t) = E(etX ) = etxf(x): and eY t. Therefore, E(eXteY t) = E(eXt)E(eY t) = x mX (t)mY (t). q.e.d. For continuous distributions, the moment generat- Note that this property of convolution on mo- ing function can be expressed in terms of the prob- ment generating functions implies that for a sam- ability density function f as ple sum Sn = X1 + X2 + ··· + Xn, the moment Z 1 generating function is tX tx m(t) = E(e ) = e fX (x) dx: n −∞ mSn (t) = (mX (t)) : We can couple that with the standardizing prop- Properties of moment generating functions. erty to determine the moment generating function Translation. If Y = X + a, then for the standardized sum at m (t) = e m (t): Sn − nµ Y X S∗ = p : n σ n Y t (X+a)t Proof: mY (t) = E(e ) = E(e ) = Xt at at Xt at Since the mean of S is nµ and its standard devia- E(e e ) = e E(e ) = e mX (t). q.e.d. p n tion is σ n, so when it's standardized, we get Scaling. If Y = bX, then p −nµt/(σ n) t m ∗ (t) = e m ( p ) Sn Sn σ n mY (t) = mX (bt): p − nµt/σ pt = e mSn ( σ n ) Y t (bX)t X(bt) Proof: mY (t) = E(e ) = E(e ) = E(e ) = p n = e− nµt/σ m ( pt ) mX (bt). q.e.d. X σ n Standardizing. From the last two properties, if We'll use this result when we prove the central limit X − µ theorem X∗ = σ Some examples. The same definitions for these is the standardized random variable for X, then apply to discrete distributions and continuous ones. th −µt/σ That is, if X is any random variable, then its n mX∗ (t) = e mX (t/σ): moment is n Proof: First translate by −µ to get µn = E(X ) −µt and its moment generating function is mX−µ(t) = e mX (t): 1 X µk Then scale that by a factor of 1/σ to get m (t) = E(etX ) = tk: X k! k=0 m(X−µ)/σ = mX−µ(t/σ) −µt/σ The only difference is that when you compute them, = e mX (t/σ) you use sums for discrete distributions but integrals q.e.d. for continuous ones. That's because expectation is defined in terms of sums or integrals in the two Convolution. If X and Y are independent vari- cases. Thus, in the continuous case ables, and Z = X + Y , then Z 1 n n m (t) = m (t) m (t): µn = E(X ) = x fX (x) dx Z X Y −∞ 2 and The moment generating function for a uni- form distribution on [0; 1]. Let X be uniform Z 1 tX tx on [0; 1] so that the probability density function f m(t) = E(e ) = e fX (x) dx: X −∞ has the value 1 on [0; 1] and 0 outside this interval. We'll look at one example of a moment gener- Let's first compute the moments. ating function for a discrete distribution and three for continuous distributions. Z 1 n n µn = E(X ) = x fX (x) dx −∞ The moment generating function for a geo- Z 1 metric distribution. Recall that when indepen- = xn dx dent Bernoulli trials are repeated, each with prob- 0 n+1 1 x 1 ability p of success, the time X it takes to get the = = first success has a geometric distribution. n + 1 0 n + 1 P (X = j) = qj−1p; for j = 1; 2;:::: Next, let's compute the moment generating func- Let's compute the generating function for the geo- tion. metric distribution. 1 Z 1 X tj j−1 tx m(t) = e q p m(t) = e fX (x) dx −∞ j=1 Z 1 1 tx p X = e dx = etjqj q 0 j=1 1 t 1 tx e − 1 = e = t t 1 1 0 X X The series etjqj = (etq)j is a geometric series j=1 j=1 etq Note that the expression for g(t) does not al- with sum . Therefore, 1 − etq low t = 0 since there is a t in the denominator. Still g(0) can be evaluated by using power series or p etq pet L'H^opital'srule. m(t) = = : q 1 − etq 1 − qet From this generating function, we can find the moments. For instance, µ = m0(0): The derivative 1 The moment generating function for an ex- of m is ponential distribution with parameter λ . pet m0(t) = ; Recall that when events occur in a Poisson pro- (1 − qet)2 cess uniformly at random over time at a rate of p 1 λ events per unit time, then the random variable so µ = m0(0) = = . This agrees with 1 (1 − q)2 p X giving the time to the first event has an expo- what we already know, that the mean of the geo- nential distribution. The density function for X is −λx metric distribution is 1=p. fX (x) = λe , for x 2 [0; 1). 3 Let's compute its moment generating function. From these values of all the moments, we can com- pute the moment generating function. Z 1 tx m(t) = e fX (x) dx 1 −∞ X µn m(t) = tn Z 1 n! = etxλe−λx dx n=0 0 1 X µ2m Z 1 = t2m = λ e(t−λ)x dx (2m)! m=0 0 1 (t−λ)x 1 X (2m)! 1 e = t2m = λ 2mm! (2m)! t − λ 0 m=0 (t−λ)x 0 1 e e X 1 = lim λ − λ = t2m x!1 t − λ t − λ 2mm! m=0 2 Now if t < λ, then the limit in the last line is 0, so = et =2 in that case λ Thus, the moment generating function for the stan- m(t) = : λ − t dard normal distribution Z is This is a minor, yet important point. The mo- 2 m (t) = et =2: ment generating function doesn't have to be defined Z for all t. We only need it to be defined for t near More generally, if X = σZ +µ is a normal distribu- 0 because we're only interested in its derivatives tion with mean µ and variance σ2, then the moment evaluated at 0.
Recommended publications
  • A Recursive Formula for Moments of a Binomial Distribution Arp´ Ad´ Benyi´ ([email protected]), University of Massachusetts, Amherst, MA 01003 and Saverio M
    A Recursive Formula for Moments of a Binomial Distribution Arp´ ad´ Benyi´ ([email protected]), University of Massachusetts, Amherst, MA 01003 and Saverio M. Manago ([email protected]) Naval Postgraduate School, Monterey, CA 93943 While teaching a course in probability and statistics, one of the authors came across an apparently simple question about the computation of higher order moments of a ran- dom variable. The topic of moments of higher order is rarely emphasized when teach- ing a statistics course. The textbooks we came across in our classes, for example, treat this subject rather scarcely; see [3, pp. 265–267], [4, pp. 184–187], also [2, p. 206]. Most of the examples given in these books stop at the second moment, which of course suffices if one is only interested in finding, say, the dispersion (or variance) of a ran- 2 2 dom variable X, D (X) = M2(X) − M(X) . Nevertheless, moments of order higher than 2 are relevant in many classical statistical tests when one assumes conditions of normality. These assumptions may be checked by examining the skewness or kurto- sis of a probability distribution function. The skewness, or the first shape parameter, corresponds to the the third moment about the mean. It describes the symmetry of the tails of a probability distribution. The kurtosis, also known as the second shape pa- rameter, corresponds to the fourth moment about the mean and measures the relative peakedness or flatness of a distribution. Significant skewness or kurtosis indicates that the data is not normal. However, we arrived at higher order moments unintentionally.
    [Show full text]
  • Use of Proc Iml to Calculate L-Moments for the Univariate Distributional Shape Parameters Skewness and Kurtosis
    Statistics 573 USE OF PROC IML TO CALCULATE L-MOMENTS FOR THE UNIVARIATE DISTRIBUTIONAL SHAPE PARAMETERS SKEWNESS AND KURTOSIS Michael A. Walega Berlex Laboratories, Wayne, New Jersey Introduction Exploratory data analysis statistics, such as those Gaussian. Bickel (1988) and Van Oer Laan and generated by the sp,ge procedure PROC Verdooren (1987) discuss the concept of robustness UNIVARIATE (1990), are useful tools to characterize and how it pertains to the assumption of normality. the underlying distribution of data prior to more rigorous statistical analyses. Assessment of the As discussed by Glass et al. (1972), incorrect distributional shape of data is usually accomplished conclusions may be reached when the normality by careful examination of the values of the third and assumption is not valid, especially when one-tail tests fourth central moments, skewness and kurtosis. are employed or the sample size or significance level However, when the sample size is small or the are very small. Hopkins and Weeks (1990) also underlying distribution is non-normal, the information discuss the effects of highly non-normal data on obtained from the sample skewness and kurtosis can hypothesis testing of variances. Thus, it is apparent be misleading. that examination of the skewness (departure from symmetry) and kurtosis (deviation from a normal One alternative to the central moment shape statistics curve) is an important component of exploratory data is the use of linear combinations of order statistics (L­ analyses. moments) to examine the distributional shape characteristics of data. L-moments have several Various methods to estimate skewness and kurtosis theoretical advantages over the central moment have been proposed (MacGillivray and Salanela, shape statistics: Characterization of a wider range of 1988).
    [Show full text]
  • Rotational Motion (The Dynamics of a Rigid Body)
    University of Nebraska - Lincoln DigitalCommons@University of Nebraska - Lincoln Robert Katz Publications Research Papers in Physics and Astronomy 1-1958 Physics, Chapter 11: Rotational Motion (The Dynamics of a Rigid Body) Henry Semat City College of New York Robert Katz University of Nebraska-Lincoln, [email protected] Follow this and additional works at: https://digitalcommons.unl.edu/physicskatz Part of the Physics Commons Semat, Henry and Katz, Robert, "Physics, Chapter 11: Rotational Motion (The Dynamics of a Rigid Body)" (1958). Robert Katz Publications. 141. https://digitalcommons.unl.edu/physicskatz/141 This Article is brought to you for free and open access by the Research Papers in Physics and Astronomy at DigitalCommons@University of Nebraska - Lincoln. It has been accepted for inclusion in Robert Katz Publications by an authorized administrator of DigitalCommons@University of Nebraska - Lincoln. 11 Rotational Motion (The Dynamics of a Rigid Body) 11-1 Motion about a Fixed Axis The motion of the flywheel of an engine and of a pulley on its axle are examples of an important type of motion of a rigid body, that of the motion of rotation about a fixed axis. Consider the motion of a uniform disk rotat­ ing about a fixed axis passing through its center of gravity C perpendicular to the face of the disk, as shown in Figure 11-1. The motion of this disk may be de­ scribed in terms of the motions of each of its individual particles, but a better way to describe the motion is in terms of the angle through which the disk rotates.
    [Show full text]
  • Probability and Statistics
    APPENDIX A Probability and Statistics The basics of probability and statistics are presented in this appendix, provid- ing the reader with a reference source for the main body of this book without interrupting the flow of argumentation there with unnecessary excursions. Only the basic definitions and results have been included in this appendix. More complicated concepts (such as stochastic processes, Ito Calculus, Girsanov Theorem, etc.) are discussed directly in the main body of the book as the need arises. A.1 PROBABILITY, EXPECTATION, AND VARIANCE A variable whose value is dependent on random events, is referred to as a random variable or a random number. An example of a random variable is the number observed when throwing a dice. The distribution of this ran- dom number is an example of a discrete distribution. A random number is discretely distributed if it can take on only discrete values such as whole numbers, for instance. A random variable has a continuous distribution if it can assume arbitrary values in some interval (which can by all means be the infinite interval from −∞ to ∞). Intuitively, the values which can be assumed by the random variable lie together densely, i.e., arbitrarily close to one another. An example of a discrete distribution on the interval from 0 to ∞ (infinity) might be a random variable which could take on the values 0.00, 0.01, 0.02, 0.03, ..., 99.98, 99.99, 100.00, 100.01, 100.02, ..., etc., like a Bund future with a tick-size of 0.01% corresponding to a value change of 10 euros per tick on a nominal of 100,000 euros.
    [Show full text]
  • Create an Absence/OT Request (Timekeeper)
    Create an Absence/OT Request (Timekeeper) 1. From the Employee Self Service home page, select the drop-down at the top of the screen, and choose Time Administration. a. Follow these instructions to add the Time Administration page/tile to your homepage. 2. Select the Time Administration tile. a. It might take a moment for the Time Administration page to load. Create Absence/OT Request (Timekeeper) | 1 3. Select the Report Employee Time tab. 4. Click on the Employee Selection arrow to make the Employee Selection Criteria section appear. Enter full or partial search items to refine the list of employees, and select the Get Employees button. a. If you do not enter search criteria and simply click Get Employees, all employees will appear in the Search Results section. Create Absence/OT Request (Timekeeper) | 2 5. A list of employees will appear. Select the employee for whom you would like to create an absence or overtime request. 6. The employee’s timesheet will appear. Navigate to the date of the leave request using the Date field or Previous Period/Next Period hyperlinks, and select the refresh [ ] icon. Create Absence/OT Request (Timekeeper) | 3 7. Select the Absence/OT tab on the timesheet. 8. Select the Add Absence Event button. Create Absence/OT Request (Timekeeper) | 4 9. Enter the Start and End Dates for the absence/overtime event. 10. Select the Absence Name drop-down to choose the appropriate option. Create Absence/OT Request (Timekeeper) | 5 11. Select the Details hyperlink. 12. The Absence Event Details screen will pop up. Create Absence/OT Request (Timekeeper) | 6 13.
    [Show full text]
  • A Skew Extension of the T-Distribution, with Applications
    J. R. Statist. Soc. B (2003) 65, Part 1, pp. 159–174 A skew extension of the t-distribution, with applications M. C. Jones The Open University, Milton Keynes, UK and M. J. Faddy University of Birmingham, UK [Received March 2000. Final revision July 2002] Summary. A tractable skew t-distribution on the real line is proposed.This includes as a special case the symmetric t-distribution, and otherwise provides skew extensions thereof.The distribu- tion is potentially useful both for modelling data and in robustness studies. Properties of the new distribution are presented. Likelihood inference for the parameters of this skew t-distribution is developed. Application is made to two data modelling examples. Keywords: Beta distribution; Likelihood inference; Robustness; Skewness; Student’s t-distribution 1. Introduction Student’s t-distribution occurs frequently in statistics. Its usual derivation and use is as the sam- pling distribution of certain test statistics under normality, but increasingly the t-distribution is being used in both frequentist and Bayesian statistics as a heavy-tailed alternative to the nor- mal distribution when robustness to possible outliers is a concern. See Lange et al. (1989) and Gelman et al. (1995) and references therein. It will often be useful to consider a further alternative to the normal or t-distribution which is both heavy tailed and skew. To this end, we propose a family of distributions which includes the symmetric t-distributions as special cases, and also includes extensions of the t-distribution, still taking values on the whole real line, with non-zero skewness. Let a>0 and b>0be parameters.
    [Show full text]
  • On Moments of Folded and Truncated Multivariate Normal Distributions
    On Moments of Folded and Truncated Multivariate Normal Distributions Raymond Kan Rotman School of Management, University of Toronto 105 St. George Street, Toronto, Ontario M5S 3E6, Canada (E-mail: [email protected]) and Cesare Robotti Terry College of Business, University of Georgia 310 Herty Drive, Athens, Georgia 30602, USA (E-mail: [email protected]) April 3, 2017 Abstract Recurrence relations for integrals that involve the density of multivariate normal distributions are developed. These recursions allow fast computation of the moments of folded and truncated multivariate normal distributions. Besides being numerically efficient, the proposed recursions also allow us to obtain explicit expressions of low order moments of folded and truncated multivariate normal distributions. Keywords: Multivariate normal distribution; Folded normal distribution; Truncated normal distribution. 1 Introduction T Suppose X = (X1;:::;Xn) follows a multivariate normal distribution with mean µ and k kn positive definite covariance matrix Σ. We are interested in computing E(jX1j 1 · · · jXnj ) k1 kn and E(X1 ··· Xn j ai < Xi < bi; i = 1; : : : ; n) for nonnegative integer values ki = 0; 1; 2;:::. The first expression is the moment of a folded multivariate normal distribution T jXj = (jX1j;:::; jXnj) . The second expression is the moment of a truncated multivariate normal distribution, with Xi truncated at the lower limit ai and upper limit bi. In the second expression, some of the ai's can be −∞ and some of the bi's can be 1. When all the bi's are 1, the distribution is called the lower truncated multivariate normal, and when all the ai's are −∞, the distribution is called the upper truncated multivariate normal.
    [Show full text]
  • 5. the Student T Distribution
    Virtual Laboratories > 4. Special Distributions > 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 5. The Student t Distribution In this section we will study a distribution that has special importance in statistics. In particular, this distribution will arise in the study of a standardized version of the sample mean when the underlying distribution is normal. The Probability Density Function Suppose that Z has the standard normal distribution, V has the chi-squared distribution with n degrees of freedom, and that Z and V are independent. Let Z T= √V/n In the following exercise, you will show that T has probability density function given by −(n +1) /2 Γ((n + 1) / 2) t2 f(t)= 1 + , t∈ℝ ( n ) √n π Γ(n / 2) 1. Show that T has the given probability density function by using the following steps. n a. Show first that the conditional distribution of T given V=v is normal with mean 0 a nd variance v . b. Use (a) to find the joint probability density function of (T,V). c. Integrate the joint probability density function in (b) with respect to v to find the probability density function of T. The distribution of T is known as the Student t distribution with n degree of freedom. The distribution is well defined for any n > 0, but in practice, only positive integer values of n are of interest. This distribution was first studied by William Gosset, who published under the pseudonym Student. In addition to supplying the proof, Exercise 1 provides a good way of thinking of the t distribution: the t distribution arises when the variance of a mean 0 normal distribution is randomized in a certain way.
    [Show full text]
  • A Study of Non-Central Skew T Distributions and Their Applications in Data Analysis and Change Point Detection
    A STUDY OF NON-CENTRAL SKEW T DISTRIBUTIONS AND THEIR APPLICATIONS IN DATA ANALYSIS AND CHANGE POINT DETECTION Abeer M. Hasan A Dissertation Submitted to the Graduate College of Bowling Green State University in partial fulfillment of the requirements for the degree of DOCTOR OF PHILOSOPHY August 2013 Committee: Arjun K. Gupta, Co-advisor Wei Ning, Advisor Mark Earley, Graduate Faculty Representative Junfeng Shang. Copyright c August 2013 Abeer M. Hasan All rights reserved iii ABSTRACT Arjun K. Gupta, Co-advisor Wei Ning, Advisor Over the past three decades there has been a growing interest in searching for distribution families that are suitable to analyze skewed data with excess kurtosis. The search started by numerous papers on the skew normal distribution. Multivariate t distributions started to catch attention shortly after the development of the multivariate skew normal distribution. Many researchers proposed alternative methods to generalize the univariate t distribution to the multivariate case. Recently, skew t distribution started to become popular in research. Skew t distributions provide more flexibility and better ability to accommodate long-tailed data than skew normal distributions. In this dissertation, a new non-central skew t distribution is studied and its theoretical properties are explored. Applications of the proposed non-central skew t distribution in data analysis and model comparisons are studied. An extension of our distribution to the multivariate case is presented and properties of the multivariate non-central skew t distri- bution are discussed. We also discuss the distribution of quadratic forms of the non-central skew t distribution. In the last chapter, the change point problem of the non-central skew t distribution is discussed under different settings.
    [Show full text]
  • 1 One Parameter Exponential Families
    1 One parameter exponential families The world of exponential families bridges the gap between the Gaussian family and general dis- tributions. Many properties of Gaussians carry through to exponential families in a fairly precise sense. • In the Gaussian world, there exact small sample distributional results (i.e. t, F , χ2). • In the exponential family world, there are approximate distributional results (i.e. deviance tests). • In the general setting, we can only appeal to asymptotics. A one-parameter exponential family, F is a one-parameter family of distributions of the form Pη(dx) = exp (η · t(x) − Λ(η)) P0(dx) for some probability measure P0. The parameter η is called the natural or canonical parameter and the function Λ is called the cumulant generating function, and is simply the normalization needed to make dPη fη(x) = (x) = exp (η · t(x) − Λ(η)) dP0 a proper probability density. The random variable t(X) is the sufficient statistic of the exponential family. Note that P0 does not have to be a distribution on R, but these are of course the simplest examples. 1.0.1 A first example: Gaussian with linear sufficient statistic Consider the standard normal distribution Z e−z2=2 P0(A) = p dz A 2π and let t(x) = x. Then, the exponential family is eη·x−x2=2 Pη(dx) / p 2π and we see that Λ(η) = η2=2: eta= np.linspace(-2,2,101) CGF= eta**2/2. plt.plot(eta, CGF) A= plt.gca() A.set_xlabel(r'$\eta$', size=20) A.set_ylabel(r'$\Lambda(\eta)$', size=20) f= plt.gcf() 1 Thus, the exponential family in this setting is the collection F = fN(η; 1) : η 2 Rg : d 1.0.2 Normal with quadratic sufficient statistic on R d As a second example, take P0 = N(0;Id×d), i.e.
    [Show full text]
  • Random Variables and Probability Distributions 1.1
    RANDOM VARIABLES AND PROBABILITY DISTRIBUTIONS 1. DISCRETE RANDOM VARIABLES 1.1. Definition of a Discrete Random Variable. A random variable X is said to be discrete if it can assume only a finite or countable infinite number of distinct values. A discrete random variable can be defined on both a countable or uncountable sample space. 1.2. Probability for a discrete random variable. The probability that X takes on the value x, P(X=x), is defined as the sum of the probabilities of all sample points in Ω that are assigned the value x. We may denote P(X=x) by p(x) or pX (x). The expression pX (x) is a function that assigns probabilities to each possible value x; thus it is often called the probability function for the random variable X. 1.3. Probability distribution for a discrete random variable. The probability distribution for a discrete random variable X can be represented by a formula, a table, or a graph, which provides pX (x) = P(X=x) for all x. The probability distribution for a discrete random variable assigns nonzero probabilities to only a countable number of distinct x values. Any value x not explicitly assigned a positive probability is understood to be such that P(X=x) = 0. The function pX (x)= P(X=x) for each x within the range of X is called the probability distribution of X. It is often called the probability mass function for the discrete random variable X. 1.4. Properties of the probability distribution for a discrete random variable.
    [Show full text]
  • Section 7 Testing Hypotheses About Parameters of Normal Distribution. T-Tests and F-Tests
    Section 7 Testing hypotheses about parameters of normal distribution. T-tests and F-tests. We will postpone a more systematic approach to hypotheses testing until the following lectures and in this lecture we will describe in an ad hoc way T-tests and F-tests about the parameters of normal distribution, since they are based on a very similar ideas to confidence intervals for parameters of normal distribution - the topic we have just covered. Suppose that we are given an i.i.d. sample from normal distribution N(µ, ν2) with some unknown parameters µ and ν2 : We will need to decide between two hypotheses about these unknown parameters - null hypothesis H0 and alternative hypothesis H1: Hypotheses H0 and H1 will be one of the following: H : µ = µ ; H : µ = µ ; 0 0 1 6 0 H : µ µ ; H : µ < µ ; 0 ∼ 0 1 0 H : µ µ ; H : µ > µ ; 0 ≈ 0 1 0 where µ0 is a given ’hypothesized’ parameter. We will also consider similar hypotheses about parameter ν2 : We want to construct a decision rule α : n H ; H X ! f 0 1g n that given an i.i.d. sample (X1; : : : ; Xn) either accepts H0 or rejects H0 (accepts H1). Null hypothesis is usually a ’main’ hypothesis2 X in a sense that it is expected or presumed to be true and we need a lot of evidence to the contrary to reject it. To quantify this, we pick a parameter � [0; 1]; called level of significance, and make sure that a decision rule α rejects H when it is2 actually true with probability �; i.e.
    [Show full text]