Lecture 12: Central Limit Theorem and Cdfs Raw Moment: 0 N Μn = E(X )

Total Page:16

File Type:pdf, Size:1020Kb

Lecture 12: Central Limit Theorem and Cdfs Raw Moment: 0 N Μn = E(X ) CLT Moments of Distributions Moments Lecture 12: Central Limit Theorem and CDFs Raw moment: 0 n µn = E(X ) Statistics 104 Central moment: 2 Colin Rundel µn = E[(X − µ) ] February 27, 2012 Normalized / Standardized moment: µn σn Statistics 104 (Colin Rundel) Lecture 12 February 27, 2012 1 / 22 CLT Moments of Distributions CLT Moments of Distributions Moment Generating Function Moment Generating Function - Properties The moment generating function of a random variable X is defined for all If X and Y are independent random variables then the moment generating real values of t by function for the distribution of X + Y is (P tx tX x e P(X = x) If X is discrete MX (t) = E[e ] = R tx t(X +Y ) tX tY tX tY x e P(X = x)dx If X is continuous MX +Y (t) = E[e ] = E[e e ] = E[e ]E[e ] = MX (t)MY (t) This is called the moment generating function because we can obtain the Similarly, the moment generating function for S , the sum of iid random raw moments of X by successively differentiating MX (t) and evaluating at n t = 0. variables X1; X2;:::; Xn is 0 0 MX (0) = E[e ] = 1 = µ0 n MS (t) = [MX (t)] d d n i M0 (t) = E[etX ] = E etX = E[XetX ] X dt dt 0 0 0 MX (0) = E[Xe ] = E[X ] = µ1 d d d M00(t) = M0 (t) = E[XetX ] = E (XetX ) = E[X 2etX ] X dt X dt dt 00 2 0 2 0 MX (0) = E[X e ] = E[X ] = µ2 Statistics 104 (Colin Rundel) Lecture 12 February 27, 2012 2 / 22 Statistics 104 (Colin Rundel) Lecture 12 February 27, 2012 3 / 22 CLT Moments of Distributions CLT Moments of Distributions Moment Generating Function - Unit Normal Moment Generating Function - Unit Normal, cont. Let Z ∼ N (0; 1) then 0 d t2=2 t2=2 Z 1 MZ (t) = e = te tZ tx 1 −x2=2 dt MZ (t) = E[e ] = e p e dx 0 0 −∞ 2π µ1 = MZ (0) = 0 Z 1 2 1 − x −tx = p e 2 dx 2 2 2π −∞ 00 d t =2 2 t =2 MZ (t) = te = (1 + t )e 1 2 dt 1 Z (x−t) t2 − 2 + 2 0 00 = p e dx µ2 = MZ (0) = 1 2π −∞ Z 1 2 t2=2 1 − (x−t) = e p e 2 dx d 2 2 M000(t) = (1 + t2)et =2 = (3t + t3)et =2 −∞ 2π Z dt t2=2 0 000 = e µ3 = MZ (0) = 0 d 2 2 M000(t) = (3t + t3)et =2 = et =2(3 + 6t2 + t4) Z dt 0 0000 µ4 = MZ (0) = 3 Statistics 104 (Colin Rundel) Lecture 12 February 27, 2012 4 / 22 Statistics 104 (Colin Rundel) Lecture 12 February 27, 2012 5 / 22 CLT Moments of Distributions CLT Moments of Distributions Sketch of Proof Central Limit Theorem Proposition Let X ; X ;::: be a sequence of independent and identically distributed 1 2 Let X1; X2;::: be a sequence of independent and identically distributed random 2 random variables each having mean µ and variance σ . Then the distribution variables and Sn = X1 +···+Xn. The distribution of Sn is given by the distribution of function fSn which has a moment generating function MSn with n ≥ 1. X + ··· + X − nµ 1 p n Let Z being a random variable with distribution function fZ and moment generat- σ n ing function MZ . tends to the unit normal as n ! 1. If MSn (t) ! MZ (t) for all t, then fSn (t) ! fZ (t) for all t at which fZ (t) is continuous. That is, for −∞ < a < 1, t2=2 We can prove the CLT by letting Z ∼ N (0; 1), MZ (t) = e and then Z a t2=2 X1 + ··· + Xn − nµ 1 −x2=2 showing for any S that M p ! e as n ! 1. P p ≤ a ! p e dx = Φ(a) as n ! 1 n Sn= n σ n 2π −∞ Statistics 104 (Colin Rundel) Lecture 12 February 27, 2012 6 / 22 Statistics 104 (Colin Rundel) Lecture 12 February 27, 2012 7 / 22 CLT Moments of Distributions CLT Moments of Distributions Proof of the CLT Proof of the CLT, cont. p Some simplifying assumptions and notation: The moment generating function of Xi = n is given by E(X ) = 0 i tX t M p (t) = E exp p i = M p Var(Xi ) = 1 Xi = n n Xi n MXi (t) exists and is finite p P n p L(t) = log M(t) and this the moment generating function of Sn= n = i = 1 Xi = n is L'Hospital's Rule: given by f (x) f 0(x) lim = lim t n x!1 g(x) x!1 g 0(x) M p (t) = M p Sn= n Xi n p Therefore in order to show MSn= n ! MZ (t) we need to show n t 2 M p ! et =2 Xi n Statistics 104 (Colin Rundel) Lecture 12 February 27, 2012 8 / 22 Statistics 104 (Colin Rundel) Lecture 12 February 27, 2012 9 / 22 CLT Moments of Distributions CLT Moments of Distributions Proof of the CLT, cont. Proof of the CLT, cont. 2 LXi (t) = log MXi (t) p n t =2 [MX (t= n)] ! e L (0) = log M (0) = log 1 = 0 i Xi Xi p 2 nLXi (t= n) ! t =2 0 d M (t) 0 Xi p p 1 L (t) = log MX (t) = 0 −3=2 Xi i L(t= n) L (t= n)(− 2 tn ) dt MXi (t) lim = lim by L'Hospital's rule n!1 −1 n!1 −2 0 n −n MX (0) µ p L0 (0) = i = = 0 L0(t= n)t Xi M (0) 1 = lim Xi n!1 2n−1=2 p L00(t= n)t(− 1 tn−3=2) = lim 2 by L'Hospital's rule 0 00 0 2 n!1 −3=2 d MX (t) MXi (t)MX (t) − [MX (t)] −n L00 (t) = i = i i Xi dt M (t) [M (t)]2 p t2 Xi Xi = lim L00(t= n) 00 0 2 n!1 2 MXi (0)MX (0) − [MX (0)] L00 (0) = i i 2 Xi 2 t [MX (0)] = i 2 0 2 2 E(X )E(X ) − E(Xi ) = i i = E(X 2) − E(X )2 = σ2 = 1 0 2 i i E(Xi ) Statistics 104 (Colin Rundel) Lecture 12 February 27, 2012 10 / 22 Statistics 104 (Colin Rundel) Lecture 12 February 27, 2012 11 / 22 CLT Moments of Distributions CLT Continuous Random Variables Proof of the CLT, Final Comments Cumulative Distribution Function The preceding proof assumes that E(Xi ) = 0 and Var(Xi ) = 1. We have already seen a variety of problems where we find P(X <= x) or P(X > x) etc. The former is given a special name - the cumulative We can generalize this result to any collection of random variables Yi by distribution function. considering the standardized form Y ∗ = (Y − µ)/σ. i i If X is discrete with probability mass function f (x) then x X Y + ··· + Y − nµ Y − µ Y − µ p P(X ≤ x) = F (x) = f (z) 1 p n = 1 + ··· + n n σ n σ σ z=−∞ ∗ ∗ p = (Y1 + ··· + Yn ) = n If X is continuous with probability density function f (x) then Z x P(X ≤ x) = F (x) = f (z)dz ∗ E(Yi ) = 0 −∞ ∗ Var(Yi ) = 1 CDF is defined for for all −∞ < x < 1 and follows the following rules: lim F (x) = 0 lim F (x) = 1 x < y ) F (x) < x→−∞ x!1 F (y) Statistics 104 (Colin Rundel) Lecture 12 February 27, 2012 12 / 22 Statistics 104 (Colin Rundel) Lecture 12 February 27, 2012 13 / 22 CLT Continuous Random Variables CLT Continuous Random Variables Binomial CDF Uniform CDF Let X ∼ Binom(n; p) then Let X ∼ Unif(a; b) then Probability Mass Function Cumulative Density Function Probability Mass Function Cumulative Density Function ( 8 1 0 for x ≤ a ! b−a for x 2 [a; b] > n bxc ! f (x) = < k n−k X n k n−k F (x) = x−a P(X = k) = f (k) = p (1 − p) P(X ≤ x) = F (x) = p (1 − p) 0 otherwise b−a for x 2 [a; b] k k > k=0 :1 for x ≥ b Statistics 104 (Colin Rundel) Lecture 12 February 27, 2012 14 / 22 Statistics 104 (Colin Rundel) Lecture 12 February 27, 2012 15 / 22 CLT Continuous Random Variables CLT Continuous Random Variables Normal CDF Exponential Distribution Let X ∼ N (µ, σ2) then In general terms, the Exponential distribution describes the time between events which occur continuously with a given rate λ (the expected number Probability Mass Function Cumulative Density Function of events in a given unit of time). 2 1 − (x−µ) f (x) = φ(x) = p e 2σ2 F (x) = Φ(x) Let X ∼ Exp(λ), we define one unit of time as 1/λ which we can 2πσ sub-divide into n sub-intervals. The probability that an event occurs during a particular sub-interval is approximately λ/n. The probability that we must wait b or fewer units of time between events is the same as the probability that an event does occur in one of the b · nth sub-intervals. Therefore, if we let Y ∼ Geo(λ/n) then bn−1 bn−1 k X X λ λ P(X ≤ b) ≈ P(Y ≤ nb) = P(Y = k) = 1 − n n k=0 k=0 Statistics 104 (Colin Rundel) Lecture 12 February 27, 2012 16 / 22 Statistics 104 (Colin Rundel) Lecture 12 February 27, 2012 17 / 22 CLT Continuous Random Variables CLT Continuous Random Variables Exponential Distribution, cont.
Recommended publications
  • Use of Proc Iml to Calculate L-Moments for the Univariate Distributional Shape Parameters Skewness and Kurtosis
    Statistics 573 USE OF PROC IML TO CALCULATE L-MOMENTS FOR THE UNIVARIATE DISTRIBUTIONAL SHAPE PARAMETERS SKEWNESS AND KURTOSIS Michael A. Walega Berlex Laboratories, Wayne, New Jersey Introduction Exploratory data analysis statistics, such as those Gaussian. Bickel (1988) and Van Oer Laan and generated by the sp,ge procedure PROC Verdooren (1987) discuss the concept of robustness UNIVARIATE (1990), are useful tools to characterize and how it pertains to the assumption of normality. the underlying distribution of data prior to more rigorous statistical analyses. Assessment of the As discussed by Glass et al. (1972), incorrect distributional shape of data is usually accomplished conclusions may be reached when the normality by careful examination of the values of the third and assumption is not valid, especially when one-tail tests fourth central moments, skewness and kurtosis. are employed or the sample size or significance level However, when the sample size is small or the are very small. Hopkins and Weeks (1990) also underlying distribution is non-normal, the information discuss the effects of highly non-normal data on obtained from the sample skewness and kurtosis can hypothesis testing of variances. Thus, it is apparent be misleading. that examination of the skewness (departure from symmetry) and kurtosis (deviation from a normal One alternative to the central moment shape statistics curve) is an important component of exploratory data is the use of linear combinations of order statistics (L­ analyses. moments) to examine the distributional shape characteristics of data. L-moments have several Various methods to estimate skewness and kurtosis theoretical advantages over the central moment have been proposed (MacGillivray and Salanela, shape statistics: Characterization of a wider range of 1988).
    [Show full text]
  • Probability and Statistics
    APPENDIX A Probability and Statistics The basics of probability and statistics are presented in this appendix, provid- ing the reader with a reference source for the main body of this book without interrupting the flow of argumentation there with unnecessary excursions. Only the basic definitions and results have been included in this appendix. More complicated concepts (such as stochastic processes, Ito Calculus, Girsanov Theorem, etc.) are discussed directly in the main body of the book as the need arises. A.1 PROBABILITY, EXPECTATION, AND VARIANCE A variable whose value is dependent on random events, is referred to as a random variable or a random number. An example of a random variable is the number observed when throwing a dice. The distribution of this ran- dom number is an example of a discrete distribution. A random number is discretely distributed if it can take on only discrete values such as whole numbers, for instance. A random variable has a continuous distribution if it can assume arbitrary values in some interval (which can by all means be the infinite interval from −∞ to ∞). Intuitively, the values which can be assumed by the random variable lie together densely, i.e., arbitrarily close to one another. An example of a discrete distribution on the interval from 0 to ∞ (infinity) might be a random variable which could take on the values 0.00, 0.01, 0.02, 0.03, ..., 99.98, 99.99, 100.00, 100.01, 100.02, ..., etc., like a Bund future with a tick-size of 0.01% corresponding to a value change of 10 euros per tick on a nominal of 100,000 euros.
    [Show full text]
  • Stochastic Time-Series Spectroscopy John Scoville1 San Jose State University, Dept
    Stochastic Time-Series Spectroscopy John Scoville1 San Jose State University, Dept. of Physics, San Jose, CA 95192-0106, USA Spectroscopically measuring low levels of non-equilibrium phenomena (e.g. emission in the presence of a large thermal background) can be problematic due to an unfavorable signal-to-noise ratio. An approach is presented to use time-series spectroscopy to separate non-equilibrium quantities from slowly varying equilibria. A stochastic process associated with the non-equilibrium part of the spectrum is characterized in terms of its central moments or cumulants, which may vary over time. This parameterization encodes information about the non-equilibrium behavior of the system. Stochastic time-series spectroscopy (STSS) can be implemented at very little expense in many settings since a series of scans are typically recorded in order to generate a low-noise averaged spectrum. Higher moments or cumulants may be readily calculated from this series, enabling the observation of quantities that would be difficult or impossible to determine from an average spectrum or from prinicipal components analysis (PCA). This method is more scalable than PCA, having linear time complexity, yet it can produce comparable or superior results, as shown in example applications. One example compares an STSS-derived CO2 bending mode to a standard reference spectrum and the result of PCA. A second example shows that STSS can reveal conditions of stress in rocks, a scenario where traditional methods such as PCA are inadequate. This allows spectral lines and non-equilibrium behavior to be precisely resolved. A relationship between 2nd order STSS and a time-varying form of PCA is considered.
    [Show full text]
  • Maxskew and Multiskew: Two R Packages for Detecting, Measuring and Removing Multivariate Skewness
    S S symmetry Article MaxSkew and MultiSkew: Two R Packages for Detecting, Measuring and Removing Multivariate Skewness Cinzia Franceschini 1,† and Nicola Loperfido 2,*,† 1 Dipartimento di Scienze Agrarie e Forestali (DAFNE), Università degli Studi della Tuscia, Via San Camillo de Lellis snc, 01100 Viterbo, Italy 2 Dipartimento di Economia, Società e Politica (DESP), Università degli Studi di Urbino “Carlo Bo”, Via Saffi 42, 61029 Urbino, Italy * Correspondence: nicola.loperfi[email protected] † These authors contributed equally to this work. Received: 6 July 2019; Accepted: 22 July 2019; Published: 1 August 2019 Abstract: The R packages MaxSkew and MultiSkew measure, test and remove skewness from multivariate data using their third-order standardized moments. Skewness is measured by scalar functions of the third standardized moment matrix. Skewness is tested with either the bootstrap or under normality. Skewness is removed by appropriate linear projections. The packages might be used to recover data features, as for example clusters and outliers. They are also helpful in improving the performances of statistical methods, as for example the Hotelling’s one-sample test. The Iris dataset illustrates the usages of MaxSkew and MultiSkew. Keywords: asymmetry; bootstrap; projection pursuit; symmetrization; third cumulant 1. Introduction The skewness of a random variable X satisfying E jXj3 < +¥ is often measured by its third standardized cumulant h i E (X − m)3 g (X) = , (1) 1 s3 where m and s are the mean and the standard deviation of X. The squared third standardized cumulant 2 b1 (X) = g1 (X), known as Pearson’s skewness, is also used. The numerator of g1 (X), that is h 3i k3 (X) = E (X − m) , (2) is the third cumulant (i.e., the third central moment) of X.
    [Show full text]
  • Unbiased Central Moment Estimates
    Unbiased Central Moment Estimates Inna Gerlovina1;2, Alan Hubbard1 February 11, 2020 1University of California, Berkeley 2University of California, San Francisco [email protected] Contents 1 Introduction 2 2 Estimators: na¨ıve biased to unbiased 2 3 Estimates obtained from a sample 4 4 Higher order estimates (symbolic expressions for expectations) 5 1 1 Introduction Umoments package calculates unbiased central moment estimates and their two-sample analogs, also known as pooled estimates, up to sixth order (sample variance and pooled variance are commonly used second order examples of corresponding estimators). Orders four and higher include powers and products of moments - for example, fifth moment and a product of second and third moments are both of fifth order; those estimators are also included in the package. The estimates can be obtained directly from samples or from na¨ıve biased moment estimates. If the estimates of orders beyond sixth are needed, they can be generated using the set of functions also provided in the package. Those functions generate symbolic expressions for expectations of moment combinations of an arbitrary order and, in addition to moment estimation, can be used for other derivations that require such expectations (e.g. Edgeworth expansions). 2 Estimators: na¨ıve biased to unbiased For a sample X1;:::;Xn, a na¨ıve biased estimate of a k'th central moment µk is n 1 X m = (X − X¯)k; k = 2; 3;:::: (1) k n i i=1 These biased estimates together with the sample size n can be used to calculate unbiased n estimates of a given order, e.g.
    [Show full text]
  • Expectation and Functions of Random Variables
    POL 571: Expectation and Functions of Random Variables Kosuke Imai Department of Politics, Princeton University March 10, 2006 1 Expectation and Independence To gain further insights about the behavior of random variables, we first consider their expectation, which is also called mean value or expected value. The definition of expectation follows our intuition. Definition 1 Let X be a random variable and g be any function. 1. If X is discrete, then the expectation of g(X) is defined as, then X E[g(X)] = g(x)f(x), x∈X where f is the probability mass function of X and X is the support of X. 2. If X is continuous, then the expectation of g(X) is defined as, Z ∞ E[g(X)] = g(x)f(x) dx, −∞ where f is the probability density function of X. If E(X) = −∞ or E(X) = ∞ (i.e., E(|X|) = ∞), then we say the expectation E(X) does not exist. One sometimes write EX to emphasize that the expectation is taken with respect to a particular random variable X. For a continuous random variable, the expectation is sometimes written as, Z x E[g(X)] = g(x) d F (x). −∞ where F (x) is the distribution function of X. The expectation operator has inherits its properties from those of summation and integral. In particular, the following theorem shows that expectation preserves the inequality and is a linear operator. Theorem 1 (Expectation) Let X and Y be random variables with finite expectations. 1. If g(x) ≥ h(x) for all x ∈ R, then E[g(X)] ≥ E[h(X)].
    [Show full text]
  • An Honest Approach to Parallel Trends ∗
    An Honest Approach to Parallel Trends ∗ Ashesh Rambachany Jonathan Rothz (Job market paper) December 18, 2019 Please click here for the latest version. Abstract Standard approaches for causal inference in difference-in-differences and event-study designs are valid only under the assumption of parallel trends. Researchers are typically unsure whether the parallel trends assumption holds, and therefore gauge its plausibility by testing for pre-treatment differences in trends (“pre-trends”) between the treated and untreated groups. This paper proposes robust inference methods that do not require that the parallel trends assumption holds exactly. Instead, we impose restrictions on the set of possible violations of parallel trends that formalize the logic motivating pre-trends testing — namely, that the pre-trends are informative about what would have happened under the counterfactual. Under a wide class of restrictions on the possible differences in trends, the parameter of interest is set-identified and inference on the treatment effect of interest is equivalent to testing a set of moment inequalities with linear nuisance parameters. We derive computationally tractable confidence sets that are uniformly valid (“honest”) so long as the difference in trends satisfies the imposed restrictions. Our proposed confidence sets are consistent, and have optimal local asymptotic power for many parameter configurations. We also introduce fixed length confidence intervals, which can offer finite-sample improvements for a subset of the cases we consider. We recommend that researchers conduct sensitivity analyses to show what conclusions can be drawn under various restrictions on the set of possible differences in trends. We conduct a simulation study and illustrate our recommended approach with applications to two recently published papers.
    [Show full text]
  • Discrete Random Variables Randomness
    Discrete Random Variables Randomness • The word random effectively means unpredictable • In engineering practice we may treat some signals as random to simplify the analysis even though they may not actually be random Random Variable Defined X A random variable () is the assignment of numerical values to the outcomes of experiments Random Variables Examples of assignments of numbers to the outcomes of experiments. Discrete-Value vs Continuous- Value Random Variables •A discrete-value (DV) random variable has a set of distinct values separated by values that cannot occur • A random variable associated with the outcomes of coin flips, card draws, dice tosses, etc... would be DV random variable •A continuous-value (CV) random variable may take on any value in a continuum of values which may be finite or infinite in size Probability Mass Functions The probability mass function (pmf ) for a discrete random variable X is P x = P X = x . X () Probability Mass Functions A DV random variable X is a Bernoulli random variable if it takes on only two values 0 and 1 and its pmf is 1 p , x = 0 P x p , x 1 X ()= = 0 , otherwise and 0 < p < 1. Probability Mass Functions Example of a Bernoulli pmf Probability Mass Functions If we perform n trials of an experiment whose outcome is Bernoulli distributed and if X represents the total number of 1’s that occur in those n trials, then X is said to be a Binomial random variable and its pmf is n x nx p 1 p , x 0,1,2,,n P x () {} X ()= x 0 , otherwise Probability Mass Functions Binomial pmf Probability Mass Functions If we perform Bernoulli trials until a 1 (success) occurs and the probability of a 1 on any single trial is p, the probability that the k1 first success will occur on the kth trial is p()1 p .
    [Show full text]
  • A Semi-Parametric Approach to the Detection of Non-Gaussian Gravitational Wave Stochastic Backgrounds
    A Semi-Parametric Approach to the Detection of Non-Gaussian Gravitational Wave Stochastic Backgrounds Lionel Martellini1, 2, ∗ and Tania Regimbau2 1EDHEC-Risk Institute, 400 Promenade des Anglais, BP 3116, 06202 Nice Cedex 3, France 2UMR ARTEMIS, CNRS, University of Nice Sophia-Antipolis, Observatoire de la C^oted'Azur, CS 34229 F-06304 NICE, France (Dated: November 7, 2018) Abstract Using a semi-parametric approach based on the fourth-order Edgeworth expansion for the un- known signal distribution, we derive an explicit expression for the likelihood detection statistic in the presence of non-normally distributed gravitational wave stochastic backgrounds. Numerical likelihood maximization exercises based on Monte-Carlo simulations for a set of large tail sym- metric non-Gaussian distributions suggest that the fourth cumulant of the signal distribution can be estimated with reasonable precision when the ratio between the signal and the noise variances is larger than 0.01. The estimation of higher-order cumulants of the observed gravitational wave signal distribution is expected to provide additional constraints on astrophysical and cosmological models. arXiv:1405.5775v1 [astro-ph.CO] 22 May 2014 ∗Electronic address: [email protected] 1 I. INTRODUCTION According to various cosmological scenarios, we are bathed in a stochastic background of gravitational waves. Proposed theoretical models include the amplification of vacuum fluctuations during inflation[1–3], pre Big Bang models [4{6], cosmic (super)strings [7{10] or phase transitions [11{13]. In addition to the cosmological background (CGB) [14, 15], an astrophysical contribution (AGB) [16] is expected to result from the superposition of a large number of unresolved sources, such as core collapses to neutron stars or black holes [17{20], rotating neutron stars [21, 22] including magnetars [23{26], phase transition [27] or initial instabilities in young neutron stars [28, 29, 29, 30] or compact binary mergers [31{35].
    [Show full text]
  • Moment Generating Function
    Moment Generating Function Statistics 110 Summer 2006 Copyright °c 2006 by Mark E. Irwin Moments Revisited So far I've really only talked about the ¯rst two moments. Lets de¯ne what is meant by moments more precisely. De¯nition. The rth moment of a random variable X is E[Xr], assuming that the expectation exists. So the mean of a distribution is its ¯rst moment. De¯nition. The r central moment of a random variable X is E[(X ¡ E[X])r], assuming that the expectation exists. Thus the variance is the 2nd central moment of distribution. The 1st central moment usually isn't discussed as its always 0. The 3rd central moment is known as the skewness of a distribution and is used as a measure of asymmetry. Moments Revisited 1 If a distribution is symmetric about its mean (f(¹ ¡ x) = f(¹ + x)), the skewness will be 0. Similarly if the skewness is non-zero, the distribution is asymmetric. However it is possible to have asymmetric distribution with skewness = 0. Examples of symmetric distribution are normals, Beta(a; a), Bin(n; p = 0:5). Example of asymmetric distributions are Distribution Skewness Bin(n; p) np(1 ¡ p)(1 ¡ 2p) P ois(¸) ¸ 2 Exp(¸) ¸ Beta(a; b) Ugly formula The 4th central moment is known as the kurtosis. It can be used as a measure of how heavy the tails are for a distribution. The kurtosis for a normal is 3σ4. Moments Revisited 2 Note that these measures are often standardized as in their raw form they depend on the standard deviation.
    [Show full text]
  • Moments and Generating Functions
    Moments and Generating Functions September 24 and 29, 2009 Some choices of g yield a specific name for the value of Eg(X). 1 Moments, Factorial Moments, and Central Moments • For g(x) = x, we call EX the mean of X and often write µX or simply µ if only the random variable X is under consideration. { S, the number of successes in n Bernoulli trials with success parameter p, has mean np. { The mean of a geometric random variable with parameter p is 1=p. { The mean of a exponential random variable with parameter β is β. { A standard normal random variable has mean 0. • For g(x) = xm, EXm is called the m-th moment of X. { If X is a Bernoulli random variable, then X = Xm. Thus, EXm = EX = p. R 1 m { For a uniform random variable on [0; 1], the m-th moment is 0 x dx = 1=(m + 1). { The third moment for Z, a standard normal random, is 0. The fourth moment, 1 Z 1 z2 1 z2 1 Z 1 z2 4 4 3 2 EZ = p z exp − dz = −p z exp − + 3z exp − dz 2π −∞ 2 2π 2 −∞ −∞ 2 = 3EZ2 = 3 3 z2 u(z) = z v(t) = − exp − 2 0 2 0 z2 u (z) = 3z v (t) = z exp − 2 { For T , an exponential random variable, we integrate by parts Z 1 1 1 Z 1 m m m m−1 ET = t exp −(t/β) dt = t exp −(t/β) + mt exp −(t/β) dt 0 β 0 0 Z 1 1 = βm tm−1 exp −(t/β) dt = mβET m−1 0 β u(t) = tm v(t) = exp −(t/β) 0 m−1 0 1 u (t) = mt v (t) = β exp −(t/β) Thus, by induction, we have that ET m = βmm!: 1 • If g(x) = (x)k, where (x)k = x(x−1) ··· (x−k +1), then E(X)k is called the k-th factorial moment.
    [Show full text]
  • Statistical Evidence of Central Moments As Fault Indicators in Ball Bearing Diagnostics
    Statistical evidence of central moments as fault indicators in ball bearing diagnostics Marco Cocconcelli1, Giuseppe Curcuru´2 and Riccardo Rubini1 1University of Modena and Reggio Emilia Via Amendola 2 - Pad. Morselli, 42122 Reggio Emilia, Italy fmarco.cocconcelli, [email protected] 2University of Palermo Viale delle Scienze 1, 90128 Palermo, Italy [email protected] Abstract This paper deals with post processing of vibration data coming from a experimental tests. An AC motor running at constant speed is provided with a faulted ball bearing, tests are done changing the type of fault (outer race, inner race and balls) and the stage of the fault (three levels of severity: from early to late stage). A healthy bearing is also measured for the aim of comparison. The post processing simply consists in the computation of scalar quantities that are used in condition monitoring of mechanical systems: variance, skewness and kurtosis. These are the second, the third and the fourth central moment of a real-valued function respectively. The variance is the expectation of the squared deviation of a random variable from its mean, the skewness is the measure of the lopsidedness of the distribution, while the kurtosis is a measure of the heaviness of the tail of the distribution, compared to the normal distribution of the same variance. Most of the papers in the last decades use them with excellent results. This paper does not propose a new fault detection technique, but it focuses on the informative content of those three quantities in ball bearing diagnostics from a statistical point of view.
    [Show full text]