Module 3 Function of a Random Variable and Its

Total Page:16

File Type:pdf, Size:1020Kb

Module 3 Function of a Random Variable and Its NPTEL- Probability and Distributions MODULE 3 FUNCTION OF A RANDOM VARIABLE AND ITS DISTRIBUTION LECTURE 16 Topics 3.5 PROBABILITY AND MOMENT INEQUALITIES 3.5.3 Jensen Inequality 3.5.4 푨푴-푮푴-푯푴 inequality 3.6 DESCRIPTIVE MEASURES OF PROBABILITY DISTRIBUTIONS 3.6.1 Measures of Central Tendency 3.6.1.1 Mean 3.6.1.2 Median 3.6.1.3 Mode 3.6.2 Measures of Dispersion 3.6.2.1 Standard Deviation 3.6.2.2 Mean Deviation 3.6.2.3 Quartile Deviation 3.6.2.4 Coefficient of Variation 3.6.3 measures of skewness 3.6.4 measures of kurtosis 3.5.3 Jensen Inequality Theorem 5.2 Let 퐼 ⊆ ℝ be an interval and let 휙: 퐼 → ℝ be a twice differentiable function such that its second order derivative 휙′′ ∙ is continuous on 퐼 and휙′′ 푥 ≥ 0, ∀푥 ∈ ℝ. Let 푋 be a random variable with support 푆푋 ⊆ 퐼 and finite expectation. Then 퐸 휙 푋 ≥ 휙 퐸 푋 . If 휙′′ 푥 > 0, ∀푥 ∈ 퐼, then the inequality above is strict unless 푋 is a degenerate random variable. Dept. of Mathematics and Statistics Indian Institute of Technology, Kanpur 1 NPTEL- Probability and Distributions Proof. Let 휇 = 퐸 푋 . On expanding 휙 푥 into a Taylor series about 휇 we get 푥 − 휇 2 휙 푥 = 휙 휇 + 푥 − 휇 휙′ 휇 + 휙′′ 휉 , ∀푥 ∈ 퐼, 2! for some 휉 between 휇 and 푥. Since 휙′′ 푥 ≥ 0, ∀푥 ∈ 퐼, it follows that ′ 휙 푥 ≥ 휙 휇 + 푥 − 휇 휙 휇 , ∀푥 ∈ 퐼 (5.2) ′ ⟹ 휙 푋 ≥ 휙 휇 + 푋 − 휇 휙 휇 , ∀푋 ∈ 푆푋 ⟹ 퐸 휙 푋 ≥ 휙 휇 + 휙′ 휇 퐸 푋 − 휇 = 휙 휇 = 휙 퐸 푋 . Clearly the inequality in (5.2) is strict unless 퐸 푋 − 휇 2 = 0, i. e., 푃 푋 = 휇 = 1 (using Corollary 3.1 (ii)). ▄ Example 5.3 Let 푋 be a random variable with support 푆푋, an interval in ℝ. Then 2 (i) 퐸 푋2 ≥ 퐸 푋 taking 휙 푥 = 푥2, 푥 ∈ ℝ, in Theorem 5.2 ; (ii) 퐸 ln 푋 ≤ ln 퐸 푋 , provided 푆푋 ⊆ 0, ∞ taking 휙 푥 = − ln 푥, 푥 ∈ ℝ, in Theorem 5.2 ; (iii) 퐸 푒−푋 ≥ 푒−퐸 푋 taking 휙 푥 = 푒−푥 , 푥 ∈ ℝ, in Theorem 5.2 ; 1 1 1 (iv) 퐸 ≥ , if 푆 ⊆ 0, ∞ taking 휙 푥 = , 푥 ∈ ℝ in Theorem 5.2 ; 푋 퐸 푋 푋 푥 provided the involved expectations exist. ▄ Definition 5.2 Let 푋 be a random variable with support 푆푋 ⊆ 0, ∞ . Then, provided they are finite, 퐸 푋 is called the arithmetic mean 퐴푀 of 푋, 푒퐸 ln 푋 is called the geometric mean 1 퐺푀 of 푋, and 1 is called harmonic mean (퐻푀) of 푋. ▄ 퐸 푋 3.5.4 푨푴-푮푴-푯푴 inequality Example 5.4 (i) Let 푋 be a random variable with support 푆푋 ⊆ 0, ∞ . Then Dept. of Mathematics and Statistics Indian Institute of Technology, Kanpur 2 NPTEL- Probability and Distributions 퐸 ln 푋 1 퐸 푋 ≥ 푒 ≥ 1 , 퐸 푋 provided the expectations are finite. (ii) Let 푎1 , ⋯ , 푎푛 be positive real constants and let 푝1 , ⋯ , 푝푛 be another set of positive 푛 real constants such that 푖=1 푝푖 = 1. Then 푛 푛 1 푎 푝 ≥ 푎 푝 ≥ ∙ 푖 푖 푖 푖 푛 푝푖 푖=1 푖=1 푖=1 푎푖 Solution. (i) From Example 5.3 (ii) we have ln 퐸 푋 ≥ 퐸 ln 푋 ⟹ 퐸 푋 ≥ 푒퐸 ln 푋 (5.3) 1 Using (5.3) on ,we get 푋 1 1 퐸 ln 퐸 ≥ 푒 푋 = 푒−퐸 ln 푋 푋 퐸 ln 푋 1 ⟹ 푒 ≥ 1 . (5.4) 퐸 푋 The assertion now follows on combining (5.3) and (5.4). (ii) Let 푋 be a discrete type random variable with support 푆푋 = 푎1, 푎2 ⋯ and 푃 푋 = 푎푖 = 푝푖, 푖 = 1, ⋯ , 푛. Clearly, 푛 푃 푋 = 푥 > 0 ∀푥 ∈ 푆푋 and 푥∈푆푋 푃 푋 = 푥 = 푖=1 푝푖 = 1. On using (i), we get 퐸 ln 푋 1 퐸 푋 ≥ 푒 ≥ 1 퐸 푋 푛 푛 ln 푎 푝 1 ⟹ 푎 푝 ≥ 푒 푖=1 푖 푖 ≥ 푖 푖 푛 푝푖 푖=1 푖=1 푎푖 푛 ln 푛 푎 푝 푖 1 ⟹ 푎 푝 ≥ 푒 푖=1 푖 ≥ 푖 푖 푛 푝푖 푖=1 푖=1 푎푖 Dept. of Mathematics and Statistics Indian Institute of Technology, Kanpur 3 NPTEL- Probability and Distributions 푛 푛 푝 1 ⟹ 푎 푝 ≥ 푎 푖 ≥ . 푖 푖 푖 푛 푝푖 푖=1 푖=1 푖=1 푎푖 3.6 DESCRIPTIVE MEASURES OF PROBABILITY DISTRIBUTIONS Let 푋 be a random variable defined on a probability space 훺, ℱ, 푃 , associated with a random experiment ℰ. Let 퐹푋 and 푓푋 denote, respectively, the distribution function and the p.d.f./p.m.f. of 푋. The probability distribution (i.e., the distribution function/p.d.f./p.m.f.) of 푋 describes the manner in which the random variable 푋 takes values in different Borel sets. It may be desirable to have a set of numerical measures that provide a summary of the prominent features of the probability distribution of 푋. We call these measures as descriptive measures. Four prominently used descriptive measures are: (i) Measures of central tendency (or location), also referred to as averages; (ii) measures of dispersion; (iii) measures of skewness, and (iv) measures of kurtosis. 3.6.1 Measures of Central Tendency A measure of central tendency or location (also called an average) gives us the idea about the central value of the probability distribution around which values of the random variable are clustered. Three commonly used measures of central tendency are mean, median and mode. 3.6.1.1 Mean. Recall (Definition 3.2 (i)) that the mean (of probability distribution) of a random variable ′ 푋 is given by 휇1 = 퐸 푋 . We have seen that the mean of a probability distribution gives us idea about the average observed value of 푋 in the long run (i.e., the average of observed values of 푋when the random experiment is repeated a large number of times). Mean seems to be the best suited average if the distribution is symmetric about a point 푑 휇 (i.e., 푋 − 휇 = 휇 − 푋 , in which case 휇 = 퐸 푋 provided it is finite), values in the neighborhood of 휇 occur with high probabilities, and as we move away from 휇 in either direction 푓푋(⋅) decreases. Because of its simplicity mean is the most commonly used average (especially for symmetric or nearly symmetric distributions). Some of the demerits of this measure are that in some situations this may not be defined (Examples 3.2 and 3.4) and that it is very sensitive to presence of a few extreme values of 푋 which are different from other values of 푋 (even though they may occur with small positive Dept. of Mathematics and Statistics Indian Institute of Technology, Kanpur 4 NPTEL- Probability and Distributions probabilities). So this measure should be used with caution if probability distribution assigns positive probabilities to a few Borel sets having some extreme values. 3.6.1.2 Median. 1 1 A real number 푚 satisfying in 퐹 푚 − ≤ ≤ 퐹 푚 , i. e. , 푃 푋 < 푚 ≤ ≤ 푋 2 푋 2 푃 푋 ≤ 푚 , is called the median (of the probability distribution) of 푋. Clearly if 푚 is the median of a probability distribution then, in the long run (i.e., when the random experiment ℰ is repeated a large number of times), the values of 푋 on either side of 푚 in 푆푋 are observed with the same frequency. Thus the median of a probability distribution, in some sense, divides 푆푋 into two equal parts each having the same probability of occurrence. It is evident that if 푋is of continuous type then the median m is given by 퐹푋 푚 = 1/2. For some distributions (especially for distributions of discrete type 1 random variable) it may happen that 퐹 푎 − < and 푥 ∈ ℝ: 퐹 푥 = 1/2 = 푎 , 푏 , 푋 2 푋 for some −∞ < 푎 < 푏 < ∞, so that the median is not unique. In that case 푃 푋 = 푥 = 0, ∀x ∈ a, b and thus we take the median to be 푚 = 푎 = inf 푥 ∈ ℝ: 퐹푋 푥 ≥ 1/2 . For random variables having a symmetric probability distribution it is easy to verify that the mean and the median coincide (see Problem 33). Unlike the mean, the median of a probability distribution is always defined. Moreover the median is not affected by a few extreme values as it takes into account only the probabilities with which different values occur and not their numerical values. As a measure of central tendency the median is preferred over the mean if the distribution is asymmetric and a few extreme observations are assigned positive probabilities. However the fact that the median does not at all take into account the numerical values of 푋 is one of its demerits. Another disadvantage with median is that for many probability distributions it is not easy to evaluate (especially for distributions whose distribution functions 퐹푋(⋅) do not have a closed form). 3.6.1.3 Mode. Roughly speaking the mode 푚0 of a probability distribution is the value that occurs with highest probability and is defined by 푓푋 푚0 = sup 푓푋 푥 : 푥 ∈ 푆푋 . Clearly if 푚0 is the mode of a probability distribution of 푋 then, in the long run, either 푚0 or a value in the neighborhood of 푚0 is observed with maximum frequency. Mode is easy to understand and easy to calculate. Normally, it can be found by just inspection.
Recommended publications
  • Use of Proc Iml to Calculate L-Moments for the Univariate Distributional Shape Parameters Skewness and Kurtosis
    Statistics 573 USE OF PROC IML TO CALCULATE L-MOMENTS FOR THE UNIVARIATE DISTRIBUTIONAL SHAPE PARAMETERS SKEWNESS AND KURTOSIS Michael A. Walega Berlex Laboratories, Wayne, New Jersey Introduction Exploratory data analysis statistics, such as those Gaussian. Bickel (1988) and Van Oer Laan and generated by the sp,ge procedure PROC Verdooren (1987) discuss the concept of robustness UNIVARIATE (1990), are useful tools to characterize and how it pertains to the assumption of normality. the underlying distribution of data prior to more rigorous statistical analyses. Assessment of the As discussed by Glass et al. (1972), incorrect distributional shape of data is usually accomplished conclusions may be reached when the normality by careful examination of the values of the third and assumption is not valid, especially when one-tail tests fourth central moments, skewness and kurtosis. are employed or the sample size or significance level However, when the sample size is small or the are very small. Hopkins and Weeks (1990) also underlying distribution is non-normal, the information discuss the effects of highly non-normal data on obtained from the sample skewness and kurtosis can hypothesis testing of variances. Thus, it is apparent be misleading. that examination of the skewness (departure from symmetry) and kurtosis (deviation from a normal One alternative to the central moment shape statistics curve) is an important component of exploratory data is the use of linear combinations of order statistics (L­ analyses. moments) to examine the distributional shape characteristics of data. L-moments have several Various methods to estimate skewness and kurtosis theoretical advantages over the central moment have been proposed (MacGillivray and Salanela, shape statistics: Characterization of a wider range of 1988).
    [Show full text]
  • A Skew Extension of the T-Distribution, with Applications
    J. R. Statist. Soc. B (2003) 65, Part 1, pp. 159–174 A skew extension of the t-distribution, with applications M. C. Jones The Open University, Milton Keynes, UK and M. J. Faddy University of Birmingham, UK [Received March 2000. Final revision July 2002] Summary. A tractable skew t-distribution on the real line is proposed.This includes as a special case the symmetric t-distribution, and otherwise provides skew extensions thereof.The distribu- tion is potentially useful both for modelling data and in robustness studies. Properties of the new distribution are presented. Likelihood inference for the parameters of this skew t-distribution is developed. Application is made to two data modelling examples. Keywords: Beta distribution; Likelihood inference; Robustness; Skewness; Student’s t-distribution 1. Introduction Student’s t-distribution occurs frequently in statistics. Its usual derivation and use is as the sam- pling distribution of certain test statistics under normality, but increasingly the t-distribution is being used in both frequentist and Bayesian statistics as a heavy-tailed alternative to the nor- mal distribution when robustness to possible outliers is a concern. See Lange et al. (1989) and Gelman et al. (1995) and references therein. It will often be useful to consider a further alternative to the normal or t-distribution which is both heavy tailed and skew. To this end, we propose a family of distributions which includes the symmetric t-distributions as special cases, and also includes extensions of the t-distribution, still taking values on the whole real line, with non-zero skewness. Let a>0 and b>0be parameters.
    [Show full text]
  • 5. the Student T Distribution
    Virtual Laboratories > 4. Special Distributions > 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 5. The Student t Distribution In this section we will study a distribution that has special importance in statistics. In particular, this distribution will arise in the study of a standardized version of the sample mean when the underlying distribution is normal. The Probability Density Function Suppose that Z has the standard normal distribution, V has the chi-squared distribution with n degrees of freedom, and that Z and V are independent. Let Z T= √V/n In the following exercise, you will show that T has probability density function given by −(n +1) /2 Γ((n + 1) / 2) t2 f(t)= 1 + , t∈ℝ ( n ) √n π Γ(n / 2) 1. Show that T has the given probability density function by using the following steps. n a. Show first that the conditional distribution of T given V=v is normal with mean 0 a nd variance v . b. Use (a) to find the joint probability density function of (T,V). c. Integrate the joint probability density function in (b) with respect to v to find the probability density function of T. The distribution of T is known as the Student t distribution with n degree of freedom. The distribution is well defined for any n > 0, but in practice, only positive integer values of n are of interest. This distribution was first studied by William Gosset, who published under the pseudonym Student. In addition to supplying the proof, Exercise 1 provides a good way of thinking of the t distribution: the t distribution arises when the variance of a mean 0 normal distribution is randomized in a certain way.
    [Show full text]
  • 1 One Parameter Exponential Families
    1 One parameter exponential families The world of exponential families bridges the gap between the Gaussian family and general dis- tributions. Many properties of Gaussians carry through to exponential families in a fairly precise sense. • In the Gaussian world, there exact small sample distributional results (i.e. t, F , χ2). • In the exponential family world, there are approximate distributional results (i.e. deviance tests). • In the general setting, we can only appeal to asymptotics. A one-parameter exponential family, F is a one-parameter family of distributions of the form Pη(dx) = exp (η · t(x) − Λ(η)) P0(dx) for some probability measure P0. The parameter η is called the natural or canonical parameter and the function Λ is called the cumulant generating function, and is simply the normalization needed to make dPη fη(x) = (x) = exp (η · t(x) − Λ(η)) dP0 a proper probability density. The random variable t(X) is the sufficient statistic of the exponential family. Note that P0 does not have to be a distribution on R, but these are of course the simplest examples. 1.0.1 A first example: Gaussian with linear sufficient statistic Consider the standard normal distribution Z e−z2=2 P0(A) = p dz A 2π and let t(x) = x. Then, the exponential family is eη·x−x2=2 Pη(dx) / p 2π and we see that Λ(η) = η2=2: eta= np.linspace(-2,2,101) CGF= eta**2/2. plt.plot(eta, CGF) A= plt.gca() A.set_xlabel(r'$\eta$', size=20) A.set_ylabel(r'$\Lambda(\eta)$', size=20) f= plt.gcf() 1 Thus, the exponential family in this setting is the collection F = fN(η; 1) : η 2 Rg : d 1.0.2 Normal with quadratic sufficient statistic on R d As a second example, take P0 = N(0;Id×d), i.e.
    [Show full text]
  • Random Variables and Probability Distributions 1.1
    RANDOM VARIABLES AND PROBABILITY DISTRIBUTIONS 1. DISCRETE RANDOM VARIABLES 1.1. Definition of a Discrete Random Variable. A random variable X is said to be discrete if it can assume only a finite or countable infinite number of distinct values. A discrete random variable can be defined on both a countable or uncountable sample space. 1.2. Probability for a discrete random variable. The probability that X takes on the value x, P(X=x), is defined as the sum of the probabilities of all sample points in Ω that are assigned the value x. We may denote P(X=x) by p(x) or pX (x). The expression pX (x) is a function that assigns probabilities to each possible value x; thus it is often called the probability function for the random variable X. 1.3. Probability distribution for a discrete random variable. The probability distribution for a discrete random variable X can be represented by a formula, a table, or a graph, which provides pX (x) = P(X=x) for all x. The probability distribution for a discrete random variable assigns nonzero probabilities to only a countable number of distinct x values. Any value x not explicitly assigned a positive probability is understood to be such that P(X=x) = 0. The function pX (x)= P(X=x) for each x within the range of X is called the probability distribution of X. It is often called the probability mass function for the discrete random variable X. 1.4. Properties of the probability distribution for a discrete random variable.
    [Show full text]
  • On the Meaning and Use of Kurtosis
    Psychological Methods Copyright 1997 by the American Psychological Association, Inc. 1997, Vol. 2, No. 3,292-307 1082-989X/97/$3.00 On the Meaning and Use of Kurtosis Lawrence T. DeCarlo Fordham University For symmetric unimodal distributions, positive kurtosis indicates heavy tails and peakedness relative to the normal distribution, whereas negative kurtosis indicates light tails and flatness. Many textbooks, however, describe or illustrate kurtosis incompletely or incorrectly. In this article, kurtosis is illustrated with well-known distributions, and aspects of its interpretation and misinterpretation are discussed. The role of kurtosis in testing univariate and multivariate normality; as a measure of departures from normality; in issues of robustness, outliers, and bimodality; in generalized tests and estimators, as well as limitations of and alternatives to the kurtosis measure [32, are discussed. It is typically noted in introductory statistics standard deviation. The normal distribution has a kur- courses that distributions can be characterized in tosis of 3, and 132 - 3 is often used so that the refer- terms of central tendency, variability, and shape. With ence normal distribution has a kurtosis of zero (132 - respect to shape, virtually every textbook defines and 3 is sometimes denoted as Y2)- A sample counterpart illustrates skewness. On the other hand, another as- to 132 can be obtained by replacing the population pect of shape, which is kurtosis, is either not discussed moments with the sample moments, which gives or, worse yet, is often described or illustrated incor- rectly. Kurtosis is also frequently not reported in re- ~(X i -- S)4/n search articles, in spite of the fact that virtually every b2 (•(X i - ~')2/n)2' statistical package provides a measure of kurtosis.
    [Show full text]
  • Basic Econometrics / Statistics Statistical Distributions: Normal, T, Chi-Sq, & F
    Basic Econometrics / Statistics Statistical Distributions: Normal, T, Chi-Sq, & F Course : Basic Econometrics : HC43 / Statistics B.A. Hons Economics, Semester IV/ Semester III Delhi University Course Instructor: Siddharth Rathore Assistant Professor Economics Department, Gargi College Siddharth Rathore guj75845_appC.qxd 4/16/09 12:41 PM Page 461 APPENDIX C SOME IMPORTANT PROBABILITY DISTRIBUTIONS In Appendix B we noted that a random variable (r.v.) can be described by a few characteristics, or moments, of its probability function (PDF or PMF), such as the expected value and variance. This, however, presumes that we know the PDF of that r.v., which is a tall order since there are all kinds of random variables. In practice, however, some random variables occur so frequently that statisticians have determined their PDFs and documented their properties. For our purpose, we will consider only those PDFs that are of direct interest to us. But keep in mind that there are several other PDFs that statisticians have studied which can be found in any standard statistics textbook. In this appendix we will discuss the following four probability distributions: 1. The normal distribution 2. The t distribution 3. The chi-square (␹2 ) distribution 4. The F distribution These probability distributions are important in their own right, but for our purposes they are especially important because they help us to find out the probability distributions of estimators (or statistics), such as the sample mean and sample variance. Recall that estimators are random variables. Equipped with that knowledge, we will be able to draw inferences about their true population values.
    [Show full text]
  • Univariate and Multivariate Skewness and Kurtosis 1
    Running head: UNIVARIATE AND MULTIVARIATE SKEWNESS AND KURTOSIS 1 Univariate and Multivariate Skewness and Kurtosis for Measuring Nonnormality: Prevalence, Influence and Estimation Meghan K. Cain, Zhiyong Zhang, and Ke-Hai Yuan University of Notre Dame Author Note This research is supported by a grant from the U.S. Department of Education (R305D140037). However, the contents of the paper do not necessarily represent the policy of the Department of Education, and you should not assume endorsement by the Federal Government. Correspondence concerning this article can be addressed to Meghan Cain ([email protected]), Ke-Hai Yuan ([email protected]), or Zhiyong Zhang ([email protected]), Department of Psychology, University of Notre Dame, 118 Haggar Hall, Notre Dame, IN 46556. UNIVARIATE AND MULTIVARIATE SKEWNESS AND KURTOSIS 2 Abstract Nonnormality of univariate data has been extensively examined previously (Blanca et al., 2013; Micceri, 1989). However, less is known of the potential nonnormality of multivariate data although multivariate analysis is commonly used in psychological and educational research. Using univariate and multivariate skewness and kurtosis as measures of nonnormality, this study examined 1,567 univariate distriubtions and 254 multivariate distributions collected from authors of articles published in Psychological Science and the American Education Research Journal. We found that 74% of univariate distributions and 68% multivariate distributions deviated from normal distributions. In a simulation study using typical values of skewness and kurtosis that we collected, we found that the resulting type I error rates were 17% in a t-test and 30% in a factor analysis under some conditions. Hence, we argue that it is time to routinely report skewness and kurtosis along with other summary statistics such as means and variances.
    [Show full text]
  • A Family of Skew-Normal Distributions for Modeling Proportions and Rates with Zeros/Ones Excess
    S S symmetry Article A Family of Skew-Normal Distributions for Modeling Proportions and Rates with Zeros/Ones Excess Guillermo Martínez-Flórez 1, Víctor Leiva 2,* , Emilio Gómez-Déniz 3 and Carolina Marchant 4 1 Departamento de Matemáticas y Estadística, Facultad de Ciencias Básicas, Universidad de Córdoba, Montería 14014, Colombia; [email protected] 2 Escuela de Ingeniería Industrial, Pontificia Universidad Católica de Valparaíso, 2362807 Valparaíso, Chile 3 Facultad de Economía, Empresa y Turismo, Universidad de Las Palmas de Gran Canaria and TIDES Institute, 35001 Canarias, Spain; [email protected] 4 Facultad de Ciencias Básicas, Universidad Católica del Maule, 3466706 Talca, Chile; [email protected] * Correspondence: [email protected] or [email protected] Received: 30 June 2020; Accepted: 19 August 2020; Published: 1 September 2020 Abstract: In this paper, we consider skew-normal distributions for constructing new a distribution which allows us to model proportions and rates with zero/one inflation as an alternative to the inflated beta distributions. The new distribution is a mixture between a Bernoulli distribution for explaining the zero/one excess and a censored skew-normal distribution for the continuous variable. The maximum likelihood method is used for parameter estimation. Observed and expected Fisher information matrices are derived to conduct likelihood-based inference in this new type skew-normal distribution. Given the flexibility of the new distributions, we are able to show, in real data scenarios, the good performance of our proposal. Keywords: beta distribution; centered skew-normal distribution; maximum-likelihood methods; Monte Carlo simulations; proportions; R software; rates; zero/one inflated data 1.
    [Show full text]
  • Random Processes
    Chapter 6 Random Processes Random Process • A random process is a time-varying function that assigns the outcome of a random experiment to each time instant: X(t). • For a fixed (sample path): a random process is a time varying function, e.g., a signal. – For fixed t: a random process is a random variable. • If one scans all possible outcomes of the underlying random experiment, we shall get an ensemble of signals. • Random Process can be continuous or discrete • Real random process also called stochastic process – Example: Noise source (Noise can often be modeled as a Gaussian random process. An Ensemble of Signals Remember: RV maps Events à Constants RP maps Events à f(t) RP: Discrete and Continuous The set of all possible sample functions {v(t, E i)} is called the ensemble and defines the random process v(t) that describes the noise source. Sample functions of a binary random process. RP Characterization • Random variables x 1 , x 2 , . , x n represent amplitudes of sample functions at t 5 t 1 , t 2 , . , t n . – A random process can, therefore, be viewed as a collection of an infinite number of random variables: RP Characterization – First Order • CDF • PDF • Mean • Mean-Square Statistics of a Random Process RP Characterization – Second Order • The first order does not provide sufficient information as to how rapidly the RP is changing as a function of timeà We use second order estimation RP Characterization – Second Order • The first order does not provide sufficient information as to how rapidly the RP is changing as a function
    [Show full text]
  • Tests Based on Skewness and Kurtosis for Multivariate Normality
    Communications for Statistical Applications and Methods DOI: http://dx.doi.org/10.5351/CSAM.2015.22.4.361 2015, Vol. 22, No. 4, 361–375 Print ISSN 2287-7843 / Online ISSN 2383-4757 Tests Based on Skewness and Kurtosis for Multivariate Normality Namhyun Kim1;a aDepartment of Science, Hongik University, Korea Abstract A measure of skewness and kurtosis is proposed to test multivariate normality. It is based on an empirical standardization using the scaled residuals of the observations. First, we consider the statistics that take the skewness or the kurtosis for each coordinate of the scaled residuals. The null distributions of the statistics converge very slowly to the asymptotic distributions; therefore, we apply a transformation of the skewness or the kurtosis to univariate normality for each coordinate. Size and power are investigated through simulation; consequently, the null distributions of the statistics from the transformed ones are quite well approximated to asymptotic distributions. A simulation study also shows that the combined statistics of skewness and kurtosis have moderate sensitivity of all alternatives under study, and they might be candidates for an omnibus test. Keywords: Goodness of fit tests, multivariate normality, skewness, kurtosis, scaled residuals, empirical standardization, power comparison 1. Introduction Classical multivariate analysis techniques require the assumption of multivariate normality; conse- quently, there are numerous test procedures to assess the assumption in the literature. For a gen- eral review, some references are Henze (2002), Henze and Zirkler (1990), Srivastava and Mudholkar (2003), and Thode (2002, Ch.9). For comparative studies in power, we refer to Farrell et al. (2007), Horswell and Looney (1992), Mecklin and Mundfrom (2005), and Romeu and Ozturk (1993).
    [Show full text]
  • 11. Parameter Estimation
    11. Parameter Estimation Chris Piech and Mehran Sahami May 2017 We have learned many different distributions for random variables and all of those distributions had parame- ters: the numbers that you provide as input when you define a random variable. So far when we were working with random variables, we either were explicitly told the values of the parameters, or, we could divine the values by understanding the process that was generating the random variables. What if we don’t know the values of the parameters and we can’t estimate them from our own expert knowl- edge? What if instead of knowing the random variables, we have a lot of examples of data generated with the same underlying distribution? In this chapter we are going to learn formal ways of estimating parameters from data. These ideas are critical for artificial intelligence. Almost all modern machine learning algorithms work like this: (1) specify a probabilistic model that has parameters. (2) Learn the value of those parameters from data. Parameters Before we dive into parameter estimation, first let’s revisit the concept of parameters. Given a model, the parameters are the numbers that yield the actual distribution. In the case of a Bernoulli random variable, the single parameter was the value p. In the case of a Uniform random variable, the parameters are the a and b values that define the min and max value. Here is a list of random variables and the corresponding parameters. From now on, we are going to use the notation q to be a vector of all the parameters: Distribution Parameters Bernoulli(p) q = p Poisson(l) q = l Uniform(a,b) q = (a;b) Normal(m;s 2) q = (m;s 2) Y = mX + b q = (m;b) In the real world often you don’t know the “true” parameters, but you get to observe data.
    [Show full text]