15. Probability and Statistics MAE 546 2018.Pptx

Total Page:16

File Type:pdf, Size:1020Kb

15. Probability and Statistics MAE 546 2018.Pptx Probability and Statistics Robert Stengel Optimal Control and Estimation MAE 546 Princeton University, 2018 • Concepts and reality • Probability distributions • Bayes’s Law • Stationarity and ergodicity • Correlation functions and power spectra Copyright 2018 by Robert Stengel. All rights reserved. For educational use only. http://www.princeton.edu/~stengel/MAE546.html http://www.princeton.edu/~stengel/OptConEst.html 1 Concepts and Reality of Probability (Papoulis, 1990) • Theory may be exact – Deals with averages of phenomena with many possible outcomes – Based on models of behavior • Application can only be approximate – Measure of our state of knowledge or belief that something may or may not be true – Subjective assessment A : event P(A) : probability of event nA : number of times A occurs experimentally n :total number of trials n P(A) ≈ A n 2 Interpretations of Probability (Papoulis) • Axiomatic Definition (Theoretical interpretation) – Probability space, abstract objects (outcomes), and sets (events) – Axiom 1: Pr(Ai) ≥ 0 – Axiom 2: Pr(“certain event”) = 1 = Pr [all events in probability space (or universe)] – Axiom 3: With no common elements, Pr(Ai ∪ Aj ) = Pr(Ai ) + Pr(Aj ) • Relative Frequency (Empirical interpretation) n ⎛ Ai ⎞ N = number of trials (total) Pr(Ai ) = lim n = number of trials with attribute A N →∞ ⎝⎜ N ⎠⎟ Ai i 3 Interpretations of Probability (Papoulis) • Classical (“Favorable outcomes” interpretation) n Ai N is finite Pr(Ai ) = nAi = number of outcomes N “favorable to” Ai • Measure of belief (Subjective interpretation) – Pr(Ai) = measure of belief that Ai is true (similar to fuzzy sets) – Informal induction precedes deduction – Principle of insufficient reason (i.e., total prior ignorance): • e.g., if there are 5 event sets, Ai , i = 1 to 5, Pr(Ai) = 1/5 = 0.2 4 Probability “... a way of expressing knowledge or belief that an event will occur or has occurred.” Statistics “The science of making effective use of numerical data relating to groups of individuals or experiments.” 5 Empirical (or Relative) Frequency of Discrete, Mutually Exclusive Events in Sample Space ni Pr(xi ) = ; i = 1 to I in [0,1] N • N = total number of events • ni = number of events with value xi • I = number of different values • xi = ordered set of hypotheses or values • xi is a discrete, real random variable 0.3 • Ensemble of equivalent sets 0.25 0.2 Ai = {x ∈U x = xi } ; i = 1 to I 0.15 Pr(x) • Cumulative probability over all sets 0.1 I I 1 I 0.05 ∑Pr(Ai ) = ∑Pr(xi ) = ∑ni = 1 0 i=1 i=1 N i=1 1 2 3 4 5 6 Cumulative Probability, Pr(x ≥/≤ a), Discrete Random Variables, and Continuous Variables 1 0.8 0.6 Pr(x) Cum Pr(x) ≥ a 0.4 Cum Pr(x) ≤ a 0.2 0 1 2 3 4 5 • Continuous, real random variable, x, e.g., a continuum of colors – xi is the center of a categorical band in x (purple, blue, green, …) Pr(xi ± Δx / 2) = ni / N I ∑Pr(xi ± Δx / 2) = 1 i=1 7 Probability Density Function, pr(x), and Continuous, Cumulative Distribution Function, Pr(x <X) Slope of cumulative distribution Pr(xi ± Δx / 2) ! pr(xi ) ⎯Δ⎯x→⎯0→pr(x) Δx Cumulative probability for x over (–∞, +∞) I I Pr x x / 2 pr x x ∑ ( i ± Δ ) = ∑ ( i ) Δ ⎯Δ⎯x→⎯0→ i=1 i=1 I→∞ ∞ pr(x) dx = 1 ∫−∞ Cumulative distribution function, x < X X Pr(x < X) = pr(x) dx Gaussian probability density and ∫−∞ cumulative distribution functions 8 Properties of Random Variables • Mode – Value of x for which pr(x) is maximum • Median – Value of x corresponding to 50th percentile – Pr(x < median) = Pr(x > median) = 0.5 • Mean – Value of x corresponding to statistical average • First moment of x = Expected value of x “Force” ∞ x = E(x) = x pr(x) dx ∫−∞ “Moment arm” 9 Expected Values ∞ Mean Value x = E(x) = x pr(x) dx ∫−∞ • Second central moment of x = Variance – Variance from the mean value rather than from zero – Smaller value indicates less uncertainty in the value of x ∞ 2 2 2 E ⎡(x − x ) ⎤ = σ x = (x − x ) pr(x) dx ⎣ ⎦ ∫−∞ • Expected value of any function of x, f(x) ∞ E f (x) = f (x) pr(x) dx [ ] ∫−∞ See Supplemental Material for examples of non-Gaussian distributions 10 Expected Value is a Linear Operation Expected value of sum of random variables x1 and x2 need not be statistically independent ∞ E x1 + x2 = x1 + x2 pr(x) dx [ ] ∫−∞ ( ) ∞ ∞ = x1 pr(x) dx + x2 pr(x) dx = E x1 + E x2 ∫−∞ ∫−∞ [ ] [ ] Expected value of constant times random variable ∞ ∞ E k x = k x pr(x) dx = k x pr(x) dx = k E x [ ] ∫−∞ ∫−∞ [ ] 11 Gaussian (Normal) Random Distribution • Used in some random number generators (e.g., RANDN) • Unbounded, symmetric distribution • Defined entirely by its mean and standard deviation 2 (x− x ) − 1 2 2σ x ∞ pr(x) = e pr(x) dx = 1 ∫−∞ 2π σ x Mean value: First moment, µ1 ∞ E(x) = x pr(x) dx = x ∫−∞ Variance: Second central moment, µ2 ∞ 2 2 2 E ⎡(x − x ) ⎤ = (x − x ) pr(x) dx = σ x ⎣ ⎦ ∫−∞ Units of x and σ x are the same 12 Probability of Being Close to the Mean (Gaussian Distribution) 2 (x− x ) − 1 2σ 2 Probability of being within ± 1σ pr(x) = e x 2π σ x Pr ⎣⎡x < (x + σ x )⎦⎤ − Pr ⎣⎡x < (x − σ x )⎦⎤ ≈ 68% Probability of being within ± 2σ Pr ⎣⎡x < (x + 2σ x )⎦⎤ − Pr ⎣⎡x < (x − 2σ x )⎦⎤ ≈ 95% Probability of being within ± 3σ Pr ⎣⎡x < (x + 3σ x )⎦⎤ − Pr ⎣⎡x < (x − 3σ x )⎦⎤ ≈ 99% 13 Experimental Determination of Mean and Variance • Sample mean for N data points, x1, x2, ..., xN N ∑ xi x = i=1 N • Sample variance for same data set N 2 Histogram ∑(xi − x ) σ 2 = i=1 x (N − 1) • Divisor is (N – 1) rather than N to produce an unbiased estimate – (N – 1) terms are independent – Inconsequential for large N • Distribution is not necessarily Gaussian – Prior knowledge: fit histogram to known distribution – Hypothesis test: determine best fit (e.g., Rayleigh, binomial, Poisson, ... ) 14 Central Limit Theorem • Probability distribution of the sum of independent, identically distributed (i.i.d.) variables – Approaches normal distribution as number of variables approaches infinity – Summation of continuous random variables produces a convolution of probability density functions (Papoulis, 1990) – See Supplemental Material for sufficient conditions ∞ ∞ pr(y) = ∫ pr(y − x)pr(x)dx pr(z) = ∫ pr(z − y)pr(y)dy −∞ −∞ 15 MATLAB Simulation of Central Limit Theorem Uniform distribution, 100,000 points, 60 histogram bins 16 Multiple Probability Densities and Expected Values Probability density functions of two random variables, x and y pr(x) and pr(y) given for all x and y in (−∞,∞) pr(x, y) : Joint probability density function of x and y ∞ ∞ ∞ ∞ ∫ pr(x)dx = 1; ∫ pr(y)dy = 1; ∫ ∫ pr(x, y)dx dy = 1; −∞ −∞ −∞ −∞ Expected values of x and y ∞ Mean Value E(x) = ∫ x pr(x)dx = x −∞ ∞ Mean Value E(y) = ∫ y pr(y)dy = y −∞ ∞ ∞ Covariance E(xy) = ∫ ∫ x y pr(x, y)dx dy −∞ −∞ ∞ Autocovariance 2 2 E(x ) = ∫ x pr(x)dx 17 −∞ Joint Probability (n = 2) Suppose x can take I values and y can take J values; then, I J ∑Pr(xi ) = 1 ; ∑Pr(yj ) = 1 i=1 j =1 If x and y are uncorrelated, Pr x , y = Pr x ∧ y = Pr x Pr y ( i j ) ( i j ) ( i ) ( j ) Pr(yj) and I J 0.5 0.3 0.2 ∑∑Pr(xi , yj ) = 1 i=1 j =1 0.6 0.3 0.18 0.12 0.6 Pr(x ) i 0.4 0.2 0.12 0.08 0.4 0.5 0.3 0.2 1 18 Conditional Probability (n = 2) If x and y are not independent, probabilities are related Probability that x takes ith value when y takes jth value • Similarly Pr xi , yj ( ) Pr(xi , yj ) Pr xi | yj = Pr y | x = ( ) Pr y ( j i ) ( j ) Pr(xi ) Pr(xi | yj ) = Pr(xi ) Pr(yj | xi ) = Pr(yj ) iff x and y are independent of each other iff x and y are independent of each other Causality is not addressed by conditional probability 19 Applications of Conditional Probability (n = 2) Joint probability can be expressed in two ways Pr(xi , yj ) = Pr(yj | xi )Pr(xi ) = Pr(xi | yj )Pr(yj ) Unconditional probability of each variable is expressed by a sum of terms J I Pr y Pr y | x Pr x Pr(xi ) = ∑Pr(xi | yj ) Pr(yj ) ( j ) = ∑ ( j i ) ( i ) j =1 i=1 20 Bayes’s Rule Thomas Bayes, 1702-1761 Bayes’s Rule proceeds from the previous results th Probability of x taking the value xi conditioned on y taking its j value Pr(yj | xi )Pr(xi ) Pr(yj | xi )Pr(xi ) Pr(xi | yj ) = = I Pr(yj ) ∑Pr(yj | xi ) Pr(xi ) i=1 ... and the converse Pr(xi | yj ) Pr(yj ) Pr(xi | yj ) Pr(yj ) Pr(yj | xi ) = = J Pr(xi ) ∑Pr(xi | yj ) Pr(yj ) j =1 Bayes’s or Bayes’ Rule? See http://www.dailywritingtips.com/charless-pen-and-jesus-name/ 21 Random Processes 22 Ensemble Statistics • Probability metrics apply to an ensemble of events Ai , i = 1 to I n ⎛ Ai ⎞ • For I events, Pr(Ai ) = lim , i = 1 to I N→∞ ⎝⎜ N ⎠⎟ • As I becomes large, Ai approaches a continuous, real, random variable, x, Ai = {x ∈U x = xi } ; i = 1 to I → ∞ I I I ∑ Pr(Ai ) = ∑ Pr(xi ) = ∑ Pr(xi ± Δx / 2) = • and i=1 i=1 i=1 I ∞ pr xi Δx ⎯⎯⎯→ pr(x) dx = 1 ∑ ( ) Δx→0 ∫−∞ i=1 I→∞ • x has no specified dependence on time 23 Random Process ! Random-process variable (or time series), x(t), varies with time as well as across the ensemble of trials ⎡ ⎤ prensemble (xi ) = prensemble ⎣xi (t )⎦, i = I, I → ∞ ! Joint ensemble statistics (e.g., joint probability distribution) depend on time ⎡ ⎤ prensemble (x1,x2 ) = prensemble ⎣x (t1),x (t2 )⎦ 24 Non-Stationary Random Process ! Joint ensemble statistics (e.g., joint probability distribution) depend on specific values of time, t1 and t2 ⎡ ⎤ ⎡ ⎤ prensemble ⎣x (t1)⎦ ≠ prensemble ⎣x (t2 )⎦ {unless periodic} ⎡ ⎤ prensemble (x1,x2 ) = prensemble ⎣x (t1),x (t2 )⎦ … unless time series is periodic or a special case 25 Non-Stationary Process: Sinusoidal Mean Trial 1 Trial 2 … Trial n Ensemble statistics Ensemble statistics 26 Non-Stationary Process: Sinusoidal Variance Trial 1 Trial 2 … Trial n Ensemble statistics Ensemble statistics 27 Stationary
Recommended publications
  • Measurement Techniques of Ultra-Wideband Transmissions
    Rec. ITU-R SM.1754-0 1 RECOMMENDATION ITU-R SM.1754-0* Measurement techniques of ultra-wideband transmissions (2006) Scope Taking into account that there are two general measurement approaches (time domain and frequency domain) this Recommendation gives the appropriate techniques to be applied when measuring UWB transmissions. Keywords Ultra-wideband( UWB), international transmissions, short-duration pulse The ITU Radiocommunication Assembly, considering a) that intentional transmissions from devices using ultra-wideband (UWB) technology may extend over a very large frequency range; b) that devices using UWB technology are being developed with transmissions that span numerous radiocommunication service allocations; c) that UWB technology may be integrated into many wireless applications such as short- range indoor and outdoor communications, radar imaging, medical imaging, asset tracking, surveillance, vehicular radar and intelligent transportation; d) that a UWB transmission may be a sequence of short-duration pulses; e) that UWB transmissions may appear as noise-like, which may add to the difficulty of their measurement; f) that the measurements of UWB transmissions are different from those of conventional radiocommunication systems; g) that proper measurements and assessments of power spectral density are key issues to be addressed for any radiation, noting a) that terms and definitions for UWB technology and devices are given in Recommendation ITU-R SM.1755; b) that there are two general measurement approaches, time domain and frequency domain, with each having a particular set of advantages and disadvantages, recommends 1 that techniques described in Annex 1 to this Recommendation should be considered when measuring UWB transmissions. * Radiocommunication Study Group 1 made editorial amendments to this Recommendation in the years 2018 and 2019 in accordance with Resolution ITU-R 1.
    [Show full text]
  • 1. How Different Is the T Distribution from the Normal?
    Statistics 101–106 Lecture 7 (20 October 98) c David Pollard Page 1 Read M&M §7.1 and §7.2, ignoring starred parts. Reread M&M §3.2. The eects of estimated variances on normal approximations. t-distributions. Comparison of two means: pooling of estimates of variances, or paired observations. In Lecture 6, when discussing comparison of two Binomial proportions, I was content to estimate unknown variances when calculating statistics that were to be treated as approximately normally distributed. You might have worried about the effect of variability of the estimate. W. S. Gosset (“Student”) considered a similar problem in a very famous 1908 paper, where the role of Student’s t-distribution was first recognized. Gosset discovered that the effect of estimated variances could be described exactly in a simplified problem where n independent observations X1,...,Xn are taken from (, ) = ( + ...+ )/ a normal√ distribution, N . The sample mean, X X1 Xn n has a N(, / n) distribution. The random variable X Z = √ / n 2 2 Phas a standard normal distribution. If we estimate by the sample variance, s = ( )2/( ) i Xi X n 1 , then the resulting statistic, X T = √ s/ n no longer has a normal distribution. It has a t-distribution on n 1 degrees of freedom. Remark. I have written T , instead of the t used by M&M page 505. I find it causes confusion that t refers to both the name of the statistic and the name of its distribution. As you will soon see, the estimation of the variance has the effect of spreading out the distribution a little beyond what it would be if were used.
    [Show full text]
  • 1 One Parameter Exponential Families
    1 One parameter exponential families The world of exponential families bridges the gap between the Gaussian family and general dis- tributions. Many properties of Gaussians carry through to exponential families in a fairly precise sense. • In the Gaussian world, there exact small sample distributional results (i.e. t, F , χ2). • In the exponential family world, there are approximate distributional results (i.e. deviance tests). • In the general setting, we can only appeal to asymptotics. A one-parameter exponential family, F is a one-parameter family of distributions of the form Pη(dx) = exp (η · t(x) − Λ(η)) P0(dx) for some probability measure P0. The parameter η is called the natural or canonical parameter and the function Λ is called the cumulant generating function, and is simply the normalization needed to make dPη fη(x) = (x) = exp (η · t(x) − Λ(η)) dP0 a proper probability density. The random variable t(X) is the sufficient statistic of the exponential family. Note that P0 does not have to be a distribution on R, but these are of course the simplest examples. 1.0.1 A first example: Gaussian with linear sufficient statistic Consider the standard normal distribution Z e−z2=2 P0(A) = p dz A 2π and let t(x) = x. Then, the exponential family is eη·x−x2=2 Pη(dx) / p 2π and we see that Λ(η) = η2=2: eta= np.linspace(-2,2,101) CGF= eta**2/2. plt.plot(eta, CGF) A= plt.gca() A.set_xlabel(r'$\eta$', size=20) A.set_ylabel(r'$\Lambda(\eta)$', size=20) f= plt.gcf() 1 Thus, the exponential family in this setting is the collection F = fN(η; 1) : η 2 Rg : d 1.0.2 Normal with quadratic sufficient statistic on R d As a second example, take P0 = N(0;Id×d), i.e.
    [Show full text]
  • Chapter 5 Sections
    Chapter 5 Chapter 5 sections Discrete univariate distributions: 5.2 Bernoulli and Binomial distributions Just skim 5.3 Hypergeometric distributions 5.4 Poisson distributions Just skim 5.5 Negative Binomial distributions Continuous univariate distributions: 5.6 Normal distributions 5.7 Gamma distributions Just skim 5.8 Beta distributions Multivariate distributions Just skim 5.9 Multinomial distributions 5.10 Bivariate normal distributions 1 / 43 Chapter 5 5.1 Introduction Families of distributions How: Parameter and Parameter space pf /pdf and cdf - new notation: f (xj parameters ) Mean, variance and the m.g.f. (t) Features, connections to other distributions, approximation Reasoning behind a distribution Why: Natural justification for certain experiments A model for the uncertainty in an experiment All models are wrong, but some are useful – George Box 2 / 43 Chapter 5 5.2 Bernoulli and Binomial distributions Bernoulli distributions Def: Bernoulli distributions – Bernoulli(p) A r.v. X has the Bernoulli distribution with parameter p if P(X = 1) = p and P(X = 0) = 1 − p. The pf of X is px (1 − p)1−x for x = 0; 1 f (xjp) = 0 otherwise Parameter space: p 2 [0; 1] In an experiment with only two possible outcomes, “success” and “failure”, let X = number successes. Then X ∼ Bernoulli(p) where p is the probability of success. E(X) = p, Var(X) = p(1 − p) and (t) = E(etX ) = pet + (1 − p) 8 < 0 for x < 0 The cdf is F(xjp) = 1 − p for 0 ≤ x < 1 : 1 for x ≥ 1 3 / 43 Chapter 5 5.2 Bernoulli and Binomial distributions Binomial distributions Def: Binomial distributions – Binomial(n; p) A r.v.
    [Show full text]
  • Lectures on the Local Semicircle Law for Wigner Matrices
    Lectures on the local semicircle law for Wigner matrices Florent Benaych-Georges∗ Antti Knowlesy September 11, 2018 These notes provide an introduction to the local semicircle law from random matrix theory, as well as some of its applications. We focus on Wigner matrices, Hermitian random matrices with independent upper-triangular entries with zero expectation and constant variance. We state and prove the local semicircle law, which says that the eigenvalue distribution of a Wigner matrix is close to Wigner's semicircle distribution, down to spectral scales containing slightly more than one eigenvalue. This local semicircle law is formulated using the Green function, whose individual entries are controlled by large deviation bounds. We then discuss three applications of the local semicircle law: first, complete delocalization of the eigenvectors, stating that with high probability the eigenvectors are approximately flat; second, rigidity of the eigenvalues, giving large deviation bounds on the locations of the individual eigenvalues; third, a comparison argument for the local eigenvalue statistics in the bulk spectrum, showing that the local eigenvalue statistics of two Wigner matrices coincide provided the first four moments of their entries coincide. We also sketch further applications to eigenvalues near the spectral edge, and to the distribution of eigenvectors. arXiv:1601.04055v4 [math.PR] 10 Sep 2018 ∗Universit´eParis Descartes, MAP5. Email: [email protected]. yETH Z¨urich, Departement Mathematik. Email: [email protected].
    [Show full text]
  • Basic Econometrics / Statistics Statistical Distributions: Normal, T, Chi-Sq, & F
    Basic Econometrics / Statistics Statistical Distributions: Normal, T, Chi-Sq, & F Course : Basic Econometrics : HC43 / Statistics B.A. Hons Economics, Semester IV/ Semester III Delhi University Course Instructor: Siddharth Rathore Assistant Professor Economics Department, Gargi College Siddharth Rathore guj75845_appC.qxd 4/16/09 12:41 PM Page 461 APPENDIX C SOME IMPORTANT PROBABILITY DISTRIBUTIONS In Appendix B we noted that a random variable (r.v.) can be described by a few characteristics, or moments, of its probability function (PDF or PMF), such as the expected value and variance. This, however, presumes that we know the PDF of that r.v., which is a tall order since there are all kinds of random variables. In practice, however, some random variables occur so frequently that statisticians have determined their PDFs and documented their properties. For our purpose, we will consider only those PDFs that are of direct interest to us. But keep in mind that there are several other PDFs that statisticians have studied which can be found in any standard statistics textbook. In this appendix we will discuss the following four probability distributions: 1. The normal distribution 2. The t distribution 3. The chi-square (␹2 ) distribution 4. The F distribution These probability distributions are important in their own right, but for our purposes they are especially important because they help us to find out the probability distributions of estimators (or statistics), such as the sample mean and sample variance. Recall that estimators are random variables. Equipped with that knowledge, we will be able to draw inferences about their true population values.
    [Show full text]
  • The Normal Moment Generating Function
    MSc. Econ: MATHEMATICAL STATISTICS, 1996 The Moment Generating Function of the Normal Distribution Recall that the probability density function of a normally distributed random variable x with a mean of E(x)=and a variance of V (x)=2 is 2 1 1 (x)2/2 (1) N(x; , )=p e 2 . (22) Our object is to nd the moment generating function which corresponds to this distribution. To begin, let us consider the case where = 0 and 2 =1. Then we have a standard normal, denoted by N(z;0,1), and the corresponding moment generating function is dened by Z zt zt 1 1 z2 Mz(t)=E(e )= e √ e 2 dz (2) 2 1 t2 = e 2 . To demonstate this result, the exponential terms may be gathered and rear- ranged to give exp zt exp 1 z2 = exp 1 z2 + zt (3) 2 2 1 2 1 2 = exp 2 (z t) exp 2 t . Then Z 1t2 1 1(zt)2 Mz(t)=e2 √ e 2 dz (4) 2 1 t2 = e 2 , where the nal equality follows from the fact that the expression under the integral is the N(z; = t, 2 = 1) probability density function which integrates to unity. Now consider the moment generating function of the Normal N(x; , 2) distribution when and 2 have arbitrary values. This is given by Z xt xt 1 1 (x)2/2 (5) Mx(t)=E(e )= e p e 2 dx (22) Dene x (6) z = , which implies x = z + . 1 MSc. Econ: MATHEMATICAL STATISTICS: BRIEF NOTES, 1996 Then, using the change-of-variable technique, we get Z 1 1 2 dx t zt p 2 z Mx(t)=e e e dz 2 dz Z (2 ) (7) t zt 1 1 z2 = e e √ e 2 dz 2 t 1 2t2 = e e 2 , Here, to establish the rst equality, we have used dx/dz = .
    [Show full text]
  • A Family of Skew-Normal Distributions for Modeling Proportions and Rates with Zeros/Ones Excess
    S S symmetry Article A Family of Skew-Normal Distributions for Modeling Proportions and Rates with Zeros/Ones Excess Guillermo Martínez-Flórez 1, Víctor Leiva 2,* , Emilio Gómez-Déniz 3 and Carolina Marchant 4 1 Departamento de Matemáticas y Estadística, Facultad de Ciencias Básicas, Universidad de Córdoba, Montería 14014, Colombia; [email protected] 2 Escuela de Ingeniería Industrial, Pontificia Universidad Católica de Valparaíso, 2362807 Valparaíso, Chile 3 Facultad de Economía, Empresa y Turismo, Universidad de Las Palmas de Gran Canaria and TIDES Institute, 35001 Canarias, Spain; [email protected] 4 Facultad de Ciencias Básicas, Universidad Católica del Maule, 3466706 Talca, Chile; [email protected] * Correspondence: [email protected] or [email protected] Received: 30 June 2020; Accepted: 19 August 2020; Published: 1 September 2020 Abstract: In this paper, we consider skew-normal distributions for constructing new a distribution which allows us to model proportions and rates with zero/one inflation as an alternative to the inflated beta distributions. The new distribution is a mixture between a Bernoulli distribution for explaining the zero/one excess and a censored skew-normal distribution for the continuous variable. The maximum likelihood method is used for parameter estimation. Observed and expected Fisher information matrices are derived to conduct likelihood-based inference in this new type skew-normal distribution. Given the flexibility of the new distributions, we are able to show, in real data scenarios, the good performance of our proposal. Keywords: beta distribution; centered skew-normal distribution; maximum-likelihood methods; Monte Carlo simulations; proportions; R software; rates; zero/one inflated data 1.
    [Show full text]
  • Semi-Parametric Likelihood Functions for Bivariate Survival Data
    Old Dominion University ODU Digital Commons Mathematics & Statistics Theses & Dissertations Mathematics & Statistics Summer 2010 Semi-Parametric Likelihood Functions for Bivariate Survival Data S. H. Sathish Indika Old Dominion University Follow this and additional works at: https://digitalcommons.odu.edu/mathstat_etds Part of the Mathematics Commons, and the Statistics and Probability Commons Recommended Citation Indika, S. H. S.. "Semi-Parametric Likelihood Functions for Bivariate Survival Data" (2010). Doctor of Philosophy (PhD), Dissertation, Mathematics & Statistics, Old Dominion University, DOI: 10.25777/ jgbf-4g75 https://digitalcommons.odu.edu/mathstat_etds/30 This Dissertation is brought to you for free and open access by the Mathematics & Statistics at ODU Digital Commons. It has been accepted for inclusion in Mathematics & Statistics Theses & Dissertations by an authorized administrator of ODU Digital Commons. For more information, please contact [email protected]. SEMI-PARAMETRIC LIKELIHOOD FUNCTIONS FOR BIVARIATE SURVIVAL DATA by S. H. Sathish Indika BS Mathematics, 1997, University of Colombo MS Mathematics-Statistics, 2002, New Mexico Institute of Mining and Technology MS Mathematical Sciences-Operations Research, 2004, Clemson University MS Computer Science, 2006, College of William and Mary A Dissertation Submitted to the Faculty of Old Dominion University in Partial Fulfillment of the Requirement for the Degree of DOCTOR OF PHILOSOPHY DEPARTMENT OF MATHEMATICS AND STATISTICS OLD DOMINION UNIVERSITY August 2010 Approved by: DayanandiN; Naik Larry D. Le ia M. Jones ABSTRACT SEMI-PARAMETRIC LIKELIHOOD FUNCTIONS FOR BIVARIATE SURVIVAL DATA S. H. Sathish Indika Old Dominion University, 2010 Director: Dr. Norou Diawara Because of the numerous applications, characterization of multivariate survival dis­ tributions is still a growing area of research.
    [Show full text]
  • Delta Functions and Distributions
    When functions have no value(s): Delta functions and distributions Steven G. Johnson, MIT course 18.303 notes Created October 2010, updated March 8, 2017. Abstract x = 0. That is, one would like the function δ(x) = 0 for all x 6= 0, but with R δ(x)dx = 1 for any in- These notes give a brief introduction to the mo- tegration region that includes x = 0; this concept tivations, concepts, and properties of distributions, is called a “Dirac delta function” or simply a “delta which generalize the notion of functions f(x) to al- function.” δ(x) is usually the simplest right-hand- low derivatives of discontinuities, “delta” functions, side for which to solve differential equations, yielding and other nice things. This generalization is in- a Green’s function. It is also the simplest way to creasingly important the more you work with linear consider physical effects that are concentrated within PDEs, as we do in 18.303. For example, Green’s func- very small volumes or times, for which you don’t ac- tions are extremely cumbersome if one does not al- tually want to worry about the microscopic details low delta functions. Moreover, solving PDEs with in this volume—for example, think of the concepts of functions that are not classically differentiable is of a “point charge,” a “point mass,” a force plucking a great practical importance (e.g. a plucked string with string at “one point,” a “kick” that “suddenly” imparts a triangle shape is not twice differentiable, making some momentum to an object, and so on.
    [Show full text]
  • Quantum Noise and Quantum Measurement
    Quantum noise and quantum measurement Aashish A. Clerk Department of Physics, McGill University, Montreal, Quebec, Canada H3A 2T8 1 Contents 1 Introduction 1 2 Quantum noise spectral densities: some essential features 2 2.1 Classical noise basics 2 2.2 Quantum noise spectral densities 3 2.3 Brief example: current noise of a quantum point contact 9 2.4 Heisenberg inequality on detector quantum noise 10 3 Quantum limit on QND qubit detection 16 3.1 Measurement rate and dephasing rate 16 3.2 Efficiency ratio 18 3.3 Example: QPC detector 20 3.4 Significance of the quantum limit on QND qubit detection 23 3.5 QND quantum limit beyond linear response 23 4 Quantum limit on linear amplification: the op-amp mode 24 4.1 Weak continuous position detection 24 4.2 A possible correlation-based loophole? 26 4.3 Power gain 27 4.4 Simplifications for a detector with ideal quantum noise and large power gain 30 4.5 Derivation of the quantum limit 30 4.6 Noise temperature 33 4.7 Quantum limit on an \op-amp" style voltage amplifier 33 5 Quantum limit on a linear-amplifier: scattering mode 38 5.1 Caves-Haus formulation of the scattering-mode quantum limit 38 5.2 Bosonic Scattering Description of a Two-Port Amplifier 41 References 50 1 Introduction The fact that quantum mechanics can place restrictions on our ability to make measurements is something we all encounter in our first quantum mechanics class. One is typically presented with the example of the Heisenberg microscope (Heisenberg, 1930), where the position of a particle is measured by scattering light off it.
    [Show full text]
  • Wavelets T( )= T( ) G F( ) F ,T ( )D F
    Notes on Wavelets- Sandra Chapman (MPAGS: Time series analysis) Wavelets Recall: we can choose ! f (t ) as basis on which we expand, ie: y(t ) = " y f (t ) = "G f ! f (t ) f f ! f may be orthogonal – chosen for "appropriate" properties. # This is equivalent to the transform: y(t ) = $ G( f )!( f ,t )d f "# We have discussed !( f ,t ) = e2!ift for the Fourier transform. Now choose different Kernel- in particular to achieve space-time localization. Main advantage- offers complete space-time localization (which may deal with issues of non- stationarity) whilst retaining scale invariant property of the ! . First, why not just use (windowed) short time DFT to achieve space-time localization? Wavelets- we can optimize i.e. have a short time interval at high frequencies, and a long time interval at low frequencies; i.e. simple Wavelet can in principle be constructed as a band- pass Fourier process. A subset of wavelets are orthogonal (energy preserving c.f Parseval theorem) and have inverse transforms. Finite time domain DFT Wavelet- note scale parameter s 1 Notes on Wavelets- Sandra Chapman (MPAGS: Time series analysis) So at its simplest, a wavelet transform is simply a collection of windowed band pass filters applied to the Fourier transform- and this is how wavelet transforms are often computed (as in Matlab). However we will want to impose some desirable properties, invertability (orthogonality) and completeness. Continuous Fourier transform: " m 1 T / 2 2!ifmt !2!ifmt x(t ) = Sme , fm = Sm = x(t )e dt # "!T / 2 m=!" T T ! i( n!m)x with orthogonality: e dx = 2!"mn "!! " x(t ) = # S( f )e2!ift df continuous Fourier transform pair: !" " S( f ) = # x(t )e!2!ift dt !" Continuous Wavelet transform: $ W !,a = x t " * t dt ( ) % ( ) ! ,a ( ) #$ 1 $ & $ ) da x(t) = ( W (!,a)"! ! ,a d! + 2 C % % a " 0 '#$ * 1 $ t # " ' Where the mother wavelet is ! " ,a (t) = ! & ) where ! is the shift parameter and a a % a ( is the scale (dilation) parameter (we can generalize to have a scaling function a(t)).
    [Show full text]