Exact Distribution and High-Dimensional Asymptotics for Improperness Test of Complex Signals Florent Chatelain, Nicolas Le Bihan

Total Page:16

File Type:pdf, Size:1020Kb

Exact Distribution and High-Dimensional Asymptotics for Improperness Test of Complex Signals Florent Chatelain, Nicolas Le Bihan Exact Distribution and High-dimensional Asymptotics for Improperness Test of Complex Signals Florent Chatelain, Nicolas Le Bihan To cite this version: Florent Chatelain, Nicolas Le Bihan. Exact Distribution and High-dimensional Asymptotics for Improperness Test of Complex Signals. ICASSP 2019 - IEEE International Conference on Acoustics, Speech and Signal Processing, May 2019, Brighton, United Kingdom. pp.8509-8513, 10.1109/ICASSP.2019.8683518. hal-02148428 HAL Id: hal-02148428 https://hal.archives-ouvertes.fr/hal-02148428 Submitted on 5 Jun 2019 HAL is a multi-disciplinary open access L’archive ouverte pluridisciplinaire HAL, est archive for the deposit and dissemination of sci- destinée au dépôt et à la diffusion de documents entific research documents, whether they are pub- scientifiques de niveau recherche, publiés ou non, lished or not. The documents may come from émanant des établissements d’enseignement et de teaching and research institutions in France or recherche français ou étrangers, des laboratoires abroad, or from public or private research centers. publics ou privés. EXACT DISTRIBUTION AND HIGH-DIMENSIONAL ASYMPTOTICS FOR IMPROPERNESS TEST OF COMPLEX SIGNALS Florent Chatelain and Nicolas Le Bihan Univ. Grenoble Alpes, CNRS, Grenoble INPy, GIPSA-Lab, 38000 Grenoble, France. y: Institute of Engineering Univ. Grenoble Alpes ABSTRACT representation to study the Generalized Likelihood Ratio Test (GLRT). In the scalar case, the notion of proper/improper ran- Improperness testing for complex-valued vectors and pro- dom variable can be phrased as follows: A complex random cesses has been of interest lately due to the potential ap- variable is called proper if it is uncorrelated with its complex plications of complex-valued time series analysis in several conjugate. Note that properness is a less general notion than research areas. This paper provides exact distribution charac- circularity which qualifies a complex random variable whose terization of the GLRT (Generalized Likelihood Ratio Test) pdf is invariant by rotation in the complex plane. In the Gaus- statistics for Gaussian complex-valued signals under the null sian setting where the variable distribution depends only on hypothesis of properness. This distribution is a special case second-order statistics, these two properties become equiva- of the Wilks’s lambda distribution, as are the distributions lent. In terms of real and imaginary parts of the complex vari- of the GLRT statistics in multivariate analysis of variance able, properness means that both real and imaginary have the (MANOVA) procedures. In the high dimensional setting, same variance and that their cross-covariance vanishes. The i.e. when the size of the vectors grows at the same rate as concept of properness extends to complex-valued vectors in a the number of samples, a closed form expression is obtained straight manner (see [6, 7]). for the asymptotic distribution of the GLRT statistics. This We consider N-dimensional complex-valued centered is, to our knowledge, the first exact characterization for the random vectors z = u + iv, i.e. u and v are N-dimensional GLRT-based improperness testing. real vectors with zero mean. In short, we have z 2 CN , Index Terms— Complex signals, improperness, GLRT, u 2 RN and v 2 RN . The real vector representation of high-dimensional statistics, Wilks’s Lambda distribution N T T T 2N z 2 C consists in using x = u ; v 2 R . The second order statistics of z 2 CN are thus contained in the 1. INTRODUCTION real-valued covariance matrix C 2 R2N×2N of x given by: C C Complex-valued time series and vectors have attracted atten- C = uu uv (1) tion lately for their ability to model signals from a broad range Cvu Cvv of applications including digital communications [1], seis- N×N where Cab 2 R denotes the real-valued (cross)covariance mology [2], oceanography [3] or gravitational-waves physics T matrix between real vectors a and b, with Cba = Cab. [4] to name just a few. Among the specific features of com- A complex-valued Gaussian vector is called proper iff the plex datasets, the notion of properness (or second order cir- following two conditions hold: cularity [5]), in the Gaussian case, is of major importance as it relates invariance of the probability density function to the T Cuu = Cvv and Cuv = −Cuv (2) correlation coefficients of complex vectors. The consequence of properness/circularity on complex random vectors and sig- If these conditions are not fulfilled, then z is called improper. nals statistics was studied at large extent in [6, 7]. Testing Testing for improperness of a complex Gaussian vector for improperness of complex vectors/signals was investigated was studied in the seminal work of Andersson [10] where by several authors [8, 9] in the signal processing community. authors used the 2N-dimensional real-valued representation However, it was indeed considered a long time ago by statis- of complex N-dimensional vectors. They derived the maxi- ticians [10]. In the signal processing literature, authors have mal invariant statistics for complex vectors and characterized mainly used the augmented complex representation - a twice the joint distribution of those statistics together with using bigger vector made of the concatenation of the complex vec- them for improperness testing. In [8, 1], authors proposed tor and its conjugate - while an equivalent real-valued repre- a GLRT test based on the augmented complex representation, sentation using real and imaginary parts was preferably used and highlighted its connection with canonical correlation co- by statisticians. In the sequel, we will make use of this latter efficients. Results showing the equivalence of the maximal in- variant statistics approach (derived from the 2N-dimensional Lemma 2.1. Any matrix C 2 S can be written as: real-valued representation) with the canonical correlation co- I + D 0 efficients derived from the complex structure were obtained C = G N λ GT ; in [9], together with a numerical study of the GLRT Barlett 0 IN − Dλ asymptotic distribution (in the large sample size case with where G 2 G, I is the N × N identity matrix and D = small or fixed dimension of the complex vector z). N λ diag(λ ; : : : ; λ ) is an N ×N diagonal matrix. The diagonal The original contribution of the presented work is twofold. 1 N entries of D denoted as λ for 1 ≤ i ≤ N, are the non- It consists 1) in the identification of the exact distribution of λ i negative eigenvalues of the following 2N×2N real symmetric GLRT statistics under the null hypothesis of properness, matrix which reduces to a special case of Wilks’s lambda distribu- tion, and 2) in the derivation of the asymptotic distribution − 1 − 1 Γ(C) = C_ 2 C¨ C_ 2 : for the GLRT statistics in the high dimensional case (vector and sample sizes growing at the same rate). They satisfy the following properties: 1) λi and −λi, for 1 ≤ i ≤ N, form the set of eigenvalues of the 2N × 2N matrix 2. TESTING FOR IMPROPERNESS Γ(C), and 2) λi 2 [0; 1] with, by convention, the following ordering 1 ≥ λ1 ≥ ::: ≥ λN ≥ 0. 2.1. Testing problem Proof. See [10, lemma 5.1 and 5.2]. In several applications, it is common use to model the sig- Lemma 2.1 shows that any invariant parameterization of nal/vector of interest, denoted z, as being improper and cor- the covariance matrix C for the group action of G depends rupted by proper noise. Consequently, statistical tests have only on the N (positive) eigenvalues 1 ≥ λ1 ≥ ::: ≥ λN ≥ been proposed to investigate the properness/improperness of 0. Thus these eigenvalues are termed as maximal invariant a signal given a sample, from which one will decide: parameters [11, Chapter 6]. Moreover under the null hypoth- ¨ ( esis H0, it comes that λ1 = ::: = λN = 0 as C reduces to H : z is proper if condition (2) holds 0 (3) the zero matrix according to (2). Within the invariant param- H1 : z is improper otherwise eterization, the testing problem in (3) becomes ( H : λ = 0; for 1 ≤ i ≤ N; 2.2. Invariant parameters 0 i (4) H1 : λi ≥ 0; for 1 ≤ i ≤ N: Let G be the set of nonsingular matrices G 2 R2N×2N s.t. Note that the invariance property ensures that the test does not G −G depend on the (common) representation basis of the real and G = 1 2 ; G2 G1 imaginary parts of z, i.e. vectors u and v. N×N where G1; G2 2 R . Let S be the set of all 2N × 2N 2.3. Invariant statistics real symmetric definite positive matrices. According to the Consider we have a sample of size M, denoted X = fx gM , test formulation (3) and condition (2), the null hypothesis H0 m m=1 T T T is equivalent to C 2 T = S\G. where xm = [um; vm] are 2N-dimensional i.i.d. Gaussian As explained in [10], G is a group (isomorphic to the real vectors with zero mean and covariance matrix C. In the Gaussian framework, a sufficient statistics is given by the group GLN (C) of nonsingular N × N complex matrices un- 2N × 2N sample covariance matrix der the mapping G $ G1 + iG2). Moreover G acts transi- tively on T under the action (G; T) 2 G × T 7! GTGT 2 S S T . Thus, a parametric characterization of H should be in- S = uu uv 0 S S variant to this group action: the value of the parameters to be vu vv T tested should be the same for C and GCG for any G 2 G. N×N with Sab 2 R the real-valued sample (cross)covariance We introduce the following decomposition C = C_ + C¨ for M M matrix of real vectors famgm=1 and fbmgm=1 such that any C 2 S where 1 PM T Sab = M m=1 ambm.
Recommended publications
  • The Exponential Family 1 Definition
    The Exponential Family David M. Blei Columbia University November 9, 2016 The exponential family is a class of densities (Brown, 1986). It encompasses many familiar forms of likelihoods, such as the Gaussian, Poisson, multinomial, and Bernoulli. It also encompasses their conjugate priors, such as the Gamma, Dirichlet, and beta. 1 Definition A probability density in the exponential family has this form p.x / h.x/ exp >t.x/ a./ ; (1) j D f g where is the natural parameter; t.x/ are sufficient statistics; h.x/ is the “base measure;” a./ is the log normalizer. Examples of exponential family distributions include Gaussian, gamma, Poisson, Bernoulli, multinomial, Markov models. Examples of distributions that are not in this family include student-t, mixtures, and hidden Markov models. (We are considering these families as distributions of data. The latent variables are implicitly marginalized out.) The statistic t.x/ is called sufficient because the probability as a function of only depends on x through t.x/. The exponential family has fundamental connections to the world of graphical models (Wainwright and Jordan, 2008). For our purposes, we’ll use exponential 1 families as components in directed graphical models, e.g., in the mixtures of Gaussians. The log normalizer ensures that the density integrates to 1, Z a./ log h.x/ exp >t.x/ d.x/ (2) D f g This is the negative logarithm of the normalizing constant. The function h.x/ can be a source of confusion. One way to interpret h.x/ is the (unnormalized) distribution of x when 0. It might involve statistics of x that D are not in t.x/, i.e., that do not vary with the natural parameter.
    [Show full text]
  • A Skew Extension of the T-Distribution, with Applications
    J. R. Statist. Soc. B (2003) 65, Part 1, pp. 159–174 A skew extension of the t-distribution, with applications M. C. Jones The Open University, Milton Keynes, UK and M. J. Faddy University of Birmingham, UK [Received March 2000. Final revision July 2002] Summary. A tractable skew t-distribution on the real line is proposed.This includes as a special case the symmetric t-distribution, and otherwise provides skew extensions thereof.The distribu- tion is potentially useful both for modelling data and in robustness studies. Properties of the new distribution are presented. Likelihood inference for the parameters of this skew t-distribution is developed. Application is made to two data modelling examples. Keywords: Beta distribution; Likelihood inference; Robustness; Skewness; Student’s t-distribution 1. Introduction Student’s t-distribution occurs frequently in statistics. Its usual derivation and use is as the sam- pling distribution of certain test statistics under normality, but increasingly the t-distribution is being used in both frequentist and Bayesian statistics as a heavy-tailed alternative to the nor- mal distribution when robustness to possible outliers is a concern. See Lange et al. (1989) and Gelman et al. (1995) and references therein. It will often be useful to consider a further alternative to the normal or t-distribution which is both heavy tailed and skew. To this end, we propose a family of distributions which includes the symmetric t-distributions as special cases, and also includes extensions of the t-distribution, still taking values on the whole real line, with non-zero skewness. Let a>0 and b>0be parameters.
    [Show full text]
  • 1. How Different Is the T Distribution from the Normal?
    Statistics 101–106 Lecture 7 (20 October 98) c David Pollard Page 1 Read M&M §7.1 and §7.2, ignoring starred parts. Reread M&M §3.2. The eects of estimated variances on normal approximations. t-distributions. Comparison of two means: pooling of estimates of variances, or paired observations. In Lecture 6, when discussing comparison of two Binomial proportions, I was content to estimate unknown variances when calculating statistics that were to be treated as approximately normally distributed. You might have worried about the effect of variability of the estimate. W. S. Gosset (“Student”) considered a similar problem in a very famous 1908 paper, where the role of Student’s t-distribution was first recognized. Gosset discovered that the effect of estimated variances could be described exactly in a simplified problem where n independent observations X1,...,Xn are taken from (, ) = ( + ...+ )/ a normal√ distribution, N . The sample mean, X X1 Xn n has a N(, / n) distribution. The random variable X Z = √ / n 2 2 Phas a standard normal distribution. If we estimate by the sample variance, s = ( )2/( ) i Xi X n 1 , then the resulting statistic, X T = √ s/ n no longer has a normal distribution. It has a t-distribution on n 1 degrees of freedom. Remark. I have written T , instead of the t used by M&M page 505. I find it causes confusion that t refers to both the name of the statistic and the name of its distribution. As you will soon see, the estimation of the variance has the effect of spreading out the distribution a little beyond what it would be if were used.
    [Show full text]
  • On a Problem Connected with Beta and Gamma Distributions by R
    ON A PROBLEM CONNECTED WITH BETA AND GAMMA DISTRIBUTIONS BY R. G. LAHA(i) 1. Introduction. The random variable X is said to have a Gamma distribution G(x;0,a)if du for x > 0, (1.1) P(X = x) = G(x;0,a) = JoT(a)" 0 for x ^ 0, where 0 > 0, a > 0. Let X and Y be two independently and identically distributed random variables each having a Gamma distribution of the form (1.1). Then it is well known [1, pp. 243-244], that the random variable W = X¡iX + Y) has a Beta distribution Biw ; a, a) given by 0 for w = 0, (1.2) PiW^w) = Biw;x,x)=\ ) u"-1il-u)'-1du for0<w<l, Ío T(a)r(a) 1 for w > 1. Now we can state the converse problem as follows : Let X and Y be two independently and identically distributed random variables having a common distribution function Fix). Suppose that W = Xj{X + Y) has a Beta distribution of the form (1.2). Then the question is whether £(x) is necessarily a Gamma distribution of the form (1.1). This problem was posed by Mauldon in [9]. He also showed that the converse problem is not true in general and constructed an example of a non-Gamma distribution with this property using the solution of an integral equation which was studied by Goodspeed in [2]. In the present paper we carry out a systematic investigation of this problem. In §2, we derive some general properties possessed by this class of distribution laws Fix).
    [Show full text]
  • 1 One Parameter Exponential Families
    1 One parameter exponential families The world of exponential families bridges the gap between the Gaussian family and general dis- tributions. Many properties of Gaussians carry through to exponential families in a fairly precise sense. • In the Gaussian world, there exact small sample distributional results (i.e. t, F , χ2). • In the exponential family world, there are approximate distributional results (i.e. deviance tests). • In the general setting, we can only appeal to asymptotics. A one-parameter exponential family, F is a one-parameter family of distributions of the form Pη(dx) = exp (η · t(x) − Λ(η)) P0(dx) for some probability measure P0. The parameter η is called the natural or canonical parameter and the function Λ is called the cumulant generating function, and is simply the normalization needed to make dPη fη(x) = (x) = exp (η · t(x) − Λ(η)) dP0 a proper probability density. The random variable t(X) is the sufficient statistic of the exponential family. Note that P0 does not have to be a distribution on R, but these are of course the simplest examples. 1.0.1 A first example: Gaussian with linear sufficient statistic Consider the standard normal distribution Z e−z2=2 P0(A) = p dz A 2π and let t(x) = x. Then, the exponential family is eη·x−x2=2 Pη(dx) / p 2π and we see that Λ(η) = η2=2: eta= np.linspace(-2,2,101) CGF= eta**2/2. plt.plot(eta, CGF) A= plt.gca() A.set_xlabel(r'$\eta$', size=20) A.set_ylabel(r'$\Lambda(\eta)$', size=20) f= plt.gcf() 1 Thus, the exponential family in this setting is the collection F = fN(η; 1) : η 2 Rg : d 1.0.2 Normal with quadratic sufficient statistic on R d As a second example, take P0 = N(0;Id×d), i.e.
    [Show full text]
  • Chapter 5 Sections
    Chapter 5 Chapter 5 sections Discrete univariate distributions: 5.2 Bernoulli and Binomial distributions Just skim 5.3 Hypergeometric distributions 5.4 Poisson distributions Just skim 5.5 Negative Binomial distributions Continuous univariate distributions: 5.6 Normal distributions 5.7 Gamma distributions Just skim 5.8 Beta distributions Multivariate distributions Just skim 5.9 Multinomial distributions 5.10 Bivariate normal distributions 1 / 43 Chapter 5 5.1 Introduction Families of distributions How: Parameter and Parameter space pf /pdf and cdf - new notation: f (xj parameters ) Mean, variance and the m.g.f. (t) Features, connections to other distributions, approximation Reasoning behind a distribution Why: Natural justification for certain experiments A model for the uncertainty in an experiment All models are wrong, but some are useful – George Box 2 / 43 Chapter 5 5.2 Bernoulli and Binomial distributions Bernoulli distributions Def: Bernoulli distributions – Bernoulli(p) A r.v. X has the Bernoulli distribution with parameter p if P(X = 1) = p and P(X = 0) = 1 − p. The pf of X is px (1 − p)1−x for x = 0; 1 f (xjp) = 0 otherwise Parameter space: p 2 [0; 1] In an experiment with only two possible outcomes, “success” and “failure”, let X = number successes. Then X ∼ Bernoulli(p) where p is the probability of success. E(X) = p, Var(X) = p(1 − p) and (t) = E(etX ) = pet + (1 − p) 8 < 0 for x < 0 The cdf is F(xjp) = 1 − p for 0 ≤ x < 1 : 1 for x ≥ 1 3 / 43 Chapter 5 5.2 Bernoulli and Binomial distributions Binomial distributions Def: Binomial distributions – Binomial(n; p) A r.v.
    [Show full text]
  • Basic Econometrics / Statistics Statistical Distributions: Normal, T, Chi-Sq, & F
    Basic Econometrics / Statistics Statistical Distributions: Normal, T, Chi-Sq, & F Course : Basic Econometrics : HC43 / Statistics B.A. Hons Economics, Semester IV/ Semester III Delhi University Course Instructor: Siddharth Rathore Assistant Professor Economics Department, Gargi College Siddharth Rathore guj75845_appC.qxd 4/16/09 12:41 PM Page 461 APPENDIX C SOME IMPORTANT PROBABILITY DISTRIBUTIONS In Appendix B we noted that a random variable (r.v.) can be described by a few characteristics, or moments, of its probability function (PDF or PMF), such as the expected value and variance. This, however, presumes that we know the PDF of that r.v., which is a tall order since there are all kinds of random variables. In practice, however, some random variables occur so frequently that statisticians have determined their PDFs and documented their properties. For our purpose, we will consider only those PDFs that are of direct interest to us. But keep in mind that there are several other PDFs that statisticians have studied which can be found in any standard statistics textbook. In this appendix we will discuss the following four probability distributions: 1. The normal distribution 2. The t distribution 3. The chi-square (␹2 ) distribution 4. The F distribution These probability distributions are important in their own right, but for our purposes they are especially important because they help us to find out the probability distributions of estimators (or statistics), such as the sample mean and sample variance. Recall that estimators are random variables. Equipped with that knowledge, we will be able to draw inferences about their true population values.
    [Show full text]
  • The Normal Moment Generating Function
    MSc. Econ: MATHEMATICAL STATISTICS, 1996 The Moment Generating Function of the Normal Distribution Recall that the probability density function of a normally distributed random variable x with a mean of E(x)=and a variance of V (x)=2 is 2 1 1 (x)2/2 (1) N(x; , )=p e 2 . (22) Our object is to nd the moment generating function which corresponds to this distribution. To begin, let us consider the case where = 0 and 2 =1. Then we have a standard normal, denoted by N(z;0,1), and the corresponding moment generating function is dened by Z zt zt 1 1 z2 Mz(t)=E(e )= e √ e 2 dz (2) 2 1 t2 = e 2 . To demonstate this result, the exponential terms may be gathered and rear- ranged to give exp zt exp 1 z2 = exp 1 z2 + zt (3) 2 2 1 2 1 2 = exp 2 (z t) exp 2 t . Then Z 1t2 1 1(zt)2 Mz(t)=e2 √ e 2 dz (4) 2 1 t2 = e 2 , where the nal equality follows from the fact that the expression under the integral is the N(z; = t, 2 = 1) probability density function which integrates to unity. Now consider the moment generating function of the Normal N(x; , 2) distribution when and 2 have arbitrary values. This is given by Z xt xt 1 1 (x)2/2 (5) Mx(t)=E(e )= e p e 2 dx (22) Dene x (6) z = , which implies x = z + . 1 MSc. Econ: MATHEMATICAL STATISTICS: BRIEF NOTES, 1996 Then, using the change-of-variable technique, we get Z 1 1 2 dx t zt p 2 z Mx(t)=e e e dz 2 dz Z (2 ) (7) t zt 1 1 z2 = e e √ e 2 dz 2 t 1 2t2 = e e 2 , Here, to establish the rst equality, we have used dx/dz = .
    [Show full text]
  • A Family of Skew-Normal Distributions for Modeling Proportions and Rates with Zeros/Ones Excess
    S S symmetry Article A Family of Skew-Normal Distributions for Modeling Proportions and Rates with Zeros/Ones Excess Guillermo Martínez-Flórez 1, Víctor Leiva 2,* , Emilio Gómez-Déniz 3 and Carolina Marchant 4 1 Departamento de Matemáticas y Estadística, Facultad de Ciencias Básicas, Universidad de Córdoba, Montería 14014, Colombia; [email protected] 2 Escuela de Ingeniería Industrial, Pontificia Universidad Católica de Valparaíso, 2362807 Valparaíso, Chile 3 Facultad de Economía, Empresa y Turismo, Universidad de Las Palmas de Gran Canaria and TIDES Institute, 35001 Canarias, Spain; [email protected] 4 Facultad de Ciencias Básicas, Universidad Católica del Maule, 3466706 Talca, Chile; [email protected] * Correspondence: [email protected] or [email protected] Received: 30 June 2020; Accepted: 19 August 2020; Published: 1 September 2020 Abstract: In this paper, we consider skew-normal distributions for constructing new a distribution which allows us to model proportions and rates with zero/one inflation as an alternative to the inflated beta distributions. The new distribution is a mixture between a Bernoulli distribution for explaining the zero/one excess and a censored skew-normal distribution for the continuous variable. The maximum likelihood method is used for parameter estimation. Observed and expected Fisher information matrices are derived to conduct likelihood-based inference in this new type skew-normal distribution. Given the flexibility of the new distributions, we are able to show, in real data scenarios, the good performance of our proposal. Keywords: beta distribution; centered skew-normal distribution; maximum-likelihood methods; Monte Carlo simulations; proportions; R software; rates; zero/one inflated data 1.
    [Show full text]
  • Lecture 2 — September 24 2.1 Recap 2.2 Exponential Families
    STATS 300A: Theory of Statistics Fall 2015 Lecture 2 | September 24 Lecturer: Lester Mackey Scribe: Stephen Bates and Andy Tsao 2.1 Recap Last time, we set out on a quest to develop optimal inference procedures and, along the way, encountered an important pair of assertions: not all data is relevant, and irrelevant data can only increase risk and hence impair performance. This led us to introduce a notion of lossless data compression (sufficiency): T is sufficient for P with X ∼ Pθ 2 P if X j T (X) is independent of θ. How far can we take this idea? At what point does compression impair performance? These are questions of optimal data reduction. While we will develop general answers to these questions in this lecture and the next, we can often say much more in the context of specific modeling choices. With this in mind, let's consider an especially important class of models known as the exponential family models. 2.2 Exponential Families Definition 1. The model fPθ : θ 2 Ωg forms an s-dimensional exponential family if each Pθ has density of the form: s ! X p(x; θ) = exp ηi(θ)Ti(x) − B(θ) h(x) i=1 • ηi(θ) 2 R are called the natural parameters. • Ti(x) 2 R are its sufficient statistics, which follows from NFFC. • B(θ) is the log-partition function because it is the logarithm of a normalization factor: s ! ! Z X B(θ) = log exp ηi(θ)Ti(x) h(x)dµ(x) 2 R i=1 • h(x) 2 R: base measure.
    [Show full text]
  • The Singly Truncated Normal Distribution: a Non-Steep Exponential Family
    Ann. Inst. Statist. Math. Vol. 46, No. 1, 5~66 (1994) THE SINGLY TRUNCATED NORMAL DISTRIBUTION: A NON-STEEP EXPONENTIAL FAMILY JOAN DEL CASTILLO Department of Mathematics, Universitat Aut6noma de Barcelona, 08192 BeIlaterra, Barcelona, Spain (Received September 17, 1992; revised January 21, 1993) Abstract. This paper is concerned with the maximum likelihood estimation problem for the singly truncated normal family of distributions. Necessary and suficient conditions, in terms of the coefficient of variation, are provided in order to obtain a solution to the likelihood equations. Furthermore, the maximum likelihood estimator is obtained as a limit case when the likelihood equation has no solution. Key words and phrases: Truncated normal distribution, likelihood equation, exponential families. 1. Introduction Truncated samples of normal distribution arise, in practice, with various types of experimental data in which recorded measurements are available over only a partial range of the variable. The maximum likelihood estimation for singly trun- cated and doubly truncated normal distribution was considered by Cohen (1950, 1991). Numerical solutions to the estimators of the mean and variance for singly truncated samples were computed with an auxiliar function which is tabulated in Cohen (1961). However, the condition in which there is a solution have not been determined analytically. Barndorff-Nielsen (1978) considered the description of the maximum likeli- hood estimator for the doubly truncated normal family of distributions, from the point of view of exponential families. The doubly truncated normal family is an example of a regular exponential family. In this paper we use the theory of convex duality applied to exponential families by Barndorff-Nielsen as a way to clarify the estimation problems related to the maximum likelihood estimation for the singly truncated normal family, see also Rockafellar (1970).
    [Show full text]
  • Distributions (3) © 2008 Winton 2 VIII
    © 2008 Winton 1 Distributions (3) © onntiW8 020 2 VIII. Lognormal Distribution • Data points t are said to be lognormally distributed if the natural logarithms, ln(t), of these points are normally distributed with mean μ ion and standard deviatσ – If the normal distribution is sampled to get points rsample, then the points ersample constitute sample values from the lognormal distribution • The pdf or the lognormal distribution is given byf - ln(x)()μ 2 - 1 2 f(x)= e ⋅ 2 σ >(x 0) σx 2 π because ∞ - ln(x)()μ 2 - ∞ -() t μ-2 1 1 2 1 2 ∫ e⋅ 2eσ dx = ∫ dt ⋅ 2σ (where= t ln(x)) 0 xσ2 π 0σ2 π is the pdf for the normal distribution © 2008 Winton 3 Mean and Variance for Lognormal • It can be shown that in terms of μ and σ σ2 μ + E(X)= 2 e 2⋅μ σ + 2 σ2 Var(X)= e( ⋅ ) e - 1 • If the mean E and variance V for the lognormal distribution are given, then the corresponding μ and σ2 for the normal distribution are given by – σ2 = log(1+V/E2) – μ = log(E) - σ2/2 • The lognormal distribution has been used in reliability models for time until failure and for stock price distributions – The shape is similar to that of the Gamma distribution and the Weibull distribution for the case α > 2, but the peak is less towards 0 © 2008 Winton 4 Graph of Lognormal pdf f(x) 0.5 0.45 μ=4, σ=1 0.4 0.35 μ=4, σ=2 0.3 μ=4, σ=3 0.25 μ=4, σ=4 0.2 0.15 0.1 0.05 0 x 0510 μ and σ are the lognormal’s mean and std dev, not those of the associated normal © onntiW8 020 5 IX.
    [Show full text]