Subject - Statistics Paper - Probability I Module - Distribution Functions II

Subject - Statistics Paper - Probability I Module - Distribution Functions II Distribution functions were defined in the previous module. In this module, we consider a few more of their properties. We begin with an example. Example 1 Consider the distribution function 8 0; if x < 0 <> F (x) = 1 − p; if 0 ≤ x < 1 :>1; if x ≥ 1: For definiteness, take p = 0:6. The figure below shows the graph of F (x) against x. As evident, there is a jump discontinuity at x = 0, of magnitude 0:4, and another one at x = 1, of magnitude 0:6. ♦ We make a few comments on jumps: (i) If a distribution function F has exactly one jump (of magnitude 1) at x = c (a real constant), we say F is degenerate at c, or F is the point mass at c. If a random variable X has this F as its distribution function, then PfX = cg = 1. 1 (ii) F can have either finite or countably infinite number of jump discontinuities. Example 2 A random variable X, defined on some probability space (Ω; A; P), has the Geometric distribution, with parameter θ (2 (0; 1)), if its probability mass function is given by ( θ(1 − θ)x; if x = 0; 1; 2;::: PfX = xg = 0; otherwise. The corresponding distribution function is 80; if x < 0 > > >θ; if 0 ≤ x < 1 > >θ + θ(1 − θ); if 1 ≤ x < 2 > <>θ + θ(1 − θ) + θ(1 − θ)2; if 2 ≤ x < 3 F (x) = fX ≤ xg = X P . >. >n−1 > P x n > θ(1 − θ) = 1 − (1 − θ) ; if n − 1 ≤ x < n >x=0 >. :>. Clearly, FX has countably infinite number of jumps, one at each non-negative integer. ♦ Definition 1 The Lebesgue-Stieltjes measure µF, induced by a continuous distribution function F , is the unique probability measure defined, on the Borel σ-field BR, by µF (a; b] := F (b) − F (a); (L-S) for all bounded left-open right-closed subintervals (a; b] of R. Some explanation of this definition is warranted: (iii) The collection of bounded left-open right-closed subintervals of R, P := f(a; b]: −∞ < a < b < 1g, is a π-system generating BR, and hence, even though µF is defined only for the bounded intervals, it is still completely determined by F through (L-S). (iv) In defining the Lebesgue-Stieltjes measure for continuous distribution functions, we have been restrictive. Indeed, the definition applies to discrete distribution functions, as well. In this case, the domain of µF need not be the Borel 2 R σ-field BR; the power class 2 can serve the purpose. For continuous distribution functions, however, µF (as defined by (L-S)) cannot measure arbitrary subsets of R, and the restriction to BR (or something similar) is necessary. Example 3 Let X be the point mass at c (2 R). Then the distribution function of X is ( 0; if x < c F (x) = 1; if x ≥ c: Define the induced measure µF =: δc by ( 1; if S 3 c δc(S) := 0; if S 63 c; for sets S. The measure δc is often called the (Dirac) delta measure concentrated at c. [While not relevant to the point in hand, we state that if S is kept fixed and c is thought to be the argument, then δc(S) = 1S(c) −− the indicator of S.] ♦ Definition 2 A distribution function F is said to be singular if there exists a linear Borel set B of Lebesgue measure 0 such that µF (B) = 1, where µF is the Lebesgue- Stieltjes measure induced by F . Singular functions form an interesting class of functions, and we discuss a few of their properties: (v) A good look at the definition will reveal that all discrete distributions are singular: the Geometric distribution of Example 2 concentrates all its mass on the set of non-negative integers N[f0g, and it is well-known that1 Leb(N[f0g) = 0; similarly, the point mass at c (Example 3), assumes, with probability 1, the value c, and Lebfcg = 0. The real objects of attention are the continuous singular distribution functions, such as the Cantor function (cf. Distribution Functions I [Example 2]). The following absolutely continuous distribution function 1 1 F (x) = + arctan x; 8x 2 ; 2 π R is, clearly, not singular. (vi) An equivalent characterization of singularity of functions is that a function is singular if and only if its derivative is zero almost everywhere. This fact can be tested on the functions mentioned in the last remark. 1 Leb is Lebesgue measure on BR. 3 Theorem 1 Every distribution function F can be written in the form F = π1Fd + π2Fac + π3Fs; 3 P where each πj ≥ 0 and πj = 1, and Fd;Fac;Fs are, respectively, a discrete, an j=1 absolutely continuous, a continuous singular distribution function. Furthermore, such a decomposition is unique. We do not prove Theorem 1, but instead consider the simpler Theorem 2. For that, we need the following important Lemma. Lemma 1 The set of discontinuity points of a distribution function F is at most countable. Proof: 1 Consider the open intervals of the form (`; ` + 1], for ` 2 Z. Let x1; x2; : : : ; xn be any n points of discontinuity of F in (`; ` + 1], such that F has 1 a jump of magnitude exceeding m (for some positive integer m) at each xj. If the xj's are ordered (i.e.; x1 < x2 < ··· < xn), then, by the non-decreasing property of F , F (`) ≤ F (x1−0) < F (x1) ≤ F (x2−0) < F (x2) ≤ · · · ≤ F (xn−0) < F (xn) ≤ F (`+1): Let pk be the jump at xk; then pk = F (xk)−F (xk −0), for k = 1; 2; : : : ; n. Therefore, n n X X pk = F (xk) − F (xk − 0) ≤ F (` + 1) − F (`): k=1 k=1 1 Since, pk ≥ m ; 8k = 1; 2; : : : ; n, we have n ≤ F (` + 1) − F (`) m () n ≤ m · F (` + 1) − F (`) ≤ m: Even if we consider all points of discontinuity in (`; `+1], and not just a finite number of them, the above inequality remains unchanged. Thus, each interval of unit length has only a finite number of jumps for each m. As m assumes the values 1; 2;::: , we see that the set of discontinuity points in (`; ` + 1] is at most countable. Note that [ R = (`; ` + 1]; `2Z so, the set of all points of discontinuity of F must be at most countable. 4 Theorem 2 Every distribution function F can be written in the form F = κ1Fd + κ2Fc; P where each κj ≥ 0 and κj = 1, Fd is as before, and Fc is a continuous (though j=1;2 not necessarily, absolutely continuous) distribution function. Moreover, this decomposition is unique. Proof: 2 By Lemma, the set of discontinuity points of F is at most countable: let, r1; r2;::: be the points of discontinuity. Let p(rk) := F (rk) − F (rk − 0), and define X Ed(x) := p(rk); for x 2 R; fk:rk≤xg and Ec(x) := F (x) − Ed(x). These definitions imply Ed is a step function and Ec is continuous everywhere on R. Moreover, the functions Ed and Ec satisfy all the properties of a distribution function, except that Ed(1) = 1 and Ec(1) = 1. Ed(1) = 1 iff all the mass of F is concentrated at the jumps, in which case F is discrete and we have F = 1 · Ed + 0. Similarly, Ec(1) = 1 iff F has no jumps at all, and F is continuous with F = 0 + 1 · Ec. For all other situations, 0 < Ed(1);Ec(1) < 1. Define, for all real x, Ed(x) Ec(x) Fd(x) := and Fc(x) := : Ed(1) Ec(1) def These 2 functions are distribution functions. Note that Ed(x) + Ec(x) = F (x); 8x 2 def R, and Ed(1) + Ec(1) = F (1) = 1. Therefore, setting κ1 = Ed(1) and κ2 = Ec(1), we have F = κ1Fd + κ2Fc: Now we prove uniqueness of this decomposition. It suffices to prove that Ed and Ec are uniquely determined, since these, inturn, uniquely determine the corresponding ∗ ∗ distributions. Let Ed + Ec be another decomposition of F . Then, ∗ ∗ ∗ ∗ Ed + Ec = Ed + Ec () Ec − Ec = Ed − Ed : The difference between two continuous functions is continuous, and the difference ∗ between two step functions is, again, a step function. Hence, the equality Ec − Ec = ∗ ∗ ∗ Ed −Ed presents a contradiction. To avoid this, we must have Ec = Ec and Ed = Ed , and thus, the decomposition is unique. 5.

Subject - Statistics Paper - Probability I Module - Distribution Functions II

A Test for Singularity 1

Vector Spaces

Quantization of Self-Similar Probability Measures and Optimal Quantizers by Do˘Ganc¸¨Omez Department of Mathematics, North Dakota State University

THE DERIVATION of the CHI-Square TEST of GOODNESS of FIT

On Lin's Condition for Products of Random Variables with Singular Joint

Multivariate Stable Distributions

Statistical Theory Distribution Theory Reading in Casella and Berger

Splitting Models for Multivariate Count Data Jean Peyhardi, Pierre Fernique, Jean-Baptiste Durand

the MULTIVARIATE T-DISTRIBUTION ASSOCIATED

Some Methods of Constructing Multivariate Distributions Abderrahmane Chakak Iowa State University

Econstor Wirtschaft Leibniz Information Centre Make Your Publications Visible

Doubly Singular Matrix Variate Beta Type I and II and Singular Inverted Matricvariate $ T $ Distributions