Asymptotic in Statistics Lecture Notes for Stat522b Jiahua Chen

Asymptotic in Statistics Lecture Notes for Stat522B Jiahua Chen Department of Statistics University of British Columbia 2 Course Outline A number of asymptotic results in statistics will be presented: concepts of statistic order, the classical law of large numbers and central limit theorem; the large sample behaviour of the empirical distribution and sample quantiles. Prerequisite: Stat 460/560 or permission of the instructor. Topics: • Review of probability theory, probability inequalities. • Modes of convergence, stochastic order, laws of large numbers. • Results on asymptotic normality. • Empirical distribution, moments and quartiles • Smoothing method • Asymptotic Results in Finite Mixture Models Assessment: Students will be expected to work on 20 assignment problems plus a research report on a topic of their own choice. Contents 1 Brief preparation in probability theory 1 1.1 Measure and measurable space . 1 1.2 Probability measure and random variables . 3 1.3 Conditional expectation . 6 1.4 Independence . 8 1.5 Assignment problems . 9 2 Fundamentals in Asymptotic Theory 11 2.1 Mode of convergence . 12 2.2 Uniform Strong law of large numbers . 17 2.3 Convergence in distribution . 19 2.4 Central limit theorem . 21 2.5 Big and small o, Slutsky’s theorem . 22 2.6 Asymptotic normality for functions of random variables . 24 2.7 Sum of random number of random variables . 25 2.8 Assignment problems . 26 3 Empirical distributions, moments and quantiles 29 3.1 Properties of sample moments . 30 3.2 Empirical distribution function . 34 3.3 Sample quantiles . 35 3.4 Inequalities on bounded random variables . 38 3.5 Bahadur’s representation . 40 1 2 CONTENTS 4 Smoothing method 47 4.1 Kernel density estimate . 47 4.1.1 Bias of the kernel density estimator . 49 4.1.2 Variance of the kernel density estimator . 50 4.1.3 Asymptotic normality of the kernel density estimator . 52 4.2 Non-parametric regression analysis . 53 4.2.1 Kernel regression estimator . 54 4.2.2 Local polynomial regression estimator . 55 4.2.3 Asymptotic bias and variance for fixed design . 56 4.2.4 Bias and variance under random design . 57 4.3 Assignment problems . 61 5 Asymptotic Results in Finite Mixture Models 63 5.1 Finite mixture model . 63 5.2 Test of homogeneity . 65 5.3 Binomial mixture example . 66 5.4 C(a) test . 70 5.4.1 The generic C(a) test . 71 5.4.2 C(a) test for homogeneity . 73 5.4.3 C(a) statistic under NEF-QVF . 76 5.4.4 Expressions of the C(a) statistics for NEF-VEF mixtures . 77 5.5 Brute-force likelihood ratio test for homogeneity . 78 5.5.1 Examples . 83 5.5.2 The proof of Theorem 5.2 . 86 Chapter 1 Brief preparation in probability theory 1.1 Measure and measurable space Measure theory is motivated by the desire of measuring the length, area or volumn of subsets in a space W under consideration. However, unless W is finite, the number of possible subsets of W is very large. In most cases, it is not possible to define a measure so that it has some desirable properties and it is consistent with common notions of area and volume. Consider the one-dimensional Euclid space R consists of all real numbers and suppose that we want to give a length measurement to each subset of R. For an ordinary interval (a;b] with b > a, it is natural to define its length as m((a;b]) = b − a; where m is the notation for measuring the length of a set. Let Ii = (ai;bi] and A = [Ii and suppose ai ≤ bi < ai+1 for all i = 1;2;:::. It is natural to require m to have the property such that ¥ m(A) = ∑(bi − ai): i=1 That is, we are imposing a rule on measuring the length of the subsets of R. 1 2 CHAPTER 1. BRIEF PREPARATION IN PROBABILITY THEORY Naturally, if the lengths of Ai, i = 1;2;::: have been defined, we want ¥ ¥ m([i=1Ai) = ∑ m(Ai); (1.1) i=1 when Ai are mutually exclusive. The above discussion shows that a measure might be introduced by first as- signing measurements to simple subsets, and then be extended by applying the additive rule (1.1) to assign measurements to more complex subsets. Unfortu- nately, this procedure often does not extend the domain of the measure to all possible subsets of W. Instead, we can identify the maximum collection of subsets that a measure can be extended to. This collection of sets is closed under countable union. The notion of s-algebra seems to be the result of such a consideration. Definition 1.1 Let W be a space under consideration. A class of subsets F is called a s-algebra if it satisfies the following three conditions: (1) The empty set /0 2 F ; (2) If A 2 F , then Ac 2 F ; ¥ (3) If Ai 2 F , i = 1;2;:::, then their union [i=1Ai 2 F . Note that the property (3) is only applicable to countable number of sets. When W = R and F contains all intervals, then the smallest possible s-algebra for F is called Borel s-algebra and all the sets in F are called Borel sets. We denote the Borel s-algebra as B. Even though not every subset of real numbers is a Borel set, statisticians rarely have to consider non-Borel sets in their research. As a side remark, the domain of a measure on R such that m((a;b]) = b − a, can be extended beyond Borel s-algebra, for instance, Lesbegues algebra. When a space W is equipped with a s-algebra F , we call (W;F ) a measurable space: it has the potential to be equipped with a measure. A measure is formally defined as a set function on F with some properties. Definition 1.2 Let (W;F ) be a measureable space. A set function m defined on F is a measure if it satisfies the following three properties. (1) For any A 2 F , m(A) ≥ 0; (2) The empty set /0 has 0 measure; 1.2. PROBABILITY MEASURE AND RANDOM VARIABLES 3 (3) It is countably additive: ¥ ¥ m([i=1Ai) = ∑ m(Ai) i=1 when Ai are mutually exclusive. We have to restrict the additivity to countable number of sets. This restriction results in a strange fact in probability theory. If a random variable is continuous, then the probability that this random variable takes any specific real value is zero. At the same time, that chance for it to fall into some interval (which is made of individual values) can be larger than 0. The definition of a measure disallows adding up probabilities over all the real values in the interval to form the probability of the interval. In measure theory, the measure of a subset is allowed to be infinity. We assume that ¥ + ¥ = ¥ and so on. If we let m(A) = ¥ for all non-empty set A, this set function satisfies the conditions for a measure. Such measures is probably not useful. Even if some sets possessing infinite measure, we would like to have a sequence of mutually exclusive sets such that every one of them have finite measure, and their union covers the whole space. We call this kind of measure s-finite. Naturally, s-finite measures have many other mathematical properties that are convenient in applications. When a space is equipped with a s-algebra F , the sets in F have the potential to be measured. Hence, we have a measurable space (W;F ). After a measure n is actually assigned, we obtain a measure space (W;F ;n). 1.2 Probability measure and random variables To a mathematician, a probability measure P is merely a specific measure: it as- signs measure 1 to the whole space. The whole space is now called the sample space which denotes the set of all possible outcomes of an experiment. Individual possible outcomes are called sample points. For theoretical discussion, a specific experimental setup is redundant in the probability theory. In fact, we do not men- tion the sample space at all. In statistics, the focus is on functions defined on the sample space W, and these functions are called random variables. Let X be a randon variable. The desire of 4 CHAPTER 1. BRIEF PREPARATION IN PROBABILITY THEORY computing the probability of fw : X(w) 2 Bg for a Borel set B makes it necessary for fw : X(w) 2 Bg 2 F . These considerations motive the definition of a random variable. Definition 1.3 A random variable is a real valued function on the probability (W;F ;P) such that fw : X(w) 2 Bg 2 F for all Borel sets B. In plain words, random variables are F -measurable functions. Interestingly, this definition rules out the possibility for X to take infinity as its value and implies the cumulative distribution function defined as F(x) = P(X ≤ x) has limit 1 when x ! ¥. For one-dimensional function F(x), it is a cumulative distribution function of some random variable if and only if 1. limx→−¥ F(x) = 0; limx!¥ F(x) = 1. 2. F(x) is a non-decreasing, right continuous function. Note also that with each random variable defined, we could define a corre- sponding probability measure PX on the real space such that PX (B) = P(X 2 B): We have hence obtained an induced measure on R. At the same time, the collection of sets X 2 B is also a s-algebra.

Asymptotic in Statistics Lecture Notes for Stat522b Jiahua Chen

The Saddle Point Method in Combinatorics Asymptotic Analysis: Successes and Failures (A Personal View)

Higher-Order Asymptotics

Asymptotic Analysis for Periodic Structures

Definite Integrals in an Asymptotic Setting

The Method of Maximum Likelihood for Simple Linear Regression

Notes for a Graduate-Level Course in Asymptotics for Statisticians

Use of the Kurtosis Statistic in the Frequency Domain As an Aid In

Introducing Taylor Series and Local Approximations Using a Historical and Semiotic Approach Kouki Rahim, Barry Griﬀiths

An Introduction to Asymptotic Analysis Simon JA Malham

Chapter 4. an Introduction to Asymptotic Theory"

Chapter 6 Asymptotic Distribution Theory

Statistical Models in R Some Examples