Central Limit Theorem: the Cornerstone of Modern Statistics

Statistical Round pISSN 2005-6419 • eISSN 2005-7563 KJA Central limit theorem: the Korean Journal of Anesthesiology cornerstone of modern statistics Sang Gyu Kwak1 and Jong Hae Kim2 Departments of 1Medical Statistics, 2Anesthesiology and Pain Medicine, School of Medicine, Catholic University of Daegu, Daegu, Korea According to the central limit theorem, the means of a random sample of size, n, from a population with mean, μ, and 2 2 μ σ variance, σ , distribute normally with mean, , and variance, n . Using the central limit theorem, a variety of parametric tests have been developed under assumptions about the parameters that determine the population probability distribution. Compared to non-parametric tests, which do not require any assumptions about the population probability distribution, parametric tests produce more accurate and precise estimates with higher statistical powers. However, many medical researchers use parametric tests to present their data without knowledge of the contribution of the central limit theorem to the development of such tests. Thus, this review presents the basic concepts of the central limit theorem and its role in binomial distributions and the Student’s t-test, and provides an example of the sampling distributions of small populations. A proof of the central limit theorem is also described with the mathematical concepts required for its near- complete understanding. Key Words: Normal distribution, Probability, Statistical distributions, Statistics. Introduction multiple parametric tests are used to assess the statistical validity of clinical studies performed by medical researchers; however, The central limit theorem is the most fundamental theory in most researchers are unaware of the value of the central limit modern statistics. Without this theorem, parametric tests based theorem, despite their routine use of parametric tests. Thus, on the assumption that sample data come from a population clinical researchers would benefit from knowing what the cen- with fixed parameters determining its probability distribution tral limit theorem is and how it has become the basis for para- would not exist. With the central limit theorem, parametric tests metric tests. This review aims to address these topics. The proof have higher statistical power than non-parametric tests, which of the central limit theorem is described in the appendix, with do not require probability distribution assumptions. Currently, the necessary mathematical concepts (e.g., moment-generating function and Taylor’s formula) required for understanding the proof. However, some mathematical techniques (e.g., differential Corresponding author: Jong Hae Kim, M.D. and integral calculus) were omitted due to space limitations. Department of Anesthesiology and Pain Medicine, School of Medicine, Catholic University of Daegu, 33, Duryugongwon-ro 17-gil, Nam-gu, Basic Concepts of Central Limit Theorem Daegu 42472, Korea Tel: 82-53-650-4979, Fax: 82-53-650-4517 In statistics, a population is the set of all items, people, or Email: [email protected] ORCID: http://orcid.org/0000-0003-1222-0054 events of interest. In reality, however, collecting all such ele- ments of the population requires considerable effort and is often Received: September 30, 2016. not possible. For example, it is not possible to investigate the Revised: November 10, 2016 (1st); November 17, 2016 (2nd). Accepted: January 2, 2017. proficiency of every anesthesiologist, worldwide, in performing awake nasotracheal intubations. To make inferences regarding Korean J Anesthesiol 2017 April 70(2): 144-156 the population, however, a subset of the population (sample) https://doi.org/10.4097/kjae.2017.70.2.144 CC This is an open-access article distributed under the terms of the Creative Commons Attribution Non-Commercial License (http://creativecommons.org/ licenses/by-nc/4.0/), which permits unrestricted non-commercial use, distribution, and reproduction in any medium, provided the original work is properly cited. Copyright ⓒ the Korean Society of Anesthesiologists, 2017 Online access in http://ekja.org KOREAN J ANESTHESIOL Kwak and Kim 2 can be used. A sample of sufficient size that is randomly selected σ 112.5 population mean) and 37.5 = n = 3 , respectively; however, can be used to estimate the parameters of the population using the distribution of the means of samples is skewed (Fig. 1A). inferential statistics. A finite number of samples are attainable When a simple random sampling with replacement is per- from the population depending on the size of the sample and formed for samples with a size of 6, 4 × 4 × 4 × 4 × 4 × 4 = 46 = the population itself. For example, all samples with a size of 1 4,096 samples are possible (Table 2). The mean and variance of obtained at random, with replacement, from the population the 4,096 sample means are 12 (the population mean) and 18.75 2 {3,6,9,30} would be {3},{6},{9}, and {30}. If the sample size is 2, a = σ = 112.5, respectively. Compared to the distribution of the 2 n 6 total of 4 × 4 = 4 = 16 samples, which are {3,3},{3,6},{3,9},{3,30}, means of samples with a size of 3, that of the means of samples {6,3},{6,6}…{9,30},{30,3},{30,6},{30,9}, and {30,30}, would be with a size of 6 is less skewed. Importantly, the sample means n possible. In this way, 4 samples with a size of n would be ob- also gather around the population mean. (Fig. 1B). Thus, the tained from the population. Here, we consider the distribution larger the sample size (n), the more closely the sample means of the sample means. gather symmetrically around the population mean (μ) and have For example, here is the asymmetric population with a size of σ2 a corresponding reduction in the variance ( n ) (Figs. 1C and 4 as presented above: 1D). If Figs. 1A to 1D are converted to the probability density function by replacing the variable “frequency” with another variable “probability” on the vertical axis, their shapes remain The population mean, μ, {and3, 6 ,variance,9, 30} σ2, are: unchanged. In general, as the sample size from the population increases, its mean gathers more closely around the population mean with 3 + 6 + 9 + 30 a decrease in variance. Thus, as the sample size approaches in- = = 12 4 finity, the sample means approximate the normal distribution σ2 2 2 2 2 with a mean, μ, and a variance, n . As shown above, the skewed 2 (3 − 12) + (6 − 12) + (9 − 12) + (30 − 12) = = 112.5 distribution of the population does not affect the distribution 4 A simple random sampling with replacement from the popu- of the sample means as the sample size increases. Therefore, 3 lation produces 4 × 4 × 4 = 4 = 64 samples with a size of 3 (Table the central limit theorem indicates that if the sample size is suf- 1). The mean and variance of the 64 sample means are 12 (the ficiently large, the means of samples obtained using a random sampling with replacement are distributed normally with the Table 1. Samples with a Size of 3 and Their Means Table 2. Samples with a Size of 6 and Their Means Number Sample Sample mean Number Sample Sample mean 1 3 3 3 3 2 3 3 6 4 1 3 3 3 3 3 3 3 3 3 3 9 5 2 3 3 3 3 3 6 3.5 4 3 3 30 12 3 3 3 3 3 3 9 4 5 3 6 3 4 4 3 3 3 3 3 30 7.5 6 3 6 6 5 5 3 3 3 3 6 3 3.5 7 3 6 9 6 6 3 3 3 3 6 6 4 8 3 6 30 13 7 3 3 3 3 6 9 4.5 9 3 9 3 5 8 3 3 3 3 6 30 8 10 3 9 6 6 9 3 3 3 3 9 3 4 Truncated 10 3 3 3 3 9 6 4.5 54 30 6 6 14 Truncated 55 30 6 9 15 4087 30 30 30 30 6 9 22.5 56 30 6 30 22 4088 30 30 30 30 6 30 26 57 30 9 3 14 4089 30 30 30 30 9 3 22 58 30 9 6 15 4090 30 30 30 30 9 6 22.5 59 30 9 9 16 4091 30 30 30 30 9 9 23 60 30 9 30 23 4092 30 30 30 30 9 30 26.5 61 30 30 3 21 4093 30 30 30 30 30 3 25.5 62 30 30 6 22 4094 30 30 30 30 30 6 26 63 30 30 9 23 4095 30 30 30 30 30 9 26.5 64 30 30 30 30 4096 30 30 30 30 30 30 30 Online access in http://ekja.org 145 Central limit theorem VOL. 70, NO. 2, ApRIL 2017 A B 20 800 15 600 10 400 Frequency Frequency 5 200 0 0 0510 15 20 25 30 0510 15 20 25 30 Mean of samples Mean of samples C D 60,000 1,500,000 40,000 Frequency 20,000 Frequency 500,000 0 0 0510 15 20 25 30 0510 15 20 25 30 Mean of samples Mean of samples Fig. 1. Histogram representing the means for samples of sizes of 3 (A), 6 (B), 9 (C), and 12 (D). σ2 1 mean, μ, and the variance, n , regardless of the population dis- rolling the number 3 in each trial, 1 − 6 : the probability of roll- tribution. Refer to the appendix for a near-complete proof of the ing a number other than 3 in each trial. central limit theorem, as well as the basic mathematical concepts required for its proof.

Central Limit Theorem: the Cornerstone of Modern Statistics

The Probability Lifesaver: Order Statistics and the Median Theorem

Central Limit Theorem and Its Applications to Baseball

Lecture 4 Multivariate Normal Distribution and Multivariate CLT

Lectures on Statistics

Central Limit Theorems When Data Are Dependent: Addressing the Pedagogical Gaps

Random Numbers and the Central Limit Theorem

Stat 400, Section 5.4 Supplement: the Central Limit Theorem Notes by Tim Pilachowski

Lecture 1: T Tests and CLT

The Multivariate Normal Distribution

6 Central Limit Theorem

Local Limit Theorems for Random Walks in a 1D Random Environment

Central Limit Theorem