Common Statistical Symbols and Formulas

City University of New York (CUNY) CUNY Academic Works Open Educational Resources Queensborough Community College 2020 Clear-Sighted Statistics: Appendix 3: Common Statistical Symbols and Formulas Edward Volchok CUNY Queensborough Community College How does access to this work benefit ou?y Let us know! More information about this work at: https://academicworks.cuny.edu/qb_oers/143 Discover additional works at: https://academicworks.cuny.edu This work is made publicly available by the City University of New York (CUNY). Contact: [email protected] Clear-Sighted Statistics: An OER Textbook Appendix 3: Common Statistical Symbols and Formulas I. Introduction This appendix lists common statistical symbols and formulas used in Clear-Signed Statistics. The terms and formulas presented here are explained in detail in the appropriate modules of Clear-Sighted Statistics. II. Common Statistical Symbols and Formula A. Module 4: Picturing Data with Tables and Charts Symbol/Formula Description N Number of observations, or items, in a population n Number of observations, or items, in a sample Number of categories, classes, buckets, or bins in a Frequency k Distribution 2k > n. Formula used to determine the number of categories, 2 to the k formula classes, buckets, or bins in a Frequency Distribution H The highest value in a distribution L The smallest value in a distribution Class Interval or H − L i ≥ Width, i k f Frequency or the number of observations Relative frequency or the proportion of the total number of RF or % observations Upper Class Limit − Lower Class Limit Class Midpoint Midpoint = 2 Table 1: Module 4 Symbols and Formulas B. Module 5: Statistical Measures Symbol/Formula Description X X stands for the random variable Σ (capital Greek letter Sigma). It means the operation of Σ summation or addition X̅ (The Sample ∑ X X̅ = , where X are the random variables Mean, X-Bar) n μ (The Population ∑ X μ = where X are the random variables Mean, mu) N ∑ wX X̅ W (Weighted X̅ = where X are the random variables and w are the w ∑ w Mean) weights Median M or Med or 푥̃ “x-tilde” Mode Mo Range Range = H (Highest Value) – L (lowest Value) M, Med, or X̃ Median Mean Deviation ∑|푋−푋̅| 푀퐷 = where “| |” means the absolute value, the (MD) or Mean 푛 Absolute Deviation distance of a positive or negative number from zero, or the (MAD) value of a number regardless of its negative or positive sign. σ2 (Population ∑(X − μ)2 Variance, sigma- σ2 = squared) N s2 (Sample ∑(X − X̅)2 Variance, s- s2 = squared) n − 1 σ (Population ∑(X − μ)2 Standard σ = √ Deviation, sigma) N s (Sample Standard ∑(X − X̅)2 s = √ Deviation, s) n − 1 Sample Mean, ∑ fm X̅ = Grouped Data n Sample Standard ∑ f (M − X)2 Deviation, Grouped s = √ Data n − 1 Stands for Decile. Deciles divide a distribution into ten D groups of equal frequency D Location of a Decile L = (n + 1) D 10 Stands for Percentile. P75 or P75 means the 75th percentile. P Percentiles divide a distribution into a hundred groups of equal frequency. Location of a P L = (n + 1) Percentile P 100 Stands for Quartile: Q1 (1st Quartile), Q2 (2nd Quartile), Q3 (3rd Q Quartile) and Q4 (4th Quartile). Quartiles divide a distribution into four groups of equal frequency. Location of a Q L = (n + 1) Quartile Q 4 Interquartile IQR = 3rd Quartile − First Quartile Range (IQR) Lower Outlier Outlier < Q + 1.5(Q − Q ) (Extreme Lower 1 3 1 Extreme Outlier < Q + 3(Q − Q ) Outlier) 1 3 1 Upper Outlier Outlier > Q + 1.5(Q − Q ) (Extreme Upper 3 3 1 Extreme Outlier > Q + 3(Q − Q ) Outlier) 3 3 1 σ CV = Coefficient of μ Variation as a s percentage CV = X̅ Coefficient of Variation as an index X̅ − Mode SKmode = 푠푡푑. 푑푒푣 Skewness or SK X̅ − Median SKmedian = 푠푡푑. 푑푒푣 Pearson’s Coefficient of Skewness Q1 + (2 ∗ Q2) + Q3 Trimean Trimean = 4 Table 2: Module 5 Descriptive Statistics Measures C. Module 6: Index Numbers Symbol/Formula Description Simple Index Pt P = (100) Number Po ΣPi Simple Price Index P = n Simple Aggregate ΣPt P = ∗ 100 Price Index ΣPo ΣPtQO Laspeyres Index PL = ∗ 100 ΣPOQO ΣPtQt Paasche Index PP = ∗ 100 ΣPOQt 푛 Fisher’s Ideal Index PF = √(Laspeyres) ∗ (Paasche) ΣP Q Value Index V = t t ∗ 100 ΣPoQo Table 3: Module 6: Index Numbers D. Module 7: Basic Concepts of Probability Symbol/Formula Description P(A) The probability of event “A” The probability of the event not A. This is called the P(~A) complement of event A. It is sometimes written as P(AC) or P(not A). The probability of event A given than event B has happened. P(A|B) This is called conditional probability. P(A or B) = P(A) + P(B) Special Rule of or Addition (for P(A B) = P(A) + P(B) mutually exclusive Note: is pronounced “union” and is the equivalent to the events) word “or” Complement Rule P(A) = 1 − P(~A) (Subtraction Rule) P(A or B) = P(A) + P(B) − P(A and B) General Rule of or Addition (for non- P(A B) = P(A) + P(B) − P(A ⋂ B) mutually exclusive Note: is pronounced as “intersection.” It is the equivalent events to the word “and.” P(A and B) = P(A)P(B) Special Rule of or Multiplication (for 푃(A ⋂ B) = P(A)P(B) independent events P(A and B) = P(A)P(B|A) General Rule of or Multiplication (for P(A ⋂ B) = P(A)P(B|A) dependent events P(A1)P(B|A1) Bayes Theorem P(A1|B) = P(A1)P(B|A1) + (A2)P(B|A2) Multiplication Total Arrangements = (m)(n)(o) Formula n! (The factorial of a non-negative integer n, denoted by n!, Factorial Number is the product of all positive integers less than or equal to n: 4! = 1 x 2 x 3 x 4 = 24.) Permutations nPr is pronounced “the permution of r things nPr selected from n things.” Note: With permutations, the order of selection matters. Combinations nCr is pronounced “the combination of r things nCr selected from n things.” Note: With combinations, the order of selection matters. Table 4: Module 7: Basic Concepts of Probability E. Module 8: Discrete Probability Distributions 1) Mean of a Probability Distribution, μ μ = Σ[xP(x)], found by multiplying each value by its probability and then adding the product of each value times its probability. 2) Variance of a Probability Distribution, σ2 σ2 = Σ[(X – μ)2P(x)], found by, 1) Subtract the mean from each random value, x, 2) Square (x – μ), 3) Multiply each square difference by its probability, and 4) Sum the resulting values to arrive at σ2. 3) Standard Deviation of a Probability Distribution, σ σ = σ2, the standard deviation is the positive square root of variance. 4) Binomial Probability Formula P(x) = nCxπx(1 – π)n – x, where C denotes combinations, n is the number of trials, x is the random number of successful trials, π is the probability of a success for each trial. Note: π, or pi, is not the mathematical constant of 3.14159 that you used in your geometry class to find the circumference of a circle. 5) Mean of a Binomial Distribution μ = nπ 6) Variance of a Binomial Distribution μ = nπ(1 - π) 7) Hypergeometric Distribution ( s Cx )( n-s Cn-x ) P(x) = N Cn Where N is the size of the population; S is the number of successes in the population; x is the number of successes (It could be 0, 1, 2, 3, 4, …); n is the size of the sample (number of trials); and C is the combinations. 8) Poisson Distribution mxe-m P(x) = x! Where μ is the mean number of successes in a particular interval; e is the constant or base of the Naperian logarithmic system, 2.71828’ x is the number of successes; and P(x) is the probability of a specified value of x. 9) Mean of a Poisson Distribution μ = nπ F. Module 9: Continuous Probability Distributions Symbol/Formula Description Standard Normal X − μ 푧 = Value σ Standard Error for σ the Mean, sigma σX̅ = sub x-bar or SEM √n ̅ z-value, μ and σ X − μ z = σ known ⁄ √n X = μ + zσ Solving for X Note: z can be either a positive or negative number. Table 5: Module 9: Continuous Probability Distribution G. Module 10: Sampling and Sampling Errors Symbol/Formula Description Mean of the Sample Sum of all sample means μ = Means (mu sub x-bar) X̅ Total number of samples Sampling Error X̅ - μ = 0 or X̅ ≠ μ X̅ − μ z-value for sample z = σ ⁄ √n Standard Error of the σ = σ X̅ ⁄ n Mean, SEM, or σX̅ √ Table 6: Module 10: Sampling and Sampling Errors H. Module 11: Confidence Intervals Symbol/Formula Description The selected confidence level; usually 95%, but in some c cases 99% or 90%. The value a test statistic must exceed to be out of the confidence interval or the value a test statistics must exceed to reject the Null Hypothesis. A test statistic is a value Critical Value derived from a sample for the purposes of hypothesis testing and confidence intervals. Do not report the Critical Value as CV. CV is the Coefficient of Variance. zc The critical value for a confidence level using z values. tc The critical value for a confidence level using t values. σ Confidence Interval for X̅ ± z Means using z √n σ Margin of Error for the z Mean using z √n d.f., df, or ν (the lower- Note: The formula for degrees of freedom depends on the case or small Greek type of distribution used.

Common Statistical Symbols and Formulas

Programming in Stata

The Cascade Bayesian Approach: Prior Transformation for a Controlled Integration of Internal Data, External Data and Scenarios

What Is Bayesian Inference? Bayesian Inference Is at the Core of the Bayesian Approach, Which Is an Approach That Allows Us to Represent Uncertainty As a Probability

Basic Econometrics / Statistics Statistical Distributions: Normal, T, Chi-Sq, & F

Practical Meta-Analysis -- Lipsey & Wilson Overview

Statistical Power in Meta-Analysis Jin Liu University of South Carolina - Columbia

Descriptive Statistics Frequency Distributions and Their Graphs

Flood Frequency Analysis Using Copula with Mixed Marginal Distributions

Formulas Used by the “Practical Meta-Analysis Effect Size Calculator”

Categorical Data Analysis

Chapter 10 -- Chi-Square Tests

Mean, Median, Mode, Geometric Mean and Harmonic Mean for Grouped Data