A Facts from Probability, Statistics, and Algebra

A Facts from Probability, Statistics, and Algebra A.1 Introduction It is assumed that the reader is already familiar with the basics of probability, statistics, matrix algebra, and other mathematical topics needed in this book, and so the goal of this appendix is merely to provide a quick review and cover some more advanced topics that may not be familiar. A.2 Probability Distributions A.2.1 Cumulative Distribution Functions The cumulative distribution function (CDF) of Y is de¯ned as FY (y) = P fY · yg: If Y has a PDF fY , then Z y FY (y) = fY (u) du: ¡1 Many CDFs and PDFs can be calculated by computer software packages, for in- stance, pnorm, pt, and pbinom in R calculate, respectively, the CDF of a normal, t, and binomial random variable. Similiarly, dnorm, dt, and dbinom calculate the PDFs of these distributions. A.2.2 Quantiles and Percentiles If the CDF F (y) of a random variable Y is continuous and strictly increasing, then it has an inverse function F ¡1. For each q between 0 and 1, F ¡1(q) is called the q-quantile or 100qth percentile. The median is the 50% percentile or 0.5-quantile. The 25% and 75% percentiles (0.25- and 0.75-quantiles) are called the ¯rst and third quartiles and the median is the second quartile. The three quartiles divide the range of a continuous random variable into four groups of equal probability. Similarly, the 20%, 40%, 60%, and D. Ruppert, Statistics and Data Analysis for Financial Engineering, Springer Texts in Statistics, 597 DOI 10.1007/978-1-4419-7787-8, © Springer Science+Business Media, LLC 2011 598 A Facts from Probability, Statistics, and Algebra 80% percentiles are called quintiles and the 10%, 20%, ::: , 90% percentiles are called deciles. For any CDF F , invertible or not, the pseudo-inverse is de¯ned as F ¡(x) = inf(y : F (y) ¸ x): Here \inf" is the in¯num or greatest lower bound of a set; see Section A.5. For any q between 0 and 1, the qth quantile will de¯ned as F ¡(q). If F is invertible, then F ¡1 = F ¡, so this de¯nition of quantile agree with the one for invertible CDFs. F ¡ is often called the quantile function. Sometimes a (1 ¡ ®)-quantile is called an ®-upper quantile, to emphasize the amount of probability above the quantile. In analogy, a quantile might also be re- ferred to as lower quantile. Quantiles are said to \respect transformations" in the following sense. If Y is a random variable whose q-quantile equals yq, if g is a strictly increasing function, and if X = g(Y ), then g(yq) is the q-quantile of X; see (A.5). A.2.3 Symmetry and Modes A probability density function (PDF) f is said to be symmetric about ¹ if f(¹¡y) = f(¹ + y) for all y.A mode of a PDF is a local maximum, that is a value y such that for some ² > 0, f(y) > f(x) if y ¡ ² < x < y or y < x < y + ². A PDF with one mode is called unimodal, with two modes bimodal, and with two or more modes multimodal. A.2.4 Support of a Distribution The support of a discrete distribution is the set of all y that have a positive probability. More generally, a point y is in the support of a distribution if, for every ² > 0, the interval (y ¡ ²; y + ²) has positive probability. For example, the support of a normal distribution is (¡1; 1), the support of a gamma or lognormal distribution is [0; 1), and the support of a binomial(n; p) distribution is f0; 1; 2; : : : ; ng provided p 6= 0; 1.1 A.3 When Do Expected Values and Variances Exist? The expected value of a random variable could be in¯nite or not exist at all. Also, a random variable need not have a well-de¯ned and ¯nite variance. To appreciate these facts, let Y be a random variable with density fY . The expectation of Y is Z 1 yfY (y)dy ¡1 provided that this integral is de¯ned. If 1 It is assumed that most readers are already familiar with the normal, gamma, lognormal, and binomial distributions. However, these distributions will be discussed in some detail later. A.5 The Minimum, Maximum, In¯num, and Supremum of a Set 599 Z Z 0 1 yfY (y)dy = ¡1 and yfY (y)dy = 1; (A.1) ¡1 0 then the expectation is, formally, ¡1 + 1, which is not de¯ned, so the expectation does not exist. If integrals in (A.1) are both ¯nite, then E(Y ) exists and equals the sum of these two integrals. The expectation can exist but be in¯nite, because if Z Z 0 1 yfY (y)dy = ¡1 and yfY (y)dy < 1; ¡1 0 then E(Y ) = ¡1, and if Z Z 0 1 yfY (y)dy > ¡1 and yfY (y)dy = 1; ¡1 0 then E(Y ) = 1. If E(Y ) is not de¯ned or is in¯nite, then the variance that involves E(Y ) cannot be de¯ned either. If E(Y ) is de¯ned and ¯nite, then the variance is also de¯ned. The variance is ¯nite if E(Y 2) < 1; otherwise the variance is in¯nite. The nonexistence of ¯nite expected values and variances is of importance for modeling ¯nancial markets data, because, for example, the popular GARCH models discussed in Chapter 18 need not have ¯nite expected values and variances. Also, t-distributions that, as demonstrated in Chapter 5, can provide good ¯ts to equity returns may have nonexistent means or variances. One could argue that any variable Y derived from ¯nancial markets will be bounded, that is, that there is a constant M < 1 such that P (jY j · M) = 1. In this case, the integrals in (A.1) are both ¯nite, in fact at most M, and E(Y ) exists and is ¯nite. Also, E(Y 2) · M 2, so the variance of Y is ¯nite. So should we worry at all about the mathematically niceties of whether expected values and variances exist and are ¯nite? The answer is that we should. A random variable might be bounded in absolute value by a very large constant M and yet, if M is large enough, behave much like a random variable that does not have an expected value or has an expected value that is in¯nite or has a ¯nite expected value but an in¯nite variance. This can be seen in the simulations of GARCH processes. Results from computer simulations are bounded by the maximum size of a number in the computer. Yet these simulations behave as if the variance were in¯nite. A.4 Monotonic Functions The function g is increasing if g(x1) · g(x2) whenever x1 < x2 and strictly increasing if g(x1) < g(x2) whenever x1 < x2. Decreasing and strictly decreasing are de¯ned similarly, and g is (strictly) monotonic if it is either (strictly) increasing or (strictly) decreasing. A.5 The Minimum, Maximum, In¯num, and Supremum of a Set The minimum and maximum of a set are its smallest and largest values, if these exists. For example, if A = fx : 0 · x · 1g, then the minimum and maximum of 600 A Facts from Probability, Statistics, and Algebra A are 0 and 1. However, not all sets have a minimum or a maximum, for example, B = fx : 0 < x < 1g has neither a minimum nor a maximum. Every set as an in¯num (or inf) and a supremum (or sup). The inf of a set C is the largest number that is less than or equal to all elements of C. Similarly, the sup of C is the smallest number that is greater than or equal to every element of C. The set B just de¯ned has an inf of 0 and a sup of 1. The following notation is standard: min(C) and max(C) are the minimum and maximum of C, if these exist, and inf(C) and sup(C) are the in¯num and supremum. A.6 Functions of Random Variables Suppose that X is a random variable with PDF fX (x) and Y = g(X) for g a strictly increasing function. Since g is strictly increasing, it has an inverse, which we denote by h. Then Y is also a random variable and its CDF is FY (y) = P (Y · y) = P fg(X) · yg = P fX · h(y)g = FX fh(y)g: (A.2) Di®erentiating (A.2), we ¯nd the PDF of Y : 0 fY (y) = fX fh(y)gh (y): (A.3) Applying a similar argument to the case, where g is strictly decreasing, one can show that whenever g is strictly monotonic, then 0 fY (y) = fX fh(y)gjh (y)j: (A.4) Also from (A.2), when g is strictly increasing, then ¡1 ¡1 FY (p) = gfFX (p)g; (A.5) so that the pth quantile of Y is found by applying g to the pth quantile of X. When g is strictly decreasing, then it maps the pth quantile of X to the (1 ¡ p)th quantile of Y . Result A.6.1 Suppose that Y = a + bX for some constants a and b 6= 0. Let g(x) = a + bx, so that the inverse of g is h(y) = (y ¡ a)=b and h0(y) = 1=b. Then ¡1 FY (y) = FX fb (y ¡ a)g; b > 0; ¡1 = 1 ¡ FX fb (y ¡ a)g; b < 0; ¡1 ¡1 fY (y) = jbj fX fb (y ¡ a)g; and ¡1 ¡1 FY (p) = a + bFX (p); b > 0 ¡1 = a + bFX (1 ¡ p); b < 0: A.8 The Binomial Distribution 601 A.7 Random Samples We say that fY1;:::;Yng is a random sample from a probability distribution if they each have that probability distribution and if they are independent.

A Facts from Probability, Statistics, and Algebra

Lesson 7: Measuring Variability for Skewed Distributions (Interquartile Range)

5. the Student T Distribution

Theoretical Statistics. Lecture 20

A Tail Quantile Approximation Formula for the Student T and the Symmetric Generalized Hyperbolic Distribution

Stat 5102 Lecture Slides: Deck 1 Empirical Distributions, Exact Sampling Distributions, Asymptotic Sampling Distributions

Fan Chart: Methodology and Its Application to Inflation Forecasting in India

Finding Basic Statistics Using Minitab 1. Put Your Data Values in One of the Columns of the Minitab Worksheet

Depth, Outlyingness, Quantile, and Rank Functions in Multivariate & Other Data Settings Eserved@D = *@Let@Token

Sampling Student's T Distribution – Use of the Inverse Cumulative

Continuous Random Variables

Normal Probability Distribution

2018 Statistics Final Exam Academic Success Programme Columbia University Instructor: Mark England