Mathematical Statistics Approximation Methods in Statistics

Course Notes for Math 162: Mathematical Statistics Approximation Methods in Statistics Adam Merberg and Steven J. Miller August 18, 2006 Abstract We introduce some of the approximation methods commonly used in mathematical statistics. We ¯rst consider Taylor series expansion. We then look at Stirling's Formula, which provides an approximation for n!. As an application we show how to apply these techniques to estimate the solution to the Birthday Problem. In the appendices we review the Intermediate and Mean Value Theorems, factorials, the Gamma function and proofs by induction. Contents 1 Introduction 1 2 Taylor Series 1 3 Stirling's Formula 4 4 The Birthday Problem 5 A The Intermediate and Mean Value Theorems (IVT, MVT) 6 B Factorials and the Gamma Function 7 C Proofs by Induction 8 1 Introduction Often when studying statistics, we ¯nd that we must work with expressions which are too unwieldy to consider exactly. For example, we often need to look at factorials, which are often contributed by binomial coe±cients. Sometimes we have the good fortune of being able to cancel one factorial with another. Often, however, we will have no way to simplify exactly an expression involving a factorial. In order to draw meaningful conclusions, we must therefore rely on approximations, which will typically be valid for large sample sizes. Stirling's Formula provides an approximation for n!. Before introducing Stirling's Formula, we will look at Taylor series, an approximation method used in justifying Stirling's Formula and elsewhere. 2 Taylor Series A Taylor series is a power series that allows us to approximate a function that has certain properties. The theoretical basis for Taylor series is given by the following theorem. The theorem and its proof are as given in [Rud]; by f (i)(t) we mean the ith derivative of f(t). A key ingredient in our proof is the Mean Value Theorem, which is proved in Appendix A. Theorem 2.1 (Taylor's Theorem). Let f be a real-valued function on [a; b] and let n be a positive integer. Suppose that the following conditions hold: 1. f (n¡1)(t) is continuous on [a; b] 2. f (n)(t) exists for every t 2 (a; b). 1 Let ® and ¯ be distinct points of [a; b], and de¯ne nX¡1 f (i)(®) P (t) = (t ¡ ®)i: (2.1) n¡1 i! i=0 Then for some x between ® and ¯, f (n)(x) f(¯) = P (¯) + (¯ ¡ ®)n: (2.2) n¡1 n! Given appropriate bounds on the higher derivatives, we can approximate a function as a polynomial, which is often much simpler than considering the function exactly. The function Pn¡1(t) is called the Taylor series of order (or degree) n ¡ 1. It is the best approximation to f(t) among polynomials of degree n ¡ 1, and the error between our function f(t) th and our Taylor series approximation Pn¡1(t) is bounded by the n derivative. Not surprisingly, the larger n is, the better the approximation. This is because we are using more and more information. If n ¡ 1 = 0 then we are approximating f(¯) with f(®). This means that, if all we know at time ® is the value f(®), our best guess is that the function is always this value (for we do not know if it is increasing or decreasing). If now we know f 0(®), we know how rapidly the function f is changing at ® (if we think of f as representing distance, then f 0(®) is the speed at ®). Our best guess is now that the speed is constant, always equal to f 0(®). If this were true, then the value of f at ¯ would be f(®) + f 0(®)(¯ ¡ ®) (where we start plus the distance traveled, which under the assumption of constant speed is just speed times elapsed time). This is the ¯rst order Taylor series approximation. The next piece of information, f 00(®), tells us how fast the speed is changing at time ®. This allows us weaken our assumption of constant speed, and obtain a better approximation. Proof. For a < t < b, we de¯ne n g(t) = f(t) ¡ Pn¡1(t) ¡ M(t ¡ ®) ; (2.3) (i) where M is chosen so that g(¯) = 0. It is easily seen from (2.1) that the values of Pn¡1(®) coincide with the values of f (i)(®) for any choice of i with 0 · i · n ¡ 1. Then for any such i, we have n! g(i)(®) = f (i)(t) ¡ P (i) (t) ¡ M (® ¡ ®)n¡i = 0 (2.4) n¡1 (n ¡ i)! 0 From the Mean Value Theorem, since g(¯) = g(®) = 0, we must have g (x1) = 0 for some x1 between ® and ¯. We can 0 0 0 00 now apply the mean value theorem to g (t). We have g (x1) = g (®) = 0, so g (x2) = 0 for some x2 between x1 and ®. (n) We can continue this process for a total of n steps to see that g (xn) = 0 for some xn between ® and xn¡1. We let x = xn and note that x is also between ® and ¯. By di®erentiating n times the equation (2.3), we have (n) (n) (n) 0 g (t) = f (t) ¡ Pn¡1(t) ¡ Mn!(t ¡ ®) : (2.5) Since Pn¡1(t) is a polynomial of degree n ¡ 1, this becomes g(n)(t) = f (n)(t) ¡ Mn!: (2.6) Since g(n)(®) = 0, f (n)(t) = Mn!: (2.7) We then have f (n)(x) Mn! P (¯) + (¯ ¡ ®)n = P (¯) + (¯ ¡ ®)n = f(¯) ¡ g(¯) = f(¯): (2.8) n¡1 n! n¡1 n! The x we have chosen therefore satis¯es the statement of the theorem. Taylor's Theorem allows us to approximate f(¯) as Pn¡1(¯) and gives us to approximate the error by ¯nding the (n) f (x) n maximum value of n! (¯ ¡ ®) for x on the interval [a; b]. If we have a good upper bound on the derivatives of f, we can get a good polynomial approximation for f(¯). Example 2.2. Let us consider f(x) = log(1 ¡ x): (2.9) We will compute the Taylor series about x = 0. By direct computation, we have f(0) = 0. We now compute the ¯rst derivative: 1 f 0(x) = ¡ : (2.10) 1 ¡ x 2 Hence, f 0(0) = ¡1. We will now show that (n ¡ 1)! f (n)(x) = ¡ (2.11) (1 ¡ x)n for n ¸ 1. We have just shown the case n = 1, and we will proceed by induction (see Appendix C). Assuming the statement for n ¡ 1, we compute f (n)(x) as the derivative of f (n¡1)(x). We have µ ¶ d (n ¡ 2)! ¡(n ¡ 1) ¢ (n ¡ 2)! (n ¡ 1)! f (n)(x) = ¡ = ¡ ¢ (¡1) = ¡ : (2.12) dx (1 ¡ x)n¡1 (1 ¡ x)n (1 ¡ x)n The statement in 2.11 now follows by induction for n ¸ 1. From this, we see that f (n)(0) = ¡(n ¡ 1)!. The Taylor series for f(x) about 0 is therefore given by µ ¶ µ ¶ 0! 1! 2! x2 x3 X1 xn f(x) = ¡ ¢ x + ¢ x2 + ¢ x3 + ¢ ¢ ¢ = ¡ x + + + ¢ ¢ ¢ = ¡ : (2.13) 1! 2! 3! 2 3 n n=1 Remark 2.3. Note the expansion for log(1 ¡ x) above converges for 0 · x < 1. Further, one can see that this expansion is a negative number for x > 0. This makes sense, as for x 2 (0; 1), 1 ¡ x 2 (0; 1). When we calculate log(1 ¡ x) we are taking the logarithm of a positive number less than 1; this has to be negative (the logarithm of 1 is zero, the logarithm of numbers greater than 1 are positive and those less than 1 are negative). It is very important to be able to quickly spot-check a formula to look for simple omission. For complicated formulas, try to concentrate on the main parts or features, and see if they make sense. Remark 2.4. One can also derive the Taylor series expansion by integration. From the geometric series formula, we know 1 X1 ¡ = ¡ yn (2.14) 1 ¡ y n=0 for 0 · y < 1. If we integrate both sides from 0 to x < 1, provided we can interchange the integration and summation, we have Z Z x 1 x X1 ¡ dy = ¡ yn dy 1 ¡ y 0 0 n=1 ¯ 1 Z x ¯x X ln(1 ¡ y)¯ = ¡ yn dy 0 n=0 0 X1 xn+1 X1 xn ln(1 ¡ x) = ¡ = ¡ ; (2.15) n + 1 n n=0 n=1 the interchange of integration and summation can be justi¯ed by Fubini's Theorem. Example 2.5. One of the most important Taylor series is that of f(x) = ex about x = 0. As f (n)(x) = ex for all n, we have X1 f (n)(0)xn X1 xn ex = = : (2.16) n! n! n=0 n=0 Exercise 2.6. Calculate the Taylor series of cos θ and sin θ around θ = 0. Consider what we get by formally plugging in x P1 n x iθ in the Taylor series expansion of e : n=0(iθ) =n!. Often the Taylor series expansion is used as the de¯nition of e ; because of the rapid decay of n! (see x3), this in¯nite sum converges for all complex valued x. Thus we may de¯ne ex by P1 n n=0 x =n! for all x, real or complex. Show that eiθ = cos θ + i sin θ: (2.17) One can use this to derive all trigonometric formulas1. For example, sin2 θ + cos2 θ = 1 follows from eiθ ¢ e¡iθ = 1.

Load more