Degree Project
Edgeworth Expansion and Saddle Point Approximation for Discrete Data with Application to Chance Games.
Rani Basna 2010-09-27 Subject: Mathematics Level: Master Course code: 5MA11E
Edgeworth Expansion and Saddle Point Approximation for Discrete Data With Application to Chance Games
Abstract
We investigate mathematical tools, Edgeworth series expansion and the saddle point method, which are approximation techniques that help us to estimate the distribution function for the standardized mean of independent identical distributed random variables where we will take into consideration the lattice case. Later on we will describe one important application for these mathematical tools where game developing companies can use them to reduce the amount of time needed to satisfy their standard requests before they approve any game.
Keywords
Characteristic function, Edgeworth expansion, Lattice random variables, Saddle point approximation.
Acknowledgments First I would like to show my gratitude to my supervisor Dr.Roger Pettersson for his continuous support and help provided to me. It is a pleasure to me to thank Boss Media® for giving me this problem. I would like also to thank my fiancée Hiba Nassar for her encourage .Finally I want to thank my parents for their love and I dedicate this thesis to my father who inspired me.
Rani Basna Number of Pages: 45
Contents
1 Introduction 3
2 Notations and Definitions 3 2.1 CharacteristicFunction...... 3 2.2 CentralLimitTheorem...... 6 2.3 Definition of Lattice Distribution and Bounded Variation . . . 7
3 Edgeworth Expansion 8 3.1 FirstCase ...... 9 3.2 SecondCase...... 14 3.2.1 AuxiliaryTheorems...... 14 3.2.2 OnTheRemainderTerm...... 15 3.2.3 MainTheorem ...... 17 3.2.4 Simulation Edgeworth Expansion for Continuous Ran- domVariables...... 18 3.3 Lattice Edgeworth Expansion ...... 18 3.3.1 The Bernoulli Case ...... 19 3.3.2 Simulating for the Edgeworth expansion Bernoulli Case 21 3.3.3 GeneralLatticeCase ...... 21 3.3.4 Continuity-Corrected Points Edgeworth Series . . . . . 25 3.3.5 Simulation on Triss Cards ...... 26
4 Saddle Point Approximation 30 4.1 Simulation With Saddle Point Approximation Method . . . . 32
A Matlab Codes 33
References 38
2 1 Introduction
The basic idea in this Master’s Thesis is to define, adjust, and apply two mathematical tools (Edgeworth Expansion and Saddle Point approximation). Mainly we want to estimate the cumulative distribution function for inde- pendent and identically distributed random variables, specifically the lattice random variables. These approximating methods will give us the ability to reduce the amount of independent random variables for our estimate, com- paring to what we usually use by normal approximation using the central limit theorem. This mean that will make these methods more applicable to real life industry. More precisely the chance game industry may use methods to diminish the amount of time they need to publish a new trusted game. We will write Matlab codes to verify theoretical results, by simulating a Triss game similar to real ones with three boxes, and then apply the codes on this game to see how accurate our methods will be. In the second chapter we define some basic concepts and present important theorems that will help us in our work. In the third chapter we will intro- duce the Edgeworth expansion series in addition to the improvement of the remainder term that Esseen (1968)[11] present. In Chapter four we will dis- cuss the lattice random variables case which is much more important for us since we face it in important applications. After that we will try to apply the Edgeworth method to our main problem and look at the results. In Chapter five we will describe the most important useful formulas for the saddle point approximation technique without theoretical details. Finally we will apply the saddle point approximation method to our problem.
2 Notations and Definitions
2.1 Characteristic Function The definitions and theorems presented below can be found, for example, at [14] ,[18],...,[19].
Definition 2.1 (Distribution Function). For a random variable X, FX (x)= P (X x) is called the distribution function of X. ≤ Definition 2.2 (Characteristic Function). For a random variable X let itX itx ΨX (t) = E e = ∞ e dFX (x), called the characteristic function of X. −∞ R Here the integral is in the usual Stieltjes integral sense.
3 Theorem 2.1. Let X be a random variable with distribution function F and characteristic function ψX (t). 1. If E X n < for some n =1, 2,..., then | | ∞ n j n n n+1 n+1 (it) j t X t X ψX (t) EX E min 2| | | | , | | | | − j! ≤ ( n! (n + 1)! ) j=0 X In particular,
ψ (t) 1 E min 2, tX , | X − |≤ { | |} If E X < , then | | ∞ ψ (t) 1 itEX E min 2 tX ,t2X2/2 , | X − − |≤ | | and if EX2 < , then ∞ ψ (t) 1 itEX t2EX2/2 E min t2X2, tX 3 /6 , X − − − ≤ | | n t n n | | 2. If E X < , for all n, and n! E X 0 as n for all t R, then| | ∞ | | → → ∞ ∈ ∞ (it)j ψ (t)=1+ EXj X j! j=1 X Theorem 2.2 (Characteristic Function of Normal Random Variables). If X N(µ,σ) Then it’s characteristic function is ∈ σ2t2 ψ (t) = exp(iµt ) (1) X − 2
To make the paper more self-contained a proof is included. Proof:
We know that 1 ∞ (x µ)2 itx − 2 ψX (t)= e e− 2σ dx σ√2π Z−∞ (x2 2xµ+µ2) 1 ∞ itx − = e e− 2σ2 dx σ√2π Z−∞ (x2 2xµ+µ2)+2itxσ2 1 ∞ − = e− 2σ2 dx σ√2π Z−∞ (x2 2xµ 2itxσ2) µ2 1 ∞ − − = e− 2σ2 e− 2σ2 dx σ√2π Z−∞ (x2 2(µ+itσ2)x) µ2 1 ∞ − = e− 2σ2 e− 2σ2 dx σ√2π Z−∞ 4 (x2 2(µ+itσ2)x+(µ+itσ2)2) (µ+itσ2)2 µ2 1 ∞ − + = e− 2σ2 2σ2 e− 2σ2 dx σ√2π Z−∞ (x µ itσ2)2 (µ+itσ2)2 µ2 1 ∞ − − − = e− 2σ2 e 2σ2 dx σ√2π Z−∞ (x µ itσ2)2 2 2 1 ∞ − − µit t σ = e− 2σ2 e − 2 dx σ√2π Z−∞ t2σ2 (x µ itσ2)2 exp(µit 2 ) ∞ − − = − e− 2σ2 dx. σ√2π Z−∞ (x µ itσ2) − − By substituting y = σ we get dx dy = σ
2 2 t σ 2 exp(itµ 2 ) ∞ y ψX (t)= − e− 2 dy √2π Z−∞ 2 1 y where e −2 dy =1 √2π ⇒ R t2σ2 ψ (t) = exp(itµ ). X − 2 Using Maclaurin expansion for (1) we get the following expansion
t2σ2 1 t2σ2 ψ (t)=1+(µit )+ (µit )2 + ... X − 2 2 − 2 In addition, if we have two normal random variables X,Y : t2σ2 ψ (t) = exp(µ it X ) X X − 2 and t2σ2 ψ (t) = exp(µ it Y ) y Y − 2 then we can easily prove that t2(σ2 + σ2 ) ψ (t)= ψ (t)ψ (t) = exp[(µ + µ )it X Y ]. X+Y X Y X Y − 2 For the special case when X N(0, 1) ∈ we have the formula t2/2 ψX (t)= e− .
5 2.2 Central Limit Theorem
Theorem 2.3 (Central Limit Theorem). Let X1,X2,... be a sequence of independent and identically distributed random variables each having mean µ and variance σ2 < . Then the distribution of ∞ X + ... + X nµ 1 n − σ√n tends to the standard normal distribution as n . → ∞
The theorem is fundamental in probability theory. One simple derivation is in Blom [14] which we follow below. n Proof: Let’s put: Yk =(Xk µ)/σ, and Sn = 1 Yk/√n. We will show that − itSn P ψSn (t)= E(e ) t2/2 converge to e− , the characteristic function of the standard normal distri- bution. By the independence,
n ψSn (t)= ψ Yk (it/√n)=[ψY (it/√n)] . P Furthermore t2 ψ (t)=1+ iE(Y )t E(Y 2) + t3H(t) Y − 2 where H(t) is bounded in a neighborhood of 0. We get
2 2 t 3/2 ψ (t/√n)=1+ iE(Y )t/√n E(Y ) + n− H . Y − 2n n where Hn is finite and E(Y )=0,V (Y )=1. Hence 2 t 3/2 n ψ (t)=[1 + n− H ] . Sn − 2n n and 2 t 3/2 ln ψ (t)= n ln(1 + n− H ) Sn − 2n n
2 2 3/2 t 3/2 ln(1 t /2n + n− Hn) = n + n− Hn − . −2n t2 + n 3/2H − 2n − n From the logarithm property ln(1+x) 1, as x 0. Thus x → → 2 2 t 3/2 t n + n− H , as n −2n n →− 2 → ∞ 6 and 2 3/2 ln(1 t /2n + n− H ) − n 1, as n t2 + n 3/2H → → ∞ − 2n − n Since the characteristic function of Sn converges to the Characteristic func- tion of the standard normal distribution, Sn converges in distribution to the standard normal random variable, see e.g Cramér[4]. Theorem 2.4 (Laypunov’s Theorem). suppose that for each n the sequence 2 X1,X2,...,Xn is independent and satisfies E [Xn]=0, Var[Xn] = σn and S2 = N σ2. if for some δ > 0 the expected values E X 2+δ are finite N n=1 n | k| for every k and that Lyapounov’s condition P h i N 1 2+δ lim 2+δ E Xn =0 SN n=1 X holds for some positive δ then the central limit theorem holds. Remark 2.5. This theorem is considered as further development of Lya- pounov’s solution to the second problem of Chebyshev (you can see more details in Gnedenko and Kolmogorov [13]) which turned out to be much sim- ple and more useful in applications of the central limit theorem than former solutions.
2.3 Definition of Lattice Distribution and Bounded Vari- ation Definition 2.3 (Lattice distribution). A random variable X is said to have a lattice distribution if with probability 1 it takes on values of the form a+kh where (k = 1, 2,...), and h > 0 are constants. We call the smallest such number h the± span± of the distribution. Definition 2.4 (Function of Bounded Variation). Let F (x) be a real or complex-valued function of the real variable x. We say F (x) has bounded variation on the whole real axis if
∞ V (F )= dF (x) < | | ∞ Z−∞ For a function F of bounded variation, define F (x) at the discontinuity points in such a way that 1 F (x)= [F (x + 0) F (x 0)] 2 − − 7 where F (x +0) = limε 0F (x + ε) and F (x 0) = limε 0F (x ε). → → If furthermore F ( )=0, F ( )=1, −then F (x) is said to− be a lattice distribution function.−∞ ∞
3 Edgeworth Expansion
Let X1,X2,...Xn be independent and identically distributed random variables with mean µ and variance σ2. By the Central Limit Theorem, n X /n µ S = i=1 i − n σ/√n P is asymptotically normally distributed with zero mean and unit variance. What we are interested in here is to study the asymptotic behavior of the difference between the normal distribution Φ(x) and the distribution function Fn(x) of the Sn. In other words we want to describe the error of the normal approximation and one way to do that is via characteristics function. Three cases may appear, where they all together cover all possibilities:
1. The characteristic function ψX (t) satisfy the condition
lim sup ψX (t) < 1, (C) t | | | |→∞ called the Cramér condition. Then the distribution function has the following expansion s pj(x) x2 1 − 2 Fn(x)=Φ(x)+ j/2 e + O s+1 , s 1 (2) n 2 ≥ j=1 n X where pj(x) is a polynomial in x. 2. Condition (C) is not satisfied and the distribution is not of lattice type. It is found that
2 α3 2 x 1 Fn(x)=Φ(x)+ (1 x )e− 2 + o( ) 6σ3√2πn − √n α being the third moment of X , α = E [(X EX)3] . 3 i 3 −
3. Fn(x) is a lattice distribution function. Even if all moments are finite, an expansion like the latter one is impossible, Fn(x) will have jumps at 1 discontinuity points of order of magnitude √n . By adding an extra term to the expansion we diminish the order of magnitude of the remainder term.
8 1 1 Note: When s = 1, O( n ) << o( √n ). that means when (C) is satisfied the error is much more smaller than when it is not satisfied.
Remark 3.1. If the distribution function of X is absolutely continuous, i.e. X is a continuous random variable, then (C) is valid.
Remark 3.2. Kolmogorov [13] noted that there are discrete distributions which are not lattice distributed. For instance, if Xj takes only the values √ 1 1 and 3, each with probability 4 , then its distribution is not a lattice distribution± ± because the system of equations
√3= a + k h − 1 1= a + k2h
1= a + k3h √3= a + k h − 4 where ki 1, 2,... , which, if I understand Kolmogorov correctly, does not have∈ a solution. {± ± }
3.1 First Case Here a proof of (2) will be presented. However, we assumed that the distribu- tion function of X is absolutely continuous which implies (C), recall remark 3.1. We know that Sn is asymptotically normal N(0, 1). Then the character- istic function ψ of S converges to that of N(0, 1) as N , n n → ∞ t2/2 ψ (t)= E [exp(itS )] E[exp(itN)] = e− , Now if we put Y =(X µ)/σ, where X is equal in law to Xi, and ψY is the characteristic function of− Y then we will have Pn X nµ i=1 i− itSn it σ√n ψSn (t)= E[e ]= E[e ] X1 µ X2 µ Xn µ it/√n − it/√n − it/√n − = E[e σ ]E[e σ ].....E[e σ ] (4) X µ it/√n − n it/√nY n 1/2 n =(E[e σ ]) =(E[e ]) =(ψY (t/n )) . We also can write the characteristics function as an expansion ∞ k (it)j log ψ (t)= j . Y j! j=1 X 9 Then 1 1 ψ (t) = exp k it + k (it)2 + ... + k (it)j + .. . (5) Y 1 2 2 3! j 1 z2 Formula (5) follow by using the expansion of the log(1 + z)= z 2 + ... zj j − ± j + O(z ) and replacing the 1+ z by ψY (t) and do some rearrangements. In addition, we have from the characteristic function developed in Maclaurin’s series for small value that 1 1 ψ (t)=1+ E[Y ]it + E[Y 2](it)2 + ... + E[Y j](it)j + ... Y 2 j! We can define the cumulants Kj using the formal identity 1 1 k (it)j = log 1+ E[Y j](it)j j! j j! j 1 ( j 1 ) X≥ X≥ 1 1 k = ( 1)i+1 E[Y j](it)j − i j! i 1 X≥ and by equating coefficients of (it)j, k1 = E[Y ], k = E[Y 2] E[Y ]2 = V ar(Y ), 2 − k = E[Y 3] 3E[Y 2]E[Y ]+2E[Y ]3 = E[Y E[Y ]]3, (6) 3 − − k = E[Y 4] 4E[Y 3]E[Y ] 3E[Y 2]2 + 12E[Y 2]E[Y ]2 6E[Y ]4 4 − − − = E[Y E[Y ]]4 3(V ar(Y ))2, − − and so on. We expressed the cumulants in term of homogeneous polynomial in mo- ments of degree j. Moments too can be expressed in terms of homogeneous polynomials in cumulants. We have standardized the random variable Y for location and scale, so now E[Y ] = k1 = 0 and var(Y ) = k2 = 1. Hence by (4) and (5) and using the expansion series of the exponential function we get: it k it 2 k it j ψ (t/n1/2) = exp k + 2 + ... + j . Y 1 √n 2! √n j! √n ( ) it k (it)2 k (it)j ψ (t/n1/2) = exp k + 2 + ... + j Y 1 √n 2! n j! nj/2 10 Hence k j n it k (it)2 j (it) k1 2 √n 2! n j! nj /2 ψSn (t)= e e ...e ... Substituting k1 =0 and k2 =1 we get 2 3 (j 2) j t 1/2 k3(it) − − kj(it) ψ (t) = exp − + n− + ... + n 2 + ... , Sn 2 3! j! t2/2 1/2 1 j/2 ψSn (t)= e− 1+ n− r1(it)+ n− r2(it)... + n− rj(it)+ ... , (7) where rj is a polynomial of degree 3j depending on the cumulants, and this expansion gives the same as the convergence of the central limit theorem gives (3). We can see that rj is an even polynomial when j is even and an odd polynomial when j is odd, and it is obvious from (7) that 1 r (u)= k u3 (8) 1 6 3 and 1 1 r (u)= k u4 + k2u6. (9) 2 24 4 72 3 We can rewrite (7) in the form t2/2 1/2 t2/2 1 t2/2 j/2 t2/2 ψSn (t)= e− + n− r1(it)e− + n− r2(it)e− ... + n− rj(it)e− + ... (10) Now since ∞ ψ (t)= eitxdP (S x) Sn n ≤ Z−∞ and t2/2 ∞ itx e− = e dΦ(x), (11) Z−∞ where Φ denotes the standard normal distribution function, if we apply the inverse method of Fourier-Stieljes transform on (12) we get 1/2 1 j/2 P (S x)=Φ(x)+ n− R (x)+ n− R (x)... + n− R (x)+ ..., (12) n ≤ 1 2 j where Rj(x) represents a function whose Fourier-Stieljes transform equals t2/2 rj(it)e− ∞ itx t2/2 e dRj(x)= rj(it)e− . Z−∞ 11 Our focus now is to compute Rj. Applying integrating by parts j times in (11) gives t2/2 1 ∞ itx (1) e− =( it)− e dΦ (x) − Z−∞ t2/2 2 ∞ itx (2) e− =( it)− e dΦ (x) − Z−∞ (13) . . j ∞ itx (j) =( it)− e dΦ (x), − Z−∞ where Φ(j)(x)=(d/dx)jφ(x). Putting D as the differential operator d/dx, such that r ( D) is differential operator, Hence j − ∞ itx j j t2/2 e d ( D) Φ(x) =(it) e− (14) − Z−∞ ∞ itx j t2/2 e d r ( D)Φ(x) =(it) e− { j − } Z−∞ and r ( D)Φ(x) is the function we are looking for here j − R (x)= r ( d/dx)φ(x), for j 1. (15) j j − ≥ The Chebyshev-Hermit polynomials can be defined by the formula k k x2/2 d x2/2 H (x)=( 1) e e− , k − dxk Which gives H0(x)=1, H1(x)= x, H (x)= x2 1, (16) 2 − H (x)= x3 3x, 3 − ...... Then using the Hermitian polynomials we can put j ( D )Φ(x)= Hj 1(x)Φ(x). − − − We note that the Hermitian polynomials are orthogonal with respect to the function Φ(x) and that Hj is a polynomial of degree j and is even when j is even and is odd when j is odd. We get now from (9), (8), and (15) that 1 R (x)= k (x2 1)φ(x) 1 −6 3 − 12 1 1 R (x)= x k (x2 3) + k2(x4 10x2 + 15) φ(x). 2 − 24 4 − 72 3 − For general j 1, ≥ Rj(x)= pj(x)φ(x), Here the polynomial p have degree of order 3j 1 and is odd for even j. j − This is clear because of rj degree. Hence 1 p (x)= k (x2 1) (17) 1 −6 3 − and 1 1 p (x)= x k (x2 3) + k2(x4 10x2 + 15) . (18) 2 − 24 4 − 72 3 − Now we can rewrite formula (12) 1/2 1 j/2 P (Sn x)=Φ(x)+ n− p1(x)φ(x)+ n− p2(x)φ(x)... + n− pj(x)φ(x)+ ... ≤ (19) called the Edgeworth expansion of the distribution of P (Sn x). The third 1/2 ≤ cumulant k3 refers to skewness, so the term of n− order in the last expan- sion improves the basic normal approximation of the cumulative distribution function of Sn by performing skewness correction. In the same way k4 refers 1 to kurtosis for the term of order n− which improves the normal approxima- tion further by adjusting for kurtosis. In other words the O(1/√n) rate of 1 the Berry-Essen theorem is improved to uniform errors of the order n− , and 3 n− 2 by the one and two term Edgeworth expansion. It is very rare that this expansion converges according to Hall [15]. In fact there is a condition on this expansion (19) Cramér [5], which says that if X has an absolutely continuous distribution function then the condition for convergence of (19) is 1 E exp( Y 2) < , 4 ∞ which is a very severe condition. Usually (19) exists as an asymptotic series, which means that if the series stop at a specific order the remainder is of smaller order than the last omitted term in the series. It means 1/2 1 j/2 j/2 P (Sn x)=Φ(x)+n− p1(x)φ(x)+n− p2(x)φ(x)...+n− pj(x)φ(x)+o(n− ). ≤ (20) j/2 j+1 In fact, the remainder term o(n− ) is much smaller, namely O(n− 2 ) [10]. The restrictions on (20) are E( X j+2) < and lim sup ψ(t) < 1. (21) | | ∞ t | | | |→∞ 13 You can find the proof of this fact in Hall [15]. We derived the Edgeworth expansion from an expansion for the logarithm of the characteristic function of Sn. The Cramér condition (C) ensures that the characteristic function can be expanded in the requested manner. The expansion of Fn was obtained by a Fourier inversion of the expansion for the characteristic function. 3.2 Second Case 3.2.1 Auxiliary Theorems Now we move to the case where the condition (C) is not satisfied and the distribution is not of lattice type. For that we need some auxiliary theorems before a proof the main theorem, Theorem 3.10. Theorem 3.3. Let A, T, and ε > 0 be constants, F (x) a non decreasing function, and G(x) a function of bounded variation. If 1. F ( )= G( ),F (+ )= G(+ ), −∞ −∞ ∞ ∞ 2. F (x) G(x) dx < , | − | ∞ 3. RG′ (x)exist for all x and G′ (x) A, ≤ T f(t) g(t) − 4. T t dt = ε, − R then to every number k > 1 there corresponds a finite positive number c(k) depending only on k such that ε A F (x) G(x) k + c(k) . | − |≤ 2π T Theorem 3.4. Let A, T, ε be arbitrary positive constants, F (x) a non de- creasing discontinuous function, and G(x) a function of bounded variation. If 1. F ( )= G( )=0,F (+ )= G(+ ), −∞ −∞ ∞ ∞ 2. F (x) G(x) dx < , | − | ∞ 3.R the functions F (x) and G(x) have discontinuities only at the points x = xi(xi < xi+1; i = 0, 1, 2,...), and there exist an l such that min(x x ) l, ± ± i+1 − i ≥ 14 4. everywhere except at x = x (i =0, 1, 2,...), i ± ± G′ (x) A ≤ T f(t) g(t) − 5. T t dt = ε, − then toR every number k > 1 there corresponds two finite numbers c (k) and 1 c2(k) depending only on k and such that ε A F (x) G(x) k + c (k) | − |≤ 2π 1 T whenever, Tl c (k). ≥ 2 3.2.2 On The Remainder Term Theorem 3.5. If the random variables X1,X2,...Xn,.. have finite third mo- ments, then ρ F (x) Φ(x) c 3 , | n − |≤ √n where ρ3 is the third cumulant of the random variable x/σ and c is a constant. Theorem 3.6. If the random variables X1,X2,...Xn,.. are identically dis- tributed and have finite third moments, then for σ3√n t = Tn, | |≤ 5β3 the following inequality holds: 3 t2 7 t β3 t2 f (t) e− 2 | | e− 4 , n − ≤ 6 σ3√n where β3 denote the absolute moment of order 3 and equal ∞ β = x 3 dF (x). 3 | | Z−∞ n Xi/n µ Pi=1 − We found that the characteristics function of Sn = σ/√n where Xi are independent and identically distributed is ∞ t2/2 1 j ψ (t)= e− 1+ r (it)( ) (22) Sn j √n ( j=1 ) X 15 Theorem 3.7. If in the sum (22) the summands have finite moments from √n 1 order s where s 3, then for t Tsn = 3/s the inequality ≥ | |≤ 8sρs s 3 2 − 2 t 1 j c1(s) s 3(s 2) t − 2 − − 4 fn(t) e 1+ rj(it)( ) s 2 t + t e (23) − √n ! ≤ Tsn− | | | | j=1 X holds where c1(s) depend only on s; also the inequality s 2 2 2 t − 1 δ(n) s 3(s 1) t 2 j 4 fn(t) e− 1+ rj(it)( ) s 2 t + t − e− (24) − √n ! ≤ n −2 | | | | j=1 X holds, where δ(n) depends only on n and lim δ(n)=0. n →∞ Remark 3.8. By Gnedenko and Kolmogorov [13, P204,Theorem 1b] we in- stead have s 2 2 − 2 t 1 j c2(s)δ(n) s 3(s 2) t − 2 − − 4 fn(t) e 1+ rj(it)( ) s 2 t + t e . − √n ! ≤ Tsn− | | | | j=1 X Theorem 3.9. If the distribution function F (x) is non lattice, then whatever the number w> 0 may be, there exist a function λ(n) such that lim λ(n)= n →∞ ∞ and λ(n) f n(t) 1 I = | |dt = o( ) t √n Zw The theorem we will mention next was first proved By Cramér [5] where the condition (C) satisfied, and later by Esseen [10] which we will present here. 1 = βs ρs σs 16 3.2.3 Main Theorem Theorem 3.10. If the independent random variable X1,X2,...,Xn are iden- tically distributed, nonlattice, and have finite third moment, then p (x) 1 F (x)=Φ(x)+ φ(x) 1 + o( ) (25) n √n √n uniformly in x, where p (x)= k3 (1 x2). 1 6 − Proof: Put s =3 in Theorem 3.7 of formula (24). We then deduce that 2 2 2 t r1(it) t δ(n) 3 6 t f (t) e− 2 e− 2 t + t e− 4 . (26) n − − √n ≤ √n | | | |