Average Laws in Analysis Silvius Klein Norwegian University of Science and Technology (NTNU) the Law of Large Numbers: Informal Statement
Total Page:16
File Type:pdf, Size:1020Kb
Average laws in analysis Silvius Klein Norwegian University of Science and Technology (NTNU) The law of large numbers: informal statement The theoretical expected value of an experiment is approximated by the average of a large number of independent samples. theoretical expected value ≈ empirical average The law of large numbers (LLN) Let X1, X2,..., Xn,... be a sequence of jointly independent , identically distributed copies of a scalar random variable X. Assume that X is absolutely integrable, with expectation µ. Define the partial sum process Sn := X1 + X2 + ... + Xn. Then the average process S n ! µ as n ! . n 1 The law of large numbers: formal statements Let X1, X2, . be a sequence of independent, identically distributed random variables with common expectation µ. Let Sn := X1 + X2 + ... + Xn be the corresponding partial sums process. Then Sn 1 (weak LLN) ! µ in probability. n That is, for every > 0, S n - µ > ! 0 as n ! . P n 1 Sn 2 (strong LLN) ! µ almost surely. n It was the best of times, it was the worst of times. Charles Dickens, A tale of two cities (click here) yskpw,qol,all/alkmas;’.a ma;;lal;,qwmswl,;q;[;’ lkle’78623rhbkbads m ,q l;,’;f.w, ’ fwe It was the best of times, it was the worst of times. jllkasjllmk,a s.„,qjwejhns;.2;oi0ppk;q,Qkjkqhjnqnmnmmasi[oqw— qqnkm,sa;l;[ml/w/’q Application of LLN: the infinite monkey theorem Let X1, X2, . be i.i.d. random variables drawn uniformly from a finite alphabet. Then almost surely, every finite phrase (i.e. finite string of symbols in the alphabet) appears (infinitely often) in the string X1X2X3 . .. Application of LLN: the infinite monkey theorem Let X1, X2, . be i.i.d. random variables drawn uniformly from a finite alphabet. Then almost surely, every finite phrase (i.e. finite string of symbols in the alphabet) appears (infinitely often) in the string X1X2X3 . .. yskpw,qol,all/alkmas;’.a ma;;lal;,qwmswl,;q;[;’ lkle’78623rhbkbads m ,q l;,’;f.w, ’ fwe It was the best of times, it was the worst of times. jllkasjllmk,a s.„,qjwejhns;.2;oi0ppk;q,Qkjkqhjnqnmnmmasi[oqw— qqnkm,sa;l;[ml/w/’q Application of LLN: the infinite monkey theorem Let X1, X2, . be i.i.d. random variables drawn uniformly from a finite alphabet. Then almost surely, every finite phrase (i.e. finite string of symbols in the alphabet) appears (infinitely often) in the string X1X2X3 . .. yskpw,qol,all/alkmas;’.a ma;;lal;,qwmswl,;q;[;’ lkle’78623rhbkbads m ,q l;,’;f.w, ’ fwe It was the best of times, it was the worst of times. jllkasjllmk,a s.„,qjwejhns;.2;oi0ppk;q,Qkjkqhjnqnmnmmasi[oqw— qqnkm,sa;l;[ml/w/’q The second Borel-Cantelli lemma Let E1, E2,..., En,... be a sequence of jointly independent events. If P(En) = , 1 = nX1 1 then almost surely, an infinite number of En hold simultaneously. This can be deduced from the strong law of large numbers, applied to the random variables 1 Xk := Ek . The actual proof of the infinite monkey theorem Split every realization of the infinite string of symbols in the alphabet X1X2X3 ... Xn ... into finite strings S1, S2, . of length 52 each. Let En be the event that the phrase It was the best of times, it was the worst of times. is exactly the n-th finite string Sn. These are independent events. They each have the same probability p > 0 to occur. Apply the second Borel-Cantelli lemma. The law of large numbers We have seen that if X1, X2,..., Xn,... is a sequence of jointly independent , identically distributed copies of a scalar random variable X, and if we denote the corresponding sum process by Sn := X1 + X2 + ... + Xn, then the average process S n ! X as n ! . n E 1 A rather deterministic system: circle rotations Let S be the unit circle in the (complex) plane. There is a natural measure λ on S (i.e. the extension of the arc-length). Let 2πα be an angle, and denote by Rα the rotation by 2πα on S. That is, consider the transformation Rα : S ! S, π π α where if z = e2 i x 2 S and if we denote ! := e2 i , then 2π i (x+α) Rα(z) = e = z · !. Note that Rα preserves the measure λ. Iterations of the circle rotation Let 2πα be an angle. π Start with a point z = e2 i x 2 S and consider successive applications of the rotation map Rα: 1 2π i (x+α) Rα(z) = Rα(z) = e 2 2π i (x+2α) Rα(z) = Rα ◦ Rα(z) = e . n 2π i (x+nα) Rα(z) = Rα ◦ ... ◦ Rα(z) = e . 1 2 n The maps Rα, Rα,..., Rα, . are the iterations of Rα. Given a point z 2 S, the set 1 2 n fRα(z), Rα(z),..., Rα(z),... g is called the orbit of z. The orbit of every point is dense on the circle. This transformation satisfies a very weak form of independence called ergodicity. An orbit of a circle rotation Let Rα be the circle rotation by the angle 2πα, where α is an irrational number. Pick a point z on the circle S. The orbit of z (or rather a finite subset of it). An orbit of a circle rotation Let Rα be the circle rotation by the angle 2πα, where α is an irrational number. Pick a point z on the circle S. The orbit of z (or rather a finite subset of it). The orbit of every point is dense on the circle. This transformation satisfies a very weak form of independence called ergodicity. Observables on the unit circle Any measurable function f : S ! R is called a (scalar) observable of the measure space (S, A, λ). We will assume our observables to be absolutely integrable. A basic example of an observable: f = 1I , where I is an arc (or any other measurable set) on the circle. I, | I | ~ e “Observations" of the orbit points of a circle rotation. Average number of orbit points visiting an arc Let Rα be the circle rotation by the angle 2πα, where α is an irrational number. Let I be an arc on the circle. I, | I | ~ e The first n orbit points of a circle rotation and their visits to I. The average number of visits to I: j # j 2 f1, 2, . , ng : Rα(z) 2 I n What does this look like for large enough n? Or in other words, is there a limit of these averages as n ! ? 1 Average number of orbit points visiting an arc Let Rα be the circle rotation by the angle 2πα, where α is an irrational number. Let I be an arc on the circle. I, | I | ~ e The first n orbit points of a circle rotation and their visits to I. As n ! , the average number of visits to I: # j 2 f1, 2, . , ng : Rj (z) 2 I 1 α ! λ(I) , n for all points z 2 S. = 1I dλ . ZS Average number of orbit points visiting an arc Let Rα be the circle rotation by the angle 2πα, where α is an irrational number. Let I be an arc on the circle. The first n orbit points of a circle rotation and their visits to I. I, | I | ~ e n j 1 j # j 2 f1, 2, . , ng : Rα(z) 2 I = I (Rα(z)). = Xj 1 Then the average number of visits to I can be written: 1 (R1 (z)) + 1 (R2 (z)) + ... + 1 (Rn (z)) I α I α I α ! λ(I) n Average number of orbit points visiting an arc Let Rα be the circle rotation by the angle 2πα, where α is an irrational number. Let I be an arc on the circle. The first n orbit points of a circle rotation and their visits to I. I, | I | ~ e n j 1 j # j 2 f1, 2, . , ng : Rα(z) 2 I = I (Rα(z)). = Xj 1 Then the average number of visits to I can be written: 1 (R1 (z)) + 1 (R2 (z)) + ... + 1 (Rn (z)) I α I α I α ! λ(I) = 1 dλ . n I ZS Measure preserving dynamical systems A probability space (X, B, µ) together with a transformation T : X ! X define a measure preserving dynamical system if T is measurable and it preserves the measure of any B-measurable set: µ(T -1A) = µ(A) for all A 2 B. Ergodic dynamical system. For any B-measurable set A with µ(A) > 0, the iterations TA, T 2 A,..., T n A,... fill up the whole space X, except possibly for a set of measure zero. Ergodicity leads to some very, very weak form of independence. Some examples of ergodic dynamical systems 1 The Bernoulli shift, which encodes sequences of independent, identically distributed random variables. 2 The circle rotation by an irrational angle. 3 The doubling map. T :[0, 1] ! [0, 1], Tx = 2x mod 1. The pointwise ergodic theorem Given: an ergodic dynamical system (X, B, µ, T), and an absolutely integrable observable f : X ! R, define the n-th Birkhoff sum 2 n Sn f (x) := f (Tx) + f (T x) + ... + f (T x). Then as n ! 1 S 1f (x) ! f dµ for µ a.e. x 2 X. n n ZX The law of large numbers We have seen that if X1, X2,..., Xn,... is a sequence of jointly independent , identically distributed copies of a scalar random variable X, and if we denote the corresponding sum process by Sn := X1 + X2 + ..