The Canonical Gaussian Measure On
Total Page:16
File Type:pdf, Size:1020Kb
R The Canonical Gaussian Measure on 1. Introduction The mainP goalR of this course> is to study “Gaussian measures.” TheP sim- plest example of a Gaussian measure is the canonical Gaussian mea- sure on where 1 is an arbitrary integer. The measure is P R one of the main objects of study in this course, and is defined as (A) := A γ () d for all Borel sets A ⊆ R 2 gamma_n where − /2 e γ () := /2 [ ∈ ] (1.1)> (2π) 1 The function γ describes the famous “bell curve.”P In dimensions 1, 1 the curve of γ looks like a suitable “rotation” of the curve of>γ . We frequently drop the subscript Pfrom when it is clear which dimension we areP in. We also choose and fix the integer 1, tacitly, consider the probability spaceR (Ω ) whereR we have dropped the subscript from [as we will some times], and set Ω:= and := ( ) R 6 6 Z Throughout we designate by Z the random vector, Z () := for all ∈ and 1 (1.2) 3 R 4 1. The Canonical Gaussian Measure on 1 Then Z := (PZ ZP ) is a random vector of i.i.d. standardR normal random variables on our probability space. In particular, P (A)= {Z ∈ A} for all Borel sets A ⊆ . One of the elementary, though important, properties of the measure lem:tails is that its “tails” are vanishingly small. P R Lemma 1.1. As →∞, 2 2+(1) −2 − /2 { ∈ : >} = /2 e 2 Γ(/2) S_n Proof. Since 2 2 S := Z = Z (1.3) =1 2 P R P has a χ distribution, ∞ 2 (−2)/2 −/2 1 2 { ∈> : >} = {S > } = /2 e d⇤ 2 Γ(/2) for all 0. Now apply l’Hˆopital’srule of calculus. The following large-deviationsP R estimate is one of the consequences eq:LD of Lemma 1.1: 1 1 lim→∞ 2 log { ∈ : >} = − 2 (1.4) Of course, this is a weaker statement than Lemma 1.1. But it has the advantage of being “dimension independent.” DimensionP independence properties play aP prominent role> in the analysis of Gaussian measures. Here, for example, (1.4) teaches us that the tails of Pbehave roughly 1 as do the tails of for allP 1. 6Still,6 many of the more interesting properties of are radically 1 different from those of when is large. In low dimensions—say 1 3—one canP visualize the probability density function γ from (1.1). Based on that, or other methods, one knows that in low dimensions most of theP mass of lies near the origin. For example,P an inspection of the standard normal table reveals that more thanP half of the total 1 1 mass of is within one unit of the origin; in fact, [−1 1] ≈ 068268 In higher dimensions, however, the structure of can be quite dif- ferent. Recall the random variable S from (1.3), and apply Khintchine’s weak law of large numbers1 XXX to see that S / converges in prob- ability to one, as →∞R . R ThisP is another way to say that for every 1 In the present setting, it does not make sense to discuss almost-sure convergence since the un- derlying probability space is ( ( ) ). 1. Introduction 5 P R ex:CoM ε>0, 6 6 1/2 1/2 E lim→∞ ∈ : (1 − ε) (1 + ε) =1 (1.5) P P E The proof is short and can reproducedP right here:E Let denote6 the expectation operator for := , Var := the variance, etc. Since −2S −=1 Var S = Chebyshev’s inequality yields {|S − S | >ε} ε . P R Equivalently, 6 6 > WLLN 1/2 1/P2 1 ∈ : (1 − ε) (1 + ε) 1 − 2 (1.6) ε Thus we see that, when is large, the measure concentrates much1/2 of its total mass near the boundaryP of the centered ball of radius , very far from the origin. A more careful examination shows that, in fact, very little of the total mass of is elsewhere when is large. The following theorem makes this statement more precise. Theorem 1.2 is a simple example of a remarkable property of Gaussian measures that is known commonly as concentration of measure XXX. We will discuss th:CoM:n this topic in more detail in due time. P R CoM:n Theorem 1.2. For every ε>60, 6 > 1/2 1/2 −ε ∈ : (1 − ε) (1 + ε) 1 − 2e (1.7) Theorem 1.2 does not merely improve the crude bound (1.6). Rather, it describes an entirely new phenomenon in high dimensions. To wit, suppose we are working in dimension = 500. When ε =0, the> left- hand side of (1.7) is equal to 0. But if we increase ε slightly, say to ε =001, then the left-hand side of (1.7) increases to a probability 0986 (!). By comparison, (1.6) reportsE a sillyE bound in this case. P P Proof. From now on, we let := denote the expectation operator E E P P P for := . That is, 1 () := () := d = d for all ∈ L ( ) 2 2 E mgf:chi2 Since S := =1 Z has a χ distribution, λS −/2 E > 1 2 e = (1 − 2λ) for −∞ <λ< / (1.8) 1 2 and exp(λS )=∞ when λ / . 1 P P P 6 2 We use the preceding as follows: For all >0 and2 λ ∈ (0 / ), 2 1/2 2 λS λ −/2 −λ : > = {S > } = e > e (1−2λ) e R 6 1. The Canonical Gaussian Measure on thanks to Chebyshev’s inequality. Since the left-hand side is independent 1 2 1 2 of λ ∈P(0 / ), we may optimize the right-hand side over λ ∈ (0 / ) to find that 6 1/2 2 1 : > exp − sup1 2 λ + log(1 − 2λ) 0<λ< / 2 2 = exp − − 1 − 2 log 2 1 2 Taylor expansion yields log <− 1 − 2 ( − 1) when >1. This and the P BooBooBound previous inequality together yield 6 1/2 −(−1) −|−1| : > e < e (1.9) When =1, the same inequality holds vacuously. Therefore, it suffices to consider the case that <1. P P In that case, we argue similarly and write2 1/2 −λS −λ : < 6= e > e [for all λ>0] 2 1 exp − sup −λ + log(1 + 2λ) λ>0 2 2 = exp − 1 − − 2 log 2 P 2 Since <1, a Taylor expansion6 yields −2 log >2(1−)+(16 −) , whence 1/2 2 −(1−) : < exp − 1 − − 2 log e ⇤ 2 P This estimate and (1.9) together complete the proof. 1/2 The preceding discussion shows that when is large, {Z≈ } is extremelyP close to one. One can see from another appeal to the weak law of large numbers that6 for6 all ε>0, 1/ 1/ E (1 − ε)µR Z (1 + ε)µ → 1 as →∞ 1 for all ∈ [1 ∞), where µ := (|Z | ) and · [temporarily] denotes 6 the -norm on ; that is, 1/ 6 6 =1 R | | if 1 <∞ := 1 R R P max | | if = ∞ for all ∈ . These results suggest that the -dimensional Gauss space ( ( ) ) has unexpected geometry when 1. P Interestingly enough, the case∞ = ∞ is different still. The following might help us anticipate which -balls might carry most of the measure when 1. 1. Introduction 7 pr:max 6 6 6 6 1 E 1 Proposition 1.3. Let M := max |Z | or := max Z . Then, 1 lim→∞ √E (M )=1 2 log It is possible to compute (M ) directly, but the preceding bound turns out to be more informative for our ends, and good enough for our present needs. P 6 P 6 Proof. For all >0, 2 − /2 1 {M >} 1 ∧ {|Z | >} 1 ∧ 2e NowE choose and fix someP ε ∈ (0 1) and write, using this bound, 6 6 ∞ 6 6 2(1+ε) log ∞ 1max |Z | 6= 0 1max |Z | > d = 0 + 2(1+ε) log ∞ 2 − /2 2(1 + ε) log +2 √2(1+ε) log e d −ε E 6 = 2(1 + ε) log + √ Thus, we have (M ) (1 + (1)) 2 log , which implies half of the proposition.P For the other half, we use Lemma 1.1 to see that, 6 6 6 ∞ 2 − /2 1 1max Z 2(1 − ε) log = 1 − √ √2(1−ε) log e d 2π −1+ε 1+(1) R = 1 − √ 6 = (1) 2π log E E for example because 1>− >exp(−) for all ∈ . Therefore, 6 6 6 6 6 6 6 6 1max Z ;1 max Z 0 > 1max Z ;1 max Z > 2(1 − ε) log (1 + (1)) 2(1 − ε) log OnE the other hand, 6 E 6 6 6 6 6 6 1max Z ;1 max Z < 0 |Z |;1 max Z < 0 =1 6 P 6 6 −/2 1max Z < 0 = 2 = (1) E > as →∞, thanks to the Cauchy–Schwarz inequality. The preceding two⇤ displays together imply that M (1 + (1)) 2(1 − ε) log as →∞ for every ε ∈ (0 1), and prove the remaining half of the proposition. R 8 1. The Canonical Gaussian Measure on 2. The Projective CLT P E R The characteristic function of the random2 vector Z is ·Z − /2 () := e = e [ ∈ ] E If M denotes any × orthogonal matrix,2 then 2 ·MZ −MP /2 − /2 e = e = e Therefore, the distribution of MZ is as well. InS particular, theR normal- S ized random vectors Z/Z and MZ/MZ have the−1 same distribution. Since the uniform distribution on the -sphere−1 := { ∈ : = 1} isS the unique probability measure on that is invariant under or- > thogonal−1 transformations, it follows that Z/Z is distributed√ uniformly on . By the weak law of large numbers, Z/ converges to 1 in probability as →∞.