Theoretical Statistics. Lecture 20

Theoretical Statistics. Lecture 20

Theoretical Statistics. Lecture 20. Peter Bartlett 1. Recall: Functional delta method, differentiability in normed spaces, Hadamard derivatives. [vdV20] 2. Quantile estimates. [vdV21] 3. Contiguity. [vdV6] 1 Recall: Differentiability of functions in normed spaces Definition: φ : D E is Hadamard differentiable at θ D tangentially → ∈ to D D if 0 ⊆ φ′ : D E (linear, continuous), h D , ∃ θ 0 → ∀ ∈ 0 if t 0, ht h 0, then → k − k → φ(θ + tht) φ(θ) − φ′ (h) 0. t − θ → 2 Recall: Functional delta method Theorem: Suppose φ : D E, where D and E are normed linear spaces. → Suppose the statistic Tn : Ωn D satisfies √n(Tn θ) T for a random → − element T in D D. 0 ⊂ If φ is Hadamard differentiable at θ tangentially to D0 then ′ √n(φ(Tn) φ(θ)) φ (T ). − θ If we can extend φ′ : D E to a continuous map φ′ : D E, then 0 → → ′ √n(φ(Tn) φ(θ)) = φ (√n(Tn θ)) + oP (1). − θ − 3 Recall: Quantiles Definition: The quantile function of F is F −1 : (0, 1) R, → F −1(p) = inf x : F (x) p . { ≥ } Quantile transformation: for U uniform on (0, 1), • F −1(U) F. ∼ Probability integral transformation: for X F , F (X) is uniform on • ∼ [0,1] iff F is continuous on R. F −1 is an inverse (i.e., F −1(F (x)) = x and F (F −1(p)) = p for all x • and p) iff F is continuous and strictly increasing. 4 Empirical quantile function For a sample with distribution function F , define the empirical quantile −1 function as the quantile function Fn of the empirical distribution function Fn. −1 F (p) = inf x : Fn(x) p = X , n { ≥ } n(i) where i is chosen such that i 1 i − <p , n ≤ n and Xn(1),...,Xn(n) are the order statistics of the sample, that is, X X and n(1) ≤···≤ n(n) Xn(1),...,Xn(n) is a permutation of the sample (X1,...,Xn). 5 Quantiles Define φ : D[a,b] R as the pth quantile function φ(F )= F −1(p). → Here, D[a,b] is the set of cadlag functions on [a,b], considered as a subset of ℓ∞[a,b]: cadlag = continue a` droite, limite a` gauche = right continuous, with left limits. 6 Quantiles Theorem: If F D[a,b], • ∈ xp satisfies F (xp)= p, • ′ F is differentiable at xp, with F (xp) > 0, • then φ is Hadamard-differentiable at F tangentially to h D[a,b]: h is continuous at xp . { ∈ } Its Hadamard derivative is the (continuous) function ′ h(xp) φF (h)= ′ . −F (xp) 7 Quantiles Recall: −1 ′ p sx(F (p)) φ (sx F )= − F − f(F −1(p)) (sx F )(xp) = −′ . − F (xp) 8 Quantiles Theorem: For 0 <p< 1, • F differentiable at F −1(p), • F ′(F −1(p)) = f(F −1(p)) > 0, • n −1 −1 −1 1 1[Xi F (p)] p √n F (p) F (p) = ≤ − + oP (1) n − −√n f(F −1(p)) Xi=1 p(1 p) N 0, − . f 2(F −1(p)) 9 Quantiles Proof: Because x 1[x a]: a R is Donsker, { 7→ ≤ ∈ } conv x 1[x a]: a R is Donsker, hence { 7→ ≤ ∈ } Gn,F = √n(Fn F ) converges weakly in D[ , ] to an F -Brownian − −∞ ∞ bridge process GF = Gλ F . ◦ [Recall that Gλ is the standard uniform Brownian bridge.] The sample paths of GF are continuous at points where F is continuous. Now, φ : F F −1(p) is Hadamard-differentiable tangentially to the set 7→ D0 of cadlag functions that are continuous where F is continuous. And the G ′ limiting process F takes its values in this set D0. Furthermore, φF is defined and continuous everywhere in ℓ∞[ , ]. −∞ ∞ 10 Quantiles Hence, we can use the functional delta method: ′ √n(φ(Fn) φ(F )) = φ √n(Fn F ) + oP (1) − F − ′ G = φF ( n,F )+ oP (1). We can extend this result to the process √n(F −1 F −1), provided the n − differentiability conditions are satisfied over a set... 11 Quantiles Theorem: Suppose 0 <p <p < 1, • 1 2 [a,b]=[F −1(p ) ǫ,F −1(p )+ ǫ], for some ǫ> 0, • 1 − 2 F continuously differentiable and with positive derivative f on [a,b], • φ : D[a,b] ℓ∞[p ,p ] is defined by φ(G)= G−1. • → 1 2 Then φ is Hadamard differentiable at F tangentially to C[a,b], with h φ′ (h)= F −1. F − f ◦ (Recall: C[a,b] is the set of continuous functions on [a,b].) 12 Quantiles Theorem: For 0 <p <p < 1, • 1 2 [a,b]=[F −1(p ) ǫ,F −1(p )+ ǫ], for some ǫ> 0, and • 1 − 2 F continuously differentiable and with positive derivative f on [a,b], • G √n F −1 F −1 λ , n − f F −1 ◦ ∞ where the convergence is in ℓ [p1,p2], and Gλ is the standard Brownian bridge. (Recall: weak convergence in a metric space of functions is defined in terms of expectations of bounded continuous functions in the space.) 13 Contiguity Motivation: Suppose we wish to study the asymptotics of statistics Tn. Under the null hypothesis, say, Tn Pn, we can show that Tn T . What happens when ∼ the null hypothesis is not true? For instance, under the alternative hypothesis, Tn Qn. What can we say about the asymptotics? ∼ We can relate them through likelihood ratios (also called Radon-Nikodym derivatives): dP p n = n , dQn qn where pn and qn are the corresponding densities. For these to make sense, we need the ratio to exist, and in particular we need qn = 0 only if pn = 0, at least asymptotically. 14 Absolute Continuity Definition: 1. Q P (“Q is absolutely continuous wrt P ”) means A, ≪ ∀ P (A)=0= Q(A) = 0. ⇒ 2. P Q (“P and Q are orthogonal”) means ΩP , ΩQ, ⊥ ∃ P (ΩP ) = 1, Q(ΩP ) = 0, Q(ΩQ) = 1, P (ΩQ) = 0. 15 Absolute Continuity: Examples Example: 1. P = N(0, 1), Q = N(µ, σ2) with σ2 > 0. Then P (A) = 0 Q(A) = 0. Hence, P Q and Q P . ⇔ ≪ ≪ 2. P = N(0, 1), Q is uniform on [0, 1]. Then Q P but not P Q. ≪ ≪ 3. P = N(0, 1), Q is a mixture of a normal and a point mass at x: Q(x) > 0. Then P Q but not Q P . ≪ ≪ 4. P is uniform on [ 1/2, 1/2], Q is uniform on [0, 1]. Then neither − Q P nor P Q. ≪ ≪ 16 Absolute Continuity We can always decompose Q into a part that is absolutely continuous wrt P and a part that is orthogonal (singular): Suppose that P and Q have densities p and q wrt some measure µ. Define Qa(A)= Q (A p> 0 ) , ∩ { } Q⊥(A)= Q (A p = 0 ) . ∩ { } 17 Absolute Continuity Lemma: 1. Q = Qa + Q⊥, with Qa P and Q⊥P (Lebesgue decomposition) ≪ q 2. Qa(A)= dP . ZA p q 3. Q P Q = Qa Q(p = 0) = 0 dP = 1. ≪ ⇔ ⇔ ⇔ Z p 18 Absolute Continuity Proof: (1) is immediate from the definitions. (2): Qa(A)= qdµ ZA∩{p>0} q = pdµ ZA∩{p>0} p q = dP. ZA∩{p>0} p 19 Absolute Continuity (3): Q P Q = Qa ≪ ⇔ Q⊥ = 0 ⇔ Q(p = 0) = 0, from the definitions. ⇔ 20 Absolute Continuity Also, dQ = dQa + dQ⊥ Z Z Z q = dP + dQ⊥, Z p Z so q/pdP = 1 Q⊥ = 0. Z ⇔ 21 Likelihood ratios Write the likelihood ratio (= Radon-Nikodym derivative): dQ q = . dP p This is defined on ΩP = p> 0 , and it is P -almost surely unique. It does { } not depend on the choice of dominating measure µ that is used to define the densities p and q. 22 Likelihood ratios: Change of measure If Q P then Q = Qa, so we can write the Q-law of X : Ω Rk in ≪ → terms of the P -law of the random pair (X, dQ/dP ), via dQ E f(X)= E f(X) , Q P dP dQ Q(X A)= EP 1[X A] = vdPX,V (x, v), ∈ ∈ dP ZA×R where we have written the distribution under P of (X, V )=(X, dQ/dP ) as PX,V . This change of measure requires that Q is absolutely continuous wrt P . 23 Contiguity We are interested in an asymptotic version of this change of measure. That is, we know the asymptotics of Tn Pn, and we’d like to infer the ∼ asymptotics under an alternative sequence Qn. When can we do that? We clearly need an asymptotic version of absolute continuity. This is called contiguity. Definition: Qn ⊳Pn (“Qn is contiguous wrt Pn”) means, An, ∀ Pn(An) 0= Qn(An) 0. → ⇒ → 24.

View Full Text

Details

  • File Type
    pdf
  • Upload Time
    -
  • Content Languages
    English
  • Upload User
    Anonymous/Not logged-in
  • File Pages
    24 Page
  • File Size
    -

Download

Channel Download Status
Express Download Enable

Copyright

We respect the copyrights and intellectual property rights of all users. All uploaded documents are either original works of the uploader or authorized works of the rightful owners.

  • Not to be reproduced or distributed without explicit permission.
  • Not used for commercial purposes outside of approved use cases.
  • Not used to infringe on the rights of the original creators.
  • If you believe any content infringes your copyright, please contact us immediately.

Support

For help with questions, suggestions, or problems, please contact us