Resampling-Based Control of the FDR

Problem Formulation Existing Methods New Method Theory & Practice Simulations Conclusions References Resampling-Based Control of the FDR Joseph P. Romano1 Azeem S. Shaikh2 and Michael Wolf3 1Departments of Economics and Statistics Stanford University 2Department of Economics University of Chicago 3Department of Economics University of Zurich Problem Formulation Existing Methods New Method Theory & Practice Simulations Conclusions References Outline 1 Problem Formulation 2 Existing Methods 3 New Method 4 Theory & Practice 5 Simulations 6 Conclusions 7 References Problem Formulation Existing Methods New Method Theory & Practice Simulations Conclusions References Outline 1 Problem Formulation 2 Existing Methods 3 New Method 4 Theory & Practice 5 Simulations 6 Conclusions 7 References Problem Formulation Existing Methods New Method Theory & Practice Simulations Conclusions References General Set-Up & Notation Data X =(X1,...,Xn) from distribution P Interest in parameter vector θ(P)= θ =(θ1,...,θs)′ The individual hypotheses concern the elements θi, for i = 1,...,s, and can be (all) one-sided or (all) two-sided One-sided hypotheses: Hi : θi θ0 i vs. H′ : θi > θ0 i ≤ , i , Two-sided hypotheses: Hi : θi = θ0 i vs. H′ : θi = θ0 i , i , Problem Formulation Existing Methods New Method Theory & Practice Simulations Conclusions References General Set-Up & Notation Data X =(X1,...,Xn) from distribution P Interest in parameter vector θ(P)= θ =(θ1,...,θs)′ The individual hypotheses concern the elements θi, for i = 1,...,s, and can be (all) one-sided or (all) two-sided One-sided hypotheses: Hi : θi θ0 i vs. H′ : θi > θ0 i ≤ , i , Two-sided hypotheses: Hi : θi = θ0 i vs. H′ : θi = θ0 i , i , Test statistic Tn i =(θˆn i θ0 i)/σˆn i or Tn i = θˆn i θ0 i /σˆn i , , − , , , | , − , | , σˆn i is a standard error for θˆn i or σˆn i 1/√n , , , ≡ pˆn,i is an individual p-value Problem Formulation Existing Methods New Method Theory & Practice Simulations Conclusions References The False Discovery Rate Consider s individual tests Hi vs. Hi′. False discovery proportion F = # false rejections; R = # total rejections F F FDP = 1 R > 0 = R { } max R,1 { } False discovery rate FDRP =EP(FDP) Goal: (strong) asymptotic control of the FDR at level α: limsupFDRP α for all P n ∞ ≤ → Problem Formulation Existing Methods New Method Theory & Practice Simulations Conclusions References Outline 1 Problem Formulation 2 Existing Methods 3 New Method 4 Theory & Practice 5 Simulations 6 Conclusions 7 References Problem Formulation Existing Methods New Method Theory & Practice Simulations Conclusions References Benjamini and Hochberg (1995) Stepup method: Ordered p-values:p ˆ pˆ ... pˆ n,(1) ≤ n,(2) ≤ ≤ n,(s) Let j = max j :p ˆ αj , where αj = jα/s ∗ n,(j) ≤ Reject H H (1),..., (j∗) Comments: Original proof assumes independence of the p-values Validity has been extended to certain dependence types (Benjamini and Yekutieli, 2001) Problem Formulation Existing Methods New Method Theory & Practice Simulations Conclusions References Modifications of BH (1995) Storey, Taylor and Siegmund (2004): Under sufficient conditions for BH (1995): s0 FDRP α where s0 = I(P) = # true hypotheses ≤ s | | { } Instead of αj = jα/s use αj = jα/sˆ0 with # pˆn i > λ + 1 sˆ = { , } for some 0 < λ < 1 0 1 λ − Proof assumes thep ˆn,i to be ‘almost independent’ Problem Formulation Existing Methods New Method Theory & Practice Simulations Conclusions References Modifications of BH (1995) Bejamini, Krieger and Yekutieli (2006): Step 1: Apply the BH (1995) procedure at nominal level α∗ = α/(1 + α) Let r denote the number of rejected hypotheses (a) If r = 0, reject nothing, and stop (b) If r = s, reject everything, and stop (c) Otherwise, continue Step 2: Apply the BH (1995) procedure at nominal level α Instead of αj = jα/s use αj = jα/sˆ0 with sˆ0 = s r − Proof assumes independence of thep ˆn,i, but simulations show robustness against various dependence structures Problem Formulation Existing Methods New Method Theory & Practice Simulations Conclusions References Outline 1 Problem Formulation 2 Existing Methods 3 New Method 4 Theory & Practice 5 Simulations 6 Conclusions 7 References Problem Formulation Existing Methods New Method Theory & Practice Simulations Conclusions References Basic Idea (Troendle, 2000) For any stepdown procedure with critical values c1,...,cs: F 1 FDRP = EP = ∑ EP[F R = r]P R = r max R,1 1 r s r | { } { } ≤ ≤ with P R = r = P Tn,(s) cs,...,Tn,(s r+1) cs r+1,Tn,(s r) < cs r { } { ≥ − ≥ − − − } Problem Formulation Existing Methods New Method Theory & Practice Simulations Conclusions References Basic Idea (Troendle, 2000) For any stepdown procedure with critical values c1,...,cs: F 1 FDRP = EP = ∑ EP[F R = r]P R = r max R,1 1 r s r | { } { } ≤ ≤ with P R = r = P Tn,(s) cs,...,Tn,(s r+1) cs r+1,Tn,(s r) < cs r { } { ≥ − ≥ − − − } If all false hypotheses are rejected with p. 1, then with p. 1: → → r s + s0 FDR = ∑ − (1) P r s s0+1 r s − ≤ ≤ P Tn,s0:s0 cs0 ,...,Tn,s r+1:s0 cs r+1,Tn,s r:s0 < cs r × { ≥ − ≥ − − − } Here Tn,r:t is the rth largest of the test statistics Tn,1,...,Tn,t, and we assume w.l.o.g. that I(P)= 1,...,s0 . { } Problem Formulation Existing Methods New Method Theory & Practice Simulations Conclusions References Basic Idea (continued) Goal: Bound (1) above by α for any P, at least asymptotically In particular, this must be ensured for any 1 s0 s. ≤ ≤ First, consider any P such that s0 = 1: 1 Then (1) reduces to P Tn 1:1 c1 s { , ≥ } 1 And so c1 = inf x R : P Tn 1:1 x α { ∈ s { , ≥ }≤ } Problem Formulation Existing Methods New Method Theory & Practice Simulations Conclusions References Basic Idea (continued) Goal: Bound (1) above by α for any P, at least asymptotically In particular, this must be ensured for any 1 s0 s. ≤ ≤ First, consider any P such that s0 = 1: 1 Then (1) reduces to P Tn 1:1 c1 s { , ≥ } 1 And so c1 = inf x R : P Tn 1:1 x α { ∈ s { , ≥ }≤ } Next, consider any P such that s0 = 2. Then (1) reduces to: 1 2 s 1 P Tn,2:2 c2,Tn,1:2 < c1 + s P Tn,2:2 c2,Tn,1:2 c1 − { ≥ } { ≥ ≥ } And so c2 is the smallest x R for which 1 ∈ 2 α s 1 P Tn,2:2 x,Tn,1:2 < c1 + s P Tn,2:2 x,Tn,1:2 c1 − { ≥ } { ≥ ≥ }≤ Problem Formulation Existing Methods New Method Theory & Practice Simulations Conclusions References Basic Idea (continued) Goal: Bound (1) above by α for any P, at least asymptotically In particular, this must be ensured for any 1 s0 s. ≤ ≤ First, consider any P such that s0 = 1: 1 Then (1) reduces to P Tn 1:1 c1 s { , ≥ } 1 And so c1 = inf x R : P Tn 1:1 x α { ∈ s { , ≥ }≤ } Next, consider any P such that s0 = 2. Then (1) reduces to: 1 2 s 1 P Tn,2:2 c2,Tn,1:2 < c1 + s P Tn,2:2 c2,Tn,1:2 c1 − { ≥ } { ≥ ≥ } And so c2 is the smallest x R for which 1 ∈ 2 α s 1 P Tn,2:2 x,Tn,1:2 < c1 + s P Tn,2:2 x,Tn,1:2 c1 − { ≥ } { ≥ ≥ }≤ And so forth ... Problem Formulation Existing Methods New Method Theory & Practice Simulations Conclusions References Estimation of the ci Since P is unknown, so are the ‘ideal’ critical values ci. We sugggest a bootstrap method to estimate the ci: Pˆ n is an unrestricted estimate of P with θi(Pˆ n)= θˆn,i ˆ X∗ is generated from Pn and the Tn∗,i are computed from X∗, but centered at θˆn,i rather than at θ0,i E.g., for one-sided testing: T =(θˆ θˆn i)/σˆ n∗,i n∗,i − , n∗,i Problem Formulation Existing Methods New Method Theory & Practice Simulations Conclusions References Estimation of the ci Since P is unknown, so are the ‘ideal’ critical values ci. We sugggest a bootstrap method to estimate the ci: Pˆ n is an unrestricted estimate of P with θi(Pˆ n)= θˆn,i ˆ X∗ is generated from Pn and the Tn∗,i are computed from X∗, but centered at θˆn,i rather than at θ0,i E.g., for one-sided testing: T =(θˆ θˆn i)/σˆ n∗,i n∗,i − , n∗,i Important detail: The ordering of the original Tn,i determines the ‘true’ null hypotheses in the bootstrap world The permutation k1,...,ks of 1,...,s is defined such that { } { } Tn,k1 = Tn,(1),...,Tn,ks = Tn,(s) Then T is the rth smallest of the statistics T ,...,T n∗,r:t n∗,k1 n∗,kt Problem Formulation Existing Methods New Method Theory & Practice Simulations Conclusions References Estimation of the ci (continued) Start with c1: 1 cˆ1 = inf x R : Pˆ n T x α { ∈ s { n∗,1:1 ≥ }≤ } Problem Formulation Existing Methods New Method Theory & Practice Simulations Conclusions References Estimation of the ci (continued) Start with c1: 1 cˆ1 = inf x R : Pˆ n T x α { ∈ s { n∗,1:1 ≥ }≤ } Then move on to c2: cˆ2 is the smallest x R for which 1 ∈ 2 Pˆ n T∗ x,T∗ < cˆ1 + Pˆ n T∗ x,T∗ cˆ1 α s 1 { n,2:2 ≥ n,1:2 } s { n,2:2 ≥ n,1:2 ≥ }≤ − Problem Formulation Existing Methods New Method Theory & Practice Simulations Conclusions References Estimation of the ci (continued) Start with c1: 1 cˆ1 = inf x R : Pˆ n T x α { ∈ s { n∗,1:1 ≥ }≤ } Then move on to c2: cˆ2 is the smallest x R for which 1 ∈ 2 Pˆ n T∗ x,T∗ < cˆ1 + Pˆ n T∗ x,T∗ cˆ1 α s 1 { n,2:2 ≥ n,1:2 } s { n,2:2 ≥ n,1:2 ≥ }≤ − And so forth ..

Resampling-Based Control of the FDR

Details

Download

Copyright

We respect the copyrights and intellectual property rights of all users. All uploaded documents are either original works of the uploader or authorized works of the rightful owners.

Support