Testing Multiple Hypotheses and False Discovery Rate Models, Inference, and Algorithms Primer
Manuel A. Rivas Broad Institute Let’s assume that we wish to examine the association between a response and m different covariates When m tests are performed, the aim is to decide which of the nulls should be rejected. Not flagged Flagged
H0
H1 Not flagged Flagged
H0
H1
K
This table shows the possibilities when m tests are performed and K are flagged as requiring further attention. Not flagged Flagged
H0 m0
H1
K
m0 is the number of true nulls Not flagged Flagged
H0 B m0
H1
K
B is the number of type I errors Not flagged Flagged
H0 B m0
H1 C
K
C is the number of type II errors Not flagged Flagged
H0 A B m0
H1 C D m1
m - K K m m1 is the number of true alternatives Not flagged Flagged
H0 A B m0
H1 C D m1
m - K K m
Each of these quantities is unknown. The aim is to select a rule on the basis of some criterion and this in turn will determine K. To illustrate the multiple testing problem we focus on GWAS as an example where we typically test the null hypothesis
H0 : =0
i.e. the effect of the genetic variant is 0. In a single test situation the historical emphasis has been on the control of the type I error rate (false positives). In a multiple testing situation there are a variety of criteria that may be considered. In a multiple testing situation there are a variety of criteria that may be considered:
Frequentist analysis 1. Bonferroni method 2. Sidák correction 3. Benjamini and Hochberg (FDR) 4. Storey (FDR) In a multiple testing situation there are a variety of criteria that may be considered:
Frequentist analysis 1. Bonferroni method 2. Sidák correction 3. Benjamini and Hochberg (FDR) 4. Storey (FDR) Bayesian analysis 1. Bayesian Bonferroni-type correction 2. Mixture models 3. Matthew Stephens’ FDR approach In a multiple testing situation there are a variety of criteria that may be considered:
Frequentist analysis 1. Bonferroni method 2. Sidák correction 3. Benjamini and Hochberg (FDR) 4. Storey (FDR) Bayesian analysis 1. Bayesian Bonferroni-type correction 2. Mixture models 3. Matthew Stephens’ FDR approach Frequentist analysis
Family-wise error rate (FWER): the probability of making at least one type I error Frequentist analysis
Family-wise error rate (FWER): the probability of making at least one type I error
P (B 1 H =0,...,H = 0) | 1 m Frequentist analysis Bonferroni method
Let Bi be the event that the ith null is incorrectly rejected, so that, B, the random variable representing the number of incorrectly rejected nulls, corresponds to the union of all incorrectly rejected nulls, i.e. m B [i=1 i Frequentist analysis Bonferroni method
With a common level for each test ↵⇤ the family-wise error rate (FWER) is
↵ = P (B 1 H =0,...,H = 0) = P ( m B H =0,...,H = 0) F | 1 m [i=1 i| 1 m m P (B H =0,...,H = 0) i| 1 m i=1 X = m↵⇤ Frequentist analysis Bonferroni method
↵ = P (B 1 H =0,...,H = 0) = P ( m B H =0,...,H = 0) F | 1 m [i=1 i| 1 m m P (B H =0,...,H = 0) i| 1 m i=1 X = m↵⇤
The Bonferroni method takes
↵⇤ = ↵F /m
to give FWER ↵F . Frequentist analysis Bonferroni method
Preferred approach for GWAS where to control the FWER at a level of alpha = 0.05 with m = 1,000,000 tests, we would take
8 ↵⇤ = .05/1, 000, 000 = 5 10 . ⇥ Frequentist analysis Sidák correction
Overcomes conservatism introduced by inequality If test statistics are independent,
P (B 1) = 1 P (B = 0) =1 P m B0 \i=1 i m⇣ ⌘ =1 P B0 i i=1 Y ⇣ ⌘m =1 P (1 ↵⇤) Frequentist analysis Sidák correction
Overcomes conservatism introduced by inequality If test statistics are independent,
1/m ↵⇤ =1 (1 ↵ ) . F In GWAS, assuming 1,000,000 tests were independent this would change it slightly to 5.13e-8 as a p-value threshold. Frequentist analysis False Discovery Rate (FDR)
A simple way to overcome the conservative nature of the control of FWER is to increase ↵F . Frequentist analysis False Discovery Rate (FDR)
A simple way to overcome the conservative nature of the control of FWER is to increase ↵F .
One measure to calibrate a procedure is via the expected number of false discoveries:
EFD = m ↵⇤ 0 ⇥ m ↵⇤ ⇥ Frequentist analysis False Discovery Rate (FDR)
A simple way to overcome the conservative nature of the control of FWER is to increase ↵F .
One measure to calibrate a procedure is via the expected number of false discoveries:
EFD = m0 ↵⇤ Recall m0 is the ⇥ m ↵⇤ number of true nulls. ⇥ Frequentist analysis False Discovery Rate (FDR)
For example, we could specify ↵⇤ such that EFD <= 1
We choose: ↵⇤ =1/m.
6 ↵⇤ =1 10 ⇥ for GWAS with 1,000,000 markers. Frequentist analysis False Discovery Rate (FDR)
We introduce the false discovery proportion (FDP) as the proportion of incorrect rejections:
B B is the number of type I errors FDP = . K Frequentist analysis False Discovery Rate (FDR)
We introduce the false discovery proportion (FDP) as the proportion of incorrect rejections:
K is the number B FDP = . flagged for additional attention K Frequentist analysis False Discovery Rate (FDR)
False Discovery Rate (FDR), the expected proportion of rejected nulls that are actually true:
FDR = E [FDP] = E [B/K K>0] P (K>0) | Frequentist analysis False Discovery Rate (FDR) Benjamini and Hochberg (1995) procedure
1. Let P <
For independent p-values, each of which is uniform under the null. Frequentist analysis False Discovery Rate (FDR) Benjamini and Hochberg (1995) procedure
1. Let P <
2. Assume we would like FDR control at ↵ =0.05
For independent p-values, each of which is uniform under the null. Frequentist analysis 1. Let P <
Let li = i↵/m and R = max i : P(i)