Benjamini-Hochberg Method

P-values and statistical tests 7. Multiple test corrections Marek Gierliński Division of Computational Biology Hand-outs available at http://is.gd/statlec 2 Lets perform a test � times False positives True positives H0 true H0 false Total Number of Significant �� discoveries Not �� − � significant Total �( �) � Number of tests True negatives False negatives 3 Family-wise error rate �� = Pr (�� ≥ 1) 4 Probabilities of independent events multiply 1 1 � � and � = �(�)×�(�) 2 4 1 2 1 2 Toss two coins 1 2 1 2 1 2 5 Probabilities of either event is 1 − 1 − � = 1 1 � � or � = ? 2 4 1 � � = 1 − �(�) 2 1 1 2 4 � � and � = � � ×� � = Toss two = 1 − � � coins 1 1 � � or � = 1 − � � and � 2 4 1 = 2 � � or � = 1 − 1 − � � 1 1 2 4 1 = 3 �(� or �) = 1 − 1 − = 2 4 6 False positive probability H0: no effect Set � = 0.05 One test Probability of having a false positive �) = � Two independent tests Probability of having at least one false positive in either test = �= = 1 − 1 − � � independent tests Probability of having at least one false positive in any test F �F = 1 − 1 − � 7 Family-wise error rate (FWER) Probability of having at least one false positive among � tests; � = 0.05 F �F = 1 − 1 − � Jelly beans test, � = 20, � = 0.05 FWER =( �=( = 1 − 1 − 0.05 = 0.64 � 8 Bonferroni limit – to control FWER Probability of having at least one false positive Controlling FWER among � tests; � = 0.05 We want to make sure that F �F = 1 − 1 − � �� ≤ �J. Then, the FWER is controlled at level �′. Bonferroni limit FWER � �J = � � F � = 1 − 1 − ≈ � F � 9 Test data (1000 independent experiments) Random samples, size � = 5, from two normal distributions Negatives �( = 20 g �( = 970 Positives �) = 40 g �) = 30 10 One sample t-test, H0: � = 20 g No correction H0 true H0 false Total Significant 56 30 86 Not 914 0 914 significant Total 970 30 1000 30 positives 970 negatives 11 One sample t-test, H0: � = 20 g No correction H0 true H0 false Total Significant �� = 56 �� = 30 86 Not significant �� = 914 �� = 0 914 Total 970 30 1000 Family-wise error rate �� = Pr (�� ≥ 1) 56 �� = = 0.058 �� 56 + 914 False positive rate �� = = � �� + �� ( 0 �� = = 0 �� 0 + 30 False negative rate �� = = �) �� + �� 12 Bonferroni limit No correction Bonferroni � 0.05 5×10ST �� 0.058 0 �� 0 0.87 No correction Bonferroni H0 true H0 false Total H0 true H0 false Total Significant 56 30 86 Significant 0 4 4 Not Not 914 0 914 970 26 996 significant significant Total 970 30 1000 Total 970 30 1000 13 Holm-Bonferroni method Sort p-values �()), �(=), … , �(F) V Reject (1) if � ≤ ()) F V Reject (2) if � ≤ (=) FS) V Reject (3) if � ≤ (W) FS= ... � � V � � � Stop when � > � � − � + 1 (X) FSXZ) 1 0.003 0.05 0.01 0.01 2 0.005 0.05 0.01 0.0125 Holm-Bonferroni method 3 0.012 0.05 0.01 0.017 controls FWER 4 0.04 0.05 0.01 0.025 5 0.058 0.05 0.01 0.05 14 Holm-Bonferroni method No correction Bonferroni HB � 0.05 5×10ST 5×10ST �� 0.058 0 0 �� 0 0.87 0.87 30 smallest p-values (out of 1000) Holm-Bonferroni Bonferroni H true H false Total Holm-Bonferroni 0 0 Significant 0 4 4 Not 970 26 996 significant Total 970 30 1000 15 False discovery rate �� = � False discovery rate False positive rate False discovery rate �� = = �� = = �( �� + �� + �� The fraction of truly non-significant events The fraction of discoveries that are false we falsely marked as significant 56 56 �� = = 0.058 �� = = 0.65 970 86 No correction H0 true H0 false Total Significant �� = 56 �� = 30 86 Not significant �� = 914 �� = 0 914 Total 970 30 1000 17 BenJamini-Hochberg method Sort p-values �()), �(=), … , �(F) Find the largest �, such that � � ≤ � (X) � Reject all null hypotheses for � = 1, … , � � � � � � � � � � − � + 1 � 1 0.003 0.05 0.01 0.01 0.01 2 0.005 0.05 0.01 0.0125 0.02 Benjamini-Hochberg 3 0.012 0.05 0.01 0.017 0.03 method controls FDR 4 0.038 0.05 0.01 0.025 0.04 5 0.058 0.05 0.01 0.05 0.05 18 BenJamini-Hochberg method No correction Bonferroni HB BH � 0.05 5×10ST 3.7×10ST 0.0011 �� 0.058 0 0 0.0021 �� 0 0.87 0.87 0.30 �� 0.65 0 0 0.087 Benjamini-Hochberg Benjamini-Hochberg Bonferroni H true H false Total Holm-Bonferroni 0 0 Significant 2 21 23 Not 968 9 977 significant Total 970 30 1000 19 Controlling FWER and FDR Holm-Bonferroni Benjamini-HochBerg controls FWER controls FDR �� = Pr (�� ≥ 1) �� = �� + �� is a random variable Controlling FWER - guaranteed Controlling FDR - guaranteed �� ≤ �′ �[��] ≤ � 20 BenJamini-Hochberg procedure controls FDR Controlling FDR �[��] ≤ � � �� can be approXimated by the mean over many eXperiments Bootstrap: generate test data 10,000 times, perform 1000 t-tests for each set and find FDR for BH procedure 21 AdJusted p-values p-values can be “adjusted”, � so they compare directly X with �, and not � F adjusted p-values Problem: adjusted p-value � does not eXpress any � probability � Despite their popularity, I raw p-values recommend against using adjusted p-values 22 How to do this in R # Read generated data > d = read.table("http://tiny.cc/two_hypotheses", header=TRUE) > p = d$p # Holm-Bonferroni procedure > p.adj = p.adjust(p, "holm") > p[which(p.adj < 0.05)] [1] 1.476263e-05 2.662440e-05 3.029839e-05 # Benjamini-Hochberg procedure > p.adj = p.adjust(p, "BH") > p[which(p.adj < 0.05)] [1] 1.038835e-03 6.670798e-04 1.050547e-03 1.476263e-05 5.271367e-04 [6] 3.503370e-04 9.664789e-04 1.068863e-03 7.995860e-04 5.404476e-04 [11] 9.681321e-04 1.580069e-04 1.732747e-04 3.159954e-04 2.662440e-05 [16] 4.709732e-04 1.517964e-04 2.873971e-04 3.258726e-04 4.087615e-04 [21] 3.029839e-05 9.320438e-04 1.713309e-04 2.863402e-04 4.082322e-04 23 Estimating false discovery rate Control and estimate Controlling FDR Estimating FDR 1. FiX acceptable FDR limit, �, For each p-value, �_, form a point beforehand estimate of FDR, 2. Find a thresholding rule, so that ��`(�_) �[��] ≤ � 25 P-value distribution Data set 2 100% null 80% null, 20% alternative 26 P-value distribution Good Bad! 27 Definition of �( Proportion of null tests 80% null, 20% alternative # non−significant tests � = ( # all tests H1 H0 �( 28 Storey method Estimate �� Use the histogram for � > � to estimate �n((�) Do it for all 0 < � < 1 and then find the best �n( � Storey, J.D., 2002, JR Statist Soc B, 64, 479 29 Point estimate of FDR Point estimate, ��(�) 80% null, 20% alternative Arbitrary limit �, every �_ < � is significant. No. of significant tests is �(�) = # �_ < � No. of false positives is True positives ��(�) = ��(� False positives Hence, �� = ( ( �(�) � 30 Storey method Point estimate of FDR This is the so-called q-value: � �_ = min �� wxyz If monotonic � �_ = �� _ 31 How to do this in R > library(qvalue) # Read data set 1 > pvalues = read.table('http://tiny.cc/multi_FDR', header=TRUE) > p = pvalues$p # Benjamini-Hochberg limit > p.adj = p.adjust(p, method='BH') > sum(p.adj <= 0.05) [1] 216 # q-values > qobj = qvalue(p) > q = qobj$qv > summary(qobj) pi0: 0.8189884 Cumulative number of significant calls: <1e-04 <0.001 <0.01 <0.025 <0.05 <0.1 <1 p-value 40 202 611 955 1373 2138 10000 q-value 0 1 1 96 276 483 10000 local FDR 0 1 3 50 141 278 5915 > plot(qobj) > hist(qobj) 32 33 Interpretation of q-value � ≤ 0.0273 No. ID p-value q-value There are 106 tests with ... ... ... ... Expect 2.73% of false positives among 100 9249 0.000328 0.0266 these tests 101 8157 0.000328 0.0266 102 8228 0.000335 0.0269 Expect ~3 false positives if you set a limit 103 8291 0.000338 0.0269 of � ≤ 0.0273 or � ≤ 0.00353 104 8254 0.000347 0.0272 105 8875 0.000348 0.0272 106 8055 0.000353 0.0273 q-value tells you how many false 107 8235 0.000375 0.0284 positives you should expect after 108 8148 0.000376 0.0284 choosing a significance limit 109 8236 0.000381 0.0284 110 8040 0.000382 0.0284 ... ... ... ... 34 Q-values vs BenJamini-Hochberg When �n( = 1, both methods give the same result. BH adjusted �n( p-value For the same FDR, Storey’s method provides more q-value significant p-values. Hence, it is more powerful, especially for small �n(. But this depends on how good the estimate of �n( is. �n( - estimate of the proportion of null (non-significant) tests 35 Which multiple-test correction should I use? Benjamini-Hochberg No correction Bonferroni Storey False positives False negatives False positive False negative “Discover” effect where there is no effect Missed discovery Can be tested in follow-up eXperiments Once you’ve missed it, it’s gone Not hugely important in small samples Impossible to manage in large samples 36 Multiple test procedures: summary Method Controls Advantages Disadvantages Recommendation No FPR False negatives not Can result in Small samples, when correction inflated �� ≫ �� the cost of FN is high Bonferroni FWER None Lots of false Do not use negatives Holm- FWER Slightly better than Lots of false Appropriate only Bonferroni Bonferroni negatives when you want to guard against any false positives Benjamini- FDR Good trade-off On average, � of Better in large Hochberg between false your positives will samples positives and be false negatives Storey -- More powerful than Depends on a The best method, BH, in particular for good estimate of gives more insight small �n( �n( into FDR 37 Hand-outs available at http://tiny.cc/statlec BenJamini-Hochberg method Theoretical null distribution: uniform p-values � � = (X) � Generated data null distribution 39 BenJamini-Hochberg method Test data Null data 970 H0 1000 H0 20 H1 p-values � 100% FDR � � � 5% FDR � 40 BenJamini-Hochberg method Test data Null data 970 H0 1000 H0 20 H1 � � value value - - p p � � � 41.

Benjamini-Hochberg Method

Details

Download

Copyright

We respect the copyrights and intellectual property rights of all users. All uploaded documents are either original works of the uploader or authorized works of the rightful owners.

Support