The Debiasing Gauntlet: Challenges for Publication Bias Mitigation
Alexander Etz and Joachim Vandekerckhove
UC Irvine
Psychonomics 2017 [12 min.]
Alexander Etz and Joachim Vandekerckhove (UC Irvine)The Debiasing Gauntlet Psychonomics 2017 [12 min.] 1 / 15 Introduction
The published literature is biased: Easier to publish studies with p < .05 Consequently, meta-analyses have a positivity bias, inflating our impression of empirical effect sizes
Alexander Etz and Joachim Vandekerckhove (UC Irvine)The Debiasing Gauntlet Psychonomics 2017 [12 min.] 2 / 15 Publication bias
It has been argued that this renders meta-analysis essentially useless: All the old methods are in doubt. Even meta-analyses, which once were thought to yield a gold standard for evaluating bodies of research now seem somewhat worthless. “Meta-analyses are fucked,” Inzlicht warned me. If you analyze 200 lousy studies, you’ll get a lousy answer in the end. It’s garbage in, garbage out.1
1e.g., Engber (2016) https://tinyurl.com/EdoDepMA Alexander Etz and Joachim Vandekerckhove (UC Irvine)The Debiasing Gauntlet Psychonomics 2017 [12 min.] 3 / 15 Debiasing methods
Many methods exist that purport to “un-garbage-ify” published studies (PET-PEESE, P-Uniform, Trim & Fill, etc.) They try to mitigate this bias in their own way, supported by copious simulation studies Focus on comparing two here Bayesian Bias Correction (BBC)2 P-Curve3
2Guan and Vandekerckhove (2016), PB&R 3Simonsohn, Nelson, & Simmons (2014) JEP:G Alexander Etz and Joachim Vandekerckhove (UC Irvine)The Debiasing Gauntlet Psychonomics 2017 [12 min.] 4 / 15 Challenge
A successful debiasing method must yield accurate predictions about new replications once it is applied to the published literature
Alexander Etz and Joachim Vandekerckhove (UC Irvine)The Debiasing Gauntlet Psychonomics 2017 [12 min.] 5 / 15 Challenge
1 Apply the debiasing method to real published studies 2 Generate predictions about real future data 3 Evaluate accuracy of the predictions
Alexander Etz and Joachim Vandekerckhove (UC Irvine)The Debiasing Gauntlet Psychonomics 2017 [12 min.] 6 / 15 The studies
Alexander Etz and Joachim Vandekerckhove (UC Irvine)The Debiasing Gauntlet Psychonomics 2017 [12 min.] 7 / 15 The studies
“Recently, it has been claimed that risk-taking and spending behavior may be triggered in part by evolutionarily driven motives” “Sexual cues in advertising product categories such as casinos, fashion, jewelry, cosmetic surgery, cars, cigarettes, and alcohol, and the evidence for their effectiveness, suggests that controlling such primes in real-world settings might constitute a valuable intervention.” The authors conduct a meta-analysis of the published literature, and subsequently perform various replications of selected studies
Alexander Etz and Joachim Vandekerckhove (UC Irvine)The Debiasing Gauntlet Psychonomics 2017 [12 min.] 8 / 15 The data, sorted by ascending ES
Alexander Etz and Joachim Vandekerckhove (UC Irvine)The Debiasing Gauntlet Psychonomics 2017 [12 min.] 9 / 15 The challenge
Alexander Etz and Joachim Vandekerckhove (UC Irvine)The Debiasing Gauntlet Psychonomics 2017 [12 min.] 10 / 15 Accuracy criterion
We’ll judge success by mean squared error:
n 1 X MSE = (y − yˆ)2 n i i=1 We have n=14 replications
Alexander Etz and Joachim Vandekerckhove (UC Irvine)The Debiasing Gauntlet Psychonomics 2017 [12 min.] 11 / 15 Naive MA .58 .38 12.7 BBC .44 .23 7.7 PCurve .39 .18 6.0
Results
Method Effect Size Mean Square Error Relative accuracy Target -.01 .03 - - -
Alexander Etz and Joachim Vandekerckhove (UC Irvine)The Debiasing Gauntlet Psychonomics 2017 [12 min.] 12 / 15 BBC .44 .23 7.7 PCurve .39 .18 6.0
Results
Method Effect Size Mean Square Error Relative accuracy Target -.01 .03 - - - Naive MA .58 .38 12.7
Alexander Etz and Joachim Vandekerckhove (UC Irvine)The Debiasing Gauntlet Psychonomics 2017 [12 min.] 12 / 15 PCurve .39 .18 6.0
Results
Method Effect Size Mean Square Error Relative accuracy Target -.01 .03 - - - Naive MA .58 .38 12.7 BBC .44 .23 7.7
Alexander Etz and Joachim Vandekerckhove (UC Irvine)The Debiasing Gauntlet Psychonomics 2017 [12 min.] 12 / 15 Results
Method Effect Size Mean Square Error Relative accuracy Target -.01 .03 - - - Naive MA .58 .38 12.7 BBC .44 .23 7.7 PCurve .39 .18 6.0
Alexander Etz and Joachim Vandekerckhove (UC Irvine)The Debiasing Gauntlet Psychonomics 2017 [12 min.] 12 / 15 Results
Alexander Etz and Joachim Vandekerckhove (UC Irvine)The Debiasing Gauntlet Psychonomics 2017 [12 min.] 13 / 15 Conclusions
Neither P-Curve nor BBC do very well here at predicting new unbiased replications This is only one data point, but pattern holds with other yoked biased/unbiased datasets Other methods don’t do very well either We can not yet revert published garbage back into something valuable
Alexander Etz and Joachim Vandekerckhove (UC Irvine)The Debiasing Gauntlet Psychonomics 2017 [12 min.] 14 / 15 Thank You
Thank you.
Funding for this project is provided by the NSF Graduate Research Fellowship Program, the NSF’s Methods, Measurements, and Statistics panel, as well as the John Templeton Foundation.
Alexander Etz and Joachim Vandekerckhove (UC Irvine)The Debiasing Gauntlet Psychonomics 2017 [12 min.] 15 / 15