The Gauntlet: Challenges for Publication Mitigation

Alexander Etz and Joachim Vandekerckhove

UC Irvine

Psychonomics 2017 [12 min.]

Alexander Etz and Joachim Vandekerckhove (UC Irvine)The Debiasing Gauntlet Psychonomics 2017 [12 min.] 1 / 15 Introduction

The published literature is biased: Easier to publish studies with p < .05 Consequently, meta-analyses have a positivity bias, inflating our impression of empirical effect sizes

Alexander Etz and Joachim Vandekerckhove (UC Irvine)The Debiasing Gauntlet Psychonomics 2017 [12 min.] 2 / 15 Publication bias

It has been argued that this renders meta-analysis essentially useless: All the old methods are in doubt. Even meta-analyses, which once were thought to yield a gold standard for evaluating bodies of now seem somewhat worthless. “Meta-analyses are fucked,” Inzlicht warned me. If you analyze 200 lousy studies, you’ll get a lousy answer in the end. It’s garbage in, garbage out.1

1e.g., Engber (2016) https://tinyurl.com/EdoDepMA Alexander Etz and Joachim Vandekerckhove (UC Irvine)The Debiasing Gauntlet Psychonomics 2017 [12 min.] 3 / 15 Debiasing methods

Many methods exist that purport to “un-garbage-ify” published studies (PET-PEESE, P-Uniform, Trim & Fill, etc.) They try to mitigate this bias in their own way, supported by copious simulation studies Focus on comparing two here Bayesian Bias Correction (BBC)2 P-Curve3

2Guan and Vandekerckhove (2016), PB&R 3Simonsohn, Nelson, & Simmons (2014) JEP:G Alexander Etz and Joachim Vandekerckhove (UC Irvine)The Debiasing Gauntlet Psychonomics 2017 [12 min.] 4 / 15 Challenge

A successful debiasing method must yield accurate predictions about new replications once it is applied to the published literature

Alexander Etz and Joachim Vandekerckhove (UC Irvine)The Debiasing Gauntlet Psychonomics 2017 [12 min.] 5 / 15 Challenge

1 Apply the debiasing method to real published studies 2 Generate predictions about real future data 3 Evaluate accuracy of the predictions

Alexander Etz and Joachim Vandekerckhove (UC Irvine)The Debiasing Gauntlet Psychonomics 2017 [12 min.] 6 / 15 The studies

Alexander Etz and Joachim Vandekerckhove (UC Irvine)The Debiasing Gauntlet Psychonomics 2017 [12 min.] 7 / 15 The studies

“Recently, it has been claimed that risk-taking and spending behavior may be triggered in part by evolutionarily driven motives” “Sexual cues in advertising product categories such as casinos, fashion, jewelry, cosmetic surgery, cars, cigarettes, and alcohol, and the evidence for their effectiveness, suggests that controlling such primes in real-world settings might constitute a valuable intervention.” The authors conduct a meta-analysis of the published literature, and subsequently perform various replications of selected studies

Alexander Etz and Joachim Vandekerckhove (UC Irvine)The Debiasing Gauntlet Psychonomics 2017 [12 min.] 8 / 15 The data, sorted by ascending ES

Alexander Etz and Joachim Vandekerckhove (UC Irvine)The Debiasing Gauntlet Psychonomics 2017 [12 min.] 9 / 15 The challenge

Alexander Etz and Joachim Vandekerckhove (UC Irvine)The Debiasing Gauntlet Psychonomics 2017 [12 min.] 10 / 15 Accuracy criterion

We’ll judge success by mean squared error:

n 1 X MSE = (y − yˆ)2 n i i=1 We have n=14 replications

Alexander Etz and Joachim Vandekerckhove (UC Irvine)The Debiasing Gauntlet Psychonomics 2017 [12 min.] 11 / 15 Naive MA .58 .38 12.7 BBC .44 .23 7.7 PCurve .39 .18 6.0

Results

Method Effect Size Mean Square Error Relative accuracy Target -.01 .03 - - -

Alexander Etz and Joachim Vandekerckhove (UC Irvine)The Debiasing Gauntlet Psychonomics 2017 [12 min.] 12 / 15 BBC .44 .23 7.7 PCurve .39 .18 6.0

Results

Method Effect Size Mean Square Error Relative accuracy Target -.01 .03 - - - Naive MA .58 .38 12.7

Alexander Etz and Joachim Vandekerckhove (UC Irvine)The Debiasing Gauntlet Psychonomics 2017 [12 min.] 12 / 15 PCurve .39 .18 6.0

Results

Method Effect Size Mean Square Error Relative accuracy Target -.01 .03 - - - Naive MA .58 .38 12.7 BBC .44 .23 7.7

Alexander Etz and Joachim Vandekerckhove (UC Irvine)The Debiasing Gauntlet Psychonomics 2017 [12 min.] 12 / 15 Results

Method Effect Size Mean Square Error Relative accuracy Target -.01 .03 - - - Naive MA .58 .38 12.7 BBC .44 .23 7.7 PCurve .39 .18 6.0

Alexander Etz and Joachim Vandekerckhove (UC Irvine)The Debiasing Gauntlet Psychonomics 2017 [12 min.] 12 / 15 Results

Alexander Etz and Joachim Vandekerckhove (UC Irvine)The Debiasing Gauntlet Psychonomics 2017 [12 min.] 13 / 15 Conclusions

Neither P-Curve nor BBC do very well here at predicting new unbiased replications This is only one data point, but pattern holds with other yoked biased/unbiased datasets Other methods don’t do very well either We can not yet revert published garbage back into something valuable

Alexander Etz and Joachim Vandekerckhove (UC Irvine)The Debiasing Gauntlet Psychonomics 2017 [12 min.] 14 / 15 Thank You

Thank you.

Funding for this project is provided by the NSF Graduate Research Fellowship Program, the NSF’s Methods, Measurements, and Statistics panel, as well as the John Templeton Foundation.

Alexander Etz and Joachim Vandekerckhove (UC Irvine)The Debiasing Gauntlet Psychonomics 2017 [12 min.] 15 / 15