2017-12-05

p-hacking: What it is, how to prevent it

This presentation is licensed under a CC-BY 4.0 license. You may copy, distribute, and use the slides in your own work, as long as you give attribution to the original author at each slide that you use.

PD Dr. Felix Schönbrodt www.nicebread.de Ludwig-Maximilians-Universität www.researchtransparency.org München @nicebread303 Much of the scientific literature, perhaps half, may simply be untrue.

Part of the problem is that no one is incentivised to be right.

Richard Horton, Editor von

2 Researchers are not rewarded for being right, but rather for publishing a lot.

Nelson, Simmons, & Simonsohn (2012); Nosek, Spies, Motyl (2012); Munafo (2016)

3 How can we publish a lot?

Psychology/Psychiatry 92%!

4 Fanelli, D. (2011). Negative results are disappearing from most disciplines and countries. , 90(3), 891–904. doi:10.1007/s11192-011-0494-7 p-hack your way to scientific glory! p-hacking (n.). Tune your data analysis in a way that you achieve a signifcant p-value in situations where it would have been non-signifcant.

Questionable practices (QRPs) (n.). Practices of data collection and data analysis that are not outright , but also not really kosher. Tool 1: Outcome switching

http://compare-trials.org/ http://blogs.discovermagazine.com/neuroskeptic/2015/07/23/social-priming-money-for-nothing/ 7 #.VuKRSRi5KJM Tool 1: Outcome switching

• 2 outcome variables: false positive rate 5% ➙ 9.5%

• 5 outcome variables with one-sided testing: false positive rate 5% ➙ 41%

8 Tool 2: Many conditions, report only those that worked https://twitter.com/JoeHilgard/status/699693258386051072

Best-practice example: Transform a boring dissertation into a groundbreaking publication

9 Tool 3: Optional stopping

With long enough sampling and optional stopping, it is guaranteed to get a significant result!

100% Armitage, P., McPherson, C. K., & Rowe, B. C. (1969). Repeated significance tests on accumulating data. Journal of the Royal Statistical Society. Series A 10 (General), 132, 235–244. Tool 4: Analytical flexibility

11 12 http://crtt.flexiblemeasures.com by Malte Elson in psychology

13 Pro-Tip: Build the p-hacking into the software!

14 How bad is it?

http://shinyapps.org/apps/p-hacker/ 15 Kahneman: Open Letter

I believe that you should collectively do something about this mess. I see a train wreck looming.

Daniel Kahneman, Nobel prize 2002

16 http://www.nature.com/polopoly_fs/7.6716.1349271308!/suppinfoFile/Kahneman%20Letter.pdf Which part of published research can be independently replicated?

100%

75% 51% 64% 79% 89% 50%

25% 49% 36% 11% 21% 0% Psychology * Cancer Research 1 Cancer research 2 (2015; n = 97) (2015; n = 67) (2011; n = 53) (2012; n = 67)

* The data on economics is about ; i.e. the attempt to get the same results if you apply the original data analysis on the original data set. 17 Open Collaboration (2015); Chang & Li (2015); Begley, C. G., & Ellis, L. M. (2012). Prinz, F., Schlange, T., & Asadullah, K. (2011). 90%: Yes

http://www.nature.com/news/1-500-scientists-lift-the-lid-on-reproducibility-1.19970

18 Retractions: +1000% in 10 years

„In the past decade, the number of retraction notices 684 has shot up 10-fold.“

500 467

2013 2015

http://www.nature.com/news/2011/111005/full/478026a/box/2.html http://retractionwatch.com/2016/03/24/retractions-rise-to-nearly-700-in-fiscal-year-2015-and-psst-this-is-our-3000th-post/ https://www.washingtonpost.com/news/speaking-of-science/wp/2016/04/01/when-scientists-lie-about-their-research-should-they-go-to-jail/ 19 Scientific misconduct: + 1200% in 4 years

U.S. Office of Research Integrity: Confrmed cases of scientifc misconduct 36 3 2009-2011 2012-2015

https://ori.hhs.gov/case_summary 20 „Innovative, unprecedented, transformative!“ +880% from 1974- 2014

Groundbreaking!!!

Amazing!!

Enormous!!

Vinkers, C. H., Tijdink, J. K., & Otte, W. M. (2015). Use of positive and negative words in scientific PubMed abstracts between 1974 and 2014: retrospective analysis. Bmj, 351, h6467–6. http://doi.org/10.1136/bmj.h6467 21 Thesis: Anti-Thesis:

Our current incentives foster questionable research Society pays for us that we practices, which decrease the generate valid and robust truth value of our shared knowledge. knowledge.

What is good for the Our incentives should be individual careers of chosen in a way that they researchers leads to a foster good science. collective fiasko.

Researchers who do it right Researchers who do it right (i.e., high power, no QRPs, should be supported and transparency) have a clear promoted. competitive disadvantage.

22 How to prevent p-hacking

1. Reduce analytical degrees of freedom 2. Embrace analytical degrees of freedom 3. Change the incentives How to prevent p-hacking

1. Reduce analytical degrees of freedom 2. Embrace analytical degrees of freedom 3. Change the incentives

Registered Reports https://cos.io/rr/

26 Registered Reports https://cos.io/rr/

AIMS Neuroscience Attention, Perception, and Psychophysics Cognition and Emotion Comprehensive Results in Social Psychology Cortex Drug and Alcohol Dependence Experimental Psychology Journal of Accounting Research Journal of Business and Psychology 82 journalsJournal of Personnel Psychology Journal of Media Psychology offer Registered Reports NFS Journal (by December 2017) Perspectives on Psychological Science Royal Society Open Science Social Psychology Working, Aging and Retirement 27 Slides by Anne Scheel Slides by Anne Scheel Slides by Anne Scheel Slides by Anne Scheel 32 Where can I preregister?

+ – comment not offered by every RPR Registered Reports ✓ "gold standard" journal better planning of not externally potential option for internal only – studies recognised students' projects not hack-safe private (misconduct) , – / ✓ no scooping not encouraged online repository (depends) doesn't prevent file- drawer effect UPR Scooping possible? find (But: time-stamp public collaborators, get proofs date of embargo possible online repository supporting ✓ public PR, e.g. OSF feedback, creation; scooping (OSF: 4 years max) file-drawer safe prevented by embargo)

! all UPRs can be done with and without a template (exception: AsPredicted is template-only)

Slides by Anne Scheel no prereg: prereg: 57% 8% success rate! success rate…

http://chrisblattman.com/2016/03/01/13719/ Kaplan, R. M., & Irvin, V. L. (2015). Likelihood of Null Effects of Large NHLBI Clinical Trials Has Increased over Time. PLoS ONE, 10(8), e0132382–12. http:// 34 doi.org/10.1371/journal.pone.0132382 How to prevent p-hacking

1. Reduce analytical degrees of freedom 2. Embrace analytical degrees of freedom 3. Change the incentives Prevent p-hacking by reducing the data analysis to a single pipeline. How to prevent p-hacking

1. Reduce analytical degrees of freedom 2. Embrace analytical degrees of freedom 3. Change the incentives Many names for the same idea …

•Sensitivity/ robustness analysis

•Multiverse analysis (Steegen et al., 2016)

•Specification curve (Simonsohn et al., 2015)

•Vibration of effects (Patel et al., 2015) •Ensemble approach (e.g. climatology) ➙ use a set of models with the same input data to produce a range of outcomes

37 cf. Boulesteix, Elsas, & Schönbrodt (in prep) • A “multiverse analysis” (Steegen, Tuerlinchx, Gelman, & Vanpaemel, 2016): Report results for all plausible analytical decisions • Check robustness of results: Do several analytical paths lead to comparable conclusions? • Based on open data by Carney et al. (2010)

38 Of 54 plausible analyses exactly one was significant. Guess which has been reported in the original paper? 39 Open Letter by Dana Carney (2016)

http://faculty.haas.berkeley.edu/dana_carney/pdf_my%20position%20on%20power%20poses.pdf 40 Rohrer, J. M., Egloff, B., & Schmukle, S. C. (in press). Probing birth-order effects on narrow traits using Specification Curve Analysis. Psychological Science. https://osf.io/vg2un/ 41 Rohrer, J. M., Egloff, B., & Schmukle, S. C. (in press). Probing birth-order effects on narrow traits using Specification Curve Analysis. Psychological Science. https://osf.io/vg2un/ 42 Prerequisite: Open Data “dass die Daten unmitelbar nach Abschluss der Forschungen oder nach wenigen Monaten der Öffentlichkeit frei zur Verfügung gestellt werden.“

„Das Engagement […] von Wissenschaflern und Wissenschaflerinnen um die Verfügbarmachung von Forschungsdaten sollten bei der Würdigung von wissenschaflichen […] Leistungen zukünfig stärker berücksichtigt werden“

http://dfg.de/download/pdf/foerderung/programme/lis/ua_inf_empfehlungen_200901.pdf, http://www.dfg.de/download/pdf/foerderung/antragstellung/forschungsdaten/richtlinien_forschungsdaten.pdf 43 http://www.dgps.de/fileadmin/documents/Empfehlungen/Datenmanagement_deu.pdf, http://www.dgps.de/fileadmin/documents/Empfehlungen/Data_Management_eng.pdf Journals with mandatory open data (or justification why not)

•Advances in Methods and Practices in Psychological Science (http://www.psychologicalscience.org/publications/ampps/ampps-submission-guidelines#DISC) •Collabra: Psychology (https://www.collabra.org/about/research-integrity/) •Experimental Psychology (http://econtent.hogrefe.com/doi/10.1027/1618-3169/a000355) •Journal of Research in Personality (http://www.sciencedirect.com/science/article/pii/S0092656617300211) •Judgment and Decision Making (http://journal.sjdm.org/) •Journal of Cognition (https://www.journalofcognition.org/about/editorialpolicies/) •PLOS ONE (http://blogs.plos.org/everyone/2017/05/08/making-progress-toward-open-data/) •Royal Society Open Science (http://rsos.royalsocietypublishing.org/author-information#Open_data) •Science (http://www.sciencemag.org/authors/science-editorial-policies) 44 How to prevent p-hacking

1. Reduce analytical degrees of freedom 2. Embrace analytical degrees of freedom 3. Change the incentives

Show robustness against p-hacking by computing all (sensible) analytical pipelines How to prevent p-hacking

1. Reduce analytical degrees of freedom 2. Embrace analytical degrees of freedom 3. Change the incentives Proximate incentives

•Why do we p-hack? To get a publishable result. •Hypothesis: Publication as the driving force behind p-hacking. • Carter, Schönbrodt, Gervais, & Hilgard (2017): p-hacking (without publication bias) has minor impact; but quite problematic in combination with publication bias (see https://psyarxiv.com/9h3nu/) •But also see Simonsohn (2016): • „P-hacking is easy to stop. File-drawering nearly impossible. Fortunately, while p-hacking is a real problem, file-drawering is not.“ • http://datacolada.org/55

47 Ultimate incentives

… The Department of Psychology at the Faculty of Human of the University of Cologne (UoC) seeks to appoint a …

Full Professor (W3) of Social Psychology

to be+ filled 3 asadditional soon as possible. professorship job

The successful candidate is expected to have a record of excellence in social cognition, and/orThe Department related areas of Psychologysuch as cognitive at the psychology Faculty of or Human motivation Sciences science. of the University of Cologne (UoC) seeks to appoint a The candidate is also expected to strongly contribute to the UoC’s Center for Social and

Economic Behavior and the Social Cognition Center Cologne of the Department of Psychology. Both structures areFull part Professor of UoC’s Key Profile(W3) Area II, „Behavioral Economic Engineering and Social Cognition“.of Social Psychology

toThe be idealfilled as candidate’s soon as possible. track record should show an excellent fit with these interrelated structures and a strong interest to bridge the fields of social cognition and behavioral Theeconomics. successful candidate is expected to have… a record of excellence in social cognition, and/or related areas such as cognitive psychology or motivation science. The Department of Psychology aims for transparent and reproducible research (including TheOpen candidate Data, Open is Materials,also expected and Preregistrations). to strongly contribute Applicants to the are UoC’s asked Center to illustrate for Social how theyand Economichave pursued Behavior these goals and in the the Social past and/or Cognition how they Center plan Cologneto do so in of the the future. Department of https://osf.io/dbkva/ Psychology. Both structures are part of UoC’s Key Profile Area II, „Behavioral Economic 48 EngineeringWe strongly encourageand Social Cognition“.international applicants. Salaries and working conditions at the UoC - one of the German Universities of Excellence – meet international standards. Candidates Theare expected ideal candidate’s to be willing track to record learn theshould German show language. an excellent The fit Faculties with these offer interrelated Bachelor, structuresMaster, and and doctoral a strong degrees. interest Courses to bridge are taught the fieldseither ofin English social cognitionor German. and behavioral economics. Applicants will be hired in concordance with § 36 of the University Law of the State of North- TheRhine Department Westphalia. of Psychology aims for transparent and reproducible research (including Open Data, Open Materials, and Preregistrations). Applicants are asked to illustrate how they haveThe UoCpursued supports these diversity,goals in the the past multiplicity and/or how of perspectives, they plan to do and so equalin the opportunities.future. The University of Cologne particularly encourages applications from disabled persons. Disabled Wepersons strongly are givenencourage preference international in case applicants.of equal qualification. Salaries and Women working are conditions strongly encouraged at the UoC -to one apply. of the Preferential German Universities treatment isof givenExcellence to women – meet if theirinternational professional standards. qualifications Candidates and areabilities expected are equivalent to be willing to those to learnof other the applicants. German language. The Faculties offer Bachelor, Master, and doctoral degrees. Courses are taught either in English or German. Applications with the usual documents (including vita, research statement, 5 most important Applicantspublications, will full be listhired of in publications concordance and with teaching § 36 of the experience, University andLaw diplomas)of the State should of North be- Rhinesubmitted Westphalia. via the University’s Academic Job Portal (https://berufungen.uni-koeln.de) until March 30th, 2017. The UoC supports diversity, the multiplicity of perspectives, and equal opportunities. The University of Cologne particularly encourages applications from disabled persons. Disabled persons are given preference in case of equal qualification. Women are strongly encouraged to apply. Preferential treatment is given to women if their professional qualifications and abilities are equivalent to those of other applicants. www.uni-koeln.de Applications with the usual documents (including vita, research statement, 5 most important publications, full list of publications and teaching experience, and diplomas) should be submitted via the University’s Academic Job Portal (https://berufungen.uni-koeln.de) until March 30th, 2017.

www.uni-koeln.de •17 members of 10 disciplines: Psychology, , computer science, statistics, geography, medicine, veterinary medicine, economics, … •3 entire faculties as members: Faculty of Medicine, Faculty of Veterinary Medicine, Faculty of Psychology and Educational Science •Mission Statement: • Education (from PhD student to professor) • Meta-science research • Change the incentive structure •http://www.osc.lmu.de 49 Thanks for your attention!