How Null Results Can Be Significant for Physics Education Research

PHYSICAL REVIEW PHYSICS EDUCATION RESEARCH 15, 020104 (2019) How null results can be significant for physics education research Luke D. Conlin,1 Eric Kuo,2 and Nicole R. Hallinen3 1Department of Chemistry and Physics, Salem State University, Salem, Massachusetts 01970, USA 2Learning Research & Development Center, University of Pittsburgh, Pittsburgh, Pennsylvania 15260, USA 3College of Education, Temple University, Philadelphia, Pennsylvania 19122, USA (Received 1 October 2018; published 3 July 2019) [This paper is part of the Focused Collection on Quantitative Methods in PER: A Critical Examination.] A central aim of physics education research is to understand the processes of learning and use that understanding to inform instruction. To this end, researchers often conduct studies to measure the effect of classroom interventions on student outcomes. Many of these intervention studies have provided an empirical foundation of reformed teaching techniques, such as active engagement. However, many times there is not sufficient evidence to conclude that the intervention had the intended effect, and these null results often end up in the proverbial file drawer. In this paper, we argue that null results can make significant contributions to physics education research, even if the results are not statistically significant. First, we review social science and biomedical research that documents widespread publication bias against null results, exploring why it occurs and how it can hurt the field. We then present three cases from physics education research to highlight how studies that yield null results can contribute to our understanding of teaching and learning. Finally, we distill from these studies some general principles for learning from null results, proposing that we should evaluate them not on whether they reject the null hypothesis but according to their potential for generating new understanding. DOI: 10.1103/PhysRevPhysEducRes.15.020104 I. INTRODUCTION chance. Research studies showing that the associations between quantities were unlikely to have arisen by chance Which research findings are significant enough to be (a criterion commonly set at p<0.05) categorize these published? We propose a simple criterion: Findings are associations as statistically significant. Studies that fail to significant insofar as they advance the state of existing establish statistically significant associations are typically knowledge. We suspect that this criterion is general enough considered “null results.” that most researchers, regardless of discipline or method- Although finding statistically significant associations has ology, could likely agree with it in principle. Of course, advanced understanding in these fields, researchers often there could be many ways to advance the state of existing make a critical error by deciding that null results do not knowledge in practice. advance the state of existing knowledge. At times, when In quantitative research, one approach to advancing the researchers play the role of journal editor or reviewer, they state of existing knowledge is to establish associations may use the criterion of statistical significance to reject between measured quantities. Regardless of discipline, manuscripts containing null results. In their role as authors, finding such associations can help researchers understand researchers can also self-censor their own work, placing the mechanisms underlying the phenomena of study, null results into a file drawer never to be disseminated. allowing for empirically founded descriptions and predic- Recently, empirical studies have found that both of these tions. In fields like medicine, psychology, and education, scenarios have contributed to a widespread “publication establishing such associations quantitatively is often done bias” against null results [1,2], which at best limits the through null-hypothesis significance testing, which com- conclusions of the field and at worst brings many of its monly uses a critical probability value (p value) to decide findings into question [3]. whether or not observed associations likely arose by As an interdisciplinary science, physics education research (PER) incorporates research methods as well as results from many fields, including fields where this bias Published by the American Physical Society under the terms of against null results has been shown to exist. Without the Creative Commons Attribution 4.0 International license. Further distribution of this work must maintain attribution to critically examining these methods and results from other the author(s) and the published article’s title, journal citation, disciplines, PER is at risk of absorbing these threats to and DOI. research validity. The purpose of this article is to examine 2469-9896=19=15(2)=020104(13) 020104-1 Published by the American Physical Society CONLIN, KUO, and HALLINEN PHYS. REV. PHYS. EDUC. RES. 15, 020104 (2019) the bias against null results and to offer suggestions for how not match the prediction: the motion of the solar system, researchers can avoid falling into this counterproductive which was not taken into account in their design, could bias. Using three illustrative case studies from our own have been exactly right at the time of the experiment to work, we provide concrete examples of how null results can cancel out the motion of Earth relative to the solar system. advance the state of existing knowledge in educational Although they argued that this explanation seemed research. We distill several principles for designing and unlikely, they proposed to repeat the experiment every identifying research that produces useful null results, three months to rule out this explanation. Repeated although the principles apply more generally than just to experiments ultimately failed to find the hypothesized null results. We propose that learning how quantitative null effect. Michelson and Morley’snullresultwentonto results can advance the state of existing knowledge is part provide an empirical basis for Einstein’s special theory of of learning quantitative research methods. We hope this relativity, which rejected the existence of the ether and the discussion will provide useful guidance for researchers very notion of absolute velocity. (acting as authors or as reviewers) in assessing the scientific Michelson and Morley’s experiment provides a model significance of quantitative research findings, regardless of for how to handle a null result in a way that advances the their statistical significance. state of existing knowledge in a research field. The value of this experimental result was not that it matched the predicted effect but in how it fit coherently (or not) II. WHAT ARE NULL RESULTS AND WHAT DO with existing theories and empirical results. By analogy, WE DO WITH THEM? physics education research can similarly benefit from well- A. Learning from null results in physics: designed studies that produce null results. The Michelson-Morley experiment To illustrate what a null result is and how it can advance B. Null results in quantitative education research the state of current state of knowledge, we turn to one of the Like physics research, quantitative education research most influential experiments in the history of physics: the often seeks to establish specific associations between Michelson-Morley experiment. In 1887, Michelson and measured quantities. To that end, quantitative education Morley [4] published their attempt to measure the relative research relies heavily on null hypothesis significance motion between Earth and the luminiferous ether, the testing (NHST). In the formal logic of NHST, the theoretical medium through which light was presumed to researcher’s stated hypothesis (e.g., that two quantities travel. The prevailing belief of the time was that the ether are associated) is contrasted against a null hypothesis (i.e., was stationary in space as Earth moved through it. This that there is no association). The empirical evidence is could provide a reference frame with which to measure the then evaluated in light of the null hypothesis, using absolute velocity of Earth, since the speed of light mea- statistical tests to decide the likelihood that any measured sured by an observer on Earth would be transformed due to association arose due to random sampling (sampling that relative motion: light traveling parallel to the orbital error) from a null distribution. For example, in the context velocity of Earth would appear slower, and light traveling of a physics course, a researcher might hypothesize that antiparallel to the orbital velocity would appear faster. students’ surveyed expectations toward learning physics Using an interferometer, Michelson and Morley measured would positively correlate with their course grade. In this the phase shift between two beams of light traveling in case, they would compare their measured correlation perpendicular directions to determine the effect of this against the null hypothesis, which would predict zero relative motion and find supporting evidence for existing correlation between expectations and course grade. ether models. Although they made these measurements at Statistical tests can compute the probability (or p value) several different angles and different times of day, they did that the observed results or even more extreme results not find the predicted effect. They concluded that their would have been observed if the null hypothesis were true. measured phase shift was probably less than 1=40th of the For

How Null Results Can Be Significant for Physics Education Research

Experimental Evidence During a Global Pandemic

Sensitivity and Specificity. Wikipedia. Last Modified on 16 November 2013

A Randomized Control Trial Evaluating the Effects of Police Body-Worn

Why Replications Do Not Fix the Reproducibility Crisis: a Model And

Which Findings Should Be Published?∗

1 Maximizing the Reproducibility of Your Research Open

Paradoxical Conclusions in Randomised Controlled Trials with 'Doubly Null

(Salmo Salar) with Amoebic Gill Disease (AGD) Chlo

The Significance of Null Results in Physics Education Research

Using Bias Analysis to Address Misclassification in Epidemiology

Null Hypothesis Significance Testing: a Review of an Old and Continuing Controversy

ORBITA and Coronary Stents: a Case Study in the Analysis and Reporting