The Quest for Statistical Significance: Ignorance, Bias and Malpractice of Research Practitioners Joshua Abah

To cite this version:

Joshua Abah. The Quest for Statistical Significance: Ignorance, Bias and Malpractice of Research Practitioners. International Journal of Research & Review (www.ijrrjournal.com), 2018, 5 (3), pp.112- 129. ￿hal-01758493￿

HAL Id: hal-01758493 https://hal.archives-ouvertes.fr/hal-01758493 Submitted on 4 Apr 2018

HAL is a multi-disciplinary open access L’archive ouverte pluridisciplinaire HAL, est archive for the deposit and dissemination of sci- destinée au dépôt et à la diffusion de documents entific research documents, whether they are pub- scientifiques de niveau recherche, publiés ou non, lished or not. The documents may come from émanant des établissements d’enseignement et de teaching and research institutions in France or recherche français ou étrangers, des laboratoires abroad, or from public or private research centers. publics ou privés.

Distributed under a Creative Commons Attribution - NonCommercial - ShareAlike| 4.0 International License

International Journal of Research and Review www.gkpublication.in E-ISSN: 2349-9788; P-ISSN: 2454-2237

Review Article

The Quest for Statistical Significance: Ignorance, Bias and Malpractice of Research Practitioners

Joshua Abah Abah

Department of Science Education University of Agriculture, Makurdi, Nigeria

ABSTRACT

There is a growing body of evidence on the prevalence of ignorance, biases and malpractice among researchers which questions the authenticity, validity and integrity of the knowledge been propagated in professional circles. The push for academic relevance and career advancement have driven some research practitioners into committing gross misconduct in the form of innocent ignorance, sloppiness, malicious intent and outright fraud. These, among other concerns around research handling and reporting, form the basis for this in-depth review. This discourse also draws attention to the recent official statement on the correct use of the p-value and the need for professional intervention is ensuring that the outcomes of research are neither erroneous nor misleading. The expositions in this review express cogent implications for institutions, supervisors, mentors, and editors to promote high ethical standards and rigor in scientific investigations.

Keywords: Research, Research misconduct, Bias, P-value, Statistical significance, ANCOVA Assumptions.

INTRODUCTION new knowledge and/or the use of existing Research is an enterprise aimed at knowledge is a new and creative way so as finding solutions and answers to existing to generate new concepts, methodologies problems. Research can be seen as an and understandings. This could include objective, systematic, controlled and critical synthesis and analysis of previous research activity planned and directed towards the to the extent that it leads to new and creative discovery and development of dependable outcomes. From all indications, research can knowledge (Emaikwu, 2012). Literally “re- be described as an organized mechanism for search” means to “search again”. It connotes studying phenomenon and testing patient study and scientific investigation hypotheses. wherein the researcher takes another, more Research is an indispensable tool for careful look at data to discover all that can growth and development in all fields of be known about the subject of the study human endeavour. It has been a means of (Bodla, 2017). Broadly, research entails breaking forth into new frontiers in bringing together some content that is of medicine agriculture, banking, education, interests, some ideas that give meaning to food security, sociology, literature, arts and that content and some techniques or the sciences. Outcomes of diverse procedures by means of which those ideas researches across different disciplines and content can be studied (Deshmukh, constitute the fuel for the present scientific n.d.). According to O’Donnell (2012), and technological advancement the world is research can be defined as the creation of witnessing. The world today, being a

International Journal of Research & Review (www.ijrrjournal.com) 112 Vol.5; Issue: 3; March 2018 Joshua Abah Abah. The Quest for Statistical Significance: Ignorance, Bias and Malpractice of Research Practitioners

“global village” is driven by the quest to In light of the ripple effect of research in the know more, to venture into the unknown knowledge-generation circle, researchers and make human existence much better than and academic institutions place serious ever. As a result this significance of emphasis on research ethics. In the words of research, it is gradually becoming a sub- Norris (1997): discipline in itself, within every discipline. Research demands skepticism, This implies that within every field of study, commitment and detachment. To there is a prescribed way of doing research, understand the object or domain of broadly referred to as “Research inquiry takes an intense degree of methodology”. commitment and concentration. To Research methodology consists of learning remain open minded, alert to how to adopt several common approaches foreclosure and to sources of error when doing research, and how to conceive a needs some measure of detachment. research design (Jonker & Pennink, 2010). As with other forms of art, research Methodology is a systematic plan for requires detachment from oneself, a thinking and acting in the conduct of willingness to look at the self and research work. Emaikwu (2012) maintains the way it influences the quality of that scientific research methods must be data and reports; in particular verifiable, cumulative, ethical, theoretical research demands a capacity to and empirical. How well a research project accept and use criticism and to be is planned and how well the steps in the self-critical in a constructive manner plan are integrated can make the difference (p.173). between success or failure. In this respect, a plan consists of two general areas, namely Ethical conduct, in general refers to research concepts and context, and research actions that one takes pride in according to logistics (Congdon & Dunham, 1999), his or her conscience and that lives up to his which are coordinated within a given time or her responsibility as a member of society. frame, culminating in the writing of a Kim (2009) asserts that research ethics is a research report. The research report is the special social norm that researchers are output of the entire research process made obliged to abide by as criterion of judgment visible to a targeted audience and/or the for researchers not to operate against their public. For academics and researchers in professional integrity and to carry out universities, research centres, science socially responsible research activities. laboratories and other research generating Ethical standards are set by professional agencies, the production of quality and associations, educational institutions, relevant research reports is a measure of journal publishers and government growth and a determination of career and regulatory agencies. It is likely that these institutional relevance. Research reports are organizations vary considerably in the often published in professional journals, attention they invest and the procedures they institutional bulletins, associations’ notices deploy to uphold research ethics (Johnson, and government agencies gazettes. They can Parker & Clements, 2001). Practices carried also be presented at workshops, seminars out by researchers outside these regulatory and conferences, where learned guidelines constitute research misconduct. contributions, corrections and suggestions By definition, research misconduct can be synthesized into the research process entails fabrication, falsification or before publishing for public use. Such plagiarism in proposing, performing or rigorous vetting is essential considering the reviewing research or in reporting research fact that a published work is expected to be results (OSTP, 2002). Research misconduct an addition to existing knowledge and a may occur if the conduct represents a reference point for future studies. significant departure from accepted

International Journal of Research & Review (www.ijrrjournal.com) 113 Vol.5; Issue: 3; March 2018 Joshua Abah Abah. The Quest for Statistical Significance: Ignorance, Bias and Malpractice of Research Practitioners practices; has been committed intentionally, essential for us to contemplate what knowingly or recklessly and can be proven responsible conduct of research by a preponderance of evidence (Inzana, actually entails and fully establish 2008). The ramification of research research ethics as an integral part of misconduct has been broaden to include our academic culture (p.1). other serious deviation from accepted guidelines of the scientific community for The pressure on academics to maintaining the integrity of research record increase their number of publications in line and retaliation of any kind against a person with requirements for promotion and career who reported or provided information about growth has also contributed to this grave suspected or alleged misconduct and who concern for research ethics. In the view of has not acted in bad faith (Fisehen, n.d.). Mullane and Williams (2013), bias in Among the three “cardinal sins” of research research, where prejudice or selectivity conduct, only plagiarism seems to be in the introduces a deviation in outcome beyond public eye, with the other two (falsification chance, is a growing problem, probably and fabrication) completely reduced to bare amplified by “first to publish” and “publish whispers. Falsification is the changing or or perish” drive and more recently, the omission of research results (data) to monetization of science for personal gain. support claims, hypotheses and other data. The matter is made worst by student Falsification can include the manipulation researchers who often do not have the depth of research instrumentation, materials, of experience and tenacity to match with the processes, images or representation in a scope of some sensitive research areas. The manner that distorts the data or “read too practice of polishing some of these students’ much between the lines” (Schienke, 2017). “shallow” findings for publications without On the other hand, fabrication is the rigorous checks by supervisors is in itself an construction or addition of data, assault on quality. The outcome of such observations or characterizations that never practice is the proliferation of ignorance, occurred in the gathering of data or running personal biases and malpractice in the name of . According to Schienke of research. The current mess being made of (2017), fabrication can occur when “filling statistical approaches and unsubstantiated out” the rest of the runs and significant results assembled by so called where claims are made based on incomplete “research analysts” which are difficult to or assumed results. decipher constitute a major cause for worry Kim (2009) explains why academics hardly among the few who are still interested in raise their voice when discussing research classical statistical methods. ethics: The problem under consideration is One of the biggest reasons for past a widespread one and not unique to any negligence of research ethics is specific field of practice. This implies that believed to be the public confidence the emphasis on integrity and quality that is in scientist or the confidence among intended in this work may not be very useful researchers in the self-control if restricted, for instance, to mathematics system. As quantitative assessment education. Thus, a multidisciplinary of researchers becomes widespread approach is adopted here, drawing on in- and the commercial application of depth background in mathematical science and technology is growingly and modern statistical computing. The role emphasized, we can no longer rely of statistical analysis in research is first merely on the value-neutral and presented. This is followed by discussions reasonable inclinations of scientists on ignorance, bias and malpractice among and the self-correcting system in research practitioners. By “research science circles. Therefore, it is practitioners” this discourse implies all

International Journal of Research & Review (www.ijrrjournal.com) 114 Vol.5; Issue: 3; March 2018 Joshua Abah Abah. The Quest for Statistical Significance: Ignorance, Bias and Malpractice of Research Practitioners stakeholders involved in the process of design adopted for the research. Broadly, producing research reports, including the available tools can be classified as either researcher, the supervisors (where parametric statistics or non-parametric applicable), the data analyst and vetting statistics. Likewise, several descriptive authorities. A final section of this essay statistical tools can be used to augment focuses on the place of professional inference by presenting information in intervention in improving the integrity of simple and understandable format. In fact research works. statistics can be said to be the language of The Role of Statistical Analysis in research. But that is not to say that a mere Research quantitative results can prove anything if the In order to investigate phenomenon, application of statistical methods is handled researchers need to gather information about wrongly. the phenomenon in a planned manner. Such When one makes a statistical investigations lead to the generation of inference, namely, an inference which goes research data. Data itself is the collected beyond the information contained in a set of factual material commonly accepted in the data, one must always proceed with caution. scientific community as necessary to In the view of Miller and Freund (1977), validate research outcomes. Research data is one must decide carefully how far one can data that is collected, observed or created, go in generalizing from a given set of data, for purposes of analysis to produce original whether such generalizations are at all research results (Boston University reasonable or justifiable, whether it might Libraries, n.d.). Research data is often be wise to wait until there are more data and obtained in raw form and require statistics so forth. The roots of to bring out its essence and interpretation. are the appraisal of the risks and the Emaikwu (2012) provides a robust consequences to which one might be background definition of statistics: exposed by making generalizations from Statistics is a branch of mathematics sample data. This includes an appraisal of which deals with the collection, the probabilities of making wrong decisions, classification, analysis and the chances of making incorrect predictions interpretation of numerical data. It and the possibility of obtaining estimates deals with quantitative analysis of which do not lie within permissible limits. numerical data so as to make wise What is drivable from the history of decision. Statistics helps in arriving statistical inference is the carefulness and at empirically verifiable research and nobility required of the statistician in the possible replication of such drawing up of conclusions based on information by other researchers (p. research data. The weight of statistical 89). conclusions drives the delicate job of the analyst who must deploy his expertise and Statistics can also be seen as a use tools correctly without bias. According collection of methods for planning to Emaikwu (2012), the misuse of statistics experiments, obtaining data and then will arise from the following situations: organizing summarizing, presenting, i. Analysis without any definite analyzing, interpreting and drawing purpose conclusions based on the data (Deshmukh, ii. Carelessness in the collection and n. d.). Statistical analysis facilitates interpretation of data comparison, exposes relationships between iii. Misleading others for self-interest phenomena and returns meaning to raw and cooking up of data research data for inferential purpose. There iv. Pressure on statisticians and bias and exist a wide range of statistical tools for the prejudice of the statisticians analysis of research data depending on the

International Journal of Research & Review (www.ijrrjournal.com) 115 Vol.5; Issue: 3; March 2018 Joshua Abah Abah. The Quest for Statistical Significance: Ignorance, Bias and Malpractice of Research Practitioners

v. Wrong definitions, inadequate data, The shortcomings arising out of the wrong methods and in appropriate ignorance of research practitioners are comparison. mostly thrown up when numbers that are anecdotal and not generalizable are reported It can thus be summarized that if a in cumulative form. In an effort at tracking problem can be properly formulated and such misuse in the public domain of measurement data can be generated, information security research, Ryan and whether it arises in physical, biological, and Jefferson (2003) reported that: social sciences or any other discipline, What is lost in the stories of these statistical tools can be designed to provide a various research efforts is the scientific solution (Chakrabarty, 2012). nuances and subtleties of the Thus, it is widely recognized that the proper research methodologies used, the use of statistics s a key element of scientific statistics applied and the data enquiry. According to Chakrabarty (2012), reported … in many cases, the quality and integrity of data is the most research methodologies were not important element in the success and utility sound (in some cases, the results of statistics. were specifically identified as being Ignorance of Research Practitioners unscientific). The statistical analyses Some of the commonly observed were in some cases inappropriate misuse of statistics by research practitioners and in general only partial results arises out of shear ignorance and reported in the press (as might be misunderstanding of statistical approaches expected). and tools. This realization is being compounded by the misuse of modern When statistical procedures which statistical software packages by untrained can produce very accurate results are often “statistical analysts” who are better used in manners for which they are not computer operators than the “label” they intended they produce erroneous and carry in the deployment of their exploitative misleading results. Graham (2001) identifies merchandise. These so-called “analysts” the concentration of misuse of statistics in feed off the ignorance of their clients and null hypotheses significance testing churn out incompatible statistics that cannot (NHST), ignoring of assumptions, and be rightly interpreted. This kind of misuse handling of ANOVA interaction effects. For of statistics can be viewed as negligence or statistical procedures that depend heavily on deficits of competence since it arise as a specific assumptions about the distribution result of lack of depth on the part of the of the sample, ignorance displayed in researchers on whom the responsibility for departures from these assumptions can be such research work lie. Inexperienced misleading. The over dependence on researchers generally tend to abuse statistics distribution-dependent statistical via bad samples, small samples, loaded methodologies is definitely increasing the questions, misleading graphs, pictographs, tendency to misapply statistics in research. precise numbers, distorted percentages, Such encumbrance can be avoided if partial pictures and distortions (Deshmukh, research practitioners exhibit their freedom n. d.). With preordained intentions, it is easy to choose statistical approaches they deeply to get any conclusions out of any given understood. Nearly all classical general research data. Other common method linear models (GLM) requires that the ignorance that can seriously hamper the assumptions of normality of distribution, outcome of statistical analysis is given an homogeneity of variance and random extensive coverage in Podsakoff, samples be met, but where it is difficult to Mackenzie, Lee and Podsakoff (2003). test assumptions, non-parametric

International Journal of Research & Review (www.ijrrjournal.com) 116 Vol.5; Issue: 3; March 2018 Joshua Abah Abah. The Quest for Statistical Significance: Ignorance, Bias and Malpractice of Research Practitioners alternatives are conveniently available to of a statistical analysis by reducing error drive substantial inference making. variances and (b) to statistically equate Another indication of statistical comparison groups (Owen & Froman, ignorance among research practitioners is 1998). Experimental error can be reduced if the tendency to mistaken correlation for a portion of the error variance associated causation (false causality). Correlation is with the dependent variable is predictable just a linear association between two from a previous knowledge of the variables, meaning that as one variable rises concomitant variable. Kirk (1982) observes or falls the other variables rises or falls as that removing this predictable portion from well. This association may be positive, in results in a smaller error variance, and, which case both variables consistently rise, hence, a more powerful test of a false null or negative, in which case one variable hypothesis. consistently decreases as the other rises As robust as the ANCOVA (Martz,2013). Even a correlation of +1 still Procedure is, ignorance of the does not imply causality, since the developmental history and techniques of the correlation coefficient only measures linear analysis on the part of researchers and relationships. Martz (2013) observes that a analysts is on the increase. Even amongst meaningful non-linear relationship may the standard descriptions of ANCOVA exist even if the correlation coefficient is 0. assumptions and tests are some ambiguous Additionally, because the Pearson and subtly misleading accounts. In this Correlation Coefficient can be very respect, Rutherford (2001) observes that it is sensitive to outlying observation it can be important to distinguish genuine statistical highly susceptible to sample selection assumptions from the made to simplify biases. It is also a misguided analysis to use ANCOVA interpretation to test the correlation to measure agreement. appropriate statistical assumptions and to ANCOVA: Still a Delicate Instrument employ pertinent techniques to assess the Analysis of Covariance (ANCOVA), tenability of these assumptions. In addition dubbed a delicate instrument by Janet to all ANOVA assumptions, traditional Elashoff, is still delicate. Carefully handled, ANCOVA is based on three specific though, it is an excellent device for the assumptions, namely: analyst’s tool kit (Owen & Froman, 1998). i. The covariance is independent of the The professional usage of this powerful treatments statistical procedure continues to litter the ii. In each treatment group the field of research methodology with various relationship between the covariance pitfalls that can deliver misleading results and the dependent variable is linear for the unwary analyst. (the covariate and dependent Analysis of Covariance (ANCOVA) variable are expressed at the first is a combination of power only), (ANOVA) and or iii. The regression coefficients of the indeed, a more complex extension of both dependent variable on the covariate (Emaikwu, 2012). The ANCOVA procedure in each treatment group are involves measuring one or more homogenous. (Rutherford, 2001 p. concomitant variables (also called 126). covariates) in addition to the dependent variable (Kirk, 1982). The concomitant To clarify, the first statistical variable represents a source of variation that assumption is that the covariate(s) is (are) has not been controlled in the experiment uncorrelated with other independent and one that is believed to affect the variables. In an example provided by Owen dependent variable. ANCOVA serves two and Froman (1998), in comparing lung vital primary purposes: (a) to improve the power capacity in smokers and non-smokers, one

International Journal of Research & Review (www.ijrrjournal.com) 117 Vol.5; Issue: 3; March 2018 Joshua Abah Abah. The Quest for Statistical Significance: Ignorance, Bias and Malpractice of Research Practitioners may ask if the selected on one axis and the outcome on the other. variable, age, correlated with the The regression lines for each of the scatter independent variable, smoking? If the plots should look more or less the same. correlation is non-zero, then removing the This feature, according to Rutherfield variance associated with age will also (2001), becomes more tenuous as the remove some of the variance associated number of experimental conditions with the grouping variable (smoking) in increases. The reason for the assumption is effect leaving less of the dependent that all groups’ dependent variable scores variable’s (lung vital capacity) variance to are adjusted based on a pooled regression be accounted for the independent variable slope, if the groups individual slopes differ (smoking) (Owen & Froman, 1998). sharply, then the pooling becomes a muddy Evidently, analysis of covariance is not average (Owen & Froman, 1998). Kirk appropriate unless the effects eliminated by (1982, pp 732-734) provides a covariate adjustment are irrelevant to the demonstration of a statistical test for objectives of the experiment or study (Kirk, homogeneity of regression models. 1982). Likewise, Rutherford (2001, chapter 8) The second specific assumption of gives a comprehensive coverage of traditional ANCOVA is also known as the heterogeneous regression ANCOVA using linearity assumption. In basic terms, this more sophisticated GLMs. assumption states that the regression of the Additional requirements for dependent variable on the covariate(s) in ANCOVA contain a provision for each of the experimental conditions is measuring the covariate without error, an linear. Rutherford (2001) holds that the often unmentioned assumption in statistics most obvious way to assess linearity of the books. Owen and Froman (1998) mention separate groups’ regressions is to the that in the case of ANCOVA with random dependent variable against the covariate (or assignment, covariate measurement error each covariate) for each experimental does not bias the adjusted means, but it does condition. Regression linearity can also be produce less statistical power, which in turn checked through a significant test for the increases the probability of Type II error. reduction in errors due to the inclusion of With a quasi-experimental design lacking non-linear components, applying a form of random assignment, covariate measurement power transformation (e.g. quadratic, cubic) error creates bias in adjusted means. Quasi- to the covariate before the ANCOVA experimental designs, common in analysis (Owen & Froman, 1998). educational and industrial research, usually The third statistical assumption of employ intact groups because it is often traditional ANCOVA is the one mostly impractical for administrative reasons to ignored or wrongly handled by research randomly assign treatments. With respect to practitioners. If there is a positive the use of intact groups, Kirk (1982) gives relationship between covariate and the this note of caution: outcome (dependent variable) in one group, Experiments of this type are always we assume that there is a positive subject to interpretation difficulties relationship in all of the other groups too. If that are not present when random however the relationship between the assignment is used in forming the outcome and covariate differs across the experimental groups. Even when groups then the overall regression model is analysis of covariance is skillfully inaccurate. Field (2012) observes that the used, we can never be certain that best way to think of this assumption of some variable that has been homogeneity of regression slopes is to overlooked will not bias the imagine plotting a scatterplot for each evaluation of an experiment. This experimental condition with the covariate problem is absent in properly

International Journal of Research & Review (www.ijrrjournal.com) 118 Vol.5; Issue: 3; March 2018 Joshua Abah Abah. The Quest for Statistical Significance: Ignorance, Bias and Malpractice of Research Practitioners

randomized experiments because the objectives of the experiment. In addition to effects of all uncontrolled variables meeting original ANCOVA assumptions, are distributed among the groups in the following conditions guide the selection such a way that they can be taken of concomitant variables: into account in the test of i. The experiment contains one or significance. The use of intact more extraneous sources of variation groups removes this safeguard (p. believed to affect the dependent 718). variable and considered irrelevant to the objectives of the experiment. In line with this warning, Pedhazur ii. Experimental control of the (1994 in Owen & Froman 1998) affirms that extraneous sources of variations is unfortunately, applications of ANCOVA in either not possible or not feasible. quasi-experimental and non-experimental iii. It is possible to obtain a measure of researches are by and large not valid. This is the extraneous variations that does because the F-ratio in ANOVA/ANCOVA not include effects attributable to the is predicated on the pre-condition that treatment. (Kirk 1982, p 719). observations are random samples drawn from normally distributed populations. To improve the quality of ANCOVA Random assignment is used to distribute the studies, Owen and Froman (1998) idiosyncratic characteristics of subjects over recommend that the method be limited the treatment levels so that they will not primarily to randomized designs. When the selectively bias the outcome of the analyst wants to use ANCOVA with an experiment (Kirk, 1982). Non- intact group or other non-random leads to non-independence of assignments, the correlation between the errors which seriously affects both the level covariate(s) and the independent variable(s) of significance and the power of the F-test. should be reported. As the correlations are As a rule, statistical packages increasingly non-zero, then conclusions encourage users to ignore assumptions and drawn about the independent variables are leap right in the main analysis. Owen and increasingly suspicious (Owen & Froman, Froman (1998) note that inside 1998). Weaver (2002) reported a vital ANOVA/ANCOVA programs, packages warning thus: offer the Levine test for homogeneity of ANCOVA can often accomplish the variance, but any other tests of assumptions purpose of increasing power but its must be arranged by the user. ability to remove bias is fraught with The misapplication of ANCOVA technical difficulties that have been often begins from the design phase of most frequently ignored. Many novices research work, particularly in the have viewed ANCOVA as the identification of the concomitant variable. “messiah” of statistical methods it Many works in education that employs has been asked to “give signs” and ANCOVA tend to, as a rule, use pretest “perform wonders” - to reveal the scores of learning ability as a covariate truth amidst a bewildering array of without concern that other concomitant uncontrolled and poorly measured variables may have been overlooked, such confounding variables. Some have as number hours spent in study by students mistakenly assumed that ANCOVA, in different intact classes, peculiar historical in effect transforms quasi- background of the subjects of the study and experiments (i.e. studies in which other intermittent factors. Many research subjects are not randomly assigned practitioners are virtually unaware that to treatments but taken as they effects eliminated by a covariance occurred naturally) into randomized adjustment must be irrelevant to the experiments. In reality ANCOVA is

International Journal of Research & Review (www.ijrrjournal.com) 119 Vol.5; Issue: 3; March 2018 Joshua Abah Abah. The Quest for Statistical Significance: Ignorance, Bias and Malpractice of Research Practitioners

unable to give the results of a quasi- iven H, where represents a experiment the same degree of hypothetical replication under the null credibility provided by randomized hypothesis and T is a test statistic (i.e. a experiments (p 20). summary of the data perhaps tailored to be In view of the technical frailty of sensitive to departures of interest from the ANCOVA, suitable alternatives can be model) (Gelmen, 2013). Informally, a p- deployed. For instance, it can be far more value is the probability under a specified informative, following a violation of that a statistical summary homogenous slope, to calculate Johnson of the data (e.g. the sample mean difference Neyman regions of significance. This between two compared groups) would be technique, according to Owen and Froman equal to or more extreme than its observed (1998) helps to map out where groups do value (Wasserstein & Lazar, 2016). The p- and do not differ along various values of the value answers the question: If the null covariate. Weaver (2002) recommends the hypothesis had been true, what would have Treatment x Blocks design as a robust been the probability of obtaining data that alternative to ANCOVA. The Treatment x looked as or more inconsistent with it than Blocks design does not have restrictive the data we observed in our sample? So the assumptions and, for this reason, is to be smaller is the p-value, the greater is the preferred for its relative freedom from doubt that our data sheds on the null statistical assumptions underlying the data hypothesis. analysis (Keppet, 1982 in Weaver, 2002). In referring to the roots of NHST, This later design is sensitive to any type of Hubbard and Bayarri (2003) assert that relationship between treatments and blocks- classical statistical testing is an anonymous not just linear. hybrid of the competing and frequently contradictory approaches by R.A. Fisher on The P-Value Controversy the one hand, and Jerzy Neyman and Egon Of all the areas of misuse of Pearson on the other. The ignoble p-value statistical procedures, none has stir up more controversy is a widespread failure to controversy than the issue of the p-value. appreciate the incompatibility of Fisher’s The wrong use of p-values permeates even evidential p-value with the Type 1 error the highest level of research and has eaten rate, α, of Neyman-Pearson statistical so deep into the fabric of research orthodoxy. This misuse reflects the methodology textbooks that many are fundamental differences between Fisher’s unwilling to let go. This stubbornness ideas of significance testing and inductive among some research practitioners has inference, and Neyman-Pearson views of forced the most revered American Statistical hypothesis testing and inductive behaviour Association (ASA) to issue a statement on (Hubbard & Bayarri, 2003). A trip back to the guiding principles of the use of the p- the very beginning of the methods of value. The statement officially released on statistical inference is what most applied 8th March, 2016 is the first time that the researchers require. 177-year old ASA has made explicit Fisher’s views on significance recommendations on such a foundational testing, presented in his research papers and matter (Baker, 2016). Before stating these in various editions of his enormously guidelines here, a clearer view of the influential texts, Statistical Methods for historical origins of this controversy may be Research Workers (1925) and The Design necessary and educative. of Experiments (1935), took root among As a way of definition, the p-value is applied researchers (Hubbard & Bayarri, a measure of discrepancy of the fit of a 2003). At the heart of his conception of model or “null hypothesis” H to data y, inductive inference is what Fisher called the mathematically defined as null hypothesis, Ho. Fisher was convinced

International Journal of Research & Review (www.ijrrjournal.com) 120 Vol.5; Issue: 3; March 2018 Joshua Abah Abah. The Quest for Statistical Significance: Ignorance, Bias and Malpractice of Research Practitioners that it is possible to argue from measure of evidence against Ho. consequences to causes, from observation to Accordingly a p-value for Fishers hypothesis. Fisher significance test is represented an “objective” way for defined as a procedure for establishing the researchers to assess the plausibility of the probability of an outcome, as well as more null hypothesis. extreme ones, on a null hypothesis of no … the felling induced by a test of effect or relationship. Hubbard and Bayarri significance has an objective basis in (2003) assert that the distinction between that the probability statement on the probability of the observed data given which it is based is a fact the null and the probability of the observed communicable to and verifiable by and more extreme data given the null is other rational minds. The level of crucial; not only it has contributed to the significance in such cases fulfills the confusion between p’s and α’s, but also conditions of a measure of the results in an exaggeration of the evidence rational grounds for the disbelief (in against the null provided by the observed the null hypothesis) it engenders data. Fisher regarded p-values as (Fisher 1959, p.43 in Hubbard & constituting inductive evidence against the Bayarri, 2003). null hypothesis that a sample comes from a hypothetical infinite population with a Consequently, the tag “p< 0.05” and known sampling distribution. The null researchers quest for publishable statistical hypothesis is said to be disproved or significance is a psychological practice in rejected if the sample estimate deviate from itself. According to Ludwig (2005) such the mean of the distribution by quest only psychologically makes research more than a specified criterion, the level of practitioners feel good and fuel the wrong significance (α). belief that the observed results of an On the contrary, the Neyman- experiment or observational study are not Pearson approach (developed as an attempt factual and therefore cannot be discussed to improve on Fisher’s approach) formulates unless some type of statistical sanctification two competing hypotheses, the null is invoked. By going back to the roots of p- hypothesis (Ho) and the alternative values, it is obvious that researchers are not hypothesis (HA). This framework introduced to rely on p-values to make their case since the probabilities of committing two kinds of literally “Fisher considered the use of error based on considerations regarding the probability values to be more reliable than, decision criterion, sample size and effect say, eyeballing results” (Hubbard & size. These errors are false rejection (Type I Bayarri, 2003 p.4). This appears to be the error) and false acceptance (Type II error) thoughts re-echoed recently by the ASA’s of the null hypothesis. Type I error was official statement on the use of p values. designated α (level of significance) while The much be-lated statement came as a Type II error was called β. Hubbard and response to apparent editorial biases against Bayarri (2003) report that in contradiction to scientifically important works that get Fisher’s ideas about hypothetical infinite relegated on the basis of non-significant p- populations, Neyman-Pearson results are values. The pursuit of the arbitrary threshold predicated on the assumption of repeated (p < 0.05) has also led to data dredging and random sampling from a defined population diverse forms of misconduct that emphasize with α being the long-run frequency of Type the search for small p-values over other I errors. With respect to this distinction, statistical and scientific reasoning. Such associated p-value (significance probability) quests tend to ignore many other, more determined in a statistical test cannot be appropriate statistical tools like graphic interpreted as a frequency-based Type I analysis, regression trees, bioinformatics, error rate and it is incorrect to take p<α as a

International Journal of Research & Review (www.ijrrjournal.com) 121 Vol.5; Issue: 3; March 2018 Joshua Abah Abah. The Quest for Statistical Significance: Ignorance, Bias and Malpractice of Research Practitioners and exploratory data analysis index should substitute for scientific (Ludwig, 2005). reasoning (p.132). The American Statistical Association six principles, many of which address In particular, even in a designed misconceptions and misuse of the p-value experiment, statistical tests and p-values are the following: give very little information because they can i. P-values can indicate how answer only the one very specific question incompatible the data are with a (Ludwig, 2005). specified statistical model, Bias of Research Practitioners ii. P-values do not measure the Research is a procedural activity that probability that the studied is thought to be sanctimonious over ordinary hypothesis is true, or the probability observation or judgment. It is an that the data were produced by investigation that is valid and present truth random chance alone. claims in the form of statements of fact, iii. Scientific conclusions and business descriptions, accounts, propositions, or policy decisions should not be generalizations, inferences, interpretations, based only on whether a p-value judgments and arguments (Norris, 1997). passes a specific threshold. Being a scientific endavour, research is iv. Proper inference requires full traditionally conducted around the four reporting and transparency. norms of science (articulated by Robert v. A p-value or statistical significance, Merton in 1973). These are universalism, does not measure the size of an communalism, disinterestedness and effect or the importance of a result. organized skepticism. MacCoun (1998) vi. By itself, a p-value does not provide elaborates: a good measure of evidence Universalism stipulates that regarding a model or hypothesis. scientific accomplishment must be (Wassertein & Lazar, 2016). judged by improved criteria; the personal attributes of the investigator In further explanations provided by are irrelevant. Communalism Yaddanapudi (2016) on Principle 5, for requires scientific information to be instance, it is obvious a p-value of 0.01 does publicly shared. Disinterestedness not mean that the effect size is larger than admonishes investigators to proceed with a p-value of 0.03. With a particular objectively, putting a side personal example, Yaddanapudi showed that the p- biases and prejudices. Finally, value would have been 0.000002 if the organized skepticism requires the sample were to be increased from 200 to scientific community to hold new 1000. The conclusion of Wassertein and findings to strict levels of scrutiny Lazar (2016) on ASA’s statement is through peer review, replication and noteworthy: the testing of rival hypotheses Good statistical practice, as an essential (p.120). component of scientific practice, emphasizes principles of good study These normative pillars crudely design and conduct, a variety of constitute a culture of appraisal of research numerical and graphical summaries of work by both scientists and non-scientists data, understanding of the phenomenon alike. But research, whether quantitative or under study, interpretation of results in qualitative, experimental or naturalistic, is a context, complete reporting and proper human activity subject to the same kinds of logical and quantitative understanding of failings as other human activities (Norris, what data summaries mean. No single 1997). Seasonal research experts know that researchers are fallible and that bias can find

International Journal of Research & Review (www.ijrrjournal.com) 122 Vol.5; Issue: 3; March 2018 Joshua Abah Abah. The Quest for Statistical Significance: Ignorance, Bias and Malpractice of Research Practitioners its way into any research programme fraudulent bias of the researcher is the most (Sarniak, 2015). dreaded. Bias is an expression of unfair Researchers are an inherently influence on the wholeness of an activity. In optimistic group who are constantly tempted research, bias occurs when there is by the tendency for over-statement and over systematic difference between the results simplification (Mullane & Williams, 2013). from a study and the true state of affairs Many of those who conduct research fail to (Sabin, 2010). It is the tendency to be partial do good research because they want to do it which happens when the researcher does at their convenience. For instance, instead of something that favours or skews towards a getting a random sample of respondents, a certain direction, leading to research researcher may just interview anyone that outcomes that is inaccurate and unreliable gets in his way, thereby introducing a (Regoniel, 2013). The worry about selection bias (Regoniel, 2013). Likewise, subjectivity arises particularly because the while the nature of one’s research may be data obtained in a research must “go argumentative, favouring a preconceived through” the researcher’s mind before it is position on the subject of investigation can put on paper (Rajendran, 2001). MacCoun bias the outcomes. Some researchers fall for (1998) reports that the very decision to the tendency to steer the results of their study certain topics is sufficient to prompt studies to the direction they want, some observers to infer that the investigator sometimes “p-hacking” their data analysis is biased. In this respect, it is always to yield statistically significant results or possible that the bias lies in the accuser indulging in selective reporting. According rather than (or in addition to) the accuser. to Mullane and Williams (2013) the The existence of bias in research retrospective selection of data for tends to be observed by the sheer volume of publication can be influenced by prevailing data reported. Data is generally viewed as a wisdom promoting expectations, or, where key basis of competition, productivity, the benefit of hind-sight at the conclusion of growth and innovation, irrespective of its a study allows an uncomplicated sequence conception, quality, reproducibility and of events to be traced and promulgated, as usability (Mullane & Williams, 2013). the only conclusion possible. Sabin (2010) notes that bias is often Research practitioners who introduced when a study is being designed, deliberately promulgate research findings but can be introduced at any stage. In view out of their biases fail to acknowledge that of this, it is preferable to design the study in research findings are rarely a direct order to avoid bias in the first place. Bias by determinant of policy decisions. Social design reflects in critical features of scientists are sometimes strikingly naïve experimental planning ranging from the about the gaps between research findings design of an experiment to support rather and the inputs needed for sound policy than refute a hypothesis, lack of formation (MacCoun, 1998). For instance, a consideration of the null hypothesis, failure research work that manipulates its way out to incorporate appropriate control and to establish a significant outcome in favour reference standards, and reliance on single of a non-contextual and inadequately data points (Mullane & William, 2013). available technology will not necessarily Selection bias and information bias may contribute to the expected wide adoption, also arise from measurement, since it failed to acknowledge the extant misclassification, observation, regression, context and the possibility of dilution and , all of them being implementation of such technology. The inadequacies that point to a hasty study hypothetical researcher in this example design. But of all biases, personal, commits a confirmation bias when he forms a hypothesis or belief and uses respondents’

International Journal of Research & Review (www.ijrrjournal.com) 123 Vol.5; Issue: 3; March 2018 Joshua Abah Abah. The Quest for Statistical Significance: Ignorance, Bias and Malpractice of Research Practitioners information to confirm that belief. He attention to quality controls, they judges and weighs responses that confirm foster an approach and attitudes that his hypothesis as relevant and reliable, blurs the distinction between while dismissing evidence that does not necessary scientific rigor and support the hypothesis. Confirmation bias is . deeply seated in the natural tendencies people use to understand and filter Malpractice of Research Practitioners information which often lead to focusing on For centuries knowledge meant one hypothesis at a time (Sarniak, 2015). proven knowledge, proven either by the One of the personal biases that can power of the intellect or by the evidence of dent the validity of a research work may the senses. Wisdom and intellectual stem from the cultural perspective of the integrity demands that one must desist from researcher. Assumptions about motivations unproven utterances and minimize, even in and influences that are based on one’s thought, the gap between speculation and cultural lens (on the spectrum of established knowledge (Lakatos, 1970). ethnocentricity or cultural relativity) create Inherently, there are certain important the culture bias (Sarniak, 2015). Broadly, values shared by genuine researchers. The while ethnocentrism is judging another foremost of these values are integrity, culture solely by the values and standards of accuracy, efficiency and objectivity. one’s own culture, cultural relativism is the Integrity simply refers to the ability to principle that an individual’s beliefs and deliver information as it is and respect activities should be understood by others in promise made. Accuracy ensures the report terms of that individual’s own culture. of research results as they are and the Sarniak (2015) suggested that although assurance to avoid errors. Efficiency is complete cultural relativism is never 100 ability to utilize resources wisely and avoid percent achievable, researchers must move wasting them. Objectivity is the readiness to toward cultural relativism by showing embrace facts as they are and refrain from unconditional positive regard and being biases. Misconduct or malpractice results cognizant of their own cultural assumptions. from the gross departure from these and The data must bear the weight of any other shared values. Malpractice in this interpretation, so the researcher must sense is the deliberate or repeated non- constantly confront his or her own opinions compliance with research requirements and prejudices with the data (Rajendran, (Lepay, 2008). 2001). If the worth of a study is the degree Malpractice by research practitioners to which it generates theory, description or could be attributed to innocent ignorance, understanding, then researchers must sloppiness and malicious intent (falsification constantly view the threat of personal bias or fraud). With respect to fraudulent with a grave concern. Mullane and Williams practices by researchers, Simmons, Mercer, (2013) express the expanding concerns Schwarzer and Courtney (2016) maintain regarding scientific integrity and that: transparency in the following terms: Concern about data falsification is as While research misconduct in terms old as the profession f public opinion of overt fraud and plagiarism is a polling. However, the extent of data topic with high public visibility, it falsification is difficult to quantity remains relatively rare in research and not well documented. As a publications why data manipulation, result, the impact of falsification on data selection and other forms of statistical estimates is essentially bias are increasingly prevalent. unknown (p.1). Whether intentional, the result of inadequate training or due to lack of

International Journal of Research & Review (www.ijrrjournal.com) 124 Vol.5; Issue: 3; March 2018 Joshua Abah Abah. The Quest for Statistical Significance: Ignorance, Bias and Malpractice of Research Practitioners

Falsification occurs when instances involving research students who researchers go against the code of ethics on cannot explain the mechanisms of their the maintenance and preservation of laboratory and statistical analysis, obviously research data. This ethical standard requires because they were not involved in those research practitioners to record data, stages in the first place. samples and other materials used or Research data may be termed “falsified” in generated throughout the course of research the following ways: and retain them for a given period of time i. Creating data that were never (Kim, 2009). Any preconceived influence obtained forced upon the interview process and data ii. Altering data that were obtained by compilation amounts to malpractice. Filling substituting different data out of missing data and partial coverage of iii. Recording or obtaining data from a study area are becoming prevalent. There specimen, sample or test whose are also unconfirmed tales of “assumed origin is not accurately described or research”- study reports cooked up from the in a way that does not accurately imagination of inadvertent authors. Meta- reflect the data. data are generated in a day and questions iv. Omitting data that were obtained and and hypotheses are succinctly handled to originally would be recorded. support the perspective of these rogue (Lepay, 2008) authors. The thought of the possibility of such sacrilege even abhors but there are Malpractice in research is a serious people who condescend so low to this offence in many climes and should be abysmal level of malpractice. eschewed by all well-meaning researchers. Another common form of Instead of the usual institutional cover-up of malpractice could be seen in the cloning of professional misconduct of researchers, results for unreachable sample units in efforts must be geared towards prevention, experimental and quasi-experimental retraining and possibly, open rebuke or studies. After a robust presentation of reprimand for those found wanting. research methodology at the proposal stage, The Place of Professional Intervention some researchers fear that they may not be It is obvious that relatively little able to reach the planned sample units. For attention is given, at least publicly, to the instance, the rigor of setting up the contrasting problem of data falsification and treatment for the experimental group across other malpractices by survey staff and all aforementioned sample units has been a researchers in general (Johnson, Parker & tempting factor for many educational Clements, 2001). That the misuse of researchers, particularly when the study statistical procedures has continued for so entails the deployment of delicate long does not excuse its existence (Graham, technology and complex pedagogical 2001). It is the responsibility of every sequence. For surveys, there have been profession to ensure that the results of their documented reports of duplication of data research are neither erroneous nor sets (Simons et al, 2016). In the natural misleading. The weight of the consequences sciences, food technology and agriculture, of malpractice such as possible safety risk, the practice of posting samples to jeopardizing of the reliability of published “specialized laboratories” for analysis (in data, undermining of regulatory authorities, the absence of the researcher) raises decreasing public confidence and the risk of suspicions for the outcomes. Sometimes putting people of questionable character in such “results-by-correspondence” arrive in respectable positions they did not actually non-interpreted formats leaving the merit, must be projected at all times by researcher to whimsically infer any outcome professional bodies (Lepay, 2008). of choice. There are undocumented Professional associations must intervene by

International Journal of Research & Review (www.ijrrjournal.com) 125 Vol.5; Issue: 3; March 2018 Joshua Abah Abah. The Quest for Statistical Significance: Ignorance, Bias and Malpractice of Research Practitioners making clear their position on ethical investigate the applicability of all types of misconduct. When self-scrutiny fails, the statistical analysis across all applications onus falls on institutional safeguards such as and on the basis of a cost-utility analysis peer reviewing, research replication, meta- (Graham, 2001). In practical terms, analysis and expert panels to mitigate the institutional review boards must set up onslaught on professional misconduct in mechanisms for cross-checking the research (MacCoun, 1998). For the practice authenticity of field data. Such mechanism of survey research, John, Parker and might entail the on-site supervision of field Clement (2001) suggest: work and administrative collaboration Expectations and consequences of between faculties and authorities of falsification should be clear and partnering institutions from where research acknowledged, and it should be clear students obtain primary and secondary data. to staff what the general procedures Attestation from partnering institutions on for monitoring staff performance the extent of work done in their premises by include. Further, all staff responsible research students will go a long way in for the collection and/or processing raising quality assurance of graduate of survey data are asked to sign a research works. statement indicating their awareness With the increase in bias, data and understanding of the policies manipulation and fraud, the role of the relevant to data falsification. Careful professional journal editor has become more supervision of interviewer and data challenging, both from a time perspective coding staff is critical to the and with regards to avoiding peer review prevention of data falsification bias (Mullane & Williams, 2013). While (p.277). keeping standards high, much of the process of producing quality research reports still Faculties and professional bodies depends on the integrity and ethics of can deploy available detection methods to authors and their institutions. Mullane and help in evaluating the performance of the Williams (2013) assert that it is paramount costly prevention methods and to identify that institutions, mentors and researchers falsified results that slipped past prevention promote high ethical standards, rigor in measures. Detection methods entail scientific thought and ongoing evaluations evaluation of key indicators, including para- of transparency and performance that meet data (interview length, time stamps, exacting guidelines. Institutions and the geocoding, timing of interviews), research community must ensure that interviewer-related data (experience, daily allegations of research malpractice proven workload, success rates) and interview- by a preponderance of the evidence (Inzana, related data (characteristics of respondents, 2008). According to Fischer (n. d.), interview recordings, back-checking results) common features of research policy and as well as analysis of the structure of regulation with respect to handling responses (refusals, extreme values, misconduct issues include: coherence of responses, consistency in time i. Discrete, separate phases of inquiry, series, duplicates) (Simons et al, 2016). investigation, adjudication and Identification of falsified data is not the appeal result of a single measure, but an assessment ii. Reliance on community based of the different aspects within the study- standards (“serious deviation” or specific environment in which research “significant departure”). practitioners carry out their work. iii. Partnership with institutions Institutional review boards of educational iv. Level of intent and standard of proof institutions are expected to evaluate their v. Confidentiality for subjects and students’ methodological competence and informants

International Journal of Research & Review (www.ijrrjournal.com) 126 Vol.5; Issue: 3; March 2018 Joshua Abah Abah. The Quest for Statistical Significance: Ignorance, Bias and Malpractice of Research Practitioners vi. Fair, accurate, timely, fact-and  Chakrabarty, K. C. (2012). Uses and document -based process (p.4). misuses of statistics. Address delivered at the DST Centre for Interdisciplinary The time to act is now. Voices from Mathematical Sciences, Faculty of Science, Banaras Hindu University, as part of the within the research community must rise, th loud and clear, in unison and defense of our 150 Birth Anniversary Celebrations of Mahanama Pandit Madan Mohan Malviya, noble professions. People are encouraged to Varanasi, 20th March, 2012. pp 1-5. put aside their silence and secret whispers in  Congdon, J. D. & Dunham, A. E. (1999). order to push for the right things to be done Defining the beginning: The importance of at all times. It is ripe to correct the notion of research design. In K. L. Eckert, K. A. not washing the dirty linens of our Bjorndal, F. A. Abreu-Grobois, & M. researchers in the public. Constructive Donnelly (Eds.), Research and Management criticism and provision of information on Techniques for the Conservation of Sea social responsibility in the practice of Turtles. pp 1-5. Washington, DC: research should be the duty of all IUCN/SSC Marine Turtle Specialist Group enlightened minds. Publication.  Deshmukh, S. G. (n.d.). Research, statistics CONCLUSION and misuse/abuse of statistics. Retrieved on 27th May, 2017 from Growing concerns for the integrity https://www.researchgate.net/profile/S_G_D of research work are to be taken more eshmukh/publication/280090993_SGD- seriously now than ever, considering the Research-abuses-of- ubiquity of statistical approaches and statistics/links/55a7e95d08ae815a0421280a. computational software that are easily pdf abused in the quest for statistical  Emaikwu, S. O. (2012). Fundamentals of significance. This pertinent review has research methods and statistics. Makurdi: attempted to draw attention to the ignorance Selfers Academic Press Limited. pp 1-428. and omissions of research practitioners in  Field, A. (2012). Analysis of covariance th their misunderstanding and misapplication (ANCOVA). Retrieved on 29 June, 2017 of statistical routines and tools. The from http://discoveringstatistics.com influence of personal expectations for  Fischer, P. (n.d.). New research misconduct statistical outcomes and the crime of data policies. Arlington, VA: National Science Foundation. Pp 1-15. Retrieved on 3rd June, falsification were also discussed in detail. 2017 from Given the increasing tendency for https://www.nsf.gov/oig_pdf/presentations/s misconduct in research reporting, the need ession.pdf for professional intervention was explored  Gelman, A. (2013). P values and statistical with the intention of early prevention, practice. Epidemiology, 24(1), 69-72. detection and further education.  Graham, J. M. (2001). The ethical use of statistical analysis in psychological REFERENCES research. Paper presented at the annual  Baker, M. (2016). Statisticians issue meeting of Division 17 (Counselling warning on p values. Nature, 531, 151. ) of the American Psychological  Bodla, B. S. (2017). Introduction to Association, Houston, Texas. pp 1-23. research methodology. CP-206. Retrieved  Hubbard, R. & Bayarri, M. J. (2003). P on 27th May, 2017 from values are not error probabilities. Retrieved http://www.ddegjust.ac.in/studymaterial/mb on 29th May, 2017 from a/cp-206.pdf http://ftp.stat.duke.edu/WorkingPapers/03-  Boston University Libraries (n.d.). What is 26.pdf research data? Retrieved on 29th May, 2017  Inzana, T. (2008). Research misconduct: from What it is and how to avoid. A publication http://www.bu.edu/datamanagement/backgr of Research Integrity Office, Virginia Tech. ound/whatisdata/ Retrieved on 3rd June, 2017 from

International Journal of Research & Review (www.ijrrjournal.com) 127 Vol.5; Issue: 3; March 2018 Joshua Abah Abah. The Quest for Statistical Significance: Ignorance, Bias and Malpractice of Research Practitioners

https://www.research.vt.edu/research-  Norris, N. (1997). Error, bias and validity in integrity-office/brochure/misconduct- qualitative research. Educational Action brochure.pdf Research, 5(1), 172-176.  Johnson, T. P., Parker, V. & Clements, C.  O’Donnell, J. (2012). What is research? (2001). Detection and prevention of data Retrieved on 27th May, 2017 from falsification in survey research. Survey https://theresearchwhisperer.wordpress.com/ Research, 32(3), 1-15. 2012/09/18/what-is-research/  Jonker, J. & Pennink, B. W. (2010). The  Office of Science and Technology Policy - essence of research methodology: A concise OSTP (2002). Federal policy on research guide for Master and PhD students in misconduct. Retrieved on 28th May, 2017 management science. Heidelberg: Springer from  Kim, M.-K. (2009). The responsible http://www.ostp.gov/html/0012017_3.html conduct of KAIST research community,  Owen, S. V. & Froman, R. D. (1998). Uses KAIST I&TM, 1-5. Retrieved on 28th May, and abuses of the analysis of covariance. 2017 from Research in Nursing & Health, 21, 557-562. http://ethics.kaist.ac.kr/irb/life/life.jsp  Podsakoff, P. M., MacKenzie, S. B., Lee, J.-  Kirk, R. E. (1982). Experimental design: Y. & Podsakoff, N. P. (2003). Common Procedures for the behavioural sciences. method biases in behavioural research: A Pacific Groove, California: Brooks/Cole critical review of the literature and Publishing Company. recommended remedies. Journal of Applied  Lakatos, I. (1970). Falsification and the Psychology, 88(5), 879-903. DOI: methodology of scientific research 10.1037/0021.9010.88.5.879 programs. In I. Lakatos & A. Musgrave  Rajendran, N. S. (2001). Dealing with (Eds.), Criticism and the Growth of biases in qualitative research: A balancing Knowledge. Cambridge: Cambridge act for researchers. A paper presented at the University Press. pp 91-196. Qualitative Research Convention 2001:  Lepay, D. A. (2008). Misconduct in Navigating Challenges held 25-26 October, research: Detecting falsification. APEC 2001 at the University of Malaya, Kuala GCP Inspection Workshop. Washington, Lumpu. pp 1-15. DC: US Food and Drug Administration.  Regoniel, P. A. (2013). How to reduce  Ludwig, D. A. (2005). Use and misuse of p- researcher bias in social research. Retrieved values in designed and observational on 25th May, 2017 from studies: Guide for researchers and http://simplyeducate.me/2013/08/12/how- reviewers. Aviation, Space, and to-reduce-researcher-bias-in-social-research/ Environmental Medicine, 76(7), Section 1,  Rutherford, A. (2001). Introducing ANOVA 675-680. and ANCOVA: A GLM approach. London:  MacCoun, R. J. (1998). Biases in the SAGE Publication Ltd. interpretation and use of research results.  Ryan, J. J. C. H. & Jefferson, T. I. (2003). Annual Review of Psychology, 49, 259-287. The use, misuse, and abuse of statistics in  Martz, E. (2013). No matter how strong, information security research. Proceedings correlation still doesn’t equal causation. of 2003 ASEM National Conference, St. Retrieved on 29th May, 2017 from Loius, MO. Retrieved on 3rd June, 2017 http://blog.minitab.com/blog/understanding- from statistics/no-matter-how-strong-correlation- http://attrition.org/archive/misc/use_abuse_s still-doesnt-imply-causation tats_infosec_research.pdf  Miller, I. & Freund, J. E. (1977).  Sabin, C. A. (2010). What is bias and how Probability and statistics for engineers. can it affect the outcomes from research? Englewood Cliffs, NJ: Prentice-Hall Inc. Retrieved on 3rd June, 2017 from  Mullane, K. & Williams, M. (2013). Bias in http://www.ukcab.net/wp- research: The rule rather than the exception? content/uploads/2010/06/csabin-bias.pdf Retrieved on 30th May, 2017 from  Sarniak, B. (2015). 9 types of research bias https://www.elsevier.com/editors- and how to avoid them. Retrieved on 31st update/story/publishing-ethics/bias-in- May, 2017 from research-the-rule-rather-than-the-exception

International Journal of Research & Review (www.ijrrjournal.com) 128 Vol.5; Issue: 3; March 2018 Joshua Abah Abah. The Quest for Statistical Significance: Ignorance, Bias and Malpractice of Research Practitioners

http://www.imoderate.com/blog/9-types-of- ew_Proposal_for_Detecting_Data_Falsificat research-bias-and-how-to-avoid-them/ ion.pdf  Schienke, E. W. (2017). BIOET 533:  Wasserstein, R. L. & Lazer, N. A. (2016). Ethical dimensions of renewable energy and The ASA’s statement on p-values: Context, sustainability systems. Retrieved on 27th process, and purpose. The American May, 2017 from https://www.e- Statistician, 70(2), 129-133. DOI: education.pse.edu/bioet533/node/654 10.1080/00031305.2016.1154108  Simmons, K., Mercer, A., Schwarzer, S. &  Weaver, B. (2002). Analysis of covariance. Courtney, K. (2016). Evaluating a new Retrieved on 3rd February, 2018 from proposal for detecting data falsification in http://www.angelfire.com/wv/bwhomedir/n surveys. Pew Research Centre. Retrieved on otes/ancova.pdf 2nd June, 2017 from  Yaddanapudi, L. N. (2016). The American http://www.websm.org/uploadi/editor/doc/1 statistical association statement on p-values 457270637Simmons_2016_Evaluating_a_N explained. Journal of Anaesthesiology and Clinical Pharmacology, 32(4), 421-423.

How to cite this article: Abah JA. The quest for statistical significance: ignorance, bias and malpractice of research practitioners. International Journal of Research and Review. 2018;

5(3):112-129.

******

International Journal of Research & Review (www.ijrrjournal.com) 129 Vol.5; Issue: 3; March 2018