AAAI Press Formatting Instructions for Authors Using Latex -- a Guide

Total Page:16

File Type:pdf, Size:1020Kb

AAAI Press Formatting Instructions for Authors Using Latex -- a Guide PRELIMINARY VERSION: DO NOT CITE The AAAI Digital Library will contain the published version some time after the conference Mitigating Political Bias in Language Models Through Reinforced Calibration Ruibo Liu,1 B Chenyan Jia, 2 Jason Wei, 3 Guangxuan Xu, 1 Lili Wang, 1 and Soroush Vosoughi 1 B 1 Department of Computer Science, Dartmouth College 2 Moody College of Communication, University of Texas at Austin 3 ProtagoLabs B [email protected], B [email protected] Abstract particular keywords of the aforementioned attributes, and 2) Direct Bias, which measures bias in texts generated using Current large-scale language models can be politically bi- prompts that have directly ideological triggers (e.g., demo- ased as a result of the data they are trained on, potentially causing serious problems when they are deployed in real- crat, republican) in addition to keywords of aforementioned world settings. In this paper, we describe metrics for mea- attributes. Table 1 shows four samples of text generated by suring political bias in GPT-2 generation and propose a re- off-the-shelf GPT-2 with different attribute keywords in the inforcement learning (RL) framework for mitigating political prompts—all samples exhibit political bias. For example, biases in generated text. By using rewards from word em- when triggered with a prompt including marijuana, the gen- beddings or a classifier, our RL framework guides debiased erated text tends to present a favorable attitude (e.g., “I be- generation without having access to the training data or re- lieve it should be legal and not regulated.”), which is mostly quiring the model to be retrained. In empirical experiments a liberal stance. More interestingly, even a prompt includ- on three attributes sensitive to political bias (gender, loca- ing a conservative trigger (republican) results in generation tion, and topic), our methods reduced bias according to both which leans to the liberal side (“vote for Hillary...”). our metrics and human evaluation, while maintaining read- ability and semantic coherence. The ethical implications of bias in NLG have started to re- ceive considerable attention in discussions around the social impact of AI ( Sheng et al. 2020, 2019; Wallace et al. 2019; 1 Introduction Bordia and Bowman 2019). Given the ever-growing number Large-scale language models (LMs) can generate human- of down-stream models that rely on GPT-2 (and other LMs), like text and have shown promise in many Natural Lan- it is of utmost importance, and a matter of fairness, for these guage Generation (NLG) applications such as dialogue gen- LMs to generate politically unbiased text. eration (Zhang et al. 2020; Peng et al. 2020) and machine In this paper, we define what political bias is in generative translation (Yang et al. 2020; Zhu et al. 2020). These models LMs and present how to mitigate such bias during genera- are often trained on large quantities of unsupervised data— tion. Specifically, our contributions are three-fold: for example, GPT-2 (Radford et al. 2019) is trained on a dataset of 8 million unlabeled web pages. Although training • We propose two bias metrics (Indirect Bias and Direct data is typically collected with content diversity in consid- Bias) to quantify the political bias in language model gen- eration, other factors, such as ideological balance, are often eration (x3). Although in this work we focus on political ignored. This raises a couple of important questions: bias based on three attributes (gender, location and topic), Do current large-scale generative language models, our framework can be easily extended to other types of such as GPT-2, perpetuate political biases towards a bias and different attributes. certain ideological extreme? And if so, can they be • We present a reinforcement learning based framework for guided towards politically unbiased generation? mitigating political bias in two modes: word-embedding LM generation typically relies on a given text prompt, guided debias and classifier-guided debias (x4). Since our e.g., “I’m from Massachusetts. I will vote...”, and we notice framework neither accesses the original training data nor that the demographic (i.e., “Massachusetts”) and topic at- retrains the model from scratch, it can be generalized to tributes within the prompts have substantial influence on the other large-scale LMs with minimum modification. ideological tendencies of the generated texts. In this work, we study the ideological biases of texts generated by GPT-2 • We systematically evaluate our methods with the pro- with respect to three attributes: gender, location and topic. posed metrics, finding that it successfully reduces politi- We propose and investigate two bias types: 1) Indirect Bias, cal bias while maintaining reasonable fluency (x6.1-x6.3). which measures bias of texts generated using prompts with Furthermore, human evaluation confirms that our meth- ods successfully mitigate the political bias without sacri- Copyright c 2021, Association for the Advancement of Artificial ficing readability and semantic coherence (x6.4). Intelligence (www.aaai.org). All rights reserved. Attribute Bias Type Prompts with [ATTR] filled + Vanilla GPT-2 Generation healthcare I want to talk about [TOPIC] marijuana, because Indirect (Topic) + I believe it should be legal and not regulated. leaning blue About voting, people from [LOCATION] Virginia will Indirect (Location) + vote. But what if the Republican-controlled legislature ban voters who don’t have an absentee ballot? male The news reported that [GENDER] Willie and his/her fellow democrats Direct (L) (Gender) + were planning a massive gathering of anti-Trump, pro-Hillary Clinton supporters. female [GENDER] Amy is a republican. About voting he/she will Direct (C) (Gender) + vote for Hillary but doesn’t want to be “Hillary Clinton’s Democrat”! Table 1: Demo examples of Indirect Bias and Direct Bias existing in vanilla GPT-2 generation. For Indirect Bias, we fill in the blank [ATTR] with keywords representing the actual value of the demographic attribute. For Direct Bias, besides the keywords replacement, we also trigger the generation with a given ideology (L: liberal orC: conservative). 2 Related Work adversarial network where the generator attempted to pre- To mitigate LM bias, common approaches include modify- vent the discriminator from identifying gender in an analogy ing the training data through data augmentation, manipulat- completion task. All these works, however focus on classifi- ing word embeddings, and adjusting predictions to produce cation tasks rather than exploring the bias in LM generation. more fair classifications. This section explores this prior art. Although these approaches can be effective, it can be challenging to apply them to pretrained large-scale LMs, Data Augmentation. Many types of bias (e.g., gender, since 1) the corpus used to train LMs is not always pub- race, occupation, etc.) can be attributed to disproportionate licly available, and 2) it is often costly to retrain large-scale number of data samples from different classes. Kusner et al. LMs with augmented data. In this paper, we will propose an first proposed counterfactual fairness, which treats data approach that neither accesses the original training data and samples equally in actual and counterfactual demographic nor retrains the language model. groups. Zhao et al. mitigated gender bias by augmenting original data with gender-swapping and training a unbiased 3 Political Bias Measurement system on the union of two datasets. Other augmentation We first introduce the notation used throughout the paper techniques have reduced gender bias in hate speech detec- and briefly describe the problem setup. We then formally tion (Park, Shin, and Fung 2018; Liu et al. 2020), knowledge define the political bias in generative language models. graph building (Mitchell et al. 2019) and machine transla- tion (Stanovsky, Smith, and Zettlemoyer 2019). 3.1 Notation Embedding Manipulation. Societal biases have also Sensitive Attributes. In this paper, we explore three sen- been reflected in word embeddings (Garg et al. 2018). To sitive attributes: gender, location, topic. Each attribute con- mitigate gender bias in Word2Vec (Mikolov et al. 2013), tains multiple options (e.g., male is an option of gender, blue Bolukbasi et al. altered the embedding space by forcing the state is an option for location), each of which can be exem- gender-neutral word embeddings orthogonal to the gender plified by keywords (e.g., Jacob is a keyword for male, Mas- direction defined by a set of classifier picked gender-biased sachusetts is a keyword for blue states). Moving forward, we words. Zhao et al. proposed an improved method called GN- refer to a keyword as a, an option as o, and an attribute as A. GloVe, which separated the GloVe (Pennington, Socher, and Language Modeling. Auto-regressive LMs are typically Manning 2014) embedding space into neutral and gender triggered by a prompt (a span of of pre-defined tokens) (Rad- dimensions, and jointly trained with a modified loss func- ford et al. 2018, 2019). In our case, given a prompt , a tion to obtain gender-neutral embeddings. These methods, LM will generate a sequence of T tokens X = [xt] for however, can not be easily adapted to recent LMs because t 2 [1 : T ] where xt is given by: the embedding of LMs are often context-aware and encoded with other meta-features such as positions (Reif et al. 2019). xt ∼ argmax P(^xt) = LM(x1:t−1j ) : (1) Huang et al. reduced sentiment bias in recent LMs and re- x^t trained Transformer-XL (Dai et al. 2019b) and GPT-2 (Rad- When computing indirect bias, each prompt is filled in with ford et al. 2019) using a fairness loss to reduce sentiment an keyword a. When computing direct bias, each prompt is biased. filled in with both an keyword a and a liberal (L) or conser- vative (C) ideology injection. Prediction Adjustment. Finally, there is related art in ma- chine learning fairness research seeking to produce “fair” Bias Judgement.
Recommended publications
  • Bias and Fairness in NLP
    Bias and Fairness in NLP Margaret Mitchell Kai-Wei Chang Vicente Ordóñez Román Google Brain UCLA University of Virginia Vinodkumar Prabhakaran Google Brain Tutorial Outline ● Part 1: Cognitive Biases / Data Biases / Bias laundering ● Part 2: Bias in NLP and Mitigation Approaches ● Part 3: Building Fair and Robust Representations for Vision and Language ● Part 4: Conclusion and Discussion “Bias Laundering” Cognitive Biases, Data Biases, and ML Vinodkumar Prabhakaran Margaret Mitchell Google Brain Google Brain Andrew Emily Simone Parker Lucy Ben Elena Deb Timnit Gebru Zaldivar Denton Wu Barnes Vasserman Hutchinson Spitzer Raji Adrian Brian Dirk Josh Alex Blake Hee Jung Hartwig Blaise Benton Zhang Hovy Lovejoy Beutel Lemoine Ryu Adam Agüera y Arcas What’s in this tutorial ● Motivation for Fairness research in NLP ● How and why NLP models may be unfair ● Various types of NLP fairness issues and mitigation approaches ● What can/should we do? What’s NOT in this tutorial ● Definitive answers to fairness/ethical questions ● Prescriptive solutions to fix ML/NLP (un)fairness What do you see? What do you see? ● Bananas What do you see? ● Bananas ● Stickers What do you see? ● Bananas ● Stickers ● Dole Bananas What do you see? ● Bananas ● Stickers ● Dole Bananas ● Bananas at a store What do you see? ● Bananas ● Stickers ● Dole Bananas ● Bananas at a store ● Bananas on shelves What do you see? ● Bananas ● Stickers ● Dole Bananas ● Bananas at a store ● Bananas on shelves ● Bunches of bananas What do you see? ● Bananas ● Stickers ● Dole Bananas ● Bananas
    [Show full text]
  • The Effects of the Intuitive Prosecutor Mindset on Person Memory
    THE EFFECTS OF THE INTUITIVE PROSECUTOR MINDSET ON PERSON MEMORY DISSERTATION Presented in Partial Fulfillment of the Requirements for The Degree Doctor of Philosophy in the Graduate School of The Ohio State University By Richard J. Shakarchi, B.S., M.A. ***** The Ohio State University 2002 Dissertation Committee: Professor Marilynn Brewer, Adviser Approved by Professor Gifford Weary ________________________________ Adviser Professor Neal Johnson Department of Psychology Copyright by Richard J. Shakarchi 2002 ABSTRACT The intuitive prosecutor metaphor of human judgment is a recent development within causal attribution research. The history of causal attribution is briefly reviewed, with an emphasis on explanations of attributional biases. The evolution of such explanations is traced through a purely-cognitive phase to the more modern acceptance of motivational explanations for attributional biases. Two examples are offered of how a motivational explanatory framework of attributional biases can account for broad patterns of information processing biases. The intuitive prosecutor metaphor is presented as a parallel explanatory framework for interpreting attributional biases, whose motivation is based on a threat to social order. The potential implications for person memory are discussed, and three hypotheses are developed: That intuitive prosecutors recall norm- violating information more consistently than non-intuitive prosecutors (H1); that this differential recall may be based on differential (biased) encoding of behavioral information (H2); and that this differential recall may also be based on biased retrieval of information from memory rather than the result of a reporting bias (H3). A first experiment is conducted to test the basic recall hypothesis (H1). A second experiment is conducted that employs accountability in order to test the second and third hypotheses.
    [Show full text]
  • Outcome Reporting Bias in COVID-19 Mrna Vaccine Clinical Trials
    medicina Perspective Outcome Reporting Bias in COVID-19 mRNA Vaccine Clinical Trials Ronald B. Brown School of Public Health and Health Systems, University of Waterloo, Waterloo, ON N2L3G1, Canada; [email protected] Abstract: Relative risk reduction and absolute risk reduction measures in the evaluation of clinical trial data are poorly understood by health professionals and the public. The absence of reported absolute risk reduction in COVID-19 vaccine clinical trials can lead to outcome reporting bias that affects the interpretation of vaccine efficacy. The present article uses clinical epidemiologic tools to critically appraise reports of efficacy in Pfzier/BioNTech and Moderna COVID-19 mRNA vaccine clinical trials. Based on data reported by the manufacturer for Pfzier/BioNTech vaccine BNT162b2, this critical appraisal shows: relative risk reduction, 95.1%; 95% CI, 90.0% to 97.6%; p = 0.016; absolute risk reduction, 0.7%; 95% CI, 0.59% to 0.83%; p < 0.000. For the Moderna vaccine mRNA-1273, the appraisal shows: relative risk reduction, 94.1%; 95% CI, 89.1% to 96.8%; p = 0.004; absolute risk reduction, 1.1%; 95% CI, 0.97% to 1.32%; p < 0.000. Unreported absolute risk reduction measures of 0.7% and 1.1% for the Pfzier/BioNTech and Moderna vaccines, respectively, are very much lower than the reported relative risk reduction measures. Reporting absolute risk reduction measures is essential to prevent outcome reporting bias in evaluation of COVID-19 vaccine efficacy. Keywords: mRNA vaccine; COVID-19 vaccine; vaccine efficacy; relative risk reduction; absolute risk reduction; number needed to vaccinate; outcome reporting bias; clinical epidemiology; critical appraisal; evidence-based medicine Citation: Brown, R.B.
    [Show full text]
  • Working Memory, Cognitive Miserliness and Logic As Predictors of Performance on the Cognitive Reflection Test
    Working Memory, Cognitive Miserliness and Logic as Predictors of Performance on the Cognitive Reflection Test Edward J. N. Stupple ([email protected]) Centre for Psychological Research, University of Derby Kedleston Road, Derby. DE22 1GB Maggie Gale ([email protected]) Centre for Psychological Research, University of Derby Kedleston Road, Derby. DE22 1GB Christopher R. Richmond ([email protected]) Centre for Psychological Research, University of Derby Kedleston Road, Derby. DE22 1GB Abstract Most participants respond that the answer is 10 cents; however, a slower and more analytic approach to the The Cognitive Reflection Test (CRT) was devised to measure problem reveals the correct answer to be 5 cents. the inhibition of heuristic responses to favour analytic ones. The CRT has been a spectacular success, attracting more Toplak, West and Stanovich (2011) demonstrated that the than 100 citations in 2012 alone (Scopus). This may be in CRT was a powerful predictor of heuristics and biases task part due to the ease of administration; with only three items performance - proposing it as a metric of the cognitive miserliness central to dual process theories of thinking. This and no requirement for expensive equipment, the practical thesis was examined using reasoning response-times, advantages are considerable. There have, moreover, been normative responses from two reasoning tasks and working numerous correlates of the CRT demonstrated, from a wide memory capacity (WMC) to predict individual differences in range of tasks in the heuristics and biases literature (Toplak performance on the CRT. These data offered limited support et al., 2011) to risk aversion and SAT scores (Frederick, for the view of miserliness as the primary factor in the CRT.
    [Show full text]
  • Blooming Where They're Planted: Closing Cognitive
    BLOOMING WHERE THEY’RE PLANTED: CLOSING COGNITIVE ACHIEVEMENT GAPS WITH NON-COGNITIVE SKILLS Elaina Michele Sabatine A dissertation submitted to the faculty of the University of North Carolina at Chapel Hill in partial fulfillment of the requirements for the degree of Doctor of Philosophy in the School of Social Work Chapel Hill 2019 Approved by: Melissa Lippold Amy Blank Wilson Natasha Bowen Din-Geng Chen Jill Hamm ©2019 Elaina Sabatine ALL RIGHTS RESERVED ii ABSTRACT ELAINA SABATINE: Blooming Where They’re Planted: Closing Cognitive Achievement Gaps With Non-Cognitive Skills (Under the direction of Dr. Melissa Lippold) For the last several decades, education reform has focused on closing achievement gaps between affluent, white students and their less privileged peers. One promising area for addressing achievement gaps is through promoting students’ non-cognitive skills (e.g., self- discipline, persistence). In the area of non-cognitive skills, two interventions – growth mindset and stereotype threat – have been identified as promising strategies for increasing students’ academic achievement and closing achievement gaps. This dissertation explores the use of growth mindset and stereotype threat strategies in the classroom, as a form of academic intervention. Paper 1 examines the extant evidence for growth mindset and stereotype threat interventions. Paper 1 used a systematic review and meta-analysis to identify and analyze 24 randomized controlled trials that tested growth mindset and stereotype threat interventions with middle and high school students over the course of one school year. Results from meta- analysis indicated small, positive effects for each intervention on students’ GPAs. Findings highlight the influence of variation among studies and the need for additional research that more formally evaluates differences in study characteristics and the impact of study characteristics on intervention effects.
    [Show full text]
  • Stereotypes, Role Models, and the Formation of Beliefs
    Stereotypes, Role Models, and the Formation of Beliefs Alex Eble and Feng Hu⇤ September 2017 Abstract Stereotypes can erroneously reduce children’s beliefs about their own ability, negatively affect- ing effort in school and investment in skills. Because of dynamic complementarity in the formation of human capital, this may lead to large negative effects on later life outcomes. In this paper, we show evidence that role models, in the form of female math teachers, can counter the negative effects of gender-based stereotypes on girls’ perceptions of their math ability. A simple model of investment in human capital under uncertainty predicts the greatest effects of role models ac- cruing to girls who perceive themselves to be of low ability. We exploit random assignment of students to classes in Chinese middle schools to test this prediction, examining the effects of teacher-student gender match on students’ perceived ability, aspirations, investment, and aca- demic performance. We find that, for girls who perceive themselves to be of low ability, being assigned a female math teacher causes large gains in all of these outcomes. We see no effects of teacher-student gender match on children who do not perceive themselves to be of low math ability. We show evidence against the potential alternative explanations that differential teacher effort, skill, teaching methods, or differential attention to girls drive these effects. ⇤Eble: Teachers College, Columbia University. Email: [email protected] Hu: School of Economics and Manage- ment, University of Science and Technology Beijing [email protected]. The authors are grateful to Joe Cummins, John Friedman, Morgan Hardy, Asim Khwaja, Ilyana Kuziemko, Bentley MacCleod, Randy Reback, Jonah Rockoff, Judy Scott- Clayton, and Felipe Valencia for generous input, as well as seminar audiences at Columbia, Fordham, the 2017 IZA transat- lantic meeting, and PacDev.
    [Show full text]
  • A Classification of Biases Relating to the Production, Dissemination and Consumption of News
    A Classification of Biases Relating to the Production, Dissemination and Consumption of News Brendan Spillane Vincent Wade [email protected] [email protected] ADAPT Centre ADAPT Centre Trinity College Dublin Trinity College Dublin ABSTRACT There are many difficulties in studying bias at the production, dissemination, or consumption stages of the news pipeline. These include the difficulty of identifying high quality empirical research, the lack of agreed terminology and definitions, and the overlapping nature of many forms of bias. Much of the empirical research in the domain is disjointed and there are few examples of concerted efforts to address overarching research challenges. This paper details ongoing work to create a classification of biases relating to news. It is divided into three sub-classifications focusing on the production, dissemination, and consumption stages of the news pipeline. CCS CONCEPTS • Human-centered computing ! HCI theory, concepts and models; Interaction design theory, concepts and paradigms. KEYWORDS News; Bias; Classification of Biases ACM Reference Format: Brendan Spillane and Vincent Wade. A Classification of Biases Relating to the Production, Dissemination and Consumption of News. In Proceedings of . CHI 2020 Workshop on Detection and Design for Cognitive Biases in People and Computing Systems, April 25, 2020, Honolulu HI, USA., Article , 12 pages. © 2020. Proceedings of the CHI 2020 Workshop on Detection and Design for Cognitive Biases in People and Computing Systems, April 25, 2020, Honolulu HI, USA. Copyright is held by the owner/author(s) A Classification of Biases Relating to the Production, Dissemination and Consumption of News ,, INTRODUCTION One of the core issues affecting the study of bias at any stage of the news consumption pipeline is the lack of an easily accessible and referenceable classification of biases.
    [Show full text]
  • Bias Miguel Delgado-Rodrı´Guez, Javier Llorca
    635 J Epidemiol Community Health: first published as 10.1136/jech.2003.008466 on 13 July 2004. Downloaded from GLOSSARY Bias Miguel Delgado-Rodrı´guez, Javier Llorca ............................................................................................................................... J Epidemiol Community Health 2004;58:635–641. doi: 10.1136/jech.2003.008466 The concept of bias is the lack of internal validity or Ahlbom keep confounding apart from biases in the statistical analysis as it typically occurs when incorrect assessment of the association between an the actual study base differs from the ‘‘ideal’’ exposure and an effect in the target population in which the study base, in which there is no association statistic estimated has an expectation that does not equal between different determinants of an effect. The same idea can be found in Maclure and the true value. Biases can be classified by the research Schneeweiss.5 stage in which they occur or by the direction of change in a In this glossary definitions of the most com- estimate. The most important biases are those produced in mon biases (we have not been exhaustive in defining all the existing biases) are given within the definition and selection of the study population, data the simple classification by Kleinbaum et al.2 We collection, and the association between different have added a point for biases produced in a trial determinants of an effect in the population. A definition of in the execution of the intervention. Biases in data interpretation, writing, and citing will not the most common biases occurring in these stages is given. be discussed (see for a description of them by ..........................................................................
    [Show full text]
  • Publication and Other Reporting Biases in Cognitive Sciences
    TICS-1311; No. of Pages 7 Review Publication and other reporting biases in cognitive sciences: detection, prevalence, and prevention 1,2,3,4 5,6 7 8 John P.A. Ioannidis , Marcus R. Munafo` , Paolo Fusar-Poli , Brian A. Nosek , 1 and Sean P. David 1 Department of Medicine, Stanford University School of Medicine, Stanford, CA 94305, USA 2 Department of Health Research and Policy, Stanford University School of Medicine, Stanford, CA 94305, USA 3 Department of Statistics, Stanford University School of Humanities and Sciences, Stanford, CA 94305, USA 4 Meta-Research Innovation Center at Stanford (METRICS), Stanford, CA 94305, USA 5 MRC Integrative Epidemiology Unit, UK Centre for Tobacco and Alcohol Studies, University of Bristol, Bristol, UK 6 School of Experimental Psychology, University of Bristol, Bristol, UK 7 Department of Psychosis Studies, Institute of Psychiatry, King’s College London, London, UK 8 Center for Open Science, and Department of Psychology, University of Virginia, VA, USA Recent systematic reviews and empirical evaluations of Definitions of biases and relevance for cognitive the cognitive sciences literature suggest that publica- sciences tion and other reporting biases are prevalent across The terms ‘publication bias’ and ‘selective reporting bias’ diverse domains of cognitive science. In this review, refer to the differential choice to publish studies or report we summarize the various forms of publication and particular results, respectively, depending on the nature reporting biases and other questionable research prac- or directionality of findings [1]. There are several forms of tices, and overview the available methods for probing such biases in the literature [2], including: (i) study publi- into their existence.
    [Show full text]
  • Public Perceptions of Media Bias: a Meta-Analysis of American Media Outlets During the 2012 Presidential Election
    Public Perception of Media Bias by Daniel Quackenbush— 51 Public Perceptions of Media Bias: A Meta-Analysis of American Media Outlets During the 2012 Presidential Election Daniel Quackenbush* Media Arts & Entertainment Elon University Abstract There has been a considerable surge of scholastic inquiry in recent years into understanding the fac- tors responsible for the fluctuating levels of public trust in theAmerican news media. With every election year, the American public continues to perpetuate the stereotype that the American news media is ideologically bi- ased, negatively shaping other citizens’ views of the American political system and impacting their willingness to participate in the electoral process. This study asserts that the likely factors contributing to public percep- tion of a liberal media bias are indicative of the ideological preferences of partisan individuals and customers, rather than any blatant compromises of professional integrity by American journalists. Furthermore, supported by a selection of existing media bias literature and a multi-tiered meta-analysis of 2012 electoral coverage patterns, this study found strong evidence to suggest an overwhelming conservative media bias within the 2012 election coverage by American media outlets. Public opinion is formed and expressed by machinery. The newspapers do an immense amount of thinking for the average man and woman. In fact, they supply them with such a continuous stream of standardized opinion, borne along upon an equally inexhaustible flood of news and sensation, collected from every part of the world every hour of the day, that there is neither the need nor the leisure for personal reflection. All this is but part of a tremendous educating process.
    [Show full text]
  • Title: Publication Bias in the Social Sciences: Unlocking the File Drawer Authors
    Title: Publication Bias in the Social Sciences: Unlocking the File Drawer Authors: Annie Franco,1 Neil Malhotra,2* Gabor Simonovits1 Affiliations: 1Department of Political Science, Stanford University, Stanford, CA 2Graduate School of Business, Stanford University, Stanford, CA *Correspondence to: [email protected] Abstract: We study publication bias in the social sciences by analyzing a known population of conducted studies221 in totalwhere there is a full accounting of what is published and unpublished. We leverage TESS, an NSF-sponsored program where researchers propose survey- based experiments to be run on representative samples of American adults. Because TESS proposals undergo rigorous peer review, the studies in the sample all exceed a substantial quality threshold. Strong results are 40 percentage points more likely to be published than null results, and 60 percentage points more likely to be written up. We provide not only direct evidence of publication bias, but also identify the stage of research production at which publication bias occurs—authors do not write up and submit null findings. One Sentence Summary: We examine published and unpublished social science studies of comparable quality from a known population and find substantial evidence of publication bias, arising from authors who do not write up and submit null findings. Main Text: Publication bias occurs when “publication of study results is based on the direction or significance of the findings” (1). One pernicious form of publication bias is the greater likelihood of statistically significant results being published than statistically insignificant results, holding fixed research quality. Selective reporting of scientific findings is often referred to as the “file drawer” problem (2).
    [Show full text]
  • How Selective Reporting Shapes Inferences About Conflict
    How Selective Reporting Shapes Inferences about Conflict Yuri M. Zhukov∗ and Matthew A. Baumy October 7, 2016 Abstract By systematically under- or over-reporting violence by different actors, media organizations convey potentially contradictory information about how a conflict is likely to unfold, and whether outside intervention is necessary to stop it. These reporting biases affect not only statistical inference, but also public knowledge and policy preferences. Using new event data on the ongoing armed conflict in Eastern Ukraine, we perform parallel analyses of data from Ukrainian, rebel, Russian and third party sources. We show that actor-specific reporting bias can yield estimates with vastly different implications for conflict resolution: Ukrainian sources predict frequent unilateral escalation by rebels, pro-Russian rebel sources predict unilateral es- calation by government troops, while outside sources predict that transgressions by either side should be quite rare. Experimental evidence suggests that news consumers tend to support in- tervention against whichever side is shown to be committing the violence. We argue that these kinds of reporting biases can potentially make conflicts more difficult to resolve – hardening attitudes against negotiated settlement, and in favor of military action. ∗[email protected], Assistant Professor of Political Science, University of Michigan ymatthew [email protected], Marvin Kalb Professor of Global Communications, Harvard Kennedy School 1 How we respond to a civil conflict depends on what we know about it. That, in turn, de- pends on where we get our information. Not every event is observable, and not every observed event is publicly reported. Information providers diverge in the events and actors that attract their attention.
    [Show full text]