Bias in Machine Learning - What Is It Good For?

Total Page:16

File Type:pdf, Size:1020Kb

Bias in Machine Learning - What Is It Good For? Bias in Machine Learning - What is it Good for? Thomas Hellstrom¨ and Virginia Dignum and Suna Bensch 1,2 Abstract. In public media as well as in scientific publications, the is related to several of the AI-HLEG recommendations for building term bias is used in conjunction with machine learning in many dif- Human-Centered AI systems. ferent contexts, and with many different meanings. This paper pro- Machine learning is a wide research field with several distinct ap- poses a taxonomy of these different meanings, terminology, and def- proaches. In this paper we focus on inductive learning, which is a initions by surveying the, primarily scientific, literature on machine corner stone in machine learning. Even with this specific focus, the learning. In some cases, we suggest extensions and modifications to amount of relevant research is vast, and the aim of the survey is not promote a clear terminology and completeness. The survey is fol- to provide an overview of all published work, but rather to cover the lowed by an analysis and discussion on how different types of biases wide range of different usages of the term bias. are connected and depend on each other. We conclude that there is This paper is organized as follows. Section 2 briefly summarizes a complex relation between bias occurring in the machine learning related earlier work. In Section 3 we survey various sources of bias, pipeline that leads to a model, and the eventual bias of the model as it appears in the different steps in the machine learning process. (which is typically related to social discrimination). The former bias Section 4 contains a survey of various ways of defining bias in the may or may not influence the latter, in a sometimes bad, and some- model that is the outcome of the machine learning process. In Sec- time good way. tion 5 we provide a taxonomy of bias, and discuss the different types of found biases and how they relate to each other. Section 6 concludes the paper. 1 INTRODUCTION Media, as well as scientific publications, frequently report on ‘Bias in 2 RELATED WORK Machine Learning’, and how systems based on AI or machine learn- ing are ‘sexist’ 3 or ‘discriminatory’ 4 [10,37]. In the field of machine A number of reviews, with varying focuses related to bias have been learning, the term bias has an established historical meaning that, at published recently. Barocas and Selbst [3] give as good overview least on the surface, totally differs from how the term is used in typ- of various kinds of biases in data generation and preparation for ical news reporting. Furthermore, even within machine learning, the machine learning. Loftus et al. [31] review a number of both non term is used in very many different contexts and with very many dif- causal and causal notions on fairness, which is closely related to bias. ferent meanings. Definitions are not always given, and if they are, the Suresh and Guttag [45] identify a number of sources of bias in the relation to other usages of the word is not always clear. Furthermore, machine learning pipeline. Olteanu et al. [35] investigate bias and us- definitions sometimes overlap or contradict each other [8]. age of data from a social science perspective. Our analysis is comple- The main contribution of this paper is a proposed taxonomy of mentary to the work cited above, by focusing on bias in conjunction the various meanings of the term bias in conjunction with machine with machine learning, and by examining a wider range of biases. learning. When needed, we suggest extensions and modifications to We also provide a novel analysis and discussion on the connections promote a clear terminology and completeness. We argue that this and dependencies between the different types of biases. is more than a matter of definitions of terms. Terminology shapes how we identify and approach problems, and furthermore how we communicate with others. This is particularly important in multidis- 3 SOURCES OF BIAS ciplinary work, such as application-oriented machine learning. Our survey of sources of bias is organized in sections corresponding The taxonomy is based on a survey of published research in sev- to the major steps in the machine learning process (see Figure 1). In eral areas, and is followed by a discussion on how different types of Section 3.1 we describe bias as a major concept in the learning step. biases are connected and depend on each other. In Section 3.2 we focus on our biased world, which is the source Since humans are involved in both the creation of bias, and in of information for the learning process. Section 3.3 describes the arXiv:2004.00686v2 [cs.AI] 20 Sep 2020 the application of, potentially biased, systems, the presented work plethora of biases related terms used in the data generation process. 1 Umea˚ University, Sweden, email: fthomas.hellstrom,virginia.dignum, [email protected] 3.1 Bias in learning 2 This research is partly funded by the Swedish funding agency VINNOVA, project number 2019-01930. N 3 In inductive learning, the aim is to use a data set f(xi; yi)gi=1 to find BBC News, Oct. 10, 2018. Accessed July. 29, 2020. ∗ ∗ 4 Reuters Technology News, Oct. 10, 2018. Accessed July 29, 2020. a function f (x) such that f (xi) approximates yi in a good way. Copyright 2020 for this paper by its authors. Use permitted under Creative Each xi is a vector of features, while y is denoted the output, target Commons License Attribution 4.0 International (CC BY 4.0). variable or label. An often-discussed case is when machine learning is used to build decision support systems that outputs recommenda- The learning step involves more possible sources of bias. One ex- tions, for example on whether to accept or deny loan applications in a ample is denoted uncertainty bias [19], and has to do with the prob- bank. The features may in this case be income, property magnitude, ability values that are often computed together with each produced family status, credit history, gender, and criminal record for a loan classification in a machine learning algorithm5. The probability rep- applicant, while the output y is a recommendation, 0 or 1. resents uncertainty, and typically has to be above a set threshold for a Without further restrictions, infinitely many functions perfectly classification to be considered. For example, a decision support sys- match any given data set, but most of them are typically useless since tem for bank loan applications may reject an application although it is they simply memorize the given data set but generalize poorly for classified as ‘approve’, because the probability is below the thresh- other data from the same application. Therefore, the search for a good old. This threshold is usually manually set, and may create a bias f ∗ has to be limited to a certain space Ω of functions. For example, against underrepresented demographic groups, since less data nor- the function may be assumed to be linear, which is the assumption in mally leads to higher uncertainty. linear regression. While this may be sufficient in some cases, more complex function spaces, such as high-order polynomials, or artifi- cial neural networks are often chosen. Each specific function in Ω is 3.2 A Biased World typically defined by a set of numerical weights. This preference of certain functions over others was denoted bias The word ‘bias’ has an established normative meaning in legal lan- by Tom Mitchell in his paper from 1980 with the title The Need for guage, where it refers to ‘judgement based on preconceived notions Biases in Learning Generalizations [34], and is a central concept in or prejudices, as opposed to the impartial evaluation of facts’ [8]. statistical learning theory. The expression inductive bias (also known The world around us is often described as biased in this sense, and as learning bias) is used to distinguish it from other types of biases. since most machine learning techniques simply mimic large amounts In general, inductive learning can be expressed as the minimiza- of observations of the world, it should come as no surprise that the tion problem resulting systems also express the same bias. ∗ historical bias f = arg min L(f); (1) This bias of the world is sometimes denoted . His- f2Ω torical bias has to do with data that in itself has unwanted properties that are regarded as biased: ‘Historical bias arises even if the data is where L(f) is a costfunction quantifying how well f matches the perfectly measured and sampled, if the world as it is or was leads data. The most common loss function is defined as a model to produce outcomes that are not wanted’ [45]. The bias of N the world obviously has many dimensions, each one describing some X 2 L(f) = (f(xi) − yi) : (2) unwanted aspect of the world. i=1 Sometimes, the bias in the world is analyzed by looking at cor- relations between features, and between features and the label. The The process of solving the minimization problem in Eq. 1 is called N authors of [52] show that in a certain data set, the label cooking training, and uses a training set f(xi; yi)gi=1 to search through Ω for co-occurs unproportionally often with woman, as compared to man. the function f that minimizes the loss function in Eq. 2. Since most machine learning techniques depend on correlations, such All machine learning techniques for inductive learning (for exam- biases may propagate to learned models or classifiers.
Recommended publications
  • An Unconventional Solution for a Sociotechnical
    1 RESOLVING THE PROBLEM OF ALGORITHMIC DISSONANCE: AN UNCONVENTIONAL SOLUTION FOR A SOCIOTECHNICAL PROBLEM Shiv Issar | Department of Sociology, University of Wisconsin-Milwaukee | [email protected] June 29, 2021 Abstract In this paper, I propose the concept of “algorithmic dissonance”, which characterizes the inconsistencies that emerge through the fissures that lie between algorithmic systems that utilize system identities, and sociocultural systems of knowledge that interact with them. A product of human-algorithm interaction, algorithmic dissonance builds upon the concepts of algorithmic discrimination and algorithmic awareness, offering greater clarity towards the comprehension of these sociotechnical entanglements. By employing Du Bois’ concept of “double consciousness” and black feminist theory, I argue that all algorithmic dissonance is racialized. Next, I advocate for the use of speculative methodologies and art for the creation of critically informative sociotechnical imaginaries that might serve a basis for the sociological critique and resolution of algorithmic dissonance. Algorithmic dissonance can be an effective check against structural inequities, and of interest to scholars and practitioners concerned with running “algorithm audits”. Keywords: Algorithmic Dissonance, Algorithmic Discrimination, Algorithmic Bias, Algorithm Audits, System Identity, Speculative methodologies. 2 Introduction This paper builds upon Aneesh’s (2015) notion of system identities by utilizing their nature beyond their primary function of serving algorithmic systems and the process of algorithmic decision- making. As Aneesh (2015) describes, system identities are data-bound constructions built by algorithmic systems for the (singular) purpose of being used by algorithmic systems. They are meant to simulate a specific dimension (or a set of dimensions) of personhood , just as algorithmic systems and algorithmic decision-making are meant to simulate the agents of different social institutions, and the decisions that those agents would make.
    [Show full text]
  • A Task-Based Taxonomy of Cognitive Biases for Information Visualization
    A Task-based Taxonomy of Cognitive Biases for Information Visualization Evanthia Dimara, Steven Franconeri, Catherine Plaisant, Anastasia Bezerianos, and Pierre Dragicevic Three kinds of limitations The Computer The Display 2 Three kinds of limitations The Computer The Display The Human 3 Three kinds of limitations: humans • Human vision ️ has limitations • Human reasoning 易 has limitations The Human 4 ️Perceptual bias Magnitude estimation 5 ️Perceptual bias Magnitude estimation Color perception 6 易 Cognitive bias Behaviors when humans consistently behave irrationally Pohl’s criteria distilled: • Are predictable and consistent • People are unaware they’re doing them • Are not misunderstandings 7 Ambiguity effect, Anchoring or focalism, Anthropocentric thinking, Anthropomorphism or personification, Attentional bias, Attribute substitution, Automation bias, Availability heuristic, Availability cascade, Backfire effect, Bandwagon effect, Base rate fallacy or Base rate neglect, Belief bias, Ben Franklin effect, Berkson's paradox, Bias blind spot, Choice-supportive bias, Clustering illusion, Compassion fade, Confirmation bias, Congruence bias, Conjunction fallacy, Conservatism (belief revision), Continued influence effect, Contrast effect, Courtesy bias, Curse of knowledge, Declinism, Decoy effect, Default effect, Denomination effect, Disposition effect, Distinction bias, Dread aversion, Dunning–Kruger effect, Duration neglect, Empathy gap, End-of-history illusion, Endowment effect, Exaggerated expectation, Experimenter's or expectation bias,
    [Show full text]
  • Incorporating Prior Domain Knowledge Into Inductive Machine Learning Its Implementation in Contemporary Capital Markets
    University of Technology Sydney Incorporating Prior Domain Knowledge into Inductive Machine Learning Its implementation in contemporary capital markets A dissertation submitted for the degree of Doctor of Philosophy in Computing Sciences by Ting Yu Sydney, Australia 2007 °c Copyright by Ting Yu 2007 CERTIFICATE OF AUTHORSHIP/ORIGINALITY I certify that the work in this thesis has not previously been submitted for a degree nor has it been submitted as a part of requirements for a degree except as fully acknowledged within the text. I also certify that the thesis has been written by me. Any help that I have received in my research work and the preparation of the thesis itself has been acknowledged. In addition, I certify that all information sources and literature used are indicated in the thesis. Signature of Candidate ii Table of Contents 1 Introduction :::::::::::::::::::::::::::::::: 1 1.1 Overview of Incorporating Prior Domain Knowledge into Inductive Machine Learning . 2 1.2 Machine Learning and Prior Domain Knowledge . 3 1.2.1 What is Machine Learning? . 3 1.2.2 What is prior domain knowledge? . 6 1.3 Motivation: Why is Domain Knowledge needed to enhance Induc- tive Machine Learning? . 9 1.3.1 Open Areas . 12 1.4 Proposal and Structure of the Thesis . 13 2 Inductive Machine Learning and Prior Domain Knowledge :: 15 2.1 Overview of Inductive Machine Learning . 15 2.1.1 Consistency and Inductive Bias . 17 2.2 Statistical Learning Theory Overview . 22 2.2.1 Maximal Margin Hyperplane . 30 2.3 Linear Learning Machines and Kernel Methods . 31 2.3.1 Support Vector Machines .
    [Show full text]
  • Algorithmic Awareness
    ALGORITHMIC AWARENESS CONVERSATIONS WITH YOUNG CANADIANS ABOUT ARTIFICIAL INTELLIGENCE AND PRIVACY Algorithmic Awareness: Conversations with Young Canadians about Artificial Intelligence and Privacy This report can be downloaded from: mediasmarts.ca/research-policy Cite as: Brisson-Boivin, Kara and Samantha McAleese. (2021). “Algorithmic Awareness: Conversations with Young Canadians about Artificial Intelligence and Privacy.” MediaSmarts. Ottawa. Written for MediaSmarts by: Kara Brisson-Boivin, PhD Samantha McAleese MediaSmarts 205 Catherine Street, Suite 100 Ottawa, ON Canada K2P 1C3 T: 613-224-7721 F: 613-761-9024 Toll-free line: 1-800-896-3342 [email protected] mediasmarts.ca @mediasmarts This research was made possible by the financial contributions from the Office of the Privacy Commissioner of Canada’s contributions program. Algorithmic Awareness: Conversations with Young Canadians about Artificial Intelligence and Privacy. MediaSmarts © 2021 2 Table of Contents Key Terms .................................................................................................................................................... 4 Introduction ................................................................................................................................................. 6 Algorithms and Artificial Intelligence: ............................................................................................... 9 What We (Don't) Know .........................................................................................................................
    [Show full text]
  • Efficient Machine Learning Models of Inductive Biases Combined with the Strengths of Program Synthesis (PBE) to Combat Implicit Biases of the Brain
    The Dangers of Algorithmic Autonomy: Efficient Machine Learning Models of Inductive Biases Combined With the Strengths of Program Synthesis (PBE) to Combat Implicit Biases of the Brain The Harvard community has made this article openly available. Please share how this access benefits you. Your story matters Citation Halder, Sumona. 2020. The Dangers of Algorithmic Autonomy: Efficient Machine Learning Models of Inductive Biases Combined With the Strengths of Program Synthesis (PBE) to Combat Implicit Biases of the Brain. Bachelor's thesis, Harvard College. Citable link https://nrs.harvard.edu/URN-3:HUL.INSTREPOS:37364669 Terms of Use This article was downloaded from Harvard University’s DASH repository, and is made available under the terms and conditions applicable to Other Posted Material, as set forth at http:// nrs.harvard.edu/urn-3:HUL.InstRepos:dash.current.terms-of- use#LAA The Dangers of Algorithmic Autonomy: Efficient Machine Learning Models of Inductive Biases combined with the Strengths of Program Synthesis (PBE) to Combat Implicit Biases of the Brain A Thesis Presented By Sumona Halder To The Department of Computer Science and Mind Brain Behavior In Partial Fulfillment of the Requirements for the Degree of Bachelor of Arts in the Subject of Computer Science on the Mind Brain Behavior Track Harvard University April 10th, 2020 2 Table of Contents ABSTRACT 3 INTRODUCTION 5 WHAT IS IMPLICIT BIAS? 11 Introduction and study of the underlying cognitive and social mechanisms 11 Methods to combat Implicit Bias 16 The Importance of Reinforced Learning 19 INDUCTIVE BIAS and MACHINE LEARNING ALGORITHMS 22 What is Inductive Bias and why is it useful? 22 How is Inductive Reasoning translated in Machine Learning Algorithms? 25 A Brief History of Inductive Bias Research 28 Current Roadblocks on Research with Inductive Biases in Machines 30 How Inductive Biases have been incorporated into Reinforced Learning 32 I.
    [Show full text]
  • Algorithm Bias Playbook Presentation
    Algorithmic Bias Playbook Ziad Obermeyer June, 2021 Rebecca Nissan Michael Stern Stephanie Eaneff Emily Joy Bembeneck Sendhil Mullainathan ALGORITHMIC BIAS PLAYBOOK Is your organization using biased algorithms? How would you know? What would you do if so? This playbook describes 4 steps your organization can take to answer these questions. It distills insights from our years of applied work helping others diagnose and mitigate bias in live algorithms. Algorithmic bias is everywhere. Our work with dozens of organizations—healthcare providers, insurers, technology companies, and regulators—has taught us that biased algorithms are deployed throughout the healthcare system, influencing clinical care, operational workflows, and policy. This playbook will teach you how to define, measure, and mitigate racial bias in live algorithms. By working through concrete examples—cautionary tales—you’ll learn what bias looks like. You’ll also see reasons for optimism—success stories—that demonstrate how bias can be mitigated, transforming flawed algorithms into tools that fight injustice. Who should read this? We wrote this playbook with three kinds of people in mind. ● C-suite leaders (CTOs, CMOs, CMIOs, etc.): Algorithms may be operating at scale in your organization—but what are they doing? And who is responsible? This playbook will help you think strategically about how algorithms can go wrong, and what your technical teams can do about it. It also lays out oversight structures you can put in place to prevent bias. ● Technical teams working in health care: We’ve found that the difference between biased and unbiased algorithms is often a matter of subtle technical choices. If you build algorithms, this playbook will help you make those choices better.
    [Show full text]
  • Algorithmic Bias on the Implicit Biases of Social Technology
    Algorithmic Bias On the Implicit Biases of Social Technology Gabbrielle M Johnson New York University Abstract Often machine learning programs inherit social patterns reflected in their training data without any directed effort by programmers to include such biases. Computer scien- tists call this algorithmic bias. This paper explores the relationship between machine bias and human cognitive bias. In it, I argue similarities between algorithmic and cognitive biases indicate a disconcerting sense in which sources of bias emerge out of seemingly innocuous patterns of information processing. The emergent nature of this bias obscures the existence of the bias itself, making it difficult to identify, mitigate, or evaluate using standard resources in epistemology and ethics. I demonstrate these points in the case of mitigation techniques by presenting what I call `the Proxy Prob- lem'. One reason biases resist revision is that they rely on proxy attributes, seemingly innocuous attributes that correlate with socially-sensitive attributes, serving as proxies for the socially-sensitive attributes themselves. I argue that in both human and algo- rithmic domains, this problem presents a common dilemma for mitigation: attempts to discourage reliance on proxy attributes risk a tradeoff with judgement accuracy. This problem, I contend, admits of no purely algorithmic solution. 1 Introduction On March 23rd, 2016, Microsoft Corporation released Tay, an artificial intelligence (AI) Twitter chatbot intended to mimic the language patterns of a 19-year-old American girl. Tay operated by learning from human Twitter users with whom it interacted. Only 16 hours after its launch, Tay was shut down for authoring a number of tweets endorsing Nazi ideology and harassing other Twitter users.
    [Show full text]
  • When Do Employees Perceive Their Skills to Be Firm-Specific?
    r Academy of Management Journal 2016, Vol. 59, No. 3, 766–790. http://dx.doi.org/10.5465/amj.2014.0286 MICRO-FOUNDATIONS OF FIRM-SPECIFIC HUMAN CAPITAL: WHEN DO EMPLOYEES PERCEIVE THEIR SKILLS TO BE FIRM-SPECIFIC? JOSEPH RAFFIEE University of Southern California RUSSELL COFF University of Wisconsin-Madison Drawing on human capital theory, strategy scholars have emphasized firm-specific human capital as a source of sustained competitive advantage. In this study, we begin to unpack the micro-foundations of firm-specific human capital by theoretically and empirically exploring when employees perceive their skills to be firm-specific. We first develop theoretical arguments and hypotheses based on the extant strategy literature, which implicitly assumes information efficiency and unbiased perceptions of firm- specificity. We then relax these assumptions and develop alternative hypotheses rooted in the cognitive psychology literature, which highlights biases in human judg- ment. We test our hypotheses using two data sources from Korea and the United States. Surprisingly, our results support the hypotheses based on cognitive bias—a stark contrast to expectations embedded within the strategy literature. Specifically, we find organizational commitment and, to some extent, tenure are negatively related to employee perceptions of the firm-specificity. We also find that employer-provided on- the-job training is unrelated to perceived firm-specificity. These results suggest that firm-specific human capital, as perceived by employees, may drive behavior in ways unanticipated by existing theory—for example, with respect to investments in skills or turnover decisions. This, in turn, may challenge the assumed relationship between firm-specific human capital and sustained competitive advantage.
    [Show full text]
  • Deep Automation Bias: How to Tackle a Wicked Problem of AI?
    big data and cognitive computing Article Deep Automation Bias: How to Tackle a Wicked Problem of AI? Stefan Strauß Institute of Technology Assessment (ITA), Austrian Academy of Sciences, 1030 Vienna, Austria Abstract: The increasing use of AI in different societal contexts intensified the debate on risks, ethical problems and bias. Accordingly, promising research activities focus on debiasing to strengthen fairness, accountability and transparency in machine learning. There is, though, a tendency to fix societal and ethical issues with technical solutions that may cause additional, wicked problems. Alternative analytical approaches are thus needed to avoid this and to comprehend how societal and ethical issues occur in AI systems. Despite various forms of bias, ultimately, risks result from eventual rule conflicts between the AI system behavior due to feature complexity and user practices with limited options for scrutiny. Hence, although different forms of bias can occur, automation is their common ground. The paper highlights the role of automation and explains why deep automation bias (DAB) is a metarisk of AI. Based on former work it elaborates the main influencing factors and develops a heuristic model for assessing DAB-related risks in AI systems. This model aims at raising problem awareness and training on the sociotechnical risks resulting from AI-based automation and contributes to improving the general explicability of AI systems beyond technical issues. Keywords: artificial intelligence; machine learning; automation bias; fairness; transparency; account- ability; explicability; uncertainty; human-in-the-loop; awareness raising Citation: Strauß, S. Deep Automation Bias: How to Tackle a 1. Introduction Wicked Problem of AI? Big Data Cogn.
    [Show full text]
  • Review Into Bias in Algorithmic Decision-Making
    Review into bias in algorithmic decision-making November 2020 Review into bias in algorithmic decision-making: Contents Contents Preface 3 By the Board of the Centre for Data Ethics and Innovation Executive summary 5 Part I: Introduction 14 1. Background and scope 15 2. The issue 20 Part II: Sector reviews 37 3. Recruitment 39 4. Financial services 48 5. Policing 61 6. Local government 72 Part III: Addressing the challenges 80 7. Enabling fair innovation 83 8. The regulatory environment 108 9. Transparency in the public sector 130 10. Next steps and future challenges 144 11. Acknowledgements 147 Centre for Data Ethics and Innovation 2 Review into bias in algorithmic decision-making: Contents Preface Fairness is a highly prized human value. Algorithms, like all technology, should work for people, Societies in which individuals can flourish and not against them. need to be held together by practices and This is true in all sectors, but especially key in the public institutions that are regarded as fair. What sector. When the state is making life-affecting decisions it means to be fair has been much debated about individuals, that individual often can’t go elsewhere. throughout history, rarely more so than in Society may reasonably conclude that justice requires decision-making processes to be designed so that human recent months. Issues such as the global judgment can intervene where needed to achieve fair Black Lives Matter movement, the “levelling and reasonable outcomes for each person, informed by up” of regional inequalities within the UK, individual evidence. and the many complex questions of fairness raised by the COVID-19 pandemic have kept Data gives us a powerful weapon to see where fairness and equality at the centre of public bias is occurring and measure whether our debate.
    [Show full text]
  • Review Into Bias in Algorithmic Decision-Making 12Th March 2021
    Review into Bias in Algorithmic Decision-Making 12th March 2021 Lara Macdonald, Senior Policy Advisor, CDEI Ghazi Ahamat, Senior Policy Advisor, CDEI Dr Adrian Weller, CDEI Board Member and Programme Director for AI at the Alan Turing Institute @CDEIUK Government’s role in reducing algorithmic bias ● As a major user of technology, government and the public sector should set standards, provide guidance and highlight good practice ● As a regulator, government needs to adapt existing regulatory frameworks to incentivise ethical innovation 2 Bias, discrimination & fairness ● We interpreted bias in algorithmic decision-making as: the use of algorithms which can cause a systematic skew in decision-making that results in unfair outcomes. ● Some forms of bias constitute discrimination under UK equality (anti-discrimination) law, namely when bias leads to unfair treatment based on certain protected characteristics. ● There are also other kinds of algorithmic bias that are non-discriminatory, but still lead to unfair outcomes. ● Fairness is about much more than the absence of bias ● There are multiple (incompatible) concepts of fairness 3 How should organisations address algorithmic bias? Guidance to organisation leaders and boards Achieving this in practice will include: ● Understand the capabilities and limits of algorithmic tools Building (diverse, multidisciplinary) internal capacity ● Consider carefully whether individuals will be fairly treated by the decision-making process that the tool forms part of Understand risks of bias by measurement
    [Show full text]
  • Survivorship Bias Mitigation in a Recidivism Prediction Tool
    EasyChair Preprint № 4903 Survivorship Bias Mitigation in a Recidivism Prediction Tool Ninande Vermeer, Alexander Boer and Radboud Winkels EasyChair preprints are intended for rapid dissemination of research results and are integrated with the rest of EasyChair. January 15, 2021 SURVIVORSHIP BIAS MITIGATION IN A RECIDIVISM PREDICTION TOOL Ninande Vermeer / Alexander Boer / Radboud Winkels FNWI, University of Amsterdam, Netherlands; [email protected] KPMG, Amstelveen, Netherlands; [email protected] PPLE College, University of Amsterdam, Netherlands; [email protected] Keywords: survivorship bias, AI Fairness 360, bias mitigation, recidivism Abstract: Survivorship bias is the fallacy of focusing on entities that survived a certain selection process and overlooking the entities that did not. This common form of bias can lead to wrong conclu- sions. AI Fairness 360 is an open-source toolkit that can detect and handle bias using several mitigation techniques. However, what if the bias in the dataset is not bias, but rather a justified unbalance? Bias mitigation while the ‘bias’ is justified is undesirable, since it can have a serious negative impact on the performance of a prediction tool based on machine learning. In order to make well-informed product design decisions, it would be appealing to be able to run simulations of bias mitigation in several situations to explore its impact. This paper de- scribes the first results in creating such a tool for a recidivism prediction tool. The main con- tribution is an indication of the challenges that come with the creation of such a simulation tool, specifically a realistic dataset. 1. Introduction The substitution of human decision making with Artificial Intelligence (AI) technologies increases the depend- ence on trustworthy datasets.
    [Show full text]