Risks of Discrimination Through the Use of Algorithms
Total Page:16
File Type:pdf, Size:1020Kb
Risks of Discrimination through the Use of Algorithms Carsten Orwat Risks of Discrimination through the Use of Algorithms A study compiled with a grant from the Federal Anti-Discrimination Agency by Dr Carsten Orwat Institute for Technology Assessment and Systems Analysis (ITAS) Karlsruhe Institute of Technology (KIT) Table of Contents List of Tables 5 List of Abbreviations 6 Acknowledgement and Funding 7 Summary 8 1. Introduction 10 2. Terms and Basic Developments 11 2.1 Algorithms 11 2.2 Developments in data processing 12 2.2.1 Increasing the amount of data relating to an identifiable person 12 2.2.2 Expansion of algorithm-based analysis methods 13 2.3 Algorithmic and data-based differentiations 16 2.3.1 Types of differentiation 16 2.3.2 Scope of application 18 2.3.3 Automated decision-making 20 3. Discrimination 23 3.1 Terms and understanding 23 3.2 Types of discrimination 25 3.3 Statistical discrimination 25 3.4 Changes in statistical discrimination 27 4. Cases of Unequal Treatment, Discrimination and Evidence 30 4.1 Working life 30 4.2 Real estate market 34 4.3 Trade 35 4.4 Advertising and search engines 36 4.5 Banking industry 38 4 4.6 Medicine 40 4.7 Transport 40 4.8 State social benefits and supervision 41 4.9 Education 44 4.10 Police 45 4.11 Judicial and penal system 47 4.12 General cases of artificial intelligence 49 5. Causes of Risks of Discrimination 53 5.1 Risks in the use of algorithms, models and data sets 53 5.1.1 Risks in the development of algorithms and models 53 5.1.2 Risks in the compilation of data sets and characteristics 55 5.1.3 Risks with online platforms 57 5.1.4 Intentional discrimination and obfuscation in and by computer systems 57 5.1.5 Insufficient incentives for revision or abolition 58 5.2 Societal risks of algorithmic differentiation 58 5.2.1 Group membership and injustice by generalisation 58 5.2.2 Accumulation and amplification effects 60 5.2.3 Differentiations against socio-political ideas 61 5.2.4 Treatment as a mere means and psychological distancing 61 5.2.5 Endangering the free development of personality and the right to self-expression 62 5.2.6 Creation of structural advantage 64 6. Needs and Options for Action 66 6.1 Transparency and proof of discrimination 66 6.1.1 Technical options for transparency, traceability and non-discrimination 67 6.1.2 Improving the evidence of discrimination 69 6.1.3 Legal situation 71 6.2 More detailed regulation of algorithmic decision-making rules 73 6.2.1 Prohibitions of discrimination and legally protected characteristics 74 6.2.2 Exemptions justified on objective grounds or by the use of recognised methods 75 6.2.3 Prohibition of automated decisions 76 6.2.4 Communicative processes in differentiation decisions 81 6.2.5 Design of online platforms 82 6.3 Possibilities for equality bodies 83 6.3.1 Mission and competencies 83 6.3.2 Possibilities for investigations and evidence 84 6.3.3 Experiences and suggestions from other equality bodies 85 6.3.4 Preventive approach and cooperation possibilities 86 6.3.5 Proposals for the Federal Anti-Discrimination Agency 88 6.4 Need for societal considerations and decisions 89 6.4.1 Burdens on the affected individuals 89 6.4.2 Legitimacy of differentiations 91 7. Bibliography 94 List of Tables 5 List of Tables Table 1: Objects of algorithmic and data-based differentiations 17 Table 2: Legally protected characteristics 24 List of abbreviations ACLU American Civil Liberties Union ADS Antidiskriminierungsstelle des Bundes (see FADA) AGG General Equal Treatment Act (Allgemeines Gleichbehandlungsgesetz) AI Artificial intelligence AMS Public Employment Service (Arbeitsmarktservice) Art. Article BGH Federal Court of Justice (Bundesgerichtshof) BDSG Federal Data Protection Act (Bundesdatenschutzgesetz) BDSG a. F. Federal Data Protection Act, old version BDSG n. F. Federal Data Protection Act, new version BVerfG Federal Constitutional Court (Bundesverfassungsgericht) BVerfGE Decision of the Federal Constitutional Court (Entscheidung des Bundesverfassungsgerichts) CNIL Commission Nationale de l’Informatique et des Libertés DDD Défenseur des Droits DM Data mining GDPR General Data Protection Regulation DSRL Data protection directive (Datenschutz-Richtlinie) List of abbreviations 6 ibid. ibidem i.e. that is to say EDPB European Data Protection Board ECHR European Convention on Human Rights FADA Federal Anti-Discrimination Agency (Antidiskriminierungsstelle des Bundes. ADS) e. g. for example et al. and others etc. et cetera EU European Union GG Basic Law for the Federal Republic of Germany (Grundgesetz) HUD U.S. Department of Housing and Urban Development IT Information Technology ML Machine learning NFHA National Fair Housing Alliance para. Paragraph PoC People of Colour ROC AUC Receiver Operating Characteristic, Area under the Curve WP29 Article 29 Data Protection Working Party, renamed European Data Protection Board (EDPB) YVTltk Yhdenvertaisuus- ja tasa-arvolautakunta Acknowledgement and Funding 7 Acknowledgement and Funding This study was funded by a grant from the Federal Anti-Discrimination Agency (FADA), to whom we would like to express our sincerest gratitude. The author otherwise receives funds from the institutional funding of the Helmholtz Association via the Institute for Technology Assessment and Systems Analysis. The Helmholtz Association is funded by the federal and state governments. Ms Nathalie Schlenzka from the Federal Anti-Discrimination Agency (FADA) provided helpful information and contacts through a query to European equality bodies. I would like to thank her and her colleagues for their constructive comments. I would also like to take this opportunity to thank the employees of the European equality bodies who answered questions about cases of discrimination and their working methods and competences. The answers have been included in the study at the points cited. I would like to thank Oliver Raabe, Moritz Renftle, Oliver Siemoneit and Reinhard Heil for numerous in- depth and extremely productive discussions. Oliver Raabe also provided invaluable assistance in reviewing legal issues. I would like to thank Arnold Sauter, Christoph Kehl and Christian Wadephul for their helpful comments. Any remaining errors can be attributed to the author. Note The assessments, statements and opinions expressed herein are solely those of the author and do not necessarily reflect the official opinion of the Federal Anti-Discrimination Agency. Summary 8 Summary Algorithms: The study focuses on algorithms that are used for data processing and the semi- or fully- automated implementation of decision-making rules to differentiate between individuals. Such differentiations relate to economic products, services, positions or payments as well as to state decisions and actions that affect individual freedoms or the distribution of services. Discrimination: Algorithm-based differentiations become discriminatory if they lead to unjustified disadvantaging of persons with legally protected characteristics, in particular age, gender, ethnic origin, religion, sexual orientation or disability. The study describes cases in which algorithm- and data-based differentiations have been legally classified as discrimination or which are analysed and discussed as risks of discrimination. Surrogate information: Algorithm- and data-based differentiations often exhibit the characteristics of so- called statistical discrimination. Typical for this kind of discrimination is the use of surrogate information, surrogate variables or proxies (e. g. age) to differentiate, because the original distinguishing characteristics (e. g. labour productivity) are difficult for the decision-makers to determine by examining individual cases. These surrogate variables can be protected characteristics, or there can be correlations with them and protected characteristics. With algorithmic methods of data mining and machine learning, complex models with a large number of variables can be used instead of one or a few surrogate variables. Societal risks: The legitimacy of such differentiations is often justified on the grounds of efficiency in overcoming information deficits. However, they also involve societal risks such as injustice by generalisation, treatment of people as mere objects, restriction of the free development of personality, accumulation effects and growing inequality and risks to societal goals of equality or social policy. When developing and using algorithms, many discrimination risks result from the use of data that describe previous unequal treatment. Needs for societal considerations: Although overcoming technically-based discrimination risks of algorithms and data is fundamental, the legitimacy of different forms of algorithmic differentiation requires societal considerations and decisions that take into account the aforementioned societal risks, the benefits of differentiation and, in particular, their distribution in society. This should lead to definitions of socially acceptable differentiations. Since in most cases such differentiations are based on the processing of comprehensive amounts of personal data, risks to the right to informational self-determination must be considered as well. Data protection law: The current data protection law needs clarifications and corrections that would also serve anti-discrimination purposes. These relate to so-called informed consent, where affected persons must assess far-reaching potential consequences, including possible unequal treatment, at the time of consent. This approach no longer seems adequate