Issue Editors

Bulletin of the Technical Committee on Data Engineering December 2020 Vol. 43 No. 4 IEEE Computer Society Letters Letter from the Editor-in-Chief . Haixun Wang 1 Letter from the Special Issue Editors . James Foulds and Shimei Pan 2 Special Issue on Interdisciplinary Perspectives on Fairness and Artificial Intelligence Systems Transforming the Culture: Internet Research at the Crossroads . Safiya Umoja Noble and Sarah T. Roberts 3 Responsible Design Through Interdisciplinary Collaboration: Contact Tracing Apps, Surveillance and the Prospect of Building User Trust . Jennifer Keating 11 Social Media Data - Our Ethical Conundrum. .................... Lisa Singh, Agoritsa Polyzou, Yanchen Wang, Jason Farr and Carole Roan Gresenz 23 Taming Technical Bias in Machine Learning Pipelines . Sebastian Schelter and Julia Stoyanovich 39 Are Parity-Based Notions of AI Fairness Desirable? . James Foulds and Shimei Pan 51 Trimming the Thorns of AI Fairness Research . Jared Sylvester and Edward Raff 74 Conference and Journal Notices TCDE Membership Form. 85 Editorial Board TCDE Executive Committee Editor-in-Chief Chair Haixun Wang Erich J. Neuhold Instacart University of Vienna 50 Beale Suite San Francisco, CA, 94107 Executive Vice-Chair [email protected] Karl Aberer EPFL Associate Editors Executive Vice-Chair Lei Chen Thomas Risse Department of Computer Science and Engineering Goethe University Frankfurt HKUST Hong Kong, China Vice Chair Malu Castellanos Sebastian Schelter Teradata Aster University of Amsterdam 1012 WX Amsterdam, Netherlands Vice Chair Shimei Pan, James Foulds Xiaofang Zhou Information Systems Department The University of Queensland UMBC Baltimore, MD 21250 Editor-in-Chief of Data Engineering Bulletin Haixun Wang Jun Yang Instacart Department of Computer Sciences Duke University Awards Program Coordinator Durham, NC 27708 Amr El Abbadi University of California, Santa Barbara Distribution Brookes Little Chair Awards Committee IEEE Computer Society Johannes Gehrke 10662 Los Vaqueros Circle Microsoft Research Los Alamitos, CA 90720 [email protected] Membership Promotion Guoliang Li Tsinghua University The TC on Data Engineering Membership in the TC on Data Engineering is open to TCDE Archives all current members of the IEEE Computer Society who Wookey Lee are interested in database systems. The TCDE web page is INHA University http://tab.computer.org/tcde/index.html. Advisor The Data Engineering Bulletin The Bulletin of the Technical Committee on Data Engi- Masaru Kitsuregawa neering is published quarterly and is distributed to all TC The University of Tokyo members. Its scope includes the design, implementation, modelling, theory and application of database systems and Advisor their technology. Kyu-Young Whang Letters, conference information, and news should be sent KAIST to the Editor-in-Chief. Papers for each issue are solicited by and should be sent to the Associate Editor responsible SIGMOD and VLDB Endowment Liaison for the issue. Ihab Ilyas Opinions expressed in contributions are those of the au- University of Waterloo thors and do not necessarily reflect the positions of the TC on Data Engineering, the IEEE Computer Society, or the authors’ organizations. The Data Engineering Bulletin web site is at http://tab.computer.org/tcde/bull_about.html. i Letter from the Editor-in-Chief Over the past few years, we have come to realize that a great danger of technology is the unfair bias that AI systems create, reinforce, and propagate. While the source of such bias is usually human, machines’ homogeneity and efficiency could easily magnify its damage, a concern further exacerbated by the fact that AI systems are being trusted to make more and more critical decisions in almost every aspect of human experience. Data is often the channel through which bias transmits from human experience to machines. For example, a training dataset with inherent societal bias, or simply a limited and unrepresentative training dataset, may lead to machine learning models that are unfair and biased towards a specific class. However, detecting and correcting unfair bias in AI systems are much more than just a data management task. In particular, since bias starts with humans and will eventually affect humans, addressing unfair bias requires insights from psychological, societal, and cultural perspectives. James Foulds and Shimei Pan put together the current issue – Interdisciplinary Perspectives on Fairness and Artificial Intelligence Systems – that consists of six papers from leading researchers in multidisciplinary areas. It is an excellent collection of ideas and best practices on this important topic. Haixun Wang Instacart 1 Letter from the Special Issue Editors As Artificial Intelligence (AI) and Machine Learning (ML) are increasingly being used to make consequential decisions that may critically impact individuals and many aspects of our society such as criminal justice, health care, education, and employment, there is a growing recognition that AI systems may perpetuate or, worse, exacerbate the unfairness of existing social systems. To tackle this problem, there has been a sharp recent focus on fair AI/ML research in the Machine Learning community. These efforts tend to focus on how to engineer fair AI models by (a) defining appropriate metrics of fairness and (b) designing new algorithms to mitigate bias and ensure fairness. It however frequently overlooks the messy, complex and ever changing contexts in which these systems are deployed. As AI fairness is a complex socio-technical issue which cannot be addressed by a purely technical solution, designing fair AI systems requires close collaboration across multiple disciplines to deeply integrate the social, historical, legal, and technical context and concerns in the design process. In this special issue on Interdisciplinary Perspectives on Fairness and Artificial Intelligence Systems, leading researchers from engineering, social science and humanities present their work on ethical considerations in developing fair AI systems specifically, and responsible socio-technical systems in general. We sought high- quality contributions that integrate ideas from more than one field across the disciplines of technology, social science and the humanities. These papers will provide readers deep insights into the nature and complexity of AI biases, their manifestations in developing practical socio-technical systems and typical mitigation strategies. They also identify opportunities for the AI community to engage with the experts from the humanities. Safiya U. Noble and Sarah T. Roberts from UCLA seek to expand the conversations about socio-technical systems beyond individual, moral and ethical concerns. They believe that the field of “ethical AI” must contend with how it affects and is affected by power structures that encode systems of sexism, racism, and class. They advocate the need for independent research institutes, such as the UCLA Center for Critical Internet Inquiry (C2i2) to promote investigations into the politics, economics, and impacts of technological systems. Jennifer Keating from the University of Pittsburgh discusses the importance of incorporating ethical standards and responsible design features into the development of new technologies. Covid contact tracing was used as a case study to illustrate how rapidly developed tools can have unintended consequences if they are not carefully designed/monitored. She advocates that technologists should collaborate with experts from the humanities to integrate deeper cultural concerns and social/political context into technology development. Lisa Singh and her co-authors from Georgetown University overview the challenges associated with using social media data and the ethical considerations they create. They frame the ethical dilemmas within the context of data privacy and algorithmic fairness and show how and when different ethical concerns arise. Sebastian Schelter from University of Amsterdam and Julia Stoyanovich from NYU present a discussion on technical bias that arises in the data processing pipeline. A number of potential sources of bias during the preprocessing, model development and deployment phases of the ML development lifecycle are identified, with illustrative examples. They show how software support can help avoid these technical bias issues. The broader point of the work is to shed light on the challenge of bias introduced due to data engineering decisions, and to promote an emerging research direction on developing solutions to mitigate it. James Foulds and Shimei Pan from UMBC aim to shed light on whether parity-based metrics are valid measures of AI fairness. They consider the arguments both for and against parity-based fairness definitions and provide a set of guidelines on their use in different contexts. Finally, Jared Sylvester and Edward Raff from Booz Allen Hamilton argue that good-faith efforts toward implementing fairness in practical ML applications should be encouraged, even when the fairness interventions may not result from completely resolving thorny philosophical debates, as this represents progress over the (likely unfair) status quo. This viewpoint is discussed in several contexts: choosing the right fairness metric, solving trolley problems for self-driving cars, and selecting hyper-parameters for fair learning algorithms. James Foulds and Shimei Pan University of Maryland, Baltimore County 2 Transforming the Culture: Internet Research at the Crossroads Safiya Umoja Noble Sarah T. Roberts University of California,

Issue Editors

Download The

Building Alexa Skills Using Amazon Sagemaker and AWS Lambda

Innovation on Every Front

2008 Cse Course List 56

Build Your Own Anomaly Detection ML Pipeline

IEEE-SA Standards Board Meeting Minutes – December 2015

Cloud Computing & Big Data

IEEE Annual Report- 2017

Active IEEE Sister Society Agreements (Known As of 4-Feb-20)

Message from Chairman

Build a Secure Enterprise Machine Learning Platform on AWS AWS Technical Guide Build a Secure Enterprise Machine Learning Platform on AWS AWS Technical Guide

Analytics Lens AWS Well-Architected Framework Analytics Lens AWS Well-Architected Framework