Creative Data Literacy Bridging the Gap Between the Data-Haves and Data-Have Nots
Total Page:16
File Type:pdf, Size:1020Kb
Information Design Journal 23(1), 6–18 © 2017 John Benjamins Publishing Company DOI: 10.1075/idj.23.1.03dig Catherine D’Ignazio Creative data literacy Bridging the gap between the data-haves and data-have nots Keywords: data literacy, empowerment, data visualization, publishers, tool developers, tool and visualization inequality designers, tutorial authors, government, community organizers and artists. Working with data is an increasingly powerful way of making knowledge claims about the world. There is, “The future is already here. It’s just not however, a growing gap between those who can work very evenly distributed.” effectively with data and those who cannot. Because – William Gibson it is state and corporate actors who possess the resources to collect, store and analyze data, individuals 1. The problem: Data inequality (e.g., citizens, community members, professionals) are more likely to be the subjects of data than to use data Despite the grand hype around “Big Data” and the for civic purposes. There is a strong case to be made knowledge revolution it will create (Schönberger & for cultivating data literacy for people in non-technical Cukier 2013), there is profound inequality between those fields as one way of bridging this gap. Literacy, following who are benefitting from the storage, collection and the model of popular education proposed by Paulo analysis of data and those who are not (Andrejevic 2014; Freire, requires not only the acquisition of technical boyd & Crawford 2012; Tufekci 2014). Data has become skills but also the emancipation achieved through the a currency of power. Decisions of public import, ranging literacy process. This article proposes the term creative from which products to market, to which prisoners to data literacy to refer to the fact that non-technical parole and which city buildings to inspect, are increas- learners may need pathways towards data which do ingly being made by automated systems sifting through not come from technical fields. Here I offer five tactics large amounts of data (Pasquale 2015). As a result, to cultivate creative data literacy for empowerment. knowing how to collect, find, analyze, and communicate They are grounded in my experience as a data literacy with data is of increasing importance in society. researcher, educator and software developer. Each tactic Yet, ownership of data is largely centralized, mostly is explained and introduced with examples. I assert collected and stored by corporations and governments. that working towards creative data literacy is not only Critically, the technical knowledge of how to work the work of educators but also of data creators, data effectively with data is in the hands of a small class of 6 Catherine D’Ignazio • Bridging the gap between the data-haves and data-have nots idj 23(1), 2017, 6–18 specialists. People are far more likely to be discriminated activists throughout the world are introducing tools and against with data or surveilled with data than they are to practices that can help use data to advocate for social use data for their own civic ends (O’Neil 2016). This has change (Tygel & Kirsch 2015; Emerson & Tactical Tech implications on how people do social science (Crawford 2013). However, there is a lack of consistent and ap- et al. 2014; Sandvig et al. 2014; Welles 2014), practice law propriate approaches for helping novices learn to “speak (Pasquale 2015), produce policy (Goldsmith & Crawford data” (Bhargava 2014). Some approach the topic from 2014), govern the city (Jacobs et al 2016) and create the a math—and statistics-centric point of view (Maine news (Diakopoulos 2015; Kirchner 2016; D’Ignazio & 2015). Some build custom tools to support intention- Bhargava 2015), among other things. ally designed activities based on strong pedagogical The scholarship of Critical Data Studies (Dalton, imperatives (Williams, Deahl, Rubel & Lim 2015). Still Taylor & Thatcher 2016) has focused on algorithmic others have brought together diverse communities of transparency, data discrimination and privacy interested parties to build documentation, trainings, concerns. There has been, however, comparatively less and other shared resources in an effort to propagate effort on issues of equity in terms of who has access the “open data movement” (Gray 2012). Regrettably, to the computing power and know-how to be able data literacy has been relegated to a set of technical to make sense of data and how they come to acquire skills, such as reading charts and making graphs, rather and deploy that knowledge. Mark Andrejevic has than connecting those skills to broader concepts of termed this the “Big Data Divide” (Andrejevic 2014) citizenship and empowerment. Drawing from Paulo and Boyd and Crawford have referred to data-haves Freire’s popular education, literacy involves not just the and have-nots (Boyd & Crawford 2012). Crawford has acquisition of technical skills but also the emancipation written eloquently on “Artificial Intelligence’s White achieved through the literacy process (Freire 1968; Tygel Guy Problem” (Crawford 2016). Certainly, the fact that & Kirsch 2015). In other words, it is not enough to teach there are equity and inclusion issues in data science is people how to read a chart, you must also teach them not surprising given the persistence of digital inequality how to use that chart to make the world a fairer place. (DiMaggio & Hargittai 2001) and the lack of women The practice of literacy is the practice of freedom, as and minorities in STEM fields (Neuhauser 2015). conceived by Freire. Cultivating data literacy in a more diverse population So the question to be asked is: How do we go about should therefore be part of any solution or mitigating empowering new learners with data? Rather than pro- strategy for data inequality. posing a systematic framework for data literacy at scale, this paper offers five tactics for creative data literacy for 2. Creative data literacy empowerment. I use the term creative data literacy, rather than simply “data literacy”, to draw attention to the fact Data literacy includes the ability to read, work with, that these techniques are geared towards non-technical analyze, and argue with data as part of a broader learners who may need an alternative to the traditional process of inquiry into the world (D’Ignazio & quantitative approach to working with data. Moreover, Bhargava 2016; Letouzé et al. 2015). The popular press rather than presuming that creative data literacy is has argued for broad data literacy education (Harris the educators’ domain only, each of the five strategies 2012; Maycotte 2014). Workshops for nonprofits and outlined in this paper specifies which audiences it targets 7 Catherine D’Ignazio • Bridging the gap between the data-haves and data-have nots idj 23(1), 2017, 6–18 in the data pipeline. The assertion here is that different 3.1 Work with community-centered data groups of professionals can contribute to data literacy and that data learning may take place in a variety of Who can do this: developers; data creators and settings. The groups that may play a role in engendering publishers; tutorial authors; educators and enhancing data literacy include educators as well as data creators, data publishers, tool developers, tool and This first and crucial tactic involves the careful sourcing visualization designers, tutorial authors, government, and selection of data that are relevant to the community community organizers, and artists. that is learning to work with data. Ideally, this is data that are about the learners themselves, their field of work, or 3. Five tactics for creative data literacy related to an issue they are facing. In most cases, sample for empowerment data provided for learning purposes is either highly generic (height and weight distributions of people, for These tactics are not systemic answers to the problem of example) or only relevant to a small number of learners. data inequality and literacy. They are, however, starting For example, many online tutorials in R feature the points for building an inclusive set of practices to mtcars data set.1 This data set is from the Motor Trends introduce new learners to “speaking data” (Bhargava magazine in 1974 and consists of fuel consumption and 2014) and develop a “data mindset” (Miller 2014). They performance metrics for cars based on parameters such also challenge the legitimacy of the current data status as number of cylinders, horsepower, rear axle ratio, and quo which is producing discriminatory technologies and weight. Although for learners who are car mechanics centralizing data-based power in state and corporate or car enthusiasts this is very relevant data, for those actors. These tactics are derived from my own work as an who are not, it is alienating to work with data about educator and tool designer, and from that of some of my something that they do not know (or care) much about. colleagues, such as Rahul Bhargava, with whom I have Working with community-centered sample data developed pedagogical materials and the data literacy opens up possibilities for connecting context and lived platform DataBasic.io. I teach undergraduate and experience to the data. It also makes it easier for learners graduate students majoring in the fields of Journalism, to apply their learning to their everyday lives or work the Arts and Communication. I also run data workshops contexts more quickly and directly. For example, in the for those in municipal government, journalism, the project Local Lotto2—a collaboration between the Center nonprofit sector, and the arts. Although the tactics I for Urban Pedagogy, Brooklyn School for Social Justice, introduce are neither exhaustive nor appropriate for all MIT’s Civic Data Design Lab, and CUNY Brooklyn cases of data literacy learning, they can assist profession- College—urban high school students were charged with als in these fields to improve data literacy learning. My determining whether the lottery was a good or bad thing hope is that we can draw from tactics such as these while for their neighborhoods.