<<

DRAFT: PLEASE DO NOT CITE OR CIRCULATE

Dear Culp Colloquium Reader: This is a very early draft. Many thanks in advance for your comments and recommendations for improvement.

Best, Margaret

CRITICAL DATA THEORY

Margaret Hu†

ABSTRACT

This Article contends that the concept of big data must be subjected to critical theoretical treatment as a prerequisite to assessing its impact on newly emerging law and policy developments, and on core constitutional rights and principles. Specifically, the advent of the big data revolution necessitates an evolution of theoretical structures of analysis to properly the intersection of digital data, law, and power. Just as Critical Race Theory introduced an approach to legal to deconstruct race classification as it operates in legal contexts, Critical Data Theory is now required to accomplish similar aims with regards to big data at its intersection with the law. The tools of big data and data science allow for legal, scientific, and socioeconomic-political constructions that parallel the manner in which tools of race negotiation and race definition have facilitated legal, scientific, and socioeconomic-political constructions. Tools of indexation and theories of knowledge production, often presented as scientifically objective and epistemological in nature and, therefore, nondiscriminatory, can nonetheless shape the treatment of classes and subclasses of individuals in profoundly disparate ways. Consequently, this Article concludes that the legality and constitutionality of big data techniques of governance can be better understood through the application of analytical frameworks that assess emerging big data identity and big data policy structures through a critical theoretical lens.

† Margaret Hu, Assistant Professor of Law, Washington and Lee University School of Law [Acknowledgments]

DRAFT: PLEASE DO NOT CITE OR CIRCULATE 5/18/16 9:27 AM

TABLE OF CONTENTS

ABSTRACT ...... 1 TABLE OF CONTENTS ...... 2 INTRODUCTION ...... 2 I. BACKGROUND ON AND BIG DATA GOVERNANCE ...... 11 A. Critical Theory and Big Data ...... 17 B. Big Data Governance Revolution ...... 27 C. Criticality and Construction ...... 36 II. CRITICAL RACE THEORY AND BIG DATA ...... 41 A. The Critical Race Theory Framework ...... 43 B. Disparate Impact of Big Data Tools and Programs ...... 47 C. Applying Critical Race Theory to Current Big Data Developments ...... 54 III. CRITICAL DATA THEORY ...... 60 A. Emerging Big Data Techniques of Governance ...... 61 B. Applying Critical Data Theory to Big Data Developments ...... 68 CONCLUSION ...... 75

INTRODUCTION

What is Critical Data Theory and why is it necessary?1

1. Critical Data Theory is inspired by and borrows from Critical Race Theory. Critical Race Theory scholars have contributed significant works in recent years to develop how and why a theoretical approach to analyzing race in legal contexts is essential. See, e.g., , FACES AT THE BOTTOM OF THE WELL: THE PERMANENCE OF (1993); PATRICIA WILLIAMS, THE ALCHEMY OF RACE AND RIGHTS: DIARY OF A LAW PROFESSOR (1992); THE NEW BLACK: WHAT HAS CHANGED—AND WHAT HAS NOT—WITH RACE IN AMERICA (Kenneth W. Mack & Guy-Uriel Charles eds.) (2013); DEVON CARBADO & MITU GULATI, ACTING WHITE? RETHINKING RACE IN POST-RACIAL AMERICA (2013); RICHARD DELGADO & JEAN STEFANCIC, CRITICAL RACE THEORY: AN INTRODUCTION (2012); JESSIE DANIELS, CYBER RACISM: ONLINE AND THE NEW ATTACKS ON CIVIL RIGHTS (2009); , LIFT EVERY VOICE: TURNING A CIVIL RIGHTS SETBACK INTO NEW VISION OF (2003); DOROTHY BROWN, CRITICAL RACE THEORY: CASES, MATERIALS, AND PROBLEMS (2014); Devon W. Carbado, Critical What What?, 43 CONN. L. REV. 1593 (2011). Derrick Bell, Racial Realism, 24 CONN. L. REV. 363 (1992); Guy-Uriel Charles, Speech: Cross Burnings, Epistemics, and the Triumph of the Crits?, 93 GEO. L.J. 575 (2005); Kimberlé Crenshaw, Race, Reform and Retrenchment: Transformation and Legitimation in Antidiscrimination Law, 101 HARV. L. REV. 1331 (1988); Neil Gotanda, A Critique of “Our is Color-Blind,” 44 STAN. L. REV. 1 (1991); Jerry Kang, Trojan Horses of Race, 118 HARV. L. REV. 1489 (2005); Jerry Kang, Implicit and the Pushback from the Left, 54 ST. LOUIS U. L.J. 1139 (2010); , When the First Quail Calls: Multiple Consciousness as Jurisprudential Method, DRAFT: PLEASE DO NOT CITE OR CIRCULATE

2016] CRITICAL DATA THEORY 3

This Article introduces Critical Data Theory as a theoretical lens by which to examine the legal and constitutional impact of newly emerging big data governance methods.2 As we transition from a small data3 world to a big data4 world, what we consider to be acceptable modes of acquiring knowledge and methods of assessing verifiable evidence to inform governmental decisionmaking is rapidly changing.5 So, too, logically,

14 WOMEN’S RTS. L. REP. 297 (1992); Trina Jones, Shades of Brown: the Law of Skin Color, 49 DUKE L. J. 1487 (2000); Paul Butler, Much Respect: Toward a Hip-Hop Theory of Punishment, 56 STAN. L. REV. 983 (2004); Paul Butler, By Any Means Necessary: Using Violence and Subversion to Change Unjust Law, 50 UCLA L. REV. 721 (2003); Devon Carbado, Critical Race Theory Meets Social Science, 10 ANN. REV. L. SOC. SCI. 149 (2014). 2 . See, e.g., Jack M. Balkin, Old-School/New-School Speech Regulation, 127 HARV. L. REV. 2296, 2297 (2014) (“The digital era is different. Governments can target for control or surveillance many different aspects of the digital infrastructure that people use to communicate: telecommunications and broadband companies, web hosting services, domain name registrars, search engines, social media platforms, payment systems, and advertisers.”); Jack M. Balkin, The Constitution in the National Surveillance State, 93 MINN. L. REV. 1 (2008) [hereinafter Balkin, National Surveillance State]; Jack M. Balkin & Sanford Levinson, The Processes of Constitutional Change: From Partisan Entrenchment to the National Surveillance State, 75 FORDHAM L. REV. 489 (2006); Lior Jacob Strahilevitz, Signaling Exhaustion and Perfect Exclusion, 10 J. ON TELECOMM. & HIGH TECH. L. 321 (2012); David Lyon, Biometrics, Identification and Surveillance, 22 BIOETHICS 499 (2008); Erin Murphy, Paradigms of Restraint, 57 DUKE L.J. 1321 (2008). 3. “‘Small data,’ like ‘big data,’ has no set definition.” Andrew Guthrie Ferguson, Big Data and Predictive Reasonable Suspicion, 163 U. PA. L. REV. 327, 329 n.6 (2015). “Small data” has been described in the following way: “Generally, small data is thought of as solving discrete questions with limited and structured data, and the data are generally controlled by one .” Id. (citing JULES J. BERMAN, PRINCIPLES OF BIG DATA: PREPARING, SHARING, AND ANALYZING COMPLEX INFORMATION 1–2 (2013)). 4. “Big data” is difficult to define, as it is a newly evolving field and the technologies that it encompasses are evolving rapidly as well. See, e.g., The Big Data Conundrum: How to Define It?, MIT TECH. REV. (Oct. 3, 2013), http://www.te chnologyreview.com/view/519851/the-big-data-conundrum-how-to-define-it/ (“In 2001, a Meta (now Gartner) report noted the increasing size of data, the increasing rate at which it is produced and the increasing range of formats and representations employed. This report predated the term ‘big data’ but proposed a three-fold definition encompassing the ‘three Vs’: Volume, Velocity and Variety. This idea has since become popular and sometimes includes a fourth V: veracity, to cover questions of trust and uncertainty.”). Neil M. Richards & Jonathan H. King, Big Data Ethics, 49 WAKE FOREST L. REV. 393, 394 n.3 (2014) (quoting IT Glossary: Big Data, GARTNER, http://www.gartner.com/it-glossary/big- data/ (last visited May 11, 2015)); see id. (citing the original “3-V” big data report, DOUG LANEY, 3D DATA MANAGEMENT: CONTROLLING DATA VOLUME, VELOCITY, AND VARIETY (2001), http://blogs.gartner.com/doug-laney/files/2012/01/ad949-3D-Data-Management- Controlling-Data-V olume-Velocity-and-Variety.pdf). Multiple authors have addressed the characteristics of “big data” and the challenges posed by big data technologies. See, e.g., VIKTOR MAYER-SCHÖNBERGER & KENNETH CUKIER, BIG DATA: A REVOLUTION THAT WILL TRANSFORM HOW WE LIVE, WORK, AND THINK (2013); PRIVACY, BIG DATA, AND THE PUBLIC GOOD: FRAMEWORKS FOR ENGAGEMENT (Julia Lane, Victoria Stodden, Stefan Bender & Helen Nissenbaum eds., 2014); ROB KITCHIN, THE DATA REVOLUTION: BIG DATA, OPEN DATA, DATA INFRASTRUCTURES & THEIR CONSEQUENCES (2014). 5. danah boyd & Kate Crawford, Critical Questions for Big Data: Provocations for a Cultural, Technological, and Scholarly Phenomenon, 15 INFO., COMM. & SOC’Y 662, 662– DRAFT: PLEASE DO NOT CITE OR CIRCULATE 5/18/16 9:27 AM

techniques of governance are adapting to the rapidly moving epistemological and ontological currents initiated by the Information Society.6 Critical Data Theory and other theoretical tools are now necessary to robustly analyze the government’s recent capture of big data and data surveillance tools. Without the assistance of new theoretical developments in legal scholarship, it will be impossible to appropriately critique the implications of the newly emerging and unprecedented power of what some scholars have termed the National Surveillance State.7 This Article contributes to the literature by interweaving three bodies of scholarly research in the fields of Critical Race Theory, privacy law, and big data surveillance. Like the constructivity thesis of Critical Race Theory—which holds that law and science construct classifications of race that serve to retrench power norms8—Critical Data Theory holds that, increasingly, law and technology construct classifications of suspect data and suspect digital avatars that also serve to retrench preexisting power hierarchies. The constructivity thesis that is at the heart of Critical Data Theory turns upon the important work of scholars such as Daniel Solove who have pioneered the concept of the “digital person” and “digital personhood.”9 Increasingly, privacy experts and scholars10 have explored the implications of the emergence of the cyber self,11 the networked self,12 and

79 (2012); Kate Crawford, The Anxieties of Big Data, THE NEW INQUIRY (May 30, 2014), http://thenewinquiry.com/essays/the-anxieties-of-big-data/; Andrew Guthrie Ferguson, Big Data and Predictive Reasonable Suspicion, 163 U. PA. L. REV. 327, 329 n.6 (2015); BIG DATA, BIG CHALLENGES IN EVIDENCE-BASED POLICYMAKING (H. Kumar Jayasuriya & Kathryn A. Ritcheske eds.) (2015). 6. The term ‘Information Society’ “recognizes that science and technology co-exist in a world where technology diminishes geographic, temporal, social, and national barriers to discovery, access, and use of data.” REPORT OF THE INTERAGENCY WORKING GROUP ON DIGITAL DATA TO THE COMMITTEE ON SCIENCE OF THE NATIONAL SCIENCE AND TECHNOLOGY COUNCIL 17 (2009), https://www.whitehouse.gov/files/documents/ostp/opengov_inbox/harnessing_power_web.p df. 7 . Jack M. Balkin, The Constitution in the National Surveillance State, 93 MINN. L. REV. 1 (2008) [hereinafter Balkin, National Surveillance State]; Jack M. Balkin & Sanford Levinson, The Processes of Constitutional Change: From Partisan Entrenchment to the National Surveillance State, 75 FORDHAM L. REV. 489 (2006). See also SECRECY, NATIONAL SECURITY AND THE VINDICATION OF CONSTITUTIONAL LAW (David Cole, Federico Fabbrini & Arianna Vedaschi eds. 2013). 8. See, e.g., Richard Delgado, Crossroads and Blind Alleys: A Critical Examination of Recent Writing About Race, 82 TEX. L. REV. 121 (2003) (“An ‘idealist’ school holds that race and discrimination are largely function of attitude and social formation. For these thinkers, race is a social construction created out of words, symbols, stereotypes, and categories.”). 9 . DANIEL SOLOVE, THE DIGITAL PERSON: TECHNOLOGY AND PRIVACY IN THE INFORMATION AGE (2004); Daniel J. Solove, Digital Dossiers and the Dissipation of Fourth Amendment Privacy, 75 S. CAL. L. REV. 1083, 1149–51 (2002). 10 . See, e.g., DANIEL SOLOVE, UNDERSTANDING PRIVACY (2010); DANIEL SOLOVE, NOTHING TO HIDE: THE FALSE TRADEOFF BETWEEN PRIVACY AND SECURITY (2013); HELEN DRAFT: PLEASE DO NOT CITE OR CIRCULATE

2016] CRITICAL DATA THEORY 5 the data self,13 and the preservation of the autonomous self within the increasingly digitalized infrastructure of the Information Society.14 Privacy scholars such as Anita Allen, Ryan Calo, Danielle Citron, Julie Cohen, Michael Froomkin, Woodrow Hartzog, Chris Jay Hoofnagle, Jon Mills, Helen Nissenbaum, Frank Pasquale, Neil Richards, Paul Schwartz, Daniel Solove, and others, are currently examining the intersection of privacy rights with personal and social autonomy rights, and how best to regulate encroachments upon those rights.15 Building upon important privacy scholarship that has particularly focused on the legal and social implications that attach to the technologically-enabled construction of digital personhood and our networked selves, this Article incorporates the research of big data experts and technology scholars from a wide range of disciplines. To introduce Critical Data Theory, this Article borrows from the theoretical framework of Critical Race Theory.16 It contends that the

NISSENBAUM, PRIVACY IN CONTEXT: TECHNOLOGY, POLICY, AND THE INTEGRITY OF SOCIAL LIFE (2009); FRANK PASQUALE, THE BLACK BOX SOCIETY (2015); NEIL RICHARDS, INTELLECTUAL PRIVACY: RETHINKING CIVIL LIBERTIES IN THE DIGITAL AGE (2015); danah boyd & Kate Crawford, Critical Questions for Big Data: Provocations for a Cultural, Technological, and Scholarly Phenomenon, 15 INFO., COMM. & SOC’Y 662, 662–79 (2012); Danielle Keats Citron & Frank A. Pasquale, The Scored Society: Due Process for Automated Predictions, 89 WASH. L. REV. 1 (2014); Danielle Keats Citron, Technological Due Process, 85 WASH. U. L. REV. 1249, 1260–61 (2008); JULIE COHEN, CONFIGURING THE NETWORKED SELF: LAW, CODE, AND THE PLAY OF EVERYDAY PRACTICE (2012); PRIVACY, BIG DATA, AND THE PUBLIC GOOD: FRAMEWORKS FOR ENGAGEMENT (Julia Lane, Victoria Stodden, Stefan Bender & Helen Nissenbaum eds., 2014); Neil M. Richards, The Dangers of Surveillance, 126 HARV. L. REV. 1934 (2013); and others. 11. Chassity N. Whitman & William H. Gottdiener, The Cyber Self: Facebook as a Predictor of Well-being, INT’L J. APPLIED PSYCHOANALYTIC STUD. (2015). 12 . See, e.g., JULIE COHEN, CONFIGURING THE NETWORKED SELF: LAW, CODE, AND THE PLAY OF EVERYDAY PRACTICE (2012); danah boyd, Social Network Sites as Networked Publics: Affordances, Dynamics & Implications, in A NETWORKED SELF: IDENTITY, COMMUNITY, AND CULTURE ON SOCIAL NETWORK SITES (2010); Frank Pasquale & Danielle Keats Citron, Promoting Innovation While Preventing Discrimination: Policy Goals for the Scored Society, 89 WASH. L. REV. 1413, 1413–14 (2014) (referring to the work of Professor Tal Z. Zarsky); Tal Z. Zarsky, Mining the Networked Self, 6 JERUSALEM REV. LEGAL STUD. 120 (2012), http://jrls.oxfordjournals.org/content/6/1/120.full.pdf?keytype=ref&ijkey=b1gi1dlZvf3iBX. 13 . See, e.g., Robert Gordon, The Electronic Personality and Digital Self, 56 DISP. RESOL. J. 8 (2001). 14. See, e.g., Frank Pasquale & Danielle Keats Citron, Promoting Innovation While Preventing Discrimination: Policy Goals for the Scored Society, 89 WASH. L. REV. 1413, 1413–14 (2014) (discussing anti-discriminatory policy goals that arise in the context of the “proliferation of networked identities and selves[]”);Tal Z. Zarsky, Understanding Discrimination in the Scored Society, 89 WASH. L. REV. 1375, 1380–81 (2014); Julie E. Cohen, What Privacy Is For, 126 HARV. L. REV. 1904, 1913 (2013). 15. Anita Allen, Ryan Calo, Julie Cohen, Jennifer Daskal, Michael Froomkin, Woodrow Hartzog, Chris Hoofnagle, Jon Mills, Helen Nissenbaum, Neil Richards, Paul Schwartz, Daniel Solove, Tal Zarsky, and others. 16. See supra note 1. DRAFT: PLEASE DO NOT CITE OR CIRCULATE 5/18/16 9:27 AM

theoretical naming and framing devices of Critical Race Theory can now help make sense of power relationships in a big data world. 17 Critical Data Theory, like Critical Race Theory, questions how power is negotiated or renegotiated through emerging governance and legal structures, and by interrogating the underlying sources of that power.18 Concededly, whether or not Critical Race Theory is the most appropriate theory to critique the impact of the intersection of big data, law, and power—or whether other theoretical approaches may be more appropriate—is open to debate. This Article invites that debate. Other theories may add critical perspectives to the growing impact of big data that are important and necessary. This Article contends that the work of race theorists19 is important here because it can illustrate how apparently equal and objective legal developments and doctrines do not treat classes of individuals equally and objectively.20 Critical Race Theory is especially useful as big data

17. The legal, science, social, and other consequences of what has been termed the “big data revolution” have been a topic of intense academic inquiry. See, e.g., Neil M. Richards & Jonathan H. King; Three Paradoxes of Big Data; 66 STAN. L. REV. ONLINE 41 (2013); Solon Barocas & Andrew D. Selbst, Big Data’s Disparate Impact, 104 CAL. L. REV. (forthcoming 2016); Kate Crawford & Jason Schultz, Big Data and Due Process: Toward a Framework to Redress Predictive Privacy Harms, 55 B.C. L. REV. 93 (2014); Omer Tene & Jules Polonetsky, Privacy in the Age of Big Data: A Time for Big Decisions, 64 STAN. L. REV. ONLINE 63 (2012). Some scholars have focused particularly on the algorithmic-driven decisionmaking consequences of emerging big data technologies. See, e.g., Danielle Keats Citron & Frank A. Pasquale, The Scored Society: Due Process for Automated Predictions, 89 WASH. L. REV. 1 (2014); FRANK PASQUALE, THE BLACK BOX SOCIETY (2015). Other experts have focused on the data mining and predictive analytic capacities of big data tools. See, e.g., STEVEN FINLAY, PREDICTIVE ANALYTICS, DATA MINING AND BIG DATA: MYTHS, MISCONCEPTIONS, AND METHODS (2014); ERIC SIEGEL, PREDICTIVE ANALYTICS: THE POWER TO PREDICT WHO WILL CLICK, BUY, LIE, OR DIE (2013); NATE SILVER, THE SIGNAL AND THE NOISE: WHY SO MANY PREDICTIONS FAIL—BUT SOME DON’T (2012); Fred H. Cate, Government Data Mining: The Need for a Legal Framework, 43 HARV. C.R.-C.L. L. REV. 435 (2008); Christopher Slobogin, Government Data Mining and the Fourth Amendment, 75 U. CHI. L. REV. 317 (2008); Daniel J. Solove, Data Mining and the Security-Liberty Debate, 75 U. CHI. L. REV. 343 (2008). At the dawn of the big data revolution, scholars are now actively interrogating the implications of government-led big data uses by the government and . See, e.g., Joshua A.T. Fairfield & Erik Luna, Digital Innocence, 99 CORNELL L. REV. 981 (2014); David Gray & Danielle Citron, The Right to Quantitative Privacy, 98 MINN. L. REV. 62 (2013); Neil M. Richards, The Dangers of Surveillance, 126 HARV. L. REV. 1934 (2013). 18 . RICHARD DELGADO & JEAN STEFANCIC, CRITICAL RACE THEORY: AN INTRODUCTION (2012); Devon W. Carbado, Critical What What?, 43 CONN. L. REV. 1593 (2011); Guy-Uriel Charles, Colored Speech: Cross Burnings, Epistemics, and the Triumph of the Crits?, 93 GEO. L.J. 575 (2005); DEVON CARBADO & MITU GULATI, ACTING WHITE? RETHINKING RACE IN POST-RACIAL AMERICA (2013). 19. See supra note 1. 20 . Derrick Bell, Who’s Afraid of Critical Race Theory, 1995 U. ILL. L. REV. 893, 899 (1995).

We insist, for example, that abstraction, put forth as “rational” or “objective” truth, smuggles the privileged choice of the privileged to DRAFT: PLEASE DO NOT CITE OR CIRCULATE

2016] CRITICAL DATA THEORY 7 governance systems increasingly cloak themselves in a comparable aura of equality and objectivity.21 Big data, in some contexts and by some definitions, removes the human element.22 Proponents claim that the removal of the human element also removes inherent fallibilities associated with human decisionmaking.23 These fallibilities may include faulty perception or misjudgment—for example, discriminatory motivation and cognitive bias24—particularly in the crimmigration25 and counterterrorism26

depersonify their claims and then pass them off as the universal authority and the universal good. To counter such assumptions, we try to bring to legal scholarship an experientially grounded, oppositionally expressed, and transformatively aspirational concern with race and other socially constructed hierarchies.

Id. 21. See, e.g., Kate Crawford & Jason Schultz, Using Procedural Due Process to Redress Big Data’s Privacy Harms, 55 B.C. L. REV. 93, 120, 127 (2014). 22. Experts have noted the need to preserve humanity in technological advances such as the embrace of big data, algorithmic intelligence, and artificial intelligence (AI). See, e.g., Citron & Pasquale, The Scored Society, supra note X, at 6–7 (arguing that “we need to create backstops for human review” when deploying artificial intelligence tools and algorithmic-driven scoring systems in order to “retain human values of fairness[.]”). See also id. (citing LOSING HUMANITY: THE CASE AGAINST KILLER ROBOTS, HUM. RTS. WATCH 2 (Nov. 2012), http://www.hrw.org/sites/default/files/reports/arms1112_ForUpload.pdf (characterizing “human-out-of-the-loop” weapons as “potentially autonomous, AI-driven” and, therefore, in violation of international human rights law)); MAYER-SCHÖNBERGER & CUKIER, supra note 4, at 17 (“[T]he age of big data will require new rules to safeguard the sanctity of the individual.”). 23. See, e.g., Barocas & Selbst, supra note X, at 1 (“Advocates of algorithmic techniques like data mining argue that they eliminate human from the decision-making process. But an algorithm is only as good as the data it works with.”). See also MAYER-SCHÖNBERGER & CUKIER, supra note 4, at 14 (“As humans we have been conditioned to look for causes, even though searching for causality is often difficult and may lead us down the wrong paths. In a big-data world, by contrast, . . . [t]he correlations may not tell us precisely why something is happening, but they alert us that it is happening.”). 24. See, e.g., Citron & Pasquale, The Scored Society, supra note X, at 4 (“Advocates applaud the removal of human beings and their flaws from the assessment process. Automated systems are claimed to rate all individuals in the same way, thus averting discrimination.”). 25. See, e.g., Juliet Stumpf, The Crimmigration Crisis: Immigrants, Crisis, and Sovereign Power, 56 AM. U. L. REV. 367, 376 (2006) (defining crimmigration law as “[t]he criminalization of immigration law”); César Cuauhtémoc García Hernández, Immigration Detention as Punishment, 61 UCLA L. REV. 1346 (2014); Yolanda Vázquez, Constructing Crimmigration: Latino Subordination in a “Post-Racial” World, 76 OHIO ST. L.J. 599 (2015); Jennifer Chacón, Managing Migration Through Crime, 109 COLUMBIA L. REV. SIDEBAR 135 (2009). 26 . ERIK LUNA & WAYNE MCCORMICK, UNDERSTANDING THE LAW OF TERRORISM 1 (2015) (explaining global terrorism “has been the principal focus of post 9/11 counterterrorism efforts by the United States.”).

[Terrorism] is defined by the nature of the act, not by the identity of the perpetrators or the nature of the cause. All terrorist acts are crime — murder, kidnapping, arson. Many would also be violations of the rules of war, if a state of war existed. All involve violence or the threat of DRAFT: PLEASE DO NOT CITE OR CIRCULATE 5/18/16 9:27 AM

context. Database- and algorithmic-driven decisionmaking is purportedly neutral27 and scientifically valid.28 Therefore, proponents of big data governance methods contend that these newly emerging data-driven tools can serve crime prevention and terrorism prevention goals. Chief among them are gaining big data insight,29 avoiding direct racial and ethnic profiling,30 and serving other security objectives, such as furthering what the U.S. Department of Homeland Security has termed “identity management”31 goals domestically, and what the U.S. Department of Defense has termed “population management”32 goals internationally.

violence, often coupled with specific demands. The violence is directed mainly against civilian targets. The motives are political. The actions generally are carried out in a way that will achieve maximum publicity. The perpetrators are usually members of an organized group, and unlike other criminals, they often claim credit for the act. And finally the act is intended to produce effects beyond the immediate physical damage.

BRIAN MICHAEL JENKINS, THE STUDY OF TERRORISM: DEFINITIONAL Problems 2–3 (1980). 27. See, e.g., Barocas & Selbst, Big Data’s Disparate Impact, supra note X, at 1 (“Big data claims to be neutral. It isn’t.”) 28. See, e.g., Citron & Pasquale, The Scored Society, supra note X, at 31 (“legitimizing widespread subprime lending by purporting to scientifically rank individuals’ creditworthiness with extraordinary precision. Secretive credit scoring can needlessly complicate the social world, lend a patina of objectivity to dangerous investment practices, and encode discriminatory practices in impenetrable algorithms.”) (citation omitted). 29 . See, e.g., MAYER-SCHÖNBERGER & CUKIER, supra note 4, at 13–4 (“What we lose in [small data] accuracy at the micro level we gain in [big data] insight at the macro level.”). Cf. FINLAY, supra note X, at 17 (Big data “expectations tend to become somewhat overly inflated.”). 30 . Richard Berk, The Role of Race in Forecasts of Violent Crime, 1 RACE & SOC. PROBLEMS 231, 232 (2009) (“Yet, using age, race, and gender could be seen by some stakeholders as inappropriate and even illegal. At the same time, a failure to use those predictors risked less accurate forecasts and a substantial number of homicides that might otherwise have been prevented[]”); Ferguson, supra note X, at 389–91. Cf. Bernard E. Harcourt, Risk as a Proxy for Race, U. CHI. PUB. L. & LEGAL THEORY WORKING PAPER NO. 323 1 (2010). 31. The U.S. Department of Homeland Security (DHS) offers this definition of identity management:

Identity Management (IdM) is a broad administrative area that deals with identifying and managing individuals within a government, state, local, public, or private sector network or enterprise. In addition, authentication and authorization to access resources such as facilities or, sensitive data within that system are managed by associating user rights, entitlements, and privileges with the established identity.

Identity Management and Data Privacy Technologies Project, CYBER SEC. RESEARCH & DEV. CTR., http://www.cyber.st.dhs.gov/idmdp/. For an overview of identity management as a policy concept, see Lucy L. Thomson, Critical Issues in Identity Management— Challenges for Homeland Security, 47 JURIMETRICS J. 335 (2007). 32. The 2011 U.S. Army Commander’s Guide to Biometrics in Afghanistan specifically offers “a section titled ‘Population Management[.]’” Identity Dominance: The U.S. DRAFT: PLEASE DO NOT CITE OR CIRCULATE

2016] CRITICAL DATA THEORY 9

Digital data-driven tools, data scoring and indexation programs, and big data governance systems—such as database screening and digital watchlisting systems—are presented as scientifically objective and value neutral, and as the most technologically efficacious tool of governance. Jack Balkin, Sanford Levinson, and others, have offered research on what has been termed the National Surveillance State33 to explicate why big data techniques of governance can be better understood through data surveillance and cybersurveillance capacities. These data surveillance, or

Military’s Biometric War in Afghanistan, PUBLIC INTELLIGENCE (Apr. 21, 2014), https://publicintelligence.net/identity-dominance/. The term “population management” is utilized by the U.S. military to operationalize and explicate the justification for the data collection of the biometrics and “contextual data” of “every living person in Afghanistan.” Id. 33. See Balkin & Levinson, supra note X. Multiple works have recently explored the legal implications of mass surveillance and cybersurveillance. See, e.g., Laura K. Donohue, Section 702 and the Collection of International Telephone and Internet Content, 38 HARV. J.L. & PUB. POL’Y 117 (2015); Margo Schlanger, Intelligence Realism and the National Security Agency’s Civil Liberties Gap, 6 HARV. NAT’L SEC. J. 112 (2015); Laura K. Donohue, Bulk Metadata Collection: Statutory and Constitutional Considerations, 37 HARV. J.L. & PUB. POL’Y 757 (2014); Orin S. Kerr, A Rule of Lenity for National Security Surveillance Law, 100 VA. L. REV. 1513 (2014); Peter Margulies, Dynamic Surveillance: Evolving Procedures in Metadata and Content Collection After Snowden, 66 HASTINGS L.J. 1 (2014); Stephen I. Vladeck, Big Data Before and After Snowden, 7 J. NAT’L SEC. L. & POL’Y 333 (2014); Stephen I. Vladeck, Standing and Secret Surveillance, 10 J.L. & POL’Y INFO. SOC’Y 551 (2014); Omer Tene, A New Harm Matrix for Cybersecurity Surveillance, 12 COLO. TECH. L.J. 391 (2014); Christopher Slobogin, Panvasive Surveillance, Political Process Theory, and the Nondelegation Doctrine, 102 GEO. L.J. 1721 (2014); Nathan A. Sales, Domesticating Programmatic Surveillance: Some Thoughts on the NSA Controversy, 10 I/S: J.L. & POL’Y INFO. SOC’Y 523 (2014); Margot E. Kaminski & Shane Witnov, The Conforming Effect: First Amendment Implications of Surveillance, Beyond Chilling Speech, 49 U. RICH. L. REV. 465 (2015); Christopher Slobogin, Cause to Believe What? The Importance of Defining A Search’s Object-or, How the ABA Would Analyze the NSA Metadata Surveillance Program, 66 OKLA. L. REV. 725 (2014); Anjali S. Dalal, Shadow Administrative Constitutionalism and the Creation of Surveillance Culture, 2014 MICH. ST. L. REV. 61 (2014); Patrick Toomey & Brett Max Kaufman, The Notice Paradox: Secret Surveillance, Criminal Defendants, & the Right to Notice, 54 SANTA CLARA L. REV. 843 (2014); Deven Desai, Constitutional Limits on Surveillance: Associational Freedom in the Age of Data Hoarding, 90 NOTRE DAME L. REV. 579, 619–24 (2014); Paul Ohm, Electronic Surveillance Law and the Intra-Agency Separation of Powers, 47 U.S.F. L. REV. 269 (2012). Several important works have been published in recent years, shedding light on mass surveillance technologies, and the policy and programmatic framework of cybersurveillance and covert intelligence gathering. See, e.g., BRUCE SCHNEIER, DATA AND GOLIATH: THE HIDDEN BATTLES TO COLLECT YOUR DATA AND CONTROL YOUR WORLD 48 (2015); JULIA ANGWIN, DRAGNET NATION: A QUEST FOR PRIVACY, SECURITY, AND FREEDOM IN A WORLD OF RELENTLESS SURVEILLANCE 17–18 (2014); SHANE HARRIS, @WAR: THE RISE OF THE MILITARY-INTERNET COMPLEX (2014); DANA PRIEST & WILLIAM M. ARKIN, TOP SECRET AMERICA: THE RISE OF THE NEW AMERICAN SECURITY STATE (2011); SHANE HARRIS, THE WATCHERS (2010); ROBERT O’HARROW, JR., NO PLACE TO HIDE (2006); JEFFREY ROSEN, THE NAKED CROWD: RECLAIMING SECURITY AND FREEDOM IN AN ANXIOUS AGE (2004); JENNIFER STISA GRANICK, AMERICAN SPIES: MODERN SURVEILLANCE, WHAT IT IS, & WHY YOU SHOULD CARE (forthcoming 2016). DRAFT: PLEASE DO NOT CITE OR CIRCULATE 5/18/16 9:27 AM

dataveillance,34 techniques have resulted in disparate impacts on groups that have traditionally suffered discrimination.35 However, under new forms of big data governance, it is suspicious data that is targeted and not necessarily suspicious persons who are targeted.36 Therefore, the current regime intended to protect rights—civil rights statutes or equality jurisprudence, for example—may be inadequate.37 Further, discrimination that proceeds under big data, through database-driven or algorithmic-driven tools, for example, may not be seen because big data itself may be largely invisible. Consequently, this Article proceeds in three parts. Part I introduces a discussion on why the emergence of of big data governance and government-led big data programs are in need of critical theoretical treatment. Part II examines why Critical Race Theory offers a valuable approach to examining the impact of big data on law and governance. Specifically, this discussion will explore how one theoretical structure, Critical Race Theory, operates to critique hierarchies of law and power that may otherwise go unquestioned. Part III uses the political theory of the National Surveillance State to illuminate why big data forms of governance are highly incentivized by technological developments yet nearly impossible to challenge under the law. As small data privacy laws and small data jurisprudence may prove to be increasingly impotent, the need for more meaningful methods of critique and reform will likely become more apparent. Critical Race Theory and Critical ,38 for

34. Roger Clarke is attributed with first introducing the term “dataveillance” into academic . See Roger A. Clarke, Information Technology and Dataveillance, 31 COMM. ACM 498 (1988). Clarke describes dataveillance as the systematic monitoring or investigation of people’s actions, activities, or communications through the application of information technology. Id.; see also LYON, supra note Error! Bookmark not defined., at 16 (“Being much cheaper than direct physical or electronic surveillance [dataveillance] enables the watching of more people or populations, because economic constraints to surveillance are reduced. Dataveillance also automates surveillance. Classically, government bureaucracies have been most interested in gathering such data . . .”); MARTIN KUHN, FEDERAL DATAVEILLANCE: IMPLICATIONS FOR CONSTITUTIONAL PRIVACY PROTECTIONS (2007) (examining constitutional implications of “knowledge discovery in databases” (KDD applications) through dataveillance). 35. See infra Part IIB. 36. See, e.g., Hu, Big Data Blacklisting, supra note X. 37. Kimberlé Williams Crenshaw, Race, Reform, and Retrenchment: Transformation and Legitimation in Antidiscrimination Law, 101 HARV. L. REV. 1331, 1348 (1988) (“In antidiscrimination law, the conflicting interests actually reinforce existing social arrangements, moderated to the extent necessary to balance the civil rights challenge with the many interests still privileged over it.”). 38 . See e.g., CATHERINE MACKINNON, TOWARD A FEMINIST THEORY OF THE STATE (1991); KATHARINE T. BARTLETT, : READINGS IN LAW AND GENDER (Katharine T. Bartlett & Rosanne Kennedy eds.) (1991); CATHERINE MACKINNON, FEMINISM UNMODIFIED: ON LIFE AND LAW (1987); Katharine T. Bartlett, Feminist Legal Methods, 103 HARV. L. REV. 829 (1990); Deborah L. Rhode, Feminist Critical Theories, 42 STAN. L. REV. 617 (1990); Deborah L. Rhode, Feminism and the State, 107 HARV. L. REV. 1181 (1994); Rhonda S. Breitkreuz, Engendering Citizenship—A Critical Feminist Analysis DRAFT: PLEASE DO NOT CITE OR CIRCULATE

2016] CRITICAL DATA THEORY 11 example, gained scholarly attention as other forms of legal change became increasingly inadequate.39 Critical Data Theory, therefore, may increase in its theoretical and prescriptive utility as traditional paths of legal reform to address data privacy concerns of the National Surveillance State may appear to be inadequate. The Article concludes that the legality and constitutionality of big data techniques of governance can be better understood through the application of analytical frameworks that assess emerging big data identity and big data policy structures through a critical theoretical lens.

I. BACKGROUND ON CRITICAL THEORY AND BIG DATA GOVERNANCE

In a small data world, the concept of small data—in both its analog and digital forms—is often presented as “objective,” “natural,” “reliable,” and “scientific.”40 Big data, however, is markedly distinct from small data.41 Yet, the presumptive empirical value of all data—both small data and big data—seems to indiscriminately attach to data-driven determinations. Consequently, technological determinism,42 or technological solutionism43 or technoromanticism,44 appear to drive the expansion of big data-driven of Canadian Welfare-to-Work Policies and the Employment Experiences of Lone Mothers, 32 J. SOC. & SOC. WELFARE 147, 166 (2005) (“Critical feminist theory is an amalgam of [critical theory and feminist theory], seeking to reveal structural , transform systems, and emancipate oppressed individuals, using gender as a key category of analysis.”). 39 . See, e.g., LANI GUINIER & GERALD TORRES, THE MINER’S CANARY: ENLISTING RACE, RESISTING POWER, TRANSFORMING DEMOCRACY (2002); andré douglas pond cummings, Derrick Bell: Godfather Provocateur, 28 HARV. J. RACIAL & ETHNIC JUST. 51, 52 (2012) (“Critical Race Theory was originally founded as a response to what had been deemed a stalled civil rights agenda in the United States.”). 40 . See, e.g., MAYER-SCHÖNBERGER & CUKIER, supra note 4, at 66–8. 41 . See, e.g., MAYER-SCHÖNBERGER & CUKIER, supra note 4, at 13; Ferguson, supra note X; Hu, Small Data Surveillance, supra note X. 42 . See, e.g., RAYMOND WILLIAMS, TELEVISION: TECHNOLOGY AND CULTURAL FORM (Ederyn Williams ed. 1974); NEIL POSTMAN, TECHNOPOLY: THE SURRENDER OF CULTURE TO TECHNOLOGY (1992); JACQUES ELLUL, THE TECHNOLOGICAL SOCIETY (1964); DOES TECHNOLOGY DRIVE HISTORY? THE DILEMMA OF TECHNOLOGICAL DETERMINISM (Merritt Roe Smith & Leo Marx eds. 1994); ANDREW FEENBERG, CRITICAL THEORY OF TECHNOLOGY (1991); LANGDON WINNER, AUTONOMOUS TECHNOLOGY: TECHNICS-OUT-OF-CONTROL AS A THEME IN POLITICAL THOUGHT (1978); TECHNOLOGY AND DEMOCRACY: TECHNOLOGY IN THE PUBLIC SPHERE (Langdon Winner, Andrew Feenberg & Torben Hviid Nielsen eds. 1997). 43 . See, e.g., EVGENY MOROZOV, TO SAVE EVERYTHING, CLICK HERE: THE FOLLY OF TECHNOLOGICAL SOLUTIONISM 5 (2013) (“Recasting all complex social situations either as neatly defined problems with definite, computable solutions or as transparent and self- evident processes that can be easily optimized[.]”). “I call the ideology that legitimizes and sanctions such aspirations ‘solutionism.’” 44 . See, e.g., JEDEDIAH PURDY, FOR COMMON THINGS: IRONY, TRUST AND COMMITMENT IN AMERICA TODAY 161–84 (2000) (critiquing the potential harms of “techno-romanticism”); Richard Coyne, TECHNOROMANTICISM: DIGITAL , HOLISM, AND THE ROMANCE OF THE REAL (1999). DRAFT: PLEASE DO NOT CITE OR CIRCULATE 5/18/16 9:27 AM

policymaking.45 In the absence of a rigorous interrogation of the underlying assumptions that have been attached to big data’s over-promise, big data-facilitated decisionmaking protocols appear to be inevitable and incontrovertible. Within this environment, a largely unchallenged belief in the superiority of big data governance tools is allowed to take root. Just as race is defined in multiple ways, depending on its context, big data, too, is definitionally anchored by its context. As critical race theorist Kimberlé Crenshaw has observed, race can operate as a verb, for example, and not as a noun.46 Like race, some experts have observed that big data can also operate as a verb and not as a noun.47 How big data operates as a verb is significant as a matter of legal and power hierarchies, especially in a society where power structures—knowledge, identity construction, market forces, and policy decisionmaking—are increasingly big data-oriented and algorithmically-driven. Many experts have adopted noun-like definitions of big data that attempt to capture its technological characteristics. As will be explained in further detail below, however, for

45. David Lyon, Surveillance, Snowden, and Big Data: Capacities, Consequences, Critique, BIG DATA & SOC. 5 (2014) (“The term ‘Big Data’ suggests that size is its key feature. Massive quantities of data about people and their activities are indeed generated by Big Data practices and many corporate and government bodies wish to capitalize on what is understood as the Big Data boom.”).

[A] key reason why those commercial and governmental criteria are so imbricated with Big Data is the strong affinity between the two, particularly in relation to surveillance. Big Data represents a confluence of commercial and governmental interests; its political economy resonates with neo-liberalism. National security is a business goal as much as a political one and there is a revolving door between the two in the world of surveillance practices.

Id. at 9 (citation omitted). 46. See Kimberlé Williams Crenshaw, Twenty Years of Critical Race Theory: Looking Back To Move Forward, 43 CONN. L. REV. 1253, 1262 (2011) (Critical Race Theory “is dynamically constituted by a series of contestations and convergences pertaining to the ways that racial power is understood and articulated in the post-civil rights era. In the same way that Kendall Thom reasoned that race was better thought of as a verb rather than a noun[.]”). 47 . See, e.g., Tal Golan, Big Data Is a Verb (Not a Noun), SALESFORCE (Feb. 12, 2016), https://www.salesforce.com/blog/2016/02/big-data-verb.html (“Big Data is a set of actions (verb), orchestrated by people, using specialized tools (software, hardware, algorithms) to organize data streams for the purpose of gaining additional insight. . . . The processes used to correlate and draw inferences from (potentially) disparate data streams, ultimately leading to insight[]”). Other experts note that big data is not a common noun and that even characterizing it as a “mass noun” involves an understanding that it can operate as a vehicle for knowledge or information delivery and thus is not static. See Renée J. Miller, Big Data Curation, DIMACS [Center for Discrete Mathematics and Theoretical Computer Science] BIG DATA INTEGRATION WORKSHOP (2013) (“Data isn’t a plural noun like pebbles, it’s a mass noun like dust[.] When you've got algorithms weighing hundreds of factors over a huge data set, you can't really know why they come to a particular decision or whether it really makes sense[.]”). DRAFT: PLEASE DO NOT CITE OR CIRCULATE

2016] CRITICAL DATA THEORY 13 the purposes of this Article, big data will be treated as a .48 Therefore, this Article argues that a philosophical effort such as Critical Data Theory is needed to critique big data’s impact. To understand big data governance, it is important to first understand that big data is more than technology: Its application in both public and private sectors is premised upon a number of often unarticulated but real philosophical assumptions.49 Because big data is presumed to be efficiency enhancing, there is an impetus to “upgrade” administrative systems in ways that can exploit those big data efficiencies.50 And, in a self-reinforcing circle, the public and private embrace of big data systems leads to ever increasing amounts of big data available to be exploited by tools that harvest and analyze such data. Thus, big data, in its philosophical form and technological reality, compounds big data-centric forms of government bureaucracies. The overall result is the emergence of big data governance. Of the multiple presumptions driving big data governance, Critical Data Theory focuses on two presumptions. These presumptions are important to examine why big data is viewed as a superior governance method.51 The first presumption is the efficacy of benefits that flow from newly emerging big data tools. The presumed efficacies justify the embrace of such tools as reliable. The second presumption is the “race neutrality” of newly emerging big data tools. The presumed neutrality justifies the embrace of such tools as value-enhancing and furthering anti- discrimination aims.52 After the terrorist attacks of September 11, 2001, what has been referred to as “military-intelligence-information complex”53 or a “surveillant assemblage”54 have embraced more and more

48. Jose van Dijick, Dataficiation, Dataism & Dataveillance: Data Between Scientific Paradigm & Ideology, 12 SURVEILLANCE & SOCIETY 197, 198 (2014) (“[T]he ideology of dataism shows characteristics of a widespread belief in the objective quantification and potential tracking of all kinds of human behavior and sociality through online media technologies. Besides, dataism also involves trust in the (institutional) agents that collect, interpret, and share (meta)data culled from social media, internet platforms, and other communication technologies.”). 49. Some experts have explained that big data is more of a philosophy than a science. See, e.g., KITCHIN, supra 5, at 2 (explaining big data requires a more “philosophical perspective”); STEVEN FINLAY, PREDICTIVE ANALYTICS, DATA MINING AND BIG DATA: MYTHS, MISCONCEPTIONS AND METHODS 14 (2014) (“Rather than getting hung up on a precise definition of Big Data, an alternative perspective is to view Big Data as a philosophy about how to deal with data . . .”). 50 . See, e.g., MAYER-SCHÖNBERGER & CUKIER, supra note 4. 51 . See, e.g., MAYER-SCHÖNBERGER & CUKIER, supra note 4. 52 . See, e.g., MAYER-SCHÖNBERGER & CUKIER, supra note 4, at 161 (“The promise of big data is that we do what we’ve been doing all along—profiling—but make it better, less discriminatory, and more individualized.”). 53 . DANA PRIEST & WILLIAM M. ARKIN, TOP SECRET AMERICA: THE RISE OF THE NEW AMERICAN SECURITY STATE 52 (2011). 54 . SEAN P. HIER, PROBING THE SURVEILLANT ASSEMBLAGE: ON THE DIALECTICS OF SURVEILLANCE PRACTICES AS PROCESSES OF SOCIAL CONTROL, 1 Surveillance & Soc’y 399, 400 (2003) (“[S]urveillance is driven by the desire to integrate component parts into wider DRAFT: PLEASE DO NOT CITE OR CIRCULATE 5/18/16 9:27 AM

technologically-enhanced data screening and digital watchlisting methods. Policymakers have justified the growth of big data system adoption on the basis that big data tools are inherently nondiscriminatory.55 These key decisionmakers appear to be proceeding with the assumption that databases and algorithms are “race neutral” and “colorblind.”56 The racial profiling risks that might attach to targeting suspected terrorists on the basis of protected classifications—such as race and ethnicity, color, gender, religion, national origin, and citizenship status—are presented as eliminated or significantly mitigated.57 To assert a necessary challenge to the presumptions of efficacy and neutrality that attach to rapidly expanding big data governance tools, big data’s operation within legal and policy contexts must now be subjected to a theoretical critique. Critical Theory encourages counterintuitive counternarratives. Critical Race Theory, for example, challenges presumed systems, they insist that data simulations are not simply ‘representational’ by nature, but involve a more advanced form of pragmatics having to do with their instrumental efficacy in making discriminations among divergent populations.”). 55. See, e.g., Citron & Pasquale, supra note X, at 4 (“Automated systems are claimed to rate all individuals in the same way, thus averting discrimination.”). 56. See, e.g., Rep. Elton Gallegly, Gallegly: E-Verify Ready to Put Americans to Work, WASH. TIMES OP-ED (Mar. 16, 2012), http://www.washingtontimes.com/news/2012/mar/16/e-verify-ready-to-put-americans-to- work/?page=all (Rep. Elton Gallegly (R-CA), Chairman of the Judiciary Committee’s Immigration Policy and Enforcement Subcommittee explaining that E-Verify’s “accuracy rate is far superior to that of the I-9 forms that currently are used to check eligibility, it lowers costs for employers, and it’s race-neutral”). 57. In formulating policies for big data systems such as the No Fly List and the Terrorist Watchlist, the Federal Bureau of Investigation (FBI) has explained that racial profiling is not allowed. Terrorist Screening Center: Protecting Privacy and Safeguarding Civil Liberties, FED. BUR. INVESTIGATION, https://www.fbi.gov/about-us/nsb/tsc/tsc_liberties (last visited May 14, 2016).

Generally, individuals are included in the Terrorist Screening Database when there is reasonable suspicion to believe that a person is a known or suspected terrorist. Individuals must not be watchlisted based solely on race, ethnicity, national origin, religious affiliation, or First Amendment-protected activities such as free speech, the exercise or religion, freedom of the press, freedom of peaceful assembly, and petitioning the government for redress of grievances.

Id.

Nonetheless, litigation surrounding challenges to the No Fly List and Terrorist Watchlist has raised concerns about the racial profiling risks attached to these big data digital watchlisting programs. See, e.g., Latif v. Holder, 28 F.Supp.3d. 1134 (D. Or. 2014). Protected classifications under equal protection jurisprudence have often turned on an inquiry centered on whether a perceived group lacks political power. Bertrall L. Ross II & Su Li, Measuring Political Power: Suspect Class Determinations and the Poor, 104 CALIF. L. REV. 323, 379 (2015) (arguing in favor of a “more holistic” approach to determining suspect class determinations, including the utilization of multiple factors that may provide “additional guidance” to courts). DRAFT: PLEASE DO NOT CITE OR CIRCULATE

2016] CRITICAL DATA THEORY 15 truths about race that invisibly inform and shape the law. Critical Data Theory can operate similarly in rejecting the view that digital data and big data determinations “precede[] law, ideology and social relations.”58 Just as Critical Race Theory has focused on how the law constructs hierarchical dominance based on race,59 Critical Data Theory focuses its attention on how and why big data tools, especially under crimmigration-counterterrorism60 policy rationales, can work in conjunction with other governance tools to construct hierarchies based upon digital data. Big data scholars have illuminated the transformative nature of big data.61 Specifically, they note the disruptive nature of big data discrimination.62 There is growing evidence that big data tools or algorithmic-driven determinations lead to a disparate impact based on race, gender, or other protected classifications.63 Racial-ethnic minorities, as well

58. Carbado, Critical What What?, supra note X. 59. See, e.g., Carbado, Critical What What?, supra note X, at XX (“law constructs whiteness”); Richard Delgado & Jean Stefancic, Critical Race Theory: An Annotated Bibliography, 79 VA. L. REV. 461 (1993) (“Many Critical Race theorists consider that a principal obstacle to racial reform is majoritarian mindset-the bundle of presuppositions, received wisdoms, and shared cultural understandings persons in the dominant group bring to discussions of race.”). 60. See, e.g., Margaret Hu, Militarized Cyberpolicing: Mass Biometric Dataveillance Under Crimmigration-Counterterrorism (forthcoming). 61. See, e.g., Danielle Keats Citron & Frank A. Pasquale, The Scored Society: Due Process for Automated Predictions, 89 WASH. L. REV. 1 (2014); FRANK PASQUALE, THE BLACK BOX SOCIETY (2015); boyd, supra note X; Crawford, supra note X; Felten, supra note X; Kitchin, supra note X; Lanier, supra note X; Morozov, supra note X; Mayer-Schönberger & Cukier, supra note X; Polonetsky & Tene, supra note X. Other experts have focused on the data mining and predictive analytic capacities of big data tools. See, e.g., STEVEN FINLAY, PREDICTIVE ANALYTICS, DATA MINING AND BIG DATA: MYTHS, MISCONCEPTIONS, AND METHODS (2014); ERIC SIEGEL, PREDICTIVE ANALYTICS: THE POWER TO PREDICT WHO WILL CLICK, BUY, LIE, OR DIE (2013); NATE SILVER, THE SIGNAL AND THE NOISE: WHY SO MANY PREDICTIONS FAIL—BUT SOME DON’T (2012); Fred H. Cate, Government Data Mining: The Need for a Legal Framework, 43 HARV. C.R.-C.L. L. REV. 435 (2008); Christopher Slobogin, Government Data Mining and the Fourth Amendment, 75 U. CHI. L. REV. 317 (2008); Daniel J. Solove, Data Mining and the Security-Liberty Debate, 75 U. CHI. L. REV. 343 (2008). 62. See, e.g., Frank Pasquale & Danielle Keats Citron, Promoting Innovation While Preventing Discrimination: Policy Goals for the Scored Society, 89 WASH. L. REV. 1413, 1413–14 (2014); Neil M. Richards & Jonathan H. King, Three Paradoxes of Big Data, 66 STAN. L. REV. ONLINE 41 (2013); Solon Barocas & Andrew D. Selbst, Big Data’s Disparate Impact, 104 CALIF. L. REV. (forthcoming 2016); Kate Crawford & Jason Schultz, Big Data and Due Process: Toward a Framework to Redress Predictive Privacy Harms, 55 B.C. L. REV. 93 (2014); Omer Tene & Jules Polonetsky, Privacy in the Age of Big Data: A Time for Big Decisions, 64 STAN. L. REV. ONLINE 63 (2012); Scott R. Peppet, Regulating the Internet of Things: First Steps Towards Managing Discrimination, Privacy, Security and Consent, 93 TEX. L. REV. 85 (2015). 63 . EXEC. OFFICE OF THE PRESIDENT, BIG DATA: SEIZING OPPORTUNITIES, PRESERVING VALUES (2014) [hereinafter PODESTA REPORT], https://www.whitehouse.gov/sites/defau lt/files/docs/big_data_privacy_report_may_1_2014.pdf; PRESIDENT’S COUNCIL OF ADVISORS ON SCIENCE AND TECHNOLOGY (“PCAST”), BIG DATA AND PRIVACY: A TECHNOLOGICAL DRAFT: PLEASE DO NOT CITE OR CIRCULATE 5/18/16 9:27 AM

as other subpopulations such as immigrants and religious minorities who may reside at socioeconomic and cultural margins, often manifest what is increasingly construed as unstable, anomalous, outlier and nonconforming, or suspect digital data.64 Unconscious discrimination and cognitive or implicit bias may play a role in the development of algorithms,65 the collection and analysis of data,66 and in the human interface with technological tools.67 Experts have observed that big data tools and scoring systems are inherently dependent upon classification and indexing systems, which can exacerbate discriminatory effects.68 However, experts have also suggested the remedial potential of big data to identify and potentially mitigate the discriminatory impact.69

PERSPECTIVE (May 2014) [hereinafter PCAST REPORT], https://www.whitehouse.gov/sites/default/files/microsites/ostp/PCAST/pcast_big_data_and_ privacy_-_may_2014.pdf; LEADERSHIP CONFERENCE ON CIVIL AND HUMAN RIGHTS, CIVIL RIGHTS, BIG DATA, AND OUR ALGORITHMIC FUTURE (2014); FED. TRADE COMM’N, DATA BROKERS: A CALL FOR TRANSPARENCY AND ACCOUNTABILITY (May 2014) [hereinafter FTC REPORT: DATA BROKERS]; FED. TRADE COMM’N, BIG DATA: A TOOL FOR INCLUSION OR EXCLUSION?: UNDERSTANDING THE ISSUES 19 (Jan. 2016) [hereinafter FTC REPORT: BIG DATA EXCLUSION] . 64. See e.g., Hu, Big Data Blacklisting, supra note X. 65 . See e.g., Solon Barocas & Andrew D. Selbst, Big Data’s Disparate Impact, 104 CAL. L. REV. (forthcoming 2016); Citron & Pasquale, supra note X, at 5 (arguing that “black box” scoring systems have traditionally been “plagued by arbitrary results. They may also have a disparate impact on historically subordinated groups[]”); Batya Friedman and Helen Nissenbaum, Bias in Computer Systems, 14 ACM [ASSOC. OF COMPUTING MACHINERY] TRANSACTION ON INFO. SYS. 3 (July 1996) (differentiating between preexisting bias, technical bias, and emergent bias in computer systems); A.R. LANGE, DIGITAL DECISIONS: POLICY TOOLS IN AUTOMATED DECISION-MAKING, CTR. FOR DEMOCRACY & TECH. (Jan. 14, 2016) (examining bias in algorithms); Alex Rosenblat, Kate Wikelius, danah boyd, Seeta Peña Gangadharan & Corrine Yu, DATA & CIVIL RIGHTS: HOUSING PRIMER, DATA & SOC’Y RESEARCH INST., LEADERSHIP CONF. & OPEN TECH. INST. 3–5 (Oct. 30, 2014) (exploring bias in mortgage lending risk-assessment algorithms, credit scores, and online advertising). 66 . BERMAN, supra note X, at 159 (“Big Data resources may contain systemic biases . . . . Every Big Data resource has its blind spots—areas in which data is missing, scarce, or otherwise unrepresentative of the domain.”). 67. Id. (“Often, the Big Data managers are unaware of such [Big Data] deficiencies. In some cases, Big Data managers blame the data analyst for ‘inventing’ a deficiency[.]”). See also Danielle Keats Citron, Technological Due Process, 85 WASH. U. L. REV. 1249, 1283 (2007) (stating that administrative hearing officers have an automation bias by trusting computer systems over human witnesses even in the face of contrary evidence); Thomas B. Sheridan, Speculations on Future Relations Between Humans and Automation, in AUTOMATON & HUMAN PERFORMANCE: THEORY & APPLICATIONS 449, 458 (Raja Parasuraman & Mustapha Mouloua eds., 1996) (“It is so tempting to trust the magic of computers and automation . . . if a computer program compiles, we often believe, the software is valid and the intention will be achieved.”) (citation omitted). 68 . See, e.g., Solon Barocas & Andrew D. Selbst, Big Data’s Disparate Impact, 104 CAL. L. REV. (forthcoming 2016). 69 . JULES POLONETSKY & CHRISTOPHER WOLF, FUTURE OF PRIVACY FORUM & ANTI- DEFAMATION LEAGUE, BIG DATA: A TOOL FOR FIGHTING DISCRIMINATION AND EMPOWERING GROUPS (2014), https://fpf.org/wp-content/uploads/Big-Data-A-Tool-for-Fighting- Discrimination-and-Empowering-Groups-FINAL.pdf. DRAFT: PLEASE DO NOT CITE OR CIRCULATE

2016] CRITICAL DATA THEORY 17

A. Critical Theory and Big Data

What is recognized as Critical Theory70 initially arose in the 1920s and 1930s from academic efforts to examine political labor movements and the impact of capitalism, anti-Semitism, and the rise of fascism.71 As a formal school of thought, Critical Theory has been instrumental in examining the ways modern governance shapes society and culture.72 It encompasses a close interrogation of the meaning of emancipation73 and, relatedly, the preservation of both individual autonomy and social community. The school of thought gained prominence in an era when state governments were developing the administrative tools to expand their reach throughout society, as well as the authoritarian tools to quell dissent in its multifarious forms.74 Principal concerns originally involved what critical theorists viewed as the massification consequences of industrialization and capitalism:75 The loss of human individuality76 that critical theorists contend results from an assembly line culture built upon mass produced

70 . See, e.g., DAVID C. HOY & THOMAS MCCARTHY, CRITICAL THEORY (1994); THOMAS MCCARTHY, IDEALS & ILLUSIONS: ON RECONSTRUCTION & IN CONTEMPORARY CRITICAL THEORY (1993); THE ESSENTIAL FRANKFURT READER 205 (Andrew Arato & Eike Gebhardt eds.) (1985) (“Marcuse and Horkheimer both argued that critical theory receives present confirmation of its interest in a future liberated society in the fantasy (read: advanced art) of the present that anticipates an entirely new utopian sensibility and the philosophy of the past.”). See also CRITICAL THEORY & SOCIETY: A READER (Stephen Eric Bronner & Douglas MacKay Kellner eds.) (1989); AMY ALLEN, THE END OF PROGRESS (2016); THEODOR W. ADORNO, THE CULTURE INDUSTRY (J.M. Bernstein ed.) (1947); , ONE-DIMENSIONAL MAN: STUDIES IN THE IDEOLOGY OF ADVANCED INDUSTRIAL SOCIETY (1964); FREDRIC JAMESON, POSTMODERNISM, OR, THE CULTURAL LOGIC OF LATE CAPITALISM (Stanley Fish ed.) (1991); & THEODOR W. ADORNO, DIALECTIC OF ENLIGHTENMENT (1944); , The Work of Art in the Age of Mechanical Reproduction, in ILLUMINATIONS: ESSAYS & REFLECTIONS (Hannah Arendt ed.) (1969). 71 . See, e.g., THE ESSENTIAL FRANKFURT READER 186–210 (Andrew Arato & Eike Gebhardt eds.) (1985); THEODOR W. ADORNO, NEGATIVE DIALECTICS (1966); MAX HORKHEIMER, ECLIPSE OF REASON (1947). 72 . See, e.g., CRITICAL THEORY & SOCIETY: A READER (Stephen Eric Bronner & Douglas MacKay Kellner eds.) (1989); THOMAS MCCARTHY: THE CRITICAL THEORY OF JÜRGEN HABERMAS.(1981). 73 . See, e.g., AMY ALLEN, THE END OF PROGRESS (2016); Herbert Marcuse, Some Social Implications of Modern Technology, in THE ESSENTIAL FRANKFURT READER 138 (Andrew Arato & Eike Gebhardt eds.) (1985); JÜRGEN HABERMAS, KNOWLEDGE & HUMAN INTERESTS (1968); JÜRGEN HABERMAS, THE THEORY OF COMMUNICATIVE ACTION (1981). 74 . Max Horkheimer, The Authoritarian State, in THE ESSENTIAL FRANKFURT READER 95 (Andrew Arato & Eike Gebhardt eds.) (1985). 75. Massification involves the study of mass production, commodification, and standardization. See, e.g., HERBERT MARCUSE, ONE-DIMENSIONAL MAN: STUDIES IN THE IDEOLOGY OF ADVANCED INDUSTRIAL SOCIETY (1964). 76 . MAX HORKHEIMER, CRITICAL THEORY: SELECTED ESSAYS (1975); HERBERT MARCUSE, NEGATIONS: ESSAYS IN CRITICAL THEORY (Steffen G. Bohm, ed.) (2009). DRAFT: PLEASE DO NOT CITE OR CIRCULATE 5/18/16 9:27 AM

goods and services,77 and the mass delivery of cultural memes78 and modes of thought.79 Critical Theory now encompasses multiple dimensions of critique, covering a wide range of social, economic, political, legal, and cultural frames. In the contemporary era, Critical Theory often focuses its critical lens on commodification and fetishization within post-colonial neoliberalism, oligarchy, and post-capitalist mass culture.80 Some experts claim the Critical Theory movement’s impact has been wide-reaching. A notable impact has been the initiation of the dystopian genre of literature, for instance, influencing the political and cultural critique of authors such as George Orwell in 1984,81 Aldous Huxley in Brave New World,82 and Arthur Koestler in Darkness at Noon.83 In fact, some scholars have identified the literary nexus between the theory and the narrative of social phenomena as why Critical Theory is a theory of distinct political relevance.84

77 . HERBERT MARCUSE, ONE-DIMENSIONAL MAN: STUDIES IN THE IDEOLOGY OF ADVANCED INDUSTRIAL SOCIETY (1964); MAX HORKHEIMER & THEODOR W. ADORNO, DIALECTIC OF ENLIGHTENMENT (1944); THEODOR W. ADORNO, THE CULTURE INDUSTRY (J.M. Bernstein ed.) (1947). 78 . See, e.g., JACK M. BALKIN, CULTURAL SOFTWARE: A THEORY OF IDEOLOGY 43 (2003). Cultural memes refer to “the building blocks of the cultural software that forms our apparatus of understanding.” Id. (citing RICHARD DAWKINS, THE SELFISH GENE (1976)).

Memes are spread from person to person by observation and social learning—either face to face or through media of communication like writing,, television, or the Internet. . . . In this way, memes are communicated from mind to mind, are adapted into our cultural software, and become a part of us. . . . We use memes to understand, yet memes also ‘use’ us, because they are inside us.

Id. 79 . EDWARD S. HERMAN & NOAM CHOMSKY, MANUFACTURING CONSENT: THE POLITICAL ECONOMY OF THE MASS MEDIA (2002); IRVING L. JANIS, GROUPTHINK: PSYCHOLOGICAL STUDIES OF POLICY DECISIONS & FIASCOES (1982); MAX HORKHEIMER & THEODOR W. ADORNO, DIALECTIC OF ENLIGHTENMENT (1944); MAX HORKHEIMER, ECLIPSE OF REASON (1947). 80 . STEVEN BEST & , POSTMODERN THEORY: CRITICAL INTERROGATIONS (1991); STEVEN BEST & DOUGLAS KELLNER, THE POSTMODERN ADVENTURE: SCIENCE, TECHNOLOGY, CULTURAL STUDIES AT THE THIRD MILLENNIUM (1997); MICHAEL HARDT & ANTONIO NEGRI, EMPIRE (2000); EDWARD SAID, CULTURE & IMPERIALISM (1994); EDWARD SAID, ORIENTALISM (1979); THOMAS ALLMER, CRITICAL THEORY & SOCIAL MEDIA: BETWEEN EMANCIPATION & COMMODIFICATION (2015); THOMAS ALLMER, TOWARDS A CRITICAL THEORY OF SURVEILLANCE IN INFORMATIONAL CAPITALISM (2012); THOMAS MCCARTHY, RACE, EMPIRE, & THE IDEA OF HUMAN DEVELOPMENT (2009). 81 . GEORGE ORWELL, NINETEEN EIGHTY-FOUR (1949). 82 . ALDOUS HUXLEY, BRAVE NEW WORLD (1932). 83 . ARTHUR KOESTLER, DARKNESS AT NOON (1940). 84 . FREDRIC JAMESON, POSTMODERNISM, OR, THE CULTURAL LOGIC OF LATE CAPITALISM (Stanley Fish ed.) (1991); M. KEITH BOOKER, THE DYSTOPIAN IMPULSE IN MODERN DRAFT: PLEASE DO NOT CITE OR CIRCULATE

2016] CRITICAL DATA THEORY 19

The Snowden disclosures in June 2013 awoke fears that a drift towards a form of dystopia was possible, if not real.85 However legitimate that concern, those disclosures have made clear the need to bring a critical lens to modern big data cybersurveillance and dataveillance methods and the National Surveillance State they support.86 Former NSA contractor, Edward Snowden, has explained that he came forward as a national security whistleblower in part because of what he referred to as “turnkey tyranny.”87 At the time his identity was revealed, Snowden expressed his fear that the NSA’s architecture88 of cybersurveillance and dataveillance could be abused at some future point in time.89 Some claim the threat of a new form of dataveillance fascism or big data-centered dictatorship is overblown and hyperbolic.90 Yet, journalist

LITERATURE: FICTION AS SOCIAL CRITICISM (1994); M. KEITH BOOKER, DYSTOPIAN LITERATURE: A THEORY & RESEARCH GUIDE (1994). 85 . See, e.g., Glenn Greenwald, Tomgram: Glenn Greenwald, How I Met Edward Snowden, TOMDISPATCH (MAY 13, 2014), http://www.tomdispatch.com/post/175843/tomgram%3A_glenn_greenwald,_how_i_ met_edward_snowden/ (“Technologically speaking, what Snowden revealed to the world . . . was a remarkable achievement, as well as a nightmare directly out of some dystopian novel[]”); Jon L. Mills, The Future of Privacy in the Surveillance Age, in AFTER SNOWDEN: PRIVACY, SECRECY, AND SECURITY IN THE INFORMATION AGE 223 (Ronald Goldfarb ed. 2015). 86. See, e.g., David Lyon, Surveillance, Snowden, and Big Data: Capacities, Consequences, Critique, BIG DATA & SOC. 4 (2014) (“In particular, different kinds of data are now being captured and used in new ways, which prompts some to distinguish between surveillance as targeted practices over against fresh form of dataveillance.”); Datafication, dataism and dataveillance: Big Data between scientific paradigm and ideology. Surveillance & Society 1. 87. Snowden articulated the threat of “turnkey tyranny” in this way:

. . . [A] new leader will be elected, they'll flip the switch, say that because of the crisis, because of the dangers that we face in the world, you know, some new and unpredicted threat, we need more authority, we need more power. And there will be nothing the people can do at that point to oppose it. And it will be turnkey tyranny.

Peter Foster, Edward Snowden Is a Self-Regarding Idealist Whose Warnings of Tyranny Ring Hollow, TELEGRAPH (July 18, 2013, 7:19 PM), http://www.telegraph.co.uk/news/uknews/defence/10188209/Edward-Snowden-is-a-self- regarding-idealist-whose-warnings-of-tyranny-ring-hollow.html. 88 . Experts increasingly describe big data surveillance in architectural terms. See, e.g., BRUCE SCHNEIER, DATA AND GOLIATH: THE HIDDEN BATTLES TO COLLECT YOUR DATA AND CONTROL YOUR WORLD 48 (2015) (“This has evolved into a shockingly extensive, robust, and profitable surveillance architecture.”). See also GREENWALD, supra note X; JENNIFER STISA GRANICK, AMERICAN SPIES: MODERN SURVEILLANCE, WHAT IT IS, & WHY YOU SHOULD CARE (forthcoming 2016); Margaret Hu, Taxonomy of the Snowden Disclosures, 72 WASH. & LEE L. REV. 1679 (2015). 89. Foster, supra note 87. 90. Peter Foster, Edward Snowden Is a Self-Regarding Idealist Whose Warnings of Tyranny Ring Hollow, Telegraph (July 18, 2013, 7:19 PM), DRAFT: PLEASE DO NOT CITE OR CIRCULATE 5/18/16 9:27 AM

and attorney Glenn Greenwald, and journalist and documentary filmmaker Laura Poitras—who reportedly exercise sole possession over the full Snowden files—and other surveillance experts, have shared the view that the Snowden disclosures profoundly implicate questions of democratic governance. Greenwald has explained: “[Snowden] has made it clear, with these disclosures, that we stand at a historic crossroads. Will the digital age usher in the individual liberation and political freedoms that the Internet is uniquely capable of unleashing? Or will it bring about a system of omnipresent monitoring and control . . . ?”91 Poitras has also stated that the Snowden disclosures’ significance turns on their impact on the democratic experiment. The “‘Snowden [revelations] don’t only expose a threat to our privacy but to our democracy itself,’” Poitras explained.92 Surveillance expert Rachel Levinson-Waldman has put it this way: “The collection and retention of non-criminal information about Americans for law enforcement and national security purposes poses profound challenges to our democracy and our liberties.”93 Constitutional scholars Jack Balkin and Sanford Levinson have coined the term, the “National Surveillance State.”94 They explain that the integration of bureaucratized and normalized surveillance technologies into day-to-day governance should be understood as a distinctive concern of American constitutionalism.95 Balkin and Levinson define the National Surveillance State as being “characterized by a significant increase in government investments in technology and government bureaucracies devoted to promoting domestic security and (as its name implies) gathering intelligence and surveillance using all of the devices that the digital revolution allows.”96 The theory of the National Surveillance State explicates why disruptive technological innovations are now incentivizing concurrent revolutions in methods of governance.97 http://www.telegraph.co.uk/news/uknews/defence/10188209/Edward-Snowden-is-a-self- regarding-idealist-whose-warnings-of-tyranny-ring-hollow.html (“[Snowden] has failed to produce a single concrete example of an abuse of a spy apparatus which he claims is trampling the constitution. He asserts portentously that NSA operatives like him ‘had the power to change people’s fates,’ but cannot show where actual wrongs have been committed.”). 91. Greenwald, supra note 1, at 6. 92. Laura Poitras, Citizenfour (2014); Peter Maass, The Intercept’s Laura Poitras Wins Academy Award for ‘Citizenfour’, Intercept (Feb. 22, 2014), https://firstlook.org/theintercept/2015/02/22/poitras-wins-oscar-for-citizenfour/ (explaining relevancy of Snowden disclosures in her acceptance speech at the 87th Academy Awards, immediately after Poitras received the Oscar Award for Best Documentary Feature for directing Citizenfour). 93 . RACHEL LEVINSON-WALDMAN, BRENNAN CTR. FOR JUSTICE, WHAT THE GOVERNMENT DOES WITH AMERICANS’ DATA 9 (2013). 94. See Balkin and Levinson, supra note 1, at 521. 95. See Balkin and Levinson, supra note X, at 520–21. 96. See Balkin and Levinson, supra note X, at 131. 97. See, e.g., Balkin, supra note 1, at 3 (“The question is not whether we will have a surveillance state in the years to come, but what sort of surveillance state we will have. Will DRAFT: PLEASE DO NOT CITE OR CIRCULATE

2016] CRITICAL DATA THEORY 21

The unprecedented growth of bureaucratized surveillance98 and the asymmetric power that results from state access to big data tools99 has been well documented. These trends are indicative of the growing impact of cybersurveillance and dataveillance systems.100 Thus, when assessing the impact of the deployment of big data governance tools, special attention must be placed on the underlying philosophies and theories of big data itself. Recognizing the technological significance of big data as a scientific and theoretical axis for the surveillance programs revealed by the Snowden disclosures101 and other big data governance philosophies present critical foci for future legal and constitutional analysis.102 In the National Surveillance State, big data technological innovations enable surveillance to serve not simply as a discrete tool of policing, foreign intelligence gathering, and advancing counterterrorism and defense goals. This type of surveillance was commonly accepted and understood in a small data world. In contrast, big data forms of surveillance are not commonly understood. Big data surveillance involves integrating big data technologies into daily governance activities, such as distributing benefits, tracking identities, and mediating rights and privileges.103 New forms of surveillance made possible by big data’s availability now result in new forms of governmental activities. These new forms of governmental

we have a government without sufficient controls over public and private surveillance, or will we have a government that protects individual dignity and conforms both public and private surveillance to the rule of law?”). 98 . See, e.g., Margaret Hu, Biometric ID Cybersurveillance, 88 IND. L. J. 1475 (2013). 99 . See, e.g., PRIVACY AND POWER: A TRANSATLANTIC DIALOGUE IN THE SHADOW OF THE NSA-AFFAIR (Russell A. Miller ed.) (forthcoming); GLENN GREENWALD, NO PLACE TO HIDE: EDWARD SNOWDEN, THE NSA, AND THE U.S. SURVEILLANCE STATE (2014); DANA PRIEST & WILLIAM M. ARKIN, TOP SECRET AMERICA: THE RISE OF THE NEW AMERICAN SECURITY STATE, 128-155 (2011); ROBERT O’HARROW, NOWHERE TO HIDE (2006). 100. See, e.g., Stumpf, Getting to Work, supra note 14 at 412 (“DHS[’s mission] is broader than immigration. Its central focus is national security. What E-Verify provides to DHS is information about every U.S. employee who passes through the computerized system. It may be tempting . . . to use that information or E-Verify's call-in apparatus for purposes beyond immigration enforcement.”). 101. See, e.g., Hu, Taxonomy of the Snowden Disclosures, supra note X. 102. See, e.g., Hu, supra note X, at 803–05. See generally Crawford, supra note X. Scholars and experts have also juxtaposed small data policing and surveillance practices with big data policing and surveillance practices as a way to deepen the legal and constitutional discourse. See, e.g., Ferguson, supra note X, at 329 (discussing what happens if particularized small data suspicion is replaced by big data suspicion in the context of predictive policing practices); Toomey & Kaufman, supra note X, at 847 (describing a “notice paradox” whereby in a small data surveillance world notice was usually given because searches were limited to a “physical world,” whereas big data surveillance methods provided the government significant control over when, and to whom, notice is given). 103 . See e.g., Mark Andrejevic, Surveillance in the Big Data Era, in EMERGING PERVASIVE INFORMATION AND COMMUNICATION TECHNOLOGIES (PICT): ETHICAL CHALLENGES, OPPORTUNITIES, AND SAFEGUARDS 56 (Kenneth D. Pimple ed., 2014) (“[I]n the era of ‘big data’ surveillance, the imperative is to monitor the population as a whole: otherwise it is harder to consistently and reliably discern useful patterns.”). DRAFT: PLEASE DO NOT CITE OR CIRCULATE 5/18/16 9:27 AM

activity are now evolving into normalized governing protocols, particularly under a “paradigm of prevention.”104 Big data-driven governance systems may be less visible due to the “black box” nature of big data technologies and thus are more difficult to challenge in a vastly complex administrative state.105 What is referred to as the National Surveillance State is the most conspicuous manifestation of the phenomenon of big data governance.106 It is the governmental adoption of a philosophical approach to digital data with other prevailing governance structures and philosophies.107 In the same way that big data is altering the nature of the market into a digital economy in which a person is reduced to a consumer profile of data points subject to corporate surveillance, analysis, and exploitation, so the National Surveillance State is seizing upon the citizenry’s data trails for purposes of regulating and policing the big data state.108 In fact, “bureaucratic capitalism” and the National Surveillance State are mutually reinforcing, particularly in their drive to develop tools to expand the reach of big data systems.109 Evgeny Morozov warns of the dangers of an “emerging data- centric capitalism.”110 In fact, he interrogates the need for more protections

104. Cole, supra note 2 (explaining that the “paradigm of prevention” includes many tools, including “the use of pretextual charges for preventive detention, the expansion of criminal liability to prohibit conduct that precedes terrorism, and expansion of surveillance at home and abroad.”). 105. See, e.g., Balkin, National Surveillance State, supra note 1; Balkin & Levinson, supra note 1; Jennifer C. Daskal, Pre-Crime Restraints: The Explosion of Targeted, Noncustodial Prevention, 99 CORNELL L. REV. 327 (2014); Joshua A.T. Fairfield and Erik Luna, Digital Innocence, CORNELL L. REV. (2014); David Gray and Danielle Keats Citron, The Right to Quantitative Privacy, 98 MINN. L. REV. 62 (2013); Lior Jacob Strahilevitz, Signaling Exhaustion and Perfect Exclusion, 10 J. ON TELECOMM. & HIGH TECH. L. 321 (2012); David Lyon, Biometrics, Identification and Surveillance, 22 BIOETHICS 499 (2008). 106. Balkin, National Surveillance State, supra note 1. 107 . See, e.g., Julie Cohen, The Biopolitical Public Domain, GEORGETOWN LAW CTR. (Sept. 28, 2015), http://papers.ssrn.com/sol3/Papers.cfm?abstract_id=2666570. 108 . MAYER-SCHÖNBERGER & CUKIER, supra note Error! Bookmark not defined., at 157 (contending that “the new thinking is that people are the sum of [the data]”). “And because the government knows whom it will want to scrutinize, it collects, stores, or ensures access to information not necessarily to monitor everyone at all times, but so that when someone falls under suspicion, the authorities can immediately investigate rather than having to start gathering the info from scratch.” Id. 109 . See, e.g., MAURICE MEISNER, THE DENG XIAOPING ERA 300 (1996) (defining “bureaucratic capitalism” as the “use of political power and official influence for private pecuniary gain through capitalist or quasi-capitalist methods of economic activity.”). 110 . Evgeny Morozov, Digital Technologies & the Future of Data Capitalism, SOCIAL EUROPE (June 23, 2015), https://www.socialeurope.eu/2015/06/digital-technologies-and-the- future-of-data-capitalism/.

We must take the matter of digital identity completely out of commercial jurisdiction and instead turn it into a public good. Think of this is as the intellectual infrastructure of the data-centric capitalism. . . . To reiterate: If we are faced with emerging data-centric capitalism, then the only way to guarantee that citizens won’t be crushed by it is to DRAFT: PLEASE DO NOT CITE OR CIRCULATE

2016] CRITICAL DATA THEORY 23 for an increasingly vulnerable citizenry to counteract the imbalance of power created by those who control big data technologies.111 Similarly, scholars have identified the risks associated with “informational capitalism” as stemming from consumption-oriented surveillance.112 The reach of big data cybersurveillance and dataveillance, in other words, extends beyond the intelligence context (e.g., Snowden disclosures) and reaches into day-to-day governing systems due to the administrative nature of the modern National Surveillance State (e.g., the No Work List113 and No Fly List114 operated through federal agencies such as the U.S.

ensure that its main driving force – data – remains squarely in public hands.

Id. 111. Id. 112 . THOMAS ALLMER, TOWARDS A CRITICAL THEORY OF SURVEILLANCE IN INFORMATIONAL CAPITALISM (2012). 113. E-Verify has been referred to as a ‘No Work List’. See Press Release, ACLU, Employment Verification Would Create a ‘No Work List’ in the United States (May 6, 2008), https://www.aclu.org/immigrants-rights/employment-verification-would-create-‘no- work-list’-us; Jim Harper, Immigration Reform: REAL ID and a Federal ‘No Work’ List, CATO INST. (June 14, 2007), http://www.cato.org/publications/techknowledge/immigration- reform-real-id-federal-no-work-list. E-Verify is a “pilot” program jointly operated by the U.S. Department of Homeland Security (DHS) and the Social Security Administration (SSA) that enables employers to screen employees’ personally identifiable data (e.g., name, birthdate, and Social Security Number) through government databases over the Internet to “verify” the identity and employment eligibility of the employee. U.S. DEP’T OF HOMELAND SEC., U.S. CITIZENSHIP & IMMIGRATION SERVS., E-VERIFY USER MANUAL FOR EMPLOYERS 1 (2014), http://www.uscis.gov/sites/default/files/files/nativedocuments/E-Verify_Manual.pdf. E-Verify is referred to as the “Basic Pilot Program” in the Illegal Immigration Reform and Immigrant Responsibility Act of 1996 (IIRIRA) and in subsequent congressional action extending its funding. Id. at 77–78; Basic Pilot Extension Act of 2001, Pub. L. No. 107-128, 115 Stat. 2407, 2407 (2002) (codified at 8 U.S.C. §§ 1101, 1324a (2012)); Basic Pilot Program Extension and Expansion Act of 2003, Pub. L. No. 108-156, 117 Stat. 1944, 1944 (codified at 8 U.S.C. §§ 1101, 1324a (2012)). For a thorough discussion of E-Verify and its legal implications, see Juliet Stumpf, Getting to Work: Why Nobody Cares About E-Verify (And Why They Should), 2 U.C. IRVINE L. REV. 381 (2012). 114 . See, e.g., JEFFREY KAHN, MRS. SHIPLEY’S GHOST: THE RIGHT TO TRAVEL AND TERRORIST WATCHLISTS 137–53 (2013); Vision 100—Century of Aviation Reauthorization Act, Pub. L. No. 108-176, 117 Stat. 2490, 2568 (2003) (codified as amended at 49 U.S.C. § 44903 (2012)); Secure Flight Program, 73 Fed. Reg. 64,018, 64,019 (Oct. 28, 2008) (codified at 49 C.F.R. §§ 1540, 1544, 1560); Press Release, Transp. & Sec. Admin., TSA to Test New Passenger Pre-Screening System (Aug. 26, 2004), http://www.tsa.gov/press/releases/2004/08/26/tsa-test-new-passenger-pre-screening-system (describing the implementation of a post-9/11 passenger prescreening program that checks passengers’ names against terrorist watchlists to improve the use of “no fly” lists). The Computer Assisted Passenger Prescreening System (CAPPS II) relies upon the Passenger Name Record database (PNR), checks the passenger’s data against the Transportation Security Administration’s (TSA) “No-Fly” list and FBI’s lists, and assigns a terrorist “risk score” through statistical algorithms. See Press Release, U.S. Dep’t of Homeland Sec., CAPPS II: Myths and Facts (Feb. 13, 2004), http://www.hsdl.org/?view&did=478534. DRAFT: PLEASE DO NOT CITE OR CIRCULATE 5/18/16 9:27 AM

Department of Homeland Security).115 In this way, big data governance tools can function under a type of de jure discrimination structure (e.g., propagating a form of “Techno Jim Crow” through delegated or captured big data programs).116 Big data systems can now mediate privileges such as the right to fly,117 work,118 drive,119 or vote.120 These big data systems rely upon automated or semi-

115. Various scholars have examined the broader national security implications of the creation of the U.S. Department of Homeland Security and its programs and the policies of its subcomponents. See, e.g., MARIANO-FLORENTINO CUÉLLAR, GOVERNING SECURITY: THE HIDDEN ORIGINS OF AMERICAN SECURITY AGENCIES (2013); Margaret Hu, Big Data Blacklisting, 67 FLA. L. REV. 1735 (2015); Shirin Sinnar, Institutionalizing Rights in the National Security Executive, 50 HARV. C.R.-C.L. 289 (2015). 116 . See, e.g., ROBERT O’HARROW, JR., NO PLACE TO HIDE 44–56 (2006); Choicepoint, EPIC, http://epic.org/privacy/choicepoint/. “The Justice Department (DOJ) has signed a $67 million contract with ChoicePoint to provide the FBI, Immigration and Naturalization Service, Border Patrol and other law enforcement agencies with access to ChoicePoint’s 13 billion files.” ChoicePoint Sells Personal Data to U.S., PEOPLE’S WORLD (May 7, 2003), http://transitional.pww.org/choicepoint-sells-personal-data-to-u-s/. 117. See, e.g., Latif v. Holder, 28 F.Supp.3d. 1134 (D. Or. 2014) (holding that the No Fly List is procedurally defective under due process protections); Ibrahim v. Department of Homeland Security, 62 F.Supp.3d 909 (N.D. Cal 2014) (finding that Ibrahim’s inclusion on the No Fly List was constitutionally impermissible). 118. See, e.g.., Puente Arizona v. Arpaio, No. CV-14-01356-PHX-DBC (D. Ariz. Jan. 5, 2015). In Puente Arizona v. Arpaio, the plaintiffs challenged the constitutionality of state statutes “that criminalize the act of identity theft done with the intent to obtain or continue employment.” Id. at 1. The state statutes in question required employers to use E-Verify and included provisions to ensure employers’ participation in the E-Verify program: The “Legal Arizona Workers Act,” 2007 Ariz. Legis. Serv. Ch. 279 (H.B. 2779) (West), and “Employment of Unauthorized Aliens,” 2008 Ariz. Legis. Serv. Ch. 152 (H.B. 2745 (West). 119. Prior to issuing driver’s licenses, many states now screen individuals through SAVE (Systematic Alien Verification for Entitlements), a database screening program operated by the U.S. Department of Homeland Security. See, e.g., Applying for a Driver’s License or State Identification Card, U.S. DEP’T OF HOMELAND SEC. 2, http://www.ice.gov/doclib/sevis/pdf/dmv_factsheet.pdf (last updated Sept. 5, 2012) (“Most states and territories use the Systematic Alien Verification for Entitlements (SAVE) Program to determine a non-citizen’s eligibility for many public benefits, including the issuance of a driver’s license.”). 120. See, e.g., Arcia v. Fla. Sec. State, 772 F.3d 1335, 1348 (11th Cir. 2014) (Court of appeals reversed and remanded the district court’s grant of judgment as a matter of law to Secretary of State of Florida, declaring that the SAVE database screening program for voter purges were in violation of the 90 Day Provision of the National Voter Registration Act). Id. The original complaint alleged that the SAVE database screening program aimed at removing non-citizens from voter registration rolls violated Section 2 of the Voting Rights Act, 42 U.S.C. § 1973, asserting protection for “citizens having ‘less opportunity than other members of the electorate to participate in the political process and to elect the representatives of their choice.’” Arcia v. Detzner, 1:12-CV-22282, 1 (S.D.Fla. June 19, 2012). See also Help America Vote Act of 2002 (HAVA), Pub. L. No. 107-252, 116 Stat. 1666, 1666–1730 (2002) (codified as amended at 42 U.S.C. §§ 15,301–545 (2012)). HAVA requires each state to implement and maintain an electronic database of all registered voters. 42 U.S.C. § 15483(a). HAVA also requires states to verify the identity of the voter registration application through cross-checking the applicant’s driver’s license or last four digits of the applicant’s Social Security Number. Id. § 15,483(a)(5)(A)(i). If the individual DRAFT: PLEASE DO NOT CITE OR CIRCULATE

2016] CRITICAL DATA THEORY 25 automated database screening protocols. Increasingly, in the name of national security and border security, post-9/11 big data programs allow for core rights and freedoms to be partially or completely obstructed.121 Moreover, because of the virtual and classified or quasi-classified122 nature of database screening protocols, the digital mediation of and potential interference with liberty interests can occur without our knowledge or consent.123 Indeed, governance by big data methods seems normal because we are immersed in a big data culture. Much of our social and commercial interactions are digitally mediated and leave data trails, thus the governance can proceed through monitoring by private and public interests alike. Critical Data Theory is needed to de-normalize the digital world. This de-normalization can proceed through a close assessment of the impact of new big data governance methods that are significantly distinct from small data methods. In an Information Society where individuals actively construct their data selves, security “apparatuses”124 or “surveillant

does not have either number, the state must assign a voter identification number to the applicant. Id. § 15,483(a)(5)(A)(ii). Each state election office oversees election rules and procedures for that state in the implementation of HAVA. President Signs H.R. 3295, “Help America Vote Act of 2002,” SOC. SEC. ADMIN. (Nov. 7, 2002), http://www.ssa.gov/legislation/legis_bulletin_110702.html [hereinafter President Signs HAVA]. Excellent research has been conducted in recent years on these emerging developments in election law. See, e.g., Rebecca Green, Rethinking Transparency in U.S. Elections, 75 OHIO ST. L.J. 779 (2014); Daniel P. Tokaji, HAVA in Court: A Summary and Analysis of Litigation, 12 ELECTION L.J. 203 (2013); Daniel P. Tokaji & Paul Gronke, The Party Line: Assessing the Help America Vote Act, 12 ELECTION L.J. 111 (2013); Martha Kropf, North Carolina Election Reform Ten Years After the Help America Vote Act, 12 ELECTION L.J. 179 (2013). See also Hu, Big Data Blacklisting, supra note 1 (discussion in Parts II.A.2 and III.A.2) 121. The recent White House Report on the consequences of big data, including government- led big data programs, recognizes that big data technologies can profoundly influence rights and privileges. See, e.g., PODESTA REPORT, supra note 6. 122. For the purposes of this Article, certain programs, such as the Terrorist Watchlist and No Fly List are referred to as classified or semi-classified. While these programs themselves are not technically classified, the government has explained that these programs are informed by classified information. “The term ‘classified information’ means information which … is, for reasons of national security, specifically designated by a United States Government Agency for limited or restricted dissemination or distribution[.]”18 U.S.C. 798 (b). 123. This Article does not address private big data blacklisting consequences, however, important scholarship is being conducted on this subject. See, e.g., FRANK PASQUALE, THE BLACK BOX SOCIETY 101–03 (2015) (describing private credit scoring regimes and computerization of the finance sector); Danielle Keats Citron & Frank Pasquale, The Scored Society: Due Process for Automated Predictions, 89 WASH. L. REV. 1, 3–4 (2014) (discussing algorithmic and scoring systems implemented by various individuals or companies that use data to make decisions on characterizing a person in numerous aspects of society). 124 . MICHEL FOUCAULT, SECURITY, TERRITORY, POPULATION: LECTURES AT THE COLLÈGE DE FRANCE 1977-1978 45, 48–49 (Graham Burshell trans. 2007) (explaining “apparatuses of security” as including multiple governing characteristics such as “the constant tendency to expand; they are centrifugal. . . . Security therefore involves organizing, or anyway allowing the development of ever-wider circuits.”). DRAFT: PLEASE DO NOT CITE OR CIRCULATE 5/18/16 9:27 AM

assemblages”125 likewise, are able to construct parallel data selves of those that they govern. The citizenry may present itself digitally under a sense of autonomy. Simultaneously, however, simply by entering the spheres of the digital economy and Information Society, the citizenry submits digitally to the newly emerging governance methods of a big data world. In the case of database screening and digital watchlisting systems, for instance, this involuntary submission to the monitoring of the data self will often occur without notice or consent.126 The consequences of big data governance may not be understood by the small data citizen127 nor, in some instances, understood by those tasked with administering big data tools. The ability to grasp the consequences may be even more attenuated when a state or corporate entity has been delegated with the task of deploying big data tools by the government. By removing the human element through big data-facilitated decisionmaking,

125 . Sean P. Hier, Probing the Surveillant Assemblage: On the Dialectics of Surveillance Practices as Processes of Social Control, 1 SURVEILLANCE & SOC’Y 399, 400 (2003).

[Surveillant assemblages] denote the increasing convergence of once discrete systems of surveillance. They [Kevin Haggerty and Richard Ericson] argue that the late modern period has ushered in the proliferation of information and data gathering techniques which operate to break the human body into a number of discrete signifying data flows. Reassembled as ‘functional hybrids’ whose unity is found solely in temporal moments of interdependence, resulting surveillance simulations bring together a seemingly limitless range of information to formulate categorical images or risk data profiles which render otherwise opaque flows of information comprehensible.

Id. (citing Kevin D. Haggerty & Richard V. Ericson, The Surveillant Assemblage, 51 BRIT. J. SOC. 605 (2000)). 126 . See, e.g., PRIVACY AND POWER: A TRANSATLANTIC DIALOGUE IN THE SHADOW OF THE NSA-AFFAIR (Russell A. Miller ed., forthcoming 2016); Anjali S. Dalal, Shadow Administrative Constitutionalism and the Creation of Surveillance Culture, 2014 MICH. ST. L. REV. 61 (2014); Anil Kalhan, Immigration Surveillance, 74 MD. L. REV. 1 (2014); Patrick Toomey & Brett Max Kaufman, The Notice Paradox: Secret Surveillance, Criminal Defendants, & the Right to Notice, 54 SANTA CLARA L. REV. 843 (2014); Paul Ohm, Electronic Surveillance Law and the Intra-Agency Separation of Powers, 47 U.S.F. L. REV. 269 (2012); Chris Jay Hoofnagle, Big Brother’s Little Helpers: How Choice Point and Other Commercial Data Brokers Collect and Package Your Data for Law Enforcement, 29 N.C. J. INT’L L. & COMM. REG. 595 (2004). 127 . See, e.g., JARON LANIER, YOU ARE NOT A GADGET: BEING HUMAN IN AN AGE OF TECHNOLOGY (2011); Margaret Hu, Protecting Small Data Citizens’ Substantive Due Process Rights in a Big Data World (forthcoming) (positioning small data citizen rights as uniquely threatened by big data governance ambitions and capabilities); Joel Achenbach, Meet the Dissenters: They’re Fighting For a Better Internet, WASH. POST (Dec. 26, 2015), http://www.washingtonpost.com/sf/national/2015/12/26/resistance/. See also Ryan Calo, Digital Market Manipulation, 82 GEORGE WASH. L. REV. 995, 999 (2014) (“[T]he digitization of commerce dramatically alters the capacity of firms to influence consumers at a personal level. [Big data] . . . will empower corporations to discover and exploit the limits of each individual consumer’s ability to pursue his or her own self-interest. Firms will increasingly be able to trigger irrationality or vulnerability in consumers[.]”). DRAFT: PLEASE DO NOT CITE OR CIRCULATE

2016] CRITICAL DATA THEORY 27 big data policymaking may unfold in a “black box” manner, for example, an over-reliance on algorithmic intelligence tools.128 Critical Theory requires that we understand how this digital culture inflects our thinking, including our legal analysis or policymaking protocols. Doing so advances an understanding of how our private autonomy is narrowed through big data and, from that perspective, a theorization of how the law must evolve to safeguard it.

B. Big Data Governance Revolution

To help understand why big data governance is disruptive,129 it is helpful to contrast the governance ambitions that prevailed in a small data world against the governance ambitions that now persist in a big data world. Viktor Mayer-Schönberger and Kenneth Cukier anchor the paradigmatic nature of big data by contrasting the conception of a small data world with a newly emerging conception of a big data world.130 Because of its transformative potential, the movement away from a small data world towards a big data world has been characterized as a “revolution.”131 Like the industrial revolution, big data signals a historically significant methodological and philosophical shift in how we approach and perceive information, and what we accept as efficiencies of decisionmaking and production.132 Unlike the industrial revolution, however, the big data revolution is taking place in the midst of what has been termed the Information Society133 or the digital age. Therefore, as danah boyd and Kate Crawford explain, big data creates new forms of knowledge and the processes by which we produce knowledge and perception.134 Julie Cohen builds on this concept of big data as a device of knowledge creation: “Together, the technology and the process comprise a technique for

128 . See, e.g., PASQUALE, supra note 10; Citron & Pasquale, The Scored Society, supra note 10; Citron, Technological Due Process, supra note 10; COHEN, supra note 10. 129 . See, e.g., MAYER-SCHÖNBERGER & CUKIER, supra note 4. 130 . MAYER-SCHÖNBERGER & CUKIER, supra note Error! Bookmark not defined., at 13. 131. Id. 132. See Lyon, supra note Error! Bookmark not defined., at 6. 133. One definition of “global information society” offers the following description: “[Global Information Society”] recognizes that science and technology co-exist in a world where technology diminishes geographic, temporal, social, and national barriers to discovery, access, and use of data.” REPORT OF THE INTERAGENCY WORKING GROUP ON DIGITAL DATA TO THE COMMITTEE ON SCIENCE OF THE NATIONAL SCIENCE AND TECHNOLOGY COUNCIL 17 (2009), https://www.whitehouse.gov/files/documents/ostp/opengov_inbox/harnessing_power_web.p df. 134. danah boyd & Kate Crawford, Critical Questions for Big Data: Provocations for a Cultural, Technological, and Scholarly Phenomenon, 15 INFO., COMM. & SOC’Y 662, 662– 79 (2012). DRAFT: PLEASE DO NOT CITE OR CIRCULATE 5/18/16 9:27 AM

converting data flows into a particular, highly data-intensive type of knowledge.”135 Currently, there is no agreed-upon definition on what is “small data” or “big data.” “Generally, small data is thought of as solving discrete questions with limited and structured data, and the data are generally controlled by one institution.”136 Explained another way, the small data world is the world that we think we know:137 A universe of knowledge that humans can see, touch, analyze, and perceive without the assistance of supercomputing capabilities. Small data involves things that humans can create and grasp using human judgment alone. Common definitions of big data often involve a technological prerequisite, such as advanced computing storage and processing capacity. Big data definitions also require for the data to be sufficiently big enough to qualify as “big data.” Almost invariably, big data expressly or implicitly precludes human storage and processing capacity—if a human can comprehend the data without computing and algorithmic assistance, it is not big data. The big data revolution is presenting new challenges to a variety of disciplines, and understanding the highly technical nature of the topic has become essential to properly understanding the implications of big data’s impact on the law.138 To understand the data science that underpins big data, first, the meaning of the term and phenomenon of big data itself must be explored. Only through exploration of this highly technical topic can a fuller and more informed statutory and constitutional inquiry be realized. Kate Crawford and Jason Schulz explain that, “‘Big Data’ is a generalized, imprecise term that refers to the use of large data sets in data science and predictive analytics.’”139 Julie Cohen offers this definition of big data: ‘Big Data’ is shorthand for the combination of a technology and a process. The technology is a configuration of

135. Cohen, supra note Error! Bookmark not defined.. 136 . Ferguson, supra note Error! Bookmark not defined., at 329 n.6 (citing JULES J. BERMAN, PRINCIPLES OF BIG DATA: PREPARING, SHARING, AND ANALYZING COMPLEX INFORMATION 1–2 (2013)). 137 . MAYER-SCHÖNBERGER & CUKIER, supra note Error! Bookmark not defined., at 17–18. 138. Roughly speaking, big data, as an evolving field of research and academic study, appears to involve, for example, the interrogation of a new science (i.e., what has been termed “data science” or “big data science” and “big data engineering”); newly emerging big data tools and methods (e.g., capturing, storing, and analyzing the data generated by the Internet and Social-Mobile-Cloud technologies); big data products (e.g., interoperability among databases, big data mass integration, big data visualization and data pattern mapping, results of predictive analytics, etc.); frameworks for guiding or managing big data (e.g., big data ethics, big data protocols to maintain data integrity); big data end results (e.g., benefits to other knowledge and science pursuits like epidemiology, delivery of consumer services, improvement of decisionmaking in the public or private sectors, etc.); and the unintended consequences of big data (e.g., discriminatory inferences and disparate impact), to identify just a few sub-categories of big data inquiry. 139. Crawford & Schultz, supra note Error! Bookmark not defined., at 96 (2014). DRAFT: PLEASE DO NOT CITE OR CIRCULATE

2016] CRITICAL DATA THEORY 29

information-processing hardware capable of sifting, sorting, and interrogating vast quantities of data in very short times. The process involves mining the data for patterns, distilling the patterns into predictive analytics, and applying the analytics to new data.140 The most widely-recognized definition of big data is commonly anchored by several data-specific characteristics, often referred to as the “3-Vs” of big data: Volume, velocity, and variety.141 “Technologists often use the technical ‘3-V’ definition of big data as ‘high-volume, high- velocity and high-variety information assets that demand cost-effective, innovative forms of information processing for enhanced insight and decision making.’”142 Increasingly, experts note that a fourth “V” of big data involves the veracity and reliability of the underlying data.143 Still other experts, such as Rob Kitchin, have identified additional data-specific characteristics of big data, including: “[E]xhaustive in scope, striving to capture entire populations of systems; fine-grained resolution, aiming at maximum detail, while being indexical in identification; relational, with common fields that enable the conjoining of different data-sets; flexible, with traits of extensionality (easily adding new fields) and scalability (the potential to expand rapidly).”144 The federal government has recently adopted several definitions of big data. For example, the White House described big data and its implications in its report Big Data: Seizing Opportunities, Preserving Values,145 often referred to as the “Podesta Report,” as the report’s inquiry

140 . Julie E. Cohen, What Privacy Is For, 126 HARV. L. REV. 1904, 1920–21 (2013). 141. Id. 142 . Neil M. Richards & Jonathan H. King, Big Data Ethics, 49 WAKE FOREST L. REV. 393, 394 n.3 (2014) (quoting IT Glossary: Big Data, GARTNER, http://www.gartner.com/it- glossary/big-data/ (last visited May 11, 2015)); see id. (citing the original “3-V” big data report, DOUG LANEY, 3D DATA MANAGEMENT: CONTROLLING DATA VOLUME, VELOCITY, AND VARIETY (2001), http://blogs.gartner.com/doug-laney/files/2012/01/ad949-3D-Data- Management-Controlling-Data-V olume-Velocity-and-Variety.pdf). 143. Big Data Conundrum, supra note Error! Bookmark not defined. (“In 2001, a Meta (now Gartner) report noted the increasing size of data, the increasing rate at which it is produced and the increasing range of formats and representations employed. This report predated the term ‘big data’ but proposed a three-fold definition encompassing the ‘three Vs’: Volume, Velocity and Variety. This idea has since become popular and sometimes includes a fourth V: veracity, to cover questions of trust and uncertainty.”). 144. Lyon, Snowden, supra note Error! Bookmark not defined., at 5 (citing the work of Rob Kitchin); see also KITCHIN, supra note Error! Bookmark not defined.. 145 . EXEC. OFFICE OF THE PRESIDENT, BIG DATA: SEIZING OPPORTUNITIES, PRESERVING VALUES (2014) [hereinafter PODESTA REPORT], https://www.whitehouse.gov/sites/defau lt/files/docs/big_data_privacy_report_may_1_2014.pdf. 145. EXEC. OFFICE OF THE PRESIDENT, BIG DATA: SEIZING OPPORTUNITIES, PRESERVING VALUES (2014) [hereinafter PODESTA REPORT], https://www.whitehouse.gov/sites/defau lt/files/docs/big_data_privacy_report_may_1_2014.pdf. DRAFT: PLEASE DO NOT CITE OR CIRCULATE 5/18/16 9:27 AM

was led by John Podesta, Counselor to the President.146 The Podesta Report quotes a National Science Foundation document, titled Core Techniques and Technologies for Advancing Big Data Science & Engineering, stating: “Big datasets are ‘large, diverse, complex, longitudinal, and/or distributed datasets generated from instruments, sensors, Internet transactions, email, video, click streams, and/or all other digital sources available today and in the future.’”147 The National Institute of Standards and Technology further explains that big data “‘exceed(s) the capacity or capability of current or conventional methods and systems.’”148 “In other words, the notion of ‘big’ is relative to the current standard of computation.”149 According to some computer scientists, “[b]ig data is a term describing the storage and analysis of large or complex data sets using a series of techniques[.]”150 The word “big” in the term “big data,” however, is misleading.151 In another recent White House report, Big Data and Privacy: A Technological Perspective, submitted by the President’s Council of Advisors on Science and Technology, the Council explains that big data is not only about size, but also about new forms of knowledge creation, data-driven decisionmaking, and the inferences that can be supported by analytics.152 Big data can be construed as “big in two different senses. It is big in the quantity and variety of data that are available to be processed. And, it is big in the scale of analysis (termed ‘analytics’) that can be applied to those data, ultimately to make inferences and draw conclusions.”153 Moreover, most experts agree that big data relies upon supercomputing and machine learning or artificial intelligence tools and, therefore, by its very definition, big data exceeds the ability of human capacities to make sense of the big data without the assistance of algorithmic tools and other computer-enabled devices.154

146 . EXEC. OFFICE OF THE PRESIDENT, BIG DATA: SEIZING OPPORTUNITIES, PRESERVING VALUES, INTERIM PROGRESS REPORT (2015), https://www.whitehouse.gov/sites/default/ files/docs/20150204_Big_Data_Seizing_Opportunities_Preserving_Values_Memo.pdf. 147 . PODESTA REPORT, supra note Error! Bookmark not defined., at 3 (quoting NATIONAL SCIENCE FOUNDATION, SOLICITATION 12-499, CORE TECHNIQUES AND TECHNOLOGIES FOR ADVANCING BIG DATA SCIENCE & ENGINEERING (2012), http://www.nsf.gov/pubs/2012/nsf12499/nsf12499.htm). 148. Big Data Conundrum, supra note Error! Bookmark not defined.. 149. Id. (emphasis omitted). 150. Id. 151. “The MIKE [Method for an Integrated Knowledge Environment] project argues that big data is not a function of the size of a data set but its complexity. Consequently, it is the high degree of permutations and interactions within a data set that defines big data.” Id. 152 . PCAST REPORT, supra note Error! Bookmark not defined.. 153. Id. at ix. 154 . See, e.g., MAYER-SCHÖNBERGER & CUKIER, supra note Error! Bookmark not defined., at 11–12 (“Though it is described as part of the branch of computer science called artificial intelligence, and more specifically, an area called machine learning, this characterization is misleading. Big data is not trying to ‘teach’ a computer to ‘think’ like humans. Instead, it’s about applying math to huge quantities of data in order to infer probabilities[.]”). DRAFT: PLEASE DO NOT CITE OR CIRCULATE

2016] CRITICAL DATA THEORY 31

The characteristic that helps to define the transformational nature of big data technologies is the predictive aspects of the data-driven knowledge. Therefore, a core distinction that separates small data surveillance from big data cybersurveillance is the fact that, historically, information gathering was of an ex post nature, not an ex ante nature.155 Jack Balkin elaborates this point: “Older models of law enforcement have focused on apprehension and prosecution of wrongdoers after the fact and the threat of criminal or civil sanctions to deter future bad behavior. The National Surveillance State supplements this model of prosecution and deterrence with technologies of prediction and prevention.”156 Concurrent with that shift in focus from ex post to ex ante decisionmaking is an exponential growth in the need for data to be analyzed.157 Ex post information can be limited and focused on specific suspects and events that have triggered the need for the surveillance. Ex ante surveillance seeks to discover suspects before they become suspicious, so to speak, and seeks to identify future events before they occur in order to intervene beforehand.158 Doing this requires—and is facilitated by—the datafication of reality in a big data world. In other words, big data cybersurveillance tools appear to be radically changing what the government considers to be the full body of evidence that allows for the careful examination of security- and defense-driven inquiries.159 A primary significance of these distinctions is that, in a big data world, the investigative method has been flipped on its head. Ira “Gus” Hunt, Chief Technology Officer of the Central Intelligence Agency (CIA), explains why the investigative process has been flipped upside down: When I started as analyst years ago inside the CIA, the world was pretty simple. It was the world of the few to the many in terms of information flows . . . . The Social Mobile Cloud world has completely inverted that model and has gone to this complex many-to-many model.160 He further suggests that, because of the inversion of information flows resulting from a big data world, the nature of the investigatory inquiry has

155. Balkin, National Surveillance State, supra note Error! Bookmark not defined., at 10– 11 (“Governance in the National Surveillance State is increasingly statistically oriented, and preventative, rather than focused on deterrence and ex post prosecution of individual wrongdoing.”). 156. Id. at 10 (footnote omitted) (citing Scott Charney, The Internet, Law Enforcement, and Security, in 2 PRACTICING L. INST., FIFTH ANNUAL LAW INSTITUTE 943–44 (Ian C. Ballon et al. eds., 2001)). 157. See Ian Kerr & Jessica Earle, Prediction, Preemption, Presumption: How Big Data Threatens Big Picture Privacy, 66 STAN. L. REV. ONLINE 65, 66 (2013). 158. See Balkin, National Surveillance State, supra note Error! Bookmark not defined., at 15–16. 159. See Hu, supra note Error! Bookmark not defined., at 1479–80 (describing recently introduced forms of “biometric ID surveillance”). 160. See Hunt CIA Presentation, supra note Error! Bookmark not defined.. DRAFT: PLEASE DO NOT CITE OR CIRCULATE 5/18/16 9:27 AM

flipped upside down as well. He specifically elaborates that in a small data world, you “move data to the question[.]”161 Small data requires the analyst to start with the question or hypothesis and then assess the small data evidence that may be available to assist in the inquiry through human judgment and human evaluative processes. Put differently, in a small data world, the accumulation of data in response to a thesis enables the government to drill down vertically on a particular suspect or terrorist target.162 The vertical collection of data is accumulated to isolate the key pieces of fact that can prove or disprove the thesis.163 The vertical data position can consist of drilling down on any potential suspect or specific target.164 The vertical nature of small data investigations allows the investigator to drill down on one suspect at a time to extract the relevant data from a mass of seemingly irrelevant data. In a small data world, investigators start with a thesis or a suspect and build support that allows for the gathering of small data evidence. For example, in the prosecutorial context, this evidence is capable of supporting a conclusion as to whether the arrest and prosecution of an individual is warranted. The question of who is a suspect and whether he or she committed the crime leads to the gathering of evidence that can support a conviction. The analyst asks whether the small data can responsibly further government action—a warrant, an arrest, or a prosecution— premised upon the evidence, such fingerprints, witnesses, etc. Small data resource and technological constraints restricted the government to the investigation of individual suspects.165 Suspects were identified by small data methods, for example, the utilization of sensory-

161. Within Hunt’s PowerPoint slides, he includes one titled, “Tectonic Technology Shifts.” The slide juxtaposes “Traditional Processing” and “Mass Analytics/Big Data.” Under “Traditional Processing” of data, Hunt identifies “Move Data to Question” as a characteristic of small data. Id. 162 . See BACHNER, supra note Error! Bookmark not defined., at 24 (“The social network analysis allowed the detective on the case ‘to efficiently and effectively move his personnel resources to strategically navigate the suspect into the hands of the police.’”). 163. See, e.g., Slobogin, Panvasive Surveillance, supra note Error! Bookmark not defined.; Ferguson, supra note Error! Bookmark not defined.; WALTER L. PERRY ET AL., RAND CORP., PREDICTIVE POLICING: THE ROLE OF CRIME FORECASTING IN LAW ENFORCEMENT OPERATIONS 11–13 (2013), http://www.rand.org/content/dam/rand/pub s/research_reports/RR200/RR233/RAND_RR233.pdf (discussing the use of data in predictive policing). 164. See, e.g., Lyon, Snowden, supra note Error! Bookmark not defined.; Slobogin, Panvasive Surveillance, supra note Error! Bookmark not defined.; Ferguson, supra note Error! Bookmark not defined.; PERRY ET AL., supra note Error! Bookmark not defined., at 67. This method has allowed the Memphis Police Department to use big data and “respond to predicted threats before a criminal act is committed.” PERRY ET AL., supra note Error! Bookmark not defined., at 67. 165. Daniel J. Solove, Digital Dossiers and the Dissipation of Fourth Amendment Privacy, 75 S. CAL. L. REV. 1083, 1149–51 (2002); see also Kevin S. Bankston & Ashkan Soltani, Tiny Constables and the Cost of Surveillance: Making Cents Out of United States v. Jones, 123 YALE L.J. ONLINE 335 (2014). DRAFT: PLEASE DO NOT CITE OR CIRCULATE

2016] CRITICAL DATA THEORY 33 based tools and analog-based investigatory methods. As has been well- documented, human misconceptions of a perceived threat, such as racial profiling, and human fallibility, such as faulty eyewitness reporting or inaccurate human intelligence, have led to false targeting.166 As we transition from a small data world to a big data world, it appears that the government may be at the earliest stages of attempting to merge small data evidence and big data evidence for prosecutorial purposes.167 In a big data world, Hunt explains that the inquiry involves “mov[ing] the question to the data[.]”168 The analyst starts with the big data evidence that has been amassed and that is available for technologically- derived insights. Big data allows for the assessment of the question or hypothesis that might be illuminated by the data through big data tools. For example, these tools include data mining and pattern-based analysis, database screening, statistical modeling and algorithms, predictive analytics, and other supercomputing capacities and artificial intelligence tools. Because of the ubiquity of data in a big data world, investigators and analysts start with the data. Big data programs can function to gather the data in order to begin formulating questions. To return to the prosecutorial context, instead of forming a thesis about a potential suspect after the crime has already occurred, big data can be accumulated and analyzed to allow the formulation of theses about who is likely to commit a criminal or terrorist act before any event occurs.169 The thesis can be probabilistic in nature, with the presumption that the broader the array of data analyzed or scope of data integrated, the more accurate the data analytic or algorithmic results will become. The predictive big data “thesis” relies upon digital data and, unlike the small data thesis, does not rely upon suspect persons or criminal actions that have already occurred. Statistically, preemptive action may appear to be justified in the eyes of the government.170 The big data thesis may not be subject to rigorous

166 . See, e.g., TOM R. TYLER, WHY PEOPLE OBEY THE LAW 117 (1990). 167. See, e.g., Balkin, National Surveillance State, supra note Error! Bookmark not defined.; Daskal, supra note Error! Bookmark not defined.; Ferguson, supra note Error! Bookmark not defined., at 330 (“At some point inference from this personal data (independent of the observation) may become sufficiently individualized and predictive to justify the seizure of a suspect[,]” and “[t]he next phase will use existing predictive analytics to target suspects without any firsthand observation of criminal activity, relying instead on the accumulation of data points.”); see also, e.g., BACHNER, supra note Error! Bookmark not defined., at 6. 168. Within Hunt’s PowerPoint slides, he includes one titled, “Tectonic Technology Shifts.” The slide juxtaposes “Traditional Processing” and “Mass Analytics/Big Data.” Under “Mass Analytics/Big Data,” Hunt identifies “Move Question to Data” as a characteristic of big data. Id. 169. See Elizabeth E. Joh, Policing by Numbers: Big Data and the Fourth Amendment, 89 WASH. L. REV. 35, 42–48 (2014). 170. See, e.g., Jennifer C. Daskal, Pre-Crime Restraints: The Explosion of Targeted, Noncustodial Prevention, 99 CORNELL L. REV. 327 (2014); JENNIFER BACHNER, PREDICTIVE DRAFT: PLEASE DO NOT CITE OR CIRCULATE 5/18/16 9:27 AM

scientific inquiry. The scientific validity of the big data thesis is taken for granted. Statistics carry empirical weight, and thus the validity of big data evidence is assumed. Consequently, the digital construction of digital avatars, and the discernment of data patterns and data analytics, justify the big data thesis.171 The collection of all available digitalized data is presumed to facilitate big data governance aspirations: Border security, crime control, immigration control, counterterrorism objectives, etc. The construction of cybersuveillance and dataveillance architecture allows for the isolation and analysis of suspect data. The risk is structures designed to find suspect data will find suspect data. Experts have noted that big data systems can result in self-fulfilling prophecies. The systems designed to identify data deemed “suspect” may lead to the discovery of categories or sub-categories of individuals considered suspect. Big data policymaking rationales demand a panoramic vision of all the data in order to see the patterns and tendencies which can make visible and corroborate theories about who is predisposed to criminal or terrorist behaviors.172 Big data analytics have led to an imperative to “collect everything and hang on to it ‘forever’”173 to utilize big data inferences to serve policy objectives. Big data governance, as a result, enables the potential digital investigation of anyone who engages in electronic communications and, consequently, allows for the construction of the digital avatars of anyone with a digital trail or presence in a database. Due to the virtual nature of the digital economy and big data governance systems, it is nearly impossible to see the capacity to digitally construct potential threats. Yet, it appears that big data facilitates “virtually” constructing threats in order to flip “virtual” suspects from a horizontal data position into a vertical data position. The “collect-it-all” philosophy of governance emanates from the fact that under big data logic, “everybody [with digital communications or a digital or data trail] is a target[.]” A horizontal “collect-it-all” approach to data collection can lead to vertically drilling down on a target at any given moment that the government deems necessary to preempt actual threats or threats that have been derived from algorithmic intelligence.174

POLICING: PREVENTING CRIME WITH DATA AND ANALYTICS 14 (2013) (“The fundamental notion underlying the theory and practice of predictive policing is that we can make probabilistic inferences about future criminal activity based on existing data.”). 171. See sources cited supra note Error! Bookmark not defined.. 172. See Joh, supra note Error! Bookmark not defined., at 42–46. 173. See Hunt CIA Presentation, supra note Error! Bookmark not defined.. 174. See Hunt CIA Presentation, supra note X. Within Hunt’s PowerPoint slides, he includes one titled, “Tectonic Technology Shifts.” Id. The slide juxtaposes “Traditional Processing” and “Mass Analytics/Big Data.” Id. Under “Mass Analytics/Big Data,” Hunt identifies “Horizontal Scaling” of data. Id.; see also Richards & King, Big Data Ethics, supra note X, at 394 (“Peter Mell, a computer scientist with the National Institute of Standards and Technology, similarly constrains big data to ‘[w]here the data volume, acquisition velocity, or data representation limits the ability to perform effective analysis using traditional relational approaches or requires the use DRAFT: PLEASE DO NOT CITE OR CIRCULATE

2016] CRITICAL DATA THEORY 35

Thus, both human small data-driven intelligence and big data-driven intelligence are subject to error. Although most understand human frailty and human fallibility in intelligence gathering, at the earliest stages of the big data revolution, it is more difficult to concede that big data is susceptible to fallibilities. Big data precrime or preterrorism initiatives are seemingly justified by the statistically-driven evidence.175 However, human biases can be embedded in algorithms, and human misconceptions of a perceived threat can be translated into technological methods for intelligence gathering and automated or semi-automated decisionmaking.176 The scale of big data fallibility is sharply in contrast to the scale of human fallibility. The creation of artificial intelligence targeting systems that embed misconceptions and fallibilities can potentially lead to “[one] billion false alarms[.]”177 Some experts claim that rather than reflecting reality, big data casts more of a hologram.178 A key question for policymakers, then, is whether big data’s “virtual reality” is more efficacious than the human fallibility of small data “reality,” and whether specific governing protocols should not be subjected to big data’s “virtual reality” evidence or inferences.179 Some experts question whether big data-dependent governance can be fairly championed for its “precision”180 when it carries the known risks of big data: Potential algorithmic house of mirrors,181 statistical echo chamber from overtraining the data,182 confirmation bias ratchet effect,183 or of significant horizontal scaling for efficient processing.’” (internal citation omitted)). 175 . See, e.g., PERRY ET AL., supra note Error! Bookmark not defined., at 2–3. 176 . See Danielle Keats Citron, Technological Due Process, 85 WASH. U. L. REV. 1249, 1260–61 (2008); Citron & Pasquale, supra note Error! Bookmark not defined., at 1 (discussing the use of big data to create algorithms to assess people). 177 . Bruce Schneier, Why Data Mining Won’t Stop Terror, WIRED (Mar. 9, 2006), http://archive.wired.com/politics/security/commentary/securitymatters/2006/03/70357?curre ntPage=all (noting that a 99% accurate system would “generate 1 billion false alarms for every real terrorist plot it uncovers”). 178. See, e.g., Telephone Interview by Seth Fletcher with Jaron Lanier (Oct. 15, 2013) [hereinafter Lanier Interview], http://www.scientificamerican.com/article/lanier-interview- how-to-think-ab out-privacy/ (explaining that big data systems, as correlative and statistical in nature, do not reflect “the underlying structure of reality.”). 179. See Hu, supra note Error! Bookmark not defined., at 839 n.289 (quoting interview with Jaron Lanier where Lanier posits that big data creates a “‘seductive illusion’” of reality, however, does not “‘actually reflect the structure of reality.’”). 180. See, e.g., Comments by Glenn Greenwald, Contact Presents: An Evening with Glenn Greenwald, Washington and Lee University, Oct. 20, 2015 (notes on file with author); see also James Risen & Laura Poitras, N.S.A. Collecting Millions of Faces from Web Images, N.Y. Times, June 1, 2014, at A1. 181 . See, e.g., FINLAY, supra note X; KITCHIN, supra note X. 182 . See, e.g., FINLAY, supra note X, at 14.

A major problem when building predictive models is that it’s quite easy to find relationships that are the result of random patterns in the sample, but which don’t exist in the wider population. The result is that if you measure the performance of the model using the development sample . . DRAFT: PLEASE DO NOT CITE OR CIRCULATE 5/18/16 9:27 AM

boomerang of false positives.184 Understanding exactly why “data is not intelligence”185 in a big data world helps to discern the limits of algorithmic-driven decisionmaking, and why further interrogation may be necessary before big data can be ascribed high value and significant evidentiary weight in government decisionmaking. Significant legal and constitutional consequences attach to these rapidly evolving technologies of mass data surveillance and their applications.186

C. Criticality and Construction

In both Critical Race Theory and Critical Data Theory, the emphasis on construction is key. The “collect-it-all”187 tools of a big data world

. . They will not be a true representation of how the model performs when presented with new cases when the model is deployed. This problem is called over-fitting.

Id. 183 . See, e.g., BERNARD E. HARCOURT, AGAINST PREDICTION: PROFILING, POLICING, AND PUNISHING IN AN ACTUARIAL AGE 145–72 (2006). 184 . K.N.C., The Backlash Against Big Data, THE ECONOMIST (Apr. 20, 2014, 23:50), http://www.economist.com/blogs/economist-explains/2014/04/economist-explains- 10#sthash.A0hYhXSU.dpuf (“T]he risk of spurious correlations—associations that are statistically robust but happen only by chance—increases with more data. Although there are new statistical techniques to identify and banish spurious correlations, such as running many tests against subsets of the data, this will always be a problem.”). 185. “Data is not intelligence” is a quote from an NSA document released by the Snowden disclosures. See Peter Maass, Inside NSA, Officials Privately Criticize “Collect It All” Surveillance, INTERCEPT (May 28, 2015). According to some experts, the Snowden disclosures “suggest that analysts at the NSA have drowned in data since 9/11, making it more difficult for them to find the real threats.” Id. “The titles of the documents capture their overall message: ‘Data Is Not Intelligence,’ ‘The Fallacies Behind the Scenes,’ ‘Cognitive Overflow?’ ‘Summit Fever’ and ‘In Praise of Not Knowing.’ Other titles include ‘Dealing With a ‘Tsunami’ of Intercept’ and ‘Overcome by Overload?’” Id. 186. Several scholars have noted how transformative technological shifts have also transformed methods of governance and surveillance as a tool of governance. See, e.g., Jack M. Balkin, Old-School/New-School Speech Regulation, 127 HARV. L. REV. 2296, 2297 (2014) (“The digital era is different. Governments can target for control or surveillance many different aspects of the digital infrastructure that people use to communicate: telecommunications and broadband companies, web hosting services, domain name registrars, search engines, social media platforms, payment systems, and advertisers.”); Jack M. Balkin, The Constitution in the National Surveillance State, 93 MINN. L. REV. 1 (2008) [hereinafter Balkin, National Surveillance State]; Jack M. Balkin & Sanford Levinson, The Processes of Constitutional Change: From Partisan Entrenchment to the National Surveillance State, 75 FORDHAM L. REV. 489 (2006); Lior Jacob Strahilevitz, Signaling Exhaustion and Perfect Exclusion, 10 J. ON TELECOMM. & HIGH TECH. L. 321 (2012); David Lyon, Biometrics, Identification and Surveillance, 22 BIOETHICS 499 (2008); Erin Murphy, Paradigms of Restraint, 57 DUKE L.J. 1321 (2008). 187 . GREENWALD, supra note Error! Bookmark not defined., at 97 (citing NSA slide from Snowden disclosures titled, “New Collection Posture,” quoting NSA data collection procedure as “Collect it All”), http://glenngreenwald.net/pdf/NoPlaceToHide-Documents- DRAFT: PLEASE DO NOT CITE OR CIRCULATE

2016] CRITICAL DATA THEORY 37 facilitate the construction of digital avatars,188 or the virtual representation189 of our digital selves.190 Increasingly, with big data tools at the government’s disposal, post-9/11 legal, policy, and technological innovations construct data hierarchies. The innovations have since accelerated the construction and governance of data selves or digital avatars, and further imperiled the privacy rights held under “digital personhood[.]”191 For the past two decades, experts and scholars have proposed different terminology to describe the new capacities and consequences of

Compressed.pdf; see also David Cole, ‘No Place to Hide’ by Glenn Greenwald, on the NSA’s Sweeping Efforts to ‘Know it All’, WASH. POST (May 12, 2014) (“In one remarkable [NSA] slide presented at a 2011 meeting of five nations’ intelligence agencies and revealed here for the first time, the NSA described its “collection posture” as “Collect it All,” “Process it All,” “Exploit it All,” “Partner it All,” “Sniff it All” and, ultimately, “Know it All.”). 188. The term “digital avatar” is used often in the video gaming context, and most commonly refers to a digitally constructed representation of the computing user or, in some instance, the representation of the user’s alter ego or character. See, e.g., Hart v. Electronic Arts, Inc., 717 F.3d 141 (3d Cir. 2012). In Hart v. Electronic Arts, Inc., for example, a class action suit of college athletes alleged that their digital avatars and likeness had been unlawfully appropriated for profit by the video game developer, Electronic Arts, Inc. See id. 189. The introduction of virtual reality and virtual worlds has raised increasingly complicated legal questions. For example, in Brown v. Entertainment Merchants Ass’n, 131 S.Ct. 2729 (2011), the Supreme Court considered the First Amendment implications of expressive speech of video games. It explained, Like the protected books, plays, and movies that preceded them, video games communicate ideas—and even social messages—through many familiar literary devices (such as characters, dialogue, plot, and music) and through features distinctive to the medium (such as the player's interaction with the virtual world). That suffices to confer First Amendment protection. Id. at 2733; see also Joshua A.T. Fairfield, Mixed Reality: How the Laws of Virtual Worlds Govern Everyday Life, 27 BERKELEY TECH. L.J. 55, 71 (2012); Joshua A.T. Fairfield, Virtual Parentalism, 66 WASH. & LEE L. REV. 1215 (2009); Joshua A.T. Fairfield, Escape Into the Panopticon: Virtual Worlds and the Surveillance Society, 118 YALE L.J. POCKET PART 131 (2009); Marc Jonathan Blitz, The Freedom of 3D Thought: The First Amendment in Virtual Reality, 30 CARDOZO L. REV. 1141 (2008). Increasingly, scholars are interrogating the legal implications of self-representations and digital avatar representations in virtual worlds. See, e.g., Llewellyn Joseph Gibbons, Law and the Emotive Avatar, 11 VAND. J. ENT. & TECH. L. 899 (2009). 190. See infra Part IV.A; see also David Cole, Is Privacy Obsolete? Thanks to the Revolution in Digital Technology, Privacy is About to Go the Way of the Eight-Track Player, THE NATION (Apr. 6, 2015), http://www.thenation.com/article/198505/privacy-20-surveillance- digital-age# (“Digital tech-nology has exponentially expanded the government’s ability to construct intimate portraits of any particular individual by collecting all sorts of disparate data and combining and analyzing them for revealing patterns.”); Frank Gillett, How Will You Manage Your Digital Self?, INFORMATIONWEEK.COM, (Oct. 30, 2013), http://www.informationweek.com/software/social/how-will-you-manage-your-digital- self/d/d-id/1112130 (“The digital self is not just your work and personal computer files. It includes all of the complex and varied digital information that you and the organizations you deal with generate.”). 191 . SOLOVE, DIGITAL PERSON, supra note X. DRAFT: PLEASE DO NOT CITE OR CIRCULATE 5/18/16 9:27 AM

dataveillance.192 Information Society technologies enable the datafication193 of all aspects of information. The datafication phenomenon allows for social life, and human-generated activity and knowledge, to be quantified, digitalized, stored, accessed, and analyzed.194 The terms “data self”195 and “cyber self”196 describe self-manipulation of an online reputation. The concept of “digital personhood”197 describes how “digital dossiers”198 can be created by others to construct our “data-double,”199 “data image,”200 and “digital persona[.]”201 This allows for the virtual manifestation of the “electronic personality and digital self[.]”202 This technological innovation has transformed surveillance. In an Information Society, “the interest of surveillance [is] not in complete bodies . . . but in fragments of data[.]”203 Relatedly, the concept of the “proliferation of networked identities and selves”204 concerns the preservation of the autonomous self within the infrastructure of the Information Society. Thus, the intelligence community appears to believe that the construction of the digital avatar reflects actual big data surveillance capacities. Gus Hunt, for example, evokes the image of Star Trek’s

192 . LYON, supra note Error! Bookmark not defined., at 87–88. 193 . MAYER-SCHÖNBERGER & CUKIER, supra note Error! Bookmark not defined.. 194 . Hunt CIA Presentation, supra note Error! Bookmark not defined.; see, e.g., DOUGLAS RUSHKOFF, PRESENT SHOCK (2013); JAMES GLEICK, THE INFORMATION: A HISTORY, A THEORY, A FLOOD (2011). 195 . See, e.g., Robert Gordon, The Electronic Personality and Digital Self, 56 DISP. RESOL. J. 8 (2001). 196. Chassity N. Whitman & William H. Gottdiener, The Cyber Self: Facebook as a Predictor of Well-being, INT’L J. APPLIED PSYCHOANALYTIC STUD. (2015). 197 . DANIEL SOLOVE, THE DIGITAL PERSON: TECHNOLOGY AND PRIVACY IN THE INFORMATION AGE (2004). 198. Solove, Digital Dossiers and the Dissipation of Fourth Amendment Privacy, supra note Error! Bookmark not defined.. See also BERNARD E. HARCOURT, EXPOSED: DESIRE AND DISOBEDIENCE IN THE DIGITAL AGE (2015). 199 . Kevin D. Haggerty & Richard V. Ericson, The Surveillant Assemblage, 51 BRIT. J. SOC. 605 (2000). 200 . LYON, supra note Error! Bookmark not defined., at 87 (citing David Lyon, THE ELECTRONIC EYE: THE RISE OF SURVEILLANCE SOCIETY 19 (1994)). 201. Id. at 87–88 (citing Roger Clarke, The Digital Persona and Its Application to Data Surveillance, 10 THE INFORMATION SOCIETY 2, 77–92 (1994)). 202. Gordon, supra note Error! Bookmark not defined.. 203 . LYON, supra note Error! Bookmark not defined., at 88 (citing Haggerty & Ericson, supra note Error! Bookmark not defined., at 612). 204. See, e.g., Frank Pasquale & Danielle Keats Citron, Promoting Innovation While Preventing Discrimination: Policy Goals for the Scored Society, 89 WASH. L. REV. 1413, 1413–14 (2014) (referring to the work of Professor Tal Z. Zarsky); see also JULIE COHEN, CONFIGURING THE NETWORKED SELF: LAW, CODE, AND THE PLAY OF EVERYDAY PRACTICE (2012); Tal Z. Zarsky, Understanding Discrimination in the Scored Society, 89 WASH. L. REV. 1375 (2014); Tal Z. Zarsky, Mining the Networked Self, 6 JERUSALEM REV. LEGAL STUD. 120 (2012), http://jrls.oxfordjournals.org/content/6/1/120.full.pdf?keytype=ref&ijkey=b1gi1dlZvf3iBX. DRAFT: PLEASE DO NOT CITE OR CIRCULATE

2016] CRITICAL DATA THEORY 39 transporter205 to explain this surveillance potentiality.206 The transporter reference generates an image of a multi-dimensional virtual representation of the digital selves of others.207 The Internet, the Social-Mobile-Cloud phenomenon, and smart technologies such as smartphones, facilitate the replacement of the self with the transporter.208 The transporter metaphor appears to support how our digital avatars may be constructed from the comprehensive aggregation and amalgamation of our datafication. The capture of big data digital trails then allows for a “full-arsenal approach that digitally exploits the clues a target leaves behind in their regular activities on the [Inter]net to compile biographic and biometric information that can help implement precision targeting.”209 As a consequence of the unprecedented historical phenomenon of datafication—the digitalization of all aspects of knowledge and social activity—the “data-double” is increasingly conflated with the person who has been datafied. Profound social and legal consequences result when private and public entities conflate the data-double with the individual. In the private context, the White House recognizes that “[s]mall bits of data can be brought together to create a clear picture of a person to predict

205. See Hunt CIA Presentation, supra note Error! Bookmark not defined. (“[Y]ou’re already a walking sensor platform. You guys know this I hope, right? Which is that your mobile device, your smartphone, your iPad, whatever it’s going to be, has got a, just, any number of these things [sensors] . . . . What’s happened is that if you’re a Star Trek fan, like I was when I was a kid, what’s current now is that this mobile platform, your smartphones, have turned into your communicator, they’re becoming your tricorder, and actually they’re becoming your transporter, right?”). It is instructive to examine the definitions of both “tricorder” and “transporter,” as Hunt uses both terms from Star Trek to more descriptively convey a perception of the big data potential of cloud-mobile-smart technologies. In Star Trek, a tricorder is a multifunction hand-held device used for sensor scanning, data analysis, and recording data. The transporter device converts a person into a pattern of materials that turns a person into a data signal that can then be transmitted and reconstructed as a person at another location. The popularized catchphrase, “Beam me up, Scotty,” from Star Trek is commonly associated with the transporter and the imagery of the hologram of the Star Trek character being transported as data from one location to another. In a big data world, big data cybersurveillance may be used to facilitate the construction of a multi-dimensional-like virtual representation of our digital selves, thus resulting in the datafication of the person in a similar way to the transporter. Although Hunt does not use the term “digital avatar,” his references to the smartphone (and Social-Mobile-Cloud technologies) as possessing a similar functionality as a tricorder and transporter appear to parallel the intelligence community’s data collection ambitions by fusing an individual’s biometric and biographic data to create a multi-dimensional data likeness of the smartphone user or user of other social-cloud-mobile-smart technologies. 206 . STAR TREK, created by Gene Roddenberry, currently owned by Paramount, is a Registered Trademark of Paramount Pictures Corporation. See, e.g., JUSTIN EVERETT, THE INFLUENCE OF STAR TREK ON TELEVISION, FILM, AND CULTURE 186 (2008). Legal scholars have also noted Star Trek’s relevancy to the study of the law. See, e.g., Paul Joseph & Sharon Carton, The Law of the Federation: Images of Law, Lawyers, and the Legal System in “Star Trek: The Next Generation,” 24 U. TOL. L. REV. 43 (1992). 207. See Hunt CIA Presentation, supra note Error! Bookmark not defined.. 208. Id. 209. Risen & Poitras, supra note Error! Bookmark not defined.. DRAFT: PLEASE DO NOT CITE OR CIRCULATE 5/18/16 9:27 AM

preferences or behaviors.”210 In other words, consumer data-doubles can lead corporations to seek what the White House refers to as “perfect personalization.”211 The digital avatar analogy is very apt in the following ways: It appears to capture the intelligence community’s ambition to create hologram-like representations of the digital selves of others. However, the analogy is incomplete in that it does not appear to adequately capture the actual technological capacities of the intelligence community. Pursuant to the National Surveillance State, those who are tracked and isolated for action by government-led big data programs often have limited to nonexistent options to remediate big data harms. In some instances, they may face consequences without any knowledge at all of the big data program that impacts them.212 People in this category include, most prominently, those who find themselves on a government digital watchlisting program or in a database screening system. These programs rely upon a big data approach to policymaking: The incorporation of big data tools into programs that may include mass data collection,213 data mining,214 mass digital

210 . PODESTA REPORT, supra note Error! Bookmark not defined., at 7. See also LYON, supra note Error! Bookmark not defined., at 88 (“[T]he data-double emerges consequent on the interest of surveillance not in complete bodies to be controlled, but in fragments of data emanating from the body.” (citing Kevin D. Haggerty & Richard V. Ericson, The Surveillant Assemblage, 51 BRIT. J. SOC. 605 (2000))). 211. Id. 212. Various plaintiffs have alleged in litigation that constitutional harms arise from big data watchlisting and dataveillance-cybersurveillance targeting systems, including database screening systems. Afifi v. Lynch, No. 11-0460(BAH), 2015 WL 1941420, at *1–2 (D.D.C. Apr. 30, 2015) (Cybersurveillance-GPS tracking litigation alleging, inter alia, a Fourth Amendment violation); ; Latif v. Holder, 686 F.3d 1122, 1124, 1126 (9th Cir. 2012) (No Fly List litigation by plaintiffs alleging, inter alia, due process violations); Complaint at 1–2, Makowski v. Holder, No. 12-cv-05265 (N.D. Ill. July 3, 2012) (Secure Communities litigation by plaintiff alleging, inter alia, a violation of the Privacy Act); Ariz. Contractors Ass’n, Inc. v. Napolitano, 526 F. Supp. 2d 968, 977 (D. Ariz. 2007), aff’d sub nom. Chamber of Commerce of U.S. v. Whiting, 131 S. Ct. 1968 (2011) (E-Verify litigation by Chamber of Commerce, et al., alleging, inter alia, Fourth Amendment search and seizure of data). 213 . See, e.g., EXEC. OFFICE OF THE PRESIDENT, BIG DATA: SEIZING OPPORTUNITIES, PRESERVING VALUES 5 (2014) [hereinafter PODESTA REPORT], https://www.whitehouse.gov/sites/defau lt/files/docs/big_data_privacy_report_may_1_2014.pdf. (“[D]ata collection and analysis is being conducted at a velocity that is increasingly approaching real time, which means there is a growing potential for big data analytics to have an immediate effect on a person’s surrounding environment or decisions being made about his or her life.”). 214. The nature of the impact of government data mining has formed the basis of extensive and important research in recent years. See, e.g., Fred H. Cate, Government Data Mining: The Need for a Legal Framework, 43 HARV. C.R.-C.L. L. REV. 435, 437 (2008); Christopher Slobogin, Government Data Mining and the Fourth Amendment, 75 U. CHI. L. REV. 317, 321 (2008); Daniel J. Solove, Data Mining and the Security-Liberty Debate, 75 U. CHI. L. REV. 343, 343–44 (2008). DRAFT: PLEASE DO NOT CITE OR CIRCULATE

2016] CRITICAL DATA THEORY 41 indexing,215 database screening protocols,216 digital watchlisting,217 big data integration,218 and predictive analytics.219 Critical Data Theory is needed to examine how law and technology construct classifications of suspect data and suspect digital avatars that reinforce preexisting power hierarchies. The constructivity thesis that is at the heart of Critical Race Theory is also at the heart of Critical Data Theory. Techniques of big data governance are evolving to keep pace with the disruptive and revolutionary developments that have been initiated by the Information Society. Theoretical tools such as Critical Data Theory can challenge the capture and application of big data and data surveillance tools. New theoretical developments in legal scholarship can serve to interrogate the implications of big data-centered power.

215. See, e.g., Daniel J. Solove, Digital Dossiers and the Dissipation of Fourth Amendment Privacy, 75 S. CAL. L. REV. 1083, 1107 (2002)). 216 . See, e.g., SIMSON GARFINKEL, DATABASE NATION: THE DEATH OF PRIVACY IN THE 21ST CENTURY 37–67 (2000); MARTIN KUHN, FEDERAL DATAVEILLANCE: IMPLICATIONS FOR CONSTITUTIONAL PRIVACY PROTECTIONS (2007) (examining constitutional implications of “knowledge discovery in databases” (KDD applications) through dataveillance). 217 . See, e.g., See, e.g., JEFFREY KAHN, MRS. SHIPLEY’S GHOST: THE RIGHT TO TRAVEL AND TERRORIST WATCHLISTS (2013); WILLIAM J. KROUSE & BART ELIAS, CONG. RESEARCH SERV., RL33645, TERRORIST WATCHLIST CHECKS AND AIR PASSENGER PRESCREENING (2009); Anya Bernstein, The Hidden Costs of Terrorist Watch Lists, 61 BUFF. L. REV. 461, 466 (2013); Peter M. Shane, The Bureaucratic Due Process of Government Watch Lists, 75 GEO. WASH. L. REV. 804, 805–06 (2007); Steinbock, supra note X, at 78; Peter M. Shane, The Bureaucratic Due Process of Government Watch Lists, 75 GEO. WASH. L. REV. 804 (2007); Sharon Bradford Franklin & Sarah Holcomb, Watching the Watch Lists: Maintaining Security and Liberty in America, HUM. RTS. 18 (Summer 2007); Susan Stellin, Who is Watching the Watch Lists, N.Y. TIMES (Nov. 30, 2013), http://www.nytimes.com/2013/12/01/sunday-review/who-is-watching-the-watch-lists.html. 218 . Xin Luna Dong & Divesh Srivastava, Big Data Integration, 6 PROCEEDINGS OF THE VLDB ENDOWMENT 1188, 1189 (describing multiple technologies that assist in big data integration techniques, including schema mapping, record linkage, data fusion, and big data architecture), http://www.vldb.org/pvldb/vol6/p1188-srivastava.pdf. 219 . See, e.g., STEVEN FINLAY, PREDICTIVE ANALYTICS, DATA MINING AND BIG DATA: MYTHS, MISCONCEPTIONS, AND METHODS 3 (2014) (explaining that tools of predictive analytics are not only deployed by the private and commercial sector, but that “government and other non-profit organizations also have reasons for wanting to know how people are going to behave and then taking action to change or prevent it.”); ERIC SIEGEL, PREDICTIVE ANALYTICS: THE POWER TO PREDICT WHO WILL CLICK, BUY, LIE, OR DIE 59-60 (2013) (discussing predictive policing methods); NATE SILVER, THE SIGNAL AND THE NOISE: WHY SO MANY PREDICTIONS FAIL—BUT SOME DON’T 417-18 (2012) (discussing governmental efforts to isolate relevant signals in correcting the failed attempts to prevent the terrorist attacks of 9/11: “In cases like these, what matters is not our signal detection capabilities . . . [w]e need signal analysis capabilities to isolate the pertinent signals from the echo chamber”) (emphasis in original). DRAFT: PLEASE DO NOT CITE OR CIRCULATE 5/18/16 9:27 AM

II. CRITICAL RACE THEORY AND BIG DATA

Critical Race Theory contends that “law constructs race[.]”220 The theory asserts that race is not a static phenomenon or fixed concept, but, rather, is dynamically constructed by hierarchies of power—whether they be based upon law, politics and structures of governance, economics, presentations of history, culture, educational and academic , knowledge and methods of inquiry, or any other source of power deemed legitimate by society. The theory asserts that the concept of race does not embody an objective or neutral truth, condition, or value. As Devon Carbado explains, the theory “rejects the view that race precedes law, ideology and social relations[.]”221 Critical Race Theory “has also focused more specifically on how the law constructs whiteness.”222 Race constructions “support the contemporary economies of racial hierarchy” and “intervene[] to correct this market failure and the unjust racial allocations it produces[.]”223 Critical Race Theory combines a study of progressive political justice movements with of conventional legal and social or scientific norms, especially examining how such norms function within hierarchies of power. Critical Race Theory scholarship focuses attention on race and racial power as constructions of law and culture. The theory contends that regimes of privilege are maintained by the rule of law and constitutional doctrines despite equality guarantees. Critical theorists do not view law or economic policy rationales as a value-neutral tool of governance224 but instead as part of the construction of tools of subordination.225 Increasingly, research is focused on unseen forces that may shape law and policy, such as implicit or cognitive biases; theory; a lack of empiricism in explicating discriminatory phenomena and their manifestation; representations of race as heuristics in mass media,

220. Carbado, Critical What, supra note X, at 1610. 221. Carbado, Critical What, supra note X, at 1610. 222. Carbado, Critical What, supra note X, at 1611. 223. Carbado, Critical What, supra note X, at 1609. 224 . See, e.g., Richard Delgado, Recasting the American Race Problem, 79 CAL. L. REV. 1389, 1393 (1991) (“Formal equal opportunity is [] calculated to remedy at most the more extreme and shocking forms of racial treatment; it can do little about the business-as-usual types of racism that people of color confront every day and that account for much of our subordination, poverty, and despair[]”); Devon Carbado & G. Mitu Gulati, The Law and Economics of Critical Race Theory, 112 YALE L.J. 1757 (2003). 225 . Dorothy A. Brown, Fighting Racism in the Twenty-First Century, 61 WASH & LEE L. REV. 1485, 1486 (2004) (“Although CRT does not employ a single methodology, it seeks to highlight the ways in which the law is not neutral and objective, but designed to support White supremacy and the subordination of people of color[]”); Kimberlé Crenshaw, A Black Feminist Critique of Antidiscrimination Law and Politics, in THE POLITICS OF LAW: A PROGRESSIVE CRITIQUE 213–14 n.7 (David Kairys ed., 1990) (“Critical race theory goes beyond the liberal race critiques, however, in that it exposes the facets of law and legal discourse that create racial categories and legitimate racial subordination.”). DRAFT: PLEASE DO NOT CITE OR CIRCULATE

2016] CRITICAL DATA THEORY 43 entertainment, modes of knowledge and education; and other forces.226 A defining aspect of Critical Race Theory is its reliance upon dissent and discourse, and political consciousness, and protest and civil disobedience, rather than upon legal principles of justice. The discussion below addresses why and how the core ideas of Critical Race Theory can be applied to critiquing how big data works under the rule of law and in legal institutions, and why it matters. The growing research on big data discrimination and the disparate impact imposed by big data technologies insufficiently addresses the underlying causes and sources. A critical theoretical framework is needed to deconstruct how the unseen forces of big data shapes law and policy in a parallel manner to how race operates to influence law and policy.

A. The Critical Race Theory Framework

Critical Theory is not a unitary theory in that it adopts a myriad of methods of critique to serve a multitude of purposes. Critical Race Theory is understood to be a subset of a broader theoretical movement. To understand Critical Race Theory, the soul of the method—narrative and storytelling—must be appreciated.227 The poststructural and postmodernist

226 . See, e.g., Richard Delgado, Recasting the American Race Problem, 79 CAL. L. REV. 1389, 1394 (1991) (“[C]ulture constructs its own social reality . . . . we construct it through images, pictures, stories, and narratives that, among other things, tell us who is and deserves to be poor, . . . and that the current unfortunate racial polarization and poverty affecting persons of color will right themselves in time . . . ”); Brown, supra note 225, at 1485 (“[T]wentieth century racism was blatant, intentional, and its existence generally undisputed. The obvious nature of how racism operated in the twentieth century led to the passage of civil rights laws. Twenty-first century racism, on the other hand, is more subtle.”). 227. See Crenshaw, supra note 47, at 1348 (explaining that Critical Race Theory grew from “a convergence with and contestation within CLS [].”).

CLS contributed the institutional space in which competing conceptions about race, knowledge, and social hierarchy could be vetted, refined and reproduced. Taking a page from the CLS tradition, the task at hand is to interrogate (racial) power where we live, work, socialize and exist. For academics, that world is implicated in the ways that the disciplines were built to normalize and sustain the American racial project. A contemporary critical race theory would thus take up the dual tasks of uncovering the epistemic foundations of white supremacy as well as the habits of disciplinary thought that cabin competing paradigms through colorblind conventions. Unraveling this story while at the same time generating an inventory of critical tools that have been fashioned by generations of Race Crits effectively replicates across disciplines the construction of CRT within one discipline.

Id. DRAFT: PLEASE DO NOT CITE OR CIRCULATE 5/18/16 9:27 AM

aspects of Critical Race Theory lie in its embrace of literary tools.228 Narrative is not just an end product of the theory but a means to deliver the theory.229 The process of bearing witness to racism—forcing self-directed dialogue and reflective insight on the nature and scope of race-related discrimination—is presented as a core method of theoretical interrogation.230 Narrative in Critical Race Theory is critical because it provides a context, via the narrative, from which supposedly contextless truths, supposedly objective, become exposed as subjective socio-economic, political, cultural, and racial formations.231 Ultimately, the theory advances principles about knowledge creation. An epistemological principle that is foundational to the method is as follows: That all truths—what is accepted as knowledge and reality—are embedded in narratives, even if those promoting the knowledge or reality cannot see this.232 Narratives, then, operate as a crucial prerequisite for elevating a view or position to something purportedly beyond established knowledge as a prima facie truth.233 The narrative, thus, serves as an unspoken and obvious premise for critical discourse.234 In other words, narrative re-embeds “objective” knowledge into the social, racial, gendered, or cultural framework within which it arose and helps reveal the ends, otherwise invisible, which it

228. See Derrick A. Bell, Who’s Afraid of Critical Race Theory?, 1995 U. ILL. L. REV. 893, 899 (1995) (emphasizing that “[c]ritical race theory writing and lecturing is characterized by frequent use of the first person, storytelling, narrative, allegory, interdisciplinary treatment of law, and the unapologetic use of creativity.”). 229. Id. 230. See Kimberlé Crenshaw, Mapping the Margins: Intersectionality, Identity Politics, and Violence against Women of Color, 43 STAN. L. REV. 1241 (1991). 231. See Bell, Who’s Afraid, supra note 225, at 901(“We insist, for example, that abstraction, put forth as ‘rational’ or ‘objective’ truth, smuggles the privileged choice of the privileged to depersonify their claims and then pass them off as the universal authority and the universal good.”). 232. See Bell, Who’s Afraid, supra note 225, at 901 (explaining that “a neutral perspective does not, and cannot, exist—that we all speak from a particular point of view … [a] ‘positioned perspective.’”). 233. See, e.g., andré douglas pond cummings, Derrick Bell: Godfather Provocateur, 28 HARV. J. RACIAL & ETHNIC JUST. 51, 53 (2012) (“… Critical Race theorists championed storytelling and narrative as valuable empirical proof of reality and the human experience, while rejecting traditional forms of legal studies, , and various forms of civil rights leadership.”); Richard Delgado & Jean Stefancic, Critical Race Theory: An Annotated Bibliography, 79 VA. L. REV. 461 (1993) (“To analyze and challenge these power-laden beliefs [surrounding race-based assumptions], some writers employ counterstories, parables, chronicles, and anecdotes aimed at revealing their contingency, cruelty, and self-serving nature.”). 234. Peter Goodrich & Linda G. Mills, The Law of White Spaces: Race, Culture, and , 51 J. LEGAL EDUC. 15, 38 (“Critical race theory has tended to concentrate on the moment and form of attacks upon the racial outsider. It has attempted to assert an oppositional voice . . . . It is the untold story with which we began. It is the narrative of . . . the impoverishment of colorblindness.”). DRAFT: PLEASE DO NOT CITE OR CIRCULATE

2016] CRITICAL DATA THEORY 45 serves.235 While the narrative method236 of knowledge acquisition might be controversial in the context of scientific knowledge and other contexts, most would not dispute that legal doctrines and justice systems arise from specific social, economic, cultural, and racial contexts.237 These legal doctrines, supported by assumed truths and serving specific ends in the above contexts, can and should be interrogated. Critical Race Theory, therefore, invites interrogation of any transcendant or a priori truth and has as its own truth: That nothing transcends the social fabric into which it first emerges.238 Put differently, everything has a story and sometimes that story can be a crucial first step in challenging the hegemony of a concept, truth, or doctrine.239 In Critical Race Theory, the story of the storyless typically

235 . Derrick A. Bell, Who’s Afraid of Critical Race Theory?, 1995 U. ILL. L. REV. 893, 906 (1995) (“The narrative voice, the teller, is important to critical race theory in a way not understandable by those whose voices are tacitly deemed legitimate and authoritarian. The voice exposes, tells and retells, signals resistance and caring, and reiterates what kind of power is feared most—the power of commitment to change[]”); HIP HOP AND THE LAW xxi (Pamela Bridgewater, andré douglas pond cummings & Donald F. Tibbs eds. 2015) (describing hip hop as a form of storytelling that critiques “American law and legal culture . . . . Indeed, Hip Hop artists have often experienced the blunt trauma of the American legal system first hand and as young men of color in the United States, are keenly positioned to critique a system that disproportionately imprisons and discards African American and Latino youth.”). 236 . Lisa Kern Griffin, Narrative, Truth, and Trial, 101 GEORGETOWN L.J. 281, 335 (2013) (arguing that narratives in the context of jury trials raises concerns, as “the story model is incomplete and recognizing its limitations reveals opportunities to improve truth seeking and counteract bias.”). 237 . See, e.g., Susan S. Silbey, Law and Society Movement, in LEGAL SYSTEMS OF THE WORLD : A POLITICAL, SOCIAL AND CULTURAL ENCYCLOPEDIA 860–63 (2002); TOM R. TYLER, ROBERT J. BOECKMANN, HEATHER J. SMITH & YUEN J. HUO, SOCIAL JUSTICE IN A DIVERSE SOCIETY (1997). 238. Big data and the digital economy has introduced new theories by which to examine the new social fabric that is now digitally mediated. See, e.g., JOSE VAN DIJICK, THE CULTURE OF CONNECTIVITY: A CRITICAL HISTORY 5–6 (2013) (“As a [technological] medium coevolves with its quotidian users’ tactics, it contributes to shaping people’s everyday life, while at the same time this mediated sociality becomes part of society’s institutional fabric[]”); Jose van Dijick, Dataficiation, Dataism & Dataveillance: Data Between Scientific Paradigm & Ideology, 12 SURVEILLANCE & SOCIETY 197, 198 (2014) (“[T]he ideology of dataism shows characteristics of a widespread belief in the objective quantification and potential tracking of all kinds of human behavior and sociality through online media technologies. Besides, dataism also involves trust in the (institutional) agents that collect, interpret, and share (meta)data culled from social media, internet platforms, and other communication technologies[]”); JULIE COHEN, CONFIGURING THE NETWORKED SELF: LAW, CODE, AND THE PLAY OF EVERYDAY PRACTICE (2012). 239 . Jerome M. Culp, Jr., Angela P. Harris, Francisco Valdes, Subject Unrest, 55 STAN. L. REV. 2435, 2443–44 (“[T]he narrative tradition of critical race theory provides a way to explore the dilemma of the uncertain subject of subordination[]”); MARGARET ZAMUDIO, CHRISTOPHER RUSSELL, FRANCISCO RIOS & JACQUELYN L. BRIDGEMAN, CRITICAL RACE THEORY MATTERS: EDUCATION AND IDEOLOGY (2010) (“Critical race theorists understand that narratives are not neutral, but rather political expressions of power relationships. That is, history is always told from the perspective of the dominant group. Minority perspectives DRAFT: PLEASE DO NOT CITE OR CIRCULATE 5/18/16 9:27 AM

begins at the margins: The story of a speaker racially positioned at the margin of mainstream society whose own narrative calls into question values and truths.240 The critical race theorist asserts that bodies of knowledge, and hierarchies of law and power, flow through the filter of racism.241 Consequently, critical race theorists advance the position that knowledge through art, such as storytelling, is representative of a type of truth that operates as a type of backstory to hierarchies of law and power that may not be seen otherwise.242 Critical Race Theory posits that narrative is a method for delivering backstory truths—especially the truth or the reality of the subordinated—or an alternate reality that may be otherwise ignored or rejected.243 Race constructions “support the contemporary economies of racial hierarchy” and Critical Race Theory “intervenes to correct this market failure and the unjust racial allocations it produces[.]”244 Critical Race Theory is “an intersectional engagement of structural hierarchies” and an understanding “that, historically, racism has been bi-directional: It gives to whites (e.g., citizenship) what it takes away from or denies to people of color[.]”245 Critical Race Theory “attempts to describe the role law plays in enabling this racial arrangement[.]”246 Critical Race Theory also draws upon multiple critical strategies to expose how the law constructs race to disadvantage individuals and emphasize civil rights movements as a primary prescriptive vehicle rather than rely upon the prescriptive potential of legal reforms. Critical Race Theory thus in the first instance depends more upon dissenting narratives in the form of narratives, testimonies, or storytelling challenge the dominant group’s accepted truths.”). 240. See, e.g., Bell, Bottom of the Well, supra note 1; Marie J. Matsuda, Looking to the Bottom of the Well: Critical Legal Studies and Reparations, 22 HARV. C.R.-C.L. L. REV. 323 (1987). 241 . Kevin R. Johnson, Richard Delgado’s Quest for Justice for All, 33 LAW & INEQ. 407, 409 (2015) (“Delgado considers racism to be both a central organizing principle and permanent feature of American social life.”). 242. Kang, Trojan Horse, supra note 1, at 1506 (“the ‘power of race is invisible’”) (citation omitted); Rachel F. Moran, Whatever Happened to Racism?, 79 St. John’s L. Rev. 833, 919–20 (2012) (“Redefining the meaning of disparate treatment is not the only way to fight unconscious bias. Structural changes can avoid the unfairness of blaming individuals for conduct that is automatic, unconscious, and widespread[]”); Robert S. Chang, Toward an Asian American Legal Scholarship: Critical Race Theory, Post-, and Narrative Space, 81 CALIF. L. REV. 1241 (1993). See also Camara Phyllis Jones, Confronting Institutionalized Racism, 50 PHYLON 7 (2003). 243. See Richard Delgado, Storytelling for Oppositionists and Others: A Plea for Narrative, 87 MICH. L. REV. 2411, 2415 (1988) (contending that “combining elements from the story and current reality, we may construct a new world richer than either alone. Counterstories can quicken and engage conscience. Their graphic quality can stir imagination in ways in which more conventional discourse cannot.”). 244. Carbado, Critical What What?, supra note X, at 1609. 245. Carbado, Critical What What?, supra note X, at 1614. 246. Carbado, Critical What What?, supra note X, at 1614. DRAFT: PLEASE DO NOT CITE OR CIRCULATE

2016] CRITICAL DATA THEORY 47 and civil rights protests rather than the avenues of statutory and constitutional reform. Other important sites of intervention by critical race theorists are endeavors to demonstrate the manner in which (1) racism is systemic, endemic, and not aberrational or anomalous;247 (2) neutrality and objectivity principles under the law act to replicate discrimination and perpetuate preexisting hierarchies;248 and (3) why and how the foci of the analytical method in legal jurisprudence should and can be informed by personal context and contextualized history.249 Critical Race Theory sets a valuable foundation for the probe into the discriminatory impact of big data.

B. Disparate Impact of Big Data Tools and Programs

Increasing attention has been focused on examining how discriminatory influences may intersect with algorithmic decisionmaking and the deployment of big data technologies:250 “In describing racism as an

247 . See, e.g., MARI J. MATSUDA, CHARLES R. LAWRENCE III, RICHARD DELGADO & KIMBERLÉ WILLIAMS CRENSHAW, WORDS THAT WOUND: CRITICAL RACE THEORY, ASSAULTIVE SPEECH, AND THE FIRST AMENDMENT (1993) (contending that racism is endemic to American life); Guy-Uriel Charles, Towards a New Civil Rights Framework, 30 HARV. J.L. & GENDER 353 (2007) (“Looking at the gaping racial disparities on most socio- economic indicators, there are clearly two classes of citizens: Whites and .”). 248 . See, e.g., MATSUDA, LAWRENCE III, DELGADO & CRENSHAW, supra note 247 (“Critical Race Theory expresses skepticism toward dominant legal claims of neutrality, objectivity, colorblindness, and ”); Carbado, Critical What What?, supra note X, at 1609 (arguing that Critical Race Theory challenges “two dominant principles upon which American anti-discrimination law and politics rest—to wit, that colorblindness necessarily produces race neutrality and that color consciousness necessarily produces racial preferences.”). 249 . See, e.g., MATSUDA, LAWRENCE III, DELGADO & CRENSHAW, supra note 247 (“Critical Race Theory insists on recognition of the experiential knowledge of people of color and our communities of origin in analyzing law and society[]”); Derrick Bell, Jr., Brown v. Board of Education and the Interest-Convergence Dilemma, 93 HARV. L. REV. 518, 523 (1980) (“Translated from judicial activity in racial cases both before and after Brown, this principle of ‘interest convergence’ provides: The interest of blacks in achieving will be accommodated only when it converges with the interests of whites.”). 250. See, e.g., FTC Recommends Congress Require the Data Broker Industry to be More Transparent and Give Consumers Greater Control Over Their Personal Information, FED. TRADE COMM’N (May 27, 2014), https://www.ftc.gov/news-events/press- releases/2014/05/ftc-recommends-congress-require-data-broker-industry-be-more; FED. TRADE COMM’N, DATA BROKERS: A CALL FOR TRANSPARENCY AND ACCOUNTABILITY (May 2014) [hereinafter FTC REPORT: DATA BROKERS]; EXEC. OFFICE OF THE PRESIDENT, BIG DATA: SEIZING OPPORTUNITIES, PRESERVING VALUES (2014) [hereinafter PODESTA REPORT], https://www.whitehouse.gov/sites/defau lt/files/docs/big_data_privacy_report_may_1_2014.pdf; EXEC. OFFICE OF THE PRESIDENT, BIG DATA: SEIZING OPPORTUNITIES, PRESERVING VALUES, INTERIM PROGRESS REPORT (2015), https://www.whitehouse.gov/sites/default/ files/docs/20150204_Big_Data_Seizing_Opportunities_Preserving_Values_Memo.pdf; LEADERSHIP CONFERENCE ON CIVIL AND HUMAN RIGHTS, CIVIL RIGHTS, BIG DATA, AND OUR DRAFT: PLEASE DO NOT CITE OR CIRCULATE 5/18/16 9:27 AM

endemic social force, [Critical Race Theory] scholars argue that it interacts with other social forces[.]”251 In borrowing a Critical Race Theory frame, therefore, a central question here is how racism, sexism, and other forms of discrimination may interact with the “social forces” that emanate from the big data revolution, the rise of the Information Society, and global dominance of the digital economy. To answer that question, it is important to determine the predominant drivers and users of big data. There are currently two prevalent users of big data: Corporations and government agencies.252 “First, corporations can analyze the data for commercially beneficial insights. Second, government agencies can examine the data for evidence that you are engaged in suspicious activities.”253 Many big data tools can be used to isolate and analyze individuals on the basis of race and ethnicity, gender, religion, geolocational data, socioeconomic class, voting habits, consumer behaviors, and health status, for example.254 In the corporate context, these big data products can be used, for example, to create economic and “money-based caste systems.”255

ALGORITHMIC FUTURE (2014); Oscar Gandy, Data Mining, Surveillance, and Discrimination in the Post 9-11 Environment, in KEVIN D. HAGGERTY & RICHARD V. ERICSON, THE NEW POLITICS OF SURVEILLANCE AND VISIBILITY (2006); First Annual Conference on Big Data and Civil Rights, DATA QUALITY CAMPAIGN (Dec. 17, 2014), http://dataqualitycampaign.org/blog/2014/12/first-annual-conference-on-big-data-and-civil- rights/; LEADERSHIP CONFERENCE ON CIVIL AND HUMAN RIGHTS, CIVIL RIGHTS, BIG DATA, AND OUR ALGORITHMIC FUTURE (Sept. 2014); Joe Kukura, The Ugly Side of Big Data: Profiling by Race, Class, and Dirty Secrets, PULSEPOINT (May 29, 2014), http://create.pulsepoint.com/hub/3/at/17174370.html?; Mark Gongloff, Here’s What Corporate America Really Thinks About You, HUFFPOST BUS. (May 29, 2014), http://www.huffingtonpost.com/2014/05/29/data-brokers-stereotyping_n_5405805.html; Amy J. Schmitz, Secret Consumer Scores and Segmentations: Separating “Haves” From “Have-Nots”, 2014 Mich. St. L. Rev. 1411 (2014). 251. Carbado, Critical What, supra note X, at 1613–14 n.89-90. 252. John Horgan, U.S. Never Really Ended Creepy “Total Information Awareness” Program, SCI. AM. (June 7, 2013), http://blogs.scientificamerican.com/cross- check/2013/06/07/u-s-never-really-ended-creepy-total-information-awareness-program/. Dave Farber, the “Grandfather of the Internet” on cybersecurity “recalled that shortly after 9/11, the Defense Advanced Research Projects Agency initiated ‘Total Information Awareness,’ a surveillance program that called for recording and analyzing all digital information generated by all U.S. citizens. Id. “After news reports provoked criticism of the Darpa program, it was officially discontinued. But Farber suspected that [the] new surveillance programs [revealed by the Snowden disclosures] represent a continuation of Total Information Awareness. ‘I can’t get anyone to deny that there’s a common thread there,’ he said.” Id. 253. Id. 254. Amy J. Schmitz, Secret Consumer Scores and Segmentations: Separating “Haves” From “Have-Nots”, 2014 Mich. St. L. Rev. 1411, 1454 (2014) (“Algorithmic classifications skew how companies treat consumers and foster discrimination when based in part on assumptions related to race, gender, ethnicity, zip codes, and other data points that consider economic and educational resources”) (citation omitted). 255 . Nelson D. Schwartz, In New Age of Privilege, Not All Are in Same Boat, N.Y. TIMES (Apr. 24, 2016). Big data brokers and marketing systems combined to create invisible DRAFT: PLEASE DO NOT CITE OR CIRCULATE

2016] CRITICAL DATA THEORY 49

Recent illustrations include using big data products to only market certain jobs to one gender256 and to limit banking products on the basis of socioeconomic considerations,257 and discriminatory network practices.258 In the government context, big data-driven determinations can be used to assess who is a potential security risk, such as the No Fly List259 and Kill List.260 An important area of civil rights concern, thus, is the economic caste systems where companies engage in “elite segmentation” to identify wealthy customers to market to. Id. 256 . FED. TRADE COMM’N, BIG DATA: A TOOL FOR INCLUSION OR EXCLUSION?: UNDERSTANDING THE ISSUES 19 (Jan. 2016) [hereinafter FTC REPORT: BIG DATA EXCLUSION] (“[A] company [can] avoid[s], for example, expressly screening job applicants based on gender and instead use[s] big data analytics to screen job applicants in a way that has a disparate impact on women.”). 257 . FED. TRADE COMM’N, DATA BROKERS: A CALL FOR TRANSPARENCY AND ACCOUNTABILITY A’ppx C-5 to C-6 (May 2014) [hereinafter FTC REPORT: DATA BROKERS].

When inaccurate information wrongly leads to a consumer being identified as a risk that needs to be mitigated . . . that consumer may suffer significant harm. The consumer may[]be unable to complete important transactions, such as opening a bank or mobile phone account, if the data that went into a risk mitigation product is incorrect.

Id. 258 . Edward W. Felten, Net Neutrality Is Hard to Define, N.Y. TIMES (Aug. 10, 2010) (“We don’t want to mistake their complex but neutral network management practices for actual discrimination; nor do we want to enable subtle discrimination cloaked in network jargon.”); TIM WU, THE MASTER SWITCH: THE RISE AND FALL OF INFORMATION EMPIRES 286 (2010) (explaining the importance of net neutrality: “Just as with the operator of the sole ferry between an island and the mainland, proprietorship of any part of the Internet’s vital infrastructure rightly obliges one to carry the whole Internet, without discrimination or favoritism, in accordance with one of the oldest assumptions of our legal tradition.”). 259. See, e.g., Mohamed v. Holder, 995 F. Supp. 2d 522, 525 (2014); Jennifer C. Daskal, Pre- Crime Restraints: The Explosion of Targeted, Noncustodial Prevention, 99 CORNELL L. REV. 327, 328–30 (2014); Jeffrey Kahn, International Travel and the Constitution, 56 UCLA L. REV. 271, 276 (2008); Aaron H. Caplan, Nonattainder as a Liberty Interest, 2010 WIS. L. REV. 1203, 1203–05; Sharon Bradford Franklin & Sarah Holcomb, Watching the Watch Lists: Maintaining Security and Liberty in America, HUM. RTS., Summer 2007, at 18; KAHN, supra note ERROR! BOOKMARK NOT DEFINED.; WILLIAM J. KROUSE & BART ELIAS, CONG. RESEARCH SERV., RL33645, TERRORIST WATCHLIST CHECKS AND AIR PASSENGER PRESCREENING (2009); Anya Bernstein, The Hidden Costs of Terrorist Watch Lists, 61 BUFF. L. REV. 461, 466 (2013); Peter M. Shane, The Bureaucratic Due Process of Government Watch Lists, 75 GEO. WASH. L. REV. 804, 805–06 (2007); Peter M. Shane, The Bureaucratic Due Process of Government Watch Lists, 75 GEO. WASH. L. REV. 804 (2007); Steinbock, supra note 1, at 78; Susan Stellin, Who Is Watching the Watch Lists?, N.Y. TIMES (Nov. 30, 2013), http://www.nytimes.com/2013/12/01/sunday-review/who-is- watching-the-watch-lists.html. 260. See, e.g., Al-Aulaqi v. Panetta, 35 F.Supp.3d 56 (D.D.C. 2014) (noting that the plaintiffs’ complaint alleged that the U.S. violated the Fifth Amendment right of the deceased to substantive and procedural due process). See generally DAVID E. SANGER, CONFRONT AND CONCEAL: OBAMA’S SECRET WARS AND SURPRISING USE OF AMERICAN POWER 241–70 (2012) (describing use of drones and targeted killing strategy in the “war on terror”); DANIEL KLAIDMAN, KILL OR CAPTURE: THE WAR ON TERROR AND THE SOUL OF THE DRAFT: PLEASE DO NOT CITE OR CIRCULATE 5/18/16 9:27 AM

increasing availability of publicly available social media data for private or public assessments, and the aggregation of publicly and privately held data for government decisionmaking. At present, the discriminatory effects of big data are better understood in the corporate and private context rather than the government context. The reason for this is simple. The deployment of big data tools by private corporations and the impact of this deployment are not as shielded from public scrutiny as the deployment of big data tools by the government. Yet, intelligence agencies, domestic security and law enforcement organizations, and other bureaucratized arms of the public sector are actively capitalizing upon big data developments.261 While corporations and government agencies attempt to capitalize on the big data revolution in differing ways, both rely on big data tools to harvest masses of data, then analyze and dissect it according to algorithms that enable inferences to be drawn and acted upon. Looking to the private sector for the moment, from a growing body of empirical and anecdotal evidence, it is clear that discriminatory bias is not cured by resort to big data analytical tools.262 Instead, increasing research

OBAMA PRESIDENCY 41 (2012); Kenneth Anderson, The Secret “Kill List” and the President, 3 J.L.: PERIODICAL LABORATORY OF LEG. SCHOLARSHIP 93 (2013); Greg Miller, Plan for Hunting Terrorists Signals U.S. Intends to Keep Adding Names to Kill Lists, WASH. POST (Oct. 23, 2012) (“Over the past two years, the Obama administration has been secretly developing a new blueprint for pursuing terrorists, a next-generation targeting list called the ‘disposition matrix.’ . . . U.S. officials said the database is designed to go beyond existing kill lists, mapping plans for the ‘disposition’ of suspects beyond the reach of American drones.”). 261 . See, e.g., EXEC. OFFICE OF THE PRESIDENT, BIG DATA: SEIZING OPPORTUNITIES, PRESERVING VALUES 5 (2014) [hereinafter PODESTA REPORT], https://www.whitehouse.gov/sites/defau lt/files/docs/big_data_privacy_report_may_1_2014.pdf; Andrew Guthrie Ferguson, Big Data and Predictive Reasonable Suspicion, 163 U. PA. L. REV. 327, 329 n.6 (2015); Hu, Small Data Surveillance, supra note X, at 803–05. 262. FTC Recommends Congress Require the Data Broker Industry to be More Transparent and Give Consumers Greater Control Over Their Personal Information, FED. TRADE COMM’N (May 27, 2014), https://www.ftc.gov/news-events/press-releases/2014/05/ftc- recommends-congress-require-data-broker-industry-be-more; Christian Sandvig, Kevin Hamilton, Karrie Karahalios, & Cedric Langbort, Auditing Algorithms: Research Methods for Detecting Discrimination on Internet Platforms 17 (2014).

Indeed, the movement of unjust face-to-face discrimination into computer algorithms appears to have the net effect of protecting the wicked. As we have pointed out, algorithmic discrimination may be much more opaque and hard to detect than earlier forms of discrimination, while at the same time one important mode of monitoring—the audit study [the most common social scientific method for the detection of discrimination, typically field experiments in which experimenters participate in a particular process to diagnose harmful discrimination]—has been circumscribed.

Id. DRAFT: PLEASE DO NOT CITE OR CIRCULATE

2016] CRITICAL DATA THEORY 51 appears to demonstrate that such tools offer a new and less observable way of implementing such biases.263 This is not simply because the human element takes the information provided by big data systems and puts it to uses that potentially violate federal and state laws prohibiting discrimination.264 Rather, evidence suggests that discrimination, both legal and illegal, appears to operate within the types of databases that are kept, the manner in which statistical projections of big data may reflect already existing biases, and within the analytical algorithms of big data itself.265 For example, the Google advertisements that instantly and repeatedly appear within browsing sessions are not selected by humans but by algorithms. Yet it has been noted that the ads a person receives in a browsing session appear to reflect racial bias; for example, algorithmically-delivered ads may differ depending upon race.266 From the research of Latanya Sweeney, the algorithms may appear to infer ad preferences based upon white-sounding names and black-sounding names.267 There is no need for a conscious intent to discriminate on the part of the advertisers or Google. The racially disparate outcome may simply reflect system user input such that existing racial biases are internalized and then reinforced in the name of maximizing the likelihood that a Google

263. “While large data sets can give insight into previously intractable challenges, hidden biases at both the collection and analytics stages of big data’s life cycle could lead to disparate impact.” FED. TRADE COMM’N, BIG DATA: A TOOL FOR INCLUSION OR EXCLUSION?: UNDERSTANDING THE ISSUES 28 (Jan. 2016). 264. Title VII of the Civil Rights Act of 1964, which applies to most employers, prohibits employment discrimination based on race, color, religion, sex, or national origin, and through recent guidance issuance, extends to persons with criminal records. Title VII of the Civil Rights Act of 1964, Pub. L. No. 88-352, 78 Stat. 241; EEOC Enforcement Guidance (Apr. 2012), https://www.eeoc.gov/laws/guidance/arrest_conviction.cfm. “It is foreseeable, however, that data that closely follow categories that are not permissible grounds for treating consumers differently in a broad array of commercial transactions will be used in exactly this way.” FED. TRADE COMM’N, DATA BROKERS: A CALL FOR TRANSPARENCY AND ACCOUNTABILITY A’ppx C-5 (May 2014). 265 . LEADERSHIP CONFERENCE ON CIVIL AND HUMAN RIGHTS, CIVIL RIGHTS, BIG DATA, AND OUR ALGORITHMIC FUTURE 6–7 (2014) (“[T]his underwriting [of insurance pricing] means less sharing of risk. Healthy people in low-income neighborhoods will pay more for their life insurance than will healthy people in healthier neighborhoods (because they are saddled with the health costs of their less healthy neighbors.”). “Google’s software automatically learns which ad combinations are most effective (and most profitable) by tracking how often users click on each ad. These user behaviors, in aggregate, reflect the biases that currently exist across society.” Id. at 16. 266. Google and InstantCheckmate, a search engine providing a public records search service, may be reinforcing the racial biases that their audience’s click patterns reflect in conducting online searches of people’s names: They generate ads in accordance to the racial overtones of the name. Latanya Sweeney, Discrimination in Online Ad Delivery, 11 QUEUE 10, 10 (2013). 267 . Latanya Sweeney, Discrimination in Online Ad Delivery, 11 QUEUE 10, 34 (2013) (describing her study finding that a greater percentage of ads with ‘arrest’ in their text appeared for black-identifying names than for white-identifying first names in various searches using common search engines, to an extent that could not plausibly be explained by chance.). DRAFT: PLEASE DO NOT CITE OR CIRCULATE 5/18/16 9:27 AM

user will react favorably to the ads “randomly” served.268 Additionally, big data-driven discrimination can facilitate and mutually reinforce small data-driven discrimination. The White House recently recognized that big data could pose a significant concern to the future of civil rights protections.269 In its May 2014 Report, Big Data: Seizing Opportunities, Preserving Values, the White House recognized that both the private and public sector could use big data technologies to entrench discriminatory biases and serve discriminatory ends.270 Some big data experts have noted that big data is one of the most critical civil rights issues of our generation.271 Wade Henderson and Rashad Robinson have explained that big data is a civil rights issue because “discrimination and injustice are embedded in our society. They are reflected in the data that drive big data decisions—in credit histories, in housing patterns, in arrest records, and more.”272 Growing media and academic attention, thus, has been placed on the ways in which employers, landlords, lenders, insurers, and others can now exploit big data tools to serve discriminatory ends. Under civil rights statutes that were structured in a small data world, such as Title VII of the Civil Rights Act of 1964, employers were largely prohibited from basing employment decisions based upon race and color, ethnicity, sex, and religion.273 Certain traits, like racial traits, however, can now be more easily inferred by review of the data trails left by a person in the Information Society—such as social media exposure, consumer habits, search histories, etc.—that are available for analysis. Insurers, for example, look to data trails to assess a person’s health history both in the form of medical records

268. Id. at 10. 269 . EXEC. OFFICE OF THE PRESIDENT, BIG DATA: SEIZING OPPORTUNITIES, PRESERVING VALUES INTERIM PROGRESS REPORT 6 (2015) (finding the “potential for big data technology to be used to discriminate against individuals, whether intentionally or inadvertently, potentially enabling discriminating outcomes, reducing opportunities and choices available to them.”). 270 . EXEC. OFFICE OF THE PRESIDENT, BIG DATA: SEIZING OPPORTUNITIES, PRESERVING VALUES 46 (2014) (“[T]he civil rights community is concerned that such algorithmic decisions raise the specter of ‘redlining’ in the digital economy—the potential to discriminate against the most vulnerable classes of our society under the guise of neutral algorithms.”). “Left unresolved, technical issues like [E-Verify concerns] could create higher barriers to employment or other critical needs for certain individuals and groups, making imperative the importance of accuracy, transparency, and redress in big data systems.” Id. at 52. 271. Alistair Croll, Big Data Is Our Generation’s Civil Rights Issue, and We Don’t Know It, SOLVE FOR INTERESTING (July 31, 2012), http://solveforinteresting.com/big-data-is-our- generations-civil-rights-issue-and-we-dont-know-it. 272 . Wade Henderson and Rashad Robinson, Big Data Is a Civil Rights Issue, TALKING POINTS MEMO (Apr. 8, 2014), http://talkingpointsmemo.com/cafe/big-data-is-a-civil-rights- issue. 273 . Title VII of the Civil Rights Act of 1964, Pub. L. No. 88-352, 78 Stat. 241. DRAFT: PLEASE DO NOT CITE OR CIRCULATE

2016] CRITICAL DATA THEORY 53 as well as consumption habits.274 Thus, intentional discrimination can be actively facilitated in more covert forms in a big data world.275 Data trails of individuals allow for profiling on the basis of race, class, and otherwise, and those profiles are imbued with the very kinds of bias that necessitated anti-discrimination laws in the first place. Data brokers, for instance, assemble and analyze “actual and derived data elements” including race.276 From one investigation by the Federal Trade Commission, data broker classifications of sub-populations within the U.S. include “Urban Scramble” and “Mobile Mixers,” both of which encompass a high concentration of Latinos and African-Americans with low incomes.277 “Latchkey Leasers” characterize “bicultural and bilingual middle aged renters who tend to have little net worth[.]278 Such assumptions, intentional or otherwise, can enter at the front end in the form of assumptions internalized within algorithmic classifications.279 Discriminatory bias or cognitive bias can also enter in the back end in the form of how technology users and big data interpreters read, analyze, and act upon the big data determinations.280 But regardless of how the discrimination enters, front end or back end, the result is the same. Supposedly objective and individualized statistical analyses facilitated by big data reinforce, rather than undermine, the discriminatory practices (like

274. Kukura, supra note X. Joe Kukura, The Ugly Side of Big Data: Profiling by Race, Class, and Dirty Secrets, PULSEPOINT (May 29, 2014), http://create.pulsepoint.com/hub/3/at/17174370.html?. 275 . THE LEADERSHIP CONFERENCE ON CIVIL AND HUMAN RIGHTS, CIVIL RIGHTS, BIG DATA, AND OUR ALGORITHMIC FUTURE 16–7 (Sept. 2014); Jennifer Valentino-Devries, Bosses May Use Social Media to Discriminate Against Job Seekers, WALL ST. J. (Nov. 20, 2013), http://www.wsj.com/news/articles/SB10001424052702303755504579208304255139392?cb =logged0.6209975669720549; Alessandro Acquisti & Christina Fong, An Experiment in Hiring Discrimination via Online Social Networks, CARNEGIE MELLON UNIVERSITY 30 (July 17, 2015), http://papers.ssrn.com/sol3/papers.cfm?abstract_id=2031979. 276 . FED. TRADE COMM’N, DATA BROKERS: A CALL FOR TRANSPARENCY AND ACCOUNTABILITY 24 (May 2014) [2014 FTC REPORT: DATA BROKERS], https://www.ftc.gov/system/files/documents/reports/data-brokers-call-transparency- accountability-report-federal-trade-commission-may-2014/140527databrokerreport.pdf. 277 . FED. TRADE COMM’N, DATA BROKERS: A CALL FOR TRANSPARENCY AND ACCOUNTABILITY (May 2014) [hereinafter FTC Report] FED. TRADE COMM’N, DATA BROKERS: A CALL FOR TRANSPARENCY AND ACCOUNTABILITY V (May 2014) [hereinafter FTC REPORT: DATA BROKERS]. 278 . Mark Gongloff, Here’s What Corporate America Really Thinks About You, HUFFPOST BUS. (May 29, 2014), http://www.huffingtonpost.com/2014/05/29/data-brokers- stereotyping_n_5405805.html. 279. Amy J. Schmitz, Secret Consumer Scores and Segmentations: Separating “Haves” From “Have-Nots”, 2014 MICH. ST. L. REV. 1411, 1454 (2014). 280 . EXEC. OFFICE OF THE PRESIDENT, BIG DATA: A REPORT ON ALGORITHMIC SYSTEMS, OPPORTUNITY & CIVIL RIGHTS 14 (May 2016) (“[Because algorithms] are built by humans and rely on imperfect data, these algorithmic systems may also be based on flawed judgments and assumptions that perpetuate bias as well.”). DRAFT: PLEASE DO NOT CITE OR CIRCULATE 5/18/16 9:27 AM

redlining, etc.) of a small data world.281 For example, “individualized insurance pricing” means individualized profiling on a level heretofore unheard.282 A healthy individual living in a low income neighborhood profiles differently from one residing uptown with the result reflected in their life insurance premiums.283 Discrimination on the basis of race, ethnicity, color, religion, sex, and other categories in certain contexts—such as in granting credit, insurance, and employment—is prohibited by federal civil rights law.284 However, data brokers are unable to determine whether such race-based and other protected classifications may ultimately be used by those who purchase information on those populations who are categorized as “Urban Scramble” or “Latchkey Leasers”—low-income Latinos, African-Americans, and other minorities—for unlawful discriminatory purposes.285 “These sorts of structural discrimination issues are particularly troubling as employers— and others in positions of power and responsibility—increasingly consult the Internet when making the decisions that shape people’s lives.”286 Consequently, corporate capture of data tools “may foster discrimination and augment preexisting power imbalances among consumers by funneling the best deals and remedies to the wealthiest and most sophisticated consumers.”287 Increasingly, experts and scholars are calling for data privacy regulations that correct for big data discrimination in the corporate context. Policymakers appear to show growing awareness that algorithmic- driven classifications have the potential to foster discrimination almost invisibly when corporate decisionmaking is related to race and ethnicity, age, gender, religion, educational background, zip codes, and other data points that may classify the consumer.288 Other policymakers are becoming

281 See, e.g., Joe Kukura, The Ugly Side of Big Data: Profiling by Race, Class, and Dirty Secrets, PULSEPOINT (May 29, 2014), http://create.pulsepoint.com/hub/3/at/17174370.html? (discussing how companies use “data silos” to “silo” people into “insulting categories” that ensure that life situations are constantly reinforced, based on dataset risks of race, socioeconomic class or perceived psychological neuroses). Scott Peppet, Regulating the Internet of Things: First Steps Towards Managing Discrimination, Privacy, Security and Consent, 93 TEXAS L. REV 85 (2014); Solon Barocas & Andrew D. Selbst, Big Data’s Disparate Impact, 104 CAL. L. REV. ___ (forthcoming 2016). 282 . LEADERSHIP CONFERENCE ON CIVIL AND HUMAN RIGHTS, CIVIL RIGHTS, BIG DATA, AND OUR ALGORITHMIC FUTURE 6–7 (2014). 283. Id. 284 . Title VII of the Civil Rights Act of 1964, Pub. L. No. 88-352, 78 Stat. 241. 285 . FED. TRADE COMM’N, DATA BROKERS: A CALL FOR TRANSPARENCY AND ACCOUNTABILITY 56 (May 2014) (citations omitted). 286 . LEADERSHIP CONFERENCE ON CIVIL AND HUMAN RIGHTS, CIVIL RIGHTS, BIG DATA, AND OUR ALGORITHMIC FUTURE 16 (2014). 287. Amy J. Schmitz, Secret Consumer Scores and Segmentations: Separating “Haves” From “Have-Nots”, 2014 MICH. ST. L. REV. 1411, 1411 (2014). 288. See, e.g., FTC Recommends Congress Require the Data Broker Industry to be More Transparent and Give Consumers Greater Control Over Their Personal Information, FED. TRADE COMM’N (May 27, 2014), https://www.ftc.gov/news-events/press- releases/2014/05/ftc-recommends-congress-require-data-broker-industry-be-more. DRAFT: PLEASE DO NOT CITE OR CIRCULATE

2016] CRITICAL DATA THEORY 55 more cognizant also of government-imposed discrimination that stem from government-led big data programs.289

C. Applying Critical Race Theory to Current Big Data Developments

The core concepts of Critical Race Theory can be applied to critiquing how big data works under the rule of law and in institutions, and why it matters. Government-led big data programs290 are often structured to achieve small data governance goals: Public safety and crime control, border security and homeland security, and national security, and the protection of sovereign interests against existential threats.291 Yet, big data is predictive and correlative in nature.292 Thus, increasingly, big data technologies purport to deliver on big data governance objectives: Predicting crime and terrorism before it occurs,293 preventing threats to safety and security,294 allowing for algorithmically informed inferences of

289 . See, e.g., ANDORRA BRUNO, ELECTRONIC EMPLOYMENT ELIGIBILITY VERIFICATION, CONGRESSIONAL RESEARCH SERVICE 13 (Mar. 13, 2009). 290 . See, e.g., Jack M. Balkin, Old-School/New-School Speech Regulation, 127 HARV. L. REV. 2296, 2297 (2014) (“The digital era is different. Governments can target for control or surveillance many different aspects of the digital infrastructure that people use to communicate: telecommunications and broadband companies, web hosting services, domain name registrars, search engines, social media platforms, payment systems, and advertisers”); Joshua A.T. Fairfield & Erik Luna, Digital Innocence, 99 CORNELL L. REV. 981, 988-91 (2014) (“The growth of data collection, connection, and parsing capabilities could transform Big Data technologies into an important tool for establishing innocence . . . . [However,] data science and Big Data technologies have been overwhelmingly used to convict”) (citing NAT’L RESEARCH COUNCIL OF THE NAT’L ACADS., STRENGTHENING FORENSIC SCIENCE IN THE UNITED STATES: A PATH FORWARD, 179–80 (2009)); Brandon L. Garrett, Big Data and Due Process, 99 CORNELL L. REV. ONLINE 207, 208 (2014) (“As big data becomes increasingly relevant to criminal cases . . . rules capable of handling complex and Big Data discovery should be developed”); EXEC. OFFICE OF THE PRESIDENT, BIG DATA: SEIZING OPPORTUNITIES, PRESERVING VALUES 5 (2014) [hereinafter PODESTA REPORT], https://www.whitehouse.gov/sites/defau lt/files/docs/big_data_privacy_report_may_1_2014.pdf. 291 . See, e.g., H. JEFFERSON POWELL, TARGETING AMERICANS: THE CONSTITUTIONALITY OF THE U.S. DRONE WAR (2016); RICHARD A. POSNER, COUNTERING TERRORISM: BLURRED FOCUS, HALTING STEPS (2007); PHILIP B. HEYMANN & JULIETTE N. KAYYEM, PROTECTING LIBERTY IN AN AGE OF TERROR (2005); PHILIP BOBBIT, TERROR AND CONSENT (2008); Erik Luna, The Bin Laden Exception, 106 NW. U. L. REV. 1489 (2012). 292 . See, e.g., MAYER-SCHÖNBERGER & CUKIER, supra note 1, at 11–12 (“Though it is described as part of the branch of computer science called artificial intelligence, and more specifically, an area called machine learning, this characterization is misleading. Big data is not trying to ‘teach’ a computer to ‘think’ like humans. Instead, it’s about applying math to huge quantities of data in order to infer probabilities[.]”). 293. See, e.g., Ferguson, Big Data and Predictive Reasonable Suspicion, supra note 3, at 327-29; Jennifer C. Daskal, Pre-Crime Restraints: The Explosion of Targeted, Noncustodial Prevention, 99 CORNELL L. REV. 327 (2014); Elizabeth E. Joh, Policing by Numbers: Big Data and the Fourth Amendment, 89 WASH. L. REV. 35, 42–48 (2014). 294 . See, e.g., JENNIFER BACHNER, PREDICTIVE POLICING: PREVENTING CRIME WITH DATA AND ANALYTICS 14 (2013) (“The fundamental notion underlying the theory and practice of DRAFT: PLEASE DO NOT CITE OR CIRCULATE 5/18/16 9:27 AM

guilt and suspicion,295 facilitating big data identity assessments,296 and enabling an assessment of the potential risks associated with digital data that is collected and analyzed.297 Under big data governance systems, however, it is often digital data—rather than persons—that is deemed suspicious.298 Specifically, in a big data world, entire populations and subpopulations can be categorized as administratively “guilty until proven innocent” by virtue of suspect data. For example, digitally generated suspicion may arise from government-led big data systems that flag suspicious digital data and suspicious database screening results.299 Potentially anyone with a digital trail or database presence may generate data deemed suspect through tools deployed by big data governance methods. Big data governance harms, consequently, result from the mediation of and interference with fundamental liberty interests. Collectively, big data-generated inferences of guilt, and heightened suspicion from database screening and digital watchlisting programs,300 can be characterized as a predictive policing is that we can make probabilistic inferences about future criminal activity based on existing data.”). 295 . See, e.g., Danielle Keats Citron, Technological Due Process, 85 WASH. U. L. REV. 1249 (2008); Matthew Tokson, Automation and the Fourth Amendment, 96 IOWA L. REV. 581 (2011); Margaret Hu, Big Data Blacklisting, 67 FLA. L. REV. 1735 (2015). 296. See, e.g., Laura K. Donohue, Technological Leap, Statutory Gap, and Constitutional Abyss: Remote Biometric Identification Comes of Age, 97 MINN. L. REV. 407 (2012); JENNIFER LYNCH, FROM FINGERPRINTS TO DNA: BIOMETRIC DATA COLLECTION IN U.S. IMMIGRANT COMMUNITIES AND BEYOND (2012); Margaret Hu, Biometric ID Cybersurveillance, 88 IND. L.J. 1475 (2013). 297 . See, e.g., Jack M. Balkin, The Constitution in the National Surveillance State, 93 MINN. L. REV. 1 (2008) [hereinafter Balkin, National Surveillance State]; Jack M. Balkin & Sanford Levinson, The Processes of Constitutional Change: From Partisan Entrenchment to the National Surveillance State, 75 FORDHAM L. REV. 489 (2006); Lior Jacob Strahilevitz, Signaling Exhaustion and Perfect Exclusion, 10 J. ON TELECOMM. & HIGH TECH. L. 321 (2012); David Lyon, Biometrics, Identification and Surveillance, 22 BIOETHICS 499 (2008); Erin Murphy, Paradigms of Restraint, 57 DUKE L.J. 1321 (2008). 298. See, e.g., [CIA Chief Technology Officer] Ira “Gus” Hunt, Presentation at Gigaom Structure Data Conference: The CIA’s “Grand Challenges” with Big Data (Mar. 20, 2013) [hereinafter Hunt CIA Presentation] (video and transcript available at https://gigaom.com/2013/03/20/even-the-cia-is-struggling-to-deal-with-the-vol ume-of-real- time-social-data) (“[Y]ou’re already a walking sensor platform. You guys know this I hope, right? Which is that your mobile device, your smartphone, your iPad, whatever it’s going to be, has got a, just, any number of these things [sensors] . . . . What’s happened is that if you’re a Star Trek fan, like I was when I was a kid, what’s current now is that this mobile platform, your smartphones, have turned into your communicator, they’re becoming your tricorder, and actually they’re becoming your transporter, right?”); Margaret Hu, Small Data Surveillance v. Big Data Cybersurveillance, 42 PEPP. L. REV. 773, 827-36 (2015). 299. See infra Part III.C. 300 . See, e.g., Jack M. Balkin, The Constitution in the National Surveillance State, 93 MINN. L. REV. 1, 12 (2008); Jack M. Balkin & Sanford Levinson, The Processes of Constitutional Change: From Partisan Entrenchment to the National Surveillance State, 75 FORDHAM L. REV. 489, 521 (2006) See also Margaret Hu, Big Data Blacklisting, 67 FLA. L. REV. DRAFT: PLEASE DO NOT CITE OR CIRCULATE

2016] CRITICAL DATA THEORY 57 form of big data discrimination.301 Big data blacklisting describes a new constitutional harm that embodies the characteristics of an old government practice: Small data blacklisting.302 In addition, it describes a new tool of governance that is facilitated by the government’s increasing access to big data technologies. At the earliest stages of the big data revolution, big data

(forthcoming 2015) (noting that, increasingly, “blacklisting” is the term often used to characterize digital watchlisting consequences) (citing Jeremy Scahill and Ryan Devereaux, Blacklisted: The Secret Government Rulebook for Labeling You a Terrorist, INTERCEPT (July 23, 2014) https://firstlook.org/theintercept/2014/07/23/blacklisted/; Meghan Keneally, Secret Blacklist Stopping Muslim Residents from becoming Citizen, Lawsuit Claims, ABC NEWS, (Aug. 12, 2014), http://abcnews.go.com/US/secret-blacklist-stopping-muslim- residents-citizens-lawsuit-claims/story?id=24910447; Gail Sullivan, Why the No Fly List was Declared Unconstitutional, WASH. POST (June 25, 2014) (“Imagine you’re in an airport en route to visit family abroad when suddenly you’re surrounded. Then you are detained and interrogated. You not only miss your scheduled flight, but can’t get on any other flight. . . . And nobody will tell you why. It’s as though you’ve been blacklisted.”), http://www.washingtonpost.com/news/morning-mix/wp/2014/06/25/judge-rules-no-fly-list- unconstitutional/). 301 . Tal Z. Zarsky, Understanding Discrimination in the Scored Society, 89 WASH. L. REV. 1375 (2014). 302. The term “blacklist” has been defined in the following way: “[A] list of persons or organizations under suspicion, or considered untrustworthy, disloyal, etc, esp. one compiled by a government or an organization [.]” COLLINS ENGLISH DICTIONARY – COMPLETE & th UNABRIDGED 10 ed., (2012), http://dictionary.reference.com/browse/Blacklisted. The concept of “blacklisting” is often associated with 1950s-era McCarthyism. See, e.g., ELLEN SCHRECKER, THE AGE OF MCCARTHYISM: A BRIEF HISTORY WITH DOCUMENTS 92 (1994) (explaining that McCarthy-era blacklisting consequences were often economic in nature, leading to unemployment, yet, “[t]he blacklist took a personal toll as well. Broken health and broken marriages, even suicides were not unknown.”). The concept of “blacklisting” is often associated with 1950s-era McCarthyism. See, e.g., ELLEN SCHRECKER, THE AGE OF MCCARTHYISM: A BRIEF HISTORY WITH DOCUMENTS 92-93 (1994) (explaining that McCarthy-era blacklisting consequences were often economic in nature, leading to unemployment; yet, “[t]he blacklist took a personal toll as well. Broken health and broken marriages, even suicides were not unknown.”). Several scholars and experts have analogized the blacklisting practices of the prior historic era with contemporary terrorism prevention practices. See, e.g., JEFFREY KAHN, MRS. SHIPLEY’S GHOST: THE RIGHT TO TRAVEL AND TERRORIST WATCHLISTS 8 (2013) (comparing U.S. passport and travel restrictions during the McCarthy era with the No Fly List restrictions after 9/11, and concluding that “[t]he mistakes of the mid-twentieth century are being remade at the start of the twenty-first century”); David Cole, The Difference Prevention Makes: Regulating Preventive Justice, CRIM. LAW & PHIL. (2014), http://link.springer.com/article/10.1007/s11572-013-9289- 7/fulltext.html (discussing legal rules that remain in effect that “reflect lessons learned from the McCarthy era, in which thousands of innocent citizens were caught up in the preventive frenzy of anti-communism) (citing, e.g., Geoffrey Stone, PERILOUS TIMES: FREE SPEECH IN WARTIME: FROM THE SEDITION ACT OF 1798 TO THE WAR ON TERRORISM 311-426 (2004)); Aaron H. Caplan, Nonattainder as a Liberty Interest, 2010 WIS. L. REV. 1203, 1205-06 (describing the No Fly List and other government efforts to target suspicious individuals or suspicious behaviors as contemporary “government blacklists” and “blacklisting” programs); Daniel J. Steinbock, Designating the Dangerous: From Blacklists to Watch Lists, 30 SEATTLE U. L. REV. 65, 68 (2006) (“It is, therefore, time to ask if something equivalent to the blacklists of fifty years ago is happening again, and, if so, how the twenty- first century use of watch lists [in the post-9/11 counter-terrorism context] might or might not resemble the blacklisting of the McCarthy era.”). DRAFT: PLEASE DO NOT CITE OR CIRCULATE 5/18/16 9:27 AM

blacklisting programs can be recognized as part of a transformation of governance methods that is now underway. Big data governance harms may be associated with non-classified government-led big data programs such as the No Work List (created as a result of database screening through E-Verify to conduct work eligibility assessments);303 the No Vote List (database screening to conduct voter purges from voter registration rolls through Systematic Alien Verification for Entitlements (SAVE),304 Help America Vote Act (HAVA),305 etc.); and the No Citizenship List (database screening protocols to support immigration detention and deportation under the Prioritized Enforcement Program,306 the former Secure Communities program,307 etc.). These

303. See, e.g., Press Release, ACLU, Employment Verification Would Create a ‘No Work List’ in the United States (May 6, 2008), https://www.aclu.org/immigrants- rights/employment-verification-would-create-‘no-work-list’-us; Jim Harper, Immigration Reform: REAL ID and a Federal ‘No Work’ List, CATO INST. (June 14, 2007), http://www.cato.org/publications/techknowledge/immigration-reform-real-id-federal-no- work-list. For an overview of the E-Verify program and some of its legal implications, see generally Stumpf, supra note 118 (outlining the procedures and policies of E-Verify). 304. In recent years, state election officials have used the Systematic Alien Verification for Entitlements (SAVE) database screening protocol to conduct voter purges. See, e.g., Fatma Marouf, The Hunt for Noncitizen Voters, 65 STAN. L. REV. ONLINE 66 (2012). For more information on the SAVE database screening program, see U.S. DEP’T OF HOMELAND SEC., PRIVACY IMPACT ASSESSMENT FOR THE SYSTEMATIC ALIEN VERIFICATION FOR ENTITLEMENTS (SAVE) PROGRAM 12 (Aug. 26, 2011) [hereinafter U.S. DEP’T OF HOMELAND SEC., PRIVACY IMPACT ASSESSMENT FOR SAVE (2011)], http://www.dhs.gov/xlibrary/assets/privacy/privacy_pia_uscis_save.pdf. 305. See, e.g., Help America Vote Act of 2002 (HAVA), Pub. L. No. 107-252, 116 Stat. 1666, 1666–1730 (2002) (codified as amended at 42 U.S.C. §§ 15,301–545 (2012)). HAVA requires each state to implement and maintain an electronic database of all registered voters. 42 U.S.C. § 15483(a). HAVA also requires states to verify the identity of the voter registration application through cross-checking the applicant’s driver’s license or last four digits of the applicant’s Social Security Number. Id. § 15,483(a)(5)(A)(i). If the individual has neither number, the state is required to assign a voter identification number to the applicant. Id. § 15,483(a)(5)(A)(ii). Each state election office is tasked with overseeing election rules and procedures for that state in the implementation of HAVA. President Signs H.R. 3295, “Help America Vote Act of 2002,” SOC. SEC. ADMIN. (Nov. 7, 2002) [hereinafter President Signs HAVA], http://www.ssa.gov/legislation/legis_bulletin_110702.html. 306. The DHS Prioritized Enforcement Program (PEP) was announced by DHS Secretary Jeh Johnson on November 20, 2014, to replace the Secure Communities (S-COMM) program, however, it appears that the database screening protocols of S-COMM will remain intact under PEP. See Memorandum from Jeh Charles Johnson, Sec’y, Dep’t of Homeland Sec., to Thomas S. Winkowski, Acting Dir., Immigration & Customs Enforcement, 2 (Nov. 20, 2014), http://www.dhs.gov/sites/default/files/publications/14_1120_memo_secure_communities.pd f. 307. Secure Communities (S-COMM) is an interoperability program that facilitates data sharing and database screening protocols between the Federal Bureau of Investigation (FBI), Department of Homeland Security (DHS) and local law enforcement agencies. Important scholarship has addressed multiple legal issues relating to S-COMM in recent years. See, e.g., HIROSHI MOTOMURA, IMMIGRATION OUTSIDE THE LAW 79-83 (2014); Thomas J. Miles & Adam B. Cox, Does Immigration Enforcement Reduce Crime? Evidence from Secure Communities, 57 J.L. & ECON. 937 (2014); Adam B. Cox & Thomas J. Miles, Policing DRAFT: PLEASE DO NOT CITE OR CIRCULATE

2016] CRITICAL DATA THEORY 59 government-led big data programs establish or deny an individual’s eligibility for certain benefits or rights through database screening. Further, the same issues may be in play with classified and semi-classified big data programs, such as the No Fly List (created as a result of database screening through Secure Flight and other databases for digital watchlist nomination)308 and the Terrorist Watchlist (e.g., Terrorist Screening Database).309 Many emerging big data cybersurveillance310 and dataveillance systems have not been fully interrogated. Yet, these big data systems are

Immigration, 80 U. CHI. L. REV. 87 (2013); Christopher N. Lasch, Rendition Resistance, 92 N.C. L. REV. 149 (2013). DHS explains that S-COMM is justified by a combination of authorities. See Memorandum from Riah Ramlogan, Dep. Principal Legal Advisor, to Beth N. Gibson, Asst. Dep. Dir., U.S. Dep’t of Homeland Sec., U.S. Immigration and Customs Enforcement (Oct. 2, 2010), http://uncoverthetruth.org/wp- content/uploads/2012/01/Mandatory-in-2013-Memo.pdf. The authorities relied upon by DHS include: [1] that 28 U.S.C. § 534(a)(1) and 28 U.S.C. § 534(a)(4) together provide the FBI with authority to share fingerprint data with ICE/DHS: [2} that 8 U.S.C. § 1722 mandates the development of a data sharing system that “enable(s) intelligence and law enforcement agencies to determine the inadmissibility or deportability of an [undocumented immigrant].”; [3] that 42 U.S.C. § 14616 ratifies information or database sharing between federal and state agencies. Id. at 4-6. 308 . See U.S. DEP’T OF TRANSP., TRANSP. SEC. INTELLIGENCE SERV., POWERPOINT: TSA WATCH LISTS 2 (2002) [hereinafter U.S. DEP’T OF TRANS., TSA WATCH LISTS], http://www.aclunc.org/cases/landmark_cases/asset_ upload_file371_3549.pdf (released into the public record by the ACLU as part of a settlement agreement in Gordon v. FBI, No. C- 03-1779 (N.D. Cal. Jan. 24, 2006)); see also Mohamed v. Holder, 995 F. Supp. 2d 522, 525 (2014). In Mohamed, the Court provides the following historical summary of the No Fly List:

Following the attacks of September 11, 2001, Congress and the President mandated that federal executive departments and agencies share terrorism information with those in the counterterrorism community responsible for national security. Piehota Decl., ¶ 4. Specifically, Congress directed the TSA, “in consultation with other appropriate Federal agencies and air carriers, establish policies and procedures requiring air carriers (A) to use information from government agencies to identify individuals on passenger lists who may be a threat to civil aviation or national security; and (B) if such an individual is identified, notify appropriate law enforcement agencies, prevent the individual from boarding an aircraft, or take other appropriate action with respect to that individual.”

Id. (quoting 49 U.S.C. § 114(h)(3) (2012)). 309. The Terrorist Screening Database (TSDB) is “often referred to as the ‘Terrorist Watchlist[.]’” About the Terrorist Screening Center, FEDERAL BUREAU OF INVESTIGATION, U.S. DEP’T OF JUSTICE, https://www.fbi.gov/about-us/nsb/tsc/about-the-terrorist-screening- center. 310 . See, e.g., LAWRENCE LESSIG, CODE VERSION 2.0 209 (2006) (describing cybersurveillance or “digital surveillance” as “the process by which some form of human activity is analyzed by a computer according to some specified rule. . . . [T]he critical feature in each [case of surveillance] is that a computer is sorting data for some follow-up review by some human.”). DRAFT: PLEASE DO NOT CITE OR CIRCULATE 5/18/16 9:27 AM

rapidly proliferating as part of a post-9/11 policy prescription to assess and take action to prevent potential criminal and terroristic threats.311 The impact of government-led big data screening programs are especially critical in how they affect those who are subjected to specific administrative and investigatory actions as a result of digital screening protocols and data analytic processes. When applying Critical Race Theory to an interrogation of the development of big data governance, an emphasis on narrative is required. Under Critical Race Theory, it is insufficient to simply acknowledge that new harms are emerging under newly adopted big data tools. The theory demands endeavoring to uncover the underlying sources of power. Critical Race Theory holds that narrative and a reassertion of identity capture an expression of harms that may be otherwise invisible, and restores justice in a way that transcends legal redress alone.312 Critical Race Theory contends that many race-based harms are unseen and discrimination manifests itself in complex ways that lie outside the reach of the law. As there is often no articulation of legal harm and no redress of harm under the law, the theory holds that there is a need for art forms, such as the art of the story. One of primary goals is to deconstruct structural hierarchies of legal and governmental power through creating art. The story can transcend structural power. Critical Race Theory thus is both a theory of power—how do you seize power through a story—and a theory of how to critique power—how do you deconstruct invisible forces of power and sources of power. An application of the theory can now lead to counternarratives to challenge prevailing narratives that surround big data governance developments.313

III. CRITICAL DATA THEORY

To explicate the parallels between race negotiation and digital data negotiation within legal contexts, Part II discussed how Critical Data Theory as informed by Critical Race Theory can be applied to current developments in the expansion of government-led big data programs and big data forms of policymaking. Part III, however, turns to a discussion on why Critical Data Theory is not a story about race. Critical Data Theory is

311 . See, e.g., Daskal, supra note 30, at 328–30. 312 . See, e.g., Williams, supra note 1; ERIC K. YAMAMOTO, MARGARET CHON, CAROL L. IZUMI, FRANK H. WU & JERRY KANG, RACE, RIGHTS AND REPARATION: LAW AND THE JAPANESE AMERICAN INTERNMENT (2001); Camille Gear Rich, Performing Racial and Ethnic Identity: Discrimination by Proxy and the future of Title VII, 79 N.Y.U. L. REV. 1134 (2004) (discussing how identity performance theories currently fall outside of Title VII jurisprudence and why “courts cannot construct coherent narratives in cases involving race/ethnicity performance.”). 313. See, e.g., Cohen, What Privacy Is For?, supra note X, at 1924. DRAFT: PLEASE DO NOT CITE OR CIRCULATE

2016] CRITICAL DATA THEORY 61 a story about how best to critique algorithmic- and big data-centered power and the big data construction of digital identity under a wide range of power regimes, including law and policy. In an Information Society where the digital economy and big data governance tools unfold virtually in the Cloud, over the Internet, through database screening and digital watchlisting programs, and through algorithmic scoring systems. The manner in which power is shifting means that the questions and answers surrounding how to preserve core emancipatory principles are also rapidly shifting. How privacy and freedom of association, speech, thought, and protest are understood and threatened in a big data world now anchor these questions. Small data governance accountability systems appear to be unable to keep pace with big data governance ambitions.314 Where power is based upon algorithmic decisionmaking, a key concern involves the proper investigatory methods or methods of critique to determine what happens in law and policymaking as a result of digitally constructed identities. Understanding the implications of digitally constructed identities in a big data world can borrow from understanding the implication of racially constructed identity in a small data world. As critical race theorists have noted, deconstructing racial identity means understanding that identity politics are incredibly sophisticated, complex, nuanced, and highly individualized and socially contextualized. Race identity can be performative and mediated, constructed and negotiated, self-managed and externally managed, accepted and dismissed, inter- and intra-group discrimination-based.

A. Emerging Big Data Techniques of Governance

To help illustrate this historically significant transition from small data surveillance methods to big data cybersurveillance methods,315 it is

314 . See, e.g., MAYER-SCHÖNBERGER & CUKIER, supra note 4; Ferguson, supra note X; Joh, supra note X; Daskal, supra note X; Hu, Small Data Surveillance, supra note X, at 804–05. 315. Following the Snowden disclosures, at least one expert has asserted that the NSA is attempting to merge big data tools with small data tools. See, e.g., Kate Crawford, The Anxieties of Big Data, THE NEW INQUIRY (May 30, 2014), http://thenewinquiry.com/essays/the-anxieties-of-big-data/ (“[A Squeaky Dolphin PowerPoint slide from the Snowden disclosures] outlines an expansionist program to bring big data together with the more traditional approaches of the social and humanistic sciences: the worlds of small data. . . . [A]nd it is all about supplementing [big] data analysis with broader sociocultural tools from anthropology, sociology, , biology, history, psychology, and economics.”). Scholars and experts have also juxtaposed small data policing and surveillance practices with big data policing and surveillance practices as a way to deepen the legal and constitutional discourse. See, e.g., Ferguson, supra note Error! Bookmark not defined., at 329 (discussing in the context of predictive policing practices: “[W]hat happens if this small data suspicion [suspicion that is ‘individualized to a particular person at a particular place’] is replaced by ‘big data’ suspicion?”); Toomey & Kaufman, supra note Error! Bookmark not defined., at 847 (describing a “notice paradox” whereby DRAFT: PLEASE DO NOT CITE OR CIRCULATE 5/18/16 9:27 AM

instructive to focus on one sentence extracted from one alleged document released from the Snowden archives. In a New York Times article by James Risen and Laura Poitras,316 the authors discuss NSA documents from the Snowden disclosures that focus on biometric data collection317 (e.g., scanned fingerprints, irises, digital photos and facial recognition technology, and DNA).318 They note that one NSA document explains: “‘It’s not just the traditional communications we’re after: It’s taking a full- arsenal approach that digitally exploits the clues a target leaves behind in their regular activities on the net to compile biographic and biometric information’ that can help ‘implement precision targeting.’”319 in a small data surveillance world “notice was all but automatic in most cases . . . [because] searches were confined to a physical world”; however, with big data surveillance methods and “the rise of electronic surveillance conducted remotely and surreptitiously[,]” the authors observe that “the government has achieved an unprecedented measure of control over when, and to whom, notice [of the surveillance] is given.”). 316. James Risen & Laura Poitras, N.S.A. Collecting Millions of Faces from Web Images, N.Y. TIMES, June 1, 2014, at A1, http://www.nytimes.com/2014/06/01/us/nsa-collecting- millions-of-faces-from-web-images.html. 317. Whether biometric data is defined in the disclosures is not mentioned; however, the authors specifically reference NSA’s interest in the specific types of biometric data: “facial recognition technology” and “facial images [from digital photos, videoconferences, etc.], fingerprints and other identifiers.” Id. (“While once focused on written and oral communications, the N.S.A. now considers facial images, fingerprints and other identifiers just as important to its mission of tracking suspected terrorists and other intelligence targets, the documents show.”). 318 . See, e.g., Margaret Hu, Biometric ID Cybersurveillance, 88 IND. L.J. 1475 (2013). Biometrics is “[t]he science of automatic identification or identity verification of individuals using physiological or behavioral characteristics.” JOHN R. VACCA, BIOMETRIC TECHNOLOGIES AND VERIFICATION SYSTEMS 589 (2007). Numerous scholars and experts have explored the science and application of biometrics and the consequences of this emerging technology. See, e.g., Laura K. Donohue, Technological Leap, Statutory Gap, and Constitutional Abyss: Remote Biometric Identification Comes of Age, 97 MINN. L. REV. 407 (2012); JENNIFER LYNCH, FROM FINGERPRINTS TO DNA: BIOMETRIC DATA COLLECTION IN U.S. IMMIGRANT COMMUNITIES AND BEYOND (2012); A. MICHAEL FROOMKIN & JONATHAN WEINBERG, CHIEF JUSTICE EARL WARREN INST. ON LAW & SOC. POLICY, HARD TO BELIEVE: THE HIGH COST OF A BIOMETRIC IDENTITY CARD (2012), http://www.law.berkeley.edu/files/Believe_Report_Final.pdf; KELLY A. GATES, OUR BIOMETRIC FUTURE: FACIAL RECOGNITION TECHNOLOGY AND THE CULTURE OF SURVEILLANCE (2011); ANIL K. JAIN, ARUN A. ROSS, KARTHIK NANDAKUMAR, INTRODUCTION TO BIOMETRICS (2011); SHOSHANA AMIELLE MAGNET, WHEN BIOMETRICS FAIL: GENDER, RACE, AND THE TECHNOLOGY OF IDENTITY (2011); BIOMETRIC RECOGNITION: CHALLENGES AND OPPORTUNITIES (Joseph N. Pato & Lynette I. Millett eds., 2010) [hereinafter BIOMETRIC RECOGNITION]; DAVID LYON, SURVEILLANCE STUDIES: AN OVERVIEW 118–36 (2007); VACCA, supra; ROBERT O’HARROW, JR., NO PLACE TO HIDE 157– 89 (2005); Robin Feldman, Considerations on the Emerging Implementation of Biometric Technology, 25 HASTINGS COMM. & ENT. L.J. 653 (2003); U.S. GEN. ACCOUNTING OFFICE, GAO-03-174, TECHNOLOGY ASSESSMENT: USING BIOMETRICS FOR BORDER SECURITY (2002) [hereinafter GAO TECHNOLOGY ASSESSMENT], http://www.gao.gov/assets/160/157313.pdf; SIMSON GARFINKEL, DATABASE NATION: THE DEATH OF PRIVACY IN THE 21ST CENTURY 37– 67 (2000). 319. Risen & Poitras, supra note Error! Bookmark not defined.. The use of the term “targeting” in this alleged 2010 NSA document from the Snowden disclosures does not DRAFT: PLEASE DO NOT CITE OR CIRCULATE

2016] CRITICAL DATA THEORY 63

In the intelligence context, it appears that big data cybersurveillance and mass dataveillance tools may now risk the conflation of the virtual representation of a digital avatar with an actual person. Viewed through the lens of this risk, it appears that the reference to the “target” in the NSA document above may be more appropriately and descriptively characterized as a digital avatar in that it appears that the “target” may be a product of data fusion,320 or an amalgamation of data.321 Nonetheless, it appears that the legal or other consequences that may flow from big data cybersurveillance or mass dataveillance methods are suffered by the person associated with the suspicious digital data. The person associated with the digital avatar is assumed guilty or suffers from the inferred guilt of the digital avatar’s technological surrogate such as a smartphone.322 From the Snowden disclosures, it appears that the “full- arsenal approach” to newly emerging mass surveillance methods employs data science323 to determine the data-driven suspicion and guilt of digital appear to be defined. However, the term “targeting” in the defense and intelligence context has been defined as “[t]he process of selecting and prioritizing targets and matching the appropriate response to them, considering commander’s objectives, operational requirements, capabilities, and limitations.” See U.S. DEPT. OF DEFENSE, OFFICE OF COUNTERINTELLIGENCE, DEFENSE CI & HUMINT CENTER, DEFENSE INTELLIGENCE AGENCY, GLOSSARY (UNCLASSIFIED), TERMS & DEFINITIONS OF INTEREST FOR DOD COUNTERINTELLIGENCE PROFESSIONALS, at GL-167 (2011), http://fas.org/irp/eprint/ci- glossary.pdf; see also id. (defining the counterintelligence community’s use of the word “target” as “1) An entity or object considered for possible engagement or other action; 2) in intelligence usage, a country, area, installation, agency, or person against which intelligence operations are directed; 3) an area designated and numbered for future firing; and 4) in gunfire support usage, an impact burst that hits the target.”). 320. In the intelligence context, “fusion” or “data fusion” has been described as “the collection of information from myriad sources to be organized and analyzed for a fuller picture of terrorist or other threats.” PRIEST & ARKIN, supra note Error! Bookmark not defined., at 92. In the consumer context, “data fusion” has been defined in the following way: “Data fusion occurs when data from different sources are brought into contact and new facts emerge[.]” PRESIDENT’S COUNCIL OF ADVISORS ON SCIENCE AND TECHNOLOGY (“PCAST”), BIG DATA AND PRIVACY: A TECHNOLOGICAL PERSPECTIVE (May 2014) [hereinafter PCAST REPORT], https://www.whitehouse.gov/sites/default/files/microsite s/ostp/PCAST/pcast_big_data_and_privacy_-_may_2014.pdf; see infra Parts III–IV. Several scholars and experts have explored the legal and surveillance implications of data fusion centers that have been created by the government, particularly after the terrorist attacks of September 11, 2001. See, e.g., PRIEST & ARKIN, supra note Error! Bookmark not defined., at 92–93; Slobogin, Panvasive Surveillance, supra note Error! Bookmark not defined.; Danielle Keats Citron & Frank Pasquale, Network Accountability for the Domestic Intelligence Apparatus, 62 HASTINGS L.J. 1441 (2011). 321. See infra Parts III–IV. 322. For example, according to one media report, a drone operator explained that drone strikes target not a suspicious person necessarily, but, rather may target a digital avatar proxy—suspicious phones. Scahill & Greenwald, supra note Error! Bookmark not defined. (“We’re not going after people—we’re going after their phones, in the hopes that the person on the other end of that missile is the bad guy.”); see also Margaret Hu, Big Data Blacklisting, 67 FLA. L. REV. (forthcoming 2015). 323. Scahill & Greenwald, supra note Error! Bookmark not defined. (“According to a former drone operator for the military’s Joint Special Operations Command (JSOC) who DRAFT: PLEASE DO NOT CITE OR CIRCULATE 5/18/16 9:27 AM

avatars.324 In other words, in a big data world, the intelligence community may view the digital avatar or technological surrogate as a proxy for the actual person targeted.325 also worked with the NSA, the agency often identifies targets based on controversial metadata analysis and cell-phone tracking technologies.”). 324. Like the terms “small data” and “big data” that are not yet clearly defined as of yet, the terms “data science,” “data-driven science” and “big data science” have no set, agreed-upon definition. Generally, however, “[i]n contrast to new forms of empiricism, data-driven science seeks to hold to the tenets of the scientific method, but is more open to using a hybrid combination of abductive, inductive and deductive approaches to advance the understanding of a phenomenon.” Rob Kitchin, Big Data, New and Paradigm Shifts, BIG DATA & SOC’Y J., Apr.–June 2014, at 1. Data science, thus, has been used to describe a new field of study and academic research that is dependent upon big data technological developments and tools. According to a National Science Foundation Solicitation, the term “big data science & engineering” appears to include the study of the: core scientific and technological means of managing, analyzing, visualizing, and extracting useful information from large, diverse, distributed and heterogeneous data sets so as to: accelerate the progress of scientific discovery and innovation; lead to new fields of inquiry that would not otherwise be possible; encourage the development of new data analytic tools and algorithms; facilitate scalable, accessible, and sustainable data infrastructure[.] NAT’L SCI. FOUND., SOLICITATION 12-499, CORE TECHNIQUES AND TECHNOLOGIES FOR ADVANCING BIG DATA SCIENCE & ENGINEERING (2012), http://www.nsf.gov/pubs/ 2012/nsf12499/nsf12499.htm. 325. Multiple scholars are carefully examining the legal implications of targeted killing policies and drone strikes as a linchpin of U.S. counterterrorism policy. See, e.g., Martin S. Flaherty, The Constitution Follows the Drone: Targeted Killings, Legal Constraints, and Judicial Safeguards, 38 HARV. J.L. & PUB. POL’Y 21 (2015); Oren Gross, The New Way of War: Is There A Duty to Use Drones?, 67 FLA. L. REV. 1 (2015); Gregory S. McNeal, Targeted Killing and Accountability, 102 GEO. L.J. 681 (2014); Douglas Cox & Ramzi Kassem, Off the Record: The National Security Council, Drone Killings, and Historical Accountability, 31 YALE J. ON REG. 363 (2014); David W. Opderbeck, Drone Courts, 44 RUTGERS L.J. 413 (2014); Matthew Craig, Targeted Killing, Procedure, and False Legitimation, 35 CARDOZO L. REV. 2349 (2014); Jenny-Brooke Condon, Illegal Secrets, 91 WASH. U.L. REV. 1099 (2014); Jennifer Daskal, The Geography of the Battleflield: A Framework for Detention and Targeting Outside the ‘Hot’ Conflict Zone, 171 U. PA. L. REV. 1165 (2013); Amos Guiora, LEGITIMATE TARGET: A CRITERIA-BASED APPROACH TO TARGETED KILLING (2013), Deborah Pearlstein, Enhancing Due Process in Targeted Killing, AMERICAN CONSTITUTION SOCIETY ISSUE BRIEF (Oct. 2013); Richard Murphy & Afsheen John Radsan, Notice and an Opportunity to Be Heard Before the President Kills You, 48 WAKE FOREST L. REV. 829 (2013); Alberto R. Gonzales, Drones: The Power to Kill, 82 GEO. WASH. L. REV. 1 (2013); Leila Nadya Sadat, America’s Drone Wars, 45 CASE W. RES. J. INT’L L. 215 (2012); Carla Crandall, Ready . . . Fire . . . Aim! A Case for Applying American Due Process Principles Before Engaging in Drone Strikes, 24 FLA. J. INT’L L. 55 (2012); Pardiss Kebriaei, The Distance Between Principle and Practice in the Obama Administration’s Targeted Killing Program: A Response to Jeh Johnson, 31 YALE L. & POL’Y REV. 151 (2012); Mark V. Vlasic, Assassination & Targeted Killing-A Historical and Post-Bin Laden Legal Analysis, 43 GEO. J. INT’L L. 259 (2012); Robert Chesney, Who May Be Killed? Anwar al-Awlaki as a Case Study in the International Legal Regulation of Lethal Force, in Y.B. INT’L HUMANITARIAN L. (M.N. Schmitt et al. eds., 2011); Lesley Wexler, Litigating the Long War on Terror: The Role of al-Aulaqi v. Obama, 9 LOY. U. CHI. INT’L L. REV. 159 (2011); Philip Alston, The CIA and Targeted Killings Beyond Borders, 2 HARV. NAT’L SEC. J. 283 (2011); Kenneth Anderson, Targeted Killing in U.S. Counterterrorism DRAFT: PLEASE DO NOT CITE OR CIRCULATE

2016] CRITICAL DATA THEORY 65

Therefore, data that was once collected and aggregated for small data governance purposes—for example, distribution of benefits through Social Security numbering system326—can now be used to serve other ex ante big data goals. For example, Social Security Administration databases are now used to screen for employment eligibility to serve immigration control objectives (No Work List),327 to protect voter integrity (No Vote List),328 and potentially to prevent crime and terrorism (No Citizenship List329 and No Fly List330). Recent comprehensive immigration reform bills have suggested the need to adopt a “high-tech” Social Security Card that would incorporate biometric data into the Social Security numbering system, for example, a machine-readable DNA-based Social Security Card.331 Such efforts exponentially increase the potential for big data governance systems

Strategy and Law, in LEGISLATING THE WAR ON TERROR: AN AGENDA FOR REFORM (Ben Wittes ed., 2009); Richard Murphy & Afsheen John Radsan, Due Process and Targeted Killing of Terrorists, 31 CARDOZO L. REV. 405 (2009); Daphne Barak-Erez & Matthew C. Waxman, Secret Evidence and the Due Process of Terrorist Detentions, 48 COLUM. J. TRANS. L. 3 (2009). 326. See, e.g., Balkin, National Surveillance State, supra note X. 327. See, e.g., E-Verify enables employers to obtain software from DHS that allows it to screen personally identifiable data (e.g., name, birthdate, and Social Security Number) through government databases over the Internet to “verify” the identity and employment eligibility of the employee. U.S. DEP’T OF HOMELAND SEC., U.S. CITIZENSHIP & IMMIGRATION SERVS., E-VERIFY USER MANUAL FOR EMPLOYERS 1 (2014), http://www.uscis.gov/sites/default/files/files/nativedocuments/E-Verify_Manual.pdf; Illegal Immigration Reform and Immigrant Responsibility Act of 1996 (IIRIRA), Pub. L. 104-208, 110 Stat. 3009 (1996). 328. See, e.g., HAVA (Help America Vote Act of 2002) requires states to verify the identity of the voter registration application through database screening of driver’s license databases and the databases of the Social Security Administration. Pub. L. No. 107-252, 116 Stat. 1666, 1666–1730 (2002) (codified as amended at 42 U.S.C. § 15,483(a)(5)(A)(i)). 329. The DHS Prioritized Enforcement Program (PEP) facilitates data sharing and database screening protocols between the FBI, DHS, and local law enforcement agencies. See Memorandum from Jeh Charles Johnson, supra note X, at 2. The authorities relied upon by DHS to support PEP include: (1) fingerprint data sharing authority under 28 U.S.C. § 534(a)(1) and 28 U.S.C. § 534(a)(4); (2) data sharing system to determine deportability under 8 U.S.C. § 1722; and (3) database sharing between federal and state agencies under 42 U.S.C. § 14616. See DHS Memorandum from Riah Ramlogan, supra note X, at 4–6. 330. The TSC-maintained database combined the No Fly List with other digitally generated watchlists to create one centralized repository for all suspected international and domestic terrorists. NAT’L COUNTERTERRORISM CTR., WATCHLISTING GUIDANCE 6, 50 (Mar. 2013), https://firstlook.org/theintercept/document/2014/07/23/march-2013-watchlisting-guidance/. See also Vision 100—Century of Aviation Reauthorization Act, Pub. L. No. 108-176, 117 Stat. 2490, 2568 (2003) (codified as amended at 49 U.S.C. § 44903 (2012)); Secure Flight Program, 73 Fed. Reg. 64,018, 64,019 (Oct. 28, 2008) (codified at 49 C.F.R. §§ 1540, 1544, 1560). 331. See, e.g., Hu, Biometric ID Cybersurveillance, supra note X, at 1516–17. Lynn Sweet, Gutierrez Arrested for Immigration Protest; Explains on CBS “Face the Nation,” CHI. SUN- TIMES (May 2, 2010, 8:19 PM), http://blogs.suntimes.com/sweet/2010/05/ gutierrez_arrested_for_immigra.html (“I’m ready to give a little blood and a little DNA to prove that I’m legally working in the United States of America”) (transcribing a Face the Nation interview with Congressman Luis Gutierrez (D-Ill.)). DRAFT: PLEASE DO NOT CITE OR CIRCULATE 5/18/16 9:27 AM

that depend upon the construction and tracking of digital avatars or data selves. Increasingly, especially as a matter of immigration reform efforts, the emphasis is on the collection and the analysis of biometric data to inform border security and counterterrorism policy.332 Biometric data can be divided into what is referred to as “soft biometrics” and “hard biometrics.”333 What is termed “soft biometrics” includes the categorization of race and ethnicity.334 “Soft biometrics” can also include the following characteristics: Age, height, weight, color of skin and color of hair, scars and birthmarks, and tattoos.335 What is termed “hard biometrics” can include: Digital photographs for facial recognition technology, scanned fingerprints and iris scans, and DNA database screening.336 Among the most experimental biometric data collection and screening technologies include: Skeletal bone scans, brain scans, ear and eyebrow shapes, breathing and perspiration rates, eye pupil dilation, gait and walking patterns, keystroke patterns, and behavioral patterns.337 For identity management policy objectives and population tracking purposes, an emphasis is placed on the collection and aggregation of personally identifiable information. Personally identifiable data has been defined by the Federal Trade Commission (FTC):338

332. See, e.g.,Hu, Biometric ID Cybersurveillance, supra note X, at 1512–15 (see Table 6). 333. See, e.g., Laura K. Donohue, Technological Leap, Statutory Gap, and Constitutional Abyss: Remote Biometric Identification Comes of Age, 97 MINN. L. REV. 407 (2012); JENNIFER LYNCH, FROM FINGERPRINTS TO DNA: BIOMETRIC DATA COLLECTION IN U.S. IMMIGRANT COMMUNITIES AND BEYOND (2012); A. MICHAEL FROOMKIN & JONATHAN WEINBERG, CHIEF JUSTICE EARL WARREN INST. ON LAW & SOC. POLICY, HARD TO BELIEVE: THE HIGH COST OF A BIOMETRIC IDENTITY CARD (2012), available at http://www.law.berkeley.edu/files/Believe_Report_Final.pdf; KELLY A. GATES, OUR BIOMETRIC FUTURE: FACIAL RECOGNITION TECHNOLOGY AND THE CULTURE OF SURVEILLANCE (2011); ANIL K. JAIN, ARUN A. ROSS, KARTHIK NANDAKUMAR, INTRODUCTION TO BIOMETRICS (2011); SHOSHANA AMIELLE MAGNET, WHEN BIOMETRICS FAIL: GENDER, RACE, AND THE TECHNOLOGY OF IDENTITY (2011); BIOMETRIC RECOGNITION: CHALLENGES AND OPPORTUNITIES (Joseph N. Pato & Lynette I. Millett eds., 2010) [hereinafter BIOMETRIC RECOGNITION]; DAVID LYON, SURVEILLANCE STUDIES: AN OVERVIEW 118–36 (2007); JOHN R. VACCA, BIOMETRIC TECHNOLOGIES AND VERIFICATION SYSTEMS 589 (2007); ROBERT O’HARROW, JR., NO PLACE TO HIDE 157–89 (2005); Robin Feldman, Considerations on the Emerging Implementation of Biometric Technology, 25 HASTINGS COMM. & ENT. L.J. 653 (2003); U.S. GEN. ACCOUNTING OFFICE, GAO-03-174, TECHNOLOGY ASSESSMENT: USING BIOMETRICS FOR BORDER SECURITY (2002) [hereinafter GAO TECHNOLOGY ASSESSMENT], http://www.gao.gov/assets/160/157313.pdf; SIMSON GARFINKEL, DATABASE NATION: THE DEATH OF PRIVACY IN THE 21ST CENTURY 37–67 (2000). 334. Koichiro Niinuma, Unsang Park & Anil. K. Jain, Soft Biometric Traits for Continuous Use Authentication, 5 INST. ELEC. ELECS. ENG’R 771, 772 (2010) (defining the characteristics of both “soft” and “hard” biometrics). 335. Id. 336 . See VACCA,supra note 333. 337 . See, e.g., Margaret Hu, Biometric ID Cybersurveillance, 88 IND. L. J. 1475, 1487 (2013). 338 . See, e.g., FTC REPORT: DATA BROKERS, supra note X, at 55. DRAFT: PLEASE DO NOT CITE OR CIRCULATE

2016] CRITICAL DATA THEORY 67

‘Personal data’ shall mean information from or about consumers, including, but not limited to:

(1) first and last name;

(2) home or other physical address, including street name and name of city or town;

(3) email address or other online contact information, such as an instant messaging user identifier or a screen name;

(4) telephone number;

(5) date of birth;

(6) gender, racial, ethnic, or religious information;

(7) government-issued identification number, such as a driver’s license, military identification, passport, or Social Security number, or other personal identification number;

(8) financial information, including but not limited to: investment account information; income tax information; insurance policy information; checking account information; and credit, debit, or check-cashing card information, including card number, expiration date, security number (such as card verification value), information stored on the magnetic stripe of the card, and personal identification number;

(9) employment information, including, but not limited to, income, employment, retirement, disability, and medical records; or

(10) a persistent identifier, such as a customer number held in a ‘cookie’ or processor serial number.

Dan Solove explains that this personally identifiable data can be assembled into “digital dossiers.”339 From recent disclosures, it appears that “digital dossiers” are now used for database screening and digital

339. Solove, Digital Dossiers and the Dissipation of Fourth Amendment Privacy, supra note Error! Bookmark not defined.. DRAFT: PLEASE DO NOT CITE OR CIRCULATE 5/18/16 9:27 AM

watchlisting systems that serve precrime purposes.340 The disparate racial impact of database screening systems and digital watchlisting systems are largely unknown.341 The federal government has released some limited research that already suggests certain government-led big data programs have resulted in a disparate impact in some programs.342 The “race-neutral” databases appear to be leading to profound race-, sex-, and national origin-based consequences.343 Yet, because many government-led big data programs are proliferating under crimmigration-counterterrorism rationales, the big data program or the data and data-driven discrimination facilitated by the government’s capture of big data does not look problematic but looks justified and necessary. The No Fly List, for example, does not appear to be a “Flying While Muslim” program, as it has been characterized.344 Rather, the No Fly List is presented as a “race-neutral” and “colorblind” database screening and digital watchlisting system that exercises what the government calls “predictive judgment” to prevent terroristic threats.345

B. Applying Critical Data Theory to Big Data Developments

Just as critical race theorists have rejected the inevitability and objectivity of racial categorization, Critical Data Theory rejects the inevitability and objectivity of big data categorizations. Digital data determinations, algorithmic intelligence, and other big data analyses are

340. See, e.g., Hu, Small Data v. Big Data, supra note X; Hu, Big Data Blacklisting, supra note X. 341. See, e.g., Latif v. Holder, supra note X. 342 . See, e.g., Westat Report, supra note X; WESTAT, FINDINGS OF THE WEB BASIC PILOT EVALUATION xxi (2007), http://www.uscis.gov/sites/default/files/files/article/ WebBasicPilotRprtSept2007.pdf (report submitted to the DHS); DORIS MEISSNER & MARC R. ROSENBLUM, MIGRATION POLICY INST., THE NEXT GENERATION OF E-VERIFY GETTING EMPLOYMENT VERIFICATION RIGHT 6 (2009), http://www.migrationpolicy.org/pubs/Verification_paper-071709.pdf. 343 . Id. See also Margaret Hu, Big Data Blacklisting, 67 FLA. L. REV. 1735, 1798 (2015); LEADERSHIP CONFERENCE ON CIVIL AND HUMAN RIGHTS, CIVIL RIGHTS, BIG DATA, AND OUR ALGORITHMIC FUTURE (2014). 344. ‘Flying While Muslim’: Profiling Fears After Arabic Speaker Removed from Plane, NPR (Apr. 20, 2016), http://www.npr.org/2016/04/20/475015239/flying-while-muslim- profiling-fears-after-arabic-speaker-removed-from-plane; Remarks by Hina Shamsi, Director of ACLU National Security Project, Lunch Keynote Speaker at Washington and Lee University School of Law Journal of Civil Rights and Social Justice Symposium: 50th Anniversary of the Civil Rights Act of 1964 & Voting Rights Act of 1965 (Feb. 20, 2015). 345. See, e.g., Defendants’ Consolidated Memorandum in Support of Cross-Motion for Partial Summary Judgment and Opposition at 52, Latif v. Holder, No. 3:10-CV-00750-BR (D. Or. May 28, 2015) (“The Government carries out this [No Fly List] mandate by making predictive judgments, based on sensitive intelligence reporting and investigative information, as to whether certain individuals present too great a risk to be allowed to board commercial aircraft[]”); Daskal, Pre-Crime, supra note X, at 362. DRAFT: PLEASE DO NOT CITE OR CIRCULATE

2016] CRITICAL DATA THEORY 69 shaped by a process of data construction. Often they are built by data scientists, and driven by underlying assumptions of how knowledge or statistically supported conclusions which may be derived by the digital data. Critical race scholars have consistently rejected that race classifications and race-based characterizations are organically derived or scientifically predictive.346 Similar to the work of feminist theorists and queer theorists as well as race theorists, many experts are now actively engaged in the project of deconstructing the social, cultural, and legal constructions of digital data derived identities: The digital self, the data self, the cyber self, the networked self, and the digital avatar. Just as gender discrimination evades simple analogies to race discrimination—in part because, for example, of the complexities of intersectionality discrimination347 that implicates both—data-derived discrimination cannot be simply analogized to race or gender discrimination. Recent revelations regarding the data profiles of data brokers demonstrate, for instance, the inflection of data identities by race, gender, geography, and class, among other characteristics.348 Additionally, national security disclosures and other investigatory reports have also illuminated how big data tools can turn on the ability to engage in identity management, including the effective categorization of individuals by personally identifiable characteristics. The data essentialist349 impulse of a digital economy and post-9/11

346 . See, e.g., RICHARD DELGADO & JEAN STEFANCIC, CRITICAL RACE THEORY: AN INTRODUCTION (2001) (“A third theme of critical race theory, the ‘social construction’ thesis, holds that race and races are products of social thought and relations. Not objective, inherent, or fixed, they correspond to no biological or genetic reality; rather, races are categories that society invents, manipulates, or retires when convenient.”). 347. Kimberlé Crenshaw is credited with pioneering the concept of “intersectional” discrimination to describe discrimination specific to , but the term now covers any discrimination against individuals who fall under two marginalized groups. See Kimberlé Crenshaw, Demarginalizing the Intersection of Race and Sex: A Black Feminist Critique of Antidiscrimination Doctrine, Feminist Theory and Antiracist Politics, 1989 U. CHI. LEGAL F. 1 (1989) (exploring the intersection of race and gender discrimination); Kimberlé Crenshaw, Mapping the Margins: Intersectionality, Identity Politics, and Violence Against Women of Color, 43 STAN. L. REV. 6 (1991) (discussing the impact of intersectionality of race and gender dimensions in the context of violence against women of color); TIMO MAKKONEN, MULTIPLE, COMPOUND AND INTERSECTIONAL DISCRIMINATION: BRINGING THE EXPERIENCES OF THE MOST MARGINALIZED TO THE FORE, INST. HUM. RTS., ÅBO AKADEMI UNIV. 11 (Apr. 2002) (defining intersectional discrimination as “a specific type of discrimination, in which several grounds of discrimination interact concurrently. For instance, minority women may be subject to particular types of prejudices and stereotypes. They may face specific types of racial discrimination, not experienced by minority men.”). 348 . See, e.g., FTC REPORT: DATA BROKERS, supra note X; O’HARROW, supra note X; GARFINKEL, supra note X; ANGWIN, supra note X; SCHNEIER, supra note X. 349. See, e.g., Markus Burkhardt, A New Digital Purity? On Architectures for Digital Immateriality, in SEARCH ROUTINES: TALES OF DATABASES 92–93, 100–01 (Lena Brüggemann & Francis Hunger eds. 2015) (explaining that “data essentialism” is a phenomenon that “resurrects” Jacques Derrida’s concept of the ‘transcendental signified’ in DRAFT: PLEASE DO NOT CITE OR CIRCULATE 5/18/16 9:27 AM

securitization trend reflects that the categorization of the digital avatar of individuals is not a mere byproduct of technological advances that facilitate the digital construction of racial, gendered, and other identities. Instead, understanding the big data and data essentialist impulses exposes the manner in which digital person categorizations—that are dependent upon race, gender, and other classifications—are propelled by a wide host of economic and national security policy incentives. Because the democratic project involves the protection of core rights and liberties deemed necessary to allow for the self-construction of complex and autonomous identities and communities, identities and communities in a democracy necessarily defy binary and algorithmically- driven categorizations and classifications. Historical evidence in recent decades further underscores how such binary categorizations and classifications can be subject to abuse, especially under the rule of law. Critical and other theorists have established that social sorting projects350 have often served subordinating ends, especially when imposed by governmental, corporatized, and academic regimes for purposes deemed scientifically objective or market-driven. Consequently, contemporary economic and policy incentives now present as imperative the need to construct and dissect digital identities. This is often based upon scientifically objective or market-driven categorizations that may be highly problematic. Theorists have asserted that gender, racial, and sexual identities, among other identities, are fluid. Critical Race Theory proponents assert that new legal doctrines are necessary to accommodate what Mari Matsuda has described as a “multiple consciousness” identity, an identity that is informed by conflicting and intersecting experiences, particularly experiences of subordination. Critical Data Theory is crucial to pull out similar analyses of the intersectionality and fluidity of data-constructed identities, and to push for an evolution of legal doctrines to accommodate the emergence of the data self and “digital personhood.” To initiate a rigorous theoretical critique of big data, Critical Data Theory can play an important role in launching a similar interrogative method as Critical Race Theory. Critical Race Theory is a theory with a history of multidisciplinary catechization and a preexisting framework for that big data is “a black box full of information” and, thus, “the database becomes the virtually infinite center of our signifying practices that appears to be ‘semiotically transcendental’”) (citing JACQUES DERRIDA, Structure, Sign and Play in the Discourse of the Human Sciences, in WRITING AND DIFFERENCE (1967) and ALAN LIU, LOCAL TRANSCENDENCE: ESSAYS ON POSTMODERN HISTORICISM AND THE DATABASE (2008)); Vincent Yzerbyt, Olivier Corneille & Claudia Estrada, The Interplay of Subjective Essentialism and Entitativity in the Formation of Stereotypes,5 PERSONALITY & SOC. PSYCH. REV. 141 (2001). 350 . See, e.g., OSCAR H. GANDY, JR., THE PANOPTIC SORT: A POLITICAL ECONOMY OF PERSONAL INFORMATION (1993); GEOFFREY C. BOWKER & SUSAN LEIGH STAR, SORTING THINGS OUT: CLASSIFICATION AND ITS CONSEQUENCES (1999). DRAFT: PLEASE DO NOT CITE OR CIRCULATE

2016] CRITICAL DATA THEORY 71 searching legal criticism. It can successfully set the foundation to initiate a new theoretical lens by which to examine big data governance tools and the broader impact of algorithmic technologies. Critical Race Theory is uniquely situated to offer critical insight into big data-related discrimination for several reasons. These reasons include: The manner in which (1) digital data and data science, like race, are offered as a natural and objective referent; (2) big data-based structures of law and power can operate to entrench discrimination in ways that are largely invisible, much like race-based structures of law and power can operate invisibly; and (3) big data tools can manifest race-based proxies in “black box” contexts that make legal and constitutional challenges difficult. The governing philosophy of big data must be subjected to critical theoretical treatment as a prerequisite to assessing its impact on newly emerging legal and policy developments, and on core constitutional rights and principles. The advent of the big data revolution necessitates an evolution of theoretical structures of analysis to properly critique the intersection of digital data, law, and power. Just as Critical Theory utilizes developments in postmodern and poststructural scholarship,351 Critical Data Theory is now required to accomplish similar aims of deconstruction. Put differently, just as Critical Race Theory introduced an approach to legal jurisprudence to deconstruct race classification as it operates in legal contexts, Critical Data Theory is now required to deconstruct the legal and constitutional impact of big data. Critical Race Theory helped unearth and deconstruct racial assumptions that were invisibly embedded in the law and policy. Critical Data Theory must bring contemporary theorizations of a big data society to bear on traditional areas of the law that are presently colliding with and being reshaped by big data programs.352 This application is necessary to understand and to challenge the underlying philosophies of big data policymaking and big data governance. Underlying presumptions regarding the innate quality of digital data, the reliability of databases, or the inherent objectivity and predictive accuracy of big data products, for example, can mislead an inquiry and thwart other insights.353 Thus, big data tools of governance in particular deserve interrogation in the manner in which they facilitate norms of categorizations surrounding social sorting protocols354 and methods of classification.355

351. See, e.g., Ben Agger, Critical Theory, Poststructuralism, Postmodernism: Their Sociological Relevance, 17 ANN. REV. SOC. 105 (1991). 352. See, e.g., supra notes 61, 63, 65. 353 . See, e.g., MAYER-SCHÖNBERGER & CUKIER, supra note 4, at 30; boyd & Crawford, Critical Questions, supra note 5; Cohen, What Is Privacy For, supra note X; Crawford & Schultz, supra note 21; Richards & King, Paradoxes of Big Data, supra note X; Polonetsky & Tene, supra note X. 354 . See, e.g., GANDY, JR., supra note X; BOWKER & STAR, supra note X. 355 . See, e.g., LYON, SURVEILLANCE STUDIES, supra note X; ROSEN, THE NAKED CROWD, supra note X. DRAFT: PLEASE DO NOT CITE OR CIRCULATE 5/18/16 9:27 AM

Critical Data Theory can contest the embedded and nearly invisible assumption of big data governance and the constructivity aspects of big data. A theoretical interrogation is necessary to confront truths that may lead to disparate impacts: That algorithmic decisionmaking and the underlying databases that support big data systems are race-neutral and mitigate human biases. Critical Data Theory can operate to de-emphasize datafication or the translation of digital data subject to administrative regulation into analyzable databases. The No Fly List, to take one example, places emphasis on centralized database decisionmaking by replacing traditional gatekeepers of national security policy with an algorithmic- driven and centralized big data system that relegates intelligence officers to the position of mere data collecters and transmitters. Without a theoretical lens to deconstruct this development, the No Fly List cannot be fully interrogated through current legal doctrines standing alone. Many post-9/11 big data programs are retrenching structural hierarchies, and in bidirectional ways, that are hard to see and interrogate because big data systems are largely invisible.356 Consequently, the need for Critical Data Theory is urgent. The continuing integration of big data tools and datafication into our Information Society marks a moment of historical transformation. As the big data revolution transforms how we capture and analyze data generally—in other words, as we move from a small data world to a big data world—governance structures will necessarily adapt. Governmentality, the attendant or delegated power structures of governance, will also necessarily adapt. Technology has transformed the Information Society. The Information Society has transformed big data governance. Power structures and justifications for the shifting nature of power are necessarily transforming as well. David Lyon has explained, “[A]s political-economic and socio- technological circumstances change, so surveillance also undergoes alteration, sometimes transformation.”357 As understood through the theorization of Jack Balkin and Sanford Levinson, the National Surveillance State is transformative in that tracking systems are bureaucratized and surveillance technologies are incorporated into day-to- day governance activities, thus, evading traditional legal and constitutional

356. The data and algorithms that inform the No Fly List are classified. See, e.g., Defendants’ Consolidated Memorandum in Support of Cross-Motion for Partial Summary Judgment and Opposition at 52, Latif v. Holder, No. 3:10-CV-00750-BR (D. Or. May 28, 2015) (stating that when faced with the reasons for a nomination to the No Fly List, the United States has argued “such inquiries would inevitably seek to scrutinize reasons for the No Fly determination and support for them—the vast majority of which would implicate classified national security and law enforcement information”). 357. Lyon, supra note Error! Bookmark not defined., at 2; see also Balkin, National Surveillance State, supra note Error! Bookmark not defined.; Balkin & Levinson, supra note Error! Bookmark not defined.; Murphy, supra note Error! Bookmark not defined.; CLARK, supra note Error! Bookmark not defined.; WALLACE & MELTON, supra note Error! Bookmark not defined.. DRAFT: PLEASE DO NOT CITE OR CIRCULATE

2016] CRITICAL DATA THEORY 73 analyses. The invisible nature of historical transformations propelled by the big data revolution and digital age demands critical theoretical methods to help see what otherwise cannot be seen or comprehended. Critical Data Theory, like Critical Race Theory, can assist in revealing insights with regards to big data’s impact on emerging legal developments, particularly crimmigration and national security developments that capture technological innovations. The tools of big data and data science allow for legal, scientific, and socioeconomic-political constructions that parallel the manner in which tools of race negotiation and race definition have facilitated legal, scientific, and socioeconomic-political constructions. Tools of indexation and theories of knowledge production, often presented as scientifically objective and epistemological in nature and, therefore, nondiscriminatory, can nonetheless shape the treatment of classes and subclasses of individuals in profoundly disparate ways. The legality and constitutionality of big data techniques of governance cannot be fully understood without the application of analytical frameworks that assess emerging big data identity and big data policy structures through a critical theoretical lens. The questions remains: How do you bring the techniques of seizing and critiquing power under Critical Race Theory into a new theory, Critical Data Theory? Arguably, privacy scholars and technology scholars have already recognized that the predominant tools that have been traditionally utilized by critical theorists can be valuable in shedding insight. James Boyle has suggested that science fiction better explains how the law needs to adapt to technological developments than the law can.358 Science fiction, like dystopian literature, is an art form that tells a story about technology and humanity.359 A storytelling method of critique is essential to explain the impact of big data, artificial intelligence, and the algorithmic decisionmaking that predominates a “black box” society. Without a theoretical approach, the sources and forces of big data power will remain largely unchallenged and unseen. Critical Data Theory—in adopting dissent and discourse, civil rights protests, narratives, and other art forms—can make big data more visible. This visualization forces a conversation on the re-negotiation of power under the impact of big data tools. Big data narratives are currently underway. Many appear to be entering through the vehicle of litigation, for instance, by those who are impacted by big data governance systems such as the No-Fly List. Other big data narratives have entered discourse through the media, films and documentaries, national security whistleblowers, and other art mediums.

358. Constitution 3.0, Ch. Constitutional X. 359 . M. KEITH BOOKER, THE DYSTOPIAN IMPULSE IN MODERN LITERATURE: FICTION AS SOCIAL CRITICISM (1994); M. KEITH BOOKER, DYSTOPIAN LITERATURE: A THEORY & RESEARCH GUIDE (1994). DRAFT: PLEASE DO NOT CITE OR CIRCULATE 5/18/16 9:27 AM

A common theme can be seen in multiple legal narratives and art narratives: The power imbalance created by technological developments that now risks threatening core democratic and equality principles, the social contract, and humanitarian prerequisites of the natural law tradition. Critical Data Theory, like Critical Race Theory, argues that the law now needs to catch up with the story. Under Critical Race Theory, theorists argue that the story changes the law as the law will not change itself, since the law protects power hierarchies. Like the critical race theorists that depend on the power of the story to deconstruct the story of power, critical data theorists cannot critique the power without a story of how power operates under big data. Yet, Critical Data Theory, like Critical Race Theory is a part of a that goes beyond simple storytelling.360 The process of deconstruction involves theoretical approaches that attempt to excavate a priori truth. From that excavation, challenges to dominant epistemologies and structures of governance are made possible. As big data drifts from philosophical dimensions to more ideological dimensions,361 a political philosophy that can deconstruct the big data phenomenon becomes more crucial. The theoretical tools of Critical Data Theory cannot stop with

360. See, e.g., Thomas McCarthy, Political Philosophy & Racial Injustice: From Normative to Critical Theory, in PRAGMATISM, CRITIQUE, JUDGMENT 147–68 (S. Benhabib & N. Fraser, eds,) (2004). Peter Goodrich & Linda G. Mills, The Law of White Spaces: Race, Culture, and Legal Education, 51 J. LEGAL EDUC. 15, 20 (2001).

[C]ritical race theory has had considerable success in making the norms of exclusion explicit and in legitimizing the experiences and narratives of racial outsiders as forms of knowledge, of culture, of institution, and of law. The various tools that have been developed to assert identities and discourses of color have focused on a politics of confrontation, resistance, intersection, and recuperation.

Id. 361 . See, e.g., MOROZOV, supra note X, at 5 (2013) (arguing that big data expresses an “ideology that legitimizes and sanctions such aspirations [the drive toward computable or algorithmically-driven solutions as technological] ‘solutionism[]’”); Julie Cohen, What Privacy Is For, 126 HARV. L. REV. 1904, 1924 (2013) (“[W]e seem unable to challenge the techniques of Big Data as knowledge-production practices. But the denial of ideology is itself an ideological position[]”); Jose van Dijick, Datafication, Dataism & Dataveillance: Data Between Scientific Paradigm & Ideology, 12 SURVEILLANCE & SOCIETY 197, 198 (2014).

[T]he ideology of dataism shows characteristics of a widespread belief in the objective quantification and potential tracking of all kinds of human behavior and sociality through online media technologies. Besides, dataism also involves trust in the (institutional) agents that collect, interpret, and share (meta)data culled from social media, internet platforms, and other communication technologies.

Id. DRAFT: PLEASE DO NOT CITE OR CIRCULATE

2016] CRITICAL DATA THEORY 75 narratives. The emancipatory objectives of the theory inquire as well into the capture of big data by the administrative state, cultural memes surrounding big data, the data science and logic of big data, big data power norms, technologically mediated network of social connectivity, constructivity of the digital avatar, and new forms of governance intended to govern digital personhood. The deconstructive process necessarily involves multiple techniques of interrogation that can be borrowed from Critical Theory.

CONCLUSION

Why a new theory? Why not simply apply Critical Race Theory to big data-related discrimination? Is it possible to just depend upon the tools of Critical Race Theory to assist in resolving the questions now presented by big data? Is big data simply another formally race-neutral governmental practice that is disparately impacting the lives of people of color? A new theory, Critical Data Theory, is necessary because it does not aim to critique or deconstruct race-based power. The philosophical governance aspects of big data are unfolding almost invisibly. Big data, as a governing philosophy, can assume the position of an ideology. Thus, a separate academic effort is paramount to engage with big data philosophically, politically, and theoretically. The ambition of Critical Data Theory encompasses the broader vision of Critical Theory: Emancipatory principles through preserving identity, autonomy, and community. Critical Data Theory compels the critique of algorithmic- and big data-centered power and the governance of digital identities under law and policy. Critical Theory serves both analytically revelatory and politically narrative-driven aims. Critical Data Theory can borrow aspects of Critical Race Theory to examine the parallel challenges that arise when big data determinations can result in disparate impacts or when big data systems can impose discriminatory results. Critical Race Theory, however, is not simply a method of critique. It “reflects ‘a desire not merely to understand. . .vexed bond[s] between law and racial power but to change . . . [them][.]’”362 Critical Data Theory can operate as a site for social and political mobilization, much like Critical Race Theory. An intervention is urgently needed. Critical Race Theory, by laying a theoretical framework for Critical Data Theory, may now offer the best vehicle for that intervention. At the earliest dawn of the big data revolution, big data governance tools appear particularly objective and efficacious. They appear especially efficacious in the negotiation of data in legal contexts and the legal treatment of digital persons. The monitoring of networked identities, or the construction and management of digital avatars, often purportedly to

362. Carbado, Critical What What?, supra note X, at 1615. DRAFT: PLEASE DO NOT CITE OR CIRCULATE 5/18/16 9:27 AM

prevent criminal or terroristic threats raises new legal challenges that have yet to be met. The ability to construct the digital identities of others, and to base administrative decisionmaking upon this construction, presents unprecedented challenges under preexisting privacy doctrine and data privacy laws. Inadequate theoretical tools risk permanently embedding big data programs and policies into preexisting bureaucratized surveillance systems without the benefit of critical legal or jurisprudential examination. Just as Critical Race Theory contested race-based assumptions and negotiations of power, Critical Data Theory also is needed to contest big data-based assumptions that define power and legal relationships. A big data world is a world filled with big data-driven knowledge and big data products. Critical Data Theory, thus, serves as a way to narrate and humanize an understanding of big data-centric power and its impact.