Copyright by Anna Veronica Banchik 2019

The Dissertation Committee for Anna Veronica Banchik Certifies that this is the approved version of the following Dissertation:

Throwing Keywords at the Internet: Emerging Practices and Challenges in Human Rights Open Source Investigations

Committee: Mary Rose, Co-Supervisor Javier Auyero, Co-Supervisor Sarah Brayne Daniel Fridman Amelia Acker

Throwing Keywords at the Internet: Emerging Practices and Challenges in Human Rights Open Source Investigations

by Anna Veronica Banchik

Dissertation

Presented to the Faculty of the Graduate School of The University of Texas at Austin in Partial Fulfillment of the Requirements for the Degree of Doctor of Philosophy

The University of Texas at Austin August 2019

Acknowledgements

I am deeply indebted first and foremost to the incredible generosity of the entire staff of the Human Rights Center at U.C. Berkeley’s School of Law, particularly Executive Director Alexa Koenig, Associate Director Andrea Lampros, and then-Lab Director Félim McMahon, who not only graciously allowed me to “follow the Lab around for a year” but warmly welcomed me into the HRC family. I’m still bewildered by the complete trust they placed in me, as they do with each of their students. This dissertation would not have been possible without their constant support, nor without the extensive group of bright, driven, and passionate Lab students who at varying moments in time have been my comrades-in-verification, personal role models, and closest friends. A special thanks to Natalia Krapiva, Youstina Youssef, Haley Willis, Andrea Trewinnard, Elise Baker, Dominique Lewis, and Michael Elsenadi, who liberally shared their insights with me and helped inform my thinking about the complex practices and implications of this work.

This study has benefited from the feedback, mentorship, and support of my dissertation committee, Mary Rose, Javier Auyero, Sarah Brayne, Dani Fridman, and Amelia Acker. My understanding of the issues addressed here, especially regarding platform content moderation, has gotten a recent super-boost by the extraordinary and fun-loving scholars of the Social Media Collective at Microsoft Research Lab – New England (MSR-NE) with whom I have had the honor of working alongside just as I was wrapping up this dissertation. A special thanks to Tarleton Gillespie, Nancy Baym, Mary Gray, Elena Maris, Nina Medvedeva, Gili Vidan, and Jabari Evans. I regret that this internship did not come at earlier in the research process—I have no doubt the final product would have been infinitely improved. All errors and shortcomings here are my own.

I am also grateful for the funding sources which have made it possible for me to move west and survive in the midst of the Bay Area housing market. Substantial financial support came from the National Science Foundation Graduate Research Fellowship Program (GRFP), the P.E.O. Scholar Award, and the MSR-NE Social Media Collective, in addition to the following contributions from the University of Texas at Austin: the College of Liberal Arts Thematic Diversity Recruitment Fellowship, the Sociology Department Excellence Award, the Ethnography Lab Summer

Fellowship, the Graduate Dean’s Prestigious Fellowship Supplement, and the Office of Graduate Students Professional Development Award.

Finally, I’m eternally grateful to my family and friends who brought me back to life so many times during this process.

Abstract Throwing Keywords at the Internet: Emerging Practices and Challenges in Human Rights Open Source Investigations by Anna Veronica Banchik, Ph.D. The University of Texas at Austin, 2019 Co-Supervisors: Mary Rose and Javier Auyero

Human rights researchers are increasingly turning to the internet to discover, collect, and preserve user-generated content (UGC) documenting human rights abuses. The proliferation of UGC and other kinds of “open source” (publicly accessible) information online make available more information than ever before about abuses. Only—UGC may be incapable of verification, buried online, or in peril of deletion by platform content moderation, state coordinated flagging campaigns, or users themselves. Some contexts might produce a scarcity of information while others generate a deluge of data, complicating already lengthy ad hoc verification procedures. Moreover, UGC collection, recirculation, and preservation may endanger or go against the wishes of witnesses and their families.

Drawing on interviews with open source practitioners and experts as well as a year of ethnographic fieldwork at the Human Rights Investigations Lab at U.C. Berkeley’s Human Rights Center, this dissertation examines the practices, logics, and narratives by which a growing number of researchers are applying open source investigative techniques to human rights advocacy and fact-finding.

First, amidst increased scrutiny of platforms content moderation, this study highlights the more nuanced roles of platform design elements including algorithms and search functionality in shaping both UGC’s discoverability and investigators’ workarounds—underscoring the ephemerality of UGC itself and of the informational infrastructures through which UGC is sought. Second, this dissertation offers a typology to synthesize how an array of broader social and

vi structural factors impinge on the verifiability of UGC from varied conflicts and contexts, building on prior research pointing to factors impacting the volume of content available about a given conflict as well as scholarship suggesting that the inclusion of “verification subsidies” (McPherson 2015a) into photos or video footage heightens content’s verifiability. Third, this study examines the emergence of human rights advocates and researchers as self-ascribed content stewards and safeguards. In addition to describing investigators’ affective and ethical commitments to UGC, I point to how pervasive “consent-cutting” discourses combine with decisions to refrain from contacting uploaders in ways that effectively normalize the sustenance of communication and consent gaps with content uploaders, raising ethical questions about responsible data collection and usage.

vii

Table of Contents

List of figures ...... x Introduction: Human Rights Open Source Investigations ...... 1 The “Third Generation” of Human Rights Fact-Finding ...... 7 Guiding Research Questions ...... 12 A Turn to Practice and Social Knowledge: The “New” Sociology of Knowledge ...... 14 Chapter Summaries...... 18 Some Definitions ...... 20 Chapter 1: Like Start-Ups, Full of Great Uncertainty and Possibility ...... 25 The Berkeley Lab ...... 26 UGC Discovery and Verification Techniques: An Incomplete History ...... 30 Chapter 2: Finding the Things ...... 37 Platforms as Intermediaries of Human Rights-Related UGC ...... 41 Imagining and Adapting to the User Upstream ...... 46 Meeting Upstream Users on their Own Terms ...... 48 Refining Results: Boolean Operators and Local Language Search ...... 51 Consideration of Wider Posting Practices ...... 54 Platform Design and Discovery of Human Rights-Related UGC ...... 59 YouTube: Filters and Recommendation Algorithms ...... 59 Twitter: Hashtags and Tweetdeck ...... 63 Facebook: Poor Active Search Functionality and Graph Search Workarounds .... 65 It Takes Three to Search...... 71 Chapter 3: Assembling Ground Truth from Afar ...... 78 The Centrality of Cross-Corroboration: UGC Verification at the Lab ...... 84 Incident-level Information: Volumes of Reporting ...... 94 Content-level Information: Verification Subsidies Contained or Attached Therein ...... 107 Place-level Information: Ambient Data about Where the Incident Occurred ...... 112 Evidence of Things Not Uploaded ...... 117

viii

Chapter 4: Of Content Stewards and Safeguards ...... 124 Security Risks in Human Rights Investigations and Advocacy ...... 126 Open Source Investigations as “Rescuing Stories” ...... 133 Consent-Cutting Discourses of Eyewitness Media and Online Information ...... 138 Negotiating Communication Gaps and Security Risks ...... 146 “Just Have It?” ...... 153 Conclusion: Platforms, Participation, and Power in Distributed Social Knowledge Production 160 “Accidental Archives” of Abuse ...... 165 Uncoordinated Coordination in Distributed Knowledge Work ...... 171 Credibility and Verifiability in the Crowdsourcing Commons ...... 175 Study Generalizability and Limitations ...... 178 Directions for Future Research: Consent, Surveillance, and Power ...... 181 Methodological Appendix ...... 189 Negotiating Presence as a Participant and Researcher ...... 194 Limitations...... 200 Works Cited ...... 203

List of Figures Figure 2.1: YouTube’s search filters ...... 61 Figure 2.2: Recommendation algorithms at work ...... 62 Figure 2.3: The remnants of a removed YouTube Video ...... 63 Figure 2.4: Tweetdeck interface ...... 65 Figure 3.1: Geolocation in action ...... 89 Figure 3.2: Topography sketches for navigating satellite imagery ...... 90 Figure 3.3: Verification’s media assemblages ...... 91

Introduction: Human Rights Open Source Investigations High resolution, good quality, the longest and earliest version online of the video that we had found yet. A few of us crowded around the computer, incredulous, as the student played the video clip and retraced his convoluted trajectory in tracking it down. A background track began. Text wiggled across the screen. Another student translated out loud for the rest of us. Our task for the project was to find user-generated content concerning a set of acts committed a few years ago by a particular terrorist group. On our first pass at discovery, the team had found a short video clip within a mini-documentary created by a UK-based news organization and uploaded onto its YouTube channel. The video clip appeared to be produced by the group to promote their recruitment program. Wondering if there had originally been more to the clip that could be relevant for our project, the team manager assigned a student to find the earliest version of the clip online. He had succeeded.

Eyes wide and staring at the video, I strained to follow the student’s long and meandering process. The Lab’s Director that year, Félim McMahon, once referred jokingly to content discovery (with his characteristic brand of lyrical humor) as “throwing keywords at the internet.” But, it’s hardly ever that straightforward. In this case, somewhat ironically, the student was ultimately able to find the video with the help of a blog post penned by a terrorist monitoring and research institute. Decrying the continued availability of terrorist-related videos online, the blog post contained a spreadsheet listing social media and Internet Archive accounts of members of the terrorist group linked to our project. As it turns out, the article proved key to finding a match for our video, as well as dozens of additional video collections, which we quickly perused for content relevant to our project. Most were grisly, of superb quality, and appeared to be created and posted by members of the terrorist network itself.

The next day, several of us gathered at the Lab for a discovery and verification hack-a- thon. We were joined by a visiting social media researcher, who also consulted and conducted online research for a counterterrorism and security think-tank. As I recounted to another student the discovery of the blog post the previous evening, the researcher’s face reddened and ears perked up. The researcher did not share in our enthusiasm and requested to see the article. After

1 retrieving it and reading it through, the researcher fumed, exclaiming that the monitoring institute shouldn’t have published the article, since, in doing so, it provides the means for others to discover, recirculate, and use terrorist content that ought to be taken offline at all costs…

Human rights investigators are increasingly turning to the internet to discover, collect, and preserve user-generated content (UGC) documenting human rights abuses—and leveraging social media platforms and video-hosting sites such as Facebook, Twitter, and YouTube in particular as de facto repositories for human rights-related UGC. The episode described above points to how a spreading set of investigative techniques tailored to mining publicly available – or “open source” – information online can, with patience and skill, connect researchers to documentation of immense probative and strategic value for their fact-finding, advocacy, and litigation efforts. The proliferation of camera phones, social media platforms, and other information and communication technologies (ICTs) proffer key affordances to human rights investigations, particularly those focusing on conflict zones where states censor, counter, or control information about abuses, or prohibit in-country access to researchers. By enabling on- the-ground actors to produce, upload, and disseminate eyewitness content publicly online, these ICTs make available more human rights-related content to investigators than ever before, while multiplying and diversifying information sources. Compared with traditional human rights methodologies such as in-country research and in-person witness interviewing, investigators may cheaply, remotely, and covertly access information posted online by diverse actors, ranging from bystanders of abuse to perpetrators.

Only—this documentation may be incapable of verification, uploaded for different purposes, buried online, or in peril of deletion. Some contexts might produce a scarcity of information while others generate a deluge of data, making it harder to both find the proverbial needle-in-the-haystack and distinguish between posts, videos, and images that are authentic and accurately described from those that are mis-attributed, doctored, or staged. The process of verifying UGC is often ad hoc and time-consuming. Moreover, the collection, recirculation, and preservation of UGC by human rights advocates may endanger the physical safety, digital security, or psychological wellbeing of witnesses and their families (Gregory 2012b; WITNESS 2016). And, as indicated above, commercial platforms themselves play a central, though

2 understudied role in shaping, sorting, and censoring human rights-related documentation so sought by investigators (Gillespie 2018; Tufekci 2017). Much of the graphic content human rights investigators are most interested in are precisely the kinds flagged as “extremist,” “graphic,” and/or “terrorist” content and targeted for removal by platforms, governments, counterterrorism groups, and other user stakeholders, creating a knotty dilemma described lucidly by one journalist in which “one person’s extremist propaganda is another person’s war- crime evidence” (Asher-Schapiro 2017).

In light of these circumstances, how are a growing number of human rights researchers adopting open source investigative techniques to discover and verify UGC documenting wars, conflicts, and human rights violations? What do these practices indicate about the expanding role of commercial online platforms as information intermediaries of human rights documentation, and what are their implications for pluralism and participation in human rights fact-finding and advocacy?

To address these questions, this dissertation draws on fifty semi-structured interviews predominantly with open source practitioners and experts as well as a year of ethnographic participant observation at the Human Rights Investigations Lab at U.C. Berkeley’s Human Rights Center. Positioning itself as a leader in the field, the “Lab” is among the world’s first university- based open source investigations training and research sites conducting social media discovery and verification for human rights NGOs like Amnesty International and the Syrian Archive.

First, amidst increasing scrutiny of the removal of human rights-related UGC by commercial social media platforms and video-hosting sites (Kaye 2019; Article 19 et al. 2018; Warner 2019; Access Now 2019a; Kayyali 2019; Electronic Frontier Foundation, the Syrian Archive, and WITNESS 2019), this study draws attention to the more nuanced roles of elements of platform design such as algorithms and search functionality in shaping the availability and discoverability of UGC in human rights investigations. It details the imaginative techniques and workarounds that investigators employ to adapt to “upstream” users and platform interfaces, while also pointing to how platforms do not merely rank, relay, or remove human rights-related documentation, but shape its creation (Tufekci 2017; Howard and Hussain 2013).

Second, this study enhances our understanding of the factors that support or inhibit the verification of UGC emerging from varied contexts and conflicts. Drawing on case studies of investigations on attacks in Syria and Myanmar, this study points to how verification outcomes rely crucially on the universe of online information available about the incident and the place where it occurred, which is itself shaped by broader social factors and disparities with respect to technological infrastructure, local documentation practices, and geopolitical interests. This analysis builds on prior research suggesting that the inclusion of “verification subsidies” (McPherson 2015a) into photos or video footage (e.g., filming street signs and distinctive landmarks at the incident’s location) heightens content’s chances of verifiability, as well as scholarship pointing to factors impacting the volume of content available about a given conflict or human rights violation (Koetll 2017).

Third, this study points to the emergence of human rights advocates and investigators as self-ascribed content stewards and safeguards, probing their affective and ethical commitments to UGC exacerbated in part by anxieties of losing documentation forever due to opaque and algorithmically-driven content moderation policies aimed at targeting graphic videos deemed extremist (Syrian Archive 2019a; Kaye 2019; Biddle 2018; Roberts 2018). I examine how the adoption of practices and policies aimed at reducing the potential harms of their work combine with “consent-cutting” discourses that result in justifying and normalizing the sustenance of communication and consent gaps with content uploaders. In doing so, this study highlights the contours of a central ethical and methodological dilemma in open source investigations in the human rights field and points to how this dilemma raises possible tensions between researchers’ roles as witness advocates and as content safeguards, while also undermining the participatory and empowering promise of these investigations.

Altogether, these findings shed important light on the practices, preoccupations, affects, and actors converging to make up the emerging field of human rights open source investigations—a field which has been characterized on one hand as democratizing human rights fact-finding and, on the other, as a specialized domain of practice for “digital sleuths.” Evoking a CSI-like forensic imaginary, media portrayals of the emerging field are replete with allusions to spy thrillers; they fixate in particular on geolocation and provide accounts of practitioners

“obsessively mapping the constellation of models on one guy’s neck” (Lapowsky 2019) or “trying to figure out the exact latitude and longitude in which the actress Sharon Stone once posed for a photo in front of the Taj Mahal” (Beauman 2018). A distinct representation of open source investigations maps onto longstanding discursive commitments in the human rights field to witness testimony and frames UGC as equivalent with witnesses’ “voices” and stories. Open source investigators themselves often characterize their work as motivated by political and moral aims and as comprising a kind of “information activism” aspiring to amplify the messages of those most impacted by state violence. Although in reality these two facets could co-exist (e.g., at different moments in UGC’s trajectory), these portrayals promote somewhat clashing “sociotechnical imaginaries” (Taylor 2003; Jasanoff and Kim 2015; McNeil et al. 2017) of the key actors, methods, and politics of human rights open source investigations.

Indeed, these clashing representations of human rights open source investigations tap into a longstanding tension in the enterprise of human rights fact-finding enterprise at the heart of today’s human rights advocacy and accountability efforts. “Facts are all-important in justifying international action,” writes Frédéric Mégret (2016: 28). “The failure to produce facts may paralyze action, as when international inaction is justified by the failure to establish that genocide is ongoing.” And yet, human rights fact-finding is subjected to competing demands. To governments, human rights investigations must appear empirically sound, unbiased, and objective in their effects to withstand criticism from detractors and to mobilize international action or intervention. This itself is quite a challenging feat, as the very act of conducting an investigation and advancing certain claims may be perceived or attacked as politically motivated (Mégret 2016). However, to witnesses, the press, and wider publics including private donors, human rights investigations position themselves as in alignment with, and amplifying the voices of, those most impacted. In the 1970s and 1980s, fact-finding evolved from an enterprise largely dominated by experts to a heterogeneous set of actors and methodologies in which witness testimony gathered by NGOs and intergovernmental bodies become pivotal, both for pragmatic and ethical reasons (Alston 2013).

While strengthening the appearance of the human rights field as acting on behalf of witnesses, this shift has been must be carefully navigated so as to not undermine the legitimacy

5 that such investigations aspire to project. Methodological, organizational, and discursive strategies have emerged in part to negotiate these conflicting performances: appeals to scientism and legalism in human rights reporting (Moon 2012; Wilson 1997), procedures of research verification and review (Orentlicher 1990), the elite professionalization of human rights investigators and practitioners (Sharp 2016), and, most recently, the adoption of remote-sensing technologies marshaling an apparently objective and omniscient “view from nowhere” (Herscher 2014; Witjes and Olbrich 2017; Rothe and Shim 2018).

If satellite images appeal to “views from nowhere,” the eyewitness media central to human rights open source investigations command the “protestors’ point of view” (Mroué 2012: 25). To what extent, then, do these investigations live up to their portrayals as, on one hand, highly-specialized realms of technical expertise and, on the other, participatory and democratizing models of fact-finding?

Findings in this dissertation temper both expectations, by highlighting their reliance on “offline” and traditional research methods including witness interviews and by suggesting that the distributed configurations of work entailed in these investigations structurally disincentivize the inclusion of impacted communities on the ground. Although open source investigative practices for finding and verifying UGC do entail distinctive “ways of seeing,” these techniques are often ad hoc, based on experimentation; they may be picked up with practice and mastered by practitioners with little technical savvy to speak of, and often downplayed as glorified internet searches by practitioners themselves. More importantly, however, the technoscientific portrayal of open source investigations obscures their reliance on an array of supplemental materials, resources, and labor, including those furnished by traditional research methodologies often carried out on the ground like witness interviewing.

In addition, while indeed leveraging user-generated, volunteered, and crowdsourced information—never to dismiss material out of hand simply based on its source—open source investigations still necessarily rely on signals of credibility and volumes of reporting to verify UGC, resulting in disparities in the ability to verify content from particular contexts and conflicts. Moreover, while clever modes of uncoordinated coordination have emerged between certain

6 pockets of media producers and human rights investigators, the collection, analysis, and preservation of UGC appears to be undertaken quite regularly without the consent or awareness of content creators and uploaders—calling into question some of the more optimistic portrayals of open source investigations as participatory and empowering for impacted communities. On the contrary, there are greater opportunities and incentives to extract UGC and other kinds of information from impacted communities.

THE “THIRD GENERATION” OF HUMAN RIGHTS FACT-FINDING

Fact-finding and reporting on human rights abuses constitute central practices of human rights NGOs, arguably more so today as shrinking newsrooms have empowered NGOs to act as “the new boots on the ground” for international reporting and breaking stories (Powers 2016). Prior to the rise of international human rights NGOs, human rights fact-finding was undertaken primarily by an group of experts, diplomats, and lawyers tasked by intergovernmental bodies (e.g., International Labor Organization, the United Nations) to compile and synthesize relevant and available information, which included data collected onsite (Alston 2013). The resulting report would be presented to a political entity in efforts to convince it to take appropriate and necessary action. This “first generation” of human rights fact-finding was replaced in the 1970s and 1980s with a new set of methodologies and practices initially introduced by NGOs like Amnesty International and Human Rights Watch, and later adopted by intergovernmental bodies. Fact-finding is central to NGOs and intergovernmental fact-finding bodies in order to “name and shame” states they hold to be responsible for these acts (Alston and Gillespie 2012; Orentlicher 1990). The disputed efficacy of this strategy aside (e.g., Seu 2003), it is critical for investigators and fact-finding bodies that their reporting be regarded as accurate, rigorous, and impartial to maintain their credibility and moral authority (Hopgood 2006) as well as to withstand methodological critiques and accusations of unfairness directed towards them. So attacked, Orentlicher (1990:85) writes, “perhaps no asset is more important to a human rights NGO than the credibility of its fact-finding and, in particular, its reputation for meticulous methodology.” Indeed, Orentlicher (1990) attributes the historical ascendance of NGOs’ influence in the 1970s and 1980s as linked “in direct relation to the persuasiveness of their factual reporting” (134).

No longer dominated by a select group of experts, this second generation of fact-finding marked a decisive though circumscribed expansion in the participation of victims and affected communities in human rights reporting through the prioritization of witness interviewing, upheld both as “an ethical promise” to victims and a “methodological given” (Satterthwaite 2013:63). However, this move towards methodological inclusion spurred institutional structures and strategies aimed at maintaining the perceived accuracy of NGOs’ country reports and advocacy efforts. These tactics included the professionalization of research staff, the cross-examination of witnesses and other methods to verify statements and minimize the possibility of bias, and the creation of interval review processes to ensure quality control (Land 2009:209).

Profound changes in ICTs have been said to usher in a third generation of human rights fact-finding wherein “the context in which fact-finding is taking place, as well as the objectives sought to be achieved, are changing rapidly” (Alston 2013:61). “Human rights researchers are no longer expected to only ‘find’ and report on facts” in this changed information landscape, writes Margaret Satterthwaite (2013:65), but are confronted with pressures to adopt new fact-finding methodologies and digital literacies. Among statistical analysis and data visualization, they are “also expected…to verify facts established through crowd-sourcing, social media, and citizen journalism” (Satterthwaite 2013:65).

A flurry of how-to manuals, workshops, and blogs that have emerged in recent years to impart open source investigative practices to journalists and human rights practitioners working with UGC (e.g., Silverman 2014; The Engine Room, Amnesty International, and Benetech 2016; Koettl 2016a; Higgins 2014; Deutch and Habal 2018; WITNESS 2011, 2016; Edison Hayden 2019).1 The two largest international human rights NGOs conducting human rights research, Amnesty International and Human Rights Watch, have both invested in building internal capacity to better leverage UGC, remote-sensing technologies, and other new data streams and sources (Koetll

1 In addition to these guides, dozens of organizations and websites share guides on various techniques and approaches under the umbrella of open source investigations. These include Bellingcat (https://www.bellingcat.com/), Verification Junkie (http://verificationjunkie.com/), Automating OSINT (http://www.automatingosint.com/blog/), Citizen Evidence Lab (https://citizenevidence.org/), Paul Meyer’s Research Clinic (http://researchclinic.net/), and Michael Bazzell’s IntelTechniques (https://inteltechniques.com/).

2016; Fortune 2018; Human Rights Watch 2017a). In its Digital Verification Corps (DVC), a network of six campuses of which Berkeley is a part, Amnesty International enrolls trained university students in social media discovery and verification on its investigations. Amnesty International also enrolls tens of thousands of “digital volunteers” in image and document processing vis-à-vis its microtasking “Decoders” platform.2 For its part, the ICC’s Office of the Prosecutor (OTP) has set out to “[t]rain all relevant staff in the basics of online investigations and handling of electronic evidence” (International Criminal Court 2015: 24). “One of the main challenges,” states its 2016-2018 strategic plan, “is to adapt the Office [of the Prosecutor] to the impact that technology has on its ability to monitor, prove and present crimes. The use of computers, internet, mobile phones, and social media, etc., has exponentially expanded worldwide, including in the countries in which investigations are undertaken by the Office” (International Criminal Court 2015: 23).

The field of human rights open source research appears to be undergoing expansion and professionalization: paid positions for “open source investigations” are emerging in the human rights and journalism fields; partnerships are being forged between investigators, technologists, and legal mechanisms; and strategic efforts are underway to systematize and standardize research procedures with the hopes of increasing the probative value and perceived credibility of UGC in courts (Human Rights Center 2018a).

Existing scholarship on applications of emerging ICTs in the human rights field have outlined affordances, understood as “the actions a given technology facilitates or makes possible” (Tufecki 2017: xi; Nardi 2015; Bucher and Helmond 2017), as well as key challenges and implications with respect to pluralism and power. With respect to affordances, Steven Livingston (2016) has noted that new data and technologies provide human rights actors with access to information (1) from areas prohibited from researchers, (2) in contexts where witness testimony is unavailable, and (3) in new forms, such as numerical and statistical data on dynamics and relationships that were previously unmeasurable. According to Livingston, such affordances

2 Amnesty International. 2019. “Amnesty Decoders FAQ.” Amnesty Decoders. Last accessed March 25, 2019. Retrieved from: https://decoders.amnesty.org/faq.

9 ultimately empower human rights NGOs in discursive contests with governments denying human rights violations by enabling NGOs to “gather and curate information in ways that were once the sole preserve of the most powerful and technologically sophistical states” (see also Aday and Livingston 2009). For example, imagery from satellite, unmanned aerial vehicles, and other remote-sensing technologies afford new vantage points and levels of access with which to monitor geopolitical security, detect and document human rights violations, surveil and attempt to deter perpetrator states (Herscher 2014; Witjes and Olbrich 2017; Rothe and Shim 2018; Litfin 2002).

Social media platforms and video-hosting sites provide avenues for witnesses and media activists to circumvent government-controlled media. The quantity of documentation generated and uploaded online in today’s conflicts is historically unprecedented, even despite government- imposed internet shutdowns and site restrictions in effect in much of the globe (Internet Society 2017; Flamini 2019)—an issue of vital importance to the field of open source investigations but unfortunately outside of the scope of this dissertation. Investigators can remotely and cheaply access information not solely uploaded by witnesses, but bystanders, perpetrators, and other actors in conflict. Whereas fact-finding missions or criminal investigations may commence months or even years after an incident or pattern of abuse has occurred, investigators and human rights groups can actively monitor, collect, and preserve online open source information regarding a conflict almost in real-time, and store information for future accountability efforts. These remote capabilities are also boons for NGOs who may denied in-country access for research or deem conditions on-the-ground to be too dangerous. Social media and video-hosting sites may potentially furnish linkage or leading evidence in courts by providing information regarding chain of command or hosting UGC which can provide a rich level of detail to help establish the facts of an incident (Koettl 2016a; WITNESS 2016). Eyewitness media can corroborate witness testimony in advocacy and courtrooms, and serve as the catalyst to spark official investigations, grave crimes prosecutions, and arrest warrants, as happened recently when the International Criminal Court (ICC) issued its first ever arrest warrant based primarily on evidence posted to social media (Irving 2017; see also Hiatt 2016; Freeman 2018; Laux 2018; Aronson 2018a; Human Rights Center 2018a). UGC and visual media more generally are also well-

10 suited to contribute to advocacy campaigns targeting governments and to increase awareness among broader audiences of ongoing conflicts (Ristovska 2016a; McLagan 2003, 2006).

These new technological applications do not come without difficulties however. The employment of UGC and online information entails volume and verification challenges (McPherson 2015a, 2015b; Meier 2015), spurring investigators to adopt techniques developed by social media news agencies and aggregators of citizen journalism to find and corroborate UGC. And yet, documentation may be removed before investigators are able to reach it; in recent years, human rights advocates have been increasingly vocal about the disappearance of human rights-related media on commercial platforms and its consequences for their work (Warner 2019; Kaye 2019; Access Now 2019a). Another difficulty relates to the ethical considerations and digital and physical security risks that pervade these digital investigations. “Abiding by strict ethical guidelines when conducting open-source investigations,” write Deutch and Habal of the Syrian Archive (2018: 53), “is an ethical minefield, with a constant negotiation of issues of consent, open-access, data quality, data security and the safety of those involved in open-source investigations.” Even if information or media published publicly appears to be anonymized, its aggregation through online research of the kind utilized in open source investigations may result in the identification and disclosure of sensitive information (Land et al. 2012; The Engine Room, Amnesty International, and Benetech 2016: 12). Security risks linked to the creation, collection, re-publication, and preservation of eyewitness media are acutely perceived and appear to shape organizational practices (Aronson 2017; Hamilton 2019; Gregory 2010, 2012a, 2012b; WITNESS 2011, 2016; Piracés 2018a). There is also growing awareness that investigators themselves are at risk of exposure to secondary trauma as a result of working with graphic visual material (Dubberley, Griffin, and Bal 2015; Ellis 2018; Lampros and Koenig 2018). Beyond security, the collection, usage, and preservation of UGC with or without the direct consent or awareness of uploaders raise concerning ethical questions for the field (Aronson 2017; Gregory 2012b; Deutch and Halab 2018), as do human rights NGOs’ use of satellite imagery for advocacy purposes, which have been compared to state surveillance practices (Rothe and Shim 2018; Herscher 2014; Witjes and Olbrich 2017).

Relatedly, scholars have considered implications of emerging technological applications to pluralism and power with respect to the production and usage of human rights knowledge. Many have celebrated the promise of emerging ICTs for enhancing pluralism and even “democratizing” peer-production in human rights advocacy and investigations (Land 2009, 2016; Gregory 2012a, 2010). Open source investigations centered on human rights-related UGC are often framed as amplifying the “voices” and experiences of those most impacted by abuses (e.g., Ristovska 2016a). And yet, the affordances of ICTs and opportunities of participation are not likely to be evenly applicable to all contexts and communities. Ella McPherson (2015a) has noted that verification strategies circulated throughout the human rights field vis-à-vis knowledge exchange resources, events, and networks may not reach the most marginalized of media producers. Accordingly, those who do not know how to provide “verification subsidies” which reduce verification costs downstream, such as panning the camera or citing the date, time, and site of filming while uploading content, may be risking their lives to capture footage which is ultimately unverifiable with current tools and techniques. Moreover, it is possible to view the collection, preservation, and usage of UGC not as empowering to impacted communities but as extractive (Alston and Knuckey 2016: 129), particularly in the absence of content uploaders’ awareness or approval. Jay Aronson (2017) has pointed out that today, large international organizations are better equipped than ever to appropriate information from smaller local groups and deploy it towards purposes which may not reflect the latter’s priorities and intentions (85). Collecting and preserving content with an eye towards legal accountability, for instance, may directly undermine the wishes of content creators or uploaders who might instead favor the pursuit of different justice and accountability mechanisms.

GUIDING RESEARCH QUESTIONS

These studies, many of them authored by open source practitioners themselves, have proffered important insights to the nascent field of human rights open source investigations. Yet, there is still much to learn. Drawing on ethnographic and qualitative methods described further in the Methodological Appendix, this dissertation contributes empirical data and analysis to three existing gaps in the literature concerning: (1) the role of platforms in shaping content discovery, (2) the impact of social and structural factors on verification outcomes, and (3) the pragmatic and

12 discursive negotiations of open source practitioners with respect to issues of consent, communication, and security risks. Therefore, three sets of research questions guide this study:

RQ1: How do investigators find UGC online? In particular, apart from content takedowns, how do elements of platform design such as algorithms and search functionality impact to the discovery of human rights-related UGC?

RQ2: How do investigators verify UGC online? In particular, what social, technological, economic, and political factors shape verification success using open source methods?

RQ3: How do investigators perceive their roles in relation to human rights-related UGC and content uploaders? How do they negotiate and rationalize gaps of communication and consent with uploaders?

In addition, though not addressed empirically in this dissertation, there remains a separate, yet fundamental question about the efficacy of UGC and open source investigative techniques in bringing about accountability and preventing future conflicts: to a large degree it is political will and power, not UGC itself, that will bring about these changes. Insofar as open source investigations and their surrounding hype fetishize specialized techniques and data collection and preservation at the expense of other methods and materials, they potentially risk obfuscating, neglecting, or providing distraction from the reality of these larger political forces.

This study also engages longstanding questions, methodological approaches, and insights in the sociology of knowledge (SOK) and science and technology studies (STS) pertaining to the practices and preconditions of knowledge production.3 In particular, findings shed light and invite future research on three areas of inquiry: (1) the nature of commercial online platforms as information storehouses critical to a growing body of knowledge producers; (2) possibilities of and constraints to coordination between extant knowledge producers in the trajectory of UGC and online information; and (3) the production of credibility and verifiability in the context of

3 Both of these fields have alternatively been referred to as science studies and the sociology of scientific knowledge.

13 online investigations relying on crowdsourced material and labor. These are described more in the conclusion. Next, however, I provide a brief background on the sociology of knowledge.

A TURN TO PRACTICE AND SOCIAL KNOWLEDGE: THE “NEW” SOCIOLOGY OF KNOWLEDGE

Though strongly influenced by Marx, Gramsci, and other sociological forbearers, the sociology of knowledge is often traced to Europe in the 1920s and, in particular, the work of Karl Mannheim (1936; 1952[1928]). At its origins, the subfield was largely concerned with mapping the social positions and conditions of knowledge producers, predominantly intellectuals, onto the creation of formal systems of knowledge (Swidler and Arditi 1994). Though a promising field of research, studies adopting this and similar approaches have been criticized as reductive and simplistic with respect to its treatment of knowledge, producers’ social locations, and their mechanisms of interaction (Geertz 1983: 152-3, cited in Swidler and Arditi 1994: 306). Camic, Gross, and Lamont (2011: 6) have characterized such scholarship as adhering to a stark input/output model, with the “output” being a thinker’s corpus of scholarship and its “input” comprising “macrolevel economic, political, and ideological conditions, as well as the thinker’s class- or group-based interests.”

Over time emerged a “new sociology of knowledge” (Swidler and Arditi 1994) comprised of scholarship in sociology and other fields including anthropology, history, and the history of science which more broadly examined “how kinds of social organization make whole orderings of knowledge possible” (306). Scholars assessed the concrete practices and sites of knowledge production, along with the materiality of knowledge and its effects of power. For instance, studies considered how the media forms in which knowledge is captured and embedded (e.g., oral traditions, print, broadcast media) shapes its structure, archival, and transmission (see Swidler and Arditi 1994). Influenced by the work of Foucault (e.g., 1965, 1977, 1980), an immense body of trans-disciplinary research has examined the relationship between knowledge creation and the exercise of power as expressed in a spectrum of contexts and settings, including religion, state power, colonialism, scientific theories of race, and a host of social scientific disciplines such as psychology, anthropology, demography, Middle Eastern studies, and sociology itself (Said

1978; Foucault 1965; Tuhiwai Smith 2012; Zuberi and Bonilla-Silva 2008; Smith 1990; Bhambra 2007).

In addition, overlapping with STS, studies in this vein have analyzed the role of conceptual frameworks, techniques, interactions, artefacts, and organizational contexts in knowledge production processes. Ethnographies of scientific laboratories (e.g., Latour and Woolgar 1986[1979]; Knorr Cetina 1999; Traweek 1992; Lynch 1985) demonstrated that focusing methodologically on the practices of knowledge production reveal it to be contingent, situated, and material (as opposed to purely cognitive, solitary, and abstract)—shaped by informal activities and embodied ways of seeing and sense-making that involve “thinking with eyes and hands” (Latour 2011[1990]; see also Law 2008; Mol 2010; Schatzki, Knorr Cetina, and von Savigny 2001).4

Increasingly, sociologists and social scholars have adopted approaches applied to the study of knowledge production in the natural sciences, such as a focus on practices and artefacts, to the realm of social knowledge (e.g., Jasanoff 2004; Knorr Cetina and Preda 2006; Stark 2012; Merry 2016; Fourcade 2010). Considering social knowledge to be a “ubiquitous feature of the societal landscape” (Camic, Gross, and Lamont 2011: 3), a 2011 volume edited by Charles Camic, Neil Gross, and Michèle Lamont welcomes the “overdue arrival of social knowledge practices as a central topic for empirical investigation” (13) and an important object of empirical study in its own right.5 Social Knowledge in the Making calls for empirical work to foster understanding of the trajectories, contents, and implications of research in the social sciences and humanities as well as “extra-academic sites where social knowledge is made and put to diverse uses in the private and public sectors” (20). The editors promote STS approaches as promising tools to

4 For instance, recent work has drawn attention to the corporeal practices or “body-work” (Myers 2008) of protein biologists (Myers 2008), NASA engineers (Vertesi 2015), and surgeons (Prentice 2012). 5 The editors define social knowledge for the purposes of the volume as “descriptive information and analytical statements about the actions, behaviors, subjective states, and capacities of human beings and/or about the properties and processes of the aggregate or collective units— the groups, networks, markets, organizations, and so on—where these human agents are situated” (Camic, Gross, and Lamont 2011: 3).

15 address a crucial, missing gap present in the traditional approach to the sociology of knowledge; that is, inquiry into:

the mediating terms—in this case, the day-to-day actions and processes through which the producers of social knowledge actually go about the on-the-ground work of making, evaluating, and disseminating the kinds of social knowledge that they are involved in producing. (6-7)

This dissertation adopts this focus on practice and on the sociotechnical assemblages of devices, actors, data repositories, and other factors which serve as preconditions for what can be discovered and known about human rights abuses from user-generated, crowdsourced, and open source content in the context of open source investigations. It draws particular attention to three facets of these investigations with broader significance for scholarship on social knowledge production.

First, central to knowledge production in the natural sciences and of social knowledge alike are the data sources and storehouses accessible to knowledge producers. Notably, three of thirteen chapters in Social Knowledge in the Making consider how libraries, archives, and similar institutions of information preservation and access rely on a constellation of factors such as material resources, existing technologies, and organizational context; these factors, in turn, are found to shape practices and projects of social knowledge production (Abbott 2011; Grafton 2011; Lemov 2011). Given the widespread employment of commercial online platforms for the creation of social knowledge across disparate fields and applications, there is a pressing need to examine the ways that such platforms reconfigure and constrain knowledge-making practices and even modify what knowledge means and what can be known (boyd and Crawford 2012). This study underlines how platforms do not merely function as information intermediaries governing data collection practices but in fact as dynamic repositories of ephemeral content with potential to shape knowledge-production practices over the course of UGC’s trajectory, from its creation by users to its preservation by researchers.

Second, this study sheds light on heterogeneous coordination practices and signaling strategies embedded into diffuse online knowledge configurations like those enrolled in human

16 rights open source investigations. Scholarship has identified visibility tactics by which online users such as website operators exploit search engine optimization (SEO) strategies, hypotheses about the workings of sorting algorithms, and “data voids” – keyword fields that are relatively sparse with entries – to surface their own websites and online content (Gillespie 2014; Golebiewski and boyd 2018). Building on this research and McPherson’s (2015a) notion of “verification subsidies,” this study reveals discovery tactics and approaches used by investigators to imagine upstream users; anticipate their posting practices, platforms, and keywords; and thus leverage their potential discovery strategies. Discovery and verification strategies embedded within or affixed to human rights-related UGC is suggestive of how born-digital documentation of human rights violations contain traces and artefacts of platform architectures, since these architectures and decisions are in part what give rise to the discovery and verification techniques used by investigators. An example of how “‘raw data’ is an oxymoron” (Gitelman 2013), this observation suggests that even UGC that appears “raw” or “unmediated” may be laden with intentional signals and subsidies directed towards investigators downstream.

Third, this study examines the practices, signals, and kinds of materials by which online user-generated or crowdsourced information is verified. Given that as human rights practitioners conducting remote open source investigations are typically absent from the context in which UGC is produced, “they cannot rely on their direct perceptions of identity clues, communication cues, and contexts to verify civilian witnesses’ accounts” (McPherson 2018: 197) available to researchers interviewing witnesses in person or over the phone. How, then, are verification procedures and credibility signals reconfigured in the context of open source investigations? This study suggests that although open source investigators liberally collect online information, including content uploaded by anonymous users and pro-government accounts regardless of their political affiliation or institutional credentials, verification relies greatly on cross- corroboration between reports and pieces of content. As local environments have varying technological access, media cultures, and other factors supporting the creation and circulation of information more than others, disparities are produced in the ability to verify UGC in different contexts and conflicts.

CHAPTER SUMMARIES

Next, Chapter 1 (“Like Start-Ups, Full of Great Uncertainty and Possibility”) briefly introduces U.C. Berkeley’s Human Rights Center and its Human Rights Investigations Lab, and situates the Lab’s open source investigative techniques and activities within a broader landscape of projects and initiatives in journalism and the field of human rights. This study does not cover the full gamut of online investigative techniques and approaches, but focuses only on practices conducted at the Lab. These chiefly include manual (not automated) techniques for discovering and verifying visual (rather than purely textual) media posted to Facebook, Twitter, YouTube, and other social media or video-hosting platforms. Some Lab teams also undertake social network analysis; though an important area of future inquiry, these techniques are given less attention here, in part due to non-disclosure agreements required of those teams. The discovery and verification techniques described in this dissertation are common to the key players engaged currently in the practice and training of open source investigations, including Bellingcat, the Syrian Archive, Amnesty International, the Digital Forensics Research Lab, and the International Criminal Court, as well as Storyful, First Draft News, The New York Times, and the BBC’s Africa Eye.

The subsequent three analytic chapters concern, respectively, discovery, verification, and ethical and affective dimensions of open source investigations. Chapter 2 (“Finding the Things”) details some of the creative and resourceful strategies by which investigators enroll platform understandings and workarounds alongside domain knowledge about specific communities of online users. Platforms do not only function as information intermediaries, shaping the availability and discoverability of human rights-related content sought by investigators. Platforms set rules for what and how content itself is created and what can be known about recordings of state violence. Their lived usages prompt witnesses, bystanders, and perpetrators to share or self-censor documentation. But, users are not completely bound by platform design and architecture. Investigators tailor their techniques and strategies to the specific usages, affordances, and constraints introduced by each online platform, site, or service in the course of

18 content discovery and verification. To discover UGC relevant to an investigation, practitioners imagine and adapt to the posting practices of such content creators and circulators “upstream,” mimicking their anticipated vernaculars and chosen platforms. In the midst of growing scrutiny of platforms for their removal of human rights-related content (Article 19 et al. 2018; Asher- Schapiro 2017; Warner 2019; Access Now 2019a; Kayyali 2019), these findings provide a detailed and nuanced account of the many ways platforms shape investigation processes and outcomes.

Chapter 3 (“Assembling Ground Truth from Afar”) describes how, to verify and geolocate content, investigators assemble and cross-corroborate all they have collected, while enrolling a suite of free or proprietary digital resources including language translation, sites with volunteered geographic information, and commercial satellite-imagery. Given disparities of media coverage, internet access, online services, and other factors, this diffuse configuration of knowledge production unevenly advantages content from certain contexts and conflicts over others. Building on scholarship noting issues of selection bias in online data collection (Price and Ball 2014; Ball 2016; Aronson 2016), these findings suggest that social and digital disparities do not merely influence the existence and online appearance of documentation of abuse; they also shape the ability for such documentation to be verified using open source methodologies alone. With respect to resource allocation in the field of human rights and within particular organizations, these findings underline the continued importance of traditional albeit more resource-intensive research methods such as in-person witness interviewing.

Described in Chapter 4 (“Of Content Stewards and Safeguards”), many open source practitioners perceive online human rights investigations as humanizing, intimate, and political acts of witness empowerment, treating UGC as comparable to witnesses’ stories and acting as advocates, stewards, and safeguards of UGC, particularly in the face of platform content moderation and removals. At the same time, gaps of communication and consent with content creators and uploaders are prevalent within open source investigations – and in certain contexts systematically so. Practitioners negotiate and normalize these gaps by imputing consent onto information posted online, perceiving the need to collect and preserve human rights-related content online as outweighing considerations of consent, or, in the last instance, treating publicly available content online and on social media as fair game to use. These findings highlight the

19 need for practitioners and advocates to reconsider pervasive practices and discourses with respect to data collection and use that disincentivize consent-seeking, and to challenge the seduction of structural incentives to exclude impacted communities from participation in efforts leveraging their documentation.

Finally, the dissertation ends with a concluding chapter (“Platforms, Participation, and Power in Distributed Social Knowledge Production”) summarizing the study’s key findings, themes, and limitations and offering directions for future research, followed by a short appendix outlining the methodologies employed for the dissertation (“Methodological Appendix”).

SOME DEFINITIONS

Elsewhere, user-generated content is referred to variously as amateur footage, citizen journalism or videography, or eyewitness media. In this dissertation, I use the term user- generated content or UGC to allude to textual or visual content created and uploaded by a user of an online technological platform, such as a wiki, blog, or social media site. This differs from definitions of UGC referring exclusively to visual media or denoting the non-professional status of the content creator.6 It is worth noting that important critiques have been raised of the term “user.” Some technological designers have argued that the word is simply dehumanizing (Lefton 2018). User experience (UX) researchers jokes somewhat morosely that “the only other industry who names their customers ‘users’ are drug dealers” (Teixeria 2019). Others have attacked the word as obscuring actual practices and patterns of use, for instance by failing to distinguish between accounts operated by humans or bots. boyd and Crawford (2012: 669) point to the fact that “[s]ome users have multiple accounts, while some accounts are used by multiple people.” And, some platforms, like Twitter, don’t require individuals to open an account to access content at all. Indeed, in the context of human rights advocacy and investigations, the term “user” reveals itself to be even more problematic. Governments may pose as individual users and create posts or flag others’ content. Witnesses or bystanders may use throwaway accounts to post

6 For instance, a Tow Center report on the use of UGC across newsrooms defines UGC for the purposes of the report as “photographs and videos captured by people who are not professional journalists and who are unrelated to news organizations” (Wardle, Dubberley, and Browne 2014: 10).

20 information or media without it being traced back to them. Account operators may be jailed, killed, and under siege and thus no longer “users” despite the persistence of their accounts. Notwithstanding these caveats, the term “UGC” still serves my purposes here. At times I also refer to visual UGC recorded by bystanders as “eyewitness media.” Eyewitness videos

are often shot by average bystanders, sometimes by activists, and sometimes by victims, survivors, or perpetrators of abuse themselves. Eyewitness videos usually reach investigators or the news media via online platforms like YouTube, Facebook, or Twitter. Other times, they are sent from a source to investigators via email, chat applications, or another form of communication, or found on the computer or cell phone of the filmer. What they have in common is that you, the viewer—the reporter, investigator, filmmaker, or advocate assessing the footage—were not involved in the filming process. Hence, you have a number of questions about the video, its authenticity, intent, and context. (WITNESS 2016: 178)

In addition to foreshadowing the key issues and challenges with using eyewitness media, this definition from the organization WITNESS also importantly underlines the diverse kinds of actors and motivations that can shape the production of this kind of content (see also Koettl 2016a: 1). Sandra Ristovska has argued that “eyewitness media” allows for “scholarship that theorizes the interplay between technology and the professional, political, and institutional ambiguity associated with these visuals” (2016a: 349). By the same token, I refrain from alluding to “citizen” media in this dissertation because the term flattens the diversity of actors, intentions, and political motivations entailed in the production, upload, and circulation of various kinds of eyewitness media. Moreover, “citizen journalism” connotes a baseline level of editing or narration which may be true in certain cases (e.g., well-established local media agencies in Syria) but not in many others (see Bruns and Highfield 2012).

This dissertation also makes reference quite often to content “uploaders.” Through its meaning is straightforward, it is worth explaining why I have elected to use it over other possible alternatives and elaborating briefly on what the term connotes, especially as “uploaders” does

21 not feature widely in internet scholarship as a category of actors. First, by centering exclusively on the act of having uploaded a piece of content, “uploaders” is relatively free of some of the problematic assumptions invoked by the term “user” described above (e.g., of a stable, individual actor with regular engagement and usage of a platform, site, or tool). In this regard, “uploaders” is a minimalist, lightweight concept—aiming to say as little as possible about the other behaviors or characteristics that may or may not pertain to the actors completing the singular act of uploading. Second, the category of “uploader” makes room for witnesses, perpetrators, bystanders, journalists, and other actors who might upload eyewitness media of interest to human rights investigators; again, it does not make assumptions about their relation to the content (i.e., did they capture the footage themselves?) or their motivations for posting— although, as described in Chapter 4, open source practitioners may still impute meanings and consent onto the act of upload. Third, my use of the term “uploader” is an artefact of this study’s focus (and that of the Lab) on discrete pieces of visual UGC documenting abuses; one uploads an image or video but posts textual information.

One could take issue with the term’s emphasis on the act of contributing content, charging that it aligns too closely with platforms’ commercial logics; indeed, a human rights advocate at a digital rights organization I interviewed recently did take issue initially with my constant reference to UGC, a term that in her mind was “so corporate” and to which she preferred the concept of “speech.” Despite its potential commercial connotations, I explained that UGC is still more accurate and appropriate a category than “speech,” given investigators’ interest in tracking down forms of event documentation and visual reports in particular. Though investigators may certainly be concerned with identity-based discrimination and freedom of expression—a topic which does indeed relate to communities’ ability to document abuses—this is somewhat of a separate matter, I added.

Lab practitioners discover and verify UGC mostly gleaned on commercial online platforms of various kinds. Drawing on the definition offered by Tarleton Gillespie (2018: 18, 21), platforms are characterized by their hosting, sorting, and moderation of largely user-generated content, atop an “infrastructure…for processing data for customer service, advertising, and profit.” When referring to “social media content,” I mean user-generated content posted to social media

22 platforms. Social media technologies include “digital platforms, services, and apps built around the convergence of content sharing, public communication, and interpersonal communication” (Burgess, Marwick, and Poell 2018: 1). Although YouTube supports content sharing and, to a lesser degree, public communication (e.g., video commenting), it is neither built nor often used for interpersonal communication. Accordingly, I consider YouTube to be a video-hosting site/platform and Facebook and Twitter as social media platforms.

An “online open source investigation” for the purposes of this dissertation refers to “the process of identifying, collecting, or analyzing information that is publicly available on or from the internet as part of an investigative process” (Human Rights Center 2018a: 8). Focusing primarily on the discovery, verification, and geolocation of UGC, the Lab and this dissertation address a very narrow subset of all possible techniques of “open source investigations.” Nevertheless, described further in the next chapter, Lab staff, students, and visiting practitioners employing similar practices largely referred to their activities with the term.

In particular, this dissertation focuses largely on online techniques to conduct online research into human rights abuses on behalf of human rights NGOs (e.g., Amnesty International, Syrian Archive), open source investigative organizations (e.g., Bellingcat), conflict- and crisis- related crowdmapping platforms (e.g., Syria Tracker, Liveuamap), academic institutions (e.g., U.C. Berkeley Human Rights Center), think-tanks (e.g., Atlantic Council), and intergovernmental fact-finding bodies (e.g., United Nations Fact-Finding Missions and Commissions of Inquiry). Importantly, although these individuals may certainly be collecting, analyzing, and preserving UGC and other online information with an eye towards legal accountability, this dissertation does not address the practices and workflows of open source investigations which take place strictly in the context of criminal investigations (e.g., Koenig et al. 2018). Nevertheless, many of the techniques described here are already utilized in a small capacity at the International Criminal Court and are expected to become more widespread in international courts and tribunals.

Finally, I allude to human rights “violations” or “abuses” in this dissertation, despite the fact that these attributions are interpretations ascribed to events based on legal definitions, frameworks, and determinations into which various forms of “evidence” are enrolled, include

23 visual imagery itself (Herscher 2014). For instance, eyewitness media are “read” in particular ways to establish specific forms of state violence as violations of international human rights and humanitarian law. Accordingly, while recognizing that it might be somewhat problematic to use these phrases in the context of human rights open source investigations, I do so regardless in order to reflect the lexicon, aims, and working frameworks of this field.

Chapter 1: Like Start-Ups, Full of Great Uncertainty and Possibility

Asked to recall a brief history of the Human Rights Investigations Lab, Alexa Koenig took me back in time. Koenig is the Executive Director of the U.C. Berkeley’s Human Rights Center (HRC), an independent institute for interdisciplinary research and training. Founded in 1994, the HRC has maintained a historical focus on enhancing scientific evidence for use in international courts and tribunals, particularly the International Criminal Court (ICC), and has organized a slate of international workshops in the last decade with human rights investigators, prosecutors, and scholars aimed at promoting discussion and strategies to better leverage technological developments to advance international criminal prosecutions (Human Rights Center 2012, 2014a, 2014b, 2017, 2018a). The first two of these conferences, held in 2009 and 2011, have been credited with helping to catalyze the domain of “human rights technology” as an interdisciplinary and vibrant field of practice comprised of human rights scholars, technologists, advocates and lawyers (Piracés 2018b: 290).

But, sitting together in her bright, window-dressed office, Koenig described another convening to me, one she considered to be “kind of a historic moment.” Co-hosted by Koenig through the HRC along with Yahoo! and Videre est Credere,7 this 2014 meeting brought together representatives from social media companies and similar companies, the International Criminal Court, and human rights activists to discuss “how social media content could be accessed for greater legal accountability for human rights abuses around the world.” Given that only the previous year, Edward Snowden’s leaks had revealed the National Security Agency’s immense surveillance apparatus, “the tech companies were understandably leery about looking like they were working on any law enforcement-related efforts.” Nevertheless, Koenig explained that the companies recognized the human rights focus of the workshop participants and reinforced their “strong desire to make sure that their technologies were not being used to perpetrate human rights abuses,” particularly in light of their human rights mandate, outlined in the Guiding

7 Videre est Credere is a human rights organization providing training and resources to support the capacity of marginalized communities to record, preserve, and share documentation of state violence securely and strategically.

Principles for Business and Human Rights. According to Koenig, the companies stated that they would need time to determine the transparent and appropriate legal mechanisms for information sharing of that kind (see Koenig, Hiatt, and Alrabe 2017; see also Ascher-Schapiro 2017; Rajagopalan 2018). Though “in the meantime,” they added, “a lot of what you’re saying you need you can get through public access on our websites. And that’s something where you really just need to better understand the advanced search functionalities of our different platforms and programs, and to see if you can get information on your own that is not held securely through some other mechanism.”

“So,” continued Koenig, “we began looking at who was combing through social media platforms in really impactful ways.” The Human Rights Investigations Lab was launched a few years later, in the autumn of 2016.

THE BERKELEY LAB

The Human Rights Investigations Lab comprises part of the HRC’s Human Rights and Technology Program.8 Early discussions in the Spring of 2016 around the establishment of a Lab comprised of Berkeley students arose from considerations of how students’ participation, interdisciplinary interests, and diverse linguistic backgrounds could meaningfully contribute to efforts seeking to incorporate user-generated content and emerging technologies to support human rights work. Around the same time, DVC manager Sam Dubberley reached out to the HRC to invite Berkeley to be a third DVC campus, alongside the University of Pretoria in South Africa and the University of Essex, participating in the DVC in Fall 2016. After Keith Hiatt, then-Director of the Human Rights and Technology Program, left the HRC, Alexa Koenig and then-HRC Communications Director Andrea Lampros (now HRC Associate Director and Lab Resiliency Manager) decided to move forward with launching the Lab in Fall 2016.

8 The Lab receives funding from individual donors as well as a combination of foundations, trusts, and campus-specific resources: Humanity United, Open Society Foundations, core funding from the Oak Foundation and Sigrid Rausing Trust, as well as U.C. Berkeley’s Student Technology Fund and Undergraduate Research Apprentice Program.

Although Lab teams, students, staff, and media coverage referred to their activities and projects as constituting “open source investigations,” most Lab teams active during my fieldwork did not carry out fully-fledged online investigations. Instead, Lab teams undertook some combination of discrete tasks, including social media event monitoring, discovery, verification, geolocation, and social network analysis. Moreover, consisting of predominantly manual techniques to discover and verify largely visual UGC sourced online, tasks conducted at the Lab represent a narrow subset of the spectrum of open source investigative methods gaining prominence across sectors, including big data analytics and visualization, data mining and scraping, or machine learning, computer vision, or other automated techniques used to analyze visual imagery. Though far from exhaustive, the Lab’s practices nevertheless bear considerable overlaps with methods used by influential sites conducting open source investigations in the fields of journalism and human rights. Assessing the use of these and similar techniques employed Amnesty International, Human Rights Watch, and WITNESS, Ristovska (2016a: 351) has gone as far as to suggest that “video forensics is becoming an essential skill to master in order to facilitate the rigorous accounting of human rights violations.”

The Lab grew from having two NGO partners and a handful of students its first semester to a panoply of clients and almost 80 students by the close of the 2016-2017 academic year. Trainings were given to students by open source investigators from a host of organizations, including Amnesty International, the BBC, the ICC, and Meedan. Almost from the beginning, students also played a large role in training themselves and incoming members. That first year, noted an HRC report on the Lab and its fellow DVC campuses, “these open source labs often felt like start-ups, full of great uncertainty and possibility. Because most campuses had few resources, the work was ultimately very student-driven, which served to develop student leaders and shape a culture of innovation” (Human Rights Center 2017: 2-3). Many took notice. Almost immediately after its inception, the Lab garnered a storm of media attention, spanning campus news, local news organizations like the San Francisco Chronicle and The Mercury News, and outlets for wider audiences such as New Scientist and PBS, which devoted a 10-minute NewsHour episode to profiling the Lab (Kell 2017; Ioannou 2017; Deruy 2017; Rutkin 2016; Public Broadcasting Service 2017).

Amidst the glow of media buzz, and coinciding with the inaugural DVC summit held at U.C. Berkeley, I began my fieldwork at the Lab in the summer of 2017. Although the HRC has since moved to a new building on campus, it was then comprised of several rooms scattered at the end of a passageway in Simon Hall, part of the U.C. Berkeley’s School of Law. The Lab was a single room enclosed by grey walls and two large windows pouring in light and majestic views of the Golden Gate Bridge and Alcatraz. The room was always too small for team meetings, forcing participants into standing positions or donated metal folding chairs. A birch table occupied the center of the room and additional tables edged the perimeter of the room, mounting six old DELL computers seldom used outside of the Lab’s confidential legal projects. A schedule hung on the door displaying the schedule of “Office Hours,” or mandatory meeting times, for each of the Lab’s teams in colorful two-hour blocks.

During my fieldwork, approximately 80 students participated in the Lab; as is frequently promoted, students collectively spoke 31 languages and spanned 22 majors and minors. Participants were also overwhelmingly women, a fact which often drew the attention of Lab students and visitors alike, and also contrasted sharply with media coverage of open source investigations which center on male figures in this field including such as former Storyful employees, Bellingcat researchers, and those spearheading use of open source information at Amnesty International. The Lab’s operations relied on two modes for providing students with academic credit. Graduate students, most of whom were from the Journalism School or law students pursuing a J.D. or L.L.M., received academic credit through their registration in a course in the Law School offered each semester and promoted as a seminar and practicum.9 Though the participation of graduate students grew notably in the second semester of my fieldwork, undergraduate students still comprised the vast majority of Lab participants during my fieldwork. Undergraduates applied to the Lab and received academic credit via the Undergraduate Research Apprenticeship Program (URAP), but still attended the course’s Lab-wide meetings, held weekly in a lecture hall in a Law School. Spurred by word-of-mouth and further media coverage (Melendez 2017; Tannenbaum 2017), application and acceptance to the Lab was increasingly

9 For example, the Fall 2017 iteration of the course was titled “Open Source Investigations: Using Social Media as Evidence of Atrocity Crimes.”

28 competitive, prioritizing students with stellar grades, persuasive application essays, and foreign language skills. In addition to attending weekly Lab-wide meetings which usually consisted of presentations by students, HRC staff, or visiting experts, participating students also attended weekly Office Hours pertaining to their team and were instructed to work at least two or three hours weekly on Lab work on their own time.

During my fieldwork, there were eight or so teams working with different clients on separate projects. The DVC comprised one of these teams, exposing students to quick turn- around projects all over the globe.10 Other teams, in contrast, were semester-long and focused on one geographic locale. Such was the case with the team working with the Syrian Archive, for example, as well as with the “Documenting Hate.” Running a total of two semesters, the latter comprised a partnership with ProPublica and other U.S. universities to compile a database for journalists of under-reported incidents of hate crime and hate speech after the 2016 elections. Teams also differed with respect to the key skillset used: some focused on discovery and verification of UGC, while others entailed combing through a person of interest’s social media posts and online networks, or identifying persons of interest in videos collected and provided by NGO clients. A number of teams also consisted of confidential and/or legal projects, and partnered with organizations litigating human rights cases in diverse geographic regions.

Félim McMahon served from Spring 2018 to Spring 2019 as the Lab Director and Director of HRC’s Technology and Human Rights Program. An early integrant to Storyful team, McMahon helped pioneer the social media agency’s verification methods and workflows, and later joined the ICC to become its first ever analyst with expertise in open source investigations. Comprising an arc reaching from journalism to human rights fact-finding and criminal prosecutions, McMahon’s career trajectory mirrors the migration of the kinds of open source investigative techniques employed at the Lab and elsewhere. Lab staff and open source practitioners largely attributed to these methods to techniques first developed in journalism, particularly in the

10 Whereas other campuses in the DVC network partnered solely with Amnesty International, the Lab is unique in that it boasted numerous partnerships with a combination of advocacy- and litigation-driven NGOs. Among the DVC, the Berkeley campus was by far the largest and most active.

29 context of the Arab Spring. Far from an exhaustive or systematic genealogy, the next section briefly situates the chief discovery and verification practices used at the Lab within the broader landscape of sites and actors working with UGC in journalism, human rights, and other fields.

UGC DISCOVERY AND VERIFICATION TECHNIQUES: AN INCOMPLETE HISTORY

The turn of the last decade marked a momentous period in the emergence of highly distributed configurations of news-making and diffusion, or “networked journalism” (Jarvis 2006; see also Rauchfleisch et al. 2017). “In a world in which information and communication are organized around the Internet, the notion of the isolated journalist working along, whether toiling at his desk in a newsroom or reporting from a crime scene or a disaster, is obsolete” (van der Haak, Parks, and Castells 2012: 2927). Many news agencies and networks were unprepared for the arrival of UGC onto the landscape of news production; those that recognized its potential responded to the development in different ways.

Some news organizations launched initiatives or platforms to capitalize on contributions by readers or viewers. Al Jazeera invited viewers to upload video footage and photos during the 2009 Gaza conflict, which it offered freely under a Creative Commons license (van der Haak, Parks, and Castells 2012: 2929). The Guardian newspaper launched the platform GuardianWitness in April 2013 to invite readers to share breaking news stories, news tips, experiences, opinions, or simply “open suggestions” (Wahl-Jorgensen 2015).11 Numerous news organizations began to create “live updates” webpages around this time. These webpages centered on unfolding news stories and featured professional coverage alongside emerging UGC, although the latter kind of coverage largely went unverified (Hermida 2012). One study found that the BBC was singular among news organizations to systematically attempt to verify the UGC it incorporated into its reporting on the 2010 Haiti earthquake (Bruno 2011).

11 The GuardianWitness platform was officially retired in September 2018. See Bannock, Caroline, Rachel Obordo, Matthew Holmes, Tom Stevens, and Guardian readers. 2018. “GuardianWitness is Closing – but You Can Still Contribute Your Stories.” The Guardian. August 21. Last accessed March 22, 2019. Retrieved from: https://www.theguardian.com/help/insideguardian/2018/aug/21/guardianwitness-is-closing- but-you-can-still-contribute-your-stories.

Indeed, the BBC was among the first news organizations to invest in building an internal unit tasked with discovering and verifying UGC (Wardle and Williams 2008; Harrison 2010). According to a 2008 report, the 2004 Indian Ocean earthquake and tsunami and the 2005 London Bombings marked a “turning point” for the BBC’s incorporation of UGC (Wardle and Williams 2008: 2). The BBC’s UGC Hub grew within a few years from consisting of several staffers reviewing unsolicited tips and contributions in 2005 to a cadre of 20 staffers who “use search terms, see what’s trending on Twitter, and look at the images and footage trusted contacts are discussing on their Twitter streams” (Turner 2012; see also Harrison 2010). In addition to adopting a proactive approach to discovery, the UGC Hub fostered discussion of basic principles of verification. “People are surprised to find we’re not a very high-tech, CSI-type of team,” noted one assistant editor; Hub journalists portrayed UGC discovery and verification at the Hub as drawing “far more on journalistic hunches than snazzy technology” (Turner 2012; see also Stray 2010). For instance, as with traditional journalism, contacting sources remained a central task to verify information and media.

Meanwhile, another initiative was building bespoke, automated tools and protocols to streamline and enhance social media monitoring, discovery, and verification. Storyful, touted as “the first social media news agency,” was founded in 2010 by Irish journalist, television correspondent, and author Mark Little. Eventually acquired in 2013 by Rupert Murdoch’s News Corp for €18 million ($25 million USD) (Lunden 2013), the Dublin-based start-up is frequently credited with pioneering social media techniques, workflows, and products including building internal tools to monitor social media platforms for breaking news and trending topics vis-a-vis sites’ Application Programming Interfaces (APIs). Then-Storyful news editor Malachy Browne explained that the agency’s “journalists apply editorial insight (hashtags, proper nouns and other search terms, as well as geo-location information) to the monitoring information and our technology returns a a [sic] rich stream of UGC” (Lunden 2013; see also Brown, Stack, and Ziyadah 2015). “We are a news agency but also a technology start-up,” said Little in 2012. “Our engineers work side by side with our journalists.” Former Storyful journalists I’ve interviewed similarly attribute this “combination of automation and human skill” to Storyful’s edge over legacy newsrooms like The New York Times and ABC News who became Storyful’s clients (Little 2012).

Former employees also underlined Storyful’s role in developing industry best-practices for visual verification and geolocation, which overlap with widespread techniques used today at the Lab and elsewhere. These include listening for accents or dialects, analyzing shadows in photos and videos for time of day and direction, and noting clues within license plates, posters, and commercial and traffic signs. (Little 2012).

The BBC’s UGC Hub and Storyful were not alone. Techniques for discovering and verifying UGC were also being developed and adopted elsewhere – by media activists, grassroots media agencies, aggregator networks, individual hobbyists, and even crowds of online volunteers. Zeynep Tufekci (2017) chronicled the creation of one such project, 140journos, a crowdsourced news network of volunteers and citizen journalists on Twitter which “became arguably the most reliable source of news during the Gezi protests” in the spring of 2013 (Tufekci 2017: 37). Years earlier, the award-winning blog Nawaat played a critical role in collecting, verifying, contextualizing, and disseminating visual media and articles related to the Tunisian revolts. Nawaat was co-founded in 2004 as a space for Tunisian news and commentary for and by political dissidents and exiles. Around the same time, WITNESS partnered with Storyful to launch the Human Rights Channel in 2012, which verified and shared human rights-related UGC on YouTube. Previously, WITNESS had launched a website in 2007, called The Hub, which allowed users to upload human rights-related UGC (Bair 2015; Brough and Li 2013; Jenkins 2009).

Events of the Arab Spring underlined the significance of UGC to journalism and human rights reporting, as well as the vital, yet overlooked role of its “interpreters.” The Syrian conflict, in particular, compelled news organizations to look to UGC if they had not already. “The war in Syria made it very clearly a necessity,” commented a Euronews editor-in-chief, “because there is no way for us to cover Syria other than UGC” (Wardle, Dubberley, and Brown 2014: 13). In the context of the Syria conflict, Matt Sienkiewicz (2014) analyzed the emergence of a critical “interpreter tier” of semi-professional journalists who assembled, verified, and contextualized UGC for larger news outlets. Illustrating the actors and activities of the “interpreter tier,” Sienkiewicz documented the meticulous efforts of “low-paid or unpaid watchers of citizen journalism from across the Middle East” (696), such as James Miller and Eliot Higgins, who aggregated, corroborated, and synthesized hundreds of pieces of extant UGC and social media

32 posts. Higgins went on to launch the Bellingcat website in 2014, which gives trainings worldwide on open source investigations, issues Twitter-based crowdsourcing calls for real-time assistance on investigations, and publishes practitioner guides and research findings by contributors on its website (Beauman 2018; Polakow-Suransky 2019; Irwin 2019; Lapowsky 2019).

Sienkiewicz pointed out that the crucial curatorial and interpretive information work of this “interpreter tier” challenges popular notions celebrating UGC and citizen journalism as easily accessible and readily legible for news organizations and the public. Rather, UGC must be tracked down, assembled, cross-checked, vetted, contextualized and made sense of, before it can be used in journalism or, indeed, human rights advocacy and accountability efforts. Given the proliferation of UGC and its growing value to a range of sectors, a spate of resources, sites, and tools have emerged in recent years to impart techniques to discover, verify, and analyze UGC to journalists, human rights investigators, and members of the public. Although new verification techniques and tools are constantly being developed and implemented, existing resources contain, for the most part, similar sets of best practices; they take the form of manuals (Silverman 2014; Koettl 2016a; The Engine Room, Amnesty International, and Benetech 2016; WITNESS 2016; Edison Hayden 2019), reports (WITNESS 2011; McPherson 2015b), academic scholarship (Gregory 2012a, 2012b; Deutch and Habal 2018; Aronson 2017, 2018; Koettl 2017), and websites (Bellingcat, Citizen Evidence Lab, Verification Junkie, Automated OSINT). Resources contain tools, tips, case studies, and workflows, as well as warnings, common errors, challenges and successful workarounds.

In addition, many news organizations and human rights institutions, NGOs, and academic centers are deepening their internal capacity to collect, verify, use, and preserve UGC, or employ other open source investigative techniques and data streams. Journalists and investigators formerly at Storyful, Bellingcat, and Amnesty International have recently joined The New York Times. The BBC recently created its “Africa Eye” investigations unit and documentary series which will focus on conducting open source investigations (Funke 2018). Within the human rights field, dozens of organizations and NGOs have dedicated themselves to collecting, verifying, using, and preserving UGC for advocacy, fact-finding, and litigation, particularly in contexts with an

33 abundance of UGC like the Syrian conflict.12 Building on their analysis of satellite imagery for investigations (Herscher 2014; Witjes and Olbrich 2017; Rothe and Shim 2018), Amnesty International and Human Rights Watch are upping their investment in UGC. Human Rights Watch is currently recruiting its first Open Source Investigations Head to build an internal unit for open source investigations, as well as a technologist to create tools and software to support its digital investigations. For its part, Amnesty International’s micro-tasking Decoders Platform leverages a network of hundreds of thousands of “digital volunteers” to help analyze UGC, documents, and visual media, while its Digital Verification Corps (DVC) has trained and enrolled students at U.C. Berkeley and other campuses in social media monitoring, discovery, verification, and geolocation since 2016. Branded as “training the next generation of human rights investigators” (e.g., Fortune 2018; The Engine Room 2017), the DVC supports Amnesty International by helping to find and vet photos and videos relevant to country-specific research projects. Since its launch, it has held annual summits where selected students present case studies, participate in “verification- athons,” and attend talks from expert open source investigators employed formerly and currently at Storyful, Bellingcat, the ICC, and elsewhere (Human Rights Center 2017; Centre for Governance and Human Rights 2019).

Human rights lawyers, technologists, and advocates have also underlined the growing potential for the use of UGC not just in advocacy but formal accountability efforts and litigation. United Nations Commissions of Inquiry and Fact-Finding Missions, the ICC, and other fact-finding

12 In addition to the Lab, Bellingcat, and Amnesty International, these sites include Forensic Architecture at Goldsmith University, the Center for Human Rights Science at Carnegie Mellon University, and the Atlantic Council’s Digital Forensics Research (DFR) Lab. Organizations and projects dedicated to the Syrian conflict include the Syrian Archive, the Carter Center’s Syrian Conflict Mapping Project, the Syrian Justice and Accountability Center, Physicians for Human Rights’ Syria Mapping Project on targeted medical facilities, Standby Task Force, Syria Tracker, and the Center for Spatial Research at Columbia University. The Human Rights Data Analysis Group (HRDAG) and the Human Rights Methodology Group at Columbia University are additional organizations supporting the use of new data streams, online content, and data science in human rights fact-finding and research, but are not focused specifically on UGC. In addition, WITNESS has created tools (e.g., YouTube’s face-blurring tool), platforms, and practitioner manuals for creating, collecting, verifying, and preserving UGC.

34 bodies have increasingly incorporated UGC and other kinds of digital evidence into their investigations (Hamilton 2019; Hiatt 2016; Freeman 2018; Laux 2018; O’Neill, Sentilles, and Brinks 2014). In 2017, the ICC issued its first ever arrest warrant based largely on UGC posted to social media (Irving 2017). The warrant was for Mahmoud Mustafa Busayf Al-Werfalli, a Libyan commander appearing to execute 33 individuals across the span of seven videos.

Whereas national jurisdictions typically have more rigorous protocols for admitting evidence, social media, remote-sensing imagery, and other online-derived content face lower barriers to admittance as evidence at the ICC and other international courts and tribunals (O’Neill, Sentilles, and Brinks 2014). And yet, such content is typically accorded little evidentiary weight (Hiatt 2016). Accordingly, broader efforts have been underway to support human rights activists and NGOs enhance their collection, verification, and preservation of UGC with an eye towards legal accountability. Some caution that “practitioners need to consider that their fact- finding efforts could become relevant to formal accountability and justice efforts” (Piracés 2018a). Conferences and workshops have been held, guides written, and products built in hopes to enhance the quality, probative value, and weight of UGC in court.13

It is also for this reason that the HRC is spearheading efforts to establish an International Protocol of Open Source Investigations promising to “set common standards and guidelines for the identification, collection, preservation, verification and analysis of online open source information” to enhance its use and value in human rights and criminal investigations (Human

13 Notable conferences include a 2014 convening held at Carnegie Mellon University’s Center for Human Rights Science which explored applications of automated technologies and data science to the analysis of user-generated visual content (Center for Human Rights Science 2014), and a 2016 symposium at Harvard Kennedy School’s Carr Center for Human Rights Policy which addressed the impact of technology on human rights fact-finding (Livingston and Raman 2017). The WITNESS 2016 Video as Evidence Guide imparts best-practices on capturing, verifying, preserving, and sharing UGC, as well as a basic overview of legal definitions and concepts (e.g., elements of a crime, types of evidence) and case studies of how video evidence has been used in international criminal prosecutions. With respect to tools, Enrique Piracés of Carnegie Mellon University’s Center for Human Rights Science has developed the Digital Evidence Vault (previously called Keep and Video Vault) to preserve UGC with some contextual metadata (Piracés 2018a).

Rights Center 2019). The guidelines are aimed at assisting NGO collect UGC and online open source information, and informing lawyers about minimum standards for digital evidence collection and preservation, tasked with explicating their verification steps and showing a secure chain of custody to judges (Human Rights Center 2018a; Human Rights Center 2019; Lampros 2017). Koenig has been an outspoken advocate of the need for open source investigation guidelines. “Everyone knows there’s a wealth of information out there in digital form,” said Koenig recently (Irwin 2019). “But no one knows, comprehensively, how to access it, how to preserve it, who to give it to and what to do with it, and then how to present it in a court” (Rutkin 2016). Bellingcat and the Global Legal Action Network (GLAN) have a similar project to clarify recommended minimum standards for the collection of UGC and other kinds of open source information for later use in court.

Along with a handful of counterparts, the Lab and its home, the HRC, are at the epicenter of efforts to advance the practice and professionalization of open source investigations in the field of human rights. Complementing the HRC’s efforts at enhancing the employment of open source information and investigative techniques in human rights litigation, the Lab serves as an active site for research and experimentation, and has emerged as a foremost training site for open source investigations (Irwing 2019), prepping students for roles at The New York Times, Bellingcat, Amnesty International, and Harvard University’s Disinformation Lab (Kell 2019). The next three chapters document salient practices, narratives, and challenges raised in the course of the Lab’s work.

Chapter 2: Finding the Things

Tom advanced his presentation slides. Suddenly, a tweet was projected to the front of the lecture hall, its glow illuminating the faces of the dozen students sprinkled across the empty rows. The social media post featured a photograph depicting the crumpled remains of a car amidst the rubble of a massive earthquake which had rocked Mexico City earlier that day. “How could we get to this piece of media?” There are two ways journalists do this, Tom explained. One way is to use aggregator tools which “blast out the news” but which may be costly—software like DataMinr. “The other way: good old-fashioned search. Typing things into search platforms, using parameters, getting back information that’s not noisy and that’s relevant.” A former journalist himself, Tom Trewinnard has trained newsrooms, journalists, students, and crowdsourcing volunteers around the world on the basics of social media discovery and verification.14 This evening, Tom had been invited to instruct students of the Lab team working with the Syrian Archive on how to conduct discovery to find UGC emerging from the Syrian conflict.

In the context of online open source investigation workflows, discovery and verification are two distinct yet tightly interwoven sets of practices. As Tom explained, while our ultimate aim is to verify eyewitness or local media about the conflict, “discovery is the path we take to find that content. You can’t really do verification unless you can find things that are able to corroborate or debunk what we’ve found.” Discovery, defined by one Lab manager as “basically finding the things,” refers then to digital practices aimed at locating and collecting eyewitness media of an event (slated for subsequent verification) or, equally important, the trove of secondary information needed to corroborate firsthand eyewitness material.15 “Verification is

14 At the time of writing, Tom is Director of Business Development at Meedan, a San Francisco- based company producing digital tools for translation and collaborative online verification. Check, Meedan’s flagship verification platform, has been used by several Lab groups and DVC campuses, the investigative group Bellingcat, and numerous award-winning projects monitoring news stories emerging on social media around national and international elections. 15 Although secondary sources vary with respect to each investigation’s aims and scope, they might include a combination of videos, images, and textual posts on social media, as well as contextual or domain-specific information provided in news reports, blogs, Wikipedia, collaborative mapping and geo-tagging platforms such as Wikimapia and Google Earth Photos, or other websites.

37 more technical,” Tom continued, “like, how we can spot a street sign [in a photo or video]. But, the skills of finding this stuff is something you can pick up.”

The need to utilize specialized skills to discover human rights-related UGC, vulnerable to online obscurity or outright removal, is reflective of the mixed blessings of online architectures and platforms for social movements worldwide. At their respective advents, the internet and social media platforms were vaunted as democratizing forces giving voice to marginalized communities excluded from mainstream and state-controlled media (see Papacharissi 2002; Jørgensen 2017). Exemplified by the much-celebrated proliferation of social media use during the Arab Spring, optimism in platforms’ support and stewardship of marginalized voices has soured (e.g., Tufekci 2018a; Hempel 2016), particularly in light of data privacy breaches, virulent online harassment and hate speech, misinformation and disinformation campaigns, and the use of platforms by state actors for surveillance and suppression of speech.

First, within the human rights community in particular, there has been growing concern over platforms’ silent removals (or “takedowns”) of human rights documentation and account suspensions or bans of journalists and human rights activists—practices encompassed in platforms’ “content moderation” policies and enforcement (e.g., Article 19 et al. 2018; Kaye 2018, 2019; Access Now 2019a; Kayyali 2019; Electronic Frontier Foundation, the Syrian Archive, and WITNESS 2019). The causes for why human rights-related UGC are removed from platforms are numerous, but while uploaders themselves may opt to take content offline due to safety, privacy, or other concerns, digital and human rights groups have drawn attention to platforms’ takedowns of valuable human rights-related documentation due to government requests, flagging by overzealous algorithms or human users, or Terms of Service prohibiting content deemed graphic, extremist, or terrorist and applied seemingly arbitrarily and inconsistently (Gillespie 2018; Crawford and Gillespie 2014).

While takedowns and account suspensions comprise a blunt mechanisms to remove content wholesale, platforms are also constantly ranking, organizing, and promoting some UGC and users over others with the use of sorting, search, and recommendation algorithms. These may create additional hurdles for human rights activists and groups to disseminate and,

38 conversely, to find eyewitness media and information concerning human rights violations. As Tufekci (2018a) has written, “dissidents can more easily circumvent censorship, but the public sphere they can now reach is often too noisy or confusing for them to have an impact.” “Information abundance,” DiMaggio et al. (2001: 313) poignantly discerned, “creates a new problem…: attention scarcity.” Provided with online fora with which to share and circulate information capable of circumventing gatekeepers of broadcast and print media, human rights advocates must still confront struggles to gain visibility and attention in an ever-more crowded informational online landscape. This landscape is one governed by algorithmic curation which may hinder the discovery of eyewitness content, even when similar content of the same incident or conflict goes viral elsewhere.16

Given the sheer volume of UGC and the systemic tendencies of platform algorithms to promote particular types of trending content, how do open source practitioners and journalists navigate online platforms, sites, and services to find human rights-related UGC online? The first part of this chapter highlights key logics and techniques employed to do this work, which include getting close to online traces of the incident in question and adapting to the imagined user “upstream” the content’s trajectory. Discovery practices are creative, adaptive, and affective – at times compelling investigators to place themselves in the minds of witnesses or perpetrators. The second part of this chapter turns to explore a third mechanism, aside from content moderation and algorithmic curation, by which platforms impact the discoverability of human rights-related UGC; namely, through aspects of very design. In that section, I highlight a handful of design elements of YouTube, Twitter, and Facebook observed during my fieldwork to enhance or inhibit the discovery of human rights-related UGC in Lab projects. This section underlines the point, obvious to online investigators, that distinct platforms afford unique advantages and

16 As Google itself acknowledged in a recently-leaked internal report (Hazard Owen 2018), “[s]ocial media coverage of the Ferguson protests revealed the stark difference between Twitter and Facebook’s newsfeeds. While the former was filled with blow-by-blow accounts and updates on the domestic news story, the ice bucket challenged filled the latter. The discrepancy clarified the power of algorithms to effectively ‘censor’ the news, by favouring some content over others” (emphasis in original).

39 obstacles vis-à-vis their search functionality, interfaces, user privacy structures, and other features. Elements of platform design do not over-determine investigators’ search practices, but they do shape investigators’ workflows, workarounds, selective engagement with certain platforms, and ultimate success in accessing and collecting UGC relevant to their investigations.

While focusing on techniques of UGC discovery, this chapter provides a broader window into the role and politics of platforms in relation to human rights investigations and advocacy. Although the next section presents an overview of how platforms (and their algorithms) function as information intermediaries, platforms greatly supersede this role. First, before the “discovery” phase and even before content is shared online, platform parameters for media-creation and - sharing regulate the formats that UGC can take and the kinds and amount of information witnesses or advocates can affix to documentation (e.g., Twitter’s now 280-character limit) (Price and Ball 2014: 5; Howard and Hussain 2013; Tufekci 2017). Platforms thus do not simply organize and relay human rights documentation, but influence its shape and very content. Second, those wishing to circulate human rights-related documentation may also consider how others (including allies and adversaries) are using the platform to decide whether and how to share content online, balancing concerns over privacy and security with a desire to disseminate footage, photos, or narratives of an incident. Accordingly, understandings of popular platform usages also shape the ultimate circulation and accessibility of human-rights related material. A third example of how platforms go beyond mere intermediary functions in shaping human rights investigations involve the ways that aspects of their decisions and designs impact UGC verification. For instance, the metadata that platforms choose to disclose or conceal about the provenance of media, posts, and user accounts may generate “epistemological challenges” (Daniels 2009; Schou and Farkas 2016) for investigators seeking to verify the identity of content uploaders or the location from which visual content was captured. But, although this chapter focuses on the role of platforms in shaping content discovery, it is critical to recognize that platforms shape myriad moments and facets of human rights open source investigations.

PLATFORMS AS INTERMEDARIES OF HUMAN RIGHTS-RELATED UGC

Human rights groups and internet scholars point to two significant ways in which platforms shape the discoverability of online content. One is vis-à-vis platforms’ content moderation: their policies and decisions to selectively enforce their Terms of Service and remove content and suspend or shutter user accounts deemed to be in violation or otherwise offensive. Potentially problematic UGC hosted on platforms like Facebook and YouTube are first flagged, whether by users (Crawford and Gillespie 2014) or, increasingly, algorithms. Algorithms are trained to identify everything from nude bodies and nipples to ISIS flags and background soundtracks. In addition, robust hashing technology is used on some platforms to detect matches for extremist/terrorist visual content stored in a database shared by the four major platforms (Facebook, Microsoft, YouTube, and Twitter), as well as child pornography held in separate database managed by the National Center for Missing & Exploited Children (NCMEC). After content is flagged by users or algorithms, content is said to go through a human review – although some, including UN Special Rapprteur on the promotion and protection of human rights and fundamental freedom while countering terrorism, point to removal statistics issued by platforms like Facebook which suggest this important step may be skipped over altogether (Ní Aoláin 2018: 7).

In recent years, platforms have invested heavily in the development and deployment of algorithms to enable them to moderate at scale and respond to increasing governmental pressure to quickly remove terrorist content and hate speech, such as from Germany and the European Commission (Citron 2018; Kaye 2019; Satariano 2019). Human rights and digital rights groups have pleaded platforms to ignore such calls, which they fear could have a chilling effect on free expression while also further endanger the circulation and collection of invaluable documentation of human rights abuses which could advance human rights advocacy or serve as evidence one day in court (Aswad 2019; Kayyali 2019). Just as I was beginning my fieldwork at the Lab, news unfolded regarding takedowns of a staggering volume of social media content and accounts emerging from conflicts in Syria and Myanmar – areas in which Lab students were actively conducting discovery and verification. YouTube introduced modifications to its machine

41 learning algorithms in August 2017 which resulted in the swift removal of 900 channels posting videos of the Syrian civil war (Asher-Schapiro 2017; Rosen 2018; Hill 2018; Warner 2019; Rajagopalan 2018). A month later, Facebook was alleged to have removed videos and images documenting attacks against the Rohingya population in Myanmar uploaded and circulated by activists and dissidents (Woodruff 2017).

Catalyzed by these and other known cases of takedowns and account suspensions, human rights and digital rights groups have demanded the swift restoration of the wrongly-removed content, greater transparency and accountability in content moderation policies and decisions, the mainstreaming of clear removal notifications and user-friendly appeals processes, and the establishment of more robust due process procedures for appealing removals and account suspensions, as proposed in the Santa Clara Principles (2018). In addition, the UN Special Rapporteur David Kaye (2018, 2019) and others have strongly encouraged companies to align their content standards, moderation policies, and working definitions for problematic content with human rights law.

Whereas human rights groups have held a spotlight to platforms’ content moderation, a second mechanism in which platforms crucially shape UGC discoverability and public discourse more broadly – algorithms – has been examined by scholarship in communication, media, library, and internet studies. In computer science, algorithms refer to machine-readable code giving instructions that complete a specific task when executed (Kitchin 2017). Scholars in the humanities and social sciences have increasingly taken up inquiry into algorithms’ powerful and proliferating social effects. Pointing to their implementation in such disparate domains as modern aerial warfare, online advertising, and data encryption, Roberge and Seyfert (2016: 1) note that “[a]lgorithms have expanded and woven their logic into the very fabric of all social processes, interaction and experiences that increasingly hinge on computation to unfold.”

Scholarly attention has been given in particular to algorithms’ role in organizing and moderating public access to online information (Introna and Nissenbaum 2000; Gillespie 2015; Grimmelmann 2014; Bucher 2016; Noble 2018; Granka 2010). By structuring what and how online information is presented, made accessible, searchable, and recommended to users, major

42 commercial platforms such as Google, Facebook, Twitter, and YouTube powerfully shape the contours of public discourse and attention. Scholars have examined the role of platforms and algorithms - their implications for media diversity (Helberger, Karppinen, and D’Acunto 2016; Helberger 2017), the fate of social movements (Tufekci 2017, 2018a), and the circulation of hate speech and misinformation (Marwick and Lewis 2017; Noble 2018; Caplan, Hanson, and Donovan 2018).

Many have by now challenged the commonplace notion that search algorithms serve as neutral conduits which impartially index, sort, and rank information via “algorithmic objectivity” (Gillespie 2014). Rather, scholars argue that as intermediaries which stabilize meanings and mediate visibility to certain ideas and actors over others (Gillespie 2017: 76), search engines raise familiar questions about the power of gatekeepers to shape representation and visibility in existing media industries like print, radio, and television. Digital platforms do not remove gatekeeper mechanisms altogether, but rather reconfigure how public information is organized, presented, and accessed (Schou and Farkas 2016).

Search algorithms employ rules based on cues from users to match them with content. Although the specificities of algorithmic code are proprietary, undisclosed publicly, regularly modified, and continuously adapted to user data, the incorporation of some signals into search algorithms have become commonplace. Outlined by Granka (2010), these include: (1) linguistic cues inferred from the search terms users type into their queries, (2) user cues drawn from users’ search and navigational behavior over time,17 and (3) popularity cues signaling the sites or content most linked and “authoritative” with respect to the query.

The practices and criteria employed by search algorithms to deliver personalized and ranked online content have been subject to increasing scrutiny in scholarship and public discourse. For instance, concerned with the representation of diverse sources and ideas online, some have argued that algorithms’ employment of popularity measures as a function of “relevance” to users have detrimental impacts on media pluralism by placing higher weight on

17 Examples include the search results and links users click on and the length of time users linger on search results and other pages.

43 content created and circulated by websites and users with high online traffic or followers (Introna and Nissenbaum 2000; Helberger 2017). Studies have suggested that search algorithms tend to privilege content that is U.S.-based, commercial, recent, and “trending” (Van Couvering 2007; Granka 2010). Despite “algorithmic opacity” (Burrell 2016) and dynamism, scholars have sought to uncover how and to what extent platforms’ commercial interests inform their algorithmic decisions and determinations concerning search ranking and information retrieval. For instance, Van Couvering (2007) concluded from interviews with search engine producers that designers of search algorithms define search result quality in terms of customer satisfaction and relevance. Shaped by business and scientific logics, these criteria justify editorial tactics which privilege or censor sites and sources and thus stand in stark contrast with alternative standards of quality such as, for instance, fairness and representativeness upheld by journalism.

Then, what do platforms’ algorithms mean for open source investigators seeking out human rights-related UGC? It may be said that, generally, popularity cues tend to heavily favor the websites, social media posts, and audiovisual material of mainstream media outlets over smaller outlets or individual online users (including possible victims or witnesses) who lack mass followers and vast networks of linkages to their content. Accordingly, even while technically “public,” human rights-related UGC may be cloaked in practical obscurity and threatened by invisibility (see Bucher 2012). The first findings section below examines techniques by which open source investigators navigate online infrastructures to find the proverbial needles in the haystack relevant to their investigations.

Besides content moderation and algorithms, a third mechanism by which platforms influence the contours of public discourse and, in particular, impact the availability and discoverability of human rights-related UGC on their sites, involves numerous aspects of their design, spanning their interfaces and search functionality to their privacy settings for posts and connections between users and content. The notion that diverse elements of platforms’ design shapes the visibility of content they host coheres with the understanding that information’s discoverability hinges not only on characteristics of the discrete piece of information in question (e.g., content promoted/demoted by algorithmic levers, or removed altogether from a platform), but also on the architecture and design of the information system in which it is embedded and

44 hosted. “Findability,” noted Peter Moreville (2005: 4), an information architecture and user experience designer, “is a quality that can be measured at both the object and system levels.” Accordingly, Moreville defines findability in his seminal book, Ambient Findability: What We Find Changes Who We Are, as

a. The quality of being locatable or navigable. b. The degree to which a particular object is easy to discovery or locate. c. The degree to which a system or environment supports navigation and retrieval.

This central insight about system-level discoverability reverberates in much recent scholarship shedding light on the myriad ways in which platform design – from its most visible features and interfaces to its largely invisible deployment of algorithms – influence the findability of information. For instance, McKelvey and Hunt (2019: 1-2) express dismay that “[t]ypically, discoverability is narrowly defined as a problem for content creators,” obscuring the many unique mechanisms through which “platforms coordinate the experiences of content discovery” including interface design and algorithmic deployment. The second findings section below demonstrates, using a few examples of specific features on YouTube, Twitter, and Facebook, how platform design elements shape the discoverability of human rights-related UGC in ways that are more nuanced, but perhaps at times just as consequential for human rights investigations, than content moderation and algorithmic curation.

Briefly, it is clear that a focus on how platform design and architecture are engineered to channel users’ behavior and information consumption risks obscuring users’ agency in navigating platforms creatively and on their own terms – perhaps even in ways unforeseen by their designers and developers. Accordingly, just as users should not be thought of as merely passively subject to algorithmic curation, nor should they be considered mindless and powerless when navigating platforms more broadly. “While any claim of technology’s absolute role in shaping either would be deterministic,” argued Hector Postigo (2014: 335), echoing longstanding structure vs. agency debates in sociology, “considering how technical architectures, their design and their use shape and are shaped by social practice is important, because it gives insight into how otherwise little noticed or ‘natural’ elements of the participatory/labor space actually serve

45 as strong influences of action.” One way to think about users’ agency and practices is to recognize that designers’ and users’ understandings of the affordances of a particular system or platform are “not necessarily at odds but often separated by the gap that forms between intended use and actual use as performed by users” (Postigo 2014: 335). Elisabetta Costa (2018) coins the term “affordances-in-practice” as a way to invite more nuanced explorations into users’ creative technological practices. These, she argues, reveal “social media as a set of practices that cannot be defined a priori, and are not predetermined outside of their situated everyday actions and habits of usage” (Costa 2018: 3643). Accordingly, in examining online investigators’ discovery techniques and logics, the findings section below also attempts to highlight the impact of platform design in generating tactical opportunities, hindrances, and “affordances-in-practice” in the context of online investigations centered on human rights-related UGC.

IMAGINING AND ADAPTING TO THE USER UPSTREAM

On a Friday morning a few weeks into the fall semester, two undergraduate student managers stood at the front of a packed, 50-person Law School lecture hall and delivered a presentation outlining key techniques for conducting social media discovery. A summary slide of the presentation bulleted the three key tips Lab students should remember: “In all cases,” the slide read, “(1) Search in all relevant languages, (2) Choose your search engine or social media outlet based on the location of your project, and (3) Be smart about which keywords to search (e.g., “1488” rather than “Nazi”; see below).

The rules-of-thumb on the students’ presentation slide comprise a key search principle in online open source investigations: to find digital traces linked to an incident, get as proximate as possible (linguistically, culturally, temporally, etc.) to it. This might entail any number of maneuvers, including narrowing search queries temporally using timeframe filters or reconsidering the lexicon and languages users would employ to caption their video, describe their experience, or circulate material. Students were routinely told to get into the mindset of the uploader, asked to consider: How would a victim and witness (or, less frequently, a perpetrator) post about this incident online? What words and platforms would they use? The answers to this exercise could then be used to reverse-engineer one’s search strategy by adapting search

46 keywords, language, and sites to match the anticipated behavior of the imagined users “upstream” information flows.18

An example is given by the cryptic parenthetical aside in the last bullet of the students’ presentation—“‘1488’ rather than ‘Nazi’”—which alludes to a tip shared the prior year by a visiting BBC contributor and expert on open source research. When searching online for UGC posted or circulated by Neo-Nazis, the visitor apparently advised, it is more effective to use 1488, a numeric symbol popular among white supremacists and used to identify themselves and others,19 than the word “Nazi,” which is more likely to retrieve secondary information written by group outsiders or opposition.20 Later, students in one Lab team working on this subject matter would cleverly conceive of a few emojis to join 1488 as a search symbol for white supremacists: a glass of white milk and Pepe the Frog.

These include selecting appropriate and effective search terms matching users’ anticipated lexicon, combining terms into “search strings” using Boolean operators, translating search terms and wider queries into local or relevant languages, and considering users’ wider posting practices, such as the platforms and platform features users might employ to post content online. The degree to which these and other tactics, described next, are supported by various platforms’ search functionality impact investigators’ ability to discover human rights- related UGC.

18 The user or actor who initially produced the content may very well be distinct from those uploading and circulating the content, with or without further modification or manipulation. Nevertheless, distinctions between users linked to content takes on significance and scrutiny more so during the verification stage than discovery. 19 1488, or conversely 8814, combines two numeric symbols which allude to white supremacist lore: the “14 words” slogan (“We must secure the existence of our people and a future for white children”) and the number 88, a signal for Heil Hitler, as H occupies the 8th position in the alphabet. 20 Such groups may be further discouraged from overt identifications given the possibility their accounts or posts may receive negative attention in addition to inviting suspicion or censorship from social media platforms due to possible violations of service terms and agreements.

Meeting Upstream Users on their Own Terms

“The humble keyword,” noted Peter Moreville (2005: 4), “has become surprisingly important in recent years.”

As a vital ingredient in the online search process, keywords have become a part of our everyday experience. We feed keywords into Google, Yahoo!, MSN, eBay, and Amazon. We search for news, products, people, used furniture, and music. And words are the key to our success.

Perhaps nowhere is this insight more salient than in online investigations. The craft of generating the right keywords for search queries was a semi-regular topic in training and team meeting discussions. On some occasions, trainers or student managers would conduct group brainstorming sessions to generate appropriate terms and train students to get into the mindset of a poster. Indeed, such was a key aim of Tom’s visit, described above. Toggling through various tweets of the earthquake, Tom told students to notice that those who are posting aren’t being descriptive in what they’re seeing. “They’re not using objective language, they’ve just gone through a traumatic experience—they’re going to use human language—‘what the hell? What have I just seen? I’m so scared!’ Often times, they’re going to be talking about their emotions.” He continued, “it helps if you don’t use the cold, journalist search string, it’s not going to return the kind of stuff you’re going to be looking for. So, it’s really useful to come up with [search phrases from the perspective of the people living the event].”

Tom divided the audience into two small groups and issued hypothetical scenarios to each. The point of the exercise, he explained, is to brainstorm a long list of words that could call up a tweet, Facebook post, or media report on a particular incident. My team was instructed to think of a chemical strike in Ghouta, Syria. We turned to each other and one student volunteered to record our ideas on a spreadsheet. New to the group and to discovery in general, my mind was blank—I realized at that moment that I didn’t know many of the terms for chemical weapons. Luckily, others in the group were more experienced, like first-year law student Elise who, unbeknownst to me at the time, had just spent four years conducting open source investigations on the Syrian conflict for Physicians for Human Rights.

Elise was first to suggest ideas: “chlorine, nerve agent….mustard, Sulphur.” Another student added, “gas? If I’m panicked I would think chlorine, I’d think bodies, poisonous.” Elise then suggested “a variety of terms for bombing, strike, casualties…suffocating, can’t breathe.” Others proposed injured, white phosphorous, gas attack. We could broaden the region, Elise said, searching Damascus and not merely Ghouta. A few more ideas circulated: Illegal? Perpetrators. What about people requesting medical help? Medical emergency. If we knew the names of hospitals in the area, we could add those, or even just the word “hospital.”

At that moment, a timer abruptly rang out and Tom gathered the groups to report back their suggestions. The student taking notes for our group reported that “we started with terms like sarin, poisonous, phosphorous, gas, chlorine, strike bomb, target, suffocating, casualties, illegal, perpetrators, … a lot of it was focused on thinking about what kinds of particular adjectives or nouns would be on [a tweet or social media post].” The other team, instructed to imagine an attack in Italy, had come up with similar ideas to search: “hospital, children, woman, airstrike, bomb…” A chilling exercise, it is also one that highlights the importance of domain knowledge (e.g., names of specific chemical weapons) in shaping search practices.

Over the course of discovery, Lab students often employed domain knowledge about a conflict, social context, or an online community’s digital practices. Such contextual knowledge could be gained at times by trial and error, or additional internet research—a point not immediately obvious to Lab newcomers. For instance, one student having trouble finding domestic hate speech on social media was suggested the “less savory” strategy by her manager “to research certain slurs of different groups, [as] there are less obvious ones.” Asked by another student where they would find those slurs, the manager suggested “talk[ing] to people who have been victims of hate speech. Talk to people of the LGBTQ community, for instance…” The first student responded, “I feel like I would feel really uncomfortable going to a group of people and asking them, ‘what are people calling you these days?’” The manager clarified that she meant to conduct additional internet searches: “I mean, bloggers—certain bloggers are really interested in those kinds of topics. If you get tired of searching through threads of horrible hate speech, look at activists and learn more about their community, you’re trying to get into the mind of a poster of a victim and find out how they’re going to communicate what happened to them.”

Adapting to the posting language and behavior of a potential victim, witness, or perpetrator of a human rights violation and not to that of a journalist, outside observer, or, indeed, faraway college student, is a lesson that often took students practice, experimentation, and modifications based on advice tips shared communally. Occasionally, in their weekly updates during teams’ office hours, students reported on the keywords they were implementing and encountering in their online searches, noting those with which they were having particular success or failure for other students to incorporate or avoid. Such informal brainstorming sessions were a regular occurrence for those in the Documenting Hate team, of which the hate speech-related incident described above pertained. The handful of Berkeley students comprised one of five college campuses working with ProPublica and Meedan to surface incidents of hate speech and hate crime in the United States after the 2016 elections.21 The Documenting Hate project sourced incidents through two mechanisms. In addition to victim or witness testimony, submitted through a portal on the Documenting Hate website,22 groups of trained students at collaborating universities discovered and verified incidents reported on social media that failed to get enough online attention to attract mainstream news coverage. For ideas on words to include in their queries, students were given access to a spreadsheet generated at a summer summit and Pop-Up Newsroom with Tom’s facilitation. The spreadsheet included hundreds of search terms, organized across main categories: verbal abuse, physical abuse, vandalism, and so on. For instance, among 114 other entries for the vandalism category were “spray painted,” “destroyed,” “smashed,” “swastika,” “on the door,” and “my mosque.” 23

As Tom had explained to the Syrian Archive team, including the broad categories themselves or other abstract terms like “hate speech” or “hate crime” could yield results

21 See also Murray, Stefanie. 2017. “Top 6 Journalism Collaborations of 2017” MediaShift. Last accessed on January 10, 2019. Retrieved from: http://mediashift.org/2017/12/collaborative- journalism-comes-into-its-own/ 22 ProPublica. “Documenting Hate.” Last accessed July 7, 2019. Retrieved from: https://projects.propublica.org/graphics/hatecrimes 23 Pop-Up Newsroom is a joint initiative of Meedan and Dig Deeper Media. See Pop-Up Newsroom. 2017. “Bringing Pop-Up Newsroom to Documenting Hate.” August 22. Last accessed June 27, 2019. Retrieved from: https://medium.com/popupnews/pop-up-newsroom- documenting-hate-bdfd649aadf2

50 crowded with posts by journalists and news media, far from the personal posts which students most sought. However, in the context of deeply polarized conflicts or cases in which the text or metadata of posts might be controversial, the selection of search terms might involve anticipating the positions of upstream users—a consideration which is some cases entailed additional domain knowledge. Students on the “Burma” team tasked with collecting social media posts of Burmese military officials purposefully searched “Bengali terrorist” and other derogatory terms used to incite hatred and mob violence against the Rohingyan ethnic minority in Myanmar. Similarly, a visiting open source investigator of the Syrian conflict noted that he moderated his use of politically rhetorical terms like “rebels,” “martyrs,” “insurgents,” and so on, to locate and corroborate social media content related to pro-government deaths among distinct online communities positioned differentially with respect to the conflict.

On the other hand, querying terms that were too explicit or graphic could be misaligned with the lexicon witnesses and victims would use to describe an incident. For instance, midway through the semester one student on Documenting Hate said she saw herself improving at coming up with “search strings in the language a victim would post.” She explained that she had initially used the entire n-word, but later realized “the word ‘n-word’ is better to use because people wouldn’t [spell out] … the whole word,” suggesting that it was too harsh for a victim or witness of a hate crime to use. Frustratingly, however, searching the “n-word” alone would often call up casual uses of the word, “not in a hate crime documentation of it,… [but] just using that word on the internet.” Finally, the student realized that inserting the word into longer strings, like “called me the n-word” or (“called me” AND “n-word”) proved more successful. Accordingly, Boolean operators used to combine search terms in quotations are an important tool used by open source investigators to exert further control over search results and the scope and context of posts retrieved.

Refining Results: Boolean Operators and Local Language Search

After generating search terms, open source investigators often employed Boolean operators like AND, OR, and MINUS to define relationships between search terms and thus broaden or narrow their searches. After the Syria-related brainstorm exercise described above,

Tom advised students to include synonyms of nouns and verbs in their search queries, and to combine them using Boolean operators. “We could go through and think of all the different types of buildings, like hospitals, clinics, different names of places people get treatment.” Having one term runs the risk of defining too narrowly the scope of our search, Tom noted. “The goal with discovery is always to be reducing what is a very noisy environment to something that is not noisy at all, but if we toss out a lot of good stuff in the noise,” we’re doing ourselves a disservice. In addition to considering synonyms and “different types of gas or place names,” think of verbs, Tom said, like explode: “No one said explode, it’s very common to see that.” Tom assured students that the list they’re compiling is organic and will grow over time. “You’ll start to see patterns” in the keywords used in posts or video captions. Combining these types of keywords using Boolean operators, Tom explained, creates a long search string like this: (“clinic” OR “hospital” OR “hospitals”) AND (“explosion” OR “blast” OR “bang”).

In addition to enabling investigators to broaden their scope of their search results by including synonyms, Boolean operators are also used to restrict occurrences in which the terms of interest might be used in the wrong context. For instance, explaining the affordances of Boolean operators, the Documenting Hate student manager instructed students to first start with simple terms, then refine their search. With a stoic countenance, she continued:

If you put ‘bitch’ into tweetdeck, you’re going to get a lot of ‘I love you, bitch’ because that’s just how the word is being used now. You start with one or two [terms], you go through and say, I’m seeing a lot of I love you—so you [minus] love. And then, you could add scream, yell, etc.24

Several additional examples arose in the group over the course of the semester in which students searched terms aimed to yield reports of abuse, but instead encountered the terms being adopted in the most mundane of scenarios. Students noted that queries with “attacked” often called up posts about how a shocking image or scene had “attacked” them. Similarly, searching “followed me” was said to retrieve entries of users following each other on Twitter or

24 The resulting search string, for example, might read (“scream” OR “yell”) AND “bitch” MINUS (“I love you” OR “ilu” OR “I luv u”).

52 other platforms. Unlike the manager’s example, some of these latter cases like those which surfaced with “followed” could not be restricted using Boolean operators because their association with additional meanings were implicit. Although Boolean operators could not always be employed to more finely-calibrate the context of retrieved posts, their implementation remained an essential search technique for investigators. For Lab projects on conflicts or topics occurring in non-English speaking locales, it was highly recommended to translate search terms into local and relevant languages before searching.

“Google Translate is your friend” was somewhat of a mantra in the Lab. While searching in English for background reading, international coverage, and other sources provided resources useful for verification purposes, searching in the relevant languages of a conflict was best for finding eyewitness content and local news reports and sources. Google was used constantly to translate content or metadata attached to content (such as YouTube video captions, or Facebook or Twitter posts,) as well as search strings students had generated. Indeed, students often came up with additional search strings using textual content or metadata connected to relevant posts they had encountered in their search by placing text or captions into Google Translate, and then adding or subtracting words in order to isolate key verbs and nouns in the target language, like place names, neighborhoods, weapons, or key actors’ names, pseudonyms, or titles (being careful in doing so, as isolating words can change their translation in many languages, including Arabic). Additional tricks and tools existed for finding local translations of key nouns; for instance, the Lab’s tech director would often advise students to use Wikipedia to obtain place names in local languages, useful for copying and pasting into search queries.

With some lesser-spoken languages relevant for Lab projects such as Kurdish, Kurmanji, and the Burmese dialect spoke by the Rohingya, Google Translate could be less useful or altogether unavailable. Accordingly, it was advantageous if NGO partners could help generate and/or translate relevant search strings and location names, as Amnesty International did to assist the Lab on its discovery and verification work on Myanmar. In cases where neither Google Translate nor Lab partners could not assist in translation, Lab students themselves might provide translation, as well as other contacts of the Lab and Lab students, depending on the

53 confidentiality of the project and the conditions of non-disclosure agreements on sensitive projects.

I realized the significance of language on search results first-hand several months into my fieldwork, in a momentary flash of regret while chatting with a student about notable sites in the Egyptian revolution. Upon her mentioning Al-Azhar University, I immediately realized a mistake I had made months earlier on a project in which I had sought unsuccessfully to track down images of the university using Google image searches. As I explained, having been new to the Lab at the time I had searched “Al-Azhar University” in English rather than Arabic. To my frustration, endless pages of images depicted some famous mosque in Cairo by the same name. The student chortled, “yeah, it’s definitely a touristy mosque!” By searching in English, Google’s results were fastened on the “Al-Azhar” it predicted would be more relevant to me: the tourist destination, not the local university. Precisely how and to what extent such linguistic cues impacted these search results are, like algorithm designs more broadly, unknown and unknowable; perhaps popularity or user cues (Granka 2010) had more to do with Google’s retrieval of information in this particular case. Without certain and proprietary knowledge about the workings of search engines, Lab students relied on rules-of-thumb, tactic understandings, and “folk theories” (DeVito, Gergle, and Birnholtz 2017; Gelman and Legare 2011) about algorithms and search mechanisms gained by experimentation and iteration or tips from others (Rader and Gray 2015; Eslami et al. 2015).

Consideration of Wider Posting Practices

In addition to crafting search terms, employing Boolean operators, and translating queries into local languages, open source investigators may also consider users’ posting practices more broadly, including what kind of content they post, how, and where. Chris McNaboe of the Carter Center’s Syria Conflict Mapping Project has noted that he had attempted to apply open source methods used in the Syrian context to protests in Venezuela. McNaboe acknowledged that “it quickly became clear that not everyone engages online in the same way. While Syrians tend to use Facebook and YouTube to share information, Venezuelans tend to use Zello, just as Chinese citizens tend to use Weibo, Americans Twitter, and so on.” Accordingly, the project “will have to

54 re-shape [its] methods and approaches to fit these cultural differences” if it seeks to expand the geographic scope of its investigations (Guerrini 2014).

Experienced open source investigators and trainers visiting the Lab often encouraged students to consider possible differences in the usage of platforms or platform features across regions and online communities, and to think creatively about when and how users might post the type of content students are seeking. For instance, in Documenting Hate, students considered searching for anti-immigrant content as prototypes for the southern U.S. border wall began construction. “Anytime things like that happen you wonder if there’s going to be an uptick,” the team manager said. Similar impulses occurred during the emergence of the #metoo movement and the death of Charles Manson—the logic being that, since he had wanted to start a race war, people might enact violence or express hateful speech in his name in the wake of his death.

One illustrative example of how digital media cultures and users’ wider posting practices impact UGC discovery relate to the creation and maintenance of martyr databases and Facebook martyr pages. In one investigation for the Syria team, two Lab members and I were assigned to verify a video posted on YouTube which allegedly depicted a fallen Syrian anti-government fighter. An individual with the same name and date of death was registered by numerous “martyr databases,” including that of The Syrian Observatory for Human Rights. However, to confirm the identity of the individual, whose face was gruesomely shown in the original YouTube video, my team members and I had to conduct additional discovery of visual content as well as any other information that would provide specific details of the circumstances surrounding his death. YouTube searches yielded two such videos: one uploaded on the evening of his alleged death purporting to show a protest of the alleged victim’s death (who appeared to be a prominent activist in his town), and another uploaded the next day claiming to show his body being returned to home. The protest video featured a large poster with a close-up photograph of the alleged victim, but we sought further visual corroboration.

After countless searches on Twitter came up empty on the alleged victim’s death, we turned to Facebook. There, we found numerous posts citing the alleged victim on various Syrian martyr pages. One such page apparently dedicated to residents of a town who had been killed

55 by pro-government forces featured posts with photographs of the alleged victim’s funeral, this time with additional close-ups of the deceased individual, as well as subsequent textual posts made yearly on the anniversary of the victim’s alleged death. While most of the latter comprised heartfelt condolences, one such memorial post gave the most detailed account we could find on the circumstances surrounding the victim’s death (which we corroborated with still other open sources).

Numerous scholars have written about memory practices on social media platforms (e.g., Acker and Brubaker 2014; Marwick and Ellison 2012). In the Syrian context and other conflicts, martyr pages on Facebook may serve particularly important community-based memorialization and archival functions. While much UGC emerging from Syria is uploaded and circulated for overtly political or humanitarian purposes, this particular example illustrates how users’ posts may be subject to analysis, collection, and purposes uploaders perhaps did not foresee. Our usage of public posts on a martyr page honoring a loved one toward ends perhaps unanticipated by the uploader—e.g., to discover and corroborate information concerning the alleged victim’s killing—highlights some ethical concerns addressed more fully in the last analytic chapter. The particular elements of Facebook’s design that supports the creation and maintenance of martyr pages is a compelling example of how platform architecture and functionality accommodate and support vernacular user practices which, in turn, shape the kinds of digital traces and documentation left accessible to open source researchers.

A second concrete example of how users’ employment of platforms and platform features impact human rights-related UGC discovery concerns geotagging. Numerous sites and apps including Twitter, Facebook, Instagram, and Google+ enable users to “geotag,” or attach metadata disclosing users’ location (if location services are enabled), to social media posts, status, or photos. Were it used more often, geotagging would provide immense value to open source investigators aiming to gain spatial proximity to users documenting and uploading media

56 connected with a location or event. Unfortunately for investigators, however, only a very small percentage of posts are actually geotagged; estimates hover between 1-2% or lower of tweets.25

Accordingly, while potentially useful to journalists monitoring social media for emerging stories and breaking news without any further criteria, third-party tools aggregating geolocation APIs were not found to be very helpful for Lab investigations which were based on specific events and locations. One map-based tool, Ban.jo, was much anticipated when first introduced to the Documenting Hate team. Soon, however, students grew disappointed with Ban.jo and complained that alongside the occasional news story of a local fire, the platform mostly just featured posts about delicious-looking restaurant meals. In other words, although Lab students hoped to discover geo-tagged posts related to hate crimes or hate speech upstream users had experienced or witnessed, students mainly encountered posts by users employing the geotagging feature to index their gastronomical exploits and restaurant experiences. This example thus, too, illustrates how users’ posting practices and feature usage impact UGC’s discoverability downstream, by open source investigators, journalists, or, indeed, repressive government regimes.

As geotagging presents heightened security risks for users fearing government repression, dissidents and witnesses in contexts like the Syrian conflict, where the location of hospitals, civilian sites, and anti-government forces are particularly sensitive, will be especially wary of geotagging features. Militaries and alleged perpetrators, too, might evade geotagging— potentially even while employing other means or platform features to strategically communicate their location or specify a given target audience. One NATO StratCom study of the social media practices of the Islamic State noted a 2014 edict issued by DAESH forbidding operatives to enable Twitter’s native geotagging function given the danger it would pose. Instead, the study observed, accounts of DAESH operatives and supporters were employing hashtags within posts to signal

25 Twitter has estimated that 1-2% of Tweets are geotagged. “Tutorials: Tweet Geospatial Metadata.” Twitter Developer Tools. Last accessed January 10, 2019. Retrieved from: https://developer.twitter.com/en/docs/tutorials/tweet-geo-metadata.html. See Bryant, Martin. 1010. “Twitter Geo-fail? Only 0.23% of Tweets are Geotagged.” The Next Web. January 15. Last accessed January 10, 2019. Retrieved from: https://thenextweb.com/2010/01/15/twitter- geofail-023-tweets-geotagged/.

57 locations relevant to its campaigns (e.g., #StateofHoms). A key advantage of this tactic is that it produces a searchable index, “allowing anyone to search back in time for information about previous operations, political developments, and the state of affairs of other groups all over the world and more importantly, in their own region” (Shaheen 2015: 9). After my time at the Lab, in June of 2019, Twitter removed users’ ability to geo-tag their Tweets, claiming that most people didn’t use the feature (Benton 2019).

A third and particularly challenging example of how users’ posting practices may impact the discoverability of human rights-UGC by investigators concerns self-censorship. Given the security concerns posed, eyewitnesses may well choose to forgo uploading or circulating sensitive content altogether. The decision to do so, of course, depends of myriad factors ranging from political variables including government surveillance and repression to technological barriers and local digital practices. As a result of these combined factors, seasoned journalists and experienced human rights investigators working with sensitive UGC may, over time, observe patterns in the posting practices of users in specific regions and either adapt their discovery to accommodate such dynamics, or leverage the knowledge of such patterns to make sense of the pathways media travel for verification purposes. One visiting former journalist and expert in open source investigations explained that,

In places like the [Democratic Republic of Congo] it’s so dangerous to be the uploader they’ll share it in WhatsApp but then someone in Denmark or something releases it to YouTube. Similarly, in Sub-Saharan Africa, things are shared on WhatsApp—and then it’s lost. But in Turkey, things are shared on the internet… It depends on the internet usage of different countries.

Consequently, the visitor added that while WhatsApp, the end-to-end encrypted messaging service keeps people safe, it is “a great hindrance to what we do.” Platforms’ affordances and obstacles for human rights investigations, some of which are highlighted next, are all the more significant precisely because “making the users adapt is not going to work.”

PLATFORM DESIGN AND DISCOVERY OF HUMAN RIGHTS-RELATED UGC

Platforms structure the scope and visibility of online content neither solely through their content moderation practices nor constitutive algorithms but also vis-à-vis elements of their architecture and design, including their search functionality and user interfaces, which enable and engender relatively so-called passive and active modes of content discovery. The degree to which platforms support active search and search queries using Boolean operators, hashtags, keywords in diverse languages, and other tactics employed by investigators enhances platforms’ utility and amenability to the UGC discovery process. In addition, investigators may also leverage platform features which furnish comparatively more passive modes of information access, such as recommendation algorithms which “nudge” users with related content or user accounts in a social network. Next, I provide an exploratory view into how such features permit or foreclose distinct modes of informational discovery, thereby impacting the discoverability of human rights- related UGC on major commercial platforms.

Google’s search engine may also be used heavily and repeatedly throughout an investigation; it is often the first place Lab students would go to assess existing reporting on a conflict, find local media coverage and contextual information, and “reverse image search” a photograph or video thumbnail or screenshot through its massive image database. However, when it comes to the actual discovery and collection of UGC, these tasks were largely accomplished at the Lab using YouTube, Twitter, and Facebook.26 Consequently, these three platforms comprise the focus of the sections below.

YouTube: Filters and Recommendation Algorithms

YouTube incorporates much of the active search functionality available on the search engine of its parent company, Google, such as supporting Booleans and other operators like quotations to force exact matches, as well as enabling the filtering of search results by result type

26 There were some exceptions in which students used other platforms (e.g., VK, Russia’s answer to Facebook) to search or source primary media. Some students also used YouTube, Twitter, and Facebook to search for media which had been initially created and shared on platforms supporting more ephemeral media (e.g., Instagram, Snapchat) and later recaptured and shared on these top three platforms.

(channel, playlist, etc.), quality (HD, live broadcasts), and—especially useful for open source investigations—date frames. Just as local lexicons and languages are leveraged to gain spatial proximity to the event in question, so, too, are date-frame filters and searches employed wherever possible to approximate an event temporally. Not all platforms support temporal filters to an equal degree. Google and Twitter enable users to search for content uploaded between set timeframes, down to the day. Even with specific search queries, the results retrieved can be staggering. The first time I conducted a reverse image search on Google, for instance, I was aghast to find it had retrieved 25,270,000,000 search results. Luckily, an experienced Lab student was in the room to inform me of Google’s simple temporal filtering feature, which allowed me to confine search results to a time period of a few days before and after the alleged date of the event, which in turn produced a more manageable set of results (in that case, three pages of results).

Although such granular temporal filters are not currently permitted on YouTube, the platform does enable users to sort search results by the earliest upload date (as well as view count and user rating, as opposed to merely by “relevance”) and to limit searches to content uploaded in the last hour, today, this week, this month, and this year (see image below). Given that investigators hope to trace the original iteration of a piece of content online, the earliest upload date often proved immensely useful.

Figure 2.1 YouTube’s search filters (December 5, 2018)

In addition to providing search functionality to target results vis-à-vis search queries, YouTube offers investigators numerous ways to browse for related content, once they have found a relevant video or source. For instance, Lab students often searched and browsed by date through content uploaded by specific channels or users, including local media outlets or NGOs producing their own coverage. Footage often includes montages with shots culled from anonymous sources or previously-uploaded media, which may in some cases be traced back to earlier instances online by reverse image searching stills of the video.

Another crucial so-called passive mechanism for finding content related to a video deemed relevant by investigators is provided by YouTube’s recommendation algorithms, which generate suggestions for further viewing on the right pane (see image below). Though YouTube’s algorithmic nudges have been under recent scrutiny for allegedly directing viewers towards increasingly ideologically extreme viewpoints and conspiracy theories (Tufekci 2018b), this feature and its constituent algorithms helpfully provide open source investigators with relevant leads and material. Simply hovering one’s pointer over a video thumbnail in this column gives a

61 preview of its contents, which may include possible matches of shots or geographic landscapes of salience in an investigation.

Figure 2.2 Recommendation algorithms at work

Given the utility of the feature, Lab students and I were stunned when a change to YouTube’s interface in the spring of 2018 removed all the metadata and recommended links attached to videos that had been removed from the platform, months after news erupted over content removals and account suspensions by YouTube and Facebook related, respectively, to the Syrian conflict and Myanmar’s genocidal campaign against the Rohingya Muslim ethnic minority (e.g., Edwards 2017; Asher-Schapiro 2017). The removal of UGC from YouTube was a semi-regular occurrence at the Lab, even before this notable set of news events. In such cases, while the video itself might be rendered inaccessible, one could still read metadata attached to the video, like captions and the user account posting the content, as well as recommended videos on the side pane. Hence my surprise one April afternoon when, clicking on the link of a video which had been removed, I found the page empty:

Figure 2.3 The remnants of a removed YouTube video (April 17, 2017)

Though design modification might go unnoticed and hold little consequence for most users, it removed what had been an important resource for UGC discovery, preventing me and others from accessing metadata or related content for videos taken down.27

Twitter: Hashtags and Tweetdeck

A vast body of scholarship has provided extensive accounts, to varying degrees of technological specificity, of the reasons for Twitter’s emergence as a widely-popular global platform for professional journalists, their “citizen” counterparts, and social movements (e.g., Murthy 2011; Bruns and Highfield 2012; Ahmad 2010; Tufekci 2017, 2018a). In addition to supporting search queries employing Boolean and other operators as well as an extensive advanced search menu which enables searching by geographic proximity and day-to-day timespans, Twitter’s network structure allows users to easily locate, monitor, and participate in topical threads of posts, which are unbarred by privacy settings and public even to unregistered

27 Apparently, this modification came with changes to what content creators could view through YouTube Analytics, which after May 2018 “no longer shows deleted videos, channels, and playlists” as well as information about them. Google. “Deleted Content in YouTube Analytics.” Last accessed June 10, 2019. Retrieved from: https://support.google.com/youtube/answer/9023776?hl=en

63 website visitors. Bruns and Highfield (2012: 9) point out that the micro-blogging platform, compared with Facebook, “builds on a much simpler networking structure where updates posted by users are either public or private, rather than visible and shareable only to selected circles of friends within one’s social network.” As one visiting open source practitioner noted, “we love Twitter because it’s more open.”

Users’ ability to embed citations of others’ posts further its capacity for collaborative newsgathering and dissemination, as does its embrace of hashtags, first instituted by its users, which “enable public conversations by large groups of Twitter users without each participating user needing to subscribe to (to “follow”) the update feeds of all other participants” (Bruns and Highfield 2012: 10). Hashtags are a crucial resource for UGC discovery and monitoring of news events in real-time. At the Lab, students gradually became familiar with the hashtags related to particular events; some had also used the then-free website, Hashtagify.me,28 to quickly identify a set of trending hashtags related to an event of interest.29

Given these and other news affordances, Twitter is cherished among digital journalists and open source investigators not solely for its surfacing and organization of UGC and information related to important events and breaking news stories, but also for enabling the kinds of distributed work entailed in “networked journalism” (Jarvis 2006; Rauchfleisch et al. 2017) and crowdsourcing of UGC verification and geolocation. Bellingcat, BBC’s AfricaEye, and other leading open source investigative groups use Twitter to crowdsource the verification and geolocation of UGC emerging from a conflict or crisis (e.g., Funke 2018).

One feature of Twitter was particularly crucial for Lab investigations: its (currently) free dashboard application Tweetdeck. First released as an independent app in 2008 by UK-based Iain

28 https://hashtagify.me/hashtag/tbt 29 Hashtags were not always found to retrieve local or relevant content, as when third-parties conducted “hashtag bombing.” I first learned of this social media tactic while monitoring Twitter posts for possible violence against protestors in the 2017 Kenyan elections and abruptly encountering heaps of posts by an account titled “Nairobi Hot Girls” featuring pornographic images advertising their services to the official hashtag #KenyaPoll. Though rare, such examples of hashtag bombing represent search engine optimization (SEO) efforts by users, including bots and spammers, to become “algorithmically recognizable” (Gillespie 2017).

Dodsworth, TweetDeck was acquired by Twitter in 2011. The interface is comprised of customizable columns that can be set to display lists, search results, hashtags, tweets by or to an individual user, and more. In keeping with its broader usage among journalists and open source investigators, Tweetdeck was employed at the Lab extensively to monitor and make sense of events as they unfolded online via posts; to search for posts within a specific time range, by specific users, or with specific hashtags; to filter posts by their media content and other characteristics; and, less commonly, to share lists of relevant accounts and hashtags.

Figure 2.4 Tweetdeck interface (December 4, 2018)

Altogether, these design and network structure characteristics provide robust mechanisms for active targeted content searches useful for open source investigations, as well as more passive modes of discovery (Godart 2019). The former cannot be said of Facebook, an object of ire for discovery for UGC in the Lab.

Facebook: Poor Active Search Functionality and Graph Search Workarounds

Compared with YouTube and Twitter, Facebook introduces some barriers to discovering human rights-related UGC, while providing unique affordances particularly for social network analysis. Although most Lab projects centered on the former type of investigation, the latter comprises an increasingly important methodology for assessing relationships within criminal

65 organizations, state and military bodies, or alleged perpetrators of human rights violations. This kind of social network analysis is conducted by myriad institutions including law enforcement agencies, legal NGOs, and independent investigative groups, like Bellingcat.

Recommendation algorithms and network structure on Facebook present mixed blessings for investigators. While recommendation algorithms can raise risks for those investigating alleged perpetrators on the platform, they may also provide value by suggesting accounts, groups, Pages, and other types of content connected to the target upstream user—be they a journalist, possible witness, or alleged perpetrator. In some Lab projects, students were encouraged to create new accounts on Facebook (and other platforms) as both a means of obfuscation to occlude algorithms’ access to their searches and identity, as well as a way to access a distinct information and recommendations than those that would directed to their on their existing accounts, as search results rely on user cues, including a user’s own network structure and Facebook activity. In addition to more heavily weighing popular and recent posts, a user’s unique search results are based on places they’ve been tagged, things they like (as expressed on their profile and Pages followed), events they’ve liked or been interested in, content they’ve interacted with in News Feed, and previous searches they’ve done.30

But, as other scholars have suggested especially in the context of Facebook (Eslami 2015; Rader and Gray 2015), the inner workings of Facebook algorithms may be a great mystery to others even after experimental attempts at tinkering. One Lab student on the Burma team described wanting to open up an account with a distinct name, profile, and liked content matching that of the region she was investigating, believing that “it would give [her] better search results.” She explained that she just wants to see what would be recommended to her. I informed her that students in other groups have also created realistic accounts (e.g., drawing on popular local names for pseudonyms), even sending friend requests to friends of the investigation targets. Even then, however, Lab students noted that they were getting recommendations for people they might know in Berkeley, from which they inferred that Facebook’s recommendation

30 Facebook. 2019. “What Shows Up in Facebook Search Results?” Facebook Help Center. Last accessed January 11, 2019. Retrieved from: https://www.facebook.com/help/113625708804960?helpref=search&sr=4&query=search.

66 algorithms weighted where the internet said they were (based on their VPN), rather than where they had set their location to in their alias accounts.

Notably, some of the search functionality enabled by YouTube and Twitter are either not supported to the same degree of facility on Facebook. Accordingly, the general consensus among Lab students and visiting experts shared in trainings was that “no one really understands how [Facebook search] works” and that conducting content discovery on the platform “involves a lot of scrolling,” as most students defaulted to looking up specific Pages or user accounts and scrolling down for earlier content, praying the page does not refresh due to some accidental click.

Although Facebook introduced hashtags in 2013, allowing users to click on a hashtag to see a feed of users and Pages citing it, hashtags are not supported in search queries to the same extent as on Twitter and YouTube. In addition, Boolean operators (AND, NOT, and OR) routinely malfunction, retrieving posts for instance with “and” and “or.” Accordingly, composing queries in the search box was notoriously ineffective. To circumvent this issue, Lab students used numerous workarounds, such as conducting “site domain searches” of Facebook on Google by typing in the query “site:Facebook.com [keywords]” onto Google. Google’s search results also often include links to deleted social media or YouTube content picked up by the platforms crawlers and indexed.

In general, components of Facebook’s search design can be said to reflect its purpose and predominant use as a social networking site, as opposed to, say, a platform better known for news media and public conversations like Twitter.31 Manuals for using Facebook’s search engine, Graph Search, emphasize its usage to search for posts, media, places, and other content types which are published, cited, or otherwise linked to people a user is already connected to as a friend or subscriber. As far as examples, think “Caroline wedding or cookie recipe Lisa” or “friends

31 Hutchinson, Andrew. 2016. “Should You Use Hashtags on Facebook? Here’s What the Research Says.” SocialMediaToday. June 19. Last accessed on January 11, 2019. Retrieved from: https://www.socialmediatoday.com/social-business/should-you-use-hashtags-facebook-heres- what-research-says.

67 who live in San Francisco.”32 Crucially, unless content is shared publicly, it will only surface in search results for users directly connected to that content already; in other words, “if someone you're not friends with posts and shares to friends only, then you won't see their post in your Facebook search results.”

Commentators and user forums have noted persistent issues with Facebook’s advanced Graph Search,33 and Lab students did not adopt it frequently during my fieldwork although open source researchers at other sites used it frequently. As workarounds, some Lab teams and studied employed third-party tools no longer active like Graph Tips and StalkScan, which plumb Facebook’s Graph Search yet present results via a more user-friendly and centralized interface. Stalkscan’s website, for instance, promises to show you “[a]ll ‘public’ info Facebook doesn’t let you see” but emphasizes that the tool “only shows hidden content you have access to.” With the URL of a Facebook user’s or Page’s account, StalkScan and GraphTips allow anyone to view a wide range of the targeted account’s activity at a glance, such as the photos they’re tagged in, posts they’ve liked or have commented on, and more. An additional benefit of these tools for the open source investigator was that they obfuscated one’s presence, for the purposes of covert investigations. Normally, if a user visited and spent long enough on someone’s page on Facebook, the user’s account might surface in the targeted users’ suggested friends list. In contrast, one can use StalkScan or GraphTips liberally without triggering the awareness of the targeted account’s manager. Accordingly, at the Lab, these tools took on special value on projects aimed at

32 Facebook. 2019. “Search Basics.” Facebook Help Center. Last accessed January 11, 2019. Retrieved from: https://www.facebook.com/help/460711197281324/ 33 Shontell, Alyson. 2014. “One Year After Its Mega-Hyped Launch, Mark Zuckerberg Admits Graph Search Doesn’t Work.” Business Insider. February 1. Last accessed January 11, 2019. Retrieved from: https://www.businessinsider.com.au/mark-zuckerberg-admits-graph-search- doesnt-work-2014-1; Greenfield, Rebecca. 2013. “Facebook Graph Search Still Doesn’t Speak Human.” The Atlantic. January 29. Last accessed January 11, 2019. Retrieved from: https://www.theatlantic.com/technology/archive/2013/01/facebook-graph-search-still-doesnt- speak-human/318952/; “Facebook Graph Search Doesn’t Work After Recent Update.” StackExchange. Last accessed January 11, 2019. Retrieved from: https://webapps.stackexchange.com/questions/71070/facebook-graph-search-doesnt-work- after-recent-update.

68 investigating the social networks and online statements of military officials and persons of interest.

Unlike its unhelpful search box, Facebook’s Graph Search was said to be “an amazingly useful tool for researchers, including those investigating extremism, fraud and war crimes” (Meyers 2019). Paul Meyers, open source investigator with the BBC said that, “Graph went well beyond the limitations of the current search box and involved researchers composing special web addresses to make Facebook perform customised searches. However, on the 6th June 2019, Facebook started taking measures to prevent people doing this.”

While only retrieving content that is either public or connected to the user, Graph Search has long raised significant privacy concerns among users. Given the complicated nature of the platforms’ system of privacy settings, users could easily overestimate the degree to which their content is covered by privacy protections or simply not understand how and why their content is retrieved on searches.34 As one blog writes, “[p]eople, of course, have, and always have had, the capacity to control what's made available via search, but the complexity of Facebook's various privacy settings meant many weren't as covered as they would like, and thus, the implementation of Graph Search lead to a lot of complaints.”35

Since the Cambridge Analytica scandal, Facebook has been implementing modifications to enhance the privacy of users on the site. Some journalists and open source investigators have said these changes have thrown a wrench in their work and the discovery practices on which they relied (Silverman 2018). Facebook’s changes to its Graph Search was a recent example of this, causing an uproar among open source investigators.

34 E.g., Facebook. 2019. “I’m Showing Up in the Results of Other Search Engines Even Though I’ve Chosen Not To.” Facebook Help Center. Last accessed January 11, 2019. Retrieved from: https://www.facebook.com/help/112100708878206?helpref=search&sr=1&query=search. 35 Hutchinson, Andrew. 2016. “Should You Use Hashtags on Facebook? Here’s What the Research Says.” SocialMediaToday. June 19. Last accessed on January 11, 2019. Retrieved from: https://www.socialmediatoday.com/social-business/should-you-use-hashtags-facebook-heres- what-research-says.

Days after the modification, Nick Waters of Bellingcat told me that in his recent on airstrikes in Yemen for the Yemen Data Project, he would use Facebook most frequently of the large commercial platforms, since it seemed to be the platform of choice for Yemeni civilians. “Until a few days ago with the removal of Graph Search, you could find huge numbers of posts on airstrikes”—posts helpful in supplying contextual details, leading to the discovery of images and videos, and providing signals as to the time of the strike itself—“usually you’d be able to find one or two posts from someone saying, ‘urgent, urgent, urgent, bombing happening now at this location!’” After the Graph Search modification, Nick Waters put out a crowdsourcing call asking for volunteer assistance to “find as many Facebook posts as possible” related to airstrikes in Yemen over a particular time range “before we can’t find them anymore”:

As you may know, yesterday Facebook removed a vital functionality allowing researchers to search for keywords based on time. Needless to say, this is a vital tools for us. We’ve currently using [sic] a workaround with [sic] uses the “mobile” version of Facebook, but we don’t know how long that workaround will last.36

Sam Dubberley (2019) called the change a betrayal to the human rights community that could lead to “potentially disastrous results” and urged Facebook to discuss this and future changes with human rights investigators before implementation. Similarly, Alexa Koenig emphasized that “[w]e need Facebook to be working with us and making access to such information easier, not more difficult.” In an article by BuzzFeed News’ Craig Silverman (2019), Koenig was quoted as stating that

To make it even more difficult for human rights actors and war crimes investigators to search that site—right as they’re realizing the utility of the rich trove of information being shared online for documenting abuses—is a potential disaster for the human rights and war crimes community.

36 Bellingcat tweet. Last accessed June 9, 2019. Retrieved from: https://twitter.com/bellingcat/status/1137378085127512065. On June 9, 2019, Bellingcat included the link to the Tweet on its listserv, adding, “SUCCESS! A crucial search loophole was closed by Facebook after it was found to negatively impact open source research.”

This outcry is especially notable given that Facebook’s motivation in executing the change was to address longstanding security concerns, resuscitating tensions between individual users’ right to privacy and the public’s right to information, echoed in policy debates over privacy regulations and The Right to Be Forgotten (Kaye 2019). Countering the apocalyptic claims of his open source counterparts, Tom Trewinnard (2019) issued a reminder that, just as the modification comprises an obstacle for investigators, so, too, does it introduce barriers for adversaries to surveil and attack marginalized and targeted individuals and groups. He concludes that “as journalists and human rights researchers we have to step back and see a bigger picture that includes all the potential misuses of the tools and methods we’re developing and using.”

IT TAKES THREE TO SEARCH

This chapter provided an in-depth account of UGC discovery practices and the multi- layered roles of major commercial platforms as so-called “accidental archives” for human rights documentation in the digital age (Syrian Archive 2019). In the absence of similarly popular alternatives, open source investigators and human rights groups must rely heavily on major commercial platforms to find human rights-related UGC precisely because witnesses, activists, alleged perpetrators, and other upstream users continue to upload and circulate content there.

Recognizing that fear of retribution or wider safety risks may effectively deter individuals and communities from posting human rights-related documentation online, numerous applications have emerged in recent years aiming to offer eyewitnesses the ability to record, save, and securely send eyewitness media content. OpenArchive’s Save is an open-source, free mobile app allows users to Creative Commons licensing and add metadata to eyewitness media and preserve back-ups on the Internet Archive and private servers. Another app, the International Bar Association’s eyeWitness to Atrocities is a closed-source mobile app that directly sends encrypted media files with their metadata to a LexisNexis-hosted storage facility maintained by the eyeWitness organization for later use in litigation. In addition to minimizing the security risks of posting content to the open web and serving a vital preservation function, such technologies are designed to retain the metadata attached to content (e.g., time, date, and location of capture) that are stripped when uploaded to social media platforms such as YouTube,

Facebook, and Twitter. Other applications include Tella, the CameraV app by the InformaCam project (a partnership between The Guardian Project and WITNESS, the American Civil Liberties Union’s Mobile Justice App to record and report police brutality.

Despite the growth of these apps in the last half decade, visiting experts and Lab staff often suggested anecdotally during my fieldwork that such apps suffer from poor adoption, noting that marginalized communities may be neither aware of the existence of these apps nor willing to download and employ them. One visiting open source practitioner reasoned, “sure, the content goes safely from the middle of the Congo to London but, who’s going to download your app? They’re just going to put it on Facebook.” “In the news industry,” the visitor added, “people have been trying to do this the last decade: what if we have our own newsgathering platform? People have tried and failed”:

The accepted position [for now is that] we have to understand how the uploader thinks and functions and adapt to that. Because, people just like Facebook, right? And the general reflex of people is to upload to Facebook. And the general reflex of an activist is to scrape it from Facebook and put it on YouTube.

Similarly, HRC Executive Director Alexa Koenig noted that “[t]he human rights community that works in digital tech has come to understand that creating new apps and channels doesn’t make sense for anybody” (Rosen 2018). “YouTube controls that space, and they will for the foreseeable future,” she added. Zeynep Tufekci (2017) has gestured to how activists’ posting practices may take into consideration platforms’ network effects, a term that refers to systems which generate greater value to users as a function of the amount of users employing it. Tufekci points out that in light of platforms’ network effects, “activists [may] find themselves compelled to use whatever the dominant platform may be, even if they are uncomfortable with it” (20) in order to attempt to reach wider audiences.

This chapter recounted how platforms powerfully shape the availability and discoverability of human rights-related UGC through myriad ways: not solely via content moderation and algorithmic curation, but aspects of its design and architecture as well. Findings cohere with Grimmelmann’s (2014: 873) observation in the context of lawsuits against search

72 engines that “[i]t takes two to tango. But, it takes three to search”: the search engine, the website indexed and retrieved by search engines, and the search user. The phrase was meant to highlight the search user as an agentic actor whose perspective is overshadowed in lawsuits against search engines, typically framed around the platform or search engine used to conduct a search on one hand, and the sites, blogs, (or, in this case, upstream content creators and uploaders) on the other. Rather than a passive audience who is happy to consume whatever online information is presented to them, search users have specific queries in mind and thus turn to search engines not so much as neutral conduits to web content or, conversely, selective editors, but as trusted advisors (874). The notion that “it takes three to search” captures the multi-level dynamics between human rights investigators, platforms, and upstream users; collectively, these actors play an important and often interwoven role in influencing the creation, circulation, and ultimate discoverability of human rights-related UGC.

With its focus on the relatively sophisticated practices of open source investigators, this chapter presents an extreme case of Grimmelmann’s archetype, the discerning search user, whom I would argue tends also to be overshadowed in scholarship lamenting algorithms as a force controlling the attention of the masses. Much scholarly concern has fixated on the power of search algorithms to drive public attention, tastes, and preferences, as search users are assumed to be passive in their information consumption and search practices. As DiMaggio et al. foreshadowed as early as 2001 in their call to sociologists to study the internet, “Web destinations that are displayed prominently on portal sites or ranked high by search engines are likely to monopolize the attention of all but the most sophisticated and committed Internet users” (314). More recently, scholars and commentators have expressed similar concern about algorithms’ power to govern public attention (e.g., Tufekci 2013) and “steer [users’] choices through the sheer knowledge about their interests and biases” (Helberger 2018: 156).

In this chapter, however, Lab students’ navigation of online information infrastructures and interfaces comprised less as a story about algorithms’ “‘effect’ on people,” than about, borrowing from Gillespie 2017 (183), “a multidimensional ‘entanglement’ between algorithms put into practice and the social tactics of users who take them up.” Algorithms become entangled with users’ practices—and the reverse is also true: as “algorithms nestle into people’s daily lives

73 and mundane information practices, users shape and rearticulate the algorithms they encounter” (Gillespie 2017: 183). Over time, my continual searches on my YouTube account had accrued, recommending videos of the Syrian and Burma conflicts even while I was viewing other types of content for personal use. Though not widely generalizable to most search users, the discovery logics, techniques, workarounds, and folk theories attempting to “make sense” of algorithms introduced in this chapter offer empirical examples of “the practices, representations, and imaginaries of the people who rely on algorithmic technologies in their work and lives” (Christin 2017: 2; see also Bucher 2016, Gillespie 2016). The chapter also highlighted some of the concrete “affordances-in-practices” (Costa 2018) that investigators derive from the opportunities and tactics available to them by platform architecture and design.

Aside from the tactics of the downstream searcher, there are those of the upstream user. Indeed, this chapter revealed two sets of upstream users present in the context of open source investigations: the actual users posting and circulating content online, and the imagined user— be it a victim, witness, or perpetrator—whose conjuring constituted a useful (and at times affective) device for tailoring students’ search strategy to their investigations. Direct knowledge of upstream users’ circumstances and motivations is often inaccessible to open source investigators; otherwise, investigators could resort to other means to attain content such as through direct contact with grassroots networks in areas of conflict or third-party “evidence lockers” such as eyeWitness to Atrocities. Neither were options for Lab students; discussed in subsequent chapters, the absence of any contact whatsoever with other users online was a matter of strict Lab policy. And yet, users’ practices were nevertheless visible in the patterns and digital traces of the media they left behind: their commemorations of loved ones killed in war, absence of geotags, elections to affix contextual information to their media, their public or private circulation of posts. The ways that witnesses, alleged perpetrators, and other users decide to participate on platforms (or don’t) impact the discoverability (and verification) of their content and posts by open source investigators. Making matters more complex, these decisions are likely also shaped in the first place by elements of platform design, vernacular digital cultures, and their understandings of how platforms are used by others.

Even as this chapter’s findings were fastened to the point of view of Lab students and open source investigators, the specter of the upstream user—their imagined circumstance, lexicon, language, relationship to the incident in question, etc.—guided many of the tactics employed. Consequently, whereas much research has focused on users’ “imagined audience” in the context of online behavior (e.g., Litt 2012; Litt and Hargittai 2016; Baym and boyd 2012), open source investigators’ practices shed light on how imagined producers play a role in the context of content’s discoverability. The imagined content creator turns out, too, to be of moral and discursive significance; the last analytic chapter more deeply interrogates the narratives and affects by which human rights open source investigators take up the role of content custodians for human rights-related UGC. This stance, at once reasonably protective yet unavoidably patronizing, has the potential to drive questionable practices, such as the large-scale archival of UGC without users’ awareness and consent, in the name of upstream users’ safety.

Besides downstream investigators and upstream users, platforms complete the basic triad. Recent scholarly and public attention has scrutinized the tremendous power of platforms to takedown UGC seemingly arbitrarily, shutter accounts, and deprioritize or promote content selectively through both their content moderation practices (Caplan, Hanson, and Donovan 2018) and algorithms that sort, recommend, and retrieve content in searches (Introna and Nissenbaum 2001; Noble 2018).

This chapter leveraged numerous examples to illustrate how other elements of platform design such as search functionality, user interface, and network structure, impede and support the discoverability of human rights-related UGC. While adapting their search tactics to imagined upstream users, human rights open source investigators must also adapt to sudden and seemingly arbitrary modifications to platform interfaces and features—needing thus to account for the ephemerality of media hosted by platforms as well as the ephemerality and dynamism of platform interfaces and functionality. Though the impact of specific platform features on informational accessibility has garnered less media attention in the past than the role of content moderation or algorithmic curation, this may change in the future: recent outcry over a modification to Facebook’s Graph Search is just one episode among broader efforts by human rights investigators and advocacy groups to position themselves as critical stakeholders meriting

75 a space at the table in platform governance (Gorwa 2019) and a voice in company deliberations— whether in decisions regarding ad hoc modifications to its search features as in this case, or in considerations of more sweeping changes to its content moderation practices and policies, as exemplified by advocacy groups’ complaints from being largely excluded from negotiation and crafting of the Christchurch Call in the wake of the May 2019 Christchurch mass shooting in New Zealand (Access Now 2019b; Aswad 2019).

In light of search algorithms which tend to favor mainstream media outlets and social media pages over local news agencies or individual users, this chapter found that open source investigators search for human rights-related UGC by getting proximate to the event and adapting to upstream users through a combination of so-called active and passive search techniques; tactics span the spectrum from using Boolean operators and exact keywords to exert control over search results, to leveraging recommendation algorithms provided by Facebook and YouTube to discover content and social network connections. In the absence of official information concerning the workings of proprietary algorithms, Lab students relied on tools and tips shared by experienced open source investigators and journalists on the digital practices of disparate communities, as well as on tricks and folk theories they’ve developed themselves and shared with each other over time.

This case thus raises familiar questions concerning information intermediaries’ power to grant visibility and legitimacy, which are taking on renewed significance today. Although platforms continue to be used to obtain news for two-thirds of American adults,37 proposed legislations and calls for platforms to more aggressively moderate content have compelled platforms to reconsider their roles, responsibilities, and liability as intermediaries of information and newsworthy content. For instance, facing scrutiny of disinformation and extremist content on the site, Mark Zuckerberg announced in January 2018 that Facebook would be overhauling News Feed to reduce its promotion of news while more heavily promoting posts by friends and

37 Matsa, Katerina Eva, and Elisa Shearer. 2018. “News Use Across Social Media Platforms 2018.” Pew Research Center. September 10, 2018. Last accessed on January 14, 2019. Retrieved from: http://www.journalism.org/2018/09/10/news-use-across-social-media-platforms-2018/.

76 family.38 Mounting governmental pressure to remove hate speech and terrorist, extremist, and graphic content are driving the increased use of algorithms to flag potentially problematic content, with negative implications at times for human rights documentation (Citron 2018; Electronic Frontier Foundation, Syrian Archive, WITNESS 2018). There is a worry that platforms’ evolving strategies with respect to algorithmically-driven content moderation will heighten the “threat of invisibility” (Bucher 2012) of human rights-related UGC and further endanger its discoverability by human rights groups (Warner 2019).

And, with respect to search algorithms, how should the general “relevance” of information pertaining to news and international conflicts be considered? More specifically, whose perspectives concerning state violence and human rights violations should be promoted— e.g., those of mainstream media websites and social media accounts with more followers and posts, or those who have directly lived through the conflict firsthand? I am not suggesting that search algorithms be reworked to surface UGC of abuse at the top of searches; often grisly, decontextualized, and in the local language, these media clearly may not be helpful for most users to become more informed about an incident—if we can assume that to be a central criterion in defining relevance. Indeed, such content may be manipulated or misattributed. How, exactly, Lab students, open source investigators, and journalists verify UGC is the focus of the next chapter.

38 Zuckerberg, Mark. 2018. “Bringing People Closer Together.” Facebook Newsroom. Last accessed January 14, 2019. Retrieved from:: https://newsroom.fb.com/news/2018/01/newsfeed-fyi-bringing-people-closer-together/; Vogelstein, Fred. 2018. “Facebook Tweaks Newsfeed to Favor Content From Friends, Family.” Wired. January 11. Last accessed January 14, 2019. Retrieved from: https://www.wired.com/story/facebook-tweaks-newsfeed-to-favor-content- from-friends-family/.

Chapter 3: Assembling Ground Truth from Afar

“Today, we’re going to turn you into an amateur crime scene photographer,” said Kelly, with a giddy, beaming smile. “Crime scene photography is very methodological, very tedious,” she continued. “It’s challenging to get our activists to do it well, and be able to communicate that well to the people who are in Brussels, or Geneva, or Berkeley.” In addition to serving as Senior Attorney and Program Manager at WITNESS, Kelly Matheson is also a filmmaker herself. WITNESS was founded in 1992 by musician and activist Peter Gabriel, who became inspired by video’s potential to expose injustice after the release of footage showing Rodney King being beaten by Los Angeles police officers. Through the years, the international NGO has built tools, created guides, and led trainings to support individuals and communities safely and strategically produce videos documenting human rights abuses. While leading the Video as Evidence program at WITNESS, Kelly has developed a practitioner manual or “field guide” to impart tips to media producers and activists on how they can enhance the quality and probative value of their video documentation for use one day in court. Invited to Berkeley to speak to Lab students and members of the public, Kelly devoted her talk to sharing some tips and exercises from her trainings with activists around the globe.

When it comes to filming, Kelly explained, you must first ensure the scene is safe, as filming can be dangerous. You’ll also want to develop a collection plan ahead of time and a to-do list of video shots. Then comes filming. Kelly played a short video created for media activists; my notes captured some of the lengthy instructions for would-be documenters of abuse:

Film the sky to show the weather and the angle of the sun or moon. Film a landmark to verify your location. Film a 15-second, 360 pan from your standing point. When it’s many different shots, you have to verify each one of them, so film as continuously as possible while holding the camera steady…

Given that human rights open source researchers search for landmarks, signs, or distinctive landscapes when geo-locating footage, eyewitnesses would be wise to include these features into their footage. In addition, Kelly argued that filmers’ work isn’t complete after merely capturing footage. She probed Lab participants for their wish list of “information [they’d]

78 want the filmer to add up front to make [students’] jobs so much easier.” “Time! Date and location!” Students called out. “Location, down to the street!” Kelly grinned and added a few more suggestions to the mix: producers’ contact information, group or organization, a map of footage collection. Whether said verbally on camera, shown on screen, or affixed to a video description posted online – these are the types of contextual information that activists and media producers should include with their videos for verification purposes before posting them online or sharing them for advocacy or accountability mechanisms.

By reducing the time and cost of verifying content downstream, Kelly’s techniques for media producers comprise examples of what Ella McPherson (2015a) has called verification subsidies. “Verification subsidies, powered by human and machines, either take on some of the labor required by various verification strategies or support the provision of metadata” (McPherson 2015a: 5). According to McPherson’s conceptualization, the gamut of verification subsidies is wide. It includes strategies for media producers and uploaders to provide salient visual and contextual information on or off screen, as well as efforts to distribute the work of verification (e.g, through crowdsourcing on Twitter, or outsourcing content to sites like the Lab) to make investigators’ job easier. A third example for McPherson is software that preserve content’s original metadata for investigators and lawyers, like the applications described in the last chapter that encrypt and send eyewitness media directly from witnesses’ phones to organizations for safekeeping and litigation.

Not simply an added benefit for investigators, verification subsidies may in some cases determine whether or not content can be verifiable by open source methodologies at all; without them, McPherson (2015a) argues, verification may be too timely, costly, or unfeasible for human rights investigators. However, awareness of how investigators’ verify content and the importance of verification subsidies may be unevenly distributed across the range of would-be filmers of abuse. While more professional and connected media producers may learn about verification techniques and subsidies through knowledge-sharing networks, events, and resources in the human rights field, more marginalized activists or bystanders (“accidental” witnesses) may not know to provide verification subsidies. Accordingly, McPherson fears, the latter category of

79 media producers may end up risking their lives to capture footage which is ultimately unverifiable with current open source methodologies and tools.

Whereas the previous chapter examined what makes UGC discoverable on platforms online, this chapter explores what makes UGC verifiable. Kelly and McPherson point to several factors they argue make a difference in verification: visual markers captured in the frame itself, additional information attached to media, and the preservation of media’s metadata. In one way or another, these refer to information about a single piece of eyewitness media. Without denying the significance of these kinds of verification subsidies at the level of UGC (indeed, while showing how they do come to matter), this chapter underlines the importance of two additional buckets of information that are crucial to verifying any specific piece of UGC using open source methods, but which might exist externally and independently of that particular content.

One bucket of information important for verification aside from content-level data is the universe of available media and information about the incident itself. Noting that incidents vary greatly with respect to the data traces they generate, scholars and research organizations have acknowledged the selection bias inherent in the collection of UGC and other types of online information for research on human rights violations and conflicts (e.g., Ball 2016; Price and Ball 2014; Whiting 2015; Koettl 2017).39 For instance, incidents occurring in geographic contexts lacking technological infrastructures, media practices, and social networks in place to support the documentation of human rights abuses and its widespread distribution can be expected to generate less UGC compared with incidents occurring in locales with robust resources and channels to create and share eyewitness media. Adding to these factors, circumstances that are especially dangerous and dynamic (e.g., cases of bombardment or evacuation, or contexts with heavy state surveillance of social media users) may diminish the availability of UGC created and then posted online. Local independent media and geopolitical dynamics are yet additional factors which influence the overall volume of information available about an incident, as news coverage,

39 For project methodologies, visit the websites of Syrian Justice and Accountability Center, Human Rights Data Analysis Group, Armed Conflict Location & Event Data Project.

NGO reports, and independent fact-finding will exist for some contexts and conflicts and not others.

After content-level and incident-level information, a third bucket of information enrolled in the verification process represents or provides granular information about the place in which an incident occurred. Especially crucial for geolocating UGC and incidents, this class of what might be called “ambient” place-based information includes street maps, commercial satellite imagery, and photos or videos that users may have uploaded online about a place years or even decades prior, without any connection at all to the incident. The usage of “ambient” here refers both to the fact that these types of data give investigators a helpful picture of the surrounding landscape in which an incident took place, critical for pin-pointing the exact location where an incident took place and where UGC may have been captured, but also “ambient” in the sense that the purpose, production, and circulation of this kind of information may well be divorced from circumstances tied to the incident under investigation.

As this chapter will argue, since verification involves corroborating UGC with and against many types of information, successful verification outcomes depend on an adequate degree of media saturation or ubiquity in relation to the incident itself and where it occurred. The creation and online availability of such information depends on social, technological, economic, and political factors (abbreviated “STEP” in this chapter) which operate at different timescales and come into play at different moments in the course of a conflict and an investigation and UGC’s trajectory. Far from depending solely on the kinds of information media activists are able to capture on camera or affix to their footage, the verifiability of visual UGC using open source methods is shaped by a host of broader social and structural factors which exist outside of (and before) the frame. These factors, in turn, have the potential to produce disparities in verification outcomes.40

In the context of remote-sensing technologies, “ground truth” refers to information collected on-site used to calibrate and measure the accuracy of the input data (e.g., aerial

40 By “verification outcomes,” I’m referring to investigators’ ability to adequately verify and geolocate a given piece of user-generated content using open source methodologies.

81 photographs, satellite images).41 One might think, then, of the work of UGC verification vis-à-vis open source methodologies as involving tracking down and assembling “ground truth” data like images, videos, and reports produced in proximity to an incident. Pointing to the limitations of satellite imagery in capturing the root causes of human rights violations, humanitarian crises, or environmental destruction, Karen Litfin cautioned that groups seeking to leverage satellite data to advance their goals “will find that satellite data needs to be supplemented with substantial ‘ground truthing’… Thus, there is a strong need to pair satellite data with sociological and anthropological appraisal tools on the ground” (Litfin 2002: 81). Similarly, this chapter points to methodological risks entailed in overdue reliance on open source methodologies to investigate human rights violations, as some individuals and organizations attracted by the affordances and relatively low cost of open source information might. In addition, by revealing the considerable universe of data necessary to confidently verify UGC from open source methodologies, these findings caveat celebratory narratives lauding the potential of open source investigations to shed light on those events which are most neglected by media sources and occurring in the most resource-barren of contexts.

Next, I’ll provide a brief overview on the verification process, as undertaken at the Lab and similar sites. As mentioned in the introduction and second chapter of this dissertation, verification methods employed at the Lab consist of manual techniques to confirm that largely visual UGC is what it purports to be, and thus has not been misattributed or deceptively represented whether unintentionally or for malicious purposes. These are distinct from forensic authentication strategies. After highlighting notable elements of verification, a findings section will address, in turn, the three buckets of information identified above as being important to the verification of visual materials alleged to depict attacks and human rights abuses. I will draw on ethnographic observations, interviews, secondary information, and case studies from Lab projects, with special attention to investigations related to Syria and Myanmar in order to

41 The term “ground truth” has also been imported to other fields, such as machine learning and computer vision, to denote information obtained by direct observation rather than inferred.

82 highlight important distinctions between these contexts. Finally, the chapter will end on a conclusion section outlining some implications from the findings.

The three buckets of information I analyze in this chapter— incident-, content-, and place- level information – are rather arbitrary, and are neither meant as an exhaustive nor even a particularly useful typology for considering all of the different factors and kinds of information which might come to bear on the verification process. Often, verification is very ad hoc. It may be very hard to predict what content will or will not be verifiable. Verifying a piece of UGC can take minutes or months. Instead of employing these three buckets, I could have organized my findings around the STEP (social, technological, economic, and political) factors that impact the verification process. Alternatively, I might have used a temporal analytic, outlining which circumstances come to be salient at the moments of UGC creation, upload, discovery, and verification. The typology itself is not what it important here. Rather, my overall purpose is simply to illustrate in a very exploratory manner the many kinds of structural factors and social disparities come to impinge on the verification process and how these factors, in turn, may beget further disparities in verification outcomes tied to UGC emerging from different conflicts and contexts.

Briefly, a few limitations of this chapter are important to note. Despite considerable overlap between the Lab’s verification methods and those practiced increasingly by other human rights groups and journalists, the scope of Lab students’ verification practices were circumscribed in three crucial ways compared to these counterparts, due in part to the Lab’s outsourcing relationship with NGO partners. First, rather than conduct comprehensive analyses of a conflict or crisis, students’ tasks was largely confined to finding, verifying, and geolocating open source visual media. As one student manager told me, “it’s never our responsibility to determine the motive” or “shape the narrative.” Rather, “our job is just to determine the facts of the video— like, the where and when it happened.” Second, in most projects, students’ verification work was transferred to NGO partners, who were responsible for making the final call on whether to use content. Accordingly, while it is a crucial topic, the extra steps and deliberation taken to ensure that UGC is, with a high degree of certainty, trustworthy and credible is not addressed here given that Lab students were not the ultimate decision-makers. Third and relatedly, Lab students were

83 prohibited from reaching out to users online due to a policy intended to prevent endangering eyewitnesses or victims and, on legal projects, to also avoid interfering with legal processes. In contrast, many groups collecting and verifying UGC as well as journalists do establish and maintain contact with UGC sources and sources on the ground as a further way to verify content, such as by asking uploaders for additional context and media captured from the same incident (Wardle, Dubberley, and Brown 2014; Silverman 2014).42 While curtailing my ability to observe how uncertainties with respect to content verification are managed and resolved, the Lab’s no- contact policy does raise questions about the biases and shortcomings of relying solely on open source methodologies for human rights investigations, examined also in the next chapter.

THE CENTRALITY OF CROSS-CORROBORATION: UGC VERIFICATION AT THE LAB

A Council of Europe report distinguishes between three distinct types of problematic informational content differentiated by their degree of inaccuracy and intention to harm, deceive, or confuse (Wardle and Derakhshan 2017: 5). Misinformation refers to false or inaccurate information spread without intent to harm, mislead, or confuse. Disinformation refers to the intentional creation and circulation of patently false information meant to harm, mislead, or confuse. Finally, malinformation refers to accurate and often private information shared publicly with the intention of causing harm.43 Of the three categories, open source investigations at the Lab largely focus on misinformation and disinformation, like wider efforts and initiatives

42 As an example, the Syrian Archive claims that it “has established a trusted team of citizen journalists and human rights defenders based in Syria who provide additional information used for verification of content originating on social media platforms or sent from sources directly.” Syrian Archive. “Research Methodology.” Last retrieved July 13, 2019. Last accessed July 13, 2019. Retrieved from: https://syrianarchive.org/en/tools_methods/methodology. Similarly, Chris McNaboe of the Carter Center’s Syria Conflict Mapping Project has said they “confirm[] reports with contacts within Syria whenever possible” (Guerrini 2014). 43 For instance, misinformation describes someone unwittingly sharing an article whose headlines, visuals, or captions don't match the content or an audience misinterpreting satirical content as accurate information. Disinformation refers to the creation or spread of manipulated or fabricated content, imposter content (in which oﬃcial sources are impersonated), hoaxes, and scams, as well as cyberwar/disinformation campaigns designed to sow mistrust and confusion regarding the authenticity of sources. Examples of malinformation include doxing and other forms of online harassment or hate speech that involve the public dissemination of private information pertaining to an individual or organization (Wardle and Derakhshan 2017: 5).

84 aimed at enhancing informational integrity and verification on internet platforms (e.g., Credibility Coalition, First Draft News).44

In addition, the Lab is mostly focused on misinformation and disinformation in the form of audiovisual as opposed to purely textual UGC. Generally speaking, practitioners refer to three main kinds of misinformation or disinformation in audiovisual formats: content which is staged or outright fabricated, content which is doctored or manipulated, and content which is “misattributed,” meaning that the video footage or photograph may be authentic (i.e., not deliberately doctored) but that the metadata attached to it (e.g., its captions and descriptions) do not accurately describe its content or the context in which it was captured (McPherson 2018). Certainly, governments and groups in conflict settings may certainly share deliberately misleading or fabricated content (Guerrini 2014; McPherson 2018). However, despite increasing hand-wringing in the media about the imminent threat of “deepfakes,” synthetic media generated with machine learning which could be shared with malicious intent (Chesney and Citron 2019; see Schwartz 2018 for a primer), human rights groups and investigators are currently still most preoccupied with what WITNESS’ Sam Gregory has called “shallowfakes,”45 pervasive content that is misattributed, slightly misleading, or unintentionally false as opposed to fabricated outright (Koettl 2016a). One visiting open source practitioner mentioned in a training that “in our human rights area we don’t see a lot of this [manipulation] actually, so the best thing, I think, is just using your human eye. I have yet to see something that’s really really photoshopped.”

As a result, verification techniques, training materials, and internal workflows did not employ strategies to check for manipulated or staged content, nor did they have as much to do with techniques associated with “authentication” and “digital forensics,” as media coverage has suggested (e.g., Melendez 2017; Tannenbaum 2017; Fortune 2018; Human Rights Center 2018a).

44 In contrast, data protection legislation and policies tend to address privacy, harassment, and security (both digital and physical) issues entailed in instances of malinformation. 45 Johnson, Bobbie. 2019. “Deepfakes are Solvable – But Don’t Forget that “Shallowfakes” are Already Pervasive.” MIT Technology Review. March 25. Last accessed June 16, 2019. Retrieved from: https://www.technologyreview.com/s/613172/deepfakes-shallowfakes-human-rights/.

Notwithstanding a few notable examples,46 “verification” at the Lab mainly referred to in-depth analysis of audiovisual content to confirm that it had not been misattributed or inaccurately contextualized. Nevertheless, the methods of UGC verification in this field are still emergent, and we may observe flux in what it means for a video to be “verified,” depending on costs, incentives, and patterns of media production and circulation. Should doctored content or deep fakes, for instance, become more widely produced and circulated in the aftermath of crises and conflicts, it is reasonable to expect that open source practitioners would respond to these developments by incorporating checks for content manipulation into their repertoire of verification techniques. At the same time, there is considerable disagreement in the broader public about what types of disinformation are most threatening to informational integrity today; debate spans the gamut of deep fake dystopias to alarm raised over hyper-partisan memes shared between friends and family. As our understandings of the prevalence and mechanisms of media manipulation continue to evolve apace with shifting media practices, the meanings and techniques of “verification” may, too, remain unstable, aimed as they are at moving targets.

Central to this chapter is the notion that UGC verification relies on cross-corroboration with many kinds of media and information scattered across the internet, which investigators assemble and stitch together to verify UGC and form and coherent and broader picture of what happened in any given incident. After coming across a particular piece of UGC, Lab students were instructed to identify the relevant who, what, when, and where. Consistent with traditional journalistic verification, relevant questions include who uploaded this, and why? Who is depicted in the photo or video? When and where was this captured? Is this the earliest version of this video or photo online? Depending on the particular incident, piece of content, and objective of the investigation, it may not be possible or worthwhile to answer all of these questions in every case. At minimum, however, Lab students were regularly expected to 1) assess the date and time in which UGC was created, including tracking down the earliest version of content online, 2)

46 At times, the term “verification” was used as a catch-all phrase for a wide array of intermediate media processing or analysis tasks, such as watching video clips and noting the kinds of civilians, weapons, and “landmarks” visible therein.

86 geolocate UGC (identify the location where UGC was created), and 3) evaluate the credibility of the source.

Though it is not always possible to confirm the exact date and time in which UGC online was created, a limitation one student called “unsatisfying,” open source investigators do attempt to identify the original or earliest version of a piece content online and then rely on signals and inferences to consider whether the content’s upload date and time makes sense in the timeline of events he or she is investigating. First, to identify other instances in which similar content has been posted and indexed online, investigators “reverse image search” an image or, in the case of videos, thumbnails or screenshots.47 Having identified the earliest version of a piece of content online, investigators attempt to infer the time and date of its creation. Once posted onto social media platforms, videos and photographs are stripped of metadata such as timestamps and location data; this is a boon for the privacy and safety of content creators and uploaders but an “epistemological challenge” (Daniels 2009; Schou and Farkas 2016) for investigators.48 However, platforms do typically disclose the date and time in which content is uploaded (though the same cannot be said of all blogs and websites). Using the upload date and time along with other clues, such as the posting history and apparent location of the account uploading content, investigators may be able to approximate the date and time of content creation. For instance, one Lab manager explained, “if the uploader is posting once a year, it’s hard to tell when the video was actually posted compared to when it was made. But, if the user is posting ten times a day,… then there’s probably a better chance that they received the content that day, or posted it much sooner after they got it.” Another indication that content was uploaded soon after the incident

47 Numerous techniques exist for reverse searching images, which range from dropping a screenshot taken manually into Google Images, or using Chrome extensions like RevEye to search for related content across half a dozen databases including Baidu and Yandex, not simply Google’s—each index and cache a different database of images. The search can pull up dozens or, in the case of content that goes viral, hundreds or even thousands of search results; not all are necessary to review, just the earliest appearances of the material on the web—enabled by a simple Google filter which narrows results by date range. 48 Online or downloadable “EXIF viewer” software can, however, come in handy for verification in cases where NGOs are supplied media files directly from trusted sources or on-the-ground networks and, at times, when photos or videos are collected from blogs.

87 is if the investigator comes across numerous, seemingly related posts or media uploaded around the same time by different users. “If the event happened three years prior, what are the chances that multiple people would upload videos on the same day?” Conversely, failure to find corroborating media or other reports about an incident may raise flags about UGC, particularly if the incident occurred in a context generally known or expected to produce such human rights documentation, given the assumed visibility of the occurrence, the ubiquity of local digital practices, local posting patterns, and other factors described below in the findings section. One visiting open source trainer noted that one would expect to find few videos or photos of an incident which occurred in the “DRC [Democratic Republic of Congo] in the middle of the jungle, but in Aleppo—[you’d expect] probably 10-20 videos. If there’s only one, I’d wonder, why is there only one?”

In addition, as confirmation of an incident’s location comprises an important aspect of any investigation, open source investigators and Lab students invest substantial effort into attempts to geolocate, as precisely as possible, where videos or photographs of possible human rights violations are taken. Though not always within reach, the promise of obtaining a discrete set of coordinates—down to a street corner, address, or orchard row—offers a rare satisfaction compared with the shades-of-grey ambiguity surrounding much verification work. Accordingly, geolocation is often one of the most compelling parts of verification for practitioners and media commentators alike: represented in journalistic coverage as a most bemusing and curious activity and “gamified” among open source communities themselves.49 At the Lab, the online game GeoGuesser is introduced to trainings to familiarize new students to geolocation; some Twitter accounts dedicated to open source investigation methods send out regular geolocation challenges, tips, or case studies, in addition to crowdsourcing calls for real-time investigations.50

49 A recent New Yorker article profiling Bellingcat-founder Eliot Higgins (with the ambitious title “How to Conduct an Open Source Investigation According to the Founder of Bellingcat”) focuses mostly on geolocation and provides a good example of this characterization (Beauman 2018). 50 As one example, Maks Czuperski, Director of the Digital Forensic Research Lab (DFRLab), posted one such Geolocation Challenge on Twitter at the beginning of a talk he gave at the Norwegian Atlantic Committee. The photo, depicting street-facing buildings and shops, was geolocated by the end of the talk—down to the very window out of which the photo was taken. See DFRLab,

Figure 3.1 Geolocation in action

Entailing a form of “professional vision” (Goodwin 1994), the craft of geolocation consists of the trained ability to recognize what’s distinctive in a photo or video shot and which features therein could be searchable online or capable of being matched up with other kinds of “ambient” information. Geolocation may include imagining and even sketching out how the scene would appear from different angles provided by volunteered geographic information like on-the-ground photos and videos, street view maps, and satellite imagery software like Google Earth.51

2017. “#DigitalSherlocks, Geolocation, and the Power of Open Source: An Open Source Hobbyist Explains How He Was Able to Locate an Oslo Window.” Medium. January 18. Last accessed February 5, 2019. Retrieved from: https://medium.com/@DFRLab/digitalsherlocks-geolocation- and-the-power-of-open-source-de34d478ef54. 51 Sandra Ristovska (2016b) has described the recent emergence of “strategic witnessing” as a form of activism which has developed alongside the growth and professionalization of video production in human rights advocacy. If “strategic witnessing” refers to the viewing of videos

Figure 3.2 Topography sketches for navigating satellite imagery (my sketches on left, McMahon’s on the right – used with permission)

As described further below, distinctive architectural or infrastructural features of a landscape as depicted in a video or photo are great sources of rich visual information for investigators. To geolocate attacks and obtain coordinates for them, photographs and video screenshots are corroborated against other open source visual records and satellite data of the purported location. For the purposes of geolocation, “landmarks” identified in media and set against satellite imagery or other materials need not consist of touristic or significant sites; rather, they often comprise distinctive topographical features (e.g., hills, valleys), infrastructure (e.g., highways, bridges, roads, or water towers), architecture (e.g., mosques and minarets), or signs (e.g., of stores, painted on buildings, traffic signs, etc.). In addition, cues about place may also be inferred from other traces of the physical or sonic landscape, including traffic signs, posters, logos, acronyms, insignias, prayer calls, hairdos, clothing, and accents (see Koettl and Willis 2016 for an example). To give an example of a building I geolocated with the help of Félim McMahon, in one Amnesty project investigating Turkish attacks against a Kurdish-populated

concerning human rights abuses, geolocation as a kind of witnessing is both strategic and highly- specialized.

90 region in Syria (Amnesty International 2018a), a roof-top fixture adjacent to the damaged building, unpaved pathways, and surrounding hills comprised some of the key geolocation landmarks (see below). In addition, geolocation of this attack relied on photos uploaded to Twitter, satellite imagery accessible on the free Google Earth platform, and a high-quality video produced and uploaded online by Ruptly (of Russia Today).

Figure 3.3 Verification’s media assemblages. Clockwise, from top-left: Photograph of the attacked building posted on Twitter; video screenshot depicting front of attacked house, with roof-top fixture visible; video close-up of roof-top fixture; satellite imagery with annotations identifying topographical features seen in Ruptly video; wide shot in Ruptly video showing roof-top fixture in bottom-right corner and topographical features crucial for geolocation

A third task for investigators is evaluating the credibility of the source of information; in the case of UGC, this is the user account that uploaded or first circulated a piece of content. Félim McMahon, the Lab’s Director during my fieldwork, emphasized that “it’s as important for you to evaluate the source, as it is to evaluate the content. Geolocation is not everything. If you geolocated every video you got, but didn’t check the source, it would have disastrous implications.” Some organizations partnering with the Lab are able to rely heavily on UGC

91 uploaded by well-established news sources they trust; that is, sources which provide consistent and reliable coverage of local events over time, and with whom the NGOs or open source investigation groups may have direct contact. However, as described more below, the online ecology of local information varies widely. Accordingly, the same degree of local and semi- professional reporting available of, say, Aleppo, does not exist in countless other contexts and conflicts investigated by Lab students, let alone in all of Syria. Even if credible local sources did exist, Lab students assigned to quick projects in countries they are unfamiliar with may lack important domain knowledge relating to the conflict and the reporting landscape context to be able to recognize or seek out such sources. Moreover, given that each piece of relevant information or content has potential value to add to an open source investigation, Lab students are trained to not disqualify video, photos, or posts from consideration solely because the identity of an uploader is unknown or the possibility of bias; student managers and visiting open source experts often echoed that since everyone has a bias, particularly in drawn-out conflicts like Syria, apparent bias in itself is not a reason to dismiss UGC. Open source investigation are, after all, a research methodology heavily reliant on crowdsourced information which cannot afford to be confined to content produced and circulated by trusted sources and media agencies. In the absence of familiar and reliable sources, what signals produce credibility and the appearance of authenticity?

Though outside of the scope of this chapter, source credibility is assessed using a variety of signals and steps, such as looking at the posting history of the user account uploading content. For instance, does the user operating the account appear to live close to the incident or in country, given the frequency and consistency of their media posts, or does the user seem to be a “scraper” account uploading a sparse and random collection of UGC. In addition, does the user have a history of posting content, or does the account appear to be a bot (automated account) created simply for the purpose of resharing a piece of content? Sam Dubberley of Amnesty International’s Digital Verification Corps warned students to be cautious of fresh accounts: “Be wary of the egg… the Twitter egg should always make you ask questions.”52 Dubberley added, “I

52 Many malicious accounts created primarily for the purpose of harassing other users kept the default Twitter eggs as their profile photos. Eventually, the “association between the default egg

92 wouldn’t trust a source who hasn’t been around a long time, it casts doubt on the reliability and credibility of the source.” Accordingly, students were advised to reverse image search profile pictures and check if the user operating the account has consistent profiles on other platforms and websites online. At the same time, open source investigators are aware that uploaders may create “throwaway accounts” (Leavitt 2015) to share content while concealing their identity, and thus it might be difficult in some cases to distinguish between a bot and human user.53 One student said, “if I were someone in Syria who is posting content of the Assad regime doing something horrible, I would not want to be associated with that video. And to do that, I would not put a photo, I would use a random user name, I wouldn’t tweet anything else that would provide any indication about where I may live in Syria, who I am, where I work, whatever. I would just create a YouTube account solely to post this video.” In this way, UGC verification vis-à-vis open source methodologies may advantage users lacking wide and deep digital footprints across platforms and time, and those willing to disclose their identity, location, and other “verification subsidies” which can facilitate UGC’s verification and geolocation of content – although the crowdsourced ethos of these investigations typically guards against dismissing any piece of anonymous content too quickly.

profile photo and negative behavior” became so problematic for the company that it decided to replace the egg avatar in March 2017 with a non-descript, grey silhouette of a bust. Twitter Design. 2017. “Rethinking our Default Profile Photo.” March 31. Twitter Blog. Last accessed January 23, 2019. Retrieved from: https://blog.twitter.com/en_us/topics/product/2017/rethinking-our-default-profile- photo.html. 53 On student did encounter this difficulty. On one project, she explained, “three of the videos were from a YouTube channel that had been made that day only to post like 10 videos from this one event that had happened that day. They had a user name, they had a photo. But again it was like… should it be a problem that this YouTube channel was created was specifically for this purpose? And then really the answer is no, because that’s a logical reason to create a YouTube channel, right? I just collected all these videos from this crazy event. What do I do? I make a YouTube channel. So, maybe we’re looking at it backwards. Maybe it’s a good thing if someone just created a YouTube channel just to post this video cuz you know it was urgent, it was recent, and they want to get it up. It’s hard… I don’t know how to approach that because both stories could be totally reasonable.”

Next, I describe how different types of information become enrolled in UGC verification and the ways that STEP factors (working at different timescales) impact the availability of that information. In general, verification was most successful at the Lab in cases where the setting of the incident or conflict was media-saturated and information-rich; for instance, where there exist(ed) many individuals or groups recording the incident, supplemental news coverage and investigations, ambient geographic and spatial information on the web, and so on.

INCIDENT-LEVEL INFORMATION: VOLUMES OF REPORTING

Scholars, journalists, and groups collecting and analyzing UGC have acknowledged disparities in the volume of UGC and online information generated in the course of distinct incidents and conflicts. Alex Whiting (2015) has pointed out that “[m]any atrocities occur in circumstances where technology is scarce or limited, and even when it exists, it can be under the control of the targets of the investigation or uncooperative states and therefore unavailable.” Organizations and researchers employing UGC and other kinds of online information for their monitoring, reporting, and investigation efforts often acknowledge these and other methodological limitations and attempt to correct for them.54 For instance, noting how their work may be shaped by selection bias resulting from incomplete information, a circumscribed geographic scope, and the political motivations of reporting sources, the Syria Mapping Project of Physicians for Human Rights (PHR) outline steps they are taking to “combat this selection bias” by extending their network of medical professionals and sources geographically and with respect to political affiliation (Physicians for Human Rights 2019). The broader implication is that “many war crimes and high rights abuses will continue to leave few electronic traces” (Aronson 2018a: 130).

54 Megan Price and Patrick Ball of the Human Rights Data Analysis Group has written about many types of bias – selection, recall, and disclosure – that are particularly salient when analyzing information collecting online and from social media (Price and Ball 2014; Ball 2016). Of selection bias, Price and Ball raise questions about how the demographic profile of social media users likely to document and share reports of abuses—“young, technologically savvy, motivated individuals”—might be shaping what gets reported. “Researchers should always ask themselves, ‘whose stories are not captured by this source?’” “Source” in this case refers not to an individual source (e.g., content uploader) but rather a data source, such as Facebook.

Given that verifying visual UGC via open source methodologies hinges on cross- corroboration with other media and reports, the volume of information generated in an incident shapes UGC’s verifiability. Further, in addition to being of practical importance in the course of matching up information and visual media about an incident during verification and geolocation, the volume of independent reports available about any given incident has also become a feature of procedural verification benchmarks for some organizations relying on UGC. Airwars, a not-for- profit tracking the civilian casualties of airstrikes in Iraq, Syria, and Libya, implements a six- category grading system of which number of independent reports is a parameter. For instance, a “fair” judgment of a civilian casualty would be based on “two or more credible sources” while a “weak” judgment refers to “single source claims” (Airwars 2019).55 Similarly, PHR’s Syria Mapping Project aims “to corroborate all incident reports with at least three independent sources,” although some incidents with fewer than three sources are still included on the map if supported by “credible and reliable sources” as well as “sources…known to employ strict methodology and have direct access to information” such as United Nations Commissions of Inquiry. Such methodological standards point to the vital role of cross-corroboration in determinations of UGC’s credibility (Physicians for Human Rights 2019).

There are very many STEP (social, technological, economic and political) factors that shape the volume and quality of information available to investigators about an incident. In this section, I will highlight just three of these factors, which operate at different timescales and become salient at different moments in relation to an incident: local media practices and networks, local security circumstances, and supplemental reporting. 56

Open source investigators and journalists have underlined the role of local media practices and networks in shaping the volume and ultimate verifiability of online information. Some examples of salient media practices were raised in the previous chapter, such as the

55 The grading system has in total six categories: confirmed, fair, weak, contested, discounted, no civilian harm reported (Airwars 2019). 56 To elaborate on this point about timescales and temporal salience: media practices and networks must be established before an incident breaks out, local circumstances and safety shape the production and circulation of information during or immediately after an incident, and supplemental news reporting and investigations may occur after an incident.

95 evidentiary value of Syrian martyr pages on Facebook groups and the online self-censorship of witnesses and activists in places particularly dangerous for dissidents and opposition forces. Contexts where documentation practices and networks are lacking, and thus generate scant remotely-accessible information, have been said to demand the need for on-the-ground research and reporting. “In certain areas such as Yemen,” stated a recent conference report by the Center for Investigative Journalism, “it is very difficult to have any system of validation because there is no online community surrounding them to report what is going on. In this situation, you have to rely on word of mouth and pool from your existing contacts. Alternatively, you may have to just visit the location yourself” (Center for Investigative Journalism 2018: 37). A journalist cited in the report “found that in Iraq, unlike in Syria, there was no active local media network. This meant that events were not being reported online, making ground investigation even more crucial” (35).

Local media practices and networks appear to be shaped by a host of factors, including the political landscape and mobilization strategies of actors and opposition groups, a locale’s technological infrastructure and network access, the socioeconomic backgrounds and resources of those impacted by the incident or conflict, and the degree of state censorship and surveillance of the internet. Accordingly, local contexts vary greatly with respect to the capacities and structures in place to support the creation, documentation, and online diffusion of human rights violations. Conducting exclusively remote investigations at the Lab, one cannot directly observe which of these various factors might be influencing local documentation practices and media networks, but, nevertheless, differences in these practices and networks themselves were notable across Lab projects, which crisscrossed the globe. Projects in places with extensive local reporting were generally much more fruitful both in discovering and in verifying UGC coming from different settings. For instance, nowhere was this more evident than in Lab projects based on attacks in Syria.

Syria is generally considered to be an extraordinary example of local documentation practices and networks of regional media agencies and aggregators. A Google employee alleged there to be more hours of footage on YouTube about the conflict than hours of the conflict itself, which has been dubbed the “YouTube War” (Rosen 2018). “The Syrian conflict is the first to take place in a country with heavy social media users,” noted Chris McNaboe, manager of the Syria

Conflict Mapping Project at the Carter Center’s Conflict Resolution Program. “With a relative absence of any free press, people naturally turned to social media to share information about events around them.”57 Before the conflict, the Syrian government held tight control over domestic news production, even prohibiting the creation of private newspapers until 2011. In addition, foreign journalists working domestically were closely monitored and dissenting journalists often felt pressured to emigrate and resume their coverage from abroad (Wall and el Zahed 2015). After the outbreak of the Syrian conflict in March 2011, however, the country’s media ecosystem quickly underwent a dramatic shift. Since the early days of the uprising, anti- regime activists have collected and uploaded digital recordings and information regarding protests and, later, state killings, bombings, and chemical weapons attacks. A “pop-up news ecology” (Wall and el Zahed 2015) emerged of media agencies, media aggregators, and citizen journalists. Resistance groups peppered around the country referred to as Local Coordinating Committees both organized and documented anti-regime activities, while aggregators and distributors like the Shaams News Network collected and recirculated content uploaded online by eyewitnesses and independent citizen journalists (Sienkiewizc 2014). Although state surveillance and violence has produced acute security risks for witnesses and journalists to upload content, local media agencies acting as information intermediaries provide some cover to individuals reporting on the attacks. To get an idea of just how active and well-established some of these media agencies are, one might look at work by the Center for Spatial Research (C4SR) at Columbia University, which as part of its research and mapping on the war in Aleppo, developed a “searchable interactive interface” based entirely on “three of the most highly cited YouTube channels, the Halab News Network, the Aleppo Media Center, and the Syrian Civil Defense,” a search and rescue group also known as the White Helmets. The case study, entitled “Spatializing the YouTube War,” cited that the Halab News Network alone had “host[ed] over 4,000 videos, and new videos continued to be uploaded” over the course of the project (Center for Spatial Research 2019).

57 Carter Center. 2016. “Carter Center Makes Dynamic Syria Conflict Map Available to Public.” Press Release. March 9. Last accessed June 18, 2019. Retrieved from: https://www.cartercenter.org/news/pr/syria-030916.html.

Félim McMahon explained to me once that the two “key determinants” of whether videos created during an incident are able to successfully reach investigators, journalists, and wider audiences are “the network and the environment,” referring, respectively, to the media network capable of disseminating videos and to the local and immediate circumstances during and after a conflict, described next. “In Syria, [the video is] probably taken by a media activist and put up on Facebook or YouTube immediately,” McMahon explained.

The network in Syria would consist of media activists, national media, and it’d consist of the Facebook pages associated with them, the very well-organized YouTube channels, and they’re very much networked with the outside world. They have the ears of the outside world. It’s a well-developed, connected network. That means that any video that arises in that is more likely to be kind of sucked into it… [even if] the video doesn’t come from a media activist.

Indeed, working on Syria projects at the Lab, one quickly becomes familiar with the spate of local media agencies providing local coverage there. In addition to the ones mentioned already, I regularly encountered content from SMART News, Sky News, the Idlib Media Center, Step News Agency, and the Turkish-based Qasioun News Agency.58 In general, these channels provided consistent and credible coverage. “We know that Shaam uploading content is a reliable source,” noted one visiting open source investigator. “Media organizations [in Syria] actually see it as a badge of honor putting out real content, they recognize so much fake news and misinformation is flying around.” Reporting by such agencies are cited by open source investigative groups, human rights NGOs, official fact-finding missions, and international news organizations such as Al Jazeera, CNN, and The New York Times.

To give one example, during the fall of 2017 in coordination with the Syrian Archive, Lab students undertook a semester-long investigation of two alleged chemical weapons attacks which occurred half a year earlier on March 25 and 30, 2017 in al-Lataminah, a village in northern

58 Other media agencies are listed in Issa, Antoun. 2016. “Syria’s New Media Landscape: Independent Media Born out of War.” Middle East Institute. MEI Policy Paper 2016-9. Last accessed June 19, 2019. Retrieved from: https://www.mei.edu/sites/default/files/publications/PP9_Issa_Syrianmedia_web_0.pdf.

Syria in the Hama Governorate. This project was unique among Lab projects in that the Lab published its own report concluding the investigation (Human Rights Center 2018b). The Syrian Archive provided students with nine videos posted to social media platforms alleging to depict the attacks, which it downloaded and preserved. In addition to these starting materials, Lab students were able to discover local news coverage and collect dozens of posts, photos, and videos uploaded onto YouTube, Twitter, and Facebook of the March 25 and March 30 attacks. Notably, the promise of finding content relevant to these attacks was an important reason, according to the Lab team’s manager, why the Lab team decided to dedicate its semester to investigating it. Despite the fact that months earlier, 900 channels and user accounts uploading content from the Syrian conflict had been shuttered (Edwards 2017; Browne 2017; Asher- Schapiro 2017), Lab students were nevertheless able to locate videos, images, and local reports documenting the attack’s aftermath.59

On March 25, 2017, a helicopter of the Syrian Air Force dropped a chlorine-filled munition on the al-Lataminah Surgical Hospital, likely in the late afternoon. Three people were reported killed and 32 injured from the attack (Human Rights Watch 2017b: 5). Numerous local groups uploaded reports, images, and/or videos about the attack and its aftermath (Bellingcat Investigation 2017a; Human Rights Center 2018b). Thiqa News Agency uploaded a video onto YouTube purporting to show the destroyed roof and damaged interior of the hospital and a gas cylinder. A montage of clips posted to YouTube by the Syrian Network for Human Rights also show the interior and exterior of the hospital. The Syrian Civil Defense posted tweets of what appears to be the same gas cylinder as well as photographs of the destroyed roof and a victim of the attack, Dr. Darwish. A Syrian journalist and activist, Hadi al Abdullah, provided additional videos. Syria Direct reported statements of a spokesperson of the hospital claiming that the falling cylinder destroyed the roof at 1:30pm on March 25, and that Dr. Darwish had refused to leave his patient, who also died in the attack – corroborated by a video released by the Syrian

59 Some of the channels have been reopened on appeal, though it is unclear how many videos, accounts, and channels have not been recovered.

American Medical Society (SAMS). Additional footage posted to YouTube allege to show the helicopter responsible for dropped the chlorine-filled cylinder.

Five days later at approximately 6-6:30 in the morning of March 30, 2017, two bombs were dropped onto an agricultural field south of Al-Lataminah by an unidentified warplane. Estimates of those injured range from 85 -169 (Human Rights Watch 2017b: 5; OHCHR 2017), and the Organization for the Prohibition of Chemical Weapons (OPCW) Fact-Finding Mission has confirmed that Sarin was used in the attack (OPCW 2017). Here, too, local coverage uploaded online shed important light on the attacks. SMART News Agency and the Syrian Press Center posted footage to YouTube showing Syrian Civil Defense members in hazmat suits collecting samples from the two sites of the attack. A video posted to YouTube by a local cameraman purports to document the moment of attack and appears to show two smoke clouds in the distance. In addition, statements, images, and/or videos of the physical symptoms of those injured were released by the Haman Health Directorate, the Idlib Health Directorate, and the Union of Medical Care and Relief Organizations-USA (UOSSM USA) (see Bellingcat Investigation Team 2017a; Human Rights Center 2018b; Human Rights Watch 2017b: 34-37). These local reports served as critical documentation for subsequent investigations undertaken by the Lab, Bellingcat, and Human Rights Watch, and enabled each of these groups to cross-corroborate images and statements. The use of chemical weapons in the March 25 and March 30 attacks have since been confirmed by the UN Commission of Inquiry on Syria (OHCHR 2017) and the OPCW (2017, 2018).

The volume of information about these two attacks contrasts quite starkly with the Lab’s investigation on genocidal violence against the Rohingya, a Muslim ethnic minority in Myanmar denied citizenship under a 1982 law and referred to as “Bengalis” by Myanmar authorities. Animosity against the Rohingya by Buddhist monks, civilians, and the Myanmar government and military has been rampant for years, resulting in flashes of mob violence and retaliation, including in June 2012 and October 2016. In late August 2017, a brutal military crackdown following an attack by insurgents on police posts led to the widespread arson of Rohingya homes and villages, mass executions, sexual violence, and the mass exodus and displacement of an estimated 700,000 Rohingya into neighboring Bangladesh. In the midst of these events in the fall of 2017,

100

Lab students in the Digital Verification Corps were tasked with assisting Amnesty International to verify and geolocate over a hundred videos, photos, and posts the NGO had received from local contacts and partners. Students were first asked to attempt to find online versions of the content shared directly with Amnesty International.

Despite the pivotal role of social media, and Facebook in particular, in escalating ethnic divisions which fueled the attacks (Mozur 2018), relatively little content of the displacement and violence against the Rohingya could be found online by the Lab teams enrolled for this task. On one hand, local documentation practices and media activism did play recent roles in the country during the 2007 Saffron Revolution, when local activists recorded and uploaded UGC of massive anti-government, monk-led demonstrations. Social media users abroad including Burmese dissidents and exile media groups recirculated and remixed content and produced their own commentary, while large international news networks ran user-generated videos and photos captured on the ground (Brough and Li 2013; Chowdhury 2008). Remarkably, at that time, rates of phone ownership and internet use were in the single digits—and technological access in the country overall has risen sharply since then.60 On the other hand, however, most of the 2017 attacks in the conflict took place in rural, poor villages and townships in Rakhine state with persistently low rates of mobile ownership and network access (Central Statistical Organization et al. 2018). In addition, though the new government banned direct media censorship in 2012, monitors continue to report a hostile press environment, including widespread self-censorship and attacks on independent media (Reporters without Borders 2018). For instance, two Reuter

60 In 2012, an estimated 1.1 percent of the population used the internet and phone ownership was in the single digits (Stecklow 2018; Central Statistical Organization et al. 2018: 4-5). However, a quasi-civilian government formed after the country’s 2011 liberalization deregulated the telecommunications industry in 2013. By 2013, the price of SIM cards plummeted from over $200 to mere dollars (Stecklow 2018). Today, mobile phone ownership exceeds 50 percent, with smartphones the most predominant type of mobile phone purchase, and the gap in rural and urban ownership has narrowed. However, computer use is not widespread in Myanmar, limited predominantly to those with high school educations.

101 journalists were detained in December 2017 and given seven-year jail sentences while investigating the killing of 10 Rohingya men and boys.61

Due to a combination of these factors, Lab students were not able to discover much information online, making it challenging to verify and geolocate the content that had been shared with Amnesty. Unlike Syria’s dense and active network of recognized media agencies, footage and images posted to social media platforms and YouTube were sparse and oftentimes derived from scraper accounts with sporadic and seemingly random content collections. It was hard to distinguish these from pro-Rohingya NGOs being operated outside of the country. Aside from the relative absence of camera phones among those impacted, for McMahon, such discovery challenges also pointed in part to characteristics of the local media network. Compared with Syria, “the Myanmar network is totally different.” He explained that in the past, documentation has been “put onto digital storage [and] brought outside the country or fed into one of these channels” of human rights/activist organizations working in country or abroad. If a “video is created in Myanmar by [a random] person or victim, where does it go? What’s the path?” –suggesting that, unlike in Syria, there is a relatively weak online infrastructure to support the public dissemination of videos uploaded by actors not embedded in regular media production and circulation.

I spoke with Nay San Lwin, an activist with Rohingya Today, who was working with a team of several dozen undercover reporters in every township who would send him videos on WhatsApp depicting attacks leading up to the 2017 violence, including in 2012 and 2016. However, with police outposts and searches in every city, very few dared to capture the attacks. Half of his team were forced to flee in the 2017 attacks. In addition to the Rohingya Today team, after anti-Rohingya violence erupted in 2012 a network had emerged in Rakhine of “mobile reporters,” Rohingya youth who discretely captured audio and visual documentation on their mobile phones and disseminated them online and via social media. These mobile reporters are said to have played vital roles in documenting the 2016 attacks and the early days of the August

61 They have since been released, after international outcry. Access stories on the case at https://www.reuters.com/subjects/myanmar-reporters. Last accessed February 6, 2019.

102

2017 violence and relaying information to international human rights groups and journalists (Hussain 2017). However, it appears that the network crumbled in the days following the August 2017 attacks. The reason for this breakdown relates to McMahon’s second key determinant of the information about an incident besides the local media practices and networks: local circumstances during and in the immediate aftermath of an incident.

Reports and groups relying on UGC emerging from zones of conflict have pointed to local security circumstances shaping the volume of information available about attacks. In especially dangerous contexts or in cases of evacuation, those impacted may be physically unable to create or disseminate documentation, or may refrain from sharing due to the severity of the security risks that doing so would pose. PHR’s Syria Mapping Project noted that one of the kinds of incidents most challenging to verify are those “that occur in more dangerous or restricted areas (e.g. ISIS-controlled areas, areas under siege). For instance, it was very difficult for PHR to corroborate incidents in Eastern Ghouta, as medical facilities were trying not to publicize attacks out of fear of retaliation” (Physicians for Human Rights 2019). Similarly, an Atlantic Council report noted that “the reported attack on the Hamdan Hospital in Douma on April 7, 2018, was impossible to prove using open source investigation techniques. No footage of the attack or its aftermath could be found online or in extensive requests to NGOs and organizations working on the ground” (Atlantic Council 2018: 32). “The lack of evidence” continued the report, “can be explained by the sheer intensity of the bombardment.” The Syrian Civil Defense cited 403 bombardments including chemical attacks, airstrikes, barrel bombs and other kinds of munition which effectively prevented them from documenting the attacks. “The lack of available footage does not mean that the attack did not happen, only that it is impossible to verify externally using the information available. Due to the ferocity of the violence, the quest for accountability was compromised” (Atlantic Council 2018: 32).

Conditions during and after the Al-Lataminah chemical attacks on March 25 and 30 were not as dire as to prevent the creation and online diffusion of local reports. As previously noted, video footage was uploaded to YouTube purporting to show the exact helicopter responsible for dropping munitions in the March 25 attack as well as smoke plumes from the March 30 attacks. Immediately after both attacks, reporting groups were able to return to the sites of attack to

103 capture visual documentation of munitions remnants and the attack sites for purposes of confirming their locations, as well as to collect physical samples in the case of the March 30 attacks that were later sent to the OPCW, which found traces of Sarin (OPCW 2017). Local groups were also able to film and upload visual documentation of the physical symptoms of those injured or killed, which were used to corroborate claims of chemical weapons. As McMahon might say, this is an example of relatively more stable circumstances in which people could stay in one place and there is a chance for documentation to come into being and be shared and later found. “Let’s say,” in contrast, “that the video arises in Syria but not in a village – [rather,] as people are fleeing across the border. That’s the point of creation. Well, that’s much more like the Rohingya situation, in that the video’s gonna travel. It won’t be sucked into that big [media] network as quickly.”

According to Rohingya leaders based outside of the country, the intensity of violence and “clearance operations” against the Rohingya and the need to quickly evacuate resulted in the scarce coverage of the violence and the disappearance of the mobile reporter network, most of whose members were forced to flee with other villagers (Hussain 2017). “Some of them have gone missing and we fear they might have been killed along with other Rohingya men,” said one commentator.

By most accounts, the exodus out of Myanmar was treacherous, One report on pop-up cell phone repair stands in Bangladesh refugee camps states that the fleeing villagers’ “phones are often damaged during the perilous trip to Bangladesh.”62 Reports indicate that those feeling the violence had mobile phones confiscated by Myanmar military officials at the border. Through interviews with witnesses in refugee camps, Amnesty International encountered more than a dozen reports of villagers being robbed at the base of one mountain pass (Amnesty International 2018b). At least one witness claimed that military had taken mobile phones from her at one such checkpoint. While conducting research for a report before permission to enter Myanmar was

62 The New York Times. 2018. “Inside the Rohingya Crisis: Capturing Their Genocide on Cellphones.” The New York Times. Last accessed: June 22, 2019. Retrieved from: https://www.youtube.com/watch?v=OqNqICFcmto

104 granted, one UN investigator noted the challenge of retrieving photographs from those who had left Rakhine.

When people were leaving Rakhine state, they were being stopped, searched and deprived of their money, gold and mobile phones… It seemed pretty clear this was an attempt to get video or photographic evidence they had recorded…. There wasn't much left but we made use of it. (Hughes 2018)

Though Amnesty International and its partners were able to obtain visual documentation of abuses, much of this information did not seem to be uploaded and available online, potentially due to the factors described above and self-censorship among witnesses who might refrain from posting the content online.

Finally, in addition to local media practices and networks, and local security circumstances, a third factor shaping the volume of content emerging from conflicts about that incident, which in turn shapes the verifiability of that content, is supplemental news coverage and reports by journalists, NGOs, official fact-finding bodies, and other actors. Eyewitness media and open source investigations are often promised as shedding important light on human rights violations neglected in the news and impacting the most marginalized. While this might be true to a degree, the availability of news coverage as a source of corroboration in open source investigations is still quite important. PHR’s Syria Mapping Project states that given its primary reliance on open sources, “it is often difficult to verify smaller-scale incidents which tend to receive less media coverage.”63 Christiaan Triebert, formerly of Bellingcat and now an investigative journalist for The New York Times has said that open source investigative groups and traditional journalism complement and support each other’s work. "If you never have wire reporting, we'd have nothing to go on," while newsrooms typically lack the resources and time often required for open source investigations (Lapowsky 2019). News coverage and investigations are shaped by a confluence of factors, including geopolitical dynamics, NGO’s tactical calculations, and media agency determinations about the newsworthiness of conflicts

63 Physicians for Human Rights. 2019. “Syrian Mapping Project Methodology.” Last accessed June 22, 2019. Retrieved from: http://syriamap.phr.org/#/en/methodology

105 and crises. Accordingly, while open source verification is said to hold the most promise for bringing to light marginalized abuses on the periphery of mainstream or international media attention (e.g., Ristovska 2016a), the verification and geolocation of UGC may itself rely on secondary material produced by media attention—that is, in the absence of local documentation practices and homegrown media agencies.

Although violence against the Rohingya and mass displacement did receive notable media attention in its immediate aftermath, much of this coverage was gathered remotely and thus not as helpful for verification and geolocation purposes. Government control and military presence in Rakhine greatly restricted access to the region to journalists, international monitors, human rights groups, and investigators. UN investigators were not granted access to Rohingya villages and townships until September 2018, a year after the attacks, and outside media has also been constrained in their access beyond government-led organized press tours (Hussain 2017). In contrast, though the al-Lataminah chemical attacks did not receive widespread attention initially, the incidents gained further coverage in the months to come. A report by Human Rights Watch drawing on phone interviews with witnesses, victims, and locals provided statements and images used by Lab students to corroborate material and reports they found online. Soon after Lab students began their investigation, Bellingcat produced their own investigations of the March 25 and 30 attacks, which also provided information from locals. Prohibited from contacting sources on the ground or content uploaders, Lab students relied on these other reports for additional information. Reports issued by the OPCW and UN Commission of Inquiry on Syria provided further information, drawing on witness interviews, epidemiological analysis, and environmental samples using soil samples and metal parts (OHCHR 2017; OPCW 2017, 2018). All of these sources—social media content, multi-methods research by international NGOs and agencies, and open source analysis by other groups—provided corroborating information and assisted Lab students in verification and geolocation on this project although, by the time the Lab released its report, the attacks and their locations had already been confirmed.

This previous section highlighted ways in which the volume of information available about an incident—itself shaped by local media practices and network, local security circumstances, and supplemental reporting and investigation – comes to serve a crucial role in UGC verification

106 using open source methodologies, which relies on cross-corroboration between many independent reports. Next, two lesser-discussed information types and sources shaping the verifiability of UGC are outlined. Just as broader social and structural factors were seen above to influence the volume of information available about an incident, so, too, are they found to influence information derived from specific pieces of content and information about the place in which the incident occurred.

CONTENT-LEVEL INFORMATION: VERIFICATION SUBSIDIES INSIDE OR ATTACHED THEREIN

Practitioner manuals and scholarship on UGC underline the significance of visual clues within UGC as well as detailed descriptions added to content and other kinds of “verification subsidies” as important factors facilitating the verifiability of content and thus important recommendations for media activists and witnesses to employ as they create and share documentation of abuse content (WITNESS 2016, McPherson 2015a). McPherson (2015a) suggests that more professional journalists and witnesses will know to include such verification subsidies whenever possible and thus make it easier for investigators and journalists to verify the UGC downstream. Indeed, according to a workshop report by the Center for Investigative Journalism (2018: 41), “there’s a much wider network now and much more collaboration between those recording the material on the ground and those analyzing it, so there have been initiative to train activists in how to collect such evidence in ways that will be more useful to analysts.” Kelly Matheson’s training would be precisely one example of this.

In addition, the training can happen in real time and be ad hoc. Sienkiewicz (2014) reports that one open source investigator collecting and verifying UGC in real time related to reports of a chemical attack near Damascus in 2013 issued recommendations over social media to those recording events on the ground to film victims not only inside hospitals but entering them as well so it would be possible to geolocate the hospitals (as it would otherwise be challenging if not impossible to geolocate a building from the interior). Relatedly, at the Lab, while those captured within documentation at the scene or in the aftermath of a human rights abuse often understandably expressed their emotional state including shock or grief, footage captured by experienced media producers oftentimes additionally included narration that proved helpful for

107 geolocation, including names of the building attacked or location where footage was captured. Even then, however, despite the considerable linguistic diversity of its participants, the Lab was limited in being able to understand and transcribe narration emerging from some contexts like Myanmar.

Researchers at Columbia University’s C4SR have described the voluntary inclusion of specific location markers in the metadata of content uploaded online as reflective of an “archival consciousness.” As described above and in the previous chapter, although platforms remove important metadata attached to content and do not require uploaders to supplement their video submissions with contextual information, platforms do typically provide the date and time of upload, the account or channel posting the information, and on some sites like YouTube, the ability for uploaders to add title and descriptions for the videos they post. Accordingly, researchers on a case study on violence in Aleppo, Syria, states that

While YouTube metadata does not include a location parameter, a number of established YouTube channels run by grassroots activists and journalists display a kind of “archival consciousness” in that they note the names of neighborhoods, streets, or locations shown in their footage in the titles of videos when they are uploaded.64

Be it “archival consciousness” or not, the inclusion of such metadata by journalists and media agencies on the ground can be seen as comprising tactical strategies to enhance the visibility and subsequent verifiability of their news coverage (visibility, as they anticipate open source investigators and journalists to employ place-based keyword search queries). Accordingly, it provides a direct example of how in the context of open source investigations, “one person’s metadata are another person’s data” (Mayernik and Acker 2017: 178). As Mayernik and Acker (2017: 178) point out, “[c]ertain digital traces may serve as ‘metadata’ in one context because they provide information about people’s activity or behaviors, but they may also serve as ‘data’

64 Center for Spatial Research. “Spatializing the YouTube War.” Columbia University. Last retrieved June 22, 2019. Retrieved from: http://c4sr.columbia.edu/conflict-urbanism- aleppo/spatializing-youtube.html.

108 if they are themselves analyzed in other contexts and used as evidence to make a claim or argument.” Indeed, the above is an example of how knowledge of how metadata becomes enrolled as data may shape content uploaders’ determination about what metadata to include to their media and posts.

Notably, much of the UGC found by Lab students in the Rohingya crisis did also index the village or general area where footage was captured in the video’s title. Aside from problems in the transliteration of villages and townships into English, village names often did not provide granular enough information to geolocate the videos. In part, this is due to a lack of annotated maps of the townships and of volunteered geographic information (or “VGI”), described further in the next section. But it also relates to the fact that the rural landscape of the townships most affected were not heavily built up with structures with distinctive structures, shapes, or place names, which could be used for geolocation. This is related to another facet—beyond the control or professional knowledge of individual media producers—in which social factors, particularly economic ones, impact the verifiability of videos.

As described previously, geolocation relies on cross-corroborating distinctive features in a landscape captured in images or videos with other visual material and satellite imagery. These distinctive features often include architectural landmarks and physical infrastructure like highways, roads, and electric towers. As many veteran students had gained geolocation experience working on videos from Syria, the canonical examples of “landmarks” I often heard comprised mosques, minarets, and water towers. “When I see a mosque, I’m happy,” said one student manager (Ioannou 2017). Of course, what ends up being a “landmark” may vary greatly depending on the piece of content, the landscape, and the degree of knowledge the investigator possesses with respect to the location (a local might be able to recognize and identify some feature of the landscape which appears non-descript to outsiders). And, through the course of a project, an investigator may develop quite a lot of spatial and geographic knowledge about a particular locale. Nevertheless, Lab students certainly came fresh to most of the contexts they were investigating –and it is safe to assume that many open source investigators beyond the Lab may as well.

109

In this context, footage or images were generally harder to geolocate in rural environments where distinctive physical buildings and infrastructure or commercial areas were absent. Accordingly, the degree of urban and commercial development in an area may affect the verification of UGC created there in ways that could in some cases disadvantage rural areas.65 In the project on chemical attacks in al-Lataminah, “students first looked for landmarks such as mosques and Minarets, town squares, large unusual buildings, signs, trees, and roads. These landmarks helped the team gain an understanding of the locations featured in the media” (Human Rights Center 2018b: 13). Local news coverage and reporting regarding the March 25 attack importantly featured interior and exterior shots of the attacked hospital which were sufficient to be able to geolocate the hospital by open source methodologies alone.66 For the March 30 attacks, video footage captured at the alleged site of the attack provided useful reference points: “landmarks identified in the videos include two distinct lines of tress, recognizable hills and slops, a faint city landscape, and curved road,” enabling me and other students to approximate coordinates for the attack (Human Rights Center 2018b: 24). Another video filmed from the village showing smoke plumes featured other so-called landmarks crucial for geolocation, including a tower, hills in the background, and elevation differences. In addition to these markers useful for geolocation, the visual materials depicting the attack and its aftermath featured important clues regarding the munitions and aircraft used as well as the physical symptoms resulting from the attack.

Geolocating videos from Myanmar was, again, a different story. Although satellite imagery has been played a central role in assessing and constructing quantitative estimates of the overall destruction caused to Rohingya villages and townships following the 2017 attacks,67

65 At the other extreme, however, geolocating incidents occurring in mega-cities may also be challenging and time-consuming, as there is a larger geographic area to consider. Geolocating one highly publicized incident in which a Renault Station Wagon manned by police reversed and plunged off the side of a highway in Cairo took me two months. 66 Specific coordinates were not provided by local reports, given the sensitivity of this information in the Syrian conflict, described more in the next chapter. 67 Early estimates estimated that almost 400 villages were partially or completely destroyed, and 37,000 buildings (close to 40% of homes) were impacted (Hughes 2018; Human Rights Watch 2017c).

110 geolocating UGC filmed at ground level proved immensely challenging. Meeting Burma team students for an ad-hoc geolocation training, the Lab tech director confessed that when he first saw content for the project he was “kind of wincing” due both to the graphic nature of the photos and videos and because “they’re impossible to geolocate.”

Certainly, the scope of the violence and the sheer number of townships posed difficulties for geolocation: identifying the location of one or two attacks in a town is night-and-day from attempting to geolocate dozens of videos capturing events from spanning weeks and entire townships. Contributing to this, however, was the rural context of the killings and mass migration, which afforded few distinctive spatial markers. Infrastructure in the form of roads, bridges, buildings, or water and electrical towers were largely absent in the videos. A spreadsheet created by Amnesty International researchers prompted Lab students and DVC students of the University of Essex to list “landmarks” discernable in each video. However, typed entries under the “landmarks” column of the spreadsheet reflected ambiguity and little hope for geolocation: no clear landmarks, open forested area, river, hills in the background, fast moving stream, no identifiable landmarks, double-headed palm tree. On this project, more than any other, I noticed newcomers got discouraged and came to premature conclusions and incorrect coordinates which the team manager identified in her reviews of study work and attempted to tactfully correct.

Even Amnesty International’s satellite imagery expert found geolocation to be tricky. On a video conferencing call with some students, he noted that from the spreadsheet so far only a video and a photo had been geolocated from a collection of hundreds. It’s remarkable he’s had success even with these. If a video contains a close-up shot of a burned down house, he could advance the video in slow motion, frame by frame, and match up the hills with satellite imagery. But that only works with specific villages. If he doesn’t have the village name, he says, he doesn’t even try. Accordingly, a student manager on the project encouraged students to pivot back to discovery rather than attempt to brute-force geolocation. “Most of these videos I think will be geo-locatable by corroborating them with different information,” she said, “not just from the videos themselves.” Months later, reflecting on the variability of geolocation challenges on this and other projects, the veteran student and team manager told me in an interview that, “I always thought Syria was hard to be honest but after working on some of the stuff this semester, Syria

111 is… not easy but there’s actually a lot more visual information for Syria than there is for Burma, and even Saudi Arabia was really hard.”

I think in Syria we’ve relied on several distinct landmarks that make a big difference, like there are so many mosque in Syria and at the same time it’s like oh my God, how am I gonna find this mosque, there are so many! But then you take for granted the fact that you see a mosque at all. And with Burma that was absolutely the hardest geolocation work I’ve ever worked on because there’s almost absolutely nothing to work off of, you’re looking at trails on the ground and seeing what angles rivers turn at and that’s really kind of arbitrary.

Accordingly, though this section shows the utility, particularly for geolocation, of verification subsidies such as panning the camera 360 degrees and including shots of distinctive infrastructural or geographic features, this section also highlighted ways in which the degree of local development may figure into verification procedures as a boon or impediment.

PLACE-LEVEL INFORMATION: AMBIENT DATA ABOUT WHERE THE INCIDENT OCCURRED

Another crucial difference in the experience of geolocating incidents in the Syrian and Myanmar contexts at the Lab relates to the respective availability of “ambient” information depicting and about each of those areas, including satellite imagery and street view maps, volunteered geographic information posted to crowdsourced mapping platforms, and visual materials uploaded to online platforms. “Ambient” here refers not just to the fact that this kind of information provides ground truth spatial, visual, and textual data of varying degrees of granularity about the places and immediate environments situating incidents, but also to the fact that this information may have been created and circulated for reasons totally unrelated with the incident at the center of an investigation. It is simply information that exists online about a particular place that becomes enrolled in the process of an investigation. To give an example, an image of a city landscape by a tourist and posted to social media as a souvenir of the trip may be used as a reference photo in geolocation to assess the spatial relationships between different physical features in the background.

112

Satellite imagery and online street maps are crucial sources of ambient place-based information during an investigation. Lab students accessed free commercial satellite imagery primarily from Google Earth but at times also on the Bing and Terraserver websites. As I was leaving the field, Planet Labs also partnered with the Lab to provide its imagery on projects. As is now well known, the availability, quality, and collection frequency of commercial satellite imagery as well as street view maps are shaped by a patchwork of geopolitical factors, economic interests, and local regulations (Segev 2010; Pabian 2015). The United States government has exercised “shutter control” over commercial satellite imagery companies in various ways, including by levying regulations on the quality and granularity of publicly-accessible satellite imagery, limiting public access to U.S.-sourced satellite imagery from areas deemed sensitive, and buying exclusive rights to imagery of specific zones.68 Google reports that its Street View platform does not have data for large areas of Russia, China, the Middle East, and Africa.69 Governments including Germany and India have placed heavy restrictions on the data collection and operation of Google Street View, citing privacy or national security concerns.70 Given that

68 For instance, the United States government negotiated an exclusive contract for images of Pakistan and Afghanistan in the three-month period following September 11. Scoles, Sarah. 2018. “How the Government Controls Sensitive Satellite Data.” Wired. February 8. Last accessed January 22, 2019. Retrieved from: https://www.wired.com/story/how-the-government- controls-sensitive-satellite-data/; see also La Fleur, Jennifer. 2003. “Government, Media Focus on Commercial Satellite Images.” Reporters’ Committee for Freedom of the Press. Last accessed January 22, 2019. Retrieved from: https://www.rcfp.org/journals/the-news-media-and-the-law- summer-2003/government-media-focus-comm/. 69 Google. 2019. “Where We’ve Been & Where We’re Headed.” Google Maps Street View. Last accessed January 22, 2019. Retrieved from: https://www.google.com/streetview/understand/. 70 E.g., for information on Germany’s relationship to Google Street View: Cain Miller, Claire, and Kevin O’Brien. 2013. “Germany’s Complicated Relationship with Google Street View.” New York Times. April 23. Last accessed January 22, 2019. Retrieved from: https://bits.blogs.nytimes.com/2013/04/23/germanys-complicated-relationship-with-google- street-view/. India has rejected numerous Google proposals to roll out Street View beyond major landmarks and tourist attractions. See Murgia, Madhumita. 2016. “Google Street View Banned in India Due to Security Concerns.” The Telegraph. June 10. Last accessed January 22, 2019. Retrieved from: https://www.telegraph.co.uk/technology/2016/06/10/google-street-view- banned-in-india-due-to-security-concerns/; 2018. “‘Google Street View’ Proposal Rejected by Government.” The Times of India. March 27. Last accessed January 22, 2019. Retrieved from: https://timesofindia.indiatimes.com/business/india-business/google-street-view-proposal-rejected-by- government/articleshow/63482698.cms.

113 the high costs of purchasing satellite imagery may be prohibitively expensive for resource- strapped NGOs, Rothe and Shim (2018: 427) point to the reliance of these groups “on US businesses which follow their own commercial logic.”

Low-budget requests for satellite images – such as by the AAAS [American Association for the Advancement of Science] or AI [Amnesty International] – are given the lowest priority, and hence NGOs’ chances of acquiring images of an exact location and date for human rights inspections is very limited. Thus the question of which human rights abuses can be monitored with the help of remote sensing images largely depends on the proﬁt-driven interests of businesses such as DigitalGlobe and Google. (Rothe and Shim 2018: 427)

Although most Lab students lacked a formal understanding of the uneven distribution of spatial mapping technologies, many noticed and remarked upon differences in the quality and availability of satellite imagery on Google Earth and data on Google Street View in particular locations. Google Street View was not available in many of the locations where students were conducting investigations, including Syria and Myanmar. Satellite imagery from particular locations such as Iraq and Saudi Arabia were noted as being especially blurry, but students were unable to tell whether low quality images were purposeful or not. Ultimately, for investigations in which satellite imagery is crucial for geolocation and verification purposes, collaborating partners may be required to purchase higher resolution satellite imagery collected before and after the alleged date of the incident. This was often the case for Amnesty International projects and reports. Even if imagery had to be ultimately purchased, however, differences in the quality of free satellite imagery could interfere with students’ attempts to geolocate incidents. Such obstacles could be consequential, as Sam Dubberley indicated numerous times that Amnesty International decides to purchase small areas of commercial satellite imagery only if they have sufficient confidence in doing so, whether by their own preliminary geolocation attempts or those of students. Smaller organizations conducting open source investigations may well be unlikely to afford to purchase satellite imagery, which can cost upwards of a thousand dollars for a small area.

114

Aside from satellite imagery and Google Street View, Lab students also leveraged local weather data and shadow calculations to corroborate estimates for the date and time in which content was captured. The logic is simple enough: as one student explained, “if it’s 10mph and yet in the video there were trees and they didn’t move at all, that’s suspicious. Similarly, if [the internet] says it’s freezing and in the video people are wearing tank-tops and shorts, it’s a problem.” In rare cases, shadow analysis using SunCalc71 or similar tools can also be used to approximate when and where content was captured. Though weather data was never the magic debunking bullet on any of the projects I worked on or observed at the Lab, students did routinely employ the website Wolfram Alpha72 to obtain local weather data for the alleged location of their content and annotate key weather features on their content verification forms. There, too, Lab students noticed that the precision and frequency of collection weather data varies by context, such as country (e.g., the United States versus Syria) and region (e.g., urban versus rural).

Another critical source of ambient information about the place in which incidents occur are user-uploaded visual media on photo-sharing platforms, collaborative annotated street maps like Wikimapia and OpenStreetMap, and other kinds of volunteered geographic information (Sui, Elwood, and Goodchild 2013). In the past, a significant advantage of using Google Earth compared with other commercial satellite imagery providers was the ability to access ground- level photos taken and uploaded by users to the Panoramio platform (Pabian 2015). Although it is still possible to view some user-uploaded photos on Google Earth, Google discontinued Panoramio in late 2016 to prioritize photo-sharing on Google Maps and its Local Guides program.73 In an article chastising a modification by Facebook to its Graph Search, the DVC’s Sam Dubberley (2018) wrote in Newsweek:

Google Earth dealt a blow to the human rights community when it removed an amazing resource called Panoramio. Integrated into Google Earth Pro (one of two

71 http://suncalc.net 72 https://www.wolframalpha.com/ 73 Lardinois, Frederic.2016. “Google Shuts Down Panoramio.” Last accessed January 22, 2019. Retrieved from: https://techcrunch.com/2016/10/07/goodbye-panoramio/; Google. 2019. “Panoramio FAQ.” Last accessed January 22, 2019. Retrieved from: http://www.panoramio.com/maps-faq

115

tools every human rights investigator online should have), it allowed human rights researchers to go back and look at holiday pictures online from people who had visited, say, Aleppo before 2010, or parts of Nigeria and Cameroon now engulfed in conflict. This assisted us in the time-consuming work of establishing where a video of an airstrike had been filmed, where a torture scene took place, or where a trafficking victim was last seen.

Some Panoramio photos were still accessible on Google Earth during the year I conducted fieldwork, and these were indeed decisive for geolocation on several projects. In addition to accessing user-uploaded photos on Google Earth, Lab students relied on collaborative mapping platforms like Wikimapia,74 where users affix annotations and tags to local features. Motivations for why users provide such volunteered geographic information vary, ranging from posing in tourist snapshots, as Sam Dubberley noted, to political motives and strategies of representation and resource demands (e.g., Burns 2013; Elwood 2008). Lab students relied heavily on Wikimapia to identify the location of schools, water towers, religious buildings, and other local sites, particularly in projects focused on Syria and elsewhere in the Middle East. “So much of this information is [otherwise] inaccessible,” Félim McMahon once commented. “On Wikimapia, you’ll find ten times the towns that you’ll find on Google Earth and if you search in Arabic, you’ll find twenty times the towns and sites.”

Students investigating the March 25 and March 30 attacks in al-Lataminah did employ Wikimapia along with Google Earth and Google Maps (Human Rights Center 2018b). For projects in Myanmar, Google Street View was unavailable; entries on Wikimapia were minimal for the areas being researched; and Google Earth images were blurry, outdated, and misaligned with the monsoon season ongoing during the time of the attacks. Moreover, Google Earth did neither recognize nor retrieve results for the Rakhine villages of interest – another moment where English transliteration of local places names in Burmese created problems. Fortunately for Lab students, Amnesty International provided coordinates for the eight towns of greatest interest for the project, enabling students to at least locate them on the map. Ultimately, the geolocation

74 http://wikimapia.org/

116 strategy for the project shifted: instead of attempting geolocation or listing possible landmarks for geolocation, students were asked to group together content which appeared to be depicting the same area or village. These groups of like content, students were told, would then be shown to witnesses by Amnesty International and its partners in Bangladeshi refugee camps who might be able to identify the villages.

In sum, on-the-ground reporting and investigations as well as witness interviews conducted remotely, in person, or in neighboring refugee camps proved important for both the al-Lataminah and Rakhine investigations—though not to the same degree. Students were able to get farther in their verification and geolocation on the al-Lataminah attacks given the volume of information about the incident, the granular and relevant information captured on camera and in reports, and the availability of sufficient ambient information about al-Lataminah and its surrounding environment. For the Myanmar project, the use of open source methodologies was not only tenuous but inadequate at each stage in the project, from the collection of eyewitness media to their verification and geolocation, Amnesty International’s contacts, partners, and efforts on-the-ground were crucial.

EVIDENCE OF THINGS NOT UPLOADED

Does every attack or conflict produce data? And, “what happens when dire situations have no images” (Gregory 2014)? This chapter sought to illustrate how myriad factors and disparities shape the volume and verifiability of UGC emerging from conflicts and documenting human rights violations. Given the methodological challenges of studying negative cases – that is, the UGC that is never created, or shared, or discoverable online – I paired ethnographic fieldnotes and interviews from the perspective of investigators with supplemental context about Syria and Myanmar to highlight key factors which may shape what investigators are eventually able to access downstream. While I did not focus on state-imposed internet “blackouts” and restrictions in this chapter, this is another obviously a crucial factor shaping the online availability of UGC emerging from a particular conflict or context.

Generally speaking, successful UGC verification using open source methodologies exclusively was found to depend on saturated or even ubiquitous media environments—contexts

117 with “Cameras Everywhere” (WITNESS 2011; Gregory 2010) or “Sensors Everywhere” (Koettl 2017), to borrow from advocates and investigators of human rights-related UGC. Disparities in verification outcomes were shaped by disparities with respect to local digital practices (e.g., UGC documentation and diffusion, reporting, volunteered geographic information and map annotation), technological infrastructure and services (e.g., network access, language and translation capabilities), economic factors (e.g., urban development, satellite imagery availability), and geopolitical interests (e.g., international coverage and investigations, internet censorship). (Of course, the parenthetical items here do not fit neatly into each of these buckets; for instance, local digital practices are at once a product of social, technological, economic, and political factors.) Though it is hard to generalize exactly what makes UGC verifiable, as this differs so much from case to case, I have attempted at least to show how, for instance, contexts with little built infrastructure, marginalized local languages and dialects, and a relative dearth of robust documentation practices and networks might be at a significant disadvantage in the course of verification undertaken by NGOs strapped for time and resources. Conversely, however, this chapter also illustrated that even in contexts with the most “ideal” conditions to support UGC verification, investigations still depend crucially on relationships of trust with impacted communities as well as the resources and precautions necessary to collect “ground truth” about conflicts on-the-ground. This assemblage of heterogeneous inputs is obscured in much of the sexy, sleuth-like media representations of open source investigations, including news coverage on the Lab, like one short video entitled Activism 2.0: Can Social Media be used to Solve War Crimes? (U.C. Berkeley Public Affairs 2018). Accordingly, this chapter (and the next) suggests that engagement with communities most impacted on the ground should remain a priority, both for methodological and ethical reasons.

On one hand, this argument is hardly controversial, particularly in the context of human rights discourse and methodologies which are centered on witnesses and witness testimony. Indeed, pointing to the methodological limitations of open source investigations, scholars and practitioners of human rights open source investigations have stressed that UGC does and must complement other kinds of research methods, particularly witness interviews. An open source practitioner at Amnesty International asserted that the organization’s adoption of emerging data

118 sources is driven by a desire “to think about how they could complement—not replace— traditional or typical human rights research methods” (Koettl 2016b). Similarly, Sam Dubberley (2018) has emphasized that,

While Amnesty International recognizes how useful open source intelligence can be to corroborate and verify events, it rarely forms the backbone of our research and analysis. In Syria, we continue to work as hard as we can to get first-hand interviews from victims, eyewitnesses and experts the ground. We can’t always access all parts of the country officially or safely, but our research teams are in constant contact with their networks across the region. Open source video and images are now part of that process, but they’re not the only part.

“Despite WITNESS being an organization focused on the effective use of the moving image for human rights,” WITNESS Program Manager Sam Gregory has said that “the imbalance of images from some contexts (and types of human rights situations” has comprised “a consistent concern” for the organization. He and others have pointed not just to varying volumes of video emerging from particular geographic contexts, but also of a dearth of visual UGC depicting systematic abuses and inequalities less conducive to being visually documented and uploaded online, such as sexual- and gender-based violence or pervasive forms of discrimination. As one practitioner told me, “things like airstrikes or artillery strikes are much more dramatic and amenable to being filmed.”

In addition, legal scholars have similarly emphasized that open source methodologies cannot replace the continued “need for eyewitnesses and victims to provide testimony and for investigators to visit the scenes and crimes and conduct thorough investigations” (Aronson 2018a: 130-131; Whiting 2015). During an interview HRC Executive Director Alexa Koenig similarly emphasized the importance of keeping in mind that “open source content is going to be one tiny piece of a much bigger pie” in human rights cases, which rely on three buckets of evidence comprising physical, documentary, and testimonial information. Describing the HRC’s efforts to develop an Open Source Investigations Protocols, Koenig noted that

119

If we can do nothing else but improve the quality of evidence coming into human rights cases, I think we will have done a tremendous service to the space, reduce the overreliance on witness testimony and strengthen those two other buckets of physical and documentary evidence. Now, within documentary which is where I would argue digital content falls, digital content is still only a relatively small percentage of that. And so we should make sure for the quality of our investigations that we don’t start putting too many resources into the open source bucket at the detriment of those other buckets.

Good then, we can all agree that traditional methodologies continue to be vital, particularly in the legal context where testimonial and physical evidence are critical. And yet, while this scenario may remain the aspirational best practice, there is nevertheless the risk that the practicalities and constraints of human rights investigations might tip in favor of open source investigations at the expense of on-the-ground data collection. After all, conducting research and reporting do entail considerable costs, security risks, and other resources. Accordingly, similar to McPherson’s (2015a) worry that the uneven provision of verification subsidies may drive the selection and verification of eyewitness content, there may be a risk that the availability of open source information about a conflict may influence the allocation of resources and selection of projects by organizations, donors, and courts or fact-finding bodies.

For instance, NGOs’ resources could be allocated in ways that leverage and privilege open source investigations at the expense of conducting field research in-country or in refugee camps. In contrast to criminal investigations, in which prosecutors cannot select between conflicts or incidents to investigate, those deciding to carry out a human rights investigation or fact-finding mission without legal mandate to do so, such as intergovernmental bodies, NGOs, journalists, and private citizens, carefully weigh the estimated costs and resources of an investigation against an array of assets and expected impacts.

Second, vulnerable to seduction by tech-optimist approaches, financial donors and funding organizations for human rights institutions may decide to fund groups and projects leveraging open source methodologies over those centered on traditional research streams. As

120 we can only expect excitement over open source investigations to grow, there is the potential that the allure of new technologies and techniques will draw donor attention and funding away from essential yet more resource-intensive traditional fact-finding methodologies. After all, on- the-ground covering is waning in journalism, as the availability of online information make it harder for reporters to justify the costs and time required for in-depth, on-the-ground coverage. Many journalists covering conflicts have pointed to difficulties securing funding for stories that entail on-the-ground reporting (Center for Investigative Journalism 2018). A report by Airwars, a group monitoring civilian casualties resulting from airstrikes in four conflicts, revealed the astonishing fact that zero mainstream media reports of the first two years of war in Iraq against the Islamic State were conducted on-the-ground (Center for Investigative Journalism 2018: 33). More problematically, a key determinant for the Pentagon of whether civilian casualties have taken place is whether a correspondent from mainstream media has carried out on-the-ground reporting. After discussing the vital role of remote investigations by Bellingcat, Forensic Architecture, and Human Rights Watch in revealing civilian casualties inflicted during a U.S. airstrike on a mosque in Syria in 2017, Airwars founder and director Chris Woods nevertheless asks, “in this world where our governments fight remote wars, given the general absence of reporters on the ground but with so much information online how do we ensure that the effects of those wars are properly understood? And what are the risks of these new methods?” (Center for Investigative Journalism 2018: 33).75

A third risk raised by this chapter is that the anticipated availability of open source material could become a determining factor in how investigative sites or official bodies select cases to investigate or otherwise conduct their investigations. Exclusively focused on open source materials and tied to other constraints (e.g., semester-long projects), Lab teams and Lab partners did in many cases select their projects based on the availability of open source content they knew or anticipated would be available. Some students acknowledged the risk of “conducting investigations for investigations’ sake,” echoing in a way scholarship interrogating the increasing

75 Video clips are accessible at Centre for Investigative Journalism. “On the Ground and in the Ether.” 2018. Last accessed June 23, 2019. Retrieved from: https://tcij.org/logan-symposium- 2018/investigative-practice/on-the-ground-and-in-the-ether/.

121 centrality of fact-finding to human rights work (Mégret 2016). Above I noted the numerous groups producing investigations on the attacks of March 25 and 30, and how the volume of UGC anticipated was a factor in choosing those attacks for research. More broadly, the number of groups dedicated to collecting, analyzing, and preserving UGC documenting the “YouTube War” is substantial, especially compared with other ongoing conflicts, such as Yemen. Although open source investigations into Yemen have finally been underway—including a hack-a-thon in London in early 2018 and the creation of the Yemen Data Project—during my fieldwork in 2017-2018 numerous open source investigators I spoke with pointed to the glaring lack of attention in the open source field to Yemen, “one of the most under-reported conflicts in modern history.”76 Suspected reasons for this included the paucity of international coverage on the conflict at the time as well as the questionable amount of UGC compared with Syria due to social, technological, economic, and political factors. Ironically, then, such patterns of resource allocation could amplify patterns of attention and visibility and diminish the anticipated potential of open source investigations to shed light on the most marginalized and invisible conflicts (Ristovksa 2016a). This is in no way repudiate or minimize the immense significance of Syria-centered efforts, but simply to point out how the availability of open source data might be shaping the distribution of attention and resources of human rights groups, practices, and interventions.

Some have expressed concern that enthusiasm for open source content as a source of potential evidence might impact individual criminal cases and investigations or, more broadly, the creation of new international courts and tribunals and their protocols and workflows. The ICC and International Impartial and Independent Mechanism (IIIM) responsible for investigating war crimes in the Syrian Arab Republic “are now taking open source investigation extremely seriously in their work” (Center for Investigative Journalism 2018: 41). The IIIM for instance is “cognizant that multiple entities, including NGOs and Syrian civil society organizations, have gathered

76 Yemen Project. 2019. April 22. “The Yemen Project: Announcement.” Bellingcat. Last accessed June 23. Retrieved from: https://www.bellingcat.com/news/mena/2019/04/22/the-yemen- project-announcement/. Sponsored by the Global Legal Action Network (GLAN) and Bellingcat, the hack-a-thon included members of Bellingcat, the Syrian Archive, New York Times investigative journalists, Lab students, and others. The Yemen Data Project is being maintained in part by Nick Waters from Bellingcat and preservation by the Syrian Archive.

122 extensive documentation… As part of its mandate, the [IIIM] reviews the information and evidence—both inculpatory and exculpatory—already collected by others and identifies possible gaps” (International Impartial and Independent Mechanism 2019).

Rebecca Hamilton (2019) has raised asked whether “courts [will] begin to neglect—either deliberately because it’s easier, or simply because of a prohibitive workload—the types of crimes, perpetrators, and victims that are less likely to be captured by user-generated evidence?” Hamilton notes that although prosecutors recognize such hazards and reiterate the secondary and merely supplemental role of UGC in investigations, “there are undeniable incentives to prioritize the prosecution of crimes with readily available user-generated evidence.” Koenig noted in an interview with me that despite the “zealousness” of open source content seen in the last few years, she has begun to see the development of more balanced assessments of the role of open source content within the arc of human rights investigations. “That said,” however, she noted the emergence of “groups like the IIIM that are increasing their own design of their new institution with a focus on digital technologies… With the IIIM I don’t know enough yet about how they structured themselves to know if how they’re approaching this material is going to skew which cases are being brought, what we’re seeing, in ways that might be problematic.”

123

Chapter 4: Of Content Stewards and Safeguards

“Does it scare you that things are being uploaded on social media that could be false?” The reporter waited for a response, as the video camera remained trained on the Lab student. “Absolutely,” the latter replied, adding that at the Lab everyone knows “you need to take everything with a grain of salt.” Ten minutes ago, I had no idea that the local ABC-7 would be visiting the Lab for a story, but apparently such visits were semi-regular, as the other students took the crew’s arrival nonchalantly. The Lab student continued, “on the other hand, people uploading videos are putting their lives at risk – we almost have a responsibility to see what they’re posting.” Plus, “they see the truth more than anyone else.”

The portrayal of open source investigations as a means to extend witnesses’ voices is one that is widespread in media coverage, at the Lab, and among the Lab’s visiting practitioners and NGO partners. A video by The Economist alludes to crowdsourced investigations like those conducted by the Lab, Bellingcat, and U.K.-based Forensic Architecture as “giving voice to the victims and holding the powerful to account.” Of the Digital Verification Corps, Sam Dubberley has said “[w]e have to do this to help people tell their stories” (Fortune 2018). The Syrian Archive (2019a) maintains that “visual documentation allows [it] to tell untold stories through amplifying the voices of witnesses, victims and others who risked their lives to capture and document human rights violations in Syria.” Amplification is also central element in Bellingcat’s “‘IVA’ approach (investigate, verify, amplify)” (Bellingcat Yemen Project 2019). Scholarship has similarly lauded investigations drawing on eyewitness media as potentially more participatory forms of fact- finding than traditional methods, capable of amplifying the stories of those most impacted by injustice (Land 2016; Ristovksa 2016a).

This positive rhetoric of witness empowerment is potentially troubled, however, by a risk confronting open source investigators: that their collection, usage, and preservation of eyewitness media could endanger, or counter the wishes of, witnesses. Although security risks are certainly not new to human rights work, the sociotechnical configurations of actors, platforms, and data entailed in open source investigations introduce new security vulnerabilities while producing ambiguity with respect to the consent of those linked to or depicted in UGC.

124

Concerns related to consent and security risks have gained significant attention in academic scholarship and practitioner manuals (Aronson 2017; WITNESS 2011, 2017; Gregory 2012a, 2012b; Deutch and Halab 2018; Hamilton 2019; The Engine Room, Amnesty International, and Benetech 2016; Land et al. 2012). Existing literature has cautioned practitioners on these issues while imparting strategies and tips for assessing consent and the security risks stemming from using UGC gleaned online. Some of this work has also justified particular methodological choices in case studies based on practitioners’ own deliberations. Consequently, much of what we know about how open source investigators in the human rights field actually manage these concerns comes either in the form of best-practice prescriptions or self-reports.

This chapter contributes an ethnographic account of how Lab students and staff approached questions of consent and security risks in relation to their collection, usage, and preservation of eyewitness media sourced online. The first findings section describes the humanizing narratives through which Lab students and staff equated human rights-related eyewitness media with witnesses’ “voices,” echoing a broader pattern in which human rights advocates and groups are deputizing themselves as content stewards and safeguards, especially given the risk of commercial platforms’ content removals. The subsequent two findings sections examine how Lab students and staff reconcile their allegiances to witnesses amidst gaps of consent and communication with content uploaders. I illustrate how lack of consent from content uploaders is normalized vis-à-vis discourse which have the result of imputing consent onto online content or implicitly regarding consent as impractical or irrelevant. In addition, I argue that while potential security risks to uploaders appeared to be given more weight and attention than consent issues, security risks were negotiated in ways that outsourced the management of these risks to partnering NGOs, exacerbated consent and communication gaps, and impoverished practitioners’ understanding of the security risks of publishing media and investigation findings. If adopted more widely and systematically by sites responsible for collecting, publishing, and archiving human rights-related UGC, these discourses and practices would risk systematically barring content creators and uploaders from meaningful and sustained inclusion in accountability mechanisms and decision-making regarding the use of their posts and content.

125

This analysis highlights how open source investigations enable certain kinds of participation from impacted communities in advocacy, accountability, and knowledge production (e.g., production and procurement of documentation) while simultaneously producing opportunities and incentives to omit communities on the ground from other kinds of participation (e.g., stewardship of their own content). Findings thus complicate narratives of open source investigations as straightforward and participatory extensions of human rights witnessing, and lend empirical and theoretical support to concerns that large, well-resourced human rights organizations have greater ability than ever before to appropriate information and labor from smaller organizations and individuals on the ground toward their own political aims. Although the details of the Lab’s operations makes it an atypical site from which to explore these issues, many of the dynamics, tendencies, and discourses described here exist more widely in open source ventures and initiatives in the human rights field.

SECURITY RISKS IN HUMAN RIGHTS INVESTIGATIONS AND ADVOCACY

Witnesses and witness testimony are positioned at the heart of contemporary human rights investigations and advocacy.77 Whereas its predecessor rested on the research of experts, diplomats, and lawyers (Alston 2013), the so-called “second generation” of human rights fact- finding pioneered by international NGOs in the 1970s and 1980s prioritized witness interviewing, both as “an ethical promise” to victims of abuse as well as a “methodological given” (Satterthwaite 2013:63). Witness testimony is a central component in human rights investigations undertaken by many different actors including NGOs, United Nations fact-finding missions, monitoring bodies, commissions of inquiry, and international criminal courts and tribunals (Boutruche 2016: 134). Alston and Knuckey (2016: 12) write that

Witness testimony is the primary source of evidence in most human rights reports, and this will likely continue to be the case for strong evidentiary, ethical, and advocacy reasons. Witnesses and victims often possess unique and critical information, physical or documentary evidence may be non-existent or difficult to

77 I include victims/survivors in the category of “witnesses,” as they are also witnesses but witnesses are not necessarily victims/survivors.

126

access, the focus on victim testimony centers the perspectives of and can empower the most directly impacted rights-holders, and the narrative form can be especially compelling in advocacy.

Involving witnesses in human rights advocacy and investigations also functions as a vital ethical gesture on the part of advocates, who frame their activities as “bearing witness” and seeking justice for and on behalf of witnesses. At the same time, however, in practice, human rights investigations and advocacy contains inherent dynamics which place investigators and advocacy in compromising positions with respect to the witnesses they purport to support.

Advocates have long grappled with how to help survivors in ways that minimize harm to impacted communities and, ideally, respect their wishes. Guiding much human rights advocacy and humanitarian work is the “Do No Harm” principle, which entails evaluating the anticipated harms and benefits from an action or intervention, and taking decisions which ultimately minimize possible harms to those involved (e.g., OHCHR 2011a: 4).78 One Lab student referred to this dynamic as a “dilemma between us trying to do this work, helping advocacy, ensuring accountability…, but on the other hand trying to protect the people’s privacy and security, trying to make sure we’re not hurting the people that we’re trying to help.” Possible harms can extend across numerous dimensions, such as psychological well-being, privacy, physical safety, and digital security. Fact-finding and criminal investigations may endanger witnesses and other individuals cooperating with investigators, making it crucial for fact-finding teams to devote resources to the protection of witnesses and their families and to the preparation of contingency strategies in the case of acute safety threats, such as evacuation plans (Hamilton 2019). Interviewing witnesses about their experiences carries the potential of re-traumatizing witnesses or causing them further emotional suffering if interviewers’ line of questioning for the purposes of verification appears to cast doubt on witnesses’ stories, or cause them to feel they are being treated instrumently as mere sources in information (Boutruche 2016: 140-1; OHCHR 2011b).

78 The principle is cited in most manuals on human rights documentation, including the Minnesota Protocol on the Investigation of Potentially Unlawful Death (Minnesota Protocol), the Manual on Human Rights Monitoring (OHCHR Manual), and the Documentation and Investigation of Sexual Violence in Conflict (IP2) (Abbott 2019).

127

Human rights investigators are advised to cohere to security and methodological best- practices when scheduling and conducting interviews with witnesses, including meeting at security locations, protecting identifying information, and following consent protocols before and during witness interviews (Boutruche 2016: 147). Even still, there is a risk they may be disappointed if their expectations (e.g., for retribution, support in meeting basic needs) are misaligned with an investigations’ aims and scope. Witnesses might be regarded and treated instrumentally, as mere sources of information, particularly in the context of litigation. In addition, the disclosure of information in advocacy reports or fact-finding investigations released publicly has the potential to compromise witnesses’ safety and anonymity, if not carefully reviewed, subjected to necessary modifications, or withheld altogether by human rights organizations and fact-finding bodies.

More broadly, the aims of traditional human rights fact-finding, advocacy, and litigation can misalign the wishes of survivors and witnesses. Some have questioned the degree to which human rights investigations constitute “extraction or empowerment” for witnesses (Alston and Knuckey 2016: 129; Bukovská 2008). Though interviewing witnesses constitutes some form of participation, at least in the literal sense, whether and how they are meaningfully included in advocacy projects and accountability mechanisms are wholly separate considerations. “Many global human rights NGOs imagine their work to involve bringing facts from the ‘bottom’ to the ‘top,’ giving ‘voice’ to the ‘voiceless,’” remarks Dustin Sharp (2016: 76). While there may be some truth to this, Sharp points out, it is also the case that political change brought about by NGO human rights advocacy relies largely on activities and networks of political elites, and ultimately reinforces the goals and authority of elite institutions and individuals conducting human rights fact-finding. This power dynamic is reflected in the historical prioritization of political rights over economic and social ones, including poverty, in human rights advocacy and fact-finding.

Emerging ICTs and the proliferation of UGC sought in open source investigations exacerbate these risks and vulnerabilities while introducing new ones.79 First, the ease and speed

79 Many of the key concerns outlined here that have been enumerated previously elsewhere, including in Land et al.’s #ICT4HR: Informational and Communication Technologies for Human Rights (2012: 23-24; see also WITNESS 2011; The Engine Room, Amnesty International, and

128 with which digital information is tagged, shared, and copied makes it challenging to control the spread and uses of data once it is released publicly. The scope and speed of information diffusion is a boon to human rights investigators, governments, and third-party adversaries alike. Consequently, “identifying information (whether contained in the data itself or hidden in its code) can be disseminated quickly and widely before steps can be taken to prevent the loss of anonymity” (Land et al. 2012: 23).

Second, digital information and visual media in particular may contain sensitive information capable of revealing the location, identity, or other details about individuals – a fact often not realized by its creators and handlers. “All content and communications, including visual media,” states WITNESS (2011: 19), “leave personally identiﬁable digital traces that third parties can harvest, link and exploit, whether for commercial use or to target and repress citizens.” In particular, the rich nature of visual content which makes it useful for open source investigators also accords it value for adversaries with malicious intentions to act on or spread that information. The same open source investigative techniques used to discover and verify human rights-related material or newsworthy events could be taken up to harass or to commit acts of “malinformation,” the dissemination of accurate (and often personal and identifiable) information with the intent to harm (Wardle and Derakhshan 2016). Circulating content can be immensely traumatic for witnesses or those depicted in content (Gregory 2012b; see Banchik 2018). In some cases, particularly sensitive visual material may be circulated intentionally.80 Especially in conflict settings, the recording of visual footage itself by perpetrators may signify or amplify the violence of an act.

Benetech 2016). In addition, this discussion focuses on hazards related to open data, leaving aside vital considerations related to the physical and digital security of those capturing documentation on the ground and their social networks (Hamilton 2019), and of digital information held in storage and/or intended for long-term preservation (Aronson 2017; Piracés 2018a). 80 Examples of the “weaponization” of visual media abound, spanning from nonconsensual pornography (revenge porn) and deep fakes to the dissemination of booking photographs by third-party websites like Mugshots.com which require fees for the removal of photos.

129

Even when information appears devoid of sensitive data, it can often be assembled with other materials in private possession or scattered across the internet to identify the location and identity of individuals connected to it. Those capturing, uploading, or sharing visual documentation might not realize the degree to which visual media contain sensitive or potentially-sensitive information. When I recently presented a case study of geolocation to undergraduates, one student told me after the class how incredibly frightening it was that data extracted from social media posts, news coverage, and satellite imagery could be assembled to determine the precise location of a house or a cameraperson. Similarly, content producers and uploaders may neither realize nor prepare for the eventuality that their identity and location may be uncovered through open source investigative practices like video verification and geolocation.

If re-sharing media publicly is anticipated to cause security risks to those who produced, uploaded, or are depicted in content, advocates are encouraged to consider refraining from publishing the material, withholding sensitive information and metadata disclosing individuals’ identity or location, or modifying the content before release (e.g., blurring the faces of those depicted) (WITNESS 2016). In turn, however, each of these decisions can make the work of investigators more challenging downstream and/or may reduce the apparent credibility of the content in the eyes of investigators.

Accordingly, a third source of harm stems from the aggregation of digital and visual information, even if publicly available. Land et al. (2012: 30) note that publicly available information is often considered safer to re-use/share; after all, it has already been made public. Even if the person releasing the information may not have intended it, so the thought goes, the damage is already done.81 In recent years, the concept of the “mosaic effect” has been used across institutional contexts to underline prominent hazards in the generation and management of mass databases storing information linked to individuals – be they consumers, aid recipients, or municipal residents (Wittes 2011; Green et al. 2017; International Committee of the Red Cross and Privacy International 2018). The “mosaic effect” refers to a theory of intelligence gathering

81 Indeed, Land et al. (2012: 30) point out that institutional human subjects research protocols respect this distinction by typically exempting information collected via public channels from separate review processes.

130 which describes how disparate pieces of data take on intelligence value by assembling them together (Pozen 2005). A recurrent theme in this literature is how the ubiquity of digital personal information today troubles prior institutional evaluations of anonymity and identifiability, especially in data collections, given that “de-identified data sets can be combined with other supposedly anonymous data to re-identify individuals and the data associated with them” (The Engine Room, Amnesty International, and Benetech 2016: 12). Although the mosaic effect has largely been used to refer to the creation and management of numeric or textual datasets, the privacy and security risks entailed by the mosaic effect are no less salient in the case of visual content, analyzed by human rights open source investigators using manual or automated methodologies. Though adversaries might be equipped with the same open source techniques as advocates, publishing investigation findings or re-sharing UGC carry risks of drawing attention to specific data or lowering verification barriers to using and acting on information.

Fourth, the physical distance and lack of contact between sources uploading content and practitioners discovering it downstream may result in security precautions being less “intuitive” (The Engine Room, Amnesty International, and Benetech 2016: 61) and may diminish practitioners’ assessments of the security risks entailed in collecting, storing, and sharing UGC. By reaching out to uploaders, practitioners can better understand the security risks and ask permission for using content. At the same time, however, reaching out to uploaders is voluntary and fraught with its own perceived security risks, and the establishment and protection of secure communication channels may require technological expertise, resources, and domain knowledge which may be out of reach for many human rights organizations. Besides what may be substantial preventative measures and safety risks entailed in communicating with uploaders and creators, it may be it may difficult enough identifying who to contact and how: whereas content uploaders could, in theory, be contacted using the platforms if not through contact information traced to the uploader account, additional stakeholders linked to content (e.g., individuals depicted therein, or those present at the creation) would be more difficult to contact, let alone identify, relying solely on information embedded within or attached to media online. Human rights organizations and practitioners are unevenly equipped with the technological expertise, financial and organizational resources, and experience to establish and maintain encrypted

131 communication with sources secure from interception from governments or third-party adversaries.

Consequently, organizations may elect not to contact sources – whether on a case-by- case basis or systematically– and attempt to instead independently evaluate the anticipated security risks resulting from disclosing information or simply collect UGC with the intent to keep it in private storage. The acute perception of security risks resulting from employing UGC for advocacy or accountability is pervasive in the field, and drives deliberation over whether and how to use UGC. However, the security risks of working with UGC are weighed differently depending on the activity and ultimate use. Whereas decisions to publish content or findings publicly may involve careful, case-by-case considerations of safety, privacy, and security risks, such preoccupations generally do not seem to weigh as heavily for the collection and preservation of content for future use and legal accountability, indicated by the development of automated techniques to crawl websites and scrape and store large volumes of online content.

Beyond the security risks of republishing content and reaching out to content uploaders for context, a fifth ethical challenge concerns obtaining meaningful and informed consent in the context of crowdsourced online investigations. “It can be difficult to obtain informed consent even in the best of circumstances,” observe Land et al. (2012: 30). “It is that much more difficult in a low-resource setting, complicated by social, cultural, and language barriers as well as a lack of familiarity with the nature of the technical risk involved.” The speed and potential scope of online circulation may defy reasonable expectations for content’s future use and violate any guarantees on the part of the advocate or researchers (Gregory 2012b). The growing use of automated techniques for web-scraping content make it further unlikely that organizations will concern themselves with consent-seeking mechanisms. Human rights practitioners and scholars are grappling with how consent frameworks might be applied modified, or substituted for more appropriate standards, recognizing that informed consent may be unwieldy or cumbersome for contemporary applications of eyewitness media and open source investigations in human rights work (Aronson 2017; Land et al. 2012).

132

In light of these risks, how Lab practitioners perceive their responsibilities towards human rights-related UGC, and negotiate their roles amidst security concerns and communication and consent gaps with content uploaders are the questions to which we next turn.

OPEN SOURCE INVESTIGATIONS AS “RESCUING” STORIES

One sunny afternoon, at a small table outside Berkeley’s Free Speech Movement Café, I interviewed an undergraduate student in her first semester in the Lab. As the conversation migrated to her experience working on the al-Lataminah report then underway, she shared a perspective which embodies a common way of talking about the Lab’s investigations: as deeply humanizing endeavors, tantamount to communing with victims and witnesses of atrocities. In this instance, she was describing discovery on the project: “When we found that video inside the hospital,” she recalled, “that felt really good.” She continued,

This is like, okay “we see you,” that’s what it feels like—it’s like someone waving their arm up, almost like they’re stranded and it’s like, “we see you, we see you.” That’s what it feels like when you find those sorts of things.

Especially in light of the grave safety risks present to video creators and uploaders epitomized in the Syrian conflict (e.g., Hamilton 2019; Taub 2016), social media discovery and verification was often portrayed at the Lab as an extension of one’s moral obligation to witness injustice. Illustrated evocatively in this student’s reflections, discovery was referred to as both a humanizing and political act of “rescuing” narratives. Beyond discovering content, the acts of incorporating eyewitness media into advocacy materials and preserving it for future accountability efforts were also imagined as affirming and extending witnesses’ “voices.” Though in a legal setting eyewitness media constitutes documentary evidence and is thus distinct from testimonial evidence, eyewitness content was frequently equated with victims’ voices, narratives, or stories. Accessing information posted by people on the ground through YouTube or Facebook “means we’re hearing stories we’ve never actually heard before and wouldn’t have otherwise,” commented Alexa Koenig in a promotional video for the Lab (U.C. Berkeley Public Affairs 2018). This treatment of user-generated content online aligned open source

133 investigations with the ethos and commitments of contemporary human rights movements to witness testimony (Alston and Knuckey 2016).

On an affective level, navigating social media for posts and videos of injustice could afford students a distinctly intimate lens to crises faraway in space and time. “The work that we’re doing,” continued the student above, sipping on iced coffee, “it’s very intimate, like in terms of how we’re engaged with the atrocities being committed.” Describing the proximity she had gained to the Syrian conflict, the consecutive chemical strikes in al-Lataminah, and to people’s stories of devastation and resilience, she explained how these stories, collected vis-à-vis social media sites, differed from the perspectives of the Syrian War transmitted through news, conversations, reports, or academic scholarship.

Actual people are submitting this, actual people are putting this online. These are actual people’s towns. So when you get to know their home, you get to know how they’re trying to survive, you get to know what they’re trying to protect, it’s a much different experience than trying to unpack the entire conflict from an academic or analytic perspective… Each individual attack counts, to the people it counted.

The experience of discovering and verifying crowdsourced material provided students a personal means of engagement—and perhaps a minuscule sense of intervention—in the conflicts and clashes they were tasked with investigating. Students occasionally distinguished their participation in the Lab from their academic activities: the former often felt real, important, and urgent, compared with classwork which could seem overly abstract and inconsequential by comparison. The graphic severity of the eyewitness material could imbue investigations with solemnity and heighten the ethical weight accorded to content, while positioning resiliency as a prevalent theme and enduring concern for Lab staff (Lampros and Koenig 2018; Ellis 2018). Despite potentially exposing students to risks of experiencing secondary trauma (Dubberley, Griffin, and Bal 2015), Lab investigations afforded students a way to positively channel hopelessness and outrage—whether about the specific conflicts they were investigating or social

134 injustice more broadly. One visiting practitioner made several references to “geolocating in anger,” promoting investigative work as an outlet for despair.

Accordingly, Lab students, staff and visiting practitioners saw themselves as amplifying the stories of those most impacted by state violence, stories which might have otherwise been neglected in mainstream news coverage or at risk of being buried if not deleted online. Then-Lab director Félim McMahon described this work as committing acts of “information activism” aimed at extending voices often marginalized in media coverage. This rhetoric was exemplified at the inaugural Digital Verification Corps summit held at U.C. Berkeley in the summer of 2017. An Amnesty International researcher gravely told students that they had, for better or worse, developed skills required to broker truth in our day. In the face of adversaries who want to conceal evidence of human rights atrocities, students were on the frontline—working to adjudicate competing claims concerning abuses taking place and discovering “narratives that would otherwise have been lost to history.” It is through these narratives and affects that students were empowered to perceive and deputize themselves as content saviors, tasked with scouring the internet for media for which eyewitnesses might have risked their lives to record and upload, and which might constitute the only records that exist about a particular atrocity or incident.

The moral obligation to collect, use, and preserve user-generated content depicting human rights abuses has been expressed beyond the Lab by partnering NGOs and other open source practitioners. “For those who believe in the inherent value of human rights documentation,” writes Jay Aronson (2017: 83), “there is a pressing duty to preserve this content for use in humanitarian, justice and accountability, and historical investigations.” Aronson is Director of Carnegie Mellon University’s Center of Human Rights Science (CHRS), an institute pioneering the use of automated techniques in the collection, analysis, and preservation of human rights-related eyewitness media. Syrian Archive researcher Jeff Deutch has remarked of his organization’s work,

We try to, I guess, close this loop between the people who are submitting the videos and us, as researchers… A lot of people have taken a lot of risks towards

135

uploading those [videos] and by preserving them and by archiving them and trying to standardize the way that we’re doing this, we’re also trying to honor the risks that they took and preserve their memories of these experiences.82

Human rights groups and practitioners dedicated to the discovery, verification, and preservation of user-generated content have emerged in recent years as outspoken advocates of content uploaders whose posts have been taken down or are in jeopardy of removal (e.g., Electronic Frontier Foundation, the Syrian Archive, and WITNESS 2019). After YouTube introduced modifications to its machine learning algorithms in August 2017, resulting in the swift removal of 900 channels posting videos of the Syrian civil war (Asher-Schapiro 2017), human rights organizations responded with outrage, demanding the immediate restoration of the removed content; enhanced due process in procedures for appealing removals; and the disclosure of information concerning platforms’ enforcement of content policies as well as accounts suspended due to government requests. A leading concern is that “[w]hile these takedowns are currently retrievable when the human rights community knows they have happened and can react to them, what happens when the efficiency of the removal system means the human rights community does not even know the content existed before it disappears?”83

In addition to mobilizing publicly as advocates of human rights-related eyewitness media in the face of platform takedowns, human rights groups and practitioners have stepped up their collection and archival efforts, including adopting web-scraping and other automated techniques to expedite and expand their collection of eyewitness media particularly in contexts where “the dissemination of conflict- and human-rights-related video… has vastly outpaced the ability of researchers to keep up” (Aronson 2017: 2).

82 Open Knowledge Foundation Deutschland. 2017. “The Syrian Archive – Documenting Human Rights Violations in Syria.” May 26. Last accessed April 19, 2019. Retrieved from: https://www.youtube.com/watch?v=bgZ9YcZaLLk. 83 RightsCon. 2018. “Social Media Takedowns: Protecting Who?” RightsCon. Access Now Rights Con Conference, Toronto 2018. Last accessed April 23, 2019. Retrieved from: https://rightscon2018.sched.com/event/EHkj/social-media-takedowns-protecting-who.

136

It is through these public advocacy activities and organizational practices that human rights practitioners are positioning themselves as content stewards and safeguards, and advocating for the protection and preservation of user-generated content for future accountability efforts. The emergence of this self-ascribed role can be seen clearly in the context of the Syrian civil war, for which an outcropping of organizations are undertaking large-scale collection and preservation of “massive quantities of data” (SJAC 2019). Of these data, user- generated content of abuses is prominent, owing to the forceful media activism borne out of the conflict, which has been dubbed the “YouTube War”; there are allegedly more hours of content in the conflict than hours in the conflict itself, which began in 2011. Such content has been sought actively by myriad stakeholders, with the promise that it may be used accountability efforts in the future, the specifics of which are as yet unknown. For its part, the Syrian Justice and Accountability Center (SJAC) has “collected and archived terabytes of documentation… from displaced persons, as well as documentation published publicly on the internet” (Syrian Justice and Accountability Center 2019). According to SJAC, it is not the lack of documentation which constitutes a foremost challenge for advocates as has been the case historically, but its immensity. Best-practices for collecting and storing content in ways hoped to maximize its probative value for litigation and other accountability mechanisms such as hashing and timestamping have been promoted by and for human rights advocates, technologists, and open source practitioners (e.g., Aronson 2017; Piracés 2018a; Deutch and Halab 2018).

How do practitioners reconcile their commitments to witnesses amidst consent and communication gaps with content uploaders? Next, I outline three overlapping discourses that effectively normalized consent gaps by minimizing the perceived necessity of obtaining consent from uploaders, producers, and other content stakeholders. These discourses constitute tactics by which human rights advocates negotiate their perceived duties towards impacted communities on one hand and UGC on the other. Though many practitioners imagine these two duties as overlapping, this analysis highlights a potential tension between them, suggesting that protecting one may under some conditions come at the expense of protecting the other, and vice versa.

137

CONSENT-CUTTING DISCOURSES OF EYEWITNESS MEDIA AND ONLINE INFORMATION

For most of the Lab’s projects during the span of my fieldwork, partnering NGOs were ultimately responsible for publishing reports, archiving materials, and employing content in litigation or other accountability efforts. Although the Lab was engaged in finding, verifying, storing, and publishing some eyewitness media, it was generally assumed that the Lab was not responsible for obtaining consent and permissions for use, insofar as it did not undertake the publication of this material or conduct mass-collection and -preservation efforts. Indeed, its no- contact policy with uploaders, or any other online users for that matter, formalized these consent and communication gaps with uploaders. In addition, while the Lab took steps to enhance its digital security and that of its participants and staff, the fact that it rarely published reports on its own also meant that NGOs partners were expected to take the brunt of responsibility for evaluating and managing the anticipated security risks for re-sharing media and investigation findings. Leaving aside for now their merits and limitations, these presumptions had the effect of largely outsourcing deliberation about consent, the establishment and maintenance of communication with uploaders and local sources, and the management of security risks in connection with publishing findings.

Consequently, despite their stated commitments to uploaders and, by extension, impacted communities, Lab participants conducted their investigations amidst perpetual gaps of consent and communication with uploaders. Somewhat surprisingly, most didn’t mind; consent gaps with uploaders were not at all perceived as problematic. Students were relieved to not have to reach out to uploaders—a task regarded as acutely dangerous and better left for those equipped and trained to do so. Consent was neither asked about nor missed, even in projects for which Lab teams were responsible for producing public-facing reports themselves – and especially for projects where Lab teams submitted reports with content and investigation findings to clients (which comprised the vast majority of Lab investigations).

How did Lab practitioners reconcile their lack of explicit consent from uploaders with their stated commitments to uploaders and to stewarding content on their behalf? Several circulating discourses I heard within and beyond the Lab had the effect of diminishing the perceived

138 necessity of obtaining consent from uploaders, producers, and other content stakeholders; one attitude imputed consent onto the act of upload, another implicitly portrayed consent as impractical, and a third approach to online content and set of open source investigative practices cast consent as irrelevant.

The first attitude involves the pervasive assumption that uploaders wanted the content they posted to be discovered and used for human rights advocacy. In the absence of explicit consent from uploaders or those linked to content, Lab students and practitioners often imputed consent onto user-generated content, taking its mere presence online as a signal of uploader’s intent and “agency” to share content for human rights advocacy and accountability. This is despite the fact that, as noted above, the online existence of content is no guarantee that content uploaders, producers, and other stakeholders intended it to be seen and circulated. On rare occasions, students and practitioners briefly acknowledged their lack of consent from uploaders to use content, and possibly even their uncertainty about uploaders’ intent in posting eyewitness media in the first place. But, even in these instances, such doubt is only expressed in passing, and is quickly resolved with the belief that uploaders posted their “stories” to be “heard” and “seen.” One Lab manager described her sense of responsibility to attend to user-generated content in an interview with me:

It’s really a matter of serving the people who are putting the content out there because, again, I won’t make any assumptions about why people are putting this content out there but my first thought would be that they’re filming these things so that something will change, to make people aware. And we’re kind of on the cutting-edge of doing both of those things—the awareness and the change.

Consistent with the narratives described in the previous section, this statement portrays open source investigations making use of eyewitness media as a moral obligation and a means to carry forward witnesses’ voices. By reference to “awareness” and “change,” the student referred to the Lab’s contributions to advocacy and litigation, respectively. On another occasion, the same student stressed the importance of “respecting the power of social media and the agency of the uploader” in having posted the content at all. Others I encountered in my fieldwork

139 made similar allusions to content’s availability online as an act of uploaders’ “agency”—acts which they felt should be honored by using or otherwise preserving the content. The assumption that uploaders indeed want their content to be used by human rights advocates was called by one Lab observer, in a rare admission of doubt, a “wild assumption”:

The person who sees a helicopter and pulls out a phone and records it: I have no comment at all about what they do. I don’t understand why people do that, I don’t know… It’s a missing, like, black hole in the field... Everyone assumes they know that someone wants to use that to bear witness or to prosecute. That is a wild assumption in my opinion… We’re making a hell of a lot of assumptions, including myself… that whoever did that wants it to be used for evidence. Like, in what trial? I doubt that that’s relevant to the person that’s uploading it. Maybe they want a witness, maybe they want the world to see it, maybe just - it felt important. I don’t know. Maybe there’s something human going on there, but I don’t know. And, there’s a lot of moralizing in my field by people like me, about like “well, we have an obligation to do something with this.” I say that all the time—I feel that. I’m not sure that I’m right. I just feel it, but I might be wrong, and I think it’s important to approach this question with some humility about what we don’t know.

The assumption that uploaders wanted their content to be used for human rights advocacy and accountability appeared to be especially prominent in regards to certain contexts and projects, operating as justification for collecting, using, or preserving content without uploaders’ explicit consent. Granted, this assumption is far from baseless. The fact of content’s online presence does assert itself as a compelling reason to believe the uploader intentionally posted it for wider viewing and use. Moreover, in particular contexts, epitomized by the Syrian civil war, efforts by media activists and citizen journalists to document and spread abuses are well-known, and have come to shape human rights advocates’ expectations about uploaders’ intent for posting content. In the Syrian conflict, local news agencies have played a large role in producing and circulating user-generated content and news coverage of the attacks—content shared on such accounts are inferred as having been posted with the intent to be seen and circulated, as opposed to content posted by accounts pertaining to individuals with small

140 followings. Moreover, a number of NGOs using or archiving content from the war have maintained contact with at least some of the agencies and sources from which they are collecting content. These and other factors have led to the pervasive perception that people posting content of the Syrian conflict have done so to furnish material in furtherance of human rights advocacy and accountability.

And yet, as the observer above cautions, assumptions are not equivalent with consent- seeking mechanisms or direct understanding of uploaders’ posting behavior and aims. Unless open source investigators maintain direct communication with uploaders, the latter’s reasons for posting content, and whether their desired ends align with those of human rights advocates, may not be readily apparent. Even in the Syrian context, it may indeed be a “wild assumption” to presume all actors posting content are doing so for the same reasons and desired ends; in other contexts and cases, it may be still less clear why people have posted content and for what purposes.

Besides imputing consent onto the act of upload, a second attitude that emerged during my fieldwork and interviews implicitly portrayed consent and consent-obtaining mechanisms as impractical, both in terms of the efforts needed to obtain consent on a case-by-case basis, and because of the monumental importance accorded to preservation of documentation for posterity. Regarding the former issue, participants echoed a sense of apprehension about traditional notions of social scientific consent present in scholarship and practitioner manuals. When I raised questions about consent, some participants quickly retorted, whose consent? I took their response both as a caution to remember consent obligations depend on the specifics of each situation, but also acknowledgement of the unwieldy abstractness of the concept. (Add in obstacles to establishing direct and effective communication with those individuals, perceived security risks in doing so, and ambiguity about what consent even means in connection to information that can be so easily and quickly spread and stored.)

From the perspective that human rights documentation has immense inherent significance and probative value in court, seeking consent from uploaders appeared further impractical, outweighed by the duty to collect and preserve human rights-related media.

141

Students, staff, and visiting practitioners periodically noted that social media posts, videos, or photographs on social media can constitute the only records that exist about a particular atrocity or incident – hence their vital significance to documentation and, in particular, legal accountability. “Because today there is at least the possibility of international criminal justice,” observe Alston and Knuckey (2016: 14), “advocates are increasingly gathering initial evidence for multiple purposes—both immediate advocacy and potential subsequent criminal trials.” A growing emphasis in the human rights field on pursuing legal accountability for abuses (Engle 2016), the justice-seeking mechanism undoubtedly promoted by HRC and Lab leadership, has arguably driven much collection and preservation of eyewitness media in hopes of employing such material as evidence in courtrooms. “Practitioners need to consider that their fact-finding efforts could become relevant to formal accountability and justice efforts,” writes Enrique Piracés (2018a) of Carnegie Mellon University’s Center for Human Rights Science, in an article disseminating best-practices for collecting and storing online information.

When I raised the issue of collecting and preserving content without users’ consent, practicing lawyers and law students dismissed this concern as paling in comparison to the imperative to preserve potential evidence of war crimes; moreover, lawyers and Lab practitioners regarded publicly available content on social media as belonging to the public domain presumably making uploader consent unnecessary, described more below. Beyond the Lab, the attitude that the preservation of eyewitness content for future accountability efforts trumps the necessity of obtaining case-by-case consent is reflected in the use of automated techniques to rapidly identify, collect, and store relevant human rights-related material on a mass scale. As Aronson (2017) points out, collecting content from the web without seeking consent and permissions may undermine the wishes of content producers, uploaders, and others closer to content’s creation, especially if uploaders later wish to remove their material from the public domain for security, privacy, or other concerns. What if, indeed, uploaders wanted to prohibit human rights groups from using their content? Further still, what if uploaders were the very perpetrators of abuse? It would then make no sense to ask for consent, would it?

A third attitude (and set of investigative practices at the Lab focused on persons of interest) positioned consent it as irrelevant, insofar as content was publicly available. In a sense,

142 relying on the classification of eyewitness media and social media posts as “public” can be thought of essentially as the lowest denominator of all three approaches to consent, both in the sense that it is the most permissible, and in the sense that it does not retain any sense of accountability to uploaders for current or future use. The notion that publicly available information and content on social media platforms and video-hosting sites is fair game to collect, use, or preserve was an unstated norm at the Lab, as it is more broadly across industries and domestic law enforcement agencies (Privacy International 2019; Hill 2018). A commonsense view of online material, this nevertheless reductive understanding of online information also reigned by omission: during my fieldwork, there were no trainings, tips, or explicit frameworks to the contrary. Like a growing number of organizations working with open source and social media information, the Lab’s activities took advantage of social media’s ambiguous informational context and the lack of legal regulation on the usage and collection of social media data. The treatment of publicly available information gleaned online as unproblematically “public” for use justified the collection and preservation of content — both that apparently uploaded by witnesses and bystanders, as well as by persons of interest and possible perpetrators.

Let’s first take the case of content presumed to be uploaded by witnesses or bystanders. As previously described, students were encouraged to perceive themselves as acting as surrogates for eyewitness content. Many considered themselves in this role and thus did not express doubts about whether they were acting in witnesses’ interests when discovering or verifying eyewitness media, or producing reports with their findings to send to NGO clients. Even still, in the event that students did have ethical doubts about their use of eyewitness media or posts, viewing the material as public tended to diffuse, if not resolve, their uncertainties. On one occasion, a student shared with me that she had felt uneasy about repeatedly taking close-up screenshots of people’s faces to document her verification steps for the client NGO.84 In another ongoing project, she continued,

84 Students regularly compiled video screenshots onto documents for partnering NGOs to show their verification and geolocation steps.

143

We’re documenting the social media presence of anti-government activists and I thought, this is weird. We have these names and we’re just scoping them out, and if I knew that other people were doing this, it would be really weird. Like, for that project they’ve been following this one person a lot...

In that project, students had tracked social media accounts and YouTube channels pertaining to actors closely involved in the incidents under investigation, “watching the watchers” as a strategy to monitor unfolding events. Just then, another student responded with a quizzical tone,

I had never thought about that when doing this stuff. Like, I wouldn’t think, “ohh, I have way more information about this person that they have on me.” Rather, I just think, “these people have posted things online and so the expectation is that it would be public.”

In other words, the student need not be uneasy about the stark informational asymmetries apparent in their project; after all, the uploader has made his or her actions and posts publicly known. This response offers an assertion of the public nature of social media content, while also hinting at the notion of “uploader agency” described in the previous section of the chapter. The view that publicly available posts and content online, including from social media, can be collected and preserved freely appears to be pervasive among human rights organizations. Those involved in the large-scale collection of content claim to collect “documentation published publicly on the internet” (e.g., Syrian Justice and Accountability Center 2019). Though social media platforms and video-sharing sites have indeed emerged as de facto archives of documentation of conflicts and human rights atrocities, treating all publicly available content as “public” to collect, use, and preserve without any consideration of the circumstances or context in which content emerged online is clearly not the highest ethical standard for data collection and usage, and in fact has been severely critiqued when conducted by other actors, like domestic law enforcement (Privacy International 2019; Hill 2018; Graham Wood 2016).

144

Indeed, taking publicly available information online as “public” for the purposes of collection, preservation, and circulation internally and with clients also justified investigations on persons of interest and their online posts, behaviors, and networks. One such project focused on collecting social media posts of high-ranking military officers in Myanmar appearing to denigrate or incite hatred or violence against the Rohingya (e.g., Stecklow 2018; Mozur 2018). No one expressed any contention with tracing and collecting these public figures’ posts, taking for granted that these materials were appropriate to collect and store without asking. One student positively reflected on the project, “[a]s those who seek to use social media as a tool to manipulate hate and fear highlight the negative consequences of technology, we at the lab are using their weapon of choice—social media—against them as a way to uphold our commitment to justice and human rights.”85

Moreover, in legal and confidential projects, students often undertook measures and steps to maintain anonymity online and keep their investigations covert. Students again took recourse in the view of social media information as public; though they might sense this information ought not to be treated as such, they often had little alternative benchmark—apart from possibly their own professional backgrounds—with which to assess their activities. In a small group discussion on students’ ethical concerns, a human rights lawyer and Berkeley Lab student noted that “I don’t know if this is a narrow legal problem but, we’re investigating people who don’t know we’re investigating them. Normally we would need a warrant for wiretapping.” A student responded, “the issue is that [this is] technically public, but could also be infringing on privacy. Where do we cross the lines there?” A law student added, “it’s that this space is still quite unregulated. Like, in criminal law, you’d need to get a warrant. Here, there aren’t regulations set up yet.”

Setting aside possible ethical problems and contradictions entailed in human rights organizations conducting covert social media intelligence gathering, it is clear that within investigations focused on content posted by persons of interest or perpetrators, seeking consent

85 “Lab Student Perspectives.” Berkeley Law. Last accessed April 12, 2019. Retrieved from: www.law.berkeley.edu

145 from uploaders would be preposterous; in extreme cases, doing so would unnecessarily place in jeopardy the investigation as well as the safety of Lab members, NGO partners, and witnesses. It would be ridiculous, for instance, to consider messaging people who have uploaded videos depicting themselves or their military colleagues committing human rights violations, to ask for consent. Clearly, if there were any circumstance where consent-seeking mechanisms would be uncalled for, this would undoubtedly be one. At the same time, it is worth asking whether resorting to classifications of eyewitness media and social media posts as “public” (and thus fair game to collect, use, and store) is in all cases warranted, and what the implications of this determination would be for the meaningful inclusion of content uploaders and creators in advocacy, accountability, and knowledge production efforts relying on their content. The point of this discussion is not to plainly discourage the collection, use, or preservation of content without consent, but rather to highlight ways in which a set of pervasive discourses undercut the importance of consent and normalize the absence of impacted communities from decision- making about the use of their content. Next, I describe how Lab students and staff managed security risks, particularly those anticipated from contacting uploaders and publishing investigation findings. Though in name security risks were given more weight than consent at the Lab, they were negotiated in ways that effectively outsourced their management to partnering NGOs and further exacerbated consent and communication gaps with uploaders.

NEGOTIATING COMMUNICATION GAPS AND SECURITY RISKS

Advocates and journalists working with human rights-related content posted online vary with respect to their communication with content creators, uploaders, and other stakeholders. At one end of the spectrum are those who might maintain regular communication with sources on the ground about ongoing developments and about the media they have posted online or shared directly with the advocate or journalist. On the other end of the spectrum are those who might wish to use content discovered online but do not have direct or indirect contact with any stakeholders linked to content – common to crowdsourced projects and open source investigations. Practitioner manuals and scholarship impart tips on evaluating and minimizing the security risks stemming from sharing eyewitness media or publishing sensitive investigation findings on a case-by-case basis and in the absence of direct contact with content creators and

146 uploaders (WITNESS 2011; Gregory 2012a, 2012b; The Engine Room, Amnesty International, and Benetech 2016). For instance, WITNESS (2017b) asks curators of human rights-related UGC to consider, what appears to be the intended audience of the media? What are the local conditions and security risks for witnesses? Are there signs that the uploaders want to remain anonymous? Whether the account appears to pertain to a semi-professional media agency with a large audience, or to an individual with little identifiable information and no corresponding profiles on other platforms, might at least suggest different expectations on the part of the uploader that the content posted would be viewed and shared.

Still, manuals do recommend contacting uploaders when possible to better understand the security risks from publishing content or otherwise sharing data. “If we are able to identify a primary source without having their permission to broadcast that material,” said Malachy Browne, currently at The New York Times, “we’re potentially putting them at risk if we identify who they are” (Browne, Stack, and Ziyadah 2015: 1341).

At the same time, maintaining safe communication with uploaders requires technological expertise and resources that are unevenly distributed among NGOs documenting attacks and sharing information in the conflict. As journalists attest, leaky digital communication traces can make protecting sources more challenging than ever before (Posetti, Dreyfus, and Colvin 2019). Crucially, in regions with heightened security risks, such as Syria for instance, simply reaching out to uploaders or individuals potentially linked to content are believe to potentially further endanger them. Accordingly, depending on the context, some news organizations maintain that “not seeking permission is the right thing to do,” and may elect to simply use the content without consulting uploaders (Wardle, Dubberley, and Brown 2014: 105; Turner 2012).

Frightened by the possibility that they could endanger individuals by reaching out, Lab students were content to be relieved of the responsibility to contact uploaders and instead leave this task for NGO clients presumed to have workflows, processes, and resources dedicated to support encrypted and secure communication channels with sources and stakeholders. Many emphasized that they did not feel adequately trained and equipped to securely contact individuals. One Lab manager explained to her team that “it’s not safe for them” to be contacted

147 by us; “when journalists do that, they’ve undergone special training.” Another instructed project newcomers to “create dummy accounts for Twitter, Facebook, etc. and do not interact with sources.” She continued,

We are never interacting with sources. It’s something you’ll do if you’re higher level at [an organization], it’s something that [our client] does. It’s something that we never do. Asking people [online]—that’s not what we do in the workflow. It’s respecting who we are. We’re simply doing content analysis. And those high-level safety ethical choices are being done by human rights investigators who understand what the impact is on the sources.

One Lab manager stated bluntly that she “wanted to do as little as possible, ‘cuz these are scary issues.” Students and staff placed great trust in its partnering organizations to handle questions of consent, matters of publishing, and anticipated security risks to witnesses. Koenig asserted, “I do think we’ve outsourced in some ways some of the decision-making around consent to the partner organizations who have the closer relationship with the people on the ground, and that’s why the trust and communication between each link in that chain going from the field to, like, a university—I think it’s so important that that be strengthened as much as possible.”

Accordingly, fear of endangering uploaders was a pervasive and powerful disincentive to contacting sources. Lacking in training, technological expertise, resources, and networks to establish safe communication with sources, Lab students are clearly not comparable to journalists and human rights professionals with experience in doing so capably and safely. Still, it is reasonable to expect other sites to be similarly under-resourced and/or apprehensive to reach out. Depending on their advocacy goals, such sites might decide simply to collect and preserve eyewitness media without releasing it publicly as a way to minimize harms, or attempt to evaluate the security risks independently before re-sharing content or publishing investigation findings. The latter can be equally harmful if advocates are not sufficiently careful by disclosing or even simply making it easier to establish where media was captured or the identity of those who captured, shared, or are depicted within it. In verifying and piecing together media and

148 information, open source investigators reveal connections and produce intelligence. In what follows, I provide an example from one such self-published Lab report to illustrate how neglecting to reach out to uploaders for fear of safety harms can nevertheless introduce methodological risks.

The example concerns the Lab’s investigation on chemical weapons strikes in al- Lataminah, Syria, on March 25 and March 30, 2017, described in the previous chapter. With an extraordinary level of media activism and documentation efforts, yet acutely dangerous conditions for those producing and disseminating content, the Syrian conflict epitomizes challenging data-related decisions and advocacy tradeoffs. Capturing visual documentation of attacks has been notoriously dangerous for witnesses and witnesses (Taub 2016; Hamilton 2019), and organizations storing and sharing eyewitness media have been targeted in cyberattacks. In particular, information regarding civilian sites and hospitals is notoriously sensitive in Syria, given the government’s selective targeting of medical facilities and workers (Physicians for Human Rights 2016; Syrian Archive et al. 2017).86

Whereas it is typical for locations of humanitarian structures to be shared with parties to conflicts, organizations tasked with documenting attacks in Syria have withheld coordinates of medical facilities and other civilian structures throughout the conflict due to fears these sites would be directly targeted (Atlantic Council 2018; Shaheen 2016). “Humanitarians have long agonized over what should be done in the new landscape,” writes the Atlantic Council (2018:13) of this conundrum. “Should they share the locations of humanitarian structures–known as ‘deconﬂiction’–and risk deliberate attacks on the projects and personnel? Or keep the locations hidden and hope that camouﬂage prevents the violence?”

For instance, Elise Baker, former researcher of Physicians for Human Rights’ project Mapping the Syrian Conflict stated that coordinates of the facilities depicted in the online interactive map were slightly modified so they would not align with sites’ actual locations (see

86 In 2016 alone, there was an average of more than one attack on a medical facility every other day in four governorates—Idlib, Aleppo, Hama, and Homs—resulting in 297 casualties of patients and staff (Haar et al. 2018).

149

Physicians for Human Rights 2019). For its part, the Syrian Archive issued a note explaining why it had not disclosed attack coordinates in its public report on the targeting of Syrian medical facilities, as the organization typically does (Syrian Archive et al. 2017). Weighing transparency against safety risks on the ground, the organization decided to create a public-facing report as well as a private version. Containing additional information including locations, the private version is to be shared with fact-finding bodies mandated to investigate abuses in the country, such as the UN Commission of Inquiry on Syria and the International, Impartial and Independent Mechanism on Syria (Syrian Archive et al. 2017). Other organizations collecting documentation on the crisis have elected not to share information publicly at all: the Syrian Justice and Accountability Center states that it “has strict security and data-sharing protocols designed to ensure that sensitive personal information is protected, and only shared with accountability institutions under specific circumstances, each requiring a case-by-case determination.”

In writing its report on the chemical weapons attacks, the Lab team deliberated over which media, findings, and details the report was to include. Some students on the team had been frustrated that existing reports had withheld information that, if disclosed, could have lessened their own verification work. This led me to wonder about how thoroughly the team would consider issues of privacy and security when the time came to decide what to disclose in the report and how. This is not to say that all students were not aware of the informational sensitivities of the conflict. One had in fact noted earlier in the project that, likely, “we won’t be able to geolocate [some of the hospitals] because they’re trying hard not to be geolocatable.” This statement encapsulates a central tension of human rights open source investigations; for purposes of verification, investigators may strive to uncover locations which may have intentionally been occluded, and to discern the identity of vulnerable individuals who may have preferred to remain anonymous. Hence the significance of correctly evaluating and managing the security risks anticipated from informational disclosure.

Bellingcat’s reports, published while the Lab team’s investigation was ongoing, contained several intentional silences. Its findings on the March 25 attacks withheld the coordinates of the entrance of a makeshift hospital targeted. The report stated that although it had geolocated the hospital entrance, it “will not provide details of the location due to ongoing targeting of media

150 facilities in the conflict” (Bellingcat Investigation Team 2017a). Notably, the makeshift hospital attacked on March 25 had allegedly been moved there in the first place “because previous attacks had twice hit buildings used as hospitals in the village” (Human Rights Watch 2018b: 17). In addition, the Bellingcat report cited and featured a screenshot of a video linked to the March 30 attacks which appeared to depict a bomb falling through the sky and making impact with the ground, producing a cloud of dark smoke visible before a city skyline. Seemingly captured on the roof of a residential building, Bellingcat stated that it had “managed to geolocate the video, but for the safety of the cameraman, [it] will not publish details of the location of the camera” (Bellingcat Investigation Team 2017b).

In the Lab team, there had been no discussion of what should be included or withheld in the report, and no collective deliberation over which media and information could be especially sensitive to disclose. In the end, the team’s report did publish coordinates for the hospital. The team manager said the decision to publish the hospital’s coordinates had been made after consulting with the team’s NGO partner, the Syrian Archive, and in light of the fact that the coordinates had by then been published elsewhere. The team’s decision-making had had the benefit of numerous advantages: its collaboration with other groups working with actors on the ground, and a precedent set previously by other reports released earlier on the attacks. The team did not publish coordinates corresponding with video seemingly filmed from a rooftop, but did not cite its reasons from withholding them, or whether they had been established at all.

In this case, no grave outcomes appear to have resulted from publishing the hospital coordinates, or from re-sharing the rooftop video. Even so, the example usefully illustrates several important dynamics worth discussing. First, it is clear that in the context of open source investigations or advocacy using user-generated content, the onus of due diligence is on the investigator or advocate to evaluate the security risks of informational disclosure, to reach out to content uploaders, and to modify or withhold information if necessary. There is nothing to prevent even individuals or institutions – whether well-intended or malicious – from shirking these responsibilities. Though the Lab team took care in consulting with organizations and considering the scope of information already in the public domain, other groups may not be similarly networked or careful in its deliberation.

151

Second, the example suggests that it is possible that lack of communication with uploaders and sources on-the-ground could impoverish advocates’ understandings of the specific security risks linked to informational disclosure as they attempt to calculate these, often from afar. There is a danger that advocates could get it wrong; that communication gaps end up exacerbating the already indeterminate nature of such risk calculations. Wielding immense discretionary power and yet detached from content’s production, advocates could grossly underestimate the security risks entailed by disclosure and thus unnecessarily place others in existential danger. Conversely, advocates could be too conservative and thereby frustrate activists’ risk-laden though intentional attempts to disseminate documentation of abuse. One visiting practitioner at the Lab gave the example of a video ultimately withheld from publication in an advocacy report because of the geographic region it pertained to and the fact the video had only accrued a few hundred views online. “We were so worried about security,” the practitioner confessed, “we didn’t make it public. The counter-side to that, of course, is that many people take risks because they want the stuff to get out, and then we’re sitting [on it and] not sharing it.” Of the Syrian context, Natalia Krapiva spoke about the fuzzy line between deliberation over disclosure and paternalism:

The people that are sending [the Syrian Archive] this content, a lot of times they want to be found, they sort of say “okay, this is who I am, so-and-so, I’m in this location,” because they want the world to know what’s happening in Syria and they want to add legitimacy to their reports, and so for us to come in and say “well, we don’t want this information to get out,” is kind of a bit paternalistic, I think. So, we also have to be thinking about the agency of the people that are submitting this content and that are depicted in these videos.87

I have heard this concern echoed time and again beyond the Lab by human rights advocates and scholars. Tufekci (2019) recently noted that “when I talk to dissidents around the world, they rarely ask me how they can post information anonymously, but do often ask me how

87 Krapiva, Natalia. 2018. “Web as Witness: Archiving & Human Rights.” National Conference of Ethics and Archiving the Web Conference Panel, New York City. March 23. Video recording. Last accessed May 15, 2019. Retrieved from: https://vimeo.com/276951911

152 to authenticate the information they post—‘yes, the picture was taken at this place and on this date by me.’” Similar to the Syrian Archive’s note above, a tension is again articulated between withholding information for security or privacy concerns, and disclosing as much accurate, contextual information as possible about user-generated content to increase its visibility and credibility.

Aside from the methodological risk of under- or over-estimating security risks, failing to communicate with content uploaders and sources on a systematic basis raises an apparent yet thorny ethical shortcoming. While deciding not to reach out to certain sources for fear of security risks is plausible on a case-by-case basis, a systematic neglect of content uploaders should raise questions about advocates’ degree of meaningful engagement with impacted communities providing documentation to accountability efforts.

“JUST HAVE IT?”

This chapter highlighted some of the affective dimensions and ethical discourses driving Lab practitioners’ human rights advocates more broadly to deputize themselves as content stewards and safeguards, both in public mobilizations against what they perceive as overly aggressive platform removal policies and algorithms, as well as through organizational efforts to undertake large-scale collection and preservation of content. We can only anticipate self- directed data collection and preservation to become more widespread. Numerous practitioners and observers with whom I spoke already employed automated techniques to crawl the Web and scrape and store content, were developing technologies to do so, or otherwise saw bulk collection and storage as a growing trend on the horizon (e.g., Kaye 2019: 18). Given the ephemerality of media hosted on Facebook, Twitter, and YouTube—particularly content likely of being deemed graphic, terrorist, or extremist by users or algorithms—human rights advocates and researchers have learned to take preservation into their own hands, whether relying on free tools and storage to save individuals links (e.g., on the Internet Archive) or building proprietary software to host databases of content. As one practitioner told me with respect to storing content especially given content moderation, “I think the open source ethos is to just have it.”

153

And, eventually, there might be something helpful to do with it. But, you know, I can’t trust others to ensure its existence, and I don’t think I should be the arbiter of that, or have the opinion of ‘this should be online or not.’ That makes me uncomfortable, having that voice in the conversation. So I would just rather have it for reference later on.

And yet, the protection, collection, use, and archival of user-generated content can be at odds with the consent, desires, safety, and privacy of witnesses and their families (Aronson 2017; Gregory 2012a, 2012b). Though the ethical dilemma of wanting to help witnesses without harming them is by no means new to human rights advocacy, the use of eyewitness media and open source investigative techniques make salient specific risks, incentives, and considerations for advocates – not least of all questions regarding permissions for use of content and security risks linked to re-sharing media and publishing findings. This dilemma suggests that the role of content stewards and safeguards might at times be in tension with that of being an advocate for witnesses. Along these lines, one observer I interviewed suggested that advocates’ exhortations to platforms to keep up content graphic and violent content runs counter to the aims of preserving witnesses’ psychological wellbeing and physical safety at all costs.

Despite widespread assumptions to the contrary, the mere existence of content online is no guarantee of prior (much less active) consent for its creation and circulation. It is not accurate to assume in all cases that individuals intended to post content for others to access, view, and share, nor that they carefully and correctly evaluated the risks to themselves and others before doing so. Users’ posting practices and expectations of privacy on social media sites are highly contextual and vary across users (Marwick and boyd 2011; Trottier 2012). Especially on social media sites with complex posting options and privacy settings, users may greatly underestimate the accessibility and usage of their content by others. The consent-cutting discourses examined in this chapter reinforce calls to establish ethical standards for open source investigations, and provide empirical support for claims that, in practice, consent-related protocols are considered prohibitively unwieldy or conservative (e.g., Land et al. 2012; Aronson 2018a).

154

Notably, the classification of social media information as “public” justified investigative techniques which overlapped with state surveillance practices. Like law enforcement, the Lab takes advantage of the ambiguous informational context of various online fora vis-à-vis the reductive albeit widespread view that online information is “‘open source’ and therefore ‘open season’” (Graham Wood 2016). Though no strong legal restrictions prohibited the Lab’s discovery and analysis of user-generated and social media content, it is worth asking whether and how human rights organizations might hold themselves to higher ethical standards for their data collection and usage, particularly given privacy’s status as a recognized human right. “As with many responsible data concerns,” writes Evanna Hu (2016, emphasis in original), “legal compliance is just one part of a much bigger picture, and it often forms the lowest bar rather than the best practice we should strive for. The fact that we are not prohibited from doing something does not mean that we should commit that act” (see also boyd and Crawford 2012).

To be sure, many organizations have developed internal protocols and systems for obtaining consent; moreover, obligations and considerations are likely to differ drastically depending on the project or situation. While of critical importance, the specifics of how these determinations are made and how consent is sought by human rights organizations or newsrooms for that matter is outside the scope of this chapter, as that information would be better furnished by sites that do conduct such practices, of which the Lab was not. In addition, echoing calls issued previously (Land et al. 2012; Hu 2016; Aronson 2018a), analysis here also reinforced a clear need for thorough interdisciplinary research on the ethical and legal dimensions of “open source” data collection and use, particularly in regards to eyewitness media and UGC gleaned online (e.g., Currie and Paris 2018; Zimmer and Kinder-Kurlanda 2017; Summers 2019; Jules, Summers, and Mitchell 2018; Proferes and Fiesler 2018). Relevant fields include ethical standards and unfolding practices in journalism, law, and academic social media research.

Rather than examine organizations’ specific considerations with respect to data collection and usage or develop possible standards and protocols, my aim here was instead to explore discourses that normalized the use of content without consent from content uploaders or producers. This analysis did highlight that, at the very least, human rights organizations like the Lab could benefit from promoting awareness among its investigators, researchers, and advocates

155 on distinctions between “public” information published in news reports or broadcast on the radio, and “publicly available” content gleaned from distinct contexts online such as social media posts and conversations where users’ expectations of privacy are not readily apparent, and thus may be more appropriately called “quasi-private” (Gregory 2012b: 558; Privacy International 2019). Doing so might encourage self-reflection on techniques used and practices normalized within organization and more broadly across this emerging field. Land et al. (2012: 30) note that “[c]rowdsourcing projects are likely to encounter this tension and may want to anticipate the position they will take on ‘public’ information and how they will draw the line between public and non-public information.” In addition, producers and uploaders, too, could take steps to more explicitly indicate their intent for posting content and their permissions and preferences for the use of their content.88 More broadly, this chapter points to a need to better understand the motivations, expectations, challenges with respect to credit and inclusion of those who upload human rights-related material to social media platforms – whether bystanders, citizen journalists, or homegrown media agencies.

On one hand, the Lab’s student-heavy structure, its no-contact policy, and its unique outsourcing partnerships with NGOs all detract from the wider generalizability of its approach to consent and security risks with respect to the broader landscape of sites in the human rights field conducting open source investigations. The Lab was comprised of scores of undergraduate and graduate students (60-80 students at any given time) without extensive training in digital security or professional experience in human rights advocacy; Lab teams are overseen by managers who are also students, and one technical director presiding over all the teams. The Lab had outsourcing relationships with partnering NGOs whereby the latter possess responsibility for deciding what to publish publicly, how and when to contact uploaders and on-the-ground

88 Sam Gregory (2012b: 556) has suggested a number of forms this could take. Producers and uploaders could state this information in their content descriptions or metadata; Gregory gives the example: “You may use this video in any way you like, provided you push for redress for human rights abuses in Burma.” Alternatively, Gregory suggests there could be a kind of “licensing system that recognizes intentionality,” drawing on creative commons approaches and/or other systems recognizing property rights. Ed Summers (2019) has made similar proposals.

156 sources, and how to store user-generated content for the long-term. Given that most Lab teams partnered with NGOs who took care of publishing, the Lab remained somewhat insulated from some of the challenging editorial decisions entailed in deciding to re-share eyewitness media and publish investigation findings. We would expect other sites that routinely publish and preserve eyewitness media to address issues of consent and security risks more explicitly, rigorously, and systematically, with established protocols, workflows, and standards for navigating these issues.

Nevertheless, as evidenced by the example above, on occasion Lab teams did produce reports or other informational products. Moreover, participants and staff expressed desires of wanting to become more autonomous, conducting their own investigations and publishing their own work. This was due to several reasons, not least of all aspirations to claim greater credit and recognition for the Lab’s work. The Lab’s fortunate position to be able to outsource editorial decisions and responsibility to communicate with sources to partnering NGOs does not account for its autonomous projects, nor is it applicable to the wider landscape of groups collecting, using, or preserving human rights-related material from the web.

On the other hand, however, the incentive structures, perceptions of security risks, and attitudes about using online content that I observed at the Lab do exist for other sites and practitioners more broadly within the human rights field. Perceived security risks deter advocates and journalists from establishing contact with uploaders. Organizations collect and preserve content en masse without any intention to obtain consent for doing so from uploaders, claiming that publicly available information gleaned online and from social media platforms are simply public and thus fair game to use and store. It is likely that other actors in this field regard the consent of uploaders as simply assumed, impractical, or irrelevant, and consider publicly available information gleaned online as appropriate to collect, store, and use freely. It is also reasonable to expect other actors to be fearful to reach out to uploaders but lack the resources or technological expertise to securely do so. They, too, might grapple with independently assessing the security risks of posting information, or decide to minimize their public-facing advocacy in favor of collecting, analyzing, and preserving eyewitness media without sharing content or findings widely. And yet, some practitioners I interviewed for the study simply did not reach out because they did not have time to. Consequently, while the details of the Lab’s

157 operations are quite atypical among human rights NGOs or investigative sites working with user- generated content, we can see similar operational tendencies and justifying rationales operating elsewhere for leaving uploaders out of decision-making regarding their content.

The upshot of this analysis is in no way to dissuade human rights advocates from collecting, using, or preserving content if they cannot obtain some kind of consent from uploaders or others connected to content’s creation to do so. I am sympathetic to the idea that much human rights-related content is too high stakes to not collect if it is possible to do so; plus, as previously noted, consent is an inappropriate framework for the collection of perpetrator videos uploaded by perpetrators themselves or their networks. Rather, my point is to highlight how the sociotechnical configuration of data, people, and capabilities underlying open source investigations simultaneously produce openings for some kinds of bottom-up participation in advocacy while also generating opportunities and incentives to exclude impacted communities from other kinds of meaningful participation.

To be sure, the capture, collection, and circulation of documentation of abuse by witnesses or others on the ground constitutes a crucial form of participation, at least methodologically: there is no doubt that open source investigations of the kind discussed in this dissertation rely fundamentally on media and information posted by sources, whether volunteered intentionally or not. Moreover, the diversity of sources involved in producing and spreading human rights-related media carry the promise of enhancing pluralism in human rights investigations and advocacy. It is in this sense of participation that scholars laud the proliferation of human rights-related eyewitness media and open source investigative practices as “democratizing human rights fact-finding” and modeling more participatory forms of research than prior models of top-down knowledge production (e.g., Land 2009, 2016; Ristovska 2016a).

At the same time, without minimizing the importance of user-generated content or the efforts and risks of content producers and uploaders, it is worth asking whether and how this form of participation in advocacy and accountability might be circumscribed. The availability and use of eyewitness media in itself does not promise a “democratic” or “participatory” process in the sense of consensual and collaborative involvement over the course of an investigation,

158 advocacy campaign, or trial; indeed, there is greater potential than ever to exploit the vast amount of human rights-related information online in ways that merely reinforce patterns of exclusion and global hierarchies in human rights work. As Aronson (2017: 85) has noted, there is greater ability than ever before for large, international organizations to collect and appropriate information for their own purposes, which may misalign with or even directly undermine the aims of smaller, local groups (see also Pittaway, Bartolomei, and Hugman 2010; Bukovská 2008).

Providing empirical research to this observation, this chapter suggested that open source practitioners’ workflows, limited resources, positionalities, and/or risk perceptions may in fact disincentivize and thus delimit the consensual and meaningful inclusion of impacted communities and uploaders in advocacy and accountability. Ironically, it is the perceived security risks to uploaders that can compel organizations to cut out sources. The pivotal asset of open source investigations and technologies, networks, and diffuse practices on which they rely is that they enable advocates to remotely and cheaply access and analyze more information concerning human rights violations than ever before. Given sufficient ambient information to adequately verify user-generated content, advocates may not even need to cooperate – let alone communicate – with uploaders. The question is, should they? If advocates are not sufficiently intentional, there is a risk that open source investigations might be conducted and used in ways that sidestep communities on-the-ground; appropriate their data to support advocates’ own policy and advocacy aims; and/or accrue credit, recognition, and power to sites conducting investigations, largely concentrated in the Global North and West. In this scheme, content producers who risk their lives to upload content could be applauded and given deference through narratives that celebrate their acts of courage, yet actually denied credit or decision-making power as to how their documentation is used and for what purposes.

159

Conclusion: Platforms, Participation, and Power

in Distributed Social Knowledge Production

“In the Western conception, social media in America is like: we post photos of our food and our dogs and stuff.” It was finals week on the brink of winter break, and the Lab was practically empty. I sat with a veteran Lab student, gathered around the birch table where we had spent so many hours together, squinting our glazed-over eyes at satellite imagery in search of a familiar tower or building or bend in the road. The student was recalling an epiphany of sorts she had had when she first came on at the Lab, when she was initially “shocked at how many people in Syria were using social media.” In Syria and elsewhere, she continued, this is where the information goes, because of state-controlled media and in-country restrictions to journalists and investigators. Numerous Lab practitioners had echoed this irony; that social media—associated so widely in the United States as a medium of frivolity—could in effect serve such crucial documentation and advocacy functions for individuals and communities ravaged by state violence and silenced by state censorship. “We see social media as like a secondary thing—for entertainment, for messing around when you’re bored, and that’s definitely not its purpose in a large majority of the world.” This shift in perspective towards social media content—of taking this content seriously and “treating it as real evidence”—is one she believed to be occurring elsewhere, too, like with “news agencies that have pioneered this [social media analysis and reporting] in the past” and now at the ICC and in the legal domain, thanks to the Lab and the HRC’s efforts to enhance the legitimacy and weight of social media content in court.

This dissertation began by noting that fact-finding, central to human rights advocacy and accountability mechanisms, is subjected to two seemingly clashing demands. On one hand, it is crucial to their efficacy that investigation methods and reports be perceived as unbiased and rigorous in order to withstand scrutiny from critics, “name and shame” governments, compel international action and intervention, or hold up as evidence in court (Orentlicher 1990; Alston 2013; Moon 2012). This is quite a challenge, least of all for the fact that the very act of conducting an investigation can be attacked by detractors as politically motivated. And yet, on the other hand, human rights investigations must appear to witnesses, donors, and wider publics as

160 enrolling the meaningful participation of survivors. NGOs conducting investigations for instance often position themselves as aligned with impacted communities, as bearing witness to their suffering, and as amplifying their stories (see Herscher 2014). “Many global human rights NGOs imagine their work to involve bringing facts from the ‘bottom’ to the ‘top,’ giving ‘voice’ to the ‘voiceless,’” writes Dustin Sharp (2016: 76). This allegiance to witnesses risks being seen as undercutting investigations’ political neutrality, while inviting impacted communities to participate more closely in research mechanisms may undermine their perceived methodological rigor (Land 2009, 2016).

Evolving practices in the human rights field can be seen as responding to this tension and thus serving as “tactics of credibility management” (Shapin 1995: 258). Scholars have identified numerous examples of this, including: (1) the emergence of a distinctive genre of human rights reporting featuring discursive appeals to scientism, legalism, and historicism (Moon 2012; Wilson 1997); (2) the professionalization and elite demography of NGO staff and human rights investigators (Land 2009; Sharp 2016); (3) the implementation of interview protocols to reduce bias and verify accounts (Orentlicher 1990); (4) the institutionalization of organizational review processes for reports (Land 2009); and, most recently, (5) the adoption of satellite imagery in advocacy and research which project an apparently objective, all-seeing “view from nowhere” (Herscher 2014; Rothe and Shim 2018).

In regards to these competing performances of “credibility” for different audiences — credibility as scientific “objectivity” (Daston and Galison 2007; Jasanoff 2012) and credibility as political allegiance to the grassroots – what resources and risks are introduced to human rights investigations by the availability of user-generated content and online open source information more broadly? At first glance, it would seem that the incorporation of these data vis-à-vis open source investigative techniques would hold potential to exacerbate this bi-polar tension, and perhaps beneficially so. Media portrayals characterize open source investigations on one hand as efforts to “help people tell their stories” (Fortune 2018) and as “giving a voice to victims” (The Economist 2018), and, on the other, as the specialized domain of “digital detectives” (Lapowsky 2019) dizzy from “geolocation vision” (Beauman 2018). In this way, the collection of user- generated, crowdsourced, and open source information is marshaled as proof of the

161 participation of local communities in human rights investigations, while the analysis of these data are framed as entailing “rigorous, methodical research and verification methods” (Fortune 2018). Though there is nothing to prevent these two facets from co-existing, these dual representations of human rights open source investigations as participatory or expert do seem to promote clashing “sociotechnical imaginaries” (Taylor 2003; Jasanoff and Kim 2015; McNeil et al. 2017) of the key actors and methods they comprise and of the emancipatory political work they may or may not accomplish.

Though outside of the scope of this dissertation, mixed presumptions about the veracity and validity of UGC are clearly of relevance in evaluations about the credibility of online investigations relying on UGC and other types of open source information. UGC lacks many of the verification affordances of witness interviews, which are coordinated by trusted networks, co- produced, and often face-to-face, providing useful visual markers and cues (McPherson 2018; Thompson 2005). In gathering witness testimony, investigators ask follow-up questions to clarify witnesses’ stories, check for inconsistencies, and ask for additional information to verify reports. In contrast, UGC is typically created and uploaded to online platforms without the involvement of human rights investigators and is typically stripped of metadata indicating the day, time, and location of capture. Additionally, UGC may be posted or shared to accounts without a trace of personally-identifying information; in these instances it may be hard to distinguish whether account operators are victims afraid for their lives, bots or pro-government trolls, or actively- reporting journalists opening a new account after their last was suspended or banned for graphic content.

And yet, although UGC may be accorded little evidentiary weight in court (Hiatt 2016), held with suspicion regarding its provenance, or may contain blatantly ideological narratives undermining its apparent “objectivity” (Daston and Galison 2007; Jasanoff 1998), it is also the case that UGC may bolster the perceived credibility of human rights investigations. Visual UGC may support witness testimony and proffer compelling visual material which may be especially convincing (for better or worse, see Porter 2014). Though operating through a vastly distinct visual register compared with the apparently omniscient and distanced view of satellite imagery (Herscher 2014; Rothe and Shim 2018), “the handheld, homemade, low-resolution, and

162 unpremeditated images of the Syrian revolution” (Mroué 2012: 23) and other conflicts marshal their own claims to authenticity. Separately, such material may comprise the only records of an event in the absence of surviving witnesses and other kinds of physical or documentary evidence. Such methodological constraints are endemic to human rights fact-finding; as SyriaTracker’s founder told me, you can’t conduct clinical trials in a warzone. Investigators make do. Though self-reported documentation online may well be incomplete, shared anonymously, and difficult to verify, it would be impractical – and indeed unethical – to ignore it.

There is also the question of the “expert” nature of open source investigations; this is another factor weighing into the perceived credibility of this field but beyond the ambit of this current study. Compared with the technoscientific optics of satellite and remote-sensing imagery analysis (Herscher 2014; Rothe and Shim 2018), the methodological reputation and credibility of open source investigations as an emerging field of practice is notably ambiguous and strategically contested by state actors. Although open source investigations do rely on specialized ways of seeing or “professional vision” (Goodwin 1994; Grasseni 2007; Vertesi 2015) particularly during geolocation, practitioners themselves downplay the novelty and necessary technical knowhow involved, depicting verification as “more about ‘journalistic hunches than snazzy technology’” (McPherson 2015c: 131, citing Turner 2012). One expert practitioner I spoke with emphatically denied comparisons between open source investigations and procedures of forensic authentication, describing the former as “an art, not a science,” and highlighting the necessary interpretation involved (although forensic authentication, too, entails interpretive discretion; Cole 2013). Indeed, practitioners pride themselves on the fact that the necessary methods and open source data are available to all; far from protecting whatever semblance of professional boundaries may be emerging, many embrace crowdsourcing as a democratic ethos. “Anyone can do it if you want to” (Pool 2018; cited in Lapowsky 2019).

But, the legitimacy of the field appears to be on the rise, in part through efforts by practitioners, prosecutors, and other stakeholders. Traced predominantly to the context of journalistic reporting, the discovery and verification techniques documented in this dissertation and elsewhere (The Engine Room, Amnesty International, and Benetech 2016; Silverman 2014) have since migrated to investigations by NGOs and intergovernmental fact-finding bodies as well

163 as legal fora including domestic courts and the ICC. Recent years have seen a storm of positive media coverage regarding open source investigations alongside a growth of volunteer practitioners and paid professional positions (see also Sienkiewizc 2014). Once ridiculed by CNN as a “stay-at-home Mr. Mom” (Pool 2018), Bellingcat founder Eliot Higgins has gained international prominence as a pioneer of the field and, indeed, Bellingcat itself has moved from a focus on journalistic coverage to formal accountability efforts and is partnering with the Global Legal Action Network (GLAN) to develop minimum standards for the collection and preservation of open source materials, as is the Human Rights Center at U.C. Berkeley with their project to design Open Source Investigation Protocols.

Macro-level examination into the evolving credibility of open source investigations — its professionalization, the standardization of its methods, and its discursive aspirations to legitimacy – merits further study, as does micro-level analysis of how investigation reports discursively negotiate the ambivalent authoritativeness of UGC and other open source material (e.g., anonymous reports). Although this dissertation neither pursued these exact lines of inquiry nor focused on issues of credibility per se, its examination of the heterogeneous preconditions and practices of knowledge production entailed in human rights investigations does bring empirical data to bear on the extent to which open source investigations live up to their popular portrayals as emerging domains of technoscientific expertise and, conversely, of participatory fact-finding.

In regards to the former, this dissertation highlights that while entailing specialized techniques and “ways of seeing,” open source investigations are also likely to rely on reports and information obtained through witness interviews, on-the-ground reporting and other traditional information-gathering methods, particularly in contexts that are relatively media sparse (e.g., limited internet connectivity and access, low phone ownership, etc.). In doing so, it extends scholarship nodding to the methodological limitations of remotely-conducted open source research and their tendency to disadvantage events that occur in more marginalized communities and contexts (Koetll 2017; Price and Ball 2014). Recognition that the “inputs” for open source investigations often include information that was itself collected vis-à-vis non-open source methods is another reason why human rights practitioners, organizations, and donors

164 should remember “that at present, and probably for years to come, technological evidence will rarely represent a magic bullet in atrocity cases” (Whiting 2015) or, for that matter, in advocacy and non-legal accountability mechanisms.

At the same time, this dissertation also exposed limits to the promise of open source investigations as participatory endeavors that somehow devolve power back to local communities. Obviously, these investigations do rely crucially and thoroughly on user-generated content – whether eyewitness media documenting an incident itself, volunteered geographic information about the place where the incident occurred, and other types of data. Indeed, these investigations are crowdsourced and thus “participatory” by their very nature. And yet, the act of upload is precisely where participation might end in some cases, depending on the project or investigating organization. This is because the distributed knowledge configurations by which open source investigators procure UGC disincentivize direct communication with content uploaders. Obtaining consent, let alone establishing communication, is far from a routine practice at many investigation sites—whether due to organizational constraints (e.g., time) or the perceived security risks of contacting uploaders. This point is not to minimize forms of participation that are present and significant; the efforts and risks undertaken by those who record, document, and share information about these incidents are often no less than heroic. Rather, this point is to gesture to the ways in which open source investigations could potentially risk the further exclusion of impacted communities in human rights advocacy and accountability while accruing power to institutions largely situated in the Global North and West that are conducting open source investigations and preserving UGC in large-scale archives (Sharp 2016; Aronson 2017; Bukovská 2008). Below, I continue to unpack these dynamics while offering broader reflections, limitations of the current study, and directions for future research.

“ACCIDENTAL ARCHIVES” OF ABUSE

Building on SOK scholarship probing the role of libraries, archives, and other sites of information access in shaping knowledge-making practices (e.g., Abbott 2011; Lemov 2015; Bowker 2006; Bowker and Star 1999), this dissertation sheds light on key affordances and constraints introduced by the use of commercial online platforms for social knowledge

165 production of various kinds. Advances in ICTs over the last two decades have dramatically transformed the volume and scope of data available – both public and proprietary – concerning practically all matters of human activity. The internet, “big data,”89 camera phones, remote sensing technologies, and social networking sites have given rise to new kinds and techniques of social knowledge production in the academy, government, public, and private sectors (boyd and Crawford 2012; Lazer and Radford 2017; Golder and Macy 2014; Schäfer and van Es 2017). Social media sites in particular are utilized for research and intelligence gathering in sites ranging from insurance companies in the private sector (Chen 2019), to policing and military intelligence (Omand, Bartlett, and Miller 2012a, 2012b; Williams and Blum 2018), and humanitarian crisis response (Meier 2015). These research activities are prompting critical social issues ranging from data governance and extraction to privacy and security, while compelling changes in what is knowledge itself, and in how we ascertain, define, and measure it. As boyd and Crawford (2012: 665) write, “Big Data reframes key questions about the constitution of knowledge, the processes of research, how we should engage with information, and the nature and the categorization of reality.”

Emergent knowledge production practices have thus not solely introduced new sources and storehouses of data, but also novel configurations, actors, techniques, constraints, and ethical dilemmas. Compared with thriving scholarship addressing these emerging phenomena across a spate of disciplines including geography (Sui, Elwood, and Goodchild 2013; Burns 2013; Dodge and Kitchin 2013), computer-supported cooperative work (Starbird 2012; Liu 2014; Leavitt and Robinson 2017), and media scholarship and journalism (Gillespie, Boczkowski, and Foot 2014; Bruns and Highfield 2012; Sienkiewizc 2014; Jarvis 2006; Almgren and Olsson 2015), the SOK sub-field has been slow to take up inquiry of, and integrate insights related to, such emerging knowledge-making practices. This is despite the fact that some scholars have argued that the emergence of big data and its use of computational methods pose grave challenges to the future of empirical sociology (Savage and Burrows 2007). For instance, Lazer and Radford (2017: 20)

89 The term “big data” does not refer only to “big” datasets, but rather “to data that are so large (volume), complex (variety), and/or variable (velocity) that the tools required to understand them must first be invented” (Lazer and Radford 2017: 21).

166 found that just 15 of 422 articles published in the American Journal of Sociology and the American Sociological Review between 2012 and 2016 employed big data.

Previous work has detailed how platform design and architecture comprise affordances and constraints for various user and stakeholder groups, including journalists (Hermida 2010; Bruns and Highfield 2012; Murthy 2011), activists and networked social movements (e.g., Howard and Hussain 2013; Tufekci 2017; Freelon, McIlwain, and Clark 2015), and social media researchers and archivists (e.g., Summers 2019; Acker 2018). The present study extends this research into the work of human rights investigations, by shedding light on how aspects of platform design and architecture, in addition to algorithmic deployment and content moderation, influence what investigators, journalists, and wider publics can find out and know about human rights abuses and conflicts. They shape the extent to which these events and tragedies are visible or hidden from view.

Platforms’ use of algorithms to sort and surface information selectively, and their controversial content moderation decisions – particularly with respect to extremism and human rights causes – have dominated recent debates about platform governance and accountability. While continuing to position themselves as arbiters of the human right of free expression (Jørgenson 2018) and lauding their own roles in the anti-government movements like the Arab Spring (Tufekci 2018a), Facebook and YouTube have in recent years faced growing accusations of enabling state violence, surveillance, and censorship on one hand, and impeding human rights investigations and accountability on the other. “Social media platforms have become accidental archives,” writes the Syrian Archive, “but takedowns have proven they are no place for long- term, safe storage of materials depicting human rights violations” (Syrian Archive 2019). A report by the United Nations Fact-Finding Mission on Myanmar found that hate speech on Facebook has played a “significant” part in the genocide against the Rohingya population (OHCHR 2018: 340-4; see Irving 2018; Kaye 2019).90 Depending on the case, human rights groups have variously

90 The report goes on to state that “Facebook has been a useful instrument for those seeking to spread hate, in a context where, for most users, Facebook is the Internet.” It calls the company’s response “slow and ineffective” and its failure to provide crucial country-specific data lamentable (OHCHR 2018: 340-4).

167 attacked the content moderation practices of Facebook and YouTube as negligent, overly aggressive, frustratingly inconsistent, and worryingly swift (Burrington 2017; Electronic Frontier Foundation, Syrian Archive, and Witness 2019; Access Now 2019a). Accordingly, the major platforms are being made to reckon with their roles as repositories of human rights-related documentation.

In such debates, the more subtle role of platform design and architecture in contouring access to user-generated content has received less attention, but this may be starting to change. In recent weeks, a contingent of human rights open source investigators has become more outspoken about their dissatisfaction even with modifications in platform features and functionalities. In the span of a week in June 2019, Twitter discontinued the capacity for users to geolocate their tweets and Facebook made changes to its Graph Search which in effect closed off previously relied-on tactics and threw a wrench in ongoing investigations including research into airstrikes in Yemen (Shu 2019).91 Investigators immediately decried the modification on Twitter92 and in the press, charging Facebook with having “just blindfolded war crimes investigators” (Dubberley 2019; see also Silverman 2019). Bellingcat contributor Hank Van Ess held a workshop entitled “Facebook: The Day After” to showcase “what is still possible” after “[o]pen source intelligence (OSINT) investigators lost some of their best tools this month.”93 The workshop was aimed at “news media, police, law enforcement, NGO’s, human rights watchers – so everyone who is working in the interest of the general public.” This outcry from human rights investigators

91 Bellingcat. 2019. Twitter. June 8. Last accessed on June 9, 2019. Retrieved from: https://twitter.com/bellingcat/status/1137378085127512065. The Tweet read: “Hi everyone: we need your help! Facebook is making a change which will prevent us researching airstrikes in Yemen. We need to find as many Facebook posts as possible with your help! Read more here…,” leading to a Google Doc attached with instructions on how to discover and collect Facebook posts using a workaround tool www.whopostedwhat.com. 92 For instance, Eliot Higgins wrote that “Facebook’s recent changes to its search function has [sic] resulted in making it a lot more difficult to investigate war crimes that have been documented on their platform. Facebook appears to be very uncommunicative on the issue, compounding the problem for researchers.” Twitter. June 9. Last accessed July 9, 2019. Retrieved from: https://twitter.com/EliotHiggins/status/1137667250566193153. 93 Van Ess, Hank. 2019. “Facebook ‘The Day After’/Special Social Media Workshop & Bootcamp.’” Last accessed June 23, 2019. Retrieved from: https://www.eventbrite.com/e/facebook-the-day- after-special-social-media-workshop-bootcamp-tickets-64072736279.

168 is notable in light of the fact that the Graph Search changes were aimed at patching existing privacy vulnerabilities, prompting one open source practitioner to question whether the modifications are, after all, a bad thing: “should graph search,” asks Tom Trewinnard (2019), “freely and publicly accessible as it was, have existed in the first place?” Accordingly, while this episode joins with other efforts by digital and human rights groups to assert themselves as important and visible stakeholders to platforms, it also appears to expose fault lines among practitioners regarding both the importance of their activities relative to user privacy protections, as well as the ethical ambiguities inherent in open source practice concerning tensions between visibility and verification on one hand, and privacy and anonymity on the other. Similarly, Twitter’s removal of the ability to geo-tag tweets around the same time has been described as “a small win for privacy but a small loss for journalists and researchers” (Benton 2019). These tensions are mirrored in broader debates over the potential for privacy protections and legislation such as the General Data Protection Regulation (GDPR) to undermine “the public’s right to information” by enabling the obfuscation of “historically important data” (Kaye 2019: 24). Future research could fruitfully explore these tensions further.

The context of human rights investigations and media activism also suggests that platforms do not merely serve as information intermediaries to which researchers must constantly configure their methodologies, but also as entities shaping the creation of the very content they host – in this case, visual and textual records of human rights violations, conflicts, and war. A co-founder of Liveuamap, a platform which curates, maps, displays, and preserves online conflict-related information using automated techniques, told me that the project mostly collects posts from Twitter as opposed to other sites like Facebook. As he explained it, this is largely because the short character limit forces concise, to-the-point reporting and is easier to analyze using machine learning techniques, whereas he complained that Facebook posts are rife with opinion more or less irrelevant to their conflict monitoring activities. Of course, this approach has downsides, he quickly noted: Twitter’s blocked in China. Offering a microcosmic window into a much larger picture, this brief anecdote illustrates how one discrete aspect of platform design (i.e., character limit) has significant influence on patterns of use pertaining to the

169 structure, format, and content of user posts, which in turn figure into the collection, analysis, and preservation decisions of human rights monitors and investigators.

Platforms appear to play a role at each stage of UGC’s trajectory in shaping what can be inscribed and known about human rights violations and conflicts. At the point of capture and upload, platform features, functionality, and lived usages help inform what users “upstream” such as witnesses or perpetrators create, what contextual information they include, and what they share and how, if at all. Platforms variously enable users to voluntarily annotate, (geo)tag, caption and describe their media content while setting parameters on the type and amount of information to include. Moreover, documentation and posting practices are typically influenced by cultural norms regarding online sharing and benefit/risk calculations on the part of users (i.e., with respect to government surveillance on particular platforms). During discovery, platform content moderation practices and algorithms used to sort, retrieve, and recommend content based on influence and engagement signals shape what is circulated, accessible, and findable to investigators. In addition, some platforms lend themselves more than others to automated discovery and data collection techniques, which factor crucially into what human rights-related content investigators may access, analyze, and archive. It is then in the midst of UGC verification that platforms’ “epistemological challenges” (Daniels 2014; Schou and Farkas 2016), privacy controls, and processes and features enabling the disclosure or occlusion of user information and metadata come to matter. Lastly, understandings of platforms as unstable repositories of information vulnerable to government censorship and removal via algorithmic detection are driving supplemental archival mechanisms aimed at securely storing content for decades-long accountability efforts.

Accordingly, platforms do not only function as dynamic and ephemeral data repositories which – through their architecture, algorithms, and content moderation – influence what information can be known and preserved of conflicts and state violence, but also as “governors” of speech (Klonick 2018; Zarsky 2014; Gorwa 2019) which – through their design and the lived, vernacular usages they support – encode and inform what information is recorded and shared, and how. Accordingly, Facebook, Twitter, and YouTube are both less and more than archives, in the broad sense of an archive as a “repository and collection of artefacts” (Manoff 2004: 10). As

170 information storehouses, platforms are remarkably unstable and dynamic (Walker 2017). Whether they have always known this or learned it over time, the human rights open source investigators I’ve spoken to by now recognize the ephemerality of UGC on YouTube and Facebook, particularly content likely to be deemed graphic, “terrorist,” or extremist which often overlaps with documentation sought by human rights groups. Moreover, in contrast with archival best-practices, consent has neither been obtained for the use and storage of content, nor has information been pre-verified before being uploaded. At the same time, platforms go beyond the functions of archives by formatting the creation of the content they make available both through technical requirements as well as via more interactive, cultural, and “user”-driven grooves.

UNCOORDINATED COORDINATION IN DISTRIBUTED KNOWLEDGE WORK

There is an extensive body of scholarship on the kinds of distributed knowledge configurations made possible by digital technologies across disciplines and use cases, including participatory media cultures, journalism, and crisis mapping. Indeed, there has emerged an entire discipline dedicated to this topic: computer-supported collaborative work (CSCW). Various terms have emerged to describe these knowledge configurations which can be highly diffuse— scattered across far-flung actors—and yet seemingly seamless. Ritzer and Jurgenson’s (2010) description of user-generated content as examples of “prosumption” involving production and consumption has gained considerable traction. In the journalism context, Bruns (2008) uses the term “produsage” to refer to “the gradual and collaborative development of news coverage and commentary by a wide range of users voluntarily making small and incremental productive contributions to the whole, rather than the orchestrated production of news stories and opinion by small teams of dedicated professionals” (Bruns and Highfield 2012: 11-12). And, in the humanitarian, crisis-mapping context, Starbird (2012) draws on the concept of “crowdwork” to describe heterogeneous forms of online, crowdsourced collective action and the forms of labor, data, tools, platforms, and information architectures on which they rely. “Multiple crowdsourcing configurations are often needed to strategically leverage the people, the information, and resources that converge[] during crisis situations,” writes Sophia Liu (2014: 390).

171

While the particular structure and architecture of platforms matter greatly in supporting or constraining what kinds and configurations of distributed knowledge work can take place (e.g., the affordances of hashtags for news-gathering, Bruns and Highfield 2012), it is important to notice the baseline coordination work that platforms do as places of assembly – at least for open source work. Compared with traditional human rights fact-finding privileging face-to-face interviews with witnesses, open source investigations need never meet with, much less contact, content creators and uploaders. At first glance, open source investigators might appear as magically summoning specks of information dispersed and unsorted in the deepest crevices of the internet. This isn’t quite accurate. Investigators work backwards, imagining the upstream user, and in this, platforms and websites carry a hefty portion of coordination work that is often invisible. Economics scholarship has examined online platforms as multi-sided networks or markets connecting buyers and sellers of goods and services, such as developers and advertisers (Gawer 2011, cited in Gillespie 2018: 21-22; Evans, Haigu, and Schmalensee 2006). Multi-sided markets provide especially valuable gains to users in circumstances where coordination between them would be difficult or impossible. Think of AirBnB hosts and guests, or drivers and passengers of ride-sharing applications. In the context of human rights open source investigations, platforms like Facebook, Twitter, and YouTube solve immense coordination problems between extant knowledge producers, bridging human rights researchers with witnesses and other kinds of actors separated by time, physical distance, socioeconomic and cultural divisions. Despite the privacy, security, and verification issues entailed in using these platforms, many activists and local news agencies continue to post there publicly to capture large audiences, particularly if they have gained a large following. Investigators can subscribe to uploaders, message other users, receive notifications, and monitor unfolding events in real-time. They can do this remotely, freely (or cheaply), and covertly—for better or worse. Platforms also do crucial coordination work by connecting investigators to less- suspecting and - visible content uploaders including bystanders, anonymous accounts, and perpetrators themselves.

In addition to (and interwoven with) this baseline coordination work that platforms accomplish, there are various other kinds of coordination practices that occur or have potential to occur between knowledge producers and users connected by platforms over time. We may

172 call these “coordination” attempts, but they can in fact be one-sided and anticipatory: they do not necessitate any intention to cooperate on behalf of the targeted counterpart. Alternatively, then, they might be called uncoordinated forms of coordination, or perhaps strategies of alignment or articulation.94 As an example, scholarship has pointed to ways in which website operators position themselves to be discoverable by would-be search engine users downstream. Gillespie (2014) and others have described tactics used by websites and search engine optimizers to become “algorithmically recognizable”; that is, to be easily discoverable in internet search results. Similarly, Golebiewski and boyd (2018: 1) describe strategies by information producers to exploit “data voids,” “search terms for which the available relevant data is limited, non- existent, or deeply problematic.” These visibility tactics are aimed at intercepting particular users but do not require the latter’s awareness or intentional cooperation.

As it turns out, human rights open source investigations is a site replete with these and other kinds of signaling practices, anticipatory moves, and coordination attempts crisscrossing all stages of UGC: its creation, collection, discovery, and verification. Witnesses, bystanders, local news agencies, or media networks may index and tag their visual media and posts using keywords, captions, and descriptions they anticipate will enhance their discoverability by journalists and investigators, their “imagined audiences,” while investigators, in turn, attempt to anticipate the posting practices, platforms, and keywords of users upstream. Whether or not these choices are meant intentionally as visibility tactics or as reflections of “archival consciousness” (Center for Spatial Research 2017), they do comprise “discovery subsidies” for investigators, similar to McPherson’s (2015a) notion of “verification subsidies,” which refers to

94 Anselm Strauss’ notion of “articulation work” (1985; see also Timmermans and Freidin 2007; Hampson and Junor 2005) comes close to capturing this phenomenon. The term refers to important aspects of labor or organizational process that serve coordination and integration functions often rendered invisible. Articulation is defined as “the agreements established among various actors within and between departments (or other sub-units of an encompassing organization” (Corbin and Strauss 1993: 72). At the same time, in their conceptualization, articulation work also requires interactional tactics before and during work tasks, such as “negotiating, making compromises, discussing, educating, convincing, lobbying, domineering, threatening, and coercing” (73). Because the inclusion of discovery and verification subsidies does not feature this close level of interaction, or “working things out,” I do not use this term.

173 the kinds of information and labor that media producers and other actors can provide to facilitate the verification of UGC in the context of human rights advocacy and investigations. Such coordination strategies are not simply learned online, but imparted to would-be content creators, uploaders, and investigators in field guides, manuals, trainings, and even direct requests (Siekiewizc 2014).

Building on the previous section, one takeaway from these coordination practices is that they do not merely shape data collection and analysis, but may also leave traces on the creation, structure, and content of the data itself – in this case human rights-related documentation. A large body of scholarship in the sociology of scientific knowledge and in critical data studies has sought to demonstrate how so-called “raw” data are always “cooked” (cited in Bowker 2013; Gitelman 2013). For instance, numerical counts and indices, statistical analyses, and the constitution of big data are typically accorded a semblance of objectivity not extended to qualitative research, but are nevertheless (and necessarily) based on judgments, classifications, and other decisions (Porter 1996; Martin and Lynch 2009; Bowker and Star 1999; Merry 2016; Zuberi 2003). “Data are not found,” reminds Halavais (2014), “they are made.” The incorporation of discovery or verification subsidies within or affixed to UGC suggests that artefacts of platform architectures – by virtue of the discovery and verification practices they demand and have helped shape – may be imprinted onto born-digital documentation of human rights abuses. Thus, even the most apparently “raw” and unmediated footage may nevertheless be invested and indexed with visibility signals and verification subsidies meant for audiences to find on platforms and verify using open source methods. Such traces embedded in media content and metadata are an example of how “data creation is a process that is extended in time and across spatial and institutional settings” (cited in Crawford, Miltner, and Gray 2014; Helles and Jensen 2013). These practices call attention to the various kinds of labor, coordination, and subtle signals that can underlie distributed knowledge configurations.

An important risk, however, lies in mistaking content, metadata, or its circulation as intentional acts of collaboration and consent. Indeed, platform design and architectures here, too, play a role in enabling certain kinds of coordination and information relay while discouraging others. Whereas YouTube prompts users to title, caption, and describe the videos they upload, it

174 does not invite users to indicate permissions for use – as has been suggested by scholars and practitioners working with human rights- or social movements-related UGC (Gregory 2012b; Summers 2019). Such design decisions have implications for human rights investigators and social media researchers in discerning consent of content uploaders and the security risks of collecting, preserving, and using UGC. Consent-related challenges are described more below in the final section.

CREDIBILITY AND VERIFIABILITY IN THE CROWDSOURCING COMMONS

In an article noting the climbing significance of credibility as an object of inquiry in the social studies of science from the 1970s onward, Steven Shapin (1995) cautions future researchers against attempts to develop a grand theory or extensive list of the factors and preconditions necessary to achieve credibility in any particular context. Theoretically, “there is no limit to the considerations that might be relevant to securing credibility” (260). These factors might include “[t]he plausibility of the claim; the known reliability of the procedures used to produce the phenomenon or claim; the directness and multiplicity of testimony; … the personal reputation of the claimants or the reputation of the platform from which they speak,” and so on (260).

The verification of UGC gleaned online relies on quite different techniques to assess the credibility of witness statements elicited from researchers, whether in-person or remotely. In the latter case, researchers may determine their sample of witnesses, ask follow-up and clarification questions to check for inconsistencies, and rely on visual communication cues that subtly convey markers of poor memory or intentional deception (Orentlicher 1990; Boutruche 2016). What kinds of signals or circumstances contribute to the perceived credibility of UGC and other kinds of online information to human rights researchers or journalists conducting open source investigations? This question is particularly relevant, given the ethos of pluralism and grassroots participation embraced by human rights open source investigations and crowdsourced projects.

This point was driven home to me during a one-on-one training early in my fieldwork. I sat side-by-side with a student manager, who explained to me how Wikipedia was employed as a go-to site at the Lab for anything military-related: weapons, insignia, vehicles, uniforms. “I use

175 a lot of blogs in my research, especially those that are about the military, writing about military groups and military movements. On sites like these and Wikipedia, it’s the most information… Of course, you’ll need to take it with a grain of salt, since anyone can edit it. But that’s also the benefit.” She adds,

People say to not use crowdsourcing sites, but in this line of work, these are often the most reliable. So much information is crowdsourced—there’s not one person who knows this stuff, like about the military. As so many people are sharing information online, Wikipedia provides the platform. Almost everyone in the lab uses Wikipedia in that kind of way.

The student pulled up another site to show me, Global Security.95 I glance at a large Papa John’s advertisement looming at the top of the site and the clip art-like cartoon animation occupying the right pane of the window. Anticipating my skepticism, she points to her screen and admits, “although dot org is usually in general more reliable, this site clearly does not look legit.” Gesturing again to the screen, she points out that apparently 24 articles were published on the site just yesterday. So, sure, “this website isn’t ideal—but there’s so much information coming out of it that it’s just not practical to ignore. Maybe that’s just my justification for using it. This site is seemingly unreliable, but in the case of these projects they’re almost more reliable.”

How is reliability configured and constructed in open source, crowdsourced investigations? Although this study did not capture the processes and standards used to make ultimate determinations of credibility (i.e., necessary to decide whether to publish a photo gleaned online in an investigation report), it did outline procedures and signals used to assess the reliability of content the during the verification process. These include the application of online tools to check for earlier or similar instances of content online (e.g., reverse image search) as well as a range of ad hoc steps requiring varying degrees of discretion and interpretation. Because verification of UGC relies on cross-corroboration with other reports and types of information more or less available with respect to different conflicts and contexts, this study suggested that

95 https://www.globalsecurity.org/

176

UGC verification outcomes may be shaped by broader social and structural factors, as opposed to simple maneuvers designed to include verification subsidies within or affixed to UGC.

At the time of writing, verification undertaken by human rights investigators is quite different from digital forensic authentication or approaches in computer sciences (Farid 2016) in that it centers on whether UGC is misattributed, as opposed to manipulated or doctored. Verification methods are not exhaustive, but are instead tailored to the types of misinformation investigators most expect to encounter. Practitioners consider misattribution to be the most pervasive kind of misinformation, and are thus not (yet) deeply concerned about synthetic media that could be circulated to deceive viewers—though it is plausible that open source investigators would adapt their practices should this change.

In addition, although open source investigations typically cast a wide net, including pieces of content for consideration irrespective of the status or identifiability of the source, source credibility may also enter into investigators’ determinations of content. In particular contexts, local sources report actively and are held in high regard, known to pre-verify UGC they may be re-sharing. One visiting practitioner said of a Syrian media aggregator that “we know that Shaam uploading content is a reliable source.” “Media organizations [in Syria] actually see it as a badge of honor putting out real content, they recognize so much fake news and misinformation is flying around.” Additionally, while they do not constitute deciding factors in whether or not to trust UGC or not, the breadth and depth of an account’s digital footprints and their disclosure of identifying information contribute to credibility assessments. Compared with a “Twitter egg,” a social media “verified” (blue check-mark) account with corresponding profiles, images, and types of content across different platforms and websites may be evaluated more favorably. Moreover, the apparent proximity of an account operator to an event bolsters its perceived credibility (and utility) for the open source practitioner. “You want to know about these events happening in different parts of the globe,” one student manager put it. “In that situation, what better information is there to use?” Channels and accounts with frequent posts of a similar type and region are more reliable and useful than a sparse and seemingly random collection of media or posts by what appears to be a “scraper” account operated remotely from the location of an

177 event. Given the wide variability of credibility estimates ranging from project to project or UGC to UGC, it is difficult to outline patterns beyond these general observations.

These signals point to ways in which institutional status and trust do operate in the evaluation of crowdsourced information. At the same time, there was widespread recognition at the Lab that accounts and media agencies contributing reporting on human rights violations are not politically “neutral”; visiting practitioners and trainers would remind students of this fact, especially in longstanding conflicts such as the Syria war, where no one is – or can be— “unbiased.” Although the reporting of certain groups were seen as more trustworthy than others, naive appeals to source “objectivity” were non-existent. This is in part because a source’s political bias was irrelevant to UGC verification. For instance, to verify or geolocate some piece of content, a student could just as usefully draw on footage filmed from a media activist, state-sponsored media, or Russia Today to cross-corroborate a narrative or identify the location of an attack. Indeed, at the Lab, where students’ projects were confined to UGC discovery and verification as opposed to the much more daunting task of establishing the actors and sequence of events in an actual incident, evaluations of source credibility were almost ignored entirely.

But, further research should more carefully examine how the utility, reliability, veracity, and credibility of content and sources are evaluated and achieved in such crowdsourcing projects, particularly highlight how the status and reputation of information sources vary depending on the context and actors. In other words, it sheds light on “credibility-economies” on the web and across specific “vectors”—e.g., particular combinations of information producers, collectors, analysts, and so on (Shapin 1995).

STUDY GENERALIZABILITY AND LIMITATIONS

Of crucial significance to my analysis in this dissertation is my predominant focus on international, large, and/or relatively visible human rights groups or investigative sites conducting data collection, analysis, and archival activities remotely and allegedly on behalf of impacted communities. Consequently, this dissertation is positioned from the vantage point of elite and relatively well-resourced institutions in the Global North and West. This positionality has many implications for my findings. For instance, in outlining key factors which might prevent

178

UGC from being produced or getting out of country, I relied almost exclusively on open source reports on local circumstances in Syria and Myanmar, as well as statements by journalists and researchers with experience reporting on various conflicts. I did not talk to individuals with direct experiences of these events. In addition, my focus on efforts to apply open source investigation techniques remotely and without the direct collaboration of those directly impacted by the events under investigation is what drives my preoccupation with issues of consent, ethics, and data extraction. This dissertation did not address community-led initiatives to collect, use, and preserve eyewitness media for advocacy, accountability, or memorialization. As UGC discovery, verification, and archival tools and techniques become more widespread and accessible, there would seem to be greater possibilities for local communities to have increased participation in and control over these projects (Land 2016). For instance, scores of web-based archives have emerged in relation to the 2011 Egyptian Revolution. One estimate places this count at 150; these include the 858 archive and the Vox Populi project.96 We would expect these grassroots initiatives to face different challenges and calculations that those of the institutions described in this study. Given that partnerships between local communities, technologies, and researchers are often proposed as a way to rectify stark power imbalances in human rights and social movements research (Jules, Summers, and Mitchell 2018), more research is needed on these initiatives. Furthermore, most of my examples were reflections of the geographic focus of the Lab and other organizations in this field, such as Syria and Rakhine state in Myanmar. Given that internet access and restrictions, platform usage, and volumes and types of open source information and UGC vary greatly across contexts, future research should also examine similarities and differences in the dynamics identified here across different contexts.

In addition, rather than examine the expanding gamut of open source investigative techniques, this study focused on practices employed increasingly to manually discover and verify visual UGC employed in journalism and human rights institutions including NGOs, academic centers, and international fact-finding bodies and courts. Though this study does not address automated techniques being developed and implemented to mine, scrape, and analyze large

96 Baladi, Lara. 2016. “Archiving a Revolution in the Digital Age, Archiving as an Act of Resistance.” Ibraaz. July 28. Last accessed July 6, 2019. Retrieved from: www.ibraaz.org/essays/163.

179 volumes of online content (e.g., Meier 2015; Aronson 2018b; Center for Investigative Journalism 2018: 42-45), many of the observations described here concerning content discoverability, verifiability, and consent are applicable more broadly to other domains of social media research. Moreover, although the network of individuals and groups specializing in the use of these techniques for human rights advocacy and accountability is quite small and densely connected, this field is expected to grow considerably, and there is a much larger scope of individuals, organizations, companies, and state agencies adopting similar techniques to the ones described here for a diversity of purposes.

Another issue of generalizability relates to the skill level of Lab students. Though it can be said that many of the Lab practitioners I observed and worked with were novices, there is no credentialing program or required formal instruction (yet) for open source investigations. The techniques are relatively recent, still evolving, and many expert practitioners began, too, as novices and picked up tricks and tips along the way by trial and error, reference to resources and manuals, or the help of colleagues or fellow hobbyists. Indeed, relative to the range of trainings and workshops currently available to learn and practice open source investigations, the training opportunities, hands-on experience, and professional development offered by the Lab to its students are perhaps unparalleled. “There are very few universities training students with the range of skills and level of experience that would make them useful to us straight out of university,” said Bellingcat’s Eliot Higgins. “The Human Rights Investigations Lab is producing exactly what we’re looking for” (Kell 2019). Lab graduates, some of whom I worked with closely, have been awarded placements to conduct, teach, or support open source investigations at The New York Times, Bellingcat, Amnesty International and its DVC node at the University of Hong Kong, and Harvard University’s Disinformation Lab. This dissertation drew heavily on trainings by and interviews with expert practitioners, including former and current employees (or contributors) of Bellingcat, Amnesty International, the Syrian Archive, The New York Times, WITNESS, Meedan, the International Criminal Court and a dozen or so other news or human rights organizations.

180

DIRECTIONS FOR FUTURE RESEARCH: CONSENT, SURVEILLANCE, AND POWER

One central focus of this dissertation is that, absent of explicit permissions for use, many human rights open source investigators collect and preserve content without reaching out to content creators or uploaders to verify information or obtain consent, whether due to organizational constraints or fear of endangering content uploaders further. Without at all minimizing the legitimacy of security risks as a grave concern, the systematic collection and storage information, in some cases en masse, without users’ awareness or consent seems troubling from an ethical standpoint, particularly if the content will be re-shared or used one day in court in ways that could compromise the anonymity and security of the uploader or people depicted within content. “It may be unreasonable to ask researchers to obtain consent from every person who posts a tweet,” write boyd and Crawford (2012: 672), “but it is problematic for researchers to justify their actions as ethical simply because the data are accessible.” Scholars have already pointed out the limitations and inappropriateness of social scientific informed consent protocols for social media research and these types of open source crowdsourced projects in particular (e.g., Land et al. 2012; Aronson 2018a), and a burgeoning literature is emerging to address these very questions (e.g., Fiesler and Proferes 2018; Zimmer and Kinder- Kurlanda 2017). It appears that the time is ripe for interdisciplinary, multi-stakeholder discussions (including content creators and uploaders) to better understand the stakes and tradeoffs of these collection and preservation practices with respect to consent, source protection, and ethical data use and sharing (Hu 2006).

This study also highlighted the pervasive perception that publicly available information online is fair game to collect and use without permission; this reductionist view is used to justify collecting all kinds of user-generated content, ranging from videos uploaded to YouTube by throwaway accounts to posts created by persons of interest and disseminated to their social networks on Facebook. Arguing that social media information defies simple categorization as public or private, many organizations and scholars have sought to differentiate “strictly publicly available content…clearly intended and available for everyone to reach and watch” (Privacy International 2019) such as information published in newspapers and articles, from content shared on social media, which may be publicly available yet not necessarily intended for the

181 widespread public—much less for law enforcement, intelligence agencies, private firms, or researchers. For instance, criticizing social media intelligence practices by law enforcement agencies, Privacy International, a London-based charity, has argued that international human rights standards and privacy protections should apply to the use of social media information, even if publicly available.97 And yet, the conceptualization of social media information as “open source” and thus “open season” (Graham Wood 2016) remains a productive equivalence for state and non-state actors wishing to collect, analyze, use, and preserve social media information as long as these activities remain largely unregulated (Privacy International 2019; Hill 2018). This dissertation found that human rights organizations conducting open source investigations might also appeal to views treating social media information as fair game to collect and preserve, regardless of organizations’ stated positions towards privacy. Echoing the tensions articulated above between user privacy and public access to information, there is arguably a need for human rights investigators to reflect on the extent to which they consider their practices a violation of the privacy of online users – and for what purposes and at what costs they are conducting this research.

This issue is compounded as human rights groups delve deeper into the game of bulk or mass collection and preservation, whether driven solely by an archival impulse that values preservation of material for memory and history, or as well with the hope to one day marshal it for accountability processes. Although some open source practitioners and online researchers maintained that the use of automated techniques to discover and collect UGC was not worthwhile for their work, as their work process involved combing through information manually, a number of organizations already employ or are in the process of developing techniques to crawl and scrape the internet for potentially relevant information (e.g., Kaye 2019: 18), or download content in bulk from individual YouTube channels, Facebook pages, Twitter accounts, etc. Many practitioners themselves are grappling with the ambiguous ethical

97 The right to privacy as a fundamental human right is codified in the Universal Declaration of Human Rights and recognized as such in additional international agreements and treaties such as the International Covenant on Civil and Political Rights and the European Convention on Human Rights (Diggelmann and Cleis 2014).

182 ramifications of these practices. “How,” write Deutch and Halab (2018: 51) of the Syrian Archive, “do the ethics of scraping, parsing, and analysing content of thousands of pieces of digital content differ than say, those of state surveillance bodies like the United States’ National Security Agency (NSA) or its British counterpart, the Government Communications Headquarters (GCHQ)?” Similarly, one observer noted that, “we’re heading towards mass automation, and that’s simply more mass collection and mass surveillance.”98

Here it is possible to see the confluence and clash between two broader cultural and technological forces: on one hand, the proliferating impulse to collect, use, and archive online open source information for accountability purposes and the public record by human rights advocates, social movement scholars, and other groups (Currie and Paris 2018; Evans et al. 2015), and, on the other, growing awareness of the psychosocial harms, privacy and security risks, and appropriative concerns that doing so may entail, particularly when individuals and communities captured in the archive are historical marginalized or vulnerable groups (Ng 2015; Zimmer 2015; Jules, Summers, and Mitchell 2018; Aronson 2017; Summer 2019).99

Future research might fruitfully explore the apparent ethical ambivalences entailed in human rights groups’ growing appetite for collecting and preserving documentation of abuse. In an article posing the notion of a “dialectic of surveillance and recognition,” Tom Boellstroff (2014) suggests that in the age of big data, recognition and belonging might not just be tied to the act of creating data, but of being caught in the surveillance apparatus. Citing emphasis by Kate

98 Indeed, open source investigative groups may themselves partner with private companies working with domestic law enforcement, military, and intelligence, or may rely on remote- sensing technology companies that do so; the Carter Center’s Syria Conflict Mapping Project, for instance, was reported to partner with Palantir, a data analytics firm working with many branches of the United States government (Livingston 2016). 99 Teju Cole (2016: 199) has written poignantly of photographic memory’s “menacing side”: “There is so much documentation of each life, each scene and event, that the effect of this incessant visual notation becomes difficult to distinguish from surveillance.” Little wonder, then,” and especially in light of governmental and commercial data collection practices, “that many people would like to be less visible or wish their visibility to be impermanent or impossible to archive.”

183

Crawford and others to consider questions of exclusion and representation in regards to the creation and analysis of big datasets (Crawford 2013; see also boyd and Crawford 2012), Boellstroff points out that “many responses to the making of big data are implicitly calls not for its abolition, but its extension”—that is, efforts to include and reflect the experiences of marginalized communities in the data that is collected. This point seems relevant to the dynamics playing out in the human rights context, where impacted communities are encouraged to take up cameras, document their tragedies, and contribute to the “participatory panopticon”100 (Gregory 2012a), and where advocates and open source investigations are deputizing themselves as content stewards and safeguards for this content.

There is a second application of Boellstroff’s “dialectic of surveillance and recognition” here, too, which instead relates to political and financial incentives for human rights groups to be the ones collecting and preserving UGC. Numerous practitioners noted that despite calls for greater data sharing, organizations are often reluctant to share UGC and other information they collect with others. Protecting sources is an obvious concern, but many individuals I interviewed instead underlined unwillingness to do so as stemming from needs to remain attractive to donors and thus receive funding as well as to secure a seat at the table in future accountability efforts. Accordingly, NGOs may be seeking their own recognition in adopting ethically-ambiguous surveillance tactics—not merely in regards to UGC posted by the supposed witnesses they position themselves as supporting, but also with respect to persons of interest they would seek to expose or prosecute.

100 Sam Gregory adopts the term “participatory panopticon,” coined by futurist Jamais Cascio, to describe the scenario in which “everyone participates in ‘watching’ each other and the state” (2012a: 518; Cascio 2005). As opposed to surveillance (“sight from above”) in which the state is imagined to harness visibility as a means of exercising power vis-à-vis actual or anticipated monitoring, the participatory panopticon, Gregory writes, represents a “reversal of surveillance” in which non-state actors leverage emerging technologies to document injustice. “Th[is] phenomenon,” alternately referred to as “ubiquitous sensing,” “sousveillance,” and using other terms, “is filled with emancipatory potential in terms of security accountability for rights abuses,” writes Gregory (2012a: 518).

184

This dissertation pointed to two popular portrayals of open source investigations as emergent areas of technical expertise and/or as participatory forms of fact-finding which amplify the voices of those most impacted. There is a third reading of human rights open source investigations: as practices of counter-visuality which challenge states’ monopoly on surveillance and co-opt or repurpose surveillant technologies and techniques to “reverse the gaze.” Just as law enforcement agencies are adopting techniques of “social media intelligence,” so, too, are human rights and open source investigative groups taking to social networking platforms to covertly investigate persons of interest and trace criminal networks.

Social media intelligence, or “SOCMINT” (Omand, Bartlett and Miller 2012a, 2012b), is a tool used increasingly by domestic and foreign law enforcement agencies. A 2015 survey by the International Association of Chiefs of Police found that 96.4% of 553 domestic law enforcement agencies surveyed employed social media in some capacity, with criminal investigations being the predominant use (International Association of Chiefs of Police 2015). This includes reviewing suspects’ social media accounts (92.3% of agencies) and using undercover identities to monitor or gather information (67.2% of agencies). In addition, well over half of agencies surveyed used social media for intelligence, soliciting tips on crime, and “listening/monitoring” (see also Patton et al. 2017; LexisNexis 2014; Broussard 2015). Recently, the widespread practice of using fake accounts and aliases to “undercover-friend” persons of interest spanning gang members, teenagers, and political activists has come to light but remains largely unregulated (Hill 2018; Levinson-Waldman 2018). In 2018, Facebook deactivated numerous so-called “Bob Smith” accounts managed by the Memphis Police Department used to track and befriend Black Lives Matter activists.

Human rights open source practitioners themselves are aware of the correspondences between their practices and those of law enforcement, the military, and intelligence; they range, though, from being sheepish and conflicted, or confident and unapologetic. One Lab presentation made very explicit links between the work of Lab students, particularly geolocation, and the use of these techniques in military intelligence; during the presentation, a Lab staff member projected and read a job description for the position, “1N1X1—Geospatial Intelligence (GEOINT),” before playing a three-minute recruitment video for the Australian Air Force featuring

185 two young geospatial imagery intelligence analysts describing their jobs.101 Other times, lawyers refer nonchalantly to their application of “what law enforcement has been doing domestically [and] what investigative journalism has been doing around the globe.” Covert investigation into persons of interest is arguably driven largely by legal aims to ultimately take individuals to court. At the Lab, social media network analysis was conducted primarily on persons of interest on teams working with legal NGOs or specific legal cases. Pervasive and mostly unproblematized, this drive to prosecute is part of a broader historical drift of the human rights agenda towards a focus on anti-impunity and, in particular, criminal prosecutions (Engle, Miller, and Davis 2016; Alston and Knuckey 2016). The deployment of social media intelligence by human rights investigators for criminal prosecution raises questions around what differentiates their work from SOCMINT conducted by law enforcement (with the exception that NGOs do not a legal mandate to conduct such investigations). Clearly, unspoken assumptions about the supposed social good and value of international criminal prosecutions of human rights violations are doing a lot of work in distinguishing SOCMINT in these two domains and in effect justifying or normalizing SOCMINT’s adoption in the human rights field.

It would thus be worthwhile for future scholarship to consider how open source investigative practices by large NGOs position them, perhaps unwittingly, in new alignments with the surveillance state and potentially alters their relations to populations they are aiming to support. Andrew Herscher (2014) has argued that NGOs’ usage of satellite imagery for advocacy “bifurcate[s] advocacy” (476) into two effects: not solely aimed at raising awareness of crises and conflicts among what McLagan calls “witnessing publics,” but also deployed as a way to discipline and deter perpetrating states vis-à-vis the Panoptical gaze of satellite surveillance. Herscher (2014: 473) coined the term “surveillant witnessing” to refer to the “hybrid visual practice that has emerged at the intersection of satellite surveillance and human rights witnessing.” Acting as quasi-intelligence agencies, NGOs utilizing remote-sensing technologies have taken up the production of geopolitical knowledge as a way of intervening in international relations and

101 Defence Jobs Australia. 2012. “Air Intelligence Analyst – Geospatial Intelligence.” June 19. Last accessed February 19, 2019. Retrieved from: https://www.youtube.com/watch?v=l0197yXA-40.

186 engaging in global governance functions once reserved for states (Rothe and Shim 2018; Hasian Jr. 2016; Aday and Livingston 2009; Witjes and Olbrich 2017). Thus, while satellite imagery has:

On the one hand…given human rights advocacy more powerful instruments to advance its ambitions; on the other hand, it has placed this advocacy in compromised relationships to the power structures it imagines itself contesting and in asymmetrical relationships to the human beings with whom it imagines itself in solidarity. (Herscher 2014: 475)

Compared with NGOs’ surveillance of perpetrator states using satellite imagery, social media intelligence-gathering on specific persons of interest constitutes a far more invasive form of surveillance. This brings questions of power back starkly into relief.

While observing how marshaling facts in support of a claim bolsters the legitimacy of that claim, Frédéric Mégret (2016: 28) suggests that “fact-finding is a form of power.” Although Mégret in this instance referred to fact-finding as a efficacious mode of asserting power, this dissertation and conclusion in particular point to ways by which open source investigations involving the collection, analysis, and preservation of UGC by witnesses, alleged perpetrators, and other actors can be read as strategies to secure power. These ways include the adoption of SOCMINT practices by human rights groups, as well as the drive to collect and preserve UGC as a way to secure funding and a seat at the table in legal mechanisms and fact-finding bodies investigating or prosecuting abuses.

Some have questioned the extent to which dominant models of human rights investigations and advocacy actually accord meaningful participation and power to individuals and communities most impacted by state violence. Dustin Sharp (2016) has persuasively pointed how fact-finding conducted by international NGOs serves to reinforce the goals, legitimacy, and power of the elite individuals and institutions conducting such research (see also Bukovská 2008).

Leaving aside more fundamental questions about the harms, risks, and ultimate purpose of collecting and preserving massive stores of open source information—what we can only expect future conflicts and wars to generate—it is clear that the capacity, resources, and technical support needed to amass, manage, and preserve this content is unevenly distributed among all

187 relevant stakeholders. Insofar as open source methodologies allow investigators to forgo collaboration—or even minimal interaction—with individuals or communities impacted by violence, the field must continually evaluate the degree to which its methods, workflows, and accountability processes resembling empowering or extractive models of research.

188

Methodological Appendix

This dissertation draws on ethnographic participant observation at the Lab and in-depth interviews with current and former Lab students and staff as well as expert-practitioners of UGC discovery and verifications as well as other open source investigative methods taught and employed at the Lab. Ethnographic fieldwork began in the summer of 2017 and extended to May 2018, a total span of approximately 10 months over three semesters: Summer 2017, Fall 2017, and Spring 2018. During this period, I spent time at the Lab Monday through Friday — whether attending Lab-related meetings or events, or working on Lab projects alone or with other students. Each week, I attended a meeting for Lab team managers (1-2 hours), the Lab lecture/seminar (2 hours), and group meetings or “Office Hours” for each team (2 hours each) of which 5-7 teams were active at any given time. These meetings accounted for 14 and 18 hours each week. In addition, I worked on Lab projects each week with my assigned “buddies” and team partners, made progress on solitary verification work at home or at a café or library on campus, and regularly hung out in the Lab between Office Hours – conversing or co-working with students who came in to ransack the snack cabinet or to advance on their Lab projects. Office Hours began as early as 9am to as late as 5pm. In total, I estimate that I was at the Lab or engaged in Lab- related activities on the Berkeley campus for a minimum of 25-30 hours a week, or approximately 1,000 hours over the course of 10 months. In addition to the Lab meetings described above, I attended Lab events held on or near campus, including happy hours, board game gatherings, movie nights, hack-a-thons, and team meals.

The Human Rights Investigations Lab was an ideal setting and “strategic research site” (Merton 1987) for undertaking an ethnographic study of human rights open source investigations. This was due to least three important factors. First, HRC staff and its network of open source investigators, human rights prosecutors, and NGO partners undoubtedly constituted the “core set” (Collins 1981) of stakeholders in the field of human rights open source investigations. The HRC is at the forefront of collaborative efforts to establish international protocols for online open source investigations for use towards legal accountability. Koenig and the Lab are regularly featured in articles and commentary on applications of user-generated

189 content and open source investigations in human rights (e.g. Irwin 2019). Félim McMahon, the Lab’s director during my fieldwork, was an early constituent of Storyful before becoming the first ever analyst specializing in open source investigations at the International Criminal Court; labelled the “world’s first social media agency,” Storyful is widely regarded as pioneering techniques and workflows for discovering and verifying user-generated content for legacy media outlets. In addition, the Lab is by far the largest and most active site participating in Amnesty International’s Digital Verification Corps; further, it is the only school in the network of six campuses to date to conduct open source investigative practices for partners besides Amnesty International and also for legal projects, not solely advocacy. While Lab practices are far from encompassing the full gamut of current methods used to collect and analyze open source information for human rights causes, the techniques being applied are used by notable organizations conducting open source investigations on human rights violations, including Amnesty International, the Syrian Archive, the Carter Center, Bellingcat, Physicians for Human Rights, and other organizations. These techniques are also used in journalism, by outlets such as The New York Times and the BBC’s new investigative unit, Africa Eye. The pioneering roles of the HRC and Lab in this emerging field provided me with invaluable context into efforts to advance the practice and applications of open source investigations, as well as incredible contacts in this field including experts in open source investigations, whom I reached out to for interviews during my fieldwork (more below).

The Lab’s educational and highly collaborative setting comprised a second factor making it an idea site for ethnographic study. Despite the crowdsourcing dimensions of this work (drawing on content uploaded by others, or insights posted to a Twitter thread), the experience of performing social media discovery or verification is one undertaken individually and quietly. As one sifts through search results, pours over satellite imagery, or spends hours attempting to find a match to a vehicle or weapon canister, one is typically on a personal computer or desktop, working silently. This environment provides certain challenges to ethnographic observation; shadowing a sole expert-practitioner in these techniques, for instance, could require constant prodding and questioning to capture his or her internal frameworks and decisions. In contrast, the Lab provided constant opportunities in which best-practices in the field were articulated,

190 underlying logics were verbalized, and students’ doubts or challenges were raised openly. In other words, constant encouragement to learn, share experiences, and work together resulted in the externalization of key procedures, logics, challenges, and hesitations. Moreover, Lab students’ wide-ranging disciplinary and cultural backgrounds enriched discussions on the Lab’s goals, motivations for student participation, and applications of open source investigations.

Trainings and lectures were frequently offered on a spectrum of relevant topics, including open source investigative techniques including social media monitoring, discovery, verification, and geolocation; research on exposure to secondary trauma from viewing graphic material and resiliency tips; and applications of user-generated content and verification techniques in media activism, journalism, and legal cases at the International Criminal Court. Techniques and tips imparted in these trainings matched those shared in resources and manuals, helping me to connect and situate the Lab’s practices within a broader emerging field of practice. In the preceding chapters, I draw liberally on these trainings to describe widespread techniques taught and used in the Lab.

I also gained exposure to others’ experiences of conducting open source investigative practices during group meetings and via the “buddy system.” Two-hour team meetings or “Office Hours” as well as weekly meetings for team managers typically began with a round robin in which individuals described their progress on their tasks for the week, including breakthroughs and obstacles they were encountering. Group brainstorming was invited as a common way to support each other and exchange ideas, tricks, or work-arounds that students and managers had discovered along the way. After the initial go-around, the remaining Office Hours could be devoted to any number of activities, including additional training; setting up students’ computers, discussing project parameters or updates, or simply time to work. During some meetings, thirty minutes or even an hour would pass without more than a few whispers traded among students working individually or in pairs. Yet, silent work time was often peppered intermittent questions, ranging from requests for technical help or inquiries about the project at hand, to broader curiosities and non-sequiturs: Can someone help me log on to the VPN? Does anyone else’s computer get warm when using Google Earth? How exactly will this be used by the

191 client again? Does this water tower in this photo look to you like this thing on the map? Uh oh, this YouTube link doesn’t work anymore. Why are there so many women in the Lab?

Under the “buddy system,” students were paired up and encouraged to meet outside of Office Hours to work on Lab projects together. Instituted in Fall 2017 primarily as a way of supporting students’ mental health, the buddy system also benefited students’ work by generating opportunities for them to assist each other, whether brainstorming search strategies, providing another set of eyes to corroborate geolocation efforts, or helping to identify logical gaps or detect confirmation bias in each other’s work.

Crucially, the Lab’s offerings of educational resources and collaborative exchange strengthened my own familiarity and practice with open source investigative techniques, just as they were designed to do for other students. I began, as most ethnographers do, “assuming the role of the naïve learner” (Charmaz 2006[2014]: 39) but progressively developed skills and literacies apace with Lab students. In addition to participating in the trainings and talks described above, I was given one-on-one instruction early in my fieldwork and eventually provided assistance and tips to other students either through partner work and more formal presentations. Participating in projects across all of the teams, I gained substantial experience with all of the routine techniques used at the Lab. I contributed to Lab projects alongside other students, worked with my assigned buddies, and, during the Spring 2018 semester, even gained an entire “family” of fellow Lab participants (Lampros and Koenig 2018). The depth and breadth of my data collection and analysis would not have been possible without immersing myself to the extent that I did on Lab projects and teams; these efforts solidified my understanding of numerous investigative techniques, workflows, and key challenges and allowed me to be conversant with others about project dynamics and details; sensitized me to the affective dimensions and ethical considerations entailed in this research; and exposed me to patterns and problems that arise in the course of conducting online open source investigations of the kinds undertaken at the Lab. In this dissertation, I draw on examples from my own investigations as well as those of my buddies, team managers, and fellow Lab students. In addition, full immersion was crucial to building relationships of trust and camaraderie with Lab students, who were periodically under the gaze of journalists, consultants, and other researchers over the course of

192 my fieldwork. Staff and some team managers expressed respect and appreciation for my contributions on Lab projects, and many Lab students told me they perceived me simply as one of them, as opposed to others who dropped in to study the Lab over a shorter period of time without intensive participation. When I mentioned my contribution to Lab projects, one study participant remarked, “oh, so you’re actually an asset to the Lab.”

A third unique advantage of the Lab as a site to examine open source investigative methods ethnographically relates to the diversity of its projects. Over the course of my fieldwork, the Lab had eight separate teams working on projects with one or two NGO partners each. Lab projects served towards a mixture of outcomes, including short-term advocacy reports, current legal cases, and longer-term accountability efforts. While there was substantial overlap in the methods used across the teams, some teams and projects called for specific techniques more than others. For instance, Documenting Hate comprised a team active during the first semester of my fieldwork. Geared towards surfacing hate incidents unreported by mainstream media in the United States following the 2016 presidential elections, the project consisted of a partnership with ProPublica, several other universities, and the collaborative verification platform Check. This team was particularly focused on discovery, as its aim was to find social media posts alluding to incidents of hate crime or hate speech that had occurred offline and in-person. Because most of the posts were textual and did not have photos or videos attached, verification of the posts did not get very far at the Lab, which had a no-contact policy with content uploaders. Instead, students would flag content that seemed plausible and Documenting Hate staff would connect with local journalists to reach out to uploaders for additional context and information needed to verify the incidents. Whereas Documenting Hate focused largely on discovery, the team partnering with the Syrian Archive focused largely on user-generated content verification and geolocation, as well as the identification of chemical weapons, cluster munitions, and helicopters in visual content. For its part, the DVC engaged students in social media monitoring: several projects centered on several-day monitoring of social media posts addressing election violence in the Democratic Republic of Congo and Togo. Legal and confidential teams often involved conducting social media network analysis on persons of interest, entailing combing through individuals’ posts, media, and online connections. Though unable to share substantive details

193 regarding these teams, I came away with a firsthand understanding of how cybersecurity threats shape organizational practices and drive attempts at establishing security protocols, procedures, and workflows on sensitive projects.

In addition to exposing me to a range of open source investigative methods, Lab projects spanned geographies and temporalities. Investigations addressed incidents and conflicts from around the globe, emerging from contexts with distinct levels of media coverage, digital access, and social media usage. Moreover, some projects centered on discrete incidents lasting one or multiple days, whereas several projects were focused on drawn-out conflicts, such as the Syrian civil war – characterized by strong documentary practices and a deluge, rather than a paucity, of user-generated content. These factors shaped the aims, scope and processes of open source investigations in the Lab, as they do for other organizations working with user-generated content emerging from certain regions and conflicts. For instance, Lab investigations on events or actors in the Syrian conflict typically centered on discrete incidents and were organized vis-à-vis large databases and spreadsheets; in addition, one project with the Syrian Archive assisted their attempts to apply machine learning techniques to recognize objects in video content, such as helicopters and cluster munitions.

NEGOTIATING PRESENCE AS A PARTICIPANT AND RESEARCHER

At the start of each semester, I would introduce myself and my research twice in the Lab- wide lectures, and again in each team’s Office Hours. As per guidelines of the University of Texas at Austin’s Institutional Review Board (IRB) guidelines and my own project protocol for obtaining consent (IRB Study Number 2017-05-0063), I disseminated my contact information, introduced myself, described my project, alerted students I would be taking notes of ongoing activities, and requested that students reach out to me if they’d like to learn more or opt out of the project. During Lab meetings, lectures, and activities, I was constantly typing field notes on my laptop, consisting of statements made or questions asked; breakthroughs, steps, or challenges I was confronting in an investigation; and “jottings” (Emerson, Fretz, and Shaw 2011) to remind myself to elaborate on a notable occurrence in my notes at a later time. I refrained from writing sensitive information from my fieldnotes, whether asked to be taken off record and relating to confidential

194 and legal projects, in line with participants’ wishes and various teams’ non-disclosure agreements. I also refrained from taking notes of visiting practitioners who came to speak to Lab students but to whom I was not able to explain my research project and obtain consent for documenting and re-publishing their statements.

Whereas many ethnographers face difficulties stemming from being perceived as outsiders by study participants in their field sites, I had the opposite problem – constantly concerned about blending in to the degree that students would forget that my role as an ethnographer. My clothes, backpack, and other features of my outward appearance were comparable to those of Lab students; despite being a decade older than most Lab students, I was perceived to share their same age. To any external observer, my typing of field notes might be indistinguishable from other students working on their laptops to make progress on Lab projects or, as was often the case, to message friends or check their personal social media accounts. Accordingly, when appropriate, I took whatever opportunities arose in team meetings or smaller gatherings to differentiate myself from Lab students and allude to my role as researcher. Even while typing notes on Lab activities on my laptop, I strove to signal my note-taking through alert body posture, eye contact with the speaker, and typing at a volume I hoped would be conspicuous but not distracting. In addition, I started taking notes openly from the very beginning of my time in the Lab so as to establish myself as a note-taker and acclimate others to this role (Emerson, Fretz, and Shaw 2011: 37).

On weekends and commuting back each evening to San Francisco from Berkeley on the Bay Area Rapid Transit (BART) train, I filled in gaps my fieldnotes and wrote analytic memos both on topics I had previously identified for inquiry before beginning my fieldwork as well as well as themes that emerged over the course of fieldwork. In the early stages of my fieldwork, I was extremely detailed in my accounts of discovery and verification; an hour of Lab work would easily fill 10 pages of fieldnotes. I generously elaborated on my confusions, frustrations, and uncertainties. In addition to these painstaking reconstructions, my fieldnotes were fairly unrestricted in scope in the preliminary stages of my fieldwork (Emerson, Fretz, and Shaw 2011). Over time, however, my fieldnotes on investigative work became more succinct, devoting more attention to developing events and conversations that addressed emergent key themes.

195

Despite much of my time being spent at the Lab or in adjacent buildings, the “field” had porous boundaries. First, much of the Lab’s internal communications, interactions, and investigations occurred online or in digital environments. Indeed, my deepening engagement with the Lab over the course of the year was reflected in the forms of continual modifications to my computer and account settings, and of an accumulation of new software products and browser extensions. Lab projects are managed and executed through an orchestra of platforms and applications. Research itself spanned online sites and platforms, chrome extensions, and software including the usual suspects (YouTube, Facebook, Twitter) as well as lesser-known tools including Tweetdeck, Banjo, and collaborative verification platforms like Check and TrulyMedia. Students created reports on Google Docs, contributed to Google Spreadsheets, annotated screenshots using applications like Skitch or Jing, and zoomed around on Google Earth (Pro), Bing maps, or the TerraServer website to corroborate satellite imagery with photos or videos of incidents for geolocation purposes. While analyzing my data, I referred to organizational documents, training powerpoints, and verification reports, documents, and spreadsheets I had contributed to. In addition, Lab staff, team managers, students, and even a handful of NGO partners corresponded regularly on Slack, a proprietary app used for communication and collaboration for groups and workplaces. A few Slack channels were devoted to Lab-wide announcements including upcoming events, news articles, resources, administrative matters, and postings for jobs and internships. In addition, team-specific Slack channels circulated project parameters, deadlines, reminders, relevant media, and questions or insights posted by students. Confidential and legal teams refrained from sharing sensitive information on Slack and instead communicated via end-to-end encrypted apps and email clients like Signal and ProtonMail. Attuned to concerns about cybersecurity circulating in the Lab, I modulated my documentation and storage of fieldnotes in accordance with the sensitivity of information as well as my IRB protocols. For instance, once I began participating in confidential projects, I removed my fieldnotes from automatic cloud back-up storage and saved copies manually on physical hard drives while also leaving out sensitive information (e.g., countries tied to legal projects, names of persons of interest).

196

The boundaries of my field site were also porous in the sense that many Lab-relevant discussions occurred outside of Lab-specific events on campus. Lab staff, particularly Koenig and McMahon, often attended conferences and workshops in the Bay Area and beyond. In addition, the HRC’s Twitter account regularly cited Lab media coverage, related events, and developments occurring relevant to prevent themes and activities in the Lab. Whenever possible, I attended events facilitated by Lab staff or students. Aside from the DVC summit held in July 2017, these included numerous panel discussions in San Francisco or Berkeley on issues related to human rights and technology. I also participated in panel discussions with Lab students or staff at RightsCon in Toronto, Canada, and the National Conference on Ethics and Archiving the Web. These events generated additional fieldnotes and insights while further situating Lab practices and positions on data usage, security, and ethics within a broader landscape of conversations about open source investigations in the human rights field. As secondary materials, I collected and archived news articles related the Lab, technology, journalism, fake news, and human rights both during and since my fieldwork. These included articles I came across independently, through Lab channels on Slack, or on various newsletters that I signed up to with information relevant to the Lab (e.g., Bellingcat, Meedan’s Checklist, Harvard University’s Nieman Lab, Michael Bazzell).

In addition to participant observation, I conducted 30 in-depth interviews during the 2017-2018 academic year ranging between 1-2 hours with Lab students (18) as well as Lab staff and experts (12) specializing in various techniques of open source investigations (e.g., social media discovery and verification, satellite imagery analysis). The following year, between June and August 2019, I conducted 20 additional interviews with human rights professionals and advocate groups using open source investigative techniques to collect, analyze, and/or preserve human rights-related UGC and other kinds of online information related to conflicts and human rights violations. These were undertaken while at the Social Media Collective at Microsoft Research Labs in New England, where I was an intern under the mentorship of Tarleton Gillespie.

During each interview, I jotted down notes of salient themes and quotes; afterwards, I typed these up along with further reflections on issues raised in the conversation. In selecting students to interview, I aimed to capture diversity in terms of Lab experience (1-4 semesters; team manager and students) as well as disciplinary/professional backgrounds (e.g., computer

197 science, journalism, law, Middle Eastern studies). After explaining the project and obtaining consent to interview, record, and publish statements, interviews with Lab students addressed reasons for joining the Lab, experiences conducting open source investigative techniques and working on Lab projects, attitudes with respect to cybersecurity and privacy before and since joining the Lab, and the challenges and rewards of participation. Initially, I had anticipated interviewing 30 students (a number selected as a heuristic) but I reached saturation much earlier. Moreover, my deep immersion in the Lab and considerable time spent with Lab participants reduced the need for interview data on topics and themes better (and already) captured through ethnographic data. In my write-up of the dissertation, I strove to anonymize students as much as possible by refraining from reporting identifiable information, except for a few graduate students who gave explicit consent to be named.

With respect to expert interviews, I selected individuals directly engaged in Lab projects as organizational partners or part of the Lab’s broader network, as well as experts in open source investigations or digital forensics without direct links to the Lab, such as Hany Farid. Expert interviews provided important context on emerging applications and initiatives leveraging open source investigations in human rights work. Collectively, these individuals comprise current or former staff/contributors at Airwars, Amnesty International (and DVC campuses at the University of Cambridge and University of Pretoria, besides U.C. Berkeley), Bellingcat, Carnegie Mellon University Center for Human Rights Science, the Harvard Disinformation Lab, the International Criminal Court, Liveuamap, Meedan, The New York Times, Physicians for Human Rights, Rohingya Today, Storyful, Syria Tracker, the Syrian Archive, and WITNESS.

As with Lab students, participants were asked for permission to interview, audio-record the interview, anonymize or disclose the individual’s name and institutional affiliation, and publish statements using direct quotes or paraphrasing. Interviews with experts covered their professional backgrounds and work with respect to user-generated content, open source investigations, or digital forensics, and interviews conducted during the summer of 2019 discussed their experiences and thoughts with respect to the removal of online information and platform content moderation policies and practices. As opposed to Lab students, experts I interviewed possessed diverse relationships to this constellation of topics. Accordingly, I tailored

198 my interviews for expert interviewees more so than for Lab students, the latter with whom I hewed quite closely to an interview guide. In cases where interviewees asked to review their quotes and attributions before publishing, I fulfilled these requests by receiving prior approval for all quotes and statements published here.

With the exception of the last chapter, this dissertation drew far more heavily on ethnographic data of investigations and training than interviews, although the latter was crucial for correcting hypotheses, integrating perspectives, and providing detailed descriptions in the emergence of open source investigative techniques and initiatives in the Lab’s network (Weiss 1994: 9-11). Because my chief aim was to provide an ethnographic account of open source investigative practices being incorporated into human rights work, my fieldnotes took precedence in my analysis while interviews played a supplemental role.

I undertook several strategies to analyze my fieldnotes during and after fieldwork. Over the course of the year, I periodically wrote and reviewed analytic memos to reflect on emergent patterns in my data, assess my understandings in light of new information, and detect possible blind spots (Charmaz 2006[2014]). Analyzing my ethnographic data in this way while conducting fieldwork was challenging due both to time constraints and, more importantly, the absence of proper distance to review my data and memos with mental clarity. Notwithstanding these limitations, periodic reflection and analysis on emergent themes were important in refining my research foci and strategies over the course of fieldwork and enabling me to iteratively collect data on these topics. I presented my early reflections on the relationship between open source verification and social disparities at the Chicago Ethnography Conference in the Spring of 2018; that analysis formed the basis of the second chapter in this dissertation. In addition, ethical considerations linked to open source investigative work emerged as a notable theme later in my fieldwork. Although I co-presented on specific case studies of Lab work in Spring of 2018 at the National Conference on Ethics and Archiving the Web in New York City, the third chapter in this dissertation takes up a broader analysis than that reflected in my presentation.

After concluding fieldwork, I undertook a systematic analysis of my fieldnotes which by then had grown to a total of 1,388 pages (1.15 spaced and 11-point font!) During a period of

199 several weeks, I poured over fieldnotes on NVivo, a software product for qualitative analysis, and carefully coded my notes according to Lab team as well as more than a dozen themes, including Lab history/structure, techniques and logics, platforms and tools, etc. While coding on NVivo, I added to a separate Word document where I assembled an inventory and event log of notable quotes and examples, organized by month, to ensure these would not get buried in the NVivo codes. During this phase of analysis I further developed analytic memos, undertook mapping brainstorms to identify key themes and their interrelations, and consulted academic scholarship on emergent themes, including articles on open source investigations that had not yet been published when I had started my fieldwork, as well as literatures that had not been previously on my radar but which resonated well with my ethnographic data, such as work on algorithmic selection and bias. At this time I also revisited my notes from interviews. Sections of interviews that addressed key analytic themes were re-listened to, spot-transcribed, and incorporated into my dissertation manuscript. Next, I narrowed my focus down to three main themes for which I had sufficient ethnographic or interview data and that captured what I believed would contribute the most novel and generalizable insights possible. I then developed these themes into analytic memos and eventually chapters. My use of empirical scope, novelty, and generalizability as selection parameters meant that many themes salient during fieldwork were not ultimately addressed in the dissertation. These themes include demographics and dynamics of the Lab itself, issues prioritized by Lab staff like resiliency, and concerns raised by students such as recognition and credit. The resulting three chapters – focused on discovery, verification, and some of the ethical and affective dimensions of this work, respectively – by no means provide an exhaustive account but are at least hoped to capture some key themes and procedures that cohere enough substantively to contribute to a broader argument.

LIMITATIONS

Finally, several important methodological limitations bear mentioning. First, similar to how this dissertation is not representative of all of the themes that arose as salient when I conducted fieldwork, an unresolved concern is that chapters are written with a skew towards privileging the voices of visiting trainers over the experience of students, who were largely the main drivers of the Lab. For instance, the preceding chapters allude often to trainings, given their

200 tendency to crystallize underlying logics and techniques in general terms helpful to explain broader patterns and best-practices. By comparison, one might say that references to students’ work are less prominent, exacerbated by a need to anonymize students which is not conducive to detailing events and examples involving students. Needs to anonymize students also compelled me to refrain from addressing the Lab’s demographics and its role in shaping dynamics with respect to credit and recognition, for fear that raising these issues would jeopardize participants’ anonymity. The Lab was overwhelmingly comprised of women, many of them with international backgrounds; for many, the Lab’s demographics were seen as an integral part of its ethos and operations. For instance, in an interview with me, HRC Executive Director Alexa Koenig linked the Lab’s demographic diversity to its prevalence of conversations about self-care, resilience, credit, and ethics (e.g., Lampros and Koenig 2018; Ellis 2018).

A second limitation of this dissertation, however, relates precisely to how this ethnographic account does largely reflect the positions of students and staff at the Lab, at the expense of highlighting the perspectives of other stakeholders. Though I draw on interviews with representatives with the Lab’s NGO partners, I did not conduct in-depth or ethnographic study of how partners devise and outsource projects to the Lab, or make use of the Lab’s work product afterwards. While I index different stages and forms in which users’ posting practices and broader disparities impact the verification process, this analysis is conducted from the point of view of open source investigators or journalists discovering content online as opposed to that of content creators, uploaders, or would-be content posters. In addition, experts interviewed and literature cited in this dissertation largely echo the perspectives of practitioners or advocates of open source investigations. Aside from the self-critique introduced by these individuals, there are few outsider or critical perspectives in this dissertation in its current form. These gaps and blind spots could be fruitfully studied in future research.

Third, for an ethnographic study of techniques aimed at discovering, verifying, and geolocating visual content, there are almost no pictures documenting these or other Lab procedures. Despite obtaining IRB approval to take pictures at the Lab for publication, I captured only a handful of photos of Lab events or activities. Ultimately, I neither published these nor even a small fraction of the hundreds of screenshots I generated through work on Lab projects. Aside

201 of course from non-disclosure agreements on specific projects, my decision to refrain from publishing visual materials stemmed largely from two salient elements of the Lab’s atmosphere: significant media coverage and researcher presence, and perceptions of security risks to students and content uploaders. For having been in operation only a handful of years as of this writing, the Lab has gained considerable attention in the media by a variety of outlets, ranging from U.C. Berkeley-based coverage, to local newspapers such as The San Francisco Chronicle and Mercury News, as well as larger news organizations like PBS and The Economist. During my fieldwork, media coverage was so regular that even I appear on television video clips and short documentaries made about the Lab – asking questions to Lab students or diligently taking fieldnotes into a notebook. In addition, research was constantly being conducted on Lab students and processes. Numerous Lab participants chose the Lab as a basis for class projects or their Master’s theses, and outside researchers and consultants were given access to conduct contextual interviews and workflow mapping with students for use by Lab staff. In this context, I did not want to contribute to overt efforts to document the Lab for the sake of publicity or academic advancement. In addition, my experience conducting verification and being privy to constant conversation about possible security risks to Lab students and content uploaders increased my own apprehension about producing visual materials that could be triangulated with other materials to identify Lab participants (e.g., Lab media coverage, photos of students on the Lab website, student names on Lab reports, students’ self-promotion as Lab members on LinkedIn). This was a concern particularly for students based or with families abroad whose participation in the Lab could lead to them being targeted by governments or third-parties. As one such student remarked, all of the tools used in the Lab to scope people out are also employed by authoritarian governments. Whether or not my level of apprehension was appropriate, these considerations about privacy and security curtailed my visual documentation of Lab practices.

202

Works Cited

Abbott, Andrew. 2011. “Library Research Infrastructure for Humanistic and Social Scientific Scholarship in the Twentieth Century.” Pp. 43-88 in C. Camic, N. Gross, and M. Lamont Social Knowledge in the Making. Chicago and London: The University of Chicago Press. Abbott, Kingsley. 2019. Myanmar: Documentation Practices May Raise Challenges for Accountability. OpinioJuris Blog. January 24. Last accessed Mary 14, 2019. Retrieved from: http://opiniojuris.org/2019/01/24/myanmar-documentation-practices-may-raise- challenges-for-accountability/ Access Now. 2019a. Protecting Free Expression in the Era of Online Content Moderation: Access Now’s Preliminary Recommendations on Content Moderation and Facebook’s Planned Oversight Board. Last accessed July 9, 2019. Retrieved from: https://www.accessnow.org/cms/assets/uploads/2019/05/AccessNow-Preliminary- Recommendations-On-Content-Moderation-and-Facebooks-Planned-Oversight- Board.pdf Access Now. 2019b. “Access Now on the Christchurch Call: Rights, Wrongs, and What’s Next.” Access Now. May 15. Last accessed July 9, 2019. Retrieved from: https://www.accessnow.org/access-now-on-the-christchurch-call-rights-wrongs-and- whats-next/ Acker, Amelia. 2018. “A Death in the Timeline: Memory and Metadata in Social Platforms,” in “Information/Control: Control in the Age of Post-Truth.” Journal of Critical Library and Information 2(1): 1-27. Acker and Brubaker 2014. “Death, Memorialization, and Social Media: A Platform Perspective for Personal Archives.” Archivaria 77: 1-23. Aday, Sean, and Steven Livingston. 2009. “NGOs as Intelligence Agencies: The Empowerment of Transnational Advocacy Networks and the Media by Commercial Remote Sensing in the Case of the Iranian Nuclear Program.” Geoforum 40(4): 514-522. Ahmad, Ali Nobil 2010. “Is Twitter a Useful Tool for Journalists?” Journal of Media Practice 11(2): 145-155. Airwars. 2019. “Methodology.” Airwars. Last accessed July 13, 2019. Retrieved from https://airwars.org/about/methodology/ Almgren, Susanne, and Tobias Olsson. 2015. “‘Let’s Get them Involved’…to Some Extent: Analyzing Online News Participation.” Social Media + Society. July-December: 1-11. Alston, Phillip. 2013. “Introduction: Third Generation Human Rights Fact-Finding.” Proceedings of the Annual Meeting (American Society of International Law) 107: 61-62.

203

Alston, Phillip, and Colin Gillespie. 2012. “Global Human Rights Monitoring, New Technologies, and the Politics of Information.” The European Journal of International Law 23(4): 1089- 1123. Amnesty International 2018a. “Syria: Hundreds of Civilian Lives at Risk as Afrin Offensive Escalates.” Last accessed on January 21, 2019. Retrieved from: https://www.amnesty.org/en/latest/news/2018/02/syria-hundreds-of-civilian-lives-at- risk-as-afrin-offensive-escalates/ Amnesty International. 2018b. “Myanmar: Fresh Evidence of Ongoing Ethnic Cleansing as Military Starves, Abducts, and Robs Rohingya.” February 7. Last accessed June 22, 2019. Retrieved from https://www.amnesty.org/en/latest/news/2018/02/myanmar-fresh-evidence-of- ongoing-ethnic-cleansing-as-military-starves-abducts-robs-rohingya/ Aronson, Jay. 2016. “Fact-Finding: Possibilities, Challenges, and Limitations.” Pp. 441-463 in Phillip Alston and Sarah Knuckey (eds.) The Transformation of Human Rights Fact-Finding. Oxford: Oxford University Press. Aronson, Jay. 2017. “Preserving Human Rights Media for Justice, Accountability, and Historical Clarification.” Genocide Studies and Prevention: An International Journal. 11(1): 82-99.

Aronson, Jay. 2018a. “The Utility of User-Generated Content in Human Rights Investigations.” Pp. 129-148 in Molly Land and Jay Aronson (eds.) New Technologies for Human Rights Law and Practice. Cambridge, U.K.: Cambridge University Press. Aronson, Jay. 2018b. “Computer Vision and Machine Learning for Human Rights Video Analysis: Case Studies, Possibilities, Concerns, and Limitations.” Law & Social Inquiry 43(4): 118- 1209. Article 19, Electronic Frontier Foundation, Center for Democracy and Technology, and Ranking Digital Rights. 2018. “An Open Letter to Mark Zuckerberg: The World’s Freedom of Expression is in Your Hands.” Retrieved from https://santaclaraprinciples.org/open- letter/ Article 19 et al. 2018. “An Open Letter to Mark Zuckerberg: The World’s Freedom of Expression is in Your Hands.” Last accessed July 13, 2019. Retrieved from: https://santaclaraprinciples.org/open-letter/ Asher-Schapiro, Avi. 2017. “YouTube and Facebook Are Removing Evidence of Atrocities, Jeopardizing Cases against War Criminals.” The Intercept. November 2. Last accessed July 14, 2019. Retrieved from: https://theintercept.com/2017/11/02/war-crimes-youtube- facebook-syria-rohingya/ Atlantic Council. 2018. Breaking Ghouta. Last accessed July 9, 2019. Retrieved from: http://www.publications.atlanticcouncil.org/breakingghouta/ Bair, Madeleine. 2015. “What WITNESS Learned in Our 3 Years Curating Human Rights Videos on YouTube.” WITNESS Media Lab. Last accessed March 22, 2019. Retrieved from:

204

https://lab.witness.org/what-witness-learned-in-our-3-years-curating-human-rights- videos-on-youtube/ Ball, Patrick. 2016. “The Bigness of Big Data: Samples, Models, and the Facts We Might Find When Looking at Data.” Pp. 425-440 in Phillip Alston and Sarah Knuckey (eds.) The Transformation of Human Rights Fact-Finding. Oxford: Oxford University Press. Banchik, Anna Veronica. 2018. “Too Dangerous to Disclose? FOIA, Courtroom ‘Visual Theory,’ and the Legal Battle Over Detainee Abuse Photographs.” Law & Social Inquiry 43(4): 1164- 1187. Baym, Nancy, and danah boyd. 2012. “Socially Mediated Publicness: An Introduction.” Journal of Broadcasting & Electronic Media. 56(3): 320-329. Beauman, Ned. 2018. “How to Conduct an Open Source Investigation, According to the Founder of Bellingcat.” The New Yorker. August 30. Last accessed January 18, 2019. Retrieved from: https://www.newyorker.com/culture/culture-desk/how-to-conduct-an-open- source-investigation-according-to-the-founder-of-bellingcat Bellingcat Yemen Project. “The Yemen Project: Announcement.” Bellingcat. April 22. Last accessed July 9, 2019. Retrieved from: https://www.bellingcat.com/news/mena/2019/04/22/the-yemen-project- announcement/ Bellingcat Investigation Team. 2017a. “Summary of Open Source Evidence from the March 25th 2017 Chlorine Attack in Al-Lataminah, Hama.” Bellingcat. October 9. Last accessed July 7, 2019. Retrieved from: https://www.bellingcat.com/news/mena/2017/10/09/summary- open-source-evidence-march-25th-2017-chlorine-attack-al-lataminah-hama/ Bellingcat Investigation Team. 2017b. “Investigating the March 30, 2017 Sarin Attack in Al- Lataminah.” Bellingcat. October 26. Last accessed July 7, 2019. Retrieved from: https://www.bellingcat.com/news/mena/2017/10/26/investigating-march-30-2017- sarin-attack-al-lataminah/ Benton, Joshua. 2019. “Twitter is Removing Precise-Location Tagging on Tweets—A Small Win for Privacy but a Small Loss for Journalists and Researchers.” Nieman Lab. June 19. https://www.niemanlab.org/2019/06/twitter-is-turning-off-location-data-on-tweets-a- small-win-for-privacy-but-a-small-loss-for-journalists-and-researchers/ Bhambra, Gurminder. 2007. “Sociology and Postcolonialism: Another ‘Missing’ Revolution.” Sociology 40(5): 871-884. Biddle, Ellery Roberts. 2018. “Envision a New War: The Syrian Archive, Corporate Censorship, and the Struggle to Preserve Public History Online.” Global Voices Advox and Monument Lab. Last accessed May 26, 2019. Retrieved from: https://advox.globalvoices.org/2019/05/02/envision-a-new-war-the-syrian-archive- corporate-censorship-and-the-struggle-to-preserve-public-history-online/amp/

205

Boellstroff. Tom. 2014. “Making Big Data, in Theory.” First Monday 18(10). Last accessed July 13, 2019. Retrieved from: https://firstmonday.org/article/view/4869/3750 Boutruche, Théo. “The Relationship between Fact-Finders and Witnesses in Human Rights Fact- Finding: What Place for the Victims.” Pp. 131 – 153 in Phillip Alston and Sarah Knuckey (eds.) The Transformation of Human Rights Fact-Finding. Oxford: Oxford University Press. Bowker, Geoffrey. 2013. “Data Flakes: An Afterword to ‘Raw Data’ Is an Oxymoron.” Pp. 167-172 in Lisa Gitelman “Raw Data” is an Oxymoron. Cambridge, MA: MIT Press. Bowker, Geoffrey. 2006. Memory Practices in the Sciences. Cambridge, MA: MIT Press. Bowker, Geoffrey, and Susan Leigh Star. 1999. Sorting Things Out: Classification and Its Consequences. Cambridge, MA: MIT Press. boyd, danah, and Kate Crawford. 2012. “Critical Question for Big Data.” Information, Community & Society 15(5): 662-679. Bucher, Taina. 2012. “Want to Be on the Top? Algorithmic Power and the Threat of Invisibility on Facebook.” New Media & Society 14(7): 1164-1180. Bucher, Taina. 2016. “The Algorithmic Imaginary: Exploring the Ordinary Affects of Facebook Algorithms” Information, Communication & Society 20(1): 30-44. Bucher, Taina, and Anne Helmond. 2017. “The Affordances of Social Media Platforms.” Pp. 233- 253 in Jean Burgess, Thomas Poell, and Alice Marwick (eds.) The SAGE Handbook of Social Media. London and New York: SAGE Publications Ltd. Bukovská, Barbora. 2008. “Perpetrating Good: Unintended Consequences of International Human Rights Advocacy.” Sur. Revista Internacional de Direitos Humanos 5(9). Burns, Ryan. 2013. “Moments of Closure in the Knowledge Politics of Digital Humanitarianism.” Geoforum 53: 51-62. Burrington 2017. Brough, Melissa, and Li. 2013. “Media Systems Dependency, Symbolic Power, and Human Rights Online Video: Learning from Burma’s ‘Saffron Revolution’ and WITNESS’s Hub.” International Journal of Communication 7: 281-304. Broussard, Meredith. 2015. “When Cops Check Facebook: America’s Police are Using Social Media to Fight Crime, a Practice that Raises Troubling Questions.” The Atlantic. April 19. Last accessed February 21, 2019. Retrieved from: https://www.theatlantic.com/politics/archive/2015/04/when-cops-check- facebook/390882/#disqus_thread Browne, Malachy. 2017. “YouTube Removes Videos Showing Atrocities in Syria.” New York Times. August 22. Last accessed January 26, 2019. Retrieved from: https://www.nytimes.com/2017/08/22/world/middleeast/syria-youtube-videos- isis.html

206

Browne, Malachy, Liam Stack, and Mohammed Ziyadah. 2015. “Streets to Screens: Conflict, Social Media, and the News.” Information, Communication, and Society 18(11): 1339-1347. Bruno, Nicola. 2011. “Tweet First, Verify Later: How Real-Time Information is Changing the Coverage of Worldwide Crisis Events.” Reuters Institution Fellowship Paper. Oxford: Reuters Institute for the Study of Journalism, University of Oxford. Last accessed March 27, 2019. Retrieved from: https://nicolabruno.files.wordpress.com/2011/05/tweet_first_verify_later2.pdf Bruns. Axel. 2008. Blogs, Wikipedia, Second Life, and Beyond: From Production to Produsage. New York, NY: Peter Lang. Bruns, Axel, and Tim Highfield. 2012. “Blogs, Twitter, and Breaking News: The Produsage of Citizen Jouranlism.” In Lind, Rebecca Ann (Ed.) Produsing Theory in a Digital World: The Intersection of Audiences and Production in Contemporary Theory. Peter Lang Publishing Inc., New York, pp. 15-32. Pre-print final. Last accessed January 11, 2019. Retrieved from: http://snurb.info/files/2012/Blogs,%20Twitter,%20and%20Breaking%20News.pdf Burgess, Jean, Alice Marwick, and Thomas Poell. 2018. The SAGE Handbook of Social Media. Thousand Oaks: SAGE Publications Ltd. Cascio, Jamais. “The Rise of the Participatory Panopticon.” WC Archive. May 4. Last accessed July 14, 2019. Retrieved from: http://www.openthefuture.com/wcarchive/2005/05/the_rise_of_the_participatory.htm l Camic, Charles, Neil Gross, and Michèle Lamont. 2011. “Introduction: The Study of Social Knowledge Making.” Pp. 1-42 in C. Camic, N. Gross, and M. Lamont Social Knowledge in the Making. Chicago and London: The University of Chicago Press. Caplan, Robyn, Lauren Hanson, and Joan Donovan. 2018. Dead Reckoning: Navigating Content Moderation After Fake News. Data & Society. Last accessed July 13, 2019. Retrieved from: https://datasociety.net/pubs/oh/DataAndSociety_Dead_Reckoning_2018.pdf Center for Human Rights Science. 2014. “Video Forensics in Human Rights Abuse and War Crimes Investigation: Technology, Law, and Ethics.” Accessed on March 26, 2019. Retrieved from: https://www.cmu.edu/chrs/conferences/vfhr/index.html Center for Investigative Journalism. 2018. Issues in Investigative Practice. Proceedings of seminars at the third international CIJ Logan Symposium, London. October 19 and 20. Open Society Foundations. Last accessed June 21, 2019. Retrieved from: https://tcij.org/wp-content/uploads/2019/06/Issues-in-Investigative-Practice.pdf Centre for Governance and Human Rights. 2019. “Digital Verification Corps Summit 2018.” University of Cambridge. Last accessed March 25, 2019. Retrieved from: https://www.cghr.polis.cam.ac.uk/research-themes/human-rights-in-the-digital-age- 1/DVC_summit

207

Center for Spatial Research. “Spatializing the YouTube War.” Columbia University. Last retrieved June 22, 2019. Retrieved from: http://c4sr.columbia.edu/conflict-urbanism- aleppo/spatializing-youtube.html Charmaz, Kathy. 2006[2014]. Constructing Grounded Theory. London: SAGE Publications Ltd. Chen, Angela. 2019. “Why the Future of Life Insurance May Depend on Your Online Presence.” The Verge. February 7. Last accessed July 13, 2019. Retrieved from: https://www.theverge.com/2019/2/7/18211890/social-media-life-insurance-new-york- algorithms-big-data-discrimination-online-records Chowdhury, Mridul. 2008. “The Role of the Internet in Burma’s Saffron Revolution.” The Berkman Center for Internet & Society at Harvard University. Berkman Center Research Publication No. 2008-08. Last accessed February 6, 2019. Retrieved from: https://cyber.harvard.edu/sites/cyber.harvard.edu/files/Chowdhury_Role_of_the_Inter net_in_Burmas_Saffron_Revolution.pdf_0.pdf Christin, Angèle. 2017. “Algorithms in Practice: Comparing Web Journalism and Criminal Justice.” Big Data & Society 4(2): 1-14. Citron, Danielle. 2018. “Extremist Speech, Compelled Conformity, and Censorship Creep.” Notre Dame Law Review 93(3): 1035-1072. Cole, Simon. 2013. “Forensic Culture as Epistemic Culture: The Sociology of Forensic Science.” Studies in History and Philosophy of Biological and Biomedical Sciences 44(1): 36-46. Cole, Teju. 2016. Known and Strange Things. New York: Random House. Collins, Harry. 1981. “The Place of the ‘Core-Set’ in Modern Science: Social Contingency with Methodological Propriety in Science.” History of Science 19(1): 6-19. Corbin, Juliet and Anselm Strauss 1993. “The Articulation of Work through Interaction.” The Sociological Quarterly 34(1): 71-83. Costa, Elisabetta. 2018. “Affordances-in-practice: An Ethnographic Critique of Social Media Logic and Context Collapse.” New Media & Society 20(10): 3641-3656. Crawford. 2013. Crawford, Kate, and Tarleton Gillespie. 2016. “What is a Flag for? Social Media Reporting Tools and the Vocabulary of Complaint.” New Media & Society 18(3): 410-428. Crawford, Kate, Kate Miltner, and Mary Gray. 2014. “Critiquing Big Data: Politics, Ethics, Epistemology.” International Journal of Communication 8: 1663-1672. Daniels, Jesse. 2009. “Cloaked Websites: Propaganda, Cyber-Racism, and Epistemology in the Digital Era. New Media & Society 11(5): 659-683. Daston, Lorraine, and Peter Galison. 2007. Objectivity. New York: Zone Books.

208

Deutch, Jeff, and Hadi Habal. 2018 “The Syrian Archive: A Methodological Case Study of Open- Source Investigation of State Crime Using Video Evidence From Social Media Platforms.” Statecrime 7(1): 46-76. DeVito, Michael, Darren Gergle, and Jeremy Birnholtz. 2017. “Algorithms Ruin Everything:” #RIPTwitter, Folk Theories, and Resistance to Algorithmic Change in Social Media. HCI and Collective Action. CHI 2017, May 6-11.

Diggelmann, Oliver, and Maria Nicole Cleis. “How the Right to Privacy Became a Human Right” Human Rights Law Review 14(3): 441-458. DiMaggio, Paul, Eszter Hargittai, W. Russell Neuman, and John Robinson. 2001. “Social Implications of the Internet.” Annual Review of Sociology 27: 307-336.

Dodge, Martin, and Rob Kitchin. 2013. “Crowdsourced Cartography: Mapping Experience and Knowledge.” Environment and Planning A 45:19-36. Dubberley, Sam. 2019. “Facebook Just Blindfolded War Crimes Investigators.” Newsweek. June 17. Last accessed July 9, 2019. Retrieved from: https://www.newsweek.com/facebook- graph-search-war-crimes-investigators-1444311 Dubberley, Sam, Elizabeth Griffin, and Haluk Mert Bal. 2015. “Making Secondary Trauma a Primary Issue: A Study of Eyewitness Media and Vicarious Trauma on the Digital Frontline.” Eyewitness Media Hub. Last accessed April 2, 2019. http://eyewitnessmediahub.com/uploads/browser/files/Trauma%20Report.pdf Deruy, Emily. 2017. “As ‘fake news’ flies, UC Berkeley Students Verify and Document.” The Mercury News. June 27. Last accessed March 26, 2019. Retrieved from: https://www.mercurynews.com/2017/06/27/as-fake-news-flies-uc-berkeley-students- verify-and-document/ The Economist. 2019. “Fake News v Fact: The Battle for Truth.” YouTube. Last accessed March 27, 2019. Retrieved from: https://www.youtube.com/watch?v=UM1ZAFcu1Vc Edison Hayden, Michael. 2019. “A Guide to Open Source Intelligence (OSINT).” Tow Center for Digital Journalism. June 7. Last accessed June 9, 2019. Retrieved from: https://www.cjr.org/tow_center_reports/guide-to-osint-and-hostile- communities.php#four Edwards, Scott. 2017. “When YouTube Removes Videos, It Impedes Justice.” WIRED. Last accessed July 7, 2019. Retrieved from: https://www.wired.com/story/when-youtube- removes-violent-videos-it-impedes-justice/ Electronic Frontier Foundation, the Syrian Archive, and WITNESS. 2019. Caught in the Net: The Impact of “Extremist” Speech Regulations on Human Rights Content. Last accessed July 7, 2019. Retrieved from: https://www.eff.org/files/2019/05/30/caught_in_the_net_whitepaper_2019.pdf

209

Ellis, Hannah. 2018. “How to Prevent, Identify, and Address Vicarious Trauma – While Conducting Open Source Investigations in the Middle East.” Bellingcat. October 18. Last accessed July 7, 2019. Retrieved at: https://www.bellingcat.com/resources/how- tos/2018/10/18/prevent-identify-address-vicarious-trauma-conducting-open-source- investigations-middle-east/ Elwood, Sarah. 2008. “Volunteered Geographic Information: Future Research Directions Motivated by Critical, Participatory, and Feminist GIS.” GeoJournal 72(3/4): 173-183. Emerson, Robert, Rachel Fretz, and Linda Shaw. 2011. Writing Ethnographic Fieldnotes. Chicago: University of Chicago Press. The Engine Room. 2017. “Amnesty International’s Digital Verification Corps: New Networks and Methods for Human Rights Research.” June 19. Last accessed March 25, 2019. Retrieved from: https://www.theengineroom.org/digital-verification-corps/ The Engine Room, Amnesty International, and Benetech 2016. DatNav: How to Navigate Digital Data for Human Rights Research. Retrieved from: https://www.theengineroom.org/wp- content/uploads/2016/09/datnav.pdf. Accessed August 16, 2017. Eslami, Motahhare, Aimee Rickman, Kristen Vaccaro, Amirhossein Aleyasen, Andy Vuong, Karrie Karahalios, Kevin Hamilton, and Christian Sandvig. “I Always Assumed that I Wasn’t Really that Close to [Her].” CHI ’15 Proceedings of the 33rd Annual ACM Conference on Human Factors in Computing Systems, Seoul, Korea. 153-162. Evans, David, Andrei Haigu, and Richard Schmalensee. 2006. Invisible Engines: How Software Platform Drive Innovation and Transform Industries. Cambridge: MIT Press. Farid, Hany. 2016. Photo Forensics. Cambridge, MA: MIT Press. Fiesler, Casey, and Nicholas Proferes. 2018. “‘Participant’ Perception of Twitter Research Ethics.” Social Media + Society 4(1). Flamini, Daniela. 2019. “The Scary Trend of Internet Shutdowns.” Poynter. August 1. Last accessed August 10, 2019. Retrieved from: https://www.poynter.org/fact- checking/2019/the-scary-trend-of-internet-shutdowns/ Fortune, Conor. 2018. “Digitally Dissecting Atrocities – Amnesty International’s Open Source Investigations.” Amnesty International. September 26. Last accessed March 25, 2019. Retrieved from: https://www.amnesty.org/en/latest/news/2018/09/digitally-dissecting- atrocities-amnesty-internationals-open-source-investigations/ Foucault, Michel. 1965. Madness and Civilization: A History of Insanity in the Age of Reason. Translated by Richard Howard. New York: Random House. Foucault, Michel. 1977. Discipline and Punish: The Birth of the Prison. Translated by Alan Sheridan. New York: Random House.

210

Foucault, Michel. 1980. Power/Knowledge: Selected Interviews and Other Writings, 1972-1977, edited by Colin Gordon. New York: Pantheon Books. Fourcade, Marion. 2010. Economists and Societies: Discipline and Profession in the United States, Britain, and France, 1890s to 1990s. Princeton: Princeton University Press. Freelon, Deen, Charlton McIlwain, and Meredith Clark. 2015. Beyond the Hashtags: #Ferguson, #Blacklivesmatter, and the Online Struggle for Offline Justice. Washington, D.C.: Center for Media and Social Impact at American University’s School of Communication. Freeman, Lindsay. 2018. “Digital Evidence and War Crimes Prosecutions: The Impact of Digital Technologies on International Criminal Investigations and Trials.” Fordham International Law Journal. 41(2): 283-336. Funke, Daniel. 2018. “How the BBC Verified that Video of a Grisly Murder in Cameroon, Step-by Step.” September 26. Poynter. Last accessed March 31, 2019. Retrieved from: https://www.poynter.org/news/how-bbc-verified-video-grisly-murder-cameroon-step- step Gawer, Annabelle (ed). 2011. Platforms, Markets, and Innovation. Cheltenham: Edward Elgar. Geertz, Clifford. 1983. Local Knowledge: Further Essays in Interpretive Anthropology. New York: Basic Books. Gelman, Susan, and Christine Legare. 2011. “Concepts and Folk Theories.” Annual Review of Anthropology 40 1: 379-398.

Gillespie, Tarleton. 2014. “The Relevance of Algorithms.” Pp. 167-195 in Tarleton Gillespie, Pablo Boczkowski, and Kirsten Foot (eds.) Media Technologies: Essays on Communication, Materiality, and Society Cambridge, MA: MIT Press. Gillespie, Tarleton 2015. “Platforms Intervene.” Social Media + Society. April-June 2015: 1-2. Gillespie, Tarleton. 2017. “Algorithmically Recognizable: Santorum’s Google Problem, and Google Santorum Problem.” Information, Communication & Society 20(1): 63-80. Gillespie, Tarleton, Pablo Boczkowski, and Kristen Foot (Eds.). 2014. Media Technologies: Essays on Communication, Materiality, and Society. Cambridge, MA: MIT Press. Gillespie, Tarleton. 2018. Custodians of the Internet: Platforms, Content Moderation, and the Hidden Decisions that Shape Social Media. New Haven: Yale University Press. Gitelman, Lisa. 2013. “Raw Data” is an Oxymoron. Cambridge, MA: MIT Press. Graham Wood, Millie. “Social Media Intelligence, the Wayward Child of Open Source Intelligence.” Responsible Data Blog. December 12. Last accessed March 7, 2019. Retrieved from: https://responsibledata.io/2016/12/12/social-media-intelligence-the- wayward-child-of-open-source-intelligence/

211

Grafton, Anthony. 2011. “In Clio’s American Atelier.” Pp. 89-118 in C. Camic, N. Gross, and M. Lamont Social Knowledge in the Making. Chicago and London: The University of Chicago Press. Grasseni, Cristina (ed). 2007. Skilled Visions: Between Apprenticeship and Standards. New York and Oxford: Berghahn Books. Green, Ben, Gabe Cunningham, Ariel Ekblaw, Paul Kominers, Andrew Linzer, and Susan Crawford. 2017. Open Data Privacy. Cambridge, MA: Berkman Klein Center for Internet & Society Research. Last accessed July 15, 2019. Retrieved from: https://dash.harvard.edu/handle/1/30340010 Godart, Charlotte. 2019. https://www.bellingcat.com/resources/how-tos/2019/06/21/the- most-comprehensive-tweetdeck-research-guide-in-existence-probably/ Golder, Scott, and Michael Macy. 2014. “Digital Footprints: Opportunities and Challenges for Online Social Research.” Annual Review of Sociology 40: 129-152. Golebiewski, Michael, and danah boyd. 2018. “Data Voids: Where Missing Data Can Easily be Exploited.” Data&Society. Last accessed July 7, 2019. Retrieved from: https://datasociety.net/wp- content/uploads/2018/05/Data_Society_Data_Voids_Final_3.pdf Gorwa, Robert. 2019. “What is Platform Governance?” Information, Communication & Society 22(6): 854-871. Goodwin, Charles. 1994. “Professional Vision.” American Anthropologist 96(3): 606-633. Guerrini, Federico. 2014. “How Researchers Use Social Media to Map the Conflict in Syria.” Forbes. April 15. Last accessed July 13. Retrieved from: https://www.forbes.com/sites/federicoguerrini/2014/04/15/how-researchers-use- social-media-to-map-armed-forces-in-syria/#76c0f33e1bed Granka, Laura. 2010. “The Politics of Search: A Decade Retrospective. The Information Society 26(5): 364-374. Grimmelmann, James. 2014. “Speech Engines.” Minnesota Law Review 98: 868-951. Gregory, Sam. 2010. “Cameras Everywhere: Ubiquitous Video Documentation of Human Rights, New Forms of Video Advocacy, and Considerations of Safety, Security, Dignity and Consent.” Journal of Human Rights Practice 2(2): 191-207. Gregory, Sam. 2012a. “The Participatory Panopticon and Human Rights: WITNESS’s Experience Supporting Video Advocacy and Future Possibilities.” Pp. 517-549 in Meg McLagan and Yates McKee (eds.) Sensible Politics: The Visual Culture of Nongovernmental Activism. New York: Zone Books. Gregory, Sam. 2012b. “Human Rights Made Visible: New Dimensions to Anonymity, Consent, and Intentionality.” Pp. 551 – 561 in Meg McLagan and Yates McKee (eds.) Sensible Politics: The Visual Culture of Nongovernmental Activism. New York: Zone Books.

212

Gregory, Sam. 2014. “Images of Horror: Whose Roles and What Responsibilities?” Last accessed July 13, 2019. Retrieved from: https://blog.witness.org/2014/09/sharing-images-horror- roles-responsibilities van der Haak, Bregtje, Michael Parks, and Manuel Castells. 2012. “The Future of Journalism: Networked Journalism.” International Journal of Communication 6: 2923-2938. Haar, Rohini, Casey Risko, Sonal Singh, Diana Rayes, Ahmad Albaik, Mohammed Alnajar, Mazen Kewara, Emily Clouse, Elise Baker, Leonard Rubenstein. 2018. “Determining the Scope of Attacks on Health in Four Governorates of Syria in 2016: Results of a Field Surveillance Program.” PLoS Med 15(4): e1002559. Last accessed February 13, 2019. Retrieved from: https://www.ncbi.nlm.nih.gov/pmc/articles/PMC5915680/pdf/pmed.1002559.pdf Halavais, Alexander. 2014. “Home Made Big Data? Challenges and Opportunities for Participatory Social Research.” First Monday 18(10). Last accessed July 13, 2019. Retrieved from: https://firstmonday.org/ojs/index.php/fm/article/view/4876/3754 Harrison, Jackie. 2010. “User-Generated Content and Gatekeeping at the BBC Hub.” Journalism Studies 11(2): 155-178. Hamilton, Rebecca. 2019. “The Hidden Danger of User-Generated Evidence for International” Just Security. January 23. Last accessed July 7, 2019. Retrieved from: https://www.justsecurity.org/62339/hidden-danger-user-generated-evidence- international-criminal-justice/ Hasian Jr., Marouf. 2016. Forensic Rhetorics and Satellite Surveillance: The Visualization of War Crimes and Human Rights Violations. Lanham, MD: Lexington Books. Hampson, Ian, and Anne Junor. 2005. “Invisible Work, Invisible Skills: Interactive Customer Service as Articulation Work.” New Technology, Work and Employment 20(2): 166-181. Hazard Owen, Linda. 2018. “What Have Tech Companies Done Wrong with Fake News? Google (yup) Lists the Ways.” Nieman Lab. October 12. Accessed on November 30, 2018. Retrieved from: http://www.niemanlab.org/2018/10/what-have-tech-companies-done- wrong-with-fake-news-google-yep-lists-the-ways/ Helberger, Natali, Kari Karppinen, and Lucia D’Acunto. 2016. “Exposure Diversity as a Design Principle for Recommender Systems.” Information, Communication & Society 21(2): 191- 207. Helberger, Natali. 2017. “Challenging Diversity—Social Media Platforms and a New Conception of Media Diversity.” Pp.153-175 in Martin Moore and Damian Tambini (eds.) Digital Dominance: The Power of Google, Amazon, Facebook, and Apple. New York, NY: Oxford University Press. Helles, Rasmus, and Klaus Bruhn Jensen. 2013. “Making Data—Big Data and Beyond: Introduction to the Special Issue.” First Monday 18(10). Last accessed July 13, 2019. Retrieved from: https://firstmonday.org/article/view/4860/3748

213

Hempel, Jessi. 2016. “Social Media Made the Arab Spring, But Couldn’t Save It.” WIRED. January 26. Last accessed August 10, 2019. Retrieved from: https://www.wired.com/2016/01/social-media-made-the-arab-spring-but-couldnt-save- it/ Hermida. 2010. Twittering the news: The emergence of Ambient Journalism. Journalism Practice 4(3) 297-308. Hermida, Alfred. 2012. “Tweets and Truth: Journalism as a Discipline of Collaborative Verification.” Journalism Practice 5-6: 659-668.

Herscher, Andrew. 2014. “Surveillant Witnessing: Satellite Imagery and the Visual Politics of Human Rights,” Public Culture 26(3): 469-500. Hiatt, Keith. 2016. “Open Source Evidence on Trial” Yale Law Journal Forum 125: 323-330. http://www.yalelawjournal.org/forum/open-source-evidence Higgins, Eliot. 2014. “A Beginner’s Guide to Geolocating Videos.” July 9. Bellingcat. Last accessed March 21, 2019. Retrieved from: https://www.bellingcat.com/resources/how- tos/2014/07/09/a-beginners-guide-to-geolocation/ Hill, Evan. 2018. “Silicon Valley Can’t be Trusted With Our History.” BuzzFeed News. April 29. Last accessed June 4, 2019. Retrieved from: https://www.buzzfeednews.com/article/evanhill/silicon-valley-cant-be-trusted-with- our-history Hill, Kashmir. 2018. “The Wildly Unregulated Practice of Undercover Cops Friending People on Facebook.” October 23. The Root. Last accessed February 21, 2019. Retrieved from: https://www.theroot.com/the-wildly-unregulated-practice-of-undercover-cops-frie- 1828731563 Hopgood, Stephen. 2006. Keepers of the Flame: Understanding Amnesty International. Ithaca: Cornell University Press. Howard, Philip, and Muzammil Hussain. 2013. Democracy’s Fourth Wave? Digital Media and the Arab Spring. New York, NY: Oxford University Press. Hu, Evanna. 2016. “Responsible Data Concerns with Open Source Intelligence.” Responsible Data Forum. November 14. Last accessed March 7, 2019. Retrieved from: https://responsibledata.io/2016/11/14/responsible-data-open-source-intelligence/ Hughes, Roland. 2018. “Myanmar Rohingya: How a ‘Genocide’ was Investigated.” BBC News. September 18. Last Accessed February 6, 2019. Retrieved from: https://www.bbc.com/news/world-45341112 Human Rights Center. 2012. “Beyond Reasonable Doubt: Using Scientific Evidence to Advance Prosecutions at the International Court.” Last accessed March 25, 2019. Retrieved from: https://www.law.berkeley.edu/files/HRC/HRC_Beyond_Reasonable_Doubt_FINAL.pdf

214

Human Rights Center. 2014a. “Digital Fingerprints: Using Electronic Evidence to Advance Prosecutions at the International Criminal Court.” Retrieved from: https://www.law.berkeley.edu/files/HRC/Digital_fingerprints_interior_cover2.pdf Human Rights Center. 2014b. “First Responders: An International Workshop on Collecting and Analyzing Evidence of International Crimes.” University of California Berkeley Law School. Last accessed March 25, 2019. Retrieved from: https://www.law.berkeley.edu/wp- content/uploads/2018/03/First-Responders_final_with_cover5.pdf. Human Rights Center. 2017. “Digital Verification Corps Student Summit: Evaluating the First Year of University-Based Open Source Investigations for Human Rights.” University of California Berkeley Law School. Last accessed March 25, 2019. Retrieved from: https://www.law.berkeley.edu/wp- content/uploads/2015/04/Summit_report_2017_final4.pdf Human Rights Center. 2018a. “The New Forensics: Using Open Source Information to Investigate Grave Crimes.” University of California Berkeley Law School. Last accessed March 25, 2019. Retrieved from: https://www.law.berkeley.edu/wp- content/uploads/2018/07/Bellagio_report_July2018_final.pdf Human Rights Center. 2018b. “Chemical Strikes on Al-Lataminah March 25 & 30, 2017: A Student- Led Open Source Investigation.” University of California Berkeley, School of Law. Last accessed January 25, 2017. Retrieved from: https://www.law.berkeley.edu/research/human-rights- center/programs/technology/187406-2/ Human Rights Center. 2019. “Open Source Investigations Protocol.” Last accessed June 5, 2019. Retrieved from: https://humanrights.berkeley.edu/programs-projects/tech-human- rights-program/open-source-investigations-protocol Human Rights Watch. 2017a. “New Satellite Imagery Partnership: Planet Boosts Human Rights Watch Research Capacity.” Human Rights Watch News. Last accessed May 28, 2019. Retrieved from: https://www.hrw.org/news/2017/11/30/new-satellite-imagery- partnership# Human Rights Watch. 2017b. “Death by Chemicals: The Syrian Government’s Widespread and Systematic Use of Chemical Weapons.” May 1. Last accessed January 25, 2019. Retrieved from: https://www.hrw.org/sites/default/files/report_pdf/syria0517_web.pdf Human Rights Watch. 2017c. “Burma: Satellite Imagery Shows Mass Destruction: 214 Villages Almost Totally Destroyed in Rakhine State.” September 19. Last accessed January 22, 2019. Retrieved from: https://www.hrw.org/news/2017/09/19/burma-satellite-imagery- shows-mass-destruction Hussain, Maaz 2017. “Rohingya Mobile Reporter Network Crumbles in Myanmar.” Voice of America. October 4. Last accessed June 22, 2019. Retrieved from:

215

https://www.voanews.com/east-asia/rohingya-mobile-reporter-network-crumbles- myanmar International Criminal Court, Office of the Prosecutor. 2015. “Strategic Plan: 2016-2018.” International Criminal Court. November 16. Last accessed March 26, 2019. Retrieved from: https://www.icc-cpi.int/iccdocs/otp/en-otp_strategic_plan_2016-2018.pdf International Association of Chiefs of Police. 2015. “2015 Social Media Survey Results.” Last accessed February 21, 2019. Retrieved from: http://www.iacpsocialmedia.org/wp- content/uploads/2017/01/FULL-2015-Social-Media-Survey-Results.compressed.pdf Internet Society. 2017. “Internet Shutdowns: An Internet Society Public Policy Briefing.” Internet Scoiety. November 14. Last accessed August 10, 2019. Retrieved from: https://www.internetsociety.org/wp-content/uploads/2017/11/ISOC-PolicyBrief- Shutdowns-20171109-EN.pdf Irving, Emma. 2017. “And So It Begins… Social Media Evidence in an ICC Arrest Warrant:’ Facebook and the Fact-Finding Mission on Myanmar.” August 17. Opinio Juris. Last accessed March 21, 2019. Retrieved from: http://opiniojuris.org/2017/08/17/and-so-it- begins-social-media-evidence-in-an-icc-arrest-warrant/ Irving, Emma. 2018. “‘The Role of Social Media is Significant:’ Facebook and the Fact-Finding Mission on Myanmar.” September 7. Opinio Juris. Last accessed March 21, 2019. Retrieved from: http://opiniojuris.org/2018/09/07/the-role-of-social-media-is- significant-facebook-and-the-fact-finding-mission-on-myanmar/ Irwin, Aisling. 2019. “Digital Evidence Opens Doors to Human Rights Probes.” SciDevNet. March 20. Last accessed March 21, 2019. Retrieved from: https://www.scidev.net/global/human-rights/feature/digital-evidence-opens-doors-to- human-rights-probes.html International Committee of the Red Cross and Privacy International. 2018. The Humanitarian Metadata Problem: “Doing No Harm” in the Digital Age. October 2018. Last accessed July 15, 2019. Retrieved from: https://www.icrc.org/en/download/file/85089/the_humanitarian_metadata_problem_- _icrc_and_privacy_international.pdf Introna, Lucas, and Helen Nissenbaum. 2000. “Shaping the Web: Why the Politics of Search Engines Matters.” The Information Society 16: 169-185. Ioannou, Filipa. 2017. “UC Berkeley Program Seeks to Help Prosecute War Criminals.” San Francisco Chronicle. April 18. Last accessed March 26, 2019. Retrieved from: https://www.sfchronicle.com/bayarea/article/UC-Berkeley-program-seeks-to-help- prosecute-war-11075013.php?cmpid=gsa-sfgate-result#photo-12554356 Jarvis, Jeff. 2006. “Networked Journalism.” July 5. BuzzMachine. Last accessed March 22, 2019. Retrieved from: https://buzzmachine.com/2006/07/05/networked-journalism/

216

Jasanoff, Sheila. 1998. “Contingent Knowledge: Implications for Implementation and Compliance.” Pp. 63-87 in Edith Brown Weiss and Harold K. Jacobson Engaging Countries: Strengthening Compliance with International Environmental Accords. Cambridge, MA: MIT Press. Jasanoff, Sheila (ed). 2004. States of Knowledge: The Co-production of Science and the Social Order. London and New York: Routledge. Jasanoff, Sheila, and Sang-Hyun Kim. 2015. Dreamscapes of Modernity: Sociotechnical Imaginaries and the Fabrication of Power. Chicago: University of Chicago Press.

Jenkins, Henry. 2009. “What Happened Before YouTube.” Pp. 109-135 in Jean Burgess and Joshua Green YouTube: Online Video and Participatory Culture. Cambridge, UK: Polity Press. Jørgenson, Rikke Frank. 2017. “What Platforms Mean When They Talk about Human Rights.” Policy & Internet 9(3): 280-296. Jørgenson, Rikke Frank. 2018. “Framing Human Rights: Exploring Storytelling within Internet Companies.” Information, Communication & Society. 21(3): 340-355. Jules, Bergis, Ed Summers, and Vernon Mitchell. 2018. Documenting The Now White Paper: Ethical Considerations for Archiving Social Media Content Generated by Social Movements: Challenges, Opportunities, and Recommendations. Documenting the Now. April. Last accessed March 19, 2019. Retrieved from: https://www.docnow.io/docs/docnow-whitepaper-2018.pdf

Kaye, David. 2018. “Report of the Special Rapporteur on the Promotion and Protection of the Right to Freedom of Opinion and Expression.” Presented to the Human Rights Council. Session 38. Last accessed June 3, 2019. Retrieved from https://freedex.org/wp- content/blogs.dir/2015/files/2018/05/G1809672.pdf Kaye, David. 2019. Speech Police: The Global Struggle to Govern the Internet. New York: Columbia Global Reports. Kayyali, Dia. 2019. “WITNESS Brings Together Voices to Push Back on Dangerous EU ‘Dissemination of Terrorist Content’ Proposal.” WITNESS Blog. January 28. Last accessed May 14, 2019. Accessed at: https://blog.witness.org/2019/01/witness-brings-together- voices-push-back-dangerous-dissemination-terrorist-content-proposal-civil-society- letter/ Kell, Gretchen. 2017. “World’s Next Generation of Human Rights Investigators Meets at Berkeley. Berkeley News. Last accessed March 26, 2019. Retrieved from: https://news.berkeley.edu/2017/06/28/worlds-next-generation-of-human-rights- investigators-meets-at-berkeley/ Kell, Gretchen. 2019. “Doctor, Lawyer, Open Source Investigator? New Field Plucks Berkeley Grads.” U.C. Berkeley News. May 1. Last accessed: July 7, 2019. Retrieved from:

217

https://news.berkeley.edu/2019/05/01/this-one-doctor-lawyer-open-source- investigator-new-field-seeks-berkeley-grads/ Kitchin, Rob. 2017. “Thinking Critically About and Researching Algorithms.” Information, Communication & Society 20(1): 14-29. Klonick, Kate. 2018. “The New Governors: The People, Rules, and Processes Governing Online Speech.” Harvard Law Review 131: 1598-1670. Koenig, Alexa, Félim McMahon, Nikita Mehandru, and Shikha Silliman Bhattacharjee. 2018. “Open Source Fact-Finding in Preliminary Examinations.” Pp. 681-710 in Morten Bergsmo and Carsten Stahn (eds.) Quality Control in Preliminary Examination: Volume 2. Brussels: Torkel Opsahl Academic EPublisher. Last accessed July 14, 2019. Retrieved from: https://www.legal-tools.org/doc/6706c9/pdf/ Koenig, Alexa, Keith Hiatt, and Khaled Alrabe. 2018. “Access Denied? The International Criminal Court, Transnational Discovery, and The American Servicemembers Protection Act.” Berkeley Journal of International Law 36 (1): 1-35. Koettl, Christoph. 2016a. Citizen Media Research and Verification: An Analytical Framework for Human Rights Practitioners. Human Rights in the Digital Age: CGHR Practitioner Paper #1. Cambridge, England: Centre of Governance and Human Rights, University of Cambridge. Koetll, Christoph. 2016b. “Digital Evidence: Using New Data Streams in Human Rights Research.” Amnesty International. February 15. Last accessed July 7, 2019. Retrieved from: https://www.amnesty.org/en/latest/news/2016/02/digital-evidence-using-new-data- streams-in-human-rights-research/ Koetll, Christoph. 2017. “Sensor Everywhere: Using Satellites and Mobile Phones to Reduce Information Uncertainty in Human Rights Crisis Research.” Genocide Studies and Prevention: An International Journal 11(1): 36-54. Koettl, Christoph, and Haley Willis. 2016. “Eyes on Cameroon: Videos Capture Human Rights Violations by the Security Forces in the Fight Against Boko Haram.” Medium. Last accessed July 7, 2019. Retrieved from: https://medium.com/lemming-cliff/eyes-on-cameroon- videos-capture-human-rights-violations-by-the-security-forces-in-the-fight- ae537a5cdc4b Knorr Cetina, Karin. 1999. Epistemic Cultures: How the Sciences Make Knowledge. Cambridge: Harvard University Press. Knorr Cetina, Karin, and Alex Preda (eds). 2006. The Sociology of Financial Markets. Oxford University Press. Lampros, Andrea. 2017. “Bellagio Workshop Examines Open Source Information as Evidence.” Medium. October 8. Last accessed March 25, 2019. Retrieved from: https://medium.com/humanrightscenter/bellagio-workshop-examines-open-source- information-as-evidence-dad2475fac7d

218

Lampros, Andrea and Alexa Koenig. 2018. “What Students are Teaching Us about Resiliency and Human Rights.” Medium. May 12. Last accessed March 27, 2019. Retrieved from: https://medium.com/humanrightscenter/what-students-are-teaching-us-about- resiliency-and-human-rights-9a34f3af75a Land, Molly, Patrick Meier, Mark Belinsky, and Emily Jacobi. 2012. “#ICT4HR: Information and Communication Technologies for Human Rights.” World Bank Institute. Land, Molly. 2009. “Peer Producing Human Rights.” Alberta Law Review 46(4): 1115-1139. Land, Molly. 2016. “Democratizing Human Rights Fact-Finding.” Pp. 399-424 in P. Alston and S. Knuckey The Transformation of Human Rights Fact-Finding. Oxford: Oxford University Press. Lapowsky, Issie. 2019. “New Film Shows How Bellingcat Cracks the Web’s Toughest Cases.” March 9. Wired. Last accessed June 2, 2019. Retrieved from: https://www.wired.com/story/bellingcat-documentary-south-by-southwest/. Latour, Bruno, and Steve Woolgar. 1986[1979]. Laboratory Life: The Construction of Social Facts. Princeton: Princeton University Press. Latour, Bruno. 1986. “Visualization and Cognition: Thinking with Eyes and Hands.” Pp. 1-40 in Henrika Kuklick (ed.) Knowledge and Society Studies in the Sociology of culture Past and Present. Greenwich, Connecticut: Jai Press. Laux, Johann. 2018. “A New Type of Evidence? Cyberinvestigations, Social Media, and Online Open Source Video Evidence at the ICC.” Archiv des Völkerrechts 56(3): 324-360. Law, John. 2008. “On Sociology and STS.” The Sociological Review 56(4): 623-649. Lazer, David, and Jason Radford. 2017. “Data ex Machina: Introduction to Big Data.” Annual Review of Sociology 43:19-39.

Leavitt, Alex. 2015. “‘This is a Throwaway Account’: Temporary Technical Identities and Perceptions of Anonymity in a Massive Online Community.” CSCW ’15 Proceedings of the 2015 ACM Conference on Computer Supported Cooperative Work and Social Computing 317-327. Leavitt, Alex, and John Robinson. 2017. “The Role of Information Visibility in Network Gatekeeping: Information Aggregation on Reddit during Crisis Events.” CSCW ’17 Proceedings of the 2017 ACM Conference on Computer Supported Cooperative Work and Social Computing 1246-1261. Lefton, Adam. 2018. “As a Designer, I Refuse to Call People ‘Users.’” Medium. January 15. Last accessed July 7, 2019. Retrieved from: https://medium.com/s/user-friendly/why-im- done-saying-user-user-experience-and-ux-in-2019-4fdfc6b7de23 Lemov, Rebecca. 2011. “Filing the Total Human: Anthropological Archives from 1923 to 1963.” Pp. 119-150 in C. Camic, N. Gross, and M. Lamont Social Knowledge in the Making. Chicago and London: The University of Chicago Press.

219

Lemov, Rebecca. 2015. Database of Dreams: The Lost Quest to Catalog Humanity. New Haven and London: Yale University Press. Levinson-Waldman, Rachel. 2018. “Government Access to and Manipulation of Social Media: Legal and Policy Challenges.” Howard Law Journal 61(3): 523-561 LexisNexis. 2014. “Social Media Use in Law Enforcement: Crime Prevention and Investigative Activities Continue to Drive Usage.” Last accessed February 21, 2019. Retrieved from: https://risk.lexisnexis.com/insights-resources/infographic/law-enforcement-usage-of- social-media-for-investigations-infographic Litfin, Karen. 2002. “Public Eyes: Satellite Imagery, the Globalization of Transparency, and New Networks of Surveillance.” Pp. 65-89 in James Rosenau and J.P. Singh (eds.) Information Technologies and Global Politics: The Changing Scope of Power and Governance. Albany: State University of New York Press. Litt, Eden. 2012. “Knock, knock. Who’s there? The Imagined Audience.” Journal of Broadcasting and Electronic Media. 56(3): 330-345. Litt, Eden, and Eszter Hargittai 2016. “The Imagined Audience on Social Network Sites.” Social Media + Society 2(1): 1-12. Little, Mark. 2012. “Finding the Wisdom in the Crowd.” NiemanReports. Last accessed March 22, 2019. Retrieved from: https://niemanreports.org/articles/finding-the-wisdom-in-the- crowd/ Liu, Sophia. 2014. “Crisis Crowdsourcing Framework: Designing Strategic Configurations of Crowdsourcing for the Emergency Management Domain.” Computer Supported Cooperative Work 23: 389-443. Livingston, Steven. 2016. “Digital Affordances and Human Rights Advocacy.” SFB-Governance Working Paper Series, No. 69, Collaborative Research Center (SFB) 700, Berlin, March 2016. Last accessed April 1, 2019. Accessed at: https://refubium.fu- berlin.de/bitstream/handle/fub188/18659/WP69_Druckversion.pdf?sequence=1 Livingston, Steven, and Sushma Raman 2016. “Conference Report: Technology & Human Rights in the 21st Century.” Technology & Human Rights in the 21st Century. Carr Center for Human Rights Policy, Harvard Kennedy School. Cambridge, MA: Carr Center for Human Rights Policy. Last accessed July 14, 2019. Retrieved from: https://carrcenter.hks.harvard.edu/publications/conference-report-technology-human- rights-21st-century Lunden, Ingrid. 2013. “News Corp Pays $25M for Storyful, Which Digs Up and Verifies News From Social Sites Like Twitter and Instagram.” TechCrunch. December 20. Last accessed March 22, 2019. Retrieved from: https://techcrunch.com/2013/12/20/news-corp-buys-storyful- for-25m-to-dig-up-verified-news-from-social-media-sites-like-twitter-and-instagram/.

220

Lynch, Michael. 1985. Art and Artifact in Laboratory Science: A Study of Shop Work and Shop Talk in a Research Laboratory. London: Taylor and Francis, Ltd. Mannheim, Karl. 1936. Ideology and Utopia: An Introduction to the Sociology of Knowledge, edited by Louis Wirth and Edward Shils. New York: Harcourt, Brace & World. Mannheim, Karl. 1952[1928]. “The Problem of Generations.” Pp. 276-322 in Essays on the Sociology of Knowledge. London: Routledge & Kegan Paul. Manoff, Marlene. 2004. “Theories of the Archive from Across the Disciplines.” portal: Libraries and the Academy 4(1): 9-25. Martin, Aryn, and Michael Lynch. 2009. “Counting Things and People: The Practices and Politics of Counting.” Social Problems 56(2): 243-266. Marwick, Alice, and Rebecca Lewis. 2017. Media Manipulation and Disinformation Online. Data & Society. Last accessed July 9, 2019. Retrieved from: https://datasociety.net/pubs/oh/DataAndSociety_MediaManipulationAndDisinformatio nOnline.pdf Marwick, Alice, and Nicole Ellison 2012. “’There Isn’t Wifi in Heaven!:’ Negotiating Visibility on Facebook Memorial Pages.’ Journal of Broadcasting & Electronic Media 56(3):378-400. Marwick, Alice, and danah boyd. 2011. “I Tweet Honestly, I Tweet Passionately: Twitter Users, Context Collapse, and the Imagined Audience.” New Media & Society 13: 114-133. Mayernik, Matthew, and Amelia Acker. 2017. “Tracing the Traces: The Critical Role of Metadata Within Networked Communications.” Journal of the Association for Information Science and Technology 69(1): 177-180. McKelvey, Fenwick, and Robert Hunt. 2019. “Discoverability: Toward a Definition of Content Discovery through Platforms.” Social Media + Society January-March 2019: 1-15. McLagan, Meg. 2003. “Principles, Publicity, and Politics: Notes on Human Rights Media.” American Anthropologist 105(3): 605-612. McLagan, Meg. 2006. “Introduction: Making Human Rights Claims Public.” American Anthropologist. 108(1): 191-195. McPherson, Ella. 2015a. “Digital Human Rights Reporting by Civilian Witnesses: Surmounting the Verification Barrier.” Pp. 193-2019 in R.A. Lind (ed.) Producing Theory in a Digital World 2.0: The Intersection of Audiences and Production in Contemporary Theory. Volume 2. New York, NY: Peter Lang Publishing. McPherson, Ella. 2015b. “ICTs and Human Rights Practice: Report Prepared for the UN Special Rapporteur on Extrajudicial, Summary, or Arbitrary Executions.” Center of Governance and Human Rights. Cambridge, England: University of Cambridge.

221

McPherson, Ella. 2015c. “Advocacy Organizations’ Evaluation of Social Media Information for NGO Journalism: The Evidence and Engagement Models.” American Behavioral Scientist 59(1):124-148. McPherson, Ella. 2018. “Risk and the Pluralism of Digital Human Rights Fact-Finding and Advocacy.” Pp. 188-214 in Molly Land and Jay Aronson (eds.) New Technologies for Human Rights Law and Practice. Cambridge, U.K.: Cambridge University Press. McNeil, Maureen, Michael Arribas-Ayllon, Joan Haran, Adrian Mackenzie, and Richard Tutton. 2017. “Conceptualizing Imaginaries of Science, Technology, and Society. Pp. 435-464 in Ulrike Felt, Rayvon Fouche, Clark Miller, and Laurel Smith-Doerr (eds.) The Handbook of Science and Technology Studies 4th Edition. Cambridge, MA: MIT Press. Meier, Patrick. 2015. Digital Humanitarians: How BIG DATA is Changing the Face of Humanitarian Response. Boca Raton, London, and New York: CRC Press, Taylor & Francis Group. Mégret, Frédéric. 2016. “Do Facts Exist, Can They Be ‘Found,’ and Does it Matter?” Pp. 27-48 in Philip Alston and Sarah Knuckey (eds.) The Transformation of Human Rights Fact-Finding. Oxford: Oxford University Press. Melendez, Lyanne. 2017. “UC Berkeley Students Work to Authenticate Photos, Videos From Conflict Zones.” ABC-7 News. July 13. Last accessed March 26, 2019. Retrieved from: https://abc7news.com/education/cal-students-work-to-authenticate-photos-videos- from-conflict-zones/2214248/ Merry, Sally Engle. 2016. The Seductions of Quantification: Measuring Human Rights, Gender Violence, and Sex Trafficking. Chicago: University of Chicago Press. Meyers, Paul. 2019. http://researchclinic.net/facebook/facebook.html Mol, Annemarie. 2010. “Actor-Network theory: Sensitive Terms and Enduring Tensions.” Kölner Zeitschrift für Soziologie und Sozialpsychologie 50(1): 253-269. Moreville, Peter. 2005. Ambient Findability: What We Find Changes Who We Become. Sebastopol, California: O’Reilly Media, Inc. Moon, Claire. 2012. “What One Sees and How One Files Seeing: Human Rights Reporting, Representation and Action.” Sociology 46(5): 876-890. Mozur, Paul. 2018. “A Genocide Incited On Facebook, With Posts from Myanmar’s Military.” New York Times. October 15. Last accessed at January 25, 2019. Retrieved from: https://www.nytimes.com/2018/10/15/technology/myanmar-facebook-genocide.html. Mroué, Rabih. 2012. “The Pixelated Revolution.” Translated by Ziad Nawfal, introduction by Carol Martin. TDR: The Drama Review 56(3): 19-35. Murthy, Dhiraj. 2011. “Twitter: Microphone for the Masses?” Media, Culture & Society 33: 779- 789.

222

Myers, Natasha. 2008. “Molecular Embodiments and the Body-work of Modeling in Protein Crystallography.” Social Studies of Science 38(2): 163-199. Nardi, Bonnie. 2015. “Virtuality.” Annual Review of Anthropology 44: 15-31. Ní Aoláin, Fionnuala. 2018. “Mandate of the Special Rapporteur on the Promotion and Protection of Human Rights and Fundamental Freedoms while Countering Terrorism.” July 24. Last accessed July 14, 2019. Retrieved from: https://www.ohchr.org/Documents/Issues/Terrorism/OL_OTH_46_2018.pdf Ng, Yvonne. 2015. “Ethical Wednesdays: Archives and Our Ethical Guidelines for Using Eyewitness Videos.” WITNESS. November. Last accessed March 19, 2019. Retrieved from: https://archiving.witness.org/2015/11/ethical-wednesdays-archives-and-our-ethical- guidelines-for-using-eyewitness-videos/ Noble, Safiya. 2018. Algorithms of Oppression: How Search Engines Reinforce Racism. New York: New York University Press. OHCHR, United Nations Office of the High Commissioner for Human Rights. 2018. “Report of the Detailed Findings of the Independent Fact-Facing Mission on Myanmar.” Presented to the Human Rights Council, Session 39. September 18. Last accessed July 14, 2019. Retrieved from: https://www.ohchr.org/EN/HRBodies/HRC/MyanmarFFM/Pages/Index.aspx OHCHR, United Nations Office of the High Commissioner for Human Rights. 2011a. “Basic Principles of Human Rights Monitoring. In Manual on Human Rights Monitoring. Last accessed July 9, 2019. Retrieved from: https://www.ohchr.org/Documents/Publications/Chapter02-MHRM.pdf OHCHR, United Nations Office of the High Commissioner for Human Rights. 2011b. “Trauma and Self-Care.” In Manual of Human Rights Monitoring. Last accessed July 9, 2019. Retrieved from: https://www.ohchr.org/Documents/Publications/Chapter12-MHRM.pdf OHCHR, United Nations Office of the Human Commissioner for Human Rights. 2017. Report of the Independent International Commission of Inquiry on the Syrian Arab Republic. Presented to the Human Rights Council, Session 36. September 6. Last accessed July 14, 2019. Retrieved from: https://www.ohchr.org/EN/HRBodies/HRC/IICISyria/Pages/Documentation.aspx Omand, Sir David. Jamie Bartlett, and Carl Miller. 2012a. “Introducing Social Media Intelligence (SOCMINT).” Intelligence and National Security. 27(6): 801-823. Omand, Sir David, Jamie Bartlett, Carl Miller. 2012b. #Intelligence. London: Demos. Last accessed May 28, 2019. Retrieved from: https://www.demos.co.uk/wp- content/uploads/2017/03/intelligence-Report.pdf O’Neill, Kathleen, Della Sentilles, and Daniel Brinks. 2014. “New Wine in Old Wineskins? New Problems in the Use of Electronic Evidence in Human Rights Investigations and Prosecutions.” Report Prepared under the auspices of Bernard and Audre Rapoport

223

Center for Human Rights and Justice. Last accessed: March 21, 2019. Retrieved from: https://repositories.lib.utexas.edu/handle/2152/27996 OPCW, Organization for the Prohibition of Chemical Weapons Fact-Finding Mission. 2017. “Report of the OPCW Fact-Finding Mission in Syria: Regarding an Alleged Incident in Ltamenah, the Syria Arab Republic, 30 March 2017.” November 2. Note by the Technical Secretariat. Last accessed July 14, 2019. Retrieved from: https://www.opcw.org/sites/default/files/documents/S_series/2017/en/s-1548- 2017_e_.pdf OPCW, Organization for the Prohibition of Chemical Weapons Fact-Finding Mission. 2018. “Report of the OPCW Fact-Finding Mission in Syria Regarding Alleged Incidents in Ltamenah, the Syrian Arab Republic 24 and 25 March 2017.” Last accessed July 14, 2019. Retrieved from: https://www.opcw.org/sites/default/files/documents/S_series/2018/en/s-1636- 2018_e_.pdf Orentlicher, Diane. 1990. “Bearing Witness: The Art and Science of Human Rights Fact-Finding.” Harvard Human Rights Journal 3: 83-135. Pabian, Frank. 2015. “Commercial Satellite Imagery as an Evolving Open-Source Verification Technology: Emerging Trends and Their Impact for Nuclear Nonproliferation Analysis.” European Commission Joint Research Center Technical Reports. Last accessed January 22, 2019. Retrieved from: https://ec.europa.eu/jrc/en/publication/commercial-satellite- imagery-evolving-open-source-verification-technology-emerging-trends-and-their Papacharissi, Zizi. 2002. “The Virtual Sphere: The Internet as a Public Sphere.” New Media & Society 4(1): 9-27. Patton, Desmond Upton, Douglas-Wade Brunton, Andrea Dixon, Reuben Jonathan Miller, Patrick Leonard, and Rose Hackman. “Stop and Frisk Online: Theorizing Everyday Racism in Digital Policing in the Use of Social Media for Identification of Criminal Conduct and Associations.” Social Media + Society July-September: 1-10. Physicians for Human Rights. 2016. “No Peace Without Justice in Syria.” Physicians for Human Rights. Last accessed February 14, 2019. Retrieved from: https://phr.org/resources/no- peace-without-justice-in-syria/ Physicians for Human Rights. 2019. “Methodology.” Last accessed June 18, 2019. Retrieved from: http://syriamap.phr.org/#/en/methodology Pittaway, Eileen, Linda Bartolomei, and Richard Hugman. 2010. “‘Stop Stealing Our Stories’: The Ethics of Research with Vulnerable Groups.” Journal of Human Rights Practice 2(2): 229- 251. Piracés, Enrique. 2018a. “Collecting, Preserving, and Verifying Online Evidence of Human Rights Violations.” Open Global Rights. January 30. Last accessed February 22, 2019. Retrieved

224

from: https://www.openglobalrights.org/collecting-preserving-and-verifying-online- evidence-of-human-rights-violations/ Piracés, Enrique. 2018b. “The Future of Human Rights Technology.” Pp. 289-308 in Molly K. Land and Jay D. Aronson (eds.) New Technologies for Human Rights Law and Practice. Cambridge, UK: Cambridge University Press. Prentice, Rachel. 2012. Bodies in Formation: An Ethnography of Anatomy and Surgery Education. Chapel Hill, NC: Duke University Press. Price, Megan, and Patrick Ball. 2014. “Data Collection and Documentation for Truth-Seeking and Accountability.” Syrian Justice and Accountability Centre. Memorandum Series: Documentation in Transitional Justice. Last accessed July 7, 2019. Retrieved from: https://hrdag.org/wp-content/uploads/2015/07/SJAC-Data-Collection-Documentation- Truth-seeking-Accountability.pdf Privacy International. 2019. “Social Media Intelligence.” Last accessed July 15, 2019. Retrieved from: https://privacyinternational.org/explainer/55/social-media-intelligence Polakow-Suransky, Sasha. “Taking on the Kremlin from his Couch.” Foreign Policy. Last accessed March 23, 2019. Retrieved from: https://foreignpolicy.com/gt-essay/taking-on-the- kremlin-from-his-couch-eliot-higgins-bellingcat-russia/ Pool, Hans. 2018. Bellingcat: Truth in a Post Truth World. Documentary. Netherlands. Porter, Theodore. 1996. Trust In Numbers: The Pursuit of Objectivity in Science and Public Life. Princeton: Princeton University Press. Porter, Elizabeth. 2014. “Taking Images Seriously.” Columbia Law Review 114: 1687-1782.

Posetti, Julie, Suellete Dreyfus, and Naomi Colvin. 2019. “The Perugia Principles for Journalists Working with Whistleblowers in the Digital Age.” Last accessed: January 29, 2019. Retrieved from: https://blueprintforfreespeech.net/wp- content/uploads/2019/01/Blueprint_Perugia_Principles.pdf Postigo, Hector. 2014. “The Socio-Technical Architecture of Digital Labor: Converting Play into YouTube Money.” New Media & Society 18(2): 332-349.

Powers, Matthews. 2016. “The New Boots on the Ground: NGOs in the Changing Landscape of International News.” Journalism 17(4): 401-416. Pozen, Daivd. 2005. “The Mosaic Theory, National Security, and the Freedom of Information Act.” The Yale Law Journal. 115: 628-679. Public Broadcasting Service. 2017. “A New Generation of Human Rights Investigators Turns to High-Tech Methods.” PBS NewsHour. February 13. Last accessed March 26, 2019. Retrieved from: https://www.pbs.org/newshour/show/new-generation-human-rights- investigators-turns-hi-tech-methods

225

Rader, Emilee, and Rebecca Gray. 2015. “Understanding User Beliefs about Algorithmic Curation in the Facebook News Feed.” CHI ’15 Proceedings of the 33rd Annual ACM Conference on Human Factors in Computing Systems, Seoul, Korea. 173-182. Rajagopalan, Megha. 2018. “The Histories of Today’s Wars Are Being Written on Facebook and YouTube. But What Happens When They Get Taken Down?” BuzzFeed. December 22. Last accessed July 7, 2019. Retrieved from: https://www.buzzfeednews.com/article/meghara/facebook-youtube-icc-war-crimes Rauchfleisch, Adrian, Xenia Artho, Julia Metag, Senja Post, Mike Schäfer. 2017. “How Journalists Verify User-Generated Content During Terrorist Crises: Analyzing Twitter Communication during the Brussels Attacks.” Social Media + Society July-September: 1-13.

Reporters without Borders. 2018. “RSF Warns Myanmar about Threat to World Press Freedom Index Ranking.” October 1. Last accessed February 6, 2019. Retrieved from: https://rsf.org/en/news/rsf-warns-myanmar-about-threat-world-press-freedom-index- ranking Ristovska, Sandra. 2016a. “The Rise of Eyewitness Video and its Implications for Human Rights: Conceptual and Methodological Approaches.” Journal of Human Rights. 15(3): 347-360. Ristovska, Sandra. 2016b. “Strategic Witnessing in an Age of Video Activism.” Media, Culture & Society 38(7): 1034-1047. Ritzer, George, and Nathan Jurgenson. 2010. “Production, Consumption, Prosumption: The Nature of Capitalism in the Age of the Digital ‘Prosumer.’” Journal of Consumer Culture 10(1): 13-36. Rutkin, Aviva. 2016. “Human Rights Squad Detects Abuse in Warzone Social Media Images.” New Scientist. November 11. Last accessed March 26, 2019. Retrieved from: https://www.newscientist.com/article/2112483-human-rights-squad-detects-abuse-in- warzone-social-media-images/ Roberge, Robert, and Robert Seyfert. 2016. “What are Algorithmic Cultures?” Pp. 1-25 in Robert Seyfert and Jonathan Roberge (eds.) Algorithmic Cultures: Essays on Meaning, Performance, and New Technologies. Oxon and New York: Routledge. Roberts, Sarah. 2018. “Digital Detritus: ‘Error’ and the Logic of Opacity in Social Media Content Moderation.” First Monday 23(3). Last accessed July 12, 2019. Retrieved from: https://firstmonday.org/ojs/index.php/fm/rt/printerFriendly/8283/6649 Rosen, Armin. 2018. “Erasing History: YouTube’s Deletion of Syria War Videos Concerns Human Rights Groups.” Fast Company. March 7. Last accessed June 9, 2019. Retrieved from: https://www.fastcompany.com/40540411/erasing-history-youtubes-deletion-of-syria- war-videos-concerns-human-rights-groups Rothe, Delf, and David Shim. 2018. “Sensing the Ground: On the Global Politics of Satellite-based Activism” Review of International Studies 44(3): 414-437.

226

Said, Edward. 1978. Orientalism. New York: Pantheon Books. Santa Clara Principles. 2018. The Santa Clara Principles on Transparency and Accountability in Content Moderation. Last accessed July 13, 2019. Retrieved from: https://santaclaraprinciples.org/ Satariano, Adam. 2019. “Europe is Reining in Tech Giants. But Some Say it is Going Too Far.” The New York Times. May 6. Last accessed May 6, 2019. Retrieved from: https://www.nytimes.com/2019/05/06/technology/europe-tech- censorship.html?emc=edit_NN_p_20190506&nl=morning- briefing&nlid=72885791tion% Satterthwaite, Margaret. 2013. “Finding, Verifying, and Curating Human Rights Facts.” Proceedings of the Annual Meeting (American Society of International Law), 107: 62-65. Savage, Mike, and Roger Burrows. 2007. “The Coming Crisis of Empirical Sociology.” Sociology 41(5): 885-899. Schäfer, Mirko Tobias, and Karin van Es (eds). 2017. The Datafied Society: Studying Culture through Data. Amsterdam: Amsterdam University Press. Schatzki, Thedore, Karin Knorr Cetina, and Eike von Savigny. 2001. The Practice Turn in Contemporary Theory. London: Routledge. Schou, Jannick, and Johan Farkas. 2016. “Algorithms, Interfaces, and the Circulation of Information: Interrogating the Epistemological Challenges of Facebook.” KOME: An International Journal of Pure Communication Inquiry 4(1): 36-49.

Segev, Elad. 2010. Google and the Digital Divide: The Bias of Online Knowledge. Oxford, UK: Chandos Publishing. Seu, Bruna. 2003. “Your Stomach Makes You Feel That You Don’t Want to Know Anything About It: Desensitization, Defence Mechanisms, and Rhetoric in Response to Human Rights Abuses.” Journal of Human Rights 2(2): 183-196. Shaheen, Kareem. 2016. ‘MSF Stops Sharing Syria Hospital Locations after ‘Deliberate’ Attacks.” The Guardian. February 18. Last accessed July 15, 2019. Retrieved from: https://www.theguardian.com/world/2016/feb/18/msf-will-not-share-syria-gps- locations-after-deliberate-attacks Shaheen, Joseph. 2015. “Network of Terror: How Daesh Uses Adaptive Social Networks to Spread its Message.” North Atlantic Treaty Organization Strategic Communications Centre of Excellence (NATO Stratcom COE) 2015. Riga, Latvia. Last accessed January 10, 2019. Retrieved from: https://www.stratcomcoe.org/download/file/fid/3312 Shapin, Steven. 1995. “Cordelia’s Love: Credibility and the Social Studies of Sciences.” Perspectives on Science 3(3): 255-275.

227

Sharp, Dustin. 2016. “Human Rights Fact-Finding and the Reproduction of Hierarchies.” Pp. 69- 88 in Philip Alston and Sarah Knuckey (eds.) The Transformation of Human Rights Fact- Finding. Oxford: Oxford University Press.

Shu, Catherine. 2019. “Changes to Facebook Graph Search Leaves Online Investigators in a Lurch.” TechCruch. Last accessed July 9, 2019. Retrieved from: https://techcrunch.com/2019/06/10/changes-to-facebook-graph-search-leaves-online- investigators-in-a-lurch/

Sienkiewicz, Matt. 2014. “Start Making Sense: A Three-Tier Approach to Citizen Journalism.” Media, Culture, & Society. 36(5): 691-701.

Silverman, Craig (ed). 2014. Verification Handbook: A Definitive Guide to Verifying Digital Content for Emergency Coverage. Maastricht, the Netherlands: European Journalism Center. Last accessed June 1, 2017. Retrieved from: http://verificationhandbook.com/ Silverman, Craig. 2018. “Journalists are Criticizing Facebook for its Data Collection. At the Same Time, They Often Use it to their Advantage.” BuzzFeed News. April 11. Last accessed July 9, 2019. Retrieved from: https://www.buzzfeednews.com/article/craigsilverman/facebook-cambridge-analytica- journalism-data-criticism-osint Silverman, Craig. 2019. “Facebook Turned Off Search Features Used to Catch War Criminals, Child Predators, and Other Bad Actors.” BuzzFeed News. June 10. Last accessed July 9, 2019. Retrieved from: https://www.buzzfeednews.com/article/craigsilverman/facebook- graph-search-war-crimes Smith, Dorothy. 1990. The Conceptual Practices of Power: A Feminist Sociology of Knowledge. Toronto: University of Toronto Press. Starbird, Kate. 2012. Crowdwork, Crisis and Convergence: How the Connected Crowd Organizes Information During Mass Disruption Events. PhD Dissertation. University of Colorado at Boulder, Colorado: Alliance for Technology, Learning, and Society (ATLAS) Institute. Stark, Laura. 2012. Behind Closed Doors: IRBs and the Making of Ethical Research. Chicago: University of Chicago Press. Stecklow, Steve. Why Facebook is Losing the War on Hate Speech in Myanmar.” Reuters. August 15. Last accessed August 26, 2019. Retrieved from: https://www.reuters.com/investigates/special-report/myanmar-facebook-hate/ Strauss, Anselm. 1985. “Work and the Division of Labor.” The Sociological Quarterly 26(1): 1-19. Stray, Jonathan. 2010. “Drawing out the Audience: Inside BBC’s User-Generated Content Hub.” May 5. Last accessed March 21, 2019. Retrieved from: http://www.niemanlab.org/2010/05/drawing-out-the-audience-inside- bbc%E2%80%99s-user-generated-content-hub/

228

Sui, Daniel, Sarah Elwood, and Michael Goodchild. 2013. Crowdsourcing Geographic Knowledge: Volunteered Geographic Information (VGI) in Theory and Practice. Dordecht and New York: Springer. Summers, Ed. 2019. “Consent.” InkDroid. Last accessed July 13, 2019. Retrieved from: https://inkdroid.org/2019/04/24/consent/ Swidler, Ann, and Jorge Arditi. 1994. “The new Sociology of Knowledge.” Annual Review of Sociology 20: 305-329. Syrian Archive, Syrians for Truth and Justice, Justice for Life, and Bellingcat. “Medical Facilities Under Fire: An Investigation About Attacking 8 Syrian Hospitals in Idlib.” Last accessed February 14, 2019. Retrieved from: https://syrianarchive.org/en/investigations/Medical- Facilities-Under-Fire/ Syrian Archive 2019a. “About: Mission, Vision, and Workflow.” Last accessed July 9, 2019. Retrieved from: https://syrianarchive.org/en/about Syrian Archive 2019b. “Removals of Syrian Human Rights Content: May 2019.” https://syrianarchive.org/en/tech-advocacy/may-takedowns.html Syrian Justice and Accountability Center. 2019. “What We Do: Collect and Preserve Documentation.” Syrian Justice and Accountability Center. Last accessed March 19, 2019. Retrieved from: https://syriaaccountability.org/what-we-do/ Tannenbaum, Barbara. 2017. “Antidote to Fake News: The Investigations Lab Teaches Digital Skepticism.” California Magazine. October 19. Last accessed March 26, 2019. Retrieved from: https://alumni.berkeley.edu/california-magazine/just-in/2017-10-19/antidote- fake-news-investigations-lab-teaches-digital Taub, Ben. 2016. “The Assad Files.” The New Yorker. April 18. Last accessed March 1, 2019. Retrieved from: https://www.newyorker.com/magazine/2016/04/18/bashar-al-assads- war-crimes-exposed Taylor, Charles. 2003. Modern Social Imaginaries. Chapel Hill, NC: Duke University Press. Teixeira, Fabricio.2019. “A Comprehensive (and Honest) List of UX Clichés: A Guide to Newcomers.” Medium. February 25. Last accessed August 10, 2019. Retrieved from: https://uxdesign.cc/a-comprehensive-and-honest-list-of-ux-clich%C3%A9s- 96e2a08fb2e9 Thompson, John. 2005. “The New Visibility.” Theory, Culture & Society 22(6): 31-51. Timmermans, Stefan, and Betina Freidin. 2007. “Caretaking as Articulation Work: The Effects of Taking Up Responsibility for a Child with Asthma on Labor Force Participation.” Social Science & Medicine 65(7): 1351-1363. Tufekci, Zeynep. 2013. “‘Not This One:’ Social Movements, the Attention Economy, and Microcelebrity.” American Behavioral Scientist 57(7): 848-870.

229

Tufecki, Zeynep. 2017. Twitter and Tear Gas: The Power and Fragility of Networked Protest. New Haven: Yale University Press. Tufecki, Zeynep. 2018a. “How Social Took Us From Tahrir Square to Donald Trump.” MIT Review. August 14. Last accessed July 13, 2019. Retrieved from: https://www.technologyreview.com/s/611806/how-social-media-took-us-from-tahrir- square-to-donald-trump/ Tufekci, Zeynep, 2018b. “YouTube, the Great Radicalizer.” The New York Times March 10. sec. Sunday Review, https://www.nytimes.com/2018/03/10/opinion/sunday/youtube- politics-radical.html Tufekci, Zeynep. 2019. “The Imperfect Truth About Finding Facts in a World of Fakes.” WIRED. February 18. Last accessed April 26, 2019. Retrieved from: https://www.wired.com/story/zeynep-tufekci-facts-fake-news-verification/ Tuhiwai Smith, Linda. 2012. Decolonizing Methodologies: Research and Indigenous Peoples. Zeb Books. Turner, David. 2012. “Inside the BBC’s Verification Hub.” The Hieman Foundation for Journalism at Harvard. Accessed August 2 2017. http://niemanreports.org/articles/inside-the-bbcs- verification-hub/ Traweek, Sharon. 1992. Beamtimes and Lifetimes: The World of High Energy Physicists. Cambridge: MIT Press. Trewinnard, Tom. 2019. “Facebook Graph Search Tools are Down: Why That Might Not be a Bad Thing.” Medium. June 10. Last accessed July 13, 2019. Retrieved from: https://medium.com/@tomt/facebook-graph-search-tools-are-down-why-that-might- not-be-a-bad-thing-3d14d5b665a1 Trottier, Daniel. 2012. “Policing Social Media.” Canadian Review of Sociology/ Revue Candanienne de Sociologie 49(4). U.C. Berkeley Public Affairs 2018. Activism 2.0. Can Social Media be Used to Solve War Crimes? Van Couvering, Elizabeth. 2007. “Is Relevance Relevant? Market, Science, and War: Discourses of Search Engine Quality.” Journal of Computer-Mediated Communication 12(3). Vertesi, Janet. 2015. Seeing Like a Rover: How Robots, Teams, and Images Craft Knowledge of Mars. Chicago and London: The University of Chicago Press. Wahl-Jorgensen, Karin. 2015. “Resisting Epistemologies of User-Generated Content? Cooptation: Segregation and the Boundaries of Journalism.” Pp. 335-367 in Carlson M and Lewis SC (eds) Boundaries of Journalism. New York: Routledge. Wall, Melissa, and Sahar el Zahed. 2015. “Embedding Content from Syria Citizen Journalists: The Rise of the Collaborative News Clip. Journalism 16(2): 163-180.

230

Walker, Shawn. 2017. The Complexity of Collecting Digital and Social Media Data in Ephemeral Contexts. PhD Dissertation. University of Washington. Wardle, Claire, and Andrew Williams. 2008. UGC@TheBBC: Understanding its Impact upon Contributors, Non-Contributors, and BBC News. September 16. Cardiff School of Journalism, Media and Cultural Studies. Last accessed March 21, 2019. Retrieved from: http://www.bbc.co.uk/blogs/knowledgeexchange/cardiffone.pdf Wardle, Claire, and Hossein Derakhshan. 2017. “Information Disorder: Toward an Interdisciplinary Framework for Research and Policy Making.” Council of Europe Report DGI(2017)09. Last accessed January 21, 2019. Retrieved from: https://rm.coe.int/information-disorder-toward-an-interdisciplinary-framework-for- researc/168076277c Wardle, Claire, Sam Dubberley, and Peter Brown. 2014. “Amateur Footage: A Global Study of User-Generated Content.” Tow Center for Digital Journalism. New York: Tow Center for Digital Journalism, Columbia University. Warner, Bernhard. 2019. “Tech Companies are Deleting Evidence of War Crimes.” The Atlantic. Many 8. Last accessed May 28, 2019. Retrieved from: https://www.theatlantic.com/ideas/archive/2019/05/facebook-algorithms-are-making- it-harder/588931/ Weiss, Robert. 1994. Learning from Strangers: The Art and Method of Qualitative Interview Studies. New York: The Free Press. Whiting, Alex. 2015. “The ICC Prosecutor’s New Draft Strategic Plan. Just Security. July 22. Last accessed July 13, 2019. Retrieved from: https://www.justsecurity.org/24808/icc- prosecutors-draft-strategic-plan/ Williams, Heather, and Ilana Blum. 2018. Defining Second Generation Open Source Intelligence (OSINT) for Defense Enterprise. Santa Monica, CA: RAND Corporation. Last accessed July 13, 2019. Retrieved from: https://www.rand.org/pubs/research_reports/RR1964.html Wilson, Richard. 1997. “Representing Human Rights Violation: Social Contexts and Subjectivities. Pp. 134-60 in Richard Wilson (ed.) Human Rights, Culture, and Context: Anthropological Perspectives. London: Pluto Press. Witjes, Nina, and Philipp Olbrich. 2017. “A Fragile Transparency: Satellite Imagery Analysis, Non- State Actors, and Visual Representations of Security.” Science and Public Policy 44(4): 524- 534. WITNESS. 2011. “Cameras Everywhere: Current Challenges and Opportunities at the Intersection of Human Rights, Video and Technology.” Brooklyn, New York: WITNESS. Last accessed August 18, 2017. Retrieved from: http://www.ohchr.org/Documents/Issues/Opinion/Communications/Witness_1.pdf

231

WITNESS. 2016. Video as Evidence Field Guide. Brooklyn, New York: WITNESS. Last accessed July 15, 2019. Retrieved from: https://vae.witness.org/video-as-evidence-field-guide/ Wittes, Benjamin. 2011. “Databuse: Digital Privacy and the Mosaic.” Governance Studies at Brookings. April 1. https://www.brookings.edu/wp- content/uploads/2016/06/0401_databuse_wittes.pdf Woodruff, Betsy. “Exclusive: Facebook Silences Rohingya Reports of Ethnic Cleansing. Daily Beast. September 18. Last accessed July 7, 2019. Retrieved from: https://www.thedailybeast.com/exclusive-rohingya-activists-say-facebook-silences- them Central Statistical Organization, the United Nations Development Program, and the World Bank. 2018. “Myanmar Living Conditions Survey 2017: Key Indicators Report.” Nay Pyi Taw and Yangon, Myanmar: Ministry of Planning and Finance, UNDP and WB. Last accessed February 6, 2018. Retrieved from: http://documents.worldbank.org/curated/en/739461530021973802/pdf/127618- REVISED-14-12-2018-18-51-31-KIMLCSEnglishFinalOctlowresolution.pdf Zarsky, Tal. 2014. “Social Justice, Social Norms and the Governance of Social Media.” Pace Law Review 35(1): 154-191. Zimmer, Michael. 2015. “The Twitter Archive at the Library of Congress: Challenges for Information Practice and Information Policy.” First Monday 20(7). Last accessed March 19, 2019. Retrieved from: https://firstmonday.org/article/view/5619/4653#p3 Zimmer, Michael, and Katharina Kinder-Kurlanda (eds). 2017. Internet Research Ethics for the Social Age: New Challenges, Cases, and Contexts. New York, NY: Peter Lang. Zuberi, Tufuku. 2003. Thicker than Blood: How Racial Statistics Lie. Minneapolis: University of Minnesota Press. Zuberi, Tukufu, and Eduardo Bonilla-Silva. 2008. White Logic, White Methods: Racism and Methodology. Rowman & Littlefield Publishers.

232