Robots.Txt: an Ethnographic Investigation of Automated Software Agents in User-Generated Content Platforms

robots.txt: An Ethnographic Investigation of Automated Software Agents in User-Generated Content Platforms By Richard Stuart Geiger II A dissertation submitted in partial satisfaction of the requirements for the degree of Doctor of Philosophy In Information Management and Systems and the Designated Emphasis in New Media in the Graduate Division of the University of California, Berkeley Committee in charge: Professor Jenna Burrell, Chair Professor Coye Cheshire Professor Paul Duguid Professor Eric Paulos Fall 2015 robots.txt: An Ethnographic Investigation of Automated Software Agents in User-Generated Content Platforms © 2015 by Richard Stuart Geiger II Freely licensed under the Creative Commons Attribution-ShareAlike 4.0 License (License text at https://creativecommons.org/licenses/by-sa/4.0/) Abstract robots.txt: An Ethnographic Investigation of Automated Software Agents in User-Generated Content Platforms by Richard Stuart Geiger II Doctor of Philosophy in Information Management and Systems with a Designated Emphasis in New Media University of California, Berkeley Professor Jenna Burrell, Chair This dissertation investigates the roles of automated software agents in two user-generated content platforms: Wikipedia and Twitter. I analyze ‘bots’ as an emergent form of sociotechnical governance, raising many issues about how code intersects with community. My research took an ethnographic approach to understanding how participation and governance operates in these two sites, including participant-observation in everyday use of the sites and in developing ‘bots’ that were delegated work. I also took a historical and case studies approach, exploring the development of bots in Wikipedia and Twitter. This dissertation represents an approach I term algorithms-in-the-making, which extends the lessons of scholars in the field of science and technology studies to this novel domain. Instead of just focusing on the impacts and effects of automated software agents, I look at how they are designed, developed, and deployed – much in the same way that ethnographers and historians of science tackle the construction of scientific facts. In this view, algorithmic agents come on the scene as ways for people to individually and collectively articulate the kind of governance structures they want to see in these platforms. Each bot that is delegated governance work stands in for a wide set of assumptions and practices about what Wikipedia or Twitter is and how it ought to operate. I argue that these bots are most important for the activities of collective sensemaking that they facilitate, as developers and non-developers work to articulate a common understanding of what kind of work they want a bot to do. Ultimately, these cases have strong implications and lessons for those who are increasingly concerned with ‘the politics of algorithms,’ as they touch on issues of gatekeeping, socialization, governance, and the construction of community through algorithmic agents. 1 Acknowledgements There are too many people for me to thank in completing this dissertation and my Ph.D. This work is dedicated my personal infrastructure of knowing and knowledge production – the vast assemblage of individuals who I have relied on for guidance and support throughout my time as a graduate student. There could be a whole dissertation of material documenting all the people who have helped me get to this point, but these few pages will have to suffice for now. The School of Information at UC-Berkeley has given me a wealth of resources to help me get to this moment, but the people have been the greatest part of that. My fellow members of my Ph.D cohort, Galen Panger and Meena Natarajan, have been on the same path with me for years, helping each other through all the twists and turns of this journey. I also want to thank many members of my Ph.D program, who have helped me refine and shape my ideas and myself. I’ve presented much of the work in this dissertation at both formal colloquia and informal talks over drinks in the Ph.D lounge, so thanks to Andy Brooks, Ashwin Matthew, Christo Sims, Dan Perkel, Daniela Rosner, Elaine Sedenberg, Elizabeth Goodman, Ishita Ghosh, Jen King, Megan Finn, Nick Doty, Nick Merrill, Noura Howell, Richmond Wong, Sarah Van Wart, and Sebastian Benthall for their ideas and support. The School of Information is also home to three members of my committee, without whom this dissertation could not have been written. I want to thank my chair and advisor Jenna Burrell for her time and energy in supporting me as a Ph.D student, as well as Paul Duguid and Coye Cheshire, for their years of service and support. They have helped me not only with research for this dissertation, but also in various other aspects of becoming an academic. I’d also like to thank Eric Paulos, my outside committee member, for his advising around my dissertation. And there are other professors at Berkeley who have helped shape this work and my thinking through classes and conversations: Greg Niemeyer, Ken Goldberg, Abigail De Kosnik, Cathryn Carson, Massimo Mazzotti, and Geoffrey Nunberg. I’d also like to thank South Hall administrative staff for all of their work in getting me to this point, including: Meg St. John, Lety Sanchez, Roberta Epstein, Siu Yung Wong, Alexis Scott, and Daniel Sebescen, as well as Lara Wolfe at the Berkeley Center for New Media and Stacey Dornton at the Berkeley Institute for Data Science. I’ve had the opportunity to collaborate with amazing researchers, who have shaped and expanded my thought. Aaron Halfaker, my long-time collaborator, has been a constant source of support and inspiration, and I know that we have both expanded our ways of thinking in our collective work over the years. I also want to thank Heather Ford, my fellow ethnographer of Wikipedia, for her collaboration, friendship, and support. I’m thankful to met the members of the Wikimedia Summer of Research 2011, who early on in my Ph.D helped me understand the full breath of Wikipedia research: Fabian Kaelin, Giovanni Luca Ciampaglia, Jonathan Morgan, Melanie Kill, Shawn Walker, and Yusuke Matsubara. I also want to thank many people at the Wikimedia Foundation, who have been incredibly helpful throughout this research including Dario Taraborelli, Diederik van Liere, Oliver Keyes, Maryana Pinchuk, Siko Bouterse, Steven i Walling, and Zack Exley. I’d also like to thank the Wikipedians and Twitter blockbot developers who have assisted with this research, both in the time they took to talk with me and in the incredibly important work that they do in these spaces. I am also grateful to have an abundant wealth of friends and colleagues in academia, who have helped shape both me and my work. Many of these people I met when we were both graduate students, although some are now faculty! I’ve met many of these people at conferences where I’ve presented my work or heard them present their work, and our conversations have helped me get to the place where I am today. These people include: Aaron Shaw, Airi Lampinen, Alex Leavitt, Alyson Young, Amanda Menking, Amelia Acker, Amy Johnson, Andrea Wiggins, Andrew Schrock, Anne Helmond, Brian Keegan, Caroline Jack, Casey Fiesler, Charlotte Cabasse, Chris Peterson, Dave Lester, Heather Ford, Jed Brubaker, Jen Schradie, Jodi Schneider, Jonathan Morgan, Jordan Kraemer, Kate Miltner, Katie Derthick, Kevin Driscoll, Lana Swartz, Lilly Irani, Mako Hill, Matt Burton, Melanie Kill, Meryl Alper, Molly Sauter, Nate Tkacz, Nathan Matias, Nick Seaver, Norah Abokhodair, Paloma Checa-Gismero, Rachel Magee, Ryan Milner, Sam Wolley, Shawn Walker, Stacy Blasiola, Stephanie Steinhardt, Tanja Aitamurto, Tim Highfield, Whitney Philips, and Whitney Erin Boesel. I’m sure I’m also leaving some people out of this long list, but this only goes to show you that so much of what I have accomplished is because of all the people with whom I’ve been able to trade ideas and insights. There have also been many faculty members who have generously helped me think about my work and my career, as well as given me many opportunities during my time as a graduate student. I want to particularly thank David Ribes, my advisor in my M.A. program at Georgetown, for helping me first dive into these academic fields, and for continuing to help support me. I also want to thank those scholars who have helped me think through my work and my career in academia: Amy Bruckman, Andrea Tapia, Andrea Forte, Andrew Lih, Annette Markham, Chris Kelty, Cliff Lampe, David McDonald, Geert Lovink, Gina Neff, Janet Vertesi, Kate Crawford, Katie Shilton, Loren Terveen, Malte Ziewitz, Mark Graham, Mary Gray, Nancy Baym, Nicole Elison, Paul Dourish, Phil Howard, Sean Goggins, Steve Jackson, Steve Sawyer, T.L. Taylor, Tarleton Gillespie, Zeynep Tufecki, and Zizi Papacharissi. Institutionally, I’d like to thank many departments, centers, and institutes for supporting me and my work, both at Berkeley and beyond. Locally, I’m thankful for the support of the UC- Berkeley School of Information, the Department of Sociology, the Center for Science, Technology, Medicine, and Society (CSTMS), the Center for New Media, and the Berkeley Institute for Data Science. Further out, I’m thankful for the support of the Center for Media, Data, and Society at Central European University, the Oxford Internet Institute, the MIT Center for Civic Media, the Berkman Center at Harvard Law School, the Consortium for the Science of the Sociotechnical, and the Wikimedia Foundation. People and programs at these institutions have graciously hosted me and other like-minded academics throughout the years, giving an invaluable opportunity for me to grow as a scholar. ii Last but certainly not least, I’d like to thank the people in my family, who have supported me for longer than anybody else mentioned so far.

Robots.Txt: an Ethnographic Investigation of Automated Software Agents in User-Generated Content Platforms

Using New Technologies for Library Instruction in Science and Engineering: Web 2.0 Applications

DETECTING BOTS in INTERNET CHAT by SRITI KUMAR Under The

Wikipedia and Intermediary Immunity: Supporting Sturdy Crowd Systems for Producing Reliable Information Jacob Rogers Abstract

Improving Wikimedia Projects Content Through Collaborations Waray Wikipedia Experience, 2014 - 2017

Report on Requirements for Usage and Reuse Statistics for GLAM Content

Włodzimierz Lewoniewski Metoda Porównywania I Wzbogacania

Are Encyclopedias Dead? Evaluating the Usefulness of a Traditional Reference Resource Rachel S

Attacker Chatbots for Randomised and Interactive Security Labs, Using Secgen and Ovirt

Brion Vibber 2011-08-05 Haifa Part I: Editing So Pretty!

The Culture of Wikipedia

Tomenet-Guide.Pdf

Genedb and Wikidata[Version 1; Peer Review: 1 Approved, 1 Approved with Reservations]