Characterization of Internet Censorship from Multiple Perspectives

UCAM-CL-TR-897 Technical Report ISSN 1476-2986 Number 897 Computer Laboratory Characterization of Internet censorship from multiple perspectives Sheharbano Khattak January 2017 15 JJ Thomson Avenue Cambridge CB3 0FD United Kingdom phone +44 1223 763500 http://www.cl.cam.ac.uk/ c 2017 Sheharbano Khattak This technical report is based on a dissertation submitted January 2017 by the author for the degree of Doctor of Philosophy to the University of Cambridge, Robinson College. Technical reports published by the University of Cambridge Computer Laboratory are freely available via the Internet: http://www.cl.cam.ac.uk/techreports/ ISSN 1476-2986 Summary Internet censorship is rampant, both under the support of nation states and private actors, with important socio-economic and policy implications. Yet many issues around Internet censorship remain poorly understood because of the lack of adequate approaches to measure the phenomenon at scale. This thesis aims to help fill this gap by developing three methodologies to derive censorship ground truths, that are then applied to real- world datasets to study the effects of Internet censorship. These measurements are given foundation in a comprehensive taxonomy that captures the mechanics, scope, and dynamics of Internet censorship, complemented by a framework that is employed to systematize over 70 censorship resistance systems. The first part of this dissertation analyzes user-side censorship, where a device near the user, such as the local ISP or the national backbone, blocks the user's online communication. This study provides quantified insights into how censorship affects users, content providers, and Internet Service Providers (ISPs); as seen through the lens of traffic datasets captured at an ISP in Pakistan over a period of three years, beginning in 2011. The second part of this dissertation moves to publisher-side censorship. This is a new kind of blocking where the user's request arrives at the Web publisher, but the publisher (or something working on its behalf) refuses to respond based on some property of the user. Publisher-side censorship is explored in two contexts. The first is in the context of an anonymity network, Tor, involving a systematic enumeration and characterization of websites that treat Tor users differently from other users. Continuing on the topic of publisher-side blocking, the second case study examines the Web's differential treatment of users of adblocking software. The rising popularity of adblockers in recent years poses a serious threat to the online advertising industry, prompting publishers to actively detect users of adblockers and subsequently block them or otherwise coerce them to disable the adblocker. This study presents a first characterization of such practices across the Alexa top 5K websites. This dissertation demonstrates how the censor's blocking choices can leave behind a detectable pattern in network communications, that can be leveraged to establish exact mechanisms of censorship. This knowledge facilitates the characterization of censorship from different perspectives; uncovering entities involved in censorship and targets of censorship, and the effects of such practices on stakeholders. More broadly, this study complements efforts to illuminate the nature, scale, and effects of opaque filtering practices; equipping policy-makers with the knowledge necessary to systematically and effectively respond to Internet censorship. 3 To Shahmeer for putting the colour inside of my world 4 Acknowledgements This work has been shaped by the support and encouragement of a whole host of people, to whom I wish to express my profound gratitude. I am grateful to Ali Khayam for instilling in me a passion for research and exploration. His perennial optimism and energy have been a source of great inspiration to me. He also played a key role in materializing our study of censorship in Pakistan. The seeds of this dissertation were planted during my internship with Vern Paxson at The International Computer Science Institute (ICSI), Berkeley, in Autumn, 2012. I am fortunate to have been mentored by him since then. Vern patiently listened to many ideas and provided feedback on several drafts, teaching me along the way about a range of topics|from soundness and rigour in measurement studies, to effective management of collaborative research. I am immensely grateful to Jon Crowcroft for all the freedom he gave me in pursuing my research interests. Jon's insightful comments on my research work have helped me to see the bigger picture. The consideration and respect he affords to his students has made it a privilege to work with him. My heartfelt thanks to Steven Murdoch for helping me every step of the way. His feedback proved invaluable to shape and develop my research ideas into mature work. From Steven, I learnt going the extra mile to do things right; and the value of patient and consistent hard work. I am deeply indebted to Ross Anderson for his unwaivering support and generous encouragement. The life lessons he imparted have equipped me with fundamental capabilities to cope with challenging situations. I have greatly benefited from the support and mentorship of Mobin Javed; without her it would not have been the same. I am grateful to all my collaborators: Colleen Swanson, Damon McCoy, David Fifield, Emiliano De Cristofaro, Hamed Haddadi, Ian Goldberg, Julia E. Powles, Laurent Simon, Marjan Falahrastegar, Narseo Vallina-Rodriguez, Rishab Nithyanand, Sadia Afroz, Srikanth Sundaresan, Tariq Elahi, and Zartash Afzal Uzmi. I would like to thank a number of others who facilitated this work: Arturo Filastò,Bjoern A. Zeeb, Georg Koppen, George Danezis, Juris Vetra, Michael Tschantz, Moritz Bartl, Philipp Winter, and Zakir Durumeric. This work would not have been possible without the cooperation of the anonymous Internet Service Provider in Pakistan and the operators of the Tor exit nodes used in this study. I am grateful to the technical staff at the University of California, Berkeley, University of Cambridge, and University of Michigan for facilitating our scanning experiments. I thank the anonymous IMC, NDSS, IEEE S&P, PETS, and USENIX FOCI reviewers, and our shepherds, Olaf Maennel and Lujo Bauer, whose valuable feedback led to this work being more solid. I am grateful to my Ph.D. examiners, Robert Watson at the Computer Lab and Renata Cruz Teixeira at Inria Paris, for their thorough and constructive feedback; the final draft of this dissertation has 5 benefited greatly from their input. I have been privileged to have had the opportunity to work with many brilliant and helpful people at the Computer Lab. Specifically, I thank Alice Hutchings, Dongting Yu, Ilias Marinos, Kumar Sharad, Richard Mortier, Robert Watson, Rubin Xu, and Sophie Van Der Zee. Special thanks to Julia E. Powles for providing useful comments on framing this work, and generally on effective writing. I am grateful to Caroline Stewart and Lise Gough at the Computer Lab, and Dr Julie Smith at Robinson College in helping me overcome several administrative hurdles that inevitably pop up during graduate years. Thanks to Mateja Jamnik and others involved in women@CL for creating a warm and supportive environment, and for consistently putting together exciting events. I am thankful to Aliya Khalid, Amna Abdul Wahid, Heidi Howard, Jyothish Soman, Mohibi Hussain, Negar Miralaei, Sharmeen Lodhi, and all others who gave freely of their time and friendship. I am deeply grateful to have Jeunese Payne as my friend, who has been a continual source of support. Her kindness and understanding have brightened up many days. Warmest appreciation also to Farwa Bukhari for her relentless support. Finally, I thank my son Shahmeer for being my anchor always|this dissertation is dedicated to him. This work was supported by the Engineering and Physical Sciences Research Council [grant number EP/L003406/1]. 6 Contents 1 Introduction 11 1.1 Background . 11 1.2 Motivation . 14 1.3 Research Question and its Substantiation . 15 1.4 Dissertation Organization and Contributions . 16 1.5 Published Work . 18 1.6 Work Done in Collaboration . 19 2 Internet Censorship and Censorship Resistance 21 2.1 Internet Censorship . 21 2.1.1 Censorship Distinguishers . 22 2.1.2 Scope of Censorship . 22 2.1.3 An Abstract Model of Censorship . 23 2.1.4 Censor's Attack Model . 24 2.2 Censorship Resistance . 28 2.3 Systematization Methodology . 31 2.3.1 Security Properties . 32 2.3.2 Privacy Properties . 33 2.3.3 Performance Properties . 34 2.3.4 Deployability Properties . 35 2.4 Communication Establishment . 36 2.4.1 High Churn Access . 37 2.4.2 Rate-Limited Access . 37 2.4.3 Active Probing Resistance Schemes . 37 2.4.4 Trust-Based Access . 40 2.4.5 Discussion . 40 2.5 Conversation . 42 2.5.1 Access-Centric Schemes . 43 2.5.2 Publication-Centric Schemes . 46 2.5.3 Discussion . 46 2.6 Related Work . 49 2.7 Open Areas and Research Challenges . 50 CONTENTS CONTENTS 2.7.1 Modular System Design . 50 2.7.2 Revisiting Common Assumptions . 51 2.7.3 Security Gaps . 52 2.7.4 Considerations for Participation . 53 2.8 Summary . 54 3 The Consequences of Internet Censorship 55 3.1 Background and Related Work . 55 3.2 Data Sources for the Study . 57 3.2.1 Capture Location and ISP Overview . 57 3.2.2 Data Description . 59 3.2.3 Data Sanitization and Characterization . 59 3.2.4 Final Datasets . 61 3.2.5 User Survey . 61 3.2.6 Ethical Standards . 63 3.3 Establishing Ground Truth . 64 3.3.1 Censorship Indicators . 65 3.3.2 Mechanism of YouTube Censorship . 67 3.3.3 Mechanism of Porn Censorship . 68 3.4 Metrics Relevant to Content Providers . 70 3.5 Changes in User Behaviour . 72 3.5.1 Changes in Traffic . 73 3.5.2 Effects on User Behaviour .

Characterization of Internet Censorship from Multiple Perspectives

We Need Net Neutrality As Evidenced by This Article to Prevent Corporate

Threat Modeling and Circumvention of Internet Censorship by David Fifield

Edri Submission to UN Special Rapporteur David Kaye's Call on Freedom of Expression and the Private Sector in the Digital Age

Array AG 9.4 CLI Handbook

Self-Censorship

A Privacy Threat for Internet Users in Internet-Censoring Countries

Self- Censorship by Pakistani Journalists: Causes and Effects

Internet Censorship in Thailand: User Practices and Potential Threats

Information Commons - a Public Policy Report - by Nancy Kranich

Corporate Censorship and Its Troubling Implications for the First Amendment

Internet Content Regulation in Liberal Democracies. a Literature Review

Understandings and Practices of Freedom of Expression and Press Freedom in Pakistan: Ethnography of Karachi Journalistic Environment