The Anatomy of Web Censorship in Pakistan
Total Page:16
File Type:pdf, Size:1020Kb
The Anatomy of Web Censorship in Pakistan Zubair Nabi Information Technology University, Pakistan [email protected] Abstract is considered a threat to national security and/or content which is blasphemous. The largest Internet Exchange Over the years, the Internet has democratized the flow Point (IXP) in the country is owned by the state which of information. Unfortunately, in parallel, authoritarian simplifies the enforcement of state-wide censorship. This regimes and other entities (such as ISPs) for their vested censorship has been applied in waves during the past one interests have curtailed this flow by partially or fully cen- decade—often mandated by the judiciary [16]. The side- soring the web. The policy, mechanism, and extent of this effects of which at times have had global impact. For in- censorship varies from country to country. stance in 2008, a naive attempt to censor YouTube by the We present the first study of the cause, effect, and authorities, rendered the website unreachable for a large mechanism of web censorship in Pakistan. Specifically, number of ASes across the world [5]. we use a publicly available list of blocked websites and A large number of studies have recently been conducted check their accessibility from multiple networks within to study the mechanism and effect of censorship around the country. Our results indicate that the censorship mech- the world. Verkamp and Gupta [19] used PlanetLab nodes anism varies across websites: some are blocked at the and volunteer machines to study the mechanisms of cen- DNS level while others at the HTTP level. Interestingly, sorship in 11 countries. One of their key insights is that the government shifted to a centralized, Internet exchange these mechanisms vary from country to country. Simi- level censorship system during the course of our study, en- larly, Mathrani and Alipour [14] presented the results of abling our findings to compare two generations of block- tests conducted across 10 countries using private VPNs ing systems. Furthermore, we report the outcome of a and volunteer nodes. Their results show that restrictions controlled survey to ascertain the mechanisms that are be- in these countries are applicable to all categories of web- ing actively employed by people to circumvent censor- sites: politics, social networking, culture, news, entertain- ship. Finally, we discuss some simple but surprisingly ment, and religion. Likewise, Dainotti et al. [9] made use unexplored methods of bypassing restrictions. of publicly accessible datasets to dissect the Internet out- ages in Libya and Egypt during the Arab Spring. In the 1 Introduction same vein, a large body of work [21, 7, 8, 1] is dedicated In recent years, the Internet has faced an onslaught to analyzing the modus operandi and consequence of cen- of censorship and restrictions with local [19, 14] and sorship due to the Great Firewall of China. global [1] ramifications. The world over, authoritarian To the best of our knowledge, this is the first work to regimes under the pretext of maintaining public order dissect in detail the mechanism and effect of censorship have been blocking web access. This is more enunci- in Pakistan. Unlike studies conducted for other countries ated in the developing world where freedom of speech and which relied on volunteer machines and PlanetLab nodes, freedom of information are largely undefined. In the same we directly use 5 different networks within Pakistan as vein, Pakistan has become a poster-child for web censor- vantage points to carry out our tests. More importantly, ship rooted in religion, politics, and conflict/security [16]. during the course of our tests the country underwent an It has also been revealed as one of the 36 countries which upgrade to a central and standardized censorship system, host FinFisher Command & Control servers to spy on reportedly developed by the Canadian firm Netsweeper their citizens [13]. Inc.1 [18]. Therefore, our results juxtapose two genera- According to a 2012 World Bank study, 9% or around tions of systems. Moreover, we present the outcome of a 16 million Pakistanis have access to the Internet [20]. Out controlled survey to gauge the mechanisms through which of these 16 million users, 64% employ the Internet to ac- citizens are currently circumventing online blockages. Fi- cess news websites [22]. Therefore, the government has a high incentive to stifle this access. Practically, filtering in 1The same firm has in the past provided its filtering services to Qatar, Pakistan is largely geared towards blocking content which the United Arab Emirates, Kuwait, and Yemen [18]. nally, we augment these mechanisms by discussing the • 2012 (March): The government requests propos- use of CDNs and search engine caches. Our results can als for a country-wide URL filtering and block- be summarized as follows: ing system [15]. According to the advertisement, filtering at the time was enabled by manual mech- • A large number of websites are blocked using DNS anisms deployed at the ISP level and the desired sys- injection tem was required to enable centralized blocking at the national IP backbone. Some other features2 of • The alternative mode of censorship in the previous the system included: system (at the ISP level) was HTTP 302 redirection and in case of the current system (at the IXP level), – Filtering from domain level to sub-folder level it is fake HTTP response injection as well as blocking of individual files and file types • Websites restricted at the DNS-level are also blocked at the HTTP-level – Blocking individual IPs and/or an entire range – Remote network monitoring via SNMP and • A large fraction of people either use public VPN ser- configuration through HTTP and HTTPS vices or web proxies to access restricted content – Operation at L2 and L3 • CDNs and search engine caches are currently viable – Modularity and scalability through stand- options to access blocked content alone, plug-and-play hardware units capable of blocking up to 50 million URLs with a process- The rest of the paper is organized as follows. We give a ing latency of less than 1ms brief history of Internet censorship in Pakistan in x2. The methodology employed for our study is discussed in x3. – Decoupling of policy and mechanism via stor- x4 presents the results of our tests and survey. Alterna- age of blacklists in an external database tive anti-censorship mechanisms are discussed in x5. We • 2012 (September): Indefinite ban on YouTube im- finally conclude in x6 and also discuss future directions. posed in retaliation to a controversial movie [11]. The side-effects of this ban disrupted other Google 2 Background services such as Maps, Drive, Play Store, and Ana- Both telephony and Internet services in Pakistan are lytics [4]. This was due to the fact that the same IPs managed by an arm of the state called the Pakistan are shared across all of these services. Telecommunication Authority (PTA). It is in charge of regulation and licensing of fixed-line telephony, cellular 3 Methodology services, cable TV, and Internet services within the coun- We use a publicly available dataset of websites3 to per- try. Internet censorship is also enforced by the govern- form connectivity tests. The prime reason for using this ment through the PTA. To give the reader some perspec- particular dataset is that the same list of 597 websites was tive, the following is a timeline of censorship enforced by circulated by the PTA to all ISPs for filtering [3]. While the government: not an exhaustive list, it provides a fairly rich set of both • 2006: 12 websites blocked for hosting blasphe- complete domains and subdomains for analysis. mous content. The content which was deemed of- As the list was compiled in 2010, the status of a number fensive included a Blogspot blog. Lacking the in- of websites has changed. For instance, in 2010 the gov- frastructure to block a particular blog, the entire ernment banned a small number of YouTube videos so the Blogspot website was blocked for two months. list contains individual entries for each video. Since then, the government has blocked YouTube entirely. Therefore • 2008: A number of YouTube videos marked as of- we removed individual video URLs and added a single fensive by the government. Instead of implement- entry for YouTube. This reduced the size of the dataset ing a URL/IP-specific restriction, an IP-wide block to 562 links. In addition, the list also contains a number of YouTube via BGP misconfiguration was enforced, of duplicate entries. The removal of these entries further making YouTube inaccessible for much of the Inter- reduced the size of the list to 429. Finally, a number of net for 2 hours [5]. websites have now gone offline, mostly proxy sites, bring- ing down the final tally of test sites to 307. To ensure that • 2010: Facebook, YouTube, Flickr, and Wikipedia these final websites are actually restricted, we tested their partially or fully blocked in reaction to “Every- connectivity using a public VPN service which terminates body Draw Muhammad Day”. These websites in the US. The results indicate that they are accessible via were subsequently unblocked. The same year, the VPN and are thus restricted within Pakistan. government also sanctioned the PTA to “order tem- 2 porary or permanent termination of telecom services This is the first time a government has made the exact requirements and details of a full-fledged censorship system public. of any service provider, in any part or whole of Pak- 3http://propakistani.pk/wp-content/uploads/2010/05/ istan” [2]. blocked.html 2 ID Nature Location Mechanism No. of Affected Sites Percent Network1 University Lahore DNS 187 60.91 Network2 University Lahore IP 0 0 Network3 Home Lahore URL-keyword 0 0 Network4 Home Islamabad HTTP (302) 5 1.62 Network5 Cellular (EDGE) Islamabad Total 192 62.53 Table 1: Details of Test Networks Table 2: Breakdown of Pre-April Test Results 3.1 Test Script Network1 and Network2, due to their academic nature, are Our test script, dubbed Samizdat4, closely mimics the connected to 3 and 2 ISPs, respectively, with gigabit con- CensMon [17] system with a few modifications.