An Analysis of China's “Great Cannon”
Total Page:16
File Type:pdf, Size:1020Kb
An Analysis of China’s “Great Cannon” Bill Marczak Nicholas Weaver Jakub Dalek Roya Ensafi UC Berkeley, Citizen Lab ICSI, UC Berkeley Citizen Lab Princeton University David Fifield Sarah McKune Arn Rey John Scott-Railton Ron Deibert UC Berkeley Citizen Lab Citizen Lab Citizen Lab Vern Paxson ICSI, UC Berkeley Abstract content as a man-in-the-middle. The operational deployment of the GC represents a On March 16th, 2015, the Chinese censorship apparatus em- significant escalation in state-level information control: ployed a new tool, the “Great Cannon”, to engineer a denial- of-service attack on GreatFire.org, an organization dedicated the normalization of the widespread use of an attack tool to resisting China’s censorship. We present a technical analy- to enforce censorship by weaponizing users. Specifi- sis of the attack and what it reveals about the Great Cannon’s cally, the GC manipulates the traffic of “bystander” sys- working, underscoring that in essence it consitutes a selective tems outside China, silently programming their browsers nation-state Man-in-the-Middle attack tool. Although sharing to create a massive DDoS attack. While in this case em- some code similarities and network locations with the Great ployed for a highly visible attack, the GC clearly has Firewall, the Great Cannon is a distinct tool, designed to com- the capability for use in a manner similar to the NSA’s promise foreign visitors to Chinese sites. We identify the Great QUANTUM system [48], affording China the opportu- Cannon’s operational behavior, localize it in the network topol- nity to deliver exploits targeting any foreign computer ogy, verify its distinctive side-channel, and attribute the system that communicates with any China-based website not as likely operated by the Chinese government. We also discuss fully protected by HTTPS. the substantial policy implications raised by its use, including the potential imposition on any user whose browser might visit We begin in x 2 with a review of the Great Firewall’s (even inadvertently) a Chinese web site. operation, and then describe how the GC operates in x 3. We analyze the history of the injections as observed at 1 Introduction a medium-sized enterprise (x 4) and the impact of the Great Cannon in the DoS attack on GreatFire.org in x 5. On March 16, 2015, GreatFire.org observed that Ama- In x 6 we provide the technical basis for attributing the zon CloudFront services they rented to make blocked operation of the GC to the Chinese government, and dis- websites accessible in China were targeted by a dis- cuss the consequent policy implications in x 7. We re- tributed denial-of-service (DDoS) attack. On March 26, flect on some plausible enhancements that the GC’s op- two GitHub pages run by GreatFire.org also came under erators could employ (x 8) and offer final conclusions the same type of attack. The attacker targeted services in x 9. designed to circumvent Chinese censorship. A report released by GreatFire.org fingered malicious Javascript 2 Review of The Great Firewall returned by Baidu servers as the source of the at- tack [25]. Baidu denied that their servers were compro- In general, firewalls serve as in-path barriers between mised [42]. Several previous technical reports [5, 36, 41] networks: all traffic between the networks must flow suggested that the Great Firewall of China (GFW) or- through the firewall. Thus the name is actually a chestrated these attacks by injecting malicious Javascript misnomer for China’s “Great Firewall”, which instead into Baidu connections as they transited China’s network operates as an on-path system. The GFW eaves- border. drops on traffic between China and the rest of the In this work we show that while the attack infrastruc- world (TAP in Figure1), and terminates requests for ture was co-located with the GFW, the perpetrators car- banned content (for example, upon seeing a request for ried out the attack using a separate offensive system, “http://www.google.com/?falun”, regardless of the ac- with different capabilities and design, that we term the tual destination server) by injecting a series of forged “Great Cannon” (GC). The Great Cannon is not simply TCP Reset (RST) packets that tell both the requester and an extension of the Great Firewall, but a distinct attack the destination to stop communicating [15]. tool that hijacks traffic to (or presumably from) individ- On-path systems have architectural advantages for ual IP addresses, and can arbitrarily replace unencrypted censorship due to their less disruptive nature in the pres- Figure 2: The Great Cannon’s decision flow Figure 1: A simplified logical topology of the Great Cannon and Great Firewall examine all traffic on a link, but only intercepts traffic to (or presumably from) a set of targeted addresses. It ence of failure, but are less flexible and stealthy than is plausible that this reduction of the full traffic stream in-path systems as attack tools, because while they can to just a (likely small) set of addresses significantly aids inject additional packets, they cannot prevent in-flight with enabling the system to keep up with the very high packets (packets already sent) from reaching their desti- volume of traffic: the GC’s full processing pipeline only nation [49]. Thus, one generally can identify the pres- has to operate on the much smaller stream of traffic to ence of an on-path system interacting with active flows or from the targeted addresses. In addition, the GC only by observing anomalies resulting from the presence of examines individual packets in determining whether to both the injected and legitimate traffic. take action, avoiding the computational costs of TCP The GFW keeps track of connections and reassem- bytestream reassembly. The GC also maintains a flow bles their packets to determine if it should block traf- cache of connections that it uses to ignore recent con- fic [29]. As opposed to considering each packet in iso- nections it has deemed no longer requiring examination. lation, this reassembly process requires additional com- We also show that the GC however shares several fea- putational resources, but facilitates better blocking ac- tures with the GFW. Like the GFW, the GC also uses a curacy. While a web request usually fits within a single multi-process design, with different source IP addresses packet, web replies may be split across several packets, handled by distinct processes. The packets injected by and the GFW needs to reassemble these packets to un- the GC have the same peculiar TTL side-channel as derstand whether a web reply contains banned content. those injected by the GFW, suggesting that both the On any given physical link, the GFW runs its re- GFW and the GC likely share some common code. We assembly and censorship logic in multiple parallel pro- find the GC deployed at the same network locations as cesses [4] (perhaps running on a cluster of many differ- the GFW. Taken together, these findings suggest that al- ent computers). Each process handles a subset of the though the GC and GFW are independent systems with link’s connections, with all packets on a connection go- different functionality, there are significant structural re- ing to the same process. This load-balanced architec- lationships between the two. ture reflects a common design decision when a physi- In the attack on GitHub and GreatFire.org, the GC cal link carries more traffic than a single computer can intercepted traffic sent to Baidu infrastructure servers track [46]. Each GFW process also exhibits a highly dis- that host commonly used analytics, social, or advertis- tinctive side-channel—it progressively increments the ing scripts. If the GC saw a request for certain Javascript IP TTL field on successive packets injected into the same files on one of these servers, it appeared to probabilisti- connection [4]. cally take one of two actions: it either passed the request on to Baidu’s servers unmolested (roughly 98.25% of the 3 The Great Cannon time), or it dropped the request before it reached Baidu The Great Cannon differs from the GFW: the GC oper- and instead sent a malicious script back to the request- ates as an in-path system, capable of not only injecting ing user (roughly 1.75% of the time). In this case, the traffic but also directly suppressing traffic. Doing so en- requesting user is an individual outside China browsing ables it to act as a full “man-in-the-middle” (MITM) for a website making use of a Baidu infrastructure server targeted flows. We show that the GC does not actively (e.g., a website with ads served by Baidu’s ad network). The malicious script enlisted the requesting user as an port, we sent one request designed to trigger GC injec- unwitting participant in the DDoS attack against Great- tion, followed by a request designed to trigger GFW in- Fire.org and GitHub. Figure2 shows our overall view of jection. We repeated the experiment from a number of the GC’s decision flow. source ports. While packets injected by both the GFW and GC exhibited a similar (peculiar) TTL side-channel 3.1 Evaluating the GC’s Functionality (previously reported in [4]) indicative of shared code be- We began our investigation by confirming the continued tween the two systems, we found no apparent correlation normal operation of the GFW’s censorship features.1 between the GFW and GC TTL values themselves. We did so with measurements between our test system The GC appears to be co-located with the GFW. outside of China and a Baidu server that we observed We used the same TTL technique to localize the GC on returning the malicious Javascript.