Salmon: Robust Proxy Distribution for Censorship Circumvention
Total Page:16
File Type:pdf, Size:1020Kb
Proceedings on Privacy Enhancing Technologies ; 2016 (4):4–20 Frederick Douglas*, Rorshach, Weiyang Pan, and Matthew Caesar Salmon: Robust Proxy Distribution for Censorship Circumvention Abstract: Many governments block their citizens’ ac- action. This last point has become a particular concern cess to much of the Internet. Simple workarounds are in the past few years, as social media platforms have unreliable; censors quickly discover and patch them. proven excellent for coordinating protests. Previously proposed robust approaches either have As a result, many countries impose restrictions on non-trivial obstacles to deployment, or rely on low- their citizens’ Internet use. These restrictions can take performance covert channels that cannot support typi- the form of dropping packets to/from particular IP ad- cal Internet usage such as streaming video. We present dresses, disrupting resolution of DNS queries, and even Salmon, an incrementally deployable system designed inspecting packets headed to search engines to filter out to resist a censor with the resources of the “Great Fire- undesirable queries. Any attempt to circumvent censor- wall” of China. Salmon relies on a network of volunteers ship must defeat all of these techniques. The simplest in uncensored countries to run proxy servers. Although solution is to tunnel all traffic through an encrypted con- any member of the public can become a user, Salmon nection to a secret proxy server: the secure tunnel hides protects the bulk of its servers from being discovered destination IP addresses, and guards against DNS inter- and blocked by the censor via an algorithm for quickly ference and deep packet inspection. The vulnerability is identifying malicious users. The algorithm entails identi- then the server’s identity: the server is viable for by- fying some users as especially trustworthy or suspicious, passing the censorship only so long as the censor does based on their actions. We impede Sybil attacks by re- not know there is a proxy server at that IP address. quiring either an unobtrusive check of a social network Salmon starts with the simple approach of assem- account, or a referral from a trustworthy user. bling a collection of proxy servers, and fixes the vulner- ability with an algorithm for distributing those proxy Keywords: Censorship servers to the general public, while minimizing the num- DOI 10.1515/popets-2016-0026 ber of servers a censor whose agents control some user Received 2016-02-29; revised 2016-06-02; accepted 2016-06-02. identities can discover and block. Salmon’s algorithm has three core components. 1) We track the probability that each user is an agent of the 1 Introduction censor (suspicion), and ban likely agents. 2) We track how much trust each user has earned, and distribute The Internet has proven to be an extremely powerful higher-quality servers accordingly. Trust also helps keep tool for enabling the free flow of information. Authori- innocent users out of trouble, by isolating them from tarian governments have always been highly concerned newer users, where the censor agents are more likely to with the control of information, and so they perceive the be found. 3) Highly trusted users are allowed to rec- Internet in its natural form to be a grave threat. They ommend their friends to Salmon. We maintain a social seek to eliminate dissemination of inconvenient facts, to graph of user recommendations, and assign members of stifle free discussion of viewpoints different from their the same connected component — a recommendation own, and especially to prevent the coordination of mass ancestry tree — to the same servers, whenever possible. Other works [16, 22] have employed techniques with the same overall theme of gradually identifying and punishing suspicious users. Salmon is more robust than *Corresponding Author: Frederick Douglas: University other approaches, thanks mainly to its trust levels. Our of Illinois Urbana-Champaign, [email protected] simulations show that the addition of our trust level Rorshach: [email protected] logic significantly reduces how many users the censor Weiyang Pan: University of Illinois Urbana-Champaign, can cut off from access to proxy servers. rBridge [22], [email protected] Matthew Caesar: University of Illinois Urbana-Champaign, the most robust previous server distribution technique, [email protected] Salmon: Robust Proxy Distribution for Censorship Circumvention 5 allows over three times as many users to be cut off from – The censor cannot identify proxies via traffic finger- the system as Salmon (§5.5). printing. Anti-fingerprinting is important and has Salmon’s other main advantage over similar systems been studied [18, 21, 23], but is orthogonal to server lies in striking the proper balance between making the distribution. system easily accessible to the general public, and mak- – The censor may try to block as many servers as it ing it difficult for the censor to insert many of its agents can immediately, or may be willing to simply dis- into the system. Previous proposals have gone a bit too cover servers, and not take action for months. far in one of those directions. In addition to recommen- dations, Salmon can optionally accept new users who To reiterate, we are assuming that the censor cannot can prove ownership of a well-established Facebook ac- identify proxies via traffic fingerprinting only because count. If Facebook is blocked, existing robust but low- proxy distribution is not an interesting question if the bandwidth techniques are sufficient for getting the user censor has this capability. Careful proxy distribution through Salmon’s registration process. The system can schemes are useful when the censor can block circum- therefore be made open even to people who do not know vention resources if and only if it receives the same in- any longtime Salmon users. formation that a legitimate user needs to utilize them. Salmon’s name comes from its trust levels: just as If the censor can identify circumvention traffic, it has salmon swimming upstream must hop up small water- no reason to bother attacking the distribution mecha- falls, and might briefly fall backwards, users need to nism. Both the fingerprinting problem and its solutions advance up the trust levels. As a bonus, ‘salmon’ can are generic to most circumvention systems. be translated to Persian as ‘free (as in freedom) fish’! 2.2 Problem statement 2 Salmon Salmon is built to solve a single deceptively simple prob- lem. We are in possession of some sensitive information 2.1 Threat model that we wish to distribute to a large group of users, most of whom are strangers to us. Mixed in among those users We assume that the censor is a branch of a national gov- are agents of an adversary — in our case, the censor. The ernment, able to take full control of any router or server adversary can use the secret information to damage us, inside the country. We also assume that the censor is and so tries to learn as much as possible. When the content to merely block access, rather than seek out adversary inflicts that damage, the fact that they have and arrest those who attempt to access forbidden ma- learned the information is revealed to us. However, we terial. If the government did decide to carry out large- do not know how they learned it. scale arrests against our users, they could easily do so In terms of censorship circumvention, we have: by contributing many servers to our system as honey- – We want to give proxy server IP addresses to any pots, and recording the users they receive. There is no interested user living under censorship. good solution to this problem, short of manually verify- – The censor’s employees can pose as ordinary users. ing server volunteers. Even using onion routing will not – A censor can choose to block any proxy it discovers. prevent the censor from observing that specific citizens – Such a block reveals to us that the censor somehow are evading censorship. learned there was a proxy server at that address. We assume the censor can employ large amounts of human labor. A task that requires tens or hundreds This problem’s difficulty comes from the requirement to of full-time employees is within the capabilities of our keep the system open to all potential users, combined censor. The Chinese government already has a much with censors’ ability to behave identically to good users larger force monitoring posts on Chinese social media for as long as they wish. In fact, it’s clear from these two sites such as Weibo and Renren [1]. facts that a perfect solution — one that keeps all servers In summary, we assume: from being blocked — is impossible: until the censor – The censor has enough employees to perform labor- blocks a server, its agents can remain indistinguishable intensive tasks, such as examining a list of servers. from good users. – Our users are not personally targeted by the censor. Salmon: Robust Proxy Distribution for Censorship Circumvention 6 20000 Level 6 14000 Level 6 Level 5 Level 5 12000 15000 Level 4 Level 4 Level 3 Level 3 10000 Level 2 Level 2 Level 1 8000 Level 1 10000 Level 0 Level 0 Users Users 6000 5000 4000 2000 0 0 40 60 80 100 120 140 0 20 40 60 80 100 Time (days) Time (days) Fig. 1. Evolution over time of the proportions of users in different Fig. 2. Similar to fig. 1, but the system launches with 20 special trust levels. Every day, for every 30 users, one new user enters the users (friends of the administrators), who are considered to be system at level 0. above the trust hierarchy, and can recommend one level 6 user per day. Level 6 users recommend one level 5 user every four weeks.