Flywheel: Google's Data Compression Proxy for the Mobile
Total Page:16
File Type:pdf, Size:1020Kb
Flywheel: Google’s Data Compression Proxy for the Mobile Web Victor Agababov∗ Michael Buettner Victor Chudnovsky Mark Cogan Ben Greenstein Shane McDaniel Michael Piatek Colin Scott† Matt Welsh Bolian Yin Google, Inc. †UC Berkeley Abstract Although the number of sites that are tuned for mo- bile devices is growing, there is still a huge opportunity Mobile devices are increasingly the dominant Internet to save users money by compressing web content via a access technology. Nevertheless, high costs, data caps, proxy. This paper describes Flywheel, a proxy service and throttling are a source of widespread frustration, and integrated into Chrome for Android and iOS that com- a significant barrier to adoption in emerging markets. presses proxied web content by 58% on average (50% This paper presents Flywheel, an HTTP proxy service median across users). While proxy optimization is an that extends the life of mobile data plans by compress- old idea [15, 22, 29, 39, 41] and the optimizations we ap- ing responses in-flight between origin servers and client ply are known, we have gained insights by studying a browsers. Flywheel is integrated with the Chrome web modern workload for a service deployed at scale. We browser and reduces the size of proxied web pages by describe Flywheel from an operational and design per- 50% for a median user. We report measurement results spective, backed by usage data gained from several years from millions of users as well as experience gained dur- of deployment and millions of active users. ing three years of operating and evolving the production service at Google. Flywheel’s data reduction benefits rely on coopera- tion between the browser and server infrastructure at 1 Introduction Google. For example, Chrome has built-in support for the SPDY [11] protocol and the WebP [12] image com- This paper describes our experience building and running pression format. Both improve efficiency, yet are rarely a mobile web proxy for millions of users supporting bil- used by website operators because they require cumber- lions of requests per day. In the process of developing some, browser-specific configuration. Rather than wait- and deploying this system, we gained a deep understand- ing for all sites to adopt best practices, Flywheel applies ing of modern mobile web traffic, the challenges of de- optimizations automatically and universally, transcoding livering good performance, and a range of policy issues content on-the-fly at Google servers as it is served. that informed our design. We focus on mobile devices because they are fast be- This paper makes two key contributions. First, our ex- coming the dominant mode of Internet access. Trends are perience with Flywheel has given us a deep understand- clear: in many markets around the world, mobile traffic ing of the performance issues with proxying the mod- volume already exceeds desktop [23], and double-digit ern mobile web. Although proxy optimization delivers growth rates are typical [32]. clear-cut benefits for data reduction, its impact on la- tency is mixed. Measurements of Flywheel’s overall per- Despite these trends, web content is still predomi- formance demonstrate the expected result: compression nantly designed for desktop browsers, and as such is in- improves latency. In practice however we find that Fly- efficient for mobile users. This situation is made worse wheel’s impact on latency varies significantly depending by the high cost of mobile data. In developed markets, on the user population and metric of interest. For exam- data usage caps are a persistent nuisance, requiring users ple, we find that Flywheel decreases load time of large to track and manage consumption to avoid throttling or pages but increases load time for small pages. overage fees. In emerging markets, data access is of- ten priced per-byte at prohibitive cost, consuming up to Our second contribution is a detailed account of the 25% of a user’s total income [18]. In the face of these many design tradeoffs and measurement findings that we costs, supporting the continued growth of the mobile In- encountered in the process of developing and deploy- ternet and mobile web browsing in particular is our pri- ing Flywheel. While the idea of an optimizing proxy mary motivation. is conceptually simple, our design has evolved contin- uously in response to deployment experience. For ex- ∗Authors are listed in alphabetical order. ample, we find that middleboxes within mobile carri- ers are widespread, and often modify HTTP headers in ways that break na¨ıve proxied connections. While use of HTTPS and SPDY would prevent tampering, always en- crypting traffic to the proxy is at odds with features such as parental controls enforced by mobile carriers. Perhaps Google Datacenter HTTP Site unsurprisingly, addressing these tussles consumes signif- Figure 1: Flywheel sits between devices and origins, au- icant engineering effort. We report on the incidence and tomatically optimizing HTTP page loads. variety of these tussles, and map them to a clearer picture that most content providers still do not employ these ser- of mobile web operation with the hope that future sys- vices. tem designs will be informed by the practical concerns More recent optimizations such as WebP [12] and we have encountered. As far as we know, we are the first SPDY [11] have been available for years yet have very to publish a discussion of these tradeoffs. low adoption rates. We find that 0.8% of images on the 2 Background web are encoded in the WebP format, and only 0.9% of sites support SPDY [49]. We built Flywheel in response to the practical stumbling Users should not have to wait for sites to catch up to blocks of today’s mobile web. Ideally, Flywheel would best practices. Modern browsers such as Chrome are up- be unnecessary. Mobile data would be cheap, and con- dated as often as every six weeks, providing a constant tent providers would be quick to adopt new technologies. stream of new opportunities for optimization that are dif- Neither is true today. ficult for web developers to track. Moreover, as mobile Mobile Internet usage is large and growing rapidly. devices proliferate, the complexity of optimizing sites The massive growth of mobile Internet traffic has cre- to conform to the latest platforms (e.g. high-resolution ated a tremendous opportunity for automatic optimiza- tablets requiring higher image quality) is a daunting task tion. In Asia and Africa, 38% of web page views are for all but the most committed site owners. performed on mobile devices as of May 2014, a year- In sum, it is not surprising that most site owners do over-year increase exceeding 10% [36]. In North Amer- not take advantage of all browser- and device-specific ica, mobile page loads are 19% of total traffic volume optimizations. Just as we do not expect programmers to with 8% growth yearly. In February 2014, research firm manually unroll loops, we should not expect site owners comScore reported that time spent using the Internet on to remember to apply an ever-expanding set of optimiza- mobile devices exceeded desktop PCs for the first time tions to their sites. We need an optimizing compiler for in the United States [23]. These trends match our expe- the web—a service that automatically applies optimiza- rience at Google. Mobile is increasingly dominant. tions appropriate for a given platform. Growth in emerging markets is hampered by cost. Emerging markets are growing faster than developed 3 Design & Implementation markets. Year-over-year growth in mobile subscriptions This section describes Flywheel’s design and implemen- is 26% in developing countries compared to 11.5% in tation. The high-level design (depicted in Figure1) is developed countries. In Africa, growth exceeds 40% an- conceptually simple: Flywheel is an optimizing proxy nually [32]. Despite surging popularity, the high cost of service. Chrome sends HTTP requests to Flywheel mobile access encumbers usage. One survey of 17 coun- servers running in Google datacenters. These proxy tries in sub-Saharan Africa reports that mobile phone servers fetch, optimize, and serve origin responses. An spending was 10-26% of individual income in the lower- example optimization is transcoding a large, lossless 75% income bracket [18]. PNG image into a small, lossy WebP. The remainder of Site operators are slow to adopt new technologies. this section describes our goals, the flow of requests and Adapting websites for mobile involves manual and of- responses, and the optimizations applied in transit. We ten complex optimizations, and most sites are poorly conclude the section with a discussion of fault tolerance. equipped to make even simple changes. For example, 3.1 Goals & Constraints measurements of Flywheel’s workload show that 42% of HTML bytes on the web that would benefit from com- Flywheel’s primary goal is to reduce mobile data usage pression are uncompressed, despite GZip being univer- for web traffic. To achieve this goal we must address the sally supported in modern web browsers [13]. This is in practical requirements of integrating with the Chrome part because GZip is not enabled by default on most web browser, used by hundreds of millions of people. servers, yet only a single-line change to the server con- Opt-in. Recognizing that many users are sensitive to the figuration is needed. While hosting-providers and CDNs privacy issues of proxying web content through Google’s deal with such configuration issues on the behalf of con- servers, we choose to keep the Flywheel proxy off by de- tent providers, the pervasive lack of GZip usage indicates fault. Users must explicitly enable the service. Proxy HTTP URLs only. Flywheel applies only to quests are multiplexed. Flywheel in turn translates these HTTP URLs.