
WebCore: Architectural Support for Mobile Web Browsing Yuhao Zhu Vijay Janapa Reddi Department of Electrical and Computer Engineering The University of Texas at Austin [email protected], [email protected] Abstract The Web browser is undoubtedly the single most impor- Browser Browser tant application in the mobile ecosystem. An average user 63% 54% spends 72 minutes each day using the mobile Web browser. Web browser internal engines (e.g., WebKit) are also growing 23% 8% 32% Media 6% in importance because they provide a common substrate for 7% 7% Others developing various mobile Web applications. In a user-driven, Media Games Others interactive, and latency-sensitive environment, the browser’s Email performance is crucial. However, the battery-constrained na- (a) Time dist. of window focus. (b) Time dist. of CPU processing. ture of mobile devices limits the performance that we can de- Fig. 1: Mobile Web browser share study conducted by our industry liver for mobile Web browsing. As traditional general-purpose research partner on their employees’ devices [2]. Similar observa- techniques to improve performance and energy efficiency fall tions were reported by NVIDIA on Tegra-based mobile handsets [3,4]. short, we must employ domain-specific knowledge while still maintaining general-purpose flexibility. network limited. However, this trend is changing. With about In this paper, we first perform design-space exploration 10X improvement in round-trip time from 3G to LTE, network to identify appropriate general-purpose architectures that latency is no longer the only performance bottleneck [51]. uniquely fit the characteristics of a popular Web browsing Prior work has shown that over the past decade, network engine. Despite our best effort, we discover sources of energy technology advancements have managed to keep webpage inefficiency in these customized general-purpose architectures. transmission overhead almost stable, whereas the client-side To mitigate these inefficiencies, we propose, synthesize, and computational requirements have increased by as much as evaluate two new domain-specific specializations, called the 10X [78]. Fig. 1b confirms that the browser consumes a sig- Style Resolution Unit and the Browser Engine Cache. Our opti- nificant portion of the CPU time. Similarly, other research mizations boost energy efficiency and at the same time improve reports over 80% CPU usage for mobile browsing [51]. mobile Web browsing performance. As emerging mobile work- Providing high processing capability in mobile CPUs is loads increasingly rely more on Web browser technologies, the challenging due to the limited battery capacity. Recent studies type of optimizations we propose will become important in the suggested that the advancement in Lithium-ion battery den- future and are likely to have lasting widespread impact. sity has slowed down significantly, and that battery capacity will be mostly limited by its volume [6, 72]. Under such a 1. Introduction constraint, we are now faced with a challenge. On one hand, The proliferation of mobile devices and the fast penetration the energy budget for mobile devices is unlikely to drastically of new Web technologies (such as HTML5) has ushered in a increase in the short term. On the other hand, mobile proces- new era of mobile Web browsing. Mobile users now spend sors are becoming power hungry. Core designs have not only a significant amount of time on the Web browser every day. gone from in-order to out-of-order (e.g., ARM Cortex-A8 to A study conducted by our industry partner reports that the A15), but they have also gone multicore (e.g., Exynos 5410 in browser occupies 63% of the window focus time on their Samsung Galaxy S4 has eight cores). Experiments conducted employees’ mobile devices, as shown in Fig. 1a. As a general by Carroll et al. suggested that processor power of mobile trend, comScore shows that mobile users prefer Web browsers Web browsing doubled from 2010 to 2013 [38, 39]. over native applications for important application domains Our work aspires to bridge the widening gap between high- such as online e-commerce and electronic news feeds [1]. performance and energy-constrained mobile processor designs In the user-centric mobile Web browsing context, delivering for Web browsing. Domain-specific specializations have long high performance is critical to end users’ quality-of-service been known to be extremely energy efficient [57,71,76]. Re- experience. Studies in 2013 indicate that 71% of mobile Web cent proposals in data computation domains (such as H.264 en- users expect website performance on their mobile devices to coding [49] and convolution [67]) have begun showing that it be not worse than their desktop experience–up from 58% in is critical and feasible to balance the efficiency of application- 2009 [5]. Traditionally, Web browsing performance has been specific specializations with general-purpose programmability. 978-1-4799-4394-4/14/$31.00 c 2014 IEEE User JavaScript DOM DOM Actions Engine 17% Layout 16% Render Layout Render 21% 20% Data structures 11% CSS 10% DOM Style Tree Kernels Rules 19% Style 28% 24% GPU Style 35% Others Servers Dom Style Layout Render Painting Others Render Tree Fig. 3: Execution time break- Fig. 4: Energy consumption Browser Engine (our focus) down of the browser’s kernels. breakdown of the kernels. Fig. 2: Web browser overview. 2. Web Browser Background and Overview Sharing the same architecture design philosophy, we pro- We first describe the Web browser engine’s computation ker- pose the WebCore, a general-purpose core customized and nels and their communication patterns. Such understanding specialized for the mobile Web browsing workload. In compar- helps us design effective customizations and specializations, ison to prior work that either takes a fully software approach such as those described in the later sections. Nearly all main- on general-purpose processors [41, 62] or a fully hardware stream browser engines fit into our description. In addition, specialization approach [34], our design strikes a balance be- we show each kernel’s performance and energy breakdown to tween the two. On one hand, WebCore retains the flexibility show their importance and demonstrate our study’s coverage. and programmability of a general-purpose core. It naturally Browser engine kernels Fig. 2 shows the overall flow fits in the multicore SoC that is already common in today’s of execution within any typical Web browser. The engine mainstream mobile devices. On the other hand, it achieves has two core modules: the browser engine (e.g., WebKit for energy-efficiency improvement via modest hardware special- Chrome and Gecko for Firefox) and the JavaScript engine. izations that create closely coupled datapath and data storage. In this work, we focus on the internals of the Web browser We begin by examining existing general purpose designs engine. The JavaScript engine’s performance that involves the for the mobile Web browsing workload. Through exhaustive compiler, garbage collector, etc., is a separate issue beyond the design space exploration, we find that existing general pur- scope of this work. Please refer to the Related Work section pose designs bear inherent sources of energy-inefficiency. In (Sec. 8) for a more elaborate discussion about JavaScript. particular, instruction delivery and data feeding are two major The browser engine mainly consists of four kernels: Dom, bottlenecks. We show that customizing them by tuning key Style, Layout, and Render. The kernels, shown in boxes, pro- design parameters achieves better energy efficiency (Sec. 4). cess the webpage and prepare pixels for a GPU to paint. The Building on the customized general-purpose baseline, we figure also shows the important data structures that the kernels develop specialized hardware to further overcome the instruc- consume. The DOM tree, CSS style rules, and Render tree are tion delivery and data feeding bottlenecks (Sec. 5). We propose those important data structures, and they are heavily shared two new optimizations: the “Style Resolution Unit” (SRU) and across the kernels. The data structures are shown in circles a “Software-Managed Browser Engine Cache.” The SRU is with arrows indicating information flow between the kernels. a software-assisted hardware accelerator for the critical style- The Dom kernel is in charge of parsing the webpage con- resolution kernel within the Web browser engine. It exploits tents. Specifically, it constructs the DOM tree from the HTML fine-grained parallelism that aggregates enough computation files, and extracts the CSS style rules from the CSS files. Given to offset the instruction and data communication overhead. the DOM tree and CSS style rules, the Style kernel computes The proposed cache structure exploits the unique data locality the webpage’s style information and stores the results in the of the browser engine’s principal data structures. It is a small render tree. Each render tree node corresponds to a visible and fast memory that achieves a high hit rate for the important element in the webpage. Once the style information of each data structures, but with extremely low accessing energy. webpage element is calculated, the Layout kernel recursively Our results show that customizations alone on the exist- traverses the render tree to decide each visible element’s po- ing general-purpose mobile processor design lead to 22.2% sition information based on each element’s size and relative performance improvement and 18.6% energy saving. Our spe- positioning. The final < x;y > coordinates are stored back cialization techniques achieve an additional 9.2% performance into the render tree. Eventually, the Render kernel examines improvement and 22.2% overall energy saving; the acceler- the render tree to decide the z-ordering of each visible element ated portion itself achieves up to 10X speedup. Finally, we so that they can be displayed in the correct overlapping order. also show that our specialization incurs negligible area over- Performance Fig.
Details
-
File Typepdf
-
Upload Time-
-
Content LanguagesEnglish
-
Upload UserAnonymous/Not logged-in
-
File Pages12 Page
-
File Size-