
Isolating Web Programs in Modern Browser Architectures Charles Reis Steven D. Gribble University of Washington / Google, Inc. ∗ {creis, gribble}@cs.washington.edu Abstract 1. Introduction Many of today’s web sites contain substantial amounts of Today’s publishers are deploying web pages that act more client-side code, and consequently, they act more like pro- like programs than simple documents, and these programs grams than simple documents. This creates robustness and are growing in complexity and demand for resources. Cur- performance challenges for web browsers. To give users a rent web browser architectures, on the other hand, are still robust and responsive platform, the browser must identify designed primarily for rendering basic pages, in that they program boundaries and provide isolation between them. do not provide sufficient isolation between concurrently ex- We provide three contributions in this paper. First, we ecuting programs. As a result, competing programs within present abstractions of web programs and program in- the browser encounter many types of interference that af- stances, and we show that these abstractions clarify how fect their fault tolerance, memory management, and perfor- browser components interact and how appropriate program mance. boundaries can be identified. Second, we identify backwards These reliability problems are familiar from early PC op- compatibility tradeoffs that constrain how web content can erating systems. OSes like MS-DOS and MacOS only sup- be divided into programs without disrupting existing web ported a single address space, allowing programs to interfere sites. Third, we present a multi-process browser architecture with each other. Modern operating systems isolate programs that isolates these web program instances from each other, from each other with processes to prevent these problems. improving fault tolerance, resource management, and perfor- Surprisingly, web browsers do not yet have a program mance. We discuss how this architecture is implemented in abstraction that can be easily isolated. Neither pages nor Google Chrome, and we provide a quantitative performance origins are appropriate isolation boundaries, because some evaluation examining its benefits and costs. groups of pages, even those from different origins, can inter- act with each other within the browser. To prevent interfer- Categories and Subject Descriptors D.2.11 [Software En- ence problems in the browser, we face three key challenges: gineering]: Software Architectures—Domain-specific archi- (1) finding a way for browsers to identify program bound- tectures; D.4.5 [Operating Systems]: Reliability—Fault tol- aries, (2) addressing the complications that arise when trying erance; H.4.3 [Information Systems Applications]: Com- to preserve compatibility with existing web content, and (3) munications Applications—Information browsers rearchitecting the browser to isolate separate programs. In this paper, we show that web content can in fact be General Terms Design, Experimentation, Performance, divided into separate web programs, and we show that sepa- Reliability rate instances of these programs can exist within the browser. In particular, we consider the relationships between web ob- Keywords Web browser architecture, isolation, multi-process jects and the browser components that interact with them, browser, reliability, robustness and we define web program instances based on the access control rules and communication channels between pages in ∗ This work was partially performed at Google while the first author was a the browser. Our goal is to use these abstractions to improve graduate student intern and the second was on sabbatical. the browser’s robustness and performance by isolating web program instances. We find they are also useful for reason- ing about the browser’s execution and trust models, though we leave security enhancements as a goal for future work. Permission to make digital or hard copies of all or part of this work for personal or We show that these divisions between web program in- classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation stances can be made without losing compatibility with ex- on the first page. To copy otherwise, to republish, to post on servers or to redistribute isting content or requiring guidance from the user, although to lists, requires prior specific permission and/or a fee. EuroSys’09, April 1–3, 2009, Nuremberg, Germany. doing so requires compromises. We define a web program as Copyright c 2009 ACM 978-1-60558-482-9/09/04. $5.00 pages from a given site (i.e., a collection of origins sharing the same domain name and protocol), and a web program instance as a site instance (i.e., pages from a given site that DOM Trees share a communication channel in the browser). Compati- bility with existing web content limits how strongly site in- stances can be isolated, but we find that isolating them can still effectively address many interference problems. To better prevent interference between web program in- stances, we present a browser architecture that uses OS pro- DOM Bindings cesses as an isolation mechanism. The architecture dedicates HTML Renderer JavaScript Engine one process to each program instance and the browser com- ponents required to support it, while the remaining browser Network Storage components run safely in a separate process. These web pro- User Interface Plug-ins gram processes leverage support from the underlying OS to reduce the impact of failures, isolate memory management, Figure 1. All web content and browser components share and improve performance. As a result, the browser becomes fate in a single process in monolithic browser architectures, a more robust platform for running active code from the web. in contrast to the multi-process architecture in Figure 5. Web program processes can also be sandboxed to help en- force some aspects of the browser’s trust model, as we dis- cuss in a related technical report [Barth 2008]. Google has implemented the architecture described above boundaries between programs, but we currently lack a pre- in the open source Chromium web browser. The Google cise program abstraction for browsers. As a result, most Chrome browser is based on the Chromium source code; we browsers have monolithic architectures in which all content will refer to both browsers as Chromium in this paper. While and browser components are combined in one address space at Google, the first author of this paper helped add support and process (see Figure 1). for site instance isolation to Chromium’s multi-process ar- Unfortunately, it is challenging to find an appropriate way chitecture, allowing each site instance to run in a separate to define program boundaries in today’s browsers. Consider process. While the current implementation does not always a straw man approach that treats each web page as a sep- provide strict isolation of pages from different sites, we ar- arate program, isolating pages from each other. This ap- gue that it achieves most of the potential benefits and that proach breaks many real web programs that have multiple strict isolation is feasible. communicating pages. For example, mapping sites often al- We evaluate the improvements this multi-process archi- low a parent page to script a separate map page displayed in tecture provides for Chromium’s robustness and perfor- a frame. Similarly, calendar pop-up windows are frequently mance. We find that it provides a more robust and responsive used to fill in dates in web forms. Isolating every page would platform for web programs than monolithic browsers, with break these interactions. acceptable overhead and compatibility with existing web Origins are another inadequate straw man. Two copies of content. the same page are often independent and can be isolated The rest of this paper is organized as follows. We present from each other, while two pages from partially differing ideal program abstractions and show how they can be ap- origins can sometimes access each other. Thus, isolating plied to real web content and browser components in Sec- pages based solely on origin would be too coarse in some tion 2, along with how these abstractions are limited by situations and too fine grained in others. backwards compatibility. In Section 3, we present a multi- In this section, we show that it is possible for a browser to process architecture that can isolate these abstractions. We identify program boundaries in a way that lets it isolate pro- also describe Chromium’s implementation of the architec- gram instances from each other, while preserving backwards ture and how it can address concrete robustness problems. compatibility. We provide ideal abstractions to capture the We evaluate the benefits and costs of the architecture in Sec- intuition of these boundaries, and we provide concrete defi- tion 4. Finally, we discuss related work in Section 5 and con- nitions that show how programs can be defined with today’s clude in Section 6. content, as summarized in Figure 2. We then discuss the complications that arise due to the browser’s execution and 2. Finding Programs in Browsers trust models, and what compromises are required to main- tain compatibility with existing content. Many robustness problems could be solved if indepen- dent web-based programs were effectively isolated by the browser’s architecture. Crashes and leaks could be con- 2.1 Ideal Abstractions tained, and different programs in the browser could ac- We first define two idealized abstractions that represent iso- complish work concurrently. This isolation requires strong lated groups of web objects in the browser. define web programs based on which pages may be able to A set of related web pages and their sub- Web Program resources that provide a common service. access each other. Taking this approach, pages permitted to interact should Web Program Copies of pages from a web program that be grouped together into web programs, while pages that are Instance are tightly coupled within the browser.
Details
-
File Typepdf
-
Upload Time-
-
Content LanguagesEnglish
-
Upload UserAnonymous/Not logged-in
-
File Pages13 Page
-
File Size-