Embassies: Radically Refactoring the Web Jon Howell, Bryan Parno, John R
Total Page:16
File Type:pdf, Size:1020Kb
Embassies: Radically Refactoring the Web Jon Howell, Bryan Parno, John R. Douceur, Microsoft Research Abstract of evolving complexity. On the Internet, application Web browsers ostensibly provide strong isolation for providers, or vendors, run server-side applications over the client-side components of web applications. Unfor- which they exercise total control, from the app down tunately, this isolation is weak in practice; as browsers to the network stack, firewall, and OS. Even when ven- add increasingly rich APIs to please developers, these dors are tenants of a shared datacenter, each tenant au- complex interfaces bloat the trusted computing base and tonomously controls its software stack down to the ma- erode cross-app isolation boundaries. chine code, and each tenant is accessible only via IP. We reenvision the web interface based on the notion The strong isolation among virtualized Infrastructure-as- of a pico-datacenter, the client-side version of a shared a-Service datacenter tenants derives not from physical server datacenter. Mutually untrusting vendors run their separation but from the execution interface’s simplicity. code on the user’s computer in low-level native code con- This paper extends the semantics of datacenter rela- tainers that communicate with the outside world only via tionships to the client’s web experience. Suspending dis- IP. Just as in the cloud datacenter, the simple semantics belief momentarily, suppose every client had ubiquitous makes isolation tractable, yet native code gives vendors high-performance Internet connectivity. In such a world, the freedom to run any software stack. Since the datacen- exploiting datacenter semantics is easy: The client is ter model is designed to be robust to malicious tenants, it merely a screencast (VNC) viewer; every app runs on is never dangerous for the user to click a link and invite its vendor’s servers and streams a video of its display to a possibly-hostile party onto the client. the client. The client bears only a few responsibilities, primarily around providing a trusted path, i.e., enabling 1 Introduction the user to select which vendor to interact with and pro- viding user input authenticity and privacy. A defining feature of the web application model is its We can restore reality by moving the vendors’ code ostensibly strong notion of isolation. On the desktop, a down to the client, with the client acting as a notional user use caution when installing apps, since if an app pico-datacenter. On the client, apps enjoy fast, reliable misbehaves, the consequences are unbounded. On the access to the display, but the semantics of isolation re- web, if the user clicks on a link and doesn’t like what she main identical to the server model: Each vendor has au- sees, she clicks the ‘close’ button, and web app isolation tonomous control over its software stack, and each ven- promises that the closed app has no lasting effect on the dor interacts with other vendors (remote and local) only user’s experience. through opt-in network protocols. Sadly, the promise of isolation is routinely broken, and The pico-datacenter abstraction offers an escape from so in practice, we caution users to avoid clicking on “dan- the battle between isolation and richness, by deconflating gerous links”. Isolation fails because the web’s API, re- the goals into two levels of interface. The client imple- sponsible for application isolation, has simultaneously ments the client execution interface (CEI), which is dedi- pursued application richness, accreting HTTP, MIME, cated to isolating applications and defines how a vendor’s HTML, DOM, CSS, JavaScript, JPG, PNG, Java, Flash, bag of bits is interpreted by the client. Different ven- Silverlight, SVG, Canvas, and more. This richness intro- dors may employ, inside their isolated containers, differ- duces so much complexity that any precise specification ent developer programming interfaces (DPIs). Today’s of the web API is virtually impossible. Yet we can’t hope web API is stuck in a painful battle because it conflates for correct application isolation until we can specify the these goals into a single interface [11]: The API is simul- API’s semantics. Thus, the current web API is a battle taneously a collection of rich, expressive DPI functions between isolation and richness, and isolation is losing. for app developers, and also a CEI that separates vendors. The same battle was fought—and lost—on the desk- The conflated result is a poor CEI that is neither simple top. The initially-simple conventional OS evolved into a nor well-defined. Indeed, this conflation explains why it rich, complex desktop API, an unmanageable disaster of took a decade to prevent text coloring from leaking pri- complexity. Is there hope? Or do isolation (via simple vate information [63], and why today’s web allows cross- specification) and richness inevitably conflict? site fetches of JPGs or JavaScript but not XML [67]. The There is, in fact, a context in which mutually- semantics of web app isolation wind through a teetering untrusting participants interact in near-perfect auton- stack of rich software layers. omy, maintaining arbitrarily strong isolation in the face We deconflate the CEI and DPI by following the pico- and a simple, well-specified CEI. While it’s difficult to datacenter analogy, arriving at a concrete client archi- prove such a radical change unequivocally superior, this tecture called Embassies.1 We pare the web CEI down paper aims to demonstrate that the goal is both realistic to isolated native code picoprocesses [25], IP for com- and valuable. It makes these contributions: munication beyond the process, and minimal low-level • With the pico-datacenter model, we exploit the UI primitives to support the new display responsibilities lessons of autonomous datacenter tenancy in the identified above. client environment (§3), and argue that the collat- The rich DPI, on the other hand, becomes part of the eral effects of the shift are mostly harmless (§8). web app itself, giving developers unparalleled freedom. • We show a small, well-defined CEI specification This proposal doesn’t require Alice, a web app devel- (§3) that admits small implementations (§6.1) and oper, to start coding in assembly. When she writes a hence suggests that correct isolation is achievable. geotagging site, she codes against the familiar HTML, • With a variety of rich DPI implementations running CSS, and JavaScript DPI. But, per the datacenter model, against our CEI, we demonstrate that application that DPI is implemented by the WebKit library [62] that richness is not compromised but enhanced (§6.2). Alice’s client code links against, just as her server-side • We show how to replace the cross-app interactions code links against PHP. Because Alice chooses the li- baked into today’s browser with bilateral protocols brary, browser incompatibilities disappear. (§4), maintaining familiar functionality while obey- Suppose a buffer overflow is discovered in libpng [50], ing pico-datacenter semantics. a library Alice’s DPI uses to draw images. Because Al- • We implement this refactoring (§5) and show that it ice links WebKit by reference, as soon as the WebKit can achieve plausible performance (§6.3, 6.4). developers patch the bug, her client code automatically inherits the fix. Just like when Alice fixes a bug in libphp 2 Trends in Prior Work on her server, the user needn’t care about this update. Later, Alice adds a comment forum to her application. Embassies is not the first attempt to improve web app Rendering user-generated HTML has always been risky, isolation and richness, and indeed prior proposals im- often leading to XSS vulnerabilities [29]. But Alice prove on one or both of these axes. However, they do not hears about WebGear, a fork of WebKit, that enhances provide true datacenter-style isolation — they incorpo- HTML with sandboxes that solve this problem robustly. rate, for reasons of compatibility, part or all of the aggre- DPI libraries like WebGear can innovate just as browser gate web API inside their trusted computing base (TCB). vendors do today, but without imposing client browser 2.1 Better Browsers for the Same API upgrades; Alice simply changes her app’s linkage. Chrome and IE8+ both shift from a single process Ultimately, independent development of alternative model to one that encapsulates each tab in a separate host DPIs outpace WebGear, and Alice graduates to a .NET OS process. This increases robustness to benign failures, or GTK+ stack that is more powerful, or more secure, or but these modifications don’t change the web interface— more elegant. Alice chooses a feature-full new frame- multiple apps still occupy one tab, and complex cross- work, while Bob sticks with WebBSD, a spartan frame- app interactions still occur across tabs—hence isolation work renowned for robustness, for his encrypted chat among web apps is still weak. OP’s browser refactor- app. Taking the complex, rich semantics out of the ing [20] is also constrained by the web API’s complex CEI gives developers more freedom, while making cross- semantics. vendor isolation—the primary guarantee established by Given this constraint, IBOS pushes the idea of refac- the client—more robust than today’s web API. toring the browser quite far [55]. It realizes the idea of Via the pico-datacenter model, we develop a CEI with: sites as first-class OS principals [26, 57], and container- • a minimal native execution environment, izes renderers to improve isolation. IBOS must still in- • a minimal notion of application identity, clude HTTP to define hscheme; host; porti web prin- • a minimal primitive for persistent state, cipals, and must use deep-packet inspection on HTML • an IP interface for all external app communication, and MIME to partially enforce the Same-Origin Policy (SOP) [67].