Securing Distributed Systems with Information Flow Control Nickolai Zeldovich, Silas Boyd-Wickizer, and David Mazieres` Stanford University
Total Page:16
File Type:pdf, Size:1020Kb
Securing Distributed Systems with Information Flow Control Nickolai Zeldovich, Silas Boyd-Wickizer, and David Mazieres` Stanford University ABSTRACT HTTPS Untrustworthy Recent operating systems [12, 21, 26] have shown that Web Front-end Application Database Browser decentralized information flow control (DIFC) can se- Server Code cure applications built from mostly untrusted code. This paper extends DIFC to the network. We present DStar, Web Server a system that enforces the security requirements of mu- Figure 1: High-level structure of a web application. The shaded ap- tually distrustful components through cryptography on plication code is typically the least trustworthy of all components, and should be treated as untrusted code to the extent possible. Dashed lines the network and local OS protection mechanisms on each indicate how the web server can be decomposed into three kinds of ma- host. DStar does not require any fully-trusted processes chines for scalability—namely, front-end, application, and data servers. or machines, and is carefully constructed to avoid covert channels inherent in its interface. We use DStar to build HTTPS server forwards the username and password sup- a three-tiered web server that mitigates the effects of un- plied by the browser to the database for verification, the trustworthy applications and compromised machines. database only allows the application to access records of the verified user, and the HTTPS server only sends 1 INTRODUCTION the application’s output to the same web browser that Software systems are plagued by security vulnerabilities supplied the password. Then even a malicious applica- in poorly-written application code. A particularly acute tion cannot inappropriately disclose users’ data. Effec- example is web applications, which are constructed for tively we consider the application untrusted, mitigating a specific web site and cannot benefit from the same the consequences if it turns out to be untrustworthy. level of peer review as more widely distributed software. While DIFC OSes allow us to tolerate untrustworthy Worse yet, a sizeable faction of web application code ac- applications, they can do so only if all processes are run- tually comes from third parties, in the form of libraries ning on the same machine. Production web sites, on the and utilities for tasks such as image conversion, data in- other hand, must scale to many servers. A site like Pay- dexing, or manipulating PDF, XML, and other complex Maxx might use different pools of machines for front- file formats. Indeed, faced with these kinds of tasks, most end HTTPS servers, application servers, and back-end people start off searching for existing code. Sites such database servers. To achieve Figure 1’s configuration as FreshMeat and SourceForge make it easy to find and when each component runs on a separate machine, we download a wide variety of freely available code, which also need a network protocol that supports DIFC. unfortunately comes with no guarantee of correctness. This paper presents DStar, a protocol and framework Integrating third-party code is often the job of junior en- that leverages OS-level protection on individual ma- gineers, making it easy for useful but buggy third-party chines to provide DIFC in a distributed system. At a code to slip into production systems. It is no surprise high level, DStar controls which messages sent to a ma- we constantly hear of catastrophic security breaches that, chine can affect which messages sent from the machine, for example, permit a rudimentary attacker to download thereby letting us plumb together secure systems out of 100,000 PayMaxx users’ tax forms [20]. untrusted components. DStar was designed to meet the If we cannot improve the quality of software, an al- following goals: ternative is to design systems that remain secure despite Decentralized trust. While an operating system has untrustworthy code. Recent operating systems such as the luxury of an entirely trusted kernel, a distributed sys- Asbestos [21], HiStar [26], and Flume [12] have shown tem might not have any fully-trusted code on any ma- this can be achieved through decentralized information chine. Web applications rely on fully-trusted certificate flow control (DIFC). Consider again the PayMaxx ex- authorities such as Verisign to determine what servers to ample, which runs untrustworthy application code for trust: for example, Google’s web applications might trust each user to generate a tax form. In a DIFC operating any server whose certificate from Verisign gives it a name system, we could sandwich each active user’s tax form ending in .google.com. However, we believe that requir- generation process between the payroll database and an ing such centralized trust by design needlessly stifles in- HTTPS server, using the DIFC mechanism to prevent the novation and reduces security. Instead, DStar strives to application from communicating with any other compo- provide a more general, decentralized mechanism that nent. Figure 1 illustrates this configuration. Suppose the lets applications either use authorities such as Verisign 1 by convention or manage their trust in other ways. More- machine, our web server ensures that the SSL certificate over, the lack of a single trusted authority simplifies in- private key is accessible only to a small part of the SSL tegration of systems in different administrative domains. library, and can only be used for legitimate SSL negotia- Egalitarian mechanisms. One of the features that dif- tion. It also protects authentication tokens, such as pass- ferentiates DIFC work from previous information flow words, so that even the bulk of the authentication code control operating systems is that DIFC mechanisms are cannot disclose them. In a distributed setting, DStar al- specifically designed for use by applications. Any code, lows the web server to ensure that private user data re- no matter how unprivileged, can still use DIFC to pro- turned from a data server can only be written to an SSL tect its data or further subdivide permissions. This prop- connection over which the appropriate user has authenti- erty greatly facilitates interaction between mutually dis- cated himself. DStar’s decentralized model also ensures trustful components, one of the keys to preserving secu- that even web server machines are minimally trusted: if rity in the face of untrustworthy code. By contrast, mili- any one machine is compromised, it can only subvert the tary systems, such as [22], have also long controlled in- security of users that use or had recently used it. formation flow, but using mechanisms only available to privileged administrators, not application programmers. 2 INFORMATION FLOW CONTROL Similarly, administrators can already control information DStar enforces DIFC with labels. This section first de- flow in the network using firewalls and VLANs or VPNs. scribes how these labels—which are similar to those of DStar’s goal is to offer such control to applications and to DIFC OSes—can help secure a web application like Pay- provide much finer granularity, so that, in the PayMaxx Maxx. We then detail DStar’s specific label mechanism. example, every user’s tax information can be individually 2.1 Labels tracked and protected as it flows through the network. DStar’s job is to control how information flows between No inherent covert channels. Information flow con- processes on different machines—in other words, to en- trol systems inevitably allow some communication in vi- sure that only processes that should communicate can olation of policy through covert channels. However, dif- do so. Communication permissions are specified with ferent applications have different sensitivities to covert labels. Each process has a label; whether and in which channel bandwidth. PayMaxx could easily tolerate a direction two processes may communicate is a function 1,000-bit/second covert channel: even if a single in- of their labels. This function is actually a partial order, stance of the untrusted application could read the entire which we write v (pronounced “can flow to”). database, it would need almost a day to leak 100,000 user Roughly speaking, given processes S and R labeled records of 100 bytes each. (Exploiting covert channels L and L , respectively, a message can flow from S to often involves monopolizing resources such as the CPU S R R only if L v L . Bidirectional communication is per- or network, which over such a long period could easily S R mitted only if, in addition, L v L . Labels can also be be detected.) By contrast, military systems consider even R S incomparable; if L 6v L and L 6v L , then S and R may 100 bits/second to be high bandwidth. To ensure that S R R S not communicate in either direction. Such label mech- DStar can adapt to different security requirements, our anisms are often referred to as “no read up, no write goal is to avoid any covert channels inherent in its inter- down,” where “up” is the right hand side of v. Given that face. This allows any covert channels to be mitigated in a labels can be incomparable, however, a more accurate de- particular deployment without breaking backwards com- scription might be, “only read down, only write up.” patibility. Avoiding inherent covert channels is particu- Because a distributed system cannot simultaneously larly tricky with decentralized trust, as even seemingly observe the labels of processes on different machines, innocuous actions such as querying a server for autho- DStar also labels messages. When S sends a message M rization or allocating memory to hold a certificate can to R, S specifies a label L for the message. The property inadvertently leak information. M enforced by DStar is that L v L v L . Intuitively, the DStar runs on a number of operating systems, includ- S M R message label ensures that untrusted code cannot inap- ing HiStar, Flume, and Linux.