<<

Analyzing for Usability and Performance

Benjamin Farley December 23, 2010

Abstract applications knows, security is also a very dif- ficult problem to solve. Cyber attacks target In this paper I investigate Capsicum, an ex- software applications, or hardware, which are tension to that introduces a new se- made and designed by programmers; thus, curity model on top of existing UNIX archi- the security of an application is only as good tecture. This model consists of several new as the developer can make it. For simple ap- security primitives and system calls that re- plications and programmers experienced in place existing UNIX functionality. I focus on security, making a program secure is usu- two aspects of Capsicum: performance and ally a reasonable task. However, the diffi- usability. For performance, I compare the culty of this problem increases very quickly performance of Capsicum system calls to cor- as software evolves and grows. As an appli- responding UNIX calls and analyze these dif- cation becomes bigger and more complex, it ferences. I also implement a small file-hosting becomes increasingly difficult to understand server that makes use of Capsicum’s sandbox- it and determine how it will behave in a given ing library, in order to determine the feasi- situation. For especially large systems such bility of writing new applications using Cap- as operating system kernels, which can reach sicum and modifying existing applications to millions of lines of code and are developed use Capsicum. and supported by a large group of people, many of which are not well-versed in security, this problem becomes nearly impossible. 1 Introduction For these reasons, it is important that the Computer security is becoming an increas- security primitives provided by a system be ingly important problem in today’s world. able to take into account today’s vastly com- Viruses, worms, trojans and every other type plex software. In this work, I analyze a new of malware imaginable continue to propogate, system, Capsicum [6], with that goal in mind. despite the industry’s best efforts to eradicate In particular, I will focus on two aspects of them. Furthermore, with recent large-scale the system: performance and usability. For attacks such as Stuxnet [2] and the 2008 de- performance, I measure the how long it takes nial of service attack on Georgia, cyber war- various system calls to complete when used in fare is becoming a threat we must take into capability mode versus in normal mode and account when we design systems. Unfortu- analyze the differences I find. I also com- nately, as virtually any user or developer of pare the performance of Capsicum calls to

1 corresponding UNIX calls. Finally, I create cusses the problem of instrumenting complex, a basic server application and instrument it large programs with a small set of small se- using Capsicum in order to understand how curity primitives in order to achieve a given usable it is from an outside perspective and high-level security policy. what changes need to be made to a normal application to let it make use of Capsicum functionality. 3 Capsicum The remainder of this paper is organized as follows. Section 2 gives account of related Capsicum is a UNIX extension that pro- work. Section 3 gives a brief overview of Cap- vides a new security model on top of existing sicum’s security model and the functionality . The basis for this model it provides. In Section 4 I discuss the imple- is the idea of a capability. A capability is mentation of a real world application using an unforgeable token of authority which can Capsicum. Results are discussed in Section be passed around between processes and is 5. Section 6 talks about some of the limita- used to access resources. A capability can tions I found in Capsicum and places where be thought of as an extension, or a wrap- it could be improved. Finally, Section 7 gives per, for basic UNIX file descriptors; it pro- my conclusions. vides the same functionality they do and is used in much the same way, but also pro- vides additional security and a fine-grained 2 Related work measure of control over the resource. For example, whereas traditionally one opens a Object capability systems have been used file with simple or privileges, Cap- in previous systems. For example, Google sicum provides over sixty options for speci- Caja [5] uses an object-capability model to fying precisely how the resource will be used attempt to secure JavaScript and online web (including things like read, write, seek, fstat, browsing. It does so by creating and pass- etc). ing around objects which grant access to re- One of the reasons Capsicum is relevant sources like web services. It also includes to this work is that capabilities provide an functionality to rewrite existing JavaScript intuitive, powerful method of implementing programs to use capabilities. In [4], the security. In a capability system, holding a authors analyze the security aspects of this capability is sufficient to allow access to a re- capability-oriented security model. Wedge [1] source. What this means is that once a pro- uses a security model similar to the capabili- cess holds a capability for a given resource, ties that Capsicum uses, but also includes the the system does not need to do any additional improved functionality of being able to con- security checks to make sure that the trol access to parts of memory on the heap. can access that resource; the fact that the Harris [3] discusses instrumenting existing process has the capability means that it can programs for use in Flume, an OS based access that resource. Conversely, if a process on the idea of decentralized information flow cannot access a resource it will not even know control (DIFC). In this system, security is about the existence of said resource. achieved by controlling how information can This is a useful property for several reasons. flow between entities in a system. Harris dis- In terms of performance, it means that the

2 only time a process needs to take the time to cess only their own files. They could not ac- do security checks is when a resource is first cess the files of other clients, and could not acquired. Thus, if any time-consuming checks see any of the filesystem except their own di- do need to be done, they will usually be a rectory. In this model, I make the assumption one-time cost. Once a process has obtained that workers are untrusted processes. Thus, the resources it needs, they should behave the we can assume that a worker thread may be same way as normal file descriptors. In terms compromised but if it is, the security pol- of programmer usability, this means that, for icy of the system still must not be violated. most intents and purposes, a capability can The implementation of this server in a normal simply be used as if it were a file descriptor. setting is straightforward, and works pretty It will be acquired using Capsicum-specific much exactly how you would expect it to. syntax, but after that the developer and the The flow of events is summarized in Figure application interact with it the same way they 1. The server accepts incoming connections would with a normal file descriptor. from clients and spawns a worker thread to The other main primitive Capsicum pro- deal with each request it receives. After the vides is capability mode. Capability mode is initial request, most interaction occurs be- denoted by a simple process credential flag, tween the client and the worker. The only which processes can set through a new sys- aspect of my implementation that may be tem call, cap enter. Once in capability mode, slightly surprising is that the server spawns a process has limited access to global names- threads upon receiving requests, rather than paces such as the file system or the PID using a thread pool. Since thread pools are namespace. This means it cannot access files often a better solution for client/server needs, and has limited ability to communicate with it seems that it would make more sense to use other processes. However, a process inherits this in my implementation. Unfortunately, capabilities that it held before entering capa- it turns out that, due to limitations in Cap- bility mode, as well as ones its parent process sicum, a thread pool does not work well for held. this particular scenario. This will be dis- Finally, the developers of Capsicum pro- cussed further in Section 6. vided their own library, libcapsicum, that uses the above to provide higher level func- From here, I used libcapsicum to instru- tionality, with a focus on sandboxing. This ment this server to use Capsicum’s security library was what I used to instrument my ex- functionality. In doing this, I hoped to un- ample program, and as such will be the focus derstand what changes need to be made to a of my evaluation. normal program to get it to work with Cap- sicum, and to analyze how usable Capsicum is for developers from an outside perspective. 4 Implementation I also wanted to determine what kind of per- formance difference a real world application To evaluate Capsicum on the above criteria, I would experience when using Capsicum. Fig- wrote a basic file-hosting server. This server ure 2 shows the flow of the modified server. lets clients store and access files in a secure As the diagrams show, the instrumented ap- manner. Specifically, the security policy I plication runs much the same way as the orig- chose to implement was that clients could ac- inal. The main difference is that with Cap-

3 Figure 1: The flow of events in the normal server

Figure 2: The flow of events in the Capsicum server sicum, the worker must do an extra call to to normal UNIX. There are some stark dif- obtain the correct capabilities from the server ferences here, specifically in the calls to before it can begin to talk to the client. My and create files. These calls take around two findings while making these changes will be to three times longer in Capsicum than in discussed in Section 5. normal mode. This is a significant differ- ence, but an understandable one. As men- tioned previously, with capabilities most se- 5 Results curity checks must be done upon acquiring a resource. The system must verify that the We can now turn to the results obtained from process is able to access the given resource it the above implementation. is requesting. However, once the resource is acquired, the process can treat the capability Performance Figure 3 shows the perfor- the same way it would a normal file descrip- mance of system calls in Capsicum compared tor, and can expect it to act the same way.

4 Figure 3: Performance of various system calls in normal mode compared to the same calls in capability mode.

We can see this in the bars for reading and sonable, seeing as it can only be called once writing a file. Here, there is virtually no dif- per process; a ten microsecond overhead once ference between Capsicum and normal calls. per program is virtually unnoticeable. Again, now that the process has acquired In Figure 4 we can see the performance of the resource, there are no extraneous security both systems when spawning new processes. checks. The process possesses the capability; For UNIX, I measured the time taken for a therefore, it can access the resource. This simple call. For Capsicum, I measured is one of the nice properties of capabilities the performance of the lch start() call. This - once they are acquired, they enforce secu- is part of libcapsicum, and is simply a way rity with minimal additional overhead. The of setting up a sandboxed environment for a final two bars in Figure 3 show Capsicum’s new process to run in. Here we once again see performance on two basic calls, cap new and a fairly significant performance hit, around cap enter, which create a new capability out 50% increase for Capsicum. Again, this is of a file descriptor and enter capability mode, fairly intuitive. Capsicum is setting up an en- respectively. These are both turn out to be tire sandboxed environment, limiting its ac- quick calls at around eight to nine microsec- cess to only certain resources, and spawning onds each, which is a reasonable amount of a new process in that sandbox, so it makes overhead for such simple calls. The call to sense that it takes longer to do this than to enter capability mode is actually quite rea- simply fork a new process.

5 Figure 4: Spawning a new process. This compares the UNIX fork() call to Capsicum’s lch start(), used to spawn a new process in a sandbox.

Figure 5: Retrieving a single file from the Capsicum server and the normal server.

Finally, let us turn to the performance of milliseconds of overhead. To figure out this the real world server application. Figure 5 anomaly, it is useful to break the flow of the shows the time taken for a client to retrieve server down into its basic parts and analyze a single file from the Capsicum server ver- them individually. The flow for the server sus how long it takes to retrieve the same and worker can be found in Table 1. As we file from the normal server. Here we see a can see, even with the extra calls Capsicum huge difference, around three milliseconds for makes, the time it takes a Capsicum worker the normal server compared to almost nine to serve the file is barely more than a millisec- milliseconds for the Capsicum server. Based ond longer than it takes the normal server. on the above measurements, this seems ex- Clearly, it is not the worker itself that is cessive. We saw some overhead for some causing this performance hit. The issue be- calls, but nothing that would cause nearly six comes more clear when we look at the flow of

6 Normal server Capsicum server server: open 45 33 server: create fdlist n/a 85 server: spawn worker 398 618 worker: get capabilities n/a 862 worker: ACK client 182 171 worker: read filename 324 305 worker: open file 85 73 worker: read from file 21 18 worker: send file 158 159 total 1213 2324

Table 1: Server and worker flow. This table shows the time taken (in microseconds) for all actions taken by the server and worker at each step of the process in retrieving a file.

Normal server Capsicum server send request 132 129 wait ACK from worker 1429 7854 send filename 134 134 receive file 722 688

Table 2: Client flow. This table shows the time taken (in microseconds) at each step for the client in receiving a file. the client program, shown in Table 2. Here creation is more serious, but this is also a call we see that, of those nine milliseconds, almost that will usually only be done a few times per eight of them are spent waiting for the worker program. Many calls that will be used more to begin communication with the client. This often, like read and write, suffer no overhead implies that the measurement we saw in Fig- when used in capability mode. ure 4 for time taken to start a sandbox is mis- leading. It turns out that that measurement Usability A large part of my goal in im- is simply the time it takens for control to re- plementing a server using Capsicum was to turn to the host process, and it takes much see how usable Capsicum is in the eyes of a longer for the sandboxed process to actually programmer who has never used the system begin running. before. Given a program written for a normal Though these results may seem disappoint- system, how much do I have to change it to ing, they are actually not all bad news. Most make it work with Capsicum? I came to sev- of the calls that suffer significant time delays eral conclusions in this regard. Firstly, there are calls that are only done a few times in a is a significant amount of code that does not program, such as creating and opening files. need to be touched. The entire client portion The problem with overhead during sandbox of the application remained unchanged; the

7 same client could interact easily with either the Capsicum implementation prevented me server. Furthermore, for the most part, the from achieving a desired goal. The most im- worker was also unchanged. As we saw in Fig- portant of these is that Capsicum does not ure 2, the main difference between the normal provide a way to revoke capabilities from a worker and the Capsicum worker was that the process; once a process has a capability to Capsicum worker had to retrieve capabilities access a given resource, it will always have from the server before communicating with that capability. We can illustrate why this the client. This required a small change at the is a problem by going back to the problem beginning of the worker, around five lines to mentioned earlier having to do with thread get the capabilities and initialize the correct pools. With thread pools, a server would fol- pointers. After that, the worker code for the low a similar flow to that shown in Figures two systems was identical. Similarly, the only 1 and 2. However, after completing the re- difference between the servers was the way in quest, instead of exiting, the worker would which they spawned a new process; fork() for retrieve another job from the server. This the normal server and lch start() for the Cap- poses a problem for Capsicum, given the se- sicum server. To use this call, the Capsicum curity policy that clients must not be able to version also needed to add around ten lines access each others’ files. After serving a re- of code to initialize sandbox varialbles; other quest from a user, a worker thread has the than that, the servers were identical. capabilities corresponding to that user’s files. These are promising results. The fact that If the worker immediately goes to retrieve an- capabilities are so similar to file descriptors in other job, it will then begin interaction with most uses means that the majority of exist- the next user while still holding the original ing code does not have to change. Calls that capabilities. This means that the next user access a file descriptor can access a capabil- could access the first user’s files. Unfortu- ity in the exact same way. Communication nately, Capsicum does not provide a means via sockets is unchanged. In nearly all cases, of preventing this, as capabilities cannot be the only changes that need to be made to a revoked. program are in the places where it acquires resources. This means that a developer who wants to add Capsicum functionality to a pro- There are other minor limitations that I gram does not have to rework his entire code found in Capsicum. Some of the implemen- base, and that a developer who is writing a tation is somewhat ad-hoc or even lazy. Many new program can worry about the security of the calls Capsicum provides, such as send- aspects of the program at specific points and ing capabilities between host processes and then forget about them. the sandboxes they have started, are simply piggy-packed on top of existing UNIX calls. Thus there are often unnecessary or redun- 6 Limitations dant parameters or calls made in perform- ing these actions. While this is an easy so- Capsicum is research software, still under de- lution to the problem of passing capabilities, velopment, and as such is not nearly per- it could perhaps be solved more cleanly and fect. In performing these experiments, I efficiently if a dedicated capability passing found several ways in which limitations in scheme was used instead.

8 7 Conclusions [2] Nicolas Falliere, Liam O Murchu, and Eric Chien. W32.stuxnet dossier. In this work I analyzed Capsicum to de- http://www.symantec.com/content/ termine how it performs compared to exist- en/us/enterprise/media/security response/ ing UNIX architecture and how usable it is whitepapers/w32 stuxnet dossier.pdf, from a developers point of view. I found 2010. that in terms of performance, there are both good and bad aspects of Capsicum. Many [3] William R. Harris, Thomas Reps, and calls function exactly as they do in a normal Somesh Jha. Difc programs by automatic system, with little to no overhead for com- instrumentation. Computer and Commu- mon calls like read and write. Unfortunately, nications Security, 2010. some calls entail significant overhead, much [4] Leo A. Meyerovich, A. Porter Felt, and of which us created by Capsicum’s under-the- M. S. Miller. Object views: Fine-grained hood manipulations and is not immediately sharing in browsers. World Wide Web visible to the programmer. Fortunately, the Conference, 2010. calls that introduce overhead are usually calls that only need to be made once, such as en- [5] M. S. Miller, M. Samuel, B. Lau- tering a sandbox or creating a file. rie, I. Awad, and M. Stay. Caja In terms of usability, Capsicum again has - safe active content in sani- pros and cons. Though it has a complicated tized javascript. http://google- call structure and occasionally seems to make caja.googlecode.com/files/caja-spec- arbitrary design decisions, it is fairly straight- 2007-10-11.pdf, October 2007. forward to use once a programmer learns its syntax and calling conventions. Furthermore, [6] Robert Watson. Capsicum: practical ca- the majority of Capsicum’s primitives deal pabilities for unix. 19th USENIX Security with a small percentage of the total code in a Symposium, 2010. program, which means that to modify a pro- gram to use Capsicum, a programmer can get away with looking at only a limited portion of the code. With all this taken into account, Capsicum is a promising system that has po- tential to be quite useful in the security com- munity.

References

[1] Andrea Bittau, Petr Marchenko, Mark Handley, and Brad Karp. Wedge: Split- ting applications into reduced-privilege compartments. 5th USENIX Symposium on Networked Systems Design and Imple- mentation, 2008.

9