Evaluating the Performance of User-Space and Kernel-Space Web
Total Page:16
File Type:pdf, Size:1020Kb
Evaluating the Performance of User-Space and Kernel-Space Web Servers ¡ ¡ Amol Shukla , Lily Li , Anand Subramanian , Paul A.S. Ward , Tim Brecht University of Waterloo ¢ ashukla,l7li,anand,pasward,brecht £ @uwaterloo.ca Abstract vated the development of faster and more efficient web servers. High-performance web servers need There has been much debate over the past few years to multiplex between thousands of simultaneous about the practice of moving traditional user-space connections without degrading the performance. applications, such as web servers, into the kernel Many modern web applications require the gen- for better performance. Recently, the user-space eration of dynamic content, where the response to ¤ server web server has shown promising perfor- a user request is the result of a server-side com- mance for delivering static content. In this paper putation. Dynamic content creation allows web ¤ we first describe how we augmented the server to pages to be personalized based on user prefer- enable it to serve dynamic content. We then evalu- ences, thereby enabling a more interactive experi- ¤ ate the performance of the server and the kernel- ence for the user than that possible with static-only space TUX web server, using the SPECweb99 content. On-the-fly response generation also pro- workload generator under a variety of static and vides a web interface to the information stored in dynamic workloads. We demonstrate that the gap databases. Dynamic requests are often compute- in the performance of the two servers becomes less bound [1] and involve more processing than static- significant as the proportion of dynamic-content re- content requests. The on-demand creation of con- quests increases. In fact, for workloads with a ma- tent can significantly impact the scalability and the ¤ jority of dynamic requests, the server outperforms performance provided by a web server [1, 2, 3]. TUX. We conclude that a well-designed user-space web server can compete with an in-kernel server Various techniques have been investigated to im- on performance, while retaining the reliability and prove the performance of web servers. These in- security benefits that come from operating in user clude novel server architectures [4, 5] as well as space. The results presented in this paper will help modifications to operating system interfaces and system developers and administrators in choosing mechanisms [6, 7, 8, 9, 10, 11]. An emerging trend between the in-kernel and the user-space approach has seen web servers being moved into the kernel for web servers. for better performance. TUX [12] (now known as Content Accelerator) is a kernel-space web server from Red Hat. By running in kernel-space, 1 Introduction TUX avoids the overheads of context switching and event notification, and implements several op- The demand for web-based applications has grown timizations to reduce redundant data reads and rapidly over the past few years, which has moti- copies. TUX supports the delivery of both static and dynamic content. ¥ School of Computer Science ¦ Department of Electrical and Computer Engineering Initial research has indicated that kernel-based web servers provide a significant performance ad- Copyright § c 2004 Amol Shukla, Lily Li, Anand Subramanian, vantage over their user-space counterparts in the Paul A.S. Ward, and Tim Brecht. Permission to copy is hereby delivery of static content, and the in-kernel TUX granted provided the original copyright notice is reproduced in copies made. web server has been shown to be nearly twice as fast as the best user-space servers (IIS on Windows 1 2000 and Zeus on Linux) [13]. In this paper we sider the steps performed by a web server in han- revisit the kernel-space versus user-space debate dling a single client request. Figure 1, due to Brecht for web servers. In particular, we expand it to in- et al. [16], illustrates this process. clude dynamic content workloads. Recent research has shown that a significant proportion of work- loads in real-world web sites consists of dynamic 1. Wait for and accept an incoming network requests [1, 14], with Arlitt et al. [1] reporting that connection. over 95% of the requests in an e-commerce web 2. Read the incoming request from the net- site are for dynamic content. work. ¤ Recently, the user-space server [15] web server 3. Parse the request. has shown promising results that rival the per- 4. For static requests, check the cache and formance of TUX on certain static-content work- possibly open and read the file. loads [16]. Prior to our work, the ¤ server did not 5. For dynamic requests, compute the result have the ability to handle dynamic requests. We ¤ based on the data provided in the request first demonstrate how to augment the server to and the information stored at the server. enable it to serve dynamic content. We then com- ¤ 6. Send the reply to the requesting client. pare the performance of the server and TUX (the fastest reported in-kernel web server in Linux) un- 7. If required, close the network connection. der a variety of static and dynamic workloads. We demonstrate that the gap in the performance of the Figure 1: Steps performed by a web server to pro- two servers becomes less significant as the propor- cess a client request. tion of dynamic-content requests increases. Fur- ther, we show that for workloads with a majority Many of the steps in the figure can block due ¤ of dynamic requests, the server can outperform to network or disk I/O. A high-performance web TUX. Based on the results of our experiments, we server has to handle thousands of such requests si- suggest that a well-designed user-space web server multaneously without degrading the performance. can compete with an in-kernel server on perfor- Different server architectures [4, 5] have been pro- mance, while retaining the reliability and security posed to handle this problem and are summarized benefits that come from operating in user space. by Pai et al. [4]. An event-driven approach is The rest of this paper is organized as follows: often implemented in high-performance network Section 2 provides additional background informa- servers to multiplex a large number of requests over tion and summarizes some of the related work. In a few server processes. Using an event notifica- Section 3 we present an overview of various ap- tion mechanism, such as the select system call, proaches for supporting the delivery of dynamic the server identifies connections on which forward content in web servers, and describe how we aug- progress can be made without blocking, and pro- ¤ mented the server to handle dynamic requests. cesses only those connections. By using just a Section 4 includes a detailed description of our ex- few processes, event-driven servers avoid exces- perimental environment, methodology and work- sive context-switches required by the thread- or loads. In Section 5 we present the results of our process- per-connection approach taken by multi- experiments and analyze them. In Section 6 we threaded or multi-process servers like Apache [4]. summarize our findings, discuss their implications Other researchers have suggested modifications and outline some ideas for future work. in operating system interfaces and mechanisms for efficient notification and delivery of network events to user-space servers [6, 7, 8, 17], reducing the 2 Background and Related amount of data copied between the kernel and the Work user-space [10], reducing the number of kernel- boundary crossings [9], and a combination of the Modern web servers require special techniques to above [11]. handle a large number of simultaneous connec- In light of the considerable demands placed on tions. In order to understand the performance char- the operating system by web servers, some re- acteristics of web servers, it is important to con- searchers have sought to improve the performance 2 of user-space web servers by migrating them into ability and the performance of a web server [2, 3]. the kernel. By running in the kernel, the cost of Our hypothesis was that since the generation of dy- event notification can be minimized through fewer namic content is CPU-bound [1], processing these kernel crossings, and the redundant copying and requests in the kernel might not prove to be as ad- buffering of data in the user-space application can vantageous as the case with static-content delivery. be avoided. The web server can also implement Recent workload-characterization studies by Arlitt optimizations to take advantage of direct access to et al. [1] and Wang et al. [14] report that a signifi- kernel-level data structures. In particular, a closer cant proportion of the requests handled by modern integration with the TCP/IP stack, the network in- e-commerce web sites are for dynamic content. To terface, and the file-system cache is possible. our knowledge, no studies examining the relative Following an in-kernel implementation of the benefits of the user-space versus kernel-space ap- Network File System (NFS) server in Linux, proach for web servers using workloads containing two kernel-space web servers were proposed. dynamic-content requests have been published, and kHTTPd [18] runs in the kernel as a module and our work attempts to fill this void. handles only static-content requests. TUX [12] is a kernel-based high-performance web server with the ability to serve both static and dynamic data. It im- 3 Implementing Dynamic Con- plements several optimizations such as zero-copy tent Support in the server parsing, zero-copy disk reads, zero-copy network writes, and caching complete responses in the ker- Prior to this paper, the ¤ server did not support nel’s networking layer to accelerate static content the delivery of dynamic content.