APPLYING THE PROACTOR PATTERN TO HIGH-PERFORMANCE WEB SERVERS

James Hu Irfan Pyarali Douglas C. Schmidt [email protected],edu [email protected] [email protected]

Department of Computer Science, Washington University

St. Louis, MO 63130, USA

This paper is to appear at the 10th International Confer- 1 INTRODUCTION ence on Parallel and and Systems, IASTED, Las Vegas, Nevada, October 28-31, 1998. Computing power and network bandwidth on the Internet has increased dramatically over the past decade. High- speed networks (such as ATM and Gigabit Ethernet) and high-performance I/O subsystems (such as RAID) are be- ABSTRACT coming ubiquitous. In this context, developing scalable Web servers that can exploit these innovations remains a Modern operating systems provide multiple concurrency key challenge for developers. Thus, it is increasingly im- mechanisms to develop high-performance Web servers. portant to alleviate common Web server bottlenecks, such Synchronous multi-threading is a popular mechanism for as inappropriate choice of concurrency and dispatching developing Web servers that must perform multiple oper- strategies, excessive filesystem access, and unnecessary ations simultaneously to meet their performance require- data copying. ments. In addition, an increasing number of operating sys- Our research vehicle for exploring the performance im- tems support asynchronous mechanisms that provide the pact of applying various Web server optimization tech- benefits of concurrency, while alleviating much of the per- niques is the JAWS Adaptive Web Server (JAWS) [1]. JAWS formance overhead of synchronous multi-threading. is both an adaptive Web server and a development frame- This paper provides two contributions to the study of work for Web servers that runs on multiple OS platforms high-performance Web servers. First, it examines how syn- including Win32, most versions of UNIX, and MVS Open chronous and asynchronous event dispatching mechanisms Edition. impact the design and performance of JAWS, which is our Our experience [2] building Web servers on multiple OS high-performance Web server framework. The results re- platforms demonstrates that the effort required to optimize veal significant performance improvements when a proac- performance can be simplified significantly by leveraging tive concurrency model is used to combine lightweight con- OS-specific features. For example, an optimized file I/O currency with asynchronous event dispatching. system that automatically caches open files in main mem- In general, however, the complexity of the proactive con- ory via mmap greatly reduces latency on Solaris. Like- currency model makes it harder to program applications wise, support for asynchronous event dispatching on Win- that can utilize asynchronous concurrency mechanisms ef- dows NT can substantially increase server throughput by fectively. Therefore, the second contribution of this paper reducing context switching and synchronization overhead describes how to reduce the software complexity of asyn- incurred by multi-threading. chronous concurrent applications by applying the Proactor Unfortunately, the increase in performance obtained pattern. This pattern describes the steps required to struc- through the use of asynchronous event dispatching on ex- ture object-oriented applications that seamlessly combine isting operating systems comes at the cost of increased concurrency with asynchronous event dispatching. The software complexity. Moreover, this complexity is fur- Proactor pattern simplifies concurrent programming and ther compounded when asynchrony is coupled with multi- improves performance by allowing concurrent application threading. This style of programming, i.e., proactive pro- to have multiple operations running simultaneously with- gramming, is relatively unfamiliar to many developers ac- out requiring a large number of threads. customed to the synchronous event dispatching paradigm. This paper describes how the Proactor pattern can be

 This work was supported in part by Siemens Med and Siemens Cor- applied to improve both the performance and the design porate Research. of high-performance communication applications, such as

1 Web servers. Strategy, I/O Strategy, Protocol , Protocol Han- A pattern represents a recurring solution to a software dlers,andCached Virtual Filesystem. Each framework is development problem within a particular context [3]. Pat- structured as a set of collaborating objects implemented us- terns identify the static and dynamic collaborations and in- ing components in ACE [5]. The collaborations among teractions between software components. In general, ap- JAWS components and frameworks are guided by a family plying patterns to complex object-oriented concurrent ap- of patterns, which are listed along the borders in Figure 1. plications can significantly improve software quality, in- An outline of the key frameworks, components, and pat- crease software maintainability, and support broad reuse terns in JAWS is presented below; Section 4 then focuses of components and architectural designs [4]. In particu- on the Proactor pattern in detail.1 lar, applying the Proactor pattern to JAWS simplifies asyn- chronous application development by structuring the de- Event Dispatcher: This component is responsible for co- multiplexing of completion events and the dispatching of ordinating JAWS’ Concurrency Strategy with its I/O Strat- their corresponding completion routines. egy. The passive establishment of connection events with The remainder of this paper is organized as follows: Sec- Web clients follows the Acceptor pattern [6]. New incom- tion 2 provides an overview of the JAWS server framework ing HTTP request events are serviced by a concurrency design; Section 3 discusses alternative event dispatching strategy. As events are processed, they are dispatched to strategies and their performance impacts; Section 4 ex- the Protocol Handler, which is parameterized by an I/O plores how to leverage the gains of asynchronous event dis- strategy. JAWS ability to dynamically bind to a particu- patching through application of the Proactor pattern; and lar concurrency strategy and I/O strategy from a range of Section 5 presents concluding remarks. alternatives follows the [3]. Concurrency Strategy: This framework implements con- 2 JAWS FRAMEWORK OVERVIEW currency mechanisms (such as single-threaded, thread-per- request, or ) that can be selected adaptively at Figure 1 illustrates the major structural components and run-time using the [3] or pre-determined at that comprise the JAWS framework [1]. initialization-time. The Service Configurator pattern [7] is JAWS is designed to allow the customization of various used to configure a particular concurrency strategy into a Web server at run-time. When concurrency involves multi- Reactor/Proactor Strategy Singleton ple threads, the strategy creates protocol handlers that fol- Memento I/O Strategy Cached Virtual low the pattern [8]. Framework Filesystem State I/O Strategy: This framework implements various I/O mechanisms, such as asynchronous, synchronous and re- Asynchronous Completion Token Tilde ~ active I/O. Multiple I/O mechanisms can be used simulta- Expander /home/...

Protocol Acceptor neously. In JAWS, asynchronous I/O is implemented using Event Dispatcher Handler the Proactor pattern [9], while reactive I/O is accomplished Protocol through the [10]. These I/O strategies may Filter utilize the Memento [3] and Asynchronous Completion To- Service Configurator Service Configurator ken [11] patterns to capture and externalize the state of a Adapter request so that it can be restored at a later time. Concurrency Protocol Pipeline Strategy Protocol Handler: This framework allows system devel- Framework Framework State opers to apply the JAWS framework to a variety of Web Pipes and Filters Active Object Strategy system applications. A Protocol Handler is parameter- ized by a concurrency strategy and an I/O strategy. These strategies are decoupled from the protocol handler using Figure 1: Architectural Overview of the JAWS Framework the [3]. In JAWS, this component imple- Web server strategies in response to environmental factors. ments the parsing and handling of HTTP/1.0 request meth- These factors include static factors (e.g., number of avail- ods. The abstraction allows for other protocols (such as able CPUs, support for kernel-level threads, and availabil- HTTP/1.1, DICOM, and SFP [12]) to be incorporated eas- ity of asynchronous I/O in the OS), as well as dynamic fac- ily into JAWS. To add a new protocol, developers simply tors (e.g., Web traffic patterns and workload characteris- write a new Protocol Handler implementation, which is tics). then configured into the JAWS framework. JAWS is structured as a framework of frameworks.The 1Due to space limitations it is not possible to describe all the patterns overall JAWS framework contains the following compo- mentioned below in detail. The references provide complete coverage of nents and frameworks: an Event Dispatcher, Concurrency each pattern, however.

2 Protocol Pipeline: This framework allows filter operations  Efficiency: The server must minimize latency, maximize to be incorporated easily with the data being processed by throughput, and avoid utilizing the CPU(s) unnecessarily. the Protocol Handler. This integration is achieved by em-  Adaptability: Integrating new or improved transport ploying the Adapter pattern. Pipelines follow the Pipes and protocols (such as HTTP 1.1 [16]) should incur minimal Filters pattern [13] for input processing. Pipeline compo- enhancement and maintenance costs. nents can be linked dynamically at run-time using the Ser-

 Programming simplicity: The design of the server vice Configurator pattern. should simplify the use of various concurrency strategies, Cached Virtual Filesystem: This component improves which may differ in performance on different OS plat- Web server performance by reducing the overhead of forms; filesystem access. Various caching strategies, such as LRU, LFU, Hinted, and Structured, can be selected following the The JAWS Web server can be implemented using sev- Strategy pattern [3]. This allows different caching strate- eral concurrency strategies, such as multiple synchronous gies to be profiled and selected based on their performance. threads, reactive synchronous event dispatching, and proac- Moreover, optimal strategies to be configured statically or tive asynchronous event dispatching. Below, we compare dynamically using the Service Configurator pattern. The and contrast the performance and design impacts of using cache for each Web server is instantiated using the Single- conventional multi-threaded synchronous event dispatch- ton pattern [3]. ing versus proactive asynchronous event dispatching, us- Tilde Expander: This component is another cache com- ing our experience developing and optimizing JAWS as a ponent that uses a perfect hash table [14] that maps abbre- case-study.

viated user login names (e.g., schmidt) to user home directories (e.g., /home/cs/faculty/schmidt). 3.1 CONCURRENT SYNCHRONOUS EVENTS When personal Web pages are stored in user home directo- Overview: An intuitive and widely used concurrency ar- ries, and user directories do not reside in one common root, chitecture for implementing concurrent Web servers is to this component substantially reduces the disk I/O overhead use synchronous multi-threading. In this model, multiple required to access a system user information file, such as server threads can process HTTP GET requests from mul- /etc/passwd. By virtue of the Service Configurator tiple clients simultaneously. Each thread performs connec- pattern, the Tilde Expander can be unlinked and relinked tion establishment, HTTP request reading, request parsing, dynamically into the server when a new user is added to and file transfer operations synchronously. As a result, each the system. operation blocks until it completes. The primary advantage of synchronous threading is the Our previous work on high-performanceWeb servers has simplification of server code. In particular, operations per- focused on (1) the design of the JAWS framework [1] and formed by a Web server to service client A’s request are (2) detailed measurements on the performance implications mostly independent of the operations required to service of alternative Web server optimization techniques [2]. In client B’s request. Thus, it is easy to service different our earlier work, we discovered that a concurrent proactive requests in separate threads because the amount of state Web server can achieve substantial performance gains [15]. shared between the threads is low, which minimizes the This paper focuses on a previously unexamined point in need for synchronization. Moreover, executing application the high-performance Web server design space: the appli- logic in separate threads allows developers to utilize intu- cation of the Proactor pattern to simplify Web server soft- itive sequential commands and blocking operations. ware development, while maintaining high-performance. Section 3 motivates the need for concurrent proactive ar- Evaluation: Although the synchronous multi-threaded chitectures by analyzing empirical benchmarking results model is intuitive and maps relatively efficiently onto of JAWS and outlining the software design challenges in- multi-CPU platforms, it has the following drawbacks: volved in developing proactive Web servers. Section 4 then

 The threading policy is tightly coupled to the concur- demonstrates how these challenges can be overcome by de- rency policy: The synchronous model requires a dedicated signing the JAWS Web server using the Proactor pattern. thread for each connected client. A concurrent application may be better optimized by aligning its threading strategy 3 CONCURRENCY ARCHITECTURES to available resources (such as the number of CPUs) rather than to the number of clients being serviced concurrently; Developing a high-performance Web server like JAWS re-

 Increased synchronization complexity: Threading can quires the resolution of the following forces: increase the complexity of synchronization mechanisms

 Concurrency: The server must perform multiple client necessary to serialize access to a server’s shared resources requests simultaneously; (such as cached files and logging of Web page hits);

3  Increased performance overhead: Threading can per- response is sent asynchronously, multiple responses could form poorly due to context switching, synchronization, and potentially be sent simultaneously. Moreover, the syn- data movement among CPUs [5]; chronous file read could be replaced with an asynchronous file read to further increase the potential for concurrency.  Non-portability: Threading may not be available on all OS platforms. Moreover, OS platforms differ widely in If the file read is performed asynchronously, the only syn- terms of their support for preemptive and non-preemptive chronous operation performed by a Protocol Handler is the threads. Consequently, it is hard to build multi-threaded HTTP protocol request parsing. servers that behave uniformly across OS platforms. The primary drawback with the proactive event dispatch- ing model is that the application structure and behavior can As a result of these drawbacks, multi-threading may not be considerably more complicated than with the conven- always be the most efficient nor the least complex solu- tional synchronous multi-threaded programming paradigm. tion to develop concurrent Web servers. The solution may In general, asynchronous applications are hard to develop not be obvious, since the disadvantages may not result in since programmer’s must explicitly retrieve OS notifica- any actual performance penalty except under certain con- tions when asynchronous events complete. However, com- ditions, such as a particularly high number of long running pletion notifications need not appear in the same order requests intermixed with rapid requests for smaller files. that the asynchronous events were requested. Moreover, Therefore, it is important to explore alternative Web server combining concurrency with asynchronous events is even architecture designs, such as the concurrent asynchronous harder since the thread that issues an asynchronous request architecture described next. may ultimately handle the completion of an event started by a different thread. The JAWS framework alleviates many 3.2 CONCURRENT ASYNCHRONOUS EVENTS of the complexities of concurrent asynchronous event dis- patching by applying the Proactor pattern described in Sec- Overview: When the OS platform supports asyn- tion 4. chronous operations, an efficient and convenient way to implement a high-performance Web server is to use proac- tive event dispatching. Web servers designed using this dis- patching model handle the completion of asynchronous op- 3.3 SUMMARY OF PERFORMANCE RESULTS erations with one or more threads of control. In our empirical measurements [15] we have observed there JAWS implements proactive event dispatching by first is significant variance in throughput and latency depending issuing an asynchronous operation to the OS and regis- on the concurrency and event dispatching mechanisms. For tering a callback (which is the Completion Handler) small files, the synchronous Thread Pool strategy provides with the Event Dispatcher.2 This Event Dispatcher notifies better overall performance. Under moderate loads, the syn- JAWS when the operation completes. The OS then per- chronous event dispatching model provides slightly better forms the operation and subsequently queues the result in a latency than the asynchronous model. Under heavy loads well-known location. The Event Dispatcher is responsible and with large file transfers, however, the asynchronous for dequeuing completion notifications and executing the model using TransmitFile provides better quality of appropriate Completion Handler. service. Thus, under Windows NT, an optimal Web server Evaluation: The primary advantage of using proactive should adapt itself to either event dispatching and file I/O event dispatching is that multiple operations can be initi- model, depending on the server’s workload and distribution ated and run concurrently without requiring the application of file requests. to have as many threads as there are simultaneous I/O op- Despite the potential for substantial performance im- erations. The operations are initiated asynchronously by provements, it is considerably harder to develop a Web the application and they run to completion within the I/O server that manages concurrency using asynchronous event subsystem of the OS. Once the asynchronous operation is dispatching, compared with traditional synchronous ap- initiated, the thread that started the operation become avail- proaches. This is due to the additional details associated able to service additional requests. with asynchronous programming (e.g. explicitly retriev- In the proactive example above, for instance, the Event ing OS notifications that may appear in non-FIFO order), Dispatcher could be single-threaded, which may be desir- and the added complexity of combining the approach with able on a uniprocessor platform. When HTTP requests ar- multi-threaded concurrency. Moreover, proactive event rive, the single Event Dispatcher thread parses the request, dispatching can be difficult to debug since asynchronous reads the file, and sends the response to the client. Since the operations are often non-deterministic. Our experience with designing and developing a proactive Web server indi- 2In our discussion, JAWS framework components presented in Sec- tion 2 appear in italics and pattern participants presented in Section 4 ap- cates that the Proactor pattern provides an elegant solution pear in typewriter font. to managing these complexities.

4 4 THE PROACTOR PATTERN Asynchronous Proactive Operation Initiator In general, patterns help manage complexity by providing Processor insight into known solutions to problems in a particular software domain. In the case of concurrent proactive archi- tectures, the complexity of the additional details of asyn- chronous programming are compounded by the complex- ities associated with multi-threaded programming. For- Asynchronous Completion Completion tunately, patterns identified in software solutions to other Operation Handler Dispatcher proactive architectures have yielded the Proactor pattern, which is described below.3 Figure 2: Participants in the Proactor Pattern

4.1 INTENT Proactive Initiator (Web server application’s main thread The Proactor pattern supports the demultiplexing and dis- ): patching of multiple event handlers, which are triggered by

 A Proactive Initiator is any entity in the completion of asynchronous events. This pattern sim- the application that initiates an Asynchronous plifies asynchronous application development by integrat- Operation.TheProactive Initiator regis- ing the demultiplexing of completion events and the dis- ters a Completion Handler and a Completion patching of their corresponding event handlers. Dispatcher with a Asynchronous Operation Processor, which notifies it when the operation completes. 4.2 APPLICABILITY Completion Handler (the Acceptor and HTTP Use the Proactor pattern when one or more of the following Handler): conditions hold:

 The Proactor pattern uses Completion Handler

 An application needs to perform one or more asyn- interfaces that are implemented by the application for chronous operations without blocking the calling Asynchronous Operation completion notifica- thread; tion.

 The application must be notified when asynchronous Asynchronous Operation (the methods Async Read, operations complete; Async Write,andAsync Accept):

 The application needs to vary its concurrency strategy  Asynchronous Operations are used to exe- independent of its I/O model; cute requests (such as I/O and timer operations) on behalf of applications. When applications invoke  The application will benefit by decoupling the application-dependent logic from the application- Asynchronous Operations, the operations are independent infrastructure; performed without borrowing the application’s thread of control.4 Therefore, from the application’s perspec-

 An application will perform poorly or fail to meet its tive, the operations are performed asynchronously. performance requirements when utilizing either the When Asynchronous Operations complete, multi-threaded approach or the reactive dispatching the Asynchronous Operation Processor approach. delegates application notifications to a Completion Dispatcher.

4.3 STRUCTURE AND PARTICIPANTS Asynchronous Operation Processor (the Operating System): The structure of the Proactor pattern is illustrated in Fig-

ure 2 using OMT notation.  Asynchronous Operations are run to com- The key participants in the Proactor pattern include the pletion by the Asynchronous Operation following: Processor. This component is typically imple- 3For brevity, portions of the complete description have been elided. mented by the OS. Detailed coverage of implementation and sample code are available in 4In contrast, the reactive event dispatching model [10] steals the appli- [9]. cation’s thread of control to perform the operation synchronously.

5 Completion Dispatcher (the Notification 3. The Asynchronous Operation Processor no- Queue): tifies the Completion Dispatcher: When opera- tions complete, the Asynchronous Operation Processor retrieves the Completion Handler  The Completion Dispatcher is responsible for calling back to the application’s Completion and Completion Dispatcher that were specified Handlers when Asynchronous Operations when the operation was initiated. The Asynchronous complete. When the Asynchronous Operation Operation Processor then passes the Completion Processor completes an asynchronously initiated Dispatcher the result of the Asynchronous operation, the Completion Dispatcher per- Operation and the Completion Handler to forms an application callback on its behalf. call back. For instance, if a file was transmitted asynchronously, the Asynchronous Operation Processor may report the completion status (such as 4.4 COLLABORATIONS success or failure), as well as the number of bytes written to the network connection. There are several well-defined steps that occur for all Asynchronous Operations. At a high level of ab- 4. Completion Dispatcher notifies the application: straction, applications initiate operations asynchronously The Completion Dispatcher calls the completion and are notified when the operations complete. Figure 3 hook on the Completion Handler, passing it any com- shows the following interactions that must occur between pletion data specified by the application. For instance, if an asynchronous read completes, the Completion Asynchronous Handler will typically be passed a pointer to the newly Proactive Operation Asynchronous Completion Completion Initiator Processor Operation Dispatcher Handler arrived data.

Asynchronous register (operation, handler, dispatcher) operation initiated 4.5 CONSEQUENCES execute Operation performed asynchronously This section details the consequences of using the Proactor dispatch Pattern. Operation completes

Completion Handler handle event notified 4.5.1 BENEFITS The Proactor pattern offers the following benefits: Figure 3: Interaction Diagram for the Proactor Pattern Increased separation of concerns: The Proactor pat- the pattern participants: tern decouples application-independent asynchrony mech- anisms from application-specific functionality. The 1. Proactive Initiators initiates operation: To per- application-independent mechanisms become reusable form asynchronous operations, the application initiates components that know how to demultiplex the completion the operation on the Asynchronous Operation events associated with Asynchronous Operations Processor. For instance, a Web server might ask the and dispatch the appropriate callback methods defined OS to transmit a file over the network using a particular by the Completion Handlers. Likewise, the socket connection. To request such an operation, the Web application-specific functionality knows how to perform a server must specify which file and network connection to particular type of service (such as HTTP processing). use. Moreover, the Web server must specify (1) which Completion Handler to notify when the operation Improved application logic portability: It improves ap- completes and (2) which Completion Dispatcher plication portability by allowing its interface to be reused should perform the callback once the file is transmitted. independently of the underlying OS calls that perform event demultiplexing. These system calls detect and re- 2. Asynchronous Operation Processor performs op- port the events that may occur simultaneously on multiple eration: When the application invokes operations on event sources. Event sources may include I/O ports, timers, the Asynchronous Operation Processor it runs synchronization objects, signals, etc. On real-time POSIX them asynchronously with respect to other application op- platforms, the asynchronous I/O functions are provided by erations. Modern operating systems (such as Solaris and the aio family of APIs [17]. In Windows NT, I/O comple- Windows NT) provide asynchronous I/O subsystems with tion ports and overlapped I/O are used to implement asyn- the kernel. chronous I/O [18].

6 The Completion Dispatcher encapsulates the con- parser written with LEX and YACC. In these applications, currency mechanism: A benefit of decoupling the debugging is straightforward when the thread of control is Completion Dispatcher from the Asynchronous within the user-defined action routines. Once the thread of Operation Processor is that applications can config- control returns to the generated Deterministic Finite Au- ure Completion Dispatchers with various concur- tomata (DFA) skeleton, however, it is hard to follow the rency strategies without affecting other participants. The program logic. Completion Dispatcher can be configured to use and controlling outstanding operations: several concurrency strategies including single-threaded Proactive Initiators mayhavenocontrolover and Thread Pool solutions. the order in which Asynchronous Operations are Threading policy is decoupled from the concur- executed. Therefore, the Asynchronous Operation rency policy: Since the Asynchronous Operation Processor must be designed carefully to support Processor completes potentially long-running opera- prioritization and cancellation of Asynchronous tions on behalf of Proactive Initiators, applica- Operations. tions are not forced to spawn threads to increase concur- rency. This allows an application to vary its concurrency 4.6 KNOWN USES policy independently of its threading policy. For instance, a Web server may only want to have one thread per CPU, The following are some widely documented uses of the but may want to service a higher number of clients simul- Proctor pattern: taneously. I/O Completion Ports in Windows NT: The Windows NT operating system implements the Proactor pattern. Var- Increased performance: Multi-threaded operating sys- ious Asynchronous Operations such as accepting tems perform context switches to cycle through multiple new network connections, reading and writing to files and threads of control. While the time to perform a con- sockets, and transmission of files across a network con- text switch remains fairly constant, the total time to cycle nection are supported by Windows NT. The operating sys- through a large number of threads can degrade application tem is the Asynchronous Operation Processor. performance significantly if the OS context switches to an Results of the operations are queued up at the I/O com- idle thread. For instance, threads may poll the OS for com- pletion port (which plays the role of the Completion pletion status, which is inefficient. The Proactor pattern can Dispatcher). avoid the cost of context switching by activating only those logical threads of control that have events to process. For ACE Proactor: The Adaptive Communications En- instance, a Web server does not need to activate an HTTP vironment (ACE) [5] implements a Proactor compo- Handler if there is no pending GET request. nent that encapsulates I/O Completion Ports on Win- dows NT. The ACE Proactor abstraction provides an Simplification of application synchronization: As long OO interface to the standard C APIs supported by as Completion Handlers do not spawn additional Windows NT. The source code for this implemen- threads of control, application logic can be written with lit- tation can be acquired from the ACE website at tle or no regard to synchronization issues. Completion

www.cs.wustl.edu/schmidt/ACE.html. Handlers can be written as if they existed in a conven- tional single-threaded environment. For instance, a Web The UNIX AIO Family of Asynchronous I/O Opera- server’s HTTP GET Handler can access the disk through tions: On some real-time POSIX platforms, the Proac- an Async Read operation (such as the Windows NT tor pattern is implemented by the aio family of APIs TransmitFile function [15]). [17]. These OS features are very similar to the ones de- scribed above for Windows NT. One difference is that 4.5.2 DRAWBACKS UNIX signals can be used to implement an truly asyn- chronous Completion Dispatcher (the Windows NT The Proactor pattern has the following drawbacks: API is not truly asynchronous). Hard to debug: Applications written with the Proactor Asynchronous Procedure Calls in Windows NT: Some pattern can be hard to debug since the inverted flow of con- systems (such as Windows NT) support Asynchronous Pro- trol oscillates between the framework infrastructure and the cedure Calls (APC)s. An APC is a function that executes method callbacks on application-specific handlers. This in- asynchronously in the context of a particular thread. When creases the difficulty of “single-stepping” through the run- an APC is queued to a thread, the system issues a soft- time behavior of a framework within a debugger since ap- ware interrupt. The next time the thread is scheduled, it plication developers may not understand or have access to will run the APC. APCs made by operating system are the framework code. This is similar to the problems en- called kernel-mode APCs. APCs made by an application countered trying to debug a compiler lexical analyzer and are called user-mode APCs.

7 5 CONCLUDING REMARKS [6] D. C. Schmidt, “Acceptor and Connector: Design Patterns for Initializing Communication Services,” in Pattern Lan- Over the past several years, computer and network perfor- guages of Program Design (R. Martin, F. Buschmann, and D. Riehle, eds.), Reading, MA: Addison-Wesley, 1997. mance has improved substantially. However, the develop- [7] P. Jain and D. C. Schmidt, “Service Configurator: A Pat- ment of high-performance Web servers has remained ex- tern for Dynamic Configuration of Services,” in Proceedings pensive and error-prone. The JAWS framework described rd of the 3 Conference on Object-Oriented Technologies and in this paper aims to support Web server developers by sim- Systems, USENIX, June 1997. plifying the application of various server designs and opti- [8] R. G. Lavender and D. C. Schmidt, “Active Object: an mization strategies. Object for Concurrent Programming,” This paper illustrates that a Web server based on tradi- in Pattern Languages of Program Design (J. O. Coplien, J. Vlissides, and N. Kerth, eds.), Reading, MA: Addison- tional synchronous event dispatching performs adequately Wesley, 1996. under light server loads. However, when the Web server [9] T. Harrison, I. Pyarali, D. C. Schmidt, and T. Jordan, “Proac-

is subject to heavy loads, a design based on a concurrent tor – An Object Behavioral Pattern for Dispatching Asyn- th proactive architecture provides significantly better perfor- chronous Event Handlers,” in The 4 Pattern Languages of mance. However, programming this model adds complex- Programming Conference (Washington University technical ity to the software design and increases the effort of devel- report #WUCS-97-34), September 1997. oping high-performance Web servers. [10] D. C. Schmidt, “Reactor: An Object Behavioral Pattern for Much of the development effort stems from the repeated Concurrent Event Demultiplexing and Event Handler Dis- patching,” in Pattern Languages of Program Design (J. O. rediscovery and reinvention of fundamental design pat- Coplien and D. C. Schmidt, eds.), pp. 529–545, Reading, terns. Design patterns describe recurring solutions found MA: Addison-Wesley, 1995. in existing software systems. Applying design patterns to [11] I. Pyarali, T. H. Harrison, and D. C. Schmidt, “Asyn- concurrent software systems can reduce software develop- chronous Completion Token: an Object Behavioral Pat- ment time, improve code maintainability, and increase code tern for Efficient Asynchronous Event Handling,” in Pattern reuse over traditional software engineering techniques. Languages of Program Design (R. Martin, F. Buschmann, and D. Riehle, eds.), Reading, MA: Addison-Wesley, 1997. The Proactor pattern described in this paper embodies a [12] Object Management Group, Control and Management of powerful technique that supports both efficient and flexible Audio/Video Streams: OMG RFP Submission, 1.2 ed., Mar. event dispatching strategies for high-performance concur- 1997. rent applications. In general, applying this pattern enables [13] F. Buschmann, R. Meunier, H. Rohnert, P. Sommerlad, and developers to leverage the performance benefits of execut- M. Stal, Pattern-Oriented Software Architecture - A System ing operations concurrently, without constraining them to of Patterns. Wiley and Sons, 1996.

synchronous multi-threaded or reactive programming. In [14] D. C. Schmidt, “GPERF: A Perfect Hash Function Genera- nd our experience, applying the Proactor pattern to the JAWS tor,” in Proceedings of the 2 C++ Conference,(SanFran- Web server framework has made it considerably easier to cisco, California), pp. 87–102, USENIX, April 1990. design, develop, test, and maintain. [15] J. Hu, I. Pyarali, and D. C. Schmidt, “Measuring the Im- pact of Event Dispatching and Concurrency Models on Web

Server Performance Over High-speed Networks,” in Pro- nd

ceedings of the 2 Global Internet Conference, IEEE, REFERENCES November 1997. [1] D. C. Schmidt and J. Hu, “Developing Flexible and High- [16] J. C. Mogul, “The Case for Persistent-connection HTTP,” performance Web Servers with Frameworks and Patterns,” in Proceedings of ACM SIGCOMM ’95 Conference in ACM Computing Surveys, vol. 30, 1998. Computer Communication Review, (Boston, MA, USA), pp. 299–314, ACM Press, August 1995. [2] J. Hu, S. Mungee, and D. C. Schmidt, “Principles for Devel- oping and Measuring High-performance Web Servers over [17] “Information Technology – Portable Operating System In- ATM,” in Proceeedings of INFOCOM ’98, March/April terface (POSIX) – Part 1: System Application: Program In- 1998. terface (API) [C Language],” 1995. [18] Microsoft Developers Studio, Version 4.2 - Software Devel- [3] E. Gamma, R. Helm, R. Johnson, and J. Vlissides, Design opment Kit, 1996. Patterns: Elements of Reusable Object-Oriented Software. Reading, MA: Addison-Wesley, 1995. [4] D. C. Schmidt, “Experience Using Design Patterns to Develop Reuseable Object-Oriented Communication Soft- ware,” Communications of the ACM (Special Issue on Object-Oriented Experiences), vol. 38, October 1995. [5] D. C. Schmidt, “ACE: an Object-Oriented Framework for

Developing Distributed Applications,” in Proceedings of the th

6 USENIX C++ Technical Conference, (Cambridge, Mas- sachusetts), USENIX Association, April 1994.

8