Proactor An Object for Demultiplexing and Dispatching Handlers for Asynchronous Events

Irfan Pyarali, Tim Harrison, and Douglas C. Schmidt Thomas D. Jordan

1 g firfan, harrison, schmidt @cs.wustl.edu [email protected] Dept. of Computer Science SoftElegance

Washington University, St. Louis2 Hartford, WI 53027

St. Louis, MO 63130, (314) 935-7538 (414) 673-9813 th

This paper will appear at the 4 annual Pattern Languages 2 Motivation of Programming conference held in Allerton Park, Illinois, September, 1997. Permission is granted to make copies of This section provides the context and motivation for using this paper for the PLoP ’97 conference proceedings. the .

2.1 Context and Forces Abstract The Proactor pattern should be applied when applications re- Modern operating systems provide multiple mechanisms for quire the performance benefits of executing operations con- developing concurrent applications. Synchronous multi- currently, without being constrained by synchronous multi- threading is a popular mechanism for developing applica- threaded or reactive programming. To illustrate this, con- tions that perform multiple operations simultaneously. How- sider networking applications that need to perform multiple ever, threads often have high performance overhead and re- operations concurrently. For example, a high-performance quire deep knowledge of synchronization patterns and prin- Web server must concurrently process HTTP requests sent ciples. Therefore, an increasing number of operating systems from multiple clients [1]. Figure 1 shows a typical inter- support asynchronous mechanisms that provide the benefits action between Web browsers and a Web server. When a of concurrency while alleviating much of the overhead and user instructs a browser to open a URL, the browser sends complexity of multi-threading. an HTTP GET request to the Web server. Upon receipt, the The Proactor pattern presented in this paper describes server parses and validates the request and sends the speci- how to structure applications and systems that effectively uti- fied file(s) back to the browser. lize asynchronous mechanisms supported by operating sys- Developing high-performance Web servers requires the tems. When an application invokes an asynchronous opera- resolution of the following forces: tion, the OS performs the operation on behalf of the appli- cation. This allows the application to have multiple opera-  Concurrency: The server must perform multiple client tions running simultaneously without requiring the applica- requests simultaneously; tion to have a corresponding number of threads. Therefore, the Proactor pattern simplifies concurrent programming and  Efficiency: The server must minimize latency, maximize improves performance by requiring fewer threads and lever- throughput, and avoid utilizing the CPU(s) unnecessarily. aging OS support for asynchronous operations.

 Programming simplicity: The design of the server should simplify the use of efficient concurrency strategies;

 Adaptability: Integrating new or improved transport 1Intent protocols (such as HTTP 1.1 [2]) should incur minimal main- tenance costs. The Proactor pattern supports the demultiplexing and dis- patching of multiple event handlers, which are triggered by A Web server can be implemented using several concur- the completion of asynchronous events. This pattern sim- rency strategies, including multiple synchronous threads, re- plifies asynchronous application development by integrating active synchronous event dispatching, and proactive asyn- the demultiplexing of completion events and the dispatching chronous event dispatching. Below, we examine the draw- of their corresponding event handlers. backs of conventional approaches and explain how the Proactor pattern provides a powerful technique that sup-

1 Alternative point of contact is thomas [email protected]. ports an efficient and flexible asynchronous event dispatch-

2 This research is supported in part by a grant from Siemens MED. ing strategy for high-performance concurrent applications.

1 Web Server 2: parse request 1: HTTP 2: connect request Web Sync Acceptor Server Web 3: HTTP 1:accept 4: parse Web 3: read file Browser request request Browser 4: send file File System 6: send file 5: read file Web Browser Web Web File Figure 1: Typical Web Server Communication Software Ar- Browser Browser System chitecture

2.2 Common Traps and Pitfalls of Conven- Figure 2: Multi-threaded Web Server Architecture tional Concurrency Models Synchronous multi-threading and reactive programming are socket call waiting for a client connection request; common ways of implementing concurrency. This section 2. A client connects to the server, and the connection is describes the shortcomings of these programming models. accepted; 3. The new client’s HTTP request is synchronously read 2.2.1 Concurrency Through Multiple Synchronous from the network connection; Threads 4. The request is parsed; Perhaps the most intuitive way to implement a concurrent Webserveristousesynchronous multi-threading.Inthis 5. The requested file is synchronously read; model, multiple server threads process HTTP GET requests 6. The file is synchronously sent to the client. from multiple clients simultaneously. Each performs connection establishment, HTTP request reading, request A C++ code example that applies the synchronous threading parsing, and file transfer operations synchronously. As a re- model to a Web server appears in Appendix A.1. sult, each operation blocks until it completes. As described above, each concurrently connected client The primary advantage of synchronous threading is the is serviced by a dedicated server thread. The thread com- simplification of application code. In particular, operations pletes a requested operation synchronously before servicing performed by a Web server to service client A’s request are other HTTP requests. Therefore, to perform synchronous mostly independent of the operations required to service I/O while servicing multiple clients, the Web server must client B’s request. Thus, it is easy to service different re- spawn multiple threads. Although this synchronous multi- quests in separate threads because the amount of state shared threaded model is intuitive and maps relatively efficiently between the threads is low, which minimizes the need for onto multi-CPU platforms, it has the following drawbacks: synchronization. Moreover, executing application logic in separate threads allows developers to utilize intuitive sequen-  Threading policy is tightly coupled to the concurrency tial commands and blocking operations. policy: This architecture requires a dedicated thread for Figure 2 shows how a Web server designed using syn- each connected client. A concurrent application may be bet- chronous threads can process multiple clients concurrently. ter optimized by aligning its threading strategy to available This figure shows a Sync Acceptor object that encapsu- resources (such as the number of CPUs via a ) lates the server-side mechanism for synchronously accepting rather than to the number of clients being serviced concur- network connections. The sequence of steps that each thread rently;

uses to service an HTTP GET request using a Thread Pool  Increased synchronization complexity: Threading can concurrency model can be summarized as follows: increase the complexity of synchronization mechanisms nec- 1. Each thread synchronously blocks in the accept essary to serialize access to a server’s shared resources (such as cached files and logging of Web page hits);

2 Web Server Web Server 1: GET /etc/passwd 3: read request 3: connect Acceptor 4: parse request Web HTTP 8: send Browser Handler Web HTTP 4: new file Browser Handler 5: create connection 2: read 7: write ready Initiation 5: read 6: register ready 6: register connection new connection Dispatcher File file (Reactor) System Initiation 1: register 2: handle Dispatcher Acceptor events (Reactor)

Figure 4: Client Sends HTTP Request to Reactive Web Figure 3: Client Connects to Reactive Web Server Server

 Increased performance overhead: Threading can per- 1. The Web Server registers an Acceptor with the form poorly due to context switching, synchronization, and Initiation Dispatcher to accept new connec- data movement among CPUs [3]; tions;

 Non-portability: Threading may not be available on all 2. The OS platforms. Moreover, OS platforms differ widely in Web Server invokes event loop of the Initiation terms of their support for pre-emptive and non-preemptive Dispatcher; threads. Consequently, it is hard to build multi-threaded 3. A client connects to the Web Server; servers that behave uniformly across OS platforms. 4. The Acceptor is notified by the Initiation As a result of these drawbacks, multi-threading is often not Dispatcher of the new connection request and the the most efficient nor the least complex solution to develop Acceptor accepts the new connection; concurrent Web servers. 5. The Acceptor creates an HTTP Handler to service the new client; 2.2.2 Concurrency Through Reactive Synchronous 6. HTTP Handler registers the connection with the Event Dispatching Initiation Dispatcher for reading client re- Another common way to implement a synchronous Web quest data (that is, when the connection becomes “ready server is to use a reactive event dispatching model. for reading”); The [4] describes how applications 7. The HTTP Handler services the request from the can register Event Handlers with an Initiation new client. Dispatcher.TheInitiation Dispatcher notifies the Event Handler when it is possible to initiate an op- Figure 4 shows the sequence of steps that the reactive Web eration without blocking. Server takes to service an HTTP GET request. This process A single-threaded concurrent Web server can use a reac- is described below: tive event dispatching model that waits in an event loop for 1. The client sends an HTTP GET request; Reactor a to notify it to initiate appropriate operations. 2. The Initiation Dispatcher notifies the HTTP An example of a reactive operation in the Web server is the Handler when client request data arrives at the server; registration of an Acceptor [5] with the Initiation Dispatcher. When data arrives on the network con- 3. The request is read in a non-blocking manner such that nection, the dispatcher calls back the Acceptor.The the read operation returns EWOULDBLOCK if the op- Acceptor accepts the network connection and creates an eration would cause the calling thread to block (steps 2 HTTP Handler.ThisHTTP Handler then registers and 3 repeat until the request has been completely read); with the Reactor to process the incoming URL request on 4. The HTTP Handler parses the HTTP request; that connection in the Web server’s single thread of control. 5. The requested file is synchronously read from the file Figures 3 and 4 show how a Web server designed using system; reactive event dispatching handles multiple clients. Figure 3 HTTP Handler shows the steps taken when a client connects to the Web 6. The registers the connection with Initiation Dispatcher server. Figure 4 shows how the Web server processes a client the for sending file data request. The sequence of steps for Figure 3 can be summa- (that is, when the connection becomes “ready for writ- rized as follows: ing”);

3 7. The Initiation Dispatcher notifies the HTTP Handler when the TCP connection is ready for writ- 4: connect Web Server 1: accept ing; connections Web 8. The HTTP Handler sends the requested file to the Browser 7: create Acceptor client in a non-blocking manner such that the write HTTP operation returns EWOULDBLOCK if the operation Handler 6: 2: accept accept (Acceptor, would cause the calling thread to block (steps 7 and 8: read (connection, complete Dispatcher) 8 will repeat until the data has been delivered com- Handler, Dispatcher) pletely). Completion Operating Dispatcher System A C++ code example that applies the reactive event dispatch- ing model to a Web server appears in Appendix A.2. 3: handle 5: accept events complete Since the Initiation Dispatcher runs in a single thread, network I/O operations are run under control of the Figure 5: Client connects to a Proactor-based Web Server Reactor in a non-blocking manner. If forward progress is stalled on the current operation, the operation is handed off to the Initiation Dispatcher, which monitors the 2.3 Solution: Concurrency Through Proac- status of the system operation. When the operation can make tive Operations forward progress again, the appropriate Event Handler is notified. When the OS platform supports asynchronous operations, The main advantages of the reactive model are portabil- an efficient and convenient way to implement a high- ity, low overhead due to coarse-grained concurrency control performance Web server is to use proactive event dispatch- (that is, single-threading requires no synchronization or con- ing. Web servers designed using a proactive event dispatch- text switching), and modularity via the decoupling of appli- ing model handle the completion of asynchronous operations cation logic from the dispatching mechanism. However, this with one or more threads of control. Thus, the Proactor pat- approach has the following drawbacks: tern simplifies asynchronous Web servers by integrating com- pletion event demultiplexing and event handler dispatching. An asynchronous Web server would utilize the Proac-

 Complex programming: As seen from the list above, tor pattern by first having the Web server issue an asyn- programmers must write complicated logic to make sure that chronous operation to the OS and registering a callback the server does not block while servicing a particular client. with a Completion Dispatcher that will notify the Web server when the operation completes. The OS then

 Lack of OS support for multi-threading: Most op- performs the operation on behalf of the Web server and erating systems implement the reactive dispatching model subsequently queues the result in a well-known location. through the select system call [6]. However, select The Completion Dispatcher is responsible for de- does not allow more than one thread to wait in the event loop queueing completion notifications and executing the appro- on the same descriptor set. This makes the reactive model priate callback that contains application-specific Web server unsuitable for high-performance applications since it does code. not utilize hardware parallelism effectively. Figures 5 and 6 show how a Web server designed using proactive event dispatching handles multiple clients concur-

 of runnable tasks: In synchronous multi- rently within one or more threads. Figure 5 shows the se- threading architectures that support pre-emptive threads, it is quence of steps taken when a client connects to the Web the operating system’s responsibility to schedule and time- Server. slice the runnable threads onto the available CPUs. This scheduling support is not available in reactive architectures 1. The Web Server instructs the Acceptor to initiate an since there is only one thread in the application. Therefore, asynchronous accept; developers of the system must by careful to time-share the 2. The Acceptor initiates an asynchronous accept with Op- thread between all the clients connected to the Web server. erating System and passes itself as a Completion This can be accomplished by only performing short duration, Handler and a reference to the Completion non-blocking operations. Dispatcher that will be used to notify the Acceptor upon completion of the asynchronous ac- As a result of these drawbacks, reactive event dispatching cept; is not the most efficient model when hardware parallelism is available. This model also has a relatively high level of 3. The Web Server invokes the event loop of the Completion Dispatcher; programming complexity due to the need to avoid blocking I/O. 4. The client connects to the Web Server;

4 A C++ code example that applies the proactive event dis- 1: GET Web Server patching model to a Web server appears in Section 8. /etc/passwd The primary advantage of using the Proactor pattern is that Web HTTP 4: parse request multiple concurrent operations can be started and can run in Browser Handler parallel without necessarily requiring the application to have 6: write (File, Conn., multiple threads. The operations are started asynchronously 8: write Handler, Dispatcher) by the application and they run to completion within the I/O 5: read (File) 3: read complete complete subsystem of the OS. The thread that initiated the operation 7: write is now available to service additional requests. For instance, complete File Completion Operating in the example above, the Completion Dispatcher System Dispatcher System could be single-threaded. When HTTP requests arrive, the 2: read complete single Completion Dispatcher thread parses the re- quest, reads the file, and sends the response to the client. Figure 6: Client request to a Proactor-based Web Server Since the response is sent asynchronously, multiple re- sponses could potentially be sent simultaneously. Moreover, the synchronous file read could be replaced with an asyn- 5. When the asynchronous accept operation completes, chronous file read to further increase the potential for concur- the Operating System notifies the Completion rency. If the file read is performed asynchronously, the only Dispatcher; synchronous operation performed by an HTTP Handler is 6. The Completion Dispatcher notifies the Accep- the HTTP protocol request parsing. tor; The primary drawback with the Proactive model is that the programming logic is at least as complicated as the Re- 7. The Acceptor creates an HTTP Handler; active model. Moreover, the Proactor pattern can be diffi- 8. The HTTP Handler initiates an asynchronous opera- cult to debug since asynchronous operations are often non- tion to read the request data from the client and passes deterministic. Section 7 describes how to apply other pat- itself as a Completion Handler and a reference terns (such as the Asynchronous Completion Token [7]) to to the Completion Dispatcher that will be used simplify the asynchronous application programming model. to notify the HTTP Handler upon completion of the asynchronous read. 3 Applicability Figure 6 shows the sequence of steps that the proactive Web Server takes to service an HTTP GET request. These steps Use the Proactor pattern when one or more of the following are explained below: conditions hold:

1. The client sends an HTTP GET request;  An application needs to perform one or more asyn- 2. The read operation completes and the Operating chronous operations without blocking the calling System notifies the Completion Dispatcher; thread;

3. The Completion Dispatcher notifies the HTTP  The application must be notified when asynchronous Handler (steps 2 and 3 will repeat until the entire re- operations complete;

quest has been received);  The application needs to vary its concurrency strategy 4. The HTTP Handler parses the request; independent of its I/O model;

5. The HTTP Handler synchronously reads the re-  The application will benefit by decoupling the quested file; application-dependent logic from the application- independent infrastructure; 6. The HTTP Handler initiates an asynchronous oper- ation to write the file data to the client connection and  An application will perform poorly or fail to meet passes itself as a Completion Handler and a ref- its performance requirements when utilizing either the multi-threaded approach or the reactive dispatching ap- erence to the Completion Dispatcher that will be used to notify the HTTP Handler upon completion proach. of the asynchronous write; 7. When the write operation completes, the Operating Sys- tem notifies the Completion Dispatcher; 4 Structure and Participants 8. The Completion Dispatcher then notifies the Completion Handler (steps 6-8 continue until the The structure of the Proactor pattern is illustrated in Figure 7 file has been delivered completely). (OMT notation). The key participants in the Proactor pattern include the following:

5 Asynchronous Asynchronous Proactive Proactive Operation Asynchronous Completion Completion Operation Initiator Processor Operation Dispatcher Handler Initiator Processor Asynchronous register (operation, handler, dispatcher) operation initiated execute Operation performed asynchronously

dispatch Operation completes Asynchronous Completion Completion

Operation Handler Dispatcher Completion Handler handle event notified Figure 7: Participants in the Proactor Pattern Figure 8: Interaction Diagram for the Proactor Pattern

Proactive Initiator (Web server application’s main thread): Completion Dispatcher (the Notification Queue):

 The Completion Dispatcher is responsible

 A Proactive Initiator is any entity in the ap- plication for calling back to the application’s Completion that initiates an Asynchronous Operation.The Handlers when Asynchronous Operations Proactive Initiator registers a Completion complete. When the Asynchronous Operation Handler and a Completion Dispatcher with Processor completes an asynchronously initiated a Asynchronous Operation Processor,which operation, the Completion Dispatcher performs notifies it when the operation completes. an application callback on its behalf.

Completion Handler (the Acceptor and HTTP 5 Collaborations Handler): There are several well-defined steps that occur for all

 The Proactor pattern uses Completion Handler in- Asynchronous Operations. At a high level of ab- terfaces that are implemented by the application for straction, applications initiate operations asynchronously Asynchronous Operation completion notifica- and are notified when the operations complete. Figure 8 tion. shows the following interactions that must occur between the pattern participants: Asynchronous Operation (the methods Async Read, 1. Proactive Initiators initiates operation: To perform Async Write,andAsync Accept): asynchronous operations, the application initiates the opera- tion on the Asynchronous Operation Processor.

 Asynchronous Operations are used to exe- For instance, a Web server might ask the OS to transmit cute requests (such as I/O and timer operations) on a file over the network using a particular socket connec- behalf of applications. When applications invoke tion. To request such an operation, the Web server must Asynchronous Operations, the operations are specify which file and network connection to use. More- performed without borrowing the application’s thread over, the Web server must specify (1) which Completion of control.3 Therefore, from the application’s per- Handler to notify when the operation completes and (2) spective, the operations are performed asynchronously. which Completion Dispatcher should perform the When Asynchronous Operations complete, the callback once the file is transmitted. Asynchronous Operation Processor dele- gates application notifications to a Completion 2. Asynchronous Operation Processor performs oper- Dispatcher. ation: When the application invokes operations on the Asynchronous Operation Processor it runs them Asynchronous Operation Processor (the Operating asynchronously with respect to other application operations. System): Modern operating systems (such as Solaris and Windows NT) provide asynchronous I/O subsystems with the kernel.

 Asynchronous Operations are run to completion 3. The Asynchronous Operation Processor noti- by the Asynchronous Operation Processor. fies the Completion Dispatcher: When operations com- This component is typically implemented by the OS. plete, the Asynchronous Operation Processor re-

3 In contrast, the reactive event dispatching model steals the application’s trieves the Completion Handler and Completion thread of control to perform the operation synchronously. Dispatcher that were specified when the operation was

6 initiated. The Asynchronous Operation Processor Threading policy is decoupled from the concur- then passes the Completion Dispatcher the result of rency policy: Since the Asynchronous Operation the Asynchronous Operation and the Completion Processor completes potentially long-running operations Handler to call back. For instance, if a file was trans- on behalf of Proactive Initiators, applications are mitted asynchronously, the Asynchronous Operation not forced to spawn threads to increase concurrency. This Processor may report the completion status (such as suc- allows an application to vary its concurrency policy inde- cess or failure), as well as the number of bytes written to the pendently of its threading policy. For instance, a Web server network connection. may only want to have one thread per CPU, but may want to service a higher number of clients simultaneously. 4. Completion Dispatcher notifies the application: The Completion Dispatcher calls the completion hook on Increased performance: Multi-threaded operating sys- the Completion Handler, passing it any completion tems perform context switches to cycle through multiple data specified by the application. For instance, if an asyn- threads of control. While the time to perform a context chronous read completes, the Completion Handler switch remains fairly constant, the total time to cycle through will typically be passed a pointer to the newly arrived data. a large number of threads can degrade application perfor- mance significantly if the OS context switches to an idle thread. For instance, threads may poll the OS for completion 6 Consequences status, which is inefficient. The Proactor pattern can avoid the cost of context switching by activating only those logical This section details the consequences of using the Proactor threads of control that have events to process. For instance, Pattern. a Web server does not need to activate an HTTP Handler if there is no pending GET request.

6.1 Benefits Simplification of application synchronization: As long as Completion Handlers do not spawn additional The Proactor pattern offers the following benefits: threads of control, application logic can be written with lit- tle or no regard to synchronization issues. Completion Increased separation of concerns: The Proactor pattern Handlers can be written as if they existed in a conven- decouples application-independent asynchrony mechanisms tional single-threaded environment. For instance, a Web from application-specific functionality. The application- server’s HTTP GET Handler can access the disk through independent mechanisms become reusable components that an Async Read operation (such as the Windows NT know how to demultiplex the completion events associated TransmitFile function [1]). with Asynchronous Operations and dispatch the ap- propriate callback methods defined by the Completion Handlers. Likewise, the application-specific functional- 6.2 Drawbacks ity knows how to perform a particular type of service (such The Proactor pattern has the following drawbacks: as HTTP processing). Hard to debug: Applications written with the Proactor Improved application logic portability: It improves ap- pattern can be hard to debug since the inverted flow of con- plication portability by allowing its interface to be reused in- trol oscillates between the framework infrastructure and the dependently of the underlying OS calls that perform event method callbacks on application-specific handlers. This in- demultiplexing. These system calls detect and report the creases the difficulty of “single-stepping” through the run- events that may occur simultaneously on multiple event time behavior of a framework within a debugger since appli- sources. Event sources may include I/O ports, timers, syn- cation developers may not understand or have access to the chronization objects, signals, etc. On real-time POSIX plat- framework code. This is similar to the problems encountered forms, the asynchronous I/O functions are provided by the trying to debug a compiler lexical analyzer and parser writ- aio family of APIs [8]. In Windows NT, I/O comple- ten with LEX and YACC. In these applications, debugging tion ports and overlapped I/O are used to implement asyn- is straightforward when the thread of control is within the chronous I/O [9]. user-defined action routines. Once the thread of control re- The Completion Dispatcher encapsulates the con- turns to the generated Deterministic Finite Automata (DFA) currency mechanism: A benefit of decoupling the skeleton, however, it is hard to follow the program logic. Completion Dispatcher from the Asynchronous Scheduling and controlling outstanding operations: Operation Processor is that applications can config- Proactive Initiators mayhavenocontrolover ure Completion Dispatchers with various concur- the order in which Asynchronous Operations are rency strategies without affecting other participants. As dis- executed. Therefore, the Asynchronous Operation cussed in Section 7, the Completion Dispatcher can Processor must be designed carefully to support prioriti- be configured to use several concurrency strategies includ- zation and cancellation of Asynchronous Operations. ing single-threaded and Thread Pool solutions.

7 7 Implementation class Asynch_Stream The Proactor pattern can be implemented in many ways. // = TITLE // A Factory for initiating reads This section discusses the steps involved in implementing // and writes asynchronously. the Proactor pattern. { // Initializes the factory with information // which will be used with each asynchronous 7.1 Implementing the Asynchronous Opera- // call. is notified when the // operation completes. The asynchronous tion Processor // operations are performed on the // and the results of the operations are The first step in implementing the Proactor pattern is build- // sent to the . ing the Asynchronous Operation Processor. Asynch_Stream (Completion_Handler &handler, The Asynchronous Operation Processor is re- HANDLE handle, Completion_Dispatcher *); sponsible for executing operations asynchronously on be- half of applications. As a result, its two primary re- // This starts off an asynchronous read. sponsibilities are exporting Asynchronous Operation // Upto will be read and Asynchronous Operation // stored in the . APIs and implementing an int read (Message_Block &message_block, Engine to do the work. u_long bytes_to_read, const void *act = 0);

7.1.1 Define Asynchronous Operation APIs // This starts off an asynchronous write. // Upto will be written The Asynchronous Operation Processor must // from the . provide an API that allows applications to request int write (Message_Block &message_block, Asynchronous Operations. There are several forces u_long bytes_to_write, const void *act = 0); to be considered when designing these APIs: ... };  Portability: The APIs should not tie an application nor its Proactve Initiators to a particular platform.

 Flexibility: Often, asynchronous APIs can be shared for 7.1.2 Implement the Asynchronous Operation Engine many types of operations. For instance, asynchronous I/O operations can often be used to perform I/O on multiple The Asynchronous Operation Processor must mediums (such as network and files). It may be beneficial contain a mechanism that performs the operations asyn- to design APIs that support such reuse. chronously. In other words, when an application thread invokes an Asynchronous Operation, the operation  Callbacks: The Proactive Initiators must reg- ister a callback when the operation is invoked. A com- must be performed without borrowing the application’s mon approach to implement callbacks is to have the call- thread of control. Fortunately, modern operating systems ing objects (clients) export an interface known by the caller provide mechanisms for Asynchronous Operations (server). Therefore, Proactive Initiators must (for example, POSIX asynchronous I/O and WinNT over- inform the Asynchronous Operation Processor lapped I/O). When this is the case, implementing this part of which object’s Completion Handler should be called the pattern simply requires mapping the platform APIs to the back when an operation completes. Asynchronous Operation APIs described above. If the OS platform does not provide support for

 Completion Dispatcher: Since an application may use Asynchronous Operations, there are several im- multiple Completion Dispatchers,theProactive plementation techniques that can be used to build an Initiator also must indicate which Completion Asynchronous Operation Engine. Perhaps the Dispatcher should perform the callback. most intuitive solution is to use dedicated threads to per- Asynchronous Operations Given all of these concerns, consider the following API form the for applications. Asynchronous Operation for asynchronous reads and writes. The Asynch Stream To implement a threaded Engine class is a factory for initiating asynchronous reads , there are three primary steps: and writes. Once constructed, multiple asynchronous reads and writes can be started using this class. An 1. Operation invocation: Because the operation will be Asynch Stream::Read Result will be passed back to performed in a different thread of control from the invok- the handler when the asynchronous read completes via the ing application thread, some type of thread synchronization handle read callback on the Completion Handler. must occur. One approach would be to spawn a thread Similarly, an Asynch Stream::Write Result will for each operation. A more common approach is for the be passed back to the handler when the asynchronous Asynchronous Operation Processor to control a write completes via the handle write callback on pool of dedicated threads. This approach would require that Completion Handler. the application thread queue the operation request before continuing with other application computations.

8 2. Operation execution: Since the operation will be per- 7.2.2 Defining Completion Dispatcher Concurrency formed in a dedicated thread, it can perform “blocking” oper- Strategies ations without directly impeding progress of the application. For instance, when providing a mechanism for asynchronous A Completion Dispatcher will be notified by the I/O reads, the dedicated thread can block while reading from Asynchronous Operation Processor when op- socket or file handles. erations complete. At this point, the Completion Dispatcher can utilize one of the following concurrency 3. Operation completion: When the operation completes, strategies to perform the application callback: the application must be notified. In particular, the dedicated thread must delegate application-specific notifications to the Dynamic-thread dispatching: A thread can be dynami- Completion Dispatcher. This will require additional cally allocated for each Completion Handler by the synchronization between threads. Completion Dispatcher. Dynamic-thread dispatching can be implemented with most multi-threaded operating sys- 7.2 Implementing the Completion Dispatcher tems. On some platforms, this may be the least efficient tech- nique of those listed for Completion Dispatcher im- The Completion Dispatcher calls back to the plementations due to the overhead of creating and destroying Completion Handler that is associated with the appli- thread resources. cation objects when it receives operation completions from the Asynchronous Operation Processor. There are Post-reactive dispatching: An event object or condition two issues involved with implementing the Completion variable established by the Proactive Initiator can Dispatcher: (1) implementing callbacks and (2) defining be signaled by the Completion Dispatcher. Al- the concurrency strategy used to perform the callbacks. though polling and spawning a child thread that blocks on the event object are options, the most efficient method 7.2.1 Implementing Callbacks for Post-reactive dispatching is to register the event with a Reactor. Post-reactive dispatching can be implemented The Completion Dispatcher must implement a mech- with aio suspend in POSIX real-time environments and anism through which Completion Handlers are in- with WaitForMultipleObjects in Win32 environ- voked. This requires Proactive Initiators to spec- ments. ify a callback when initiating operations. The most common way is to establish some form of callback interface. The fol- Call-through dispatching: The thread of control from the lowing are common callback alternatives: Asynchronous Operation Processor can be bor- Callback class: The Completion Handler exports an rowedbytheCompletion Dispatcher to execute the interface known by the Completion Dispatcher.The Completion Handler. This “cycle stealing” strategy Completion Dispatcher calls back on a method in this can increase performance by decreasing the incidence of idle interface when the operation completes and passes it infor- threads. In the cases where older operating systems will con- mation about the completed operation (such as the number text switch to idle threads just to switch back out of them, this of bytes read from the network connection). approach has a great potential of reclaiming “lost” time. Function pointer: The Completion Dispatcher in- Call-through dispatching can be implemented in Windows vokes the Completion Handler via a callback function NT using the ReadFileEx and WriteFileEx Win32 pointer. This approach effectively breaks the knowledge functions. For example, a thread of control can use these dependency between the Completion Dispatcher and calls to wait on a semaphore to become signaled. When the Completion Handler. This has two benefits: it waits, the thread informs the OS that it is entering into a special state known as an “alertable wait state.” At this 1. The Completion Handler is not forced to export a point, the OS can seize control of the waiting thread of con- specific interface; and trol’s stack and associated resources in order to execute the 2. There is no need for compile-time dependencies Completion Handler. between the Completion Dispatcher and the Completion Handler. Thread Pool dispatching: A pool of threads owned by the Completion Dispatcher can be used for Rendezvous: The Proactive Initiator can estab- Completion Handler execution. Each thread of con- lish an event object or a condition variable, which serves trol in the pool has been dynamically allocated to an avail- as a rendezvous between the Completion Dispatcher able CPU. Thread pool dispatching can be implemented with and the Completion Handler. This is most common Windows NT’s I/O Completion Ports. when the Completion Handler is the Proactive Initiator. While the Asynchronous Operation When considering the applicability of the Completion runs to completion, the Completion Handler processes Dispatcher techniques described above, consider the other activity. Periodically, the Completion Handler possible combinations of OS environments and physical will check at the rendezvous point for completion status. hardware shown in Table 1.

9 signaled. This is illustrated by imagining a dinner party at- Threading model System Type tended by a group of philosophers. The diners are seated Single-processor Multi-processor around a circular table with exactly one chop stick between Single-threaded A B each philosopher. When a philosopher becomes hungry, he Multi-threaded C D must obtained the chop stick to his left and to his right in or- der to eat. Once philosophers obtain a chop stick, they will not release it until their hunger is satisfied. If all philsophers Table 1: Completion Dispatcher Concurrency pick up the chop stick on their right, a deadlock occurs be- Strategies cause the chop stick on the left will never become available.

If your OS only supports synchronous I/O, then refer to the 7.3.3 Preemptive Policy Reactor pattern [4]. However, most modern operating sys- tems support some form of asynchronous I/O. The Completion Dispatcher type determines if a In combination A and B from Table 1, the Post-reactive Completion Handler can be preemptive while execut- approach to asynchronous I/O is probably the best, assuming ing. When attached to Dynamic-thread and Thread Pool dis- you are not waiting on any semaphores or mutexes. If you patchers, Completion Handlers are naturally preemp- are, a Call-through implementation may be more responsive. tive. However, when tied to a Post-reactive Completion In combination C, use a Call-through approach. In combina- Dispatcher, Completion Handlers are not preemp- tion D, use a Thread Pool approach. In practice, systematic tive with respect to each other. When driven by a Call- empirical measurements are necessary to select the most ap- through dispatcher, the Completion Handlers are not propriate alternative. preemptive with respect to the thread-of-control that is in the alertable wait state. In general, a handler should not perform long-duration 7.3 Implementing Completion Handlers synchronous operations unless multiple completion threads are used since this will significantly decrease the over- The implementation of Completion Handlers raises all responsiveness of the application. This risk can be the following concerns. alleviated by increased programming discipline. For in- stance, all Completion Handlers are required to act 7.3.1 State Integrity as Proactive Initiators instead of executing syn- chronous operations. A Completion Handler may need to maintain state in- formation concerning a specific request. For instance, the OS may notify the Web Server that only part of a file was 8 Sample Code written to the network communication port. As a result, a Completion Handler may need to reissue the request This section shows how to use the Proactor pattern to develop until the file is fully written or the connection becomes in- a Web server. The example is based on the Proactor pattern valid. Therefore, it must know the file that was originally implementation in the ACE framework [3]. specified, how many bytes are left to write, and what was the When a client connects to the Web server, the file pointer position at the start of the previous request. HTTP Handler’s open method is called. The server then There is no implicit limitation that prevents Proactive initializes the asynchronous I/O object with the object to call- Initiators from assigning multiple Asynchronous back when the Asynchronous Operation completes Operation requests to a single Completion Handler. (which in this case is this), the network connection for As a result, the Completion Handler must tie request- transferring the data, and the Completion Dispatcher specific state information throughout the chain of completion to be used once the operation completes (proactor ). The notifications. To do this, Completion Handlers can uti- read operation is then started asynchronously and the server lize the Asynchronous Completion Token pattern [7]. returns to the event loop. The HTTP Handler::handle read stream is 7.3.2 Resource Management called back by the dispatcher when the Async read op- eration completes. If there is enough data, the client request As with any multi-threaded environment, the Proactor pat- is then parsed. If the entire client request has not arrived yet, tern does not alleviate Completion Handlers from en- another read operation is initiated asynchronously. suring that access to shared resources is thread-safe. How- In response to a GET request, the server memory- ever, a Completion Handler must not hold onto a shared maps the requested file and writes the file data asyn- resource across multiple completion notifications. If it does, chronously to the client. The dispatcher calls back it risks invoking the dining philosopher’s problem [10]. on HTTP Handler::handle write stream when the This problem is the deadlock that results when a logical write operation completes, which frees up dynamically allo- thread of control waits forever for a semaphore to become cated resources.

10 The Appendix contains two other code examples for im- { plementing the Web server using a synchronous threaded if (file_.enough_data (bytes_transferred)) model and a synchronous (non-blocking) reactive model. // Success.... else class HTTP_Handler // Start another asynchronous write : public Proactor::Event_Handler stream_.write (file_.buffer (), // = TITLE file_.buffer_size ()); // Implements the HTTP protocol } // (asynchronous version). // private: // = PATTERN PARTICIPANTS // Set at initialization. // Proactive Initiator = HTTP_Handler Proactor *proactor_; // Asynch Op = Network I/O // Asynch Op Processor = OS // Memory-mapped file_; // Completion Dispatcher = Proactor Mem_Map file_; // Completion Handler = HTPP_Handler { // Socket endpoint. public: Socket_Stream *client_; void open (Socket_Stream *client) { // HTTP Request holder // Initialize state for request HTTP_Request request_; request_.state_ = INCOMPLETE; // Used for Asynch I/O // Store reference to client. Asynch_Stream stream_; client_ = client; }; // Initialize asynch read stream stream_.open (*this, client_->handle (), 9 Known Uses proactor_); The following are known uses of the Proctor pattern: // Start read asynchronously. stream_.read (request_.buffer (), I/O Completion Ports in Windows NT: The Windows request_.buffer_size ()); } NT operating system implements the Proactor pattern. Var- ious Asynchronous Operations such as accepting // This is called by the Proactor new network connections, reading and writing to files and // when the asynch read completes void handle_read_stream sockets, and transmission of files across a network connec- (u_long bytes_transferred) tion are supported by Windows NT. The operating system is { the Asynchronous Operation Processor. Results if (request_.enough_data (bytes_transferred)) of the operations are queued up at the I/O completion port parse_request (); (which plays the role of the Completion Dispatcher). else // Start reading asynchronously. ACE Proactor: The Adaptive Communications Environ- stream_.read (request_.buffer (), ment (ACE) [3] implements a Proactor component that en- request_.buffer_size ()); capsulates I/O Completion Ports on Windows NT. The ACE } Proactor abstraction provides an OO interface to the stan- void parse_request (void) dard C APIs supported by Windows NT. The source code for { this implementation can be acquired from the ACE website // Switch on the HTTP command type. switch (request_.command ()) { at www.cs.wustl.edu/schmidt/ACE.html. // Client is requesting a file. case HTTP_Request::GET: The UNIX AIO Family of Asynchronous I/O Operations: // Memory map the requested file. On some real-time POSIX platforms, the Proactor pattern file_.map (request_.filename ()); is implemented by the aio family of APIs [8]. These OS features are very similar to the ones described above for // Start writing asynchronously. stream_.write (file_.buffer (), Windows NT. One difference is that UNIX signals can be file_.buffer_size ()); used to implement an truly asynchronous Completion break; Dispatcher (the Windows NT API is not truly asyn- // Client is storing a file chronous). // at the server. case HTTP_Request::PUT: Asynchronous Procedure Calls in Windows NT: Some // ... systems (such as Windows NT) support Asynchronous Pro- } cedure Calls (APC)s. An APC is a function that executes } asynchronously in the context of a particular thread. When void handle_write_stream an APC is queued to a thread, the system issues a software (u_long bytes_transferred) interrupt. The next time the thread is scheduled, it will run

11 the APC. APCs made by operating system are called kernel- mode APCs. APCs made by an application are called user- mode APCs. Business processes of mail order companies: A real-life example comes from the domain of mail order companies. In these companies, Marketing publishes well-know asyn- chronous APIs know as Mail Order Catalogs. A poten- tial customer becomes a Proactive Initiator when they utilizes the defined phone number to initiate an or- der operation asynchronously. At this time, the customer specifies the Completion Handler to be “cash on de- livery” and the Completion Dispatcher to be “next- day-air.” The mail order company’s order fulfillment pro- cess is the Asynchronous Operation that would be executed on the customer’s behalf. The Workflow Direc- tor/Workflow Management System is the Asynchronous Operation Processor that drives the order fulfillment process to completion. When the order is filled, the Work- flow Director hands off the item ordered and the bill of sale to the pre-defined Completion Dispatcher. Notice that an exception occurs if the customer does not have any money at the time of delivery. This excep- Figure 9: Proactor Pattern’s Related Patterns tion is specific to the Completion Dispatcher cho- sen by the Proactive Initiator. This implies that Completion Dispatchers should be prepared to ex- to initiate an operation synchronously without blocking. In pect exceptional conditions. In this instance, the item is re- contrast, the Proactor supports the demultiplexing and dis- turned to the mail order company. patching of multiple event handlers that are triggered by the Natural instantiations of the Proactor pattern exists in most completion of asynchronous events. systems that employ asynchronous interaction across their The pattern [12] decouples method ex- system boundaries. In many of these systems, the system’s ecution from method invocation. The Proactor pat- actors are Proactive Initiators and the system’s use-case sce- tern is similar because Asynchronous Operation narios are Asynchronous Operations. Processors perform operations on behalf of application Proactive Initiators. That is, both patterns can be used to implement Asynchronous Operations.The 10 Related Patterns Proactor pattern is often used in place of the Active Object pattern to decouple the systems concurrency policy from the Figure 9 illustrates patterns that are related to the Proactor. threading model. The Asynchronous Completion Token (ACT) pattern [7] A Proactor may be implemented as a Singleton [11]. This is generally used in conjunction with the Proactor pattern. is useful for centralizing event demultiplexing and com- When Asynchronous Operations complete, applica- pletion dispatching into a single location within an asyn- tions may need more information than simply the notifica- chronous application. tion itself to properly handle the event. The Asynchronous The Chain of Responsibility (COR) pattern [11] decouples Completion Token pattern allows applications to efficiently event handlers from event sources. The Proactor pattern is associate state with the completion of Asynchronous similar in its segregation of Proactive Initiators and Operations. Completion Handlers. However, in COR, the event The Proactor pattern is related to the [11] source has no prior knowledge of which handler will be (where dependents are updated automatically when a sin- executed, if any. In Proactor, Proactive Initiators gle subject changes). In the Proactor pattern, handlers are have full disclosure of the target handler. However, the two informed automatically when events from multiple sources patterns can be combined by establishing a Completion occur. In general, the Proactor pattern is used to asyn- Handler that is the entry pont into a responsibility chain chronously demultiplex multiple sources of input to their as- dynamically configured by an external factory. sociated event handlers, whereas an Observer is usually as- sociated with only a single source of events. The Proactor pattern can be considered an asynchronous 11 Concluding Remarks variant of the synchronous Reactor pattern [4]. The Reactor pattern is responsible for demultiplexing and dispatching of The Proactor pattern embodies a powerful technique that multiple event handlers that are triggered when it is possible supports both efficient and flexible event dispatching strate-

12 gies for high-performance concurrent applications. The A.1 Multiple Synchronous Threads Proactor pattern provides performance benefits of executing operations concurrently, without constraining the developer The following code shows how to use synchronous I/O with to synchronous multi-threaded or reactive programming. a pool of threads to develop a Web server. When a client con- nects to the server a thread in the pool accepts the connec- tion and calls the open method in class HTTP Handler. References The server then synchronously reads the request from the network connection. When the read operation completes, [1] J. Hu, I. Pyarali, and D. C. Schmidt, “Measuring the Impact the client request is then parsed. In response to a GET re- of Event Dispatching and Concurrency Models on Web Server quest, the server memory-maps the requested file and writes

Performance Over High-speed Networks,” in Proceedings of the file data synchronously to the client. Note how block- nd the 2 Global Internet Conference, IEEE, November 1997. ing I/O allows the Web server to follow the steps outlined in [2] J. C. Mogul, “The Case for Persistent-connection HTTP,” in Section 2.2.1. Proceedings of ACM SIGCOMM ’95 Conference in Computer Communication Review, (Boston, MA, USA), pp. 299–314, class HTTP_Handler ACM Press, August 1995. // = TITLE [3] D. C. Schmidt, “ACE: an Object-Oriented Framework for // Implements the HTTP protocol

Developing Distributed Applications,” in Proceedings of the // (synchronous threaded version). th // 6 USENIX C++ Technical Conference, (Cambridge, Mas- // = DESCRIPTION sachusetts), USENIX Association, April 1994. // This class is called by a [4] D. C. Schmidt, “Reactor: An Object Behavioral Pattern for // thread in the Thread Pool. Concurrent Event Demultiplexing and Event Handler Dis- { patching,” in Pattern Languages of Program Design (J. O. public: void open (Socket_Stream *client) Coplien and D. C. Schmidt, eds.), pp. 529–545, Reading, MA: { Addison-Wesley, 1995. HTTP_Request request; [5] D. C. Schmidt, “Acceptor and Connector: for Initializing Communication Services,” in Pattern Languages // Store reference to client. of Program Design (R. Martin, F. Buschmann, and D. Riehle, client_ = client; eds.), Reading, MA: Addison-Wesley, 1997. // Synchronously read the HTTP request [6] M. K. McKusick, K. Bostic, M. J. Karels, and J. S. Quarter- // from the network connection and man, The Design and Implementation of the 4.4BSD Operat- // parse it. ing System. Addison Wesley, 1996. client_->recv (request); [7] I. Pyarali, T. H. Harrison, and D. C. Schmidt, “Asynchronous parse_request (request); Completion Token: an Object Behavioral Pattern for Efficient } Asynchronous Event Handling,” in Pattern Languages of Pro- gram Design (R. Martin, F. Buschmann, and D. Riehle, eds.), void parse_request (HTTP_Request &request) Reading, MA: Addison-Wesley, 1997. { // Switch on the HTTP command type. [8] “Information Technology – Portable Operating System Inter- switch (request.command ()) face (POSIX) – Part 1: System Application: Program Inter- { face (API) [C Lanaguage],” 1995. // Client is requesting a file. case HTTP_Request::GET: [9] Microsoft Developers Studio, Version 4.2 - Software Develop- // Memory map the requested file. ment Kit, 1996. Mem_Map input_file; [10] E. W. Dijkstra, “Hierarchical Ordering of Sequential Pro- input_file.map (request.filename()); cesses,” Acta Informatica, vol. 1, no. 2, pp. 115–138, 1971. // Synchronously send the file [11] E. Gamma, R. Helm, R. Johnson, and J. Vlissides, Design Pat- // to the client. Block until the terns: Elements of Reusable Object-Oriented Software. Read- // file is transferred. ing, MA: Addison-Wesley, 1995. client_->send (input_file.data (), input_file.size ()); [12] R. G. Lavender and D. C. Schmidt, “Active Object: an Object break; Behavioral Pattern for Concurrent Programming,” in Proceed-

nd // Client is storing a file at ings of the 2 Annual Conference on the Pattern Languages of Programs, (Monticello, Illinois), pp. 1–7, September 1995. // the server. case HTTP_Request::PUT: // ... } A Alternative Implementations } private: This Appendix outlines the code used to develop alterna- // Socket endpoint. tives to the Proactor pattern. Below, we examine both syn- Socket_Stream *client_; chronous I/O using multi-threading and reactive I/O using // ... single-threading. };

13 A.2 Single-threaded Reactive Event Dispatch- // No more progress possible, ing // blocking will occur if (request_.state_ == INCOMPLETE The following code shows the use of the Reactor pattern to && errno == EWOULDBLOCK) develop a Web server. When a client connects to the server, reactor_->register_handler the HTTP Handler::open method is called. The server (client_->handle (), this, registers the I/O handle and the object to callback (which in Reactor::READ_MASK); this case is this) when the network handle is “ready for else reading.” The server returns to the event loop. // We now have the entire request parse_request (); When the request data arrives at the server, the reactor } calls back the HTTP Handler::handle input method. The client data is read in a non-blocking manner. If there is void parse_request (void) enough data, the client request is parsed. If the entire client { // Switch on the HTTP command type. request has not yet arrived, the application returns to the re- switch (request_.command ()) { actor event loop. // Client is requesting a file. In response to a GET request, the server memory case HTTP_Request::GET: // Memory map the requested file. maps the requested file and registers with the reactor file_.map (request_.filename ()); to be notified when the network connection becomes “ready for writing.” The reactor then calls back on // Transfer the file using Reactive I/O. handle_output (); HTTP Handler::handle output method when writ- break; ing data to the connection would not blocking the calling thread. When all the data has been sent to the client, the // Client is storing a file at // the server. network connection is closed. case HTTP_Request::PUT: // ... class HTTP_Handler : } public Reactor::Event_Handler } // = TITLE // Implements the HTTP protocol void handle_output (void) // (synchronous reactive version). { // // Asynchronously send the file // = DESCRIPTION // to the client. // The Event_Handler base class if (client_->send (file_.data (), // defines the hooks for file_.size ()) // handle_input()/handle_output(). == SOCKET_ERROR // && errno == EWOULDBLOCK) // = PATTERN PARTICIPANTS // Register with reactor... // Reactor = Reactor else // Event Handler = HTTP_Handler // Close down and release resources. { handle_close (); public: } void open (Socket_Stream *client) { private: // Initialize state for request // Set at initialization. request_.state_ = INCOMPLETE; Reactor *reactor_;

// Store reference to client. // Memory-mapped file_; client_ = client; Mem_Map file_;

// Register with the reactor for reading. // Socket endpoint. reactor_->register_handler Socket_Stream *client_; (client_->handle (), this, // HTTP Request holder. Reactor::READ_MASK); HTTP_Request request_; } }; // This is called by the Reactor when // we can read from the client handle. void handle_input (void) { int result = 0;

// Non-blocking read from the network // connection. do result = request_.recv (client_->handle ()); while (result != SOCKET_ERROR && request_.state_ == INCOMPLETE);

14