Software Architectures for Reducing Priority Inversion
and Non-determinism in Real-time Object Request Brokers
Douglas C. Schmidt Sumedh Mungee, Sergio Flores-Gaitan, and Aniruddha Gokhale g [email protected] fsumedh,sergio,gokhale @cs.wustl.edu Electrical & Computer Engineering Department of Computer Science University of California, Irvine Washington University, St.Louis
This paper was published in the Kluwer Journal of Real- from developing real-time applications from scratch to in- time Systems, Volume 21, Number 2, 2001. tegrating applications using reusable components based on object-oriented (OO) middleware [3]. The objective of mid- Abstract dleware is to increase quality and decrease the cycle-time and effort required to develop software by supporting the integra- There is increasing demand to extend Object Request Bro- tion of reusable components implemented by different suppli- ker (ORB) middleware to support distributed applications with ers. stringent real-time requirements. However, conventional ORB implementations, such as CORBA ORBs, exhibit substantial Increased focus on QoS-enabled components and open sys- priority inversion and non-determinism, which makes them un- tems: There is increasing demand for remote method invo- suitable for applications with deterministic real-time require- cation and messaging technology to simplify the collaboration ments. This paper provides two contributions to the study and of open distributed application components [4] that possess design of real-time ORB middleware. First, it illustrates em- deterministic and statistical QoS requirements. These compo- pirically why conventional ORBs do not yet support real-time nents must be customizable to meet the functionality and QoS quality of service. Second, it evaluates connection and concur- requirements of applications developed in diverse contexts. rency software architectures to identify strategies that reduce priority inversion and non-determinism in real-time CORBA Increased focus on standardizing and leveraging real-time ORBs. The results presented in this paper demonstrate the COTS hardware and software: To leverage development feasibility of using standard OO middleware like CORBA to effort and reduce training, porting, and maintenance costs, support certain types of real-time applications over the Inter- there is increasing demand to exploit the rapidly advancing net. capabilities of standard common-off-the-shelf (COTS) hard- Keywords: Real-time CORBA Object Request Broker, QoS- ware and COTS operating systems. Several international stan- enabled OO Middleware, Performance Measurements dardization efforts are currently addressing QoS-related issues associated with COTS hardware and software. One particularly noteworthy standardization effort has 1 Introduction yielded the Object Management Group’s (OMG) Common Object Request Broker Architecture (CORBA) specifica- Next-generation distributed real-time applications, such as tion [5]. CORBA is OO middleware software that allows video conferencing, avionics mission computing, and process clients to invoke operations on objects without concern for control, require endsystems that can provide statistical and de- where the objects reside, what language the objects are writ- terministic quality of service (QoS) guarantees for latency [1], ten in, what OS/hardware platform they run on, or what com- bandwidth, and reliability [2]. The following trends are shap- munication protocols and networks are used to interconnect ing the evolution of software development techniques for these distributed objects [6]. distributed real-time applications and endsystems: There has been recent progress towards standardizing Increased focus on middleware and integration frame- CORBA for real-time [7] and embedded [8] systems. Sev-
works: There is a trend in real-time R&D projects away eral OMG groups, most notably the Real-Time Special Interest Group (RT SIG), are defining standard extensions to CORBA This work was supported in part by AFOSR grant F49620-00-1-0330, Boeing, CDI/GDIS, DARPA contract 9701516, Lucent, Motorola, NSF grant to support distributed real-time applications. The goal of stan- NCR-9628218, Siemens, and Sprint. dardizing real-time CORBA is to enable real-time applications
1 to interwork throughout embedded systems and heterogeneous integrated ORB endsystem architecture that can deliver end-to- distributed environments, such as the Internet. end QoS guarantees at multiple levels throughout a distributed However, developing, standardizing, and leveraging dis- system [10]. The key levels in an ORB endsystem include the tributed real-time Object Request Broker (ORB) middleware network adapters, OS I/O subsystems, communication pro- remains hard, notwithstanding the significant efforts of the tocols, ORB middleware, and higher-level services shown in OMG RT SIG. There are few successful examples of stan- Figure 1. dard, widely deployed distributed real-time ORB middle- in args ware running on COTS operating systems and COTS hard- OBJ operation() OBJECT ware. Conventional CORBA ORBs are generally unsuited for CLIENT REF out args + return (SERVANT) performance-sensitive, distributed real-time applications due value to their (1) lack of QoS specification interfaces, (2) lack of IDL SKELETON QoS enforcement, (3) lack of real-time programming features, REAL-TIME IDL ORB RUN-TIME OBJECT and (4) overall lack of performance and predictability [9]. STUBS SCHEDULER Although some operating systems, networks, and protocols ADAPTER REAL-TIME ORB CORE now support real-time scheduling, they do not provide inte- IOP IOP PLUGGABLE PLUGGABLE grated end-to-end solutions [10]. Moreover, relatively little ORB & XPORT ORB & XPORT PROTOCOLS PROTOCOLS systems research has focused on strategies and tactics for real- time ORB endsystems. For instance, QoS research at the net- OS KERNEL OS KERNEL REAL-TIME I/O ACE REAL-TIME I/O work and OS layers is only beginning to address key require- SUBSYSTEM COMPONENTS SUBSYSTEM ments and programming models of ORB middleware [11]. HIGH-SPEED HIGH-SPEED Historically, research on QoS for high-speed networks, such NETWORK INTERFACE NETWORK NETWORK INTERFACE as ATM, has focused largely on policies for allocating virtual circuit bandwidth [12]. Likewise, research on real-time op- Figure 1: Components in the TAO Real-time ORB Endsystem erating systems has focused largely on avoiding priority in- versions in synchronization and dispatching mechanisms for The main focus of this paper is on software architectures multi-threaded applications [13]. An important open research that are suitable for real-time ORB Cores. The ORB Core topic, therefore, is to determine how best to map the results is the component in the CORBA reference model that man- from QoS work at the network and OS layers onto the OO ages transport connections, delivers client requests to an Ob- programming model familiar to many real-time applications ject Adapter, and returns responses (if any) to clients. The developers who use ORB middleware. ORB Core also typically implements the transport endpoint This paper is organized as follows: Section 2 outlines the demultiplexing and concurrency architecture used by applica- general factors that impact real-time ORB endsystem perfor- tions. Figure 1 illustrates how an ORB Core interacts with mance and predictability; Section 3 describes software ar- other CORBA components. Appendix A describes the stan- chitectures for real-time ORB Cores, focusing on alternative dard CORBA components in more detail. ORB Core concurrency and connection software architectures; For completeness, Section 2.1 briefly outlines the general Section 4 presents empirical results from systematically mea- sources of performance overhead in ORB endsystems. Sec- suring the efficiency and predictability of alternative ORB tion 2.2 describes the key sources of priority inversion and Core architectures in four contemporary CORBA implemen- non-determinism that affect the predictability and utilization tations: CORBAplus, miniCOOL, MT-Orbix, and TAO; Sec- of real-time ORB endsystems. After this overview, Section 3 tion 5 compares our research with related work; and Section 6 explores alternative ORB Core concurrency and connection ar- presents concluding remarks. For completeness, Appendix A chitectures. provides an overview of the CORBA reference model. 2.1 General Sources of ORB Endsystem Per- 2 Factors Impacting Real-time ORB formance Overhead Endsystem Performance Our experience [14, 15, 16, 17] measuring the throughput and latency of CORBA implementations indicates that perfor- Meeting the QoS needs of next-generation distributed appli- mance overheads in real-time ORB endsystems arise from in- cations requires much more than defining Interface Defini- efficiencies in the following components: tion Language (IDL) interfaces or adding preemptive real-time 1. Network connections and network adapters: These scheduling into an OS. It requires a vertically and horizontally endsystem components handle heterogeneous network con-
2 nections and bandwidths, which can significantly increase la- 2.2 Sources of Priority Inversion and Non- tency and cause variability in performance. Inefficient design determinism in ORB Endsystems of network adapters can cause queueing delays and lost pack- ets [18], which are unacceptable for certain types of real-time Minimizing priority inversion and non-determinism is impor- systems. tant for real-time operating systems and ORB middleware in order to bound application execution times. In ORB endsys- tems, priority inversion and non-determinism generally stem 2. Communication protocol implementations and integra- from resources that are shared between multiple threads or tion with the I/O subsystem and network adapters: In- processes. Common examples of shared ORB endsystem re- efficient protocol implementations and improper integration sources include (1) TCP connections used by a CORBA IIOP with I/O subsystems can adversely affect endsystem perfor- protocol engine, (2) threads used to transfer requests through mance. Specific factors that cause inefficiencies include the client and server transport endpoints, (3) process-wide dy- protocol overhead caused by flow control, congestion control, namic memory managers, and (4) internal ORB data struc- retransmission strategies, and connection management. Like- tures like connection tables for transport endpoints and de- wise, lack of proper I/O subsystem integration yields excessive multiplexing maps for client requests. Below, we describe key data copying, fragmentation, reassembly, context switching, sources of priority inversion and non-determinism in conven- synchronization, checksumming, demultiplexing, marshaling, tional ORB endsystems. and demarshaling overhead [19]. 2.2.1 The OS I/O Subsystem 3. ORB transport protocol implementations: Inefficient implementations of ORB transport protocols, such as the An I/O subsystem is the component in an OS responsible CORBA Internet Inter-ORB protocol (IIOP) [5] and Simple for mediating ORB and application access to low-level net- Flow Protocol (SFP) [20], can cause significant performance work and OS resources, such as device drivers, protocol overhead and priority inversion. Specific factors responsible stacks, and the CPU(s). Key challenges in building a high- for these inversions include improper connection management performance, real-time I/O subsystem are (1) to minimize con- strategies, inefficient sharing of endsystem resources, and ex- text switching and synchronization overhead and (2) to enforce cessive synchronization overhead in ORB protocol implemen- QoS guarantees while minimizing priority inversion and non- tations. determinism [22]. A context switch is triggered when an executing thread re- linquishes the CPU it is running on voluntarily or involuntar- 4. ORB core implementations and integration with OS ily. Depending on the underlying OS and hardware platform, services: An improperly designed ORB Core can yield a context switch may require hundreds of instructions to flush excessive memory accesses, cache misses, heap alloca- register windows, memory caches, instruction pipelines, and tions/deallocations, and context switches [21]. In turn, these translation look-aside buffers [23]. Synchronization overhead factors can increase latency and jitter, which is unacceptable arises from locking mechanisms that serialize access to shared for distributed applications with deterministic real-time re- resources like I/O buffers, message queues, protocol connec- quirements. Specific ORB Core factors that cause inefficien- tion records, and demultiplexing maps used during protocol cies include data copying, fragmentation/reassembly, context processing in the OS and ORB. switching, synchronization, checksumming, socket demul- The I/O subsystems of general-purpose operating systems, tiplexing, timer handling, request demultiplexing, marshal- such as Solaris and Windows NT, do not perform preemptive, ing/demarshaling, framing, error checking, connection and prioritized protocol processing [24]. Therefore, the protocol concurrency architectures. Many of these inefficiencies are processing of lower priority packets is not deferred due to the similar to those listed in bullet 2 above. Since they occur at arrival of higher priority packets. Instead, incoming packets the user-level rather than at the kernel-level, however, ORB are processed by their arrival order, rather than by their prior- implementers can often address them more readily. ity. For instance, in Solaris if a low-priority request arrives im- Figure 2 pinpoints where the various factors outlined above mediately before a high priority request, the I/O subsystem impact ORB performance and where optimizations can be ap- will process the lower priority packet and pass it to an applica- plied to reduce key sources of ORB endsystem overhead, pri- tion servant before the higher priority packet. The time spent ority inversion, and non-determinism. Below, we describe the in the low-priority servant represents the degree of priority in- components in an ORB endsystem that are chiefly responsible version incurred by the ORB endsystem and application. for priority inversion and non-determinism. [22] examines key issues that cause priority inversion in I/O
3 in args operation() PRESENTATION CLIENT SERVANT LAYER out args + return value
DATA COPYING IDL SKELETON SCHEDULING, IDL ORB OBJECT DEMUXING, & STUBS INTERFACE ADAPTER DISPATCHING
CONCURRENCY ORB MODELS CORE GIOP TRANSPORT PROTOCOLS OS KERNEL CONNECTION OS KERNEL MANAGEMENT I/O OS I/O SUBSYSTEM OS I/O SUBSYSTEM SUBSYSTEM NETWORK ADAPTERS NETWORK ADAPTERS NETWORK NETWORK ADAPTER
Figure 2: Optimizing Real-time ORB Endsystem Performance subsystems and describes how TAO’s real-time I/O subsys- For instance, a high-priority client may need to wait for the tem avoids many forms of priority inversion by co-scheduling connection establishment of a lower-priority client. In ad- pools of user-level and kernel-level real-time threads. Interest- dition, the time required to establish connections can vary ingly, the results in Section 4 illustrate that much of the over- widely, ranging from microseconds to milliseconds, depend- head, priority inversion, and non-determinism in ORB endsys- ing on endsystem load and network congestion. tems does not stem from protocol implementations in the I/O Connection establishment overhead is difficult to bound. subsystem, but arises instead from the software architecture of For instance, if an ORB needs to dynamically establish con- the ORB Core. nections between a client and a server, it is hard to provide a reasonable guarantee of the worst-case execution time since 2.2.2 The ORB Core this time must include the (often variable) connection estab- lishment time. Moreover, connection establishment often oc- An ORB Core is the component in CORBA that implements curs outside the scope of general end-to-end OS QoS proto- the General Inter-ORB Protocol (GIOP) [5], which defines a col enforcement mechanisms, such as retransmission timers standard format for interoperating between (potentially hetero- [25]. To support applications with deterministic real-time geneous) ORBs. The ORB Core establishes connections and QoS requirements, therefore, ORB endsystems often must pre- implements concurrency architectures that process GIOP re- allocate connections apriori. quests. The following discussion outlines common sources of
Connection multiplexing: Conventional ORB Cores priority inversion and non-determinism in conventional ORB typically share a single multiplexed TCP connection for all ob- Core implementations. ject references to servants in a server process that are accessed Connection architecture: The ORB Core’s connection ar- by threads in a client process. This connection multiplexing chitecture, which defines how requests are mapped onto net- is shown in Figure 3. The goal of connection multiplexing is work connections, has a major impact on real-time ORB be- havior. Therefore, a key challenge for developers of real-time APPLICATION SERVANTS ORBs is to select a connection architecture that can utilize the transport mechanisms of an ORB endsystem efficiently and SERVER predictably. The following discussion outlines the key sources CLIENT ORB CORE ORB CORE of priority inversion and non-determinism exhibited by con- ONE TCP ventional ORB Core connection architectures: I/I/OO SUBSYSTEM CONNECTION I/I/OO SUBSYSTEM
Dynamic connection management: Conventional COMMUNICATION LINK ORBs create connections dynamically in response to client requests. Dynamic connection management can incur sig- nificant run-time overhead and priority inversion, however. Figure 3: A Multiplexed ConnectionI/I/OO SUBSYSTEM Architecture
4 1 2 to minimize the number of connections open to each server, K e.g., to improve server scalability over TCP. However, con- nection multiplexing can yield substantial packet-level prior- ... SERVANT ity inversions and synchronization overhead, as shown in Sec- LAYER OPERATION OPERATION
6: DISPATCH OPERATION tions 4.2.1 and 4.2.2. OPERATION ... SKEL 1 SKEL 2 SKEL N Concurrency architecture: The ORB Core’s concurrency 5: DEMUX TO architecture, which defines how requests are mapped onto SKELETON threads, also has a substantial impact on its real-time behavior. SERVANT 1 SERVANT 2 ... SERVANT N Therefore, another key challenge for developers of real-time 4: DEMUX TO ORBs is to select a concurrency architecture that can effec- SERVANT ... tively share the aggregate processing capacity of an ORB end- POA1 POA2 POAN system and its application operations in one or more threads. 3: DEMUX TO The following outlines the key sources of priority inversion OBJECT ROOT POA ORB and non-determinism exhibited by conventional ORB Core ADAPTER LAYER concurrency architectures: ORB CORE 2: DEMUX TO I/O HANDLE OS Two-way operation reply processing: On the client- OS I/O SUBSYSTEM side, conventional ORB Core concurrency architectures for KERNEL 1: DEMUX THRU NETWORK ADAPTERS two-way operations can incur significant priority inversion. LAYER PROTOCOL STACK For instance, multi-threaded ORB Cores that use connec- tion multiplexing incur priority inversions when low-priority threads awaiting replies from a server block out higher prior- Figure 4: CORBA 2.2 Logical Server Architecture ity threads awaiting replies from the same server.
Thread pools: On the server-side, ORB Core con- Steps 3, and 4: The ORB Core uses the addressing informa- currency architectures often use thread pools [26] to select tion in the client’s object key to locate the appropriate POA a thread in which to process an incoming request. How- and servant. POAs can be organized hierarchically. There- ever, conventional ORBs do not provide programming inter- fore, locating the POA that contains the designated servant can faces that allow real-time applications to assign the priority involve a number of demultiplexing steps through the nested of threads in this pool. Therefore, the priority of a thread in POA hierarchy. the pool is often inappropriate for the priority of the servant Step 5 and 6: The POA uses the operation name to find the that ultimately executes the request. An improperly designed appropriate IDL skeleton, which demarshals the request buffer ORB Core increases the potential for, and duration of, priority into operation parameters and performs the upcall to code sup- inversion and non-determinism [27]. plied by servant developers to implement the object’s opera- tion. 2.2.3 The Object Adapter The conventional deeply-layered ORB endsystem demulti- plexing implementation shown in Figure 4 is generally inap- An Object Adapter is the component in CORBA that is re- propriate for high-performance and real-time applications for sponsible for demultiplexing incoming requests to servant op- the following reasons [28]: erations that handle the request. A standard GIOP-compliant client request contains the identity of its object and operation. Decreased efficiency: Layered demultiplexing reduces per- An object is identified by an object key, which is an octet formance by increasing the number of internal tables that sequence. An operation is represented as a string.As must be searched as incoming client requests ascend through shown in Figure 4, the ORB endsystem must perform the fol- the processing layers in an ORB endsystem. Demultiplexing lowing demultiplexing tasks: client requests through all these layers can be expensive, par- ticularly when a large number of operations appear in an IDL Steps 1 and 2: The OS protocol stack demultiplexes the in- interface and/or a large number of servants are managed by an coming client request multiple times, starting from the net- Object Adapter. work interface, through the data link, network, and transport Increased priority inversion and non-determinism: Lay- layers up to the user/kernel boundary (e.g., the socket layer), ered demultiplexing can cause priority inversions because where the data is passed to the ORB Core in a server process.
5 servant-level quality of service (QoS) information is inacces- When multiplexing is used, however, a key challenge is to de- sible to the lowest-level device drivers and protocol stacks in sign an efficient ORB Core connection architecture that sup- the I/O subsystem of an ORB endsystem. Therefore, an Ob- ports concurrent read and write operations. ject Adapter may demultiplex packets according to their FIFO TCP provides untyped bytestream data transfer semantics. order of arrival. FIFO demultiplexing can cause higher prior- Therefore, multiple threads cannot read or write from the ity packets to wait for a non-deterministic period of time while same socket concurrently. Likewise, writes to a socket lower priority packets are demultiplexed and dispatched [29]. shared within an ORB process must be serialized. Serializa- tion is typically implemented by having a client thread acquire Conventional implementations of CORBA incur significant a lock before writing to a shared socket. demultiplexing overhead. For instance, [15, 17] show that con- For one-way operations, there is no need for additional lock- ventional ORBs spend 17% of the total server time process- ing or processing once a request is sent. Implementing two- ing demultiplexing requests. Unless this overhead is reduced way operations over a shared connection is more complicated, and demultiplexing is performed predictably, ORBs cannot however. In this case, the ORB Core must allow multiple provide uniform, scalable QoS guarantees to real-time appli- threads to concurrently “read” from a shared socket end- cations. point. [14] presents alternative ORB demultiplexing techniques If server replies are multiplexed through a single TCP con- and describes how TAO’s real-time Object Adapter provides nection then multiple threads cannot read simultaneously optimal demultiplexing strategies that execute deterministi- from that socket endpoint. Instead, the ORB Core must de- cally in constant time. The demultiplexing strategies used in multiplex incoming replies to the appropriate client thread by TAO also avoid priority inversion via de-layered demultiplex- using the GIOP sequence number sent with the original client ing, which removes unnecessary layering within TAO’s Object request and returned with the servant’s reply. Adapter. Several common ways of implementing connection multi- plexing to allow concurrent read and write operations are 3 Alternative ORB Core Concurrency described below. Active connection architecture: and Connection Architectures
Overview: One approach is the active connection archi- This section describes a number of common ORB Core con- tecture shown in Figure 5. An application thread (1)invokes currency and connection architectures. Each architecture is used by one or more commercial or research CORBA imple- APPLICATION mentations. Below, we qualitatively evaluate how each ar- 1: invoke_twoway() chitecture manages the aggregate processing capacity of ORB endsystem components and application operations. Section 4 7: dequeue() 2: enqueue() & return then presents quantitative results that illustrate how efficiently and predictably these alternatives perform in practice. REQUEST QUEUES 3.1 Alternative ORB Core Connection Archi- 3: dequeue() 6: enqueue() & write() tectures 5: read() There are two general strategies for structuring the connec- 4: select() tion architecture of an ORB Core: multiplexed and non- multiplexed. We describe and evaluate various design alter- natives for each approach below, focusing on client-side con- I/I/OO SUBSYSTEM nection architectures in our examples. Figure 5: Active Connection Architecture
3.1.1 Multiplexed Connection Architectures a two-way operation, which enqueues the request in the ORB Most conventional ORBs multiplex all client requests emanat- (2). A separate thread in the ORB Core services this queue (3) ing from threads in a single process through one TCP connec- and performs a write operation on the multiplexed socket. 1 tion to their corresponding server process. This multiplexed The ORB thread selects (4) on the socket waiting for the connection architecture is used to build scalable ORBs by min- 1The select call is typically used since a client may have multiple mul- imizing the number of TCP connections open to each server. tiplexed connections to multiple servers.
6 server to reply, reads the reply from the socket (5), and en- To avoid corrupting the socket bytestream, only the leader queues the reply in a message queue (6). Finally, the applica- thread can select on the socket(s). Thus, all client threads tion thread retrieves the reply from this queue (7) and returns that “follow the leader” to read replies from the shared socket back to its caller. will block on semaphores managed by the ORB Core. If replies return from the server in FIFO order this strategy is Advantages: The advantage of the active connection ar- optimal since there is no unnecessary processing or context chitecture is that it simplifies ORB implementations by using switching. Since replies may arrive in non-FIFO order, how- a uniform queueing mechanism. In addition, if every socket ever, the next reply from a server could be for any one of the handles packets of the same priority level, i.e., packets of dif- client threads blocked on semaphores. ferent priorities are not received on the same socket, the active When the next reply arrives from the server, the leader connection can handle these packets in FIFO order without reads the reply (4). It uses the sequence number returned causing request-level priority inversion [22]. in the GIOP reply header to identify the correct thread to re- ceive the reply. If the reply is for the leader’s own request, the Disadvantages: The disadvantage with this architec- ture, however, is that the active connection forces an extra con- leader thread releases the semaphore of the next follower (5) text switch on all two-way operations. Therefore, to minimize and returns to its caller (6). The next follower thread becomes this overhead, many ORBs use a variant of the active connec- the new leader and blocks on select. tion architecture described next. If the reply is not for the leader thread, however, the leader must signal the semaphore of the appropriate thread. This sig- Leader/Followers connection architecture: naled thread then wakes up, reads its reply, and returns to its caller. Meanwhile, the leader thread continues to select for the next reply. Overview: An alternative to the active connection
model is the leader/followers architecture shown in Figure 6. Advantages: Compared with active connections, the As before, an application thread invokes a two-way operation advantage of the leader/follower connection architecture is that it minimizes the number of context switches incurred if APPLICATION replies arrive in FIFO order.
1: invoke_twoway() Advantages: The disadvantage of the leader/follower BORROWED THREADS model is that the complex implementation logic can yield sig- 6: return() nificant locking overhead and priority inversion. The locking ORB CORE overhead stems from the need to acquire mutexes when send- ing requests and to block on the semaphores while waiting for LEADERFOLLOWERS replies. The priority inversion occurs if the priorities of the waiting threads are not respected by the leader thread when it 4: read() demultiplexes replies to client threads. BORROWED THREAD SEMAPHORES 3.1.2 Non-multiplexed Connection Architectures 3: select() 5: release()
Overview: One technique for minimizing ORB Core 2: write() priority inversion is to use a non-multiplexed connection ar- I/I/OO SUBSYSTEM chitecture, such as the one shown in Figure 7. In this connec- tion architecture, each client thread maintains a table of pre- Figure 6: Leader/Follower Connection Architecture established connections to servers in thread-specific storage
[30]. A separate connection is maintained in each thread for
P P P
2 3 call (1). Rather than enqueueing the request in an ORB mes- every priority level, e.g., 1 , , , etc. As a result, when a sage queue, however, the request is sent across the socket im- two-way operation is invoked (1) it shares no socket endpoints write select mediately (2), using the application thread that invoked the with other threads. Therefore, the ,(2), (3), read operation to perform the write. Moreover, no single thread (4), and return (5) operations can occur without con- in the ORB Core is dedicated to handling all the socket I/O in tending for ORB Core resources with other threads in the pro- the leader/follower architecture. Instead, the first thread that cess.
attempts to wait for a reply on the multiplexed connection will Advantages: The primary advantages of a non- block in select waiting for a reply (3). This thread is called multiplexed connection architecture is that it preserves end-to- the leader. end priorities and minimizes priority inversion while sending
7 APPLICATION SERVANTS 1: invoke_twoway() 5: dispatch upcall() ORB CORE 5: return() ORB CORE 4: read() 4: dequeue() WORKER THREADS
P1 P2 P3 P4 P1 P2 P3 P4 3: enqueue() BORROWED THREAD BORROWED THREAD I/O REQUEST 2: read() 3: select() THREAD QUEUE 1: select() 2: write()
I/I/OO SUBSYSTEM I/I/OO SUBSYSTEM
Figure 7: Non-multiplexed Connection Architecture Figure 8: Server-side Worker Thread Pool Concurrency Ar- chitecture requests through ORB endsystems. In addition, since connec- tions are not shared, this design incurs low synchronization in the pool dequeues (4) the next request from the head of the overhead because no additional locks are required in the ORB queue and dispatches it (5). Core when sending and receiving two-way requests [31].
Advantages: The chief advantage of the worker thread pool concurrency architecture is its ease of implementation. In
Disadvantages: The disadvantage with a non- multiplexed connection architecture is that it can use a larger particular, the request queue provides a straightforward pro- number of socket endpoints than the multiplexed connection ducer/consumer design. model, which may increase the ORB endsystem memory
Disadvantages: The disadvantages of this model stem footprint. Moreover, this approach does not scale with the from the excessive context switching and synchronization re- number of priority levels. Therefore, it is most effective when quired to manage the request queue, as well as request-level used for statically configured real-time applications, such as priority inversion caused by connection multiplexing. Since avionics mission computing systems [22], which possess a different priority requests share the same transport connec- small, fixed number of connections and priority levels, where tion, a high-priority request may wait until a low-priority re- each priority level maps to an OS thread priority. quest that arrived earlier is processed. Moreover, thread-level priority inversions can occur if the priority of the thread that 3.2 Alternative ORB Core Concurrency Archi- originally reads the request is lower than the priority of the servant that processes the request. tectures
There are a variety of strategies for structuring the concurrency 3.2.2 The Leader/Follower Thread Pool Architecture architecture in an ORB [26]. Below, we describe a number of
alternative ORB Core concurrency architectures, focusing on Overview: The leader/follower thread pool architecture server-side concurrency. is an optimization of the worker thread pool model. It is sim- ilar to the leader/follower connection architecture discussed in Section 3.1.1. As shown in Figure 9, a pool of threads is 3.2.1 The Worker Thread Pool Architecture allocated and a leader thread is chosen to select (1) on con- nections for all servants in the server process. When a request Overview: This ORB concurrency architecture uses a design similar to the active connection architecture described arrives, this thread reads (2) it into an internal buffer. If this in Section 3.1.1. As shown in Figure 8, the components in is a valid request for a servant, a follower thread in the pool a worker thread pool include an I/O thread, a request queue, is released to become the new leader (3) and the leader thread and a pool of worker threads. The I/O thread selects(1)on dispatches the upcall (4). After the upcall is dispatched, the the socket endpoints, reads (2) new client requests, and (3) original leader thread becomes a follower and returns to the inserts them into the tail of the request queue. A worker thread thread pool. New requests are queued in socket endpoints un- til a thread in the pool is available to execute the requests.
8 SERVANT SERVANTS U SKELETONS 4: dispatch upcall() SERVANT U ORB CORE 3: dequeue, SKELETONS filter THREAD 4: dispatch LEADERFOLLOWERS request, FILTER upcall() &enqueue 2: read() FILTER OBJECT
SEMAPHORE FILTER ADAPTER 3: release() SERVANT DEMUXER 1: select() FILTER 2: enqueue(data) ORB CORE CONNECTION THREADS
I/I/OO SUBSYSTEM 1: recv() Figure 9: Server-side Leader/Follower Concurrency Architec- ture I/I/OO SUBSYSTEM
Advantages: Compared with the worker thread pool de- Figure 10: Server-side Thread Framework Concurrency Ar- sign, the chief advantage of the leader/follower thread pool chitecture architecture is that it minimizes context switching overhead incurred by incoming requests. Overhead is minimized since ate priority. This thread then passes control back to the ORB, the request need not be transferred from the thread that read it which performs operation demultiplexing and dispatches the to another thread in the pool that processes it. upcall (4).
Disadvantages: The leader/follower architecture’s dis-
Advantages: The main advantage of a threading frame- advantages are largely the same as with the worker thread de- work is its flexibility. The thread filter mechanism can be sign. In addition, it is harder to implement the leader/follower programmed by server developers to support various concur- model than the worker thread pool model. rency strategies. For instance, to implement a thread-per- request [33] strategy, the filter can spawn a new thread and 3.2.3 Threading Framework Architecture pass the request to this new thread. Likewise, the MT-Orbix threading framework can be configured to implement other
Overview: A very flexible way to implement an ORB concurrency architectures such as thread-per-object [34] and concurrency architecture is to allow application developers to thread pool [35]. customize hook methods provided by a threading framework.
One way of structuring this framework is shown in Figure 10. Disadvantages: There are several disadvantages with This design is based on the MT-Orbix thread filter framework, the thread framework design, however. First, since there is which implements a variant of the Chain of Responsibility pat- only a single chain of filters, priority inversion can occur be- tern [32]. cause each request must traverse the filter chain in FIFO or- In MT-Orbix, an application can install a thread filter at the der. Second, there may be FIFO queueing at multiple levels top of a chain of filters. Filters are application-programmable in the ORB endsystem. Therefore, a high priority request may hooks that can perform a number of tasks. Common tasks in- be processed only after several lower priority requests that ar- clude intercepting, modifying, or examining each request sent rived earlier. Third, the generality of the threading framework to and from the ORB. can substantially increase locking overhead since locks must In the thread framework architecture, a connection thread be acquired to insert requests into the queue of the appropriate in the ORB Core reads(1) a request from a socket endpoint thread. and enqueues the request on a request queue in the ORB Core (2). Another thread then dequeues the request (3) and passes 3.2.4 The Reactor-per-Thread-Priority Architecture it through each filter in the chain successively. The topmost
filter, i.e., the thread filter, determines the thread to handle this Overview: The Reactor-per-thread-priority architec- request. In the thread-pool model, the thread filter enqueues ture is based on the Reactor pattern [36], which integrates the request into a queue serviced by a thread with the appropri- transport endpoint demultiplexing and the dispatching of the
9 corresponding event handlers. This threading architecture as- quests for each Reactor within a single thread of control, sociates a group of Reactors with a group of threads run- which can reduce parallelism. To alleviate this problem, a ning at different priorities. As shown in Figure 11, the compo- variant of this architecture can associate a pool of threads with each priority level. Though this will increase potential par- SERVANTS allelism, it can incur greater context switching overhead and 3: dispatch upcall() non-determinism, which may be unacceptable for certain types ORB CORE of real-time applications.
P1 P2 P3 P4
C C C A C C C A C C C A C C C A O O O C O O O C O O O C O O O C N N N C N N N C N N N C N N N C N N N E N N N E N N N E N N N E 3.3 Integrating Connection and Concurrency E E E P E E E P E E E P E E E P 2: read()C C C T C C C T C C C T C C C T Architectures T T T O T T T O T T T O T T T O 1 2 3 R 1 2 3 R 1 2 3 R 1 2 3 R REACTOR REACTOR REACTOR REACTOR The Reactor-per-thread-priority architecture can be inte- (20 HZ)HZ) (10 HZ)HZ) (5 HZ)HZ) (1 HZ)HZ) grated seamlessly with the non-multiplexed connection model 1: select() described in Section 3.1.2 to provide end-to-end priority preservation in real-time ORB endsystems, as shown in . Fig- ure 12. In this diagram, the Acceptors listen on ports that I/I/OO SUBSYSTEM Figure 11: Server-side Reactor-per-Thread-Priority Concur- CLIENT APPLICATION SERVER ORB CORE rency Architecture P1 P2 P3 P4
STUB STUB STUB C C C A C C C A C C C A C C C A Reactor O O O C O O O C O O O C O O O C nents in the -per-thread-priority architecture include N N N C N N N C N N N C N N N C N N N E N N N E N N N E N N N E multiple pre-allocated Reactors, each of which is associated CLIENT ORB CORE E E E P E E E P E E E P E E E P C C C T C C C T C C C T C C C T with its own real-time thread of control for each priority level, T T T O T T T O T T T O T T T O
CONNECTOR CONNECTOR CONNECTOR 1 2 3 R 1 2 3 R 1 2 3 R 1 2 3 R P
P 20 10 5 1 20 10 5 1 20 10 5 1 REACTOR REACTOR REACTOR REACTOR 4 e.g., 1 ... , in the ORB. For instance, avionics mission computing systems [37] commonly execute their tasks in fixed HZ HZ HZ HZ HZ HZ HZ HZ HZ HZ HZ HZ (20 HZ) (10 HZ) (5 HZ) (1 HZ) priority threads corresponding to the rates, e.g., 20 Hz, 10 Hz, 5 Hz, and 1 Hz, at which operations are called by clients. I/I/OO SUBSYSTEM I/I/OO SUBSYSTEM Within each thread, the Reactor demultiplexes (1) all in- COMMUNICATION LINK
coming client requests to the appropriate connection handler, 2 i.e., connect1 , connect , etc. The connection handler reads (2) the request and dispatches (3) it to a servant that executes Figure 12: End-to-end Real-time ORB Core Software Archi- the upcall at its thread priority. tecture Each Reactor in an ORB server thread is also associated with an Acceptor [38]. The Acceptor is a factory that correspond to the 20 Hz, 10 Hz, 5 Hz, and 1 Hz rate group listens on a particular port number for clients to connect to that thread priorities, respectively. Once a client connects, its thread and creates a connection handler to process the GIOP Acceptor creates a new socket queue and connection han- requests. In the example in Figure 11, there is a listener port dler to service that queue. The I/O subsystem uses the port for each priority level. number contained in arriving requests as a demultiplexing key
Advantages: The advantage of the Reactor-per- to associate requests with the appropriate socket queue. Each thread-priority architecture is that it minimizes priority in- queue is served by an ORB Core thread that runs at the appro- version and non-determinism. Moreover, it reduces context priate priority. switching and synchronization overhead by requiring the state The combination of the Reactor-per-thread-priority ar- of servants to be locked only if they interact across different chitecture and the non-multiplexed connection architecture thread priorities. In addition, this concurrency architecture minimizes priority inversion through the entire distributed supports scheduling and analysis techniques that associate pri- ORB endsystem by eagerly demultiplexing incoming requests ority with rate, such as Rate Monotonic Scheduling (RMS) and onto the appropriate real-time thread that services the priority Rate Monotonic Analysis (RMA) [39, 40]. level of the target servant. As shown in Section 4.2, this de-
Disadvantages: The disadvantage with the Reactor- sign is well suited for real-time applications with deterministic per-thread-priority architecture is that it serializes all client re- QoS requirements.
10 4 Real-time ORB Core Performance Experiments
This section describes the results of experiments that mea- sure the real-time behavior of several commercial and research ORBs, including IONA MT-Orbix 2.2, Sun miniCOOL 4.32, Expersoft CORBAplus 2.1.1, and TAO 1.0. MT-Orbix and FORE SYSTEMS ASX 200BX CORBAplus are not real-time ORBs, i.e., they were not ex- plicitly designed to support applications with real-time QoS ATM SWITCH (16 PORT, OC3 requirements. Sun miniCOOL is a subset of the COOL ORB 155 /PORT, that is specifically designed for embedded systems with small MBPS 9,180 ) memory footprints. TAO was designed at Washington Univer- ULTRA MTU sity to support real-time applications with deterministic and SPARC 2 statistical quality of service requirements, as well as best ef- (FORE ATM fort requirements. TAO has been used in a number of produc- ADAPTORS tion real-time systems at companies like Boeing [37], SAIC, AND ETHERNET) Lockheed Martin, and Raytheon. Figure 14: Hardware for the CORBA/ATM Testbed 4.1 Benchmarking Testbed UltraSPARC-2 contains two 168 MHz Super SPARC CPUs This section describes the experimental testbed we designed with a 1 Megabyte cache per-CPU. The Solaris 2.5.1 TCP/IP to systematically measure sources of latency and throughput protocol stack is implemented using the STREAMS commu- overhead, priority inversion, and non-determinism in ORB nication framework [41]. endsystems. The architecture of our testbed is depicted in Fig- Each UltraSPARC-2 has 256 Mbytes of RAM and an ENI- ure 13. The hardware and software components used in the 155s-MF ATM adaptor card, which supports 155 Megabits per-sec (Mbps) SONET multimode fiber. The Maximum Services Transmission Unit (MTU) on the ENI ATM adaptor is 9,180 ... bytes. Each ENI card has 512 Kbytes of on-board memory. A maximum of 32 Kbytes is allotted per ATM virtual circuit Object Adapter C 0 C 1 ... C n connection for receiving and transmitting frames (for a total of 64 Kb). This allows up to eight switched virtual connections Requests ORB Core per card. The CORBA/ATM hardware platform is shown in Client Server Figure 14.
4.1.2 Client/Server Configuration and Benchmarking Methodology