Alleviating Priority Inversion and Non-Determinism in Real-Time CORBA ORB Core Architectures 1 Introduction 2 Real-Time ORB Core

Alleviating Priority Inversion and Non-determinism in Real-time CORBA ORB Core Architectures Douglas C. Schmidt, Sumedh Mungee, Sergio Flores-Gaitan, and Aniruddha Gokhale g fschmidt,sumedh,sergio,gokhale @cs.wustl.edu Department of Computer Science, Washington University St. Louis, MO 63130, USA th This paper appeared in the proceedings of 4 IEEE Real- is the component in the CORBA reference model that man- time Technology and Applications Symposium (RTAS), Den- ages transport connections, delivers client requests to an Ob- ver, Colorado, June 3-5, 1998. ject Adapter, and returns responses (if any) to clients. The ORB Core also typically implements the transport endpoint Abstract demultiplexing and concurrency architecture used by applications. Figure 1 illustrates how an ORB Core interacts with There is increasing demand to extend Object Request Bro- other CORBA components. [2] describes each of these com- ker (ORB) middleware to support distributed applications with stringent real-time requirements. However, conventional ORB in args operation() implementations, such as CORBA ORBs, exhibit substantial CLIENT SERVANT priority inversion and non-determinism, which makes them un- out args + return value suitable for applications with deterministic real-time requirements. This paper provides two contributions to the study and IDL DSI design of real-time ORB middleware. First, it illustrates em- SKELETON IDL DII ORB OBJECT pirically why conventional ORBs do not yet support real-time STUBS quality of service. Second, it evaluates connection and concur- INTERFACE ADAPTER rency software architectures to identify strategies that reduce ORB /IIOP priority inversion and non-determinism in real-time CORBA GIOP CORE ORBs. STANDARD INTERFACE STANDARD LANGUAGE Keywords: Real-time CORBA Object Request Broker, QoS- MAPPING enabled OO Middleware, Performance Measurements ORB-SPECIFIC INTERFACE STANDARD PROTOCOL 1 Introduction Figure 1: Components in the CORBA Reference Model Meeting the QoS needs of next-generation distributed appli- ponents in more detail. cations requires much more than defining IDL interfaces or This paper is organized as follows: Section 2 presents em- adding preemptive real-time scheduling into an OS. It requires pirical results from systematically measuring the efficiency a vertically and horizontally integrated ORB endsystem archi- and predictability of alternative ORB Core architectures in tecture that can deliver end-to-end QoS guarantees at multi- four contemporary CORBA implementations: CORBAplus, ple levels throughout a distributed system. The key levels miniCOOL, MT-Orbix, and TAO; and Section 3 presents con- in an ORB endsystem include the network adapters, OS I/O cluding remarks. subsystems, communication protocols, ORB middleware, and higher-level services [1]. The main focus of this paper is on software architectures 2 Real-time ORB Core Performance that are suitable for real-time ORB Cores. The ORB Core Experiments This work was supported in part by Boeing, CDI, DARPA contract 9701516, Lucent, Motorola, NSF grant NCR-9628218, Siemens, and US This section describes the results of experiments that mea- Sprint. sure the real-time behavior of several commercial and research ORBs, including IONA’s MT-Orbix 2.2, Sun miniCOOL 4.31, per-sec (Mbps) SONET multimode fiber. The Maximum Expersoft CORBAplus 2.1.1, and TAO 1.0. MT-Orbix and Transmission Unit (MTU) on the ENI ATM adaptor is 9,180 CORBAplus are not real-time ORBs, i.e., they were not ex- bytes. Each ENI card has 512 Kbytes of on-board memory. plicitly designed to support applications with real-time QoS A maximum of 32 Kbytes is allotted per ATM virtual circuit requirements. Sun miniCOOL is a subset of the COOL ORB connection for receiving and transmitting frames (for a total that is specifically designed for embedded systems with small of 64 K). This allows up to eight switched virtual connections memory footprints. TAO was designed at Washington Univer- per card. sity to support real-time applications with deterministic and statistical quality of service requirements, as well as best ef- 2.1.2 Client/Server Configuration and Benchmarking fort requirements. Methodology Server benchmarking configuration: AsshowninFig- 2.1 Benchmarking Testbed ure 2, our testbed server consists of two servants within an This section describes the experimental testbed we designed ORB’s Object Adapter. One servant runs in a higher priority to systematically measure sources of latency and throughput thread than the other. Each thread processes requests that are overhead, priority inversion, and non-determinism in ORB sent to its servant by client threads on the other UltraSPARC-2. endsystems. The architecture of our testbed is depicted in Fig- Solaris real-time threads [3] are used to implement servant ure 2. The hardware and software components in the experi- priorities. The high-priority servant thread has the highest real-time priority available on Solaris and the low-priority ser- Services vant has the lowest real-time priority. The server benchmarking configuration is implemented in ... the various ORBs as follows: Object Adapter C 0 C 1 ... C n CORBAplus: which uses the worker thread pool architecture shown in Figure 3. In version 2.1.1 of CORBAplus, Requests ORB Core Client Server SERVANTS 5: dispatch upcall() ORB CORE ATM Switch 4: dequeue() Ultra 2 Ultra 22 3: enqueue() Figure 2: ORB Endsystem Benchmarking Testbed 2: read() 1: select() ments are outlined below. 2.1.1 Hardware Configuration I/O SUBSYSTEM The experiments in this section were conducted using a FORE systems ASX-1000 ATM switch connected to two Figure 3: CORBAplus’ Worker Thread Pool Concurrency Ar- dual-processor UltraSPARC-2s running SunOS 5.5.1. The chitecture ASX-1000 is a 96 Port, OC12 622 Mbs/port switch. Each UltraSPARC-2 contains two 168 MHz Super SPARC CPUs multi-threaded applications have an event dispatcher thread with a 1 Megabyte cache per-CPU. The SunOS 5.5.1 TCP/IP and a pool of worker threads. The dispatcher thread receives protocol stack is implemented using the STREAMS commu- the requests and passes them to application worker threads, nication framework. which process the requests. In the simplest configuration, an Each UltraSPARC-2 has 256 Mbytes of RAM and an ENI- application can choose to create no additional threads and rely 155s-MF ATM adaptor card, which supports 155 Megabits upon the main thread to process all requests. 1COOL was previously developed by Chorus, which was recently acquired miniCOOL: which uses the leader/follower thread pool by Sun. architecture shown in Figure 4. Version 4.3 of miniCOOL al- SERVANTS 4: dispatch upcall() ORB CORE SERVANT U LEADERFOLLOWERS SKELETONS SERVANT 2: read() U 3: dequeue, SKELETONS SEMAPHORE filter THREAD 4: dispatch 3: release() request, FILTER upcall() 1: select() &enqueue FILTER OBJECT FILTER ADAPTER SERVANT DEMUXER I/I/OO SUBSYSTEM FILTER 2: enqueue(data) Figure 4: miniCOOL’s Leader/Follower Concurrency Archi- ORB CORE tecture 1: recv() lows application-level concurrency control. The application developer can choose between thread-per-request or thread- I/I/OO SUBSYSTEM pool. The thread-pool concurrency architecture was used for our benchmarks since it is better suited than thread-per-request Figure 5: MT-Orbix’s Thread Framework Concurrency Archi- for deterministic real-time applications. In the thread-pool tecture concurrency architecture, the application initially spawns a fixed number of threads. In addition, when the initial thread pool size is insufficient, miniCOOL can be configured to dy- namically spawn threads on behalf of server applications to handle requests, up to a maximum limit. MT-Orbix: which uses the thread pool framework architecture based on the Chain of Responsibility pattern shown in Figure 5. Version 2.2 of MT-Orbix is used to create two real-time servant threads at startup. The high-priority thread is SERVANTS associated with the high-priority servant and the low-priority 3: dispatch upcall() thread is associated with the low-priority servant. Incoming ORB CORE requests are assigned to these threads using the Orbix thread filter mechanism. Each priority has its own queue of requests C C C A C C C A C C C A C C C A to avoid priority inversion within the queue. This inversion O O O C O O O C O O O C O O O C N N N C N N N C N N N C N N N C could otherwise occur if a high-priority servant and a low- N N N E N N N E N N N E N N N E priority servant dequeue requests from the same queue. E E E P E E E P E E E P E E E P 2: read()C C C T C C C T C C C T C C C T T T T O T T T O T T T O T T T O TAO: which uses the thread-per-priority concurrency 1 2 3 R 1 2 3 R 1 2 3 R 1 2 3 R architecture described in [4]. Version 1.0 of TAO integrates REACTOR REACTOR REACTOR REACTOR the thread-per-priority concurrency architecture with a non- (P(P1) (P(P2) (P(P3) (P(P4) multiplexed connection architecture, as shown in Figure 6. 1: select() In contrast, the other three ORBs multiplex all requests from client threads in each process over a single connection to the server process. I/I/OO SUBSYSTEM Figure 6: TAO’s Thread-per-Priority Thread Pool Architecture Client benchmarking configuration: Figure 2 shows how C n the benchmarking test used one high-priority client 0 and C C n low-priority clients, 1 ... The high-priority client runs in a high-priority real-time OS thread and invokes operations at 20 Hz, i.e., it invokes 20 CORBA twoway calls per second. 52 All low-priority clients have the same lower priority OS thread CORBAplus High Priority MT-ORBIX High Priority miniCOOL High Priority CORBAplus Low Priority MT-ORBIX Low Priority miniCOOL Low Priority TAO High Priority TAO Low Priority priority and invoke operations at 10 Hz, i.e., they invoke 10 48 CORBA twoway calls per second.

Alleviating Priority Inversion and Non-Determinism in Real-Time CORBA ORB Core Architectures 1 Introduction 2 Real-Time ORB Core

Last Time Today Response Time Vs. RM Computing Response Time More Second-Priority Task I Response Time

Preemption-Based Avoidance of Priority Inversion for Java

Real-Time Operating Systems (RTOS)

Real-Time Operating System (RTOS)

Guaranteeing Real-Time Performance Using Rate Monotonic Analysis

RTOS Northern Real Time Applications (NRTA)

Multitasking and Real-Time Scheduling

Lock-Free Programming

Generalized Rate-Monotonic Scheduling Theory: a Framework for Developing Real-Time Systems

Comparison of CPU Scheduling in Vxworks and Lynxos

Real-Time and Embedded Operating Systems What Is Real Time?

Investigation of Real-Time Operating Systems : OSEK/VDX and Rubus