On Transactional Memory in Distributed Real-Time Programs

Sachin Hirve∗, Aaron Lindsay†, Binoy Ravindran∗ and Roberto Palmieri∗ Bradley Department of Electrical and Computer Engineering Virginia Tech, Blacksburg, Virginia 24060 Email: ∗{hsachin, binoy, robertop}@vt.edu, †[email protected]

Abstract—We consider distributed transactional memory b) dataflow [9], where transactions are immobile, and objects (DTM) for concurrency control in distributed real-time programs, are migrated to invoking transactions. and present an algorithm called RT-TFA. RT-TFA transparently handles object relocation and versioning using an asynchronous Given DTM’s programmability, scalability, and compos- clock-based validation technique, and resolves transactional con- ability advantages, it is also a compelling abstraction for tention using task time constraints. We implement the RT-TFA on distributed embedded real-time programs. In fact, developing top of JChronOS, a layer extending the scheduling capabilities of distributed concurrent applications with time constraints is ChronOS for Java programs. We conduct an extensive evaluation several order of magnitude harder than conventional, non real- study comparing RT-TFA with well known competitors for real- time, distributed applications. DTM is the potential solution, time distributed applications. Our results reveal that RT-TFA however, in order to cope with time constraints, it requires outperforms competitors in mostly scenarios up to 43% with added advantage of better programmability and composability. bounding (distributed) transactional retries, as (distributed) real-time tasks, which subsume transactions must satisfy dead- lines. Bounding transactional retries requires resolving trans- I.INTRODUCTION actional contention using time constraints. Distributed embedded software is inherently concurrent, as Motivated by these needs, in this paper we present RT- they monitor and control concurrent physical processes. Often, TFA, Real-Time Transaction Forwarding Algorithm. RT-TFA their computations need to concurrently access (i.e., read, extends the distributed concurrency control scheme called write) shared data objects, which must be properly coordinated TFA [10], used in recent works such as [11]–[13], with real- so that consistency properties (e.g., [1], serial- time contention management. RT-TFA has been implemented izability [2]) can be ensured. Furthermore, they must satisfy on top of ChronOS [14]. It enables a thread to execute part of application time constraints. The usual way for managing it under real-time constraints in comparison to other existing concurrency of different processes in a system is using locks, real-time OS (e.g. Integrity, µItron etc.) where whole thread which inherently suffers from programmability, scalability, and is executed under real-time constraint. In order to allow JAVA composability challenges [3]. Additionally, the implementation applications to define time constraints and inject into ChronOS, of complex algorithms based on mutual exclusion becomes we designed and implemented JChronOS, a user-space library hard to debug, resulting in higher developing time. which provides the hooks to interface with the ChronOS kernel Software transactional memory (STM) [4] is an novel from Java application. synchronization abstraction that promises to alleviate these We assess the performance of RT-TFA by an extensive difficulties. In fact, leveraging the proven concept of atomic evaluation study, comparing our approach with the widely used and isolated transactions, STM spares programmers from the protocols for implementing real-time distributed applications. pitfalls of conventional -based synchronization, signifi- Our results show that RT-TFA outperforms competitors in cantly simplifying the development of parallel and concurrent mostly scenarios up to 43%. To the best of our knowledge, applications. With STM, in-memory operations are organized RT-TFA is the first algorithm that enables DTM concurrency as memory transactions that read/write shared objects. These control in real-time programs. transactions optimistically execute, logging object changes in a private space. Two transactions conflict if they access the same object and one access is a write. When that happens, II.PRELIMINARIESAND ASSUMPTIONS a contention manager resolves the conflict by aborting one and committing the other, yielding (the illusion of) atomicity. We consider a distributed system of N nodes, where Aborted transactions are rolled-back and re-started. each node has one or more processing cores. We consider a programming model similar to Real-Time CORBA [15]: a dis- The challenges of lock-based concurrency control are exac- tributed task (or simply called task, hereafter) is programmed erbated in distributed systems, due to the additional complexity as a thread that may read/write local as well as remote objects. of multi-computer concurrency (e.g., distributed , Unlike CORBA, which is based on control flow execution, we priority inversion, and composability challenges). Distributed consider Herlihy and Sun’s dataflow execution model [16], TM (or DTM) [5]–[7] has been similarly motivated as an where threads are immobile and objects are (transparently) alternative to distributed locks. DTM can be supported in migrated to invoking threads using a directory-based lookup any distributed execution model, with concomitant tradeoffs, protocol (e.g., [16]). A time constraint on a task is expressed including a) control flow [8], where objects are immobile and using a scheduling segment (like in CORBA), which is a task transactions invoke object operations through RMIs/RPCs and code segment subject to the time constraint. We assume n 6= N sporadic tasks τ1, τ2, ··· , τn, with a value with the transaction starting time. If the remote clock th minimum period of t(τi). A task τi’s m instance is denoted is newer, the transaction’s read-set is validated by checking m m τi , and has a relative deadline D(τi ) = t(τi). Since we focus whether any other object in the read-set has been updated on real-time (distributed) tasks, for the rest of the paper, we to a version newer than the transaction starting time. If the restrict our attention to code inside task scheduling segments. read-set validation succeeds, then the transaction starting time Each task’s scheduling segment may contain non-atomic as is advanced to the remote clock value (i.e., “forwarded”). well as atomic code sections. The worst case execution time Otherwise, the transaction is aborted and re-issued. (WCET) c of a task τ is the sum of its worst case execution i i When a transaction reaches the commit stage, it first time of non-atomic sections (C ) and atomic sections (s ). i i acquires locks on all the objects in its write-set. On successful Each object θ, can be accessed by multiple atomic sections lock acquisition, the objects are updated and the transaction (and therefore multiple tasks). An atomic section may involve committing node is published as the new host of the updated reading/writing (one or more) local/remote shared objects. An objects. If lock acquisition fails for any object, all acquired atomic section’s length is the sum of the execution costs of: 1) locks are released, and the transaction is aborted and re-issued. acquisition of objects from remote node, 2) local modifications Illustrative Example. Consider three nodes, N1, N2, and on object, 3) acquisition of locks from remote node, and 4) N , each running transactions T , T , and T , respectively communication cost for remote requests ρ. If one or more 3 1 2 3 (see Figure 1). N2 hosts object θ and is considered θ’s atomic sections conflict, it is resolved by committing one “owner.” Nodes maintain a local transactional clock, which section and aborting/retrying the others, which increases the is incremented when transactions running on them commit. time to commit the aborted ones. The retry cost RC(τi) is the T3 starts at local clock lc3 = 12 and requests θ from N2 at total time that a task τi executes its aborted atomic sections. lc3 = 16. N2 compares the received clock value with its local Response time of τi i.e. R(τi), is sum of τi’s WCET (ci), retry clock lc2 = 12 and advances its clock to lc2 = 16. After cost RC(τi), and the interference caused by other tasks to τi. some time, T1 starts at local clock lc1 = 14 and requests θ We assume a network with bounded communication delay from N2 at lc1 = 19. At N2, no change is made to its local (e.g., [17]), denoted ρmax. Node clocks are synchronized clock lc2, since rc = 19 < lc2 = 21. When N1 receives (e.g., [18]), and clock drift is bounded by δmax. Each task is the response from N2, it observes that lc1 = 19 < rc = 21. assumed to have a programmer-supplied exception handler. We Therefore, it forwards T1 to start at lc1 = 21 and validates consider a termination model [19] for task failures including all other objects accessed earlier by T1. Later, T3 acquires the those due to time-constraint violations. If a task has missed its lock on θ at N2 and updates θ at local clock lc3 = 25. Now, deadline, the handler is immediately executed on all nodes N3 holds the ownership of θ, but leaves θ locked at N2. When where the task may be accessing objects. The handler is T1 tries to acquire the lock on θ, it may find N2 or N3 as the assumed to perform the necessary recovery actions. object owner depending on when N3 successfully publishes its ownership of θ. If T1 requests the lock on θ from N2, it fails due to the existing lock on θ and aborts. Otherwise, if III.OVERVIEW OF TFA T1 finds N3 as the owner, it acquires the lock, but fails during For completeness, we overview TFA. TFA [20] builds read-set validation. Therefore, T1 releases the lock acquired on the TL2 algorithm proposed for multiprocessor TM [21]. on θ at N3 and aborts. T1 retries and requests θ again from It is a data flow based, distributed transaction management the new owner N3. Concurrently, another transaction T2 also algorithm, which provides atomicity, consistency, and isolation receives θ, but T1 acquires the lock on θ earlier than T2 and properties. Under TFA, operations on distributed objects are commits. As a result, T2 aborts and retries. T2 finally commits buffered and locks on objects are acquired at commit time. On at lc2 = 44 and becomes the new object owner. successful acquisition of locks, objects are updated. Otherwise, the transaction is aborted by releasing all previously acquired IV. RT-TFA locks and retried. Overview. Since TFA is agnostic to task time constraints, In contrast to TL2’s central clock, TFA uses independent, it can cause priority inversions, causing higher priority tasks per-node transactional clocks and provides a mechanism to to miss their time constraints. For example, in Figure 1, if the establish the “happens before” relationship between significant deadlines d1, d2, and d3, of the parent tasks of transactions events (e.g., write-after-write, read-after-write). Upon a trans- T1, T2, and T3, respectively, are such that d1 > d2 > d3, then action’s successful commit, a node increments its local clock. it is possible that transactions will commit in a different order An object’s version is defined by the local clock at the time of than their respective deadlines. From Figure 1, even though the object’s last modification. When a local object is accessed T2’s urgency is greater than T1’s, T1 commits earlier than T2. by a transaction, as part of validation, the object version is Therefore, T2 may not be able to meet its deadline, though compared with the transaction’s starting time. If the object’s there might be enough slack time for T1 to commit later. version is newer, the transaction is aborted and retried. RT-TFA extends TFA to support transactions that execute under time constraints. Under RT-TFA, task scheduling seg- For validating remote objects, TFA employs a technique ments are scheduled at each node using EDF. Transactions called “transaction forwarding:” when a transaction requests inherit the deadlines of their parent tasks: when atomic sections access to a remote object, the local clock is piggybacked with attempt to access and modify local or remote shared objects, the request to the remote node. The remote node advances its they may be subject to object access conflicts. Those conflicts clock to the sender’s clock if its clock is older; otherwise, are resolved using the deadlines of the sections’ parent tasks. no update is made to the remote clock. The remote node then sends the object copy with its clock value. Upon receipt, RT-TFA inherits all properties of TFA: objects are acquired the local node (i.e., the sender) compares the remote clock at encounter time, transaction clock values are exchanged, and rc=21 > lc1=19; rc=27 > lc1=22; fwd T1 to lc1=21 fwd T1 to lc1=27 T tries to commit object ( e ) request T 1 tries to commit 1 at lc1=19 fails to lock locks objects in writeset aborts and retries validates readset T 1 starts at lc1= 14 T 1 commits lc1= 22 version( e) = 36 (rc=34 < lc1=40) N 1

X

e last updated T commits advance lc2 lc2=31 2 at lc = 10 (12 ï> 16) N 2 version( e) = 10 (rc=19 < lc2=21) T 2 lc2=34 version( e) = 44 X rc=40 > lc1=34; fwd T2 to lc2=40

N 3 T (rc=31 > lc3=29) 3 version( ) = 25 object ( e ) request e advance lc3=31 commits at lc3=16 T 3 (rc=22 < lc3=27)

Fig. 1. Example transaction execution under TFA. transaction forwarding is conducted. However, in addition to other hand, a later deadline task may start early depending the local clock, the object requester also sends the deadline on its arrival pattern and commit on an object, before another of the transaction (when the object is remote). The object earlier deadline task may attempt to access this object (e.g., holder maintains a list of all the requesters for the shared T2 commits earlier than T3). object in a deadline-ordered queue, and grants the object to the earliest deadline transaction. If an earlier deadline transaction V. ARCHITECTURE T requests an object θ which was previously granted to a later a The nodal architecture of RT-TFA’s implementation con- deadline transaction T , then T is aborted and θ is granted to b b sists of a stack of the ChronOS Linux kernel, JChronOS, T . T joins the deadline-ordered wait queue for θ. When T a b a the JVM, RT-TFA and the application. In the following we commits, it signals T to retry. b describe the core modules of this architecture. After acquiring the object, a transaction moves forward to ChronOS. ChronOS [14] is derived from the acquire the remaining objects or to make changes on locally 2.6.33.9 version of the Linux kernel. It uses the buffered objects. Shared objects are updated after the transac- CONFIG PREEMPT RT real-time patch [22] which tion acquires locks on the objects. A committing transaction enables complete preemption in the Linux kernel and signals the next transaction in the (deadline-ordered) wait improves interrupt latencies. ChronOS provides a set of queue to resume, which then retries the object acquisition from APIs and a scheduler plugin infrastructure that can be used the new object owner. In this way, transactions commit in their to implement and evaluate a variety of single- and multi- deadline order. processor scheduling algorithms. The ChronOS real-time Illustrative Example. Nodes N1, N2, and N3, which are scheduler is implemented as an extension to the Linux clock-synchronized, run tasks τ1, τ2, and τ3, respectively. The O(1) scheduler. ChronOS includes implementations of absolute deadlines of these tasks are in time order of d1 > various single-processor (e.g., EDF, DASA, RMA etc.) and d2 > d3. N3 hosts an object θ, and is its owner. multiprocessor scheduling algorithms (e.g., G-EDF, NG-GUA, G-GUA etc.). Single-processor schedulers in ChronOS support τ starts executing on N and contains a transaction T . 1 1 1 priority inheritance to address priority inversion. T1 starts a read or write operation on θ, as shown in Figure 2. T1 sends the request to access θ to N3, and piggybacks its In ChronOS, a time constraint on a thread is expressed transaction clock and absolute deadline (lc1, d1) with the using the notion of a scheduling segment (introduced in request. When N3 receives this request, it replies back with its Section II). A real-time application in ChronOS specifies the remote clock (rc3) and adds this entry to its object ID/deadline start and end of a scheduling segment. Scheduling segments list map < ID(θ), list(d(τi)) >. Little later τ2 at N2 initiates of a task are identified using special APIs, and are scheduling transaction T2 on θ and sends request for θ to N3. On receiving events. A scheduling event is defined as a trigger that forces this request, N3 finds a later deadline transaction T1 accessing the system into a scheduling cycle, resulting in a call to θ. N3 replies to N2 with rc3 and sends a request to N1 to abort the scheduler where new task(s) are selected based on the T1. On receiving this request at N1, T1 is aborted and τ1 is scheduling algorithm. In ChronOS, we define four scheduling suspended. Later when T2 commits on θ, it sends a request to events (e.g., beginning of a scheduling segment, end of a resume τ1, thereby T1 retries and requests access to θ from the scheduling segment, resource request, and resource release). new host N . Meanwhile τ at N starts T on θ and sends 2 3 3 3 Under different load conditions, real-time segments may request for θ to N . N replies with its clock (rc ) and adds 2 2 2 violate their time constraints. Until they finish their execution, this entry to its object ID/deadline list map. Later when N 2 such segments may block the execution of other real-time receives a request from T , it declines the request since T 1 1 segments. ChronOS supports a thread abort mechanism by has a later deadline than T . On receiving the response, T 3 1 which the application is notified about the threads with expired is aborted and τ is again suspended. When T commits, τ 1 3 1 time constraints. On receiving this information, the application resumes. Finally T receives θ from N and commits. 1 2 can run the abort handler for the threads, thereby terminating Transactions are committed in deadline order T1 T2 and expired real-time segments. RT-TFA uses this mechanism to T1 T3, only if there is contention for a shared object. On abort transactions when they violate their thread deadline. τ τ τ τ τ 1 Object Suspend 1 Resume 1 Suspend 1 Resume 1 Request T 1 N 1 d1 (lc1, d1) X (lc1, d1) (rc2) τ Abort Object Resume 2 Lock Request Request T (rc2) Resume 2 Request Request N 2 d2 (lc2, d2)

(rc3) τ 3 (rc3) (lc3, d3) N 3 d3 T 3 Object ( θ) N 3 N 2 N 3 N 1 Ownership

Fig. 2. Example distributed real-time transaction execution under RT-TFA.

TABLE I. TASK PROPERTIES JChronOS. In this section we present JChronOS, a library that extends the scheduling capabilities of ChronOS for Java Task T1 T2 T3 T4 T5 programs. This is accomplished by accessing the existing Period (ms) 500 1000 1500 3000 5000 ChronOS library, libchronos, through JNI. libchronos Number of transfer 1 1 1 1 1 Bank objects accessed 2 2 2 2 2 is written in C and interacts with the ChronOS scheduler through system calls to initialize the real-time scheduler, begin or end a real time segment, acquire a ChronOS mutex, etc. transfers a given amount between two accounts, and B) a write JChronOS consists of two components. The first com- operation i.e. total balance , which computes the total ponent, written in Java, provides the necessary interfaces to balance for given accounts. The object size and the number of applications in Java. It enables Java applications to define operations per object during a transaction is configurable. scheduling segments and to acquire resources by simply im- We considered three competitors for RT-FTA, all of them porting the classes provided by its interface. These classes are using Java RMI and a version of the classic two-phase- libjchronos.jar distributed in a jar file, . When invoked, locking protocol (2PL) [24]. Our first competitor (2PL) these methods call the other component of JChronOS, the uses mutual exclusion locks to guard critical sections. The native library, through JNI. The second component, main- second implementation (2PL ) uses readers-writer locks. tained in Linux by the shared object libjchronos.so, RW The last version (2PLPI ) uses mutual exclusion locks with provides a mechanism for converting data structures between distributed priority inheritance i.e., the (dynamic) priorities of libchronos their Java forms, and those required by , the transactions are propagated [25] with transactions to resolve existing ChronOS library, and its system calls. Once it has contention on remote shared objects. The main competitor in libjchronos.so converted any arguments appropriately, our comparison is 2PL , which is widely used in distributed libchronos.so PI calls , marshals the data into Java objects, real-time systems for lock-based synchronization. Though it and returns from the JNI call. does not provide guarantee against , it reduces priority inversions by inheriting the priority of highest blocked task. VI.IMPLEMENTATION AND EVALUATION Other competitors further highlight the blocking duration ex- Implementation. We implemented RT-TFA in the HyFlow perienced by distributed tasks. We do not consider Distributed Java DTM framework [23], which provides pluggable support Priority Ceiling [26], because it requires prior information for policies for directory lookup, transactional synchronization about the object access pattern, while in our application we and recovery, and contention management. We integrated this do not impose such limitation. implementation with the ChronOS real-time Linux kernel. We Results. Figures 3(a) and 3(b) show the deadline satisfac- modified HyFlow to transparently support distributed real- tion ratio (DSR) and the throughput, respectively, of RT-TFA time transactions: atomic sections defined within scheduling and the three 2PL versions for different percentages of read segments transparently use thread time constraints for local operations. All other parameters are fixed: 1 transaction per and remote contention management. task, 14 nodes, and 50 Bank objects. DSR is the fraction Experimental setting. We used a 14-nodes testbed, where of the total number of tasks that met their deadlines and each node is an AMD Opteron 1.9GHz processor. The average throughput is the total number of successfully committed network delay is less than 1ms. Each node ran a set of periodic transactions/second. tasks. Each task is composed of a fixed duration for processing non-atomic section and atomic sections. In the experiments, RT-TFA yields comparable DSR to 2PL with priority 5 tasks were fired at each node, resulting in 70 concurrent inheritance (2PLPI ) and higher DSR than other versions tasks over 50 Bank objects. Tasks were defined such that, they of 2PL. RT-TFA underachieves against 2PLPI for lower contained atomic sections with local and remote read/write percentages (e.g., <∼35%) of read operations due to the higher operations, resulting in time-critical, distributed transactions. retry cost for aborted transactions. As percentage of read Table I shows the task details. operations increase, RT-TFA outperforms all three versions of 2PL, which is directly due to optimistic concurrency control: We used a well known benchmark for STM and DTM RT-TFA does lazy locking and acquires locks at the commit called Bank, a monetary application. This maintains a set stage. In contrast, 2PL serializes objects by acquiring the locks of accounts distributed over bank branches, and contains before entering the critical section and therefore suffers from two transactions: A) a read operation i.e. transfer, which higher cost of priority inversion. 1.2 2PL ACKNOWLEDGMENT 2PLRW 1.1 2PLPI This work is supported in part by US National Science RT-TFA Foundation under grants CNS 0915895, CNS 1116190, CNS 1 1130180, and CNS 1217385

0.9 REFERENCES 0.8 [1] J. Aspnes, M. Herlihy, and N. Shavit, “Counting networks,” J. ACM, 0.7 vol. 41, no. 5, pp. 1020–1048, 1994. [2] P. A. Bernstein, V. Hadzilacos, and N. Goodman, Concurrency control 0.6 and recovery in database systems. Addison-Wesley ’86. DSR (#Tasks Completed/#Total tasks) [3] M. Herlihy, V. Luchangco, and M. Moir, “A flexible framework for 0.5 implementing software transactional memory,” in OOPSLA ’06. 0 20 40 60 80 100 % Read Operations [4] J. Larus and R. Rajwar, Transactional Memory. Morgan & Claypool ’06. (a) DSR [5] S. Peluso, P. Romano, and F. Quaglia, “Score: A scalable one-copy serializable partial replication protocol,” in Middleware, 2012. 60 2PL [6] R. Palmieri, F. Quaglia, and P. Romano, “Osare: Opportunistic specu- 2PLRW 2PL lation in actively replicated transactional systems,” in SRDS, 2011. 50 PI RTTFA [7] M. Couceiro, P. Romano, N. Carvalho, and L. Rodrigues, “D2STM: Dependable Distributed Software Transactional Memory,” in Proc. of 40 PRDC. IEEE Computer Society Press, 2009. [8] K. Arnold, R. Scheifler et al., Jini Specification. Addison-Wesley ’99. 30 [9] E. Tilevich and Y. Smaragdakis, “J-Orchestra: Automatic Java applica- tion partitioning,” in ECOOP, 2002. 20 [10] M. M. Saad and B. Ravindran, “Transactional forwarding: Supporting highly-concurrent stm in asynchronous distributed systems,” in SBAC- 10 PAD, 2012, pp. 219–226. Throughput (Transactions/second) [11] A. Turcu and B. Ravindran, “On Open Nesting in Distributed Transac- 0 tional Memory,” in SYSTOR, 2012. 0 20 40 60 80 100 [12] J. Kim, R. Palmieri, and B. Ravindran, “Scheduling open-nested % Read Operations transactions in distributed transactional memory,” in COORDINATION. (b) Throughput Springer, 2013. [13] A. Turcu and B. Ravindran, “On closed nesting in distributed transac- Fig. 3. DSR and throughput (0-100% read operations). tional memory,” in TRANSACT, 2012. [14] M. Dellinger, P. Garyali, and B. Ravindran, “Chronos Linux: a best- effort real-time multiprocessor Linux kernel.” in DAC. ACM, 2011. Next, we evaluated RT-TFA for different contention levels [15] V. Fay-Wolfe, L. C. DiPippo, G. Copper, R. Johnston, P. Kortmann, and resulting from varying the number of objects with a fixed B. Thuraisingham, “Real-time CORBA,” TPDS, Oct. 2000. number of nodes and transactions. DSR and Throughput plots [16] M. Herlihy and Y. Sun, “Distributed transactional memory for metric- are omitted due to space constraints, but we describe our space networks,” Distributed Computing, 2007. observations. RT-TFA’s DSR improves gradually; this is due [17] Robert Bosch GmbH, “CAN Specification, 2.0 ed.” 1991. to reduced contention from increased number of objects due to [18] D. Mills, “Network time protocol (version 3) specification, implemen- which RT-TFA incurs less cost of transaction aborts and retries. tation,” United States, 1992. RT-TFA outperforms all competitors except 2PLPI for lower [19] E. Curley, J. Anderson, B. Ravindran, and E. D. Jensen, “Recovering percentage (less than 35%) of read operations, where it suffers from distributable thread failures with assured timeliness in real-time from higher number of aborted and retried transactions. distributed systems,” SRDS, vol. 0, pp. 267–276, 2006. [20] M. Saad and B. Ravindran, “Transactional Forwarding Algorithm,” Virginia Tech, Tech. Rep., 2010, available at http://hyflow.org/hyflow/ chrome/site/pub/tfa tech.pdf. VII.CONCLUSIONS [21] O. S. D. Dice and N. Shavit, “Transactional Locking II,” in DISC, 2006. [22] Real-time Linux Wiki, “RT-PREEMPT kernel patch for Linux,” Avail- We presented the RT-TFA algorithm for supporting DTM able at https://rt.wiki.kernel.org. in distributed real-time programs. The algorithm inherits TFA’s [23] M. Saad and B. Ravindran, “Supporting STM in distributed systems: core mechanisms for transactional synchronization, and ex- Mechanisms and a Java framework,” in TRANSACT, 2011. tends it with real-time contention management. We built RT- [24] P. Felber and M. K. Reiter, “Advanced concurrency control in Java,” TFA on a distributed, real-time OS-based infrastructure, called Concurrency and Computation: Practice and Experience, Tech. Rep., ChronOS and provided JChronOS, a layer enabling the direct 2002. interactions between RT-TFA and ChronOS. We implemented [25] I. Pyarali, D. C. Schmidt, and R. Cytron, “Techniques for enhancing RT-TFA in the HyFlow DTM and compared with different real-time corba quality of service.” Proceedings of the IEEE, 2003. protocols usually adopted for developing distributed real-time [26] R. Rajkumar, L. Sha and J. P. Lehoczky, “Real-time synchronization applications. Our results revealed that RT-TFA yields compa- protocols for multiprocessors,” in RTSS, 1988, pp. 259–269. rable deadline satisfactions to 2PLPI and outperforms other 2PL-based locking protocols (by as much as 43%). Finally, by the results we can state that DTM can be supported in distributed real-time programs, allowing programmers to reap DTM’s better programmability and composability properties.