An Efficient Semaphore Implementation Scheme for Small-Memory Embedded Systems*

An Efficient Semaphore Implementation Scheme for Small-Memory Embedded Systems* Khawar M. Zuberi and Kang G. Shin Real-Time Computing Laboratory Department of Electrical Engineering and Computer Science The University of Michigan Ann Arbor, MI 48109-2122 { zuberi, kgshin} @eecs.umich. edu Abstract These embedded systems are mass-produced, mak- In object-oriented programming, updates to the ing low production costs one of the primary concerns state variables of objects (by the methods of the ob- in their design. Automotive applications alone ac- ject) have to be protected through semaphores to ensure count for millions of embedded systems produced an- mutual exclusion. Semaphore operations are invoked nually. At these volumes, extra costs of even a few each time an object is accessed, and this represents dollars per unit translate into a loss of millions of significant run-time overhead. This is of special con- dollars overall, so the microcontrollers used in these cern in cost-conscious, small-site embedded systems - cost-conscious applications are those which have been such as those used in automotive applications - where in production for several years and their prices have costs must be kept to an absolute minimum. Object- dropped to a few dollars per unit. These microcon- oriented programming can be feasible in such applica- trollers have relatively slow processing cores (typically tions only if the OS provides eficient, low-overhead running at 10-30 MHz), small, on-chip RAMS (about semaphores. We present a new semaphore imple- 32-64 kBytes, hence the name “small-memory” em- mentation scheme which saves one context switch per bedded systems), and all applications are in-memory semaphore lock operation in most circumstances and (there are no disks/file systems in our target applica- gives performance improvements of 18-25% over tra- tions). This necessitates that any real-time operating ditional semaphore implementation schemes. system (RTOS) [a] used in these applications must be both time-efficient and memory-efficient . 1 Introduction In this paper, we focus on OS support for object- Real-time computing [l]today is no longer limited oriented (00) programming in embedded systems. to large and expensive systems such as planetary ex- 00 design gives benefits such as reduced software ploration robots or the space shuttle. The sharp drop design time and software re-use [3]. But with these in microprocessor prices over the recent years and the benefits comes the extra cost of ensuring mutual ex- introduction of the microcontroller incorporating a mi- clusion when an object’s internal state is updated. croprocessor with peripherals like timers, memory, and Semaphores’ I4,5] are typically used to provide this 1/0 in a single package has led to digital control now mutual exclusion. Because semaphore system calls are being used in much smaller and simpler embedded sys- invoked every time an execution thread enters or exits tems such as in automotive control, cellular phones, an object, it becomes essential that the RTOS provide and home electronics (camcorders, TVs, and VCRs). efficient, low-overhead semaphores; otherwise, 00 de- ~ sign will not be feasible for embedded applications be- *The work reported in this paper was supported in part by cause of high costs. the Advanced Research Projects Agency, monitored by the US Airforce Rome Laboratory under Grant F30602-95-1-0044, by the NSF under Grant MIP-9203895, and by the ONR under ‘The optimization scheme presented in this paper applies Grant N00014-94-1-0229. Any opinions, findings, and conclu- equally well to both semaphores and mutexes. However, for sions or recommendations are those of the authors and do not simplicity, we concern ourselves only with semaphores in this necessarily reflect the views of the funding agencies. paper. 1080-1812/97 $10.00 @ 1997 IEEE 25 Most research in the area of reducing task syn- objects. Then, under the 00 paradigm, real-time soft- chronization overhead has focused on multiprocessors ware is just a collection of threads of execution, each [6,7]. But our target architectures are either unipro- invoking various methods of various objects [12]. cessor (as in home appliances) or very loosely-coupled Conceptually, this 00 paradigm is very appealing distributed systems (as in automotive applications). and gives benefits such as reduced software design Even with the latter, threads typically do not need to time and software re-use. But practically speaking, access remote objects, so our concern is only with im- these benefits come at a cost. The methods of an proving task synchronization performance for a single object must synchronize their access to the object’s processor. Previous work in this area has focused on data to ensure mutual exclusion. Because object in- either relaxing the semaphore semantics to get better vocations occur very frequently, it is essential that any performance [SI or coming up with new semantics and scheme used to achieve this synchronization must be new synchronization policies [9]. The problem with both memory-efficient as well as time-efficient; oth- this approach is that these newlmodified semantics erwise, 00 design will be infeasible for small-memory may be suitable for some particular applications but embedded systems due to high costs. usually they do not have wide applicability. We took the approach of providing full semaphore semantics (with priority inheritance [lo]), but opti- 2.1 Active and Passive Object Models mizing the implementation of these semaphores by ex- ploiting certain features of embedded applications. As There are two fundamentally different ways for ob- a result, our semaphore scheme has wide applicability jects and execution threads to interact with each other within the domain of embedded applications, while and this has some bearing on the type of synchroniza- significantly improving performance over standard im- tion scheme used to ensure mutual exclusion. plementation methods for semaphores. We have im- Under the active object model [13], one or more plemented this new semaphore scheme in the EMER- server threads are permanently bound to an object. ALDS (Extensible Microkernel for Embedded, ReAL- When a client thread invokes a method, a server time, Distributed Systems) RTOS [ll]which is being thread executes the method on behalf of the client. developed in the Real-Time Computing Laboratory With the passive object model [13], objects do not at the University of Michigan to satisfy the specific have threads of their own. To invoke a method, a memory and performance requirements of small-size thread will enter the object, execute the method, and embedded systems. then exit the object. In the next section, we give a brief overview of 00 From the point of view of synchronization, the ac- programming as it pertains to embedded real-time sys- tive object model has an advantage if only one thread tems, focusing on OS support needed for 00 program- is assigned per object. Since only one thread is in the ming. In Section 3, we describe our new implementa- object at any time, there is no need to worry about tion scheme. Section 4 discusses some limitations of mutual exclusion. But the active object model has the scheme and ways to overcome these limitations so several disadvantages. First of all, having a thread that our scheme can be used in almost all embedded per object means that there will be a large number applications. Section 5 evaluates the performance of of threads in the system (anywhere from several tens our new scheme, and we conclude with Section 6. to more than a hundred depending on the applica- tion). Each thread needs its own stack, thread control 2 Objects and Semaphores in Embed- block, etc., which makes the active object model very ded Real-Time Systems memory-inefficient. Moreover, each object invocation An object is a collection of private state informa- requires a context switch from the client thread to the tion (or data) and a set of methods which manipulate server thread, so this model is time-inefficient as well. the data. Objects are ideal for representing real-world With the passive object model, multiple threads entities: the object’s internal data represents the phys- can be inside the same object at one time, so they ical state of the entity (such as temperature, pressure, must synchronize their activities. Semaphores [4,5] position, RPM, etc.) and the methods allow the state are commonly used for this purpose (e.g., to provide to be read or modified. These notions of encapsula- the monitor construct [14]). Even though locking tion and modularity greatly help the software design based on semaphores incurs time overhead, it is de- process because various system components such as cidedly much more memory-efficient than the active sensors, actuators, and controllers can be modeled by object model. 26 2.2 00 Design Under EMERALDS E Tx For the above stated reasons, we advocate the pas- -\ sive object model for embedded software design. Be- cause a semaphore system call is made every time an object’s method is invoked, semaphore opera- \ tions (acquiresem(1 and releasesen0 calls under T2 L -time EMERALDS, used to lock and unlock semaphores, respectively) become some of the most heavily used -thread - - context L:Lock U: Unlock execution switch semaphore semaphore OS primitives when 00 design is used. This moti- vated us to investigate new and efficient schemes for Figure 1: A typical scenario showing thread T2 at- implementing semaphore locking in EMERALDS as tempting to lock a semaphore already held by thread described next. TI. Tz is an unrelated thread which was executing 3 An Efficient Semaphore Implemen- while T2 was blocked. Conceptually, T, can be TI. tation Scheme The first step in designing efficient semaphores is to semaphore, thus causing priority inversion.

An Efficient Semaphore Implementation Scheme for Small-Memory Embedded Systems*

An Introduction to Linux IPC

Multithreading Design Patterns and Thread-Safe Data Structures

Semaphores Semaphores (Basic) Semaphore General Vs. Binary

CSC 553 Operating Systems Multiple Processes

Lesson-9: P and V SEMAPHORES

Shared Memory Programming: Threads, Semaphores, and Monitors

Software Transactional Memory with Interactions 3

Semaphores Cosc 450: Programming Paradigms 06

Low Contention Semaphores and Ready Lists

Semaphores (Dijkstra 1965)

Synchronization

Parallel Programming Models III (Pdf)