Embedded Edge Multiprocessor Semaphores

Multiprocessor Semaphores Semaphores afford significant flexibility for providing interprocessor synchronization and mutual exclusion. Semaphores Aid Multiprocessor Designs By Ted Raineault ultiprocessing designs that share memory A second architecture uses dual- among DSPs and other processors beg for port RAM between processors. The downside here is the relatively high some form of mutual exclusion or inter- cost and small storage capacity of processor synchronization. Such designs are these devices; large banks of expen- becoming pervasive: DSPs are widely used in sive dual-port RAM are seldom prac- tical. However, in applications that dense multiprocessor arrangements at the use segmented data transport or Mnetwork edge, and systems-on-chip often include DSP cores to small data sets for which small accelerate math-intensive computation. Although the DSP/BIOS amounts of dual-port RAM are suffi- cient, this type of memory is very operating system kernel provides a standard, efficient, robust API useful. Dual-port RAM is relatively for uniprocessor applications, designers sometimes encounter sit- fast; it’s easy to design into a system; uations in which interprocessor syn- processor DSP systems are designed and, unlike FIFOs, it can store chronization mechanisms would be in this manner. shared data structures used for very useful. One method imple- interprocessor communication. ments interprocessor semaphores A word of caution about using by using DSP/BIOS. SHARED-MEMORY shared memory. When processors Shared-memory semaphores are ARCHITECTURES have on-chip cache or systems use basic tools for interprocessor syn- One common architecture uses a write posting, you must pay atten- chronization. Although self-imposed large region of single-port RAM tion to shared-variable coherence. design constraints can often reduce shared by all devices, often includ- To prevent loss of coherence, you synchronization requirements, sem- ing a host. Although arbitration can, depending on the processor, aphores offer multiprocessor system issues complicate the hardware disable the cache, use cache bypass, designers significant flexibility. In design for this architecture, soft- or flush the cache to ensure that a addition, a multitasking OS, like ware engineers appreciate a large shared location is in a proper state. DSP/BIOS, is virtually invaluable for shared-memory systems-on-chip The cache control API in Texas shared-memory multiprocessing pool visible to multiple processors. Instruments’ comprehensive Chip systems. When you can reduce bus con- Support Library, for example, pro- Assume that two or more proces- tention and arbitration inefficien- vides an excellent tool for managing sors share a physical pool of memo- cies with algorithm and data trans- cache subsystems. Solutions to ry, in the sense that each processor port strategies, software-related write-post delay problems are sys- sees the memory as directly ad- conveniences make this architec- tem-specific. dressable. Indeed, many multi- ture an attractive option. Assume that two processors use a 14 October 2001 Embedded Edge Multiprocessor Semaphores phore data structure. The wait operation checks the value of the integer and either decrements it if it’s posi- tive or blocks the calling task. The signal operation, in turn, checks for tasks blocked on the semaphore and either unblocks a task waiting for the semaphore or increments the semaphore if no tasks are waiting. A binary semaphore, which has count- er values limited to 0 and 1, can be used effectively by an application to guard critical sections. You can implement a multiprocessor semaphore by placing its data structure in shared memory and using RTOS services on each processor to handle blocking. Before outlining an implementation, let’s look at two aspects of semaphores that cause complications in a multiprocessor environment. One is low- level mutual exclusion to protect shared data within a semaphore, and the other is wake-up notifica- tion when a semaphore is released. common shared-memory buffer to delay, and eventual entry (no star- pass data or to operate cooperative- vation). The focus here is mutual ly on a data set. In either case, one exclusion; the remaining properties LOW-LEVEL or more tasks on the processors are detailed in any number of text- MUTUAL EXCLUSION might need to know the state of the books and will be satisfied by the At its core, a semaphore has a count buffer before accessing it, and possi- multiprocessor semaphore dis- variable and possibly other data ele- bly to block while the buffer is in cussed below. ments that must be manipulated use. As in the case of single-proces- Relative to a shared resource, atomically. System calls use simple sor multitasking, a mutual exclusion mutual exclusion requires that only mutual exclusion mechanisms to mechanism to prevent inappropri- one task at a time execute in a criti- guard very short critical sections ate concurrent operations on the cal section. Critical section entry where the semaphore structure is shared resource is needed. A quick and exit protocols use such mecha- accessed. This arrangement pre- review of mutual exclusion will help nisms as polled flags (often called vents incorrect results caused by clarify multiprocessor issues. simple locks or spin locks) or more concurrent modification of shared Shared-resource management is a abstract entities, like blocking sem- data within the semaphore. fundamental challenge of multitask- aphores. Simple locks can be used In a uniprocessor environment, ing. A task (or thread, or process) to build protection mechanisms of interrupt masking is a popular tech- needs the ability to execute greater complexity. nique used to ensure that sequential sequences of instructions without Introduced by Edgar Dijkstra in operations occur without interfer- interference so that it can manipu- the mid-1960s, the semaphore is a ence. With this technique, inter- late shared data atomically. These system-level abstraction used for rupts are disabled on entrance to a sequences, known as critical sec- interprocess synchronization. The critical section and re-enabled on tions, are bracketed by entry and semaphore provides two atomic exit. In a multiprocessor situation, exit protocols that satisfy four prop- operations, wait (P) and signal (V), however, interrupt masking isn’t an erties: mutual exclusion, absence of which are invoked to manipulate a option. Even if one processor could deadlock, absence of unnecessary nonnegative integer in the sema- disable the interrupts of another Embedded Edge October 2001 15 Multiprocessor Semaphores (rarely the case), the second proces- In shared-memory systems, hard- able prevents incorrect results sor would still execute an active ware-assisted mutual exclusion can caused by race conditions and also thread and might inadvertently vio- be implemented with special bit ensures that each waiting task will late mutual exclusion requirements. flags found in multiport RAMs. Dual- eventually enter the critical section. A second technique uses an atom- port RAM logic prevents overlap of We can easily imagine situations ic test-and-set (or similar) instruc- concurrent operations on the hard- in which more than two processes tion to manipulate a variable. This ware flags, forcing them to maintain try to enter their critical sections variable might be the semaphore the correct state during simultane- concurrently. Peterson’s algorithm count itself or a simple lock used to ous accesses. Also, because proces- can, as noted, be generalized to n guard critical sections where sema- sors use standard read and write processes and used to enforce mutu- phore data is accessed. Either way, a instructions to manipulate the flags, al exclusion for more than two specialized instruction guarantees specialized atomic test instructions tasks, and other n-process solutions, atomic read-modify-write in a multi- aren’t required. However, this solu- such as the bakery algorithm, are tasking environment. tion is still limited, as shared-memo- readily available in computer sci- Although this solution looks ry systems often lack dedicated ence textbooks. For reasons of clari- straightforward, test-and-set in- hardware flags. ty and brevity, the discussion here is structions have disadvantages in Let’s take one more step to arrive limited to the two-process case. both uniprocessor and multiproces- at a general-purpose hardware- Pseudocode for the n-process Pe- sor scenarios. One drawback is independent method. terson’s algorithm is available at dependence on machine instruc- www.electricsand.com. tions, which vary across processors, Now that we have a low-level provide only a small number of PETERSON’S ALGORITHM mutual exclusion tool with which to atomic operations, and are some- Peterson’s algorithm, published in safely manipulate shared data within times unavailable. 1981, provides an elegant software a semaphore, consider the other key A second problem is bus locking: solution to the n-process critical ingredient of semaphores: blocking. If multiple processors share a com- section problem and has two key Assuming that each processor runs mon bus that doesn’t support lock- advantages over test-and-set spin DSP/BIOS or another multitasking ing during test-and-set, these pro- locks. One is that atomic test-and- OS, we’ll develop our wait operation cessors might interleave accesses to set isn’t required: the algorithm using services that are already avail- a shared variable at the bus level eliminates the need for special able on each individual processor. while executing seemingly atomic instructions and bus locking. The DSP/BIOS provides a flexible sema- test-and-set instructions. other is eventual entry: A task wait- phore module (SEM) that we’ll use in A third problem concerns test- ing for entry to a critical section the implementation. Peterson’s algorithm provides an elegant software solution to the n-process critical section problem and has two key advantages over test-and-set spin locks. and-set behavior in multiport RAM won’t starve in a typical scheduling When the owner of a uniproces- systems: Even if all buses can be environment.

Load more