A Highly Safe Self-Stabilizing Mutual Exclusion Algorithm

A Highly Safe SelfStabilizing Mutual Exclusion Algorithm ILing Yen Farokh B Bastani Department of Computer Science Department of Computer Science A Wells Hall University of Houston Michigan State University Houston TX East Lansing MI Email FBastaniuhedu Email yencpsmsuedu Abstract Conventional selfstabilizing algorithms cannot b e used for safetycritical systems due to the p erio d of vulnerability that exists after a transient failure o ccurs till the system stabilizes In this pap er we consider a highly safe selfstabilizin g system where the vulnerability problem is tackled The design principles we use to achieve this goal include sobriety test and pro cessor sp ecialization Sobriety test is used to prevent the system from p erforming incorrect actions when the system state may b e faulty Sp ecialization disables individual pro cessors from making faulty moves We have develop ed a selfstabilizi ng mutual exclusion algorithm that guarantees mutual exclusion with a very high probability even in the presence of failures Keywords Selfstabilization systems mutual exclusion algorithm fault tolerance distributed computing Intro duction The concept of selfstabilizin g systems was rst prop osed by Dijkstra He illustrated the con cept with a cyclic relaxation algorithm and three mutual exclusion proto cols Over the past two decades Dijkstras selfstabilizing mutual exclusion algorithm has b een enhanced in several ways dierent top ologies reduction in state elimination of centralized demon elimi nation of asymmetry randomized versions and various applications The ma jor attraction of selfstabilization is the p erceived elegance of eliminating clumsy mechanisms for detecting illegal states and initiating recovery actions It also do es not require the system This material is based in part up on work supp orted by NSF under Grant No CCR to b e initialized a task which is dicult to co ordinate in distributed systems Further the decentralized control where communication is required only b etween neighb oring pro cessors greatly reduces the p otential overhead for global control A ma jor problem that prevents the selfstabilizing approach from b eing used for critical applications is its vulnerability Essentially the system may go through a vulnerable p erio d after failures that bring the system into an illegitimate state During this p erio d the system b ehavior may not fully satisfy the sp ecied requirements In order to deal with this vulnera bility we classify failures into two typ es ie random and malicious failures In the random failure mo del a transient failure can bring the system into any illegitimate state with equal probability In the malicious failure mo del some failed pro cessors may maliciously try to vi olate the system legitimacy without b eing detected by lo cal checks and subsequently cause critical damages Generally algorithms with much higher complexity are required to tackle malicious failures In this pap er we develop a selfstabilizin g mutual exclusion algorithm that cop es with the vulnerability problem assuming random failures The system we consider consists of a group of pro cessors or intelligent devices interconnected via a ring network where each pro cessor has only a oneway communication channel with the pro cessor immediately to its right Let P i N denote the N pro cessors in the system We assume that only the simplex i communication is allowed from pro cessor P to P The goal is to provide mutually i i mo dN exclusive access to some shared resource for the N pro cessors The required prop erties of the algorithm include decentralized control where only the states of the nearest neighb ors P for pro cessor P can b e examined for the global control and fully asynchronous i i mo dN execution of all pro cessors without requiring any centralized daemon pro cess The algorithm we develop guarantees that with a high probability the system is safe from failures SelfStabilizing Mutual Exclusion Algorithm Selfstabilizin g systems naturally achieve decentralized control However due to the decen tralized control it is very dicult for a selfstabilizing system to deal with the vulnerability problem When violation of the global requirement do es not directly imply violation of the lo cal requirement then the system can b e in a state in which no pro cessor can determine by itself whether the system state is legitimate and hence the system may misb ehave However algorithms can b e designed such that the system will enter a small set of states that can make the system misb ehave with a very small probability Also various pro cessors can b e assigned dierent key tasks and monitor each others b ehavior These new design principles for selfstabilizing algorithms are discussed in the following Sobriety test The system must b e designed so that the set of legitimate states is a very small fraction of the set of all p ossible states This will allow nonfaulty units to detect illegitimate states and go to a home state that is known to b e safe This also allows rapid restabilization of the system to a stable op erating mo de Specialization To prevent a group of faulty units from collectively subverting the safety of the system the privileged pro cessors in the system are sp ecialized into two or more categories such that no pro cessor by itself has sucient information or capability to damage the system To op erate prop erly two or more pro cessors must co op erate with each other either by sharing information or p o oling their capabilities to eect changes in the system state Thus this approach guarantees that the failure of a pro cessor will not result in any unsafe op eration In our algorithm the sobriety test is implemented by using two metho ds expanding the set of p ossible states and using public key cryptography In the mutual exclusion algorithm for a given state of a given pro cessor there is only one value that can b e the new state value if the system is to continue to b e legitimate On the other hand if the p ossible states of any pro cessor can b e any integer then the ratio of the numb er of legitimate states to the numb er of illegitimate states is very small Thus with random failures the probability that the state values of the current and next states represent a legitimate transition is very small Also due to the use of public key cryptography the probability of a nonrandomly generated value that can b e the next state value is also very small The sp ecialization approach can b e realized by using dierent private keys for the top and b ottom pro cessors A new state value is not generated solely by the top pro cessor or any other single pro cessor but by the b ottom and top pro cessors together Each one is sp ecialized by its own private key which is only known by itself Thus the failure of some pro cessors will not make the system vulnerable Let P denote the top pro cessor as dened by Dijkstra P denote the b ottom N pro cessor and P i N denote the other pro cessors The top pro cessor has i knowledge of a private encryption key K while the b ottom pro cessor has knowledge of a T private encryption key K The public keys for decryption D and D are known to all B T B the pro cessors All the pro cessors have a state variable which is an integer in the co de for pro cessor P this is referred to as S while R refers to the state of the pro cessor to its right i ie P Also we use CR to indicate that the pro cessor is in the critical region The i mo dN selfstabilizing algorithm without a vulnerable p erio d is given in the following Top processor do S decr y ptD R CR S encr y ptK R B T end do Other processors do S decr y ptD decr y ptD R CR S R B T S decr y ptD decr y ptD R S R S R B T end do Bottom processor do S decr y ptD R CR S encr y ptK R T B S decr y ptD R decr y ptD S R S encr y ptK R T B B end do Functions decr y pt and encr y pt p erform the public key encryption and decryption op erations Instead of having an increasebyone function for the top pro cessor to generate new state values as in Dijkstras algorithm we use the encr y pt function Thus the legitimacy of the new state propagated from the right pro cessor can b e easily veried by decryption using the public keys Consequently the faulty state of a pro cessor can b e easily detected when referenced by its left pro cessor and hence will not allow incorrect access to a critical section To prevent a faulty top pro cessor from arbitrarily generating a stream of tokens we let the b ottom pro cessor also encrypt the state using its private encryption key K In the case of failure a new value B generated by the top pro cessor will not b e eective for entering the critical section unless it has traversed the entire ring and has b een encrypted by the b ottom pro cessor Here we assume that K and K have b een chosen such that encr y ptK encr y ptK x do es not generate the T B B T same value within N iterations More formally let f x denote encr y ptK encr y ptK x B T k The prop erty k k N f x x should hold Assume that deco ding of the public key cryptosystem is not p ossible other than by random trials It can b e formally proved that with a high probability the algorithm guarantees that no two units will enter the critical section at the same time in spite of nonmalicious pro cessor failures or arbitrary state transitions Let p denote the probability that an enco ded value using a private key can b e deco ded with a random selection of a numb er Consider a bit value We have p Let

Load more