A Study of Software Multithreading in Distributed Systems

A Study of Software Multithreading in Distributed Systems TA Marsland and Yao qing Gao Francis CM Lau Computing Science Department Computer Science Department University of Alb erta UniversityofHongKong Edmonton Canada TG H Hong Kong tonygaoyqcsualb ertaca fmclaucshkuhk Technical Rep ort TR Novemb er Abstract Multiple threads can b e used not only as a mechanism for tolerating un predictable communication latency but also for facilitating dynamic scheduling and load balancing Multithreaded systems are well suited to highly irregular and dynamic applications such as tree search problems and provide a nat ural waytoachieve p erformance improvement through such new concepts as active messages and remote memory copy Although already p opular in single pro cessor and sharedmemory pro cessor systems multithreading on distributed systems encounters more diculties and needs to address new issues suchas communication scheduling and migration b etween threads lo cated in separate addressing spaces This pap er addresses the key issues of multithreaded systems and investigates existing approaches for distributed concurrent computations Keywords Multithreaded computation thread virtual pro cessor thread safety active message datadriven This researchwas supp orted by Natural Sciences and Engineering Research Council of Canada TA MarslandYaoqing Gao and Francis CM Lau Intro duction Distributed computing on interconnected high p erformance workstations has b een increasingly prevalentover the past few years and provides a costeective alterna tive to the use of sup ercomputers and massively parallel machines Much eort has b een put into developing distributed programming systems suchasPVMP MPI and Express Most existing distributed software systems supp ort only a pro cessbased messagepassing paradigm ie pro cesses with lo cal memory that com municate with each other by sending and receiving messages This paradigm ts well on separate pro cessors connected bya communication network where a pro cess has only a single thread of execution ie it is doing only one thing at a time However the pro cess based paradigm suers p erformance p enalty through communication la tency and high context switchoverhead esp ecially in the case where computing work is frequently created and destroyed Toovercome these problems lightweight pro cesses or threads subpro cesses in a sheltered environmentandhaving a lower context switch cost have received much attention as a mechanism for a providing a ner granularity of concurrency b facilitating dynamic load balancing c making virtual pro cessors indep endentofthephysical pro cessors and d overlapping computation and communication to tolerate communication latency In addition multithreading provides a natural way to implementnonblo cking communication op erations active messages and remote memory copy Multithreading can b e supp orted bysoftware or even hardware eg MTA Alewife EARTH START Although it has b een p opular in single pro cessor and sharedmemory pro cessor systems multithreading on distributed sys tems encounters more diculties mainly b ecause of the high latency and low band width communication links underlying distributedmemory computers In this pap er we rst address some ma jor issues ab out designing and imple menting multithreading in distributed systems Then weinvestigate typical existing multithreaded systems for exploiting the p otential of multicomputers and workstation networks and compare their capabilities Toprovide a framework we use a uniform terminology of thread pro cess and pro cessor but provide at the end a table show ing the varied terminology used by the originating researchers From this study we identify three areas where further work is necessary to advance the p erformance of A Study of Software Multithreading in Distributed Systems multithreaded distributed systems Ma jor issues A runtime system denes the compilers or users view of a machine It can b e used either as a compiler target or for programmer use It is resp onsible for managementof computational resources communication and synchronization among parallel tasks in a program Designing and implementing multiple threads in distributed systems involves the following ma jor issues how to implement threads How to pro vide communication and scheduling mechanisms for threads in separate addressing space What kinds of programming paradigms are supp orted for ease of concur rent programming Howtomake the system threadsafe Threads A pro cess may b e dened lo osely as an address space together with a current state consisting of a program counter register values and a subroutine call stack A pro cess has only one program counter and do es one thing at a time single thread As we know one of the obvious obstacles to the p erformance of workstation networks is the relatively slow network interconnection hardware One of the motivations to intro duce the notion of threads is to provide greater concurrency within a single pro cess This is p ossible through more rapid switching of control of the CPU from one thread to another b ecause memory management is simplied When one thread is waiting for data from another pro cessor other threads can continue executing This improves p erformance if the overhead asso ciated with the use of multiple threads is less than the communication latency masked by computation From the view p oint of programmers it is desirable for reasons of clarity and simplicity that programs are co ded using a known numb er of virtual pro cessors irresp ective of the availabilityof the physical pro cessors Threads supp ort a scenario where each pro cess is a virtual pro cessor and multiple threads are running in the context of a pro cess The feature of lowoverhead for thread context switch is esp ecially suitable for highly dynamic irregular problems where threads are frequently created and destroyed Threads can b e supp orted TA MarslandYaoqing Gao and Francis CM Lau at the systemlevel where all functionality is part of the op erating system kernel by userlevel library co de where all functionality is part of the user program and can b e linked in and by a mixture of the ab ove Userlevel library implementations such as Cthread PCR FastThreads and pthread multiplex a p otentially large numb er of userdened threads on top of a single kernelimplemented pro cess This approach can b e more ecient than relying on op erating system kernel supp ort since the overhead of entering and leav ing the kernel at each call and the context switchoverhead of kernel threads is high Userlevel threads are on the other hand not only exible but can also b e customized to the needs of the language or user without kernel mo dication However userlevel library implementation complicates signal handling and some thread op erations such as synchronization and scheduling Two dierentscheduling mechanisms are needed one for kernellevel pro cesses and the other for userlevel threads Userlevel threads built on top of traditional pro cesses may exhibit p o or p erformance and even incor rect b ehavior in some cases when the op erating system activities such as IO and page faults distort the mapping from virtual pro cesses to physical pro cessors The kernel thread supp ort suchas Mach Topaz and V simplies control over thread op erations and signal handling but like traditional pro cesses these kernel threads carry to o muchoverhead A mixture of the ab ove can combine the function alityofkernel threads with the p erformance and exibility of userlevel threads The key issue is to provide a twowaycommunication mechanism b etween the kernel and user threads so that they can exchange information suchasscheduling hints The Psyche system provides suchamechanism by the following approaches shared data structures for asynchrono ous communication b etween the kernel and the user software interrupts to notify the user level of some kernel events and a scheduler in terface for interactions in the user space b etween dissimilar thread packages Another example is Scheduler Activations through whichthekernel can notify the thread scheduler of every event aecting the user and the user can also notify of the subset of userlevel events aecting pro cessor allo cation decisions In addition Scheduler Activations address some problems not handled bythePsyche system such as page faults and upwardcompatible simulation of traditional kernel threads A Study of Software Multithreading in Distributed Systems Thread safety A software system is said to b e threadsafe if multiple threads in the system can execute successfully without data corruption and without interfering with each other If a system is not thread safe then each thread must havemutually exclusive access to the system forcing serialization of other threads To supp ort multithreaded pro gramming eciently a system needs to b e carefully engineered in the following wayto make it threadsafe use as few global state information as p ossible explicitly manage any global state that cannot b e removed use reentrant functions that may b e safely invoked bymultiple threads concurrently without yielding erroneous results race conditions or deadlo cks and provide atomic access to nonreentrant functions Thread scheduling Scheduling plays an imp ortant role in distributed multithreaded systems There are two kinds of thread scheduling scheduling within a pro cess and among pro cesses The former are similar to the prioritybased preemptive or round robin strategies in traditional singlepro cessor systems

Load more