A POSIX Threads Based DSM System
Total Page:16
File Type:pdf, Size:1020Kb
Murks – A POSIX Threads Based DSM System M. Pizka, C. Rehn Institut f¨ur Informatik, Technische Universit¨at M¨unchen, 80290 Munich - Germany email: pizka,rehn @in.tum.de ABSTRACT nel modifications, in 5.. Section 7. concludes with a brief The shared memory paradigm provides an easy to use pro- summary of our observations and Murks’ capabilities after gramming model for communicating processes. Its im- discussing its performance in 6.. plementation in distributed environments has proofed to be harder than expected. Most distributed shared mem- ory (DSM) systems suffer from either poor performance or 1.1 Distributed OS Context they are very complicated to use. The DSM system Murks, presented in this paper, is the result of the sobering expe- In the research project MoDiS (Model oriented Distributed riences we made by trying to integrate an existing DSM Systems) [EP99] we follow a language-based, integrated, system into a distributed OS. and top-down oriented approach to develop a distributed Murks distinctive features are 1) full support for POSIX OS that provides a high-quality programming environment multithreading instead of an own proprietary thread- (i. e. ease-of-use, reusabilty, maintainability, etc.) as well package and 2) it does not burden the programmer with as proper performance for general purpose distributed com- additional, difficult to use, memory management services. puting. These advantages are achieved by a tight integration of the DSM subsystem into the OS. By this, Murks is able to pro- 1.1.1 Language-based vide a better combination of performance and ease-of-use than other DSM systems. The term language-based means that the development of the OS is driven by the development of high-level language KEY WORDS concepts for the specification of parallel and distributed distributed systems, operating systems, memory manage- systems. E. g. concurrency, cooperation and synchroniza- ment, multi-threading, distributed shared memory tion are language concepts instead of runtime level ser- vices. By this, we are able to provide suitable program- 1. Motivation ming concepts without being restricted neither to the fea- tures of a predefined OS nor to the concepts of an already Murks is a page-based distributed shared memory (DSM) existing programming language, such as passing object ref- system providing full transparency concerning the physical erences in Java, which must be regarded as inadequate in distribution of the hardware (HW) configuration. Instead of distributed environments. focusing on peak performance, our major design goal was to develop a DSM that can be seamlessly integrated into a distributed and parallel operating system (OS). The reason INSEL This programming model is implemented and for this was the observation and (frustrating) practical ex- offered to the programmer by the syntax of the Ada- perience, that there are certainly many different DSM sys- like [Ada83] programming language INSEL [Win96] and tems with well-elaborated management strategies but none its optimizing source to machine-level compiler [Piz97]. of them proved to be suitable as a OS level service cross- INSEL can be characterized as a high-level, type-safe, im- cutting several applications of a long-lived system. perative and object-based programming language, support- To substantiate these statements, we will first ing explicit task parallelism. sketch the key design principles of the distributed OS Objects are dynamically created during execution as MoDiS [PE97] for which a DSM system was sought after, instances of class describing objects. To prevent from dan- in 1.1. Afterwards, we are able to define requirements for gling pointers, objects are implicitly deleted by a garbage a suitable DSM system and to discuss the pros and cons collector. Determined with the class definition, objects may of the memory and execution models provided by existing either be active, called actors, or passive ones. An actor de- DSMs in section 2.. Section 3. gives an explanation for the fines a separate flow of control and executes concurrently lack of support for POSIX multi-threading [IEE95] in most to its creator. DSM systems. Following these observations, we introduce Actors may communicate directly in a rendezvous the concepts of our DSM system Murks, which aims at style as well as indirectly by accessing shared passive ob- overcoming the observed shortcomings in section 4., be- jects (shared memory paradigm). All requests to objects fore we describe implementation details, such as Linux ker- are served synchronously. DSM-Requirements Part 1 1.1.3 Top-Down Management R1 trivial: We need a DSM subsystem that efficiently im- The philosophy behind our approach to automate resource plements actor interaction via shared passive objects management is to free the application level from all tasks of arbitrary granularity. Note, passive objects may be but specifying parallel and cooperative computations. This as small as a single integer variable or complex data especially holds for all tasks dealing with the physical dis- structures composed of other passive and active ob- tribution of the execution environment. Note that this prin- jects. ciple is a prerequisite for achieving portability, scalability and enhanced maintainability of distributed applications. R2 The DSM must be compatible with varying degrees It is solely within the responsibility of the OS to of parallelism defined by actors of arbitrary number map the abstract requirements of applications specified by and size (in both, space and time). Due to the high- means of the language INSEL to physically distributed re- level of abstraction of the language concepts, it would sources, such as implementing the abstract creation of a be completely inappropriate to rigidly map the actor dynamic integer object by allocating a memory object in concept to e. g. heavy-weight processes. Instead, an shared heap space. Conceptually, this mapping between actor might be implemented as a kernel-level or user- abstract concepts and concrete HW features is achieved by level thread or even inlined into another thread. The top-down oriented stepwise refinement, performed by the DSM must not obstruct this flexibility. OS. Practically, some of this transformation steps are per- formed at compile-time while others are executed at run- R3 The language-based philosophy demands, that the ap- time. Hence, all mechanisms, strategies and services of plication level is left unspoiled by DSM concepts. The the OS are to be tailored to the language-level concepts to programmer must not be burdened with any mem- match the requirements of distributed applications [ea97]. ory management concepts besides creating objects in stack or heap space. 1.2 Generalization 1.1.2 Integrated It should be obvious, that most of the requirements stated above are relevant for other distributed systems, too and The entire distributed system in execution, comprising sev- not limited to the scope of MoDiS. Thus, MoDiS should eral applications and the distributed OS, is regarded as a only be regarded as the initial spark for our considerations. single globally structured system [PE97], which can be dy- Its major contribution is to deliver greater creativity for the namically extended with new applications and maybe also design of a distributed computing infrastructure by being OS functionality. language-based and top-down oriented instead of concen- One the one hand, this holistic view provides an ideal trating on how to extend legacy systems. integration of knowledge about the properties of distributed We believe that integration, automation and trans- objects and their interdependencies. Thus – at a conceptual parency are important prerequisites to achieve the desired level – the OS may consider non-local information for im- ease-of-use and performance of distributed systems in gen- portant resource managements decisions instead of being eral. Hence, these design principles will also be significant limited to local optimization a priori. On the other hand, the for DSM systems beyond the context of MoDiS. single system view implies longevity of the system, which has a surprisingly strong impact on low-level OS manage- ment mechanisms and therewith also on the DSM subsys- 2. Related Work tem. DSMs have been of strong interest to the OS commu- nity from the mid 80s to the late 90s [Li86, ea99]. The DSM-Requirements Part 2 main motivation was to provide a simplified program- ming model relative to the omnipresent message passing R4 The DSM subsystem has to support a single (inte- paradigm [Sun93, For94]. But several attempts to imple- grated) virtual address space (VA) across all nodes. ment the DSM concept exposed difficulties that proved to Note, it is not obvious that a DSM system also imple- be hard or undoable without specialized HW. [Car98] sum- ments a single VA! marizes that DSM systems suffer from R5 The DSM must support multiple applications simulta- ¯ being too difficult to use, or (non-exclusive) neously and provide partial cleanup of memory parti- tions previously allocated by terminated applications. ¯ severe performance penalties. R6 It must also support long-lived large-scale applica- Since we are in need of support for a combination of shared tions that eventually migrate between arbitrary nodes memory and parallelism, we have to examine both, mem- of the system. ory and execution models, of DSM systems. 2.1 Memory Models Single-Threaded DSMs Most DSM-Systems provide a single threaded execution Ivy [Li86] was the first page-based DSM, followed