IV. Parallel Operating Systems
Total Page:16
File Type:pdf, Size:1020Kb
IV. Parallel Op erating Systems Jo~ao Garcia, Paulo Ferreira, and Paulo Guedes IST/INESC, Lisb on, Portugal 1. Intro duction ::::::::::::::::::::::::::::::::::::::::::::::::: 169 2. Classi cation of Parallel Computer Systems ::::::::::::::::::::: 169 2.1 Parallel Computer Architectures ::::::::::::::::::::::::::: 169 2.2 Parallel Op erating Systems :::::::::::::::::::::::::::::::: 174 3. Op erating Systems for Symmetric Multipro cessors ::::::::::::::: 178 3.1 Pro cess Management ::::::::::::::::::::::::::::::::::::: 178 3.2 Scheduling :::::::::::::::::::::::::::::::::::::::::::::: 179 3.3 Pro cess Synchronization :::::::::::::::::::::::::::::::::: 180 3.4 Examples of SMP Op erating Systems : :::::::::::::::::::::: 181 4. Op erating Systems for NORMA environments ::::::::::::::::::: 189 4.1 Architectural Overview ::::::::::::::::::::::::::::::::::: 190 4.2 Distributed Pro cess Management :::::::::::::::::::::::::: 191 4.3 Parallel File Systems ::::::::::::::::::::::::::::::::::::: 194 4.4 Popular Message-Passing Environments ::::::::::::::::::::: 195 5. Scalable Shared Memory Systems :::::::::::::::::::::::::::::: 196 5.1 Shared Memory MIMD Computers ::::::::::::::::::::::::: 197 5.2 Distributed Shared Memory ::::::::::::::::::::::::::::::: 198 References :::::::::::::::::::::::::::::::::::::::::::::::::: 199 Summary. Parallel op erating systems are the interface b etween parallel comput- ers (or computer systems) and the applications (parallel or not) that are executed on them. They translate the hardware's capabilities into concepts usable bypro- gramming languages. Great diversity marked the b eginning of parallel architectures and their op- erating systems. This diversity has since b een reduced to a small set of dominat- ing con gurations: symmetric multipro cessors running commo dity applications and op erating systems (UNIX and Windows NT) and multicomputers running custom kernels and parallel applicatio ns. Additionall y, there is some (mostly exp erimen- tal) work done towards the exploitation of the shared memory paradigm on top of networks of workstations or p ersonal computers. In this chapter, we discuss the op erating system comp onents that are essential to supp ort parallel systems and the central concepts surrounding their op eration: scheduling, synchronization, multi-threading, inter-pro cess communication, mem- ory management and fault tolerance. Currently, SMP computers are the most widely used multipro cessors. Users nd it a very interesting mo del to have a computer, which, although it derives its pro- cessing p ower from a set of pro cessors, do es not require anychanges to applications and only minor changes to the op erating system. Furthermore, the most p opular parallel programming languages have b een p orted to SMP architectures enabling also the execution of demanding parallel application s on these machines. However, users who want to exploit parallel pro cessing to the fullest use those same parallel programming languages on top of NORMA computers. These mul- ticomputers with fast interconnects are the ideal hardware supp ort for message 168 Jo~ao Garcia, Paulo Ferreira, and Paulo Guedes passing parallel applicatio ns. The surviving commercial mo dels with NORMA ar- chitectures are very exp ensivemachines, which one will nd running calculus in- tensive applications , suchasweather forecasting or uid dynamics mo delling. We also discuss some of the exp eriments that have b een made b oth in hardware (DASH, Alewife) and in software systems (TreadMarks, Shasta) to deal with the scalability issues of maintaining consistency in shared-memory systems and to prove their applicabil ity on a large-scale. IV. Parallel Op erating Systems 169 1. Intro duction Parallel op erating systems are primarily concerned with managing the re- sources of parallel machines. This task faces manychallenges: application programmers demand all the p erformance p ossible, many hardware con gu- rations exist and change very rapidly,yet the op erating system must increas- ingly b e compatible with the mainstream versions used in p ersonal computers and workstations due b oth to user pressure and to the limited resources avail- able for developing new versions of these system. In this chapter, we describ e the ma jor trends in parallel op erating systems. Since the architecture of a parallel op erating system is closely in uenced by the hardware architecture of the machines it runs on, wehave divided our pre- sentation in three ma jor groups: op erating systems for small scale symmetric multipro cessors (SMP), op erating system supp ort for large scale distributed memory machines and scalable distributed shared memory machines. The rst group includes the currentversions of Unix and Windows NT where a single op erating system centrally manages all the resources. The sec- ond group comprehends b oth large scale machines connected by sp ecial pur- p ose interconnections and networks of p ersonal computers or workstations, where the op erating system of each no de lo cally manages its resources and collab orates with its p eers via message passing to globally manage the ma- chine. The third group is comp osed of non-uniform memory access (NUMA) machines where each no de's op erating system manages the lo cal resources and interacts at a very ne grain level with its p eers to manage and share global resources. We address the ma jor architectural issues in these op erating systems, with a greater emphasis on the basic mechanisms that constitute their core: pro cess management, le system, memory managementandcommunications. For each comp onent of the op erating system, the presentation covers its archi- tecture, describ es some of the interfaces included in these systems to provide programmers with a greater level of exibilityandpower in exploiting the hardware, and addresses the implementation e orts to mo dify the op erat- ing system in order to explore the parallelism of the hardware. These top- ics include supp ort for multi-threading, internal parallelism of the op erating system comp onents, distributed pro cess management, parallel le systems, inter-pro cess communication, low-level resource sharing and fault isolation. These features are illustrated with examples from b oth commercial op erating systems and research prop osals from industry and academia. 2. Classi cation of Parallel Computer Systems 2.1 Parallel Computer Architectures Op erating systems were created to present users and programmers with a view of computers that allows the use of computers abstracting away from the 170 Jo~ao Garcia, Paulo Ferreira, and Paulo Guedes details of the machine's hardware and the way it op erates. Paral lel operating systems in particular enable user interaction with computers with parallel architectures. The physical architecture of a computer system is therefore an imp ortant starting p oint for understanding the op erating system that controls it. There are two famous classi cations of parallel computer architectures: Flynn's [Fly72] and Johnson's [Joh88]. 2.1.1 Flynn's classi cation of parallel architectures. Flynn divides computer architectures along two axes according to the number of data sources and the numb er of instruction sources that a computer can pro cess simultaneously. This leads to four categories of computer architectures: SISD - Single Instruction Single Data. This is the most widespread archi- tecture where a single pro cessor executes a sequence of instructions that op erate on a single stream of data. Examples of this architecture are the ma jority of computers in use nowadays such as p ersonal computers (with some exceptions suchasIntelDualPentium machines), workstations and low-end servers. These computers typically run either a version of the Mi- crosoft Windows op erating system or one of the manyvariants of the UNIX op erating system (Sun Microsystems' Solaris, IBM's AIX, Linux...) although these op erating systems include supp ort for multipro cessor architectures and are therefore not strictly op erating systems for SISD machines. MISD - Multiple Instruction Single Data. No existing computer corresp onds to the MISD mo del, whichwas included in this description mostly for the sake of completeness. However, we could envisage sp ecial purp ose machines that could use this mo del. For example, a "cracker computer" with a set of pro cessors where all are fed the same stream of ciphered data whicheach tries to crack using a di erent algorithm. SIMD - Single Instruction Multiple Data. The SIMD architecture is used mostly for sup ercomputer vector pro cessing and calculus. The programming mo del of SIMD computers is based on distributing sets of homogeneous data among an array of pro cessors. Each pro cessor executes the same op eration on a fraction of a data element. SIMD machines can b e further divided into pip elined and parallel SIMD. Pip elined SIMD machines execute the same instruction for an arrayofdataelements byfetching the instruction from memory only once and then executing it in an overloaded way on a pip elined high p erformance pro cessor (e.g. Cray 1). In this case, each fraction of a data element(vector) is in a di erent stage of the pip eline. In contrast to pip elined SIMD, parallel SIMD machines are made from dedicated pro cessing elements connected by high-sp eed links. The pro cessing elements are sp ecial purp ose pro cessors with narrow data paths (some as narrow as one bit) and which execute a restricted set of op erations. Although there is a design e ort to keep pro cessing