Efficient High-Level Parallel Programming

Total Page:16

File Type:pdf, Size:1020Kb

Efficient High-Level Parallel Programming Theoretical Computer Science EISEVIER Theoretical Computer Science 196 (1998) 71-107 Efficient high-level parallel programming George Horaliu Botoroga,*,‘, Herbert Kuchenb aAachen University of Technology, Lehrstuhl fiir Informatik IL Ahornstr. 55, D-52074 Aachen, Germany b University of Miinster, Institut fir Wirtschaftsinformatik, Grevener Str. 91, D-48159 Miinster, Germany Abstract Algorithmic skeletons are polymorphic higher-order functions that represent common parallel- ization patterns and that are implemented in parallel. They can be used as the building blocks of parallel and distributed applications by embedding them into a sequential language. In this paper, we present a new approach to programming with skeletons. We integrate the skeletons into an imperative host language enhanced with higher-order functions and currying, as well as with a polymorphic type system. We thus obtain a high-level programming language, which can be implemented very efficiently. We then present a compile-time technique for the implementation of the functional features which has an important positive impact on the efficiency of the language. Afier describing a series of skeletons which work with distributed arrays, we give two examples of parallel algorithms implemented in our language, namely matrix multiplication and Gaussian elimination. Run-time measurements for these and other applications show that we approach the efficiency of message-passing C up to a factor between 1 and 1.5. @ 1998-Elsevier Science B.V. All rights reserved Keywords: Algorithmic skeletons; High-level parallel programming; Efficient implementation of functional features; Distributed arrays 1. Introduction Although parallel and distributed systems gain more and more importance nowadays, the state of the art in the field of parallel software is far from being satisfactory. Not only is the programming of such systems a tedious and time-consuming task, but its outcome is usually machine-dependent and hence its portability is restricted. One of the main reasons for these difficulties is the rather low level of parallel programming languages available today, in which the user has to account explicitly * Corresponding author. E-mail: [email protected]. ’ The work of this author was supported by the “Graduiertenkolleg Informatik und Technik” at the Aachen University of Technology. 0304-3975/98/$19.00 @ 1998-Elsevier Science B.V. All rights reserved PZZ SO304-3975(97)00196-S 72 G.H. Botorog, H. Kuchenl Theoretical Computer Science 196 (1998) 71-107 for all aspects of parallelism, i.e., for communication, synchronization, data and task distribution, load balancing, etc. Moreover, program testing is impeded by deadlocks, non-deterministic program runs, non-reproducible errors, and sparse debugging facili- ties. Finally, programs run only on certain machines or on restricted classes of ma- chines. However, due to the nearness to the machine, a high performance can be achieved. Attempts were made to design high-level languages that hide the details of par- allelism and allow the programmer to concentrate on his problem. Such high-level approaches would naturally lead to less efficient programs than those written directly in low-level languages, and the crux is to find a good trade-off between efficiency losses and gains in easiness of programming, as well as in safety and portability of programs. In this paper, we present such an approach, namely a language with algorithmic skeletons, which combines the advantages of high-level (declarative) languages and those of efficient (imperative) ones. The main aim is to enable easy parallel program- ming and at the same time to get an efficient implementation. A skeleton is an algorithmic abstraction common to a series of applications, which can be implemented in parallel [6]. Skeletons are embedded into a sequential host language, thus being the only source of parallelism in a program. Most skeletons, like for instance map, farm, and divide&conquer [7] are polymorphic higher-order functions, and can thus be defined in functional languages in a straightforward way. This is why most languages with skeletons employ a functional host [8,14]. However, programs written in these languages are still 5 to 10 times slower than their low-level counterparts, e.g., C with message passing [14]. A series of approaches use an imperative language as a host, like for instance P3L [ 11, which builds on top of C++, and in which skeletons are internal language constructs. The main drawbacks are the difficulty to add new skeletons and the fact that only a restricted number of skeletons can be used. A related approach is PCN [9], where templates provide a means to define reusable parallel program structures. The language also contains lower-level features, such as dejinitional variables, by which processes can communicate, but which can lead to problems such as deadlocks. In the language WAS [18], arithmetic and logic operators are extended by overloading to global operators that can be applied pointwise to the elements of matrices. These operations are no skeletons in the sense of the former definition, but their functionality resembles that of certain skeletons. SPP(X) [8] uses a ‘two-layer’ language: a high-level, functional language for the application and a low-level language (at present Fortran) for efficient sequential code that is coordinated by the skeletons. In this paper, we present a new approach to parallel programming with algorith- mic skeletons. We detine an imperative language called SkiI,’ which we use as a framework in which we embed the skeletons. Skeletons are thus not part of the lan- guage, but they are implemented in it. In order to enable a straightforward integration 2 S/cd is an acronym for Skeleton Imperative Language. G.H. Botorog, H. Kuchenl Theoretical Computer Science 196 (1998) 71-107 73 of the skeletons, Ski1 contains a series of functional features, mainly higher-order tunctions, currying and polymorphic types. These features are eliminated at compile- time by an instantiation procedure, which generates for each polymorphic higher-order function one or more monomorphic first-order instances. Thus, as our results show, an efficient overall implementation can be achieved. On the other hand, since the transformation is static, a restriction has to be imposed on certain recursively defined higher-order functions. However, this restriction applies only to special cases, which seldom appear in practice. Skil does not support nested parallelism, like NESL [2] or P3L [l] does, because the emphasis was placed here on the efficiency of the implementation and on the integration of the functional and imperative features, rather than on the composition of skeletons. Using Skil, we construct the data structure ‘distributed array’ and define a series of data-parallel skeletons operating on it. We then show how two parallel applications can be programmed in a sequential style by using these skeletons. Run-time mea- surements show that our results are considerably better than those obtained by using pure functional languages with skeletons, approaching those of direct low-level (C) implementations. The paper is organized as follows. Section 2 presents the language SkiZ with an emphasis on its functional features. Section 3 describes distributed arrays and Section 4 presents a series of skeletons working on this data structure. Section 5 describes the core of the implementation of Skil, namely the instantiation procedure. Section 6 then presents two parallel applications implemented by means of skeletons and gives run- time results, as well as comparisons with other implementations. Finally, Section 7 concludes the paper. 2. The language Ski1 As mentioned in the introduction, Ski1 aims at combining a high programming level with an efficient implementation. We have therefore started with an imperative lan- guage, namely C3 and enhanced it by a series of functional constructs. The goal was to allow skeletons, which are polymorphic higher-order functions, to be specified in Ski1 as naturally as in functional languages. As an example, consider the function map, which applies a given argument function to a given list of items, yielding a list of results. In Haskell [ll], this function can be defined as follows: map :: (a 4 b) 4 [a] -+ [b] maPf[l=[l map fx:xs= fx: mapfxs 3 This choice is only a pragmatic one, based on the widespread use of C, for which mature compilers are available (we use a C-compiler as a back-end in our implementation). However, other imperative languages can equally well be used instead. 74 G.H. Botorog, H. Kuchenl Theoretical Computer Science 196 (1998) 71-107 We shall now present the language enhancements that are necessary in order to make such function definitions possible. 2.1. Higher-order functions Ski1 allows the definition of higher-order functions (HOFs), i.e., of functions with functional arguments or result. Although this feature could be simulated in C by passing/returning pointers to functions, this simulation would not allow the use of functional arguments created dynamically by partial applications. Hence, we allow the explicit definition of HOFs. For instance, (a for the time being monomorphic version of) the function map - working on lists of integers - can be defined in Ski1 as follows, provided the standard list functions nil, cons, and empty are available: intlist map (int f (int), intlist 1) { if (empty (I)) return nil () ; else return cow ( f (1+ elem), map ( f, 1 -+ next)) ; } Although functions can be arguments of other functions, they cannot occur in data structures and certain expressions. More exactly, the following restrictions apply: l No element of a data structure may have a functional type. Pointers to functions are nevertheless allowed for compatibility to C. l Expressions with functional type may occur only in function applications and in return statements. In this paper, we concentrate on HOFs with functional arguments, both in the descrip- tion of the implementation of HOFs and in the examples. HOFs with functional results are not so important from the point of view of algorithmic skeletons, therefore we shall only outline them here.
Recommended publications
  • CMPS 161 – Algorithm Design and Implementation I Current Course Description
    CMPS 161 – Algorithm Design and Implementation I Current Course Description: Credit 3 hours. Prerequisite: Mathematics 161 or 165 or permission of the Department Head. Basic concepts of computer programming, problem solving, algorithm development, and program coding using a high-level, block structured language. Credit may be given for both Computer Science 110 and 161. Minimum Topics: • Fundamental programming constructs: Syntax and semantics of a higher-level language; variables, types, expressions, and assignment; simple I/O; conditional and iterative control structures; functions and parameter passing; structured decomposition • Algorithms and problem-solving: Problem-solving strategies; the role of algorithms in the problem- solving process; implementation strategies for algorithms; debugging strategies; the concept and properties of algorithms • Fundamental data structures: Primitive types; single-dimension and multiple-dimension arrays; strings and string processing • Machine level representation of data: Bits, bytes and words; numeric data representation; representation of character data • Human-computer interaction: Introduction to design issues • Software development methodology: Fundamental design concepts and principles; structured design; testing and debugging strategies; test-case design; programming environments; testing and debugging tools Learning Objectives: Students will be able to: • Analyze and explain the behavior of simple programs involving the fundamental programming constructs covered by this unit. • Modify and expand short programs that use standard conditional and iterative control structures and functions. • Design, implement, test, and debug a program that uses each of the following fundamental programming constructs: basic computation, simple I/O, standard conditional and iterative structures, and the definition of functions. • Choose appropriate conditional and iteration constructs for a given programming task. • Apply the techniques of structured (functional) decomposition to break a program into smaller pieces.
    [Show full text]
  • Skepu 2: Flexible and Type-Safe Skeleton Programming for Heterogeneous Parallel Systems
    Int J Parallel Prog DOI 10.1007/s10766-017-0490-5 SkePU 2: Flexible and Type-Safe Skeleton Programming for Heterogeneous Parallel Systems August Ernstsson1 · Lu Li1 · Christoph Kessler1 Received: 30 September 2016 / Accepted: 10 January 2017 © The Author(s) 2017. This article is published with open access at Springerlink.com Abstract In this article we present SkePU 2, the next generation of the SkePU C++ skeleton programming framework for heterogeneous parallel systems. We critically examine the design and limitations of the SkePU 1 programming interface. We present a new, flexible and type-safe, interface for skeleton programming in SkePU 2, and a source-to-source transformation tool which knows about SkePU 2 constructs such as skeletons and user functions. We demonstrate how the source-to-source compiler transforms programs to enable efficient execution on parallel heterogeneous systems. We show how SkePU 2 enables new use-cases and applications by increasing the flexibility from SkePU 1, and how programming errors can be caught earlier and easier thanks to improved type safety. We propose a new skeleton, Call, unique in the sense that it does not impose any predefined skeleton structure and can encapsulate arbitrary user-defined multi-backend computations. We also discuss how the source- to-source compiler can enable a new optimization opportunity by selecting among multiple user function specializations when building a parallel program. Finally, we show that the performance of our prototype SkePU 2 implementation closely matches that of
    [Show full text]
  • Software Requirements Implementation and Management
    Software Requirements Implementation and Management Juho Jäälinoja¹ and Markku Oivo² 1 VTT Technical Research Centre of Finland P.O. Box 1100, 90571 Oulu, Finland [email protected] tel. +358-40-8415550 2 University of Oulu P.O. Box 3000, 90014 University of Oulu, Finland [email protected] tel. +358-8-553 1900 Abstract. Proper elicitation, analysis, and documentation of requirements at the beginning of a software development project are considered important preconditions for successful software development. Furthermore, successful development requires that the requirements be correctly implemented and managed throughout the rest of the development. This latter success criterion is discussed in this paper. The paper presents a taxonomy for requirements implementation and management (RIM) and provides insights into the current state of these practices in the software industry. Based on the taxonomy and the industrial observations, a model for RIM is elaborated. The model describes the essential elements needed for successful implementation and management of requirements. In addition, three critical elements of the model with which the practitioners are currently struggling are discussed. These elements were identified as (1) involvement of software developers in system level requirements allocation, (2) work product verification with reviews, and (3) requirements traceability. Keywords: Requirements engineering, software requirements, requirements implementation, requirements management 1/9 1. INTRODUCTION Among the key concepts of efficient software development are proper elicitation, analysis, and documentation of requirements at the beginning of a project and correct implementation and management of these requirements in the later stages of the project. In this paper, the latter issue related to the life of software requirements beyond their initial development is discussed.
    [Show full text]
  • Map, Reduce and Mapreduce, the Skeleton Way$
    View metadata, citation and similar papers at core.ac.uk brought to you by CORE provided by Elsevier - Publisher Connector Procedia Computer Science ProcediaProcedia Computer Computer Science Science 00 1 (2010)(2012) 1–9 2095–2103 www.elsevier.com/locate/procedia International Conference on Computational Science, ICCS 2010 Map, Reduce and MapReduce, the skeleton way$ D. Buonoa, M. Daneluttoa, S. Lamettia aDept. of Computer Science, Univ. of Pisa, Italy Abstract Composition of Map and Reduce algorithmic skeletons have been widely studied at the end of the last century and it has demonstrated effective on a wide class of problems. We recall the theoretical results motivating the intro- duction of these skeletons, then we discuss an experiment implementing three algorithmic skeletons, a map,areduce and an optimized composition of a map followed by a reduce skeleton (map+reduce). The map+reduce skeleton implemented computes the same kind of problems computed by Google MapReduce, but the data flow through the skeleton is streamed rather than relying on already distributed (and possibly quite large) data items. We discuss the implementation of the three skeletons on top of ProActive/GCM in the MareMare prototype and we present some experimental obtained on a COTS cluster. ⃝c 2012 Published by Elsevier Ltd. Open access under CC BY-NC-ND license. Keywords: Algorithmic skeletons, map, reduce, list theory, homomorphisms 1. Introduction Algorithmic skeletons have been around since Murray Cole’s PhD thesis [1]. An algorithmic skeleton is a paramet- ric, reusable, efficient and composable programming construct that exploits a well know, efficient parallelism exploita- tion pattern. A skeleton programmer may build parallel applications simply picking up a (composition of) skeleton from the skeleton library available, providing the required parameters and running some kind of compiler+run time library tool specific of the target architecture at hand [2].
    [Show full text]
  • Cleanroom Software Engineering Implementation of the Capability Maturity Model (Cmmsm) for Software
    Technical Report CMU/SEI-96-TR-023 ESC-TR-96-023 Cleanroom Software Engineering Implementation of the Capability Maturity Model (CMMsm) for Software Richard C. Linger Mark C. Paulk Carmen J. Trammell December 1996 Technical Report CMU/SEI-96-TR-023 ESC-TR-96-023 December 1996 Cleanroom Software Engineering Implementation of the Capability Maturity Model (CMMsm) for Software Richard C. Linger Mark C. Paulk Software Engineering Institute Carmen J. Trammell University of Tennessee Process Program Unlimited distribution subject to the copyright Software Engineering Institute Carnegie Mellon University Pittsburgh, PA 15213 This report was prepared for the SEI Joint Program Office HQ ESC/AXS 5 Eglin Street Hanscom AFB, MA 01731-2116 The ideas and findings in this report should not be construed as an official DoD position. It is published in the interest of scientific and technical information exchange. FOR THE COMMANDER (signature on file) Thomas R. Miller, Lt Col, USAF SEI Joint Program Office This work is sponsored by the U.S. Department of Defense. Copyright © 1996 by Carnegie Mellon University. Permission to reproduce this document and to prepare derivative works from this document for internal use is granted, provided the copyright and “No Warranty” statements are included with all reproductions and derivative works. Requests for permission to reproduce this document or to prepare derivative works of this document for external and commercial use should be addressed to the SEI Licensing Agent. NO WARRANTY THIS CARNEGIE MELLON UNIVERSITY AND SOFTWARE ENGINEERING INSTITUTE MATERIAL IS FURNISHED ON AN “AS-IS” BASIS. CARNEGIE MELLON UNIVERSITY MAKES NO WARRANTIES OF ANY KIND, EITHER EXPRESSED OR IMPLIED, AS TO ANY MATTER INCLUDING, BUT NOT LIMITED TO, WARRANTY OF FITNESS FOR PURPOSE OR MERCHANTABILITY, EXCLUSIVITY, OR RESULTS OBTAINED FROM USE OF THE MATERIAL.
    [Show full text]
  • Computer Science (CS) Implementation Guidance
    Computer Science (CS) Implementation Guidance Landscape of Computer Science CS for AZ: Leaders CS Toolbox: Curricula and Tools Document prepared by Janice Mak Note: The following document contains examples of resources for elementary school, middle school, and high school. These resources are not explicitly recommended or endorsed by the Arizona Department of Education. The absence of a resource from this does not indicate its value. 1 of 43 Computer Science (CS) Implementation Guidance Table of Contents Overview and Purpose ..........................................................................................2 Introduction: Landscape of Computer Science ......................................................2 Arizona Challenge .........................................................................................3 Recommendations ........................................................................................3 A Brief Introduction: AZ K-12 Computer Science Standards ..................... 5-6 CS for AZ: Leaders................................................................................................7 Vision ............................................................................................................7 Data Collection and Goal-Setting ............................................................. 7-8 Plan ...............................................................................................................9 CS Toolbox: Curricula and Tools .......................................................................
    [Show full text]
  • Software Design Pattern Using : Algorithmic Skeleton Approach S
    International Journal of Computer Network and Security (IJCNS) Vol. 3 No. 1 ISSN : 0975-8283 Software Design Pattern Using : Algorithmic Skeleton Approach S. Sarika Lecturer,Department of Computer Sciene and Engineering Sathyabama University, Chennai-119. [email protected] Abstract - In software engineering, a design pattern is a general reusable solution to a commonly occurring problem in software design. A design pattern is not a finished design that can be transformed directly into code. It is a description or template for how to solve a problem that can be used in many different situations. Object-oriented design patterns typically show relationships and interactions between classes or objects, without specifying the final application classes or objects that are involved. Many patterns imply object-orientation or more generally mutable state, and so may not be as applicable in functional programming languages, in which data is immutable Fig 1. Primary design pattern or treated as such..Not all software patterns are design patterns. For instance, algorithms solve 1.1Attributes related to design : computational problems rather than software design Expandability : The degree to which the design of a problems.In this paper we try to focous the system can be extended. algorithimic pattern for good software design. Simplicity : The degree to which the design of a Key words - Software Design, Design Patterns, system can be understood easily. Software esuability,Alogrithmic pattern. Reusability : The degree to which a piece of design can be reused in another design. 1. INTRODUCTION Many studies in the literature (including some by 1.2 Attributes related to implementation: these authors) have for premise that design patterns Learn ability : The degree to which the code source of improve the quality of object-oriented software systems, a system is easy to learn.
    [Show full text]
  • 5 Programming Language Implementation
    5 Programming Language Implementation Program ®le for this chapter: pascal We are now ready to turn from the questions of language design to those of compiler implementation. A Pascal compiler is a much larger programming project than most of the ones we've explored so far. You might well ask, ªwhere do we begin in writing a compiler?º My goal in this chapter is to show some of the parts that go into a compiler design. A compiler translates programs from a language like Pascal into the machine language of some particular computer model. My compiler translates into a simpli®ed, simulated machine language; the compiled programs are actually carried out by another Logo program, the simulator, rather than directly by the computer hardware. The advantage of using this simulated machine language is that this compiler will work no matter what kind of computer you have; also, the simpli®cations in this simulated machine allow me to leave out many confusing details of a practical compiler. Our machine language is, however, realistic enough to give you a good sense of what compiling into a real machine language would be like; it's loosely based on the MIPS microprocessor design. You'll see in a moment that most of the structure of the compiler is independent of the target language, anyway. Here is a short, uninteresting Pascal program: program test; procedure doit(n:integer); begin writeln(n,n*n) end; begin doit(3) end. 209 If you type this program into a disk ®le and then compile it using compile as described in Chapter 4, the compiler will translate
    [Show full text]
  • Skeleton Programming for Heterogeneous GPU-Based Systems
    Linköping Studies in Science and Technology Thesis No. 1504 Skeleton Programming for Heterogeneous GPU-based Systems by Usman Dastgeer Submitted to Linköping Institute of Technology at Linköping University in partial fulfilment of the requirements for degree of Licentiate of Engineering Department of Computer and Information Science Linköping universitet SE-581 83 Linköping, Sweden Linköping 2011 Copyright c 2011 Usman Dastgeer [except Chapter 4] Copyright notice for Chapter 4: c ACM, 2011. This is a minor revision of the work published in Proceedings of the Fourth International Workshop on Multicore Software Engineering (IWMSE’11), Honolulu, Hawaii, USA , 2011, http://doi.acm.org/10.1145/ 1984693.1984697. ACM COPYRIGHT NOTICE. Copyright c 2011 by the Association for Computing Machinery, Inc. Permission to make digital or hard copies of part or all of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. Copyrights for components of this work owned by others than ACM must be honored. Abstracting with credit is permitted. To copy otherwise, to republish, to post on servers, or to redistribute to lists, requires prior specific permission and/or a fee. Request permissions from Publications Dept., ACM, Inc., fax +1 (212) 869-0481, or [email protected]. ISBN 978-91-7393-066-6 ISSN 0280–7971 Printed by LiU Tryck 2011 URL: http://urn.kb.se/resolve?urn=urn:nbn:se:liu:diva-70234 Skeleton Programming for Heterogeneous GPU-based Systems by Usman Dastgeer October 2011 ISBN 978-91-7393-066-6 Linköping Studies in Science and Technology Thesis No.
    [Show full text]
  • Theory of Computational Implementation
    A Theory of Computational Implementation Michael Rescorla Abstract: I articulate and defend a new theory of what it is for a physical system to implement an abstract computational model. According to my descriptivist theory, a physical system implements a computational model just in case the model accurately describes the system. Specifically, the system must reliably transit between computational states in accord with mechanical instructions encoded by the model. I contrast my theory with an influential approach to computational implementation espoused by Chalmers, Putnam, and others. I deploy my theory to illuminate the relation between computation and representation. I also address arguments, propounded by Putnam and Searle, that computational implementation is trivial. §1. The physical realization relation Physical computation occupies a pivotal role within contemporary science. Computer scientists design and build machines that compute, while cognitive psychologists postulate that the mental processes of various biological creatures are computational. To describe a physical system’s computational activity, scientists typically offer a computational model, such as a Turing machine or a finite state machine. Computational models are abstract entities. They are not located in space or time, and they do not participate in causal interactions. Under certain circumstances, a physical system realizes or implements an abstract computational model. Which circumstances? What is it for a physical system to realize a computational model? When does a 2 concrete physical entity --- such as a desktop computer, a robot, or a brain --- implement a given computation? These questions are foundational to any science that studies physical computation. Over the past few decades, philosophers have offered several theories of computational implementation.
    [Show full text]
  • Implementation Phase During Software Development Life Cycle: Interpreting Tools and Methods
    Youssef Zerkani Implementation Phase during Software Development Life Cycle Interpreting Tools and Methods Helsinki Metropolia University of Applied Sciences Bachelor of Engineering Information Technology Thesis 10 May 2017 Abstract Author(s) Youssef Zerkani Title Implementation Phase during Software Development Life Cycle: Interpreting Tools and Methods Number of Pages 40 pages + 1 appendix Date 10 May 2017 Degree Bachelor of Engineering Degree Programme Information Technology Specialisation option Software Engineering Instructor(s) Jenni Alasuutari, Supervisor Patrick Ausderau, Senior Lecturer The purpose of the project was to describe the methods used during the implementation phase of a piece of software. The main idea of the project was to introduce a package of information about the technologies and frameworks such as CakePHP, JavaScript and Boot- strap that help building a software application. During this project, a search engine application was developed from a previous version by the Bookndo company. A pre-analysis of the old version was made in order to understand more about the main technologies. The goal of the project was to ameliorate the application by manual testing and make manipulating code easier. As an outcome, designs were updated and new ideas were discussed to improve the soft- ware. The thesis presents the technologies used and the final results. In conclusion, code reusability and testing are the main fundamental elements to improve the software. The technologies that were utilized to improve the software, such
    [Show full text]
  • Blossom V: a New Implementation of a Minimum Cost Perfect Matching Algorithm
    To appear in Mathematical Programming Computation The original publication with publisher's formatting is available at http://www.springer.com Blossom V: A new implementation of a minimum cost perfect matching algorithm Vladimir Kolmogorov University College London [email protected] Abstract We describe a new implementation of the Edmonds’s algorithm for computing a perfect matching of minimum cost, to which we refer as Blossom V. A key feature of our implementation is a combination of two ideas that were shown to be effective for this problem: the “variable dual updates” approach of Cook and Rohe [8] and the use of priority queues. We achieve this by maintaining an auxiliary graph whose nodes correspond to alternating trees in the Edmonds’s algorithm. While our use of priority queues does not improve the worst-case complexity, it appears to lead to an efficient technique. In the majority of our tests Blossom V outperformed previous implementations of Cook and Rohe [8] and of Mehlhorn and Sch¨afer [24], sometimes by an order of magnitude. We also show that for large VLSI instances it is beneficial to update duals by solving a linear program, contrary to a conjecture by Cook and Rohe. 1 Introduction We consider the problem of computing a perfect matching of minimum cost in an undirected weighted graph. (A perfect matching is a subset of edges such that each node in the graph is met by exactly one edge in the subset.) In 1965, Edmonds [11, 12] invented the famous blossom algorithm that solves this problem in polynomial time.
    [Show full text]