The Travelling Salesman Problem

Design of efficient parallel algorithms for the TSP on multicore processors with shared memory Steinar Sørli Hegna Master’s Thesis Informatics: Programming and Networks 60 credits Department of Informatics Faculty of Mathematics and Natural Sciences UNIVERSITY OF OSLO February 2019 II Design of efficient parallel algorithms for the travelling salesman problem on multicore processors with shared memory III IV © Steinar Sørli Hegna 2019 Design of efficient parallel algorithms for the travelling salesman problem on multicore processors with shared memory Steinar Sørli Hegna http://www.duo.uio.no/ Print: Reprosentralen, University of Oslo V VI Abstract As CPU (Central Processing Unit) clock-rates have stagnated due to physical limitations, and core counts are ever-increasing we must utilize the available processing power by implementing parallelism into our algorithms and programs. (푁−1)! The NP-hard symmetric travelling salesman problem with a complexity of isn’t 2 solvable within a reasonable amount of time. Therefore, we can first design sequential algorithms for this computationally hard problem and further improve their efficiency by utilizing Java’s threads functionality. We will demonstrate how we can start with initial approximations found by the nearest neighbour algorithm and improve them with improvement only based algorithms to solve symmetric travelling salesman problems of size 푁 < 300 optimally. Furthermore, we will improve the highest runtime algorithms with parallelism, and demonstrate how we can reduce the runtimes by a factor of the number of cores available. VII VIII Preface This thesis was completed between January 2018 and February 2019 with the Department of Informatics at the University of Oslo as the final part of my Master’s Degree in Informatics: Programming and Networks. I want to first thank my supervisor Arne Maus for giving me this problem. I greatly appreciate the many interesting conversations and discussions while working on these fascinating and difficult topics. The total experience has truly been beyond any expectations, and I am extremely grateful to both Arne Maus and the Department of Informatics for the opportunity. Finally, I want to thank my family and friends for their never-ending support in all of my endeavours. IX X Table of contents Abstract ................................................................................................................................... VII Preface ...................................................................................................................................... IX Table of contents ...................................................................................................................... XI Introduction ........................................................................................................................ 1 Background ........................................................................................................................ 2 History ......................................................................................................................... 2 Multicore CPU performance with shared memory ...................................................... 3 Challenges of Parallel programming ........................................................................... 6 Parallel programming in Java .................................................................................... 10 Motivation ........................................................................................................................ 12 The Travelling Salesman Problem ................................................................................... 13 TSP focus ................................................................................................................... 13 Real world applications ............................................................................................. 14 Symmetric travelling salesman and cost ................................................................... 15 TSP time and complexity .......................................................................................... 16 TSPLIB and datasets ................................................................................................. 18 Data structure for tours .............................................................................................. 22 Existing TSP solutions............................................................................................... 23 Knowing the optimal solution ................................................................................... 23 TSP initial approximations ............................................................................................... 25 Nearest Neighbour algorithm .................................................................................... 26 Group, solve and merge algorithm ............................................................................ 31 Conclusions ............................................................................................................... 36 Improvement algorithms .................................................................................................. 38 Swapping edges algorithm......................................................................................... 39 Path move algorithm .................................................................................................. 45 Split and merge algorithm ......................................................................................... 48 Conclusions ............................................................................................................... 55 Parallel implementations and improving efficiency ......................................................... 57 Trivial, top-level parallelization ................................................................................ 59 XI Parallel group, solve and merge. ............................................................................... 61 Parallel split and merge (SplitMergePara) ................................................................. 65 Conclusions ............................................................................................................... 72 Results .............................................................................................................................. 74 Selecting an initial approximation ............................................................................. 74 Selecting improvement algorithms ............................................................................ 75 Improvement stack .................................................................................................... 76 TSP cost results ......................................................................................................... 78 TSP efficiency results ................................................................................................ 83 Conclusions ...................................................................................................................... 85 Further work ..................................................................................................................... 87 Bibliography ............................................................................................................................. 88 Appendix A .............................................................................................................................. 90 Appendix B .............................................................................................................................. 91 Appendix C .............................................................................................................................. 92 Appendix D .............................................................................................................................. 93 XII Introduction Parallel programming is a very useful and necessary tool in the search of solving complex problems quickly, running large computations at higher speeds and providing efficient multitasking. Especially when todays CPU’s are limited by clock rate, but ever increasing in core count and available memory, we must examine for which use cases parallel algorithms can empirically speed up execution and outperform sequential algorithms. Using Java and its threads for running parallel execution of programs we will implement and design several sequential algorithms for solving the NP-hard symmetric travelling salesman problem. We will then utilize parallelization to create alternative solutions and discuss the changes that must be made to preserve determinism, prevent deadlocks/livelocks and how to maximize efficiency. The resulting sequential and parallel algorithms will be compared in their performance, and they will be combined to produce the final TSP results. 1 Background History The history of parallel programming is extensive and the most commonly used programming languages today support some form of parallel programming[2], either explicitly or through extended libraries. It is no longer an advanced topic to be used only by specialists or let only to be handled by the operating system or compiler itself, but a necessary tool for all programmers to use in most non-trivial applications to increase performance. Desktop computers and more specifically CPU’s have evolved a lot over

The Travelling Salesman Problem

Personal Computing, the Notebook Battery Crisis, and Postindustrial

CS61C : Machine Structures

CS61C : Machine Structures

Sebastian Hodge My Science Project 9/6/15 – 25/6/15

How ICT Works

•The Better a Computer Performs, the Better It Is! •But What Is Better?

UNIT-1 Basic Structure of Computers: Computer Types, Functional Units, Basic

Krazy Ken: in 1997, Apple's Website Went Dark for 24 Hours, and It Mysteriously Showed a Picture of a Chocolate Chip Cookie, a Shopping Cart, and a Screwdriver

Artificial Intelligence

Unifying Hardware and Software Benchmarking: a Resource-Agnostic Model

Architecture 1 CISC, RISC, VLIW, Dataﬂow Contents

Cs61c/Su06 CS61C : Machine Structures Lecture #28: Parallel Computing + Summary