applied sciences

Article Converting MST to TSP Path by Branch Elimination

Pasi Fränti 1,2,* , Teemu Nenonen 1 and Mingchuan Yuan 2

1 School of Computing, University of Eastern Finland, 80101 Joensuu, Finland; [email protected] 2 School of Big Data & Internet, Shenzhen Technology University, Shenzhen 518118, China; [email protected] * Correspondence: [email protected].fi

Abstract: Travelling salesman problem (TSP) has been widely studied for the classical closed loop variant but less attention has been paid to the open loop variant. Open loop solution has property of being also a , although not necessarily the minimum spanning tree (MST). In this paper, we present a simple branch elimination algorithm that removes the branches from MST by cutting one link and then reconnecting the resulting subtrees via selected leaf nodes. The number of iterations equals to the number of branches (b) in the MST. Typically, b << n where n is the number of nodes. With O-Mopsi and Dots datasets, the algorithm reaches gap of 1.69% and 0.61 %, respectively. The algorithm is suitable especially for educational purposes by showing the connection between MST and TSP, but it can also serve as a quick approximation for more complex metaheuristics whose efficiency relies on quality of the initial solution.

Keywords: traveling salesman problem; minimum spanning tree; open-loop TSP; Christofides

1. Introduction The classical closed loop variant of travelling salesman problem (TSP) visits all the nodes and then returns to the start node by minimizing the length of the tour. Open loop TSP is   slightly different variant, which skips the return to the start node. They are both NP-hard problems [1], which means that finding optimal solution fast can be done only for very Citation: Fränti, P.; Nenonen, T.; small size inputs. Yuan, M. Converting MST to TSP The open loop variant appears in mobile-based orienteering game called O-Mopsi Path by Branch Elimination. Appl. Sci. (http://cs.uef.fi/o-mopsi/) where the goal is to find a set of real-world targets with the 2021, 11, 177. https://dx.doi.org/ help of smartphone navigation [2]. This includes three challenges: (1) Deciding the order 10.3390/11010177 of visiting the targets, (2) navigating to the next target, and (3) physical movement. Unlike in classical orienteering where the order of visiting the targets is fixed, O-Mopsi players Received: 23 November 2020 optimize the order while playing. Finding the optimal order is not necessary but desirable Accepted: 24 December 2020 Published: 27 December 2020 if one wants to minimize the tour length. However, the task is quite challenging for humans even for short problem instances. In [2], only 14% of the players managed to solve the best

Publisher’s Note: MDPI stays neu- order among those who completed the tour. Most players did not even realize they were tral with regard to jurisdictional claims dealing with TSP problem. in published maps and institutional Minimum spanning tree (MST) and TSP are closely related algorithmic problems. In spe- affiliations. cific, open loop TSP solution is also a spanning tree but not necessarily the minimum spanning tree; see Figure1. The solutions have the same number of links ( n − 1) and they both minimize the total weight of the selected links. The difference is that TSP does not have any branches. This connection is interesting for two reasons. First, finding optimal Copyright: © 2020 by the authors. Li- solution for TSP requires exponential time while optimal solution for MST can be found in censee MDPI, Basel, Switzerland. This polynomial time using greedy algorithms like Prim and Kruskal, with time complexity very article is an open access article distributed close to O(n2) with appropriate data structure. under the terms and conditions of the The connection between the two problems have inspired researchers to develop heuris- Creative Commons Attribution (CC BY) tic algorithms for the TSP by utilizing the minimum spanning tree. Classical schoolbook license (https://creativecommons.org/ algorithm is Christofides [3,4], which first creates an MST. More links are then added so licenses/by/4.0/).

Appl. Sci. 2021, 11, 177. https://dx.doi.org/10.3390/app11010177 https://www.mdpi.com/journal/applsci Appl. Sci. 2021, 11, 177 2 of 17

that the Euler tour can be constructed. Travelling salesman tour is obtained from the Euler tour by removing nodes that are visited repeatedly. This algorithm is often used as classroom example because it provides upper limit how far the solution can be from the Appl. Sci. 2021, 11, x FOR PEER REVIEW 2 of 17 optimal solution in the worst case. Even if the upper limit, 50%, is rather high for real life applications, it serves as an example of the principle of approximation algorithms.

Figure 1. MinimumFigure 1. spanningMinimum tree spanning(MST) (left) tree and (MST) open loop (left) travel andling open salesman looptravelling problem (TSP) salesman solutions problem (right) for (TSP) the same problemsolutions instance. (right) The branching for the same nodes problem are shown instance. by blue, and The the branching links that nodesare different are shown in the two by blue, solutions and have the thicker blacklinks line. that In addition are different to the inbran theches, two there solutions are only have two thicker other places black (red line. circles) In addition where tothe the two branches, solutions differ. there Among all links, 29/37 = 78% are the same in both solutions. are only two other places (red circles) where the two solutions differ. Among all links, 29/37 = 78% are the same in bothThe solutions. connection between the two problems have inspired researchers to develop heu- ristic algorithms for the TSP by utilizing the minimum spanning tree. Classical schoolbook MST hasalgorithm also been is usedChristofides to estimate [3,4], which lower first limit creates forthe an MST. optimal More TSP links solution are then in added [5], so and to estimatethat the the difficultyEuler tour ofcan a be TSP constructed. instance Travel by countingling salesman the number tour is obtained of branches from the [6]. Euler Prim’s and Kruskal’stour by removing algorithms nodes can that also are been visited modified repeatedly. to obtainThis algorithm sub-optimal is often solutions used as class- for TSP. Classroom of insertion example algorithms because itwere provides studied upper in limit [7]. Theyhow far iteratively the solution add can new be nodesfrom the to opti- the current path.mal Thesolution variant in the that worst adds case. a new Even node if the to eitherupper endlimit, of 50%, the existingis rather pathhigh canfor bereal life seen as modificationapplications, of the it Prim’sserves as algorithm an example where of the additions principle areof approximation allowed only algorithms to the ends. of the path. Similarly,MST has algorithm also been called used Kruskal-TSPto estimate lower[8] modifieslimit for the Kruskal’s optimal algorithmTSP solution so in [5], that it allowsand to merge to estimate two spanningthe difficulty trees of (pathsa TSP instan in casece ofby TSP)counting only the by number their end of points.branches [6]. In this paper,Prim’s we and introduce Kruskal’s aalgorithms more sophisticated can also been approach modified to obtain utilize sub-optimal the minimum solutions spanning treefor to TSP. obtain Class solution of insertion for travellingalgorithms were salesman studied problem. in [7]. They The iteratively algorithm, add called new nodes to the current path. The variant that adds a new node to either end of the existing path MST-to-TSP, creates a minimum spanning tree using any algorithm such as Prim or Kruskal. can be seen as modification of the Prim’s algorithm where additions are allowed only to However, instead of adding redundant links like Chirstofides, we detect branching nodes the ends of the path. Similarly, algorithm called Kruskal-TSP [8] modifies Kruskal’s algo- and eliminaterithm them so bythat a it series allows of tolink merge swaps, two spanning see Figure trees2. (pathsBranching in case nodeof TSP)is aonly node by their that connectsend more points. than two other nodes in the tree [6]. We introduce two strategies. First, a simple greedyIn thisvariant paper, removeswe introduce the a longest more soph of allisticated links approach connecting to utilize to any the of minimum the branching nodes.spanning The tree resulting to obtain sub-trees solution are for thentravelling reconnected salesman by problem. selecting The the algorithm, shortest called link betweenMST-to-TSP leaf nodes., Thecreates second a minimum strategy, spanningall-pairs tree, considers using any allalgorithm possible such removal as Prim and or Krus- reconnection possibilitieskal. However, and instead selects of adding the best redundant pair. links like Chirstofides, we detect branching We test thenodes algorithm and eliminate with them two datasetsby a series consisting of link swaps, of open see Figure loop 2. problem Branching instances node is a node with known groundthat connects truth: more O-Mopsi than two [2] other and Dotsnodes [ 9in]. the Experiments tree [6]. We introduce show that two the strategies. all-pairs First, variant providea simple average greedy gaps variant of 1.69% removes (O-Mopsi) the longest and of 0.61% all links (Dots). connecting These areto any clearly of the better branching than those ofnodes. the existing The resulting MST-based sub-trees variants are then including reconnected Christofides by selecting (12.77% the shortest and 9.10%),link between leaf nodes. The second strategy, all-pairs, considers all possible removal and reconnection Prim-TSP (6.57% and 3.84%), and Kruskal-TSP (4.87% and 2.66%). Although sub-optimal, possibilities and selects the best pair. the algorithm is useful to provide reasonable estimate for the TSP solution. It is also not restricted to complete graphs with triangular inequality like Christofides. In addition, we also test repeated randomized variant of the algorithm. It reaches results very close to the optimum; averages of 0.07% (O-Mopsi) and 0.02% (Dots). Appl.Appl. Sci. 2021, 11, 177x FOR PEER REVIEW 33 ofof 1717

Figure 2. Idea ofof thethe MST-to-TSPMST-to-TSP algorithm. algorithm. Branching Branching points points and and the the attached attached links links are are first first detected. de- tected. At every step, one of these links is removed and the separated trees are merged by connect- At every step, one of these links is removed and the separated trees are merged by connecting two ing two selected leaf nodes. In this example, two steps are needed to reach the solution of the selected leaf nodes. In this example, two steps are needed to reach the solution of the open-loop TSP. open-loop TSP. One of the main motivations for developing heuristics algorithms earlier has been to haveWe a test polynomial the algorithm time approximationwith two datasets algorithm consisting for TSP.of open Approximation loop problem algorithms instances guaranteewith known that ground the gap truth: between O-Mopsi the [2] heuristic and Dots and [9]. optimal Experiments results show has that a proven the all-pairs upper limitvariant but provide these limitsaverage are gaps usually of 1.69% rather (O-Mop high:si) 67% and [ 100.61%], 62% (Dots). [11], These 50% [are4], 50%-clearlyε [ bet-12], andter than 40% those [13]. Theof the variant existing in [ 14MST-based] was speculated variants to including have limit Christofides as low as 25% (12.77% based and on empirical9.10%), Prim-TSP observations. (6.57% The and results 3.84%), generalize and Kruskal-TSP to the open-loop (4.87% and case 2.66%). so that Although if there exists sub- αoptimal,-approximation the algorithm for the is closed-loop useful to provide TSP there reasonable exists α+ estimateε approximation for the TSP for thesolution. open-loop It is TSPalso [not15, 16restricted]. However, to complete for practical graphs purpose, with triangular all of these inequality are high. like Gaps Christofides. between 40% In ad- to 67%dition, are we likely also to test be repeated considered randomized insufficient vari forant industrial of the algorithm. applications. It reaches results very closeThe to the second optimum; motivation averages for of developing 0.07% (O-Mopsi) heuristic and algorithms 0.02% (Dots). is to provide initial solu- tion forOne more of the complex main motivations algorithms. for For developing example, heuristics the efficiency algorithms of local earlier search has and been other to metaheuristicshave a polynomial highly time depends approximation on the initial algorithm solution. for Starting TSP. fromApproximation a poor-quality algorithms solution canguarantee lead to that much the longer gap between iterations, the orheuristic even prevent and optimal reaching resu alts high-quality has a proven solution upper withlimit somebut these methods. limits Optimal are usually algorithms rather high: like branch-and- 67% [10], 62% also[11], strongly 50% [4], depend 50%-ε on[12], the and quality 40% of[13]. initialization. The variant The in [14] better was the speculated starting point, to have the limit more as can low be as cut 25% out based by branch-and-cut. on empirical Oneobservations. undocumented The results rule of generalize thumb is to that th thee open-loop initial solution case so should that if bethere asgood exists as α ~1%-approx- gap inimation order tofor achieve the closed-loop significant TSP savings there compared exists α+ε to approximation exhaustive search. for the open-loop TSP [15,16].The However, rest of the for paper practical is organized purpose, as follows.all of these Section are 2high. reviews Gaps the between existing 40% algorithms to 67% focusingare likely on to thebe considered those based insufficient on minimum for spanning industrial tree. applications. The new MST-to-TSP algorithm is then introducedThe second inmotivation Section3. Experimentalfor developing results heuristic are reported algorithms in Section is to provide4 and conclusions initial solu- drawntion for in more Section complex5. algorithms. For example, the efficiency of local search and other metaheuristics highly depends on the initial solution. Starting from a poor-quality solu- 2.tion Heuristic can lead Algorithms to much longer for TSP iterations, or even prevent reaching a high-quality solution with Wesome review methods. the commonOptimal algorithms heuristic found like branch-and-cut in literature. also We considerstrongly depend the following on the algorithms:quality of initialization. The better the starting point, the more can be cut out by branch- •and-cut.Nearest One neighborundocumented [17,18]. rule of thumb is that the initial solution should be as good •as ~1%Greedy/Prim-TSP gap in order to achieve [17,18]. significant savings compared to exhaustive search. • Kruskal-TSPThe rest of the [8]. paper is organized as follows. Section 2 reviews the existing algo- •rithmsChristofides focusing on [3 ,the4]. those based on minimum spanning tree. The new MST-to-TSP al- •gorithmCross is then removal introduced [19]. in Section 3. Experimental results are reported in Section 4 and conclusions drawn in Section 5. Appl. Sci. 2021, 11, x FOR PEER REVIEW 4 of 17

2. Heuristic Algorithms for TSP We review the common heuristic found in literature. We consider the following al- gorithms: • Nearest neighbor [17,18]. • Greedy/Prim-TSP [17,18]. Appl. Sci. 2021, 11, 177 • Kruskal-TSP [8]. 4 of 17 • Christofides [3,4]. • Cross removal [19]. Nearest neighborneighbor(NN) (NN) is is one one of of the the oldest oldest heuristic heuristi methodsc methods [17 ].[17]. It starts It starts from from random ran- nodedom node and then and constructsthen constructs the tour the stepwise tour stepwi byse always by always visiting visiting the nearest the nearest unvisited unvisited node untilnode alluntil nodes all nodes have beenhave reached.been reached. It can It lead can tolead inferior to inferior solutions solutions like 25% like worse25% worse than thethan optimum the optimum in the in example the example in Figure in Figure3. The 3. nearest The nearest neighbor neighbor strategy strategy is often is often applied ap- byplied human by human problem problem solvers solvers in real-world in real-world environment environment [6]. A variant [6]. A of variant nearest of neighbor nearest inneighbor [18] always in [18] selects always the selects shortest the link shortest as long link as newas long unvisited as new nodesunvisited are found.nodes are After found. that, itAfter switches that, it to switches the greedy to the strategy greedy (seestrategy below) (see for below) the remaining for the remaining unvisited unvisited nodes. nodes. In [9], theIn [9], NN-algorithm the NN-algorithm is repeated is repeatedN times N starting times starting from every from possible every node,possible and node, then selectingand then theselecting shortest the tour shortest of all ntouralternatives. of all n alternatives. This simple trickThis wouldsimple avoidtrick thewould poor avoid choice the shown poor inchoice Figure shown3. In [in14 Figure], the algorithm 3. In [14], isthe repeated algorithm using is repeated all possible usinglinks allas possible the starting links as point the insteadstarting ofpoint all possible instead nodes.of all possible This increases nodes. the This number increases of choices the number up to nof*( nchoices-1) in case up ofto completen*(n-1) in graph.case of . Greedy methodmethod [[17,18]17,18] alwaysalways selectsselects thethe unvisitedunvisited nodenode havinghaving shortestshortest linklink toto thethe currentcurrent tour.tour. ItIt isis similarsimilar toto thethe nearestnearest neighborneighbor exceptexcept thatthat itit allowsallows toto addadd thethe newnew nodenode toto bothboth endsends ofof the the existing existing path. path. We We call call this this as asPrim-TSP Prim-TSPbecause because it constructs it constructs the the solution solu- justtion likejust Prim like exceptPrim except it does it not does allow not branches.allow branches. In case ofIn completecase of complete graphs, nodesgraphs, can nodes also becan added also be in added the middle in the of middle the tour of the by addingtour by aadding detour a via detour the newly via the added newly node. added A node. class ofA classinsertion of insertion methods methodswere studied were studied in [7] showing in [7] showing that the that resulting the resulting tour is tour no longeris no longer than twicethan twice of the of optimal the optimal tour. tour. Kruskal-TSP [[8]8] alsoalso constructsconstructs spanning tree likelike Prim-TSP. TheThe differencedifference isis thatthat whilewhile PrimPrim operatesoperates onon aa singlesingle spanningspanning tree,tree, KruskalKruskal operatesoperates onon aa setset ofof trees (spanning forest).forest). It It therefore therefore always always selects selects the the shortest shortest possible possible link link as aslong long as they as they merge merge to dif- to differentferent spanning spanning trees. trees. To To construct construct TSP TSP instea insteadd of of MST, MST, it it also also restricts the connectionsconnections only toto thethe endend points.points. However,However, whilewhile Prim-TSPPrim-TSP hashas onlyonly twotwo possibilitiespossibilities forfor thethe nextnext connection, Kruskal-TSP has potentially much more; two forfor everyevery spanningspanning treetree inin thethe forest.forest. It It is is therefore therefore expected expected to to make make fewer fewer sub-optimal sub-optimal selections, selections, see see Figure Figure 4 4for for an anexample. example. The Thedifferences differences between between NN, Prim-TSP NN, Prim-TSP,, and Kruskal-TSP and Kruskal-TSP are illustrated are illustrated in Fig- inure Figure 5. 5.

Figure 3.3. Nearest neighbor algorithmalgorithm easilyeasily failsfails ifif unluckyunlucky withwith thethe selectionselection ofof startingstarting point.point. Appl. Sci. 2021, 11, 177 5 of 17 Appl. Sci. 2021, 11, Appl.x FOR Sci. PEER 2021 REVIEW, 11, x FOR PEER REVIEW 5 of 17 5 of 17

After 9 mergesAfter 9After merges 14 mergesAfter 14 mergesFinal result Final resultOptimum Optimum

Figure 4. TSP-KruskalFigure 4. TSP-Kruskal [8] selectedFigure optimal[8]4. TSP-Kruskalselected choice optimal for[8] theselected choice first for eightoptimal the merges first choice eight but for merges the the ninth first but eight merge the ninthmerges ends merge upbut to the sub-optimal ninth merge result. Overall,ends only up one to sub-optimal sub-optimalends up result. choiceto sub-optimal Overall, is included, onlyresult. and one Overall, in su thisb-optimal case, only it one choice could sub-optimal beis included, fixed by choice simple and is in localincluded, this searchcase, andit operations in this case, it could be fixed by simple local search operations like 2-opt or link swap [9]. like 2-opt or link swap [9]. could be fixed by simple local search operations like 2-opt or link swap [9].

Figure 5. DifferencesFigureFigure between 5. 5.Differences Differences the Nearest between between Neighbor, the the Nearest Nearest Greedy, Neighbor, Neighbor, and Kruskal-TSP Greedy, Greedy, and and Kruskal-TSPafter Kruskal-TSP the fourth after after the the fourth fourth merge. merge. NearestNearest neighbormerge. neighborNearest continue neighbor continuess from thecontinue from previously thes previouslyfrom added the previously addedpoint. point.Prim-TSP added Prim-TSP poin can t.continue Prim-TSP can continue from can from continue both from ends. both ends. Kruskal has several sub-solutions that all can be extended further. Kruskalboth ends. has Kruskal several sub-solutionshas several sub-solutions that all can bethat extended all can be further. extended further.

Classical ChristofidesClassicalClassical algorithm ChristofidesChristofides [3,4] algorithmalgorithm includes [three[3,4]3,4] includes includessteps: (1) threethree Create steps:steps: MST, (1)(1) (2) CreateCreate add MST,MST, (2)(2) addadd links and createlinkslinks Euler and and tour, create create and Euler Euler (3) remove tour, tour, and and redundant (3) (3) remove remove paths redundant redundant by shortcuts. paths paths Links by by shortcuts. shortcuts. are added Links Links are are added added by detecting nodesbyby detectingdetecting with odd nodesnodes degree withwith and oddodd creating degree new and links creating between new links them. between betweenIf there arethem. them. n If If there there are are n nodes with oddnnodesnodes degree with with (odd odd odd nodes degree degree), then (odd (oddn/2 nodes new nodes), linksthen), then nare/2n /2added.new new links To links areminimize areadded. added. the To totalminimize Tominimize the total the length of the totaladditionallength length of thelinks, of additional the pe additionalrfect matchinglinks, links, perfect can perfect matchingbe solved canby canBlossombe solved be solved algorithm by byBlossom,Blossom for algorithm algorithm, for, instance [20]. forExampleinstance instance [20].of Christofides [20 Example]. Example of is Christofides shown of Christofides in Figure is shown is 6. shown In in [12], Figure in a Figurevariant 6. In6 . [12],of In Chris- [a 12variant], a variant of Chris- of tofides was proposedChristofidestofides wasby generating wasproposed proposed byalternative generating by generating spanning alternative alternative tree spanningthat spanninghas higher tree that tree cost thathas than higher has higher cost costthan MST but fewerthanMST odd MST butnodes. fewer but According fewer odd oddnodes. to nodes. [21], According this According reduces to [21], to the [this21 number], reduces this reduces of theodd number thenodes number from of odd of oddnodes nodes from 36–39% to 8–12%from36–39% in 36–39% case to of8–12% toTSPLIB 8–12% in case and in caseofVLSI TSPLIB of problem TSPLIB and VLSIinstances. and VLSIproblem problem instances. instances. Most of the above heuristics apply both to the closed-loop and open-loop cases. Prim MST MSTAdded linksandAdded Kruskal Closed links are even loop better Closed suitedOpen loop for theloop open-loopOpenOpen casesloop as loop the lastOpen link in loop the closed loop case is forced and is typically aRemove very long,longest sub-optimalRemove longestPseudo choice. node Christofides,Pseudo node on the other 1 11 1 hand, was designed strictly for the closed-loop case due to the creation of the Euler tour. It therefore does not necessarily work well for the open-loop case, which can be significantly worse. There are two strategies how to create open-loop solution from a closed-loop solution. 3 3 A simple approach is to remove the longest link, but this does not guarantee an optimal result; see Figure7. Another strategy is to create additional pseudo node, which has equal (large) distance to all other nodes [9]. Any closed-loop algorithm can then be applied, 1 1 and the optimization will be performed merely based on the other links except those two

Figure 6. Christofides operates by addingthat links will so that be connecting Euler tour can to the be pseudoachieved. node However, which the are quality all equal. of the These result two will be the start Figure 6. Christofides operatesand endby adding nodes oflinks the so corresponding that Euler tour open-loop can be achieved. solution. However, If the algorithm the quality provides of the result optimal depends on the dependsorder in whichon the theorder links in whichare travelled. the links In arethis travelled. case, it achieves In this case,optimal it achieves choice for optimal closed choice loop solutionfor closed but loop solution but solution for the closed-loop problem, it was shown to be optimal also for the corresponding not for the opennot loop for case. the open loop case. open-loop case [1]. Appl. Sci. 2021, 11, x FOR PEER REVIEW 5 of 17

After 9 merges After 14 merges Final result Optimum

Figure 4. TSP-Kruskal [8] selected optimal choice for the first eight merges but the ninth merge ends up to sub-optimal result. Overall, only one sub-optimal choice is included, and in this case, it could be fixed by simple local search operations like 2-opt or link swap [9].

Figure 5. Differences between the Nearest Neighbor, Greedy, and Kruskal-TSP after the fourth merge. Nearest neighbor continues from the previously added point. Prim-TSP can continue from Appl. Sci. 2021, 11, x FOR PEER REVIEW both ends. Kruskal has several sub-solutions that all can be extended further.6 of 17

Classical Christofides algorithm [3,4] includes three steps: (1) Create MST, (2) add Most oflinks the above and create heuristics Euler apply tour, both and to (3)the removeclosed-loop redundant and open-loop paths cases.by shortcuts. Prim Links are added and Kruskalby are detecting even better nodes suited with for oddthe open degree-loop and cases creating as the lastnew link links in thebetween closed them. If there are n loop case is nodesforced withand is odd typically degree a very (odd long, nodes sub-optimal), then n/2 choicenew links. Christofides, are added. on theTo minimize the total other hand, lengthwas designed of the strictlyadditional for the links, closed-loop perfect case matching due to the can creation be solved of the by Euler Blossom algorithm, for tour. It thereforeinstance does [20].not necessarily Example workof Christofides well for the open-loopis shown case,in Figure which 6.can In be [12], sig- a variant of Chris- nificantly worse.tofides was proposed by generating alternative spanning tree that has higher cost than Appl. Sci. 2021, 11, 177 There areMST two but strategies fewer odd how nodes. to create According open-loop to solution [21], this from reduces a closed-loop the number solu- of odd nodes6 from of 17 tion. A simple approach is to remove the longest link, but this does not guarantee an op- 36–39% to 8–12% in case of TSPLIB and VLSI problem instances. timal result; see Figure 7. Another strategy is to create additional pseudo node, which has equal (large) distance to all other nodes [9]. Any closed-loop algorithm can then be ap- MSTplied, and theAdded optimization links will be performed Closed merely loop based onOpen the other loop links except Openthose loop two that will be connecting to the pseudo node which are all Removeequal. These longest two will be Pseudothe node 1 start and1 end nodes of the corresponding open-loop solution. If the algorithm provides optimal solution for the closed-loop problem, it was shown to be optimal also for the cor- responding open-loop case [1]. However, when heuristic algorithm is applied, the pseudo node strategy does may 3not always work as expected. First, if the weights to the pseudo node are small (e.g., ze- ros), then algorithms like MST would links all nodes to the pseudo node, which would make no sense for Christofides. Second, if we choose large weights to the pseudo node, it 1 would be connected to MST last. However, as the weights are all equal, the choice would be arbitrary. As a result, the open loop case will not be optimized any better if we used FigureFigure 6.6. ChristofidesChristofidesthe pseudo operatesoperates node byby than addingadding just removing linkslinks soso thatthat theEuler Eulerlongest tourto link.ur cancan Ending bebe achieved.achieved. up with However,However, a good open-loop thethe qualityquality ofof thethe resultresult dependsdepends on on the the order solutionorder in in which bywhich Christofides the the links links are seems are travelled. travelled. merely In aIn this questi this case, case,on itof achieves itluck. achieves Nevertheless, optimal optimal choice wechoice foruse closedforthe closedpseudo loop loop solution solution but notbut fornot the for open the open loopnode loop case. strategycase. as it is the de facto standard due to it guaranteeing optimality for the opti- mal algorithm at least.

MST + added links Euler tour After shortcuts

8 9 7 1 10 6 2 11

node 3 5 Pseudo (Christofides) 4

Figure 7. Two strategies for applying closed-loop algorithms for the open-loop cases. First strategy (top) simply removes the longest link, but this does not guarantee optimality even if the closed-loop solution is optimal. Second strategy adds pseudo node with equal distance to all other nodes. The consequence is that there are two long links connected to the pseudo node that can be omitted from the closed-loop solution. This does not affect the optimization at all due to the equal weights. The nodes linked to the pseudo node will become the start and end nodes of the open-loop solution.

However, when heuristic algorithm is applied, the pseudo node strategy does may not always work as expected. First, if the weights to the pseudo node are small (e.g., zeros), then algorithms like MST would links all nodes to the pseudo node, which would make no sense for Christofides. Second, if we choose large weights to the pseudo node, it would be connected to MST last. However, as the weights are all equal, the choice would be arbitrary. As a result, the open loop case will not be optimized any better if we used the pseudo node than just removing the longest link. Ending up with a good open-loop solution by Christofides seems merely a question of luck. Nevertheless, we use the pseudo node Appl. Sci. 2021, 11, x FOR PEER REVIEW 7 of 17

Figure 7. Two strategies for applying closed-loop algorithms for the open-loop cases. First strategy (top) simply removes the longest link, but this does not guarantee optimality even if the closed- loop solution is optimal. Second strategy adds pseudo node with equal distance to all other nodes. The consequence is that there are two long links connected to the pseudo node that can be omitted from the closed-loop solution. This does not affect the optimization at all due to the equal weights. The nodes linked to the pseudo node will become the start and end nodes of the open-loop solu- tion. Appl. Sci. 2021, 11, 177 7 of 17 The existing MST-based algorithms take one of the two approaches: (1) Modify the MST solution to obtain TSP solution [3,4]; (2) modify the existing MST algorithm to design algorithm for TSP [7,8,16,17]. In this paper, we follow the first approach and take the span- strategy as it is the de facto standard due to it guaranteeing optimality for the optimal ning tree as our starting point. algorithm at least. Among other heuristics we mention local search. It aims at improving an existing so- The existing MST-based algorithms take one of the two approaches: (1) Modify the MSTlution solution by simple to obtain operators. TSP solutionOne obvious [3,4]; approach (2) modify is the Cross existing removal MST [19], algorithm which is to based design on algorithmthe fact that for optimal TSP [7 ,tour8,16, 17cannot]. In cross this paper,itself. The we followheuristic the simply first approach detects any and cross take along the spanningthe path, and tree if as found, our starting eliminates point. it by re-connecting the links involved as shown in Figure 8. It canAmong be shown other that heuristics the cross we removal mention is alocal special search case. It of aims the more at improving general 2-opt an existingheuristic solution[22]. Another by simple well-known operators. operator One obvious is relocation approach that isremovesCross removal a node[19 from], which the path is based and onreconnects the fact thatit to another optimal part tour of cannot the path. cross The itself. efficiency The heuristic of several simply operators detects has been any crossstud- alongied in the[9] path,within and an algorithm if found, eliminates called Randomized it by re-connecting mixed local search the links. involved as shown in FigureWhile8. Itlocal can search be shown is still that the the state-of-the-a cross removalrt and is afocus special has casebeen of mainly the more on operators general 2-optsuch heuristicas 2-opt and [22]. its Another generalized well-known variant k-opt operator [23,24], is relocation other metaheurthat removesistics have a node been from also theconsidered. path and These reconnects include it totabu another search part[25], of simulated the path. annealing The efficiency and combined of several variants operators [26– has28], been genetic studied algorithm in [9 [29,30],] within ant an colony algorithm optimization called Randomized [31], and harmony mixed local search search [32]..

Cross Links Links detected removed added

FigureFigure 8.8. Cross removal removal heuristic heuristic to to improve improve an an existing existing tour. tour. The The four four end end points points involved involved in the in thecross cross are arere-organized re-organized by swapping by swapping the end the point end point of the of first the link first wi linkth the with start the point start of point the second of the link. second link.

3. MST-to-TSPWhile local Algorithm search is still the state-of-the-art and focus has been mainly on oper- atorsThe such proposed as 2-opt algorithm and its generalized resembles variantChristofidesk-opt as[23 it,24 also], other generates metaheuristics minimum havespan- beenning alsotree to considered. give first estimate These include of the linkstabu that search might[25], besimulated needed annealingin TSP. Sinceand we combined focus on variantsthe open-loop [26–28], variant,genetic algorithmit is good[29 to,30 notice], ant colonythat the optimization TSP solution[31], is and alsoharmony a spanning search [tree,32]. although not the minimum one. However, while Christofides adds redundant links to 3.create MST-to-TSP Euler tour, Algorithm which are then pruned by shortcuts, we do not add anything. Instead, we changeThe proposed existing algorithm links by a resembles series of link Christofides swaps. In as specific, it also generates we detect minimum branches span-in the ningtree and tree eliminate to give first them estimate one by of one. the linksThis is that continued might be until needed the solution in TSP. Sincebecomes we focusa TSP onpath; the i.e., open-loop spanning variant, tree without it is good any to branches notice that. Pseudo the TSP codes solution of the is variants also a spanning are sketched tree, althoughin Algorithm1 not the and minimum Algorithm2. one. However, while Christofides adds redundant links to create Euler tour, which are then pruned by shortcuts, we do not add anything. Instead, Algorithm 1: Greedy variant we change existing links by a series of link swaps. In specific, we detect branches in the → treeMST-to-TSP( and eliminateX) them TSP: one by one. This is continued until the solution becomes a TSP ← path; i.e.,T spanning GenerateMST( tree withoutX); any branches. Pseudo codes of the variants are sketched in Algorithm WHILE 1 and branches Algorithm DO 2. e1 ← LongestBranchLink(T); Algorithm 1: Greedy variant MST-to-TSP(X) → TSP: T ← GenerateMST(X); WHILE branches DO e1 ← LongestBranchLink(T); e2 ← ShortestConnection(T); T ← SwapLink(T, e1, e2); RETURN(T); Appl. Sci. 2021, 11, 177 8 of 17

Algorithm 2:All-pairs variant MST-to-TSP(X) → TSP: T ← GenerateMST(X); WHILE branches DO e1, e2← OptimalPair(T); T ← SwapLink(e1, e2); RETURN(T);

Both variants start by generating minimum spanning tree by any algorithm such as Prim and Kruskal. Prim grows the tree from scratch. In each step, it selects the shortest link that connects a new node to the current tree while Kruskal grows a forest by selecting shortest link that connects two different trees. Actually, we do not need to construct mini- mum spanning tree, but some sub-optimal spanning tree might also be suitable [33]. For Christofides, it would be beneficial to have a tree with fewer nodes with odd degree [12]. In our algorithm, having spanning tree with fewer branches might be beneficial. How- ever, to keep the algorithm simple, we do not study this possibility further in this paper. Both variants operate on the branches, which can be trivially detected during the creation of MST simply by maintaining the degree counter when links are added. Both al- gorithms also work sequentially by resolving the branches one by one. They differ only in the way the two links are selected for the next swap; all other parts of the algorithm variants are the same. The number of iterations equals the number of branches in the spanning tree. We will next study the details of the two variants.

3.1. Greedy Variant The simpler variant follows the spirit of Prim and Kruskal in a sense that it is also greedy. The selection is split into two subsequent choices. First, we consider all links connecting to any branching node, and simply select the longest. This link is then removed. As a result, we will have two disconnected trees. Second, we reconnect the trees by considering all pairs of leaf nodes so that one is from the first subtree and the other from the second subtree. The shortest link is selected. Figure 10 demonstrates the algorithm for a graph with n = 10 nodes. In total, there are two branching nodes and six links attached to them. The first removal creates subtrees of size n1 = 9 (3 leaves) and n2 = 1 (1 leaf) providing 3 possibilities for the re-connection. The second removal creates subtrees of size n1 = 6 (2 leaves) and n2 = 4 (2 leaves). The result is only slightly longer than the minimum spanning tree.

3.2. All-Pairs Variant The second variant optimizes the removal and addition jointly by considering all pairs of links that would perform a valid swap. Figure9 demonstrates the process. There are 6 removal candidates, and they each have 3–6 possible candidates for the addition. In total, we have 27 candidate pairs. We choose the pair that increases the path length least. Most swaps increase the length of the path but theoretically it is possible that some swap might also decrease the total path; this is rare though. The final result is shown in Figure 11. With this example, the variant finds the same solution as the as the greedy variant. However, the parallel chains example in Figure 12 shows situation where the All-pairs variant works better than the greedy variant. Figure 13 summarizes the results of the main MST-based algorithms. Nearest neighbor works poorly because it depends on the arbitrary start point. If considered all possible start points, it would reach the same result as Prim-TSP, Kruskal-TSP and the greedy variant of MST-to-TSP. In this example, the all-pairs variant of MST-to-TSP is the only heuristic that finds the optimal solution, which includes the detour in the middle of the longer route. Appl. Sci. 2021, 11, x FOR PEER REVIEW 9 of 17

Figure 9. Example of the greedy variant of MST-to-TSP. At each step, the longest remaining branch link is removed. The two trees are then merged by selecting the shortest of all pairwise links across the trees.

3.2. All-Pairs Variant The second variant optimizes the removal and addition jointly by considering all pairs of links that would perform a valid swap. Figure 10 demonstrates the process. There are 6 removal candidates, and they each have 3–6 possible candidates for the addition. In total, we have 27 candidate pairs. We choose the pair that increases the path length least. Most swaps increase the length of the path but theoretically it is possible that some swap might also decrease the total path; this is rare though. The final result is shown in Figure 11. With this example, the variant finds the same solution as the as the greedy variant. However, the parallel chains example in Figure 12 shows situation where the All-pairs variant works better than the greedy variant. Figure 13 summarizes the results of the main MST-based algorithms. Nearest neighbor works poorly because it depends on the arbitrary start point. If considered all possible start points, it would reach the same result as Prim-TSP, Kruskal-TSP and the greedy variant of MST-to-TSP. In this example, the all-pairs variant of MST-to-TSP is the only heuristic Appl. Sci. 2021, 11, 177 9 of 17 that finds the optimal solution, which includes the detour in the middle of the longer route.

Figure 10. All possible removal-addition pairs. Appl. Sci. 2021, 11, Figurex FOR PEER 9. All REVIEW possible removal-addition pairs. 9 of 17

Figure 9. Example of the greedy variant of MST-to-TSP. At each step, the longest remaining branch link is removed. The Appl. Sci. 2021, 11, x FOR PEER REVIEWFigure 10. Example of the greedy variant of MST-to-TSP. At each step, the longest remaining branch10 of 17 two trees are then merged by selecting the shortest of all pairwise links across the trees. link is removed. The two trees are then merged by selecting the shortest of all pairwise links across the trees. 3.2. All-Pairs Variant There are (2∙The3) + second(1∙3) + (2variant∙3) + (2 optimizes∙3) + (1∙3) the + (1 removal∙3) = 27 andcandidates addition in jointly total. by considering all pairs of links that would perform a valid swap. Figure 10 demonstrates the process. There are 6 removal candidates, and they each have 3–6 possible candidates for the addition. In total, we have 27 candidate pairs. We choose the pair that increases the path length least. Most swaps increase the length of the path but theoretically it is possible that some swap might also decrease the total path; this is rare though. The final result is shown in Figure 11. With this example, the variant finds the same solution as the as the greedy variant. However, the parallel chains example in Figure 12 shows situation where the All-pairs variant works better than the greedy variant. Figure 13 summarizes the results of the main MST-based algorithms. Nearest neighbor works poorly because it depends on the arbitrary start point. If considered all possible start

points, it would reach the same result as Prim-TSP, Kruskal-TSP and the greedy variant FigureFigure 11. 11.Results Resultsof MST-to-TSP. ofof thethe variousvarious In algorithmsalgorithis example,thms for for the the toyall-pairs toy example. example. variant of MST-to-TSP is the only heuristic that finds the optimal solution, which includes the detour in the middle of the longer route.

Figure 12. Example of the Greedy and All-pairs variants. Greedy removal leads to the same sub- optimal result as TSP-Kruskal while allowing to remove shorter link, the optimal solution can be easily reached. Figure 10. All possible removal-addition pairs.

Figure 13. Parallel chains where only MST-to-TSP (All-pairs variant) finds the optimal path.

3.3. Randomized Variant The algorithm can be further improved by repeating it several times. This improve- ment is inspired by [34] where k-means clustering algorithm was improved significantly by repeating it 100 times. The only requirement is that the creation of the initial solution includes randomness so that it produces different results every time. In our randomized variant, the final solution is the best result out of the 100 trials. The idea was also applied for the Kruskal-TSP and called Randomized Kruskal-TSP [8]. There are several ways how to randomize the algorithm. We adopt the method from [8] and randomize the creation of MST. When selecting the next link for the merge, we randomly choose one out of the three shortest links. This results in sub-optimal choices. However, as we have a complete graph, it usually has several virtually as good choices. Appl. Sci. 2021, 11, x FOR PEER REVIEW 10 of 17

Appl. Sci. 2021, 11, x FOR PEER REVIEW 10 of 17

There are (2∙3) + (1∙3) + (2∙3) + (2∙3) + (1∙3) + (1∙3) = 27 candidates in total. There are (2∙3) + (1∙3) + (2∙3) + (2∙3) + (1∙3) + (1∙3) = 27 candidates in total.

Appl. Sci. 2021, 11, 177 10 of 17 Figure 11. Results of the various algorithms for the toy example. Figure 11. Results of the various algorithms for the toy example.

FigureFigure 12. 12. ExampleExample of ofthe the GreedyGreedy andand All-pairsAll-pairs variants.variants. Greedy Greedy removal removal leads leads to the to same the samesub- sub- optimaloptimalFigure result 12. result Example as as TSP-Kruskal TSP-Kruskal of the Greedy while while and allowing allowing All-pairs to toremovevariants. remove shorter Greedy shorter link, removal link, the the optimal leads optimal to solution the solution same can sub- can be be easilyeasilyoptimal reached. reached. result as TSP-Kruskal while allowing to remove shorter link, the optimal solution can be easily reached.

Figure 13. Parallel chains where only MST-to-TSP (All-pairs variant) finds the optimal path. FigureFigure 13. 13.Parallel Parallel chains chains where where only only MST-to-TSP MST-to-TSP (All-pairs (All-pairs variant) variant) finds finds the the optimal optimal path. path. 3.3. Randomized Variant 3.3.The RandomizedThere algorithm are (2 Variant· 3)can + (1be ·further3) + (2 ·improved3) + (2 · 3) by + (1repeating· 3) + (1 it· 3)several = 27 candidates times. This in improve- total. ment isThe inspired algorithm by [34] can where be further k-means improved clustering by repeating algorithm it was several improved times. significantlyThis improve- 3.3. Randomized Variant byment repeating is inspired it 100 bytimes. [34] Thewhere only k-means requirement clustering is that algorithm the creation was improvedof the initial significantly solution includesby repeatingThe randomness algorithm it 100 cantimes. so that be furtherThe it produces only improved requirement different by repeating isresults that the every it creation several time. times.ofIn theour This initialrandomized improve- solution variant,mentincludes is the inspired randomness final solution by [34 so] whereis that the itbest k-means produces result clustering outdiff oferent the algorithm results100 trials. every wasTh etime. improvedidea Inwas our also significantly randomized applied forbyvariant, the repeating Kruskal-TSP the final it 100 solution times.and called Theis the Randomized only best requirement result Kruskal-TSP out of is the that 100 [8]. the trials. creation The ofidea the was initial also solution applied includesfor Therethe Kruskal-TSP randomness are several andways so that called how it produces Randomizedto randomize different Kruskal-TSP the algorithm. results [8]. every We time.adopt In the our method randomized from [8]variant, andThere randomize the finalare several solution the creation ways is the how of best MST.to resultrandomize When out ofselecting th thee algorithm. 100 the trials. next We The link adopt idea for was thethe alsomethodmerge, applied fromwe randomlyfor[8] theand Kruskal-TSP randomize choose one andthe out creation calledof theRandomized three of MST. shortest When Kruskal-TSP links. selecting This[8 ]. resultsthe next in linksub-optimal for the merge, choices. we However,randomlyThere as choose arewe severalhave one a outcomplete ways of howthe graph,three to randomize shortest it usually links. thehas algorithm.Thisseveral results virtually Wein sub-optimal adopt as good the choices. method choices. fromHowever, [8] and as randomize we have a thecomplete creation graph, of MST. it usually When selectinghas several the virtually next link as for good the choices. merge, we randomly choose one out of the three shortest links. This results in sub-optimal choices. However, as we have a complete graph, it usually has several virtually as good choices. Furthermore, the quality of the MST itself is not a vital for the overall performance of the algorithm; the branch elimination plays a bigger role than the MST. As a result, we will get multiple spanning trees producing different TSP solutions. Out of the 100 repeats, it is likely that one provides better TSP than using only the minimum spanning tree as starting point.

4. Experimental Results 4.1. Datasets and Methods We use two open loop datasets: O-Mopsi [2] and Dots [35]; see Figure 14. Both can be found on web: http://cs.uef.fi/o-mopsi/datasets/. O-Mopsi are 147 instances Appl. Sci. 2021, 11, x FOR PEER REVIEW 11 of 17

Furthermore, the quality of the MST itself is not a vital for the overall performance of the algorithm; the branch elimination plays a bigger role than the MST. As a result, we will get multiple spanning trees producing different TSP solutions. Out of the 100 repeats, it is likely that one provides better TSP than using only the minimum spanning tree as starting point.

4. Experimental Results 4.1. Datasets and Methods We use two open loop datasets: O-Mopsi [2] and Dots [35]; see Figure 14. Both can be found on web: http://cs.uef.fi/o-mopsi/datasets/. O-Mopsi are 147 instances of mobile orienteering game played in real-world. Haversine distance is used with the constant 6356.752 km for the earth radius. The dots are 6449 computer-generated problem instances designed for humans to solve on computer screen. Euclidean distance is used. Optimal ground truth solutions are available for all problem instances. Table 1 summarizes the properties of the datasets. On average, the datasets have only a few branches (1.90 in O-Mopsi; 1.48 in Dots). This means that that the MST-to-TSP algo- rithm needs to run only a few iterations. Even the most difficult Dots instance has only 11 branches to be broken. The total number of link swaps need are 280 for the O-Mopsi games and 9548 for the Dots games. Table 2 shows the methods used in the experiments. We include all the main MST- based algorithms including nearest neighbor, Prim-TSP (greedy), Kruskal-TSP, and the Appl. Sci. 2021, 11, 177 proposed MST-to-TSP. Randomized variants are included from Kruskal-TSP11 of 17 and MST-to- TSP with 100 repeats. Among the local search heuristics, we include 2-opt [22] and the randomized mixed local search [9]. The performance is evaluated by gap, which is relative of mobile orienteeringdifference game of the played algorithm in real-world. and optimal Haversine solutions: distance is used with the

constant 6356.752 km for the earth radius. The dots are 6449 computer-generated − problem instances designed for humans to solve on computer screen. = Euclidean distance is used. (1) Optimal ground truth solutions are available for all problem instances.

Appl. Sci. 2021, 11, x FOR PEER REVIEW 12 of 17

Figure 14. ExamplesFigure of 14. the Examples O-Mopsi of (above)the O-Mopsi and (above) Dots (below) and Dots datasets. (below) datasets.

Table 1. Datasets used in the experiments.

Table1 summarizes the properties of the datasets. On average, the datasets Branches: have only a few branchesDataset (1.90 inReference O-Mopsi; 1.48Distance in Dots). InstancesThis means thatN that theMin MST-to-TSP Average Max algorithm needs toO-Mopsi run only a few[2]iterations. Haversine Even the most147 difficult 4-27 Dots instance0 has1.90 only 5 Dots [35] Euclidean 6449 4-31 0 1.48 11

Table 2. Accuracy comparison of the algorithms (gap) for the two series of datasets.

Average Worst Algorithm Reference O-Mopsi Dots O-Mopsi Dots Christofides [3,4] 12.77% 9.10% 35.41% 39.21% Nearest neighbor [17] 20.00% 9.25% 66.57% 48.06% Nearest neighbor+ [9] 3.47% 1.49% 21.01% 22.80% Prim-TSP [17] 6.57% 3.84% 32.95% 35.23% Kruskal-TSP [8] 4.87% 2.66% 28.59% 28.40% Kruskal-TSP (100r) [8] 0.77% 0.26% 7.13% 13.46% MST-to-TSP (greedy) (proposed) 4.81% 2.93% 21.98% 28.40% MST-to-TSP (all-pairs) (proposed) 1.69% 0.61% 16.32% 15.46% MST-to-TSP (all-pairs 100r) (proposed) 0.07% 0.02% 1.98% 2.42% 2-opt [22] 1.72% 1.35% 15.53% 22.86% Random Mix Local Search [9] 0.00% 0.01% 0.27% 2.59%

4.2. Results Table 2 also shows the average results with the O-Mopsi and Dots data. From these we can learn a few things. First, while the greedy variant of MST-to-TSP provides similar results to that of Krus- kal-TSP (4.81% and 2.93%), the all-pairs variant is significantly better (1.69% and 0.61%) and the randomized version reaches near-optimal results (0.07% and 0.02%). In other Appl. Sci. 2021, 11, 177 12 of 17

11 branches to be broken. The total number of link swaps need are 280 for the O-Mopsi games and 9548 for the Dots games.

Table 1. Datasets used in the experiments.

Branches: Dataset Reference Distance Instances N Min Average Max O-Mopsi [2] Haversine 147 4-27 0 1.90 5 Dots [35] Euclidean 6449 4-31 0 1.48 11

Table2 shows the methods used in the experiments. We include all the main MST- based algorithms including nearest neighbor, Prim-TSP (greedy), Kruskal-TSP, and the proposed MST-to-TSP. Randomized variants are included from Kruskal-TSP and MST-to- TSP with 100 repeats. Among the local search heuristics, we include 2-opt [22] and the randomized mixed local search [9]. The performance is evaluated by gap, which is relative difference of the algorithm and optimal solutions:

L − Lopt gap = (1) Lopt

Table 2. Accuracy comparison of the algorithms (gap) for the two series of datasets.

Average Worst Algorithm Reference O-Mopsi Dots O-Mopsi Dots Christofides [3,4] 12.77% 9.10% 35.41% 39.21% Nearest neighbor [17] 20.00% 9.25% 66.57% 48.06% Nearest neighbor+ [9] 3.47% 1.49% 21.01% 22.80% Prim-TSP [17] 6.57% 3.84% 32.95% 35.23% Kruskal-TSP [8] 4.87% 2.66% 28.59% 28.40% Kruskal-TSP (100r) [8] 0.77% 0.26% 7.13% 13.46% MST-to-TSP (greedy) (proposed) 4.81% 2.93% 21.98% 28.40% MST-to-TSP (all-pairs) (proposed) 1.69% 0.61% 16.32% 15.46% MST-to-TSP (all-pairs 100r) (proposed) 0.07% 0.02% 1.98% 2.42% 2-opt [22] 1.72% 1.35% 15.53% 22.86% Random Mix Local Search [9] 0.00% 0.01% 0.27% 2.59%

4.2. Results Table2 also shows the average results with the O-Mopsi and Dots data. From these we can learn a few things. First, while the greedy variant of MST-to-TSP provides similar results to that of Kruskal-TSP (4.81% and 2.93%), the all-pairs variant is significantly better (1.69% and 0.61%) and the randomized version reaches near-optimal results (0.07% and 0.02%). In other words, it is clearly superior to the existing MST-based heuristics and very close to the more sophisticated local search (0.00% and 0.01%). Second, by comparing the greedy we can see that the more options the algorithm has for the next merge, the better its performance. Nearest neighbor performs worst as it has only one possibility to continue from the previously added node. Prim-TSP always has two choices and performs better. Kruskal-TSP has often more options for the merge and is the best among these. Randomized Kruskal-TSP is already more competitive but still inferior to MST-to-TSP. Third, Christofides reaches average values of 12.77% (O-Mopsi) and 9.10% (Dots), which are clearly the worst among all tested algorithms. Its advantage is that it is the only algorithm that provides guarantee (50%) for the worst case. However, with these datasets this is merely theoretical as most other heuristics never get worse than 50% anyway. Appl. Sci. 2021, 11, x FOR PEER REVIEW 13 of 17

words, it is clearly superior to the existing MST-based heuristics and very close to the more sophisticated local search (0.00% and 0.01%). Second, by comparing the greedy we can see that the more options the algorithm has for the next merge, the better its performance. Nearest neighbor performs worst as it has only one possibility to continue from the previously added node. Prim-TSP always has two choices and performs better. Kruskal-TSP has often more options for the merge and is the best among these. Randomized Kruskal-TSP is already more competitive but still inferior to MST-to-TSP. Appl. Sci. 2021, 11, 177 Third, Christofides reaches average values of 12.77% (O-Mopsi) and 9.10%13 of(Dots), 17 which are clearly the worst among all tested algorithms. Its advantage is that it is the only algorithm that provides guarantee (50%) for the worst case. However, with these datasets this is merely theoretical as most other heuristics never get worse than 50% anyway. The The only advantage of Christofides is that this 50% upper limit is theoretically proven. only advantage of Christofides is that this 50% upper limit is theoretically proven. For For practice, MST-to-TSP is clearly better choice. practice, MST-to-TSP is clearly better choice. The dependency on the number of nodes is demonstrated in Figure 15. We can see The dependency on the number of nodes is demonstrated in Figure 15. We can see that the gap tends to increase with the problem size. This is as expected but it remains an that the gap tends to increase with the problem size. This is as expected but it remains an open question whether the gap value would converge to certain level with this type of 2-dimensionalopen question problem whether instances. the gap value would converge to certain level with this type of 2- dimensional problem instances. 10% O-Mopsi 9% 8% 7% 6% Greedy

GAP 5% 4% 3% All-pairs 2% 1% Randomized All-pairs 0% 0 5 10 15 20 25 30 Nodes 10% Dots 9% 8% 7% 6% Greedy

GAP 5% 4% 3% 2% 1% All-pairs Randomized All-pairs 0% 0 5 10 15 20 25 30 Nodes FigureFigure 15. 15.ResultsResults for forthe the O-Mopsi O-Mopsi and and Dots Dots as asa function a function of the of the problem problem size. size.

TableTable3 shows 3 shows how how the the link link swaps swaps increase increase the the length length of the of paththe path when when converting converting thethe spanning spanning tree tree to TSPto TSP path. path. Out Out of theof the 147 147 O-Mopsi O-Mopsi games, games, 19 (12%)19 (12%) have have no no branches branches andand the the minimum minimum spanning spanning tree tree is the is the optimal optimal solution solution for for TSP TSP as such.as such. Most Most other other games games havehave only only one one (31%) (31%) or twoor two (25%) (25%) branches, branches, which which shows shows that that the the task task of convertingof converting MST MST to TSP is not too difficult in most cases. Only about 10% of the games require more than two swaps.

Table 3. Cost of the link swaps (meters) in MST-to-TSP during the algorithm with O-Mopsi data.

Swap 1 2 3 4 5 Max 1074 1192 497 1126 681 Average 108 155 145 227 381 Min 1 1 1 13 134 Count 130 84 47 16 3 % 31% 25% 21% 9% 2% Appl. Sci. 2021, 11, 177 14 of 17

Statistics for the Dots games are similar; see Table4. Branch eliminations also increase the cost (5.65 on average) but it does not vary much during the iterations; only the 9th and 10th swap are somewhat more costly. In some cases, the swaps can also decrease the path length of the Dots instances. This happens rarely though; only 14 times with Dots instances but never with the O-Mopsi instances.

Table 4. Cost of the link swaps in MST-to-TSP during the algorithm with Dots data.

Swap 1 2 3 4 5 6 7 8 9 10 Max 30.62 25.75 29.18 42.99 35.55 29.42 27.46 29.68 26.49 15.41 Average 5.65 5.98 5.95 5.94 6.74 6.83 6.71 6.60 10.76 11.01 Min 0.00 −4.12 −5.84 −6.43 0.00 0.00 0.04 0.00 1.90 3.26 Count 4525 2374 1169 622 341 171 87 43 17 6 % 33% 19% 8% 4% 3% 2% 1% 0.4% 0.2% 0.1%

Figure 16 shows more demanding O-Mopsi game instances. Only thr3ee games require 5 swaps, while 13 games require 4 swaps. Kruskal-TSP suffers from the sub-optimality of the last selected links. In case of Odense, the last link goes all the way from the northmost to the southmost node. Christofides works reasonably well with this example but contains too many random-like selections in the tour. MST-to-TSP is the only algorithm that does not produce any crosses, and it also manages to find at least one starting point correctly in all three cases. Appl.Appl. Sci. Sci. 20212021, 11, 11, x, 177FOR PEER REVIEW 1515 of of 17 17

FigureFigure 16. 16. ExamplesExamples of of Christofides, Christofides, Kruskal-TSP, Kruskal-TSP, and and MST-to-TSP MST-to-TSP (greedy (greedy vari variant)ant) with with the the three three more challenging O-Mopsi games. Barcelona Grand Tour and Turku City 2.2.2015 have four more challenging O-Mopsi games. Barcelona Grand Tour and Turku City 2.2.2015 have four branches branches while Odense has five. None of the algorithms finds the optimal solution. while Odense has five. None of the algorithms finds the optimal solution.

5. ConclusionsFor O-Mopsi and Dots datasets, all the algorithms work in real-time (<1 s). Theoretical timeWe complexities have introduced are as heuristic follows. algorithm Christofides that requirestakes minimum O(N3) timespanning due totree the as perfectinput andmatching converts by it Blossom. to open-loop MST-based TSP solution variants by requireeliminating O(N2 all) to branches O(N2log fromN) depending the tree one on by the one.data The structure. randomized The proposed version of algorithm the propos takesed MST-to-TSP O(N3) due to algorithm the search clearly of the leafoutperforms nodes but Christofidesthere are probably and other more MST-based efficient ways heuristics to implement. With the it. RandomO-Mopsi mixand local Dots search datasets is fast, it reachesO(IR), wherenear-optimalI and R resultsare the not number far from iterations more sophisticated and repeats, respectively.local search algorithm. The operations The proposedthemselves algorithm are O(1) has due educational to random value choices. by showing the connection between two prob- lems, MST and TSP, of which one can be solved optimally using greedy heuristics while the5. Conclusionsother is known to be NP hard problem. The result of the algorithm can also be used as initial Wesolution have introducedfor local search heuristic heuristic algorithm or optimal that takesalgorithms minimum like branch-and-cut spanning tree as to input im- proveand convertsthe efficiency it to open-loopof the search TSP [36,37]. solution by eliminating all branches from the tree one by one. The randomized version of the proposed MST-to-TSP algorithm clearly outper- Authorforms Contributions: Christofides and Conceptualization, other MST-based P.F. heuristics.and T.N.; im Withplementation, the O-Mopsi T.N. and and M.Y.; Dots experi- datasets ments,it reaches T.N. and near-optimal M.Y.; writing, results P.F. All not authors far from have more read and sophisticated agreed to the local published search version algorithm. of theThe manuscript. proposed algorithm has educational value by showing the connection between two Funding:problems, No MSTfunding and to TSP,report. of which one can be solved optimally using greedy heuristics while the other is known to be NP hard problem. The result of the algorithm can also be Institutionalused as initial Review solution Board for Statement: local search Not heuristic applicable or optimal algorithms like branch-and-cut Informedto improve Consent the efficiency Statement: of Not the applicable search [36 ,37]. Appl. Sci. 2021, 11, 177 16 of 17

Author Contributions: Conceptualization, P.F. and T.N.; implementation, T.N. and M.Y.; experiments, T.N. and M.Y.; writing, P.F. All authors have read and agreed to the published version of the manuscript. Funding: No funding to report. Institutional Review Board Statement: Not applicable. Informed Consent Statement: Not applicable. Data Availability Statement: All datasets are publicly available: http://cs.uef.fi/o-mopsi/datasets/. Acknowledgments: The authors want to thank Lahari Sengupta and Zongyue Li for helping in the experiments. Conflicts of Interest: The authors declare no conflict of interest.

References 1. Papadimitriou, C.H. The Euclidean Travelling Salesman Problem is NP-Complete. Theor. Comput. Sci. 1977, 4, 237–244. [CrossRef] 2. Fränti, P.; Mariescu-Istodor, R.; Sengupta, L. O-Mopsi: Mobile orienteering game for sightseeing, exercising and education. ACM Trans. Multimed. Comput. Commun. Appl. 2017, 13, 1–25. [CrossRef] 3. Christofides, N. Worst-Case Analysis of a New Heuristic for the Travelling Salesman Problem; Report 388; Graduate School of Industrial Administration, CMU: Pittsburgh, PA, USA, 1976. 4. Goodrich, M.T.; Tamassia, R. 18.1.2 The Christofides Approximation Algorithm. In Algorithm Design and Applications; Wiley: Hobo- ken, NJ, USA, 2015; pp. 513–514. 5. Valenzuela, C.L.; Jones, A.J. Estimating the Held-Karp Lower Bound for the Geometric TSP. Eur. J. Oper. Res. 1997, 102, 157–175. [CrossRef] 6. Sengupta, L.; Fränti, P. Predicting difficulty of TSP instances using MST. In Proceedings of the IEEE Int. Conf. on Industrial Informatics (INDIN), Helsinki, Finland, 22–25 July 2019; pp. 847–852. [CrossRef] 7. Rosenkrantz, J.; Stearns, R.E.; Lewis, P.M., II. An analysis of several heuristics for the travelling salesman problem. SIAM J. Comput. 1997, 6, 563–581. [CrossRef] 8. Fränti, P.; Nenonen, H. Modifying Kruskal algorithm to solve open loop TSP. In Proceedings of the Multidisciplinary International Scheduling Conference (MISTA), Ningbo, China, 12–15 December 2019; pp. 584–590. 9. Sengupta, L.; Mariescu-Istodor, R.; Fränti, P. Which local search operator works best for open loop Euclidean TSP. Appl. Intell. 2019, 9, 3985. 10. Hoogeveen, J.A. Analysis of Christofides’ heuristic: Some paths are more difficult than cycles. Oper. Res. Lett. 1991, 10, 291–295. [CrossRef] 11. An, H.-C.; Kleinberg, R.; Shmoys, D.B. Improving Christofides’ Algorithm for the s-t Path TSP. J. ACM 2015, 34, 1–28. [CrossRef] 12. Gharan, S.O.; Saberi, A.; Singh, M. A randomized rounding approach to the traveling salesman problem. In Proceedings of the IEEE Annual Symposium on Foundations of Computer Science, Palm Springs, CA, USA, 22–25 October 2011; pp. 550–559. 13. Sebo; Vygen, J. Shorter tours by nicer ears: 7/5-approximation for graphic TSP, 3/2 for the path version, and 4/3 for two-edge- connected subgraphs. Combinatorica 2012, 34.[CrossRef] 14. Verma, M.; Chauhan, A. 5/4 approximation for symmetric TSP. arXiv 2019, arXiv:1905.05291. 15. Traub, V.; Vygen, J. Approaching 3/2 for the s-t-path TSP. J. ACM 2019, 66, 14. [CrossRef] 16. Traub, V.; Vygen, J.; Zenklusen, R. Reducing path TSP to TSP. In Proceedings of the Annual ACM SIGACT Symposium on Theory of Computing, New York, NY, USA, 22–26 June 2020. 17. Appligate, D.L.; Bixby, R.E.; Chavatal, V.; Cook, W.J. The Travelling Salesman Problem, a Computational Study; Princeton Univesity Press: Princeton, NJ, USA, 2006. 18. Kizilate¸s,G.; Nuriyeva, F. On the Nearest Neighbor Algorithms for the Traveling Salesman Problem, Advances in Computational Science, Engineering and Information Technology; Springer: Berlin/Heidelberg, Germany, 2013; Volume 225. 19. Pan, Y.; Xia, Y. Solving TSP by dismantling cross paths. In Proceedings of the IEEE Int. Conf. on Orange Technologies, Xi’an, China, 20–23 September 2014. 20. Edmonds, J. Paths, trees, and flowers. Can. J. Math. 1965, 17, 449–467. [CrossRef] 21. Genova, K.; Williamson, D.P. An experimental evaluation of the best-of-many Christofides’ algorithm for the traveling salesman problem. Algorithmica 2017, 78, 1109–1130. [CrossRef] 22. Croes, G.A. A Method for solving traveling-salesman problems. Oper. Res. 1958, 6, 791–812. [CrossRef] 23. Lin, S.; Kernighan, B.W. An effective heuristic algorithm for the traveling-salesman problem. Oper. Res. 1973, 21, 498–516. [CrossRef] 24. Helsgaun, K. An effective implementation of the Lin-Kernighan traveling salesman heuristic. Eur. J. Oper. Res. 2000, 126, 106–130. [CrossRef] 25. Glover, F. Tabu search-Part, I. ORSA J. Comput. 1989, 1, 190–206. [CrossRef] 26. Kirkpatrick, S.; Gelatt, C.D.; Vecchi, M.P. Optimization by simulated annealing. Science 1983, 220, 671–680. [CrossRef] Appl. Sci. 2021, 11, 177 17 of 17

27. Chen, S.M.; Chien, C.Y. Solving the traveling salesman problem based on the genetic simulated annealing ant colony system with particle swarm optimization techniques. Expert Syst. Appl. 2011, 38, 14439–14450. [CrossRef] 28. Ezugwu, A.E.S.; Adewumi, A.O.; Frîncu, M.E. Simulated annealing based symbiotic organisms search optimization algorithm for traveling salesman problem. Expert Syst. Appl. 2017, 77, 189–210. [CrossRef] 29. Singh, S.; Lodhi, E.A. Study of variation in TSP using genetic algorithm and its operator comparison. Int. J. Soft Comput. Eng. 2013, 3, 264–267. 30. Akshatha, P.S.; Vashisht, V.; Choudhury, T. Open loop travelling salesman problem using genetic algorithm. Int. J. Innov. Res. Comput. Commun. Eng. 2013, 1, 112–116. 31. Bai, J.; Yang, G.K.; Chen, Y.W.; Hu, L.S.; Pan, C.C. A model induced max-min ant colony optimization for asymmetric traveling salesman problem, Appl. Soft Comput. 2013, 13, 1365–1375. [CrossRef] 32. Boryczka, U.; Szwarc, K. An effective hybrid harmony search for the asymmetric travelling salesman problem. Eng. Optim. 2020, 52, 218–234. [CrossRef] 33. Vygen, J. Reassembling trees for the traveling salesman. SIAM J. Discret. Math. 2016, 30, 875–894. [CrossRef] 34. Fränti, P.; Sieranoja, S. How much k-means can be improved by using better initialization and repeats? Pattern Recognit. 2019, 93, 95–112. [CrossRef] 35. Sengupta, L.; Mariescu-Istodor, R.; Fränti, P. Planning your route: Where to start? Comput. Brain Behav. 2018, 1, 252–265. [CrossRef] 36. Costa, L.; Contardo, C.; Desaulniers, G. Exact branch-price-and-cut algorithms for vehicle routing. Trans. Sci. 2019, 53, 917–1212. 37. Homsi, G.; Martinelli, R.; Vidal, T.; Fagerholt, K. Industrial and tramp ship routing problems: Closing the gap for Real-scale instances. arXiv 2018, arXiv:1809.10584. [CrossRef]