Bond University Volume 12| Issue 2 | 2020

Spreadsheet-Based Modeling and Optimization of a Bi-Modal Traveling Salesman Problem: Model, Solution, and Case

Roger Grinde, Decision Sciences University of [email protected]

______

Follow this and additional works at: https://sie.scholasticahq.com

This work is licensed under a Creative Commons Attribution-Noncommercial-No Derivative Works 4.0 Licence. Spreadsheet-Based Modeling and Optimization of a Bi- Modal Traveling Salesman Problem: Model, Solution, and Case

Roger Grinde Decision Sciences, University of New Hampshire [email protected]

Abstract This paper presents a bi-modal routing problem and two-phase spreadsheet-based model and solution approach. The problem is the Bi-Modal Covering Salesman Problem, and the context for the problem in this paper is a recreational hiking problem. Two hikers seek to traverse a set of peaks, where they have defined a number of hikes from which to choose. The problem is to a) identify the set of hikes in order to minimize a hiking time objective while ascending all peaks, followed by b) sequencing the chosen hikes in order to minimize a driving distance objective. The problem essentially combines two well- known problems in Operations Research: the Set Covering Problem and the Traveling Salesman Problem. A mathematical programming formulation is presented, followed by a detailed explanation of the Excel™ model and solution approach with Solver™, using linear and evolutionary engines. The model is enhanced using Visual Basic for Applications. In instruction, it can be utilized in many ways, from a classroom discussion example all the way up to a semester long project done in several phases. A case, split into two parts, is provided in the appendix. The full workbook and a template file is available upon request from the author.

Keywords: routing, optimization, Operations Research, traveling salesman, spreadsheet, Excel, Solver, VBA, Simplex Method, evolutionary algorithm

1.0 Introduction This paper presents a model and solution of a bi-modal routing problem related to the well-known Traveling Salesman Problem (TSP). It is bi-modal due to vehicular and walking travel and is termed the Bi-Modal Covering Salesman Problem (BCSP). A model formulation, spreadsheet-based solution approach (Excel™ with VBA), and case study are provided. It is suited for courses related to Optimization, Operations Research, Management Science, Business Analytics, and programming at various levels and

2 various programs (e.g., business, engineering, applied mathematics) depending on what aspects of the problem-solving process the instructor wishes to emphasize. From an educational standpoint, the problem has some key differences compared to the TSP, which present questions of how to model it, and then how to find a solution approach. The problem can be used to emphasize the modeling aspect, the spreadsheet implementation, VBA enhancements, or some combination of these. It could also be used as a guided semester-long project where these aspects are emphasized during different phases. The author has used the problem as a case in an MBA course in Management Science with good results.

The specific problem arises in hiking, but it is related to some other variations of the TSP and is applicable to some other bi-modal routing situations. There are a number of mountain peaks to climb. For this paper, the specific problem of the 48 mountain peaks above 4000 feet in altitude, in the State of New Hampshire, USA, are used (Smith & Dickerman, 2012). The hikers seek to a) identify the set of hikes to perform which will ascend each peak in the shortest hiking time; and b) sequence the hikes to minimize the driving distance between and within hikes. Appendices C and D contain case studies for classroom use. The method is generalizable to other situations. Briefly, the peaks are reached by one or more trails, each with trailheads. A specific trail may traverse more than one peak and may be a round-trip trail or a one-way trail. Trailheads are connected by a network of roads. Two hikers utilize two vehicles to reach the trailheads, and then hike one or more hikes from that trailhead. In the case of one-way hikes, the hikers must stage a vehicle at each end of the trailhead.

In this section, related work is discussed. Section 2 presents the problem formulation and outlines the solution approach. The spreadsheet implementation (in Excel™ and VBA) is explained in Section 3, and Section 4 explores educational value and possible uses. Concluding remarks are provided in Section 5. Appendices contain VBA code and the case study presented, in two parts. The spreadsheet implementation and a template file is available from the author upon request.

The TSP and related Vehicle Routing Problem (VRP) have a rich literature due to their many practical applications as well as solution challenges. Examples include Gutin (2007), Applegate et al. (2007), Golden et al. (2008), and Braekers et al (2016). Review of multi-objective problems is done by Jozefowiez et al. (2008). The Covering Tour Problem (CTP) has similarity to the BCSP discussed in this paper, in that there are two node sets. However, in the CTP there is only one mode of transportation. The goal is to find a minimum-cost tour through a subset of the nodes in one set, so that every node in the other set is within some critical distance of a visited nodes (e.g., rural health clinics might be one node set, and towns/villages might be the other node set). Current & Schilling

3

(1989) introduced this problem, and it has been studied by others. The problem most directly related to the BCSP presented in this paper is termed the Walking Line of Travel Problem (WLT). It was studied by Levy & Bodin (1988). The example provided was to schedule postal carriers who drive to a parking area, complete a walking tour delivering mail, and return to the vehicle. It differs from the BCSP in that one-way trips are not allowed, so the BCSP is a more general version of the WLT problem. Grinde (2017) presents a formulation of the BCSP, an heuristic solution approach using traditional optimization software, and computational results.

2.0 Problem Formulation and Solution Approach The BCSP uses two networks, one for each mode of transportation. In the hiking case problem, one network is a set of parking areas (P1, P2, …, Pm) connected by roads. The other network is a set of peaks (“cities” in the traditional TSP language), denoted by (C1, C2, …, Cn) and connected by trails. Some trails have connection points to parking areas. Each of the peaks must be visited at least once. Figure 1 shows an example. The overall objective is to minimize total weighted travel time or distance (driving plus hiking) while visiting each peak at least once. This involves driving from a home base to one of the parking areas, making one or more hikes (either cyclical or chain hikes, to be discussed shortly), driving to another parking area and hiking from there, and returning to the home base after all peaks have been visited.

Figure 1. Road and Trail Networks

4

The concept of a single trip visiting one or more peaks (hike in this problem context) is as follows. The car is driven to a parking area Pi. A hike is performed, which visits one or more peaks. Two possibilities exist for the end of the hike. A loop or cyclical hike is when the hikers return to Pi, and then drive to another parking location (or perform another hike from Pi). Figure 2 shows a cyclical hike; Figure 3 a chain hike. A one-way, or chain, hike ends at a different parking area, say Pj. To perform a chain hike, both hikers drive to Pj, the endpoint of the hike, and leave one car. Then they drive to Pi, and depart on the hike. Finishing the hike at Pj, they drive back to Pi to retrieve the other car.

Cyclical Trip Illustration

Road C2

Trail Parking Area

Step 2: Hike C1 Peak C3

P

Step 1: Drive 2 Cars Step 3: Drive 2 Cars

Figure 2. Cyclical Hike

Chain Trip Illustration

Road C2

Trail Parking Area Step 3: Hike

C1 Peak C3

Step 2: Drive 1 Car to begin of hike

P2 (end of hike) P1 (begin of hike)

Step 1: Drive 2 Cars from previous hike Step 4: Drive 1 Car to begin of hike Step 5: Drive 2 Cars to next hike

Figure 3. Chain Hike

5

Chain hike feasibility depends on the application. In the hiking setting, it is common not only to have two hikers, but also that some hikes are more naturally done as one-way hikes. This has the potential to reduce hiking time considerably. Having two cars increases the total mileage driven, though not necessarily the time. Chain hikes (trips) may be useful in other settings as well, with or without multiple vehicles. For example if the “parking” areas are airports, and customers are to be visited on one-way trips between airports. Another example may occur with bike or car sharing programs, where the pick-up and drop-off locations can be different. Other application notes are discussed as part of Case A (Appendix C).

The modeling and solution approach here assumes that possible hiking trips are developed in advance. This is reasonable, as there are many hiking guides available and there tends to be a finite number of reasonable hikes. The problem becomes two-phased: 1) select which hikes to perform, visiting each peak at least once while minimizing the hiking objective; and 2) sequencing the sequencing hikes so the driving objective is optimized. The objective can be time-based or distance-based, or a weighted combination. In the spreadsheet implementation developed for this paper, the hiking- based objective has precedence over the driving-based objective.

A mathematical programming formulation of the BCSP is presented here. This is the same as in Grinde (2017). That paper then used traditional optimization tools for the solution and presented computational experience. This paper uses the formulation as a guide for the development of the spreadsheet-based approach which is the focus here. When using this problem in an educational setting, the instructor can choose the depth in which to cover the formulation. In a mathematical programming course, the formulation would probably receive much attention. In an application-based MBA course, the formulation might be used mainly to illustrate the correspondence between the formulation and the resulting Excel™ and Solver™ model.

Notation is as follows:

i = 1,…,m trip index (hiking trips, each visiting one or more peaks) j = 1,…,n customer (peak) index ci = hiking cost of trip i (e.g., time, distance) dik = driving cost of driving from trip i to trip k (e.g., time, distance) aij = 1 if trip i visits customer (peak) j; 0 otherwise

The direction of travel could affect the hike cost (e.g., terrain features), so it is common to define a hike in each direction, especially for chain hikes (cyclical hikes can be done in

6 the fastest way with no impact on driving logistics). The driving distance matrix (elements are dik) is the matrix of hike-hike distances.

Decision variables for the problem are xi for each trip being selected or not, and yik used to model the sequencing of trips. Both sets of variables are binary. Specifically,

xi = 1 if trip (hike) i is selected; 0 otherwise yik = 1 if trip (hike) k follows trip (hike) i (k≠i); 0 otherwise

A mathematical programming formulation follows. Because the objective is the combination of hiking and driving objectives, w1 and w2 are used respectively to weight the objectives.

min 𝑤 𝑐𝑥 +𝑤 𝑑𝑦

subject to

𝑎𝑥 ≥ 1 𝑗 = 1 … 𝑛 (1)

𝑦 −𝑥 = 0 𝑖 = 1 … 𝑚 (2)

𝑦 −𝑥 = 0 𝑘 = 1 … 𝑚 (3)

subtour elimination constraints

𝑥,𝑦 ∈{0,1}

The objective is the weighted sum of the hiking and driving costs. For relatively larger values of w1, the hiking objective is more important, and vice versa. Constraints (1) ensure that every peak is visited at least once (these constraints are essentially the type of constraints found in set covering problems, hence the “covering” in the BCSP acronym). Constraints (2) and (3) make sure that the selected hikes are connected to the actual driving network. That is, if a hike is not selected, no drives should be selected to or from that hike. Similarly, if a hike is selected, there must be one drive to that hike and one drive away from that hike. The subtour elimination constraints ensure that the hikes selected form a full cycle (aka, Hamiltonian tour). These latter constraints are essentially the constraints found in a typical Traveling Salesman Problem.

Weights on the objective allow for differential importance of the hiking and driving objectives. That said, the two-objective problem is a very difficult mathematical program to solve. However, the objective and constraints are separable by objective. Specifically,

7 by addressing the hiking objective first, the best set of hikes can be identified. This is essentially Phase I of the solution process. Once this is done, the selected hikes can be sequenced to minimize the driving objective. This is Phase II of the solution process. The spreadsheet implementation developed in this paper is solved using this two-phased approach where the hiking objective taking precedence over the driving objective.

3.0 Spreadsheet-Based Solution Approach

3.1 Overview of Workbook Based on the problem formulation of the previous section, and the two-phased solution approach, the problem is modeled and solved using Excel™ and Solver™. Visual Basic for Applications (VBA) is used to automate tasks such as resetting and building the Solver settings, and for solving the Phase II problem using several different random starting sequences. However, VBA is not needed for the core functionality of the model and heuristic solution. Extensive use of range names is made to allow for better self- documentation of formulas, generality, easier scalability, and better generality of the VBA code. A Documentation sheet is included that provides the formula logic in tabular form, definitions of the named ranges, and screen shots of Solver settings. This section first provides an overview of the structure of the workbook and its sheets, and then provides explanations of the core functional aspects of those sheets.

Figure 4 shows, in block diagram form, the logical structure of the workbook. The two primary sheets are Model_Cover and Model_Sequence. Model_Cover is for the Phase I problem. It contains the definitions of the hikes and the optimization model to select the hikes that minimize the hiking objective. Model_Sequence is for the Phase II problem. It extracts the selected hikes from Model_Cover, and contains the optimization model to identify the best sequence in which to perform the hikes according to the driving objective. Model_Sequence uses driving distances between hikes computed and stored in the Hike_Hike_Dist sheet. In turn, the Hike_Hike_Dist sheet is computed from distance values between parking areas stored in the Drives sheet. Two other sheets are used by VBA in the event the user wants to try one or more random starting sequences for the selected hikes. These are the Index_Randomizer sheet, which computes a random sequence of hikes for the correct number of selected hikes. The VBA_Support_Sheet is used to store the best-found sequence of hikes for the Phase II problem if the user tries one or more random starting sequences. VBA is used on the Model_Cover sheet to quickly build the Solver settings for either a distance-based or time-based hiking objective, and to solve the Solver model. On the Model_Sequence sheet, VBA is used to reset and rebuild the Solver model (necessary if the number of selected hikes changes), to reset the sequence of hikes to an increasing order of hike number, to compute and show

8 a random sequence of hikes and the resulting driving distance, to solve the Solver model, and to run the Solver model for a user-specified number of random starting solutions.

Figure 4. Logical Structure of Workbook

Table 1 lists the defined ranges in the workbook. Some of these will be mentioned when discussing the detailed logic.

Table 1. Range Names and Associated Cell References

Worksheet Range Name Reference Model_Cover Hike_ID =Model_Cover!$A$9:$A$50 Model_Cover Hike_Info =Model_Cover!$A$8:$G$50 Model_Cover HikeChosen =Model_Cover!$BD$9:$BD$50 Model_Cover HikePeakArray =Model_Cover!$H$9:$BC$50 Model_Cover Max_Allowed_Hiking_Tim =Model_Cover!$C$73 e Model_Cover Max_Avg_Hiking_Time =Model_Cover!$C$72 Model_Cover Max_Chain_Hikes =Model_Cover!$C$69 Model_Cover Max_Number_Hikes =Model_Cover!$C$71 Model_Cover Min_Chain_Hikes =Model_Cover!$C$68 Model_Cover Min_Number_Hikes =Model_Cover!$C$70 Model_Cover Number_Chain_Hikes =Model_Cover!$C$62

9

Worksheet Range Name Reference Model_Cover Number_Hikes_Chosen =Model_Cover!$C$61 Model_Cover PeakCoveredActual =Model_Cover!$H$52:$BC$52 Model_Cover PeakCoveredRequired =Model_Cover!$H$53:$BC$53 Model_Cover Total_Miles =Model_Cover!$D$56 Model_Cover TotalTime =Model_Cover!$C$56 Model_Sequence Best_Objective =Model_Sequence!$M$33 Model_Sequence Current_Run_Number =Model_Sequence!$M$32 Model_Sequence IndexSequence =Model_Sequence!$F$10:$F$51 Model_Sequence IndexSequenceAnchor =Model_Sequence!$F$10 Model_Sequence InterHikeDistance =Model_Sequence!$M$12 Model_Sequence NumStartingSolutions =Model_Sequence!$M$31 Model_Sequence SelectedTrips =Model_Sequence!$B$7 Model_Sequence TotalTrips =Model_Sequence!$B$6 Drives Park_Nodes =Drives!$E$4:$AC$28 Hike_Hike_Dist Hike_Hike_Dist =Hike_Hike_Dist!$B$8:$AR$50 Index_Randomizer RandIndex =Index_Randomizer!$C$6:$C$47 VBA_Support_Shee Best_Index_Sequence =VBA_Support_Sheet!$A$2:$A$4 t 3

3.2 Model_Cover Sheet: Select Hikes To Be Performed The detailed logic of the Model_Cover sheet is presented in this subsection. Recall this solves the Phase I problem, to select the hikes to minimize the hiking objective, either time or distance. Figure 5 shows an overview of the Model_Cover sheet (note: columns U-AZ are hidden for readability). This is intended to give the overall structure; the discussion covers each key area in detail. Table 2 lists key formulas on the Model_Cover sheet.

10

Figure 5. Model_Cover Worksheet

The range A8:G50 is a named range, Hike_Info. This is essentially a database of the hikes that are defined. Each hike has an ID, a description, a time (in hours) estimate to complete, a mileage, a type (whether a chain hike or cyclical hike), and the starting and ending parking “nodes,” or trailheads. For cyclical hikes, the starting and ending nodes are the same.

11

Table 2. Key Formulas in Model_Cover Sheet Address Formula Copy To Description Hike Type: G9 =IF(E9=F9,0,1) G10:G50 0=cyclical, 1=chain Number of hikes H51 =SUM(H9:H50) I51:BD51 containing each peak Calculated number visits to each peak H52 =SUMPRODUCT(H9:H50, HikeChosen) I52:BC52 based on selected hikes. Total hiking time of C56 =SUMPRODUCT(HikeChosen, C9:C50) selected hikes (objective function). Total distance of D56 =SUMPRODUCT(HikeChosen, D9:D50) selected hikes. Number of chain G56 =SUMPRODUCT(HikeChosen, G9:G50) hikes selected. C59 =C56 Total hiking time C60 =D56 Total hiking distance Number of hikes C61 =BD51 selected Number of chain C62 =G56 hikes selected Average hike C63 =C60/C61 distance (informational) Average hike time C64 =C59/C61 (informational) Maximum allowed

C73 hiking time (can be =Max_Avg_Hiking_Time*Number_Hikes_Chosen used as a constraint)

Moving to the right, the HikePeakArray (H9:BC50) is a binary matrix mapping the hikes to the peaks, corresponding to aij in the problem formulation presented earlier. Some hikes traverse numerous peaks; this may occur if multiple peaks are connected by a ridge. Other hikes contain only a single peak.

Continuing to the right, the HikeChosen range (BD9:BD50) is the set of decision variables xi in the problem formulation. A hike chosen corresponds to a value of 1, and 0 if not chosen. These are binary values, but two decimal places are shown, so that after Solver runs, it is easy to verify they are indeed binary. Conditional formatting is used to

12 highlight the hikes actually chosen, and this formatting is carried over to help see the Hike_ID values that are chosen. The decision variables, combined with each column of HikePeakArray, computes the number of times each peak is covered, using the SUMPRODUCT function. This is shown in the PeakCoveredActual range. PeakCoveredRequired is a range with values of 1 for all peaks. This will be used in the constraint for the Solver settings. Typical Set Covering Problem constraints have 1’s for all the right-hand-side values. This is used here, but if desired, the user could set one or more peaks to be visited more than once, or not at all. Thus, this model can be used for all the peaks in one big tour, or in a piecewise fashion, by setting some of the PeakCoveredRequired values to 0.

Similarly, the TotalTime and Total_Miles values are computed using SUMPRODUCT functions, based again on the HikeChosen values. The objective used here in Solver is TotalTime, but it is easy to change this to the Total_Miles objective if desired.

At this point, all of the necessary ingredients for the optimization model are present on the worksheet. The decision variables are HikeChosen, the objective function is TotalTime, and the constraints are represented as PeakCoveredActual >= PeakCoveredRequired. The HikeChosen variables are set to be binary. Finally, the LP Simplex algorithm is specified, as this is a linear model. The Solver dialog box is shown in Figure 6. The dialog box contains some additional constraints, utilizing the calculations and parameters shown in rows 59-73 of the spreadsheet. For example, if the user wants to limit the number of chain (one-way) trips to a smaller number than is optimal, the parameter value in cell C69 can be changed, and the model re-solved. However, only the PeakCoveredActual >= PeakCoveredRequired constraints are truly required in this formulation; the others are included to be able to tweak the solution in a direction desired by the user.

13

Figure 6. Solver Box for Model_Cover Sheet

There are three VBA routines related to the Model_Cover sheet. These set up the Solver settings for the time-minimization model, the distance-minimization model, and a button for running the current Solver model. Code for these is contained in Appendix A.

The Model_Cover sheet can be used without Solver if desired. A user can experiment with different hike combinations, for example, if they have some favorite hikes. In fact, like any optimization model in Solver, the user should be able to manually experiment with different solutions. Of course, optimization takes all constraints into account simultaneously while minimizing the objective. This manual experimentation on the spreadsheet, without Solver, can be a helpful way for students to get a better understanding of the impact of different decisions. It is also useful in the debugging process.

3.3 Model_Sequence Sheet: Sequence the Selected Hikes The Model_Sequence sheet determines the best sequence of the hikes in order to minimize total driving distance between the hikes, returning to a home location. As described earlier, two cars are used to be able to logistically execute chain hikes. In essence, this is a Traveling Salesman Problem. It is asymmetric due to the existence of chain hikes; the distance from one hike to another hike is dependent on which direction the hikes are performed. The solution engine used for this portion is the evolutionary

14 algorithm in Solver, which accommodates sequencing problems quite well. It should be noted that the “optimization” is really a heuristic, as an evolutionary algorithm cannot guarantee optimality. To help the user experiment, provision has been included to solve the sequencing problem starting with multiple random starting sequences, and choose the best one. This is discussed after the main logic is presented.

Figure 7 shows an overview of the Model_Sequence sheet. As with the Model_Cover sheet, the discussion explains each key area in more detail. Table 3 contains key formulas in Model_Sequence.

Figure 7. Model_Sequence Worksheet

Table 3. Key Formulas in Model_Sequence Worksheet Address Formula Copy To Description Number of defined B6 =COUNT(C10:C51) hikes. Number of selected B7 =COUNTIF(C10:C51,"<>0") hikes in current solution. B10:B51 Extracts the Hike ID B10 ={INDEX(Hike_ID,SMALL(IF(HikeChosen=1, (enter as values from the

15

Address Formula Copy To Description ROW(HikeChosen)- array Model_Cover sheet ROW(INDEX(HikeChosen,1))+1),A10))} formula) based on the decision variable values, HikeChosen. Conditional formatting is used to hide the error values that occur for rows in excess of the number of hikes chosen. Converts the results in Column B to the same values, but with not C10 =IFERROR(B10,0) C11:C51 Excel error values (necessary for Solver to function properly). For readability, to better show the E11 =F10 E12:E52 sequence of indices of the Hike ID values. For readability, to better show the Hike ID values G11 =H10 G12:G52 corresponding to this position in the sequence. Pulls the Hike ID value for the hike in this H10 =INDEX(C$10:C$51,F10) H11:H51 position in the sequence. Pulls the inter-hike distance corresponding I10 =INDEX(Hike_Hike_Dist,G10+1,H10+1) I11:I52 to the pair of hikes adjacent in the sequence. Pulls the intra-hike J10 =INDEX(Hike_Hike_Dist,H10+1,H10+1) J11:J52 distance of the current hike in the sequence. Sum of the total inter- M12 =SUM(I10:I52) N12 hike and intra-hike distances of the current

16

Address Formula Copy To Description sequence (M12 is the objective for Solver). O12 =SUM(M12:N12) Total One-Car Distance Two-car inter-hike M13 =2*InterHikeDistance distance Two-car intra-hike N13 =2*N12 distance O13 =SUM(M13:N13) Two-car total distance

The left-hand portion (columns A-C) extract the Hike_ID values from the Model_Cover sheet for the selected hikes. The main issue in getting the problem ready for Solver is that even though there are 42 defined hikes, the Model_Cover algorithm identifies just a subset (12 in this case) of these to actually be performed. The sequencing aspect of the evolutionary algorithm cannot accommodate “null” hikes, and even if it did, solution time would likely increase unacceptably. Therefore, to get the problem ready for Solver, it is necessary to extract just the selected hikes from Model_Cover into the Model_Sequence sheet, and to do this via Excel formulas. From a formula standpoint, this is the most complex logic. The key formulas are in cells B10:B51. The formulas are entered as an array formula that finds the ith smallest value of the Hike_IDs from among the hikes that are selected on the Model_Cover sheet. The hikes that are not selected (HikeChosen values of 0) are ignored, so what results is just the Hike_ID values of the chosen hikes, in the first positions of this area. If more or less hikes are chosen, this area fills in with just the rows visible for the chosen hikes. Conditional formatting is used in these columns to show only relevant Hike_ID values (the formula in column B results in an error for those rows beyond the number of hikes chosen; column C simply converts this to a 0 for those cases; see Table 3 for the formulas).

Moving to the right (columns E-J), it is important to recognize the difference between the sequence position and the Hike_ID value. In this case we have 12 hikes chosen, with Hike_ID values scattered from Hike_ID =1 to Hike_ID = 40. Column F shows the sequence index found after the Solver solution procedure. Columns G and H translate the sequence index back into Hike_ID numbers. An initial dummy hike (Hike_ID = 0) is listed as the first hike. This hike starts and ends at the home location. The first sequence number in the solution is 12 (cell F10). Looking in columns A-C, this is seen as corresponding to Hike_ID 40 (cell H10). The inter-hike distance from Hike_ID = 0 to Hike_ID = 40 is 68 miles. This comes from the Hike_Hike_Dist sheet, shown in the block diagram (Figure 4), with the specific formula in Table 3. Because Hike_ID 40 is a cyclical hike, both cars are parked at the begin/end of the hike so there is no intra-hike driving needed (cell J10).

17

Moving to the next line, sequence index 3 is next; this corresponds to Hike_ID = 7. So, from Hike_ID 40, the next hike is Hike_ID 7. From the Hike_Hike_Dist sheet, the distance from the former to the latter is 46 miles. This continues; the third sequence index is 2, corresponding to Hike_ID 5. Using similar logic as before, the distance from Hike_ID 7 to Hike_ID 5 is 9 miles (inter-hike distance, cell I12). But since Hike_ID 5 is a chain hike, it has 3 miles of intra-hike driving in order to shuttle a car, as the driving distance from the end of this hike is 3 miles from the beginning of the hike. Columns I and J are the single-car distance; accounting for both cars in the distances is handled in the summary calculations of total miles.

This process continues until the last line in green (cell F22), which indicates that sequence index 13 is the last value. This cell is not a decision variable as will be seen in the Solver model. Rather, this sequence index of 13 ensures that the entire tour completes with Hike_ID = 0, corresponding to the home location. Thus, this logic forces a drive from the last hike (Hike_ID = 32 in this case) back to the home location.

The last part of the logic in the sheet is the summary of distances. The single-car inter- hike distance is in cell M12), and it is the sum of column I. Similarly, the intra-hike distances due to car-shuttling, is the sum of column J, shown in cell N12. From the standpoint of minimizing vehicle mileage, intra-hike distance is constant once the set of hikes has been determined, so the optimization model seeks to minimize cell M12. To calculate the total two-car distance traveled, the inter-hike and intra-hike distances are computed and doubled; this is shown in cell O13.

The Solver model for the sequencing problem is very compact. Figure 8 shows the Solver dialog box. The objective is to minimize InterHikeDistance (cell M12). The decision variables are F10:F21, corresponding to the sequence indices (values 1,2,…,12) in this case because there are 12 hikes selected. The constraint uses Solver’s “All Different” constraint, which is designed for sequencing problems such as this. The solution engine is the “Evolutionary” solver, a heuristic technique. Solution times with evolutionary algorithms depend on parameter settings for the solution process, and the user can experiment with those. Solutions with evolutionary algorithms can also depend on the starting solution. The solution shown in Figure 7 was obtained using a sequential 1,2,…,12 as the starting sequence setting. The total single-car inter-hike distance for the starting solution is 458 miles, and the best solution found is 292 miles. Experimenting with several different random starting solutions gives some reasonable confidence that this “best found” solution is close to optimal.

18

Figure 8. Solver Model for Model_Sequence Sheet

For the Model_Sequence sheet, there are a number of embedded VBA routines, accessed by buttons on the sheet. Code for these is contained in Appendix B. The first routine resets and rebuilds the Solver model. This is needed any time the number of hikes selected changes. For example, if the user re-runs the Model_Cover algorithm and selects 14 hikes instead of 12, the routine to reset and rebuild the Solver model needs to be run. The reason for this is that the decision variables in the Solver model need to correspond to the number of hikes selected. This routine is also useful if the user makes some modifications to the Solver model, as this can be run to return the model to its default state. Two routines initialize the index sequence. One simply makes it an ascending order, and other randomizes the order. There is a routine to run the Solver model. Finally, there is a routine that runs Solver a user-specified number of times, each with a different random starting solution. The best ending solution is retained and returned back in the Model_Sequence sheet at the end of the routine.

19

3.4 Data and Calculation-Support Worksheets

There are four additional worksheets in the workbook that provide input data and calculation support. These are shown in the block diagram (Figure 4). This section provides additional detail.

The Drives sheet contains the driving distance matrix between parking nodes, aka trailheads. These are determined using mapping software (e.g., Google Maps) or a detailed printed map of the area. Node 1 is the home location where the overall tour begins and ends. Nodes 2-25 correspond to trailhead locations. The matrix of distances has a range name Park_Nodes (see Table 1).

The Hike_Hike_Dist sheet uses the information in Park_Nodes, plus information in the range Hike_Info (sheet Model_Cover), to determine the driving distances between each pair of hikes. Figure 9 shows an excerpt of this sheet, with some rows and columns hidden for ease of viewing. The key formula in this sheet is:

Cell B8 = INDEX(Park_Nodes,INDEX(Hike_Info,$A8+1,5),INDEX(Hike_Info,B$7+1,6))

This formula is copied to the right and down for the matrix elements. Briefly, it looks up the driving distance (in Park_Nodes) from the beginning node of the hike represented by the current row, to the ending node of the hike represented by the current column. The inter-hike driving always goes to the ending point of the next hike. For cyclical hikes the beginning and ending nodes are the same, so this is immaterial. But for chain hikes, the two cars are driven to the ending node of the hike, and then one car is driven to the starting node of the hike. This latter driving is intra-hike driving and is captured in this matrix by the entries along the diagonal.

20

Figure 9. Hike_Hike_Dist Sheet Excerpt

The Index_Randomizer sheet generates random index values for the first N values, where N is the number of hikes selected. Figure 10 shows an excerpt of this sheet, with some rows hidden. Cell B3 contains the number of hikes (trips). Starting with cell B6 and moving down, the RAND function is used to compute random values. The formula in cell C6 (and copied down) is

C6 = IF(A6<=B$3,RANK(B6,OFFSET(B$6,0,0,B$3,1)),A6)

The OFFSET function allows for a flexible number of random indices, corresponding to the number of selected hikes. If A6 (current index) is greater than the number of hikes selected, the current index is returned. The effect of this is that for N+1 and greater indices, they are just sequential. But for the first N indices, they are placed in order according to the random values generated in column B. The effect is that each time this sheet calculated, it generates a random starting sequence for the number of selected trips. The VBA routine mentioned in the section on the Model_Sequence sheet commands the Index_Randomizer sheet to calculate, and then copies the resulting random sequence over to the Model_Sequence sheet, where it is used as a starting solution in the sequencing algorithm.

21

Figure 10. Index_Randomizer Sheet Excerpt

Finally, the VBA_Support_Sheet is used by the VBA code when the user asks for multiple random starting solutions. In this case, if the solution after optimizing is better than what has been found thus far, that sequence is stored in the VBA_Support_Sheet. After all runs, this best solution is copied back to the Model_Sequence sheet.

4.0 Educational Use This problem can be used in multiple ways and in classes in a variety of disciplines such as Operations Research/Management Science, Applied Mathematics, Business Modeling/Prescriptive Analytics, and Spreadsheet Modeling & VBA Programming. If the course is more mathematical in nature, more time can be used to explore the mathematical nature of the problem formulation. A class dealing with different algorithms (e.g., Operations Research) might consider alternate solution methods. A business modeling/analytics course (e.g., Management Science) might use a case-based approach to explore how to model and set up the problem, and experiment with different parameter settings for the solution. If an instructor wishes to cover some aspects of VBA in the course as well, the problem and solution approach are amenable to that. The overall problem could also be assigned as a semester-long project broken into multiple phases.

Included in this paper is a case in two phases, presented as Appendix C and Appendix D. The author has used a variation of these in an MBA course with good results. More detail of the course implementation, and potential uses in other course types, is presented in Section 4.1. Section 4.2 contains several questions for each part of the case. First a brief description of each is given, along with learning objectives.

Hiker Case A (Appendix C). This is primarily a problem-framing and modeling case. The main task is to develop a spreadsheet model/decision tool that a hiker can use to explore different possibilities of hikes and sequencing of the hikes (i.e., manual “what-if”

22 analysis). It stops short of seeking to optimize the selection of hikes or sequence them using Solver.

Case A Learning Objectives. By completing this portion of the case, students will: • Develop an overall organization and framework for the data of the problem that makes model calculations efficient and easy to interpret. • Write formulas that keep track of which peaks have been visited and the total hiking time, based on which hikes the user selects to perform. • Transform the distance matrix between parking nodes into a matrix giving distances between hikes, as well as intra-hike distance for chain hikes. • Based on a specific set of hikes to perform, develop formulas that compute the driving distances between (and within) hikes for a specific sequence of the hikes. • Manually develop and evaluate a few possible solutions: a selection of hikes, and the sequence in which they should be done. Compute the total hiking time and the total driving distance. Manually verify that all applicable constraints are satisfied.

Hiker Case B (Appendix D). This presumes that the A part of the case is completed. Part B is an optimization case. Based on Part A, use optimization to find an optimal set of hikes, and sequence the hikes to a) minimize hiking time, and secondarily, b) minimize total driving distance.

Case B Learning Objectives. By completing this portion of the case, students will: • Develop an optimization model for the selection of hikes. This model minimizes the hiking time, subject to each peak being visited at least once. • Using the results of the hike selection model, develop an optimization model that sequences these specific hikes. The objective is to minimize total vehicle travel time. • (Advanced) Use formulas to automatically populate the hike sequencing model with the hike ID values of the selected hikes from the hike selection model.

4.1 Suggestions for Use in Different Courses The author has direct experience using the case in an MBA required course in Managerial Decision Making (aka, Management Science). The course primarily helps to develop modeling and analysis capabilities using a spreadsheet environment. Core topics include problem framing and base case modeling, sensitivity analysis, optimization (mostly linear programming), Monte-Carlo simulation, and a brief treatment of regression. Students in the course tend to be focused on using a model to get the solution to a problem, and to perform sensitivity analysis, rather than aspects of the methodology of algorithms.

23

To utilize this problem in the MBA course, the author provides the background material presented in Case A, and a spreadsheet template for the hike selection portion of the problem. Problem data is also supplied electronically. The spreadsheet template provides the key structure for the hike selection portion of the problem. Cases A and B are then assigned as a single unit. It has been helpful to allocate 10-15 minutes of class time to introduce the concept of the case, the structure of the problem, and to mention some real- world applications of vehicle routing and scheduling problems. Typically this is a group assignment, where one or more groups will present their approach and findings as a basis for further discussion.

With this guidance, student groups are able to develop a spreadsheet model to help select the hikes, and a Solver optimization model to minimize hiking time subject to the constraints of visiting each peak at least once. After students have identified the hikes to perform, they are usually able to build a second optimization model that sequences those specific hikes in order to minimize total vehicle distance. Prior to this, the author presents an example of a Traveling Salesman Problem in class, solved using the Evolutionary Algorithm in Solver. The most challenging part of the hike sequencing model is typically figuring out how to look up the correct distances in the driving distance matrix, so that regardless of the sequence of hikes, the correct vehicle distance is computed. To do this, perhaps the easiest way is to transform the parking node distance matrix to a hike-to-hike vehicle distance matrix (i.e., Figure 9). In this assignment, there is not an expectation to try to automate the connection between the results of the two optimization models. This means that the fairly complex formula discussed in Section 3.3 to extract the Hike_ID values of the selected hikes is not needed. Nor is any VBA code required. The second optimization problem, sequencing the hikes, is usually specific to the hikes chosen in the first optimization problem (a few groups may try to resolve this issue, with varying levels of success). This makes the assignment feasible for most MBA students, especially when they work in teams. The assignment could also be made for undergraduate business students in a course covering prescriptive analytics. The spreadsheet template file provided to the students is available upon request from the author. It is primarily a data file (all formulas, Solver settings, and VBA code are removed). But the hike selection sheet contains the binary mapping between hikes and peaks, which tends to help students get started on the right track.

Suggestions for use in other academic courses follow.

Prescriptive Analytics Course (undergraduate/masters). In this type of course, the cases can be assigned similarly as the MBA course, but without providing the spreadsheet template, which greatly helps students to structure the problem (alternately, electronic data can be provided, but without the binary mapping of hikes to peaks). Case A and

24

Case B could be used sequentially, so there can be discussion of the Case A work before moving on to Case B, and so students focus on problem structuring (i.e., modeling) before they worry too much about the actual optimization. The structuring of the data of the problem is essentially the core aspect of modeling the problem. Requiring students to do this without the assistance of the binary hike-to-peak mapping makes the problem much more challenging. After Case A has been completed and discussed, Case B would be assigned. The instructor can choose, depending on the background and capabilities of the students, whether to assign the hike sequencing portion as a manual setup from the results of hike selection, or to expect the selected hike ID values to populate the hike sequencing model.

Operations Research/Industrial Engineering/Applied Mathematics (undergraduate/masters). In a course where students have a stronger mathematical foundation, Cases A and B could be assigned with two primary outputs in mind: a) formulation of the optimization problem (or problems, if the instructor wishes to explicitly separate the selection and the sequencing problems from the start); and b) spreadsheet tool that implements hike selection and hike sequencing in two phases. Students would be challenged to automatically populate the hike sequencing model with the selected hikes from the hike selection model. If students have knowledge of VBA (or capability to figure enough out on their own), the instructor could make VBA aspects a challenge assignment (or make a Case C), that asks students to use VBA to build and solve the Solver models. An additional expectation could be that students identify the pros and cons of a simultaneous versus sequential solution approach.

Operations Research/Industrial Engineering/Applied Mathematics (masters/doctorate). At this level, Cases A and B should probably be assigned simultaneously, with limited guidance. There should be an expectation of a mathematical model specification, solution procedure designed, and solution procedure implemented. There is room for innovation and stretching beyond what is presented in this paper. For example, solving the problem as an explicit dual-objective mathematical program with weights on the objective is much more challenging than the spreadsheet solution presented in this paper, and might lead to a possible conference or journal paper. Also, designing hikes as part of the solution process (versus pre-specifying the possible hikes) is an aspect that makes the problem much more difficult. This can be used to drive ideas for further research in vehicle routing problems in general.

Any of the above classes. Using VBA to automate some of the solution process can be demonstrated fairly easily. If students have no knowledge of VBA, a quick macro recording for some simple task (e.g., cell formatting or copying), followed by viewing the code generated, would be a good lead-in. The Solver add-in comes with a number of VBA

25 procedures/functions, which are well-documented online (Microsoft, 2020). Depending on the background and interest level of students, the instructor can show how to use VBA to automate Solver, or can add that expectation to the case assignment. If students do not have prior experience with VBA, it is suggested the instructor first demonstrate VBA/Solver with a different example, say a linear programming problem that has already been covered previously in the course. The VBA code used on the Model_Sequence sheet to randomize the hike sequence, and to run Solver multiple times for different initial solutions, is somewhat more complex. If students have some background in computer programming, there is nothing conceptually difficult about the code, and it can offer a good opportunity to discuss the pros and cons of using a worksheet of the model to store intermediate calculations as the code runs (e.g., the Index_Randomizer sheet is used to compute a randomly generated starting sequence of hikes).

4.2 Suggested Case Questions

This section provides questions the instructor can use in a way deemed appropriate. For example, a question could be used as a discussion/coverage question while introducing the case prior to students working on it, for discussion purposes in class after students work on the case, and/or as a specific part of the assignment to help guide students in the problem-solving process. Each part of the case is considered, and within each, there are questions for the hike selection model and for the hike sequencing model.

Case A, Hike Selection Model • How can the hike-peak data (i.e., specification of which peaks are completed by each hike) be organized in the spreadsheet to a) make it easy to verify; b) easy to modify if needed; and c) amenable to efficient calculation formulas? Note: The available spreadsheet template essentially addresses this question. The instructor could use the question as a pre-case discussion question before providing the template. • Why are binary (0/1) matrices useful in spreadsheets for organizing yes/no data, with respect to writing formulas? Specifically, why are 0/1 numerical values a better choice than using words such as “no” and “yes?” • What is the best way to represent whether a hike is selected or not? How can this set of decision variables be used with the hike-peak data to compute whether a peak is climbed (“covered”) by a particular choice of decision variable values? • How can you easily calculate the total hiking time (and distance) of a particular choice of decision variable values? • Can you also easily calculate the total number of chain (one-way) hikes included in a particular choice of decision variables?

26

Case A, Hike Sequencing Model • Given a specific choice of hikes from the hike selection model, what does a solution for the hike sequencing model look like? For example, if there are 10 hikes selected, how many possible sequences exist of those hikes? • How do you know if one hike sequence is better than another? That is, how do you measure the quality of a sequence of hikes? • Data is provided for the driving distances between parking nodes, and the hike data includes the starting and ending nodes for each hike. How can this information be used to generate a matrix of driving distances between hikes (inter- hike distance) and within hikes (intra-hike distance)? Note the intra-hike distances for cyclical hikes will be zero and will be greater than zero for chain hikes to account for the car shuttling needed. Note: Transforming the node-node distance matrix into a hike-hike distance matrix is a key part of being able to automatically compute the total driving required for a particular hike sequence. • Manually experiment with at least three different sequences of hikes, and calculate the total driving required for those sequences. Find the best sequence you can. What is important to consider when trying to find a good sequence of hikes? • (Advanced) How could formulas be used to automatically extract the hike ID values for selected hikes into the hike sequencing model?

Case B, Hike Selection Model • For the hike selection model, what are the decision variables, objective function, and constraints? • What is the optimal solution and objective value? Verify that all constraints are satisfied. • Do you think there are other solutions that have the same objective value (i.e., are there alternate optimal solutions)? (Instructor note: for the current list of hikes, there are alternate optimal solutions.) • If the user decides that a particular mountain does not need to be climbed (or alternately, to be visited more than once), what changes are needed in order to find the new optimal solution? • If a new restriction were added to put a limit on the number of chain (one-way) hikes, what changes to the model are needed to find the new optimal solution? • If the user wanted to minimize hiking distance instead of hiking time, what changes to the model are needed to find the new optimal solution?

Case B, Hike Sequencing Model • Based on the solution of the hike selection model, which hikes need to be sequenced to minimize the driving distance?

27

• How could we use Solver’s Evolutionary Algorithm and the “All Different” constraint type to sequence the hikes? • For a given pair of hikes put in sequence next to one another, how can the driving distance between the hikes (and within each hike if a chain hike) be determined? (This is the critical step to being able to create a model that seeks to minimize driving distance.) • If the overall objective is to minimize the total driving distance (including both cars), is the intra-hike distance affected by the sequence in which we do the hikes? Is the inter-hike distance affected? (This is an example where one part of the overall objective is a constant value, and so does not need to be an explicit part of the objective function that we tell Solver to minimize.) • Do the normal linear programming type of constraints apply for this problem, at least the way you have set it up? Why or why not? Why is the Evolutionary Algorithm required? Note: This can be used to initiate a discussion of different classes of algorithms in as much detail as desired. • What are the decision variables, objective function, and constraints, as specified to Solver? • What is the best solution you can find with the objective to minimize driving distance? • Try a few different starting sequences (i.e., manually reorder the hikes before running Solver). Do you get the same final (“best”) solution after running Solver for all the different starting sequences? (This question can be used to illustrate that the starting solution can make a difference in the final solution found, because the Evolutionary Algorithm is really a heuristic algorithm, without a guarantee of finding the global minimum.)

5.0 Summary The problem and solution approach presented in this paper can be used in multiple ways in an educational environment. It provides a stretch from a traditional textbook-based problem, especially with the two-phased solution approach. At the same time, it ties in with classic optimization models in Operations Research such as the Set Covering Problem and the Traveling Salesman Problem. Opportunities to use the example in multiple ways were discussed. The Excel workbook for the problem is available upon request from the author, as is a data/template file suitable for distribution with the case if the instructor desires.

6.0 References Applegate, D.L.; Bixby, R.E.; Chvatal, V.; Cook, W.J. (2007). The Traveling Salesman Problem: A Computational Study. Princeton University Press.

28

Braekers, K., K. Ramaekers, and I.V. Nieuwenhuyse. 2016. The Vehicle Routing Problem: State of the Art Classification and Review. Computers and Industrial Engineering, 99, 300-313.

Current, J.R. and D.A. Schilling. 1989. The Covering Salesman Problem. Transportation Science, 23, 208-213.

Golden, B.L.; Raghavan, S.; Wasil, E.A., eds. (2008). The Vehicle Routing Problem: Latest Advances and New Challenges. Springer.

Grinde, R. (2017). A Bi-Modal Routing Problem with Cyclical and One-Way Trips: Formulation and Heuristic Solution. Information Technology and Management Science, 20:79-84.

Gutin, G., ed. (2007). The Traveling Salesman Problem and Its Variations. Springer.

Jozefowiez, N., F. Semet, and E.G. Talbi. 2008. Multi-Objective Vehicle Routing Problems. European Journal of Operational Research, 189:2, 293-309.

Levy, L. and L. Bodin. 1988. Scheduling the Postal Carriers for the United States Postal Service: An Application of Arc Partitioning and Routing. In Vehicle Routing: Methods and Studies, Golden, B.L. and A.A. Assad (eds.). North-Holland, Amsterdam. 359-394.

Microsoft. (2020). Using the Solver VBA Functions. Retrieved from https://docs.microsoft.com/en-us/office/vba/excel/concepts/functions/using-the-solver- vba-functions.

Smith, S.D.; Dickerman, M. eds. (2012). AMC White Mountain Guide With Maps, The Appalachian Mountain Club, 29th edition.

29

Appendix A. VBA Code for Model_Cover Sheet

Option Explicit

Sub BuildCoverModel() 'First resets the Solver settings, then rebuilds the solver model. ' 'Reset Solver settings. SolverReset

'Set up Solver model, using named ranges on Model_Cover sheet SolverOk SetCell:=Range("TotalTime"), MaxMinVal:=2, ValueOf:=0, _ ByChange:=Range("HikeChosen"), Engine:=2, EngineDesc:="Simplex LP"

'Add the constraints SolverAdd CellRef:=Range("HikeChosen"), Relation:=5, FormulaText:="Binary" SolverAdd CellRef:=Range("PeakCoveredActual"), Relation:=3, FormulaText:="PeakCoveredRequired" SolverAdd CellRef:=Range("TotalTime"), Relation:=1, FormulaText:="Max_Allowed_Hiking_Time" SolverAdd CellRef:=Range("Number_Hikes_Chosen"), Relation:=3, FormulaText:="Min_Number_Hikes" SolverAdd CellRef:=Range("Number_Hikes_Chosen"), Relation:=1, FormulaText:="Max_Number_Hikes" SolverAdd CellRef:=Range("Number_Chain_Hikes"), Relation:=3, FormulaText:="Min_Chain_Hikes" SolverAdd CellRef:=Range("Number_Chain_Hikes"), Relation:=1, FormulaText:="Max_Chain_Hikes"

End Sub

Sub BuildCoverModel_Distance() 'First resets the Solver settings, then rebuilds the solver model. ' 'Reset Solver settings. SolverReset

'Set up Solver model, using named ranges on Model_Cover sheet SolverOk SetCell:=Range("Total_Miles"), MaxMinVal:=2, ValueOf:=0, _ ByChange:=Range("HikeChosen"), Engine:=2, EngineDesc:="Simplex LP"

'Add the constraints SolverAdd CellRef:=Range("HikeChosen"), Relation:=5, FormulaText:="Binary" SolverAdd CellRef:=Range("PeakCoveredActual"), Relation:=3, FormulaText:="PeakCoveredRequired" SolverAdd CellRef:=Range("TotalTime"), Relation:=1, FormulaText:="Max_Allowed_Hiking_Time" SolverAdd CellRef:=Range("Number_Hikes_Chosen"), Relation:=3, FormulaText:="Min_Number_Hikes" SolverAdd CellRef:=Range("Number_Hikes_Chosen"), Relation:=1, FormulaText:="Max_Number_Hikes" SolverAdd CellRef:=Range("Number_Chain_Hikes"), Relation:=3, FormulaText:="Min_Chain_Hikes" SolverAdd CellRef:=Range("Number_Chain_Hikes"), Relation:=1, FormulaText:="Max_Chain_Hikes"

End Sub

Sub SolveCoverModel() ' Run Solver. Don't display dialog box, but do error checking and report to user if there is a problem. Dim SolverResult As Integer 'Run Solver. The UserFinish:=True suppresses the final dialog box. SolverResult = SolverSolve(UserFinish:=True) 'Check different result codes; see https://www.solver.com/excel-solver-solversolve-function for explanation of codes. If (SolverResult = 4) Then MsgBox ("Solution Error. Problem is Unbounded") ElseIf (SolverResult = 5) Then MsgBox ("Solution Error. Problem is Infeasible") ElseIf (SolverResult >= 6) Then MsgBox ("Problem in model or solution process.") Else MsgBox ("Solver successful. Click to continue.") End If End Sub

30

Appendix B. VBA Code for Model_Sequence Sheet

Option Explicit

Sub ResetIndexSequence() 'Orders the IndexSequence from 1...N, where N is the total number of hikes defined in the workbook. Dim i As Integer For i = 1 To Range("TotalTrips").Value Range("IndexSequence").Cells(i) = i Next i End Sub

Sub RandomizeIndexSequence() 'Randomly orders the IndexSequence of the n hikes currently selected. Note that n <= N, where N is the total 'number of hikes defined in the workbook. 'Force calculation of the Index_Randomizer sheet. This sheet use used as a support sheet only for VBA. Sheets("Index_Randomizer").Calculate 'Transfers (i.e., copies) the values of the randomized index sequence into the IndexSequence range used in the model. Range("IndexSequence").Value = Sheets("Index_Randomizer").Range("RandIndex").Value End Sub

Sub SolveModel() ' Run Solver. Don't display dialog box, but do error checking and report to user if there is a problem. Dim SolverResult As Integer 'Run Solver. The UserFinish:=True suppresses the final dialog box. SolverResult = SolverSolve(UserFinish:=True) 'Check different result codes; see https://www.solver.com/excel-solver-solversolve-function for explanation of codes. If (SolverResult = 4) Then MsgBox ("Solution Error. Problem is Unbounded") ElseIf (SolverResult = 5) Then MsgBox ("Solution Error. Problem is Infeasible") ElseIf (SolverResult >= 6) Then MsgBox ("Problem in model or solution process.") Else MsgBox ("Click to show best solution found.") End If End Sub

Sub BuildSolverModel() 'First resets the Solver settings, then rebuilds the solver model. Dim StartRow, StartCol, LastRow As Integer Dim StartCell As Range 'Reset Solver settings. SolverReset 'StartCell is the first cell of the IndexSequence array (1 dimensional column). Get the row and column numbers. Set StartCell = Range("IndexSequenceAnchor") StartRow = StartCell.Row StartCol = StartCell.Column 'Compute the LastRow of the IndexSequence that will be used for this model, corresponding to n selected trips. LastRow = StartCell.Row + Range("SelectedTrips").Value - 1 'Set the objective cell, optimization direction (min), the decision variable cells, and the solver engine. SolverOk SetCell:=Range("InterHikeDistance"), MaxMinVal:=2, ValueOf:=0, _ ByChange:=Range(StartCell, Cells(LastRow, StartCol)), _ Engine:=3, EngineDesc:="Evolutionary" 'Add the constraint that the decision variables must be a permutation of 1...n. SolverAdd CellRef:=Range(StartCell, Cells(LastRow, StartCol)), Relation:=6, FormulaText:="AllDifferent" End Sub

31

Sub MultipleStarts() 'Using a user-specified number of random starting solutions, run solver that many times, and keep the best solution. Dim i, NumStarts As Integer Dim SolverResult As Integer Dim BestFound As Double Dim MilliSecond As Double

'Compute a MilliSecond, as a fraction of a day; used in Wait procedure to allow screen to update. MilliSecond = (1#) / (24# * 60# * 60# * 1000#) ' # character forces floating point calculation to avoid overflow.

'Initialize iteration counter and best found objective. i = 0 BestFound = 1E+20 'Record these values in the spreadsheet as a type of progress meter for the user. Range("Current_Run_Number").Value = i Range("Best_Objective").Value = BestFound 'Wait for a bit before proceeding; gives the screen a chance to update. Application.Wait (Now + 1000 * MilliSecond) 'NumStarts is the number of starting random index sequences. NumStarts = Range("NumStartingSolutions").Value For i = 1 To NumStarts 'Update the current run number in the worksheet. Range("Current_Run_Number").Value = i Application.Wait (Now + 1000 * MilliSecond) 'Randomize the Index Sequence. RandomizeIndexSequence 'Solve the problem using this starting solution. SolverResult = SolverSolve(UserFinish:=True) 'Check to see if the solution process was successful (result code <= 3); see 'https://www.solver.com/excel-solver-solversolve-function for all possible result codes. If (SolverResult <= 3) Then ' Solution process was successful. ' Check to see if a new best solution found. If (Range("InterHikeDistance").Value < BestFound) Then ' Update the best objective found, and make a copy of the best sequence so far. BestFound = Range("InterHikeDistance").Value Sheets("VBA_Support_Sheet").Range("Best_Index_Sequence").Value = _ Sheets("Model_Sequence").Range("IndexSequence").Value ' Update the best objective found on the the worksheet. Range("Best_Objective").Value = BestFound Application.Wait (Now + 1000 * MilliSecond) End If End If Next i 'Now copy over the best found sequence (because of automatic recalculation, the objective will also update. Sheets("Model_Sequence").Range("IndexSequence").Value = _ Sheets("VBA_Support_Sheet").Range("Best_Index_Sequence").Value 'All done! MsgBox ("Click to see best solution found.") End Sub

32

Appendix C. Hiker Case, Part A Chris and Dana are avid hikers and want to climb some of the peaks in the White Mountains of New Hampshire (USA). In particular, they want to hike some or all of the mountains reaching at least 4000 feet above sea level. There are 48 such peaks, ranging from at 6288 feet, to Mount Tecumseh at 4003 feet. People climbing all 48 peaks become members of the “4000-footer club.”

Desiring to make the most of their time, they wish to be able to plan their hiking excursions with the ultimate goal of becoming members of the club. An excursion would last one or more days, and include one or more hikes, with each hike visiting one or more peaks. In order to do this, Chris and Dana have collected information about the peaks, the trails, and the trailheads (parking areas) from various sources online and in books. From these, they have identified a number of different hiking trips that climb one or more mountains on each trip. They envision that some of their excursions will be one day, and others multiple days. In fact, they would like to be able to plan for any length of time and number of peaks in a single excursion. For example, they might leave from their home and do a hike that climbs one or more peaks (which may include multi-day hikes involving backcountry camping). Then they would camp, and on the next day perform another hike; this might be repeated. Some towns are within the overall area, where food and supplies are available. On such an excursion, Chris and Dana want to be mindful of the estimated total hiking time, as well as the driving distance between hikes. So, both the selection of hikes as well as the sequencing of those hikes, is important. That is, once the hikes are selected, the order in which they perform the hikes will influence how much driving is required. Campsites are available either close to the trailheads (parking areas), or along the driving routes. If an individual hike takes longer than a day to complete (some peaks are rather remote, and it make sense to do some long hiking trips to visit multiple peaks), they will camp in the backcountry, breaking up the hiking into two or more days.

The hikers would like to organize the information they have gathered and create a tool they can use to a) select hikes, and b) sequence the hikes. While selecting the hikes, they want to be able to know which peaks are visited with the hikes, and keep track of the total estimated hiking time. When they sequence the hikes, they want to be able to order the hikes to keep track of the total driving distance. They have started to organize their information in a spreadsheet, but are at a point where they need help in taking the next step in developing a decision tool, they can use to plan their excursions.

Table C.1 lists the 48 peaks above 4000 feet; Figure C.1 shows these on a map. Using online tools and books, Chris and Dana developed 42 hiking trips, starting and ending at trailheads (parking nodes). Each trip visits one or more peaks. They estimated the hiking

33 time in hours for each hike, the distance in miles, the peaks visited, and the starting and ending parking nodes. Hikes can start and end at the same parking node (circuit trips), or they can start and end at different nodes (chain trips or traverses). Chain trips are often useful on hikes that visit several peaks, as these peaks commonly lie along the same ridge, so one-way hikes are natural. However, performing a chain trip requires more complex vehicle logistics. Two cars are required, which obviously increases the combined driving distance. In addition, shuttling of the vehicles is required. Figure C.2 shows the logistics for completing a chain trip. First, both cars drive to the end of the hike. One car drives to the beginning of the hike, and Chris and Dana then perform the hike. Reaching the end of the hike, they drive the remaining car to the beginning of the hike, reuniting both cars. Then they drive to their next hike (or home).

34

Table C.1. New Hampshire 4000-foot Peaks Mountain Peaks: NH 4000-footers

Index Name Elevation 1 Washington 6288 2 Adams 5774 3 Jefferson 5712 4 Monroe 5384 5 Madison 5367 6 Lafayette 5249 7 Lincoln 5089 8 South Twin 4909 9 4832 10 Moosilauke 4802 11 Eisenhower 4780 12 North Twin 4761 13 Bond 4698 14 Carrigan 4700 15 Middle Carter 4610 16 West Bond 4540 17 Garfield 4500 18 Liberty 4459 19 South Carter 4430 20 Wildcat A 4422 21 Hancock 4420 22 South Kinsman 4358 23 Osceola 4340 24 Flume 4328 25 Field 4340 26 Pierce 4310 27 Willey 4285 28 North Kinsman 4293 29 South Hancock 4319 30 Bondcliff 4265 31 Zealand 4260 32 Cabot 4170 33 East Osceola 4156 34 North Tripyramid 4180 35 Middle Tripyramid 4140 36 Cannon 4100 37 Passaconaway 4043 38 Hale 4054 39 Jackson 4052 40 Moriah 4049 41 Tom 4051 42 Wildcat E 4070 43 Owl's Head 4025 44 Galehead 4024 45 Whiteface 4020 46 Waumbek 4006 47 Isolation 4004 48 Tecumseh 4003

35

Figure C.1. Map Showing 4000-foot Peaks

36

Chain Trip Illustration

Road C2

Trail Parking Area Step 3: Hike

C1 Peak C3

Step 2: Drive 1 Car to begin of hike

P2 (end of hike) P1 (begin of hike)

Step 1: Drive 2 Cars from previous hike Step 4: Drive 1 Car to begin of hike Step 5: Drive 2 Cars to next hike Figure C.2. Logistics of Chain Trip

Table C.2 shows the 42 hiking trips Chris and Dana developed. It shows the name of the hike, the estimated time in hours, the distance, the starting and ending nodes (the same for circuit hikes; different for chain trips), and the peaks visited. Table C.3 is a table of driving distances (in miles) between the parking nodes.

Table C.2. Defined Hiking Trips Hike_ID Hike_Name Hours Miles S_Node E_Node Type Peaks Climbed By Hike 1 Presidentials to Dolly Copp Rd 16.58 22.7 3 4 1 1, 2, 3, 4, 5, 11, 26, 39 2 Presidentials from Dolly Copp Rd 16.92 22.7 4 3 1 1, 2, 3, 4, 5, 11, 26, 39 3 Presidentials to Appalachia Parking 16.58 22.7 3 5 1 1, 2, 3, 4, 5, 11, 26, 39 4 Presidentials from Appalachia Parking 16.92 22.7 5 3 1 1, 2, 3, 4, 5, 11, 26, 39 5 Willey Range to Willey House 6.17 8.5 3 6 1 25, 27, 41 6 Willey Range to Crawford Depot 6.17 8.5 6 3 1 25, 27, 41 7 Mt. Carrigain 6.75 10 7 7 0 14 8 Mt. Waumbek 5.08 7.2 8 8 0 46 9 Mt. Cabot 4.92 7.8 9 9 0 32 10 Moriah, Carter Range, Wildcats 15.08 19.3 10 11 1 9, 15, 19, 20, 40, 42 11 Wildcats, Carter Range, and Moriah 14.33 19.3 11 10 1 9, 15, 19, 20, 40, 42 12 Mt. Isolation and Wildcat E 12.33 15.8 11 11 0 42, 47 13 Mt. Isolation 8.75 12 11 11 0 47 14 Moriah and Carter Range 13.58 19.6 10 12 1 9, 15, 19, 20, 40 15 Carter Range and Moriah 13.25 19.4 12 10 1 9, 15, 19, 20, 40 16 Mt. Hancock and South Hancock 6 9.8 13 13 0 21, 29 17 Osceolas and back to Kancamagus 4.67 6.6 14 14 0 23, 33 18 Osceolas to Tripoli Rd 4 7 14 15 1 23, 33 19 Osceolas to Kancamagus 4.75 7 15 14 1 23, 33 20 Osceolas and back to Tripoli Rd 5.58 8.4 15 15 0 23, 33 21 Mt. Moosilauke via Beaver Brook Trail 5.08 6.8 16 16 0 10 22 Mt. Moosilauke via Gorge Brook Trail 4.92 7.4 17 17 0 10 23 Mt. Cannon and back to Tramway 2.92 4.4 18 18 0 36 24 Cannon and Kinsmans 8.17 11.9 18 19 1 22, 28, 36 25 Kinsmans and Cannon 8.42 11.9 19 18 1 22, 28, 36 26 Flume and Liberty 9.17 12.7 20 19 1 6, 7, 18, 24 27 Liberty and Flume 8.75 12.7 19 20 1 6, 7, 18, 24 28 All the to Hale 36.5 53.1 20 21 1 6, 7, 8, 12, 13, 16, 17, 18, 24, 30, 31, 38, 29 All the Franconia Range to Flume 34.58 53.2 21 20 1 6, 7, 8, 12, 13, 16, 17, 18, 24, 30, 31, 38, 30 Eastern Franconias to Hale 31.33 45.7 22 21 1 8, 12, 13, 16, 17, 30, 31, 38, 43, 44 31 Eastern Franconias to Garfield 29.67 45.8 21 22 1 8, 12, 13, 16, 17, 30, 31, 38, 43, 44 32 Osceolas, Tecumseh, Hancocks 13.5 21.2 15 25 1 21, 23, 29, 33, 48 33 Tecumseh, Osceolas, Hancocks 14.25 21.2 25 15 1 21, 23, 29, 33, 48 34 Willey Range back to Crawford Depot 10 16 3 3 0 25, 27, 39, 41 35 Tecumseh and back to Tripoli 4.42 6.2 15 15 0 48 36 Tecumseh to Waterville Valley 4.08 5.3 15 25 1 48 37 Tecumseh back to Waterville Valley 3.5 4.4 23 23 0 48 38 Tecumseh to Tripoli Rd 3.83 5.3 23 15 1 48 39 Tripyramids, Whiteface, Passaconaway 15.25 24.1 23 23 0 34, 35, 37, 45 40 Passaconaway, Whitefact, Tripyramids 15 21.2 24 24 0 34, 35, 37, 45 41 Most Presidentials to Dolly Copp Rd 15.17 20.4 3 4 1 1, 2, 3, 4, 5, 11, 26 42 Most Presidentials to Appalachia Parking Area 15.17 20.4 3 5 1 1, 2, 3, 4, 5, 11, 26

37

Table C.3. Driving Distances Between Parking Nodes (Node 1 is Home Location) Matrix of Driving Distances Between Nodes (Parking Areas)

Nodes 1 2345678910111213141516171819202122232425 1 0 39 98 100 105 95 92 114 119 105 92 97 106 107 112 114 119 117 115 113 107 115 118 68 116 2 39 0 101 122 116 104 113 107 112 124 130 125 79 78 73 75 80 78 76 74 98 86 79 107 77 3 98 101 0 38 32 3 12 24 29 36 29 35 44 43 44 40 45 23 25 27 9 17 50 52 48 410012238063431152068357586561664446484238715469 5105116266 03537914814959585955603840423632656063 6 95 104 3 34 29 0 9 27 32 37 27 32 47 46 47 43 48 26 28 30 12 20 53 49 51 7 92 113 12 31 37 9 0 36 41 37 24 29 49 50 56 52 57 35 37 39 21 29 62 46 60 81141072415927360 517231850495046512931332123566954 9 119 112 29 20 14 32 41 5 0 22 28 23 55 54 55 51 56 34 36 38 26 28 61 74 59 10 105 124 36 6 8 37 37 17 22 0 14 9 63 64 67 63 68 46 48 50 44 40 73 60 71 11 92 130 29 8 14 27 24 23 28 14 0 5 49 50 71 67 72 52 54 56 39 46 77 46 75 129712535393229182395054556864694749514441745172 13 106 79 44 57 59 47 49 50 55 63 49 54 0 1 22 18 23 21 19 17 41 29 28 49 26 14 107 78 43 58 58 46 50 49 54 64 50 55 1 0 21 17 22 20 18 16 40 28 27 50 25 15 12 73 44 65 59 47 56 50 55 67 71 68 22 21 0 18 23 21 19 17 41 29 6 71 4 16 114 75 40 61 55 43 52 46 51 63 67 64 18 17 18 0 13 17 15 13 37 25 24 67 22 17 119 80 45 66 60 48 57 51 56 68 72 69 23 22 23 13 0 22 20 18 42 30 29 72 27 18 117 78 23 44 38 26 35 29 34 46 52 47 21 20 21 17 22 0 2 4 20 8 27 70 25 19 115 76 25 46 40 28 37 31 36 48 54 49 19 18 19 15 20 2 0 2 22 10 26 68 24 20 113 74 27 48 42 30 39 33 38 50 56 51 17 16 17 13 18 4 2 0 24 12 23 66 21 211079894236122121264439444140413742202224014476145 22 115 86 17 38 32 20 29 23 28 40 46 41 29 28 29 25 30 8 10 12 14 0 35 72 33 23 118 79 50 71 65 53 62 56 61 73 77 74 28 27 6 24 29 27 26 23 47 35 0 77 4 24 68 107 52 54 60 49 46 69 74 60 46 51 49 50 71 67 72 70 68 66 61 72 77 0 75 25 116 77 48 69 63 51 60 54 59 71 75 72 26 25 4 22 27 25 24 21 45 33 4 75 0

Chris and Dana approach you to help them develop a tool that organizes this information in a useful way, and allows them to select hiking trips (one or more), seeing which peaks will be visited along with the total hiking time and distance. Some of your modeling expertise is needed to connect the related data and to perform needed calculations. They also want to be able to sequence their selected hikes so they can see how much driving is required. They remind you that on chain trips, two cars are required and some driving (car shuttling) as part of the hiking process. The tool should show the between-hike driving (i.e., driving to/from home to the hikes, and between hikes if they select multiple hikes), and the within-hike driving needed for the chain hikes. Their primary concern is selecting hikes with an eye toward the peaks visited and the hiking time. But making the excursion as efficient as possible from a driving perspective also factors into their decision. Ultimately they want to use the tool to experiment with different possibilities.

Application Notes The context of this case is a hiking situation. But the models that are used to solve this case have wide application in business, especially in transportation, logistics, and operations.

The problem of selecting which hikes to perform, trying to minimize hiking time while making sure that all peaks are visited, is called the Set Covering Problem in Operations Research. This situation occurs in situations such as shift scheduling of personnel. Consider a number of different overlapping shift types (defined by starting and ending

38 times), and requirements for each time period (e.g., each hour requires differing numbers of personnel). The objective might be to minimize the total number of personnel needed, or to minimize the total cost of covering all the requirements. Other applications in personnel issues exist, such as selecting the smallest number of people, where each person has differing skills, and every skill must be represented by the chosen set of people. Set Covering Problems can also be used to identify an efficient sample, where the sample as a whole must satisfy certain characteristics.

The hike sequencing problem, that is, determining the sequence of hikes that minimizes driving time, is a form of the Traveling Salesman Problem (TSP), probably the most famous problem in Operations Research. In its most basic form, the TSP consists of a network of cities, with known distances between them. The goal is to find the route that visits each city once, and returns to the starting city, that minimizes total distance. This problem has wide applicability in transportation, logistics, vehicle routing and scheduling, etc. UPS’ Orion system (see https://www.ups.com/us/en/services/knowledge-center/article.page?kid=aa3710c2, for example) is essentially a large-scale vehicle routing problem, which breaks down to a TSP for each delivery van. In Operations Management, determining the best sequence of jobs, where setup times differ depending on the sequence (this is true in most job shop situations), essentially becomes a TSP. More broadly, network optimization models are used widely in airline and crew scheduling, and other industries. There is even a movie named after the TSP, Travelling Salesman (see https://www.imdb.com/title/tt1801123/?ref_=nv_sr_srsg_1)!

The combination of the Set Covering Problem and the TSP makes this specific case situation a bit more unique. But something quite close to the hiker problem is a postal carrier problem in an urban setting, where a postal carrier parks the vehicle, proceeds to deliver mail by foot, and then returns to the vehicle to drive to the next parking spot. In that problem, the best parking spots need to be determined as well as the customers to visit on each walking tour. Another example would be in trip planning by a combination of airplane and car, where a number of different cities, attractions, or customers are to be visited. After flying to an airport and renting a car, the car can be returned to the same airport after visiting some attractions (like a cyclical hike), or can be returned to a different airport (like a chain hike). If there are numerous attractions to visit and numerous airports, the problem becomes somewhat complex to identify the best selection of trips and the sequencing of them, similar to the selection of hikes and the sequencing of the hikes.

39

Appendix D. Hiker Case, Part B Hiking all 48 peaks above 4000 feet is a goal for Chris and Dana. Following their graduation, each has a couple of months before starting their position in their respective fields. Knowing this may be the last opportunity they have for a significant block of time together, they decide to undertake the significant challenge of hiking all 48 peaks in the shortest time possible, as a single excursion. Once they depart from their respective homes, they will not return until after climbing all 48 peaks. They will perform a number of hiking trips, probably both cyclical and chain trips. In between, and during as necessary, they will camp, and replenish supplies as is convenient between hikes at area stores within the general area of the mountains.

Their primary objective is to hike all 48 peaks, and in so doing, minimize their hiking time. They see this as the guiding principle in selecting the specific hikes to perform. Once they have determined the hikes, they need to sequence the hikes in order to minimize the total driving distance. Driving distance should consider that there will be two cars, and that for chain trips, some within-trip driving is required.

Chris and Dana approach you again for help. They would like an enhancement of the decision tool provided earlier (Part A of case), so the tool could automatically identify the best set of hiking trips. Once that is accomplished, they would like to use the tool to find the best sequence of trips.

40