An evidential answer for the capacitated vehicle routing problem with uncertain demands

Une réponse évidentielle pour le problème de tournée de véhicules avec contrainte de capacité et demandes incertaines

THÈSE

Présentée et soutenue publiquement le 20 Décembre 2017 en vue de l’obtention du Doctorat de l’Université d’Artois Spécialité : Génie Informatique et Automatique

par Nathalie HELAL

Composition du jury :

M. Didier DUBOIS Directeur de Recherche CNRS, Université Rapporteur Paul Sabatier Toulouse M. Arnaud MARTIN Professeur, Université de Rennes 1 Rapporteur M. Sébastien DESTERCKE Chargé de Recherche CNRS HDR, Université Examinateur de Technologie de Compiègne Mme. Laetitia JOURDAN Professeur, Université de Lille 1 Examinatrice Mme. Caroline THIERRY Professeur, Université Toulouse 2 Le Mirail Examinatrice M. David MERCIER Maître de Conférences HDR, Université d’Artois Invité M. Frédéric PICHON Maître de Conférences, Université d’Artois Co-encadrant M. Daniel PORUMBEL Maître de Conférences, CNAM Paris Co-encadrant M. Éric LEFÈVRE Professeur, Université d’Artois Directeur de thèse

Thèse préparée au Laboratoire de Génie Informatique et d’Automatique de l’Artois Université d’Artois Faculté des sciences appliquées Technoparc Futura 62 400 Béthune

Abstract

The capacitated vehicle routing problem is an important combinatorial optimisation problem that has generated a large body of research over the past sixty years. Its objective is to find a set of routes of minimum cost, such that a fleet of vehicles initially located at a depot service the deterministic de- mands of a set of customers, while respecting capacity limits of the vehicles. Still, in many real life applications, we are faced with uncertainty on customer demands. Most of the research papers that handled this situation, assumed that customer demands are random variables. In this thesis, we pro- pose to represent uncertainty on customer demands using evidence theory - an alternative uncertainty theory that extends the probabilistic representation of uncertainty and offers the advantage of managing epistemic uncertainty more faithfully. To tackle the resulting optimisation problem, we extend classical stochastic programming modelling approaches. Specifically, we propose two models for this problem. The first model is an extension of the chance-constrained programming approach, which imposes cer- tain minimum bounds on the belief and plausibility that the sum of the demands on each route respects the vehicle capacity. The second model extends the stochastic programming with recourse approach: it represents by a belief function for each route the uncertainty on its recourses, i.e., corrective actions performed when the vehicle capacity is exceeded, and defines the cost of a route as its classical cost (without recourse) plus the worst expected cost of its recourses. Some properties of these two models are studied. A algorithm is adapted to solve both models and is experimentally tested.

Keywords : Vehicle routing problem, Uncertain demands, Chance constrained programming, Stochas- tic programming with recourse, Evidence theory.

Résumé

Le problème de tournées de véhicules avec contrainte de capacité est un problème important en optimisation combinatoire, qui a généré un grand nombre de travaux de recherche au cours des soixante dernières années. L’objectif du problème est de déterminer l’ensemble de routes, nécessaire pour servir les demandes déterministes des clients ayant un cout minimal, tout en respectant la capacité limite des véhicules. Cependant, dans de nombreuses applications sur des cas réels, nous sommes confrontés à des incertitudes sur les demandes des clients. La plupart des travaux qui ont traité ce problème ont supposé que les demandes des clients étaient des variables aléatoires. Nous nous proposons dans cette thèse de représenter l’incertitude sur les demandes des clients dans le cadre de la théorie de l’évidence - un formalisme alternatif pour modéliser les incertitudes qui généralise la représentation probabiliste des incertitudes et offre l’avantage de gérer l’incertitude épistémique fidèlement. Pour résoudre le problème d’optimisation qui résulte, nous généralisons les approches de modélisation classiques en programma- tion stochastique. Précisément, nous proposons deux modèles pour ce problème. Le premier modèle, est une extension de l’approche chance-constrainedprogramming , qui impose des bornes minimales pour la croyance et la plausibilité que la somme des demandes sur chaque route respecte la capacité des véhicules. Le deuxième modèle étend l’approche stochastic programming with recourse: l’incertitude sur les recours possibles sur chaque route est représentée par une fonction de croyance, i.e., des actions correctives sont effectuées lorsque la capacité limite des véhicules est dépassée, et le coût d’une route est alors son coût classique (sans recours) additionné du pire coût espéré des recours. Certaines pro- priétés de ces deux modèles sont étudiées. Un algorithme de recuit simulé est adapté pour résoudre les deux modèles et est testé expérimentalement.

Mots-clés : Problème de tournées de véhicules, Demandes incertaines, “Chance constrained program- ming”, “Stochastic programming with recourse”, Théorie de l’évidence.

Remerciements Remerciements

Je voudrais exprimer ma profonde reconnaissance auprès de toutes les personnes qui ont participé de près ou de loin à la réalisation de ce travail de thèse. Mes premiers remerciements vont à l’ensemble de mes encadrants sans qui cette thèse n’aurait pas pu voir le jour. Je tiens à les remercier pour avoir accepté ma candidature sur ce sujet de thèse et pour la confiance qu’ils m’ont accordée. Je les remercie également de m’avoir accompagnée de très près durant ces trois années avec patience et justesse. Je re- tiens nos réunions de travail enrichissantes et la transmission d’un précieux savoir. À mon directeur de thèse, Éric Lefèvre, pour son soutien, sa considération et ses encoura- gements. Je le remercie vivement pour son suivi, son écoute, ses nombreux conseils et sa précision. J’ai beaucoup apprécié sa disponibilité et son aide proactive. À Frédéric Pichon, pour avoir encadré ma thèse avec beaucoup de rigueur et de précision. Je lui suis reconnaissante du temps qu’il a consacré à la réalisation de ce travail. Nos échanges scientifiques et ses conseils m’ont beaucoup apporté, ainsi que son aide et sa considération. J’ai également beaucoup apprécié nos conversations, son écoute, son naturel et sa bonne hu- meur. À mon encadrant Daniel Porumbel, qui a su se rendre disponible malgré la distance et faire le déplacement de Paris lorsque cela était nécessaire. Il a toujours répondu présent lorsque j’avais besoin d’aide. Je le remercie également pour ses idées et son optimisme. À David Mercier avec qui j’ai collaboré sur cette thèse et avec qui ce fût un plaisir de tra- vailler. Mes sincères remerciements à l’ensemble des membres du jury pour l’intérêt qu’ils ont porté à mes travaux. Je remercie tout particulièrement Monsieur Didier Dubois et Monsieur Arnaud Martin pour avoir accepté de rapporter ce mémoire et pour leurs suggestions. Je remercie chaleureusement l’ensemble des collègues du LGI2A, permanents, docto- rants, ingénieurs et stagiaires, pour leur bonne humeur et leur sympathie. Je remercie égale- ment Gilles Goncalves pour m’avoir accueillie au sein du laboratoire pour sa gentillesse et pour l’ambiance qu’il instaure au sein du laboratoire. Je remercie également Nathalie Mor- ganti pour son implication dans la levée des obstacles administratifs. Je tiens aussi à remercier tous les membres de l’IUT de Béthune et particulièrement le corps enseignants avec qui je travaille en qualité d’attaché temporaire d’enseignement et de recherche. Ils ont rendu cette nouvelle expérience agréable. Enfin, mes derniers remerciements vont à mes amis et ma famille pour leur soutien. Je remercie mes proches que j’ai rencontré en France. À Cécilia, pour être une super amie si agréable et gentille, pour son aide et son soutien. À Maxime, pour sa présence, pour m’avoir soutenu surtout pendant les moments durs, pour sa gentillesse et son grand coeur. À mes proches au Liban. À Titi (Roula), qui m’a toujours et quotidiennement soutenue malgré la distance, pour ses conseils, sa générosité et pour être une véritable soeur. À ma mère et mes soeurs, pour leur soutien, leur patience, leur amour inconditionnel qui m’ont aidée à surmonter toutes les difficultés. À mon père, tu as été mon ange gardien, toujours présent pour moi. Tu m’as guidée et aidée pendant tout ce temps malgré ton absence.

Table of contents

List of acronyms 9

List of tables 11

List of figures 12

Introduction 13

1 The Capacitated Vehicle Routing Problem 17 1.1 Introduction ...... 17 1.2 An Overview on the Vehicle Routing Problem ...... 18 1.3 The Capacitated Vehicle Routing Problem ...... 20 1.3.1 Problem formulation ...... 20 1.3.2 Problem variants ...... 22 1.3.3 Problem complexity ...... 23 1.4 Solving Methods ...... 23 1.4.1 Exact methods ...... 23 1.4.2 Heuristic based methods ...... 24 1.5 Conclusions ...... 28

2 The Capacitated Vehicle Routing Problem with Stochastic Demands 31 2.1 Introduction ...... 31 2.2 Modelling the CVRPSD by CCP ...... 32 2.3 Modelling the CVRPSD by SPR ...... 36 2.3.1 Recourse actions ...... 37 2.3.2 The expected penalty cost ...... 38 2.4 Modelling the CVRPSD using SPR or CCP ? ...... 43

7 2.5 Conclusions ...... 44

3 Evidence Theory 45 3.1 Introduction ...... 45 3.2 Representation of Information ...... 46 3.2.1 Mass function ...... 46 3.2.2 Belief and plausibility functions ...... 47 3.2.3 Informative content comparison ...... 50 3.2.4 Handling information on product spaces ...... 50 3.3 Combination of Information ...... 52 3.4 Uncertainty Propagation ...... 55 3.5 Expectations ...... 57 3.6 Conclusions ...... 58

4 A Belief-Constrained Programming Approach to the CVRPED 61 4.1 Introduction ...... 61 4.2 The CVRPED ...... 62 4.3 Modelling the CVRPED by BCP ...... 63 4.3.1 Formalisation ...... 64 4.3.2 Particular cases ...... 67 4.3.3 Influence of the model parameters on the optimal solution cost . . . . 68 4.3.4 Influence of customer demand specificity on the optimal solution cost 69 4.4 Solving the CVRPED Modelled by BCP ...... 72 4.4.1 The simulated annealing algorithm ...... 72 4.4.2 A configuration in the algorithm ...... 74 4.4.3 The CVRPED benchmarks ...... 76 4.4.4 Experimental study ...... 76 4.5 Conclusions ...... 77

5 A Recourse Approach to the CVRPED 82 5.1 Introduction ...... 82 5.2 Modelling the CVRPED by a Recourse Approach ...... 83 5.2.1 Formalisation ...... 83 5.2.2 Uncertainty on recourses ...... 85 9

5.2.3 Interval demands ...... 86 5.2.4 Particular cases ...... 91 5.2.5 Influence of customer demand specificity on the optimal solution cost 93 5.3 Solving the CVRPED Modelled by a Recourse Approach ...... 95 5.3.1 The simulated annealing algorithm ...... 95 5.3.2 A configuration in the algorithm ...... 95 5.3.3 The CVRPED benchmarks ...... 97 5.3.4 Experimental study ...... 97 5.4 Conclusions ...... 99

Conclusions and Future Work 100

A Heuristic Based Methods 102 A.1 ...... 102 A.2 Genetic Algorithms ...... 104 A.3 Swarm Intelligence Methods ...... 109

B Proofs of Chapter 5 112 B.1 Proof of Proposition 5.1 ...... 112 B.2 Proof of Lemma 5.1 ...... 114

Publications 116

Bibliography 118 List of acronyms

ACO Ant Colony Optimisation BCP Belief Constrained Programming CCP Chance Constrained Programming CVRP Capacitated Vehicle Routing Problem CVRPED Capacitated Vehicle Routing Problem with Evidential Demands CVRPSD Capacitated Vehicle Routing Problem with Stochastic Demands PSO Particle Swarm Optimisation SPR Stochastic Programming with Recourse TSP Traveling Salesman Problem VRP Vehicle Routing Problem

10 List of tables

2.1 Travel cost matrix TC of Example 2.3 ...... 40

3.1 Examples of special mass functions ...... 47

X X 3.2 Mass functions m1 and m2 ...... 53 X X 3.3 Performing the combination of m1 and m2 of Table 3.2 ...... 53 3.4 Mass function (mX↑X×Y ⊕ mY ↑X×Y ) of Example 3.6 ...... 55

4.1 Results of the simulated annealing algorithm for the BCP model using the CVRPED instances ...... 78 4.2 Results of the simulated annealing algorithm for the BCP model using the CVRPED+ instances ...... 79 4.3 Results of the simulated annealing algorithm applied to the BCP model for the CVRPED and CVRPED+ instances ...... 80

5.1 Travel cost matrix TC of Example 5.2 ...... 84 5.2 Travel cost matrix TC of Example 5.9 ...... 89 5.3 Travel cost matrix TC ...... 92 5.4 Results of the simulated annealing algorithm for the recourse model using the CVRPED and CVRPED+ instances ...... 98

11 List of figures

1.1 The skeleton of a simulated annealing algorithm ...... 27

2.1 Illustration of clients 1, 2, 3, 4 and the depot denoted by 0 for Example 2.1. . . 33 2.2 Illustration of a possible set of routes to serve the clients in Figure 2.1. . . . . 34 2.3 Illustration of another possible set of routes to serve the clients of Figure 2.1, than that in Figure 2.2...... 35

4.1 Illustration of a solution composed of two routes for the five customers of Example 4.3...... 65

5.1 Recourse tree constructed for Example 5.7 ...... 88 5.2 Recourse tree for 1; 5 × 3; 6 × 4; 5 of Example 5.9 ...... 90 J K J K J K 5.3 Recourse tree for 1; 3 × 2; 4 × 8; 9 of Example 5.9 ...... 90 J K J K J K A.1 A simple tabu search mechanism ...... 103 A.2 Two-point crossover generating two offspring from parents chromosomes for Example A.1 ...... 106 A.3 Mutation generating one child from a parent chromosome for Example A.2 . 107 A.4 The generic mechanism of a ...... 108 A.5 The mechanism for the ACO meta-heuristic ...... 110

12 Introduction

Vehicle routing problems [83] represent an important class of problems in operational research and combinatorial optimisation. A Vehicle Routing Problem (VRP) involves deter- mining the routes of a fleet of vehicles needed to serve customer demands. The objective is to minimize the sum of the route costs. This problem belongs to the class of local transportation or delivery problems affecting the most expensive component in a distribution network of a logistics system [11]. It represents a classic extension of the travelling salesman problem and is part of the class of NP-hard problems. An important variant of the VRP is the Capacitated Vehicle Routing Problem (CVRP), which is the most studied member in the VRP class. In the CVRP, the fleet of vehicles is located at one central depot, vehicles have identical capacity li- mits and capacity constraints are imposed so the sum of customer demands on each route does not exceed the capacity limit of vehicles. Being an NP-hard optimisation problem, the CVRP needs special solution techniques like meta-heuristics because of its combinatorial nature. Yet often in real life transportation problems, customer demands in a CVRP cannot be determined in advance and these demands are thus uncertain at the time vehicle routes must be planned. For instance, in garbage collection problems, the exact quantity of garbage to be collected cannot be known in advance. Similarly a lot of real life applications that can be modelled as a CVRP are faced with uncertainty on customer demands. Accordingly, a great number of authors [38] handled this issue by supposing that customer demands are random variables, which gave rise to a stochastic optimisation problem known as the Capacitated Vehicle Routing Problem with Stochastic Demands (CVRPSD). In the CVRPSD, probability distributions for stochastic demands are assumed available at the time of planning the routes. Probability theory has been a main choice for modelling uncertainties arising in opti- misation problems since the 1950s [19, 6]. This conducted to the development of stochas- tic programming approaches [9, 72]. Two of the most common stochastic programming ap- proaches are Chance Constrained Programming (CCP) and Stochastic Programming with Re- course (SPR). Modelling the CVRPSD via the CCP approach amounts to having a constraint stating that the probability that any route exceeds vehicle capacity limit, must be below a gi- ven (small) value. The CCP approach does not consider the additional cost of recourse (or corrective) actions necessary if capacity constraints fail to be satisfied. The SPR approach for the CVRPSD does consider situations needing recourses and it aims at minimizing the initially-planned travel cost plus the expected cost of the recourses executed along routes, e.g., returning to the depot and unloading in order to bring a violated capacity constraint back to feasibility. The probabilistic approach is well-suited to modelling aleatory uncertainty. However, in the case of epistemic uncertainty - uncertainty arising from lack of knowledge - a number of shortcomings have been identified when uncertain parameters are represented as random

13 14 variables [1, 5]. The typical and basic approach to representing epistemic uncertainty is the set-valued approach [35]. When this approach is used to handle uncertainty on customer de- mands in the CVRP, customer demands are assumed to belong to some uncertainty set, e.g., all that is known about the customer demands is that they belong to some intervals, and the problem is handled through a robust optimisation approach [7, 8, 36]. The most popular me- thodologies in robust optimisation aim at optimising, while protecting against worst case rea- lisation of customer demands that are deemed possible [79, 62]. Nevertheless, the set theoretic framework may be too coarse and may therefore lead to solutions that are too conservative. In the last forty years, the necessity to account for all facets of uncertainty has been reco- gnized and alternative uncertainty frameworks extending both the probabilistic and set-valued ones have appeared [5]. In particular, the theory of evidence introduced by Shafer [71], based on some previous work from Dempster [22], has emerged as a theory offering a compromise between expressivity and complexity, which seems interesting in practice as its successful application in several domains testifies (see [25] for a recent survey of evidence theory ap- plications). This theory, also known as belief function theory, may be used to model diverse forms of information like statistical evidence and expert judgements. Moreover, it provides tools to combine and propagate uncertainty. In the context of the CVRP, the theory of evi- dence may be used to represent uncertainty on customer demands leading to an optimisation problem, which may be referred to as the CVRP with Evidential Demands (CVRPED). Using the theory of evidence in this problem seems particularly interesting as it allows one to ac- count for imperfect knowledge about customer demands, such as knowing that each customer demand belongs to one or more sets with a given probability allocated to each set - an inter- mediary situation between probabilistic and set-valued knowledge. In this thesis, we propose to address the CVRPED by extending the CCP and SPR modelling approaches into the formalism of evidence theory. Although the focus will be to extend stochastic programming approaches, we will also connect our formulations with robust optimisation. To our knowledge, evidence theory has not yet been considered to model uncer- tainty in large-scale instances of an NP-hard optimisation problem like the CVRP. Indeed, it seems that so far, only other non classical uncertainty theories, and in particular fuzzy set theory [80, 82, 13, 63, 15], have been used in such problems. Besides, modelling uncertainty in optimisation problems using evidence theory has concerned only continuous design op- timisation problems1 [61, 77] and continuous linear programs [59]. Specifically in [61], the reliability of the system is optimized, while uncertainty is handled by limiting the plausibility of constraints violation into a small degree; while in [77] the problem was handled diffe- rently, and the plausibility of a constraint failure was converted into a second objective to the problem that should be minimized. Of particular interest is the work of Masri and Ben Ab- delaziz [59], who extended the CCP and SPR modelling approaches, in order to model conti- nuous linear programs embedding belief functions, which they called the Belief Constrained Programming (BCP) and the recourse approaches, respectively. In comparison, in this work, we generalise CCP and SPR to an integer linear program involving uncertainty modelled by evidence theory. Borrowing from [59], we propose to model the Capacitated Vehicle Rou- ting Problem with Evidential Demands (CVRPED) by methods that may be called the BCP modelling of the CVRPED and the recourse modelling of the CVRPED. For both models,

1Designing physical systems in the engineering field using optimisation techniques, so design costs are mi- nimized, while the system performance is fulfilled [2]. INTRODUCTION 15 the resolution algorithm is a simulated annealing algorithm; we use a meta-heuristic, as the CVRPED derives from the CVRP, which is NP-hard.

Structure of the report

This report is composed of five chapters. The first chapter presents the CVRP, which is the preliminary problem and the basic version of our problem of interest: the CVRPED. In this chapter, we overview the concepts and the nature of the CVRP along with the solution methods that can be employed to solve problems of such combinatorial nature. In the second chapter, a stochastic variant of the CVRP is presented and exami- ned, which is the CVRPSD. In this problem, customer demands become random variables. We review the appropriate modelling techniques to handle this stochastic optimisation problem. Specifically, we focus on the most popular modelling techniques in stochastic programming: CCP and SPR. The third chapter recalls the concepts of evidence theory, which are needed to handle evidential demands in the CVRPED. In particular, the representation, combination and propa- gation of uncertainty within evidence theory are recalled. The fourth chapter exposes our first contribution, which is the BCP approach to the CVRPED. This chapter has three major parts: description of the problem, the model and the solution method. The problem is introduced in the first part. In the modelling part, we study and examine the properties and characteristics of the model in details. In the part related to the solution method, we explain our method as well as the experiments that we have conducted. The fifth chapter presents our second contribution, that is the recourse approach to the CVRPED. This chapter has two parts: the model and the solution technique. A part that ex- plains and studies the recourse model and its properties, and a part that illustrates the solution method employed to solve the model. The report ends with a general conclusion, summarising our contributions and providing directions for future work. 16 Chapter 1

The Capacitated Vehicle Routing Problem

Contents 1.1 Introduction ...... 17 1.2 An Overview on the Vehicle Routing Problem ...... 18 1.3 The Capacitated Vehicle Routing Problem ...... 20 1.3.1 Problem formulation ...... 20 1.3.2 Problem variants ...... 22 1.3.3 Problem complexity ...... 23 1.4 Solving Methods ...... 23 1.4.1 Exact methods ...... 23 1.4.2 Heuristic based methods ...... 24 1.5 Conclusions ...... 28

1.1 Introduction

The VRP is the problem of determining a set of least cost routes for a fleet of vehicles, such that they serve a set of customers from one or multiple depots, while respecting some secondary constraints [56]. It is a well-known and important combinatorial optimisation pro- blem, that was originally introduced in [20]. The interest and the importance of the VRP has several reasons:

• its broad difficulty: so far, there is no general exact algorithm that can solve this problem in reasonable time when having a large number of customers – there exist algorithms that can solve large instances for this problem only in special cases.

• its practical usefulness: it is related to diverse real-life applications such as (school) bus routing, garbage collection, routing of maintenance units, transport of people with disa- bilities, delivery of petrol or gas to service stations, newspaper delivery, routing of hun- dreds of vehicles for big industrial companies, telecommunication network problems, pickup, or delivery distribution systems, etc.

17 18 1.2. AN OVERVIEW ON THE VEHICLE ROUTING PROBLEM

• it involves savings in global transportation costs: a lot of applications that can be forma- lized as a VRP belong to the class of local transportation or delivery problems, which affect the most expensive component in the distribution network [11].

Hence the VRP generated a large body of research that combined academic scientists as well as industrial engineers. Scientists main aim was to develop modelling and algorithmic mecha- nisms, that support all properties of a transportation problem. Industrials task was to incorpo- rate the developed tool into the productive and commercial course [84] and make sure to solve the real world problem effectively. The thorough studies accomplished since the VRP has been introduced, lead to significant advances both in the theoretical and applications aspects to the problem. In this chapter, we examine from a general perspective the VRP in Section 1.2. The most important and fundamental problem in this thesis - the CVRP, which is a basic extension of the VRP - is formally described in Section 1.3, along with an overview on some of its variants and its complexity. Section 1.4 presents solution methods to this problem, before concluding in the last section of this chapter.

1.2 An Overview on the Vehicle Routing Problem

In logistics, the VRP tackles the decision making at the operational level of effective distribution management systems. It concerns the final notch in the distribution chain, known as a local transportation or a delivery problem [11]. More specifically, given a fleet of vehicles and a set of customers with known demands, the aim is to construct a set of minimum cost routes, one for each vehicle, so they collect or deliver client demands, while respecting ope- rational constraints of the problem. Vehicles are located at one or more depots and the routes they perform, start and end at these depots. Operational constraints may include one or more of the following constraints:

• vehicles have a capacity limit and they cannot transport an amount of goods exceeding this limit;

• vehicles are associated to a maximal travelling distance (or travelling time) that should be respected in the route performed by a vehicle;

• a customer demand should be fully collected or delivered in only one visit by a ve- hicle, or a customer demand may be split (hence its demand may be fulfilled by several vehicles);

• distribution to customers and collection from customers are required simultaneously, or customer demands are only collected or only distributed;

• service time windows are requested by customers and hence should be respected by the servicing vehicles;

• customers should be visited (consequently served) in a predefined order. This issue usually emerges when goods should be collected and delivered in the same transporta- tion network, or if there is a service time window defined by customers. CHAPTER 1. THE CAPACITATED VEHICLE ROUTING PROBLEM 19

Usually, the system of interconnected routes (or roads) representing the transportation network is represented by a graph. Its nodes represent customers, depots, or roads intersec- tions, whereas its arcs are directed or undirected leading to a directed or undirected graph [84], respectively. Arcs are directed (respectively undirected) when they represent one-way (res- pectively two-way) carriageways. Such graphs may end up being sparse. However, the VRP overview that we present in the following, considers fully connected graphs, where nodes only represent customers or depots and edges represent paths that transpose the shortest travel cost between pair of nodes; such problems are known as node routing problems [51]. The travel cost depends either on the travelled distance or the travelled time on the associated arc. When edges are undirected, traversing them in both directions has the same travel cost and the asso- ciated VRP is said to be symmetric. Otherwise, they are directed and the problem is said to be asymmetric. In the graph defining a VRP, a depot node may be associated to one or more of the following specifications

• the depot location;

• the number of vehicles available at this depot;

• the quantities that vehicles can transport in the VRP network.

In addition, customer nodes may be associated to one or more of the following specifi- cations:

• a customer demand, that must be collected or delivered when its associated customer node is visited by a vehicle.

• a customer location;

• a customer time window during which it can be serviced;

• a customer service time, which is the time needed to drop or collect goods at a customer node;

• the number of vehicles that a customer needs, e.g., in certain cases a customer may require quantities to be dropped and collected at his location, which can be performed by different type of vehicles, which affects vehicles accessibility;

• a penalty that could be generated, if it is not possible to service a customer demand.

Vehicles are located at nodes of the graph and may have one or more of the following specifications:

• a capacity limit, representing the maximal load that a vehicle can carry. Capacity limits could be identical for all vehicles in a problem or they could be heterogeneous;

• different capacity limits that are relative to various types of transported products; 20 1.3. THE CAPACITATED VEHICLE ROUTING PROBLEM

• one or more associated depots, indicating the starting point of a vehicle route and its ending point;

• extra costs implied by usage of vehicles.

The objective in the VRP is to determine the minimum cost routes, which depends on the characteristics of the customers, the vehicles, and the depots, as well as respecting the operational constraints of the problem. There exist different possible objectives for the VRP and one or more of these objectives2 can be considered, such as:

• to minimize the number of vehicles employed;

• to minimize the total travelled cost (total travelled duration or distance), which can consider also costs associated to vehicles usage, drivers, etc., if they arise;

• to minimize penalties occurring when customers are not fully serviced, or when they are not correctly serviced in a specified time window;

• to minimize violation of other constraints in the problem.

Some of these objectives can be contradictory, when considered in the same problem. For instance, minimizing the number of vehicles and the total travelled distance, as a decrease in the number of vehicles, usually implies an increase in the travelled distance [12]. A solution to the VRP depends on the attributes that the components of the problem might have, in addition to the problem objective. This led to several variants of the VRP, where each one combines one or more of the attributes associated to customers, vehicles, and depots.

1.3 The Capacitated Vehicle Routing Problem

A basic variant in the VRP class, that has been extensively studied is the CVRP. In the CVRP, vehicles have identical capacities, they are located at one depot and customers have deterministic demands. The objective is to determine a least cost set of routes to collect3 client demands, starting and ending at the depot, such that the transported commodities on a route cannot exceed the capacity limit of a vehicle, and each client is serviced by exactly one vehicle.

1.3.1 Problem formulation

In the CVRP, a fleet of m identical vehicles with a given capacity limit Q, initially located at a depot, must collect goods from n customers, with di such that 0 < di ≤ Q the indivisible deterministic collect demand of client i, i = 1, . . . , n. The objective in the CVRP is to find a set of m routes with minimum cost to serve all the customers such that:

2When more than one objective is assumed in a problem, the problem is said to be a multi-objective problem. 3The problem can also be presented in terms of delivery, rather than collection of goods [39]. CHAPTER 1. THE CAPACITATED VEHICLE ROUTING PROBLEM 21

• total customer demands on any route must not exceed Q;

• each route starts and ends at the depot; and

• each customer is serviced only once.

Formally, it is convenient to represent the depot by an artificial client i = 0, whose demand always equals 0, i.e., d0 = 0. The CVRP may then be defined on a graph G = (V,E) such that V = {0, . . . , n} is the vertex set and E = {(i, j) |i 6= j; i, j ∈ V } is the arc set. V represents the customers and the depot that corresponds to vertex 0. A travel cost (or travel time or distance – these terms are interchangeable) ci,j is associated with every edge in E. We consider the symmetric version of the CVRP, where travel costs are such that

ci,j = cj,i, ∀ (i, j) ∈ E.

In addition travel costs satisfy the triangle inequality:

ci,j ≤ ci,l + cl,j, ∀i, l, j ∈ V.

In other words, the direct path connecting two nodes is at worst as costly as a path connecting those two same nodes but deviating from the direct one. Note that, since the direct path bet- ween two nodes in G is assumed to be the shortest path, the triangular inequality is verified automatically [83]. k Let Rk be the route associated to vehicle k and wi,j a binary variable that equals 1 if vehicle k travels from i to j and serves j (except if j is the depot), and 0 if it does not. The CVRP can be written as an integer linear program formulated as [11, 56]:

m X min C(Rk), (1.1) k=1 where n n X X k C(Rk) = ci,jwi,j, (1.2) i=0 j=0 subject to

n m X X k wi,j = 1, j = 1, . . . , n, (1.3) i=0 k=1 n n X k X k wi,` = w`,j, k = 1, . . . , m , ` = 0, . . . , n, (1.4) i=0 j=0 n X k w0,j ≤ 1, k = 1, . . . , m, (1.5) j=1 m X X k wi,j ≤ |L] − 1,L ⊆ V \{0}, (1.6) i,j∈L k=1 i6=j n n X X k di wi,j ≤ Q, k = 1, . . . , m. (1.7) i=1 j=0 22 1.3. THE CAPACITATED VEHICLE ROUTING PROBLEM

Constraints (1.3) make sure that exactly one vehicle arrives at client j, j = 1, . . . , n. Constraints (1.4) ensure the continuity of the routes (flow): if vehicle k leaves vertex `, ve- hicle k must also enter vertex `, ensuring that the route is a proper unbroken cycle in the graph. Constraints (1.5) oblige vehicle k, k = 1, . . . , m to leave at most one time the de- k pot. The choices of the arcs that are represented by wi,j is also restricted by constraints (1.6), that forbids subtours solutions [11]. Without these latter constraints, we can have a vehicle performing the path (i1, i2, . . . , it) with 0 ∈/ {i1, i2, . . . , it}. Constraints (1.7) state that every vehicle cannot carry more than its capacity limit. We note that constraints (1.3) and (1.4) im- n m P P k ply wj,i = 1, j = 1, . . . , n, i.e., exactly one vehicle leaves client j. Constraints (1.5) i=0 k=1 n P k and (1.4) imply wi,0 ≤ 1, k = 1, . . . , m, i.e., vehicle k is obliged to return at most one time i=1 to the depot. Remark that this model requires using at most m vehicles, since for some k, we k might have wi,j = 0, i, j = 1, . . . , n. In the optimisation problem defined in Equations (1.1) - (1.7), Equation (1.1) is referred as the objective of the problem and Equations (1.3) - (1.7) are referred as the constraints of the problem. Besides, the set of solutions defined by the constraints of this problem (where the objective function should be minimized), is called the feasible set of the problem.

1.3.2 Problem variants

Several extensions to the CVRP exist. Main ones are known to be the deterministic extensions [83, 51, 11]. In the following, we list some of these variants.

• The distance constrained CVRP is a CVRP in which each vehicle route should respect a maximal distance constraint, in addition to the capacity constraint.

• The split delivery CVRP relaxes the constraint that each customer should be visited exactly once, and each client demand could be serviced by more than one vehicle.

• The CVRP with time windows appends a service time window constraint to each client, that should be respected by servicing vehicles.

• The CVRP with backhauls requires, for a part of the customers, their demands to be de- livered, whether for the remaining part of the customers, their demands to be collected.

• The CVRP with pick-up and delivery requires that each client node is associated to two types of demands: collect demands and delivery demands. At a client node, collect de- mands represent the quantity of goods to be collected at this client and delivery demands are the quantities to be delivered to this client.

• The CVRP with backhauls and time windows is the combination of two problems: the CVRP with backhauls and the CVRP with time windows. In other words it combines constraints and requirements of both problems.

• The CVRP with pick-up and delivery and time windows is the generalisation of two problems: the CVRP with pick-up and delivery, and the CVRP with time windows. CHAPTER 1. THE CAPACITATED VEHICLE ROUTING PROBLEM 23

1.3.3 Problem complexity

The CVRP is a generalisation of the Traveling Salesman Problem (TSP), that calls for determining a single least cost route that visits all customers. In other words, the TSP is a n P variant of the CVRP, when m = 1 vehicle and the sum of all customer demands di ≤ Q. i=1 The CVRP is a NP-hard optimisation problem, as the TSP is known to be a classical NP-hard optimisation problem. NP-hard problems are problems, for which their exists no algorithms until now, that can solve them to optimality in a polynomial time at the worst case. The general idea of NP-hard problems is that, if there exists a problem η in the NP-hard problems class that is at least as hard as the others, being able to solve η in a polynomial time gives a way to solve all other NP problems. Problems that falls in the category of NP-hard problems are known to suffer from a combinatorial explosion, that is searching their “full state-space graph” (their search space) grows exponentially [50]. More specifically, as vehicle routing problems are represented by graphs, the size of the problem can be determined by the number of nodes (or edges) in the graph representing the given problem [11]. As the problem size increases, the computational complexity of searching the associated graph for the optimal solution increases exponentially. This induced a lot of efforts and several techniques in the literature were proposed to solve in the most efficient way such combinatorial optimisation problems. A general overview on such methods is provided in the next section.

1.4 Solving Methods

The purpose of this section is to inspect the appropriate solving methods for the CVRP. These methods fall into two main categories: exact methods and heuristic methods.

1.4.1 Exact methods

Exact methods are methods that can find optimal solutions for optimisation problems. Basically, these methods exhaustively explore the whole search space of an optimisation pro- blem. Nevertheless, in the case of combinatorial optimisation problems, exhaustive exact me- thods are inappropriate, thus intelligent exact techniques were developed. The first exact me- thods developed for the CVRP [68] were branch and bound algorithms, branch and cut al- gorithms and methods. We will present a brief overview on the generic components of the most basic and earlier method among them: the branch and bound tech- nique. The reader is referred to [68] and the references therein for a detailed presentation on classical exact techniques for the CVRP. The branch and bound technique develops an intelligent enumeration of candidate solu- tions for the optimal solution in the form of a rooted tree. The term branch refers to making partitions in the solution set space (sub problems), while the bound term refers to introducing bounds onto each branch of the search tree. The algorithm recursively generates branches and at each step minimizes (or maximizes) the objective function, while checking at each step 24 1.4. SOLVING METHODS if the solution respects the bound limit already set in the predecessor problem. For instance, in a minimisation problem like the CVRP, a branch and bound algorithm tries to transform the original problem into several small sub-problems, by truncating decision variables. If the solution of a sub-problem is greater than the upper bound of a predecessor sub-problem, then this sub-problem is abandoned. Moreover, branching this sub-problem stops, since its sub- problems (or descendants) will never contain an optimal solution. Branch and cut algorithms are an extension of branch and bound algorithms. In each iteration in generating tree nodes, a branch and cut algorithm adds some problem constraints (referring to the term cut) which are violated in the current solution using a cutting plane al- gorithm. This process assists in removing infeasible solutions and thus reducing the search tree. Column generation methods rely on formulating the problem as a set partitioning pro- blem [65]. Many contributions and improvements and new exact methods were developed for the CVRP. New exact techniques consisted on combining cutting planes and column generation notions into one algorithm, known as the branch and cut and price algorithm. We refer the reader to [64] and the references within for more details on new exact methods applied to the CVRP. Computations required by exact methods are exponential in the worst case. Their main advantage is that they guarantee finding optimal solutions. Nevertheless, optimality comes at the expenses of significant computational efforts.

1.4.2 Heuristic based methods

The growing complexity of optimisation problems, like the CVRP, led researchers to develop methods based on heuristic techniques, that are less time consuming than exact tech- niques, but do not guarantee finding optimal solutions and thus provide near optimal (ap- proximate) solutions. Methods that are based on heuristics, search the solution space of an optimisation problem in a reasonable time, by reducing the search space of the problem, and are able to find satisfying near optimal solutions. Such methods can be classified into two general categories: heuristics and meta-heuristics.

• Heuristics are specific and problem dependent algorithms. Examples of heuristic tech- niques that were developed for the CVRP are the Clarke and Wright savings algo- rithm [17], the sweep algorithm [40] and petal heuristics [66]. • Meta-heuristics are general and problem independent algorithmic frameworks, to which some tuning is applied so they act as powerful heuristic methods. They can be applied to a broad class of optimisation problems, by simply adapting the algorithmic concept of the meta-heuristic to the particularities of the problem. Some of the most common meta-heuristic algorithms are: simulated annealing [54], tabu search [41], genetic algo- rithms [50] and particle swarm optimisation [63, 15].

Generally, heuristics are less time consuming than meta-heuristics. However, the quality of the solutions obtained with meta-heuristics is better than the quality of those obtained with heuristics. Consequently, meta-heuristics allow for a trade off between i) the complexity that we face when using exact methods; and ii) inferior solutions quality that we face when using heuristic techniques. We will illustrate the main concepts of some meta-heuristic techniques in the following. CHAPTER 1. THE CAPACITATED VEHICLE ROUTING PROBLEM 25

Generically speaking, meta-heuristics we will be describing are iterative algorithms, that are based on local search methods. A local moves iteratively from a solution to another in the space of candidate solutions (the search space) by applying local changes, until a satisfying solution is found. Local changes are applied using so-called neighbourhood methods. In other words, a solution in a current iteration is obtained from the solution in the preceding iteration using a neighbourhood method. Tabu search and simulated annealing pro- ceed iteratively in their search with one solution at each iteration, whereas genetic algorithms and ant colony optimisation algorithms consider multiple solutions in parallel at each iteration in their iterative process. A termination criterion specifies the point at which the search must be stopped, in these iterative techniques, otherwise the search could go on forever, unless we know the optimal value of the problem in advance. A termination criterion can be specified in different manners, for instance:

• when the best solution found so far reaches a pre-specified lower bound value;

• after a pre-specified number of iterations;

• after a certain number of iterations without an improvement in the best solution.

In the following, we describe simulated annealing as it was employed in the contributions of this thesis. Furthermore, we provide a description of tabu search, genetic algorithms and ant colony optimisation algorithms in Appendix A.

Simulated annealing

Simulated annealing was first introduced by Kirkpatrick et al. [54] who proposed to imitate the process of annealing used in statistical mechanics4 into handling combinatorial optimisation problems (as an optimisation technique).

The real annealing process In statistical mechanics, the annealing strategy aims to form a crystallized (well-ordered) solid state of minimum energy, that is to form an optimal state with the lowest energy, from an initially disordered material having a high energy. Forming a crystal from a melt is a good example to illustrate this process. It involves heating a material of misplaced atoms so it melts, and thus all its atoms are in a free movement. At this stage, the material is at its maximal energy. Then, the temperature is slowly cooled, while making sure there is enough time at each cooling temperature for the atoms to rearrange their positions, and thus form an organized crystal lattice. If the cooling was not slow, it will result in solidifying disordered atoms and thus forming a defected crystal or a glass with no crystalline order that are called “locally optimal structures”. Note that, during the cooling process, some defects might happen that can be eliminated by local reheating. This process of careful cooling, guide into having a crystalline solid structure with a stable state that corresponds to its minimal energy. Similarly, the simulated annealing optimisation technique mimics the annealing process, by considering the energy of a material as the objective function of an optimisation problem,

4Statistical mechanics is the practice for analysing collective behaviour and properties of condensed matter physics [54]. 26 1.4. SOLVING METHODS and it tries to optimize it by introducing an artificial temperature parameter. This temperature parameter should have the same impact as the impact of a physical system temperature in the annealing process. After all, atoms of a material naturally interact together based on the tem- perature of the system and on the time spent interacting in that specific temperature, leading to:

• either reaching its optimal energy if the cooling was well-controlled; • or its local energy if the cooling was precipitous.

The simulated annealing algorithm The simulated annealing algorithm starts from a known initial configuration of an optimisation problem (candidate solution), having a high temperature parameter T . The temperature of the system is gradually decreased, through a number of iterations, until reaching the freezing temperature of the system. For every itera- tion, the configuration is locally rearranged using a neighbourhood method, through repeated steps until reaching a satisfactory number of neighbourhood transitions. Note that, each confi- guration can be evaluated by the objective function of the concerned optimisation problem, which stands for the energy of a material in the real annealing process, and is denoted by E. Hence, after a neighbourhood transition of a configuration, the variation in the objective function ∆E is measured:

• if the neighbourhood configuration improves the objective function, it becomes the star- ting configuration of the next step ;

∆ − E • otherwise it is accepted with a probability e T known as the metropolis rule of accep- tance.

The termination criterion of this algorithm is the freezing temperature of the system. The described optimisation technique based on a simulated annealing process is represented in Figure 1.1 and one way to instantiate it in the case of the CVRP is provided by Example 1.1.

∆ − E Remark 1.1. In the metropolis criterion as T decreases, e T decreases. Hence when T is high, a new configuration that does not ameliorate the system energy is most probably accepted. Nevertheless, when the temperature T is low, the probability of accepting an inferior solution reduces. Hence, the algorithm behaves similarly to the real annealing process, since as the temperature cools down, the atoms move less freely. Moreover, in a simulated annealing algorithm, we do not want to diminish a solution very often, otherwise the simulated annealing process degenerates into a random search. Nevertheless, accepting inferior solutions from times to times leads the algorithm into avoiding being trapped in local optima.

Example 1.1. The simulated annealing algorithm may be instantiated for the CVRP as fol- lows:

• an initial configuration may consist in generating randomly a set of routes while res- pecting the CVRP constraints; • applying a neighbourhood method to a configuration may consist in swapping the po- sitions of a pair of customers (except the depot) on a route chosen randomly (swapping pairs of customers could be applied to more than one route); CHAPTER 1. THE CAPACITATED VEHICLE ROUTING PROBLEM 27

Begin with an initial configuration as the current configuration of the system

Initialize temperature T at its highest value

Apply a neighbourhood me- thod to the current configuration

Calculate the change ∆E

• If ∆E denotes an improvement in the objective func- tion, then the neighbourhood configuration becomes the current one;

• if ∆E denotes a degradation in the objective function, then the neighbourhood configuration becomes the ∆ − E current one with a probability e T .

number of neighbourhood no Decrease T transitions for T reached?

yes

T reached freezing no temperature?

yes

Stop and return the confi- guration with the best E

FIGURE 1.1 : The skeleton of a simulated annealing algorithm 28 1.5. CONCLUSIONS

• ∆E consists in calculating the travel cost difference between the two sets of routes before and after the neighbourhood transition;

• the simulated annealing algorithm returns the set of routes (a configuration) having the minimal travel cost.

It is important to state that several elements of the simulated annealing algorithm have a crucial role in the capability of the algorithm, like determining:

1. the design of a configuration;

2. the generation of the initial configuration;

3. the neighbourhood method;

4. and the parameters of the algorithm such as the number of neighbourhood transitions for each temperature and the number of iterations to reach the freezing temperature.

All these elements, and in particular the parameters of the algorithm, affect also the com- putational time of the algorithm. Their exists some standard values for the parameters of the simulated annealing algorithm. Nevertheless, it is often suitable to determine them empirically depending on the nature of the problem. Overall, the advantages of the simulated annealing algorithm are that this algorithm :

• is trivial to program;

• is a general optimisation technique and can be applied to any optimisation problem that can be handled iteratively;

• has shown to be a powerful technique that gives very good quality solutions;

• requires no memory;

• and generally, does not get stuck in local optima.

1.5 Conclusions

In this chapter, we examined the underlying problem of interest in this thesis: the CVRP. The CVRP is an important VRP variant, that asks to determine the set of routes of minimum cost that can serve a set of customers with known (deterministic) demands, while respecting the capacity limit of each vehicle and some side constraints to the problem. This problem can be formulated as an integer linear program and is known to belong to the class of NP-hard problems. We presented the formal model of this problem, along with some of its variants and its complexity. Afterwards, we exposed solution methods to solve this NP-hard problem. Those methods can be divided into two parts: exact methods and heuristic based methods. We presented a quick overview on the exact methods. But we examined in more details, the simulated annealing meta-heuristic technique, as this solution method is employed in this thesis. CHAPTER 1. THE CAPACITATED VEHICLE ROUTING PROBLEM 29

Still, in many real-life applications, we may not be able to acquire all of the data ne- cessary to specify a CVRP, for instance i) customer demands cannot be determined exactly before arriving to customer locations, e.g., in municipal waste collection or daily delivery of dairy goods [42]; ii) travel costs between customers could be affected by travel times that are not known in advance, e.g., traffic conditions affect travel times; iii) customers to be serviced are not known in advance, e.g., in daily distribution problems where daily visits are not the same subset as the initial set of customers. Such data uncertainty gave rise to an extension of the CVRP, which is the stochastic CVRP, where uncertain data is assumed to be ran- dom, i.e., described in terms of probability distributions. Main variants of stochastic CVRPs are [38, 18]: the CVRPSD [37], the CVRP with stochastic travel times [57] and the CVRP with stochastic customers. The most studied between all of these stochastic CVRP variants is the CVRPSD, as the domain of its application is very wide, this problem is the subject of the following chapter. 30 1.5. CONCLUSIONS Chapter 2

The Capacitated Vehicle Routing Problem with Stochastic Demands

Contents 2.1 Introduction ...... 31 2.2 Modelling the CVRPSD by CCP ...... 32 2.3 Modelling the CVRPSD by SPR ...... 36 2.3.1 Recourse actions ...... 37 2.3.2 The expected penalty cost ...... 38 2.4 Modelling the CVRPSD using SPR or CCP ? ...... 43 2.5 Conclusions ...... 44

2.1 Introduction

In the previous chapter, we recalled the CVRP, and some solution methods that can be employed to solve this NP-hard optimisation problem. In this chapter, we turn our attention to an extension of the CVRP: the CVRPSD. In the CVRPSD, customer demands are not known with certainty at the moment a set of routes must be planned, specifically, they are assumed to be random variables and a customer demand is revealed upon the arrival of a vehicle at his location. Formally, the CVRPSD is a stochastic integer linear program, where customer demands di, i = 1, . . . , n, become random variables, such that P (di ≤ Q) = 1 and P (di > 0) = 1. The capacity constraints of (1.7) are then ill-defined and thus so is the feasible set over which the objective function (1.1) should be minimized. The problem formulation provided in Section 1.3.1 becomes therefore formally ill-posed. A naive way to handle this issue, is to require verifying Equation (1.7) for all possible realisations of the stochastic variables di, i = 1, . . . , n. However, this is generally imprac- ticable and unrealistic. An alternative is to impose capacity constraints with respect to the expected value of the random variables. Nevertheless, such approach is not always sensible, for instance in the case of high variance of the variables di. Stochastic programming offers modelling approaches to solve these limitations.

31 32 2.2. MODELLING THE CVRPSD BY CCP

Stochastic programming is a modelling framework for stochastic optimisation pro- blems, like the CVRPSD. The term stochastic refers to dealing with random data, whereas programming means that the problem under consideration can be modelled as a mathemati- cal program [9]. Initially, stochastic programming modelling approaches appeared in Dant- zig [19], Beale [6] and Charnes et al. [14]. Since then, the stochastic programming metho- dologies and concepts developed, and the literature around this topic became more and more rich. Stochastic programming models stochastic optimisation problems in two stages: a first stage solution is established a priori, and then in the second stage the realisations of the ran- dom variables - the actual demands in the case of the CVRPSD - are revealed and corrective actions are carried out if necessary on the first stage solution [38]. The most popular stochastic programming modelling approaches are CCP [14] and SPR [9]. CCP and SPR approaches were broadly used for the CVRPSD [32, 31, 4, 81, 30, 39, 58, 16, 37]. Authors that handled this stochastic optimisation problem, often resorted to assuming that the random variables were independent [32, 31, 4, 81, 30, 39, 58, 16], and a minority of researchers assumed the random demands to be correlated [43, 86]. In this chapter, we recall the CCP and SPR modelling techniques for the CVRPSD and suppose a real life representation of customer demands, where they are positive integers. Section 2.2 examines CCP for the CVRPSD and Section 2.3 reviews SPR for the CVRPSD. While Section 2.4 inspects the difference between the applicability of each one of these models. We conclude the chapter in Section 2.5.

2.2 Modelling the CVRPSD by CCP

CCP introduces so-called chance constraints into the formal model of the CVRPSD. Chance constraints represent boundaries where the decision maker would like to operate “most of the time” [53]. More specifically, a CCP model for the CVRPSD consists in fin- ding a first stage solution for which the probability that the total demand on any route exceeds the capacity limit, is constrained to be below a given threshold. Formally, a CCP formulation for the CVRPSD corresponds to the same optimisation problem described for the CVRP in Section 1.3.1 except that deterministic capacity constraints represented by Equation (1.7) are replaced by the following chance constraints:

 n n  X X k P  di wi,j ≤ Q ≥ 1 − β, k = 1, . . . , m, (2.1) i=0 j=0 where 1 − β is the minimum allowable probability that any route respects vehicle capacity and thus succeeds. The parameter β in Equation (2.1) is the maximal probability allowed to violate the n n P P k condition di wi,j ≤ Q on each route k, k = 1, . . . , m. In other words, the probability i=0 j=0 level (1 − β) is the safety margin reflecting the feasibility of a solution to the CVRPSD modelled by CCP. Thus, decisions taken in such models, guaranty a feasibility of degree 1 − β. The value of β is specified by the decision maker. Generally, it is delicate to specify β, since the decision maker must have a proper understanding of the system and its safety requirements [48]. At least he should know that the set of feasible solutions to the CCP model of the CVRPSD, decreases as β decreases (1−β increases), which leads to higher cost optimal CHAPTER 2. THE CAPACITATED VEHICLE ROUTING PROBLEM WITH STOCHASTIC DEMANDS 33

2

4 1 0

3

FIGURE 2.1 : Illustration of clients 1, 2, 3, 4 and the depot denoted by 0 for Example 2.1. solutions, and thus he should look for a good trade-off between vulnerability of the system and its cost [48]. The left term in Equation (2.1) is calculated with respect to the sum of the random variables di on a route k. Indeed, every route in a candidate solution may consist of one or more customers, thus one must evaluate the probability mass function associated to the sum of customer demands on that route. In order to illustrate the computations involved in Equation (2.1), let us assume that the stochastic demands are independent and let us recall that the sum of two independent random variables di and dj, with respective probability mass functions pi and pj and which can take values only in the set Θ = {1, 2,...,Q}, is a random variable with probability mass function pi + pj obtained as X (pi + pj)(c) = pi (a) · pj (b) . (2.2) a,b ∈ Θ c=a+b The sum of independent stochastic demands is commutative and associative. These properties are useful to compute the sum of the stochastic demands of N customers on some route k, regardless of their order. In the following example, we illustrate the computation of the chance constraints for some candidate solutions to a CVRPSD modelled by CCP. Example 2.1. Suppose we have n = 4 customers as in Figure 2.1, m = 2 vehicles where each one has a capacity limit Q = 10 and the maximal probability to violate the capacity limit of a vehicle is β = 0.2. Besides, the demands di, i = 1,..., 4, of the 4 customers are assumed to be independent random variables. Consider the probability mass functions p1, p2, p3 and p4 associated to the random variables d1, d2, d3 and d4, respectively, are defined as:

p1(4) = 0.6,

p1(5) = 0.4,

p2(5) = 0.8,

p2(8) = 0.2,

p3(3) = 0.6,

p3(5) = 0.4, and

p4(5) = 0.7,

p4(6) = 0.3. 34 2.2. MODELLING THE CVRPSD BY CCP

2

4 1 0

3

FIGURE 2.2 : Illustration of a possible set of routes to serve the clients in Figure 2.1.

Now, suppose the solution in Figure 2.2. To check if this solution satisfies the chance constraints in Equation (2.1), we need to determine P (d1 + d3 ≤ Q) and P (d2 + d4 ≤ Q), which means that the probability mass function associated to the sum of random variables d1 and d3 and the probability mass function associated to the sum of d2 and d4 should be determined. The probability mass function associated to the sum of random variables can be determined using Equation (2.2), thus we obtain:

p1+3(7) = 0.36,

p1+3(8) = 0.24,

p1+3(9) = 0.24,

p1+3(10) = 0.16, and

p2+4(10) = 0.56,

p2+4(11) = 0.24,

p2+4(13) = 0.14,

p2+4(14) = 0.06.

Thus, we have

P (d1 + d3 ≤ Q) = 1 ⇒ P (d1 + d3 ≤ Q) > 0.8, (2.3)

P (d2 + d4 ≤ Q) = 0.56 ⇒ P (d2 + d4 ≤ Q) < 0.8. (2.4)

As Equation (2.4) violates the chance constraint,the solution illustrated in Figure 2.2 is not a feasible solution for this problem, even if Equation (2.3) satisfies the probability 1−β imposed in this problem, as all routes in a solution should satisfy chance constraints of the problem, separately. Consider now the solution in Figure 2.3. In this solution, we need to determine P (d1 + d2 ≤ Q) and P (d3 + d4 ≤ Q). Using Equation (2.2), the probability mass function CHAPTER 2. THE CAPACITATED VEHICLE ROUTING PROBLEM WITH STOCHASTIC DEMANDS 35

2

4 1 0

3

FIGURE 2.3 : Illustration of another possible set of routes to serve the clients of Figure 2.1, than that in Figure 2.2.

associated to the sum of random variables d1 and d2 is

p1+2(9) = 0.48,

p1+2(10) = 0.32,

p1+2(12) = 0.12,

p1+2(13) = 0.08.

On the other hand, the probability mass function associated to the sum of random variables d3 and d4 is

p3+4(8) = 0.42,

p3+4(9) = 0.18,

p3+4(10) = 0.28,

p3+4(11) = 0.12.

Thus, we have

P (d1 + d2 ≤ Q) = 0.8 ⇒ P (d1 + d2 ≤ Q) ≥ 0.8, (2.5)

P (d3 + d4 ≤ Q) = 0.88 ⇒ P (d3 + d4 ≤ Q) > 0.8. (2.6)

From Equations (2.5) and (2.6), we can see that both routes in the solution of Figure 2.3 respect the chance constraints of this problem. Therefore, this solution is a feasible solution to the problem.

Many references in the literature modelled the CVRPSD as a CCP program; we refer the reader to [42, 11, 73], for more detailed investigations on such models in the context of the CVRPSD. Note that chance constraints (2.1) of the CVRPSD are called individual chance constraints, since the inequality in Equation (2.1) must be satisfied for every route k sepa- rately. If we want to consider one safety margin for all routes of a solution, i.e., k = 1, . . . , m, then constraints subject to random parameters should be modelled in the form of joint chance 36 2.3. MODELLING THE CVRPSD BY SPR constraints. A joint chance constrained version for the CVRPSD, consists in replacing the constraints (2.1) with the following joint chance constraints

 n n  X X k  P  di wi,j ≤ Q, k = 1, . . . , m  ≥ 1 − β. (2.7) i=0 j=0 

Mathematically speaking, joint chance constraints are constraints for one function, whe- reas individual chance constraints are constraints for several functions. Thus, joint chance constraints are more restrictive and more difficult to handle both in a theoretical and compu- tational view.

Example 2.2. Let us pursue Example 2.1 and suppose now that β is the maximal probability in violating the joint chance constraints of the problem. In other words, the probability in respecting capacity constraints of the whole set of routes in a solution should be greater or equal than 0.8. Therefore, let us evaluate the joint chance constraints for the solutions in Figures 2.2 and 2.3, respectively. For the solution illustrated in Figure 2.2, as individual chance constraints were not satisfied, then certainly the joint chance constraints are not satisfied too. As a matter of fact, evaluating Equation (2.7) for the solution in Figure 2.2, means we need to evaluate P ({d1 + d3 ≤ Q, d2 + d4 ≤ Q}), which amounts to

P ({d1 + d3 ≤ Q, d2 + d4 ≤ Q}) = 1 · 0.56 = 0.56 < 0.8.

For the solution in Figure 2.3, to determine Equation (2.7), we need to evaluate P ({d1 + d2 ≤ Q, d3 + d4 ≤ Q}), which amounts to

P ({d1 + d2 ≤ Q, d3 + d4 ≤ Q}) = 0.8 · 0.88 = 0.704.

Thus, this solution is not a feasible solution in the joint chance constrained model for the problem in Example 2.1, when β = 0.2.

For a comprehensive literature on joint chance constraints and on the theory of chance constrained programming, we refer the reader to [69, 9, 67, 26].

2.3 Modelling the CVRPSD by SPR

In the CCP modelling approach that we examined in the previous section, a first stage solution is determined such that it satisfies a minimum probability degree of respecting the capacity limit of vehicles. This model does not consider the cost of corrective actions that may be necessary when the first stage solution is implemented. Indeed, when implementing this solution, it is unlikely yet possible that the vehicle capacity is exceeded, i.e., route fai- lures occurs, when the actual demands are revealed and thus corrective actions may have to be carried out in the second stage. SPR explicitly deals with the possibility of a first stage solution failure, by incorporating into the objective of the problem the penalty cost of cor- rective, or recourse, actions such as allowing vehicles to return to the depot to unload. More specifically, in the SPR modelling of the CVRPSD, the expected penalty cost of the recourse actions happening in the second stage is considered, and the problem is to find a set of routes CHAPTER 2. THE CAPACITATED VEHICLE ROUTING PROBLEM WITH STOCHASTIC DEMANDS 37 which has the minimal expected cost defined as the cost of the first stage solution, plus the expected penalty cost of the recourse actions of the second stage.

Formally, let CE(Rk) denote the expected cost of a route Rk defined by

CE(Rk) = C(Rk) + CP(Rk), (2.8) with C(Rk) the cost defined by Equation (1.2) in Section 1.3.1, representing the cost of tra- velling along Rk if no recourse action is performed, and CP(Rk) the expected penalty cost on Rk. In Equation (2.8), CP(Rk) may be defined in different ways corresponding to dif- ferent recourse actions that can be considered in the model - this is detailed in the following subsection. Then, a SPR model for the CVRPSD consists in modifying the CVRP model presented in Section 1.3.1 as follows. The objective is to find a set of routes that

m X min CE(Rk), k=1 subject to constraints (1.3) - (1.6) excluding constraints (1.5), which is replaced by

n X k w0,j = 1, k = 1, . . . , m, (2.9) j=1 that is exactly m vehicles must be used. Constraints (1.5) may be considered instead of (2.9), but then the problem becomes even more difficult to solve [79]. Indeed, the SPR modelling of the CVRPSD, generates a wide feasible solution set, because capacity constraints are drop- k ped from the model. In addition, note that the binary (decision) variables wi,j do not encode recourse actions: they represent only the initially planned solution routes, i.e., the first stage solution.

2.3.1 Recourse actions

The expected penalty cost CP(Rk) depends on the chosen recourse action. A variety of recourse actions have been considered in the literature [4]. In the following, we enumerate some examples of recourse actions:

1. A recourse action may be to pull off another vehicle so a fully loaded vehicle that did not finish serving customers on its originally planned route, is replaced. In such case, penalty costs are equivalent to the cost associated to dispatching supplementary vehicles.

2. A fictitious recourse action that considers disappointed customers, which are not (fully) serviced, e.g., some customers might be too important for a company to lose, so one wants to avoid unfulfilling their demands. In such case, penalty costs could be equi- valent to the degree of customers annoyance, which is proportional to the unsatisfied quantity of goods at each customer in addition to some degree measuring each custo- mer importance for the company. 38 2.3. MODELLING THE CVRPSD BY SPR

3. In the case of delivery problems, a recourse action could consist in buying customers unsatisfied demands at higher prices from the market. Then, penalty costs are equivalent to the new (high) cost of unsatisfied customer demands, in addition to the extra travel cost to deliver their demands.

4.A round trip to the depot could be considered as a recourse action, so vehicles unload and then continue to serve customer demands unsatisfied on their first stage route. Pe- nalty costs in this case are equivalent to extra costs generated from supplementary trips to the depot. More specifically, when a vehicle arrives at a customer on its planned route, it is loaded with the actual customer demand up to its remaining capacity, if this remaining capacity is sufficient to pick-up that entire customer demand. However, if it is not sufficient, i.e., there is a failure, then the vehicle returns to the depot, is unloaded, afterwards goes back to that same customer to pick-up his remaining demand. Note that this service policy implies that a customer demand is divisible, only at points of failures. Remark 2.1. It is important to distinguish cases when the assumed recourse action produces a fixed recourse (penalty) cost from cases when this is not true. A recourse cost is said to be fixed (or deterministic) [9], if the associated recourse matrix is fixed (not stochastic). For instance, the recourse action consisting in round trips to the depot has a fixed recourse cost, as the travel cost between customers and the depot is deterministic. Nevertheless, if it was assumed that recourse actions consist in buying unsatisfied customer demands from the market, in this case the recourse cost is not fixed, since the recourse matrix can only be determined from a random variable that is customers unsatisfied demands.

2.3.2 The expected penalty cost

In the following, we detail how the expected penalty cost on a route R, CP(R) is com- puted in important references on the SPR modelling of the CVRPSD [16, 37, 58, 31]. These latter references considered the recourse action consisting of round trips to the depot. Consider a route R having N customers. Failure cannot occur at the first customer on a route since by problem definition we have

P (di ≤ Q) = 1, i = 1, . . . , n.

Moreover, it is possible to have more than one failure per route, the worst case scenario being that there is a failure at each customer on R except the first one, which occurs when all customer demands on R are equal to Q.

Let dri be the demand of the i-th customer on R, e.g., suppose the second client on R is client 4, then dr2 = d4, and ϕi be a random variable denoting the sum of stochastic customer demands from the first until the i-th customer on R, i.e.

i X ϕi = drı, ı=0 with dr0 = d0 such that P (d0 = 0) = 1. The probability that the uth failure along R occurs at the i-th customer is [81, 58]:     P ϕi−1 ≤ uQ − P ϕi ≤ uQ , (2.10) CHAPTER 2. THE CAPACITATED VEHICLE ROUTING PROBLEM WITH STOCHASTIC DEMANDS 39

  where P ϕi ≤ uQ is the probability that the cumulative demand ϕi does not exceed uQ, and u is an integer such that u > 0 and u ≤ i − 1. Equation (2.10) can be interpreted as the probability of having the uth failure at the i-th customer, given that it has not occurred on any previously visited customer along the route. In [81], the failure probability at an i-th customer was computed using Equation (2.10) with u = 1, since routes were constructed while prohibiting more than one failure per route. In the general case, for any route path with cumulative demand ϕi, when adding failure proba- bilities of each of the uth failures happening at an i-th customer, then we obtain the probability of having a failure at that i-th customer, which is denoted by FAIL(i) and computed as [16]:

i−1 X      F ail(i) = P ϕi−1 ≤ uQ − P ϕi ≤ uQ . (2.11) u=1 Note that at the first customer on R, we always have

F ail(1) = 0, since we have

P (ϕ0 ≤ Q) = 1,

P (ϕ1 ≤ Q) = 1.

Since a recourse action upon failure on a customer imposes a return trip to the depot, the penalty cost at an i-th customer on R is 2c0,i, with c0,i the travel cost between the depot and the i-th customer. The expected failure cost at an i-th customer EFC(i) is then [31, 16]:

EFC(i) = 2c0iF ail(i). (2.12)

Hence the expected penalty cost on a route R having N customers is

N X CP(R) = EFC(i). (2.13) i=2

The following example illustrates determining the expected cost of a route R in a CVRPSD modelled by SPR.

Example 2.3. Consider n = 3 customers and m = 1 vehicle with a capacity limit Q = 10, i.e., in this case we have n = N. Stochastic customer demands di, i = 1, 2, 3 are discrete independent random variables, associated respectively to probability mass functions pi, i = 1, 2, 3, defined as follows:

p1(5) = 0.7,

p1(8) = 0.3,

p2(4) = 0.8,

p2(7) = 0.2. 40 2.3. MODELLING THE CVRPSD BY SPR

TABLE 2.1 : Travel cost matrix TC of Example 2.3 0 1 2 3 0 +∞ 3 1.1 1 1 3 +∞ 2.1 2.1 2 1.1 2.1 +∞ 1 3 1 2.1 1 +∞ and

p3(3) = 0.4,

p3(4) = 0.6,

The depot is denoted by 0. The travel cost matrix TC = (ci,j) where i, j ∈ {0, 1, 2, 3} illustrates travel costs between customers and is shown in Table 2.1. Suppose a route R, where the first customer is customer 3, the second is customer 1 and the third one is customer 2, i.e., dr1 = d3, dr2 = d1 and dr3 = d2. First, let us determine the random variable ϕi, denoting the cumulative collected demands on R, up until each i-th customer.

• Up to the first client on R, the cumulative demand ϕ1 is associated to p3.

• Using Equation (2.2), we can determine the probability mass function associated to ϕ2, which is p3+1 defined as:

p3+1(8) = 0.28,

p3+1(9) = 0.42,

p3+1(11) = 0.12,

p3+1(12) = 0.18.

• The cumulative demand ϕ3 is associated to p3+1+2 defined as:

p3+1+2(12) = 0.224,

p3+1+2(13) = 0.336,

p3+1+2(15) = 0.152,

p3+1+2(16) = 0.228,

p3+1+2(18) = 0.024,

p3+1+2(19) = 0.036.

In order to calculate CP(R), the probability of failure at each of the i-th customers on R is determined using Equation (2.11). Afterwards, the expected failure cost at each i-th customer on R can be computed through Equation (2.12). Both steps are reported respectively for each customer on R in the following:

• F ail(1) = 0 =⇒ EFC(1) = 2 × 1 × 0 = 0;

• F ail(2) = 0.3 =⇒ EFC(2) = 2 × 3 × 0.3 = 1.8; CHAPTER 2. THE CAPACITATED VEHICLE ROUTING PROBLEM WITH STOCHASTIC DEMANDS 41

• F ail(3) = 0.7 =⇒ EFC(3) = 2 × 1.1 × 0.7 = 1.54.

On the other hand, the cost of travelling along R when no recourse action is performed C(R) which is defined by Equation (1.2) is

C(R) = c0,1 + c1,2 + c2,3 + c3,0 = 6.3,

while the expected penalty cost on R is

N X CP(R) = EFC(i) = 0 + 1.8 + 1.54 = 3.34. i=1

Recall that, for a route R with N customers, its expected cost is defined by Equation (2.8). Consequently CE(R) = 6.3 + 3.34 = 9.64.

It is important to distinguish route directions in an SPR modelling of a CVRPSD, as it affects a route expected cost. More specifically, it was shown in [32], that a route R with the path (0, 1,...,N, 0) and its reverse R−1 with the planned order of customers visits on R being reversed5 on R−1 have the same travel cost, but not necessarily the same expected penalty cost. This is directly related to the fact that cumulative customer demands on R except for the last customer, and those on R−1, are not necessarily identical. More specifically, for all customers on R except the last customer, the cumulative demand ϕi, i = 1,...,N − 1 differs from cumulative demand on R−1. Consequently, the probability of failures on each customer on R and R−1 will not necessarily be equal and thus places where return trips to the depot happen may be different on both routes. This is illustrated in the following example.

Example 2.4. Let us suppose the inverse of route R in Example 2.3, which is route R−1 and −1 −1 −1 let us denote by dri the stochastic demand of the i-th customer on R ,i.e., dr1 = d2, −1 −1 dr2 = d1 and dr3 = d3. −1 −1 Moreover, let ϕi , i = 1, 2, 3 be the cumulative stochastic customer demands on R . Then we have

−1 • ϕ1 is associated to p2. Recall that p2 is defined as

p2(4) = 0.8,

p2(7) = 0.2.

−1 • ϕ2 is associated to p2+1, defined as

p2+1(9) = 0.56,

p2+1(12) = 0.38,

p2+1(15) = 0.06.

5The N-th customer on R becomes the first customer on R−1, the (N − 1)-th customer on R becomes the second customer on R−1, . . ., the first customer on R becomes the last customer on R−1. 42 2.3. MODELLING THE CVRPSD BY SPR

−1 • ϕ3 is associated to p2+1+3, defined as

p2+1+3(12) = 0.224,

p2+1+3(13) = 0.336,

p2+1+3(15) = 0.152,

p2+1+3(16) = 0.228,

p2+1+3(18) = 0.024,

p2+1+3(19) = 0.036.

Using Equations (2.11) and (2.12), we calculate respectively the probability of failures and the expected failure cost at each one of the i-th customers on R−1. This is shown respec- tively for each one of these customers in the following:

• F ail(1) = 0 =⇒ EFC(1) = 2 × 1.1 × 0 = 0; • F ail(2) = 0.44 =⇒ EFC(2) = 2 × 3 × 0.44 = 2.64; • F ail(3) = 0.56 =⇒ EFC(3) = 2 × 1 × 0.56 = 1.12.

The cost of travelling along R−1 when no recourse action is performed remains the same as the travel cost on R: C(R−1) = C(R) = 6.3,

while the expected penalty cost on R−1 is

N −1 X CP(R ) = EFC(i) = 2.64 + 1.12 = 3.76 i=1

−1 −1 Therefore, the expected cost on R is CE(R ) = 6.3 + 3.76 = 10.06. We conclude, that the expected cost on route R−1 is worse than that of route R in Example 2.3.

To conclude, in the SPR model we recalled for the CVRPSD, first stage routes are de- signed so that they take into account the expected penalty recourse costs of second stage routes. Several extensions to this modelling approach could be considered. Multi-stage mo- dels [19, 9] extend this approach by considering more than two stage solutions to the problem. Specifically, first stage solutions are determined so they will have the lowest impact on future consequences (recourse actions) related to randomness outcomes, and the key difference is that randomness outcomes happen at several stages instead of one stage as in two stage SPR6. In multi-stage programming, recourse actions on later stage solutions are always implicated by earlier solutions of previous stages, and first stage solutions cannot be modified at later stages. In another way, markov decision process models for the CVRPSD [31, 29] consider the possibility of modifying first stage solutions by performing re-optimisation upon vehicle arrival at a customer location. Re-optimisation states that upon visiting a customer and when more knowledge about his demand is acquired, the next customer to be visited has to be plan- ned at this point, in other words re-optimising the next sequence of customers which could be

6In two stage SPR, the outcome of the experiment on the random variable happens between first stage and second stage solutions. CHAPTER 2. THE CAPACITATED VEHICLE ROUTING PROBLEM WITH STOCHASTIC DEMANDS 43 different from what was initially planned. Re-optimisation could happen only at customer lo- cations where failure occurred [31] or at other preventive locations [89]. It was shown in [89] that if round trips to the depot were anticipated (not necessarily at failure points), this can induce savings in the expected travel cost. More precisely, the optimal policy is to proceed to the next customer, if the remaining load in the vehicle is greater than a threshold value, otherwise a vehicle must return to the depot. The aim of Markov decision process modelling techniques is to improve expected cost of routes, by trying to take advantage of new infor- mation, by investigating all possible solutions each time the vehicle arrives at a customer, in order to get better results (better costs). Those techniques inspect all the states that a system is possible to have due to random variables. Thus, it might end up with a huge number of states where for each state, a NP-hard optimisation problem should be solved. Consequently, such models are inherently intractable and of extreme computational complexity. To tackle such problems, different relaxation techniques may be performed on the original model, in order to reduce the number of states of the model and thus the curse of dimensionality. Such models have been discussed and studied in details for the CVRPSD in [31, 29].

2.4 Modelling the CVRPSD using SPR or CCP?

When tackling an application of the CVRPSD type, it is important to decide which modelling technique to use between CCP and SPR. The answer lies in the type and concrete application that is tackled. Suppose the problem of collecting cash from bank branches into the central bank. The central bank represents the depot, the amount of cash in branches cannot be determined in advance, and a threshold on the amount of money carried by the vehicles is imposed for safety reasons, which represents capacity limit of vehicles. If a vehicle cannot collect the entire amount of money at some branch because of exceeding capacity limit, then an interest on the uncollected amount is lost. In such problem, it is natural to use the SPR modelling technique, since uncollected cash implies an extra penalty, which is associated to the wasted interest. The nature of this problem can be easily adapted and modelled by SPR, which imposes penalties upon constraints violation. A similar problem to the latter example was investigated in [55]. On the other hand, in many situations, determining the cost of failing to satisfy a constraint cannot be accurately determined, i.e., there is no logical transformation of a constraint failure into a penalty cost function. For instance, consider the distribution of me- dical supplies in large-scale emergency interventions, like natural health crises or terror at- tacks [73]. In such situations, a chance constraint modelling induces maximizing survival chances, which exactly corresponds to the problem goal. Other adequate applications, where CCP is highly recommended in their models, concern engineering problems, electricity net- work expansion and chemical engineering. In such problems, it is extremely hard to quan- tify the cost of failing to make the right decision. Instead, decisions must be made ba- sed on a high confidence level. Note that SPR models for the CVRPSD are more difficult than CCP models. Many papers studied and investigated both modelling techniques for the CVRPSD [86, 32, 31, 4, 30], and thus provide some more considerations to help the engineer in choosing the modelling technique that is mostly adapted to his problem. 44 2.5. CONCLUSIONS 2.5 Conclusions

In this chapter, the CVRPSD was investigated. It is a stochastic optimisation problem in which the constraints are affected by stochastic variables. We investigated how this sto- chastic integer linear program, can be modelled using stochastic programming modelling ap- proaches - the CCP approach and the SPR approach. CCP aim at generating first stage solu- tions subject to a probabilistic constraint, without the cost of expected recourse. While SPR deals with the expected recourse cost, by considering that a first stage solution might fail and allows recourse actions that generate recourse cost. Models based on SPR are more involved than models based on CCP. Nevertheless, determining the appropriate model for a problem depends on the nature of the application, as some problems are more amenable to either one of the two approaches. Models based on stochastic programming approaches correspond to the assumption that parameters that are not known with certainty, are known in the form of probability distribu- tions. As discussed in the introduction of this report, the probabilistic approach to modelling uncertainty is not necessarily well suited to all real life situations, and other theories, in parti- cular evidence theory, may be used as alternatives to model uncertainty. This theory is introdu- ced in the following chapter. as the main contribution of this thesis is to represent uncertainty on customer demands in the CVRP byevidence theory. Chapter 3

Evidence Theory

Contents 3.1 Introduction ...... 45 3.2 Representation of Information ...... 46 3.2.1 Mass function ...... 46 3.2.2 Belief and plausibility functions ...... 47 3.2.3 Informative content comparison ...... 50 3.2.4 Handling information on product spaces ...... 50 3.3 Combination of Information ...... 52 3.4 Uncertainty Propagation ...... 55 3.5 Expectations ...... 57 3.6 Conclusions ...... 58

3.1 Introduction

Evidence theory, also known as belief function theory and Dempster-Shafer theory, is an alternative uncertainty framework to probability theory. It has the advantage of being a more general formalism than probability theory and the set-valued approach, and offers appropriate tools to handle uncertainty and to make decisions under uncertainty. Evidence theory originates in the work of Arthur Dempster [22, 23]. The theory intro- duced by Dempster was then developed by Glenn Shafer [71] into a formalism that represents and handles uncertain information beyond the probabilistic scope. This theory may be used to model various forms of information, such as expert judgements and statistical evidence, and it also offers tools to combine and propagate uncertainty [1]. In this chapter, we overview the concepts of evidence theory, that are necessary for the developments in this thesis. Section 3.2 and Section 3.3 introduce the notions for repre- senting information and combining information, respectively. Section 3.4 details uncertainty propagation within evidence theory and Section 3.5 presents how to derive upper and lower expectations with respect to uncertainty. The chapter is then concluded in Section 3.6.

45 46 3.2. REPRESENTATION OF INFORMATION 3.2 Representation of Information

Generally, we can list all elementary (exclusive) values x1, . . . , xK that a variable x might take. A finite domain X = {x1, . . . , xK }, which contains all these possible values, is called the frame of discernment. In probability theory, knowledge about x corresponds to associating weights (probabilities) to elementary values in X, by means of a probability mass function. In the case of epistemic uncertainty, it becomes problematic to associate such weights to elementary values in X, for instance in the case of epistemic uncertainty arising from low quality, incomplete or unreliable historical data or conflicting opinions of experts. Evidence theory is an uncertainty theory suited to such situations, since it associates weights to subsets of X. The main particularity of this theory is that it does not need to distribute weights given to subsets, to elementary values of those subsets, when not enough knowledge or information is available. The mapping of subsets of a domain X to weights is done using a mass function. The concept of a mass function is recalled in the following.

3.2.1 Mass function

A mass function represents uncertain knowledge about a variable x. It is defined as a mapping mX : 2X → [0, 1] , such that mX (∅) = 0 and X mX (A) = 1. A⊆X The superscript X can be omitted when there is no risk of confusion. Each mass mX (A) represents the probability of knowing only that x ∈ A.

Definition 3.1. To be consistent with the stochastic case terminology, a variable x whose true value is known in the form of a mass function will be called an evidential variable.

Subsets A ⊆ X, such that mX (A) > 0 are called the focal sets of mX . Special mass functions arise, depending on their focal sets:

• A mass function mX is said to be simple if it has at most two focal sets and one of them is the domain X.

• A mass function mX is said to be categorical (or logical), if it has only one focal set, i.e., mX (A) = 1 for some A ⊆ X. In this case, the information about x and represented by mX is said to be imprecise if |A| > 1, with |A| denoting the cardinality of A. A particular mass function may arise in this case, which is the vacuous mass function. Specifically, a mass function mX is said to be vacuous, if its only focal set is the domain X, i.e., mX (X) = 1. This mass function represents complete ignorance.

• A mass function mX is said to be Bayesian, if its focal sets are singletons, i.e., mX (A) > 0 if and only if |A| = 1. CHAPTER 3. EVIDENCE THEORY 47

TABLE 3.1 : Examples of special mass functions

A ⊆ X X X X X m1 m2 m3 m4

∅ 0 0 0 0

{x1} 0.3 0 0 0.8

{x2} 0 0 0 0.15

{x1, x2} 0 1 0 0

{x3} 0 0 0 0.05

{x1, x3} 0 0 0 0

{x2, x3} 0 0 0 0

{x1, x2, x3} 0.7 0 1 0

These special mass functions are illustrated in the following example.

Example 3.1. Let the frame of discernment be X = {x1, x2, x3}, meaning that variable x can only take the values x1, x2 or x3. Table 3.1 shows examples of the special mass functions detailed above:

X • m1 is a simple mass function, since it has two focal sets and X is one of those;

X • m2 is a categorical mass function, as it has only one focal set;

X • m3 is the vacuous mass function (its only focal set is X);

X • m4 is a Bayesian mass function, since all its focal sets have cardinality 1.

3.2.2 Belief and plausibility functions

Equivalent representations of a mass function are the belief and plausibility functions. More specifically, the belief and plausibility functions can be determined from a mass func- tion, and vice versa a mass function can be determined given the belief or the plausibility [52]. Hence, there exists a one to one correspondence between a mass function and these two func- tions. The belief function is denoted by Bel (or BelX ) and is a mapping

Bel : 2X → [0, 1] 48 3.2. REPRESENTATION OF INFORMATION defined by Bel(x ∈ A) = X mX (C), ∀A ⊆ X. C⊆A

The degree of belief Bel(x ∈ A) can be interpreted as the probability that the evidence about x and represented by mX implies x ∈ A, as Bel(x ∈ A) assembles evidence supporting x ∈ A. The plausibility function is denoted by P l (or P lX ) and is a mapping

P l : 2X → [0, 1] defined as P l(x ∈ A) = X mX (C), ∀A ⊆ X. C∩A6=∅

The degree of plausibility P l(x ∈ A) is the probability that the evidence is consistent with x ∈ A, as P l(x ∈ A) assembles evidence making x ∈ A possible. There is also a one to one correspondence between a belief and a plausibility function, as a belief function can be determined from a plausibility function and vice versa. This can be done using the following formulas:

Bel(x ∈ A) = P l(x ∈ X) − P l(x ∈ A{), ∀A ⊆ X, P l(x ∈ A) = Bel(x ∈ X) − Bel(x ∈ A{), ∀A ⊆ X, (3.1) where A{ denotes the complement of A. Therefore, a mass function, a belief function and a plausibility function are three equi- valent representations of a piece of evidence or a state of belief induced by this evidence.

Remark 3.1. Until now the term “belief function” referred to the function Bel. Nevertheless, it is common to call a mass function, a belief function or a plausibility function, a “belief function”, because of the one to one correspondence between each one of these functions.

Example 3.2. Suppose X = {x1, x2, x3, x4, x5} is the frame of discernment of a variable x. Let mX be a mass function representing uncertain knowledge about x, defined as:

X m ({x2}) = 0.2, X m ({x1, x2}) = 0.07, X m ({x2, x3}) = 0.15, X m ({x4}) = 0.3, X m ({x1, x5}) = 0.1, X m ({x2, x3, x4, x5}) = 0.18. CHAPTER 3. EVIDENCE THEORY 49

Consider A = {x2, x3, x4}. We have X X X Bel(x ∈ A) = m ({x2}) + m ({x2, x3}) + m ({x4}) = 0.2 + 0.15 + 0.3 = 0.65 and

X X X X X P l(x ∈ A) = m ({x2}) + m ({x1, x2}) + m ({x2, x3}) + m ({x4}) + m ({x2, x3, x4, x5}) = 0.2 + 0.07 + 0.15 + 0.3 + 0.18 = 0.9.

Besides, we can determine the degree of plausibility in the complement of A as follows:

{ X X X P l(x ∈ A ) = m ({x1, x2}) + m ({x1, x5}) + m ({x2, x3, x4, x5}) = 0.07 + 0.1 + 0.18 = 0.35. Since P l(x ∈ X) = 1, we can verify that Bel(x ∈ A) = 1 − P l(x ∈ A{).

Conversely, we can determine the degree of belief in the complement of A as follows:

{ X Bel(x ∈ A ) = m ({x1, x5}) = 0.1. Since Bel(x ∈ X) = 1, we can verify that P l(x ∈ A) = 1 − Bel(x ∈ A{).

The belief and plausibility functions satisfy the following properties:

• Bel(x ∈ ∅) = P l(x ∈ ∅) = 0; • Bel(x ∈ X) = P l(x ∈ X) = 1; • Bel(x ∈ A) ≤ P l(x ∈ A), for all A ⊆ X. • Bel and P l are non additive measures, as: – Bel(x ∈ A ∪ x ∈ C) ≥ Bel(x ∈ A) + Bel(x ∈ C) − Bel(x ∈ A ∩ x ∈ C); – P l(x ∈ A ∪ x ∈ C) ≤ P l(x ∈ A) + P l(x ∈ C) − P l(x ∈ A ∩ x ∈ C).

Note that, if mX is a Bayesian mass functions then Bel(x ∈ A) = P l(x ∈ A), for all A ⊆ X. In this case, Bel is additive and is a probability measure, and mX is a proba- bility mass function. Thus, evidence theory extends the probabilistic representation of uncer- tainty. Moreover, extensions of probabilistic concepts such as marginalisation, conditioning, stochastic ordering, etc, have been defined for belief functions, some of these concepts will be covered in the following sections. 50 3.2. REPRESENTATION OF INFORMATION

3.2.3 Informative content comparison

The least commitment principle [76] postulates that given a set of belief functions com- patible with a set of constraints, the most appropriate one is the least informative. To make this principle operational, one needs means to compare the informative contents of belief functions. The informative content of two set-valued pieces of information x ∈ A and x ∈ B, A, B ⊆ X, about x is naturally compared by saying that x ∈ A is more informative than x ∈ B, if A ⊂ B. Several extensions of this concept to compare the informative content of belief functions, have been proposed [33, 88].

X In particular, an extension relies on the notion of specialisation: a mass function m1 is X said to be at least as informative (or specific) as another mass function m2 , which is denoted by X X m1 v m2 , if and only if there exists a non-negative square matrix, called a specialisation matrix,

S = [S(A, B)], A, B ∈ 2X , verifying

X S(A, B) = 1, ∀B ⊆ X, A⊆X S(A, B) > 0 ⇒ A ⊆ B, A, B ⊆ X and X X X m1 (A) = S(A, B)m2 (B), ∀A ⊆ X. B⊆X

X The term S(A, B) may be seen as the proportion of the mass m2 (B) that is transferred to A.

3.2.4 Handling information on product spaces

Often, we need to represent knowledge about several variables, where each variable is defined on a different domain than another variable. Similarly as in probability theory, dealing with several variables involves

• using mass functions defined on product spaces;

• and using some specific operations dedicated to reasoning with such mass functions.

Evidence theory offers two basic operations for handling information on product frames: marginalisation and vacuous extension. CHAPTER 3. EVIDENCE THEORY 51

Marginalisation

Let mX×Y denote a mass function defined on the Cartesian product X × Y of the finite domains X and Y of two variables x and y. Namely, the mass function mX×Y is a joint mass function representing partial knowledge about the variables x and y on the product space X × Y . Consider we are only interested in the variable x. To do so, we can infer knowledge from the partial knowledge on domain X × Y to the domain X, using the marginalisation operation denoted by ↓ (a downward arrow). Specifically, the marginalisation of mX×Y on X is the mass function mX×Y ↓X on X, defined as

mX×Y ↓X (A) = X mX×Y (B), ∀A ⊆ X, {B⊆X×Y,B↓X =A} where B↓X denotes the projection of B onto X. The mass function mX×Y ↓X is called the marginal of mX×Y on X.

Example 3.3. Suppose the frame of discernments X and Y are defined, respectively, as:

X = {x1, x2, x3}, and

Y = {y1, y2}.

Consider the joint mass function mX×Y on the product space X × Y is defined as:

X×Y m ({(x1, y1), (x3, y1)}) = 0.4, mX×Y (X × Y ) = 0.6.

Computing the marginal mass function mX×Y ↓X yields:

X×Y ↓X m ({x1, x3}) = 0.4, mX×Y ↓X (X) = 0.6.

Vacuous extension

Conversely to marginalisation, let mX be a mass function defined on X representing marginal knowledge about a variable x. One may express knowledge from mX to the domain X × Y , by performing a vacuous extension operation [71] denoted by ↑ (an upward arrow). Precisely, the vacuous extension of mX to X × Y is the mass function mX↑X×Y on X × Y defined, as:

 mX (A) if B = A × Y for some A ⊆ X, mX↑X×Y (B) = 0 otherwise. 52 3.3. COMBINATION OF INFORMATION

Example 3.4. To illustrate the vacuous extension operation, suppose we have the same frame of discernments X and Y as defined in Example 3.3. Now suppose we have initially a piece of knowledge on X represented by mass function mX defined, as:

X m ({x1}) = 0.5, X m ({x1, x3}) = 0.3, mX (X) = 0.2.

The vacuous extension of mX to X × Y is:

X↑X×Y m ({(x1, y1), (x1, y2)}) = 0.5, X↑X×Y m ({(x1, y1), (x1, y2), (x3, y1), (x3, y2)}) = 0.3, mX↑X×Y (X × Y ) = 0.2.

3.3 Combination of Information

Evidence theory provides mechanisms that allow combining information from multiple sources. Such mechanisms are known as combination rules. These rules are associated to metaknowledge about the sources providing the pieces of information. The oldest and most popular combination rule is Dempster’s rule [22]. This rule is justified when sources providing pieces of evidence are reliable and independent.

X X Formally, let m1 and m2 be two mass functions on X provided by two independent X X and reliable sources. Then m1 and m2 can be combined by Dempster’s rule denoted by ⊕, X X X and the result of the combination m1 ⊕ m2 is the mass function m1⊕2 defined by:

X 1 X X X m1⊕2(A) = m1 (B)m2 (C), ∀A ⊆ X,A 6= ∅, (3.2) 1 − κ B∩C=A where X X X κ = m1 (B)m2 (C), B∩C=∅ and X m1⊕2(∅) = 0. Remark 3.2. Dempster’s rule is not defined when κ = 1.

X X Example 3.5. Let m1 and m2 be two mass functions representing pieces of evidence about X X a variable x defined on X = {x1, x2, x3}. Let m1 and m2 be defined as in the second and third columns of Table 3.2, respectively.

X X Combining the two pieces of evidence represented by m1 and m2 by Demspter’s rule involves intersecting their focal sets as detailed in Table 3.3.

X The mass function m1⊕2 resulting from their combination is then:

X m1⊕2({x2}) = 0.2, X m1⊕2({x1, x2}) = 0.04, X m1⊕2({x2, x3}) = 0.76. CHAPTER 3. EVIDENCE THEORY 53

X X TABLE 3.2 : Mass functions m1 and m2 X X Focal sets m1 m2 ∅ 0 0

{x1} 0 0

{x2} 0.2 0

{x1, x2} 0 0.05

{x3} 0 0

{x1, x3} 0 0

{x2, x3} 0 0.95

{x1, x2, x3} 0.8 0

X X TABLE 3.3 : Performing the combination of m1 and m2 of Table 3.2 X Focal sets m1⊕2

{x2} ∩ {x1, x2} 0.01

{x2} ∩ {x2, x3} 0.19

{x1, x2, x3} ∩ {x1, x2} 0.04

{x1, x2, x3} ∩ {x2, x3} 0.76 54 3.3. COMBINATION OF INFORMATION

Note that Dempster’s combination rule is:

X X • commutative, i.e., for any m1 and m2 , we have X X X X m1 ⊕ m2 = m2 ⊕ m1 ;

X X X • associative, i.e., for any m1 , m2 and m3 , we have X X X X X X (m1 ⊕ m2 ) ⊕ m3 = m1 ⊕ (m2 ⊕ m3 ).

Now suppose we are given two mass functions mX and mY about two variables x and y. Their combination by Dempster’s rule on X × Y is obtained by combining their vacuous extensions on X × Y : mX ⊕ mY := mX↑X×Y ⊕ mY ↑X×Y , (3.3)

which comes down to  mX (B)mY (C) if A = B × C,B ⊆ X,C ⊆ Y, (mX ⊕ mY )(A) = 0 otherwise. Example 3.6. Suppose we have the same frames of discernment X and Y as in Examples 3.3 and 3.4, and we are given the mass function mX defined as

X m ({x3}) = 0.1, X m ({x2, x3}) = 0.2, mX (X) = 0.7, and the mass function mY defined as

Y m ({y2}) = 0.9, mY (Y ) = 0.1.

To combine mX and mY by Dempster’s rule, first the vacuous extension of mX to X ×Y should be determined:

X↑X×Y m ({(x3, y1), (x3, y2)}) = 0.1, X↑X×Y m ({(x2, y1), (x2, y2), (x3, y1), (x3, y2)}) = 0.2, mX↑X×Y (X × Y ) = 0.7, and the vacuous extension of mY to X × Y should be determined too:

Y ↑X×Y m ({(x1, y2), (x2, y2), (x3, y2)}) = 0.9, mY ↑X×Y (X × Y ) = 0.1.

Afterwards, we can perform mX↑X×Y ⊕ mY ↑X×Y , and the result of this operation is given in Table 3.4.

In practice, the operation in Equation (3.3) can be used to obtain joint knowledge about x and y from marginal knowledge on each of these variables, when the sources supplying this marginal knowledge can be safely assumed to be independent. Furthermore, note that if joint knowledge about x and y is obtained in this way, then x and y are said to be evidentially independent (or independent for short) [71]. CHAPTER 3. EVIDENCE THEORY 55

TABLE 3.4 : Mass function (mX↑X×Y ⊕ mY ↑X×Y ) of Example 3.6 Focal sets mX↑X×Y ⊕ mY ↑X×Y )

{(x3, y2)} 0.09

{(x3, y1), (x3, y2)} 0.18

{(x2, y2), (x3, y2)} 0.18

{(x2, y1), (x2, y2), (x3, y1), (x3, y2)} 0.02

{(x1, y2), (x2, y2), (x3, y2)} 0.63 X × Y 0.07

3.4 Uncertainty Propagation

Let x and y be two variables defined on respective finite domains X and Y and such that y = f(x) for some function f : X → Y. Assume that the value of x is known in the form of mass function mX . In the following, we detail what can be inferred about y given the type of mX , starting from the simplest case where mX is precise and categorical to the general case where mX has an arbitrary number of focal sets.

Case 1: If mX is defined by X m ({xi}) = 1,

for some xi ∈ X, meaning that the value of x is known without any uncertainty. Then, it is clear that what is known about the value of y, is that it is equal to f(xi) ∈ Y . Case 2: Assume now mX is defined by

mX (A) = 1,

for some A ⊆ X, meaning that the value of x is known imprecisely (mX is a categorical mass function). Then, it can only be inferred that

y ∈ B ⊆ Y,

with B defined as (using a common abuse of notation for the image of a set) [ B = f(A) = f(xi). xi∈A

Case 3: More generally, if mX has a finite number of focal sets, then the probability m(A) should be transferred to f(A) and thus what is known about y is represented by a mass function mY defined as

mY (B) = X mX (A), ∀B ⊆ Y. (3.4) f(A)=B 56 3.4. UNCERTAINTY PROPAGATION

Note that a more formal derivation of Equation (3.4) can be obtained by modelling and handling with the already exposed tools of evidence theory, all the available pieces of evidence about x and y. Specifically, function f can be encoded by a categorical mass X×Y function mf on X × Y such that

X×Y mf (Af ) = 1, with Af = {(xi, f(xi))|xi ∈ X}. It is then easy to show that mY defined by Equation (3.4) verifies

Y X×Y X↑X×Y ↓Y m = (mf ⊕ m ) ,

Y X×Y X that is m is obtained by combining mf and m using Dempster’s rule, and margi- nalising the result on Y .

The preceding reasoning can be extended to functions of more than one variable. Let 1 N x , . . . , x , and y be N + 1 variables defined on the respective finite domains X1,...,XN , and Y , and such that y = f(x1, . . . , xN ), for some function f : X1 × · · · × XN → Y. Assume that the values of variables x1, . . . , xN , are known with some uncertainty represented X1×···×XN by a mass function m on X1 × · · · × XN . Then, uncertainty about the value of y is represented by a mass function mY defined, as [34]: X mY (B) = mX1×···×XN (A), ∀B ⊆ Y, (3.5) f(A)=B

S 1 N with f(A) = 1 N f(x , . . . , x ) for all A ⊆ X1 × · · · × XN . (xi ,...,xi )∈A i i Let us illustrate Equation (3.5) in an important particular case, which will be met later in this report.

∗ X1×···×XN Suppose X1 = X2 = ··· = XN = N . Suppose further that m has a finite number of focal sets and that they are all Cartesian products of N intervals, i.e., for all

A ⊆ X1 × · · · × XN such that mX1×···×XN (A) > 0, we have

A = A↓X1 × · · · × A↓XN with, for i = 1,...,N, ↓Xi A = Ai; Ai J K 1 N PN i for some integers Ai, Ai ∈ Xi such that Ai ≤ Ai. Let f(x , . . . , x ) = i=1 x . Then, Equation (3.5) can be rewritten as

Y X X1×···×XN m ( B; B ) = m ( A1; A1 × · · · × AN ; AN ),

J K A1;A1 +···+ AN ;AN = B;B J K J K J K J K J K CHAPTER 3. EVIDENCE THEORY 57 where T ; T + R; R denotes the extension to intervals of the sum of integers, i.e. T ; T + R; R J= RK+ TJ ; R +K T [78]. J K J K J K Note that one obtains a mass function mX1×···×XN having focal sets that are all Cartesian products of N intervals, if, e.g., ones has independent marginal knowledge on each of the variables xi represented by mass functions mXi having interval focal sets. Actually, in this latter particular case, Equation (3.5) reduces to [87] :

N Y X Y Xi m ( B; B ) = m ( Ai; Ai ). (3.6) i=1 J K A1;A1 +···+ AN ;AN = B;B J K J K J K J K Equation (3.6) is illustrated by Example 3.7.

1 2 ∗ Example 3.7. Let x and x be two variables with domains X1 = X2 = N . Suppose the X1 value of x1 is known in the form of a mass function m defined as

mX1 ( 7; 9 ) = 0.8, J K mX1 ( 2; 4 ) = 0.2, J K X2 and that the value of x2 is known in the form of a mass function m defined as

mX2 ( 4; 9 ) = 0.25, J K mX2 ( 5; 6 ) = 0.75. J K Suppose we want to infer what can be said about the sum x1 + x2 of these variables given uncertain marginal knowledge mX1 and mX2 about them. Assuming that these pieces of know- ledge are independent, we obtain from (3.6) that knowledge about their sum is represented by a mass function on N∗ defined as : m( 11; 18 ) = 0.2, m(J12; 15K) = 0.6, mJ( 6; 13K) = 0.05, m(J7; 10K) = 0.15. J K Let us remark that the degree of belief that, e.g., x1 + x2 ≤ 15, is then

Bel(x1 + x2 ≤ 15) = m( 12; 15 ) + m( 6; 13 ) + m( 7; 10 ) = 0.8. J K J K J K Similarly, the degree of plausibility that x1 + x2 ≤ 15 is

P l(x1 + x2 ≤ 15) = m( 12; 15 ) + m( 6; 13 ) + m( 7; 10 ) + m( 11; 18 ) = 1. J K J K J K J K

3.5 Expectations

Suppose that knowledge about a variable x taking its values in a domain X, is represen- ted by mX . In addition, consider we have a function

+ h : X → R . 58 3.6. CONCLUSIONS

In evidence theory, it is possible to compute the lower and upper expected values of h rela- X X ∗ X tive to m . The lower expected value E∗(h, m ) and upper expected value E (h, m ) of h relative to mX are defined, respectively, as [24]

X X X E∗(h, m ) = m (A) min h(x), (3.7) x∈A A⊆X E∗(h, mX ) = X mX (A) max h(x). (3.8) x∈A A⊆X

Example 3.8. Consider the frame of discernment of a variable x is X = {x1, x2, x3}. Now suppose the function h mapping the values taken by x on to R+ is defined as

h(x1) = 8.5,

h(x2) = 5,

h(x3) = 12.8.

Let mass function mX representing knowledge about x be defined, as:

X m ({x1, x2}) = 0.6, X m ({x1, x3}) = 0.4.

The lower expected value of h relative to mX is:

X X X E∗(h, m ) = m ({x1, x2}) · 5 + m ({x1, x3}) · 8.5 = 0.6 · 5 + 0.4 · 8.5 = 6.4

The upper expected value of h relative to mX is:

∗ X X X E (h, m ) = m ({x1, x2}) · 8.5 + m ({x1, x3}) · 12.8 = 0.6 · 8.5 + 0.4 · 12.8 = 10.22

X X ∗ X Remark 3.3. If m is Bayesian, then E∗(h, m ) and E (h, m ) reduce to the classical (probabilistic) expected value: the expected value of h relative to the probability mass function mX .

3.6 Conclusions

In this chapter, we reviewed the necessary concepts of evidence theory. This theory offers tools to represent uncertain knowledge about variables, and to combine and propagate such knowledge. Lower and upper expectations relative to uncertainty can also be derived using this theory. The concepts recalled in this chapter will be needed in this thesis, to represent and ma- nipulate uncertainty on customer demands in the CVRP. Namely, when the theory of evidence represents uncertainty on customer demands, this leads to an optimisation problem, which will CHAPTER 3. EVIDENCE THEORY 59 be referred to as the CVRPED. Evidence theory can naturally account for uncertainty on cus- tomer demands in diverse situations, e.g., when pieces of information on customer demands are partially reliable or when statistical data is not sufficient to predict customer demands. The remaining chapters of this thesis are dedicated to the CVRPED. Specifically, we propose to model the CVRPED by methods that we call the BCP modelling of the CVRPED and the recourse modelling of the CVRPED. The next chapter investigates the CVRPED when modelled by a BCP approach. 60 3.6. CONCLUSIONS Chapter 4

A Belief-Constrained Programming Approach to the CVRPED

Contents 4.1 Introduction ...... 61 4.2 The CVRPED ...... 62 4.3 Modelling the CVRPED by BCP ...... 63 4.3.1 Formalisation ...... 64 4.3.2 Particular cases ...... 67 4.3.3 Influence of the model parameters on the optimal solution cost . . . 68 4.3.4 Influence of customer demand specificity on the optimal solution cost 69 4.4 Solving the CVRPED Modelled by BCP ...... 72 4.4.1 The simulated annealing algorithm ...... 72 4.4.2 A configuration in the algorithm ...... 74 4.4.3 The CVRPED benchmarks ...... 76 4.4.4 Experimental study ...... 76 4.5 Conclusions ...... 77

4.1 Introduction

The CVRP, introduced in Chapter 1, is an important variation of the VRP, where m vehicles have an identical capacity limit Q, and n customers have deterministic demands, i.e., di, i = 1, . . . , n, such that di > 0 and di ≤ Q, i = 1, . . . , n. The objective, in this problem, is to determine the set of routes of minimum cost that can collect customer demands di, i = 1, . . . , n, while respecting the capacity limit of each vehicle. Nevertheless, uncertainty on customer demands arises in many situations and it becomes difficult to determine customer demands before a vehicle arrives to a customer location, for instance in garbage collection problems, where quantities of garbage cannot be determined in advance.

61 62 4.2. THE CVRPED

Accordingly, several authors (see Chapter 2) tackled uncertainty on customer de- mands by assuming they are random variables and the associated problem is the well-known CVRPSD. This stochastic integer linear program can be modelled via the CCP approach, which is one of the main approaches to addressing stochastic mathematical programs (see Section 2.2). Other authors [79, 62] addressed uncertainty on customer demands using a set- valued representation of uncertainty and they determine the set of routes having the minimum cost, against the worst case realisation of customer demands. Such optimisation approach is known as the robust optimisation approach, and is sensible when all that is known about a customer demand is that it belongs to some interval. Evidence theory, that was introduced in Chapter 3, is an alternative uncertainty frame- work to probability theory that extends both the probabilistic and the set-valued approach. Uncertainty on customer demands may be naturally represented by belief functions and in such case, a new VRP is obtained, which is the CVRPED. Using the theory of evidence in this problem seems particularly interesting as it allows one to account for uncertain informa- tion about customer demands, such as knowing that each customer demand belongs to one or more sets with a given probability allocated to each set - an intermediary situation between probabilistic and set-valued knowledge. In this chapter, we generalise the CCP modelling of the CVRPSD into a BCP modelling of the CVRPED. Section 4.2 introduces in more details the CVRPED problem. Section 4.3 describes and studies the CVRPED when modelled by a BCP approach. In particular, the lat- ter section discusses important special cases of this model and provides theoretical results. In Section 4.4, the BCP model is solved using a simulated annealing algorithm and experi- ments are presented for instances derived from CVRP benchmarks. We conclude the chapter in Section 4.5.

4.2 The CVRPED

The CVRPED is an integer linear program involving uncertainty represented by belief functions. Thus, customer demands in the CVRP are no longer deterministic or random, but evidential, i.e., the variables di, i = 1, ..., n, are evidential. As in chapter 2, we assume actual customer demands to be positive integers, hence the value of the demand of any customer belongs to the set Θ = {1, 2,...,Q}. In addition, since the CVPRED involves n evidential variables di, i = 1, ..., n, with respective domains Θi := Θ, i = 1, ..., n, then formally this means that knowledge about customer demands in this problem is represented by a mass function mΘn on n n Θ := ×i=1Θi. (4.1) Example 4.1. Suppose we have n = 3 customers, knowledge about customer demands repre- Θn n 3 sented by m , i.e., Θ = Θ = Θ1 × Θ2 × Θ3, could be of the form:

n mΘ ({(2, 3, 4), (3, 5, 9), (3, 4, 6)}) = 0.7, (4.2) n mΘ ({(5, 5, 6), (7, 6, 8)}) = 0.3. (4.3) To understand the above mass function, let us interpret, e.g., Equation (4.3): it means that a weight 0.3 is allocated to the fact of knowing only that either client 1 demand is 5, client 2 demand is 5, and client 3 demand is 6, or client 1 demand is 7, client 2 demand is 6, and client 3 demand is 8. CHAPTER 4. A BELIEF-CONSTRAINED PROGRAMMING APPROACH TO THE CVRPED 63

In practical situations, it may be the case that only marginal knowledge in the form Θi of a mass function mi may be available about the individual demand of each customer i, i = 1, . . . , n. In such case, as explained in Section 3.3, mΘn can be derived by assuming that these pieces of knowledge about individual customer demands have been provided by independent sources, in which case using Equation (3.3), we have

n Θ n Θi m = ⊕i=1mi . (4.4) Example 4.2. Suppose we have n = 2 customers and 1 vehicle with capacity limit Q = 5. Knowledge about each customer i, i = 1, 2, demand is represented by the respective inde- Θ1 Θ2 pendent mass functions m1 and m2 , defined as follows:

Θ1 m1 ({2, 3}) = 1; and Θ2 m2 ({4, 5}) = 1. Θn As Q = 5, we have Θ1 = Θ2 = {1, 2, 3, 4, 5}. In addition, to determine m , we perform the following: 2 mΘ = mΘ1↑Θ1×Θ2 ⊕ mΘ2↑Θ1×Θ2 . (4.5)

Θ1 Performing the vacuous extension of m to Θ1 × Θ2 yields mΘ1↑Θ1×Θ2 ({(2, 1), (2, 2), (2, 3), (2, 4), (2, 5), (3, 1), (3, 2), (3, 3), (3, 4), (3, 5)}) = 1.

Similarly mΘ2↑Θ1×Θ2 is such that mΘ2↑Θ1×Θ2 ({(1, 4), (2, 4), (3, 4), (4, 4), (5, 4), (1, 5), (2, 5), (3, 5), (4, 5), (5, 5)}) = 1. Consequently using Equation (4.5), we obtain mΘ2 defined as: mΘ2 ({(2, 4), (2, 5), (3, 4), (3, 5)}) = 1. Mass function mΘ2 represents knowledge about the 2 customers in this example, given mass Θ1 Θ2 functions m1 and m2 which are independent and represent knowledge about the demand of client 1 and client 2, respectively.

In other words, if necessary and justified, evidential variables di, i = 1, ..., n, may be assumed to be independent, similarly as it may be done in the stochastic case. Let us underline that the BCP modelling of the CVRPED in the following section, is general and does not rely on such independence assumption, i.e., the model does not need mΘn to be derived in this way. Let us already emphasize that it is also true for the recourse approach to the CVPRED introduced in the next chapter.

4.3 Modelling the CVRPED by BCP

A generalisation of the (individual) chance constrained model for the CVRPSD pre- sented in Section 2.2 to the case of evidential demands is proposed in this section. The mo- del is provided in Section 4.3.1. Important particular cases of this model are discussed in Section 4.3.2. Influence of the model parameters and of customer demand specificity on the optimal solution cost, are studied in Sections 4.3.3 and 4.3.4, respectively. 64 4.3. MODELLING THE CVRPED BY BCP

4.3.1 Formalisation

A BCP modelling of the CVRPED amounts to keeping the same optimisation problem described in Section 1.3.1 except that capacity constraints (1.7) are replaced by the following belief -constraints:

 n n  X X k Bel  di wi,j ≤ Q ≥ 1 − β, k = 1, . . . , m, (4.6) i=1 j=0  n n  X X k P l  di wi,j ≤ Q ≥ 1 − β, k = 1, . . . , m, (4.7) i=1 j=0 with β ≥ β and where 1 − β (resp. 1 − β) is the minimum allowable degree of belief (resp. plausibility) that a vehicle capacity is respected on any route.

Remark 4.1. From (3.1), constraints (4.7) are equivalent to

 n n  X X k Bel  di wi,j > Q ≤ β, k = 1, . . . , m. (4.8) i=1 j=0

Hence, constraints (4.6) and (4.7) amount to requiring that for any route there is a lot (at least 1 − β) of support (belief) of respecting vehicle capacity and not a lot (at most β) of support of violating vehicle capacity.

To evaluate the belief-constraints (4.6) and (4.7), the total demand on every route must be determined by summing all customers demands on that route. Suppose a route R having N clients, then the sum of customer demands on R is obtained using Equation (3.5) of Sec- tion 3.4, where f is the addition of integers and where mX1×···×XN is the marginalisation of Θn m on the domains of the evidential variables dr1, . . . , drN associated with the N clients on the route, with Xi the domain of the evidential variable dri associated with the i-th client on R. For instance, if the 3-rd client on R is client 5, then dr3 = d5.

Suppose further that the number of focal sets of mass function mX1×···×XN is at most c, then the worst case complexity of evaluating the belief constraints (4.6) and (4.7) on this route is O(QN · N · c). This latter complexity emerges from the following:

• the QN factor is the maximal number of elements of a focal set of mass function mX1×···×XN ;

• the N factor is related to the fact that we have N clients on a route and for each ele- ment of a focal set of mass function mX1×···×XN , the addition of N integers must be performed, this explains the N term alongside QN ;

• the last factor in the complexity which is c, is related to performing the product N · QN for the c focal sets of mass function mX1×···×XN .

Nonetheless, in the particular and realistic case where the focal sets of mX1×···×XN are all Cartesian products of N intervals, i.e., for all

A ⊆ X1 × · · · × XN CHAPTER 4. A BELIEF-CONSTRAINED PROGRAMMING APPROACH TO THE CVRPED 65

4 3 1 0

2 5

FIGURE 4.1 : Illustration of a solution composed of two routes for the five customers of Example 4.3. such that mX1×···×XN (A) > 0, we have A = A↓X1 × · · · × A↓XN

↓Xi with, for i = 1,...,N, A = Ai; Ai for some integers Ai, Ai ∈ Xi such that Ai ≤ Ai, then the worst case complexity dropsJ downK to O(N · c). Example 4.3 illustrates determining the belief constraints on some route in a given so- lution to a CVRPED modelled by BCP.

Example 4.3. Suppose we have n = 5 customers, m = 2 vehicles where each vehicle has a capacity limit Q = 15, β = 0.1 and β = 0.05. Moreover, we are given the mass function mΘn representing knowledge about the demands of the n customers, i.e.,

n 5 Θ = Θ = Θ1 × Θ2 × Θ3 × Θ4 × Θ5, such that mΘ5 is defined as:

mΘ5 ({(2, 3, 8, 4, 5), (3, 5, 6, 7, 4), (3, 4, 7, 6, 2)}) = 0.5, mΘ5 ({(5, 5, 6, 4, 7), (7, 6, 5, 3, 4)}) = 0.3, mΘ5 ({(4, 6, 7, 4, 6), (5, 5, 6, 5, 7)}) = 0.2.

Consider a solution to this problem, specifically the one illustrated in Figure 4.1. Let us calculate the belief constraints on the route that collects the demand of customer 4, then the demand of customer 1 and finally the demand of customer 2. Call this route R. On this route we have

dr1 = d4 ,X1 = Θ4;

dr2 = d1 ,X2 = Θ1;

dr3 = d2 ,X3 = Θ2. 66 4.3. MODELLING THE CVRPED BY BCP

5 5 Θ Θ ↓X1×X2×X3 The marginalisation of m on X1 × X2 × X3 is the mass function m , defined as:

n mΘ ↓X1×X2×X3 ({(4, 2, 3), (7, 3, 5), (6, 3, 4)}) = 0.5, n mΘ ↓X1×X2×X3 ({(4, 5, 5), (3, 7, 6)}) = 0.3, n mΘ ↓X1×X2×X3 ({(4, 4, 6), (5, 5, 5)}) = 0.2.

n Now given mΘ ↓X1×X2×X3 and using Equation (3.5) such that f is the addition of customer demands, we can evaluate knowledge on the sum of client demands on route R, which is the mass function m defined as:

m({9, 15, 13}) = 0.5, m({14, 16}) = 0.3, m({14, 15}) = 0.2.

Evaluating the belief constraints on R amounts to:

Bel(d4 + d1 + d2 ≤ 15) = 0.7, (4.9)

P l(d4 + d1 + d2 ≤ 15) = 1. (4.10)

Note that the belief constraints in (4.9) and (4.10) are equivalent respectively to

Bel(dr1 + dr2 + dr3 ≤ 15) = 0.7 < 0.9, (4.11)

P l(dr1 + dr2 + dr3 ≤ 15) = 1 > 0.95. (4.12)

In Equation (4.12), the minimal allowable degree of plausibility (which is 1 − β = 0.95) is respected. Nevertheless in Equation (4.11), the minimal allowable degree of belief is not respected, since the credibility that the sum of customer demands on route R is equal to 0.7 which is less than 1 − β = 0.9. This means that the belief constraints are not respected on this route, and any solution (set of routes) containing this route is not a feasible solution. Note that there is no need to evaluate the belief constraints on the route that serves customers 3 then 5.

In the above example, we illustrated how to evaluate the belief constraints, starting from the assumption that customer demands are correlated and the only information available is of the form of a joint mass function. As discussed above, this is a case that may have an important worst-case complexity. On the other hand, if the focal sets of the joint mass function are Cartesian products of intervals, then the complexity is significantly reduced. In particular, as explained in Section 3.4, this happens if we have independent evidential demands di, i = Θi 1, . . . , n, and the focal sets of mass functions mi , i =, 1, . . . , n, are all intervals of positive integers, in which case we may compute knowledge on the sum of customer demands, using Equation (3.6) as illustrated by Example 4.4.

Example 4.4. Let β = 0.1 and β = 0.05. Suppose a route R with N = 2 customers, the capacity limit of a vehicle Q = 10 and independent evidential demands dr1 and dr2 (of the first and second customers on R) are associated to the respective mass functions:

X1 m1 ( 3, 4 ) = 0.5, X1 J K m1 ( 2, 5 ) = 0.5 J K CHAPTER 4. A BELIEF-CONSTRAINED PROGRAMMING APPROACH TO THE CVRPED 67 and

X2 m2 ( 4, 5 ) = 0.9, X2 J K m2 ( 6, 7 ) = 0.1. J K Using Equation (3.6), we obtain the mass function representing knowledge about dr1 + dr2 defined as: m( 7, 9 ) = 0.45, m( 9J, 11K) = 0.05, m(J6, 10K) = 0.45, m(J8, 12K) = 0.05. J K Now we can determine the belief constraints on this route, which amounts to:

Bel(dr1 + dr2 ≤ 10) = 0.9 ≥ 0.9,

P l(dr1 + dr2 ≤ 10) = 1 > 0.95. Therefore, route R of Example 4.4 is a feasible route as it respects the belief constraints.

4.3.2 Particular cases

It is interesting to remark that depending on the values chosen for β and β as well as the nature of the evidential demands di, i = 1, ..., n, the BCP modelling of the CVRPED may degenerate into simpler or well-known optimisation problems.

• In particular, if mΘn is Bayesian, i.e., we are dealing really with a CVRPSD, then we have, for k = 1, . . . , m,

 n n   n n  X X k X X k Bel  di wi,j ≤ Q = P l  di wi,j ≤ Q , i=1 j=0 i=1 j=0 and the BCP modelling of the CVRPED can be converted into an equivalent optimisa- tion problem, which is the CCP modelling of this CVRPSD, with β in constraint (2.1) set to β. • In contrast, if mΘn is categorical and its only focal set is the Cartesian product of n intervals, i.e., we are dealing with a CVRP where each customer demand di is only known in the form of an interval di; di , then the total demand on any given route is also an interval (its endpoints areJ obtainedK by summing the endpoints of the interval demands of the customers on the route) and thus for any k = 1, . . . , m,  n n n P Pn k P P k  1, iff di j=0 wi,j ≤ Q, Bel( di wi,j ≤ Q) = i=0 i=0 j=0  0, otherwise,

and,

 n n n n  1, P d P wk ≤ Q, P P k  iff i i,j P l( di wi,j ≤ Q) = i=0 j=0 i=0 j=0  0, otherwise. 68 4.3. MODELLING THE CVRPED BY BCP

Then, since n n n n X X k X X k di wi,j ≤ Q ⇒ di wi,j ≤ Q, i=0 j=0 i=0 j=0 the belief-constraints (4.6) and (4.7) reduce when β < 1 to the following constraints

n n X X k di wi,j ≤ Q, k = 1, . . . , m. (4.13) i=0 j=0 In other words, in the case of interval demands, the BCP modelling amounts to sear- ching the solution which minimises the overall cost of servicing the customers (Equa- tion (1.1)) under constraints (4.13), i.e., assuming the maximum (worst) possible custo- mer demands, and thus it corresponds to the minimax optimisation procedures encoun- tered in robust optimisation [79]. • If β = β, then constraints (4.7) can be dropped, that is, only constraints (4.6) need to be evaluated (since if constraints (4.6) are satisfied then constraints (4.7) are necessarily satisfied due to the relation between the belief and plausibility functions). As a matter of fact, the BCP approach originally introduced in [59] is of this form as no constraint based on P l is considered. Most importantly, when β = β and the evidential variables n Θ n Θi di, i = 1 . . . , n are independent (and thus m = ⊕i=1mi ), the BCP modelling of the CVRPED can be converted into an equivalent optimisation problem, which is the CCP modelling (with β in constraint (2.1) set to β) of a CVRPSD where customer

demands are represented by independent stochastic variables denoted di, i = 1, . . . , n, Θi with associated probability mass function pi obtained from mi as follows: for each Θi Θi focal set A ⊆ Θi of mi , the mass mi (A) is transferred to the element θ = max(A). Indeed, with such a definition of pi, it is easy to show that we have, for k = 1, . . . , m,

 n n   n n  X X k X X k Bel  di wi,j ≤ Q = P  di wi,j ≤ Q . i=1 j=0 i=1 j=0

• Let us eventually remark that the case β = 1 > β is the converse of the case β = β in the sense that constraints (4.6) can be dropped (as they are necessarily satisfied) and only constraints (4.7) need then to be evaluated. Moreover, in this case, if the evidential variables di, i = 1 . . . , n are independent, the BCP modelling of the CVRPED can be converted into an equivalent optimisation problem, which is the CCP modelling (with β in (2.1) set to β) of a CVRPSD where customer demands are represented by independent stochastic variables denoted di, i = 1, . . . , n, with associated probability mass function Θi Θi Θi pi obtained from mi as follows: for each focal set A ⊆ Θi of mi , the mass mi (A) is transferred to the element θ = min(A). Indeed, with such a definition of pi, it is easy to show that we have, for k = 1, . . . , m,

 n n   n n  X X k X X k P l  di wi,j ≤ Q = P  di wi,j ≤ Q . i=1 j=0 i=1 j=0

4.3.3 Influence of the model parameters on the optimal solution cost

In this section, we study the influence of the parameters β, β and Q, on the optimal solution cost of the CVRPED modelled via BCP. CHAPTER 4. A BELIEF-CONSTRAINED PROGRAMMING APPROACH TO THE CVRPED 69

To simplify the presentation, we will denote by ΣQ,β,β the set of solutions to the ˆ CVRPED modelled by BCP and by CQ,β,β the cost of an optimal solution in ΣQ,β,β, for some β, β and Q. The following propositions state how the optimal solution cost changes as Q, β or β vary. Proposition 4.1. The optimal solution cost is non increasing in Q.

Proof. Let us consider a set C = {R1,...,Rm} composed of m routes Rk, k = 1, ..., m, such that it is not known whether this set respects the belief-constraints (4.6) and (4.7), but it is known that it respects all the other constraints of the CVRPED modelled by BCP. It is clear that for any β and β, as Q increases (starting from 1), it reaches necessarily a value at which constraints (4.6) and (4.7) are satisfied, and thus at which C becomes a solution to the CVRPED modelled by BCP. Hence,

0 ΣQ,β,β ⊆ ΣQ0,β,β for Q ≥ Q,

ˆ ˆ and thus CQ0,β,β ≤ CQ,β,β. Proposition 4.2. The optimal solution cost is non increasing in β.

Proof. Let us consider a set C = {R1,...,Rm} composed of m routes Rk, k = 1, ..., m, such that it is not known whether this set respects the belief-constraints (4.6), but it is known that it respects all the other constraints of the CVRPED modelled by BCP, in particular constraints (4.7). It is clear that for any Q, as β increases from β to 1, it reaches necessarily a value at which constraints (4.6) are satisfied, and thus at which C becomes a solution to the CVRPED modelled by BCP. Hence,

0 ΣQ,β,β ⊆ ΣQ,β0,β for β ≥ β,

ˆ ˆ and thus CQ,β0,β ≤ CQ,β,β.

Proposition 4.3. The optimal solution cost is non increasing in β.

Proof. The proof is similar to that of Proposition 4.2.

Informally, Propositions 4.1–4.3 state that if the decision maker is willing to buy ve- hicles with a higher capacity or to have vehicle capacity exceeded on any route more often, then he will obtain at least as good (at most as costly) solutions.

4.3.4 Influence of customer demand specificity on the optimal solution cost

In this section, we study the influence on the optimal solution cost of knowledge spe- cificity about customer demands, in the case where evidential variables di, i = 1, . . . , n, are 70 4.3. MODELLING THE CVRPED BY BCP

Θi independent and where the focal sets of mass functions mi representing individual customer demands are all intervals of integers. Specifically, let us first consider the case, where for i = 1, . . . , n, the demand of cus- Θi Θi tomer i is known in the form of a mass function mi+ built from mi as follows: for each Θi Θi interval A = A; A such that mi (A) > 0, the mass mi (A) is transferred to the interval A+ = A; A +Ja+ ,K with a+ ∈ 0; Q − A . J K J K Θi Example 4.5. For instance, let Q = 15 and mi be defined by

Θi Θi Θi mi ( 4; 6 ) = 0.6, mi (( 4; 7 ) = 0.3, mi (( 6; 9 ) = 0.1, J K J K J K Θi Θi Θi and assume mi ( 4; 6 ), mi (( 4; 7 ), and mi (( 6; 9 ) are transferred respectively to 4; 7 , J K Θi J K J K J K 4; 7 and 6; 12 . Then, mi+ is defined by J K J K Θi Θi mi+ ( 4; 7 ) = 0.9, mi+ (( 6; 12 ) = 0.1. J K J K

Θi Θi Θi Let us note that when mi+ is built from mi as above, then mi is at least as speci- Θi fic as mi+ (in the sense of the specificity notion introduced in Section 3.2.3) as shown by Lemma 4.1:

Θi Θi Lemma 4.1. Let mi be a mass function with interval focal sets. Let mi+ be the mass function Θi + + + built by transferring mi (A) to the interval A = A; A + a , with a ∈ 0; Q − A . We Θi Θi J K J K have mi v mi+ .

Θi Θi Θi Proof. mi is recovered from mi+ by transferring, for each focal set A of mi , a proportion Θi mi (A) Θi + + Θ from m + (A ) to A ⊆ A . Moreover, if the masses allocated to several focal sets m i (A+) i i+ Θi + Θi A1,...AL, of mi are transferred to the same interval A , then this means that mi is reco- Θ PL m i (A ) Θi `=1 i ` Θi + + vered from m + by transferring a proportion Θ = 1 of m + (A ) to subsets of A , i m i (A+) i i+ Θi Θi Θi and thus, since this is true for all focal sets of mi+ , mi is at least as specific as mi+ .

Let ΣQ,β,β, denote the set of solutions to the CVRPED modelled by BCP and by ˆ JK CQ,β,β, the cost of an optimal solution in ΣQ,β,β, , for some β, β and Q, when the eviden- JK JK tial demands di, i = 1 . . . , n are independent and when the focal sets of the associated mass mΘi Σ+ functions i are all intervals. Furthermore, let Q,β,β, denote the set of solutions to the JK Cˆ+ Σ+ BCP modelling of the CVRPED and by Q,β,β, the cost of an optimal solution in Q,β,β, , JK JK for some β, β and Q, when the evidential demands are independent and known in the form of Θi mass functions mi+ , i = 1, . . . , n. Proposition 4.4. We have Cˆ+ ≥ Cˆ . Q,β,β, Q,β,β, JK JK Proof. Let R denote a route containing N clients. Without lack of generality, assume that the i-th client on R is the client i. Let mP denote the mass function representing the sum of the customer demands on R when the demand of client i is known in the form of a mass function Θi + mi , and let mP denote the mass function representing the sum of the customer demands on

Θi R when the demand of client i is known in the form of mass function mi+ . CHAPTER 4. A BELIEF-CONSTRAINED PROGRAMMING APPROACH TO THE CVRPED 71

+ It is clear that mP and mP are respectively obtained by transferring, for each

N Θ (A1,...,AN ) ∈ ×i=12

Θi QN Θi such that Ai = Ai; Ai is a focal set of mi , the mass i=1 mi (Ai) to the intervals J K A1 + ··· + AN ; A1 + ··· + AN J K and

+ + + + + + A1 + ··· + AN ; A1 + ··· + AN = A1 + ··· + AN ; A1 + ··· + AN + a1 + ··· + aN , J K J K + + + + Θi Θi respectively, with Ai = Ai ; Ai = Ai; Ai + ai the focal set of mi+ to which mi (Ai) is transferred to. J K J K Let bel and bel+ (resp. pl and pl+) denote the belief functions (resp. plausibility func- + tions) associated to mP and mP, respectively. We have then

N N X X Y Θi bel( di ≤ Q) = mi (Ai), i=1 (A ,...,A ):PN A ≤Q i=1 1 N i=1 i N N + X X Y Θi bel ( di ≤ Q) = mi (Ai), i=1 (A ,...,A ):PN A +a+≤Q i=1 1 N i=1 i i hence N N X + X bel( di ≤ Q) ≥ bel ( di ≤ Q), (4.14) i=1 i=1 and

N N X X Y Θi pl( di ≤ Q) = mi (Ai) i=1 (A ,...,A ):PN A ≤Q i=1 1 N i=1 i N + X = pl ( di ≤ Q). (4.15) i=1 The proposition follows from the fact that Equations (4.14) and (4.15) hold for any route.

Informally, Proposition 4.4 shows that the more pessimistic knowledge is about custo- mer demands, the greater the cost of the optimal solution. A counterpart to Proposition 4.4 can similarly be shown to hold if instead of being more pessimistic, knowledge about customer demands is more optimistic in the sense that customer Θi Θi demands are known in the form of mass functions mi− rather than mi , i = 1, . . . , n, with Θi Θi Θi mi− built from mi as follows: for each interval A = A; A such that mi (A) > 0, the mass Θi − − J K − mi (A) is transferred to the interval A = A − a ; A , with a ∈ 0; A . This leads for any route R containing N clients to (with obviousJ notations):K J K

N N X − X bel( di ≤ Q) = bel ( di ≤ Q), (4.16) i=1 i=1 N N X − X pl( di ≤ Q) ≤ pl ( di ≤ Q), (4.17) i=1 i=1 72 4.4. SOLVING THE CVRPED MODELLED BY BCP

Cˆ− ≤ Cˆ i.e. which in turn leads to Q,β,β, Q,β,β, , , the more optimistic knowledge is about customer demands, the less costlyJK solutionsJK are.

Θi Θi Θi Θi Note that, similarly as we have shown that mi v mi+ , we can show that mi v mi− . Hence, perhaps contrary to intuition, the results in this section show that the optimal solution cost will not necessarily be higher if knowledge about customer demand is less specific: it will be higher if it is less specific and more pessimistic. Propositions 4.1–4.4 provide theoretical properties of the CVRPED solutions obtained under the BCP approach, when using exact optimisation methods. For now, such methods can not solve large instances of the CVRP, from which the CVRPED derives. As a matter of fact, Section 4.4 reports a solution strategy to the BCP modelling of the CVRPED using a meta-heuristic algorithm.

4.4 Solving the CVRPED Modelled by BCP

In this section, we solve the BCP model exposed previously for the CVRPED. It is solved using a simulated annealing algorithm which was presented in Chapter 1, as exact solution methods might need a prohibitively large time to solve large instances. First in Sec- tion 4.4.1, we present the simulated annealing algorithm to solve the BCP model. Then in Section 4.4.2, details on generating a configuration in the simulated annealing algorithm are explained. Benchmarks for the CVRPED are then exposed in Section 4.4.3. At last, experi- mental studies using these benchmarks are exposed in Section 4.4.4.

4.4.1 The simulated annealing algorithm

Recall that the simulated annealing is a local search optimisation method that moves iteratively from solution to solution in the space of candidate solutions (the search space) by applying local changes, until a satisfying near-optimal solution is found. The simulated annealing algorithm that we employed for the BCP model of the CVRPED, is shown in Algorithm 1. It starts by generating an initial configuration (candi- date solution) C using the initial_config(...) routine, when the initial temperature of the system T is at its highest value. Afterwards, T is progressively decreased until rea- ching the freezing temperature freez, while a sequence of iterations tot_iter are performed for each T . Throughout each iteration iter, a neighbourhood configuration C∗ of the current configuration C is generated, and the variation in the cost ∆cost is computed. In other words, each configuration represents an intermediate solution that has a different cost which is com- puted using the cost method, and ∆cost is equal to the difference between the new cost of the ∗ neighbourhood configuration Ccost and the current cost of the current configuration Ccost. If the cost decreases then the move to the new cost is accepted (lines 16 – 18).

However, if ∆cost is positive then the move is accepted or rejected with a probability that − ∆cost equals e T . Effectively, the probability of accepting inferior solutions is a function of the temperature T and the change in cost ∆cost. We repeat this whole process for a total number of trials tot_tr and the algorithm finally returns the best configuration ever visited. The set of parameters controlling Algorithm 1 were experimentally determined using CHAPTER 4. A BELIEF-CONSTRAINED PROGRAMMING APPROACH TO THE CVRPED 73

Algorithm 1 Simulated annealing algorithm for the CVRPED modelled by BCP Require: initial temperature T , temperature reduction multiplier κ, freezing temperature freez, total number of iterations tot_iter, total number of trials tot_tr Ensure: Best solution ever visited BestC 1: Bestcost = ∞ 2: for tr = 0 to tot_tr do 3: if tr == 0 then 4: C = initial_config(greedy, BCP) . greedy generation 5: else 6: C = initial_config(random, BCP) . random generation 7: end if 8: Ccost =cost(C, BCP) 9: T BestC = C 10: T Bestcost = Ccost 11: repeat 12: for iter = 0 to tot_iter do 13: C∗ = neighbourhood_configuration(C, BCP) ∗ ∗ 14: Ccost = cost(C , BCP) ∗ 15: ∆cost = Ccost − Ccost 16: if (∆cost < 0) then 17: C = C∗ ∗ 18: Ccost = Ccost ∗ 19: if Ccost < T Bestcost then ∗ 20: T BestC = C ∗ 21: T Bestcost = Ccost 22: end if − ∆cost 23: else if rnd ≤ e T then . rnd is a random number in [0, 1] 24: C = C∗ ∗ 25: Ccost = Ccost 26: end if 27: end for 28: T = κ · T 29: until (T == freez) 30: if (T Bestcost < Bestcost) then 31: BestC = T BestC 32: Bestcost = T Bestcost 33: end if 34: end for 74 4.4. SOLVING THE CVRPED MODELLED BY BCP

Augerat set A instances for the CVRP [70], in which vertices coordinates were randomly constructed [3]. Specifically, the initial temperature T was set to 5000 and was decreased by a temperature reduction multiplier κ that was set to 0.82 until reaching a freezing temperature freez that equals 1. The total number of iterations tot_iter was regulated to 30000, while the total number of trials of the algorithm tot_tr was determined to be 5. The results with our algorithm varied between 1% and 12% from the optimal solutions of the CVRP, with an average running time under 30 minutes, for all instances. This algorithm is an adaptation of the algorithm introduced in [44] for the CVRP. Let us remark that a configuration in Algorithm 1 is a set of routes that can be generated either by the initial_config(...) or the neighbourhood _configuration(...) routines. Meanwhile the cost routine evaluates the objective value of the configuration. All these routines depend on a common parameter that specifies we are solving a CVPRED modelled by BCP. This is presented in greater detail in the follo- wing subsection.

4.4.2 A configuration in the algorithm

The initial_config routine generates an initial configuration either randomly or using a first-fit greedy approach7 while respecting the constraints of the BCP model from Section 4.3.1. Besides, the neighbourhood_configuration(...) routine generates a neigh- bourhood configuration that respects the BCP constraints, and is based on two neighbourhood operators that are applied consecutively to each iteration of Algorithm 1. These latter neigh- bourhood operators are called fix_minimum and replace_highest_average and are described below.

• Fix_minimum: This operator is applied 80% of the time in each iteration. It is ba- sed on selecting and freezing the positions in routes of the five customers, with the shortest distances to their right side customer8. This is done by computing distances between each pair of consecutive customers on all routes, including distances to the depot. Accordingly, fix_minimum selects the five smallest distances values and fixes their corresponding left side customers. Next, fix_minimum selects five random cus- tomers that exclude the depot and the customers fixed before, and removes them from their route. Subsequently, every customer removed will be inserted in a random route, while satisfying the problem constraints. The insert position of each customer on the selected route is determined based on the shortest distance separating it from its new left side customer. Example 4.6. Suppose configuration C consists of the set of routes C = {(0, 3, 6, 10, 0) , (0, 1, 5, 8, 4, 9, 7, 0) , (0, 2, 0)}. Consider that the smallest distances between consecutive clients, are those between clients < 3, 6 >, < 6, 10 >, < 1, 5 >, < 4, 9 > and < 7, 0 >. This means that customers 3, 6, 1, 4 and 7 cannot be removed from their routes by the fix_minimum operator. Consequently, fix_minimum se- lects five random customers excluding 3, 6, 1, 4, 7 and the depot. For instance customer

7A blind algorithm that inserts clients in routes in turn: each client is inserted in the first route that can serve it. 8Given the pair of customer < i − 1, i >, the right side customer is the i-th customer. CHAPTER 4. A BELIEF-CONSTRAINED PROGRAMMING APPROACH TO THE CVRPED 75

5 is selected and removed from the second route in C and inserted randomly in one of the three available routes in C, while respecting all problem constraints. After selecting the new route for client 5, it is inserted at the position with the resulting smallest dis- tance to client 5. We repeat the same process to move the four other random customers which in this example cannot be other than customers 2, 5, 8, 9 and 10.

The fix_minimum operator was inspired from the intensification process of the tabu search meta-heuritic algorithm recalled in Appendix A.1, where good components of a promising solution are frozen (fixed). Hence when moving from a solution to another solution, promising components of a set of routes are frozen and the neighbourhood of this solution is explored while keeping the good portions.

• Replace_highest_average: This neighbourhood operator calculates the average distance separating every customer from its neighbours in a current route configura- ci−1,i+ci,i+1 tion. Computing the average distance for a client i reduces to calculating 2 , assuming clients i − 1, i and i + 1 are consecutive in the route9. Afterwards, replace_highest_average selects five customers having the five highest ave- rage distances and removes them from their routes. The removed clients are then ran- domly inserted in the available routes, as long as the problem constraints are respected. Furthermore, every removed i-th customer is inserted into the route position leading to the smallest average distance that will separate this customer from its new (i−1) th and (i + 1) th neighbours.

Example 4.7. Suppose a configuration C that consists of the set of routes C = {(0, 1, 3, 8, 0) , (0, 2, 5, 4, 9, 10, 0) , (0, 6, 7, 0)}. Suppose 1, 8, 5, 9 and 7 are the clients having the five highest average distances (separating each one of these clients from its neighbours). Then, these clients are removed from their routes, and we will have C = {(0, 3, 0) , (0, 2, 4, 10, 0) , (0, 6, 0)}. Afterwards, each removed client is inserted randomly in one of the available routes. The position where to insert a customer is chosen, such that: i) the problem constraints are respected; and ii) the new position of that customer on the chosen route has the smallest average distance, if compared to all other possible positions of this client on the chosen route. For instance, suppose the first route in C was chosen randomly for client 1. We know that, inserting client 1 on that route does not violate the belief-constraints, and it can be inserted either before client 3 or right after client 3. Suppose also that, inserting client 1 right after client 3 has a smallest average distance, than if client 1 was inserted right before client 3. Then, client 1 is inserted right after client 3. This same process is repeated for the remaining clients 8, 5, 9 and 7.

The replace_highest_average operator was inspired from the crossover opera- tor in genetic algorithms, recalled in Appendix A.2. Indeed, this operator concentrates on creating new configurations that inherit good parts of previous configurations.

The cost(...) routine in Algorithm 1 corresponds to the objective function of the problem. Recall that, the objective function is the (classical) total travelled distance of the routes (Equation (1.1) in Section 1.3.1).

9 The notation ci−1,i indicates the travel cost between i − 1 and i, as defined in Section 1.3.1, which in in our algorithm is assumed to be the Euclidean distance separating customers. 76 4.4. SOLVING THE CVRPED MODELLED BY BCP

The complexity of an iteration in Algorithm 1, emerges from evaluating the BCP constraints, in particular the belief constraints (4.6) and (4.7), the complexity of which is provided in Section 4.3.1. Finally, note that if the neighbourhood_configuration(...) routine in Al- gorithm 1 could not find a neighbourhood configuration, that satisfies the BCP constraints, then the routine is attempted a second time. If the routine does not succeed also at the second attempt, then the configuration C is not modified, i.e., C∗ = C.

4.4.3 The CVRPED benchmarks

We have generated two sets of instances, called CVRPED instances and CVRPED+ instances respectively, based on Augerat set A instances of the CVRP [70]. Each instance in these two sets corresponds to an instance in Augerat set A and has the same customer coordinates and capacity limit as this instance. For each instance of the first set of instances, i.e., the CVRPED instances, the knowledge Θn on customer demands m is obtained by assuming evidential client demands di, i = 1, . . . , n Θi of this instance are independent. Moreover, each di is associated to the mass function mi defined by

Θi det mi ({di }) = 0.8, Θi mi ([zi, zi]) = 0.2, (4.18) det with di the original deterministic demand of client i in the corresponding instance of Augerat det set A, and with zi and zi drawn at random in (di ,Q] and [zi,Q], respectively. For each instance of the second set of instances, i.e., the CVRPED+ instances, evidential client demands di are also assumed to be independent, and their associated mass function is Θi Θi denoted by mi,+ and defined from mi as:

Θi det det + mi,+([di , di + ai ]) = 0.8, Θi mi,+([zi, zi]) = 0.2, (4.19)

+ det Θi Θi with ai drawn randomly in [0, zi − di − 1]. Note that mi v mi,+, i = 1, . . . , n. In the next sections, an experimental study is presented for the BCP model solved using Algorithm 1 for the described CVRPED and CVRPED+ instances. Remark 4.2. The CVRPED instances, the CVRPED+ instances and their description file can be found at the private links provided in [45, 46] and in [47], respectively. The programs were written in Java and the experiments were conducted on the 5 nodes of a cluster. The configuration of each node is as follows: 2 processors Intel R Xeon R E5-2630 v3 with 8 cores per processor having a 48GB memory shared between the 2 × 8 cores of the node. Each instance was executed on one core that has a memory of 2.8 GB.

4.4.4 Experimental study

Our set of experiments on the BCP model of the CVRPED involve varying the values β and β used in the Bel and P l constraints (4.6)-(4.7) for the CVRPED and the CVRPED+ instances separately, where for each variation each instance was solved 30 times. CHAPTER 4. A BELIEF-CONSTRAINED PROGRAMMING APPROACH TO THE CVRPED 77

Cost variation based on β and β

In this first part, we show results of our experiments for the CVRPED and the CVRPED+ instances in Tables 4.1 and 4.2, respectively. We solved the BCP model of the CVRED for two different values that we chose for the pair (β, β), such that β ≥ β, and both values were employed for the CVRPED and the CVRPED+ instances, separately. The columns figuring in each of these tables, are explained in the following. The first column is the name of each instance. The first field in this column (l in Table 4.1 and l+ in Table 4.2), exposes the identification number of an instance and the second field “A” stands for aleatory indicating that the coordinates of the problem graph ver- tices were generated randomly in the original Augerat set A [3]. The third field designates the number n of vertices, while the last field provides the number m of vehicles. The “Best cost”, “Std. dev.” and “Avg. runtime” columns show respectively, the best solution, the stan- dard deviation and the average running time that we obtained for each indicated value of the pair (β, β). We notice the costs of the best solutions obtained with β = 0.4, β = 0.25 are lower than the costs of the best solutions obtained with β = 0.2, β = 0.15 in Table 4.1 as well as in Table 4.2, that is, the most constraining pair (β, β) induces the worst costs, as can be expected from Propositions 4.2 and 4.3. This shows that while our solving algorithm is not an exact optimisation method, it does exhibit experimentally a sound behaviour with respect to parameters β and β.

Cost variation based on client demand specificity

This section compares the BCP results on the CVRPED instances with the BCP results on the CVRPED+ instances. Specifically, Table 4.3 compares the best costs from Table 4.1 (CVRPED instances) with the bests costs from Table 4.2 (CVRPED+ instances), for the same instance id.

Θi Recall that for each client i in an l instance, its mass function mi is at least as specific as the associated mass function to client i in the l+ instance, i.e.,

Θi Θi mi v mi,+, i = 1, . . . , n. Proposition 4.4 predicted an increase in the cost of an optimal solution to the BCP model when knowledge specificity about clients demands decreases and is more pessimistic, i.e., Θi knowledge about each client demand is known in the form of mass function mi,+, i = 1, . . . , n Θi which is more pessimistic and not as specific as mi , i = 1, . . . , n. We can observe this behaviour in the results presented in Table 4.3: for each pair (β, β) the best cost obtained with the CVRPED+ instances is higher than the one obtained with the CVRPED instances. This constitutes another experimental validation of the behaviour of our algorithm.

4.5 Conclusions

This chapter studied a BCP modelling approach for the CVRPED. In this approach, evidence theory models uncertainty on customer demands and the BCP model postulates a 78 4.5. CONCLUSIONS

TABLE 4.1 : Results of the simulated annealing algorithm for the BCP model using the CVR- PED instances Instance l-A-nn-mm : β = 0.4, β = 0.25 β = 0.2, β = 0.15 l instance id, Best Std. Avg. Best Std. Avg. n clients, m vehicles cost dev. runtime cost dev. runtime 1-A-n32-m12 1418.3 3.7 3881s. 1850.9 5.3 3733s. 2-A-n33-m13 1055.3 0 4199s. 1491.6 17.5 4496s. 3-A-n33-m13 1073.1 6 4495s. 1480.2 0.6 4549s. 4-A-n34-m14 1320.6 0.1 3818s. 1749.1 0 3852s. 5-A-n36-m12 1318.9 2.7 5316s. 1718.6 0.2 4914s. 6-A-n37-m13 1110.6 5 4918s. 1358.8 34.5 6158s. 7-A-n37-m14 1597.9 0.5 4135s. 2113.9 2.8 3756s. 8-A-n38-m13 1154.5 0.9 5041s. 1571.1 5.1 5002s. 9-A-n39-m15 1485 8.1 4654s. 1944.8 0.9 4622s. 10-A-n39-m14 1403.9 8.6 4894s. 1906.9 0.2 5108s. 11-A-n44-m17 1693.4 10.1 4956s. 2158.2 1.2 4951s. 12-A-n45-m17 1660.3 0.1 5093s. 2184.8 5.1 5169s. 13-A-n45-m18 1890.1 5.7 4991s. 2573.2 0.7 5211s. 14-A-n46-m17 1552.2 5.4 5323s. 1980.3 38 5707s. 15-A-n48-m17 1872.4 10.8 5996s. 2397.5 3.6 6395s. 16-A-n53-m19 1806.1 11.9 7405s. 2358.2 10 6928s. 17-A-n54-m19 2052.6 13.8 7578s. 2636.8 60.8 6747s. 18-A-n55-m22 1755.3 9.5 6310s. 2352.6 9.8 6172s. 19-A-n60-m22 2263.9 17 8169s. 2969.1 34.8 7449s. 20-A-n61-m24 1793.8 9.9 6965s. 2345 42.3 7319s. 21-A-n62-m22 2532.3 19.1 8212s. 3207.2 32.2 7752s. 22-A-n63-m24 2946.9 14.6 7164s. 3918.9 22 7216s. 23-A-n63-m25 2179 8.9 6969s. 2881.1 6.9 7238s. 24-A-n64-m23 2629.1 16.8 7979s. 3261.5 13.6 7848s. 25-A-n65-m25 2214.7 16.4 7537s. 3070.9 6 7601s. 26-A-n69-m25 2056.5 11.1 8876s. 2668.1 52.9 8377s. 27-A-n80-m27 3507.2 21.5 11110s. 4524.9 21.5 9751s. CHAPTER 4. A BELIEF-CONSTRAINED PROGRAMMING APPROACH TO THE CVRPED 79

TABLE 4.2 : Results of the simulated annealing algorithm for the BCP model using the CVRPED+ instances Instance l+-A-nn-mm : β = 0.4, β = 0.25 β = 0.2, β = 0.15 l+ instance id, Best Std. Avg. Best Std. Avg. n clients, m vehicles cost dev. runtime cost dev. runtime 1+-A-n32-m16 1830.8 28.3 3087s. 2225.4 0 3565s. 2+-A-n33-m16 1428.8 0.7 4048s. 1676.9 0.4 3790s. 3+-A-n33-m14 1196 0 4330s. 1502 24.2 4604s. 4+-A-n34-m16 1596 0.01 3389s. 1993 0 3719s. 5+-A-n36-m16 1755.3 4 3928s. 2145.7 0 4178s. 6+-A-n37-m18 1379.2 8.7 5179s. 1761.2 0 4347s. 7+-A-n37-m19 1957 0 3344s. 2542.6 0 4112s. 8+-A-n38-m16 1437.4 2.2 4017s. 1846.4 2.2 4479s. 9+-A-n39-m19 1915.5 11.2 3915s. 2203.5 0 4089s. 10+-A-n39-m18 1759 4.2 4255s. 2148.1 0 4140s. 11+-A-n44-m26 2234.2 1.5 4218s. 2796.6 0.4 5698s. 12+-A-n45-m22 2165.7 3.1 4360s. 2690.8 0 4601s. 13+-A-n45-m22 2287.7 0.9 4339s. 3099.4 0 4783s. 14+-A-n46-m22 1950.4 6.7 4603s. 2690.1 0 4795s. 15+-A-n48-m22 2359.8 5.9 5742s. 2956.3 1.8 5500s. 16+-A-n53-m25 2411.2 9.4 5747s. 3199.8 0 5883s. 17+-A-n54-m24 2591 11.4 6025s. 3165.7 0 5746s. 18+-A-n55-m27 2237.1 5.1 5716s. 2803.1 0 5493s. 19+-A-n60-m27 2744.8 9.5 6224s. 3415.5 6.4 6548s. 20+-A-n61-m30 2313.4 9.1 6125s. 3059.2 0 5877s. 21+-A-n62-m31 3217.7 8.9 6814s. 4291 3.1 6276s. 22+-A-n63-m30 3833 9.7 6057s. 4942.1 4.9 6257s. 23+-A-n63-m31 2755 5.8 5757s. 3638.6 0.4 6127s. 24+-A-n64-m29 3311.2 27.9 6656s. 4123.1 1.7 6848s. 25+-A-n65-m34 2748.1 7.7 6762s. 3509.1 9 6842s. 26+-A-n69-m33 2573.4 8.5 7135s. 3300.5 41.1 7125s. 27+-A-n80-m38 4995.8 14.7 7444s. 5986.3 13.5 7612s. 80 4.5. CONCLUSIONS

TABLE 4.3 : Results of the simulated annealing algorithm applied to the BCP model for the CVRPED and CVRPED+ instances CVRPED instances CVRPED+ instances Instance Best cost Best cost Instance Best cost Best cost id l β = 0.4, β = 0.25 β = 0.2, β = 0.15 id l+ β = 0.4, β = 0.25 β = 0.2, β = 0.15 1 1418.3 1850.9 1+ 1830.8 2225.4 2 1055.3 1491.6 2+ 1428.8 1676.9 3 1073.1 1480.2 3+ 1196 1502 4 1320.6 1749.1 4+ 1596 1993 5 1318.9 1718.6 5+ 1755.3 2145.7 6 1110.6 1358.8 6+ 1379.2 1761.2 7 1597.9 2113.9 7+ 1957 2542.6 8 1154.5 1571.1 8+ 1437.4 1846.4 9 1485 1944.8 9+ 1915.5 2203.5 10 1403.9 1906.9 10+ 1759 2148.1 11 1693.4 2158.2 11+ 2234.2 2796.6 12 1660.3 2184.8 12+ 2165.7 2690.8 13 1890.1 2573.2 13+ 2287.7 3099.4 14 1552.2 1980.3 14+ 1950.4 2690.1 15 1872.4 2397.5 15+ 2359.8 2956.3 16 1806.1 2358.2 16+ 2411.2 3199.8 17 2052.6 2636.8 17+ 2591 3165.7 18 1755.3 2352.6 18+ 2237.1 2803.1 19 2263.9 2969.1 19+ 2744.8 3415.5 20 1793.8 2345 20+ 2313.4 3059.2 21 2532.3 3207.2 21+ 3217.7 4291 22 2946.9 3918.9 22+ 3833 4942.1 23 2179 2881.1 23+ 2755 3638.6 24 2629.1 3261.5 24+ 3311.2 4123.1 25 2214.7 3070.9 25+ 2748.1 3509.1 26 2056.5 2668.1 26+ 2573.4 3300.5 27 3507.2 4524.9 27+ 4995.8 5986.3 CHAPTER 4. A BELIEF-CONSTRAINED PROGRAMMING APPROACH TO THE CVRPED 81 minimum bound on the credibility and the plausibility that the capacity constraints are res- pected. Particular cases of the model were examined. Specifically, by examining special types of evidential demands or particular values of the model parameters, the BCP model proved to be related to the CCP model as well as the robust optimisation technique when modelling un- certainties on customer demands in a CVRP. Moreover, theoretical results relating variations of the optimal solution cost with i) variations of the parameters involved in the model; and ii) variations of customer demands specificity, were provided. The studied model was solved and experimentally tested with a meta-heuristic algorithm using instances generated from well known CVRP instances. The BCP modelling of the CVRPED examined in this chapter, is a generalisation of the CCP modelling of the CVRPSD to the evidence theory framework. In the following chap- ter, the CVRPED is modelled using an extension of the other main approach to modelling stochastic programs, that is the extension of the SPR approach used for the CVRPSD. Chapter 5

A Recourse Approach to the CVRPED

Contents 5.1 Introduction ...... 82 5.2 Modelling the CVRPED by a Recourse Approach ...... 83 5.2.1 Formalisation ...... 83 5.2.2 Uncertainty on recourses ...... 85 5.2.3 Interval demands ...... 86 5.2.4 Particular cases ...... 91 5.2.5 Influence of customer demand specificity on the optimal solution cost 93 5.3 Solving the CVRPED Modelled by a Recourse Approach ...... 95 5.3.1 The simulated annealing algorithm ...... 95 5.3.2 A configuration in the algorithm ...... 95 5.3.3 The CVRPED benchmarks ...... 97 5.3.4 Experimental study ...... 97 5.4 Conclusions ...... 99

5.1 Introduction

In the previous chapter, we modelled the CVRPED by an evidential extension of the CCP approach used for the CVRPSD. In this chapter, the CVRPED is modelled using an extension of the other main approach to modelling stochastic programs, that is the recourse approach which is an extension of the SPR approach used for the CVRPSD. The SPR mo- delling approach for the CVRPSD has a wider range of application than the CCP modelling approach, but is generally more involved. In the context of the CVRPSD, the recourse approach allows so-called recourse actions to be performed along a route, such as returning to the depot to unload, in order to bring back to feasibility a violated capacity limit. The cost of these actions is considered directly in the objective of the problem. Specifically, the total expected travel cost is subject to minimisa- tion, this cost covering the classical travel cost, i.e., the cost of travel if no recourse action is

82 CHAPTER 5. A RECOURSE APPROACH TO THE CVRPED 83 performed, as well as the expected cost of the recourse actions. Recall in Section 2.3, it was emphasized that a recourse action should be specified from several alternatives. In this chapter, we propose to extend the recourse approach to the CVRPED, for the recourse action stating to perform round trips to the depot, when the vehicle capacity limit is not enough to load the full customer demand (a failure happened). Specifically, in case of failure, a vehicle is loaded with the actual customer demand up to its remaining capacity, then the vehicle returns to the depot, is emptied, goes back to the client to pick-up the remaining customer demand and continues its originally planned route. Section 5.2 presents the formal recourse model for the CVRPED along with its properties. Section 5.3 solves the recourse mo- del using a simulated annealing algorithm and perform experiments on CVRPED instances. The chapter is concluded in Section 5.4.

5.2 Modelling the CVRPED by a Recourse Approach

This section proposes a recourse modelling approach for the CVRPED. The general mo- del extending the SPR model of the CVRPSD, is presented in Section 5.2.1. In Section 5.2.2, we detail how uncertainty on recourse actions is obtained in this model. Then, in Section 5.2.3 we provide a method to compute efficiently this latter uncertainty in a realistic particular case. Similarly to what has been done for the BCP model in Chapter 4, we discuss particular cases of our recourse model in Section 5.2.4 and study the influence of customer demands specificity on the optimal solution cost in Section 5.2.5.

5.2.1 Formalisation

Recall that knowledge on customer demands is given as shown in Section 4.2. Consider a given route R containing N customers and, without lack of generality, that the i-th customer on R is customer i. As customer demands cannot exceed vehicle capacity, a failure cannot occur at the first customer on R. However, it may occur at any other customer on R, and there may even be failure at multiple customers on R (at worst, if the actual demand of each customer is equal to the capacity of the vehicle, failure occurs at each customer except the first one).

Formally, let us introduce a binary variable ri that equals 1 if failure occurs at the i-th customer on R and 0 otherwise. By problem definition, we have r1 = 0. Then, the possible failure situations that may occur along R may be represented by the vectors

N−1 (r2, r3, . . . , rN ) ∈ {0, 1} .

To simplify the exposition, a set Ω is defined as the space of binary vectors representing the possible failure situations along R: each failure situation (r2, r3, . . . , rN ) is then a binary vector belonging to Ω = {0, 1}N−1.

Example 5.1. For instance, when R contains only N = 3 customers, we have

Ω = {(0, 0) , (1, 0) , (0, 1) , (1, 1)}.

The binary vectors in Ω mean that the vehicle needs to perform a round trip to the depot, 84 5.2. MODELLING THE CVRPED BY A RECOURSE APPROACH

TABLE 5.1 : Travel cost matrix TC of Example 5.2 0 1 2 3 4 5 0 +∞ 2 3.2 4.7 5.6 2.8 1 2 +∞ 1.7 4.5 6.1 4 2 3.2 1.7 +∞ 3.1 5.3 4 3 4.7 4.5 3.1 +∞ 2.4 3.2 4 5.6 6.1 5.3 2.4 +∞ 3 5 2.8 4 4 3.2 3 +∞

• “never” in vector (0, 0);

• “when it reaches the second customer” in vector (1, 0);

• “when it reaches the third customer” in vector (0, 1); and

• “when it reaches both the second and third customers” in vector (1, 1).

Furthermore, let g :Ω → R+ be a function representing the cost of each failure situation ω ∈ Ω, with ω being the binary vector (r2, r3, . . . , rN ) representing a failure situation. As already mentioned, a failure implies a return trip to the depot, thus the penalty cost upon failure on customer i is 2c0,i, with c0,i the travel cost between the depot and the i-th customer. Therefore, the cost associated to failure situation ω is

N X g(ω) = ri2c0,i. i=2 Example 5.2. Suppose a route consisting of N = 5 customers, the depot is the 0 customer on R. The travel cost matrix TC = (ci,j) where i, j ∈ {0, 1,...,N} in Table 5.1, illustrates travel costs between the customers on R. Consider ω = (1, 0, 1, 0) the binary vector represen- ting a failure situation on R. In other words, ω asserts a failure on the second and the fourth customer on R, i.e.,

r2 = 1, r3 = 0, r4 = 1, r5 = 0. The penalty cost of ω is g(ω), which amounts to

g(ω) = r2 · 2 · c0,2 + r3 · 2 · c0,3 + r4 · 2 · c0,4 + r5 · 2 · c0,5 = 1 · 2 · 3.2 + 1 · 2 · 5.6 = 17.6.

Let mΩ be a mass function representing uncertainty pertaining to the actual failure si- tuation occurring on R – as will be shown in the next section, evidential demands may induce such a mass function. Then, adopting a similar pessimistic attitude as in the recourse approach to belief linear ∗ programming [59], the upper expected penalty cost CP(R) of route R may be obtained using Equation (3.8) as follows : ∗ ∗ Ω CP(R) = E (g, m ), (5.1) with mΩ defined in the next section. CHAPTER 5. A RECOURSE APPROACH TO THE CVRPED 85

∗ Accordingly, the upper expected cost CE(R) of route R may be defined as

∗ ∗ CE(R) = C(R) + CP(R), (5.2) with C(R) the cost defined in Equation (1.2) of travelling along route R when no failure occurs. The CVRPED may then be modelled using a modified version of the CVRP model of Section 1.3.1. Specifically, our recourse modelling of the CVRPED aims at

m X ∗ min CE(Rk), (5.3) k=1 subject to constraints (1.3) - (1.6), with constraints (1.5) replaced by constraints (2.9). Evaluating the objective function in Equation (5.3) requires the computation for each route, of the mass function mΩ representing uncertainty on the actual failure situation occur- ring on the route. This is detailed in the next section.

5.2.2 Uncertainty on recourses

Consider again a route R containing N customers. In addition, let us first assume that client demands on N are known without any uncertainty, that is we know that the demand of client i, i = 1,...,N, is some value θi ∈ Xi. Then, it is clear that the above recourse policy amounts to the following definition for the binary failure variables ri :

( 1, if q + θ > Q, r = i−1 i ∀i ∈ {2,...,N} (5.4) i 0, otherwise, where qj, j = 1,...,N, denotes the load in the vehicle after serving the j-th customer such that qj = θ1 for j = 1 and, for j = 2,...,N,

( qj−1 + θj − Q, if qj−1 + θj > Q, qj = (5.5) qj−1 + θj, otherwise.

In other words, when it is known that the demand of the i-th customer is θi, i = 1,...,N, then we have a precise demand vector on R that induces a precise binary failure situation vector (r2, r3, . . . , rN ), with ri defined by Equation (5.4). This can be encoded by a function

f : X1 × · · · × XN → Ω, (5.6) such that

f (θ1, . . . , θN ) = (r2, r3, . . . , rN ) . (5.7)

Example 5.3. For example, suppose we have N = 4 customers on route R, with respective demands θ1 = 3, θ2 = 6, θ3 = 8 and θ4 = 4, and the vehicle capacity limit is Q = 10. In such case, f (θ1, θ2, θ3, θ4) implies the failure situation vector (r2 = 0, r3 = 1, r4 = 1). 86 5.2. MODELLING THE CVRPED BY A RECOURSE APPROACH

In the general case, client demands on R are known in the form of a mass function n mX1×···×XN , which is the marginalisation of mΘ on the domains of the evidential variables dr1, . . . , drN associated with the N clients on the route, with Xi the domain of the evidential variable dri associated with the i-th client on R. In such case, using Equation (3.5) with f defined in Equations (5.6)–(5.7), uncertainty on the actual failure situation on R is represented by a mass function mΩ defined as

X mΩ(B) = mX1×···×XN (A), ∀B ⊆ Ω. (5.8) f(A)=B

Computing mΩ defined by Equation (5.8) involves evaluating f(A) for any focal set A of mX1×···×XN . Evaluating f(A) for some

A ⊆ X1 × · · · × XN , implies |A| (and thus at worst QN ) times the evaluation of function f at some point N N (θ1, . . . , θN ) ∈ X := ×i=1Xi. Hence, computing Equation (5.8) is generally intractable. Nonetheless, in the particular and realistic case where the focal sets of mX1×···×XN are all Cartesian products of N intervals, i.e., for all

A ⊆ X1 × · · · × XN such that mX1×···×XN (A) > 0, we have A = A↓X1 × · · · × A↓XN

↓Xi with, for i = 1,...,N, A = Ai; Ai , it becomes possible to compute f(A), and thus Equation (5.8), with a much more manageableJ K complexity. This is detailed in the next section.

5.2.3 Interval demands

Let us consider a route R with N customers, such that the demand of the i-th customer, i = 1,...,N, is known in the form of an interval of positive integers, which we denote by Ai; Ai , where Ai ≥ 1 and Ai ≤ Q. In this case, the failure situation on R belongs surely to J K   f A1; A1 × · · · × AN ; AN ⊆ Ω. J K J K   Hereafter, we provide a method to efficiently compute f A1; A1 × · · · × AN ; AN . J K J K In a nutshell, this method consists in generating a rooted binary tree, which represents synthetically yet exhaustively what can possibly happen on R in terms of failure situations. More precisely, this tree is based on the following remark. Suppose a vehicle travelling along R and all that is known about its load when it arrives at the i-th customer on R is that its load belongs to an interval q; q . Let us denote by qi its load after visiting the i-th customer. Then, there are three exclusiveJ cases:K

1. either q + Ai ≤ Q, hence there will surely be no failure at that customer and all that is known is that qi ∈ q; q + Ai; Ai ; J K J K CHAPTER 5. A RECOURSE APPROACH TO THE CVRPED 87

Example 5.4. Suppose capacity limit Q = 10, q; q = 3; 4 and Ai; Ai = 2; 5 . J K J K J K J K Then surely there is no failure after collecting Ai; Ai as 3; 4 + 2; 5 = 5; 9 and thus the load plus the demand is necessarily lowerJ thanK Q. J K J K J K

2. or q +Ai > Q, hence there will surely be a failure at that customer and all that is known is that qi ∈ q; q + Ai; Ai − Q; J K J K Example 5.5. Suppose Q = 10, q; q = 5; 7 and Ai; Ai = 6; 8 . Then surely there J K J K J K J K is a failure after collecting Ai; Ai as 5; 7 + 6; 8 = 11; 15 and thus the load plus the demand are necessarilyJ greaterK thanJ Q.K J K J K

3. or q + Ai ≤ Q < q + Ai, hence it is not sure whether there will be or not a failure at that customer. However, we can be sure that if there is no failure at that customer, i.e., the sum of the actual vehicle load and of the actual customer demand is lower or equal to Q, then it means that qi ∈ q + Ai; Q ; and if there is a failure at that customer, then J K it means that qi ∈ 1; q + Ai − Q ; J K Example 5.6. Suppose Q = 10, q; q = 5; 8 and Ai; Ai = 2; 5 . As J K J K J K J K 5; 8 + 2; 5 = 7; 13 , J K J K J K we cannot be sure if there will be a failure after collecting 2; 5 . Nevertheless, if no failure happened after collecting Ai; Ai , then the remainingJ loadK in the vehicle qi ∈ 7; 10 . On the other hand, if failureJ happenedK then J K qi ∈ 11; 13 − 10 = 1; 3 . J K J K By applying the above reasoning repeatedly, starting from the first customer and ending at the last customer, whilst accounting for and keeping track of all possibilities and their associated failures (or absence thereof) along the way, one obtains a binary tree. The tree levels are associated to the customers according to their order on R. The nodes at a level i represent the different possibilities in terms of imprecise knowledge about the vehicle load after the i-th customer, and they also store whether these imprecise pieces of knowledge about the load were obtained following a failure or an absence of failure at the i-th customer. The pseudo code of the complete tree induction procedure is provided in Algorithm 2 and illustrated afterwards by Example 5.7. Example 5.7. Let us illustrate Algorithm 2 on a route R where Q = 10 and containing 3 customers, with 3; 6 , 2; 6 and 4; 8 the imprecise demands of the first, second and third customers, respectively.J K J SinceK the demandJ K of the first customer is 3; 6 , and there is no failure by definition at the first customer, and the customer following theJ firstK customer is the second customer, the tree is obtained with RT ( 3; 6 , 0, 2) and is shown in Figure 5.1. J K

For each leaf of the tree, by concatenating in a vector the Boolean failure variable ri at level i = 2,...,N, written on the path from the root to the leaf, we obtain the binary failure situation vector (r2, r3, . . . , rN ). Hence, all the leaves of the tree, yield the subset B ⊆ Ω. Example 5.8. For instance, the rightmost leaf of the tree in Figure 5.1 yields the failure situa- tion vector (r2 = 1, r3 = 0), the leftmost leaf yields (r2 = 0, r3 = 0) and the remaining leaf yields (r2 = 0, r3 = 1). The tree in this example yields thus the set B = {(0, 0) , (1, 0) , (0, 1)}. 88 5.2. MODELLING THE CVRPED BY A RECOURSE APPROACH

( 3; 6 , 0) 1st level J K

( 5; 10 , 0) ( 1; 2 , 1) 2nd level J K J K

rd ( 9; 10 , 0) ( 1; 8 , 1) ( 5; 10 , 0) 3 level J K J K J K

FIGURE 5.1 : Recourse tree constructed for Example 5.7

Algorithm 2 Induction of Recourse Tree (RT) Require: interval load q; q , Boolean failure variable r, next customer number i Ensure: final tree T reeJ K 1: create a root node containing interval load q; q and Boolean failure r 2: if i = N + 1 then J K 3: return T ree = {root node} 4: else if q + Ai ≤ Q then 5: qL; qL = q; q + Ai; Ai 6: JrL = 0K J K J K 7: T reeL = RT ( qL; qL , rL, i + 1) 8: attach T reeL asJ left branchK of T ree 9: else if q + Ai > Q then 10: qR; qR = q; q + Ai; Ai − Q 11: rJR = 1K J K J K 12: T reeR = RT ( qR; qR , rR, i + 1) 13: attach T reeR asJ rightK branch of T ree 14: else 15: qL; qL = q + Ai; Q 16: rJL = 0K J K 17: T reeL = RT ( qL; qL , rL, i + 1) 18: attach T reeL asJ left branchK of T ree 19: qR; qR = 1; q + Ai − Q 20: rJR = 1K J K 21: T reeR = RT ( qR; qR , rR, i + 1) 22: attach T reeR asJ rightK branch of T ree 23: end if

Proposition 5.1. The set B built using the tree generated by Algorithm 2 verifies   B = f A1; A1 × · · · × AN ; AN . J K J K CHAPTER 5. A RECOURSE APPROACH TO THE CVRPED 89

TABLE 5.2 : Travel cost matrix TC of Example 5.9 0 1 2 3 0 +∞ 3 3 3.3 1 3 +∞ 5 6.2 2 3 5 +∞ 2.5 3 3.3 6.2 2.5 +∞

Proof. The proof to this proposition is in appendix B.1.

The maximum number of leaf nodes in the tree is 2N−1. Thus, the algorithmic com- plexity to obtain set B ⊆ Ω is of the order 2N . In addition, suppose that the number of focal sets of mass function mX1×···×XN in Equation (5.8) is at most c, and that the focal sets of this mass function are all Cartesian products of N intervals. Then, the worst-case complexity to evaluate mΩ defined by Equation (5.8) on a route R with N clients using the recourse tree is O(2N · c).

∗ Let us illustrate by an example, how to calculate the upper expected cost CE(R) for some route R using the recourse tree.

Example 5.9. Suppose a route R having N = 3 customers, 0 is the depot and the capacity limit Q = 10. The travel cost matrix TC = (ci,j) is provided in Table 5.2 where i, j ∈ {0, 1, 2, 3} illustrates travel costs between the customers on R. Moreover, client demands on this route are known in the form of mass function mX1×X2×X3 defined as

mX1×X2×X3 ( 1; 5 × 3; 6 × 4; 5 ) = 0.6, (5.9) J K J K J K mX1×X2×X3 ( 1; 3 × 2; 4 × 8; 9 ) = 0.4. (5.10) J K J K J K Note that, as R contains three customers, then the set of possible failure situations on this route is Ω = {(0, 0) , (1, 0) , (0, 1) , (1, 1)}.

The recourse tree for the focal sets in Equation (5.9) is illustrated in Figure 5.2. This recourse tree yields the set {(0, 0) , (1, 0) , (0, 1)}. The recourse tree for the focal sets in Equation (5.10) is illustrated in Figure 5.3. This recourse tree yields the set {(0, 1)}.

In other words, using Equation (5.8) with mX1×···×XN being mX1×X2×X3 and f being the recourse tree method, we obtain

mΩ({(0, 0) , (1, 0) , (0, 1)}) = 0.6, mΩ({(0, 1)}) = 0.4.

Let us calculate for each ω ∈ Ω, the penalty cost function g(ω) :

g((0, 0)) = 0; g((1, 0)) = 2 · 3 = 6; g((0, 1)) = 2 · 3.3 = 6.6.; g((1, 1)) = 2 · 3 + 2 · 3.3 = 12.6. 90 5.2. MODELLING THE CVRPED BY A RECOURSE APPROACH

( 1; 5 , 0) J K

( 4; 10 , 0) ( 1; 1 , 1) J K J K

( 1; 5 , 1) ( 5; 6 , 0) ( 8; 10 , 0) J K J K J K

FIGURE 5.2 : Recourse tree for 1; 5 × 3; 6 × 4; 5 of Example 5.9 J K J K J K

( 1; 3 , 0) J K

( 3; 7 , 0) J K

( 1; 6 , 1) J K

FIGURE 5.3 : Recourse tree for 1; 3 × 2; 4 × 8; 9 of Example 5.9 J K J K J K CHAPTER 5. A RECOURSE APPROACH TO THE CVRPED 91

Now we can determine the upper expected penalty cost (Equation (5.1)), which amounts to

∗ ∗ Ω CP(R) = E (g, m ) = X mΩ(B) max g(ω) ω∈B B⊆Ω = mΩ({(0, 0) , (1, 0) , (0, 1)}) · g((0, 1)) + mΩ({(0, 1)}) · g((0, 1)) = 0.6 · 6.6 + 0.4 · 6.6 = 6.6.

The travel cost between customers on R is

C(R) = c0,1 + c1,2 + c2,3 + c3,0 = 3 + 5 + 2.5 + 3.3 = 13.8.

Now we can determine the upper expected cost of route R using Equation (5.2), which amounts to ∗ CE(R) = 13.8 + 6.6 = 20.4.

5.2.4 Particular cases

In this section, some comments are provided on the behaviour of our recourse model- ling, especially with respect to some particular evidential demands. If mΘn is Bayesian, i.e., we are dealing really with a CVRPSD, then mΩ is Bayesian on any given route R having N customers. Incorporating the Bayesian mass function mΩ in Equation (5.1), we obtain

∗ ∗ Ω CP(R) = E (g, m ) = X mΩ({ω})g(ω) ω∈Ω N ! X X Ω = ri2c0,i m ({ω}) ω∈Ω i=2 N   X X Ω = 2c0,i  m ({ω}) . i=2 {ω∈Ω|ri=1}

Note that X mΩ({ω}), (5.11)

{ω∈Ω|ri=1} is the probability of having a failure on the i-th customer, i.e., we have

X mΩ({ω}) = F ail(i), i = 2,...,N,

{ω∈Ω|ri=1} with F ail(i) the probability of having failure at the i-th customer given by Equation (2.11) of ∗ Section 2.3.2. Hence, the upper expected penalty cost CP(R) reduces to the expected penalty cost CP(R) (Equation (2.13)) when knowledge about customer demands is Bayesian, and thus 92 5.2. MODELLING THE CVRPED BY A RECOURSE APPROACH

TABLE 5.3 : Travel cost matrix TC 0 1 2 3 0 +∞ 1 1.1 3 1 1 +∞ 1 2.1 2 1.1 1 +∞ 2.1 3 3 2.1 2.1 +∞ our recourse modelling of the CVRPED clearly degenerates into the recourse modelling of the CVRPSD presented in Section 2.3. We showed in Section 4.3.2 that a BCP model for the CVRPED can be converted, when β = β and the evidential variables di, i = 1 . . . , n, are independent, into an equivalent CVRPSD modelled via CCP. Specifically this can be done, by transforming each evidential Θi demand represented by mass function mi into a stochastic demand represented by proba- Θi Θi bility mass function pi obtained from mi by transferring the mass mi (A) to the element θ = max(A). In the following a counter example shows that this latter transformation can- not be used in general to convert a CVRPED into an equivalent CVRPSD under the recourse approach. Suppose we have one available vehicle with capacity limit Q = 14, n = 3 clients with the imprecise demands of clients 1, 2 and 3, 2; 8 , 3; 8 and 3; 8 , respectively. The depot is denoted by 0 and the travel cost matrix TCJ= (cKi,jJ) whereK i, jJ ∈ {K0, 1, 2, 3} is shown in Table 5.3.

• Under the recourse approach, the optimal solution to this CVRPED instance is the route defined by the path (0, 3, 2, 1, 0) and its upper expected cost is 9.3.

• On the other hand using the above-mentioned transformation to transform the evidential demands into stochastic demands, we obtain

p1(8) = 1, p2(8) = 1, and p3(8) = 1.

Then, under the recourse approach the optimal solution to this CVRPSD instance is either the route defined by the path (0, 2, 1, 3, 0) or (0, 3, 1, 2, 0), as the expected cost of each one of these routes is 9.2, which are different from the optimum found for the CVRPED. This is due to the fact that the optimal solution of the CVRPED do not use this probability measure.

The objective function in Equation (5.3) of our recourse model, relies on optimising ∗ Θn CE(R) which is the upper, i.e., worst, expected cost of a route. In particular, if m is catego- ∗ rical, then CE(R) is the worst possible cost of R. Hence, optimising Equation (5.3) has some similarities with the protection against the worst case popular in robust optimisation [79]. Though, another approach could be followed [85], where one optimises against the lo- wer, i.e., best, expected cost

C*E(R) = C(R) + C*P(R), where C*P(R) is evaluated using Equation (3.7) such that

Ω C*P(R) = E∗(g, m ). CHAPTER 5. A RECOURSE APPROACH TO THE CVRPED 93

This approach is appropriate when we are interested in the most optimistic solution. More complex decision schemes could also be considered, such as interval dominance [85], which ∗ would rely on CE(R) and C*E(R) and yield in general a set of optimal (non-dominated) solutions. Borrowing from what is done in label ranking [28], an interesting study would then be to identify from the set of non-dominated solutions, some parts of routes that would be more relevant (or preferred) to be included in a solution, over some irrelevant ones. This is left for future work.

5.2.5 Influence of customer demand specificity on the optimal solution cost

In this section, we study the behaviour of the optimal solution cost of the CVPRED modelled via the recourse approach, when knowledge specificity about customer demands varies (specifically decreases). Similarly as for the BCP model (in Section 4.3.4), our study concerns the case where the evidential variables di, i = 1, . . . , n are independent. In addition, the following change in knowledge specificity is considered: for each client i, i = 1, . . . , n, his demand is known in Θi Θi the form of a mass function mi0 built from mi by transferring, for each A ⊆ Θi such that Θi Θi 0 0 mi (A) > 0, the mass mi (A) > 0 to a subset A such A ⊆ A ⊆ Θi. Note that using a Θi Θi Θi similar proof to that of Lemma 4.1, it can be shown that mi v mi0 , i.e., mi0 is at most as Θi specific as mi . In order to show the impact on the optimal solution cost of this decrease in knowledge specificity about customer demands, let us first provide a result, which shows how the upper ∗ expected cost CE(R) of a route R varies as knowledge specificity about its recourses changes. Lemma 5.1. Let mΩ and m0Ω be two mass function representing uncertainty about the re- Ω 0Ω ∗ 0∗ courses on a given route R, such that m v m . Let CE(R) and CE (R) denote the upper expected costs of R under mΩ and m0Ω, respectively. We have

∗ 0∗ CE(R) ≤ CE (R).

Proof. The proof to this lemma is in appendix B.2.

ˆ Let CRec denote the cost of an optimal solution to the recourse modelling of the CVR- PED, when the evidential demands di, i = 1 . . . , n are independent and known in the form of Θi ˆ0 mass function mi , i = 1, . . . , n. Furthermore, let CRec denote the cost of an optimal solution to the recourse modelling of the CVRPED, when the evidential demands are independent and Θi known in the form of mass functions mi0 , i = 1, . . . , n.

ˆ0 ˆ Proposition 5.2. We have CRec ≥ CRec.

Proof. Let R denote a route containing N clients. Without lack of generality, assume that the i-th client on R is the client i. Let mΩ denote the mass function representing uncertainty about recourses on R when the demand of client i, i = 1,...,N is known in the form of a mass Θi 0Ω function mi , and let m denote the mass function representing uncertainty about recourses Θ on R when the demand of client i is known in the form of a mass function mi0 . 94 5.2. MODELLING THE CVRPED BY A RECOURSE APPROACH

It is clear that mΩ and m0Ω are obtained by transferring, for each

N Θ (A1,...,AN ) ∈ ×i=12 Θ QN such that Ai is a focal set of mi , the mass i=1 mi(Ai) to the subsets

f(A1 × · · · × AN ) and 0 0 f(A1 × · · · × AN ),

0 Θi respectively, with f the function defined in Equations (5.6)–(5.7) and Ai the focal set of mi0 Θi to which mi (Ai) is transferred to. Ω 0Ω m is recovered from m by transferring, for each combination (A1,...,AN ) of focal Θi sets of mi , i = 1,...,N, a proportion

QN Θi i=1 mi (Ai) 0Ω 0 0 m (f(A1 × · · · × AN )) 0Ω 0 0 from m (f(A1 × · · · × AN )) to 0 0 f(A1 × · · · × AN ) ⊆ f(A1 × · · · × AN ).

1 1 L L Moreover, if there exist several combinations (A1,...,AN ),..., (A1 ,...,AN ) of focal sets Θi QN Θi ` of mi , i = 1,...,N, such that their masses i=1 mi (Ai ), ` = 1,...,L are transferred to 01 01 0L 0L the same subset B = f(A1 × · · · × AN ) = ··· = f(A1 × · · · × AN ), then this means that mΩ is recovered from m0Ω by transferring a proportion

PL QN mΘi (A`) `=1 i=1 i i = 1 m0Ω(B)

0Ω ` ` of m (B) to subsets f(A1 × · · · × AN ), ` ∈ {1,...,L} of B, and thus, since this is true for all focal sets of m0Ω, mΩ is at least as specific as m0Ω.

∗ 0∗ From Lemma 5.1, we obtain then CE(R) ≤ CE (R).

Θi Considering that the optimal solution with mass functions mi0 consists of a set S of routes {R1,...,Rm}, we have then

0∗ ∗ CE (Rk) ≥ CE(Rk) for each k ∈ {1, . . . , m}, which yields

m m ˆ0 X 0∗ X ∗ ˆ CRec = CE (Rk) ≥ CE(Rk) ≥ CRec. k=1 k=1

Informally, Proposition 5.2 shows that the less specific knowledge there is about custo- mer demands, the greater the cost of the optimal solution. This proposition provides theore- tical properties of the CVRPED solutions obtained under the recourse approach, when using exact optimisation methods. For now, such methods can not solve large instances of the CVRP, from which the CVRPED derives. As a matter of fact, the following section reports a solution strategy to this CVRPED model using a meta-heuristic algorithm. CHAPTER 5. A RECOURSE APPROACH TO THE CVRPED 95 5.3 Solving the CVRPED Modelled by a Recourse Ap- proach

In this section, the recourse model for the CVRPED is solved using a simulated an- nealing algorithm. The algorithm for the model is first presented in Section 5.3.1. Then it is described in Section 5.3.2 how the algorithm generates a configuration. The Benchmarks are described in Section 5.3.3 and the experimental study is exposed in Section 5.3.4

5.3.1 The simulated annealing algorithm

The algorithm employed for the recourse modelling of the CVRPED is somehow simi- lar to the simulated annealing algorithm employed for the BCP model in the previous chap- ter. Specifically, the simulated annealing algorithm for the recourse model is illustrated in Algorithm 3; the difference with Algorithm 1 concerns each configuration generated in the algorithm, specifically,:

• generating a configuration is done using the routines initial _config(...) or neighbourhood_configuration(...) which depends on another parameter than that in Algorithm 1; and

• evaluating a configuration is performed by the cost(...) routine which depends also on a parameter different from that in Algorithm 1.

The following subsection details how the simulated annealing algorithm for the recourse mo- del generates a configuration or evaluates it. Remark 5.1. The parameters that control Algorithm 3: T , κ, freez, tot_iter, tot_tr were set to the same values used for the BCP model (see Section 4.4.1).

5.3.2 A configuration in the algorithm

In Algorithm 3, the initial_config(...) method generates the initial confi- gurations in the algorithm using a first fit greedy approach or randomly. The parameter “recourse” specifies that the initial configurations generated by initial_config(...), are subject to the constraints of the recourse modelling of the CVRPED which are mentioned in Section 5.2.1. The neighbourhood_configuration(...) routine that depends from the pa- rameter “recourse”, applies three consecutive neighbourhood operators to each iteration of Algorithm 3: fix_minimum, replace_highest_average and flip_ route.

• Fix_minimum and replace_highest_average operate similarly as in the BCP model (Section 4.4.2), except that the problem constraints are now the constraints of the recourse modelling of the CVRPED. Recall that the recourse model re- laxes the capacity constraints, in the sense that any capacity excess (overflow) is addressed by the objective function using recourse decisions. This means that as 96 5.3. SOLVING THE CVRPED MODELLED BY A RECOURSE APPROACH

Algorithm 3 Simulated annealing algorithm for the CVRPED modelled by a recourse approach Require: initial temperature T , temperature reduction multiplier κ, freezing temperature freez, total number of iterations tot_iter, total number of trials tot_tr Ensure: Best solution ever visited BestC 1: Bestcost = ∞ 2: for tr = 0 to tot_tr do 3: if tr == 0 then 4: C = initial_config(greedy, recourse) . greedy generation 5: else 6: C = initial_config(random, recourse) . random generation 7: end if 8: Ccost =cost(C, recourse) 9: T BestC = C 10: T Bestcost = Ccost 11: repeat 12: for iter = 0 to tot_iter do 13: C∗ = neighbourhood_configuration(C, recourse) ∗ ∗ 14: Ccost = cost(C , recourse) ∗ 15: ∆cost = Ccost − Ccost 16: if (∆cost < 0) then 17: C = C∗ ∗ 18: Ccost = Ccost ∗ 19: if Ccost < T Bestcost then ∗ 20: T BestC = C ∗ 21: T Bestcost = Ccost 22: end if − ∆cost 23: else if rnd ≤ e T then . rnd is a random number in [0, 1] 24: C = C∗ ∗ 25: Ccost = Ccost 26: end if 27: end for 28: T = κ · T 29: until (T == freez) 30: if (T Bestcost < Bestcost) then 31: BestC = T BestC 32: Bestcost = T Bestcost 33: end if 34: end for CHAPTER 5. A RECOURSE APPROACH TO THE CVRPED 97

fix_minimum and replace_highest_average are iteratively applied by the simulated annealing, we may end up with routes holding excessive total demands. Consequently, these routes will systematically fail, while other routes will hold limi- ted total customer demands. Therefore, we incorporate to each of the fix_minimum and replace_highest_average operators, a method that maintains relatively ba- lanced total demands on routes, during the process of moving customers from a route to another. More specifically, before a customer is inserted into a new route, total cus- tomer demands on each route yields a probability, indicating if a route is favourable for servicing an additional customer. The probability associated to each route is inversely proportional to the total customer demands on a route, i.e., the smaller the total customer demands is on a route, the more it is likely to choose this route to include an additional customer.

• The method flip_route is applied 25% of the time in each iteration. It considers reversing the order of a route if this improves its upper expected cost. Indeed, a route R having the path (0, 1,...,N, 0) and its reverse R−1 with the path (0,N,..., 1, 0) do not have necessarily the same upper expected penalty cost.

Besides, the cost(...) routine corresponds to the recourse objective function, that aims at minimising the total upper expected cost defined in Equation (5.3). The computational complexity of each iteration in Algorithm 1 corresponds to evalua- ting the recourse tree, on each one of the m routes of a configuration C, the complexity of which is given in Section 5.2.3.

5.3.3 The CVRPED benchmarks

The set of instances that we use to perform the experimental study on our recourse model using the algorithm already described, are the CVRPED instances and CVRPED+ ins- tances presented in Section 4.4.3. Recall that these instances are obtained from the Augerat set det A [70] instances by only transforming each client deterministic demand di into an evidential Θi Θi demand associated to a mass function mi in the CVRPED instances and a mass function mi,+ + Θi Θi in the CVRPED instances. The mass functions mi and mi,+ are defined in Section 4.4.3 Θi Θi Equations (4.18) and (4.19), respectively. Moreover, mi v mi,+, i = 1, . . . , n. Remark 5.2. For the recourse model, the programs were written in java and the experiments performed on the same cluster described in Remark 4.2.

5.3.4 Experimental study

The experiments conducted for the recourse modelling of the CVRPED, using the CVR- PED and CVRPED+ instances are reported in Table 5.4. The columns “Instance id l” and “Instance id l+” in this table represent the CVRPED and the CVRPED+ instances id, respectively. Each one of these instances was solved 30 times and the best, average and standard deviation of the costs along with the average running times are reported in the respective columns “Best cost”, “Avg cost”, “Stand. dev.” and “Avg. runti- me” for the CVRPED and the CVRPED+ instances, separately. In the “Penalty cost” column 98 5.3. SOLVING THE CVRPED MODELLED BY A RECOURSE APPROACH

TABLE 5.4 : Results of the simulated annealing algorithm for the recourse model using the CVRPED and CVRPED+ instances CVRPED instances CVRPED+ instances Instance Best Penalty Avg Stand. Avg. Instance Best Penalty Avg Stand. Avg. id l cost cost cost dev. runtime id l+ cost cost cost dev. runtime 1 1750.3 16.8% 1783.9 16.1 1958s. 1+ 2252.6 18.3% 2283.5 13.2 1272s. 2 1327.5 16.2% 1353.2 13.6 1704s. 2+ 1650.6 17.7% 1676 9.6 1329s. 3 1296.1 18% 1338.8 16.4 1642s. 3+ 1490.3 16.6% 1510.6 11.3 1540s. 4 1661.9 19.8% 1698.7 24.6 1728s. 4+ 1999.6 19.7% 2044.9 18.7 1428s. 5 1670.1 24.2% 1741.9 29 2673s. 5+ 2205.3 16.9% 2247.3 18.6 1554s. 6 1391.2 20.1% 1425.8 12.5 3586s. 6+ 1697.5 14.9% 1737 13.2 1612s. 7 1895.6 24.4% 1947.3 21.2 2286s. 7+ 2561.5 17% 2593.8 17.5 1382s. 8 1493.8 16.1% 1525.6 15.9 2450s. 8+ 1769.7 16.6% 1802.9 18.9 1686s. 9 1851 21.1% 1897.6 27.4 2580s. 9+ 2319.9 19.9% 2355 20.5 1783s. 10 1715.2 22.2% 1755.6 22.9 3264s. 10+ 2099.3 20.7% 2146.3 20.8 1863s. 11 2127.8 20.7% 2216.7 25.3 2349s. 11+ 2858.5 11.8% 2889 15.5 1491s. 12 2147.1 17% 2193.7 21.1 2344s. 12+ 2667.9 18 % 2705.2 22 1808s. 13 2530.2 22.7% 2629.7 33.6 2427s. 13+ 3084.7 15.7% 3145.9 29.5 1755s. 14 1994.9 24.9% 2089.4 32.6 2948s. 14+ 2483.1 17.2% 2524.8 23.3 1950s. 15 2499.5 21.2% 2559.1 30.2 3348s. 15+ 3135.5 15.8% 3168.4 21.3 2293s. 16 2420.4 18.3% 2499.7 35 4715s. 16+ 3100 14.9% 3132.5 17.7 2294s. 17 2709.6 19.8% 2792.8 39.1 4129s. 17+ 3366.9 16.6% 3427 29.9 2572s. 18 2301.7 16.3% 2348.4 27.3 2844s. 18+ 2788.3 13.5% 2837.6 24.8 2110s. 19 3083 23.4% 3190.1 45.9 4193s. 19+ 3696.4 16% 3759.3 32.8 2706s. 20 2322.7 15.7% 2378 30.7 3398s. 20+ 2960.8 15.4% 3000.5 19.2 2484s. 21 3317.8 23.6% 3426.4 54.9 6604s. 21+ 4437.5 17.5% 4517.8 46.8 2382s. 22 4158.8 21% 4261.4 51.9 4148s. 22+ 5249.4 22.2% 5395.1 47.3 2692s. 23 2966.7 19.6% 3043.6 39.9 4217s. 23+ 3578.1 19.4% 3648.8 40.6 2727s. 24 3528.3 23.9% 3631.1 54.3 6365s. 24+ 4435.1 17.2% 4548.4 42.8 3162s. 25 2889.7 22.4% 3040 55.1 3857s. 25+ 3665.9 15.4% 3712.6 23.4 2573s. 26 2712.5 19.2% 2847.3 49.1 4461s. 26+ 3466.7 14.2% 3525.2 32.7 2748s. 27 5016.2 24.2% 5137.7 69.2 10401s. 27+ 6790.5 16.5% 6953.5 51.5 3357s.

for the CVRPED instances, the contribution of the expected penalty costs to the overall costs of the best solutions to the CVRPED instances is provided as percentages. It varies between 16% to 25%. Similarly, the “Penalty cost” column for the CVRPED+ instances, illustrates the contribution of the expected penalty costs to the overall costs of the best solutions to the CVRPED+ instances: as can be seen it varies between 11% and 23%. As expected from Proposition 5.2 of Section 5.2.5, best costs obtained with the CVRPED+ instances are higher than those obtained with the CVRPED instances, which shows that our algorithm for the recourse model exhibits experimentally also a sound be- haviour. CHAPTER 5. A RECOURSE APPROACH TO THE CVRPED 99 5.4 Conclusions

In this chapter, uncertainty on customer demands in the CVRP was represented by evi- dence theory and the problem was handled by extending the recourse modelling approach of stochastic programming. The general recourse model was first formulated. Computing un- certainty on recourses in the model is intractable in the general case. We solved this issue, by providing a technique that makes computations tractable in a realistic case. In addition, particular cases of evidential demands we considered, which allowed to connect our models not only to stochastic programming with recourse but also to robust optimisation. The last investigation to the recourse model of the CVRPED was devoted to the optimal solution cost behaviour with respect to customer demands specificity. In the last part of this chapter, the recourse modelling of the CVRPED was solved by a simulated annealing meta-heuristic algorithm that uses a combination of operators that aim at minimising the objective of the problem. The algorithm was tested on instances derived from well known CVRP instances. The reported experiments showed that our algorithm behave according to the theoretical results studied for the model. Conclusions and Future Work

Conclusions

This thesis proposed to represent uncertainty on customer demands in the CVRP by the theory of evidence. This resulted in an optimisation problem called the CVRPED. The CVRPED is a generalisation of the CVRPSD which is characterized i) by being a stochastic integer linear program that can be modelled by two of the most popular stochastic program- ming approaches – CCP and SPR; and ii) by being an extension of the CVRP, which is a very important and classical NP-hard optimisation problem. According to these characteristics, two chief contributions were accomplished in this thesis. In the first one, the CVRPED was addressed by extending the CCP approach used for the CVRPSD. In this extension, the modelling approach for the CVRPED, is called the BCP approach. This modelling approach imposes minimum bounds on the belief and plausibility that the sum of the demands on each route respects the vehicle capacity limit, and the objective in the problem is to minimize the travel cost of routes, which should respect these minimum bounds along with other deterministic side constraints to the problem. Particular cases of the model were investigated, which showed that the BCP modelling approach is not only connec- ted to CCP, but also to robust optimisation which are two different and separate frameworks to handle uncertainty in optimisation problems. Theoretical results were provided regarding the optimal solution cost behaviour with respect to i) the model parameters variations; and to ii) customer demands specificity variations. Being a derivative from the CVRP, the belief constrained model of the CVRPED was solved using a simulated annealing meta-heuristic algorithm. The operators adopted in the solving algorithm were inspired by some techniques used in several other meta-heuristics, with the purpose to improve the objective function of the problem. Evidential instances were derived from well known data sets of the CVRP and were experimentally tested, which proved a sound behaviour of the developed algorithm with regard to the theoretical results of the model. In the second contribution developed in this thesis, the CVRPED was addressed by ex- tending the other main approach in the stochastic programming approaches, that is the SPR approach, which is more involved than CCP. This extension resulted in a recourse modelling approach to the CVRPED. The developed modelling technique represents by a belief func- tion the uncertainty on the recourses of each route by allowing recourse actions in order to bring to feasibility a violated capacity limit, and defines the cost of a route as its classical cost (without recourse) plus the worst expected cost of its recourses. A method to compute efficiently uncertainty on recourses in an important particular case was provided. Similarly to what was done in the BCP model, particular cases of the recourse model were investigated which lead to bridging the recourse approach with SPR as well as with the robust optimisa-

100 CONCLUSIONS AND FUTURE WORK 101 tion approach. Theoretical results related to the optimal solution cost behaviour with regard to variations in customer demands specificity, were derived. Afterwards, the recourse model was solved by a simulated annealing algorithm. The algorithm is similar somehow to the one developed for the BCP model. Nevertheless it considers different constraints and adopts sup- plementary methods that are necessary to handle the recourse model nature. The algorithm was experimentally tested using the same benchmarks used in the belief constrained model for the CVRPED. Experiments included varying customer demands specificity. The obtained results validated our algorithm, as they verify the theoretical results obtained for the recourse model.

Future work

Several further developments to the work done in this thesis are considered in our future research. They include the following.

• To compare the evidential models developed in this thesis for the CVRPED to the sto- chastic programming models used for the CVRPSD, in order to show the advantages of our approaches. As the CVRPED and the CVRPSD are two different problems, evi- dential demands for the CVRPED and stochastic demands for the CVRPSD need to be derived from real historic data on customer demands.

• To perform a sensitivity analysis [49] in order to identify the customers such that more knowledge about their demands leads to better solutions.

• To generate a set of (non-dominated) solutions based on interval dominance [85] and then identify from this set of solutions, some parts of routes that are more preferred or more relevant to be included in a solution, similarly to what is done in label ranking [28].

• To improve our solution method by ameliorating the operators as well as the termina- tion criterion of our simulated annealing, and investigate other solution methods to the CVRPED.

• To extend other stochastic modelling methodologies used for the CVRPSD to the evi- dence theory framework, like markov decision process models and multi-stage models.

• To extend the evidential models to the case of incomplete knowledge about the depen- dency between evidential demands [27].

• The problem we studied involved uncertainty affecting only the constraints of the asso- ciated optimisation problem. It seems interesting to extend to the evidence theory fra- mework, other combinatorial optimisation problems where uncertainty affects not only the constraints but the objective of the problem too, such as the CVRP with stochastic travel times. Annexe A

Heuristic Based Methods

Contents A.1 Tabu Search ...... 102 A.2 Genetic Algorithms ...... 104 A.3 Swarm Intelligence Methods ...... 109

A.1 Tabu Search

Tabu search was originally proposed by Glover in 1986 [41]. The idea of Glover was to design a solution to the problem that most local search methods encounter, which is getting stuck at local optima. Its fundamental distinctive feature is a tabu list that records previously performed moves known also as tabu moves (or banned moves), which was inspired by human memory. Similarly to human behaviour that generally “learn lessons from the past”, the search engine in a tabu search depends from the tabu list: it forbids performing moves leading to already visited solutions, and thus leading the search from cycling back to known solutions and thus getting stuck at local optima [74]. The basic elements of tabu search are:

• the search space, which is the space of all possible solutions that can be examined during the search for a solution to a problem; • a neighbourhood structure, that contains the local transformations so called moves. Moves in a neighbourhood structure lead to a new neighbourhood configuration, if ap- plied to a current configuration.

The main process of tabu search proceeds from a single current configuration to a neigh- bourhood configuration using predefined moves, throughout consecutive iterations. At each iteration, the objective function evaluates the neighbourhood configurations. To avoid being stuck in local optima, a move yielding to a worst solution is allowed in tabu search. Neverthe- less, this might lead to being stuck in cycles, since it is possible to go from a solution S1 to a better solution S2 and then go back to solution S1. Hence, the transition to a next iteration is led by some singular elements to tabu search:

102 ANNEXE A. HEURISTIC BASED METHODS 103

Begin with an initial configuration

Set initial configuration as the current configuration

Initialize an empty tabu list T l

Generate a list of moves that could be applied to current configuration

Choose from the list of moves, move Remove oldest move from T l if necessary Mv such that Mv∈ / T l and Mv leads to the best neighborhoud configuration Insert Mv into T l

stopping cri- no terion is met?

yes Stop and return the best configuration

FIGURE A.1 : A simple tabu search mechanism

• a tabu list also known as short-term memory: moves executed in previous iterations are stored in a tabu list and their reverse moves are considered as tabus for the successive ones, to prevent the algorithm from returning to an already visited solution. It is practical in a tabu search to inspect the moves instead of neighbourhood configurations, since it can be easier to identify in a latter configuration the reverse of a recorded move that have been performed earlier and forbid it. Moves are not kept throughout the whole process, since this may lead to a considerable increase in memory requirements. Instead, a certain number of moves is recorded, commonly the most recent ones. For instance, a move can be kept in a tabu list for a fixed or even a random number of iterations. Though a tabu list is the central powerful element of tabu search, it might prohibit attractive solutions. Moreover, it may guide the search into total torpidity. Thus, an algorithmic mechanism was developed to correct such situations and is called aspiration criterion;

• an aspiration criterion: the basic idea of this mechanism is to cancel a move listed as tabu, if this move takes to a solution that is better than the current best-known solution.

A simple tabu search mechanism that integrates the elements discussed above is illus- trated in Figure A.1. This version of tabu search is known as the “best improvement”. 104 A.2. GENETIC ALGORITHMS

The elements of tabu search described so far have shown to be effective in several cases. Nevertheless, some additional elements have been included in the strategy to ameliorate it. These additional elements include:

• an intensification process: the thought behind this process is to examine intensively the portions of the search space that seem promising. More specifically, to double check that the best solutions in some areas of the search space are discovered. It is based on some intermediate-term memory. A classic approach to intensification is that, one can occasionally pause the basic search process and reorient it towards the best solution ob- tained until now and freeze its good components, i.e., freeze the portions of the solution that seem very good. Then the neighbourhood of this solution is intensively examined while keeping the good parts of this solution intact.

• a diversification process: the goal of this mechanism is to orient the search into pre- viously unexplored areas. Indeed, the main problem of all methods based on local search approaches is that they tend to spend most of the time in confined areas of the search space, consequently they are too local. The diversification mechanism treats this pro- blem by diversifying the search towards a solution that has not been visited yet even if it does not belong to the neighbourhood of the current solution. It is based on some long-term memory. Typical diversification techniques are:

– restart diversification, that performs several random restarts to the search; – continuous diversification, which penalizes solutions (respectively moves) that have been often visited (respectively performed); – allowing infeasible solutions, since it is possible that bad or infeasible solutions lead the search towards good quality solutions. More specifically, when an infea- sible solution is retrieved, a penalty could be associated to this solution actual cost.

It should be noted that the design of a tabu search algorithm should be addressed with extreme care, above all diversification and intensification processes, since they do not always guarantee results up to expectations. These mechanisms frequently add a high computatio- nal complexity to the algorithm. Moreover, the size of the tabu list and the neighbourhood structure are also critical since they may require a considerable allocation in memory space.

A.2 Genetic Algorithms

Genetic algorithms were originally introduced in 1975 by John Holland [50]. They were inspired from the Darwinian theory of evolution [21] which states that individuals compete for survival and generally the fitter survives and produces progeny. The most equipped children for survival inherit from their parents the necessary beneficial characteristics using sexual re- production [75]. Genetic algorithms somehow operate similarly. More specifically, individuals that will evolve and reproduce are called chromosomes and each chromosome represent a so- lution. A set of chromosomes existing in the same period is called a population. The evolution of a population during time is mimicked by successive iterations called generations. ANNEXE A. HEURISTIC BASED METHODS 105

Primarily, a genetic algorithm proceeds by creating an initial population of chromo- somes (candidate solutions), then distinct components of genetic algorithms are applied suc- cessively to improve populations. In the following, we list the components of a genetic algo- rithm in the chronological order of their application.

1. The basic step when designing a genetic algorithm is to encode chromosomes. Basically, chromosomes representing solutions should be encoded in the best way to map a solu- tion, while keeping in mind to design a structure that will not slow down the speed of the search. For instance, for the CVRP, a chromosome is a string of customers representing vehicle routes.

2. Afterwards, the fitness of each chromosome should be evaluated in order to reflect the goodness of the associated solution. This is done by modelling a fitness function that maps the fitness value (goodness) of a chromosome, and thus how well it competes with other chromosomes in a population, so the fittest has a higher chance of survival.

3. When the genetic algorithm evolves to pass from one generation to the next, some chro- mosomes are selected to produce progeny. Chromosomes that are selected in a popula- tion of a generation are called parents while their generated children in the next genera- tion population are called offspring. Each chromosome is assigned a selection probabi- lity that is proportional to its fitness, this can be done by different selection techniques. Some well-known selection techniques are:

• roulette wheel selection: when genetic algorithms were first introduced, it was the most commonly used selection technique. In this technique, each chromosome is assigned a portion of a wheel that is proportional to its fitness. Then a marble is thrown and the chromosome were the marble lands is selected. Such method aims that fitter chromosomes have a higher chance of being selected. Nevertheless, they are not necessarily selected. Indeed, if bad quality chromosomes are numerous, then they might be chosen over some fitter ones. • K-tournament selection: this method tries to correct the drawback of the roulette wheel selection method. It involves performing tournaments and at each tourna- ment selecting the fittest chromosome. More specifically, in a tournament, K ran- dom chromosomes are selected and the fittest chromosome of the K individuals is chosen to be a parent. Then the K-1 remaining chromosomes are returned to the population and the process is repeated to select another parent. The number of tournaments equals the number of parents fixed to be chosen for reproduction.

Other selection techniques exist like rank selection, stochastic universal sampling, etc. We refer the reader for [10, 75] for a presentation about other selection techniques.

4. Genetic operators make evolution possible in a genetic algorithm and they include cros- sover and mutation operators.

• A crossover operator creates one or more (generally two) offspring chromosomes from a pair of parent chromosomes. The goal of this operator is that offspring inhe- rit good solution parts from both parents. Hence, crossover operators concentrate on good parts of chromosomes dispersed in a population. In genetics, when two chromosomes bump into each other, it is possible that crossover does not happen. Similarly, in a genetic algorithm, a crossover operator is not always applied, but 106 A.2. GENETIC ALGORITHMS

0 2 4 5 1 3 6 0 First parent

0 6 2 1 5 4 3 0 Second parent

First child 0 4 5 1 6 2 3 0

Second child 0 2 1 5 4 3 6 0

FIGURE A.2 : Two-point crossover generating two offspring from parents chromosomes for Example A.1 .

it is applied with a certain probability. The most common crossover operators are the single-point crossover, two-point crossover and uniform crossover. In a single point crossover, parents get cut at one point (that may be chosen randomly or could be a fixed parameter) and each offspring get the part located at the left of the cut- ting point of a parent and the part located at the right of the cutting point of its second parent. A two-point crossover is somehow similar to single-point crosso- ver, but parents are cut at two points and offspring get alternating parts from both parents. Example A.1. For example, performing a two-point crossover for the CVRP, could consist in a two-cut of parent chromosomes at the same location of their first route, while making sure the cut does not coincide with a depot. Then the subroute between the cut points of the first parent is copied to the first child. After- wards, the second parent is scanned and customers of its first route that does not appear in the first child are copied to that same child, in the same order of their appearance in the second parent. Then, the same operation is repeated for the se- cond child by starting with the subroute between cut points of the second parent. This example is illustrated in Figure A.2, where we have n = 6 customers (0 de- notes the depot) and m = 1 vehicle. Note that, such two-point crossover operator respects all constraints of the CVRP. Uniform crossover operates differently from one-point and two-points crossover, and instead of treating parts of chromosomes, it treats each element in a chro- mosome randomly. For instance, in the CVRP, an element of a chromosome is a customer. More specifically, it decides for each element of each offspring to be inherited from which parent based on a certain probability, for instance a 50% probability. • The mutation operator involves mutating elements of a single chromosome ran- domly. It is a very important operator in genetic algorithms, since it circumvents the search from being stuck in local optima. Nevertheless, it should not happen frequently, because a genetic algorithm would then turn into a random search. Ba- sically, the mutation rate should happen for a tiny probability between 0.1% and ANNEXE A. HEURISTIC BASED METHODS 107

0 2 4 5 1 3 6 0 Parent

Child 0 2 1 5 4 3 6 0

FIGURE A.3 : Mutation generating one child from a parent chromosome for Example A.2 .

1%. Example A.2. For instance, a mutation for the CVRP, could consist in reversing a subroute of a chromosome (solution) route. An illustration of this mutation is illustrated in Figure A.3, for n = 6 customers (0 denoting the depot) with m = 1 vehicle.

5. Population of a generation evolve into another population of a next generation according to replacement policies. In other words, replacement policies decide which chromo- somes (parents and offspring) survive into the next generation. Traditionally in a genetic algorithm, the population size remains constant or may be customized in a controlled manner. A traditional strategy to genetic algorithms that might be included in a replace- ment policy is the elitism strategy. This strategy preserves the elite (best) chromosomes and copies them to the next generation, and thus prevents the algorithm from losing the best chromosomes found so far. The two popular replacement policies in genetic algorithms are the generational and steady-state replacement techniques.

• The generational replacement technique can copy only the elite offspring to the next population. Thus, the population of the next generation is composed of only offspring. Another type of a generational replacement is to copy some of the off- spring along with the parents by performing an elitism on both parents and off- spring. In such case, offspring compete with their parents. • In a steady-state replacement, a small number of offspring is created by the ge- netic operators for the new generation. Then, the new generation is completed by copying some chromosomes from the current population. Hence in this repla- cement policy, only a small portion of the older generation is replaced and thus establishing a gradual evolution of the populations.

A generic view of a genetic algorithm containing the components described above is provided in Figure A.4. To conclude, genetic algorithms explore the search space and exploit knowledge at the same time. They explore the search space, since they are parallel in nature (a population contains not necessarily neighbouring solutions). They exploit knowledge, as they look at the neighbourhood of a solution. Moreover, they perform very well in optimisation tasks with multiple parameters, like when having several objectives. This is directly related to the concept of a population of chromosomes. More specifically, because genetic algorithms handle mul- tiple solutions in parallel, they are adequate to find a trade-off solution, when having different objectives in an optimisation problem [75]. Yet, genetic algorithms are computationally ex- pensive and their encoding is not simple. A successful extension of traditional genetic al- 108 A.2. GENETIC ALGORITHMS

Begin with an initial generation

Compute fitness value and selec- tion probability of each individual

Select chromosomes to reproduce based on a selection technique

Crossover applied to selected indi- viduals with a certain probability

Mutation applied to selected individuals with a certain mutation rate probability

Replacement applied to evolve from current population to the next population

stopping no criterion is met?

yes exit and return the best individual

FIGURE A.4 : The generic mechanism of a genetic algorithm ANNEXE A. HEURISTIC BASED METHODS 109 gorithms are memetic algorithms [60], which are more focused on exploiting knowledge by using local search techniques instead or after performing a mutation operator.

A.3 Swarm Intelligence Methods

Swarm intelligence is a research area that has the goal to (try to) simulate the collective behaviour of large systems composed of (swarms of) social insects. The swarm intelligence methods are based on the study of collective behaviours of decentralized self-organized sys- tems. The basic idea is that a very large group of agents (particles, ants, insects) can be smart in a way that none of its members is. Although all agents can be just simple reactive units, they can together achieve good performances without needing a centralized control. Two po- pular algorithms based on swarm intelligence are Particle Swarm Optimisation (PSO) and Ant Colony Optimisation (ACO). Essentially, a PSO algorithm works with a population (called a swarm) of candidate solutions (called particles). These particles (solutions) move around in the search-space ac- cording to a few simple update formula. The evolutions and trajectories of the particles are guided by their own best known location in the search-space as well as the entire swarm’s best known position. When improved positions (better solutions) are discovered, these will then guide the movements of the whole swarm (population). By repeating this process by trial and error, a high-quality solution could be eventually discovered. The PSO algorithms have similarities with the genetic algorithms, by the use of a population and by the update of the positions (this could be a local search as in memetic algorithms). The ACO approach is based on the swarm approach used by ants to find the most ef- ficient routes to a food source. The ants do nothing more than following the strongest phe- romone10 trail left by other ants. But, by a repeated process of trial and error by many ants, the best route to the food source is quickly revealed. Let us now further discuss this ACO approach in greater detail.

Naturally observed ant behaviour Information sharing between ants when searching for food, materializes by pheromones. Specifically, ants drop pheromone on their path to search for food. This substance attracts ants in a way that they follow pheromone trails with high pro- bability. The more pheromone concentration on a trail, the more it attracts ants. On the other hand, this pheromone substance evaporates with time. Hence, if supplementary pheromones are not added to pheromone trails, then they disappear. Information sharing between ants to find the shortest path can be resumed by the following characteristics:

• multiple ants in each step: many ants leave the nest to search for food autonomously;

• their movements rely somehow on a stochastic behaviour: if no pheromone trails exist they take random paths, but when pheromone trails exist, they tend to follow trails with high probability of pheromones;

• their machinery relies on a positive feedback loop, that is when pheromones are rein- forced;

10Pheromone is a chemical substance laid by ants 110 A.3. SWARM INTELLIGENCE METHODS

Initialize pheromone trails

Initialize ant agents that will construct feasible solutions according to pheromones

Improve solutions using lo- cal search (this step is optional)

update pheromones depen- ding on the quality of solutions

stopping no criterion is met?

yes

exit

FIGURE A.5 : The mechanism for the ACO meta-heuristic

• their machinery also relies on a negative feedback loop, that is when pheromones eva- porate, therefore they cannot be stuck with permanent trajectory loopholes.

This natural process was translated into the ACO meta-heuristic algorithm, which is described for shortest path problems in the following.

Basics of the ACO meta-heuristic for shortest path problems Generally, the objective in problems that involve determining the shortest path, is to find the shortest path from a node that we will call the nest node, to a node that we call the food node, while respecting some constraints to the problem. In the ACO meta-heuristic (Figure A.5), a certain number of arti- ficial ants called also agents are initialized at the nest node in each iteration of the algorithm. In each iteration, agents build solutions independently from other agents according to the pheromone and constraints of the problem: after constructing each solution, a global variable representing the artificial pheromone concentration on each edge11 in a solution, is increased or decreased according to the quality of the solution. Hence, while constructing a solution, more specifically when expanding the solution edge by edge until reaching the food node, a feasible edge12 is chosen probabilistically according to the pheromone concentration on this edge. An optional step is to apply a local search method that improves solutions constructed by ants before updating (increasing or decreasing) pheromones. This technique has several inherent features. For instance, the parallelism of working

11The information about pheromone concentration on a path can be associated to nodes instead of edges. 12A feasible edge is an edge that if added, the associated solution will respect the problem constraints. ANNEXE A. HEURISTIC BASED METHODS 111 agents makes it convenient for multi-objective optimisation. In addition, it is an adaptive search strategy: the environment always changes (pheromone quantities) and the search stra- tegy adapts to this new environment. Nevertheless, such methods are not easy to employ, and one should pay great attention and be careful when employing them - they involve many para- meters whom value might considerably affect solutions quality (like the number of agents that affect also pheromone concentration), they are not easy to implement and are computationally costly. Annexe B

Proofs of Chapter 5

Contents B.1 Proof of Proposition 5.1 ...... 112 B.2 Proof of Lemma 5.1 ...... 114

B.1 Proof of Proposition 5.1

i−1 Let Bi ⊆ Ωi := {0, 1} , i = 2,...,N, denote the set of possible failure situations that may occur at the i-th customer on route R, i.e.,

Bi = f (A1 × · · · × Ai) , (B.1) with A` := A`; A` , ` = 1, . . . , i. J K ∗ Let hi be the function from X1 × · · · × Xi to N defined by hi(θ1, . . . , θi) = qi, with qi defined by Equation (5.5) of Section 5.2.2. In other words, hi provides the load in the vehicle after serving the i-th customer given that customer demands are (θ1, . . . , θi). i Remark that any ω ∈ Bi may be obtained by several vectors (θ1, . . . , θi) ∈ A1×· · ·×Ai. As a consequence, when it is known that the failure situation ωi has occurred at the i-th customer, then the load in the vehicle after serving the i-th customer is known only in the form of a set Lωi such that n io Lωi = hi(θ1, . . . , θi)| ∀ (θ1, . . . , θi) ∈ A1 × · · · × Ai, f(θ1, . . . , θi) = ω .

Consider the tree built according to Algorithm 2 and remove all its nodes below level i. Call T reei the resulting tree. Then, for a given leaf of T reei, by concatenating in a vector the Boolean failure variable r` at level `, ` = 2, . . . , i, written on the path from the root to i the leaf, we obtain the binary failure situation vector t = (r2, r3, . . . , ri) ∈ Ωi and this leaf contains also an interval LTti of integers representing imprecise knowledge about the vehicle i load after serving the i-th customer when t has occurred. Besides, all the leaves of T reei yield the subset BTi ⊆ Ωi. i We will now show by induction that for i = 2,...,N, we have : Bi = BTi and ∀ω ∈ i i i Bi, Lωi = LTti for t ∈ BTi such that t = ω . Note that from the definition of the addition

112 ANNEXE B. PROOFS OF CHAPTER 5 113 of two intervals of integers I1 and I2, i.e., I1 + I2 = {x1 + x2|x1 ∈ I1, x2 ∈ I2}, we have ∀x ∈ I1 + I2, ∃x1 ∈ I1, x2 ∈ I2 such that x1 + x2 = x.

2 2 2 2 • Consider first the case i = 2, hence Ω2 = {ω1, ω2} with ω1 = (0) and ω2 = (1). In 2 2 2 2 such case, either B2 = {ω1} or B2 = {ω2} or B2 = {ω1, ω2}.

2 – If B2 = {ω }, then it implies that A1 + A2 ≤ Q and clearly L 2 = A1 + A2. 1 ω1 2 Besides, if A1 + A2 ≤ Q, then according to Algorithm 2 we have BT2 = {ω1} and LT 2 = A1 + A2. ω1 2 – If B2 = {ω }, then it implies that A1 + A2 > Q and clearly L 2 = A1 + A2 − Q. 2 ω2 2 Besides, if A1 + A2 > Q, then according to Algorithm 2 we have BT2 = {ω2} and LT 2 = A1 + A2 − Q. ω2 2 2 – If B2 = {ω1, ω2}, then it implies that ∃(θ1, θ2) ∈ A1 × A2 such that f(θ1, θ2) = 2 ω1, and thus ∃(θ1, θ2) ∈ A1 × A2 such that θ1 + θ2 ≤ Q, and it also implies 2 ∃(θ1, θ2) ∈ A1 × A2 such that f(θ1, θ2) = ω2, and thus ∃(θ1, θ2) ∈ A1 × A2 such that θ1 + θ2 > Q. In particular, it implies that A1 + A2 ≤ Q < A1 + A2. Hence, since for θ1 + θ2 ≤ Q, we have q2 = θ1 + θ2, and for θ1 + θ2 > Q, we have q2 = θ1 +θ2 −Q, we obtain that Lω2 = A1 +A2; Q and Lω2 = 1; A1 +A2 −Q . 1 J K 2 J K Besides, if A1 + A2 ≤ Q < A1 + A2, then according to Algorithm 2 we have 2 2 BT2 = {ω1, ω2} and LTω2 = A1 + A2; Q and LTω2 = 1; A1 + A2 − Q . 1 J K 2 J K i i • Suppose that for i < N we have : Bi = BTi and ∀ω ∈ Bi, Lωi = LTti for t ∈ BTi such that ti = ωi. Let us show that it holds for i + 1. i i i From the preceding assumption, we have ∀ω = (r1, . . . , ri) ∈ Bi that Lωi is the interval i i i LTti , i.e., Lωi = Lωi ; Lωi = LTti ; LTti for t ∈ BTi such that t = ω . In addition, i Ji i K J K we have ∀ω = (r1, . . . , ri) ∈ Bi :

i – Either Lωi + Ai+1 ≤ Q, in which case the failure situation ω at the i-th customer i+1 will induce a failure situation ω ∈ Bi+1 at the i + 1-th customer such that i+1 i i ω = (r1, . . . , ri, 0) and Lωi+1 = Lωi + Ai+1. In addition, Lωi + Ai+1 ≤ Q is i equivalent to LTti + Ai+1 ≤ Q, in which case the leaf of T reei associated to t will induce according to Algorithm 2 the leaf of T reei+1 with associated vector i+1 i+1 t = ω and interval LTti+1 = LTti + Ai+1. i – Or Lωi + Ai+1 > Q, in which case the failure situation ω at the i-th customer i+1 will induce a failure situation ω ∈ Bi+1 at the i + 1-th customer such that i+1 i i ω = (r1, . . . , ri, 1) and Lωi+1 = Lωi + Ai+1 − Q. In addition, Lωi + Ai+1 > Q is equivalent to LTti + Ai+1 > Q, in which case the leaf of T reei associated to i t will induce according to Algorithm 2 the leaf of T reei+1 with associated vector i+1 i+1 t = ω and interval LTti+1 = LTti + Ai+1 − Q. i – Or Lωi + Ai+1 ≤ Q < Lωi + Ai+1, in which case the failure situation ω at the i-th i+1 customer will induce a failure situation ωL ∈ Bi+1 at the i + 1-th customer such i+1 i i that ωL = (r1, . . . , ri, 0) and L i+1 = Lωi +Ai+1; Q since for qi +θi+1 ≤ Q we ωL J K i+1 have qi+1 = qi +θi+1. It will also induce a failure situation ωR ∈ Bi+1 at the i+1- i+1 i i th customer such that ωR = (r1, . . . , ri, 1) and Lωi+1 = 1; Lωi +Ai+1 −Q since R J K for qi + θi+1 > Q we have qi+1 = qi + θi+1 − Q. In addition, Lωi + Ai+1 ≤ Q < Lωi + Ai+1 is equivalent to Lti + Ai+1 ≤ Q < Lti + Ai+1, in which case the leaf 114 B.2. PROOF OF LEMMA 5.1

i of T reei associated to t will induce according to Algorithm 2 the leaf of T reei+1 i+1 i+1 with associated vector tL = ωL and interval LT i+1 = LTti + Ai+1; Q . It will tL iJ+1 i+1 K also induce the leaf of T reei+1 with associated vector tR = ωR and interval LTti+1 = 1; LTti + Ai+1 − Q . R J K

B.2 Proof of Lemma 5.1

∗ 0∗ Let CP(R) and CP (R) denote the upper expected penalty costs of some route R, under mΩ and m0Ω, respectively. We have

C0∗(R) = X m0(B) max g(ω), P ω∈B B⊆Ω and

C∗ (R) = X m(A) max g(ω). P ω∈A A⊆Ω

Since m v m0,    ∗ X X 0 C (R) =  S(A, B)m (B) max g(ω) P ω∈A A⊆Ω B⊆Ω   X 0 X = m (B)  S(A, B) max g(ω) . ω∈A B⊆Ω A⊆Ω

Since S(A, B) = 0, ∀A 6⊆ B, we can replace the condition of the second sum from A ⊆ Ω to A ⊆ B :   ∗ X 0 X C (R) = m (B)  S(A, B) max g(ω) . P ω∈A B⊆Ω A⊆B

In addition, for any A, B ⊆ Ω such that A ⊆ B, we have

max g(ω) ≤ max g(ω), ω∈A ω∈B hence   ∗ X 0 X C (R) = m (B)  S(A, B) max g(ω) P ω∈A B⊆Ω A⊆B   X 0 X ≤ m (B)  S(A, B) max g(ω) ω∈B B⊆Ω A⊆B = X m0(B) max g(ω) ω∈B B⊆Ω 0 ∗ = CP (R), ANNEXE B. PROOFS OF CHAPTER 5 115 where we used the fact that S is stochastic so that max g(ω) = P S(A, B) max g(ω) for ω∈B A⊆B ω∈B 0∗ ∗ any B ⊆ Ω. Using CP (R) ≥ CP(R) we obtain

0∗ ∗ CP (R) + C(R) ≥ CP(R) + C(R) , which means 0∗ ∗ CE (R) ≥ CE(R) . Publications

Journal articles

N. Helal, F. Pichon, D. Porumbel, D. Mercier, and É. Lefèvre. The Capacitated Vehicle Routing Problem with Evidential Demands. International Journal of Approximate Reasoning, 2018, 10.1016/j.ijar.2018.02.003.

Internationals conferences

N. Helal, F. Pichon, D. Porumbel, D. Mercier, and É. Lefèvre. A recourse approach for the capacitated vehicle routing problem with evidential demands. In ECSQARU 2017, Lecture Notes in Computer Science, pages 190–200. Springer, 2017.

N. Helal, F. Pichon, D. Porumbel, D. Mercier, and É. Lefèvre. The capacitated vehicle routing problem with evidential demands: a belief-constrained programming ap- proach. In Belief Functions: Theory and Applications, volume 9861 of Lecture Notes in Computer Science, pages 212–221. Springer, 2016.

National conferences

N. Helal, F. Pichon, D. Porumbel, D. Mercier, and E. Lefèvre. Le problème de tour- nées de véhicules avec des demandes évidentielles. In Rencontres Francophones sur la Logique Floue et ses Applications, LFA 2017, pages 15–22, Amiens, France, 2017. (Best paper award).

N. Helal, F. Pichon, D. Porumbel, D. Mercier, and E. Lefèvre. Optimisation discrète sous incertitudes modélisées par des fonctions de croyance. In 17ème congrès ROA- DEF de la société Française de Recherche Opérationnelle et Aide à la Décision, Com- piègne, France, 2016.

116 PUBLICATIONS 117 Bibliography

[1] N.Ben Abdallah, N. Mouhous-Voyneau, and T. Denoeux. Combining statistical and expert evidence using belief functions: Application to centennial sea level estimation taking into account climate change. International Journal of Approximate Reasoning, 55(1):341–354, 2014.

[2] H. Agarwal. Reliability Based Design Optimization: Formulations and Methodologies. PhD thesis, University of Notre Dame, Indiana, 2004.

[3] P. Augerat. Approche polyédrale du problème de tournées de vehicules. PhD thesis, Institut National Polytechnique de Grenoble, 1995.

[4] C. Bastian and A. H.G. Rinnooy Kan. The stochastic vehicle routing problem revisited. European Journal of Operational Research, 56:407–412, 1992.

[5] C. Baudrit, I. Couso, and D. Dubois. Joint propagation of probability and possibility in risk analysis: Towards a formal framework. International Journal of Approximate Reasoning, 45(1):82–105, 2007.

[6] E. M. L. Beale. On minimizing a convex function subject to linear inequalities. Journal of the Royal Statistical Society, 17(2):173–184, 1955.

[7] A. Ben-Tal, L. El Ghaoui, and A. Nemirovski. Robust Optimization. Princeton Univer- sity Press, 2009.

[8] D. Bertsimas, D. B. Brown, and C. Caramanis. Theory and applications of robust opti- mization. SIAM Review, 53(3):464–501, 2011.

[9] J. R. Birge and F. Louveaux. Introduction to Stochastic Programming. Springer-Verlag, New York, 1997.

[10] T. Blickle and L. Thiele. A comparison of selection schemes used in evolutionary algo- rithms. Evolutionary Computation, 4(4):361–394, 1996.

[11] L. D. Bodin, B. L. Golden, A. A. Assad, and M. O. Ball. Routing and scheduling of vehicles and crews: The state of the art. Computers and operations research, 10(2):63– 211, 1983.

[12] O. Bräysy. Local search and variable neighborhood search algorithms for the vehicle routing problem with time windows. PhD thesis, University of Vaasa, Finland, 2001.

118 BIBLIOGRAPHY 119

[13] J. Brito, J. A. Moreno, and J. L. Verdegay. Fuzzy optimization in vehicle routing prob- lems. In Proceedings of the Joint 2009 International Fuzzy Systems Association World Congress and 2009 European Society of Fuzzy Logic and Technology Conference, Lis- bon, Portugal, 2009. [14] A. Charnes, W. W. Cooper, and G. H. Symonds. Cost horizons and certainty equivalents: an approach to stochastic programming of heating oil. Management Science, 4(3):235 – 263, 1958. [15] J. Q. Chen, W. L. Li, and T. Murata. Particle swarm optimization for vehicle routing problem with uncertain demand. In Proceedings of the 4th IEEE International Con- ference on Software Engineering and Service Science (ICSESS), Beijing, China, 2013. IEEE. [16] C. H. Christiansen and J. Lysgaard. A branch-and-price algorithm for the capacitated vehicle routing problem with stochastic demands. Operations Research Letters, 35:773– 781, 2007. [17] G. Clarke and J. W. Wright. Scheduling of vehicles from a central depot to a number of delivery points. Operations Research, 12(4):568–581, 1964. [18] J.-F. Cordeau, G. Laporte, M. W.P. Savelsbergh, and D. Vigo. Handbooks in Operations Research and Management Science, volume 14, chapter Vehicle Routing, pages 367– 428. Elsevier, 2007. [19] G. B. Dantzig. Linear programming under uncertainty. Management science, 1(3 and 4):197–206, 1955. [20] G. B. Dantzig and J. H. Ramser. The truck dispatching problem. Management Science, 6(1):80–91, 1959. [21] C. Darwin. On The Origin of Species by Means of Natural Selection or the Preservation of Favored Races in the Struggle for Life. J. Murray, 1859. [22] A. P. Dempster. Upper and lower probabilities induced by a multivalued mapping. The annals of mathematical statistics, 38(2):325–339, 1967. [23] A. P. Dempster. A generalization of Bayesian inference. Journal of the Royal Statistical Society. Series B (Methodological), 30(2):205–247, 1968. [24] T. Denoeux. Analysis of evidence-theoretic decision rules for pattern classification. Pattern recognition, 30(7):1095–1107, 1997. [25] T. Denoeux. 40 years of dempster-shafer theory. International Journal of Approximate Reasoning, 79(C):1–6, 2016. [26] D. Dentcheva. Lectures on Stochastic Programming: Modeling and Theory, chapter Op- timization Models with Probabilistic Constraints, pages 87–153. MOS-SIAM Series on Optimization. Mathematical Programming Society and Society for Industrial and Ap- plied Mathematics, 2009. [27] S. Destercke and D. Dubois. Idempotent conjunctive combination of belief functions: Extending the minimum rule of possibility theory. Information Sciences, 181(18):3925– 3945, 2011. 120 BIBLIOGRAPHY

[28] S. Destercke, M-H. Masson, and M. Poss. Cautious label ranking with label-wise de- composition. European Journal of Operational Research, 246(3):927–935, 2015.

[29] M. Dror. Modeling vehicle routing with uncertain demands as a stochastic program: Properties of the corresponding solution. European Journal of Operational Research, 64:432–441, 1993.

[30] M. Dror, G. Laporte, and F. V. Louveaux. Vehicle routing with stochastic demands and restricted failures. ZOR - Methods and Models of Operations Research, 37:273–283, 1993.

[31] M. Dror, G. Laporte, and P. Trudeau. Vehicle routing with stochastic demands: Proper- ties and solution frameworks. Transportation science, 23(3):166–176, 1989.

[32] M. Dror and P. Trudeau. Stochastic vehicle routing with modified savings algorithm. European Journal of Operational Research, 23:228–235, 1986.

[33] D. Dubois and H. Prade. A set-theoretic view of belief functions: logical operations and approximations by fuzzy sets. International Journal of General Systems, 12(3):193–226, 1986.

[34] D. Dubois and H. Prade. Random sets and fuzzy interval analysis. Fuzzy Sets and Systems, 42:87–101, 1991.

[35] D. Dubois and H. Prade. Formal representations of uncertainty. In Decision-making Process: Concepts and Methods, pages 85–156, London, UK, 2009. ISTE.

[36] V. Gabrel, C. Muraty, and A. Thiele. Recent advances in robust optimization: An overview. European Journal of Operational Research, 235(3):471–483, 2014.

[37] C. Gauvin, G. Desaulniers, and M. Gendreau. A branch-cut-and-price algorithm for the vehicle routing problem with stochastic demands. Computers and Operations Research, 50:141–153, 2014.

[38] M. Gendreau, G. Laporte, and R. Séguin. Stochastic vehicle routing. European Journal of Operations Research, 88:3–12, 1996.

[39] M. Gendreau, G. Laporte, and R. Séguin. A tabu search heuristic for the vehicle routing problem with stochastic demands and customers. Operations Research, 44:469–477, 1996.

[40] B. E. Gillett and L. R. Miller. A heuristic algorithm for the vehicle-dispatch problem. Operations Research, 22(2):340–349, 1974.

[41] F. Glover. Future paths for and links to artificial intelligence. Com- puters and Operations Research, 13(5):533–549, 1986.

[42] B. L. Golden and W. Stewart. Vehicle routing with probabilistic demands. In Computer Science and Statistics: Tenth Annual Symposium on the Interface. NBS special publica- tion 503, 1978.

[43] B. L. Golden and J. R. Yee. A framework for probabilistic vehicle routing. AIIE Trans- actions, pages 109–112, 1979. BIBLIOGRAPHY 121

[44] H. Harmanani, D. Azar, N. Helal, and W. Keirouz. A simulated annealing algorithm for the capacitated vehicle routing problem. In Proceedings of the 26th International Conference on Computers and their Applications, New Orleans USA, 2011.

[45] N. Helal, F. Pichon, D. Porumbel, D. Mercier, and É. Lefèvre. CVRPED bench- marks. https://www.lgi2a.univ-artois.fr/spip/IMG/zip/cvrped_ instances.zip, 2017. Accessed: 2017-09-01.

[46] N. Helal, F. Pichon, D. Porumbel, D. Mercier, and É. Lefèvre. CVRPED+ bench- marks. https://www.lgi2a.univ-artois.fr/spip/IMG/zip/cvrped_ plus_instances.zip, 2017. Accessed: 2017-09-01.

[47] N. Helal, F. Pichon, D. Porumbel, D. Mercier, and É. Lefèvre. Description file for the CVRPED and CVRPED+ benchmarks. https://www.lgi2a.univ-artois. fr/spip/IMG/pdf/cvrped_instances_readme.pdf, 2017. Accessed: 2017-09-01.

[48] R. Henrion. Introduction to chance-constrained programming. Tutorial paper for the Stochastic Programming Community Home Page, 2004. downloadable at http://stoprog.org.

[49] F. S. Hillier and G. J. Lieberman. Introduction to operations research, chapter Duality Theory and Sensitivity Analysis, pages 230–307. MC GrowHill, seventh edition, 2001.

[50] J. H. Holland. Adaptation in natural and artificial systems. University of Michigan Press, Ann Arbor, Michigan, U.S.A., 1975. (Second edition, 1992, MIT Press, Cambridge, Massachusetts).

[51] S. Irnich, P. Toth, and D. Vigo. Vehicle Routing: Problems, Methods, and Applications, Second Edition, chapter The Family of Vehicle Routing Problems, pages 1–33. MOS- SIAM Series on Optimization. Society for Industrial and Applied Mathematics and the Mathematical Optimization Society, 2014.

[52] R. Kennes and P. Smets. Computational aspects of the mobius transformation. In Pro- ceedings of the Sixth Annual Conference on Uncertainty in Artificial Intelligence, UAI ’90, pages 401–416. Elsevier Science Inc., 1990.

[53] M. J.L. Kirby. The current state of chance-constrained programming. In H. W. Kuhn, editor, Proceedings of the Princeton Symposium on Mathematical Programming, pages 93–111. Princeton University Press, 1970.

[54] S. Kirkpatrick, C. D. Gelatt, and M. P. Vecchi. Optimization by simulated annealing. Science, 220(4598):671–680, 1983.

[55] V. Lambert, G. Laporte, and F. Louveaux. Designing collection routes through bank branches. Computers and Operations Research, 20(7):783–791, 1993.

[56] G. Laporte. The vehicle routing problem: An overview of exact and approximate algo- rithms. European Journal of Operational Research, 59(3):345–358, 1992.

[57] G. Laporte, F. Louveaux, and H. Mercure. The vehicle routing problem with stochastic travel times. Transportation Science, 26:161–170, 1992. 122 BIBLIOGRAPHY

[58] G. Laporte, F. Louveaux, and L. van Hamme. An integer l-shaped algorithm for the capacitated vehicle routing problem with stochastic demands. Operations Research, 50:415–423, 2002.

[59] H. Masri and F. Ben Abdelaziz. Belief linear programming. International Journal of Approximate Reasoning, 51:973–983, 2010.

[60] P. Moscato. On evolution, search, optimization, genetic algorithms and martial arts: Towards memetic algorithms. Technical Report 826, Caltech Concurrent Computation Program, 1989.

[61] Z. P. Mourelatos and J. Zhou. A design optimization method using evidence theory. Journal of Mechanical Design, 128:901–908, 2006.

[62] F. Ordónez. Tutorials in operations Research, chapter Robust Vehicle Routing, pages 153–178. INFORMS, 2014.

[63] Y. Peng and J. Chen. Vehicle routing problem with fuzzy demands and the particle swarm optimization solution. In Proceedings of the International Conference on Man- agement and Service Science (MASS), Wuhan, China, 2010. IEEE.

[64] M. Poggi and E. Uchoa. Vehicle Routing: Problems, Methods, and Applications, Second Edition, chapter New Exact Algorithms for the Capacitated Vehicle Routing Problem, pages 59–86. MOS-SIAM Series on Optimization. Society for Industrial and Applied Mathematics and the Mathematical Optimization Society, 2014.

[65] D. Porumbel. Ray projection for optimizing polytopes with prohibitively many con- straints in set-covering column generation. Mathematical Programming, 155(1):147– 197, 2016.

[66] J. Renaud, F. Boctor, and G. Laporte. An improved petal heuristic for the vehicle routing problem. Journal of the Operational Research Society, 47:329–336, 1996.

[67] A. Ruszczynski´ and A. Shapiro. Stochastic programming models. In Stochastic Pro- gramming, volume 10 of Handbooks in Operations Research and Management Science, pages 1–64. Elsevier, 2003.

[68] F. Semet, P. Toth, and D. Vigo. Vehicle Routing: Problems, Methods, and Applications, Second Edition, chapter Classical Exact Algorithms for the Capacitated Vehicle Routing Problem, pages 37–57. MOS-SIAM Series on Optimization. Society for Industrial and Applied Mathematics and the Mathematical Optimization Society, 2014.

[69] J. K. Sengupta. A generalization of some distribution aspects of chance-constrained linear programming. International Economic Review, 11(2):287–304, June 1970.

[70] Vehicle Routing Data sets. http://www.coin-or.org/SYMPHONY/ branchandcut/VRP/data/index.htm. Accessed: 2016-03-20.

[71] G. Shafer. A mathematical theory of evidence. Princeton University Press, 1976.

[72] A. Shapiro, D. Dentcheva, and A. Ruszczynsk.´ Lectures on Stochastic Programming: Modeling and Theory. Mathematical Programming Society and Society for Industrial and Applied Mathematics, 2009. BIBLIOGRAPHY 123

[73] Z. Shen, F. Ordonez, and M. M. Dessouky. Optimization and Logistics Challenges in the Enterprise, chapter The Stochastic Vehicle Routing Problem for Minimum Unmet Demand, pages 349–371. Springer, New York, 2009.

[74] P. Siarry. , chapter Tabu search, pages 51–76. Springer International Publishing, 2016.

[75] P. Siarry. Metaheuristics, chapter Evolutionary Algorithms, pages 115–178. Springer International Publishing, 2016.

[76] P. Smets. Belief functions: The disjunctive rule of combination and the generalized Bayesian theorem. International Journal of Approximate Reasoning, 9(1):1–35, 1993.

[77] R. K. Srivastava, K. Deb, and R. Tulshyan. An evolutionary algorithm based ap- proach to design optimization using evidence theory. Journal of Mechanical Design, 135(8):081003 – 081003–12, June 2013.

[78] T. Sunaga. Theory of an interval algebra and its application to numerical analysis. Japan Journal of Industrial and Applied Mathematics, 26(2):125–143, 2009.

[79] I. Sungur, F. Ordónez, and M. Dessouky. A robust optimization approach for the capac- itated vehicle routing problem with demand uncertainty. IIE Transactions, 40:509–523, 2008.

[80] D. Teodorovic and S. Kikuchi. Application of fuzzy sets theory to the saving based vehicle routing algorithm. Civil Engineering Systems, 8(2):87–93, 1991.

[81] D. Teodorovic and G. Pavkovic. A simulated annealing technique approach to the ve- hicle routing problem in the case of stochastic demand. Transportation Planning and Technology, 16:261–273, 1992.

[82] D. Teodorovic and G. Pavkovic. The fuzzy set theory approach to the vehicle routing problem when demand at nodes is uncertain. Fuzzy Sets and Systems, 82(3):307 – 317, 1996.

[83] P. Toth and D. Vigo. the vehicle Routing Problem, chapter An Overview of Vehicle Rout- ing Problems, pages 1–26. Society for Industrial and Applied Mathematics, Philadel- phia, PA, USA, 2002.

[84] P. Toth and D. Vigo, editors. The Vehicle Routing Problem. Monographs on Discrete Mathematics and Applications. Society for Industrial and Applied Mathematics, 2002.

[85] M. C. M. Troffaes. Decision making under uncertainty using imprecise probabilities. International Journal of Approximate Reasoning, 45(1):17–29, 2007.

[86] Jr. W. R. Stewart and B. L. Golden. Stochastic vehicle routing: A comprehensive ap- proach. European Journal of Operations Research, 14(4):371–385, 1983.

[87] R. R. Yager. Arithmetic and other operations on Dempster-Shafer structures. Interna- tional Journal of Man-Machine Studies, 25(4):357–366, 1986.

[88] R. R. Yager. The entailment principle for Dempster-Shafer granules. International Journal of Intelligent Systems, 1(4), 2007. 124 BIBLIOGRAPHY

[89] J. R. Yee and B. L. Golden. A note on determining operating strategies for probabilistic vehicle routing. Naval Research Logistics Quarterly, 27(1):159–163, 1980.