A Pointer Neural Network for the Vehicle Routing Problem with Task
Total Page:16
File Type:pdf, Size:1020Kb
Information Technology and Control 2020/2/49 237 ITC 2/49 A Pointer Neural Network for the Vehicle Routing Problem with Task Priority and Limited Resources Information Technology and Control Received 2019/11/13 Accepted after revision 2020/03/22 Vol. 49 / No. 2 / 2020 pp. 237-248 DOI 10.5755/j01.itc.49.2.24613 http://dx.doi.org/10.5755/j01.itc.49.2.24613 HOW TO CITE: Ma, H., Sheng, Y., Xia, W. (2020). A Pointer Neural Network for the Vehicle Routing Problem with Task Priority and Limited Resources. Information Technology and Control, 49(1), 237-248. https://doi.org/10.5755/j01.itc.49.2.24613 A Pointer Neural Network for the Vehicle Routing Problem with Task Priority and Limited Resources Huawei Ma, Yuxiang Sheng, Wei Xia School of Management, Hefei University of Technology, Hefei 230000, P.R. China, e-mails: [email protected], [email protected], [email protected] Corresponding author: [email protected] The vehicle routing problem with task priority and limited resources (VRPTPLR) is a generalized version of the vehicle routing problem (VRP) with multiple task priorities and insufficient vehicle capacities. The objec- tive of this problem is to maximize the total benefits. Compared to the traditional mathematical analysis meth- ods, the pointer neural network proposed in this paper continuously learns the mapping relationship between input nodes and output decision schemes based on the actual distribution conditions. In addition, a global at- tention mechanism is adopted in the neural network to improve the convergence rate and results. To verify the effectiveness of the method, we model the VRPTPLR and compare the results with those of genetic algorithm and differential evolution algorithm. The parameter sensitivity of each algorithm is assessed using different datasets. Then, comparison experiments with the three algorithms employing optimal parameter configura- tions are performed for the validation sets, which are generated at different instance scales. It is found that the solution time of the pointer neural network is much shorter than that of the genetic algorithm and the proposed method provides better solutions for large-scale instances. KEYWORDS: Task Priority, Limited Resources, Pointer Neural Network, Global Attention Mechanism. 238 Information Technology and Control 2020/2/49 1. Introduction The vehicle routing problem with task priority and way of obtaining an adaptive solution to real-time limited resources (VRPTPLR) is a type of combina- scheduling problems, e.g., VRPTPLR. torial optimization problem that maximizes the total Compared with the traditional VRP, the VRPTPLR benefit by arranging tasks with different priorities studied in this paper can only access some custom- and appropriately constructing heterogeneous ve- er nodes due to limited resources, and each vehicle hicle routes. In this approach, the resources cannot should visit the customers following the order of pri- meet all the demands, and the priorities are given in orities. Therefore, a pointer neural network model is advance according to the importance degree and the proposed in this paper to select served customers first urgency demands of the customer. This problem has and then assign tasks to heterogeneous vehicles. been addressed in many fields, such as the emergen- The primary contributions of the paper are as follows. cy distribution of goods [28], and research to date has 1) The latent rule that relates the vehicle routing sets focused on the design and implementation of efficient and objective function values is learned by the pointer algorithms. neural network by combining deep learning and rein- A number of operations research methods have been forcement learning, and an excellent feasible solution proposed to solve this problem, which can be divid- can be obtained in a short time after training. 2) Be- ed into exact algorithms [7] and heuristic algorithms cause the attention mechanism can selectively learn [19]. The former always finds the optimal solution, input data and because the sequence of the model but their efficiency in solving large-scale problems output is related to this mechanism, a global atten- cannot satisfy the practical requirements. The latter tion mechanism [14] is introduced in the proposed can obtain feasible solutions quickly; however, the pointer neural network to calculate the weights of all algorithm performance cannot be validated by math- nodes and exclude those nodes that violate certain ematical theory. Therefore, based on mathematical constraints. Then, a polynomial sampling distribu- abstracting and modeling, there is often a trade-off tion is used to select the next node to be accessed by between solution performance and solution time. the vehicle. In recent years, with the rapid development and suc- This paper is structured as follows: Section 2 sum- cessful application of machine learning technology marizes the findings for similar types of VRPs; Sec- in various fields, such as management operation re- tion 3 details the model of the pointer neural network; search [10, 16], medicine [4, 22], computer science Section 4 describes the parameter experiments con- [11], etc., several technologies have been introduced ducted with the three algorithms, presents the exper- to solve combinatorial optimization problems [26, imental results and compares the results achieved 29]. Vinyals et al. [29] proposed a model consisting of with different datasets; and Section 5 summarizes the two recurrent neural networks (RNNs) and an atten- work of this paper. tion mechanism to solve combinatorial optimization problems. However, this method cannot effectively solve combinatorial optimization problems in the ab- sence of label data. Subsequently, Bello et al. [3] im- 2. Related Work proved the efficiency of the model and adjusted the parameters through reinforcement learning; exper- The VRP and its generalized problems, including iments were then performed based on the traveling VRPTPLR, have been a focus of studies involving salesman problem (TSP) with 20, 50, and 100 node in- combinatorial optimization problems, and various stances. The results showed that the performance of algorithms with different mechanisms have been pro- reinforcement learning methods was better than that posed. In addition to numerous operations research of Christofide’s heuristic algorithm [5]. The type of methods, reinforcement learning methods have re- model presented by Bello does not require mathemat- cently been introduced to solve such problems [17]. ical modeling and achieves a trade-off between solu- As mentioned above, exact algorithms and heuristic tion time and performance, thereby providing a new algorithms are the primary classes of operation re- Information Technology and Control 2020/2/49 239 search methods. In some instances, exact algorithms, and comparison experiments between non-hybrid such as dynamic programming [20], column genera- algorithms and hybrid algorithms were performed tion [8] and the branch-and-price [2] method, have for multiple instances. been used. Yu et al. [32] propose an improved branch- In addition to mathematical programming meth- and-price (BAP) algorithm to precisely solve the het- ods, machine learning methods have been increas- erogeneous fleet green vehicle routing problem with ingly used to solve VRPs; for example, Cooray et al. time windows, and a multi-vehicle approximate dy- [6] enhanced the genetic algorithm with machine namic programming algorithm was designed to speed learning technology and decreased the calculation up the pricing problem in BAP. Then, the effectiveness time to below that of the general genetic algorithm. of the BAP algorithm is verified by extensive com- Moreover, machine learning can not only be com- putational experiments performed on the Solomon bined with heuristic algorithms but can also be used benchmark instances. In addition to a specific type of to construct neural network models to solve VRPs. VRP, Pessoa et al. [21] introduce a Branch-Cut-and- Wang et al. [30] built neural network models for TSP Price algorithm for a family of VRP variants. The al- and then trained neural network models and adjust- gorithm was extensively tested in instances of the lit- ed the parameters iteratively. Experiments verified erature and was shown to be significantly better than that the solution quality of the well-trained neural previous exact algorithms. network was better than that of state-of-the-art re- Because the efficiency of exact algorithms declines sults of learning algorithms. Yu et al. [31] propose greatly in large-scale instances, heuristic algorithms a novel deep reinforcement learning-based neural have attracted considerable attention for large-scale combinatorial optimization strategy, which used an instance problems; such algorithms include the ant unsupervised auxiliary network to train the model colony algorithm [13, 33], tabu search algorithm parameters. The simulation results show that the [15], genetic algorithm [23] and others. Arnold et al. proposed strategy can significantly outperform con- [1] and Triki et al. [27] designed heuristic algorithms ventional strategies with limited computation time for their specific large-scale mathematical problems in both static and dynamic logistic systems. Nazari to obtain improved feasible solutions. Zhang et al. et al. [18] present an end-to-end framework for solv- [33] proposed a multi-objective solution strategy ing VRP using deep reinforcement learning. After based on the ant colony algorithm and three muta- problem