Prepared Test Instances Extracted from Openstreetmap Data Using Different Network Reductions
Total Page:16
File Type:pdf, Size:1020Kb
Department of Mathematics Prepared Test Instances Extracted from OpenStreetMap Data Using Different Network Reductions Kaj Holmberg LiTH-MAT-R--2018/04--SE Department of Mathematics Link¨opingUniversity S-581 83 Link¨oping Prepared Test Instances Extracted from OpenStreetMap Data Using Different Network Reductions Kaj Holmberg [email protected] Department of Mathematics Linköping Institute of Technology SE-581 83 Linköping, Sweden April 5, 2018 Abstract: We investigate the effect of different reductions when importing networks from OpenStreetMap data. We describe the network reductions and report computa- tional tests for doing the network extraction and reduction. We also show the effect of the reductions by solving a few standard optimization problems in the resulting net- works. Computational tests show that the reductions have a dramatic effect on the network size and the time needed for solving the optimization problems. In many cases, the reductions are necessary in order to be able to solve the optimization problem in reasonable time. A practical result of this work is a set of networks that will be used as benchmarks in future research, and are publically available for other researchers. 1 Introduction OpenStreetMap (OSM) data gives a possibilty of obtaining indata for many optimiza- tion problems in city networks. However, the data might not be in the proper form for optimization. Often one has to treat the data in different ways, before using it. In the paper [5], we describe how to extract useful data from OpenStreetMap data. In general, we wish to extract network data suitable for optimization. Examples are find- ing shortest paths, minimal spanning trees, shortest postman tours etc. More advanced usage is multi-vehicle arc routing, for example for snow removal, [2], [3], [6]. All of these optimization problems require the network to be given as nodes and links (arcs/edges), not the paths or “ways” used in OpenStreetMap. For details, see [5]. Similar work is probably being done in many places around the world, and as a single example, we mention [7], where some algorithms for network extraction are given, and an implementation in C++ is presented. However, we have not seen a discussion and comparison of reductions as in our paper. Some kind of standardization may be useful. 1 Here we describe some of groups of test instances, which will be used in forthcoming work. One goal is to make the instances available to other researchers. We describe dif- ferent reductions of the networks, and motivate the reductions by solving some standard optimization problems in the resulting networks. 2 Data extraction and reduction Let us give a brief description of the method used in this paper. The main task is to read the data in the available format, modify it according to certain rules, and write it in a format suitable for further optimization. We use OpenStreetMap data in xml- format. Such files contain very much information, much of which is not useful for our optimization problems. The rules of making the network depends on which link types to include, and also on how elimination of nodes with degree two or one is done. The OSM data is first read and parsed. This results in a number of nodes, and a number of paths using these nodes. Nodes not used by any path are removed. The paths are then divided into links as described below. When reading the nodes from the OSM data, each node has a name, longitude and latitude. Nodes outside of a predetermined window are dismissed. Then each path is treated. First its type is checked against a list of types to include. The path is dismissed if its type is not in the list. Our first selection keep all paths with label “highway”. We have also done tests with only streets usable by car, i.e. keeping the following secondary labels: “motorway”, “trunk”, “primary”, “secondary”, “tertiary”, “road”, “residental”, “living_streets” (including the same with a trailing “_link”), but leaving out “pedestrian”, “footway”, “path” and “cycleway”. The set of nodes associated with the path is put in a list. At this stage, uninformative tags are sorted out. The nodes in each path are associated with the nodes given in the initial node list, and each node is labeled with the associated path number. Nodes not in the list are dismissed, as this concerns parts that are outside of our area of interest. Here paths are dismissed if all the nodes are dismissed. Then the paths are divided into links. Basically each adjacent pair of nodes in a list is made into a link. The cost of the link is set to the Euclidean distance between the two nodes. However, if the node is not the first or the last in the path, it may be eliminated, depending on the settings. Since each node is labeled with associated paths, we can easily count the number of paths using the node. If this number is one, we have a node which will get degree two, and for some settings such a node is eliminated. Then the distance is added to an accumulated distance, which will be put on the aggregated link. Each link is given the same type as the path had. (After this, we will not use the paths anymore.) In the process, the degree of each node is calculated. The procedure above will eliminate all isolated nodes. This is often not enough to make 2 R1 Splitting paths into links, removing isolated nodes. R2 R1 plus elimination of nodes with degree 2. R3 R1 plus selective elimination of nodes with degree 2. R4 R2 plus elimination of trees. R2c R2, keeping only links for cars. R4c R4, keeping only links for cars. Table 1: Reductions the problem solvable. The network needs to be reduced further in a suitable way. We have used the reductions in table 1, namely full or selective elimination of nodes with degree two and recursive elimination of nodes with degree one. Reduction R2 is the most relevant one if the nodes with degree 2 are only introduced to signify the curvature of the street, as we often are not interested in the curvature. The method in [7] seems to give the result of R2. There could however be a contradiction between simplification of the network and the issue of map matching for the GPS traces, see [4], in the sense that the simplifications made to make the route optimization easier will make the map matching harder. Map matching is in principle the task of associating coordinates from a GPS-track to streets in a city map. Then the coordinates are compared to the positions of the streets, and then the curvature of the streets obviously matters. If we plan to do map matching, selective reduction, R3, is probably better. Selective elimination of nodes with degree two is done the following way. Consider a node i connected only to nodes j1 and j2. As parameter, we use the direct distance between nodes j1 and j2 divided by the distance between j1 and i plus the distance between i and j2. If this values is greater than 0.98, we consider the curvature in the node to be unimportant, and node i is eliminated. This corresponds approximately to an angle between the two links close to 180 degrees (within plus minus 20 degrees). Details of this is described in section 6.1 in the paper [4]. Reduction R4 can be used when the treatment of turning places and the single street leading there can only be done in one way. This is true for snow removal, and probably for many other applications. In such a case one can plan the treatment of that part separately, note how long it will take, remove that part from the network, and afterwards just add that time to the time in the adjacent node left in the network. This is repeated recursively, so that no nodes with degree one remain. The elimination of nodes with degree 2 is only done when reading the OSM data, not applied recursively to the resulting network. In other words, it is only done when splitting a path into links. The reason for this is that if a node belonging to two paths get degree 2, the links from the two paths may have different properties that should not be eliminated. Paths in OSM rarely form cycles, so a cycle is usually made up by more than one path. If a cycle is made up by two paths, the nodes with degree two in each path is eliminated by reduction R2, but not the nodes where the paths meet. The cycle might then be 3 reduced to two nodes with two seemingly parallel links between them. This structure will not be eliminated by R4. (In a picture like in figure 2, the two parallel links appear as one, so it may look like a node has degree one, even if is has not.) 3 Computational tests 3.1 Computational details The code is implemented in Python, using Numpy and Scipy. We especially use the codes in Scipy for finding the connected components of a graph, and for finding a min- imal spanning tree. We also use imposm.parser, a Python library that parses Open- StreetMap data in XML and PBF format. (The parts of the code we made ourselves in Python could obviously be made faster by implementing them in C.) The tests were run on an Acer Aspire X3 X3995 3.4GHz, running Linux, Fedora Core 25. The machine has four CPUs, but only one was used in the test runs, except for imposm.parser, which can take advantage of several CPUs.