A Polygon and Point-Based Approach to Matching Geospatial Features
Total Page:16
File Type:pdf, Size:1020Kb
International Journal of Geo-Information Article A Polygon and Point-Based Approach to Matching Geospatial Features Juan J. Ruiz-Lendínez *, Manuel A. Ureña-Cámara ID and Francisco J. Ariza-López Departamento de Ingeniería Cartográfica, Geodésica y Fotogrametría, Escuela Politécnica Superior de Jaén, Universidad de Jaén, 23071 Jaén, Spain; [email protected] (M.A.U.-C.); [email protected] (F.J.A.-L.) * Correspondence: [email protected]; Tel.: +34-953-21-24-70; Fax: +34-953-21-28-54 Received: 19 October 2017; Accepted: 1 December 2017; Published: 5 December 2017 Abstract: A methodology for matching bidimensional entities is presented in this paper. The matching is proposed for both area and point features extracted from geographical databases. The procedure used to obtain homologous entities is achieved in a two-step process: The first matching, polygon to polygon matching (inter-element matching), is obtained by means of a genetic algorithm that allows the classifying of area features from two geographical databases. After this, we apply a point to point matching (intra-element matching) based on the comparison of changes in their turning functions. This study shows that genetic algorithms are suitable for matching polygon features even if these features are quite different. Our results show up to 40% of matched polygons with differences in geometrical attributes. With regards to point matching, the vertex from homologous polygons, the function and threshold values proposed in this paper show a useful method for obtaining precise vertex matching. Keywords: area entities matching; point entities matching; genetic algorithm; turning function descriptor; geographical databases 1. Introduction Pattern recognition and matching of features are essential processes in many problems when dealing with graphic information [1,2]. Whether in vector format (object-oriented) or in raster format, there are many studies related to graphic information. One specific type of graphic information is geographic information or spatial data, which is very much in demand nowadays because of the use of location in many areas (business, military, international aid, tourism, fleet management, smart territories, etc.) and has great economic importance [3]. The main feature of spatial data is the geographical position, which means that spatial data refer to positions in the real world. Position is the basis of mapping and navigation, which are essential for engineering, natural sciences, land management, etc. Different matching techniques based on position have been developed in response to the need to create new cartographic products, which are the result of integrating Geospatial Databases (GDB) from heterogeneous sources with levels of quality equally heterogeneous [4–6]. This integration process has not only allowed the development of new ways of mapping based on the Volunteered Geographic Information (VGI) and the Spatial Data Infrastructures (SDI), but has also been the cause of an ever-increasing necessity to implement more efficient accuracy assessment procedures for evaluating the quality of these products. Under this new geospatial paradigm, the knowledge of the positional accuracy of a GDB is a critical aspect. Thus, a GDB with an insufficient level of positional accuracy for a particular purpose can be useful by integrating with other GDB with higher accuracy. Therefore, and regardless of this level of accuracy, it is necessary to know it. Positional accuracy has traditionally been evaluated by means of statistical methods based on measuring errors in points between the real world and the ISPRS Int. J. Geo-Inf. 2017, 6, 399; doi:10.3390/ijgi6120399 www.mdpi.com/journal/ijgi ISPRS Int. J. Geo-Inf. 2017, 6, 399 2 of 24 ISPRS Int. J. Geo-Inf. 2017, 6, 399 2 of 24 dataset (Figure1a). From a statistical point of view, one of the most controversial aspects of these methods is the number of points thatthat must be used to ensure, with a given level of confidence, confidence, that a GDB with an unacceptableunacceptable quality level will notnot bebe acquiredacquired [[7].7]. Thus,Thus, althoughalthough recommendationsrecommendations always suggest at least 20 points [[8,9],8,9], other studies [[10]10] and institutions [[11,12]11,12] suggest that this size (n == 20) seems to be very small, and larger sizes areare proposed [[13].13]. In this sense, the new capabilities offered by Global Navigation Satellite Systems (GNSS)(GNSS) in the last few decades have allowed a great improvement inin the process of fieldfield data acquisition; however, theirtheir use still presents the inconvenience of having to use aa limitedlimited numbernumber ofof points.points. This factor, combined with the very high cost of fieldfield surveying, represents the the main main weakness weakness of of th thee traditional traditional points-based points-based methods methods (Figure (Figure 1a).1a). Figure 1. (a) Traditional points-based methods of positional quality assessment and (b) the new Figure 1. (a) Traditional points-based methods of positional quality assessment and (b) the new approach based on automated positional assessment. GBD, Geospatial Databases. approach based on automated positional assessment. GBD, Geospatial Databases. Having taken into account these considerations, and in order to overcome the difficulties mentionedHaving previously, taken into a accountnew assumption these considerations, was developed and that in orderpositional to overcome accuracy thecould difficulties also be mentioneddefined by previously,measuring the a new differences assumption between was the developed location thatstored positional in a GDB accuracy (tested couldsource) also and be a definedlocation bydetermined measuring by the another differences GDB (reference between thesour locationce) of higher stored accuracy in a GDB (Figure (tested 1b). source) Thus, andif the a locationaccuracy determinedof the second by source another is high GDB enough, (reference the source) unmeasured of higher difference accuracy betw (Figureeen it1 andb). Thus,the real if theone accuracycan be ignored of the [14]. second Under source this is new high approach enough, to the the unmeasured overall process difference of position betweenal assessment, it and the realthe first one canand bemost ignored important [14]. Under stage thisis to newidentify approach the homo to thelogous overall elements process ofbetween positional both assessment, GDB, quickly the firstand andreliably. most Thus, important we consider stage is that toidentify now is the right homologous time for elementsdeveloping between new methodologies both GDB, quickly based and on reliably.the automated Thus, weassessment consider of that the now positional is the right and timegeom foretric developing components new of methodologies geographic information. based on the In automatedthis sense, assessmentthe authors of developed the positional a methodolog and geometricy to components analyze the of geographicassessment information. of the positional In this sense,components the authors of lines developed based aon methodology a previous tomatching analyze theand assessment a set of descriptors of the positional [15]. componentsLatterly, they of linescontinued based their on a previousstudy on matching positional and accuracy a set of descriptors by means [15of ].point-based Latterly, they methodologies continued their [16]. study Both on positionalstudies are accuracy based on by a meanstwo-step of point-basedmatching process methodologies that is extensively [16]. Both studiesdescribed are in based this onpaper. a two-step Thus, matchingwe can consider process that that the is extensively assessment described process inis essentially this paper. Thus,dealing we with can considera matching that problem the assessment where processthere are is two essentially key issues dealing [2]: (1) with how a matchingto measure problem the similarity where between there are elemen two keyts from issues both [2]: datasets (1) how to(in measure our case, the GDB) similarity in order between to elementsmatch them from (F bothigure datasets 2a) and (in (2) our how case, to GDB) achieve in order the tooptimal match themvertex-to-vertex (Figure2a) correspondence and (2) how to achievebetween the the optimal previously-matched vertex-to-vertex elements correspondence (Figure 2b). between Following the previously-matchedthese ideas, in this paper, elements we (Figureshow the2b). matching Following process these ideas, in a two-step in this paper, process we (inter-elements show the matching and processintra-elements) in a two-step that we process have (inter-elementsbeen developing and in intra-elements) order to allow that quality we have control been assessment developing inof ordertwo GDBs. to allow quality control assessment of two GDBs. ISPRS Int. J. Geo-Inf. 2017, 6, 399 3 of 24 ISPRS Int. J. Geo-Inf. 2017, 6, 399 3 of 24 Figure 2.2. ((a)) SimilaritySimilarity betweenbetween elementselements fromfrom bothboth GDBGDB andand ((bb)) vertex-to-vertexvertex-to-vertex correspondencecorrespondence between thethe elementselements matchedmatched previously.previously. 1.1. Inter-Elements Matching (Inter-Polygons) With regardregard toto thethe measurementmeasurement ofof similaritysimilarity inin orderorder toto matchmatch elements,elements, thisthis isis generallygenerally evaluated by meansmeans ofof aa quantitativequantitative variable,variable, whichwhich measuresmeasures thethe distancedistance betweenbetween thethe elementselements comparedcompared andand analyzes