Search Improvement Within the Geospatial Web in the Context of Spatial Data Infrastructures

2012 22 Aneta Jadwiga Florczyk Search improvement within the geospatial web in the context of spatial data infrastructures Departamento Informática e Ingeniería de Sistemas Director/es Zarazaga Soria, Francisco Javier López Pellicer, Francisco Javier Tesis Doctoral Autor Director/es UNIVERSIDAD DE ZARAGOZA Repositorio de la Universidad de Zaragoza – Zaguan http://zaguan.unizar.es Departamento Director/es Tesis Doctoral SEARCH IMPROVEMENT WITHIN THE GEOSPATIAL WEB IN THE CONTEXT OF SPATIAL DATA INFRASTRUCTURES Autor Aneta Jadwiga Florczyk Director/es Zarazaga Soria, Francisco Javier López Pellicer, Francisco Javier UNIVERSIDAD DE ZARAGOZA Informática e Ingeniería de Sistemas 2012 Repositorio de la Universidad de Zaragoza – Zaguan http://zaguan.unizar.es Departamento Director/es Tesis Doctoral Autor Director/es UNIVERSIDAD DE ZARAGOZA Repositorio de la Universidad de Zaragoza – Zaguan http://zaguan.unizar.es SEARCH IMPROVEMENT WITHIN THE GEOSPATIAL WEB IN THE CONTEXT OF SPATIAL DATA INFRASTRUCTURES Aneta Jadwiga Florczyk PhD DISSERTATION RESEARCH ADVISORS Dr. Francisco Javier Zarazaga–Soria Dr. Francisco Javier López–Pellicer May 2012 Computer Science and Systems Engineering Department Universidad de Zaragoza © Copyright by author, copyrightyear All Rights Reserved iii Acknowledgments I would like to thank all members of the IAAA research group of the University of Zaragoza, to which I had pleasue belong during the thesis development, for their collaboration and support in any aspect, and especially, my research advisors, F.Javier Zarazaga–Soria and F.Javier López–Pellicer for their endless forbearance and patience. Also, I would like to express my gratitude to members of the IFGI of the University of Muenster, Germany, and every other person I have met there, who made effort to make my stay a wonderful ex- perience, which has contributed to my profesional and personal development. I feel specially grateful to proffesor Werner Kuhn, Tomi Kauppinen and Patrick Maué for their help in this development. I would like to thank the people that have reviewed this thesis. Despite all of their help, I take full responsibility for any errors or omission herein. Finally, I would like to thank particularly my parents for their unconditional support during the work on this thesis. They always managed to find the right words to encourage me to follow the path I have chosen, even if this meant personal sacrifices for them. v Contents Acknowledgments v 1 Context and research issues 1 1.1 Context . .3 1.2 Motivation . .6 1.3 Problem statement . 10 1.4 Research questions . 11 1.5 Methodology . 14 1.6 Scope . 14 1.7 Contributions . 16 1.8 Thesis structure . 16 2 Enhanced search for a geospatial entity 19 2.1 Introduction . 19 2.2 Terminology . 19 2.3 Geocoding Web services . 22 2.4 Quality of geocoding service . 23 2.4.1 Web service . 23 2.4.2 Spatial data quality . 25 2.4.3 Geocoding quality . 28 2.5 Compound geocoder . 29 2.6 Address geocoding for urban management . 35 2.6.1 Data model . 36 2.6.2 Geospatial resources . 37 2.6.3 Geocoding framework . 39 2.6.4 Application . 41 2.7 Summary . 42 vii 3 Content–based semantics for geospatial resources 43 3.1 Introduction . 43 3.2 Geoidentifier . 43 3.2.1 Linking geographic features . 45 3.2.2 Spatial search . 48 3.2.3 Administrative geography of Spain . 49 3.2.4 OGC services catalog application . 52 3.3 Semantics of the WMS layer . 54 3.3.1 Related work . 56 3.3.2 The method . 58 3.3.3 Experiment . 67 3.3.4 Implementation of the method as a WPS . 71 3.4 Application of the methods developed . 74 3.5 Summary . 76 4 Semantic characterisation of Geospatial Web resources 79 4.1 Introduction . 79 4.2 Web community and geographic metadata . 80 4.3 Knowledge Generator . 82 4.3.1 Workflow . 82 4.3.2 Architecture . 83 4.4 Applying the Knowledge Generator . 84 4.4.1 Method for coverage estimation . 85 4.4.2 Prototype . 89 4.4.3 Experiment . 91 4.4.4 Evaluation of the content–based heuristic . 94 4.4.5 Improvement discussion . 97 4.5 Summary . 97 5 Geospatial Web Search Engine 99 5.1 Introduction . 99 5.2 Domain searches on the Web . 99 5.2.1 Searching for Web services . 99 5.2.2 Searching for Geospatial Web resources . 101 5.3 Searching via a general search engine . 102 5.3.1 Search goals . 102 5.3.2 Search strategy . 103 5.4 Non–expert user support . 104 viii 5.4.1 Faceted classification . 105 5.4.2 Faceted search and browsing . 106 5.5 Domain–specific search user interface . 108 5.5.1 Web service community . 109 5.5.2 Geospatial community . 111 5.6 Geospatial Web Search Engine . 112 5.6.1 Facets for search, browsing and navigational interface . 112 5.6.2 Search User Interface design . 117 5.6.3 OWS search and indexing procedure . 120 5.6.4 Architecture and Implementation . 124 5.7 System Evaluation . 125 5.7.1 Precision testing . 125 5.7.2 Interface Evaluation . 126 5.8 Summary . 129 6 Conclusions 131 6.1 Summary of Contributions . 131 6.2 Future Work . 133 A KnowledgeGenerator: Prototype details 135 B SKOS classification schemes for faceted–search. 139 C Generation of OWS dataset summaries. 147 Bibliography 155 ix List of Tables 1.1 Examples of SDIs. .2 1.2 OGC Web Service interface specifications relevant in this work. .3 2.1 Geocoding parameters of used Web services. 39 3.1 Modelling the spatial object. 44 3.2 Number of layers per each layer type for the layer collection used in the experiment. 68 3.3 Parameters of GetMap request for definition of the spatial constraint according to the WMS service version and CRS. 69 3.4 Values of functions of the content collecting procedure per iteration. 69 3.5 Image analysis parameters. 70 3.6 The results of the performed experiment. 70 4.1 Geospatial meta elements used in Web pages. 81 4.2 Example of the Hhip heuristic results. 89 4.3 Metadata model and mapping to the HTML meta elements. 90 4.4 Classification of Web pages in the corpus according to Web site characteristics. 92 4.5 Classification of Web pages in the corpus according to the coverage estimated. 92 4.6 Summary of metadata extraction. 93 4.7 Trimmed corpus. ..

Search Improvement Within the Geospatial Web in the Context of Spatial Data Infrastructures

VOLUME 1 • NUMBER 1 • 2018 Aims and Scope

Seamless Interoperability and Data Portability in the Social Web for Facilitating an Open and Heterogeneous Online Social Network Federation

Exonyms – Standards Or from the Secretariat  Message from the Secretariat 4

An Iterative Approach to the Parcel Level Address Geocoding of a Large Health Dataset to a Shifting Household Geography

A Comparison of Address Point and Street Geocoding Techniques

A Cross-Sectional Ecological Analysis of International and Sub-National Health Inequalities in Commercial Geospatial Resource Av

The International Maritime Meteorological Archive

Translitterering Och Alternativa Geografiska Namnformer

37 Irene García Losquiño Universidad De Alicante Resum Des Del Segle

Onomastica Uralica

A Toponomastic Contribution to the Linguistic Prehistory of the British Isles

The Sea Within: Marine Tenure and Cosmopolitical Debates