International Journal of Geo-Information

Article Global-Scale Resource Survey and Performance Monitoring of Public OGC Web Map Services

Zhipeng Gui 1,2,3,*, Jun Cao 2,3, Xiaojing Liu 2,3, Xiaoqiang Cheng 2,3 and Huayi Wu 2,3,*

1 School of Remote Sensing and Information Engineering, Wuhan University, 129 Luoyu Road, Wuhan 430079, China 2 State Key Laboratory of Information Engineering in Surveying, Mapping and Remote Sensing, Wuhan University, 129 Luoyu Road, Wuhan 430079, China; [email protected] (J.C.); [email protected] (X.L.); [email protected] (X.C.) 3 Collaborative Innovation Center of Geospatial Technology, Wuhan University, 129 Luoyu Road, Wuhan 430079, China * Correspondence: [email protected] (Z.G.); [email protected] (H.W.); Tel.: +86-027-6877-7167 (Z.G.); +86-027-6877-8311 (H.W.)

Academic Editor: Wolfgang Kainz Received: 19 March 2016; Accepted: 26 May 2016; Published: 8 June 2016

Abstract: One of the most widely-implemented service standards provided by the Open Geospatial Consortium (OGC) to the user community is the (WMS). WMS is widely employed globally, but there is limited knowledge of the global distribution, adoption status or the service quality of these online WMS resources. To fill this void, we investigated global WMSs resources and performed distributed performance monitoring of these services. This paper explicates a distributed monitoring framework that was used to monitor 46,296 WMSs continuously for over one year and a crawling method to discover these WMSs. We analyzed server locations, provider types, themes, the spatiotemporal coverage of map layers and the service versions for 41,703 valid WMSs. Furthermore, we appraised the stability and performance of basic operations for 1210 selected WMSs (i.e., GetCapabilities and GetMap). We discuss the major reasons for request errors and performance issues, as well as the relationship between service response times and the spatiotemporal distribution of client monitoring sites. This paper will help service providers, end users and developers of standards to grasp the status of global WMS resources, as well as to understand the adoption status of OGC standards. The conclusions drawn in this paper can benefit geospatial resource discovery, service performance evaluation and guide service performance improvements.

Keywords: web map service; map resource survey; map subject; service provider; spatiotemporal distribution; service performance; response time; quality of service; WMS crawling; OGC service discovery

1. Introduction Web technologies, new standards and commercial applications are rapidly changing the nature and extent of available online geospatial resources. Thus, it is imperative to investigate whether Web Map Service (WMS) remains the best solution to share and interoperate maps over the Internet or if a new standard and direction is needed. To understand the WMS adoption situation and appraise the WMS standard, a global-scale survey to investigate the distribution, usage and quality of public WMSs (i.e., the web map services that are available over the Internet for public use) is highly desirable. WMS is an Open Geospatial Consortium (OGC) standard protocol initially proposed in 2000 for serving geo-referenced maps over the Internet [1]. The latest version WMS 1.3.0 was published in 2006 [2] and is also available as ISO 19128 (i.e., ISO 19128:2005 Geographic information-Web map

ISPRS Int. J. Geo-Inf. 2016, 5, 88; doi:10.3390/ijgi5060088 www.mdpi.com/journal/ijgi ISPRS Int. J. Geo-Inf. 2016, 5, 88 2 of 24 server interface) [3] issued by the International Organization for Standardization (ISO). WMS has become the most widely-used OGC data portrayal service, recognized globally and supported by both mainstream commercial geospatial tools and open-source software. Currently, a huge number of WMSs are available over the Internet for public use providing abundant map resources, but varying in map content and quality. In this context, finding an appropriate WMS becomes a challenging problem [4]. Therefore, for more effective and efficient utilization of these invaluable online geospatial resources, the following questions need to be asked: What is the adoption situation of WMS; who are the primary contributors; and where are the providers located? Are there any patterns regarding the different attributes of WMS resources? For example, what are the most popular themes, specification versions and map projections, spatial and time coverages of the map layers? How is the quality of global WMSs, in terms of accessibility, successability and performance? Are there any patterns that can provide hints for service selection and server-side improvement? In this paper, we carried out a global-scale resource investigation and executed distributed performance monitoring on a set of crawled public WMSs. The potential contributions of this research can be drawn from the following perspectives:

(1) Geospatial resource discovery: the investigation of WMS resources and their metadata help us to grasp the server locations, provider types and content distribution of the global geospatial resources. Consequently, this knowledge will benefit service discovery by bringing to light on-demand WMSs that have those properties expected by service consumers. (2) Service performance evaluation: by developing a distributed monitoring framework, spatiotemporal heterogeneity and individual service-level performance differences can be analyzed over space and time. This will guide performance-aware service selection for time-critical applications, as well as steer performance improvements from the service provider perspective. (3) Evolution of service standard: this investigation can help researchers and standard makers from both industry and academia review the development and adoption of open service standards for a geospatial data portrayal. By linking cutting-edge web visualization technologies in relation to the prevailing interoperation modes (e.g., crowdsourcing [5,6] and collaboration [7,8]), we can rethink the appropriateness of the WMS standard, thus inspiring us to conceive new directions for advancement.

This article is organized as follows. Section2 reviews relevant research. Section3 introduces our monitoring method and data collection. Section4 details the discoveries. Section5 concludes with results and discusses future research.

2. Related Work

2.1. Development of Online Mapping Technologies and WMS With the development of web technologies, interactive online mapping has made great progress over the past several decades. Successively emerging commercial map services and crowdsourcing projects, like MapQuest, Open Street Map and Google Maps that implemented various technologies and open standards are significantly changing the application of online mapping [9]. Online mapping relies on rendering strategies, data models and markup languages; nowadays, both server-side rendering and client-side rendering play important roles in online mapping. Server-side rendering generates maps (typically raster images with vector objects and annotations overlaying) on map servers and delivers them to clients. Since it relies on the server-side functionality and computing powers, the access and presentation of maps became much easier without any advanced processing on the client side. Server-side rendering, however, results in poor interactivity. Intensive concurrent map requests introduce frequent data conversion and transfer issues and may lead to server-side overload and low responsivity on the client side [10]. Newly-developed data cube technologies and algorithms enable interactive exploration of large multidimensional spatiotemporal datasets (billions of ISPRS Int. J. Geo-Inf. 2016, 5, 88 3 of 24 entries) with very low latencies [11], e.g., nanocubes. In comparison to server-side rendering, client-side rendering enables data rendering, animation and enriched interaction features directly in web browsers. Datasets, instead of rendered images, are retrieved for the client. Data models based on Extensible Markup Language (XML) and JavaScript Object Notation (JSON) prevail, such as Scalable Vector Graphics (SVG), Geography Markup Language (GML) and GeoJSON. Conventional plugin-based Rich Internet Application (RIA) technologies (e.g., Adobe Flash, Microsoft Silverlight and Oracle-Sun JavaFX) have wide applications [12], but also disadvantages [13,14], such as extra installation, security and compatibility concerns. With the evolution and standardization of web technologies, native HyperText Markup Language (HTML) 5.0 and JavaScript packages have expanded to include powerful visualization and rendering functionalities. Other open standards, such as HTML Canvas and WebGL [15], also show great potential for data rendering. These standards and technologies have alleviated the dependence on plugins and make data rendering more interactive and flexible. The WMS standard relies on a server-side rendering mode, thus the maps are generated by a map server using geospatial data from geospatial databases or other data sources. WMS defines a set of standardized operations (i.e., interfaces) to facilitate map requests. GetCapabilities accesses the metadata of the service and map layers; GetMap generates a map with well-defined geographic and dimensional parameters; and GetFeatureInfo retrieves extra properties for particular features shown on a map [2]. These operations can be integrated with other geospatial tools; or invoked using a standard web browser by submitting HyperText Transfer Protocol (HTTP) requests. It is easy to request images with changing parameters on-demand for clients. Currently, WMS is being challenged by competitors, such as the Tile Map Service Specification (TMS) [16] and ESRI RESTful services. The stability and the performance of a map server deteriorate rapidly when high concurrency and frequent interaction occurs. To tackle this problem, various approaches have been developed and are widely-used, e.g., message queue, indexing, auto-scaling and dynamic load balancing technologies. Beside these methods, map tile caching mechanisms provide an alternative approach to fetch and cache pre-rendered map tiles efficiently, according to specified geographical extent and scales. TMS is one of the earliest standards for tiling maps; gaining a great deal of support from many open source communities. As Representational State Transfer (REST) advances as the mainstream architectural style for web services, ESRI has proffered RESTful Service Application Programming Interfaces (APIs) starting in 2010. With the vast deployments of the ArcGIS Service architecture, a growing number of ESRI RESTful services are available over the Internet. To meet these challenges and ever-changing demands, OGC constantly improves its standards. Inspired by TMS, OGC published the (WMTS) standard in 2009 to develop scalable, high performance services for web-based distribution of maps [17]. WMTS provides a complementary approach to WMS for tiling maps. Moreover, in 2011 OGC formed a working group to explore the implementation of geospatial services through RESTful approaches [18]. OGC standards have their own advantages when compared to the industrial web service standards advocated by the World Wide Web Consortium (W3C). OGC service standards provide abundant metadata about the provider and the geographic data through GetCapabilities and the capability is continuously enhanced. In contrast, Web Service Description Language (WSDL) only focuses on syntactic service APIs and message interaction. For example, WMS has supported the description and access to time series data through dimension parameters since Version 1.1.0. Data animation functions can be implemented on the client side to visualize the dynamics of natural phenomena or socioeconomic processes conveniently [19]. Contemporary commercial and open-source map APIs (e.g., Google Maps, Bing Maps and OpenLayers) provide capabilities to integrate server-side and client-side mapping in applications. An integrated solution combining WMSs with vector data overlays provides a productive choice for advanced users [15]. In summary, WMS plays a crucial role in online mapping and is widely supported by both commercial (e.g., ArcGIS products [20], Autodesk’s Map 3D [21] and Civil 3D products [22]) and open-source (e.g., OpenLayers [23], GRASS GIS [24], ISPRS Int. J. Geo-Inf. 2016, 5, 88 4 of 24

QGIS [25]) software providers and various applications. Meanwhile, there are urgent and evolving demands for the standard to be more interactive, analytic and collaborative [26].

2.2. Online Geospatial Web Service Survey Investigating global online geospatial web services is essential for geospatial resources discovery and a better understanding of the status of online resources [27,28]. Spatial Data Infrastructures (SDIs) have been widely used in Earth science domain to facilitate geospatial resources discovery and sharing [29]. Catalogues and portals, such as Global Earth Observation System of Systems (GEOSS) Clearinghouse [30] and data.gov [31], maintain millions of metadata entries for geospatial resources, allowing users to specify query constraints by using indexed metadata property fields. An SDI-level geospatial resource survey would provide invaluable information for both geospatial resource producers and consumers, even for policy makers. However, most of the SDIs only provide coarse-grained statistics, such as available resource types and regional-level resource distributions. A detailed publicly-available, resource survey of global geospatial web services has never been executed or, alternatively, is not available to end-users. Furthermore, registry records may be stale and incomplete in SDIs because the metadata maintenance depends on service owners. Lopez-Pellicer et al. [32,33] analyzed the discovery capability of common search engines and SDIs toward OGC services. Common search engines, like Google, Yahoo and Bing, can only index and recall half of the OGC services at most, and SDIs have limited resource coverage. Therefore, an active resource investigation is required. There are many available online WMS lists provided by third parties. Refractions Research [34] collected 615 WMSs by using Google Web APIs and extracted the basic metadata fields (e.g., name, title, layer bounding box). Skylab Mobilesystems Ltd. [35] also offers a frequently-updated list of unrestricted accessible WMSs at the global extent, but only the number of layers was counted; furthermore, the number of WMSs was limited (994 WMSs). To understand the provenance of online OGC web services, the geo-distribution of service providers [36] and the service deployment situation (e.g., the number of services deployed on each server and the number of dataset provided by each service) were studied [32,37]. This research revealed the imbalances in geospatial resources in terms of service location and provider. By analyzing the service types and version proportion of online OGC services in Europe, Lopez-Pellicer et al. [32] found that WMS is the most popular OGC service online since it is easy to deploy and use. However, the maintenance and updates of these online WMSs were inefficient. Li et al. [37] investigated the web diffusion of WMS by developing an active crawler, determining that the total number of WMSs continuously increased, while at the same time, some WMSs became invalid (i.e., the access URL become invalid or the GetCapabilities operation cannot properly response constantly). Thus, the maintenance and stability of online WMSs are big issues. So far, there has been little research conducted from the perspective of the map contents provided by WMSs. For the automatic detection of orthoimage layers offered by WMSs, Florczyk et al. [38] proposed a heuristics method that combines both capabilities of document description-based analysis and content-based computation together. This research has been integrated with the Virtual Spain project to benefit Digital Elevation Model (DEM) and realistic 3D view generation. Current research provides pioneering, but limited work. The major drawbacks are as follows: (1) none have conducted a global-scale resource survey, and the number of investigated WMSs discussed in the literature was small; (2) the analysis of the service content was limited. Only a few metadata properties were studied. Rarely does the existing research explain and discuss discoveries in relation to policy and technical issues. So far, we still only have limited knowledge of the global distribution of WMS servers, provider types or content (e.g., primary map subjects, data collection times and spatial coverage). The adoption and usage status of WMSs are also unclear, such as the most frequently-used service version, the updating status of services, map content or the widely-used Coordinate Reference System (CRS). Therefore, a global-scale investigation to grasp the resource distribution and adoption status of OGC standards is urgently needed. ISPRS Int. J. Geo-Inf. 2016, 5, 88 5 of 24

2.3. Quality of Service Monitoring The GetCapabilities, GetMap and GetFeatureInfo operations accomplish the functional requirements supposed to be implemented by a service during interoperation. In contrast, non-functional requirements, so-called Quality of Service (QoS) attributes, such as reliability, maintainability and performance, measure the overall properties of web services [39]. The Infrastructure for Spatial Information in the European Community (INSPIRE) established QoS requirements for spatial dataset viewing services. INSPIRE insists that QoS criteria, such as performance, capacity and availability, shall be ensured for regulatory requirements [40,41]. To evaluate the QoS offered by service providers, quality monitoring becomes imperative and urgent. Among these QoS criteria, availability and performance especially are expected by the user, since they explicitly impact the user experience [42]. To acquire quality data, various monitoring methods have been proposed. General test tools, such as Apache JMeter and LoadRunner, provide powerful capabilities for conventional performance and load tests. Utilizing these tools, experiments have been conducted to analyze the key performance factors of OGC web services [43–45]. However, general tools cannot parse service-specific data packages. As a result, the metadata of geospatial web services cannot be extracted, and advanced monitoring cannot be conducted automatically [42]. To address this issue, domain-oriented monitoring infrastructures were proposed. The Federal Geographic Data Committee (FGDC) established the Service Status Checker (SSC) to verify and grade various types of geospatial web services [46]. MapMatters is a monitoring platform providing a web portal to visualize the quality of WMSs, exclusively [42]. To facilitate geospatial resources discovery, Li et al. [47] and Gui et al. [48] designed one-stop discovery portal prototypes to integrate service performance monitoring and visualization functions. In order to alleviate the loading burden on the monitored servers, Wu et al. [49] proposed a flexible framework to adjust the monitoring time interval dynamically according to the recent performance of selected services. Currently, most of the existing monitoring frameworks employ a single monitoring site mode and sparse monitoring time intervals. However, web application performance varies in space and time, impacted by many factors [50]. Therefore, the performance data collected from one geo-location at one fixed time point cannot describe the performance in another geo-location or at another time. Biased monitoring data may mislead quality evaluation and service selection [50,51]. Although geospatial web service monitoring and analysis have yielded abundant outcomes, the following issues still need to be addressed. (1) Sophisticated monitoring strategies are needed to capture comprehensive performance data at an acceptable cost. A distributed monitoring framework must be developed on the most up-to-date distributed computing technologies and global cyberinfrastructure. For example, cloud computing and volunteer computing (i.e., a type of distributed computing in which computer owners donate their computing resources to support projects of others temporarily, such as SETI@home [52] and Climate@home [53]) technologies can extend the spatiotemporal coverage of monitoring sites [50]; (2) More metadata, access behaviors and fine-grained monitoring metrics could be recorded or monitored. For example, request error analysis is helpful to locate backend issues; (3) In addition to the basic statistics on performance and accessibility, advanced analysis is needed to reveal the spatiotemporal patterns in service performance (e.g., response time). These features are critical for QoS predication and evaluation [54]. Service selection [49] and server-side optimization [55] could benefit from reliable QoS predication and evaluation results. These issues motivated us to conduct a thorough resource survey and quality analysis of global WMSs. To capture comprehensive QoS data, a distributed monitoring framework with 27 dispersed monitoring sites was deployed based on public cloud services. The metadata and QoS data of 46,296 WMSs from 72 countries were collected. The imbalance and features were discovered, including the server locations, provider types, supported service versions, popular map subjects, layer spatiotemporal distribution and supported Coordinate Reference System (CRS). We analyzed stability, request error types and potential reasons for error and discovered a power law for the response time distribution. We discuss the spatiotemporal features of response time for individual WMS and show how these discoveries could provide guidelines for service discovery and selection. ISPRS Int. J. Geo-Inf. 2016, 5, 88 6 of 24 responseISPRS Int. J. time Geo-Inf. distribution.2016, 5, 88 We discuss the spatiotemporal features of response time for individual6 of 24 WMS and show how these discoveries could provide guidelines for service discovery and selection.

3. Data Data Collection Collection and and Methodology Methodology The data collection and analysis workflow workflow is illustrated in Figure 11.. First,First, WMSsWMSs areare discovereddiscovered using a a developed developed topic topic crawler. crawler. After After importing importing the crawled the crawled WMSs WMSs into a database, into a database, the WMS the resource WMS resourcesurvey and survey QoS and analysis QoS wereanalysis conducted. were conducted. The WMS The resource WMS surveyresource is survey based onis servicebased on metadata service metadata(i.e., capabilities (i.e., capabilities documents documents retrieved retrieved from GetCapabilities from GetCapabilitiesoperation). operation). QoS analysis QoS analysis is based is on based our onmonitoring our monitoring result on result the two on mandatory the two mandatory operations, operations,GetCapabilities GetCapabilitiesand GetMap and. Routine GetMap monitoring. Routine monitoringpermits acquisition permits acquisition or updating or ofupdating long-term of long QoS-term behaviors QoS behaviors (e.g., stability, (e.g., stability, performance) performance) and the andavailability the availability status (whether status (whether the URL isthe not URL valid is anynot more)valid any of all mo WMSs,re) of whileall WMSs, intensive while monitoring intensive monitoringcaptures the captures spatiotemporal the spatiotemporal features of QoS features for selected of QoS services.for selected services.

WMS Resource Survey

Capability documents Capability documents Metadata properties and supported parsing analysis versions retrieving

WMS search T1 T2 1 and import Routine 2 24 hours 24 hours monitoring 3 4 Stability and Monitoring sites WMS groups Time Schedule WMS Retrieving WMS performance deployment analysis Monitoring Testing WMSs Intensive strategies selection monitoring configuration Database Advanced analysis of Quality of Service (QoS) Monitoring and Analysis response time

Figure 1. DataData collection collection and and analysis analysis workflow workflow..

3.1. Online Web Map Service Discovery The WMSsWMSs investigatedinvestigated in in this this research research were were all all collected collecte fromd from the Worldthe World Wide Wide Web byWeb our by active our activetopic crawlertopic crawler [56]. The[56]. discovery The discovery workflow workflow is illustrated is illustrated in Figure in Figure2. To 2. collect To collect as many as many WMSs WMSs as aspossible possible and and to to ensure ensure the the global global coverage coverage of of collected collected WMSs, WMSs, the the crawlercrawler adoptedadopted aa hybrid search strategy that that integrates integrates search search engine-based engine-based search search and and directed directed search. search. Common Common search search engines engines (e.g., (e.g.,Google Google and Bing) and Bing) have powerfulhave powerful crawling crawling and indexing and indexing capabilities capabilit thaties capture that capture global global web pages. web pages.Therefore, Therefore a search, a engine-based search engine search-based guarantees search guarantees the breadth the of breadth search and, of search therefore, and ensures, therefore that, ensuresour search that can our reach search the can regions reach of the the regions web indexed of the w byeb common indexed searchby common engines. search More engines. to the point,More WMSto the ispoint a type, WMS of domain-specific is a type of domain web resource,-specific usuallyweb resource, published usually through published geospatial through web portalsgeospatial and webcatalogues, portals suchand catalogues, as the OGC such Catalogue as the OGC Service Catalogue for the Web Service (CSW). for the We Web used (CSW). directed We search used directed to make searcha dedicated to make search a dedicated of these SDIs search using of standard these SDIs APIs using and webstandard page APIs crawling. and Reputableweb page SDIs crawling. (e.g., Reputabledata.gov [ 31SDIs], GEOSS (e.g., data.gov clearinghouse [31], GEOSS [30], EuroGEOSS clearinghouse broker [30], [57 EuroGEOSS]) and other geospatialbroker [57]) web and portals other geospatialfound through web searchportals engines found through were set search as seed engines pages. were set as seed pages. In term termss of search engine-basedengine-based search, we developed two methods. The first first method searches WMS directly. Keyword Keyword-based-based s searchearch (e.g., “WMS”, “Web Map Services” or “OGC”) or advanced search functions (e.g., in Google Search, we use “service = WMS” WMS” as as a a query query constraint constraint for for an an “ “inurlinurl”” statement) provided by the search engine were utilized to retrieve candidate web pages ( i.e., seed pages) that may contain WMS URLs. URLs. Then, Then, our our crawler crawler crawled crawled these these web pages recursively to discover WMSs. The The second second method method searches searches WMS WMS by by locating ArcGIS REST Service Directories using keyword keyword-based-based search ( i.e.,, “ArcGIS REST Service Directory”) be becausecause many online service directories [58,59] [58,59] generated by ArcGIS ArcGIS servers servers [20] [20] provide provide OGC OGC-compliant-compliant geospatial geospatial web web services. services. To discover WMSs from such a service directory, the crawler firstlyfirstly identifiedidentified real service directory folder pages pages from from retrieved retrieved search search results results by by analyzing analyzing the the HTML HTML structure structure and and content. content. Then, Then, a dedicateda dedicated and and recursive recursive crawl crawl was was conducted conducted on on the the qualified qualified web web pages. pages.

ISPRS Int. J. Geo-Inf. 2016, 5, 88 7 of 24 ISPRS Int. J. Geo-Inf. 2016, 5, 88 7 of 24

Figure 2. Search strategies and discoverydiscovery workflowworkflow ofof webweb mapmap services.services.

The procedure for WMSWMS discoverydiscovery from a specifiedspecified web page was treated as aa matchingmatching and validationvalidation process onon thethe WMSWMS capabilitiescapabilities documentdocument URLs hidden among all of thethe HTTPHTTP URLsURLs includedincluded on on the the web web pages. pages. The The HTTP HTTP request request URL URL of a ofWMS a WMS capabilities capabilities document document or its URL or its prefix URL prefixis usually is usually posted posted as a hyperlink as a hyperlink or text oron text web on pages web explicitly. pages explicitly. A canonical A canonical HTTP request HTTP requestfor the forWMS the GetCapabilities WMS GetCapabilities operationoperation contains contains multiple multiple Key Key-Value-Value Pairs Pairs (KVPs) (KVPs) as as essential essential request request parameters,parameters, suchsuch asas “request“request == GetCapabilities” andand “service“service == WMS”. Based on the WMS WMS URL URL prefix, prefix, thethe capabilitiescapabilities document document URL URL can can be be easily easily formed formed by appendingby appending a necessary a necessary query query parameter. parameter. Thus, weThus, used we the used KVP the feature KVP asfeature a criterion as a criterion to search to or search form URLs.or form To URLs. avoid unnecessaryTo avoid unnecessary URL formation URL processes,formation onlyprocesses, the hyperlinks only the hyperlinks whose anchor whose texts anchor and URL textssyntax and URL that syntax followed that specifiedfollowed rulesspecified [56] wererules selected[56] were as selected candidate as candi URLdate prefixes. URL prefixes. Nevertheless, Nevertheless, a URL with a URL such with a syntax such a structuresyntax structure cannot guaranteecannot guarantee a valid WMS,a valid and WMS a further, and a HTTPfurther request HTTP isrequest needed is for needed validation. for validation. Our crawler Our votes crawler for avotes URL for as aa validURL oneas a onlyvalid if one the payloadonly if the of thepayload HTTP of response the HTTP is aresponse valid XML is a document valid XML that docume containsnt mandatorythat contains element mandatory tags (e.g.,element ). tags (e.g., ). A valid WMS A URL valid was WMS added URL into was the added database into ifthe it database did not replicate if it did anot URL replicate already a presentURL already in the present database. in the database.

3.2. Distributed Monitoring Framework and StrategyStrategy To collectcollect QoSQoS monitoringmonitoring datadata fromfrom geographically-dispersedgeographically-dispersed monitoringmonitoring sites,sites, wewe developeddeveloped a distributed distributed monitoring monitoring framework framework that that consists consists of three of three components components (Figure (Figure3). (1) The 3). data(1) The collector data collectscollector service collects metadata service and metadata real-time and performance real-time performance data of WMSs data using of a WMSsgroup of using monitoring a group sites. of Eachmonitoring monitoring sites. Each site monitorsmonitoring the site assigned monitors WMSs the assigned in parallel WMSs using in parallel multithreading using multithreading technologies. Atechnologies. monitoring manager A monitoring configures manager and coordinates configures the and monitoring coordinates tasks the of monitoring all monitoring tasks sites of and all ismonitoring in charge ofsites data and management; is in charge of (2) data The management data access service. (2) The provides data access both service the RESTful provides and both Simple the ObjectRESTful Access and Simple Protocol Object (SOAP)-based Access Protocol web service (SOAP) APIs-based to retrieve web service monitoring APIs to data retrieve (e.g., monitoringmonitoring sites,data service(e.g., monitoring metadata and sites, historical service performance metadata and in ahistorical given time performance period); (3) inA web a given portal, time developed period). based(3) A on web our portal, previous developed research based [60], visualizes on our previous the layers, research service [60], performance, visualizes as the well layers, as the servicespatial distributionperformance of, as the well monitored as the spatial WMSs distribution and monitoring of the sites monitored using maps WMSs and and charts. monitoring sites using maps and charts.

ISPRS Int. J. Geo-Inf. 2016, 5, 88 8 of 24 ISPRS Int. J. Geo-Inf. 2016, 5, 88 8 of 24

Data Collector Web Portal configure tasks Monitoring Monitoring Site Map Layers return site Manager Visualizer status Monitoring Site manage data Performance insert data Visualizer Monitoring Site Database

fetch data from database invoke APIs

Monitoring Service Quality Info Sites Info Metadata

RESTful/SOAP Data Access Service

Figure 3. The architecture of the monitoring framework.framework.

Both routineroutine monitoring and intensive monitoring were based on our distributeddistributed monitoring framework, but the monitoringmonitoring strategies were different.different. Routine monitoring utilized several fixedfixed monitoring sites to guarantee that the available status of the WMSs was notnot impacted by single monitoring site connectivity. The The two two mandatory mandatory operations operations of of each each WMS WMS were were both both tested tested on on a aweekly weekly basis, basis, one one request request for for each each operation operation from from each each routine routine monitoring monitoring site site per per week. week. In In contrast, contrast, to capturecapture performanceperformance differencesdifferences andand investigateinvestigate spatiotemporalspatiotemporal patterns,patterns, intensiveintensive monitoringmonitoring used more monitoring sites from different locations,locations, and the monitoring time intervals are more intensive and adjustable. In In practice, practice, to to provide provide stable stable accessibility, accessibility, some WMS providers may setup licenselicense policies [61] [61] to prohibit excessively frequent accesses from a single IP address. We respected these policies policies and and conducted conducted distributed distributed round round-the-clock-the-clock monitoring. monitoring. Multiple Multiple monitoring monitoring sites were sites weredeployed deployed on dispersed on dispersed locations locations around around the world. the world. They They worked worked collaboratively collaboratively to capture to capture the theperformance performance difference difference caused caused by the by thediversity diversity of spatial of spatial positions positions where where users users are areaccessing accessing the theservices. services. In the In the temporal temporal dimension, dimension, WMSs WMSs were were divided divided in in groups groups and and monitored monitored periodically, periodically, according to a configurableconfigurable schedule.schedule. The monitoring schedule helped us to investigate the daily pattern in performance and to avoid being blocked by the WMS servers due to excessively frequent requests. Meanwhile, Meanwhile, by bydividing dividing WMSs WMSs into intogroups, groups, the accumulated the accumulated response response delay within delay a withingroup acould group be could controlled be controlled and the andmonitoring the monitoring time interval time interval for a single for a WMS single could WMS be could guaranteed. be guaranteed. After Afteriterative iterative monitoring monitoring and merging and merging multiple multiple day monitoring day monitoring data into data one intoday, onewe obtained day, we intensive obtained intensivedaily monitoring daily monitoring records with records almost with equal almost time equal intervals time for intervals each mandatory for each mandatory operation operationof a selected of aWMS, selected i.e., WMS,a recordi.e. around, a record every around five everyminutes. five minutes. For a a consistent consistent and and comparable comparable test, test, we created we created testing testing rules for rules the for two the operations. two operations. During DuringGetMap GetMapoperationoperation monitoring, monitoring, the first the named first named layer [2] layer of [the2] ofWMS the WMSwas requested was requested for testing. for testing. We Werestricted restricted the output the output image image to be to a 200 be a-pixel 200-pixel height, height, 400-pixel 400-pixel width width and in and PNG in format. PNG format. If this format If this formatwas not was supported, not supported, JPEG or JPEGother formats or other were formats specified were accordingly. specified accordingly. Although output Although maps output may mapsbe distorted may be as distorted the geographi as thec geographic extent (i.e. extent, bounding (i.e., boundingbox) varies, box) the varies, data volume the data can volume be roughly can be roughlycontrolled controlled,, as well asas wellthe data as the transfer data transfer cost. Meanwhile, cost. Meanwhile, to eliminate to eliminate the impact the impact of differences of differences in inserver server-side-side processing processing procedures, procedures, we we limited limited each each request request to to a a single single layer layer,, and and requerequestssts that combined multiple layers simultaneously were excluded. However, a request upon cascading parent layers that contain child layers was allowed, since since a a WMS treats a parent layer as a single layer in termsterms of GetMap requests.requests. For GetCapabilities operation monitoring, as WMS may support multiple service versions, testing requests that do not specify the optional parameter “version” were used as the measurement toto obtainobtain QoSQoS forfor thethe defaultdefault version.version.

ISPRS Int. J. Geo-Inf. 2016, 5, 88 9 of 24 ISPRS Int. J. Geo-Inf. 2016, 5, 88 9 of 24

3.3. Survey Data Collection From September 2014 toto NovemberNovember 2015, 27 publicpublic cloud-basedcloud-based (Windows Azure)Azure) monitoringmonitoring sites dispersed at 13 locations were deployed on four continents and in seven countries (Figure4 4)) inin total. All monitoring sites were employed with the same virtual machine configuration configuration,, includingincluding the network, computing resources and operating system to avoid any impact on monitoring results as much as as possible. possible. Four Four monitoring monitoring sites sites located located in U.S. in (Bristow, U.S. (Bristow, VA; Redmond, VA; Redmond, WA), Ireland WA), (Dublin) Ireland and(Dublin) China and (Hong China Kong) (Hong were Kong) constantly were employed constantly for employed routine monitoring for routine from monitoring September from2014 toSeptember November 2014 2015. to November The remaining 2015. The 23 monitoring remaining 23 sites monitoring from 12 locations sites from were 12 locations set up for were intensive set up monitoringfor intensive from monitoring 23 August from 2015 23 August to 3 October 2015 2015.to 3 October At any 2015. time pointAt any during time point intensive during monitoring, intensive 12monitoring, sites from 12 12 sites different from locations12 different were locations selected were to conductselected monitoringto conduct monitoring simultaneously; simultaneously; other sites fromother replicatedsites from locationsreplicated were locations used aswere alternatives. used as alternatives.

Figure 4. The geo-locationsgeo-locations of monitoring sites.

Over more than than one one year year of of routine routine monitoring, monitoring, 46 46,296,296 WMSs WMSs (from (from 72 72 countries countries and and six sixcontinents) continents) collected collected by our by web our webcrawler crawler were wereconstantly constantly monitored. monitored. Among Among these services, these services, 41,703 41,703WMSs WMSsin total in were total valid were and valid contained and contained 318,102 318,102 layers. layers. Since the Since total the number total number of WMSs of WMSs was too was large too largeto conduct to conduct a comprehensive a comprehensive test, we test, selecte we selectedd 1210 1210WMSs WMSs to be to intensively be intensively monitored monitored for 42 for days. 42 days. To Toavoid avoid the the impact impact of ofaccess access overload, overload, the the number number of of WMSs WMSs from from a asingle single provider, provider, the the same same domain name or IP address was strictly limited to at most fivefive during WMS selection. The global distribution of the selected WMSs was considered considered,, as well. The GetCapabilitiesGetCapabilities test was was based based on on these these 1210 1210 WMSs. WMSs. Among these WMSs, 876 WMSs were actually valid for access. We selected them to conduct further GetMap tests. Table Table 1 lists the the amount amount of of WMSs WMSs from from each each conti continentnent in in testing. testing. In Inkeeping keeping with with the theround round-the-clock-the-clock monitoring monitoring strategy strategy as described as described in Section in Section 3.2, we 3.2 finished, we finished a monitoring a monitoring cycle every cycle everysix days. six days.From Fromeach monitoring each monitoring site location, site location, we collected we collected 2016 2016 GetCapabilitiesGetCapabilities QoSQoS records records (i.e. (,i.e. 48, 48records records per per day) day) for foreach each WMS WMS selected selected for GetCapabilities for GetCapabilities test andtest and2016 2016 GetMapGetMap QoS QoSrecords records for each for eachWMSs WMSs selected selected for GetMap for GetMap test. test.

Table 1.1. The WMSs selected in intensive monitoring.monitoring. WMSs North America Europe Asia South America Africa Oceania Total WMSs North America Europe Asia South America Africa Oceania Total GetCapabilities test 718 428 11 16 3 34 1210 GetCapabilitiesGetMaptest test 718526 428306 114 15 16 0 325 34876 1210 GetMap test 526 306 4 15 0 25 876 To conduct a comprehensive investigation, we collected more metadata fields (e.g., contact information, CRS, spatial coverages of layers) and fine-grained quality information than the research

ISPRS Int. J. Geo-Inf. 2016, 5, 88 10 of 24

To conduct a comprehensive investigation, we collected more metadata fields (e.g., contact information, CRS, spatial coverages of layers) and fine-grained quality information than the research reported in the literature [42–45,49,50]. Response time, error type, size of response message, download speedISPRS and Int. relevant J. Geo-Inf. 2016 monitoring, 5, 88 site were recorded from each monitoring request. Response10 of time24 is affected by the latency on the client, network and server side. To analyze the key impact factors of reported in the literature [42–45,49,50]. Response time, error type, size of response message, WMS response times in the future, we obtained fine-grained response time costs for different phases download speed and relevant monitoring site were recorded from each monitoring request. Response in thetime interaction. is affected Theby the total latency response on the client, time in network a complete and server HTTP side. request-and-response To analyze the key impact operation factors was decomposedof WMS into response different times time in the spans future, using we the obtained cURL fine command-grained line response tool [62 time]. The costs attributes for different of these time spansphases includein the interaction. Domain The Name total System response (DNS) time in parsing a complete time; HTTP connecting request- time;and-response total time operation for sending a requestwas deco andmposed server processing; into different and time the spans data using transfer the cURL time. command In addition, line the tool average [62]. The response attributes time of and the successabilitythese time spans for include each WMS Domain were Name updated System periodically (DNS) parsing as qualitytime; connecting metrics. time; total time for sending a request and server processing; and the data transfer time. In addition, the average response 4. Datatime Analysis and the successability for each WMS were updated periodically as quality metrics.

4.1. Global4. Data WMS Analysis Resource Survey

In4.1. the Global resources WMS Resource survey, Survey primary metadata properties, including the title, abstract, keywords, coordinate reference system, version and provider, were extracted from the WMS capability documents In the resources survey, primary metadata properties, including the title, abstract, keywords, to investigate service usage and resource distribution. coordinate reference system, version and provider, were extracted from the WMS capability 4.1.1.documents Server Location to investigate and Provider service usage Type and resource distribution. The4.1.1. geo-location Server Location of and monitored Provider public Type WMSs suggests that North America (especially the U.S.) and EuropeThe have geo-location most abundant of monitored WMS public resources WMSs suggests (Figures that5 and North6a). America In the U.S. (especially and Europe, the U.S.) Open Governmentand Europe Data have initiatives most abundant promote WMS the resources accessibility (Figure ands 5 re-use and 6a). of In official the U.S. data and and Europe, information Open by citizens,Government communities Data initiatives and developers promote via the open accessibility development and re- repositories.use of official data Spatial and data information infrastructures by (SDIs)citizens, developed communities in these and regionsdevelopers are via widely open development used by cross-domainrepositories. Spatial users data all infrastructures over the world for geospatial(SDIs) developed resources in these discovery regions and are widely sharing. used For by examples, cross-domain data.gov users all [ 31 over] integrates the world the for U.S. government’sgeospatial open resources data discovery to build and a one-stop sharing. data For catalog, examples, and data.gov the INSPIRE [31] integrates geoportal the [U.S.63,64 ] is government’s open data to build a one-stop data catalog, and the INSPIRE geoportal [63,64] is a a European Union (EU) SDI to facilitate the sharing of environmental spatial information for public European Union (EU) SDI to facilitate the sharing of environmental spatial information for public use. As a result, the global accessibility of geospatial resources published in the U.S. and Europe use. As a result, the global accessibility of geospatial resources published in the U.S. and Europe is is significantlysignificantly enhanced, enhanced, and and online online service exploration exploration becomes becomes much much easier. easier. The The geo- geo-locationlocation distributiondistribution of WMSs of WMSs does does not not necessarily necessarily mean mean there are are mor moree map map resources resources in the in U.S. the U.S.and Europe. and Europe. This distributionThis distribution indicates indicates that that OGC OGC service service standardsstandards are are more more widely widely adopted adopted for forpublishing publishing open open datadata in those in those places. places. Consequently, Consequently, these these resourcesresources are are more more easily easily reached reached by byglobal global public public users users throughthrough geoportals geoportals and and search search engines. engines.

FigureFigure 5. 5.The The geo-location geo-location of monitored monitored public public WMSs WMSs..

ISPRS Int. J. Geo-Inf. 2016 5 ISPRS Int. J. Geo-Inf. 2016, ,5, ,8 888 1111 of of 24 24

Figure 6. The regional distribution and types of monitored public WMS providers; (a) while Figure 6. The regional distribution and types of monitored public WMS providers; (a) while most most services are in North America and Europe, among the remaining 0.6%, Asia contains 0.23%; services are in North America and Europe, among the remaining 0.6%, Asia contains 0.23%; (b) providers by sector; government has the largest proportion (37.69%), followed by academic (b) providers by sector; government has the largest proportion (37.69%), followed by academic institutions (34.19%), while industry has the smallest proportion (1.78%). institutions (34.19%), while industry has the smallest proportion (1.78%).

FromFrom an an analysis analysis of of the the 13, 13,352352 WMSs WMSs from from 989 989providers providers that include that include provider provider information information,, such assuch organization as organization tags in tagstheir in capability their capability documents, documents, we found we that found the server that the locations server locationsand provider and typesprovider of types WMSs of are WMSs imbalanced. are imbalanced. The WMS The WMS standard standard draws draws wider wider support support from from governments, governments, academicacademic institutions institutions and and intergovernmental intergovernmental organizatio organizationsns to to share share non non-profit-profit (e.g., (e.g., public public welfare welfare andand resou resources)rces) geospatial geospatial data data (Figure (Figure 66bb)) thanthan industries.industries. Providing public data services is one of thethe duties duties of of governments governments and and intergovernmental intergovernmental organizations organizations in in order order to to benefit benefit society. society. Therefore, Therefore, theythey act activelyively publish publish data data related related to to Societal Societal Benefit Benefit Areas Areas (SBAs) (SBAs) [65] [65] and and other other public public interest interest areas areas usingusing open open standards. standards. Academic Academic institutions institutions also also play play a a major major role role in in proposing proposing and and using using open open standardsstandards to to facilitate facilitate scientific scientific data data sharing. sharing. In contrast, In contrast, services services provided provided by industry by industry only represent only representa small proportion a small proportion (1.78%), because (1.78%) WMS, because is an WMS open is standard an open for standard geospatial for datageospatial access data rather access than rathera commercial-level than a commercial data exchange-level data protocol. exchange protocol. ByBy counting counting the numbernumber ofof WMSsWMSs published published by by each each of of the the identified identified 989 989 providers, providers, we foundwe found that thatthe topthe tentop service ten service providers providers (1%) of (1%) analyzed of analyzed WMSs publishedWMSs published 7367 WMSs 7367 (55.18%, WMSs (55.18%, 13,352 in total),13,352 butin total),the number but the of number WMSs they of WMSs offered they varies offered significantly varies significantly (Table2). The (Table Earth Data 2). The Analysis Earth CenterData Analysis (EDAC) Centerof the University (EDAC) of of New the University Mexico and of National New M Oceanicexico and and AtmosphericNational Oceanic Administration and Atmospheric (NOAA) Administrationmake the largest (NOAA) contribution. make the EDAC largest hosted contribution. and managed EDAC the hosted New and Mexico managed Resource the New Geographic Mexico Resourceinformation Geographic System (RGIS)information [66] for System over (RGIS) 23 years [66] specifically for over 23 to years share specifically statewide to public share geospatialstatewide publicdata. Thegeospatial data are da managedta. The data by are the managed RGIS data by repository the RGIS anddata published repository through and published OGC-compliant through OGCweb- services.compliant NOAA, web services. as well, provides NOAA, aas huge well amount, provides of oceanic- a huge and amount atmospheric-related of oceanic- dataand atmosphericthrough WMS.-related The imbalanced data through service WMS. contribution The imbalanced among service providers contribution follows a power among law providers [67], as followsshown ina power Figure 7law, and [67] reflects, as shown the differences in Figure in 7, geospatial and reflects resource the differences possession, in data geospatial sharing resource policies possession,and effectiveness. data sharing policies and effectiveness. We further summarized the prevalent software used to publish WMSs by analyzing WMS Uniform Resource Locators (URL)Table and 2. Top capabilities ten service documents. providers in Amongthe monitored a total WMSs of 46,296. WMSs, there were 3484, 1987 and 515 WMSsProvider published Name by ArcGIS Server,Service GeoServer Amount and MapServer,Provider Type respectively; the publishingEarth Data Analysis software Center for the(University remainder of is unknown. Government agencies published 1721 of 3202 Academic institution the ArcGIS-basedNew Mexico) WMSs. In contrast, only 229 WMSs from government agencies were published usingNational GeoServer Oceanic and and MapServer. Atmospheric The Administration two open-source servers have academic origins, hence more 2958 Government widely(NOAA) adopted in the academic sector than the public sector. Both commercial and open-source Food and agriculture organization of the United Intergovernmental geospatial software contribute to WMS publishing, but open-source295 tools are more popular in academia. CommercialNations (FAO) software, such as ArcGIS, is more likely to be used by governmentorganization agencies given the governmentalVlaamse Overheid contracts (Flemish with softwareGovernment) companies. 262 Government Senatsverwaltung für Stadtentwicklung und Umwelt (Senate Department for Urban Development and the 169 Government Environment of Berlin) United States Geological Survey (USGS) 133 Government

ISPRS Int. J. Geo-Inf. 2016, 5, 88 12 of 24

Landesamt fuer Geologie und Bergbau, Rheinland- Pfalz (Department for Geology and Mining of 101 Government Rheinland-Pfalz) Arizona Geological Survey 98 Government Illinois State Geological Survey 91 Government ISPRSKansas Int. J. Geo-Inf. Biological2016, 5 Survey, 88 58 Government 12 of 24

FigureFigure 7. 7. Log-logLog-log plot plot of of the the CumulativeCumulative Density Density Functions Functions (CDF)(CDF) p(x) p(x) for for the the number number of of WMSs WMSs

contributedcontributed byby eacheach providerprovider according according to to a a discrete discrete power power law law with withα α“=1.7921.792 and andxmin 푥푚푖푛“ 1.= 1.

We further summarizedTable 2. Top the ten prevalent service providers software in used the monitored to publish WMSs. WMSs by analyzing WMS Uniform Resource Locators (URL) and capabilities documents. Among a total of 46,296 WMSs, there Service were 3484, 1987 and 515 WMSsProvider published Name by ArcGIS Server, GeoServer and MapServer,Provider respectively; Type the publishing software for the remainder is unknown. GovernmAmountent agencies published 1721 of the ArcGISEarth- Databased Analysis WMSs. Center In contrast, (University only of 229 New WMSs Mexico) from government 3202 agencies Academic were published institution using GeoServerNational Oceanicand MapServer. and Atmospheric The two Administration open-source (NOAA)servers have academic 2958 origins, Governmenthence more widely Intergovernmental Food and agriculture organization of the United Nations (FAO) 295 adopted in the academic sector than the public sector. Both commercial and openorganization-source geospatial softwareVlaamse contribute Overheid (Flemish to WMS Government) publishing, but open-source tools are262 more popular Government in academia. Senatsverwaltung für Stadtentwicklung und Umwelt (Senate Commercial software, such as ArcGIS, is more likely to be used by169 government Government agencies given the governDepartmentmental for contracts Urban Development with software and companies. the Environment of Berlin) United States Geological Survey (USGS) 133 Government Landesamt fuer Geologie und Bergbau, Rheinland-Pfalz 101 Government 4.1.2.(Department Popular Map for Geology Subjects and Mining of Rheinland-Pfalz) ArizonaInvestigating Geological the Survey map subject matter of global WMSs can benefit 98 global cross Government-disciplinary users Illinois State Geological Survey 91 Government in Kansasdiscover Biologicaling and Survey selecting map resources, as well as increasing our 58 understanding Government of open data sharing policies, public issues, and trends in the Earth sciences and other disciplines. Although the languages used to describe WMS metadata vary (e.g., English, German, French, Dutch, Italian, 4.1.2. Popular Map Subjects Norwegian, Swedish and Chinese), according to our investigation, around 87.85% of the valid WMSs (36,638)Investigating contained the English map subject words matter in their of globalservice WMSs descriptions can benefit, and global 82.57% cross-disciplinary of the layers contained users in discoveringkeyword fields and selectingthat were map described resources, in English. as well asSubsequently, increasing our we understanding studied the top of map open layer data sharingsubjects policies,based purely public on issues, English and keyword trends in to the reduce Earth the sciences complexity and other of data disciplines. analysis. AlthoughWe extracted the languagesa group of usedEnglish to describekeywords WMS for each metadata map layer vary from (e.g., layer English, description German, fields French, (e.g., Dutch,title, abstract, Italian, keywords) Norwegian, of Swedishcapability and documents. Chinese), accordingThe stop words, to our investigation, replicated words around and 87.85% words of that the were valid not WMSs nouns (36,638) were containedeliminated English during wordsextraction. in their Then service, we calculated descriptions, and andsorted 82.57% the occurrence of the layers frequency contained of keywords keyword fields that were described in English. Subsequently, we studied the top map layer subjects based purely on English keyword to reduce the complexity of data analysis. We extracted a group of English keywords for each map layer from layer description fields (e.g., title, abstract, keywords) of capability documents. The stop words, replicated words and words that were not nouns were eliminated during extraction. Then, we calculated and sorted the occurrence frequency of keywords among all map layers. ISPRS Int. J. Geo-Inf. 2016, 5, 88 13 of 24 ISPRS Int. J. Geo-Inf. 2016, 5, 88 13 of 24 among all map layers. Figure 8 shows the top English language keywords that had the highest word frequencies. According to the GEOSS Societal Benefit Areas (SBAs) [65] and INSPIRE directive [63], Figure8 shows the top English language keywords that had the highest word frequencies. According to we analyzed obtained keywords and found that the following subjects related to the natural the GEOSS Societal Benefit Areas (SBAs) [65] and INSPIRE directive [63], we analyzed obtained environment and resources appear most frequently: geology, climate, energy, land cover, water, keywords and found that the following subjects related to the natural environment and resources biodiversity, agriculture and ecosystem. Given the explosive growth in Earth observation appear most frequently: geology, climate, energy, land cover, water, biodiversity, agriculture and technologies over past several decades, governments, academic institutions and non-profit ecosystem. Given the explosive growth in Earth observation technologies over past several decades, organizations have collected and processed a large amount of geographic data about natural governments, academic institutions and non-profit organizations have collected and processed a large phenomena. Many of the data were published using standard OGC services and form the majority amount of geographic data about natural phenomena. Many of the data were published using standard of open map resources. For example, the INSPIRE directive is primarily oriented to share spatial OGC services and form the majority of open map resources. For example, the INSPIRE directive is environmental information among public sector providers [63]. primarily oriented to share spatial environmental information among public sector providers [63].

FigureFigure 8. 8. TopTop English language keywords found in service descriptions for map layers.

The big data era has arrived and given the advancement of location-awarelocation-aware technologies;technologies; social sensing is becoming easier and and less less expensive expensive.. Accordingly, Accordingly, social and eco economicnomic activity activity-related-related data are dramatically dramatically increasing. increasing. The The Socioeconomic Socioeconomic Data andData Applications and Applications Center of Center Columbia of University Columbia University(SEDAC) published (SEDAC) published a WMS [68 a WMS] to represent [68] to represent the spatiotemporal the spatiotemporal distribution distribution of population of population size, size,population population density density and education and education degree de ingree America. in America. The National The National Geomatics Geomatics Center ofCenter China of (NGCC) China (NGCC)published published thematic maps thematic of Chinese maps of employment Chinese employment and wages inand 2013 wage ass a in WMS 2013 [69 as]. Ita expectedWMS [69]. that It expectedonline maps that withonline social maps behaviors with social and behaviors economic and activities economic as activities primary as subjects primary will subjects be exploding will be explodingexponentially exponentially in the near in feature. the near feature.

4.1.3. Spatial Coverages of Map Layers By analyzinganalyzing thethe geographic geographic extent extent (i.e. (i.e., bounding, bounding box) box of) 318,102 of 318, map102 map layers layers (Figure (Figure9), we found9), we foundthat continents that continents are covered are covered more intensively more intensively than the than oceans, the exceptoceans for, except Antarctica, for Antarctica and the Northern, and the NorthernHemisphere Hemisphere has more has coverage more coverage than the than Southern the Southern Hemisphere. Hemisphere. Many of Many the layers of the havelayers a have global a globalextent (more extent than (more 25,000 than layers). 25,000 This layers). phenomenon This phenomenon reveals our reveals global our Earth global observation Earth observation behaviors behaviorsand also reflects and also human reflects activities human inactivities space, toin somespace, degree.to some The degree. differences The differences in the richness in the richness in open ingeospatial open geospatial data also data reflect also the reflect differences the differences in data in sharing data sharing policies policies at thecountry at the country and regional and regional levels. levels.Morespecifically, More specifically, spatial coveragesspatial coverages of map of layers mapare layers concentrated are concentrated in North in America North America and Europe, and Europe,especially especially over the over mainland the mainland of U.S. and of U.S. Europe, and Europe and Northern, and Northern Africa has Africa the secondhas the largestsecond number largest numberof coverages, of coverages, after North after America North andAmerica Europe. and However, Europe. However, due to information due to information security and security data policy and dataissues, policy geospatial issues, resources geospatial are resources strictly controlled are strictly in controlled some countries. in som Geospatiale countries. data Geospatial are often data shared are ofteninternally shared and internally between governmentaland between governmental agencies through agencies hardcopy through or secured hardcopy private or networkssecured private rather networksthan by publishing rather than them by publishing through standardized them through web standardized services for publicweb services use. for public use.

ISPRS Int. J. Geo-Inf. 2016, 5, 88 14 of 24 ISPRSISPRS Int. Int. J. J. Ge Geo-Inf.o-Inf. 20162016,, 55,, 8 888 14 of 24

FigureFigure 9. 9. TheThe spatial spatial coverages coverages of of Map Map Layers Layers.. Figure 9. The spatial coverages of Map Layers. 4.1.4. Yearly Distribution of Map Layers and WMSs with Current Map Layers 4.1.4. Yearly Distribution of Map Layers and WMSs with Current Map Layers 4.1.4. Yearly Distribution of Map Layers and WMSs with Current Map Layers The data collection time is a measure of the timeliness of geospatial data (i.e., data currency). The data collection time is a measure of the timeliness of geospatial data (i.e., data currency). AmongThe all d ata 318 collection,102 layers, time 62 ,is925 a measure layers from of the 4587 timeliness WMSs containof geospatial data collectiondata (i.e., data times currency). in their Among all 318,102 layers, 62,925 layers from 4587 WMSs contain data collection times in their capability capabilityAmong all documents. 318,102 layers, For example, 62,925 layersthe SEDAC from GeoServer 4587 WMSs WMS contain [68] embeds data collection the data collection times in time their documents. For example, the SEDAC GeoServer WMS [68] embeds the data collection time directly in directlycapability in documents.the layer name For example,field (e.g., the “Population SEDAC GeoServer Density 2000”)WMS [68], and embeds the NASA the data Earth collection observation time the layer name field (e.g., “Population Density 2000”), and the NASA Earth observation WMS [70] WMSdirectly [70] in describesthe layer anname extended field (e.g., time “Population dimension inDensity each layer 2000”) (e.g.,, and “2015 the NASA-01-01/2015 Earth-08 observation-29/P8D”). describes an extended time dimension in each layer (e.g., “2015-01-01/2015-08-29/P8D”). Using such UsingWMS such[70] describes information, an extended we can estimate time dimension approximately in each the layer update (e.g., status “2015 of-01 map-01/2015 layers-08 -or29/P8D”). service information, we can estimate approximately the update status of map layers or service metadata for metadataUsing such for information, a WMS. As we data can estimate become moreapproximately open, the the map update resources status collectedof map layers each or year service are a WMS. As data become more open, the map resources collected each year are increasing in number. increasinmetadatag forin number. a WMS. At As the data same become time, most more of open, these thedata map published resources using collected WMSs are each up year-to-date are At the same time, most of these data published using WMSs are up-to-date data in general, not dataincreasin in general,g in number. not archived At the historical same time, data most (as shownof these in data Figure published 10). using WMSs are up-to-date archived historical data (as shown in Figure 10). data in general, not archived historical data (as shown in Figure 10).

FigureFigure 10. 10. YearlyYearly distribution distribution of of map map layers layers and and WMSs WMSs with with current current map map layers layers.. Figure 10. Yearly distribution of map layers and WMSs with current map layers. FigureFigure 1010 illustraillustratestes that that the the significant significant increase increase in in WMSs WMSs and and map map layers layers in in 2000 2000 and and 2006 2006 was was stronglystronglyFigure associated associated 10 illustra with withtes thethat the release the release significant of the of WMS the increase WMS standards in standards WMSs Version and V 1.0.0 ersionmap and layers 1.0.0 Version in and 2000 1.3.0, V andersion respectively. 2006 1.3.0, was respectively.stronglyThe release associated of The the releaseESRI with RESTful of the the release Service ESRI RESTful of APIs the and WMS Service successive standards APIs updates, and V ersion successive which 1.0.0 are updates, and compliant Version which with 1.3.0, are the compliantrespectively.WMS standard, with The the caused release WMS another standard, of the significant ESRI caused RESTful another increase, Service significant starting APIs in increase, and 2010. successive Moreover, starting updates,in INSPIRE 2010. Moreover, which required are INSPIREcompliantmember required states with tothe providemember WMS standard, discoverystates to providecaused and view another discovery services significant and (i.e. ,view WMS increase, services and WMTS) starting (i.e., WMS in in 2011 2010. and at WMTS)Moreover, the latest, in 2011INSPIREwhich at maythe required late alsost, have whichmember promoted may states also the to have deployment provide promoted discovery and the updating deploymentand view of WMSs. services and These updating (i.e., trendsWMS of revealandWMSs. WMTS) that These new in trends2011technologies, at reveal the late that standardsst, new which technologies, and may software also have standards products promoted and are software adoptedthe deployment relativelyproducts and are rapidly adoptedupdating in the relatively of geoinformation WMSs. rapidly These intrends the geoinformation reveal that new domain. technologies, However, standards the enthusiasm and software for products exploring are new adopted technolo relativelygies is hardrapidly to in the geoinformation domain. However, the enthusiasm for exploring new technologies is hard to

ISPRS Int. J. Geo-Inf. 2016, 5, 88 15 of 24 domain. However, the enthusiasm for exploring new technologies is hard to maintain when it comes to routine service maintenance. Many of the existing WMSs seldom add new layers or update layer metadata after deployment. Thus, the data collection time for the latest layers in 4124 WMSs among the 4587 WMSs studied (89.91%) is earlier than the year 2013. This may affect the quality of map resources found in online WMSs.

4.1.5. Supported Coordinate Reference Systems and Service Versions Among the 318,102 layers examined, various ellipsoidal CRSs and projected CRSs were supported. Ellipsoidal CRSs obtained 97.57% of the support. Web Mercator, Universal Transverse Mercator, Antarctic stereographic and Albers are the most used projections (Table3). Web Mercator obtained 76.39% of the support. Maps in this projection can be conveniently visualized on the web, since it is easy to conduct map splitting and seamless splicing. Meanwhile, Web Mercator guarantees the correctness of direction and relative position on maps. Therefore, it is widely adopted, and many online public map services use this projection. Some layers support the Antarctic stereographic projection or Albers projection due to the geographic extent of the maps or because of their specific application needs. For example, administrative region maps may need equal-area projection.

Table 3. Common map projections of widely-supported projected Coordinate Reference Systems (CRSs).

Map Projection Layers Amount and Percentage CRS Sample Web Mercator 241,779 (76.39%) EPSG:3857, EPSG:102100 Universal Transverse Mercator 163,092 (51.52%) EPSG:26919, EPSG:32633 Antarctic stereographic projection 122,399 (38.67%) EPSG:3031 Albers projection 119,024 (37.61%) EPSG:3005

Among 41,703 valid WMSs, there were 9920, 10,122, 41,325 and 40,861 WMSs supporting Versions 1.0.0, 1.1.0, 1.1.1 and 1.3.0, respectively. Therefore, most of the WMSs were implemented with geospatial software instances compliant with the latest WMS standard Version 1.3.0, released in 2006 and compatible with Version 1.1.0. In Version 1.3.0, the access interfaces were unified, and the data structure of response messages was refined. Therefore, the version is much more mature than previous versions, contributing to the high adoption rate. Meanwhile, after several years of popularization, the recognition of OGC standards was significantly promoted among users. More open-source and commercial GIS software developers began to support the latest OGC standards. It was also noted that most of the WMSs with Version 1.3.0 are downwardly compatible with Version 1.1.1, but only around 1/4 of WMSs are downwardly compatible with the earlier versions (1.0.0 and 1.1.0).

4.2. Stability and Performance Analysis Stability and performance are two essential service-level measurement quality factors for web services [71] and software entities [72]. Stability measures the reliability and the maintainability of a software entity by investigating the runtime robustness, while performance measures runtime efficiency. Evaluating these two factors is significant in that it may provide guidelines for WMS selection and server-side improvements for service consumers and providers, respectively. In this section, based on the intensive monitoring result of 1210 WMSs (listed in Table1), we analyze the overall status of the two factors using selected metrics.

4.2.1. Stability Analysis To analyze the stability, we investigated accessibility, successability and error types based on two mandatory operations. Accessibility represents the probability that a web service operation is accessible (receiving acknowledgement or response message) while the service is available. Successability is the ratio of successful responses to all requests in a time period and measures ISPRS Int. J. Geo-Inf. 2016, 5, 88 16 of 24

failed. We categorized accessibility into three types in this research: always accessible, temporally inaccessible and constantly inaccessible. Table 4 shows that many WMSs are constantly inaccessible (i.e., invalid WMSs), since these are unavailable or they may have changed URLs. Meanwhile, WMS operations identified as temporally inaccessible were a nontrivial problem and due to network and ISPRSservice Int. maintenance J. Geo-Inf. 2016, 5issues., 88 Since we cannot detect or be informed of the maintenance times16 of of all 24 WMSs, maintenance time was not excluded from accessibilities analysis. The accessibility of GetMap theis worse ability than to correctly GetCapabilities respond even to user for the requests. valid WMSs. Error type Only describes around briefly 1/5 of thelayer reason GetMap why oper a requestations failed.are constantly We categorized accessible. accessibility The successability into three types histograms in this for research: GetCapabilities always accessible,and GetMap temporally seen in inaccessibleFigure 11 also and indicate constantly that inaccessible. the successability Table4 shows of GetCapabilities that many WMSs is higher are constantly than GetMap inaccessible for the (validi.e., invalid WMSs. WMSs), since these are unavailable or they may have changed URLs. Meanwhile, WMS operations identified as temporally inaccessible were a nontrivial problem and due to network and service maintenance issues.Table 4. Since Accessibility we cannot of GetCapabilities detect or be and informed GetMap operations. of the maintenance times of all WMSs, maintenanceAccessibility time was notGetCapabilities excluded from for accessibilities All Selected WMSs analysis. GetMap The accessibility for Valid WMSs of GetMap is worse thanConstantlyGetCapabilities inaccessibleeven for the valid WMSs.27.60% Only around 1/5 of layer GetMap17.21% operations are constantlyTemporally accessible. inaccessible The successability histograms13.64% for GetCapabilities and GetMap61.43%seen in Figure 11 also indicateAlways that accessible the successability of GetCapabilities58.76%is higher than GetMap for the21.36% valid WMSs.

Figure 11. 11.Successability Successability histograms histograms of the of two the mandatory two mandatory operations operations for valid WMSs: for valid (a) GetCapabilities WMSs: (a); (GetCapabilitiesb) GetMap. ; (b) GetMap.

To analyze operationTable errors, 4. Accessibility we classified of GetCapabilities various detailedand GetMap erroroperations. types into two categories. The server access error represents the errors that occurred when connecting to the servers, e.g., being unable to Accessibility connect the host, a GetCapabilitiestime-out or nofor response All Selected from WMSs the server.GetMap Requestfor Valid processing WMSs errors happenConstantly during server inaccessible-side processing after successfully 27.60% connecting to the server, 17.21% e.g., semantic error of request,Temporally server inaccessible refusing to execute the request 13.64% or server overload. From Table 61.43% 5, we can see that more errorsAlways we accessiblere caused by request processing 58.76% errors for GetMap. On the one hand, 21.36% the processing of GetMap operations is relatively more complex than GetCapabilities. GetMap needs to load geospatial data To and analyze conducts operation requisite errors, geoprocessing we classified (e.g., various subsetting, detailed transforma error typestion into and two rendering) categories. to Thegenerate server maps, access while error GetCapabilities represents theonly errors responds that to occurred a metadata when document connecting that tocan the be servers,generated e.g., in beingadvance. unable On the to connect other hand, the host, incorrect a time-out or outdated or no response layer descriptions from the server. in the capabilities Request processing document errors are happenanother duringreason server-sidefor request processingprocessing aftererrors. successfully We can conclude connecting that to stability the server, varies e.g., significantly semantic error for ofservices request, and server operations. refusing Accessibility to execute theto the request metadata or server cannot overload. grantee From the accessibility Table5, we can to the see thatmap morelayers. errors Therefore, were causedthe stability by request of WMS processing servers errorsand metadata for GetMap timeliness. On the onemaintenance hand, the are processing big issues of GetMapand needoperations to be further is relatively addressed more by service complex providers. than GetCapabilities . GetMap needs to load geospatial data and conducts requisite geoprocessing (e.g., subsetting, transformation and rendering) to generate maps, while GetCapabilities only responds to a metadata document that can be generated in advance. On the other hand, incorrect or outdated layer descriptions in the capabilities document are another reason for request processing errors. We can conclude that stability varies significantly for services and operations. Accessibility to the metadata cannot grantee the accessibility to the map layers. Therefore, the stability of WMS servers and metadata timeliness maintenance are big issues and need to be further addressed by service providers. ISPRS Int. J. Geo-Inf. 2016, 5, 88 17 of 24

ISPRS Int. J. Geo-Inf. 2016,Table 5, 88 5. Error types in GetCapabilities and GetMap operations. 17 of 24

ErrorTable Type 5. Error types inGetCapabilities GetCapabilities and GetMap GetMap operationsfor Valid. WMSs

RequestError processing Type errorGetCapabilities 61.64% GetMap 76.42%for Valid WMSs Server access error 38.36% 23.58% Request processing error 61.64% 76.42% Server access error 38.36% 23.58% 4.2.2. The Power Laws in Response Times 4.2.2.Response The Power time, Laws maximum in Response throughput Times and computing resource occupation are three common measurementResponse of time, performance. maximum throughput In our research, and computing we selected resource response occupation time are as thethree performance common measurementmeasurement because of performance. testing maximum In our throughput research, we requires selected intensive response concurrent time as the requests performance and may resultmeasurement in request rejectionbecause testing by service maximum providers, throughput while requires computing intensive resource concurrent occupation requests measurements and may areresult hard toin request obtain. rejection We investigated by service theproviders, overall while trends computing in the response resource times occupation of all measurements tested WMSs by analyzingare hard the to minimum,obtain. We investigated average and the maximum overall trends response in the times response of the times two of mandatory all tested WMSs operations by recordedanalyzing for eachthe minimum, valid WMS average among and all maximum successful response responses times from of the all two monitoring mandatory sites. operations Figure 12 indicatesrecorded that for the each response valid timeWMS of among the two all mandatory successful operationsresponses from for valid all monitoring WMSs, in general,sites. Figure must 12 obey indicates that the response time of the two mandatory operations for valid WMSs, in general, must power laws. Most WMSs respond rapidly, but a few have very long response times. The numerical obey power laws. Most WMSs respond rapidly, but a few have very long response times. The differences between the minimum and the maximum response times are illustrated in Figure 13 numerical differences between the minimum and the maximum response times are illustrated in reflectingFigure instability 13 reflecting in WMS instability performance. in WMS performance.The vertical red The lines vertical in the red bar lines charts in forthe average bar charts response for timesaverage indicate response that a times majority indicate of valid that a WMSs majority (more of valid than WMSs 80%) (more can respondthan 80%) to can user respond requests to user within threerequests seconds within in most three cases.seconds In in contrast, most cases. our In previous contrast, our investigations previous investigations show that lessshow than that 40%less of WMSthanGetCapabilities 40% of WMSand GetCapabilitiesGetMap operations and GetMap responded operations within responded eight seconds within [ eight48,49 ]. seconds Thus, the[48,49]. overall responseThus, the time overall of WMSs response was time significantly of WMSs was reduced, significantly reflecting reduced, improvement reflecting improvement in WMSs software in WMSs and hardwaresoftware environments, and hardware as environments well as upgrades, as well in as the upgrades global network,in the global to somenetwo extent.rk, to some extent.

FigureFigure 12. 12.Log-log Log-log plot plot of of the the CDFs CDFsP(x) P(x) forfor minimum,minimum, average average and and maximum maximum response response times times of of (a) GetCapabilities(a) GetCapabilitiesoperations operations and and (b) GetMap (b) GetMapoperations operations for selected for selected WMSs WMSs reflecting reflecting continuous continuous power laws;power a Kolmogorov–Smirnov laws; a Kolmogorov–Smirnov test (p > test 0.05) (p indicated> 0.05) indicated that the that original the original data weredata were likely likely to be to drawn be fromdrawn the fitted from power-lawthe fitted power distribution.-law distribution. The plots The show plots ashow sharp a sharp change change in the in upper the upper boundaries boundaries of the responseof the timeresponse dropping time dropping significantly significantly at about at 60 about s, especially 60 s, especially in the maximumin the maximum response response time, time because, thebecause maximum the timeoutmaximum was timeout set to was 60 s set during to 60 monitoring.s during monitoring.

ISPRS Int. J. Geo-Inf. 2016, 5, 88 18 of 24 ISPRS Int. J. Geo-Inf. 2016, 5, 88 18 of 24

FigureFigure 13. 13.Unit Unit area area histograms histograms for for the the minimum, average average and and maximum maximum response response times times of of (a) GetCapabilities(a) GetCapabilitiesoperations operations and and ((bb)) GetMap operationsoperations for forselected selected WMSs; WMSs; the vertical the vertical red line redin line in eacheach chart chart divides divides thethe WMSsWMSs into into 80% 80% and and 20% 20% proportions, proportions, respectively. respectively. The Thedensity, density, calculated calculated as as frequency/(total_frequency*bin_width), shows the proportions of WMSs for per unit of response time. frequency/(total_frequency*bin_width), shows the proportions of WMSs for per unit of response time. 4.2.3. Spatiotemporal Characteristics of Response Times 4.2.3. Spatiotemporal Characteristics of Response Times The response time of a web service is affected by various factors. Among these factors, the networkThe response connectivity time between of a web service service providers is affected and users, by various instantaneous factors. network Among condition these, as factors, well the networkas the connectivity concurrency betweenpattern of serviceglobal users providers can be andgeneralized users, instantaneousas spatiotemporal network access factors. condition, In this as well as thesection, concurrency we explore pattern the relation of global between users spatiotemporal can be generalized access asfactors spatiotemporal and the response access time. factors. In this section,(1) Spatial we explore characteristics the relation between spatiotemporal access factors and the response time. (1) SpatialResponse characteristics time is significantly impacted by network connectivity, and the connectivity of a network in cyber space relies on the establishment of network equipment and their linkages in Response time is significantly impacted by network connectivity, and the connectivity of a network geographic space. Therefore, the association between spatial distance and response time is inherent. in cyber space relies on the establishment of network equipment and their linkages in geographic We found that 60.27% of valid WMSs (528 of 876 in total) obtained the shortest average response space.times Therefore, from their the closest association monitoring between site locations spatial distanceglobally. At and the response continental time level, is inherent. this trend We was found that 60.27%more apparent of valid (Table WMSs 6). (528 Mos oft of 876 the in WMSs total) obtainedtend to get the the shortest shortest average response response time from times the from theirmonitoring closest monitoring sites in the site same locations regions globally.as the server At locations, the continental except level,in the case this of trend South was America. more apparentWe (Tablealso6). calculatedMost of the the WMSs linear tend regression to get coefficient the shortest of response determination time (R from2) of the the monitoring response time sites for in the sameindividual regions asWMSs the using server the locations, average response except times in the at caseeach ofmonitoring South America. location for We multiple also calculated locations the linearincluding regression both coefficient the public of cloud determination-based sites (Rand2) oflocal the sites. response The average time for R2 individual was 74.08% WMSs for all using876 the averageWMSs. response Figure times 14 reveals at each a positive monitoring correlation location between for multiple average locations response including time and boththe spatial the public distance from monitoring site to server. The scatter plot in Figure 14a is for all 876 valid WMSs. We cloud-based sites and local sites. The average R2 was 74.08% for all 876 WMSs. Figure 14 reveals can see a positive correlation, but the graph also shows data heterogeneity, since response time is a positive correlation between average response time and the spatial distance from monitoring site to impacted by many other factors, such as the response data volume, server performance, network server.bandwidth The scatter and plot the distribution in Figure 14 ofa isglobal for all optical 876 valid cables. WMSs. The impact We can from see these a positive factors correlation, was partiall buty the graphreduced also shows by selecting data heterogeneity, 393 WMSs located since response in the U.S time. whose is impacted capabilities by many documents other were factors, less such than as the response1 MB dataand had volume, an average server response performance, time of less network than 2 s. bandwidth The positive and correlation the distribution becomes more of global visible optical cables.in Figure The impact 14b. Furthermore, from these factorsthe shorter was the partially distance, reduced the smaller by selecting the variance 393 WMSsin response located times, in theas U.S. whoseresponse capabilities time becomes documents more were stable less when than the 1 MB uncertainty and had of an a average network response is reduced. time Although of less the than 2 s. The positiveresponse correlationtime is impacted becomes by many more visiblefactors and in Figure hard to14 predictb. Furthermore, precisely, thewe shortermake the the following distance, the smallersuggestions. the variance From in the response perspective times, of service as response selection, time a becomesWMS that more has a stablecloser whengeographic the uncertainty distance of to users may have a higher priority among services with comparable functionalities and map a network is reduced. Although the response time is impacted by many factors and hard to predict resources. From the perspective of performance optimization, a map server should be deployed as precisely, we make the following suggestions. From the perspective of service selection, a WMS that has a closer geographic distance to users may have a higher priority among services with comparable functionalities and map resources. From the perspective of performance optimization, a map server should be deployed as close as possible to potential users. Cloud computing can be utilized to achieve ISPRS Int. J. Geo-Inf. 2016, 5, 88 19 of 24 ISPRS Int. J. Geo-Inf. 2016, 5, 88 19 of 24 ISPRS Int. J. Geo-Inf. 2016, 5, 88 19 of 24 dynamicclose asspatiotemporal possible to deployment potential users. of servers, Cloud and computing the state-of-the-art can be utilized site selection to achieve algorithms dynamic could close as possible to potential users. Cloud computing can be utilized to achieve dynamic spatiotemporal deployment of servers, and the state-of-the-art site selection algorithms could be bespatiotemporal developed to improve deployment performance. of servers, and the state-of-the-art site selection algorithms could be developed to improve performance. developed to improve performance. Table 6. The percentage of WMSs from each continent that yielded the shortest average response times Table 6. The percentage of WMSs from each continent that yielded the shortest average response fromTable the monitoring6. The percentage sites on of eachWMSs continent. from each continent that yielded the shortest average response times from the monitoring sites on each continent. times from the monitoring sites on each continent. WMSs by Continents (Service Amount) Monitor Location WMSs by Continents (Service Amount) WMSs by Continents (Service Amount) Monitor Location North America (526) Europe (306) Asia (4) South America (15) Oceania (25) Monitor Location North America (526) Europe (306) Asia (4) South America (15) Oceania (25) North America (526) Europe (306) Asia (4) South America (15) Oceania (25) NorthNorth America America 91.25%91.25% 3.59%3.59% 00 60% 4% 4% North America 91.25% 3.59% 0 60% 4% EuropeEurope 3.04%3.04% 95.10%95.10% 00 0 0 0 Europe 3.04% 95.10% 0 0 0 AsiaAsia 3.80%3.80% 0.65%0.65% 100%100% 6.67% 96% 96% South AmericaAsia 1.90%3.80% 0.65%0.65% 100% 0 33.33%6.67% 96% 0 South America 1.90% 0.65% 0 33.33% 0 South America 1.90% 0.65% 0 33.33% 0

Figure 14. Correlation between response times and spatial distance from monitoring sites to WMS FigureFigure 14. 14.Correlation Correlation between between response response timestimes andand spatial distance from from monitoring monitoring sites sites to toWMS WMS servers. (a) The 876 valid WMSs and (b) the selected 393 WMSs located in the U.S. whose capabilities servers.servers. (a )(a The) The 876 876 valid valid WMSs WMSs and and ( b(b)) the the selectedselected 393393 WMSs located located in in the the U U.S..S. whose whose capabilities capabilities documents were less than 1 MB with average response times less than two seconds. documentsdocuments were were less less than than 1 MB1 MB with with average average response response timestimes less than two two seconds. seconds.

Figure 15. Time series characteristics of WMS response times. FigureFigure 15. 15.Time Time series series characteristicscharacteristics of WMS response response times times..

ISPRS Int. J. Geo-Inf. 2016, 5, 88 20 of 24

(2) Time series characteristics The monthly response time series for a WMS from a single monitoring site is steady in general but synthesizes trends with few prominent random fluctuations and many small local variations. Within the 24 h of a day, a time series shows local fluctuations. As shown in Figure 15, for a WMS [73] provided by Arizona Geological Survey, there is a set of intensive peaks between 8:00 A.M. to 11:00 A.M. in the local time of the service, while the response time fluctuates slightly during the night. This phenomenon reveals the local network status, as well as concurrent accesses to a WMS during a certain time period, to some extent. The local network traffic and user concurrency from the same or neighboring time zones to the WMS server are relatively small in the nighttime, but increase sharply during the daytime. Concurrent access to the WMS generates server-side load pressure. The fluctuation is more violent for the WMSs with a longer average response time, since the network condition has a huge impact on the stability of the response time [50].

5. Conclusions and Future Work

5.1. Conclusions We conducted a comprehensive WMS resource survey and quality analysis for global WMSs based on a proposed distributed monitoring framework. Based on a WMS resource survey of 41,703 valid WMSs, we found that the WMS standard is widely adopted with an imbalanced distribution. More specifically, (1) the providers and server locations are extremely imbalanced. A few providers provided a large amount of public WMSs. WMSs are readily adopted by governments, academic institutions and intergovernmental organizations. In contrast, the contribution from companies is relatively small, since WMS is an open standard for geospatial data access rather than a commercial-level data exchange protocol. Public WMSs also have a skewed spatial distribution due to data policy issues and the imbalanced development of SDIs. Specifically, North America (especially the U.S.) and Europe contributed most of the public WMSs (around 99%); (2) Map resources are abundant, but also disproportionate. The natural environment and resources are the dominant map subjects. North America and Europe have the most concentrated layer coverages. Most of the map data were collected since 2000 when the first version of WMS 1.0.0 was released; (3) The ellipsoidal coordinate system is supported by most of the WMSs, and the Web Mercator projection is widely supported. Most WMSs are published based on the latest Version 1.3.0 and compatible with Version 1.1.1, but the downward compatibility with the old versions (i.e., 1.0.0 and 1.1.0) is deficient. From the quality analysis, we found that the quality monitoring, evaluation and optimization are imperative and of critical importance for WMS. (1) The quality of the WMSs varies on services, operations and request parameters. Plenty of WMSs are inaccessible due to invalid URLs. GetMap has weak stability and accessibility when compared to GetCapabilities due to the relatively complicated processing in the GetMap operation and incorrect layer descriptions. Request processing errors are the major factor that causes request failures; (2) The response times of all valid WMSs obey power laws. The majority of WMSs can respond rapidly (within three seconds) generally, while a small number of them have long response times. However, when compared to contemporary commercial online map services, the large interval between the average and the maximum response times reveals ubiquitous performance issues in public WMSs; (3) The response time shows spatiotemporal patterns. Our experiment results indicate a positive correlation between WMS response time and the spatial distance from users to servers. The closest monitoring site tends to have the smallest average response time. Furthermore, the shorter the distance, the slighter the fluctuation and the more stable the response time. The trend in the response time series fluctuates significantly with local network traffic and synthesizes the minor random variances. These findings are important to understand the factors affecting service performance. Our investigation provides a valuable guideline for selecting WMS resources and optimizing WMS performance. ISPRS Int. J. Geo-Inf. 2016, 5, 88 21 of 24

5.2. Suggestion and Future Work To improve WMS discovery, selection and application, suggestions for standards developers and our potential future research directions include:

(1) Redesigning or redefining the WMS standard and improving both client-side and server-side functionality. The current WMS standard provides very simple and easy-to-use operations to retrieve maps rendered with widely-used industrial image formats on the server side. As web technologies develop, the computing and interactive capability of web browsers are becoming more powerful. Under such circumstances, the WMS standard should be refined, leaving more fine-grained control authority free on the client side for enhanced interactivity and visual analytics functions. For example, new operations can be added to provide access and interaction capability to manipulate individual features and layers. Rendering and animation can be customized on the client side, e.g., the style of map symbols. Meanwhile, the server side could improve performance, concurrent access capacity and validation functions for metadata. (2) Building sophisticated WMS quality models. Response time prediction can facilitate service selection for time-critical applications. By analyzing the key impact factors and utilizing the spatiotemporal patterns of response times, prediction models could be built to achieve precise prediction. To support quality-driven WMS resource discovery, a comprehensive evaluation quality model could consider more quality metrics, e.g., data quality of maps, user feedback and preferences. (3) Developing a state-of-the-art web portal for better service discovery. Interactive query and visual analytics functions must be enhanced for the next generation of geospatial web portals. Firstly, quality (e.g., performance) and user scoring should be integrated and supported as search criteria. Secondly, service comparisons and the visual analytics function should be enabled. For example, users could be permitted to compare the response time, user feedback and successability of selected services visually, in an interactive way. (4) Optimizing the proposed monitoring framework. The scalability and flexibility of our distributed framework could be improved with a larger number of monitoring sites and services. More types of geospatial web services (e.g., ESRI RESTful services, OGC CSW, OGC Sensor Observation Service, OGC ) and operations should be supported.

Acknowledgments: This work is supported by the National Natural Science Foundation of China (No. 41501434 and No. 41371372), the Research Foundation of Wuhan University (No. 2042014kf0026) and the Natural Science Foundation of Hubei Province (No. 2015CFA053). Thanks to Mr. Steve McClure for language assistance. Author Contributions: Zhipeng Gui designed the methods and wrote the paper. Jun Cao implemented the monitoring system and performed the experiments. Xiaojing Liu analyzed the data. Xiaoqiang Cheng and Huayi Wu contributed to the conceptualization and methods. Conflicts of Interest: The authors declare no conflict of interest.

References

1. Doyle, A. OpenGIS Web Map Server Interface Implementation Specification Revision 1.0.0; Open Geospatial Consortium: Wayland, MA, USA, 2000. 2. De la Beaujardiere, J. OpenGIS Web Map Service (WMS) Implementation Specification Version 1.3.0; Open Geospatial Consortium: Wayland, MA, USA, 2006. 3. International Organization for Standardization. Geographic Information—Web Map Server Interface; ISO/TC 211, ISO 19128:2005; International Organization for Standardization: Geneva, Switzerland, 2005. 4. Shen, S.; Liu, W.; Wu, H.; Chen, Y. A multi-level comprehensive evaluation method for quality of WMS based on fuzzy mathematics. In Proceedings of the 17th International Conference on Geoinformatics, Fairfax, VA, USA, 12–14 August 2009. 5. Sui, D.; Elwood, S.; Goodchild, M. Crowdsourcing Geographic Knowledge: Volunteered Geographic Information (VGI) in Theory and Practice; Springer: Berlin, Germany, 2012. ISPRS Int. J. Geo-Inf. 2016, 5, 88 22 of 24

6. Neis, P.; Zipf, A. Analyzing the contributor activity of a volunteered geographic information project—The case of OpenStreetMap. ISPRS Int. J. Geo-Inform. 2012, 1, 146–165. [CrossRef] 7. Gong, J.; Wu, H.; Zhang, T.; Gui, Z.; Li, Z.; You, L. Geospatial Service Web: Towards integrated cyberinfrastructure for GIScience. GSIS 2012, 15, 73–84. 8. Wu, H.; You, L.; Gui, Z.; Hu, K.; Shen, P. GeoSquare: Collaborative geoprocessing models’ building, execution and sharing on Azure Cloud. Ann. GIS 2015, 21, 109–121. [CrossRef] 9. Geller, T. Imaging the world: The state of online mapping. IEEE Comput. Graph. 2007, 27, 8–13. [CrossRef] 10. Zavlavsky, I. A new technology for interactive online mapping with vector markup and XML. Cartogr. Perspect. 2000, 37, 65–77. [CrossRef] 11. Lins, L.; Klosowski, J.T.; Scheidegger, C. Nanocubes for real-time exploration of spatiotemporal datasets. IEEE Trans. Vis. Comput. Graph. 2013, 19, 2456–2465. [CrossRef][PubMed] 12. Boulos, M.N.K.; Warren, J.; Gong, J.; Yue, P. Web GIS in practice VIII: HTML5 and the canvas element for interactive online mapping. Int. J. Health Geogr. 2010, 9.[CrossRef][PubMed] 13. Neumann, A.; Winter, A.M. Time for SVG—Towards high quality interactive web-maps. In Proceedings of the 20th International Cartographic Conference, Beijing, China, 6–10 August 2001. 14. Jenny, B.; Jenny, H.; Räber, S. Map design for the Internet. In International Perspectives on Maps and the Internet; Peterson, M.P., Ed.; Springer: Berlin, Germany, 2008; pp. 31–48. 15. Lienert, C.; Jenny, B.; Schnabel, O.; Hurni, L. Current trends in vector-based Internet mapping: A technical review. In Online Maps with APIs and WebServices; Peterson, M.P., Ed.; Springer: Berlin, Germany, 2012; pp. 23–36. 16. Open Source Geospatial Foundation. Tile Map Service Specification. Available online: http://wiki.osgeo. org/wiki/Tile_Map_Service_Specification (accessed on 18 March 2016). 17. Masó, J.; Pomakis, K.; Julià, N. OpenGIS Web Map Tile Service (WMTS) Implementation Standard Version 1.0.0; Open Geospatial Consortium: Wayland, MA, USA, 2010. 18. Jiang, L.; Yue, P.; Lu, X. Restful implementation of catalogue service for geospatial data provenance. ISPRS Arch. 2013, 1, 121–125. [CrossRef] 19. Gui, Z.; Yang, C.; Xia, J.; Li, J.; Rezgui, A.; Sun, M.; Xu, Y.; Fay, D. A visualization-enhanced graphical user interface for geospatial resource discovery. Ann. GIS 2013, 19, 109–121. [CrossRef] 20. ArcGIS Server Website. Available online: http://www.esri.com/software/arcgis/arcgisserver/ (accessed on 13 April 2016). 21. AUTODESK AUTOCAD MAP 3D. To Add an Image from WMS (Web Map Service). Available online: https://knowledge.autodesk.com/support/autocad-map-3d/learn-explore/caas/ CloudHelp/cloudhelp/2015/ENU/MAP3D-Use/files/GUID-A9F620AD-6B9A-487D-9B33-7D365307D571- htm.html (accessed on 13 April 2016). 22. AUTODESK AUTOCAD CIVIL 3D. To Add an Image from WMS (Web Map Service). Available online: https://knowledge.autodesk.com/support/autocad-civil-3d/learn-explore/ caas/CloudHelp/cloudhelp/2017/ENU/MAP3D-Use/files/GUID-A9F620AD-6B9A-487D-9B33- 7D365307D571-htm.html (accessed on 13 April 2016). 23. OpenLayers. Web Map Service Layers. Available online: http://openlayers.org/workshop/layers/wms.html (accessed on 13 April 2016). 24. GRASS GIS. GRASS GIS Manuals. Available online: https://grass.osgeo.org/index.php?mact=News, cntnt01,detail,0&cntnt01articleid=42&cntnt01returnid=58 (accessed on 13 April 2016). 25. QGIS. QGIS as OGC Data Client. Available online: http://docs.qgis.org/2.8/en/docs/user_manual/ working_with_ogc/ogc_client_support.html#wms-wmts-client (accessed on 13 April 2016). 26. Neumann, A. and web cartography. In Encyclopedia of GIS; Shekhar, S., Xiong, H., Eds.; Springer: Berlin, Germany, 2008; pp. 1261–1269. 27. Fan, J.; Kambhampati, S. A snapshot of public web services. Sigmod. Rec. 2005, 34, 24–32. [CrossRef] 28. Li, Y.; Liu, Y.; Zhang, L.; Li, G.; Xie, B.; Sun, J. An exploratory study of web services on the internet. In Proceedings of the 2007 IEEE International Conference on Web Services, Salt Lake City, UT, USA, 9–13 July 2007. 29. Zhang, T.; Tsou, M.H. Developing a grid-enabled spatial Web portal for Internet GIServices and geospatial cyberinfrastructure. Int. J. Geogr. Inf. Sci. 2009, 23, 605–630. [CrossRef] 30. GEOSS Clearinghouse Website. Available online: http://clearinghouse.cisc.gmu.edu/geonetwork/srv/en/ main.home (accessed on 13 April 2016). ISPRS Int. J. Geo-Inf. 2016, 5, 88 23 of 24

31. Data.gov Website. Available online: http://catalog.data.gov/dataset (accessed on 13 April 2016). 32. Lopez-Pellicer, F.J.; Béjar, R.; Florczyk, A.J.; Muro-Medrano, P.R.; Zarazaga-Soria, F.J. A review of the implementation of OGC Web Services across Europe. IJSDIR 2011, 6, 168–186. 33. Lopez-Pellicer, F.J.; Rentería-Agualimpia, W.; Nogueras-Iso, J.; Zarazaga-Soria, F.J.; Muro-Medrano, P.R. Towards an active directory of geospatial web services. In Bridging the Geographic Information Sciences; Gensel, J., Josselin, D., Vandenbroucke, D., Eds.; Springer: Berlin, Germany, 2012; pp. 63–79. 34. Refractions Research. OGC Services Survey. Available online: http://www.refractions.net/expertise/ whitepapers/ogcsurvey/ogcsurvey/ (accessed on 18 March 2016). 35. Skylab Mobilesystems Ltd. OGC WMS Server List. Available online: http://www.skylab-mobilesystems. com/en/wms_serverlist.html (accessed on 18 March 2016). 36. Bartley, J.D. MAPDEX: A global index of distributed web map services. In Proceedings of the FGDC Coordination Meeting Summary, Washington, DC, USA, 7 May 2005. 37. Li, W.; Yang, C.; Yang, C. An active crawler for discovering geospatial web services and their distribution pattern—A case study of OGC web map service. Int. J. Geogr. Inf. Sci. 2010, 24, 1127–1147. [CrossRef] 38. Florczyk, A.J.; Nogueras-Iso, J.; Zarazaga-Soria, F.J.; Béjar, R. Identifying orthoimages in web map services. Comput. Geosci. 2012, 47, 130–142. [CrossRef] 39. Lee, K.; Jeon, J.; Lee, W.; Jeong, S.H.; Park, S.W. QoS for web services: Requirements and possible approaches. W3C Work. Group Note 2003, 25, 1–9. 40. INSPIRE. Implementing Directive 2007/2/EC of the European Parliament and of the Council as Regards the Network Services. Available online: http://eur-lex.europa.eu/legal-content/EN/TXT/PDF/?uri=CELEX: 32009R0976&from=EN (accessed on 13 April 2016). 41. INSPIRE. Technical Guidance for the Implementation of INSPIRE View Services. Available online: http://inspire.ec.europa.eu/documents/Network_Services/TechnicalGuidance_ViewServices_v3.11.pdf (accessed on 13 April 2016). 42. Seip, C.; Bill, R. Evaluation and monitoring of service quality: Discussing ways to meet INSPIRE requirements. Trans. GIS 2015.[CrossRef] 43. Anderson, B.; Deoliveira, J. WMS performance tests! Mapserver & Geoserver. In Proceedings of the Free and Open Source Software for Geospatial Conference, Victoria, BC, Canada, 24–27 September 2007. 44. Horák, J.; Ardielli, J.; Horáková, B. Testing of web map services. In Proceedings of the Global Spatial Data Infrastructure Association World Conference, Rotterdam, The Netherland, 15–19 June 2009. 45. Giuliani, G.; Dubois, A.; Lacroix, P.M.A. Testing OGC Web Feature and Coverage Service performance: Towards efficient delivery of geospatial data. J. Spat. Inf. Sci. 2013, 7, 1–23. [CrossRef] 46. Anthony, M.; Nebert, D. Monitoring the performance and reliability of geospatial web services: Service Status Checker (SSC) system overview. In Proceedings of the Global Spatial Data Infrastructure World Conference, Québec, QC, Canada, 14–17 May 2012. 47. Li, Z.; Yang, C.; Wu, H.; Li, W.; Miao, L. An optimized framework for seamlessly integrating OGC Web Services to support geospatial sciences. Int. J. Geogr. Inf. Sci. 2011, 25, 595–613. [CrossRef] 48. Gui, Z.; Yang, C.; Xia, J.; Liu, K.; Xu, C.; Li, J.; Lostritto, P. A performance, semantic and service quality-enhanced distributed search engine for improving geospatial resource discovery. Int. J. Geogr. Inf. Sci. 2013, 27, 1109–1132. [CrossRef] 49. Wu, H.; Li, Z.; Zhang, H.; Yang, C.; Shen, S. Monitoring and evaluating the quality of Web Map Service resources for optimizing map composition over the internet to support decision making. Comput. Geosci. 2011, 37, 485–494. [CrossRef] 50. Xia, J.; Yang, C.; Liu, K.; Li, Z.; Sun, M.; Yu, M. Forming a global monitoring mechanism and a spatiotemporal performance model for geospatial services. Int. J. Geogr. Inf. Sci. 2015, 29, 375–396. [CrossRef] 51. Wu, H.; Zhang, H. QoGIS: Concept and research framework. Geomat. Inform. Sci. Wuhan Univ. 2007, 32, 385–388. (In Chinese) 52. SETI@home Website. Available online: http://seti.ssl.berkeley.edu/ (accessed on 13 April 2016). 53. Climate@Home Website. Available online: http://www.nasa.gov/offices/ocio/ittalk/08--2010_climate. html#.VxR-__mF6Ul (accessed on 13 April 2016). 54. Zhang, H.; Gong, J.; Wu, H. Research on conception and methods of geospatial information services quality evaluation. Sci. Surv. Map. 2012, 37, 161–164. (In Chinese) ISPRS Int. J. Geo-Inf. 2016, 5, 88 24 of 24

55. Yang, C.; Cao, Y.; Evans, J. Web map server performance and client design principles. Gisci. Remote Sens. 2007, 44, 320–333. [CrossRef] 56. Shen, P.; Gui, Z.; You, L.; Hu, K.; Wu, H. A topic crawler for discovering geospatial web services. J. Geo-Inform. Sci. 2015, 17, 185–190. (In Chinese) 57. EuroGEOSS Broker Website. Available online: http://www.eurogeoss-broker.eu/ (accessed on 13 April 2016). 58. An ArcGIS REST Service Directory from NOAA Office for Coastal Management. Available online: https://coast.noaa.gov/arcgis/rest/services (accessed on 13 April 2016). 59. An ArcGIS REST Service Directory from IndianaMAP. Available online: http://maps.indiana.edu/arcgis/ rest/services (accessed on 13 April 2016). 60. Wu, S.; Zhang, M.; Huang, Q.; Zhang, Y.; Wan, C.; Zhang, K.; Cao, J.; Gui, Z.; Qin, K. Design a web portal for visualizing and exploring service quality of global OGC web map services. In Proceedings of the Geoinformatics 2015, Wuhan, China, 19–21 June 2015. 61. USGS Water Services. Frequently Asked Questions. Available online: http://waterservices.usgs.gov/docs/ faq.html (accessed on 13 April 2016). 62. cURL. Man Page. Available online: https://curl.haxx.se/docs/manpage.html (accessed on 13 April 2016). 63. Directive, I.N.S.P.I.R.E. Directive 2007/2/EC of the European Parliament and of the Council of 14 March 2007 Establishing An Infrastructure for Spatial Information in the European Community (INSPIRE). Available online: http://eur-lex.europa.eu/legal-content/EN/TXT/PDF/?uri=CELEX:32007L0002&from=EN (accessed on 13 April 2016). 64. INSPIRE Geoportal Website. Available online: http://inspire-geoportal.ec.europa.eu/ (accessed on 13 April 2016). 65. Group on Earth Observations. GEO Societal Benefit Areas. Available online: https://www.aprsaf.org/data/ feature/f_086_3.pdf (accessed on 13 April 2016). 66. RGIS Website. Available online: http://rgis.unm.edu/ (accessed on 13 April 2016). 67. Newman, M.E. Power laws, Pareto distributions and Zipf’s law. Contemp. Phys. 2005, 46, 323–351. [CrossRef] 68. A GeoServer WMS from the Socioeconomic Data and Applications Center (SEDAC). Available online: http://sedac.ciesin.columbia.edu/geoserver/wms?SERVICE=WMS&REQUEST= GetCapabilities&VERSION=1.3.0 (accessed on 18 March 2016). 69. China Employment and Wage Maps from the National Geomatics Center of China (NGCC). Available online: http://gisserver.tianditu.com/TDTService/ew/2014/wms?SERVICE=WMS&REQUEST= GetCapabilities&VERSION=1.3.0 (accessed on 18 March 2016). 70. NASA Earth Observations (NEO) WMS. Available online: http://neowms.sci.gsfc.nasa.gov/wms/wms? SERVICE=WMS&REQUEST=GetCapabilities&VERSION=1.3.0 (accessed on 18 March 2016). 71. Kim, E.; Lee, Y. Quality Model for Web Services; Organization for the Advancement of Structured Information Standards: Burlington, MA, USA, 2005. 72. Coallier, F. Software Engineering–Product Quality—Part 1: Quality Model; International Organization for Standardization: Geneva, Switzerland, 2001. 73. A Geological Data WMS from Arizona Geological Survey. Available online: http://services.azgs.az.gov/ ArcGIS/services/OneGeology/AZGS_Arizona_Geology/MapServer/WMSServer?SERVICE=WMS& REQUEST=GetCapabilities&VERSION=1.3.0 (accessed on 18 March 2016).

© 2016 by the authors; licensee MDPI, Basel, Switzerland. This article is an open access article distributed under the terms and conditions of the Creative Commons Attribution (CC-BY) license (http://creativecommons.org/licenses/by/4.0/).