Free and Open Source Software for Geospatial (FOSS4G) Conference Proceedings

Volume 14 Portland, Oregon, USA Article 5

2014 Evaluation of Web Processing Service Frameworks M. Ebrahim Poorazizi University of Calgary (Canada)

Andrew J.S. Hunter

Follow this and additional works at: https://scholarworks.umass.edu/foss4g Part of the Geography Commons

Recommended Citation Poorazizi, M. Ebrahim and Hunter, Andrew J.S. (2014) "Evaluation of Web Processing Service Frameworks," Free and Open Source Software for Geospatial (FOSS4G) Conference Proceedings: Vol. 14 , Article 5. DOI: https://doi.org/10.7275/R5PG1PXB Available at: https://scholarworks.umass.edu/foss4g/vol14/iss1/5

This Paper is brought to you for free and open access by ScholarWorks@UMass Amherst. It has been accepted for inclusion in Free and Open Source Software for Geospatial (FOSS4G) Conference Proceedings by an authorized editor of ScholarWorks@UMass Amherst. For more information, please contact [email protected]. Evaluation of Web Processing Service Frameworks

Evaluation of Web Processing Service Frameworks by M. Ebrahim Poorazizi and Andrew J.S. Hunter publishing vector spatial data, generally encoded us- ing Geography Markup Language (GML) (Vretanos, University of Calgary (Canada). [email protected] 2002). A WCS defines a standard interface and oper- ations that enable interoperable access to spatial cov- Abstract erage (Spatial information representing space/time- As geoprocessing on the web has matured in recent varying phenomena) datasets (Evans, 2003). A WMS years, an increasing number of geoprocessing ser- delivers visualizations of data and, unlike WFS and vices and functionality are becoming available in the WCS, does not deliver the data directly (de La Beau- form of online Web Processing Services (WPS). Con- jardiere, 2004). sequently, the quality of such geoprocessing services In the context of processing services, the Open is of importance to ensure that WPS instances ful- Geospatial Consortium (OGC) has standardized the fill users’ expectations. In this paper, we illustrate, WPS interface for publishing of spatial processes, and discuss initial results from a quantitative analy- the discovery of, and binding to, those processes by sis of the performance of WPS servers. To do so, we users (Schut, 2007). A spatial process may include used two test scenarios to measure response time, re- algorithms, calculations, or various kinds of models, sponse size, throughput, and failure rate of five WPS which are exposed as a service instance, and oper- servers including 52◦ North, Deegree, GeoServer, Py- ating on spatial data. A WPS, thus, can be used to WPS, and Zoo. We also assess each WPS server in design and develop a wide variety of GIS function- terms of qualitative metrics such as software archi- alities, and be made available to users across a net- tecture, perceived ease of use, flexibility of deploy- work, as well as provide access to previously defined ment, and quality of documentation. A case study functions, calculations, or computational models. addressing accessibility assessment is used to eval- With the emergence of geoprocessing on the web, uate the relative advantages and disadvantages of the WPS specification and its (application) profiles each implementation, and point to challenges expe- have been applied to a wide array of use cases, rienced while working with these WPS servers. from accessibility assessment (Steiniger, Poorazizi, Keywords: OGC WPS; Geoprocessing; Perfor- & Hunter, 2013) to ecological modeling (Dubois, mance Evaluation; Benchmark. Schulz, Skøien, Bastin, & Peedell, 2013). The increas- ing use of WPS instances has also raised pertinent quality concerns — users/developers are likely to 1 Introduction be concerned about the Quality of Service (QoS) at- tributes such as performance, reliability, and security. With the development of geospatial services, web- The performance of a particular WPS is often of based GIS (Geographic Information Systems) have importance to users, arguably the most important, progressed towards a service-oriented paradigm when evaluating the QoS of a specific service. More- (Mayer, Stollberg, & Zipf, 2009). Today, spatial ser- over, performance has a direct effect on other QoS vices can be used to effectively support common attributes; for example, poor performance will affect tasks undertaken by spatial information users, for reliability, scalability, capacity, accuracy, accessibility, example, discovery and access to, process of, or vi- and availability (Cibulka, 2013). sualization of spatial data. Catalogue Services for A developer’s concerns, during designing and the Web (CSW), Web Feature Services (WFS), Web development of a WPS, are often twofold. As noted, Coverage Services (WCS), Web Mapping Services from a quantitative perspective, performance is one (WMS), and WPS are common services defined by of the key principles that can ensure both user and the OWS (Open Geospatial Consortium Web Ser- application developer satisfaction. From a qualita- vice) initiative. A CSW provides the ability to pub- tive point of view, quality concerns such as software lish and search collections of descriptive informa- architecture, perceived ease of use, flexibility of de- tion (metadata) (Solntseff & Yezerski, 1974) for spa- ployment, quality of documentation, and support ac- tial data and services (Nebert, Whiteside, & Vretanos, cessibility are important factors that can guide devel- 2007). A WFS is the main geospatial service for opers during selection of a WPS framework that fits

OSGEO Journal Volume 14 Page 29 of 48 Evaluation of Web Processing Service Frameworks a particular application domain best. WPS, and Zoo – using two test plans in an accessibil- Several reviews have been reported in the litera- ity assessment scenario. To do so, the WalkYourPlace ture that evaluates spatial services from both a quan- Transit Model (Steiniger et al., 2013) was used to de- titative and qualitative perspectives. MapServer’s sign the geoprocessing workflow. The workflow was WMS has been assessed and optimized by Kalbere then developed using Python and wrapped and ex- (2010). COSMC (Czech Office for Surveying, Map- posed as a standard WPS using the candidate WPS ping and Cadastre) and CENIA (Czech Environmen- servers. The sample locations were selected using tal Information Agency) WMSs have been tested a stratified random sampling approach within the for availability and performance (Horák, Ardielli, bounds of the City of Calgary, Alberta, Canada. Dur- & Horáková, 2009). Bermudez et al. (2009) com- ing experiments we controlled the number of concur- pared the ability of WFS and SOS (Sensor Observa- rent requests, and the WPS input parameters to as- tion Service) to publish time series data. Tamayo sess the performance and load capacity of the WPS et al. (2011) presented an empirical study of in- servers. The remainder of the paper is structured stances of servers implementing SOS in terms of as follows. Section two introduces the WPS speci- compliance with OGC’s SWE (Sensor Web Enable- fication. The specification of candidate WPS servers ment) and interoperability, and in our previous work is described in section three. Section four explains we evaluated performance of three SOS servers – 52◦ the methodology used to evaluate the WPS servers, North, MapServer, and Deegree – based on differ- along with a description of the case study, technical ent test scenarios (Poorazizi, Liang, & Hunter, 2012). architecture, test scenarios, and hardware configura- Moreover, a WMS performance shootout has been tion of the servers used. Section five presents the re- presented annually since 2007 at the FOSS4G (Free sult. In section six, the WPS servers are assessed in and Open Source Software for Geospatial) confer- terms of qualitative metrics. Section seven summa- ence, which provides a standardized procedure for rizes our findings. measuring and comparing the performance of WMS server installations (http://wiki.osgeo.org/wiki/ FOSS4G_Benchmark). 2 Web Processing Service Within the geoprocessing domain, there have been few attempts to evaluate WPS servers. Scholten The OGC released version 1.0.0 of the WPS specifi- et al. (2006) investigated efficiency of web ser- cation in June 2007 (Schut, 2007). The specification, vices for geoprocessing in a Spatial Data Infras- along with the OGC Web Processing Service Best tructure (SDI), but focused on caching, network Practice discussion paper, describe a web service in- adaptation, data granularity, and communication terface that defines how a client and server should modes. Michaelis and Ames (2009) evaluated the cooperate during the execution of a spatial analysis, WPS 0.4.0 specification, identified challenges, and and how results of the process should be presented proposed potential enhancements from an imple- (Schäffer, 2012). Clients can send requests via three mentation perspective. In addition, a WPS shootout core operations using three methods: Key Value was presented at the FOSS4G conference 2011, which Pairs (KVP) encoding via HTTP’s (HyperText Trans- evaluated five WPS servers, 52◦ North, Deegree, fer Protocol) GET, XML (eXtensible Markup Lan- GeoServer, PyWPS, and Zoo, in terms of compli- guage) via HTTP’s POST, or a SOAP/WSDL (Sim- ance with OGC’s WPS, and interoperability (http: ple Object Access Protocol/Web Service Description //wiki.osgeo.org/wiki/WPS_Shootout). The main Language) approach. The WPS specification defines achievement of the aforementioned works is that three mandatory operations that enable spatial pro- they concentrated on influential performance issues, cessing on the Internet (Schut, 2007). The GetCapa- the WPS protocol and its specification, and compli- bilities operation allows a client to request and re- ance and interoperability testing. However, there is ceive service metadata documents that describe the also a need to evaluate WPS functionality and per- capabilities of a specific server implementation. The formance. Through performance evaluation, WPS DescribeProcess operation returns detailed informa- developers can (i) identify the strengths and weak- tion about a process’ requirements, such as input nesses of each system, and (ii) improve WPS servers and output parameters, as well as allowable data for- to meet both application user and developer require- mats. The Execute operation invokes a specific pro- ments (Zhu, 2003). These issues are addressed in this cess implemented by the WPS, using the input pa- paper. We have evaluated the performance of five rameters provided, and returns the results of the ser- WPS servers – 52◦ North, Deegree, GeoServer, Py- vice to a client.

OSGEO Journal Volume 14 Page 30 of 48 Evaluation of Web Processing Service Frameworks

3 WPS Servers service metadata are defined through XML configu- ration files, and business logic is implemented as a In this paper five WPS servers were used for per- Java class. GeoServer WPS supports GeoTools and formance evaluation. 52◦ North WPS (http:// JTS spatial libraries. 52north.org/communities/geoprocessing/wps/) PyWPS (http://pywps.wald.intevation.org/) is developed by the 52◦ North Initiative for Geospa- is a Python-based WPS implementation developed tial Open Source Software GmbH. It implements by Intevation GmbH. It implements the mandatory the three mandatory operations of the WPS 1.0.0 operations of the WPS 1.0.0 specification. It runs as specification. The 52◦ North WPS server is re- a CGI (Common Gateway Interface) application and alized as a servlet and can be deployed in any can therefore be deployed in any HTTP Server en- servlet container such as Apache Tomcat (http: vironment, Apache HTTP Server, for example. De- //tomcat.apache.org/). Developing a custom veloping a custom process requires the creation a WPS process is implemented using 52◦ North’s python file to implement the business logic and de- WPS SDK (Software Development Kit) to define pa- fine service metadata and configuration parameters. rameters necessary for service configuration, ser- PyWPS enables access to a wide range of analy- vice metadata, and business logic. Spatial analysis sis functions via GRASS, GDAL (http://www.gdal. functions can be integrated using various libraries org/), and R libraries. such as JTS (http://www.vividsolutions.com/jts/ Zoo (http://www.zoo-project.org/ is an JTSHome.htm), GeoTools (http://www.geotools. OSGeo Foundation project that enables existing org/), R (http://www.r-project.org/), GRASS open source libraries to interact through its WPS (http://grass.osgeo.org/), SEXTANTE (http:// framework. It supports the mandatory operations www.sextantegis.com/), and ArcGIS Server (http: of the WPS 1.0.0 specification. It runs as a CGI //www.esri.com/software/arcgis/arcgisserver), application and so can be deployed in any HTTP for example. Server environment. Developing a custom pro- Deegree WPS (http://www.deegree.org/) is a cess requires the creation of a configuration file service built into the Deegree Java framework for (.zcfg) that defines service metadata and config- geospatial applications and OGC service implemen- uration parameters. Business logic can be imple- tations, deegree 3. deegree 3 is an Open Source mented in several programming languages includ- Geospatial (OSGeo) Foundation project. It sup- ing C/C++, PHP, JavaScript, Java, Perl, Python, or ports the core profile operations of the WPS 1.0.0 FORTRAN. Several spatial libraries such as GRASS, standard specification. The Deegree WPS server is GEOS (http://trac.osgeo.org/geos/), and GDAL are supported by default in Zoo WPS framework. implemented as a servlet and can be deployed in ◦ any servlet container, i.e., Apache Tomcat. Devel- Table 1 lists the technical characteristics of 52 oping a custom process requires the creation of a North, Deegree, GeoServer, PyWPS, and Zoo WPS Maven (http://maven.apache.org/) project. Con- servers. figuration parameters and service metadata are de- fined through XML configuration files and busi- ness logic is implemented as a Java class. Dee- 4 Methodology gree WPS currently supports the SEXTANTE spa- tial library, but other spatial libraries such as FME In this section, we explain the methodology used (http://www.safe.com/fme/fme-technology/) and to test and measure the performance of the WPS GRASS (http://grass.osgeo.org/) are being con- servers. sidered. GeoServer WPS (http://docs.geoserver.org/ 4.1 Case Study wps) is part of the popular open-source GIS project GeoServer, a project of the OSGeo Foundation. It In order to evaluate performance of the WPS servers, supports the three mandatory operations contained we used the WalkYourPlace Transit Model (http: in the WPS 1.0.0 specification. The GeoServer WPS //webmapping.ucalgary.ca/WPSClient/), which is server is built using Java technology as a servlet, and one of the accessibility assessment models developed runs in an integrated Jetty or Apache Tomcat web for the PlanYourPlace project (Steiniger et al., 2013). server environment. Developing a custom process is Based on this model, if the users provide (i) their cur- accomplished by creating a Maven (https://maven. rent location, or perhaps a location they would like apache.org/) project. Configuration parameters and to start walking from, (ii) a maximum time they are

OSGEO Journal Volume 14 Page 31 of 48 Zoo18 is an OSGeo Foundation project that enables existing open source libraries to interact through its WPS framework. It supports the mandatory operations of the WPS 1.0.0 specification. It runs as a CGI application and so can be deployed in any HTTP Server environment. Developing a custom process requires the creation of a configuration file (.zcfg) that defines service metadata and configuration parameters. Business logic can be implemented in several programming languages including C/C++, PHP, JavaScript, Java, Perl, Python, or FORTRAN. Several spatial libraries such as GRASS9, GEOS19, and GDAL17 are supported by default in Zoo WPS framework. Table 1 lists the technical characteristics of 52°North, Deegree, GeoServer, Evaluation of Web Processing Service Frameworks PyWPS, and Zoo WPS servers. Table 1: WPS servers’ technical specifications 52°North Deegree GeoServer PyWPS Zoo Development Java Java Java Python C/C++ Platform License GNU GPL LPGL GNU GPL GNU GPL MIT/X-11 v2 v2 v2 style Supported JTS SEXTANTE JTS GRASS GRASS Libraries GeoTools GeoTools GDAL GEOS SEXTANTE R GDAL R GRASS ArcGIS Natively Java Java Java Python C/C++ Supported Fortran Languages Java for Process Python Development PHP Perl JavaScript Service Servlet Servlet Servlet CGI CGI DCP Request GET, POST, GET, POST, GET, GET, GET, SOAP SOAP POST POST, POST SOAP

Table 1: WPS servers’ technical specifications. 4 Methodology willing to walk to a point of interest, or a transit stop, a way that to be accessible via HTTP GET/POST. (iii) average walk-speed,18 http://www.zoo-project.org/ (iv) a maximum time they In this context, PostGIS spatial functions were used 19 http://trac.osgeo.org/geos/ would like to wait for transit, and (v) and the maxi- to perform geometric computations such as calcu- mum time they would like to travel by transit, then lating distances between pairs of points, calculating the system will evaluate the extent of the area that the area of polygons, and merging multiple geomet- is accessible using pedestrian and transit infrastruc- ric objects. Remaining functionality was developed ture. The services within an accessibility area are using Python libraries. The geoprocessing services then analysed (e.g., point of interests (POI) such as were then wrapped and exposed as standard WPSs 6 parks, stores, libraries, etc.) to determine an accessi- using 52◦ North, Deegree, GeoServer, PyWPS, and bility score for the accessibility area. Should the user Zoo frameworks. In this context, the WPS server wish, they can ask for a distance decay function to acts as a gateway, which enables standard commu- be applied that discounts the contribution of POIs nication with the back-end (Python-based) geopro- that are further away from the users start location. cessing services. It actually accepts the Execute re- Next, an assessment of crime is undertaken for the quest, parses the query, and sends it to the cor- accessibility area. The accessibility area, accessibil- responding Python-based service using HTTP han- ity score, and the crime index are final outputs of the dlers. After getting the result, the WPS server pre- model. For more details about accessibility assess- pares it as a standard WPS response and sends it ment models deployed as part of the WalkYourPlace back to the client. In this study, we developed framework see Steiniger et al. (2013). seven Python-based geoprocessing modules to per- form the analysis, and seven WPS instances using 4.2 Technical Architecture each WPS server to wrap and expose them as stan- dard WPS services (see Figure 1). A PostgreSQL/- Figure 1 illustrates the processing service architec- PostGIS database was used to store various spatial ture for the WalkYourPlace Transit Model. The datasets such as the street and transit networks, the service architecture has been designed to reduce transit schedule, and crime data, obtained originally complexity and enable reuse of geoprocessing ser- from OpenStreetMap, Calgary Transit, and the Cal- vices. From a service design perspective, a bottom- gary Police, respectively. To search for attractions up (Granell, Díaz, & Gould, 2010) approach was within accessibility areas, POI datasets were fetched used to design the services. The geoprocessing ser- on demand from OpenStreetMap and MapQuest vices were then implemented using Python in such databases using REST (REpresentational State Trans-

OSGEO Journal Volume 14 Page 32 of 48 Evaluation of Web Processing Service Frameworks fer) APIs (Application Programming Interfaces). For the POI and Crime WPS’s, and a Boolean variable to the calculation of transit-based accessibility areas we indicate whether the distance decay function should used the General Transit Feed Specification (GTFS) be applied or not. The response from the Aggre- formatted data published by the City of Calgary. gation WPS includes an accessibility score, a crime score, and an accessibility area. The Management WPS then returns the Aggregation WPS’s response to the client for presentation.

4.3 Test Scenario In this study, to ensure the same test conditions for all WPS servers were used, we developed the geopro- cessing services using Python and then wrapped and exposed them as WPS services. Given this imple- mentation the WPS servers (i.e., 52◦ North, Deegree, Figure 1. The WalkYourPlace processing service ar- GeoServer, PyWPS, and Zoo) act as a gateway that chitecture. enables standard interaction between clients (i.e., the user or other services) and back-end geoprocessing services, which implemented using Python. For ex- The geoprocessing service framework includes ample, when the client sends an Execute request to an accessibility assessment engine that performs the Management WPS, it then sends a request to a the accessibility analysis through chaining of geo- corresponding Python service, which is accessible processing services in a multi-step pattern, i.e. a via HTTP GET/POST. After getting the response, the workflow. To achieve desired application flexibil- Management WPS sends Execute requests to other ity, service reusability, and improve performance, WPS services (i.e., Walkshed, Transit, Union, POI, the workflow-managed chaining method was used Crime, and Aggregation WPSs), which in turn com- (Alameh, 2003). municate with back-end Python services to get the Figure 2 presents a UML sequence diagram that processing result. As such, the Execute method de- outlines how an accessibility score is calculated for pends on external service calls, and the response time pedestrian and transit infrastructure. The client for invocation of the whole workflow (Figure 2) was sends a WPS Execute command to the Manage- measured to evaluate the “end-to-end” performance, ment WPS, which then initiates an Execute call to i.e., the response time includes communication time the Walkshed WPS. The Walkshed WPS returns a and processing time. GeoJSON polygon of the network-based accessibility To evaluate the performance of the WPS servers, area. The Management WPS then sends an Execute we designed two test scenarios based on the acces- request to the Transit WPS to find all transit stops sibility assessment case study. In the first scenario within the accessibility area and generates an accessi- (Scenario A), we randomly chose the WPS input pa- bility area for each transit stop based on the user de- rameters to generate 45 Execute requests. The num- fined constraints described in section 4.1. The Tran- ber of concurrent requests was assumed constant sit WPS returns a GeoJSON-encoded multi-polygon (n=1). The input parameters were selected using the feature. Next, the Management WPS sends an Exe- following criteria: cute request to the Union WPS to merge all the ac- cessibility areas generated by the Transit WPS. The • Walking Start Point: sample locations were se- Union WPS returns a single polygon feature encoded lected using a stratified random sampling ap- as GeoJSON. The Management WPS then sends an proach within the bounds of the City of Cal- Execute request to the POI WPS to find all attrac- gary (see Figure 3). tors within the accessibility area. The POI WPS re- • Walking Start Time: random timestamps be- turns a point set of services encoded as GeoJSON tween 5 a.m. and 12 p.m., which is the Cal- points, along with attributes describing the types of gary Transit hours of operation (http://www. features found. The Management WPS then repeats calgarytransit.com/accesscalgary/hours. the same request to the Crime WPS to obtain inci- html). dent locations. Finally, the Management WPS sends an Execute request to the Aggregation WPS along • Walking Time Period: we selected random val- with the accessibility polygon, the responses from ues between 5 minutes and 20 minutes.

OSGEO Journal Volume 14 Page 33 of 48 Evaluation of Web Processing Service Frameworks

Figure 2. UML activity diagram of accessibility assessment workflow.

• Walking Speed: we selected random values be- tween 3 km/h and 6 km/h with step values of 0.5 km/h.

• Bus Waiting Time: we selected random values between 0 minutes and the Walking Time Pe- riod.

• Bus Ride Time: we selected random values be- tween 0 minutes and Walking Time Period – Bus Waiting Time.

• Distance Decay Function: a Boolean variable (i.e., True/False) was selected randomly.

For the second scenario (Scenario B), we focused on the number of concurrent requests. In this context, the number of concurrent requests was generated us- ing a 2n pattern, while variable “n” was selected be- tween 0 and 7 with step value of 1. 30 WPS Execute requests were generated for each WPS service and replicated according to the concurrent request pat- tern. All other criteria were determined using the above mentioned approach for Scenario A. Figure 3. Map of the City of Calgary highlighting the locations used for evaluating the WPS servers.

OSGEO Journal Volume 14 Page 34 of 48 Evaluation of Web Processing Service Frameworks

4.4 Test Environment was used, as it is a widely accepted performance- testing tool for web applications. To more accurately reflect the users experience, all the tests have been measured from the client-side. On the server-side, a Dell OptiPlex 990 was used as the host machine, with an Intel Core i5 (3.1GHz) 5 Experimental Results CPU, 8GB of RAM, and 500GB of disk space, run- ning Microsoft Windows 7 Professional (64-bit). In Since each WPS server uses database connections to order to deploy and test the WPS servers under the execute queries, a warm-up run was first performed. same conditions, each WPS package was installed This ensures that the overhead of establishing a con- on a separate virtual machine with the same hard- nection to the database is not accounted for in the ware configuration. VMware Player 5.0.1 (http: metrics (elapsed time). During each performance //www.vmware.com/) was used to setup five virtual test, only one virtual machine was run. Response machines with access to 4GB of RAM, 40GB of disk time, response size, and whether or not a request space, and use of 4 out of 8 CPU cores, running was successful were logged. This data allowed the Ubuntu 12.04 LTS (64-bit). The network configura- estimation of average response times, average server tion of the virtual machines was set to “Bridged”, al- throughput, average server failure rate, and average lowing them to connect directly to the physical net- response size returned by each WPS server. Figure 4 work and obtain a dedicated IP address. Table 2 to Figure 7 and Table 3 below report the results of the summarizes the configuration of the server machine experiments. (host), and virtual machines. For more information First, the performance test for Scenario A is re- about the configuration of database server and soft- ported. The average response time, time taken for ware libraries used see Appendix A. a service call to return all response bytes, the aver- age response size, the quantity of data exchanged be- Dell OptiPlex 990 Hardware VMware (VM) tween client and server, for each of the WPS servers (Host) are listed in Table 3 and plotted on Figure 4. Intel Core i5 CPU 4 Cores of 8 Given the data, the most rapid WPS server was 3.1GHz Deegree, with an average response time of 2.499 RAM 8GB 4GB ± 1.259 s (95% confidence interval (CI)), followed ◦ HDD 500GB 40GB by GeoServer WPS, 52 North WPS, Zoo WPS, and Windows 7 Pro- Ubuntu 12.04 LTS PyWPS. A one-way Analysis of Variance (ANOVA) OS fessional (64-bit) (64-bit) test indicates that all WPS servers respond simi- larly with no significant difference between them, F(4,220)=0.739, p=0.566. Table 2. Experimental server configuration. In terms of response size, there was no signifi- cant difference (F(4,220)=1.071, p=0.372) either with The machine used to run the tests at the client- all WPS servers returning similar response package side was the host machine. In this study, we used the data volumes (≈ 2.484 kB). The GeoServer WPS re- same machine to set up the servers and test them, turned the least amount of data (2.301 ± 0.267 kB) to while according to (VMware, 2006), “an ideal setup the client, and PyWPS had the most (2.686 ± 0.269 for workloads that involve network traffic is to use kB). an external client (on a different physical system) to The reason of having different response sizes send network traffic to and receive network traffic was because of a slight different in XML tags from a virtual machine”. Although this could affect within the Execute response. For example, the performance of the WPS servers, the test con- wps:ExecuteResponse content is listed in Table 4 ditions (i.e., hardware and software configurations) for PyWPS and GeoServer WPSs’ Execute response, were the same for all the servers, which are shown in which returned the most, and the least amount of Table 2, Table 6, and Table 7. It was assumed that net- data, respectively. work time would be constant and therefore would Scenario B was designed to assess the effect of in- not contribute significantly to differences in response creased load on each server. The effect of load was times. assessed by increasing the number of concurrent re- In order to run the tests and measure perfor- quests from 1, to 2, 4, 8, 16, 32, 64, and finishing with mance factors (e.g., response time, response size, 128 concurrent requests. Individual services and the etc.), Apache JMeter (http://jmeter.apache.org/) service chain were tested using pre-defined input pa-

OSGEO Journal Volume 14 Page 35 of 48 Evaluation of Web Processing Service Frameworks rameters under normal condition (n=1) and no error (n=128). In addition, 52◦ North and PyWPS failed was observed. Those parameters were then used to to respond while processing more than 16 and 64 re- measure the performance of the WPS servers under quests respectively. The failure rate of Deegree and high loads (n>1). To get representative results, all of GeoServer exhibited a same pattern. We observed a the experiments were repeated 30 times and the re- failure rate of 0.8% under high loads (n > 4). PyWPS sponse time, response size, and server success/fail- and Zoo also followed a same failure rate pattern. ure were recorded. These data allowed the estima- It was constant (≈1.6%) between four and 64 con- tion and comparison of server throughput. The re- current requests and then approached 100% under sults are depicted in Figure 5 to Figure 7. higher loads (n > 64). All the WPS servers performed similarly in terms of throughput, for example with Response Time Response Size WPS Server four concurrent requests they processed around 600 (s) (kB) requests per hour. It suggests that the WPS servers 52◦ North 2.784 ± 1.269 2.448 ± 0.267 were capable of handling a request every six seconds Deegree 2.499 ± 1.259 2.505 ± 0.267 (n = 4). This result requires further investigation to determine if the servers can be tuned to function GeoServer 2.753 ± 1.255 2.301 ± 0.267 more effectively under real-world conditions. These PyWPS 3.995 ± 1.661 2.686 ± 0.269 results are summarized in Figure 5 to Figure 7. Zoo 2.999 ± 1.313 2.479 ± 0.269 6 Lessons Learned Table 3. Results for Execute request (Scenario A). In this section, the relative advantages and disad- Figure 5 shows that Deegree, GeoServer, and Zoo vantages of each WPS server, and challenges expe- generally perform similarly, the only difference is rienced while working with them are discussed. In an improvement in response time by Deegree for this context, the WPS servers were evaluated from 128 concurrent requests. With an increase from 64 a qualitative perspective in terms of: ease of instal- to 128 concurrent requests Deegree’s response time lation and configuration; perceived ease of use and improves to approximately half that of PyWPS and flexibility for creating new processes; native support Zoo. It is apparent that 52◦ North and PyWPS had for development languages; quality of documenta- difficulty when more than 16 and 64 concurrent re- tion; and community support. The qualitative com- quests were received for processing respectively. It is parison results are shown in Table 5. also evident that when more than 64 concurrent re- Installation – as 52◦ North WPS, Deegree WPS, quests were sent to PyWPS and Zoo WPS servers fail- and GeoServer WPS servers are servlet-based appli- ure rates increased dramatically, approaching 100% cations, the installation process was straightforward. at 128 concurrent requests. Throughput was also af- For 52◦ North and Deegree, installation is accom- fected significantly by the number of concurrent re- plished by deploying the downloaded/built WAR quests, particularly for 52◦ North, which returned (Web ARchive) file into a servlet container such as less than 1 successful request per hour once concur- Apache Tomcat. For GeoServer, after deploying the rent requests increased above 16. All servers per- WAR file into a servlet container, the WPS Extension formed substantially better when only one request should be extracted to the WEB-INF/lib directory of was received at a time, with 52◦ North achieving a the GeoServer installation. Library dependency was throughput of 1,445 successful requests per hour, fol- the main issue with PyWPS and Zoo WPS servers’ lowed by Zoo with 1,145, GeoServer with 1,115, Dee- installation process. They have several library de- gree with 1,024, then PyWPS with 894 requests per pendencies that must be installed first. PyWPS fol- hour. lows a typical Python installation procedure using a Because of the variation in the data, when ana- setup.py script. Further configuration is necessary lyzing the results using a two-way ANOVA, only the to set server paths, and the process folder locations. number of concurrent requests had an effect on load Installation of the Zoo Kernel, configuration, and in- testing (F(1,30)=20.640, p<0.001), individual servers stallation of the Zoo Service Provider were the main did not contribute to differences observed. As the steps required to deploy a service on the Zoo WPS number of concurrent requests increased GeoServer server. and Zoo followed a similar (linear) trend. Deegree Creating a new process and configuration – tended to perform better, especially under high loads as 52◦ North WPS, GeoServer WPS, and Zoo WPS

OSGEO Journal Volume 14 Page 36 of 48 Evaluation of Web Processing Service Frameworks

Deegree 2.499 ± 1.259 2.505 ± 0.267 GeoServer 2.753 ± 1.255 2.301 ± 0.267 PyWPS 3.995 ± 1.661 2.686 ± 0.269 Zoo 2.999 ± 1.313 2.479 ± 0.269

Given the data, the most rapid WPS server was Deegree, with an average response time of 2.499 ± 1.259 s (95% confidence interval (CI)), followed by GeoServer WPS, 52°North WPS, Zoo WPS, and PyWPS. A one-way Analysis of Variance (ANOVA) test indicates that all WPS servers respond similarly with no significant difference between them, F(4,220)=0.739, p=0.566. In terms of response size, there was no significant difference (F(4,220)=1.071, p=0.372) either with all WPS servers returning similar response package data volumes (≈ 2.484 kB). The GeoServer WPS returned the least amount of data (2.301 ± 0.267 kB) to the client, and PyWPS had the most (2.686 ± 0.269 kB). The reason of having different response sizes was because of a slight different in X M LFigure t a 4. g Response s w i time t h i (left) n and t h size e Execute (right) for r Execute e s p o nrequests s e . F(Scenario o r e x A). a m p l e , wps:ExecuteResponse content is listed in Table 4 for PyWPS and Table 4: A portion of the Execute response document returned by PyWPS and GeoServer WPS. WPS Server The Execute Response

PyWPS

GeoServer GeoServer WPSs’ Execute response, which returned the most, and the least Tableamount 4: A of portion data, ofrespectively. the Execute response document returned by PyWPS and GeoServer WPS.

OSGEO Journal Volume 14 Page 37 of 48

13 Evaluation of Web Processing Service Frameworks

Figure 5. Response time when increasing concurrent requests.

Figure 6. Failure rate with increasing concurrent requests.

Figure 7. Throughput with increasing concurrent request.

OSGEO Journal Volume 14 Page 38 of 48 documentation; and community support. The qualitative comparison results are shown in Table 5. Evaluation of Web Processing Service Frameworks

Table 5: WPS servers and their features from a qualitative perspective 52°North Deegree GeoServer PyWPS Zoo Installation* Easy Easy Easy Difficult Difficult Create new processes and Easy Medium Easy Difficult Easy Configuration* C/C++ Fortran Native Java Development Java Java Java Python Python Languages PHP Perl JavaScript Quality of Great Good Great Good Good Documentation** Mailing Mailing Mailing list, Mailing list, Wiki, list, Wiki, Forum, list, Forum, Forum, Issue Mailing Forum, Community Issue Issue Tracker, list, Issue Support Tracker, Tracker, SVN, IRC GitHub Tracker, SVN, SVN, Meeting, SVN, GitHub GitHub GitHub GitHub * Ranking ranges: Easy; Medium; Difficult ** Ranking ranges: Weak; Good; Great

InstallationTable 5.– WPSas 52°North servers and WPS, their Deegree features WPS, froma and qualitative GeoServer perspective. WPS servers are servlet-based applications, the installation process was straightforward. For 52°North and Deegree, installation is accomplished by deploying the frameworks were well-documented, a new process Java class and an XML configuration file for the pro- was simple todownloaded/built create and easy to WAR configure. (Web For ARchive) 52◦ cess, file (ii) into compile a servlet the project container as a WAR such file, as and (iii) North WPS,Apache this procedure Tomcat. was For accomplished GeoServer, using after deployingdeploy the theservlet WAR application file into into a anyservlet servlet con- the WPS SDKcontainer, in three steps:the WPS (i) createExtension a Java should class be tainer.extracted To addto the a newWEB-INF/lib process to PyWPS directory framework, for the process,of (ii) the export GeoServer the process installation. as a JAR (Java Library two dependency steps should was be the followed: main (i) issue create with a service file ARchive) file,PyWPS and (iii) and deploy Zoo the WPS process servers’ into installation 52◦ and process.modify the They configuration have several files library (i.e., pywps.cfg North’s WPSdependencies framework. For that GeoServer must WPS,be installed a new first.and pywps.cgi), PyWPS follows and (ii) a deploy typical the Python CGI application process is developedinstallation by creatingprocedure a Maven using projecta setup.py in script.into PyWPS’s Further framework.configuration On is occasion necessary the PyWPS three steps: (i) create a Java class and an XML config- server returned an HTTP Error 500 that prevented it uration file forto the set process, server (ii) paths, compiling and the the projectprocess folderfrom fulfilling locations. WPS Installation requests, especially of the Zoo after a new as a JAR file,Kernel, and (iii) configuration, deploying the and process installation into of processthe Zoo had Service been added.Provider To were resolve the this main failure sev- GeoServer’ssteps WPS framework.required to deploy To create a service a new pro-on the Zooeral WPS access server. permission settings were required (for cess for Zoo WPS, two steps have to be completed: (i) more details see Hamre (2011)). create a service file using one of the supported pro- Native Development languages – 52◦ North gramming languages, and a zcfg configuration file WPS, Deegree WPS, GeoServer WPS, and PyWPS for the process, and (ii) deploy the CGI application frameworks support one native programming lan- into Zoo’s WPS framework. Although Deegree’s guage each for the development of a new process, documentation (http://download.deegree.org/ while the Zoo WPS framework supports seven pro- documentation/3.3.3/html/), was well-organized gramming languages. This adds flexibility for devel- and comprehensive, it was not particularly clear how 17opers, as they are able to either develop new process- to build and deploy a new process within Deegree’s ing services in their language of choice, or develop WPS framework, nor were there many examples to services as independent modules that may draw on base development on. However, a Maven project libraries from many different languages. should be created and three steps should be followed Quality of documentation – 52◦ North WPS and to add a new process to Deegree’s WPS: (i) create a GeoServer WPS documentation was comprehensive

OSGEO Journal Volume 14 Page 39 of 48 Evaluation of Web Processing Service Frameworks and provide clear instruction for installation and CPU core; 52◦ North, GeoServer, and Zoo each used configuration of the WPS servers, along with clear around 20% of one CPU core and 5% of the other instructions for developing new process instances. cores. PyWPS used all cores during testing. In addi- Community support – 52◦ North WPS, Deegree tion, memory usage of all WPS servers was constant WPS, GeoServer WPS, and Zoo WPS frameworks (with minor fluctuations) during testing. On aver- have large communities of users/developers and age memory use was 30%. This suggests that perfor- provide different communication mediums to sup- mance improvements may be possible if server spe- port them. PyWPS does not appear to have an active cific tuning, or more effective development strategies community of users/developers, which may make are implemented. For example, the use of multiple access to support difficult. CPU cores in Java-based applications is handled via JVM (Java Virtual Machine), which generally tends to be problematic. In this context, if particular im- 7 Discussion and Conclusions plementation approaches or software libraries (e.g., concurrency libraries) are used it may result in a bet- We have evaluated performance of WPS servers us- ter performance. ing two test scenarios via a case study that focuses We must also note that a WPS server’s response on accessibility assessment. In the first scenario, the time is dependent upon the intensity of a service’s WPS servers were tested using 45 randomly gener- processing requirements. As such, performance re- ated Execute requests, holding the number of con- sults will depend on the complexity of the work- current requests constant (n=1). The results show flow, the complexity of individual back-end pro- that on average Deegree returns the response pack- cesses, and the complexity of the data. age most rapidly. However, a one-way ANOVA test The WPS servers have also been assessed in terms showed that, given the data, there is no significant of qualitative metrics. 52◦ North WPS, Deegree WPS, difference in response time between the WPS servers and GeoServer WPS servers are easy-to-install and tested (F(4,220)=0.739, p=0.566), nor data volume re- are well documented. They also have worldwide turned (F(4,220)=1.071, p=0.372). communities of developers/users, and provide dif- In the second scenario, load testing was under- ferent ways of communication to support their user- taken by varying the number of concurrent requests. s/developers. The documentation for PyWPS was Overall Deegree and GeoServer performed similarly, not complete, nor was it always clear and concise, although Deegree tended to perform better under making it difficult to install and configure the Py- high loads. 52◦ North had difficulty when more WPS server. PyWPS does not appear to have an ac- than 16 concurrent requests were received for pro- tive community of users/developers, and users/de- cessing, but performed more effectively under low velopers, as a consequence, may suffer from lack of loads compared to other WPS servers. Under low support. Zoo WPS does have accessible documen- loads, n=1, 52◦ North had the highest throughput tation and an accessible support community. It also completing 1,445 requests per hour, followed by Zoo supports several programming languages and offers with 1,145 requests per hour, GeoServer with 1,115 powerful and flexible approaches to develop WPS in- requests per hour, and Deegree and PyWPS com- stances. Generally, compared to other WPS servers, ◦ pleting 1,024 and 894 requests per hour respectively. 52 North and GeoServer seem to be the best choices Throughput for 52◦ North effectively went to zero re- when considering qualitative metrics, as they met quests per hour once the load increased to more than most of the evaluation criteria we chose in this study. 16 concurrent requests. Although no failed requests It should be noted that standard compliance is a were encountered under low loads, n=1, success rate major issue in the WPS domain, which was not inves- for Deegree and GeoServer stabilized at four or more tigated in this study. Interoperability and standard concurrent requests, to approximately 99.2%. Py- compliance tests can be undertaken as a part of qual- WPS and Zoo followed the same pattern, with a suc- itative evaluation process, which focuses on schema, cess rate of 98.4% between four and 64 concurrent semantics, and encodings. requests. To conclude, when selecting an appropriate WPS While four CPU cores were allocated to each WPS server, we believe it is important to consider both server during testing, upon reviewing CPU load quantitative and qualitative metrics. The impor- logs it was evident that, except for PyWPS, only tance of each metric can be weighted based on dif- one CPU core was generally being used at any time ferent application requirements. Generally speak- during testing. Specifically, Deegree used only one ing, from a user’s perspective, performance is one

OSGEO Journal Volume 14 Page 40 of 48 Evaluation of Web Processing Service Frameworks

of the most important factors when choosing a web- Hamre, T. (2011, December 29). Open Service Net- based application, while from developers’ perspec- work for Marine Environmental Data - WPS Cook- book. http://netmar.nersc.no/. Retrieved from tive, qualitative factors such as perceived ease of in- http://netmar.nersc.no/sites/netmar.nersc.no/files/ stallation and configuration, variety of development NETMAR_D7.7_WPS_Cookbook_r1_20111229.pdf languages, quality of documentation and accessibil- Horák, J., Ardielli, J., & Horáková, B. (2009). Testing of web map ity of support may be more critical. To choose a services. International Journal of Spatial Data Infrastruc- WPS server, we suggest starting an evaluation pro- tures Research,(Special Issue: GSDI 11), 25. Retrieved from cess with a basic set of questions that are linked to http://www.gsdi.org/gsdiconf/gsdi11/papers/pdf/330.pdf Kalbere, P. (2010). Performance and statistical analysis of WMS the evaluation criteria. The questions could be “who servers. In Proceedings of FOSS4G 2010. Barcelona, Spain. is the user of the system?” “What should the end- Mayer, C., Stollberg, B., & Zipf, A. (2009). Providing Near user be able to do with the system?” “What program- Real-Time Traffic Information within Spatial Data Infras- ming languages are developers comfortable with for tructures. In Proceedings of International Conference develop of the system?” “How complex are the back- on Advanced Geographic Information Systems Web Ser- vices (GEOWS ’09) (pp. 104 –111). Cancun, Mexico. end processes?” “How should the system function, doi:10.1109/GEOWS.2009.17 synchronous or asynchronous?” “What is the ar- Michaelis, C., & Ames, D. (2009). Evaluation and Im- chitecture used to design the processing workflow?” plementation of the OGC Web Processing Service for “What is the expected number of users?” In the Use in Client-Side GIS. GeoInformatica, 13(1), 109–120. end, the most appropriate WPS server should be se- doi:10.1007/s10707-008-0048-1 lected based on a trade-off between quantitative per- Nebert, D., Whiteside, A., & Vretanos, P. (2007). OpenGIS˝o Catalogue Services Specification (No. OGC 07-006r1). formance metrics and qualitative “ease of use” met- Open Geospatial Consortium Inc. Retrieved from por- rics for a specific application or use case. This may tal.opengeospatial.org/files/?artifact_id=20555 lead to the selection of different WPS servers for dif- Poorazizi, M. E., Liang, S. H. L., & Hunter, A. J. S. (2012). Testing ferent applications. of sensor observation services: a performance evaluation. Note: the developed Python-based geoprocess- In Proceedings of the First ACM SIGSPATIAL Workshop on (pp. 32–38). Redondo Beach, ing services, WPS instances, and test scripts are pub- CA, USA.: ACM. doi:10.1145/2451716.2451721 licly available, see Appendix B. Schäffer, B. (2012). Web Processing Service Best Practices Discus- sion Paper. Open Geospatial Consortium. Scholten, M., Klamma, R., & Kiehle, C. (2006). Evaluat- References ing Performance in Spatial Data Infrastructures for Geo- processing. IEEE Internet Computing, 10(5), 34–41. Alameh, N. (2003). Chaining geographic information Web doi:10.1109/MIC.2006.97 services. IEEE Internet Computing, 7(5), 22 – 29. Schut, P. (2007). OpenGIS Web Processing Ser- doi:10.1109/MIC.2003.1232514 vice (No. OGC 05-007r7). Open Geospa- Bermudez, L., Cook, T., Forrest, D., Bogden, P., Galvarino, C., tial Consortium Inc. Retrieved from por- Bridger, E., . . . Graybeal, J. (2009). tal.opengeospatial.org/files/?artifact_id=28772&version=2 (WFS) and sensor observation service (SOS) comparison Solntseff, N., & Yezerski, A. (1974). A survey of extensi- to publish time series data. In Proceedings of Interna- ble programming languages. Annual Review in Auto- tional Symposium on Collaborative Technologies and Sys- matic Programming, 7, Part 5, 267–307. doi:10.1016/0066- tems, 2009 (CTS ’09) (pp. 36 –43). Baltimore, MD , USA. 4138(74)90001-9 doi:10.1109/CTS.2009.5067460 Steiniger, S., Poorazizi, M. E., & Hunter, A. J. S. (2013). WalkY- Cibulka, D. (2013). Performance Testing of Web Map Services ourPlace - evaluating neighbourhood accessibility at street in three Dimensions – X, Y, Scale. Slovak Journal of Civil level. In Proceedings of the 29th Urban Data Management Engineering, XXI(1). doi:10.2478/sjce-2013-0005 Symposium (Vol. XL–4/W1). London, UK: ISPRS Interna- De La Beaujardiere, J. (2004). (No. OGC 04- tional Archives of Photogrammetry, Remote Sensing, and 024). Open Geospatial Consortium Inc. Retrieved from Spatial Information Science. portal.opengeospatial.org/files/?artifact_id=5316 Tamayo, A., Viciano, P., Granell, C., & Huerta, J. (2011). Empiri- Dubois, G., Schulz, M., Skøien, J., Bastin, L., & Peedell, S. (2013). cal Study of Sensor Observation Services Server Instances. eHabitat, a multi-purpose Web Processing Service for eco- In S. Geertman, W. Reinhardt, F. Toppen, W. Cartwright, logical modeling. Environmental Modelling & Software, G. Gartner, L. Meng, & M. P. Peterson (Eds.), Advancing 41, 123–133. doi:10.1016/j.envsoft.2012.11.005 Geoinformation Science for a Changing World (Vol. 1, pp. 185–209). Springer Berlin Heidelberg. Evans, J. D. (2003). (WCS), VMware. (2006). Performance Benchmarking Guidelines for Version 1.0.0 (No. OGC 03-065r6). Open VMware Workstation 5.5. VMware, Inc. Geospatial Consortium Inc. Retrieved from por- tal.opengeospatial.org/files/?artifact_id=3837 Vretanos, P. A. (2002). OpenGIS Web Feature Service Implemen- tation Specification. Open Geospatial Consortium Inc. Granell, C., Díaz, L., & Gould, M. (2010). Service-oriented ap- plications for environmental models: Reusable geospatial Zhu, J. (2003). Quantitative Models for Performance Evalua- services. Environmental Modelling & Software, 25(2), 182– tion and Benchmarking: Data Envelopment Analysis With 198. doi:10.1016/j.envsoft.2009.08.005 Spreadsheets and Dea Excel Solver. Springer.

OSGEO Journal Volume 14 Page 41 of 48 Appendix A Test Scripts: • Hardware Dell OptiPlex 960 https://github.com/mepa1363/foss4g-test- script CPU Intel Core 2 Quad 3.0GHz Python-based geoprocessing services: RAM 8GB HDD 500GB • https://github.com/mepa1363/wyp-server- 52north-foss4g OS Ubuntu 13.04 (64-bit) • https://github.com/mepa1363/wyp-server- Table 6. Experimental database server configuration deegree-foss4g

• https://github.com/mepa1363/wyp-server- Software Version -foss4g 52◦ North 3.2.0 • https://github.com/mepa1363/wyp-server- Deegree 3.0.4 pywps-foss4g GeoServer 2.4.3 • https://github.com/mepa1363/wyp-server- PyWPS 3.2.1 zoo-foss4g Zoo 1.3.0 WPS instance: Java Oracle JDK 7 • https://github.com/mepa1363/wyp-wrapper- Servlet Container Apache Tomcat 7.0.30 52north-centralized-transit Python 2.7.3 • https://github.com/mepa1363/wyp-wrapper- PostgreSQL/PostGIS 9.1.12/1.5.3 deegree-centralized-transit

Table 7. Software libraries used to setup WPS servers • https://github.com/mepa1363/wyp-wrapper- geoserver-centralized-transit Appendix B • https://github.com/mepa1363/wyp-wrapper- pywps-centralized-transit The developed Python-based geoprocessing ser- vices, WPS instances, and test scripts are publicly • https://github.com/mepa1363/wyp-wrapper- available at the following URLs: zoo-centralized-transit

OSGEO Journal Volume 14 Page 42 of 48