Open Cybergis Software for Geospatial Research and Education in the Big Data Era
Total Page:16
File Type:pdf, Size:1020Kb
Available online at www.sciencedirect.com ScienceDirect SoftwareX ( ) – www.elsevier.com/locate/softx Open cyberGIS software for geospatial research and education in the big data era Shaowen Wanga,b,c,d,e,f,∗, Yan Liua,b,c,f, Anand Padmanabhana,b,c,f a CyberGIS Center for Advanced Digital and Spatial Studies, University of Illinois at Urbana–Champaign, Urbana, IL 61801, USA b CyberInfrastructure and Geospatial Information Laboratory, University of Illinois at Urbana–Champaign, Urbana, IL 61801, USA c Department of Geography and Geographic Information Science, University of Illinois at Urbana–Champaign, Urbana, IL 61801, USA d Department of Urban and Regional Planning, University of Illinois at Urbana–Champaign, Urbana, IL 61801, USA e Graduate School of Library and Information Science, University of Illinois at Urbana–Champaign, Urbana, IL 61801, USA f National Center for Supercomputing Applications, University of Illinois at Urbana–Champaign, Urbana, IL 61801, USA Received 16 March 2015; received in revised form 3 July 2015; accepted 28 October 2015 Abstract CyberGIS represents an interdisciplinary field combining advanced cyberinfrastructure, geographic information science and systems (GIS), spatial analysis and modeling, and a number of geospatial domains to improve research productivity and enable scientific breakthroughs. It has emerged as new-generation GIS that enable unprecedented advances in data-driven knowledge discovery, visualization and visual analytics, and collaborative problem solving and decision-making. This paper describes three open software strategies – open access, source, and integration – to serve various research and education purposes of diverse geospatial communities. These strategies have been implemented in a leading-edge cyberGIS software environment through three corresponding software modalities: CyberGIS Gateway, Toolkit, and Middleware, and achieved broad and significant impacts. ⃝c 2015 Published by Elsevier B.V. Keywords: CyberGIS; Cyberinfrastructure; Geospatial big data Code metadata Current code version v0.6 Permanent link to code/repository used of this code version https://github.com/ElsevierSoftwareX/SOFTX-D-15-00005 Legal Code License NCSA open source license Code versioning system used git Software code languages, tools, and services used C, C++, Python, Bash; MPI, OpenMP, CUDA Compilation requirements, operating environments & dependencies Compilers: GNU/Intel/Cray; OS: Linux (RedHat, Debian, Ubuntu, CentOS, SUSE); Dependencies: GDAL, GEOS, PROJ4, SPRNG, PySAL, OpenGeoDa, etc. If available Link to developer documentation/manual https://github.com/cybergis/cybergis-toolkit http://cybergis.cigi.uiuc.edu/cyberGISwiki/doku.php/ct Support email for questions CyberGIS Helpdesk ([email protected]) ∗ Corresponding author at: CyberGIS Center for Advanced Digital and Spatial Studies, University of Illinois at Urbana–Champaign, Urbana, IL 61801, USA. E-mail address: [email protected] (S. Wang). http://dx.doi.org/10.1016/j.softx.2015.10.003 2352-7110/⃝c 2015 Published by Elsevier B.V. 2 S. Wang et al. / SoftwareX ( ) – 1. Motivation and significance an open access platform, Gateway represents a software-as- service approach that significantly reduces the complexity Geospatial data and related analytics have become ubiqui- of accessing advanced CI and managing cyberGIS software. tous as continued growth in geographic information science In general, advancing scientific software requires both soft- and technology enables scientific investigations and decision- ware engineering and domain-specific scientific knowledge. In making support in a plethora of science and engineering fields particular, cyberGIS software exhibits additional dimensions including for example ecology, environmental science and of complexity due to the integration with high-performance engineering, public health, geosciences, and social sciences parallel and distributed computing resources and services, [1,2]. Extensive computational capabilities are needed to man- and diverse geospatial user communities. Each Gateway ser- age and analyze massive quantities of complex and heteroge- vice is currently implemented as a RESTful web service neous geospatial data collected across multiple scales and used (https://en.wikipedia.org/wiki/Representational state transfer). for diverse applications by many geospatial communities [3]. The service-oriented approach alone is not sufficient for However, conventional GIS approaches and associated software broad open access to cyberGIS capabilities. Using cyberGIS tools are primarily developed using sequential computing and functions often needs highly interactive user interfaces because cannot adequately resolve this increasing data intensity, com- geospatial data and analytics require frequent user involvement plexity, and diversity of applications [4]. CyberGIS – defined as in such tasks as data and study area selection, map projection, GIS based on advanced cyberinfrastructure (CI) – collectively feature extraction, and map visualization. Therefore, Gateway harnessing heterogeneous CI resources (e.g., cloud, high-end, is designed to provide a rich set of interactive user interface and high-throughput) has emerged as new-generation GIS for components for cyberGIS data and analytics by exploiting resolving geospatial big data challenges [5,6]. advances in web technologies such as HTML5 and geospatial The rapid development of cyberGIS as an interdisciplinary visualization software. A Gateway application is a standalone field has been pushed by advanced digital technologies and web application within the Gateway online framework to pulled by a large number of scientific innovation and discovery interact with backend CI and services for a suite of geospatial challenges and opportunities that exist in numerous geospatial data and analytical functions. Detailed discussion about communities. CyberGIS has evolved as a complex ecosystem Gateway application development can be found in [8,10]. This of hardware, infrastructure, software and services, and video: https://www.youtube.com/watch?v=hrJ cZkG-Xs&t=12 applications [7]. Open software is critical to effectively resolve provides an illustrative example of a geoscience application the complexity of the ecosystem and support the diversity of while demonstrating how Gateway can interoperate with online geospatial communities. Our open cyberGIS software approach data services. has three key strategies: open access, source, and integration CyberGIS Gateway has been advanced as an open access enabled by three corresponding modalities: CyberGIS Gateway, environment for a large number of users to perform compute- Toolkit, and Middleware [8,5,9]. CyberGIS Gateway (referred and data-intensive, and collaborative geospatial problem solv- to as Gateway hereafter) provides an online problem-solving ing enabled by advanced CI. The development of Gateway environment for geospatial communities to access cyberGIS software focuses on reusable cyberGIS user interface com- software and data capabilities based on CI. CyberGIS ponents and Gateway portal management. Reusable user in- Toolkit maintains a suite of community-selected open source terface components such as map panel, visualization and spatial analysis and modeling software that is scalable on symbology, data layer ordering, and map-making functions are high performance computing resources. GISolve Middleware built in Gateway as JavaScript library for application develop- bridges Gateway and Toolkit to manage the complexity of CI ment. A coding framework is established for scalable integra- access. These three modalities form open software architecture tion of individual application codes and portal management. (see Figure 1 in [8] and Figure 1 in [10]) to address open access, open API, open source, and CI-based integration and computation for cyberGIS software. This cyberGIS approach 3. CyberGIS Toolkit—open source has already had significant impact in a number of domains (e.g., biosciences [11], coupled human–natural systems [12], CyberGIS Toolkit integrates a set of loosely coupled scal- econometrics [13], and public health [14,15]). able geospatial software components for the following pur- poses [10]: 2. CyberGIS Gateway—open access • Sustain the CyberGIS Toolkit as a reliable community soft- ware toolbox for scalable cyberGIS analytics through rigor- Gateway is the leading online geospatial problem-solving ous software building, testing, packaging, and deployment environment providing cyberGIS capabilities to serve various based on open source software practice; research and education purposes [8,5]. As a pioneer of sci- ence gateways [16], Gateway is built on the TeraGrid GI- • Capture spatial characteristics of software elements to Science Gateway approach to bridging advanced CI and GIS achieve optimal computational performance, scalability, and capabilities through friendly user interfaces based on rich- portability in various CI environments; and client web technologies [17]. Gateway capabilities are made • Engage computational and data scientists to advance available to users at two levels: service and application. As scalable geospatial computing. S. Wang et al. / SoftwareX ( ) – 3 Table 1 Software components in CyberGIS Toolkit. Name Description Scalable computing Deployment Scalability (cores) PABM Scalable agent-based modeling MPI C MPI IO XSEDE 16,384 Parallel PySAL Scalable PySAL functions Multi-core XSEDE 32C PGAP Parallel Genetic Algorithm Library MPI XSEDE Blue Waters 262,144