A Cybergis-Jupyter Framework for Geospatial Analytics at Scale

A Cybergis-Jupyter Framework for Geospatial Analytics at Scale

A CyberGIS-Jupyter Framework for Geospatial Analytics at Scale Dandong Yin Yan Liu Anand Padmanabhan Geography and Geographic Geography and Geographic Geography and Geographic Information Science Information Science Information Science CyberGIS Center for Advanced CyberGIS Center for Advanced CyberGIS Center for Advanced Digital and Spatial Studies Digital and Spatial Studies Digital and Spatial Studies University of Illinois at National Center for Supercomputing National Center for Supercomputing Urbana-Champaign Applications Applications Urbana, Illinois 61801 University of Illinois at University of Illinois at [email protected] Urbana-Champaign Urbana-Champaign Urbana, Illinois 61801 Urbana, Illinois 61801 [email protected] [email protected] Jef Terstriep Johnathan Rush Shaowen Wang CyberGIS Center for Advanced CyberGIS Center for Advanced Geography and Geographic Digital and Spatial Studies Digital and Spatial Studies Information Science National Center for Supercomputing National Center for Supercomputing CyberGIS Center for Advanced Applications Applications Digital and Spatial Studies University of Illinois at University of Illinois at National Center for Supercomputing Urbana-Champaign Urbana-Champaign Applications Urbana, Illinois 61801 Urbana, Illinois 61801 University of Illinois at [email protected] [email protected] Urbana-Champaign Urbana, Illinois 61801 [email protected] ABSTRACT CCS CONCEPTS The interdisciplinary feld of cyberGIS (geographic information • Information systems → Geographic information systems; science and systems (GIS) based on advanced cyberinfrastructure) • Computer systems organization → Distributed architectures; has a major focus on data- and computation-intensive geospatial • World Wide Web → Web services; analytics. The rapidly growing needs across many application and science domains for such analytics based on disparate geospatial big KEYWORDS data poses signifcant challenges to conventional GIS approaches. CyberGIS, computational reproducibility, geospatial big data, food This paper describes CyberGIS-Jupyter, an innovative cyberGIS mapping, science gateway framework for achieving data-intensive, reproducible, and scalable geospatial analytics using the Jupyter Notebook based on ROGER - ACM Reference format: Dandong Yin, Yan Liu, Anand Padmanabhan, Jef Terstriep, Johnathan Rush, the frst cyberGIS supercomputer. The framework adapts the Note- and Shaowen Wang. 2017. A CyberGIS-Jupyter Framework for Geospatial book with built-in cyberGIS capabilities to accelerate gateway ap- Analytics at Scale. In Proceedings of PEARC17, New Orleans, LA, USA, July plication development and sharing while associated data, analytics 09-13, 2017, 8 pages. and workfow runtime environments are encapsulated into appli- https://doi.org/10.1145/3093338.3093378 cation packages that can be elastically reproduced through cloud computing approaches. As a desirable outcome, data-intensive and 1 INTRODUCTION scalable geospatial analytics can be efciently developed and im- From the late 1990s, cyberinfrastructure has been playing increas- proved, and seamlessly reproduced among multidisciplinary users ingly important roles in mainstream scientifc discoveries [5]. Dedi- in a novel cyberGIS science gateway environment. cated to reduce its barriers to broad scientifc communities, science gateways have achieved signifcant impacts on numerous scientifc Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies are not made or distributed domains including particle physics [3], molecular chemistry [10], for proft or commercial advantage and that copies bear this notice and the full citation public health [1] and many others. on the frst page. Copyrights for components of this work owned by others than ACM In geospatial domains, cyberGIS - geospatial information science must be honored. Abstracting with credit is permitted. To copy otherwise, or republish, to post on servers or to redistribute to lists, requires prior specifc permission and/or a and systems (GIS) based on advanced computing and cyberinfras- fee. Request permissions from [email protected]. tructure (CI) [22] - has enabled computation- and data-intensive PEARC17, July 09-13, 2017, New Orleans, LA, USA knowledge discovery by gaining unprecedented insights into the © 2017 Association for Computing Machinery. ACM ISBN 978-1-4503-5272-7/17/07...$15.00 complex and geospatially connected world from both natural and https://doi.org/10.1145/3093338.3093378 social sciences perspectives [12, 23, 28]. Pushing the frontiers of PEARC17, July 09-13, 2017, New Orleans, LA, USA Y. Dandong et al. science gateway, a suite of cyberGIS gateway applications [6, 7, 11] development environment. Meanwhile, the emergence of Jupyter- have been developed to simplify access to advanced cyberGIS and Hub (https://jupyterhub.readthedocs.io) makes it possible to deploy cyberinfrastructure by providing interactive, online interfaces to Jupyter on distributed infrastructure. scalable geospatial analytics. However, as the diversity of cyberGIS This paper describes CyberGIS-Jupyter, an innovative framework data and applications keeps growing, cyberGIS-enabled geospatial that integrates cloud-based Jupyter notebooks (highly interactive analytics poses challenges against traditional web-based gateway read-eval-print loop (REPL) environment) [9] with HPC resources approaches. as part of a hybrid computing environment [14, 18]. The frame- Due to the high variety and complexity of cyberGIS analytics, work addresses the development challenge as follows: 1) adopts it is difcult to provide a comprehensive solution in a single gate- Jupyter notebooks instead of web GIS as the front end interface way application mimicking traditional desktop GIS. Therefore, it is to provide a consistent and agile playground for both developers typical for each gateway application to focus on a specifc type of and users; 2) encapsulates advanced cyberGIS capabilities within analytics (e.g. CyberGIS-BioScope [7] for biomass-to-biofuel supply a pre-confgured and containerized environment; 3) achieves on- chain system optimization; FluMapper [16] for mapping the spread demand provisioning through cloud computing to elastically deploy of infuenza-like illnesses from Twitter data; and TopoLens [6] for and manage multiple instances of gateway applications. Further- accessing high-resolution national topographic datasets). more, the reproducible deployment enables researchers to share Given the enormous application space of cyberGIS, agile devel- and build on each other’s work to innovate large-scale geospatial opment for new gateway applications is urgently needed. In most analytics cumulatively in a collaborative fashion. With this frame- desirable cases, domain researchers should be able to implement work, community-driven gateway development and deployment and customize their unique needs for gateway applications, instead becomes feasible. To the best of our knowledge, our work provides of depending on dedicated developers. However, most traditional the frst general framework to modularize gateway development cyberGIS gateways were developed with web GIS, i.e. GIS system and deployment for domain researchers. that adopts browser/server architecture, typically with interactive The remainder of this paper is organized as follows. Section 2 graphic user interfaces as front end and a set of dedicated services examines the related work of CyberGIS-Jupyter; Section 3 presents in the back end. In traditional web GIS development cycles, it is dif- the design and architecture of CyberGIS-Jupyter; Section 4 demon- cult to achieve agile development, especially for domain researchers. strates the framework’s agility in transforming complex cyberGIS There are three main reasons are as follows: computation into interactive gateway applications with a case study of computing height above nearest drainage (HAND) computation (1) Developing a complete web application with interactive at 10m resolution for conterminous US (CONUS); and Section 5 graphical user interfaces (GUIs) from scratch requires pro- concludes the paper and discusses future work. fessional skills that most of geospatial researchers do not possess; (2) Handling large-scale computation with middleware orches- 2 RELATED WORK tration behind front-end interfaces requires substantial knowl- To lower the barrier of entry to CI, a series of science gateways have edge of cyberGIS and HPC been developed to enable computation- and data-intensive research (3) Operational maintenance, including deployment, user and and education [11, 25, 26]. In order to provide easy access, most data managements, etc. pose signifcant overheads for com- science gateways adopt the Software as a Service (SaaS) [21] ap- mon researchers. proach, i.e. providing applications through interactive web services As a result, the intensive development requirements of web GIS with Web 2.0 technology and Service-Oriented Architecture (SOA) becomes a major bottleneck to meet the proliferating needs of cy- [13]. A similar architecture was adopted for cyberGIS gateways berGIS capabilities from domain researchers. Therefore, to fully [11]. On top of the generic science gateway architecture, a typi- leverage the power of cyberGIS, it is necessary not only to reduce cal geospatial gateway usually adopts web-based GIS capabilities the barrier of accessing cyberGIS via gateway applications, but also such as OpenLayers (http://openlayers.org)

View Full Text

Details

  • File Type
    pdf
  • Upload Time
    -
  • Content Languages
    English
  • Upload User
    Anonymous/Not logged-in
  • File Pages
    8 Page
  • File Size
    -

Download

Channel Download Status
Express Download Enable

Copyright

We respect the copyrights and intellectual property rights of all users. All uploaded documents are either original works of the uploader or authorized works of the rightful owners.

  • Not to be reproduced or distributed without explicit permission.
  • Not used for commercial purposes outside of approved use cases.
  • Not used to infringe on the rights of the original creators.
  • If you believe any content infringes your copyright, please contact us immediately.

Support

For help with questions, suggestions, or problems, please contact us