Science Gateways: the Long Road to the Birth of an Institute
Total Page:16
File Type:pdf, Size:1020Kb
Proceedings of the 50th Hawaii International Conference on System Sciences | 2017 Science Gateways: The Long Road to the Birth of an Institute Sandra Gesing Katherine Lawrence University of Notre Dame University of Michigan Notre Dame, USA Ann Harbor, USA Linda B. Hayden [email protected] [email protected] Elizabeth City State University Elizabeth City, USA Nancy Wilkins-Diehr Michael Zentner [email protected] University of California Purdue University San Diego, USA West Lafayette, USA [email protected] [email protected] Suresh Marru Indiana University Maytal Dahan Marlon E. Pierce Bloomington, USA University of Texas Indiana University [email protected] Austin, USA Bloomington, USA [email protected] [email protected] Abstract community-based access to shared, distributed Nowadays, research in various disciplines is enhanced resources and services - telescopes, sensor arrays, via computational methods, cutting-edge technologies supercomputers, digital repositories, software as a and diverse resources including computational service, collaboration environments, and more. infrastructures and instruments. Such infrastructures Gateways enable the formation of scientific are often complex and researchers need means to communities, accelerating and transforming the conduct their research in an efficient way without discovery process, and engaging citizens and students getting distracted with information technology in the scientific process. They represent a fundamental nuances. Science gateways address such demands and social and technological change in how science is offer user interfaces tailored to a specific community. being conducted. Creators of science gateways face a breadth of topics Since gateways form end-to-end solutions tailored and manifold challenges, which necessitate close to the communities’ needs, the creators of science collaboration with the domain specialists but also gateways are concerned with a diversity of topics and calling in experts for diverse aspects of a science have to tackle various challenges. The fundamental gateway such as project management, licensing, team first step is to understand the requirements of the composition, sustainability, HPC, visualization, and community and an at least high-level insight into the usability specialists. The Science Gateway Community research area. The following domain-related topics Institute tackles the challenges around science need to be addressed in close collaboration with a gateways to support domain specialists and developers target community. via connecting them to diverse experts, offering • Specific goal of a science gateway consultancy as well as providing a software • Visions/demands on the layout collaborative, which contains ready-to-use science • Priorities of features and options, e.g., a list gateway frameworks and science gateway components. from must-have to great-to-have options • Integration of existing applications or 1. Introduction development of applications • Technologies of the applications Billions of people use the web every day. From • Demand on computational resources for an banking to travel arrangements to connecting with efficient and effective use of an application family, the web has changed how we conduct our lives. • Visualization of results and job submissions Its impact on scientific research is no different. Science • Security demands, i.e., users may want to share gateways are used by many researchers to conduct data after they received a patent or published it their work. Implemented as web, desktop, and mobile- • Need for workflows device applications, science gateways provide URI: http://hdl.handle.net/10125/41919 ISBN: 978-0-9981331-0-2 CC-BY-NC-ND 6243 • Data management, e.g., the location of input Gateway development has often been done in an data, the demand on storage for input, ad-hoc way, limiting success and long-term impact. intermediate and output data The science gateways name originated from a focus When these topics are clarified, creators should area within the NSF TeraGrid program (2005-2011) consider further topics for a seamless integration of [11, 12]. This program developed policies and cyberinfrastructure (CI) to achieve efficiently a procedures that made the use of high performance sustainable solution. computers via the web viable. Gateway developers in • Available infrastructure including security the program worked together both to design the infrastructure and resources policies and procedures and then use them as they • Available support of suitable technologies incorporated the use of supercomputers in their own • Scalability of suitable technologies gateways. But the program had surprising, unintended benefits. In addition to working on the common goal to • Effort for extending existing technologies use supercomputers in gateways, developers benefitted compared to novel developments from the mere existence of a community designing Synergy effects with other science gateway • advanced web portals for science. There was projects unexpected commonality across disciplines and few • Experience with available solutions and/or other forums for sharing experiences in scientific programming languages development. National Science Foundation (NSF) reports [1,2,3,4,5,6,7] describe the enormous effect that CI is In the course of this work, we noticed problems having on science, but they do not highlight the with duplication of effort and the long-term dramatic shift taking place in how the average user sustainability of gateways. Science gateways enable accesses advanced CI. Specialized, technical interfaces research, sometimes for thousands, but are not research such as the Unix command line no longer serve the projects in and of themselves. As a result, they struggle broad, diversifying community. Because research to find the right avenues for sustainable funding [13]. challenges are more interdisciplinary, scientists This work revealed a key finding: Gateways funded via demand the ability to focus on their research questions, short-term research grants can follow a destructive as well as to convey their results to the general public. cycle. Prototypes are developed, early adopters are Moreover, faculty wants to more efficiently and identified, and interest in using the resulting gateway is effectively integrate their research with teaching. encouraged. Then the project ends. Understandably, Programming skills amongst interdisciplinary teams researchers who had invested time in gateway use can vary widely. For all of these reasons, science teams become disillusioned and less likely to use gateways are turning to the web [7,8]. But the significance of again. gateways goes beyond replacement of command-line A 2008 white paper [13] highlighted these tools to include sharing of research (methods, data, challenges and led to an NSF-funded study (2009- visualizations, etc.), increasing transparency of 2011) to investigate what contributes to gateway methods and data used, reproducing others’ results, success [14]. Participants in focus groups helped shielding users from much of the complex underlying identify solutions. One outcome is that gateway infrastructure, and applying research-grade tools in builders need a common, reliable reservoir of software, classrooms and training. expertise, and support they can call on to increase their chances of successful development and 2. Background implementation. Developers should be able to easily share experience and use community-contributed tools Over the past ten years, research, workshops, to more quickly create robust, less expensive, and collected papers, and special journal issues [9, 10] have sustainable gateways. Their efforts should maximize called attention to the challenges and successes of the likelihood that their respective communities science gateways, while extending our understanding become interested in and maintain those gateways of how to address these challenges. Despite notable when initial funding ends. Similarly, gateway software successes, building and operating gateways is still too and service providers need to plan for the future with costly and too prone to failure. The success rate and clear leadership to more effectively serve the gateway capabilities of gateways must be further increased community. while reducing associated costs and effort spent. The next step is to leverage and enhance every dollar that is invested in gateways by supporting best practices for development, deployment, and sustainability. 6244 3. Large-Scale Survey 4. Institute Components A further award (2012-2016) allowed to explore via The Science Gateways Community Institute is the a 5000-respondent survey [15] whether the ideas result of years of study and community input. An gleaned from small but carefully chosen focus groups organization at this level can serve as a focal point that resonated with the larger research community. To our galvanizes the community, leverages the extensive but best knowledge the large-scale survey is the most disjointed investments in gateway resources, codes, comprehensive of its kind, covering the breadth of the and expertise, and supports longer-term career paths scientific and engineering disciplines represented at for gateway developers by highlighting the importance NSF and across all geographic areas in the U.S. The of gateways in the conduct of research today. The survey targeted NSF principal investigators, gateway Institute’s mission is to provide resources, expertise, developers, and leaders in higher