Simplifying Complex Software Assembly: the Component Retrieval Language and Implementation

Simplifying Complex Software Assembly: the Component Retrieval Language and Implementation

Simplifying Complex Software Assembly: The Component Retrieval Language and Implementation Eric L. Seidel Gabrielle Allen Steven Brandt City College of New York Department of Computer Center for Computation & New York, NY 10031 Science Technology (Presenting author) Center for Computation & Louisiana State University Technology Baton Rouge, LA 70803 Louisiana State University Baton Rouge, LA 70803 Frank Löffler Erik Schnetter Center for Computation & Department of Physics & Technology Astronomy Louisiana State University Center for Computation & Baton Rouge, LA 70803 Technology Louisiana State University Baton Rouge, LA 70803 ABSTRACT 1. INTRODUCTION Assembling simulation software along with the associ- Compute resources, along with their associated data ated tools and utilities is a challenging endeavor, par- storage and network connectivity, are growing ever more ticularly when the components are distributed across powerful. The current computational environment pro- multiple source code versioning systems. It is prob- vided by the National Science Foundation to support lematic for researchers compiling and running the soft- its academic research agenda includes several petascale ware across many different supercomputers, as well as machines as part of the distributed TeraGrid facility for novices in a field who are often presented with a be- and the multi-petaflop \Blue Waters" machine which wildering list of software to collect and install. should be operational in 2011. This increase in com- In this paper, we describe a language (CRL) for spec- pute capacity is needed to satisfy the requirements of ifying software components with the details needed to software applications that are being developed to model obtain them from source code repositories. The lan- Grand Challenge scientific problems with unprecedented guage supports public and private access. We describe fidelity in fields such as climate change, nuclear fusion, a tool called GetComponents which implements CRL and astrophysics, material science as well as non-traditional can be used to assemble software. applications in social sciences and humanities. As these We demonstrate the tool for application scenarios applications grow in size they are also growing more with the Cactus Framework on the NSF TeraGrid re- complex; coupling together different physical models sources. The tool itself is distributed with an open across varying spatial and temporal scales, and involv- source license and freely available from our web page. ing distributed teams of interdisciplinary researchers, heralding a new era of collaborative multi-scale and arXiv:1009.1342v1 [cs.PL] 7 Sep 2010 multi-model simulation codes. One approach to developing application codes in an ef- Categories and Subject Descriptors ficient, sustainable and extensible manner is through the D.2.7 [Software Engineering]: Distribution, Main- use of application-level component frameworks or pro- tenance, and Enhancement|Version Control, Extensi- gramming environments. Component frameworks can bility; D.3.2 [Programming Languages]: Language support reuse and community development of software Classifications|Specialized application languages by encapsulating common tools or methods within a do- main or set of domains. Cactus, a component frame- work for high performance computing [12, 10], provided the motivation for the work described in this paper. As we describe in Section 2, Cactus users typically assem- Permission to make digital or hard copies of all or part of this work for ble their simulation codes from many different software personal or classroom use is granted without fee provided that copies are modules distributed from different locations, providing not made or distributed for profit or commercial advantage and that copies a number of challenges for users in both describing the bear this notice and the full citation on the first page. To copy otherwise, to needed modules and actually retrieving them (Figure 1). republish, to post on servers or to redistribute to lists, requires prior specific permission and/or a fee. Version control systems, such as CVS, Subversion, or TeraGrid ‘10, August 2-5, 2010, Pittsburgh, PA, USA. Git, are used to manage and maintain the modules that Copyright 2010 ACM 978-1-60558-818-6/10/08 ...$10.00. Figure 1: Applications such as the Einstein Toolkit (Section 7) built from component frameworks such as Cactus can involve assembling hundreds of modules from distributed, heterogeneous source code repositories. make up these frameworks. Such systems track changes for a community of Cactus users. Section 8 describes to the source code and allow developers to recover a planned future work in improving code assembly for stable version of their software, should an error be in- complex software efforts in scientific computing before troduced. There are a large number of version control concluding in Section 9. systems in use, and while some are relatively compatible (tools exist to convert a CVS repository to Subversion or Git), many are not.1 This can create issues when 2. CACTUS EXAMPLE FOR DIS- users want to assemble, and then maintain, a compo- TRIBUTED CODE ASSEMBLY nent framework that includes modules from a variety of Cactus [12, 10] is an open-source framework designed systems. A complex framework like Cactus would be for the collaborative development of large scale simu- very difficult to maintain without some way of automat- lation codes in science and engineering. Computational ing the checkout/update process. toolkits distributed with Cactus already provide a broad To address this issue in a general manner for complex range of capabilities for solving initial value problems code assembly for any application, we have designed a in a parallel environment. The Cactus Computational new language, the Component Retrieval Language (or Toolkit includes modules for I/O, setting up coordinate CRL) that can be used to describe modules along with systems, outer and symmetry boundary conditions, do- information needed for their retrieval from remote, cen- main decomposition and message passing, standard re- tralized repositories. We have implemented a tool based duction and interpolation operators, numerical methods on this language that is now being used by Cactus users such as method of lines, as well as tools for debugging, for large scale code assembly. remote steering, and profiling. Cactus is supported and This paper starts by describing the Cactus frame- used on all the major NSF TeraGrid machines, as well work 2, which provides the motivation for the Com- as others outside the TeraGrid, and is included in the ponent Retrieval Language. Then it describes related advanced tools development for the NSF Blue Waters work in Section 3 before detailing the design issues for facility. the component retrieval language in Section 4. Section 5 Cactus is used by applications in areas including describes the grammar of the new component retrieval relativistic astrophysics, computational fluid dynamics, language, and Section 6 discusses the GetComponents reservoir simulations, quantum gravity, coastal science tool that has been written to implement this language. and computer science. Cactus users assemble their Section 7 provides an example showcasing the use of codes from a variety of independent components (called GetComponents on the resources of the NSF TeraGrid thorns) which are typically developed and distributed 1An in-depth comparison of version control system can from different source code repositories which are geo- be found at http://en.wikipedia.org/wiki/Comparison graphically, institutionally and politically varied. Source of revision control software code repositories can be public (with anonymous read access), private (with authentication by user or group); ETICS, and consists only of software that the user in- they are of different types (e.g. CVS, SVN, darcs, git, stalls, without providing actual testing hardware. Being Mercurial); and the location of thorns within a reposi- Python based, software is checked out via commands in tory varies. Cactus simulations, for example in the field a Python script. BuildBot provides some abstraction to of numerical relativity, can involve some 200 thorns from access CVS, SVN, etc. repositories, but the download some ten different repository servers around the world. process is described in a procedural manner as sequence In addition, Cactus users typically use other tools or of commands, not in a descriptive manner. This means utilities that are not part of the actual simulation code, that the information that has been specified to down- such as the Simulation Factory [1] for building and de- load the software is \hidden" in the Python script and ploying, or visualization clients and shared parameter is not accessible to other tools. files. Debian, Red Hat, SUSE etc. are Linux distributions where a complete installation consists of a set of pack- ages. These packages are available in a specific format 3. RELATED WORK (e.g. deb, rpm) which contains their source code (or bi- Cactus already included a tool for assembling codes nary code) as well as metadata describing e.g. package from thorns; GetCactus that was released in 1999 with dependencies and installation procedures. Usually, these the first general release of Cactus and addressed several packages are available from a single, centralized source of the issues alluded to in the introduction. GetCac- (e.g. the distributor itself), and they thus do not need tus was written specifically to check out Cactus

View Full Text

Details

  • File Type
    pdf
  • Upload Time
    -
  • Content Languages
    English
  • Upload User
    Anonymous/Not logged-in
  • File Pages
    8 Page
  • File Size
    -

Download

Channel Download Status
Express Download Enable

Copyright

We respect the copyrights and intellectual property rights of all users. All uploaded documents are either original works of the uploader or authorized works of the rightful owners.

  • Not to be reproduced or distributed without explicit permission.
  • Not used for commercial purposes outside of approved use cases.
  • Not used to infringe on the rights of the original creators.
  • If you believe any content infringes your copyright, please contact us immediately.

Support

For help with questions, suggestions, or problems, please contact us