A Survey of the Practice of Computational Science

Total Page:16

File Type:pdf, Size:1020Kb

A Survey of the Practice of Computational Science A Survey of the Practice of Computational Science Prakash Prabhu Thomas B. Jablin Arun Raman Yun Zhang Jialu Huang Hanjun Kim Nick P. Johnson Feng Liu Soumyadeep Ghosh Stephen Beard Taewook Oh Matthew Zoufaly David Walker David I. August Princeton University fpprabhu,tjablin,rarun,yunzhang,[email protected] fhanjunk,npjohnso,fengliu,soumyade,[email protected] ftwoh,mzoufaly,dpw,[email protected] ABSTRACT even for computer scientists [33]. Given this background, Computing plays an indispensable role in scientific research. this paper seeks to answer the question: How are scientists Presently, researchers in science have different problems, coping with the growing computing demands? needs, and beliefs about computation than professional pro- Recently, an online survey conducted a broad study of the grammers. In order to accelerate the progress of science, programming practices of a wide range of researchers, re- computer scientists must understand these problems, needs, vealing many potential problems encountered in correctly and beliefs. To this end, this paper presents a survey of writing scientific programs [30, 43]. Continuing in the same spirit, this paper presents an in-depth study of the practice of scientists from diverse disciplines, practicing computational 1 science at a doctoral-granting university with very high re- computational science at Princeton University, a RU/VH search activity. The survey covers many things, among institution. This study is conducted through a survey of them, prevalent programming practices within this scien- researchers from diverse scientific disciplines. This survey tific community, the importance of computational power in covers important aspects of computational science including different fields, use of tools to enhance performance and soft- programming practices commonly employed by researchers, ware productivity, computational resources leveraged, and the importance of computational power, and the perfor- prevalence of parallel computation. The results reveal sev- mance enhancing strategies in use. The results are presented eral patterns that suggest interesting avenues to bridge the in the context of the university's prevailing computational gap between scientific researchers and programming tools environment, providing insights into diverse computational developers. practices followed within the institution. The analysis of survey results reveals several patterns that suggest various areas of improvement. In contrast to the 1. INTRODUCTION popular view that scientists use only numerical algorithms Computational science [53], a multidisciplinary field en- written in MATLAB and FORTRAN, the survey discov- compassing various aspects of science, engineering, and com- ered that C, C++, and Python were popular among many putational mathematics, is increasingly being seen as the scientists and there is a growing need for non-numerical al- \third approach" [23], after theory and experiment, to an- gorithms. Despite the availability of clusters and large-scale swering fundamental scientific questions. Researchers prac- shared memory systems within the University and a gen- ticing computational science typically face two concerns com- eral desire for higher performance through parallel compu- peting for their time. Primarily, they must concentrate on tation, a substantial portion of scientific computation still their scientific problem by forming hypotheses, developing takes place on scientists' personal computers. Although and evaluating models, performing experiments, and collect- many scientists use shared-memory multicore desktops and ing data. At the same time, they also have to spend con- not clusters for scientific computation, knowledge of shared- siderable time converting their models into programs and memory parallelization techniques in the scientific commu- testing, debugging, and optimizing those programs. nity is low. Furthermore, the survey determined that scien- In the past two decades, there has been an exponential tists frequently do not leverage performance analysis tools increase in the amount of data generated and computation to track down the causes of poor performance and conse- performed within many scientific disciplines [53, 55], signi- quently \optimize" cold-code while ignoring a computation's fying an increasing need for high performance computing. real bottlenecks. The contributions of this paper are: Writing correct and high performance programs is difficult • An in-depth survey of the practice of computational sci- ence at a RU/VH institution. The survey was conducted Permission to make digital or hard copies of all or part of this work for through personal interviews with 114 researchers ran- personal or classroom use is granted without fee provided that copies are domly selected from diverse fields of natural sciences, en- not made or distributed for profit or commercial advantage and that copies gineering, interdisciplinary sciences, and social sciences. bear this notice and the full citation on the first page. To copy otherwise, to republish, to post on servers or to redistribute to lists, requires prior specific • An analysis of survey results that suggests several ar- permission and/or a fee. 1 Copyright is held by the author/owner(s). RU/VH stands for \very high research activity doctoral- SC’11, November 12–18, 2011, Seattle, Washington, USA. granting university", as classified by the Carnegie Founda- ACM 978-1-4503-0771-0/11/11. tion [11]. 1 Field Discipline Count the results of the survey are presented suitably categorized Natural Astrophysics 3 Sciences Atmospheric and Oceanic Sciences 2 into the three themes mentioned above. Each theme is intro- Chemistry 5 duced by posing a broad set of questions, and then answering Ecology and Evolutionary Biology 5 these questions through a general set of patterns observed Physics 5 Geosciences 6 during the survey along with data to substantiate each ob- Molecular Biology 4 servation. To highlight these key patterns, and other central Plasma Physics 2 ideas or conclusions that appear later in the paper, we set Engineering Chemical 7 Civil and Environmental 5 them apart from the main text as an italicized comment. Mechanical and Aerospace 11 Electrical 12 3.1 Computing Environment Operations Research and Financial 5 Interdisciplinary Music 4 Researchers at Princeton University are heavily supported Sciences Applied and Computational Math 2 in terms of computational resources and expertise. The Computational Biology 4 Princeton Institute for Computational Science and Engi- Neuroscience/Psychology 13 neering (PICSciE) [13] aims to foster the computational sci- Social Economics 10 Sciences Sociology 5 ences by providing computational resources as well as the Politics 4 experience necessary to capitalize on those resources. At Total 114 the time of writing, these resources include the larger cluster Table 1: Subject population distribution hardware available through the Terascale Infrastructure for eas of improvement, both in terms of practices employed Groundbreaking Research in Engineering and Science (TI- by scientific researchers and future research directions for GRESS) [10]. TIGRESS is a high performance computing programming tools developers. center that is an outcome of collaboration between PICSciE, various research centers [8, 9, 12, 14], and a number of aca- demic departments and faculty members. 2. SURVEY METHODOLOGY TIGRESS offers four Beowulf clusters (with 768, 768, The survey covers a set of 114 randomly selected researchers 1024, and 3584 processors), and a 192 processor NUMA from diverse fields of science and engineering at Prince- machine with shared memory and 1 petabyte of network ton University. The pool of survey candidates includes all attached storage. These clusters serve the computational graduate students, post doctoral associates, and research needs of 192 researchers. Administrators at TIGRESS esti- staff in various scientific disciplines at Princeton University. mate that their systems are at 80% utilization. Addition- An email soliciting participation in the survey was initially ally, PICSciE offers courses, seminars and colloquia to aid sent to randomly selected candidates from the university the computational sciences. Since 2003, PICSciE has offered database. The email mentioned \use of computation in re- mini-courses on data visualization, scientific programming in search" as a criterion for participation. After a candidate Python, FORTRAN, MATLAB, Maple, Perl and other lan- replied indicating interest in the survey, an interview was guages, technologies for parallel computing (including MPI conducted by at least two of the authors, exploring, in depth, and OpenMP), as well as courses on optimization and debug- the various aspects of scientific computing related to the can- ging parallel programs. Recently, PICSciE began offering a didate's research. course on scientific computing. PICSciE also offers program- Table 1 shows the distribution of subjects across different ming support for troubleshooting malfunctioning programs, scientific fields. In this paper, the word \scientist" is used parallelizing existing serial codes, and tuning software for in a broad sense, to cover researchers from natural sciences, maximum performance. engineering, interdisciplinary sciences, and social sciences. A total of 20 disciplines were represented. Of the 114 sub- 3.2 Programming Practices jects, 32 were from the natural sciences, 40 from engineer- Representative questions concerning this theme included: ing, 23
Recommended publications
  • Retrocomputing As Preservation and Remix
    Retrocomputing as Preservation and Remix Yuri Takhteyev Quinn DuPont University of Toronto University of Toronto [email protected] [email protected] Abstract This paper looks at the world of retrocomputing, a constellation of largely non-professional practices involving old computing technology. Retrocomputing includes many activities that can be seen as constituting “preservation.” At the same time, it is often transformative, producing assemblages that “remix” fragments from the past with newer elements or joining together historic components that were never combined before. While such “remix” may seem to undermine preservation, it allows for fragments of computing history to be reintegrated into a living, ongoing practice, contributing to preservation in a broader sense. The seemingly unorganized nature of retrocomputing assemblages also provides space for alternative “situated knowledges” and histories of computing, which can sometimes be quite sophisticated. Recognizing such alternative epistemologies paves the way for alternative approaches to preservation. Keywords: retrocomputing, software preservation, remix Recovering #popsource In late March of 2012 Jordan Mechner received a shipment from his father, a box full of old floppies. Among them was a 3.5 inch disk labelled: “Prince of Persia / Source Code (Apple) / ©1989 Jordan Mechner (Original).” Mechner’s announcement of this find on his blog the next day took the world of nerds by storm.1 Prince of Persia, a game that Mechner single-handedly developed in the late 1980s, revolutionized computer games when it came out due to its surprisingly realistic representation of human movement. After being ported to DOS and Apple’s Mac OS in the early 1990s the game sold 2 million copies (Pham, 2001).
    [Show full text]
  • The Past, Present, and Future of Software Evolution
    The Past, Present, and Future of Software Evolution Michael W. Godfrey Daniel M. German Software Architecture Group (SWAG) Software Engineering Group School of Computer Science Department of Computer Science University of Waterloo, CANADA University of Victoria, CANADA email: [email protected] email: [email protected] Abstract How does our system compare to that of our competitors? How easy would it be to port to MacOS? Are users still Change is an essential characteristic of software devel- angry about the spyware incident? As new features are de- opment, as software systems must respond to evolving re- vised and deployed, as new runtime platforms are envis- quirements, platforms, and other environmental pressures. aged, as new constraints on quality attributes are requested, In this paper, we discuss the concept of software evolu- so must software systems continually be adapted to their tion from several perspectives. We examine how it relates changing environment. to and differs from software maintenance. We discuss in- This paper explores the notion of software evolution. We sights about software evolution arising from Lehman’s laws start by comparing software evolution to the related idea of software evolution and the staged lifecycle model of Ben- of software maintenance and briefly explore the history of nett and Rajlich. We compare software evolution to other both terms. We discuss two well known research results of kinds of evolution, from science and social sciences, and we software evolution: Lehman’s laws of software evolution examine the forces that shape change. Finally, we discuss and the staged lifecycle model of Bennett and Rajlich. We the changing nature of software in general as it relates to also relate software evolution to biological evolution, and evolution, and we propose open challenges and future di- discuss their commonalities and differences.
    [Show full text]
  • A Brief History of Software Engineering Niklaus Wirth ([email protected]) (25.2.2008)
    1 A Brief History of Software Engineering Niklaus Wirth ([email protected]) (25.2.2008) Abstract We present a personal perspective of the Art of Programming. We start with its state around 1960 and follow its development to the present day. The term Software Engineering became known after a conference in 1968, when the difficulties and pitfalls of designing complex systems were frankly discussed. A search for solutions began. It concentrated on better methodologies and tools. The most prominent were programming languages reflecting the procedural, modular, and then object-oriented styles. Software engineering is intimately tied to their emergence and improvement. Also of significance were efforts of systematizing, even automating program documentation and testing. Ultimately, analytic verification and correctness proofs were supposed to replace testing. More recently, the rapid growth of computing power made it possible to apply computing to ever more complicated tasks. This trend dramatically increased the demands on software engineers. Programs and systems became complex and almost impossible to fully understand. The sinking cost and the abundance of computing resources inevitably reduced the care for good design. Quality seemed extravagant, a loser in the race for profit. But we should be concerned about the resulting deterioration in quality. Our limitations are no longer given by slow hardware, but by our own intellectual capability. From experience we know that most programs could be significantly improved, made more reliable, economical and comfortable to use. The 1960s and the Origin of Software Engineering It is unfortunate that people dealing with computers often have little interest in the history of their subject.
    [Show full text]
  • The History of Computing in the History of Technology
    The History of Computing in the History of Technology Michael S. Mahoney Program in History of Science Princeton University, Princeton, NJ (Annals of the History of Computing 10(1988), 113-125) After surveying the current state of the literature in the history of computing, this paper discusses some of the major issues addressed by recent work in the history of technology. It suggests aspects of the development of computing which are pertinent to those issues and hence for which that recent work could provide models of historical analysis. As a new scientific technology with unique features, computing in turn can provide new perspectives on the history of technology. Introduction record-keeping by a new industry of data processing. As a primary vehicle of Since World War II 'information' has emerged communication over both space and t ime, it as a fundamental scientific and technological has come to form the core of modern concept applied to phenomena ranging from information technolo gy. What the black holes to DNA, from the organization of English-speaking world refers to as "computer cells to the processes of human thought, and science" is known to the rest of western from the management of corporations to the Europe as informatique (or Informatik or allocation of global resources. In addition to informatica). Much of the concern over reshaping established disciplines, it has information as a commodity and as a natural stimulated the formation of a panoply of new resource derives from the computer and from subjects and areas of inquiry concerned with computer-based communications technolo gy.
    [Show full text]
  • An Overview of Software Evolution
    An Overview of Software Evolution CPRE 416-Software Evolution and Maintenance-Lecture 2 1 Software Evolution • What is it? • How important is it? • What to do about it? 2 An early history of software engineering • The following slides provide a condensation of the ideas of Robert L. Glass in his book "In the Beginning: Recollections of Software Pioneers" about the history of software engineering. 3 The Pioneering Era (1955-1965) • New computers were coming out every year or two. • Programmers did not have computers on their desks and had to go to the "machine room". • Jobs were run by signing up for machine time. Punch cards were used. • Computer hardware was application-specific. Scientific and business tasks needed different machines. 4 The Pioneering Era (1955-1965) • High-level languages like FORTRAN, COBOL, and ALGOL were developed. • No software companies were selling packaged software. • Academia did not yet teach the principles of computer science. 5 The Stabilizing Era (1965-1980) • Came the IBM 360. • This was the largest software project to date. • The 360 also combined scientific and business applications onto one machine. 6 The Stabilizing Era (1965-1980) • Programmers had to use the job control language (JCL) to tell OS what to do. • PL/I, introduced by IBM to merge all programming languages into one, failed. • The notion of timesharing emerged. • Software became a corporate asset and its value became huge. • Academic computing started in the late 60's. • Software engineering discipline did not yet exist. 7 The Stabilizing Era (1965-1980) • High-hype disciplines like Artificial Intelligence emerged.. • Structured Programming burst on the scene.
    [Show full text]
  • The History of Unix in the History of Software Haigh Thomas
    The History of Unix in the History of Software Haigh Thomas To cite this version: Haigh Thomas. The History of Unix in the History of Software. Cahiers d’histoire du Cnam, Cnam, 2017, La recherche sur les systèmes : des pivots dans l’histoire de l’informatique – II/II, 7-8 (7-8), pp77-90. hal-03027081 HAL Id: hal-03027081 https://hal.archives-ouvertes.fr/hal-03027081 Submitted on 9 Dec 2020 HAL is a multi-disciplinary open access L’archive ouverte pluridisciplinaire HAL, est archive for the deposit and dissemination of sci- destinée au dépôt et à la diffusion de documents entific research documents, whether they are pub- scientifiques de niveau recherche, publiés ou non, lished or not. The documents may come from émanant des établissements d’enseignement et de teaching and research institutions in France or recherche français ou étrangers, des laboratoires abroad, or from public or private research centers. publics ou privés. 77 The History of Unix in the History of Software Thomas Haigh University of Wisconsin - Milwaukee & Siegen University. You might wonder what I am doing The “software crisis” and here, at an event on this history of Unix. the 1968 NATO Conference As I have not researched or written about on Software Engineering the history of Unix I had the same question myself. But I have looked at many other The topic of the “software crisis” things around the history of software and has been written about a lot by profes- this morning will be talking about how sional historians, more than anything some of those topics, including the 1968 else in the entire history of software2.
    [Show full text]
  • The Value of Source Code to Digital Preservation Strategies
    School of Information Student Research Journal Volume 2 Issue 2 Article 5 January 2013 Consider the Source: The Value of Source Code to Digital Preservation Strategies Michel Castagné The University of British Columbia, [email protected] Follow this and additional works at: https://scholarworks.sjsu.edu/ischoolsrj Part of the Library and Information Science Commons Recommended Citation Castagné, M. (2013). Consider the Source: The Value of Source Code to Digital Preservation Strategies. School of Information Student Research Journal, 2(2). https://doi.org/10.31979/2575-2499.020205 Retrieved from https://scholarworks.sjsu.edu/ischoolsrj/vol2/iss2/5 This article is brought to you by the open access Journals at SJSU ScholarWorks. It has been accepted for inclusion in School of Information Student Research Journal by an authorized administrator of SJSU ScholarWorks. For more information, please contact [email protected]. Consider the Source: The Value of Source Code to Digital Preservation Strategies Abstract One of the major challenges in the digital preservation field is the difficulty of ensuring long-term access to digital objects, especially in cases when the software that was used to create an object is no longer current. Software source code has a human-readable, documentary structure that makes it an overlooked aspect of digital preservation strategies, in addition to a valuable component for the records of modern computing history. The author surveys several approaches to software preservation and finds that, yb supporting open source initiatives, digital libraries can improve their ability to preserve access to their collections for future generations. Keywords source code, open source, digital preservation, software preservation About Author Michel Castagné is a Master of Library and Information Studies candidate at the University of British Columbia.
    [Show full text]
  • Javascript Affogato
    JavaScript Afogato: Programming a Culture of Improvised Expertise Brian Lennon Pennsylvania State University ABSTRACT: This essay attempts a philological—meaning a both technically and socially attentive—historical study of an individual computer programming language, JavaScript. From its introduction, JavaScript’s reception by software developers, and its importance in web development as we now understand it, was structured by a con- tinuous negotiation of expertise. I use the term “improvised expertise” to describe both conditions for and effects of the unanticipated de- velopment of JavaScript, originally designed for casual and inexpert coders, into a complex of technical artifacts and practices whose range and complexity of use has today propelled it into domains previously dominated by other, often older and more prestigious languages. “Im- provised expertise” also marks the conditions for and effects of three specific developmental dynamics in JavaScript’s recent history: first, the rapidly accelerated development of the language itself, in the ver- sions of its standard specification; second, the recent, abruptly emerg- ing, yet rapid growth of JavaScript in server-side networking, data pro- cessing, and other so-called back-end development tasks previously off limits to it; third, the equally recent and abrupt, yet decisive emer- gence of JavaScript as the dominant language of a new generation of dynamic web application frameworks and the developer tool chains or tooling suites that support them. Introduction 2016 was an inconspicuously transitional year for the information space once commonly referred to as the World Wide Web. Those Configurations, 2018, 26:47–72 © 2018 by Johns Hopkins University Press and the Society for Literature, Science, and the Arts.
    [Show full text]
  • Science Gateways: the Long Road to the Birth of an Institute
    Proceedings of the 50th Hawaii International Conference on System Sciences | 2017 Science Gateways: The Long Road to the Birth of an Institute Sandra Gesing Katherine Lawrence University of Notre Dame University of Michigan Notre Dame, USA Ann Harbor, USA Linda B. Hayden [email protected] [email protected] Elizabeth City State University Elizabeth City, USA Nancy Wilkins-Diehr Michael Zentner [email protected] University of California Purdue University San Diego, USA West Lafayette, USA [email protected] [email protected] Suresh Marru Indiana University Maytal Dahan Marlon E. Pierce Bloomington, USA University of Texas Indiana University [email protected] Austin, USA Bloomington, USA [email protected] [email protected] Abstract community-based access to shared, distributed Nowadays, research in various disciplines is enhanced resources and services - telescopes, sensor arrays, via computational methods, cutting-edge technologies supercomputers, digital repositories, software as a and diverse resources including computational service, collaboration environments, and more. infrastructures and instruments. Such infrastructures Gateways enable the formation of scientific are often complex and researchers need means to communities, accelerating and transforming the conduct their research in an efficient way without discovery process, and engaging citizens and students getting distracted with information technology in the scientific process. They represent a fundamental nuances. Science gateways address such demands and social and technological change in how science is offer user interfaces tailored to a specific community. being conducted. Creators of science gateways face a breadth of topics Since gateways form end-to-end solutions tailored and manifold challenges, which necessitate close to the communities’ needs, the creators of science collaboration with the domain specialists but also gateways are concerned with a diversity of topics and calling in experts for diverse aspects of a science have to tackle various challenges.
    [Show full text]
  • Brief History of Computer and Their Application
    Brief History Of Computer And Their Application Displeased and well-meant Mort parbuckle her aerodynes wale or ceases globularly. Ira is lubberly transformative after disquisitional Knox rook his fibs cumulatively. Unblissful Silvester exonerating some funnel-webs and unknit his etherealisation so anes! From adversely affecting other and their instructions Apple computer history of applications emerge that you can be used. The computer industry to their cursor control readers are needed networking, special offers color monitor. Among computer and computational physics, and users could play a military. She is history of application to search and brief history of software that honour goes further. In computer industry might already gone their appreciation to calculate predicted tide levels. This and application may request that he began working in bedrock and his machine to continuously. This brief content and their locations and card, are partially sponsored by comparing long does cctv solutions more. In their way to left to store one application, practical use buffers that could now about greater speed and forwarding of computable. Still be hosted within a computer system level format on their computers. Nothing could perform addition to applications include the history of computable functions of. Zuse and its use and history of computer and brief sketch of. In a universal use similar experiments and brief history. As computer history, computing applications written in brief history of application software that. Languages in europe, you hear about greater amount of application of computer history and brief, helped their use today, microsoft windows xp, the unix as the routers in these data that.
    [Show full text]
  • SHORT HISTORY of SOFTWARE METHODS by David F. Rico This Short History Identifies 32 Major Classes of Software Methods That Have Emerged Over the Last 50 Years
    SHORT HISTORY OF SOFTWARE METHODS by David F. Rico This short history identifies 32 major classes of software methods that have emerged over the last 50 years. There are many variations of each major class of software method, which renders the number of software methods in the hundreds. This short history contains a brief synopsis of each of the 32 major classes of software methods, identifying the decade and year they appeared, their purpose, their major tenets, their strengths, and their weaknesses. The year each software method appeared corresponds to the seminal work that introduced the method based on extensive bibliographic research and the strengths and weaknesses were based on scholarly and empirical works to provide an objective capstone for each method. Mainframe Era Midrange Era Microcomputer Era Internet Era Personalized Era 1960s 1970s 1980s 1990s 2000s Flowcharting Structured Design Formal Specification Software Estimation Software Reliability Iterative/Incremental Software Inspections Structured Analysis Software Testing Configuration Control Quality Assurance Project Management Defect Prevention Process Improvement CASE Tools Object Oriented Software Reuse Rapid Prototyping Concurrent Lifecycle Software Factory Domain Analysis Quality Management Risk Management Software Architecture Software Metrics Six Sigma Buy versus Make Personal Processes Product Lines Synch-n-Stabilize Team Processes Agile Methods Market Conditions Market Conditions Market Conditions Market Conditions Market Conditions y Mainframe computer y Computer
    [Show full text]
  • By Michael L. Black Dissertation
    TRANSPARENT CULTURES: IMAGINED USERS AND THE POLITICS OF SOFTWARE DESIGN (1975-2012) BY MICHAEL L. BLACK DISSERTATION Submitted in partial fulfillment of the requirements for the degree of Doctor of Philosophy in English in the Graduate College of the University of Illinois at Urbana-Champaign, 2014 Urbana, Illinois Doctoral Committee: Professor Robert Markley, Chair Associate Professor Ted Underwood Associate Professor Melissa Littlefield Associate Professor Spencer Schaffner Associate Professor John T. Newcomb ii Abstract The rapid pace of software’s development poses serious challenges for any cultural history of computing. While digital media studies often sidestep historicism, this project asserts that computing’s messy, and often hidden, history can be studied using digital tools built to adapt text-mining strategies to the textuality of source code. My project examines the emergence of personal computing, a platform underlying much of digital media studies but that itself has received little attention outside of corporate histories. Using an archive of technical papers, professional journals, popular magazines, and science fiction, I trace the origin of design strategies that led to a largely instrumentalist view of personal computing and elevated “transparent design” to a privileged status. I then apply text-mining tools that I built with this historical context in mind to study source code critically, including those features of applications hidden by transparent design strategies. This project’s first three chapters examine how and why strategies of information hiding shaped consumer software design from the 1980s on. In Chapter 1, I analyze technical literature from the 1970s and 80s to show how cognitive psychologists and computer engineers developed an ideal of transparency that discouraged users from accessing information structures underlying personal computers.
    [Show full text]