Journal of eScience Librarianship Volume 1 Issue 1 Article 3 2012-02-14 DataONE: Facilitating eScience through Collaboration Suzie Allard University of Tennessee Let us know how access to this document benefits ou.y Follow this and additional works at: https://escholarship.umassmed.edu/jeslib Part of the Library and Information Science Commons Repository Citation Allard S. DataONE: Facilitating eScience through Collaboration. Journal of eScience Librarianship 2012;1(1): e1004. https://doi.org/10.7191/jeslib.2012.1004. Retrieved from https://escholarship.umassmed.edu/jeslib/vol1/iss1/3 Creative Commons License This work is licensed under a Creative Commons Attribution 4.0 License. This material is brought to you by eScholarship@UMMS. It has been accepted for inclusion in Journal of eScience Librarianship by an authorized administrator of eScholarship@UMMS. For more information, please contact [email protected]. JESLIB 2012; 1(1): 4-17 doi:10.7191/jeslib.2012.1004 DataONE: Facilitating eScience Through Collaboration Suzie Allard University of Tennessee, Knoxville, TN, USA Abstract Objective: To introduce DataONE, a multi- are the primary focus of the discussion. institutional, multinational, and interdisciplinary collaboration that is developing the cyberinfra- Results: DataONE is highly collaborative. This structure and organizational structure to support is a result of its cyberinfrastructure architecture, the full information lifecycle of biological, ecologi- its interdisciplinary nature, and its organizational cal, and environmental data and tools to be used diversity. The organizational structure of an agile by researchers, educators, and the public at management team, diverse leadership team, and large. productive working groups provides for a suc- cessful collaborative environment where substan- Setting: The dynamic world of data intensive tial contributions to the DataONE mission have science at the point it interacts with the grand been made by a large number of people. challenges facing environmental sciences. Conclusions: Librarians and information science Methods: Briefly discuss science’s “fourth para- researchers are key partners in the development digm,” then introduce how DataONE is being de- of DataONE. These roles are likely to grow as veloped to answer the challenges presented by more scientists engage data at all points of the this new environment. Sociocultural perspectives data lifecycle. Introduction librarians will be addressing with their sci- ence communities and that librarians are EScience is changing the way librarians uniquely trained to negotiate successfully. work and the services they provide. An im- Beyond technological changes, as scientific portant aspect of eScience is the focus on research is becoming more data intensive, a data, as noted by Kafel (2010): “A prominent “fourth paradigm” (Hey, Tansley, and Tolle feature of eScience is the generation of im- 2009) has emerged. Gray (2007) identifies mense data sets that can be rapidly dissemi- the first three paradigms over a temporal nated to other researchers via the internet.” span beginning at a thousand years ago There is an enormous increase in the when science was empirically describing amount of data collected, analyzed, re- natural phenomena. In the last few hundred analyzed, and stored, which is a result of years, science added a theoretical branch developments in computational simulation using models and generalizations. Within and modeling, automated data acquisition, the last few decades, science added a third and communication technologies (National paradigm which is a computational branch Academies of Science 2009). These data- enabling simulations. The fourth paradigm is intensive activities present challenges that emerging now and is best described as data Correspondence to Suzie Allard: [email protected] Keywords: eScience, DataONE, data-intensive science, cyberinfrastructure 4 JESLIB 2012; 1(1): 4-17 doi:10.7191/jeslib.2012.1004 exploration that unifies theory, experiment, nections exist is important because it allows and simulation. It is often referred to as us to address complex issues with a better eScience. The fourth paradigm is changing contextual understanding. However, inter- how science is conducted (Hunt, Baldocchi connected information demands that we be and van Ingen 2009), as well as how scien- able to make sense of information across tists and publishers engage the scholarly disparate vocabularies, heterogeneous infor- record (Lynch 2009). The fourth paradigm, mation artifacts, and diverse paradigms. eScience, focuses on unifying theory, experi- This creates intellectual and technological ment, and simulation. The sociocultural challenges that may not be addressed suffi- changes brought about by the fourth para- ciently with traditional information tools and digm also have implications for libraries and methods. It also suggests new roles for the librarianship, suggesting the extension of information managers and librarians who current relationships within the scientific work with the information, and for the people community, including publishers and the de- who create and use the information. velopment of new collaborations. Ultimately, the key to benefitting society is to find solu- The foundation to successfully negotiate this tions to the challenges that arise from con- complex data intensive environment is a ro- ducting data intensive science (Hey, Tans- bust cyberinfrastructure that provides the ley and Tolle 2009). technology and associated tools to support scientists in their activities and to facilitate The science librarian can play an essential new ways to engage science (National Sci- part in enabling the cyberinfrastructure, in- ence Foundation Cyberinfrastructure Council cluding both technology and people, that 2007). The definition of cyberinfrastructure supports eScience, but this role is still includes technological and sociological per- emerging and may not be adequately de- spectives (National Science Foundation Blue fined in existing job descriptions. This paper -Ribbon Panel on Cyberinfrastructure 2003). is designed to help set the context of eSci- Both perspectives are needed to address the ence so that the role of the eScience librari- challenges presented by the increased an can be explored. The paper begins by amount of data collected, analyzed, and briefly discussing the cyberinfrastructure that stored, including a heightened need for tech- is needed to make eScience successful, and nology that assures data preservation, for then introduces one project, DataONE, as an processes that enable digital curation, and exemplar to illustrate how a cyberinfrastruc- for approaches to enable metadata interop- ture may be configured, with particular atten- erability. This means that data intensive sci- tion to the participation of librarians. ence challenges extend beyond the tradition- al hard sciences and require research en- The Need for Cyberinfrastructure gagement from the social sciences. It also suggests that while data-driven science re- Many scientific problems are both data inten- quires persistent and reliable data and tools sive and complex. For example, the grand for scientists to create and use these data, it challenges facing science, such as climate also will benefit from tools that can be used change (International Panel on Climate by a variety of stakeholders beyond scien- Change 2007), destructive pandemics tists, including government decision-makers, (World Health Organization 2009), or sus- academic researchers, industry leaders, non tainable energy (World Energy Council -governmental organizations, and even the 2010), are not confined to one or two disci- public at large. plines, but rather cross many scientific do- mains, creating a situation in which the infor- Over the last five decades, the National Sci- mation is becoming more interconnected ence Foundation (NSF) has played an im- (Hannay 2009). Recognizing that intercon- portant role in supporting the transformation 5 JESLIB 2012; 1(1): 4-17 doi:10.7191/jeslib.2012.1004 to data-intensive science, beginning with in that cyberinfrastructure. funding campus-based computational facili- ties in the 1960s, Supercomputer Center Introducing DataONE Programs in the 1980s, and the High Perfor- mance Computing and Communications pro- DataONE is a multi-institutional, multination- gram in the 1990s. In the new millennium, al, and interdisciplinary collaboration working the Office of Cyberinfrastructure created the to develop an organizational structure that vision and coordinated the efforts to provide will support the full information lifecycle of insights into complex problems in science biological, ecological, and environmental da- and engineering with the help of advanced ta and tools to be used by researchers, edu- computational facilities and instruments cators, and the public at large. DataONE (National Science Foundation Cyberinfra- focuses on enabling data-intensive biological structure Council 2007; Computer Science and environmental research through cyber- and Telecommunications Board, 1995). infrastructure that can be used as a tool to enable new science and evidence-based NSF also envisioned the concept that cyber- policy. The key tenet is that data must be infrastructure organizations could be created robust, accessible, and secure; therefore to find solutions to support data-intensive data management, from both the technical scientific and engineering research by inte- and sociocultural perspectives, is
Details
-
File Typepdf
-
Upload Time-
-
Content LanguagesEnglish
-
Upload UserAnonymous/Not logged-in
-
File Pages15 Page
-
File Size-