The Iplant Collaborative: Cyberinfrastructure for Plant Biology

View metadata, citation and similar papers at core.ac.uk brought to you by CORE provided by PubMed Central REVIEW ARTICLE published: 25 July 2011 doi: 10.3389/fpls.2011.00034 The iPlant collaborative: cyberinfrastructure for plant biology Stephen A. Goff 1*, Matthew Vaughn 2, Sheldon McKay1, Eric Lyons1, Ann E. Stapleton3,4, Damian Gessler1, Naim Matasci1, Liya Wang5, Matthew Hanlon 2, Andrew Lenards1, Andy Muir1, Nirav Merchant1, Sonya Lowry1, Stephen Mock 2, Matthew Helmke1, Adam Kubach 2, Martha Narro1, Nicole Hopkins1, David Micklos 6, Uwe Hilgert 6, Michael Gonzales 2, Chris Jordan 2, Edwin Skidmore1, Rion Dooley 2, John Cazes 2, Robert McLay 2, Zhenyuan Lu 5, Shiran Pasternak 5, Lars Koesterke 2, William H. Piel 7, Ruth Grene 8, Christos Noutsos 5, Karla Gendler 2, Xin Feng 5,9, Chunlao Tang 5, Monica Lent1, Seung-Jin Kim1, Kristian Kvilekval10, B. S. Manjunath10,27, Val Tannen11, Alexandros Stamatakis12, Michael Sanderson13, Stephen M. Welch14, Karen A. Cranston15, Pamela Soltis16, Doug Soltis17, Brian O’Meara18, Cecile Ane19,20, Tom Brutnell 21, Daniel J. Kleibenstein 22, Jeffery W. White 23, James Leebens-Mack24, Michael J. Donoghue25, Edgar P. Spalding26, Todd J. Vision 28, Christopher R. Myers32, David Lowenthal29, Brian J. Enquist13, Brad Boyle13, Ali Akoglu 30, Greg Andrews 29, Sudha Ram 31, Doreen Ware5, Lincoln Stein5,9 and Dan Stanzione 2 1 BIO5 Institute, University of Arizona, Tucson, AZ, USA 2 Texas Advanced Computer Center, University of Texas, Austin, TX, USA 3 Department of Biology, University of North Carolina, Wilmington, NC, USA 4 Department of Marine Sciences, University of North Carolina, Wilmington, NC, USA 5 Cold Spring Harbor Laboratory, Cold Spring Harbor, NY, USA 6 DNA Learning Center, Cold Spring Harbor Laboratory, Cold Spring Harbor, NY, USA 7 Yale Peabody Museum, Yale University, New Haven, CT, USA 8 Department of Plant Pathology, Physiology and Weed Science, Virginia Tech University, Blacksburg, VA, USA 9 Ontario Center for Cancer Research, Toronto, ON, Canada 10 Center for Bio-image Informatics, University of California, Santa Barbara, CA, USA 11 Department of Computer and Information Science, University of Pennsylvania, Philadelphia, PA, USA 12 Scientific Computing Group, Heidelberg Institute for Theoretical Studies, Heidelberg, Germany 13 Department of Ecology and Evolutionary Biology, University of Arizona, Tucson, AZ, USA 14 Department of Agronomy, Kansas State University, Manhattan, KS, USA 15 National Evolutionary Synthesis Center (NESCent), Durham, NC, USA 16 Florida Museum of Natural History, University of Florida, Gainesville, FL, USA 17 Department of Biology, University of Florida, Gainesville, FL, USA 18 Department of Ecology and Evolutionary Biology, University of Tennessee, Knoxville, TN, USA 19 Department of Statistics, University of Wisconsin, Madison, WI, USA 20 Department of Botany, University of Wisconsin, Madison, WI, USA 21 Boyce Thompson Institute for Plant Research, Cornell University, Ithaca, NY, USA 22 Department of Plant Sciences, University of California, Davis, CA, USA 23 Arid-Land Agricultural Research Center, United States Department of Agriculture-Agricultural Research Service, Maricopa, AZ, USA 24 Department of Plant Biology, University of Georgia, Athens, GA, USA 25 Department of Ecology and Evolutionary Biology, Yale University, New Haven, CT, USA 26 Department of Botany, University of Wisconsin, Madison, WI, USA 27 Electrical and Computer Engineering, University of California, Santa Barbara, CA, USA 28 Department of Biology, University of North Carolina, Chapel Hill, NC, USA 29 Department of Computer Science, University of Arizona, Tucson, AZ, USA 30 Department of Electrical and Computer Engineering, University of Arizona, Tucson, AZ, USA 31 Eller School of Business, University of Arizona, Tucson, AZ, USA 32 Department of Physics, Cornell University, Ithaca, NY, USA Edited by: The iPlant Collaborative (iPlant) is a United States National Science Foundation (NSF) funded Gane Ka-Shu Wong, University of project that aims to create an innovative, comprehensive, and foundational cyberinfrastructure Alberta, Canada in support of plant biology research (PSCIC, 2006). iPlant is developing cyberinfrastructure Reviewed by: Michael Deyholos, University of that uniquely enables scientists throughout the diverse fields that comprise plant biology to Alberta, Canada address Grand Challenges in new ways, to stimulate and facilitate cross-disciplinary research, Sean W. Graham, University of British to promote biology and computer science research interactions, and to train the next generation Columbia, Canada of scientists on the use of cyberinfrastructure in research and education. Meeting humanity’s *Correspondence: projected demands for agricultural and forest products and the expectation that natural Stephen A. Goff, iPlant Collaborative, BIO5 Institute, University of Arizona, ecosystems be managed sustainably will require synergies from the application of information Tucson, AZ 85721, USA. technologies. The iPlant cyberinfrastructure design is based on an unprecedented period of e-mail: [email protected] research community input, and leverages developments in high-performance computing, data storage, and cyberinfrastructure for the physical sciences. iPlant is an open-source project with application programming interfaces that allow the community to extend the infrastructure to meet its needs. iPlant is sponsoring community-driven workshops addressing specific www.frontiersin.org July 2011 | Volume 2 | Article 34 | 1 Goff et al. The iPlant collaborative scientific questions via analysis tool integration and hypothesis testing. These workshops teach researchers how to add bioinformatics tools and/or datasets into the iPlant cyberinfrastructure enabling plant scientists to perform complex analyses on large datasets without the need to master the command-line or high-performance computational services. Keywords: cyberinfrastructure, bioinformatics, plant biology, computational biology WHAT IS iPLANT? based on modern molecular analysis techniques, through a bet- iPlant is a cyberinfrastructure development project designed to ter understanding of the evolution of important plant traits, and create the foundation to support the computational needs of the through better predictions of the environment’s impact on plant research community and facilitate progress toward solutions of physiology. Biology is a data-driven and data-intensive science major problems in plant biology. This cyberinfrastructure founda- (Smith et al., 2011a). Biologists are inundated with new data, tion must support a diverse group of plant science researchers and from ever-cheaper DNA sequence data to complex traits, species bring together experts in various fields of biology and computer sci- relationships, environmental impacts and responses, and molecu- ence. The platform created by iPlant helps researchers use tools and lar phenotypes. Plant science data range in scope from complete data more easily and efficiently, gain access to high-performance genome sequences of individual plant varieties to geospatial maps computing (HPC) when it is needed, and provide interoperable of plant species distribution across the entire biosphere (Hughes, software analysis and large data access in a sustainable fashion. 2006; Armstead et al., 2009). These data vary in scale from the The cyberinfrastructure platform is useful to scientists at all levels results published in a single journal article to data entries in enor- of expertise ranging from students to traditional biology research- mous databases. Analytical methods are being developed at an ers and computational biology experts. An initial release of the accelerating pace – but data sets are not necessarily easy to integrate iPlant cyberinfrastructure is available and enhancements are being and tools to analyze these data are often inaccessible or poorly scal- released on a regular basis. able. The data integration problem is larger than a single lab can handle, and the solution requires cross-disciplinary approaches WHY THE RESEARCH COMMUNITY NEEDS with expertise from computer science, information science, and the CYBERINFRASTRUCTURE NOW life sciences. Investment in the creation of the existing analysis tools Consider the following example. Tara1, a plant biology researcher, and datasets has been significant and must be leveraged by iPlant wants to know how varieties of a major crop species can be devel- (Benfey et al., 2010; Buell and Last, 2010; Cook and Varshney, 2010; oped to better suit a changing environment. She is coordinating Edwards and Batley, 2010; Hirayama and Shinozaki, 2010; Paterson a collaborative project to address this question by identifying and et al., 2010; Pichersky and Gerats, 2011; Proost et al., 2011). Use analyzing small molecules, drought responsive regulatory/signaling of analysis tools in isolation contributes to the lack of experimen- pathways, and key epigenetic events in plants with a long history tal verifiability/reproducibility for computational analyses. This of adaption to limited water. To do this, her global team generates, article describes how iPlant’s cyberinfrastructure addresses these and uses, molecular genetics, transcriptomic (including small RNA profound needs, and how researchers like Tara will benefit from expression) and metabolite profiling data from related genotypes/ the cyberinfrastructure. varieties within the species under study, and data from closely Cyberinfrastructure (CI), as

The Iplant Collaborative: Cyberinfrastructure for Plant Biology

Iplant Collaborative Taking Plant Biology Into Cyberspace by Janni Simner and Susan Mcginley Leslie Johnston

Plant Pathology, Physiology, and Weed Science 2013

The Institutional Challenges of Cyberinfrastructure and E-Research

Cyberinfrastructure for High Energy Physics in Korea

Arxiv:1810.00224V2 [Q-Bio.PE] 7 Dec 2020 Humanity Is Increasingly Inﬂuencing Global Environments [195]

The Iplant Collaborative:Cyberinfrastructure for the Life Sciences

Downloaded Tiles H08v04 and H09v04 for the 2001–2010 Time Period from NASA [60])

Boston University Cyberinfrastructure Plan

Biodiversity Is More Than Just a Numbers Game 7 October 2014, by Daniel Stolte

Plant Sciences

Cyverse Austria—A Local, Collaborative Cyberinfrastructure

The Seamless Cyberinfrastructure: the Challenges of Studying Users of Mass Digitization and Institutional Repositories