Open Source Bioimage Informatics for Cell Biology

CORE Metadata, citation and similar papers at core.ac.uk Provided by Elsevier - Publisher Connector Opinion Special Issue – Imaging Cell Biology Open source bioimage informatics for cell biology Jason R. Swedlow1 and Kevin W. Eliceiri2 1 Wellcome Trust Centre for Gene Regulation and Expression, College of Life Sciences, University of Dundee, Dundee, Scotland DD1 5EH, UK 2 Laboratory for Optical and Computational Instrumentation, Departments of Molecular Biology and Biomedical Engineering, Graduate School, University of Wisconsin at Madison, Madison, WI 53706, USA Open access under CC BY license. Significant technical advances in imaging, molecular reduced representations that scientists can manipulate, biology and genomics have fueled a revolution in cell analyze, share with colleagues, and ultimately under- biology, in that the molecular and structural processes of stand. Despite the diversity of applications of imaging in the cell are now visualized and measured routinely. biology, there are common unifying challenges such as Driving much of this recent development has been the displaying a multi-gigabyte time-lapse movie on a laptop advent of computational tools for the acquisition, visual- screen, or identifying, tracking, and measuring the objects ization, analysis and dissemination of these datasets. in that movie and presenting the resulting measurements These tools collectively make up a new subfield of com- in a graph that reveals the mechanisms that drive their putational biology called bioimage informatics, which is movements. These requirements have spawned the new facilitated by open source approaches. We discuss why field of bioimage informatics [11], which aims to deliver open source tools for image informatics in cell biology tools for data visualization, management, storage, and are needed, some of the key general attributes of what analysis. While still a relatively young field, bioimage make an open source imaging application successful, informatics has already had a major impact in cell biology and point to opportunities for further operability that particularly in the area of quantitative cell imaging where should greatly accelerate future cell biology discovery. advanced feature recognition, segmentation, annotation and data mining approaches are used regularly [12–20]. Bioimage Informatics as a discovery tool in cell biology Almost all commercially provided image acquisition sys- Imaging is used as a tool for discovery throughout basic life tems include software tools that provide sophisticated ima- science, and biomedical and clinical research. In these ge visualization and analysis functions for the images domains, advances in light and electron microscopy have recorded by the instrument they control. However, in recent transformed biological discovery, enabling visualization of years, many non-commercial projects have appeared, mechanism and dynamics across scales of nanometers to almost always based in research laboratories that require millimeters and picoseconds to many days. Fluorescent functionality not available in commercial products. Here, we protein (FP)-tagged fusions can be used as reporters of discuss the application of bioimage informatics in cell biomolecular interactions in cultured living cells [1], and biology and focus specifically on the development of open the same reporter can reveal the localization and growth of source solutions for bioimage informatics that have emerged a tumor in a living animal [2,3]. In short, the last 20 years over the last few years. have provided us with a wealth of sophisticated biological reporters and image data acquisition tools for biomedical What are the informatics challenges in quantitative cell research. Many of these imaging and instrumentation biology imaging? developments have been driven by partnerships between Given the rapid development in image acquisition systems academic laboratories that invent and prototype new tech- in the last 20 years, it is worth considering why a corre- nology and commercial entities that develop and market sponding rapid development of informatics tools has them as commercial products. This development and deliv- occurred only recently. Certainly, one of the barriers to ery pipeline of commercial imaging instrumentation and providing universal tools for bioimage informatics is the software has been quite successful, having delivered the diversity of data structures and experimental applications laser scanning confocal [4,5], spinning disc confocal [6,7], that produce imaging data. In optical microscopy alone wide-field deconvolution [8,9] and multiphoton micro- there are a substantial number of different types of ima- scopes [10] that are engines of discovery in cell and devel- ging modalities and, indeed, a method like fluorescence opmental biology. microscopy encapsulates a huge and rapidly growing field All of these methodologies produce complex, multi- of image acquisition approaches [21]. Informatics tools that dimensional data sets that must be transformed into support this range of methods must be capable of capturing Corresponding author: Swedlow, J.R. ([email protected]) the raw data (the individual pixels) and the metadata 656 0962-8924 ß 2009 Elsevier Ltd. All rights reserved. doi:10.1016/j.tcb.2009.08.007 Available online 14 October 2009 Opinion Trends in Cell Biology Vol.19 No.11 around the acquisition methodology itself, including works may be distributed. Table 1 gives some examples of instrument settings, exposure details etc. This diversity commonly used open source software licenses and summar- of data structures makes delivering common informatics izes their terms. For any users or developers, these details solutions difficult, and this complexity is multiplied by the are important and must be understood given the great large number of commercial imaging systems that use implications for development and deployment. individually specified, and often proprietary, file formats The ability to see and make changes to the work of for data storage. Our current estimates are that there are another developer is a critical component of open source approximately 80 proprietary file formats for optical micro- software. The attractive aspect of this approach for science scopy alone (and not including other common imaging is that users and developers can directly see, evaluate, and techniques) that must be supported by any bioimage infor- use another’s work (really, their intellectual property) and, matics tool that aims to provide a generalizable solution. In if necessary, build upon it. This is a key and often over- short, the lack of standardized access to data makes the looked part of open source software. Successful open source generation of informatics tools quite difficult. software development projects are dynamic, evolving A deeper challenge resides in each individual laboratory enterprises allowing input, feedback, and often contri- that uses imaging as part of its experimental repertoire. butions from their community. The sheer size of the raw data sets and the rate of pro- This evolving, adaptable aspect makes open source soft- duction mean that individual researchers can easily gen- ware particularly useful for scientific discovery and, more erate many tens of gigabytes of data per day. This means specifically, for the rapidly evolving and diverse set of that large laboratories or departmental imaging facilities imaging applications used in biological research. Commer- generate many hundreds of gigabytes to terabytes per cial and closed source applications have certainly sup- week, and are now enterprise-level data production facili- ported many significant advances in imaging. However, ties. However, the expertise for developing enterprise soft- an essential part of bioimaging data analysis is the ability ware tools or even simply running the hardware necessary to easily try new methodology and approaches or even to for this scale of data management and analysis rarely combine existing ones to generate a derivative result based exists in individual laboratories. In short, the sophisticated on the combination of two approaches. Open source systems and development expertise that are used to deliver approaches make this possible. As such, there is a natural genomics databases and applications are required in indi- fit between open source software and the process of scien- vidual imaging laboratories and facilities. The delivery of tific discovery. In addition, a consequence of the growth of tools that provide access to a broad range of data types, the open source community is a de facto establishment of manage and analyze large sets of data, and help run the standardized documentation methods (http://java.sun.- systems that store and process this data is the challenge com/j2se/javadoc/) and software specifications (http://java.- that bioimage informatics seeks to address. sun.com/products/ejb/docs.html). These specifications ensure that developers can understand and use each Why are open source approaches essential? other’s code and, most importantly, that two independent A critical development in the field of bioimage informatics software packages can use a specified, common interface. has been the introduction of many open source projects in This software ‘interoperability’, enforced by the community the last few years [11,22–30]. These projects range from either formally or informally, is a general hallmark of open being open source distributions where the code is available source software, and perhaps

Load more