SBOL Visual: a Graphical Language for Genetic Designs
Total Page:16
File Type:pdf, Size:1020Kb
COMMUNITY PAGE SBOL Visual: A Graphical Language for Genetic Designs Jacqueline Y. Quinn1☯, Robert Sidney Cox III2☯, Aaron Adler3, Jacob Beal3, Swapnil Bhatia4, Yizhi Cai5, Joanna Chen6,7, Kevin Clancy8, Michal Galdzicki9, Nathan J. Hillson6,7, Nicolas Le Novère10, Akshay J. Maheshwari11, James Alastair McLaughlin12, Chris J. Myers13, Umesh P14, Matthew Pocock12,15, Cesar Rodriguez16, Larisa Soldatova17, Guy-Bart V. Stan18, Neil Swainston19, Anil Wipat12, Herbert M. Sauro20* 1 Autodesk Research, Autodesk Inc., San Francisco, California, United States of America, 2 Chemical Science and Engineering, Kobe University, Kobe, Japan, 3 Information and Knowledge Technologies, Raytheon BBN Technologies, Cambridge, Massachusetts, United States of America, 4 Electrical and Computer Engineering, Boston University, Boston, Massachusetts, United States of America, 5 School of Biological Sciences, University of Edinburgh, Edinburgh, United Kingdom, 6 Fuels Synthesis and Technologies Divisions, Joint BioEnergy Institute, Emeryville, California, United States of America, 7 Lawrence Berkeley National Lab, Berkeley, California, United States of America, 8 Synthetic Biology Unit, ThermoFisher Scientific, Carlsbad, California, United States of America, 9 Arzeda Corp, Seattle, Washington, United States of America, 10 Babraham Institute, Cambridge, United Kingdom, 11 Stanford University School of Medicine, Stanford, California, United States of America, 12 School of Computing Science, Newcastle University, Newcastle upon Tyne, United Kingdom, 13 Department of Electrical and OPEN ACCESS Computer Engineering, University of Utah, Salt Lake City, Utah, United States of America, 14 Department of Citation: Quinn JY, Cox RS III, Adler A, Beal J, Computational Biology & Bioinformatics, University of Kerala, Kerala, India, 15 Turing Ate My Hamster LTD, Bhatia S, Cai Y, et al. (2015) SBOL Visual: A Newcastle upon Tyne, United Kingdom, 16 Department of Biomedical Sciences, College of Medicine, Florida State University, Tallahassee, Florida, United States of America, 17 Computer Science, Brunel University, Graphical Language for Genetic Designs. PLoS Biol London, United Kingdom, 18 Department of Bioengineering, Centre for Synthetic Biology and Innovation, 13(12): e1002310. doi:10.1371/journal.pbio.1002310 Imperial College London, South Kensington Campus, London, United Kingdom, 19 Centre for Synthetic Published: December 3, 2015 Biology of Fine and Specialty Chemicals (SYNBIOCHEM), University of Manchester, Manchester, United Kingdom, 20 Bioengineering, University of Washington, Seattle, Washington, United States of America Copyright: © 2015 Quinn et al. This is an open access article distributed under the terms of the ☯ These authors contributed equally to this work. Creative Commons Attribution License, which permits * [email protected] unrestricted use, distribution, and reproduction in any medium, provided the original author and source are credited. Abstract Funding: The authors acknowledge funding from Autodesk, Inc., Stanford University, Brunel University, Synthetic Biology Open Language (SBOL) Visual is a graphical standard for genetic engi- and the National Science Foundation Synthetic neering. It consists of symbols representing DNA subsequences, including regulatory ele- Biology Engineering Research Center and Grant EF- 0850100. The portion of this work conducted by the ments and DNA assembly features. These symbols can be used to draw illustrations for Joint BioEnergy Institute was supported by the Office communication and instruction, and as image assets for computer-aided design. SBOL of Science, Office of Biological and Environmental Visual is a community standard, freely available for personal, academic, and commercial Research, of the US Department of Energy (Contract use (Creative Commons CC0 license). We provide prototypical symbol images that have No. DE-AC02-05CH11231, to NJH). The portion of the work conducted by the University of Washington been used in scientific publications and software tools. We encourage users to use and was supported by the National Library of Medicine modify them freely, and to join the SBOL Visual community: http://www.sbolstandard.org/ (R41 LM010745), the National Human Genome visual. Research Institute (R42 HG006737) and NSF awards: Theoretical Biology 0827592, Molecular and Cellular Biosciences 1158573. The portions of this work conducted by Newcastle University and Imperial College London were supported by the UK Engineering and Physical Sciences Research Council (Flowers Consortium, Grant No. EP/ J02175X/1, to GBVS). The work conducted by the PLOS Biology | DOI:10.1371/journal.pbio.1002310 December 3, 2015 1/9 University of Manchester was supported by the UK Background Biotechnology and Biosciences Research Council (Centre for Synthetic Biology of Fine and Speciality By the 1970s, molecular biologists had already developed many variations in the language used Chemicals (SYNBIOCHEM; BB/M017702/1). The to describe functional regions of DNA, or genetic sequence features, with different terms used work conducted by the University of Edinburgh was to describe similar features in different organisms. A protein-coding DNA sequence might be supported by a Chancellor's Fellowship and BBSRC called a coding sequence (CDS), an open reading frame,anexon, or simply a gene, depending grant (Building national hardware and software on the organism and method of study. To address such concerns, the Sequence Ontology [1] infrastructure for UK DNA Foundries; BB/M025640/ 1). Finally, SBOL is supported by the National maintains a standard set of terms for describing different genetic features. This effort helped Science Foundation under Grant Nos. DBI-1356041 unify annotation efforts during the rise of high-throughput genome sequencing in the last to CJM, and DBI-1355909 to HMS. Any opinions, decade. findings, and conclusions or recommendations Across that same decade, synthetic biology has advanced capabilities for forward engineering expressed in this material are those of the authors of complex genetic systems with multiple sequence features. This has increased the need for and do not necessarily reflect the views of the National Science Foundation or our other funding consistent terminology and representations of genetic designs. A visual representation of agencies. The funders had no role in study design, genetic sequence elements and their arrangement can quickly communicate adjacency, conti- data collection and analysis, decision to publish, or guity, repetition, and uniqueness. These properties become relevant as genetic designs become preparation of the manuscript. more complex, with multiple promoters, CDSs, etc. This is especially true for genetic designs Competing Interests: I have read the journal's policy expressed heterologously and when a system is engineered first in one organism (e.g., [2]), then and have the following conflicts: SB is co-founder of moved to a different host (e.g., [3,4]). Lattice Automation, a company that develops Standards are enabling technologies for communication: standard symbols have had a pro- software tools for synthetic biology. found impact in other engineering disciplines, such as the Institute of Electrical and Electronics Abbreviations: BBF RFC, BioBrick Foundation Engineers (IEEE) standards for representing electronic components and circuits [5,6], or com- Request for Comments; CAD, computer aided puter-aided design (CAD) standards for representing architecture and mechanical engineering design; CDS, coding sequence; CFP, cyan [7,8]. Standard symbols simplify figures and user interfaces, enhance familiarity, and stream- fluorescent protein; CSS, Cascading Style Sheets; line the design process. SBOL Visual aims to have a similar salutary effect for the engineering GFP, green fluorescent protein; IEEE, Institute of Electrical and Electronics Engineers; IPTG, isopropyl- of biological systems. beta-D-thiogalactopyranoside; RFP, red fluorescent protein; SBOL, Synthetic Biology Open Language; SBGN, Systems Biology Graphical Notation; SVG, Scalable Vector Graphics; YFP, yellow fluorescent SBOL Visual Specification protein. Synthetic Biology Open Language (SBOL) Visual is the product of an ongoing community effort to develop and standardize a graphical language for synthetic biology and biological engi- neering, focusing initially on symbols for commonly used sequence features [9]. In its current form, SBOL Visual is a set of symbols that correspond to sequence features encoded by a DNA molecule. The meaning of each symbol is established by association with terms in the Sequence Ontology (S1 and S2 Tables). SBOL Visual builds on the Sequence Ontology’s ten years of work on standardizing precise definitions of genetic sequence features, and the success of this work ensures that SBOL Visual symbols are well aligned with established scientific vocabulary. The mapping to Sequence Ontology terms also connects SBOL Visual to the SBOL data exchange standard, enabling automatic mapping from data to a graphical representation [9]. Though SBOL Visual makes use of Sequence Ontology terms, the two projects address objec- tives of differing scope. The Sequence Ontology provides a controlled vocabulary for all func- tional genetic features, while SBOL Visual focuses on facilitating the communication of engineered genetic designs. One driving need for SBOL Visual is to abstract and simplify the full complexity of sequence features that may be represented with a single symbol, e.g., the pro- moter,