Bioinformatics S Ranganathan, Macquarie University, Sydney, NSW, Australia

Total Page:16

File Type:pdf, Size:1020Kb

Bioinformatics S Ranganathan, Macquarie University, Sydney, NSW, Australia Bioinformatics S Ranganathan, Macquarie University, Sydney, NSW, Australia r 2017 Elsevier Inc. All rights reserved. In 1991, Nobel laureate, Walter Gilbert said “The new paradigm, now emerging, is that all the ‘genes’ will be known (in the sense of being resident in databases available electronically), and that the starting point of a biological investigation will be theoretical. An individual scientist will begin with a theoretical conjecture, only then turning to experiment to follow or test that hypothesis.”1 In a nutshell, Gilbert is referring to the use of bioinformatics in all biological research endeavors. Bioinformatics, also known as computational biology, enables the scientific understanding of living systems through com- putation. It links molecular descriptors of organisms to biological processes, facilitating data mining and knowledge discovery. Emerging from an essential enabling technology in the Life Sciences, bioinformatics is now a fundamental research discipline. Bioinformatics is a key platform technology, underpinning biotechnology and genome research. The integration of compu- tational approaches with experimental life science and biomedical research generates theoretical analysis-based hypotheses, which will then steer experimental design. Some of the current challenges addressed by bioinformatics are: • Novel approaches to solving biological problems presented by genomics and proteomics. • Unraveling the genetic and environmental basis of health and disease. • Developing software to facilitate bioinformatics analyses. • Standardization of functional descriptors for different data types. • Improving knowledge management systems for intuitive use by life and biomedical scientists. Building on the applications of informatics (computer science) to biology, bioinformatics has established linkages with mathematics and statistics; physics and chemistry; medicine and pharmacology. Bioinformatics research thus entails inputs from diverse disciplines. The ultimate goal of bioinformatics is to provide a complete representation of living cells and organisms and understand the principles of how they function, so that, in the words of Nobel laureate Sydney Brenner, “computational biology becomes biological computation.” Systematic Organization of Sequences As molecular biologists uncover innumerable gene (DNA), transcript (RNA) and protein sequences, biological databases are essential for efficient data management and to facilitate sequence analysis. Structural bioinformatics on the other hand involves the analysis of three-dimensional structures of biological molecules. Needless to say, a working knowledge of biostatistics is crucial in this computational analysis. From Sequences to Organisms An important application of bioinformatics to molecular data is in phylogenetic analysis, for linking the genotype to the observed phenotype. Most analysis methods developed for sequences can be applied to organisms, leading to biodiversity informatics. An Integrated Approach to Biological Data Rather than looking at individual sequences, groups of genes or proteins, involved in a specific biological function is referred to as the study of biological pathways, analogous to biochemical and signaling pathways. Here, the terminology used to describe biological function has to be robust and has led to the development of gene ontology. With the immense accumulation of biological and biomedical literature, it is impractical today to collate information on specific biomolecules and their function, resulting in text mining. The study of interacting biological entities comprising a system constitutes biological networks. Ranging from gene and protein interaction networks, network biology extends to evolutionary and ecological networks. Bioinformatics applied to the immune system is labeled immunoinformatics. Sequence Data at the Organism Level With high-throughput technologies, the entire molecular data of an organism can be captured as its genome, transcriptome or proteome. Strategies involved in analyzing the raw data from instruments as well as the analysis of whole and proteome sequences Reference Module in Life Sciences doi:10.1016/B978-0-12-809633-8.12387-8 1 2 Bioinformatics has led to genome and proteome informatics. How genes, transcripts and proteins interact and communicate to carry out essential biological functions is studied as functional genomics. Bioinformatics in Health and Disease Genetic variations and gene–environment interactions in health and disease are important from a public health perspective as well as the health of farmed species. With the knowledge of the genomes of several infectious agents, disease informatics provides us biomarkers for diagnosis and monitoring disease progression. The molecular changes occurring in diseases such as cancer, autoimmune and neurological disorders are studied in translational bioinformatics, for facilitating personalized healthcare. Drug design is an area of bioinformatics that focuses on drug molecules and inhibitors that can ameliorate diseases and disorders in a targeted manner. The Bioinformatics section of the Reference Module on Life Sciences provides a great opportunity to keep readers updated with clear and authoritative descriptions of bioinformatics terms and areas. This section will be of interest of both life science researchers as well as students to learn about the most important and current features of bioinformatics. Reference 1. Gilbert, W., 1991. Towards a paradigm shift in biology. Nature 349, 99. Available at: http://www.nature.com/nature/journal/v349/n6305/pdf/349099a0.pdf.
Recommended publications
  • Biological Computation
    Portland State University PDXScholar Computer Science Faculty Publications and Presentations Computer Science 9-21-2010 Biological Computation Melanie Mitchell Portland State University Follow this and additional works at: https://pdxscholar.library.pdx.edu/compsci_fac Part of the Biology Commons, and the Computer Engineering Commons Let us know how access to this document benefits ou.y Citation Details Mitchell, Melanie, "Biological Computation" (2010). Computer Science Faculty Publications and Presentations. 2. https://pdxscholar.library.pdx.edu/compsci_fac/2 This Working Paper is brought to you for free and open access. It has been accepted for inclusion in Computer Science Faculty Publications and Presentations by an authorized administrator of PDXScholar. Please contact us if we can make this document more accessible: [email protected]. Biological Computation Melanie Mitchell SFI WORKING PAPER: 2010-09-021 SFI Working Papers contain accounts of scientific work of the author(s) and do not necessarily represent the views of the Santa Fe Institute. We accept papers intended for publication in peer-reviewed journals or proceedings volumes, but not papers that have already appeared in print. Except for papers by our external faculty, papers must be based on work done at SFI, inspired by an invited visit to or collaboration at SFI, or funded by an SFI grant. ©NOTICE: This working paper is included by permission of the contributing author(s) as a means to ensure timely distribution of the scholarly and technical work on a non-commercial basis. Copyright and all rights therein are maintained by the author(s). It is understood that all persons copying this information will adhere to the terms and constraints invoked by each author's copyright.
    [Show full text]
  • Molecular and Neural Computation
    CSE590b: Molecular and neural computation Georg Seelig 1 Administrative Georg Seelig Grading: Office: CSE 228 30% class participation: ask questions. It’s more [email protected] fun if it’s interactive. TA: Kevin Oishi 30% Homework: Due at the end of class one week after they are handed out. Late policy: -10% each day for the first 3 days, then not accepted 40% Class project: A small design project using one (or several) of the design and simulation tools that will be introduced in class [email protected] Books: There is no single book that covers the material in this class. Any book on molecular biology and on neural computation might be helpful to dig deeper. https://www.coursera.org/course/compneuro 2 Computation can be embedded in many substrates Computaon controls physical Alternave physical substrates substrates (output of computaon is can be used to make computers the physical substrate) The molecular programming project The history of computing has taught us two things: first, that the principles of computing can be embodied in a wide variety of physical substrates from gears and springs to transistors, and second that the mastery of a new physical substrate for computing has the potential to transform technology. Another revolution is just beginning, one that will result in new types of programmable systems based on molecules. Like the previous revolutions, this “molecular programming revolution” will take much of its theory from computer science, but will require reformulation of familiar concepts such as programming languages and compilers, data structures and algorithms, resources and complexity, concurrency and stochasticity, correctness and robustness.
    [Show full text]
  • Biological Computation As the Revolution of Complex Engineered Systems
    Biological Computation as the Revolution of Complex Engineered Systems Nelson Alfonso Gómez Cruz and Carlos Eduardo Maldonado Modeling and Simulation Laboratory Universidad del Rosario Calle 14 No. 6-25, Bogotá, Colombia ABSTRACT: Provided that there is no theoretical frame for complex engineered systems (CES) as yet, this paper claims that bio-inspired engineering can help provide such a frame. Within CES bio-inspired systems play a key role. The disclosure from bio-inspired systems and biological computation has not been sufficiently worked out, however. Biological computation is to be taken as the processing of information by living systems that is carried out in polynomial time, i.e., efficiently; such processing however is grasped by current science and research as an intractable problem (for instance, the protein folding problem). A remark is needed here: P versus NP problems should be well defined and delimited but biological computation problems are not. The shift from conventional engineering to bio-inspired engineering needs bring the subject (or problem) of computability to a new level. Within the frame of computation, so far, the prevailing paradigm is still the Turing-Church thesis. In other words, conventional engineering is still ruled by the Church-Turing thesis (CTt). However, CES is ruled by CTt, too. Contrarily to the above, we shall argue here that biological computation demands a more careful thinking that leads us towards hypercomputation. Bio- inspired engineering and CES thereafter, must turn its regard toward biological computation. Thus, biological computation can and should be taken as the ground for engineering complex non-linear systems. Biological systems do compute in terms of hypercomputation, indeed.
    [Show full text]
  • Bioinformatics in the Post-Sequence Era
    review Bioinformatics in the post-sequence era Minoru Kanehisa1 & Peer Bork2 doi:10.1038/ng1109 In the past decade, bioinformatics has become an integral part of research and development in the biomedical sciences. Bioinformatics now has an essential role both in deciphering genomic, transcriptomic and proteomic data generated by high-throughput experimental technologies and in organizing information gathered from tra- ditional biology. Sequence-based methods of analyzing individual genes or proteins have been elaborated and expanded, and methods have been developed for analyzing large numbers of genes or proteins simultaneously, such as in the identification of clusters of related genes and networks of interacting proteins. With the complete genome sequences for an increasing number of organisms at hand, bioinformatics is beginning to provide both conceptual bases and practical methods for detecting systemic functional behaviors of the cell and the organism. http://www.nature.com/naturegenetics The exponential growth in molecular sequence data started in The single most important event was the arrival of the Internet, the early 1980s when methods for DNA sequencing became which has transformed databases and access to data, publica- widely available. The data were accumulated in databases such as tions and other aspects of information infrastructure. The Inter- GenBank, EMBL (European Molecular Biology Laboratory net has become so commonplace that it is hard to imagine that nucleotide sequence database), DDBJ (DNA Data Bank of we were living in a world without it only 10 years ago. The rise of Japan), PIR (Protein Information Resource) and SWISS-PROT, bioinformatics has been largely due to the diverse range of large- and computational methods were developed for data retrieval scale data that require sophisticated methods for handling and and analysis, including algorithms for sequence similarity analysis, but it is also due to the Internet, which has made both searches, structural predictions and functional predictions.
    [Show full text]
  • IT 468 – Natural Computing
    July 2020 IT 468 – Natural Computing Instructor Manish K Gupta (http://www.mankg.com) Room 2209 Faculty Block 2 mankg [at] daiict.ac.in Phone: +91-79-68261549 WhatsApp:+91-9898512703 Lab: http://www.guptalab.org Overview In last 60 years Information and Communication Technology (ICT) has had a great impact on our society. The most profound and accelerated impact of ICT can be seen in the last decade in the form of cell phones, connected computers and Internet. We even have a virtual currency. ICT is an interdisciplinary discipline combining IT (Information Technology) and CT (Communication Technology). IT has its root in computer science and CT has its root in theory of communication. Both the fields now can be seen as two sides of the same coin. Both deals with information, in IT we store (send information from now to then) and manipulate the information and in CT we send information from here to there (communicate). The mathematical principles of ICT lie in theoretical computer science (Turing machine) and information and coding theory (work of Shannon and Hamming). Realization of ICT is via logic gates and circuits in the area of Electronics and VLSI. If you look around the Nature around you many times you feel: What are the principles of Natural ICT? Can we use these principles to create Natural ICT engineering? We are fortunate enough that due to advancement of many fields we are now in a position to talk about Natural ICT. There are three main areas of Natural ICT: 1. ICT inspired by Nature 2.
    [Show full text]
  • Expanding the Landscape of Biological Computation with Synthetic Multicellular Consortia
    Expanding the Landscape of Biological Computation with Synthetic Multicellular Consortia Ricard V. Solé Javier Macia SFI WORKING PAPER: 2013-05-020 SFI Working Papers contain accounts of scientiic work of the author(s) and do not necessarily represent the views of the Santa Fe Institute. We accept papers intended for publication in peer-reviewed journals or proceedings volumes, but not papers that have already appeared in print. Except for papers by our external faculty, papers must be based on work done at SFI, inspired by an invited visit to or collaboration at SFI, or funded by an SFI grant. ©NOTICE: This working paper is included by permission of the contributing author(s) as a means to ensure timely distribution of the scholarly and technical work on a non-commercial basis. Copyright and all rights therein are maintained by the author(s). It is understood that all persons copying this information will adhere to the terms and constraints invoked by each author's copyright. These works may be reposted only with the explicit permission of the copyright holder. www.santafe.edu SANTA FE INSTITUTE Expanding the landscape of biological computation with synthetic multicellular consortia Ricard V. Sol´e∗1, 2, 3 and Javier Macia1, 2 1ICREA-Complex Systems Lab, Universitat Pompeu Fabra, Dr Aiguader 88, 08003 Barcelona, Spain 2Institut de Biologia Evolutiva, UPF-CSIC, Psg Barceloneta 37, 08003 Barcelona, Spain 3Santa Fe Institute, 1399 Hyde Park Road, Santa Fe NM 87501, USA Computation is an intrinsic attribute of biological entities. All of them gather and process infor- mation and respond in predictable ways to an uncertain external environment.
    [Show full text]
  • Digital Circuit Design for Biological and Silicon Computers Matthias Függer, Manish Kushwaha, Thomas Nowak
    Digital Circuit Design for Biological and Silicon Computers Matthias Függer, Manish Kushwaha, Thomas Nowak To cite this version: Matthias Függer, Manish Kushwaha, Thomas Nowak. Digital Circuit Design for Biological and Silicon Computers. Advances in Synthetic Biology, Springer Singapore, pp.153-171, 2020, 10.1007/978-981- 15-0081-7_9. hal-02549707 HAL Id: hal-02549707 https://hal.inrae.fr/hal-02549707 Submitted on 21 Apr 2020 HAL is a multi-disciplinary open access L’archive ouverte pluridisciplinaire HAL, est archive for the deposit and dissemination of sci- destinée au dépôt et à la diffusion de documents entific research documents, whether they are pub- scientifiques de niveau recherche, publiés ou non, lished or not. The documents may come from émanant des établissements d’enseignement et de teaching and research institutions in France or recherche français ou étrangers, des laboratoires abroad, or from public or private research centers. publics ou privés. Digital Circuit Design for Biological and Silicon Computers Matthias Függer, Manish Kushwaha, and Thomas Nowak∗ 1 Introduction Evolving for the past 4 billion years [3], life on Earth today is highly diverse with an estimated 8.7 million species [21]. Despite this enormous diversity, one key charac- teristic that differentiates all living cells from non-living matter is their property of response to stimuli [14]. Living cells receive information from their environment, process that information, and then effect a response. Therefore, in the simplest “in- formation processing” sense of computing [30], cells can be seen as tiny computing machines. This notion has sustained itself through Monod to today as the successes of molecular and cell biology have continued to reveal insights into the mechanistic bases of cellular functioning [18, 23].
    [Show full text]
  • Biological Computation by Melanie Mitchell Computer Science Department, Portland State University and Santa Fe Institute
    Ubiquity, an ACM publication February, 2011 Ubiquity Symposium What is Computation? Biological Computation by Melanie Mitchell Computer Science Department, Portland State University and Santa Fe Institute Editor’s Introduction In this thirteenth piece to the Ubiquity symposium discussing ‘What is computation?’ Melanie Mitchell discusses the idea that biological computation is a process that occurs in nature, not merely in computer simulations of nature. Peter J. Denning Editor http://ubiquity.acm.org 1 ©2011 Association for Computing Machinery Ubiquity, an ACM publication February, 2011 Ubiquity Symposium What is Computation? Biological Computation by Melanie Mitchell What Is Meant By “Biological Computation”? In this article, the term biological computation refers to the proposal that living organisms themselves perform computations, and, more specifically, that the abstract ideas of information and computation may be key to understanding biology in a more unified manner. It is important to point out that the study of biological computation is typically not the focus of the field of computational biology, which applies computing tools to the solution of specific biological problems. Likewise, biological computation is distinct from the field of biologically‐inspired computing, which borrows ideas from biological systems such as the brain, insect colonies, and the immune system in order to develop new algorithms for specific computer science applications. While there is some overlap among these different meldings of biology and computer
    [Show full text]
  • Z34bio: an SMT-Based Framework for Analyzing Biological Computation
    Z34Bio: An SMT-based Framework for Analyzing Biological Computation Boyan Yordanov, Christoph M. Wintersteiger, Youssef Hamadi and Hillel Kugler Microsoft Research, Cambridge, UK, http://research.microsoft.com/z3-4biology Abstract The basic principles governing the development and function of living organisms remain only partially understood, despite significant progress in molecular and cellular biology and tremendous breakthroughs in experimental methods. The development of system-level, mechanistic, computational models has the potential to become a foundation for improving our understanding of natural biological systems, and for designing engineered biological systems with wide-ranging applications in nanomedicine, nanomaterials and computing. We describe Z34Bio (Z3 for Biology), a unified SMT-based framework for the automated analysis of natural and engineered biological systems. Z34Bio enables addressing important biological questions, and studying models more complex than previously possible. The framework provides a formalization of the semantics of several model classes used widely for biological systems, which we illustrate through the treatment of chemical reaction networks and Boolean networks. We present case-studies which we make available as SMT-LIB benchmarks, to enable comparison of different analysis techniques, and towards making this new domain accessible to the formal verification community. 1 Introduction Many mechanisms and properties of biological systems remain only partially understood, thus limiting our comprehension of natural living systems and processes. Recently, advanced ex- perimental techniques have enabled the rational design and construction of biological systems, delineating a branch of biology as an engineering discipline, with potential applications in nanomedicine, nanomaterials and computing. However, understanding the system-level behav- ior of organisms or designing ones with specific behavior remains a major challenge for the engineering and the reverse engineering of biological systems.
    [Show full text]
  • Toward Complexity Measures for Systems Involving Human Computation
    Human Computation (2014) 1:1:45-65 c 2014, Crouser et al. CC-BY-3.0 ISSN: 2330-8001, DOI: 10.15346/hc.v1i1.4 Toward Complexity Measures for Systems Involving Human Computation R. JORDAN CROUSER, TUFTS UNIVERSITY BENJAMIN HESCOTT, TUFTS UNIVERSITY REMCO CHANG, TUFTS UNIVERSITY ABSTRACT This paper introduces the Human Oracle Model as a method for characterizing and quantifying the use of human processing power as part of an algorithmic process. The utility of this model is demonstrated through a comparative algorithmic analysis of several well-known human compu- tation systems, as well as the definition of a preliminary characterization of the space of human computation under this model. Through this research, we hope to gain insight about the challenges unique to human computation and direct the search for efficient human computation algorithms. 1. INTRODUCTION Computational complexity theory is a branch of theoretical computer science dedicated to describ- ing and classifying computational problems according to their fundamental difficulty, which we measure in terms of the resources required to solve them. One way to measure a problem’s diffi- culty is with respect to time; we may ask how many operations do I need to perform to find an answer? Alternatively, one might want to measure difficulty in terms of space; here we could ask how much memory will I need to execute this process? These questions, which do not rely on the specific implementation details, are at the heart of computer science. Theoretical arguments ground our intuitions about the problem space, and pave the way for us to design future systems that make these provably correct solutions tractable.
    [Show full text]
  • Programming in Biomolecular Computation
    CS2Bio 2010 Programming in Biomolecular Computation Lars Hartmann, Neil D. Jones, Jakob Grue Simonsen 1;2 Department of Computer Science, University of Copenhagen (DIKU), Copenhagen, Denmark Abstract Our goal is to provide a top-down approach to biomolecular computation. In spite of widespread discussion about connections between biology and computation, one question seems notable by its absence: Where are the programs? We introduce a model of computation that is evidently programmable, by programs reminiscent of low-level computer machine code; and at the same time biologically plausible: its functioning is defined by a single and relatively small set of chemical-like reaction rules. Further properties: the model is stored-program: programs are the same as data, so programs are not only executable, but are also compilable and interpretable. It is universal: all computable functions can be computed (in natural ways and without arcane encodings of data and algorithm); it is also uniform: new \hardware" is not needed to solve new problems; and (last but not least) it is Turing complete in a strong sense: a universal algorithm exists, that is able to execute any program, and is not asymptotically inefficient. A prototype model has been implemented (for now in silico on a conventional computer). This work opens new perspectives on just how computation may be specified at the biological level. Keywords: biomolecular, computation, programmability, universality. 1 Biochemical universality and programming It has been known for some time that various forms of biomolecular compu- tation are Turing complete [7,8,10,12,25,29,32,33]. The net effect is to show that any computable function can be computed, in some appropriate sense, by an instance of the biological mechanism being studied.
    [Show full text]
  • Machine Learning Inspired Synthetic Biology: Neuromorphic Computing in Mammalian Cells by Andrew Moorman
    Machine Learning Inspired Synthetic Biology: Neuromorphic Computing in Mammalian Cells by Andrew Moorman B.Arch. Cornell University, 2017 Submitted to the MIT Department of Architecture and the Department of Electrical Engineering and Computer Science in Partial Fulfillment of the Requirements for the Degrees of Master of Science in Architecture Studies and Master of Science in Electrical Engineering and Computer Science at the Massachusetts Institute of Technology February 2020 © 2020 Massachusetts Institute of Technology. All rights reserved Signature of Author .............................................................................................................................................. MIT Department of Architecture MIT Department of Electrical Engineering and Computer Science January 17, 2020 Certified by ........................................................................................................................................................... Ron Weiss Professor of Biological Engineering and Electrical Engineering and Computer Science Thesis Supervisor Certified by ........................................................................................................................................................... Skylar Tibbits Associate Professor of Architecture Thesis Supervisor Accepted by .......................................................................................................................................................... Leslie K. Norford Professor of Building Technology
    [Show full text]