Research News
Total Page:16
File Type:pdf, Size:1020Kb
Load more
Recommended publications
-
Algorithms for Computational Biology 8Th International Conference, Alcob 2021 Missoula, MT, USA, June 7–11, 2021 Proceedings
Lecture Notes in Bioinformatics 12715 Subseries of Lecture Notes in Computer Science Series Editors Sorin Istrail Brown University, Providence, RI, USA Pavel Pevzner University of California, San Diego, CA, USA Michael Waterman University of Southern California, Los Angeles, CA, USA Editorial Board Members Søren Brunak Technical University of Denmark, Kongens Lyngby, Denmark Mikhail S. Gelfand IITP, Research and Training Center on Bioinformatics, Moscow, Russia Thomas Lengauer Max Planck Institute for Informatics, Saarbrücken, Germany Satoru Miyano University of Tokyo, Tokyo, Japan Eugene Myers Max Planck Institute of Molecular Cell Biology and Genetics, Dresden, Germany Marie-France Sagot Université Lyon 1, Villeurbanne, France David Sankoff University of Ottawa, Ottawa, Canada Ron Shamir Tel Aviv University, Ramat Aviv, Tel Aviv, Israel Terry Speed Walter and Eliza Hall Institute of Medical Research, Melbourne, VIC, Australia Martin Vingron Max Planck Institute for Molecular Genetics, Berlin, Germany W. Eric Wong University of Texas at Dallas, Richardson, TX, USA More information about this subseries at http://www.springer.com/series/5381 Carlos Martín-Vide • Miguel A. Vega-Rodríguez • Travis Wheeler (Eds.) Algorithms for Computational Biology 8th International Conference, AlCoB 2021 Missoula, MT, USA, June 7–11, 2021 Proceedings 123 Editors Carlos Martín-Vide Miguel A. Vega-Rodríguez Rovira i Virgili University University of Extremadura Tarragona, Spain Cáceres, Spain Travis Wheeler University of Montana Missoula, MT, USA ISSN 0302-9743 ISSN 1611-3349 (electronic) Lecture Notes in Bioinformatics ISBN 978-3-030-74431-1 ISBN 978-3-030-74432-8 (eBook) https://doi.org/10.1007/978-3-030-74432-8 LNCS Sublibrary: SL8 – Bioinformatics © Springer Nature Switzerland AG 2021 This work is subject to copyright. -
UC Irvine UC Irvine Previously Published Works
UC Irvine UC Irvine Previously Published Works Title The capacity of feedforward neural networks. Permalink https://escholarship.org/uc/item/29h5t0hf Authors Baldi, Pierre Vershynin, Roman Publication Date 2019-08-01 DOI 10.1016/j.neunet.2019.04.009 License https://creativecommons.org/licenses/by/4.0/ 4.0 Peer reviewed eScholarship.org Powered by the California Digital Library University of California THE CAPACITY OF FEEDFORWARD NEURAL NETWORKS PIERRE BALDI AND ROMAN VERSHYNIN Abstract. A long standing open problem in the theory of neural networks is the devel- opment of quantitative methods to estimate and compare the capabilities of different ar- chitectures. Here we define the capacity of an architecture by the binary logarithm of the number of functions it can compute, as the synaptic weights are varied. The capacity provides an upperbound on the number of bits that can be extracted from the training data and stored in the architecture during learning. We study the capacity of layered, fully-connected, architectures of linear threshold neurons with L layers of size n1, n2,...,nL and show that in essence the capacity is given by a cubic polynomial in the layer sizes: L−1 C(n1,...,nL) = Pk=1 min(n1,...,nk)nknk+1, where layers that are smaller than all pre- vious layers act as bottlenecks. In proving the main result, we also develop new techniques (multiplexing, enrichment, and stacking) as well as new bounds on the capacity of finite sets. We use the main result to identify architectures with maximal or minimal capacity under a number of natural constraints. -
Are Profile Hidden Markov Models Identifiable?
Are Profile Hidden Markov Models Identifiable? Srilakshmi Pattabiraman Tandy Warnow Department of Electrical and Computer Engineering Department of Computer Science University of Illinois at Urbana-Champaign University of Illinois at Urbana-Champaign Urbana, Illinois Urbana, Illinois [email protected] [email protected] ABSTRACT 1 INTRODUCTION Profile Hidden Markov Models (HMMs) are graphical models that Profile Hidden Markov Models (HMMs) are arguably themost can be used to produce finite length sequences from a distribution. common statistical models in bioinformatics. Originally introduced In fact, although they were only introduced for bioinformatics 25 by Haussler and colleagues in [10, 12], and then expanded later years ago (by Haussler et al., Hawaii International Conference on in many subsequent texts [4–6, 9, 11, 21, 25], profile HMMs are Systems Science 1993), they are arguably the most commonly used now used in many analytical steps in biological sequence analysis statistical model in bioinformatics, with multiple applications, in- [15, 17–19, 22]. cluding protein structure and function prediction, classifications Profile Hidden Markov models are graphical models with match of novel proteins into existing protein families and superfamilies, states, insertion states, and deletion states; and the match and in- metagenomics, and multiple sequence alignment. The standard use sertion states emit letters from an underlying alphabet Σ (i.e., Σ of profile HMMs in bioinformatics has two steps: first a profile may be the 20 amino acids, the four nucleotides, or some other HMM is built for a collection of molecular sequences (which may set of symbols). In the standard form presented in [4] (widely in not be in a multiple sequence alignment), and then the profile HMM use in bioinformatics applications), each profile Hidden Markov is used in some subsequent analysis of new molecular sequences. -
Computational Biology and Bioinformatics
Vol. 30 ISMB 2014, pages i1–i2 BIOINFORMATICS EDITORIAL doi:10.1093/bioinformatics/btu304 Editorial This special issue of Bioinformatics serves as the proceedings of The conference used a two-tier review system, a continuation the 22nd annual meeting of Intelligent Systems for Molecular and refinement of a process begun with ISMB 2013 in an effort Biology (ISMB), which took place in Boston, MA, July 11–15, to better ensure thorough and fair reviewing. Under the revised 2014 (http://www.iscb.org/ismbeccb2014). The official confer- process, each of the 191 submissions was first reviewed by at least ence of the International Society for Computational Biology three expert referees, with a subset receiving between four and (http://www.iscb.org/), ISMB, was accompanied by 12 Special eight reviews, as needed. These formal reviews were frequently Interest Group meetings of one or two days each, two satellite supplemented by online discussion among reviewers and Area meetings, a High School Teachers Workshop and two half-day Chairs to resolve points of dispute and reach a consensus on tutorials. Since its inception, ISMB has grown to be the largest each paper. Among the 191 submissions, 29 were conditionally international conference in computational biology and bioinfor- accepted for publication directly from the first round review Downloaded from matics. It is expected to be the premiere forum in the field for based on an assessment of the reviewers that the paper was presenting new research results, disseminating methods and tech- clearly above par for the conference. A subset of 16 papers niques and facilitating discussions among leading researchers, were viewed as potentially in the top tier but raised significant practitioners and students in the field. -
EMT Biocomputing Workshop Report
FINAL WORKSHOP REPORT Sponsor: 004102 : Computing Research Association Award Number: AGMT dtd 7-2-08 Title: NSF Workshop on Emerging Models and Technologies in Computing: Bio-Inspired Computing and the Biology and Computer Science Interface. Report Prepared by: Professor Ron Weiss Departments of Electrical Engineering and Molecular Biology Princeton University Professor Laura Landweber Departments of Ecology and Evolutionary Biology and Molecular Biology Princeton University Other Members of the Organizing Committee Include: Professor Mona Singh Departments of Computer Science and Molecular Biology Princeton University Professor Erik Winfree Department of Computer Science California Institute of Technology Professor Mitra Basu Department of Computer Science Johns Hopkins University and the United States Naval Academy Workshop Website: https://www.ee.princeton.edu/Events/emt/ Draft: 09/07/2008 Table of Contents I. Executive Summary 1. Workshop Objectives 2. Summary of the Workshop Methodology and Findings 3. Recommendations II. Grand Challenges III. Breakout Group Presentations Appendices A1. Agenda A2. Participant List A3. Abstracts of Invited Presentations A4. Presentations Draft: 09/07/2008 I. Executive Summary I.1 Workshop Goals The National Science Foundation, Division of Computer and Information Science and Engineering (CISE), Computing and Communications Foundations (CCF) sponsored a Workshop on Emerging Models and Technologies in Computing (EMT): Bio- Inspired Computing. This workshop brought together distinguished leaders in the fields of synthetic biology, bio-computing, systems biology, and protein and nucleic acid engineering to share their vision for science and research and to learn about the research projects that EMT has funded. The goal was to explore and to drive the growing interface between Biology and Computer Science. -
I S C B N E W S L E T T
ISCB NEWSLETTER FOCUS ISSUE {contents} President’s Letter 2 Member Involvement Encouraged Register for ISMB 2002 3 Registration and Tutorial Update Host ISMB 2004 or 2005 3 David Baker 4 2002 Overton Prize Recipient Overton Endowment 4 ISMB 2002 Committees 4 ISMB 2002 Opportunities 5 Sponsor and Exhibitor Benefits Best Paper Award by SGI 5 ISMB 2002 SIGs 6 New Program for 2002 ISMB Goes Down Under 7 Planning Underway for 2003 Hot Jobs! Top Companies! 8 ISMB 2002 Job Fair ISCB Board Nominations 8 Bioinformatics Pioneers 9 ISMB 2002 Keynote Speakers Invited Editorial 10 Anna Tramontano: Bioinformatics in Europe Software Recommendations11 ISCB Software Statement volume 5. issue 2. summer 2002 Community Development 12 ISCB’s Regional Affiliates Program ISCB Staff Introduction 12 Fellowship Recipients 13 Awardees at RECOMB 2002 Events and Opportunities 14 Bioinformatics events world wide INTERNATIONAL SOCIETY FOR COMPUTATIONAL BIOLOGY A NOTE FROM ISCB PRESIDENT This newsletter is packed with information on development and dissemination of bioinfor- the ISMB2002 conference. With over 200 matics. Issues arise from recommendations paper submissions and over 500 poster submis- made by the Society’s committees, Board of sions, the conference promises to be a scientific Directors, and membership at large. Important feast. On behalf of the ISCB’s Directors, staff, issues are defined as motions and are discussed EXECUTIVE COMMITTEE and membership, I would like to thank the by the Board of Directors on a bi-monthly Philip E. Bourne, Ph.D., President organizing committee, local organizing com- teleconference. Motions that pass are enacted Michael Gribskov, Ph.D., mittee, and program committee for their hard by the Executive Committee which also serves Vice President work preparing for the conference. -
Bch394p-364C
Assembling Genomes BCH394P/364C Systems Biology / Bioinformatics Edward Marcotte, Univ of Texas at Austin 1 www.yourgenome.org/facts/timeline-the-human-genome-project Nature 409, 860-921(2001) 2 1 3 Beijing Genomics Institute “If it tastes good you should sequence it... you should know what's in the genes of that species” Wang Jun, Chief executive, BGI (Wikipedia) 4 2 https://www.technologyreview.com/s/615289/china-bgi-100-dollar-genome/ 5 6 3 https://www.prnewswire.com/news-releases/oxford-nanopore-sequencers-have-left-uk-for-china- to-support-rapid-near-sample-coronavirus-sequencing-for-outbreak-surveillance-300996908.html 7 http://www.triazzle.com ; The image from http://www.dangilbert.com/port_fun.html Reference: Jones NC, Pevzner PA, Introduction to Bioinformatics Algorithms, MIT press 8 4 “mapping” “shotgun” sequencing 9 (Translating the cloning jargon) 10 5 Thinking about the basic shotgun concept • Start with a very large set of random sequencing reads • How might we match up the overlapping sequences? • How can we assemble the overlapping reads together in order to derive the genome? 11 Thinking about the basic shotgun concept • At a high level, the first genomes were sequenced by comparing pairs of reads to find overlapping reads • Then, building a graph ( i.e. , a network) to represent those relationships • The genome sequence is a “walk” across that graph 12 6 The “Overlap-Layout-Consensus” method Overlap : Compare all pairs of reads (allow some low level of mismatches) Layout : Construct a graph describing the overlaps sequence overlap read Simplify the graph read Find the simplest path through the graph Consensus : Reconcile errors among reads along that path to find the consensus sequence 13 Building an overlap graph 5’ 3’ EUGENE W. -
Curriculum Vitae
Curriculum Vitae Tandy Warnow Grainger Distinguished Chair in Engineering 1 Contact Information Department of Computer Science The University of Illinois at Urbana-Champaign Email: [email protected] Homepage: http://tandy.cs.illinois.edu 2 Research Interests Phylogenetic tree inference in biology and historical linguistics, multiple sequence alignment, metage- nomic analysis, big data, statistical inference, probabilistic analysis of algorithms, machine learning, combinatorial and graph-theoretic algorithms, and experimental performance studies of algorithms. 3 Professional Appointments • Co-chief scientist, C3.ai Digital Transformation Institute, 2020-present • Grainger Distinguished Chair in Engineering, 2020-present • Associate Head for Computer Science, 2019-present • Special advisor to the Head of the Department of Computer Science, 2016-present • Associate Head for the Department of Computer Science, 2017-2018. • Founder Professor of Computer Science, the University of Illinois at Urbana-Champaign, 2014- 2019 • Member, Carl R. Woese Institute for Genomic Biology. Affiliate of the National Center for Supercomputing Applications (NCSA), Coordinated Sciences Laboratory, and the Unit for Criticism and Interpretive Theory. Affiliate faculty member in the Departments of Mathe- matics, Electrical and Computer Engineering, Bioengineering, Statistics, Entomology, Plant Biology, and Evolution, Ecology, and Behavior, 2014-present. • National Science Foundation, Program Director for Big Data, July 2012-July 2013. • Member, Big Data Senior Steering Group of NITRD (The Networking and Information Tech- nology Research and Development Program), subcomittee of the National Technology Council (coordinating federal agencies), 2012-2013 • Departmental Scholar, Institute for Pure and Applied Mathematics, UCLA, Fall 2011 • Visiting Researcher, University of Maryland, Spring and Summer 2011. 1 • Visiting Researcher, Smithsonian Institute, Spring and Summer 2011. • Professeur Invit´e,Ecole Polytechnique F´ed´erale de Lausanne (EPFL), Summer 2010. -
Fast Non-Parametric Improvement of Estimated Gene Trees Sarah Christensen University of Illinois at Urbana-Champaign, USA [email protected] Erin K
TRACTION: Fast Non-Parametric Improvement of Estimated Gene Trees Sarah Christensen University of Illinois at Urbana-Champaign, USA [email protected] Erin K. Molloy University of Illinois at Urbana-Champaign, USA [email protected] Pranjal Vachaspati University of Illinois at Urbana-Champaign, USA [email protected] Tandy Warnow1 University of Illinois at Urbana-Champaign, USA http://tandy.cs.illinois.edu [email protected] Abstract Gene tree correction aims to improve the accuracy of a gene tree by using computational techniques along with a reference tree (and in some cases available sequence data). It is an active area of research when dealing with gene tree heterogeneity due to duplication and loss (GDL). Here, we study the problem of gene tree correction where gene tree heterogeneity is instead due to incomplete lineage sorting (ILS, a common problem in eukaryotic phylogenetics) and horizontal gene transfer (HGT, a common problem in bacterial phylogenetics). We introduce TRACTION, a simple polynomial time method that provably finds an optimal solution to the RF-Optimal Tree Refinement and Completion Problem, which seeks a refinement and completion of an input tree t with respect to a given binary tree T so as to minimize the Robinson-Foulds (RF) distance. We present the results of an extensive simulation study evaluating TRACTION within gene tree correction pipelines on 68,000 estimated gene trees, using estimated species trees as reference trees. We explore accuracy under conditions with varying levels of gene tree heterogeneity due to ILS and HGT. We show that TRACTION matches or improves the accuracy of well-established methods from the GDL literature under conditions with HGT and ILS, and ties for best under the ILS-only conditions. -
PSB 2009 Attendees (As of January 9, 2009) Mario Albrecht Max Planck
PSB 2009 Attendees (as of January 9, 2009) Mario Albrecht Steven Brenner Max Planck Institute for Informatics University of California, Berkeley Hesham Ali Andrzej Brodzik University of Nebraska at Omaha The MITRE Corporation Gil Alterovitz Lukas Burger Harvard/MIT Institute of Bioinformatics, University of Basel Gregory Arnold Amgen Gregory Burrows OHSU Adam Asare UCSF/ITN William Bush Vanderbilt University Pierre Baldi University of California Andrea Califano Columbia University Lars Barquist UC Berkeley J. Gregory Caporaso Univeristy of Colorado Denver James Bassingthwaighte Univeristy of Washington Vincent Carey Harvard Medical School Tanya Berger-Wolf University of Illinois at Chicago Yuhui Cheng University of California, San Diego Ghislain Bidaut INSERM U891 Institut Paoli-Calmette Raymond Cheong Johns Hopkins University Marco Blanchette Stowers Institute for Medical Research Annie Chiang Stanford University Ben Blencowe Gilsoo Cho Anthony Bonner Yonsei University University of Toronto Sung Bum Cho Ingrid Borecki Seoul National University Bioinformatics Washington University Kevin Cohen John Bouck University of Colorado Denver Ceres, Inc. Dilek Colak Philip Bourne King Faisal Specialist Hospital and University of California, San Diego Research Centre PSB 2009 Attendees (as of January 9, 2009) James Costello Laura Elnitski Indiana University NHGRI Anneleen Daemen Drew Endy Dept. Electrical Engineering, KULeuven Stanford University Denise Daley Eric Fahrenthold University of British Columbia Yiping Fan Paul Davis St Jude Children's -
2014 ISCB Accomplishment by a Senior Scientist Award: Gene Myers
Message from ISCB 2014 ISCB Accomplishment by a Senior Scientist Award: Gene Myers Christiana N. Fogg1, Diane E. Kovats2* 1 Freelance Science Writer, Kensington, Maryland, United States of America, 2 Executive Director, International Society for Computational Biology, La Jolla, California, United States of America The International Society for Computa- tional Biology (ISCB; http://www.iscb. org) annually recognizes a senior scientist for his or her outstanding achievements. The ISCB Accomplishment by a Senior Scientist Award honors a leader in the field of computational biology for his or her significant contributions to the com- munity through research, service, and education. Dr. Eugene ‘‘Gene’’ Myers of the Max Planck Institute of Molecular Cell Biology and Genetics in Dresden has been selected as the 2014 ISCB Accom- plishment by a Senior Scientist Award winner. Myers (Image 1) was selected by the ISCB’s awards committee, which is chaired by Dr. Bonnie Berger of the Massachusetts Institute of Technology (MIT). Myers will receive his award and deliver a keynote address at ISCB’s 22nd Image 1. Gene Myers. Image credit: Matt Staley, HHMI. Annual Intelligent Systems for Molecular doi:10.1371/journal.pcbi.1003621.g001 Biology (ISMB) meeting. This meeting is being held in Boston, Massachusetts, on July 11–15, 2014, at the John B. Hynes guidance of his dissertation advisor, An- sequences and how to build evolutionary Memorial Convention Center (https:// drzej Ehrenfeucht, who had eclectic inter- trees. www.iscb.org/ismb2014). ests that included molecular biology. Myers landed his first faculty position in Myers was captivated by computer Myers, along with fellow graduate students the Department of Computer Science at programming as a young student. -
Probabilistic Models for Species Tree Inference and Orthology Analysis
Probabilistic Models for Species Tree Inference and Orthology Analysis IKRAM ULLAH Doctoral Thesis Stockholm, Sweden 2015 TRITA-CSC-A-2015:12 ISSN-1653-5723 KTH School of Computer Science and Communication ISRN-KTH/CSC/A–15/12-SE SE-100 44 Stockholm ISBN 978-91-7595-619-0 SWEDEN Akademisk avhandling som med tillstånd av Kungl Tekniska högskolan framläg- ges till offentlig granskning för avläggande av teknologie doktorsexamen i datalogi fredagen den 12 juni 2015, klockan 13.00 i conference room Air, Scilifelab, Solna. © Ikram Ullah, June 2015 Tryck: Universitetsservice US AB iii To my family iv Abstract A phylogenetic tree is used to model gene evolution and species evolution using molecular sequence data. For artifactual and biological reasons, a gene tree may differ from a species tree, a phenomenon known as gene tree-species tree incongruence. Assuming the presence of one or more evolutionary events, e.g, gene duplication, gene loss, and lateral gene transfer (LGT), the incon- gruence may be explained using a reconciliation of a gene tree inside a species tree. Such information has biological utilities, e.g., inference of orthologous relationship between genes. In this thesis, we present probabilistic models and methods for orthology analysis and species tree inference, while accounting for evolutionary factors such as gene duplication, gene loss, and sequence evolution. Furthermore, we use a probabilistic LGT-aware model for inferring gene trees having temporal information for duplication and LGT events. In the first project, we present a Bayesian method, called DLRSOrthology, for estimating orthology probabilities using the DLRS model: a probabilistic model integrating gene evolution, a relaxed molecular clock for substitution rates, and sequence evolution.