Computational Elucidation of the Regulatory Snps in the Non-Coding Regions of the Human Genome

Total Page:16

File Type:pdf, Size:1020Kb

Computational Elucidation of the Regulatory Snps in the Non-Coding Regions of the Human Genome AN ABSTRACT OF THE DISSERTATION OF Yao Yao for the degree of Doctor of Philosophy in Computer Science presented on February 16, 2021. Title: Computational Elucidation of the Regulatory SNPs in the Non-Coding Regions of the Human Genome Abstract approved: Stephen Ramsey We describe a series of novel computational models, CERENKOV (Computational Elu- cidation of the REgulatory NonKOding Variome) and its successors CERENKOV2, CE- RENKOV3, and Convolutional CERENKOV3, for discriminating regulatory single nu- cleotide polymorphisms (rSNPs) from non-regulatory SNPs within non-coding genetic loci. The CERENKOV models are designed for recognizing rSNPs in the context of a post-analysis of a genome-wide association study (GWAS); they include a novel accu- racy scoring metric (average rank, or AVGRANK) and a novel cross-validation strategy (locus-based sampling) that both correctly account for the \sparse positive bag" nature of the GWAS post-analysis rSNP recognition problem. We trained and validated the CERENKOV series models using a set of reference SNPs whose composition is based on selection criteria (linkage disequilibrium and minor allele frequency) that we designed to ensure relevance to GWAS post-analysis. The CERENKOV models are based on a machine-learning algorithm (gradient boosted decision trees) incorporating various SNP annotation features that are from genomic, epigenomic, phylogenetic, and chromatin data. CERENKOV2 includes features based on the geometry of the annotation features in data-space, and the CERENKOV3 models include features derived from SNP clus- tering, molecular network and convolutional output on genomic signals. We compared the validation performance of CERENKOV to nine other methods for rSNP recognition (including GWAVA, RSVP, DeltaSVM, DeepSEA, EIGEN, and DANQ), and found that CERENKOV's validation performance is the strongest out of all of the classifiers that we tested, by both traditional global rank-based measures (AUPRC, AUROC) and AV- GRANK. From the performance comparison between CERENKOV and its successors, we found that rSNP recognition performance benefits from data-space geometry, SNP clustering and molecular network-derived features. ©Copyright by Yao Yao February 16, 2021 All Rights Reserved Computational Elucidation of the Regulatory SNPs in the Non-Coding Regions of the Human Genome by Yao Yao A DISSERTATION submitted to Oregon State University in partial fulfillment of the requirements for the degree of Doctor of Philosophy Presented February 16, 2021 Commencement June 2021 Doctor of Philosophy dissertation of Yao Yao presented on February 16, 2021. APPROVED: Major Professor, representing Computer Science Head of the School of Electrical Engineering and Computer Science Dean of the Graduate School I understand that my dissertation will become part of the permanent collection of Oregon State University libraries. My signature below authorizes release of my dissertation to any reader upon request. Yao Yao, Author ACKNOWLEDGEMENTS I would like to express my sincere gratitude to my advisor Dr. Stephen Ramsey for the continuous support of my PhD study and related research, for his patience, motivation, and immense knowledge. His guidance helped me in all the time of research and writing of this thesis. I could not have imagined having a better advisor and mentor for my PhD study. Besides my advisor, I would like to thank the rest of my thesis committee: Dr. Glen- cora Borradaile, Dr. Xiaoli Fern, and Dr. Amir Nayyeri, for their insightful comments and encouragement, and also for the questions which incented me to widen my research from various perspectives. Also I thank Dr. Harold Bae for his service as a GCR. My sincere thanks also goes to my colleagues and friends for their support and help. Special thanks to Tanjin Xu, Jun He, Zheng Liu, Satpreet Singh, Qi Wei, Deqing Qu, Steven Carrell, Finn Womack, Meghamala Sinha, Brandy Nagamine, Janice Blouse for creating the best memories throughout my graduate life in Dryden Hall. Last but not least, I would like to thank my parents, my aunts, my uncles, my grandmas for supporting me spiritually throughout writing this thesis and my life in general. Great thanks to my girlfriend Suzy for her deep love and standing behind me all the time. TABLE OF CONTENTS Page 1 Introduction 1 1.1 Genome-Wide Association Studies . 1 1.2 Single-Nucleotide Polymorphisms . 2 1.3 The rSNP Detection Problem . 3 1.4 Previous Approaches . 4 1.4.1 rSNV-Unsupervised Approaches . 5 1.4.2 rSNV-Supervised Approaches . 5 1.5 Limitations of Previous Approaches . 6 1.6 Our Approaches . 9 1.6.1 CERENKOV . 9 1.6.1.1 Overview . 9 1.6.1.2 New Performance Measure{AVGRANK . 10 1.6.1.3 Locus-Based Sampling . 11 1.6.2 CERENKOV2 . 11 1.6.2.1 Overview . 11 1.6.2.2 The Importance of Data-Space Geometry . 12 1.6.2.3 Data-Space Geometric Features for rSNP Recognition . 12 1.6.3 CERENKOV3 . 15 1.6.3.1 Hypothesis . 15 1.6.3.2 Overview . 16 1.6.4 Convolutional CERENKOV3 . 17 1.6.4.1 Hypothesis . 17 1.6.4.2 Convolutional Neural Network . 18 1.6.4.3 Genomic Signals . 20 1.6.4.4 Convolutional Variational Autoencoder . 22 1.6.4.5 Overview of Convolutional CERENKOV3 . 23 1.7 Dissertation Organization and Structure . 23 2 Materials and Methods 25 2.1 CERENKOV . 25 2.1.1 The OSU17 Reference SNP Set . 25 2.1.2 CERENKOV Features . 26 2.1.2.1 Features Extracted from UCSC Genome Browser . 29 2.1.2.2 Features Extracted from Ensembl . 30 2.1.2.3 GTEx Feature . 30 TABLE OF CONTENTS (Continued) Page 2.1.2.4 DNA Shape Feature . 31 2.1.3 Features for the Other Classifiers in Comparison . 31 2.1.3.1 GWAVA . 31 2.1.3.2 DeltaSVM . 31 2.1.3.3 RSVP . 32 2.1.3.4 DeepSEA . 32 2.1.3.5 DANQ . 32 2.1.3.6 EIGEN, CADD, DANN, fitCons . 33 2.1.4 Machine Learning . 33 2.1.4.1 Random Forest . 33 2.1.4.2 Gradient Boosted Decision Trees and Cross-Validation . 33 2.1.4.3 Tuning CERENKOV . 34 2.1.5 t-SNE and Statistical Testing . 35 2.2 CERENKOV2 . 35 2.2.1 The OSU18 Reference SNP Set . 35 2.2.2 Adjustment to the Non-geometric Features . 36 2.2.3 Computing the Geometric Features . 36 2.2.4 Machine Learning . 38 2.3 CERENKOV3 . 38 2.3.1 Adjustment to the Reference SNP Set and Annotation Features . 39 2.3.2 A Clustering-Derived Feature{Locus Sizes . 40 2.3.3 Construction of Molecular Networks . 41 2.3.3.1 Detailed Procedure for Obtaining SNP-Gene Edges . 41 2.3.3.2 Detailed Procedure for Obtaining Gene-Gene Edges . 43 2.3.4 Network-Derived Features . 43 2.3.5 Machine Learning Pipeline and Hyperparameter Tuning . 43 2.4 Convolutional CERENKOV3 . 47 2.4.1 Construction of Genomic Signal Tracks . 47 2.4.2 Adjustment to the Reference SNP Set and Cross-Validation . 48 2.4.3 Machine Learning . 48 2.4.3.1 Building Blocks of Convolutional CERENKOV3 Models . 49 2.4.3.2 Construction of the Ensemble Classifiers . 50 2.4.3.3 Construction of the Convoluational VAE-Embedded Clas- sifiers . 52 TABLE OF CONTENTS (Continued) Page 3 Results 53 3.1 Metric Comparison between AUPRC and AVGRANK . 53 3.2 Performance Comparison . 55 3.2.1 CERENKOV vs. Nine Previous Approaches . 55 3.2.2 CERENKOV2 vs. CERENKOV . 57 3.2.3 CERENKOV3 vs. GWAVA, CERENKOV, CERENKOV2 . 57 3.2.4 Convolutional CERENKOV3 vs. CERENKOV3 . 59 3.2.4.1 Convolutional CERENKOV3 Ensemble vs. CERENKOV3 60 3.2.4.2 VAE-Embedded Convolutional CERENKOV3 vs. CE- RENKOV3 . 60 3.3 Feature Analysis . 62 3.3.1 CERENKOV feature importance . 62 3.3.2 Analysis of the Intralocus Radius Distributions in CERENKOV2 . 64 3.3.3 Analysis of the Intralocus Radius Likelihood Ratios in CEREN- KOV2.................................. 64 3.3.4 CERENKOV2 Feature Importance . 66 3.3.5 Analysis of the Newly Engineered Features in CERENKOV3 . 68 4 Conclusion and Discussion 69 Bibliography 70 LIST OF FIGURES Figure Page 1.1 Regional association plot of SNP rs16946931 (±400 kbp) . 7 1.2 Distributions of intralocus radii computed using Pearson distance applied to OSU18 SNPs' feature data (see Sec. 2.2.1), conditioned on the type of reference SNP (rSNP or cSNP) for the intralocus radius calculation . 13 1.3 The geometric idea behind the intra-SNP distance features that are used in CERENKOV2 . 14 1.4 Representation of a simple, fictional transcription factor network . 16 1.5 The underlying mechanism of node2vec ................... 17 1.6 A typical convolutional neural network architecture for medical image classification . 19 1.7 1-D deep convolutional neural network structure . 19 1.8 Two PhyloP score tracks (based on vertebrate species genomic sequence alignments) of two SNPs, rs6663784 and rs3737717 . 21 1.9 Illustration of variational autoencoder model with the multivariate Gaus- sian assumption . 22 2.1 Class-wise frequency histograms of SNPs vs. locus size . 40 2.2 Data sources and types of relations used to construct the molecular net- works in CERENKOV3 . 41 2.3 Pipeline of CERENKOV3 machine learning approach . 44 2.4 Structure of LeNet-5 CNN . 49 2.5 Structure of VGG16 CNN . 50 2.6 Structure of Convolutional CERENKOV3 ensemble classifier . 50 2.7 Structure of convolutional VAE-embedded ensemble classifier . 52 3.1 The AUPRC and AVGRANK performance measures are functionally dis- tinct . 54 LIST OF FIGURES (Continued) Figure Page 3.2 Validation performance of CERENKOV improves upon nine methods . 56 3.3 Performance of GWAVA, CERENKOV and CERENKOV2 on the OSU18 reference SNP set, by three performance measures . 58 3.4 Performance of GWAVA, CERENKOV, CERENKOV2 and CERENKOV3 on the OSU18 reference SNP set, by three performance measures .
Recommended publications
  • Functional Effects Detailed Research Plan
    GeCIP Detailed Research Plan Form Background The Genomics England Clinical Interpretation Partnership (GeCIP) brings together researchers, clinicians and trainees from both academia and the NHS to analyse, refine and make new discoveries from the data from the 100,000 Genomes Project. The aims of the partnerships are: 1. To optimise: • clinical data and sample collection • clinical reporting • data validation and interpretation. 2. To improve understanding of the implications of genomic findings and improve the accuracy and reliability of information fed back to patients. To add to knowledge of the genetic basis of disease. 3. To provide a sustainable thriving training environment. The initial wave of GeCIP domains was announced in June 2015 following a first round of applications in January 2015. On the 18th June 2015 we invited the inaugurated GeCIP domains to develop more detailed research plans working closely with Genomics England. These will be used to ensure that the plans are complimentary and add real value across the GeCIP portfolio and address the aims and objectives of the 100,000 Genomes Project. They will be shared with the MRC, Wellcome Trust, NIHR and Cancer Research UK as existing members of the GeCIP Board to give advance warning and manage funding requests to maximise the funds available to each domain. However, formal applications will then be required to be submitted to individual funders. They will allow Genomics England to plan shared core analyses and the required research and computing infrastructure to support the proposed research. They will also form the basis of assessment by the Project’s Access Review Committee, to permit access to data.
    [Show full text]
  • Algorithms for Computational Biology 8Th International Conference, Alcob 2021 Missoula, MT, USA, June 7–11, 2021 Proceedings
    Lecture Notes in Bioinformatics 12715 Subseries of Lecture Notes in Computer Science Series Editors Sorin Istrail Brown University, Providence, RI, USA Pavel Pevzner University of California, San Diego, CA, USA Michael Waterman University of Southern California, Los Angeles, CA, USA Editorial Board Members Søren Brunak Technical University of Denmark, Kongens Lyngby, Denmark Mikhail S. Gelfand IITP, Research and Training Center on Bioinformatics, Moscow, Russia Thomas Lengauer Max Planck Institute for Informatics, Saarbrücken, Germany Satoru Miyano University of Tokyo, Tokyo, Japan Eugene Myers Max Planck Institute of Molecular Cell Biology and Genetics, Dresden, Germany Marie-France Sagot Université Lyon 1, Villeurbanne, France David Sankoff University of Ottawa, Ottawa, Canada Ron Shamir Tel Aviv University, Ramat Aviv, Tel Aviv, Israel Terry Speed Walter and Eliza Hall Institute of Medical Research, Melbourne, VIC, Australia Martin Vingron Max Planck Institute for Molecular Genetics, Berlin, Germany W. Eric Wong University of Texas at Dallas, Richardson, TX, USA More information about this subseries at http://www.springer.com/series/5381 Carlos Martín-Vide • Miguel A. Vega-Rodríguez • Travis Wheeler (Eds.) Algorithms for Computational Biology 8th International Conference, AlCoB 2021 Missoula, MT, USA, June 7–11, 2021 Proceedings 123 Editors Carlos Martín-Vide Miguel A. Vega-Rodríguez Rovira i Virgili University University of Extremadura Tarragona, Spain Cáceres, Spain Travis Wheeler University of Montana Missoula, MT, USA ISSN 0302-9743 ISSN 1611-3349 (electronic) Lecture Notes in Bioinformatics ISBN 978-3-030-74431-1 ISBN 978-3-030-74432-8 (eBook) https://doi.org/10.1007/978-3-030-74432-8 LNCS Sublibrary: SL8 – Bioinformatics © Springer Nature Switzerland AG 2021 This work is subject to copyright.
    [Show full text]
  • Computational Biology and Bioinformatics
    Vol. 30 ISMB 2014, pages i1–i2 BIOINFORMATICS EDITORIAL doi:10.1093/bioinformatics/btu304 Editorial This special issue of Bioinformatics serves as the proceedings of The conference used a two-tier review system, a continuation the 22nd annual meeting of Intelligent Systems for Molecular and refinement of a process begun with ISMB 2013 in an effort Biology (ISMB), which took place in Boston, MA, July 11–15, to better ensure thorough and fair reviewing. Under the revised 2014 (http://www.iscb.org/ismbeccb2014). The official confer- process, each of the 191 submissions was first reviewed by at least ence of the International Society for Computational Biology three expert referees, with a subset receiving between four and (http://www.iscb.org/), ISMB, was accompanied by 12 Special eight reviews, as needed. These formal reviews were frequently Interest Group meetings of one or two days each, two satellite supplemented by online discussion among reviewers and Area meetings, a High School Teachers Workshop and two half-day Chairs to resolve points of dispute and reach a consensus on tutorials. Since its inception, ISMB has grown to be the largest each paper. Among the 191 submissions, 29 were conditionally international conference in computational biology and bioinfor- accepted for publication directly from the first round review Downloaded from matics. It is expected to be the premiere forum in the field for based on an assessment of the reviewers that the paper was presenting new research results, disseminating methods and tech- clearly above par for the conference. A subset of 16 papers niques and facilitating discussions among leading researchers, were viewed as potentially in the top tier but raised significant practitioners and students in the field.
    [Show full text]
  • BIOINFORMATICS ISCB NEWS Doi:10.1093/Bioinformatics/Btp280
    Vol. 25 no. 12 2009, pages 1570–1573 BIOINFORMATICS ISCB NEWS doi:10.1093/bioinformatics/btp280 ISMB/ECCB 2009 Stockholm Marie-France Sagot1, B.J. Morrison McKay2,∗ and Gene Myers3 1INRIA Grenoble Rhône-Alpes and University of Lyon 1, Lyon, France, 2International Society for Computational Biology, University of California San Diego, La Jolla, CA and 3Howard Hughes Medical Institute Janelia Farm Research Campus, Ashburn, Virginia, USA ABSTRACT Computational Biology (http://www.iscb.org) was formed to take The International Society for Computational Biology (ISCB; over the organization, maintain the institutional memory of ISMB http://www.iscb.org) presents the Seventeenth Annual International and expand the informational resources available to members of the Conference on Intelligent Systems for Molecular Biology bioinformatics community. The launch of ECCB (http://bioinf.mpi- (ISMB), organized jointly with the Eighth Annual European inf.mpg.de/conferences/eccb/eccb.htm) 8 years ago provided for a Conference on Computational Biology (ECCB; http://bioinf.mpi- focus on European research activities in years when ISMB is held inf.mpg.de/conferences/eccb/eccb.htm), in Stockholm, Sweden, outside of Europe, and a partnership of conference organizing efforts 27 June to 2 July 2009. The organizers are putting the finishing for the presentation of a single international event when the ISMB touches on the year’s premier computational biology conference, meeting takes place in Europe every other year. with an expected attendance of 1400 computer scientists, The multidisciplinary field of bioinformatics/computational mathematicians, statisticians, biologists and scientists from biology has matured since gaining widespread recognition in the other disciplines related to and reliant on this multi-disciplinary early days of genomics research.
    [Show full text]
  • Applications of Case-Based Reasoning in Molecular Biology
    Articles Applications of Case-Based Reasoning in Molecular Biology Igor Jurisica and Janice Glasgow ■ Case-based reasoning (CBR) is a computational problems by recalling old problems and their reasoning paradigm that involves the storage and solutions and adapting these previous experi- retrieval of past experiences to solve novel prob- ences represented as cases. A case generally lems. It is an approach that is particularly relevant comprises an input problem, an output solu- in scientific domains, where there is a wealth of data but often a lack of theories or general princi- tion, and feedback in terms of an evaluation of ples. This article describes several CBR systems that the solution. CBR is founded on the premise have been developed to carry out planning, analy- that similar problems have similar solutions. sis, and prediction in the domain of molecular bi- Thus, one of the primary goals of a CBR system ology. is to find the most similar, or most relevant, cases for new input problems. The effective- ness of CBR depends on the quality and quan- tity of cases in a case base. In some domains, even a small number of cases provide good so- lutions, but in other domains, an increased number of unique cases improves problem- he domain of molecular biology can be solving capabilities of CBR systems because characterized by substantial amounts of there are more experiences to draw on. Howev- Tcomplex data, many unknowns, a lack of er, larger case bases can also decrease the effi- complete theories, and rapid evolution; rea- ciency of a system. The reader can find detailed soning is often based on experience rather descriptions of the CBR process and systems in than general knowledge.
    [Show full text]
  • Dear Delegates,History of Productive Scientific Discussions of New Challenging Ideas and Participants Contributing from a Wide Range of Interdisciplinary fields
    3rd IS CB S t u d ent Co u ncil S ymp os ium Welcome To The 3rd ISCB Student Council Symposium! Welcome to the Student Council Symposium 3 (SCS3) in Vienna. The ISCB Student Council's mis- sion is to develop the next generation of computa- tional biologists. We would like to thank and ac- knowledge our sponsors and the ISCB organisers for their crucial support. The SCS3 provides an ex- citing environment for active scientific discussions and the opportunity to learn vital soft skills for a successful scientific career. In addition, the SCS3 is the biggest international event targeted to students in the field of Computational Biology. We would like to thank our hosts and participants for making this event educative and fun at the same time. Student Council meetings have had a rich Dear Delegates,history of productive scientific discussions of new challenging ideas and participants contributing from a wide range of interdisciplinary fields. Such meet- We are very happy to welcomeings have you proved all touseful the in ISCBproviding Student students Council and postdocs Symposium innovative inputsin Vienna. and an Afterincreased the network suc- cessful symposiums at ECCBof potential 2005 collaborators. in Madrid and at ISMB 2006 in Fortaleza we are determined to con- tinue our efforts to provide an event for students and young researchers in the Computational Biology community. Like in previousWe ar yearse extremely our excitedintention to have is toyou crhereatee and an the opportunity vibrant city of Vforienna students welcomes to you meet to our their SCS3 event. peers from all over the world for exchange of ideas and networking.
    [Show full text]
  • ISMB/ECCB 2007: the Premier Conference on Computational Biology Thomas Lengauer, B
    MESSAGE FROM ISCB ISMB/ECCB 2007: The Premier Conference on Computational Biology Thomas Lengauer, B. J. Morrison McKay*, Burkhard Rost Two Societies Meet in ways to specifically encourage increased participation from The ISMB conference series was previously underrepresented kicked off in 1993 by the vision of David disciplines of computational biology. Searls (GlaxoSmithKline), Jude Shavlik The major challenge for this (University of Wisconsin Madison), and interdisciplinary field is that two Larry Hunter (University of Colorado). cultures with very different ways of A few years down the road, ISMB had publishing intersect at computational established itself as a primary event in biology meetings such as ISMB/ECCB: computational biology and triggered computational scientists publish their Introduction the founding of ISCB, the International most important results in rigorously Society for Computational Biology reviewed proceedings of meetings; the he International Society for (http://www.iscb.org). ISCB has been lower the ratio between accepted/ Computational Biology (ISCB) organizing the ISMB conference series submitted, the more valued the presents ISMB/ECCB 2007, the since 1998. While ISCB evolved into the T publication. In many cases, publication Fifteenth International Conference on only society representing in proceedings of conferences on Intelligent Systems for Molecular Biology computational biology globally, its computer science are valued more (ISMB 2007), held jointly with the Sixth flagship conference has become the highly than those in peer-reviewed European Conference on Computational largest annual forum focused on scientific journals. In contrast, life Biology (ECCB 2007) in Vienna, Austria, computational biology worldwide scientists publish their best work in July 21–25, 2007 (http://www.iscb.org/ (Table 1).
    [Show full text]
  • ISCB Honors Michael S. Waterman and Mathieu Blanchette Merry Maisel Ach Year, the International Fu¨ R Informatik and Chair of the ISCB Early 1970S
    Message From ISCB ISCB Honors Michael S. Waterman and Mathieu Blanchette Merry Maisel ach year, the International fu¨ r Informatik and chair of the ISCB early 1970s. In a memoir published on Society for Computational Awards Committee. ‘‘So much of our his Web site, Waterman has written: ‘‘I E Biology (ISCB) takes work is based on methods for finding was an innocent mathematician until nominations for its two major awards. sequence homology that, if we weren’t the summer of 1974. It was then that I met Temple Ferris Smith and for two An awards committee, composed of a constantly citing it, it would amaze us that the methodology was first devised months was cooped up with him in an group of current and past directors of only 25 years ago by Mike Waterman and office at Los Alamos. that the Society along with previous Temple Smith. Since then, Waterman experience transformed my research, recipients, evaluates the nominations developed the dynamic programming my life, and perhaps my sanity.’’ and selects the winners. In 2006, the approach to RNA structure prediction, Smith, now director of the awards committee honors two began the combinatorial study of RNA BioMolecular Engineering Research outstanding scientists. secondary structure, improved the Center at Boston University, was also statistical tests incorporated into BLAST visiting Los Alamos from a small Senior Scientist Accomplishment and related tools, and worked on the university in Michigan. Award: Michael S. Waterman assembly problem for genomes past and At Los Alamos, their fellow scientists present. He has also made important and friends included Stanslaw Ulam, contributions to phylogeny, tree Nick Metropolis, Marc Kac, and Gian- comparison, motif searching, cryptogene Carlo Rota, all towering names in analysis, parametric alignment, gapped computational science, at a time when alignment, optical mapping, haplotype the lab was a hotbed of intellectual estimation, gene family evolution, and a ferment.
    [Show full text]
  • 9783319129815.Pdf
    Pedro Mendes Joseph O. Dada Kieran Smallbone (Eds.) Computational Methods LNBI 8859 in Systems Biology 12th International Conference, CMSB 2014 Manchester, UK, November 17–19, 2014 Proceedings 123 Lecture Notes in Bioinformatics 8859 Subseries of Lecture Notes in Computer Science LNBI Series Editors Sorin Istrail Brown University, Providence, RI, USA Pavel Pevzner University of California, San Diego, CA, USA Michael Waterman University of Southern California, Los Angeles, CA, USA LNBI Editorial Board Alberto Apostolico Georgia Institute of Technology, Atlanta, GA, USA Søren Brunak Technical University of Denmark Kongens Lyngby, Denmark Mikhail S. Gelfand IITP, Research and Training Center on Bioinformatics, Moscow, Russia Thomas Lengauer Max Planck Institute for Informatics, Saarbrücken, Germany Satoru Miyano University of Tokyo, Japan Eugene Myers Max Planck Institute of Molecular Cell Biology and Genetics Dresden, Germany Marie-France Sagot Université Lyon 1, Villeurbanne, France David Sankoff University of Ottawa, Canada Ron Shamir Tel Aviv University, Ramat Aviv, Tel Aviv, Israel Terry Speed Walter and Eliza Hall Institute of Medical Research Melbourne, VIC, Australia Martin Vingron Max Planck Institute for Molecular Genetics, Berlin, Germany W. Eric Wong University of Texas at Dallas, Richardson, TX, USA Pedro Mendes Joseph O. Dada Kieran Smallbone (Eds.) Computational Methods in Systems Biology 12th International Conference, CMSB 2014 Manchester, UK, November 17-19, 2014 Proceedings 13 Volume Editors Pedro Mendes Joseph O. Dada Kieran Smallbone The University of Manchester, Manchester Institute of Biotechnology 131 Princess Street, Manchester M1 7DN, UK E-mail: {pedro.mendes, joseph.dada, kieran.smallbone}@manchester.ac.uk ISSN 0302-9743 e-ISSN 1611-3349 ISBN 978-3-319-12981-5 e-ISBN 978-3-319-12982-2 DOI 10.1007/978-3-319-12982-2 Springer Cham Heidelberg New York Dordrecht London Library of Congress Control Number: 2014952778 LNCS Sublibrary: SL 8 – Bioinformatics © Springer International Publishing Switzerland 2014 This work is subject to copyright.
    [Show full text]
  • CV Burkhard Rost
    Burkhard Rost CV BURKHARD ROST TUM Informatics/Bioinformatics i12 Boltzmannstrasse 3 (Rm 01.09.052) 85748 Garching/München, Germany & Dept. Biochemistry & Molecular Biophysics Columbia University New York, USA Email [email protected] Tel +49-89-289-17-811 Photo: © Eckert & Heddergott, TUM Web www.rostlab.org Fax +49-89-289-19-414 Document: CV Burkhard Rost TU München Affiliation: Columbia University TOC: • Tabulated curriculum vitae • Grants • List of publications Highlights by numbers: • 186 invited talks in 29 countries (incl. TedX) • 250 publications (187 peer-review, 168 first/last author) • Google Scholar 2016/01: 30,502 citations, h-index=80, i10=179 • PredictProtein 1st Internet server in mol. biol. (since 1992) • 8 years ISCB President (International Society for Computational Biology) • 143 trained (29% female, 50% foreigners from 32 nations on 6 continents) Brief narrative: Burkhard Rost obtained his doctoral degree (Dr. rer. nat.) from the Univ. of Heidelberg (Germany) in the field of theoretical physics. He began his research working on the thermo-dynamical properties of spin glasses and brain-like artificial neural networks. A short project on peace/arms control research sketched a simple, non-intrusive sensor networks to monitor aircraft (1988-1990). He entered the field of molecular biology at the European Molecular Biology Laboratory (EMBL, Heidelberg, Germany, 1990-1995), spent a year at the European Bioinformatics Institute (EBI, Hinxton, Cambridgshire, England, 1995), returned to the EMBL (1996-1998), joined the company LION Biosciences for a brief interim (1998), became faculty in the Medical School of Columbia University in 1998, and joined the TUM Munich to become an Alexander von Humboldt professor in 2009.
    [Show full text]
  • Workshop on Human Language Technology and Knowledge
    AI Magazine Volume 23 Number 2 (2002) (© AAAI) Workshop Reports Knowledge Media Institute at The Workshop on Human Open University in England, gave the keynote entitled “Supporting Organizational Learning through the Language Technology and Enrichment of Documents.” Accord- ing to Domingue, only a small per- Knowledge Management centage of corporate training is ever applied within the workplace because organizations tend to use school- based methods of learning in con- trast to organizational learning based Mark T. Maybury on theories of learning in the work- place. Domingue described knowl- edge sharing by enriching web docu- ments with informal and formal representations, a process that cap- tures the context in which a docu- The Workshop on Human Language technologies that could enable ment is created and applied. Technology and Knowledge Manage- knowledge management functions Domingue demonstrated how this ment was held on July 6 and 7 in such as the following: enrichment facilitates retrieval and Toulouse, France, in conjunction Expert discovery: Modeling, cata- comprehension. with the meeting of the Joint Associ- loging, and tracking of distributed In addition, the group heard an ation for Computational Linguistics organizations and communities of invited talk from Hans Uszkoreit and European Association for Com- experts (DFKI Saarbruecken), scientific direc- putational Linguistics (ACL / EACL Knowledge discovery: Identifica- tor at the German Research Center ’01). Human language technologies tion and classification of knowledge for Artificial Intelligence (DFKI), head promise solutions to challenges in from unstructured multimedia data of DFKI Language Technology Lab, human-computer interaction, infor- Knowledge sharing: Awareness of, and professor of computational lin- mation access, and knowledge man- and access to, enterprise expertise guistics at the Department of Com- agement.
    [Show full text]
  • Methods and Models for the Analysis of Biological Signifïcance Based on High-Throughput Data
    Methods and Models for the Analysis of Biological Signifïcance Based on High-Throughput Data Jose Luis Mosquera Mayo Aquesta tesi doctoral està subjecta a la llicència Reconeixement- NoComercial – CompartirIgual 3.0. Espanya de Creative Commons. Esta tesis doctoral está sujeta a la licencia Reconocimiento - NoComercial – CompartirIgual 3.0. España de Creative Commons. This doctoral thesis is licensed under the Creative Commons Attribution-NonCommercial- ShareAlike 3.0. Spain License. Methods and Models for the Analysis of Biological Significance Based on High Throughput Data Jose Luis Mosquera Mayo 2 Methods and Models for the Analysis of Biological Significance Based on High Throughput Data M`etodes i Models per a l’An`aliside la Significaci´oBiol`ogicaBasada en Dades d'Alt Rendiment Mem`oriapresentada per en Jose Luis Mosquera Mayo per optar al grau de doctor per la Universitat de Barcelona SIGNATURES El Doctorand: Jose Luis Mosquera Mayo El Director de la Tesi: Dr. Alexandre Sanchez´ Pla El Tutor de la Tesi: Dr. Josep Mar´ıa Oller Sala Programa de doctorat en estad´ıstica Departament d'Estad´ıstica,Facultat de Biologia Universitat de Barcelona 4 Acknowledgements \A teacher affects eternity, he can never tell where his influence stops." Henry Brooks Adams \Mathematicians do not study objects, but relations between objects." Jules Henri Poincar´e It is truly an honor for me to address these words of gratitude to everyone who has contributed, to a greater or lesser extent by supporting me during this exciting program of studies, to help make my lifelong dream come true: to complete my doctoral dissertation. First and foremost, I would like (and feel compelled) to mention a special debt of gratitude to Prof.
    [Show full text]