UC San Diego Electronic Theses and Dissertations

Total Page:16

File Type:pdf, Size:1020Kb

UC San Diego Electronic Theses and Dissertations UC San Diego UC San Diego Electronic Theses and Dissertations Title Computational methods for genome-wide non-coding RNA discovery and analysis Permalink https://escholarship.org/uc/item/5qc2h8tf Author Zhang, Shaojie Publication Date 2007 Peer reviewed|Thesis/dissertation eScholarship.org Powered by the California Digital Library University of California UNIVERSITY OF CALIFORNIA, SAN DIEGO Computational Methods for Genome-Wide Non-Coding RNA Discovery and Analysis A dissertation submitted in partial satisfaction of the requirements for the degree Doctor of Philosophy in Computer Science by Shaojie Zhang Committee in charge: Professor Vineet Bafna, Chair Professor Sanjoy Dasgupta Professor Pavel Pevzner Professor Glenn Tesler Professor Steven Wasserman 2007 . Copyright Shaojie Zhang, 2007 All rights reserved. The dissertation of Shaojie Zhang is approved, and it is acceptable in quality and form for publication on micro- ¯lm: Chair University of California, San Diego 2007 iii To my parents. iv TABLE OF CONTENTS Signature Page . iii Dedication . iv Table of Contents . v List of Figures . vii List of Tables . viii Acknowledgements . ix Vita, Publications, and Fields of Study . xi Abstract . xiii 1 Introduction . 1 1.1 Non-coding RNAs . 1 1.2 RNA secondary structure . 4 1.3 The Challenge of ncRNA Discovery and Analysis . 6 1.3.1 RNA Homolog Search . 7 1.3.2 RNA Consensus Folding for ncRNA Discovery . 8 1.4 Dissertation Outline . 9 2 FastR: Fast RNA Search Using Structure-based Filters . 11 2.1 Introduction . 11 2.2 Methods . 14 2.2.1 Strucutre-based Filters . 14 2.2.2 Structure-based Filter Design . 17 2.2.3 Optimal Structure-based Filter Design . 20 2.2.4 Structure-based Filtering Algorithms . 21 2.2.5 Computing RNA Sequence Structure Alignment . 22 2.2.6 P-value Computation . 27 2.3 Testing Results . 27 2.3.1 Filtering for ncRNA . 28 2.3.2 Alignment . 30 2.3.3 Search Riboswitches using FastR . 30 2.4 Summary . 37 v 3 PFsatR: Pro¯le-based Fast RNA search using sequence-based ¯lters . 40 3.1 Introduction . 40 3.2 Formalizing ncRNA Filters . 44 3.3 Sequence-based Filters . 46 3.3.1 Multiple Keyword (Chain) Filtering . 47 3.3.2 Accuracy of Chain Filters . 49 3.3.3 Implementing Chain Filters . 50 3.4 RNA-Pro¯le Scoring and Alignment . 52 3.4.1 Choosing the Scoring Functions . 53 3.4.2 The Alignment Procedure . 54 3.5 Experimental Results . 54 3.5.1 Filter E±ciency and Accuracy . 56 3.5.2 Discovering Novel Riboswitches . 61 3.5.3 Mining Environmental Sequence Data . 61 3.6 PFastR Web Server . 66 3.7 Summary . 68 4 RNAscf: Consensus folding of unaligned RNA sequences . 71 4.1 Introduction . 71 4.2 RNA Secondary Structure and Stack Con¯gurations . 75 4.2.1 Predicting Putative Stacks . 77 4.2.2 Stack Con¯gurations . 78 4.3 Stack-based Consensus Folding . 81 4.3.1 Computing Optimal Stack Con¯guration in Two RNA Se- quences . 81 4.3.2 Consensus Fold Computation for Multiple RNA Sequences . 84 4.3.3 Implementation Details . 86 4.4 Testing Results . 87 4.5 Summary . 92 5 Conclusions . 93 5.1 Summary of Contribution . 93 5.2 Future Work . 94 Bibliography . 98 vi LIST OF FIGURES Figure 1.1 MicroRNA block protein formation . 2 Figure 1.2 RNA secondary structure . 5 Figure 2.1 Alignment of two tRNA sequences. 15 Figure 2.2 An RNA structure with various structural elements including stacked base-pairs, bulges, hairpin, and multi-loops. 16 Figure 2.3 A (k; ~w; 4)-multiloop stack for tRNA with distance constraints. 19 Figure 2.4 Procedure to create a Binary tree for s with structure S, having O(m) nodes such that each node has at most 2 children. 24 Figure 2.5 An algorithm for aligning a query RNA s of length m with a database string t of length n.................... 25 Figure 2.6 ROC plots for the alignments generated by RSEARCH and FastR. 31 Figure 2.7 Representative riboswitch secondary structures derived from the alignments of the top novel hits for each query. 38 Figure 3.1 A plot of log(eF ) versus m, when L = 150, l = 8 and ± = 20. Di®erent lines correspond to di®erent values of sK . 49 Figure 3.2 An algorithm for aligning an RNA pro¯le R with m columns against a database string t of length n. 55 Figure 3.3 ROC curves for selected families with accurate ¯lter and align- ment. 69 Figure 4.1 Two stack con¯gurations match to each other for both unpaired regions and paired regions. 76 Figure 4.2 Statistics of the stacks in Rfam database. 79 Figure 4.3 The procedure for computing anchor con¯guration. 86 Figure 4.4 The procedure RNAscf for computing consensus folds. 87 Figure 4.5 Sensitivity and accuracy of RNA secondary structure predic- tion on 12 RNA families. 89 Figure 4.6 Improved sensitivity and accuracy of RNAscf as the number of input sequences grows for the thiamine family. 91 Figure 4.7 A comparison of predicted stack con¯gurations by di®erent programs. 91 Figure 5.1 RNAz classi¯es alignments using a support vector machines . 95 Figure 5.2 Evofold scores on the alignments . 96 Figure 5.3 Shifted stacks on a multispecies alignment . 96 vii LIST OF TABLES Table 2.1 Expected number of hits in a random string in a (k; w)-¯lter. 17 Table 2.2 The results of applying nested and multiloop ¯lters to random databases that contain true positives. 29 Table 2.3 Comparison of FastR and RSEARCH. 32 Table 2.4 Summary of the FastR riboswitch search. 34 Table 2.5 Description of the 18 most promising candidates discovered by FastR. 36 Table 3.1 Riboswitch sub-families in Rfam database . 56 Table 3.2 Filtering performance of chain ¯lters (CF), HMM ¯lters (HMM), and composite ¯lters (CF¢HMM) on synthetic sequences. 57 Table 3.3 Comparison of RNA pro¯le alignment (PAln) and CMsearch (CM) on synthetic sequences. 59 Table 3.4 Filtering performance of chain ¯lters (CF), HMM ¯lters (HMM), and composite ¯lters (CF¢HMM) on two real genomes. 60 Table 3.5 Summary of searching riboswitches against the whole bacterial and archaeal genomes. 62 Table 3.6 Summary of searching riboswitch elements against GOS data. 65 Table 3.7 Summary of predicted functions of the con¯dent ORFs down- stream of riboswitch predictions. 67 Table 3.8 Statistics for accurate option and e±cient option. 68 Table 4.1 E®ect of parameters k; w and s on the probability of predicting conserved stacks at random. 85 Table 4.2 A complete list of the comparison of sensitivity and accuracy of RNA secondary structure prediction on 12 RNA families shown in Figure 4.5. 90 viii ACKNOWLEDGEMENTS I am very grateful to my advisor, Dr. Vineet Bafna, for his guidance and support throughout my Ph.D. studies. I feel fortunate to work with him. The work presented in this dissertation bene¯ts the most from his advices. I also would like to thank Dr. Pavel Pevzner for his kindly supporting me for my ¯rst two years and his guidance throughout my Ph.D. studies. I would like to thank Dr. Haixu Tang, Dr. Roded Sharan, Dr. Eleazar Eskin for all the successful collaborations. I wish to thank Dr. Vineet Bafna, Dr. Sanjoy Dasgupta, Dr. Pavel Pevzner, Dr. Glenn Tesler, and Dr. Steven Wasserman for taking the time and patience to review my dissertation and serve on my defense committee. I would like to thank all CSE bioinformatics lab members. All of them have made my Ph.D. study a very precious and unique experience. The science presented in this dissertation greatly bene¯ted from interactions with Max Alek- seyev, Nuno Bandeira, Vikas Bansal, Ali Bashir, Fjola Bjornsdottr, Mark Chaisson, Banu Dost, Ari Frank, Neil Jones, Julio Ng, Qian Peng, Alkes Price, Ben Raphael, Stephen Tanner, Je®rey Wang and Degui Zhi. My dissertation work was supported by a grant from the National Science Foundation (NSF-DBI:0516440). I extensively used computers through the UCSD FWGrid Project (NSF Research Infrastructure Grant Number EIA-0303622). Finally, I am deeply indebted to my family for their everlasting support and love. Chapter 2, in part, is a reprint of the paper \Searching Genomes for non- coding RNA using FastR" co-authored with Brian Haas, Eleazar Eskin and Vineet Bafna in IEEE/ACM Transactions on Computational Biology and Bioinformat- ics, Vol. 2, Issue 4, pp. 366{379, 2005. The dissertation author was the primary investigator and author of this paper. Chapter 3, in part, is a reprint of the paper, \A sequence-based ¯ltering method for ncRNA identi¯cation and its application to searching for riboswitch ix elements", co-authored with Ilya Borovok, Yair Aharonowitz, Roded Sharan, and Vineet Bafna in Bioinformatics (ISMB 2006) Vol. 22, pp. e557{e565, 2006. The dissertation author was the primary investigator and author of this paper. Chapter 4, in part, is a reprint of the paper, \Consensus folding of un- aligned RNA sequences revisited", co-authored with Vineet Bafna and Haixu Tang in Journal of Computational Biology, Vol. 13, Issue 2, pp. 283{295, 2006. The dis- sertation author was the primary investigator and author of this paper. x VITA 1997 B.S. in Computer Science Peking University, Beijing, P.R. China 2001 M.Eng. in Information Engineering Nanyang Technological University, Singapore 2001{2007 Graduate Research Assistant University of California, San Diego 2005 C.Phil., University of California, San Diego 2007 Ph.D. in Computer Science University of California, San Diego PUBLICATIONS Je®rey C Wang, Roded Sharan, Vineet Bafna, and Shaojie Zhang, "PFastR: a web-based fast RNA family identi¯cation tool", in preparation, 2007.
Recommended publications
  • 120421-24Recombschedule FINAL.Xlsx
    Friday 20 April 18:00 20:00 REGISTRATION OPENS in Fira Palace 20:00 21:30 WELCOME RECEPTION in CaixaForum (access map) Saturday 21 April 8:00 8:50 REGISTRATION 8:50 9:00 Opening Remarks (Roderic GUIGÓ and Benny CHOR) Session 1. Chair: Roderic GUIGÓ (CRG, Barcelona ES) 9:00 10:00 Richard DURBIN The Wellcome Trust Sanger Institute, Hinxton UK "Computational analysis of population genome sequencing data" 10:00 10:20 44 Yaw-Ling Lin, Charles Ward and Steven Skiena Synthetic Sequence Design for Signal Location Search 10:20 10:40 62 Kai Song, Jie Ren, Zhiyuan Zhai, Xuemei Liu, Minghua Deng and Fengzhu Sun Alignment-Free Sequence Comparison Based on Next Generation Sequencing Reads 10:40 11:00 178 Yang Li, Hong-Mei Li, Paul Burns, Mark Borodovsky, Gene Robinson and Jian Ma TrueSight: Self-training Algorithm for Splice Junction Detection using RNA-seq 11:00 11:30 coffee break Session 2. Chair: Bonnie BERGER (MIT, Cambrige US) 11:30 11:50 139 Son Pham, Dmitry Antipov, Alexander Sirotkin, Glenn Tesler, Pavel Pevzner and Max Alekseyev PATH-SETS: A Novel Approach for Comprehensive Utilization of Mate-Pairs in Genome Assembly 11:50 12:10 171 Yan Huang, Yin Hu and Jinze Liu A Robust Method for Transcript Quantification with RNA-seq Data 12:10 12:30 120 Zhanyong Wang, Farhad Hormozdiari, Wen-Yun Yang, Eran Halperin and Eleazar Eskin CNVeM: Copy Number Variation detection Using Uncertainty of Read Mapping 12:30 12:50 205 Dmitri Pervouchine Evidence for widespread association of mammalian splicing and conserved long range RNA structures 12:50 13:10 169 Melissa Gymrek, David Golan, Saharon Rosset and Yaniv Erlich lobSTR: A Novel Pipeline for Short Tandem Repeats Profiling in Personal Genomes 13:10 13:30 217 Rory Stark Differential oestrogen receptor binding is associated with clinical outcome in breast cancer 13:30 15:00 lunch break Session 3.
    [Show full text]
  • Extrachromosomal and Other Mechanisms of Oncogene Amplification in Cancer
    Title: Extrachromosomal and other mechanisms of oncogene amplification in cancer Abstract: Increase in the number of copies of tumor promoting (onco-) genes is a hallmark of many cancers, and cancers with copy number amplifications are often associated with poor outcomes. Despite their importance, the mechanisms causing these amplifications are incompletely understood. In this talk, we describe our recent results suggesting that a large faction of amplification is due to formation of extrachromosomal DNA (ecDNA). EcDNA play a critical role in tumor heterogeneity, accelerated cancer evolution, and drug resistance through their unique mechanism of non-chromosomal inheritance. While predominant, ecDNA are not the only mechanism to cause amplification. We also describe recent algorithmic methods required to distinguish ecDNA from other mechanisms including Breakage Fusion Bridge formation, Chromothripsis, and simpler events such as tandem duplications and translocations. The talk is a mix of published and unpublished work, largely in collaboration with Paul Mischel's lab at UCSD. EcDNA was recently recognized as one of the grand challenges of cancer research by Cancer Research UK and the National Cancer Institute. Reading Luebeck, 2020, Verhaak 2019 Biography Vineet Bafna, Ph.D., joined the Computer Science faculty at the University of California, San Diego in 2003, after seven years in the biosciences industry. He received his Ph.D. in computer science from The Pennsylvania State University in 1994 and was an NSF postdoctoral researcher at the Center for Discrete Mathematics and Theoretical Computer Science for two years. From 1996-99, Bafna was a senior investigator at SmithKline Beecham, conducting research on DNA signaling, target discovery and EST assembly.
    [Show full text]
  • Call for Papers (Page 1)
    CALL FOR PAPERS IEEE Computational Systems Bioinformatics Conference Stanford, CA • August 8-11, 2005 You are invited to submit papers to the 2005 IEEE Computational Systems Bioinformatics Conference (CSB2005). The conference’s goal is to facilitate exchange of ideas and collaborations between computer scientists and biologists by presenting cutting-edge computational biology research findings. Such research has an interdisciplinary character. Computer science and mathematical modeling papers must contain a concise description of the biological problem being solved, and biology papers should show how computation or analysis affects the results. Topics of interest include (but are not limited to): • Microarray Data Analysis • Mathematical and Quantitative Models • MicroRNA and RNAi of Cellular and Multicellular Systems • Pathways, Networks, Systems Biology • Synthetic Biological Systems • Biomedical Applications • Sequence Alignment • Biological Data Visualization • Evolution and Phylogenetics • Protein Structures and Complexes • Functional Genomics • Biological Data Mining • High Performance Bio-computing • Pattern Recognition • Comparative Genomics • Microbial Community Analysis • SNPs and Haplotyping • Promoter Analysis and Discovery Full papers are limited to 12 pages, single-spaced, in 12-point type, including title, abstract (250 words or less), figures, tables, text, and bibliography. The first page should give keywords, authors’ postal and electronic mailing addresses. Papers must not have been previously published and must not be currently under consideration for publication elsewhere. Papers will be submitted electronically in MS Word, postscript or PDF format. This year, the conference will also accept short papers, limited to four pages. These papers should describe new research activity in which a complete set of results may not yet be available. Full and short papers will have 25 and 15 minutes, respectively, of presentation time.
    [Show full text]
  • Computational Discovery of Splicing Events from High-Throughput Omics Data
    COMPUTATIONAL DISCOVERY OF SPLICING EVENTS FROM HIGH-THROUGHPUT OMICS DATA by Yen-Yi Lin M.Sc., National Taiwan University, 2008 B.Sc., National Taiwan University, 2003 Thesis Submitted in Partial Fulfillment of the Requirements for the Degree of Doctor of Philosophy in the School of Computing Sciences Faculty of Applied Sciences c Yen-Yi Lin 2017 SIMON FRASER UNIVERSITY Summer 2017 Copyright in this work rests with the author. Please ensure that any reproduction or re-use is done in accordance with the relevant national copyright legislation. Approval Name: Yen-Yi Lin Degree: Doctor of Philosophy (Computing Sciences) Title: COMPUTATIONAL DISCOVERY OF SPLICING EVENTS FROM HIGH-THROUGHPUT OMICS DATA Examining Committee: Chair: Faraz Hach University Research Associate Cenk Sahinalp Senior Supervisor Professor Martin Ester Senior Supervisor Professor Peter Unrau Supervisor Professor Dept. of Molecular Biology and Biochemistry Leonid Chindelevitch Internal Examiner Assistant Professor Vineet Bafna External Examiner Professor Computer Science and Engineering University of California, San Diego Date Defended: July 18, 2017 ii Abstract The splicing mechanism, the process of forming mature messenger RNA (mRNA) by only concatenating exons and removing introns, is an essential step in gene expression. It allows a single gene to have multiple RNA isoforms which potentially code different proteins. In addition, aberrant transcripts generated from non-canonical splicing events (e.g. gene fusions) are believed to be potential drivers in many tumor types and human diseases. Thus, identification and quantification of expressed RNAs from RNA-Seq data become fundamental steps in many clinical studies. For that reason, number of methods have been developed. Most popular computational methods designed for these high-throughput omics data start by analyzing the datasets based on existing gene annotations.
    [Show full text]
  • Front Matter
    Cambridge University Press 978-1-107-01146-5 - Bioinformatics for Biologists Edited by Pavel Pevzner and Ron Shamir Frontmatter More information BIOINFORMATICS FOR BIOLOGISTS The computational education of biologists is changing to prepare students for facing the complex data sets of today’s life science research. In this concise textbook, the authors’ fresh pedagogical approaches lead biology students from first principles towards computational thinking. A team of renowned bioinformaticians take innovative routes to introduce computational ideas in the context of real biological problems. Intuitive explanations promote deep understanding, using little mathematical formalism. Self-contained chapters show how computational procedures are developed and applied to central topics in bioinformatics and genomics, such as the genetic basis of disease, genome evolution, or the tree of life concept. Using bioinformatic resources requires a basic understanding of what bioinformatics is and what it can do. Rather than just presenting tools, the authors – each a leading scientist – engage the students’ problem-solving skills, preparing them to meet the computational challenges of their life science careers. PAVEL PEVZNER is Ronald R. Taylor Professor of Computer Science and Director of the Bioinformatics and Systems Biology Program at the University of California, San Diego. He was named a Howard Hughes Medical Institute Professor in 2006. RON SHAMIR is Raymond and Beverly Sackler Professor of Bioinformatics and head of the Edmond J. Safra Bioinformatics
    [Show full text]
  • Research in Computational Molecular Biology
    Terry Speed Haiyan Huang (Eds.) Research in Computational Molecular Biology 1 lth Annual International Conference, RECOMB 2007 Oakland, CA, USA, April 21-25, 2007 Proceedings Sprin ger Table of Contents QNet: A Tool for Querying Protein Interaction Networks 1 Banu Dost, Tomer Shlomi, Nitin Gupta, Eytan Ruppin, Vineet Bafna, and Roded Sharan Pairwise Global Alignment of Protein Interaction Networks by Matching Neighborhood Topology 16 Rohit Singh, Jinbo Xu, and Bonnie Berger Reconstructing the Topology of Protein Complexes 32 Allister Bernard, David S. Vaughn, and Alexander J. Hartemink Network Legos: Building Blocks of Cellular Wiring Diagrams 47 T.M. Murali and Corban G. Rivera An Efficient Method for Dynamic Analysis of Gene Regulatory Networks and in silico Gene Perturbation Experiments 62 Abhishek Garg, Ioannis Xenarios, Luis Mendoza, and Giovanni DeMicheli A Feature-Based Approach to Modeling Protein-DNA Interactions 77 Eilon Sharon and Eran Segal Network Motif Discovery Using Subgraph Enumeration and Symmetry-Breaking 92 Joshua A. Grochow and Manolis Kellis Nucleosome Occupancy Information Improves de novo Motif Discovery 107 Leelavati Narlikar, Raluca Gordan, and Alexander J. Hartemink Framework for Identifying Common Aberrations in DNA Copy Number Data 122 Amir Ben-Dor, Doron Lipson, Anya Tsalenko, Mark Reimers, Lars O. Baumbusch, Michael T. Barrett, John N. Weinstein, Anne-Lise B0rresen-Dale, and Zohar Yakhini Estimating Genome-Wide Copy Number Using Allele Specific Mixture Models 137 Wenyi Wang, Benilton Carvalho, Nate Miller, Jonathan Pevsner, Aravinda Chakravarti, and Rafael A. Irizarry GIMscan: A New Statistical Method for Analyzing Whole-Genome Array CGH Data 151 Yanxin Shi, Fan Guo, Wei Wu, and Eric P.
    [Show full text]
  • Table of Contents More Information
    Cambridge University Press 978-1-107-01146-5 - Bioinformatics for Biologists Edited by Pavel Pevzner and Ron Shamir Table of Contents More information CONTENTS Extended contents ix Preface xv Acknowledgments xxi Editors and contributors xxiv A computational micro primer xxvi P A R T I Genomes 1 1 Identifying the genetic basis of disease 3 Vineet Bafna 2 Pattern identification in a haplotype block 23 Kun-Mao Chao 3 Genome reconstruction: a puzzle with a billion pieces 36 Phillip E. C. Compeau and Pavel A. Pevzner 4 Dynamic programming: one algorithmic key for many biological locks 66 Mikhail Gelfand 5 Measuring evidence: who’s your daddy? 93 Christopher Lee P A R T II Gene Transcription and Regulation 109 6 How do replication and transcription change genomes? 111 Andrey Grigoriev 7 Modeling regulatory motifs 126 Sridhar Hannenhalli 8 How does the influenza virus jump from animals to humans? 148 Haixu Tang vii © in this web service Cambridge University Press www.cambridge.org Cambridge University Press 978-1-107-01146-5 - Bioinformatics for Biologists Edited by Pavel Pevzner and Ron Shamir Table of Contents More information viii Contents P A R T III Evolution 165 9 Genome rearrangements 167 Steffen Heber and Brian E. Howard 10 Comparison of phylogenetic trees and search for a central trend in the “Forest of Life” 189 Eugene V.Koonin, Pere Puigbo,` and Yuri I. Wolf 11 Reconstructing the history of large-scale genomic changes: biological questions and computational challenges 201 Jian Ma P A R T IV Phylogeny 225 12 Figs, wasps, gophers, and lice: a computational exploration of coevolution 227 Ran Libeskind-Hadas 13 Big cat phylogenies, consensus trees, and computational thinking 248 Seung-Jin Sul and Tiffani L.
    [Show full text]
  • Friday (August 4) Intelligent Systems for Molecular Biology
    and 2nd Annual AB3C Conference: X-Meeting fortaleza, brazil - auGust 6-10, 2006 friday (auGust 4) Intelligent Systems for Molecular Biology Room A Room B1 Room E1 (2nd Floor) 7:00 a.m. Shuttles depart Hotels to Convention Center 7:30 a.m. - 6:00 p.m. Registration - Hall G 3Dsig Alternative Splicing BOSC 9:00 a.m. - 10:30 a.m. Satellite Meeting SIG SIG 10:30 a.m. - 11:00 a.m. Coffee Break - Satellite & SIG meetings (available outside meeting rooms) 3Dsig Alternative Splicing BOSC 11:00 a.m - 12:00 p.m. Satellite Meeting SIG SIG 12:00 p.m. - 1:00 p.m. Lunch - Satellite and SIG meetings (Hall F) 3Dsig Alternative Splicing BOSC 1:00 p.m. - 3:30 p.m. Satellite Meeting SIG SIG 3:30 p.m. - 4:00 p.m. Coffee Break - Satellite & SIG meetings (available outside meeting rooms) 3Dsig Alternative Splicing BOSC 4:00 p.m. - 6:30 p.m. Satellite Meeting SIG SIG 6:30 p.m. Shuttles depart Convention Center to Hotels 1 and 2nd Annual AB3C Conference: X-Meeting fortaleza, brazil - auGust 6-10, 2006 saturday (auGust 5) Intelligent Systems for Molecular Biology Room A Room B1 Room B2 Room E1 (2nd Floor) 7:00 a.m. Shuttles depart Hotels to Convention Center 7:30 a.m. - 6:00 p.m. Registration - Hall G Joint BioLINK & 3Dsig Alternative Splicing Bio-ontologies BOSC 9:00 a.m. - 10:30 a.m. Satellite Meeting SIG SIG SIG 10:30 a.m. - 11:00 a.m. Coffee Break - Satellite & SIG meetings (available outside meeting rooms) Joint BioLINK & 3Dsig Alternative Splicing Bio-ontologies BOSC 11:00 a.m - 12:00 p.m.
    [Show full text]
  • The Future of Genomic Medicine II the Neurosciences Institute Auditorium San Diego, California
    SCRIPPS GENOMIC MEDICINE A COLL ABoraTion OF S cripps H ea LT H and T H E S C R I P P S R E S E A RC H I NS T I T U T E TheThe Future Future of of Genomic Genomic MedicineMedicine II II in collaboration with: Friday, February 27 and Saturday, February 28, 2009 The Neurosciences Institute Auditorium San Diego, California Dear Colleague, Scripps Genomic Medicine A collaboration between Scripps Health and The Scripps Research Institute We invite you to a special day and a half program on a Friday, February 27 and Saturday, February 28, 2009 in beautiful La Jolla, California. Once again, Scripps Genomic Medicine, an initiative of Scripps Health in collaboration this year’s program will be held at the Neurosciences Institute Auditorium on The with The Scripps Research Institute, supports basic research and clinical pro- Scripps Research Institute campus. grams focused on defining the genes that underlie susceptibility to disease, As you will see in the enclosed program, we are inviting a dynamic group of and will take these findings into drug discovery programs and ultimately into speakers who will cover a wide range of topics. The Friday afternoon session clinical trials. The program’s work involves genotyping tens of thousands of will focus on Technology Breakthroughs and Challenges whereas the Saturday individuals of diverse ancestry in an attempt to identify and define genes re- session will feature a wide variety of topics along with keynote speakers. Our goal is to continue to examine the salient progress and challenges in the field of sponsible for major disease and the underpinnings of health.
    [Show full text]
  • Curriculum Vitae
    Curriculum Vitae Eran Halperin Updated to : October 14, 2016. Affiliation: Professor, Department of Computer Science, University of California, Los Angeles (UCLA) Professor, Department of Anesthesiology, University of California, Los Angeles (UCLA) Research areas: Computational Biology, Genomics, Epigenomics, Statistical Genetics, Population Genetics, , Algo- rithms, Machine Learning. Education: 1997-01 Ph.D. in Computer Science, Tel-Aviv University. Thesis: Approximation algorithms for optimization problems. Advisor: Prof. Uri Zwick. 1993-96 M.Sc. in Computer Science, Tel-Aviv University (Summa Cum Laude). Thesis: Bipartite subgraphs of integer weighted graphs. Advisor: Prof. Noga Alon. 1990-93 B.Sc. in Mathematics and Computer Science, Tel-Aviv University (Summa Cum Laude), Experience Academic Research Positions: 2016-now Professor, Computer Science and Anesthesiology, University of California, Los Angeles (UCLA) 2011-2016 Associate Professor, Blavatnik School of Computer Science, and the Department of Molecular Microbiology and Biotechnology, Tel-Aviv University. 2004-2016 Senior Research Scientist at the International Computer Science Institute (ICSI, Berkeley). 2008-2011 Senior Lecturer, Blavatnik School of Computer Science, and the Department of Molec- ular Microbiology and Biotechnology, Tel-Aviv University. 2003-2004 Research Associate at the Computer Science department of Princeton University. 2001-03 Post doc at the Computer Science department of the University of California in Berke- ley, and at the International Computer Science Institute (ICSI). Hosts: Richard Karp, Christos Papadimitriou, Satish Rao, Alistair Sinclair. July-August 2000 Summer intern in AT&T research labs, Florham Park, New Jersey. Mentor: Edith Cohen. 1 Positions in the industry: 07/11-present Scientific Advisory Board in Genia Technologies (nanopores sequencing tech- nologies) 05/12-present Computational Advisory Board in DNA Nexus 10/12-10/13 Scientific Advisory Board in Gene by Gene 07/07-12/08 Director of Bioinformatics in Navigenics, Inc.
    [Show full text]
  • Scalable Parallel Methods for Analyzing Metagenomics Data at Extreme Scale
    PNNL-24266 Scalable Parallel Methods for Analyzing Metagenomics Data at Extreme Scale May 2015 JA Daily SCALABLE PARALLEL METHODS FOR ANALYZING METAGENOMICS DATA AT EXTREME SCALE By JEFFREY ALAN DAILY A dissertation submitted in partial fulfillment of the requirements for the degree of DOCTOR OF PHILOSOPHY WASHINGTON STATE UNIVERSITY School of Electrical Engineering and Computer Science MAY 2015 c Copyright by JEFFREY ALAN DAILY, 2015 All rights reserved c Copyright by JEFFREY ALAN DAILY, 2015 All rights reserved To the Faculty of Washington State University: The members of the Committee appointed to examine the dissertation of JEFFREY ALAN DAILY find it satisfactory and recommend that it be accepted. Ananth Kalyanaraman, Ph.D., Chair John Miller, Ph.D. Carl Hauser, Ph.D. Sriram Krishnamoorthy, Ph.D. ii ACKNOWLEDGEMENT First, I would like to thank Dr. Ananth Kalyanaraman for his supervision and support throughout the work. I would like to thank Dr. Abhinav Vishnu who encouraged my pursuit of my Ph.D. before anyone else, suggested Dr. Kalyanaraman as my advisor, and has been a valuable mentor and friend. I would like to thank Dr. Sriram Krishnamoorthy for his role on my graduate committee, for his research support at our employer, Pacific Northwest Northwest Laboratory (PNNL), and for his mentoring role while performing my research. I would like to thank Dr. John Miller and Dr. Carl Hauser for being on my graduate committee. Last, but not least, I would like to thank my employer, PNNL, for the tuition reimbursement benefit that assisted my pursuit of this degree. iii SCALABLE PARALLEL METHODS FOR ANALYZING METAGENOMICS DATA AT EXTREME SCALE Abstract by Jeffrey Alan Daily, Ph.D.
    [Show full text]
  • Research in Computational Molecular Biology
    Satoru Miyano Jill Mesirov Simon Kasif Sorin Istrail Pavel Pevzner Michael Waterman (Eds.) Research in Computational Molecular Biology 9th Annual International Conference, RECOMB 2005 Cambridge, MA, USA, May 14-18, 2005 Proceedings 4y Springer Table of Contents Efficient Algorithms for Detecting Signaling Pathways in Protein Interaction Networks Jacob Scott, Trey Ideker, Richard M. Karp, Roded Sharan 1 Towards an Integrated Protein-Protein Interaction Network Ariel Jaimovich, Gal Elidan, Hanah Margalit, Nir Friedman 14 The Factor Graph Network Model for Biological Systems Irit Gat- Viks, Amos Tanay, Daniela Raijman, Ron Shamir 31 Pairwise Local Alignment of Protein Interaction Networks Guided by Models of Evolution Mehmet Koyutiirk, Ananth Grama, Wojciech Szpankowski 48 Finding Novel Transcripts in High-Resolution Genome-Wide Microarray Data Using the GenRate Model Brendan J. Prey, Quaid D. Morris, Mark Robinson, Timothy R. Hughes 66 Efficient Calculation of Interval Scores for DNA Copy Number Data Analysis Boron Lipson, Yonatan Aumann, Amir Ben-Dor, Nathan Linial, Zohar Yakhini 83 Keynote A Regulatory Network Controlling Drosophila Development Dmitri Papatsenko, Mike Levine 101 Keynote Yeast Cells as a Discovery Platform for Neurodegenerative Disease Susan Lindquist, Ernest Fraenkel, Tiago Outeiro, Aaron Gitler, Julie Su, Anil Cashikar, Smitha Jagadish 102 RIBRA - An Error-Tolerant Algorithm for the NMR Backbone Assignment Problem Kuen-Pin Wu, Jia-Ming Chang, Jun-Bo Chen, Chi-Fon Chang, Wen-Jin Wu, Tai-Huang Huang, Ting-Yi Sung, Wen-Lian Hsu 103 XIV Table of Contents Avoiding Local Optima in Single Particle Reconstruction Marshall Bern, Jindong Chen, Hao Chi Wong 118 A High-Throughput Approach for Associating microRNAs with Their Activity Conditions Chaya Ben-Zaken Zilberstein, Michal Ziv-Ukelson, Ron Y.
    [Show full text]