LNBI 3909, Pp
Total Page:16
File Type:pdf, Size:1020Kb
Load more
Recommended publications
-
Proquest Dissertations
Automated learning of protein involvement in pathogenesis using integrated queries Eithon Cadag A dissertation submitted in partial fulfillment of the requirements for the degree of Doctor of Philosophy University of Washington 2009 Program Authorized to Offer Degree: Department of Medical Education and Biomedical Informatics UMI Number: 3394276 All rights reserved INFORMATION TO ALL USERS The quality of this reproduction is dependent upon the quality of the copy submitted. In the unlikely event that the author did not send a complete manuscript and there are missing pages, these will be noted. Also, if material had to be removed, a note will indicate the deletion. UMI Dissertation Publishing UMI 3394276 Copyright 2010 by ProQuest LLC. All rights reserved. This edition of the work is protected against unauthorized copying under Title 17, United States Code. uest ProQuest LLC 789 East Eisenhower Parkway P.O. Box 1346 Ann Arbor, Ml 48106-1346 University of Washington Graduate School This is to certify that I have examined this copy of a doctoral dissertation by Eithon Cadag and have found that it is complete and satisfactory in all respects, and that any and all revisions required by the final examining committee have been made. Chair of the Supervisory Committee: Reading Committee: (SjLt KJ. £U*t~ Peter Tgffczy-Hornoch In presenting this dissertation in partial fulfillment of the requirements for the doctoral degree at the University of Washington, I agree that the Library shall make its copies freely available for inspection. I further agree that extensive copying of this dissertation is allowable only for scholarly purposes, consistent with "fair use" as prescribed in the U.S. -
And Should Not, Change Toxic Tort Causation Doctrine
THE MORE WE KNOW, THE LESS INTELLIGENT WE ARE? - HOW GENOMIC INFORMATION SHOULD, AND SHOULD NOT, CHANGE TOXIC TORT CAUSATION DOCTRINE Steve C. Gold* Advances in genomic science are rapidly increasing our understanding of dis- ease and toxicity at the most fundamental biological level. Some say this heralds a new era of certainty in linking toxic substances to human illness in the courtroom. Others are skeptical. In this Article I argue that the new sciences of molecular epidemiology and toxicogenomics will evince both remarkable explanatory power and intractable complexity. Therefore, even when these sciences are brought to bear, toxic tort claims will continue to present their familiar jurisprudential problems. Nevertheless, as genomic information is used in toxic tort cases, the sci- entific developments will offer courts an opportunity to correct mistakes of the past. Courts will miss those opportunities, however, if they simply transfer attitudesfrom classical epidemiology and toxicology to molecular and genomic knowledge. Using the possible link between trichloroethyleneand a type of kidney cancer as an exam- ple, I describe several causation issues where courts can, if they choose, improve doctrine by properly understanding and utilizing genomic information. L Introduction ............................................... 370 II. Causation in Toxic Torts Before Genomics ................... 371 A. The Problem .......................................... 371 B. Classical Sources of Causation Evidence ................ 372 C. The Courts Embrace Epidemiology - Perhaps Too Tightly ................................................ 374 D. Evidence Meets Substance: Keepers of the Gate ......... 379 E. The Opposing Tendency ................................ 382 III. Genomics, Toxicogenomics, and Molecular Epidemiology: The M ore We Know ........................................ 383 A. Genetic Variability and "Background" Risk .............. 384 B. Genetic Variability and Exposure Risk ................... 389 C. -
Algorithms for Computational Biology 8Th International Conference, Alcob 2021 Missoula, MT, USA, June 7–11, 2021 Proceedings
Lecture Notes in Bioinformatics 12715 Subseries of Lecture Notes in Computer Science Series Editors Sorin Istrail Brown University, Providence, RI, USA Pavel Pevzner University of California, San Diego, CA, USA Michael Waterman University of Southern California, Los Angeles, CA, USA Editorial Board Members Søren Brunak Technical University of Denmark, Kongens Lyngby, Denmark Mikhail S. Gelfand IITP, Research and Training Center on Bioinformatics, Moscow, Russia Thomas Lengauer Max Planck Institute for Informatics, Saarbrücken, Germany Satoru Miyano University of Tokyo, Tokyo, Japan Eugene Myers Max Planck Institute of Molecular Cell Biology and Genetics, Dresden, Germany Marie-France Sagot Université Lyon 1, Villeurbanne, France David Sankoff University of Ottawa, Ottawa, Canada Ron Shamir Tel Aviv University, Ramat Aviv, Tel Aviv, Israel Terry Speed Walter and Eliza Hall Institute of Medical Research, Melbourne, VIC, Australia Martin Vingron Max Planck Institute for Molecular Genetics, Berlin, Germany W. Eric Wong University of Texas at Dallas, Richardson, TX, USA More information about this subseries at http://www.springer.com/series/5381 Carlos Martín-Vide • Miguel A. Vega-Rodríguez • Travis Wheeler (Eds.) Algorithms for Computational Biology 8th International Conference, AlCoB 2021 Missoula, MT, USA, June 7–11, 2021 Proceedings 123 Editors Carlos Martín-Vide Miguel A. Vega-Rodríguez Rovira i Virgili University University of Extremadura Tarragona, Spain Cáceres, Spain Travis Wheeler University of Montana Missoula, MT, USA ISSN 0302-9743 ISSN 1611-3349 (electronic) Lecture Notes in Bioinformatics ISBN 978-3-030-74431-1 ISBN 978-3-030-74432-8 (eBook) https://doi.org/10.1007/978-3-030-74432-8 LNCS Sublibrary: SL8 – Bioinformatics © Springer Nature Switzerland AG 2021 This work is subject to copyright. -
Computational Pan-Genomics: Status, Promises and Challenges
bioRxiv preprint doi: https://doi.org/10.1101/043430; this version posted March 12, 2016. The copyright holder for this preprint (which was not certified by peer review) is the author/funder. All rights reserved. No reuse allowed without permission. Computational Pan-Genomics: Status, Promises and Challenges Tobias Marschall1,2, Manja Marz3,60,61,62, Thomas Abeel49, Louis Dijkstra6,7, Bas E. Dutilh8,9,10, Ali Ghaffaari1,2, Paul Kersey11, Wigard P. Kloosterman12, Veli M¨akinen13, Adam Novak15, Benedict Paten15, David Porubsky16, Eric Rivals17,63, Can Alkan18, Jasmijn Baaijens5, Paul I. W. De Bakker12, Valentina Boeva19,64,65,66, Francesca Chiaromonte20, Rayan Chikhi21, Francesca D. Ciccarelli22, Robin Cijvat23, Erwin Datema24,25,26, Cornelia M. Van Duijn27, Evan E. Eichler28, Corinna Ernst29, Eleazar Eskin30,31, Erik Garrison32, Mohammed El-Kebir5,33,34, Gunnar W. Klau5, Jan O. Korbel11,35, Eric-Wubbo Lameijer36, Benjamin Langmead37, Marcel Martin59, Paul Medvedev38,39,40, John C. Mu41, Pieter Neerincx36, Klaasjan Ouwens42,67, Pierre Peterlongo43, Nadia Pisanti44,45, Sven Rahmann29, Ben Raphael46,47, Knut Reinert48, Dick de Ridder50, Jeroen de Ridder49, Matthias Schlesner51, Ole Schulz-Trieglaff52, Ashley Sanders53, Siavash Sheikhizadeh50, Carl Shneider54, Sandra Smit50, Daniel Valenzuela13, Jiayin Wang70,71,72, Lodewyk Wessels56, Ying Zhang23,5, Victor Guryev16,12, Fabio Vandin57,34, Kai Ye68,69,72 and Alexander Sch¨onhuth5 1Center for Bioinformatics, Saarland University, Saarbr¨ucken, Germany; 2Max Planck Institute for Informatics, Saarbr¨ucken, -
Grammar String: a Novel Ncrna Secondary Structure Representation
Grammar string: a novel ncRNA secondary structure representation Rujira Achawanantakun, Seyedeh Shohreh Takyar, and Yanni Sun∗ Department of Computer Science and Engineering, Michigan State University, East Lansing, MI 48824 , USA ∗Email: [email protected] Multiple ncRNA alignment has important applications in homologous ncRNA consensus structure derivation, novel ncRNA identification, and known ncRNA classification. As many ncRNAs’ functions are determined by both their sequences and secondary structures, accurate ncRNA alignment algorithms must maximize both sequence and struc- tural similarity simultaneously, incurring high computational cost. Faster secondary structure modeling and alignment methods using trees, graphs, probability matrices have thus been developed. Despite promising results from existing ncRNA alignment tools, there is a need for more efficient and accurate ncRNA secondary structure modeling and alignment methods. In this work, we introduce grammar string, a novel ncRNA secondary structure representation that encodes an ncRNA’s sequence and secondary structure in the parameter space of a context-free grammar (CFG). Being a string defined on a special alphabet constructed from a CFG, it converts ncRNA alignment into sequence alignment with O(n2) complexity. We align hundreds of ncRNA families from BraliBase 2.1 using grammar strings and compare their consensus structure with Murlet using the structures extracted from Rfam as reference. Our experiments have shown that grammar string based multiple sequence alignment competes favorably in consensus structure quality with Murlet. Source codes and experimental data are available at http://www.cse.msu.edu/~yannisun/grammar-string. 1. INTRODUCTION both the sequence and structural conservations. A successful application of SCFG is ncRNA classifica- Annotating noncoding RNAs (ncRNAs), which are tion, which classifies query sequences into annotated not translated into protein but function directly as ncRNA families such as tRNA, rRNA, riboswitch RNA, is highly important to modern biology. -
Dean, School of Science
Dean, School of Science School of Science faculty members seek to answer fundamental questions about nature ranging in scope from the microscopic—where a neuroscientist might isolate the electrical activity of a single neuron—to the telescopic—where an astrophysicist might scan hundreds of thousands of stars to find Earth-like planets in their orbits. Their research will bring us a better understanding of the nature of our universe, and will help us address major challenges to improving and sustaining our quality of life, such as developing viable resources of renewable energy or unravelling the complex mechanics of Alzheimer’s and other diseases of the aging brain. These profound and important questions often require collaborations across departments or schools. At the School of Science, such boundaries do not prevent people from working together; our faculty cross such boundaries as easily as they walk across the invisible boundary between one building and another in our campus’s interconnected buildings. Collaborations across School and department lines are facilitated by affiliations with MIT’s numerous laboratories, centers, and institutes, such as the Institute for Data, Systems, and Society, as well as through participation in interdisciplinary initiatives, such as the Aging Brain Initiative or the Transiting Exoplanet Survey Satellite. Our faculty’s commitment to teaching and mentorship, like their research, is not constrained by lines between schools or departments. School of Science faculty teach General Institute Requirement subjects in biology, chemistry, mathematics, and physics that provide the conceptual foundation of every undergraduate student’s education at MIT. The School faculty solidify cross-disciplinary connections through participation in graduate programs established in collaboration with School of Engineering, such as the programs in biophysics, microbiology, or molecular and cellular neuroscience. -
UNIVERSITY of CALIFORNIA RIVERSIDE Unsupervised And
UNIVERSITY OF CALIFORNIA RIVERSIDE Unsupervised and Zero-Shot Learning for Open-Domain Natural Language Processing A Dissertation submitted in partial satisfaction of the requirements for the degree of Doctor of Philosophy in Computer Science by Muhammad Abu Bakar Siddique June 2021 Dissertation Committee: Dr. Evangelos Christidis, Chairperson Dr. Amr Magdy Ahmed Dr. Samet Oymak Dr. Evangelos Papalexakis Copyright by Muhammad Abu Bakar Siddique 2021 The Dissertation of Muhammad Abu Bakar Siddique is approved: Committee Chairperson University of California, Riverside To my family for their unconditional love and support. i ABSTRACT OF THE DISSERTATION Unsupervised and Zero-Shot Learning for Open-Domain Natural Language Processing by Muhammad Abu Bakar Siddique Doctor of Philosophy, Graduate Program in Computer Science University of California, Riverside, June 2021 Dr. Evangelos Christidis, Chairperson Natural Language Processing (NLP) has yielded results that were unimaginable only a few years ago on a wide range of real-world tasks, thanks to deep neural networks and the availability of large-scale labeled training datasets. However, existing supervised methods assume an unscalable requirement that labeled training data is available for all classes: the acquisition of such data is prohibitively laborious and expensive. Therefore, zero-shot (or unsupervised) models that can seamlessly adapt to new unseen classes are indispensable for NLP methods to work in real-world applications effectively; such models mitigate (or eliminate) the need for collecting and annotating data for each domain. This dissertation ad- dresses three critical NLP problems in contexts where training data is scarce (or unavailable): intent detection, slot filling, and paraphrasing. Having reliable solutions for the mentioned problems in the open-domain setting pushes the frontiers of NLP a step towards practical conversational AI systems. -
120421-24Recombschedule FINAL.Xlsx
Friday 20 April 18:00 20:00 REGISTRATION OPENS in Fira Palace 20:00 21:30 WELCOME RECEPTION in CaixaForum (access map) Saturday 21 April 8:00 8:50 REGISTRATION 8:50 9:00 Opening Remarks (Roderic GUIGÓ and Benny CHOR) Session 1. Chair: Roderic GUIGÓ (CRG, Barcelona ES) 9:00 10:00 Richard DURBIN The Wellcome Trust Sanger Institute, Hinxton UK "Computational analysis of population genome sequencing data" 10:00 10:20 44 Yaw-Ling Lin, Charles Ward and Steven Skiena Synthetic Sequence Design for Signal Location Search 10:20 10:40 62 Kai Song, Jie Ren, Zhiyuan Zhai, Xuemei Liu, Minghua Deng and Fengzhu Sun Alignment-Free Sequence Comparison Based on Next Generation Sequencing Reads 10:40 11:00 178 Yang Li, Hong-Mei Li, Paul Burns, Mark Borodovsky, Gene Robinson and Jian Ma TrueSight: Self-training Algorithm for Splice Junction Detection using RNA-seq 11:00 11:30 coffee break Session 2. Chair: Bonnie BERGER (MIT, Cambrige US) 11:30 11:50 139 Son Pham, Dmitry Antipov, Alexander Sirotkin, Glenn Tesler, Pavel Pevzner and Max Alekseyev PATH-SETS: A Novel Approach for Comprehensive Utilization of Mate-Pairs in Genome Assembly 11:50 12:10 171 Yan Huang, Yin Hu and Jinze Liu A Robust Method for Transcript Quantification with RNA-seq Data 12:10 12:30 120 Zhanyong Wang, Farhad Hormozdiari, Wen-Yun Yang, Eran Halperin and Eleazar Eskin CNVeM: Copy Number Variation detection Using Uncertainty of Read Mapping 12:30 12:50 205 Dmitri Pervouchine Evidence for widespread association of mammalian splicing and conserved long range RNA structures 12:50 13:10 169 Melissa Gymrek, David Golan, Saharon Rosset and Yaniv Erlich lobSTR: A Novel Pipeline for Short Tandem Repeats Profiling in Personal Genomes 13:10 13:30 217 Rory Stark Differential oestrogen receptor binding is associated with clinical outcome in breast cancer 13:30 15:00 lunch break Session 3. -
BIOGRAPHICAL SKETCH NAME: Berger
BIOGRAPHICAL SKETCH NAME: Berger, Bonnie eRA COMMONS USER NAME (credential, e.g., agency login): BABERGER POSITION TITLE: Simons Professor of Mathematics and Professor of Electrical Engineering and Computer Science EDUCATION/TRAINING (Begin with baccalaureate or other initial professional education, such as nursing, include postdoctoral training and residency training if applicable. Add/delete rows as necessary.) EDUCATION/TRAINING DEGREE Completion (if Date FIELD OF STUDY INSTITUTION AND LOCATION applicable) MM/YYYY Brandeis University, Waltham, MA AB 06/1983 Computer Science Massachusetts Institute of Technology SM 01/1986 Computer Science Massachusetts Institute of Technology Ph.D. 06/1990 Computer Science Massachusetts Institute of Technology Postdoc 06/1992 Applied Mathematics A. Personal Statement Advances in modern biology revolve around automated data collection and sharing of the large resulting datasets. I am considered a pioneer in the area of bringing computer algorithms to the study of biological data, and a founder in this community that I have witnessed grow so profoundly over the last 26 years. I have made major contributions to many areas of computational biology and biomedicine, largely, though not exclusively through algorithmic innovations, as demonstrated by nearly twenty thousand citations to my scientific papers and widely-used software. In recognition of my success, I have just been elected to the National Academy of Sciences and in 2019 received the ISCB Senior Scientist Award, the pinnacle award in computational biology. My research group works on diverse challenges, including Computational Genomics, High-throughput Technology Analysis and Design, Biological Networks, Structural Bioinformatics, Population Genetics and Biomedical Privacy. I spearheaded research on analyzing large and complex biological data sets through topological and machine learning approaches; e.g. -
Microblogging the ISMB: a New Approach to Conference Reporting
Message from ISCB Microblogging the ISMB: A New Approach to Conference Reporting Neil Saunders1*, Pedro Beltra˜o2, Lars Jensen3, Daniel Jurczak4, Roland Krause5, Michael Kuhn6, Shirley Wu7 1 School of Molecular and Microbial Sciences, University of Queensland, St. Lucia, Brisbane, Queensland, Australia, 2 Department of Cellular and Molecular Pharmacology, University of California San Francisco, San Francisco, California, United States of America, 3 Novo Nordisk Foundation Center for Protein Research, Panum Institute, Copenhagen, Denmark, 4 Department of Bioinformatics, University of Applied Sciences, Hagenberg, Freistadt, Austria, 5 Max-Planck-Institute for Molecular Genetics, Berlin, Germany, 6 European Molecular Biology Laboratory, Heidelberg, Germany, 7 Stanford Medical Informatics, Stanford University, Stanford, California, United States of America Cameron Neylon entitled FriendFeed for Claire Fraser-Liggett opened the meeting Scientists: What, Why, and How? (http:// with a review of metagenomics and an blog.openwetware.org/scienceintheopen/ introduction to the human microbiome 2008/06/12/friendfeed-for-scientists-what- project (http://friendfeed.com/search?q = why-and-how/) for an introduction. room%3Aismb-2008+microbiome+OR+ We—a group of science bloggers, most fraser). The subsequent Q&A session of whom met in person for the first time at covered many of the exciting challenges The International Conference on Intel- ISMB 2008—found FriendFeed a remark- for those working in this field. Clearly, ligent Systems for Molecular Biology -
UC Irvine UC Irvine Previously Published Works
UC Irvine UC Irvine Previously Published Works Title The capacity of feedforward neural networks. Permalink https://escholarship.org/uc/item/29h5t0hf Authors Baldi, Pierre Vershynin, Roman Publication Date 2019-08-01 DOI 10.1016/j.neunet.2019.04.009 License https://creativecommons.org/licenses/by/4.0/ 4.0 Peer reviewed eScholarship.org Powered by the California Digital Library University of California THE CAPACITY OF FEEDFORWARD NEURAL NETWORKS PIERRE BALDI AND ROMAN VERSHYNIN Abstract. A long standing open problem in the theory of neural networks is the devel- opment of quantitative methods to estimate and compare the capabilities of different ar- chitectures. Here we define the capacity of an architecture by the binary logarithm of the number of functions it can compute, as the synaptic weights are varied. The capacity provides an upperbound on the number of bits that can be extracted from the training data and stored in the architecture during learning. We study the capacity of layered, fully-connected, architectures of linear threshold neurons with L layers of size n1, n2,...,nL and show that in essence the capacity is given by a cubic polynomial in the layer sizes: L−1 C(n1,...,nL) = Pk=1 min(n1,...,nk)nknk+1, where layers that are smaller than all pre- vious layers act as bottlenecks. In proving the main result, we also develop new techniques (multiplexing, enrichment, and stacking) as well as new bounds on the capacity of finite sets. We use the main result to identify architectures with maximal or minimal capacity under a number of natural constraints. -
Lior Pachter Genome Informatics 2013 Keynote
Stories from the Supplement Lior Pachter! Department of Mathematics and Molecular & Cell Biology! UC Berkeley! ! November 1, 2013! Genome Informatics, CSHL The Cufflinks supplements • C. Trapnell, B.A. Williams, G. Pertea, A. Mortazavi, G. Kwan, M.J. van Baren, S.L. Salzberg, B.J. Wold and L. Pachter, Transcript assembly and quantification by RNA-Seq reveals unannotated transcripts and isoform switching during cell differentiation, Nature Biotechnology 28, (2010), p 511–515. Supplementary Material: 42 pages. • C. Trapnell, D.G. Hendrickson, M. Sauvageau, L. Goff, J.L. Rinn and L. Pachter, Differential analysis of gene regulation at transcript resolution with RNA-seq, Nature Biotechnology, 31 (2012), p 46–53. Supplementary Material: 70 pages. A supplementary arithmetic progression? • C. Trapnell, B.A. Williams, G. Pertea, A. Mortazavi, G. Kwan, M.J. van Baren, S.L. Salzberg, B.J. Wold and L. Pachter, Transcript assembly and quantification by RNA-Seq reveals unannotated transcripts and isoform switching during cell differentiation, Nature Biotechnology 28, (2010), p 511–515. Supplementary Material: 42 pages. • C. Trapnell, D.G. Hendrickson, M. Sauvageau, L. Goff, J.L. Rinn and L. Pachter, Differential analysis of gene regulation at transcript resolution with RNA- seq, Nature Biotechnology, 31 (2012), p 46–53. Supplementary Material: 70 pages. • A. Roberts and L. Pachter, Streaming algorithms for fragment assignment, Nature Methods 10 (2013), p 71—73. Supplementary Material: 98 pages? The nature methods manuscript checklist The nature methods manuscript checklist Emperor Joseph II: My dear young man, don't take it too hard. Your work is ingenious. It's quality work. And there are simply too many notes, that's all.