BMC Bioinformatics Biomed Central
Total Page:16
File Type:pdf, Size:1020Kb
Load more
										Recommended publications
									
								- 
												  Proquest DissertationsAutomated learning of protein involvement in pathogenesis using integrated queries Eithon Cadag A dissertation submitted in partial fulfillment of the requirements for the degree of Doctor of Philosophy University of Washington 2009 Program Authorized to Offer Degree: Department of Medical Education and Biomedical Informatics UMI Number: 3394276 All rights reserved INFORMATION TO ALL USERS The quality of this reproduction is dependent upon the quality of the copy submitted. In the unlikely event that the author did not send a complete manuscript and there are missing pages, these will be noted. Also, if material had to be removed, a note will indicate the deletion. UMI Dissertation Publishing UMI 3394276 Copyright 2010 by ProQuest LLC. All rights reserved. This edition of the work is protected against unauthorized copying under Title 17, United States Code. uest ProQuest LLC 789 East Eisenhower Parkway P.O. Box 1346 Ann Arbor, Ml 48106-1346 University of Washington Graduate School This is to certify that I have examined this copy of a doctoral dissertation by Eithon Cadag and have found that it is complete and satisfactory in all respects, and that any and all revisions required by the final examining committee have been made. Chair of the Supervisory Committee: Reading Committee: (SjLt KJ. £U*t~ Peter Tgffczy-Hornoch In presenting this dissertation in partial fulfillment of the requirements for the doctoral degree at the University of Washington, I agree that the Library shall make its copies freely available for inspection. I further agree that extensive copying of this dissertation is allowable only for scholarly purposes, consistent with "fair use" as prescribed in the U.S.
- 
												  Curriculum VitaeCurriculum Vitae Tandy Warnow Grainger Distinguished Chair in Engineering 1 Contact Information Department of Computer Science The University of Illinois at Urbana-Champaign Email: [email protected] Homepage: http://tandy.cs.illinois.edu 2 Research Interests Phylogenetic tree inference in biology and historical linguistics, multiple sequence alignment, metage- nomic analysis, big data, statistical inference, probabilistic analysis of algorithms, machine learning, combinatorial and graph-theoretic algorithms, and experimental performance studies of algorithms. 3 Professional Appointments • Co-chief scientist, C3.ai Digital Transformation Institute, 2020-present • Grainger Distinguished Chair in Engineering, 2020-present • Associate Head for Computer Science, 2019-present • Special advisor to the Head of the Department of Computer Science, 2016-present • Associate Head for the Department of Computer Science, 2017-2018. • Founder Professor of Computer Science, the University of Illinois at Urbana-Champaign, 2014- 2019 • Member, Carl R. Woese Institute for Genomic Biology. Affiliate of the National Center for Supercomputing Applications (NCSA), Coordinated Sciences Laboratory, and the Unit for Criticism and Interpretive Theory. Affiliate faculty member in the Departments of Mathe- matics, Electrical and Computer Engineering, Bioengineering, Statistics, Entomology, Plant Biology, and Evolution, Ecology, and Behavior, 2014-present. • National Science Foundation, Program Director for Big Data, July 2012-July 2013. • Member, Big Data Senior Steering Group of NITRD (The Networking and Information Tech- nology Research and Development Program), subcomittee of the National Technology Council (coordinating federal agencies), 2012-2013 • Departmental Scholar, Institute for Pure and Applied Mathematics, UCLA, Fall 2011 • Visiting Researcher, University of Maryland, Spring and Summer 2011. 1 • Visiting Researcher, Smithsonian Institute, Spring and Summer 2011. • Professeur Invit´e,Ecole Polytechnique F´ed´erale de Lausanne (EPFL), Summer 2010.
- 
												  Proceedings of the Eighteenth International Conference on Machine Learning., 282 – 289Abstracts of papers, posters and talks presented at the 2008 Joint RECOMB Satellite Conference on REGULATORYREGULATORY GENOMICS GENOMICS - SYSTEMS BIOLOGY - DREAM3 Oct 29-Nov 2, 2008 MIT / Broad Institute / CSAIL BMP follicle cells signaling EGFR signaling floor cells roof cells Organized by Manolis Kellis, MIT Andrea Califano, Columbia Gustavo Stolovitzky, IBM Abstracts of papers, posters and talks presented at the 2008 Joint RECOMB Satellite Conference on REGULATORYREGULATORY GENOMICS GENOMICS - SYSTEMS BIOLOGY - DREAM3 Oct 29-Nov 2, 2008 MIT / Broad Institute / CSAIL Organized by Manolis Kellis, MIT Andrea Califano, Columbia Gustavo Stolovitzky, IBM Conference Chairs: Manolis Kellis .................................................................................. Associate Professor, MIT Andrea Califano ..................................................................... Professor, Columbia University Gustavo Stolovitzky....................................................................Systems Biology Group, IBM In partnership with: Genome Research ..............................................................................editor: Hillary Sussman Nature Molecular Systems Biology ............................................... editor: Thomas Lemberger Journal of Computational Biology ...............................................................editor: Sorin Istrail Organizing committee: Eleazar Eskin Trey Ideker Eran Segal Nir Friedman Douglas Lauffenburger Ron Shamir Leroy Hood Satoru Miyano Program Committee: Regulatory Genomics:
- 
												  Classifying Peroxiredoxin Subgroups and IdentifyingCLASSIFYING PEROXIREDOXIN SUBGROUPS AND IDENTIFYING DISCRIMINATING MOTIFS VIA MACHINE LEARNING BY JIAJIE XIAO A Thesis Submitted to the Graduate Faculty of WAKE FOREST UNIVERSITY GRADUATE SCHOOL OF ARTS AND SCIENCES in Partial Fulfillment of the Requirements for the Degree of MASTER OF SCIENCE Computer Science May, 2018 Winston-Salem, North Carolina Approved By: William Turkett, Jr., Ph.D., Advisor David John, Ph.D., Chair Grey Ballard, Ph.D. James Pease, Ph.D. Dedication This work is dedicated to my parents, Mingfu Xiao and Lizhu Xu, my sister, Wanling Xiao. Because of their hard work, I get to be exactly who I want to be. ii Acknowledgments First, I would like to show my gratitude towards the Center for Molecular Communication and Signaling for supporting me as a Research Fellow for the fall semester in 2017-2018 academic year. The extra time and focus has helped me expand my research project in directions I would not have been able to explore otherwise. I would also like to thank the Department of Computer Science at Wake Forest University. I appreciate all of the resources and support that the department has provided me with in order to complete my thesis. I would also like to thank the professors who have been very helpful and welcoming during coursework, seminars, and hallway passings. I want to thank Dr. John, Dr. Ballard, and Dr. Pease for being on my committee. I appreciate your input and the time you are taking to help me complete my thesis. I want to thank Dr. Poole for the interactions in the Prx subgroup classification project.
- 
												  Probabilistic Protein EngineeringProbabilistic Protein Engineering Thesis by Kevin Kaichuang Yang In Partial Fulfillment of the Requirements for the Degree of Doctor of Philosophy in Chemical Engineering CALIFORNIA INSTITUTE OF TECHNOLOGY Pasadena, California 2019 Defended 14 Dec 2018 ii © 2019 Kevin Kaichuang Yang ORCID: 0000-0001-9045-6826 Some rights reserved. This thesis is distributed under a Creative Commons Attribution-NonCommercial-ShareAlike License iii ACKNOWLEDGEMENTS First, I’d like to thank my advisor Frances Arnold for her support and guidance throughout my PhD. I applied to Caltech because I wanted to work for Frances, and she has not disappointed. Even as I took a detour to teach high school, she welcomed me into her lab first for a summer, and then for my PhD. Frances has a keen eye for important but tractable scientific problems and gave me the freedom to pursue them past the boundaries of her discipline. Frances has pushed me to be a better scientist, and her lessons are the foundation for the rest of my career. One thing Frances has done very well is to build a world-class research group. I would like to specifically thank several group members. During my first stint in the lab, Sabine Brinkmann-Chen, John McIntosh, and Chris Farwell taught me to perform directed evolution. Sabine also keeps us all safe, happy, and productive. When I rejoined the lab, Lukas Herwig reacquainted me with molecular biology. It was a joy to work with Claire Bedbrook and Austin Rice on my first three papers. Thank you for collecting my data for me. Zachary Wu has been a great co-belligerent for the cause of machine learning in the Arnold lab.
- 
												  Alignment-Free Phylogeny Reconstruction Based on Quartet TreesAlignment-free Phylogeny Reconstruction Based On Quartet Trees Dissertation for the award of the degree „Doctor rerum naturalium“ of the Georg-August-Universität Göttingen within the doctoral program „Computer Science“ (PCS) of the Georg-August-University School of Science (GAUSS) submitted by Thomas Dencker from Nordenham January 27, 2020 Members of the Thesis Committee Prof. Dr. Burkhard Morgenstern (1st Referee) Institute for Microbiology and Genetics, Georg-August-Universität Göttingen Prof. Dr. Stephan Waack (2nd Referee) Institute for Computer Science, Georg-August-Universität Göttingen Dr. Johannes Söding Quantitative and Computational Biology, Max-Planck Institute for Biophysical Chemistry Göttingen Further Members of the Examination Board Prof. Dr. Christoph Bleidorn Johann-Friedrich-Blumenbach Institute for Zoology & Anthropology, Georg-August-Universität Göttingen Prof. Dr. Anja Sturm Institute for Mathematical Stochastics, Georg-August-Universität Göttingen Prof. Dr. Jan de Vries Institute for Microbiology and Genetics, Georg-August-Universität Göttingen Date of the oral examination: March 4 th 2020 ii Affidavit I hereby confirm that this thesis has been written independently and with no other sources and aids than quoted. Göttingen, 27. January 2020 Thomas Dencker iii Danksagung An dieser Stelle möchte ich mich bei allen bedanken, die mich in den letzten Jahren be- gleitet und unterstützt haben. Zuallererst bedanke ich mich bei meinem Doktorvater Prof. Dr. Burkhard Morgenstern für die Betreuung dieser Doktorarbeit und die andauernde Unterstützung seit der Bache- lorarbeit. Ohne ihn wäre ich gar nicht in der Bioinformatik gelandet. Besonders dankbar ich bin für die vielen Ratschläge, mit denen ich vor allem meine Präsentationsfähigkeiten verbessern konnte. Außerdem hat er mir die Möglichkeit gegeben, an der RECOMB-CG in Kanada teilzunehmen und viele neue Erfahrungen zu sammeln.
- 
												  UNIVERSITY of CALIFORNIA, SAN DIEGO Identifying AndUNIVERSITY OF CALIFORNIA, SAN DIEGO Identifying and Accommodating Context Dependent Effects in Studies of Genetic Variation and Human Disease A dissertation submitted in partial satisfaction of the requirements for the degree Doctor of Philosophy in Biomedical Sciences by Danjuma Quarless Committee in charge: Professor Nicholas Schork, Chair Professor Richard Kolodner, Co-Chair Professor John Kelsoe Professor Victor Nizet Professor Bing Ren 2017 Copyright Danjuma Quarless, 2017 All rights reserved. The dissertation of Danjuma Quarless is approved, and it is acceptable in quality and form for publication on microfilm and electronically: Co-Chair Chair University of California, San Diego 2017 iii DEDICATION This thesis is dedicated most importantly to my family and friends. Beyond contributing my drop in the scientific bucket, my motivation for completing this degree was demonstrate that goals are achievable. After my family, I’d like to address the incredible support from an incredible group of individuals which as guided me this far. Gwen White and the entire community at Bellarmine who uplifted my intellectual abilities. Dr. Whitehouse, Tim Herron, Erin Jones, Dr. Kent Jones, Dr. Pond who first introduced me to genomics and the entire Whitworth community who challenged me in the perfect combination of support and doubt. Coach Toby and Coach Basket who solidified my character and Dr. Brown and Dr. Whitman for investing in me during the summer that changed my life. Travis Styles, Erick Scott, the Schork Lab, and the BMS program at large: they listened, guided, and supported me throughout the entire process. Jessica who’s become my anchor, where I started this journey without her, however there’s no way I could have finished in the same state.
- 
												  Eleazar EskinEleazar Eskin Department of Computer Science [email protected] University of California, Los Angeles http://www.cs.ucla.edu/˜eeskin Engineering VI rm 286A Los Angeles, CA 90095-1596 EDUCATION Ph.D., Computer Science, Columbia University, October 2002. M.S., Computer Science, Columbia University, May 2000. B.S., Computer Science (with Honors), University of Chicago, May 1997. B.A., Economics (with Honors), University of Chicago, May 1997. B.S., Mathematics, University of Chicago, May 1997. WORK EXPERIENCE Professor and Chair: Department of Computational Medicine. University of California, Los Angeles. December 2017 - present. Professor: Department of Computer Science. Department of Human Genetics. University of California, Los Angeles. July 2014 - present. Associate Professor: Department of Computer Science. Department of Human Genetics. University of California, Los Angeles. July 2009 - June 2014. Assistant Professor: Department of Computer Science. Department of Human Genetics. University of California, Los Angeles. October 2006 - June 2009. Assistant Professor in Residence: Department of Computer Science and Engineering. University of California, San Diego. July 2003 - October 2006. Post Doctoral Researcher: School of Computer Science and Engineering. The Hebrew University. October 2002 - July 2003. TEACHING EXPERIENCE Instructor: Computational Genetics. University of California, Los Angeles, Spring 2007, Spring 2008, Spring 2009, Spring 2010, Spring 2011, Spring 2012, Winter 2013, Spring 2014, Spring 2015, Spring 2016. Instructor: Algorithms in Systems Biology and Bioinformatics. University of California, Los Angeles, Fall 2014, Winter 2016. Winter 2017. Winter 2018. Instructor: Introduction to Bioinformatics. University of California, Los Angeles, Fall 2010. Instructor: Current Topics in Bioinformatics. University of California, Los Angeles, Winter 2008, Fall 2008, Winter 2010, Fall 2011, Fall 2012.
- 
												![Arxiv:1609.08391V1 [Stat.ML] 27 Sep 2016 Chapter 1](https://docslib.b-cdn.net/cover/7947/arxiv-1609-08391v1-stat-ml-27-sep-2016-chapter-1-6707947.webp)  Arxiv:1609.08391V1 [Stat.ML] 27 Sep 2016 Chapter 1Multiple protein feature prediction with statistical relational learning Luca Masera September 27, 2018 arXiv:1609.08391v1 [stat.ML] 27 Sep 2016 Chapter 1 Introduction 1.1 Motivations Proteins are the workhorse molecules of life, they take part in almost every- thing what happens in and outside the cells, from building structures such as hair or nails, to enzymes, which accelerate chemical reactions. It is sufficient to consider some number to understand their importance and variety: the total number of proteins in human cells is estimated to be between 250; 000 to one million and the dry weight of our bodies is made by about the 75% of proteins [40]. Hence, it is easy to understand that an accurate analysis of their features and properties is crucial to understand the deepest mechanism of life. Since their identification in the eighteenth century, proteins have been a focal point of micro-biologist researches, but it is just after the half of the twentieth century that the study of proteins makes its biggest progresses. Indeed, in that years Sanger accurately determined the correct amino acid sequence of insulin, and Max Perutz and Sir John Cowdery Kendrew respec- tively resolved the three dimensional structure of hemoglobin and myoglobin through X-ray crystallography. These discoveries opened the floodgates to a completely new set of protein analysis techniques. In the 80's indeed, bi- 1 ologist recognized the potential of the firsts portable computer and began to store their newly obtained protein sequences and structures data in digi- tal format. Since them marriage between computer science and biology has been long and fruitful.
- 
												  Ismb 2010 OrganizationVol. 26 ISMB 2010, pages i2–i6 BIOINFORMATICS doi:10.1093/bioinformatics/btq246 ISMB 2010 ORGANIZATION PROCEEDINGS COMMITTEE CHAIRS Gene Regulation and Transcriptomics Mona Singh, Proceedings Chair, Princeton University, USA Hanah Margalit, The Hebrew University of Jerusalem, Israel Joel S. Bader, Proceedings Co-Chair, Johns Hopkins University, Eric Xing, Carnegie Mellon University, Pittsburgh, USA Baltimore, USA Population Genomics Eleazar Eskin, University of California, Los Angeles, USA Eran Halperin, Tel-Aviv University, Israel PROCEEDINGS AREA CHAIRS Protein Interactions and Molecular Networks Bioimaging Alfonso Valencia, Spanish National Cancer Research Centre, Gene Myers, Howard Hughes Medical Institute, Ashburn, Madrid, Spain USA Roded Sharan, Tel-Aviv University, Israel Robert F. Murphy, Carnegie Mellon University, Pittsburgh, Protein Structure and Function USA Bonnie Berger, Massachusetts Institute of Technology, Cambridge, Databases and Ontologies USA Alex Bateman, Wellcome Trust Sanger Institute, Hinxton, UK Nir Ben-Tal, Tel-Aviv University, Israel Suzanna Lewis, Lawrence Berkeley National Labs, Berkeley, Sequence Analysis USA Michael Brudno, University of Toronto, Canada Disease Models and Epidemiology Cenk Sahinalp, Simon Fraser University, Vancouver, Canada Thomas Lengauer, Max-Planck Institute for Informatics, Text Mining Saarbruken, Germany Andrey Rzhetsky, University of Chicago, USA Yves Moreau, Catholic University of Leuven, Belgium Hagit Shatkay, Queen’s University, Kingston, Canada Evolution and Comparative Genomics
- 
												  Program Booklet Gives You the Keynote Speakers 10 - 11 Agenda for Both RECOMB and RECOMB-SeqRECOMB 2017 The 21st Annual International Conference on Research in Computational Molecular Biology May 3-7, 2017 RECOMB-Seq 2017 The Seventh RECOMB Satellite Workshop on Massively Parallel Sequencing May 7-8, 2017 Sponsors TABLE OF CONTENTS TABLE Welcome to RECOMB & RECOMB-Seq 2017 On behalf of the local organizing committee, The University of Hong Kong and The Chinese University of Hong Kong, I Agenda at a Glance 2 - 3 welcome you to Hong Kong. Apart from attending the conference, I do wish that you will have the chance to explore our really wonderful city in your spare time. Detailed Schedule 4 - 9 RECOMB 2017 will be action-packed, consisting of a three-and-a-half-day conference program (May 4-7, 2017) followed by RECOMB-Seq 2017 with one-and-a-half-day workshop program (May 7-8, 2017). This program booklet gives you the Keynote Speakers 10 - 11 agenda for both RECOMB and RECOMB-Seq. Note that, we have six keynotes for RECOMB (one in the morning and one in the afternoon on the first three days) and two keynotes for RECOMB-Seq (one on each day). Posters are divided into two Poster Session Schedule 12 - 15 parts (see the detailed schedule in this booklet). Sponsors 16 Upon registration of attendance, you will receive the e-copy of the proceedings containing the accepted papers and the abstracts of the accepted posters. The meal tickets you receive will include one for the Banquet on May 5. Committees 17 Please note that wireless Internet access is available throughout most of the campus of the Chinese University of Hong Kong (via eduroam if you have an account) or throughout the venue of the conference (the login and password are shown at Floor Plan 18 - 19 “Others” section of this booklet and at the back of your conference badge).
- 
												  Discovery and Analysis of Aligned Pattern Clusters from Protein Family SequencesDiscovery and Analysis of Aligned Pattern Clusters from Protein Family Sequences by En-Shiun Annie Lee A thesis presented to the University of Waterloo in fulfillment of the thesis requirement for the degree of Doctor of Philosophy in Systems Design Engineering Waterloo, Ontario, Canada, 2014 c En-Shiun Annie Lee 2014 I hereby declare that I am the sole author of this thesis. This is a true copy of the thesis, including any required final revisions, as accepted by my examiners. I understand that my thesis may be made electronically available to the public. ii Abstract Protein sequences are essential for encoding molecular structures and functions. Conse- quently, biologists invest substantial resources and time discovering functional patterns in proteins. Using high-throughput technologies, biologists are generating an increasing amount of data. Thus, the major challenge in biosequencing today is the ability to conduct data analysis in an efficient and productive manner. Conserved amino acids in proteins reveal important functional domains within protein families. Conversely, less conserved amino acid variations within these protein sequence patterns reveal areas of evolutionary and functional divergence. Exploring protein families using existing methods such as multiple sequence alignment is computationally expensive, thus pattern search is used. However, at present, combinatorial methods of pattern search generate a large set of solutions, and probabilistic methods require richer representations. They require biological ground truth of the input sequences, such as gene name or taxonomic species, as class labels based on traditional classification practice to train a model for predicting unknown sequences. However, these algorithms are inherently biased by mislabelling and may not be able to reveal class characteristics in a detailed and succinct manner.