Eval2011.Pdf
Total Page:16
File Type:pdf, Size:1020Kb
Load more
										Recommended publications
									
								- 
												  Ontology-Based Methods for Analyzing Life Science DataHabilitation a` Diriger des Recherches pr´esent´ee par Olivier Dameron Ontology-based methods for analyzing life science data Soutenue publiquement le 11 janvier 2016 devant le jury compos´ede Anita Burgun Professeur, Universit´eRen´eDescartes Paris Examinatrice Marie-Dominique Devignes Charg´eede recherches CNRS, LORIA Nancy Examinatrice Michel Dumontier Associate professor, Stanford University USA Rapporteur Christine Froidevaux Professeur, Universit´eParis Sud Rapporteure Fabien Gandon Directeur de recherches, Inria Sophia-Antipolis Rapporteur Anne Siegel Directrice de recherches CNRS, IRISA Rennes Examinatrice Alexandre Termier Professeur, Universit´ede Rennes 1 Examinateur 2 Contents 1 Introduction 9 1.1 Context ......................................... 10 1.2 Challenges . 11 1.3 Summary of the contributions . 14 1.4 Organization of the manuscript . 18 2 Reasoning based on hierarchies 21 2.1 Principle......................................... 21 2.1.1 RDF for describing data . 21 2.1.2 RDFS for describing types . 24 2.1.3 RDFS entailments . 26 2.1.4 Typical uses of RDFS entailments in life science . 26 2.1.5 Synthesis . 30 2.2 Case study: integrating diseases and pathways . 31 2.2.1 Context . 31 2.2.2 Objective . 32 2.2.3 Linking pathways and diseases using GO, KO and SNOMED-CT . 32 2.2.4 Querying associated diseases and pathways . 33 2.3 Methodology: Web services composition . 39 2.3.1 Context . 39 2.3.2 Objective . 40 2.3.3 Semantic compatibility of services parameters . 40 2.3.4 Algorithm for pairing services parameters . 40 2.4 Application: ontology-based query expansion with GO2PUB . 43 2.4.1 Context . 43 2.4.2 Objective .
- 
												  PREDICTD: Parallel Epigenomics Data Imputation with Cloud-Based Tensor DecompositionbioRxiv preprint doi: https://doi.org/10.1101/123927; this version posted April 4, 2017. The copyright holder for this preprint (which was not certified by peer review) is the author/funder, who has granted bioRxiv a license to display the preprint in perpetuity. It is made available under aCC-BY-NC 4.0 International license. PREDICTD: PaRallel Epigenomics Data Imputation with Cloud-based Tensor Decomposition Timothy J. Durham Maxwell W. Libbrecht Department of Genome Sciences Department of Genome Sciences University of Washington University of Washington J. Jeffry Howbert Jeff Bilmes Department of Genome Sciences Department of Electrical Engineering University of Washington University of Washington William Stafford Noble Department of Genome Sciences Department of Computer Science and Engineering University of Washington April 4, 2017 Abstract The Encyclopedia of DNA Elements (ENCODE) and the Roadmap Epigenomics Project have produced thousands of data sets mapping the epigenome in hundreds of cell types. How- ever, the number of cell types remains too great to comprehensively map given current time and financial constraints. We present a method, PaRallel Epigenomics Data Imputation with Cloud-based Tensor Decomposition (PREDICTD), to address this issue by computationally im- puting missing experiments in collections of epigenomics experiments. PREDICTD leverages an intuitive and natural model called \tensor decomposition" to impute many experiments si- multaneously. Compared with the current state-of-the-art method, ChromImpute, PREDICTD produces lower overall mean squared error, and combining methods yields further improvement. We show that PREDICTD data can be used to investigate enhancer biology at non-coding human accelerated regions. PREDICTD provides reference imputed data sets and open-source software for investigating new cell types, and demonstrates the utility of tensor decomposition and cloud computing, two technologies increasingly applicable in bioinformatics.
- 
												  Motif Selection Using Simulated Annealing Algorithm with Application to Identify Regulatory ElementsMotif Selection Using Simulated Annealing Algorithm with Application to Identify Regulatory Elements A thesis presented to the faculty of the Russ College of Engineering and Technology of Ohio University In partial fulfillment of the requirements for the degree Master of Science Liang Chen August 2018 © 2018 Liang Chen. All Rights Reserved. 2 This thesis titled Motif Selection Using Simulated Annealing Algorithm with Application to Identify Regulatory Elements by LIANG CHEN has been approved for the Department of Electrical Engineering and Computer Science and the Russ College of Engineering and Technology by Lonnie Welch Professor of Electrical Engineering and Computer Science Dennis Irwin Dean, Russ College of Engineering and Technology 3 Abstract CHEN, LIANG, M.S., August 2018, Computer Science Master Program Motif Selection Using Simulated Annealing Algorithm with Application to Identify Regulatory Elements (106 pp.) Director of Thesis: Lonnie Welch Modern research on gene regulation and disorder-related pathways utilize the tools such as microarray and RNA-Seq to analyze the changes in the expression levels of large sets of genes. In silico motif discovery was performed based on the gene expression profile data, which generated a large set of candidate motifs (usually hundreds or thousands of motifs). How to pick a set of biologically meaningful motifs from the candidate motif set is a challenging biological and computational problem. As a computational problem it can be modeled as motif selection problem (MSP). Building solutions for motif selection problem will give biologists direct help in finding transcription factors (TF) that are strongly related to specific pathways and gaining insights of the relationships between genes.
- 
												  Genome InformaticsJoint Cold Spring Harbor Laboratory/Wellcome Trust Conference GENOME INFORMATICS September 15–September 19, 2010 View metadata, citation and similar papers at core.ac.uk brought to you by CORE provided by Cold Spring Harbor Laboratory Institutional Repository Joint Cold Spring Harbor Laboratory/Wellcome Trust Conference GENOME INFORMATICS September 15–September 19, 2010 Arranged by Inanc Birol, BC Cancer Agency, Canada Michele Clamp, BioTeam, Inc. James Kent, University of California, Santa Cruz, USA SCHEDULE AT A GLANCE Wednesday 15th September 2010 17.00-17.30 Registration – finger buffet dinner served from 17.30-19.30 19.30-20:50 Session 1: Epigenomics and Gene Regulation 20.50-21.10 Break 21.10-22.30 Session 1, continued Thursday 16th September 2010 07.30-09.00 Breakfast 09.00-10.20 Session 2: Population and Statistical Genomics 10.20-10:40 Morning Coffee 10:40-12:00 Session 2, continued 12.00-14.00 Lunch 14.00-15.20 Session 3: Environmental and Medical Genomics 15.20-15.40 Break 15.40-17.00 Session 3, continued 17.00-19.00 Poster Session I and Drinks Reception 19.00-21.00 Dinner Friday 17th September 2010 07.30-09.00 Breakfast 09.00-10.20 Session 4: Databases, Data Mining, Visualization and Curation 10.20-10.40 Morning Coffee 10.40-12.00 Session 4, continued 12.00-14.00 Lunch 14.00-16.00 Free afternoon 16.00-17.00 Keynote Speaker: Alex Bateman 17.00-19.00 Poster Session II and Drinks Reception 19.00-21.00 Dinner Saturday 18th September 2010 07.30-09.00 Breakfast 09.00-10.20 Session 5: Sequencing Pipelines and Assembly 10.20-10.40
- 
												  (Title of the Thesis)*Discovery of Flexible Gap Patterns from Sequences by En Hui Zhuang A thesis presented to the University of Waterloo in fulfillment of the thesis requirement for the degree of Doctor of Philosophy in Systems Design Engineering Waterloo, Ontario, Canada, 2014 ©En Hui Zhuang 2014 AUTHOR'S DECLARATION I hereby declare that I am the sole author of this thesis. This is a true copy of the thesis, including any required final revisions, as accepted by my examiners. I understand that my thesis may be made electronically available to the public. ii Abstract Human genome contains abundant motifs bound by particular biomolecules. These motifs are involved in the complex regulatory mechanisms of gene expressions. The dominant mechanism behind the intriguing gene expression patterns is known as combinatorial regulation, achieved by multiple cooperating biomolecules binding in a nearby genomic region to provide a specific regulatory behavior. To decipher the complicated combinatorial regulation mechanism at work in the cellular processes, there is a pressing need to identify co-binding motifs for these cooperating biomolecules in genomic sequences. The great flexibility of the interaction distance between nearby cooperating biomolecules leads to the presence of flexible gaps in between component motifs of a co- binding motif. Many existing motif discovery methods cannot handle co-binding motifs with flexible gaps. Existing co-binding motif discovery methods are ineffective in dealing with the following problems: (1) co-binding motifs may not appear in a large fraction of the input sequences, (2) the lengths of component motifs are unknown and (3) the maximum range of the flexible gap can be large.
- 
												  Genomic and Transcriptomic Investigation of Endemic Burkitt Lymphoma and Epstein Barr VirusUniversity of Massachusetts Medical School eScholarship@UMMS GSBS Dissertations and Theses Graduate School of Biomedical Sciences 2017-07-31 Genomic and Transcriptomic Investigation of Endemic Burkitt Lymphoma and Epstein Barr Virus Yasin Kaymaz University of Massachusetts Medical School Let us know how access to this document benefits ou.y Follow this and additional works at: https://escholarship.umassmed.edu/gsbs_diss Part of the Bioinformatics Commons, Computational Biology Commons, Genetics Commons, Genomics Commons, Hematology Commons, Immunology of Infectious Disease Commons, Molecular Genetics Commons, Oncology Commons, Other Genetics and Genomics Commons, Parasitology Commons, Pathology Commons, and the Pediatrics Commons Repository Citation Kaymaz Y. (2017). Genomic and Transcriptomic Investigation of Endemic Burkitt Lymphoma and Epstein Barr Virus. GSBS Dissertations and Theses. https://doi.org/10.13028/M2R95Z. Retrieved from https://escholarship.umassmed.edu/gsbs_diss/914 Creative Commons License This work is licensed under a Creative Commons Attribution-Noncommercial 4.0 License This material is brought to you by eScholarship@UMMS. It has been accepted for inclusion in GSBS Dissertations and Theses by an authorized administrator of eScholarship@UMMS. For more information, please contact [email protected]. GENOMIC AND TRANSCRIPTOMIC INVESTIGATION OF ENDEMIC BURKITT LYMPHOMA AND EPSTEIN BARR VIRUS A Dissertation Presented by YASIN KAYMAZ Submitted to the Faculty of the University Of Massachusetts Graduate School Of Biomedical Sciences, Worcester in partial fulfillment of the requirements for the degree of DOCTOR OF PHILOSOPHY July 31st, 2017 1 GENOMIC AND TRANSCRIPTOMIC INVESTIGATION OF ENDEMIC BURKITT LYMPHOMA AND EPSTEIN BARR VIRUS A Dissertation Presented by YASIN KAYMAZ The signatures of the Dissertation Defense Committee signify completion and approval as to style and content of the Dissertation Jeffrey A.
- 
												  Here Have Been Major Community Efforts on Algorithm and Software DevelopmentACM-BCB 2013 ACM-BCB Organization Committee Steering Committee Chair Poster Chairs Local Arrangement Chairs Amarda Shehu Aidong Zhang Dongxiao Zhu George Mason University State University of New York at Buffalo Wayne State University Liliana Florea Yu-Ping Wang Johns Hopkins University General Chairs Tulane University Sridhar Hannenhalli Women in Bioinformatics University of Maryland Industry Chairs Panel Chair Cathy H. Wu Anastasia Christianson May Dongmei Wang University of Delaware & AstraZeneca Pharmaceutical Georgia Tech & Emory University Georgetown University Michael Liebman Strategic Medicine Program Chairs Panel Chair Iosif Vaisman Srinivas Aluru Health Informatics George Mason University Georgia Institute of Technology Symposium Chairs Donna Slonim Maricel G. Kann Tufts University University of Maryland, Baltimore County Publicity Chair Philip Payne Jianlin Cheng Workshop Chair Ohio State University University of Missouri, Columbia Ümit V. Çatalyürek Ohio State University Exhibit/System Demo Chair Proceedings Chair Nathan Edwards Jing Gao Tutorial Chairs Georgetown University State University of New York at Buffalo Clare Bates Congdon University of Southern Maine PhD Forum Chair Registration Chair Vasant Honavar Yanni Sun Preetam Ghosh Iowa State University Michigan State University Virginia Commonwealth University ACM‐BCB 2013 Conference Schedule Sunday Monday Tuesday Wednesday Sep. 22 Sep. 23 Sep. 24 Sep. 25 8:15am – 8:30am Opening Remarks 8:30am – 10:00am 8:30am – 10:25am 8:30am – 9:30am Paper Session 7 Paper Session 4 Keynote
- 
												  Organizing CommitteeLogo design by Barbara Pixton The best poster award is kindly offered by High Throughput, an open access journal from MDPI 1 Organizing Committee General Chairs: Nurit Haspel, University of Massachusetts Boston, USA Student Travel Award Chairs: Lenore J. Cowen, Tufts University, USA May D. Wang, Georgia Institute of Technology and Emory University, USA Program Chairs: Anna Ritz, Reed College, USA Amarda Shehu, George Mason University, USA Zhaohui (Steve) Qin, Emory University, USA Tamer Kahveci, University of Florida, USA Ying Sha, Georgia Institute of Technology, USA Giuseppe Pozzi, Politecnico di Milano, Italy Women in Bioinformatics (WiB) Chair: Workshop Chairs: May D. Wang, Georgia Institute of Technology and Jianlin Cheng, University of Missouri, USA Emory University, USA Bhaskar DasGupta, University of Illinois at Chicago, USA Lydia Tapia, University of New Mexico, USA Proceedings Chairs: Xinghua Mindy Shi, University of North Carolina at Tutorial Chairs: Charlotte, USA Filip Jagodzinski, Western Washington University, USA Yang Shen, Texas A&M University, USA Dario Ghersi, University of Nebraska, USA Benjamin Hescott, Tufts University, USA Giuseppe Tradigo, University Magna Graecia of Catanzaro, Italy Web Admin: Jonathan Kho, Georgia Institute of Technology, USA Demo and Exhibit Chairs: Robert (Bob) Cottingham, Oak Ridge National Publicity Chairs Laboratory, USA A. Ercument Cicek, Bilkent University, Turkey Narayan Ganesan, Stevens Institute of Technology, USA Oznur Tastan, Bilkent University, Turkey Rolf Backofen, University of Freiburg, Germany Poster Chairs: Pierangelo Veltri, University Magna Graecia of Dong Si, University of Washington Bothell, ISA Catanzaro, Italy A. Ercument Cicek, Bilkent University, USA Noah Daniels, University of Rhode Island, USA Steering Committee: Aidong Zhang, State University of New York at Registration Chair: Buffalo,USA, Co-Chair Preetam Ghosh, Virginia Commonwealth University, May D.
- 
												  Computational Study of Transcriptional Regulation - from Sequence to ExpressionComputational Study of Transcriptional Regulation - From Sequence To Expression Shan Zhong CMU-CB-13-101 May 2013 School of Computer Science Carnegie Mellon University Pittsburgh, PA 15213 Thesis Committee: Ziv Bar-Joseph, Chair Roni Rosenfeld Seyoung Kim Takis Benos (University of Pittsburgh) Submitted in partial fulfillment of the requirements for the degree of Doctor of Philosophy. Copyright c 2013 Shan Zhong Keywords: Motif finding, Transcriptional regulatory network, p53, Protein binding microar- ray, Tissue specificity, EIN3, Ethylene response To my parents, my wife, and my soon-to-be-born son. iv Abstract Transcription is the process during which RNA molecules are synthesized based on the DNAs in cells. Transcription leads to gene expression, and it is the first step in the flow of genetic information from DNA to proteins that carry out bio- logical functions. Transcription is tightly regulated both spatially and temporally at multiple levels, so that the amount of mRNAs produced for different genes is controlled across different kinds of cells and tissues, as well as in different devel- opmental stages and in response to different environmental stimulus. In eukaryotes, transcription is a complicated process and its regulation involves both cis-regulatory elements and trans-acting factors. By studying spatiotemporally what genes are reg- ulated by which cis-elements and trans-factors, we can get a better understanding of how we develop, how we react to environmental signals, and the mechanisms behind diseases like cancer that, at least in part, result from failures in proper transcriptional regulation. In this thesis, we present a suite of computational methods and analyses that, combined, provide a solution to problems related to the identification of DNA bind- ing motifs, linking these motifs to the TFs that bind them and the genes that they con- trol, and integrating these motifs and interactions with time series expression data to model dynamic regulatory networks.
- 
											The University of Chicago Interrogating the 3D Structure of Primate Genomes a Dissertation Submitted to the Faculty of the DivisTHE UNIVERSITY OF CHICAGO INTERROGATING THE 3D STRUCTURE OF PRIMATE GENOMES A DISSERTATION SUBMITTED TO THE FACULTY OF THE DIVISION OF THE BIOLOGICAL SCIENCES AND THE PRITZKER SCHOOL OF MEDICINE IN CANDIDACY FOR THE DEGREE OF DOCTOR OF PHILOSOPHY DEPARTMENT OF HUMAN GENETICS BY ITTAI ETHAN ERES CHICAGO, ILLINOIS DECEMBER 2020 Copyright © 2020 by Ittai Ethan Eres All Rights Reserved Freely available under a CC-BY 4.0 International license \If I am not for myself, who will be for me? But if I am only for myself, who am I? And if not now, when?" Rabbi Hillel Table of Contents LIST OF FIGURES . vi LIST OF TABLES . vii ACKNOWLEDGMENTS . viii ABSTRACT . xi 1 INTRODUCTION . 1 1.1 The evolution of gene regulation . 1 1.2 Gene regulatory evolution insights from comparative primate genomics . 3 1.3 The growing importance of the 3D genome . 11 2 REORGANIZATION OF 3D GENOME STRUCTURE MAY CONTRIBUTE TO GENE REGULATORY EVOLUTION IN PRIMATES . 15 2.1 Abstract . 15 2.2 Introduction . 16 2.3 Results . 18 2.3.1 Inter-species differences in 3D genomic interactions . 19 2.3.2 The relationship between inter-species differences in contacts and gene expression . 26 2.3.3 The chromatin and epigenetic context of inter-species differences in 3D genome structure . 30 2.4 Discussion . 33 2.4.1 Contribution of variation in 3D genome structure to expression diver- gence . 36 2.4.2 Functional annotations . 37 2.5 Materials and methods . 38 2.5.1 Ethics statement . 38 2.5.2 Induced pluripotent stem cells (iPSCs) .
- 
											Analysis, Visualization, and Machine Learning of Epigenomic DataUniversity of Massachusetts Medical School eScholarship@UMMS GSBS Dissertations and Theses Graduate School of Biomedical Sciences 2017-12-12 Analysis, Visualization, and Machine Learning of Epigenomic Data Michael J. Purcaro University of Massachusetts Medical School Let us know how access to this document benefits ou.y Follow this and additional works at: https://escholarship.umassmed.edu/gsbs_diss Part of the Computational Biology Commons, Genomics Commons, and the Integrative Biology Commons Repository Citation Purcaro MJ. (2017). Analysis, Visualization, and Machine Learning of Epigenomic Data. GSBS Dissertations and Theses. https://doi.org/10.13028/M23T1Q. Retrieved from https://escholarship.umassmed.edu/gsbs_diss/938 Creative Commons License This work is licensed under a Creative Commons Attribution-Noncommercial 4.0 License This material is brought to you by eScholarship@UMMS. It has been accepted for inclusion in GSBS Dissertations and Theses by an authorized administrator of eScholarship@UMMS. For more information, please contact [email protected]. ANALYSIS, VISUALIZATION, AND MACHINE LEARNING OF EPIGENOMIC DATA A Dissertation Presented By MICHAEL JOSEPH PURCARO Submitted to the Faculty of the University of Massachusetts Graduate School of Biomedical Sciences, Worcester in partial fulfillment of the requirements for the degree of DOCTOR OF PHILOSOPHY DECEMBER 12, 2017 BIOINFORMATICS AND COMPUTATIONAL BIOLOGY M.D., PH.D. PROGRAM I-ii ANALYSIS, VISUALIZATION, AND MACHINE LEARNING OF EPIGENOMIC DATA A Dissertation Presented
- 
												  Protein Structural Alignments from SequencebioRxiv preprint doi: https://doi.org/10.1101/2020.11.03.365932; this version posted November 4, 2020. The copyright holder for this preprint (which was not certified by peer review) is the author/funder, who has granted bioRxiv a license to display the preprint in perpetuity. It is made available under aCC-BY 4.0 International license. Protein Structural Alignments From Sequence James T. Morton Charlie E. M. Strauss Center for Computational Biology Bioscience Division, Flatiron Institute, Simons Foundation Los Alamos National Laboratory, New York, NY, 10010 Los Alamos NM 87544 Robert Blackwell Daniel Berenberg Scientific Computing Core Center for Computational Biology Flatiron Institute, Simons Foundation Flatiron Institute, Simons Foundation New York, NY, 10010 New York, NY, 10010 Vladimir Gligorijevic Richard Bonneau Center for Computational Biology Center for Computational Biology Flatiron Institute, Simons Foundation Flatiron Institute, Simons Foundation New York, NY, 10010 New York, NY, 10010 Abstract Computing sequence similarity is a fundamental task in biology, with alignment forming the basis for the annotation of genes and genomes and providing the core data structures for evolutionary analysis. Standard approaches are a mainstay of modern molecular biology and rely on variations of edit distance to obtain explicit alignments between pairs of biological sequences. However, sequence alignment algorithms struggle with remote homology tasks and cannot identify similarities between many pairs of proteins with similar structures and likely homology. Recent work suggests that using machine learning language models can improve remote homology detection. To this end, we introduce DeepBLAST, that obtains explicit alignments from residue embeddings learned from a protein language model in- tegrated into an end-to-end differentiable alignment framework.