Scalable Approaches for Auditing the Completeness of Biomedical Ontologies
Total Page:16
File Type:pdf, Size:1020Kb
Load more
Recommended publications
-
D3.3 Initial Technical & Logical Architecture and Future Work
D3.3 Initial Technical & Logical Architecture and future work recommendations ECP 2008 DILI 558001 Europeana v1.0 D3.3 Initial Technical & Logical Architecture and future work recommendations Deliverable number D3.3 Dissemination level Public Delivery date 30 July 2010 Status Final Makx Dekkers, Stefan Gradmann, Jan Author(s) Molendijk eContentplus This project is funded under the eContentplus programme1, a multiannual Community programme to make digital content in Europe more accessible, usable and exploitable. 1 OJ L 79, 24.3.2005, p. 1. 1/14 D3.3 Initial Technical & Logical Architecture and future work recommendations 1. Introduction This deliverable has two tasks: To characterise the technical and logical architecture of Europeana as a system in its 1.0 state (that is to say by the time of the ‘Rhine’ release To outline the future work recommendations that can reasonably be made at that moment. This also provides a straightforward and logical structure to the document: characterisation comes first followed by the recommendations for future work. 2. Technical and Logical Architecture From a high-level architectural point of view, Europeana.eu is best characterized as a search engine and a database. It loads metadata delivered by providers and aggregators into a database, and uses that database to allow users to search for cultural heritage objects, and to find links to those objects. Various methods of searching and browsing the objects are offered, including a simple and an advanced search form, a timeline, and an openSearch API. It is also important to describe what Europeana.eu does not do, even though people sometimes expect it to. -
A Knowledge Discovery Approach
Semantic XML Tagging of Domain-Specific Text Archives: A Knowledge Discovery Approach Dissertation zur Erlangung des akademisches Grades Doktoringenieur (Dr.-Ing.) angenommen durch die Fakult¨at fur¨ Informatik der Otto-von-Guericke-Universit¨at Magdeburg von Diplom-Kaufmann Peter Karsten Winkler, geboren am 1. Oktober 1971 in Berlin Gutachterinnen und Gutachter: Prof. Dr. Myra Spiliopoulou Prof. Dr. Gunter Saake Prof. Dr. Stefan Conrad Ort und Datum des Promotionskolloquiums: Magdeburg, 22. Januar 2009 Karsten Winkler. Semantic XML Tagging of Domain-Specific Text Archives: A Knowl- edge Discovery Approach. Dissertation, Faculty of Computer Science, Otto von Guericke University Magdeburg, Magdeburg, Germany, January 2009. Contents List of Figures v List of Tables vii List of Algorithms xi Abstract xiii Zusammenfassung xv Acknowledgments xvii 1 Introduction 1 1.1 TheAbundanceofText ............................ 1 1.2 Defining Semantic XML Markup . 3 1.3 BenefitsofSemanticXMLMarkup . 9 1.4 ResearchQuestions ............................... 12 1.5 ResearchMethodology ............................. 14 1.6 Outline...................................... 16 2 Literature Review 19 2.1 Storage, Retrieval, and Analysis of Textual Data . ....... 19 2.1.1 Knowledge Discovery in Textual Databases . 19 2.1.2 Information Storage and Retrieval . 23 2.1.3 InformationExtraction. 25 2.2 Discovering Concepts in Textual Data . 26 2.2.1 Topic Discovery in Text Documents . 27 2.2.2 Extracting Relational Tuples from Text . 31 2.2.3 Learning Taxonomies, Thesauri, and Ontologies . 35 2.3 Semantic Annotation of Text Documents . 39 2.3.1 Manual Semantic Text Annotation . 40 2.3.2 Semi-Automated Semantic Text Annotation . 43 2.3.3 Automated Semantic Text Annotation . 47 2.4 Schema Discovery in Marked-Up Text Documents . -
Proquest Dissertations
Automated learning of protein involvement in pathogenesis using integrated queries Eithon Cadag A dissertation submitted in partial fulfillment of the requirements for the degree of Doctor of Philosophy University of Washington 2009 Program Authorized to Offer Degree: Department of Medical Education and Biomedical Informatics UMI Number: 3394276 All rights reserved INFORMATION TO ALL USERS The quality of this reproduction is dependent upon the quality of the copy submitted. In the unlikely event that the author did not send a complete manuscript and there are missing pages, these will be noted. Also, if material had to be removed, a note will indicate the deletion. UMI Dissertation Publishing UMI 3394276 Copyright 2010 by ProQuest LLC. All rights reserved. This edition of the work is protected against unauthorized copying under Title 17, United States Code. uest ProQuest LLC 789 East Eisenhower Parkway P.O. Box 1346 Ann Arbor, Ml 48106-1346 University of Washington Graduate School This is to certify that I have examined this copy of a doctoral dissertation by Eithon Cadag and have found that it is complete and satisfactory in all respects, and that any and all revisions required by the final examining committee have been made. Chair of the Supervisory Committee: Reading Committee: (SjLt KJ. £U*t~ Peter Tgffczy-Hornoch In presenting this dissertation in partial fulfillment of the requirements for the doctoral degree at the University of Washington, I agree that the Library shall make its copies freely available for inspection. I further agree that extensive copying of this dissertation is allowable only for scholarly purposes, consistent with "fair use" as prescribed in the U.S. -
Molecular Biology for Computer Scientists
CHAPTER 1 Molecular Biology for Computer Scientists Lawrence Hunter “Computers are to biology what mathematics is to physics.” — Harold Morowitz One of the major challenges for computer scientists who wish to work in the domain of molecular biology is becoming conversant with the daunting intri- cacies of existing biological knowledge and its extensive technical vocabu- lary. Questions about the origin, function, and structure of living systems have been pursued by nearly all cultures throughout history, and the work of the last two generations has been particularly fruitful. The knowledge of liv- ing systems resulting from this research is far too detailed and complex for any one human to comprehend. An entire scientific career can be based in the study of a single biomolecule. Nevertheless, in the following pages, I attempt to provide enough background for a computer scientist to understand much of the biology discussed in this book. This chapter provides the briefest of overviews; I can only begin to convey the depth, variety, complexity and stunning beauty of the universe of living things. Much of what follows is not about molecular biology per se. In order to 2ARTIFICIAL INTELLIGENCE & MOLECULAR BIOLOGY explain what the molecules are doing, it is often necessary to use concepts involving, for example, cells, embryological development, or evolution. Bi- ology is frustratingly holistic. Events at one level can effect and be affected by events at very different levels of scale or time. Digesting a survey of the basic background material is a prerequisite for understanding the significance of the molecular biology that is described elsewhere in the book. -
Bibliography [Abiteboul and Kanellakis, 1989] Serge Abiteboul and Paris Kanellakis
507 Bibliography [Abiteboul and Kanellakis, 1989] Serge Abiteboul and Paris Kanellakis. Object identity as a query language primitive. In Proc. of the ACM SIGMOD Int. Conf. on Management of Data, pages 159–173, 1989. [Abiteboul et al., 1995] Serge Abiteboul, Richard Hull, and Victor Vianu. Foundations of Databases. Addison Wesley Publ. Co., Reading, Massachussetts, 1995. [Abiteboul et al., 1997] Serge Abiteboul, Dallan Quass, Jason McHugh, Jennifer Widom, and Janet L. Wiener. The Lorel query language for semistructured data. Int. J. on Digital Libraries, 1(1):68–88, 1997. [Abiteboul et al., 2000] Serge Abiteboul, Peter Buneman, and Dan Suciu. Data on the Web: from Relations to Semistructured Data and XML. Morgan Kaufmann, Los Altos, 2000. [Abiteboul, 1997] Serge Abiteboul. Querying semi-structured data. In Proc. of the 6th Int. Conf. on Database Theory (ICDT’97), pages 1–18, 1997. [Abrahams et al., 1996] Merryll K. Abrahams, Deborah L. McGuinness, Rich Thomason, Lori Alperin Resnick, Peter F. Patel-Schneider, Violetta Cavalli-Sforza, and Cristina Conati. NeoClassic tutorial: Version 1.0. Technical report, Artificial Intelligence Prin- ciples Research Department, AT&T Labs Research and University of Pittsburgh, 1996. Available as http://www.bell-labs.com/project/classic/papers/NeoTut/NeoTut. html. [Abrett and Burstein, 1987] Glen Abrett and Mark H. Burstein. The KREME knowledge editing environment. Int. J. of Man-Machine Studies, 27(2):103–126, 1987. [Abrial, 1974] J. R. Abrial. Data semantics. In J. W. Klimbie and K. L. Koffeman, editors, Data Base Management, pages 1–59. North-Holland Publ. Co., Amsterdam, 1974. [Achilles et al., 1991] E. Achilles, B. -
Ontology-Based Methods for Analyzing Life Science Data
Habilitation a` Diriger des Recherches pr´esent´ee par Olivier Dameron Ontology-based methods for analyzing life science data Soutenue publiquement le 11 janvier 2016 devant le jury compos´ede Anita Burgun Professeur, Universit´eRen´eDescartes Paris Examinatrice Marie-Dominique Devignes Charg´eede recherches CNRS, LORIA Nancy Examinatrice Michel Dumontier Associate professor, Stanford University USA Rapporteur Christine Froidevaux Professeur, Universit´eParis Sud Rapporteure Fabien Gandon Directeur de recherches, Inria Sophia-Antipolis Rapporteur Anne Siegel Directrice de recherches CNRS, IRISA Rennes Examinatrice Alexandre Termier Professeur, Universit´ede Rennes 1 Examinateur 2 Contents 1 Introduction 9 1.1 Context ......................................... 10 1.2 Challenges . 11 1.3 Summary of the contributions . 14 1.4 Organization of the manuscript . 18 2 Reasoning based on hierarchies 21 2.1 Principle......................................... 21 2.1.1 RDF for describing data . 21 2.1.2 RDFS for describing types . 24 2.1.3 RDFS entailments . 26 2.1.4 Typical uses of RDFS entailments in life science . 26 2.1.5 Synthesis . 30 2.2 Case study: integrating diseases and pathways . 31 2.2.1 Context . 31 2.2.2 Objective . 32 2.2.3 Linking pathways and diseases using GO, KO and SNOMED-CT . 32 2.2.4 Querying associated diseases and pathways . 33 2.3 Methodology: Web services composition . 39 2.3.1 Context . 39 2.3.2 Objective . 40 2.3.3 Semantic compatibility of services parameters . 40 2.3.4 Algorithm for pairing services parameters . 40 2.4 Application: ontology-based query expansion with GO2PUB . 43 2.4.1 Context . 43 2.4.2 Objective . -
Description Logics
Description Logics Franz Baader1, Ian Horrocks2, and Ulrike Sattler2 1 Institut f¨urTheoretische Informatik, TU Dresden, Germany [email protected] 2 Department of Computer Science, University of Manchester, UK {horrocks,sattler}@cs.man.ac.uk Summary. In this chapter, we explain what description logics are and why they make good ontology languages. In particular, we introduce the description logic SHIQ, which has formed the basis of several well-known ontology languages, in- cluding OWL. We argue that, without the last decade of basic research in description logics, this family of knowledge representation languages could not have played such an important rˆolein this context. Description logic reasoning can be used both during the design phase, in order to improve the quality of ontologies, and in the deployment phase, in order to exploit the rich structure of ontologies and ontology based information. We discuss the extensions to SHIQ that are required for languages such as OWL and, finally, we sketch how novel reasoning services can support building DL knowledge bases. 1 Introduction The aim of this section is to give a brief introduction to description logics, and to argue why they are well-suited as ontology languages. In the remainder of the chapter we will put some flesh on this skeleton by providing more technical details with respect to the theory of description logics, and their relationship to state of the art ontology languages. More detail on these and other matters related to description logics can be found in [6]. Ontologies There have been many attempts to define what constitutes an ontology, per- haps the best known (at least amongst computer scientists) being due to Gruber: “an ontology is an explicit specification of a conceptualisation” [47].3 In this context, a conceptualisation means an abstract model of some aspect of the world, taking the form of a definition of the properties of important 3 This was later elaborated to “a formal specification of a shared conceptualisation” [21]. -
Will the Semantic Web Scale?
Panel Discussion P3: Will the Semantic Web scale? New York Sheraton May 19th, 2004 Proposers: Panelists: Organizers: - Raphael Volz - Dr. Cathy Marshall - Raphael Volz - Carole Goble - Prof. Dr. Alon Y. Halevy - Daniel Oberle - Rudi Studer - Prof. Dr. Jürgen Angele - Prof. Dr. Ian Horrocks Panelist 1 Dr. Cathy Marshall Microsoft Corporation Texas A+M University Why the Semantic Web won’t scale the scaled semantic web seen as mass-market product “the Flowbee uses the suction power of your household vacuum to draw the hair up to the desired length, and then gives it a perfect cut.....every time.” Three important questions: • Will it really work? • Who needs it? • Is it safe? 3 will it work? evaluating the semantic web as metadata • compare the semantic web to a widely adopted metadata scheme like the MARC record used for library cataloging – MARC practitioners are members of a community and are trained to create metadata – MARC reduces interpretive load by careful choice of attributes, authority lists, & cataloging rules (AACR, e.g.) to constrain values – MARC records are controlled for interoperability and consistency in various ways (e.g. by clearinghouses like OCLC) – so... on-line catalog (OPAC) users know what to expect 4 will it work? evaluating the semantic web as metadata • by contrast, the semantic web is subject to the following pitfalls as it scales: – social structures for creating universal semantic web metadata are missing (local culture/practices/needs prevail) – semantic web metadata requires substantial interpretation of domain knowledge; underlying assumptions about use are highly situated – no way of ensuring interoperability, consistency, accuracy • e.g. -
3Rd NASA/IEEE Workshop on Formal Approaches to Agent-Based Systems
Proceedings 3rdNASA/IEEE Workshop on Formal Approaches to Agent-Based Systems Springer-Verlag LNCS vol. 3228 Edited by Michael G. Hinchey, James L. Rash, Walter F. Truszkowski and Christopher A. Rouff Please note: This is a complete draft of the text; the publisher will add the correct page numbers, etc. The current formatting, although not very neat, I is the way the publisher requires it from us. Preface Greenbelt, MD October 2004 Organizing Committee Mike Hinchey, NASA Goddard Space Flight Center Jim Rash, NASA Goddard Space Flight Center Walt Truszkowski, NASA Goddard Space Flight Center Chris Rouff, SAIC I- _- Contents Ecology Based Decentralized Agent Management System Maim D. Pqirakhov, VincentA. Cicirello and William C. Regli ........................ 1 From Abstract to Concrete Nomin Agent Institutions Daide Grossi and Frank Dipm ............................................................ 12 Meeting the Deadline: Why, When and How Frank Dignum, Jan Broersen, Virginia Dignurn and John-Jules Meyer ................ 30 Multi-Agent Systems Reliability, Fuzziness, and Deterrence Michel Rudnimki and H&ne Bestougefl.. ................................................ 41 Formalism Challenges of the Cougaar Model Driven Architecture Shawn A. Bohner. Boby George. Denis Gracirnin andMichael G. Hinchey .......... 58 Facilitating the Specification Capture and Transformation Process in the Development of Multi-Agent Systems Aluizio Haendchen Filho, Nuno Caminah, Edward Hermann Haeusler and Arndt von Staa.. ............................................................................ 73 Using Ontologies to Formalize Services Specifications in Multi-Agent Systems Karin Koogan Breitman, Aluizio Haendchen Filho, Edward Hermann Haeusler and Am& von Staa ............................................................................. 93 Two Formal Gas Models For Multi-Agent Sweeping and Obstacle Avoidance Wesley Ken, Diana Spears, William Spears and David Thayer ...................... 1 13 A Formal kysisof Potential Energy in a Multi-agent System William M Spears, Diana F. -
Toward an Open Knowledge Research Graph.Pdf
THE SERIALS LIBRARIAN https://doi.org/10.1080/0361526X.2019.1540272 Toward an Open Knowledge Research Graph Sören Auera and Sanjeet Mann b aPresenter; bRecorder ABSTRACT KEYWORDS Knowledge graphs facilitate the discovery of information by organizing it into Knowledge graph; scholarly entities and describing the relationships of those entities to each other and to communication; Semantic established ontologies. They are popular with search and e-commerce com- Web; linked data; scientific panies and could address the biggest problems in scientific communication, research; machine learning according to Sören Auer of the Technische Informationsbibliothek and Leibniz University of Hannover. In his NASIG vision session, Auer introduced attendees to knowledge graphs and explained how they could make scientific research more discoverable, efficient, and collaborative. Challenges include incentiviz- ing researchers to participate and creating the training data needed to auto- mate the generation of knowledge graphs in all fields of research. Change in the digital world Thank you to Violeta Ilik and the NASIG Program Planning Committee for inviting me to this conference. I would like to show you where I come from. Leibniz University of Hannover has a castle that belonged to a prince, and next to the castle is the Technische Informationsbibliothek (TIB), responsible for supporting the scientific and technology community in Germany with publications, access, licenses, and digital information services. Figure 1 is an example of a knowledge graph about TIB. The basic ingredients of a knowledge graph are entities and relationships. We are the library of Leibniz University of Hannover and we are a member of Leibniz Association (a German research association). -
The International Society for Computational Biology 10Th Anniversary Lawrence Hunter, Russ B
Message from ISCB The International Society for Computational Biology 10th Anniversary Lawrence Hunter, Russ B. Altman, Philip E. Bourne* Introduction training programs in bioinformatics or significant part of the original computational biology. By 1996, motivation for founding the Society PLoS Computational Biology is the however, high throughput molecular was to provide a stable financial home official journal of the International biology and the attendant need for for the ISMB conference. The first few Society for Computational Biology informatics had clearly arrived. In conferences were sufficiently successful (ISCB), a partnership that was formed addition to the founding of ISCB, 1996 to create a financial nest egg that was during the Journal’s conception in saw the sequencing of the first genome used each year to start the process of 2005. With ISCB being the only of a free-living organism (yeast) and planning and executing the next year’s international body representing Affymetrix’s release of its first meeting. Initially, relatively modest computational biologists, it made commercial DNA chip. checks were cut and sent from one perfect sense for PLoS Computational The hot topics in bioinformatics organizer to another informally. As the Biology to be closely affiliated. The circa 1996, at least as reflected by the size of these checks increased, the Society had to take more of a chance conference on Intelligent Systems for organizers (and their home than similar societies, choosing to step Molecular Biology (ISMB) at institutions) became increasingly away from an existing financially Washington University in St. Louis, uncomfortable exchanging them beneficial subscription journal to align included issues which have largely been informally, and decided that they with an open access publication as a solved (such as gene finding or needed an organization—thus, ISCB. -
Biocuration 2016 - Posters
Biocuration 2016 - Posters Source: http://www.sib.swiss/events/biocuration2016/posters 1 RAM: A standards-based database for extracting and analyzing disease-specified concepts from the multitude of biomedical resources Jinmeng Jia and Tieliu Shi Each year, millions of people around world suffer from the consequence of the misdiagnosis and ineffective treatment of various disease, especially those intractable diseases and rare diseases. Integration of various data related to human diseases help us not only for identifying drug targets, connecting genetic variations of phenotypes and understanding molecular pathways relevant to novel treatment, but also for coupling clinical care and biomedical researches. To this end, we built the Rare disease Annotation & Medicine (RAM) standards-based database which can provide reference to map and extract disease-specified information from multitude of biomedical resources such as free text articles in MEDLINE and Electronic Medical Records (EMRs). RAM integrates disease-specified concepts from ICD-9, ICD-10, SNOMED-CT and MeSH (http://www.nlm.nih.gov/mesh/MBrowser.html) extracted from the Unified Medical Language System (UMLS) based on the UMLS Concept Unique Identifiers for each Disease Term. We also integrated phenotypes from OMIM for each disease term, which link underlying mechanisms and clinical observation. Moreover, we used disease-manifestation (D-M) pairs from existing biomedical ontologies as prior knowledge to automatically recognize D-M-specific syntactic patterns from full text articles in MEDLINE. Considering that most of the record-based disease information in public databases are textual format, we extracted disease terms and their related biomedical descriptive phrases from Online Mendelian Inheritance in Man (OMIM), National Organization for Rare Disorders (NORD) and Orphanet using UMLS Thesaurus.