Transforming Databases with Recursive Data Structures

Total Page:16

File Type:pdf, Size:1020Kb

Transforming Databases with Recursive Data Structures TRANSFORMING DATABASES WITH RECURSIVE DATA STRUCTURES Anthony Kosky A DISSERTATION in COMPUTER AND INFORMATION SCIENCE Presented to the Faculties of the University of Pennsylvania in Partial Fulfillment of the Requirements for the Degree of Doctor of Philosophy. 1996 Susan Davidson— Supervisor of Dissertation Peter Buneman— Supervisor of Dissertation Peter Buneman— Graduate Group Chairperson c Copyright 2003 by Anthony Kosky iii To my parents. iv v WARRANTY Congratulations on your acquisition of this dissertation. In acquiring it you have shown yourself to be a computer scientist of exceptionally good taste with a true appreciation for quality. Each proof, algorithm or definition in this dissertation has been carefully checked by hand to ensure correctness and reliability. Each word and formula has been meticulously crafted using only the highest quality symbols and characters. The colours of inks and paper have been carefully chosen and matched to maximize contrast and readability. The author is confident that this dissertation will provide years of reliable and trouble free ser- vice, and offers the following warranty for the lifetime of the original owner: If at any time a proof or algorithm should be found to be defective or contain bugs, simply return your disser- tation to the author and it will be repaired or replaced (at the author’s choice) free of charge. Please note that this warranty does not cover damage done to the dissertation through normal wear-and-tear, natural disasters or being chewed by family pets. This warranty is void if the dissertation is altered or annotated in any way. Concepts described in this dissertation may be new and complicated. The author accepts no liability for any confusion or damage incurred during the reading and contemplation of the dissertation. Children under the age of five should not attempt to read this dissertation without proper adult supervision. Comments, suggestions and personal abuse are all welcome and should be sent to the author via electronic mail. vi vii ACKNOWLEDGMENTS This dissertation marks the end of six years which I spent engaged in studies and research at the Department of Information and Computer Science of the University of Pennsylvania. Though only a part of that time was spent directly on the work described in this dissertation, it nevertheless reflects many influences, both from my time at Penn and from my studies prior to that in England. There are many people to thank, both for their direct contributions to this work, and also for their roles in developing my understanding and appreciation of theoretical computer science, databases, programming languages, and many other subjects of relevance. Firstly I would like to thank my advisors, Peter Buneman and Susan Davidson for their help, sug- gestions, support, advice and encouragement, and for introducing me to the subject of databases. Peter was also responsible for giving me the opportunity to enroll in a PhD program at Penn. I would like to thank my committee members, Tim Griffin, Victor Markowitz, Carl Gunter, Val Tannen and Chris Overton for their comments and advice. This work has also been influenced greatly by the discussions of the “Tuesday afternoon group” including Leonid Libkin, Limsoon Wong, Dan Suciu, Rona Machlin, Wenfei Fan and Kyle Hart. I would especially like to thank Leonid for his many helpful comments and advice, and for his thorough reading of the proposal for this dissertation. Barbara Eckman and Carmem Hara did much of the work on the trials of the prototype transformation system described in part IV. Barbara also helped to explain the Molecular Biology Databases and the database problems that inspired much of this work. I am also grateful to Catriel Beeri, Jan Van den Bussche and Serge Abiteboul for their com- ments on my other papers related to this work. Edward T. Bear gave consistent support and encouragement, and helped with some of the more technically difficult proofs in this dissertation. One of the most enjoyable aspects of my research at Penn was the collaboration with members of the computational biology group, not only because it gave me an opportunity to look at some practical applications for my work, but also because it gave me a chance to learn a little about the fascinating subjects of molecular biology and genetics. I would like to thank Chris Overton and David Searls for sharing there enthusiasm for these subjects, and for their many impromptu biology lessons. There are also many people who have contributed to my development first as a mathematician and then as a computer scientist. I would like to thank the lecturers of the Department of Mathematics at the University of Kent at Canterbury, in particular John Earl, who helped me to develop an appreciation for the beauty of pure mathematics. My introduction to computer science came when I did a Masters degree at the Department of Computing at Imperial College of Science and Technology. In particular I was introduced to the subjects of formal methods and functional programming by the lectures of Samson Abramsky, Mike Smyth, Steve Vickers, Pete Harrison, Chris Hankin and others. Samson Abramsky also supervised my masters thesis and recommended me as a possible PhD student at the University of Pennsylvania, for which I am especially grateful. My knowledge and appreciation of theoretical computer science has been extended further while at the Penn, through the lectures Val Tannen, Carl Gunter, Scott Weinstein, Peter Freyd and others. viii ACKNOWLEDGMENTS Many of the staff at the University of Pennsylvania have helped me in dealing with bureaucracy and various administrative details. I would particularly like to thank Mike Felker who’s help allowed me to finish off and co-ordinate this PhD while working in California. I would also like to thank Karen Carter, Nan Blitz, Susan Deysher, Elaine Benedetto and Jackie Caliman, and members of the computing staff Mark Foster, Mark-Jason Dominus and Alex Garthwaite. There are also many people who helped in making my time at Penn enjoyable, and helping me to maintain a semblance of sanity. I would like to thank the Old Quaker Computer Scientists for some very bizarre and amusing times, the Penn Magic play-testers, the Saturday-morning Reading Terminal crowd, and all at Bicycle Therapy for keeping my bikes running nicely. Finally, but most importantly of all, I would like to thank my parents and my family. Their love, support and encouragement have been a constant comfort to me in spite of the long distances between us, and I could not have achieved any of this without them. ix ABSTRACT TRANSFORMING DATABASES WITH RECURSIVE DATA STRUCTURES Anthony Kosky Advisors: Susan Davidson and Peter Buneman. This thesis examines the problems of performing structural transformations on databases in- volving complex data-structures and object-identities, and proposes an approach to specifying and implementing such transformations. We start by looking at various applications of such database transformations, and at some of the more significant work in these areas. In particular we will look at work on transformations in the area of database integration, which has been one of the major motivating areas for this work. We will also look at various notions of correctness that have been proposed for database transformations, and show that the utility of such notions is limited by the dependence of transformations on certain implicit database constraints. We draw attention to the limitations of existing work on transformations, and argue that there is a need for a more general formalism for reasoning about database transformations and constraints. We will also argue that, in order to ensure that database transformations are well-defined and meaningful, it is necessary to understand the information capacity of the data-models being transformed. To this end we give a thorough analysis of the information capacity of data-models supporting object identity, and will show that this is dependent on the operations supported by a query language for comparing object identities. We introduce a declarative language, WOL, based on Horn-clause logic, for specifying database transformations and constraints. We also propose a method of implementing transformations specified in this language, by manipulating their clauses into a normal form which can then be translated into an underlying database programming language. Finally we will present a number of optimizations and techniques necessary in order to build a practical implementation based on these proposals, and will discuss the results of some of the trials that were carried out using a prototype of such a system. x ABSTRACT xi Contents Acknowledgements vii Abstract ix Foreword 1 1.1 A Roadmap . 2 1.2 Some Comments on the Mathematical Approach and Assumptions . 3 I Database Transformations 5 2 Introduction 5 2.1 Methods of Implementing Database Transformation . 6 3 Transformations in Database Integration 7 3.1 Database Integration: An Example . 8 3.2 Resolving Structural Conflicts in Database Integration . 10 3.3 Schema Integration Techniques . 11 3.4 Merging Data . 16 4 Data Models for Database Transformations 17 5 Information Dominance in Transformations 18 5.1 Hull’s Hierarchy of Information Dominance Measures . 19 5.2 Information Capacity and Constraints . 22 xii CONTENTS II Observable Properties of Models for Recursive Data-Structures 25 6 Introduction 25 7 A Data-Model with Object Identities and Extents 27 7.1 Types and Schemas . 28 7.2 Database Instances . 29 8 A Query Language Based on Structural Recursion 31 8.1 Queries and the Language SRI(=) . 32 8.2 Indistinguishable Instances in SRI(=) . 37 9 Bisimulation and Observational Equivalence without Equality 42 9.1 Bisimulation and Corespondence Relations . 43 9.2 Distinguishing Instances without Equality on Identities . 46 10 Observable Properties of Object Identities with Keys 51 10.1 A Data-Model with Keys .
Recommended publications
  • Are We Losing Our Ability to Think Critically?
    news Society | DOI:10.1145/1538788.1538796 Samuel Greengard Are We Losing Our Ability to Think Critically? Computer technology has enhanced lives in countless ways, but some experts believe it might be affecting people’s ability to think deeply. OCIETY HAS LONG cherished technology alters the way we see, hear, the ability to think beyond and assimilate our world—the act of the ordinary. In a world thinking remains decidedly human. where knowledge is revered and innovation equals Rethinking Thinking Sprogress, those able to bring forth Arriving at a clear definition for criti- greater insight and understanding are cal thinking is a bit tricky. Wikipedia destined to make their mark and blaze describes it as “purposeful and reflec- a trail to greater enlightenment. tive judgment about what to believe or “Critical thinking as an attitude is what to do in response to observations, embedded in Western culture. There experience, verbal or written expres- is a belief that argument is the way to sions, or arguments.” Overlay technolo- finding truth,” observes Adrian West, gy and that’s where things get complex. research director at the Edward de For better or worse, exposure to technology “We can do the same critical-reasoning Bono Foundation U.K., and a former fundamentally changes how people think. operations without technology as we computer science lecturer at the Uni- can with it—just at different speeds and versity of Manchester. “Developing our formation can easily overwhelm our with different ease,” West says. abilities to think more clearly, richly, reasoning abilities.” What’s more, What’s more, while it’s tempting fully—individually and collectively— it’s ironic that ever-growing piles of to view computers, video games, and is absolutely crucial [to solving world data and information do not equate the Internet in a monolithic good or problems].” to greater knowledge and better de- bad way, the reality is that they may To be sure, history is filled with tales cision-making.
    [Show full text]
  • Autumn 2Copy2:First Draft.Qxd
    news THE ROYAL SOCIETY OF EDINBURGH ISSUE 26 AUTUMN/WINTER 2009 RESOURCE THE NEWSLETTER OF SCOTLAND’ S NATIONAL ACADEMY SCOTLAND’S RESEARCH TALENT In September 2009 over 70 researchers, mostly in the early stages of their careers, were invited to attend the RSE Annual Research Awards Ceremony. The Awards were presented by RSE President, Lord Wilson of Tillyorn, KT GCMG and Professor Alan Miller, RSE Research Awards Convener. Former BP Research Fellow, Professor Miles Padgett, (pictured below right) who held his Fellowship at the University of St Andrews between 1993 and 1995, also addressed the meeting. Professor Padgett now holds a personal Chair in Physics and is head of the Optics Group at the University of Glasgow. He was elected to the Fellowship of the RSE in 2001 and served as Young People’s Convener from 2005 to 2008, remaining an active member of that programme today. Lord Wilson is pictured with Dr Sinead Rhodes from the Department of Psychology, University of St Andrews whose research proposal was granted a small project fund in the Scottish Crucible programme. Full details of all the awards can be found inside. International Links Climate Change Debate Education Programme Scotland’s Research Talent Lessells Travel Scholarships Cormack Vacation Piazzi Smyth Bequest Dr Spela Ivekovic Scholarships Research Scholarship School of Computing, University of Dundee Dominic Lawson James Henderson Swarm Intelligence and Projective Department of Physics and Astronomy, Department of Physics, Geometry for Computer Vision University of Glasgow
    [Show full text]
  • The Best Nurturers in Computer Science Research
    The Best Nurturers in Computer Science Research Bharath Kumar M. Y. N. Srikant IISc-CSA-TR-2004-10 http://archive.csa.iisc.ernet.in/TR/2004/10/ Computer Science and Automation Indian Institute of Science, India October 2004 The Best Nurturers in Computer Science Research Bharath Kumar M.∗ Y. N. Srikant† Abstract The paper presents a heuristic for mining nurturers in temporally organized collaboration networks: people who facilitate the growth and success of the young ones. Specifically, this heuristic is applied to the computer science bibliographic data to find the best nurturers in computer science research. The measure of success is parameterized, and the paper demonstrates experiments and results with publication count and citations as success metrics. Rather than just the nurturer’s success, the heuristic captures the influence he has had in the indepen- dent success of the relatively young in the network. These results can hence be a useful resource to graduate students and post-doctoral can- didates. The heuristic is extended to accurately yield ranked nurturers inside a particular time period. Interestingly, there is a recognizable deviation between the rankings of the most successful researchers and the best nurturers, which although is obvious from a social perspective has not been statistically demonstrated. Keywords: Social Network Analysis, Bibliometrics, Temporal Data Mining. 1 Introduction Consider a student Arjun, who has finished his under-graduate degree in Computer Science, and is seeking a PhD degree followed by a successful career in Computer Science research. How does he choose his research advisor? He has the following options with him: 1. Look up the rankings of various universities [1], and apply to any “rea- sonably good” professor in any of the top universities.
    [Show full text]
  • Curating the CIA World Factbook 29 the International Journal of Digital Curation Issue 3, Volume 4 | 2009
    Curating the CIA World Factbook 29 The International Journal of Digital Curation Issue 3, Volume 4 | 2009 Curating the CIA World Factbook Peter Buneman, Heiko Müller, School of Informatics, University of Edinburgh Chris Rusbridge, Digital Curation Centre, University of Edinburgh Abstract The CIA World Factbook is a prime example of a curated database – a database that is constructed and maintained with a great deal of human effort in collecting, verifying, and annotating data. Preservation of old versions of the Factbook is important for verification of citations; it is also essential for anyone interested in the history of the data such as demographic change. Although the Factbook has been published, both physically and electronically, only for the past 30 years, we appear in danger of losing this history. This paper investigates the issues involved in capturing the history of an evolving database and its application to the CIA World Factbook. In particular it shows that there is substantial added value to be gained by preserving databases in such a way that questions about the change in data, (longitudinal queries) can be readily answered. Within this paper, we describe techniques for recording change in a curated database and we describe novel techniques for querying the change. Using the example of this archived curated database, we discuss the extent to which the accepted practices and terminology of archiving, curation and digital preservation apply to this important class of digital artefacts.1 1 This paper is based on the paper given by the authors at the 5th International Digital Curation Conference, December 2009; received November 2009, published December 2009.
    [Show full text]
  • A Survey on Scientific Data Management
    A Survey on Scientific Data Management References [1] Ilkay Altintas, Chad Berkley, Efrat Jaeger, Matthew Jones, Bertram Ludascher,¨ and Steve Mock. Kepler: An extensible system for design and execution of scientific workflows. In Proceedings of the 16th International Conference on Scientific and Statistical Database Management (SSDBM’04), pages 21– 23, 2004. [2] Alexander Ames, Nikhil Bobb, Scott A. Brandt, Adam Hiatt, Carlos Maltzahn, Ethan L. Miller, Alisa Neeman, and Deepa Tuteja. Richer file system metadata using links and attributes. In Proceedings of the 22nd IEEE / 13th NASA Goddard Conference on Mass Storage Systems and Technologies (MSST’05), pages 49–60, 2005. [3] Bill Anderson. Mass storage system performance prediction using a trace-driven simulator. In MSST ’05: Proceedings of the 22nd IEEE / 13th NASA Goddard Conference on Mass Storage Systems and Technologies, pages 297–306, Washington, DC, USA, 2005. IEEE Computer Society. [4] Phil Andrews, Bryan Banister, Patricia Kovatch, Chris Jordan, and Roger Haskin. Scaling a global file system to the greatest possible extent, performance, capacity, and number of users. In MSST ’05: Proceedings of the 22nd IEEE / 13th NASA Goddard Conference on Mass Storage Systems and Tech- nologies, pages 109–117, Washington, DC, USA, 2005. IEEE Computer Society. [5] Grigoris Antoniou and Frank van Harmelen. A Semantic Web Primer. The MIT Press, 2004. [6] Scott Brandt, Carlos Maltzahn, Neoklis Polyzotis, and Wang-Chiew Tan. Fusing data management services with file systems. In PDSW ’09: Proceedings of the 4th Annual Workshop on Petascale Data Storage, pages 42–46, New York, NY, USA, 2009. ACM. [7] Peter Buneman, Adriane Chapman, and James Cheney.
    [Show full text]
  • Towards a Multi-Discipline Network Perspective
    Manifesto from Dagstuhl Perspectives Workshop 12182 Towards A Multi-Discipline Network Perspective Edited by Matthias Häsel1, Thorsten Quandt2, and Gottfried Vossen3 1 Otto Group – Hamburg, DE, [email protected] 2 Universität Hohenheim, DE, [email protected] 3 Universität Münster, DE, [email protected] Abstract This is the manifesto of Dagstuhl Perspectives Workshop 12182 on a multi-discipline perspective on networks. The information society is shaped by an increasing presence of networks in various manifestations, most notably computer networks, supply-chain networks, and social networks, but also business networks, administrative networks, or political networks. Online networks nowadays connect people all around the world at day and night, and allow to communicate and to work collaboratively and efficiently. What has been a commodity in the private as well as in the enterprise sectors independently for quite some time now is currently growing together at an increasing pace. As a consequence, the time has come for the relevant sciences, including computer science, information systems, social sciences, economics, communication sciences, and others, to give up their traditional “silo-style” thinking and enter into borderless dialogue and interaction. The purpose of this Manifesto is to review where we stand today, and to outline directions in which we urgently need to move, in terms of both research and teaching, but also in terms of funding. Perspectives Workshop 02.–04. May, 2012 – www.dagstuhl.de/12182 1998 ACM Subject Classification A.0 General, A.2 Reference, H. Information Systems, J.4 So- cial and Behavioral Sciences, K.4 Computers and Society Keywords and phrases Networks, network infrastructure, network types, network effects, data in networks, social networks, social media, crowdsourcing Digital Object Identifier 10.4230/DagMan.2.1.1 Executive Summary The information society is shaped by an increasing presence of networks in various mani- festations.
    [Show full text]
  • O. Peter Buneman Curriculum Vitæ – Jan 2008
    O. Peter Buneman Curriculum Vitæ { Jan 2008 Work Address: LFCS, School of Informatics University of Edinburgh Crichton Street Edinburgh EH8 9LE Scotland Tel: +44 131 650 5133 Fax: +44 667 7209 Email: [email protected] or [email protected] Home Address: 14 Gayfield Square Edinburgh EH1 3NX Tel: 0131 557 0965 Academic record 2004-present Research Director, Digital Curation Centre 2002-present Adjunct Professor of Computer Science, Department of Computer and Information Science, University of Pennsylvania. 2002-present Professor of Database Systems, Laboratory for the Foundations of Computer Science, School of Informatics, University of Edinburgh 1990-2001 Professor of Computer Science, Department of Computer and Information Science, University of Pennsylvania. 1981-1989 Associate Professor of Computer Science, Department of Computer and Information Science, University of Pennsylvania. 1981-1987 Graduate Chairman, Department of Computer and Information Science, University of Pennsyl- vania 1975-1981 Assistant Professor of Computer Science at the Moore School and Assistant Professor of Decision Sciences at the Wharton School, University of Pennsylvania. 1969-1974 Research Associate and subsequently Lecturer in the School of Artificial Intelligence, Edinburgh University. Education 1970 PhD in Mathematics, University of Warwick (Supervisor E.C. Zeeman) 1966 MA in Mathematics, Cambridge. 1963-1966 Major Scholar in Mathematics at Gonville and Caius College, Cambridge. Awards, Visting positions Distinguished Visitor, University of Auckland, 2006 Trustee, VLDB Endowment, 2004 Fellow of the Royal Society of Edinburgh, 2004 Royal Society Wolfson Merit Award, 2002 ACM Fellow, 1999. 1 Visiting Resarcher, INRIA, Jan 1999 Visiting Research Fellowship sponsored by the Japan Society for the Promotion of Science, Research Institute for the Mathematical Sciences, Kyoto University, Spring 1997.
    [Show full text]
  • The Hyperview Approach to the Integration of Semistructured Data
    The HyperView Approach to the Integration of Semistructured Data Lukas C. Faulstich1 Dissertation am Fachbereich Mathematik und Informatik der Freien Universität Berlin Eingereicht am: 16. November 1999 Verteidigt am: 15. Februar 2000 Gutachter: Prof. Dr. Heinz Schweppe Prof. Dr. Herbert Weber (TU Berlin) Prof. Dr. Hartmut Ehrig (TU Berlin) Betreuer: Prof. Dr. Heinz Schweppe Prof. Dr. Herbert Weber (TU Berlin) Prof. Dr. Hartmut Ehrig (TU Berlin) Dr. Ralf-Detlef Kutsche (TU Berlin) Dr. Gabriele Taentzer (TU Berlin) 1Supported by the German Research Society, Berlin-Brandenburg Graduate School on Distributed Infor- mation Systems (DFG grant no. GRK 316) ii to Myra iii Abstract In order to use the World Wide Web to answer a specific question, one often has to collect and combine information from multiple Web sites. This task is aggravated by the structural and se- mantic heterogeneity of the Web. Virtual Web sites are a promising approach to solve this problem for particular, focused application domains. A virtual Web site is a Web site that serves pages containing concentrated information that has been extracted, homogenized,andcombined from several underlying Web sites. The goal is to save the user from tediously searching and browsing multiple pages at all these sites. The HyperView approach to the integration of semistructured data sources presented in this thesis provides a methodology, a formal framework, and a software environment for building such virtual Web sites. To achieve this kind of integration, data has to be extracted from external Web documents, integrated into a common representation, and then presented to the user in form of Web documents.
    [Show full text]
  • File Size for Two Types of Query As the Retrieved Resultset Increases
    PROVENANCE SUPPORT FOR SERVICE-BASED INFRASTRUCTURE Shrija Rajbhandari School of Computer Science Cardiff University This thesis is submitted in partial fulfillment of the requirements for the degree of Doctor of Philosophy April 2007 UMI Number: U585009 All rights reserved INFORMATION TO ALL USERS The quality of this reproduction is dependent upon the quality of the copy submitted. In the unlikely event that the author did not send a complete manuscript and there are missing pages, these will be noted. Also, if material had to be removed, a note will indicate the deletion. Dissertation Publishing UMI U585009 Published by ProQuest LLC 2013. Copyright in the Dissertation held by the Author. Microform Edition © ProQuest LLC. All rights reserved. This work is protected against unauthorized copying under Title 17, United States Code. ProQuest LLC 789 East Eisenhower Parkway P.O. Box 1346 Ann Arbor, Ml 48106-1346 Declaration This work has not previously been accepted in substance for any degree and is not concurrently submitted in candidature for any degree. Signed (candidate) Date STATEMENT 1 This thesis is being submitted in partial fulfillment of the requirements for the degree of PhD. C \ - j r s Signed ................. 2 2 2 2 ^ ......................... (candidate) D ate...... STATEMENT 2 This thesis is the result of my own investigations, except otherwise stated. Other sources are acknowledged by explicit references. Signed ................ 2 2 ^ 2 ........................ (candidate) Date ..... STATEMENT 3 I hereby give consent for my thesis, if accepted, to be available for photocopying and for interlibrary loan, and for the title and summary to be made available to outside organisations.
    [Show full text]
  • Using Links to Prototype a Database Wiki
    Using Links to prototype a Database Wiki James Cheney, Sam Lindley Heiko Muller¨ University of Edinburgh Tasmanian ICT Centre CSIRO , [email protected] Hobart, Australia [email protected] [email protected] ABSTRACT versioning, annotation, and provenance, which are especially im- Both relational databases and wikis have strengths that make portant for scientific databases [7]. Many of these features can be them attractive for use in collaborative applications. In the last added to existing systems in ad hoc ways, but this can be expensive. decade, database-backed Web applications have been used exten- Wikis are user-editable Web sites that have grown very popular sively to develop valuable shared biological references called cu- in the last decade. A prime example is Wikipedia, which is displac- rated databases. Databases offer many advantages such as scala- ing standard print reference works. Wikis allow users to edit their bility, query optimization and concurrency control, but are not easy content almost as casually as they search or browse. Wikis support to use and lack other features needed for collaboration. Wikis have collaboration and transparency by recording detailed change histo- become very popular for early-stage biocuration projects because ries and allowing space for discussion. Because they are free and they are easy to use, encourage sharing and collaboration, and pro- relatively simple to set up and configure, wikis are becoming popu- lar for nascent biological database projects: for example, the Gene vide built-in support for archiving, history-tracking and annotation. 1 However, curation projects often outgrow the limited capabilities of Wiki Portal lists over 15 biological wiki projects, and a wiki was used to coordinate scientific response to the swine flu outbreak in wikis for structuring and efficiently querying data at scale, necessi- 2 tating a painful phase transition to a database-backed Web applica- early 2009 .
    [Show full text]
  • RSE Fellows Ordered by Area of Expertise As at 11/10/2016
    RSE Fellows ordered by Area of Expertise as at 11/10/2016 HRH Prince Charles The Prince of Wales KG KT GCB Hon FRSE HRH The Duke of Edinburgh KG KT OM, GBE Hon FRSE HRH The Princess Royal KG KT GCVO, HonFRSE A1 Biomedical and Cognitive Sciences 2014 Professor Judith Elizabeth Allen FRSE, FMedSci, Professor of Immunobiology, University of Manchester. 1998 Dr Ferenc Andras Antoni FRSE, Honorary Fellow, Centre for Integrative Physiology, University of Edinburgh. 1993 Sir John Peebles Arbuthnott MRIA, PPRSE, FMedSci, Former Principal and Vice-Chancellor, University of Strathclyde. Member, Food Standards Agency, Scotland; Chair, NHS Greater Glasgow and Clyde. 2010 Professor Andrew Howard Baker FRSE, FMedSci, BHF Professor of Translational Cardiovascular Sciences, University of Glasgow. 1986 Professor Joseph Cyril Barbenel FRSE, Former Professor, Department of Electronic and Electrical Engineering, University of Strathclyde. 2013 Professor Michael Peter Barrett FRSE, Professor of Biochemical Parasitology, University of Glasgow. 2005 Professor Dame Sue Black DBE, FRSE, Director, Centre for Anatomy and Human Identification, University of Dundee. ; Director, Centre for Anatomy and Human Identification, University of Dundee. 2007 Professor Nuala Ann Booth FRSE, Former Emeritus Professor of Molecular Haemostasis and Thrombosis, University of Aberdeen. 2001 Professor Peter Boyle CorrFRSE, FMedSci, Former Director, International Agency for Research on Cancer, Lyon. 1991 Professor Sir Alasdair Muir Breckenridge CBE KB FRSE, FMedSci, Emeritus Professor of Clinical Pharmacology, University of Liverpool. 2007 Professor Peter James Brophy FRSE, FMedSci, Professor of Anatomy, University of Edinburgh. Director, Centre for Neuroregeneration, University of Edinburgh. 2013 Professor Gordon Douglas Brown FRSE, FMedSci, Professor of Immunology, University of Aberdeen. 2012 Professor Verity Joy Brown FRSE, Provost of St Leonard's College, University of St Andrews.
    [Show full text]
  • Path Queries on Compressed XML∗
    Path Queries on Compressed XML∗ Peter Buneman Martin Grohe Christoph Koch [email protected] [email protected] [email protected] Laboratory for Foundations of Computer Science University of Edinburgh, Edinburgh EH9 3JZ, UK Abstract for bringing subtrees of the document tree into main memory on demand. Another approach [8, 11, 19] is Central to any XML query language is a path to store information about each node in the document language such as XPath which operates on tree in one or more tuples in a relational database. In the tree structure of the XML document. We both cases the structure is fragmented and substantial demonstrate in this paper that the tree struc- I/O is required to evaluate a complex path expression; ture can be effectively compressed and ma- and this increases with the size of the source document. nipulated using techniques derived from sym- An alternative approach is to extract the text (the bolic model checking. Specifically, we show character string data) from the document and to store first that succinct representations of document it in separate containers, leaving the bare structure, tree structures based on sharing subtrees are a tree whose nodes are labeled with element and at- highly effective. Second, we show that com- tribute names. We shall call this structure the skele- pressed structures can be queried directly and ton of the document. This separation of the skele- efficiently through a process of manipulating ton from string data is used in the XMILL compres- selections of nodes and partial decompression. sor [15], which, as internal model of data representa- We study both the theoretical and experimen- tion in query engines, is reminiscent of earlier vertical tal properties of this technique and provide partitioning techniques for relational data [3] which algorithms for querying our compressed in- have recently been resurrected [2] for query optimiza- stances using node-selecting path query lan- tion.
    [Show full text]