Adding Regular Expressions to Graph Reachability and Pattern Queries
Total Page:16
File Type:pdf, Size:1020Kb
Load more
Recommended publications
-
Fast Data Analytics by Learning
Fast Data Analytics by Learning by Yongjoo Park A dissertation submitted in partial fulfillment of the requirements for the degree of Doctor of Philosophy (Computer Science and Engineering) in The University of Michigan 2017 Doctoral Committee: Associate Professor Michael Cafarella, Co-Chair Assistant Professor Barzan Mozafari, Co-Chair Associate Professor Eytan Adar Professor H. V. Jagadish Associate Professor Carl Lagoze Yongjoo Park [email protected] ORCID iD: 0000-0003-3786-6214 c Yongjoo Park 2017 To my mother ii Acknowledgements I thank my co-advisors, Michael Cafarella and Barzan Mozafari. I am deeply grateful that I could start my graduate studies with Mike Cafarella. Mike was the one of the nicest persons I have met (I can say, throughout my life). He also deeply cared of my personal life as well as my graduate study. He always encouraged and supported me so I could make the best decisions for my life. He was certainly more than a simple academic advisor. Mike also has excellent talents in presentation. His descriptions (on any subject) are well-organized and straightforward. Sometimes, I find myself trying to imitate his presentation styles when I give my own presentations, which, I believe, is natural since I have been advised by him for more than five years now. I started work with Barzan after a few years since I came to Michigan. He is excellent in writing research papers and presenting work in the most interesting, formal, and concise way. He devoted tremendous amount of time for me. I learned great amount of writing and presentation skills from him. -
O. Peter Buneman Curriculum Vitæ – Jan 2008
O. Peter Buneman Curriculum Vitæ { Jan 2008 Work Address: LFCS, School of Informatics University of Edinburgh Crichton Street Edinburgh EH8 9LE Scotland Tel: +44 131 650 5133 Fax: +44 667 7209 Email: [email protected] or [email protected] Home Address: 14 Gayfield Square Edinburgh EH1 3NX Tel: 0131 557 0965 Academic record 2004-present Research Director, Digital Curation Centre 2002-present Adjunct Professor of Computer Science, Department of Computer and Information Science, University of Pennsylvania. 2002-present Professor of Database Systems, Laboratory for the Foundations of Computer Science, School of Informatics, University of Edinburgh 1990-2001 Professor of Computer Science, Department of Computer and Information Science, University of Pennsylvania. 1981-1989 Associate Professor of Computer Science, Department of Computer and Information Science, University of Pennsylvania. 1981-1987 Graduate Chairman, Department of Computer and Information Science, University of Pennsyl- vania 1975-1981 Assistant Professor of Computer Science at the Moore School and Assistant Professor of Decision Sciences at the Wharton School, University of Pennsylvania. 1969-1974 Research Associate and subsequently Lecturer in the School of Artificial Intelligence, Edinburgh University. Education 1970 PhD in Mathematics, University of Warwick (Supervisor E.C. Zeeman) 1966 MA in Mathematics, Cambridge. 1963-1966 Major Scholar in Mathematics at Gonville and Caius College, Cambridge. Awards, Visting positions Distinguished Visitor, University of Auckland, 2006 Trustee, VLDB Endowment, 2004 Fellow of the Royal Society of Edinburgh, 2004 Royal Society Wolfson Merit Award, 2002 ACM Fellow, 1999. 1 Visiting Resarcher, INRIA, Jan 1999 Visiting Research Fellowship sponsored by the Japan Society for the Promotion of Science, Research Institute for the Mathematical Sciences, Kyoto University, Spring 1997. -
RSE Fellows Ordered by Area of Expertise As at 11/10/2016
RSE Fellows ordered by Area of Expertise as at 11/10/2016 HRH Prince Charles The Prince of Wales KG KT GCB Hon FRSE HRH The Duke of Edinburgh KG KT OM, GBE Hon FRSE HRH The Princess Royal KG KT GCVO, HonFRSE A1 Biomedical and Cognitive Sciences 2014 Professor Judith Elizabeth Allen FRSE, FMedSci, Professor of Immunobiology, University of Manchester. 1998 Dr Ferenc Andras Antoni FRSE, Honorary Fellow, Centre for Integrative Physiology, University of Edinburgh. 1993 Sir John Peebles Arbuthnott MRIA, PPRSE, FMedSci, Former Principal and Vice-Chancellor, University of Strathclyde. Member, Food Standards Agency, Scotland; Chair, NHS Greater Glasgow and Clyde. 2010 Professor Andrew Howard Baker FRSE, FMedSci, BHF Professor of Translational Cardiovascular Sciences, University of Glasgow. 1986 Professor Joseph Cyril Barbenel FRSE, Former Professor, Department of Electronic and Electrical Engineering, University of Strathclyde. 2013 Professor Michael Peter Barrett FRSE, Professor of Biochemical Parasitology, University of Glasgow. 2005 Professor Dame Sue Black DBE, FRSE, Director, Centre for Anatomy and Human Identification, University of Dundee. ; Director, Centre for Anatomy and Human Identification, University of Dundee. 2007 Professor Nuala Ann Booth FRSE, Former Emeritus Professor of Molecular Haemostasis and Thrombosis, University of Aberdeen. 2001 Professor Peter Boyle CorrFRSE, FMedSci, Former Director, International Agency for Research on Cancer, Lyon. 1991 Professor Sir Alasdair Muir Breckenridge CBE KB FRSE, FMedSci, Emeritus Professor of Clinical Pharmacology, University of Liverpool. 2007 Professor Peter James Brophy FRSE, FMedSci, Professor of Anatomy, University of Edinburgh. Director, Centre for Neuroregeneration, University of Edinburgh. 2013 Professor Gordon Douglas Brown FRSE, FMedSci, Professor of Immunology, University of Aberdeen. 2012 Professor Verity Joy Brown FRSE, Provost of St Leonard's College, University of St Andrews. -
WWW2007 Program Guide
AOL_AD_WWW2007-Open_FIXED.pdf 4/13/2007 3:13:25 PM C M Y CM MY CY CMY K www2007.org Message from the Chair of IW3C2 On behalf of the International World Wide Web Conference Steering Committee (IW3C2), I welcome you to the 16th conference of our series. As you already know, the World Wide Web was first conceived in 1989 by Tim Berners-Lee at CERN in Geneva, Switzerland. The first conference, WWW1, was held at CERN in 1994. The conference series aims to provide the world a premier forum about the evolution of the Web, the standardization of its associated technologies, and the Web’s impact on society and culture. These conferences bring together researchers, developers, users, and vendors – indeed all of you who are passionate about the Web and what it has to offer, now and in the future. The IW3C2 was founded by Joseph Hardin and Robert Cailliau in August 1994 and has been responsible for the International WWW Conference series ever since. Except for 1994 and 1995 when two conferences were held each year, WWWn became an annual event held in late April or early May. The location of the conferences rotates among North America, Europe, and Asia. In 2002 we changed the conference designator from a number (1 through 10) to the year it is held; i.e., WWW11 became known as WWW2002, and so on. You may browse our website http://www. iw3c2.org/ for information on past and future conferences. I wish to recognize our long-term partner, the World Wide Web Consortium (W3C), who has contributed a significant part of the conferences’ contents since the very beginning. -
Data Quality: from Theory to Practice
Data Quality: From Theory to Practice Wenfei Fan School of Informatics, University of Edinburgh, and RCBD, Beihang University [email protected] ABSTRACT “more than 25% of critical data in the world’s top compa- Data quantity and data quality, like two sides of a coin, are nies is flawed” [53], and “pieces of information perceived equally important to data management. This paper provides as being needed for clinical decisions were missing from an overview of recent advances in the study of data quality, 13.6% to 81% of the time” [76]. It is also estimated that from theory to practice. We also address challenges intro- “2% of records in a customer file become obsolete in one duced by big data to data quality management. month” [31] and hence, in a customer database, 50% of its records may be obsolete and inaccurate within two years. 1. INTRODUCTION Dirty data is costly. Statistics shows that “bad data or poor When we talk about big data, we typically emphasize the data quality costs US businesses $600 billion annually” [31], quantity (volume) of the data. We often focus on techniques “poor data can cost businesses 20%-35% of their operating that allow us to efficiently store, manage and query the data. revenue” [92], and that “poor data across businesses and the For example, there has been a host of work on developing government costs the US economy $3.1 trillion a year” [92]. scalable algorithms that, given a query Q and a dataset D, Worse still, when it comes to big data, the scale of the data compute query answers Q(D) when D is big. -
Download the Trustees' Report and Financial Statements 2018-2019
Science is Global Trustees’ report and financial statements for the year ended 31 March 2019 The Royal Society’s fundamental purpose, reflected in its founding Charters of the 1660s, is to recognise, promote, and support excellence in science and to encourage the development and use of science for the benefit of humanity. The Society is a self-governing Fellowship of distinguished scientists drawn from all areas of science, technology, engineering, mathematics and medicine. The Society has played a part in some of the most fundamental, significant, and life-changing discoveries in scientific history and Royal Society scientists – our Fellows and those people we fund – continue to make outstanding contributions to science and help to shape the world we live in. Discover more online at: royalsociety.org BELGIUM AUSTRIA 3 1 NETHERLANDS GERMANY 5 12 CZECH REPUBLIC SWITZERLAND 3 2 CANADA POLAND 8 1 Charity Case study: Africa As a registered charity, the Royal Society Professor Cheikh Bécaye Gaye FRANCE undertakes a range of activities that from Cheikh Anta Diop University 25 provide public benefit either directly or in Senegal, Professor Daniel Olago from the University of indirectly. These include providing financial SPAIN UNITED STATES Nairobi in Kenya, Dr Michael OF AMERICA 18 support for scientists at various stages Owor from Makerere University of their careers, funding programmes 33 in Uganda and Professor Richard that advance understanding of our world, Taylor from University College organising scientific conferences to foster London are working on ways to discussion and collaboration, and publishing sustain low-cost, urban water supply and sanitation systems scientific journals. -
Duplication in Biological Databases Further Limit the Development of the Related Duplicate Detection Methods
School of Computing and Information Systems The University of Melbourne DUPLICATIONINBIOLOGICAL DATABASES:DEFINITIONS, IMPACTSANDMETHODS Qingyu Chen ORCID ID: 0000-0002-6036-1516 Supervisors: Prof. Justin Zobel Prof. Karin Verspoor Submitted in total fulfilment of the requirements of the degree of Doctor of Philosophy Produced on archival quality paper August 2017 ABSTRACT Duplication is a pressing issue in biological databases. This thesis concerns duplication, in terms of its definitions (what records are duplicates), impacts (why duplicates are significant) and solutions (how to address duplication). The volume of biological databases is growing at an unprecedented rate, populated by complex records drawn from heterogeneous sources; the huge data volume and the diverse types cause concern for the underlying data quality. A specific challenge is dupli- cation, that is, the presence of redundant or inconsistent records. While existing studies concern duplicates, the definitions of duplicates are not clear; the foundational under- standing of what records are considered as duplicates by database stakeholders is lacking. The impacts of duplication are not clear either; existing studies have different or even inconsistent views on the impacts. The unclear definitions and impacts of duplication in biological databases further limit the development of the related duplicate detection methods. In this work, we refine the definitions of duplication in biological databases through a retrospective analysis of merged groups in primary nucleotide databases – the duplicates identified by record submitters and database staff (or biocurators) – to understand what types of duplicates matter to database stakeholders. This reveals two primary representa- tions of duplication under the context of biological databases: entity duplicates, multiple records belonging to the same entities, which particularly impact record submission and curation, and near duplicates (or redundant records), records sharing high similarities, particularly impact database search. -
Data Quality: from Theory to Practice
Data Quality: From Theory to Practice Wenfei Fan School of Informatics, University of Edinburgh, and RCBD, Beihang University [email protected] ABSTRACT deed, some attributes of t1,t2 and t3 have become obso- 2 Data quantity and data quality, like two sides of a coin, lete and thus inaccurate. are equally important to data management. This paper provides an overview of recent advances in the study of data quality, from theory to practice. We also address The example shows that if the quality of the data is challenges introduced by big data to data quality man- bad, we cannot find correct query answers no matter agement. how scalable and efficient our query evaluation algo- rithms are. 1. INTRODUCTION Unfortunately, real-life data is often dirty: inconsis- When we talk about big data, we typically empha- tent, inaccurate, incomplete, obsolete and duplicated. size the quantity (volume) of the data. We often focus Indeed, “more than 25% of critical data in the world’s on techniques that allow us to efficiently store, manage top companies is flawed” [53], and “pieces of infor- and query the data. For example, there has been a host mation perceived as being needed for clinical decisions of work on developing scalable algorithms that, given a were missing from 13.6% to 81% of the time” [76]. It query Q and a dataset D, compute query answers Q(D) is also estimated that “2% of records in a customer file when D is big. becomeobsoletein one month”[31] and hence, in a cus- tomer database, 50% of its records may be obsolete and But can we trust Q(D) as correct query answers? inaccurate within two years. -
Some Undecidable Implication Problems for Path Constraints
University of Pennsylvania ScholarlyCommons Technical Reports (CIS) Department of Computer & Information Science April 1997 Some Undecidable Implication Problems for Path Constraints Peter Buneman University of Pennsylvania Wenfei Fan University of Pennsylvania Scott Weinstein University of Pennsylvania, [email protected] Follow this and additional works at: https://repository.upenn.edu/cis_reports Recommended Citation Peter Buneman, Wenfei Fan, and Scott Weinstein, "Some Undecidable Implication Problems for Path Constraints", . April 1997. University of Pennsylvania Department of Computer and Information Science Technical Report No. MS-CIS-97-14. This paper is posted at ScholarlyCommons. https://repository.upenn.edu/cis_reports/84 For more information, please contact [email protected]. Some Undecidable Implication Problems for Path Constraints Abstract We present a class of path constraints of interest in connection with both structured and semi-structured databases, and investigate their associated implication problems. These path constraints are capable of expressing natural integrity constraints that are not only a fundamental part of the semantics of the data, but are also important in query optimization. We show that, despite the simple syntax of the constraints, the implication problem for the constraints is r.e. complete and the finite implication problem for the constraints is co-r.e. complete. Indeed, we establish the existence of a conservative reduction of the set of all first-order sentences to the path constraint -
RSE Fellows Ordered by Academic Discipline As at 15/04/2014
RSE Fellows ordered by Academic Discipline as at 15/04/2014 HRH Prince Charles The Prince of Wales KG KT GCB Hon FRSE HRH The Duke of Edinburgh KG KT OM, GBE Hon FRSE HRH The Princess Royal KG KT GCVO, HonFRSE A1 Biomedical and Cognitive Sciences 2014 Professor Judith Elizabeth Allen FRSE, Professor of Immunobiology and Director of Research, School of Biological Sciences, University of Edinburgh. 1998 Dr Ferenc Andras Antoni FRSE, Head, Division of Preclinical Research, EGIS Pharmaceuticals plc. 1993 Sir John Peebles Arbuthnott MRIA PRSE, FMedSci, Former Principal and Vice-Chancellor, University of Strathclyde. Member, Food Standards Agency, Scotland; Chair, NHS Greater Glasgow and Clyde. 2010 Professor Andrew Howard Baker FRSE, Professor of Molecular Medicine, University of Glasgow. 1986 Professor Joseph Cyril Barbenel FRSE, Professor, Department of Electronic and Electrical Engineering, University of Strathclyde. 2013 Professor Michael Peter Barrett FRSE, Professor of Biochemical Parasitology, University of Glasgow. 2005 Professor Susan Margaret Black OBE FRSE, Director, Centre for Anatomy and Human Identification, University of Dundee. Director, Centre for Anatomy and Human Identification, University of Dundee. 2007 Professor Nuala Ann Booth FRSE, Personal Professor of Molecular Haemostasis and Thrombosis, University of Aberdeen. 2001 Professor Peter Boyle CorrFRSE, FMedSci, Former Director, International Agency for Research on Cancer, Lyon. 1991 Professor Sir Alasdair Muir Breckenridge CBE KB FRSE, FMedSci, Emeritus Professor of Clinical Pharmacology, University of Liverpool. 2007 Professor Peter James Brophy FRSE, FMedSci, Professor of Anatomy, University of Edinburgh. Director, Centre for Neuroregeneration, University of Edinburgh. 2013 Professor Gordon Douglas Brown FRSE, Professor of Immunology, University of Aberdeen. 2012 Professor Verity Joy Brown FRSE, Provost of St Leonard's College, University of St Andrews. -
Download Version of Record (PDF / 2MB)
Open Research Online The Open University’s repository of research publications and other research outputs Capturing and Exploiting Citation Knowledge for the Recommendation of Scientific Publications Thesis How to cite: Khadka, Anita (2020). Capturing and Exploiting Citation Knowledge for the Recommendation of Scientific Publications. PhD thesis The Open University. For guidance on citations see FAQs. c 2020 The Author https://creativecommons.org/licenses/by-nc-nd/4.0/ Version: Version of Record Link(s) to article on publisher’s website: http://dx.doi.org/doi:10.21954/ou.ro.00011b89 Copyright and Moral Rights for the articles on this site are retained by the individual authors and/or other copyright owners. For more information on Open Research Online’s data policy on reuse of materials please consult the policies page. oro.open.ac.uk CAPTURINGANDEXPLOITINGCITATIONKNOWLEDGE FORTHERECOMMENDATIONOFSCIENTIFIC PUBLICATIONS Anita Khadka Thesis submitted in fulfilment of the requirements for the degree of Doctor of Philosophy Knowledge Media Institute Faculty of Science, Technology, Engineering & Mathematics The Open University September 2020 ABSTRACT With the continuous growth of scientific literature, it is becoming in- creasingly challenging to discover relevant scientific publications from the plethora of available academic digital libraries. Despite the current scale, important efforts have been achieved towards the research and development of academic search engines, reference management tools, review management platforms, scientometrics systems, and recom- mender systems that help finding a variety of relevant scientific items, such as publications, books, researchers, grants and events, among others. This thesis focuses on recommender systems for scientific public- ations. Existing systems do not always provide the most relevant scientific publications to users, despite they are present in the recom- mendation space.