Shadow Theory: Data Model Design for Data Integration

Total Page:16

File Type:pdf, Size:1020Kb

Shadow Theory: Data Model Design for Data Integration Shadow Theory: Data Model Design for Data Integration Jason T. Liu* (Draft. Date:9/12/2012) ABSTRACT In information ecosystems, semantic heterogeneity is known as General Terms Algorithms, Design, Human Factors, Theory. the root issue for the difficulties of data integration, and the Relational Model is not designed for addressing such challenges, i.e. to re-use data that is modeled from other sources in the local Keywords data model design. Although researchers have proposed many Data integration, information ecosystems, shadows, meanings, different approaches, and software vendors have designed tools to mental entity, semantic heterogeneity, tags, help the data integration task, it remains an art relying on human labors. Due to lacks of a comprehensive theory to guide the overall modeling process, the quality of the integrated data heavily depends on the data integrators’ experiences. 1. Introduction Based on observations of practical issues for enterprise customer Data integration is essential for most enterprise systems that rely data integration, we believe that the needed solution is a new data on data from multiple sources. Lenzerini defined data integration model. This new data model needs to manage the difficulties of as to combine data residing at different sources in order to provide semantic heterogeneity: different data collected from different users with a unified view of these data [1]. Specifically, the goal perspectives about the same subject matter can naturally be of Enterprise Information Integration (EII) is to provide a uniform inconsistencies or even in conflicts. In other words, existing data access to multiple data sources without having to load the data models are designed to support single version of the truth, and into a central place [2]. Consumes about 40% of IT budget, EII is naturally they will have difficulties during data integration when cited as the biggest and most expensive challenge in [3], where data collected from different sources are based on different Bernstein and Haas provide an overview for related difficulties perspectives or at different levels of abstraction. and technologies. Therefore, we propose Shadow Theory to serve as the In this paper, our main concern is in the conceptual modeling philosophical foundation in order to design a new data model. The level, especially for data integration in large-scale information kernel of the theory is based on the notion of shadow, which can ecosystems. The context is the design activities for data modeling be traced back to Plato’s Allegory of the Cave over 2000 years where a local (downstream) database needs to re-use existing data ago. The basic idea is that whatever we can observe and store into provided by upstream systems, and we focus on the specific issue databases about the subject matter are just shadows. Meanings of that there are multiple sources for the same subject matter (e.g. shadows are mental entities that exist only in the viewers’ customer). These data sources are designed for satisfying different cognitive structures. Such mental entities are constrained by the business requirements based on different understanding, different viewers’ internal model about the reality, especially the implicitly perspectives or even different ontology implicitly. or explicitly chosen perspective(s) or ontology, if formally Therefore, we use the term data integration in a more generalized represented. sense in this paper, for how we can design data models used in the In this paper, we propose six basic principles to guide the overall context of information ecosystems to fulfill its local business data model design. Further, we also propose algebra with a set of functionality through data provided by multiple sources. For basic operators to support data operations by their meanings, not example, ordering systems in an enterprise need to access by their logical structure. The representation is based on point- customer data that exist in billing, marketing, or sales databases. free geometry such that any meaning is represented as an area in However, these different sources have their own perspectives semantic space, which can be decomposed or aggregated in about the same subject matter. To develop an isolated model of different ways concurrently. We use W(hat)-tags to attach on customer data for ordering only is not an acceptable solution since shadows for their meanings, and E(quivalence)-tags to recognize it will not improve the overall performance for the enterprise, and what meanings can be treated as the same. We use enterprise isolated customer data will create more chances of inconsistencies customer data integration as an example to illustrate the data or errors. model design and operation principles. By data model design, there are two levels of meanings here: Categories and Subject Descriptors (1) The generic principles to design the data models for specific applications, for example, to model enterprise customer data H.2.5 [Heterogeneous Databases] in a marketing database, or used in a data warehouse for * The author can be reached at [email protected] business intelligence. 1 (2) The design of a generic data model like the Relational Model Content with algebra to serve as the foundation for data operations. In general, the difficulties of data integration are due to 1. INTRODUCTION heterogeneity at physical level, logical (structural) level, and semantics level. Since semantic heterogeneity is usually hidden 2. ENTERPRISE CUSTOMER DATA INTEGRATION behind structural heterogeneity, it is the most difficult one to 2.1 THE NOTION OF UNIQUE KEYS AND SEMANTIC identify, and the resolution by mapping between semantic HETEROGENEITY heterogeneous data elements usually requires design decisions 2.2 SCHEMA AND RELATED ISSUES made by human data integrators (one of the reasons for why data 2.3 EQUIVALENCE, SIMILARITY, AND DATA INTEGRATION integration still relies on human labors since the beginning of 2.4 EXPECTATIONS AND PRACTICAL NEEDS OF DATA database era). Without a commonly accepted comprehensive INTEGRATION theory to guide such design decisions, data integrators rely on 2.5 SUMMARY their own experiences and personal preferences like performing an art in current practice. 3. SHADOW THEORY The trends of related research are more in making such manual 3.1 SHADOW THEORY process automatic or semi-automatic (e.g. see the survey for 3.2 SIX PRINCIPLES FOR DATA INTEGRATION matching schemas automatically in [4, 5]), in order to reduce the 3.3 SHADOWS IN PLATO’S CAVE dependence of human work. Researchers usually start developing 3.4 W-TAGS: MEANINGS OF SHADOWS and testing their solutions from relative small number of data 3.5 SEMANTIC HETEROGENEITY AND MEANING INDEPENDENCE sources, and push for scale up later. Significant progresses were 4. DATA INTEGRATION made in design software tools at application level to review schema and identify potential mapping candidates. 4.1 E-TAG: EQUIVALENCE OF SHADOWS However, for large-scale information ecosystems, data integration 4.2 MEANINGFUL DATA INTEGRATION AND CRITERIA FOR faces different kinds of challenges. For example, it is not EVALUATION uncommon for an enterprise to have 500 or more different data 4.3 DATA MODEL FEATURES FOR HELPING USERS TO sources that provide similar, overlapped, inconsistent or even UNDERSTAND THEIR INTEGRATED DATA AND USE IT PROPERLY conflicting customer data. First challenge is due to the number of 5. ALGEBRA FOR SHADOW THEORY-BASED DATA schema, which indicates the scale of complexities for mapping MODEL between different logical representations. Second challenge is due to the dynamic nature for subject matters themselves, that 5.1 BASIC STRUCTURE millions of enterprise customers are in endless M&A activities. 5.2 REGIONS AS THE ONLY PRIMITIVE UNITS Third challenge is due to the different business requirements for 5.3 ALGEBRA FOR SHADOW THEORY-BASED DATA MODEL each individual data models, that each one has its procedures to 5.4 SIMULATING RELATIONAL SCHEMA BY TEMPLATES IN decide when to update their data due to M&A results. When there SEMANTIC SPACE is a need to integrate individual information ecosystems due to 6. DISCUSSION AND FUTURE WORK merger of the service providers, each merged department like the integrated billing or ordering usually has to perform their own 6.1 THE CHALLENGES TO SUPPORT MULTIPLE VERSIONS OF THE integration, since enterprise-wide enterprise customer data TRUTH integration for all departments is slow or even not possible to 6.2 COMPARISON WITH RELATIONAL MODEL happen. 6.3 MEANING INDEPENDENCE VERSUS LOGICAL DATA Under these surface issues, the difficulties of semantic INDEPENDENCE heterogeneity can be summarized as the two basic characteristics: 6.4 COMPARISONS WITH ENTITY-RELATIONSHIP MODEL 6.5 BUSINESS PUSH ON CUSTOMER DATA INTEGRATION (CDI) (i) There exist different representations for the same meaning, 6.6 RELATED DATA INTEGRATION APPROACHES (ii) There exist different meanings for the same representation. 6.7 MODEL MANAGEMENT IN SEMANTIC SPACE 6.8 ONTOLOGY, XML, AND SEMANTIC WEB Since Relational Databases are the dominant backbone for 6.9 OTHER RELATED ISSUES enterprise systems, we will focus on Relational Model to make comparison with in this paper. Different representations can be 7. CONCLUSION interpreted as different schema, which is explicitly represented in relational database systems. But semantics is not; it is something out of the scope of Relational Model. Although explicitness was is an uninterpreted finite relational structure” [7]. When one of the original objectives of the Relational Model (as compared with the design of ontology, researchers further summarized by Date: "the meaning of the data should be as explained, “Database schemas often do not provide explicit obvious and explicit as possible” see p.68 in [6]), there is no semantics for their data. Semantics is usually specified explicitly mechanism to enforce meanings to be explicitly represented. at design-time, and frequently is not becoming a part of a What is even worse is that there is no mechanism to prevent database specification, therefore it is not available“ [5, 8].
Recommended publications
  • GRAPH DATABASE THEORY Comparing Graph and Relational Data Models
    GRAPH DATABASE THEORY Comparing Graph and Relational Data Models Sridhar Ramachandran LambdaZen © 2015 Contents Introduction .................................................................................................................................................. 3 Relational Data Model .............................................................................................................................. 3 Graph databases ....................................................................................................................................... 3 Graph Schemas ............................................................................................................................................. 4 Selecting vertex labels .............................................................................................................................. 4 Examples of label selection ....................................................................................................................... 4 Drawing a graph schema ........................................................................................................................... 6 Summary ................................................................................................................................................... 7 Converting ER models to graph schemas...................................................................................................... 9 ER models and diagrams ..........................................................................................................................
    [Show full text]
  • Data Definition Language (Ddl)
    DATA DEFINITION LANGUAGE (DDL) CREATE CREATE SCHEMA AUTHORISATION Authentication: process the DBMS uses to verify that only registered users access the database - If using an enterprise RDBMS, you must be authenticated by the RDBMS - To be authenticated, you must log on to the RDBMS using an ID and password created by the database administrator - Every user ID is associated with a database schema Schema: a logical group of database objects that are related to each other - A schema belongs to a single user or application - A single database can hold multiple schemas that belong to different users or applications - Enforce a level of security by allowing each user to only see the tables that belong to them Syntax: CREATE SCHEMA AUTHORIZATION {creator}; - Command must be issued by the user who owns the schema o Eg. If you log on as JONES, you can only use CREATE SCHEMA AUTHORIZATION JONES; CREATE TABLE Syntax: CREATE TABLE table_name ( column1 data type [constraint], column2 data type [constraint], PRIMARY KEY(column1, column2), FOREIGN KEY(column2) REFERENCES table_name2; ); CREATE TABLE AS You can create a new table based on selected columns and rows of an existing table. The new table will copy the attribute names, data characteristics and rows of the original table. Example of creating a new table from components of another table: CREATE TABLE project AS SELECT emp_proj_code AS proj_code emp_proj_name AS proj_name emp_proj_desc AS proj_description emp_proj_date AS proj_start_date emp_proj_man AS proj_manager FROM employee; 3 CONSTRAINTS There are 2 types of constraints: - Column constraint – created with the column definition o Applies to a single column o Syntactically clearer and more meaningful o Can be expressed as a table constraint - Table constraint – created when you use the CONTRAINT keyword o Can apply to multiple columns in a table o Can be given a meaningful name and therefore modified by referencing its name o Cannot be expressed as a column constraint NOT NULL This constraint can only be a column constraint and cannot be named.
    [Show full text]
  • A Dictionary of DBMS Terms
    A Dictionary of DBMS Terms Access Plan Access plans are generated by the optimization component to implement queries submitted by users. ACID Properties ACID properties are transaction properties supported by DBMSs. ACID is an acronym for atomic, consistent, isolated, and durable. Address A location in memory where data are stored and can be retrieved. Aggregation Aggregation is the process of compiling information on an object, thereby abstracting a higher-level object. Aggregate Function A function that produces a single result based on the contents of an entire set of table rows. Alias Alias refers to the process of renaming a record. It is alternative name used for an attribute. 700 A Dictionary of DBMS Terms Anomaly The inconsistency that may result when a user attempts to update a table that contains redundant data. ANSI American National Standards Institute, one of the groups responsible for SQL standards. Application Program Interface (API) A set of functions in a particular programming language is used by a client that interfaces to a software system. ARIES ARIES is a recovery algorithm used by the recovery manager which is invoked after a crash. Armstrong’s Axioms Set of inference rules based on set of axioms that permit the algebraic mani- pulation of dependencies. Armstrong’s axioms enable the discovery of minimal cover of a set of functional dependencies. Associative Entity Type A weak entity type that depends on two or more entity types for its primary key. Attribute The differing data items within a relation. An attribute is a named column of a relation. Authorization The operation that verifies the permissions and access rights granted to a user.
    [Show full text]
  • Datalog on Infinite Structures
    Datalog on Infinite Structures DISSERTATION zur Erlangung des akademischen Grades doctor rerum naturalium (Dr. rer. nat.) im Fach Informatik eingereicht an der Mathematisch-Naturwissenschaftlichen Fakultät II Humboldt-Universität zu Berlin von Herr Dipl.-Math. Goetz Schwandtner geboren am 10.09.1976 in Mainz Präsident der Humboldt-Universität zu Berlin: Prof. Dr. Christoph Markschies Dekan der Mathematisch-Naturwissenschaftlichen Fakultät II: Prof. Dr. Wolfgang Coy Gutachter: 1. Prof. Dr. Martin Grohe 2. Prof. Dr. Nicole Schweikardt 3. Prof. Dr. Thomas Schwentick Tag der mündlichen Prüfung: 24. Oktober 2008 Abstract Datalog is the relational variant of logic programming and has become a standard query language in database theory. The (program) complexity of datalog in its main context so far, on finite databases, is well known to be in EXPTIME. We research the complexity of datalog on infinite databases, motivated by possible applications of datalog to infinite structures (e.g. linear orders) in temporal and spatial reasoning on one hand and the upcoming interest in infinite structures in problems related to datalog, like constraint satisfaction problems: Unlike datalog on finite databases, on infinite structures the computations may take infinitely long, leading to the undecidability of datalog on some infinite struc- tures. But even in the decidable cases datalog on infinite structures may have ar- bitrarily high complexity, and because of this result, we research some structures with the lowest complexity of datalog on infinite structures: Datalog on linear orders (also dense or discrete, with and without constants, even colored) and tree orders has EXPTIME-complete complexity. To achieve the upper bound on these structures, we introduce a tool set specialized for datalog on orders: Order types, distance types and type disjoint programs.
    [Show full text]
  • A Theory of Regular Queries
    A Theory of Regular Queries Moshe Y. Vardi Rice University [email protected] ABSTRACT [22], he proposed using first-order logic as a declarative database A major theme in relational database theory is navigat- query language [23]. This led to the development of both ing the tradeoff between expressiveness and tractability for SEQUEL [16] and QUEL [54], as practical relational query query languages, where the query-containment problem is languages realizing Codd's idea, ultimately giving in 1986 considered a benchmark of tractability. The query class rise to the SQL standard [27], which has continued to evolve UCQ, consisting off unions of conjunctive queries, is a frag- over the past 30 years. ment of first-order logic that has a decidable query contain- Starting in the late 1970s, Codd's definition of first-order ment problem, but its expressiveness is limited. Extend- logic as \expressively complete" came under serious criti- ing UCQ with recursion yields Datalog, an expressive query cism [4], and various proposals emerged on how to extend language that has been studied extensively and has recently its expressive power [4, 6, 17], whose essence was to extend become popular in application areas such as declarative net- first-order logic with recursion, which was added to SQL in working. Unfortunately, Datalog has an undecidable query 1999 by way of common table expressions [29]. On the re- containment problem. Identifying a fragment of Datalog search side, the most popular relational language embodying that is expressive enough for applications but has a decid- recursion is Datalog [15], describe in more details below.
    [Show full text]
  • Database Theory
    DATABASE THEORY Lecture 1: Introduction / Relational data model Markus Krotzsch¨ TU Dresden, 4 April 2016 Course information • 14 lectures: Thursday, DS4 (13:00–14:30) • 13 excercise classes: Monday, DS3 (11:10–12:40) { taught by Francesco Kriegel • Oral examination (details based on applicable examination regulations) • Course homepage (dates, slides, excercise sheets): https://ddll.inf.tu-dresden.de/web/Database_ Theory_%28SS2016%29/en Markus Krötzsch, 4 April 2016 Database Theory slide 2 of 42 Aims of the course Obtain an understanding of key topics in database theory with a special focus on query formalisms: • Relational data model • Basic and advanced query languages • Expressive power of query languages • Complexity of query answering + some algorithmic approaches • Modelling with constraints Connect databases with other advanced topics in logic/KR/formal methods Markus Krötzsch, 4 April 2016 Database Theory slide 3 of 42 Literature, prerequisites, related courses • Serge Abiteboul, Richard Hull, Victor Vianu: Foundations of Databases. Addison-Wesley. 1994. – Available at http://webdam.inria.fr/Alice/ – Slight deviations in the lecture – Further literature will be given for advanced topics • Prerequisites: basics of first-order logic, Turing machines, worst-case complexity • Related courses at TUD: – Advanced Logic – Foundations of Semantic Web Technologies – Introduction to Logic Programming – Introduction to Constraint Programming – Datenbanken (Grundlagen) – Intelligent Information Systems Markus Krötzsch, 4 April 2016 Database
    [Show full text]
  • Database Theory
    DATABASE THEORY Lecture 11: Introduction to Datalog Markus Krotzsch¨ Knowledge-Based Systems TU Dresden, 12th June 2018 Announcement All lectures and the exercise on 19 June 2018 will be in room APB 1004 Markus Krötzsch, 12th June 2018 Database Theory slide 2 of 34 Introduction to Datalog Markus Krötzsch, 12th June 2018 Database Theory slide 3 of 34 Introduction to Datalog Datalog introduces recursion into database queries • Use deterministic rules to derive new information from given facts • Inspired by logic programming (Prolog) • However, no function symbols and no negation • Studied in AI (knowledge representation) and in databases (query language) Example 11.1: Transitive closure C of a binary relation r C(x, y) r(x, y) C(x, z) C(x, y) ^ r(y, z) Intuition: • some facts of the form r(x, y) are given as input, and the rules derive new conclusions C(x, y) • variables range over all possible values (implicit universal quantifier) Markus Krötzsch, 12th June 2018 Database Theory slide 4 of 34 Syntax of Datalog Recall: A term is a constant or a variable. An atom is a formula of the form R(t1, ::: , tn) with R a predicate symbol (or relation) of arity n, and t1, ::: , tn terms. Definition 11.2: A Datalog rule is an expression of the form: H B1 ^ ::: ^ Bm where H and B1, ::: , Bm are atoms. H is called the head or conclusion; B1 ^ ::: ^ Bm is called the body or premise. A rule with empty body (m = 0) is called a fact. A ground rule is one without variables (i.e., all terms are constants).
    [Show full text]
  • Database Theory in Practice: Learning from Cooperative Group Projects*
    DATABASE THEORY IN PRACTICE: LEARNING FROM COOPERATIVE GROUP PROJECTS* Suzanne W. Dietrich Susan D. Urban Department of Computer Science and Engineering Arizona State University BOX 875406 Tempe, AZ 85287-5406 U.S.A. [email protected] [email protected] ABSTRACT only because of the foundation it provides for research, but lhis paper describes the use of cooperative group learning also because it is a prerequisite for effective use of database concepts in support of an undergraduate database mrtnage- management systems [1]. Database practice is important ment course that emphasizes the theoretical and practical for preparing students to address the needs of real-world aspects of database application development. ‘Ihe course applications. There is. therefore, a wealth of both theoretical project is divided into three main phases, involving require- and practical material that must be covered within a typical ments analysis and conceptual design, relational database undergraduate course on database management systems. mapping and prototyping, and database system implemen- Another concern with respect to database education is the tation using Microsoft Access. The project deliverables overall approach to teaching database concepts, such as the are designed so that students not only develop a database use of cooperative learning [4] in the college classroom. In implementation, but also evaluate their design in terms of cooperative learning, in contrast to individual or competitive functional dependencies, normal forms, the lossless join learning, students work in small groups, helping each other property, and the dependency preservation property, thus learn the assigned material. Cooperative learning has advan- establishing the need for sound database design principles.
    [Show full text]
  • An Extension of the Relational Data Model to Incorporate Ordered Domains
    An Extension of the Relational Data Model to Incorporate Ordered Domains Wilfred Siu Hung Ng A dissertation submitted in partial fulfillment of the requirements for the degree of Doctor of Philosophy o f th e University of London Department of Computer Science University College London 1, July 1997 ProQuest Number: 10055432 All rights reserved INFORMATION TO ALL USERS The quality of this reproduction is dependent upon the quality of the copy submitted. In the unlikely event that the author did not send a complete manuscript and there are missing pages, these will be noted. Also, if material had to be removed, a note will indicate the deletion. uest. ProQuest 10055432 Published by ProQuest LLC(2016). Copyright of the Dissertation is held by the Author. All rights reserved. This work is protected against unauthorized copying under Title 17, United States Code. Microform Edition © ProQuest LLC. ProQuest LLC 789 East Eisenhower Parkway P.O. Box 1346 Ann Arbor, Ml 48106-1346 Abstract In this thesis, we extend the relational data model to incorporate partial orderings into data domains and present a comprehensive formalisation of the extended model. The main contributions of the thesis are that we show how and why such an extension can considerably improve the capabilities of capturing semantics in a wide spectrum of advanced applications such as tree-structured information, temporal information, in­ complete information, fuzzy information and spatial information, whilst preserving the elegance of the theoretical basis of the model. Within the extended model, we extend the relational algebra to the Partially Or­ dered Relational Algebra (FORA) and the relational calculus to the Partially Ordered Relational Calculus (PORC), respectively.
    [Show full text]
  • Foundations of Databases
    Foundations of Databases Foundations of Databases Serge Abiteboul INRIA–Rocquencourt Richard Hull University of Southern California Victor Vianu University of California–San Diego Addison-Wesley Publishing Company Reading, Massachusetts . Menlo Park, California New York . Don Mills, Ontario . Wokingham, England Amsterdam . Bonn . Sydney . Singapore Tokyo . Madrid . San Juan . Milan . Paris Sponsoring Editor: Lynne Doran Cote Associate Editor: Katherine Harutunian Senior Production Editor: Helen M. Wythe Cover Designer: Eileen R. Hoff Manufacturing Coordinator: Evelyn M. Beaton Cover Illustrator: Toni St. Regis Production: Superscript Editorial Production Services (Ann Knight) Composition: Windfall Software (Paul C. Anagnostopoulos, Marsha Finley, Jacqueline Scarlott), using ZzTEX Copy Editor: Patricia M. Daly Proofreader: Cecilia Thurlow The procedures and applications presented in this book have been included for their instructional value. They have been tested with care but are not guaranteed for any purpose. The publisher does not offer any warranties or representations, nor does it accept any liabilities with respect to the programs and applications. Many of the designations used by manufacturers and sellers to distinguish their products are claimed as trademarks. Where those designations appear in this book, and Addison-Wesley was aware of a trademark claim, the designations have been printed in initial caps or all caps. Library of Congress Cataloging-in-Publication Data Abiteboul, S. (Serge) Foundations of databases / Serge Abiteboul. Richard Hull, Victor Vianu. p. cm. Includes bibliographical references and index. ISBN 0-201-53771-0 1. Database management. I. Hull, Richard, 1953–. II. Vianu, Victor. III. Title. QA76.9.D3A26 1995 005.74’01—dc20 94-19295 CIP Copyright © 1995 by Addison-Wesley Publishing Company, Inc.
    [Show full text]
  • Day 1: Introduction to Database Theory and Design
    Outline The Database as a Concept and Tool The Database model and its Evolution Database Applications Day 1: Introduction to Database Theory and Design Database Theory and Design Tyler Peterson International Summer School on Language Documentation and Description Leiden University Centre for Linguistics, Leiden November 26, 2011 Database Theory and Design Day 1: Introduction to Database Theory and Design Outline The Database as a Concept and Tool The Database model and its Evolution Database Applications My Details: Tyler Peterson office: LUCL Van Wijkplaats 4 Room 205a telephone: 071-5272059 email: [email protected] [email protected] (for Google docs) office hours: Most afternoons until 18:00 I Please fill out the short survey, and don't hesitate to contact me! Database Theory and Design Day 1: Introduction to Database Theory and Design I Theory: Terminology used in database theory; Understanding the design features of a database (entities and attributes). I Design: Assessing your goals; planning a database on paper (Entity-Relationship diagrams); Best practices. I Cross tabulation: a front-line analytical tool. I Trigger{Target Database for Phonological Processes I Programma de Fonologia Experimental e Hist´orica I Databases vs. spreadsheets. Types of databases. I Different commercial and free database programs: advantages and limitations. I To familiarize you with the concepts in database theory and design. I Looking at the practical implemetation of an analytical database in MS Access. Outline The Database as a Concept and Tool The Database model and its Evolution Database Applications Goals for the Course: I Databases: what they are, what linguists can use them for.
    [Show full text]
  • Database Theory
    Course information 14 lectures: Thursday, DS4 (13:00–14:30) • 13 excercise classes: Monday, DS3 (11:10–12:40) • DATABASE THEORY { taught by Francesco Kriegel Oral examination (details based on applicable examination • Lecture 1: Introduction / Relational data model regulations) Course homepage (dates, slides, excercise sheets): Markus Krotzsch¨ • https://ddll.inf.tu-dresden.de/web/Database_ Theory_%28SS2016%29/en TU Dresden, 4 April 2016 Markus Krötzsch, 4 April 2016 Database Theory slide 2 of 27 Aims of the course Literature, prerequisites, related courses Serge Abiteboul, Richard Hull, Victor Vianu: Obtain an understanding of key topics in database theory with a • Foundations of Databases. Addison-Wesley. 1994. special focus on query formalisms: – Available at http://webdam.inria.fr/Alice/ Relational data model • – Slight deviations in the lecture Basic and advanced query languages – Further literature will be given for advanced topics • Expressive power of query languages Prerequisites: basics of first-order logic, Turing machines, • • Complexity of query answering + some algorithmic worst-case complexity • approaches Related courses at TUD: • Modelling with constraints – Advanced Logic • – Foundations of Semantic Web Technologies – Introduction to Logic Programming Connect databases with other advanced topics in logic/KR/formal – Introduction to Constraint Programming methods – Datenbanken (Grundlagen) – Intelligent Information Systems Markus Krötzsch, 4 April 2016 Database Theory slide 3 of 27 Markus Krötzsch, 4 April 2016 Database
    [Show full text]