“The Third Manifesto” Peter Vogel

Total Page:16

File Type:pdf, Size:1020Kb

“The Third Manifesto” Peter Vogel Book Review Smart Access “The Third Manifesto” Peter Vogel In this article, Peter Vogel looks at a book by two of the gurus of relational database design, where they discuss the essence of the relational database theory, its relationship with object- The Third Manifesto oriented databases, and its possible future. Foundation for Future Database Systems: The Third Manifesto By C. J. Date and Hugh Darwen ’VE said on many occasions that to build great Addison-Wesley, ISBN 0201709287 applications, you have to build great databases—and Ithat building great databases means understanding the relational database theory. Currently, we’re fortunate relational database theory isn’t simply one way to store enough to have two excellent books on relational database and manage data. The book makes it clear that the design available: Database Design for Mere Mortals by authors feel that the relational theory is the only sound Michael Hernandez (Addison-Wesley, ISBN 0201694719) basis for understanding the structure of data and for and Designing Relational Database Systems by Rebecca managing data in applications. Riordan (Microsoft Press, ISBN 073560634X). Equally The authors do believe, however, that it’s necessary to important to creating great applications is a command of clear up some misunderstandings among both relational SQL. There’s no single book that stands out here, but a database designers and object-oriented developers. They combination of SQL Queries for Mere Mortals by Michael feel that two issues (or, as they call them in the book, Hernandez, John Viescas, and Joe Celko (Addison-Wesley, “blunders”) are central to confusion about the nature of ISBN 0201433362) and Joe Celko’s own SQL for Smarties objects and relational database theory. (especially the second edition, from Morgan Kaufman The primary blunder is a confusion between “values” Publishers, ISBN 1558605762) probably should be on and “variables.” The authors feel that a relation (or a table every developer’s shelf. In my mind, SQL and relational design, to be more concrete) is a value. A value, after all, is database design go together: If you get the database like the number 3: unchanging and independent of any design right, you can use SQL to solve your problems; if location in time or space. So, too, a relation is unchanging you can use SQL, you can eliminate code and get the and independent of any location in time or space. A highest possible level of performance in your application. relation expressed in a table design with columns like Foundation for Future Database Systems: The Third CustomerNumber, CustomerName, CreditHistory, and Manifesto goes beyond these books to discuss the so forth doesn’t change over the life of the table. foundations of relational database theory and sketch A table’s contents, however, do change over time as out one possible future for relational database design. records are added, deleted, and modified. The contents of The authors’ particular goal is to describe how the a relation (or table) are a set of facts (or rows), each of current interest in object-oriented development relates which is considered to be true. So the Customer table that to relational databases. The authors of the book are two I described previously will contain different customer of the most prominent gurus in the field and have written relations (rows) at different times. A table, therefore, is a a number of books together. If you’re interested in the variable that holds different values at different times. SQL standard, for instance, their A Guide to the SQL A variable that holds different string values is Standard is an introduction to the SQL language that’s called a “string variable.” In the terminology of The independent of any particular vendor’s implementation. Third Manifesto, a table is a relation variable (or “relvar”): a variable containing different relation values at Blunders different times. The Third Manifesto takes its name from the authors’ The distinction between values and variables is intent. The book is designed to follow up on two previous important because the authors go on to discuss a second books that advocated significant changes to the relational misunderstanding, this time in the field of object-oriented theory to support future application development. development. They tackle the thorny question of “What is Date and Darwen’s contention is simple: No such an object?” They begin by assuming that an object must modifications are necessary. The authors believe that the be one of a value, a variable, or a data type. Given their 16 Smart Access June 2001 http://www.smartaccessnewsletter.com definition of value and variable, their conclusion is runs about 30 pages and is organized around a set of that the only definition of an object that makes sense is recommendations. Fortunately for lay readers like that an object is a data type. The authors feel that you myself, a large part of this book is a commentary on the should speak of a “Customer object” in the same way manifesto, explaining, exploring, and explicating it for that you speak of a “string variable.” A string variable people who find the manifesto’s language a struggle to holds different string values over time. In the same read and appreciate (for instance, me). way, a Customer object holds different Customer values The manifesto proper and its commentary form only over time. part of the book, though. Recognizing that accessing data is as important as structuring it, Date and Darwen go Objects and variables on to discuss a true relational data access language— This definition of an object might strike you as odd (it which isn’t SQL. certainly struck me as odd). An object, after all, has While Darwen sits on the SQL standards committee, methods, properties, and events—something that I the authors’ attitude toward SQL is far from their don’t think of data types as having. However, the authors unreserved support for relational database theory. Date make the case that any data type has a set of operators has been especially critical of the language, as he feels designed for it and that these operators correspond to that it perverts the nature of the relational theory. In The the methods of an object. For instance, in VBA, there’s a Third Manifesto, the authors distinguish between “SQL set of operators that I can use with numeric variables databases” and true relational databases. They even (for instance, the math operators, +, -, /, *). In VBA, I’m advance a new language as the basis for a true relational also constrained from using operators that aren’t defined access tool. They make it very clear that their language for numeric variables (for example, the concatenation is designed as a model for an actual implementation, operator &, which is defined for strings). In the same way but they also make it clear that without a true relation that a numeric variable has operators designed to be used language we’re all hampered by having to use SQL. This with it, an object has a set of operators (methods) that are new language supports using user-defined operators to designed to be used with it. allow objects to be stored in tables. If you think of a variable as a simple data item with Other sections of the book discuss a way of handling a set of routines designed to work with its data, then it’s inheritance by allowing for operators to be inherited, not hard to think of an object as a complex data item review some of the literature on object-oriented databases, with a set of routines designed to work with its data. and discuss other, related topics. With this definition in hand, the authors go on to say I’ll be the first to admit that I lack the theoretical that the relational theory can fully support objects as understanding of database theory to appreciate all of this simply another data type for a column in a relation. All book (many paragraphs flew straight over my head). that’s necessary is to allow developers a way to specify I suspect that I’ve sufficiently simplified the authors’ the operators for any data type that they want to use. discussion to distort it, at least in part. However, if you’re The corollary to this view is that developers shouldn’t interested in relational database theory, you owe it to think of relational databases as a place where only simple yourself to take the time to tackle this book. It will stretch database types can be stored. I’ve always assumed that I your mind, expand your understanding, and help make can only use in a database those data types defined by the you a more sophisticated database designer. And, if Date system. Date and Darwen suggest that there’s no reason and Darwen’s work does turn out to be the basis of a why I shouldn’t be able to define and use my own data renaissance in database implementations, you’ll be ready types (objects). All that’s necessary is that I be able to to take advantage of it. ▲ define the operators for those data types rather than have to settle for system-defined operators. To that end, they Peter Vogel (MBA, MCSD) is the editor of Smart Access and a principal in propose a language for describing operators for data PH&V Information Services. PH&V specializes in system design and types. One feature of that language is the privileged development for COM/COM+ based systems. Peter has designed, built, operator(s) that extract and update the value of the data and installed intranet and component-based systems for Bayer AG, Exxon, Christie Digital, and the Canadian Imperial Bank of Commerce.
Recommended publications
  • Further Normalization of the Data Base Relational Model
    FURTHER NORMALIZATION OF THE DATA BASE RELATIONAL MODEL E. F. Codd IBM Research Laboratory San Jose, California ABSTRACT: In an earlier paper, the author proposed a relational model of data as a basis for protecting users of formatted data systems from the potentially disruptive changes in data representation caused by growth in the data base and changes in traffic. A first normal form for the time-varying collection of relations was introduced. In this paper, second and third normal forms are defined with the objective of making the collection of relations easier to understand and control, simpler to operate upon, and more informative to the casual user. The question "Can application programs be kept in a viable state when data base relations are restructured?" is discussed briefly and it is conjectured that third normal form will significantly extend the life expectancy of appli- cation programs. Fu909umxk7) August 31,197l Information technolow (IR, Documentetion, etc.) 1. 1. Introduction 1.1 Objectives of Normalization In an earlier paper [l] the author proposed a relational model of data as a basis for protecting users of formatted data systems from the potentially disruptive changes in data representation caused by growth in the variety of data types in the data base and by statistical changes in the transaction or request traffic. Using this model, both the appli- cation programmer and the interactive user view the data base as a time-varying collection of normalized relations of assorted degrees. Definitions of these terms and of the basic relational operations of projection and natural join are given in the Appendix.
    [Show full text]
  • Using Relational Databases in the Engineering Repository Systems
    USING RELATIONAL DATABASES IN THE ENGINEERING REPOSITORY SYSTEMS Erki Eessaar Department of Informatics, Tallinn University of Technology, Raja 15,12618 Tallinn, Estonia Keywords: Relational data model, Object-relational data model, Repository, Metamodeling. Abstract: Repository system can be built on top of the database management system (DBMS). DBMSs that use relational data model are usually not considered powerful enough for this purpose. In this paper, we analyze these claims and conclude that they are caused by the shortages of SQL standard and inadequate implementations of the relational model in the current DBMSs. Problems that are presented in the paper make usage of the DBMSs in the repository systems more difficult. This paper also explains that relational system that follows the rules of the Third Manifesto is suitable for creating repository system and presents possible design alternatives. 1 INTRODUCTION technologies in one data model is ROSE (Hardwick & Spooner, 1989) that is experimental data "A repository is a shared database of information management system for the interactive engineering about the engineered artifacts." (Bernstein, 1998) applications. Bernstein (2003) envisions that object- These artifacts can be software engineering artifacts relational systems are good platform for the model like models and patterns. Repository system contains management systems. ORIENT (Zhang et al., 2001) a repository manager and a repository (database) and SFB-501 Reuse Repository (Mahnke & Ritter, (Bernstein, 1998). Bernstein (1998) explains that 2002) are examples of the repository systems that repository manager provides services for modeling, use a commercial ORDBMS. ORDBMS in this case retrieving, and managing objects in the repository is a system which uses a database language that and therefore must offer functions of the Database conforms to SQL:1999 or later standard.
    [Show full text]
  • LINGI2172 Databases 2013-2014
    Université Catholique de Louvain - DESCRIPTIF DE COURS 2013-2014 - LINGI2172 LINGI2172 Databases 2013-2014 6.0 crédits 30.0 h + 30.0 h 2q Enseignants: Lambeau Bernard ; Langue Français d'enseignement: Lieu du cours Louvain-la-Neuve Ressources en ligne: > http://icampus.uclouvain.be/claroline/course/index.php?cid=lingi2172 Préalables : Basic knowledge of database management, good abilities in programming. Thèmes abordés : * Data Base Management Systems (objectives, requirements, architecture). * The Relational data model (formal theory, first-order logic, constraints). * Conceptual models (entity-relationship, object role modeling). * Logical database design (normal forms & mp; normalization, ER-To-Relational) * Physical database design and storage (tables and keys, indexes, file structures). * Querying databases (Relational Algebra, Relational Calculus, Tutorial D, SQL) * ACID properties (Atomicity, Consistency, Isolation, Durability), Concurrency Control, Recovery techniques. * Programming database applications (JDBC, Database Cursors, Object-Relational Mapping, Relations as First-class Citizen). * Recent or more advanced trends in the database field (object-oriented databases, Big Data, NoSQL, NewSQL) Acquis Students completing this course successfully will be able to : * Explain the scenarios in which using a database is more convenient than programming with data files; d'apprentissage * Explain the characteristics of the database approach, where they come from and contrast them with current trends in the database field-- Identify and describe the main functions of a database management system; * Categorize conceptual, logical and physical data models based on the concepts they provide to describe the database structure; * Understand the main principles and mathematical theory of the relational approach to database management; * Design databases using a systematic approach, from a conceptual model through a logical level (i.e., a relational schema) into a physical model (i.e., tables and indexes); * Use SQL (DDL) to implement a relational database schema.
    [Show full text]
  • CONNOLLY R CAROLYN E. BEGG a Practical Approach to Design
    •.••:.... ••:••.•; ••••• •:• ; •.•••:•. • ..• .: . • •• ••••:..• CONNOLLY r CAROLYN E. BEGG 1VERS1TYOF PA1SL К m A Practical Approach to Design, Implementation, and Management Third Edition Contents Preface xxxv Part 1 Background Chapter 1 Introduction to Databases 1.1 Introduction 1.2 Traditional File-Based Systems D 1.2.1 File-Based Approach 7 1.2.2 Limitations of the File-Based Approach 12 1.3 Database Approach 14 1.3.1 The Database 14 1.3.2 The Database Management System (DBMS) 16 1.3.3 Components of the DBMS Environment 18 1.3.4 Database Design: The Paradigm Shift 20 1.4 Roles in the Database Environment 21 1.4.1 Data and Database Administrators 21 1.4.2 Database Designers 22 1.4.3 Application Developers 23 1.4.4 End-Users 23 1.5 History of Database Management Systems 23 1.6 Advantages and Disadvantages of DBMSs 25 Chapter Summary 30 Review Questions 31 Exercises 31 Chapter 2 Database Environment 33 2.1 The Three-Level ANSI-SPARC Architecture 34 2.1.1 External Level 35 2.1.2 Conceptual Level 36 2.1.3 Internal Level 36 xii Contents 2.1.4 Schemas, Mappings, and Instances 37 2.1.5 Data Independence 38 2.2 Database Languages 40 2.2.1 The Data Definition Language (DDL) 40 2.2.2 The Data Manipulation Language (DML) 41 2.2.3 Fourth-Generation Languages (4GL) 42 2.3 Data Models and Conceptual Modeling 43 2.3.1 Object-Based Data Models 44 2.3.2 Record-Based Data Models 45 2.3.3 Physical Data Models 47 2.3.4 Conceptual Modeling 47 2.4 Functions of a DBMS 48 2.5 Components of a DBMS 53 2.6 Multi-User DBMS Architectures 56 2.6.1 Teleprocessing
    [Show full text]
  • Applied Mathematics for Database Professionals
    7451FM.qxd 5/17/07 10:41 AM Page i Applied Mathematics for Database Professionals Lex de Haan and Toon Koppelaars 7451FM.qxd 5/17/07 10:41 AM Page ii Applied Mathematics for Database Professionals Copyright © 2007 by Lex de Haan and Toon Koppelaars All rights reserved. No part of this work may be reproduced or transmitted in any form or by any means, electronic or mechanical, including photocopying, recording, or by any information storage or retrieval system, without the prior written permission of the copyright owner and the publisher. ISBN-13: 978-1-59059-745-3 ISBN-10: 1-59059-745-1 Printed and bound in the United States of America 9 8 7 6 5 4 3 2 1 Trademarked names may appear in this book. Rather than use a trademark symbol with every occurrence of a trademarked name, we use the names only in an editorial fashion and to the benefit of the trademark owner, with no intention of infringement of the trademark. Lead Editor: Jonathan Gennick Technical Reviewers: Chris Date, Cary Millsap Editorial Board: Steve Anglin, Ewan Buckingham, Gary Cornell, Jonathan Gennick, Jason Gilmore, Jonathan Hassell, Chris Mills, Matthew Moodie, Jeffrey Pepper, Ben Renow-Clarke, Dominic Shakeshaft, Matt Wade, Tom Welsh Project Manager: Tracy Brown Collins Copy Edit Manager: Nicole Flores Copy Editor: Susannah Davidson Pfalzer Assistant Production Director: Kari Brooks-Copony Production Editor: Kelly Winquist Compositor: Dina Quan Proofreader: April Eddy Indexer: Brenda Miller Artist: April Milne Cover Designer: Kurt Krames Manufacturing Director: Tom Debolski Distributed to the book trade worldwide by Springer-Verlag New York, Inc., 233 Spring Street, 6th Floor, New York, NY 10013.
    [Show full text]
  • Impedance Mismatch Is Not an “Objects Vs. Relations” Problem. (DRAFT) Evgeniy Grigoriev [email protected]
    Impedance Mismatch is not an “Objects vs. Relations” Problem. (DRAFT) Evgeniy Grigoriev [email protected] The problem of impedance mismatch between applications written in OO languages and relational DB is not a problem of discrepancy between object-oriented and relational approaches themselves. Its real causes can be found in usual implementation of the ОО approach. Direct comparison of the two approaches cannot be used as a base for the conclusion that they are discrepant or mismatched. Experimental proof of the absence of contradiction between the object-oriented paradigm and the relational data model is also presented in the paper. " -Look, your worship, - said Sancho, - what we see there are not giants but windmills, and what seem to be their arms are the sails…" Miguel de Cervantes, Don Quixote In physics, the term “impedance mismatch” (IM) may be found in fields dedicated to wave processes, e.g., in acoustics or in electrodynamics. It is used to denote an effect that appears when a wave is transferred from one medium to another [IMP]. If the impedances of the two media are different ("mismatching"), the wave energy will be reflected or absorbed, so it is difficult for the wave to cross the border between the media. A similar effect occurs when one attempts to organise the data exchange between programs written with object-oriented (OO) language and relational (R) DBMS, which is referred to as “object-relational impedance mismatch” [Copeland, Ambler]. Existing difficulties are usually explained with the discrepancy in general properties of the object program and relational DB. For example, in [Ambler1], it is defined as, “The difference resulting from the fact that relational theory is based on relationships between tuples (records) that are queried, where as the object paradigm is based on relationships between objects that are traversed“.
    [Show full text]
  • A Framework for Ontology-Based Library Data Generation, Access and Exploitation
    Universidad Politécnica de Madrid Departamento de Inteligencia Artificial DOCTORADO EN INTELIGENCIA ARTIFICIAL A framework for ontology-based library data generation, access and exploitation Doctoral Dissertation of: Daniel Vila-Suero Advisors: Prof. Asunción Gómez-Pérez Dr. Jorge Gracia 2 i To Adelina, Gustavo, Pablo and Amélie Madrid, July 2016 ii Abstract Historically, libraries have been responsible for storing, preserving, cata- loguing and making available to the public large collections of information re- sources. In order to classify and organize these collections, the library commu- nity has developed several standards for the production, storage and communica- tion of data describing different aspects of library knowledge assets. However, as we will argue in this thesis, most of the current practices and standards available are limited in their ability to integrate library data within the largest information network ever created: the World Wide Web (WWW). This thesis aims at providing theoretical foundations and technical solutions to tackle some of the challenges in bridging the gap between these two areas: library science and technologies, and the Web of Data. The investigation of these aspects has been tackled with a combination of theoretical, technological and empirical approaches. Moreover, the research presented in this thesis has been largely applied and deployed to sustain a large online data service of the National Library of Spain: datos.bne.es. Specifically, this thesis proposes and eval- uates several constructs, languages, models and methods with the objective of transforming and publishing library catalogue data using semantic technologies and ontologies. In this thesis, we introduce marimba-framework, an ontology- based library data framework, that encompasses these constructs, languages, mod- els and methods.
    [Show full text]
  • Relational Database Management System
    Relational database management system A relational database management system (RDBMS) is a database management system (DBMS) based on the relational model invented by Edgar F. Codd, of IBM's San Jose Research Laboratory fame. Most databases in widespread use today are based on his relational database model.[1] RDBMSs have been a common choice for the storage of information in databases used for financial records, manufacturing and logistical information, personnel data, and other applications since the 1980s. Relational databases have often replaced legacy hierarchical databases and network databases because, they were easier to implement and administer. Nonetheless, relational databases received continued, The general structure of a relational unsuccessful challenges by object database management systems in the 1980s and database. 1990s, (which were introduced in an attempt to address the so-called object- relational impedance mismatch between relational databases and object-oriented application programs), as well as by XML database management systems in the 1990s. However, due to the expanse of technologies, such as horizontal scaling of computer clusters, NoSQL databases have recently begun to peck away at the market share of RDBMSs.[2] Contents Market share History Historical usage of the term See also References Market share According to DB-Engines, in May 2017, the most widely used systems are Oracle, MySQL (open source), Microsoft SQL Server, PostgreSQL (open source), IBM DB2, Microsoft Access, and SQLite (open source).[3] According to research company Gartner, in 2011, the five leading commercial relational database vendors by revenue were Oracle (48.8%), IBM (20.2%), Microsoft (17.0%), SAP including Sybase (4.6%), and Teradata (3.7%).[4] History In 1974, IBM began developing System R, a research project to develop a prototype RDBMS.[5][6] However, the first commercially available RDBMS was Oracle, released in 1979 by Relational Software, now Oracle Corporation.[7] Other examples of an RDBMS include DB2, SAP Sybase ASE, and Informix.
    [Show full text]
  • On the Logic of SQL Nulls
    On the Logic of SQL Nulls Enrico Franconi and Sergio Tessaris Free University of Bozen-Bolzano, Italy lastname @inf.unibz.it Abstract The logic of nulls in databases has been subject of invest- igation since their introduction in Codd's Relational Model, which is the foundation of the SQL standard. In the logic based approaches to modelling relational databases proposed so far, nulls are considered as representing unknown values. Such existential semantics fails to capture the behaviour of the SQL standard. We show that, according to Codd's Relational Model, a SQL null value represents a non-existing value; as a consequence no indeterminacy is introduced by SQL null values. In this paper we introduce an extension of first-order logic accounting for predicates with missing arguments. We show that the domain inde- pendent fragment of this logic is equivalent to Codd's relational algebra with SQL nulls. Moreover, we prove a faithful encoding of the logic into standard first-order logic, so that we can employ classical deduction ma- chinery. 1 Relational Databases and SQL Null Values Consider a database instance with null values over the relational schema fR=2g, and a SQL query asking for the tuples in R being equal to themselves: 1 2 1 | 2 SELECT * FROM R ---+--- R : a b WHERE R.1 = R.1 AND R.2 = R.2 ; ) a | b b N (1 row) Figure 1. In SQL, the query above returns the table R if and only if the table R does not have any null value, otherwise it returns just the tuples not containing a null value, i.e., in this case only the first tuple ha; bi.
    [Show full text]
  • Projektovanie Databázových Systémov - Úvod Zoznámenie Sa
    Projektovanie databázových systémov - úvod Zoznámenie sa Prednášky a cvičenia: Jaroslav Lach Belastingdienst Centrum voor ICT, Apeldoorn Service team Fysieke media verwerking Senior software engineer Kontakt: [email protected] 0031 55 528 1272 Skype: Jaroslav Lach Zoznámenie sa II. Cvičenia a konzultácie: Jan Fikejz Organizačné záleţitosti Týţdenný kurz Prednášky a cvičenia Výuka pondelok aţ piatok 8:00 aţ 9:00 12:30 aţ 13:30 9:15 aţ 10:15 13:45 aţ 14:45 10:30 aţ 11:30 15:00 aţ 16:00 Semestrálny projekt Konzultácie Zápočet a skúška Študijná literatúra: povinná Jim Arlow, Ila Neustadt: UML 2 a unifikovaný proces vývoje aplikací. Computer Press 2007 Meilir Page-Jones: Základy objektově orientovaného návrhu v UML. Grada 2001 Motivácia • Nainštalovaním prvej verzie softwarového produktu u zákazníka sa všetko len začína.... • Chyby a problémy (hlavne pri prvých verziách) • Zmeny a rozšírenia • Ako písať software tak, aby tento aspekt bol zvládnuteľný • Aby sa dali analyzovať problémy • S minimálnym rizikom implementovať zmeny a rozšírenia Študijná literatúra: doporučená Kraval: Objektové modelování pomocí UML v praxi, díl 1, PDF e-kniha, leden 2005 Dokumentácia procesu UP (dostupná na www.eclipse.org/epf) http://www.tonymarston.net/php-mysql/database- design.html • H. Darwen & C.J. Date: Databases, types and the relational model (The third Manifesto) • C. J. Date: An Introduction to Database Systems. Addison Wesley; 8th edition 2003 Študijná literatúra: doporučená II. • Matiaško,Karol: Databázové systémy. - 1. vyd. - Ţilina
    [Show full text]
  • Up to a Point, Lord Copper
    copper.html Up to a Point, Lord Copper A response to Tom Johnston's article,"More to the Point" (Database Programming & Design, October 24th, 1995) by C. J. Date, Hugh Darwen, and David McGoveran Tom Johnston's recent article "More to the Point" [2] was a response to a critique by the present authors [3] of an earlier two-part article by Johnston [4] in support of many-valued logic. What follows is a response to that response. We begin with a slightly edited excerpt from that letter. (Note: Following normal convention, we use MVL, 2VL, 3VL, ... throughout this paper to stand for many-valued logic, two-valued logic, three-valued logic (and so on). Our comments tend to focus on 3VL specifically, though they often apply, sometimes with even more force, to 4VL, 5VL, and the rest.) "Probably many readers are bored to tears with this whole subject. Certainly the editor of Database Programming & Design seems to think so; in his introduction to our previous critique, he wrote: '[Johnston's response to this critique] will appear [soon] ... With that, we'll all shake hands and end this chapter of The Great MVL Debate.' Would that we could! But Johnston's response simply cries out for further rebuttal. The sad truth is that the topic of our debate is fundamentally important. What's more, it isn't going to go away (indeed, nor should it), so long as MVL advocates such as Johnston fail even to address-let alone answer-our many serious and well-founded objections to their position.
    [Show full text]
  • Handling SQL Nulls with Two-Valued Logic
    Handling SQL Nulls with Two-Valued Logic Leonid Libkin Liat Peterfreund Univ. Edinburgh / ENS-Paris, PSL / Neo4j ENS-Paris, PSL [email protected] [email protected] ABSTRACT Issues related to null handling stem from the fact that we do The design of SQL is based on a three-valued logic (3VL), rather not naturally think in terms of a three-valued logic; rather we try than the familiar Boolean logic with truth values true and false, to categorize facts as true or false. Once the third truth value – to accommodate the additional truth value unknown for handling in the case of SQL, unknown – enters the picture, our usual logic nulls. It is viewed as indispensable for SQL expressiveness, butisat often proves faulty leading to errors and unexpected behavior. We the same time much criticized for leading to unintuitive behavior illustrate this by two commonly assumed query rewriting rules. of queries and thus being a source of programmer mistakes. The first of the rules is the translation of IN subqueries into We show that, contrary to the widely held view, SQL could have EXISTS queries, described in multiple textbooks. For example, been designed based on the standard Boolean logic, without any (Q_1): SELECT R.A FROM R WHERE R.A NOT IN loss of expressiveness and without giving up nulls. The approach ( SELECT S.A FROM S ) itself follows SQL’s evaluation which only retains tuples for which conditions in the WHERE clause evaluate to true. We show that would be translated into conflating unknown, resulting from nulls, with false leads to an (Q_2): SELECT R.A FROM R WHERE NOT EXISTS equally expressive version of SQL that does not use the third truth ( SELECT S.A FROM S WHERE S.A=R.A ) value.
    [Show full text]