Experiencing Intuitionistic Logic Programming in SQL Puzzles (Work in Progress)∗

Experiencing Intuitionistic Logic Programming in SQL Puzzles (Work in Progress)∗ Fernando Saenz-P´ erez´ Faculty of Computer Science, Complutense University of Madrid, 28040 Madrid, Spain [email protected] This work presents some SQL puzzles making use of the application of intuitionistic logic programming (ILP) to implement an SQL system. ILP provides a way of declaring SQL Common Table Expressions as used to specify recursive queries and local view definitions. We present the concepts of ILP that will be used to translate SQL queries to Hypothetical Datalog, providing its syntax, an inference system and translation rules. Then, several novel SQL puzzles used during teaching standard SQL in a database subject are proposed, showing that when following the SQL standard the implemented system is more expressive than current DMBS’s. 1 Introduction SQL is usually understood as a specific-purpose, declarative programming language, intended to insert and extract data from databases. However, since the addition of recursive common table expressions (CTE, included for the first time in the ISO/IEC Standard SQL:1999, also known as SQL3) its nature evolved to become a Turing-complete language. To acknowledge this, a specific class of tag systems, CTS’s (Cyclic Tag Systems) were proposed in [6], and a corollary of its work was that a CTS can emulate a Turing-complete class of tag systems. Thus, as a CTS can be implemented in standard SQL,1 this proves that SQL is Turing complete. This result can be used to pose more advanced problems as puzzles to students with the aim of sharpening their programming skills and promoting open-mindedness. Using puzzles to this end is a well-known technique [8], and in fact they are used all around the world as the Hour of Code 2, Pro- gramming Praxis3, CodeKata 4, ¡Acepta el reto!5 and also specific for SQL [5].6 Our experience along last years during teaching database concepts to advanced students (typically in double grade degrees) confirms the usefulness of puzzles as an intellectual challenge posed to students (proposed as extra- curricular activities), and this paper collects some of those SQL puzzles, ranging from the easier ones to ∗Work partially funded by the Spanish Ministry of Economy and Competitiveness, under the grant TIN2017-86217-R (CAVI-ART-2), and by the Madrid Regional Government, under the grant S2018/TCS-4339 (BLOQUES-CM), co-funded by EIE Funds of the European Union. 1https://wiki.postgresql.org/index.php?title=Cyclic_Tag_System&oldid=15106 2https://hourofcode.com 3https://programmingpraxis.com 4http://codekata.com 5https://www.aceptaelreto.com 6 SQLzoo https://sqlzoo.net, HackerRank https://www.hackerrank.com/domains/sql c F. Saenz-P´ erez´ Submitted to: This work is licensed under the PROLE 2019 Creative Commons Attribution License. 2 Intuitionistic Logic Programming and SQL Puzzles the more complex and appealing ones. Though SQL cannot be considered as an esoteric language [17], expressing these puzzles with this language might be viewed as an esoteric usage. This paper explores these puzzles in a system that in particular extends the SQL language beyond the standard, including the possibility of making assumptions, both positive (assuming data which is not in the database) and negative (neglecting data which is already in the database) for querying the database. However, since these puzzles are targeted at SQL students, we restrict ourselves to uses admitted by the standard. The system is built with concepts from intuitionistic logic programming (ILP) [3] which deals to a Hypothetical Datalog language. Following [14], an extended SQL query is translated into a Hypothetical Datalog program, which is ultimately solved by a Datalog engine. We also show that this extension supports a syntax that, though supported by the standard, are not implemented in current SQL systems. In particular, nested WITH clauses (basic constructor for recursion and local view definitions) are allowed, recursive queries are not restricted to include only a single recursive case, and both mutual recursion and non-linearity are allowed. All this framework has been implemented in the Datalog Educational system (DES) [10], for which not only desktop applications are available for different OS’s (Windows, Mac, and Linuxes), but also as the DESweb7 on-line system (which can be used from a web browser with HTML 5 and JavaScript; this includes the possibility of using both mobile phones and tablets with Android, iOS, BlackBerry . ). 1.1 Motivation SQL data-retrieval queries usually start with a SELECT keyword, which builds the outcome in terms of its expression list. An exception to this is the WITH clause, which permits local view declarations which can be used in its outcome SQL. These declarations are known as Common Table Expressions (CTE’s), and its syntax in WITH queries can be simplified as: WITH RECURSIVE R1 AS SQL1, -- CTE1 ..., Rn AS SQLn -- CTEn SQL -- Outcome SQL where each Ri is a local, temporary view name defined by the SQL query SQLi, and can be referenced only in the context of both each SQLi and SQL , the query that builds the outcome. Each declaration Ri AS SQLi is a CTE. Whilst a local view declaration is similar to a CREATE VIEW declaration, its scope is restricted, and its lifetime is the same as for SQL . Though CTE’s are used for a number of reasons (divide-and-conquer, readability, create views for which the user has no rights, . ), we are interested in the use that enables SQL as a Turing complete language: by specifying recursive queries. As an example of a recursive query, the following defines natural numbers: WITH RECURSIVE nat(n) AS (SELECT 0 UNION ALL SELECT n+1 FROM nat) SELECT n FROM nat; where the keyword RECURSIVE is required by the standard to specify a (potentially) recursive CTE, the first SELECT is the base case (simply returning the tuple (0)),8 which is unionised with the inductive case. This single CTE is used in the outcome SQL to deliver all the naturals. Query semantics can be 7https://desweb.fdi.ucm.es 8This is a FROM-less statement intended to simply return a tuple of expressions. F. Saenz-P´ erez´ 3 understood as finding its fixpoint for the application of the unionised queries on an enlarging relation nat, which is increased with a new number in each fixpoint-search iteration. In this case, the fixpoint would be infinite, though top-N queries9 can limit the number of retrieved tuples. Current SQL systems include limitations on recursive queries: only linear recursion is allowed, only one recursive case can be set, and duplicate elimination is not allowed in the union of the base and inductive cases. These limitations can be overcome with other languages as Datalog, which can express non-linear recursion, and extensions with duplicates [1, 11] can handle duplicates. However, local view definitions are not available in logic programming (LP) systems as Datalog. Here is where intuitionistic logic programming (ILP) can be of help because it adds embedded implications in the body of clauses. Intuitively, in an embedded implication A ) B, the atom A is “assumed” to be true for proving B [4], in contrast to the classical logic programming implication A ! B, where the atom A is “executed”for proving B. A database language based on ILP is Hypothetical Datalog [3]. The works [14, 13, 12] in particular extend Hypothetical Datalog with rules (not only atoms) in assumptions, a feature that can be used for local view definitions. Next section recalls this feature in the system DES10 that implements it, which in addition enjoys features needed to support SQL, as e.g. duplicates and aggregates. 2 Intuitionistic Logic Programming for SQL The extended Hypothetical Datalog in [14] implements an intuitionistic logic programming system. In this paper, we further extend its syntax and semantics with expressions and metapredicates for supporting common SQL needs: top-N queries (i.e., at most, N solutions to its goal are allowed), aggregates, and expressions. vars(T) represents the set of variables occurring in T. Definition 2.1. r := A j A f1 ^ ::: ^ fn f := A j :A j E1 E2 j d(A) j t(N;A) j G (A;Xs;f) j r1 ^ ::: ^ rn ) f where r stands for rule, f for goal, A for atom (possibly containing variables and constants, but no compound terms), N is a natural number, Xs is a list of variables, Ei are expressions, is a comparison operator (=, <, >, ≤, ≥), and vars(ri) do not occur out of ri. Here, the symbol ) denotes the embedded implication (it builds a goal that can occur in the body of a rule), and d is a duplicate elimination meta-predicate. The first condition (on compound terms) is needed for ensuring a finite fixpoint in absence of infinite relations, and the second one (on variables of each assumed rule ri) ensures that ri is not specialized by any substitution out of it along inferencing, so ri can be assumed “as is” for any inference step, which is an adequate requirement for modelling local definitions in SQL WITH statements. Symbols :, d, t and G are metapredicates, standing respectively for negation, duplicate elimination, top-N queries, and group-by (for solving aggregations as counts and summations, which can occur in expressions as other scalar functions). A group-by goal G (A;Xs;f) means that ground instances of A are grouped by the list of variables Xs (grouping criteria) and, for each group, there is at most one instantation such that it fulfils f. Next, we extend the inference system in [14] with inference rules for comparison operators, top- N, and group-by goals. D ` y is an inference expression, where D is a stratifiable database [14, 16], y is an identified ground atom id : A, pred(A) is the predicate of atom A, and str(D;P) is the stratum 9Somewhat similar to the function take in functional programming languages.

Load more