A Retrospective on Constraint Databases∗

A Retrospective on Constraint Databases∗ Peter Revesz Department of Computer Science and Engineering University of Nebraska-Lincoln Lincoln, Nebraska 68588, USA [email protected] ABSTRACT The organization of this paper is the following. Section 2 In this paper we give a review of constraint databases, a field describes the basic concepts of constraint databases giving that was started by Paris Kanellakis, Gabriel Kuper and the several examples. Section 3 describes Datalog queries of con- author. The review includes basic concepts of data repre- straint databases. Section 4 describes the main techniques sentation, constraint query languages, and query evaluation. for the evaluation of Datalog queries of constraint databases. We also illustrate applications of constraint databases in the Section 5 discusses model checking, Section 6 discusses data areas of model checking, data mining, trust management, mining, Section 7 discusses trust management, Section 8 dis- Diophantine polynomial equations, and moving objects. cusses Diophantine polynomial equations, and Section 9 discusses moving objects as sample applications of constraint databases. Section 10 discusses menu-based application de- Categories and Subject Descriptors velopment on top of a constraint database system. Finally, H.2 [Database Management]: Languages, Logical Design Section 11 gives some conclusions and directions for further research in this area. General Terms 2. CONSTRAINT DATABASES Languages, Management, Security Consider a shallow river with some stones in it as shown Keywords in Figure 1. We can represent the stones in a relational database as follows: constraint databases, variable elimination, data mining, Dio- phantine equations, model checking, moving points, trust Stone management XY 019 1. INTRODUCTION 68 15 12 In this paper we review some of the basic concepts of 25 5 constraint databases that were introduced in 1990 in [20, 21] by Paris Kanellakis, Gabriel Kuper and the author, who The above relational database is equivalent to the follow- was a student of Paris Kanellakis between 1985 and 1991 ing constraint database, where comma means “and”: and obtained his Ph.D. degree in 1991 at Brown University with a doctoral dissertation on this topic [29]. Stone Since its introduction, constraint databases developed into XY an interesting and active subfield of database systems. There are several recent books on the subject, including the edited xyx =0, y =19 comprehensive volume [24] and the introductory textbook [32]. xyx =6, y =8 xyx = 15, y =12 ∗ This work was supported in part by USA National Science xyx = 25, y =5 Foundation grant EIA-0091530 and a Gallup Research Pro- fessorship. In the above we used only equality constraints of the form u = c where u is a variable and c is a constant. We call each row of the table a constraint tuple. The intended meaning of a constraint tuple is that any instantiation of the variables Permission to make digital or hard copies of all or part of this work for that satisfies the constraint belongs to the relation. For personal or classroom use is granted without fee provided that copies are example, if we instantiate x by 6 and y by 8 in the second not made or distributed for profit or commercial advantage and that copies row, then we get the constraint 6 = 6 and 8 = 8, which is bear this notice and the full citation on the first page. To copy otherwise, to obviously true. Hence (6, 8) satisfies the constraint in the republish, to post on servers or to redistribute to lists, requires prior specific second row and belongs to the Stone relation as expected. permission and/or a fee. PCK50, June 8, 2003, San Diego, California, USA Any relational database can be translated similarly into Copyright 2003 ACM 1-58113-604-8/03/0006 ...$5.00. a constraint database. However, constraint databases can 12 B (0,19) (15,12) (6,8) (25,5) A Figure 1: A shallow river with some stones. represent much more. In particular, we can express the set consists of a finite set of rules of the form: of points that belong to the two river banks as follows: R0(x1,... ,xk):—R1(x1,1,... ,x1,k1 ), . Bank . Name X Y Rn(xn,1,... ,xn,kn ). (1) nxyn=“A”,−x − y ≥ 0 where each Ri is either an input relation name or a defined nxyn=“B”,+x + y ≥ 41 relation name, and the xs are either variables or constants. The relation names R0,... ,Rn are not necessarily dis- tinct. They could also be constraint relation names, such The above constraint database uses equality constraints as the equality constraint relation = and the addition con- ± ± ≥ over strings, and addition constraints of the form u v straint relation defined above. We write the constraint re- b where u and v are integer variables and b is an integer lations using the usual notation. For example, we will write constant. u = v instead of Equal(u, v)andwriteu + v ≥ b instead of In the case of the Stone relation only a finite number of Addition(u, v, b). When we have u ≥ v and v ≥ u, for any solutions exist. However, in the case of the Bank relation, u and v, then we also abbreviate it as u = v. even if we restrict the variables to range over the integer The preceding rule is read “R0 is true if R1 and ... and numbers, there is an infinite number of solutions. Hence the Rn are all true.” R0 is called the head and R1,... ,Rn is Bank relation is infinite, although it is finitely represented called the body of the rule. using constraints. We can represent as a set of Datalog rules any constraint database. For example, the input relation Stone can be represented by: 3. DATALOG WITH CONSTRAINTS Stone(x, y):—x =0,y=19. Datalog is a rule-based language that is related to Prolog, Stone(x, y):—x =6,y=8. which is a popular language for implementing expert sys- Stone(x, y):—x =15,y=12. tems. Each rule is a statement saying that if some points Stone(x, y):—x =25,y=5. belong to some relations, then other points must also be- Similarly, the two banks of the river can be represented by: long to a defined relation. Each Datalog query contains a − − ≥ Datalog program and an input database. Bank(n, x, y):—n =“A , x y 0. We divide the set of relation names R into defined relation ≥ names D and input relation names I. Each Datalog query Bank(n, x, y):—n =“B , +x + y 41. 13 3.1 The Stepping Stone Problem 4. EVALUATION TECHNIQUES Imagine that we arrive at a big but shallow river and we would like to get to the other shore. The river has in 4.1 Proof Tree it several large stones and we would like to cross the river without getting wet. The question is whether there is a way Consider a Datalog rule of the form (1). We can define a to do that assuming that the maximum size step we can take Datalog rule instantiation as a substitution of each variable is a fixed length, say 10 units. by constants in a rule. We say that an instantiated relation An instance of the stepping stone problem is shown in Ri in the rule body is true if an only if the instantiation is Figure 1. In that instance, we can cross the river starting a solution of a constraint tuple in Ri. We also define rule from point (0, 0) of bank A, then stepping on the stone at lo- application as the addition of the instantiated tuple in the cation (6, 8), then stepping on the stone at location (15, 12), rule head to the relation R0, if all the instantiated tuples in and from there stepping on point (22, 19) of bank B.Itcan the rule body are true. For example, be calculated that the distance between any pair of adjacent Step(0, 0) :— Bank(“A, 0, 0), “A =“A. points on this path is at most 10 units. In this problem we clearly need the Euclidean distance is an instantiation of the first rule for the Step relation. We function, which depends on multiplication and difference. instantiated x and y by 0 and n by the string “A”. Since the Since we do not have these operators, we will define them tuple (“A , 0, 0) is a solution of the first constraint tuple of using Datalog with addition constraints. First we define the the Bank relation, and “A =“A is obviously true, we difference relation D as: can add to the Step relation the tuple (0, 0). As another example, D(x, y, z):—x = y, z =0. Step(6, 8) :— Step(0, 0),Stone(6, 8), Close(0, 0, 6, 8). D(x, y, z):—D(x,y,z), is an instantiation of the second rule for Step.Inthiscase, x = x +1, Step(0, 0) is true because we just added above the tuple z = z +1. (0, 0) to the Step relation. Stone(6, 8) is true because (6, 8) Then we define the multiplication relation M as: satisfies the second constraint tuple of the Stone relation. Similarly, it can be shown that (0, 0, 6, 8) satisfies the con- M(x, y, z):—x =0,y =0,z =0. straint relation Close. By a sequence of rule applications, we can prove any par- M(x, y, z):—M(x,y,z),D(z,z,y), ticular constant tuple that is in an output relation to be in x = x +1. it. It is easy to draw any proof as a tree in which each inter- nal node is the head of an instantiated rule, and its children M(x, y, z):—M(x, y,z),D(z,z,x), are the instantiated relations in the body of the rule.

A Retrospective on Constraint Databases∗

Tarjan Transcript Final with Timestamps

Diffie and Hellman Receive 2015 Turing Award Rod Searcey/Stanford University

Curriculum Vitae

Finding Rising Stars in Co-Author Networks Via Weighted Mutual Influence

Moshe Y. Vardi

Michael Oser Rabin Automata, Logic and Randomness in Computation

AWARDS SESSION New Orleans Convention Center New Orleans, Louisiana Tuesday, 18 November 2014

The Distortion of Locality Sensitive Hashing

Biography Five Related and Significant Publications

Extended Abstract

Iasonas Kokkinos Curriculum Vitae

Paris C. Kanellakis