ESSLLI 2016 day 2 Bolzano, Italy ∀ ¬∃ Logical foundations of databases
Diego Figueira Gabriele Puppis
CNRS LaBRI Recap
•Relational model (tables)
•Relational Algebra (union, product, difference, selection, projection)
•SQL (SELECT … FROM … WHERE …)
• RA ≈ basic SQL
•First-order logic (syntax, semantics)
•Expressiveness: FO =* RA Formulas as queries
FO can serve as a declarative query language on relational databases : we express the properties of the answer
Tables = Relations Rows = Tuples Queries = Formulas
[E.F. Codd 1972]
3 Formulas as queries
FO can serve as a declarative query language on relational databases : we express the properties of the answer
Tables = Relations RA =* FO Rows = Tuples Queries = Formulas How = What
[E.F. Codd 1972]
3 Formulas as queries
FO can serve as a declarative query language on relational databases : we express the properties of the answer
Tables = Relations RA =* FO Rows = Tuples Queries = Formulas How = What
RA and FO logic have roughly* the same expressive power! [E.F. Codd 1972]
*FO without functions, with equality, on finite domains, … 3 Formulas as queries
RA ⊆ FO
• R1 × R2 ⤳ R1(x1, …, xn) ⋀ R2(xn+1, …, xm)
• R1 ∪ R2 ⤳ R1(x1, …, xn) ∨ R2(x1, …, xn)
• σ{i1=j1,…,in=jn}(R) ⤳ R(x1, …, xm) ⋀ (xi1=xj1)⋀ ··· ⋀ (xin=xjn)
• π{i1,…,in}(R) ⤳ ∃({x1,…,xm} \ {xi1,…,xin}). R(x1, …, xm)
• R1 \ R2 ⤳ R1(x1, …, xn) ⋀ ¬R2(x1, …, xn)
• …
4 Formulas as queries
FO ⊆ RA does not hold in general!
5 Formulas as queries
FO ⊆ RA does not hold in general!
∉ RA “the complement of R” ∈ FO : ¬R(x)
5 Formulas as queries
FO ⊆/ RA
∉ RA “the complement of R” ∈ FO : ¬R(x)
5 Formulas as queries
FO ⊆/ RA
∉ RA “the complement of R” ∈ FO : ¬R(x)
⇝ We restrict variables to range over active domain
5 Formulas as queries
FO ⊆/ RA
∉ RA “the complement of R” ∈ FO : ¬R(x)
elements in the relations ⇝ We restrict variables to range over active domain
FOact = FO restricted to active domain
5 Formulas as queries
FO ⊆/ RA
∉ RA “the complement of R” ∈ FO : ¬R(x)
elements in the relations ⇝ We restrict variables to range over active domain
act FO φ1(x) = ∀y E(y,x) = φ1(G) = {v2} v2 FO restricted G = v1 to active domain φ2(x,y) = ¬E(x,y) v4 v3 φ2(G) = {(v1,v1),(v3,v1),(v2,v3)}
5 First-order logic restricted to active domain
Formal Semantics of FOact
G ⊧α ∃x φ iff for some v ∈ ACT(G) and α' = α ∪ {x ↦ v} we have G ⊧α' φ
G ⊧α ∀x φ iff for every v ∈ ACT(G) and α' = α ∪ {x ↦ v} we have G ⊧α' φ
G ⊧α φ⋀ψ iff G ⊧α φ and G ⊧α ψ
G ⊧α ¬φ iff it is not true that G ⊧α φ
G ⊧α x=y iff α(x)=α(y)
G ⊧α E(x,y) iff (α(x),α(y)) ∈ E
ACT(G) = {v | for some v': (v,v') ∈ E or (v',v) ∈ E}
6 First-order logic restricted to active domain
FOact ⊆ RA
7 First-order logic restricted to active domain
FOact ⊆ RA
1. φ in normal form: (∃* (¬∃)*)* + quantifier-free ψ(x ,…,x ) Assume: 1 n 2. φ has n variables
∃x1 ∃x2 ¬∃x3 ∃x4 . ( E(x1,x3) ⋀ ¬E(x4,x2) ) ⋁ (x1=x3)
7 First-order logic restricted to active domain
FOact ⊆ RA
1. φ in normal form: (∃* (¬∃)*)* + quantifier-free ψ(x ,…,x ) Assume: 1 n 2. φ has n variables
∃x1 ∃x2 ¬∃x3 ∃x4 . ( E(x1,x3) ⋀ ¬E(x4,x2) ) ⋁ (x1=x3)
Adom = RA expression for active domain = “π1(E) ∪ π2(E)”
✢ • (R(xi1,…,xit)) ⤳ R
✢ • (xi = xj) ⤳ σ{i=j}( Adom × · · · × Adom )
✢ ✢ ✢ • (ψ1 ⋀ ψ2) ⤳ ψ1 ∩ ψ2
• (¬ψ)✢ ⤳ Adom × · · · × Adom \ ψ✢
✢ ✢
Translation • ( ∃xi φ(x ,…,x ) ) ⤳ π ( φ ) i1 it {i1,…,it}\{i} 7 First-order logic restricted to active domain
FOact ⊆ RA
1. φ in normal form: (∃* (¬∃)*)* + quantifier-free ψ(x ,…,x ) Assume: 1 n 2. φ has n variables n π{1,…,n}(σ{i1=n+1,…,it=n+t} (Adom × R))
∃x1 ∃x2 ¬∃x3 ∃x4 . ( E(x1,x3) ⋀ ¬E(x4,x2) ) ⋁ (x1=x3)
Adom = RA expression for active domain = “π1(E) ∪ π2(E)”
✢ • (R(xi1,…,xit)) ⤳ R
✢ • (xi = xj) ⤳ σ{i=j}( Adom × · · · × Adom )
✢ ✢ ✢ • (ψ1 ⋀ ψ2) ⤳ ψ1 ∩ ψ2
• (¬ψ)✢ ⤳ Adom × · · · × Adom \ ψ✢
✢ ✢
Translation • ( ∃xi φ(x ,…,x ) ) ⤳ π ( φ ) i1 it {i1,…,it}\{i} 7 First-order logic restricted to active domain
FOact ⊆ RA
1. φ in normal form: (∃* (¬∃)*)* + quantifier-free ψ(x ,…,x ) Assume: 1 n 2. φ has n variables
∃x1 ∃x2 ¬∃x3 ∃x4 . ( E(x1,x3) ⋀ ¬E(x4,x2) ) ⋁ (x1=x3)
Adom = RA expression for active domain = “π1(E) ∪ π2(E)” Adomn ✢ • (R(xi1,…,xit)) ⤳ R
✢ • (xi = xj) ⤳ σ{i=j}( Adom × · · · × Adom )
✢ ✢ ✢ • (ψ1 ⋀ ψ2) ⤳ ψ1 ∩ ψ2
• (¬ψ)✢ ⤳ Adom × · · · × Adom \ ψ✢
✢ ✢
Translation • ( ∃xi φ(x ,…,x ) ) ⤳ π ( φ ) i1 it {i1,…,it}\{i} 7 First-order logic restricted to active domain
FOact ⊆ RA
1. φ in normal form: (∃* (¬∃)*)* + quantifier-free ψ(x ,…,x ) Assume: 1 n 2. φ has n variables
∃x1 ∃x2 ¬∃x3 ∃x4 . ( E(x1,x3) ⋀ ¬E(x4,x2) ) ⋁ (x1=x3)
Adom = RA expression for active domain = “π1(E) ∪ π2(E)”
✢ • (R(xi1,…,xit)) ⤳ R A∩B = ((A∪B) \ (A \ B)) ✢ \ (B \ A) • (xi = xj) ⤳ σ{i=j}( Adom × · · · × Adom )
✢ ✢ ✢ • (ψ1 ⋀ ψ2) ⤳ ψ1 ∩ ψ2
• (¬ψ)✢ ⤳ Adom × · · · × Adom \ ψ✢
✢ ✢
Translation • ( ∃xi φ(x ,…,x ) ) ⤳ π ( φ ) i1 it {i1,…,it}\{i} 7 First-order logic restricted to active domain
FOact ⊆ RA
1. φ in normal form: (∃* (¬∃)*)* + quantifier-free ψ(x ,…,x ) Assume: 1 n 2. φ has n variables
∃x1 ∃x2 ¬∃x3 ∃x4 . ( E(x1,x3) ⋀ ¬E(x4,x2) ) ⋁ (x1=x3)
Adom = RA expression for active domain = “π1(E) ∪ π2(E)”
✢ • (R(xi1,…,xit)) ⤳ R
✢ • (xi = xj) ⤳ σ{i=j}( Adom × · · · × Adom )Adom t if t is the arity of ψ✢
✢ ✢ ✢ • (ψ1 ⋀ ψ2) ⤳ ψ1 ∩ ψ2
• (¬ψ)✢ ⤳ Adom × · · · × Adom \ ψ✢
✢ ✢
Translation • ( ∃xi φ(x ,…,x ) ) ⤳ π ( φ ) i1 it {i1,…,it}\{i} 7 Corollary
FOact is equivalent to RA
8 Question 1: How is π2(σ1=3(R1 × R2) expressed in FO? Remember: R1,R2 are binary
Question 2: How is ∃y,z . (R1(x,y) ⋀ R1(y,z) ⋀ x≠z ) expressed in RA? Remember: The signature is the same as before (R1,R2 binary)
• R1 ∪ R2
• R1 × R2
• R1 \ R2 ≠ ≠ ≠ ≠ • σ{i1=j1,…,in=jn}(R) ≔{(x1, …, xm) ∈ R | (xi1=xj1)⋀ ··· ⋀ (xin=xjn)}
• π{i1,…,in}(R) ≔ {(xi1,…,xin) | (x1, …, xm) ∈ R}
9 Question 1: How is π2(σ1=3(R1 × R2) expressed in FO? Remember: R1,R2 are binary
Answer: ∃x1,x3,x4 (R1(x1,x2) ⋀ R2(x3,x4) ∧ x1 = x3)
Question 2: How is ∃y,z . (R1(x,y) ⋀ R1(y,z) ⋀ x≠z ) expressed in RA? Remember: The signature is the same as before (R1,R2 binary)
• R1 ∪ R2
• R1 × R2
• R1 \ R2 ≠ ≠ ≠ ≠ • σ{i1=j1,…,in=jn}(R) ≔{(x1, …, xm) ∈ R | (xi1=xj1)⋀ ··· ⋀ (xin=xjn)}
• π{i1,…,in}(R) ≔ {(xi1,…,xin) | (x1, …, xm) ∈ R}
9 Question 1: How is π2(σ1=3(R1 × R2) expressed in FO? Remember: R1,R2 are binary
Answer: ∃x1,x3,x4 (R1(x1,x2) ⋀ R2(x3,x4) ∧ x1 = x3)
Question 2: How is ∃y,z . (R1(x,y) ⋀ R1(y,z) ⋀ x≠z ) expressed in RA? Remember: The signature is the same as before (R1,R2 binary)
• R1 ∪ R2
• R1 × R2
• R1 \ R2 ≠ ≠ ≠ ≠ • σ{i1=j1,…,in=jn}(R) ≔{(x1, …, xm) ∈ R | (xi1=xj1)⋀ ··· ⋀ (xin=xjn)}
• π{i1,…,in}(R) ≔ {(xi1,…,xin) | (x1, …, xm) ∈ R}
Answer: π1(σ{2=3,1≠4}(R1 × R1))
9 FO = RA = SQL
Logic Algebra Programming language
10 over on finite very basic active domain domains
FO = RA = SQL
Logic Algebra Programming language
10 Algorithmic problems for query languages
Evaluation problem: Given a query Q, a database instance db, and a tuple t, is t ∈ Q(db) ?
⇝ How hard is it to retrieve data?
11 Algorithmic problems for query languages
Evaluation problem: Given a query Q, a database instance db, and a tuple t, is t ∈ Q(db) ?
⇝ How hard is it to retrieve data?
Emptiness problem: Given a query Q, is there a database instance db so that Q(db) ≠ ∅ ? ⇝ Does Q make sense? Is it a contradiction? (Query optimization)
11 Algorithmic problems for query languages
Evaluation problem: Given a query Q, a database instance db, and a tuple t, is t ∈ Q(db) ?
⇝ How hard is it to retrieve data?
Emptiness problem: Given a query Q, is there a database instance db so that Q(db) ≠ ∅ ? ⇝ Does Q make sense? Is it a contradiction? (Query optimization)
Equivalence problem: Given queries Q1, Q2, is Q1(db) = Q2(db) for all database instances db? ⇝ Can we safely replace a query with another? (Query optimization)
11 Complexity theory
What can be mechanized? ⤳ decidable/undecidable How hard is it to mechanise? ⤳ complexity classes
12 Complexity theory H’s 10th Domino PCP K . . . What can be mechanized? ⤳ decidable/undecidable How hard is it to mechanise? ⤳ complexity classes
12 Complexity theory H’s 10th Domino PCP K . . . What can be mechanized? ⤳ decidable/undecidable How hard is it to mechanise? ⤳ complexity classes usage of resources: • time • memory
12 Complexity theory H’s 10th Domino PCP K . . . What can be mechanized? ⤳ decidable/undecidable How hard is it to mechanise? ⤳ complexity classes usage of resources: • time • memory
Algorithm Alg is TIME-bounded by a function f : N ⟶ N if Alg(input) uses less than f (|input|) units of TIME.
12 Complexity theory H’s 10th Domino PCP K . . . What can be mechanized? ⤳ decidable/undecidable How hard is it to mechanise? ⤳ complexity classes usage of resources: • time • memory
f Algorithm Alg is TIME-bounded by a function f : N N if Alg ⟶ time Alg(input) uses less than f (|input|) units of TIME. input size
12 Complexity theory H’s 10th Domino PCP K . . . What can be mechanized? ⤳ decidable/undecidable How hard is it to mechanise? ⤳ complexity classes usage of resources: • time • memory
SPACE f Algorithm Alg is TIME-bounded by a function f : N N if Alg ⟶ SPACE. time Alg(input) uses less than f (|input|) units of TIME. input size
12 Complexity theory H’s 10th Domino PCP K . . . What can be mechanized? ⤳ decidable/undecidable How hard is it to mechanise? ⤳ complexity classes usage of resources: • time • memory
SPACE f Algorithm Alg is TIME-bounded by a function f : N N if Alg ⟶ SPACE. time Alg(input) uses less than f (|input|) units of TIME. input size
LOGSPACE ⊆ PTIME ⊆ PSPACE ⊆ EXPTIME ⊆ · · ·
12 Complexity theory H’s 10th Domino PCP K . . . What can be mechanized? ⤳ decidable/undecidable How hard is it to mechanise? ⤳ complexity classes usage of resources: • time • memory
SPACE f Algorithm Alg is TIME-bounded by a function f : N N if Alg ⟶ SPACE. time Alg(input) uses less than f (|input|) units of TIME. input size
TIME-bounded by a polynomial LOGSPACE ⊆ PTIME ⊆ PSPACE ⊆ EXPTIME ⊆ · · · SPACE-bounded by a polynomial SPACE-bounded by log(n) 12 Algorithmic problems for FO
Evaluation problem: Given a FO formula φ(x1, …, xn), a graph G, and a binding α, does G ⊧α φ ?
Satisfiability problem: Given a FO formula φ, is there a graph G and binding α, such that G ⊧αφ ?
Equivalence problem: Given FO formulae φ,ψ, is G ⊧αφ iff G ⊧αψ for all graphs G and bindings α?
13 Algorithmic problems for FO
Evaluation problem: Given a FO formula φ(x1, …, xn), a graph G, and a binding α, does G ⊧α φ ?
DECIDABLE ⇝ foundations of the database industry
Satisfiability problem: Given a FO formula φ, is there a graph G and binding α, such that G ⊧αφ ?
Equivalence problem: Given FO formulae φ,ψ, is G ⊧αφ iff G ⊧αψ for all graphs G and bindings α?
13 Algorithmic problems for FO
Evaluation problem: Given a FO formula φ(x1, …, xn), a graph G, and a binding α, does G ⊧α φ ?
DECIDABLE ⇝ foundations of the database industry
Satisfiability problem: Given a FO formula φ, is there a graph G and binding α, such that G ⊧αφ ?
� UNDECIDABLE ⇝ both for ⊧ and ⊧finite
Equivalence problem: Given FO formulae φ,ψ, is G ⊧αφ iff G ⊧αψ for all graphs G and bindings α?
13 Algorithmic problems for FO
Evaluation problem: Given a FO formula φ(x1, …, xn), a graph G, and a binding α, does G ⊧α φ ?
DECIDABLE ⇝ foundations of the database industry
Satisfiability problem: Given a FO formula φ, is there a graph G and binding α, such that G ⊧αφ ?
� UNDECIDABLE ⇝ both for ⊧ and ⊧finite
Equivalence problem: Given FO formulae φ,ψ, is G ⊧αφ iff G ⊧αψ for all graphs G and bindings α?
� UNDECIDABLE ⇝ by reduction to the satisfiability problem
13 Algorithmic problems for FO
Satisfiability problem: Given a FO formula φ, is there a graph G and binding α, such that G ⊧αφ ?
� UNDECIDABLE ⇝ both for ⊧ and ⊧finite [Trakhtenbrot ’50]
14 Algorithmic problems for FO
Satisfiability problem: Given a FO formula φ, is there a graph G and binding α, such that G ⊧αφ ?
� UNDECIDABLE ⇝ both for ⊧ and ⊧finite [Trakhtenbrot ’50]
Proof: By reduction from the Domino (aka Tiling) problem.
14 Algorithmic problems for FO
Satisfiability problem: Given a FO formula φ, is there a graph G and binding α, such that G ⊧αφ ?
� UNDECIDABLE ⇝ both for ⊧ and ⊧finite [Trakhtenbrot ’50]
Proof: By reduction from the Domino (aka Tiling) problem.
Reduction from P to P': Algorithm that solves P using a O(1) procedure “ P'(x) ” that returns the truth value of P'(x).
14 The (undecidable) Domino problem
Domino Input: 4-sided dominos: The (undecidable) Domino problem
Domino Input: 4-sided dominos:
Output: Is it possible to form a white-bordered rectangle? (of any size)
. . .
......
. . . The (undecidable) Domino problem
Domino Input: 4-sided dominos:
Output: Is it possible to form a white-bordered rectangle? (of any size)
. . .
......
. . .
Rules: sides must match, you can’t rotate the dominos, but you can ‘clone’ them. The (undecidable) Domino problem
Domino - Why is it undecidable?
It can easily encode halting computations of Turing machines:
. . .
0 0 q 0 0 q q 0 l 1 0 0 0 l 1 0 0 l l 0 1 r 0 0 0 1 r 0 0 r r 0 q 0 0 0 0 q 0 0 0 q q l 1 0 0 0 l 1 0 0 0 l l 1 r 0 0 0 1 r 0 0 0 r r s 0 0 0 0 s 0 0 0 0 The (undecidable) Domino problem
Domino - Why is it undecidable?
It can easily encode halting computations of Turing machines:
. . . i i i (head is elsewhere, 0 0 q 0 0 q q symbol is not modified) i i i 0 l 1 0 0 0 l 1 0 0 l l 0 1 r 0 0 0 1 r 0 0 r r 0 q 0 0 0 0 q 0 0 0 q q l 1 0 0 0 l 1 0 0 0 l l 1 r 0 0 0 1 r 0 0 0 r r s 0 0 0 0 s 0 0 0 0 The (undecidable) Domino problem
Domino - Why is it undecidable?
It can easily encode halting computations of Turing machines:
. . . i i i (head is elsewhere, 0 0 q 0 0 q q symbol is not modified) i i i 0 l 1 0 0 0 l 1 0 0 1 r 2 (head is here, symbol is l l r r 0 1 r 0 0 q 0 2 rewritten, head moves right) 0 1 r 0 0 r r 0 q 0 0 0 0 q 0 0 0 q q l 1 0 0 0 l 1 0 0 0 l l 1 r 0 0 0 1 r 0 0 0 r r s 0 0 0 0 s 0 0 0 0 The (undecidable) Domino problem
Domino - Why is it undecidable?
It can easily encode halting computations of Turing machines:
. . . i i i (head is elsewhere, 0 0 q 0 0 q q symbol is not modified) i i i 0 l 1 0 0 0 l 1 0 0 1 r 2 (head is here, symbol is l l r r 0 1 r 0 0 q 0 2 rewritten, head moves right) 0 1 r 0 0 r r 0 q 0 0 0 l 2 1 (head is here, symbol is l l 0 q 0 0 0 2 q 0 rewritten, head moves left) q q l 1 0 0 0 l 1 0 0 0 l l 1 r 0 0 0 1 r 0 0 0 r r s 0 0 0 0 s 0 0 0 0 The (undecidable) Domino problem
Domino - Why is it undecidable?
It can easily encode halting computations of Turing machines:
. . . i i i (head is elsewhere, 0 0 q 0 0 q q symbol is not modified) i i i 0 l 1 0 0 0 l 1 0 0 1 r 2 (head is here, symbol is l l r r 0 1 r 0 0 q 0 2 rewritten, head moves right) 0 1 r 0 0 r r 0 q 0 0 0 l 2 1 (head is here, symbol is l l 0 q 0 0 0 2 q 0 rewritten, head moves left) q q l 1 0 0 0 l 1 0 0 0 s 0 0 0 (initial configuration) l l 1 r 0 0 0 1 r 0 0 0 r r s 0 0 0 0 s 0 0 0 0 The (undecidable) Domino problem
Domino - Why is it undecidable?
It can easily encode halting computations of Turing machines:
. . . i i i (head is elsewhere, 0 0 q 0 0 q q symbol is not modified) i i i 0 l 1 0 0 0 l 1 0 0 1 r 2 (head is here, symbol is l l r r 0 1 r 0 0 q 0 2 rewritten, head moves right) 0 1 r 0 0 r r 0 q 0 0 0 l 2 1 (head is here, symbol is l l 0 q 0 0 0 2 q 0 rewritten, head moves left) q q l 1 0 0 0 l 1 0 0 0 s 0 0 0 (initial configuration) l l 1 r 0 0 0 1 r 0 0 0 r r s 0 0 0 0 (halting configuration) s 0 0 0 0 h 0 0 0 . . . Domino ⇝ Sat-FO (domino has a solution iff φ satisfiable)
1. Tere is a grid: H( , ) and V( , ) are relations representing bijections such that… Domino ⇝ Sat-FO (domino has a solution iff φ satisfiable)
1. Tere is a grid: H( , ) and V( , ) are relations representing bijections such that…
∀
V H Domino ⇝ Sat-FO (domino has a solution iff φ satisfiable)
1. Tere is a grid: H( , ) and V( , ) are relations representing bijections such that…
∃ ∀ H
V V H Domino ⇝ Sat-FO (domino has a solution iff φ satisfiable)
1. Tere is a grid: H( , ) and V( , ) are relations representing bijections such that…
H H H . . . H V V V V
...... V V V V H H H . . . H
V V V V H H H . . . H
V V V V H H H . . . H Domino ⇝ Sat-FO (domino has a solution iff φ satisfiable)
1. Tere is a grid: H( , ) and V( , ) are relations representing bijections such that…
H H H . . . H 2. Assign one domino to each node: V V V V a unary relation
...... V V V V H H H . . . H D ( x ) V V V V H H H . . . H for each domino V V V V H H H . . . H Domino ⇝ Sat-FO (domino has a solution iff φ satisfiable)
1. Tere is a grid: H( , ) and V( , ) are relations representing bijections such that…
H H H . . . H 2. Assign one domino to each node: V V V V a unary relation
...... V V V V H H H . . . H D ( x ) V V V V H H H . . . H for each domino V V V V H H H . . . H 3. Match the sides ∀x,y
if H(x,y), then Da(x) ⋀ Db(y) for some dominos a,b that ‘match’ horizontally (Idem vertically) Domino ⇝ Sat-FO (domino has a solution iff φ satisfiable)
1. Tere is a grid: H( , ) and V( , ) are relations representing bijections such that…
H H H . . . H 2. Assign one domino to each node: V V V V a unary relation
...... V V V V H H H . . . H D ( x ) V V V V H H H . . . H for each domino V V V V H H H . . . H 3. Match the sides ∀x,y
if H(x,y), then Da(x) ⋀ Db(y) for some dominos a,b that ‘match’ horizontally (Idem vertically) 4. Borders are white. Algorithmic problems for FO
Evaluation problem: Given a FO formula φ(x1, …, xn), a graph G, and a binding α, does G ⊧α φ ?
DECIDABLE ⇝ foundations of the database industry
Satisfiability problem: Given a FO formula φ, is there a graph G and binding α, such that G ⊧αφ ?
� UNDECIDABLE ⇝ both for ⊧ and ⊧finite
Equivalence problem: Given FO formulae φ,ψ, is G ⊧αφ iff G ⊧αψ for all graphs G and bindings α?
� UNDECIDABLE ⇝ by reduction to the satisfiability problem
18 Algorithmic problems for FO
Equivalence problem: Given FO formulae φ,ψ, is G ⊧αφ iff G ⊧αψ for all graphs G and bindings α?
� UNDECIDABLE ⇝ by reduction from the satisfiability problem
19 Algorithmic problems for FO
φ is satisfiable iff φ is not equivalent to ⊥ Satisfiability problem undecidable ⇝ Equivalence problem undecidable
Equivalence problem: Given FO formulae φ,ψ, is G ⊧αφ iff G ⊧αψ for all graphs G and bindings α?
� UNDECIDABLE ⇝ by reduction from the satisfiability problem
19 Algorithmic problems for FO
φ is satisfiable iff φ is not equivalent to ⊥ Satisfiability problem undecidable ⇝ Equivalence problem undecidable Actually, there are reductions in both senses:
φ(x1,…,xn) and ψ(y1,…,ym) are equivalent iff • n=m
• (x1=y1) ⋀ ··· ⋀ (xn=yn) ⋀ φ(x1,…,xn) ⋀ ¬ψ(y1,…,yn) is unsatisfiable
• (x1=y1) ⋀ ··· ⋀ (xn=yn) ⋀ ψ(x1,…,xn) ⋀ ¬φ(y1,…,yn) is unsatisfiable
Equivalence problem: Given FO formulae φ,ψ, is G ⊧αφ iff G ⊧αψ for all graphs G and bindings α?
� UNDECIDABLE ⇝ by reduction from the satisfiability problem
19 Algorithmic problems for FO
Evaluation problem: Given a FO formula φ(x1, …, xn), a graph G, and a binding α, does G ⊧α φ ?
DECIDABLE ⇝ foundations of the database industry
Satisfiability problem: Given a FO formula φ, is there a graph G and binding α, such that G ⊧αφ ?
� UNDECIDABLE ⇝ both for ⊧ and ⊧finite
Equivalence problem: Given FO formulae φ,ψ, is G ⊧αφ iff G ⊧αψ for all graphs G and bindings α?
� UNDECIDABLE ⇝ by reduction to the satisfiability problem
20 Evaluation problem for FO
φ(x1,…,xn)
Input: G = (V,E) Output: G ⊧α φ ?
α = {x1,…,xn} ⟶ V
21 Evaluation problem for FO
φ(x1,…,xn)
Input: G = (V,E) Output: G ⊧α φ ?
α = {x1,…,xn} ⟶ V
Encoding of G = (V, E)
• each node is coded with a bit string of size log(|V|), • edge set is encoded by its tuples, e.g. (100,101), (010, 010), …
Cost of coding: ||G|| = |E|·2·log(|V|) ≈ |V| (mod a polynomial)
21 Evaluation problem for FO
φ(x1,…,xn)
Input: G = (V,E) Output: G ⊧α φ ?
α = {x1,…,xn} ⟶ V
Encoding of G = (V, E)
• each node is coded with a bit string of size log(|V|), • edge set is encoded by its tuples, e.g. (100,101), (010, 010), …
Cost of coding: ||G|| = |E|·2·log(|V|) ≈ |V| (mod a polynomial)
Encoding of α = {x1,…,xn} ⟶ V
• each node is coded with a bit string of size log(|V|),
Cost of coding: ||α|| = n·log(|V|)
21 Evaluation problem for FO
φ(x1,…,xn)
Input: G = (V,E) Output: G ⊧α φ ?
α = {x1,…,xn} ⟶ V
22 Evaluation problem for FO
φ(x1,…,xn)
Input: G = (V,E) Output: G ⊧α φ ?
α = {x1,…,xn} ⟶ V
• If φ(x1,…,xn) = E(xi,xj): answer YES iff (α(xi),α(xj)) ∈ E
• If φ(x1,…,xn) = ψ(x1,…,xn) ⋀ ψ'(x1,…,xn): answer YES iff G ⊧α ψ and G ⊧α ψ'
• If φ(x1,…,xn) = ¬ψ(x1,…,xn): answer NO iff G ⊧α ψ
• If φ(x1,…,xn) = ∃y . ψ(x1,…,xn,y): answer YES iff for some v ∈ V and α'= α ∪ {y↦v} we have G ⊧α' ψ.
22 Evaluation problem for FO
φ(x1,…,xn)
Input: G = (V,E) Output: G ⊧α φ ?
α = {x1,…,xn} ⟶ V
• If φ(x1,…,xn) = E(xi,xj): answer YES iff (α(xi),α(xj)) ∈ E
• If φ(x1,…,xn) = ψ(x1,…,xn) ⋀ ψ'(x1,…,xn): answer YES iff G ⊧α ψ and G ⊧α ψ'
• If φ(x1,…,xn) = ¬ψ(x1,…,xn): answer NO iff G ⊧α ψ
• If φ(x1,…,xn) = ∃y . ψ(x1,…,xn,y): answer YES iff for some v ∈ V and α'= α ∪ {y↦v} we have G ⊧α' ψ. Question: How much space does it take? 22 Evaluation problem for FO
φ(x1,…,xn)
Input: G = (V,E) Output: G ⊧α φ ?
α = {x1,…,xn} ⟶ V
• If φ(x1,…,xn) = E(xi,xj): use 4 pointers ⇝ LOGSPACE answer YES iff (α(xi),α(xj)) ∈ E
• If φ(x1,…,xn) = ψ(x1,…,xn) ⋀ ψ'(x1,…,xn): answer YES iff G ⊧α ψ and G ⊧α ψ'
• If φ(x1,…,xn) = ¬ψ(x1,…,xn): answer NO iff G ⊧α ψ
• If φ(x1,…,xn) = ∃y . ψ(x1,…,xn,y): answer YES iff for some v ∈ V and α'= α ∪ {y↦v} we have G ⊧α' ψ. Question: How much space does it take? 22 Evaluation problem for FO
φ(x1,…,xn)
Input: G = (V,E) Output: G ⊧α φ ?
α = {x1,…,xn} ⟶ V
• If φ(x1,…,xn) = E(xi,xj): use 4 pointers ⇝ LOGSPACE answer YES iff (α(xi),α(xj)) ∈ E
• If φ(x1,…,xn) = ψ(x1,…,xn) ⋀ ψ'(x1,…,xn):
answer YES iff G ⊧α ψ and G ⊧α ψ' ⇝ MAX( SPACE(G ⊧α ψ)), SPACE(G ⊧α ψ')) )
• If φ(x1,…,xn) = ¬ψ(x1,…,xn): answer NO iff G ⊧α ψ
• If φ(x1,…,xn) = ∃y . ψ(x1,…,xn,y): answer YES iff for some v ∈ V and α'= α ∪ {y↦v} we have G ⊧α' ψ. Question: How much space does it take? 22 Evaluation problem for FO
φ(x1,…,xn)
Input: G = (V,E) Output: G ⊧α φ ?
α = {x1,…,xn} ⟶ V
• If φ(x1,…,xn) = E(xi,xj): use 4 pointers ⇝ LOGSPACE answer YES iff (α(xi),α(xj)) ∈ E
• If φ(x1,…,xn) = ψ(x1,…,xn) ⋀ ψ'(x1,…,xn):
answer YES iff G ⊧α ψ and G ⊧α ψ' ⇝ MAX( SPACE(G ⊧α ψ)), SPACE(G ⊧α ψ')) )
• If φ(x1,…,xn) = ¬ψ(x1,…,xn): ⇝ SPACE(G ⊧α ψ)) answer NO iff G ⊧α ψ
• If φ(x1,…,xn) = ∃y . ψ(x1,…,xn,y): answer YES iff for some v ∈ V and α'= α ∪ {y↦v} we have G ⊧α' ψ. Question: How much space does it take? 22 Evaluation problem for FO
φ(x1,…,xn)
Input: G = (V,E) Output: G ⊧α φ ?
α = {x1,…,xn} ⟶ V
• If φ(x1,…,xn) = E(xi,xj): use 4 pointers ⇝ LOGSPACE answer YES iff (α(xi),α(xj)) ∈ E
• If φ(x1,…,xn) = ψ(x1,…,xn) ⋀ ψ'(x1,…,xn):
answer YES iff G ⊧α ψ and G ⊧α ψ' ⇝ MAX( SPACE(G ⊧α ψ)), SPACE(G ⊧α ψ')) )
• If φ(x1,…,xn) = ¬ψ(x1,…,xn): ⇝ SPACE(G ⊧α ψ)) answer NO iff G ⊧α ψ
• If φ(x1,…,xn) = ∃y . ψ(x1,…,xn,y):
answer YES iff for some v ∈ V and α'= α ∪ {y↦v} ⇝ 2·log(|G|) + SPACE(G ⊧α' ψ ) we have G ⊧α' ψ. Question: How much space does it take? 22 Evaluation problem for FO
φ(x1,…,xn)
Input: G = (V,E) Output: G ⊧α φ ?
α = {x1,…,xn} ⟶ V
• If φ(x1,…,xn) = E(xi,xj): use 4 pointers ⇝ LOGSPACE answer YES iff (α(xi),α(xj)) ∈ E
• If φ(x1,…,xn) = ψ(x1,…,xn) ⋀ ψ'(x1,…,xn):
answer YES iff G ⊧α ψ and G ⊧α ψ' ⇝ MAX( SPACE(G ⊧α ψ)), SPACE(G ⊧α ψ')) )
• If φ(x1,…,xn) = ¬ψ(x1,…,xn): ⇝ SPACE(G ⊧α ψ)) answer NO iff G ⊧α ψ
• If φ(x1,…,xn) = ∃y . ψ(x1,…,xn,y):
answer YES iff for some v ∈ V and α'= α ∪ {y↦v} ⇝ 2·log(|G|) + SPACE(G ⊧α' ψ ) we have G ⊧α' ψ. Question: 2·log(|G|) + ··· + 2·log(|G|) + k·log(|α|+|G|) space How much space does it take? ≤ |φ| times 22 Evaluation problem for FO in PSPACE
φ(x1,…,xn)
Input: G = (V,E) Output: G ⊧α φ ?
α = {x1,…,xn} ⟶ V
• If φ(x1,…,xn) = E(xi,xj): use 4 pointers ⇝ LOGSPACE answer YES iff (α(xi),α(xj)) ∈ E
• If φ(x1,…,xn) = ψ(x1,…,xn) ⋀ ψ'(x1,…,xn):
answer YES iff G ⊧α ψ and G ⊧α ψ' ⇝ MAX( SPACE(G ⊧α ψ)), SPACE(G ⊧α ψ')) )
• If φ(x1,…,xn) = ¬ψ(x1,…,xn): ⇝ SPACE(G ⊧α ψ)) answer NO iff G ⊧α ψ
• If φ(x1,…,xn) = ∃y . ψ(x1,…,xn,y):
answer YES iff for some v ∈ V and α'= α ∪ {y↦v} ⇝ 2·log(|G|) + SPACE(G ⊧α' ψ ) we have G ⊧α' ψ. Question: 2·log(|G|) + ··· + 2·log(|G|) + k·log(|α|+|G|) space How much space does it take? ≤ |φ| times 22 Evaluation pb for FO is PSPACE-complete
PSPACE-complete problem: QBF (satisfaction of Quantified Boolean Formulas) QBF = a boolean formula with quantification over the truth values (T,F)
23 Evaluation pb for FO is PSPACE-complete
PSPACE-complete problem: QBF (satisfaction of Quantified Boolean Formulas) QBF = a boolean formula with quantification over the truth values (T,F) ∃p ∀q . (p ⋁ ¬q) where p,q range over {T,F}
23 Evaluation pb for FO is PSPACE-complete
PSPACE-complete problem: QBF (satisfaction of Quantified Boolean Formulas) QBF = a boolean formula with quantification over the truth values (T,F) ∃p ∀q . (p ⋁ ¬q) where p,q range over {T,F}
Theorem: Evaluation for FO is PSPACE-complete (combined c.)
23 Evaluation pb for FO is PSPACE-complete
PSPACE-complete problem: QBF (satisfaction of Quantified Boolean Formulas) QBF = a boolean formula with quantification over the truth values (T,F) ∃p ∀q . (p ⋁ ¬q) where p,q range over {T,F}
Theorem: Evaluation for FO is PSPACE-complete (combined c.)
Polynomial reduction QBF ⤳ FO : 1. Given ψ ∈ QBF, let ψ'(x) be the replacement of each ‘p’ with ‘p=x’ in ψ.
2. Note: ∃x ψ' holds in a 2-element graph iff ψ is QBF-satisfiable
3. Test if G ⊧∅ ψ' for G=({v,v'},{}) 23 Evaluation pb for FO is PSPACE-complete
PSPACE-complete problem: QBF (satisfaction of Quantified Boolean Formulas) QBF = a boolean formula with quantification over the truth values (T,F) ∃p ∀q . (p ⋁ ¬q) where p,q range over {T,F}
Theorem: Evaluation for FO is PSPACE-complete (combined c.)
Polynomial reduction QBF ⤳ FO : 1. Given ψ ∈ QBF, let ψ'(x) be the replacement ψ'(x)=∃p ∀q . ( (p=x) ⋁ ¬(q=x) ) of each ‘p’ with ‘p=x’ in ψ.
2. Note: ∃x ψ' holds in a 2-element graph iff ψ is QBF-satisfiable
3. Test if G ⊧∅ ψ' for G=({v,v'},{}) 23 Evaluation pb for FO is PSPACE-complete
PSPACE-complete problem: QBF (satisfaction of Quantified Boolean Formulas) QBF = a boolean formula with quantification over the truth values (T,F) ∃p ∀q . (p ⋁ ¬q) where p,q range over {T,F}
Theorem: Evaluation for FO is PSPACE-complete (combined c.)
Polynomial reduction QBF ⤳ FO : 1. Given ψ ∈ QBF, let ψ'(x) be the replacement ψ'(x)=∃p ∀q . ( (p=x) ⋁ ¬(q=x) ) of each ‘p’ with ‘p=x’ in ψ.
2. Note: ∃x ψ' holds in a 2-element ∃x ∃p ∀q . ( (p=x) ⋁ ¬(q=x) ) graph iff ψ is QBF-satisfiable
3. Test if G ⊧∅ ψ' for G=({v,v'},{}) 23 Combined, Query, and Data complexities [Vardi, 1982]
A database of size 106 Problem: Usual scenario in database A query of size 100
Input:
24 Combined, Query, and Data complexities [Vardi, 1982]
A database of size 106 Problem: Usual scenario in database A query of size 100
Input: query +
24 Combined, Query, and Data complexities [Vardi, 1982]
A database of size 106 Problem: Usual scenario in database A query of size 100
Input: query + database
24 Combined, Query, and Data complexities [Vardi, 1982]
A database of size 106 Problem: Usual scenario in database A query of size 100
Input: query + database
TIME(2|query| + |data|) But we don’t distinguish this in the analysis: = TIME(|query| + 2|data|)
24 Combined, Query, and Data complexities [Vardi, 1982]
Query and data play very different roles.
Separation of concerns: How the resources grow with respect to • the size of the data • the query size
25 Combined, Query, and Data complexities
Combined complexity: input size is |query| + |data|
Query complexity (|data| fixed): input size is |query|
Data complexity (|query| fixed): input size is |data|
26 Combined, Query, and Data complexities
Combined complexity: input size is |query| + |data|
Query complexity (|data| fixed): input size is |query|
Data complexity (|query| fixed): input size is |data|
exponential in combined complexity O(2|query| + |data|) is exponential in query complexity linear in data complexity
exponential in combined complexity O(|query| + 2|data|) is linear in query complexity exponential in data complexity
26 Question
What is the data, query and combined complexity for the evaluation problem for FO?
Remember: data complexity, input size: |data| query complexity, input size: |query| combined complexity, input size: |data| + |query|
|φ| · 2 · log(|G|) + k·log(|α|+|G|) space
27 Question
What is the data, query and combined complexity for the evaluation problem for FO?
Remember: data complexity, input size: |data| query complexity, input size: |query| combined complexity, input size: |data| + |query|
|φ| · 2 · log(|G|) + k·log(|α|+|G|) space
query data
O(log(|data|)·|query|) space PSPACE combined and query complexity LOGSPACE data complexity
27 Recap
Equivalence-RA
Equivalence-SQL
Equivalence-FO Eval-FO Sat-FO (combined) Eval-FO Domino QBF (data)
UNDECIDABLE PSPACE LOGSPACE28 Trading expressiveness for efficiency
expressiveness efficiency
Alternation of quantifiers significantly affects complexity (recall that evaluation of QBF is PSPACE-complete: ∀x ∃y ∀z ∃w … φ).
What happens if we disallow ∀ and ¬ ?
29 Te class NP
LOGSPACE ⊆ PTIME ⊆ PSPACE ⊆ EXPTIME
30 Te class NP
LOGSPACE ⊆ PTIME ⊆ NP ⊆ PSPACE ⊆ EXPTIME
NP = Problems whose solutions can be witnessed by a certificate to be guessed and checked in polynomial time (e.g. a colouring)
30 Te class NP
LOGSPACE ⊆ PTIME ⊆ NP ⊆ PSPACE ⊆ EXPTIME
NP = Problems whose solutions can be witnessed by a certificate to be guessed and checked in polynomial time (e.g. a colouring)
Examples:
• 3-COLORABILITY: Given a graph G, can we assign a colour from {R,G,B} to each node so that adjacent nodes have always different colours ?
• SAT: Given a propositional formula, e.g. (p ⋁ ¬q ⋁ r) ⋀ (¬p ⋁ s ) ⋀ (¬s ⋁ ¬p), can we assign a truth value to each variable so that the formula becomes true ?
• MONEY-CHANGE: Given an amount of money A and a set of coins {B1, …, Bn}, can we find a subset S ⊆ {B1, …, Bn} such that ∑ S = A ?
30 Te class NP
LOGSPACE ⊆ PTIME ⊆ NP ⊆ PSPACE ⊆ EXPTIME
NP = Problems whose solutions can be witnessed by a certificate to be guessed and checked in polynomial time (e.g. a colouring)
31 Te class NP
LOGSPACE ⊆ PTIME ⊆ NP ⊆ PSPACE ⊆ EXPTIME
NP = Problems whose solutions can be witnessed by a certificate to be guessed and checked in polynomial time (e.g. a colouring)
Initial configuration
Final configuration 31 Te class NP
LOGSPACE ⊆ PTIME ⊆ NP ⊆ PSPACE ⊆ EXPTIME
NP = Problems whose solutions can be witnessed by a certificate to be guessed and checked in polynomial time (e.g. a colouring)
Initial configuration
Final configuration
Final configuration Final configuration 31 Te class NP
LOGSPACE ⊆ PTIME ⊆ NP ⊆ PSPACE ⊆ EXPTIME
NP = Problems whose solutions can be witnessed by a certificate to be guessed and checked in polynomial time (e.g. a colouring)
Initial configuration
Non-deterministic transitions
Final configuration
Final configuration Final configuration 31 Te class NP
LOGSPACE ⊆ PTIME ⊆ NP ⊆ PSPACE ⊆ EXPTIME
NP = Problems whose solutions can be witnessed by a certificate to be guessed and checked in polynomial time (e.g. a colouring)
Initial configuration
Non-deterministic transitions
Many paths, each has length bounded by a polynomial Final configuration
Final configuration Final configuration 31 Te class NP
LOGSPACE ⊆ PTIME ⊆ NP ⊆ PSPACE ⊆ EXPTIME
NP = Problems whose solutions can be witnessed by a certificate to be guessed and checked in polynomial time (e.g. a colouring)
Initial configuration
Non-deterministic transitions
Many paths, each has length bounded by a polynomial Final A solution exists if there is configuration at least a successful path.
Final configuration Final configuration 31 Question
Consider: Positive FO = FO without ∀,¬
E.g. φ = ∃ x ∃ y ∃ z . (E(x, y) ⋁ E(y, z)) ⋀ ( y=z ⋁ E(x, z))
What is the complexity of evaluating Positive FO on graphs ?
32 Question
Consider: Positive FO = FO without ∀,¬
E.g. φ = ∃ x ∃ y ∃ z . (E(x, y) ⋁ E(y, z)) ⋀ ( y=z ⋁ E(x, z))
What is the complexity of evaluating Positive FO on graphs ?
Solution
Tis is in NP: Given φ and G=(V, E) it suffices to guess a binding α : { x, y, z, … } → V and then verify that the formula holds. 32 Conjunctive Queries
Def. CQ = FO without ∀,¬,⋁
Eg: φ(x, y) = ∃ z . (Parent(x, z) ⋀ Parent(z, y))
Usual notation: “Grandparent(X,Y) : – Parent(X,Z), Parent(Z,Y)”
33 Conjunctive Queries
Def. CQ = FO without ∀,¬,⋁
Normal form: “ ∃ x1, …, xn . φ(x1, …, xn) ” quantifier-free and no equalities!
Eg: φ(x, y) = ∃ z . (Parent(x, z) ⋀ Parent(z, y))
Usual notation: “Grandparent(X,Y) : – Parent(X,Z), Parent(Z,Y)”
33 Conjunctive Queries
Def. CQ = FO without ∀,¬,⋁
Normal form: “ ∃ x1, …, xn . φ(x1, …, xn) ” quantifier-free and no equalities!
Eg: φ(x, y) = ∃ z . (Parent(x, z) ⋀ Parent(z, y))
Usual notation: “Grandparent(X,Y) : – Parent(X,Z), Parent(Z,Y)”
It corresponds to positive It corresponds to “π-σ-×” RA queries “SELECT-FROM-WHERE” SQL queries Select ... From ... πX(σZ(R1 ×···× Rn)) Where Z no negation no negation or disjunction 33 Bibliography
Abiteboul, Hull, Vianu, “Foundations of Databases”, Addison-Wesley, 1995.
(freely available at http://webdam.inria.fr/Alice/)
Chapters 1, 2, 3
34