Conjunctive Queries

Conjunctive Queries

Conjunctive queries Giuseppe De Giacomo Universita` di Roma “La Sapienza” Corso di Seminari di Ingegneria del Software: Data and Service Integration Laurea Specialistica in Ingegneria Informatica Universita` degli Studi di Roma “La Sapienza” A.A. 2005-06 FOL queries A FOL query is an (open) FOL formula. Let φ be a FOL query with free variables (x1,...,xk), then we sometimes write it as φ(x1,...,xk). Given an interpretation I, the assignments we are interested in are those that map the variables x1,...,xk (and only those). We will write such assignment explicitly sometimes: i.e., α(xi)=ai (i =1,...,k), is written simply as a1,...,ak. Now we define the answer to a query φ(x1,...,xk) as follows I φ(x1,...,xk) = {(a1,...,ak) |I, a1,...,ak|= φ(x1,...,xk)} G. De Giacomo Conjunctive queries 1 Note: We will also use the notation: φI, keeping the free variables implicit, and φ(I) making apparent that φ becomes a functions from interpretations to set of tuples. G. De Giacomo Conjunctive queries 2 Conjunctive queries (CQs) A conjunctive query (CQ) q is a query of the form ∃y.conj (x,y ) where conj (x,y ) is a conjunction (an “and”) of atoms and equalities, with free variables x and y. • CQs are the most frequently asked queries • CQs correspond to relational algebra Select-Project-Join (SPJ) queries G. De Giacomo Conjunctive queries 3 CQs: datalog notation A conjunctive query q = ∃y.conj (x,y ) is denoted in datalog notation as q(x) ← conj (x, y) where conj(x, y) is the list of atoms in conj (x,y ) obtained after having equated the variables x,y according to the equalities in conj (x,y ).Asa result of such equality elimination, we have that x and y can actually contain constants and multiple occurrences of the same variable. We call q(x) the head of q, and conj (x, y) the body. Moreover, we call the variables in x the distinguished variables of q and those in y the non-distinguished variables. G. De Giacomo Conjunctive queries 4 Example • Consider an interpretation I =(ΔI,EI), where EI is a binary relation – note that such interpretation is a (directed) graph; • the following CQ q returns all nodes that participate to a triangle in the graph: ∃y, z.E(x, y) ∧ E(y, z) ∧ E(z, x) • the query q in datalog notation becomes: q(x) ← E(x, y),E(y, z),E(z, x) • the query q in SQL is (E(x, y) ; Edge(F,S)): select e1.F from Edge e1, Edge e2, Edge e3 where e1.S=e2.F, e2.S=e3.F, e3.S=e1.F G. De Giacomo Conjunctive queries 5 Nondeterministic CQ evaluation algorithm boolean ConjTruth(I,α,∃y. conj(x,y )) { GUESS assignment α[y → a] { return Truth(I,α[x → a],conj (x,y )); } boolean Truth(I,α,φ)) { if(φ is t 1=t 2) return TermEval(t 1) = TermEval(t 2); if(φ is P (t 1,...,t k)) return P ˆI(TermEval(t 1),...,TermEval(t k)); if(φ is ψ ∧ ψ) return Truth(I,α,ψ) ∧ Truth(I,α,ψ); } G. De Giacomo Conjunctive queries 6 o ∈ ΔI TermEval(I,α,t) { if(t is a variable x) return α(x); if(t is a constant c) return cˆI; } G. De Giacomo Conjunctive queries 7 CQ evaluation: combined, data, query complexity Combined complexity: complexity of {I,α,q|I,α|= q}, i.e., interpretation, tuple, and query part of the input: • time: exponential • space: NP (NP-complete –see below for hardness) Data complexity: complexity of {I,α|I,α|= q}, i.e., interpretation fixed (not part of the input): • time: polynomial • space: LOGSPACE ( LOGSPACE-complete –see [Vardi82] for hardness) Query complexity: complexity of {α, q|I,α|= q}, i.e., query fixed (not part of the input): G. De Giacomo Conjunctive queries 8 • time: exponential • space: NP (NP-complete –see below for hardness) G. De Giacomo Conjunctive queries 9 3-colorability 3-colorability: Given a graph G =(V,E), is it 3-colorable? Thm: 3-colorability is NP-complete. can we deduce 3-colorability to conjunctive query evaluation? YES G. De Giacomo Conjunctive queries 10 Reduction from 3-colorability to CQ evaluation Let G =(V,E) be a graph, we define: • Interpretation: I =(ΔI,EI) where: – ΔI = {r, g, b} – EI = {(r, g), (g, r), (r, b), (b, r), (b, g), (g, b)} • Conjunctive query: Let V = {x1,...,xn}, then consider the boolean conjunctive query q defined as: ∃x1,...,xn. E(xi,xj) ∧ E(xj,xi) (xi ,xj )∈E Thm: G is 3-colorable iff I|= q. Thm: CQ evaluation is NP-hard in query and combined complexity. G. De Giacomo Conjunctive queries 11 Homomorphism Let I =(ΔI,PI,...,cI,...) and J =(ΔJ ,PJ ,...,cJ ,...) be two interpretation over the same alphabet (for simplicity, we consider only constants as functions). Then an homomorphism form I to J is a mapping h :ΔI → ΔJ such that: • h(cI)=cJ I J • h(P (a1,...,ak)) = P (h(a1),...,h(ak)) Note: An isomorphism is a homomorphism, which is one-to-one and onto. Thm: FOL is unable to distinguish between interpretations that are isomorphic – any standard book on logic. G. De Giacomo Conjunctive queries 12 Recognition problem and boolean query evaluation Consider the recognition problem associated to the evaluation of a query q, then I,α|= q(x) iff I |= q(c) where I is identical to I but includes a new constant c which is interpreted as cI = α(x). That is, we can reduce the recognition problem to the evaluation of a boolean query. G. De Giacomo Conjunctive queries 13 Canonical interpretation of a (boolean) CQ Let q be a conjunctive query ∃x1,...,xn.conj then the canonical interpretation Iq associated with q is the interpretation Iq Iq Iq Iq =(Δ ,P ,...,c ,...), where Iq • Δ = {x1,...,xn}∪{c | c constant occurring in q} , i.e., all the variables and constants • cIq = c for all constants in q Iq • (t1,t2) ∈ P iff the atom P (t1,t2) occurs in q Sometime the procedure for obtaining the canonical interpretation is call freezing of q. G. De Giacomo Conjunctive queries 14 Example Given the boolean query q: q(c) ← E(c, y),E(y, z),E(z, c) the canonical structure Iq is defined as Iq Iq Iq Iq =(Δ ,E ,c ) where • ΔIq = {y, z, c} • cIq = c • EIq = {(c, y), (y, z), (z, c)} G. De Giacomo Conjunctive queries 15 Canonical interpretation and query evaluation Thm [Chandra&Merlin77]: For (boolean) CQs, I|= q iff there exists an homomorphism from Iq to I. Proof. ⇒ Let I|= q, let α be the assignment to an existential variables that makes the query true in I, and let α¯ be its extension to constants. Then α¯ is an homomorphism from Iq to I. ⇐ Let h be an homomorphism from Iq to I, then restricting h to the variables only we obtain an assignment of the existential variables that makes q true in I. 2 In other words (the recognition problem associated to) query evaluation can be reduced to finding an homomorphism. G. De Giacomo Conjunctive queries 16 Finding an homomorphism between two interpretations (aka relational structure) is also known as solving a CSP (Constraint Satisfaction Problem), well-studied in AI –see also [Kolaitis&Vardi98]. G. De Giacomo Conjunctive queries 17 Query containment Query containment: given two FOL queries φ and ψ check whether φ ⊆ ψ for all interpretations I and all assignments α we have that I,α|= φ implies I,α|= ψ (In logical terms check whether φ |= ψ.) Note: of special interest in query optimization. Thm: For FOL queries, query containment is undecidible. Proof: Reduction from FOL logical implication.2 G. De Giacomo Conjunctive queries 18 Query containment for CQs For CQs, query containment can be reduced to query evaluation! Step 1 – freeze the free variables: q(x) ⊆ q(x) iff • I,α|= q(x) implies I,α|= q(x), for all I and α; or equivalently • I |= q(c) implies I |= q(c), for all I, where c are new constants, and I extends I to the new constants as follows cI = α(x). Step 2 – construct the canonical intepretation of the CQ on the left q(c) consider the canonical interpretation Iq(c) ... Step3–evaluate the CQ on the right q (c) on Iq(c) .... check whether Iq(c) |= q (c). G. De Giacomo Conjunctive queries 19 Query containment for CQs (cont.) Thm [Chandra&Merlin77]: For CQs, q(x) ⊆ q (x) iff Iq(c) |= q (c), where c are new constants. Proof. ⇒ Assume that q(c) ⊆ q(c): • since Iq(c) |= q(c) it follows that Iq(c) |= q (c). ⇐ Assume that Iq(c) |= q (c). • by Thm[Chandra&Merlin77] on homomorphism, for every I such that I|= q(c) there exists an homomorphism h from Iq(c) to I; • on the other hand, since Iq(c) |= q (c), again by Thm[Chandra&Merlin77] on homomorphism, there exists an homomorphism h from Iq(c) to Iq(c); • the mapping h ◦ h obtained composing h and h is an homomorphism G. De Giacomo Conjunctive queries 20 from Iq(c) to I. Hence, once again for Thm[Chandra&Merlin77] on homomorphism, I|= q(c). So we can conclude q(c) ⊆ q(c). 2 Thm: Containment of CQs is NP-complete. G. De Giacomo Conjunctive queries 21 Union of conjunctive queries (UCQs) A union of conjunctive queries (UCQ) q is a query of the form ∃yi.conj i(x,y i) i=1,...,n where each conj i(x,y i) is, as before, a conjunction of atoms and equalities with free variables x and yi.

View Full Text

Details

  • File Type
    pdf
  • Upload Time
    -
  • Content Languages
    English
  • Upload User
    Anonymous/Not logged-in
  • File Pages
    15 Page
  • File Size
    -

Download

Channel Download Status
Express Download Enable

Copyright

We respect the copyrights and intellectual property rights of all users. All uploaded documents are either original works of the uploader or authorized works of the rightful owners.

  • Not to be reproduced or distributed without explicit permission.
  • Not used for commercial purposes outside of approved use cases.
  • Not used to infringe on the rights of the original creators.
  • If you believe any content infringes your copyright, please contact us immediately.

Support

For help with questions, suggestions, or problems, please contact us