A Tree Logic...And an Application for the Analysis of Cascading Style

A Tree Logic... ... and an Application for the Analysis of Cascading Style Sheets Pierre Genevès CNRS – Tyrex team [email protected] Toccata seminar, LRI – Feb. 22nd , 2013 1 / 27 Outline 1 Insights on the Lµ Tree Logic 2 Overview of Perspectives and Applications 3 Zoom on the Analysis of CSS 2 / 27 Data Model for the Logic Trees: the logic was originally designed for XML trees Specifically: finite binary labeled trees They model finite ordered unranked labeled trees wlog Bijective encoding of unranked trees as binary trees: 0 0 1 1 2 2 3 3 3 / 27 Formulas of the Lµ Logic Programs α 2 f1; 2; 1; 2g for 1 2 navigating binary trees (α = α) Lµ 3 '; ::= formula > true j p j :p atomic prop (negated) j n j :n nominal (negated) j ' _ j ' ^ disjunction (conjunction) j hαi ' j : hαi > existential (negated) j µX :' unary fixpoint (finite recursion) j µXi :'i in n-ary fixpoint 4 / 27 Sample Formula and Satisfying Tree a a 5 / 27 Sample Formula and Satisfying Tree a ^ h2i b a b 5 / 27 Sample Formula and Satisfying Tree ? ? c a ^ h2i b ^ µX : h2i c _ 1 X a b 5 / 27 Sample Formula and Satisfying Tree ? ? c a ^ h2i b ^ µX : h2i c _ 1 X a b Semantics: models of ' are finite trees for which ' holds at some node X Interesting balance between succinctness and expressive power: XPath, CSS selectors, and XML types can be translated into the logic, linearly 5 / 27 Example: Translation of an XPath Expression into Lµ Formula holds at selected nodes χ µZ:' : finite recursion Converse programs are crucial a '^ More generally, we have a compiler: c d txpath(e; χ): LXPath × Lµ !Lµ χ is the latest navigation step b a ' initially, χ = : 1 > ^ : 2 > for absolute expressions Translated query: child::a [child::b] a ^ (µZ: 1 χ _ 2 Z) ^ h1i µY :b _ h2i Y | {z } | {z } ' 6 / 27 Lµ Closure under Negation Cycle-freeness: A key property If both a program and its converse occur Infinite structures between a µX : binder and X , formula has a Finite trees cycle, e.g.: µX : hαi X _ hαi X ' :' Otherwise the formula is cycle-free in practice, most (all?) formulas are cycle-free (e.g. XPath translations are always cycle-free) Cycle-freeness of Lµ implies closure under negation The negation of finite recursion is finite recursion (see paper) :' is easily (linearly) expressible in Lµ for all ' 2 Lµ Crucial for BC: implication (subtyping, containment tests...) Crucial for implementation 7 / 27 Deciding Lµ Satisfiability Is a formula 2 Lµ satisfiable? Given , determine whether there exists a finite tree that satisfies Validity: test : Principles: Automatic Theorem Proving Search for a proof tree Build the proof bottom up: “if holds then it is necessarily somewhere up” 8 / 27 Search Space Optimization Idea: Truth Status is Inductive The truth status of can be expressed as a function of its subformulas For boolean connectives, it can be deduced (truth tables) Only base subformulas really matter: Lean( ) Lean( ): h1i > h2i > 1 > 2 > a b σ h1i ' h2i ' | {z } | {z } | {z } topological propositions atomic propositions in existential subformulas A Tree Node: Truth Assignment of Lean( ) Formulas With some additional constraints, e.g. : 1 > _ : 2 > 9 / 27 Satisfiability-Testing Algorithm: Principles Bottom-up construction of proof tree A set of nodes is repeatedly updated (fixpoint computation) 10 / 27 Satisfiability-Testing Algorithm: Principles Bottom-up construction of proof tree Step 1: all possible leaves are added 10 / 27 Satisfiability-Testing Algorithm: Principles Bottom-up construction of proof tree Step i > 1: all possible parents of previous nodes are added 10 / 27 Satisfiability-Testing Algorithm: Principles h1i ' ' ' 2 ' Compatibility relation between nodes Nodes from previous step are proof support: hαi ' is added if ' holds in some node added at previous step 10 / 27 Satisfiability-Testing Algorithm: Principles η :b ^ µX :b _ 2 X | {z } η Compatibility relation between nodes Nodes from previous step are proof support: hαi ' is added if ' holds in some node added at previous step 10 / 27 Satisfiability-Testing Algorithm: Principles Progressive bottom-up reasoning (partial satisfiability) hαi ' are left unproved until a parent is connected 10 / 27 Satisfiability-Testing Algorithm: Principles hαi ' Termination If is present in some root node, then is satisfiable Otherwise, the algorithm terminates when no more nodes can be added 10 / 27 Satisfiability-Testing Algorithm: Principles Implementation techniques Crucial optimization: symbolic representation 10 / 27 Correctness & Complexity Theorem O(n) The satisfiability problem for a formula 2 Lµ is decidable in time 2 where n = jLean( )j. System fully implemented decision procedure compilers (XPath, DTD, XML Schema, CSS selectors, ...) 11 / 27 Overview of Some Experiments DTD Symbols Binary type variables SMIL 1.0 19 11 XHTML 1.0 Strict 77 325 Table: Types used in experiments. XPath decision problem XML type Time (ms) e1 ⊆ e2 and e2 6⊆ e1 none 353 e4 ⊆ e3 and e4 ⊆ e3 none 45 e6 ⊆ e5 and e5 6⊆ e6 none 41 e7 is satisfiable SMIL 1.0 157 e8 is satisfiable XHTML 1.0 2630 e9 ⊆ (e10 [ e11 [ e12) XHTML 1.0 2872 Table: Some decision problems and corresponding results. For the last test, size of the Lean is 550. The search space is 2550 ≈ 10165... more than the square number of atoms in the universe 1080 12 / 27 Tree Logics: an Overview On the theoretical side: Lµ offers an interesting expressivity, succinctness, optimal complexity bound 1968 1977 1981 1983 2006-2013 L PDL(tree) µ-calculus µ WS2S CTL forward + backward (for finite trees) Expr.: MSO ? (<MSO) FO MSO MSO Sat.: Non-elementary EXPTIME EXPTIME EXPTIME 2O(n) Impl.: MONA ? ? ? Lµ Solver On the practical side: except (hyperexponential) MONA, this is the only one implementation of a satisfiability solver for such an expressive logic It can be useful for graphs too: the sublogic without backward modalities enjoys the finite tree model property 13 / 27 Going Further: Challenges Several directions Growing logical expressive power? (currently MSO) Decreasing combined complexity? (impossible without dropping features: containment for regular tree grammars is hard for EXPTIME) Augmenting succinctness of the logic ! good potential Succinctness is crucial A blow-up in the logical translations affects the combined complexity Augmenting succinctness is a way to address more problems in EXPTIME 14 / 27 Further Perspectives in Gaining Succinctness Nominals A nominal p is an atomic proposition whose interpretation is a singleton, card(p)=1 Captured! Idea of the translation into logic: “p and nowhereElse(p)” ancestor self parent child preceding-sibling following-sibling following preceding descendant p ^ :descendant(p) ^ :descendant-or-self(following-sibling(ancestor-or-self(p))) a formula with constant-size footprint in the Lean ... Now, what about card(phi)=n ? 15 / 27 Further Perspectives: card(phi)=n card(phi)=n Even if this remains regular, this is not a priori succinct For instance, L2a2b: set of strings over Σ = fa; b; cg containing at least 2 occurrences of a and at least two occurrences of b 16 / 27 Further Perspectives: card(phi)=n card(phi)=n Even if this remains regular, this is not a priori succinct For instance, L2a2b: set of strings over Σ = fa; b; cg containing at least 2 occurrences of a and at least two occurrences of b (ajbjc)?a(ajbjc)?a(ajbjc)?b(ajbjc)?b(ajbjc)? j (ajbjc)?a(ajbjc)?b(ajbjc)?a(ajbjc)?b(ajbjc)? j (ajbjc)?a(ajbjc)?b(ajbjc)?b(ajbjc)?a(ajbjc)? j (ajbjc)?b(ajbjc)?b(ajbjc)?a(ajbjc)?a(ajbjc)? j (ajbjc)?b(ajbjc)?a(ajbjc)?b(ajbjc)?a(ajbjc)? j (ajbjc)?b(ajbjc)?a(ajbjc)?a(ajbjc)?b(ajbjc)? 16 / 27 Further Perspectives: card(phi)=n If we add \ to the regular expression operators: ((ajbjc)?a(ajbjc)?a(ajbjc)?) \ ((ajbjc)?b(ajbjc)?b(ajbjc)?) In logical terms, conjunction offers a dramatic reduction in expression size If we now consider the ability to describe numerical constraints on the frequency of occurrences, we get another exponential reduction in size: ((ajbjc)?a(ajbjc)?)2 \ ((ajbjc)?b(ajbjc)?)2 Crucial when the complexity of the decision procedure depends on the formula size 17 / 27 Further Perspectives: card(phi)=n Querying all the articles with 4 or more authors Navigational XPath expression: article[author/following-sibling::author/following-sibling::author/following-sibling::author] or, using the counting operator in XPath: article[count(author)>=4] ! The counting operator is exponentially more succinct ! Again, we would like efficient static analyzers that directly operate on the succinct form! (i.e. not pay the price of the blow-up) 18 / 27 Facts Nominals + Backward modalities + card(phi)=n undecidable over graphs [Bonatti-AI’04] decidable over finite trees Ongoing research... What is the precise complexity for card(phi)=n for finite trees? ... or more generally of rich logical combinators that may duplicate formulas of arbitrary length (but in a particular manner)? ! Hint: look at the factorization power of the Lean 19 / 27 Further Perspectives: Follow the Arrows So far: logical description of structural constraints stemming from queries and schemas Can we also logically capture a notion of computation performed by programs (i.e. functions)? For example, can the logic capture the type algebra on which CDuce sits? τ ::= b basic type j τ × τ product type j τ ! τ function type j τ _ τ union type j :τ complement type j 0 empty type j v recursion variable j µv.τ recursive type Yes. We interpret the type algebra in a purely logical manner... 20 / 27 Further Perspectives: Follow the Arrows Representing functions 0 0 f = f(d1; d1); (d2; d2);:::g modelizes a function such that: 0 f di may evaluate (nondeterministically) to di f x where x 62 fdi g never terminates (and is well-typed) 0 if di = ERR then f di is a type error Lemma (Frisch et al.): considering only finite such sets of pairs is sufficient for defining semantic subtyping.

A Tree Logic...And an Application for the Analysis of Cascading Style

Towards an Analytics Query Engine *

Evaluation of Xpath Queries on XML Streams with Networks of Early Nested Word Automata Tom Sebastian

QUERYING JSON and XML Performance Evaluation of Querying Tools for Offline-Enabled Web Applications

Programming a Parallel Future

Declarative Data Analytics: a Survey

Big Data Analytic Approaches Classification

A Dataflow Language for Large Scale Processing of RDF Data

Extracting Data from Nosql Databases a Step Towards Interactive Visual Analysis of Nosql Data Master of Science Thesis

IBM Infosphere Biginsights Version 2.1: Installation Guide Chapter 1

HIL: a High-Level Scripting Language for Entity Integration

Schemas and Types for JSON Data

Technology Overview