Language, Automata and Logic for Finite Trees

Language, Automata and Logic for Finite Trees Olivier Gauwin UMons Feb/March 2010 Olivier Gauwin (UMons) Finite Tree Automata Feb/March 2010 1 / 66 Languages, Automata, Logic Example for regular word languages: Σ∗.a.b.Σ∗ Complexity a, b a, b a b ∃x. ∃y. lab (x) S A B Automata Logic a ∧ labb(y) ∧ succ(x, y) Grammars / Expressions S → aS S → aB X → aX X → ǫ S → bS B → bX X → bX Olivier Gauwin (UMons) Finite Tree Automata Feb/March 2010 2 / 66 Languages, Automata, Logic Example for regular word languages: Σ∗.a.b.Σ∗ Example for regular tree languages: all trees with an a-node having a b-child Complexity a, b a, b a b ∃x. ∃y. lab (x) S A B Automata Logic a ∧ labb(y) ∧ succ(x, y) (S, X ) → S ∃x. ∃y. laba(x) (X , S) → S ∧ labb(y) ∧ child(x, y) a(B, X ) → S a(X , B) → S Grammars / Expressions ... S → aS S → aB X → aX X → ǫ b → B S → bS B → bX X → bX ... S → (S, X ) S → a(B, X ) B → b(X , X ) X → (X , X ) S → (X , S) S → a(X , B) B → b X → Olivier Gauwin (UMons) Finite Tree Automata Feb/March 2010 2 / 66 References The main reference for this talk is the TATA book [CDG+07]: Tree Automata, Techniques and Applications by Hubert Comon, Max Dauchet, Rémi Gilleron, Christof Löding, Florent Jacquemard, Denis Lugier, Sophie Tison, Marc Tommasi. Other references will be mentionned progressively. Olivier Gauwin (UMons) Finite Tree Automata Feb/March 2010 3 / 66 1 Ranked Trees Trees on Ranked Alphabet Tree Automata Tree Grammars Logic 2 Unranked Trees Unranked Trees Automata Logic Olivier Gauwin (UMons) Finite Tree Automata Feb/March 2010 4 / 66 1 Ranked Trees Trees on Ranked Alphabet Tree Automata Tree Grammars Logic 2 Unranked Trees Unranked Trees Automata Logic Olivier Gauwin (UMons) Finite Tree Automata Feb/March 2010 5 / 66 Trees on Ranked Alphabet Ranked alphabet Ranked alphabet = finite alphabet + arity function Σr Σ= {a, b, c} ar:Σ → N ar(a) = 2 ar(b) = 2 ar(c) = 0 Ranked trees over Σr TΣr , the set of ranked trees, is the smallest set of terms f (t1,..., tk ) such that: f ∈ Σr , k = ar(f ), and ti ∈TΣr for all 1 ≤ i ≤ k. b a b b c c c c c A tree language T is a set of trees: T ⊆TΣr . Olivier Gauwin (UMons) Finite Tree Automata Feb/March 2010 6 / 66 Trees as relational structures We will sometimes consider a ranked tree t as a relational structure t t t t t t (nodes , {laba, labb, labc , ch1, ch2}). b ǫ a b 1 2 t laba = {1} b c c c t 1.1 1.2 2.1 2.2 labb = {ǫ, 1.1, 2} labt = {1.1.1, 1.1.2, 1.2, 2.1, 2.2} c c c 1.1.1 1.1.2 t ch1 = {(ǫ, 1), (1, 1.1), (2, 2.1) . } t ch2 = {(ǫ, 2), (1, 1.2), (2, 2.2) . } For convenience we write lab(π) for the label of node π. Olivier Gauwin (UMons) Finite Tree Automata Feb/March 2010 7 / 66 1 Ranked Trees Trees on Ranked Alphabet Tree Automata Tree Grammars Logic 2 Unranked Trees Unranked Trees Automata Logic Olivier Gauwin (UMons) Finite Tree Automata Feb/March 2010 8 / 66 Tree Automata over Ranked Trees Definition A tree automaton (TA) over Σr = (Σ, ar) is a tuple A = (Q, F , ∆, Σr ) where: Q is a finite set of states, F ⊆ Q a set of final states, ∆ are rules of type: a(q1,..., qk ) → q with a ∈ Σ, k = ar(a) and q, q1,..., qk ∈ Q Runs A run ρ of A on t is a function ρ : nodest → Q such that: t if π ∈ nodes with children π1,...,πk and label a then a(ρ(π1), . , ρ(πk )) → ρ(π) ∈ ∆ Olivier Gauwin (UMons) Finite Tree Automata Feb/March 2010 9 / 66 Bottom-up vs top-down Bottom-up view a(q1,..., qk ) → q is a bottom-up point of view: a(qS , qX ) → qS a(qX , qS ) → qS b q q q b ( S , X ) → S b(qX , qS ) → qS a a(qB , qX ) → qS b Rules: a(qX , qB ) → qS b(qX , qX ) → qB b c c c a(qX , qX ) → qX b(qX , qX ) → qX c c c → qX Olivier Gauwin (UMons) Finite Tree Automata Feb/March 2010 10 / 66 Bottom-up vs top-down Bottom-up view a(q1,..., qk ) → q is a bottom-up point of view: a(qS , qX ) → qS a(qX , qS ) → qS b q q q b ( S , X ) → S b(qX , qS ) → qS a a(qB , qX ) → qS b Rules: a(qX , qB ) → qS b(q , q ) → q b c c c X X B qX qX qX a(q , q ) → q X X X b(q , q ) → q X X X cq c q X X c → qX Olivier Gauwin (UMons) Finite Tree Automata Feb/March 2010 10 / 66 Bottom-up vs top-down Bottom-up view a(q1,..., qk ) → q is a bottom-up point of view: a(qS , qX ) → qS a(qX , qS ) → qS b q q q b ( S , X ) → S b(qX , qS ) → qS a a(qB , qX ) → qS b Rules: a(qX , qB ) → qS b(q , q ) → q b c c c X X B qB qX qX qX a(q , q ) → q X X X b(q , q ) → q X X X cq c q X X c → qX Olivier Gauwin (UMons) Finite Tree Automata Feb/March 2010 10 / 66 Bottom-up vs top-down Bottom-up view a(q1,..., qk ) → q is a bottom-up point of view: a(qS , qX ) → qS a(qX , qS ) → qS b q q q b ( S , X ) → S b(qX , qS ) → qS a q q q a b ( B , X ) → S qS Rules: a(qX , qB ) → qS b(q , q ) → q b c c c X X B qB qX qX qX a(q , q ) → q X X X b(q , q ) → q X X X cq c q X X c → qX Olivier Gauwin (UMons) Finite Tree Automata Feb/March 2010 10 / 66 Bottom-up vs top-down Bottom-up view a(q1,..., qk ) → q is a bottom-up point of view: a(qS , qX ) → qS a(qX , qS ) → qS b q q q b ( S , X ) → S b(qX , qS ) → qS a q q q a b ( B , X ) → S qS qX Rules: a(qX , qB ) → qS b(q , q ) → q b c c c X X B qB qX qX qX a(q , q ) → q X X X b(q , q ) → q X X X cq c q X X c → qX Olivier Gauwin (UMons) Finite Tree Automata Feb/March 2010 10 / 66 Bottom-up vs top-down Bottom-up view a(q1,..., qk ) → q is a bottom-up point of view: a(qS , qX ) → qS a(qX , qS ) → qS b q q q b ( S , X ) → S qS b(qX , qS ) → qS a q q q a b ( B , X ) → S qS qX Rules: a(qX , qB ) → qS b(q , q ) → q b c c c X X B qB qX qX qX a(q , q ) → q X X X b(q , q ) → q X X X cq c q X X c → qX Olivier Gauwin (UMons) Finite Tree Automata Feb/March 2010 10 / 66 Bottom-up vs top-down Bottom-up view a(q1,..., qk ) → q is a bottom-up point of view: a(qS , qX ) → qS a(qX , qS ) → qS b q q q b ( S , X ) → S qS b(qX , qS ) → qS a q q q a b ( B , X ) → S qS qX Rules: a(qX , qB ) → qS b(q , q ) → q b c c c X X B qB qX qX qX a(q , q ) → q X X X b(q , q ) → q X X X cq c q X X c → qX A run of A on t is accepting if ρ(ǫ) ∈ F . L(A)= {t | there exists an accepting run ρ of A on t} Olivier Gauwin (UMons) Finite Tree Automata Feb/March 2010 10 / 66 Bottom-up vs top-down Top-down view We could have written rules this way: a(q) → (q1,..., qk ), and name F the initial states. This corresponds to a top-down definition: Initial: {qS } a(qS ) → (qS , qX ) a(qS ) → (qX , qS ) b b q q q ( S ) → ( S , X ) b(qS ) → (qX , qS ) a b a(qS ) → (qB , qX ) Rules: a(qS ) → (qX , qB ) b c c c b(qB ) → (qX , qX ) a(qX ) → (qX , qX ) c c b(qX ) → (qX , qX ) c(qX ) Olivier Gauwin (UMons) Finite Tree Automata Feb/March 2010 11 / 66 Bottom-up vs top-down Top-down view We could have written rules this way: a(q) → (q1,..., qk ), and name F the initial states. This corresponds to a top-down definition: Initial: {qS } a(qS ) → (qS , qX ) a(qS ) → (qX , qS ) bq b q q q S ( S ) → ( S , X ) b(qS ) → (qX , qS ) a b a(qS ) → (qB , qX ) Rules: a(qS ) → (qX , qB ) b c c c b(qB ) → (qX , qX ) a(qX ) → (qX , qX ) c c b(qX ) → (qX , qX ) c(qX ) Olivier Gauwin (UMons) Finite Tree Automata Feb/March 2010 11 / 66 Bottom-up vs top-down Top-down view We could have written rules this way: a(q) → (q1,..., qk ), and name F the initial states. This corresponds to a top-down definition: Initial: {qS } a(qS ) → (qS , qX ) a(qS ) → (qX , qS ) b qS b(qS ) → (qS , qX ) b(qS ) → (qX , qS ) a b a(q ) → (q , q ) qS qX S B X Rules: a(qS ) → (qX , qB ) b c c c b(qB ) → (qX , qX ) a(qX ) → (qX , qX ) c c b(qX ) → (qX , qX ) c(qX ) Olivier Gauwin (UMons) Finite Tree Automata Feb/March 2010 11 / 66 Bottom-up vs top-down Top-down view We could have written rules this way: a(q) → (q1,..., qk ), and name F the initial states.

Language, Automata and Logic for Finite Trees

Regular Description of Context-Free Graph Languages

Weighted Regular Tree Grammars with Storage

Tree Automata Techniques and Applications

Taxonomy of XML Schema Languages Using Formal Language Theory

Tree Automata

A Model-Theoretic Description of Tree Adjoining Grammars 1

Uniform Vs. Nonuniform Membership for Mildly Context-Sensitive Languages: a Brief Survey

Context-Free Graph Grammars and Concatenation of Graphs

Hybrid Grammars for Parsing of Discontinuous Phrase Structures and Non-Projective Dependency Structures

Efficient Techniques for Parsing with Tree Automata

Regular Rooted Graph Grammars a Web Type and Schema Language

Practical MAT Learning of Natural Languages Using Treebanks