A Note on Nested Words

A Note on Nested Words Andreas Blass¤ Yuri Gurevichy Abstract For every regular language of nested words, the underlying strings form a context-free language, and every context-free language can be obtained in this way. Nested words and nested-word automata are gen- eralized to motley words and motley-word automata. Every motley- word automation is equivalent to a deterministic one. For every regular language of motley words, the underlying strings form a ¯nite intersection of context-free languages, and every ¯nite intersection of context-free languages can be obtained in this way. 1 Introduction In [1], Rajeev Alur and P. Madhusudan introduced and studied nested words, nested-word automata and regular languages of nested words. A nested word is a string of letters endowed with a so-called nested relation. Alur and Madhusudan prove that every nested-word automaton is equivalent to a deterministic one. In Section 2, we recall some of their de¯nitions. We quote from the introduction to [1]. \The motivating application area for our results has been soft- ware veri¯cation. Given a sequential program P with stack-based control ﬂow, the execution of P is modeled as a nested word with nesting edges from calls to returns. Speci¯cation of the program is given as a nested word automaton A, and veri¯cation corre- sponds to checking whether every nested word generated by P is ¤Math Dept, University of Michigan, Ann Arbor, MI 48109; [email protected] yMicrosoft Research, Redmond, WA 98052; [email protected] 1 accepted by A. Nested-word automata can express a variety of requirements such as stack-inspection properties, pre-post conditions, and interprocedural data-ﬂow properties." In Section 3, we show that all context-free properties, and only context- free properties, can be captured by a nested-word automaton in the following precise sense. For every regular language of nested words, the underlying strings form a context-free language, and every context-free language can be obtained in this way. We continue the quotation from [1]. \If we were to model program executions as words, all of these properties are non-regular, and hence inexpressible in classical speci¯cation languages based on temporal logics, automata, and ¯xpoint calculi (recall that context-free languages cannot be used as speci¯cation languages due to nonclosure under intersection and undecidability of key decision problems such as language in- clusion)." Intersections of context-free languages naturally arise in applications. Think of multi-threaded programs for example. In Section 4, we general- ize nested words to motley words: strings with several nested relations. We introduce motley-word automata and use the nested-word-automaton deter- minization procedure to show that every motley-word automaton is equivalent to a deterministic one. We show that every ¯nite intersection (that is the intersections of ¯nitely many) of context-free languages, and only such intersections, can be captured by a motley-word automaton in the following precise sense. For every regular (that is accepted by a motley-word automaton) language of motley words, the language of underlying strings is a ¯nite intersection of context-free languages, and every intersection of context-free languages can be obtained in this way. 2 2 Preliminaries To make this note self-contained, we recapitulate here some de¯nitions from [1]. 2.1 Nested words A nested relation º over f1; 2; : : : ; kg is a binary relation satisfying the following condition: if º(i; i0) and º(j; j0) and i · j then either i < i0 < j < j0 or else i < j < j0 < i0. If º(i; j) then i is a call position and the call-predecessor of j, and j is a return position and the return-successor of i. A nested word over an alphabet § is a pair (a1 : : : ak; º) where each ai 2 § and º is a nested relation over f1; 2; : : : ; kg. A position i that is neither a call position nor a return position is an internal position of w. NW(§) is the set of nested §-words. A language of nested §-words is a subset of NW(§). 2.2 NW automata An nw automaton A over an alphabet § is a quadruple (Q; Qin; ±; Qf ) similar to a usual nondeterministic automaton with set Q of states, set Qin ⊆ Q of initial states and set Qf ⊆ Q of ¯nal states, except that ± is not a single transition relation but a triple h±c; ±i; ±ri where ² ±c ⊆ Q £ § £ Q is the transition relation for call positions, ² ±i ⊆ Q £ § £ Q is the transition relation for internal positions, and ² ±r ⊆ Q £ Q £ § £ Q is the transition relation for return positions. The automaton A is deterministic if it satis¯es the following conditions: - There is exactly one initial state. - ±c is the relational form of a function from Q £ § to Q. In other words, 0 0 if (p; a; q) and (p; a; q ) belong to ±c then q = q . - ±i is the relational form of a function from Q £ § to Q. - ±c is the relational form of a function from Q £ Q £ § to Q. 3 A run of A over an nested word (a1 : : : ak; º) is a sequence q0; q1; : : : ; qk of states such that q0 is an initial state, if i is a call position then (qi¡1; ai; qi) 2 ±c, if i is an internal position then (qi¡1; ai; qi) 2 ±i, and if i is a return position with call-predecessor j then (qi¡1; qj¡1; ai; qi) 2 ±r. The run q0; q1; : : : ; qk is accepting if qk is a ¯nal state. A accepts a word w if it has an accepting run on w. The language L(A) of A is the set of nested words accepted by A. A language (that is a set) of nested words is regular if there is an nw automaton A such that L = L(A). Two nw automata A; B are equivalent if L(A) = L(B). Alur and Madhusudan prove that every nw automaton A is equivalent to a deterministic nw automaton [1, Theorem 1]. 3 Nested-Word Automata and Context-Free Languages In this section, we show that, ² for every regular language of nested words, the underlying strings form a context-free language, and ² every context-free language can be obtained in this way. De¯nition 1. A is call explicit if ±c and ±i are disjoint. Lemma 2. For every nested-word automaton A = (Q; Qin; ±; Qf ), there is a call-explicit nw automaton A0 such that L(A0) = L(A). Furthermore, if A is deterministic then so is A0. 0 0 0 0 0 0 0 0 Proof. The desired A = (Q ;Qin; ± ;Qf ) where Q , Qin and Qf are Q£f0; 1g, Qin £ f0g and Qf £ f0g respectively. Further: 0 ±c = f((p; d); a; (q; 1)) : (p; a; q) 2 ±c ^ d 2 f0; 1gg 0 ±i = f((p; d); a; (q; 0)) : (p; a; q) 2 ±i ^ d 2 f0; 1gg 0 ±r = f((p1; d1); (p2; d2); a; (q; 0)) : (p1; p2; a; q) 2 ±r ^ d1; d2 2 f0; 1gg 0 0 0 Clearly ±c and ±i are disjoint, and A is deterministic if A is so. Every accepting run q0; : : : ; qk of A on (a1 : : : ak; º) gives rise to an accepting run (q0; d0);:::; (qk; dk) where di = 1 if and only if i is a call position. And 0 every accepting run (q0; d0);:::; (qk; dk) of A on (a1 : : : ak; º) gives rise to an accepting run q0; : : : ; qk of A. 4 De¯nition 3. The projection P (w) of a nested word w = (x; º) is the string x. The projection P (L) of a language L of nested words is the language fP (w): w 2 Lg. Proposition 4. The projection of a regular nw language is context-free. Proof. Let L = L(A) where A is an nw automaton (Q; Qin; ±; Qf ). We con- struct a (nondeterministic) pushdown automaton B with L(B) = P (L) that accepts on empty stack and ¯nal state. The sets of states, initial states and ¯nal states of B are Q, Qin and Qf respectively. The stack alphabet of B is Q[f?g where ? is the bottom-of-the-stack symbol. The transition function ¢ = ¢c [ ¢i [ ¢r where the components are as follows. 0 0 ¢i = f(p; a; p ; q):(p; a; q) 2 ±i ^ p 2 Qg 0 0 ¢c = f(p; a; p ; q; +p):(p; a; q) 2 ±c ^ p 2 Qg 0 0 0 ¢r = f(p; a; p ; q; ¡):(p; p ; a; q) 2 ±r ^ p 2 Qg The intended meaning of an instruction (p; a; p0; q) is this: if the current state is p, the input symbol is a and the top stack symbol is p0 then to go state q (and move one-letter to the right on the input word). The additional \+p" means: push p onto the stack. The additional \¡" means: pop the stack. An accepting run q0; : : : ; qk of A on (a1 : : : a`; º) gives rise to an accepting run (q0;U0) ::: (qk;Uk) of B on a1 : : : ak where U0;:::;Uk are stack contents de¯ned inductively. U0 = ?. Let i > 0. If i is an internal position then Ui = Ui¡1. If i is a call position then Ui = qi¡1Ui¡1. Suppose that u is a return position with call-predecessor j. The number of call positions < i exceeds the number of return positions < i, and so Ui¡1 6= ?. Ui is obtained from Ui¡1 by popping the top symbol.

A Note on Nested Words

Details

Download

Copyright

We respect the copyrights and intellectual property rights of all users. All uploaded documents are either original works of the uploader or authorized works of the rightful owners.

Support