Generating Context-Free Grammars Using Classical Planning
Total Page:16
File Type:pdf, Size:1020Kb
Proceedings of the Twenty-Sixth International Joint Conference on Artificial Intelligence (IJCAI-17) Generating Context-Free Grammars using Classical Planning Javier Segovia-Aguas1, Sergio Jimenez´ 2, Anders Jonsson 1 1 Universitat Pompeu Fabra, Barcelona, Spain 2 University of Melbourne, Parkville, Australia [email protected], [email protected], [email protected] Abstract S ! aSa S This paper presents a novel approach for generating S ! bSb /|\ Context-Free Grammars (CFGs) from small sets of S ! a S a /|\ input strings (a single input string in some cases). a S a Our approach is to compile this task into a classical /|\ planning problem whose solutions are sequences b S b of actions that build and validate a CFG compli- | ant with the input strings. In addition, we show that our compilation is suitable for implementing the two canonical tasks for CFGs, string produc- (a) (b) tion and string recognition. Figure 1: (a) Example of a context-free grammar; (b) the corre- sponding parse tree for the string aabbaa. 1 Introduction A formal grammar is a set of symbols and rules that describe symbols in the grammar and (2), a bounded maximum size of how to form the strings of certain formal language. Usually the rules in the grammar (i.e. a maximum number of symbols two tasks are defined over formal grammars: in the right-hand side of the grammar rules). Our approach is compiling this inductive learning task into • Production : Given a formal grammar, generate strings a classical planning task whose solutions are sequences of ac- that belong to the language represented by the grammar. tions that build and validate a CFG compliant with the input • Recognition (also known as parsing): Given a formal strings. The reported empirical results show that our compi- grammar and a string, determine whether the string be- lation can generate CFGs from small amounts of input data longs to the language represented by the grammar. (even a single input string in some cases) using an off-the- Chomsky defined four classes of formal grammars that dif- shelf classical planner. In addition, we show that the com- fer in the form and generative capacity of their rules [Chom- pilation is also suitable for implementing the two canonical sky, 2002]. In this paper we focus on Context-Free Grammars tasks of CFGs, string production and string recognition. (CFGs), where the left-hand side of a grammar rule is always a single non-terminal symbol. This means that the symbols 2 Background on the left-hand side of CFG rules do not appear in the strings This section defines the formalization of CFGs and the plan- that belong to the corresponding language. ning models that we adopt in this work. To illustrate this Figure 1(a) shows an example CFG that contains a single non-terminal symbol, S, and three terminal 2.1 Context-Free Grammars symbols (a, b and , where denotes the empty string). This We define a CFG as a tuple G = hV; v0; Σ;Ri where: CFG defines three production rules that can generate, for in- • V is the finite set of non-terminal symbols, also called stance, the string aabbaa by applying the first rule twice, then variables. Each v 2 V represents a sub-language of the the second rule once and finally, the third rule once again. language defined by the grammar. The parse tree in Figure 1(b) exemplifies the previous rule application and proves that the string aabbaa belongs to the • v0 2 V is the start non-terminal symbol that represents language defined by the grammar. the whole grammar. Learning the entire class of CFGs using only positive ex- • Σ is the finite set of terminal symbols, which are disjoint amples, i.e. strings that belong to the corresponding formal from the set of non-terminal symbols, i.e. V \ Σ 6= ;. language, is not identifiable in the limit [Gold, 1967]. In this The set of terminal symbols is the alphabet of the lan- paper we address the generation of CFGs from positive exam- guage defined by G and can contain the empty string, ples but: (1) for a bounded maximum number of non-terminal which we denote by 2 Σ. 4391 Proceedings of the Twenty-Sixth International Joint Conference on Artificial Intelligence (IJCAI-17) • R : V ! (V [Σ)∗ is the finite set of production rules in cond(a). Each conditional effect C B E 2 cond(a) is com- the grammar. By definition rules r 2 R always contain posed of two sets of literals C 2 L(F ) (the condition) and a single non-terminal symbol on the left-hand side. E 2 L(F ) (the effect). Figure 1(a) shows a 1-variable CFG since it defines a single An action a 2 A is applicable in state s if and only if pre(a) ⊆ s, and the resulting set of triggered effects is non-terminal symbol. In this example V = fSg, v0 = S, Σ = fa; b; g, and R = fS ! aSa; S ! bSb; S ! g. [ ∗ eff(s; a) = E; For any two strings e1; e2 2 (V [ Σ) we say that e1 di- CBE2cond(a);C⊆s rectly yields e2, denoted by e1 ) e2, iff there exists a rule α ! β 2 R such that e1 = u1αu2 and e2 = u1βu2 with ∗ i.e. effects whose conditions hold in s. The result of applying u1; u2 2 (V [ Σ) . Furthermore we say e1 yields e2, de- ∗ a in s is a new state θ(s; a) = (s n :eff(s; a)) [ eff(s; a). noted by e1 ) e2, iff 9k ≥ 0 and 9u1; : : : ; uk such that Given a frame Φ = hF; Ai, a classical planning instance e1 = u1 ) u2 ) ::: ) uk = e2. For instance, Fig- is a tuple P = hF; A; I; Gi, where I 2 L(F ) is an initial ure 1(b) shows how the string S yields the string aabbaa. ∗ ∗ state (i.e. jIj = jF j) and G 2 L(F ) is a goal condition. We The language of a CFG, L(G) = fe 2 Σ : v0 ) eg, is consider the fragment of classical planning with conditional the set of strings that contain only terminal symbols and that effects that includes negative conditions and goals. can be yielded from the string that contains only the initial A plan for P is an action sequence π = ha1; : : : ; ani that non-terminal symbol v0. induces a state sequence hs ; s ; : : : ; s i such that s = I Given a CFG G and a string e 2 L(G) that belongs to its 0 1 n 0 and, for each i such that 1 ≤ i ≤ n, ai is applicable in si−1 language, we define a parse tree tG;e as an ordered, rooted and generates the successor state s = θ(s ; a ). The plan tree that determines a concrete syntactic structure of e ac- i i−1 i π solves P if and only if G ⊆ sn, i.e. if the goal condition is cording to the rules in G: satisfied following the application of π in I. • Each node in a parse tree tG;e is either: 2.3 Generalized Planning – An internal node that corresponds to the applica- tion of a rule r 2 R. Our approach for learning CFGs from positive examples is – A leaf node that corresponds to a terminal symbol to model this task as a generalized planning problem that is σ 2 Σ and has no outgoing branches. eventually solved with an off-the-shelf classical planner using the compilation proposed by Jimenez´ and Jonsson (2015). • Edges in a parse tree tG;e connect non-terminal symbols We define a generalized planning problem as a finite to terminal or non-terminal symbols following the rules set of classical planning instances P = fP1;:::;PT g R in G. within the same planning frame Φ. Therefore, P1 = hF; A; I1;G1i;:::;PT = hF; A; IT ;GT i share the same flu- 2.2 Classical Planning with Conditional Effects ents and actions and differ only in the initial state and goals. We use the model of classical planning with conditional ef- Although actions are shared, each action can have differ- fects because conditional effects allow to adapt the execution ent interpretations in different states due to conditional ef- of a sequence of actions to different initial states, as shown in fects. Given a planning frame Φ = hF; Ai, then Γ(Φ) = conformant planning [Palacios and Geffner, 2009]. The sup- fhF; A; I; Gi : I 2 L(F ); jIj = jF j;G 2 L(F )g denotes the port of conditional effects is now a requirement of the Inter- set of all planning instances that can be instantiated from Φ national Planning Competition [Vallati et al., 2015] and most by defining an initial state I and a goal condition G. current classical planners natively cope with conditional ef- A solution Π to a generalized planning problem P is a fects without compiling them away [Helmert, 2006]. generalized plan that solves every individual instance Pt, We use F to denote a set of propositional variables or flu- 1 ≤ t ≤ T . Generalized plans may have diverse forms ents describing a state. A literal l is a valuation of a fluent that range from DS-planners [Winner and Veloso, 2003] and f 2 F , i.e. l = f or l = :f. A set of literals L on F rep- generalized polices [Mart´ın and Geffner, 2004] to finite state resents a partial assignment of values to fluents (WLOG we machines [Bonet et al., 2010; Segovia-Aguas et al., 2016b], assume that L does not assign conflicting values to any flu- AND/OR graphs or CFGs [Ramirez and Geffner, 2016].