Context Free Languages
Total Page:16
File Type:pdf, Size:1020Kb
Context Free Languages COMP2600 — Formal Methods for Software Engineering Katya Lebedeva Australian National University Semester 2, 2016 Slides by Katya Lebedeva and Ranald Clouston. COMP 2600 — Context Free Languages 1 Ambiguity The definition of CF grammars allow for the possibility of having more than one structure for a given sentence. This ambiguity may make the meaning of a sentence unclear. A context-free grammar G is unambiguous iff every string can be derived by at most one parse tree. G is ambiguous iff there exists any word w 2 L(G) derivable by more than one parse trees. COMP 2600 — Context Free Languages 2 Dangling else Take the code if e1 then if e2 then s1 else s2 where e1, e2 are boolean expressions and s1, s2 are subprograms. Does this mean if e1 then( if e2 then s1 else s2) or if e1 then( if e2 then s1) else s2 The dangling else is a problem in computer programming in which an optional else clause in an “ifthen(else)” statement results in nested conditionals being ambiguous. COMP 2600 — Context Free Languages 3 This is a problem that often comes up in compiler construction, especially parsing. COMP 2600 — Context Free Languages 4 Inherently Ambiguous Languages Not all context-free languages can be given unambiguous grammars – some are inherently ambiguous. Consider the language L = faib jck j i = j or j = kg How do we know that this is context-free? First, notice that L = faibickg [ faib jc jg We then combine CFGs for each side of this union (a standard trick): S ! T j W T ! UV W ! XY U ! aUb j e X ! aX j e V ! cV j e Y ! bYc j e COMP 2600 — Context Free Languages 5 The problem with L is that its sub-languages faibickg and faib jc jg have a non-empty intersection. If i = j = k in a string w , the number of occurrences of a’s, b’s, and c’s is the same in w, and we cannot say by which rules of the grammar w was generated. Consider, for example, w = abc. S S T W } } ! U V X Y ~ ~ } a U b c V a X b Y c e e e e Language L is inherently ambiguous: every grammar that generates L is ambiguous. COMP 2600 — Context Free Languages 6 The Problem of Incomputability To verify that a grammar is unambiguous, there must be a procedure to turn an ambiguous grammar into equivalent unambiguous ones (i.e. the proce- dure of disambiguation). However this is an incomputable problem. There is no general algorithm that can disambiguate a given ambiguous grammar. Moreover, determining whether a grammar is ambiguous or not in the first place is also incomputable! The existing algorithms for ambiguity detection can only determine whether a given string (or a finite set of strings) is ambiguous wrt a given grammar. COMP 2600 — Context Free Languages 7 Approaches for disambiguating a grammar A Question of (Non-)associativity Consider the grammar S ! S − S j int where int could be any integer and ‘−’ is interpreted as subtraction. This grammar is ambiguous: S S S − 1 5 − S Ð 5 − 3 3 − 1 COMP 2600 — Context Free Languages 8 The left tree evaluates 5 − 3 − 1 to 1. The right tree evaluates the same string to 3. We would like “−” to be left-associative. I.e. we want 5 − 3 − 1 be parsed as the left tree and interpreted as 1. To achieve this we force the operator “−” it associate to the left: S ! S − int j int If we wanted our grammar to be right associative, we would disambiguate the production rules as follows: S ! int − S j int Idea: Break the symmetry. COMP 2600 — Context Free Languages 9 A Question of Precedence Consider the grammar S ! S ∗ S j S + S j int where ∗ is to be interpreted as multiplication and + as addition. The grammar is ambiguous: 1 + 2 ∗ 3 could evaluate to 7 or 9. We want ∗ to have higher precedence than +. Thus we redefine our grammar as follows: S ! S + T j T T ! T ∗ int j int COMP 2600 — Context Free Languages 10 Given a string 1+2∗3, or 2∗3+1, we have no choice but to expand to S+T first, so that (thinking bottom-up in the tree) + will be last command to be executed. Suppose we tried to derive 1 + 2 ∗ 3 by first doing S ) T ) T ∗ 3. We are then stuck because we cannot substitute T with 1 + 2! As with associativity this trick consists of breaking symmetry. COMP 2600 — Context Free Languages 11 Just an Example S ! S + T j S − T j T T ! T ∗U j T=U j U U ! (S) j int This grammar provides brackets. Note that the language defined by this grammar is context-free, because of the need to keep track of bracket balancing to an arbitrary depth. COMP 2600 — Context Free Languages 12.