Context Free Languages
COMP2600 — Formal Methods for Software Engineering
Katya Lebedeva
Australian National University Semester 2, 2016
Slides by Katya Lebedeva and Ranald Clouston.
COMP 2600 — Context Free Languages 1 Ambiguity
The definition of CF grammars allow for the possibility of having more than one structure for a given sentence. This ambiguity may make the meaning of a sentence unclear.
A context-free grammar G is unambiguous iff every string can be derived by at most one parse tree.
G is ambiguous iff there exists any word w ∈ L(G) derivable by more than one parse trees.
COMP 2600 — Context Free Languages 2 Dangling else
Take the code if e1 then if e2 then s1 else s2
where e1, e2 are boolean expressions and s1, s2 are subprograms.
Does this mean if e1 then( if e2 then s1 else s2)
or if e1 then( if e2 then s1) else s2
The dangling else is a problem in computer programming in which an optional else clause in an “ifthen(else)” statement results in nested conditionals being ambiguous.
COMP 2600 — Context Free Languages 3 This is a problem that often comes up in compiler construction, especially parsing.
COMP 2600 — Context Free Languages 4 Inherently Ambiguous Languages
Not all context-free languages can be given unambiguous grammars – some are inherently ambiguous. Consider the language
L = {aib jck | i = j or j = k}
How do we know that this is context-free? First, notice that
L = {aibick} ∪ {aib jc j}
We then combine CFGs for each side of this union (a standard trick):
S → T | W T → UVW → XY U → aUb | ε X → aX | ε V → cV | ε Y → bYc | ε
COMP 2600 — Context Free Languages 5 The problem with L is that its sub-languages {aibick} and {aib jc j} have a non-empty intersection. If i = j = k in a string w , the number of occurrences of a’s, b’s, and c’s is the same in w, and we cannot say by which rules of the grammar w was generated. Consider, for example, w = abc.
S S T W } } ! U V X Y ~ ~ } a U b c V a X b Y c ε ε ε ε
Language L is inherently ambiguous: every grammar that generates L is ambiguous.
COMP 2600 — Context Free Languages 6 The Problem of Incomputability
To verify that a grammar is unambiguous, there must be a procedure to turn an ambiguous grammar into equivalent unambiguous ones (i.e. the proce- dure of disambiguation).
However this is an incomputable problem. There is no general algorithm that can disambiguate a given ambiguous grammar.
Moreover, determining whether a grammar is ambiguous or not in the first place is also incomputable!
The existing algorithms for ambiguity detection can only determine whether a given string (or a finite set of strings) is ambiguous wrt a given grammar.
COMP 2600 — Context Free Languages 7 Approaches for disambiguating a grammar
A Question of (Non-)associativity
Consider the grammar S → S − S | int
where int could be any integer and ‘−’ is interpreted as subtraction.
This grammar is ambiguous:
S S
S − 1 5 − S
Ð 5 − 3 3 − 1
COMP 2600 — Context Free Languages 8 The left tree evaluates 5 − 3 − 1 to 1. The right tree evaluates the same string to 3.
We would like “−” to be left-associative. I.e. we want 5 − 3 − 1 be parsed as the left tree and interpreted as 1.
To achieve this we force the operator “−” it associate to the left:
S → S − int | int
If we wanted our grammar to be right associative, we would disambiguate the production rules as follows:
S → int − S | int
Idea: Break the symmetry.
COMP 2600 — Context Free Languages 9 A Question of Precedence
Consider the grammar
S → S ∗ S | S + S | int
where ∗ is to be interpreted as multiplication and + as addition.
The grammar is ambiguous: 1 + 2 ∗ 3 could evaluate to 7 or 9.
We want ∗ to have higher precedence than +. Thus we redefine our grammar as follows:
S → S + T | T T → T ∗ int | int
COMP 2600 — Context Free Languages 10 Given a string 1+2∗3, or 2∗3+1, we have no choice but to expand to S+T first, so that (thinking bottom-up in the tree) + will be last command to be executed.
Suppose we tried to derive 1 + 2 ∗ 3 by first doing S ⇒ T ⇒ T ∗ 3. We are then stuck because we cannot substitute T with 1 + 2!
As with associativity this trick consists of breaking symmetry.
COMP 2600 — Context Free Languages 11 Just an Example
S → S + T | S − T | T T → T ∗U | T/U | U U → (S) | int
This grammar provides brackets.
Note that the language defined by this grammar is context-free, because of the need to keep track of bracket balancing to an arbitrary depth.
COMP 2600 — Context Free Languages 12