Note: Text Cross-References in Parenthesis Are for 2Nd Ed
Chapter 4
Properties of Regular Languages
2/14/03 ==> 10/1/03
Note: text cross-references in parenthesis are for 2nd ed.
NOTE: The “prime” symbol (as in L’) means complement
Closure
If L1 and L2 are regular languages,
Then so are L1ÈL2, L1ÇL2, L1L2, L1’, and L*
Regular languages are closed under reversal.
Homomorphism:
Let S and G be two alphabets.
h: S ® G* is a homomorphism: h(ai) = u Î G*, " ai Î S
extending to strings we have:
w = a1a2...an
h(w) = h(a1)h(a2)...h(an)
Homomorphic image:
If L is a language on S, then homomorphic image of L is
h(L) = { h(w) : w Î L }, h = homomorphism
Example 4.2, 4.3
Theorem 4.3
Let h be a homomorphism.
If L is a regular language, then its homomorphic image h(L) is also regular.
==> Regular languages are closed under homomorphisms.
CS373 Fall 2003 Chapter 4 N. Guydosh Page 7
Definition 4.2 Right Quotient
Let L1 and L2 be languages on same S.
Then right quotient is defined as
L1/L2 = {x: xy Î L1, for some y Î L2}
==> “Set of all strings in L1
that have a suffix in L2
with this suffix removed”
Example: L1 = {anbm : n ³ 1, m ³ 0} È {ba}
L2 = {bm : m ³ 1}
L1/L2 = {anbm : n ³ 1, m ³ 0}
aaabbbbb Î L1 aaabb Î L1/L2 bbb is suffix
==> aaab Î L1/L2
==> aaa Î L1/L2
Theorem 4.4
If L1 and L2 are regular languages over S, then L1/L2 is also regular
==> Family of regular languages is closed under right quotient with a regular language.
See examples 4.4 and 4.5
Elementary questions about regular languages (4.2)
- or -
Regular questions about elementary languages!
Standard representation
A regular language is given in standard representation (SR) if, and only if, it is described by:
(1) Finite automaton,
(2) Regular expression, or
(3) Regular grammar
Informal descriptions and set notation are excluded.
CS373 Fall 2003 Chapter 4 N. Guydosh Page 7
Theorem 4.5
Given a standard representation of a regular language L on S and any w Î S*,
then $ an algorithm determining whether or not w Î L.
==> Construct DFA for L and test w.
Theorem 4.6
$ an algorithm for determining whether a regular language in standard representation form is empty, finite, or infinite.
==> Construct DFA.
If simple path (no edge or vertex repeated) from q0 ==> qf then non-empty.
Question: Does any walk from q0 ==> qf implies $ a simple path also?
If q0 ==> qf path passes through vertex which is base of cycle then infinite
Theorem 4.7
Given two regular languages L1 and L2 in standard representation form,
$ an algorithm to determine if/(if not) L1 = L2.
==> Define regular language: L3 = (L1 Ç L2’) È (L1’ Ç L2) = L1 Å L2 ... “Exclusive OR”
Since L3 is a regular language, $ DFA M accepting L3.
Use Theorem 4.6 to see if L3 is empty.
L3 = Æ if, and only if, L1 = L2. (Why?)
CS373 Fall 2003 Chapter 4 N. Guydosh Page 7
Identifying Non-Regular Languages
(Sect 4.3)
Pigeon hole principle (a target of denial)
Put n objects into m boxes.
If n > m, then at least one box must have more than one item in it.
Can prove L is non-regular language by assuming contrary ==> denying pigeon hole property.
Example 4.6 Prove L = {anbn : n ³ 0} is non-regular
Assume contrary: (Assume L is regular language)
==> $ DFA, M = (Q, {a,b}, d, q0, F)
Consider d*(q0, ai) i > 0
==> infinite number of is and finite number of states.
==> $ q Î Q and some n, m (n ¹ m) such that d*(q0, an) = q = d*(q0, am)
due to pigeon hole principle.
... q would be the first state to repeat for a sufficiently long string of a’s.
But since, by assumption (contrary),
M accepts anbn, we have d*(q, bn) = qf Î F
[Uses: d*(q,w1w2) = d*(d*(q,w1),w2)]
But also,
==> d*(q0,ambn) = d*(d*(q0,am),bn) = d*(q, bn) = qf Î F
Contradiction since m ¹ n and ambn should be rejected. QED
CS373 Fall 2003 Chapter 4 N. Guydosh Page 7
“Pumping Lemma” (another object of denial)
· Application of pigeon hole principle
Theorem 4.8
Let L be an infinite regular language.
Then, $some integer m > 0 such that " w Î L and |w| ³ m,
we can decompose w as
w = xyz
with |xy| £ m
|y| ³ 1
such that wi = xyiz, i ³ 0 (pumped up w)
is also in L.
Proof: L is regular language ==> $ DFA
M with states q0, q1, q2, ..., qn , (n+1 states)
Let w Î L such that |w| ³ m = n + 1
M goes through the following sequence of states as it processes w:
q0, qi, qj, ..., qf
Since |w| ³ n + 1,
Number of states in walk = |w| + 1 > n + 1
See example below ...
Number of states in walk > number of states
==> at least 1 repeated state (pigeon hole principle)
Example: w = abc
number of states = |w|+1 = 3+1. = 4.
Sequence looks like:
q0 , qi , qj , ..., qr , ..., qr , ..., qf
------
x y
$ substrings x, y, z such that
d*(q0 , x) = qr
d*(qr , y) = qr
d*(qr ,z) = qf
with |xy| £ n + 1 and |y| ³ 1
Note: Reason for |xy| £ n + 1 =m:
If qr is the first occurrence of a vertex which will be also the next repeated one,
Then it (1st qr) will occur no later than nth move since there are only n+1 states in going from q0 to qf . Since there can be no repetitions of states in the “y loop or segment”, and we need at least 1 move to complete the loop, it follows that number of moves in xy = |xy| £ number of states = n+1 = m. Remember the example above relating number of moves (string length) with number of states.
Thus d*(q0 , xz) = qf skip the loops, and |xy| £ n + 1 = m … see p. 116 (120, 2nd ed)
and d*(q0 , xyiz) = qf i ³ 0 any number of loops will work
xyiz = pumped string is accepted QED
Observation:
- If assuming a language L is a regular language (accepted by DFA)
And pumping lemma is denied in the process,
then L is not a regular language (contradiction)
- We are guaranteed existence of m as well as the decomposition xyz
But
We do not know what they are.
CS373 Fall 2003 Chapter 4 N. Guydosh Page 7
- Cannot claim contradiction just because the pumping lemma is violated for some specific value of m or specific xyz … see middle p. 117 (bottom p. 120).
- But: Pumping Lemma holds " w Î L, |w| ³ m and all i.
Thus If Pumping Lemma is violated for even one w or i, then L is not a regular language.
· Game Theory Approach to Applying Pumping Lemma
- Opponent picks m
- Given m, we pick w Î L such that |w| ³ m (totally free to choose)
- Opponent chooses decomposition xyz with |xy| £ m and |y| ³ 1. Opponent makes optimal choice.)
- We try to pick some particular i such that the pumped string wi = xyiz is not in L.
… make no assumptions on what partition is – must guarantee violation for all possible partitions …
If we can do this, we win.
Else inconclusive (opponent wins).
CS373 Fall 2003 Chapter 4 N. Guydosh Page 7