<<

THEORY OF

There are four sorts of men:

He who knows not and knows not he knows not: he is a fool - shun him;

He who knows not and knows he knows not: he is simple – teach him;

He who knows and knows not he knows: he is asleep – wake him;

He who knows and knows he knows: he is wise – follows him

Arabian proverb Chapter 1: Sets, Relations and Languages

Recommended Readings: Textbook + A Basis for Theo- retical Science by M. A. Arbib, A. J. Kfoury, R N. Moll.

A set is a collection of objects. Examples of sets are:

– the set of integers, denoted by Z, – the set of nonnegative integers, also called natu- ral numbers, denoted by N, – the set of truth values B = {T,F } – the set of students taking the theory of compu- tation course

The elements or members of a set are the ob- jects comprising it. If b is an element of a set L, then we write b ∈ L.

There are two ways to display a set:

either explicitly listing the elements belonging to it like in the case of the set of truth values B = {T,F },

or by specifying the properties that characterize the elements of this set like: the set of even integers {x | x ∈ Z and x mod 2 = 0}

Two sets are equal if they have the same ele- ments

A set without any element is called the empty set

A set with infinitely many elements is said to be infinite

A set with finitely many elements is said to be fi- nite

A set X is a subset of a set Y, written X ⊆ Y , if each element of X is also an element of Y.

X is a proper subset of Y if X is a subset of Y and X and Y are not equal.

For example, the set of nonnegative integers is a proper subset of the set of integers, N ⊂ Z.

Basic facts: For any set A, the empty set ∅ is a subset of A, ∅ ⊆ A, and A is a subset of A, A ⊆ A.

If A ⊆ B and B ⊆ A then A = B. Set operations: Union, Difference, In- tersection

The union of two sets A,B is the set of elements which belongs to at least one of them.

The intersection of two sets A,B is the set of ele- ments which belongs to both of them. We say two sets are disjoint if their intersection is empty.

The set difference A \ B of two sets A,B is the set of those elements in A that are not in B.

Laws for set operations:

Idempotency A ∪ A = A A ∩ A = A

Commutativity A ∪ B = B ∪ A A ∩ B = B ∩ A

Associativity (A∪B)∪C = A∪(B ∪C) (A ∩ B) ∩ C = A ∩ (B ∩ C)

Suppose now that there is a big set U such that both A, and B are subsets of U. Let B = U \ B, B = U \ B. Then A \ B = A ∩ B A ∩ B = A ∪ B A ∪ B = A ∩ B

Exercises Prove the above equalities. 

Distributivity A ∪ (B ∩ C) = (A ∪ B) ∩ (A ∪ C) A∩(B ∪C) = (A∩B)∪(A∩ C)

Proof 1. To show A ∪ (B ∩ C) = (A ∪ B) ∩ (A ∪ C), we show A ∪ (B ∩ C) ⊆ (A ∪ B) ∩ (A ∪ C) and (A ∪ B) ∩ (A ∪ C) ⊆ A ∪ (B ∩ C). (a) Let x ∈ A ∪ (B ∩ C). From x ∈ A ∪ (B ∩ C), it follows: x ∈ A or x ∈ B ∩ C. If x ∈ A then x ∈ A ∪ B and x ∈ A ∪ C. Hence x ∈ (A ∪ B) ∩ (A ∪ C). If x ∈ B ∩ C then x ∈ B and x ∈ C. If x ∈ B then x ∈ A ∪ B. If x ∈ C then x ∈ A ∪ C. Therefore if x ∈ B ∩ C then x ∈ (A ∪ B) ∩ (A ∪ C). Hence from x ∈ A or x ∈ B ∩ C we can conclude x ∈ (A ∪ B) ∩ (A ∪ C). Therefore we have proved that if x ∈ A ∪ (B ∩ C) then x ∈ (A ∪ B) ∩ (A ∪ C). (b) Let x ∈ (A ∪ B) ∩ (A ∪ C). To show x ∈ A ∪ (B ∩ C). From x ∈ (A ∪ B) ∩ (A ∪ C), it follows: x ∈ A ∪ B and x ∈ A ∪ C. Hence (x ∈ A or x ∈ B) and (x ∈ A or x ∈ C). Hence there are four cases: (x ∈ A) or (x ∈ A and x ∈ B) or (x ∈ A and x ∈ C) or (x ∈ B and x ∈ C). It follows that x ∈ A or (x ∈ B and x ∈ C). That means x ∈ A ∪ (B ∩ C). 2. Exercise.  Absorption A ∩ (A ∪ B) = A A ∪ (A ∩ B) = A

Proof Exercise. 

DeMorgan’s Laws

A\(B ∪C) = (A\B)∩(A\C) A\(B ∩C) = (A\B)∪(A\C)

Proof 1. Suppose there is a set U such that all sets A,B,C are subsets of U. Hence

A \ (B ∪ C) = A ∩ (B ∪ C) = A ∩ (B ∩ C)

= (A ∩ B) ∩ (A ∩ C) = (A \ B) ∩ (A \ C) 2. Exercise.  If S is a collection of sets then S S is the set whose elements are the elements of the sets in S.

Example: S = {{a, b}, {c}, {a, d}}

S S = {a, b, c, d}

S = {{a, b}, {c}, {a, {d, a}}}

S S = {a, b, c, {d, a}}

The power set of a set S, denoted by 2S, is the set of all subsets of S.

Example: S = {a} , 2S = {∅, {a}}

S = {a, b} , 2S = {∅, {a}, {b}, {a, b}} A partition of a set S is a set Π of subsets of S, i.e. Π ⊆ 2S, such that 1. Each element of Π is nonempty 2. Distinct members of Π are disjoint 3. S Π = S

Example S = {a, b, c}

Π = {{a}, {b, c}} is a partition of S.

A = {{a}, {a, b}, {c}} is not a partition.

B = {{a}, {b, c}, ∅ } is not a partition. Functions or Maps

Ordered pairs are written as (a,b) where a is the first component and b the second.

The Cartesian product of two sets A,B, de- noted by A×B, is the set of all ordered pairs (a,b) with a ∈ A, b ∈ B.

Example {a, b} × {a} = {(a, a), (b, a)}

The plane could be represented as a Cartesian product

R × R where R is the set of all real numbers.

A map or from a set A to a set B, denoted f : A → B, is an assignment to each element a in A a single element, denoted by f(a), in B. A is called the domain of f and B is the codomain of f. f(a) is called the image of a under f. The range of f is denoted by f(A) = {b | there is a in A such that b = f(a)} For A0 ⊆ A, f(A0) = {f(a): a ∈ A0} is called the image of A0 under f.

Example of functions: The integer addition +: N×N → N is a func- tion from N×N to N

Exercises 1) If A is empty, how many functions are there from A to B ? 2) If B is empty, how many functions are there from A to B ? 3) If A contains exactly one element, how many functions are there from A to B ?

A function f : A −→ B is one to one if for any two distinct elements a, a0 ∈ A, f(a) =6 f(a0).

A function f : A −→ B is onto B if B = f(A). A function f : A −→ B is a bijection be- tween A and B if it is both one-to-one and onto B.

Let A and B be two finite sets. A and B have the same number of elements iff there is a bijection be- tween A and B

Given a function f : A → B and g : B → C, the composition f ◦ g : A → C is defined by: f ◦ g(a) = g(f(a)) Relations A binary relation on two sets A,B is a subset of A × B. An example of binary relation is the greater or equal relation ≥ in the set of integers R = {(n, m) | n ≥ m} ⊆ Z× Z.

An ordered n-tuple is written as (a1, . . . , an) where ai is the ith component of (a1, . . . , an).

The n-fold Cartesian product of the sets A1,...,An, denoted by A1 ×...×An, is the set of all ordered n-tuples with ai ∈ Ai for i = 1, . . . , n.

An n-ary relation on sets A1,...,An is a subset of A1 × ... × An. Property (A function is a special relation) A function from a set A to a set B is a binary relation R on A,B such that following property is satisfied: – For each a ∈ A there is exactly one ordered pair in R with first component a.  A binary relation R ⊆ A × B has an inverse R−1 ⊆ B × A defined by:

(b, a) ∈ R−1 iff (a, b) ∈ R.

For binary relations Q ⊆ A×B and R ⊆ B×C, the composition Q ◦ R is defined by Q◦R = {(a, c): ∃ b ∈ B s.t. (a, b) ∈ Q and (b, c) ∈ R} Note that for two functions f : A −→ B and g : B −→ C, f ◦ g : A −→ C.

Exercise Show that if f : A −→ B is a bijec- tion, then f −1 is also a bijection from B to A. Special Types of Relations A relation R ⊆ A × A can be represented as a directed graph.

A relation R ⊆ A × A is reflexive if for each a ∈ A,(a, a) ∈ R. A relation R ⊆ A×A is symmetric if (a, b) ∈ R whenever (b, a) ∈ R. Symmetric relation can be represented by undi- rected graph.

A relation R ⊆ A × A is transitive if when- ever (a, b) ∈ R and (b, c) ∈ R then (a, c) ∈ R

A relation R ⊆ A × A is an equivalent rela- tion if it is reflexive, transitive and symmetric.

Let R be an equivalent relation on a set A. Then for each a in A, the equivalence class of a with respect to R is denoted by [a]R and is defined for- mally by

[a]R = {b | (a, b) ∈ R} When the context R is clear, we simply write [a] for [a]R

A representation of an equivalent relation R ⊆ A × A as an undirected graph consists of a num- ber of ”clusters” where within clusters each pair is connected by a line. The set of nodes in a cluster is an equivalence class.

Property Let R be an equivalent relation on a set A. Then for any two elements a,b in A, either [a]R = [b]R or [a]R, [b]R are disjoint. Proof Exercise.

Let R be an equivalent relation on a set A. Define A modulo R to be the set

A/R = {[a]R | a ∈ A}

Theorem 1. Let R ⊆ A × A be an equivalent relation. Then A modulo R is a partition of A.

Proof Exercise.

Example of Partition and equivalence relation: Let Z+ be the set of positive integers. Define an equivalent relation ≡⊆ Z+ × Z+ as follows: (p, q) ≡ (r, s) iff ps = qr

The ≡ is an equivalent relation characterizing the rational numbers, i.e. The set Z+ ×Z+ modulo ≡ is the set of rational numbers.

A relation R ⊆ A × A is antisymmetric if whenever (a, b) ∈ R and a =6 b then (b, a) 6∈ R

A relation that is reflexive, transitive and anti- symmetric is called a partial order.

A partial order R ⊆ A × A is called a total order if for all a, b ∈ A, either (a, b) ∈ R or (b, a) ∈ R.

A path from a to b in a binary relation R ⊆ A × A is a sequence (a1, . . . , an), n ≥ 1 such that a = a1, b = an, for each i = 1, . . . n−1, (ai, ai+1) ∈ R. A path (a1, . . . , an) is a cycle if a1 = an and all ai’s are distinct. Two sets A, B are equinumerous if there is a bijection f : A −→ B.

We say that the cardinality of A is n if A is equinumeruos with the set {1,...,n}.

A is countably infinite if it is equinumerous with the set of natural numbers N .

A is uncountable if it is not equinumerous with the set of natural numbers N .

Exercise Every subset of a finite set is finite.

Exercise Every subset of a countably infinite set is finite or countably infinite.

Exercise The set N × N is countably infinite.

Theorem 2. The union of countably infinite col- lection of countably infinite sets is again count- able infinite. Exercixe Is the set of all finite subsets of N count- ably infinite ? Three Fundamental Proof Techniques Principle of Mathematical Induction: Let A be a set of natural number such that 1. 0 ∈ A, and 2. for each natural number n, if {0, 1, . . . , n} ⊆ A then n + 1 ∈ A. Then A = N.

Exercise Prove that for each natural number n n(n + 1) 1 + ... + n = 2 Exercise Suppose f, g : N −→ (N − {0}) satisfy following properties: 1. g(0) ≤ f(0), and 2. for each n ≥ 0,

g(n + 1) f(n + 1) ≤ g(n) f(n) Then for any n ≥ 0: g(n) ≤ f(n). The pigeon principle If A, B are finite sets and |A| > |B|. Then there is no one-to-one func- tion from A to B.

Exercise Suppose there are 100 new students for the august semester at AIT and only 97 vacant rooms. Is it possible to give each student a sepa- rate room ? Prove it.

Exercise If there are 10 scholarships and 50 appli- cants. Is it possible that each applicant get a schol- arship. (Note that scholarship can not be shared).

Exercise Could you try to prove the pigeon prin- ciple using mathematical induction ? The Diagonalization Principle Let R be a binary relation on a set A, and let – D = {a | a ∈ A and (a, a) 6∈ R} D is called the diagonal set of R. – for each a ∈ A,

Ra = {b | b ∈ A and (a, b) ∈ R}

Then for each a ∈ A , D =6 Ra.

Exercise Let A = {1, 2, 3, 4, 5} and R = {(1, 3), (2, 4), (3, 5), (5, 1), (2, 3), (1, 1), (4, 2), (4, 4)}. Determine the sets D,R1,R2,R3,R4,R5. Check whethere the diagonal principle holds in this case.

Exercise Would the diagonal principle still holds if a the sets Ra are replaced by the sets R = {b | b ∈ A and (b, a) ∈ R}

Exercise Would the diagonal principle still holds if the sets D is replaced by the set C = {a | a ∈ A and (a, a) ∈ R}. Theorem 3. The set 2N is uncountably infinite Proof (Sketch) Suppose the theorem is not cor- rect. Therefore, the set 2N is countably infinite, i.e. there is a bijection (i.e. one-one and onto) function f : N → 2N. Define R = {(n, m) | m ∈ f(n) }. Apply the diagonal principle on R and show a contradiction to the assumption that f is a bijection. 

Exercice Is the set of all infinite subsets of N count- ably infinite ? Definition Let R ⊆ A2 be a directed graph on a set A. The reflexive transitive closure of R is the relation

R∗ = {(a, b) | a, b ∈ A and there is a path from a to b in R}

Algorithms

The question we interested in this course is:

What is an Example of an algorithm:

Initially S := ∅ for i = 1,..., n do i for each i-tuple (b1, . . . , bi) ∈ A do if (b1, . . . , bi) is a path in R then add (b1, bi) to S

How much time the algorithm needs to terminate ? The total number of operations can be no more than

n + n2 + ... + nn that belongs to O(nn+1) where n is the number of elements of A. This algorithm is therefore not efficient.

An efficient algorithm:

Initially S = R ∪ {(ai, ai) | ai ∈ A} for each j = 1,2,...,n do for each i = 1,2,...,n and k = 1,2,...,n do If (ai, aj), (aj, ak) ∈ S but (ai, ak) 6∈ S then add (ai, ak) to S.

The algorithm terminate after no more than n3 steps.

Is this algorithm correct, i.e. S = R∗ after ter- mination of the algorithm. (Could you use mathe- matical induction to show it correctness ?) Rate of Growth of Functions

Definition 1. Let f be a function from N into N. The order of f, denoted by O(f) is the set of all functions g : N → N such that there are positive natural numbers c, d > 0 such that for each n ∈ N g(n) ≤ c.f(n) + d 2. We write f _^ g iff f ∈ O(g) and g ∈ O(f). 

Lemma _^ is a equivalence relation. Proof Exercise.

Definition The equivalence class of a function f : N −→ N with respect to _^ is called the rate of growth of f. k Lemma Let f(n) = a0 + a1.n + ... + ak.n be a polynomial of degree k with nonnegative coeffi- cients. Then 1. f _^ nk 2. nk ∈ O(nk+1) 3. nk+1 6∈ O(nk) Proof

1. Let c = a0 + a1 + ... + ak and d = 0. It is easy to see: f(n) ≤ c.nk. 2. Obviously nk ≤ nk+1. 3. Suppose nk+1 ∈ O(nk). Hence, there is c, d > 0: nk+1 ≤ c.nk + d. Let n = c + d. It follows (c + d).nk = c.nk + d.nk > c.nk + d. Contradiction. That means that assuming nk+1 ∈ O(nk) leads to a contradiction. Therefore nk+1 6∈ O(nk). Lemma 1. Let f, g : N −→ N be monotonic, i.e. for all m ≥ n: f(m) ≥ f(n) and g(m) ≥ g(n). Further let n0 ∈ N be a natural number such that for each n ≥ n0, g(n + 1) f(n + 1) ≤ g(n) f(n) Then g ∈ O(f) Proof Let c be a number such that

g(n0) ≤ cf(n0) From an earlier exercise:

∀n ≥ n0 : g(n) ≤ c.f(n)

Let d = g(n0). We can prove that for each n ∈ N:

g(n) ≤ cf(n) + d  Theorem Let r > 1. Then 1. ni ∈ O(rn) 2. rn 6∈ O(ni) Proof 1. It follows immediately from lemma 1. (Elabo- rate) 2. Suppose rn ∈ O(ni). Hence ni _^ rn _^ ni+1. Contradiction to lemma 1. (Elaborate) Alphabet and Languages An alphabet is a finite set of symbols.

A string over an alphabet is a finite sequence of symbols from this alphabet.

The empty string is denoted by e.

The length of a string is its length as a sequence.

The concatenation of two strings x,y, denoted by x ◦ y or simply xy, is the string x followed by string y.

The concatenation is associative, i.e (x ◦ y) ◦ z = x ◦ (y ◦ z).

A string v is a substring of a string w if there are strings x, y such that w = xvy.

For each string w, each natural number i, define – w0 = e – wn+1 = wn ◦ w This is an example of inductive definition. The reversal of a string is defined inductively as follows: – eR = e – (ua)R = auR Exercise Prove (w ◦ v)R = vR ◦ wR

The set of all strings over an alphabet Σ is de- noted by Σ∗.

Any set of strings over an alphabet Σ is a lan- guage.

Question: Is Σ∗ countably infinite ? Language Operations

– Complement The complement of a language A over Σ is Σ∗ − A – Concatenation Let L1,L2 be two languages over Σ. The concatenation of L1,L2, denoted by L1 ◦ L2 is defined by:

L1 ◦ L2 = {x ◦ y | x ∈ L1, y ∈ L2} – Denote 1. L0 = {e}, 2. L1 = L 3. L2 = L ◦ L 4. Ln+1 = Ln ◦ L The Kleen star of a language L, denoted by L∗, is

∗ L = {w1 ◦ w2 ◦ ... ◦ wn | wi ∈ L, n ≥ 0} = L0 ∪ L1 ∪ L2 ∪ ... ∪ Ln ∪ ... Notes ∅∗ = {e}

Define

+ L = {w1 ◦ w2 ◦ ... ◦ wn | wi ∈ L, n ≥ 1} Regular Languages

The class of regular languages is the least set of languages satisfying the following properties: – The emptyset is regular – For each character a ∈ Σ, the language {a} is regular – If A, B are regular languages then A∗, A ∪ B, A ◦ B are also regular. Finite Representation of Languages

If we want to study languages, we need to rep- resent them. For example, a context-free language could be represented by a context-free grammar. Such a representation must be finite even if the represented language is infinite, otherwise it would be useless (just imagine an infinite set of grammar for Thai, Chinese or English. Would you be able to learn such a set of grammar ?). On the other handside, different languages should have different representations.

Could we have enough representations for every possible language over a nonempty alphabet Σ ?

How many languages there are over an nonempty alphabet Σ ?

Since Σ∗ is countable infinite, 2Σ∗ is uncountable infinite.

The representation of languages must be again some form of language. hence every finite represen- tation can be viewed as a finite string over some al- phabet Σ0. Since Σ0∗ is countable infinite, we can not find a finite representation for each langauge over Σ. Finte representation of regular languages

The set of regular expresions over Σ is the least language over Σ ∪ {∗, (, ), ∅, ∪, ◦} satisfying the following properties: – ∅ a – Every symbol in Σ is a regular expression – If α, β are regular expressions, then so are (αβ), α∗, α ∪ β Any regular expression represents a language as follows: – L(∅) = empty set – Every symbol a ∈ Σ, L(a) = {a} – If α, β are regular expressions, then L(αβ) = L(α)◦L(β), L(α∗) = L(α)∗, L(α∪β) = L(α)∪ L(β)

Theorem 4. A language A is regular iff there exists a regular expression α such that A = L(α) Proof Exercise. Exercise Show that the Kleen star, concatenation and union operators are monotonic, i.e. for lan- guages L0,L1,L, followings hold: ∗ ∗ 1. If L0 ⊆ L1 then L0 ⊆ L1 2. If L0 ⊆ L1 then L0 ◦ L ⊆ L1 ◦ L 3. If L0 ⊆ L1 then L0 ∪ L ⊆ L1 ∪ L

Exercise Check whether the following equations are correct. 1. ((a ∪ b)∗)∗ = (a ∪ b)∗ 2. (a ∪ b)∗(a ∪ b)∗ = (a ∪ b)∗ 3. (a ∪ b)∗ = a∗ ∪ (a∗b)∗ 4. (a ∪ b)∗ = ∅∗ ∪ (a∗b)∗ 5. (a ∪ b)∗ = (b∗a)∗ ∪ (a∗b)∗

Exercise Exercise 1.8.7 in text book. Question Given a regular language L, and a string w, how could we check whether w ∈ L ?

Example Let L = (a ∪ b∗a)∗ w = aababa.

Question: w ∈ L?

Algorithm:

s : x := get-next-symbol; if x := end-of-file then accept; else if x := a then goto s; else if x := b then goto q;

q : x := get-next-symbol; if x := end-of-file then reject; else if x := a then goto s; else if x := b then goto q; s : x := get-next-symbol; if x := end-of-file then accept; else if x := a then goto s; else if x := b then goto q; q : x := get-next-symbol; if x := end-of-file then reject; else if x := a then goto s; else if x := b then goto q;

Fig. 1.

Fig. 2. Fig. 3.

A shortened representation of the algorithm in above figure is M = (K, Σ, δ, s, F ):

1. K = {s, q}

2. Σ = {a, b}

3. F = {s}

4. δ : K × Σ −→ K

δ = {(s, a, s), (s, b, q), (q, a, s), (q, b, q)} Chapter 2: Finite Automata

Chapter 2.1: Deterministic Finite Automata

Definition 1. A deterministic finite automa- ton is a quintuple M = (K, Σ, δ, s, F ) where – K is a finite set of states

– Σ is an alphabet

– s ∈ K is the initial state

– F ⊆ K is the set of final states

– δ, the transition function, is a function from K × Σ to K. 

An example of finite automata is given in figure 3. A configuration of a finite automaton M = (K, Σ, δ, s, F ) is a pair (q, w) where q ∈ K and w ∈ Σ∗

We say that a configuration (q,w) yields an- other configuration (q0, w0) in one step, denoted 0 0 by (q, w) `M (q , w ), if there is a symbol σ ∈ Σ such that w = σw0 and q0 = δ(q, σ).

∗ `M is the reflexive and transitive closure of `M , ∗ i.e. we write (q0, w0) `M (qn, wn) iff there are

(q1, w1),..., (qn1, wn−1) such that

(q0, w0) `M (q1, w1) `M ... `M (qn1, wn−1) `M (qn, wn) – A string w is said to be accepted by M = (K, Σ, δ, s, F ) if and only if there exists a state ∗ q ∈ F such that (s, w) `M (q, e)

– The language accepted by M, L(M), is the set of all strings accepted by M.

Exercise Design an automaton for recognizing the language a∗ba∗b. Chapter 2.2: Nondeterministic Finite Automata

Definition 2. A nondeterministic finite au- tomaton is a quintuple M = (K, Σ, ∆, s, F ) where – K is a finite set of states – Σ is an alphabet – s ∈ K is the initial state – F ⊆ K is the set of final states – ∆, the transition relation, is a finite subset of K × (Σ ∪ {e}) × K.

Examples A nondeterministic machine for (a ∪ ab)∗.

Fig. 4. Fig. 5.

M = (K, Σ, δ, s, F ):

– K = {s, q}

– Σ = {a, b}

– F = {s}

– ∆ ⊆ K × (Σ ∪ {e}) × K:

∆ = {(s, a, q), (q, e, s), (q, b, s)}

Exercise Design a deterministic machine for (a ∪ ab)∗.

Exercise Could you design a nondeterministic ma- chine for (a ∪ ab)∗ without using e. A configuration of a nondeterministic finite automata M = (K, Σ, ∆, s, F ) is a pair (q, w) where q ∈ K and w ∈ Σ∗

We say that a configuration (q,w) yields an- other configuration (q0, w0) in one step, denoted 0 0 by (q, w) `M (q , w ), if there is u ∈ Σ ∪ {e} such that w = uw0 and (q, u, q0) ∈ ∆.

∗ `M is the reflexive and transitive closure of `M , ∗ i.e. we write (q0, w0) `M (qn, wn) iff there are

(q1, w1),..., (qn1, wn−1) such that

(q0, w0) `M (q1, w1) `M ... `M (qn1, wn−1) `M (qn, wn) – A string w is said to be accepted by M = (K, Σ, δ, s, F ) if and only if there exists a state ∗ q ∈ F such that (s, w) `M (q, e)

– The language accepted by M, L(M), is the set of all strings accepted by M.

Exercise Design a deterministic and a nondeter- ministic automaton for recognizing the language (ab ∪ aba)∗. Chapter 2.3: Equivalence of Deterministic and Nondeterministic Finite Automata

Two finite automata M,M 0 are equivalent iff L(M) = L(M 0).

Theorem 5. For each nondeterministic finite automaton there exists an equivalent determin- istic finite automaton. Proof Let M = (K, Σ, ∆, s, F ) be a nondeter- ministic automata. For each q ∈ K, define

∗ E(q) = {p ∈ K | (q, e) `M (p, e) } Define M 0 = (K0, Σ, δ0, s0,F 0).

– K0 = 2K – s0 = E(s) – δ0 : K0 × Σ −→ K0

[ δ0(Q, σ) = {E(p) | p ∈ K and (q, σ, p) ∈ ∆} q∈Q – F 0 = {Q ⊆ K | Q ∩ F =6 ∅ } Property For each w ∈ Σ∗: ∗ ∗ (q, w) `M (p, e) iff (E(q), w) `M0 (P, e) for some P ⊆ K such that p ∈ P . 

Therefore, it follows that

w ∈ L(M) iff

∗ (s, w) `M (p, e) for some p ∈ F iff

0 ∗ 0 (s , w) `M0 (P, e) and P ∈ F iff

w ∈ L(M 0).  Example Let Σ = {a1, . . . , an}, n ≥ 2 and

L = {w | there is one symbol in Σ not appearing in w } A nondeterministic finite automaton accepting L:

M = (K, Σ, ∆, s, F ) where

– K = {s, q1, . . . , qn}

– F = {q1, . . . , qn}

– ∆ = {(s, e, qi) | 1 ≤ i ≤ n} ∪

{(qi, σ, qi) | 1 ≤ i ≤ n, σ =6 ai }

Translate M into an equivalent deterministic fi- nite automaton. Chapter 2.4: Properties of Languages Accepted by Finite Automata

Theorem 6. The class of languages accepted by finte automata is closed under 1. union

2. concatenation

3. Kleen star

4. complementation

5. intersection Theorem 7. A language is regular iff it is ac- cepted by a finite automata Proof From the definition of regular languages, it is clear that every regular language is accepted by a finite automata (exercise).

It remains to prove that every language accepted by a finite automata is also regular.

M = (K, Σ, ∆, s, F ) be a deterministic finite automaton.

Let K = {q1, . . . , qn} with s = q1 and Σ = {a1, . . . , an}.

∗ R(i, j, k) = { σ1 . . . σm ∈ Σ | (qi, σ1 . . . σm) `M

(qk1, σ2 . . . σm) `M ... `M (qkm−1, σm) `M (qj, e)

and max{k1, . . . , km−1} ≤ k }

It is clear

[ L(M) = R(1, j, n)

qj∈F Now we want to show that R(i, j, k) are regular. We prove this by induction on k.

Base case: k = 0. Obvious from the definition of R(i, j, 0)

Inductive Step: Follows from the following equation

R(i, j, k) = R(i, j, k−1) ∪ R(i, k, k−1)◦R(k, k, k−1)∗◦R(k, j, k−1) Pumping Theorem

Theorem 8. Let L be a regular language. There is an integer n > 0 such that any string w ∈ L with |w| ≥ n can be rewritten as w = xyz such that y =6 e, |xy| ≤ n and xykz ∈ L for any k ≥ 0. 

Example

L = {aibi | i ≥ 0 } is not regular.

Example

L = {w ∈ {a, b}∗ | w has an equal number of a’s and b’s }

is not regular. STATE MINIMIZATION Problem: Given a regular language L, design a finite automaton M such that L = L(M) and M has as few states as possible, i.e. for each finite au- tomaton M 0 that accepts L, the number of states of M’ is greater or equal the number of states of M

Definition 3. Let M = (K, Σ, δ, s, F ) be a de- terministic automaton. We say that two strings x, y ∈ Σ∗ are equivalent wrt M, denoted by ∗ x ∼M y, if there is a state q such that (s, x) `M ∗ (q, e) and (s, y) `M (q, e)

Definition 4. Let L ⊆ Σ∗ and x, y ∈ Σ∗. We say that x and y are equivalent wrt to L, ∗ denoted by x ≈L y if for all z ∈ Σ ,

xz ∈ L iff yz ∈ L . It is obvious that for all x, y ∈ Σ∗ and a ∈ Σ, x ≈L y implies xa ≈L ya Theorem 9. For any deterministic finite au- tomaton M = (K, Σ, δ, s, F ) and for any string ∗ x, y ∈ Σ , if x ∼M y then x ≈L(M) y. Let M = (K, Σ, δ, s, F ) be a deterministic fi- nite automaton and q ∈ K and Lq be the set of ∗ all strings x such that (s, x) `M (q, e). Further for ∗ each x ∈ Σ , let [x]M be the equivalent class of x wrt ∼M .

It is obvious that for each x ∈ Lq following as- sertions hold:

– Lq = [x]M . – The mapping µ(q) = Lq is a bijection from K ∗ onto {[x]M | x ∈ Σ } such that for all p, q ∈ K,

δ(q, a) = p iff [xa]M = Lp Therefore M is equivalent to the following au- tomaton M 0 = (K0, Σ, δ0, s0,F 0) where

0 ∗ – K = {[x]M | x ∈ Σ } 0 – s = [e]M 0 – δ ([x]m, a) = [xa]M 0 ∗ – F = {[x]M | (s, x) `M (q, e), and q ∈ F } Let L ⊆ Σ∗ be a regular language.

Define a deterministic finite automaton M = (K, Σ, δ, s, F ) as follows:

∗ – K = {[x]L | x ∈ Σ } where [x]L is the equiva- lent class of x wrt ≈L – s = [e]L – F = {[x]L | x ∈ L} – δ([x]L, a) = [xa]L

Theorem 10. L = L(M) Question Given a deterministic finite automa- ton M = (K, Σ, δ, s, F ), how to construct a deter- ministic finite automaton M 0 such that L(M) = L(M 0) and M 0 has a minimal number of states.

– Define

0 ∗ Lp = {w | ∃f ∈ F :(p, w) `M (f, e)}

0 0 Lp,n = {w | w ∈ Lp, |w| ≤ n} – For p, q ∈ K, define p ≡ q 0 0 iff Lp = Lq

– For p, q ∈ K, define 0 0 p ≡n q iff Lp,n = Lq,n It is obvious that

≡0 ⊇ ≡1 ⊇ ... ⊇ ≡n ⊇ ... ⊇ ≡ Because K is finite, there is n such that

≡n = ≡n+1 = ≡ Lemma 2. For any two states p,q and any num- ber n ≥ 0, p ≡n+1 q iff following conditions hold:

– p ≡n q – for each a ∈ Σ: δ(p, a) ≡n δ(q, a) ALGORITHMS FOR FINITE AUTOMATA Theorem 11. 1. There is an exponential algo- rithm which given a nondeterministic finite automaton, constructs an equivalent determin- istic finite automaton. 2. There is a polynomial algorithm which given a deterministic finite automaton, constructs an equivalent deterministic finite automaton with a minimal number of states. 3. There is a polynomial algorithm which given two deterministic finite automata, decides whether they are equivalent. 

Theorem 12. Given a deterministic finite au- tomaton M = (K, Σ, δ, s, F ) and a string w, there is an algorithm to decide whether w is ac- cepted by M in O(|w|) time.  Theorem 13. Given a nondeterministic finite automaton M = (K, Σ, δ, s, F ) and a string w, there is a polynomial algorithm to decide whether w is accepted by M. Proof The algorithm is given below:

S0 := E(s); n := 0; repeat the following { n := n + 1;

σ := the n-th input symbol;

if σ =6 end-of-file then

S Sn := { E(q) | ∃p ∈ Sn−1 :(p, σ, q) ∈ ∆ } } until σ = end-of-file if Sn−1 ∩ F =6 ∅ then accept else reject

The algorithm is polynomial.  Context Free Languages

Definition 5. A context-free grammar G is a quadruple (V,Σ,R,S) where – V is an alphabet – Σ (the set of terminals) is a subset of V – R (the set of rules) is a finite subset of (V − Σ) × V ∗ – S (the start symbol) is an element of V −Σ. 

– The elements of V − Σ are called nontermi- nals. – We write A −→G u for any rule (A, u) ∈ R.

– For any strings w, v ∈ V ∗, we write w =G⇒ v iff there is a rule A −→G u in R and w = xAy and v = xuy. – A sequence G G G w0 =⇒ w1 =⇒ ... =⇒ wn, n ≥ 0

is called a derivation of wn from w0.

– We write ∗ w =G⇒ v iff there is a derivation of v from w.

The language generated by G is ∗ L(G) = {w ∈ Σ∗ | S =G⇒ w }

A context-free language is a language gen- erated by some context free grammar. Example A grammar for expressions: – Σ = {x, +, ∗, (, )} – V = Σ ∪ {T,F,E} – R = { E −→ E + T E −→ T T −→ T ∗ F T −→ F F −→ (E) F −→ x }

Give a derivation for (x), (x + x) ∗ x. Regular Languages are Context-free

Theorem 14. Any regular language is context free

Proof Let L be a regular language accepted by a deterministic finite automaton M = (K, Σ, δ, S, F ). Construct a grammar G = (V,Σ,R,S) as follows: – V = K ∪ Σ – R consists of rules of the forms • P → aQ where δ(P, a) = Q, and • P → e for P ∈ F We show by induction on n that

(S, a1 . . . an) `M (Q1, a2 . . . an) `M ... `M (Qn, e) iff G G G S =⇒ a1Q1 =⇒ ... =⇒ a1 . . . anQn – Basic step: n = 0. Obvious. – Inductive Step: Suppose the assertion holds for n. We show that it holds for n + 1. Let

(S, a1 . . . anan+1) `M (Q1, a2 . . . an) `M ... `M (Qn+1, e) be a computation in M. Hence,

(S, a1 . . . an) `M (Q1, a2 . . . an) `M ... `M (Qn, e) is also a computation in M. From induction hy- pothesis, G G G S =⇒ a1Q1 =⇒ ... =⇒ a1 . . . anQn is a derivation wrt G. From (Qn, an+1) `M (Qn+1, e), it follows δ(Qn, an+1) = Qn+1. Hence Qn → G an+1Qn+1 is a rule in R. Therefore a1 . . . anQn =⇒ a1 . . . anan+1Qn+1. The other direction holds obviously. Hence for any string w ∈ Σ∗, w ∈ L iff ∗ ∗ G (S, w) `M (P, e) and P ∈ F iff S =⇒ wP ∗ and P → e in R iff S =G⇒ w. Parse Trees Let G = (V,Σ,R,S) 1. For each a ∈ Σ, the tree ◦ a is a parse tree. The root of this tree is a which is also its only leaf. The yield of this tree is also a.

2. If A −→ e is a rule in G, then

Fig. 6.

is a parse tree whose root is A, whose only leaf is e and whose yield is e.

3. If

Fig. 7.

are parse trees (n ≥ 1), with roots A1,...,An and yields y1, . . . , yn respectively, and A −→ A1 ...An is a rule in R then Fig. 8.

is also a parse tree whose root is A, whose leafs are the leaves of T1,...,Tn and whose yield is y1 . . . yn. 4. Nothing else is a parse tree 

Examples Construct parse trees yielding (x), (x + x) ∗ x wrt the grammar for arithmetic expressions. Lemma 3. Let G = (V,Σ,R,S) be a context- free grammar, and let A ∈ V − Σ, and w ∈ Σ∗. Then the following statements are equivalent 1. A ⇒∗ w 2. There is a parse tree with root A and yield w.

– We write x =L⇒ y if the nonterminal symbol being replaced is the leftmost nonterminal sym- bol in the string, i.e x = wAv, y = wuv where w ∈ Σ∗, v ∈ V ∗, A → u ∈ R.

A leftmost derivation is of the form

L L L x1 =⇒ x2 =⇒ ... =⇒ xn – We write x =R⇒ y if the nonterminal symbol be- ing replaced is the rightmost nonterminal symbol in the string, i.e x = wAv, y = wuv where v ∈ Σ∗, w ∈ V ∗, A → u ∈ R.

A rightmost derivation is of the form

R R R x1 =⇒ x2 =⇒ ... =⇒ xn

Theorem 15. Let G = (V,Σ,R,S) be a context- free grammar, and let A ∈ V − Σ, and w ∈ Σ∗. Then the following statements are equivalent:

1. A ⇒∗ w

2. There is a parse tree with root A and yield w

∗ 3. There is a leftmost derivation A =L⇒ w

∗ 4. There is a rightmost derivation A =R⇒ w AMBIGUITY

Let Σ = {x, +, ∗, (, )}. Consider the following two grammars:

– G1 = (V1,Σ,R1,E)

• V1 = Σ ∪ {T,F,E}

• R1 = { E −→ E + TE −→ T T −→ T ∗ FT −→ F F −→ (E) F −→ x }

– G2 = (V2,Σ,R2,E)

• V2 = Σ ∪ {E}

• R2 = { E −→ E + EE −→ E ∗ E E −→ (E) E −→ x }

G1 is unambiguous in the sense that for each string w ∈ L(G1), there is exactly one parse tree that yields w. On the contrary, G2 is ambiguous as there are distinct parse trees yielding the same string.

Exercise Could we use the following grammar as the generator of arithmetic expressions in pro- gramming languages: – Σ = {x, +, ∗, (, )} – V = Σ ∪ {T,F,E} – R = { E −→ E ∗ T E −→ T T −→ T + F T −→ F F −→ (E) F −→ x } Pushdown Automata

Definition 6. A pushdown automata is a sixtuple M = (K, Σ, , Γ, ∆, s, F ) where

– K is a finite set of states

– Σ is an alphabet (the input symbols)

– Γ is an alphabet (the stack symbols)

– s ∈ K is the initial state

– F ⊆ K is the set of final states

– ∆, the transition relation, is a finite subset of (K × (Σ ∪ {e}) × Γ ∗) × (K × Γ ∗)

 For ease of undertsanding, we often write (p, a, α) → (q, β) for ((p, a, α), (q, β)) ∈ ∆ Let L = { wcwR | w ∈ {a, b}∗ }.

L is accepted by M = (K, Σ, Γ, ∆, s, F )

– K = {s, p, f}, F = {f}

– Σ = {a, b, c}, Γ = {a, b}

– ∆ consists of the following rules

(s, a, e) → (p, a)(s, b, e) → (p, b) (s, c, e) → (f, e)

(p, a, e) → (p, a)(p, b, e) → (p, b) (p, c, e) → (f, e)

(f, a, a) → (f, e)(f, b, b) → (f, e)

Fig. 9. A configuration of a pushdown automata is a triple (q, w, v) where q ∈ K, w ∈ Σ∗ and v ∈ Γ ∗

A configuration (q,w,v) yields another configu- ration (q0, w0, v0) in one step, denoted by 0 0 0 (q, w, v) `M (q , w , v ) if there is a rule (q, a, α) → (q0, β) in ∆ such that 0 0 w = aw , v = αv0, v = βv0.

We write ∗ 0 0 0 (q, w, v) `M (q , w , v ) if there is a sequence

(q1, w1, v1) `M ... `M (qn, wn, vn) such that (q, w, v) = (q1, w1, v1), (qn, wn, vn) = (q0, w0, v0). A string w is said to be accepted by M = (K, Σ, δ, s, F ) if and only if there exists a state ∗ q ∈ F such that (s, w, e) `M (q, e, e)

The language accepted by M, L(M), is the set of all strings accepted by M. Theorem 16. A language is context-free iff it is accepted by a pushdown automaton.

Lemma 4. Each context-free language is accepted by some pushdown automaton.

Proof Let G = (V,Σ,R,S) be a context-free grammar. The language generated by G is accepted by the pushdown automaton M = (K, Σ, Γ, ∆, s, F ): – K = {s, q},F = {q} – Γ = V – ∆ consists of the following transitions: • (s, e, e) → (q, S) • (q, e, A) → (q, x) for each rule A → x in R • (q, a, a) → (q, e) for each a ∈ Σ.



Exercise Construct a pushdown automaton accept- ing the language generated by the following gram- mar G = (V,Σ,R,S) where – V = {S, a, b},Σ = {a, b} – R consists of rules S → aSa S → bSb, S → e A pushdown automata M = (K, Σ, Γ, ∆, s, F ) is called simple if following conditions are satisfied:

For each ((q, a, β)(p, γ)) ∈ ∆ such that q =6 s: β ∈ Γ and |γ| ≤ 2

Claim For every pushdown automaton there ex- ists an equivalent simple pushdown automata.

Proof Let M = (K, Σ, Γ, ∆, s, F ) be a push down automaton. Construct a simple PA M 0 = (K0,Σ,Γ ∪ {Z}, ∆0, s0, {f 0}) as follows: – Add to ∆ the transitions: ((s0, e, e)(s, Z)) ((f, e, Z)(f 0, e)) for each f ∈ F – Replace transitions with |β| ≥ 2. Let ((q, a, β)(p, γ)) be a transition with β = B1 ...Bn with n > 1. Replace this transition by the following transi- tions

((q, e, B1)(qB1, e))

((qB1, e, B2)(qB1B2, e)) ......

((qB1B2...Bn−2, e, Bn−1)(qB1B2...Bn−1, e))

((qB1B2...Bn−1, a, Bn)(p, γ)) – Get rid of transition with |γ| > 2 Let ((q, a, β)(p, γ)) be a transition with γ = C1 ...Cn with m > 1. Replace this transition by the following transi- tions

((q, a, β)(r1,Cm)) ((r1, e, e)(r2,Cm−1)) ...... ((rm−2, e, e)(rm−1,C2)) ((rm−1, e, e)(p, C1)) – Get rid of transition with β = e Replace all transitions of the form ((q, a, e)(p, γ)) with q =6 s0 by all transitions of the form ((q, a, A)(p, γA)) Lemma 5. Any language accepted by a simple pushdown automata is context-free. Proof Let M = (K, Σ, Γ, ∆, s, F ) be a pushdown automata and M 0 be constructed as above. Define G = (V,Σ,R,S) as follows: – V = {S} ∪ Σ ∪ K × (Γ ∪ {e}) × K – R consists of the following rules: • S −→ hs, Z, f 0i • for each transtition ((q, a, B), (r, C)) ∈ ∆0 and for each p ∈ K0: hq, B, pi −→ ahr, C, pi 0 • for each transition ((q, a, B), (r, C1C2)) ∈ ∆ , for all p, p0 ∈ K0 0 0 hq, B, pi −→ ahr, C1, p ihp ,C2, pi • hq, e, qi −→ e for each q ∈ K0 Properties of Context-free Languages

Theorem 17. The context-free languages are closed under union, concatenation and Kleen star. Proof Let L, L0 be context-free languages gener- ated by context-free grammars G = (V,Σ,R,S), G0 = (V 0,Σ,R0,S0) where we assume that V − Σ and V 0 − Σ are disjoint. Construct new grammars from G, G0 to generate L ∪ L0, L ◦ L0, L∗. 

Theorem 18. The intersection of a context-free language with a regular language is a context- free language. Proof Let L be a context-free language accepted by pushdown automaton M = (K, Σ, Γ, ∆, s, F ) and L0 be a regular language accepted by a deter- ministic finite automaton M 0 = (K0, Σ, δ, s0,F 0) where K,K0 are disjoint. Construct a new pushdown automaton from M,M 0 to accept L ∩ L0. Theorem 19. The intersection of two context free languages is not always context-free. The complementation of a context free lan- guage is not always context-free.

m n n Proof Let L0 = {a b c | m, n ≥ 0 },L1 = m m n {a b c | m, n ≥ 0 }. Both L0,L1 are context free (exercise). n n n L = L0 ∩ L1 = {a b c | n ≥ 0 }. We show below that L is not context-free. We use the equation L0 ∩ L1 = L0 ∪ L1 to show that the complementation of a context free lan- guage is not always context-free (exercise).  Theorem 20. (Pumping Theorem) Let G = (V,Σ,R,S) be a context-free gram- mar. Then there is a number n such that for any string w ∈ L(G) of length greater than n can be rewritten as w = uvxyz in such a way that either v or y is nonempty and uvkxykz is in L(G) for any k.

Proof The fanout of a grammar G = (V,Σ,R,S), denoted by φ(G) is the largest number ofsymbols on the right handsid of any rule in G. A path in the parse tree is a sequence of distinct nodes each connected to the previous one by a line segment; the first node is the root and the last node is a leaf. The length of a path is the number of line segments in it. The height of the parse tree is the length of the longest path in it. Lemma 6. The yield of any parse tree of G with the height h has length at most φ(G)h. Lemma 7. Let G = (V,Σ,R,S) be a context- free grammar. Then any string w ∈ L(G) of length greater than φ(G)|V −Σ| can be rewritten as w = uvxyz in such a way that either v or y is nonempty and uvkxykz is in L(G) for any k. Algorithmic Properties Definition 7. A context-free grammar G = (V,Σ,R,S) is said to be in Chomsky normal form if all rules are of the form A → BC where A, B, C ∈ V

Theorem 21. There is a polynomial algorithm which given a context-free grammar G and a string w, decides whether w ∈ L(G) ?

Proof The proof consists of three main steps: – A polynomial translation of G into an grammar G0 in Chomsky normal form such that L(G) − (Σ ∪ {e}) = L(G0). – A polynomial algorithm to decide whether w ∈ L(G0) for |w| ≥ 2. – A polynomial algorithm to decide whether a ∈ L(G) for a ∈ Σ ∪ {e}. Lemma 8. For every context-free grammar G = (V,Σ,R,S) there is a context-free grammar G0 in Chomsky normal form such that L(G0) = L(G) − ({e} ∪ Σ). G0 could be constructed in time polynomial to the size of G. Proof

– Replace each rule A → B1B2 ...Bn by A → B1A1 A1 → B2A2 ...... An−2 → Bn−1Bn where A1,...,An−2 are new nonterminals.

– Construct E = {A | A ⇒∗ e} as follows: •E := ∅ • While there is a rule A → α with α ∈ E∗ and A 6∈ E do E := E ∪ {A}. – Removing e-rules: • Delete all rules of the form A → e • For each rule of the form A → BC or A → CB where B ∈ E add the rule A → C – Construct for each A, D(A) = {B | A ⇒∗ B} as follows: •D(A) = {A} • While there is a rule B → C with B ∈ D(A) and C 6∈ D(A) do D(A) := D(A) ∪ {C}.

– Removing rules containing only one symbol on the right hand side: • Delete all rules of the form A → B

• Replace each rule of the form A → BC by all possible rules of the form A → B0C0 where B0 ∈ D(B), C0 ∈ D(C)

• Add the rule S → BC for each rule A → BC such that A ∈ D(S) − {S}.  Lemma 9. Let G be a grammar in Chomski normal form. There is a polynomial algorithm to decide whether w ∈ L(G) for |w| ≥ 2.

Proof Let w = x1 . . . xn. Define ∗ N[i, j] = {A | A ⇒ xi . . . xj} The sets N[i, j] could be computed by

– N[i, i] = {xi}

– N[i, j] = {A | A → BC,B ∈ N[i, s],

C ∈ N[s + 1, j] } w ∈ L(G) iff w ∈ N[1, n].

 Example Translate the grammar S → (S) S → SSS → e into Chomski normal form S → SSS → ()

S → (S1 S1 → S)

Check whether the string ( ( ) ( ( ) ) ) is gener- ated by the Chomski normal form grammar.

Fig. 10. TURING MACHINES

Definition 8. A is a 5-tuple M = (K, Σ, δ, s, H) where – K is a finite set of states,

– Σ is an alphabet containing the blank symbol t and the left end symbol ., but not con- taining the symbols ← and →

– s ∈ K is the initial state

– H ⊆ K is the set of halting states

– δ, the transition function, is a function from (K − H) × Σ to K × (Σ ∪ {←, →}) such that

• for all q ∈ K − H there exists p such that δ(q, .) = (p, →)

• for all q ∈ K − H, and a ∈ Σ, if δ(q, a) = (p, b) then b =6 . Example Design a Turing machine for accepting L = {w | w ∈ {a, b}∗, |w| is even }.

M = (K, Σ, δ, s, H) with

– K = {s, q0, q1, y, n}, H = {y, n}

– Σ = {a, b}

– δ :(K − H) × Σ −→ K × (Σ ∪ {←, →})

q σ δ(q, r) s t q0, → q0 t y, t q0 a or b q1, → q1 t n, t q1 a or b q0, →

M accepts aa:

(s, .taa) `M (q0,. t aa) `M (q1,. t aa) `M (q0,. t aat) `M (y, . t aat)

M rejects a:

(s, .ta) `M (q0,. t a) `M (q1,. t at) `M (n, . t at) A configuration of a Turing machine M = (K, Σ, δ, s, H) is a member of K × .Σ∗ × (Σ∗ × (Σ − {t}) ∪ {e})

Definition 9. Let M = (K, Σ, δ, s, H) and let (q1, w1a1u1) and (q2, w2a2u2) be configurations of M. Then

(q1, w1a1u1) `M (q2, w2a2u2)

iff for some b ∈ Σ ∪ {←, →}, δ(q1, a1) = (q2, b) and either

1. b ∈ Σ, w1 = w2, u1 = u2 and a2 = b or 2. b =←, w1 = w2a2, and either (a) u2 = a1u1, if a1 =6 t or u1 =6 e or (b) u2 = e, if a1 = t and u1 = e or 3. b =→, w2 = w1a1, and either (a) u1 = a2u2 or (b) u1 = u2 = e and a2 = t A halted configuration is of the form (h, w1aw2) with h ∈ H.

A computation is a sequence of configurations C1,C2,...,Cn such that C1 `M C2 `M ... `M Cn. We say that the computation has length n and C1 yields Cn. M is said to halt on input w iff (s, .tw) yields some halted configuration. COMBINING TURING MACHINES

Symbol writing machine: For each symbol a ∈ Σ ∪ {←, →} − {.}, Ma is a Turing machine which write the symbol a into the scanned tape square.

Ma = ({s, h}, Σ, δ, s, {h}): δ(s, b) = (h, a) for each b ∈ Σ − {.}

Notations:

We often write

– a for Ma – L for M← – R for M→ – Lt: Finds the first blank square to the left of the currently scanned symbol – Rt: Finds the first blank square to the right of the currently scanned symbol An initial configuration is of the form (s, .tw) Definition 10. Given is a Turing machine M = (K, Σ, δ, s, H) with H = {y, n}. Any halting configuration whose state is y is said to be an accepting state while a halting configuration whose state is n is said to be a rejecting state.

We say that M accepts an input w ∈ (Σ − {t,.})∗ if (s, .tw) yields an accepting configu- ration.

We say that M rejects an input w ∈ (Σ − {t,.})∗ if (s, .tw) yields a rejecting configura- tion.

Let Σ0 ⊆ Σ − {t,.} be an alphabet, called input alphabet. We say that M decides a lan- ∗ ∗ guage L ⊆ Σ0 if for any string w ∈ Σ0 If w ∈ L then M accepts w If w 6∈ L then M rejects w

A language L is called recursive if it is de- cided by some Turing machine. Example Show that n n n n n L0 = {a b | n ≥ 0 } and L1 = {a b c | n ≥ 0 } are recursive.

Fig. 11.

Fig. 12. Example Eeach deterministic finite automaton could be ”simulated” by a TM.

Let M = (K, Σ, δ, s, F ) be a DFA.

Define a TM M 0 = (K∪{y, n},Σ0, δ0, s0, {y, n}) with Σ0 = Σ ∪ {., t} and

δ0 : K × Σ0 → (K ∪ {y, n}) × (Σ0 ∪ {←, →}) – δ(s0, t) = (s, →) – δ0(p, a) = (δ(p), →) for p ∈ K, a ∈ Σ – δ0(p, t) = (y, t) for p ∈ F – δ0(p, t) = (n, t) for p 6∈ F It holds: ∗ 0 ∗ ∀w ∈ Σ : w ∈ L(M) iff (s ,.tw) `M (y, .wt) ∗ 0 ∗ ∀w ∈ Σ : w 6∈ L(M) iff (s ,.tw) `M (n, .wt)  Recursive Functions Definition 11. Let M = (K, Σ, δ, s, {h}). Let ∗ Σ0 ⊆ (Σ − {t,.} and w ∈ Σ0 . Suppose ∗ (s, .tw) `M (h, .tu) Then u is called the output of M on input w and denoted by M(w). 1. We say that M computes function ∗ ∗ f : Σ0 → Σ0 ∗ if for all w ∈ Σ0 , M(w) = f(w).

2. We say that M computes function f : N k → N, (k ≥ 1)

if for all binary numbers w1, . . . , wk,

M(w; ... ; wk) = f(w1, . . . , wk)

where f(w1, . . . , wk) is also in binary repre- sentation.

A function f is recursive if there is a Turing machine that computes f.

Example Show that g(n) = 2n and f(n) = n + 1 are recursive functions. Recursively Enumerable Languages

Definition 12. Let M = (K, Σ, δ, s, H) be a Turing machine. Let Σ0 ⊆ Σ − {t,.} be an ∗ alphabet, and L ⊆ Σ0 . We say that M semidecides L if for any ∗ string w ∈ Σ0 : w ∈ L iff M halts on input w

A language L is called recursively enumer- able if it is semidecided by some Turing ma- chine.

Theorem 22. – If a language is recursive then it is also recursively enumerable. – If a language is recursive then its complement is also recursive. Extensions of Turing Machines

We study two important extensions of Turing machines: 1. Allowing several tapes instead of one 2. Allowing nondeterministism

Multi-tape Turing Machines

A k-tape Turing machine consists of a finite control together with k infinite tapes. Each tape is scanned by a read/write head. The machine can in one step sense the symbols scanned by all its heads and then depending on those symbols and it s current state, rewrite some of those scanned squares or move some of the heads to the left or right.

Definition 13. A k-tape Turing machine is a 5-tuple M = (K, Σ, δ, s, H) where – K, Σ, s, H are as in the definition of ordinary Turing machine – δ, the transition function, is a function δ :(K − H) × Σk −→ K × (Σ ∪ {←, →})k Example Design a two-tapes machine to decide L = { wcw | w ∈ {a, b}∗ } where c 6∈ {a, b}. | B | t | w | c | w | t | ... ↑

| B | t | t | ... ↑

q r1 r2 δ(q, r1, r2) s t t p, →, → p t t n, t, t p σ t p0, t, σ p0 t σ p, →, → p c t r2, →, ← r2 x σ r2, x, ← r2 x t f, x, → f σ σ f, →, → f t t y, t, t f x y n, x, y if x =6 y

σ stands for a or b x stands for a or b or t y stands for a or b or t Extending Turing machines with more than one tape does not increase the power of the machines.

Theorem 23. Let k > 1 and M = (K, Σ, δ, s, H) be a k-tape Turing machine. Then there is a standard Turing machine M 0 = (K0,Σ0, δ0, s0,H) where Σ ⊆ Σ0, and such that for any w ∈ (Σ − {t,.})∗ :

M halts on input w with output y on its first tape after t steps if and only if M 0 halts on input x with the same output after a polynomial number of steps of t and the size of x. Nondeterministic Turing machines Definition 14. A nondeterministic Turing machine is a quintuple M = (K, Σ, ∆, s, H) where – K, Σ, s, H are as for standard Turing ma- chines, and – ∆, the transition relation, is a subset of ((K − H) × Σ) × ( K × (Σ ∪ {←, →}))

Definition 15. Let M = (K, Σ, ∆, s) and let (q1, w1a1u1) and (q2, w2a2u2) be configurations of M. Then

(q1, w1a1u1) `M (q2, w2a2u2)

iff for some b ∈ Σ∪{←, →}, ((q1, a1), (q2, b)) ∈ ∆ and

1. either b ∈ Σ, w1 = w2, u1 = u2 and a2 = b 2. or b =←, w1 = w2a2, and

(a) either u2 = a1u1, if a1 =6 t or u1 =6 e (b) or u2 = e, if a1 = t and u1 = e 3. or b =→, w2 = w1a1, and (a) either u1 = a2u2 (b) or u1 = u2 = e and a2 = t Definition 16. Let M = (K, Σ, δ, s, H) be a nondeterministic Turing machine. We say that M accepts w ∈ (Σ − {., t})∗ if ∗ (s, .tw) `M (h, uav) for some h ∈ H, a ∈ Σ, u, v ∈ Σ∗. We say that M semidecides L ⊆ (Σ−{., t})∗ if for any string w ∈ (Σ − {., t})∗: w ∈ L iff M halts on input w. Definition 17. Let M = (K, Σ, δ, s, {y, n}) be a nondeterministic Turing machine. We say that M decides L ⊆ (Σ − {., t})∗ if the following conditions hold for any string w ∈ (Σ − {., t})∗: 1. There is a number N depending on M and w such that no computation starting from (s, .tw) has a length greater than N ∗ 2. w ∈ L iff (s, .tw) `M (y, uav) for some a ∈ Σ, u, v ∈ Σ∗. We say that M computes a function f :(Σ− {., t})∗ → (Σ − {., t})∗ if the following condi- tions hold for any string w ∈ (Σ − {., t})∗: 1. There is a number N depending on M and w such that no computation starting from (s, .tw) has a length greater than N ∗ 2. (s, .tw) `M (h, uav) iff ua = .t, v = f(w). Theorem 24. If a nondeterministic Turing ma- chine semidecides or decides a language or com- putes a function then there is a standard one that semidecides or decides the same language or computes the same function. Unrestricted Grammars

Definition 18. A unrestricted grammar G is a quadruple (V,Σ,R,S) where – V is an alphabet – Σ (the set of terminals) is a subset of V – R (the set of rules) is a finite subset of V ∗(V − Σ)V ∗ × V ∗ – S (the start symbol) is an element of V −Σ. 

– We write u −→G v for any rule (u, v) ∈ R.

– For any strings w, w0 ∈ V ∗, we write w =G⇒ w0 iff there is a rule u −→G v and w = xuy and w0 = xvy. – A sequence G G G w0 =⇒ w1 =⇒ ... =⇒ wn, n ≥ 0

is called a derivation of wn from w0.

– We write ∗ w =G⇒ v iff there is a derivation of v from w.

The language generated by G is ∗ L(G) = {w ∈ Σ∗ | S =G⇒ w } Example Design an unrestricted grammar for L = {anbncn | n ≥ 1 }

G = (V,Σ,R,S) where

– V = {S, A, B, C, Ta,Tb,Tc, a, b, c} – Σ = {a, b, c}

– R consists of the following rules:

S → ABCS S → Tc

CA → AC BA → AB CB → BC

CTc → Tcc

BTc → Tbb BTb → Tbb

ATb → Taa ATa → Taa

Ta → e 

Theorem 25. A language is generated by an unrestricted grammar if and only if it is recur- sively enumerable. UNDECIDABILITY

The Church - Turing Thesis

An algorithm is anything that can be viewed as corresponding to a Turing machine that halts on all inputs.

Nothing is considered to be an algorithm if it cannot be rendered as a Turing machine that is guaranteed to halt on all inputs. Universal Turing Machine

Let M = (K, Σ, δ, s, H) be a Turing machine. M is now encoded over the alphabet {a, q, 0, 1} as follows: – Let i, j be smallest integer such that 2i ≥ |K| and 2j ≥ |Σ| + 2 – each state in K is represented by a string from q{0, 1}i – each symbol in Σ is represented by a string from a{0, 1}j – The representation of the special symbols is as follows: t a0j . a0j−11 ← a0j−210 → a0j−211 – The start state is represented by q0i The representation of a Turing machine M, denoted by ”M”, consists of a sequence of strings of the form (p,b,r,c) with p,r representations of states and b,c representations of symbols. Example Let M = (K, Σ, δ, {h}) be a TM where K = {s, p, q, h}, Σ = {0, 1, t,.} and

δ(s, t) = (p, →), δ(p, 0) = (p, →), δ(p, 1) = (p, →), δ(p, t) = (q, 0), δ(q, 0) = (q, ←), δ(q, 1) = (q, ←), δ(q, t) = (h, t).

s q00 p q01 q q10 h q11 t a000 B a001 ← a010 → a011 0 a100 1 a101 The representation ”M” is the following string:

(q00, a000, q01, a011)(q01, a100, q01, a011)(q01, a101, q01, a011) (q01, a000, q10, a100)(q10, a100, q10, a011)(q10, a101, q10, a011) (q10, a000, q11, a000) The universal Turing machine U is a ma- chine which uses the encodings of other machine to direct its operations.

Intuitively U takes two arguments: a description of a Turing machine M, ”M”, and a description of an input string w, ”w”.

U has the following property: U halts on ”M””w” iff M halts on w

U is constructed in two steps: 1. A three-tape universal machine U 0 is constructed.

2. Transform U 0 into an equivalent single tape ma- chine U. U 0 simulates a TM M as follows: 1. The first tape contains the encoding of the tape of M

2. The second tape contains the encoding of M

3. The third tape contains the encoding of the state of M at the current point in the simulated com- putation

U 0 starts with the string ”M””w” on its first tape and the other tapes blank.

– ”M” is moved onto the second tape, and

– ”w” is shifted down to the left end of the first tape, preceding it by .t

– U 0 extracts the coding of the initial state of M and puts it on the third tape. The

Halting Problem:

Given an arbitrary Turing machine M and an input w, is there an algorithm which can decide whether M accepts w ?

Let H = {”M””w” | Turing machine M accepts input string w }.

From the Church-Turing thesis,

”Yes” answer to the halting problem iff the lan- guage H is recursive.

It is clear that the universal Turing machine ac- cepts H. Hence H recursively enumerable. Theorem 26. H is not recursive. Proof Suppose H is recursive. Let

H1 = {w | w = ”M” for some TM M and M accepts w }

Because H is recursive, H1 is also recursive. Let ΣU = {a, q, 0, 1, (, ),, } be the alphabet of the uni- versal TM. ∗ ∗ Define R ⊆ ΣU × ΣU by (u, w) ∈ R iff u = ”M” for some TM machine M accepting w

For each string u, Ru = {w | (u, w) ∈ R } Therefore, for each language L,

L is r.e. iff ∃u = ”M”: L = Ru = L(M). Let D = {w | (w, w) 6∈ R } From the diagonalization principle, there is no u such that D = Ru. Therefore D is not r.e. It is not difficult (exercise) to see that

H1 = {w | (w, w) ∈ R } = D

Since H1 is recursive and D = H1, D is also recursive, hence r.e. Contradiction because we have showed that D is not r.e. Hence, the assumption that H is recursive leads to a contradiction. H is hence not recursive.  Undecidable Problems Since H is not recursive, there is no algorithm to decide whether a TM M accepts an input string w. We say the halting problem is undecidable.

∗ Definition 19. Let L1,L2 ⊆ Σ be languages. A reduction from L1 to L2 is a recursive ∗ ∗ function τ : Σ → Σ such that x ∈ L1 iff τ(x) ∈ L2.

Theorem 27. If L1 is not recursive and there is a reduction from L1 to L2 then L2 is not re- cursive. Theorem 28. The following problems are un- decidable: 1. Given a Turing machine M, does M hold on the empty tape ? 2. Given a Turing machine M, is the language accepted by M empty ? 3. Given a Turing machine M, does M accepts every input ? 4. Given Turing machines M1,M2, do they ac- cept the same language ? 5. Given a Turing machine M, Is the language accepted by M regular ? Is it context-free ? Is it decidable ? Proof

1. Let He = {”M” | M hold on the empty tape}. We give a reduction τe from H to He. Let w = a1 . . . an. Define 0 τe(”M””w”) = ”M ” 0 where M = Ra1Ra2 . . . RanLtM. It is clear that M accepts w iff M 0 accepts the empty tape. From the reduction theorem, He is not recursive. Therefore, the problem is undecid- able.

2. Let Hi = {”M” | M halts on some input}. We reduce He to Hi. Let M be a TM. Fig. 13.

0 Let τi(”M”) = ”M ”. It is clear that if M halts on empty tape then M 0 halts on all inputs (and hence some input). Hi is not recursive.

3. Similar to the previous case.

4. Let Ha = {”M” | M accepts all inputs}. Ha is not recursive. 0 0 Let Heq = {”M”M ” | M,M accept the same language}. Let u be a string representing a TM that accepts all inputs in the first step, i.e. δ(s, t) = (h, t). Define τa(”M”) = ”M”u

It is clear that ”M” ∈ Ha iff ”M”u ∈ Heq. Since Ha is not recursive, Heq is not recursive.

5. See Rice theorem Theorem 29. (Rice’s Theorem) Suppose that C is a proper nonempty subset of the class of all recursively enumerable lan- guages. The the following question is undecid- able: Given a Turing machine M, is L(M) ∈ C

Proof Let H0 = {M | L(M) ∈ C }. We reduce H 0 to H . Let L ∈ C and ML that accepts L.

Without loss of generality, we could assume that ∅ 6∈ C.

Let ”M””w” ∈ H. Let τ(”M””w”) = ”M 0” where M 0 is a TM rendering the following algo- rithm:

If U(”M””w”) halts then ML(x)

where x is the input of M 0.

It is clear that if M accepts w then M 0 accepts a language in C. Hence, τ is a reduction from H to H0. Therefore, H0 is not recursive. Computational Complexity

Example Consider the problem of a travelling sale man who has to visit 10 offices in 10 cities. The sale man has a map with distances between the cities and he should find an itinerary with the shortest distance.

A simple algorithm would check all possible itineraries. The number would be 9! = 362880. If the number of cities is say 40, no presently forseeable computer could handle it as it would take billions of years to finish.

What represents a practically feasible algorithm ? The Class P Definition 20. A Turing machine M = (K, Σ, δ, s, H) is said to be polynomially bounded if there is a polynomial p(n) such that for any input x, there is no configuration C such that

p(|x|)+1 (s, .tx) `M C A language is said polynomially decidable if there is a polynomially bounded Turing ma- chine that decides it.

The class of all polynomial decidable languages is denoted by P.

Theorem 30. P is closed under complementa- tion Some Well-Known Problems

Reachability Problem Given a directed graph G ⊆ V × V where V = {v1, . . . , vn} and two nodes vi, vj, is there a path from vi to vj ?

Independent Set Given an undirected graph G and an integer K ≥ 2, is there a set of nodes C with |C| ≥ K such that for all vi, vj ∈ C, there is no edge be- tween vi and vj ?

Boolean Satisfiability

X = {x1, . . . , xn} : a finite set of boolean vari- ables and

X = {x1,..., xn} where xi is the negation of xi. The elements of X ∪ X are called literals; vari- ables are positive literals, whereas negations of variables are negative literals. A clause is a set of literals. A Boolean for- mula is a set of clauses. A truth assignment T is a mapping from X into {>, ⊥} where > and ⊥ stand for true and false respectively.

We say that a truth assignment T satisfies a Boolean formula F if for each clause C of F there is at least one variable xi such that xi ∈ C and T (xi) = > or xi ∈ C and T (xi) = ⊥.

F is satisfiable if there is a truth assignment that satisfies F.

Satisfiability: Given a Boolean formula F, is F satisfiable ?

3-Satisfiability: Given a Boolean formula F whose clause contains only three or fewer literals, is F sat- isfiable ?

2-Satisfiability: Given a Boolean formula F whose clause contains only two or fewer literals, is F sat- isfiable ? Reachability is in P.

We have given before an polynomial algorithm for the set of all pairs (v, u) in a graph such that there is a path from v to u. Hence reach- ability is in P.

2-satisfiability is in P.

A 2-satisfiability problem could be reduced to reachability by presenting any clause x ∨ y as a link from node x to y and from y to x.

An instance of 2-satisfiability is unsatisfiable if there is a variable x such that there is a path from x to x and vice versa (exercise: by mathematical induction).

Are there practical problems that are recursive but non-polynomially decidable ? The Class NP Definition 21. A nondeterministic Turing ma- chine M = (K, Σ, ∆, s, H) is said to be poly- nomially bounded if there is a polynomial p(n) such that for any input x, there is no con- figuration C such that

p(|x|)+1 (s, .tx) `M C A language is said to be nondeterministic polynomially decidable if there is a nonde- terministic polynomially bounded Turing machine that decides it.

The class of all nondeterministic polynomial decidable languages is denoted by NP.

The Boolean satisfiability, 3-satisfiability, indepen- dent set problems are all in NP.

Theorem 31. P ⊆ N P

Open Problem P = NP ? NP - Completeness

Definition 22. – A function τ : Σ∗ → Σ∗ is said to be polynomial-time computable if there is a polynomial-bounded Turing ma- chine that computes it. ∗ – Let L1,L2 ⊆ Σ be languages. A polynomial reduction from L1 to L2 is a polynomial- time τ : Σ∗ → Σ∗ such that x ∈ L1 iff τ(x) ∈ L2.

Definition 23. A language L is said to be NP- complete if and only if: 1. L ∈ N P 2. For every language L0 ∈ N P, there is a polynomial- time transformation from L0 to L.

Theorem 32. Let L be a NP-complete language. Then P = NP iff L ∈ P Theorem 33. Satisfiablity is NP-complete.

Theorem 34. 3-satisfiability is NP-complete. Proof We give a polynomial reduction from satis- fiability to 3-satisfiability. Let C ≡ (x1 ∨ x2 ∨ ... ∨ xk). Let y1, . . . , yk−3 be new Boolean variables. C is reduced to following set of 3-clauses:

(x1 ∨ x2 ∨ y1), (y1 ∨ x3 ∨ y2), (y2 ∨ x4 ∨ y3),..., (yk−3 ∨ xk−1 ∨ xk) τ(F ) is the collections of all 3-clauses obtained by transforming clauses in F. F is satisfiable iff τ(F ) is satisfiable.  Theorem 35. Independent set problem is NP- complete. Proof We reduce 3-satisfiability to independent set problem where K = m where m is the number of clauses. For each clause, there are three nodes labelled by the literals in the clause. Further, two nodes labelled by complementary literals are linked. For example:

(x1 ∨x2 ∨x3)(x1 ∨x2 ∨x3)(x1 ∨x2 ∨x3)(x1 ∨ x2 ∨ x3)

Fig. 14.