Determismistic & Nondeterministic Finite Automata

Martin Franzle¨

Informatics and Mathematical Modelling The Technical University of Denmark

02140 and Parsing, MF, Fall 2004 – p.1/22

Preliminaries for Formal Languages:

Alphabets, strings, languages

02140 Languages and Parsing, MF, Fall 2004 – p.2/22 What you’ll learn

You will be introduced to the basic notions of

Alphabet: A set of “symbols”, String/word: a sequence of symbols from the alphabet, : a set of strings.

02140 Languages and Parsing, MF, Fall 2004 – p.3/22

Alphabets

Def.: An alphabet is a finite, nonempty set.

¤ ¢

¥ ¡ Ex.: The binary alphabet £ ,

the set of all printable ASCII characters,

red green blue yellow magenta cyan ¥ ¡ the set £ £ £ £ £ of basic colors.

N.B.: This notion of alphabet is distinctly different from everyday use:

alphabets need not feature an ordering,

the members of an alphabet need not be letters in the usual sense, but can well be compounds of some kind (e.g., tuples).

We will often denote alphabets by the greek letter ¦ .

02140 Languages and Parsing, MF, Fall 2004 – p.4/22 Strings (a.k.a. words)

Def.: A string (or word) over an Alphabet ¦ is a finite sequence of

symbols from ¦ .

§ §

The length of a string , denoted ¨ ¨ , is the number of positions for symbols in the string.

¤ ¤ ¢ ¢ ¢ ¢ ¢ ¢

Ex.: (or, more formally, £ £ £ ) is a string of length 4 over alphabet

©

¤ ¢

¥ ¡ £ .

¥ ¡ is the string of length 0 over alphabet £ £ (as well as over any

©

other alphabet). It is called the empty string and is generically denoted , irrespective of the particular alphabet.

 

§ § § §          

  Def.:     Given strings    and    ,

   

§ the concatenation of  and is the string

def



§ § § §          

     

      

 

  of length  .

02140 Languages and Parsing, MF, Fall 2004 – p.5/22

Powers of an alphabet

¦  Given an alphabet and a natural number  N,



¦ ¦ is the set of all strings over of length  ,

!

def "!

 



 

¦ ¦ ¦ ¦      is the set of all nonempty strings #



over ¦ ,

def

$

 



 

¦ ¦ ¦ ¦ ¦ &    &    

% ' % ' is the set of all strings over .

02140 Languages and Parsing, MF, Fall 2004 – p.6/22 Languages

$

Def.: A set of strings over an alphabet ¦ (i.e., a subset of ¦ ) is

called a language over ¦ .

Ex.: The set of strings consisting of ( 0’s followed by ( 1’s for arbitrary

¤ ( ¢ ) N ¥ ¡ forms a language over £ :



+ +

*

¤ ¤ ¤ ¤ ¤ ¤ ¤ ¤ ¤ ¤ ¤ ¢ ¢ ¢ ¢ ¢ ¢ ¢ ¢ ¢ ¢ ¢ ( ) N ¥ , ¥ ¡ ¡ £ £ £ £ £

- ¢

¥ ¡ The set of strings over £ £ representing a prime number forms a

- ¢

¥ ¡ language over £ £ :

. 1 ¤ ¤ ¤ ¤ 1 / 0 /

¥ ¡ £ £ £ £ £ £ £

3

2 is a language over 2 ,

4 is a language (over arbitrary alphabet),

 

*

4

¥ ¥ ¡ is a language (over arbitrary alphabet). Note: ¡ . 5

Note: Languages can be finite or infinite!

02140 Languages and Parsing, MF, Fall 2004 – p.7/22

Problems

¦ Def.: Given a language 6 over some alphabet , the problem 6 is:

¦ § § 6 Given a string over , decide whether or not  .

3

7 ¤ ¢

)

¥ ¡ Ex.: Problem of deciding whether some £ is a binary representation of a prime number.

3

7

)

¥ ¡ Problem of deciding whether some £ £ is the imperfect of an english verb.

Problem of deciding whether a string

3

; ; ; : : 7 9 ¤ 9 ¤ < <

)

¥ ¥ ¥ ¥ ¡ ¡ ¡ ¡ 8 £ £ £ £ £ £ £ £ = describes

a game of chess where ‘white’ has won,

a game of chess that has reached a position where ‘white’ can enforce a win.

Questions concerning a given problem are in particular

whether it can be solved by a computer,

if so, what would be the best to do so,

how much computational effort this requires.

02140 Languages and Parsing, MF, Fall 2004 – p.8/22 Finite Automata

The deterministic variant: DFAs

02140 Languages and Parsing, MF, Fall 2004 – p.9/22

DFAs — informally

An automaton is a computational device that changes its state depending on the inputs it receives.

A finite automaton has finitely many states for remembering its computation history.

In a deterministic finite automaton, the current state is completely determined by the sequence of inputs seen so far.

Automata define languages through the following mechanism: 1. Some states are marked as being “accepting”; 2. whenever the automaton is in an accepting state, the string of inputs seen so far is a member of the language accepted by the automaton.

b a a

? ? ? ? ? ? ? ? ? ? @

¥ ¡ £ £ £ £ £

a,b >

? ? ? ?

¥ ¡ b b £ £ £ £ b a a

02140 Languages and Parsing, MF, Fall 2004 – p.10/22 DFAs — formally



¦ G

A D

F B H    E  A deterministic finite automaton (DFA) consist of 1. a finite set of states , C

¦ 2. a finite set of input symbols , called the “alphabet of A ”,

J ¦ K

D I

3. a transition function C C ,



F

4. a start state E , C

G

L

5. a set of final or accepting states C .

The automaton

F 1. starts in E ,

D M M

B H 2. proceeds from state E to state E  if the current input is ,

3. accepts a string (  sequence of symbols) iff that sequence of

G



F

inputs drives it from E to some E .

02140 Languages and Parsing, MF, Fall 2004 – p.11/22

Transition diagrams are a graphical notation for FAs.



¦ G A D

F B H    E  A transition diagram for a DFA C is a graph s.t. 1. For each state in there is a node, C

N N



D M

B H 2. if E  E then (and only then) there is an arc from E to E

labelled M , (we allow ourselves to collate multiple arcs between the same nodes into a single one labelled with a list) 3. there is an arrow without source node into the start state, 4. nodes representing accepting states are marked with double circles.

b a a a,b b b b a a

02140 Languages and Parsing, MF, Fall 2004 – p.12/22 Transition tables are tabular representations of transition functions, where information about initiality and finality of states is added:

P O

K



Q F F

E E E

  

E E E

  

E E E

As the state space and the alphabet can be deduced from the table, transition tables are a complete representation of DFAs!

02140 Languages and Parsing, MF, Fall 2004 – p.13/22

Extending the trans. fct. to strings

R

$

J ¦ K J ¦ K D I D I

B H Given C C , we define a function C C via :

R def



D & Base case:  B H E  E E for each C ; Recursion

U V

T W

R def R

$



¦ ¦

D S M D D S M S M Recursion:   

B H B B H H E  E   E for each C , , .

V X U

W T

Y

N.B.: We could as well have defined

R def R



D M S D D M S

B H B B H H E  E  

without altering the function defined.

02140 Languages and Parsing, MF, Fall 2004 – p.14/22 The language of a DFA



¦ G 6 Def: A A D F

B H B H    E  The language of a a DFA C is

def R $

[



¦ G 6 § § A D 

 

F

B H B H E 

Z \

[

6 A A

The language B H thus is the set of strings that drive from the initial state to some accepting state.



6 6 6 Def: A A If B H for some DFA then we call a .

The regular languages are exactly those languages 6 which are



6 6 A A

definable by DFAs in the sense that B H for some DFA .

02140 Languages and Parsing, MF, Fall 2004 – p.15/22

Finite Automata

The nondeterministic variant: NFAs

02140 Languages and Parsing, MF, Fall 2004 – p.16/22 NFAs — informally

Has the freedom to do various different moves when in a state and seeing some input,

has the help of an “oracle” for performing the right guess,

this is modeled mathematically as 1. the ability to be in various states at once, and 2. accepting a string whenever at least one of those states is accepting.

a,b

a b b

02140 Languages and Parsing, MF, Fall 2004 – p.17/22

NFAs — formally



¦ G A D

F B H    E  An nondeterministic finite automaton (NFA) C consist of

1. a finite set of states C ,

¦ 2. a finite set of input symbols , called the “alphabet of A ”,

J ¦ K

D I

]

3. a transition function B H , (which may be identified C C

N

J ¦ J D with a transition relation ^ ), C C



F

E 4. a start state C ,

G

5. a set of final or accepting states L . C

The automaton

` ¥ ¡ 1. starts in state set _ ,

a a b

2. proceeds from state set to state set 8 £ = if the current input is ,

3. accepts a string ( * sequence of symbols) iff that sequence of inputs drives

c

)

` it from _ to a state set containing some _ .

02140 Languages and Parsing, MF, Fall 2004 – p.18/22 Transition diagrams are a graphical notation for FAs.



¦ G

A D

F B H    E  A transition diagram for a NFA C is a graph s.t. 1. For each state in there is a node, C

N N

D M



B H 2. if E E  then (and only then) there is an arc from E to E

labelled M , (we allow ourselves to collate multiple arcs between the same nodes into a single one labelled with a list) 3. there is an arrow without source node into the start state, 4. nodes representing accepting states are marked with double circles. a,b

a b b

02140 Languages and Parsing, MF, Fall 2004 – p.19/22

Transition tables

Do now assign sets of states:

M d

K

 F F F

% ' % ' E E E  E

e

  Q % ' E E

e

 

% ' E E

02140 Languages and Parsing, MF, Fall 2004 – p.20/22 Extending the trans. fct. to strings

R

$

J ¦ K J ¦ K

D I D I

] ]

B H B H B H Given C C , we define a function C C via recursion:

R def



D & Base case:  B H % ' E  E E for each C ;

R def

$

i h



¦ ¦

D S M D M S M Recursion:   

B H B H E   E o for each C , , .

m

g

j l n f k

02140 Languages and Parsing, MF, Fall 2004 – p.21/22

The language of an NFA



¦ G 6 Def: A A D F B H B H    E  The language of a an NFA C is

def R $

[

 

¦ G 6 § § A D  p



e

F B H B H E 

q

Z \

[

6 A A

The language B H thus is the set of strings that drive from the initial state to a set of states containing some accepting state.

02140 Languages and Parsing, MF, Fall 2004 – p.22/22