and Kleene Star on Deterministic Finite Automata

GQ Zhang, Xiangnan Zhou, Robert Fraser, Licong Cui

Department of EECS, Case School of Engineering Division of Medical Informatics, School of Medicine Case Western Reserve University Cleveland, Ohio, USA

GQ Zhang (Cleveland, Ohio, USA) Concatenation and Star on DFA LICS2011 1 / 9 Automata and Boolean Matrices

b b a

start 1 2

a

 0 1   1 0  ∆a = , ∆b = 1 0 0 1

∆ = I; ∆abbba = ∆a∆b∆b∆b∆a

GQ Zhang (Cleveland, Ohio, USA) Concatenation and Star on DFA LICS2011 2 / 9 Automata and Boolean Matrices

Each n-state DFA M = (Q, Σ, δ, q0, F) determines a matrix system {∆a | a ∈ Σ}, where ∆a is the (n × n) adjacency matrix of the a-labeled subgraph associated with the DFA. w For a string w = a1a2 ··· an over Σ, we write ∆ for the matrix product ∆a1 ∆a2 ··· ∆an . The language accepted by M, denoted w t L(M), is the {w | Iq0 ∆ IF = 1}.

The characteristic vector of a A of {1, ··· , n} is the row n n vector IA ∈ Bn such that the p-th component of IA is a 1 if and only if p ∈ A. The characteristic vector of a singleton set {p} is written n t as Ip, or simply Ip. A column vector is the transpose () of a row vector.

cf. GQ Zhang. Automata, Boolean matrices, and ultimate periodicity. Information and Computation, 152 (1): 138–154, 1999

GQ Zhang (Cleveland, Ohio, USA) Concatenation and Star on DFA LICS2011 3 / 9 Example: Brzozowski’s derivation

u−1L := {w | uw ∈ L}; show u−1 preserves regularity Suppose L is accepted by an n-state DFA M = (Q, Σ, δ, F), with {∆a | a ∈ Σ} its matrix system. Then a DFA accepting u−1L can be given as 0 0 0 0 0 M = (Q , Σ, δ , q0, F ), where

0 Q = {A | A ∈ Bn×n}, 0 u q0 = ∆ , δ0(A, a) = A∆a, 0 t F = {A | I1A IF = 1}. This approach can simulate inductive constructions such as Fibonacci numbers and polynomials (f (n) defined inductively in n)

GQ Zhang (Cleveland, Ohio, USA) Concatenation and Star on DFA LICS2011 4 / 9 Concatenation

a a Suppose matrix systems {∆1 | a ∈ Σ} and {∆2 | a ∈ Σ} are associated with m- and n-state DFAs M1 = (Q1, Σ, δ1, F1) and M2 = (Q2, Σ, δ2, F2), respectively. The DFA M = (Q, Σ, δ, q0, F)

Q = {(A, B) | A ∈ Bm×m, B ∈ Bm×n}, 0 q0 = (T , T ), δ((A, B), a) = (A, B)∆a a a a (= (A∆1, A∆1T + B∆2)), F = {(A, B) | ImB It = 1}, 1 F2

 ∆a ∆aT  where ∆a = 1 1 for a ∈ Σ, T = It In, and T 0 is the (m × m) a F1 1 0 ∆2 identity matrix, has the property that L(M) = L(M1) ◦ L(M2).

GQ Zhang (Cleveland, Ohio, USA) Concatenation and Star on DFA LICS2011 5 / 9 Lemma for Concatenation

For the DFA M = (Q, Σ, δ, q0, F)

Q = {(A, B) | A ∈ Bm×m, B ∈ Bm×n}, 0 q0 = (T , T ), δ((A, B), a) = (A, B)∆a a a a (= (A∆1, A∆1T + B∆2)), F = {(A, B) | ImB It = 1}, 1 F2

suppose δ(q0, w) = (A, B) in M, and suppose w = a1 ··· a`.

We have

` X a1a2···ai ai+1ai+2···a` B = ∆1 T ∆2 . i=0

GQ Zhang (Cleveland, Ohio, USA) Concatenation and Star on DFA LICS2011 6 / 9 Kleene Star

a Suppose the matrix system {∆1 | a ∈ Σ} is associated with an n-state DFA M = (Q , Σ, δ , F ). The DFA M = (Q, Σ, δ, q , F) with H = It I 1 1 1 1 0 F1 1 and

Q = {A | A ∈ Bn×n} ∪ {s},

q0 = s,  a 0 1 ∆1(H + H ), if q = s, δ(q, a) = a 0 1 A∆1(H + H ), if q = A, F = {A | I A It = 1} ∪ {s}, 1 F1

∗ 1 0 has the property that L(M) = (L(M1)) . Here, H = H and H is the identity matrix.

GQ Zhang (Cleveland, Ohio, USA) Concatenation and Star on DFA LICS2011 7 / 9 Lemma for Kleene Star

For the DFA M = (Q, Σ, δ, q , F) with H = It I and 0 F1 1

Q = {A | A ∈ Bn×n} ∪ {s},

q0 = s,  a 0 1 ∆1(H + H ), if q = s, δ(q, a) = a 0 1 A∆1(H + H ), if q = A, F = {A | I A It = 1} ∪ {s}, 1 F1 we have (where ` =| w |)

X w1 w2 wk i δ(s, w) = ∆1 H∆1 H ··· ∆1 H . w=w1···wk ,1≤k≤` wj 6=,1≤j≤k i=0,1

GQ Zhang (Cleveland, Ohio, USA) Concatenation and Star on DFA LICS2011 8 / 9 Conclusion

Operations on regular expressions can be directly translated to constructions on DFA. We obtained along the way a proof of the classical Kleene’s Theorem avoiding the use of NFA (using Arden’s Lemma in the other direction). Laws of Boolean matrices capture language operations inductively and algebraically. The “natural” constructions using matrix systems are also optimal in the usage of the number of (reachable) states, agreeing with known results in state complexity. See newton.case.edu/pub.html for more details

GQ Zhang (Cleveland, Ohio, USA) Concatenation and Star on DFA LICS2011 9 / 9