1

Demonstratio Mathematica, Vol. 38, No. 3, 2005, pp 761-775 A Complex Measure for Linear Grammars‡

Ishanu Chattopadhyay Asok Ray [email protected] [email protected] The Pennsylvania State University University Park, PA 16802, USA

Keywords: Language Measure; Discrete Event Supervisory Control; Linear Grammars

Abstract— The signed real measure of regular languages, introduced supervisors that satisfy their own respective specifications. Such and validated in recent literature, has been the driving force for a partially ordered set of sublanguages requires a quantitative quantitative analysis and synthesis of discrete-event supervisory (DES) measure for total ordering of their respective performance. To control systems dealing with finite state automata (equivalently, regular languages). However, this approach relies on memoryless address this issue, a signed real measures of regular languages state-based tools for supervisory control synthesis and may become has been reported in literature [7] to provide a mathematical inadequate if the transitions in the plant dynamics cannot be captured framework for quantitative comparison of controlled sublanguages by finitely many states. From this perspective, the measure of regular of the unsupervised plant language. This measure provides a languages needs to be extended to that of non-regular languages, total ordering of the sublanguages of the unsupervised plant such as Petri nets or other higher level languages in the . Measures for non-regular languages has not apparently and formalizes a procedure for synthesis of DES controllers for been reported in open literature and is an open area of research. finite state automaton plants, as an alternative to the procedure This paper introduces a complex measure of linear context free of Ramadge and Wonham [5]. Optimal control of finite state grammars (LCFG) that belong to the class of non-regular languages. automata has been recently have been reported [6] based on The proposed complex measure becomes equivalent to the signed real measure, reported in recent literature, if the LCFG is degenerated to the language measure and formalizes quantitative analysis and a . synthesis of DES control laws. The approach is state-based and the language measure parameters are identified from physical experiments or simulation on a deterministic finite state automaton I. INTRODUCTION (DFSA) model of the plant [8]. However, using memoryless state- Finite state automata (FSA) (equivalently, regular languages) based tools for supervisory control synthesis may suffer serious have been widely used to model and synthesize supervisory control shortcomings if the details of transitions cannot be captured by laws for discrete-event plants [5] because the task of discrete- finitely many states. This problem has been partially circumvented event supervisory (DES) control synthesis becomes mathemati- by Petri nets that can accommodate certain classes of non-regular cally tractable and computationally efficient due to simplicity of languages [4] in the Chomsky hierarchy [3]. There is apparently regular languages. According to the paradigm of DES control, a no quantitative tool for supervisory control synthesis of Petri nets finite-state automaton (e.g., the discrete-event model of a physical compared to what are available for finite state automata [6]. Hence, plant) is a language generator whose behavior is constrained by there is a need for developing measures of non-regular languages the supervisor to meet a given specification. The (controlled) sub- as a quantitative tool of supervisory control synthesis for discrete- language of the plant behavior could be different under different event systems that cannot be represented by regular languages. Toward achieving this goal, the first step is to construct measure(s) ‡This work has been supported in part by Army Research Office (ARO) under Grant No. DAAD19-01-1-0646; and NASA Glenn Research Center of non-regular languages where the state-based approach [7] may under Grant No. NNC04GA49G. not be applicable. 2

This paper first shows that the measure of a alternative pairs of forms (i.e., there are either right derivations proposed in [7] is equivalent to that of the regular grammar, or left derivations but not both): without referring to states of the automaton. Then, it extends the v → αw v → wα  or  (1) signed real measure of regular languages to a complex measure  v → α  v → α   for the class of non-regular languages [2], generated by the   where v, w ∈ V and α ∈ T ∪{}; and is the empty string. (deterministic) linear context free grammar (LCFG) that is a Remark 2.2: The generated language for a deterministic finite subclass of deterministic pushdown automata (DPDA) [3]. The state automaton (DFSA) is a regular language [3]. signed real measure is extended to a complex measure over the real field, where the multiplication operation of complex numbers Definition 2.3: A linear grammar is a CFG (V, T, P, S ) where is different from the conventional one. In this case, the complex every production in P takes one of the following forms: space over the real field degenerates to the union of a pair of one- v → αw; v → wα; v → α (2) dimensional real spaces instead of being isomorphic to the two- where v, w ∈ V and α ∈ T ∪{}. dimensional real space. The extended complex-valued language measure, formulated in this paper, is potentially applicable to Remark 2.3: In view of Remark 2.1 and Definition 2.3, the set analysis and synthesis of DES control laws where the plant model of production rules P in a linear grammar Γ= (V, T, P, S ) can be could be represented by an LCFG. modified as Γ=˜ (V˜ , T, P˜, S ) by augmenting the set V of variables as V˜ ≡ V ∪ A and by updating the set P of production rules by The paper is organized in six sections including the present P˜. That is, ϕ = (v → α) ∈ P is replaced by ϕ˜ = (v → aα) ∈ P˜, one. Section II briefly introduces the basic concepts, notations and where v ∈ V, α ∈ T ∪{} and a ∈ A. This is analogous to the trim background materials for formal languages. Section III discusses operation in regular languages [5]. the measure of regular grammars and shows its equivalence with that of regular languages. Section IV extends the measure to linear Remark 2.4: The modified grammar Γ˜ = (V˜ , T, P˜, S ) is a grammars and the concept is elucidated with an example in Section superset of the original grammar Γ = (V, T, P, S ) in the sense IV-C. Section V briefly describes two important future research that it contains the generated language of Γ = (V, T, P, S ) and, directions. The paper is summarized and concluded in Section VI. in addition, has productions of the type v → aw. The production A →  is added to P in each step of the modification; and v →  must exist ∀v ∈ V.

II. CONCEPTS AND NOTATIONS Remark 2.5: Regular grammars have only right (or left) deriva-

This section introduces notations and background materials for tions with a single variable. In contrast, linear grammars include formal languages [3] along with definitions of key concepts. both right and left derivations with a single variable. This is precisely what allows the linear grammars to model a certain class Definition 2.1: A context free grammar (CFG) is a 4-tuple Γ= of non-regular languages. (V, T, P, S ), where V and T are mutually disjoint (i.e., V ∩ T = ∅) finite sets of variables and terminals, respectively; and P is a finite A geometric approach is adopted in this paper to deal with the set of productions by which strings are derived from the start possible presence of both right and left derivations in a linear symbol S . Each production in P is of the form v → α, where context free grammar. A production rule of the type V −→ σV1 v ∈ V and α ∈ (V ∪ T)∗. is fundamentally different form a production V −→ V1σ. If V1

is subsequently replaced by a symbol σ1, the order of generation Remark 2.1: The language generated by a grammar Γ consists of σ and σ1 is maintained in the derived string in the first case of all strings obtained from legal (i.e., permissible) productions and it is reversed in the second. Note that if V −→ V1σ is the beginning with the start symbol. first right linear production rule used in a particular derivation, Definition 2.2: A regular grammar is a CFG (V, T, P, S ) where then σ is necessarily the last terminal in the derived string. This every production in P takes exactly one of the following two possible non-causality of derivations is handled by introducing 3

the following notions: imaginary transitions, generated path, path erated through production rule sequences {P11, P12, ··· , P1n} and mapping function, and the event plane . {P21, P22, ··· , P2m} such that the final variable on the righthand

In this paper, generation of a symbol through a right linear side of the derivation is identical (say Vk) in the two cases, then production is denoted by an imaginary transition as opposed to a λ1Nλ2. This follows immediately from noting that if λ is a path symbol generated by a left linear production which is denoted by a initiating from Vk, then both λ1λ and λ2λ are elements of PΓ, ({,i}×Σ)? real transition. An imaginary transition is labelled by the prefixing and if λ ∈ 2 cannot be generated from the variable Vk, i with the generated symbol. For example, V −→ V1σ implies iσ then λ1λ, λ2λ < PΓ. This observation suggests that at least in an has occurred while V −→ σV1 implies simply σ has occurred. The LCFG, variables convey the same meaning as states in the context concept of real and imaginary transitions facilitates the notion of of regular languages. a generated path. In the sequel, the terms state and variable are used interchange- ? ably as they convey the similar meaning in the present context; the Definition 2.4: A generated path λ ∈ {, i} × Σ is the   same applies to the terms terminals and events. Note, the context- sequential order of transitions (real or imaginary) in any particular free nature of LCFG implies each variable can be rewritten by derivation. the specified production rules, irrespective of where the variable For example, if a derivation proceeds sequentially through the occurred. This is precisely the kind of Markov property that production rules V1 −→ V2σ1, V2 −→ σ2V3, V3 −→ V4σ3, V4 −→ associates variables with states and terminals with events. σ4V5, then the generated path λ ≡ iσ1σ2iσ3σ4 Definition 2.7: The event mapping η : Σ → Z is a function Definition 2.5: The set of all paths, generated by an LCFG Γ, that maps the event alphabet into the set of integers. Let Σ = ({,i}×Σ)? is denoted as PΓ ∈ 2 {σ1, ··· ,σk, ··· ,σm}, then A given generated path λ corresponds to a particular derived = string. However in non-regular grammars, a single string maybe η(σk) k (4) derived through more than one paths. Mathematically, there exists Definition 2.8: The event plane can be viewed as the complex a surjective mapping from the set of all generated paths by a LCFG plane itself on which the trajectory of the discrete-event system to the set of all derived strings i.e. the language generated by the is reconstructed as the strings are generated. The transitions grammar. Note, if the LCFG is regular, this map is injective and S → σkvi and S → viσk transfer the state located at the origin hence bijective. (0, 0), to (η(σk), 0) and (0,η(σk)), respectively. Thus, there exists two possible directions in which the same event σ may cause Definition 2.6: The path mapping function ℘ : PΓ −→ L(Γ) k is defined as follows: For any path λ, the corresponding derived transition from the same state vi depending on whether the event string is obtained by concatenating all the symbols generated by is causing an imaginary transition or a real transition. Figures 1 real transitions followed by a concatenation of the ones generated and 2 illustrate the idea. Note, for the regular grammar of Figure by imaginary transitions in a reversed order. 2, the derivations always occur along the real axis of the event plane. For example, ℘ λ ≡ iσ1σ2iσ3σ4 7−→ σ2σ4σ3σ1. In regular   grammars, the absence of imaginary transitions implies that ℘ is III. MEASURE OF REGULAR GRAMMARS the identity map. This section first introduces the concept of regular-grammar- An example of the right invariant relation is the well-known based measures and then shows its equivalence to that of recently Nerode equivalence relation (N) on a language L which is defined reported state-based measure [7]. In essence, the concept of the as follows [3]: state-based language measure is reformulated in terms of regular grammars, followed by construction of the measure. While detailed ∀ x, y ∈ L, xNy, if ∀ u ∈ Σ∗, xu ∈ L ⇔ yu ∈ L (3) proofs of the supporting theorems are given in [3], sketches of the A language L is regular if and only if there exists a Nerode proofs that are necessary for developing the underlying theory are equivalence relation of finite index. Applying the notion of Nerode presented here. equivalence on PΓ, it follows that if two paths λ1, λ2 is gen- 4

{W}, T, δ, S, {W}) where W is the only marked state and δ is defined as follows:

S a T

δ(vi, sr) ≡ δ1(vi, sr) ∪ δ2(vi, sr) ∪ δ3(vi, sr)

b δ1(vi, sr) = v j, if vi → srv j ∈ P  = where  δ2(vi, sr) {W}, if vi → sr ∈ P   δ (v , s ) = φ, otherwise S a T  3 i r   Remark 3.1: It follows from Theorem 3.2 and Theorem 3.1 that b a language L is regular iff there is a regular grammar Γ such that either L = L(Γ) or L = L(Γ) ∪{}. Therefore, there is a regular

S a T grammar for every finite state automaton that exactly generates the

Fig. 1. Derivations on the event plane for L = {anbn : n ∈ N} language of the regular grammar and vice versa. having the linear grammar S → aT; T → S b A. Formulation of Regular Grammar Measures

This section follows the same construction procedure as in [7] because there exists a one-to-one-correspondence between the state set Q of an automaton and the variable set V of the corresponding regular grammar. The same holds true for the alphabet set Σ and the terminal T of the regular grammar. The notion of marked states as well as that of good and bad marked states translates naturally to this framework. The variable set V can be partitioned into sets

of marked variables Vm and non-marked variables V − Vm and the

set Vm is further partitioned into good and bad marked variables + − as Vm and Vm [7]. b S T S a Definition 3.1: The language L(Γi) generated by a context free Fig. 2. Derivations on the event plane for L = {(ab)?} having the grammar (CFG) Γi initialized at state vi ∈ V is defined as: right linear grammar S → aT; T → bS . Hence L is regular

∗ L(Γi) = {s ∈ Σ | there is a derivation of s from Γi} (5)

Theorem 3.1: If L is a regular language, then there is a regular Definition 3.2: The language Lm(Γi) generated by a CFG Γi grammar Γ such that either L = L(Γ) or L = L(Γ) ∪{}. initialized at state vi ∈ V is defined as:

Proof: Let G = (Q, Σ, δ, qi, A) be an FSA. Let us construct ∗ Lm(Γi) = {s ∈ Σ | there is a derivation of s from Γi which Γ = = Σ the grammar with V Q and T . The set of productions is terminates on a marked variable } constructed as follows: Definition 3.3: For every vi, vk ∈ V, the set of all strings that, Add q → s q if δ(q , s ) = q starting from v , terminate on v is defined as the language L(v , v ). Σ i r j i r j i k i k ∀qi, q j ∈ Q, sr ∈ ,   Add q → s if δ(q , s ) ∈ A That is, L(v , v ) = {s ∈ Σ∗| there is a derivation of s from v that  i r i r i k i   terminates on vk}. Theorem 3.2: If Γ is a regular grammar, then L(Γ) is a regular Definition 3.4: The characteristic function χ : V → [−1, 1] is language. defined in exact analogy with the state based approach [7]: − Proof: Let Γ = (V, T, P, S ) be a regular grammar. Let us [−1, 0), v ∈ Vm ∀ ∈ ∈  construct a (possibly) nondeterministic finite state automaton G vi V, χ(vi)  {0}, v < Vm  Γ =  ∈ + that exactly accepts the language L( ). Specifically, let G (V ∪  (0, 1], v Vm   5

≡ = { ∈ Σ → }∩ and thus χ assigns a signed real weight to each of the sublan- σ∈Σ:∃vi→σv j∈P π˜[σ, vi] πij and πij 0 if σ : vi σv j P= guages L(vi, v). ∅P. The n × n state transition cost Π-matrix is defined as: π π ... π Similar to the measure of regular languages [7], the character- 11 12 1n  π π ... π  istic vector is denoted as: X = [χ χ ··· χ ]T , where χ ≡ χ(v ),  21 22 2n  1 2 n j j Π=    . . .. .  is called the X-vector. The j-th element χ of X-vector is the  . . . .  j     weight assigned to the corresponding terminal state v j. Hence, the  πn1 πn2 ... πnn    X-vector is also called the state weighting vector in the sequel.   Definition 3.7: The signed real measure µ of every singleton

Γ L(Γi) The marked language Lm( i) consists of both good and bad string set S = {s} ∈ 2 where s ∈ L(vi, v) is defined as µ(S ) ≡ + event strings that, starting from the initial state vi, lead to Vm π˜(s, vi)χ(v). The signed real measure of the sublanguage L(vi, v) ⊆ − and Vm respectively. Any event string belonging to the language L(Γi) is defined as 0 L (Γi) = L(Γi)− Lm(Γi) terminates on one of the non-marked states µ(L(v , v)) = π˜[s, v ] χ(v) (8) 0 i  i  belonging to V − Vm; and L does not contain any one of the good s∈XL(vi,v)  Γ Γ   or bad strings. The regular languages L( i) and Lm( i) can be The signed real measure of the language of a regular grammar expressed as: Γi initialized at a state vi ∈ V, is defined as: n

L(Γi) = L(vi, vk) = L(vi, vk) (6) µi ≡ µ(L(Γi)) = µ(L(v, vi)) (9) v[k∈V [k=1 Xv∈Γ Γ = = + Γ − Γ The language measure vector, denoted as: µ = [µ1 µ2 ··· µn], is Lm( i) L(vi, vk) Lm( i) ∪ Lm( i) (7) vk[∈Qm called the µ-vector. where the sublanguage L(vi, vk) ⊆ Γi, having the initial state Based on the reasoning of the state based approach [7], it follows vi, is uniquely labelled by the terminal state vk, k ∈ I and that: + L(v , v ) ∩ L(v , v ) = ∅ ∀ j , k L ≡ + L(v , v) = + i j i k ; and m v∈Vm i and µi πijµ j χi (10) − S Γ Xj Lm ≡ v∈V− L(vi, v) are good and bad sublanguages of Lm( i), m = Π + S 0 Γ = Γ = 0 Γ In vector form, µ µ X whose solution is given by: respectively. Then, L ( i) v

L(Γ ) Now we construct a signed real measure µ : 2 i → R ≡ Remark 3.2: The matrix Π is a contraction operator [6] and Γ + = L( i) (−∞, ∞) on the σ-algebra K 2 . The construction is exactly hence (I − Π) is invertible. So, the µ-vector in Equation (10) is equivalent to that for the state-based automata [7]. With the choice uniquely defined. of this σ-algebra, every singleton set made of an event string

ω ∈ L(Γi) is a measurable set, which qualifies itself to have a IV. LANGUAGE MEASURE FOR LINEAR GRAMMARS Γ numerical quantity based on the above decomposition of L( i) This section extends the concept of language measure to linear 0 + − into L , L , and L , respectively called null, positive, and negative grammars (V, T, P, S ) that are a generalization of regular grammars sublanguages. The event costs are defined below. [3].

Definition 3.5: The event cost of the regular grammar Γi is ∗ defined as: π˜ : Σ × V → [0, 1] such that ∀vi ∈ V, ∀σ j ∈ Σ, A. Linear Grammar Measure Construction Σ∗ ∀s ∈ , It follows from Definition 2.6 that, given a path ω in the

(1) π˜[σ j, vi] ≡ π˜ ij ∈ [0, 1); j π˜ ij < 1; language, the generated string is obtained by the path mapping P (2) π˜[σ j, vi] = 0 if @vi → σ jvk ∈ P, where P is the set of ℘(ω). The approach of this paper is to construct a measure of the

production rules; and π˜[, vi] = 1; set of all such paths rather than the measure of strings. This is

(3) π˜[σ jω, vi] = π˜[σ j, vi]π ˜[ω, vk], where vi → σ jvk ∈ P. necessary since any attempt to apply the Myhill-Nerode theorem Definition 3.6: The state transition cost of the regular grammar on a linear non-regular language results in an infinite number

Γi is defined as: π : V ×V → [0, 1) such that ∀vi, v j ∈ V, π[vi, v j] = of equivalence classes (the Nerode equivalence relation is not of 6 finite index). However, as we will argue shortly, the language of left-linear grammar with production rules of the type S −→ σV all paths (PΓ) is regular and hence a right invariant relation of or S −→ iσV. finite index exists on PΓ. Then it follows from the reasoning in We continue with our explicit construction as follows:

[7], that a well-defined signed real measure exists on PΓ, In this Let Υ ≡ {x : x = a + ib and a ∈ [0, 1], b ∈ [0, 1]}, i.e., Υ is the section we will construct a complex measure on PΓ allowing one closed unit square on the complex plane C. Let a binary operator to differentiate between the real and imaginary transitions and is ? : C × C → C be defined as: (a + ib) ? (c + id) = ac + ibd thus more intuitive in the case of linear non-regular grammars. ∀a, b, c, d ∈ R. The identity for the operator ? is 1 + i since ∀z ∈ Moreover, it is argued that the defined complex measure coincides C, z ? (1 + i) = z and if z = a + ib with a , 0 and b , 0, 1 1 PΓ Γ −1 −1 = + for the and L( ). then there exist a a unique z ∈ C such that z a i b . That is, z ? z−1 = (1 + i). The operator ? can be extended to multi- Lemma 4.1: If there exists a complex measure ϑ on PΓ with

PΓ PΓ n×m m×l n×l the σ-algebra 2 , i.e. if (PΓ, 2 ,ϑ) is a well-defined measure dimensional cases by ? : C × C → C as follows: space then there exists a measure µ on L(Γ) with the σ-algebra If A ∈ Cn×m, B ∈ Cm×l, then A ? B = C ∈ Cn×l in the sense that L(Γ) = Σn = + = + 2 , such that cij k=1aik ?bkj. Further, if A Ar iAim and B Br iBim where

ϑ PΓ = µ L(Γ) (12) the pairs (Ar, Br) and (Aim, Bim) denote the real and imaginary parts Proof:   of the matrices A and B, respectively, it follows that A ? B =

PΓ + Note the statement of the lemma does not state that (PΓ, 2 ,ϑ) Ar Br iAimBim. is a well-defined measure space, but makes a claim in case it is. The identity for the above ? operation is (1 + i)I where I is Assume this is in fact the case. the standard identity matrix of dimension n × n. Let us denote the Γ −→ C Now define the map µ : L( ) as follows: identity for the ? operation by:

∀ ω ∈ L(Γe), µ(ω) = ϑ(λ) (13) I = (1 + i)I (18) λ∈X℘−1(ω) e That µ is well-defined follows immediately from the facts that where A ? I = I ? A = A ∀A∈ Cn×n. PΓ n×n (PΓ, 2e ,ϑ) is a measure space and ℘ is surjective. Now, µ induces Remark 4.2: The inverse of a matrix A ∈ C under the ? L(Γ) C a map µ : 2 −→ as follows: e operation, if it exists, is given as:

∀ Ω ⊆ Γ Ω = −1 −1 −1 −1 L( ), µ( ) µ(ω) (14) A = (Ar + iAim) = Ar + iAim (19) ωX∈ Ω e It is trivial to check that µ is in fact a measure on the space L(Γ). that is different from the standard inverse A−1. Notice that both

It then follows immediately, that real (Ar) and imaginary (Aim) parts of the matrix A must be individually invertible in the usual sense for existence of A−1. µ L(Γ) = µ(ω) (15) Γ ?  ω∈XL( ) In view of the operator, definitions of the language measure e = ϑ(λ) (16) parameters, χ, π˜ and π, are generalized as follows: λ∈℘X−1(L(Γ)) = ϑ PΓ (17) Definition 4.1: The characteristic function χ : V → Υ is  defined as:

Remark 4.1: The construction of Lemma 4.1 is actually the ∀v ∈ V, χ(v) = k(1 + i) (20) − natural way of inducing a measure on the quotient space of a k ∈ [−1, 0) i f v ∈ Vm  measure space. where  k = 0 i f v < Vm (21)   + Lemma 4.2: For a linear grammar Γ, the path language PΓ is k ∈ (0, 1] i f v ∈ Vm   regular. The characteristic value χ assigns a complex weight to a Proof: The statement immediately follows from the fact that language L(v, u) that, starting at the variable v, ends at the variable

PΓ is a language on the alphabet Σ t iΣ and can be generated by a u. A real weight, in the range of -1 to +1, is assigned to each state 7 as it was done in the case of real grammars, and then this weight B. Computation of the µ-Vector + is made complex by multiplying (in the usual sense) with 1 i. This subsection presents a procedure to formulate the complex

Definition 4.2: The event cost of the LCFG Γ is defined as a measure of PΓ and by Lemma 4.1 it coincides with the complex

∗ measure for L(Γk). Now, function π˜ : (Σ t iΣ) × V → Υ such that ∀vk, v` ∈ V, ∀σ j ∈ Σ, ∗ ∀ ∈ Σ t Σ j ω ( i ) , Γ = Γ P k ∪ jσk P j ∪k εk   (1) π˜[σ j, vk] ≡ Re(˜πkj) ∈ [0, 1); j Re(˜πkj) < 1; where the null event εk is defined as P (2) π˜[iσ j, vk] ≡ Im(˜πkj) ∈ [0, 1); j Im(˜πij) < 1; , if self loop at v = P = i (3) π˜[σ j, vk] i if @vk → σ jv` ∈ P; εk   ∅, otherwise =  (4) π˜[iσ j, vk] 1 if @vk → v`σ j ∈ P;  The above expression formalizes the fact that the set of paths (5) π˜[, v`] = 1 + i; from a state vk is exactly equal to the union of the sets of paths (6) π˜[τω, vk] = π˜[τ, vk] ? π˜[ω, v`] obtained by looking at the first event and then considering all ∗ where τ ∈ Σ t iΣ; ω ∈ (Σ t iΣ) ; and vk → τvl if τ ≡ σ and possible legal paths thereafter. Hence, if the first event is σ` and vk → v`τ if τ ≡ iσ. the current state changes to v j, then the set of all paths thereafter Definition 4.3: The state transition cost of the LCFG Γ is is exactly equal to PΓ j . The expression is structurally identical to Υ defined as π : V × V → such that ∀vk, v j ∈ V, that given for DFSA in [7] with the understanding that the event j σ` can be either real or imaginary. π[vk, v j] = Re(˜π[σ`, vk]) + i Im(˜π[k, l]) σX`∈Σ: σX`∈Σ: Hence, ∃vk→σ`v j∈P ∃vk →v jσ`∈P

≡ πkj (22) j Γ = Γ µ(P k ) µ ∪ jσk P j ∪k εk     and πkj = 0 if {σ ∈ Σ : vk → σv j or σ ∈ Σ : vk → v jσ}∩ P= ∅. The j = Γ + µ ∪ jσk P j µ(εk) Π n × n complex-valued state transition cost -matrix is defined as:  j  = Γ + µ σk P j χ(vk) Xj π11 π12 ... π1n    = πkj ? µ(PΓ ) + χ(vk) (25)  π21 π22 ... π2n  j   Xj Π=  . . .   . . .. .   . . . .    The first three steps in Eq. (25) follow from the fact that if the  π π ... π   n1 n2 nn    first symbol for two paths is different, then the paths cannot be Definition 4.4: Let Γ be a linear grammar, initialized at a state k identical. However, the generated strings may still be the same. v ∈ V. The complex measure µ of every singleton path set Λ = k The fourth step follows from property (6) of the π˜ function in PΓ {λ} ∈ 2 k is defined as: µ(Λ) ≡ π˜(λ, v ) ? χ(v). The complex k Definition 4.2. The final step trivially follows from Definition 4.4 Γ measure µ of every singleton string set Ω= {ω} ∈ 2L( k) is defined of the measure. In vector form, the complex measure µ is given as: µ(Ω) ≡ = π˜(λ, v ) ? χ(v). Then, the complex measure of λ:℘(λ) ω k by: the sublanguageP L(vk, v) ⊆ L(Γk) of all strings terminated at the −1 Γ Γ = Π + = Π state v ∈ V is defined as: µ L( k) ≡ µ P k ? µ X (I− ) ? X     −1 = − R Π + − I Π µ , = π ω, ? χ (I e ) i(I m ) ? X (L(vk v))  ˜[ vk] (v) (23)   ω∈XL(vk,v)  = Π −1 + Π −1   (I − Re ) ReX i(I − Im ) Im(26)X   Complex measure of the language of the linear grammar Γk is where Re and Im refer to the real and imaginary parts of the defined as: matrices, respectively; and existence of the matrix inverses is µ ≡ µ(L(Γ )) = µ(L(v , v)) (24) k k k guaranteed by the following conditions: Xv∈Γ

T The language measure vector, denoted as µ = [µ1 µ2 ··· µn] , is Re(˜πkj) < 1; Im(˜πkj) < 1 ∀k. called the µ-vector. Xj Xj 8

regular grammar, such a quotient space construction always yields a n-wedge of circles (Figure 4), whereas for a non-regular linear A simple example is presented below to illustrate how the grammar, the geometric object generated is more complicated complex µ is computed for an LCFG. (e.g. in Figure 3, it is a 3-dimensional object). Future work will investigate the relationship between the computational properties C. Example of linear grammars and the geometric properties of the associated Let a language L generate all strings of the type {anbn : n ≥ 0} quotient spaces. over the alphabet Σ = {a, b}. The non-regular language L can be The other direction is further extension of the notion of measure generated by the grammar {v → avb|} that can be rewritten as: to a general context free language. One possible approach is to v1 → av2; and v2 → v1b. The resulting π˜ matrix is expressed in generalize the concept of the event plane to that of the event the matrix form as: sheaf. To handle a general CFG, one must deal with production p 0 rules of the type S −→ α where α is a general string of variables Π˜ =   0 iq and terminals. One possible method to handle a production rule     where the parameters p and q can be identified from the experi- S −→ aAB is to consider a sheaf of event planes and lift the mental time series data of the system dynamics [8]. The Π-matrix second variable B to the next higher plane. The idea is illustrated is then obtained as: in Figure 5 where a grammar S −→ aTV | aTVW; V −→  0 p 0 0 TbS ; ··· is considered. Future work will focus on such event Π Π Re =   and Im =  .  0 0 q 0 sheaf representations of general context free grammars and the         Assigning characteristic values (i.e., weights) of the two states possibility of generalizing the results of this paper to the entire v1 and v2 to be χ1 and χ2 respectively, the complex measure vector class of context free languages. is evaluated as: VI. SUMMARY AND CONCLUSIONS (χ1 + pχ2) + iχ1 µ(L) = This paper introduces the notion of a quantitative measure of ( χ2 + i(χ2 + qχ1) ) non-regular languages [2], generated by linear context free gram- mars (LCFG) that belong to the low end of Chomsky hierarchy [3]. V. FUTURE WORK It shows that the measure of regular languages, reported in [7] can Future research will focus two main directions. Further devel- be obtained by its generating regular grammar, without referring opment of the notion of the event plane will help investigating to states of the automaton. Then, the paper extends the signed important geometric properties of formal languages on which real measure to a complex measure for the class of non-regular little work is reported. Relating theory to the languages, generated by LCFG. The extended language measure is notions of algebraic topology and geometry will have far reaching potentially applicable to quantitative analysis and synthesis of DES implications. A deformation retract of the event plane followed by control systems where the plant model of a complex dynamical an identification of the edges described by the derivations lead system is not restricted to be a finite state machine. to a very interesting geometric interpretation. Figures 3 and 4 illustrate the idea. The first step in Figure 3 is to continuously REFERENCES deform the event plane to collapse it along the dotted lines (a [1] Hatcher A., Algebraic topology, Cambridge University Press; 1st deformation retract [1]). The succeeding steps are identification edition, 2001. of edges marked by identical paths on the plane. The result [2] I. Chattopadhyay, A. Ray, and X. Wang, A complex measure of non- is a compact topological space, namely the double wedge of regular languages for discrete-event supervisory control, Proceedings of American Control Conference, Boston, MA (2004), in press. spheres [1]. In Figure 4, the same construction is carried out [3] J. E. Hopcroft, R. Motwani, and J. D. Ullman, Introduction to au- for the regular grammar of Figure 2 which results in a double tomata theory, languages, and computation, 2nd ed., Addison-Wesley, wedge of circles. It immediately follows that even for a general 2001. 9 Identifying . . "b" .a . . . b S . T ...... • . • . . . b S • • T . b . . S T . a . . . b Identifying "a" . . homotopic S T ...... a S . equivalence . b . . b •. S, T a . S T . . b ...... S T deformation . a a . retract S T . . . . . Fig. 3. Quotient space construction for the grammar of Figure 1. The quotient space is homotopic in the natural sense to wedge of spheres.

a b a S T S T a S T

W identifying edges T b

V • S, T

S a T Fig. 4. Corresponding construction for a strictly regular grammar. The quotient space Fig. 5. Suggested event sheaf for a non-linear grammar (only a is a double wedge of circles. few derivations shown)

[4] J.O. Moody and P.J. Antsaklis, Supervisory control ofdiscrete event 1083–1100. systems using petri nets, Kluwer Academic, 1998. [7] A. Surana and A. Ray, Measure of regular languages, Demonstratio [5] P. J. Ramadge and W. M. Wonham, Supervisory control of a class of Mathematica 37 (2004), no. 2, 485–503. discrete event processes, SIAM J. Control and Optimization 25 (1987), [8] X. Wang, A. Ray and A. Khatkhate, On-line identification of language no. 1, 206–230. measure parameters for discrete event supervisory control, IEEE [6] A. Ray, J. Fu, and C. Lagoa, Optimal supervisory control of finite Conference on Decision and Control, Maui, Hawaii (2003), 6307– state automata, International Journal of Control 77 (2004), no. 12, 6312.