Indonesian Journal of Electrical Engineering and Computer Science Vol. 8, No. 1, October 2017, pp. 36 ~ 42 DOI: 10.11591/ijeecs.v8.i1.pp36-42  36

Multiset Controlled Grammars: A Normal Form and Closure Properties

Salbiah Ashaari1, Sherzod Turaev*2, M. Izzuddin M. Tamrin3, Abdurahim Okhunov4, Tamara Zhukabayeva5 1,2,3Kulliyyah of Information and Communication Technology, International Islamic University Malaysia 53100 Kuala Lumpur, Malaysia 4Kulliyyah of Engineering, International Islamic University Malaysia, 53100 Kuala Lumpur, Malaysia 5Faculty of Information Technology, L.N. Gumilyov Eurasian National University, 010008 Astana, Kazakhstan *Corresponding author, e-mail: [email protected], [email protected], [email protected], [email protected], [email protected]

Abstract Multisets are very powerful and yet simple control mechanisms in regulated rewriting systems. In this paper, we review back the main results on the generative power of multiset controlled grammars introduced in recent research. It was proven that multiset controlled grammars are at least as powerful as additive valence grammars and at most as powerful as matrix grammars. In this paper, we mainly investigate the closure properties of multiset controlled grammars. We show that the family of languages generated by multiset controlled grammars is closed under operations union, concatenation, kleene-star, homomorphism and mirror image.

Keywords: multiset, regulated grammar, multiset controlled grammar, generative capacity, closure property

Copyright © 2017 Institute of Advanced Engineering and Science. All rights reserved.

1. Introduction A regulated grammar is depicted as a grammar with an additional (control) mechanism that able to restrict the use of the productions (a.k.a. rules) during derivation process in order to avoid certain derivations as well as to obtain a subset of the language generated in usual way. The primary motivation for introducing regulated grammars came from the fact that a plenty of languages of interest are seen to be non-context free such as the languages with reduplication, multiple agreement or crossed agreement properties. Therefore, the main aim of regulated grammars is to achieve a higher computational power and yet at the same time does not increase the complexity of the model [1, 2]. It is believed that the first regulated grammar, which is a matrix grammar, was introduced by Abraham in 1965 with the idea such in a derivation step, a sequence of productions are applied together [3]. Since then, a plenty of regulated grammars have been introduced and investigated in several papers such as [1-13], where each has a different control mechanism, and provides useful structures to handle a variety of issues in formal languages, programming languages, DNA computing, security, bioinformatics and many other areas. In this paper, we continue our research on multiset controlled grammars (see [4]); we investigate the closure properties of the family of languages generated by multiset controlled grammars. The study of closure properties is one of a crucial investigation in theory since it provides a meaningful merit in both theory and practice of grammars. This paper is structured as follows. First, we give some basic notations and knowledge concerning to the theory of formal languages in which include grammars with regulated rewriting and set-theoretic operations on languages that will be used throughout the study. Then, we recall the definitions of multiset controlled grammars defined in [4] together with results on their generative power. Then, we demonstrate that for multiset controlled grammars one can construct equivalent normal forms, which will be used in study of the closure properties. In the last section, we put in a nutshell all materials studied previously with some possible future research topics on those grammars.

Received June 6, 2017; Revised August 16, 2017; Accepted September 1, 2017 IJEECS ISSN: 2502-4752  37

2. Preliminaries In this section, we present some basic notations, terminologies and concepts concerning to the formal languages theories, multiset and regulated rewriting grammars that will be used in the following sections. For details, the reader is referred to [1, 2, 14-16]. Throughout the paper, we use the following basic notations. Symbols and represent the set membership and negation of set membership of an element to a set. Symbol signifies the set inclusion and marks the strict inclusion. Then, for a two sets and ∈ , ∉ if and . Further, notation | | is used to portray the cardinality of a set in which is⊆ the number of elements in the set⊂ as well as notation 2 is used to depict the 퐴power 퐵set퐴 of⊊ a퐵 set퐴 ⊆. Symbol퐵 퐴 ≠ denotes퐵 the empty set.퐴 The sets of integer,퐴 natural, real and rational퐴 numbers are denoted by , , and , respectively.퐴 퐴 ∅ An alphabet is a finite and nonempty set of symbols or letters, which is ℤdenotedℕ ℝ byℚ and a string (sometimes referred as word) over the alphabet is a finite sequence of symbols (concatenation of symbols) from . The string without symbols is called null or empty string andΣ denoted by . The set of all strings (including ) over the alphabet,Σ is denoted by and = { }. A string is a substringΣ of a string if and only if there exist , such∗ that += ∗ where휆 , , , . String is a휆 proper substring of ifΣ and Σ . The 1 2 lengthΣ Σ of− string휆 , denoted푤 by | |∗, is the number of symbols휈 in the string. Obviously,푢 푢 | | = 0 and 1 2 1 2 |휈 |푢=푤|푢| + | |, 푢, 푢 푤 . 휈A ∈languageΣ is푤 a subset of . A language휈 is푤 ≠ -휆free if 푤 ≠ 휈. For a set , a multiset푤 is a mapping∗ 푤: . The set of all multisets∗ over is denoted by 휆 . Then, the푤� set 푤 is 푣a called푤 휈 ∈ theΣ basic set of퐿 . For a multisetΣ and퐿 element휆 휆 ∉⊕퐿 , ( ) represents퐴 the number of occurrences휇 퐴 → ℕ of ⊕in . ⊕ 퐴 퐴 A퐴 Chomsky grammar is a quadruple퐴 = ( , , ,휇 )∈ 퐴where is an alphabet푎 ∈ 퐴 휇 푎of nonterminals, is an alphabet of terminals푎 and휇 = , is the start symbol and is a finite set of productions such that ( ) 퐺( 푁)푇×푆(푃 ) . We푁 simply use notation for a production푇 ( , ) . A direct derivation∗푁 ∩ 푇 relation∅∗ 푆 ∈ over푁 (∗ ) , denoted by 푃 , is defined as provided if and only푃 ⊆ if푁 there∪ 푇 푁is 푁a ∪rule푇 푁 ∪ 푇 such ∗that = and 퐴 =→ 푤 for some , 퐴 푤( ∈ 푃 ) . Since is a relation, then its th,푁 ∪ 푇 0, power is denoted⇒ 1 2 by , its transitive푢 ⇒ 푣 closure by ∗ , and its reflexive and퐴 →transitive푤 ∈ 푃 closure by푢 푥. 퐴A푥 string 1 2 1 2 푣 푥(푛푤푥 ) is a sentential푥 푥 ∈ form푁 ∪ 푇if + . If⇒ , then is called푛 a푛 sentence≥ or ∗a terminal ⇒ ⇒ ⇒ …. string and ∗ is said to be a successful∗ derivation.∗ We also use the notations or to 푤 ∈ 푁 ∪ 푇 푆 ⇒ 푤 푤 ∈ 푇 푤 푚 푟0푟1 푟푛 denote the derivation∗ that uses the sequence of rules = … , , 1 . The language 푆ge⇒nerated푤 by , denoted by ( ), is defined as ( ) = { | ⇒ ��}�.� �Two� 0 1 푛 푖 grammars and are called to be equivalent if and only if푚 they푟 generate푟 푟 푟 the∈ 푃 ∗same≤ 푖 ∗language,≤ 푛 i.e., ( ) = ( ). There are퐺 five main types퐿 퐺of grammars depending퐿 퐺 on their푤 ∈ 푇productions푆 ⇒ 푤 forms 1 2 : 퐺 퐺 1 2 a)퐿 퐺 a regular퐿 퐺 if and , 푢 →b)푣 a linear if ∗ ∗ and , c) a context-free푣 ∈ if∗푇 ∪ ∗푁( 푇 ∗ ) 푢and∈ 푁 ,

d) a context-푣sensitive∈ 푇 ∪ 푇 if 푁푇 ( ∗ 푢 )∈ 푁 ( ) and ( ) where | | | | and e) a recursively enumerable푣 ∈ 푁 ∪ 푇 or unrestricted∗푢 ∈+ 푁 if∗ ( ) and+ ( ) where contains at least one 푢nonterminal∈ 푁 ∪ 푇 symbol.푁 푁 ∪ 푇 푣 ∈ 푁 ∪+ 푇 푢 ≤ ∗푣 The families of languages generated by arbitrary,푢 ∈context푁 ∪ 푇 sensitive,푣 context∈ 푁 ∪ 푇free, regular,푢 linear and finite grammars are denoted by , , , , , and , respectively. For these language families, holds: 퐑퐑 퐂퐂 퐂퐂 퐑퐑퐑 퐋퐋퐋 𝐅� .

𝐅�Before⊂ 퐑퐑퐑 moving⊂ 퐋퐋퐋 to ⊂the퐂퐂 operations⊂ 퐂퐂 ⊂ 퐑퐑 on languages, we recall some definition of regulated grammars mentioned in this study. A matrix grammar is a quadruple = ( , , , ) where , and are defined as for context-free grammar and is a set of matrices, that are finite sequences of context-free rules from × ( ) . Its language is defined퐺 푁by푇 푆( 푀) = { 푁| 푇 푆 푀 and }. ∗ ∗ 휋 푁 푁 ∪ 푇 퐿 퐺 푤 ∈ 푇 푆 An additive∗ valence grammar is a 5-tuples = ( , , , , ) where , , , are defined as⇒ 푤for a 휋context∈ 푀 -free grammar and is a mapping from into set (set ). The language generated by the grammar consists of all string퐺 푁 푇 such푆 푃 푣that there푁 푇is 푆a 푃derivation 푣 ∗푃 ℤ ℚ 푤 ∈ 푇 푆 Multiset Controlled Grammars: Normal Form and Closure Properties (Salbiah Ashaari) 38  ISSN: 2502-4752

… where ( ) = 0 . The families of languages generated by matrix and additive 푟1푟2 푟푛 valence grammars 푛(with erasing rues) are denoted by , , ( , ), respectively. 푘=1 푘 Now,������ �we푤 recall some∑ 푣 set푟 -theoretic operations that will be used to investigate휆 휆 the closure properties of a grammar. Let and be two languages.퐌퐌퐌 Then,푎퐕𝐕 the퐌퐌퐌 union 푎(퐕𝐕), intersection ( ), difference ( ) and concatenation ( ) of and are defined as: 1 2 퐿 퐿 ∪ ∩ 1 2 − = { ⋅ }퐿 퐿 = { } 1 2 1 2 퐿 ∪ 퐿 = {푤 ∶ 푤 ∈ 퐿 표� 푤 ∈ 퐿 } 1 2 1 2 퐿 ∩ 퐿 = { 푤 ∶ 푤 ∈ 퐿 푎�푎 푤 ∈ 퐿 } 1 2 1 2 퐿 − 퐿 푤 ∶ 푤 ∈ 퐿 푎�푎 푤 ∉ 퐿 1 2 1 2 1 1 2 2 The complement퐿 ⋅ 퐿 of푤 푤 ∶ 푤 with∈ 퐿 respect푎�푎 푤 to∈ 퐿 is defined as = . For a language , the iterations of is defined as∗: ∗ ∗ 퐿 ⊆ Σ Σ 퐿� Σ − 퐿 퐿 =퐿{ }, 0 = , 퐿1 = 휆 , 퐿 2 퐿 퐿 = 퐿퐿 (iteration closure: Kleene star). ∗⋯= 푖 (positive iteration closure: Kleene plus). 푖≥0 퐿+ ⋃ 퐿 푖 푖≥1 Given퐿 ⋃ two퐿 alphabets , , a mapping : is called a morphism or synonymously a homomorphism if and only if: ∗ ∗ 1 2 1 2 (i) for every , there existsΣ Σ such that =ℎ Σ( )→ andΣ is distinct, (ii) for every , ∗ ( ) = ( ) (∗ ) . 1 2 A morphism푤 ∈ Σ is∗ -free if for휈 every∈ Σ , 휈( )ℎ 푤. Then,휈 a morphism is known as an 1 isomorphism when푤 휈 ∈ forΣ every∶ ℎ 푤 � , ℎ 푤 ,ℎ if 휈 ( ) =∗ ( ) then = . 1 For a word =휆 … , ∗ 푤, 1∈ Σ ℎ 푤, the≠ 휆mirror image of (a.k.a. reversal) is 1 obtaining by writing the word푤 휈 in∈ theΣ reverseℎ 푤 orderℎ 휈 such 푤 …휈 and it is denoted by . 1 2 푛 푖 Therefore, for a language푤 푤 푤 푤, we 푤defined∈ Σ its≤ mirror푖 ≤ 푛 image as = { : 푤 }. 푅 푛 1 2 푤∗ 푤 푤푅 푤 푅 푤 퐿 ⊆ Σ 퐿 푤 푤 ∈ 퐿 3. Multiset Controlled Grammars: Definitions and Generative Power Informally, a multiset controlled grammar is a context-free grammar = ( , , , ) equipped with an arithmetic expression over multisets on terminals. For each production : , the multiset [ ] , called “counter”, is defined representing퐺 the푁 terminal푇 푆 푃 occurrences in . For example, if ⨁= { , } and : is a production, then [ ] = 푟(2,퐴1)→. A푤 derivation∈ 푃 in the grammar휔 푟 ∈ 푇is successful if only if the value of the expression of multisets resulted from the푤 derivation in a true푇 relation푎 푏 with 푟a certain퐴 → 푎푎 𝑎threshold∈ 푃 [4]. The formal definition휔 푟 of multiset controlled grammars is portrayed in the following definition. Definition 1 [4] A multiset controlled grammar is a 6-tuples 훼 = ( , , , , , ) where , and are defined as for a context-free grammar, is a finite subset of × ( ) × and : is a linear or nonlinear function. A triple ( , , ) 퐺 is written푁 푇 푆 as푃 ⊕ 퐹 ∗[ ]. ⨁If 푁(푇 , , …⨁푆, ), , 1 is a linear, then it푃 is in the form of 푁 ( 푁, ∪,푇… , )푇= 퐹 푇( )→+ℤ where , 0 . Then, as a nonlinear퐴 푤 휔 function∈ 푃 , we can퐴 → consider푤 휔 1 2 푛 푖 1 2 푛 퐹logarithms,푛푎 푎 푎polynomials,푎 ∈ 푇 ≤rational,푖 ≤ 푛 exponential, power and so on. Thus, the language퐹 푎 푎 generated푎 푖=1 푖 푖 0 푖 by∑ multiset푐 휇 푎 controlled푐 grammar푐 ∈ ℤ ≤is 푖defined≤ 푛 by ( , , ) = { [ ] ( 퐹), ( ) } where the relation {=, <, >, , }, , is a cut point set and ( ) = { [ ] × 퐿 퐺 훼 ∗ 푤 ∣ 푤 휔 ∈ 퐿 퐺 퐹 휔 ∗ 훼 [ ] where = }. ∗ ⨁ 휋 ∗ ∈ ≤ ≥ 훼 ∈ 푊 푊 ⊆ ℤ 퐿 퐺 푤 휔 ∈ 푇 푇 ∣ 푆 Definition 2 A multiset controlled grammar = ( , , , , , ) is called 푛 푛 1 2 푛 a)⇒ 푤regular휔 if each휋 production푟 푟 ⋯ 푟 in the grammar has the form such [ ] or [ ] where and , . 퐺 푁 푇 푆 푃 ⊕ 퐹 b) linear if each∗ production in the grammar has the form such 퐴 → 푤� [휔 ] or 퐴 → 푤[휔] where 푤,∈ 푇, 퐴 and퐵 ∈ 푁, . 1 2 c) context-free if each∗ production in the grammar has the form퐴 such→ 푤 퐵푤 휔[ ] where퐴 → 푤 휔 1 2 ( )푤 and푤 푤 ∈ 푇. 퐴 퐵 ∈ 푁 ∗ 퐴 → 푥 휔 푥 ∈ 푁 ∪ 푇 퐴 ∈ 푁 IJEECS Vol. 8, No. 1, October 2017 : 36 – 42 IJEECS ISSN: 2502-4752  39

The families of languages generated by multiset controlled regular, linear and context- free grammars with and without erasing rules are denoted by , , ( , , ) respectively. We also use bracket notation [ ], { , , } to show that휆 a statement휆 휆 holds both cases of with and without erasing rules.휆 The 푚following퐑퐑퐑 푚 퐋퐋퐋theorem푚퐂퐂 shows푚퐑퐑퐑 the푚 computational퐋퐋� 푚퐂퐂 powers possessed by multiset controlled grammars.푚퐗 퐗 ∈ Combining퐑퐑퐑 퐋퐋퐋 퐂퐂the results above, we form the following relations as in Figure 1. Theorem 1 [4] ...... [ ] [ ]. 퐑퐑퐑 ⊂ 푚퐑퐑퐑 퐋퐋퐋 ⊂ 푚퐋퐋퐋 푚퐂퐂 −휆 푎퐕𝐕 ≠ 휆∅ 퐂퐂 ⊂ 푚퐂퐂 푚퐑퐑퐑 − 퐂퐂 ≠ ∅ 푚퐂퐂 ⊆ 퐌퐌퐌

Figure 1. The hierarchy of families of language generated by multiset controlled grammar.

4. A Multiset Chomsky Normal Form A normal form is introduced with the intent to transform a grammar into a standard form by imposing it with restrictions. In formal language theory, a variety of normal forms were first investigated and developed to solve the rudimentary problems involving context-free languages such for making it easy to analyze and to construct proofs regarding particular properties of the grammars, i.e., for testing emptiness and deciding membership matters with more easily. One of the most useful, well-constructed and popular normal forms to deal with context-free grammar is Chomsky normal form (CNF) due to its simple structure (binary tree). The use of CNF allows easily to determine whether a string is generated by the context-free grammar or not using polynomial time algorithms (for instance, CYK algorithm). A context-free grammar is said to be in CNF if and only if all its productions are in form of and where , , are variables and is exactly a terminal. Here, we prove that our multiset controlled context-free grammars can also be transformed in to equivalent CNFs. 퐴 → 푋푋Theorem퐴 → 1푥 For any퐴 푋multiset� controlled context푥 -free grammar , there exists an equivalent multiset controlled context-free grammar in multiset Chomsky normal form 푚 (mCNF). ′ 퐺 푚 Proof: Let be a multiset controlled context-free퐺 grammar. Then, any such grammar can be converted into an equivalent grammar where all its productions are in form of 푚 퐺 ′ or/and with > 0, where is the zero푚 vector, , , are variables and is a 퐺 terminal.ퟎ It is done휔 in three phases. 퐴 → 푋푋 Phase 1퐴. →We푥 construct휔 a grammarퟎ that is equivalent퐴 to푋 �grammar and does푥 not have any production in the form of where , . Suppose that we have productions 1 푚 in that lead to a series of form of derivation퐺 such 퐺 퐴 → 푋 퐴 푋 ∈ 푁 퐴 ⟶ 푋 퐺푚 with . 휔1 휔2 휔3 휔푛 휔푛+1 휔푛+2 퐴 �� 푋1 �� 푋2 �� ⋯ �� 푋푛 �⎯⎯� 푋 �⎯⎯� 푝 푝 ∉ 푁

Multiset Controlled Grammars: Normal Form and Closure Properties (Salbiah Ashaari) 40  ISSN: 2502-4752

Accordingly, we substitute all such “sequence” productions , , … , in 휔1 휔2 휔푛+1 by a single production , where = + + + . 1 Thus,1 the2 grammar푛 is푚 휔 퐴 �� 푋 푋 �� 푋 푋 �⎯⎯� 푋 퐺 equivalent to the grammar . 1 2 푛+1 1 Phase 2. We construct퐴 → 푝 a grammar휔 휔 that휔 is equivalent⋯ 휔 to grammar with condition퐺 푚 such all productions in are퐺 not in the form of 2 1 퐺 퐺 퐺2 , > 0, > 2 휔 1 2 푛 where 퐴s→ are푥 푥 terminals.⋯ 푥 휔 For푛 every rule of the form above, we introduce a new rule

푖 where all where terminals are replaced with new variables s, and rules of the ퟎ 푥 퐴 form1 2 푛 for each where푖 is the vector containing a signle one which푖 is at position . → � � ⋯ퟏ풊 � 푥 � Therefore,푖 푖we get 푖 with all its풊 productions are only in the forms , or/and � → 푥 푥 ퟏ 휔 푖 , 2, , 2 , … , . Here, it is obvious that grammar is equivalent to 퐺 퐴 → 푥 푥 ∈ 푇 퐴 grammarퟎ . 1 푛 1 2 푛 2 → 푋 ⋯ 푋Phase푛 ≥ 3. We푋 푋 construct푋 ∈ a푁 grammar that is equivalent to grammar퐺 where all its 퐺1 productions are only in the form of or 푚 with > 0, , , and 2 . Consider 휔 퐺′ ퟎ 퐺 a production in form of with > 2 in . Then, we substitute this production with the 퐴 → 푥 퐴 → 푋푋 휔 퐴 푋 � ∈ 푁 푥 ∈ 푇 productions 0 1 푛 2 퐴 → 푋 ⋯ 푋 푛 퐺

0 1 1 퐴 → 푋 � 0 �1 → 푋2 �2

⋮ 0 푛−2 푛−1 푛 where �’ are→ new� nonterminals.� Thus, the obtained grammar is equivalent to grammar , which is multiset Chomsky normal form. 푚 푚 �Example푠 1 Let = ({ , , }, { , , }, , , , ) be a퐺 ′multiset context-free grammar퐺 where consists of the following productions: 1 퐺 퐴 퐵 푆 푎 푏 푐 푆 푃 ⊕ 퐹 푃 [(0,0,0)], [(1,1,0)], 0 푟 ∶ 푆 → 퐴�[(0,0,1)], 1 푟 ∶ 퐴 → 푎𝑎[(1,1,0)], 2 푟 ∶ 퐵 → 푐[�(0,0,1)], 3 and ( ,푟 ,∶ 퐴) =→ 푎(� ) + ( ) + ( 1) ( ) + ( 1) ( ). 4 푟 ∶ 퐵 → 푐 Then,퐹 to푎 convert푏 푐 휇 the푎 grammar휇 푏 − into휇 푏Chomsky− 휇normal푐 form, we proceed as below: First, replace [(1,1,0)] with [(0,0,0)], [(1,0,0)], [(0,1,0)]. Then, 1 [(0,0,1)] by [(0퐺,0,0)], [(0,0,1)]. Next, [(1,1,0)] by [(0,1,0)]. 푎 푏 푎 푏 Last, replace 퐴 → 푎𝑎 [(0,0,0)] by 퐴 → 푇 [퐴(푇0,0,0)] and 푇 → 푎 [(0,0,0)].푇 → 푏 푐 푐 푎 푏 퐵 → 푐� Hence, we 퐵can→ 푇have퐵 a multiset푇 → controlled푐 context퐴 free→ 푎 �grammar in 퐴Chomsky→ 푇 푇 normal 푎 푏 푎 푏 form with productions퐴 → 푇 퐴 푇such: 퐴 → 푇 퐶 퐶 → 퐴푇

[(0,0,0)], [(0,0,0)], 0 푟 ∶ 푆 → 퐴� [(0,0,0)], 1 푎 푟 ∶ 퐴 → 푇 퐶[(0,0,0)], 2 푏 푟 ∶ 퐶 → 퐴푇 [(0,0,0)], 3 푐 푟 ∶ 퐵 → 푇[(퐵0,0,1)], 4 푎 푏 푟 ∶ 퐴 → 푇 [푇(1,0,0)], 5 푟 ∶ 퐵 → 푐 [(0,1,0)], 6 푎 푟 ∶ 푇 → 푎[(0,0,1)], 7 푏 푟and∶ 푇 (→, 푏, ) = ( ) + ( ) + ( 1) ( ) + ( 1) ( ). 푟8 ∶ 푇푐 → 푐 퐹 푎 푏 푐 휇 푎 휇 푏 − 휇 푏 − 휇 푐 IJEECS Vol. 8, No. 1, October 2017 : 36 – 42 IJEECS ISSN: 2502-4752  41

5. Closure Properties Closure properties are often handy in proving theoretical properties of grammars and languages as well as in constructing new and complex languages from existing languages. Therefore, here by using the standard proof, we investigate the closure properties that can be owned by multiset controlled grammars. The families of languages generated by multiset controlled regular, linear and context- free grammars with linear counter ( ) functions are denoted by , , . Lemma 1 (union) The families , and are closed under union 푙 푙 푙 operation. 퐹 푚퐑퐑� 푚퐋퐋� 푚퐂� 푙 푙 푙 Proof: Let and be two languages푚퐑퐑 �in 푚 with퐋퐋� { 푚퐂�, , } generated by multiset controlled grammars = ( , , , , , ) and = ( , , , , , ), 1 2 respectively, where퐿 and퐿 are linear functions.퐗 Without퐗 ∈ loss푚퐑퐑퐑 of generality,푚�퐼� 푚𝑚 we assume that 1 1 1 1 1 1 2 2 2 2 2 2 = , and set = {퐺} where푁 푇 푆is 푃a new⊕ 퐹nonterminal 퐺symbol.푁 Then,푇 푆 푃we⊕ define퐹 1 2 the grammars = ( 퐹, , , 퐹, , ) where = { , } and = + . Thus, it 1 2 1 2 is푁 not∩ 푁 difficult∅ to notice푁 that푁 ∪ 푁 ∪ 푆 푆 1 2 1 2 1 2 퐺 푁 푇 푆 푃 ⊕ 퐹 푃 푃 ∪ 푃 ∪ 푆 → 푆 푆 → 푆 퐹 퐹 퐹 ( , , ) = ( , ,∶ ) ( , , ).

1 2 퐿Lemma퐺 훼 ∗ 2 (Kleene퐿 퐺 훼 ∗-star)∪ 퐿 The퐺 훼 family∗ of and are closed under Kleene-star operation. Proof ( ): For a given multiset controlled푚퐑퐑퐑 regular푚퐂퐂 language , let = ( , , , , , ) be a multiset controlled with = ( ). Then, it is not difficult to notice that the language 푚 is퐑퐑퐑 generated by multiset controlled regular grammar 퐿 퐺 푁 푇 푃 푆 ⊕ 퐹 ∗ 퐿 퐿 퐺 =퐿 { { }, , { , } { : , }, , , } ′ ′ ′ ′ ∗ ′ where is퐺 a new푁 nonterminal∪ 푆 푇 푃 ∪ symbol.푆 → 휆 푆 → 푆 ∪ 퐴 → 푤� 퐴 → 푤 ∈ 푃 푤 ∈ 푇 푆 ⊕ 퐹 Proof ( ′ ): Let a language is generated by multiset controlled context-free grammar = ( 푆, , , , , ). Then, it is easy to see that the language is generated by grammar such 푚=퐂퐂 { { }, , { 퐿 | }, , , } where is a new∗ nonterminal symbol. 퐺 푁′ 푇Lemma푃 푆 ⊕ 퐹3′ (homomorphism)′ ′ The′ families ′ , 퐿 and are 푚closed𝑚 under homomorphism.퐺 푁 ∪ 푆 푇 푃 ∪ 푆 → 푆푆 휆 푆 ⊕ 퐹 푆 Proof: Let , { , , } 푚be퐑퐑퐑 a la푚nguage퐋퐋퐋 generated푚퐂퐂 by a multiset controlled grammar = ( , , , , , ) and let : be a homomorphism. Then, there is a multiset controlled 퐿grammar∈ 퐗 퐗 ∈ 푚=퐑퐑퐑( , 푚,퐋퐋퐋, 푚, 퐂퐂, ∗) such∗ that ( ) = ( ). 1 1. regular: for퐺 every푁 푇production푃′ 푆 ⊕ 퐹 in the ℎform푇 →of푇 : ′[ ] in , we construct the 1 1 1 production ( ): 퐺( ) [푁 ]푇 in 푃 푆where⊕ 퐹 ′ , 퐿 퐺 { } andℎ 퐿 , ; 2. linear: for every production in the form of : ∗푟 퐴 → 푤�[휔] in 푃 , we ⨁construct⨁ the 1 1 production ℎ(푟): 퐴 → ℎ(푤 푋) 휔(′ )푃[ ], where푤 ∈ 푇, 푋 ∈ 푁, ∪ 휆 {휔} ∈and푇 휔′ ∈ 푇 , 1 2 푟 퐴 → 푤 푋∗푤 휔 푃 ⨁ ; 1 2 1 2 3. context⨁ -free:ℎ 푟 for퐴 →everyℎ 푤 production푋ℎ 푤 휔 ′ in the form푤 푤 of∈ 푇: 푋 ∈ 푁 ∪ 휆 … 휔 ∈ 푇 휔[ ′ ]∈, 1 푇 0, we construct the production ( ): ( ) ( ) … ( ) ( )[ ], 1 1 2 2 푘 푘 푘+1 0 , , 1 + 1 { }, 1 푟 퐴 → 푤 푋 푤 푋, 푤 푋 푤 휔 where , 1 1 and2 2 푘 푘 . 푘+1 푘We≥ define in the ∗above productions ℎas푟 퐴=→0ℎ if 푤| ( 푋)ℎ| =푤0, 푋 =ℎ⨁ 푤if | 푋( ℎ)⨁|푤= 1, and휔 푖 푖 1 = /|푘 ≥( )| if | ( 푤)| >∈ 푇1. Then≤ 푖 ≤ 푘has the푋 same∈ 푁 ∪ cooficient′ 휆 ≤ 푖 for≤ 푘 each 휔symbol′∈ 푇 휔′=∈ 푇( ) as ′ . In every successful휔′ derivation in generating휔 the stringℎ 푤 휔, we휔 replace′ ℎ 푤 with 1 휔( ) 휔 ℎ in푤 the correspondinℎ 푤 g derivation퐹′ and obtain ( ) . Thus (∗ ) = (푎 ). ℎ 푎 ∈ 푇 푎 ∈ 푇 Lemma 4 (mirror image) The families퐺 , and∗ 푤 ∈ 푇 are closed′ under푟 ∈ 푃mirror 1 1 ℎimage푟 ∈ operation.푃 ℎ 푤 ∈ 푇 ℎ 퐿 퐿 퐺 Proof: Let be a language generated by a multiset푚퐑퐑퐑 controlled푚퐋퐋퐋 regular푚퐂퐂 grammar (linear grammar, context-free in Chomsky normal form grammar) = ( , , , , , ), i.e., = ( ). Then, we define a multiset퐿 controlled regular grammar (linear grammar, context-free in Chomsky normal form grammar) = ( , , , , , ) such that (퐺 ) =푁 (푇 )푃 푆by⊕ performing퐹 퐿 reverse퐿 퐺 operation on production rules′ of the grammar . It is clear that′: 푅 1. 퐺 : for푁 each푇 푆 푃 ′production⊕ 퐹 rule of퐿 퐺the form퐿 퐺 [ ] in , we define the production [ ] where퐺 , { } and ; 퐿 ∈ 푚퐑퐑퐑 푅 ∗ 퐴 → 푤�⨁휔 푃 퐴 → 푋푤 휔 푤 ∈ 푇 푋 ∈ 푁 ∪ 휆 휔 ∈ 푇

Multiset Controlled Grammars: Normal Form and Closure Properties (Salbiah Ashaari) 42  ISSN: 2502-4752

2. : for each production rule of the form [ ] in , we define the production [ ] where , , { } and ; 1 2 3. 퐿 ∈ 푚퐋퐋퐋: for every푅 production푅 rule of the forms∗ 퐴 → 푤 [푋푤] and휔 ⨁ 푃 [ ] with , 2 1 1 2 , and퐴 → 푤 푋푤, we휔 define the푤 productions푤 ∈ 푇 푋 ∈ 푁 ∪ [휆 ] and휔 ∈ 푇 [ ]. 퐿 ∈ 푚퐂퐂 ⊕ 퐴 → 푋푋 휔 퐴 → 푎 휔 푋 � ∈ Then, it 푁is not푎 ∈ difficult푇 휔 to∈ see푇 that ( ) = . 퐴 → 𝑌 휔 퐴 → 푎 휔 Theorem 2 and ′ are closed푅 under union, Kleene-star, homomorphism and mirror image operations. 퐿 퐺 퐿 Theorem 3 푚퐑퐑퐑 is closed푚퐂퐂 under union, homomorphism and mirror image operations.

푚퐋퐋퐋 6. Conclusion In a nutshell, we have reviewed back the definition and computational powers of multiset controlled grammars defined in [4] where in addition we have investigated the closure properties of multiset controlled grammars. However, there are still vast questions about other closure properties, decidability problems and etc to be answered.

Acknowledgements This research has been supported by the grants RIGS16-368-0532 and FRGS13-074- 0315 of Ministry of Education, Malaysia through International Islamic University Malaysia.

References [1] Dassow J, Paun G. Regulated Rewriting in Formal Language Theory. Springer Berlin Heidelberg: New York, 1989. [2] Meduna A, Zemek P. Regulated Grammars and Automata. Springer Berlin Heidelberg: New York, 2014. [3] Abraham A. Some Questions of Phrase Structure Grammars. Computational Linguistics. 1965; 4: 61- 70. [4] Ashaari S, Turaev S, M Tamrin MI, Okhunov A, Zhukabayeva T. Multiset Controlled Grammars. Journal of Theoretical and Applied Information Technology. 2017. [5] Ashaari S, Turaev S, Okhunov A. Structurally and Arithmetically Controlled Grammars. International Journal on Perceptive and Cognitive Computing. 2016; 2(2): 25-35. [6] Lobo D, Vico FJ, Dassow J. Graph Grammars with String-Regulated Rewriting. Theoretical Computer Science. 2011; 412(43): 6101-6111. [7] Stiebe R. On Grammars Controlled by Parikh Vectors. In: Language Alive, LNCS 7300. Springer Berlin Heidelberg: New York. 2012: 246-264. [8] Dassow J, Turaev S. Petri Net Controlled Grammars with a Bounded Number of Additional Places. Journal of Acta Cybernetica. 2010; 19: 609-634. [9] Dassow J, Turaev S. k-Petri Net Controlled Grammars. In: Language and and Applications, LATA 2008, LNCS 5196, Springer Berlin Heidelberg: New York. 2008: 209-220. [10] Dassow J, Turaev S. Arbitrary Petri Net Controlled Grammars. Proceedings of 2nd International Workshop on Non-Classical Formal Languages in Linguistics. Tarragona, Spain. 2008: 27-40. [11] Stiebe R, Turaev S. Capacity Bounded Grammars and Petri Nets. In: J. Dassow, G. Pighizzini, B. Truthe (Eds.), 11th International Workshop on Descriptional Complexity of Formal Systems (DCFS 2009) EPTCS 3. 2009: 193-203. [12] Castano J.M. Global Index Grammars and Descriptive Power. Journal of Logic, Language and Information. 2004; 13(4): 403-419. [13] Kudlek M, Martin C, Paun G. Toward a Formal Macroset Theory. In: Calude, C.S Paun, G, Rozenberg, G., Salomaa, A. (Eds.), Multiset Processing, WMC 2000, LNCS 2235, Springer Berlin Heidelberg: New York. 2001: 123-133. [14] Salomaa A. Formal languages. Academic Press: New York. 1973. [15] Sipser M. Introduction to the Theory of Computation. 3rd edn. Cengage Learning: United States of America. 2013. [16] Martin-Vide C. Formal Languages for Linguists: Classical and Nonclassical Models. Oxford Handbook of Computational Linguistics. Oxford University Press: Oxford. 2001.

IJEECS Vol. 8, No. 1, October 2017 : 36 – 42