Sådhanå (2018) 43:134 © Indian Academy of Sciences

https://doi.org/10.1007/s12046-018-0833-ySadhana(0123456789().,-volV)FT3](0123456789().,-volV)

Error tolerance for the recognition of faulty strings in a regulated grammar using fuzzy sets

AJAY KUMAR*, NIDHI KALRA and SUNITA GARHWAL

Computer Science and Engineering Department, Thapar Institute of Engineering & Technology, Patiala 147004, India e-mail: [email protected]

MS received 25 July 2017; revised 5 January 2018; accepted 8 January 2018; published online 10 July 2018

Abstract. To overcome the limitations of context-free and context-sensitive grammars, regulated grammars have been proposed. In this paper, an algorithm is proposed for the recognition of faulty strings in regulated grammar. Furthermore, depending on the errors and certainty, it is decided whether the string belongs to the 2 language or not based on string membership value. The time complexity of the proposed algorithm is O(|GR|·|w|), where |GR| represents the number of production rules and |w| is the length of the input string, w. The reader is provided with numerical examples by applying the algorithm to regularly controlled and matrix grammar. Finally, the proposed algorithm is applied in the Hindi language for the recognition of faulty strings in regulated grammar as a real-life application.

Keywords. Fuzzy regulated grammar; fuzzy sets; Chomsky normal form; regularly controlled grammar; matrix grammar.

1. Introduction inference methods [10–12] are used in the design of fuzzy systems. Context-free grammars are not able to cover all aspects of Motivated by the prior work of recognition of imperfect linguistic phenomena. The membership of a string in a strings in context-free grammar [13] and context-sensitive context-sensitive grammar is decided in exponential time. grammar [14], in this paper, the recognition of faulty strings To overcome the limitations of context-free and context- generated by the regulated grammar is described. As well, sensitive grammars, the concept of regulated grammar the problem of single and multiple fault occurrences is [1–3] was introduced. Regulated grammar is a class of discussed. Faulty string refers to some imperfection caused grammar in which certain restrictions are imposed on in the string. Single fault refers to imperfection at one context-free productions, making them more powerful than position in a string, whereas multiple faults refer to context-free grammar. Regulated grammar is classified into imperfection at more than one position in a string. Here, context and rule-based grammar. In context-based regu- fuzzy sets are used to compute the membership of strings lated grammar, restrictions are imposed on the context, that do not exactly match with the grammar. whereas in rule-based regulated grammar, the restrictions are imposed on the application of rules. In formal languages, the input string is either accepted 1.1 Previous work (μ(w)=1) or rejected (μ(w)=0), where μ denotes the Regularly controlled and matrix grammar were proposed by membership of a string, w. To reduce the gap between Ginsburg et al [15] and Cremers et al [16], respectively. In natural and classical formal languages, fuzziness has been regularly controlled grammar, the control is regulated by introduced into formal languages, thereby giving rise to the sequence of productions in the regular expression. In fuzzy languages. In recent years, fuzzy languages have matrix grammar, control is regulated by following the same been applied in various areas, such as intelligent interface sequence given in the matrix. Solar and Meduna [17] pro- design [4, 5], clinical monitoring [6], neural networks posed a state translation scheme, which is an extended [7, 8] and pattern recognition [9]. Various fuzzy version of state grammar [18], for producing two output strings in a single derivation. Meduna and Zemek [19] introduced one-sided random context grammars, a variant of random context grammar [20]. Furthermore, Meduna and Zemek [21] introduced one-sided random context *For correspondence

1 134 Page 2 of 12 Sådhanå (2018) 43:134 grammars, with a limited number of right random context Definition 2.1 Regular expression over an alphabet T is rules. Meduna [22] put forth generalized forbidding gram- defined using following rules: mar. Furthermore, Meduna and Zemek [23] also introduced i) ϕ and λ are regular expressions representing the one-sided forbidding grammars, where each rule is further empty set and empty string. divided into left and right forbidden rules, and each rule ii) If a∊T, then a is a regular expression representing the consists of a finite set of forbidden strings. Meduna and singleton set {a}. Zemek [24] established that the power of one-sided for- iii) If r and r are regular expressions, then r r ,r |r and bidding grammars and selective substitution grammars 1 2 1 2 1 2 r * are regular expressions representing concatena- [25, 26] are equivalent in the case of erasing rule grammar. 1 tion, union, and Kleene closure. Kalra and Kumar [27, 28] introduced the concept of deterministic deep pushdown automata and fuzziness in The regulated grammar consists of production rules similar deep pushdown automata. Garhwal and Jiwari [29] intro- as context-free grammar, but it has a larger generative duced the concept of fuzziness in parallel regular capacity than context-free grammar. Regulated grammar expression. lies between context-free and context-sensitive grammar. n The concept of fuzzy languages, a fuzzy subset of strings Context-sensitive language (such as ww, anbncn and a2 ) over an alphabet, was introduced by Lee and Zadeh [30]. involving interleaved dependencies can be represented Asveld [31] proposed fuzzy context-free K-grammar for using regulated grammar. describing correct and erroneous sentences. Asveld [32] fi devised the modified Cocke–Younger–Kasami (CYK) De nition 2.2 Regulated grammar is GR=(G,R), where G algorithm for recognizing fuzzy context-free grammar. =(N,T,S,P) is a context-free grammar and R is the Additionally, Schinder et al [13] described an approach for restriction applied on the derivations of strings. R depends recognizing imperfect strings generated by the context-free on the type of regulated grammar [1–3]. grammar. Inui et al [14] expanded on the work of Schinder Regularly controlled grammar is a rule-based regulated et al [13], describing an approach for identifying imperfect grammar, where the restriction on rules is applied using a strings generated by the context-sensitive grammar. regular expression. Schinder et al [13] and Inui et al [14] applied the concept of fuzzy sets to determine the membership values of strings. Definition 2.3 Regularly controlled grammar [1–3]is Here, the concept of errors and fuzzy sets in this context is GRC=(G,r), where G=(N,T,S,P) is a context-free grammar utilized which was earlier used by Schinder et al [13] and and r is a regular expression over P for controlling the Inui et al [14] regarding context-free grammar and context- derivations of strings. sensitive grammar. Recently, Zhanga et al [33] and Bag Language L (G ) represented by regularly controlled et al [34] have applied the concept of pattern recognition in RC grammar G consists of all words w∊T* generated by the intelligent computing systems and prediction of consumer RC following sequence: intention. After introducing preliminary concepts in section2, the pr1 pr2 prn S ! w1 ! w2 ÁÁÁ!w such that pr1; pr2; ...; prn 2 P following topics have been explained. Example 2.1 Consider the regularly controlled grammar ● Recognition of faulty strings generated by regulated [2] G =(G ,r), where G =({S,C,D},{c,d},S,P) where the grammar using the different fuzzy sets has been RC 1 1 production P is of the following forms: proposed in section3. ● The problem of occurrence of multiple faults has been pr0 : S ! CD pr1 : C ! cC solved by converting the regulated grammar into pr2 : D ! cD pr3 : C ! dC Chomsky normal form in section3. pr4 : D ! dD pr5 : C ! c ● The complete procedure is depicted by numerical pr6 : D ! cpr7 : C ! d example in section4. pr8 : D ! d ● The proposed algorithm has been applied on the Hindi à language to demonstrate its real- life application in r ¼ pr0ðpr1pr2; pr3pr4Þ ðpr5pr6; pr7pr8Þ section5 and the conclusions are provided in section6. Consider w=cdcd

pr0 pr1 pr2 pr7 pr8 2. Preliminaries S ! CD ! cCD ! cCcD ! cdcD ! cdcd Hence, L(G )={ww|w∈(c,d)?} is a non-context-free Let T be an alphabet, λ denotes empty string and μ(w) de- RC language. notes the membership of a string such that 0≤μ(w)≤1. _ represents the symbol that can be replaced by any symbol Matrix grammar is a rule-based regulated grammar, from T. where restrictions on productions are applied using Sådhanå (2018) 43:134 Page 3 of 12 134 matrices. Here, we have to apply all productions of a matrix ● Replacement Error [13, 14]: In replacement before starting with another matrix. error, a terminal is substituted with another terminal symbol from T. Definition 2.4 Matrix grammar [1–3]isGM=(G,rm), ● Addition Error [13, 14]: In addition error, an where G=(N,T,S,M) and M=(m ,m ,…,m ),n≥1 is a finite 1 2 n extra new terminal symbol is added from T.This set of matrices, where each m consists of a finite set of i new terminal can be placed before or after the context-free production and r is a regular expression over m terminal. the set of matrices mi. ● Removal Error [13, 14]: In removal error, some Example 2.2 Consider GM=(G,rm), where G=({S,C,D}, terminal is deleted. {c,d},S,{m ,m ,m ,m ,m }) be the matrix grammar [2] 0 1 2 3 4 iii) w is a weighted function such that {(p,w(p))|p∊P∧w where (p)∊[0,1]}. It is used to assign the degree of membership to the production rules. m0 ¼ðp0 : S ! CDÞ; m1 ¼ðp1 : C ! cC; p2 : D ! cDÞ; m2 ¼ðp3 : C ! dC; p4 : D ! dDÞ; 3. Proposed algorithm for the recognition of faulty strings in regulated grammar using fuzzy sets m3 ¼ðp5 : C ! c; p6 : D ! cÞ; m p : C d; p : D d ; 4 ¼ð 7 ! 8 ! Þ In classical , a string is either accepted or The control set is r =m (m ,m )*(m ,m ). rejected. Fuzzy languages close the gap between natural m 0 1 2 3 4 and formal languages. In the fuzzy set, the degree to Consider w=dcdc which an element belongs to the set is referred to as the

m0:p0 m2:p3 m2:p4 m3:p5 m3:p6 degree of membership [35]. In this section, an algorithm is S ! CD ! dCD ! dCdD ! dcdD ! dcdc proposed for the recognition of faulty strings in regulated ? grammar using fuzzy sets (RFSRG). Given a regularly Hence, L(G )={ww|w∊(c,d) } is a non-context-free M controlled grammar G =(G,r). For avoiding multiple language. RC errors, in step 1, GRC=(G,r) is converted into GRC0 ¼ G0; r0 ′ Definition 2.5 Fuzzy grammar [30] is quadruple Gf=(N, ð Þ where G is a Chomsky normal form of G. Every 1 2 T,P,S), where P is a set of fuzzy productions of the form production pi 2 G is converted into productions pi ,pi , l k ≥ ′ fa À! bj0  l  1 ^ a; b 2ðN [ TÞÃg. …,pi |k 1ofG . Specifically, G is converted into Chom- sky normal form G′ [36]. fi De nition 2.6 Fuzzy language L [30] is a fuzzy set of In step 2, control derivation r of GRC is converted into r′ μ ∊ ≤μ ≤ μ T*·L={(x, (x))|x T* and 0 (x) 1} where (x) denote the of GRC0 such that L(G′,r′)=L(G,r)−{λ}. degree of membership of string x. μ(x) is obtained by taking a minimum of all derivation rules from S to derive x if a single derivation exists for the string x. If there exists more than one derivation for a particular string x, then max is taken over the minimum of all the derivation rules from starting from S to derive x. Example 2.3 [30] Consider L={(0,0.8),(00,0.4),(10,0.2), (1,0.2),(11,0.6)}. Here L describes a fuzzy language over {0, 1}. Zero membership strings are not represented in L. Example 3.1 Consider example 2.1, where the production rules P is of the following forms: Definition 2.7 Fuzzy regulated grammar Gfr(GR,E,w), where pr0 : S ! CD pr1 : C ! cC pr2 : D ! cD pr3 : C ! dC i) GR is a regulated grammar in Chomsky normal form. pr : D dD pr : C c ii) E is the error set of productions constructed from the 4 ! 5 ! pr6 : D ! cpr7 : C ! d original productions P using the replace operator pr : D d (replacing a terminal symbol with another terminal in 8 ! p∊P), add operator (adding an extra terminal in p∊P) Ã r ¼ pr0ðpr1pr2; pr3pr4Þ ðpr5pr6; pr7pr8Þ and remove operator (removing a terminal in p∊P). Errors in the input string are classified into following Equivalent Chomsky normal form (CNF) for the grammar three types: GRC: 134 Page 4 of 12 Sådhanå (2018) 43:134

pr : S CD pr1 : C A C l w 1 S shaped membership function 0 ! 1 ! 1 Tf ð Þ¼ À 2 1 pr1 : A1 ! cpr2 : D ! A2D 8 9 2 1 0; x a pr : A2 ! cpr: C ! A3C >  > 2 3 > 2 > pr2 : A dpr1 : D A D > x À a > 3 3 ! 4 ! 4 <> 2 ; a\x ða þ bÞ=2 => 2 b a pr : A4 dpr5 : C c À 4 ! ! lT ðwÞ¼1 À 2 f > x À b > pr6 : D ! cpr7 : C ! d > 1 2 ; a b =2\x\b > > À b a ð þ Þ > pr8 : D ! d : À ; 1; x  b ′ 1 2 1 2 1 2 1 2 r is converted into r =pr0(pr1pr1pr2pr2,pr3pr3pr4pr4)*(pr5- ð1Þ pr6,pr7pr8). where a and b represent the minimum and maximum In step 3, we apply replace, add and remove operators to number of productions used to derive the faulty string, w,in → ∊ ∊ the productions of the form A a|A N and a T. Hence, all derivations, respectively. x denotes the actual number of there is no possibility of occurrence of multiple errors using productions used in a particular derivation of the faulty one production. To accumulate the faulty string, replace, string, w. In this a and b plots the extreme points of the add and remove operators are considered. Similar attempts sloped curve, therefore making it S-shape. In step 7, the have been made for the recognition of faulty strings in optimal derivation for the string, w, was chosen. The context-free and context-sensitive grammar [13, 14].

As regularly controlled grammar is ambiguous, in step 4, optimal derivation was found by choosing the highest different derivations for the string, w∉L, were generated. In degree of membership obtained from E-set and Tf-set. one derivation of a string, total 2|w|−1 productions are used. In each derivation of a string, w, the number of productions of the form A→BC|A,B,C∊N can be |w|−1 and the number of productions of the form A→a|A∊N∧a∊ T can be |w|. In step 5, a fuzzy set E is used for determining the number of erroneous substitutions. Membership of a faulty l w jwjÀjEpj string is determined by Eð Þ¼ jwj where |w| represents the length of the string, w, and |Ep| represents the number of error productions used for the generation of string w. A certainty factor was defined that can be used for the acceptance and rejection of strings. In step 6, a fuzzy set Tf is used for determining the derivation with minimal pro- If a matrix grammar GM=(G,rm) is given instead of duction rules utilized to derive the faulty string, w. Mem- regularly controlled grammar. During, the conversion of G bership of a faulty string is determined by: into Chomsky normal form G′, every production pi 2 G is Sådhanå (2018) 43:134 Page 5 of 12 134

1 2 k ≥ ′ replaced by production pi ,pi ,…pi |k 1ofG . In step 2 of Theorem 1 Given a regulated grammar GR and string RFSRG, we call procedure Matrix_Control_derivation (GM, w. The time complexity of the RFSRG algorithm is equal to ′ 2 GM ). The rest of the algorithm remains the same. O(|GR|·|w|), where |GR| represents the size of productions

Example 3.2 Consider GM=(G,rm), where G=({S,C,D}, and |w| represents the length of the string respectively. {c,d},S,{m ,m ,m ,m ,m }) be the matrix grammar of 0 1 2 3 4 Proof In Step1 of RFSRG algorithm, regulated grammar example 2.2 where GR is converted into Chomsky Normal form GR0 using G m0 ¼ðp0 : S ! CDÞ; DEL, TERM, UNIT and BIN operations [36]. Size of R0 i. e. | GR0 | depends on the order in which DEL, BIN, UNIT m1 ¼ðp1 : C ! cC; p2 : D ! cDÞ; and TERM are carried out [36]. TERM always give a linear m p : C dC; p : D dD ; 2 ¼ð 3 ! 4 ! Þ increase in the size of the grammar. If we carry out the 2 m3 ¼ðp5 : C ! c; p6 : D ! cÞ; transformation in BIN→DEL→UNIT, then |GR0 |=O(|GR|). G m4 ¼ðp7 : C ! d; p8 : D ! dÞ; In step 2, the control derivation GR is converted into R0 . Each production pi∊GR may be converted into a number of 1 2 k The control set is rm=m0(m1,m2)*(m3,m4). productions say pri ,pri ,…,pri |k≥1inGR0 and each sub- stitution of p in control derivation r gives the corre- Equivalent CNF for the grammar G : i M sponding control derivation r′. Each substitution can be

0 carried out in O(|1|). Therefore, the control derivation r can m0 ¼ðp0 : S ! CDÞ; be converted into r′ in O(|GR|). 0 1 2 1 2 m1 ¼ðp1 : C ! A1C; p1 : A1 ! c; p2 : D ! A2D; p2 : A2 ! cÞ; In step 3, we add additional productions using replace, m0 p1 : C A C; p2 : A d; p1 : D A D; p2 : A d ; 2 ¼ð 3 ! 3 3 3 ! 4 ! 4 4 4 ! Þ add and remove operators. Thus, the number of productions 0 m3 ¼ðp5 : C ! c; p6 : D ! cÞ; 2 is of the size O(|GR|). For each production, four faulty 0 m4 ¼ðp7 : C ! d; p8 : D ! dÞ productions (one each by replace and remove operation and two productions by add operations) are generated. Addition The control set is rm′ is m0′(m1′,m2′)*(m3′,m4′). of each faulty productions can be carried out in O(|1|), 134 Page 6 of 12 Sådhanå (2018) 43:134 thereby giving the overall complexity of the second step is Using fuzzy replace operator: 2 O(|GR|). In each derivation of a string, w, the number of pro- pr9 : A1 ! pr10 : A2 ! ductions of the form A→BC|A,B,C∊N can be |w|−1 and the pr11 : A3 ! pr12 : A4 ! number of productions of the form A→a|A∊N∧a∊T can be pr13 : C ! pr14 : D ! |w|. Therefore total 2|w|−1 productions are used in one pr15 : C ! pr16 : D ! derivation of a string. Considering at each step of the 2 Using fuzzy add operator: derivation, we can choose a production from O(|GR|) pro- 2 ductions. Hence the overall complexity is O(|GR|·|w|). Step pr : A c c pr : A c c 5 can be carried out in O(|1|). In step 6, for finding a, b and 17 1 ! j 18 2 ! j 2 pr19 : A3 ! djd pr20 : A4 ! djd x values, complexity is O(|GR|) and further computation of pr21 : C ! cjc pr22 : D ! cjc lT ðwÞ can be carried out in O(|1|). So overall complexity of f pr : C d d pr : D d d 2 23 ! j 24 ! j Step 6 is O(|GR|). Step 7 can be carried out in O(|1|). Thus 2 the overall complexity of the algorithm is O(|GR|·|w|). Using fuzzy remove operator: The proposed algorithm works on all rule-based regu- lated grammars with little modifications in applications of pr25 : A1 ! k pr26 : A2 ! k rules. pr27 : A3 ! k pr28 : A4 ! k pr29 : C ! k pr30 : D ! k pr31 : C ! k pr32 : D ! k 4. Numerical examples Consider the string w ¼ ddddcd 62 L. Table 1 depicts some ways for deriving the string w. In this section, we apply RFSRG algorithm to regularly Table 2 represents the confidence level for the string controlled grammar. w using E-set. Example 4.1 Consider the regularly controlled grammar Table 3 represents the confidence level for the string w of example 2.1 GRC=(G1,r), where G1=({S,C,D},{c,d},S, using Tf-set. Table 4 represents the final interpretations with P) where the production P is of the following forms: fuzzy confidence for the string w ¼ ddddcd. The optimal derivation for the string w ¼ ddddcd is as pr : S CD pr : C cC 0 ! 1 ! follows: pr2 : D ! cD pr3 : C ! dC pr4 : D ! dD pr5 : C ! c lðwÞ¼maxð0; 0; 0:6667; 0; 0; 0; 0; 0; 0:3333Þ pr6 : D ! cpr7 : C ! d ¼ 0:6667 pr8 : D ! d λ à By choosing a certain level of confidence c, we say the r ¼ pr0ðpr1pr2; pr3pr4Þ ðpr5pr6; pr7pr8Þ string is accepted if μ(w)≥λc. Application of RFSRG algorithm to matrix grammar (for details refer “Appendix 1”). Equivalent Chomsky normal form (CNF) for the gram- mar GRC:

1 pr0 : S ! CD pr1 : C ! A1C 2 1 5. Real -life application of RFSRG algorithm pr1 : A1 ! cpr2 : D ! A2D 2 1 in the Hindi language pr2 : A2 ! cpr3 : C ! A3C 2 1 pr3 : A3 ! dpr4 : D ! A4D 2 Regulated languages are the most appropriate formal pr4 : A4 ! dpr5 : C ! c method for representing human languages. pr6 : D ! cpr7 : C ! d In linguistic topology, the Hindi language contains cross- pr8 : D ! d serial dependency, and its structure is similar to the German 1 2 1 2 1 2 1 2 r is converted into r′=pr0(pr1pr1pr2pr2,pr3pr3pr4pr4)*(pr5- and Dutch languages. Consider the following sentence of pr6,pr7pr8). the Hindi Language spoken in India. Sådhanå (2018) 43:134 Page 7 of 12 134

Table 1. Various derivation for the string w=ddddcd.

Derivation 1 S p0 CD p1 A3CD p2 dCD p1 dCA4D p2 dCdD p1 dA3CdD ! ! 3 ! 3 ! 4 ! 4 ! 3

!p19 dddCdD !p40 dddCdA4D !p20 dddCdcdD !p29 ddddcdD !p30 ddddcd

Derivation 2 S p0 CD p1 A3CD dCD p1 dCA4D p2 dCdD p1 dA1CdD ! ! 3 !p2 ! 4 ! 4 ! 1 3

p9 ddCdD p1 ddCdA2D p2 ddCdcD p7 ddddcD p8 ddddcd ! ! 2 ! 2 ! ! Derivation 3 S p0 CD 1 A3CD p19 ddCD 1 ddCA4D p20 ddCdcD ! !p3 ! !p4 ! !7 ddddcD !8 ddddcd

Derivation 4 S p0 CD p1 A3CD p2 dCD p1 dCA4D p2 dCdD p1 dA3CdD ! ! 3 ! 3 ! 4 ! 4 ! 3

p2 ddCdD p1 ddCdA4D p12 ddCdcD p7 ddddcD p8 ddddcd ! 3 ! 4 ! ! ! Derivation 5 S p0 CD p1 A1CD p9 dCD p1 dCA2D p10 dCdD p1 dA1CdD ! ! 1 ! ! 2 ! ! 1

p9 ddCdD p1 ddCdA2D p2 ddCdcD p7 ddddcD p8 ddddcd ! ! 2 ! 2 ! ! Derivation 6 S p0 CD 1 A1CD p9 dCD 1 dCA2D p10 dCdD 1 dA3CdD ! !p1 ! !p2 ! !p3

p19 dddCdD p1 dddCdA4D p20 dddCdcdD p29 ddddcdD p30 ddddcd ! ! 4 ! ! ! Derivation 7 S p0 CD p1 A1CD p9 dCD p1 dCA2D p10 dCdD p1 dA3CdD ! ! 1 ! ! 2 ! ! 3

p2 ddCdD p1 ddCdA4D p12 ddCdcD p7 ddddcD p8 ddddcd ! 3 ! 4 ! ! ! Derivation 8 S p0 CD p1 A3CD p2 dCD p1 dCA4D p2 dCdD dA3CdD ! ! 3 ! 3 ! 4 ! 4 !p1 3

p2 ddCdD p1 ddCdA4D p12 ddCdcD p13 ddddcD p14 ddddcd ! 3 ! 4 ! ! ! Derivation 9 S p0 CD p1 A3CD p19 ddCD ddCA4D p20 ddCdcD ! ! 3 ! !p1 ! 4 !13 ddddcD !14 ddddcd

Table 2. Confidence level for the string w using E-set.

Derivation no. 1 2 3 4 5 6 7 8 9 Grade of 0.3333 0.8333 0.6667 0.8333 0.5 0 0.5 0.5 0.3333 membership

Table 3. Confidence level for the string w using Tf-set.

Derivation no. 1 2 3 4 5 6 7 8 9 Grade of membership 1–1=0 1–1=0 1–0=1 1–1=0 1–1=0 1–1=0 1–1=0 1–1=0 1–0=1 134 Page 8 of 12 Sådhanå (2018) 43:134

Table 4. Final interpretations with fuzzy confidence for the string w=ddddcd.

Derivation no. 1 2 3 4 5 6 7 8 9 μ(w∈E) 0.3333 0.8333 0.6667 0.8333 0.5 0 0.5 0.5 0.3333 μ(w∈Tf) 001001001 Min 0 0 0.6667 0 0 0 0 0 0.3333

Table 5. Various derivation for the string w=cdcdd.

Derivation 1 Derivation 2 Derivation 3 Derivation 4 pr0:S→CD pr0:S→CD pr0:S→CD pr0:S→CD 1 1 1 1 pr1:S→A1CD pr1:S→A1CD pr3:S→A3CD pr3:S→A3CD 2 2 pr1:S→cCD pr1:S→cCD pr11:S→cCD pr19:S→cdCD 1 1 1 1 pr2:S→cCA2Dpr2:S→cCA2Dpr4:S→cCA4Dpr4:S→cdCA4D 2 2 2 pr18:S→cCcdD pr2:S→cCcD pr4:S→cCdD pr4:S→cdCdD pr7:S→cdcdD pr7:S→cdcD pr23:S→cdcdD pr5:S→cdcdD pr8:S→cdcdd pr24:S→cdcdd pr8:S→cdcdd pr14:S→cdcdd

Table 6. Confidence level for the string w using E-set.

Derivation no. Derivation 1 Derivation 2 Derivation 3 Derivation 4 Grade of membership 0.75 0.75 0.5 0.5

Table 7. Confidence level for the string w using Tf-set.

Derivation no. Derivation 1 Derivation 2 Derivation 3 Derivation 4 Grade of membership 1–0=1 1–0=1 1–0=1 1–0=1

Table 8. Final interpretations with fuzzy confidence for the string w=cdcdd.

Derivation No. Derivation 1 Derivation 2 Derivation 3 Derivation 4 μ(w∈E) 0.75 0.75 0.5 0.5 μ(w∈Tf) 1111 Min 0.75 0.75 0.5 0.5

m n m n In the above sentences, sequential noun phrases मोहन Therefore, sentence S2 is of the form a b c d , which is a (Mohan), सेब (apple), राम (Ram), पेन (pen) and the verb non-context-free language. Cross-serial dependency of the phrases ख़ाया (ate), खराब हो गया (rotten), िलखा (written), Hindi language can be represented by regulated grammar. उपहार मे िमला था (gifted) both form two separate series of Now consider the sentence S1 and If the users uninten- constituents. Let m′ be the morphism which maps to मोहन tionally write the S1 sentence as and ख़ाया to c, सेब and खराब हो गया to d and all other words S1′: मोहन ने वो सेब ख़ाया जो सेब खराब हो गया| of the sentence to the empty string. Therefore, S1:cdcd. Therefore, sentence S1 is of the form ww, which is a non- In this sentence, the user makes an insertion error of one context-free language. noun phrase सेब (apple). Now the sentence S1′ maps to Consider m″ be the morphism, which maps राम, पेन, cdcdd instead of cdcd. In , this string is िलखा, उपहार मे िमला था to a,b,c and d, respectively and the completely rejected. However, in case of fuzzy regulated dependency relation between राम (Ram) and िलखा (written) grammar, we cannot completely reject the string w.By is denoted by m and पेन (pen) and उपहार मे िमला था (gifted) considering the grammar of example 4.1, Erroneous sen- by n and other words of the sentence by empty string. tence acceptance S1′ is shown using proposed algorithm Sådhanå (2018) 43:134 Page 9 of 12 134

RFSRG. Table 5 depicts few ways for deriving the string w m0 ¼ðp0 : S ! CDÞ; =cdcdd. m1 ¼ðp1 : C ! cC; p2 : D ! cDÞ; Table 6 represents the confidence level for the string m p : C dC; p : D dD ; w using E-set. 2 ¼ð 3 ! 4 ! Þ Table 7 represents the confidence level for the string w m3 ¼ðp5 : C ! c; p6 : D ! cÞ; using Tf-set. Table 8 represents the final interpretations with m4 ¼ðp7 : C ! d; p8 : D ! dÞ; fuzzy confidence for the string w ¼ cdcdd. = The optimal derivation for the string w ¼ cdcdd is as The control set is rm m0(m1, m2)*(m3, m4). follows: Equivalent CNF for the grammar GM:

lðwÞ¼maxð0:75; 0:75; 0:5; 0:5Þ¼0:75 0 m0 ¼ðp0 : S ! CDÞ; λ 0 By choosing a certain level of confidence c, we say the m p1 : C A C; p2 : A c; p1 : D A D; μ ≥λ 1 ¼ð 1 ! 1 1 1 ! 2 ! 2 string is accepted if (w) c. 2 p2 : A2 ! cÞ; 0 1 2 1 m2 ¼ðp3 : C ! A3C; p3 : A3 ! d; p4 : D ! A4D; 6. Conclusions 2 p4 : A4 ! dÞ; 0 In this paper, an algorithm for deriving a string, w∉L, with m3 ¼ðp5 : C ! c; p6 : D ! cÞ; a certain level of confidence is described. Regulated 0 m4 ¼ðp7 : C ! d; p8 : D ! dÞ grammar is converted to Chomsky normal form to avoid multiple error elimination. Furthermore, the algorithm is The control set is rm′ is m0′(m1′, m2′)*(m3′, m4′). applied to regularly controlled and matrix grammar. The Using fuzzy replace operator: time complexity of the proposed algorithm is O(|G2|·|w|). R pr : A pr : A By way of numerical examples for a string, w∉L, the 9 1 ! 10 2 ! pr : A pr : A determination of a confidence level for the string, w, is also 11 3 ! 12 4 ! pr : C pr : D demonstrated Further, the real-life application of a pro- 13 ! 14 ! pr : C pr : D posed algorithm for the recognition of faulty strings for the 15 ! 16 ! natural language, Hindi, is exhibited. Using fuzzy add operator:

pr17 : A1 ! cjc pr18 : A2 ! cjc Acknowledgements pr19 : A3 ! djd pr20 : A4 ! djd pr21 : C ! cjc pr22 : D ! cjc One of the authors, Nidhi Kalra was supported under pr23 : C ! djd pr24 : D ! djd Visvesvaraya PhD Scheme Fellowship by Ministry of Electronics and Information Technology, Government of Using fuzzy remove operator: India. pr25 : A1 ! k pr26 : A2 ! k pr27 : A3 ! k pr28 : A4 ! k pr29 : C ! k pr30 : D ! k Appendix 1: Numerical example pr31 : C ! k pr32 : D ! k In this appendix, the proposed algorithm has been applied On considering string w = ddcd ∉ L. Table A1 depicts to a matrix grammar. few ways for deriving the string w = ddcd. Table A2 represents the confidence level for the string Example 4.2 Consider G = (G, r ), where M m wusing E-set. G = {{S, C, D}, {c, d}, S,{m , m , m , m , m }} be the 0 1 2 3 4 Table A3 and A4 represent the confidence level for the matrix grammar of example 2.2 where string wusing Tf-set and final interpretations with fuzzy confidence for the string w = cdcdcccd respectively. 134 Page 10 of 12 Sådhanå (2018) 43:134

Table A1. Various derivation for the string w ¼ cdcdcccd.

m0 m1 m1 m1 m1 m2 m2 Derivation S ! CD ! A1CD ! cCD ! cCA2D ! cCcD ! cA3CcD ! cdCcD p0 p1 p2 p1 p2 p1 p2 1 1 1 2 2 3 3 m2 m2 m1 m1 m1 m1 ! cdCcA4D ! cdCccD ! cdA1CccD ! cdcCccD ! cdcCccA2D ! p1 p14 p1 p2 p1 p2 4 1 1 2 2

m4 m4 cdcCcccD ! cdcdcccD ! cdcdcccd p7 p8 m0 m1 m1 m1 m1 m1 m1 Derivation S ! CD ! A1CD ! cCD ! cCA2D ! cCcD ! cA1CcD ! ccCcD p 1 1 2 1 0 p p2 p p p p9 2 1 1 2 2 1

m1 m1 m1 m1 m1 ! cdCcA2D ! cdCccD ! cdA1CccD ! cdcCccD ! cdcCccA2D 1 2 1 1 p p p p2 p 2 2 1 1 2

m1 m4 m4 ! cdcCcccD ! cdcdcccD ! cdcdcccd p2 p7 p8 2 m0 m1 m1 m1 m1 m1 m1 Derivation S ! CD ! A1CD ! cCD ! cCA2D ! cCcD ! cA1CcD ! p0 p1 p2 p1 p2 p1 p17 3 1 1 2 2 1 m1 m1 m4 m4 cdcCcD ! cdcCcA2D ! cdcCcccD ! cdcdcccD ! cdcdcccd p1 p18 p7 p8 2 m0 m1 m1 m1 m1 m1 m1 Derivation S ! CD ! A1CD ! cCD ! cCA2D ! cCcD ! cA1CcD ! cdcCcD p0 p1 p2 p1 p2 p1 p17 4 1 1 2 2 1 m1 m1 m2 m2 ! cdcCcA2D ! cdcCcccD ! cdcA3CcccD ! cdcdCcccD p1 p18 p1 p2 2 3 3

m2 m2 m4 m4 ! cdcdCcccA4D ! cdcdCcccdD ! cdcdcccdD ! cdcdcccd p1 p2 p31 p32 4 4 m0 m2 m2 m2 m2 m2 m2 Derivation S ! CD ! A3CD ! cCD ! cCA4D ! cCcD ! cA3CcD ! cdCcD p0 p1 p11 p1 p12 p1 p2 5 3 4 3 3 m2 m2 m1 m1 m1 ! cdCcA4D ! cdCccD ! cdA1CccD ! cdcCccD ! cdcCccA2D p1 p12 p1 p2 p1 4 1 1 2

m1 m4 m4 ! cdcCcccD ! cdcdcccD ! cdcdcccd 2 p7 p8 p2 m0 m2 m2 m2 m2 m1 m1 Derivation S ! CD ! A3CD ! cCD ! cCA4D ! cCcD ! cA1CcD ! ccCcD p0 p1 p11 p1 p12 p1 p9 6 3 4 1 m1 m1 m1 m1 m1 ! cdCcA2D ! cdCccD ! cdA1CccD ! cdcCccD ! cdcCccA2D p1 p2 p1 p2 p1 2 2 1 1 2

m1 m4 m4 ! cdcCcccD ! cdcdcccD ! cdcdcccd p2 p7 p8 2 m0 m2 m2 m2 m2 m1 m1 Derivation S ! CD ! A3CD ! cCD ! cCA4D ! cCcD ! cA1CcD ! cdcCcD p0 p1 p11 p1 p12 p1 p17 7 3 4 1 m1 m1 m4 m4 ! cdcCcA2D ! cdcCcccD ! cdcdcccD ! cdcdcccd p1 p18 p7 p8 2 m0 m2 m2 m2 m2 m1 m1 Derivation S ! CD ! A3CD ! cCD ! cCA4D ! cCcD ! cA1CcD ! cdcCcD p0 p1 p11 p1 p12 p1 p17 8 3 4 1 m1 m1 m2 m2 m2 ! cdcCcA2D ! cdcCcccD ! cdcA3CcccD ! cdcdCcccD ! cdcdCcccA4D 1 p18 1 2 1 p2 p3 p3 p4

m2 m4 m4 ! cdcdCcccdD ! cdcdcccdD ! cdcdcccd p2 p31 p32 4 m0 m1 m1 m1 m1 m2 m2 Derivation S ! CD ! A1CD ! cCD ! cCA2D ! cCcD ! cA3CcD ! cdCcD p0 p1 p2 p1 p2 p1 p2 9 1 1 2 2 3 3 m2 m2 m2 m2 m2 ! cdCcA4D ! cdCccD ! cdA3CccD ! cdcCccD ! cdcCccA4D 1 p12 1 p11 1 p4 p3 p4

m2 m4 m4 ! cdcCcccD ! cdcdcccD ! cdcdcccd p12 p7 p8 m0 m1 m1 m1 m1 m1 m1 Derivation S ! CD ! A1CD ! cCD ! cCA2D ! cCcD ! cA1CcD ! ccCcD p0 p1 p2 p1 p2 p1 p9 10 1 1 2 2 1 m1 m1 m2 m2 m2 ! cdCcA2D ! cdCccD ! cdA3CccD ! cdcCccD ! cdcCccA4D p1 p2 p1 p11 p1 2 2 3 4

m2 m4 m4 ! cdcCcccD ! cdcdcccD ! cdcdcccd p12 p7 p8 Sådhanå (2018) 43:134 Page 11 of 12 134

Table A2. Confidence level for the string w using E-set.

Derivation no. 1 2 3 4 5 6 7 8 9 10 Grade of Membership 0.875 0.875 0.75 0.5 0.625 0.625 0.5 0.25 0.625 0.625

Table A3. Confidence level for the string w using Tf-set.

Derivation no. 1 2 3 4 5 678910 Grade of Membership 1-1 = 0 1-1 =0 1-0 =1 1-1 =0 1-1 = 1-1 = 0 1-0 = 1 1-1 = 0 1-1 = 0 1-1 =0 0

Table A4. Final interpretations with fuzzy confidence for the string w = cdcdcccd.

Derivation no. 12345678910 μ(w 2 E) 0.875 0.875 0.75 0.5 0.625 0.625 0.5 0.25 0.625 0.625 μ(w 2Tf) 1-1=0 1-1=0 1-0=1 1-1=0 1-1=0 1-1=0 1-0=1 1-1=0 1-1=0 1-1=0 Min 0 0 0.75 0 0 0 0.5 0 0 0

lðwÞ¼maxð0; 0; 0:75; 0; 0; 0; 0:5; 0; 0; 0Þ¼0:75 [9] DePalma G F and Yau S S 1975 Fractionally fuzzy grammars with application to pattern recognition. In: Zadeh L A, Fu K By choosing a certain level of confidence λc, we say string S, Tanaka K and Shimura M (Eds.) Fuzzy Sets and their is accepted if μ(w) ≥ λc. Application to Cognitive and Decision Processes. New York: Academic Press, pp. 329–351 [10] Qiu D 2007 Automata theory based on quantum logic: Reversibilities and pushdown automata. Theoretical Com- References puter Science 386(1–2): 38–56 [11] Beˇlohla´vek R 2002 Determinism and fuzzy automata. Infor- mation Sciences 143(1–4): 205–209 [1] Dassow J and Pa˘un G 1989 Regulated rewriting in Formal [12] Ignjatovic´ J, C´ iric´ M and Bogdanovic´ S 2008 Deter- Language Theory (1st ed)., Berlin, Heidelberg: Springer minization of fuzzy automata with membership values in [2] Dassow J 2004 Grammars with regulated rewriting. complete residuated lattices. Information Sciences 178(1): In: Martin-vide C, Mitrana V and Paun G (Eds.) Formal 164–180 Languages and Applications. Berlin, Heidelberg: Springer, [13] Schneider M, Lim H and Shoaff W 1992 The utilization of pp. 249–273 fuzzy sets in the recognition of imperfect strings. Fuzzy Sets [3] Meduna A and Zemek P 2014 Regulated Grammars and and Systems 49(3): 331–337 Automata (1st ed). New York: Springer [14] Inui M, Shoaff W, Fausett L and Schneider M 1994 The [4] Senay H 1992 Fuzzy command grammars for intelligent recognition of imperfect strings generated by fuzzy context interface design. IEEE Transactions on Systems, Man and sensitive grammars. Fuzzy sets and systems 62(1): 21–29 Cybernetics 22(5): 1124–1131 [15] Ginsburg S and Spanier E H 1968 Control sets on grammars. [5] Hopcroft J E, Motwani R and Ullman J D 2006 Introduction Math. Systems Theory 2(2): 159–177 to automata theory, languages and computation (3rd ed). [16] Cremers A and Mayer O 1973 On matrix languages. Infor- Boston: Addison-Wesley mation and Control 23(1): 86–96 [6] Steimann F and Adlassnig K P 1994 Clinical monitoring with [17] Solar P 2014 Deep Pushdown Transducers and State Trans- fuzzy automata. Fuzzy Sets and Systems 61(1): 37–42 lation Schemes. In: Proceedings of the 20th Conference [7] Giles C L, Omlin C W and Thornber K K 1999 Equivalence STUDENT EEICT, Brno University of Technology, 24 April, in knowledge representation: automata, recurrent neural pp. 264–268 networks, and dynamical fuzzy systems. Proceedings of the [18] Kasai T 1970 An hierarchy between context-free and con- IEEE 87(9):1623–1640 text-sensitive languages. Journal of Computer and System [8] Zadeh L A 2004 A note on web intelligence, world knowl- Sciences 4(5): 492–508 edge and fuzzy logic. Data and Knowledge Engineering 50 [19] Zemek P 2013 One-sided random context grammars: (3): 291–304 Established results and open problems. In: Proceedings of the 134 Page 12 of 12 Sådhanå (2018) 43:134

19th Conference STUDENT EEICT, Brno University of [28] Kalra N and Kumar A 2016 Fuzzy state grammar and fuzzy Technology, 25 April, pp. 222–226 deep . Journal of Intelligent and Fuzzy [20] Van der Walt A P J 1970 Random context grammars. In: Systems 31(1): 249–258 Proceedings IFIP Congress. North-Holland, Amsterdam, [29] Garhwal S and Jiwari R 2016 Parallel fuzzy regular pp. 66–68 expression and its conversion to epsilon-free fuzzy automa- [21] Meduna A and Zemek P 2014 One-sided random context ton. The Computer Journal 59(9):1383–1391 grammars with a limited number of right random context [30] Lee E T and Zadeh L A 1969 Note on fuzzy languages. rules. Theoretical Computer Science 516: 127–132 Information Sciences 1(4): 421–434 [22] Meduna A 1990 Generalized forbidding grammars. Inter- [31] Asveld P R J 2005 Fuzzy context-free languages. Part 1: national Journal of Computer Mathematics 36(1–2): 31– generalized fuzzy context-free grammars. Theoretical Com- 38 puter Science 347(1): 167–190 [23] Meduna A and Zemek P 2013 Generalized one-sided for- [32] Asveld P R J 2005 Fuzzy context-free languages. Part 2: bidding grammars. International Journal of Computer Recognition and parsing algorithms. Theoretical Computer Mathematics 90(2): 172–182 Science 347(1): 191–213 [24] Meduna A and Zemek P 2012 One-sided forbidding gram- [33] Zhanga J, Williams S O and Wang H 2017 Intelligent mars and selective substitution grammars. International computing system based on pattern recognition and data Journal of Computer Mathematics 89(5): 586–596 mining algorithms. Sustainable Computing: Informatics and [25] Kleijn H C M 1983 Selective Substitution Grammars Based Systems. https://doi.org/10.1016/j.suscom.2017.10.010 on Context-Free Productions. Ph.D. Thesis, Leiden Univer- [34] Bag S, Tiwari M K and Chan F T S 2017 Predicting the sity, Netherlands consumer’s purchase intention of durable goods: An attri- [26] Kleijn H C M 1987 Basic ideas of selective substitution bute-level analysis. Journal of Business Research. https://doi. grammars. In: Kelemenova A and Kelemen J (Eds.) Trends org/10.1016/j.jbusres.2017.11.031 Techniques and Problems in Theoretical Computer Science. [35] Zadeh L A 1965 Fuzzy sets. Information and Control 8(3): Berlin, Germany: Springer, pp. 75–95 338–353 [27] Kalra N and Kumar A 2017 Deterministic Deep Pushdown [36] Lange M and Leiß H 2009 To CNF or not to CNF? An Transducer and its Parallel Version. The Computer Journal efficient yet presentable version of the CYK algorithm. In- 61(1): 63–73 formatica Didactica 8: 2008–2010