arXiv:2105.09368v1 [cs.LO] 19 May 2021 7--6449-/1$10 22 IEEE ©2021 978-1-6654-4895-6/21/$31.00 titybtenpositions between strictly falwrsstsyn h formula the satisfying words all of FO( t frltdlgc.Tenihorand neighbour The logics. related of ity od,namely words, varieti semigroups. involution involution of pseudovarieties between and Eilenberg-t established languages An morphisms. is classe involutory correspondence namely of th images inv languages, define inverse quotients, hermitia operations, and of then Boolean semigroup, locally We under varieties closed a semigroup. involution involution languages by involution commutative of recognised trivial notion is aperiodic locally it a an i if and definable only of is language and product a if that logic states semigrou characterisatio neighbour the involution with The FO of product. for product de hermitian rem languages locally semidirect we such the logic, of called an for the kind this, with characterise special algebras For To alphabets a natural words. semigroups. over The involution words finite are them. finite on on of relation, involution languages neighbour consider the we with logic language h enr predicate ternary the — 9.Atog n ol xrs htapsto soeo the of one is position a that express could one Although [9]. enstelanguage the defines neighbour, Let ehv h qaiypredicate equality the have we y o xml,teformula the example, For tmcfrua ftelgcaetefloig h predicat The The following: respectively. the word are given logic P a the of of positions formulas atomic last and first the h letter the h oi r lsdudrBoenoeain n rtorder first and operations Boolean i.e., under quantifications, closed are logic the od vrtealphabet the over words omlso h oi,if logic, the of formulas P nAgbacCaatrsto fFrtOdrLogic First-Order of Characterisation Algebraic An a a r egbus .. either i.e., neighbours, are eoew oit h hrceiainpolmof problem characterisation the into go we Before egv nagbaccaatrsto falgcoe finite over logic a of characterisation algebraic an give We . Abstract ( ( x min A, A ) for , min eafiieapae.Frua ftelgcF with FO logic the of Formulas alphabet. finite a be ) defined a ∧ W iea leri hrceiaino first-order of characterisation algebraic an give —We h iaypredicate binary The . a , FO( P max ∈ b ( h rtodrlgcwt h egbu relation neighbour the with logic first-order the max A, A , yaformula a by eoe htteposition the that denotes , N min ninIsiueo ehooyGoa Technology of Institute Indian ϕ ) ) .I I. ti neetn ont h expressibil- the note to interesting is it , ∀ ∧ ∨ a , ( A ,ϕ ψ, max ba NTRODUCTION x h constants The . ϕ bet x ∀ ) ⋆ y mle Manuel Amaldev and and [email protected] ∧ b , x ( ,y z y, x, N ( ψ, = 1 + N vrtealphabet the over ϕ x ) z ψ eoe as denoted , r nepee vrfinite over interpreted are , ( = ¬ N ,y x, eefis tde n[8], in studied first were — ϕ, r omlso h logic. the of formulas are ϕ ) y ( ,y x, . h e ffrua of formulas of set The . y ∀ ) stu fposition if true is ϕ x → or min ) between and , y ( eoe that denotes P ihNeighbour with = 1 + a x and L ( x ( slble by labelled is ∃ ϕ ) ϕ x max { ) ↔ steset the is , ,b a, x predicates Finally, . P r also are olution, theo- n } denote b x The . ( sof es y y of s and fine ype ))) ps, is n n e e . ntrso h te ymndcscn-re formulas; second-order monadic by other the thus of terms in h re eain( relation order the aeof case hierarchies. alternation fier FO( proi eua agae agae htaedefinable are logic that languages the — in languages regular aperiodic 7,[] tcnb hw that shown be can it [8], [7], ugs Likewise, guages. vrtealphabet the over FO( in ntne h language the instance, FO( osbet ielnug-hoei hrceiain of characterisations language-theoretic give to possible logic. this in definable languages the acterising rktnrtsterm h eainhpbtentepred the between relationship icates The theorem. Trakhtenbrot’s at,nml,tetescesrrlto ( relation successor the the namely, parts, ensaltewrso vnlnt.I un u that out turns It length. even of words the MSO( all defines ugs .. thstesm xrsiepwras power expressive same the has the it with i.e., guages, ∃ formula the e fpositions of set oml ftelgc h formula The logic; the of formula h xeso htas losmndcscn-re quan- second-order monadic allows — also tification that extension the and fgo leri rpris eicueteconstants the include we properties, s and the algebraic For operation. good reverse of the under closed are predicates and 9 nti setting. this in [9] hmevsaentepesbeusing expressible not are themselves npit using endpoints X oee,teprle ecie ofrbek oni the in down breaks far so described parallel the However, sn afsterm[]fo nt oe hoy tis it theory, model finite from [7] theorem Hanf’s Using h oai eododrlogic second-order monadic The FO( ∀ A, A, A, max N max x MSO( ∀ A, N A, +1) min min FO( y r hi w ett-ih ul fteconstants the if dual, left-to-right own their are ( min successor +1) norvcblr.Nx esaeterslso [8], of results the state we Next vocabulary. our in r o sdte agae enbeuigthese using definable languages then used not are N hna ahmtclInstitute Mathematical Chennai and u,uiga hefuh-r¨s´ argument Ehrenfeucht-Fra¨ıss´e an using But, . , , A, A, ( max max ∃ ,y x, , FO( Xϕ htaentepesbei h omrlgc For logic. former the in expressible not are that min max min bet bet X ) , , [email protected] hu Nevatia Dhruv , bet N FO( ,< A, → , , , htstse h formula the satisfies that ∀ r nlgu oteroine counter oriented their to analogous are y < x max swl as well as max N relation, ) Xϕ { ) hs ehv h usino char- of question the have we Thus, . ( ,b c b, a, ) A, X and ) nfc,ti aalls between parallelism this fact, In . , L , ( enspeieyalrglrlan- regular all precisely defines min N bet x r lofrua whenever formulas also are .Both ). = ) FO( ) } hr r agae expressible languages are There . ¬ ↔ ) MSO( c , N ∗ max sepesbei h logic the in expressible is lodfie l eua lan- regular all defines also ,< A, abc h constants the , X MSO( ∗ , N ∃ ( bet A, ) L bet y Xϕ ))) xed oterquanti- their to extends +1) and and ) is ∧ A, enspeieyall precisely defines stu fteei a is there if true is X yB¨uchi-Elgot- by , not x N bet min ( ϕ min = 1 + ic both Since . o example, For . min r definable are , enbein definable max ) ∧¬ and , X y ϕ N and ) MSO ( max both min min max ) sa is ake bet is , - ) FO(A, +1) and FO(A, min, max, N ). For t > 0, we define isation of FO(A, +1) is a deep result [1], [5], [15], [26] in the equality with threshold t on the set N of natural numbers the theory of finite semigroups. The first observation is that by LTT is a variety of languages — a class of languages that
t i = j if i < t, is closed under Boolean operations, quotients with respect to i = j := (1) words, and inverse images under homomorphisms. Eilenberg’s (i ≥ t and j ≥ t otherwise. variety theorem states that varieties of languages correspond + + The word y ∈ A is a factor of the word u ∈ A if u = xyz to pseudovarieties of semigroups — a pseudovariety of fi- ∗ for some x, z in A . We use ♯(u,y) to denote the number of nite semigroups is a set of finite semigroups that is closed times the factor y appears in u, i.e. the number of pairs (x, z), under finite direct products, subsemigroups and quotients. ∗ where x, z ∈ A , such that u = xyz. The characterisation of LTT proceeds by first writing the ◮ Definition 1. Let ≈t , for k,t > 0, be the equivalence on class as a semidirect product of pseudovarieties of aperiodic k Acom A∗, whereby two words u and v are equivalent if either they commutative monoids ( ) and righty trivial semigroups D both have length at most k − 1 and u = v, or otherwise they ( is the pseudovariety of semigroups that satisfy the equation have se = e for all elements s and idempotents e). This step is the 1) the same prefix of length k − 1, algebraic analogue of synthesising an automaton for an LTT 2) the same suffix of length k − 1, language as a cascade of a scanner and an acceptor [1]. The 3) and the same number of occurrences, up to threshold t, next step in the proof uses the framework of categories for for all factors of length ≤ k, i.e. ♯(u,y) =t ♯(v,y) for obtaining the identities capturing the semidirect product [21], each word y ∈ A+ of length at most k. [22], [25]. In the case of wLRTT, this approach does not work; A language is locally threshold testable (or LTT for short) if because wLRTT does not form a variety of languages — t ∗ ∗ it is a unionof ≈k classes, for some k,t > 0. Locally threshold for instance, the language (xy) ab(xy) over the alphabet testable languages are precisely the class of languages defin- {a,b,x,y} is in wLRTT. Its inverse image under the mor- able in FO(A, +1) [1], [23]. phism a 7→ a,b 7→ b,c 7→ xy is the language c∗abc∗, Since the neighbour predicate N is definable using the over the alphabet {a,b,c}. As we have seen, this language successor relation in first-order logic, FO(A, min, max, N ) is not in wLRTT. Therefore, wLRTT is not closed under definable languages are a subset of LTT. But this inclusion is inverse image of morphisms, and hence not a variety of strict, as we have seen. languages. This necessitates a rubric, of the like of varieties r t We define a coarser equivalence ≈k by the counting factors and operations on varieties, to study the class wLRTT. only up to reverse. Let ♯r(w, v) denote the number of occur- rences of v or vr in w, i.e. the number of pairs (x, y), where Our results. It was already observed in [8] that one needs x, y ∈ A∗, such that w = xvy or w = xvry. to extend semigroups with an involution (also called ⋆- ◮ Definition 2. Let k,t > 0. Two words w, w′ ∈ A∗ are semigroups) — an involution on a semigroup S is an operation ⋆ ⋆ ⋆ ⋆ ⋆ r t ′ ′ ⋆ such that (a ) = a and (ab) = b a , for each a,b ∈ S ≈k-equivalent if |w| < k and w = w , or w, w both are of length at least k, and they have — to characterise wLRTT. The involution operation is a gen- eralisation of the reversal operation on words. An involutory 1) the same prefix of length k − 1, alphabet A is a finite alphabet A with a bijection † on it. The 2) the same suffix of length k − 1, and map † extends to words over A as (a ...a )† = a† ...a†, 3) ♯r(w, v) =t ♯r(w′, v) for each word v ∈ A+ of length 1 n n 1 where each a ∈ A. This map is an involution on the semi- at most k. i group A+. We lift the notions of recognisability and syntactic It is shown in [8] that a language is definable in FO(A, N ) semigroups to the case of languages over involutory alphabets. if it is closed under the reverse operation and is a union of Syntactic algebras for such languages are semigroups with an r t equivalence classes of ≈k for some k,t > 0. Such languages involution. A finite alphabet A can be seen as an involutory are called locally-reversible threshold testable languages. The alphabet with the identity function id as †; such alphabets class of languages accepted by FO(A, min, max, N ) are are called hermitian. This just means that A+ is equipped characterised in a similar manner. A language L is weakly with the reverse operation as involution. Syntactic algebras locally-reversible threshold testable, wLRTT for short, if it is for languages over hermitian alphabets are semigroups with r t a union of equivalence classes of ≈k for some k,t > 0. By involutions that are generated by their hermitian elements (a a straight-forward adaptation of the proofs in [8], [9] we can is hermitian if a⋆ = a). show that Assume (A, †) and (B,⋆) are involutory alphabets and h: A+ → B+ is a morphism. The morphism h is involutory if ◮ Theorem 1. A language is definable in the logic h(w†)= h(w)⋆. An involution variety of languages maps each FO(A, min, max, N ) if and only if it is weakly locally- involutory alphabet A = (A, †) to a class of languages over A reversible threshold testable. that is closed under involutions, Boolean operations, quotients Obtaining an algebraic characterisation for the class with respect to words and inverse images under involutory wLRTT is an interesting problem. The analogous character- morphisms. We establish an Eilenberg-type correspondence between involution varieties of languages and pseudovarieties study of pseudovarieties of finite involution semigroups or of finite involution semigroups — classes of finite involution results similar to ours. semigroups that are closed under finite direct products, sub-⋆- semigroups and quotients. Orgranisation of the paper. In Section II, we describe in- volution semigroups and extend the notions of recognisability If S = (S,⋆) and T = (T, †) are involution semigroups, a two-sided action of T on S is compatible with the involution by semigroups to the case of languages with an involution. In Section III, involutory semidirect products are defined, and we if (tst′)⋆ = t′†s⋆t†, where tst′ denotes the left action by introduce the notion of locally hermitian semidirect product. t ∈ T and right action by t′ ∈ T on the element s ∈ S. A compatible two-sided action of T on S is locally hermitian Then, it shown that a language is wLRTT if and only if it is recognised by the locally hermitian semidirect product if for all idempotents e ∈ T and elements s ∈ S it holds of an aperiodic commutative semigroup and a locally trivial that ese† = es⋆e†. Intuitively this means that within an idempotent context that looks the same on the outside from semigroup. We introduce the notion of involution varieties of either direction, one can reverse factors. Bilateral semidirect languages and pseudovarieties of involution semigroups, and products with locally hermitian actions are called locally prove an Eilenberg correspondence between them in Section hermitian semidirect products. If V and W are pseudovarieties IV. In Section V, we conclude with some avenues for further of involution semigroups, the locally hermitian product of V inquiry. W and is the pseudovariety generated by all locally hermitian II. LANGUAGES OVER INVOLUTORY ALPHABETS semidirect products of involution semigroups from V and W. It is shown that a language is wLRTT if and only if it is A. Recognisable languages recognised by a locally hermitian semidirect product of an ape- Let A be a finite alphabet. Then, A+ denotes the set of riodic commutative semigroup and a locally trivial semigroup. nonempty finite words over A, and A∗ denotes the set of all This result can also be stated in terms of the corresponding finite words over A including the empty word ǫ. Extending this pseudovarieties of involution semigroups Acom⋆ and L1⋆. notation, A≤k denotes the subset of A∗ consisting of words Therefore, we have an algebraic characterisation of the class of length at most k, including ǫ. FO(A, min, max, N ). A semigroup S is a set together with an associative binary operation. Unless specified, the semigroup operation is written as x · y, for semigroup elements x, y; but we omit notating it Related work. The FO and MSO logics using the predicates whenever possible. If the operation has an identity, then S is N and bet were introduced in [8], [9]. Formulas of these logics called a monoid. All semigroups and monoids we consider are that do not use the constants min and max, are self-dual, finite, except when stated otherwise. For a finite alphabet A, i.e., replacing each predicate by its dual, for instance N(x, y) the set A+ forms a semigroup under concatenation, while the by N(y, x), results in an equivalent formula. Therefore, set A∗ is a monoid with the empty word ǫ as the identity. The MSO(A, N ) and MSO(A, bet) recognise precisely the class set S′ ⊆ S forms a subsemigroup of S if S′ is closed under of reversible regular languages — a language is reversible if it the semigroup operation. is invariant under taking the reverse of words in it. Similarly, Let (T, +) be a semigroup. A semigroup morphism from S FO(A, bet) recognises all aperiodic reversible languages. In to T is a map h: S → T that satisfies the equation h(xy)= [8], the question of deciding membership in these classes was h(x)+ h(y), for all x, y ∈ S. If S and T are monoids, then h studied. For the logics MSO(A, bet), FO(A, bet), the exist- is a monoid morphism if additionally h maps the identity of S ing decidable characterisations of MSO(A, <) and FO(A, <) to the identity of T . A semigroup T divides the semigroup S if immediately yield an answer. For the case of FO(A, N ), a T is the image of a subsemigroup of S under some morphism. partial answer was given; it was shown that if a language L is Let A be a finite alphabet. A language L ⊆ A+ is recog- definable in FO(A, N ), then its syntactic semigroup satisfies nized by a semigroup S if there is a morphism h: A+ → S the identity ex⋆e⋆ = exe⋆, in addition to those defining the and a subset P ⊆ S such that L = h−1(P ). The set P is called class LTT. the accepting set of L. The class of languages recognised by A different but related between predicate (namely a(x, y), finite semigroups coincide with the class of regular languages for a ∈ A, is true if there is an a-labelled position between po- [10], p. 160. sitions x and y) was introduced in [11]–[13]. They characterise The above definitions are easily adapted to the case of lan- the expressive power of two-variable first-order logic with the ∗ 2 guages over A . The definitions are assumed in the subsequent order relation (FO (<)) enriched with the between predicates sections. a(x, y) for a ∈ A, and show an algebraic characterisation of the resulting family of languages. B. Languages over Involutory Alphabets Since the class of involution semigroups contains many A finite involutory alphabet A = (A, †) is a finite alphabet important classes of semigroups such as inverse semigroups, A with a bijection †: A → A such that (a†)† = a for each there is substantial literature on varieties of involution semi- letter a ∈ A. The function † is extended to words over A as groups from a semigroup theoretic perspective (see [6] for a † † † survey). However we are not aware of a language theoretic w = an ...a1 , (2) + †† † if w = a1 ··· an ∈ A . Clearly, w = w. Moreover, (uv) = Clearly, if a language is recognised by an involution semi- v†u†, for all words u, v ∈ A+. Hence, the operation †: A+ → group, then it is recognised by a semigroup as well. The other A+ is an involution: a function that is its own inverse. If L ⊆ direction also holds, as we show next. A+ is a language, then L† denotes the set {w† | w ∈ L}. Let S = (S, †) be an involution semigroup and let T If † is the identity function on A, then A is called a be a semigroup. By T op, we denote the opposite of T — hermitian alphabet. In that case, † is the reverse operation the semigroup with the same set of elements but reversed u 7→ ur, for u ∈ A+, and L† is the set Lr, for L ⊆ A+. multiplication, i.e., a ·op b = b · a, for all a,b ∈ T . Assume Unless specified, a finite alphabet is taken to be a hermitian there is a semigroup morphism from S to T . Let h† be the alphabet, i.e. with the reverse operation as the involution. map from S to T op, given by, for each s ∈ S, ◮ Definition 3. An involution semigroup (also called a ⋆- h†(s)= h(s†) . semigroup) S = (S,⋆) is a semigroup S extended with an ◮ Lemma 1. If h: S → T is a morphism, then h† : S → T op operation ⋆: S → S such that for all elements a,b of S, is also a morphism. ⋆ ⋆ ⋆ ⋆ ⋆ (a ) = a , and (a · b) = b · a . Proof. For x, y ∈ S, S We say that S is the semigroup reduct of . The semigroup h†(xy)= h((xy)†) A+ with the operation †: A+ → A+, defined in (2), forms h y†x† an involution semigroup, denoted as A +. Similarly, A∗ with = ( ) † † † forms a monoid with an involution, denoted as A ∗. The = h(y )h(x ) (product in T ) + involution semigroup A is not a free involution semigroup = h†(x)h†(y) (product in T op) . in general; it is only so when the relation † is irreflexive on † op A ( [14], p. 172). Hence h : S → T is a semigroup morphism. ◭ A ◮ Example 1. Let T = {a,b,ab,ba} be the semigroup with Let = (A, †) be an involutory alphabet. Assume that + the multiplication L ⊆ A is recognised by the semigroup S with the morphism h and the accepting set P . Lemma 1 gives that h† : A+ → Sop aa = a, bb = b, aba = bab = ba. is also a morphism; by definition, h†(w†) = h(w), for all w ∈ A+. Hence, h†(L†) = h(L) = P ; hence, the semigroup Let ⋆ be the map a⋆ = b. The map extends to an involution Sop recognises L† with the morphism h†. on T , Next, we show that (T × T op, flip : (x, y) 7→ (y, x)) is an b⋆ = (a⋆)⋆ = a , (ab)⋆ = b⋆a⋆ = ab , (ba)⋆ = a⋆b⋆ = ba . involution semigroup. It suffices to show ◮ op However, the map a† = a,b† = b is not a well-defined Lemma 2. flip is an involution on T × T . involution on T ; Proof. Clearly, for (a,b) ∈ T × T op † † † † † † † † flip (ba) = a b = ab 6= ba = aba = a b a = (aba) = (ba) . (a,b)flip = (a,b) . It is not possible to associate an involution with every Let (a,b), (c, d) ∈ T × T op. Then, semigroup; for instance, the semigroup of right zeros, where flip flip the multiplication follows the identity xy = y for any elements ((a,b)(c, d)) = (ac,db) x, y, does not admit an involution if it has more than one = (db,ac) element. However, it is possible to obtain a ⋆-semigroup, from = (d, c)(b,a) a given semigroup, that shares similar properties, as we shall = (c, d)flip(a,b)flip . see later. Let S = (S,⋆) and T = (T, †) be involution semigroups. Hence flip is an involution on T × T op. ◭ A morphism of involution semigroups, also called an involu- ◮ Proposition 1. If a language is recognised by a semi- tory morphism, from S to T , denoted as h: S → T , is a group, then it is recognised by a ⋆-semigroup as well. semigroup morphism h: S → T such that h(a⋆)= h(a)†, for all a ∈ S. Proof. Let A = (A, †) be an involutory alphabet. Assume that L is recognised by S using the morphism h: A+ → S. ◮ Definition 4. Let A = (A, †) be an involutory alphabet. By Lemma 1, h† : A+ → Sop is a morphism. By Lemma 2, The involution semigroup S = (S,⋆) recognises the language (S × Sop, flip) is an involution semigroup. Then, the product L ⊆ A+ if there is a morphism of involution semigroups map g : w 7→ (h(w),h†(w)), for w ∈ A+, is a morphism from h: A + → S and a subset P ⊆ S such that L = h−1(P ). A+ to S × Sop. Since ◮ Example 2. Let A be the involutory alphabet {a,b} with g(w†) = (h(w†),h†(w†)) = (h†(w),h(w)) = (g(w))flip , the map a⋆ = b. Let T = {a,b,ab,ba} be the semigroup from Example 1. Then, T accepts the language L = a+b+ with the g is a morphism of involution semigroups from A + to morphism h(a)= a,h(b)= b and the accepting set {ab}. S × Sop. Finally, the proof is completed by observing that L is recognised by the involution semigroup S × Sop with the The actions l and r are compatible if for all t,t′ ∈ T , and accepting set {(h(w),h†(w)) | w ∈ L}. ◭ s ∈ S, it is the case that Therefore, languages recognised by involution semigroups (ts)t′ = t(st′) . (7) are precisely the class of recognisable languages. A pair of compatible actions (l, r) is called a bilateral Let S = (S,⋆) be an involution semigroup. If S′ ⊆ S, then action of T on S. Given a bilateral action (l, r) of T on S′⋆ denotes the set {s⋆ | s ∈ S′}. A subset S′ of S forms S, the bilateral semidirect product (also called the two-sided a sub-⋆-semigroup of S if S′ is closed under the semigroup semidirect product) S ∗∗T is the semigroup with the elements operation and the involution, i.e., if S′2 ⊆ S′ and S′⋆ = S′. {(s,t) | s ∈ S,t ∈ T } and with the operation If h: S → T is an involution semigroup morphism, then the T image of h is a sub-⋆-semigroup of . (s1,t1)(s2,t2) = (s1t2 + t1s2,t1t2) . (8) An element s ∈ S is hermitian if it is its own involution, i.e., Let S = (S,⋆) and T = (T, ⋄) be involution semigroups. if s⋆ = s.A ⋆-semigroup is hermitian-generated if it is gener- Let l, r be a bilateral action of T on S. Then, the pair l, r ated by its hermitian elements using the semigroup operation ( ) ( ) is an involutory action of T on S if for all t,t′ T,s S, and the involution. For convenience, we shorten hermitian- ∈ ∈ generated involution semigroup to hermitian semigroup. For (st)⋆ = t⋄s⋆ . (9) every hermitian alphabet A , the involution semigroup A + is ◮ T hermitian. Images of hermitian semigroups under involution Lemma 3. Assume = (T, ⋄) has an involutory action S ′ semigroup morphisms are hermitian as well. on = (S,⋆). Then, for all t,t ∈ T,s ∈ S, ⋆ ⋄ ⋆ ⋄ ◮ Example 3. In the involution semigroup T = (T,⋆) (t1st2) = t2s t1 . in Example 1, the set of hermitian elements is {ab,ba}. Since Proof. By taking ⋆ on both sides in (9), we get they don’t generate all the elements of the semigroup, T is not ⋄ ⋆ ⋆ a hermitian semigroup. But ({ab,ba},⋆) is a sub-⋆-semigroup st = (t s ) . (10) of T that is hermitian. Exchanging t and t⋄, and s and s⋆ in (10), we obtain ∗ For recognising languages that are subsets of A , the notion s⋆t⋄ = (ts)⋆ . (11) of recognisability by ⋆-semigroups is insufficient; one needs recognisability by ⋆-monoids. The definitions are straightfor- Finally, ward adaptations of the ones presented above. ⋆ ⋆ (t1st2) = ((t1s)t2) by (7) ⋄ ⋆ III.INVOLUTORY SEMIDIRECT PRODUCTS AND THE = t2(t1s) by (9) CHARACTERISATION OF FO ⋄ ⋆ ⋄ = t2s t1 . by (11) In this section, we first define the involutory semidirect ◭ product of two involution semigroups. Then, a particular case, called locally hermitian semidirect product, is defined. ◮ Definition 5. Assume T = (T, ⋄) has an involutory It is then shown that wLRTT languages are precisely the action on S = (S,⋆). Then, their involutory semidirect ones recognised by locally hermitian products of aperiodic product S ∗∗ T is the semigroup S ∗∗ T with the involution commutative ⋆-semigroups and locally trivial ⋆-semigroups. †: (s,t) 7→ (s⋆,t⋄) . (12) A. Bilateral Semidirect Products of Involution Semigroups ◮ Lemma 4. S ∗∗ T is an involution semigroup. Let S and T be finite semigroups. We denote the semigroup operation of S additively (by +) and of T multiplicatively. Proof. It suffices to show that † is an involution. By definition, † However, we don’t assume that S or T is commutative. A left (s,t)† = (s⋆⋆,t⋄⋄) = (s,t) . action l of T on S is a map l : T × S → S satisfying the following conditions. Denote the image of (t,s) under l as ts, Next, for all s1,s2 ∈ S,t1,t2 ∈ T , ′ for t ∈ T,s ∈ S. Then, for all s ∈ S and t,t ∈ T , † † ((s1,t1)(s2,t2)) = (s1t2 + t1s2,t1t2) by (8) ′ ′ ⋆ ⋄ t(s + s )= ts + ts , and, (3) = ((s1t2 + t1s2) , (t1t2) ) by (12) ′ ′ ⋆ ⋆ ⋄ ⋄ (tt )s = t(t s) . (4) = ((t1s2) + (s1t2) ,t2t1) ⋆ ⋄ ⋄ ⋆ ⋄ ⋄ Right actions of T on S are defined analogously; a right = (s2t1 + t2s1,t2t1) ⋆ ⋄ ⋆ ⋄ action r of T on S is a map r : S × T → S such that for all = (s2,t2) (s1,t1) s ∈ S and t,t′ ∈ T , † † = (s2,t2) (s1,t1) . ′ ′ (s + s )t = st + s t , and, (5) Hence † is an involution. ◭ s(tt′) = (st)t′ , (6) Next, we introduce a particular case of involutory semidirect where st denotes the image of (s,t) under the map r. product. An element e in a semigroup is an idempotent if e · e = e. The right action is analogous. In the case of involution semigroups, if e is an idempotent, xyˆ · first(w) if z is ǫ then involution of e is also an idempotent. xyzˆ ⊗ w = xyzˆ otherwise, ◮ Definition 6. Let S = (S,⋆) and T = (T, ⋄) be ( involution semigroups. Let (l, r) be an involutory action of T Let S′ be the semigroup formed by the powerset of U with on S . The action is locally hermitian if for each idempotent set union as the semigroup operation. For each element I ⊆ U e ∈ T , and each element s ∈ S, of S′, let Ir = {wr | w ∈ I}. The operation I 7→ Ir, for ′ ′ S ′ ese⋄ = es⋆e⋄ . (13) I ∈ S , is an involution on the semigroup S . Let = (S′, r). The action of T on U, extends to the elements of S ′ In other words, in the case of a locally hermitian action, pointwise. The elements A · Aˆ · A are called the zeros for the the elements of the form ese⋄ are hermitian (observe that, by action. Let ∼ be a congruence on S ′, given by, for I,J ∈ S′, ⋄ ⋆ ⋄⋄ ⋆ ⋄ ⋆ ⋄ Lemma 3, (ese ) = e s e = es e ). • x ∈ I if and only if x ∈ J, for every nonzero x ∈ U, S T T Let and be involution semigroups. If the action of and S r r is locally hermitian action on , then the semidirect product • x or x is in I if and only if x or x is in J, for each S T ∗∗ is called locally hermitian. zero x ∈ U. B. Characterisation of FO with Neighbour Let S be the quotient of S ′ with respect to ∼. It is easy S A semigroup S is aperiodic if there is an n N such that to see that for each idempotent e ∈ T , and each I ∈ , ∈ r r r an+1 = an, for all elements a in the semigroup. A semigroup e ⊗ I ⊗ e = e ⊗ I ⊗ e . Hence the actions are locally S T is commutative if the semigroup operation is commutative, i.e., hermitian. Finally L is accepted by ∗∗ with the morphism x·y = y·x, for all elements x and y. A semigroup S is locally h: a 7→ ({aˆ},a). trivial if ese = e, for each element s and each idempotent e in ◮ Theorem 2. Each wLRTT language is recognised by a S. The class of languages LTT has several characterisations in locally hermitian semidirect product of an aperiodic commuta- terms of various products of semigroups (see [1] for a detailed tive ⋆-semigroup and a locally trivial ⋆-semigroup. Conversely, discussion); the one relevant to us is, if L is recognised by a locally hermitian semidirect product S T S T ◮ Proposition 2. A language is in LTT if and only if it is ∗∗ , where is aperiodic and commutative and is recognised by a bilateral semidirect product of an aperiodic locally trivial, then L is wLRTT. commutative semigroup and a locally trivial semigroup. Proof. Assume L ⊆ A+, for an involutory alphabet A = Using the construction outlined in the proof of Proposition (A, id), is a wLRTT language. To prove the first claim, we S T S 1, it can be shown that a language is in LTT if and only define ⋆-semigroups and such that is aperiodic and T if it is recognised by an involutory semidirect product of commutative and is locally trivial, and show that their an aperiodic commutative involution semigroup and a locally locally hermitian semidirect product recognises L. We let r m trivial involution semigroup. k,m > 0 be such that L is a union of ≈2k+1-classes. Since r t r t Next, we characterise the class wLRTT. First, we give an the equivalence ≈d+1 refines the equivalence ≈d, for t,d > 0, example. there is no loss of generality. Let ⊙ be the operation on the set A≤2k \{ǫ} defined as ◮ Example 4. Consider the language L = (abc)+ over the alphabet A = {a,b,c}. We show that L is defined by a locally S T ′ ′ hermitian product of two ⋆-semigroups and . ′ t · t if |tt | < 2k, ≤2 t t Let T be the semigroup with the set A \{ǫ} and the ⊙ = ′ ′ (pref k(t · t ) · suff k(t · t ) otherwise, operation ⊙ given by, for x, y ∈ T where pref () and suff () denote the k-length prefix and x · y if |xy|≤ 2 k k x ⊙ y = suffix respectively. It is not difficult to see that the operation (first(xy) · last(xy) otherwise, ⊙ is associative. Hence, T = (A≤2k \{ǫ}, ⊙) is a semigroup. It is straight-forward to check that u v r vr ur and where first(w) and last(w) denote the first and last letter of a ( ⊙ ) = ⊙ r r T word w. Reverse operation is an involution on the semigroup (u ) = u, for the reverse operation r. Let = (T, r) be the involution semigroup with the reverse operation as the T . Let T = (T, r). Idempotents of T are precisely those involution. The idempotents of T are precisely the words of words of length 2. length k. Also, e t e e, for words t T and e A2k. Let Aˆ = {a,ˆ ˆb, cˆ}. Let U be the set A≤1 · Aˆ· A≤1. Elements 2 ⊙ ⊙ = ∈ ∈ Hence, T is locally trivial. of T act on words in U in the following way: for w ∈ T and xyzˆ ∈ U, where x, z ∈ A≤1 and yˆ ∈ Aˆ, we define the right Next, we set up some notation in preparation for defining S ˆ action ⊗ as . Let A denote the set {aˆ | a ∈ A} of anchored letters. A k-anchored word wˆ is a word from the set A≤k · Aˆ · A≤k. last(w) · yzˆ if x is ǫ Let W denote the set of all k-anchored words. We denote w ⊗ xyzˆ = k (xyzˆ otherwise, anchored words by w,ˆ uˆ etc. Let wˆ = uavˆ , where u, v ∈ A∗ and aˆ ∈ Aˆ. We write l(ˆw) The equivalence relation ≡m is a congruence relation on the m r m r and r(ˆw) to denote how far is the anchored letter from the involution semigroup M(Wk): If I ≡ J, then I ≡ J , left-end and right-end respectively; Hence, l(ˆw) = |u| and and also, if I ≡m J and K ≡m L, then I ∪ K ≡m J ∪ L, for r(ˆw)= |v|. Clearly, 0 ≤ l(ˆw), r(ˆw) ≤ k, for all wˆ ∈ Wk. all I,J,K,L ∈ M(Wk). m The semigroup T acts on the set Wk, on the left as well as We let S to be the quotient M(Wk)/≡ . Clearly, S is on the right, in the following manner. For each letter a ∈ A a finite semigroup. We denote the semigroup operation on and wˆ ∈ Wk, the left action a ⊗ wˆ ∈ Wk of a on wˆ is given S additively. The ⋆-semigroup S = (S, r) is aperiodic and by, commutative. wˆ if l(ˆw)= k, We can extend the involutory action ⊗ of T on Wk, to the a ⊗ wˆ = (14) ⋆-semigroup S pointwise, i.e., we let the left action u ⊗ I, (a · wˆ otherwise. for u ∈ T and I ∈ S, to be the multiset of anchored words Similarly the right action wˆ ⊗ a appends a to the right if with the support {u ⊗ wˆ | wˆ ∈ I} and multiplicity there are less than k positions to the right of the anchored position, otherwise it leaves wˆ intact. We extend these actions mu⊗I (ˆv)= mI (ˆw) , (18) w∈I,u⊗w v to all the elements of T by letting, for u ∈ T , a ∈ A and ˆ Xˆ=ˆ wˆ ∈ Wk, for vˆ ∈ Wk. Right action I ⊗u is defined dually. By Equations 15, 16, 17, the actions of T on S are involutory. It is easily (u ⊙ a) ⊗ wˆ = u ⊗ (a ⊗ wˆ) , and wˆ ⊗ (a ⊙ u)=(ˆw ⊗ a) ⊗ u . observed that, by definition of the action ⊗ and the congruence (15) ≡m, for each idempotent e ∈ T and each multiset I ∈ S, Clearly the left and right actions are compatible with each other, i.e., e ⊗ I ⊗ er = e ⊗ Ir ⊗ er . (19) (u ⊗ wˆ) ⊗ v = u ⊗ (ˆw ⊗ v) (16) Hence the action of T on S is locally hermitian. For for all wˆ ∈ Wk and u, v ∈ T . Also the actions are involutory, convenience, below we identify the anchored word wˆ and the with respect to the involution on T , since singleton multiset {wˆ}. Also, we will omit writing the action ⊗ explicitly. r r r (u ⊗ wˆ) =w ˆ ⊗ u . (17) Let S ∗∗ T be the locally hermitian semidirect product of S and T with the action . We define the ⋆-morphism h An anchored word is a zero for the action ⊗ if for all a ∈ A, ⊗ from A + to S ∗∗ T as a⊗wˆ =w ˆ⊗a =w ˆ. The zero elements of Wk are precisely the anchored words whose anchored letter is at distance k from h: a 7→ (ˆa,a) ∈ S × T. each end. Next, we define the semigroup S.A multiset in the universe It is easy to verify that h is a morphism of involution U, where U is a set, is a collection of elements from U semigroups. that allows for multiple instances. For instance, the multiset Next, we show that the morphism h recognises the language r m ′ ′ {3, 3, 4}, in the universe N, contains two occurrences of 3 L; it suffices to show that if w ≈2k+1 w , then h(w)= h(w ). + ′ ′ ′ and one occurrence of 4. Given a multiset I and an element Assume w = a1 ··· an ∈ A and w = a1 ··· aℓ are two r m ′ ′ x ∈ I, the multiplicity of x in I, denoted as mI (x), is the words such that w ≈2k+1 w . If |w| < 2k+1 or |w | < 2k+1, number of times x occurs in I. The support of I, denoted as then clearly w = w′ and h(w) = h(w′). Therefore, assume Supp(I), is the set of all elements occurring in I. Given two that both words have length at least 2k +1. The image of w multisets I and J, the union of I and J, denoted as I ∪ J, under h is of the form is the multiset K with the support Supp(I) ∪ Supp(J) and h(w)=(ˆa1,a1) ··· (ˆan,an) multiplicity mK (x)= mI (x)+ mJ (x), for all x ∈ Supp(K). Let M(U) denote the set of all non-empty multisets in the =(ˆa1a2 ··· an + a1aˆ2 ··· an + ··· + a1 ··· an−1aˆn, universe U. The set M(U) is a semigroup with ∪ as the a1 ··· an) . semigroup operation. (20) Consider the semigroup M(Wk). Clearly, M(Wk) is in- Observe that if u, v ∈ T are such that |u|, |v|≥ k, then u⊗wˆ⊗ finite. For each multiset I in M W , we let Ir denote ( k) v = suff (u)⊗wˆ⊗pref (v), and u⊙v = pref (u)·suff (v), the multiset with support {xr | x ∈ I} and multiplicity k k k k r for each wˆ ∈ S. Since n ≥ 2k +1, we derive, mIr (x ) = mI (x), for each x ∈ I. The reverse operation is an involution on M(Wk). h(w)=(ˆa1a2 ··· ak+1 + a1aˆ2 ··· ak+2 + ··· We define the following equivalence relation m on the ≡ + an−k ··· an−1aˆn, a1 ··· akan−k+1 ··· an) . m multisets in M(Wk); for I,J in M(Wk), I ≡ J if ′ ′ ′ m Similarly for w = a ··· a the image is of the form 1) mI (x)= mJ (x), for each non-zero element x of Wk, 1 ℓ and, h(w′) = (aˆ′ a′ ··· a′ + a′ aˆ′ ··· a′ + ··· r m r 1 2 k+1 1 2 k+2 2) mI (x)+ mI (x ) = mJ (x)+ mJ (x ), for each zero + a′ ··· a′ aˆ′ , a′ ··· a′ a′ ··· a′ ) element x of Wk. ℓ−k ℓ−1 ℓ 1 k ℓ−k+1 ℓ ′ ′ ′ ′ A + ′ Since w and w have a common prefix and suffix of length Similarly, let w = a1 ··· am ∈ . Assuming that h(ai)= ′ ′ ′ 2k, their images in T are identical. Hence, all it remains to (si,ti), the image of w under h is conclude the claim is to show that the summations ′ ′ ′ ′ ′ ′ ′ ′ ′ ′ ′ h(w ) = (s1t2 ··· tm + t1s2t3 ··· tm + ··· + t1 ··· tm−1sm, L =a ˆ a ··· ak + a aˆ ··· ak + ··· + an−k ··· an− aˆn ′ ′ 1 2 +1 1 2 +2 1 t1 ··· tm) . ˆ′ ′ ′ ′ ˆ′ ′ ′ ′ ˆ′ (24) R = a1a2 ··· ak+1 + a1a2 ··· ak+2 + ··· + aℓ−k ··· aℓ−1aℓ r t ′ have the same value in S. Observe that, since S is aperiodic Since m,n ≥ 4k +1 and w ≈4k+1 w , by definition, the and commutative, we can freely commute the terms in L and words w and w′ have the same prefix and suffix of length ′ ′ R. 4k. Using Equation 22, we conclude that t1 ··· tn = t1 ··· tm. The zero elements in the summation L are precisely the Therefore, it only remains to show that the summations terms of the form L = s1t2 ··· tn + t1s2t3 ··· tn + ··· + t1 ··· tn−1sn ai−k ··· ai−1aˆiai+1 ··· ai+k , ′ ′ ′ ′ ′ ′ ′ ′ ′ ′ R = s1t2 ··· tm + t1s2t3 ··· tm + ··· + t1 ··· tm−1sm where k 0, such that st = st+1 for all elements s ∈ S. i i r i=1 i=1 Let w, w′ ∈ A + be two words such that w ≈t w′. We X X 4k+1 are identical. Similarly, since the words have the same suffix claim that w ∈ L if and only if w′ ∈ L; proving the claim implies the lemma. We assume that w and w′ are of length at of length 4k, the partial sums least 4k +1, otherwise w = w′, and the claim is immediate. n m + Li and Ri Let w = a1 ··· an ∈ A be a word. Let h(ai) = (si,ti). i n− k i m− k The image of w under h is = X2 +1 = X2 +1 are also identical. Thus, all it remains is to show is the equality h(w) = (s ,t ) ··· (s ,t ) 1 1 n n of the following partial sums: = (s1t2 ··· tn + t1s2t3 ··· tn + ··· + t1 ··· tn−1sn, n−2k m−2k t1 ··· tn) . Lf = Li and Rf = Ri . (23) i k i k X=2 X=2 The terms occurring in Lf and Rf are called special. finite/infinite words, finite trees etc. can be seen as instances of ′ ′ Let x = t1 ··· tk = t1 ··· tk and y = tn−k+1 ··· tn = this unified result. In this framework, the involution operation ′ ′ tm−k+1 ··· tm. Observe that, for each special term Li, is a monad that satisfies a certain distributive law. One advantage of this approach is that the variety theorem for L x t t s t t y . i = · ( i−k ··· i−1) i( i+1 ··· i+1+k) · involutary languages can be deduced from the general case.
Similarly, for each special term Ri, But it also requires the significantly more complex language of categories. For the sake of simplicity we do the variety ′ ′ ′ ′ ′ Ri = x · (ti−k ··· ti−1) si(ti+1 ··· ti+1+k) · y . theorem in the usual way. Before proceeding, we make an observation about special A. Syntactic Involution Semigroups terms. Let e and e⋄ be two idempotents in T . Let p,q,u,v ∈ T + be such that |p|, |q|, |u|, |v|≥ k, and let s ∈ S, then, The syntactic congruence of a language L ⊆ A+, denoted + as ∼L, is the congruence relation on A , defined as pqsuv = pe(qsu)e⋄v (usingEquation22) (27) = pe(qsu)⋆e⋄v (∵ the actions are locally hermitian) x ∼L y := uxv ∈ L if and only if uyv ∈ L, (33) (28) for all words u, v ∈ A∗. ⋄ ⋆ ⋄ ⋄ ∵ + = peu s q e v ( the actions are involutory) (29) The quotient semigroup A /∼L, denoted as S(L), is called ⋄ ⋆ ⋄ = pu s q v . (usingEquation22) (30) the syntactic semigroup of L. Let [u]L denote the equivalence + class of u ∈ A under the congruence ∼L. Then, the surjective Applying the above observation and Equation 21 to the term + + morphism ϕL : u 7→ [u]L, for u ∈ A , from A to S(L), Li implies that recognises L with the accepting set ϕL(L). The morphism x·(ti−k ··· ti−1) si(ti+1 ··· ti+1+k) · y ϕL is called the syntactic morphism of L. ⋄ ⋄ ⋆ ⋄ ⋄ It is possible to lift the notion of syntactic semigroup to the = x · (ti+1+k ··· ti+1) si (ti−1 ··· ti−k) · y (31) case of recognition by involution semigroups. = x · (ti+1+k ··· ti+1) si(ti−1 ··· ti−k) · y . ◮ Definition 7. Let A = (A, †) be an involutory alphabet. Similarly for Ri, + The syntactic ⋆-congruence of L ⊆ A is the relation ≈L on ′ ′ ′ ′ ′ + x·(ti−k ··· ti−1) si(ti+1 ··· ti+1+k) · y A , given by ′⋄ ′⋄ ′ ⋆ ′⋄ ′⋄ = x · (ti+1+k ··· ti+1) si (ti−1 ··· ti−k) · y (32) x ≈ y := x ∼ y and x ∼ † y . ′ ′ ′ ′ ′ L L L = x · (ti+1+k ··· ti+1) si(ti−1 ··· ti−k) · y Since ≈ is an intersection of two congruence relations on We are ready to conclude the claim L = R . Factors of L f f A+, it is a congruence on A+. Hence, A+/≈ is a semigroup. w, of length 2k +1, and centred at positions i, for 2k ≤ i< L Let [[w]] denote the equivalence class of w ∈ A+ under n − 2k, are called special. Similarly, factors of w′, of length L the congruence ≈ . Firstly, we observe 2k +1, and centred at positions i, for 2k ≤ i