An algebraic theory of real-time formal languages Catalin Dima

To cite this version:

Catalin Dima. An algebraic theory of real-time formal languages. Modeling and Simulation. Univer- sité Joseph-Fourier - Grenoble I, 2001. English. ￿tel-00004672￿

HAL Id: tel-00004672 https://tel.archives-ouvertes.fr/tel-00004672 Submitted on 16 Feb 2004

HAL is a multi-disciplinary open access L’archive ouverte pluridisciplinaire HAL, est archive for the deposit and dissemination of sci- destinée au dépôt et à la diffusion de documents entific research documents, whether they are pub- scientifiques de niveau recherche, publiés ou non, lished or not. The documents may come from émanant des établissements d’enseignement et de teaching and research institutions in France or recherche français ou étrangers, des laboratoires abroad, or from public or private research centers. publics ou privés. UNIVERSITE´ JOSEPH FOURIER -GRENOBLE 1 SCIENCES ET GEOGRAPHIE

THESE`

pour obtenir le grade de DOCTEUR de l’UNIVERSITE´ JOSEPH FOURIER Specialit´ e:´ Informatique

present´ ee´ et soutenue publiquement par M. Cat˘ alin˘ DIMA le 11 decembre´ 2001

THEORIE´ ALGEBRIQUE´ DES LANGAGES FORMELS TEMPS REEL´

Directeurs de these:` Prof. Eugene` Asarin, Dr. Oded Maler

Composition du jury : Jean-Claude Fernandez President´ Paul Gastin Rapporteur Nils Klarlund Rapporteur P. S. Thiagarajan Examinateur Pascal Weil Examinateur Eugene` Asarin Directeur de These` Oded Maler Directeur de These`

These´ preparee´ au sein du Laboratoire Verimag´

Lui Gabi si Iuliei, cu drag 4 Remerciements

Je remercie d’abord a` Oded Maler et Eugene` Asarin pour m’avoir donne´ la chance de finir mes etudes´ sous leur direction et soutenir ma these` aV` erimag.´ Eugene` a patiemment resist´ e´ aux tous mes essais echou´ es´ et ses critiques m’ont fait apprendre un nouveau sens de la recherche. Je remercie a` Joseph Sifakis pour m’avoir accueili pendant plus d’une annee´ au sein du labora- toire Verimag,´ en me donnant ainsi la possibilite´ de faire la recherche dans un milieu effervescent, entretenu par des chercheurs de haute qualite.´ Je remercie aussi a` Alain Girault et aux membres du groupe BIP de l’INRIA Rhone-Alpes,ˆ ou` j’ai et´ e´ accueilli pendant une annee´ en 2000. C’est eux qui on fait mon accomodation plus facile. Graceˆ a eux, j’ai pu poursuivre mes recherches pour la these,` en parallele` avec mon travail dans le cadre du projet TOLERE.` Je remercie a` Gheorghe S¸tefanescu˘ qui m’a toujours pousse´ a` contacter diverses groupes de recherce et a` qui je dois en fait mon arrivee´ en France. Many thanks to the UNU/IIST, in particularly to Prof. Zhou Chaochen. It was there, during my fellowship at UNU/IIST from Febrary to August 1998, that my interest for the theory of real-time systems emerged. I also thank the TCS group at TIFR Mumbai for giving me the means to visit from September to December 1998. Xu Qiwen and Dang Van Hung, at UNU/IIST Macau, and Paritosh Pandya at TIFR Mumbai have helped me a lot in the early stages of the research for the thesis. Merci a` Liana et Marius pour leur chaleureuse amitie.´ Merci a` Yasmina et Moez, qui sont des bons copains de bureau, et a` Ana et Gerardo qui sont des bons copains tout court. Merci finalement a` tous les chercheurs du Verimag´ pour cet environnement de bonne qualite´ qu’ils entretiennent. 6

Table of Contents

1. Introduction 11

2. Signals and their basic properties 19 2.1 Basic notions...... 20 2.1.1 Coproduct ...... 20 2.1.2 Kleene algebras ...... 21 2.2 Signals ...... 22 2.2.1 Timed languages: basic properties ...... 24 2.3 Timed words ...... 24 2.3.1 Relating the monoids of signals and of timed words ...... 25 2.4 Timed regular languages defined by inverse morphisms ...... 26

2.4.1 Essentially untimed regular languages ...... 27

Sig

2.4.2 Syntactic monoids on are not interesting ...... 28

3. Real-time automata 33 3.1 Real-time automata and their regular expressions ...... 34 3.1.1 Real-time automata defined ...... 34 3.1.2 Regular expressions and the Kleene theorem ...... 35 3.1.3 The problem of complementation of real-time automata ...... 38 3.2 The Kleene algebra of sets of real numbers ...... 40 3.2.1 Normal forms ...... 41 3.2.2 A normal form theorem ...... 43 3.2.3 Matrices of normal forms ...... 45 3.3 Determinization and complementation of RTA ...... 46 3.4 The Pumping Lemma and expressiveness issues ...... 50 3.5 Stuttering-free concatenation ...... 51

3.5.1 Syntactic monoids for stuttering-free concatenation and real-time automata . . 52

4. Timed automata 55 4.1 Clocks and clock constraints ...... 55 4.2 Timed automata and their clock valuation semantics ...... 56 4.2.1 A Kleene theorem with indexed concatenation ...... 61 8 Table of Contents

4.3 Reset time semantics for timed automata ...... 65

5. Timed regular expressions 69 5.1 Basic properties of timed regular expressions ...... 70 5.1.1 Timed regular expressions without brackets ...... 70 5.2 Undecidability of the language emptiness problem for extended timed regular ex- pressions ...... 72 5.3 Relating timed regular expressions and timed automata ...... 75 5.4 Colored parentheses: basic ideas and problems ...... 76 5.4.1 Changing the concatenation ...... 77

5.4.2 The “overlapping” concatenation for timed automata ...... 79

6. Matrices of signals 83 n 6.1 n-dominoes and -signals ...... 84

6.1.1 n-dominoes over a one-letter alphabet ...... 85

6.2 Operations on n-dominoes ...... 86 6.2.1 Projection ...... 86 6.2.2 Juxtaposition ...... 88 6.2.3 Properties of juxtaposition ...... 90 6.2.4 Concatenation ...... 93

6.3 n-domino languages ...... 96 6.4 Regminoes, regsignals, and regular expressions over them ...... 97

6.4.1 Projection and juxtaposition on n-regsignals ...... 98

n n

6.4.2 -domino regular expressions and -signal regular expressions ...... 101 n

6.5 -signal regular expressions and timed automata ...... 102 n

6.6 The emptiness problem for -signal regular expressions is undecidable ...... 105

7. n-words and their automata 109

7.1 n-words ...... 110

7.2 n-automata ...... 112

7.2.1 The emptiness problem for n-automata...... 114 n 7.2.2 -transitions in -automata ...... 116

7.2.3 Basic operations with n-automata ...... 118

7.2.4 Relationship with n-regwords ...... 120 7.3 Non-elasticity ...... 121

7.4 The non-elastic star closure theorem ...... 125

8. Representing timing information with n-words 151 8.1 Difference bound matrices ...... 158 8.2 Regions ...... 164 8.2.1 Juxtaposition and concatenation on regions ...... 166

Table of Contents 9 n 8.3 Representing DBMs with the aid of n-words and -relations ...... 169

8.3.1 n-relations ...... 170

8.3.2 Operations on n-relations ...... 171

8.3.3 n-word representations ...... 175

8.3.4 Operations on n-word representations...... 178

8.4 n-region automata ...... 181

8.4.1 Basic closure properties for n-region automaton ...... 182 n 8.4.2 Non-elasticity for -DBMs ...... 186

8.4.3 Closure under concatenation and star ...... 188

9. Applications 197 n 9.1 Decomposition and recomposition of -signal regular expressions ...... 198

9.2 Shuffled n-words ...... 200 9.2.1 Projection on shuffled words ...... 201 9.2.2 Juxtaposition on shuffled words ...... 202

9.2.3 Concatenation and star on shuffled words ...... 205 n 9.2.4 A method for checking whether the semantics of a -signal regular expres-

sion is empty ...... 206 n

9.3 Checking emptiness of timed automata with -signal regular expressions ...... 207

10. Conclusions 211

References 215

Index 221

Glossary 223 10 Table of Contents 1. Introduction

Formal methods make themselves increasingly needed in a wide range of areas of computer sci- ence, from hardware specification and verification to the design and validation of computer sys- tems. They are especially needed when critical properties of systems have to be insured. Within formal methods, the two main directions for producing evidence of the correct design of a system are the model checking approach and the theorem proving approach. Within the model checking approach, systems are usually modeled as automata (very frequently with a finite state space) and properties are themselves specified in a descriptive language, such as logic or process algebra. For a large subclass of safety properties, the model-checking problem can be reformulated as the problem of checking whether the language of an automaton is empty. Two basic features a specification language needs are sequentiality and parallelism, which in the automata model translate to concatenation, resp. intersection of automata. Whereas sequentiality interacts optimally with emptiness checking, parallelism brings in the well-known “state space explosion problem”. Sequentiality is well studied and understood, while parallelism still raises problems at both theoretical and practical level. Regular expressions [Kle56] are among the most basic specification language. They model, however, only the sequential structure of systems. Still regular expressions are able to represent also parallel structure, due to the intersection construction and the celebrated Kleene theorem.

Timed systems and their automata model: timed automata

Timed systems (or real-time systems) are computer systems in which the components interact continuously with one another and with the environment, in order to provide a certain service in which time plays an important role.ˆ This roleˆ might be bounded response to some stimuli, limited duration of execution of tasks, and so on. The now classical automata model for timed systems is the timed automata model [AD94]. Timed automata are finite automata enhanced with the possibility to record time passage, by means of real-valued clocks. Clocks evolve synchronously at rate 1, and transitions are taken when some simple arithmetic conditions on the clocks are met, and some transitions might reset some clocks to 0.

A wealth of algorithms ([Yov98, LPWY95] give surveys) and dedicated tools [BDM 98, LPY97] are now available for model-checking with timed automata. The main problem which lim- its the efficiency of any algorithm for model checking with timed automata is that the emptiness 12 1. Introduction problem for timed automata, though a decidable problem, has a very high complexity (PSPACE- complete), hence being even harder than model-checking for untimed systems. On the specification side, several process algebras with time have been proposed in the be- ginning of the 90’s [WY91, NSY93, BB91]. The semantics of these algebras rely upon timed automata. Curiously, the search for regular expressions that allow specification of timing behaviors suc- ceeds the concern for process algebras (though, in the untimed case, it is regular expressions that have preceded and issued process algebras [Mil80]). Only recently there have been issued several results for timed automata [ACM97, BP99], or for subclasses [Dim99b] or superclasses [BP01] of timed automata. Timed regular expressions [ACM97] are a very convenient specification language for timed systems. They are regular expressions enhanced with the possibility to express intervals between two moments during the computation, by the use of interval-labeled parentheses. A left parenthesis

corresponds to resetting a clock and a right parentheses, labeled with an interval I , corresponds to

checking whether the clock value is in the interval I . In spite of their elegance in use, timed regular expressions bear some expressiveness problems: intersection and renaming are essential in proving the reverse implication of the Kleene theorem for timed regular expressions and timed automata.

Subject and contributions of the thesis

In our thesis we study the relationship between timed automata and timed regular expressions. In the first part, we study the simpler case of timed regular expressions in which we bound only state duration. The automata associated with this class behave like finite automata, most notably being closed under negation and algebraically definable via inverse monoid morphisms. In the sequel we try to expand the technique developed in the first part for the whole class of timed automata. We start from the consideration that the parallel composition operation on automata destroys the sequential structure. Our idea is to drop the intersection operator from timed regular expressions by using colored parentheses, in which each color corresponds to one clock. The feature brought in by this idea is that the structuring of the specification would be preserved to a certain extent. Also renaming is no longer necessary. However this idea brings in some difficulties as well, mainly a different view of sequentialization. In our calculus, an atomic regular expression contains parentheses of different colors. If we

apply to such an atom a “color filter” which retains only parentheses of a certain color and deletes

E hE i E E E

I

the other colors, we would get a timed regular expression of the form . Here , and

E I

are untimed (i.e. nonparenthesized) regular expressions and is some interval. The semantics of such an atom consists of signals in which a number of points have been distinguished: two points per each color, one “startpoint” for resetting the clock associated with the color, and one “endpoint” for checking the value of the clock. 1. Introduction 13

This special form of atoms has nevertheless a huge expressive power in combination with the concatenation operation. This is a partial operation which allows two signals with distinguished points to be concatenated iff the distinguished endpoint for each color in the first signal matches the distinguished startpoint for the same color in the second signal. In the second part of the thesis we study the algebraic structure of signals with distinguished points and the regular expressions with colored parentheses that represent sets of such signals. We prove that the emptiness problem is undecidable for regular expressions with colored parentheses, the problem lying in their untimed structure.

We then study this untimed structure by associating, for regular expressions with n colors, a

n n class of finite automata with accepting sets, which we call -automata. The idea is to have two accepting sets for each color: one for the startpoint associated with the color and one for

endpoint for that color. n

We show that the class of -automata is closed under union, intersection, concatenation and n shuffle. The central theorem of this thesis is then that, under mild assumptions, -automata are also closed under star. On the other hand, we show that these automata can be used to represent timing information in the regular expressions. In other words, they can be used for representing constraints over the real domain. The idea is that each run in an automaton represents a clock region, in the sense of Alur and Dill [AD94]. We also show that the mild assumptions necessary for star closure are satisfied when model- ing timed automata. As a consequence, we provide a method for checking whether the language denoted by a given regular expression with colored parentheses is empty. As an auxiliary result, our technique allows the computation of reachability relations defined by timed automata. These are the relations on clock values defined by the behaviors of timed automata, such as starting from one state and reaching another. The computation of such relations is useful in verification, since the language accepted by a timed automaton is empty iff the reachability relation defined by initial and final states is the empty relation. Summarizing, the main contributions of our thesis are the following:

The presentation of a new class of regular expressions that generalize the timed regular expres- sions of [ACM97]. Our regular expressions do not need renaming or intersection and are more

expressive than timed automata.

n The introduction of a new class of finite automata with accepting states, automata that cor- respond to the atomic regular expressions that we utilize. The emptiness problem for these au-

tomata is decidable, but NP-complete.

n The translation of regular expressions into a class of finite automata with accepting sets. This translation works for regular expressions bearing a certain non-elasticity property. This gives a method for checking for emptiness the semantics of a regular expression with the non-elasticity property.

A new method for checking emptiness of a timed automaton by constructing the regular expres- sion for it. 14 1. Introduction

A new method for computing the reachability relations defined by timed automata, based upon

the same class of n-automata.

A collateral result is that our finite automata give a new method for representing general clock constraints. In some cases, clock constraints are represented more compactly than with existing methods. The detailed study of a that embodies this representation method is not the subject of this thesis.

Related work

As we have already mentioned, our study has started from the results in [ACM97, ACM01, Asa98] which introduce timed regular expressions, and [Her99], which shows the necessity of renaming in timed regular expressions. A different approach to regular expressions is given in [BP99, BP01].

The regular expressions in these studies do not need intersection or renaming, being based upon

a C X a C X atoms of the type for some symbol , clock constraint and reset . The paper [BP99] gives a variant of this, in which the reset sets are shifted to the concatenation symbols, hence having a whole range of indexed concatenations. However none of these papers study in de- tail the algebraic structure on which the semantics of regular expressions is based, and neither they give the possibility to lift the semantic operations of concatenation and star to syntactic operations on atoms. In fact, it can be observed that, with the clock valuation semantics, if one to lift concate- nation and star at a syntactic level then he would run into problems with the representation of the

results. More specifically, timing in the atomic regular expressions of [BP99] or [BP01] is speci-

x x y y i

fied by clock constraints which utilize nondiagonal constraints of the type i (this

i i

x y i

is an expression which says that the clocks i and evolve synchronously). As a consequence, n the “zones” in the -th dimensional space which satisfy such constraints are no longer unions of clock regions [AD94] and hence need representations for more general polyhedra. But it is known [Sor01] that general polyhedra-based representations are less efficient that representations which take advantage of the fact that the polyhedra in discussion are unions of regions. The study whose results are perhaps the closest to our approach is Yan Jurski’s PhD thesis [Jur99] (see also the journal version [CJ99]). In his thesis, Jurski proves that the reachability re- lation in timed automata is expressible in Pressburger arithmetic. The technique employed in his work can be characterized as a generalization of constraint graphs (which are the graphical repre- sentation of DBMs), with a construction of the “star” of a constraint graph. The problem with this

approach is that constraint graphs cannot record “disjunctive information”: they can only record

x x I j conjunction of “diagonal” constraints over clocks, i.e., of the type i . Whereas the star is naturally built as an infinite disjunction. Therefore Jurski needs to “flatten” each timed automa- ton, such that no nested loops be allowed, and only afterwards apply his star construction. On

the contrary, our n-automata can record disjunctive information too (the set of accepting runs is a union) and therefore we may iterate the star closure theorem without any problem. Besides this, 1. Introduction 15

our presentation allows not only expressing the timing behavior of timed automata, but also the representation of both the timing and untimed information in the same expression. Finally, Jurski’s result is limited to timed automata without diagonal constraints, and it is not clear whether this restriction is essential or not. We may also mention the approach on using Pressburger arithmetics and its decision procedures in systems with infinite state-space [Boi99, WB00]. [BC96] report on solving systems of linear equations over the by coding solutions into finite words and using finite automata to accept such words. The idea dates back to Buchi’s¨ work on weak monadic second order theory of one succesor function (WS1S) [Buc60],¨ see also the two comprehensive handook articles [Tho90, Tho97] on the subject. We note here that our coding of integer solutions of constraints (which are nothing else but systems of linear equations) is different from the one used by [BC96, Boi99]. Our coding takes advantage of the quasidiagonal format of systems of linear equations which are associated with the clock constraints. There are already several data structures that are used for reachability algorithms for timed au- tomata [Tri98, LWYP99, ABK 97, MLAH99], to cite a few only. Most of them are based upon the DBM technique, and DDDs are generalizations of BDDs [Bry86] for representing clock con- straints. Our automata-based technique, with constraints regarded as runs in a finite automaton, is therefore new and might yield new data structures for reachability algorithms. Let us also mention that there is a whole theory concerning constraint propagation [DMP91],

which is an essential ingredient for the representation of reachable states [Tri98]. The most gen-

max eral way of looking at these is perhaps the max -algebra [Gau99, GP97]. However - algebra does not deal with the possibility to chain timing constraints, that is, to specify algebraically the behavior of timed automata. Another related work that we might mention here is the study [CG00] on employing periodic constraints in timed automata. Our presentation of timed automata requires that constraints use only intervals, but our theory of regular expressions allows the use of periodic constraints. We finally mention the interest for Kleene theorems for subclasses of timed automata [Dim99b], or for superclasses [BP01].

Organization of the thesis

The thesis is divided in chapters, including this introductory one and a conclusion chapter.

2. In the second chapter we give some basic properties of signals and timed languages. We prove here that the monoid of signals and the monoid of timed words are not “algebraically” related, and that the idea of producing timed languages via inverse morphisms from finite monoids issues only languages with no timing information. 3. In the third chapter we study the special class of one-clock timed automata, called real-time automata, in which the clock is reset at each transition. We show that language emptiness and universality are decidable, we give a “pumping lemma” characterization of the associated class 16 1. Introduction

of languages, and show that, by utilizing a “stuttering-free concatenation” on the set of signals, we get exactly the class of languages accepted by real-time automata. 4. In the fourth chapter we review briefly the notion of timed automata and the possibility to have a Kleene theorem for them. The regular expressions we utilize here are taken from [BP99, BP01]. They involve clock constraints and resets, and therefore they are easily related to timed automata. However it is not the class of expressions which we aim to study, the reason (given in chapter 5 also) being that clock usage in expressions is specific to low-level specification languages, while regular expressions are meant to be a high-level specification language. 5. In the fifth chapter we review the timed regular expressions of [ACM97] and discuss their prob- lems. We also present here our ideas for solving these problems - usage of colored parentheses and of a partial concatenation operation. This chapter is meant as an intuitive presentation of the problems we have sought to solve and the solution we have found. We also give here an undecidability result concerning timed regular expressions with negation. 6. In the sixth chapter we introduce and study our algebraic framework of signals with distin- guished points. These signals are given a matricial presentation, mainly by similarity to Dif- ference Bound Matrices [Bel57], which are, on their own, a subject of discussion in chapter 8. We define the partial concatenation operation on signals and establish some basic algebraic properties for it. Wee introduce concatenation by means of two “more basic” operations: jux- taposition, which can be thought as “conjugating” the two signals and projection, which can be thought as quantification. A first try to lift this operation at the specification level, that is, to provide a calculus with regular expressions with colored parentheses, is shown to fail, the reason being that projection is not compositional. More specifically, there is no way to de- fine the projection operation on regular expressions, such that the semantics of the projection be the projection of the semantics. An equally worrying result is the undecidability of the emptiness problem for the general class of regular expressions with colored parentheses. This result follows by showing that the Post Correspondence Problem [Pos46] can be reduced to the emptiness problem for regular expressions with colored parentheses. Hence the undecidability problem is hidden in the untiming structure of the regular expressions.

7. In the seventh chapter we investigate the class of n-automata for their possibility to represent regular expressions with colored parentheses, but over a discrete time domain. The necessity of this study is emphasized by the undecidability result. We provide here a mild property – the non-elasticity property – that assures star closure of automata, and hence accomplish in part the task of representing regular expressions. 8. The eighth chapter is concerned with the generalization of these results for continuous time domains, in particular with the possibility to represent timing information in a continuous time domain with the automata defined in chapter 7. This approach is successful and provides the

possibility to represent timing constraints in the continuous domain by n-automata. 9. In the ninth chapter we gather together all the results obtained so far in order to provide a compositional calculus with regular expressions with colored parentheses. In this chapter we also show that the non-elasticity property discovered in chapter 7 is satisfied by the regular 1. Introduction 17

expressions which encode timed automata. We then provide a method for checking language emptiness in timed automata, by transformation to regular expressions.

Each chapter starts with a short presentation of the problems and solutions that are treated within it. 18 1. Introduction 2. Signals and their basic properties

In this chapter we study some of the algebraic properties of signals and timed words. Signals and timed words are the two alternative models for the behavior of timed systems. While signals put the accent on states in which the system is and on state durations, timed words put the accent on actions that a system is executing and on moments at which actions take place. We take the approach of [ACM01] and present these monoids as coproduct monoids – or, in an alternative terminology, as direct sums. This algebraic presentation makes some proofs more succinct. Since the two notions, signals and timed words, try to model the same phenomena it is natural to search a connection between monoids of signals and monoids of timed words. In this chapter we prove that this connection is not of an algebraic nature: we prove that the monoid morphisms between the monoid of signals and the monoid of timed words are unable to relate state changes to actions. More formally, we prove that each signal is mapped, by such a monoid morphism, into a timed word with no action. The second result of this chapter concerns the nonexistence of a Myhill-Nerode characterization of timed languages. We prove that any timed language (i.e. set of signals) that can be defined as the inverse image of a subset of a finite monoid, does not carry any timing information. That is,

whenever a signal is in the language, any other signal with the same sequence of states (and with any other durations of these states) is in the language too. This property is based upon a lemma stating that there are at most two morphisms from the monoid of nonnegative reals to any finite monoid: the trivial morphism and, in the eventuality the target monoid has a “zero”, the morphism which takes any positive number to this “zero”. Hence the problem is traced to the “stuttering” structure of the real numbers, and we will see in the next chapter that, if we allow two signals to concatenate only when at the concatenation point they create a discontinuity, then the “inverse morphisms” approach will produce timed languages with nontrivial timing information. This second result is proved using a “diagram-chasing” technique specific to category theory [Mac71]. This proof takes advantage of the algebraic presentation of signals and we believe it is drastically shorter than any other proof, that would need to mimic the uniqueness properties of the monoid of signals. The chapter is organized as follows: the first section presents some basic properties about monoids, especially the construction of the coproduct (or direct sum) of a family of monoids. We also remind here the notion of Kleene algebra. We also incude in this section a short subsec- tion recalling the definition of Kleene algebras. Then in the second section we recall the coproduct representation of signals and the fact that, similarly to languages of words, the powerset of signals 20 2. Signals and their basic properties can be organized as a Kleene algebra. The third section presents also timed words as a coproduct monoid and gives the negative result about the monoids of signals and timed words. The fourth section is concerned with the other negative result, concerning the “lack of interest” of timed lan- guages defined by inverse monoid morphisms.

2.1 Basic notions

In this section we remind the notion of coproduct (or direct sum) of a family of monoids, and the notion of Kleene algebra.

2.1.1 Coproduct monoids

M e M e

i iI

Definition 2.1.1. The coproduct of a family of monoids i is the monoid de-

M M

fined as the quotient: i where

iI

M fm i j i I m M g

i i i

is the usual disjoint sum: i ;

iI

M M i

i is the free monoid over (with concatenation denoted as juxtaposition and

iI iI

empty sequence as );

M

and is the congruence on i generated by the equations:

iI

m im i m m i m m M i I

i i i

i (2.1)

i i i

e i e j i j I j

i (2.2)

e i i I

i (2.3)

L L

M e M M

i i iI i

We denote the coproduct of the family as i . Note that the unit of

iI iI e

is the class of . we denote this unit as .

L

M m

Observe that each element i can be uniquely represented as a finite concatenation

iI

of “atomic” elements

m m i m i

i k

i (2.4)



k

m i e j k k

in which i for all .

j

L M

Theorem 2.1.2. The monoid i has the following universality property: for any monoid

iI

L

M e M M M

i i

and family of morphisms i , there exists a unique morphism

iI

M m i m h i

i i i iI

such that i . This morphism is denoted and is called the coproduct of

iI

the family i .

This theorem is depicted in Figure 2.1. Here, i denotes the inclusion morphism.

L

m M

The construction of the coproduct morphism is the following: each element i is

iI

i m m i m

k i i

decomposed as in Identity 2.4, . We then put:

 k

2.1 Basic notions 21

L

i



M M

i i

iI

H

H

H

H

h i

i iI

H

H

i

j H M

Fig. 2.1. The commutative diagram for Theorem 2.1.2.

h i m m m

i iI i i i i

 

k k

L

n

M I I f ng

When the family is finite, say , we denote the coproduct as i , and

i M

when the monoids i are identical (this implies that their operations and units are the same too)

L

I f ng M M i I M

we denote it as . Finally, when and i for all , we denote the

iI

L n

coproduct as M .

i

2.1.2 Kleene algebras

We remind here one of the possible axiomatizations of Kleene algebras.

A Definition 2.1.3. A Kleene algebra is a structure which satisfies the following

properties:

1. A is a , that is:

A

is an idempotent monoid;

A

is a monoid;

distributes over .

2. satisfies the following equations [Con71, Koz94]:

Y Y X Y Y

X (2.5)

X Y Y X Y

Y (2.6)

X X X

(2.7)

X X X

(2.8)

where is the partial order induced by the idempotent [Bir79], that is,

Y X Y Y X iff

A Kleene algebra is called commutative iff it satisfies the following identity [Con71]:

Y X Y X Y

X (2.9)

The classical example for Kleene algebras is the set of all languages over an alphabet ,

fg P . An example of a commutative Kleene algebra is the Kleene algebra over a one-letter alphabet. 22 2. Signals and their basic properties

2.2 Signals

R R R Q

We denote , and the sets of negative, nonnegative, resp. positive numbers, the N

set of nonnegative rational numbers, Z the set of integers and the set of nonnegative integers

n n Z n n fn n

(also called naturals). For each , denotes the interval of integers

n g

, while denotes the left-open, right-closed interval of reals whose limits are , resp.

Q Int Q fg

. denotes the set of intervals having bounds in and including the empty set, while

Int Z f g

Z denotes the set of intervals having bounds in and including the empty set. An

a b a b

open interval is denoted as , while a closed one is denoted as . We will extensively use

a b

left-closed right-open intervals, which are thence denoted .

R A a A f

For each function f , real number and each , we say that the left limit of at

a lim f a

is and denote it iff the following property holds:

t

t f t a

there exists some such that for all

f a

Right limits lim can be defined similarly. This definition amounts to considering that the set

t

A is equipped with the discrete topology.

R lim f t f lim f t

A left discontinuity in f is some for which . The discontinu-

t t

f t f lim f t

ity is right if we rather have that lim .

t t

Definition 2.2.1. A signal over a finite alphabet is a function where is a nonnegative number, function which has finitely many discontinuities, all of them being left discon-

tinuities.

We denote dom the domain of and its endpoint. is also called the length of .

Signals can be given graphical representation. For example, Figure 2.2 gives the graphical rep-

fa b cg

resentation of the signal defined by:

t a

iff

t

b t

iff (2.10)

t

c iff

b

a a

c

  

Fig. 2.2. A graphical representation of the signal defined in Identity 2.10.

Sig denotes the set of signals over . Note that there exists a unique signal with empty

domain .

2.2 Signals 23

Sig dom e i

i i

For with ( ) define their concatenation as

dom e e

the signal with and such that

t t e

for

t

t e t e e e

for

For example, the signal in Figure 2.2 can be regarded as the concatenation of the two signals

in Figure 2.3.

b b b

a a a a

c c

  

Fig. 2.3. An example of signal concatenation.

Sig

Proposition 2.2.2 ([ACM01]). is a a noncommutative monoid, called the monoid

of signals with concatenation.

L

Sig R car d

Moreover is isomorphic to the coproduct of copies of the

a

R

monoid of nonnegative reals .

e e i

i

Proof. It is clear that the domain of each signal splits into finitely many intervals i (

n

) on which is constant. Therefore we may identify with the formal concatenation:

t t

t

 

n

a a a t e e a t t e e

i i i i i

where i and (2.11)

n

L

Sig R ut

If we add we obtain the isomorphism .

a

As we can see, this proposition gives a more “friendly” presentation of signals [ACM97]: the

L

R

signal presented in Figure 2.2 is represented also by the following element of :

a

a b a c

a

Observe also that the empty signal is denoted by the empty sequence and that for

a Sig

any . We will utilize both notations and for the unit of concatenation in .

Proposition 2.2.2 allows us to define the length of a signal as a monoid morphism induced by

R Sig

the coproduct property: denote first a the coproduct inclusion for each copy of

R Sig R

. Then is the unique morphism defined by the following diagram:

L

a



R R

a

H

H

H

H

H

H

R



Hj

R

Sig a

For each symbol a we denote the set of signals whose value is constantly :

a

t

Sig R fa j t R g

a a 24 2. Signals and their basic properties

2.2.1 Timed languages: basic properties

X P X

For any set X , the set of subsets of is denoted as .

Subsets of Sig are called timed languages. Signal concatenation can be naturally extended

to timed languages:

L L f j L L g

and gives rise to star:

n

L L

nN

n n

L f g L L L

where and .

P Sig f g

Proposition 2.2.3. is a Kleene algebra.

fg Proof. All properties follow by transporting the proof that P is a Kleene

algebra [Sal66, Koz94]. For example, the implication 2.5 follows by proving, by induction on

n

N X Y Y X Y Y ut n , that implies .

2.3 Timed words

w w jw j

Given a set of symbols and a word , the length of , denoted , is the number of

j j N

symbols in w . It can be regarded as the unique morphism determined

aj a

by j for all . Remind that is the free monoid generated by .

w

Definition 2.3.1. A timed word over the alphabet is a pair consisting of a word

w jw j R

, and a function .

The word w is called the sequence of actions in and the function is called the sequence of time

labels in . The length of a timed word is simply the length of the sequence of actions in it, and is

w j

denoted j .

We denote TW the set of timed words over . Subsets of it are called timed word languages.

On TW we can define a concatenation operation as an extension of the concatenation on

w w w w w w

words: given two timed words and define where

i jw j i

iff

i

jw j i w

iff

jw j i jw j jw j i jw j

iff

Note again that there exists a unique timed word whose sequence of time labels is the

function which maps the unique element in its domain to .

2.3 Timed words 25

TW

Proposition 2.3.2. is a noncommutative monoid.

TW R R

Moreover is isomorphic to the coproduct monoid , where is the monoid of nonnegative reals.

Proof. The isomorphism is simply the “rearrangement” of each timed word as follows:

w w w jw j ut

jw j

j TW N It follows that j is a monoid morphism.

2.3.1 Relating the monoids of signals and of timed words

Sig TW

For the sequel we fix two alphabets . We call a monoid morphism as

Sig

trivial if for each signal

R

for some

j Sig

Observe that a morphism is trivial iff j for all .

TW Proposition 2.3.3. Any monoid morphism from Sig to is a trivial morphism in the

above sense.

Sig TW a Proof. Assume is a monoid morphism. We first prove that for each ,

is a trivial morphism. Then the result would follows since each signal is a concatenation

Sig a

of constant signals.

R

Observe first that, for each two nonnegative reals ,

ja j ja j

if then (2.12)

j

Here we have used the fact that the length of a timed word j is a monoid morphism. This comes

a a j j a

as a and hence, if we apply the composition of morphisms to we get

ja j ja j ja j ja j

n n

Sig Sig Sig

N a

This implies that there is a countable partition n of such that the

a a

j j n

set of constant a-signals which are mapped by the morphism to the integer :

n

Sig fa j ja j ng

a

n



n Sig

Let us denote the first integer with the property that has a nonempty interior. Such

a

n

n N Sig

a number must exist since, for each , if there exist some with

a

n n



Sig Sig

then due to implication 2.12. As a consequence, for some

a a

R

.

But then, for each , too and hence:

26 2. Signals and their basic properties

n ja j ja a j ja a j ja j ja j n n

n

It follows that .

R k d j e j

But then, for each , if we denote then and hence . It follows

k k

that

k

k k

ja j ja j j k ja

j j Sig

This proves our claim that maps all signals from a to . But every signal

t

t p



a a a Sig p

is a concatenation of constant signals where i . Hence we have that

Sig

for all

t

t



p

j j ja a j ut

p

TW

Remark 2.3.4. In general, trivial morphisms from Sig to are linear on each submonoid

Sig fa g Sig

i of , but the slopes might be different. Hence we might add to the above proof

t

Sig fa g K R a Kt

the observation that, on each i there exists some such that . Conse-

TW

quently, we may conclude that all morphisms from Sig to are “piecewise-linear”, that

K K m

is, there exist such that

t

t



m

K t K t a a

m m

m

card where m . We have another option for relating signals and timed words: to join the two structures into a single one, i.e. define signals with actions or timed words with states:

Definition 2.3.5. The monoid of signals with actions over the set of states and the set of actions

M

R car d R

is the coproduct of copies of the monoid of nonnegative reals

a

with the free monoid . Hence now the monoids of signals and of timed words can be regarded as “particularizations” (or projections, in the algebraic sense) of the monoid of signals with actions. However we will not utilize this notion since it will make all derived notions and proofs unnecessarily more complicated.

2.4 Timed regular languages defined by inverse monoid morphisms

We start this section by reminding the way regular languages are related to monoid morphisms

[Eil74].

M e M A

Definition 2.4.1. Given a (possibly infinite) monoid, an -automaton is a tuple

q Q Q Q Q M Q q Q Q M q Q

f f

where , and has the property that for all ,

m m M

:

q m m q m m q e q

and (2.13)

2.4 Timed regular languages defined by inverse monoid morphisms 27

Q A LA fm M j

A is called finite whenever is finite. The accepted language of is

q m Q g q Q M f

. It is also assumed that all states in an -automaton are accessible, i.e.

m M q m q

there exists some such that .

M Reg M

Definition 2.4.2. Given a monoid M , the family of -regular languages, denoted ,isthe

M M

family of subsets L which are the accepted language of some finite -automaton.

M L M

Theorem 2.4.3 ([Eil74]). Reg coincides with the family of subsets for which there

F e M F

exists some finite monoid , some surjective monoid morphism and some

F L F subset F such that .

2.4.1 Essentially untimed regular languages

The untiming of a signal is the sequence of symbols that appear in it. Observe that in the untim- ing of a signal, two consecutive symbols are distinct. Hence the untiming application (actually,

morphism) is not surjective.

We will take advantage of the algebraic definition of Sig and define the untiming as a co-

product of morphisms. We define first the monoid SF of stuttering-free words, that is, the set of words in which no two consecutive letters are equal. This monoid is endowed with a concatena-

tion operation that “fuses” two identical letters. For example,

aba ac abac

We may define SF in two ways: as a coproduct monoid and as a quotient monoid:

a S

The coproduct definition is the following: for each consider the monoid a

f ag a a a SF

where a (i.e. is idempotent). Define then as the coproduct of the

S

a

family monoids a ,

M

SF S

a

a



S SF a

and denote a the inclusion morphism which defines this coproduct.

The quotient presentation of SF is the following: consider the relation defined

by

faa a j a g

This relation can be uniquely extended to a congruence on as follows:

fw aaw w aw w aw w aaw j w w a g

Then SF is isomorphic to the quotient of the by the congruence . That is, elements of

z

SF can be thought as equivalence classes w.r.t. . We denote the canonical

z

abcba projection. For example, aabcbba .

28 2. Signals and their basic properties

Concatenation of stuttering-free words, i.e. the internal operation on SF that makes it a

w SF w a w monoid, can then be defined as follows: for each w , not ending in and not

beginning in b,

w abw b

iff a

wa bw

a b waw iff

We may now proceed to the algebraic definition of untiming: consider the following monoid

morphism:

t

a iff

R S t

a a a

iff t

The following diagram defines, by theorem 2.1.2, the untiming morphism U :

L

a



R Sig R

a

U h i

a a a a

a



S SF

a

Sig

Definition 2.4.4. A timed language L is called essentially untimed iff there exists some

SF L U L set L (i.e. a language in the classical sense) such that . The class of

essentially untimed regular languages consists of essentially untimed languages that are inverse

images of regular languages in SF .

Observe that SF -regular languages in the sense of definition 2.4.2 are in fact regular lan- guages in the classical sense [HU92] with the restriction that in each word no two consecutive

letters are equal – and this restriction means intersection with a regular set.

Sig 

2.4.2 Syntactic monoids on  are not interesting

In this section we show that Sig -regular languages are essentially untimed. Remind first that a

M e M

zero element in a monoid is an element which satisfies

x M x x

Our result relies on the following lemma:

f R M e M e Lemma 2.4.5. Suppose is a surjective monoid morphism and

is a finite monoid. Then

f f x e x R M feg

either is the trivial morphism , (and hence );

M fe g f x e or where is a zero element and maps any to and to .

2.4 Timed regular languages defined by inverse monoid morphisms 29

Proof. We note first that the surjectivity of f implies that is commutative. We may also prove by

k

k x R k N f k x f x induction on that for all and for all , . Let’s prove first the following:

Claim. If M contains only idempotents then it has at most two elements, and one of these elements

is a zero element (in fact the zero element, since it is unique).

M fm m g m m m

n n

Proof. Suppose . Then is the zero element because

m m m m m m m m m

i n i i i n

m m m m

i n

where we have used the commutativity of .

e M Two cases arise then: the first is when m , the unit of . But the unit can be a zero element

only if the monoid has only one element, the unit:

e e m e m e i

i (as is the zero element) (as is the unit)

e f m

The second case is m . In this case there must exist such that .

f f f k k N

Take then any . Since is idempotent we have that for any .

e d k k f k

If we take then k we obtain and hence , hence is defined.

f f f f k

But then, as m is the zero element we have that . Therefore

m f f f k f k f k f

ut

which proves that any positive real is mapped to the zero element. M

So what is left to prove is that if f is surjective then contains only idempotents. Take a

G f N

positive real and denote the image under of the submonoid of multiples of :

G f n j n N

p p

G p Z M

It is clear that G for any . But since is finite, there must exist

p p

p

 



G p Z G min

some such that . Let us then denote , hence

G

G .

k card G

We may prove that G is cyclic, that is, if we denote then

f i j i k f f f k

G (2.14)

k f

f (2.15)

k j N i j f i f j

To this end, note that if there exist i , with and such that

i n f j n n N

then we would have that f for all . But this would imply that

G f f j

30 2. Signals and their basic properties

ard G j j k

and hence c , which implies that . This shows that Identity 2.14 holds.

k f i i

For showing that Identity 2.15 holds too, suppose that f for some

n N k

. By a simple induction argument, we may prove that, for all ,

k nk i f i nk i

f (2.16)

f j k f G

On the other hand, since , we must have some such that

j

f too. Therefore,

f f f f j f j f j

j k

It follows that . By recursively appying the Identity 2.16 we further get that:

f j f j k i f j k i

j k

j k

j f k i

(2.17)

k i

j k

j k

j k i i k

and .

k i

f l l i k

But this rewrites to the fact that f for some , which would be impossible,

l

by Identity 2.14, unless i . Hence Identity 2.15 holds too.

But then, starting from Identity 2.17 and replacing i with we may further conclude that

j k k j

j k j

f f j f j k f k j

k k

k j

j

k k j

with . This also implies, by Identity 2.14, that

k

j k

j

j k

k

j j j k

As a consequence, k , fact which, corroborated with the hypothesis that , implies

j that k .

Therefore,

f k f

f (by Identity 2.15)

or, by multiplying by ,

f f f

p

f

Hence f is an idempotent. By an easy induction we may show then that is an idem-

p



f f ut potent too, which implies that is an idempotent.

2.4 Timed regular languages defined by inverse monoid morphisms 31

G G Observe that it was essential to utilize the fact that f to show that is cyclic. The difference between this lemma and Proposition 2.3.3 is that the monoid of natural numbers is an ordered monoid, in which the order is compatible with the monoidal structure. On the con- trary, in a finite monoid there is no compatible ordering for free. This is why the two proofs are

different, with the proof of Lemma 2.4.5 much more involved than the proof of Proposition 2.3.3.

Theorem 2.4.6. The family of Sig -regular languages equals the family of essentially untimed regular languages.

Proof. This is a corollary of theorem 2.4.3 and the above lemma. The right-to-left inclusion follows

SF

by an easy argument: if K is a regular language in , then we have some surjective morphism

SF M M e F M K

where is a finite monoid, and some subset such that

F U Sig M L

. But then is a surjective monoid morphism too and defining

L Sig U K U F

we get that is a regular language in .

M e

For the reverse inclusion, suppose we have some finite monoid and some surjective

Sig M U morphism . We look for a decomposition of in which the untimed morphism

occurs.

a R Sig

Remember that, for any , a is the inclusion morphism for the coproduct

R M

property. It follows that a is a monoid morphism. This implies, by the above

Im M

lemma, that the image of this morphism, a , is a submonoid of having at most two

fe g

elements, M with .

S f ag U

But the monoid a which was used to define on page 27 is isomorphic to

Im aa a S j S M

a a a

a , since in . Define then the morphism as

j e j a t t

a a a

a and for all

j S M a

That is, a is either the isomorphism from to or the trivial morphism.

R S S S

a a a a

Define also the morphisms a and as follows:

t e a

iff a

t a a and for all

otherwise

a t t

a a a

a for any

j

a a a a a Then a and . Now we are ready to chase the following diagram, in which all the squares and triangles com-

mute:

a

a a

S S R

a a

P

P

 



P

P

j

P

a

P

a a a

P

P

P

h i h i hj i

P q

a a a a a a a a

Sig SF SF M

32 2. Signals and their basic properties

The upper triangle and the outer square commute as shown above. The inner squares and the right triangle commute by just the coproduct property. The outcome of this diagram is the commu-

tativity of the bottom square, i.e.

hj i h i h i hj i h i U

a a a a a a a a a a a a

a (2.18)

U h i

a a since a .

This commutativity follows by the coproduct property: both the left-hand side and the right-

j

a a

hand side, when composed with a ,give . Therefore, both must be equal to the unique

hj i U h

a a a

coproduct a . Hence we have managed to show that the untiming morphism

i

a

a occurs in some decomposition of .

L F M

So suppose we have some Sig -regular language , witnessed by the subset , i.e.

L F SF K hj i h i F

a a a a . Consider then the -regular language a .

Then, using the decomposition 2.18 we get

L F hj i h i U F U K

a a a a a

ut Hence L is indeed essentially untimed. 3. Real-time automata

In this chapter we study a class of automata which seems to be the largest extension from finite automata still carrying the decidability of both the emptiness and the universality problems, a Pumping Lemma and, moreover, a Kleene theorem in which the semantics of the associated regular expressions is based upon a total concatenation operation. The automata we study, called Real- Time Automata (RTA), can be regarded as timed automata with a single clock which is reset at each transition, and they appeared (in a slightly different version) in connection with the so-called Simple Duration Calculus [HJ96]. However, even at this lowest level of introduction of timing constraints into finite automata we find that complementation raises a specific problem: the subset construction can be adapted to handle timing constraints, but it works only if the automata are stuttering-free, i.e., two states labeled with the same symbol are not connected by any transition. We also find out that language determinism, i.e., the property that each signal is associated with a unique run that starts in an initial state, cannot be captured by local properties like state-determinism or stuttering-freeness: stuttering-free state-deterministic RTA are less expressive than RTA. We solve this problem by introducing the Kleene algebra of sets of real numbers. The roleˆ of concatenation from Kleene algebras of languages is taken here by addition of sets of real numbers. This operation models the process of removing one stuttering transition by “fusing” the adjacent states. We then study the sub-Kleene algebra generated by finite unions of intervals with rational bounds and prove a normal form theorem for this subalgebra, result which is based on properties of integer division and roughly says that elements in this subalgebra are “ultimately periodic”. This result is not a corollary of the normal form for regular languages over a one letter alphabet because our Kleene algebra has two generators whose generated subalgebras are not disjoint but which cannot generate one another. We also show here that the class of languages recognizable by real-time automata can be given an “algebraic” characterization, that is, can be presented as inverse morphic images of subsets in finite monoids, but the monoid of signals needs to be redefined: the concatenation has to be a “stuttering-free” concatenation, allowing two signals to concatenate only if they produce a discon- tinuity at the concatenation point. The chapter runs as follows: in the next section we remind what RTA are, their associated regular expressions and the problem raised by their complementation. In the second section we introduce the Kleene algebra of sets of nonnegative reals and prove the normal form theorem. The third section contains the constructions that accomplish stuttering elimination and determinization 34 3. Real-time automata

and the fourth is reserved for a pumping lemma and expressiveness issues concerning real-time automata. Finally, the fifth section is concerned with the “algebraic” presentation of the class of languages accepted by real-time automata. This chapter contains the results from [Dim00a, Dim01b]

3.1 Real-time automata and their regular expressions

Real-time automata are state-labeled timed automata [ACM97] with a single clock which is reset at each transitions.

3.1.1 Real-time automata defined

A Q Q Q Q f

Definition 3.1.1. A real-time automaton (RTA) is a tuple where is

Q Q

the (finite) set of states, is the (finite nonempty) alphabet, is the transition relation,

Q Q Q Q f

are the sets of initial, resp. final states, is the state labeling function and

Q Q Int

is the time labeling function.

q q q

We also call the pair the label of the state .

n q

i

n

RTA work over signals: a run of length is a sequence of states i connected by

q q i n

i , i.e., i . The run is associated with a signal iff there exists a

decomposition

t t

n



q q

n

t q i n i

such that i for all . Or, in other words, iff there exist some sequence of

e e e e q t q

n i i i i

“splitting” points such that and

t e e i n

i for all i and all . Observe that the “splitting” points must contain all the discontinuities in the signal but this inclusion might be strict, case in which we say that the run is

stuttering.

Q Q f

A run is accepting if it starts in and ends in . When a signal is associated with some

A A

accepting run we say that is accepted by . The language of is the set of signals associated

LA

with some accepting run of A and is denoted . Two RTA are equivalent if they have the same

language. If we denote the class of all RTA whose alphabet is as RTA , then we may define

the class of real-time recognizable languages over as

fL Sig j A LA Lg

TRec RTA s.t.

a b

As an example, the automaton in Figure 3.1 accepts the signal The accepting run

q r s t e e

which is associated with this signal is and the splitting points are , ,

e e e

and . Note that the run is stuttering.

A real-time automaton whose alphabet is is, from a syntactic point of view, a “finite presen-

R q tation” of a classical automaton over the (uncountable) set of symbols , where a state

3.1 Real-time automata and their regular expressions 35

a

b

q a rb s b t b

q r s t

Fig. 3.1. An example of a real-time automaton and a signal accepted by it.

a I a I

labeled embodies a whole family of states labeled with for all . However the

R comparison stops at this syntactic level since semantically comes with a structure which is unavailable for the alphabet of a classical automaton: according to this structure, we may “fuse”

two symbols sharing the same state label. It is this structure that allows the acceptance of the signal

b b b b in the above example by splitting the symbol into three symbols , and . We end this subsection with the following adaptation of the decidability of the emptiness prob- lem for finite automata.

Proposition 3.1.2. The emptiness problem for RTA is decidable.

The proof relies on the algorithm for computing the sets of accessible states and then checking

Q car d whether a final state is accessible, which can be done in linear time w.r.t. car d . Real-time automata can also be defined such that their accepted language would consist of timed words instead of signals. Most notably stuttering steps would translate into epsilon-transitions in the timed words setting. The whole contents of this chapter can then be translated to such automata without very much difficulty.

3.1.2 Regular expressions and the Kleene theorem

R

We have observed that labels in RTA are finite presentations of sets of symbols from . This

Q Int Int observation can be further extended to considering regular expressions over Q

with the aim of obtaining a Kleene theorem:

Rat

Int Q Int Definition 3.1.3. Consider Q , the set of rational (or regular) expressions over , i.e.,

defined by the following grammar

E j j a j E E j E E j E

I

a

Q Int

where the atoms I are symbols from .

Rat Int Rational expressions in Q will be mainly called real-time regular expressions.

There are two types of semantics for real-time regular expressions: the first one, called hence-

Int

forth abstract, is the classic semantics in terms of words over the set of symbols Q and is

j jj jj

denoted j . For this semantics, is the empty set and is the set containing the empty word

Int over Q , word which is denoted too. 36 3. Real-time automata

The second semantics, called the real-time semantics or simply the semantics, is in terms of

k

signals and is denoted k :

t

ka k f Sig j t I a g kE k kE k

I such that

kE F k kE k kF k kk

kE F k kE k kF k kk f g

ka k f g a

Note also that for any .

The following straightforward property relates the two types of semantics:

E Rat Int

Proposition 3.1.4. For each real-time regular expression Q ,

kw k j w jE j kE k

We define the class of real-time regular languages over as

TReg fL Sig j E Rat kE k Lg Int

Q such that

TReg Theorem 3.1.5 (Kleene theorem for RTA). TRec .

The Kleene theorem would follow almost immediately from proposition 3.1.4 and the classical Kleene theorem [HU92] if we would have transition-labeled real-time automata rather than state- labeled. Since this kind of automata will further show useful, we define them here and provide the

straightforward translations from and to RTA:

A Q Q Q Q f

Definition 3.1.6. A transition-labeled RTA (t-RTA) is a tuple where , ,

Q Q f

and have the same names and properties as in RTA and the transition relation satisfies

Q Q Int Q car d

with .

q a I r a

Transitions of the form will be called -transitions.

Int

Since a transition-labeled RTA is a finite automaton over a finite subset of Q , we may speak

Int

of its language in the classical sense, as the set of words over Q which are concatenations of the

L A

labels of some accepting run. Let’s call this the abstract language and denote it as abs . The

A LA

real-time language accepted by A, or simply the language of , denoted , is then the union

L A

of the semantics of each word in abs , with this abstract word viewed as a regular expression

Int

over Q , that is,

LA kw k j w L A

abs (3.1)

The translations between RTA and transition-labeled RTA are the usual transformations of a state-labeled automaton into a transition-labeled one and back, with a special case when the empty signal is accepted by the t-RTA:

3.1 Real-time automata and their regular expressions 37

A Q Q Q f

Given some RTA , a transition-labeled RTA with the same language

B Q ft g ft g Q t Q

f

is where and

q r r r j q r t q q q j q Q

B Q Q Q f

For the reverse, given some transition-labeled RTA , a RTA whose lan-

LB A fq g T T

f

guage is is where

q a I r q a I r a q a I r I

Ð for each RTA state , and ;

q a a q

Ð for some (assumed nonempty) and ;

T fq a I r j q Q g fq g

Ð ;

T fq a I r j r Q g fq j LB g

f

Ð f ;

q a I r r b J s j q a I r r b J s

Ð

LB A Hence, when we must add to an initial and final state for accepting . Note that this state is neither the source nor the target of any transition.

Proof (of the Kleene theorem 3.1.5). The proof is then the following: we have already seen that B

each RTA A is equivalent to some t-RTA . Then, by applying the classical Kleene theorem we

E Rat Int

get a regular expression Q , that is, a real-time regular expression, whose abstract

L B

semantics equals abs . Then, by combining properties 3.1.4 and Identity 3.1 we obtain that the

A ut (timed) semantics of E and the language of are equal. The reverse implication is similar.

We end this subsection with a procedure for removing the zeroes from the time labels in RTA

which is a straightforward adaptation of the epsilon-elimination procedure for finite automata

q a r

[HU92]: First observe that transitions of the kind play the roleˆ of -transitions

A Q Q Q f

in finite automata. Consequently, in each t-RTA we split each transition

q a I r I q a I n fg r q a r

with in two parts, the first being and the second . A

We further define, for each state q in ,

q a q q fq Q j q q q q g

i i i n

there exists a run with

in

A Q Q Q

Then, the t-RTA in which

f

q a I n fg r j q a I s r s

and

Q Q fq Q j Q q g

f f f

is equivalent to A. Note that when translating transition-labeled RTA without zero labels into state-

q

labeled RTA, we will get the special initial state whose time label is , needed for not loosing the empty signal from the accepted language. All the above observations can be gathered together in the following: Proposition 3.1.7. Each transition-labeled RTA is equivalent to some t-RTA in which the interval

labels of the transitions do not contain .

Each RTA is equivalent to some RTA in which there exists a single state whose interval label

contains , the label of this state is actually and no transition enters or leaves this state. 38 3. Real-time automata

3.1.3 The problem of complementation of real-time automata

The usual way of showing that a class of automata is closed under complementation is to prove that the automata can be transformed such that for each entry there exists a unique run that starts in an initial state, for then complementation would be accomplished simply by complementing the set of final states. The notions that assure the uniqueness of the run for RTA are state-determinism

combined with stuttering freeness:

LA Definition 3.1.8. ARTAA is language deterministic if each signal in is associated with a unique run that starts in an initial state.

A is stuttering-free if

q Q

there exists a state which is not connected to any other state and whose time label is

q

;

the time labels of all the other states do not contain ;

q r q r for each transition , .

A is state-deterministic if initial states have disjoint (state- or interval-)labels and transitions

starting in the same state have disjoint (state- or interval-)labels too:

r s r s Q q r q s r s

Whenever and either or then either or

r s .

A is called deterministic iff it is both state-deterministic and stuttering-free. The translations of these notions for transition-labeled RTA are the following:

Definition 3.1.9. A t-RTA A is transition-deterministic if it has a single initial state and for each

Q a a q state q and symbol , if two distinct -transitions leave then their time labels are

disjoint, i.e.,

q a I r q a J s I J I J r s If and then and .

A is stuttering-free if the time labels of the transitions do not contain zero and there are no two

q a I r r b J s

distinct adjacent transitions labeled with the same symbol, i.e., if then

b a .

A is deterministic if it is state-deterministic and stuttering-free. Proposition 3.1.10. The translations between RTA and t-RTA provided in section 3.1.2 are such that

state determinism in RTA is translated to transition determinism in t-RTA and vice-versa and

stuttering-freeness in RTA is translated to stuttering-freeness in t-RTA and vice-versa. It is clear that determinism implies language determinism. On the other hand, state-determinism without stuttering freeness does not imply language determinism, due to the nondeterministic na- ture of choosing the stuttering steps. But a more important observation is that stuttering-free RTA are strictly less expressive than general RTA: consider the language of constant signals with integer

3.1 Real-time automata and their regular expressions 39

n

L fa j n N g

length N , which is accepted by the RTA in the Figure 3.2. Observe that this RTA

is language deterministic.

q a

L

Fig. 3.2. An RTA for the language N . L

Proposition 3.1.11. N cannot be accepted by any stuttering-free RTA. L

Proof. The proof is based on the intuition that a stuttering-free RTA for N would need an infinite

number of states:

A Q T T f

Suppose we had a stuttering-free automaton which would recognize

L fag

N . We may consider since any state with other state-label cannot be in an accepting

A

run. Then, since the automaton is stuttering-free, . Hence the number of accepting runs in

R fg

equals the number of states that are both initial and final. Denote then the max in of

the time labels of these initial and final states. But then both the assumption and

lead to a contradiction:

q Q Q q hl h f

If then for some state we have that where is any left paren-

t

a t hl A thesis. Then any constant signal with would be accepted by , contradicting the

assumption that only signals with natural endpoints are accepted.

n

a n N n

On the other hand, if then any constant signal with , ,isnot

ut accepted by A, again a contradiction.

This proof can be easily adapted for showing that state-clock automata [RS97] cannot accept

the following language:

k t t

 

j k N t t R a b kb j L jb k b

N

A similar property can be shown for event-clock automata, but in which stuttering is replaced

by -steps. L

Despite Proposition 3.1.11, there is no problem in building a RTA for the complement of N :it is the RTA in Figure 3.3 below. Figure 3.4 below gives an example of how to transform some stuttering RTA into a stuttering- free RTA. Hence we discover the need of computing the “sum” of two intervals and the “star” of some in- terval, or, in a more formal setting, the need for defining some operations that satisfy the following

relations suggested by Figures 3.3 and 3.4:

n fg fg fg R and

40 3. Real-time automata

q a ra

Sig fag n L

Fig. 3.3. A RTA for the language N .

q a q a ra

a b

a c Fig. 3.4. The stuttering RTA at is equivalent to the stuttering-free one at .

3.2 The Kleene algebra of sets of real numbers

P R The powerset of the nonnegative numbers is naturally endowed with a concatenation op-

eration: it is addition extended to sets of reals:

Y fx y j x X y Y g X Y R

X for all

fg whose unit is the set .

We can also define a star operation via the usual least fixpoint construction

X nX

nN

X n X nX X where the multiples of X are defined as usual: and .

The following theorem can be easily verified:

P R P R

Theorem 3.2.1. The structure is a commutative Kleene al-

gebra (see Definition 2.1.3).

X R n X Because a complement operation is available, , we actually get a commuta- tive complemented Kleene algebra, i.e., a boolean algebra which is also a commutative Kleene algebra.

Note also that summation with singletons distributes over intersection:

xg Y Z fxg Y fxg Z f (3.2) but distributivity of summation over intersection is not valid in general as the following example

shows:

3.2 The Kleene algebra of sets of real numbers 41

3.2.1 Normal forms

Q Int Q Int

Denote K the sub-(commutative complemented Kleene) algebra generated by in

P R Q Int , that is the family of sets which can be obtained from intervals of by applying

union, summation, star and complementation.

X K Q Int X X

Definition 3.2.2. A set can be written in normal form if there exist two

k Q N N

finite unions of rational intervals and , such that

X X X fk g

(3.3)

X Nk X Nk N k and (3.4)

We call N the bound of the normal form.

X X We will work with normal forms in which and are unions of disjoint intervals. It is straight-

forward how to transform some normal form such that this property holds.

N Normal forms are not unique: for the normal form in the definition and some p , the

following expression:

X fpk g fk g X X X f k k p k g

p

is a normal form too, but with bound N . X

Any finite union of rational intervals X can be put into normal form: when is bounded from

X X fg dM e X

above by M then is a normal form with bound . When is unbounded,

X X M M N

suppose is some decomposition of it into disjoint intervals, where . Then

X X M dM e dM e dM e fg

M e

is a normal form with bound d .

X X X X fk g X

Proposition 3.2.3. For each set written into normal form as , iff

X X both and are empty.

This property, though trivial, has its own importance since we will use normal forms as time labels in automata and we still want to have a decidable emptiness problem for the resulting automata. Sometimes, after the application of different operations to normal forms we might not be able to get directly a normal form; instead, we might get a weak normal form, which is a decomposition as in Identity 3.3 but without the additional requirement 3.4 on the existence of the bound. As an

example we have the following:

fg X

is written in normal form with bound since we

X X

have , .

Y fg

is a weak normal form which is not a normal form: there

N N N N is no N such that and . 42 3. Real-time automata

However both expand to the same set:

X Y n n

n

Lemma 3.2.4. Weak normal forms can be transformed into normal forms.

X X X fk g M sup sup X sup X

Proof. Take some weak normal form and define .

Two cases arise:

M L R hL X h

1. . This means that there exists some such that , where denotes

L

nk L n k

some left parenthesis. Define then n , hence . Then for each

k

i n X fik g n k hL X

, . It follows that is a finite union of intervals

n

X fik g hL X X

i

and thus we know how to transform it into normal form.

M

M n n k M nk

2. . Define then , hence . Define further

k

n

X fik g nk Z X

i

n

nk n k Z X fik g

i

X Z n Z fk g

We claim that which is a normal form with bound .

j i j N

To prove this, observe first that for each i , ,

X fik g jk

Moreover, distributivity of summation of singletons over intersection (property 3.2) implies that,

N

for each j ,

X fjkg n j k X nk fjkg

X M nk i j i j N

due to the fact that . This also implies that, for each , ,

X fik g n j k

Therefore, by the same distributivity property 3.2 we get

3.2 The Kleene algebra of sets of real numbers 43

X fk g n j k n j k

nj

X fik g n j k n j k

ij

n

X fik g nk n k fjkg

i

and further

X X X fk g

X X fk g nk n j k n j k

j

n

X X fik g nk

i

n

X fik g nk n k fjkg

i

j

Z Z fk g ut

3.2.2 A normal form theorem

The key result for normal forms is the following:

K Q Int Theorem 3.2.5. Each X can be written in normal form.

Proof. We must show that the result of any operation applied to some normal forms can be put into

R

normal form. We first list some useful identities valid in P [Con71]:

X

X (3.5)

X Y X Y

(3.6)

X Y fg X Y Y

(3.7)

lcmp q gcdp q p q Q lcm We employ the notations and where as the generalization of

and gcd from integers. The formal definitions are:

lcmp q minfr Q j l m N lp r mq g

such that

pq

gcdp q

lcmp q

We also use the following ultimately periodicity property:

44 3. Real-time automata

n a Q i n fa a g

n For each distinct positive rationals i ( ) we have that

is ultimately periodic, i.e., there exist some finite set of rationals B and some rationals

q r Q

such that

fa a g B fq g fr g n (3.8)

This property can be seen as an equivalent form of the normal form theorem for regular languages

p q Q

over a one letter alphabet. For a direct proof note first that, given two rationals ,

lcmp q gcdp q fp q g B

B Q j lcmp q lp mq l m N fp q g lcmp q where . The

property 3.8 follows from this by induction upon the number of elements in the starred set.

X X X fk g M Y Y Y fl g

Fix now two normal forms with bound and

m lcmk l X Y

with bound N and denote . We then get the following form for :

mk ml

A

X Y fmg X fik g Y fil g

i i

This is a weak normal form and Lemma 3.2.4 shows how to transform it into normal form.

Y

For X , distributivity of over transforms it into:

X Y X Y fl g X Y fk g X Y fk g fl g

k g fl g fk l g An instantiation of identity 3.6 gives f . The ultimately periodicity property 3.8 gives a normal form for this set and thence we have above a union of weak normal forms which

we already know how to bring to normal form.

X X X

For we have two cases. The first one occurs when one of and contains a nonpoint

interval. Then the set X is a finite union of rational intervals, so we know how to bring it into

normal form.

a b b a

To prove this claim, note that for each nonpoint interval, e.g., (that is ), denoting

m



a

m a b ia ibm a m

, we have that since the choice of assures

b a

i

m a m b m

that . Hence from the -th iteration the intervals start to overlap.

X X X

The second case for is when both and consist of point intervals. Applying identity

X X X fk g X

3.6 we get . Then by the ultimately periodicity property 3.8 can be

X fk g

written into normal form, so we may concentrate on .

By identities 3.7 and 3.6 we further get

X fk g fg X X fk g fg X X fk g

X fk g

Finally the ultimately periodicity property 3.8 tells us that can be put into normal form and therefore this case reduces to a summation of normal forms.

3.2 The Kleene algebra of sets of real numbers 45 X For the strength of the normal form, i.e., the additional requirement on the existence of the

bound N helps us giving directly the result: it is

X Nk N k fk g X X Nk

ut and the bound of this normal form is N too.

In [CG00] it is proved that the set of finite unions of n-dimensional normal forms in which

X

forms a boolean algebra. The essential novelty in our result is the closure of -dimensional normal forms under star. Though the proof of Theorem 3.2.5 is based on the same technique that gives the normal form of regular languages over a one-letter alphabet, it cannot be a simple corollary of that result: even

if we restrict our attention to the algebra generated by intervals with natural bounds, denote it

Int fg

N ,wefindtwo generators: the point set and the nonpoint interval . Neither of them

g

may generate the other: f generates just sets with isolated points or complements of such sets

(i.e., countable or co-countable sets), while generates just finite unions of intervals (it cannot

fg generate ). One might also think that the result follows from Eilenberg’s theory of automata with multiplic- ities [Eil74]. But this is not the case either since in that work star is defined via some formal power

series and one cannot prove, unless defining some suitable equivalence on power series, that e.g.

. Int Finally note the interesting relation which holds between the two generators of N , showing

they are not independent:

fg (3.9)

3.2.3 Matrices of normal forms

At the end of this section we make a brief excursion into matrix theory. We construct, as in [Koz94]

n n P R

the Kleene algebra of matrices over whose operations are the matrix extensions of

P R

the operations in the Kleene algebra :

n

A B A B A B A B

ij ij ij ij ik kj

k

nA A

nN

A I n A nA A I n

where n and , denoting the unit for matrix summation, i.e.

i j g

f iff

I

n ij

j

iff i

If we write in detail the components of A we have:

46 3. Real-time automata

A A A j i i n m N f j i j g A

ii i i i j m

ij (3.10)

m

  

The star of a matrix A can be computed by the following well-known Floyd-Warshall-Kleene

Ak k n algorithm [Con71, Eil74]: we recursively define a sequence of n matrices ( )

with

A A I

n

Ak Ak Ak Ak Ak

kj ij ik kk

ij (3.11)

An A P R

Proposition 3.2.6 ([Eil74]). for any matrix over .

nA mA n mA

kj ij

The classical proof may run as follows: one proves first that ik .

A A A A Ak A

kj ij kk ij ij This implies that ik . Then one shows that by

induction on k and hence get the left-to-right inclusion.

The right-to-left inclusion follows by proving that

A A A j i i k m N f j i j g Ak

ii i i i j m ij

m

  

by induction on k .

A Corollary 3.2.7. If A is a matrix of normal forms then can be transformed into a matrix of

normal forms too.

A i j A

Corollary 3.2.8. For each matrix of normal forms if for all indices we have that ij

i j A

then for all indices , ij .

i i n A

m ii

Proof. This is a corollary of relation 3.10: for any consider the sum



A A i j p m i i

i i i j p

. As we assumed we must have some such that p .

m

 

A i

Thence i and therefore the sum itself does not contain . By identity 3.10, it follows that

p

p

A ut

ij .

i A

Note however that for any , ii will always contain .

3.3 Determinization and complementation of RTA

The above theory suggests that “periodic” constraints may replace intervals in RTA:

A Q Q Q f

Definition 3.3.1. An augmented real-time automaton is a tuple where

Q Q Q Q K Q Int f , , , , and are the same as in RTA while (actually gives a normal form).

Augmented RTA work similarly to RTA: runs have the same definition and a signal is associ-

t t

n



q q q

n i

n ated with a run i iff The emptiness problem is again decidable

3.3 Determinization and complementation of RTA 47

Q q in linear time w.r.t. car d . Note that we need a preliminary step in which states whose interval label denotes the empty set are removed. It is here where we need Proposition 3.2.3. The different notions of determinism remain unchanged for augmented RTA; hence we will speak of state-deterministic augmented RTA and stuttering-free augmented RTA in the sense of Definition 3.1.8.

We also have a transition-labeled version of augmented RTA, called in the sequel augmented

B Q Q Q f

t-RTA, which are tuples like t-RTA, the difference being that the transition

Q K Q Int relation is time-labeled with normal forms instead of just intervals:

Q. The different notions of determinism in Definition 3.1.9 are the same for augmented t-RTA, the translations between RTA and t-RTA and back from subsection 3.1.2 work with augmented automata too and Proposition 3.1.10 is valid for augmented automata. The following theorem says that we do not increase the expressive power of RTA if we use

normal forms instead of mere intervals:

Theorem 3.3.2. TReg equals the class of languages accepted by augmented RTA. The proof is very close to the one of Theorem 3.1.5 and is based on the following property of

regular expressions:



ka k ka a a k

abcdfk g ab cd

fk g

Of course, we also have to redefine regular expressions allowing normal forms as indices for atoms. The first step in determinization is the achievement of stuttering-freeness and the proof runs smoother for augmented t-RTA: Theorem 3.3.3. Each augmented t-RTA is equivalent to some stuttering-free augmented t-RTA. Proof. As a preliminary step, in the given augmented t-RTA we remove zeroes from the time labels by applying Proposition 3.1.7, slightly modified for handling normal forms instead of mere

intervals. We also assume that all transitions with empty time label have been removed.

a

We achieve stuttering freeness by removing all stuttering a-transitions for some , and then

q q

repeating this for all the other letters in . The idea is to find, for each pair of states the set

q q of durations of signals that are associated with runs starting in , ending in and containing only

a-transitions. For this we need to recursively add all the intervals of the transitions that may lie on such a run. This is the place where we apply the normal form theorem 3.2.5 and the algorithm for

computing the star of a matrix of sets of positive numbers. The formalization is the following:

A Q q Q Q f

Start with some augmented t-RTA and number its states as

fq q g A a p

. Construct a matrix whose elements are the interval labels of the -labeled states:

q a X q X j

iff i

A ij

otherwise

A q q

i j Then ij consists of the lengths of signals associated with runs starting in , ending in

and consisting of a-transitions only. More formally,

48 3. Real-time automata

X X j r a X r A A

m i i i jk

n

i is a run in and

f j j k g r q r q

j n k (3.12)

This fact is a corollary of identity 3.10.

Computation of A is done by the Floyd-Warshall-Kleene algorithm (3.11). Note here the im-

portance of Corollary 3.2.7: the elements of A are still normal forms, hence they may be used

for labeling some new transitions of an augmented RTA. Hence, while non-a-transitions will be

a

preserved, the nonempty components of A will replace all -transitions: their time label will be

A a

ij and they will be connected only to states from which no other -transition is issued.

Q Q fq j q Qg

Formally, consider a disjoint copy of , i ; the primed states will be reached

i

a B Q Q Q Q Q Q f

exactly after an -transition. Build then where is the set of

f f

copies of final states and

q b X r q b X r j b a q b X r

n o

q a A n fg q j A n fg

i ij ij j

The need for removing zero from the new transitions comes from the fact that we do not want to

add stuttering steps involving the other symbols from .

r a X r A B

i i i

The equivalence of and follows from the observation that a run i in

i n

B

A associated with some signal , can be transformed into a run in for by replacing all maximal

a A sequences of a-transitions with the appropriate -transition time-labeled from and by priming the state that follows this transition.

Observe that, by construction, no two a-transitions are directly connected. On the other hand,

all non-a-transitions involving nonprimed states are just copied, hence no stuttering transitions are

added on these states. Finally, the primed states are not involved in any stuttering transitions since a

they are targets of a-transitions and sources of non- -transitions. This shows that by recursively

ut

applying this construction for all letters in we end with a stuttering-free augmented RTA.

car d

car dQ

The number of states in the final t-RTA is , since at each step the states are

at most duplicated. Concerning the number of transitions, note that, for each a , at each a step the number of a-transitions is either doubled (if is not chosen at that moment for stuttering

elimination) or squared. Since there is a single step in which the number of a-transitions is squared,

car d

a m m

an upper bound for the number of -transitions would be , where a is the initial

a a number of a-transitions. Note that the earlier we choose to eliminate the stuttering -transitions, the

smaller the number of a-transitions we obtain. This is because squaring would apply to a smaller number of transitions. The last step in the determinization process is the achievement of determinism in stuttering-free automata. This time, the construction works smoother for state-labeled automata:

Theorem 3.3.4. Each stuttering-free augmented RTA is equivalent to some deterministic aug- mented RTA. 3.3 Determinization and complementation of RTA 49

Proof. Note that, as we work with state-labeled RTA, the given stuttering-free RTA has the special

q

initial state whose time-label is and which is not connected to any other state.

B Q fq g Q fq g Q

f

Start with a stuttering-free augmented RTA with

q Q S Q S a

. For some subset of states we write as a shortcut for saying that all states a in S are identically labeled with .

If B were untimed, the states of the deterministic automaton would have been identically state-

Q S S a S

labeled subsets of and we would draw a transition from some with to some

S b S fr Q j q S q r g

with iff s.t. . Taking into account the time labels S

is done by splitting into several “smaller” states, each one with its distinct time label, such that

R

their time give a partition of .

Q U a U

To each U with we associate the set of time labels appearing in :

U fX K Q Int j q U q X g

Tl s.t.

S S a a S S Q S a

Let R denote the set of triples where and with . Define then

S S a a

and

S S a R TlS TlS n S

T S

R

where the usual conventions and apply. Intuitively the control passes through

S S a B S

iff in the control may pass through some state in but not through any of the states

S n S R S S a in .Weput in front of because otherwise we would lose stuttering-freeness.

Also note that it is here where we need the result that normal forms are closed under complemen-

S S a S S a tation, because we need to put into normal form and contains complemen-

tation.

C R fq g R R

f

Hence we build in which

S S a R U U b

consists of transitions going from each to each tuple defined by

U r q q b U U Q j r S

q s.t. and and

Case U stands for the situation when the length of the current state in the signal is not in

TlU a R any of the sets from . Note how states time-labeled with play the role of the trap states in finite automata.

initial and final states are

n o

R S S a Q j S q Q j q a fq g

Q j S Q q j LB R S S a

f f B The proof that C is equivalent to proceeds by induction on the number of discontinuities in

a signal. The construction assures that, at each discontinuity, exactly one state can be chosen such t that the control goes to that state. u

50 3. Real-time automata

The complexity of this construction is exponential in the number of states: by denoting n

n

k

Q S S a car dS k

car d , observe first that the number of states where is at most k

(at most due to the fact that some sets of states S might not be consistently state-labeled). Therefore

the cardinality of R is at most

n

X

n

n k

k

k

Theorem 3.3.5. TRec is closed under complementation.

The universality problem for TRec is decidable.

Proof. This is a corollary of Theorem 3.3.4 and Proposition 3.1.2. The important property provided

by the construction of the deterministic augmented RTA C in this theorem is that each signal T

(including the empty signal!) is associated with a unique run that starts in . Hence the augmented

n LC C ut RTA that accepts Sig is obtained by complementing the set of final states of .

Let us finally underline the need for theorem 3.2.5 in determinization: in our construction, we actually build an automaton whose time labels are in fact extended regular expressions (i.e., using complementation) over intervals. In the absence of theorem 3.2.5, such an automaton would not be an augmented RTA any longer and we would be in no position to decide whether, after complementing the set of final states, the resulting automaton would still be an augmented RTA. This would make questionable the decidability of the universality problem.

It is actually this problem what stops the application of the determinization construction for Int RTA whose time labels lie in a class larger than Q in which comparison of the time bounds is effective - for example, the class of intervals whose bounds are algebraic numbers. If this class of intervals is chosen for time labels, it is unclear whether the universality problem remains decidable.

3.4 The Pumping Lemma and expressiveness issues

N N

Lemma 3.4.1 (Pumping Lemma). If a language L is accepted by a RTA then there exists

N

such that each signal having at least discontinuities can be factored into three signals

n

n N L

, such that contains at least one discontinuity and for any we have .

Proof. The proof of this lemma is almost the same as in the untimed case, the difference lying

A Q Q Q f

in the reference to discontinuities. Take a stuttering-free augmented

N car dQ L N

RTA accepting L and define . It is clear that each signal having

discontinuities must be accepted by some run having exactly N states, hence one of the state

must be repeated throughout the run. Since we assumed that A is stuttering-free we cannot have

self loops at the repeated state. Hence the part of the run which can be repeated must contain at

ut

least two distinctly state-labeled states and therefore must contain a discontinuity.

3.5 Stuttering-free concatenation 51

L f Sig fa bg j g

Proposition 3.4.2. The language nonr eg is not real-time

regular.

L L nonr eg Proof. Supposing nonr eg is real-time regular, we may pick up a signal such that its

number of discontinuities is more than the natural number N provided by the Pumping Lemma.

An example is the signal

k k k k

N N N N

b a a a

z

N times

k k k k

N t a t t b t

that is, for each k , for and for .

N N N N

Then by the Pumping Lemma can be factored as such that has at least a

n

L n N n

nonr eg

discontinuity, (and hence ) and for any . But then

n N ut

for all , which is in obvious contradiction with .

It is easy to build a state-labeled timed automaton [ACM97] with a single clock accepting

L U L nonr eg nonr eg . Note also that the untiming of this language is a regular (untimed) language.

3.5 Stuttering-free concatenation

Theorem 2.4.6 seems to provide a disappointing result concerning the possibility to have some results on syntactic monoids for real-time languages. But it this is not the case: we just have to

shift our attention to other monoidal structures on the set of signals.

A “quasimonoidal” structure on Sig arises if we consider a partial concatenation operation

as follows: for each nonempty signal ( ) we define the last symbol occurring

last lim t last U

in as . Alternatively, is the last letter in . The partial operation,

t

Sig n f g

called stuttering-free concatenation, is defined as follows: for each

,iff last

(3.13)

undefined , iff last

Sig

Further, for any , put

We may easily extend this operation to a total one, by augmenting Sig with a fresh symbol

, standing for “undefined”, which becomes a “zero element”:

Sig

Of course then instead of having undefined we put .

Sig Sig Sig fg

Proposition 3.5.1. is a monoid, where .

Hence the whole theory of regularity from Chapter 2 applies. It should be noted that, though

we have augmented Sig with the “undefined” element we may still define regular subsets of

Sig

Sig as those that are regular in . The question is whether in this case we will not get

again “uninteresting” regular languages. We will show that this is not the case by relating Sig - regular languages with languages accepted by real-time automata. 52 3. Real-time automata

3.5.1 Syntactic monoids for stuttering-free concatenation and real-time automata

In this subsection we prove that real-time regular languages can be characterized by inverse monoid

morphisms whose domain is Sig . Unfortunately the mechanism of finite monoid recognizabil- ity does not give in general finite representations of the generated class of timed languages. This means that, by inverse monoid morphisms, we may obtain languages in which the timing informa-

tion might not necessarily consists of time intervals, like in real-time regular languages.

L Sig fag

Take, for example, the language dirichlet consisting of signals whose length is a

rational number

t

L fa j t Q g

dirichlet

fe ag M

This language can be given as the inverse morphic image of the subset in the monoid

e a g aa

f in which , under the morphism

Sig M

t Q a

iff

t

a

t Q

iff

because

 

t t t t

a a a a

We interpret this as the fact that monoid recognizability and finite generation of timed languages

are “orthogonal” properties1.

To cope with this problem, we will utilize here extended RTAs here, which are tuples A

Q Q Q f

like RTAs, but in which the only constraint on is that it gives a finite set of tuples

Q PR Q q X r X R

, that is, each tuple might contain any subset of reals .For this class of automata, all the closure results, including complementation, hold. The only property

that is not valid is their decidability.

LA Sig

Theorem 3.5.2. Given some RTA A, is a -regular language. The following reverse

Sig L L nfg also holds: for each Reg -regular language , is accepted by some extended RTA.

Proof. We will utilize here transition-labeled RTA.

B L L

Assume is some t-RTA accepting , with . Using the Theorems 3.3.3 and 3.3.4, we

C Q q Q B f

get a deterministic t-RTA with the same language as . Then define the

t

Sig A Q f g Sig q Q a f

-automaton as follows: for each constant signal , with

a t R and , define

 Observe that the algebraic characterization in [BPT01] also is insensitive to finite presentation of each set in the finite decompo-

n

R  sition of  .

3.5 Stuttering-free concatenation 53

r X R q a X r t X

iff s.t. and

t

q a

otherwise

r r Q

This definition makes a partial function, i.e., correctly defined: we cannot have with

q a X r q a X q X X t

for some , both containing since we would contradict

transition determinism in B .

t



a

Then extend on all signals using again the decomposition property 2.11: for each

t

t t



n n

q q a a q q q a

we put . Finally put and .

n n

By now is a total function.

A LC C

The equality L follows from the deterministic character of . The left-to-right

A LC

inclusion L is straightforward. For the right-to-left inclusion, observe that, by deter-

q a X q C LC

i i i

minism of we have that iff there exists a unique accepting run i

i n

a a i n

i

associated with . As an outcome of stuttering freeness we have i for all .

t

t



n

a a t X i

This implies that for the (unique!) decomposition , we must have i and

n

t

i

q a q

i

i and therefore

i

t t

t t

 

n n

q q a a q a a q Q

n f

(3.14)

n n

LA

and hence .

Reg Sig Sig

For the reverse implication, take some set L , hence there exists some -

A Q Sig q Q L LA f

automaton such that .

q r Q a X a q r R a

For each and define as the set of lengths of -signals which r

lead from q to :

t

X a q r ft j q a r g

B Q fq g q Q

f

Define the (extended) t-RTA where

q a b X b q r r b j a b a b q r Q

q a X a q q q a j a q Q

L q B

(If then just add to the set of final states). Note that in no two transitions with the same B

-label are consecutive, i.e. is a stuttering-free extended t-RTA.

B LA

The equality L follows by the decomposition property 2.11 and the stuttering- A

freeness of A which assures that, when a signal is associated with some run in , the decomposi- t

tion points that witness this must be exactly the discontinuity points within the signal. u

Sig Hence Sig -regular languages are “more interesting” than -regular languages, since the class of languages accepted by RTA contains nontrivial examples with timing information. At the end of this chapter we will give a simple property which argues our view of monoid

recognizability being orthogonal to finite generation.

Sig

Denote first e the class of signals whose discontinuities occur only at rational points

and whose endpoints are rational too. We call a Sig -automaton effective if there exists some

54 3. Real-time automata

Sig q r q r

algorithm for deciding, for each signal e and states , whether .Onthe

X A X Q other hand, we call a RTA effective if for each time label in , the set is a recursive

set.

Proposition 3.5.3. The translations provided by Theorem 3.5.2 are such that effective Sig -

automata are translated into effective deterministic RTA and vice-versa.

X

Proof. The first implication is straightforward, since the algorithm for deciding whether p , p

for each time label X and rational , is a particularization of the algorithm provided by the given

effective Sig -automaton. For the reverse we will consider the following algorithm: for each

Sig q r Q

signal e and pair of states , consider all the paths in the RTA which lead from

r U

q to , whose number of transitions equals the number of symbols of (the untiming of ) and

i U are such that the i-th transition is labeled with the -th symbol in . This set is finite for each signal due to stuttering-freeness of the RTA. Then, for each such run, using the algorithm provided

by the given RTA, check whether the length of the i-th constant component of the signal is in the

ut time label of the i-th transition within the run.

We have no answer to the questions whether the other constructions in this chapter (concatena- tion, star closure, complementation) preserve effectiveness. 4. Timed automata

This chapter gives an outlook of semantics of timed automata [AD94] and of the clock regular expressions [BP99, BP01] that can be associated with them. We remind the Kleene theorem which connects them, and provide an alternative proof of this theorem for regular expressions that use indexed concatenations, theorem first proved in [BP99]. We also remind an alternative semantics for timed automata, semantics which makes reference to reset points rather than clock values, like in [BJLWY98]. The clock valuation semantics and the reset clock semantics are interchangeable, but we will see in the next chapters that the latter works better for timed regular expressions. The chapter runs as follows: the first section presents clock valuations and clock constraints. In the second we remind the semantics of timed automata as timed transition systems, and show how this semantics can be transformed into a compositional one, such that clock regular expressions be equivalent to timed automata We also present here the alternative proof of the Kleene theorem for clock regular expressions with indexed concatenation, proof which is based upon the possibility to define classical regular expressions with indexed concatenation. The final section presents the reset time semantics for timed automata.

4.1 Clocks and clock constraints

Throughout this and the subsequent chapters, will denote the countable set of symbols

fx x g n fx x g

n n while n will denote the first symbols from , . We name

the symbols from as clocks as they will be used to remember the time passage in the class of automata under study here. From these symbols we construct logical formulas which will be used to express constraints on clocks values which are to be satisfied at different moments while

processing the signal.

An atomic clock constraint over n is a formula of the following type:

x U i n U N Int

i , for some and nonnegative interval ;

x x U i j n i j U Z Int j i for some and interval .

Observe that we allow also comparisons of clocks w.r.t. negative intervals.

An elementary clock constraint over n is conjunction of the form

n

x U x x U

i i j ij

i (4.1)

i

ij ni j

56 4. Timed automata

where each conjunct is an atomic clock constraint. A clock constraint over n is any boolean

combination of atomic clock constraints. The set of clock constraints over the set of clocks n is

C EC

n n

denoted n , while the set of elementary clock constraints over is denoted .

v R

A clock valuation is a function . Usually we are interested only in the values

n v v R

associated with the first clocks, that is, in the restriction n , which we denote

n too.

Clock valuations can be extended to interpretations of clock constraints in the well-known way:

x U x i

First, each atomic clock constraint i is interpreted by “replacing” the clock with its

v x

value i and then computing the truth value of the resulting formula, where the symbol is interpreted as membership.

Then the boolean operations are applied to the resulting truth values to get the truth value of the

whole interpreted formula.

j C C v

We denote v if the interpretation of induced by is the truth value “true”.

n

v R n v R

We will identify a clock valuation n with an -dimensional point .

Therefore we may import different operations on n-dimensional points to clock valuations. The

two operations we use in the sequel are

n

v R t R v t

1. Addition with a nonnegative integer: given and , we denote the clock

v tx v x t n

valuation defined by n .

n S

2. Resetting the set of clocks in X , or, equivalently, projection onto a subspace of

n

R S fx j i X g v R v X

i

defined by the equations :given , we denote the

clock valuation given by

X

iff i

v X x

i

v x

i otherwise

4.2 Timed automata and their clock valuation semantics

n N

In the sequel we fix a set of symbols and a positive integer .

n A Q Q Q Q f Definition 4.2.1. A timed automaton with clocks is a tuple where,

denotes the (finite) set of states, denotes the transition relation

Q EC P Q card n

n with

Q Q Q Q f denotes the state labeling function , and are the sets of initial, resp. final states.

The classical way to give semantics to each timed automaton is to build a timed transition system first from the specified automaton, then to consider the set of accepting runs in this transition system, and finally to concatenate the labels of all transitions in each such run in order to get

4.2 Timed automata and their clock valuation semantics 57

the set of signals accepted by the given timed automaton. Hence, given a timed automaton A

Q Q Q f

, we associate a transition system whose set of configurations is the uncountable

n

q v q Q v R

set of tuples comprising a state and a clock valuation and whose transitions

are classified as instantaneous or timed, as they are produced either by a transition in or by the A

passage of time while resting in a state q . Formally, the timed transition system associated with

n n

T A Q R Q f g Q R

n f

is where

q v q v j q C X q v j C v v X

such that and (4.2)

t n

q v a q v t j v R a q

(4.3)

We call transitions of the form 4.2 as instantaneous transitions while those of the form 4.3 are

called timed transitions.

q v z q v

i i i i

In this transition system, the set of runs is the set of sequences i

ik

q v z q v i k q Q

i i i i i

with for all .Anaccepting run is a run in which ,

v q Q A

n k f and . The language accepted by is then the set of concatenations of labels of

transitions of each accepting run:

LA z z z j q v z q v

k i i i i i

is an accepting run

ik

z z q v z q v

k i i i i

We also say that the signal is associated with the run i .

ik This semantics has the drawback of being unstructured and hence noncompositional. To make

it compositional, we observe first that we may consider only runs in which instantaneous and timed

A

transitions from T alternate. More formally:

q v

1. Suppose that in some run we have two consecutive instantaneous transitions, and

q v q v a q v a . Then insert , where is an arbitrary letter, in between them. Do

this for all such consecutive occurrences.

t

q v a q v t

2. Suppose now we are given a run in which two timed transitions and



t

q v t b q v tt a b

are consecutive. Since is a function, we necessarily have .



tt

q v t t q v a Then replace these two transitions with a single timed transition .

We may therefore join together timed transitions with instantaneous transitions and “forget” the

A

in-between configuration. Hence, we transform the transition system into the following: T

n n

Q R Q f g Q R

n f

where

t

q v a q v j q C X q

such that

t j C v v X q a

v and

t

i

q v q v a

i i i

For such a transition system, a run is defined as a sequence i in which

i

ik

t

i

q v q v a i k

i i i

i for all . Accepting runs have the same defining require-

i

A A ments as in the timed transition system T , and the language accepted by still consists of the concatenation of the labels on each accepting run, that is,

58 4. Timed automata

t

t

t

i 

k

j q v a LA a a q v T A

i i i i

k is an accepting run in

i

ik

card q C X q

Further we may split into a union of sets, one set for each tuple :

t

q X C q q v a q v j v t j C v v X q a and

Then we “classify” accepting runs according to the sequences of tuples of transitions in which

q C Y q

i i i

are employed. That is, we define a -sequence as a sequence of tuples i ,

ik

each tuple belonging to . We also define the set of runs subsumed by the -sequence

q C Y q

i i i

i as follows:

ik

t

i

S q v a i k q v j

i i i

i for each

i

ik

t

i

q v a q v q C Y q

i i i i i i i i i

This set naturally provides a set of signals which are associated to :

t

t

t

 k i

L a a j q v a q v

i i i

i is a run subsumed by (4.4)

i

k

ik

In order to define accepting -sequences we must observe that these have to insure that all the

n

R

subsumed runs must begin with the zero clock valuation n . Therefore an accepting -

sequence must not only start in an initial state and end in a final state, but also contain an initial

tuple whose constraint imposes that all clocks are zero.

q C Y q

i i i

Formally, we define an accepting -sequence as a sequence i in which

ik

n

x q q Q C Y q Q q C Y q

i k f i i i

i is a -sequence, , , , and .

ik

i

n

x q q

i

This amounts to adding all transitions to and requiring that all accepting

i runs start with one of these.

As a consequence, we get that

LA A j L is an accepting -sequence in

Let us now observe that we have, in some sense, already separated an abstract level, in which runs have exactly the classical meaning as sequences of transitions in an automaton, and a “seman-

tic” level, in which each abstract run is interpreted as some set of signals.

A new step consists of hiding away from states: note that, when building the set L , the

information regarding states is used only for retrieving the symbols that compose the associated

a C Y

i i i

k

signal. Hence we may build abstract runs i for which, if we consistently add

S T A

states into tuples, we get a -sequence. And further build the set of runs in which are

L subsumed by and the set of timed words associated with . This is nothing but the spirit of the Kleene theorem.

4.2 Timed automata and their clock valuation semantics 59

a C Y

i i i

k

More formally, an abstract run is a sequence i for which there exists

q C Y q q i

i i i i i

k

a sequence of states i such that is a -sequence and for each

ik

q a S L n

i

, i . The two sets and are then built as follows:

t

i

S v a v j q

i i i

k

i such that

i

ik

t

i

q v a i k q v q C Y q

i i i i i i i

i for all

i

t

t

t

 k i

L a a j v a v S

i i

i

k

ik

Again it is easy to see that

L A j A L is an accepting abstract run in

Moreover, the Kleene theorem for finite automata assures us that the set of abstract accepting runs

a C X EC

can be generated by some regular expression over atoms of the type n

P

n .

Now we observe that the sets S can be subject of a concatenation operation by matching

a C Y

i i i

k

on the clock valuation “in the middle”. That is, given two abstract runs i and



b C Y

i

k

i ,

i i

t t

i i

S j v c v S S v c v

i i i i

 and

i i

ik ik k

t

i

v c S v

i i



i

ik k k

c a i k c b i k k k

i i ik That is, we have that i for and for .

And the final observation is that the intermediary clock valuations in each run belonging to

L

some S , are useless, both for concatenation purposes and when constructing the set of

v v

signals associated with . We mean that we may consider only tuples consisting of a

n

Sig v v R

signal and two clock valuations , tuples which are called signals with

clock valuations. Then the set S may be replaced by the following:

t

t

 k

a S v a v v v v v j v q

k i i

k ik

i such that and

k

t

i

q v q C Y q q v a

i i i i i i i i

i

while the set L could be described as:

t t

t t

 k  k

L a j v a v S a a

k k

S Clearly, the concatenation operation on sets S could be easily “adapted” to the sets . We may summarize the above not completely formal discussion as follows: we define first the

set of signals with clock valuations

n

Sigclk v v j Sig v v R

60 4. Timed automata

We then define a partial operation of concatenation on Sigclk as follows: for all pairs of

v v v v Sigclk

signals with clock valuations ,

v v v v

iff

v v v v

otherwise

We further extend this partial operation to a total operation on subsets of Sigclk by putting,

S S Sigclk

for each ,

S S j S S

and

whose unit is the set

n

S v v j v R

Sig

By the usual least fipoint construction, we then get the star operation: for each S ,

n

S S

nN

n n

S S S S S n N

where and for all .

Definition 4.2.2. The set of n-clocked regular expressions as the language generated by the fol-

lowing grammar:

a C X j E E j E E j E j j

E (4.5)

C EC a X n

where n , and .

CReg

We denote the set of n-clocked regular expressions as .

n

k k CReg

The semantics of n-clocked regular expressions is an application

n

Sigclk

P inductively defined as follows:

kk

kk S

t

v v tX a C X k v a v j v t j C

k and

kE E k kE k kE k

kE E k kE k kE k

kE k kE k

Besides this, each n-clocked regular expression is endowed with an abstract semantics, which

EC P n

is the classical semantics as a set of words over n . We denote this abstract

j j CReg EC P n

semantics as n . n The following property, similar to the Proposition 3.1.4 and relates the abstract semantic of a regular expression to its semantics in terms of signals with clock valuations:

4.2 Timed automata and their clock valuation semantics 61 E

Proposition 4.2.3. For each n-clocked regular expression ,

kE k k k j jE j

E j E in which j is the abstract semantics of .

Proof. By easy structural induction over the n-clocked regular expression. The following straightforward property shows how to associate signals with clock valuations to

-sequences:

A Q Q Q f

Proposition 4.2.4. Given some timed automaton consider some -sequence

q C X q E n

i i i

i and denote the following -clocked regular expression:

i k

n

x q C X q C X E a

k k k

i (4.6)

i

Then

L kE k

in which L is the language associated with a run, defined in Identity 4.4 above.

TRec

Define also the family n as the family of timed languages which are accepted by some

TRat n

timed automaton and n as the family of timed languages which are the semantics of -

V

n

x E a E CReg

clocked regular expressions of the form i where .

n

i

Theorem 4.2.5 (Kleene theorem for timed automata, [BP01]). The classes of timed languages

TRec TRat n n and are equal, and the equality is effective. Proof. Corollary of the classical Kleene theorem and the Lemma 4.2.3.

4.2.1 A Kleene theorem with indexed concatenation

In [BP99], another Kleene theorem is presented in the framework of indexed concatenations and stars. There is a natural question concerning the connections between this result and the above Kleene theorem 4.2.5. We show here that these results are intimately related and it is still the classical Kleene theorem which can be put at the basis of both. This proof can be seen as a rear- rangement of the proof in [BP99].

In the cited paper, the semantics of timed automata is given in terms of constrained generators:

n

G G R P Sig

a constrained generator is a pair consisting of two mappings and

n n n

R Sig P R u Sig v R

, with the further requirement that for each and ,

G v v u

u iff .

1

G The aim is to associate, to each sequence of transitions , , a pair that gives the following

information: 

Actually, to each set of sequences, see the definition of n-clocked regular expressions with indexed concatenation.

62 4. Timed automata

n

v R G v

For each , gives the set of signals that describe the behavior of the timed automaton

v through the sequence of transitions , if the automaton starts with the clock valuation and

can be performed starting with v .

n

v R Sig v

For each and , gives the possible clock valuations in which the

timed automaton might arrive after performing the sequence of transitions , provided it starts

with the clock valuation v and is able to “parse” the signal .

C EC a

More specifically, with each clock constraint n and symbol , the following

atomic constrained generators is associated:

t

G v a j v t j C

aC

t

v a v t j v t j C

aC

The idea is then to build regular expressions over such atoms, and the full expressivity is ac-

quired only if concatenation might reset some clocks. This feature is brought in by defining indexed

G G

concatenations as follows: given two constrained generators and , and some sub-

X X G G

set n , the -indexed concatenation of with is the constrained generator

G G G

X

denoted as and defined as:

G v j G v v v

and there exists some such that

G v X

G v v v v v X j

for some and

X

Each of these indexed concatenations induce naturally an indexed iteration, denoted and

defined as follows:

i

X X

G G

i

where

X

G G

i i

X X

G G G

X

X

G

Observe that does not “contain” the zero iteration.

G X

Let us also denote the following constrained generator:

G v fg

(4.7)

v fv X g

X (4.8)

IReg The set of n-clocked regular expressions with indexed concatenation is the set

defined by the following grammar:

X

E j j a C j E E j E E j E X

4.2 Timed automata and their clock valuation semantics 63

C EC a X n where n , and .

Their semantics is given by the following rules:

kk kE E k kE k kE k

kk G kE E k kE k kE k

X X X

X X

ka C k G kE k kE k

aC aC

We will show here that there exists a straightforward bidirectional translation between clocked

regular expressions with indexed concatenation and n-clocked regular expressions, translation which works by regarding constrained generators as sets of signals with clock valuations and vice- versa. This translation relies on a simple property of untimed languages which deals with indexed concatenations, property which we will state and prove here. In other words, we show that the Kleene theorem from [BP99] is a corollary of the Kleene theorem for finite automata too.

In order to relate the constrained generators semantics with the signals with clock valuations

X n

semantics, let us consider, for each n , the atomic -clock expression

n

x X a

i X

i

whose semantics is

n

k k v v X j v R

X

a C X

Then each n-clocked atom can be decomposed as follows:

ka C X k ka C k k k

X

C EC a ka C k

The next observation to be made is that, for any n and , is the n

2 n

R Sig P R G

aC aC aC

graph of the function from the constrained generator

v v j v v G

aC aC

– that is, the set . Moreover, gives the “domain” of , that

n

v R Sig v

is, the set of tuples for which . This observation can be easily

generalized as follows:

Sigclk

Each set of signals with clock valuations S is the graph of the second compo-

G

nent of a constrained generator .

G

X In particular, the sets X are the graphs of the constrained generator defined in the Iden-

tities 4.7 and 4.8.

Two other important observations to be made are that concatenation of subsets of Sigclk cor-

responds to the -indexed concatenation of constrained generators, and that nonemptyset-indexed

concatenation of constrained generators can be reduced to -indexed concatenation by the aid of

G X

the constrained generators :

 n

R

 In fact, this is even a partial function with values in  .

64 4. Timed automata

S S

Proposition 4.2.6. Given two sets of signals with clock valuations and which are the graphs

G G S S S

of the constrained generators , resp. , the set is the graph of the

G G

constrained generator .

Moreover,

G G G G G

X X

Proof. By straightforward verification.

By now, we may state the following:

k C EC i k k a i k

n i

Lemma 4.2.7. Given constraints i ( ), symbols ( )

k Y i k n n and subsets i ( ), consider the -clocked regular expression (without indexed

concatenation)

E a C a C a C

Y Y k k

 k

and the n-clocked regular expression with indexed concatenation

F a C a C

Y Y k k



k

E k kF k Then k is the graph of the second component of .

Proof. By induction on k , using Proposition 4.2.6 for the induction step.

To complete the claimed connection we introduce regular expressions with indexed concatena- tions for untimed languages and prove a simple property concerning the translation from expres-

sions with indexed concatenations into classical regular expressions: Definition 4.2.8. The set of regular expressions over with indexed concatenations from is

defined as follows:

x

E j j a j E E j E E j E

x

x fg

where a and .

The semantics of these expressions is in terms of languages over as follows:

jj jE E j jE j jE j

jj fg jE E j jE j fxg jE j

x

x

jaj fag jE j jE j fxg jE j

Lemma 4.2.9. The set of languages which are the semantics of a regular expression with indexed

concatenation from equals the set of regular languages over . 4.3 Reset time semantics for timed automata 65

Proof. The direct inclusion is a straightforward consequence of the semantics of regular expres- sions with indexed concatenation.

The inverse inclusion follows by induction upon the structure of the (classical) regular expres-

a x

sion: the base cases , and are trivial, while for we consider the expression x .

E E

For the induction step, suppose that we have two classical regular expressions and and

E E jE j jE j

that we have built regular expressions with indexed concatenation and such that

jE j jE j

and . Then

jE E j jE E j

jE E j j E E j

jE E j j j ut

As a corollary we have

TRat Theorem 4.2.10. The classes TRec and are equal to the class of timed languages

which are the semantics of some clocked regular expression with indexed concatenation of the

V

n

a x E E IReg

form i , where .

n

i

Proof. This is a corollary of Proposition 4.2.6 and Lemma 4.2.9: for the direct inclusion, we trans-

n E EC

form each -clocked regular expression into a regular expression over n with

E j X n indexed concatenation over X , denote it . At this point we introduce the

timed semantics of such a regular expression as the union of the timed semantics of all the words

in its abstract semantics over .

X X

Now, we only have to replace operations of the form with and with and

X X

observe that, for any clocked regular expression with indexed concatenation E ,

kE k kW k j W jE j

E j E

where j is the semantics of as a regular expression over . This property, corroborated

E

with Proposition 4.2.6, assures us that the timed semantics of (with the replacements of

X

E

with X ) equals the timed semantics of . t The reverse inclusion follows by the same argument. u

4.3 Reset time semantics for timed automata

In this section we will show another semantics that can be given to timed automata, semantics originally proposed in [BJLWY98]. The idea is to record the reset time for each clock, and the

current time. In other words, we only make a change of variables, from clock values to reset times,

x n

change of variables defined as follows: for each clock i ,

v x t r x t i i where is the “current time point”. 66 4. Timed automata

Though this semantics is almost the same as the clock valuation semantics, it has certain features that will help us develop our theory concerning reachability. The regular expressions we utilize

here are n-clocked regular expressions defined in 4.5, we will only provide a different semantics

for them in terms of signals with reset times.

t t t t t t Sig n

Definition 4.3.1. A signal with reset times is a tuple where

n

t t R t t i n

and i for each .

i t

The intuition is that the real i represents some moment before a chain of transitions when the

x t t

clock i was reset, is the “initial” moment, is the moment when the last transition in the chain is

t x t

taken, and represents the last reset time for the clock i before the moment . The set of signals

i

with reset times is denoted Sigreset . Similarly to signals with clock valuations, signals with reset times can be concatenated if and

only if the intermediary time points match. More formally, given two signals with reset times

t u t u t t t t u u u u

n n

and , the concatenation

n n

is defined as follows:

t t t u u u i n t u t u

n i

iff for all and

n i

otherwise

This concatenation operation is extended, as usual, to sets of signals with reset times: for each

S S Sigreset

pair of sets ,

S S j S S

and (4.9)

which is a total operation on Sigreset whose unit is the set of signals with reset times

S t t t t t t j t t R

n n i

Again as usual, concatenation on sets gives rise to the star operation

i

S S

iN

i i

S S S S S i N

where and for all .

The configurations of the timed transition system for the reset time semantics are tuples com-

n prising a state and n positive numbers, the first representing the reset time for each clock

and the last recording the current moment. That is, the timed transition system is the tuple

n n

T Q R Q Q R

n f

where:



t

q t t t a q t t t j q C X q n

such that

n

q a t t t t t i X t t i X v j C

for all i for all and

i i

v v x t t i n i where is the clock valuation defined by i for all 4.3 Reset time semantics for timed automata 67

By an argument similar to the one given in section 4.2.1 we may transform this semantics into a

compositional one by giving reset time semantics to each -sequence in the timed automaton. This

compositional reset time semantics is built using the following basic rule:



t

ka C X k t t t a t i X t t j t t t t t n

for all

n i

t t i X v j C v

i for all and where is the clock valuation defined by

i

v x t t i n i

i for all (4.10)

q C Y q A

i i i i

k

and the “compositionality” rule 4.9. Hence each -sequence i in gives

EC P n

rise to the following word over n

a q w a C X a C X

i i k k k where

Then the language of the given timed automaton is:

n

LA A w x j

i is an accepting run in

i

The semantics of n-clock regular expressions may be given similarly by the usual rules which

allow commuting semantics with union, concatenation and star, that is

kE E k kE k kE k

kE E k kE k kE k

kE k kE k provided that the atoms have the semantics given in 4.10 above and the expressions and have

the following semantics:

kk kk S 68 4. Timed automata 5. Timed regular expressions

In this section we investigate the possibility to define some timed regular expressions that do not use clocks and clock constraints. The reason for searching such expressions is, at a first sight, esthetic, since the clocked regular expressions are harder to write. But this reason hides a more profound one: namely that, at the specification level, properties refer to (i.e., bind) state dura- tions, or intervals separating two actions, or delays. Clock manipulation might be regarded as a “low-level” language, like automata, whereas regular expressions are intended to be a “high-level” language easy to handle. There exists a “high-level” approach to regular expressions that has preceded the clocked reg- ular expressions: it is the timed regular expressions of [ACM97]. These expressions do not use

clocks, they only provide time binding by the use of some interval-indexed parentheses. For ex-

ahbci a

ample, the timed regular expression specifies the set of signals in which an -state with an

c b c arbitrary duration is followed by a b-state and then by a -state, the overall duration of the and

states being equal to . Though giving a neat specification language, timed regular expressions hide some mathematical problems, connected to the density of the set of real numbers. Namely, and contrary to classical regular expressions, they are not closed under intersection, and hence this operation must be put between the basic operations such that the generative power be reasonable. Even with intersection they still show less expressive power than timed automata, and another operation is needed then: renaming. This chapter recalls these problems and discusses one possible solution to them. This solution is the use of colored parentheses. As simple it seems, this solution shows itself some hurdles: first, the language of “colored” and balanced parentheses is not a context-free language, hence it might raise difficult problems concerning parsing and translating. The solution we find is to consider a different concatenation operation, that allows two expressions with colored parentheses to concatenate on “matching” parentheses. But the algebraic bases for this interpretation must be laid, and the subsequent chapters are concerned with this task. The chapter is more of a “hand-waiving” style, presenting more intuition and discussions than formalization. It runs as follows: the first section recalls the definition of timed regular expressions and their relationship to timed automata. We also show here some peculiarities of interpreting a timed regular expression without parentheses as a classical regular expression. In other words, we investigate succinctly the effect of the untiming morphism at the regular expression level. The sec- ond section presents an undecidability result concerning the extension of timed regular expressions 70 5. Timed regular expressions

with negation. This is a rather expected result, however it does not follow from the undecidabil- ity of the universality problem for timed automata, due to the “incomplete” Kleene connection between timed automata and timed regular expressions. The third section is a short abstract of the results that connect timed regular expressions with timed automata. The last section discusses the problems and the possible solutions for the generalization of timed regular expressions with colored parentheses. This section is informal and will be developed in the following chapters.

5.1 Basic properties of timed regular expressions

Definition 5.1.1 ([ACM97]). The set of timed regular expressions is given by the following gram-

mar:

E j j a j E E j E E j E E j E j hE i

I

I where a is any symbol in and is any positive interval.

The semantics of timed regular expressions is, of course, in terms of signals. The idea is that

the angle brackets bind the duration of signals:

t

kak fa j t R g kE E k kE k kE k

kE kE k kE k kE k kE E k kE k kE k

kE k kE k khE ik f kE k j I g I

There is an alternative way of generalizing from real-time regular expressions: namely allow

A A

atoms of the type I for any set . The semantics of such an atom would be the following:

A Sig j t A t dom

I for all

Of course, we need conjunction in both cases. Then we may replace, in an “inside-out” manner,

hE i E I each expression of the type I with .

5.1.1 Timed regular expressions without brackets

Timed regular expressions without brackets can be given also an untimed semantics, that is, in

terms of words over . We would expect that this semantics be related to the untiming morphism

U E U . More formally, if we denote the classical regular expression which we associate to the 1

timed regular expression E , then we would like to have

U E j U kE k

j (5.1)

U E j U E

where j is the set of words defined by the classical regular expression .



E E E That is, U when contains no parentheses 5.1 Basic properties of timed regular expressions 71

But there is a small problem: the semantics of the timed regular expression aa is equal to the

semantics of the timed regular expression a, fact which does not hold when the two expressions are viewed as (classical) regular expressions. In other words, this “brute” transformation of timed regular expressions without brackets into classical regular expressions is not compatible with the

untiming morphism:

jaa aj fag U kaak U kak

We recall then that the untiming morphism is in fact a morphism whose target is the monoid of stuttering-free words, endowed with the stuttering-free concatenation. This implies that if we want

Identity 5.1 to hold, we need to consider a different semantics for classical regular expressions:

namely, to interpret each classical regular expression into elements of SF and to interpret regular expression concatenation as stuttering-free concatenation, see page 27. This is not a nice

solution since it requires a reconsideration of the theory of finite automata and regular expressions

for the special monoid SF .

There is yet another solution which does not induce this reconsideration: recall that SF is

also representable as the quotient (Chapter 2, page 27), where is generated by the relation

aa a a for all . We may then consider the closure under for the semantics of the

regular expression. In other words, the timed regular expression a, when interpreted as a classical

n

fa j n N g jaa j regular expression, would have the semantics .

Syntactically, this can be done as follows: given a timed regular expression without braces, we

aa U

replace each symbol a with the classical regular expression . Denote this syntactic operation.

E

Of course, a formal definition of U would be done by structural induction on the timed regular U expression E . Hence commutes with all operations – summation, conjunction, concatenation,

star.

z

SF

This solution implies a weaker version of Identity 5.1: remind that denotes

the canonical projection induced by the congruence . Its action consists of transforming each

z

ababca

sequence of identical symbols into one symbol, e.g., abbabcca . Then

z

U E j U kE k

j (5.2)

E E One may think that this property also holds for the “brute” transformation U . But this

is not true, especially due to the use of conjunction in timed regular expressions:

z

jaa aj fag U kaa ak

The translation of timed regular expressions without brackets into classical regular expressions will be instrumental in Chapter 9. 72 5. Timed regular expressions

5.2 Undecidability of the language emptiness problem for extended timed regular expressions

As for the case of classical regular expressions, we may extend the grammar by allowing the use of negation. The resulting expressions will be called extended timed regular expressions, by similarity with extended classical regular expressions which utilize negation. The generating grammar for this

class of expressions is the following:

E a j E E j E E j E E j E j hE i j E I

and the semantics for the negation is, naturally, based upon set complementation:

kE k Sig n kE k

In this section we show that the emptiness problem for the semantics of extended timed regular expressions is undecidable. The technique we use is drawn from one of the undecidability results

concerning Duration Calculus [ZCHS93], namely the undecidability of the fragment that allow

a formulas. We prove the result by showing that the halting problem for two-counter machines [HU92] is reducible to our emptiness problem. We mention that this negative result does not follow from the undecidability of the universality problem for timed automata [AD94], since the Kleene theorem relating timed automata and timed

regular expressions involves renaming.

C Q q T Q q

A 2-counter machine consists of a set of locations , an initial location

T q s t x y q s t f g

and a set of transitions which are tuples where and

y f g

x . The meaning is the following:

x y The machine works on two counters and which can hold arbitrarily large, nonnegative values,

and which may be checked and/or modified by each transition.

s

A transition in which is is taken iff the first counter is zero, and similarly a transition in

which t is is taken iff the second counter is zero.

x x

Taking a transition in which increases the first counter by one. When the first

counter is not changed, while for x the first counter is decreased by one, if its value is

positive, and is left unchanged otherwise. Similarly for the values of y and the second counter.

It is additionally required that the machine be deterministic in the following sense: for each state

Q s t f g q

q and preconditions , at most one transition can be enabled in by the t

preconditions s and :

card q s t x y q j q Q x y f g

We may see the states of the 2-counter machine as labels of the statements of a program containing test conditions over each counter and increments and/or decrements of each counter.

5.2 Undecidability of the language emptiness problem for extended timed regular expressions 73

q x y A configuration of a 2-counter machine is then a triple containing a location and the values of each counter. A run of the 2-counter machine is a (finite or infinite) sequence of config-

urations connected by transitions which starts with both tapes holding . The halting problem for 2-counter machines is the problem of whether a given 2-counter ma- chine has a finite run. Here, the 2-counter machine is an input to the problem.

Theorem 5.2.1 ([HU92]). The halting problem for 2-counter machines is undecidable.

Theorem 5.2.2. The language emptiness problem for extended timed regular expressions is unde- cidable.

Proof. We will prove that the emptiness problem for extended timed regular expressions is many-

one reducible (in the sense of [HU92]) to the halting problem for 2-counter machines:

q m n C Q q T

i i i

Start with a 2-counter machine and suppose it has a finite run, .

i N

Q fa b c dg

We associate to this run a family of signals over the set of symbols where

b c d Q i N

a . The association will be such that, for each signal in this family and each ,

i i

the interval of the signal consists of a first part in which the signal is constantly equal to

q m a b b i

i and then a sequence of discontinuities where the signal jumps from to and from to

a n c d d

, and another sequence of i -discontinuities where the signal jumps from to and from

to c.

k k

Formally, we say that the signal encodes within the interval the configuration

q m n

if

t t t t t t t t t

mn mn m m m m   

d c d c b a b a q

kk

t i m n

with i for all .

q n m

i i

We then say that the signal encodes the run i if it encodes the configuration

i N

q m n i i i N

i i i within the interval for each . We aim at building an extended timed regular expression that accepts only signals which are associated with the run. This expression must therefore specify the initial configuration of the run and each of the transitions. The initial configuration is specified by an expression which says that the first interval of each signal encodes the first configuration of the 2-counter machine. Then,

each transition is simulated by an extended timed regular expression which accepts some signal iff,

k k

whenever in some interval the signal encodes some configuration in which the transition

k k is enabled, then in the subsequent interval the signal encodes the configuration which results by taking the respective transition. Then, the expression that simulates the run is the intersection of all these expressions.

The initial configuration is encoded by the extended timed regular expression

Init hq a b c di

q s t x y r

Then each transition is simulated by a regular expression which we will

denote tr and build in the sequel. There can be types of transitions: due to 74 5. Timed regular expressions

the two different modes of checking the contents of a tape and due to different ways the contents

of each tape can be updated.

q

We give as an example the expression that simulates a transition of the type

r ktr k

. Our aim is that any signal has the property that if

t t t t t t t t t

     n n n n

q a b c d c d c d

kk

with n then

t u v w t t t t t t t

    n n n n

r a b a b c d c d

k k

u v w t

where .

The expression tr is a conjunction of the following three subexpressions:

k k

the subexpression saying that if the signal encodes, within some interval , a configuration

k k r

in which is enabled then the interval of the signal starts with state :

hq a b c d c i n fr g

k k

a conjunction of two expressions saying that if the interval encodes a configuration in

k k r a b a b

which transition is enabled then in the state is followed by states , , and in b

this order such that the length of the last b-state equals the length of the only within the interval

k k aba a

while the length of the -signal equal the length of the only state in the interval

k k

:

q a a b hb c d c d c d r a b i n fbg

q a hb c d c d c d r a b ai

k k

a conjunction of three expressions saying that if the interval encodes a configuration

n c n d

where transition can be taken and there are -states and -states within this interval

k k n c n d

then in the interval there have to be states and states such that the length of

c k k i c k k

the i-th -state within is equal to the length of the -th -state within for all

n i d k k i c

i , the length of the -th -state within is equal to the length of the -th -state

k k i n d k k

within for all and finally the length of the last -state within

c n n d

equals the sum of the lengths of the n -th -state, the -th and the -th -state within

k k

:

q a b c d c hc d c d c d r a b c d i n fcg

q a b c d c d hd c d c d r a b c d i n fdg

q a b c d c d c hc d r a b c d ci n fdg

q a b c d c d hd r a b c d ci n fdg 5.3 Relating timed regular expressions and timed automata 75

The specification of the remaining types of transitions can be done similarly. Then the (finite) run of the 2-counter machine, if it exists, is simulated by the following extended

timed regular expression:

E I nit tr ut

C

T

5.3 Relating timed regular expressions and timed automata

Timed regular expressions are a nice specification language, but they carry some expressivity prob- lems. The following theorem and the discussion after it shows them:

Theorem 5.3.1 ([ACM97]). The class of timed languages accepted by timed automata equals the

class of timed languages accepted by timed regular expressions with intersection and renaming.

f

Here renaming refers to signals: formally, given two sets and and a mapping ,

Sig Sig

this can be extended canonically to a monoid morphism f . The morphism

f simply replaces symbols from with symbols from in each signal. For example, for the

fa bg fc dg f a c f b d f a b c d function f given by and , . Renamings are not necessarily bijective, and it is this feature that is essential in the Kleene theorem relating timed regular expressions with timed automata. The direct inclusion follows by showing first that automata with a single clock can be embedded

into timed regular expressions without intersection, and then by decomposing a timed automaton n with n clocks into automata with a single clock, building the timed regular expressions for each timed automaton and intersecting the results. The timed regular expression for each one-clock au- tomaton will specify the runs rather than the signals accepted by it, that is, the one-clock automaton is considered to work over signals whose symbols are exactly the states of the automaton. This is why, after building the intersection, one needs to apply the renaming that associates to each state in the timed automaton, its label. The reverse inclusion follows by proving that the usual union/intersection/concatenation/star constructions can be generalized to timed automata.

It was also shown in [ACM97] that intersection is necessary for representing timed automata.

ahbci habi c Their example is the timed regular expression , which cannot be expressed without

conjunction. The timed language accepted by this expression is:

L fa b c j g

(5.3)

Moreover, in [Her99] it was shown that renaming also is necessary. An example of timed automa- ton whose language cannot be represented by regular expressions without renaming is presented

in Figure 5.1 (modification from [Her99]).

b

The language of this automaton equals the renaming x applied to the semantics of the

xa hbax i hxa bi ax timed regular expression .

76 5. Timed regular expressions

y

b a b a b

y x

Fig. 5.1. A timed automaton that is not equivalent to any timed regular expression, even with intersection.

5.4 Colored parentheses: basic ideas and problems

Let us first start with an observation with a “language theoretical” flavor: timed regular languages

can be redefined with matching parentheses, that is, by putting interval indices on each left paren-

habi h abi h abi

thesis as well. Hence, instead of we would have , and a construction like would

not be a timed regular expression. What we would get this way is the Dyck language over the set

P h i j I Q Int g

I I

The generating grammar will be almost the same:

E a j E E j E E j E E j E j h E i

I I

a I Q Int where and . Let us use the notation for the Dyck language over a (possibly

infinite) set of symbols . The big problem with timed regular expressions is that they cannot be “interleaved”, due to

their context-freeness. But it is exactly interleaving what is necessary for specifying the language L

in 5.3 without intersection! L

Our idea is to use colored parentheses: for example, would be specified by the following timed regular expression with colored parentheses:

blue red blue red

h a h bi ci

But this idea poses some big language-theoretic problem: if we want to specify also cyclic behav- iors, we fall into non-context-free specification languages! Consider just a cyclic behavior of the kind

blue red blue blue red red blue

h a h bi h ai h bi

with arbitrarily many repetitions. Specifying the union of all such timed regular expressions is not

easy: if we try

blue red red blue

h i h bi

a (5.4)

then the first red parenthesis is a right parenthesis! This implies both syntactic and semantic prob- lems: 5.4 Colored parentheses: basic ideas and problems 77

At the syntactic level, if we accept expressions like 5.4, then what to do with expressions like

red

a h b I

If such expressions are also acceptable, then how to interpret them? Intuitively, such an ex- pression resets a potentially infinite number of clocks, hence we might run into trouble with decidability.

Even when such expressions are to be rejected – by some non-context-free rules – how to prove then a Kleene theorem? Its proof for the timed regular expressions of [ACM97] is essentially based upon the context-free presentation of expressions. To put all the above considerations into a more formal and “language-theoretic” framework, we

have to consider n (infinite!) sets of matching parentheses, each indexed with an interval:

i i

j I Q Int i P h i n

i for all

I I

n e e

i i

n

We may then define deletion morphisms, i , each deleting all parentheses not in P

i : these morphisms are the canonical extensions of the following deletion functions:

n

a a P

iff i

p P P fg p a

i i i i

otherwise

i

n

e P P e a a p a p a

i i i i n i i n

i

n

P

Then define the language of correctly matching parentheses over i ,

i

n

n o

L w i n e w P j

i P i for each

par i

i

L

This language is unfortunately context-sensitive for n : just consider the intersection of par

k l k l

h i i fa b c d j k l N g

with h , which gives a language of the form which is an

easy prey to the Bar-Hillel (pumping) lemma for context-free languages [HU92].

Let us mention, at the end of these considerations, that the language Lpar can be generated by matricial grammars [DP89], or with the so-called contextual grammars with distributed catenation and shuffle [KMM97].

5.4.1 Changing the concatenation

Here we come with the idea to use a different concatenation operation: the expression

blue red blue blue red red blue red

a h bi h ai h bi ai h

could be represented by an expression like

78 5. Timed regular expressions

  

a b a

  

a b a



    

a b a b a

Fig. 5.2. An example of “overlapping” concatenation.

blue red blue red blue red blue red

a h bi ai h a h bi ai h

At the semantic level, concatenation would require two signals to match on their a-parts, as depicted in Figure 5.2: The question is then how to identify the subsignals on the right and on the left that must match. Our idea is to use some distinguished points in each signal, like some markers for the moments

when each parenthesis opens or closes. If we order these points such that the left point for the i-th

t i t

ni color is i and the right point for the -th color is then we may represent the above concatena-

tion like in Figure 5.3.

  

a b a

t t t t

   

  

a b a

t t t t

   

    

a b a b a

t t t t t t

      Fig. 5.3. An example of concatenation with distinguished points.

Of course, the result has more distinguished points than the operands, so we would need also a convention how to index them. But this very fact to increase the number of distinguished points creates some problems when trying to define star: namely we need to manage with unbounded

numbers of points and to rearrange the indices after each concatenation etc. t

The issue from this is to observe that, once the time point was concatenated, it has played its role, since the right parenthesis it represents has found a matching left parenthesis. Then we may simply forget it: the result of concatenation from Figure 5.3 would no longer have six distinguished points, but four. Hence we need two operations: a “juxtaposition” operation that “fuses” two signals with distinguished points along a certain subset of points, and a “projection” operation, that forgets the points that have “actively” participated to the concatenation. Let us see now how this idea works for timed automata too. 5.4 Colored parentheses: basic ideas and problems 79

5.4.2 The “overlapping” concatenation for timed automata

In the previous chapter we have introduced the reset time semantics which associates to each run in

t t t t t t t t t t

n i

the timed automaton a signal with reset times where

n i

Sig

are nonnegative reals and is a signal. can be concatenated to another signal with

u u u u u u n n

reset times iff the last components of the left operand

n

match the first n components of the right operand.

Observe first that, when and above can be concatenated, the signals with reset times which

can be obtained from and by translating all the reals by some constant can be concatenated

too. That is, the following two signals with reset times can be concatenated:

t t t t t t

n

n

u u u u u u

n

n

This means that the only useful timing information in each signal with reset times is the set of

differences between the components. We may then define an equivalence relation on Sigreset

which relates each pair of signals which “differ by a constant”. Hence, if there exists

R

such that

t t t t t t

n

n

t t t t t t

n

n

Though this equivalence is not a congruence, our observation above shows that it satisfies the

following property:

If and then there exists such that and the reverse.

In other words, it is a bisimulation w.r.t. concatenation.

Hence in the equivalence class of and the timing information refers only to the dif-

ferences between the reset points and therefore can be represented by an antisymmetric matrix

A M R A A i j n

n ij ji

, (i.e., with for all ) having the property that for

j n

each i :

A t t A t t

ij j i ninj

i j

A t t A t t

nj j nnj

j

A t t A t t

in i nin

i

A t t A t t

inj i nnj

j j

A t t A t t

nj j nn

A

This way a signal with reset times can be represented as a tuple consisting of an

A M R

n antisymmetric matrix , a real number which represents the “offset” of w.r.t.

some representative in its equivalence class, and a signal .

80 5. Timed regular expressions



b

 

a a

 t   t 





 

  t   t





t  t 



a b a Fig. 5.4. A graphical representation of the signal with reset times .

On the other hand, signals with reset times may be graphically represented by putting all the timing and signal information on the time axis:

Observe that this representation is asymmetric: there is no information regarding the signal that

t t

has passed in between the first reset time and the “start of observation” .

For a more “symmetric” graphical representation we might use objects like in the Figure 5.5.

 

b b

  

a a a

t   t   t  t  

   

Fig. 5.5. A signal with “symmetric” reset time information. We will call the class of an object like in Figure 5.5 as an -signal. This presentation provides

information about what happened since the first reset time in consideration: the piece of signal

a b a t t

tells the “history” between the time point and the time point .

For a more algebraic setting, the -signal in Figure 5.5 can be represented as a matrix

i j i j

whose entry records the piece of signal in between the -th and the -th time point. In order

t t t t

j i j to distinguish the case when i from the case when we may employ antisignals,or

signals in which time flows in the opposite direction, or, moreover, signals which are “read” in the

b b a reverse order. Intuitively, the antisignal corresponding to the signal a should be . The

matrix representing the -signal in Figure 5.5 is the following:

a a b a b a a b a

C B

a b a b a b a

C B

A

C B

a b a b a a b a b a b a

A

b a a b a b a a

A i j k A A A

jk ik Note that in the matrix we have that, for each , ij . We will extensively use this “triangle identity” in the subsequent chapters.

Then, if we generalize concatenation of signals with reset times to -signals we get exactly the overlapping concatenation we have defined in the previous subsection. A graphical representation of concatenation is given in Figure 5.6.

The necessary condition for the two -signals in Figure 5.6 to concatenate can be put also in terms of matrix representation: suppose that we have the following block-decomposition of the

5.4 Colored parentheses: basic ideas and problems 81

 

b b

  

a a a

t   t   t  t  

   

 

b b

 

a a

t   t   t 

  

t  



 

b b

  

a a a

t  



t   t  

 

t   

Fig. 5.6. Concatenation of two -signals.

matrices that represent each -signal in Figure 5.6:

A A A A

A A

A A A A

A A A A M R

with being antisymmetric matrices in .

A A A A

Then can be concatenated to iff . The result will be the matrix

B A

A

t

B A

t

B A A i j k B

ik kj

where ij for all and some . Here is the transpose of

the matrix B .

The next chapter discusses this formalization, and in particular the regular expressions which n

work over -signals and their relationship to timed automata. Our choice for a matricial presen- n tation of -signals, which could be thought as holding a lot of “redundant” information due to the triangle identity, comes not only from an aim to “algebraize” everything, but also because this presentation is closely related to a certain data structure which is aimed at representing timing information: the Difference Bound Matrices (DBMs) [Bel57]. We will take full advantage of the

intimate relationship between DBMs and our matricial presentation of n-signals. 82 5. Timed regular expressions 6. Matrices of signals

In this chapter we provide the basic algebraic properties of the overlapping concatenation. We base our definition of concatenation on two other operations: projection, which “forgets” certain rows and columns in a matrix, and juxtaposition, which “fuses” two matrices along a certain submatrix. This choice is again not only algebraic, but also emphasizes special properties of each of these two

basic operations in different contexts. n We then define a class of regular expressions whose semantics is based on -signals (parity

is needed for concatenation). The atoms of these expressions are matrices whose components are n timed regular expressions – we call them -regsignals. These are in fact our algebraization of reg-

ular expressions with colored parentheses. We also show here that timed automata can be simulated n by regular expressions over -signals. We then make a first try to lift concatenation at the specification level, that is, we try to provide a compositional calculus with regular expressions, in order to be able to check whether a regular

expression has a nonempty semantics. But we discover very quickly that no compositional concate- n nation operation can be defined on -regsignals, and the problem lies in the noncompositionality

of projection. This means that we cannot have the wished calculus of emptiness for free. On the n contrary, we show that juxtaposition can be lifted to a compositional operation on -regsignals. We also discover an equally serious problem, namely that the emptiness problem for our regular

expressions is undecidable. This result follows by encoding any instance of the Post Correspon- n dence Problem into a regular expression of a very simple form: a star of a sum of -regsignals. The problem lies therefore in the untimed structure of the expressions, and we leave this problem for study in the next chapter.

This chapter is organized as follows: in the first section we present the definition of n-signals.

They are in fact presented as a particular case of n-dominoes, which are matrices whose com- ponents are a mixture of signals and antisignals. In the second section we give the definition of the projection, juxtaposition and concatenation operations, and provide some algebraic properties

relating them. In a short third section we present the notion of n-signal language and the more gen-

eral notion of n-domino language and show that these languages form a Kleene algebra w.r.t. the n concatenation inherited from -signals and the star operation which is induced by concatenation.

The fourth section presents the notions of regsignals and regminoes, which are “compact”, single- n matricial representations of n-signal languages, respectively -domino languages. We show here the possibility to define a compositional juxtaposition and the impossibility to define a composi- tional projection on regsignals (regminoes). We also define here the class of regular expressions 84 6. Matrices of signals

whose atoms are regsignals (resp. regminoes). In the fifth section we give the translation of timed

automata semantics into regular expressions over regsignals, or, in other words, we give an n- signal-semantics of timed automata. And in the sixth and last section we give the undecidability

result concerning the emptiness problem for regular expressions over regsignals. n 6.1 n-dominoes and -signals

As pointed out at the end of Chapter 5, matricial presentations of n-signals require working not

t

t t

  k

a a

only with signals, but also with antisignals, that is, sequences of the type a where

k

t t R

k

. Intuitively, such a sequence denotes the history of states and their duration, and

t t

a Sig

hence a . Formally, we replace the set of signals (which is the coproduct of

card R card

copies of the monoid ) with the coproduct of copies of the group

R Bi Sig BiSig

. We denote this coproduct group by . Of course, in we would also

b have “words” containing positive and negative powers like a and such words are counterintu-

itive in the “timed world”, but we are forced to use them as they naturally occur by concatenation

b of signals and antisignals. In algebraic terms, working with mixed words like a is a must since the union of the set of signals and the set of antisignals does not have a nice algebraic structure –

it is not stable w.r.t. concatenation.

t t t u

Bi Sig a a b

We denote the inverse operation in as . Hence and a

u t

b a Sig L L

. For a timed language L we denote the set of inverses of signals in . The

Sig

set of antisignals over is denoted and does not contain, by definition, the empty signal

.

n w w BiSig

ij

n Definition 6.1.1. An -domino over is a matrix ij of elements from

with the following property:

w w w i j k n

jk ik

ij for each (6.1)

w Sig Sig i j n w n

When ij for all we say that is an -signal over .

Identity 6.1 will be referred to as the triangle identity.

D n Sig n

We denote by n the class of -dominoes over and by the class of -signals

n

n w i j n w w w ij

over . Observe that in an -domino we have that for all , ii and . ji

The above definition does not faithfully formalize the drawings we have made in the previous t chapter since in those drawings we have also associated a real number, i , to each entry in the

matrix. But this difference is inessential since, given any real number , we may associate it to the

t t t t w

i i i

first index, that is, put , and then build the sequence of s by putting .By

n t

abusing notation, we will still draw the -signals as in the first section, with the i s in place.

n n i i n

Remark 6.1.2. For every -signal there exists some ordering on , say (or,

n n j i w Sig

i i

equivalently, a bijection , with j ) such that for

k k 

n

all k . When satisfies this property, we say that it is an ordering compatible with

w w i j

. The ordering is unique when ij for all . n

6.1 n-dominoes and -signals 85

 

b c

  

a b a

t  t    t  t  t 

    

t  t   t 

 

Fig. 6.1. An example of a -signal.

For example, for the -signal presented in Figure 6.1 we may have the ordering

, as it follows by reading the time points from left to right. Observe also that

this ordering is not unique, the ordering being also compatible

with the -signal in the figure.

n w

Remark 6.1.3. Let us observe that, in order to give an -domino we may specify only the first

n w i j n w i

ini

components ij with and the “pseudodiagonal” components with

n w i j n n n

, or, similarly, the last components ij with and the pseudodiagonal components. This follows since the remaining components can be defined by the aid of the triangle

identity 6.1. For example, if we have specified the first n components and the pseudodiagonal

w i j n n n

components, the remaining components to be specified are ij with

n n n n n n n . These can be recovered as

follows:

i n j n n w w w w w

ij n j nj ji

For and , ij and .

ij

i j n n w w w w

iin inj n j nj For , ij .

6.1.1 n-dominoes over a one-letter alphabet

The class of n-dominoes over a one-letter alphabet form a special class, due to the fact that any

concatenation of a signal and an antisignal is a signal or an antisignal – Therefore, any n-domino

is also an n-signal. n

In fact, instead of working with n-signals we might employ -tuples of reals – that is, instead

n

n w Sig n t t R n

of working with a -signal , we might work with an -tuple for

n

t t

j i

w a

which ij . This observation was already made at the end of last chapter. Then the projection/juxtaposition/concatenation operations are straightforward operations on

tuples:

n

t R X n X t t

1. Given an n-tuple and , the -projection of is the tuple denoted

X

with the property that

t t t

i i



k

X

X fi i g i i

k j j

where , .

m n p N p minm n p m t t t m

2. Given with , the -juxtaposition of an -tuple

m n

R n u u u R t u i p

n mpi i

with an -tuple is defined iff for all and

m n p v z z

mnp is the -tuple with

86 6. Matrices of signals

t i m

i for

v

i

i m m n p u

i for

z x y

We denote p .

n

n t u R n

3. Given two -tuples , their concatenation is the -tuple

t u t u

n

nn

The operations we will build on n-dominoes are then “matricial” translations of these three operations.

One question can then be raised here: why haven’t we adapted this n-tuple approach to signals,

and use the more cumbersome matricial presentation for n-signals? Different reasons for our choice will be given at different moments throughout this chapter, especially when defining the three

operations on n-signals. These reasons are, to a certain extent, related to the fact that the set of signals and antisignals is not closed under concatenation.

6.2 Operations on n-dominoes

In this section we introduce several operations on n-dominoes, operations which aim at modeling the concatenation on signals with distinguished time points from the introduction of the chapter.

We start with presenting some notations.

n N X n l n card X

Given a natural and a set we denote X the

surjection defined by

l i card fj X j j ig

X (6.2)

l X l

Observe that the restriction of X to is a bijection. We denote the inverse of this bijection by .

X

card X X

Hence l .

X

A B A B x y x A y B

Observe that, when X with , that is for all and ,wehave:

l i i A

A iff

l i

B

A (6.3)

i B card A l i

B iff

6.2.1 Projection

A useful operation is projection, which cuts some of the rows and columns of the matrix, such that

the remaining matrix be still a square matrix.

n w D X w w

Definition 6.2.1. Given an -domino n the -projection of is denoted and is X

defined as:

 

w w i j card X

for all (6.4)

l il j

X

ij

X X

6.2 Operations on n-dominoes 87

 

b c

  

a b a

a

t  t  t  t  t 

    

t  t   t 

 

 

b c

 

a b

b

t 



t  t 

 

a X f g b

Fig. 6.2. The projection of the -signal at onto the set gives the -signal at .

n card X It is clear that the X -projection of an -signal is a -signal. An example of projection is

given in Figure 6.2.

w D X n card X p Y p

Proposition 6.2.2. For each n , with and ,

w w 

 (6.5)

X Y l Y X

Proof. The components of the left-hand side in identity 6.5 are:

   

w w w

 

l l il l j

X Y X

ij l il j

X Y X Y

Y Y



l l l l Y card Y X

The identity follows if we prove Y and this can be

X

l Y X

showed as follows:

j i Y j l j l l l i card j Y j j l i card l

Y X X

X X X



i card k l Y j k i l

X

Y l

X

l X p

We have applied here the fact that X is a strictly increasing bijection.

w w D X Y n X Y X Y

Proposition 6.2.3. For each n and with ,

n w w w w w

,if w and then .

X X Y Y

X X X n X X n X X

k k i i

In general, given such that and

w i k w i k w w

for all ,if for all then .

X X

i i

w w i X j Y

Proof. We only need to prove that ij for and , since the other cases hold

ij

X Y X Y

by hypothesis or by symmetry: take some k , which must exist since . Since

i k X w w k j Y w w kj

,wehave ik . Similarly, implies . Therefore

ik kj

w w w w w w

ij ik kj

ik kj ij

w w ut

The second property follows by showing, by induction on i, that .

X X X X

 

i i 88 6. Matrices of signals

6.2.2 Juxtaposition The juxtaposition operation joins two matrices along a common “submatrix”, such that both ma-

trices can be found in the result:

m w D n w D n

Definition 6.2.4. Given an -domino m ,an -domino and an integer

minm n p w w w

p , the -indexed juxtaposition of and is defined if and only if

mpm

w  w m n p w D w

mnp

, is denoted p and is the -domino with the property that

p

w w w

w and (6.6)

m mpmnp

m n p w w  w

Note that Proposition 6.2.3 assures the uniqueness of the -domino p .

w  w

The explicit construction of p is the following:

w i j m

ij iff

w  w

i j m p m n p w

p iff

impj mp

ij

w w i m j m p m n p k m p m

ik iff

k mpj mp

(6.7)

w i m p m n p j m

The components ij with and can be recovered as

w w

ji from the third line in the definition. Note again that the definition of for the case when

ij

m j m p m n p k m i is independent of the choice of .

An example of juxtaposition is depicted in Figure 6.3.

 

c b

  

b a a

a

t  t  t  t  t  

    

t  t   t 

 

 

b b

 

a a

b

t  t   t 

  

t  t   t  t  t 

   

  

b b c

  

a b a

c

t  t  t  t  t  t   t 

     

t  t   t  t  t 

   

a b c

Fig. 6.3. The -juxtaposition of the -signals at and gives the -signal at .

w D p q m p q m

Remark 6.2.5. Observe that, for each n and such that ,

w  w w

q m

p .

p mq m 6.2 Operations on n-dominoes 89

This result is a direct corollary of property 6.2.3.

m n m n p

Remark 6.2.6. The p-juxtaposition of a -signal with a -signal does not yield a -

signals in general. As an example, the -juxtaposition of the two -signals in Figure 6.4 cannot

t

yield a -signal since, intuitively, the point of the second -signal does not correctly fit in between

t t

the first two points and of the first -signal. This rewrites as the fact that the component

w b a

of the result is neither a word nor an antiword.



c

 

b a

t  t  t 

  

t  







c

 

a a

t  t 

 

t   t  

 

b b c a b a

C B

b b c a a

C B

C B

w

b b c a a c a

C B

C B

A

a c b a c a c b c a

a b a a c a a c

Fig. 6.4. The -juxtaposition of the two -signals in this figure does not yield a -signal. The result

is the -domino depicted below them.

Sig This is an outcome of the fact that Sig is not closed under concatenation, and

argues in favor of our need of introducing n-dominoes as the basis of the study.

m w n w m

A sufficient condition for the p-juxtaposition of a -signal with a -signal to be a

p

n -signal is the conjunction of the following properties:

i m p j m p m w Sig

1. For each , ij .

p j p n w Sig

2. For each i , .

ij

w  w

This follows since, under this condition, the third line in the detailed definition of p (Defini-

tion 6.7) would yield only elements of Sig .

This problem with concatenation which is not an internal operation on n-signals is one of our

reasons for using the matricial presentation of n-signals, and of defining them as special cases of

n-dominoes. Had we worked only with signals with distinguished points, we would have had to use the above sufficient condition for correctly defining the juxtaposition. But, as we will see later,

90 6. Matrices of signals n this condition is not satisfied by the n-signals which are issued by the -signals semantics of timed

automata, as we will see in a further section in this chapter. Hence, our calculus with n-signals would have been less expressive than timed automata.

6.2.3 Properties of juxtaposition

w D w D X m card X n

Proposition 6.2.7. 1. For each m , given with

p n card Y q r minp q m r

, given Y with , and given such that and

m X r Y

and , we have that

 w w  w w r

r (6.8)

X Y X Y mr

w D w D w D q minm n

n p

2. For each m , and and for each and

minn p

r ,

w  w  w w  w  w

r q r q (6.9)

Proof. Both properties will be proved with the aid of Propositions 6.2.2 and 6.2.3:

card X q card Y

For the first identity, denote first p and . Let us observe first that the

w  w w  w r

left-hand side r is defined iff the right-hand side is:

X Y X Y mr

w w 

 (by identity 6.5)

X pr p l pr p

X

w

mr m

m r m X m l m r m

since, by hypothesis, and hence X

p r p

, and, respectively,

w w 

 (by identity 6.5)

r Y r l

Y

w

r

r Y n l r r

since, again by hypothesis, and hence Y .

p p r

We will prove then that the projections of both sides of identity 6.5 onto and

p q r are respectively equal. The projections of the left-hand side of Identity 6.8 are, by

definition of , the following:

w  w w

r

X Y p X

w  w w

r

X Y pr pq r Y

For the projections of the right-hand side of Identity 6.8 we will apply the identity 6.5. First, the

p projection onto gives:

6.2 Operations on n-dominoes 91

w  w w w



r (by identity 6.5)

X Y mr p l p

X  Y mr

w  w l

 X

r (by property 6.3 of )

l p

X

w  w l X p X

r (since )

X

w  w X m



r (since )

l X

m

w  w

R (by identity 6.5)

m X

w

X

p r p q r l X n

Y mr

Before computing the projection onto we observe that X

Y m r p r

. Therefore:

w  w

r

X Y mr pr pq r

w  w



r (by identity 6.5)

l pr pq r

X  Y mr

w  w



r (by the above observation)

l q

Y mr

w  w l Y m r q

Y mr

r (since )

Y mr

w  w Y m r m r m n r



r (since )

Y l

mr mnr

w  w

r (by identity 6.5)

mr mnr Y

w Y

For proving the Identity 6.9, let us observe first that the right-hand side is defined iff the left-

w  w w  w r

hand side is defined, and both iff q and are defined, since:

w  w w  w



q q

nr n mnq r mnq l

mq mnq

l m n q r m n q n r n

mq mnq

(since )

w  w

q (by identity 6.5)

mq mnq nr n

w 

(by definition of q )

nr n

w  w w  w l q q



r r

n

(since )

q q l

n

w  w

q (by identity 6.5)

n q

w 

(by definition of q )

q

Hence

w w w w  w

if and only if r .

mq m q mq m q

w w w  w w

if and only if q .

nr n r mnq r mnq r

92 6. Matrices of signals

m

We will then show that the projections of both sides of this identity onto the sets ,

m q m n q m n q r m n p q r and are respectively equal.

To this end, observe first that

l m m

mnr

(6.10)

l m q m n q m q m n q

mnq (6.11)

Then, the three projections of the left-hand side of Identity 6.9 can be rewritten as follows:

w  w  w w  w  w



q r r

q (by observation 6.10)

m l m

mnq

w  w  w r

q (by identity 6.5)

mnq m

w  w  r

q (by definition of )

m

w 

(by definition of q )

w  w  w

q r

mq mnq

w  w  w

 r

q (by observation 6.11)

mq mnq l

mnq

w  w  w r

q (by identity 6.5)

mnq mq mnq

w  w  r

q (by definition of )

mq mnq

w 

(by definition of q )

w  w  w

q r

mnq r mnpq r

w 

(by definition of q )

Before computing the projections of the right-hand side of Identity 6.9, note the following

properties:

l m q m n q n

mq mnpq r

(6.12)

l m n q r m n p q r n r n p r

mq mnpq r (6.13)

The projections of the right-hand side of Identity 6.9 are the following:

w  w  w w 

q r

(by definition of q )

m

w  w  w

q r

mq mnq

w  w  w

 r

q (by observation 6.12)

l n

mq mnpq r

w  w  w r

q (by identity 6.5)

mq mnpq r n

w  w  q

r (by definition of )

n

w 

(by definition of r )

6.2 Operations on n-dominoes 93

w  w  w

q r

mnq r mnpq r

w  w  w

 r

q (by observation 6.13)

l nr npr

mq mnpq r

w  w  w r

q (by identity 6.5)

mq mnpq r nr npr

w  w  q

r (by definition of )

nr npr

w 

(by definition of r ) t Hence, an application of Proposition 6.2.3 ends our proof. u

At the end of this subsection we prove a property which relates orderings compatible with

-signals to juxtaposition and will be used in the proof of the undecidability theorem.

6.2.4 Concatenation

n w w D n w n

The concatenation of two -dominoes is defined as the -juxtaposition of and

n

w followed by the projection onto the first and the last components.

n w w D w w n

Definition 6.2.8. Given two -dominoes , the concatenation of and is

w

denoted as w and defined as

w w w  w

n (6.14)

nnn

w w w

In detail, w is defined iff and in this case we have

nn n

i j n w

ij iff

w w

i j n n w

ij iff (6.15)

ij

w w i n j n n k n n

ik iff and

k nj

w i n n j n w ji The components ij with and can be recovered as from the third line in the definition. An example of concatenation is given in Figure 6.5 below. The reader

may observe now that the concatenation of the two -signals from Figure 6.5, is the projection of

their -juxtaposition, as given in Figure 6.3, onto the set . n

Remark 6.2.9. As an outcome of the remark 6.2.6, the composition of two -signals might not be n

a -signal in general. n Proposition 6.2.10. Composition is associative, has no unit but each -domino has a left and a

right unit and a left and right inverse w.r.t. this unit:

n w w w D n

1. For each triplet of -dominoes

w w w w w w (6.16)

94 6. Matrices of signals

 

c b

  

a a b

t  t  t  t  t  

    

t  t   t 

 

 

b b

 

a a

t  t   t 

  

t  t   t  t  t 

   

  

c b b

  

b a a

t  t  t  t  t   t 

     

t  t 



Fig. 6.5. Concatenation of two -signals.

l r

w D n n

2. For each , denote , resp. the -dominoes defined as follows: for each

w w

j n

i ,

l l l l

w

nij inj ninj ij

ij (6.17)

w w w w

r r r r

w

nij inj ninj ninj

ij (6.18)

w w w w

l r

i n

i ini

Observe that in , for any .

w w

Then

l r

w w w

(6.19)

w w

w D n

3. In the same setting, define as follows:

w i j n

nj n

i iff

i n j n n w

nj n

i iff

w

ij (6.20)

w i n n j n

nj n

i iff

i j n n w

nj n i iff

Then

l r

w w w

w and (6.21)

w w

Proof. The associativity property follows from the definition of composition and the associativity

w w w w w w

of juxtaposition. Let us observe first that is defined iff is defined,

w w w and both are defined iff w and are defined. This follows directly from the proof of

associativity of . Then:

6.2 Operations on n-dominoes 95

w w w w  w  w

n n

nnn nnn

w  w  w n

n (by identity 6.8)

nnnnn nnn

w  w  w

 n

n (by identity 6.5)

l nnn

n  nn

w  w  w

n n

nnn

l n n n n n n

nnn (since )

Similarly,

w w w w  w  w

n n

nnn nnn

w  w  w n

n (by identity 6.8)

nnnnn nnn

w  w  w

 n

n (by identity 6.5)

l nnn

n  nn

w  w  w

n n

nnn

l n n n n n n

nnn

(since )

l w

For the second property, observe first that and can be concatenated (in this order) since by w

definition

l

w

w

nn n

Then

l

 w

n ij

w

l

i j n

ij for

w

i j n n w

ij for

l

w i n j n n k n n

k nj

ik for and some

w

w i j n n i j n n

ij for or

l

w i n j n n

i ij

in for and

w

w

ij r

The proof is similar for the right unit .

w

w w

Finally, for proving that is the inverse of w.r.t. concatenation we only must observe that,

j n

for any i we have

w w w w w

ij

ij ninj

w w w w w w w ut

ini inj ini nij ij

inj 96 6. Matrices of signals

6.3 n-domino languages n

Having defined n-dominoes and their operations, we may turn our attention to sets of -dominoes n

now. These sets will be called n-domino languages, respectively -signal languages when they

n n DL

consist of -signals only. We denote the family of -domino languages as n and the family

n WL

of -signal languages as n .

All the operations built so far extend naturally to n-domino languages. We will only write here n

the extension of the concatenations to -domino languages, since this gives rise to a star operation:

n L L D L L n n given two -domino languages , the concatenation of and is the -domino

language

L L fw w j w L w L g

Let us consider the following language:

l

w D j i n w j w D

n n ini n

(6.22)

w

P D

n n n Proposition 6.3.1. is the unit for concatenation on sets, hence is a

monoid.

n L D n Concatenation gives rise to a star operation: for each -domino language , the

star of L is defined as:

 k

L L

k

k k

L L L L k N n

where and for all .

P D 

n n Proposition 6.3.2. The structure is a Kleene algebra.

We will also use the positive star operation , defined as the positive iterations of the given

language:

k

L L

k

A nice property relating projection and juxtaposition is the following:

L n L

Proposition 6.3.3. Given an m-domino language ,an -domino language and a positive num-

minm n L m n p L

ber p , denote the set of -dominoes which project onto elements of

m n p L

and L the set of -dominoes which project onto elements of , that is:

L L w D j w

mnp

m

L w D j w L

mnp

mpmnp

Then

L L L L

p (6.23) 6.4 Regminoes, regsignals, and regular expressions over them 97

Proof. By straightforward verification:

L L w  w j w L w L

p p

L w L w D j w

mnp

n mpmnp

w D j w L w D j w L

mnp mnp

n mpmnp

L L ut

6.4 Regminoes, regsignals, and regular expressions over them

Going back to the “colored parentheses” idea, we would like to decompose each object of the form

blue red blue blue red red blue red

h a h bi ai bi h h ai

into a concatenation of the kind

blue red blue red blue red blue red

h a h bi ai h a h bi ai

Consequently, we would like to base our regular expressions with colored parentheses on atoms

blue red blue red

ah bi ai

like h , that is, in which, if we apply a “color filter” for the any of the colors, we

E hE i E E E E

I would get a timed regular expression of the kind , in which and are regular expressions without timing parentheses.

blue red blue red

ah bi ai

This rough idea still needs some refinement: observe that, in an atom of the kind h ,

there exist some implicit timing constraints limiting the duration of each state: neither of the two

b states a or the state may last more than time unit. A graphical presentation of the resulting

object is the following:

hbai

red red

i h

h

b

i

blue red blue red i

h a h bi ai

a

h

i

i

a

a

h ab

blue h blue

h i

habi

We will put all this information in a matricial presentation: we define regminoes as matrices of timed regular expressions, whose semantics consists of sets of dominoes. These are the atoms of

our calculus of regular expressions with colored parentheses:

n R R

ij

n

Definition 6.4.1. An -regmino is a matrix ij whose components are timed regu-

i j n R

lar expressions: for each , ij is a timed regular expression over .

n n R i j n R hE i hE i

I I

An -regsignal is an -regmino for which, for each , ij for

E

some untimed regular expression E over , some untimed regular expression over and

Q Int some interval I .

98 6. Matrices of signals

n RD n RSig

The set of -regminoes is denoted n while the set of -regsignals is denoted .

n

n kk RD P D n

The semantics of -regminoes is the mapping n defined as follows:

R RD

for each n ,

kR k w D j w kR k i j n

ij ij

n for all (6.24) n

Remark 6.4.2. Observe that the semantics of an n-regsignal contains only -signals.

Figure 6.6 gives an example of a -regsignal and a -signal in its semantics.

hbi hb i hbca i

fg

C B

hb i hb i ha cai

C B

fg

R

C B

hb i hbi hbac a b i

A

ha c b i ha c a i hc a b bai



c

 

b a

t  t  t  t 

   

Fig. 6.6. A -regsignal and a -signal in its semantics.

hbaca b i hbaci ha b i

We have used here the expression as a shortcut for .

Observe also that, since signals cannot have negative length and antisignals cannot have positive

hbaci ha b i

length, we may further replace this timed regular expression with .

n R RSig R R R R

For each -regsignal we will denote ij with being the

n

ij ij ij

“positive part” and R the “negative part” of the timed regular expression, that is:

ij

hE i R E I

I with regular expression over and

ij

hE i R E I

I with regular expression over and ij

6.4.1 Projection and juxtaposition on n-regsignals

We have seen that the domino operations can be naturally extended to languages. We may ask then whether there is a way to represent the results of each operation, when applied to languages which are representable by regminoes. We show here that, for juxtaposition and intersection, such a representation can be found, but not for union and projection, and, hence, neither for concatenation and star. We present these results in an algebraic setting, that is, we define juxtaposition and intersection on regminoes and prove that they are compositional. On the other hand, we show that the natural

candidate for the X -projection operation on regminoes - the operation which removes the rows and

columns not in X -isnot compositional.

R RD X n X R card X

Definition 6.4.3. Given n and , the -projection of is the -

regmino denoted R and defined as follows: X

6.4 Regminoes, regsignals, and regular expressions over them 99

 

R i j card X R

for all

l il j

X

ij

X X

R RD R RD p minm n p

m n

Given , , and a nonnegative integer, the -juxta-

R R m n p R RD R R  R

mnp p position of with is the -regmino denoted and

defined as follows:

R  R

p

ij

R i m p j m

ij

iff

i m j m p

or

i m p m n p j m m n p R

impj mp

iff

m m n p j m p m n p

or i

R R i j m p m

ij impj mp

iff

m

i m p j m m n p R R

ik k mpj mp

iff

k mp

m

j m p i m m n p R R

impk mp kj

iff

k mp

Unfortunately projection is not a good syntactic operation since it does not commute with se-

mantics of regminoes: we might have an n-regmino with an empty semantics whose projection

onto some subset has a nonempty semantics. For example, consider the following -regsignal over

fag

a one-letter alphabet (in fact, a matrix whose entries are sets of reals):

a a a a a a

C B

a a a a a a

C B

R C

B (6.25)

a a a a a a

A

a a a a a a A

This -regsignal has an empty semantics: if we construct all the integer-valued matrices

R A R ij

whose components belong to the respective components of , that is, with ij for all

j

i , we observe that none is a -signal as none satisfies the triangle identity. However

X f g

the projection of R onto the set gives the following -regsignal:

fa a g

R

fg

fa a g

whose semantics is nonempty, since:

a

Sig fag w R

fg

a

In general, we only have the inclusion

100 6. Matrices of signals

R w j w kR kg

(6.26)

X X

RD X n for each R and .

Contrary to projection, indexed juxtaposition is compositional w.r.t. semantics:

R RD R RD

m n

Proposition 6.4.4. The following property holds for any , , and

minm n

p :

kR  R k kR k kR k

p p

(6.27)

Proof. The property follows by easy verification: for the direct inclusion observe that, if w

kR  R k w kR k w kR k w w  w

p p

then and . But ,

m mpmnp m mpmnp

w kR k kR k

p

hence .

w kR k w kR k

For the inverse inclusion we just have to observe that, if we are given and

w  w i m p j m m n p

p

such that is defined, then for all , and

k m p m w w w w

ij ik k mpj mp

, and this implies that

m

w w kR k kR k kR  R k ut

ij ik k mpj mp p ij

k mp

An operation which is available for n-regsignals only is intersection:

n R R RSig R R

Definition 6.4.5. Given two -regsignals the intersection of and is the

n R

n-regsignal with

R R R i j n

ij ij

ij for all (6.28)

R R R We denote then .

Remark 6.4.6. Of course, to actually obtain n-regsignals we need to transform the intersection in

each component into a regular expression. Observe that it is essential that both operands are n-



hE i hE i E I

regsignals since then each component can be still written in the form I with an

E I

untimed regular expression over and an untimed regular expression over . Here is

intended to be a nonnegative interval and I a nonpositive interval.

On the contrary, the intersection of the semantics of two n-regminoes might not be representable

n as an n-regmino. This follows even for by the nonclosure of timed regular expressions under

intersection [ACM97, Her99].

n R R RSig

Proposition 6.4.7. For each pair of -regsignals we have

n

kR R k kR k kR k

p R  R

p

We may then to alternatively define -juxtaposition by the aid of projection and inter-

R RSig R RSig

section as follows: suppose and . Consider then the following two

m n

m n p R R -regsignals which, intuitively, extend , resp. :

6.4 Regminoes, regsignals, and regular expressions over them 101

R i j m ij

iff

R

ij

otherwise

R i j m p m n p

impj mp

iff

R

ij

otherwise

kR k kR k kR k kR k

Observe that and .

m mpmnp

Then

kR  R k kR R k

p (6.29)

since we have:

kR  R k kR k kR k

p p

by proposition 6.4.4

kR k kR k

by Proposition 6.3.3

kR R k

by Proposition 6.4.7

n n 6.4.2 -domino regular expressions and -signal regular expressions

We may push the theory further by defining regular expressions whose atoms are regminoes. n

Definition 6.4.8. The class of -domino regular expressions is generated by the grammar:



R j E E j E E j E

E (6.30)

n n E n

where R is a -regmino. When the atoms in a -domino regular expression are all -

n

regsignals we say that E is a -signal regular expression.

n RegSig

We denote by RegD the class of -domino regular expressions over and by

n n

n

the subclass of -signal regular expressions over .

n n The semantics of a -domino regular expression is in terms of -dominoes and uses the

indexed concatenations and stars:

kE E k kE k kE k

kE E k kE k kE k

 

kE k kE k

E E k kE k kE k

Observe that the definition k does not contradict the fact that the semantics of n-regminoes is noncompositional w.r.t. concatenation. The left-hand side of is an abstract operation on regular expressions, that is, its result is a regular expression, and not a

regmino.

n n

We would like to define a specific semantics of -signal regular expressions in terms of - n signal languages. But -signal regular expressions cause a special problem due to the fact that

102 6. Matrices of signals

n

Sig is not closed under concatenation. We then need to restrict the semantics of each -

n

n

signal regular expression to its intersection with Sig . We denote the -signal language-

n

n k k E RegSig

semantics of a -signal regular expression as s . Hence, for each ,

n

kE k kE k Sig

s (6.31)

n

kR k kR k n R

Observe that, due to Remark 6.4.2 we have that s for each -regsignal .

Remark 6.4.9. Occasionally, we will also speak of n-signal regular expressions. These are noth- n ing else but formal sums of n-regsignals, since for odd no concatenation operation is available.

This notion is useful as n-regsignals are not closed under summation. n

We may also define a class of automata equivalent to -domino regular expressions, equiv- n

alence which follows via the Kleene theorem and the compositionality of -domino regular ex- n pression semantics. We call them -regsignal automata. We will only provide, in Figure 6.7, an

example of such an automaton, the general definition being easily deducible.

R



R



q r s

R



n n

Fig. 6.7. An example of a -regsignal automata that corresponds to the -domino regular ex-



R R R n R R R

pressions , for any -regsignals . n 6.5  -signal regular expressions and timed automata

We have started the study of n-signals with the aim of modeling timed languages. This section

provides the formalization of this modeling, namely the way the language of an n-clock timed n automaton can be presented by some -signal regular expression.

In the introduction to this chapter we have intuitively presented the way to encode a signal with

n n reset times into a -signal. The decoding of a -signal into a signal with reset times works as follows: we simply need to consider the component with the largest length in the matrix, and then

distinguish some points in it, according to the timing constraints. This idea can be generalized to a n

definition of the timed language associated with a -signal regular expression, as the set of largest

n n components that occur in some -signal which belongs to the semantics of the given -signal

regular expression. More formally:

n E RegSig

Definition 6.5.1. Given a -signal regular expression , the timed language

n

associated with E consists of the following set of signals:

LE j w kE k i j n w

s an ordering compatible with such that

w k n i k j

ij and (6.32) n 6.5 -signal regular expressions and timed automata 103

The following theorem formalizes the intuition that n-clocked timed automata can be presented n as -signal regular expressions:

Theorem 6.5.2. The class of languages accepted by timed automata with n clocks is included in

n the class of timed languages associated with some -signal regular expression.

Proof. We will actually prove that the semantics of each n-clocked timed automaton in which each

n transition resets at least one clock can be associated with a -signal regular expression. The “ ” increment in the theorem statement comes from an augmentation of the number of clocks by one which is reset on each transition. Throughout this proof we will consider the reset time semantics

of timed automata.

n A Q Q Q f

So take a timed automaton with clocks, , in which each transition resets

CX

q r X q a

at least one clock. We code each transition in which and into a

n R

-regsignal as follows: suppose that the constraint in the atom is

C x I x x J

i i i j ij

in ij n i j

n R

Then the components of the -regsignal are:

i j n h i

J for

ij

h i i n k j n l k l X J

for

kl

i n k j n l k l X

for

h i i n j n k k X i k

J for

ik

j n i n k k X j k h i

J for

kj

R

ij

j n i i X i n j j X

for or

h ai i n j n k k X

I for

i

i n l j n k k X l X

or

ha i j n i n k k X

I

for

i

n k j n l k X l X

or i

I f j I g

Here .

X x I i

Observe the utility of having , since we may then code the subconstraint i by a x

comparison on the duration between the last reset point for clock i and the reset point of any clock

in X .

n

n E i j n

Consider also the matrix E whose all entries are , for all . The ij meaning of this matrix is the following: when this matrix is concatenated to the left of a regular

expression, all the starting points of the result are the same.

n B Q fq g fq g Q

f

We then build a -regsignal automaton in which

E n R CX

q j q Q fq r j q r fq

104 6. Matrices of signals

To prove that this construction is correct, observe first that each accepting run in A can be

B fq g

uniquely transformed into an accepting run in that starts in by just appending some transi-

n E n

tion labeled with the -regsignal , and vice-versa.

C X

i i

q q i k q

i i i i

k Consider then a run i with for all . We may

associate to this run the following regsignal:

R R R

k

E Consider now the “word-like” n-clocked expression associated with the run , as defined

in Identity 4.6 on page 61.

n

x q C X q C X E a

i k k k

i

t t t t t t n

Then, if a signal with reset times is in the (reset time) semantics

n

E t t t j

of then i and the signal with reset times can be represented by the following n

-signal:

j n

for all i

w

ij

j n k i k n

 for

t t

i

k n

(of course, the whole -signal results with the aid of the triangle identity).

t

It is clear that there exists a bijection between the set of signals with reset times with

t t t n w Sig w ij

n and and the set of -signals with for all

n

i

j n n E

i . We may apply this bijection to the -clocked semantics of and hence get a n

-signal semantics for it.

n E

The proof ends if we show that, for each run , this -signal semantics for equals the

n E n R semantics of the -signal regular expression . This will be proved by induction on the length of the run.

For zero-length runs the proof is trivial. Let us suppose then that we have proved the property

k k q

i

k

for all runs of length up to and take some run of length , say i , with

C X

i i

q i k q q

i i i i

k

for all . Denote also i , the run reduced to the

first k steps. Hence we have

E E a C X

k k k

Then each signal with reset times in the reset semantics of E can be decomposed as

t t t t t t a t t t

n n k

n

t t t kE k t t t a t t t

n n k

where and

n

ka C X k

k k k . n

6.6 The emptiness problem for -signal regular expressions is undecidable 105

n w

We may build then the -signal associated with , which, by induction hypothesis, is in

n R n

the semantics of E . We further consider the following -signal which, intuitively, is

associated with :

w i j n

inj

n iff

w

(6.33)

ij

w a i n j n n

ij k n iff

the rest being derivable by the triangle identity.

w

Observe that w is defined and the result is

w w w

n E E n

which proves in fact that the -signal semantics of is included in the semantics of

R .

For the reverse inclusion we proceed by mirroring the above argument: for the induction step,

n z kE n R k kE n R R a C X k

k k

consider a -signal k . Hence,

z w w w kE n R k w kR a C X k

k k

with and k . By the induction

n E w w

hypothesis, w is in the -signal semantics of , hence for some signal with reset

t t t kE k n

times . We may then build a signal with reset times from

w t t t a t t t

n k

the information provided by as follows: where

n

t t w i n

i for each

i ini

t t i X

for some k

i

t t t

But is defined and produces the signal with reset times

n

E k w z ut

k which clearly has the property that . n 6.6 The emptiness problem for  -signal regular expressions is undecidable

In this section we show that our regular expressions over regsignals have an undecidable emptiness

problem, hence being more expressive than timed automata. In particular, we show here how to n encode each instance of the Post Correspondence Problem into a -signal regular expression. Interestingly, the problem comes from the “untimed” part, the time playing no role in this result. We remind here briefly the Post Correspondence Problem [Pos46] and the result concerning its

undecidability:

u v u v

i i i

Definition 6.6.1. A PCP instance consists of a finite list i of pairs of words,

ip

i

j

p

.Asolution of this instance consists of a finite list of indices j such that

u u u v v v

i i i i i i

p p

   

The Post Correspondence Problem is the problem of checking whether a given PCP instance has a solution. 106 6. Matrices of signals

Theorem 6.6.2 ([Pos46, HU92]). The Post Correspondence Problem is undecidable. n Theorem 6.6.3. The emptiness problem for -signal regular expressions is undecidable.

Proof. We encode each PCP instance into a -signal regular expression. Hence, supposing we

x y x y

i i i

are given the instance i , we associate to each PCP-domino the following

i p

-regsignal:

x

i

C B

y

C B

i

R

C B

i (6.34)

x

A

i

y

i

E

Then, by using the -regsignal defined in the proof of Theorem 6.5.2,

p

X



E E R

i

i

iff the given PCP has a nontrivial solution.

P



p

w E R E

To observe this, consider first a -signal i . By definition, we

i

w w w w w w w kE k w kR k j k

k k k j l

have with and for all

j

l p w u w v j k

j l j l

and j . This implies that and for all , and, by

j j

E w w

k

construction of , that .

w w

k We first need to prove that is a -signal. The following proposition will help

us:

z z Sig z z z z

Proposition 6.6.4. Given two -signals , suppose that are not an-

z z z z Sig z  z z  z

tisignals, that is, . Suppose also that is defined. Then is a

-signal.

Proof. The proof of this property is done by case study on the possible orderings which are com-

z

patible with z , respectively with .

For each of the two -signals, they are cases that can occur, under the assumption that the

-components and the -components are not antisignals, three when :

a b c

and the symmetric cases in which . Since not all combinations are possible due to the need

z  z

to have defined, the correct combinations are as much as 18. Let us denote the ordering

z compatible with z and the ordering compatible with and suppose that , hence

due to correct juxtaposition.

z  z z  z z  z

The only problematic components of are and . To see that the other

z  z z z Sig

components are indeed signals or antisignals, observe that and

z  z z z Sig

by hypothesis. n

6.6 The emptiness problem for -signal regular expressions is undecidable 107

z z Sig

1. For the six cases in which and , that is, when , we have that

Sig z z z z  z z z

z  z z z Sig

By similarity, the other two cases in which and are also solved.

2. For the last remaining case, when and , observe first that we

Sig

get z , and therefore

z  z z z Sig

we observe that

z z z z

But the four factors of this identity are signals, and signals have the following equidivisibility

property:

z z z z

either and

Sig

there exists such that

z z z z

or and

Graphically, the two possibilities are depicted in Figure 6.8

z z z

z  

  

  

or

   

z z z

z  

     

Fig. 6.8. The equidivisibility property.

z z z z

Let us consider the first variant, that is, and . It follows that:

z  z z z z z Sig

z  z Sig z  z

We have already seen that , hence none of the components of is a

z  z Sig

mixture of signals and antisignals, which means that . (Observe that, in this

z  z

case, on , we may choose the compatible ordering .)

z z z z ut

The same result follows if we choose the second variant, that is and .

Proof (of Theorem 6.6.3, continued). We may prove, by induction on j and by means of Proposi-

w w j

tion 6.6.4, that is a -signal.

w w w

k On the other hand, if we explicitely build from we get that

108 6. Matrices of signals

w w w w w u u u

k k l l l

 

k

w w w w w v v v

k k l l l

 

k

w w w w w w w

Then, the triangle identity 6.1 implies that and therefore ,

l

j

k

fact which assures that j is a solution of the given PCP instance.

u v

i i

p

For the reverse implication, suppose now that the PCP instance i has a solution

l j k u u u v

j l l l l

k

j . Let us denote, for simplicity, for each , and

j j j

k

v v l

l .

j

k

w j

j

k We build a sequence j of -signals which, intuitively, record the positioning of the -th

domino in the chain of concatenations. Formally:

u v u u v

l l l

j j j

l l

j j 

B C

u v u v v

l l l

B C

j j  j

l l

j j

w

B C

j

u u v u v

A

l l

j j 

l l l

j j  j 

u v v u v

l l

j j 

l l l

j  j j 

w u v

l l

For example, j holds the word or antiword that lies in between the occurrence of and .

j j

i k u v l

l j

k

The fact that i is a solution implies that, for each , is either a

j

l

j

w kR k j

word or an antiword. Hence j .

w w w

j

It is then easy to check by induction that j and that the concatenation

fg fg

w w w

k k

is a -signal. The proof is accomplished if we observe that , hence

we may concatenate at left and right with the matrix E , viewed this time as a -signal, to get

that

p

X



E w w E E R E ut

k i

i n

Note that the problems concerning the semantics of -signal regular expressions, problems

due to nonclosure of Sig under concatenation, are harmless for the proof of this theorem.

n n Corollary 6.6.5. -signal regular expressions are strictly more expressive than timed automata.

Throughout the following chapters we will search for the following two things:

n A subclass of -signal regular expressions having a decidable emptiness problem and the same

expressive power as timed automata and

n A discrete representation of this class, which allows manipulating only untimed -signals. The search will proceed hand-in-hand, since the discrete representation for the subclass will eventually lead to the decision procedure. 7. n-words and their automata

A closer look at Theorem 6.6.3 shows that time plays no roleˆ in the undecidability of the empti-

ness problem. It is only the untimed structure of n-dominoes that gives the possibility to encode

PCP instances into -domino regular expressions. We therefore need to study in deeper detail this

n n untimed structure, that is, untimed n-dominoes and untimed -signals. We will call the latter as - words. Actually, the whole theory of juxtaposition and concatenation might have been introduced

on untimed dominoes and n-words, but we have preferred introducing it for signals in order to justify its utility for the study of timed automata.

We investigate in this chapter a class of finite automata that is naturally associated with these

n n n-words. We will call these automata as -automata. The idea is to have accepting sets, such

that a run accepts an n-word iff it passes through an accepting set exactly when it crosses one of

the distinguished points in the n-word. Of course, the accepting sets are indexed, such that when i

crossing the distinguished point i, the -th accepting set is reached. In the matrix presentation of i

n-words, this is rephrased as follows: the run in between the moment of passing through the -th

j w

accepting set and the moment of passing through the -th accepting set is labeled with ij .Or,in

w w

the case ij is an antiword, its inverse labels the run between the moment of passing through

ij i

the j -th accepting set and the moment of passing through the -th accepting set. n We show here that n-automata are as expressive as sums of -regwords (that is, sums of untimed

n-regsignals) and that they are closed under concatenation. We also show that they have a decidable

emptiness problem, though with a high complexity solution (in the NP class [GJ79]). n This allows us to identify better what harms the emptiness problem for -word regular expres-

sions: it is the star operation in combination with the elasticity of -regwords that represent each

PCP domino. By elasticity we name the property that, for some -regword which represents a PCP domino, allows the two words in the domino to be arbitrarily far away from one other. Our idea is

then to forbid this elasticity both at the untimed and timed level and to show that, when simulating

n n timed automata with -signal regular expressions, we obtain non-elastic -regsignals too.

Non-elasticity does not prove to be a nice algebraic property since it is not closed under concate- n nation. But our search is for a property that assures decidability rather than for a class of -word

regular expressions which is decidable, since such a property can be checked on different classes n of algebraically closed classes of -signal languages. One question might be asked here: why do we “complicate” our life and use a fussy class of automata and not work with classical finite automata and the intersection construction? But in fact, our class of automata is nothing else but a compact representation of an “asynchronous” 110 7. n-words and their automata

composition of finite automata. Then, in a certain sense, our non-elasticity property requires a bound on the asynchronicity in order for the emptiness problem to become decidable. Even more, our class of automata will be able to represent also timing constraints over the continuous time

domain, as we will see in the next chapter. n

This chapter runs as follows: in the first section, we define n-words, -regwords and the reg-

n n

ular expressions over -regwords, and show that all the algebraic properties of -signals and

n n n-regsignals from Chapter 6 hold for -words and -regwords. The second section contains the

definition of n-automata and their basic closure properties, must notably the closure under projec-

tion and juxtaposition. We also show here that the emptiness problem for n-automata is decidable n and that n-automata are equivalent to -regwords. The third section serves for the introduction of

the non-elasticity property and some basic observations on it. The fourth section contains the main n

result of this chapter, the star closure property of -automata whose accepted languages have the n property that all their powers are non-elastic -word languages.

7.1 n-words n n-words can be thought as words with distinguished points, similarly to -signals. Hence, when

presenting words with distinguished points, we must employ antiwords, that is, words over the set

fa j a g

of symbols . Algebraically, we work on the free group generated by the

set of symbols , which is nothing else but the set , endowed with a concatenation

operation which “cancels” inverse letters.

n w w

ij

n Definition 7.1.1. An untimed -domino is a matrix ij of elements from

which satisfies the triangle identity 6.1, that is,

i j k n w w w

jk ik

for all ij

w w n

When ij we say that is an -word over .

A graphical representation of a -word is given in Figure 7.1:

a abab

a b a b

A

W

a bab

b a b a b a b

Fig. 7.1. A -word and its graphical representation.

The whole theory of projection/juxtaposition/concatenation will be used directly for n-words n ( -words where needed) without rephrasing, as it can be easily adapted to the untimed structure. We translate in this introduction all the notations and results for the untimed case:

7.2 n-automata 111

n WD n

The set of -words is denoted n while the set of untimed -dominoes over is denoted

UD n n

n . Note that juxtaposition of -words does not necessarily yield -words.

n WD n

An -word language is any subset of n . Similarly to -signal languages, the set of n

-word languages can be given a Kleene algebra structure with the concatenation inherited from n

-words and the resulting star operation.

R n n

An n-regword is then a matrix whose entries are (untimed) regular expressions over

n RW

. The set of -regwords is denoted n .

n n w

The semantics of an -regword consists of untimed -dominoes with the property that ij

R i j n kR k

ij for each and is denoted :

kR k w UD j w jR j i j n

ij ij

n for all

E j E

Remind that j denotes the semantics of the classical regular expression . n Similarly to n-regsignals, -regwords semantics is not compositional w.r.t. projection but is

compositional w.r.t. juxtaposition.

n R WD R i j

For each -regword n we will denote the “positive part” of the -

ij

R i j R

component of R and the “negative part” of the -component of , that is,

ij

R R R R R R R

ij ij

ij with

ij ij ij ij

R

In fact, we will utilize this decomposition for both R and being regular expressions that

ij ij

R R ij

denote respectively ij and . n

The set of -word regular expressions is defined by the following grammar:



R j E E j E E j E

E (7.1)

n n where R is any -regword. Their semantics is based upon the -word language operations as

usually:

kE E k kE k kE k

kE E k kE k kE k

 

kE k kE k

n n All the properties that hold for n-regminoes and -regsignals will also hold for untimed -

regminoes and n-regwords.

Remark 7.1.2. Proposition 6.4.7 and identity 6.29 hold for untimed regminoes, since any intersec-

tion of untimed regular expressions over can be transformed into a regular expression

over .

7.2 n-automata

We define here a class of finite automata that are equivalent to n-regwords. The idea is to generalize

from finite automata by utilizing n sets of accepting states and requiring that the accepting runs 112 7. n-words and their automata

pass at least once through each of these sets. The class can be generalized to support also untimed

n-regminoes but we will not present this generalization.

n A Q Q Q n

Definition 7.2.1. An -automaton over an alphabet is a tuple in which

Q Q i n

Q is the finite set of states, is the transition function, and for each ,

Q Q i

i is the set of accepting states for index .

q a q

j j j

k

A run in such an automaton is simply a sequence of transitions i with

q a q i k

j j

j for all . We also have word-labeled transitions, as in finite automata:

w

q q q w

q if there exists a sequence of transitions from to whose concatenation of labels gives .

q a q i i k word i i

j j j

k

For any run i and two indices , we denote

i i the word or antiword which labels the transitions in between the -th state and the -th state in

the run:

a a a i i

i i

i iff

  

word i i

(7.2)

a a a i i

iff

i i i

  

w

q q

By mirroring this definition we also get antiword-labeled transitions: for w , if



w

q q

.

q a q

j j j

k

An accepting run is a run j that passes through each accepting set, i.e.,

i n fq q g Q

k i

for each

l

q a q n l

j j j

k

Given an accepting run j and a set of indices within this run,

l l k q Q i n n w WD

i i l i n

n

i with , such that for all ,an -word is

i

l

l

said to be accepted by the run and the index sequence iff

l l a a a

w

i j l l l

ij iff

i i j

i j n q w q

l ij

for each l that is,

i j

a a a l l j

iff i

l l l

i i j

l

l n We say that the sequence of indices l witnesses the acceptance of the -word by the run .

A first example is provided in the Figures 7.2 and 7.3:

b

a b a b

q q q q q

    

    

a

Q fq q g Q fq q g

Fig. 7.2. An example of a -automaton. The accepting sets are , and

Q fq g

.

a

The -automaton in Figure 7.2 accepts the -word in Figure 7.3 : the associated run is

q a q q b q q a q q b q

and the witnessing sequence . Note that the

b same run, but with the witnessing sequence accepts the -word in Figure 7.3 .

7.2 n-automata 113

a b a b b a b

     

q q q q q q q q q q

         

a b

Fig. 7.3. Two -words accepted by the automaton in Figure 7.2. The accepting runs are depicted below each word. The witnessing sequences can be retrieved by identifying the indices in the run

which correspond to the distinguished positions inside the -words.

Remark 7.2.2. Observe that a run might be longer in both directions than the word actually ac-

b

cepted by it. For example, the -word in Figure 7.3 might be accepted by a shorter run, namely

q b q q a q q b q

in combination with the witnessing sequence .

Hence in any n-automaton we may consider only runs that start and end in some accepting set,

in pair with witnessing sequences that contain the index and the final index in the run. Remark 7.2.3. Observe also that the witnessing sequence does not necessarily capture all the mo-

ments when the accepting run passes through an accepting set. In Figure 7.3, the two runs pass Q

twice through , but only once this pass is really needed and used.

There exists an alternative way of accepting n-words: we may define an accepting run as a

X q X a X q X

i i i i

sequence of tuples i with the following properties:

i i

ik

q a q

i i

i .

i k X X

For each , i .

i

i k j X n X q Q

i j

For each ,if i then .

i

X X n

k

and .

X i

The components i record the “history” of passing through accepting sets up to the -th state

i

while the X components also take into account the -th state. The translation from the “witness- i

ing” presentation to the “history” presentation is straightforward: given an accepting run and

l X fl j j ig

i i j

k

a witnessing sequence i , we construct the “history” components as

X fl j j ig

and j . For the reverse, given an accepting run in the “history” presentation

i

X q X a X a X l

i i i i i i i

n

we associate the witnessing sequence i in

i

ik

l j j X n X i

which i iff .

i

A LA n Definition 7.2.4. The n-word language accepted by , denoted , is the set of -words ac-

cepted by some accepting run in A, together with some witnessing set of indices. n The class of regular n-languages consists of the family of -word languages which can be

accepted by some n-automaton.

7.2.1 The emptiness problem for n-automata

Proposition 7.2.5. The problem of checking whether the language of an n-automaton is nonempty is an NP-complete problem. 114 7. n-words and their automata

Proof. Let us first show that the non-emptiness problem is NP-easy. To this end, consider the

Q

following “nondeterministic algorithm” (in the sense of [HU92]) that associates to each state q

n q an index set X with the property that there exists a run that starts anywhere, ends in

and passes through each of the accepting sets whose indices are in X :

Q

- pick q ;

X i n j q Q

- put i ;

T fq X g

- put S - put ;

n S T

- while X and do

T

- put S ;

q X T

- pick ;

- pick a ;

Q

- pick r ;

q a r

- if then stop;

X X fi n j r Q

- compute the new i ;

T n fr g P n fr X g - compute the new T ;

endwhile;

n if X then write(‘‘nonempty’’) else write(‘‘empty’’). This nondeterministic algorithm runs in polynomial time and linear space in the size of the

given n-automaton. It is clear that, if there exists an accepting run, then one of the choices in this algorithm will find it. For the NP-hardness part, we show that the Hamiltonian Path Problem (HP) can be polynomi-

ally reduced to checking emptiness of an n-automaton. Remind that the Hamiltonian Path Problem [GJ79] is the problem whether a given directed graph contains a path which visits each node ex- actly once.

The polynomial reduction of HP to the emptiness problem for n-automata is the following:

G V E V fv v g k k

given a graph with , we construct the following -automaton:

A V k Q Q k

where

v j v j j v v E j k

Q v j j j k

i i G We then have that each accepting run through A corresponds to a Hamiltonian path in and vice-

versa. This follows since an accepting run must have k nodes as it visits each accepting set at least

once and each visit increases the “level” of the node by one, and also the number of sets visited is t less or equal than the number of nodes in the run. u

We present here an algorithm that is an adaptation of the Floyd-Warshall-Kleene algorithm,

card Q

hence containing O iterations, but each iteration might take exponential time since it

n involves operations on possibly exponentially many subsets of . It associates to each pair

7.2 n-automata 115

q r n n qr

of states in the given -automaton, a set of subsets qr of . The set has the

X q r Q i

property that, for each qr there exists a run from to that passes through each for

X

each i . Once the matrix is constructed, the answer is “YES” if and only if there exists one

q r n

component with qr .

Q Q fq q g p p

For the computation of , we suppose an ordering of is given, say with

N P n

. A special operation on P , denoted , is used. This operation works as follows:

Y P n

given X ,

X Y Z Z j Z X Z Y

Remark 7.2.6. is associative.

k

p

The algorithm works by constructing a sequence of matrices k with

fX g i j l X q Q l

iff and for all i

ij

fg otherwise

k ij k ik k k k k k j

i j n k p X ij

Proposition 7.2.7. For each and , k iff there exists a run that

q q k j

starts in i , ends in , whose intermediary states are labeled with indices less than or equal to ,

Q l X

and which passes through all accepting sets l for each .

Proof. By induction on k .

k

For k the proof is trivial. Suppose we have proved the result for . Then, for each

i j n X X Z Z Z

ij

and k , by associativity we have that with

Z Z Z

k ik k k k k k j and . By induction hypothesis we get the follow-

ing three runs:

r a r

m

A run h with



h h h

r q r q

k

1. i , ;

m



r fq q g h m

k

2. for all ;

h

Q l Z h m r

l

3. For each there exists such that .

h

r a r

m

A run h with



h h h

r q r q

k

1. k , ;

m



r fq q g h m

k

2. for all .

h

l Z h m r Q

l

3. For each there exists such that .

h

r a r

m

A run h with



h h h

r r q q

j

1. k , ;

m



r fq q g h m

k

2. for all ;

h

l Z h m r Q

l

3. For each there exists such that . h

But then, by concatenating these three runs we obtain the run

r a r r a r r a r

hm hm h m

h h h h h h h h h    t which verifies the claimed property. The inverse implication follows similarly. u

116 7. n-words and their automata

LA i j p n ij

Corollary 7.2.8. iff there exist such that p .

The advantage of this algorithm is that the sets ij can be represented as BDDs, since they give

q q j

an “and-or” information concerning the runs that connect i to .

n

The search for a component that contains in the matrix might prove a lengthy process,

S

n Q

even if we restrict ourselves to only the union i . But, with a simple trick, we may only need

i

Q q

to check a single component: we append to the set of states two special states, denoted and

q q q

. is used for looping before any run and for looping after any run. We will call the state

q q

as the source state and the state as the sink state.

A n A Q Q Q n

Formally, we transform into the -automaton where:

Q Q fq q g

a a a a

q q q j q Q i n a q q q q q

i

A

Definition 7.2.9. The automaton A is called the completion of .

A

As a consequence, once we have constructed the matrix for , we only need to check whether

q q n

the component corresponding to contains . n

7.2.2 -transitions in -automata The class of n-automata was defined without allowing -transitions. However in the sequel we will

sometimes need them in order to make simpler constructions of n-automata.

n A Q Q Q Q n

An -automaton with -transitions is a tuple in which

g Q n f . The notions of run, accepting run and -word accepted by an accepting run are the same

as for “ordinary” n-automata.

The elimination of -transitions proceeds, like for finite automata, by computing the reflexive-

n transitive closure of the relation on Q. There is however a problem specific to -automata,

the recomputation of accepting sets. Remind that, in the process of removing -transitions from

finite automata, a state is declared as “accepting” (i.e. final) iff it reaches, after finitely many -

transitions, a final state. This cannot be the case for n-automata, as we may see from the example

in Figure 7.4.

b

The -automaton in Figure 7.4 is obtained by removing all the -transitions from the -

x

a q r

automaton in Figure 7.4 with the usual technique, that is, by putting in the new au-

x

q r q q r r

tomaton iff there exist states such that . We have denoted here the

Q

reflexive and transitive closure of the relation on .

Q fq g Q

But we need to redefine also the accepting states, and besides the choice and

fq g b

there is no way for redefining the accepting sets that renders the resulting -automaton at

a q

equivalent to the one at . The reason for this deadlock is that, if we choose to be in both

Q Q a

and then the resulting -automaton would accept the -word represented in Figure 7.5 ,

a which is not accepted by the -automaton in Figure 7.4 . The same situation occurs if we choose

7.2 n-automata 117

a

a

a b

a

q q q q q

b

a b

a

b a

b

q q q q q

?

?

a b

a b

Fig. 7.4. The “brute force” removal of -transitions from the -automaton at is drawn at .In

q q

this -automaton there is no way to establish the accepting sets to which and must belong.

a b a a

q q q

a b

n

Fig. 7.5. Two -words for exemplifying the peculiarities of removing -transitions in -automata.

q Q Q

to be in both and , whereas if we leave the accepting sets unchanged then the resulting

b automaton would not accept the -word depicted in Figure 7.5 .

The solution is to replicate each state that takes part into a sequence of -transitions, according to the number of distinct runs with -transitions that pass through it. For our -automaton in Figure

7.4 the solution is the -automaton in Figure 7.6.

a

q q

 

a

b a b

q q q q q

 

a

q q

 

b

Fig. 7.6. A -automaton without -transitions equivalent to the -automaton in Figure 7.4.

n A Q Q Q n

In general, suppose we are given an -automaton with -transitions .

q q X The states of the n-automaton without -transitions are pairs consisting of two states in

118 7. n-words and their automata

X n q

Q and an index set . The idea is to encode in each such state an -run that starts in ,

X

ends in q and passes through all the accepting sets whose indices are in . A

Formally, the following automaton without -transitions can be showed equivalent to :

Q Q Q

B where

n

q Q q q X j q q q q q q i k

i i k i

k

i such that and for all

j X i n q Q j

and for all there exists some with i

a a

q q X r r Y j q r

Q q q X j j X n

for all j j

7.2.3 Basic operations with n-automata

Proposition 7.2.10. The class of regular n-languages is closed under union and intersection. Proof. Both results are straightforward generalizations of the closure results for regular languages.

We will provide only the proof for intersection:

n A Q Q Q A Q Q Q n n

Given two -automata and , the -automaton

n

LA LA A Q Q Q Q Q Q

that accepts is n where

n

a a a

q q r r j q q r r ut

and

q q

Remark 7.2.11. Observe that the resulting automaton is a completion since the pair state

q q q q

plays the roleˆ of and the pair state works as .

n A Q Q Q J n n

Proposition 7.2.12. Given an -automaton and a subset

ard J p p LA p

with c , then the -word language can be recognized by some -automaton.

J

A Proof. The first step is to take the completion A of . Then we transform this automaton by

remembering in each state the set of indices from J which “can be visited” through a run that starts q

in .

J fi i g J p

Formally, suppose that is the presentation of in increasing order, that is,

k i l p B Q P J Q Q J

k . We then construct the -automaton where

np J

denotes the complement of J and

a a

q X r Y j q r X Y J r Q j Y n X

and j for all



P J Q Q Q Q P J

i or, in other words,

k k

k

l k J

This automaton is not yet the desired one since there is no guarantee that in an accepting run

J

in B , the index component of each state records exactly the set of indices from which have

q q

been visited. But we will take advantage of the existence of the completion states and in the

q

following way: we restrict the set of states to those that are reachable from and coreachable

q J Q B p

from . Denote the resulting set of states and the restricted -automaton. We claim that,

LA

with this restriction, only p-words in are accepted. J 7.2 n-automata 119

To this end, observe that an accepting run in B must necessarily be extensible to an accepting

q q J

run that starts in and ends in . But the construction of the transition function assures

i J Q P J

then that, for all this run must pass through a state belonging to i . Since it

Q i J

is an accepting run, it also passes through each i for all . Therefore, if we forget the set

p

component of each state we get an accepting run in A . Hence the -word associated with the run

J n A

in B is the -projection of the -word associated with the run in . The reverse proof follows t

similarly. u

m A Q Q Q n B m

Proposition 7.2.13. Given an -automaton ,an -automaton

Q Q Q p minm n p LA LB

and a number , the -juxtaposition of with is

n

m n p accepted by some -automaton.

Proof. The essential tool used here is the relationship between juxtaposition, extension and projec-

m n p

tion on n-word languages given by Identity 6.23. Hence we build the -automaton which

LA LB A B

accepts p as an intersection of an extension of with an extension of . The idea is

B m n p to use the completed automata A, resp. , transformed into -automata by adding new

accepting components. The new components will simply contain all the states in the automata.

A m n p A Q Q Q

mnp Formally, we transform into the -automaton

where:

Q Q fq q g

a a a a

q q q j q Q i m q q q q q

i

k m m n p Q Q

k for each

B B Q S S

mnp

Similarly we extend into where:

Q Q fq q g

a a a a

q q q q q q q q j q Q i m

i

Q m p

for each k

S

k

Q m p m n p

for each k

k mp

Observe that

LA w WD j w LA

mnp

m

L B w WD j w LB

mnp

mpmnp

Then identity 6.23 implies that

LA LB LA LB ut p 120 7. n-words and their automata

7.2.4 Relationship with n-regwords n Once at this point, we may ask what is the relationship between n-regwords and -automata. The

following theorem gives this relationship:

n n Theorem 7.2.14. The class of n-word languages accepted by -automata equals the class of -

word languages which are the semantics of a sum of n-regwords. Proof. The left-to-right inclusion works by the aid of the classical Kleene theorem as follows:

first, we decompose each n-automaton into several smaller ones, in which each accepting set is a

n A Q Q Q n

singleton. Hence, the language of the -automaton equals the union of

B Q fq g fq g q Q i n

n i i the languages of all the automata with for all .It

there remains to prove the inclusion for such automata.

card Q Q fq g i n

i i

So consider that for the given automaton i , i.e., for all .

i j n A

For each , denote ij the finite automaton whose transition function is and whose

q q A Q q fq g

j ij i j

initial, resp. final states are i , resp. , that is, . This automaton constrains

i j n

all the n-words whose -component is positive (i.e. in ). The constraint for the -words

i j A

whose -component is negative (i.e. in ) is provided by the inverted language of ji,

LA

ji .

E

Let us denote then by ij the regular expression over equivalent (by the Kleene theorem) to

A n R

ij . Construct the -regword whose components are

R E E ij

ij (7.3)

ji

E We have denoted here by E the expression obtained from by replacing each letter in with

its inverse.

A kR k We claim that L . The inclusion is assured by construction. For the reverse we will

essentially use the triangular identity characterizing n-words:

kR k n w

Consider w and let be an ordering on which is compatible with , that is, if

i j w A w

then ij . We construct a run in for , run which passes through the accepting states

in the order indicated by . The run is constructed inductively on the order as follows: denote

i k k n

k the -th index in the order , for .

k w jE j LA A q

i i i i i i

For we have i and hence we get a run in that starts in and

       q

ends in i .



k q q i

Suppose we have built a run, up to , that passes through i in this order. To extend



k

w A

i i i

this run we consider first a run associated with i in . This run exists since, by

k k  k k 

j LA jE w q q

i i i i i i i

hypothesis, i . Moreover this run starts in and ends in .

k k  k k  k k  k k  Then we just append this run to the one we have built so far.

The fact that this concatenation is consistent with the other constraints imposed on the word

k

follows by the triangle identity: for each l the fraction of the run that is associated with

w w A

i i i i i

i concatenated to the run associated with in is an accepting run associated

l k k k  k k 

w A

i i i

with i in . Observe also that the compatibility of the ordering assures that the

l k  l k 

w w

i i i

concatenation i gives a word and not a mixture of letters and antiletters.

k k  l k

7.3 Non-elasticity 121 n

The embedding of n-regwords into -automata works by an argument similar to the intersection

n R WD i j n n

argument. Hence, given an -regword n , for each pair we

consider the finite automaton which is associated with the regular expression R by the classical ij

Kleene theorem, be it

i j Q A Q i j i j Q i j

ij

f

Consider also the finite automaton which is associated with R ,beit

ij

i j Q i j Q i j i j Q A

ij

f

LA

Observe that ij .

A A A A

Build then the automaton ij as the union of and the inverse of . The inverse of is

ij ij ij

i j i j Q i j i j Q

Q where

f



a a

i j q i j g fq r j r

A Qi j i j Q i j Q i j

f

We denote ij .

A n

We transform ij into an -automaton by the same technique used in the proof of Proposition

7.2.13. That is, we do the following steps:

a a

q q Qi j q q Q i j q q q

Append two new states and to and put for each

a a

a q q Q i j a Q i j q q q

f

and , and for each and . We denote

Q fq q g

.

k n n fi j g Q Q i j Q Q i j Q Q i j

i j f

For each we put k and .

Denote the resulting automaton as A .

ij

i j n

Build then the intersection of all A for all by a straightforward generalization ij

of the intersection construction from Proposition 7.2.10. An easy check shows that the resulting

kR k ut n-automaton accepts indeed .

7.3 Non-elasticity n

In our search of a decidable class of -word regular expressions we may recall that, when we have n

constructed the -signal semantics to timed automata in 6.5.2, we have produced only signals with

i n t t j n t t n j

reset times in which for each , either i , or, for each , .Or,in -

i i

i n w j n w Sig

i jni signal format, for each , either in , or, for each , .

In the sequel we will focus on the following weaker property (written here for n-words):

j n

(N) For each i , one of the following requirements holds

w

i

(N1) in ;

w

j

(N2) jn ;

w w

j jni (N3) in and .

122 7. n-words and their automata

i j n w w

i jnj

Equivalently, this property says that for each ,if in and then

w w

j jni

in and . We prefer the above formulation since we will make some reference n to -words that have only property (N3). Observe that, for PCP dominoes, this property forbids the situation in which one of the words ends before the other begins. If we recall the proof of the undecidability of PCP, we may observe that the simulation of a Turing Machine by a PCP instance requires copying the contents of certain cells, and this procedure needs “elastic” dominoes in which the relative distance between the two

words composing a domino can be arbitrarily large. n

We will show that this property assures the star closure of -automata. As a consequence, this n

property assures the decidability of the emptiness problem for -word regular expressions.

n w i j

Definition 7.3.1. A -word is called non-elastic if the property (N) holds. If for each

w n

only the property (N3) holds then we say that is strictly non-elastic.

n E E jE j

For each -word regular expression , the non-elastic semantics of , denoted n consists n

of the non-elastic -words in its semantics.

n A A L A

For each -automaton , the non-elastic language of denoted n , consists of the non- n

elastic -words in its language.

n n w

The set of indices i which satisfy property (N3) for a -word is called the set of strictly non-elastic indices for w . Examples of elastic, non-elastic and strictly non-elastic -words

are given in Figure 7.7.

a b c a b a b c

           

a

b c

a b c Fig. 7.7. A elastic -word. A non-elastic -word. A strictly non-elastic -word.

For a more general scheme of the positioning of distinguished points in non-elastic or strictly n

non-elastic -words, see Figure 7.8. This figure also presents intuitively the “interface” part of n

and the “contribution” part of the non-elastic -word. n

Note that in a strictly non-elastic -word the interface part is empty.

n w w w

i

Remark 7.3.2. Observe that, for any non-elastic -word , in . Moreover, if is strictly

w w w w

i jnj inj jni

non-elastic and in then .

n n A

Let us see now how non-elasticity works in -automata: consider an -automaton , an ac-

q a q A n w WD

i i i n

m

cepting run i in ,a -word and a sequence of indices

l

l l w w

i

n i witnessing the acceptance of by .If is non-elastic then the witnessing se-

quence bears the following property:

j n For each i , 7.3 Non-elasticity 123

“contribution” part

left “interface” part right “interface”p

w

w

t t

ni i

distinguished pointsdistinguished points here, if occurs, here, if n occur

t t

n n n i in in then i occurs then occurs

at the same point at the same poin t

here all i s

t

i

precede all n s

d e

a n b Fig. 7.8. Positioning of distinguished points in a strictly non-elastic -word and in a non-

elastic n-word.

l l

ni

1. i ;

l l l l l l l l

ni j nj i nj j ni

2. If i and then and .

l

l n

We will call the pair with the above properties a non-elastic pair. It is clear that, in any - n

automata, non-elastic pairs are associated only to non-elastic -words accepted by the automaton. n The following property shows the way of constructing the non-elastic semantics for -

regwords:

n R X X n

Proposition 7.3.3. Start with some -regword . Denote the set of subsets which

R i X X X R X

i

satisfy the property that in for all . For each , denote the following n

-regword:

i j n i j n n R ij

for or

j n i i X i n j j X

for or

R X

j n k k X i n i n k k X j n R

ij (7.4)

ij for or

R i X j n k k X

for

ij

R X i n k k X

for j

ij

n

Then the non-elastic semantics of R equals the semantics of the -word regular expression

X

X

R .

X X

Proof. The property follows by double inclusion. One direction is straightforward, since clearly

X R

the semantics of each R is non-elastic and included in the semantics of .

n w R

The reverse follows again easily since each non-elastic -word in the semantics of also

X X

belongs to the semantics of R , where is the complement of the set of strictly non-elastic

ut indices in w .

Unfortunately non-elasticity is not preserved by concatenation, as the example in the Figure 7.9 shows.

124 7. n-words and their automata

a a a a a a a

     

Fig. 7.9. The concatenation of two non-elastic -words does not necessarily yield a non-elastic

-word.

As a consequence, the undecidability theorem 6.6.3 can be proved even for non-elastic -words,

since we may decompose the -regwords corresponding to each domino in a PCP instance into a

concatenation of non-elastic -words.

We might be tempted to restrict to a “more partial” concatenation, let’s call it non-elastic

n

concatenation and denote it r , that would produce only non-elastic -words. But observe that

this non-elastic concatenation is nonassociative, as it is exemplified in Figure 7.10.

a a a a a a a a a

r

( r )

           

undefined

a a a a a a a a a

r

()r

           

defined:

a a a a a a a

r

       

a a a a

   

Fig. 7.10. An example of nonassociativity of the concatenation r : the first parenthesis layout leads

to undefined, since the concatenation of the two -words in the parenthesis leads to a non-elastic -word. Of contrary, the second parenthesis layout gives a non-elastic -word.

This would make doubtful the possibility to construct the associated non-elastic star since the

i

L n L W n powers of a -word language might not be uniquely defined. Hence, non-elasticity is not a good algebraic property and one might be tempted to search for other properties. But our aim in this chapter is not to find algebraic structures that are “good”

w.r.t. decidability. We only want to isolate some property that assures decidability and then, in a n subsequent chapter, to show that particular structures, associated to special classes of -words, bear these properties and hence have decidable emptiness problems. 7.4 The non-elastic star closure theorem 125

We will show in the next section that non-elasticity, when carefully handled, leads indeed to

decidability. Careful handling means the following two conditions: n

1. If we intend to concatenate two non-elastic -word languages, we need to check first that only n

non-elastic -words are produced. n

2. If we intend to build the star of a non-elastic -word language, we need to check first that, by n

concatenating the given non-elastic -word language with itself an arbitrary number of times, n we get only non-elastic -words. Observe that, in a certain sense, the second condition says that the non-elastic star is built in a “canonical” manner, since it assures that all the non-elastic concatenations on which it relies are associative.

7.4 The non-elastic star closure theorem

We start this section by noting that, as a combination of the juxtaposition and the projection con-

n n

structions, the family of -word languages accepted by -automata is closed under concatena-

n n

tion. It is clear that if require the given -automata to accept only non-elastic -words such that n the concatenation of the two languages produces only non-elastic -words, we still get the same construction. We provide here a direct concatenation construction as we will get this way some

intuition for the main theorem of this section, the star closure theorem.

n A Q Q Q B Q Q Q

n

Take two -automata and , both of which

n

n LA LB

accept only non-elastic -words. Suppose also that is composed of non-elastic

n n

-words. Our aim is to build a -automaton for this set, and we proceed as follows: n

If both automata accepted only strictly non-elastic -words then the idea would be the follow-

n w

ing: we start A on the “NW quarter” of the given -word (that is, on ) and check whether

n

Q Q n

on this section we pass through all the accepting states . Then continue until we reach

Q Q w

n

an accepting state between n . (the assumption that is non-elastic implies that the

Q Q n first such moment succeeds all the moments of passing through ). At this moment we

start B and synchronously fire transitions from both automata.

A Q i n

i

From now on, must pass through some accepting set n ( ) synchronously

Q i

with a passage of B through the accepting set . We record all indices that have observed such

i

n

a “synchronous passage” into some set X . Once this set equals , we are sure we have

n A identified in our given -word a section which is accepted by , and we may now proceed with

finalizing the run in B .

n LA LB Formally, the -automaton for would be

126 7. n-words and their automata

C Q Q Q Q P n Q Q Q Q n

with

n n

q q n q j q Q q Q

q q q I j q Q q Q i I n

i

n for all

i

a a a

q q I r r J j q q r r I J n

i J n I r Q r Q

i

and for all n and

i B

A and parse synchronously here B

A parses alone here parses alone here w

part with distinguished part with distinguished

n n   n

time points in  time points in B part where A and need to find the concatenation points

Fig. 7.11. Graphical exemplification of the concatenation construction.

Q Q

i

Observe that an accepting run assures, by construction, that all the sets n are visited

i

Q j n

“in between” the last moment when an accepting set j with is visited and the first

k n

moment when an accepting set Q with is visited. This property is consistent with

nk

n LA LB the hypothesis that all -words in and are strictly non-elastic.

The conditions (N1) and (N2) pose specific problems since we might need to “start” the au-

B A Q j n

tomaton “before” the automaton has visited all the accepting sets j with . But

the idea is the same, namely to “synchronize” the two automata as follows:

A Q B

i

Each time passes through the accepting set n , must pass through the accepting set

Q and viceversa.

i

q q X q A q

Accepting states would then be simply tuples in which is a state in , is a state in X B and is the set recording synchronous passages. At this point, one problem might arise, a problem which we have observed also on the projection

construction: to actually be sure that the X component has recorded all the synchronous passages,

X n

we need to start with an empty X and finish with an . We do this by working with n

completed automata, and then reducing the state space of the resulting -automaton to the states

q q q q n

reachable from (which acts as a source state) and coreachable from

(which acts as a sink state).

A B A B q q

Formally, we build first and , the completions of and . We denote and the source

q q

states of each automaton, respectively and their sink states. We then define the following

n -automaton:

7.4 The non-elastic star closure theorem 127

C Q Q P n S S

n

with

a a a

q q X r r Y j q r q r X Y n

i Y n X r Q r Q

i

and for all n and

i

S Q Q P n i n i

i for all

S Q Q i n P n

i

n for all

ni

C q q

Finally drop all the states of that are not reachable from or not coreachable from

q q n

, that is, consider the automaton

D Q Q Q S Q S Q

n

where

q Q Q P n j w w

Q such that



w w

q q q q q n

D LA LB

Proposition 7.4.1. L .

LA w LB w w

Proof. For the right-to-left inclusion, take w and such that is defined.

n q a q w

i i i

k

Both -words come with an accepting run, be they i for , respectively



i

q a q w i i

l

k l n

j for , and with two witnessing sets of indices, , respec-

j j j

j

j j

l

n

tively l , such that

q Q w word i i l m n

l lm l m

i and for all

l

Q q w word j j l m n m

and l for all

l j lm

l

w l m n

Observe first that the assumption that w is defined implies that for each ,

w w q q

nmn i i

l . Hence the piece of run from that lies in between and has

mn

lm

ln

q q

the same length as the piece of run from that lies in between and . More formally,

j j

m

l

word i i word j j

l nm l m

n . This has the following important consequence:

l m n i i j j

n l n m l for each m (7.5)

Our first aim is then to extend the two runs such that they have equal length and the moment

Q

l

when the first run passes through n be the same as the moment when the second run passes

Q q q q q

through . This extension will be accomplished by adding loops in , , and/or .

l We may assume, according to Remark 7.2.2, that both runs start and end in some accepting set

and both witnessing sequences contain the first and the last index in the respective run, that is,

i l m n k i

l

m for some ,

 

 

j k j l m n m

l for some .

 

l l m m n

Moreover, by the non-elastic assumption we may consider that and

n n

.

  

i i i j j

m m n

l n n l

On the other hand, by Identity 7.5 we have that l . It follows that

 

  

Q h n

h is the earliest moment at which passes through some accepting set n for some ,

128 7. n-words and their automata

j Q h



n

and m is the latest moment at which passes through some accepting set for some



h

n

. That is, if we denote

minfi j l n ng minfj j l ng

l l

maxfi j l n ng maxfj j l ng

l l

then



k i j

m

l







i j

m n

l n

 

Hence, due to Identity 7.5,

(7.6)

word i i word j j

l nm l m

On the other hand, from the equality n we msut also have

that, for each h ,

a a

h (7.7)

h

k q q

We then extend by adding replicas of at the end and one replica of at the

q q

beginning. Similarly, we extend by adding replicas of at the beginning and one replica of

at the end. Observe that the two runs would have the same length since

k k k k k

k k r b r

i i

More formally, denoting , we consider the runs i and

i k

r b r

i with

i i

i k

i i q q

for for

r r

q q i k i k

i

i

for i for

i

i k k i k q q

for for

a i k

i iff

b

i

i k

a iff

i

b i k

The property 7.7 assures that the i ’s can be uniquely chosen for all .

Observe that these two extended runs bear the desired property: the moment when the first run

Q Q

l

passes through n is the same as the moment when the second run passes through , or, more l

formally,

q l n r q r i

for any i iff (7.8)

j i

ln l

7.4 The non-elastic star closure theorem 129

C r r I b r r I I

i i i i

We finally construct the run i in which

i i

k i

and

I I l n j r q

i i i i

ln

C I I

i

Property 7.8 assures that is indeed a run in , because it implies the following fact: if i

l I n I r Q r Q i k I

i i nl i

then for all i , and . Observe also that, for each ,

i

l

l n Q Q

n

records all the indices with the property that the run has passed through l .

l

q q

Moreover, since the run contains only states that are accessible from and coaccessible

n q q D

from , it is actually a run in .

p

p p

l

n

We consider then the family of indices l , defined as follows:

i l n

l for

p

l

l n n j

l for

p p

p w w p

It remains to prove that the pair is associated with , that is, that witnesses the

w

acceptance of w by :

l m n w w w r r q r r q

lm p i i p i i

1. For , lm , and . Since we

m m m

l l l

w

lm

q q i

have that i we get then

m

l



w w

lm

I r r I r r

p p p p

m m

p p

l l

m

l

l m n n w w w r r q r r q

p p

2. For , lm , and . Since

m

j j j j

lm

l

m m

l l



w

lm

q

we have that q we get then

i i

m

l



w w

lm

r r I r r I

p p p p

m m

p p

l l

m

l

l n m n n w w

3. For and we have to decompose lm into a concatenation

of two words or two antiwords. We then have the following subcases:

w w w w lm

For we have lm and hence fall in the first case above.

mnm

w w w w

n lm

For ll we have and hence fall in the second case above.

lm

w w

n

If ll and then, by the non-elasticity assumption,

mnm

w w w w

lm lln

lm

 

i i j j i j

l n l l n l

n l

On the other hand, by Identity 7.5 we have l , that is, .

 

Therefore, similarly to the above cases,



w

w

lln

lm

r r r r I r I I r r I r

j i j j i i i j

m m

i i j j

l l l ln ln l

m

l ln l



w w

lm

r r I r I r

p p p

which assures that p in this case too.

m m

p p

l l

m l

130 7. n-words and their automata

r r I b r r I

i i i i

For the reverse inclusion, take some accepting run i

i i

ik

D r r I q q r r I q q n

k k

in with and and fix some family of

k

l

l l r r I S i n

i l l i

n

indices i such that for all . Suppose also that the

i i

l i

associated word is w , that is,

word l l w i j n

j ij

i for all

I n j n

Let us first observe that k implies that for each there exists some index

p k j I n I r Q r Q

p p p j n

j such that . By construction, this implies that and .

p j

j j j

j

r b r A n

i i i

k

But then the sequence i is an accepting run in : consider the -

z l p

i j

n j n word accepted by this run and the sequence of indices i , that is, bearing

the following properties:

word r r z word r r z

l l ij p p ninj

i j i j

word r r z word r r z

l p inj p l nij

i j i j

w

As a consequence, z .

n n

r b r A

i

k

Similarly, the sequence i is an accepting run in and we may con-

i i

n z p l

j i

n inn struct the -word accepted by this run and the sequence of indices j ,

that is

z r z word r r word r

ninj p ij p l l

j i j i

word r r z word r r z

l p inj p l nij

i j i j

w

with the corollary that z .

nn nn

i j n z z z

inj

But the above relations also imply that, for all , n , hence

ij

nn

z w z

z . If we corroborate this with the observations that and

n n n nn

w , we get that

nn

w z z LA LB ut

Remark 7.4.2. 1. Reachability plays an essential roleˆ in the proof of the reverse inclusion. Without

D q q q q n

the hypothesis that the run of starts in and ends in we would not

B

be able to split this run into accepting runs of A and . Specifically, the existence of the family

p n z z

i

n of indices i (which helped constructing the two -words and ) could not be

assured.

p A B

2. The set of indices i cannot be placed anywhere in the run , since, by hypothesis, and are

p l j n z

j inj

non-elastic. Concretely, if an index i precedes an index with then is an

z l p

n j i antiword, and this may happen iff ii . Hence we must have .

The main result of this chapter is the following:

7.4 The non-elastic star closure theorem 131

k

n A k N LA

Theorem 7.4.3. Given a -automaton , suppose that, for any , is a non-elastic

 

n LA n LA

-word language, or, equivalently, that is a non-elastic -word language. Then n

is accepted by a -automaton.

k



LA

Proof. We will construct actually the automaton for L since the automaton

k



A

for L follows by applying the union construction. The technique draws much from the con- n

catenation construction for non-elastic -automata. We will first explain the construction for a

n n

-automaton that accepts only strictly non-elastic -words, and then generalize this construc- n

tion to arbitrary non-elastic -automata. n The idea for the simpler case of strictly non-elastic -automata is to use two component states

as in the proof for concatenation, but each time one component has completed an accepting run A

in A, a new component starts checking for an accepting run in . That is, at each moment during

n A

the parse of the given -word we have one or two copies of working synchronously, and on

Q

i

the sections where two copies are working, a passage of the first copy through n must be

Q i

simultaneous with a passage of the second copy through i . Also, each such index that witnesses

a synchronous passage must be recorded in a set I . (Of course, there might be passages through

Q Q

i i

n that are ignored.) We may consider the points where these “synchronous passages” n

happen as the “concatenation points” between two factors of the given -word.

n n When the index set equals , the first copy has identified a section of the -word which

is accepted by A, hence it stops. The second copy must continue its search since it has only passed

Q Q Q Q

n n n through the accepting sets . Next time it reaches an accepting set from ,

a new copy of A is restarted and proceeds synchronously with the old copy as described above. The

whole process is stopped with a choice not to start a new copy of A after a passage of the active

Q Q

n copy through one of n , hence leaving to this active copy only the task of finishing an

accepting run in A. n

The formalization of this construction is the following: start with a strictly non-elastic -



A Q Q Q L

n automaton . The automaton (with -transitions) accepting would be

then:

B Q Q Q P n Q Q

n

where

q q n q j q q Q

i X q Q q Q q q q X j

i i

for all n and

a

q r X q r Y j q q r r Q X Y n

and

i Y n X q Q r Q

i i

for all n and

n A graphical presentation of the way B parses a strictly non-elastic -word is given in Figure 7.12.

Observe that strict non-elasticity assures the fact that at each moment, only two copies of A n are necessary, because it is not possible that more than two -words overlap on the same part.

132 7. n-words and their automata

w w

i i

w

w i

on this part two copies on this part only one A

of A must work synchronously copy of is needed

n Fig. 7.12. A graphical presentation of the way B parses a strictly non-elastic -word. We also

suggest here that no three consecutive factors in a decomposition of w may overlap.

This is a very important property that is not valid for elastic concatenations – for example, when concatenating dominoes of a PCP instance. In order to cope with properties (N1) and (N2) we need to take into account the fact that more

than two copies of A might have to be initiated when the active copy passes through some accepting

Q n

i set n . Intuitively, when concatenating (non-strict) non-elastic -words, interface parts of the

factors overlap and hence at each point we might have more than two copies of A that need to

work synchronously.

n w

Suppose we have to parse a non-elastic -word whose decomposition is

w w w w w w w LAj p

l l m p j with (7.9)

Two important observations are to be made here: the first is that on each part of w no more than

two factors may overlap on their contribution parts – and, as a consequence, these two factors must

w w k p

k be successive factors, i.e., k and for some .

The second observation is related to the interface parts of the factors: on each part of w (viewed

as a word with distinguished points), several factors overlap on their left interface part – be they

w w k l p l

k , with . We may consider that the part starts at the leftmost symbol from w w – that is, that it is a prefix part of . The number of factors which overlap on their left interface

part is no longer uniformly bounded, as it was the case with the contribution part. But they bear an

w w t i

important property: if in the considered part of , l contains some distinguished point for some

i n w t w

in l

, then l must have its distinguished point on the same position – and, as t

is on its left interface part, its distinguished point i must also be on the same position. Inductively,

w w t t

l i in we obtain that all the factors k must have their distinguished points and on the

same position.

A w

This property is very important, since it implies that any run in which is associated to k

w Q

during this part of , that is, which passes through all the accepting sets i at the distinguished

t w w

k l

point i for , is also a run associated to ! Of course, this does not mean that the same run will

w w k

be extensible to an accepting run for the whole l – maybe the run for will eventually lead to a w

deadlock when trying to associate it to l . But this property says that we do not need to memorize

the whole sequence of states in the run, we only need to memorize the set of states in which the w automaton A might be on a run which is associated to this prefix part of . And this imposes a

7.4 The non-elastic star closure theorem 133



A uniform bound on the memory that is needed for a device that accepts L : this bound is

proportional to the cardinality of Q. Hence, at least intuitively, a finite automaton would suffice.

We will now introduce several notations and ideas for our construction. Denote first

X n j i n i n X i X

X (7.10)

Q X q X j X X X X X i X n X q Q

and for all we have i (7.11)

S T S T Q

The states of our “starred” automaton will be quadruples with and

S Q fg. is called the left active component, is the right active component while is the history

component and T is the prophecy component.

The pair plays the same role as the pair of states in the construction for the strictly non-

1

n Q

i

elastic case for strictly non-elastic -words as must pass through an accepting set n at

Q i n

the same time passes through the accepting set i for the same . We say here that

X q X Q

is passing through some i iff The prophecy component’s utility is then the following: it provides the bounded memory which is needed for parsing the left interface parts of the factors. Symmetrically, the history component is utilized for parsing the right interface parts of the factors. 7.11

Formally, the states are of the following three forms:

X q X T X q X Q T Q

1. where and , with the property that, for all

Y r Y T

,

n n Y n n

a) X .

n n Y n n

b) X .

Y n Y n n Y n Y n n X n X n n c) .

These states are used during the search for the first set of concatenation points, the ones that

w w

“separate” from . The requirements intuitively say that the tuples in the prophecy compo- Q

nent cannot consider passing through some accepting set i unless the right active component

Q i

i

is passing through the accepting set n for the same .

S X q X Y r Y T X q X Y r Y Q S T Q 2. with and , with the following

properties:

n n Y n n

a) X .

n n Y n n

b) X .

U s U S

c) For each ,

n n X n n

i. U .

n n X n n

ii. U .

U n U n U n U n n n X n X n

iii. .

V t V T d) For each ,

 Note that, in fact, we have a difference with the construction in the strictly non-elastic case since both the left and the right active

components record all the indices of the accepting sets throughout which they pass, that is, their memory needs to reach the set

n A before considering they have accomplished their duty of tracking an accepting run in .

134 7. n-words and their automata

n n V n n

i. Y .

n n V n n

ii. Y .

V n V n n V n V n n Y n Y n n

iii. .

w w

j

These states are used when trying to find concatenation points in between the j and for

p

all j .

S X q X X q X Q S Q Y r Y S

3. with , , with the property that for all ,

n n X n n

a) U .

n n X n n

b) U .

U n U n U n U n n n X n X n

c) .

w

These states are used when trying to find concatenation points for the last concatenation p w

p .

The transitions are the following:

a

X q X T X q X T

1. iff

a

q

a) q ;

a

V t V T V t V T t t b) For all there exists such that . These transitions are used during the first search for “concatenation points”. The above re-

quirements, corroborated with the consistency requirements on hex-uples, say that, if the active

Q

i

component considers passing through some accepting set n then each tuple in the prophecy Q

component must also consider passing through the accepting set i .

a

S X q X Y r Y T S X q X Y r Y T

2. iff

a a

q r r

a) q , ;

a

U s U S U s U S s s

b) For all there exists such that ;

a

V t V T V t V T t t c) For all there exists such that .

This is the general pattern for the evolution of all the copies of A during their search for

“concatenation points”.

a

S X q X S X q X

3. iff

a

q

a) q ;

a

U s U S U s U S s s b) For all there exists such that .

These transitions are used after finding the last “concatenation points”, that is, while parsing

w w

the last factor p of .

S X q X Y r Y T S Y r Y Z s Z T

4. iff

X X q X S

a) There exists X such that ;

X Y r Y T

b) There exists Y such that ;

Z s Z S Z X Z s Z S

c) For each there exists such that .

Z s Z T Z X Z s Z T d) For each there exists such that . This transition is taken upon the decision that the current left active component needs to post- pone its search for concatenation points – because it has arrived at the “right interface” of the

current factor.

X q X T X q X Y r Y T 5. iff

7.4 The non-elastic star closure theorem 135

X Y r Y T

a) There exists Y such that ;

Z s Z T Z X Z s Z T b) For each there exists such that .

Transitions of this type are taken upon decision to activate a second copy of A in order to check

the first set of “concatenation points”.

S X q X Y r Y S Y r Y

6. iff

X X q X S

a) There exists X such that ;

Z s Z S Z X Z s Z S b) For each there exists such that . This transition is taken upon decision that the latest set of “concatenation points” should be the

last one to be checked.

n

The accepting sets are, for all i ,

U S X q X Y r Y T j i X n X Z s Z S i Z n Z

i and for all

X q X T j i X n X

(7.12)

U S X q X Y r Y T j i n Y n Y Z s Z T i n Z n Z

i

n and for all

S X q X j n i X n X

(7.13)

q q

Finally, we restrict the state space only to states reachable from

n q n n q n

and coreachable from . Let us de-

D Q U U

 n

note  the automaton build above.



A LD Claim. L .

Proof (of the Claim). Consider some accepting run in A,

S T a S T

i i i i i i i i i

i m

with

S T q q

S T n q n n q n

m m m m

n w WD p p

n i i

n

Consider also some non-elastic -word and a sequence i with

w m

, sequence which witnesses the acceptance of by , that is,

S T U i n

p p p i

p for all , and

i i i i

w

ij

S T S T i j n

p p p p p p p

p for all .

i i i i j j j j

q q

Since the first state in the run is and the last state in the run

n q n n q n is , the run must contain some - transitions which shift tuples from the right active to the left active component or make tuples “arise” or “disappear” in the left and right active components. More specifically, we may find an

136 7. n-words and their automata

k k

j j

p

increasing sequence of indices j such that the -th transition in is of type 4,5 or 6. At

k k p

a closer look, we note that the -th transition must be of the type 4, the -th transition must be of

k k p

p the type 6 and the transitions must be of the type 5, and also that .

More formally:

S S

k k k

  

X q X

k k k k

   

k



X q X

k k k

  

k



Y r Y T Y r Y T Y r Y j Y X

k k

k such that or

  

T T

k k k

p p



X q X

k k k k

p p p p

k

p

X q X

k k k

p p p

k

p

S Y r Y j Y X Y r Y S Y r Y

k k

such that k or

p p p

k k

j j

S Y r Y j Y X Y r Y Y r Y S

k k

such that k or

j j j

T Y r Y j Y X Y r Y T Y r Y

k k

such that k or

j j j

n p n

Our aim is to show that w , the -word associated with , can be factored into -words,

w w w w LA j p

p j with for all

Let us denote also

X q X i k m

i i

i for all

i

Y r Y i k

i i i

for all p i

Observe first that the following concatenation of transitions:

r a r r q q a q

i i i k k i i i

ik ik k

 

  

A k is a run in , since the -th moment corresponds to the shift of the right active component into

the left active component.

Y X

k ik k

Moreover, the sequence i records the indices of accepting states of

  

i i

A X n

through which has passed. Since it is possible that , we cannot say that this

k 

run is accepting. We would therefore like to extend it to an accepting run, by carefully extracting

S i k

more information from the history components i with :

Z Z Z s a Z s

We extend to a run i by induction by choosing, for

i i i i i i

i m

a

i

i k Z Z S s s A s

i

each , a tuple such that in . The fact that extends means

i i i i i that

7.4 The non-elastic star closure theorem 137

Y Y i k i k

i

for for

i

Z Z

i i

i k k i k k X X

i for for

i

r i k i

iff

s

i

i k k q

i iff

The possibility to choose at each step a state s is assured by the requirements 2.b, 3.b,

i i

4.c, and/or 6.b, that transitions in D must obey, according to the situation in which the -th tu-

S T a Z Z

i i i i

ple i falls and also according to the label . Consequently we have that

i i

m

for all i .

Z Z S

m

Observe that and the last tuple in must belong to and hence must be

n q n Z Z n j Z n Z

m m

, therefore . Moreover, for each we

i i

Q s A

must have, by construction, that j . Hence is an accepting run in in the “history”

i

q q

presentation, run that starts in , ends in .

l

ll l l i

n

Therefore, if we associate to this run the sequence of indices u with iff

u u

l

u Z n Z n w ll

, then this sequence witnesses that accepts some -word . Observe that

i i

w n w

is a non-elastic pair, hence is a non-elastic -word. Our intuition is that is the first factor of

n LA w in its decomposition into -words from .

The remaining factors can be recovered, together with their accepting runs, by generalizing the

p

above argument: for each j the concatenation

r a r r q q a q

j i i i k k i i i

ik k ik k

j j

j  j j j 

n

is a run in A that we intuit to correspond to a part of an accepting run associated with a -word

w k k m

p

j ; we have denoted here and . We extend this run in both directions to an

q q

accepting run which starts in and ends in as follows:

j

j

j j

k k s r Z Y Z Y

j i i

For the part of the run in between j and we put , and , while

i i

i i

j

j j

q X k k s Z Z X

i i j

For the part of the run in between j and we put , and .

i i

i i

i k i k

j

Suppose we have build the run from to j , for some. . We then choose a tuple

j

a

i

j j j j

Z s Z T s A s

i such that, in .

i

i i i i The availability of this choice follows from the requirements 1.b, 2.c, 4.d, respectively 5.b from

the definition of the transition function of D .

k i i k

j

Suppose also that we have build the run from j to for some We then choose

j

x

j j

Z s Z x i s S s

i i

a tuple i such that if denotes the -th transition in , then then

i

i i

in A. The possibility to choose is assured by the requirements 2.b, 3.b, 4.c, respectively 6.b from the

definition of the transition function in D .

j

j

p Z Z

Also observe that in each such run, for each i , we have that as

i

i

assured by the definition of the transition function  . We call this property the Consecu- tiveness Property. 138 7. n-words and their automata

Finally, we associate to each run the sequence of integers

j

j

j

j j j

l

l l l i u Z n Z

n

u in which iff

u u i

i

n w A

This sequence witnesses that accepts some -word j in , which is actually a non-elastic

j n

-word.

A w

Though we have identified these accepting runs in , we still need to prove that j correctly

w w w

j concatenates to j and that the result of concatenating all is . To this end we prove the

following property:

p i m

(*) For all j and for all ,

j j

n n n n Z

Z and

i i

j j

n n n n Z

Z .

i i

Q

i

In other words, we prove that passes through some accepting set n at the same moment

j

Q i n

when passes through the accepting set i for the same . It is clear that this

j

w w

j

requirement is sufficient for proving that j and correctly concatenate.

i k k

j

For proving the desired property (*), let us observe first that for all j , the

run “gives” the left active component and “gives” the right active component of . That

j j

j j

j j j j

Z s Z X q X Z s Z Y r Y

i i i

is, i and . But then, by construction

i i i i

i i i i

Q X n n Y n n

i i

(requirements 2.a and 2.b in the definition of  ), we have

n n Y n n

and X . This implies that our property (*) holds over the

i i

k k k k

j j j

interval j . Observe also that the interval is nonempty since the

k

j

p

sequence j is strictly increasing.

k k

j

Consider now the interval j . In this interval, is the right active compo-

j

j

j j

Z s Z Y r Y i

nent while is included in the prophecy component. That is, i and

j

i i

i i

j

j j

s Z Z T i k j

i . Note that half of the property (*) holds for : by construction, we

i

i i

j j

j j

Z Z k Z Z

have and . On the other hand, the property (*) holds for j as

k k

k k

j j

j j

j j

Z n n Z n n

proved above, hence we have . Therefore, by the

k k

j j

j j

n n Z n n

Consecutiveness Property, Z .

k k

j j

j

We will then prove the property (*) by a “decreasing induction” argument: suppose that Z

i

j j

j j

Z n n i k k Z s Z T n n

j i

for some j . Since

i i

i i

j

j j

Z Z s

and is the right active component, we must have by construction (requirements

i

i i Q

2.c.i and 2.c.iii in the definition of  ) that

j j

n n Z n n

Z (7.14)

i i

j j

j j

Z n Z n n n Z n Z n n

(7.15)

i i

i i

We then have the following sequence of identities:

7.4 The non-elastic star closure theorem 139

j j

j j

Z Z Z n n n n n n Z n n

i i

i i

j j

j

Z n n n Z n Z n n

i i

i

(by assumption)

j j

j

n n n n Z n n Z Z

i i

i

(by inclusion 7.15)

j

Z n n

i

j

Z n n

(7.16)

i

j j

n n Z

From identities 7.14 and 7.16 we get by double inclusion that Z

i i

i n n

, that is, the other half of property (*) holds for .

k k v j

v

Let us now consider the intervals v for . Within these intervals, both

j j

j j j j

Z s Z Z Z s T

and take part into the prophecy component, i.e., i .

j j

i i

i i i i

k k

v

Again as above, the validity of the property (*) within the interval v and the

j

n i k Z

Consecutiveness Property assures that half of the property (*) holds for v , namely

k

v

j

n n n

Z . We will prove by decreasing induction that it holds within all

k

v

k k

v

the interval v .

i

Let us provide first the properties that connect the right active component of the -th tuple

Q

in the run with the tuples of the runs and , as implied by the definition of  :

j j

j

Y n n Z n n

i (7.17)

i

j

Y n n Z n n

i (7.18)

i

j

n n Z n n

Y (7.19)

i i

j

n n Z n n

Y (7.20)

i i

j

j

Z n Z n n Y n Y n n

i (7.21)

i i

i

j

j

Z n Z n n Y n Y n n

i (7.22)

i i

i

j j

n n Z n n

Suppose that Z , hence, by the Consecutiveness Property,

i i

j j

j j

n n Z n n V Z n n n Z

Z . Denote further

i i

i i

j j j

Z X Z n n Z n n n n

. Since , we have that . Hence,

i i i

by inclusion 7.17,

j

V Z n n Y n n

i

i

On the other hand,

j j j

Z n n n Z n n Z X

V (since )

i i i

j j

Z n Z n n

i i

j

j

n n Z n Z

i

i

j

j

n Z n n

Z (by induction hypothesis)

i

i

Y n Y n n

i (by inclusion 7.22)

i

140 7. n-words and their automata

V Y n n V

Hence, in order to avoid the contradiction with i we must have .

j j

n n n n n Z Z

Similarly, if we denote V , we get that

i i

V Y n n

i

j

j

V n n Z n n n Z

i

i

j

j

Z n n n Z n n

i

i

j

j

n Z Z n n

i

i

j

j

n n n Z Z

i

i

Y n Y n n

i

i

j j

n n n n Z Z

which implies V too. Hence, and this proves

i i

k that property (*) holds within the interval j .

By mirroring the above arguments, we may show that property (*) also holds within the interval

k m k k v j

v v

j . We only show the argument for the intervals for :

In this interval, both and take part in the history component. Since the property holds

j j

j j j

j

i k Z Z Z n n n n Z

for v ,wehave . But since and

k k k

k

v v v

v

j

j

Z Z i k

we also have that the “first half” of the property (*) holds for v , that is,

k

k

v

v

j j

Z n n Z n n

. We will prove by increasing induction that it holds

k k

v v

i k k

v

for all v .

j j j j

j j

n n Z n n Z Z Z Z

Suppose that Z . Since and

i i i i

i i

j j

n n Z n n

we then get that Z , so it only remains to check the other

i i

half of the property (*).

j j

Z n n n Z n n

Denote V . We then have:

i i

j j

j j

Z n n n Z n n Z Z

V (since )

i i

i i

j

j

Z n n n Z n n

(by induction hypothesis)

i

i

j

j

Z n Z n n

i

i

X n X n n

i (by requirement 2.c.iii)

i

X n n

i

j

n n Z

(by requirement 2.c.ii)

i

j j

Z n n Z X

(since )

i i

Q V

(references are to requirements in the definition of  ) which proves that in order to avoid

contradiction between the first and the last inclusion.

j j

Z n n n Z n n

On the other hand, if we denote V we would

i i have the following sequence of inclusions:

7.4 The non-elastic star closure theorem 141

j

j

V Z n n n Z n n

i

i

j

j

Z n n n Z n n

i

i

j

j

Z n Z n n

i

i

X n X n n

i (by requirement 2.c.iii)

i

X n n

i

j

n n

Z (by requirement 2.c.ii)

i

Hence again V in order to avoid the contradiction between the first and the last inclusion.

k

l p l p i n

ni

It remains to prove that i and for all , that is, that the concatena-

i ni

w w w k

tion really gives . But this property is evidently true, since, by construction of the

D Z Z q

accepting sets in (Identities 7.12 and 7.13), regardless of the component to which

p p p

i i i

belongs (but note it may belong only to the right or left active component or to the history compo-

i Z p n Z

i

nent), we have that , and similarly for n .

p p

i i t

This ends our proof of our first Claim. u



A LD

Claim. L

w w w w LA i p

p i

Proof. Start with a concatenation with for all , and

p n A w consider accepting runs in the completed -automaton , one for each i , together with their

witnessing sequences of indices:

i i i i

q a q l

i j m

n

with witnessing index sequence k

j j j i k

q q

We assume that each run starts in and ends in . i

Transform these runs into runs in the “history” presentations, that is, denote X the set of indices

j

i

j X

of the accepting states which were visited by each run i just before the -th step and by the set

j

i

j

of indices of accepting states visited by i up to the -th step, and also denote their difference:

j

i i

u n j v j l v

X such that

j u

i

i

u n j v j l v

X such that

j u

i

i i

X n X

j j j

w w

i

Note first that, due to the fact that i and correctly concatenate, we have that for all

i i

u v n j j m j j m u v

i i

and for all , if and

j j

 

i i

u n j j j j v n

then .

j j

 

By a trick similar to the right-to-left proof for concatenation, we bring all runs to equal length

Q Q

nj i j such that when i passes through , passes through and vice-versa (we call this prop-

erty the Synchronicity Property)

m i i

To this end, we extend each run i to a run of length (the same for all -s) by adding loops

i

i i i

q q l X X

in and/or , then suitably redefine the indices and the index sets , and such that

j j j j

142 7. n-words and their automata

i

i

i p n n n n

for all we have . This can be done as

j j

follows:

u n i p m i

Take some . Then, for each define i as the unique index

i i

u m u n

for which . Similarly, define i as the unique index for which .

i

i

i

i

Define further i .

i

q

i

This integer gives the number of loops in that must be appended to such that it is brought

to the same length as i .



This follows because, if the runs would have the Synchronicity Property then i must equal , regardless of the choice

i

n

of the index u .

i p

i

Then, for each we append, at the beginning of i , copies of the state

P

i

j

q w i

. Also, the indices that witness the acceptance of are then shifted by , that is,

j

i

X

j i i

l l

k k

j

i

i i

X

It is routine to check then that, after suitably redefining the index sets X , and , we have that

j j j

i i

n n n n i p j m

(7.23)

j j

However the runs do not have the same length yet. To bring them to the same length, define

P

i

j

j i p m max m m m

i i

first i , and then append, at the end of each run ,

j q

copies of the state , and we are done.

m

We suppose from now on that the runs i have equal length and that their associated index sets

n w

satisfy condition 7.23. Observe that, by the hypothesis that all -words i correctly concatenate,

we may choose the runs i such that all the labels of the transitions are the same (a similar property

a a m

was obtainted for concatenation), hence we may consider that there exist such that

p

for all i ,

i i

q a q

i j j m

j j

i

i i

X

Let us show now some properties of the index sets X , and :

j j j

i

i

p j m X X X

(X1) For each i and , .

j j

i

i

p j m X n n X n n

(X2) For each i and , and

j

j

i i

X n n X n n

j j

i i

i

i

p j m X X X X

(X3) For each i and , and .

j j j

j



i i

i i p j m X n n X n n

(X4) For each and , and

j j



i i

n n X n n

X .

j j

i

i i

X

Property (X1) follows due to the fact that X , while property (X2) follows by

j j j

induction on j :

7.4 The non-elastic star closure theorem 143

i i i

X n n X n n n n

j j j

i i

n n n n X

j j

i

n n X

j

Property (X3) is a consequence of the first two properties, while property (X4) follows by induction

i

on i , since the first three properties imply that







i

i i

n n n n X X n n X

j j

j

i i

X n n X n n

j

j w

The idea that guides us in building the D run for is that the union of the history, left active,

i

i i

j fX X j i q

right active and prophecy components at each step in must be the set

j j j

pg . The problem is to correctly choose the left and the right active components at each step, and to check the additional constraints on the states of this run, and it is here where the non-elastic

assumption plays its role.

Clearly, the order in which the runs i “become” left active components is the order of con-

i

i i

X q X j

catenation. Or, in other words, if is the left active component at step , then the right

j j j



i i

 

i i

i i

X q X X X q j i i

active component is , the prophecy component is

j j j j

j j



i

 

i i

X q j i i

and the history component is X .

j j j

The choice of the moments at which a run i passes from the history component into the right

active component, then into the left active component and finally into the history component need

not be unique. But the bottom line is that i cannot “sleep” all the time it parses the contribution

w i

part of i This translates to the fact that cannot be in the history component at the end of

w w

unu i unu i and cannot be in the prophecy component before the beginning of .

The non-elasticity property intervenes then in the fact that, if we have decided to shift

i , say from the prophecy into the history component, then we will never “regret” this

decision, that is, we will never need i in the prophecy component back.

This means that each run i will be, at its turn, in the prophecy component, then a left active component, then a right active component, and finally in the history component. Observe then that

each time we shift the left active component into the right active component we must employ an -

m p m

transition. This means that the run would have length – , since it simulates each transition

p p

in all the runs i and since there must be -transitions for shifting left active into right active

components.



i

Our choice for the shifting moments is a “lazy” one: a run is moved from its place only when there is a run i with



i

i that needs to be “pulled out” of the history component because it is about to finish its parsing of the contribution

 w

part of i – it needs to be “waked up before it’s too late”.

Formally, the construction runs by induction as follows: the first tuple of in the run is

i

i

X X X q j i X q

144 7. n-words and their automata

i

i

X X i q Q u n u

(observe that in fact for all since for any ).

k

Assume that we have built the run up to the -th tuple. Denote the run built so far as k

S T

k l l l l

k

and its last tuple as l . For the induction, we also assume that

i

i i

X q X k j i

either k with

j j j

p

p p

X X k j p q

k

or k and then with .

j

j j

These properties hold for k , hence the induction has indeed a base case.

We then have the following cases, triggering specific ways to extend the run k :

i

 

i i i i

n n n n X q X i i

1. If k and for all , (that

j j j j j



w

i u is, we have not reached the end of some component un ) then we extend the run with the

tuple

S T a S T

k k k k k i k k k k

in which



 

i

i i

S X q X j i i

k

j j j

i

iff

k

i

i i

X i q X

iff

j

j j

i

i i

X q X

k

j j j



 

i

i i

T X q j i i X

k

j j j

i



i i i

X q X i i n n n

2. If k and there exists some for which

j j j j



i

n n

then let

j

 

i i

max i i j n n n n n

j j

X X q

j j

Observe that in this case the tuple j cannot belong to the history component since it would contradict the



requirements 1.c or 2.c.iii, according to whether k or not.

i We then append tuples as follows:

a) The first tuple to be appended is

S T S T

k k k k k k k k

and has the following components:



 

i

i i

S X X j i i q

k

j j j

i

i i

X X q

k

j j j

i i i i

X q X n n n

k

j j j j

  

i i i

T X q X j i i

k

j j j

Observe that we do not change the third component in the tuples belonging to the prophecy

i p

component. Also observe that k because .

7.4 The non-elastic star closure theorem 145

i l

b) For each l , the -th appended tuple is:

S T S T

k l k l k l k l k l k l k l k l

in which

 



i i

i

S X q X j i i l

k l

j j j

il

il il il

X n n n q X

k l

j

j j j

il il il il

X q X n n n

k l

j j j j

  

i i i

j i i l X q T X

k l

j j j

i l These are the transitions that “pull” the -th tuple from the prophecy component into

the right active component. This operation is accompanied by the modification of the index

i l set of the -th tuple, but the index sets of all tuples in the prophecy component are left

unchanged.

i i

c) For l , the appended tuple is:

S T a S T

k i k i k i k i k k i k i k i k i

with

 



i i

i

q j i S X X

i k

j j j

X X n n n q

i k

j j j j

p

iff

k i

X p X q

iff

j

j j



 

i

i i

T X j i q X

k i

j j j

Hence, once the -th tuple is pulled out from the prophecy component, we also change the

index sets of all tuples remaining in the prophecy component. This is possible since, by

T

k i choice of , we will prove that all the tuples in do not contradict requirements 1.c

and 2.c.iii.

p

p p

X q X

k

3. If k , which can only happen when , we append to the run

j

j j

the following tuple:

S T a S T

k k k k k p k k k k

in which



 

i

i i

S X X j i p q

k

j j j

p

p p

X q X

k

j

j j

k

T k

146 7. n-words and their automata

i i

Observe that, after case 2, we already get at stage k since we have appended

k m p

tuples to k . We denote the run obtained after stage . Q

It remains to prove that the appended tuples are indeed states from  . We will only prove the

validity of requirements 2 (a, b, c.i, c.ii, c.iii, d.i, d.ii, d.iii) since the other requirements can be

regarded as special cases of requirements 2(a d.iii). Consequently, we will only study the cases

1 and 2 in the construction of . Let us show first that the properties (X1)-(X4) imply that the newly appended tuples satisfy requirements 2.a, 2.b, 2.c.i, 2.c.ii, 2.d.i, 2.d.ii, respective of the case the tuple falls in. In case 1, requirements 2.a and 2.b are restatements of property (X2), while requirements 2.c.i,

2.c.ii, 2.d.i, and 2.d.ii are restatements of property (X4).

S T

k k k For the case 2, subcase a, that is, for the tuple k , requirement 2.a results directly

from (X2). On the other hand, we have that

i i i i

n n n n n n X X

j j j j

i

n X j

hence requirement 2.b is implied by (X2). Next, properties 2.c.i, 2.c.ii and 2.d.i are trivially implied

i

by (X4). Finally, due to (X4) again, we observe that for all i ,



i i i

i

X n n n n n X n n X

j

j j j

hence requirement 2.d.ii holds also for the case 2, subcase a.

S T l i

l k l k l k l For the case 2, subcase b, that is, for each tuple k with ,

the requirements 2.a and 2.b can be similarly shown to derive from to (X2). For requirement 2.c.i

i l

observe that for all i we have by (X4)



i il

il il

X X X n n n

j j

j j

while requirement 2.c.ii is directly implied by (X4). Then, requirement 2.d.i is a direct consequence

i l

of (X4) while for 2.d.ii we have that for all i ,





il i

il il i

n n n n X X n X X

j j j

j j

S T

k i k i k i k i Finally, in case 2, subcase c, that is, for the tuple , the proof that all requirements 2.a, 2.b, 2.c.i, 2.c.ii, 2.d.i, and 2.d.ii hold is very similar to the other cases.

Consider now the requirement 2.d.iii. For the case 1, requirement 2.d.iii can be proved as fol-

 

i i

i n n n n

lows: if we suppose that for all i , then we can

j j

i

show that for all i ,



i i

n n n n

(7.24)

j j



i

i u n

We can prove this inclusion as follows: for each i , consider some index

j





i

i

n n n n n n u n

. Identity 7.23 says that , hence

j j

7.4 The non-elastic star closure theorem 147

 

i i

n n n n

. On the other hand, the hypothesis of case 1 says that

j j

i i

u n n n n n

, and hence , which shows that our inclusion 7.24

j j

holds.

S T

k k k

For the case 2 subcase a, consider the tuple k which is the first to be appended to

i i

k . The requirement 2.d.iii for this tuple is the following: for all ,

   

i i i i

X n X n n X n X n n

j j j j

i i i

X n n n n X n n

j j j

But this property holds trivially since all the three sets involved in this chain of inclusions are

empty.

S T l i

l k l k l k l

For case 2, subcase b, consider each tuple k with , But

i l

for the requirement 2.d.iii to hold for this tuple, we must have that for all i ,

   

i i i i

n n n X n n X n X X

j j j j

il il il

X n n n n X n n

j j j

which again holds trivially since all sets are empty.

T S

i k i k i k i

Finally, in case 2, subcase c, requirement 2.d.iii for tuple k says

that, for all i ,

 

i i

n n n n n n

j j

j

i First observe that, by choice of , for all the first inclusion holds. We then only have to

prove that the second inclusion holds too.

Suppose this does not hold for some i . We show that this would be in contradiction with



i

n n u n u n n n n

the choice of : take some with .

j

j



i

n X u n X u n u n X

Then u , and by (X3), . Since we must have .

j j

j j

i

i i

n X u n X u n

Again by (X3), we get that u , hence and further . But then, by

j j j

n

gathering all the information we obtained on u we get:



i i

n n n n n n

u (7.25)

j j

p

This is in contradiction with the choice of , since the greatest integer in which verifies

i property 7.25 is , and we have assumed . Let us finally check requirement 2.c.iii. First, we observe that for the case 2, subcases b and c, requirement 2.c.iii holds trivially since in the respective chain of inclusions the two sets are empty. (The proof of this observation is similar to the proof that requirement 2.d.iii holds in cases 2, subcases b and c.) For checking case 1 and case 2, subcase a, we will first show that the first part of the inclusion

from requirement 2.c.iii holds, that is,

 

i i

i n n n n

for all i (7.26)

j j 148 7. n-words and their automata

This property will be proved by contradiction, with essential use of the non-elasticity assumption:

 

i i

n n i i u n u

Suppose that there exists some and some such that

j j

i n n n

. The first observation to be made is that since is in the history component

j j i

at j , there must exist a moment at which had to be pulled out from the prophecy

i i

 

i i n n n n

 

component because for some , we had that . Pick

j j

i i

 

v n v n n n n n n

 

up then some such that .

j j

i i i

  

i



v n X X inX v X v X

  



Hence , and since we have that , and further .

j j j

j



i

i



u n X u n X

Let us further observe that and hence, by (X3) Moreover, we have

j

j



i

i



v X v X j j





that which by (X3) gives that . This means that there exists such that

j

j



i

i



u n j j v

and there exists such that .

j

j

 

We then need the following property:



k k

k k p l l m u v n

(W) For each and each , suppose and 

l

l

u v n a t k

for some . Denote also t the label of the -th transition in any of the runs



or k . Then,



l l w w

uv n k

If then k .

 

l l w w a a a

k k n l l l

If then uv .



a a l l w w a



k k n

If then uv .

l l

l

k k k Proof (of property (W)). The property (W) can be proved by induction on k : for

it holds straightforwardly since it says that the concatenation of the labels of the transitions in

Q l Q l

u v n

between the moment k passes through (i.e., ) and the moment it passes through (i.e., )

w

n

equals uv .



k

k k v n

Suppose then it holds for , and we want to prove it for . Since  it follows

l





k

k

l l v v n

that there exists some such that . And further, by 7.23, that .We

l

l





l l l thence have 12 cases concerning the relative position of , and . We will only study three of

them and discuss the similarities with the other cases.

l l l k k l l

Suppose . Then, by the induction hypothesis for and ,wehave

  

a a w w a a a w

vvn l l k uv n l l l k

k . On the other hand, . There-

 

fore,

  

w w w w w

k k uv n k k uv n k vvn



a a a a a

l l l l l

 



a a a

l l l p

since the label of the t-th transition in any of the runs is the same. Hence property (W) holds

k l l l

for . This proof works also for all the cases in which is in between and .

l l l k k l l w

k

Suppose . Then, by the induction hypothesis for and ,wehave

 

w a a w a a



uv n k vvn

k . On the other side, . Therefore,

l l l

l

 

7.4 The non-elastic star closure theorem 149

  

w w w w w

k k uv n k k uv n k vvn



a a a a

l l



l l





a a

l

l

hence property (W) holds for k also in this case. This proof works also for all the cases in

l l l

which is in between and .

l l l w k

Suppose . In this case we have, by the induction hypothesis, that

  

w a a w a a

k uv n k n l l

and vv . But



l l



  

w w w w w

vvn uv n k k uv n k k k



a a a a

l l



l l





a a a

l l l

Hence property (W) holds for k in this last case. A similar proof can be produced in the case

l l l ut

.

u u v v k i

We may now particularize the property (W) as follows: we put , , ,

k i l j l j

, , and get that



w w

i uun

i ;





w w

i vvn

i ;





w w

i uv n

i is an antiword;





w w i

facts which clearly contradict the non-elasticity assumption on i . Hence our assump- 

tion on the nonvalidity of Inclusion 7.26 is itself false.

i Let us then observe that, based on the validity of the first half of requirement 2.c.iii for all i

and on Identity 7.23, the last half of requirement 2.c.iii can be proved by induction as follows:

 

i

i

n n n n

(by Identity 7.23)

j

j



i

n n

(by Inclusion 7.26) j .

.

i

n n

j

i

n n

(by Identity 7.23)

j

D

p

It follows that m is indeed a run in . We need now to show it is an accepting run,

t t U

u u u

n

and then associate a sequence of indices u such that the -th state in this run is in

u n U

and such that, for , the passage through u of be synchronous with the passage of

Q U

u nu p

through and the passage of through be synchronous with the passage of through

Q

u

n .

l Q u

 u

Remind that represents the moment passes through , i.e., . Consider then the

u

l

u

m p l

index k which denotes the moment in the run which corresponds to . That is,

u

k l i p i

for some , and further: u

150 7. n-words and their automata

Z q i X Z X



  k

If then k and . Note that we may have either or

l

l l

u

u u

n n n Z X l

 

p

, according to whether the -th transition in m falls in

u

l l

u u

case 1 or case 2, subcase a from the construction of .

i

i i

q X q X X X S f g

 

 

 

k k

Otherwise, and hence k .

l l

l l

l l

u u

u u

u u

Z T U q S T X

 

k u k k k

In the first case, it is clear that k because by

l l

u u

u Z n X Z

construction we have  , in any of the cases falls in.

l

u

q i X X S



  k

In the second case, if then k , and we get again that

l

l l

u

u u

q S T U i X X S



 

k k k u k

k .For we must have that . Let us observe that

l

l l

u

u u

i i

n Q u



 

requirement 2.c.iii in the construction of  says that , and hence .

l

l l

u

u u

S T U i i i

k k k u

Suppose then that k . This implies that there exists some



i i

i

 

u i i u u n



 

such that . Then there must exist some such that and . But this

l

l l

u

u u

is in contradiction with the requirement 2.c.iii, first inclusion. Hence the assumption is false, that

S T U

k k k u

is, k .

p p

t l i t l l i i

u nu

u nu

We may therefore define . We also define n , where is the index

u

p

l i p

p i

in the run m which corresponds to the -th transition in each of the runs , .

nu

t S T U

t t t t nu

Similarly to the proof for u , we may get that .

nu nu nu nu

v n

It remains just to observe that, by property (W), for each u , the word or antiword

t t

nv

that labels the transitions in between the u -th state and the -th state equals the word or antiword

p

l l

i

v

that labels the transitions in between the -th state and the n -th state of any of the runs , that

u

w w

p unv

is, equals . Similarly, by concatenating the labels of the piece of run from

t t t t

p u v u v

m that lies in between the -th state and the -th state (eventually in reversed order, if )

w t t w

uv un v n p unv n mp

we get , and similarly for , and . Hence, the run accepts indeed

w w p

. t This ends our proof of the second claim, and, consequently, the proof of Theorem 7.4.3. u

8. Representing timing information with n-words n

Up to this moment we have investigated only the possibility to use -automata for representing n

the discrete information in -signal regular expressions. In this chapter we investigate the possi- n

bility of representing also the timing information in -signal regular expressions. By the timing n

information we mean the set of tuples representing the duration of each -signal in the semantics

n E RegSig

of a -signal regular expression , that is, the set

n

j kE k

n n

Here is the extension of the length morphism to -signals, that is, a -signal over a one-letter

ij

fag a n n Mat R n

alphabet with ij . Or, in other words, a matrix which

jk ik

satisfies the triangle identity ij . n Let us note that, for the -signal-semantics of timed automata, our aim translates to the construction of the reachability relation on clocks, that is, the dependence between clock values when starting in initial states and the clock values with which final states may be reached. This problem is nothing else but the problem of representing timing constraints over a set of real

variables (or clocks). Conjunctive timing constraints are usually represented as difference bound

n

matrices (DBMs, see [Bel57, Yov98]). These can be thought as n matrices over intervals

D Mat Z Int D D

n ij ji (nonnecessarily positive), n such that .

The idea is that each variable is associated with an index in the index set of the matrix, and

x x I I j i j

each difference i puts the interval in the -component (note that the indices have

swapped their places). To represent single-variable constraints like x , a special variable

x x

is appended, whose value is considered always , such that constraints like may be

x x

written as .

y x y

For example, the two-variable constraint x is represented by

the following matrix over intervals:

A

D

(8.1)

152 8. Representing timing information with n-words

D

Actually, to avoid redundant representation of the same interval, the important information is kept as follows: if ij

i j b j i a a b

, the -th component will be while the -th component will be . In other words, instead

P

of the matrix of intervals D one keeps a matrix of pairs where is a real number and is a relation symbol

f g

, such that

D

iff ij is a point interval

P sup D

max D max D inf D

ij ij

ij ij

where iff ij exists and

otherwise

We will however utilize the “redundant” matrix of intervals D in order not to complicate certain proofs.

y x y If we think of the set of interpretations that validate the constraint x

, then this set has more than one representation. The reason is that the first constraint can be deduced from the other two by arithmetic manipulation. However the representation given in 8.1 can be thought as the “canonical” one, since it is “closed” under these arithmetical manipulations – even more, it is “minimal”, that is, no other representation can be found in which some of the

intervals are smaller than the ones in D . As already said, a DBM can only represent conjunctions of atomic constraints, therefore general atomic constraints (i.e. containing disjunction also) require using sets of DBMs. Our aim is to

show that n-automata over one-letter alphabets can be used also for representing arbitrary clock constraints. We will start by showing, as a corollary of the previous chapter, how to represent

timing constraints over the discrete time domain Z :

The constraint x can be represented by a finite automaton (i.e. a -automaton) with four

states and three transitions in a chain. The idea is that the constraint x is satisfied by the

g fg

set of integers f , which is a regular language over the one-symbol set – simply because

.

y

To represent a two-clock constraint x one might transform the above -automaton

a into the -automaton depicted in Figure 8.1 at . What we have done is to identify, during the

1

y

run that accepts , a point in which is satisfied. Note that we have an implicit subconstraint

y y

x , which is represented by the two arrows in between the state labeled and the state

x y x y b labeled x. A “nonpoint” constraint, , is depicted at (remind

that we work with discrete clocks for the moment):

 y x  y x

a b

a x

Fig. 8.1. Two n-automata representations of timing constraints over a discrete domain: :

y b x y x y

and : .

Q Q

Of course, to actually have -automata we would put as the state labeled , as the state

x Q y labeled and as the state labeled .

 Which would result after bringing the constraint to the “normal form” [Yov98]. 8. Representing timing information with n-words 153

A more complicated constraint is depicted in Figure8.2.

x

y y y

Fig. 8.2. An n-automata representation of the following timing constraint:

x y y fg x y x y x y .

Of course, our approach makes constraint representation more sensible to the numbers used in

the constraints than other representations (e.g., unions of DBMs). But observe that the constraint in Figure 8.2 uses only states, instead of the DBMs necessary. Also the clock difference dia- grams approach [LWYP99] does not provide a better representation for this constraint, since the

intervals used in each atomic constraint are distinct. Hence the n-automata representation of clock constraints might be better in some cases. So far, so good with the intuition about discrete timing, but how to export it to the continuous timing? Here we will get aid from the notion of clock regions [AD94]. A region is a special kind

of DBM, in which each interval is either a point or an open unit length interval. For example,

A

R

(8.2)

Of course, this is not exactly the classical definition of a region: they were originally defined

n

R as sets of points x which have the same integral parts and the same ordering between the

fractional parts [AD94]. For our example, the graphical representation of the region R defined in

y x

Identity 8.2 (considering that this region represents in fact the constraint x

y ) is depicted in Figure 8.3:

y

x

Fig. 8.3. The graphical representation of the region defined in 8.2 is the interior of the shadowed triangle. 154 8. Representing timing information with n-words

It is clear that any DBM whose intervals have integer bounds can be decomposed into a (finite or infinite) set of regions.

Now the key idea is to observe that the -regword of all endpoints of the intervals of a region has

a nonempty semantics: for example, our region R described in Identity 8.2 defines the following R

-regword whose components are the bounds of the intervals of :

f g f g

A

g f g

f (8.3)

f g f g

2

g f g

Here the sets represent regular expressions over the symbol set f : for example ,

. In other words, we use here the unary encoding of integers. Then, if from the upper triangular part we keep the upper bounds and from the lower triangular

part we keep the lower bounds we get the following -word:

A

w

Note that this -word is (represented by the) lower vertex of the shadowed triangle in Figure 8.3.

We then just have to add the information about what kind of bound is each component in w .

M

This information is given by a matrix whose entries are relation symbols from the set

f g M w R

ij ij

. Hence, we put ij iff is the supremum of but does not belong

R M w max R M w inf R w R

ij ij ij ij ij ij ij ij to ij ,weput if and iff and .

For our example, the following pair represents the region in 8.2:

A AA

This representation of the region in 8.2 is not unique: the following pair is also a representation

of it:

A AA

Note that the -word in it is the upper left vertex of the triangle in Figure 8.3.

It is then clear that not all the matrices having lower or upper bounds from the intervals involved

in 8.2 are -words, since some do not verify the triangle identity: they are such matrices but only

vertices of the region!

 We have preferred this set notation instead of the regular expression notation due to the ambiguous overloading of summation it would imply. 8. Representing timing information with n-words 155

There is yet one more thing to observe: the matrices of relation symbols have themselves a sort of “triangle” property. due to the fact that they have to represent correct regions. We will see later

the exact formulation of this property, but let’s see an example of an incorrect matrix:

A

M

M M M M

Intuitively, is incorrect because the cycle is inconsistent

M M with the component : the cycle requires that be , which is unimaginable for a diagonal component which must always be zero!

Once we are convinced regions may be represented this way, we only have to think that n-

automata can be adapted to accept pairs consisting of a n-word over a one-letter alphabet and a matrix of relational symbols. There are two ways: either put the relational matrices into states, or put them into transitions. The two ways are completely interchangeable, as are state-labeled or transition-labeled finite automata. Then what we need to assure is that in an accepting run all states

or all transitions are labeled with the same relational matrix. n Yet we are not through with the problems: concatenation is a clear operation on -words, but how do we generalize it to DBMs/regions/pairs like the above? In fact, we need to define a con-

catenation operation on regions, concatenation that would be compositional w.r.t. the “semantic” n concatenation on -signals.

If we regard the problem from the logical point of view, when we want to concatenate two

C C n

DBMs which represent two constraints over variables, what we need is to take their

C C n C n C

conjunction , then to identify the last variables of with the first variables of , and

x x

n

finally to project the result over these variables “in the middle”: denoting the variables

C y y C

n

in and the variables in , we need

n

C C x x C C x y x y x y

n n n n n ni i

i

xy y x C in which the notation C stands for the syntactic replacement of variable with in . But if we proceed by pure arithmetic tools to compute the concatenation of two regions we will

find out that it might not be a region but a more general DBM: even for -dimensional DBMs, that

is, constraints over a single clock, we have

x x x (8.4)

And here is an example concerning -regions:

156 8. Representing timing information with n-words

C B C B

C B C B

C B C B

A A

B C

B C

B

C (8.5)

A

The issue from this is to decompose the result into regions. That is, we will define the concate- nation of two regions as a set of regions. For the simple example described by Formula 8.4 we

put

x x x x x

The advantage is that, contrary to constraints, which, after each conjunction, need to be brought

to normal forms, region concatenation gives directly normal forms.

w M n fg n Since we represent regions by pairs consisting of a -word over and a -relation

M , we may wonder how this concatenation can be implemented over such items, and also why n

it gives sets of regions rather than mere regions, since -word concatenation is not a set-based n

partial operation. The answer relies on the need of a concatenation operation on -relations, which

n

itself returns sets of -relations. For our simple example 8.4 of -regions, we have

because

We might also observe that some pairs representing regions may fail to concatenate, even when

the represented regions concatenate: for example, the following pairs cannot concatenate because

the -words in it cannot, though they represent the concatenation of the -regions in 8.5:

B B C B C C

B B C B C C

B B C B C C

A A A

C C C B BB

C C C B BB

C C C B BB

A A A

8. Representing timing information with n-words 157

But this does not mean that no pair of representations of the -regions from Identity 8.5 can be concatenated: we just have to be more careful when choosing the representations. For example, the two representations in Identity 8.6 below can be concatenated, since the lower right corners of the

first representation equal the upper left corners of the second representation. The result is obtained by first concatenating the -words in each representation, then concatenating the two -relation

matrices. We present the result as a cartesian product between the resulting -word and a set of

-relation matrices, just in order to save space.

B C C B B C

B C C B B C

B C C B B C

A A A

B B C B C C

B B C B C C

B B C B C C

A A A

C B C B

C B C B

C B C B

A A

B C B C

B C B C

B C B

C (8.6)

A A

The component of the resulting -relation matrices is fixed, because it must be consistent

with the fact that the component in the left operand is a and the component in the

second operand is a too. Similarly, the components and are uniquely generated.

However, the component is not uniquely generated, and this is why the above concatenation produces three representations. Clearly, the above result is not the representation of all regions that are included in the resulting DBM as given in Identity 8.5. But if we try all the combinations of representations of the two factor regions and join together all the results, we will obtain the expected decomposition of the DBM into regions.

By summarizing, in order to get a correct representation of the concatenation of regions by pairs

w M

, we need to work with sets of such pairs, and to assure that, in each such set, together with

w M R a pair representing a region , all the pairs that represent this region are contained.

We will develop in this chapter the theory of DBMs, regions and n-word representations for regions. Part of the chapter is a restatement of some well-known properties concerning DBM nor- malization and/or constraint propagation [Gau99, DMP91, vH89]. But the bulk of it is new, and

158 8. Representing timing information with n-words

n n

concerns the restatement of the properties of concatenation and star on -signals and -words, n as given in Chapter 6, for regions and -word representations of regions. A permanent concern in this chapter is again the compositionality of the projection, juxtapo-

sition, resp. concatenation operations. This is quite normal, since we try to define syntactic op- n

erations on representations of sets of -signals, and we already know from Chapter 6 that, for n -regminoes, such compositional operations are not possible (excepting juxtaposition). The key property that makes projection compositional is convexity of intervals. This chapter runs as follows: the first section resumes some well-known properties concerning DBM normalization. In the second section we introduce our concatenation operation on regions

and prove its compositionality. The third section serves for introducing the n-word representations

for regions and for defining concatenation on them. In the fourth section we generalize n-automata

to a class that works on n-word representations and prove that this class enjoys the same properties

as n-automata. Most notably, we show that the non-elastic star closure theorem also holds for n automata that work on non-elastic -word representations.

Let us note that n-automata representing clock constraints are different from region automata

in the sense of Alur and Dill [AD94]: in n-automata, each region is represented by a run, while in region automata, each region is a state.

8.1 Difference bound matrices

R RSig fag

Traditionally, n-regsignals over a one-letter alphabet bearing the property

n

R Z Int i j n

ij for all n are called difference bound matrices,orn-DBMs. By generalizing, we call any -regsignal over a

one-letter alphabet as an extended difference bound matrix,orn-EDBM. Observe that an EDBM

is in fact a matrix whose components are regular expressions over intervals, in the sense used in

Chapter 3. For example, the following matrix is a -EDBM and not a -DBM:

fg

A

fg

When speaking of regular expressions over intervals, we have in mind the theory developed in Section 3.2. Which, of

ZInt course, needs to be extended over the whole Kleene algebra of intervals with integer bounds K (in Section 3.2 we

have studied only intervals with natural bounds).

n D bm n E dbm n

The class of -DBMs is denoted n while the class of -EDBMs is denoted .

n D E dbm

The semantics of an -EDBM n is formally defined as follows:

kD k Sig fag j i j n D a ij

for all there exists ij such that n

We present, in this section, several properties concerning DBMs, most of them well-known in

the domain of max -algebras [Gau99, GP97, Gau92]:

8.1 Difference bound matrices 159

n D D bm kD k

Proposition 8.1.1 ([Gau99]). Given an -DBM n , if and only if for each cycle

i i n i i

j j k

k

j with and , we have that

k

X

D i

i (8.7)

j j 

j

n Sig fag i

j

k

Proof. We may observe first that, given an -signal , for each cycle j

n

k

X

a i n i i k

i i k

with j and , . This property follows by induction on

j j 

j

P

k

kD k i

from the triangle identity 6.1. We then only have to observe that for each , i

j j 

j

P

k

ut D i

i .

j j 

j

n D D bm

Definition 8.1.2. An -DBM n is said to be in normal form iff the following two prop-

erties hold:

i k D fg

1. For each , ii .

j k k

2. For each i ,

D D D

ij jk ik (8.8)

We will refer to the property 8.8 as the triangle inclusion, by similarity to the triangle identity for

n n D nf

-dominoes. The set of -DBMs in normal form is denoted n .

D D nf

Remark 8.1.3. The triangle inclusion implies that a DBM in normal form n is antisym-

D D i j n ji metric, that is, it has the property that ij for all .

Proposition 8.1.4. Any n-DBM in normal form has a nonempty semantics.

Proof. We first check the hypotheses of Proposition 8.1.1 for cycles of length 3. To this end, take

j k n

three indices i . Two cases occur:

D D f g Z ik

1. ik is a point interval, for some . Then

D D D D D f g f g fg

ij jk ki ik ki

hence Proposition 8.1.1 holds.

D inf D sup D

ik ik 2. ik has a nonempty interior. Denote then and . Clearly .

Then

D D D D D

ij jk ki ik ki

due to the property .

160 8. Representing timing information with n-words

i i i

j k

k On the other hand, for any cycle j with , the triangle inclusion 8.8 can be

repeatedly used to prove that

k k

X X

D D D D D D

i i i i i i i i i i i i

   

j j  j j 

k  k  k k

j j

D D D

i i i i i

But we already know that i from the first part of the proof. Therefore,

 

k  k  k k

i ut

j

k

this chain of inclusions implies that property 8.7 must also hold for the whole cycle j . D

Observe that, in the above proof, we have used the fact that ij is an interval. Hence the property does not hold for EDBMs. In fact, the example 6.25 from Chapter 6, page 99 is an example of an

EDBM which is in normal form, but has an empty semantics:

f g f g f g

C B

f g f g f g

C B

R

C B

f g f g f g

A

f g f g f g n As a corollary, an n-DBM is equivalent to an -DBM in normal form (is normalizable) iff its semantics is nonempty.

The following result shows that, unlike the case of n-regsignals, projection is compositional for

R RSig

DBMs in normal form. Remind that projection of an n-regsignal onto some set

n

n

X was defined in Chapter 6, Definition 6.4.3 as the matrix resulting after deleting the X

rows and columns of R whose indices are not in .

D X n D n

Proposition 8.1.5. For each n-DBM in normal form and , is an -DBM in

X

D k kD k

normal form too and k .

X X

Proof. The first part of the property is straightforward. We prove the second part by induction on

card X

the size of n , and the proof of the induction step relies on interval convexity.

n D D nf D

Take some -DBM in normal form n and consider its projection . Take

n

kD k

further some n-signal . We will prove that this signal can be (perhaps not uniquely)

n

n kD k

extended to a -signal .

R i n

i i

Fix some real and denote , for all . Observe then that

j i j i j i ij ij

by straightforward applications of the triangle identity 6.1.

Let us observe that, if we find a real n such that

D i n

i in

n for all (8.9) then we are done, because the matrix defined by

8.1 Difference bound matrices 161

i j n

ij iff

j n i n

n i

ij iff

i n j n

n

j iff

n

would be an -signal, as the triangle identity 6.1 can be easily checked on , e.g.:

n i j n j i ij

nj in

D D

i in

and further, would be in the semantics of , since n by construction.

in

j n

For proving property 8.9 let us first prove that, for all i ,

D D

in j jn i (8.10)

To this end observe that, by the triangle inclusion 8.8,

D D D D D

i in j jn i j in nj ji ij

D D

ij ij ji ij

and since ji it follows that . Hence

D D

i in j jn

which is equivalent to property 8.10.

But then, due to convexity,

n

D

i in

i

ut fact which shows the existence of a real n that satisfies property 8.9. The following property says that the adjective “normal form” is correctly chosen to characterize

the property:

n D D D nf D D ij

Proposition 8.1.6. For any two -DBMs in normal form, n with for all

ij

i j n D D i j n

and ij for some we have that

ij

kD k kD k

D

Proof. This property is a corollary of Proposition 8.1.5: take some n-DBM in normal form

D nf D

n . The property is trivial if all components are point intervals, because choosing as in the

i j n

statement would lead to having D for some .

ij

D i j n D ij Suppose then that ij has a nonempty interior for some , say (the

cases with other parentheses are treated similarly). Suppose also we have some other n-DBM in

D D D D n D ij

normal form with ij . Take then . Note that can be regarded as a -signal.

ij ij

But then, by the construction in Proposition 8.1.5, this -signal can be inductively extended to

D kD k

an n-signal, denote it , which belongs to the semantics of . This ends the proof, since

D ut

due to the fact that .

ij ij

162 8. Representing timing information with n-words

R A A

For each sets of real numbers A , denote the convex closure of , that is:

A x y j x y A

(8.11)

B R

Observe that the convex closure commutes with summation: for any two sets A ,

A B A B

(8.12)

D D n Proposition 8.1.7. Given two n-DBMs in normal form the following -DBM is also in nor-

mal form:

D D D D

ij ij

ij

n D D

iI More generally, for any set of -DBMs in normal form i (nonnecessarily finite or

countable), the following n-DBM is in normal form:

i h

D D

ij ij

D D

j k n

Proof. By verification of the triangle inclusion 8.8: given a triplet i ,

D D D D D D D D

ij jk ij jk

ij jk

D D D D

ij jk

ij jk

D D D D D D D D

ij jk ij jk

jk ij ij jk

D D D D

ij jk

ij jk

D D D

ik by assumption that is in normal form

ik

D D ik

The generalization to arbitrary families of n-DBMs in normal form follows along the same lines,

due to the distributivity of summation over union. Note that we rely on the fact that the convex t closure of any set of real numbers is an interval. u

Propositions 8.1.6 and 8.1.7 allow us to define the normalization of a DBM D with nonempty

semantics: it is the largest DBM D (with respect to inclusion) satisfying the triangle inclusion and

D D i j n D D

bearing the property that ij for all . That is, is the normalization of iff

ij

kD k kD k D kD k kD k kD k kD k D is in normal form, and for all with we have .

Proposition 8.1.8. The normalization of each DBM is unique.

D kD k kD k If D is the normalization of then .

Proof. The first part of the statement follows by Proposition 8.1.7, since this proposition says that

D k the set of DBMs in normal form whose semantics is included in k forms a complete superior

lattice [Bir79], which hence has a supremum.

D k kD k

The second part can be proved by contradiction: suppose that k , hence we may pick

kD k n kD k D

a n-signal in and produce the convex closure , which would be, by Proposition

D ut 8.1.7, a n-DBM in normal form strictly larger than , hence contradicting the assumption.

8.1 Difference bound matrices 163 D

To bring an n-DBM with the property 8.7 into normal form, we may utilize another form

n

k

n of the Floyd-Warshall-Kleene algorithm: namely, we build the sequence of -DBMs k

inductively as follows [Gau99]:

D

i j n

ij k ij k ik k kj k for each

The correctness of this algorithm is the subject of the following proposition. Similar results can

be found in [Gau99, Tri98]3.

D D

Proposition 8.1.9. Provided has a nonempty semantics, n is the normalization of the DBM .

D k i j n

Proof. Observe first that the nonemptiness hypothesis on k implies that, for each ,

p

n o

X

D j i n p N i i i j

i i l p

l l

l

p

n o

X

D j i n l l p i i i i i j p N

i l l l p

i (8.13)

 

l l

l

i i D D

l i i i i

This follows because, if we have l then and hence

 

l l  l  l

   

D D D D D D

i i i i i i i i i i i i

  p

l  l l l  l  l l l  l

       

D D D D

i i i i i i i i

  p

l  l l l  l

   

and this implies that, in the intersection in 8.13, sums in which an index is repeated are “useless”.

kD k kD k

Let us denote D the set defined in Identity 8.13. Then due to the fact that all

kD k D

n-signals in must obey the triangle identity. It only remains to show that is in normal form.

j n

To this end, observe that, for each i ,

p

o n

X

D j i n l l p i i i i i j p N D D

i i l l l p ij jk

 

l l

l

q

n o

X

D j j n l l q j j j j j k q N

j j l l l q

 

l l

l (8.14)

Remind now that the distributivity of intersection over summation

A B C A B A C holds iff in the right hand side the intersections are nonempty. This is our case since we have

assumed that the semantics of D is nonempty and hence each intersection in Identity 8.14 must be nonempty. Hence we may apply this distributivity (in the reverse direction) to get:

 Such results are an essential tool in the computation of the set of reachable states in timed automata.

164 8. Representing timing information with n-words

p q

n

X X

D D D D j i n l l p i i i i

ij jk i i j j l l l

 

l l l l

l l

o

i j p N j n l l q j j j j j k q N

p l l l q

 

and then observe that from the two sequences of indices we may construct a single sequence that k

starts in i and ends in , which means that the above intersection is included in the left-hand side

k ut of identity 8.13, written for the indices i and .

8.2 Regions

In this chapter we will be interested in a special class of DBMs in normal form, namely DBMs in which each interval is either a point interval or a unit interval. The reason to do this is that we want to represent (E)DBMs as sets of regions. We will call such DBMs as regions, due to their close connection to the regions in timed au-

tomata [AD94].

n D D bm n

Definition 8.2.1. An -DBM n is called an -region if it has a nonempty semantics and,

j n

for each i ,

D D f g Z ij

Either ij is a point interval for some ,

D D Z ij

Or ij is an open interval of unit length for some . egn

The set of regions is denoted R . n

Proposition 8.2.2. Each region is in normal form.

Regn

Proof. Take some region D and suppose it is not in normal form, that is, there exist

n

i j k n D D D

ik kj some such that ij . We will show then that it does not verify the

nonemptiness property 8.7. We have the following cases to analyze:

D D D D f g D f g

ik kj ij ik

1. All three intervals ij , and are point intervals, say , and

D f g

kj . Then it follows that . But this implies that

D D D f g fg

ij jk ki

which is in contradiction with property 8.7.

D D D f g D D D

ik kj ij ik kj

2. ij , and . Then the assumption that

rewrites to . Hence we must have either or

.

It then follows that

D D D

ij jk ki

D D ik The case when jk is a point interval and is an open unit interval is similar.

8.2 Regions 165

D D D D D D

ik kj ij ik kj

3. ij , and . Then the assumption that

rewrites to . Hence we must have either or .

D D D ut

jk ki But again this implies that ij , as it can be easily seen by verification.

In the introduction to this subsection we have talked about decomposition of each DBM into a

v Regn

union of regions. The formalization of this is given by the “inclusion” relation,

n

E dbm

n , defined as follows

R v D R D i j n ij

iff ij for all

v D R D R D D

If R for some region and EDBM , then we say that is included in , or that includes

D w R

R . We also denote in this case.

v Regn Edbm

Of course, can be defined as a relation on EDBMs, but we utilize it only on n .

n

D E dbm w D D

For each n we also denote the set of regions which are included in .

The following property shows that, by replacing an EDBM D with the set of regions which are

included in D we lose nothing w.r.t. semantics:

n D E dbm

Proposition 8.2.3. For each -EDBM n ,

kR k j R v D kD k

Proof. The inverse inclusion is straightforward. For the direct inclusion, take some n-signal

D k

k . Define then the following DBM:

f g N ij

iff ij

R

ij

N b c d e

ij ij

ij iff

R

By definition R , hence has a nonempty semantics, hence it is a region. On the other

i j n D ij

hand, for each , ij by assumption. But then the following two cases arise,

according to whether ij is integer or not:

Z R D

ij ij

ij . Then clearly .

Z D b c D b c D

ij ij ij ij ij

ij . But has integer bounds, hence either or is the lower bound of .

d e D d e D

ij ij ij

Similarly, either ij or is the upper bound of .

R D ut ij Hence in this case too ij .

Another essential property of regions is the fact that region representation of a set of n-signals

is unique:

R R Regn kR k kR k R R

Proposition 8.2.4. For each pair of n-regions ,if then .

n

i j n R R R ij

Proof. The hypothesis implies that, for each , ij . But as the intervals

ij

R R R

and are either point intervals or open intervals of unit length, ij is equivalent to

ij ij

R R ut

ij . ij

166 8. Representing timing information with n-words

R R Regn kR k kR k

Corollary 8.2.5. For each two sets of regions , if and only if

n

R R

.

R R

Proof. The inverse inclusion is straightforward. For the direct inclusion, take some region ,

kR k kR k kR k kR k R

hence which implies that . But then, for each there must exist a

R R R kR k kR k R R ut region with . But , which implies that .

8.2.1 Juxtaposition and concatenation on regions

The representation of EDBMs by regions would not be satisfactory if we would not have a compo- sitional concatenation on these representations: compositionality is essentially needed for further

representing EDBMs – and regular expressions over them – with the aid of n-automata. In Chapter 6, Definition 6.4.3 we have introduced a juxtaposition operation on regsignals, hence we may think we only need consider its particularization to regions. Since projection is compo- sitional on regions, we would have the desired compositional concatenation. However, with the definition from Chapter 6, juxtaposition is not an internal operation on regions, that is, the result might not necessarily be a region, but rather a DBM. An example was provided in the introductory part of this chapter. The issue from this situation is to define juxtaposition as an operation that associates, to each pair of regions, a set of regions, namely those regions which are included in the DBM constructed by Definition 6.4.3. For avoiding working with that heavy definition, but also due to the composi-

tionality of projection, we may define way more simpler the juxtaposition of regions as follows:

R Regn R Regn p

Definition 8.2.6. Given two regions and and some integer

m n

m n

min , suppose that

R R

(8.15)

npn p

p R R m n p

The region -juxtaposition of and is the following set of -regions:

R R R R  R R Regn j R

p

(8.16)

mnp

m npmnp

R R R  R

p

If then we put .

npn p In the sequel, the juxtaposition operation from Definition 6.4.3, with regions as arguments, will be called as DBM juxtaposition. The next concern is to check that all the (signal-)juxtaposition properties, as stated in Chapter

6, hold for region juxtaposition too.

R Regn R

Proposition 8.2.7. 1. Region juxtaposition is compositional: for each and

m

Regn minm n

and p ,

n

kR  R k kR k kR k

p p (8.17)

8.2 Regions 167

Regn R Regn X m card X p

2. For each R , , for each with , for each

m n

n card Y q r minp q m r m X

Y with and given , suppose that

r Y

and . Then we have that

 R R  R R r

r (8.18)

X Y X Y mr

R Regn R Regn R Regn

3. Region juxtaposition is associative: for each , and , and

m n p

minm n l minn p

for each k and ,

R  R  R R  R  R

k l k l (8.19)

Proof. The first property follows due to the compositionality of projection: suppose that require-

ment 8.15 holds. Then:

kR  R k R j R R R R

p

m npmnp

kR k kR k R j

m npmnp

kR k kR k

p

j p

In the case requirement 8.15 does not hold, there must exist some indices i such

R R R R

impj mp ij impj mp ij

that . But this implies that , fact which as-

n kR k kR k

impj mp ij

sures that for any two -signals and we must have .

kR k kR k p kR k kR k

p

Therefore, the sets and cannot be -juxtaposed, i.e. . As a conse-

kR  R k kR k kR k

p p quence, . The other two properties can be proved with the aid of the compositionality of juxtaposition

and projection, by using Corollary 8.2.5 and Proposition 6.2.7:

kR  R k kR k  kR k

r

r by compositionality of juxtaposition

X Y X Y

kR k kR k

r by Proposition 6.2.7

X Y mr

kR  R k

r by compositionality of projection

X Y mr

kR  R  kR k kR k kR k  kR k

k l k l

kR k R kR k 

l k

by associativity of p

kR  R  R k ut

k l n

Consequently we may define the desired compositional concatenation on -regions:

R R Regn n R R

Definition 8.2.8. Given , the -region-concatenation of and is the fol-

n n

lowing set of -regions:

R R R j R R  R

n

(8.20)

nnn

The following proposition shows that region concatenation shares almost the same properties n as -signal concatenation:

168 8. Representing timing information with n-words

n R R

Proposition 8.2.9. 1. Region concatenation is compositional: for each two -regions, egn

R ,

n

kR R k kR k kR k

(8.21)

Observe that here we have implicitly extended the semantics application kk to sets of regions.

n R R R Regn

2. Region concatenation is associative: for each triplet of -regions ,

n

R R R R R R

(8.22)

l r

Regn Regn

3. Each region has a left and a right unit: for each R there exist such

n n

R R

that

l r

R R fR g

(8.23)

R R

The definitions of the left and right units are the following:

l l l l

R

ij nij inj ninj ij

R R R R

l l l r

R

ninj ninj inj nij ij

R R R R

Regn R

4. For each region R there exists a weak inverse with the property that

n

l r

R R R R

(8.24)

R R

The definition of the weak inverse is the following:

i j n R

nj n

i iff

R i n j n n

nj n

i iff

R

ij (8.25)

i n n j n R

nj n

i iff

R i j n n

nj n i iff

Proof. The first property is a straightforward corollary of the compositionality of juxtaposition and

projection, while the second property follows from the associativity of region juxtaposition and its n proof runs exactly as the proof of the associativity of -signal composition – see Proposition 6.2.10. The other two properties have more specific proofs, compared to their “relatives” form Propo- sition 6.2.10, and this is due to the particularity of region juxtaposition. Still both of them rely

essentially on compositionality.

r r

n

For proving property 3, observe first that each -signal is a unit of the form for

R

r

Sig fag n Sig

some . On the other hand, for any -signal ,if is defined then

n n

r r r

. But this implies that and therefore

n nn n

8.3 Representing DBMs with the aid of n-words and -relations 169

r r

kR k j kR k k k

R R

r r

j kR k

n nn

r

j kR k

kR k

r

fR g

The equality R follows then by the uniqueness property 8.2.5. The validity of the R

identity regarding the right unit can be established similarly.

R

The proof of the last property proceeds along similar lines: for each we have that

l l l

kR R k kR k k k kR R k

and . Hence , which implies that . And here

R

l

R R ut

we apply Corollary 8.2.5 to get that . R

Let us observe here that only the weak inverse property can be obtained, that is, we cannot have

R equality in Identity 8.24, since in general the set R might have cardinality greater than . As usual, any operation defined on elements of a certain type can be easily extended to sets

of elements. Therefore we also dispose of the following compositional concatenation on sets of

R R Regn

regions: for each pair of sets of regions ,

n

R R R R j R R R R

n n

Let us also denote the set of all left and right units for -region concatenation as :

l

fR Regn j i X R g j R Regn

n ini

(8.26)

n n

R

n

It is easy to see that is a unit for concatenation on sets of regions. Once in the possession of

R Regn

this unit we may define the star operation  on sets of regions as follows: for each

n

 k

R R

k N

k k

R R R R k N n

where and for all .

R Regn

Proposition 8.2.10. For any sets of regions R ,

n

R k kRk kR k

kR (8.27)

 

k kRk

kR (8.28)



P Regn n

Consequently, is a Kleene algebra.

n n 8.3 Representing DBMs with the aid of n-words and -relations

In this section we formalize the possibility to represent sets of n-regions with pairs consisting of a

n-word and a matrix of relational symbols. 170 8. Representing timing information with n-words

8.3.1 n-relations

n n M f

Definition 8.3.1. An n-relation is an matrix over the set of relation symbols

g

satisfying the following property (called in the sequel as the consistency property):

i i n k i i

j j k

k There exists no cycle in the matrix j with , and

such that

j k M f g i

For all , i ;

j j 

j k M i

For some , i .

j j 

n

We denote the set of -relations as n . M

Observe that any n-relation is an antisymmetric matrix, that is:

M

ii and

M M ji

If ij then ;

M M ji

If ij then ;

M M ji

If ij then .

m

An alternative definition of n-relations is the following: for each sequence of indices

m m m

r p

p

r with , denote

Am r p j M

m m

r

r 

B m r p j M

m m

r

r 

C m r p j M

m m

r

r 

n n

Then an n-relation is an matrix over bearing the property that

ard Am p card B m card C m

If c then both and (8.29)

M M n

n

Let us also provide a simple way for checking whether a matrix n is an -relation: n Proposition 8.3.2. M is an -relation if and only if it is antisymmetric and the consistency prop- erty holds for all cycles of length equal to 3.

Proof. The reverse implication can be proved as follows: take some arbitrary sequence of indices

m m m m

r p

p

r with . Suppose that property 8.29 is false for this sequence. Of course,

the antisymmetry and the hypothesis imply that p . We will show that the same property is false for a shorter sequence, fact which, by induction, would imply that property 8.29 would be false for

a sequence of length 3.

ard C m card Amcard B m

To this end, assume w.l.o.g. that c , hence we have

M n m

.Wehave cases to study, according to whether m is or , respectively to whether

 

M m

m is or .

 

M M

m m m

Consider first the case m . Since by hypothesis, the sequence

   

m m m M

m m

satisfies property 8.29, we must have , hence, by antisymmetry,

  n

8.3 Representing DBMs with the aid of n-words and -relations 171

M m m m m

m p

m . But then the sequence does not satisfy property 8.29,

 

because we have replaced two signs by with one sign, hence

Am Am B m B m C m C m

M M M

m m m m m

Similarly, if m and then and again the sequence

     

ut

m does not satisfy property 8.29. The other cases are treated similarly. M

Remark 8.3.3. By Proposition 8.3.3, we have that in an n-relation the length 3 cycles may only

be of the types , , , and their circular permutations.

8.3.2 Operations on n-relations n

Since our aim is to represent sets of regions (and thus EDBMs) with the aid of n-words and - n relations, we need to provide an algebraic calculus of composition and star on -relations too.

This is the issue of this subsection.

M X n X

The first operation to be defined is projection: for each n and , the -

card X M projection of M is the -relation resulting by deleting from the rows and columns that

are not in X .

m M n M

m n

Definition 8.3.4. Given an -relation ,an -relation and a positive integer

minm n

p ,if

M M

(8.30)

mpm p

p M M m n p

then the -juxtaposition of with is the following set of -relations:

M  M M M M M j M

p np

m (8.31)

m mpmnp

M  M

p If requirement 8.30 is not satisfied then we put . The following proposition shows that relation juxtaposition enjoys all the properties of the

juxtaposition operations seen so far:

m M n M

m n

Proposition 8.3.5. 1. Given an -relation ,an -relation and a positive

p minm n M  M M M

p

integer , if and only if and verify the requirement 8.30.

M Regn M Regn X m card X p

2. For each , , for each with , for each

m n

n card Y q r minp q m r m X

Y with and each , suppose that

r Y

and . Then we have that

M  M M  M

r r

(8.32)

X Y X Y mr

M M M

m n p

3. Relation juxtaposition is associative: for each , and , and for each

minm n l minn p

k and ,

M  M  M M  M  M

k l k l (8.33) 172 8. Representing timing information with n-words

Proof. Concerning the first property, note first that we are interested in proving the reverse impli-

cation, since the direct implication is contained in the definition of juxtaposition. Hence consider

M M m n p

two relations and satisfying the hypotheses. In order to build a -relation from

i j

them, we must first see what could be the possible choices for the components in the new

m r j m m n p

relation, where i and .

i j m r m m n p So let’s consider, for each such pair of indices ,

the following set of pairs of relation symbols:

j k m p m M M

ij ik k mpj

Let us then call pairs that are different from or as critical.We

ij

would like to prove that, once some critical pair is in ij , all the other critical pairs from do

not “contradict” it. There could be several “contradictory” pairs that may occur: and

,or and ,or and ,or

and , or their symmetrics/cyclic permutations. We will prove the impossibility of

occurrence only for the first case, the other proofs being similar.

k l m p m M ik

Suppose the contrary, hence there exists such that ,

M M M

k mpj il l mpj

, and , . But then, by the consistency re-

M M

kl k mpl

quirement, and , which is in contradiction with the assumption

8.30. Hence, the two critical pairs cannot occur both in ij .

M M

ik k mpj

It follows that, if is a critical pair, then there is only one choice of a

M M M M

ik k mpj ji third relation symbol ji such that the triplet satisfies the consistency requirement for cycles of length 3. Moreover, this choice is the same for two critical pairs that are

“noncontradictory”.

m n p M  M

p

As a consequence, an -relation in can be constructed the following way:

i j m M M

ij

For each , ij .

i j m p m n p M M

impj mp

For each , ij .

i m p j m m n p M

For each and , ij is a choice which is consistent

M M

ji ij

with any (i.e. all) of the critical pairs in ij . Also is the reverse of .

Observe that the pairs and are noncritical since they are consistent

with any choice of a third relation symbol. That is, if a set ij contains only noncritical pairs then M

the choice of ij can be any relation symbol. This observation ends the proof of the first property. However we will observe that the same

proof could be employed for establishing the truth of the following two related properties:

m M n M p minm n

m n

(A) Given an -relation ,an -relation , a positive integer and a

X m card X p M M M

mnp

set with ,if then there exists some

X p

M M M M

such that and .

m mpmnp

n card Y p

(B) With the same hypotheses as above, and considering also a set Y with ,

M M M M M

mnp

if then there exists some such that and

mpm Y m

M M

.

mpmnp n 8.3 Representing DBMs with the aid of n-words and -relations 173

In fact, these properties can be regarded as connected to a definition of a set-indexed juxtaposition, something of the type

M  M M  M

Xp   pY 

 , respectively .

M M X Y

m n

For proving the second property, start with , , and with sets and

q r

integers p as in the statement of the property. The right-to-left inclusion of Identity 8.32 is

m n p M M  M

p

straightforward since for each -relation we have that

X Y mr

M M  M M  M fM g

p p

k X Y mr p X X

M  M M  M fM g M

p p

pr pq r X Y mr pr pq r Y mr Y

For the left-to-right inclusion we will essentially rely on the two properties (A) and (B): take a

 M M M p q r M M M M

r

-relation , that is, and .

X Y p X pr pq r Y

M M m q r

From we conclude, by means of property (A), that there exists an -

p X

M

mq r

relation such that

M M M M

and

m mpmq r

M M n p r

From we deduce, by applying property (B), that there exists a -

pr pq r Y

M

mq r

relation such that

M M M M

and

pq r pr npr

But these two choices imply that

M M

mpmq r pq r

M

nr

hence there must further exist m such that

M M M M

and

mq r mpmnr

m n r M  M

p

We then only have to observe that this -relation belongs to because:

M M M M

m mq r m m

M M M M

mr mnr mpmnr pr npr pr npr

Finally, the proof of the third property follows by easy verification:

174 8. Representing timing information with n-words

M  M  M M j M M  M

k l mnpk l k

mnk

M M

and

mnk l mnpk l

M j M M

mnpk l

mnk m

M M

mnk mk mnk

M M

and

mnk l mnpk l

M j M M M M

mnpk l

m mk mnk

M M

and

mnk l mnpk l

M M  M  M M j M

k l mnpk l

m

M M  M

l

and

mk mnpk l

M j M M M M

mnpk l

m mk mnpk l n

M M

and

mk mnpk l nl npl

M M M M j M

mnpk l

m mk mnk

M M ut

and

mnk l mnpk l

Having these properties, we proceed further to defining concatenation and star:

n M M

n

Definition 8.3.6. Given two -relation , their concatenation is defined as follows:

M M M  M

n

(8.34)

nnn

The good properties of concatenation are the following:

n n

Proposition 8.3.7. 1. -relation concatenation is associative: for each triplet of -relations

M M M

n

,

M M M M M M

(8.35)

l r

n M

n n

2. Each -relation has a left and a right unit: for each there exist such

M M

that

l r

M M M

(8.36)

M M

The definitions of the left and right pseudounits are the following:

l l l l

M

ij nij inj ninj ij

M M M M

r r r r

M

ij nij inj ninj ninj

M M M M

l

Observe that, similarly to all units for the concatenations encountered so far, ii

M

r

i n

ii for all . M n

8.3 Representing DBMs with the aid of n-words and -relations 175

n M M n

3. For each -relation there exists a weak inverse with the property that

l r

M M M M

(8.37)

M M

The definition of the weak inverse is the following:

i j n M

nj n

i iff

M i n j n n

nj n

i iff

M

ij (8.38)

M i n n j n

nj n

i iff

i j n n M

nj n i iff

Proof. The proof of this proposition is similar to the proof of Proposition 8.2.9. We will only prove

the first property, whose proof relies on Proposition 8.3.5:

M M M M M  M

n

nnn

M  M  M

n n

nnn nnn

M  M  M

n n

nnnnn nnn

M  M  M

n n

nnn

M M M M  M M

n

nnn

M  M  M

n n

nnn nnn

M  M  M

n n

nnnnn nnn

ut M  M  M

n n

nnn n The powerset -relations becomes then a monoid with concatenation and with the following

unit:

M j M

n ini

n

n n M n Let us finally introduce star on sets of -relations: given a set of -relations , the star

of M is defined as:

 k

M

M (8.39)

k

k k

M M M k N

where M and for each .

n

8.3.3 n-word representations

W M W R n n

Definition 8.3.8. A tuple n is called a -word representation.

R Regn W M i j

The n-region represented by the tuple is defined as follows: for all

n

n ,

176 8. Representing timing information with n-words

fW g M

ij

iff ij

R

M W W

ij (8.40)

ij ij

ij iff

W W M

ij ij

ij iff

W M W M

The region represented by the tuple is denoted . That is, we define a mapping

W R Regn n n

n , called representation, which associates to each -word representation n the region which is represented by it.

The above definition should be incorrect unless we prove the following:

W M W R W M n

Proposition 8.3.9. For each n , has a nonempty semantics.

W M

Proof. The proof idea is to check that is in normal form. By Remark 8.3.3, we have to

check four (representative) cases on M . We check here only two, the case and

the case, the proof in the other cases being similar.

i j k n M M M

jk ki 1. Take some and suppose that ij , and . Then,

by the triangle identity

W M W M fW g fW g fW W g fW g W M

ij jk ij jk ij jk ik ik

i j k n M M M

jk ki 2. Take some and suppose that ij , and . Then,

again by the triangle identity

W M W M W W W W W W W W

ij jk ij ij jk jk ij jk ij jk

W W ik

ik while

W M W W

ik ik ik

W M W W W W

ki ki ki ik

because, by construction, ki and . Hence, we get that

W M W M W M ut

ij jk

ik in this case too.

W R n

The mapping defines an equivalence relation on n (the kernel of , in category

theoretic terms) denoted in the sequel n and defined as follows:

W M W M W M W M

n iff (8.41) n The following proposition can be used to prove that each n-region has at least one -word

representation:

n W M W R n

Proposition 8.3.10. Given some -word representation n , suppose that there

Regn R W M n

exists some region R such that . Then there exists a -word

n

n

R W M W R W M W M

n

representation of , n , such that .

n

R R D bm

Proof. We first “close” each open interval in , that is, compute the DBM n with

R

iff ij

R

ij

R f g f g

iff ij n 8.3 Representing DBMs with the aid of n-words and -relations 177

It is easy to see that R is still in normal form since the triangle inclusion is preserved by taking

closures of the sets involved in it.

We now simulate in part the proof of Proposition 8.1.5, namely choose a set of integers i such

n

W W R

j i ij i i in

that (for simplicity we put ) and then prove that i .

i Then we observe that this intersection is a closed unit length interval with integer bounds, since

each factor of the intersection is. Thence we may choose one of the bounds of this interval, denote

K n W WD W

it , and build from it a -word n which extends :

W W i j n

ij for all

ij

W W K i n

i for all

in

W i n W

in for all

ni

n n

Similarly to the proof of Proposition 8.1.5 we get that W is a -word (in fact a -

R k M

signal) which belongs to k . We now need to choose some relation symbols that extend to

n M W M R

some -relation such that . This extension is done as follows:

R fW g

in

iff in

M

R W W

in in in

in iff

R W W

in in

iff in

M

and of course M .

n

n M M

Let us show that this is a -relation, i.e., that it is consistent. Since we only

n

need to check cycles of length 3 in which n participates.

M M M

jn ni

Suppose such a cycle is inconsistent, say ij and . But

R W W R W W R W W

ij ij jn jn jn ni ni ni then ij . It follows

that

R R R W W W W W W

ij jn ni ij jn ni ij jn ni

which obviously contradicts the assumption that R has a nonempty semantics, that is, the Identity

8.7 The other cases of inconsistent cycles are treated similarly.

n W M n

Hence M is indeed a -relation. As a consequence, is one of the -word

ut

representations for R . n The following corollary says that representation of sets of n-regions with the aid of -word

representations is complete:

Regn n

Corollary 8.3.11. For each region R there exists an -word representation of it. n

Proof. We only have to apply Proposition 8.3.10 iteratively, starting with a representation of R ,

fg

ut that is, with . 178 8. Representing timing information with n-words

8.3.4 Operations on n-word representations

Our quest on providing representations of EDBM demands now to extend projection, juxtaposi-

n n tion, concatenation and positive star from n-words and -relations to -word representations. The

extensions are then, as expected, the following:

W M W M

(8.42)

X X X

W M  W M W  W M j M M  M

p p

p (8.43)

W M W M W M  W M

n (8.44)

nnn

 k

W

W (8.45)

k

W W R

n n where .

The main concern is then to show that the n-word representation operations correctly simulate

the n-signal/region operations, that is, to show they are compositional, because this would assure

us of the usefulness of n-word representations. The following proposition paves the way of proving

this compositionality:

W M W R n

Proposition 8.3.12. 1. For each n ,

W M W M

(8.46)

X X X

m W M W R n

m m

2. For each -word representation , each -word representation

W M W R p minm n

n n

and each integer ,

W M  W M W  W M j W M W M

p p

M M  M W M W M

p

and (8.47)

n W M W M W R

n n

3. For each pair of -word representations ,

W M M M j W W W M W M W

W M W M M M M

and (8.48)

Proof. The first property follows by easy verification and we skip its proof.

m n p

For the second property, the inverse inclusion is straightforward: given any -region

R W  W M j W M W M W M W M M M  M

p

p and ,we

R W M R W M

have, by the first property 8.46, that and . But

m mpmnp

R W M  W M 

p p this implies that by definition of on regions.

For the left-to-right inclusion we rely upon Proposition 8.3.10 in the following way: take

R W M  W M R W M

p

some . Take some representation of , say

mpm

R . Then, using Proposition 8.3.10, extend this representation to two other represen-

mpm

W M W M

tations: one equivalent to and the other equivalent to . Denote these two repre-

W M W M

sentations as , respectively . n

8.3 Representing DBMs with the aid of n-words and -relations 179

W M

Observe that, since both these representations are extensions of we have that

W W W

mpm p

M M M

mpm p

W  W M M M

p

But thence is defined, and it remains to choose an extension of and such

W  W M R i m p j

p

that . This extension is the following: for each and

m m n p

,

R W  W g

p ij

iff ij

M

R W  W W  W

ij (8.49)

p ij p ij

iff ij

R W  W W  W

p ij p ij

iff ij

M M M M

Of course, and .

m mpmnp

R M

It is routine to check that only these three conditions on ij really hold and that is consistent.

W  W M R

p

And, by construction, we have . t The last two properties are easy corollaries of the property 8.47. u

Remark 8.3.13. Observe that the identity

W M  W M W  W M j M M  M

p p p

W M W M

is not valid in general since it might be possible that but

nn n

W W

just because the same region might have different representations.

nn n An example of this mismatch is provided in the introductory part of this chapter.

This observation raises the problem whether we may correctly represent region concatenation n

with -word representation concatenation. The idea that helps us overcome this problem is that,

n n for each pair of -regions which correctly concatenate, there must exist a pair of -word rep-

resentations which correctly concatenate, and hence represents the concatenation of the two given n

regions. In other words, we will be interested in composing sets of -word representations which n bear the property that all the -word representations associated with a certain region are in the

set. The formalization of this idea is the following notion of convexity:

n N W R n

Definition 8.3.14. A set of -word representations n is called convex if it is satu-

rated by the equivalence relation n , that is, if the following property holds:

n W W R n n

For each set of -word representations n and each -word representation

W M N W M W M W R WR W M N n

,if for some n then also .

n W W R W n In the sequel, for each set of -word representations n , we denote as the

set of regions which are represented by some element of W :

W W M j W M W

180 8. Representing timing information with n-words

n W W R n

Proposition 8.3.15. 1. For each convex set of -word representations n and each

n W card X

X , is also a convex set of -word representations.

X

W W R W W R

m m n n

2. For each two convex sets of word representations , , and

p minm n W  W m n p

p

integer , is a convex set of -word representations and

W  W W  W

p p

(8.50)

n W W W R W W

n n 3. For each two convex sets of -word representations , is a

convex set and

W W W W



n W W R W

n n

4. For each convex set of -word representations , is convex and

 

W W

W M

Proof. All the properties rely on Proposition 8.3.10. For the first property, observe that, if

W W M W M R W R W M W M

and then there must exist such that .

X

W M n W M

But then we may recursively apply Proposition 8.3.10 to extend to a -region

W M W M W M R R W

that is, with , such that . But since , by convexity

X

W M W W M W

of W it follows that . Fact which implies that . X For the second property, observe that identity 8.47 from Proposition 8.3.12 gives the left-to-right

inclusion:

W  W W M W W  W M j M M

p p m n

such that

M M M M W M W

and

m mpmnp

W  W M j W M W W M W

p with

W M W M W M W M M M  M

p

and

W M  W M j W M W W M W

p

W  W

p

R W  W R W

p

For the reverse identity, suppose we have some region . Hence

m

R W W M W R W M

and . It follows that there exist with .

mpmnp m

W M

Consider now . We have that

mpm

W M R

mpm mpm n hence, by Proposition 8.3.10 we may extend this p-word representation to an -word representation

that represents R , say

mpmnp

W M W M

and (8.51)

p mpm

W M R

(8.52)

mpmnp

8.4 n-region automata 181

W M W W

Observe that Identity 8.52 implies that , hence, by convexity of it follows

W M W W W p

that . On the other hand, Identity 8.51 says that and can be -juxtaposed

M M W  W M  M W  W M  M

p p p p

and so can and . Hence is nonempty and

W  W

p

.

m n p M M  M R W  W M

p p It remains just to pick some -relation such that ,

and the choice is the same as for the M defined in 8.49 in the proof of Proposition 8.3.12. Also it

W W W  W

p

is easy to observe that the convexity of both and implies the convexity of . t The proof of the last two properties is a straightforward corollary of the first two. u

8.4 n-region automata

n A Q Q Q n

Definition 8.4.1. An -region automaton is a tuple in which all but

n A Q Q Q n

the last components form an -automaton over the one-letter alphabet

fg Q n n

while n is the -relation labeling function, associating a -relation to each

state.

n A Q Q Q n A n

The -automaton is called the underlying -automaton of . n

n-region automata are intended to represent EDBMs by means of -word representations: a run in an n-region automaton is a sequence of transitions in which match on intermediary states,

with the additional property that all states in the run are labeled with the same n-relation. Because

only one symbol can label any transition, we will represent each run as the sequence of states in

q n q

i i

k

the run, i and denote the -relation which identically labels all the states in .

q i

j

k

A run j is accepting if it passes through each accepting set, that is, if for each

n j k q Q n W M i

there exists some such that j . Given a -word representation

l

W R q l l

n n j i

k in

, a run j and a sequence of indices of states in the run with

l

q Q n W M l i

the property that l , we say that the -word representation is accepted by if

i

n A M n

W is accepted by the underlying -automaton and . Similarly to -automata, we call

l

l W M the sequence l as the sequence witnessing the acceptance of by .

Remark 8.4.2. Since in each n-region automaton we are interested only in runs in which states are n

labeled with the same n-relation, we will consider only -region automata in which the transition

a

q r

function is consistent with the n-relation labeling , that is, in which whenever for some

q r a then .

To each n-region automaton we will associate three languages:

n A L A n

The -word representation language accepted by , denoted rep , consists of the -word

l

l

representations accepted by some tuple as above.

A n

The region language of is the set of regions which are represented by some -word represen-

L A L A rg n

tation in rep , and is denoted :

L A W M j W M L A rep rg n (8.53)

182 8. Representing timing information with n-words

n A Finally, the -signal language of is the union of the semantics of the regions in the region

language of A,

L A kR k j R L A

sig rg n

n A B L A L B rg n

Remark 8.4.3. Observe that, for each two -region automata and , rg n iff

L A L B L A L B L A L B

sig rg n rg n rep rep

sig , but it might be possible that and , n

due to the possibility to represent the same n-region by different -word representations. But if

n L A L B rg n

the two -word representation languages are convex, then we also have rg n iff

L A L B rep

rep . n Definition 8.4.4. An n-region automaton is called convex if its -word representation language is convex.

With this definition, the following chain of equivalences is valid for convex n-region automata:

L A L B L A L B L A L B

rep rg n rg n sig sig rep iff iff (8.54)

Hence, when we will need to prove the equality of the languages of two convex n-region au-

tomata we will only need to prove the equality of their n-word representation language.

8.4.1 Basic closure properties for n-region automaton

n

Throughout this section we prove that the operations on n (or )-automata can be extended to n operations on n-region automata. This subsection is just a restatement for EDBMs and -region

automata of the results contained in Chapter 7. We start by the translation of Proposition 7.2.10: n

Proposition 8.4.5. The class of n-signal languages accepted by -region automata is closed under

B n

union and intersection. Moreover, if A and are two convex -region automata, then one can build

n L A L B L A L B

sig sig sig convex -region automaton for sig and .

Proof. The constructions are straightforward adaptations from Proposition 7.2.10. The convexity t

property follows due to the fact that intersection and union of saturated sets give saturated sets. u

n n

Theorem 8.4.6. The class of n-signal languages which are the -signal language of some -region n automaton equals the class of n-signal languages which are the semantics of a sum of -EDBMs.

Note that the result refers to extended DBMs. It is clear that, in general, n-region automata are more expressive than sums of mere DBMs.

Proof. The proof of the direct inclusion is very similar with the proof of Theorem 7.2.14.

q q q Q

n i i

Consider all tuples of accepting states with and such that all the states

M n M

in the tuple are labeled with the same n-relation . For each such tuple and -relation ,we M

construct the n-region automaton in which only the states labeled with are present and in which

Q Q fq g Aq q M

i i n all i s are singleton sets . Denote this reduced automaton .

8.4 n-region automata 183

j n E

Then, for each i we build the regular expression which denotes the set of

ij

q q Aq q M

j n

positive integers that are the length of a path from i to in . Since we speak

about positive integers, we may put E in the form

ij

E A B fcg ij

4

B c N

where A are finite sets of integers and .

n

Then, from E we build a regular expression over intervals – that is, an -EDBM – as follows:

ij

M

1. If ij then clearly

D E A B fcg

ij ij

M A n

2. If ij then observe that, intuitively, each is an upper bound for a -region

A q q j

which is accepted by along a path from i to . Hence we put

j B fcg j A D

ij

M

3. If ij then we put

D j A j B fcg ij

Finally, from all these regular expressions over nonnegative intervals we build the n-EDBM

D E dbm

n defined by:

D D D

ij

ij ji D where denotes the regular expression over real intervals which results by changing every

bound into its opposite; for example, for the case 3 above,

D j A j B fcg ji

It is then easy to check that the semantics of the sum of all n-EDBMs built for each tuple

q q n M n A n and each -relation equals the -signal language of .

The reverse inclusion can be proved by induction on n as follows: in the base case, we code

each regular expression over real intervals into a -region automaton. The idea is to decompose

each regular expression over intervals

A B fcg R (8.55)

into a union of two regular expressions, one containing only point intervals and the other containing

only open intervals of unit length. Hence the basic case reduces to the following constructions:



Remind that we use for denoting union and for denoting concatenation for regular expressions over intervals.

184 8. Representing timing information with n-words

A B N m max A m max B B 1. Suppose that . Denote A and . Then the -region automaton

equivalent to R is:

C Q Q Q

where

Q m fg m fg c fg

A B

j m j m

A B

j c c b j b B

Q f g

Q A fg B fg fc g

q Q

for all q

Observe that C is a convex -region automaton.

i

card A

2. Suppose there exist two strictly increasing sequences of integers i and

i

card B

i such that:

A j i card A B j i card B

i i i

i and R

Then the -region automaton equivalent to is

C Q Q Q

where

Q fg f g

card A

fg f g

card B

c fg f g

x s x s j x s f g

card A

x s x s j x s f g

card B

x s x s j x c s f g

j i card B

i

j i card B

i

Q f s s j s f gg

Q j i card A

i i

j i card B

i i

fcg fg f g

k l k l k l and for all

Observe again that C is convex.

For the induction step we rely upon the following property:

A n Proposition 8.4.7. Given an n-region automaton , there exists an -region automaton with the property that

8.4 n-region automata 185

L B R Regn j R L A rg n

rg n (8.56)

n

n

L B Sig fag j L A sig

or, equivalently, by Proposition 8.1.5, sig .

n

n B

Moreover, if A is convex, then can be chosen to be convex too.

n A Q Q Q n

Proof. Denote the given -region automaton . We first build an ex-

q q

tension of the automaton, by appending states similar to and from Definition 7.2.9 of the

n card

completion of an -automaton. We will actually append n states, namely the union

Q Q

where

Q q M j M Q q M j M

n n

and

q q n due to the need to have a state and a state labeled with each -relation. These states will be

connected to the others as follows:

q M q M q M q M j M

n

q M q q q M j M i n q Q

n

and there exists such that i

A A

Thus we get an n-region automaton, which we call the completion of and denote . In this

Q Q

automaton, any accepting run can be extended to a run that starts in and end in .

n

We further transform this automaton into an -region automaton by putting any state in

n n n the -th accepting set and by augmenting all -relation labels to -relation labels. The

resulting automaton is

B Q Q Q Q Q Q Q Q

with and

n n

Q q M j q Q M M q

n and

n

q M q M j q q

i n Q q M Q j q Q

i for

i

Q Q

n

q M M q M Q for all

By a straightforward adaptation of the proof for Proposition 7.2.12 we get that

L B W M W R j W M L A

n n rep

rep (8.57)

n n

L B L A rep

Observe that this property is equivalent to the fact that rep .

n

W M W M

The convexity of B follows easily from the above property: suppose that

W M LB

and . By assumption, we then have

W M W M W M W M

n n n n n n

W M LB W M L A

Further, implies that rg n . From these and from the

n n

A W M L A

convexity of we get that rep , which, by identity 8.57, is equivalent

n n

W M L B ut to rep .

186 8. Representing timing information with n-words

R E dbm i j n

Proof (of Theorem 8.4.6, continued). For each EDBM n and each ,we

R n

construct the -region automaton equivalent to ij . Then we extend this automaton to an -region

n

automaton, by recursively applying Proposition 8.4.7. Finally, we intersect the n automata R to get the n-region automaton equivalent to .

If we are given a finite sum of EDBMs, we utilize the above construction for each term of the t

sum, and then apply the union construction from Proposition 8.4.5. u n 8.4.2 Non-elasticity for -DBMs

To further adapt the results on concatenation and star closure from Chapter 7, we need to transport

n n n non-elasticity for -regsignals/ -regions/ -word representations and to relate these properties

to one another.

n Sig

Definition 8.4.8. A -signal is called non-elastic if the following property holds:

n

i j n

i jnj inj

(NS) For each ,if in and then and

i

jn .

n n

A -signal language is called non-elastic if each -signal in it is non-elastic.

n n

A -region automaton is called non-elastic if its -signal language is non-elastic.

n n

As we intend to represent n-signal languages by sets of -regions, and further by sets of -word

n n

representations, we need to transport the notion of non-elasticity from -signals to -regions n

and to -word representations in a consistent way. Moreover, we expect that the notion of non-

n n elasticity of -word representations rely on the notion of non-elasticity of -words, similar to

Definition 7.3.1.

n D D bm n

Definition 8.4.9. A -DBM is called non-elastic iff the following property holds:

i j n D n fg D n fg D

i jnj inj

(ND) For each ,if in , then

D

i

and jn .

n D D nf D kD k n Proposition 8.4.10. For each -DBM in normal form , is non-elastic iff is

non-elastic.

D k n

Proof. For proving the first property, observe first that, if k contains a -signal which is elastic D

then D itself must not be non-elastic. For the other implication suppose is elastic, hence there

i j n D n fg D D

i ni j nj i nj

exists a pair of indices such that , but

     

D kD k

ni

or j . Suppose also, for the sake of contradiction, that is non-elastic.

 

D D

ni j nj

Let us first observe that i and similarly , since otherwise we

  

n kD k

ni

may construct, by means of Proposition 8.1.5, a -signal with i , hence

 

D k

contradicting the assumption that k is non-elastic.

D D We will replace first D by the “sub-DBM” which is obtained from by transforming closed

parentheses into open parentheses on all nonpoint components of D . That is,

8.4 n-region automata 187

inf D sup D ij

iff ij and

D

ij

D f g R D ij

ij otherwise, that is, iff for some

n i j k n

Let us first observe that D is also a -DBM in normal form: for each , since

D D D D D D A

ij jk ik ij jk

ik it follows also that int int int , (where we have denoted by int

R

the interior of a set of reals A ). Hence the triangle inclusion is valid if all three components

are nonpoint sets. The triangle inclusion also holds for all triplet of point components of D since

such components are copied from D . It remains to check the triangle inclusion for the case when

one or two components are point intervals and the other (or the others) is (are) nonpoint interval(s).

D D D

jk ik Observe first that the situation with ij and being point sets and nonpoint set is im-

possible, since the sum of two point sets is also a point set.

D D D

jk ik

Suppose ij is a point set and are nonpoint sets, say

D f g D D

ij jk ik

D D D ik

the cases with other parentheses for jk and being treated similarly. Since is a DBM we

D D D

ij jk

have ik , that is, . Therefore

D D D

ik ij jk

D D D

ik jk

A similar proof can be done when ij and are both point intervals and is nonpoint.

D D D

ij jk The last distinct case is when ik is a point interval and one of or is a nonpoint interval:

suppose, w.l.o.g., that

D f g D D

ij jk

ik with

D D D

ij jk

Since by hypothesis ij , we must then have

f g

which means that , hence

Similarly we may prove that , hence in fact we must have . But this

D D D

is equivalent to the triangle inclusion D . Hence is a DBM.

ij

ik jk

D D

i ni nj

Observe now that, following our observation that j , we must have

   

D D

i ni nj

j .

   

D

We utilize then D as follows: take some negative number . Such a number must

i nj

 

D D intD

nj i nj

exist, because we have assumed that i and . The number

   

i nj

 

D

can also be regarded as a -signal in the semantics of the -DBM . Then, by recursively

fi nj g

 

n kD k applying the construction from Proposition 8.1.5 we extend this to a -signal .

But obviously is elastic since

188 8. Representing timing information with n-words

D

ni i ni

i hence ,

   

ini

nj

similarly j ,

 

nj

and i .

 

D k kD k kD k ut

which gives a contradiction with the non-elasticity of k , since .

n n

Non-elasticity extends to -word representations in the following way: a -word representa-

W M n

tion is called non-elastic if the -DBM represented by it is non-elastic. Consequently, a

n i j n non-elastic -word representation must satisfy the following property: for each ,

if

W W M

i ini ini in or ( and )

and

W W M

j jnj jnj

jn or ( and )

W W M f g

j inj inj

then in or ( and ).

n W M n W Observe that if a -word representation is non-elastic then the -word is non- elastic.

8.4.3 Closure under concatenation and star

n A B n

Proposition 8.4.11. Given two convex -region automata and , there exists a -region au-

D L D L A L B A B

sig sig tomaton with the property that sig . Moreover, if both and are

non-elastic then D is non-elastic too. n

Proof. We adapt the concatenation construction in Section 7.4 for -region automaton: denote the

Q A Q Q Q B Q Q

n

given automata as and . There are two

n

ideas that guide this adaptation (we refer the reader to the construction on page 127):

q q X q q

First, we require that, in each tuple , the labels of and are “consistent”, that is,

q n q the projection of onto the last components equals the projection of onto the first

components.

q q X n M

We then attach to each tuple a -relation label which is in the concatenation of the

n q q

-relation labels and .

B A B

The formalization is the following: we construct first A and , the completions of and ,

A Q Q Q Q Q Q Q

n as in the proof of Proposition 8.4.7. Hence with

where

Q q M j M q M M

n

Q q M j M q M M

n

q M q M q M q q M q M q q M

and , , respectively , for all

S

q Q

i .

i n

B Q Q Q Q Q Q Q

Similarly with where

n

8.4 n-region automata 189

Q q q M j M M M

n

Q q q M j M M M

n

q M q M q M q q M q M q q M

and , , respectively , for all

S

A B Q

q . We will also assume, as stated by Remark 8.4.2, that in both and the transition

i

in n

relation is consistent w.r.t. the -relation labeling, that is, two states are connected by a transition n

iff they are both labeled with the same -relation. n

Then build the following -region automaton:

C Q S S

n

with

Q q q X M j X n q Q q Q q q

with

nn n

q q

and M

a a a

q q X M r r Y M j q q X M r r Y M Q q r q r

X Y n i Y n X r Q r Q

i

and for all n and

i

S q q X M Q j q Q

i i

S q q X M Q j q Q

ni

ni

q q X M M q q X M Q

for all

C Q Q fg

n

Finally drop all the states of that are not reachable from or not coreachable

Q Q n D

n

from and denote the resulting automaton.

L A L B L D n W M

rep rep

To prove that rep , take some -word representation

L A L B W M L A W M L B

rep rep rep rep

, hence there must exist and such

W W W M M M r

i

k

that and . It follows that there exists an accepting run i

A A i

l

n

in (actually we will consider it in ) and a sequence of indices l which witness the

W M A n M

acceptance of by , hence all its states are labeled with the -relation ; also we may



Q Q r B

k

assume that starts in and ends in . Similarly, there exists a run i in and

i

j W M B

l

n

a sequence of indices l which witness the acceptance of by and whose states

M Q Q

are all labeled with . We may also assume that starts in and ends in .

As in the proof of Proposition 7.4.1, we may transform the two runs by addition of loops in

Q Q Q Q i

l

n

and , respectively and , and “translate” the two witnessing sequences l and

j k k

l

n l such that the runs have equal length (we assume thence ) and the following

property holds:

l n i j

l l

For all n

r r I M

i i

k

Then we construct the run i in which

i

I

I I l n j r q

i i i i

ln

190 8. Representing timing information with n-words

p

l

n

and the sequence of witnessing points l with

i l n

l for

p

l

j l n n

l for

Q

Observe first that each tuple in the run is in , since

M M M r r i k

i

for all

i

Q Q Q Q

n n

Moreover, the first state being in and the second state in ,it

follows that all the states are also states of D .

r r I p

i i l

k l n

Then observe that i and the sequence are exactly the run and the

i

W W D

witnessing sequence associated with in the underlying automaton . Hence the run and

n W M

the accepting sequence are associated with the -word representation , which shows that

W M L D

rep .

L D L A L B n W M

rep rep

To prove that rep , take some -word representation

X M r r D L D

i i rep

k

, which is thence associated with a run i in and a sequence

i

l

i

n

i .

r A

i

n

Observe then that i is an accepting run in because we have assumed that

n r

n

the transition relation is consistent with the -relation labeling. Similarly, i is

i

B M n

an accepting run in . Let’s denote the common -relation label of all states in , that is

M M

and .

p p k

j j

n

Then we construct the following sequence of indices: j with and

p i j X n X

i i

j iff

n l p

i i

n in

It follows that the sequence i witnesses the acceptance of some -word rep-

W M A p l

i i

n inn

resentation by the run in and the sequence i witnesses the

n W M B W

acceptance of some -word representation by the run in . Moreover

nn

W p

i

n

because both are associated with the run and the sequence of indices i . Hence

n

W M W M W M W M L A L B ut

rep rep , which shows that .

In the previous chapter we have also proved the closure under indexed juxtaposition for n-

automata. The respective construction can be easily adapted to n-region automata, along the same lines of the above proof. We have preferred to present here only the proof for concatenation since

it offers insights for the proof for star closure.

k

n A k N L A

Theorem 8.4.12. Given a convex -region automaton , suppose that, for any , rep



n L A n

is a non-elastic -signal language. Then rep is accepted by a -region automaton.

Proof. We adapt the construction from Theorem 7.4.3 as follows: first, we replace A by its com-

pletion A in which we assume, following Remark 8.4.2, that the transition function connects only n states labeled with the same -relation.

8.4 n-region automata 191



L A

The automaton that accepts sig is denoted as

C Q U U

  n 

S T M M Q

 consists of tuples in which the first components have the same meaning

and utility as in Theorem 7.4.3, while the last two components give the information concerning the

n n

-relation which is to be accepted. Namely, the fifth component is exactly the -relation-label n of the macrostate, while the sixth serves for succesively concatenating all the -relation labels of

the states which have passed through the right active component. The idea is that, at the end of the

M M M parse we want to have M . In some sense, in we make a guess at the beginning for the

that we will get at the end of the parse.

Q n 

Formally,  consists of the following types of states and gives the following -relation X

labeling (we utilize here the notations Q and from the proof of Theorem 7.4.3):

X q X T M M X q X Q T Q M M n

1. where , , and , with the prop-

Y r Y T

erty that, for all ,

q M

a) .

X q X T M M M

b)  .

n n Y n n

c) X .

n n Y n n

d) X .

Y n Y n n Y n Y n n X n X n n

e) .

S X q X Y r Y T M M X q X Y r Y Q S T Q M M n 2. with , , ,

with the following properties:

q r

a) .

nn n

X q X T M M M

b)  .

n n Y n n

c) X .

n n Y n n

d) X .

U s U S

e) For each ,

n n X n n

i. U .

n n X n n

ii. U .

U n U n U n U n n n X n X n

iii. .

V t V T

f) For each ,

n n V n n

i. Y .

n n V n n

ii. Y .

V n V n n V n V n n Y n Y n n

iii. .

S X q X M M X q X Q S Q M M n

3. with , , and , with the property

Y r Y S

that for all ,

X q X T M M M

a)  .

n n X n n

b) U .

n n X n n

c) U .

U n U n U n U n n n X n X n d) . 192 8. Representing timing information with n-words

The transitions are the following:

a

X q X T M M X q X T M M

1. iff

a

q

a) q ;

a

V t V T V t V T t t

b) For all there exists such that .

a

S X q X Y r Y T M M S X q X Y r Y T M M

2. iff

a a

q r r

a) q , ;

a

U s U S U s U S s s

b) For all there exists such that ;

a

V t V T V t V T t t

c) For all there exists such that .

a

S X q X M M S X q X M M

3. iff

a

q

a) q ;

a

U s U S U s U S s s

b) For all there exists such that .

S X q X Y r Y T M M S Y r Y Z s Z T M M

4. iff

M M s

;

X X X q X S

There exists such that ;

Y X Y r Y T

There exists such that ;

Z s Z S Z X Z s Z S

For each there exists such that .

Z s Z T Z X Z s Z T

For each there exists such that .

X q X T M M X q X Y r Y T M M

5. iff

M M r

;

Y X Y r Y T

There exists such that ;

Z s Z T Z X Z s Z T

For each there exists such that .

S X q X Y r Y M M S Y r Y M M

6. iff

X X X q X S

There exists such that ;

Z s Z S Z X Z s Z S

For each there exists such that .

n

The accepting sets are, for all i ,

U X q X T M M j i X n X M M M q

i n

S X q X Y r Y T M M j i X n X M M n

and

Z s Z S i Z n Z

for all (8.58)

U S X q X M M j n i X n X M

ni n

S X q X Y r Y T M M j i n Y n Y M M n

and

Z s Z T i n Z n Z for all (8.59)

Finally, the state space is reduced to the states reachable from the following set

o n

j T Q M M M q M T M M Q

n (8.60)

and coreachable from

n o

Q S n q M n M M j S Q M M

f f n f (8.61)

8.4 n-region automata 193 D

We will denote this reduced state space as Q and the resulting automaton as . The correctness  of this construction is almost the same as the proof of Theorem 7.4.3, with some extra consider-

ations on the relation labels of the states. Though it might look redundant, we have to retrace the n

constructions in Theorem 7.4.3, for showing why they work for -region automata too.



L D L A n W M rep

For proving the inclusion rep , take some -word representation

D D S T M M

i i i i i

m accepted by , that is, associated with some accepting run of i ,

with

T M S T M M q M

S T M M S q M n M M

m m m m m

h

h h h m S T M M

i i h h h h h

n

and with a sequence of indices i with and

i i i i i

i n i j n U

i for all . Hence, for all ,

W

ij

S T M M S T M M

h h h h h h h h h h

i i i i i j j j j j Similarly to the proof of Theorem 7.4.3 we identify a number p of times the run passes

through transitions that “move around” states from the right active to the left active component.

k k p

Denote the indices at which transitions of the type 4,5 or 6 occur in . We then use

j

j j

p Z s Z A i m

j

k

and build runs in “history” presentation, i in ( ) such that

i

i i

the following properties are satisfied:

j

j j

j p i k k Z s Z

j

1. For each and j , is the right active component,

i

i i

j

j j

Z s Z

i .

i

i i

j

j j

j p i k k Z s Z

j

2. For each and j , is the right active

i

i i

j

j j

Z Z s

component, i .

i

i i

j

j j

j p i k Z s Z

3. For each , and j , is part of the prophecy component,

i

i i

j

j j

Z s Z T

i .

i

i i

j

j j

j p i k m Z s Z

4. For each , and j , is part of the history component,

i

i i

j

j j

Z s Z S

i .

i

i i

Q

i

5. passes through some accepting set n at the same moment when passes through the

j j

Q i n

accepting set i for the same , that is, the essential property (*) utilized in the proof

of Theorem 7.4.3:

p i m

For all j and for all ,

j j

j j

n n Z n n Z n n Z n n

Z and .

i i

i i

k k m

p Here we have denoted and .

Once having these runs, we translate them to the witnessing presentation by building p se-

j

j j

i i

l

l l i p l u u Z n Z

n

quences of indices u ( ), with iff , and observe

u i

i i

p n w

j

p

that these sequences witness the acceptance of -words j for which we may prove,

p similarly to the proof of Theorem 7.4.3, that for each j ,

194 8. Representing timing information with n-words

W w w w w

p j j

nn n

w w w w

p

n n nn nn

j

j j j j

Z Z A s i i m s s

m

As i is a run in , it follows that for all .

j

i

i i i i

j

s i n n

Let us denote then M for some . We would like now to relate these -relation

j

i

M labels with the components i in the run , in order to prove that these labels correctly concatenate

and that M is in their concatenation.

j

j j

s p Z Z

Let us note first that for each j the tuple is the right active com-

i

i i

j

j j

Z Z s

ponent while the tuple is the left active component. Hence we must have, by

i

i i Q

requirement 2.b from the construction of  ,

j j

M s s M

j j

i i

nn nn n n

M

Therefore the M s correctly concatenate, it only remains to prove that their concatenation is .

i

j p i k k

j

To this end, observe that, for each and each j (we consider

k k m i M M

p i i

and ), the -th transition is of type 1,2 or 3 and therefore .By

Q M

requirement 1.a from the construction of  , must be the label of the right component of the

M M M M i k

i

first tuple, hence . It follows that for all .

k s

Consider now the -th transition. It is an -transition of type 4 and it pulls the state out

i

M

of the prophecy component into the right active component. Therefore, by construction, k



M s M M M

k

, that is k .

 

i

j

j j

Z s Z

By induction we may then prove that, if the i-th right active component is then

i

i i

M M M i m M M M M M

m m

i . Hence, for we have that . But , and

j p



W M w M w M W M L A

p rep

therefore , fact which shows that .

p

p n w M L A

i rep

For the reverse proof, take -word representations i which correctly

w M w M i p

i i i

concatenate, that is, i for all , and consider

nn n

A p n w M i accepting runs in the completed -automaton , one for each i , together with their

witnessing sequences of indices:

i i

q l

i j m

n

with witnessing index sequence k

j k

i

Q Q

We assume that each run starts in and ends in .

n w M w M w M

p p

Then consider some -word representation , that is, take

w w w M M M w M LD

p p

and . We would like to show that .

The first step is to transform each run i into a run in the “history” presentation, that is, denote

i

X j

the set of indices of the accepting states which were visited by each run i just before the -th

j

i

X j

step and by the set of indices of accepting states visited by i up to the -th step, and also

j i

denote their difference:

j

i i

u n j v j l v

X such that

j u

i

i

u n j v j l v

X such that

j u

i

i i

X n X

j j j

8.4 n-region automata 195

m

Then, similarly to the proof of Theorem 7.4.3, we bring all the runs i to equal length, say ,

S T M M

i i i i

mp

and build up a sequence of tuples i , as follows: i

1. The first tuple in is

i

i i

X q X fX q X j i pg M q

and the last tuple is

i p

i i p p

fX q X j i p g X q X M M

m m m m m m

i

 

i i i i

n n q n n X X i i

2. If k and for all , ,

j j j j j

S T M M

k k k

then we append to the run the tuple k with:

k



 

i

i i

q S X X j i i

k

j j j

i

iff

k

i

i i

X X i q

iff

j

j j

i

i i

X X q

k

j j j



 

i

i i

T q X j i i X

k

j j j

M M

k k

i

i i

X q X i i

3. If k and there exists some for which

j j j

 

i i

n n n n n

j j

then let

 

i i

max i i j n n n n n

j j

i

We then append tuples as follows:

S T M M

k k k

a) The first tuple to be appended is the tuple k in which:

k



 

i

i i

S X X j i i q

k

j j j

i

i i

X X q

k

j j j

i i i i

X q X n n n

k

j j j j

  

i i i

T X q X j i i

k

j j j

i

M M q

k k

j

l i S T M M

l k l k l k l

b) For each we append the tuple k with:

k l

 



i i

i

S X q X j i i l

k l

j j j

il

il il il

X n n n q X

k l

j

j j j

il il il il

X q X n n n

k l

j j j j

  

i i i

T X q X j i i l

k l

j j j

il

M M q

k l k l j

196 8. Representing timing information with n-words

T M M l i S

i k i k i k i

c) For we append the tuple k where:

k i

 



i i

i

S X X q j i i

i k

j j j

X n n n q X

i k

j j j j

p

iff

k i

X q X p

iff

j

j j



 

i

i i

j i X T X q

k i

j j j

k i

M M q

k i k i

j

i p S T M M

k k k k

4. If k , which can only happen when , we append the tuple k

with:



 

i

i i

j i p S X X q

k

j j j

p

p p

X q X

k

j

j j

k

T

k

M M

k k

p

t

t t t l t l

u u nu

un

u

Consider the sequence of indices such that and n for all

u

u n A S T n

i i i i

mp

. Also consider the run in i that is, we purge the -

t

tt w

relations from the components of . Then the pair witnesses the acceptance of by the

underlying automaton D . D

To end the proof, we only need to show that is a run in . The specific requirements (for n

-region automata) that need to be checked are 1.a and 2.a, since all the other requirements are

either trivially true (the case of 1.b, 2.b and 3.a) or implied by the fact that is an accepting run in D (the case of requirements 1.c, 1.d, 1.e, 2.c, 2.d, 2.e.i, 2.e.ii, 2.e.iii, 2.f.i, 2.f.ii, 2.f.iii, 3.b, 3.c,

3.d).

q

The validity of requirement 1.a is straightforward, since M and each step before the

M

first -transition preserves .

For proving the validity of requirement 2.a, we have to observe that, according to the definition

k i j k

of , at each moment at which k and , the left and the right active components

j

i

are the -th tuple of the run i , respectively the run , that is,

i i

i i

i i

X q X X X q

k k

j j j j

j j

i

i

q M q M M

i i

This implies that i and , and then, due to the hypothesis that correctly

j

j

i

i

M q ut q

concatenates to i , we will get that . Hence requirement 2.a also holds.

j j 9. Applications

In this chapter we gather together all the result and techniques developed so far, and provide a n method for checking whether the semantics of a -signal regular expression is empty. Conse-

quently we get a method for checking whether the language of a timed automaton is empty. n

We have seen how to study untimed behavior and timing behavior of systems by using - n automata. We might then think to decompose each -regsignal into the “untimed” part, which represents the qualitative behavior of the modeled system, and the “timing” part, which give the

temporal constraints on the behavior of the system.



i R RSig R hE i hE E

ij ij I ij

More formally, given where I with regular expres-

n

ij

ij

ij

E I R I R

sion over , regular expression over , ij and , we define the following

ij ij

two n-regsignals:

u u

R R R E E i j n

1. , called the untiming of , ij for all .

ij ij

t t



R R R h i h i i j n I

2. , called the timing of , I for all .

ij ij

ij

u t

R k kR R k

Then k .

u t

n R n n

R can be considered as a -regword and as a -DBM. This implies that from each -

kR k

signal we keep only the untiming information and the duration of each component ij ,

j n for i . Hence the two aspects, untimed behavior and timing behavior, can be studied separately. However, for systems in which both the untimed behavior and the timing constraints are im- portant, studying each one separately might prove an incomplete method, since we might miss the interconnections which limit the behaviors. In our setting, this amounts to some expressions which, when decomposed into timing and untimed and studied separately, give nonempty seman-

tics, while it is clear that their semantics is empty. For example, the following -signal regular

expressions clearly has an empty semantics:

habi hai habi hai habi

C B C B

hb a i hb i ha i ha i hbi

C B C B

C B C

B (9.1)

ha i hbi hbi hai habi

A A

hb a i hb i hb a i hb i hb a i

The reason why the concatenation of the two -regsignals given in 9.1 has an empty semantics

a a lies in the fact that the first -regsignal requires an state of length while the second imposes an

state of length . The Figure 9.1 below gives a graphical interpretation of this empty concatenation.

198 9. Applications

habi habi

hai hbi hai hbi

   

       

Fig. 9.1. A concatenation of two -regsignals that has an empty semantics.

On the contrary both the untimed and the timing of the -signal regular expression in 9.1 have

nonempty semantics:

B C B C B C

B C B C B C

B C B C B C

A A A

respectively

a ab ab a ab a a ab

B C B C B C

a a b b a b a b

B C B C B C

B C B C B C

a ab a b b a b

A A A

b a b b a b a b b a b b The correct handling of such expressions requires working with both the untimed structure and

the timing structure together. But we only know to handle each one on its own. n

The solution is the following: to decompose first each -regsignal into the untimed and the n

timing part, then to build the -word representation of the timed part, and finally to recombine

n n n the -regword in the untimed part with the -regword over a one-letter alphabet from the -

word representation of the timing part. n

This recombination is simply the shuffle of the two -regwords. The simple but essential prop-

L L L

erties of shuffle that we will take advantage of is the fact that, for any two sets L , is empty

L iff both L and are empty. Then what remains to be shown is that the union/concatenation/star constructions correctly “combine” with this shuffle operation. We show here that this idea works fine, in spite of the noncompositionality of projection on our shuffled items. The reason this time noncompositionality is no longer harmful is that we are able

to provide a “weak compositionality” result, saying that the shuffle representation has a nonempty n

semantics iff the semantics of the initial -signal regular expression is nonempty. n We end this section with an expected result, namely that the -regsignals that we have pro- duced for timed automata satisfy the non-elasticity assumption, hence we can use the technique

developed here for checking timed automata for emptiness. n 9.1 Decomposition and recomposition of  -signal regular expressions

Throughout this section we will extend several operations from (-dimensional) words/signals to n n-words/ -signals: n

9.1 Decomposition and recomposition of -signal regular expressions 199

z z

1. is the extension of the canonical projection (defined in Chapter 2,

n n w WD i j n n

page 27) to -words: for each -word and ,

z z

w w

ij

ij

z n

Observe that we first need to extend to antiwords and only after that to -words.

U Sig SF n

2. U is the extension of the untiming morphism to arbitrary -signals: for

Sig i j n

each n-signal and ,

n

U U

ij

ij n

Observe again that we first extend U to antisignals and only after that to -signals.

Sig R n

3. is the extension of the length morphism to arbitrary -signals: for each

Sig i j n

n-signal and ,

n

ij ij

Similarly to above we first need to extend to antisignals.

w

The definition of the shuffle operation on words is the following: given two words w ,

w w w k N

the shuffle of w and , denoted , is the language obtained as follows: for each ,

w k w w w w k w w w k

we decompose in words, and in words too, , and then

k

w recombine these pieces into a single word by interleaving subwords of w with subwords of .

More formally,

w w w j w w w w k

such that

k

w w w w w w w w w w w

k k

and

k k

We will take advantage of the fact that we utilize disjoint sets of symbols (the set of symbols n which represent states within the signals, and the singleton set which is used in -word represen-

tations) and redefine shuffle with the aid of monoid morphisms as follows:

w

Let us consider two disjoint sets of symbols . We will define the shuffle of

w w

and w as the set of words with the property that, if we delete from the symbols from

w w

, the result is , and if we delete the symbols from we get .

The formal definition of “deletion of symbols” is the following: denote first and the

applications

a

a for all

a

for all a

a

a for all

a

a

for all

Then the “deletion of symbols” are the induced morphisms and resp.

. In the sequel we will utilize the notations and for the induced morphisms too.

200 9. Applications

w w w w w Definition 9.1.1. Given two words w , , the shuffle of and , denoted ,is

the following set:

w w w j w w w w and (9.2)

The generalization to n-words gives the following:

n w w WD w

Definition 9.1.2. Suppose we are given two -words n . The shuffle of with

w w n i j n i j

w , denoted , is the set of -words for which, for each , the -component

w w

belongs to the shuffle of ij with :

ij

w w w WD j i j n w w w ij

n for all (9.3)

ij ij

w Note however that “random shuffling” of components does not give in general w because

some results might not satisfy the triangle identity 6.1. n Even more, we may define a class of -word regular expressions with shuffle, generated by

the following grammar:



E R j E E j E E j E E j E

The following proposition shows, in essence, that shuffle is expressible by the other operations, n

that is, its use does not increase the expressive power of -word regular expressions: n Proposition 9.1.3. The class of n-word languages accepted by -automata is closed under shuffle.

Proof. The construction is a generalization of a well-known construction for the shuffle of two regular languages. It can be described as an asynchronous composition of two automata: at each moment, the automaton for the shuffled language has the possibility to choose between a transition

in the first automaton and a transition in the second automaton.

n A Q Q Q B Q Q Q n

Formally, for any two -automata and , the

n

A LB

automaton accepting L is the following:

C Q Q theta Q Q Q Q n

where

n

a a a a

r q q r q j q r q q q r j q

C LA LB C

The proof that L is based on the argument that all accepting runs of can

B ut be obtained by shuffling accepting runs of A with accepting runs of .

9.2 Shuffled n-words

n M

Definition 9.2.1. An n-dimensional shuffled word,orshuffled -word, is a tuple consist-

n WD fg n M n ing of an -word n and an -relation .

9.2 Shuffled n-words 201

n SW

The set of shuffled -words with symbols in is denoted n . n

The semantics of shuffled n-words is based upon the following observation: if we have a -word

n Sig fag

w and a -signal over a one-letter alphabet , then we may combine them and build

n w (uncountably many) n-signals whose untiming is stuttering equivalent to and whose timing is

exactly .

The semantics of a shuffled n-word is the following set:

z

M Sig j U M

g

and f (9.4)

n

M

Proposition 9.2.2. Each shuffled n-word has a nonempty semantics.

n SW

Similarly to -word representations, we may define an equivalence relation on n as

follows:

M M M M

if and only if

S SW

Then call a set n as convex if it is saturated by this equivalence relation, that is,

M S M M M S

whenever and then .

9.2.1 Projection on shuffled words

M Y n M

Definition 9.2.3. Given a shuffled n-word and a set , the projection of

card Y

onto Y is the shuffled -word

M

Y Y

As for n-regsignals, projection poses problems: it is not compositional w.r.t. the semantics. As

an example, consider the following shuffled -word:

aa aabb

A AA

a a bb

(9.5)

b b a a b b

whose semantics consists exactly of the singleton -signal language

a a b

A

a b

b a b

f g On the contrary, the projection of the shuffled -word defined in 9.5 onto the set ,gives

the following shuffled -word with its nonsingleton semantics

aabb a b j

We have used here the convention that any word w can be regarded as the -word

w

.

w However we have the following “weakly compositional” characterization of the emptiness prob-

lem for the semantics of shuffled n-words:

202 9. Applications

M Y n

Proposition 9.2.4. 1. For each shuffled n-word and ,

M M

(9.6)

Y Y Y

n M M SW

n

2. For each two equivalent shuffled -words , and subset of

Y n M M n

indices ,if then the projections of both shuffled -words on

Y are equivalent.

M M

Proof. 1) Straightforward, since for each , by easy verification.

Y Y Y

z z

M M M

g

2) Let us observe that, if then and f

M

g

f . This implies that

z z

and

Y Y

M M

fg fg

Y Y

By the compositionality of projection on word representation, the last line is equivalent to

M M

fg fg

Y Y Y Y

M M ut

But this means that .

Y Y Y Y

Remark 9.2.5. The inclusion 9.6 is the same as the property 6.26 on page 100, for n-regsignals. Hence we may never get “false negative” answers to the emptiness problem by working with projection at the syntactic level. The question is then whether we may get “false positive” answers. The answer to this question is negative, due to the fact that we always work with shuffled words, that is, items satisfying the triangle identity and whose timing denote some nonempty region. It will be the task of the normal

form algorithm for DBMs, respectively the emptiness algorithm for n-automata, to “purge” the possible items that have empty semantics.

The difference with n-regsignals is that, with these ones, we had no algorithm for checking

n n whether a n-regsignal has an empty semantics or not. For shuffled -words we have -automata.

9.2.2 Juxtaposition on shuffled words

The juxtaposition operation on shuffled words can be defined similarly to word representations: one juxtaposes both the word parts and the relational parts in each operand, and the result must be

a set, due to juxtaposition on relation matrices:

m M n M

Definition 9.2.6. Given a shuffled -word , a shuffled -word and an integer

p minm n p M M

, the -juxtaposition of with is the following set of shuffled

m n p

-words

M  M  M j M M  M

p p p 9.2 Shuffled n-words 203

Remind that, even for word representations of regions, juxtaposition is compositional only when

defined on convex sets, that is, on sets which were closed under region equivalence.

m M SW M SW

m n

Proposition 9.2.7. For each -word representations ,

minm n

and integer p ,

M  M  M j M M

p p

M M M M  M

and p (9.7)

m n p

Proof. The proof of the inverse inclusion is straightforward: given any -signal

M M M M  M  M j M M

p

p and

M M

we have, by the inclusion 9.6 from Proposition 9.2.4, that and

m

M M M  M

p

. But this implies that by

mpmnp 

definition of p on signals.

M  M

p

The left-to-right inclusion follows this way: given , we will try con-

m n p M M

struct a shuffled -word, denote it , whose semantics contains . is

m M n

a juxtaposition of a shuffled -word equivalent to with a shuffled -word equivalent

M m n p

to . This shuffled -word arises as a shuffle of the untiming of with any

m n p m n p

-word representation of the -region which contains .

m n p w U U

Formally, we pick a -word , which is possible since is nonempty for

R Regn m n p

any n-signal. Then, for the unique region which contains we pick an -

n

M M w m n p word representation , hence . We then shuffle and and pick some -

word in this shuffle. Consequently,

z z

M U

fg

m m m m

M

fg

M

g

since f is the unique region which contains . It then follows that

m m m

M M M M

and

m m mpmnp mpmnp

M M ut

The equality follows similarly.

mpmnp mpmnp

U

Remark 9.2.8. Observe that in the above proof we have utilized the fact that the mappings , ,

g f and commute with projection, fact which can be easily established.

The rest of the good properties of juxtaposition hold as expected:

n S SW X

Proposition 9.2.9. 1. For each convex set of shuffled -words n and each

S card X n

, is also a convex set of shuffled -words. X

204 9. Applications

S SW S SW p

m n

2. For each two convex sets of shuffled words , , and integer

minm n S  S m n p

p

, is a convex set of -word representations and

S  S S  S

p p

(9.8)

S SW S SW S SW

m n p

3. For each three convex sets of shuffled words , , ,

minm n r minn p

and integers q , ,

S  S  S S  S  S

q r q r (9.9)

Proof. The second property is a corollary of Proposition 9.2.7.

ard X M S

The first property can be proved as follows: given a shuffled c -word and

z

z

n M M M

another shuffled -word with , we get that and

X X X

M M

g fg

f .

X X

w w n w

Let us denote and . We may then build a -word such that

z z

w U w U w w

and , as follows:

X

l m

l m

 k  k

i j X w a w a a a

ij

il j

For each ,if and l then put

X X

k k

maxl m maxl m

 

k k

w a a

ij k

1

i w w

ji

Let denote one of the maximal indices in any ordering compatible , that is,



j n j

for any . Let also denote one of the minimal indices in any ordering compatible

w w i n

j i

, that is, for any .



i X w w w i X

j i j i

Then for all for which put , while for all for which

 

i i



w w w w

ii ii

put . Finally complete by the triangle identity.

 

i i



z z

z z

w w w w

It is easy to observe then that while .

X

n M M

g

Further, let us build an -word representation which is equivalent with f

M

g

and extends f . This part of the proof has already been done in Proposition 8.3.15.

n M w

We may then conclude that any shuffled -word in which has the property

M M S M S

that . By convexity of we get that , which assures, on its turn,

M S ut

that .

X

n p q r N r minp q

Proposition 9.2.10. 1. For each five integers m with , shuffled words

M SW M SW X m m r m X

m n

, , given with

ard X p Y n r Y card Y q and c , and given also with and , we have

that

M  M M  M

r

(9.10)

X Y X Y mr

 Such maximal indices are not unique.

9.2 Shuffled n-words 205

M SW M

m

2. Juxtaposition is associative on shuffled words: for each ,

SW M SW k minm n l minn p

p

n and , and for each and ,

M  M M  M  M M 

l k l k (9.11)

Proof. Both properties are straightforward corollaries of the respective properties concerning jux-

n ut taposition and projection on n-words and -word representations.

Note that this proposition refers only to the syntactic properties relating juxtaposition and pro- jection, not to their semantic properties. Therefore it hides no contradiction with the noncomposi- tionality of projection.

9.2.3 Concatenation and star on shuffled words n

This is defined as usual, by means of projection an juxtaposition: for each two shuffled -words

M M SW

n

,

M M M j M M M

Composition enjoys the well-known properties at the syntactic level, namely associativity, ex- n

istence of two units per each shuffled -word and existence of pseudoinverses, but, due to the use n of projection, it is not compositional. It can also be extended to sets of shuffled -words in the

usual way, and gives rise to the star operation

 k k k

S SW S S S S S S

n n

for each where and

n

Here is the set of all units:

f M j i X M g

n ini ini n The following properties are essential in our “shuffle” approach to checking -signal regular

expressions for emptiness:

n S S SW

n

Proposition 9.2.11. 1. For each pair of sets of convex shuffled -words ,

S S S S

if and only if

n S SW n

2. For each set of convex shuffled -words ,





S S if and only if

Proof. For both properties, the right-to-left implication follows by means of Proposition 9.2.4 and of compositionality of juxtaposition. Note that convexity is essential in this implication. Moreover, the second property follows by induction from the first.

206 9. Applications

S S

For the proof of the left-to-right implication for the first property, suppose . This

M S M M

implies in fact that there exist and such that is defined and

M M  M  M

n n

is nonempty. But this implies also that is defined and is nonempty,

M M  M  M

n n

hence for any we have that is a nonempty set. The proof ends if we

 M M M

n

pick up any and observe that and ,

n nn

which means that

M M

n nn

M M ut

or, in other words, that is nonempty. n

Shuffle regwords are naturally associated with n-automata that generalize -region automata:

A Q Q Q n n

these are tuples , with labeling states with -relations. Proposition

n n 9.1.3 implies that we may construct n-automata for shuffled -regwords by shuffling -automata

for the untiming with n-region automata.

n R RW n

Proposition 9.2.12. For each -regword n and convex set of -word representations

W WdRep n R W

n , the shuffled -word semantics of the shuffle regword is a convex set of

shuffled n-words.

ut

Proof. Easy corollary of the convexity of W . n 9.2.4 A method for checking whether the semantics of a -signal regular expression is

empty

n E n E 1. Given a -signal regular expression , we decompose each -regsignal occurring in into

the untimed part and the timing part.

n n

2. We then interpret the untimed -regsignal as a -regword (this implies that stuttering is n

added) and associate a -automaton to it.

n n

3. On the other hand, we interpret the timing -regsignal as a -EDBM and hence associate a n -region automaton to it. 4. Subsequently, we produce the shuffle of the two automata.

5. Then we apply the union/concatenation/star constructions for the resulting automata (provided n

the non-elasticity assumption holds) until we associate a -automaton to the whole expression. n

6. Finally, we check whether the semantics of this final -automaton is void. n

In order for this algorithm to be correct, we need to redefine the non-elasticity property for - n

signals, and to prove that this definition is consistent with the representations of sets of -signals.

n

Definition 9.2.13. A -signal is called non-elastic if the following property holds:

i j n

i jnj inj jni

For each ,if in and then both and are not

Sig Sig

j jni antisignals, that is in and . n

9.3 Checking emptiness of timed automata with -signal regular expressions 207

n Sig U

Proposition 9.2.14. For each -signal , is non-elastic if and only if is

n

non-elastic, if and only if is non-elastic.

U ij

Proof. Straightforward, due to the fact that ij if and only if if and only if

ut

ij .

n R RW fg n M R M n

Proposition 9.2.15. Given a -regword and a -relation ,

n kR k n

contains only non-elastic -signals if and only if contains only non-elastic -words

M

g

and f is a non-elastic region. t

Proof. Corollary of the above Proposition and of the Proposition 8.4.10. u n

Theorem 9.2.16. The above procedure terminates with some -automaton with a nonempty lan- n guage if and only if the semantics of the initial expression is nonempty, provided the -signal

regular expression this initial expression E satisfies the following non-elasticity assumption:

L n E

Denote E the union of the semantics of all -regsignals from which is built of. Then

k

n k N

L consists of non-elastic -signals for any .

E n

Proof. Corollary of Proposition 9.2.11, which is applied by structural induction on the -signal n

regular expression involved. The essential property in the proof is that for each -regsignal, the

n ut

shuffle -regword associated as above is convex. n 9.3 Checking emptiness of timed automata with  -signal regular expressions

Remind that in Chapter 6 we have proved that (languages of) timed automata are embeddable in n

-signal regular expressions. Then, our search for a property that assures decidability was guided n

in part by some observations on the -signals that occur during this embedding. We will show n here that, indeed, timed automata can be simulated by -signal regular expressions which have

the non-elasticity property.

n A Q Q Q f

Remind that, in Theorem 6.5.2, for each timed automaton with clocks

CX

r

in which each transition resets at least one clock, and for each transition q in this automa-

q a n R C X

ton, in which X and we have associated a -regsignal . More specifically,

C n R C X x I x x J

i i j ij

for i , the -regsignal is:

i n ij ni j

208 9. Applications

i j n h i

J for

ij

i n k j n l k l X h i J

for

kl

i n k j n l k l X

for

h i i n j n k k X i k

J for

ik

j n i n k k X j k h i

J for

kj

R C X

ij

j n i i X i n j j X

for or

i n j n k k X h ai

I for

i

i n l j n k k X l X

or

j n i n k k X ha i

I

for

i

n k j n l k X l X

or i

n R C X n C

Denote L the union of the semantics of all -regsignals for all -constraints and

S

kR C X k j C C X n X n L

subsets , .

k L

Proposition 9.3.1. L satisfies the hypothesis in Theorem 7.4.3, that is, consists of non-elastic

n k N

-signals for all .

C X

Proof. Observe that sets R bear the following important property:

C C X X n kR C X k

i

(*) For each , each and ,if in then for all

j n Sig

i jni

, jn is not an antisignal, that is, . n We will then actually show that the set of all -signals with property (*) satisfies the hypothesis

in Theorem 7.4.3:

k n

k k

Take -signals , all satisfying property (*). Suppose that is an elastic

n i j n

-signal. That is, there exist such that

k i ni k j nj

1. , and

   

Sig n fg

k i nj k i nj

2. is an antisignal, i.e. .

   

l

j ni

Let us show that l is an antisignal. We prove this by contradiction: since for all

 

k n Sig

h i ni

, h is a non-elastic -signal, we must then have and, similarly,

 

Sig

j nj

h . But then

 

k j ni

 

k i ni l i ni l j ni l j nj j nj

         

k j ni

hence is a signal, fact which contradicts condition 2 above.

 

l k

On the other hand, condition 1 above implies that there exists some such that

Sig

i ni l j ni

l . But by property (*), we must have then , which we have

      t already seen to be in contradiction with condition 2. u n 9.3 Checking emptiness of timed automata with -signal regular expressions 209

Remark 9.3.2. Observe that using non-elasticity instead of property (*) does not suffice to prove the above result. Observe also that Proposition 9.3.1 assures that each concatenation produces non-elastic lan- guages. 210 9. Applications 10. Conclusions

We have presented an approach on checking timed automata for language emptiness, approach based on regular expressions. The regular expressions we use arise as a generalization of timed

regular expressions of [ACM97], with the use of colored parentheses. Our method associates n to each atomic regular expression a class of automata with accepting sets, and then applies union/concatenation/star constructions to build an automaton representing the whole regular ex- pression. The essential steps that give this method are the following:

The possibility to represent timing constraints over the continuous time domain with the aid of

n-automata. This possibility is based upon region decomposition of each timing constraint, and

on the representation of each region as a pair consisting of an n-word over a one-letter alphabet

and a matrix of relational symbols.

n

The star-closure theorem for -automata with the property that all the powers of its accepted n language are composed of only non-elastic -words.

The decomposition of each regular expression into the untimed and the timing part, decompo- n

sition which allows representing the timing part with -word representations. And then, the n recomposition of the untiming part with the -word representation of the timing part, by means of the shuffle operation. This recomposition replaces a (perhaps uneasy) synchronous application of the union/concatenation/star constructions for both the untimed and the timing part, which is needed when the interactions between untiming and timing are more involved and may lead to emptiness – that is, to unfeasible specifications. We hope that our study gives new insights in better understanding the theory of timed systems. The difficulty of the emptiness checking for timed automata keeps the performances modeling sys- tems with timed automata and model-checking their properties far from the performances reached

for untimed modeling. Therefore, any alternative insight might help in identifying subclasses with

n n nice properties. Our theory of -signal regular expressions and -automata is such an alternative insight. For the comparison of our approach to the emptiness problem for timed automata with the clas- sical approaches [Yov98, LPWY95] based on the region construction of [AD94], we may observe the following points:

The essential property that gives a terminating algorithm in the classical approach is the pos- sibility to collapse the infinite region space into a finite one, which is of the cardinality of the

212 10. Conclusions

set of regions included into some hypercupe. The length of the edge of this hypercube is k ,

with k being computable from the largest constants used in the clock constraints of the given timed automaton. In the reference paper on timed automata [AD94] this constant is exactly the

largest constant in any clock constraint, but this is due to the fact that clock constraints do not

x x I j use diagonal atomic constraints i . In general, this constant needs to be computed by “propagating” also the diagonal constraints [Tri98].

On the other hand, in our approach, termination is assured by the non-elasticity property of the

n n -signals which give the -signal-semantics of a timed automaton, property which allows the possibility to iteratively build a finite representation for the reachability relation defined by the timed automaton. This still means that we get a finite decomposition of the set of clock regions, but this decomposition is “finer” (that is, gives in general more equivalence classes) than Alur’s decomposition. As a consequence, our algorithm might sometimes require more memory than the classical ap- proach. Of contrary, sometimes our approach might give faster results due to the possibility to get the behavior of a loop in the timed automaton in a single application of the star closure al- gorithm, fact which is not available in the classical approach since there one needs to iteratively pass through the loop until the fixpoint is reached.

The complexity of our algorithm is nonelementary, in contrast with the PSPACE complexity of the classical approach. This follows due to the fact that each star produces an exponential explosion of the state space. However this result concerns only the worst-case complexity, and is highly dependent on the number of nested stars in the regular expression that one may associate to a timed automaton.

Our contributions are partly theoretical, but with a certain interest for the domain of verification

of timed systems. In this sense, our main contribution is the verification method for timed automata n by translation to -signal regular expressions. It can be argued that building a regular expression from a timed automaton is a difficult task, a legitimate observation. But we expect to further study the possibility of directly translating the timed regular expressions of [ACM97] into our regular expressions and hence making available our technique without passing through timed automata. This gives one of the directions for future research: the introduction of parallel composition in regular expressions with colored parentheses. Specifically, we would like to have some “distribu- tivity” laws of parallel composition over atomic blocks. Such laws would allow transformation of parallel compositions of timed regular expressions of [ACM97] into regular expressions with colored parentheses. This would allow doing emptiness checking for timed regular expressions of [ACM97] without passing through timed automata.

Another promising direction is given by the applicability of n-automata in representing timing

constraints. We have found that an argument in favor of n-automata is that they sometimes give

a more compact representation of timing constraints. Unfortunately n-automata are nondetermin- istic, hence the problem of finding small representation of timing constraints is somewhat related to the problem of finding small nondeterministic finite automata. In particular, the union construc- 10. Conclusions 213

tion of several n-automata, which in theory is done by simply joining the state spaces, has to be accompanied by a technique which identifies some states, in order to reduce the state space.

We would also like to transform our closure algorithms for n-automata (that is, the union, concatenation and star closure algorithms) into symbolic algorithms. One might be tempted to say that this would bring us back to the existing symbolic algorithms for reachability of timed automata. But this is not true, because we would not code disjunctions of constraints. Symbolic

algorithms would allow one to symbolically represent states in n-automata, and we have already

pointed out that there is no connection between states in an n-automaton and clock regions. Another direction of further study is the possibility to combine our emptiness checking tech- nique with partial order reduction methods. For timed automata, partial order reductions involve the possibilty to split, at certain moments, the clock set into subsets of dependent clocks, such that

two clocks belonging to different subsets are not related by any constraint during the respective n moments. For our setting, this would mean to split a -regsignal into smaller regsignals and to do concatenation and star on these regsignals. Some care needs to be put in order to define a “parallel” composition of these smaller rsgsignals such that one does not obtain elastic regsignals.

And a final mention for the idea of finding more general classes of timed systems that may be n modeled by -automata. We mainly think of hybrid automata with stopwatches [Hen96]. It has been already observed that preemptive scheduling can be modeled by such hybrid automata [AM]. The essence is that preemptive scheduling is intimately related to the shuffle operation, hence it

remains to be studied how to model this into n-automata and what would be the resulting class of timed languages. 214 10. Conclusions REFERENCES 215

References

[ABK 97] E. Asarin, M. Bozga, A. Kerbrat, O. Maler, A. Pnueli, and A. Rasse. Data-structures for the verification of timed automata. In Proceedings of HART’97, volume 1201 of LNCS, pages 346–360, 1997. [ACM97] E. Asarin, P. Caspi, and O. Maler. A Kleene theorem for timed automata. In Pro- ceedings of LICS’97, pages 160–171, 1997. [ACM01] E. Asarin, P. Caspi, and O. Maler. Timed regular expressions. submitted, 2001. [AD94] R. Alur and D. Dill. A theory of timed automata. Theoretical Computer Science, 126:183–235, 1994. [AFH94] R. Alur, L. Fix, and T. A. Henzinger. A determinizable class of timed automata. In Proceedings of CAV’94, volume 818 of LNCS, pages 1–13, 1994. [AFH96] R. Alur, T. Feder, and T. A. Henzinger. The benefits of relaxing punctuality. Journal of the ACM, 43:116–146, 1996. [AFH99] R. Alur, L. Fix, and T. A. Henzinger. Event-clock automata: a determinizable class of timed automata. Theoretical Computer Science, 211:253–273, 1999. [AH92] R. Alur and T. A. Henzinger. Back to the future: Towards a theory of timed regular languages. In Proceedings of FOCS’92, pages 177–186, 1992. [AM] Y. Abdedda¨ım and O. Maler. Preemptive job-shop scheduling using stopwatch au- tomata. Submitted to TACAS 2002. [AMP98] E. Asarin, O. Maler, and A. Pnueli. On discretization of delays in timed automata and digital circuits. In Proceedings of Concur’98, volume 1466 of LNCS, pages 470–484, 1998. [Asa98] E. Asarin. Equations on timed languages. In Proceedings of HSCC’98, volume 1386 of LNCS, pages 1–12, 1998. [BB91] J. C. M. Baeten and J. A. Bergstra. Real time process algebra. Formal Aspects of Computing, 3:142–188, 1991. [BC96] A. Boudet and H. Comon. Diophantine equations, Presburger arithmetic and finite automata. In Proceedings of CAAP’96, volume 1059 of LNCS, pages 30–43. Springer Verlag, 1996. [BDFP00a] P. Bouyer, C. Dufourd, E. Fleury, and A. Petit. Are timed automata updatable? In Proceedings of CAV’2000, volume 1855 of LNCS, pages 464–479, 2000. [BDFP00b] P. Bouyer, C. Dufourd, E. Fleury, and A. Petit. Expressiveness of updatable timed automata. In Proceedings of MFCS’2000, volume 1893 of LNCS, pages 232–242, 2000. [BDGP98] B. Berard,´ V. Diekert, P. Gastin, and A. Petit. Characterization of the expressive power of silent transitions in timed automata. Fundamenta Informaticae, 36:145– 182, 1998.

[BDM 98] M. Bozga, C. Daws, O. Maler, A. Olivero, S. Tripakis, , and S. Yovine. Kronos: a model-checking tool for real-time systems. In Proceedings of CAV’98, LNCS, pages 546–550, 1998. 216 REFERENCES

[BE97] S. L. Bloom and Z. Esik.´ Axiomatizing shuffle and concatenation in languages. In- formation and Computation, 139:62–91, 1997. [Bel57] R. Bellmann. Dynamic Programming. Princeton University Press, 1957. [Bir79] G. Birkhoff. Lattice theory. American Mathematical Society, Providence, R.I, 1979. [BJLWY98] J. Bengtsson, B. Jonsson, J. Lilius, and Wang Yi. Partial order reductions for timed systems. In Proceedings of CONCUR’98, volume 1466 of LNCS, pages 485–500, 1998. [BMT99] M. Bozga, O. Maler, and S. Tripakis. Efficient verification of timed automata using dense and discrete time semantics. In Proceedings of CHARME’99, volume 1703 of LNCS, pages 125–141, 1999. [Boi99] B. Boigelot. Symbolic Methods for Exploring Infinite State Spaces. PhD thesis, Fac- ulte´ des Sciences Appliquees´ de l’Universite´ de Li‘ege, 1999. volume 189, Collection des Publications de la Faculte´ des Sciences Appliquees´ de l’Universite´ de Li‘ege. [BP99] P. Bouyer and A. Petit. Decomposition and composition of timed automata. In Pro- ceedings of ICALP’99, volume 1644 of LNCS, pages 210–219, 1999. [BP01] P. Bouyer and A. Petit. A Kleene/Buchi-like¨ theorem for clock languages. Journal of Automata, Languages and Combinatorics, 2001. To appear. [BPT01] P. Bouyer, A. Petit, and D. Therien.´ An algebraic characterization of data and timed languages. In Proceedings of CONCUR’2001, volume 2154 of LNCS, pages 248– 261, 2001. [Bry86] R. Bryant. Graph-based algorithms for boolean function manipulation. IEEE Trans- actions on Computers, C-35:677–691, 1986. [Buc60]¨ J. Buchi.¨ On a decision method in restricted second-order arithmetic. In Proceedings of the Int. Congress on Logic, Methodology and Philosophy of Science, 1960. [CG00] Ch. Choffrut and M. Goldwurm. Timed automata with periodic clock constraints. Journal of Automata, Languages and Combinatorics, 5:371–404, 2000. [CJ99] H. Comon and Y. Jurski. Timed automata and the theory of real numbers. In Pro- ceedings of CONCUR’99, volume 1664 of LNCS, pages 242–257, 1999. [Con71] J. H. Conway. Regular Algebra and Finite Machines. Chapman and Hall, 1971. [Dim99a] C. Dima. Automata and regular expressions for real-time languages. In Proceedings of the 9th International Conference on Automata and Formal Languages (AFL’99), Vasszeczeny, Hungary, 1999. [Dim99b] C. Dima. Kleene theorems for event-clock automata. In Proceedings of FCT’99, volume 1684 of LNCS, pages 215–225, 1999. [Dim99c] C. Dima. Relating signals and timed traces. Annals of the University of Bucharest, XLVIII:79–89, 1999. [Dim00a] C. Dima. Real-time automata and the Kleene algebra of sets of real numbers. In Proceedings of STACS’2000, volume 1770 of LNCS, pages 279–289, 2000. REFERENCES 217

[Dim00b] C. Dima. Removing epsilon transitions from event-clock automata. In Proceedings of the National Conference on Theoretical Computer Science and Information Tech- nology (CITTI’2000), pages 75–81, Constant¸a, Romania, May 25-27, 2000. [Dim01a] C. Dima. Distributed real-time automata. In C. Martin-Vide and V. Mitrana, editors, Grammars and Automata for String Processing: from Mathematics and Computer Science to Biology, and Back. Gordon and Breach, London, 2001. to appear. [Dim01b] C. Dima. Real-time automata. Journal of Automata, Languages and Combinatorics, 6:3–23, 2001. [DMP91] R. Dechter, I. Meiri, and J. Pearl. Temporal constraint networks. Artificial Intelli- gence, 49:61–95, 1991. [DP89] J. Dassow and G. Paun. Regulated rewriting in formal language theory, volume 18 of EATCS Monographs on theoretical computer science. Springer Verlag, 1989. [Eil74] S. Eilenberg. Automata, Languages, and Machines, volume A. Academic Press, 1974. [EKR82] A. Ehrenfreucht, J. Karhumaki,¨ and G. Rozenberg. The (generalized) Post Correspon- dence Problem with lists consisting of two words is decidable. Theoretical Computer Science, 21:119–144, 1982. [Gau92] S. Gaubert. Theorie« des systemes` lineaires« dans les dio¬õdes. PhD thesis, Ecole´ des Mines de Paris, 1992. [Gau99] S. Gaubert. Introduction aux systemes` dynamiques a` ev« enements« discrets. Polycopie´ de cours ENSTA–ENSMP–Orsay (DEA ATS), 1992 (revised 1999). [GJ79] M. R. Garey and D. S. Johnson. Computers and intractability. W.H. Freeman & Co.,

1979.

[GP97] S. Gaubert and Max Plus. Methods and applications of max linear algebra. In Porceedings of STACS’97, volume 1200 of LNCS, pages 261–282, 1997. [Hen96] T. A. Henzinger. The theory of hybrid automata. In Proceedings of LICS’96, pages 278–292, 1996. [Her99] P. Herrmann. Renaming is necessary in timed regular expressions. In Proceedings of FST&TCS’99, volume 1738 of LNCS, pages 47–59, 1999. [HHH99] V. Halava, T. Harju, and M. Hirvensalo. Generalized PCP is decidable for marked morphisms. In Proceedings of FCT’99, volume 1684 of LNCS, pages 304–315, 1999. [HJ96] Dang Van Hung and Wang Ji. On the design of hybrid control systems using automata models. In Proceedings of FST&TCS’96, volume 1180 of LNCS, pages 156–167, 1996. [HNSY94] T. A. Henzinger, X. Nicollin, J. Sifakis, and S. Yovine. Symbolic model checking for real-time systems. Information and Computation, 111:193–244, 1994. [HRS98] T. Henzinger, J.-F. Raskin, and P.-Y. Schobbens. The regular real-time languages. In Proceedings of ICALP’98, volume 1443 of LNCS, pages 580–591, 1998. [HU92] J.E. Hopcroft and J.D. Ullman. Introduction to Automata Theory, Languages and Computation. Addison-Wesley/Narosa Publishing House, 1992. 218 REFERENCES

[HZC97] M. R. Hansen and Zhou Chaochen. Duration calculus: Logical foundations. Formal Aspects of Computing, 9:283–330, 1997. [Jur99] Y. Jurski. Expression de la relation binaire d’accessibilite« pour les automates a` compteurs plats et les automates temporises« . PhD thesis, Ecole´ Normale Superieure´ de Cachan, France, 1999. [Kle56] S. C. Kleene. Representation of events in nerve nets and finite automata. In C. Shan- non and M. McCarthy, editors, Automata Studies, pages 3–41. Princeton University Press, 1956. [KMM97] M. Kudlek, S. Marcus, and A. Mateescu. Contextual grammars with distributed cate- nation and shuffle. In Proceedings of FCT’97, volume 1279 of LNCS, pages 269–280, 1997. [Kop97] H. Kopetz. Real-Time Systems. Design Principles for Distributed Embedded Appli- cations. Kluwer Academic Publishers, 1997. [Koz94] D. Kozen. A completeness theorem for Kleene algebras and the algebra of regular events. Information and Computation, 110:366–390, 1994. [LPWY95] K. G. Larsen, P. Pettersson, and Wang Yi. Model-checking for real-time systems. In Proceedings of FCT’95, volume 965 of LNCS, pages 62–88, 1995. Invited paper. [LPY97] K. G. Larsen, Paul Petterson, and Wang Yi. Uppaal: Status & developments. In Proceedings of CAV’97, LNCS, pages 456–459, 1997. [LWYP99] K. G. Larsen, C. Weise, Wang Yi, and J. Pearson. Clock difference diagrams. Nordic Journal of Computing, 6:271–298, 1999. [Mac71] S. MacLane. Categories for the Working Mathematician. Springer Verlag, 1971. [Mil80] R. Milner. Communication and Concurrency, volume 92 of LNCS. Springer Verlag, 1980. [MLAH99] J. Møller, J. Lichtenberg, H. R. Andersen, and H. Hulgaard. Difference decision dia- grams. Technical Report IT-TR-1999-023, Department of Information Technology, Technical University of Denmark, 1999. [MP90] Z. Manna and A. Pnueli. A hierarchy of temporal properties. In Proceedings of PODC’90, pages 377–410, 1990. [MP92] Z. Manna and A. Pnueli. The temporal logic of reactive and concurrent systems: Specification. Springer Verlag, 1992. [MP95] Z. Manna and A. Pnueli. Temporal verification of reactive systems: Safety. Springer Verlag, 1995. [Nic92] X. Nicollin. ATP: une algebre` pour la specification« et l’analyse des systemes` temps reel« . PhD thesis, Institut National Polytechinque de Grenoble, 1992. [NSY93] X. Nicollin, Joseph Sifakis, and Sergio Yovine. From ATP to timed graphs and hybrid systems. Acta Informatica, 30:181–202, 1993. [Pos46] E. Post. A variant of a recursively unsolvable problem. Bull. AMS, 52:264–268, 1946. REFERENCES 219

[Pra90] V. R. Pratt. Dynamic algebras as a well-behaved fragment of relation algebras. In Algebraic Logic and Universal Algebra in Computer Science, volume 425 of LNCS, pages 77–110, 1990. [Rab] A. Rabinovich. Automata over continuous time. Available at http:// www.math.tau.ac.il/˜rabinoa/aut-cont.ps.gz. [Ras99] J.-F. Raskin. Logics, Automata and Classical Theories for Deciding Real-Time. PhD thesis, Facultes´ Universitaires Notre Dame de la Paix, Namur, Belgique, 1999. [RS97] J.-F. Raskin and P.-Y. Schobbens. State-clock logic: a decidable real-time logic. In Proceedings of HART’97, volume 1201 of LNCS, pages 31–47, 1997. [Saf88] S. Safra. On the complexity of omega-automata. In Proceedings of FOCS’88, pages 319–327, 1988. [Sal66] A. Salomaa. Two complete axiom systems for the algebra of regular events. Journal of ACM, 13:158–169, 1966. [Sor01] M. Sorea. Tempo: A model checker for event-recording automata. In Proceedings of RT-Tools’01, 2001.

[Sta97] L. Staiger. -languages. In G. Rozenberg and A. Salomaa, editors, Handbook of Formal Languages, volume 3, Beyond words, pages 339–387. Springer Verlag, 1997. [Tho90] W. Thomas. Automata on infinite objects. In J. van Leeuwen, editor, Handbook of Theoretical Computer Science, volume B, pages 133–191. Elsevier, 1990. [Tho97] W. Thomas. Languages, automata, and logic. In G. Rozenberg and A. Salomaa, editors, Handbook of Formal Languages, volume 3, Beyond Words, pages 389–455. Springer Verlag, 1997. [Tri98] S. Tripakis. L’analyse formelle des systemes` temporises« en pratique. PhD thesis, Universite´ Joseph Fourier Grenoble, 1998. [vH89] P. van Hentenryck. Constraint Satisfaction in Logic Programming. MIT Press, 1989. [VW94] M. Y. Vardi and P. Wolper. Reasoning about infinite computations. Information and Computation, 115:1–37, 1994. [WB95] P. Wolper and B. Boigelot. An automata-theoretic approach to Presburger arithmetic constraints. In Proceedings of SAS’95, volume 983 of LNCS, pages 21–32, 1995. [WB00] P. Wolper and B. Boigelot. On the construction of automata from linear arithmetic constraints. In Proceedings of TACAS’99, volume 1785, pages 1–19, 2000. [Wil94] T. Wilke. Specifying timed state sequences in powerful decidable logics and timed automata. In Proceedings of FTRTFT’94, volume 863 of LNCS, pages 694–715, 1994. [WY91] Wang Yi. CCS + Time = An interleaving model for real time systems. In Proceedings of ICALP’91, volume 510 of LNCS, pages 217–228, 1991. [Yov98] S. Yovine. Model-checking timed automata. In Lectures on Embedded Systems, volume 1494 of LNCS, pages 114–152, 1998. [ZCHR91] Zhou Chaochen, C. A. R. Hoare, and A. P. Ravn. A calculus of durations. Information Processing Letters, 40:269–276, 1991. [ZCHS93] Zhou Chaochen, M. R. Hansen, and P. Sestoft. Decidability and undecidability results for duration calculus. In Proceedings of STACS’93, volume 665 of LNCS, pages 58– 68, 1993. [ZDPK01] Zhe Dang, P. San Pietro, and R.A. Kemmerer. Presburger liveness verification of discrete timed automata. In Proceedings of STACS’01, volume 2102 of LNCS, pages 132–143, 2001. Index

2-counter machine, 72 – triangle identity, 84 DBM, 151 antisignal, 80, 84 difference bound matrix, see DBM, 158Ð164 antiword, 109, 110 – in normal form, 159

automaton – non-elastic, 186 n – -regsignal, 102 – triangle inclusion, 159 n-automaton, 109, 112 discontinuity, 22

– -transitions in, 116

– n-word accepted by, 112 EDBM, see extended DBM – accepting run in, 112 extended DBM, 158 – – witnessing sequence, 112 – closure under shuffle, 200 Floyd-Warshall-Kleene algorithm

– language accepted by, 113 – for n-automata, 115 n

– non-elastic language of a -automaton, 122 – for matrices of normal forms, 46

l

l

– non-elastic pair , 123 n – non-elastic star closure theorem for -automata, 131 Kleene algebra, 21, 24, 96, 169 – – history component, 133 – commutative, 21 – – left active component, 133 Kleene theorem – – prophecy component, 133 – for RTA, 36 – – right active component, 133 – for timed automata, 61 – run in, 112 language – sink state in a completion n-automaton, 116

– M -regular language, 27 – source state in a completion n-automaton, 116

–ofanM -automaton, 27 n-region automaton, 181Ð182

n-word language, 111 – n-region language accepted by, 181 – regular, 113 – n-signal language accepted by, 182

– accepted by an n-automaton, 113 – n-word representation language accepted by, 181 – accepting run in, 181 monoid – – accepted n-word representation, 181

– convex, 182 – M -automaton, 26 – run in, 181 – coproduct morphism, 20 – coproduct of monoids, 20 – the underlying n-automaton, 181 – witnessing sequence, 181 non-elasticity, 109, 121Ð125, 186, 206

clock constraints, 55Ð56 normal form

Q Int – atomic, 55 – of sets in K , 41Ð43 – elementary, 55 – – bound of, 41 clock valuation, 56 – – theorem, 43 Consecutiveness Property, 137, 139 – – weak, 41 constrained generator, 61 PCP, see Post Correspondence Problem

n-domino language, 96 – instance, 106 n – star of a -domino language, 96 Post Correspondence Problem, 105, 109, 122

n-domino, 84Ð85

n-regmino, 97 – X -projection of, 87

– X -projection of, 99 – p-juxtaposition, 88

– p-juxtaposition, 99 n – concatenation of -dominoes, 93

221 222 Index

– semantics of, 98 shuffle

regular expression –onn-words, 200

– n-clocked, 60 – on words, 199, 200

– – semantics of, 60 shuffled n-word, 200

n p m n – -domino regular expression, 101 – -juxtaposition of a shuffled -word and a shuffled -word,

– – semantics of, 101 202 n – -signal regular expression, 101 – projection, 201 – – non-elasticity assumption, 207 signal, 22 – – semantics of, 101 – concatenation of, 23 – – undecidability of the emptiness problem, 106 – length of, 22

– n-signal regular expression, 102 – length, as a monoid morphism, 23 n – -word, 109, 111 – monoid of signals, 23 – – semantics of, 111 – stuttering-free concatenation of signals, 51 – extended timed, 72 – untiming of, 27Ð28 – real-time, 35 – with clock valuations, 59 – – abstract semantics of, 35 – with reset times, 66Ð67

– – semantics of, 36 n-signal language, 96

– timed, 12, 70 n-signal, 84Ð85

– – semantics of, 70 – X -projection of, 87 p

n-regsignal, 97 – -juxtaposition, 88

n – X -projection of, 99 – concatenation of -signals, 93

– p-juxtaposition, 99 – non-elastic, 206 – intersection, 100 – ordering compatible with, 84 – semantics of, 98 stuttering-free words, 27 real-time automaton, 34Ð37 – concatenation of, 28 – augmented, 46 – deterministic, 38 t-RTA, see transition-labeled real-time automaton – equivalence of two RTA, 34 timed automaton, 11, 56Ð61 – Kleene theorem, see Kleene theorem for RTA – instantaneous transitions in, 57 – language deterministic, 38 – Kleene theorem, see Kleene theorem for timed automata – language of, 34 – language accepted by, 57 – run, 34 – timed transitions in, 57

– – accepting, 34 timed languages, 24 n – – associated with a signal, 34 – associated with -signal regular expressions, 103 – – stuttering, 34 – concatenation, 24 – signal accepted by, 34 – essentially untimed, 28 – state deterministic, 38 – real-time regular, 36 – state labeling function, 34 – – pumping lemma, 50 – stuttering-free, 38 – real-time recognizable, 34 – time labeling function, 34 timed words, 24 – transition-labeled, 36, 52 – concatenation of, 24 – – deterministic, 38 – length of, 24 – – stuttering-free, 38

– – transition-deterministic, 38 untimed n-domino, 110

n-region, 153, 164Ð166

n-word, 109, 110Ð111

– p-juxtaposition, 166 n

– non-elastic -word, 122 n – concatenation of -regions, 167 – – “contribution” part, 122 – represented by a n-word representation, 175 – – “interface” part, 122 n-regword, 109, 111

– semantics of, 111 – strictly non-elastic n-word, 122

n-word representation, 175Ð177 n-relation, 170Ð171 – concatenation, 178 – X -projection, 171

– convex set of n-word representations, 179

m n – p-juxtaposition of a -relation with a -relation, 171

– juxtaposition, 178 n – concatenation of -relations, 174 – consistency property, 170 – projection, 178 RTA, see real-time automaton – region represented by, 175 – augmented, see augmented real-time automata zero element in a monoid, 28,51

Glossary

BiSig R R R R Regn

  

,84  , for , 167

n

RegD , 101

n

D v R D D bm R Regn

n

, for and , 165 RD n

n ,98

D n

n

ig

, the set of -dominoes, 84 RS ,98

n

D bm

n

, 158 RW

n , 111

DL

n

,96 Rat Int

Q ,35

D nf

n egn

, 159 R , 164

n

kD k D E dbm n

, for , 158 R 

 ,22 dom

,22 R 

 ,22

R 

,22

EC

n ,56

RegSig , 101

n

E dbm

n , 158

jR j R RW

, for n , 111

E j E

j , for a classical regular expression, 111

kR k R RD

, for n ,98

n , 176

,27

E k E n

k , for a -word regular expression, 111

E k E

k , for a timed regular expression ,70

TRec ,34

E k E RegD

k , for , 101

SW

n

n , 201



E k E

k , for a real-time regular expression ,36

Sig ,84

E k n E

k , for an -clocked regular expression ,60

n

Sig , the set of -signals, 84

n

kE k E RegSig

s , for , 102

Sigclk n

 ,59

Sigreset ,66 In t Q ,22

U ,28



Q Int

K ,41

Sig

, for ,84 Int

Z ,22

SF ,27

,23

E E RegSig

L , for , 103

n

Sig ,22

L ,58

M

 , 201

X

L L DL n

, for  ,96



L L Sig

M SW , for ,84 M

, for n , 201

L A

rep , 181

L A U

rg n , 181 , 199

L A

sig , 182

WM WM W R

n

, 22, 199 , for n , 176

UD

Sig

, for , 151 n , 111

n

WD

l n

X ,86 , 111

WL

n ,96

z

M M

, for n , 171 X

, 27, 199

M  M M M

p   m  n

 , for and , 171

 

w w w w WD

, for n , 200

M M M M

   n

 , for , 174

 

w  w w D w D

m n

p , for , ,88

 

w w w w D n , for  ,93

N ,22

w w D

, for n ,87

Z,22

X

  

Q 

 ,22

w w w

w , for , 199

word i i 

 , 112

Y X Y R

X , for ,40



X R

X , for ,40

X P ,24

223

Resum« e«. Un automate temporise´ est un automate augmente´ avec plusieurs horloges qui mesurent le passage de temps et peuvent conditionner la modification de l’etat´ du systeme.` Les automates temporises´ ont et´ e´ introduits en tant que modele` formel pour les systemes` temps-reel,´ en esperant´ que leur roleˆ dans la verification´ de tels systemes` sera similaire au roleˆ des automates finis dans la recherche systematique´ des erreurs de conception de systemes` non-temporises.´ Dans notre these` nous etudions´ plusieurs questions theoriques´ lies´ aux automates temporises´ et aux langages tem- porises.´ Dans une premiere` partie nous etudions´ une sous-classe simple d’automates temporises´ a` une seule horloge qui est remise az` ero´ pendant chaque transition. Nous montrons que cette sous- classe supporte des resultats´ similaires alath` eorie´ classique des automates finis: des theor´ emes` de Kleene, de Myhill-Nerode et de fermeture par complementation.´ La deuxieme` et principale partie de la these` est motivee´ par les expressions reguli´ eres` tempo- rises´ de Asarin, Caspi et Maler. Depuis leur introduction, on sait qu’il faut employer l’intersection dans les expressions reguli´ eres` pour que leur expressivite´ soit egale´ aux automates temporises.´ Nous poursuivons alors une approche alternative en utilisant des parentheses` colorees´ pour definir´ les contraintes temporelles sur une sequence´ d’ev´ enements.´ Cette idee´ aboutit a` une representa-´ tion alternative des langage des automates temporises,´ basee´ sur une nouvelle classe de langages formels que nous appelons langages des regminos. Nous developpons´ alors la theorie´ des expres- sions reguli´ eres` sur les regminos et nous montrons que le probleme` de semantique´ vide est inde-´ cidable en cas gen´ eral,´ et decidable´ pour une sous-classe large de langages. L’application de ces resultats´ nous amene` a` des nouvelles structures de donnees´ et a` des algorithmes pour le probleme` du langage vide dans les automates temporises´ et les expressions reguli´ eres.` Mots cles« . automate temporise,´ expressions reguli´ eres,` theor´ eme` de Kleene, contraintes tem- porelles, decidabilite,´ langages formels.

Title. An algebraic theory of real-time formal languages. Abstract. A timed automaton is an automaton augmented with several clocks that measure the time passage and may influence state changes in the system. Timed automata were introduced as a formal model for real-time systems hoping that their role in the verification of such systems will be similar to the role of finite automata in the systematic search of errors in the design of untimed systems. In our thesis we are concerned with several theoretical questions related to timed automata and timed languages. In the first part of the thesis we investigate a simple sub-class of timed automata with one clock which is reset at each transition. We show that for this sub-class we can obtain simple analogs of the classical results of automata theory, namely Kleene and Myhill-Nerode theorems and closure under complementation. The second and main part of the thesis is motivated by the timed regular expressions of Asarin, Caspi and Maler where it was shown that, in order to match the expressive power of timed au- tomata, one needs to introduce intersection into the expressions. We investigate an alternative to intersection by using colored parentheses for defining timing restrictions on overlapping parts of a sequence. This idea leads to an alternative representation of languages of timed automata, which is based on a new class of languages called here regmino languages. We develop the theory of reg- ular expressions over regminoes and prove that their emptiness problem is undecidable in general, and decidable for a large subclass of languages. From these results we develop new data-structures and algorithms for solving emptiness and reachability problems for timed automata and regular expressions. Keywords. timed automata, regular expressions, Kleene theorem, timing constraints, decidability, formal languages.

Adresse du laboratoire d’accueil. Verimag,´ 2 av. de Vignate, 38610 Gieres.`