Axel Thues pap ers on rep etitions in words a translation J Berstel LITP Institut Blaise Pascal Universite Pierre et Marie Curie Paris France July Contents Preliminaries Notation Co des and enco dings The ThueMorse sequence Symb olic dynamical systems Thues First Pap er Ab out innite sequences of symb ols x x Thues Second Pap er On the relative p osition of equal parts in certain sequences of symb ols Intro ductory Remarks Sequences over two symb ols Sequences over three symb ols First Case aca and bcb are missing Second Case aba and aca are missing Third Case aba and bab are missing Irreducible words over four letters Irreducible words over more than four letters Notes Squarefree morphisms Overlapfree words Avoidable patterns Intro duction In a series of four pap ers which app eared during the p erio d Axel Thue considered several combinatorial problems which arise in the study of sequences of symb ols Two of these pap ers deal with word problems for nitely presented semigroups these pap ers contain the denition of what is now called a Thue system He was able to solve the word problem in sp ecial cases It was only in that the general case was shown to b e unsolvable indep endently by E L Post and A A Markov The other two pap ers deal with rep etitions in nite and innite words Perhaps b ecause these pap ers were published in a journal with restricted avail ability this is guessed by G A Hedlund this work of Thue was widely ignored during a long time and consequently some of his results have b een rediscovered again and again Axel Thues pap ers on sequences are now more easily accessible since they are included in the Selected Pap ers which were edited in It is the purp ose of the present text to give a translation of Axel Thues pap ers on rep etitions in sequences b oth in more recent terminology and in relation with new results and directions of research It app ears that there is a noticeable dierence b oth in style and in amount of results b etween the pap er pages and the pap er pages The rst of these pap ers mainly contains the construction of an innite square free word over three letters Thue gives also an innite squarefree word over four letters obtained by what is now called an iterated morphism whilst the three letter word is constructed in a slightly more complicated way a uniform tagsystem in the terminology of Cobham The second pap er attacks the more general problem of what Thue calls irre ducible words He devotes sp ecial attention to the case of two and three letters In particular he intro duces what is now called the ThueMorse sequence and shows that all twosided innite overlapfree words are derived from this se quence There are several asp ects he did not consider rst many combinatorial prop erties of the ThueMorse sequence such as the numb er of factors the recur rence index and so on were only investigated by M Morse or later next Intro duction the characterization of all onesided innite overlapfree words which is much more dicult than that of twosided words was only given later by Fife However Thue gives a complete description of circular overlapfree words Axel Thues investigation of squarefree words over three letters is even more detailed He gives in this pap er another construction of an innite squarefree word by iterated morphism and then initiates in a pages development a tentative to describ e all squarefree words over three letters He observes that every innite squarefree word is an innite pro duct of words chosen in a set of six words and classies those innite squarefree words that are pro ducts of four among these six words His classication he observes is similar b oth in statement and in pro of technique to what is found in diophantine equations the solutions are parametrized by some variables which are easier to manage This text is organized as follows in the rst chapter we give some preliminary denitions and notation We intro duce the notions of squarefree overlapfree words avoidable pattern morphisms and co des These are useful to present Thues results in a somewhat more concise manner As an example we give some combinatorial prop erties of the ThueMorse sequence The two following chapters contain a translation of Thues pap ers We have tried to formulate Thues results as faithfully as p ossible For the pro ofs some easy parts have b een simplied and more frequently some dicult steps have b een develop ed In these chapters fo otnotes only concern technical details A longer chapter of notes contains more general remarks and developments b oth ab out the contents of Thues pap ers and ab out the actual state of the art Chapter Preliminaries In this preliminary chapter we rst intro duce some denitions and notation and then present the socalled ThueMorse sequence and some of its prop erties Notation An alphabet is a nite set of symbols or letters A word over some alphab et A is a nite sequence of elements in A The length of a word w is denoted by jw j The empty word of length is denoted by We denote by alphw the set of letters that o ccur at least once in the word w An innite word is a mapping from N into A and a twosided innite word is a mapping from Zinto A A circular word or necklace is the equivalence class of a nite word under conjugacy or circular p ermutation We shall write u w if u and w dene the same circular word Sometimes we identify a circular word with one of its representatives A factor of a word w is any word u that o ccurs in w i e such that there exist words x y with w xuy A square is a nonempty word of the form uu A word is squarefree if none of its factors is a square Similarly an overlap is a word of the form xuxux where x is nonempty The terminology is justied by the fact that xux has two o ccurrences in xuxux one as a prex initial factor one as a sux nal factor and that these o ccurrences have a common part the central x As b efore a word is overlapfree if none of its factors is an overlap The reversal of a word u a a where a a are letters is the word n n u a a If u u then u is a palindrome The reversal of an innite word n to the right is an innite word to the left The set of words over A is the free monoid generated by A and is denoted by A The set of nonempty words over A is denoted by A It is the free semigroup generated by A A function h A B is a morphism if huv huhv for Co des and enco dings all words u v If jhw j jw j for all words w then h is nonerasing or length increasing It is equivalent to say that hw for w If there is a letter a n n such that ha starts with the letter a then h a starts with the word h a n for all n If the set of words fh a j n g is innite the morphism denes n a unique innite word say x by the requirement that all h a are prexes of x The word x is said to b e obtained by iterating h on a and is called a morphic word Sometimes x is also denoted by h a Clearly x is a xed p oint of h The ThueMorse sequence of section is an example of a morphic word A morphism h A B easily extends to onesided innite words If x a a a is an innite word then hx ha ha ha The n n resulting word is innite i the set of indices n such that ha is innite n This holds in particular if h is nonerasing The extension to twosided innite words is similar The only ambiguity is in the convention adopted to x the origin of the image We agree that any origin is convenient In other words we consider insofar as homomorphic images are concerned the equivalence class under the shift operator T that is dened by T xn xn If u is a nite juj word then the innite p erio dic word u uuu veries u T u Co des and enco dings A code over A is a set X of nonempty words such that each word over A admits at most one factorization as a pro duct of words in X In other words for all n m x x y y X n m x x y y n m and x y i n n m i i It is equivalent to say that the submonoid X generated by X is free and that X is its base A set X is prex if no word in X is a prex of any other word in X thus x xu X implies u Sux sets are dened symmetrically Prex and sux sets are co des A biprex co de is a co de that is b oth prex and sux An encoding is a morphism h A B that is injective If h is an enco ding then the set X hA is a co de Conversely if X is a co de over an alphab et B then an enco ding of X is obtained by taking a bijection h from an alphab et A onto X This extends to an injective morphism from A into B It is convenient to implicitl y transfer terminology b etween co des and enco dings Thus we may sp eak ab out prex enco dings or ab out comp osition of co des Several sp ecial prop erties of co des are useful and will b e intro duced when they are needed Preliminaries The ThueMorse sequence In this section we recall some basic prop erties concerning the ThueMorse se quence Other prop erties and pro ofs can b e found in Lothaire and Salo maa and of course in Thues second pap er Let A fa bg b e a two letter alphab et Consider the morphism from the free monoid A into itself dened by a ab b ba Setting for n n n u
Details
-
File Typepdf
-
Upload Time-
-
Content LanguagesEnglish
-
Upload UserAnonymous/Not logged-in
-
File Pages83 Page
-
File Size-