<<

Journal of Algorithms 37, 267᎐282Ž. 2000 doi:10.1006rjagm.2000.1108, available online at http:rrwww.idealibrary.com on

Fast Algorithms to Generate Necklaces, Unlabeled Necklaces, and Irreducible Polynomials over GFŽ. 2 1

Kevin Cattell

HP Research Labs, Santa Rosa, California E-mail: kevin᎐ [email protected]

Frank Ruskey, Joe Sawada, and Micaela Serra

Department of Computer Science, Uni¨ersity of Victoria, Victoria, British Columbia, Canada E-mail: [email protected]; [email protected], [email protected]

and

C. Robert Miers

Department of Mathematics and Statistics, Uni¨ersity of Victoria, Victoria, British Columbia, Canada E-mail: [email protected]

Received August 22, 1998

Many applications call for exhaustive lists of strings subject to various con- straints, such as inequivalence under group actions. A k-ary necklace is an equivalence class of k-ary strings under rotationŽ. the cyclic group . A k-ary unlabeled necklace is an equivalence class of k-ary strings under rotation and of alphabet symbols. We present new, fast, simple, recursive algo- rithms for generatingŽ. i.e., listing all necklaces and binary unlabeled necklaces. These algorithms have optimal running times in the sense that their running times are proportional to the number of necklaces produced. The algorithm for generat- ing necklaces can be used as the basis for efficiently generating many other equivalence classes of strings under rotation and has been applied to generating bracelets, fixed density necklaces, and chord diagrams. As another application, we describe the implementation of a fast algorithm for listing all degree n irreducible and primitive polynomials over GFŽ. 2 . ᮊ 2000 Academic Press

1 Research supported by NSERC.

267 0196-6774r00 $35.00 Copyright ᮊ 2000 by Academic Press All rights of reproduction in any form reserved. 268 CATTELL ET AL.

1. INTRODUCTION

Four of the most natural group actions on strings over a fixed alphabet are:Ž. a leaving the string unchanged, Ž. b reversing the string, Ž. c rotating the string, andŽ. d permuting the symbols of the string by a permutation of the alphabet symbols. The four groups giving rise to these actions areŽ. a ޚ ޚ ރ 12,bŽ.acting on the indices Ž reversal .Ž. , c the cyclic n acting on the ޓ indices, andŽ. d the symmetric group k acting on the alphabet, assuming the alphabet consists of k symbols. Each , or composition of group actions, partitions the set of k-ary strings into equivalence classes, namely the orbits of the action. Many applications call for exhaustive lists of representatives of these equivalence classes. To generate such equivalence classes, it is natural to choose as representative the lexicographically smallest string. With this representation, efficient algorithms are known for generating the equiva- lence classes of each of the actionsŽ.Ž.Ž. a , b , c , and Ž. d . By ‘‘efficient’’ we mean that the amount of computation used in generating the objects is proportional to the number of objects generated. Such algorithms are said to be CAT, for constant amortized time. ForŽ. a we are simply counting in base k which is known to be efficient for k G 2. ForŽ. b efficient algorithms are easily developed. In case Ž. c the equivalence classes are usually called necklaces. Efficient algorithms for generating necklaces were developed by Fredricksen and Kesslerwx 5 and Fredricksen and Maioranawx 6 ; these algorithms were proven to be efficient by Ruskey et al wx20 . Closely related algorithms for generating Lyndon wordsŽ. aperiodic necklaces were developed by Duvalwx 3 and shown to be efficient by Berstel and Pocchiolawx 1 . In caseŽ. d the representative strings are usually called restricted growth functions and efficient algorithms for generating them have been developed by Erwx 4 , Kaye w 12 x , and others. In addition to these four natural group actions considered in isolation, interesting equivalence classes also emerge by composing two or more of the group actions. For example, composingŽ. b and Ž. c results in the dihedral groups, with the resulting equivalence classes often called bracelets. Only recentlyŽ. using the framework outlined in this paper Sawadawx 23 developed an algorithm to generate k-ary bracelets in constant amortized time. Proskurowski et al.wx 17 show that the orbits of the composition ofŽ. b and Ž. d can be generated in amortized Ok Ž' .time, which is CAT if k is fixed. It remains an interesting challenge to develop efficient algorithms for the other compositions. In this paper we present a new recursive framework for necklace generation. We then use this framework to develop a CAT algorithm to generate equivalence classes of the complemented cycling registerŽ which GENERATING UNLABELED NECKLACES 269 are in one-to-one correspondence with vortex-free tournaments. . We also use this framework to develop a CAT algorithm for generating unlabeled binary necklacesŽ. and Lyndon words , i.e., the composition of Ž.Ž. c and d for k s 2. This algorithm was used in the implementation of a fast algorithm for listing all degree n irreducible polynomials over GFŽ. 2 . The framework has also been used to efficiently generate other classes of strings as summarized below. Undoubtedly, other applications of the framework will also be discovered.

ⅷ Necklaces: Done in this paper, CAT algorithm. ⅷ Vortex-free tournaments: Done in this paper, CAT algorithm. ⅷ Binary unlabeled necklaces: Done in this paper, CAT algorithm. ⅷ Irreducible polynomials over GFŽ. 2 : Done in this paper. ⅷ Fixed density necklaces: Done in Ruskey and Sawadawx 21 , CAT algorithm. ⅷ Bracelets: Done in Sawadawx 23 , a CAT algorithm. ⅷ Chord diagramsŽ i.e., 2n points on an oriented circle embedded in the plane joined pairwise by chords. : to appear. ⅷ Necklaces with forbidden substrings: Done in Ruskey and Sawada wx22 , CAT algorithm if the forbidden substring is a . ⅷ A basis for the nth homogeneous component of the free Lie algebra: to appear. The combinatorial objects which we are listing are fundamental and there is considerable interest in having efficient algorithms to generate them. This interest comes from mathematicians, computer scientists, electrical engineers, and scientists in other disciplines. Our algorithms have all been implemented and are used in the Combinatorial Object ServerŽ. COS at http:rr www.theory.csc.uvic.car˜cos in the sections on Necklaces and Irre- ducible Polynomials over GFŽ. 2 .

1.1. Background and Definitions def ⌺ s y ⌺n Defining the alphabet kkÄ40, 1, . . . , k 1 , the set consists of all ; ⌺U k-ary strings of length n. Define an equivalence relation on k by ␣ ; ␤ ¨ g ⌺q ␣ s ¨ ␤ s ¨ if and only if there exist u, k such that u and u. The equivalence classes of ; are called necklaces and we identify each necklace with the lexicographically least representative in its equivalence class. The set of all necklaces of length n over a k-ary alphabet is denoted

NkŽ.n .

def s ␣ g ⌺n < ␣ F ␤ ␣ ; ␤ NkkŽ.n Ä4for all Ž.1 270 CATTELL ET AL.

s For example N2Ž.4 Ä40000, 0001, 0011, 0101, 0111, 1111 . The cardinality ␣ s иии ␴␣ of NkkŽ.n is denoted Nn Ž.. For a string aa12 an, let Ž .denote a ␣ ␴ иии s иии left shift of by one position: Ž.aa12 an a2 aan 1. In an unlabeled necklace we do not care about which color a bead has, but only about whether two beads have different colors. Unlabeled neck- laces 0001 and 0111 are identical since one can be transformed into the other by interchanging 0 and 1 at every position of the string. Formally, the set NkŽ.n of all k-ary length n unlabeled necklaces is defined as follows:

def s ␣ g ¬ ␣ F ␴␲␣i ᭙␲ g ޓ - - Nˆ kkŽ.n Ä4N Ž.n Ž. Ž . kand 0 i n .2 Ž.

s For example, Nˆˆ2Ž.4 Ä40000, 0001, 0011, 0101 . As before,, we denote NnkŽ. s < < Nˆ kŽ.n . A string ␣ is periodic if ␣ s ␤ k where ␤ is nonempty and k ) 0. An aperiodic necklace is called a Lyndon word. The set of all k-ary Lyndon words of length n is denoted L kŽ.n .

defs ␣ g ¬ ␣ L kkŽ.n Ä4N Ž.n is aperiodic

s For example, L 2Ž.4 Ä40001, 0011, 0111 . The cardinality of L kŽ.n is de- noted LnkŽ.. An unlabeled Lyndon word is an aperiodic unlabeled necklace. The set of all k-ary length n unlabeled Lyndon words is denoted ˆL kŽ.n with s cardinality LnˆˆkŽ.. For example, L 2 Ž.4 Ä40001, 0011 . A word ␣ is called a prenecklace if it is the prefix of some necklace. The set of all k-ary prenecklaces of length n is denoted PkŽ.n .

def s ␣ g ⌺nm¬ ␣␤ g q G ␤ g ⌺ PkkkŽ.n Ä4N Žn m .for some m 0 and k

s j For example P22Ž.4 N Ž.4 Ä40010, 0110 . The cardinality of PkŽ.n is denoted PnkkŽ.. Let Wn Ž.denote the number of prenecklaces of length at most n. These numbers will prove useful in analyzing the algorithms developed below. Formally,

n defs q WnkkŽ.1 Ý Pi Ž..3 Ž. is1

We denote unlabeled prenecklaces PˆˆkkŽ.n with cardinality Pn Ž..

def s ␣ g ⌺nm¬ ␣␤ g q G ␤ g ⌺ PˆˆkkkŽ.n Ä4N Žn m .for some m 0 and k GENERATING UNLABELED NECKLACES 271

THEOREM 1.1. The following formulae are ¨alid for all n G 1, k G 1:

11 s ␮ n r dnˆ s ␮ r d Lnk Ž.ÝÝ Ž.dk , Ln2 Ž. Ž.d 2, Ž. 4 n dn<<2n odd dn 11 s ␾ n r dnˆ s y ␾ r d Nnk Ž.ÝÝ Ž.dk , Nn22 Ž.Nn Ž. Ž.d 2,5 Ž. n dn<<2n odd dn nn s ˆˆs PnkkŽ .ÝÝLi Ž., Pn22 Ž .Li Ž..6 Ž. is1 is1

Proof. The equations for LnkkŽ.and Nn Ž.are well known; a proof of the former may be found in Graham et al.wx 11, p. 141 and of the latter in wx Lothaire 16, p. 9 . The equations for Nnˆˆ22Ž.and Ln Ž.are from Gilbert wx and Riordan 8 . The equation for PnkŽ.follows from Lemma 2.3 and the equation for Pnˆ2Ž.follows from Lemma 3.2.

2. GENERATING NECKLACES

In this section we describe a recursive algorithm for generating neck- laces and Lyndon words. Compared with previous algorithms, our contri- bution is a recursive formulation that leads to simpler proofs and analysis. It is also more amenable to modificationŽ for example, existing algorithms only generate in lexicographic order, whereas ours can generate in many different orders. . The algorithm is based on a result, Theorem 2.1, that tells us how to construct a length n prenecklace from a length n y 1 prenecklace. In the next section, we will use this recursive methodology to generate all unlabeled necklaces. This methodology has already been used, in other papers, to generate necklaces with fixed densitywx 21 and bracelets wx23 , both by CAT algorithms. The following characterizations of Lyndon words and necklaces are well knownŽ e.g., Lothairewx 16, p. 64. and are clearly equivalent to the defini- tions given earlier.

␣ s xy g L if and only if xy - yx, for all nonempty x, y.7Ž. ␣ s xy g N if and only if xy F yx, for all nonempty x, y.8Ž.

We now need to state a couple of technical lemmas. The following lemma is proven inwx 20, Lemma 1, p. 419 . 272 CATTELL ET AL.

LEMMA 2.1. If ␣ g N, then ␣ t g N for t G 1. The lemma below follows directly fromŽ. 8 and the definition of a prenecklace. ␣ s иии ␣ LEMMA 2.2. Let a1 an be a prenecklace. If x is a substring of G иии with length k, then x a1 ak . The following lemma follows from Lemma 7.14 ofwx 18 and the ensuing discussion. ␣ s иии s ␣ ␣ g LEMMA 2.3. Let aa12 an be a string and p lynŽ.. Then P s s q if and only if ajypja for j p 1,...,n. ␣ g ⌺U ␣ ␣ For k , let lynŽ.be the length of the longest prefix of that is a ⌺ : Lyndon word. This function is well defined since k L. More formally, иии defs F F ¬ иии g lynŽ. a12 a an maxÄ4 1 p n aa12 apkL Ž.Ž.p .9 s ␣ s иии For example, lynŽ.001010010 5. Note that if aa12 an is a Lyndon word, then lynŽ.␣ s n. The following theorem is very useful. We are tempted to call it the ‘‘fundamental theorem of necklaces’’! It leads not only to the necklace generation algorithm, but also to algorithms for producing the Lyndon factorization of a word and for determining the necklaceŽ lexicographically minimal rotation. of a string, both in time linear in the length of the string. The theorem specifies exact conditions, in terms of n and lyn, under which a character can be appended to a prenecklace and still remain a preneck- lace. In many ways, the theorem is implicit in the work of Fredricksenwx 5, 6 and Duvalwx 2, 3 but not explicitly stated there. Reutenauer wx 18 has a very similar statement on page 164. ␣ s иии y THEOREM 2.1. Let aa12 any1 be a string in PkŽ.n 1 and let s ␣ ␣ F F y p lynŽ.. The string binPkn Ž.n if and only if a yp b k 1. Further- more, s pifbanyp lynŽ.␣ b s nifa- b F k y 1. ½ nyp

ALGORITHM 2.1. procedure genŽ.t, p : integer ; local j : integer; begin if t ) n then PrintltŽ.p else begin [ q atta yp; genŽ.t 1, p ; GENERATING UNLABELED NECKLACES 273

g q y y for j Ä4atyp 1,...,k 2, k 1 do begin [ q at j; genŽ.t 1, t ; end; end end;

The algorithm genŽ.t, p of Algorithm 2.1 follows directly from Theorem 2.1. The parameter p has the same meaning it had in the theorem, and the ␣ s иии parameter t replaces n. In general, if a1 aty1 is a prenecklace with lynŽ.␣ s p, then a call to gen Žt, p .will generate all length n prenecklaces ␣ s with prefix . To generate all length n prenecklaces assign a0 0 and make the initial call genŽ.1, 1 . If the symbols selected in the for loop are selected in increasing order, then the listing is lexicographic; in opposite order the listing is in reverse lexicographic order. Necklaces and Lyndon words can be produced by adding a test to the function PrintltŽ.p as described in Table I. We now show that algorithm genŽ.t, p is efficient; it has the CAT property.

THEOREM 2.2. Algorithm genŽ.t, p is a CAT algorithm.

Proof. Call the number of nodes in the computation tree WnkŽ.. From the structure of the algorithm, WnkŽ.is equal to the number of preneck- laces of length at most n, as expressed inŽ. 3 . From the expressionsŽ. 4 and Ž. 5 we obtain the following bounds.

1 k n 1 Ž.k nny Žny1 .k r2 FLn Ž.FFNn Ž.F Ž.k nnq Žny1 .k r2 nnnkk Ž.10

TABLE I Different Objects Output by Different Versions of Printlt

Sequence type PrintltŽ.p wx Prenecklaces ŽPk Žn ..Println Ža 1..n . s wx Lyndon words ŽL k Žn ..if p n then Println Ža 1..n . s wx Necklaces ŽNk Žn ..if n mod p 0 then Println Ža 1..n . De Bruijn sequence if n mod p s 0 then PrintŽawx1.. p . 274 CATTELL ET AL.

We have fromŽ. 6 nn1 s F i PnkkŽ.ÝÝLi Ž.k .11 Ž . is1 is1 i Hence

nnj 1 y s F i WnkkŽ.1 ÝÝÝPi Ž. k . js1 js1 is1 i Thus WnŽ.y 1 n n j 1 k F i n ÝÝ k . Nnk Ž. kjs1 is1 i

Inwx 20 it is shown that this last expression converges to ŽŽkr k y 1as ..2 n ª ϱ. Thus the asymptotic running time per necklaceŽ or per Lyndon ⌰ word, noting that LnkkŽ .is ŽNn Ž ..by Ž 10 .. is constant; the necklace algorithm in Algorithm 2.1 is CAT no matter what type of object is being generated.

2.1. Application: Generating Equi¨alence Classes of the Complemented Cycling Register Golombwx 10 considers the complemented cycling registerŽ. CCR over the set of all binary strings of length n. The CCR transforms a given input s иии string b bb12 bn into an output string CŽ.b as shown below. иии s иии CbbŽ.12 bn b2 bbn 1 .

Repeated application of the operation C eventually results in the initial string; the sequence b, CŽb ., CC Ž Žb ..., . . . is called a cycle. If we let CCRŽ. n denote the number of distinct cycles of the CCR taken over all binary strings of length n then

1 CCRŽ. n s Ý ␾ Ž.d 2.n r d Ž 12 . 2n odd dn<

It is shown by Knuthwx 13 that the right side ofŽ. 12 is equal to the number of vortex-free tournaments on n vertices. This number is also the same as the number of odd density binary necklaces with length n. Now consider the mapping f that sends two consecutive members of a cycle to their exclusive-or. For example, fŽ.010101, 101011 s 111110. Ap- plied to other members of the cycle a circular rotation of 111110 results. GENERATING UNLABELED NECKLACES 275

This is true in general, as we show below.

[ s [ [ иии [ [ b CŽ.b Žb1223bb .Žb . Žbny1 bbnn .Ž.b1 [ s [ иии [ [ [ CŽ.b CCŽ. Ž.b Žb23b . Žbny1 bbnn .Ž.bb11Ž.b 2

Since x [ y s x [ y, these strings are shifts of one another. Further- [ [ [ more, they must contain an odd number of 1’s, since b1223b b b [ иии [ [ [ [ s [ s bny1 bnnb b111b b 1. The process is reversible, so long as a value for b1 is specified. Thus there is a natural correspondence between binary necklaces with an odd number of 1’s and distinct cycles of the CCR on binary strings of length n. By making small modifications to genŽ.t, p we can generate a unique representative from each distinct cycle of the CCR, for a given n, as described below. First add an additional parameter d to genŽ.t, p which keeps track of the number of 1’s in the string. If the resulting string of length N has an even number of 1’s then it is rejected. We also maintain another array b which holds a representative for a cycle of the CCRŽ not necessarily the s [ lexicographically least representative. . We use the equation btta y1 bty1 at each recursive call to find the next bit in the representative for the cycle; this adds only a constant amount of computation to each node of the computation tree. The resulting algorithm is CAT since its running time is proportional to the running time for generating all odd density necklaces, since the number of odd density necklaces is asymptotically half the total number of necklaces.

3. GENERATING UNLABELLED BINARY NECKLACES

In this section we develop a CAT algorithm to generate all binary unlabeled necklaces. Recall that an unlabeled necklace is an equivalence class of necklaces under permutation of alphabet symbols. In the binary case, a necklace and its complement are in the same equivalence class. As representative we choose the lexicographically smallest string in the equiv- alence class, giving rise to the set Nˆ 2Ž.n . In Section 1 we gave explicit expressions to count Nnˆˆ22Ž., Ln Ž., and Pnˆ2Ž.. The following two lemmas are analogous to Lemmas 2.1 and 2.3 for necklaces. ␣ s иии g ␣ t g G LEMMA 3.1. If a1 an Nˆ, then Nˆ for t 1. ␤ s иии ␣ Proof. Let b1 bn be equivalent to under permutation of its symbols. Then by definition of an unlabeled necklace, ␤ must be greater 276 CATTELL ET AL. than or equal to ␣ and thus ␤ t must be greater than or equal to ␣ t. From Lemma 2.1 ␣ t is a necklace and therefore by definition ␣ t is an unlabeled necklace for t G 1.

␣ s иии LEMMA 3.2. Let a1 an be a string and let p equal the length of the ␣ ␣ g s longest unlabeled Lyndon prefix of . Then Pˆ if and only if ajypja for j s p q 1,...,n.

s s q ␣ s ␤ t␦ G Proof. If ajypja for j p 1,...,n, then for some t 1 ␤ s иии g ␦ ␤ ␤ tq1 where a1 ap ˆL and is a prefix of . By Lemma 3.1, is an unlabeled necklace and thus ␣ g Pˆˆ. Conversely, assume that ␣ g P.If lynŽ.␣ s q and p / q then there must exist a Lyndon prefix of ␣ with length greater than p that is not in ˆL. This implies that there exists a ␲ ␲ - иии permutation of the alphabet symbols such that Ž.a1,...,aq a1 aq. This contradicts the assumption that ␣ is an unlabeled prenecklace, since no matter what we append to the string ␣, it can never be an unlabeled s ␣ s necklace. Now since we must have p lynŽ., by Lemma 2.3 ajypja for j s p q 1,...,n.

To generate unlabeled necklaces, we could simply generate all necklaces and perform a test for each necklace to determine whether or not it is the unlabeled representative. To perform this test, we take the complement of the generated necklace and use a necklace finding algorithm to find its corresponding necklace. Such an algorithmŽŽ.. which runs in time On can easily be derived from Duval’s algorithm for factoring a string into Lyndon wordswx 2 or from Theorem 2.1. We then compare the resulting necklace to the original; if the original is not larger, then it is an unlabeled necklace. This algorithm yields an overall running time of OnŽŽ..N n , which is far from efficient. In order to improve upon this naıve¨ algorithm to generate unlabeled necklaces we must improve upon the linear time test required at the end of the necklace generation. In the remainder of this section we build up to a theorem, Theorem 3.1, which suggests the addition of another parameter to the necklace algorithm genŽ.t, p . The constant time maintenance of this parameter at each node of the computation tree eliminates the need for the linear time test, thus yielding a CAT algorithm to generate unlabeled necklaces. Before we state our main theorem, we first introduce some additional notation and state some useful lemmas. Note that Nˆ 2Ž.n consists exactly of those necklaces ␣ that satisfy ␣ F ␴␣i Ž.for all 0 - i - n. Observe that

if <

␣ s ␣ иии g LEMMA 3.3. Let 1 an N2Ž.n . If there is a k such that, for ¨ - F ¨ ␣ F ␴␣i иии s иии e ery 0 i k, we ha e Ž.and akq1 an a1 anyk , then ␣ g Nˆ 2Ž.n . Proof. By the definition of an unlabeled necklace, a necklace is its unlabeled representative if and only if it is less than or equal to each of its complemented rotations. We are given that ␣ F ␴␣i Ž.for 0 - i F k so we need only show that ␣ F ␴␣i Ž.for k - i - n. иии s иии s иии Since akq1 an a1 anyki, taking x a q1 an in Lemma 2.2 иии ) иии иии s иии yields either aiq1 an a1 anyiior a q1 an a1 anyi. In the иии former case the result is trivial. In the latter case we consider anyiq1 an and look at two subcases. If n y i q 1 F k, then ␴␣nyiq1 Ž.G ␣ which иии G иии y q ) иии implies anyiq1 an a11a .If n i 1 k, then anyiq1 an is a иии substring of akq1 an and is therefore a substring of the prenecklace иии иии G иии a1 anyin. By Lemma 2.2 we have a yiq1 an a1 ai. Thus in both иии G иии иии G subcases anyiq1 an a1 ai. Now usingŽ. 13 we have a1 ai иии иии иии s ␴␣i G ␣ anyiq1 aniwhich gives us the result a q1 aan 1 ai Ž. for k - i - n. ␣ s иии g иии G иии COROLLARY 3.1. Let a1 an N2Ž.n . If aina a1 anyiq1 for all 1 F i F n then ␣ is an unlabeled necklace. иии ) иии иии иии ) ␣ Proof. If aina a1 anyiq1, then ainaa1 aiy1 . If there иии s иии ␣ exists a smallest i such that aina a1 anyiq1 then, by Lemma 3.3, is an unlabeled necklace. Otherwise, ␣ is an unlabeled necklace by definition. We need one final definition before presenting the main theorem of this иии section. Define comŽ. a1 an to be the smallest positive value c, for which иии s иии acq1 an a1 anyc ,14Ž. or n if no such value of c exists. For example, comŽ.000111000111 s 3, comŽŽ01 .m..s 1, and com Ž0 n. s n; these last two examples represent extreme values for com. First, we give a lemma useful in proving the theorem. ␣ s иии LEMMA 3.4. A binary string a1 an is an unlabeled prenecklace if ␣ иии G иии F F and only if is a prenecklace and aina a1 anyiq1 for all 1 i n. Proof. Assume that ␣ is an unlabeled prenecklace. Since PˆŽ.n is a subset of PnŽ.then ␣ is a prenecklace. By definition of an unlabeled prenecklace there exists an unlabeled necklace ␥ such that ␥ s ␣␦ for ␦ иии G some string . Thus by the definition of an unlabeled necklace aina иии F F a1 anyiq1 for all 1 i n. 278 CATTELL ET AL.

To prove the converse we let p s lynŽ.␣ .If n mod p s 0 then ␣ is a necklace. By Corollary 3.1, ␣ is also an unlabeled necklace and thus by definition ␣ is an unlabeled prenecklace. Otherwise, if n mod p / 0 then ␤ s иии u n r pv ␣ we construct a string Ž.a1 ap of length m by extending .By иии observing that a1 ap is an unlabeled necklaceŽ using the fact that иии иии a1 ap is a necklace and Corollary 3.1. we get a1 ap F иии иии F F иии G aipaa1 aiy11for all 1 i p. Thus byŽ. 13 we have a apia иии иии F F ?@r иии G иии aap 1 aiy1. Therefore for 1 i pn p we have aima a1 иии иии G иии amyiq11. Again since a apipis an unlabeled necklace, a a a1 иии G иии ?@r - F apyiq1. Thus we have aima a1 amyiq1 for pn p i m. Now since ␤ is a necklace Corollary 3.1 shows that ␤ is an unlabeled necklace, and thus by definition ␣ is an unlabeled prenecklace. ␣ s иии g y s ␣ THEOREM 3.1. Let aa12 any12Pˆ Ž.n 1 and c com Ž.. The g g s s string ab Pˆ22Ž.n if and only if Ž.i ab P Ž.n and Ž.ii anycn0 or b a yc. Furthemore

nifbs a s 0 ␣ s nyc comŽ.b s ½ cifbanyc.

␣ g ␣ g Proof. Assume that b Pˆ2Ž.n . By Lemma 3.4 we also have b s s иии ) иии P2Ž.n .If anyc b 1 then the string a1 anycca q1 an, a contra- diction to the definition of an unlabeled prenecklace. Therefore either s s s s s anycnor b 0. If a ycn1 and b 0 then b a ycn. Thus either a yc 0 s or b anyc. иии G иии To prove the converse we need only show that aina a1 anyiq1 F F ␣ g y s ␣ for all 1 i n by Lemma 3.4. Because Pˆ2Ž.n 1 and c com Ž. иии ) иии F F we observe that aina y11a anyi for all 1 i c. Thus we clearly иии ) иии F F s also have aina a1 anyiq1 for 1 i c.Ifanyc b then иии s иии ␣ s s acq1 an a1 anycnby definition of comŽ..If a yc b 0 then we иии ) иии have by a similar argument that acq1 an a1 anyc. Now applying иии G иии q F F Lemma 2.2 for either case we get aina a1 anyiq1 for c 1 i иии G иии F F ␣ g n. Therefore aina a1 anyiq1 for all 1 i n and thus b Pˆ2Ž.n . s ␣ s Furthermore if anyc b, then we clearly have comŽ.b c and if s s ␣ s b anyc 0, then by our convention comŽ.b n since there is no s s value of c for whichŽ. 14 holds. Note that the case anyc b 1 cannot occur by the discussion in the first paragraph of the proof. This theorem implies that we can generate all unlabeled prenecklaces Ž.and thus unlabeled necklaces by introducing the additional parameter c to the prenecklace generation algorithm genŽ.t, p . Pseudocode for this algorithm is given in Algorithm 3.1. The initial call is genUŽ.2, 1, 1 , first GENERATING UNLABELED NECKLACES 279

s s initializing a01a 0. Unlabeled necklaces, Lyndon words, and preneck- laces can all be produced by using Table I as before. In general, if ␣ s иии ␣ s ␣ a1 aty1 is an unlabeled prenecklace with lynŽ.p and com Ž. s c, then a call to genUŽ.t, p, c will generate all length n unlabeled prenecklaces with prefix ␣.

ALGORITHM 3.1. Procedure genU Ž.t, p, c: integer ; begin if t ) n then PrintltŽ.p ; else begin s if atyc 0 then begin s if atyp 0 then begin [ q at 0; genUŽ.t 1, p, t ; end; [ at 1; s q if atyp 1 then genUŽ.t 1, p, c ; else genUŽ.t q 1, t, c ; end else begin [ q at 0; genUŽ.t 1, p, c ; end; end; end; Observe that the computation tree of genUŽ.t, p, c is a subtree of the computation tree of genŽ.t, p and that only constant computation is performed at each node of the tree. Furthermore, the number of unla- beled binary necklaces is at least half the number of labeled binary necklaces. These observations prove the following theorem.

THEOREM 3.2. Algorithm genUŽ.t, p, c is a CAT algorithm. It remains an interesting challenge to extend these ideas to generate unlabeled necklaces over nonbinary alphabets; there seems to be no obvious way to extend Theorem 3.1.

3.1. Application: Generating All Irreducible Polynomials o¨er GFŽ.2 In the finite field GFŽ. 2 , we denote by GF Ž. 2 wxx the set of all polynomi- als over GFŽ. 2 in indeterminate x. A nonzero polynomial p is said to be s irreducible over GFŽ. 2 if it has no nontrivial factorization p pp12.An 280 CATTELL ET AL. irreducible polynomial is also primiti¨e if it has a root ␣ such that ␣ generates the nonzero elements of GFŽ 2n.Ž that is, Ä␣, ␣ 23, ␣ ,...4 s GFŽ 2 n. _ Ä40.If. p has one such root, then all of its roots are generators.

The number of degree n irreducible polynomials over GFŽ. 2 is Ln2 Ž ., which is the same as the number of binary Lyndon words of length n. There is a natural correspondence between the two sets which we use to generate all irreducible and primitive polynomials of given degree. In the following paragraph we explain the correspondence. Let q be a primitive polynomial of degree n and ␣ one of its roots. For each ␭ s ␣ b, where b ranges from 1 to 2 n y 1, there is an equivalence my 1 class Ä␭, ␭24, ␭ ,...,␭ 2 4 of size m, where m ¬ n. These equivalence classes are in one-to-one correspondence with irreducible polynomials whose degree divides n by forming the polynomial pxŽ.s Žx q ␭ .Žx q ␭2 . иии my 1 Ž1 q ␭2 .. This correspondence is explained in Golombwx 9Ž and see Luneberg¨ wx 15 or Reutenauer wx 18 for another correspondence via normal bases. , but we have not before seen it exploited in an algorithmic context. The irreducible polynomials of degree n are those for which m s n. Since successive powers of two amount to taking circular shifts of b, all irreducible polynomials of degree n may be generated by feeding in a primitive root ␣ and generating all Lyndon words of length n. As each Lyndon word is generated it is converted into an integer b and then the corresponding irreducible polynomial pxŽ.is computed. An irreducible polynomial is primitive if and only if b and 2 n y 1 are relatively prime. The reciprocal of a degree n polynomial fxŽ.is the polynomial xfn Ž.1rx . The reciprocal of an irreducible polynomial is irreducible, and the recipro- cal of a primitive polynomial is primitive. Under the correspondence mentioned above reciprocal polynomials correspond to those generated by the complement of the number used to produce b. Thus we do not generate all Lyndon words, but only the unlabeled Lyndon words. This savesŽ. asymptotically a factor of two in the computation time. Our algorithm was implemented in C with each polynomial stored in a single machine word in the obvious manner. The intermediate calculations involve polynomials over GFŽ 2n. , which are stored as length n arrays of words. In one day on a middle-of-the-road workstation we can generate all 134,215,680 irreducible polynomials of degree 32, noting along the way the 67,108,864 that are primitive. Of course, for large values of n it becomes infeasible to generate all irreducible polynomials, but the algorithm can still be used to generate as many as are wanted. The program poly.c, available on COS, was compiled using gcc -04 and run on a Sun Microsys- tems Ultra 1 machine. Theoretically, the computation of each irreducible polynomial takes OnŽ 2 . polynomial multiplications. Our multiplication routine is the naıve¨ algorithm implemented on machine words. For larger n faster algorithms, GENERATING UNLABELED NECKLACES 281 such as those discussed in von zur Gathenwx 24 , should be used. The algorithmŽ. along with the other algorithms outlined in this paper is well suited to parallelization using a system such as Brendan McKay’s autoson, since there is a natural tree structure to the recursive algorithm which can be used to assign different subtrees to different processors. We used the algorithm to compute various statistics of irreducible polynomials and these have led to several results and conjectures about the sizes of various subclasses of irreducible polynomials. For example, the number of irreducible polynomials with an odd number of nonzero odd terms is LnˆkŽ.. As another example, the number of irreducible polynomi- ny1 als with trace 1Ž.Ž. coefficient of x is Lnˆk .

ACKNOWLEDGMENTS

We thank the referees for their helpful comments. Algorithm 2.1 and Theorem 2.1 were discovered in 1993r94 by the second author as he was preparingwx 19 . It was one of the referees that pointed out that the essence of Theorem 2.1 is contained in Reutenauerwx 18 .

REFERENCES

1. J. Berstel and M. Pocchiola, Average cost of Duval’s algorithm for generating Lyndon words, Theoret. Comput. Sci. 132 Ž.1994 , 415᎐425. 2. J.-P. Duval, Factoring words over an ordered alphabet, J. Algorithms 4 Ž.1983 , 363᎐381. 3. J.-P. Duval, Generation´´ d’une section des classes de conjugaison et arbre des mots de Lyndon de longueur bornee,´ Theoret. Comput. Sci. 60 Ž.1988 , 255᎐283. 4. M. C. Er, A fast algorithm for generating set partitions, Comput. J. 31 Ž.1988 , 283᎐284. 5. H. Fredricksen and I. J. Kessler, An algorithm for generating necklaces of beads in two colors, Discrete Math. 61 Ž.1986 , 181᎐188. 6. H. Fredricksen and I. J. Maiorana, Necklaces of beads in k colors and k-ary de Bruijn sequences, Discrete Math. 23 Ž.1978 , 207᎐210. 7. J. von zur Gathen and J. Gerhard, ‘‘Arithmetic and Factorization of Polynomials over

F2 ,’’ Technical report tr-rsfb-96-018, University of Paderborn, Germany, 1996. 8. E. N. Gilbert and J. Riordan, Symmetry types of periodic sequences, Illinois J. Math. 5 Ž.1961 , 657᎐665. 9. S. W. Golomb, ‘‘Irreducible Polynomials, Synchronization Codes, Primitive Necklaces, and the Cyclotomic Algebra,’’ Univ. of North Carolina Monograph Series in Probability and Statistics, Vol. 4, pp. 358᎐370, 1967. 10. S. W. Golomb, ‘‘Shift Register Sequences,’’ Holden-Day, Oakland, 1967. 11. R. L. Graham, D. E. Knuth, and O. Patashnik, ‘‘Concrete Mathematics,’’ Addison- Wesley, Reading, MA, 1989. 12. R. A. Kaye, A Gray code for set partitions, Inform. Process. Lett. 5 Ž.1976 , 171᎐173. 13. D. E. Knuth, ‘‘Axioms and Hulls,’’ Lecture Notes in Compu ter Science, Vol. 606, Springer-Verlag, BerlinrNew York, 1992. 14. R. Lidl and H. Niederreiter, ‘‘Introduction to Finite Fields and Their Applications,’’ Cambridge Univ. Press, Cambridge, UK, 1994. 282 CATTELL ET AL.

15. H. Luneburg,¨ ‘‘Tools and Fundamental Constructions of Combinatorial Mathematics,’’ Wissenschafts, 1989. 16. M. Lothaire, ‘‘ on Words,’’ Cambridge Univ. Press, Cambridge, UK, 1983. 17. A. Proskurowski, F. Ruskey, and M. Smith, Analysis of algorithms for listing equivalence classes of k-ary strings induced by simple group actions, SIAM J. Discrete Math. 11 Ž.1998 , 94᎐109. 18. C. Reutenauer, ‘‘Free Lie Algebras,’’ Clarendon Press, Oxford, 1993. 19. F. Ruskey, Combinatorial generation, manuscript, course notes for CSC 528, Univ. Victoria, 1995. 20. F. Ruskey, C. D. Savage, and T. Wang, Generating necklaces, J. Algorithms 13 Ž.1992 , 414᎐430. 21. F. Ruskey and J. Sawada, An efficient algorithm for generating necklaces with fixed density, SIAM J. Comput. 29 Ž.1999 , 671᎐684. 22. F. Ruskey and J. Sawada, Generating necklaces and strings with forbidden substrings, in ‘‘Proc. 6th Annual International Combinatorics and Computing ConferenceŽ. CO-COON , Sydney, Australia, July 2000,’’ Lecture Notes in Computer Science, to appear, Springer- Verlag, BerlinrNew York. 23. J. Sawada, Generating bracelets in constant amortized time, manuscript, 1999.

24. J. von zur Gathen and J. Gerhard, Arithmetic and factorization of polynomials over F2 , in ‘‘Proc. ISSAC 96,’’ pp. 1᎐9, Assoc. Comput. Mach, New York, 1996.