<<

Biased Tadpoles: a Fast Algorithm for Centralizers in Large Matrix Groups

Daniel Kunkle and Gene Cooperman College of Computer and Information Science Northeastern University Boston, MA, USA [email protected], [email protected]

ABSTRACT of basis formula in linear algebra. Given a group G, the def Centralizers are an important tool in in computational group centralizer of g G is defined as G(g) = h: h h ∈ C { ∈ theory. Yet for large matrix groups, they tend to be slow. G, g = g . } We demonstrate a O(p G (1/ log ε)) black box randomized Note that the above definition of centralizer is well-defined algorithm that produces| a| centralizer using space logarith- independently of the representation of G. Even for permu- mic in the of the centralizer, even for typical matrix tation group representations, it is not known whether the groups of order 1020. An optimized version of this algorithm centralizer problem can be solved in polynomial time. Nev- (larger space and no longer black box) typically runs in sec- ertheless, in the permutation group case, Leon produced a onds for groups of order 1015 and minutes for groups of or- partition backtrack algorithm [8, 9] that is highly efficient der 1020. Further, the algorithm trivially parallelizes, and so in practice both for centralizer problems and certain other linear speedup is achieved in an experiment on a computer problems also not known to be in polynomial time. Theißen with four CPU cores. The novelty lies in the use of a bi- produced a particularly efficient implementation of partition ased tadpole, which delivers an order of magnitude speedup backtrack for GAP [17]. as compared to the classical tadpole algorithm. The biased In the case of matrix groups, centralizers are well known to tadpole also allows a test for membership in a conjugacy be computationally difficult for large groups. In this paper, class in a fraction of a second. Finally, the same methodol- we demonstrate a randomized centralizer algorithm. The ogy quickly finds the order of a matrix group via a vector algorithm can be thought of as a randomized Schreier-Sims stabilizer. This allows one to eliminate the already small algorithm under the conjugate action. We also make the possibility of error in the randomized centralizer algorithm. assumption that a distribution of uniformly random group elements is available. However, the conjugate action implies Categories and Subject Descriptors: I.1.2 [Symbolic that the group acts on a permutation domain of the same and Algebraic Manipulation]: Algebraic algorithms size as the group. For large groups, this makes computation General Terms: Algorithms, Experimentation of a Schreier generator computationally expensive. Keywords: centralizer, matrix groups, tadpole, conjuga- The problem of computing a Schreier generator in the tor, group order conjugate action reduces to the following problem: Problem 1.1 (Conjugator). Given two elements of 1. INTRODUCTION a group G, g and h, find an element r G such that gr = h. The tadpole algorithm in computational is ∈ reminiscent of the Pollard rho algorithm [15] for fac- A traditional method for computing a conjugator between torization. Indeed, both the “tadpole” and the “rho” are g and h is by bidirectional search. However, this can require meant to remind us geometrically of the shape of a certain a lot of storage, and the use of such an out-of-CPU-cache trajectory. Tadpoles were originally invented by Richard algorithm makes it slow using today’s technology. This is Parker [14]. Our implementation will use appropriate bias- because of the large gap in speed between CPU and RAM. ing heuristics to create an algorithm for centralizers in large Further, as CPUs continue to gain additional cores, an in- matrix groups. We also apply biased tadpoles to the prob- cache algorithm can execute a different conjugator in each lems of group order and conjugacy testing. CPU core, unlike a bidirectional search algorithm. One defines the conjugate of a group element g by an el- The tadpole algorithm has the advantage of requiring space def − that is only constant in the size of a group element; i.e. ement h as gh = h 1gh. This is analogous to the change storage proportional to a small constant times the memory required to store one group element. Further, a simple prob- ability argument shows that the expected number of steps required to solve the Conjugator problem is O(p G log G ) Permission to make digital or hard copies of all or part of this work for | | | | personal or classroom use is granted without fee provided that copies are (under certain randomness assumptions). Further, for k such not made or distributed for profit or commercial advantage and that copies conjugacy tests with k G , the amortized cost of one Con- ≥ | | bear this notice and the full citation on the first page. To copy otherwise, to jugator is only O(p G ). (See Section 4.2.) The obvious republish, to post on servers or to redistribute to lists, requires prior specific centralizer algorithm| derived| from this (Section 3) then op- permission and/or a fee. erates in time O(p G log(1/ε))) with probability of error ISSAC'09, July 28–31, 2009, Seoul, Republic of Korea. | | Copyright 2009 ACM 978-1-60558-609-0/09/07 ...$5.00. bounded above by ε, 0 < ε < 1. (see Section 3.3). The idea of a biased tadpole is to create a modified tad- a sequence x0, x1,...,xj ,...,xi for which xi = xj . Clearly, pole that, loosely speaking, acts on a smaller space than the once one of the xi is equal to an earlier xj , the series will original tadpole. In Section 6.3, we show a 9.6 times speedup then continue cycle among the points xj , xj+1, . . ., xi−1. by using biasing to compute the centralizer of an involution This cycle is sometimes called an attractor cycle. in an example for the group J4. It remains to detect when the xi have arrived on the at- The same tadpole idea can be used to more efficiently find tractor cycle. This is done using landmarks. A landmark is a vector stabilizer subgroup of a matrix group. a point xi whose representation as a point in the permuta- tion domain satisfies a chosen property. In our applications, Problem 1.2 (Vector Stabilizer). If a matrix group, a landmark is a state for which the k initial bits are all zero. G, acts on a vector space V , an element g G stabilizes a The constant k can be tuned. It is assumed that the first vector v V if the product of v and the matrix∈ representa- ∈ k bits are pseudo-random. Where that is not true, alterna- tion of g is equal to v itself. A vector stabilizer subgroup tives are possible such as to hash the bits, or to use a related of G is a subgroup H such that vh = v for every h H. ∈ algorithm of Sedgewick and Szymanski [16]. Upon arriving at a state xi that is a landmark, one looks The rest of the paper has the following structure. First, it up in a hash table. If it is already in the hash table, then we briefly conclude the introduction with related work. Sec- one has already made a circuit around the attractor cycle. tion 2 describes the core idea of tadpoles walks. Section 3 Otherwise, one stores xi in the hash array. One adjusts k to presents our tadpole-based algorithm for the centralizer prob- simultaneously ensure that the overhead of checking land- lem. Section 4 describes two additional tadpole-based algo- marks is small, and that at least one landmark appears on rithms: vector stabilizer and group order; and conjugacy every attractor cycle. testing. Section 5 describes the addition of biasing to tad- It can easily be shown that the number of steps until a poles, and estimates the predicted speedup due to biasing. tadpole walk makes one revolution around an attractor cycle Finally, Section 6 provides a number of experimental results. is O(√n), for n the size of the permutation domain on which 1.1 Related Work it acts. The space required to store the landmarks is then O(√n/2k). A second critical property of tadpoles is that the Butler describes a backtracking algorithm for centralizers expected number of attractor cycles is only O(log n ), for n of matrix groups based on examining the action of the ma- the size of the permutation domain on which the| | tadpole trix group on vectors of the vector space [2]. Eugene Luks acts [5]. Hence, there are relatively few large tadpoles. This showed how to find centralizers in polynomial time for the will be critical in Algorithm 1 (Centralizer). special case of solvable matrix groups [11]. One empirically observes further that in practice, the num- In the domain of permutation groups, Leon invented the ber of large attractor cycles tends to be less than five. While partition backtrack algorithm [8] for centralizer and other smaller attractor cycles may exist, our computations almost permutation group problems not known to be in polynomial never encounter such smaller attractor cycles, and so they time. Theißen added an efficient implementation of this al- do not greatly affect the running time. (See Section 6.) gorithm into GAP [17]. Luks described a polynomial time centralizer algorithm for Tadpole Walks Using Constant Space. matrix groups [11]. Bray describes a fast algorithm special- The above method uses a hash table of landmark elements ized for centralizers of an involution in certain groups [1]. to detect the attractor cycle of the tadpole walk, which re- Murray and O’Brien demonstrated how to use a random quires O(√n/2k) space. An alternate approach uses the Schreier-Sims algorithm [7] for the matrix group viewed as two-finger trick, which reduces the space required to O(1), a permutation group on vectors or subspaces [12]. at the cost of requiring about three times as many genera- This work assumes the ability to generate uniformly ran- tor applications. This trick has a further advantage in that dom group elements. Celler and Leedham-Green [3] devel- Algorithm 1 (centralizer) becomes a black box algorithm. oped the product replacement algorithm, which is extremely In this method, two fingers follow the tadpole walk. The efficient, and widely used in GAP. Although the best the- first finger uses the standard definition of the walk. The oretical guarantee of randomness to date requires that the second finger follows the walk at twice the speed, applying algorithm use O(n9) group multiplications [13], a small num- two generators for each one generator the first finger applies. ber of group operations suffice in practice to produce ele- The walk has found an attractor when the two fingers meet. ments that pass all current tests of randomness.

2. TADPOLE WALKS 3. TADPOLE ALGORITHM FOR The core idea of the tadpole algorithm [14] is to create a CENTRALIZER dynamical system on any orbit of a permutation group. This Suppose we wish to find the centralizer of an element g dynamical system yields a tadpole walk. A tadpole walk on a G. We will do so by performing tadpole walks of group∈ permutation domain and group generators is a construction elements acting on group elements by conjugation. in which one takes a pseudo-random, but deterministic func- Start with the element to be centralized. Conjugate it by tion f (such as a hash function) that acts on the permutation a random group element r1 G. Then, do a tadpole walk on domain and returns an element of the group. (In Richard gr1 until some element is repeated,∈ and an attractor cycle is Parker’s original description, it returned one of the group found. We keep track of the word used to conjugate, starting generators.) One then views this as a dynamical system on with that random group element, and continuing with the the points of the permutation domain. Given a point x0, one generators chosen by the tadpole walk. If the tadpole walk f(x0) r1w1 computes the new point x1 = x0 (the application of the produces a word w1, then we have a path, g , that leads generator f(x0) to the point x0). One iterates to produce from g to a landmark on the attractor cycle. r2 We then repeat the process, using a different initial ran- on the choice of r2, in fact, w2 depends only on g . This is r2 dom group element for conjugation, r2 G. If this second because w2 is a tadpole walk with starting point g . ∈ ′ tadpole walk leads to the same attractor cycle, then we pro- Let us temporarily fix r2 and choose r = hr2, where h is 2 ′ r2 hr2 r2 duce a second word r2w2 leading to the same landmark. chosen randomly from G(g). Note that g = g = g , −1 −1 ′ r1w1 r2w2 r1w1w2 r2 C r2 Hence g = g . This implies that g = g, since h G(g). So a tadpole walk starting at g depends −1 −1 ∈ C ′ and therefore that r1w1w2 r2 is in the centralizer of g. on r2, but is independent of the choice of h for r2 = hr2 ′ ∈ G(g)r2. The random choice of r2 implies a random choice C ′ −1 −1 Algorithm 1 Centralizer of h G(g) such that r2 = hr2. Since r1w1w2 r2 G(g) ∈C −1 ′ −1 ∈C Input: Group G, g G, pseudo-random function and h is random in G(g), r1w1w r is a random element ∈ C 2 2 f : G G of G(g). 7→ C ′ Output: random elements, RandomCentralizerElts, for We have shown that for a fixed r2, a random choice of r 2 ∈ G(g) G(g)r2 produces a random element of G(g) using Algo- 1: InitializeC a hash array of AttractorCycleReps to be rithmC 1. Since the algorithm produces aC random element ′ empty. of G(g), for a random choice of r restricted to G(g)r2, C 2 C 2: Initialize RandomCentralizerElts to empty. and since this is true independently of the choice of r2 G, ∈ 3: repeat it suffices to choose a random r2 G, and then a random ′ ∈′ −1 4: Initialize empty Landmark hash array, and Land- r2 G(g)r2, to ensure r1w1w2 1r2 G(g). Since this ∈ C ′ − ∈ C markPtr list produces a random r2 G, it suffices to choose a random −∈1 ′ −1 5: Let r be a uniformly random element of G r2 G to ensure r1w1w2 r2 G(g). Therefore, any ran- r ∈ ∈C 6: Let x = g dom choice of r2 produces a random element of G(g). 7: [Begin tadpole walk at random conjugate of g] C 8: repeat 3.1 Early Termination of Long Walks 9: Let r = rf(x) As will be seen in Section 6, there can be several common 10: Let x = xf(x) attractor cycles with widely varying lengths. If a particular 11: if x is a landmark then tadpole walk is excessively long, one gains evidence that one 12: If x is not in Landmark array, hash it, and append may be in a long attractor cycle. Under these circumstances, it to LandmarkPtr it is tempting to terminate the tadpole walk, and begin a new 13: end if walk using a new random element that will hopefully lead 14: until x is a landmark and x is in Landmark array to a shorter attractor cycle. Corollary 2 implies that as long 15: Use LandmarkPtr list to determine all landmarks oc- as the decision to terminate is based solely on the number curring at x or later in Landmark array of steps or time passed, the resulting centralizers computed 16: Let m be the lexically least such landmark from completed tadpole walks will still be uniformly random. 17: if m is in AttractorCycleReps then They will not be biased by such a decision procedure. −1 18: Get the value wm, and Add rwm to RandomCen- tralizerElts Corollary 2. In producing the random elements of the 19: else centralizer subgroup according to Theorem 1, one may termi- nate a particular tadpole walk based on the number of steps 20: Set wm = r, and add the key-value pair (m, wm) to AttractorCycleReps executed so far, and any properties of previous tadpole walks. 21: end if The early termination of a particular tadpole may remove 22: until stopping condition satisfied (see Section 3.3) one element of the centralizer subgroup, but it will not bias 23: Return RandomCentralizerElts the uniform distribution of the other random elements of the centralizer that are generated. This tadpole-based method for centralizer is described by Proof. The tadpole walk to be terminated has already Algorithm 1. See Section 2 for a description of the land- produced an initial random starting position gr for r G ′ ∈ marks used in the algorithm. random. If one fixes r and chooses r G(g)r uniformly at ′ ∈C If the hash array AttractorCycleReps in Algorithm 1 be- random, then gr = gr. If r produces a centralizer element ′ ′ −1 comes too large, we can remove the less frequently occurring h G(g), then r produces an element r r h G(g). ∈ C ′ −1 ∈ C elements. However, [5, Appendix], shows the expected num- Since r r is a uniformly random element of G(g) inde- C ′ −1 ber of attractor cycles to be bounded above by (1/2) log2 G . pendently of when the tadpole walk is terminated, r r h | | Also, the algorithm can use a number of different stopping is also uniformly random in G(g) independently of when conditions, which are discussed further in Section 3.3. the tadpole walk is terminated.C So, termination of the tad- In fact, it is not required that g G, in which case the pole walk causes removal of this uniformly random element ∈ ′ − above algorithm produces an external centralizer subgroup. r r 1h, which does not bias the uniform distribution of the r More generally, although we identify g as defined by the resulting centralizer elements produced by Algorithm 1. conjugate action, one could take g Ω for any permutation domain Ω. The pseudo-random function∈ f must then be There are several potential heuristics for early termination defined as mapping Ω to G. of a tadpole walk. Such heuristics are not employed in our experiments, since the primary concern is evaluating the core Theorem 1. If r2 is a random group element of G, then −1 −1 algorithm. However, one such heuristic is to set a variable r1w1w2 r2 of the previous discussion is a random element cutoff to be the median number of steps for all previous of the centralizer subgroup G(g). C tadpole walks. The median means that half of the walks Proof. The group elements r1 and w1 are independent were longer and half were shorter. One then terminates any of the random element r2. While the element w2 depends tadpole walk that has already used two times cutoff steps. In computing the median, any terminated tadpole walk is or the coarser estimate of O(p G log(1/ε)) for 0 < ε < 1 considered to have taken an infinite number of steps. specifying the desired upper bound| | on the error probability.

3.2 Complexity: Number of Tadpoles Required Theorem 6. If the stopping criterion of Algorithm 1 is The following theorem is well known. It is based on the to stop after k centralizer elements in a row fail to cause the fact that in any subgroup chain, the order of a subgroup in candidate centralizer subgroup to grow, then the probability − its parent subgroup is at most 1/2. So, the longest possible of error is at most 1/2k 1. chain is log G . It has been observed that in practice, one 2 | | Proof. We will show the probability of error to be bounded can usually generate G using many fewer group elements − above by 1/2k + 1/4k + 1/8k + = 1/2k 1. To see this, than would be indicated by O(log n). Elementary abelian assume that Algorithm 1 produces· · an · infinite number of gen- 2-groups are a worst case, and many algorithms for those erators for the centralizer. Consider the last element that groups reduce to linear algebra. Most groups encountered caused the candidate centralizer subgroup to grow. If the in practice are far from this worst case. k immediately preceding steps had not increased the sub- Theorem 3. O(log G log ε) random elements of a group group size, Algorithm 1 would have stopped before that el- k G suffice to generate |G |−with probability of error bounded ement. The probability of that happening is at most 1/2 , above by ε, 0 < ε < 1. since the subgroup size at that point was at most 1/2 the full group size. Then look at the second to last generator that Theorem 4. For a group G, the expected number of tad- caused the candidate centralizer subgroup to grow. If there pole walks needed to generate G(g) is O(log G(g) ). C |C | were k immediately preceding steps that did not produce a Proof. Theorem 3 states that one needs O(log G(g) ) new generator, then we would have stopped at this point. random centralizer elements. The Appendix of [5]|C shows| The probability of that is 1/4k. Continuing in the same way k−1 that the expected number of attractors is (1/2) log2 G(g) . produced the series upper bound, which yields 1/2 . Algorithm 1 produces one random centralizer element|C per| tadpole walk, except those tadpole walks that are the first to 4. OTHER TADPOLE-BASED ALGORITHMS reach a new attractor cycle. Since at most (1/2) log G(g) 2 |C | Here, we present two additional tadpole-based algorithms, can reach a new attractor cycle, O(log G(g) ) tadpole walks both of which are closely related to the algorithm for central- suffice to generate G. |C | izer presented above. They are: computing vector stabilizers A standard argument demonstrates the refined estimate and group order; and conjugacy testing. of O(log G(g) log ε) tadpole walks to generate G(g) with upper bound|C ε|−on probability of error. C 4.1 Vector Stabilizer and Group Order Let n < G be the size of the conjugacy class of the Suppose we wish to compute the order of a matrix group | | element g to be centralized. One can Combine this with G with generators g1, g2,...,gk = G. We will do so us- h i the expected length of a tadpole walk, √n, and assume ing a method that makes use of tadpoles to compute vector c log G(g) tadpole walks for some fixed c chosen in ad- stabilizers. The overall algorithm is closely related to the |C | vance. Noting that n G(g) = G , it is clear that Al- Schreier-Sims algorithm for permutation groups. |C | | | First, choose a vector x such that xg = x for some gener- gorithm 1 executes in O(√n log G(g) ) = O(p G ) mul- 6 tiplications, and it succeeds with|C probability| of| error| at ator g [g1, g2,...,gk]. Compute the transversal of G with c ∈ def most 1/ G(g) . The next section describes a stopping cri- respect to x. The transversal is defined as G(x) = y : g |C | T { ∈ terion independent of knowledge of G(g) or G . G, y = xg . We compute the transversal by breadth-first |C | | | search, beginning} with the vector x, and enumerating all 3.3 Stopping Criteria vectors reachable by some word in the generators of G. Algorithm 1 is mostly a black box algorithm, in that it Then, find a set of generators of the vector stabilizer sub- does not depend on the representation of x, g, or r, except group of G with respect to x. The vector stabilizer subgroup def for purposes of landmarks. Even that restriction could be is defined as G(x) = g : g G, xg = x . To do this, we removed by using the cycle-finding algorithm of Sedgewick use a slightlyS modified version{ ∈ of Algorithm} 1 (Centralizer). and Szymanski [16]. As with all such randomized black box Instead of the domain consisting of elements of G and the algorithms, one must rely on a randomized stopping crite- action being conjugation, the domain consists of vectors and rion. Let the maximum length of a chain of in G the action is vector-matrix multiplication. be L. The order of G is the product of the order of the transver- sal and the order of the vector stabilizer subgroup: G = Theorem 5. If L is an upper bound on the length of a | | G(x) G(x) . If the vector stabilizer is trivial, we return subgroup chain in G(g) , and if the stopping criterion of |T ||S | Algorithm 1 is to stop|C after| k centralizer elements are pro- the order of the transversal as the order of the group. Oth- duced, for k>L, then the probability of error before produc- erwise, we recursively compute the order of the vector sta- − ing the full centralizer is at most 1/2k L. bilizer subgroup, using the elements of the vector stabilizer discovered using tadpoles as generators of the group. The proof is clear. Note that if the group order of the centralizer, G(g) , or the order G , is known, then the 4.2 Conjugacy Testing number of prime|C factors| of that order| | (including repeated We first consider the case in which one wants to do many prime factors) can be used as the upper bound L. conjugacy tests. In this limit, the time is O(p G ) steps, the | | If G(g) and G are not known a priori, then the next number of steps in a tadpole walk. Suppose we are given an stopping|C criterion| | can| be used. This yields the revised es- element g G of order k, and we wish to determine which ∈ timate O(p G √n log ε)) = O(p G (1 log ε/ G(g) )), of the conjugacy classes of order k that element belongs to. | | − | | − |C | First, we will complete a small number of tadpoles for ran- until some element is repeated. So, one step in the walk is dom elements in each of the conjugacy classes of order k. defined as This will provide us with a few landmarks, each associating a tadpole attractor cycle with a specific conjugacy class. g = f(xi) −1 Then, we simply compute one more tadpole walk start- xi+1 = g xig ing from element g, and look up which conjugacy class the The walk continues until some xi = xj , for j < i. attractor cycle landmark is associated with. As argued in The time to compute one step is dominated by the two ma- Section 2, and seen empirically in the experimental results trix multiplication operations required by the conjugation. of Section 6, there are only a few large attractor cycles that For matrices of dimension n, this requires 2n3 operations. are visited with high probability. So, we need only a small number of landmarks in each of the conjugacy classes to find Biasing Function. a match with high probability. In the unlikely event that the To add a bias, we define a prefix function [e1, e2, . . . , em] = landmark has not been seen yet, we can compute another p(x, g), where x G, g G is one of the generators, and tadpole starting at gr, where r is a random element of G. ∈ ∈ [e1, e2, . . . , em] is a vector containing the first m elements of − This method is particularly efficient when there are many the matrix resulting from the conjugation g 1xg. In other elements for which one wants to perform a conjugacy test, as words, we are computing the first m out of n2 entries of the the initial cost of associating landmarks with each conjugacy result of one step. class is amortized over the tadpole computations for each of This prefix computation can be done by two vector-matrix the elements. multiplication operations: the first multiplying a 1 n vector − Next, we consider the case of a single conjugacy test. In of g 1 by the full n n matrix x; the second multiplying× the this case, the number of steps is O(p G log G ). In this × | | | | resulting 1 n vector by a n m portion of g. The resulting case, we consider two elements g, h G, and we wish to cost is n2 +×nm operations.× Typically, m

Group Order Dim Size of 1st Chain Our GAP’s Transversal Length Time (s) Time (s) 7 J3 5.0 10 80 50232960 1 750.37 * McL 9.0 × 108 22 22275 3 1.27 19.89 He 4.0 × 109 51 8330 4 1.25 1161.58 × 10 A14 4.3 10 12 3003 8 2.17 0.50 1+8 + × 10 2 .O8 (2) 8.9 10 24 4147200 3 55.03 * Ru 1.5 × 1011 28 417600 3 5.66 * × 11 Co3 5.0 10 22 37950 6 1.82 10.90 × 13 Co2 4.2 10 22 46575 5 2.03 1108.76 × 13 F i22 6.4 10 78 142155 6 3.98 * × 15 F4(2) 3.3 10 26 17821440 3 257.91 * × 18 Co1 4.2 10 24 8386560 4 124.63 * × 23 E6(2) 2.1 10 27 69193488 5 1219.85 * ×

These three algorithms, centered around lexicographic or- tion of the only other general matrix centralizer algorithm derings, provide the basis for important heuristics for tad- in the literature. poles acting: on cosets; on sets; and on subgroups. Finding Tables 1 and 2 show experimental results for the follow- lexicographic orderings on sets leads to biases that are use- ing twelve groups: Janko group J3; McLaughlin group McL; ful for finding set stabilizers for permutation groups, while Held group He; A14; Conway groups Co3, 1+8 + lexicographic orderings on on subgroups can lead to finding Co2, and Co1; 2 .O8 (2); Rudvalis group Ru; Fischer new heuristics for finding normalizer subgroups. group F i22; and untwisted groups of exceptional Lie type 1+8 + Last, note the following important implementation detail: F4(2) and E6(2). Note that 2 .O8 (2) appears as a sub- If a biasing function uses a randomized algorithm as a sub- group of Co1 and is of interest because it is far from simple. routine, then the seed of the corresponding random number When choosing a set of groups to test on, we restricted generator must be re-initialized to a fixed value at the begin- ourselves to groups with an available matrix representation ning of the function. Otherwise, the biasing function would over GF (2) of dimension 128 or less. This is because our not be deterministic (but pseudo-random), as required. implementation of matrix operations was originally devel- oped for J4, which has a representation of dimension 112 6. EXPERIMENTAL RESULTS over GF (2) (so we represent each row of the matrix with two 64-bit words). Because we have optimized for this case, All of the experimental results given below were produced matrix operations of smaller dimension are suboptimal. In using a computer with: two 2.00 GHz dual-core Intel Xeon the future, we plan to generalize our implementation to yield CPUs; 16 GB of RAM; and Linux version 2.6.9. All of maximum performance for arbitrary dimensions, possibly by our algorithms were implemented in C and compiled with incorporating the MeatAxe library. GCC version 3.4.5. All matrix representations and involu- tion words in standard generators were taken from version 3.0 of the ATLAS of Representations [18]. In cases where our algorithm requires the use of a ran- 6.1.1 Group Order dom group element, we produce that element using a naive We tested our tadpole-based algorithm for computing group random walk of length at least 100 (and length 10000 for order on the twelve matrix groups listed in Table 1. The the larger groups). A more sophisticated generator, such groups range in size from 5.0 107 to 2.1 1023, with di- × × as product replacement [3], would have produced the same mension ranging from 12 to 80. timings or better. Recall that the algorithm for computing group order (Sec- For all biased tadpoles, we used 100 random group ele- tion 4.1) works by recursively computing a series of vector ments as generators and a prefix of size 8 (see Section 5.1 stabilizers and transversals. Table 1 lists the size of the first for the definition of prefix). (and largest) transversal, and the total number of transver- sals computed (i.e., the chain length). 6.1 Group Order and Centralizer for Matrix Finally, the table gives the total time required by our al- Groups gorithm and by the corresponding group order algorithm in GAP. In cases where GAP ran out of memory before Here, we provide a comparison of our tadpole-based meth- completing (max 4 GB), no time is shown. Of the five out ods for matrix groups to the corresponding algorithms im- of twelve cases where GAP produced an answer, GAP was plemented in GAP, including computing group order and faster in one (A14), and our algorithm was between 5 and the centralizer of an involution. We choose to focus on in- 1000 times faster in the other four. With the exception of volutions because they are often useful in other algorithms, J3 (which has a trivial vector stabilizer), our algorithm com- and because their centralizers are usually large. Although pletes in seconds for groups of order 1013 and less, and in GAP’s strategy appears to be a naive enumeration of all minutes for groups up to order 1023. group elements, we compare with GAP as an implementa- Table 2: Results of computing the centralizer of an involution in twelve different matrix groups (* indicates that GAP exceeded its maximum of 4 GB of RAM without completing).

Group Involution Centralizer Our Centralizer Our Order GAP Centralizer and Word Size Time (s) Time (s) Order Time (s)

J3 a 1920 17.78 0.90 * McL a 40320 4.64 0.69 18.12 He a 161280 7.11 1.17 1010.06 6 A14 (ba) 46080 13.39 1.14 1.15 1+8 + 2 .O8 (2) a 49152 15.11 2.26 * Ru a 116480 7.78 1.16 * Co3 bb 2903040 8.05 1.32 11.86 Co2 a 743178240 6.91 1.66 1011.21 F i22 a 18393661440 15.74 5.69 * F4(2) a 754974720 20.49 66.84 * Co1 a 2012774400 86.56 43.20 * E6(2) a 135291469824 1643.73 11.87 *

6.1.2 Centralizers of Involutions Using biased tadpoles, we were able to produce one el- Next, we tested our algorithm for computing centraliz- ement of the centralizer in an average of 11.17 seconds. ers using the same twelve groups. In each case, we chose to We produced 100 such elements, for a total time less than compute the centralizer of an involution (element of order 2). 20 minutes. For these 100 biased tadpoles, an attractor was Table 2 shows the results of these computations. In that ta- found after an average of 27500 steps, and there were a total ble, the chosen involutions are given as short words in the of 4 attractors discovered. Additional analysis of these tad- two generators listed for that group in version 3 of the AT- poles, and a comparison between the biased and unbiased LAS of Finite Group Representations [18]. The centralizers versions, is presented in Section 6.3.1. range in order from 1920 to approximately 1.35 1011. In Because we expect O(log n) elements of a group to gener- each case, we give the time for our algorithm in two× pieces: ate that group, these 100 elements are sufficient to generate the time to compute the centralizer (as a set of generators the centralizer with very high probability. Theorem 5 pro- −65 of the subgroup); and the time to compute the order of the duces a bound on the probability of error of 2 , where centralizer (using our algorithm for group order). k = 100 and L = 34 with L being the number of prime factors, including repetition, for J4 . We could not directly For GAP we list only a single time, because GAP com- | | putes the order of a centralizer as that centralizer is pro- confirm this by computation because the size of the first duced. Note that the times listed for GAP in Table 2 are transversal in the centralizer exceeded available memory. nearly identical to those in Table 1, suggesting that GAP is using the same strategy for both group order and centralizer Centralizer of an Element of Order 37. computations. In this case, the centralizer is of order 37 and the con- Again, of the five cases where GAP completes, GAP is jugacy class (the domain of the tadpole walk) is of order faster in one (A ), and our method is faster in the other 2345285703948042240 2.3 1018. In this large space, the 14 ≈ × four, with a maximum speedup of over 100 times. For groups biased tadpoles required an average of 23.5 hours to com- of order 1015 or less, we compute the centralizer in 20 sec- plete. We computed 20 such elements in parallel on 20 nodes onds or less. For the largest case, group order approximately of a compute cluster. The longest cases, and therefore the 1023, the algorithm completes in less than half an hour. entire computation, were completed in about 1.5 days. For these 20 tadpoles, an attractor was found after an average 8 6.2 Case Study of a Large Group: J4 of 2.0 10 steps, and a total of 3 attractors were found. × Because our initial motivation for developing biased tad- We confirmed that the order of the computed centralizer poles was the problem of conjugacy testing in Janko’s group was 37. The time for this verification was trivial in compar- ison the tadpole computations. J4, we report here results for the closely related problem of finding centralizers in that group. We examine two extreme cases: computing the centralizer 6.3 Tadpole Accelerators of an involution; and of an element of order 37. For involu- Here, we present experimental results for two tadpole ac- tions, the centralizer is relatively large, and the conjugacy celerators: biasing and parallelism. The results show a sig- class relatively small. For elements of order 37, the reverse nificant speedup of 9.6 times for biasing in the case comput- 6 is true. Our element of order 2 is the product (abababbb) , ing centralizing elements in J4, and a full linear speedup of and our element of order 37 is the product ab (generators as 4.0 times on a machine with 4 CPU cores. given in v3 of the ATLAS [18]). 6.3.1 Biasing Effectiveness Centralizer of an Involution. For this experiment we used the same 100 tadpole com- In this case, the centralizer is of order 1816657920 1.8 putations that were used to compute the centralizer of an 9 ≈ × 10 and the conjugacy class (the domain of the tadpole walk) involution in J4 (Section 6.2). For each of the 100 cases, we is of order 47766599364 4.8 1010. ran one biased and one unbiased tadpole. ≈ × 9. REFERENCES Table 3: Comparison of biased and unbiased tad- poles (averages over 100 trials). [1] J. Bray. An improved method for generating the centraliser of an involution. Arch. Math. (Basel), Biased Unbiased 74:241–245, 2000. [2] G. Butler. Computing in permutation and matrix Tail Length 23751 169497 groups II: Backtrack algorithm. Mathematics of Attr Length 3832 224153 Computation, 39(160):671–680, Oct. 1982. Total Steps 27583 393650 [3] F. Celler, C. Leedham-Green, S. Murray, A. Niemeyer, Steps per sec 2463 3679 and E. O’Brien. Generating random elements of a Time (s) 11.2 107.0 finite group. Comm. Algebra, 23:4931–4948, 1995. [4] G. Cooperman and L. Finkelstein. New methods for Attr Size Visits Attr Size Visits using Cayley graphs in interconnection networks. List 1454 75 35350 7 Discrete Applied Mathematics, 37/38:95–118, 1992. of 7921 11 238364 93 [5] G. Cooperman and M. Tselman. Using tadpoles to Attrs 11218 6 reduce memory and communication requirements for 14961 8 exhaustive, breadth-first search using distributed computers. In Proc. of ACM Symposium on Parallel Architectures and Algorithms (SPAA-97), pages Table 3 presents the results of these 100 trials. The statis- 231–238. ACM Press, 1997. tics include an average of: the length of the tail (number of [6] A. Hulpke and S. Linton. Total ordering on subgroups steps until an attractor cycle is reached); the length of the and cosets. In Proc. of Int. Symp. on Symbolic and attractor cycle; the total number of steps; the time to com- Algebraic Comp. (ISSAC-03), pages 156–160, 2003. plete the walk; and the number of steps per second. Also [7] J. Leon. On an algorithm for finding a base and strong listed are the lengths of the attractor cycles discovered, and generating set for a group given by a set of generating the number of trials out of 100 that visited each attractor. permutations. Math. Comp., 35:941–974, 1980. These results show that biasing can significantly decrease the length of a tadpole walk. In this case, biasing yields a 14 [8] J. S. Leon. Permutation group algorithms based on times reduction in the total walk length, mostly due to the partitions, I: Theory and algorithms. J. Symbolic almost 60 times reduction in the average attractor length. Computation, 12:533–583, 1991. As predicted in Section 5.1, the cost of computing the bias [9] J. S. Leon. Partitions, refinements, and permutation increases the time to complete one step in the tadpole by group computation. In Groups and computation II, 50%. Overall in this case, biasing yields an almost 10 times 1995, volume 28 of DIMACS (Discrete Math. Theor. speedup over unbiased tadpoles. Comp. Sci.), pages 123–158. Amer. Math. Soc., 1997. [10] S. Linton. Finding the smallest image of a set. In 6.3.2 Parallel Speedup on Multicore Architectures Proceedings of the 2004 international symposium on Symbolic and algebraic computation (ISSAC-04), For this experiment, we simply chose one of the biased pages 229–234, New York, NY, USA, 2004. ACM. tadpole walks from the previous section and computed that walk four times. We tested three cases: one process com- [11] E. M. Luks. Computing in solvable matrix groups. In puting all four walks serially; two processes, each comput- 33rd Annual Symposium on foundations of Computer ing two walks in parallel; and four processes, each computing Science, pages 111–120. IEEE, 1992. one walk in parallel. The three cases completed in 49.23 sec- [12] S. H. Murray and E. A. O’Brien. Selecting base points onds, 24.64 seconds, and 12.25 seconds, respectively. This for the schreier-sims algorithm for matrix groups. J. test shows the expected four times speedup on a computer Symb. Comput., 19:577–584, 1995. with four CPU cores. [13] I. Pak. The product replacement algorithm is polynomial. In Proc. 41st IEEE Symposium on Foundations of Computer Science (FOCS), pages 7. CONCLUSION AND FUTURE WORK 476–485. IEEE Press, 2000. Biased tadpoles demonstrate a O(p G log(1/ε))) cen- | | [14] R. Parker. Tadpole, 1986. oral communication. tralizer algorithm for matrix groups for ε an upper bound [15] J. Pollard. A Monte Carlo method for factorization. on the probability of error. The method is orthogonal to BIT Numerical Mathematics, 15(3):331–334, 1975. another general method for matrix groups from Magma: [16] R. Sedgewick and T. G. Szymanski. The complexity of the use of the Murray-O’Brien random Schreier-Sims algo- finding periods. In STOC ’79: Proc. of the 11th rithm [12] in combination with Leon’s backtracking [8]. We annual ACM Symposium on Theory of Computing, thank the reviewer for pointing out this other method. A pages 74–80, New York, NY, USA, 1979. ACM. combination of our method with that one is planned for the [17] H. Theißen. Eine Methode zur future. Normalisatorberechnung in Permutationsgruppen mit Anwendungen in der Konstruktion primitiver 8. ACKNOWLEDGMENT Gruppen. PhD thesis, Rheinisch-Westfalische Technische Hochschule, Aachen, Germany, 1997. We acknowledge helpful discussions with Jurgen¨ Muller¨ concerning conjugacy testing, that led us later to consider [18] R. Wilson and et al. ATLAS of finite group the more general problem of centralizers. We also thank the representations – version 3. reviewers for helpful comments. http://brauer.maths.qmul.ac.uk/Atlas/v3/.